Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'f2fs-for-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
"In this round, we've added two small interfaces: (a) GC_URGENT_LOW
mode for performance and (b) F2FS_IOC_SEC_TRIM_FILE ioctl for
security.

The new GC mode allows Android to run some lower priority GCs in
background, while new ioctl discards user information without race
condition when the account is removed.

In addition, some patches were merged to address latency-related
issues. We've fixed some compression-related bug fixes as well as edge
race conditions.

Enhancements:
- add GC_URGENT_LOW mode in gc_urgent
- introduce F2FS_IOC_SEC_TRIM_FILE ioctl
- bypass racy readahead to improve read latencies
- shrink node_write lock coverage to avoid long latency

Bug fixes:
- fix missing compression flag control, i_size, and mount option
- fix deadlock between quota writes and checkpoint
- remove inode eviction path in synchronous path to avoid deadlock
- fix to wait GCed compressed page writeback
- fix a kernel panic in f2fs_is_compressed_page
- check page dirty status before writeback
- wait page writeback before update in node page write flow
- fix a race condition between f2fs_write_end_io and f2fs_del_fsync_node_entry

We've added some minor sanity checks and refactored trivial code
blocks for better readability and debugging information"

* tag 'f2fs-for-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (52 commits)
f2fs: prepare a waiter before entering io_schedule
f2fs: update_sit_entry: Make the judgment condition of f2fs_bug_on more intuitive
f2fs: replace test_and_set/clear_bit() with set/clear_bit()
f2fs: make file immutable even if releasing zero compression block
f2fs: compress: disable compression mount option if compression is off
f2fs: compress: add sanity check during compressed cluster read
f2fs: use macro instead of f2fs verity version
f2fs: fix deadlock between quota writes and checkpoint
f2fs: correct comment of f2fs_exist_written_data
f2fs: compress: delay temp page allocation
f2fs: compress: fix to update isize when overwriting compressed file
f2fs: space related cleanup
f2fs: fix use-after-free issue
f2fs: Change the type of f2fs_flush_inline_data() to void
f2fs: add F2FS_IOC_SEC_TRIM_FILE ioctl
f2fs: should avoid inode eviction in synchronous path
f2fs: segment.h: delete a duplicated word
f2fs: compress: fix to avoid memory leak on cc->cpages
f2fs: use generic names for generic ioctls
f2fs: don't keep meta inode pages used for compressed block migration
...

+815 -292
+3 -1
Documentation/ABI/testing/sysfs-fs-f2fs
··· 229 229 Contact: "Jaegeuk Kim" <jaegeuk@kernel.org> 230 230 Description: Do background GC agressively when set. When gc_urgent = 1, 231 231 background thread starts to do GC by given gc_urgent_sleep_time 232 - interval. It is set to 0 by default. 232 + interval. When gc_urgent = 2, F2FS will lower the bar of 233 + checking idle in order to process outstanding discard commands 234 + and GC a little bit aggressively. It is set to 0 by default. 233 235 234 236 What: /sys/fs/f2fs/<disk>/gc_urgent_sleep_time 235 237 Date: August 2017
+4 -2
Documentation/filesystems/f2fs.rst
··· 258 258 on compression extension list and enable compression on 259 259 these file by default rather than to enable it via ioctl. 260 260 For other files, we can still enable compression via ioctl. 261 + Note that, there is one reserved special extension '*', it 262 + can be set to enable compression for all files. 261 263 inlinecrypt When possible, encrypt/decrypt the contents of encrypted 262 264 files using the blk-crypto framework rather than 263 265 filesystem-layer encryption. This allows the use of ··· 745 743 746 744 - In order to eliminate write amplification during overwrite, F2FS only 747 745 support compression on write-once file, data can be compressed only when 748 - all logical blocks in file are valid and cluster compress ratio is lower 749 - than specified threshold. 746 + all logical blocks in cluster contain valid data and compress ratio of 747 + cluster data is lower than specified threshold. 750 748 751 749 - To enable compression on regular inode, there are three ways: 752 750
+9 -6
fs/f2fs/checkpoint.c
··· 523 523 __remove_ino_entry(sbi, ino, type); 524 524 } 525 525 526 - /* mode should be APPEND_INO or UPDATE_INO */ 526 + /* mode should be APPEND_INO, UPDATE_INO or TRANS_DIR_INO */ 527 527 bool f2fs_exist_written_data(struct f2fs_sb_info *sbi, nid_t ino, int mode) 528 528 { 529 529 struct inode_management *im = &sbi->im[mode]; ··· 1258 1258 DEFINE_WAIT(wait); 1259 1259 1260 1260 for (;;) { 1261 - prepare_to_wait(&sbi->cp_wait, &wait, TASK_UNINTERRUPTIBLE); 1262 - 1263 1261 if (!get_pages(sbi, type)) 1264 1262 break; 1265 1263 ··· 1267 1269 if (type == F2FS_DIRTY_META) 1268 1270 f2fs_sync_meta_pages(sbi, META, LONG_MAX, 1269 1271 FS_CP_META_IO); 1272 + else if (type == F2FS_WB_CP_DATA) 1273 + f2fs_submit_merged_write(sbi, DATA); 1274 + 1275 + prepare_to_wait(&sbi->cp_wait, &wait, TASK_UNINTERRUPTIBLE); 1270 1276 io_schedule_timeout(DEFAULT_IO_TIMEOUT); 1271 1277 } 1272 1278 finish_wait(&sbi->cp_wait, &wait); ··· 1417 1415 curseg_alloc_type(sbi, i + CURSEG_HOT_DATA); 1418 1416 } 1419 1417 1420 - /* 2 cp + n data seg summary + orphan inode blocks */ 1418 + /* 2 cp + n data seg summary + orphan inode blocks */ 1421 1419 data_sum_blocks = f2fs_npages_for_summary_flush(sbi, false); 1422 1420 spin_lock_irqsave(&sbi->cp_lock, flags); 1423 1421 if (data_sum_blocks < NR_CURSEG_DATA_TYPE) ··· 1517 1515 1518 1516 /* 1519 1517 * invalidate intermediate page cache borrowed from meta inode which are 1520 - * used for migration of encrypted or verity inode's blocks. 1518 + * used for migration of encrypted, verity or compressed inode's blocks. 1521 1519 */ 1522 - if (f2fs_sb_has_encrypt(sbi) || f2fs_sb_has_verity(sbi)) 1520 + if (f2fs_sb_has_encrypt(sbi) || f2fs_sb_has_verity(sbi) || 1521 + f2fs_sb_has_compression(sbi)) 1523 1522 invalidate_mapping_pages(META_MAPPING(sbi), 1524 1523 MAIN_BLKADDR(sbi), MAX_BLKADDR(sbi) - 1); 1525 1524
+64 -25
fs/f2fs/compress.c
··· 49 49 return false; 50 50 if (IS_ATOMIC_WRITTEN_PAGE(page) || IS_DUMMY_WRITTEN_PAGE(page)) 51 51 return false; 52 + /* 53 + * page->private may be set with pid. 54 + * pid_max is enough to check if it is traced. 55 + */ 56 + if (IS_IO_TRACED_PAGE(page)) 57 + return false; 58 + 52 59 f2fs_bug_on(F2FS_M_SB(page->mapping), 53 60 *((u32 *)page_private(page)) != F2FS_COMPRESSED_PAGE_MAGIC); 54 61 return true; ··· 513 506 return f2fs_cops[F2FS_I(inode)->i_compress_algorithm]; 514 507 } 515 508 516 - static mempool_t *compress_page_pool = NULL; 509 + static mempool_t *compress_page_pool; 517 510 static int num_compress_pages = 512; 518 511 module_param(num_compress_pages, uint, 0444); 519 512 MODULE_PARM_DESC(num_compress_pages, ··· 670 663 const struct f2fs_compress_ops *cops = 671 664 f2fs_cops[fi->i_compress_algorithm]; 672 665 int ret; 666 + int i; 673 667 674 668 dec_page_count(sbi, F2FS_RD_DATA); 675 669 ··· 687 679 if (dic->failed) { 688 680 ret = -EIO; 689 681 goto out_free_dic; 682 + } 683 + 684 + dic->tpages = f2fs_kzalloc(sbi, sizeof(struct page *) * 685 + dic->cluster_size, GFP_NOFS); 686 + if (!dic->tpages) { 687 + ret = -ENOMEM; 688 + goto out_free_dic; 689 + } 690 + 691 + for (i = 0; i < dic->cluster_size; i++) { 692 + if (dic->rpages[i]) { 693 + dic->tpages[i] = dic->rpages[i]; 694 + continue; 695 + } 696 + 697 + dic->tpages[i] = f2fs_compress_alloc_page(); 698 + if (!dic->tpages[i]) { 699 + ret = -ENOMEM; 700 + goto out_free_dic; 701 + } 690 702 } 691 703 692 704 if (cops->init_decompress_ctx) { ··· 849 821 } 850 822 851 823 /* return # of valid blocks in compressed cluster */ 852 - static int f2fs_cluster_blocks(struct compress_ctx *cc, bool compr) 824 + static int f2fs_cluster_blocks(struct compress_ctx *cc) 853 825 { 854 826 return __f2fs_cluster_blocks(cc, false); 855 827 } ··· 863 835 .cluster_idx = index >> F2FS_I(inode)->i_log_cluster_size, 864 836 }; 865 837 866 - return f2fs_cluster_blocks(&cc, false); 838 + return f2fs_cluster_blocks(&cc); 867 839 } 868 840 869 841 static bool cluster_may_compress(struct compress_ctx *cc) ··· 914 886 bool prealloc; 915 887 916 888 retry: 917 - ret = f2fs_cluster_blocks(cc, false); 889 + ret = f2fs_cluster_blocks(cc); 918 890 if (ret <= 0) 919 891 return ret; 920 892 ··· 977 949 } 978 950 979 951 if (prealloc) { 980 - __do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, true); 952 + f2fs_do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, true); 981 953 982 954 set_new_dnode(&dn, cc->inode, NULL, NULL, 0); 983 955 ··· 992 964 break; 993 965 } 994 966 995 - __do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, false); 967 + f2fs_do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, false); 996 968 } 997 969 998 970 if (likely(!ret)) { ··· 1124 1096 loff_t psize; 1125 1097 int i, err; 1126 1098 1127 - if (!IS_NOQUOTA(inode) && !f2fs_trylock_op(sbi)) 1099 + if (IS_NOQUOTA(inode)) { 1100 + /* 1101 + * We need to wait for node_write to avoid block allocation during 1102 + * checkpoint. This can only happen to quota writes which can cause 1103 + * the below discard race condition. 1104 + */ 1105 + down_read(&sbi->node_write); 1106 + } else if (!f2fs_trylock_op(sbi)) { 1128 1107 return -EAGAIN; 1108 + } 1129 1109 1130 1110 set_new_dnode(&dn, cc->inode, NULL, NULL, 0); 1131 1111 ··· 1173 1137 f2fs_set_compressed_page(cc->cpages[i], inode, 1174 1138 cc->rpages[i + 1]->index, cic); 1175 1139 fio.compressed_page = cc->cpages[i]; 1140 + 1141 + fio.old_blkaddr = data_blkaddr(dn.inode, dn.node_page, 1142 + dn.ofs_in_node + i + 1); 1143 + 1144 + /* wait for GCed page writeback via META_MAPPING */ 1145 + f2fs_wait_on_block_writeback(inode, fio.old_blkaddr); 1146 + 1176 1147 if (fio.encrypted) { 1177 1148 fio.page = cc->rpages[i + 1]; 1178 1149 err = f2fs_encrypt_one_page(&fio); ··· 1246 1203 set_inode_flag(inode, FI_FIRST_BLOCK_WRITTEN); 1247 1204 1248 1205 f2fs_put_dnode(&dn); 1249 - if (!IS_NOQUOTA(inode)) 1206 + if (IS_NOQUOTA(inode)) 1207 + up_read(&sbi->node_write); 1208 + else 1250 1209 f2fs_unlock_op(sbi); 1251 1210 1252 1211 spin_lock(&fi->i_size_lock); ··· 1275 1230 out_put_dnode: 1276 1231 f2fs_put_dnode(&dn); 1277 1232 out_unlock_op: 1278 - if (!IS_NOQUOTA(inode)) 1233 + if (IS_NOQUOTA(inode)) 1234 + up_read(&sbi->node_write); 1235 + else 1279 1236 f2fs_unlock_op(sbi); 1280 1237 return -EAGAIN; 1281 1238 } ··· 1357 1310 congestion_wait(BLK_RW_ASYNC, 1358 1311 DEFAULT_IO_TIMEOUT); 1359 1312 lock_page(cc->rpages[i]); 1313 + 1314 + if (!PageDirty(cc->rpages[i])) { 1315 + unlock_page(cc->rpages[i]); 1316 + continue; 1317 + } 1318 + 1360 1319 clear_page_dirty_for_io(cc->rpages[i]); 1361 1320 goto retry_write; 1362 1321 } ··· 1406 1353 err = f2fs_write_compressed_pages(cc, submitted, 1407 1354 wbc, io_type); 1408 1355 cops->destroy_compress_ctx(cc); 1356 + kfree(cc->cpages); 1357 + cc->cpages = NULL; 1409 1358 if (!err) 1410 1359 return 0; 1411 1360 f2fs_bug_on(F2FS_I_SB(cc->inode), err != -EAGAIN); ··· 1468 1413 f2fs_set_compressed_page(page, cc->inode, 1469 1414 start_idx + i + 1, dic); 1470 1415 dic->cpages[i] = page; 1471 - } 1472 - 1473 - dic->tpages = f2fs_kzalloc(sbi, sizeof(struct page *) * 1474 - dic->cluster_size, GFP_NOFS); 1475 - if (!dic->tpages) 1476 - goto out_free; 1477 - 1478 - for (i = 0; i < dic->cluster_size; i++) { 1479 - if (cc->rpages[i]) { 1480 - dic->tpages[i] = cc->rpages[i]; 1481 - continue; 1482 - } 1483 - 1484 - dic->tpages[i] = f2fs_compress_alloc_page(); 1485 - if (!dic->tpages[i]) 1486 - goto out_free; 1487 1416 } 1488 1417 1489 1418 return dic;
+70 -23
fs/f2fs/data.c
··· 87 87 sbi = F2FS_I_SB(inode); 88 88 89 89 if (inode->i_ino == F2FS_META_INO(sbi) || 90 - inode->i_ino == F2FS_NODE_INO(sbi) || 90 + inode->i_ino == F2FS_NODE_INO(sbi) || 91 91 S_ISDIR(inode->i_mode) || 92 92 (S_ISREG(inode->i_mode) && 93 93 (f2fs_is_atomic_file(inode) || IS_NOQUOTA(inode))) || ··· 1073 1073 1074 1074 /* This can handle encryption stuffs */ 1075 1075 static int f2fs_submit_page_read(struct inode *inode, struct page *page, 1076 - block_t blkaddr, bool for_write) 1076 + block_t blkaddr, int op_flags, bool for_write) 1077 1077 { 1078 1078 struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 1079 1079 struct bio *bio; 1080 1080 1081 - bio = f2fs_grab_read_bio(inode, blkaddr, 1, 0, page->index, for_write); 1081 + bio = f2fs_grab_read_bio(inode, blkaddr, 1, op_flags, 1082 + page->index, for_write); 1082 1083 if (IS_ERR(bio)) 1083 1084 return PTR_ERR(bio); 1084 1085 ··· 1194 1193 1195 1194 int f2fs_get_block(struct dnode_of_data *dn, pgoff_t index) 1196 1195 { 1197 - struct extent_info ei = {0,0,0}; 1196 + struct extent_info ei = {0, 0, 0}; 1198 1197 struct inode *inode = dn->inode; 1199 1198 1200 1199 if (f2fs_lookup_extent_cache(inode, index, &ei)) { ··· 1266 1265 return page; 1267 1266 } 1268 1267 1269 - err = f2fs_submit_page_read(inode, page, dn.data_blkaddr, for_write); 1268 + err = f2fs_submit_page_read(inode, page, dn.data_blkaddr, 1269 + op_flags, for_write); 1270 1270 if (err) 1271 1271 goto put_err; 1272 1272 return page; ··· 1416 1414 set_summary(&sum, dn->nid, dn->ofs_in_node, ni.version); 1417 1415 old_blkaddr = dn->data_blkaddr; 1418 1416 f2fs_allocate_data_block(sbi, NULL, old_blkaddr, &dn->data_blkaddr, 1419 - &sum, seg_type, NULL, false); 1417 + &sum, seg_type, NULL); 1420 1418 if (GET_SEGNO(sbi, old_blkaddr) != NULL_SEGNO) 1421 1419 invalidate_mapping_pages(META_MAPPING(sbi), 1422 1420 old_blkaddr, old_blkaddr); ··· 1476 1474 return err; 1477 1475 } 1478 1476 1479 - void __do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock) 1477 + void f2fs_do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock) 1480 1478 { 1481 1479 if (flag == F2FS_GET_BLOCK_PRE_AIO) { 1482 1480 if (lock) ··· 1541 1539 1542 1540 next_dnode: 1543 1541 if (map->m_may_create) 1544 - __do_map_lock(sbi, flag, true); 1542 + f2fs_do_map_lock(sbi, flag, true); 1545 1543 1546 1544 /* When reading holes, we need its node page */ 1547 1545 set_new_dnode(&dn, inode, NULL, NULL, 0); ··· 1690 1688 f2fs_put_dnode(&dn); 1691 1689 1692 1690 if (map->m_may_create) { 1693 - __do_map_lock(sbi, flag, false); 1691 + f2fs_do_map_lock(sbi, flag, false); 1694 1692 f2fs_balance_fs(sbi, dn.node_changed); 1695 1693 } 1696 1694 goto next_dnode; ··· 1716 1714 f2fs_put_dnode(&dn); 1717 1715 unlock_out: 1718 1716 if (map->m_may_create) { 1719 - __do_map_lock(sbi, flag, false); 1717 + f2fs_do_map_lock(sbi, flag, false); 1720 1718 f2fs_balance_fs(sbi, dn.node_changed); 1721 1719 } 1722 1720 out: ··· 1863 1861 flags |= FIEMAP_EXTENT_LAST; 1864 1862 1865 1863 err = fiemap_fill_next_extent(fieinfo, 0, phys, len, flags); 1864 + trace_f2fs_fiemap(inode, 0, phys, len, flags, err); 1866 1865 if (err || err == 1) 1867 1866 return err; 1868 1867 } ··· 1887 1884 flags = FIEMAP_EXTENT_LAST; 1888 1885 } 1889 1886 1890 - if (phys) 1887 + if (phys) { 1891 1888 err = fiemap_fill_next_extent(fieinfo, 0, phys, len, flags); 1889 + trace_f2fs_fiemap(inode, 0, phys, len, flags, err); 1890 + } 1892 1891 1893 1892 return (err < 0 ? err : 0); 1894 1893 } ··· 1984 1979 1985 1980 ret = fiemap_fill_next_extent(fieinfo, logical, 1986 1981 phys, size, flags); 1982 + trace_f2fs_fiemap(inode, logical, phys, size, flags, ret); 1987 1983 if (ret) 1988 1984 goto out; 1989 1985 size = 0; ··· 2219 2213 if (ret) 2220 2214 goto out; 2221 2215 2222 - /* cluster was overwritten as normal cluster */ 2223 - if (dn.data_blkaddr != COMPRESS_ADDR) 2224 - goto out; 2216 + f2fs_bug_on(sbi, dn.data_blkaddr != COMPRESS_ADDR); 2225 2217 2226 2218 for (i = 1; i < cc->cluster_size; i++) { 2227 2219 block_t blkaddr; ··· 2346 2342 unsigned nr_pages = rac ? readahead_count(rac) : 1; 2347 2343 unsigned max_nr_pages = nr_pages; 2348 2344 int ret = 0; 2345 + bool drop_ra = false; 2349 2346 2350 2347 map.m_pblk = 0; 2351 2348 map.m_lblk = 0; ··· 2357 2352 map.m_seg_type = NO_CHECK_TYPE; 2358 2353 map.m_may_create = false; 2359 2354 2355 + /* 2356 + * Two readahead threads for same address range can cause race condition 2357 + * which fragments sequential read IOs. So let's avoid each other. 2358 + */ 2359 + if (rac && readahead_count(rac)) { 2360 + if (READ_ONCE(F2FS_I(inode)->ra_offset) == readahead_index(rac)) 2361 + drop_ra = true; 2362 + else 2363 + WRITE_ONCE(F2FS_I(inode)->ra_offset, 2364 + readahead_index(rac)); 2365 + } 2366 + 2360 2367 for (; nr_pages; nr_pages--) { 2361 2368 if (rac) { 2362 2369 page = readahead_page(rac); 2363 2370 prefetchw(&page->flags); 2371 + if (drop_ra) { 2372 + f2fs_put_page(page, 1); 2373 + continue; 2374 + } 2364 2375 } 2365 2376 2366 2377 #ifdef CONFIG_F2FS_FS_COMPRESSION ··· 2439 2418 } 2440 2419 if (bio) 2441 2420 __submit_bio(F2FS_I_SB(inode), bio, DATA); 2421 + 2422 + if (rac && readahead_count(rac) && !drop_ra) 2423 + WRITE_ONCE(F2FS_I(inode)->ra_offset, -1); 2442 2424 return ret; 2443 2425 } 2444 2426 ··· 2796 2772 2797 2773 /* Dentry/quota blocks are controlled by checkpoint */ 2798 2774 if (S_ISDIR(inode->i_mode) || IS_NOQUOTA(inode)) { 2775 + /* 2776 + * We need to wait for node_write to avoid block allocation during 2777 + * checkpoint. This can only happen to quota writes which can cause 2778 + * the below discard race condition. 2779 + */ 2780 + if (IS_NOQUOTA(inode)) 2781 + down_read(&sbi->node_write); 2782 + 2799 2783 fio.need_lock = LOCK_DONE; 2800 2784 err = f2fs_do_write_data_page(&fio); 2785 + 2786 + if (IS_NOQUOTA(inode)) 2787 + up_read(&sbi->node_write); 2788 + 2801 2789 goto done; 2802 2790 } 2803 2791 ··· 3304 3268 3305 3269 if (f2fs_has_inline_data(inode) || 3306 3270 (pos & PAGE_MASK) >= i_size_read(inode)) { 3307 - __do_map_lock(sbi, flag, true); 3271 + f2fs_do_map_lock(sbi, flag, true); 3308 3272 locked = true; 3309 3273 } 3310 3274 ··· 3341 3305 err = f2fs_get_dnode_of_data(&dn, index, LOOKUP_NODE); 3342 3306 if (err || dn.data_blkaddr == NULL_ADDR) { 3343 3307 f2fs_put_dnode(&dn); 3344 - __do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, 3308 + f2fs_do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, 3345 3309 true); 3346 3310 WARN_ON(flag != F2FS_GET_BLOCK_PRE_AIO); 3347 3311 locked = true; ··· 3357 3321 f2fs_put_dnode(&dn); 3358 3322 unlock_out: 3359 3323 if (locked) 3360 - __do_map_lock(sbi, flag, false); 3324 + f2fs_do_map_lock(sbi, flag, false); 3361 3325 return err; 3362 3326 } 3363 3327 ··· 3469 3433 err = -EFSCORRUPTED; 3470 3434 goto fail; 3471 3435 } 3472 - err = f2fs_submit_page_read(inode, page, blkaddr, true); 3436 + err = f2fs_submit_page_read(inode, page, blkaddr, 0, true); 3473 3437 if (err) 3474 3438 goto fail; 3475 3439 ··· 3519 3483 if (f2fs_compressed_file(inode) && fsdata) { 3520 3484 f2fs_compress_write_end(inode, fsdata, page->index, copied); 3521 3485 f2fs_update_time(F2FS_I_SB(inode), REQ_TIME); 3486 + 3487 + if (pos + copied > i_size_read(inode) && 3488 + !f2fs_verity_in_progress(inode)) 3489 + f2fs_i_size_write(inode, pos + copied); 3522 3490 return copied; 3523 3491 } 3524 3492 #endif ··· 3782 3742 } 3783 3743 3784 3744 f2fs_put_dnode(&dn); 3785 - 3786 3745 return blknr; 3787 3746 #else 3788 - return -EOPNOTSUPP; 3747 + return 0; 3789 3748 #endif 3790 3749 } 3791 3750 ··· 3792 3753 static sector_t f2fs_bmap(struct address_space *mapping, sector_t block) 3793 3754 { 3794 3755 struct inode *inode = mapping->host; 3756 + struct buffer_head tmp = { 3757 + .b_size = i_blocksize(inode), 3758 + }; 3759 + sector_t blknr = 0; 3795 3760 3796 3761 if (f2fs_has_inline_data(inode)) 3797 - return 0; 3762 + goto out; 3798 3763 3799 3764 /* make sure allocating whole blocks */ 3800 3765 if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) 3801 3766 filemap_write_and_wait(mapping); 3802 3767 3803 3768 if (f2fs_compressed_file(inode)) 3804 - return f2fs_bmap_compress(inode, block); 3769 + blknr = f2fs_bmap_compress(inode, block); 3805 3770 3806 - return generic_block_bmap(mapping, block, get_data_block_bmap); 3771 + if (!get_data_block_bmap(inode, block, &tmp, 0)) 3772 + blknr = tmp.b_blocknr; 3773 + out: 3774 + trace_f2fs_bmap(inode, block, blknr); 3775 + return blknr; 3807 3776 } 3808 3777 3809 3778 #ifdef CONFIG_MIGRATION
+52 -12
fs/f2fs/debug.c
··· 174 174 for (i = META_CP; i < META_MAX; i++) 175 175 si->meta_count[i] = atomic_read(&sbi->meta_count[i]); 176 176 177 + for (i = 0; i < NO_CHECK_TYPE; i++) { 178 + si->dirty_seg[i] = 0; 179 + si->full_seg[i] = 0; 180 + si->valid_blks[i] = 0; 181 + } 182 + 183 + for (i = 0; i < MAIN_SEGS(sbi); i++) { 184 + int blks = get_seg_entry(sbi, i)->valid_blocks; 185 + int type = get_seg_entry(sbi, i)->type; 186 + 187 + if (!blks) 188 + continue; 189 + 190 + if (blks == sbi->blocks_per_seg) 191 + si->full_seg[type]++; 192 + else 193 + si->dirty_seg[type]++; 194 + si->valid_blks[type] += blks; 195 + } 196 + 177 197 for (i = 0; i < 2; i++) { 178 198 si->segment_count[i] = sbi->segment_count[i]; 179 199 si->block_count[i] = sbi->block_count[i]; ··· 349 329 seq_printf(s, "\nMain area: %d segs, %d secs %d zones\n", 350 330 si->main_area_segs, si->main_area_sections, 351 331 si->main_area_zones); 352 - seq_printf(s, " - COLD data: %d, %d, %d\n", 332 + seq_printf(s, " TYPE %8s %8s %8s %10s %10s %10s\n", 333 + "segno", "secno", "zoneno", "dirty_seg", "full_seg", "valid_blk"); 334 + seq_printf(s, " - COLD data: %8d %8d %8d %10u %10u %10u\n", 353 335 si->curseg[CURSEG_COLD_DATA], 354 336 si->cursec[CURSEG_COLD_DATA], 355 - si->curzone[CURSEG_COLD_DATA]); 356 - seq_printf(s, " - WARM data: %d, %d, %d\n", 337 + si->curzone[CURSEG_COLD_DATA], 338 + si->dirty_seg[CURSEG_COLD_DATA], 339 + si->full_seg[CURSEG_COLD_DATA], 340 + si->valid_blks[CURSEG_COLD_DATA]); 341 + seq_printf(s, " - WARM data: %8d %8d %8d %10u %10u %10u\n", 357 342 si->curseg[CURSEG_WARM_DATA], 358 343 si->cursec[CURSEG_WARM_DATA], 359 - si->curzone[CURSEG_WARM_DATA]); 360 - seq_printf(s, " - HOT data: %d, %d, %d\n", 344 + si->curzone[CURSEG_WARM_DATA], 345 + si->dirty_seg[CURSEG_WARM_DATA], 346 + si->full_seg[CURSEG_WARM_DATA], 347 + si->valid_blks[CURSEG_WARM_DATA]); 348 + seq_printf(s, " - HOT data: %8d %8d %8d %10u %10u %10u\n", 361 349 si->curseg[CURSEG_HOT_DATA], 362 350 si->cursec[CURSEG_HOT_DATA], 363 - si->curzone[CURSEG_HOT_DATA]); 364 - seq_printf(s, " - Dir dnode: %d, %d, %d\n", 351 + si->curzone[CURSEG_HOT_DATA], 352 + si->dirty_seg[CURSEG_HOT_DATA], 353 + si->full_seg[CURSEG_HOT_DATA], 354 + si->valid_blks[CURSEG_HOT_DATA]); 355 + seq_printf(s, " - Dir dnode: %8d %8d %8d %10u %10u %10u\n", 365 356 si->curseg[CURSEG_HOT_NODE], 366 357 si->cursec[CURSEG_HOT_NODE], 367 - si->curzone[CURSEG_HOT_NODE]); 368 - seq_printf(s, " - File dnode: %d, %d, %d\n", 358 + si->curzone[CURSEG_HOT_NODE], 359 + si->dirty_seg[CURSEG_HOT_NODE], 360 + si->full_seg[CURSEG_HOT_NODE], 361 + si->valid_blks[CURSEG_HOT_NODE]); 362 + seq_printf(s, " - File dnode: %8d %8d %8d %10u %10u %10u\n", 369 363 si->curseg[CURSEG_WARM_NODE], 370 364 si->cursec[CURSEG_WARM_NODE], 371 - si->curzone[CURSEG_WARM_NODE]); 372 - seq_printf(s, " - Indir nodes: %d, %d, %d\n", 365 + si->curzone[CURSEG_WARM_NODE], 366 + si->dirty_seg[CURSEG_WARM_NODE], 367 + si->full_seg[CURSEG_WARM_NODE], 368 + si->valid_blks[CURSEG_WARM_NODE]); 369 + seq_printf(s, " - Indir nodes: %8d %8d %8d %10u %10u %10u\n", 373 370 si->curseg[CURSEG_COLD_NODE], 374 371 si->cursec[CURSEG_COLD_NODE], 375 - si->curzone[CURSEG_COLD_NODE]); 372 + si->curzone[CURSEG_COLD_NODE], 373 + si->dirty_seg[CURSEG_COLD_NODE], 374 + si->full_seg[CURSEG_COLD_NODE], 375 + si->valid_blks[CURSEG_COLD_NODE]); 376 376 seq_printf(s, "\n - Valid: %d\n - Dirty: %d\n", 377 377 si->main_area_segs - si->dirty_count - 378 378 si->prefree_count - si->free_segs,
+1 -1
fs/f2fs/dir.c
··· 779 779 return err; 780 780 781 781 /* 782 - * An immature stakable filesystem shows a race condition between lookup 782 + * An immature stackable filesystem shows a race condition between lookup 783 783 * and create. If we have same task when doing lookup and create, it's 784 784 * definitely fine as expected by VFS normally. Otherwise, let's just 785 785 * verify on-disk dentry one more time, which guarantees filesystem
+9 -9
fs/f2fs/extent_cache.c
··· 325 325 } 326 326 327 327 /* return true, if inode page is changed */ 328 - static bool __f2fs_init_extent_tree(struct inode *inode, struct f2fs_extent *i_ext) 328 + static void __f2fs_init_extent_tree(struct inode *inode, struct page *ipage) 329 329 { 330 330 struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 331 + struct f2fs_extent *i_ext = ipage ? &F2FS_INODE(ipage)->i_ext : NULL; 331 332 struct extent_tree *et; 332 333 struct extent_node *en; 333 334 struct extent_info ei; ··· 336 335 if (!f2fs_may_extent_tree(inode)) { 337 336 /* drop largest extent */ 338 337 if (i_ext && i_ext->len) { 338 + f2fs_wait_on_page_writeback(ipage, NODE, true, true); 339 339 i_ext->len = 0; 340 - return true; 340 + set_page_dirty(ipage); 341 + return; 341 342 } 342 - return false; 343 + return; 343 344 } 344 345 345 346 et = __grab_extent_tree(inode); 346 347 347 348 if (!i_ext || !i_ext->len) 348 - return false; 349 + return; 349 350 350 351 get_extent_info(&ei, i_ext); 351 352 ··· 363 360 } 364 361 out: 365 362 write_unlock(&et->lock); 366 - return false; 367 363 } 368 364 369 - bool f2fs_init_extent_tree(struct inode *inode, struct f2fs_extent *i_ext) 365 + void f2fs_init_extent_tree(struct inode *inode, struct page *ipage) 370 366 { 371 - bool ret = __f2fs_init_extent_tree(inode, i_ext); 367 + __f2fs_init_extent_tree(inode, ipage); 372 368 373 369 if (!F2FS_I(inode)->extent_tree) 374 370 set_inode_flag(inode, FI_NO_EXTENT); 375 - 376 - return ret; 377 371 } 378 372 379 373 static bool f2fs_lookup_extent_tree(struct inode *inode, pgoff_t pgofs,
+46 -35
fs/f2fs/f2fs.h
··· 402 402 } 403 403 404 404 /* 405 - * ioctl commands 405 + * f2fs-specific ioctl commands 406 406 */ 407 - #define F2FS_IOC_GETFLAGS FS_IOC_GETFLAGS 408 - #define F2FS_IOC_SETFLAGS FS_IOC_SETFLAGS 409 - #define F2FS_IOC_GETVERSION FS_IOC_GETVERSION 410 - 411 407 #define F2FS_IOCTL_MAGIC 0xf5 412 408 #define F2FS_IOC_START_ATOMIC_WRITE _IO(F2FS_IOCTL_MAGIC, 1) 413 409 #define F2FS_IOC_COMMIT_ATOMIC_WRITE _IO(F2FS_IOCTL_MAGIC, 2) ··· 430 434 _IOR(F2FS_IOCTL_MAGIC, 18, __u64) 431 435 #define F2FS_IOC_RESERVE_COMPRESS_BLOCKS \ 432 436 _IOR(F2FS_IOCTL_MAGIC, 19, __u64) 433 - 434 - #define F2FS_IOC_GET_VOLUME_NAME FS_IOC_GETFSLABEL 435 - #define F2FS_IOC_SET_VOLUME_NAME FS_IOC_SETFSLABEL 436 - 437 - #define F2FS_IOC_SET_ENCRYPTION_POLICY FS_IOC_SET_ENCRYPTION_POLICY 438 - #define F2FS_IOC_GET_ENCRYPTION_POLICY FS_IOC_GET_ENCRYPTION_POLICY 439 - #define F2FS_IOC_GET_ENCRYPTION_PWSALT FS_IOC_GET_ENCRYPTION_PWSALT 437 + #define F2FS_IOC_SEC_TRIM_FILE _IOW(F2FS_IOCTL_MAGIC, 20, \ 438 + struct f2fs_sectrim_range) 440 439 441 440 /* 442 441 * should be same as XFS_IOC_GOINGDOWN. ··· 444 453 #define F2FS_GOING_DOWN_METAFLUSH 0x3 /* going down with meta flush */ 445 454 #define F2FS_GOING_DOWN_NEED_FSCK 0x4 /* going down to trigger fsck */ 446 455 447 - #if defined(__KERNEL__) && defined(CONFIG_COMPAT) 448 456 /* 449 - * ioctl commands in 32 bit emulation 457 + * Flags used by F2FS_IOC_SEC_TRIM_FILE 450 458 */ 451 - #define F2FS_IOC32_GETFLAGS FS_IOC32_GETFLAGS 452 - #define F2FS_IOC32_SETFLAGS FS_IOC32_SETFLAGS 453 - #define F2FS_IOC32_GETVERSION FS_IOC32_GETVERSION 454 - #endif 455 - 456 - #define F2FS_IOC_FSGETXATTR FS_IOC_FSGETXATTR 457 - #define F2FS_IOC_FSSETXATTR FS_IOC_FSSETXATTR 459 + #define F2FS_TRIM_FILE_DISCARD 0x1 /* send discard command */ 460 + #define F2FS_TRIM_FILE_ZEROOUT 0x2 /* zero out */ 461 + #define F2FS_TRIM_FILE_MASK 0x3 458 462 459 463 struct f2fs_gc_range { 460 464 u32 sync; ··· 472 486 struct f2fs_flush_device { 473 487 u32 dev_num; /* device number to flush */ 474 488 u32 segments; /* # of segments to flush */ 489 + }; 490 + 491 + struct f2fs_sectrim_range { 492 + u64 start; 493 + u64 len; 494 + u64 flags; 475 495 }; 476 496 477 497 /* for inline stuff */ ··· 786 794 struct list_head inmem_pages; /* inmemory pages managed by f2fs */ 787 795 struct task_struct *inmem_task; /* store inmemory task */ 788 796 struct mutex inmem_lock; /* lock for inmemory pages */ 797 + pgoff_t ra_offset; /* ongoing readahead offset */ 789 798 struct extent_tree *extent_tree; /* cached extent_tree entry */ 790 799 791 800 /* avoid racing between foreground op and gc */ ··· 1260 1267 GC_NORMAL, 1261 1268 GC_IDLE_CB, 1262 1269 GC_IDLE_GREEDY, 1263 - GC_URGENT, 1270 + GC_URGENT_HIGH, 1271 + GC_URGENT_LOW, 1264 1272 }; 1265 1273 1266 1274 enum { ··· 1306 1312 (page_private(page) == (unsigned long)ATOMIC_WRITTEN_PAGE) 1307 1313 #define IS_DUMMY_WRITTEN_PAGE(page) \ 1308 1314 (page_private(page) == (unsigned long)DUMMY_WRITTEN_PAGE) 1315 + 1316 + #ifdef CONFIG_F2FS_IO_TRACE 1317 + #define IS_IO_TRACED_PAGE(page) \ 1318 + (page_private(page) > 0 && \ 1319 + page_private(page) < (unsigned long)PID_MAX_LIMIT) 1320 + #else 1321 + #define IS_IO_TRACED_PAGE(page) (0) 1322 + #endif 1309 1323 1310 1324 #ifdef CONFIG_FS_ENCRYPTION 1311 1325 #define DUMMY_ENCRYPTION_ENABLED(sbi) \ ··· 1440 1438 unsigned long last_time[MAX_TIME]; /* to store time in jiffies */ 1441 1439 long interval_time[MAX_TIME]; /* to store thresholds */ 1442 1440 1443 - struct inode_management im[MAX_INO_ENTRY]; /* manage inode cache */ 1441 + struct inode_management im[MAX_INO_ENTRY]; /* manage inode cache */ 1444 1442 1445 1443 spinlock_t fsync_node_lock; /* for node entry lock */ 1446 1444 struct list_head fsync_node_list; /* node list head */ ··· 1518 1516 unsigned int cur_victim_sec; /* current victim section num */ 1519 1517 unsigned int gc_mode; /* current GC state */ 1520 1518 unsigned int next_victim_seg[2]; /* next segment in victim section */ 1519 + 1521 1520 /* for skip statistic */ 1522 - unsigned int atomic_files; /* # of opened atomic file */ 1521 + unsigned int atomic_files; /* # of opened atomic file */ 1523 1522 unsigned long long skipped_atomic_files[2]; /* FG_GC and BG_GC */ 1524 1523 unsigned long long skipped_gc_rwsem; /* FG_GC only */ 1525 1524 ··· 2459 2456 2460 2457 static inline bool is_idle(struct f2fs_sb_info *sbi, int type) 2461 2458 { 2462 - if (sbi->gc_mode == GC_URGENT) 2459 + if (sbi->gc_mode == GC_URGENT_HIGH) 2463 2460 return true; 2464 2461 2465 2462 if (get_pages(sbi, F2FS_RD_DATA) || get_pages(sbi, F2FS_RD_NODE) || ··· 2476 2473 if (SM_I(sbi) && SM_I(sbi)->fcc_info && 2477 2474 atomic_read(&SM_I(sbi)->fcc_info->queued_flush)) 2478 2475 return false; 2476 + 2477 + if (sbi->gc_mode == GC_URGENT_LOW && 2478 + (type == DISCARD_TIME || type == GC_TIME)) 2479 + return true; 2479 2480 2480 2481 return f2fs_time_over(sbi, type); 2481 2482 } ··· 2656 2649 2657 2650 static inline void set_inode_flag(struct inode *inode, int flag) 2658 2651 { 2659 - test_and_set_bit(flag, F2FS_I(inode)->flags); 2652 + set_bit(flag, F2FS_I(inode)->flags); 2660 2653 __mark_inode_dirty_flag(inode, flag, true); 2661 2654 } 2662 2655 ··· 2667 2660 2668 2661 static inline void clear_inode_flag(struct inode *inode, int flag) 2669 2662 { 2670 - test_and_clear_bit(flag, F2FS_I(inode)->flags); 2663 + clear_bit(flag, F2FS_I(inode)->flags); 2671 2664 __mark_inode_dirty_flag(inode, flag, false); 2672 2665 } 2673 2666 ··· 3282 3275 struct page *f2fs_get_node_page(struct f2fs_sb_info *sbi, pgoff_t nid); 3283 3276 struct page *f2fs_get_node_page_ra(struct page *parent, int start); 3284 3277 int f2fs_move_node_page(struct page *node_page, int gc_type); 3285 - int f2fs_flush_inline_data(struct f2fs_sb_info *sbi); 3278 + void f2fs_flush_inline_data(struct f2fs_sb_info *sbi); 3286 3279 int f2fs_fsync_node_pages(struct f2fs_sb_info *sbi, struct inode *inode, 3287 3280 struct writeback_control *wbc, bool atomic, 3288 3281 unsigned int *seq_id); ··· 3294 3287 void f2fs_alloc_nid_done(struct f2fs_sb_info *sbi, nid_t nid); 3295 3288 void f2fs_alloc_nid_failed(struct f2fs_sb_info *sbi, nid_t nid); 3296 3289 int f2fs_try_to_free_nids(struct f2fs_sb_info *sbi, int nr_shrink); 3297 - void f2fs_recover_inline_xattr(struct inode *inode, struct page *page); 3290 + int f2fs_recover_inline_xattr(struct inode *inode, struct page *page); 3298 3291 int f2fs_recover_xattr_data(struct inode *inode, struct page *page); 3299 3292 int f2fs_recover_inode_page(struct f2fs_sb_info *sbi, struct page *page); 3300 3293 int f2fs_restore_node_summary(struct f2fs_sb_info *sbi, ··· 3332 3325 int f2fs_disable_cp_again(struct f2fs_sb_info *sbi, block_t unusable); 3333 3326 void f2fs_release_discard_addrs(struct f2fs_sb_info *sbi); 3334 3327 int f2fs_npages_for_summary_flush(struct f2fs_sb_info *sbi, bool for_ra); 3335 - void allocate_segment_for_resize(struct f2fs_sb_info *sbi, int type, 3328 + void f2fs_allocate_segment_for_resize(struct f2fs_sb_info *sbi, int type, 3336 3329 unsigned int start, unsigned int end); 3337 - void f2fs_allocate_new_segments(struct f2fs_sb_info *sbi, int type); 3330 + void f2fs_allocate_new_segment(struct f2fs_sb_info *sbi, int type); 3331 + void f2fs_allocate_new_segments(struct f2fs_sb_info *sbi); 3338 3332 int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct fstrim_range *range); 3339 3333 bool f2fs_exist_trim_candidates(struct f2fs_sb_info *sbi, 3340 3334 struct cp_control *cpc); ··· 3358 3350 void f2fs_allocate_data_block(struct f2fs_sb_info *sbi, struct page *page, 3359 3351 block_t old_blkaddr, block_t *new_blkaddr, 3360 3352 struct f2fs_summary *sum, int type, 3361 - struct f2fs_io_info *fio, bool add_list); 3353 + struct f2fs_io_info *fio); 3362 3354 void f2fs_wait_on_page_writeback(struct page *page, 3363 3355 enum page_type type, bool ordered, bool locked); 3364 3356 void f2fs_wait_on_block_writeback(struct inode *inode, block_t blkaddr); ··· 3456 3448 struct page *f2fs_get_new_data_page(struct inode *inode, 3457 3449 struct page *ipage, pgoff_t index, bool new_i_size); 3458 3450 int f2fs_do_write_data_page(struct f2fs_io_info *fio); 3459 - void __do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock); 3451 + void f2fs_do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock); 3460 3452 int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, 3461 3453 int create, int flag); 3462 3454 int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, ··· 3544 3536 int curseg[NR_CURSEG_TYPE]; 3545 3537 int cursec[NR_CURSEG_TYPE]; 3546 3538 int curzone[NR_CURSEG_TYPE]; 3539 + unsigned int dirty_seg[NR_CURSEG_TYPE]; 3540 + unsigned int full_seg[NR_CURSEG_TYPE]; 3541 + unsigned int valid_blks[NR_CURSEG_TYPE]; 3547 3542 3548 3543 unsigned int meta_count[META_MAX]; 3549 3544 unsigned int segment_count[2]; ··· 3761 3750 int f2fs_convert_inline_inode(struct inode *inode); 3762 3751 int f2fs_try_convert_inline_dir(struct inode *dir, struct dentry *dentry); 3763 3752 int f2fs_write_inline_data(struct inode *inode, struct page *page); 3764 - bool f2fs_recover_inline_data(struct inode *inode, struct page *npage); 3753 + int f2fs_recover_inline_data(struct inode *inode, struct page *npage); 3765 3754 struct f2fs_dir_entry *f2fs_find_in_inline_dir(struct inode *dir, 3766 3755 const struct f2fs_filename *fname, 3767 3756 struct page **res_page); ··· 3806 3795 bool f2fs_check_rb_tree_consistence(struct f2fs_sb_info *sbi, 3807 3796 struct rb_root_cached *root); 3808 3797 unsigned int f2fs_shrink_extent_tree(struct f2fs_sb_info *sbi, int nr_shrink); 3809 - bool f2fs_init_extent_tree(struct inode *inode, struct f2fs_extent *i_ext); 3798 + void f2fs_init_extent_tree(struct inode *inode, struct page *ipage); 3810 3799 void f2fs_drop_extent_tree(struct inode *inode); 3811 3800 unsigned int f2fs_destroy_extent_node(struct inode *inode); 3812 3801 void f2fs_destroy_extent_tree(struct inode *inode);
+229 -35
fs/f2fs/file.c
··· 21 21 #include <linux/uuid.h> 22 22 #include <linux/file.h> 23 23 #include <linux/nls.h> 24 + #include <linux/sched/signal.h> 24 25 25 26 #include "f2fs.h" 26 27 #include "node.h" ··· 106 105 107 106 if (need_alloc) { 108 107 /* block allocation */ 109 - __do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, true); 108 + f2fs_do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, true); 110 109 set_new_dnode(&dn, inode, NULL, NULL, 0); 111 110 err = f2fs_get_block(&dn, page->index); 112 111 f2fs_put_dnode(&dn); 113 - __do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, false); 112 + f2fs_do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, false); 114 113 } 115 114 116 115 #ifdef CONFIG_F2FS_FS_COMPRESSION ··· 1374 1373 truncate_pagecache(inode, offset); 1375 1374 1376 1375 new_size = i_size_read(inode) - len; 1377 - truncate_pagecache(inode, new_size); 1378 - 1379 1376 ret = f2fs_truncate_blocks(inode, new_size, true); 1380 1377 up_write(&F2FS_I(inode)->i_mmap_sem); 1381 1378 if (!ret) ··· 1659 1660 map.m_seg_type = CURSEG_COLD_DATA_PINNED; 1660 1661 1661 1662 f2fs_lock_op(sbi); 1662 - f2fs_allocate_new_segments(sbi, CURSEG_COLD_DATA); 1663 + f2fs_allocate_new_segment(sbi, CURSEG_COLD_DATA); 1663 1664 f2fs_unlock_op(sbi); 1664 1665 1665 1666 err = f2fs_map_blocks(inode, &map, 1, F2FS_GET_BLOCK_PRE_DIO); ··· 2526 2527 } 2527 2528 2528 2529 ret = f2fs_gc(sbi, range.sync, true, GET_SEGNO(sbi, range.start)); 2530 + if (ret) { 2531 + if (ret == -EBUSY) 2532 + ret = -EAGAIN; 2533 + goto out; 2534 + } 2529 2535 range.start += BLKS_PER_SEC(sbi); 2530 2536 if (range.start <= end) 2531 2537 goto do_more; ··· 3363 3359 return fsverity_ioctl_measure(filp, (void __user *)arg); 3364 3360 } 3365 3361 3366 - static int f2fs_get_volume_name(struct file *filp, unsigned long arg) 3362 + static int f2fs_ioc_getfslabel(struct file *filp, unsigned long arg) 3367 3363 { 3368 3364 struct inode *inode = file_inode(filp); 3369 3365 struct f2fs_sb_info *sbi = F2FS_I_SB(inode); ··· 3389 3385 return err; 3390 3386 } 3391 3387 3392 - static int f2fs_set_volume_name(struct file *filp, unsigned long arg) 3388 + static int f2fs_ioc_setfslabel(struct file *filp, unsigned long arg) 3393 3389 { 3394 3390 struct inode *inode = file_inode(filp); 3395 3391 struct f2fs_sb_info *sbi = F2FS_I_SB(inode); ··· 3535 3531 if (ret) 3536 3532 goto out; 3537 3533 3538 - if (!F2FS_I(inode)->i_compr_blocks) 3539 - goto out; 3540 - 3541 3534 F2FS_I(inode)->i_flags |= F2FS_IMMUTABLE_FL; 3542 3535 f2fs_set_inode_flags(inode); 3543 3536 inode->i_ctime = current_time(inode); 3544 3537 f2fs_mark_inode_dirty_sync(inode, true); 3538 + 3539 + if (!F2FS_I(inode)->i_compr_blocks) 3540 + goto out; 3545 3541 3546 3542 down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); 3547 3543 down_write(&F2FS_I(inode)->i_mmap_sem); ··· 3760 3756 return ret; 3761 3757 } 3762 3758 3759 + static int f2fs_secure_erase(struct block_device *bdev, struct inode *inode, 3760 + pgoff_t off, block_t block, block_t len, u32 flags) 3761 + { 3762 + struct request_queue *q = bdev_get_queue(bdev); 3763 + sector_t sector = SECTOR_FROM_BLOCK(block); 3764 + sector_t nr_sects = SECTOR_FROM_BLOCK(len); 3765 + int ret = 0; 3766 + 3767 + if (!q) 3768 + return -ENXIO; 3769 + 3770 + if (flags & F2FS_TRIM_FILE_DISCARD) 3771 + ret = blkdev_issue_discard(bdev, sector, nr_sects, GFP_NOFS, 3772 + blk_queue_secure_erase(q) ? 3773 + BLKDEV_DISCARD_SECURE : 0); 3774 + 3775 + if (!ret && (flags & F2FS_TRIM_FILE_ZEROOUT)) { 3776 + if (IS_ENCRYPTED(inode)) 3777 + ret = fscrypt_zeroout_range(inode, off, block, len); 3778 + else 3779 + ret = blkdev_issue_zeroout(bdev, sector, nr_sects, 3780 + GFP_NOFS, 0); 3781 + } 3782 + 3783 + return ret; 3784 + } 3785 + 3786 + static int f2fs_sec_trim_file(struct file *filp, unsigned long arg) 3787 + { 3788 + struct inode *inode = file_inode(filp); 3789 + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 3790 + struct address_space *mapping = inode->i_mapping; 3791 + struct block_device *prev_bdev = NULL; 3792 + struct f2fs_sectrim_range range; 3793 + pgoff_t index, pg_end, prev_index = 0; 3794 + block_t prev_block = 0, len = 0; 3795 + loff_t end_addr; 3796 + bool to_end = false; 3797 + int ret = 0; 3798 + 3799 + if (!(filp->f_mode & FMODE_WRITE)) 3800 + return -EBADF; 3801 + 3802 + if (copy_from_user(&range, (struct f2fs_sectrim_range __user *)arg, 3803 + sizeof(range))) 3804 + return -EFAULT; 3805 + 3806 + if (range.flags == 0 || (range.flags & ~F2FS_TRIM_FILE_MASK) || 3807 + !S_ISREG(inode->i_mode)) 3808 + return -EINVAL; 3809 + 3810 + if (((range.flags & F2FS_TRIM_FILE_DISCARD) && 3811 + !f2fs_hw_support_discard(sbi)) || 3812 + ((range.flags & F2FS_TRIM_FILE_ZEROOUT) && 3813 + IS_ENCRYPTED(inode) && f2fs_is_multi_device(sbi))) 3814 + return -EOPNOTSUPP; 3815 + 3816 + file_start_write(filp); 3817 + inode_lock(inode); 3818 + 3819 + if (f2fs_is_atomic_file(inode) || f2fs_compressed_file(inode) || 3820 + range.start >= inode->i_size) { 3821 + ret = -EINVAL; 3822 + goto err; 3823 + } 3824 + 3825 + if (range.len == 0) 3826 + goto err; 3827 + 3828 + if (inode->i_size - range.start > range.len) { 3829 + end_addr = range.start + range.len; 3830 + } else { 3831 + end_addr = range.len == (u64)-1 ? 3832 + sbi->sb->s_maxbytes : inode->i_size; 3833 + to_end = true; 3834 + } 3835 + 3836 + if (!IS_ALIGNED(range.start, F2FS_BLKSIZE) || 3837 + (!to_end && !IS_ALIGNED(end_addr, F2FS_BLKSIZE))) { 3838 + ret = -EINVAL; 3839 + goto err; 3840 + } 3841 + 3842 + index = F2FS_BYTES_TO_BLK(range.start); 3843 + pg_end = DIV_ROUND_UP(end_addr, F2FS_BLKSIZE); 3844 + 3845 + ret = f2fs_convert_inline_inode(inode); 3846 + if (ret) 3847 + goto err; 3848 + 3849 + down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); 3850 + down_write(&F2FS_I(inode)->i_mmap_sem); 3851 + 3852 + ret = filemap_write_and_wait_range(mapping, range.start, 3853 + to_end ? LLONG_MAX : end_addr - 1); 3854 + if (ret) 3855 + goto out; 3856 + 3857 + truncate_inode_pages_range(mapping, range.start, 3858 + to_end ? -1 : end_addr - 1); 3859 + 3860 + while (index < pg_end) { 3861 + struct dnode_of_data dn; 3862 + pgoff_t end_offset, count; 3863 + int i; 3864 + 3865 + set_new_dnode(&dn, inode, NULL, NULL, 0); 3866 + ret = f2fs_get_dnode_of_data(&dn, index, LOOKUP_NODE); 3867 + if (ret) { 3868 + if (ret == -ENOENT) { 3869 + index = f2fs_get_next_page_offset(&dn, index); 3870 + continue; 3871 + } 3872 + goto out; 3873 + } 3874 + 3875 + end_offset = ADDRS_PER_PAGE(dn.node_page, inode); 3876 + count = min(end_offset - dn.ofs_in_node, pg_end - index); 3877 + for (i = 0; i < count; i++, index++, dn.ofs_in_node++) { 3878 + struct block_device *cur_bdev; 3879 + block_t blkaddr = f2fs_data_blkaddr(&dn); 3880 + 3881 + if (!__is_valid_data_blkaddr(blkaddr)) 3882 + continue; 3883 + 3884 + if (!f2fs_is_valid_blkaddr(sbi, blkaddr, 3885 + DATA_GENERIC_ENHANCE)) { 3886 + ret = -EFSCORRUPTED; 3887 + f2fs_put_dnode(&dn); 3888 + goto out; 3889 + } 3890 + 3891 + cur_bdev = f2fs_target_device(sbi, blkaddr, NULL); 3892 + if (f2fs_is_multi_device(sbi)) { 3893 + int di = f2fs_target_device_index(sbi, blkaddr); 3894 + 3895 + blkaddr -= FDEV(di).start_blk; 3896 + } 3897 + 3898 + if (len) { 3899 + if (prev_bdev == cur_bdev && 3900 + index == prev_index + len && 3901 + blkaddr == prev_block + len) { 3902 + len++; 3903 + } else { 3904 + ret = f2fs_secure_erase(prev_bdev, 3905 + inode, prev_index, prev_block, 3906 + len, range.flags); 3907 + if (ret) { 3908 + f2fs_put_dnode(&dn); 3909 + goto out; 3910 + } 3911 + 3912 + len = 0; 3913 + } 3914 + } 3915 + 3916 + if (!len) { 3917 + prev_bdev = cur_bdev; 3918 + prev_index = index; 3919 + prev_block = blkaddr; 3920 + len = 1; 3921 + } 3922 + } 3923 + 3924 + f2fs_put_dnode(&dn); 3925 + 3926 + if (fatal_signal_pending(current)) { 3927 + ret = -EINTR; 3928 + goto out; 3929 + } 3930 + cond_resched(); 3931 + } 3932 + 3933 + if (len) 3934 + ret = f2fs_secure_erase(prev_bdev, inode, prev_index, 3935 + prev_block, len, range.flags); 3936 + out: 3937 + up_write(&F2FS_I(inode)->i_mmap_sem); 3938 + up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]); 3939 + err: 3940 + inode_unlock(inode); 3941 + file_end_write(filp); 3942 + 3943 + return ret; 3944 + } 3945 + 3763 3946 long f2fs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) 3764 3947 { 3765 3948 if (unlikely(f2fs_cp_error(F2FS_I_SB(file_inode(filp))))) ··· 3955 3764 return -ENOSPC; 3956 3765 3957 3766 switch (cmd) { 3958 - case F2FS_IOC_GETFLAGS: 3767 + case FS_IOC_GETFLAGS: 3959 3768 return f2fs_ioc_getflags(filp, arg); 3960 - case F2FS_IOC_SETFLAGS: 3769 + case FS_IOC_SETFLAGS: 3961 3770 return f2fs_ioc_setflags(filp, arg); 3962 - case F2FS_IOC_GETVERSION: 3771 + case FS_IOC_GETVERSION: 3963 3772 return f2fs_ioc_getversion(filp, arg); 3964 3773 case F2FS_IOC_START_ATOMIC_WRITE: 3965 3774 return f2fs_ioc_start_atomic_write(filp); ··· 3975 3784 return f2fs_ioc_shutdown(filp, arg); 3976 3785 case FITRIM: 3977 3786 return f2fs_ioc_fitrim(filp, arg); 3978 - case F2FS_IOC_SET_ENCRYPTION_POLICY: 3787 + case FS_IOC_SET_ENCRYPTION_POLICY: 3979 3788 return f2fs_ioc_set_encryption_policy(filp, arg); 3980 - case F2FS_IOC_GET_ENCRYPTION_POLICY: 3789 + case FS_IOC_GET_ENCRYPTION_POLICY: 3981 3790 return f2fs_ioc_get_encryption_policy(filp, arg); 3982 - case F2FS_IOC_GET_ENCRYPTION_PWSALT: 3791 + case FS_IOC_GET_ENCRYPTION_PWSALT: 3983 3792 return f2fs_ioc_get_encryption_pwsalt(filp, arg); 3984 3793 case FS_IOC_GET_ENCRYPTION_POLICY_EX: 3985 3794 return f2fs_ioc_get_encryption_policy_ex(filp, arg); ··· 4007 3816 return f2fs_ioc_flush_device(filp, arg); 4008 3817 case F2FS_IOC_GET_FEATURES: 4009 3818 return f2fs_ioc_get_features(filp, arg); 4010 - case F2FS_IOC_FSGETXATTR: 3819 + case FS_IOC_FSGETXATTR: 4011 3820 return f2fs_ioc_fsgetxattr(filp, arg); 4012 - case F2FS_IOC_FSSETXATTR: 3821 + case FS_IOC_FSSETXATTR: 4013 3822 return f2fs_ioc_fssetxattr(filp, arg); 4014 3823 case F2FS_IOC_GET_PIN_FILE: 4015 3824 return f2fs_ioc_get_pin_file(filp, arg); ··· 4023 3832 return f2fs_ioc_enable_verity(filp, arg); 4024 3833 case FS_IOC_MEASURE_VERITY: 4025 3834 return f2fs_ioc_measure_verity(filp, arg); 4026 - case F2FS_IOC_GET_VOLUME_NAME: 4027 - return f2fs_get_volume_name(filp, arg); 4028 - case F2FS_IOC_SET_VOLUME_NAME: 4029 - return f2fs_set_volume_name(filp, arg); 3835 + case FS_IOC_GETFSLABEL: 3836 + return f2fs_ioc_getfslabel(filp, arg); 3837 + case FS_IOC_SETFSLABEL: 3838 + return f2fs_ioc_setfslabel(filp, arg); 4030 3839 case F2FS_IOC_GET_COMPRESS_BLOCKS: 4031 3840 return f2fs_get_compress_blocks(filp, arg); 4032 3841 case F2FS_IOC_RELEASE_COMPRESS_BLOCKS: 4033 3842 return f2fs_release_compress_blocks(filp, arg); 4034 3843 case F2FS_IOC_RESERVE_COMPRESS_BLOCKS: 4035 3844 return f2fs_reserve_compress_blocks(filp, arg); 3845 + case F2FS_IOC_SEC_TRIM_FILE: 3846 + return f2fs_sec_trim_file(filp, arg); 4036 3847 default: 4037 3848 return -ENOTTY; 4038 3849 } ··· 4159 3966 long f2fs_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg) 4160 3967 { 4161 3968 switch (cmd) { 4162 - case F2FS_IOC32_GETFLAGS: 4163 - cmd = F2FS_IOC_GETFLAGS; 3969 + case FS_IOC32_GETFLAGS: 3970 + cmd = FS_IOC_GETFLAGS; 4164 3971 break; 4165 - case F2FS_IOC32_SETFLAGS: 4166 - cmd = F2FS_IOC_SETFLAGS; 3972 + case FS_IOC32_SETFLAGS: 3973 + cmd = FS_IOC_SETFLAGS; 4167 3974 break; 4168 - case F2FS_IOC32_GETVERSION: 4169 - cmd = F2FS_IOC_GETVERSION; 3975 + case FS_IOC32_GETVERSION: 3976 + cmd = FS_IOC_GETVERSION; 4170 3977 break; 4171 3978 case F2FS_IOC_START_ATOMIC_WRITE: 4172 3979 case F2FS_IOC_COMMIT_ATOMIC_WRITE: ··· 4175 3982 case F2FS_IOC_ABORT_VOLATILE_WRITE: 4176 3983 case F2FS_IOC_SHUTDOWN: 4177 3984 case FITRIM: 4178 - case F2FS_IOC_SET_ENCRYPTION_POLICY: 4179 - case F2FS_IOC_GET_ENCRYPTION_PWSALT: 4180 - case F2FS_IOC_GET_ENCRYPTION_POLICY: 3985 + case FS_IOC_SET_ENCRYPTION_POLICY: 3986 + case FS_IOC_GET_ENCRYPTION_PWSALT: 3987 + case FS_IOC_GET_ENCRYPTION_POLICY: 4181 3988 case FS_IOC_GET_ENCRYPTION_POLICY_EX: 4182 3989 case FS_IOC_ADD_ENCRYPTION_KEY: 4183 3990 case FS_IOC_REMOVE_ENCRYPTION_KEY: ··· 4191 3998 case F2FS_IOC_MOVE_RANGE: 4192 3999 case F2FS_IOC_FLUSH_DEVICE: 4193 4000 case F2FS_IOC_GET_FEATURES: 4194 - case F2FS_IOC_FSGETXATTR: 4195 - case F2FS_IOC_FSSETXATTR: 4001 + case FS_IOC_FSGETXATTR: 4002 + case FS_IOC_FSSETXATTR: 4196 4003 case F2FS_IOC_GET_PIN_FILE: 4197 4004 case F2FS_IOC_SET_PIN_FILE: 4198 4005 case F2FS_IOC_PRECACHE_EXTENTS: 4199 4006 case F2FS_IOC_RESIZE_FS: 4200 4007 case FS_IOC_ENABLE_VERITY: 4201 4008 case FS_IOC_MEASURE_VERITY: 4202 - case F2FS_IOC_GET_VOLUME_NAME: 4203 - case F2FS_IOC_SET_VOLUME_NAME: 4009 + case FS_IOC_GETFSLABEL: 4010 + case FS_IOC_SETFSLABEL: 4204 4011 case F2FS_IOC_GET_COMPRESS_BLOCKS: 4205 4012 case F2FS_IOC_RELEASE_COMPRESS_BLOCKS: 4206 4013 case F2FS_IOC_RESERVE_COMPRESS_BLOCKS: 4014 + case F2FS_IOC_SEC_TRIM_FILE: 4207 4015 break; 4208 4016 default: 4209 4017 return -ENOIOCTLCMD;
+45 -28
fs/f2fs/gc.c
··· 21 21 #include "gc.h" 22 22 #include <trace/events/f2fs.h> 23 23 24 + static unsigned int count_bits(const unsigned long *addr, 25 + unsigned int offset, unsigned int len); 26 + 24 27 static int gc_thread_func(void *data) 25 28 { 26 29 struct f2fs_sb_info *sbi = data; ··· 82 79 * invalidated soon after by user update or deletion. 83 80 * So, I'd like to wait some time to collect dirty segments. 84 81 */ 85 - if (sbi->gc_mode == GC_URGENT) { 82 + if (sbi->gc_mode == GC_URGENT_HIGH) { 86 83 wait_ms = gc_th->urgent_sleep_time; 87 84 down_write(&sbi->gc_lock); 88 85 goto do_gc; ··· 176 173 gc_mode = GC_CB; 177 174 break; 178 175 case GC_IDLE_GREEDY: 179 - case GC_URGENT: 176 + case GC_URGENT_HIGH: 180 177 gc_mode = GC_GREEDY; 181 178 break; 182 179 } ··· 190 187 191 188 if (p->alloc_mode == SSR) { 192 189 p->gc_mode = GC_GREEDY; 193 - p->dirty_segmap = dirty_i->dirty_segmap[type]; 190 + p->dirty_bitmap = dirty_i->dirty_segmap[type]; 194 191 p->max_search = dirty_i->nr_dirty[type]; 195 192 p->ofs_unit = 1; 196 193 } else { 197 194 p->gc_mode = select_gc_type(sbi, gc_type); 198 - p->dirty_segmap = dirty_i->dirty_segmap[DIRTY]; 199 - p->max_search = dirty_i->nr_dirty[DIRTY]; 200 195 p->ofs_unit = sbi->segs_per_sec; 196 + if (__is_large_section(sbi)) { 197 + p->dirty_bitmap = dirty_i->dirty_secmap; 198 + p->max_search = count_bits(p->dirty_bitmap, 199 + 0, MAIN_SECS(sbi)); 200 + } else { 201 + p->dirty_bitmap = dirty_i->dirty_segmap[DIRTY]; 202 + p->max_search = dirty_i->nr_dirty[DIRTY]; 203 + } 201 204 } 202 205 203 206 /* ··· 211 202 * foreground GC and urgent GC cases. 212 203 */ 213 204 if (gc_type != FG_GC && 214 - (sbi->gc_mode != GC_URGENT) && 205 + (sbi->gc_mode != GC_URGENT_HIGH) && 215 206 p->max_search > sbi->max_victim_search) 216 207 p->max_search = sbi->max_victim_search; 217 208 ··· 330 321 unsigned int secno, last_victim; 331 322 unsigned int last_segment; 332 323 unsigned int nsearched = 0; 324 + int ret = 0; 333 325 334 326 mutex_lock(&dirty_i->seglist_lock); 335 327 last_segment = MAIN_SECS(sbi) * sbi->segs_per_sec; ··· 342 332 p.min_cost = get_max_cost(sbi, &p); 343 333 344 334 if (*result != NULL_SEGNO) { 345 - if (get_valid_blocks(sbi, *result, false) && 346 - !sec_usage_check(sbi, GET_SEC_FROM_SEG(sbi, *result))) 335 + if (!get_valid_blocks(sbi, *result, false)) { 336 + ret = -ENODATA; 337 + goto out; 338 + } 339 + 340 + if (sec_usage_check(sbi, GET_SEC_FROM_SEG(sbi, *result))) 341 + ret = -EBUSY; 342 + else 347 343 p.min_segno = *result; 348 344 goto out; 349 345 } 350 346 347 + ret = -ENODATA; 351 348 if (p.max_search == 0) 352 349 goto out; 353 350 ··· 382 365 } 383 366 384 367 while (1) { 385 - unsigned long cost; 386 - unsigned int segno; 368 + unsigned long cost, *dirty_bitmap; 369 + unsigned int unit_no, segno; 387 370 388 - segno = find_next_bit(p.dirty_segmap, last_segment, p.offset); 371 + dirty_bitmap = p.dirty_bitmap; 372 + unit_no = find_next_bit(dirty_bitmap, 373 + last_segment / p.ofs_unit, 374 + p.offset / p.ofs_unit); 375 + segno = unit_no * p.ofs_unit; 389 376 if (segno >= last_segment) { 390 377 if (sm->last_victim[p.gc_mode]) { 391 378 last_segment = ··· 402 381 } 403 382 404 383 p.offset = segno + p.ofs_unit; 405 - if (p.ofs_unit > 1) { 406 - p.offset -= segno % p.ofs_unit; 407 - nsearched += count_bits(p.dirty_segmap, 408 - p.offset - p.ofs_unit, 409 - p.ofs_unit); 410 - } else { 411 - nsearched++; 412 - } 384 + nsearched++; 413 385 414 386 #ifdef CONFIG_F2FS_CHECK_FS 415 387 /* ··· 435 421 next: 436 422 if (nsearched >= p.max_search) { 437 423 if (!sm->last_victim[p.gc_mode] && segno <= last_victim) 438 - sm->last_victim[p.gc_mode] = last_victim + 1; 424 + sm->last_victim[p.gc_mode] = 425 + last_victim + p.ofs_unit; 439 426 else 440 - sm->last_victim[p.gc_mode] = segno + 1; 427 + sm->last_victim[p.gc_mode] = segno + p.ofs_unit; 441 428 sm->last_victim[p.gc_mode] %= 442 429 (MAIN_SECS(sbi) * sbi->segs_per_sec); 443 430 break; ··· 455 440 else 456 441 set_bit(secno, dirty_i->victim_secmap); 457 442 } 443 + ret = 0; 458 444 459 445 } 460 446 out: ··· 465 449 prefree_segments(sbi), free_segments(sbi)); 466 450 mutex_unlock(&dirty_i->seglist_lock); 467 451 468 - return (p.min_segno == NULL_SEGNO) ? 0 : 1; 452 + return ret; 469 453 } 470 454 471 455 static const struct victim_selection default_v_ops = { ··· 849 833 850 834 mpage = f2fs_grab_cache_page(META_MAPPING(fio.sbi), 851 835 fio.old_blkaddr, false); 852 - if (!mpage) 836 + if (!mpage) { 837 + err = -ENOMEM; 853 838 goto up_out; 839 + } 854 840 855 841 fio.encrypted_page = mpage; 856 842 ··· 877 859 } 878 860 879 861 f2fs_allocate_data_block(fio.sbi, NULL, fio.old_blkaddr, &newaddr, 880 - &sum, CURSEG_COLD_DATA, NULL, false); 862 + &sum, CURSEG_COLD_DATA, NULL); 881 863 882 864 fio.encrypted_page = f2fs_pagecache_get_page(META_MAPPING(fio.sbi), 883 865 newaddr, FGP_LOCK | FGP_CREAT, GFP_NOFS); ··· 1351 1333 ret = -EINVAL; 1352 1334 goto stop; 1353 1335 } 1354 - if (!__get_victim(sbi, &segno, gc_type)) { 1355 - ret = -ENODATA; 1336 + ret = __get_victim(sbi, &segno, gc_type); 1337 + if (ret) 1356 1338 goto stop; 1357 - } 1358 1339 1359 1340 seg_freed = do_garbage_collect(sbi, segno, &gc_list, gc_type); 1360 1341 if (gc_type == FG_GC && seg_freed == sbi->segs_per_sec) ··· 1451 1434 1452 1435 /* Move out cursegs from the target range */ 1453 1436 for (type = CURSEG_HOT_DATA; type < NR_CURSEG_TYPE; type++) 1454 - allocate_segment_for_resize(sbi, type, start, end); 1437 + f2fs_allocate_segment_for_resize(sbi, type, start, end); 1455 1438 1456 1439 /* do GC to move out valid blocks in the range */ 1457 1440 for (segno = start; segno <= end; segno += sbi->segs_per_sec) {
+14 -7
fs/f2fs/inline.c
··· 12 12 13 13 #include "f2fs.h" 14 14 #include "node.h" 15 + #include <trace/events/f2fs.h> 15 16 16 17 bool f2fs_may_inline_data(struct inode *inode) 17 18 { ··· 254 253 return 0; 255 254 } 256 255 257 - bool f2fs_recover_inline_data(struct inode *inode, struct page *npage) 256 + int f2fs_recover_inline_data(struct inode *inode, struct page *npage) 258 257 { 259 258 struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 260 259 struct f2fs_inode *ri = NULL; ··· 276 275 ri && (ri->i_inline & F2FS_INLINE_DATA)) { 277 276 process_inline: 278 277 ipage = f2fs_get_node_page(sbi, inode->i_ino); 279 - f2fs_bug_on(sbi, IS_ERR(ipage)); 278 + if (IS_ERR(ipage)) 279 + return PTR_ERR(ipage); 280 280 281 281 f2fs_wait_on_page_writeback(ipage, NODE, true, true); 282 282 ··· 290 288 291 289 set_page_dirty(ipage); 292 290 f2fs_put_page(ipage, 1); 293 - return true; 291 + return 1; 294 292 } 295 293 296 294 if (f2fs_has_inline_data(inode)) { 297 295 ipage = f2fs_get_node_page(sbi, inode->i_ino); 298 - f2fs_bug_on(sbi, IS_ERR(ipage)); 296 + if (IS_ERR(ipage)) 297 + return PTR_ERR(ipage); 299 298 f2fs_truncate_inline_inode(inode, ipage, 0); 300 299 clear_inode_flag(inode, FI_INLINE_DATA); 301 300 f2fs_put_page(ipage, 1); 302 301 } else if (ri && (ri->i_inline & F2FS_INLINE_DATA)) { 303 - if (f2fs_truncate_blocks(inode, 0, false)) 304 - return false; 302 + int ret; 303 + 304 + ret = f2fs_truncate_blocks(inode, 0, false); 305 + if (ret) 306 + return ret; 305 307 goto process_inline; 306 308 } 307 - return false; 309 + return 0; 308 310 } 309 311 310 312 struct f2fs_dir_entry *f2fs_find_in_inline_dir(struct inode *dir, ··· 782 776 byteaddr += (char *)inline_data_addr(inode, ipage) - 783 777 (char *)F2FS_INODE(ipage); 784 778 err = fiemap_fill_next_extent(fieinfo, start, byteaddr, ilen, flags); 779 + trace_f2fs_fiemap(inode, start, byteaddr, ilen, flags, err); 785 780 out: 786 781 f2fs_put_page(ipage, 1); 787 782 return err;
+2 -2
fs/f2fs/inode.c
··· 367 367 fi->i_pino = le32_to_cpu(ri->i_pino); 368 368 fi->i_dir_level = ri->i_dir_level; 369 369 370 - if (f2fs_init_extent_tree(inode, &ri->i_ext)) 371 - set_page_dirty(node_page); 370 + f2fs_init_extent_tree(inode, node_page); 372 371 373 372 get_inline_info(inode, ri); 374 373 ··· 401 402 402 403 /* try to recover cold bit for non-dir inode */ 403 404 if (!S_ISDIR(inode->i_mode) && !is_cold_node(node_page)) { 405 + f2fs_wait_on_page_writeback(node_page, NODE, true, true); 404 406 set_cold_node(node_page, false); 405 407 set_page_dirty(node_page); 406 408 }
+10 -8
fs/f2fs/namei.c
··· 569 569 570 570 trace_f2fs_unlink_enter(dir, dentry); 571 571 572 - if (unlikely(f2fs_cp_error(sbi))) 573 - return -EIO; 572 + if (unlikely(f2fs_cp_error(sbi))) { 573 + err = -EIO; 574 + goto fail; 575 + } 574 576 575 577 err = dquot_initialize(dir); 576 578 if (err) 577 - return err; 579 + goto fail; 578 580 err = dquot_initialize(inode); 579 581 if (err) 580 - return err; 582 + goto fail; 581 583 582 584 de = f2fs_find_entry(dir, &dentry->d_name, &page); 583 585 if (!de) { ··· 602 600 /* VFS negative dentries are incompatible with Encoding and 603 601 * Case-insensitiveness. Eventually we'll want avoid 604 602 * invalidating the dentries here, alongside with returning the 605 - * negative dentries at f2fs_lookup(), when it is better 603 + * negative dentries at f2fs_lookup(), when it is better 606 604 * supported by the VFS for the CI case. 607 605 */ 608 606 if (IS_CASEFOLDED(dir)) ··· 1287 1285 } 1288 1286 1289 1287 const struct inode_operations f2fs_encrypted_symlink_inode_operations = { 1290 - .get_link = f2fs_encrypted_get_link, 1288 + .get_link = f2fs_encrypted_get_link, 1291 1289 .getattr = f2fs_getattr, 1292 1290 .setattr = f2fs_setattr, 1293 1291 .listxattr = f2fs_listxattr, ··· 1313 1311 }; 1314 1312 1315 1313 const struct inode_operations f2fs_symlink_inode_operations = { 1316 - .get_link = f2fs_get_link, 1314 + .get_link = f2fs_get_link, 1317 1315 .getattr = f2fs_getattr, 1318 1316 .setattr = f2fs_setattr, 1319 1317 .listxattr = f2fs_listxattr, ··· 1321 1319 1322 1320 const struct inode_operations f2fs_special_inode_operations = { 1323 1321 .getattr = f2fs_getattr, 1324 - .setattr = f2fs_setattr, 1322 + .setattr = f2fs_setattr, 1325 1323 .get_acl = f2fs_get_acl, 1326 1324 .set_acl = f2fs_set_acl, 1327 1325 .listxattr = f2fs_listxattr,
+21 -17
fs/f2fs/node.c
··· 1041 1041 trace_f2fs_truncate_inode_blocks_enter(inode, from); 1042 1042 1043 1043 level = get_node_path(inode, from, offset, noffset); 1044 - if (level < 0) 1044 + if (level < 0) { 1045 + trace_f2fs_truncate_inode_blocks_exit(inode, level); 1045 1046 return level; 1047 + } 1046 1048 1047 1049 page = f2fs_get_node_page(sbi, inode->i_ino); 1048 1050 if (IS_ERR(page)) { ··· 1728 1726 set_dentry_mark(page, 1729 1727 f2fs_need_dentry_mark(sbi, ino)); 1730 1728 } 1731 - /* may be written by other thread */ 1729 + /* may be written by other thread */ 1732 1730 if (!PageDirty(page)) 1733 1731 set_page_dirty(page); 1734 1732 } ··· 1816 1814 return true; 1817 1815 } 1818 1816 1819 - int f2fs_flush_inline_data(struct f2fs_sb_info *sbi) 1817 + void f2fs_flush_inline_data(struct f2fs_sb_info *sbi) 1820 1818 { 1821 1819 pgoff_t index = 0; 1822 1820 struct pagevec pvec; 1823 1821 int nr_pages; 1824 - int ret = 0; 1825 1822 1826 1823 pagevec_init(&pvec); 1827 1824 ··· 1859 1858 pagevec_release(&pvec); 1860 1859 cond_resched(); 1861 1860 } 1862 - return ret; 1863 1861 } 1864 1862 1865 1863 int f2fs_sync_node_pages(struct f2fs_sb_info *sbi, ··· 1924 1924 goto continue_unlock; 1925 1925 } 1926 1926 1927 - /* flush inline_data, if it's async context. */ 1928 - if (do_balance && is_inline_node(page)) { 1927 + /* flush inline_data/inode, if it's async context. */ 1928 + if (!do_balance) 1929 + goto write_node; 1930 + 1931 + /* flush inline_data */ 1932 + if (is_inline_node(page)) { 1929 1933 clear_inline_node(page); 1930 1934 unlock_page(page); 1931 1935 flush_inline_data(sbi, ino_of_node(page)); ··· 1942 1938 if (flush_dirty_inode(page)) 1943 1939 goto lock_node; 1944 1940 } 1945 - 1941 + write_node: 1946 1942 f2fs_wait_on_page_writeback(page, NODE, true, true); 1947 1943 1948 1944 if (!clear_page_dirty_for_io(page)) ··· 2101 2097 .invalidatepage = f2fs_invalidate_page, 2102 2098 .releasepage = f2fs_release_page, 2103 2099 #ifdef CONFIG_MIGRATION 2104 - .migratepage = f2fs_migrate_page, 2100 + .migratepage = f2fs_migrate_page, 2105 2101 #endif 2106 2102 }; 2107 2103 ··· 2112 2108 } 2113 2109 2114 2110 static int __insert_free_nid(struct f2fs_sb_info *sbi, 2115 - struct free_nid *i, enum nid_state state) 2111 + struct free_nid *i) 2116 2112 { 2117 2113 struct f2fs_nm_info *nm_i = NM_I(sbi); 2118 2114 ··· 2120 2116 if (err) 2121 2117 return err; 2122 2118 2123 - f2fs_bug_on(sbi, state != i->state); 2124 - nm_i->nid_cnt[state]++; 2125 - if (state == FREE_NID) 2126 - list_add_tail(&i->list, &nm_i->free_nid_list); 2119 + nm_i->nid_cnt[FREE_NID]++; 2120 + list_add_tail(&i->list, &nm_i->free_nid_list); 2127 2121 return 0; 2128 2122 } 2129 2123 ··· 2243 2241 } 2244 2242 } 2245 2243 ret = true; 2246 - err = __insert_free_nid(sbi, i, FREE_NID); 2244 + err = __insert_free_nid(sbi, i); 2247 2245 err_out: 2248 2246 if (update) { 2249 2247 update_free_nid_bitmap(sbi, nid, ret, build); ··· 2574 2572 return nr - nr_shrink; 2575 2573 } 2576 2574 2577 - void f2fs_recover_inline_xattr(struct inode *inode, struct page *page) 2575 + int f2fs_recover_inline_xattr(struct inode *inode, struct page *page) 2578 2576 { 2579 2577 void *src_addr, *dst_addr; 2580 2578 size_t inline_size; ··· 2582 2580 struct f2fs_inode *ri; 2583 2581 2584 2582 ipage = f2fs_get_node_page(F2FS_I_SB(inode), inode->i_ino); 2585 - f2fs_bug_on(F2FS_I_SB(inode), IS_ERR(ipage)); 2583 + if (IS_ERR(ipage)) 2584 + return PTR_ERR(ipage); 2586 2585 2587 2586 ri = F2FS_INODE(page); 2588 2587 if (ri->i_inline & F2FS_INLINE_XATTR) { ··· 2602 2599 update_inode: 2603 2600 f2fs_update_inode(inode, ipage); 2604 2601 f2fs_put_page(ipage, 1); 2602 + return 0; 2605 2603 } 2606 2604 2607 2605 int f2fs_recover_xattr_data(struct inode *inode, struct page *page)
+9 -3
fs/f2fs/recovery.c
··· 544 544 545 545 /* step 1: recover xattr */ 546 546 if (IS_INODE(page)) { 547 - f2fs_recover_inline_xattr(inode, page); 547 + err = f2fs_recover_inline_xattr(inode, page); 548 + if (err) 549 + goto out; 548 550 } else if (f2fs_has_xattr_block(ofs_of_node(page))) { 549 551 err = f2fs_recover_xattr_data(inode, page); 550 552 if (!err) ··· 555 553 } 556 554 557 555 /* step 2: recover inline data */ 558 - if (f2fs_recover_inline_data(inode, page)) 556 + err = f2fs_recover_inline_data(inode, page); 557 + if (err) { 558 + if (err == 1) 559 + err = 0; 559 560 goto out; 561 + } 560 562 561 563 /* step 3: recover data indices */ 562 564 start = f2fs_start_bidx_of_node(ofs_of_node(page), inode); ··· 748 742 f2fs_put_page(page, 1); 749 743 } 750 744 if (!err) 751 - f2fs_allocate_new_segments(sbi, NO_CHECK_TYPE); 745 + f2fs_allocate_new_segments(sbi); 752 746 return err; 753 747 } 754 748
+92 -39
fs/f2fs/segment.c
··· 174 174 175 175 if (f2fs_lfs_mode(sbi)) 176 176 return false; 177 - if (sbi->gc_mode == GC_URGENT) 177 + if (sbi->gc_mode == GC_URGENT_HIGH) 178 178 return true; 179 179 if (unlikely(is_sbi_flag_set(sbi, SBI_CP_DISABLED))) 180 180 return true; ··· 796 796 } 797 797 if (!test_and_set_bit(segno, dirty_i->dirty_segmap[t])) 798 798 dirty_i->nr_dirty[t]++; 799 + 800 + if (__is_large_section(sbi)) { 801 + unsigned int secno = GET_SEC_FROM_SEG(sbi, segno); 802 + unsigned short valid_blocks = 803 + get_valid_blocks(sbi, segno, true); 804 + 805 + f2fs_bug_on(sbi, unlikely(!valid_blocks || 806 + valid_blocks == BLKS_PER_SEC(sbi))); 807 + 808 + if (!IS_CURSEC(sbi, secno)) 809 + set_bit(secno, dirty_i->dirty_secmap); 810 + } 799 811 } 800 812 } 801 813 ··· 815 803 enum dirty_type dirty_type) 816 804 { 817 805 struct dirty_seglist_info *dirty_i = DIRTY_I(sbi); 806 + unsigned short valid_blocks; 818 807 819 808 if (test_and_clear_bit(segno, dirty_i->dirty_segmap[dirty_type])) 820 809 dirty_i->nr_dirty[dirty_type]--; ··· 827 814 if (test_and_clear_bit(segno, dirty_i->dirty_segmap[t])) 828 815 dirty_i->nr_dirty[t]--; 829 816 830 - if (get_valid_blocks(sbi, segno, true) == 0) { 817 + valid_blocks = get_valid_blocks(sbi, segno, true); 818 + if (valid_blocks == 0) { 831 819 clear_bit(GET_SEC_FROM_SEG(sbi, segno), 832 820 dirty_i->victim_secmap); 833 821 #ifdef CONFIG_F2FS_CHECK_FS 834 822 clear_bit(segno, SIT_I(sbi)->invalid_segmap); 835 823 #endif 824 + } 825 + if (__is_large_section(sbi)) { 826 + unsigned int secno = GET_SEC_FROM_SEG(sbi, segno); 827 + 828 + if (!valid_blocks || 829 + valid_blocks == BLKS_PER_SEC(sbi)) { 830 + clear_bit(secno, dirty_i->dirty_secmap); 831 + return; 832 + } 833 + 834 + if (!IS_CURSEC(sbi, secno)) 835 + set_bit(secno, dirty_i->dirty_secmap); 836 836 } 837 837 } 838 838 } ··· 1759 1733 continue; 1760 1734 } 1761 1735 1762 - if (sbi->gc_mode == GC_URGENT) 1736 + if (sbi->gc_mode == GC_URGENT_HIGH) 1763 1737 __init_discard_policy(sbi, &dpolicy, DPOLICY_FORCE, 1); 1764 1738 1765 1739 sb_start_intwrite(sbi->sb); ··· 2166 2140 new_vblocks = se->valid_blocks + del; 2167 2141 offset = GET_BLKOFF_FROM_SEG0(sbi, blkaddr); 2168 2142 2169 - f2fs_bug_on(sbi, (new_vblocks >> (sizeof(unsigned short) << 3) || 2143 + f2fs_bug_on(sbi, (new_vblocks < 0 || 2170 2144 (new_vblocks > sbi->blocks_per_seg))); 2171 2145 2172 2146 se->valid_blocks = new_vblocks; ··· 2631 2605 bool reversed = false; 2632 2606 2633 2607 /* f2fs_need_SSR() already forces to do this */ 2634 - if (v_ops->get_victim(sbi, &segno, BG_GC, type, SSR)) { 2608 + if (!v_ops->get_victim(sbi, &segno, BG_GC, type, SSR)) { 2635 2609 curseg->next_segno = segno; 2636 2610 return 1; 2637 2611 } ··· 2658 2632 for (; cnt-- > 0; reversed ? i-- : i++) { 2659 2633 if (i == type) 2660 2634 continue; 2661 - if (v_ops->get_victim(sbi, &segno, BG_GC, i, SSR)) { 2635 + if (!v_ops->get_victim(sbi, &segno, BG_GC, i, SSR)) { 2662 2636 curseg->next_segno = segno; 2663 2637 return 1; 2664 2638 } ··· 2700 2674 stat_inc_seg_type(sbi, curseg); 2701 2675 } 2702 2676 2703 - void allocate_segment_for_resize(struct f2fs_sb_info *sbi, int type, 2677 + void f2fs_allocate_segment_for_resize(struct f2fs_sb_info *sbi, int type, 2704 2678 unsigned int start, unsigned int end) 2705 2679 { 2706 2680 struct curseg_info *curseg = CURSEG_I(sbi, type); ··· 2733 2707 up_read(&SM_I(sbi)->curseg_lock); 2734 2708 } 2735 2709 2736 - void f2fs_allocate_new_segments(struct f2fs_sb_info *sbi, int type) 2710 + static void __allocate_new_segment(struct f2fs_sb_info *sbi, int type) 2737 2711 { 2738 - struct curseg_info *curseg; 2712 + struct curseg_info *curseg = CURSEG_I(sbi, type); 2739 2713 unsigned int old_segno; 2714 + 2715 + if (!curseg->next_blkoff && 2716 + !get_valid_blocks(sbi, curseg->segno, false) && 2717 + !get_ckpt_valid_blocks(sbi, curseg->segno)) 2718 + return; 2719 + 2720 + old_segno = curseg->segno; 2721 + SIT_I(sbi)->s_ops->allocate_segment(sbi, type, true); 2722 + locate_dirty_segment(sbi, old_segno); 2723 + } 2724 + 2725 + void f2fs_allocate_new_segment(struct f2fs_sb_info *sbi, int type) 2726 + { 2727 + down_write(&SIT_I(sbi)->sentry_lock); 2728 + __allocate_new_segment(sbi, type); 2729 + up_write(&SIT_I(sbi)->sentry_lock); 2730 + } 2731 + 2732 + void f2fs_allocate_new_segments(struct f2fs_sb_info *sbi) 2733 + { 2740 2734 int i; 2741 2735 2742 2736 down_write(&SIT_I(sbi)->sentry_lock); 2743 - 2744 - for (i = CURSEG_HOT_DATA; i <= CURSEG_COLD_DATA; i++) { 2745 - if (type != NO_CHECK_TYPE && i != type) 2746 - continue; 2747 - 2748 - curseg = CURSEG_I(sbi, i); 2749 - if (type == NO_CHECK_TYPE || curseg->next_blkoff || 2750 - get_valid_blocks(sbi, curseg->segno, false) || 2751 - get_ckpt_valid_blocks(sbi, curseg->segno)) { 2752 - old_segno = curseg->segno; 2753 - SIT_I(sbi)->s_ops->allocate_segment(sbi, i, true); 2754 - locate_dirty_segment(sbi, old_segno); 2755 - } 2756 - } 2757 - 2737 + for (i = CURSEG_HOT_DATA; i <= CURSEG_COLD_DATA; i++) 2738 + __allocate_new_segment(sbi, i); 2758 2739 up_write(&SIT_I(sbi)->sentry_lock); 2759 2740 } 2760 2741 ··· 3122 3089 void f2fs_allocate_data_block(struct f2fs_sb_info *sbi, struct page *page, 3123 3090 block_t old_blkaddr, block_t *new_blkaddr, 3124 3091 struct f2fs_summary *sum, int type, 3125 - struct f2fs_io_info *fio, bool add_list) 3092 + struct f2fs_io_info *fio) 3126 3093 { 3127 3094 struct sit_info *sit_i = SIT_I(sbi); 3128 3095 struct curseg_info *curseg = CURSEG_I(sbi, type); ··· 3139 3106 } else if (type == CURSEG_COLD_DATA_PINNED) { 3140 3107 type = CURSEG_COLD_DATA; 3141 3108 } 3142 - 3143 - /* 3144 - * We need to wait for node_write to avoid block allocation during 3145 - * checkpoint. This can only happen to quota writes which can cause 3146 - * the below discard race condition. 3147 - */ 3148 - if (IS_DATASEG(type)) 3149 - down_write(&sbi->node_write); 3150 3109 3151 3110 down_read(&SM_I(sbi)->curseg_lock); 3152 3111 ··· 3190 3165 if (F2FS_IO_ALIGNED(sbi)) 3191 3166 fio->retry = false; 3192 3167 3193 - if (add_list) { 3168 + if (fio) { 3194 3169 struct f2fs_bio_info *io; 3195 3170 3196 3171 INIT_LIST_HEAD(&fio->list); ··· 3204 3179 mutex_unlock(&curseg->curseg_mutex); 3205 3180 3206 3181 up_read(&SM_I(sbi)->curseg_lock); 3207 - 3208 - if (IS_DATASEG(type)) 3209 - up_write(&sbi->node_write); 3210 3182 3211 3183 if (put_pin_sem) 3212 3184 up_read(&sbi->pin_sem); ··· 3239 3217 down_read(&fio->sbi->io_order_lock); 3240 3218 reallocate: 3241 3219 f2fs_allocate_data_block(fio->sbi, fio->page, fio->old_blkaddr, 3242 - &fio->new_blkaddr, sum, type, fio, true); 3220 + &fio->new_blkaddr, sum, type, fio); 3243 3221 if (GET_SEGNO(fio->sbi, fio->old_blkaddr) != NULL_SEGNO) 3244 3222 invalidate_mapping_pages(META_MAPPING(fio->sbi), 3245 3223 fio->old_blkaddr, fio->old_blkaddr); ··· 4315 4293 { 4316 4294 struct dirty_seglist_info *dirty_i = DIRTY_I(sbi); 4317 4295 struct free_segmap_info *free_i = FREE_I(sbi); 4318 - unsigned int segno = 0, offset = 0; 4296 + unsigned int segno = 0, offset = 0, secno; 4319 4297 unsigned short valid_blocks; 4298 + unsigned short blks_per_sec = BLKS_PER_SEC(sbi); 4320 4299 4321 4300 while (1) { 4322 4301 /* find dirty segment based on free segmap */ ··· 4336 4313 __locate_dirty_segment(sbi, segno, DIRTY); 4337 4314 mutex_unlock(&dirty_i->seglist_lock); 4338 4315 } 4316 + 4317 + if (!__is_large_section(sbi)) 4318 + return; 4319 + 4320 + mutex_lock(&dirty_i->seglist_lock); 4321 + for (segno = 0; segno < MAIN_SECS(sbi); segno += blks_per_sec) { 4322 + valid_blocks = get_valid_blocks(sbi, segno, true); 4323 + secno = GET_SEC_FROM_SEG(sbi, segno); 4324 + 4325 + if (!valid_blocks || valid_blocks == blks_per_sec) 4326 + continue; 4327 + if (IS_CURSEC(sbi, secno)) 4328 + continue; 4329 + set_bit(secno, dirty_i->dirty_secmap); 4330 + } 4331 + mutex_unlock(&dirty_i->seglist_lock); 4339 4332 } 4340 4333 4341 4334 static int init_victim_secmap(struct f2fs_sb_info *sbi) ··· 4385 4346 dirty_i->dirty_segmap[i] = f2fs_kvzalloc(sbi, bitmap_size, 4386 4347 GFP_KERNEL); 4387 4348 if (!dirty_i->dirty_segmap[i]) 4349 + return -ENOMEM; 4350 + } 4351 + 4352 + if (__is_large_section(sbi)) { 4353 + bitmap_size = f2fs_bitmap_size(MAIN_SECS(sbi)); 4354 + dirty_i->dirty_secmap = f2fs_kvzalloc(sbi, 4355 + bitmap_size, GFP_KERNEL); 4356 + if (!dirty_i->dirty_secmap) 4388 4357 return -ENOMEM; 4389 4358 } 4390 4359 ··· 4821 4774 /* discard pre-free/dirty segments list */ 4822 4775 for (i = 0; i < NR_DIRTY_TYPE; i++) 4823 4776 discard_dirty_segmap(sbi, i); 4777 + 4778 + if (__is_large_section(sbi)) { 4779 + mutex_lock(&dirty_i->seglist_lock); 4780 + kvfree(dirty_i->dirty_secmap); 4781 + mutex_unlock(&dirty_i->seglist_lock); 4782 + } 4824 4783 4825 4784 destroy_victim_secmap(sbi); 4826 4785 SM_I(sbi)->dirty_info = NULL;
+7 -3
fs/f2fs/segment.h
··· 166 166 struct victim_sel_policy { 167 167 int alloc_mode; /* LFS or SSR */ 168 168 int gc_mode; /* GC_CB or GC_GREEDY */ 169 - unsigned long *dirty_segmap; /* dirty segment bitmap */ 170 - unsigned int max_search; /* maximum # of segments to search */ 169 + unsigned long *dirty_bitmap; /* dirty segment/section bitmap */ 170 + unsigned int max_search; /* 171 + * maximum # of segments/sections 172 + * to search 173 + */ 171 174 unsigned int offset; /* last scanned bitmap offset */ 172 175 unsigned int ofs_unit; /* bitmap search unit */ 173 176 unsigned int min_cost; /* minimum cost */ ··· 187 184 unsigned char *cur_valid_map_mir; /* mirror of current valid bitmap */ 188 185 #endif 189 186 /* 190 - * # of valid blocks and the validity bitmap stored in the the last 187 + * # of valid blocks and the validity bitmap stored in the last 191 188 * checkpoint pack. This information is used by the SSR mode. 192 189 */ 193 190 unsigned char *ckpt_valid_map; /* validity bitmap of blocks last cp */ ··· 269 266 struct dirty_seglist_info { 270 267 const struct victim_selection *v_ops; /* victim selction operation */ 271 268 unsigned long *dirty_segmap[NR_DIRTY_TYPE]; 269 + unsigned long *dirty_secmap; 272 270 struct mutex seglist_lock; /* lock for segment bitmaps */ 273 271 int nr_dirty[NR_DIRTY_TYPE]; /* # of dirty segments */ 274 272 unsigned long *victim_secmap; /* background GC victims */
+42 -26
fs/f2fs/super.c
··· 350 350 set_opt(sbi, QUOTA); 351 351 return 0; 352 352 errout: 353 - kvfree(qname); 353 + kfree(qname); 354 354 return ret; 355 355 } 356 356 ··· 362 362 f2fs_err(sbi, "Cannot change journaled quota options when quota turned on"); 363 363 return -EINVAL; 364 364 } 365 - kvfree(F2FS_OPTION(sbi).s_qf_names[qtype]); 365 + kfree(F2FS_OPTION(sbi).s_qf_names[qtype]); 366 366 F2FS_OPTION(sbi).s_qf_names[qtype] = NULL; 367 367 return 0; 368 368 } ··· 462 462 { 463 463 struct f2fs_sb_info *sbi = F2FS_SB(sb); 464 464 substring_t args[MAX_OPT_ARGS]; 465 + #ifdef CONFIG_F2FS_FS_COMPRESSION 465 466 unsigned char (*ext)[F2FS_EXTENSION_LEN]; 467 + int ext_cnt; 468 + #endif 466 469 char *p, *name; 467 - int arg = 0, ext_cnt; 470 + int arg = 0; 468 471 kuid_t uid; 469 472 kgid_t gid; 470 473 int ret; ··· 499 496 } else if (!strcmp(name, "sync")) { 500 497 F2FS_OPTION(sbi).bggc_mode = BGGC_MODE_SYNC; 501 498 } else { 502 - kvfree(name); 499 + kfree(name); 503 500 return -EINVAL; 504 501 } 505 - kvfree(name); 502 + kfree(name); 506 503 break; 507 504 case Opt_disable_roll_forward: 508 505 set_opt(sbi, DISABLE_ROLL_FORWARD); ··· 659 656 if (!strcmp(name, "adaptive")) { 660 657 if (f2fs_sb_has_blkzoned(sbi)) { 661 658 f2fs_warn(sbi, "adaptive mode is not allowed with zoned block device feature"); 662 - kvfree(name); 659 + kfree(name); 663 660 return -EINVAL; 664 661 } 665 662 F2FS_OPTION(sbi).fs_mode = FS_MODE_ADAPTIVE; 666 663 } else if (!strcmp(name, "lfs")) { 667 664 F2FS_OPTION(sbi).fs_mode = FS_MODE_LFS; 668 665 } else { 669 - kvfree(name); 666 + kfree(name); 670 667 return -EINVAL; 671 668 } 672 - kvfree(name); 669 + kfree(name); 673 670 break; 674 671 case Opt_io_size_bits: 675 672 if (args->from && match_int(args, &arg)) ··· 795 792 } else if (!strcmp(name, "fs-based")) { 796 793 F2FS_OPTION(sbi).whint_mode = WHINT_MODE_FS; 797 794 } else { 798 - kvfree(name); 795 + kfree(name); 799 796 return -EINVAL; 800 797 } 801 - kvfree(name); 798 + kfree(name); 802 799 break; 803 800 case Opt_alloc: 804 801 name = match_strdup(&args[0]); ··· 810 807 } else if (!strcmp(name, "reuse")) { 811 808 F2FS_OPTION(sbi).alloc_mode = ALLOC_MODE_REUSE; 812 809 } else { 813 - kvfree(name); 810 + kfree(name); 814 811 return -EINVAL; 815 812 } 816 - kvfree(name); 813 + kfree(name); 817 814 break; 818 815 case Opt_fsync: 819 816 name = match_strdup(&args[0]); ··· 827 824 F2FS_OPTION(sbi).fsync_mode = 828 825 FSYNC_MODE_NOBARRIER; 829 826 } else { 830 - kvfree(name); 827 + kfree(name); 831 828 return -EINVAL; 832 829 } 833 - kvfree(name); 830 + kfree(name); 834 831 break; 835 832 case Opt_test_dummy_encryption: 836 833 ret = f2fs_set_test_dummy_encryption(sb, p, &args[0], ··· 865 862 case Opt_checkpoint_enable: 866 863 clear_opt(sbi, DISABLE_CHECKPOINT); 867 864 break; 865 + #ifdef CONFIG_F2FS_FS_COMPRESSION 868 866 case Opt_compress_algorithm: 869 867 if (!f2fs_sb_has_compression(sbi)) { 870 868 f2fs_err(sbi, "Compression feature if off"); ··· 931 927 F2FS_OPTION(sbi).compress_ext_cnt++; 932 928 kfree(name); 933 929 break; 930 + #else 931 + case Opt_compress_algorithm: 932 + case Opt_compress_log_size: 933 + case Opt_compress_extension: 934 + f2fs_info(sbi, "compression options not supported"); 935 + break; 936 + #endif 934 937 default: 935 938 f2fs_err(sbi, "Unrecognized mount option \"%s\" or missing value", 936 939 p); ··· 1034 1023 1035 1024 /* Will be used by directory only */ 1036 1025 fi->i_dir_level = F2FS_SB(sb)->dir_level; 1026 + 1027 + fi->ra_offset = -1; 1037 1028 1038 1029 return &fi->vfs_inode; 1039 1030 } ··· 1195 1182 int i; 1196 1183 bool dropped; 1197 1184 1185 + /* unregister procfs/sysfs entries in advance to avoid race case */ 1186 + f2fs_unregister_sysfs(sbi); 1187 + 1198 1188 f2fs_quota_off_umount(sb); 1199 1189 1200 1190 /* prevent remaining shrinker jobs */ ··· 1263 1247 1264 1248 kvfree(sbi->ckpt); 1265 1249 1266 - f2fs_unregister_sysfs(sbi); 1267 - 1268 1250 sb->s_fs_info = NULL; 1269 1251 if (sbi->s_chksum_driver) 1270 1252 crypto_free_shash(sbi->s_chksum_driver); 1271 - kvfree(sbi->raw_super); 1253 + kfree(sbi->raw_super); 1272 1254 1273 1255 destroy_device_list(sbi); 1274 1256 f2fs_destroy_xattr_caches(sbi); 1275 1257 mempool_destroy(sbi->write_io_dummy); 1276 1258 #ifdef CONFIG_QUOTA 1277 1259 for (i = 0; i < MAXQUOTAS; i++) 1278 - kvfree(F2FS_OPTION(sbi).s_qf_names[i]); 1260 + kfree(F2FS_OPTION(sbi).s_qf_names[i]); 1279 1261 #endif 1280 1262 fscrypt_free_dummy_context(&F2FS_OPTION(sbi).dummy_enc_ctx); 1281 1263 destroy_percpu_info(sbi); ··· 1282 1268 #ifdef CONFIG_UNICODE 1283 1269 utf8_unload(sbi->s_encoding); 1284 1270 #endif 1285 - kvfree(sbi); 1271 + kfree(sbi); 1286 1272 } 1287 1273 1288 1274 int f2fs_sync_fs(struct super_block *sb, int sync) ··· 1631 1617 else if (F2FS_OPTION(sbi).fsync_mode == FSYNC_MODE_NOBARRIER) 1632 1618 seq_printf(seq, ",fsync_mode=%s", "nobarrier"); 1633 1619 1620 + #ifdef CONFIG_F2FS_FS_COMPRESSION 1634 1621 f2fs_show_compress_options(seq, sbi->sb); 1622 + #endif 1635 1623 return 0; 1636 1624 } 1637 1625 ··· 1784 1768 GFP_KERNEL); 1785 1769 if (!org_mount_opt.s_qf_names[i]) { 1786 1770 for (j = 0; j < i; j++) 1787 - kvfree(org_mount_opt.s_qf_names[j]); 1771 + kfree(org_mount_opt.s_qf_names[j]); 1788 1772 return -ENOMEM; 1789 1773 } 1790 1774 } else { ··· 1909 1893 #ifdef CONFIG_QUOTA 1910 1894 /* Release old quota file names */ 1911 1895 for (i = 0; i < MAXQUOTAS; i++) 1912 - kvfree(org_mount_opt.s_qf_names[i]); 1896 + kfree(org_mount_opt.s_qf_names[i]); 1913 1897 #endif 1914 1898 /* Update the POSIXACL Flag */ 1915 1899 sb->s_flags = (sb->s_flags & ~SB_POSIXACL) | ··· 1930 1914 #ifdef CONFIG_QUOTA 1931 1915 F2FS_OPTION(sbi).s_jquota_fmt = org_mount_opt.s_jquota_fmt; 1932 1916 for (i = 0; i < MAXQUOTAS; i++) { 1933 - kvfree(F2FS_OPTION(sbi).s_qf_names[i]); 1917 + kfree(F2FS_OPTION(sbi).s_qf_names[i]); 1934 1918 F2FS_OPTION(sbi).s_qf_names[i] = org_mount_opt.s_qf_names[i]; 1935 1919 } 1936 1920 #endif ··· 3188 3172 3189 3173 /* No valid superblock */ 3190 3174 if (!*raw_super) 3191 - kvfree(super); 3175 + kfree(super); 3192 3176 else 3193 3177 err = 0; 3194 3178 ··· 3862 3846 free_options: 3863 3847 #ifdef CONFIG_QUOTA 3864 3848 for (i = 0; i < MAXQUOTAS; i++) 3865 - kvfree(F2FS_OPTION(sbi).s_qf_names[i]); 3849 + kfree(F2FS_OPTION(sbi).s_qf_names[i]); 3866 3850 #endif 3867 3851 fscrypt_free_dummy_context(&F2FS_OPTION(sbi).dummy_enc_ctx); 3868 3852 kvfree(options); 3869 3853 free_sb_buf: 3870 - kvfree(raw_super); 3854 + kfree(raw_super); 3871 3855 free_sbi: 3872 3856 if (sbi->s_chksum_driver) 3873 3857 crypto_free_shash(sbi->s_chksum_driver); 3874 - kvfree(sbi); 3858 + kfree(sbi); 3875 3859 3876 3860 /* give only one another chance */ 3877 3861 if (retry_cnt > 0 && skip_recovery) {
+17 -6
fs/f2fs/sysfs.c
··· 27 27 NM_INFO, /* struct f2fs_nm_info */ 28 28 F2FS_SBI, /* struct f2fs_sb_info */ 29 29 #ifdef CONFIG_F2FS_STAT_FS 30 - STAT_INFO, /* struct f2fs_stat_info */ 30 + STAT_INFO, /* struct f2fs_stat_info */ 31 31 #endif 32 32 #ifdef CONFIG_F2FS_FAULT_INJECTION 33 33 FAULT_INFO_RATE, /* struct f2fs_fault_info */ ··· 223 223 } 224 224 #endif 225 225 226 + static ssize_t main_blkaddr_show(struct f2fs_attr *a, 227 + struct f2fs_sb_info *sbi, char *buf) 228 + { 229 + return snprintf(buf, PAGE_SIZE, "%llu\n", 230 + (unsigned long long)MAIN_BLKADDR(sbi)); 231 + } 232 + 226 233 static ssize_t f2fs_sbi_show(struct f2fs_attr *a, 227 234 struct f2fs_sb_info *sbi, char *buf) 228 235 { ··· 357 350 return -EINVAL; 358 351 359 352 if (!strcmp(a->attr.name, "gc_urgent")) { 360 - if (t >= 1) { 361 - sbi->gc_mode = GC_URGENT; 353 + if (t == 0) { 354 + sbi->gc_mode = GC_NORMAL; 355 + } else if (t == 1) { 356 + sbi->gc_mode = GC_URGENT_HIGH; 362 357 if (sbi->gc_thread) { 363 358 sbi->gc_thread->gc_wake = 1; 364 359 wake_up_interruptible_all( 365 360 &sbi->gc_thread->gc_wait_queue_head); 366 361 wake_up_discard_thread(sbi, true); 367 362 } 363 + } else if (t == 2) { 364 + sbi->gc_mode = GC_URGENT_LOW; 368 365 } else { 369 - sbi->gc_mode = GC_NORMAL; 366 + return -EINVAL; 370 367 } 371 368 return count; 372 369 } ··· 533 522 F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_idle, gc_mode); 534 523 F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_urgent, gc_mode); 535 524 F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, reclaim_segments, rec_prefree_segments); 536 - F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, main_blkaddr, main_blkaddr); 537 525 F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, max_small_discards, max_discards); 538 526 F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, discard_granularity, discard_granularity); 539 527 F2FS_RW_ATTR(RESERVED_BLOCKS, f2fs_sb_info, reserved_blocks, reserved_blocks); ··· 575 565 F2FS_GENERAL_RO_ATTR(unusable); 576 566 F2FS_GENERAL_RO_ATTR(encoding); 577 567 F2FS_GENERAL_RO_ATTR(mounted_time_sec); 568 + F2FS_GENERAL_RO_ATTR(main_blkaddr); 578 569 #ifdef CONFIG_F2FS_STAT_FS 579 570 F2FS_STAT_ATTR(STAT_INFO, f2fs_stat_info, cp_foreground_calls, cp_count); 580 571 F2FS_STAT_ATTR(STAT_INFO, f2fs_stat_info, cp_background_calls, bg_cp_count); ··· 717 706 }; 718 707 719 708 static struct kset f2fs_kset = { 720 - .kobj = {.ktype = &f2fs_ktype}, 709 + .kobj = {.ktype = &f2fs_ktype}, 721 710 }; 722 711 723 712 static struct kobj_type f2fs_feat_ktype = {
+4 -2
fs/f2fs/verity.c
··· 29 29 #include "f2fs.h" 30 30 #include "xattr.h" 31 31 32 + #define F2FS_VERIFY_VER (1) 33 + 32 34 static inline loff_t f2fs_verity_metadata_pos(const struct inode *inode) 33 35 { 34 36 return round_up(inode->i_size, 65536); ··· 154 152 struct inode *inode = file_inode(filp); 155 153 u64 desc_pos = f2fs_verity_metadata_pos(inode) + merkle_tree_size; 156 154 struct fsverity_descriptor_location dloc = { 157 - .version = cpu_to_le32(1), 155 + .version = cpu_to_le32(F2FS_VERIFY_VER), 158 156 .size = cpu_to_le32(desc_size), 159 157 .pos = cpu_to_le64(desc_pos), 160 158 }; ··· 201 199 F2FS_XATTR_NAME_VERITY, &dloc, sizeof(dloc), NULL); 202 200 if (res < 0 && res != -ERANGE) 203 201 return res; 204 - if (res != sizeof(dloc) || dloc.version != cpu_to_le32(1)) { 202 + if (res != sizeof(dloc) || dloc.version != cpu_to_le32(F2FS_VERIFY_VER)) { 205 203 f2fs_warn(F2FS_I_SB(inode), "unknown verity xattr format"); 206 204 return -EINVAL; 207 205 }
+2 -2
fs/f2fs/xattr.c
··· 175 175 const struct xattr_handler f2fs_xattr_advise_handler = { 176 176 .name = F2FS_SYSTEM_ADVISE_NAME, 177 177 .flags = F2FS_XATTR_INDEX_ADVISE, 178 - .get = f2fs_xattr_advise_get, 179 - .set = f2fs_xattr_advise_set, 178 + .get = f2fs_xattr_advise_get, 179 + .set = f2fs_xattr_advise_set, 180 180 }; 181 181 182 182 const struct xattr_handler f2fs_xattr_security_handler = {
+63
include/trace/events/f2fs.h
··· 1891 1891 __entry->fs_cdrio, __entry->fs_nrio, __entry->fs_mrio) 1892 1892 ); 1893 1893 1894 + TRACE_EVENT(f2fs_bmap, 1895 + 1896 + TP_PROTO(struct inode *inode, sector_t lblock, sector_t pblock), 1897 + 1898 + TP_ARGS(inode, lblock, pblock), 1899 + 1900 + TP_STRUCT__entry( 1901 + __field(dev_t, dev) 1902 + __field(ino_t, ino) 1903 + __field(sector_t, lblock) 1904 + __field(sector_t, pblock) 1905 + ), 1906 + 1907 + TP_fast_assign( 1908 + __entry->dev = inode->i_sb->s_dev; 1909 + __entry->ino = inode->i_ino; 1910 + __entry->lblock = lblock; 1911 + __entry->pblock = pblock; 1912 + ), 1913 + 1914 + TP_printk("dev = (%d,%d), ino = %lu, lblock:%lld, pblock:%lld", 1915 + show_dev_ino(__entry), 1916 + (unsigned long long)__entry->lblock, 1917 + (unsigned long long)__entry->pblock) 1918 + ); 1919 + 1920 + TRACE_EVENT(f2fs_fiemap, 1921 + 1922 + TP_PROTO(struct inode *inode, sector_t lblock, sector_t pblock, 1923 + unsigned long long len, unsigned int flags, int ret), 1924 + 1925 + TP_ARGS(inode, lblock, pblock, len, flags, ret), 1926 + 1927 + TP_STRUCT__entry( 1928 + __field(dev_t, dev) 1929 + __field(ino_t, ino) 1930 + __field(sector_t, lblock) 1931 + __field(sector_t, pblock) 1932 + __field(unsigned long long, len) 1933 + __field(unsigned int, flags) 1934 + __field(int, ret) 1935 + ), 1936 + 1937 + TP_fast_assign( 1938 + __entry->dev = inode->i_sb->s_dev; 1939 + __entry->ino = inode->i_ino; 1940 + __entry->lblock = lblock; 1941 + __entry->pblock = pblock; 1942 + __entry->len = len; 1943 + __entry->flags = flags; 1944 + __entry->ret = ret; 1945 + ), 1946 + 1947 + TP_printk("dev = (%d,%d), ino = %lu, lblock:%lld, pblock:%lld, " 1948 + "len:%llu, flags:%u, ret:%d", 1949 + show_dev_ino(__entry), 1950 + (unsigned long long)__entry->lblock, 1951 + (unsigned long long)__entry->pblock, 1952 + __entry->len, 1953 + __entry->flags, 1954 + __entry->ret) 1955 + ); 1956 + 1894 1957 #endif /* _TRACE_F2FS_H */ 1895 1958 1896 1959 /* This part must be outside protection */