Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'f2fs-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
"The main changes include converting major IO paths to use folio, and
adding various knobs to control GC more flexibly for Zoned devices.

In addition, there are several patches to address corner cases of
atomic file operations and better support for file pinning on zoned
device.

Enhancement:
- add knobs to tune foreground/background GCs for Zoned devices
- convert IO paths to use folio
- reduce expensive checkpoint trigger frequency
- allow F2FS_IPU_NOCACHE for pinned file
- forcibly migrate to secure space for zoned device file pinning
- get rid of buffer_head use
- add write priority option based on zone UFS
- get rid of online repair on corrupted directory

Bug fixes:
- fix to don't panic system for no free segment fault injection
- fix to don't set SB_RDONLY in f2fs_handle_critical_error()
- avoid unused block when dio write in LFS mode
- compress: don't redirty sparse cluster during {,de}compress
- check discard support for conventional zones
- atomic: prevent atomic file from being dirtied before commit
- atomic: fix to check atomic_file in f2fs ioctl interfaces
- atomic: fix to forbid dio in atomic_file
- atomic: fix to truncate pagecache before on-disk metadata truncation
- atomic: create COW inode from parent dentry
- atomic: fix to avoid racing w/ GC
- atomic: require FMODE_WRITE for atomic write ioctls
- fix to wait page writeback before setting gcing flag
- fix to avoid racing in between read and OPU dio write, dio completion
- fix several potential integer overflows in file offsets and dir_block_index
- fix to avoid use-after-free in f2fs_stop_gc_thread()

As usual, there are several code clean-ups and refactorings"

* tag 'f2fs-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (60 commits)
f2fs: allow F2FS_IPU_NOCACHE for pinned file
f2fs: forcibly migrate to secure space for zoned device file pinning
f2fs: remove unused parameters
f2fs: fix to don't panic system for no free segment fault injection
f2fs: fix to don't set SB_RDONLY in f2fs_handle_critical_error()
f2fs: add valid block ratio not to do excessive GC for one time GC
f2fs: create gc_no_zoned_gc_percent and gc_boost_zoned_gc_percent
f2fs: do FG_GC when GC boosting is required for zoned devices
f2fs: increase BG GC migration window granularity when boosted for zoned devices
f2fs: add reserved_segments sysfs node
f2fs: introduce migration_window_granularity
f2fs: make BG GC more aggressive for zoned devices
f2fs: avoid unused block when dio write in LFS mode
f2fs: fix to check atomic_file in f2fs ioctl interfaces
f2fs: get rid of online repaire on corrupted directory
f2fs: prevent atomic file from being dirtied before commit
f2fs: get rid of page->index
f2fs: convert read_node_page() to use folio
f2fs: convert __write_node_page() to use folio
f2fs: convert f2fs_write_data_page() to use folio
...

+799 -464
+56
Documentation/ABI/testing/sysfs-fs-f2fs
··· 579 579 candidates whose age is not beyond the threshold, by default it was 580 580 initialized as 604800 seconds (equals to 7 days). 581 581 582 + What: /sys/fs/f2fs/<disk>/atgc_enabled 583 + Date: Feb 2024 584 + Contact: "Jinbao Liu" <liujinbao1@xiaomi.com> 585 + Description: It represents whether ATGC is on or off. The value is 1 which 586 + indicates that ATGC is on, and 0 indicates that it is off. 587 + 582 588 What: /sys/fs/f2fs/<disk>/gc_reclaimed_segments 583 589 Date: July 2021 584 590 Contact: "Daeho Jeong" <daehojeong@google.com> ··· 769 763 Contact: "Chao Yu" <chao@kernel.org> 770 764 Description: It controls to enable/disable IO aware feature for background discard. 771 765 By default, the value is 1 which indicates IO aware is on. 766 + 767 + What: /sys/fs/f2fs/<disk>/blkzone_alloc_policy 768 + Date: July 2024 769 + Contact: "Yuanhong Liao" <liaoyuanhong@vivo.com> 770 + Description: The zone UFS we are currently using consists of two parts: 771 + conventional zones and sequential zones. It can be used to control which part 772 + to prioritize for writes, with a default value of 0. 773 + 774 + ======================== ========================================= 775 + value description 776 + blkzone_alloc_policy = 0 Prioritize writing to sequential zones 777 + blkzone_alloc_policy = 1 Only allow writing to sequential zones 778 + blkzone_alloc_policy = 2 Prioritize writing to conventional zones 779 + ======================== ========================================= 780 + 781 + What: /sys/fs/f2fs/<disk>/migration_window_granularity 782 + Date: September 2024 783 + Contact: "Daeho Jeong" <daehojeong@google.com> 784 + Description: Controls migration window granularity of garbage collection on large 785 + section. it can control the scanning window granularity for GC migration 786 + in a unit of segment, while migration_granularity controls the number 787 + of segments which can be migrated at the same turn. 788 + 789 + What: /sys/fs/f2fs/<disk>/reserved_segments 790 + Date: September 2024 791 + Contact: "Daeho Jeong" <daehojeong@google.com> 792 + Description: In order to fine tune GC behavior, we can control the number of 793 + reserved segments. 794 + 795 + What: /sys/fs/f2fs/<disk>/gc_no_zoned_gc_percent 796 + Date: September 2024 797 + Contact: "Daeho Jeong" <daehojeong@google.com> 798 + Description: If the percentage of free sections over total sections is above this 799 + number, F2FS do not garbage collection for zoned devices through the 800 + background GC thread. the default number is "60". 801 + 802 + What: /sys/fs/f2fs/<disk>/gc_boost_zoned_gc_percent 803 + Date: September 2024 804 + Contact: "Daeho Jeong" <daehojeong@google.com> 805 + Description: If the percentage of free sections over total sections is under this 806 + number, F2FS boosts garbage collection for zoned devices through the 807 + background GC thread. the default number is "25". 808 + 809 + What: /sys/fs/f2fs/<disk>/gc_valid_thresh_ratio 810 + Date: September 2024 811 + Contact: "Daeho Jeong" <daehojeong@google.com> 812 + Description: It controls the valid block ratio threshold not to trigger excessive GC 813 + for zoned deivces. The initial value of it is 95(%). F2FS will stop the 814 + background GC thread from intiating GC for sections having valid blocks 815 + exceeding the ratio.
+9 -8
fs/f2fs/checkpoint.c
··· 99 99 } 100 100 101 101 if (unlikely(!PageUptodate(page))) { 102 - f2fs_handle_page_eio(sbi, page->index, META); 102 + f2fs_handle_page_eio(sbi, page_folio(page), META); 103 103 f2fs_put_page(page, 1); 104 104 return ERR_PTR(-EIO); 105 105 } ··· 345 345 enum iostat_type io_type) 346 346 { 347 347 struct f2fs_sb_info *sbi = F2FS_P_SB(page); 348 + struct folio *folio = page_folio(page); 348 349 349 - trace_f2fs_writepage(page_folio(page), META); 350 + trace_f2fs_writepage(folio, META); 350 351 351 352 if (unlikely(f2fs_cp_error(sbi))) { 352 353 if (is_sbi_flag_set(sbi, SBI_IS_CLOSE)) { 353 - ClearPageUptodate(page); 354 + folio_clear_uptodate(folio); 354 355 dec_page_count(sbi, F2FS_DIRTY_META); 355 - unlock_page(page); 356 + folio_unlock(folio); 356 357 return 0; 357 358 } 358 359 goto redirty_out; 359 360 } 360 361 if (unlikely(is_sbi_flag_set(sbi, SBI_POR_DOING))) 361 362 goto redirty_out; 362 - if (wbc->for_reclaim && page->index < GET_SUM_BLOCK(sbi, 0)) 363 + if (wbc->for_reclaim && folio->index < GET_SUM_BLOCK(sbi, 0)) 363 364 goto redirty_out; 364 365 365 - f2fs_do_write_meta_page(sbi, page, io_type); 366 + f2fs_do_write_meta_page(sbi, folio, io_type); 366 367 dec_page_count(sbi, F2FS_DIRTY_META); 367 368 368 369 if (wbc->for_reclaim) 369 370 f2fs_submit_merged_write_cond(sbi, NULL, page, 0, META); 370 371 371 - unlock_page(page); 372 + folio_unlock(folio); 372 373 373 374 if (unlikely(f2fs_cp_error(sbi))) 374 375 f2fs_submit_merged_write(sbi, META); ··· 1552 1551 blk = start_blk + BLKS_PER_SEG(sbi) - nm_i->nat_bits_blocks; 1553 1552 for (i = 0; i < nm_i->nat_bits_blocks; i++) 1554 1553 f2fs_update_meta_page(sbi, nm_i->nat_bits + 1555 - (i << F2FS_BLKSIZE_BITS), blk + i); 1554 + F2FS_BLK_TO_BYTES(i), blk + i); 1556 1555 } 1557 1556 1558 1557 /* write out checkpoint buffer at block 0 */
+43 -20
fs/f2fs/compress.c
··· 90 90 static void f2fs_set_compressed_page(struct page *page, 91 91 struct inode *inode, pgoff_t index, void *data) 92 92 { 93 - attach_page_private(page, (void *)data); 93 + struct folio *folio = page_folio(page); 94 + 95 + folio_attach_private(folio, (void *)data); 94 96 95 97 /* i_crypto_info and iv index */ 96 - page->index = index; 97 - page->mapping = inode->i_mapping; 98 + folio->index = index; 99 + folio->mapping = inode->i_mapping; 98 100 } 99 101 100 102 static void f2fs_drop_rpages(struct compress_ctx *cc, int len, bool unlock) ··· 162 160 cc->cluster_idx = NULL_CLUSTER; 163 161 } 164 162 165 - void f2fs_compress_ctx_add_page(struct compress_ctx *cc, struct page *page) 163 + void f2fs_compress_ctx_add_page(struct compress_ctx *cc, struct folio *folio) 166 164 { 167 165 unsigned int cluster_ofs; 168 166 169 - if (!f2fs_cluster_can_merge_page(cc, page->index)) 167 + if (!f2fs_cluster_can_merge_page(cc, folio->index)) 170 168 f2fs_bug_on(F2FS_I_SB(cc->inode), 1); 171 169 172 - cluster_ofs = offset_in_cluster(cc, page->index); 173 - cc->rpages[cluster_ofs] = page; 170 + cluster_ofs = offset_in_cluster(cc, folio->index); 171 + cc->rpages[cluster_ofs] = folio_page(folio, 0); 174 172 cc->nr_rpages++; 175 - cc->cluster_idx = cluster_idx(cc, page->index); 173 + cc->cluster_idx = cluster_idx(cc, folio->index); 176 174 } 177 175 178 176 #ifdef CONFIG_F2FS_FS_LZO ··· 881 879 f2fs_bug_on(F2FS_I_SB(cc->inode), !page); 882 880 883 881 /* beyond EOF */ 884 - if (page->index >= nr_pages) 882 + if (page_folio(page)->index >= nr_pages) 885 883 return true; 886 884 } 887 885 return false; ··· 947 945 unsigned int cluster_size = F2FS_I(inode)->i_cluster_size; 948 946 int count, i; 949 947 950 - for (i = 1, count = 1; i < cluster_size; i++) { 948 + for (i = 0, count = 0; i < cluster_size; i++) { 951 949 block_t blkaddr = data_blkaddr(dn->inode, dn->node_page, 952 950 dn->ofs_in_node + i); 953 951 ··· 958 956 return count; 959 957 } 960 958 961 - static int __f2fs_cluster_blocks(struct inode *inode, 962 - unsigned int cluster_idx, bool compr_blks) 959 + static int __f2fs_cluster_blocks(struct inode *inode, unsigned int cluster_idx, 960 + enum cluster_check_type type) 963 961 { 964 962 struct dnode_of_data dn; 965 963 unsigned int start_idx = cluster_idx << ··· 980 978 } 981 979 982 980 if (dn.data_blkaddr == COMPRESS_ADDR) { 983 - if (compr_blks) 984 - ret = __f2fs_get_cluster_blocks(inode, &dn); 985 - else 981 + if (type == CLUSTER_COMPR_BLKS) 982 + ret = 1 + __f2fs_get_cluster_blocks(inode, &dn); 983 + else if (type == CLUSTER_IS_COMPR) 986 984 ret = 1; 985 + } else if (type == CLUSTER_RAW_BLKS) { 986 + ret = __f2fs_get_cluster_blocks(inode, &dn); 987 987 } 988 988 fail: 989 989 f2fs_put_dnode(&dn); ··· 995 991 /* return # of compressed blocks in compressed cluster */ 996 992 static int f2fs_compressed_blocks(struct compress_ctx *cc) 997 993 { 998 - return __f2fs_cluster_blocks(cc->inode, cc->cluster_idx, true); 994 + return __f2fs_cluster_blocks(cc->inode, cc->cluster_idx, 995 + CLUSTER_COMPR_BLKS); 996 + } 997 + 998 + /* return # of raw blocks in non-compressed cluster */ 999 + static int f2fs_decompressed_blocks(struct inode *inode, 1000 + unsigned int cluster_idx) 1001 + { 1002 + return __f2fs_cluster_blocks(inode, cluster_idx, 1003 + CLUSTER_RAW_BLKS); 999 1004 } 1000 1005 1001 1006 /* return whether cluster is compressed one or not */ ··· 1012 999 { 1013 1000 return __f2fs_cluster_blocks(inode, 1014 1001 index >> F2FS_I(inode)->i_log_cluster_size, 1015 - false); 1002 + CLUSTER_IS_COMPR); 1003 + } 1004 + 1005 + /* return whether cluster contains non raw blocks or not */ 1006 + bool f2fs_is_sparse_cluster(struct inode *inode, pgoff_t index) 1007 + { 1008 + unsigned int cluster_idx = index >> F2FS_I(inode)->i_log_cluster_size; 1009 + 1010 + return f2fs_decompressed_blocks(inode, cluster_idx) != 1011 + F2FS_I(inode)->i_cluster_size; 1016 1012 } 1017 1013 1018 1014 static bool cluster_may_compress(struct compress_ctx *cc) ··· 1115 1093 if (PageUptodate(page)) 1116 1094 f2fs_put_page(page, 1); 1117 1095 else 1118 - f2fs_compress_ctx_add_page(cc, page); 1096 + f2fs_compress_ctx_add_page(cc, page_folio(page)); 1119 1097 } 1120 1098 1121 1099 if (!f2fs_cluster_is_empty(cc)) { ··· 1145 1123 } 1146 1124 1147 1125 f2fs_wait_on_page_writeback(page, DATA, true, true); 1148 - f2fs_compress_ctx_add_page(cc, page); 1126 + f2fs_compress_ctx_add_page(cc, page_folio(page)); 1149 1127 1150 1128 if (!PageUptodate(page)) { 1151 1129 release_and_retry: ··· 1545 1523 if (!clear_page_dirty_for_io(cc->rpages[i])) 1546 1524 goto continue_unlock; 1547 1525 1548 - ret = f2fs_write_single_data_page(cc->rpages[i], &submitted, 1526 + ret = f2fs_write_single_data_page(page_folio(cc->rpages[i]), 1527 + &submitted, 1549 1528 NULL, NULL, wbc, io_type, 1550 1529 compr_blocks, false); 1551 1530 if (ret) {
+92 -72
fs/f2fs/data.c
··· 7 7 */ 8 8 #include <linux/fs.h> 9 9 #include <linux/f2fs_fs.h> 10 - #include <linux/buffer_head.h> 11 10 #include <linux/sched/mm.h> 12 11 #include <linux/mpage.h> 13 12 #include <linux/writeback.h> ··· 354 355 } 355 356 356 357 f2fs_bug_on(sbi, page->mapping == NODE_MAPPING(sbi) && 357 - page->index != nid_of_node(page)); 358 + page_folio(page)->index != nid_of_node(page)); 358 359 359 360 dec_page_count(sbi, type); 360 361 if (f2fs_in_warm_node_list(sbi, page)) ··· 703 704 bio = __bio_alloc(fio, 1); 704 705 705 706 f2fs_set_bio_crypt_ctx(bio, fio->page->mapping->host, 706 - fio->page->index, fio, GFP_NOIO); 707 + page_folio(fio->page)->index, fio, GFP_NOIO); 707 708 708 709 if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) { 709 710 bio_put(bio); ··· 802 803 fio->new_blkaddr)); 803 804 if (f2fs_crypt_mergeable_bio(*bio, 804 805 fio->page->mapping->host, 805 - fio->page->index, fio) && 806 + page_folio(fio->page)->index, fio) && 806 807 bio_add_page(*bio, page, PAGE_SIZE, 0) == 807 808 PAGE_SIZE) { 808 809 ret = 0; ··· 902 903 if (!bio) { 903 904 bio = __bio_alloc(fio, BIO_MAX_VECS); 904 905 f2fs_set_bio_crypt_ctx(bio, fio->page->mapping->host, 905 - fio->page->index, fio, GFP_NOIO); 906 + page_folio(fio->page)->index, fio, GFP_NOIO); 906 907 907 908 add_bio_entry(fio->sbi, bio, page, fio->temp); 908 909 } else { ··· 995 996 (!io_is_mergeable(sbi, io->bio, io, fio, io->last_block_in_bio, 996 997 fio->new_blkaddr) || 997 998 !f2fs_crypt_mergeable_bio(io->bio, fio->page->mapping->host, 998 - bio_page->index, fio))) 999 + page_folio(bio_page)->index, fio))) 999 1000 __submit_merged_bio(io); 1000 1001 alloc_new: 1001 1002 if (io->bio == NULL) { 1002 1003 io->bio = __bio_alloc(fio, BIO_MAX_VECS); 1003 1004 f2fs_set_bio_crypt_ctx(io->bio, fio->page->mapping->host, 1004 - bio_page->index, fio, GFP_NOIO); 1005 + page_folio(bio_page)->index, fio, GFP_NOIO); 1005 1006 io->fio = *fio; 1006 1007 } 1007 1008 ··· 1086 1087 } 1087 1088 1088 1089 /* This can handle encryption stuffs */ 1089 - static int f2fs_submit_page_read(struct inode *inode, struct page *page, 1090 + static int f2fs_submit_page_read(struct inode *inode, struct folio *folio, 1090 1091 block_t blkaddr, blk_opf_t op_flags, 1091 1092 bool for_write) 1092 1093 { ··· 1094 1095 struct bio *bio; 1095 1096 1096 1097 bio = f2fs_grab_read_bio(inode, blkaddr, 1, op_flags, 1097 - page->index, for_write); 1098 + folio->index, for_write); 1098 1099 if (IS_ERR(bio)) 1099 1100 return PTR_ERR(bio); 1100 1101 1101 1102 /* wait for GCed page writeback via META_MAPPING */ 1102 1103 f2fs_wait_on_block_writeback(inode, blkaddr); 1103 1104 1104 - if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) { 1105 + if (!bio_add_folio(bio, folio, PAGE_SIZE, 0)) { 1105 1106 iostat_update_and_unbind_ctx(bio); 1106 1107 if (bio->bi_private) 1107 1108 mempool_free(bio->bi_private, bio_post_read_ctx_pool); ··· 1269 1270 return page; 1270 1271 } 1271 1272 1272 - err = f2fs_submit_page_read(inode, page, dn.data_blkaddr, 1273 + err = f2fs_submit_page_read(inode, page_folio(page), dn.data_blkaddr, 1273 1274 op_flags, for_write); 1274 1275 if (err) 1275 1276 goto put_err; ··· 1712 1713 dn.ofs_in_node = end_offset; 1713 1714 } 1714 1715 1716 + if (flag == F2FS_GET_BLOCK_DIO && f2fs_lfs_mode(sbi) && 1717 + map->m_may_create) { 1718 + /* the next block to be allocated may not be contiguous. */ 1719 + if (GET_SEGOFF_FROM_SEG0(sbi, blkaddr) % BLKS_PER_SEC(sbi) == 1720 + CAP_BLKS_PER_SEC(sbi) - 1) 1721 + goto sync_out; 1722 + } 1723 + 1715 1724 if (pgofs >= end) 1716 1725 goto sync_out; 1717 1726 else if (dn.ofs_in_node < end_offset) ··· 1946 1939 1947 1940 inode_lock_shared(inode); 1948 1941 1949 - maxbytes = max_file_blocks(inode) << F2FS_BLKSIZE_BITS; 1942 + maxbytes = F2FS_BLK_TO_BYTES(max_file_blocks(inode)); 1950 1943 if (start > maxbytes) { 1951 1944 ret = -EFBIG; 1952 1945 goto out; ··· 2071 2064 static inline loff_t f2fs_readpage_limit(struct inode *inode) 2072 2065 { 2073 2066 if (IS_ENABLED(CONFIG_FS_VERITY) && IS_VERITY(inode)) 2074 - return inode->i_sb->s_maxbytes; 2067 + return F2FS_BLK_TO_BYTES(max_file_blocks(inode)); 2075 2068 2076 2069 return i_size_read(inode); 2077 2070 } ··· 2215 2208 /* get rid of pages beyond EOF */ 2216 2209 for (i = 0; i < cc->cluster_size; i++) { 2217 2210 struct page *page = cc->rpages[i]; 2211 + struct folio *folio; 2218 2212 2219 2213 if (!page) 2220 2214 continue; 2221 - if ((sector_t)page->index >= last_block_in_file) { 2222 - zero_user_segment(page, 0, PAGE_SIZE); 2223 - if (!PageUptodate(page)) 2224 - SetPageUptodate(page); 2225 - } else if (!PageUptodate(page)) { 2215 + 2216 + folio = page_folio(page); 2217 + if ((sector_t)folio->index >= last_block_in_file) { 2218 + folio_zero_segment(folio, 0, folio_size(folio)); 2219 + if (!folio_test_uptodate(folio)) 2220 + folio_mark_uptodate(folio); 2221 + } else if (!folio_test_uptodate(folio)) { 2226 2222 continue; 2227 2223 } 2228 - unlock_page(page); 2224 + folio_unlock(folio); 2229 2225 if (for_write) 2230 - put_page(page); 2226 + folio_put(folio); 2231 2227 cc->rpages[i] = NULL; 2232 2228 cc->nr_rpages--; 2233 2229 } ··· 2290 2280 } 2291 2281 2292 2282 for (i = 0; i < cc->nr_cpages; i++) { 2293 - struct page *page = dic->cpages[i]; 2283 + struct folio *folio = page_folio(dic->cpages[i]); 2294 2284 block_t blkaddr; 2295 2285 struct bio_post_read_ctx *ctx; 2296 2286 ··· 2300 2290 2301 2291 f2fs_wait_on_block_writeback(inode, blkaddr); 2302 2292 2303 - if (f2fs_load_compressed_page(sbi, page, blkaddr)) { 2293 + if (f2fs_load_compressed_page(sbi, folio_page(folio, 0), 2294 + blkaddr)) { 2304 2295 if (atomic_dec_and_test(&dic->remaining_pages)) { 2305 2296 f2fs_decompress_cluster(dic, true); 2306 2297 break; ··· 2311 2300 2312 2301 if (bio && (!page_is_mergeable(sbi, bio, 2313 2302 *last_block_in_bio, blkaddr) || 2314 - !f2fs_crypt_mergeable_bio(bio, inode, page->index, NULL))) { 2303 + !f2fs_crypt_mergeable_bio(bio, inode, folio->index, NULL))) { 2315 2304 submit_and_realloc: 2316 2305 f2fs_submit_read_bio(sbi, bio, DATA); 2317 2306 bio = NULL; ··· 2320 2309 if (!bio) { 2321 2310 bio = f2fs_grab_read_bio(inode, blkaddr, nr_pages, 2322 2311 f2fs_ra_op_flags(rac), 2323 - page->index, for_write); 2312 + folio->index, for_write); 2324 2313 if (IS_ERR(bio)) { 2325 2314 ret = PTR_ERR(bio); 2326 2315 f2fs_decompress_end_io(dic, ret, true); ··· 2330 2319 } 2331 2320 } 2332 2321 2333 - if (bio_add_page(bio, page, blocksize, 0) < blocksize) 2322 + if (!bio_add_folio(bio, folio, blocksize, 0)) 2334 2323 goto submit_and_realloc; 2335 2324 2336 2325 ctx = get_post_read_ctx(bio); ··· 2441 2430 if (ret) 2442 2431 goto set_error_page; 2443 2432 2444 - f2fs_compress_ctx_add_page(&cc, &folio->page); 2433 + f2fs_compress_ctx_add_page(&cc, folio); 2445 2434 2446 2435 goto next_page; 2447 2436 read_single_page: ··· 2656 2645 2657 2646 int f2fs_do_write_data_page(struct f2fs_io_info *fio) 2658 2647 { 2659 - struct page *page = fio->page; 2660 - struct inode *inode = page->mapping->host; 2648 + struct folio *folio = page_folio(fio->page); 2649 + struct inode *inode = folio->mapping->host; 2661 2650 struct dnode_of_data dn; 2662 2651 struct node_info ni; 2663 2652 bool ipu_force = false; 2653 + bool atomic_commit; 2664 2654 int err = 0; 2665 2655 2666 2656 /* Use COW inode to make dnode_of_data for atomic write */ 2667 - if (f2fs_is_atomic_file(inode)) 2657 + atomic_commit = f2fs_is_atomic_file(inode) && 2658 + page_private_atomic(folio_page(folio, 0)); 2659 + if (atomic_commit) 2668 2660 set_new_dnode(&dn, F2FS_I(inode)->cow_inode, NULL, NULL, 0); 2669 2661 else 2670 2662 set_new_dnode(&dn, inode, NULL, NULL, 0); 2671 2663 2672 2664 if (need_inplace_update(fio) && 2673 - f2fs_lookup_read_extent_cache_block(inode, page->index, 2665 + f2fs_lookup_read_extent_cache_block(inode, folio->index, 2674 2666 &fio->old_blkaddr)) { 2675 2667 if (!f2fs_is_valid_blkaddr(fio->sbi, fio->old_blkaddr, 2676 2668 DATA_GENERIC_ENHANCE)) ··· 2688 2674 if (fio->need_lock == LOCK_REQ && !f2fs_trylock_op(fio->sbi)) 2689 2675 return -EAGAIN; 2690 2676 2691 - err = f2fs_get_dnode_of_data(&dn, page->index, LOOKUP_NODE); 2677 + err = f2fs_get_dnode_of_data(&dn, folio->index, LOOKUP_NODE); 2692 2678 if (err) 2693 2679 goto out; 2694 2680 ··· 2696 2682 2697 2683 /* This page is already truncated */ 2698 2684 if (fio->old_blkaddr == NULL_ADDR) { 2699 - ClearPageUptodate(page); 2700 - clear_page_private_gcing(page); 2685 + folio_clear_uptodate(folio); 2686 + clear_page_private_gcing(folio_page(folio, 0)); 2701 2687 goto out_writepage; 2702 2688 } 2703 2689 got_it: ··· 2723 2709 if (err) 2724 2710 goto out_writepage; 2725 2711 2726 - set_page_writeback(page); 2712 + folio_start_writeback(folio); 2727 2713 f2fs_put_dnode(&dn); 2728 2714 if (fio->need_lock == LOCK_REQ) 2729 2715 f2fs_unlock_op(fio->sbi); ··· 2731 2717 if (err) { 2732 2718 if (fscrypt_inode_uses_fs_layer_crypto(inode)) 2733 2719 fscrypt_finalize_bounce_page(&fio->encrypted_page); 2734 - end_page_writeback(page); 2720 + folio_end_writeback(folio); 2735 2721 } else { 2736 2722 set_inode_flag(inode, FI_UPDATE_WRITE); 2737 2723 } 2738 - trace_f2fs_do_write_data_page(page_folio(page), IPU); 2724 + trace_f2fs_do_write_data_page(folio, IPU); 2739 2725 return err; 2740 2726 } 2741 2727 ··· 2757 2743 if (err) 2758 2744 goto out_writepage; 2759 2745 2760 - set_page_writeback(page); 2746 + folio_start_writeback(folio); 2761 2747 2762 2748 if (fio->compr_blocks && fio->old_blkaddr == COMPRESS_ADDR) 2763 2749 f2fs_i_compr_blocks_update(inode, fio->compr_blocks - 1, false); 2764 2750 2765 2751 /* LFS mode write path */ 2766 2752 f2fs_outplace_write_data(&dn, fio); 2767 - trace_f2fs_do_write_data_page(page_folio(page), OPU); 2753 + trace_f2fs_do_write_data_page(folio, OPU); 2768 2754 set_inode_flag(inode, FI_APPEND_WRITE); 2755 + if (atomic_commit) 2756 + clear_page_private_atomic(folio_page(folio, 0)); 2769 2757 out_writepage: 2770 2758 f2fs_put_dnode(&dn); 2771 2759 out: ··· 2776 2760 return err; 2777 2761 } 2778 2762 2779 - int f2fs_write_single_data_page(struct page *page, int *submitted, 2763 + int f2fs_write_single_data_page(struct folio *folio, int *submitted, 2780 2764 struct bio **bio, 2781 2765 sector_t *last_block, 2782 2766 struct writeback_control *wbc, ··· 2784 2768 int compr_blocks, 2785 2769 bool allow_balance) 2786 2770 { 2787 - struct inode *inode = page->mapping->host; 2771 + struct inode *inode = folio->mapping->host; 2772 + struct page *page = folio_page(folio, 0); 2788 2773 struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 2789 2774 loff_t i_size = i_size_read(inode); 2790 2775 const pgoff_t end_index = ((unsigned long long)i_size) 2791 2776 >> PAGE_SHIFT; 2792 - loff_t psize = (loff_t)(page->index + 1) << PAGE_SHIFT; 2777 + loff_t psize = (loff_t)(folio->index + 1) << PAGE_SHIFT; 2793 2778 unsigned offset = 0; 2794 2779 bool need_balance_fs = false; 2795 2780 bool quota_inode = IS_NOQUOTA(inode); ··· 2814 2797 .last_block = last_block, 2815 2798 }; 2816 2799 2817 - trace_f2fs_writepage(page_folio(page), DATA); 2800 + trace_f2fs_writepage(folio, DATA); 2818 2801 2819 2802 /* we should bypass data pages to proceed the kworker jobs */ 2820 2803 if (unlikely(f2fs_cp_error(sbi))) { 2821 - mapping_set_error(page->mapping, -EIO); 2804 + mapping_set_error(folio->mapping, -EIO); 2822 2805 /* 2823 2806 * don't drop any dirty dentry pages for keeping lastest 2824 2807 * directory structure. ··· 2836 2819 if (unlikely(is_sbi_flag_set(sbi, SBI_POR_DOING))) 2837 2820 goto redirty_out; 2838 2821 2839 - if (page->index < end_index || 2822 + if (folio->index < end_index || 2840 2823 f2fs_verity_in_progress(inode) || 2841 2824 compr_blocks) 2842 2825 goto write; ··· 2846 2829 * this page does not have to be written to disk. 2847 2830 */ 2848 2831 offset = i_size & (PAGE_SIZE - 1); 2849 - if ((page->index >= end_index + 1) || !offset) 2832 + if ((folio->index >= end_index + 1) || !offset) 2850 2833 goto out; 2851 2834 2852 - zero_user_segment(page, offset, PAGE_SIZE); 2835 + folio_zero_segment(folio, offset, folio_size(folio)); 2853 2836 write: 2854 2837 /* Dentry/quota blocks are controlled by checkpoint */ 2855 2838 if (S_ISDIR(inode->i_mode) || quota_inode) { ··· 2879 2862 2880 2863 err = -EAGAIN; 2881 2864 if (f2fs_has_inline_data(inode)) { 2882 - err = f2fs_write_inline_data(inode, page); 2865 + err = f2fs_write_inline_data(inode, folio); 2883 2866 if (!err) 2884 2867 goto out; 2885 2868 } ··· 2909 2892 out: 2910 2893 inode_dec_dirty_pages(inode); 2911 2894 if (err) { 2912 - ClearPageUptodate(page); 2895 + folio_clear_uptodate(folio); 2913 2896 clear_page_private_gcing(page); 2914 2897 } 2915 2898 ··· 2919 2902 f2fs_remove_dirty_inode(inode); 2920 2903 submitted = NULL; 2921 2904 } 2922 - unlock_page(page); 2905 + folio_unlock(folio); 2923 2906 if (!S_ISDIR(inode->i_mode) && !IS_NOQUOTA(inode) && 2924 2907 !F2FS_I(inode)->wb_task && allow_balance) 2925 2908 f2fs_balance_fs(sbi, need_balance_fs); ··· 2937 2920 return 0; 2938 2921 2939 2922 redirty_out: 2940 - redirty_page_for_writepage(wbc, page); 2923 + folio_redirty_for_writepage(wbc, folio); 2941 2924 /* 2942 2925 * pageout() in MM translates EAGAIN, so calls handle_write_error() 2943 2926 * -> mapping_set_error() -> set_bit(AS_EIO, ...). ··· 2946 2929 */ 2947 2930 if (!err || wbc->for_reclaim) 2948 2931 return AOP_WRITEPAGE_ACTIVATE; 2949 - unlock_page(page); 2932 + folio_unlock(folio); 2950 2933 return err; 2951 2934 } 2952 2935 2953 2936 static int f2fs_write_data_page(struct page *page, 2954 2937 struct writeback_control *wbc) 2955 2938 { 2939 + struct folio *folio = page_folio(page); 2956 2940 #ifdef CONFIG_F2FS_FS_COMPRESSION 2957 - struct inode *inode = page->mapping->host; 2941 + struct inode *inode = folio->mapping->host; 2958 2942 2959 2943 if (unlikely(f2fs_cp_error(F2FS_I_SB(inode)))) 2960 2944 goto out; 2961 2945 2962 2946 if (f2fs_compressed_file(inode)) { 2963 - if (f2fs_is_compressed_cluster(inode, page->index)) { 2964 - redirty_page_for_writepage(wbc, page); 2947 + if (f2fs_is_compressed_cluster(inode, folio->index)) { 2948 + folio_redirty_for_writepage(wbc, folio); 2965 2949 return AOP_WRITEPAGE_ACTIVATE; 2966 2950 } 2967 2951 } 2968 2952 out: 2969 2953 #endif 2970 2954 2971 - return f2fs_write_single_data_page(page, NULL, NULL, NULL, 2955 + return f2fs_write_single_data_page(folio, NULL, NULL, NULL, 2972 2956 wbc, FS_DATA_IO, 0, true); 2973 2957 } 2974 2958 ··· 3175 3157 #ifdef CONFIG_F2FS_FS_COMPRESSION 3176 3158 if (f2fs_compressed_file(inode)) { 3177 3159 folio_get(folio); 3178 - f2fs_compress_ctx_add_page(&cc, &folio->page); 3160 + f2fs_compress_ctx_add_page(&cc, folio); 3179 3161 continue; 3180 3162 } 3181 3163 #endif 3182 - ret = f2fs_write_single_data_page(&folio->page, 3164 + ret = f2fs_write_single_data_page(folio, 3183 3165 &submitted, &bio, &last_block, 3184 3166 wbc, io_type, 0, true); 3185 3167 if (ret == AOP_WRITEPAGE_ACTIVATE) ··· 3387 3369 } 3388 3370 3389 3371 static int prepare_write_begin(struct f2fs_sb_info *sbi, 3390 - struct page *page, loff_t pos, unsigned len, 3372 + struct folio *folio, loff_t pos, unsigned int len, 3391 3373 block_t *blk_addr, bool *node_changed) 3392 3374 { 3393 - struct inode *inode = page->mapping->host; 3394 - pgoff_t index = page->index; 3375 + struct inode *inode = folio->mapping->host; 3376 + pgoff_t index = folio->index; 3395 3377 struct dnode_of_data dn; 3396 3378 struct page *ipage; 3397 3379 bool locked = false; ··· 3428 3410 3429 3411 if (f2fs_has_inline_data(inode)) { 3430 3412 if (pos + len <= MAX_INLINE_DATA(inode)) { 3431 - f2fs_do_read_inline_data(page_folio(page), ipage); 3413 + f2fs_do_read_inline_data(folio, ipage); 3432 3414 set_inode_flag(inode, FI_DATA_EXIST); 3433 3415 if (inode->i_nlink) 3434 3416 set_page_private_inline(ipage); 3435 3417 goto out; 3436 3418 } 3437 - err = f2fs_convert_inline_page(&dn, page); 3419 + err = f2fs_convert_inline_page(&dn, folio_page(folio, 0)); 3438 3420 if (err || dn.data_blkaddr != NULL_ADDR) 3439 3421 goto out; 3440 3422 } ··· 3527 3509 } 3528 3510 3529 3511 static int prepare_atomic_write_begin(struct f2fs_sb_info *sbi, 3530 - struct page *page, loff_t pos, unsigned int len, 3512 + struct folio *folio, loff_t pos, unsigned int len, 3531 3513 block_t *blk_addr, bool *node_changed, bool *use_cow) 3532 3514 { 3533 - struct inode *inode = page->mapping->host; 3515 + struct inode *inode = folio->mapping->host; 3534 3516 struct inode *cow_inode = F2FS_I(inode)->cow_inode; 3535 - pgoff_t index = page->index; 3517 + pgoff_t index = folio->index; 3536 3518 int err = 0; 3537 3519 block_t ori_blk_addr = NULL_ADDR; 3538 3520 ··· 3638 3620 *foliop = folio; 3639 3621 3640 3622 if (f2fs_is_atomic_file(inode)) 3641 - err = prepare_atomic_write_begin(sbi, &folio->page, pos, len, 3623 + err = prepare_atomic_write_begin(sbi, folio, pos, len, 3642 3624 &blkaddr, &need_balance, &use_cow); 3643 3625 else 3644 - err = prepare_write_begin(sbi, &folio->page, pos, len, 3626 + err = prepare_write_begin(sbi, folio, pos, len, 3645 3627 &blkaddr, &need_balance); 3646 3628 if (err) 3647 3629 goto put_folio; ··· 3666 3648 3667 3649 if (!(pos & (PAGE_SIZE - 1)) && (pos + len) >= i_size_read(inode) && 3668 3650 !f2fs_verity_in_progress(inode)) { 3669 - folio_zero_segment(folio, len, PAGE_SIZE); 3651 + folio_zero_segment(folio, len, folio_size(folio)); 3670 3652 return 0; 3671 3653 } 3672 3654 ··· 3680 3662 goto put_folio; 3681 3663 } 3682 3664 err = f2fs_submit_page_read(use_cow ? 3683 - F2FS_I(inode)->cow_inode : inode, &folio->page, 3684 - blkaddr, 0, true); 3665 + F2FS_I(inode)->cow_inode : inode, 3666 + folio, blkaddr, 0, true); 3685 3667 if (err) 3686 3668 goto put_folio; 3687 3669 ··· 3744 3726 goto unlock_out; 3745 3727 3746 3728 folio_mark_dirty(folio); 3729 + 3730 + if (f2fs_is_atomic_file(inode)) 3731 + set_page_private_atomic(folio_page(folio, 0)); 3747 3732 3748 3733 if (pos + copied > i_size_read(inode) && 3749 3734 !f2fs_verity_in_progress(inode)) { ··· 4138 4117 .swap_deactivate = f2fs_swap_deactivate, 4139 4118 }; 4140 4119 4141 - void f2fs_clear_page_cache_dirty_tag(struct page *page) 4120 + void f2fs_clear_page_cache_dirty_tag(struct folio *folio) 4142 4121 { 4143 - struct folio *folio = page_folio(page); 4144 4122 struct address_space *mapping = folio->mapping; 4145 4123 unsigned long flags; 4146 4124
+1 -1
fs/f2fs/debug.c
··· 275 275 /* build nm */ 276 276 si->base_mem += sizeof(struct f2fs_nm_info); 277 277 si->base_mem += __bitmap_size(sbi, NAT_BITMAP); 278 - si->base_mem += (NM_I(sbi)->nat_bits_blocks << F2FS_BLKSIZE_BITS); 278 + si->base_mem += F2FS_BLK_TO_BYTES(NM_I(sbi)->nat_bits_blocks); 279 279 si->base_mem += NM_I(sbi)->nat_blocks * 280 280 f2fs_bitmap_size(NAT_ENTRY_PER_BLOCK); 281 281 si->base_mem += NM_I(sbi)->nat_blocks / 8;
+5 -3
fs/f2fs/dir.c
··· 166 166 unsigned long bidx = 0; 167 167 168 168 for (i = 0; i < level; i++) 169 - bidx += dir_buckets(i, dir_level) * bucket_blocks(i); 169 + bidx += mul_u32_u32(dir_buckets(i, dir_level), 170 + bucket_blocks(i)); 170 171 bidx += idx * bucket_blocks(level); 171 172 return bidx; 172 173 } ··· 842 841 struct f2fs_dentry_block *dentry_blk; 843 842 unsigned int bit_pos; 844 843 int slots = GET_DENTRY_SLOTS(le16_to_cpu(dentry->name_len)); 844 + pgoff_t index = page_folio(page)->index; 845 845 int i; 846 846 847 847 f2fs_update_time(F2FS_I_SB(dir), REQ_TIME); ··· 868 866 set_page_dirty(page); 869 867 870 868 if (bit_pos == NR_DENTRY_IN_BLOCK && 871 - !f2fs_truncate_hole(dir, page->index, page->index + 1)) { 872 - f2fs_clear_page_cache_dirty_tag(page); 869 + !f2fs_truncate_hole(dir, index, index + 1)) { 870 + f2fs_clear_page_cache_dirty_tag(page_folio(page)); 873 871 clear_page_dirty_for_io(page); 874 872 ClearPageUptodate(page); 875 873 clear_page_private_all(page);
+2 -2
fs/f2fs/extent_cache.c
··· 366 366 static void __drop_largest_extent(struct extent_tree *et, 367 367 pgoff_t fofs, unsigned int len) 368 368 { 369 - if (fofs < et->largest.fofs + et->largest.len && 369 + if (fofs < (pgoff_t)et->largest.fofs + et->largest.len && 370 370 fofs + len > et->largest.fofs) { 371 371 et->largest.len = 0; 372 372 et->largest_updated = true; ··· 456 456 457 457 if (type == EX_READ && 458 458 et->largest.fofs <= pgofs && 459 - et->largest.fofs + et->largest.len > pgofs) { 459 + (pgoff_t)et->largest.fofs + et->largest.len > pgofs) { 460 460 *ei = et->largest; 461 461 ret = true; 462 462 stat_inc_largest_node_hit(sbi);
+92 -58
fs/f2fs/f2fs.h
··· 11 11 #include <linux/uio.h> 12 12 #include <linux/types.h> 13 13 #include <linux/page-flags.h> 14 - #include <linux/buffer_head.h> 15 14 #include <linux/slab.h> 16 15 #include <linux/crc32.h> 17 16 #include <linux/magic.h> ··· 132 133 typedef u32 nid_t; 133 134 134 135 #define COMPRESS_EXT_NUM 16 136 + 137 + enum blkzone_allocation_policy { 138 + BLKZONE_ALLOC_PRIOR_SEQ, /* Prioritize writing to sequential zones */ 139 + BLKZONE_ALLOC_ONLY_SEQ, /* Only allow writing to sequential zones */ 140 + BLKZONE_ALLOC_PRIOR_CONV, /* Prioritize writing to conventional zones */ 141 + }; 135 142 136 143 /* 137 144 * An implementation of an rwsem that is explicitly unfair to readers. This ··· 290 285 APPEND_INO, /* for append ino list */ 291 286 UPDATE_INO, /* for update ino list */ 292 287 TRANS_DIR_INO, /* for transactions dir ino list */ 288 + XATTR_DIR_INO, /* for xattr updated dir ino list */ 293 289 FLUSH_INO, /* for multiple device flushing */ 294 290 MAX_INO_ENTRY, /* max. list */ 295 291 }; ··· 790 784 FI_NEED_IPU, /* used for ipu per file */ 791 785 FI_ATOMIC_FILE, /* indicate atomic file */ 792 786 FI_DATA_EXIST, /* indicate data exists */ 793 - FI_INLINE_DOTS, /* indicate inline dot dentries */ 794 787 FI_SKIP_WRITES, /* should skip data page writeback */ 795 788 FI_OPU_WRITE, /* used for opu per file */ 796 789 FI_DIRTY_FILE, /* indicate regular/symlink has dirty pages */ ··· 807 802 FI_ALIGNED_WRITE, /* enable aligned write */ 808 803 FI_COW_FILE, /* indicate COW file */ 809 804 FI_ATOMIC_COMMITTED, /* indicate atomic commit completed except disk sync */ 805 + FI_ATOMIC_DIRTIED, /* indicate atomic file is dirtied */ 810 806 FI_ATOMIC_REPLACE, /* indicate atomic replace */ 811 807 FI_OPENED_FILE, /* indicate file has been opened */ 812 808 FI_MAX, /* max flag, never be used */ ··· 1161 1155 CP_FASTBOOT_MODE, 1162 1156 CP_SPEC_LOG_NUM, 1163 1157 CP_RECOVER_DIR, 1158 + CP_XATTR_DIR, 1164 1159 }; 1165 1160 1166 1161 enum iostat_type { ··· 1300 1293 bool no_bg_gc; /* check the space and stop bg_gc */ 1301 1294 bool should_migrate_blocks; /* should migrate blocks */ 1302 1295 bool err_gc_skipped; /* return EAGAIN if GC skipped */ 1296 + bool one_time; /* require one time GC in one migration unit */ 1303 1297 unsigned int nr_free_secs; /* # of free sections to do GC */ 1304 1298 }; 1305 1299 ··· 1426 1418 * bit 1 PAGE_PRIVATE_ONGOING_MIGRATION 1427 1419 * bit 2 PAGE_PRIVATE_INLINE_INODE 1428 1420 * bit 3 PAGE_PRIVATE_REF_RESOURCE 1429 - * bit 4- f2fs private data 1421 + * bit 4 PAGE_PRIVATE_ATOMIC_WRITE 1422 + * bit 5- f2fs private data 1430 1423 * 1431 1424 * Layout B: lowest bit should be 0 1432 1425 * page.private is a wrapped pointer. ··· 1437 1428 PAGE_PRIVATE_ONGOING_MIGRATION, /* data page which is on-going migrating */ 1438 1429 PAGE_PRIVATE_INLINE_INODE, /* inode page contains inline data */ 1439 1430 PAGE_PRIVATE_REF_RESOURCE, /* dirty page has referenced resources */ 1431 + PAGE_PRIVATE_ATOMIC_WRITE, /* data page from atomic write path */ 1440 1432 PAGE_PRIVATE_MAX 1441 1433 }; 1442 1434 ··· 1569 1559 #ifdef CONFIG_BLK_DEV_ZONED 1570 1560 unsigned int blocks_per_blkz; /* F2FS blocks per zone */ 1571 1561 unsigned int max_open_zones; /* max open zone resources of the zoned device */ 1562 + /* For adjust the priority writing position of data in zone UFS */ 1563 + unsigned int blkzone_alloc_policy; 1572 1564 #endif 1573 1565 1574 1566 /* for node-related operations */ ··· 1697 1685 unsigned int max_victim_search; 1698 1686 /* migration granularity of garbage collection, unit: segment */ 1699 1687 unsigned int migration_granularity; 1688 + /* migration window granularity of garbage collection, unit: segment */ 1689 + unsigned int migration_window_granularity; 1700 1690 1701 1691 /* 1702 1692 * for stat information. ··· 2006 1992 static inline struct f2fs_super_block *F2FS_RAW_SUPER(struct f2fs_sb_info *sbi) 2007 1993 { 2008 1994 return (struct f2fs_super_block *)(sbi->raw_super); 1995 + } 1996 + 1997 + static inline struct f2fs_super_block *F2FS_SUPER_BLOCK(struct folio *folio, 1998 + pgoff_t index) 1999 + { 2000 + pgoff_t idx_in_folio = index % (1 << folio_order(folio)); 2001 + 2002 + return (struct f2fs_super_block *) 2003 + (page_address(folio_page(folio, idx_in_folio)) + 2004 + F2FS_SUPER_OFFSET); 2009 2005 } 2010 2006 2011 2007 static inline struct f2fs_checkpoint *F2FS_CKPT(struct f2fs_sb_info *sbi) ··· 2420 2396 PAGE_PRIVATE_GET_FUNC(nonpointer, NOT_POINTER); 2421 2397 PAGE_PRIVATE_GET_FUNC(inline, INLINE_INODE); 2422 2398 PAGE_PRIVATE_GET_FUNC(gcing, ONGOING_MIGRATION); 2399 + PAGE_PRIVATE_GET_FUNC(atomic, ATOMIC_WRITE); 2423 2400 2424 2401 PAGE_PRIVATE_SET_FUNC(reference, REF_RESOURCE); 2425 2402 PAGE_PRIVATE_SET_FUNC(inline, INLINE_INODE); 2426 2403 PAGE_PRIVATE_SET_FUNC(gcing, ONGOING_MIGRATION); 2404 + PAGE_PRIVATE_SET_FUNC(atomic, ATOMIC_WRITE); 2427 2405 2428 2406 PAGE_PRIVATE_CLEAR_FUNC(reference, REF_RESOURCE); 2429 2407 PAGE_PRIVATE_CLEAR_FUNC(inline, INLINE_INODE); 2430 2408 PAGE_PRIVATE_CLEAR_FUNC(gcing, ONGOING_MIGRATION); 2409 + PAGE_PRIVATE_CLEAR_FUNC(atomic, ATOMIC_WRITE); 2431 2410 2432 2411 static inline unsigned long get_page_private_data(struct page *page) 2433 2412 { ··· 2462 2435 clear_page_private_reference(page); 2463 2436 clear_page_private_gcing(page); 2464 2437 clear_page_private_inline(page); 2438 + clear_page_private_atomic(page); 2465 2439 2466 2440 f2fs_bug_on(F2FS_P_SB(page), page_private(page)); 2467 2441 } ··· 2882 2854 return false; 2883 2855 } 2884 2856 2857 + static inline bool is_inflight_read_io(struct f2fs_sb_info *sbi) 2858 + { 2859 + return get_pages(sbi, F2FS_RD_DATA) || get_pages(sbi, F2FS_DIO_READ); 2860 + } 2861 + 2885 2862 static inline bool is_idle(struct f2fs_sb_info *sbi, int type) 2886 2863 { 2864 + bool zoned_gc = (type == GC_TIME && 2865 + F2FS_HAS_FEATURE(sbi, F2FS_FEATURE_BLKZONED)); 2866 + 2887 2867 if (sbi->gc_mode == GC_URGENT_HIGH) 2888 2868 return true; 2889 2869 2890 - if (is_inflight_io(sbi, type)) 2891 - return false; 2870 + if (zoned_gc) { 2871 + if (is_inflight_read_io(sbi)) 2872 + return false; 2873 + } else { 2874 + if (is_inflight_io(sbi, type)) 2875 + return false; 2876 + } 2892 2877 2893 2878 if (sbi->gc_mode == GC_URGENT_MID) 2894 2879 return true; 2895 2880 2896 2881 if (sbi->gc_mode == GC_URGENT_LOW && 2897 2882 (type == DISCARD_TIME || type == GC_TIME)) 2883 + return true; 2884 + 2885 + if (zoned_gc) 2898 2886 return true; 2899 2887 2900 2888 return f2fs_time_over(sbi, type); ··· 2944 2900 } 2945 2901 2946 2902 static inline int f2fs_has_extra_attr(struct inode *inode); 2903 + static inline unsigned int get_dnode_base(struct inode *inode, 2904 + struct page *node_page) 2905 + { 2906 + if (!IS_INODE(node_page)) 2907 + return 0; 2908 + 2909 + return inode ? get_extra_isize(inode) : 2910 + offset_in_addr(&F2FS_NODE(node_page)->i); 2911 + } 2912 + 2913 + static inline __le32 *get_dnode_addr(struct inode *inode, 2914 + struct page *node_page) 2915 + { 2916 + return blkaddr_in_node(F2FS_NODE(node_page)) + 2917 + get_dnode_base(inode, node_page); 2918 + } 2919 + 2947 2920 static inline block_t data_blkaddr(struct inode *inode, 2948 2921 struct page *node_page, unsigned int offset) 2949 2922 { 2950 - struct f2fs_node *raw_node; 2951 - __le32 *addr_array; 2952 - int base = 0; 2953 - bool is_inode = IS_INODE(node_page); 2954 - 2955 - raw_node = F2FS_NODE(node_page); 2956 - 2957 - if (is_inode) { 2958 - if (!inode) 2959 - /* from GC path only */ 2960 - base = offset_in_addr(&raw_node->i); 2961 - else if (f2fs_has_extra_attr(inode)) 2962 - base = get_extra_isize(inode); 2963 - } 2964 - 2965 - addr_array = blkaddr_in_node(raw_node); 2966 - return le32_to_cpu(addr_array[base + offset]); 2923 + return le32_to_cpu(*(get_dnode_addr(inode, node_page) + offset)); 2967 2924 } 2968 2925 2969 2926 static inline block_t f2fs_data_blkaddr(struct dnode_of_data *dn) ··· 3083 3038 return; 3084 3039 fallthrough; 3085 3040 case FI_DATA_EXIST: 3086 - case FI_INLINE_DOTS: 3087 3041 case FI_PIN_FILE: 3088 3042 case FI_COMPRESS_RELEASED: 3089 - case FI_ATOMIC_COMMITTED: 3090 3043 f2fs_mark_inode_dirty_sync(inode, true); 3091 3044 } 3092 3045 } ··· 3206 3163 set_bit(FI_INLINE_DENTRY, fi->flags); 3207 3164 if (ri->i_inline & F2FS_DATA_EXIST) 3208 3165 set_bit(FI_DATA_EXIST, fi->flags); 3209 - if (ri->i_inline & F2FS_INLINE_DOTS) 3210 - set_bit(FI_INLINE_DOTS, fi->flags); 3211 3166 if (ri->i_inline & F2FS_EXTRA_ATTR) 3212 3167 set_bit(FI_EXTRA_ATTR, fi->flags); 3213 3168 if (ri->i_inline & F2FS_PIN_FILE) ··· 3226 3185 ri->i_inline |= F2FS_INLINE_DENTRY; 3227 3186 if (is_inode_flag_set(inode, FI_DATA_EXIST)) 3228 3187 ri->i_inline |= F2FS_DATA_EXIST; 3229 - if (is_inode_flag_set(inode, FI_INLINE_DOTS)) 3230 - ri->i_inline |= F2FS_INLINE_DOTS; 3231 3188 if (is_inode_flag_set(inode, FI_EXTRA_ATTR)) 3232 3189 ri->i_inline |= F2FS_EXTRA_ATTR; 3233 3190 if (is_inode_flag_set(inode, FI_PIN_FILE)) ··· 3306 3267 return is_inode_flag_set(inode, FI_DATA_EXIST); 3307 3268 } 3308 3269 3309 - static inline int f2fs_has_inline_dots(struct inode *inode) 3310 - { 3311 - return is_inode_flag_set(inode, FI_INLINE_DOTS); 3312 - } 3313 - 3314 3270 static inline int f2fs_is_mmap_file(struct inode *inode) 3315 3271 { 3316 3272 return is_inode_flag_set(inode, FI_MMAP_FILE); ··· 3326 3292 return is_inode_flag_set(inode, FI_COW_FILE); 3327 3293 } 3328 3294 3329 - static inline __le32 *get_dnode_addr(struct inode *inode, 3330 - struct page *node_page); 3331 3295 static inline void *inline_data_addr(struct inode *inode, struct page *page) 3332 3296 { 3333 3297 __le32 *addr = get_dnode_addr(inode, page); ··· 3464 3432 return F2FS_I(inode)->i_inline_xattr_size; 3465 3433 } 3466 3434 3467 - static inline __le32 *get_dnode_addr(struct inode *inode, 3468 - struct page *node_page) 3469 - { 3470 - int base = 0; 3471 - 3472 - if (IS_INODE(node_page) && f2fs_has_extra_attr(inode)) 3473 - base = get_extra_isize(inode); 3474 - 3475 - return blkaddr_in_node(F2FS_NODE(node_page)) + base; 3476 - } 3477 - 3478 3435 #define f2fs_get_inode_mode(i) \ 3479 3436 ((is_inode_flag_set(i, FI_ACL_MODE)) ? \ 3480 3437 (F2FS_I(i)->i_acl_mode) : ((i)->i_mode)) ··· 3516 3495 int f2fs_truncate_hole(struct inode *inode, pgoff_t pg_start, pgoff_t pg_end); 3517 3496 void f2fs_truncate_data_blocks_range(struct dnode_of_data *dn, int count); 3518 3497 int f2fs_do_shutdown(struct f2fs_sb_info *sbi, unsigned int flag, 3519 - bool readonly); 3498 + bool readonly, bool need_lock); 3520 3499 int f2fs_precache_extents(struct inode *inode); 3521 3500 int f2fs_fileattr_get(struct dentry *dentry, struct fileattr *fa); 3522 3501 int f2fs_fileattr_set(struct mnt_idmap *idmap, ··· 3740 3719 struct page *f2fs_get_sum_page(struct f2fs_sb_info *sbi, unsigned int segno); 3741 3720 void f2fs_update_meta_page(struct f2fs_sb_info *sbi, void *src, 3742 3721 block_t blk_addr); 3743 - void f2fs_do_write_meta_page(struct f2fs_sb_info *sbi, struct page *page, 3722 + void f2fs_do_write_meta_page(struct f2fs_sb_info *sbi, struct folio *folio, 3744 3723 enum iostat_type io_type); 3745 3724 void f2fs_do_write_node_page(unsigned int nid, struct f2fs_io_info *fio); 3746 3725 void f2fs_outplace_write_data(struct dnode_of_data *dn, ··· 3780 3759 int f2fs_rw_hint_to_seg_type(struct f2fs_sb_info *sbi, enum rw_hint hint); 3781 3760 enum rw_hint f2fs_io_type_to_rw_hint(struct f2fs_sb_info *sbi, 3782 3761 enum page_type type, enum temp_type temp); 3783 - unsigned int f2fs_usable_segs_in_sec(struct f2fs_sb_info *sbi, 3784 - unsigned int segno); 3762 + unsigned int f2fs_usable_segs_in_sec(struct f2fs_sb_info *sbi); 3785 3763 unsigned int f2fs_usable_blks_in_seg(struct f2fs_sb_info *sbi, 3786 3764 unsigned int segno); 3787 3765 ··· 3888 3868 int f2fs_encrypt_one_page(struct f2fs_io_info *fio); 3889 3869 bool f2fs_should_update_inplace(struct inode *inode, struct f2fs_io_info *fio); 3890 3870 bool f2fs_should_update_outplace(struct inode *inode, struct f2fs_io_info *fio); 3891 - int f2fs_write_single_data_page(struct page *page, int *submitted, 3871 + int f2fs_write_single_data_page(struct folio *folio, int *submitted, 3892 3872 struct bio **bio, sector_t *last_block, 3893 3873 struct writeback_control *wbc, 3894 3874 enum iostat_type io_type, ··· 3897 3877 void f2fs_invalidate_folio(struct folio *folio, size_t offset, size_t length); 3898 3878 bool f2fs_release_folio(struct folio *folio, gfp_t wait); 3899 3879 bool f2fs_overwrite_io(struct inode *inode, loff_t pos, size_t len); 3900 - void f2fs_clear_page_cache_dirty_tag(struct page *page); 3880 + void f2fs_clear_page_cache_dirty_tag(struct folio *folio); 3901 3881 int f2fs_init_post_read_processing(void); 3902 3882 void f2fs_destroy_post_read_processing(void); 3903 3883 int f2fs_init_post_read_wq(struct f2fs_sb_info *sbi); ··· 3921 3901 /* victim selection function for cleaning and SSR */ 3922 3902 int f2fs_get_victim(struct f2fs_sb_info *sbi, unsigned int *result, 3923 3903 int gc_type, int type, char alloc_mode, 3924 - unsigned long long age); 3904 + unsigned long long age, bool one_time); 3925 3905 3926 3906 /* 3927 3907 * recovery.c ··· 4007 3987 4008 3988 #define stat_inc_cp_call_count(sbi, foreground) \ 4009 3989 atomic_inc(&sbi->cp_call_count[(foreground)]) 4010 - #define stat_inc_cp_count(si) (F2FS_STAT(sbi)->cp_count++) 3990 + #define stat_inc_cp_count(sbi) (F2FS_STAT(sbi)->cp_count++) 4011 3991 #define stat_io_skip_bggc_count(sbi) ((sbi)->io_skip_bggc++) 4012 3992 #define stat_other_skip_bggc_count(sbi) ((sbi)->other_skip_bggc++) 4013 3993 #define stat_inc_dirty_inode(sbi, type) ((sbi)->ndirty_inode[type]++) ··· 4192 4172 int f2fs_convert_inline_page(struct dnode_of_data *dn, struct page *page); 4193 4173 int f2fs_convert_inline_inode(struct inode *inode); 4194 4174 int f2fs_try_convert_inline_dir(struct inode *dir, struct dentry *dentry); 4195 - int f2fs_write_inline_data(struct inode *inode, struct page *page); 4175 + int f2fs_write_inline_data(struct inode *inode, struct folio *folio); 4196 4176 int f2fs_recover_inline_data(struct inode *inode, struct page *npage); 4197 4177 struct f2fs_dir_entry *f2fs_find_in_inline_dir(struct inode *dir, 4198 4178 const struct f2fs_filename *fname, ··· 4309 4289 * compress.c 4310 4290 */ 4311 4291 #ifdef CONFIG_F2FS_FS_COMPRESSION 4292 + enum cluster_check_type { 4293 + CLUSTER_IS_COMPR, /* check only if compressed cluster */ 4294 + CLUSTER_COMPR_BLKS, /* return # of compressed blocks in a cluster */ 4295 + CLUSTER_RAW_BLKS /* return # of raw blocks in a cluster */ 4296 + }; 4312 4297 bool f2fs_is_compressed_page(struct page *page); 4313 4298 struct page *f2fs_compress_control_page(struct page *page); 4314 4299 int f2fs_prepare_compress_overwrite(struct inode *inode, ··· 4334 4309 bool f2fs_all_cluster_page_ready(struct compress_ctx *cc, struct page **pages, 4335 4310 int index, int nr_pages, bool uptodate); 4336 4311 bool f2fs_sanity_check_cluster(struct dnode_of_data *dn); 4337 - void f2fs_compress_ctx_add_page(struct compress_ctx *cc, struct page *page); 4312 + void f2fs_compress_ctx_add_page(struct compress_ctx *cc, struct folio *folio); 4338 4313 int f2fs_write_multi_pages(struct compress_ctx *cc, 4339 4314 int *submitted, 4340 4315 struct writeback_control *wbc, 4341 4316 enum iostat_type io_type); 4342 4317 int f2fs_is_compressed_cluster(struct inode *inode, pgoff_t index); 4318 + bool f2fs_is_sparse_cluster(struct inode *inode, pgoff_t index); 4343 4319 void f2fs_update_read_extent_tree_range_compressed(struct inode *inode, 4344 4320 pgoff_t fofs, block_t blkaddr, 4345 4321 unsigned int llen, unsigned int c_len); ··· 4427 4401 static inline void f2fs_invalidate_compress_pages(struct f2fs_sb_info *sbi, 4428 4402 nid_t ino) { } 4429 4403 #define inc_compr_inode_stat(inode) do { } while (0) 4404 + static inline int f2fs_is_compressed_cluster( 4405 + struct inode *inode, 4406 + pgoff_t index) { return 0; } 4407 + static inline bool f2fs_is_sparse_cluster( 4408 + struct inode *inode, 4409 + pgoff_t index) { return true; } 4430 4410 static inline void f2fs_update_read_extent_tree_range_compressed( 4431 4411 struct inode *inode, 4432 4412 pgoff_t fofs, block_t blkaddr, ··· 4685 4653 io_schedule_timeout(timeout); 4686 4654 } 4687 4655 4688 - static inline void f2fs_handle_page_eio(struct f2fs_sb_info *sbi, pgoff_t ofs, 4689 - enum page_type type) 4656 + static inline void f2fs_handle_page_eio(struct f2fs_sb_info *sbi, 4657 + struct folio *folio, enum page_type type) 4690 4658 { 4659 + pgoff_t ofs = folio->index; 4660 + 4691 4661 if (unlikely(f2fs_cp_error(sbi))) 4692 4662 return; 4693 4663
+129 -70
fs/f2fs/file.c
··· 8 8 #include <linux/fs.h> 9 9 #include <linux/f2fs_fs.h> 10 10 #include <linux/stat.h> 11 - #include <linux/buffer_head.h> 12 11 #include <linux/writeback.h> 13 12 #include <linux/blkdev.h> 14 13 #include <linux/falloc.h> ··· 53 54 54 55 static vm_fault_t f2fs_vm_page_mkwrite(struct vm_fault *vmf) 55 56 { 56 - struct page *page = vmf->page; 57 + struct folio *folio = page_folio(vmf->page); 57 58 struct inode *inode = file_inode(vmf->vma->vm_file); 58 59 struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 59 60 struct dnode_of_data dn; ··· 85 86 86 87 #ifdef CONFIG_F2FS_FS_COMPRESSION 87 88 if (f2fs_compressed_file(inode)) { 88 - int ret = f2fs_is_compressed_cluster(inode, page->index); 89 + int ret = f2fs_is_compressed_cluster(inode, folio->index); 89 90 90 91 if (ret < 0) { 91 92 err = ret; ··· 105 106 106 107 file_update_time(vmf->vma->vm_file); 107 108 filemap_invalidate_lock_shared(inode->i_mapping); 108 - lock_page(page); 109 - if (unlikely(page->mapping != inode->i_mapping || 110 - page_offset(page) > i_size_read(inode) || 111 - !PageUptodate(page))) { 112 - unlock_page(page); 109 + folio_lock(folio); 110 + if (unlikely(folio->mapping != inode->i_mapping || 111 + folio_pos(folio) > i_size_read(inode) || 112 + !folio_test_uptodate(folio))) { 113 + folio_unlock(folio); 113 114 err = -EFAULT; 114 115 goto out_sem; 115 116 } ··· 117 118 set_new_dnode(&dn, inode, NULL, NULL, 0); 118 119 if (need_alloc) { 119 120 /* block allocation */ 120 - err = f2fs_get_block_locked(&dn, page->index); 121 + err = f2fs_get_block_locked(&dn, folio->index); 121 122 } else { 122 - err = f2fs_get_dnode_of_data(&dn, page->index, LOOKUP_NODE); 123 + err = f2fs_get_dnode_of_data(&dn, folio->index, LOOKUP_NODE); 123 124 f2fs_put_dnode(&dn); 124 125 if (f2fs_is_pinned_file(inode) && 125 126 !__is_valid_data_blkaddr(dn.data_blkaddr)) ··· 127 128 } 128 129 129 130 if (err) { 130 - unlock_page(page); 131 + folio_unlock(folio); 131 132 goto out_sem; 132 133 } 133 134 134 - f2fs_wait_on_page_writeback(page, DATA, false, true); 135 + f2fs_wait_on_page_writeback(folio_page(folio, 0), DATA, false, true); 135 136 136 137 /* wait for GCed page writeback via META_MAPPING */ 137 138 f2fs_wait_on_block_writeback(inode, dn.data_blkaddr); ··· 139 140 /* 140 141 * check to see if the page is mapped already (no holes) 141 142 */ 142 - if (PageMappedToDisk(page)) 143 + if (folio_test_mappedtodisk(folio)) 143 144 goto out_sem; 144 145 145 146 /* page is wholly or partially inside EOF */ 146 - if (((loff_t)(page->index + 1) << PAGE_SHIFT) > 147 + if (((loff_t)(folio->index + 1) << PAGE_SHIFT) > 147 148 i_size_read(inode)) { 148 149 loff_t offset; 149 150 150 151 offset = i_size_read(inode) & ~PAGE_MASK; 151 - zero_user_segment(page, offset, PAGE_SIZE); 152 + folio_zero_segment(folio, offset, folio_size(folio)); 152 153 } 153 - set_page_dirty(page); 154 + folio_mark_dirty(folio); 154 155 155 156 f2fs_update_iostat(sbi, inode, APP_MAPPED_IO, F2FS_BLKSIZE); 156 157 f2fs_update_time(sbi, REQ_TIME); ··· 162 163 out: 163 164 ret = vmf_fs_error(err); 164 165 165 - trace_f2fs_vm_page_mkwrite(inode, page->index, vmf->vma->vm_flags, ret); 166 + trace_f2fs_vm_page_mkwrite(inode, folio->index, vmf->vma->vm_flags, ret); 166 167 return ret; 167 168 } 168 169 ··· 217 218 f2fs_exist_written_data(sbi, F2FS_I(inode)->i_pino, 218 219 TRANS_DIR_INO)) 219 220 cp_reason = CP_RECOVER_DIR; 221 + else if (f2fs_exist_written_data(sbi, F2FS_I(inode)->i_pino, 222 + XATTR_DIR_INO)) 223 + cp_reason = CP_XATTR_DIR; 220 224 221 225 return cp_reason; 222 226 } ··· 375 373 f2fs_remove_ino_entry(sbi, ino, APPEND_INO); 376 374 clear_inode_flag(inode, FI_APPEND_WRITE); 377 375 flush_out: 378 - if ((!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER) || 379 - (atomic && !test_opt(sbi, NOBARRIER) && f2fs_sb_has_blkzoned(sbi))) 376 + if (!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER) 380 377 ret = f2fs_issue_flush(sbi, inode->i_ino); 381 378 if (!ret) { 382 379 f2fs_remove_ino_entry(sbi, ino, UPDATE_INO); ··· 432 431 static loff_t f2fs_seek_block(struct file *file, loff_t offset, int whence) 433 432 { 434 433 struct inode *inode = file->f_mapping->host; 435 - loff_t maxbytes = inode->i_sb->s_maxbytes; 434 + loff_t maxbytes = F2FS_BLK_TO_BYTES(max_file_blocks(inode)); 436 435 struct dnode_of_data dn; 437 436 pgoff_t pgofs, end_offset; 438 437 loff_t data_ofs = offset; ··· 514 513 static loff_t f2fs_llseek(struct file *file, loff_t offset, int whence) 515 514 { 516 515 struct inode *inode = file->f_mapping->host; 517 - loff_t maxbytes = inode->i_sb->s_maxbytes; 518 - 519 - if (f2fs_compressed_file(inode)) 520 - maxbytes = max_file_blocks(inode) << F2FS_BLKSIZE_BITS; 516 + loff_t maxbytes = F2FS_BLK_TO_BYTES(max_file_blocks(inode)); 521 517 522 518 switch (whence) { 523 519 case SEEK_SET: ··· 1049 1051 if (err) 1050 1052 return err; 1051 1053 } 1054 + 1055 + /* 1056 + * wait for inflight dio, blocks should be removed after 1057 + * IO completion. 1058 + */ 1059 + if (attr->ia_size < old_size) 1060 + inode_dio_wait(inode); 1052 1061 1053 1062 f2fs_down_write(&fi->i_gc_rwsem[WRITE]); 1054 1063 filemap_invalidate_lock(inode->i_mapping); ··· 1893 1888 if (ret) 1894 1889 goto out; 1895 1890 1891 + /* 1892 + * wait for inflight dio, blocks should be removed after IO 1893 + * completion. 1894 + */ 1895 + inode_dio_wait(inode); 1896 + 1896 1897 if (mode & FALLOC_FL_PUNCH_HOLE) { 1897 1898 if (offset >= inode->i_size) 1898 1899 goto out; ··· 2127 2116 struct mnt_idmap *idmap = file_mnt_idmap(filp); 2128 2117 struct f2fs_inode_info *fi = F2FS_I(inode); 2129 2118 struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 2130 - struct inode *pinode; 2131 2119 loff_t isize; 2132 2120 int ret; 2121 + 2122 + if (!(filp->f_mode & FMODE_WRITE)) 2123 + return -EBADF; 2133 2124 2134 2125 if (!inode_owner_or_capable(idmap, inode)) 2135 2126 return -EACCES; ··· 2162 2149 goto out; 2163 2150 2164 2151 f2fs_down_write(&fi->i_gc_rwsem[WRITE]); 2152 + f2fs_down_write(&fi->i_gc_rwsem[READ]); 2165 2153 2166 2154 /* 2167 2155 * Should wait end_io to count F2FS_WB_CP_DATA correctly by ··· 2172 2158 f2fs_warn(sbi, "Unexpected flush for atomic writes: ino=%lu, npages=%u", 2173 2159 inode->i_ino, get_dirty_pages(inode)); 2174 2160 ret = filemap_write_and_wait_range(inode->i_mapping, 0, LLONG_MAX); 2175 - if (ret) { 2176 - f2fs_up_write(&fi->i_gc_rwsem[WRITE]); 2177 - goto out; 2178 - } 2161 + if (ret) 2162 + goto out_unlock; 2179 2163 2180 2164 /* Check if the inode already has a COW inode */ 2181 2165 if (fi->cow_inode == NULL) { 2182 2166 /* Create a COW inode for atomic write */ 2183 - pinode = f2fs_iget(inode->i_sb, fi->i_pino); 2184 - if (IS_ERR(pinode)) { 2185 - f2fs_up_write(&fi->i_gc_rwsem[WRITE]); 2186 - ret = PTR_ERR(pinode); 2187 - goto out; 2188 - } 2167 + struct dentry *dentry = file_dentry(filp); 2168 + struct inode *dir = d_inode(dentry->d_parent); 2189 2169 2190 - ret = f2fs_get_tmpfile(idmap, pinode, &fi->cow_inode); 2191 - iput(pinode); 2192 - if (ret) { 2193 - f2fs_up_write(&fi->i_gc_rwsem[WRITE]); 2194 - goto out; 2195 - } 2170 + ret = f2fs_get_tmpfile(idmap, dir, &fi->cow_inode); 2171 + if (ret) 2172 + goto out_unlock; 2196 2173 2197 2174 set_inode_flag(fi->cow_inode, FI_COW_FILE); 2198 2175 clear_inode_flag(fi->cow_inode, FI_INLINE_DATA); ··· 2192 2187 F2FS_I(fi->cow_inode)->atomic_inode = inode; 2193 2188 } else { 2194 2189 /* Reuse the already created COW inode */ 2190 + f2fs_bug_on(sbi, get_dirty_pages(fi->cow_inode)); 2191 + 2192 + invalidate_mapping_pages(fi->cow_inode->i_mapping, 0, -1); 2193 + 2195 2194 ret = f2fs_do_truncate_blocks(fi->cow_inode, 0, true); 2196 - if (ret) { 2197 - f2fs_up_write(&fi->i_gc_rwsem[WRITE]); 2198 - goto out; 2199 - } 2195 + if (ret) 2196 + goto out_unlock; 2200 2197 } 2201 2198 2202 2199 f2fs_write_inode(inode, NULL); ··· 2217 2210 } 2218 2211 f2fs_i_size_write(fi->cow_inode, isize); 2219 2212 2213 + out_unlock: 2214 + f2fs_up_write(&fi->i_gc_rwsem[READ]); 2220 2215 f2fs_up_write(&fi->i_gc_rwsem[WRITE]); 2216 + if (ret) 2217 + goto out; 2221 2218 2222 2219 f2fs_update_time(sbi, REQ_TIME); 2223 2220 fi->atomic_write_task = current; ··· 2238 2227 struct inode *inode = file_inode(filp); 2239 2228 struct mnt_idmap *idmap = file_mnt_idmap(filp); 2240 2229 int ret; 2230 + 2231 + if (!(filp->f_mode & FMODE_WRITE)) 2232 + return -EBADF; 2241 2233 2242 2234 if (!inode_owner_or_capable(idmap, inode)) 2243 2235 return -EACCES; ··· 2274 2260 struct mnt_idmap *idmap = file_mnt_idmap(filp); 2275 2261 int ret; 2276 2262 2263 + if (!(filp->f_mode & FMODE_WRITE)) 2264 + return -EBADF; 2265 + 2277 2266 if (!inode_owner_or_capable(idmap, inode)) 2278 2267 return -EACCES; 2279 2268 ··· 2296 2279 } 2297 2280 2298 2281 int f2fs_do_shutdown(struct f2fs_sb_info *sbi, unsigned int flag, 2299 - bool readonly) 2282 + bool readonly, bool need_lock) 2300 2283 { 2301 2284 struct super_block *sb = sbi->sb; 2302 2285 int ret = 0; ··· 2343 2326 if (readonly) 2344 2327 goto out; 2345 2328 2329 + /* grab sb->s_umount to avoid racing w/ remount() */ 2330 + if (need_lock) 2331 + down_read(&sbi->sb->s_umount); 2332 + 2346 2333 f2fs_stop_gc_thread(sbi); 2347 2334 f2fs_stop_discard_thread(sbi); 2348 2335 2349 2336 f2fs_drop_discard_cmd(sbi); 2350 2337 clear_opt(sbi, DISCARD); 2338 + 2339 + if (need_lock) 2340 + up_read(&sbi->sb->s_umount); 2351 2341 2352 2342 f2fs_update_time(sbi, REQ_TIME); 2353 2343 out: ··· 2392 2368 } 2393 2369 } 2394 2370 2395 - ret = f2fs_do_shutdown(sbi, in, readonly); 2371 + ret = f2fs_do_shutdown(sbi, in, readonly, true); 2396 2372 2397 2373 if (need_drop) 2398 2374 mnt_drop_write_file(filp); ··· 2710 2686 (range->start + range->len) >> PAGE_SHIFT, 2711 2687 DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE)); 2712 2688 2713 - if (is_inode_flag_set(inode, FI_COMPRESS_RELEASED)) { 2689 + if (is_inode_flag_set(inode, FI_COMPRESS_RELEASED) || 2690 + f2fs_is_atomic_file(inode)) { 2714 2691 err = -EINVAL; 2715 2692 goto unlock_out; 2716 2693 } ··· 2735 2710 * block addresses are continuous. 2736 2711 */ 2737 2712 if (f2fs_lookup_read_extent_cache(inode, pg_start, &ei)) { 2738 - if (ei.fofs + ei.len >= pg_end) 2713 + if ((pgoff_t)ei.fofs + ei.len >= pg_end) 2739 2714 goto out; 2740 2715 } 2741 2716 ··· 2817 2792 err = PTR_ERR(page); 2818 2793 goto clear_out; 2819 2794 } 2795 + 2796 + f2fs_wait_on_page_writeback(page, DATA, true, true); 2820 2797 2821 2798 set_page_dirty(page); 2822 2799 set_page_private_gcing(page); ··· 2944 2917 goto out_unlock; 2945 2918 } 2946 2919 2920 + if (f2fs_is_atomic_file(src) || f2fs_is_atomic_file(dst)) { 2921 + ret = -EINVAL; 2922 + goto out_unlock; 2923 + } 2924 + 2947 2925 ret = -EINVAL; 2948 2926 if (pos_in + len > src->i_size || pos_in + len < pos_in) 2949 2927 goto out_unlock; ··· 3000 2968 } 3001 2969 3002 2970 f2fs_lock_op(sbi); 3003 - ret = __exchange_data_block(src, dst, pos_in >> F2FS_BLKSIZE_BITS, 3004 - pos_out >> F2FS_BLKSIZE_BITS, 3005 - len >> F2FS_BLKSIZE_BITS, false); 2971 + ret = __exchange_data_block(src, dst, F2FS_BYTES_TO_BLK(pos_in), 2972 + F2FS_BYTES_TO_BLK(pos_out), 2973 + F2FS_BYTES_TO_BLK(len), false); 3006 2974 3007 2975 if (!ret) { 3008 2976 if (dst_max_i_size) ··· 3331 3299 return ret; 3332 3300 3333 3301 inode_lock(inode); 3302 + 3303 + if (f2fs_is_atomic_file(inode)) { 3304 + ret = -EINVAL; 3305 + goto out; 3306 + } 3334 3307 3335 3308 if (!pin) { 3336 3309 clear_inode_flag(inode, FI_PIN_FILE); ··· 4230 4193 /* It will never fail, when page has pinned above */ 4231 4194 f2fs_bug_on(F2FS_I_SB(inode), !page); 4232 4195 4196 + f2fs_wait_on_page_writeback(page, DATA, true, true); 4197 + 4233 4198 set_page_dirty(page); 4234 4199 set_page_private_gcing(page); 4235 4200 f2fs_put_page(page, 1); ··· 4246 4207 struct inode *inode = file_inode(filp); 4247 4208 struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 4248 4209 struct f2fs_inode_info *fi = F2FS_I(inode); 4249 - pgoff_t page_idx = 0, last_idx; 4250 - int cluster_size = fi->i_cluster_size; 4251 - int count, ret; 4210 + pgoff_t page_idx = 0, last_idx, cluster_idx; 4211 + int ret; 4252 4212 4253 4213 if (!f2fs_sb_has_compression(sbi) || 4254 4214 F2FS_OPTION(sbi).compress_mode != COMPR_MODE_USER) ··· 4282 4244 goto out; 4283 4245 4284 4246 last_idx = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); 4247 + last_idx >>= fi->i_log_cluster_size; 4285 4248 4286 - count = last_idx - page_idx; 4287 - while (count && count >= cluster_size) { 4288 - ret = redirty_blocks(inode, page_idx, cluster_size); 4249 + for (cluster_idx = 0; cluster_idx < last_idx; cluster_idx++) { 4250 + page_idx = cluster_idx << fi->i_log_cluster_size; 4251 + 4252 + if (!f2fs_is_compressed_cluster(inode, page_idx)) 4253 + continue; 4254 + 4255 + ret = redirty_blocks(inode, page_idx, fi->i_cluster_size); 4289 4256 if (ret < 0) 4290 4257 break; 4291 4258 ··· 4299 4256 if (ret < 0) 4300 4257 break; 4301 4258 } 4302 - 4303 - count -= cluster_size; 4304 - page_idx += cluster_size; 4305 4259 4306 4260 cond_resched(); 4307 4261 if (fatal_signal_pending(current)) { ··· 4326 4286 { 4327 4287 struct inode *inode = file_inode(filp); 4328 4288 struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 4329 - pgoff_t page_idx = 0, last_idx; 4330 - int cluster_size = F2FS_I(inode)->i_cluster_size; 4331 - int count, ret; 4289 + struct f2fs_inode_info *fi = F2FS_I(inode); 4290 + pgoff_t page_idx = 0, last_idx, cluster_idx; 4291 + int ret; 4332 4292 4333 4293 if (!f2fs_sb_has_compression(sbi) || 4334 4294 F2FS_OPTION(sbi).compress_mode != COMPR_MODE_USER) ··· 4362 4322 set_inode_flag(inode, FI_ENABLE_COMPRESS); 4363 4323 4364 4324 last_idx = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); 4325 + last_idx >>= fi->i_log_cluster_size; 4365 4326 4366 - count = last_idx - page_idx; 4367 - while (count && count >= cluster_size) { 4368 - ret = redirty_blocks(inode, page_idx, cluster_size); 4327 + for (cluster_idx = 0; cluster_idx < last_idx; cluster_idx++) { 4328 + page_idx = cluster_idx << fi->i_log_cluster_size; 4329 + 4330 + if (f2fs_is_sparse_cluster(inode, page_idx)) 4331 + continue; 4332 + 4333 + ret = redirty_blocks(inode, page_idx, fi->i_cluster_size); 4369 4334 if (ret < 0) 4370 4335 break; 4371 4336 ··· 4379 4334 if (ret < 0) 4380 4335 break; 4381 4336 } 4382 - 4383 - count -= cluster_size; 4384 - page_idx += cluster_size; 4385 4337 4386 4338 cond_resched(); 4387 4339 if (fatal_signal_pending(current)) { ··· 4580 4538 f2fs_down_read(&fi->i_gc_rwsem[READ]); 4581 4539 } 4582 4540 4541 + /* dio is not compatible w/ atomic file */ 4542 + if (f2fs_is_atomic_file(inode)) { 4543 + f2fs_up_read(&fi->i_gc_rwsem[READ]); 4544 + ret = -EOPNOTSUPP; 4545 + goto out; 4546 + } 4547 + 4583 4548 /* 4584 4549 * We have to use __iomap_dio_rw() and iomap_dio_complete() instead of 4585 4550 * the higher-level function iomap_dio_rw() in order to ensure that the ··· 4645 4596 if (trace_f2fs_dataread_start_enabled()) 4646 4597 f2fs_trace_rw_file_path(iocb->ki_filp, iocb->ki_pos, 4647 4598 iov_iter_count(to), READ); 4599 + 4600 + /* In LFS mode, if there is inflight dio, wait for its completion */ 4601 + if (f2fs_lfs_mode(F2FS_I_SB(inode))) 4602 + inode_dio_wait(inode); 4648 4603 4649 4604 if (f2fs_should_use_dio(inode, iocb, to)) { 4650 4605 ret = f2fs_dio_read_iter(iocb, to); ··· 5001 4948 5002 4949 /* Determine whether we will do a direct write or a buffered write. */ 5003 4950 dio = f2fs_should_use_dio(inode, iocb, from); 4951 + 4952 + /* dio is not compatible w/ atomic write */ 4953 + if (dio && f2fs_is_atomic_file(inode)) { 4954 + ret = -EOPNOTSUPP; 4955 + goto out_unlock; 4956 + } 5004 4957 5005 4958 /* Possibly preallocate the blocks for the write. */ 5006 4959 target_size = iocb->ki_pos + iov_iter_count(from);
+86 -27
fs/f2fs/gc.c
··· 81 81 continue; 82 82 } 83 83 84 + gc_control.one_time = false; 85 + 84 86 /* 85 87 * [GC triggering condition] 86 88 * 0. GC is not conducted currently. ··· 118 116 goto next; 119 117 } 120 118 121 - if (has_enough_invalid_blocks(sbi)) 119 + if (f2fs_sb_has_blkzoned(sbi)) { 120 + if (has_enough_free_blocks(sbi, 121 + gc_th->no_zoned_gc_percent)) { 122 + wait_ms = gc_th->no_gc_sleep_time; 123 + f2fs_up_write(&sbi->gc_lock); 124 + goto next; 125 + } 126 + if (wait_ms == gc_th->no_gc_sleep_time) 127 + wait_ms = gc_th->max_sleep_time; 128 + } 129 + 130 + if (need_to_boost_gc(sbi)) { 122 131 decrease_sleep_time(gc_th, &wait_ms); 123 - else 132 + if (f2fs_sb_has_blkzoned(sbi)) 133 + gc_control.one_time = true; 134 + } else { 124 135 increase_sleep_time(gc_th, &wait_ms); 136 + } 125 137 do_gc: 126 138 stat_inc_gc_call_count(sbi, foreground ? 127 139 FOREGROUND : BACKGROUND); 128 140 129 - sync_mode = F2FS_OPTION(sbi).bggc_mode == BGGC_MODE_SYNC; 141 + sync_mode = (F2FS_OPTION(sbi).bggc_mode == BGGC_MODE_SYNC) || 142 + gc_control.one_time; 130 143 131 144 /* foreground GC was been triggered via f2fs_balance_fs() */ 132 145 if (foreground) ··· 196 179 return -ENOMEM; 197 180 198 181 gc_th->urgent_sleep_time = DEF_GC_THREAD_URGENT_SLEEP_TIME; 199 - gc_th->min_sleep_time = DEF_GC_THREAD_MIN_SLEEP_TIME; 200 - gc_th->max_sleep_time = DEF_GC_THREAD_MAX_SLEEP_TIME; 201 - gc_th->no_gc_sleep_time = DEF_GC_THREAD_NOGC_SLEEP_TIME; 182 + gc_th->valid_thresh_ratio = DEF_GC_THREAD_VALID_THRESH_RATIO; 183 + 184 + if (f2fs_sb_has_blkzoned(sbi)) { 185 + gc_th->min_sleep_time = DEF_GC_THREAD_MIN_SLEEP_TIME_ZONED; 186 + gc_th->max_sleep_time = DEF_GC_THREAD_MAX_SLEEP_TIME_ZONED; 187 + gc_th->no_gc_sleep_time = DEF_GC_THREAD_NOGC_SLEEP_TIME_ZONED; 188 + gc_th->no_zoned_gc_percent = LIMIT_NO_ZONED_GC; 189 + gc_th->boost_zoned_gc_percent = LIMIT_BOOST_ZONED_GC; 190 + } else { 191 + gc_th->min_sleep_time = DEF_GC_THREAD_MIN_SLEEP_TIME; 192 + gc_th->max_sleep_time = DEF_GC_THREAD_MAX_SLEEP_TIME; 193 + gc_th->no_gc_sleep_time = DEF_GC_THREAD_NOGC_SLEEP_TIME; 194 + gc_th->no_zoned_gc_percent = 0; 195 + gc_th->boost_zoned_gc_percent = 0; 196 + } 202 197 203 198 gc_th->gc_wake = false; 204 199 ··· 368 339 unsigned char age = 0; 369 340 unsigned char u; 370 341 unsigned int i; 371 - unsigned int usable_segs_per_sec = f2fs_usable_segs_in_sec(sbi, segno); 342 + unsigned int usable_segs_per_sec = f2fs_usable_segs_in_sec(sbi); 372 343 373 344 for (i = 0; i < usable_segs_per_sec; i++) 374 345 mtime += get_seg_entry(sbi, start + i)->mtime; ··· 396 367 { 397 368 if (p->alloc_mode == SSR) 398 369 return get_seg_entry(sbi, segno)->ckpt_valid_blocks; 370 + 371 + if (p->one_time_gc && (get_valid_blocks(sbi, segno, true) >= 372 + CAP_BLKS_PER_SEC(sbi) * sbi->gc_thread->valid_thresh_ratio / 373 + 100)) 374 + return UINT_MAX; 399 375 400 376 /* alloc_mode == LFS */ 401 377 if (p->gc_mode == GC_GREEDY) ··· 776 742 */ 777 743 int f2fs_get_victim(struct f2fs_sb_info *sbi, unsigned int *result, 778 744 int gc_type, int type, char alloc_mode, 779 - unsigned long long age) 745 + unsigned long long age, bool one_time) 780 746 { 781 747 struct dirty_seglist_info *dirty_i = DIRTY_I(sbi); 782 748 struct sit_info *sm = SIT_I(sbi); ··· 793 759 p.alloc_mode = alloc_mode; 794 760 p.age = age; 795 761 p.age_threshold = sbi->am.age_threshold; 762 + p.one_time_gc = one_time; 796 763 797 764 retry: 798 765 select_policy(sbi, gc_type, type, &p); ··· 1705 1670 } 1706 1671 1707 1672 static int __get_victim(struct f2fs_sb_info *sbi, unsigned int *victim, 1708 - int gc_type) 1673 + int gc_type, bool one_time) 1709 1674 { 1710 1675 struct sit_info *sit_i = SIT_I(sbi); 1711 1676 int ret; 1712 1677 1713 1678 down_write(&sit_i->sentry_lock); 1714 - ret = f2fs_get_victim(sbi, victim, gc_type, NO_CHECK_TYPE, LFS, 0); 1679 + ret = f2fs_get_victim(sbi, victim, gc_type, NO_CHECK_TYPE, 1680 + LFS, 0, one_time); 1715 1681 up_write(&sit_i->sentry_lock); 1716 1682 return ret; 1717 1683 } ··· 1720 1684 static int do_garbage_collect(struct f2fs_sb_info *sbi, 1721 1685 unsigned int start_segno, 1722 1686 struct gc_inode_list *gc_list, int gc_type, 1723 - bool force_migrate) 1687 + bool force_migrate, bool one_time) 1724 1688 { 1725 1689 struct page *sum_page; 1726 1690 struct f2fs_summary_block *sum; 1727 1691 struct blk_plug plug; 1728 1692 unsigned int segno = start_segno; 1729 1693 unsigned int end_segno = start_segno + SEGS_PER_SEC(sbi); 1694 + unsigned int sec_end_segno; 1730 1695 int seg_freed = 0, migrated = 0; 1731 1696 unsigned char type = IS_DATASEG(get_seg_entry(sbi, segno)->type) ? 1732 1697 SUM_TYPE_DATA : SUM_TYPE_NODE; 1733 1698 unsigned char data_type = (type == SUM_TYPE_DATA) ? DATA : NODE; 1734 1699 int submitted = 0; 1735 1700 1736 - if (__is_large_section(sbi)) 1737 - end_segno = rounddown(end_segno, SEGS_PER_SEC(sbi)); 1701 + if (__is_large_section(sbi)) { 1702 + sec_end_segno = rounddown(end_segno, SEGS_PER_SEC(sbi)); 1738 1703 1739 - /* 1740 - * zone-capacity can be less than zone-size in zoned devices, 1741 - * resulting in less than expected usable segments in the zone, 1742 - * calculate the end segno in the zone which can be garbage collected 1743 - */ 1744 - if (f2fs_sb_has_blkzoned(sbi)) 1745 - end_segno -= SEGS_PER_SEC(sbi) - 1746 - f2fs_usable_segs_in_sec(sbi, segno); 1704 + /* 1705 + * zone-capacity can be less than zone-size in zoned devices, 1706 + * resulting in less than expected usable segments in the zone, 1707 + * calculate the end segno in the zone which can be garbage 1708 + * collected 1709 + */ 1710 + if (f2fs_sb_has_blkzoned(sbi)) 1711 + sec_end_segno -= SEGS_PER_SEC(sbi) - 1712 + f2fs_usable_segs_in_sec(sbi); 1713 + 1714 + if (gc_type == BG_GC || one_time) { 1715 + unsigned int window_granularity = 1716 + sbi->migration_window_granularity; 1717 + 1718 + if (f2fs_sb_has_blkzoned(sbi) && 1719 + !has_enough_free_blocks(sbi, 1720 + sbi->gc_thread->boost_zoned_gc_percent)) 1721 + window_granularity *= 1722 + BOOST_GC_MULTIPLE; 1723 + 1724 + end_segno = start_segno + window_granularity; 1725 + } 1726 + 1727 + if (end_segno > sec_end_segno) 1728 + end_segno = sec_end_segno; 1729 + } 1747 1730 1748 1731 sanity_check_seg_type(sbi, get_seg_entry(sbi, segno)->type); 1749 1732 ··· 1841 1786 1842 1787 if (__is_large_section(sbi)) 1843 1788 sbi->next_victim_seg[gc_type] = 1844 - (segno + 1 < end_segno) ? segno + 1 : NULL_SEGNO; 1789 + (segno + 1 < sec_end_segno) ? 1790 + segno + 1 : NULL_SEGNO; 1845 1791 skip: 1846 1792 f2fs_put_page(sum_page, 0); 1847 1793 } ··· 1919 1863 goto stop; 1920 1864 } 1921 1865 retry: 1922 - ret = __get_victim(sbi, &segno, gc_type); 1866 + ret = __get_victim(sbi, &segno, gc_type, gc_control->one_time); 1923 1867 if (ret) { 1924 1868 /* allow to search victim from sections has pinned data */ 1925 1869 if (ret == -ENODATA && gc_type == FG_GC && ··· 1931 1875 } 1932 1876 1933 1877 seg_freed = do_garbage_collect(sbi, segno, &gc_list, gc_type, 1934 - gc_control->should_migrate_blocks); 1878 + gc_control->should_migrate_blocks, 1879 + gc_control->one_time); 1935 1880 if (seg_freed < 0) 1936 1881 goto stop; 1937 1882 1938 1883 total_freed += seg_freed; 1939 1884 1940 - if (seg_freed == f2fs_usable_segs_in_sec(sbi, segno)) { 1885 + if (seg_freed == f2fs_usable_segs_in_sec(sbi)) { 1941 1886 sec_freed++; 1942 1887 total_sec_freed++; 1943 1888 } 1889 + 1890 + if (gc_control->one_time) 1891 + goto stop; 1944 1892 1945 1893 if (gc_type == FG_GC) { 1946 1894 sbi->cur_victim_sec = NULL_SEGNO; ··· 2070 2010 .iroot = RADIX_TREE_INIT(gc_list.iroot, GFP_NOFS), 2071 2011 }; 2072 2012 2073 - do_garbage_collect(sbi, segno, &gc_list, FG_GC, 2074 - dry_run_sections == 0); 2013 + do_garbage_collect(sbi, segno, &gc_list, FG_GC, true, false); 2075 2014 put_gc_inode(&gc_list); 2076 2015 2077 2016 if (!dry_run && get_valid_blocks(sbi, segno, true))
+29
fs/f2fs/gc.h
··· 15 15 #define DEF_GC_THREAD_MAX_SLEEP_TIME 60000 16 16 #define DEF_GC_THREAD_NOGC_SLEEP_TIME 300000 /* wait 5 min */ 17 17 18 + /* GC sleep parameters for zoned deivces */ 19 + #define DEF_GC_THREAD_MIN_SLEEP_TIME_ZONED 10 20 + #define DEF_GC_THREAD_MAX_SLEEP_TIME_ZONED 20 21 + #define DEF_GC_THREAD_NOGC_SLEEP_TIME_ZONED 60000 22 + 18 23 /* choose candidates from sections which has age of more than 7 days */ 19 24 #define DEF_GC_THREAD_AGE_THRESHOLD (60 * 60 * 24 * 7) 20 25 #define DEF_GC_THREAD_CANDIDATE_RATIO 20 /* select 20% oldest sections as candidates */ 21 26 #define DEF_GC_THREAD_MAX_CANDIDATE_COUNT 10 /* select at most 10 sections as candidates */ 22 27 #define DEF_GC_THREAD_AGE_WEIGHT 60 /* age weight */ 28 + #define DEF_GC_THREAD_VALID_THRESH_RATIO 95 /* do not GC over 95% valid block ratio for one time GC */ 23 29 #define DEFAULT_ACCURACY_CLASS 10000 /* accuracy class */ 24 30 25 31 #define LIMIT_INVALID_BLOCK 40 /* percentage over total user space */ 26 32 #define LIMIT_FREE_BLOCK 40 /* percentage over invalid + free space */ 33 + 34 + #define LIMIT_NO_ZONED_GC 60 /* percentage over total user space of no gc for zoned devices */ 35 + #define LIMIT_BOOST_ZONED_GC 25 /* percentage over total user space of boosted gc for zoned devices */ 36 + #define DEF_MIGRATION_WINDOW_GRANULARITY_ZONED 3 37 + #define BOOST_GC_MULTIPLE 5 27 38 28 39 #define DEF_GC_FAILED_PINNED_FILES 2048 29 40 #define MAX_GC_FAILED_PINNED_FILES USHRT_MAX ··· 62 51 * caller of f2fs_balance_fs() 63 52 * will wait on this wait queue. 64 53 */ 54 + 55 + /* for gc control for zoned devices */ 56 + unsigned int no_zoned_gc_percent; 57 + unsigned int boost_zoned_gc_percent; 58 + unsigned int valid_thresh_ratio; 65 59 }; 66 60 67 61 struct gc_inode_list { ··· 168 152 *wait -= min_time; 169 153 } 170 154 155 + static inline bool has_enough_free_blocks(struct f2fs_sb_info *sbi, 156 + unsigned int limit_perc) 157 + { 158 + return free_sections(sbi) > ((sbi->total_sections * limit_perc) / 100); 159 + } 160 + 171 161 static inline bool has_enough_invalid_blocks(struct f2fs_sb_info *sbi) 172 162 { 173 163 block_t user_block_count = sbi->user_block_count; ··· 188 166 limit_invalid_user_blocks(user_block_count) && 189 167 free_user_blocks(sbi) < 190 168 limit_free_user_blocks(invalid_user_blocks)); 169 + } 170 + 171 + static inline bool need_to_boost_gc(struct f2fs_sb_info *sbi) 172 + { 173 + if (f2fs_sb_has_blkzoned(sbi)) 174 + return !has_enough_free_blocks(sbi, LIMIT_BOOST_ZONED_GC); 175 + return has_enough_invalid_blocks(sbi); 191 176 }
+15 -16
fs/f2fs/inline.c
··· 260 260 return err; 261 261 } 262 262 263 - int f2fs_write_inline_data(struct inode *inode, struct page *page) 263 + int f2fs_write_inline_data(struct inode *inode, struct folio *folio) 264 264 { 265 - struct dnode_of_data dn; 266 - int err; 265 + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 266 + struct page *ipage; 267 267 268 - set_new_dnode(&dn, inode, NULL, NULL, 0); 269 - err = f2fs_get_dnode_of_data(&dn, 0, LOOKUP_NODE); 270 - if (err) 271 - return err; 268 + ipage = f2fs_get_node_page(sbi, inode->i_ino); 269 + if (IS_ERR(ipage)) 270 + return PTR_ERR(ipage); 272 271 273 272 if (!f2fs_has_inline_data(inode)) { 274 - f2fs_put_dnode(&dn); 273 + f2fs_put_page(ipage, 1); 275 274 return -EAGAIN; 276 275 } 277 276 278 - f2fs_bug_on(F2FS_I_SB(inode), page->index); 277 + f2fs_bug_on(F2FS_I_SB(inode), folio->index); 279 278 280 - f2fs_wait_on_page_writeback(dn.inode_page, NODE, true, true); 281 - memcpy_from_page(inline_data_addr(inode, dn.inode_page), 282 - page, 0, MAX_INLINE_DATA(inode)); 283 - set_page_dirty(dn.inode_page); 279 + f2fs_wait_on_page_writeback(ipage, NODE, true, true); 280 + memcpy_from_folio(inline_data_addr(inode, ipage), 281 + folio, 0, MAX_INLINE_DATA(inode)); 282 + set_page_dirty(ipage); 284 283 285 - f2fs_clear_page_cache_dirty_tag(page); 284 + f2fs_clear_page_cache_dirty_tag(folio); 286 285 287 286 set_inode_flag(inode, FI_APPEND_WRITE); 288 287 set_inode_flag(inode, FI_DATA_EXIST); 289 288 290 - clear_page_private_inline(dn.inode_page); 291 - f2fs_put_dnode(&dn); 289 + clear_page_private_inline(ipage); 290 + f2fs_put_page(ipage, 1); 292 291 return 0; 293 292 } 294 293
+7 -2
fs/f2fs/inode.c
··· 7 7 */ 8 8 #include <linux/fs.h> 9 9 #include <linux/f2fs_fs.h> 10 - #include <linux/buffer_head.h> 11 10 #include <linux/writeback.h> 12 11 #include <linux/sched/mm.h> 13 12 #include <linux/lz4.h> ··· 33 34 34 35 if (f2fs_inode_dirtied(inode, sync)) 35 36 return; 37 + 38 + if (f2fs_is_atomic_file(inode)) { 39 + set_inode_flag(inode, FI_ATOMIC_DIRTIED); 40 + return; 41 + } 36 42 37 43 mark_inode_dirty_sync(inode); 38 44 } ··· 179 175 180 176 if (provided != calculated) 181 177 f2fs_warn(sbi, "checksum invalid, nid = %lu, ino_of_node = %x, %x vs. %x", 182 - page->index, ino_of_node(page), provided, calculated); 178 + page_folio(page)->index, ino_of_node(page), 179 + provided, calculated); 183 180 184 181 return provided == calculated; 185 182 }
-68
fs/f2fs/namei.c
··· 457 457 return d_obtain_alias(f2fs_iget(child->d_sb, ino)); 458 458 } 459 459 460 - static int __recover_dot_dentries(struct inode *dir, nid_t pino) 461 - { 462 - struct f2fs_sb_info *sbi = F2FS_I_SB(dir); 463 - struct qstr dot = QSTR_INIT(".", 1); 464 - struct f2fs_dir_entry *de; 465 - struct page *page; 466 - int err = 0; 467 - 468 - if (f2fs_readonly(sbi->sb)) { 469 - f2fs_info(sbi, "skip recovering inline_dots inode (ino:%lu, pino:%u) in readonly mountpoint", 470 - dir->i_ino, pino); 471 - return 0; 472 - } 473 - 474 - if (!S_ISDIR(dir->i_mode)) { 475 - f2fs_err(sbi, "inconsistent inode status, skip recovering inline_dots inode (ino:%lu, i_mode:%u, pino:%u)", 476 - dir->i_ino, dir->i_mode, pino); 477 - set_sbi_flag(sbi, SBI_NEED_FSCK); 478 - return -ENOTDIR; 479 - } 480 - 481 - err = f2fs_dquot_initialize(dir); 482 - if (err) 483 - return err; 484 - 485 - f2fs_balance_fs(sbi, true); 486 - 487 - f2fs_lock_op(sbi); 488 - 489 - de = f2fs_find_entry(dir, &dot, &page); 490 - if (de) { 491 - f2fs_put_page(page, 0); 492 - } else if (IS_ERR(page)) { 493 - err = PTR_ERR(page); 494 - goto out; 495 - } else { 496 - err = f2fs_do_add_link(dir, &dot, NULL, dir->i_ino, S_IFDIR); 497 - if (err) 498 - goto out; 499 - } 500 - 501 - de = f2fs_find_entry(dir, &dotdot_name, &page); 502 - if (de) 503 - f2fs_put_page(page, 0); 504 - else if (IS_ERR(page)) 505 - err = PTR_ERR(page); 506 - else 507 - err = f2fs_do_add_link(dir, &dotdot_name, NULL, pino, S_IFDIR); 508 - out: 509 - if (!err) 510 - clear_inode_flag(dir, FI_INLINE_DOTS); 511 - 512 - f2fs_unlock_op(sbi); 513 - return err; 514 - } 515 - 516 460 static struct dentry *f2fs_lookup(struct inode *dir, struct dentry *dentry, 517 461 unsigned int flags) 518 462 { ··· 466 522 struct dentry *new; 467 523 nid_t ino = -1; 468 524 int err = 0; 469 - unsigned int root_ino = F2FS_ROOT_INO(F2FS_I_SB(dir)); 470 525 struct f2fs_filename fname; 471 526 472 527 trace_f2fs_lookup_start(dir, dentry, flags); ··· 501 558 goto out; 502 559 } 503 560 504 - if ((dir->i_ino == root_ino) && f2fs_has_inline_dots(dir)) { 505 - err = __recover_dot_dentries(dir, root_ino); 506 - if (err) 507 - goto out_iput; 508 - } 509 - 510 - if (f2fs_has_inline_dots(inode)) { 511 - err = __recover_dot_dentries(inode, dir->i_ino); 512 - if (err) 513 - goto out_iput; 514 - } 515 561 if (IS_ENCRYPTED(dir) && 516 562 (S_ISDIR(inode->i_mode) || S_ISLNK(inode->i_mode)) && 517 563 !fscrypt_has_permitted_context(dir, inode)) {
+24 -22
fs/f2fs/node.c
··· 20 20 #include "iostat.h" 21 21 #include <trace/events/f2fs.h> 22 22 23 - #define on_f2fs_build_free_nids(nmi) mutex_is_locked(&(nm_i)->build_lock) 23 + #define on_f2fs_build_free_nids(nm_i) mutex_is_locked(&(nm_i)->build_lock) 24 24 25 25 static struct kmem_cache *nat_entry_slab; 26 26 static struct kmem_cache *free_nid_slab; ··· 123 123 static void clear_node_page_dirty(struct page *page) 124 124 { 125 125 if (PageDirty(page)) { 126 - f2fs_clear_page_cache_dirty_tag(page); 126 + f2fs_clear_page_cache_dirty_tag(page_folio(page)); 127 127 clear_page_dirty_for_io(page); 128 128 dec_page_count(F2FS_P_SB(page), F2FS_DIRTY_NODES); 129 129 } ··· 919 919 clear_node_page_dirty(dn->node_page); 920 920 set_sbi_flag(sbi, SBI_IS_DIRTY); 921 921 922 - index = dn->node_page->index; 922 + index = page_folio(dn->node_page)->index; 923 923 f2fs_put_page(dn->node_page, 1); 924 924 925 925 invalidate_mapping_pages(NODE_MAPPING(sbi), ··· 1369 1369 */ 1370 1370 static int read_node_page(struct page *page, blk_opf_t op_flags) 1371 1371 { 1372 + struct folio *folio = page_folio(page); 1372 1373 struct f2fs_sb_info *sbi = F2FS_P_SB(page); 1373 1374 struct node_info ni; 1374 1375 struct f2fs_io_info fio = { ··· 1382 1381 }; 1383 1382 int err; 1384 1383 1385 - if (PageUptodate(page)) { 1384 + if (folio_test_uptodate(folio)) { 1386 1385 if (!f2fs_inode_chksum_verify(sbi, page)) { 1387 - ClearPageUptodate(page); 1386 + folio_clear_uptodate(folio); 1388 1387 return -EFSBADCRC; 1389 1388 } 1390 1389 return LOCKED_PAGE; 1391 1390 } 1392 1391 1393 - err = f2fs_get_node_info(sbi, page->index, &ni, false); 1392 + err = f2fs_get_node_info(sbi, folio->index, &ni, false); 1394 1393 if (err) 1395 1394 return err; 1396 1395 1397 1396 /* NEW_ADDR can be seen, after cp_error drops some dirty node pages */ 1398 1397 if (unlikely(ni.blk_addr == NULL_ADDR || ni.blk_addr == NEW_ADDR)) { 1399 - ClearPageUptodate(page); 1398 + folio_clear_uptodate(folio); 1400 1399 return -ENOENT; 1401 1400 } 1402 1401 ··· 1493 1492 out_put_err: 1494 1493 /* ENOENT comes from read_node_page which is not an error. */ 1495 1494 if (err != -ENOENT) 1496 - f2fs_handle_page_eio(sbi, page->index, NODE); 1495 + f2fs_handle_page_eio(sbi, page_folio(page), NODE); 1497 1496 f2fs_put_page(page, 1); 1498 1497 return ERR_PTR(err); 1499 1498 } ··· 1536 1535 if (!clear_page_dirty_for_io(page)) 1537 1536 goto page_out; 1538 1537 1539 - ret = f2fs_write_inline_data(inode, page); 1538 + ret = f2fs_write_inline_data(inode, page_folio(page)); 1540 1539 inode_dec_dirty_pages(inode); 1541 1540 f2fs_remove_dirty_inode(inode); 1542 1541 if (ret) ··· 1609 1608 enum iostat_type io_type, unsigned int *seq_id) 1610 1609 { 1611 1610 struct f2fs_sb_info *sbi = F2FS_P_SB(page); 1611 + struct folio *folio = page_folio(page); 1612 1612 nid_t nid; 1613 1613 struct node_info ni; 1614 1614 struct f2fs_io_info fio = { ··· 1626 1624 }; 1627 1625 unsigned int seq; 1628 1626 1629 - trace_f2fs_writepage(page_folio(page), NODE); 1627 + trace_f2fs_writepage(folio, NODE); 1630 1628 1631 1629 if (unlikely(f2fs_cp_error(sbi))) { 1632 1630 /* keep node pages in remount-ro mode */ 1633 1631 if (F2FS_OPTION(sbi).errors == MOUNT_ERRORS_READONLY) 1634 1632 goto redirty_out; 1635 - ClearPageUptodate(page); 1633 + folio_clear_uptodate(folio); 1636 1634 dec_page_count(sbi, F2FS_DIRTY_NODES); 1637 - unlock_page(page); 1635 + folio_unlock(folio); 1638 1636 return 0; 1639 1637 } 1640 1638 ··· 1648 1646 1649 1647 /* get old block addr of this node page */ 1650 1648 nid = nid_of_node(page); 1651 - f2fs_bug_on(sbi, page->index != nid); 1649 + f2fs_bug_on(sbi, folio->index != nid); 1652 1650 1653 1651 if (f2fs_get_node_info(sbi, nid, &ni, !do_balance)) 1654 1652 goto redirty_out; ··· 1662 1660 1663 1661 /* This page is already truncated */ 1664 1662 if (unlikely(ni.blk_addr == NULL_ADDR)) { 1665 - ClearPageUptodate(page); 1663 + folio_clear_uptodate(folio); 1666 1664 dec_page_count(sbi, F2FS_DIRTY_NODES); 1667 1665 f2fs_up_read(&sbi->node_write); 1668 - unlock_page(page); 1666 + folio_unlock(folio); 1669 1667 return 0; 1670 1668 } 1671 1669 ··· 1676 1674 goto redirty_out; 1677 1675 } 1678 1676 1679 - if (atomic && !test_opt(sbi, NOBARRIER) && !f2fs_sb_has_blkzoned(sbi)) 1677 + if (atomic && !test_opt(sbi, NOBARRIER)) 1680 1678 fio.op_flags |= REQ_PREFLUSH | REQ_FUA; 1681 1679 1682 1680 /* should add to global list before clearing PAGECACHE status */ ··· 1686 1684 *seq_id = seq; 1687 1685 } 1688 1686 1689 - set_page_writeback(page); 1687 + folio_start_writeback(folio); 1690 1688 1691 1689 fio.old_blkaddr = ni.blk_addr; 1692 1690 f2fs_do_write_node_page(nid, &fio); ··· 1699 1697 submitted = NULL; 1700 1698 } 1701 1699 1702 - unlock_page(page); 1700 + folio_unlock(folio); 1703 1701 1704 1702 if (unlikely(f2fs_cp_error(sbi))) { 1705 1703 f2fs_submit_merged_write(sbi, NODE); ··· 1713 1711 return 0; 1714 1712 1715 1713 redirty_out: 1716 - redirty_page_for_writepage(wbc, page); 1714 + folio_redirty_for_writepage(wbc, folio); 1717 1715 return AOP_WRITEPAGE_ACTIVATE; 1718 1716 } 1719 1717 ··· 1869 1867 } 1870 1868 if (!ret && atomic && !marked) { 1871 1869 f2fs_debug(sbi, "Retry to write fsync mark: ino=%u, idx=%lx", 1872 - ino, last_page->index); 1870 + ino, page_folio(last_page)->index); 1873 1871 lock_page(last_page); 1874 1872 f2fs_wait_on_page_writeback(last_page, NODE, true, true); 1875 1873 set_page_dirty(last_page); ··· 3168 3166 3169 3167 nm_i->nat_bits_blocks = F2FS_BLK_ALIGN((nat_bits_bytes << 1) + 8); 3170 3168 nm_i->nat_bits = f2fs_kvzalloc(sbi, 3171 - nm_i->nat_bits_blocks << F2FS_BLKSIZE_BITS, GFP_KERNEL); 3169 + F2FS_BLK_TO_BYTES(nm_i->nat_bits_blocks), GFP_KERNEL); 3172 3170 if (!nm_i->nat_bits) 3173 3171 return -ENOMEM; 3174 3172 ··· 3187 3185 if (IS_ERR(page)) 3188 3186 return PTR_ERR(page); 3189 3187 3190 - memcpy(nm_i->nat_bits + (i << F2FS_BLKSIZE_BITS), 3188 + memcpy(nm_i->nat_bits + F2FS_BLK_TO_BYTES(i), 3191 3189 page_address(page), F2FS_BLKSIZE); 3192 3190 f2fs_put_page(page, 1); 3193 3191 }
+56 -16
fs/f2fs/segment.c
··· 199 199 clear_inode_flag(inode, FI_ATOMIC_COMMITTED); 200 200 clear_inode_flag(inode, FI_ATOMIC_REPLACE); 201 201 clear_inode_flag(inode, FI_ATOMIC_FILE); 202 + if (is_inode_flag_set(inode, FI_ATOMIC_DIRTIED)) { 203 + clear_inode_flag(inode, FI_ATOMIC_DIRTIED); 204 + f2fs_mark_inode_dirty_sync(inode, true); 205 + } 202 206 stat_dec_atomic_inode(inode); 203 207 204 208 F2FS_I(inode)->atomic_write_task = NULL; ··· 370 366 } else { 371 367 sbi->committed_atomic_block += fi->atomic_write_cnt; 372 368 set_inode_flag(inode, FI_ATOMIC_COMMITTED); 369 + if (is_inode_flag_set(inode, FI_ATOMIC_DIRTIED)) { 370 + clear_inode_flag(inode, FI_ATOMIC_DIRTIED); 371 + f2fs_mark_inode_dirty_sync(inode, true); 372 + } 373 373 } 374 374 375 375 __complete_revoke_list(inode, &revoke_list, ret ? true : false); ··· 1290 1282 wait_list, issued); 1291 1283 return 0; 1292 1284 } 1285 + 1286 + /* 1287 + * Issue discard for conventional zones only if the device 1288 + * supports discard. 1289 + */ 1290 + if (!bdev_max_discard_sectors(bdev)) 1291 + return -EOPNOTSUPP; 1293 1292 } 1294 1293 #endif 1295 1294 ··· 2701 2686 goto got_it; 2702 2687 } 2703 2688 2689 + #ifdef CONFIG_BLK_DEV_ZONED 2704 2690 /* 2705 2691 * If we format f2fs on zoned storage, let's try to get pinned sections 2706 2692 * from beginning of the storage, which should be a conventional one. 2707 2693 */ 2708 2694 if (f2fs_sb_has_blkzoned(sbi)) { 2709 - segno = pinning ? 0 : max(first_zoned_segno(sbi), *newseg); 2695 + /* Prioritize writing to conventional zones */ 2696 + if (sbi->blkzone_alloc_policy == BLKZONE_ALLOC_PRIOR_CONV || pinning) 2697 + segno = 0; 2698 + else 2699 + segno = max(first_zoned_segno(sbi), *newseg); 2710 2700 hint = GET_SEC_FROM_SEG(sbi, segno); 2711 2701 } 2702 + #endif 2712 2703 2713 2704 find_other_zone: 2714 2705 secno = find_next_zero_bit(free_i->free_secmap, MAIN_SECS(sbi), hint); 2706 + 2707 + #ifdef CONFIG_BLK_DEV_ZONED 2708 + if (secno >= MAIN_SECS(sbi) && f2fs_sb_has_blkzoned(sbi)) { 2709 + /* Write only to sequential zones */ 2710 + if (sbi->blkzone_alloc_policy == BLKZONE_ALLOC_ONLY_SEQ) { 2711 + hint = GET_SEC_FROM_SEG(sbi, first_zoned_segno(sbi)); 2712 + secno = find_next_zero_bit(free_i->free_secmap, MAIN_SECS(sbi), hint); 2713 + } else 2714 + secno = find_first_zero_bit(free_i->free_secmap, 2715 + MAIN_SECS(sbi)); 2716 + if (secno >= MAIN_SECS(sbi)) { 2717 + ret = -ENOSPC; 2718 + f2fs_bug_on(sbi, 1); 2719 + goto out_unlock; 2720 + } 2721 + } 2722 + #endif 2723 + 2715 2724 if (secno >= MAIN_SECS(sbi)) { 2716 2725 secno = find_first_zero_bit(free_i->free_secmap, 2717 2726 MAIN_SECS(sbi)); 2718 2727 if (secno >= MAIN_SECS(sbi)) { 2719 2728 ret = -ENOSPC; 2729 + f2fs_bug_on(sbi, 1); 2720 2730 goto out_unlock; 2721 2731 } 2722 2732 } ··· 2783 2743 out_unlock: 2784 2744 spin_unlock(&free_i->segmap_lock); 2785 2745 2786 - if (ret == -ENOSPC) { 2746 + if (ret == -ENOSPC) 2787 2747 f2fs_stop_checkpoint(sbi, false, STOP_CP_REASON_NO_SEGMENT); 2788 - f2fs_bug_on(sbi, 1); 2789 - } 2790 2748 return ret; 2791 2749 } 2792 2750 ··· 3090 3052 sanity_check_seg_type(sbi, seg_type); 3091 3053 3092 3054 /* f2fs_need_SSR() already forces to do this */ 3093 - if (!f2fs_get_victim(sbi, &segno, BG_GC, seg_type, alloc_mode, age)) { 3055 + if (!f2fs_get_victim(sbi, &segno, BG_GC, seg_type, 3056 + alloc_mode, age, false)) { 3094 3057 curseg->next_segno = segno; 3095 3058 return 1; 3096 3059 } ··· 3118 3079 for (; cnt-- > 0; reversed ? i-- : i++) { 3119 3080 if (i == seg_type) 3120 3081 continue; 3121 - if (!f2fs_get_victim(sbi, &segno, BG_GC, i, alloc_mode, age)) { 3082 + if (!f2fs_get_victim(sbi, &segno, BG_GC, i, 3083 + alloc_mode, age, false)) { 3122 3084 curseg->next_segno = segno; 3123 3085 return 1; 3124 3086 } ··· 3562 3522 if (file_is_cold(inode) || f2fs_need_compress_data(inode)) 3563 3523 return CURSEG_COLD_DATA; 3564 3524 3565 - type = __get_age_segment_type(inode, fio->page->index); 3525 + type = __get_age_segment_type(inode, 3526 + page_folio(fio->page)->index); 3566 3527 if (type != NO_CHECK_TYPE) 3567 3528 return type; 3568 3529 ··· 3822 3781 f2fs_up_read(&fio->sbi->io_order_lock); 3823 3782 } 3824 3783 3825 - void f2fs_do_write_meta_page(struct f2fs_sb_info *sbi, struct page *page, 3784 + void f2fs_do_write_meta_page(struct f2fs_sb_info *sbi, struct folio *folio, 3826 3785 enum iostat_type io_type) 3827 3786 { 3828 3787 struct f2fs_io_info fio = { ··· 3831 3790 .temp = HOT, 3832 3791 .op = REQ_OP_WRITE, 3833 3792 .op_flags = REQ_SYNC | REQ_META | REQ_PRIO, 3834 - .old_blkaddr = page->index, 3835 - .new_blkaddr = page->index, 3836 - .page = page, 3793 + .old_blkaddr = folio->index, 3794 + .new_blkaddr = folio->index, 3795 + .page = folio_page(folio, 0), 3837 3796 .encrypted_page = NULL, 3838 3797 .in_list = 0, 3839 3798 }; 3840 3799 3841 - if (unlikely(page->index >= MAIN_BLKADDR(sbi))) 3800 + if (unlikely(folio->index >= MAIN_BLKADDR(sbi))) 3842 3801 fio.op_flags &= ~REQ_META; 3843 3802 3844 - set_page_writeback(page); 3803 + folio_start_writeback(folio); 3845 3804 f2fs_submit_page_write(&fio); 3846 3805 3847 - stat_inc_meta_count(sbi, page->index); 3806 + stat_inc_meta_count(sbi, folio->index); 3848 3807 f2fs_update_iostat(sbi, NULL, io_type, F2FS_BLKSIZE); 3849 3808 } 3850 3809 ··· 5422 5381 return BLKS_PER_SEG(sbi); 5423 5382 } 5424 5383 5425 - unsigned int f2fs_usable_segs_in_sec(struct f2fs_sb_info *sbi, 5426 - unsigned int segno) 5384 + unsigned int f2fs_usable_segs_in_sec(struct f2fs_sb_info *sbi) 5427 5385 { 5428 5386 if (f2fs_sb_has_blkzoned(sbi)) 5429 5387 return CAP_SEGS_PER_SEC(sbi);
+3 -2
fs/f2fs/segment.h
··· 188 188 unsigned int min_segno; /* segment # having min. cost */ 189 189 unsigned long long age; /* mtime of GCed section*/ 190 190 unsigned long long age_threshold;/* age threshold */ 191 + bool one_time_gc; /* one time GC */ 191 192 }; 192 193 193 194 struct seg_entry { ··· 431 430 unsigned int secno = GET_SEC_FROM_SEG(sbi, segno); 432 431 unsigned int start_segno = GET_SEG_FROM_SEC(sbi, secno); 433 432 unsigned int next; 434 - unsigned int usable_segs = f2fs_usable_segs_in_sec(sbi, segno); 433 + unsigned int usable_segs = f2fs_usable_segs_in_sec(sbi); 435 434 436 435 spin_lock(&free_i->segmap_lock); 437 436 clear_bit(segno, free_i->free_segmap); ··· 465 464 unsigned int secno = GET_SEC_FROM_SEG(sbi, segno); 466 465 unsigned int start_segno = GET_SEG_FROM_SEC(sbi, secno); 467 466 unsigned int next; 468 - unsigned int usable_segs = f2fs_usable_segs_in_sec(sbi, segno); 467 + unsigned int usable_segs = f2fs_usable_segs_in_sec(sbi); 469 468 470 469 spin_lock(&free_i->segmap_lock); 471 470 if (test_and_clear_bit(segno, free_i->free_segmap)) {
+74 -45
fs/f2fs/super.c
··· 11 11 #include <linux/fs_context.h> 12 12 #include <linux/sched/mm.h> 13 13 #include <linux/statfs.h> 14 - #include <linux/buffer_head.h> 15 14 #include <linux/kthread.h> 16 15 #include <linux/parser.h> 17 16 #include <linux/mount.h> ··· 706 707 if (!strcmp(name, "on")) { 707 708 F2FS_OPTION(sbi).bggc_mode = BGGC_MODE_ON; 708 709 } else if (!strcmp(name, "off")) { 710 + if (f2fs_sb_has_blkzoned(sbi)) { 711 + f2fs_warn(sbi, "zoned devices need bggc"); 712 + kfree(name); 713 + return -EINVAL; 714 + } 709 715 F2FS_OPTION(sbi).bggc_mode = BGGC_MODE_OFF; 710 716 } else if (!strcmp(name, "sync")) { 711 717 F2FS_OPTION(sbi).bggc_mode = BGGC_MODE_SYNC; ··· 2565 2561 2566 2562 static void f2fs_shutdown(struct super_block *sb) 2567 2563 { 2568 - f2fs_do_shutdown(F2FS_SB(sb), F2FS_GOING_DOWN_NOSYNC, false); 2564 + f2fs_do_shutdown(F2FS_SB(sb), F2FS_GOING_DOWN_NOSYNC, false, false); 2569 2565 } 2570 2566 2571 2567 #ifdef CONFIG_QUOTA ··· 3322 3318 * fit within U32_MAX + 1 data units. 3323 3319 */ 3324 3320 3325 - result = min(result, (((loff_t)U32_MAX + 1) * 4096) >> F2FS_BLKSIZE_BITS); 3321 + result = min(result, F2FS_BYTES_TO_BLK(((loff_t)U32_MAX + 1) * 4096)); 3326 3322 3327 3323 return result; 3328 3324 } 3329 3325 3330 - static int __f2fs_commit_super(struct buffer_head *bh, 3331 - struct f2fs_super_block *super) 3326 + static int __f2fs_commit_super(struct f2fs_sb_info *sbi, struct folio *folio, 3327 + pgoff_t index, bool update) 3332 3328 { 3333 - lock_buffer(bh); 3334 - if (super) 3335 - memcpy(bh->b_data + F2FS_SUPER_OFFSET, super, sizeof(*super)); 3336 - set_buffer_dirty(bh); 3337 - unlock_buffer(bh); 3338 - 3329 + struct bio *bio; 3339 3330 /* it's rare case, we can do fua all the time */ 3340 - return __sync_dirty_buffer(bh, REQ_SYNC | REQ_PREFLUSH | REQ_FUA); 3331 + blk_opf_t opf = REQ_OP_WRITE | REQ_SYNC | REQ_PREFLUSH | REQ_FUA; 3332 + int ret; 3333 + 3334 + folio_lock(folio); 3335 + folio_wait_writeback(folio); 3336 + if (update) 3337 + memcpy(F2FS_SUPER_BLOCK(folio, index), F2FS_RAW_SUPER(sbi), 3338 + sizeof(struct f2fs_super_block)); 3339 + folio_mark_dirty(folio); 3340 + folio_clear_dirty_for_io(folio); 3341 + folio_start_writeback(folio); 3342 + folio_unlock(folio); 3343 + 3344 + bio = bio_alloc(sbi->sb->s_bdev, 1, opf, GFP_NOFS); 3345 + 3346 + /* it doesn't need to set crypto context for superblock update */ 3347 + bio->bi_iter.bi_sector = SECTOR_FROM_BLOCK(folio_index(folio)); 3348 + 3349 + if (!bio_add_folio(bio, folio, folio_size(folio), 0)) 3350 + f2fs_bug_on(sbi, 1); 3351 + 3352 + ret = submit_bio_wait(bio); 3353 + folio_end_writeback(folio); 3354 + 3355 + return ret; 3341 3356 } 3342 3357 3343 3358 static inline bool sanity_check_area_boundary(struct f2fs_sb_info *sbi, 3344 - struct buffer_head *bh) 3359 + struct folio *folio, pgoff_t index) 3345 3360 { 3346 - struct f2fs_super_block *raw_super = (struct f2fs_super_block *) 3347 - (bh->b_data + F2FS_SUPER_OFFSET); 3361 + struct f2fs_super_block *raw_super = F2FS_SUPER_BLOCK(folio, index); 3348 3362 struct super_block *sb = sbi->sb; 3349 3363 u32 segment0_blkaddr = le32_to_cpu(raw_super->segment0_blkaddr); 3350 3364 u32 cp_blkaddr = le32_to_cpu(raw_super->cp_blkaddr); ··· 3378 3356 u32 segment_count = le32_to_cpu(raw_super->segment_count); 3379 3357 u32 log_blocks_per_seg = le32_to_cpu(raw_super->log_blocks_per_seg); 3380 3358 u64 main_end_blkaddr = main_blkaddr + 3381 - (segment_count_main << log_blocks_per_seg); 3359 + ((u64)segment_count_main << log_blocks_per_seg); 3382 3360 u64 seg_end_blkaddr = segment0_blkaddr + 3383 - (segment_count << log_blocks_per_seg); 3361 + ((u64)segment_count << log_blocks_per_seg); 3384 3362 3385 3363 if (segment0_blkaddr != cp_blkaddr) { 3386 3364 f2fs_info(sbi, "Mismatch start address, segment0(%u) cp_blkaddr(%u)", ··· 3437 3415 set_sbi_flag(sbi, SBI_NEED_SB_WRITE); 3438 3416 res = "internally"; 3439 3417 } else { 3440 - err = __f2fs_commit_super(bh, NULL); 3418 + err = __f2fs_commit_super(sbi, folio, index, false); 3441 3419 res = err ? "failed" : "done"; 3442 3420 } 3443 3421 f2fs_info(sbi, "Fix alignment : %s, start(%u) end(%llu) block(%u)", ··· 3450 3428 } 3451 3429 3452 3430 static int sanity_check_raw_super(struct f2fs_sb_info *sbi, 3453 - struct buffer_head *bh) 3431 + struct folio *folio, pgoff_t index) 3454 3432 { 3455 3433 block_t segment_count, segs_per_sec, secs_per_zone, segment_count_main; 3456 3434 block_t total_sections, blocks_per_seg; 3457 - struct f2fs_super_block *raw_super = (struct f2fs_super_block *) 3458 - (bh->b_data + F2FS_SUPER_OFFSET); 3435 + struct f2fs_super_block *raw_super = F2FS_SUPER_BLOCK(folio, index); 3459 3436 size_t crc_offset = 0; 3460 3437 __u32 crc = 0; 3461 3438 ··· 3612 3591 } 3613 3592 3614 3593 /* check CP/SIT/NAT/SSA/MAIN_AREA area boundary */ 3615 - if (sanity_check_area_boundary(sbi, bh)) 3594 + if (sanity_check_area_boundary(sbi, folio, index)) 3616 3595 return -EFSCORRUPTED; 3617 3596 3618 3597 return 0; ··· 3807 3786 sbi->next_victim_seg[FG_GC] = NULL_SEGNO; 3808 3787 sbi->max_victim_search = DEF_MAX_VICTIM_SEARCH; 3809 3788 sbi->migration_granularity = SEGS_PER_SEC(sbi); 3789 + sbi->migration_window_granularity = f2fs_sb_has_blkzoned(sbi) ? 3790 + DEF_MIGRATION_WINDOW_GRANULARITY_ZONED : SEGS_PER_SEC(sbi); 3810 3791 sbi->seq_file_ra_mul = MIN_RA_MUL; 3811 3792 sbi->max_fragment_chunk = DEF_FRAGMENT_SIZE; 3812 3793 sbi->max_fragment_hole = DEF_FRAGMENT_SIZE; ··· 3961 3938 { 3962 3939 struct super_block *sb = sbi->sb; 3963 3940 int block; 3964 - struct buffer_head *bh; 3941 + struct folio *folio; 3965 3942 struct f2fs_super_block *super; 3966 3943 int err = 0; 3967 3944 ··· 3970 3947 return -ENOMEM; 3971 3948 3972 3949 for (block = 0; block < 2; block++) { 3973 - bh = sb_bread(sb, block); 3974 - if (!bh) { 3950 + folio = read_mapping_folio(sb->s_bdev->bd_mapping, block, NULL); 3951 + if (IS_ERR(folio)) { 3975 3952 f2fs_err(sbi, "Unable to read %dth superblock", 3976 3953 block + 1); 3977 - err = -EIO; 3954 + err = PTR_ERR(folio); 3978 3955 *recovery = 1; 3979 3956 continue; 3980 3957 } 3981 3958 3982 3959 /* sanity checking of raw super */ 3983 - err = sanity_check_raw_super(sbi, bh); 3960 + err = sanity_check_raw_super(sbi, folio, block); 3984 3961 if (err) { 3985 3962 f2fs_err(sbi, "Can't find valid F2FS filesystem in %dth superblock", 3986 3963 block + 1); 3987 - brelse(bh); 3964 + folio_put(folio); 3988 3965 *recovery = 1; 3989 3966 continue; 3990 3967 } 3991 3968 3992 3969 if (!*raw_super) { 3993 - memcpy(super, bh->b_data + F2FS_SUPER_OFFSET, 3970 + memcpy(super, F2FS_SUPER_BLOCK(folio, block), 3994 3971 sizeof(*super)); 3995 3972 *valid_super_block = block; 3996 3973 *raw_super = super; 3997 3974 } 3998 - brelse(bh); 3975 + folio_put(folio); 3999 3976 } 4000 3977 4001 3978 /* No valid superblock */ ··· 4009 3986 4010 3987 int f2fs_commit_super(struct f2fs_sb_info *sbi, bool recover) 4011 3988 { 4012 - struct buffer_head *bh; 3989 + struct folio *folio; 3990 + pgoff_t index; 4013 3991 __u32 crc = 0; 4014 3992 int err; 4015 3993 ··· 4028 4004 } 4029 4005 4030 4006 /* write back-up superblock first */ 4031 - bh = sb_bread(sbi->sb, sbi->valid_super_block ? 0 : 1); 4032 - if (!bh) 4033 - return -EIO; 4034 - err = __f2fs_commit_super(bh, F2FS_RAW_SUPER(sbi)); 4035 - brelse(bh); 4007 + index = sbi->valid_super_block ? 0 : 1; 4008 + folio = read_mapping_folio(sbi->sb->s_bdev->bd_mapping, index, NULL); 4009 + if (IS_ERR(folio)) 4010 + return PTR_ERR(folio); 4011 + err = __f2fs_commit_super(sbi, folio, index, true); 4012 + folio_put(folio); 4036 4013 4037 4014 /* if we are in recovery path, skip writing valid superblock */ 4038 4015 if (recover || err) 4039 4016 return err; 4040 4017 4041 4018 /* write current valid superblock */ 4042 - bh = sb_bread(sbi->sb, sbi->valid_super_block); 4043 - if (!bh) 4044 - return -EIO; 4045 - err = __f2fs_commit_super(bh, F2FS_RAW_SUPER(sbi)); 4046 - brelse(bh); 4019 + index = sbi->valid_super_block; 4020 + folio = read_mapping_folio(sbi->sb->s_bdev->bd_mapping, index, NULL); 4021 + if (IS_ERR(folio)) 4022 + return PTR_ERR(folio); 4023 + err = __f2fs_commit_super(sbi, folio, index, true); 4024 + folio_put(folio); 4047 4025 return err; 4048 4026 } 4049 4027 ··· 4199 4173 } 4200 4174 4201 4175 f2fs_warn(sbi, "Remounting filesystem read-only"); 4176 + 4202 4177 /* 4203 - * Make sure updated value of ->s_mount_flags will be visible before 4204 - * ->s_flags update 4178 + * We have already set CP_ERROR_FLAG flag to stop all updates 4179 + * to filesystem, so it doesn't need to set SB_RDONLY flag here 4180 + * because the flag should be set covered w/ sb->s_umount semaphore 4181 + * via remount procedure, otherwise, it will confuse code like 4182 + * freeze_super() which will lead to deadlocks and other problems. 4205 4183 */ 4206 - smp_wmb(); 4207 - sb->s_flags |= SB_RDONLY; 4208 4184 } 4209 4185 4210 4186 static void f2fs_record_error_work(struct work_struct *work) ··· 4247 4219 sbi->aligned_blksize = true; 4248 4220 #ifdef CONFIG_BLK_DEV_ZONED 4249 4221 sbi->max_open_zones = UINT_MAX; 4222 + sbi->blkzone_alloc_policy = BLKZONE_ALLOC_PRIOR_SEQ; 4250 4223 #endif 4251 4224 4252 4225 for (i = 0; i < max_devices; i++) {
+57 -25
fs/f2fs/sysfs.c
··· 170 170 SM_I(sbi)->dcc_info->undiscard_blks); 171 171 } 172 172 173 + static ssize_t atgc_enabled_show(struct f2fs_attr *a, 174 + struct f2fs_sb_info *sbi, char *buf) 175 + { 176 + return sysfs_emit(buf, "%d\n", sbi->am.atgc_enabled ? 1 : 0); 177 + } 178 + 173 179 static ssize_t gc_mode_show(struct f2fs_attr *a, 174 180 struct f2fs_sb_info *sbi, char *buf) 175 181 { ··· 188 182 int len = 0; 189 183 190 184 if (f2fs_sb_has_encrypt(sbi)) 191 - len += scnprintf(buf, PAGE_SIZE - len, "%s", 185 + len += sysfs_emit_at(buf, len, "%s", 192 186 "encryption"); 193 187 if (f2fs_sb_has_blkzoned(sbi)) 194 - len += scnprintf(buf + len, PAGE_SIZE - len, "%s%s", 188 + len += sysfs_emit_at(buf, len, "%s%s", 195 189 len ? ", " : "", "blkzoned"); 196 190 if (f2fs_sb_has_extra_attr(sbi)) 197 - len += scnprintf(buf + len, PAGE_SIZE - len, "%s%s", 191 + len += sysfs_emit_at(buf, len, "%s%s", 198 192 len ? ", " : "", "extra_attr"); 199 193 if (f2fs_sb_has_project_quota(sbi)) 200 - len += scnprintf(buf + len, PAGE_SIZE - len, "%s%s", 194 + len += sysfs_emit_at(buf, len, "%s%s", 201 195 len ? ", " : "", "projquota"); 202 196 if (f2fs_sb_has_inode_chksum(sbi)) 203 - len += scnprintf(buf + len, PAGE_SIZE - len, "%s%s", 197 + len += sysfs_emit_at(buf, len, "%s%s", 204 198 len ? ", " : "", "inode_checksum"); 205 199 if (f2fs_sb_has_flexible_inline_xattr(sbi)) 206 - len += scnprintf(buf + len, PAGE_SIZE - len, "%s%s", 200 + len += sysfs_emit_at(buf, len, "%s%s", 207 201 len ? ", " : "", "flexible_inline_xattr"); 208 202 if (f2fs_sb_has_quota_ino(sbi)) 209 - len += scnprintf(buf + len, PAGE_SIZE - len, "%s%s", 203 + len += sysfs_emit_at(buf, len, "%s%s", 210 204 len ? ", " : "", "quota_ino"); 211 205 if (f2fs_sb_has_inode_crtime(sbi)) 212 - len += scnprintf(buf + len, PAGE_SIZE - len, "%s%s", 206 + len += sysfs_emit_at(buf, len, "%s%s", 213 207 len ? ", " : "", "inode_crtime"); 214 208 if (f2fs_sb_has_lost_found(sbi)) 215 - len += scnprintf(buf + len, PAGE_SIZE - len, "%s%s", 209 + len += sysfs_emit_at(buf, len, "%s%s", 216 210 len ? ", " : "", "lost_found"); 217 211 if (f2fs_sb_has_verity(sbi)) 218 - len += scnprintf(buf + len, PAGE_SIZE - len, "%s%s", 212 + len += sysfs_emit_at(buf, len, "%s%s", 219 213 len ? ", " : "", "verity"); 220 214 if (f2fs_sb_has_sb_chksum(sbi)) 221 - len += scnprintf(buf + len, PAGE_SIZE - len, "%s%s", 215 + len += sysfs_emit_at(buf, len, "%s%s", 222 216 len ? ", " : "", "sb_checksum"); 223 217 if (f2fs_sb_has_casefold(sbi)) 224 - len += scnprintf(buf + len, PAGE_SIZE - len, "%s%s", 218 + len += sysfs_emit_at(buf, len, "%s%s", 225 219 len ? ", " : "", "casefold"); 226 220 if (f2fs_sb_has_readonly(sbi)) 227 - len += scnprintf(buf + len, PAGE_SIZE - len, "%s%s", 221 + len += sysfs_emit_at(buf, len, "%s%s", 228 222 len ? ", " : "", "readonly"); 229 223 if (f2fs_sb_has_compression(sbi)) 230 - len += scnprintf(buf + len, PAGE_SIZE - len, "%s%s", 224 + len += sysfs_emit_at(buf, len, "%s%s", 231 225 len ? ", " : "", "compression"); 232 - len += scnprintf(buf + len, PAGE_SIZE - len, "%s%s", 226 + len += sysfs_emit_at(buf, len, "%s%s", 233 227 len ? ", " : "", "pin_file"); 234 - len += scnprintf(buf + len, PAGE_SIZE - len, "\n"); 228 + len += sysfs_emit_at(buf, len, "\n"); 235 229 return len; 236 230 } 237 231 ··· 329 323 int hot_count = sbi->raw_super->hot_ext_count; 330 324 int len = 0, i; 331 325 332 - len += scnprintf(buf + len, PAGE_SIZE - len, 333 - "cold file extension:\n"); 326 + len += sysfs_emit_at(buf, len, "cold file extension:\n"); 334 327 for (i = 0; i < cold_count; i++) 335 - len += scnprintf(buf + len, PAGE_SIZE - len, "%s\n", 336 - extlist[i]); 328 + len += sysfs_emit_at(buf, len, "%s\n", extlist[i]); 337 329 338 - len += scnprintf(buf + len, PAGE_SIZE - len, 339 - "hot file extension:\n"); 330 + len += sysfs_emit_at(buf, len, "hot file extension:\n"); 340 331 for (i = cold_count; i < cold_count + hot_count; i++) 341 - len += scnprintf(buf + len, PAGE_SIZE - len, "%s\n", 342 - extlist[i]); 332 + len += sysfs_emit_at(buf, len, "%s\n", extlist[i]); 333 + 343 334 return len; 344 335 } 345 336 ··· 564 561 return -EINVAL; 565 562 } 566 563 564 + if (!strcmp(a->attr.name, "migration_window_granularity")) { 565 + if (t == 0 || t > SEGS_PER_SEC(sbi)) 566 + return -EINVAL; 567 + } 568 + 567 569 if (!strcmp(a->attr.name, "gc_urgent")) { 568 570 if (t == 0) { 569 571 sbi->gc_mode = GC_NORMAL; ··· 631 623 spin_lock_irq(&sbi->iostat_lock); 632 624 sbi->iostat_period_ms = (unsigned int)t; 633 625 spin_unlock_irq(&sbi->iostat_lock); 626 + return count; 627 + } 628 + #endif 629 + 630 + #ifdef CONFIG_BLK_DEV_ZONED 631 + if (!strcmp(a->attr.name, "blkzone_alloc_policy")) { 632 + if (t < BLKZONE_ALLOC_PRIOR_SEQ || t > BLKZONE_ALLOC_PRIOR_CONV) 633 + return -EINVAL; 634 + sbi->blkzone_alloc_policy = t; 634 635 return count; 635 636 } 636 637 #endif ··· 792 775 if (!strcmp(a->attr.name, "ipu_policy")) { 793 776 if (t >= BIT(F2FS_IPU_MAX)) 794 777 return -EINVAL; 795 - if (t && f2fs_lfs_mode(sbi)) 778 + /* allow F2FS_IPU_NOCACHE only for IPU in the pinned file */ 779 + if (f2fs_lfs_mode(sbi) && (t & ~BIT(F2FS_IPU_NOCACHE))) 796 780 return -EINVAL; 797 781 SM_I(sbi)->ipu_policy = (unsigned int)t; 798 782 return count; ··· 978 960 GC_THREAD_RW_ATTR(gc_min_sleep_time, min_sleep_time); 979 961 GC_THREAD_RW_ATTR(gc_max_sleep_time, max_sleep_time); 980 962 GC_THREAD_RW_ATTR(gc_no_gc_sleep_time, no_gc_sleep_time); 963 + GC_THREAD_RW_ATTR(gc_no_zoned_gc_percent, no_zoned_gc_percent); 964 + GC_THREAD_RW_ATTR(gc_boost_zoned_gc_percent, boost_zoned_gc_percent); 965 + GC_THREAD_RW_ATTR(gc_valid_thresh_ratio, valid_thresh_ratio); 981 966 982 967 /* SM_INFO ATTR */ 983 968 SM_INFO_RW_ATTR(reclaim_segments, rec_prefree_segments); ··· 990 969 SM_INFO_GENERAL_RW_ATTR(min_seq_blocks); 991 970 SM_INFO_GENERAL_RW_ATTR(min_hot_blocks); 992 971 SM_INFO_GENERAL_RW_ATTR(min_ssr_sections); 972 + SM_INFO_GENERAL_RW_ATTR(reserved_segments); 993 973 994 974 /* DCC_INFO ATTR */ 995 975 DCC_INFO_RW_ATTR(max_small_discards, max_discards); ··· 1023 1001 F2FS_SBI_RW_ATTR(gc_reclaimed_segments, gc_reclaimed_segs); 1024 1002 F2FS_SBI_GENERAL_RW_ATTR(max_victim_search); 1025 1003 F2FS_SBI_GENERAL_RW_ATTR(migration_granularity); 1004 + F2FS_SBI_GENERAL_RW_ATTR(migration_window_granularity); 1026 1005 F2FS_SBI_GENERAL_RW_ATTR(dir_level); 1027 1006 #ifdef CONFIG_F2FS_IOSTAT 1028 1007 F2FS_SBI_GENERAL_RW_ATTR(iostat_enable); ··· 1056 1033 F2FS_SBI_GENERAL_RW_ATTR(last_age_weight); 1057 1034 #ifdef CONFIG_BLK_DEV_ZONED 1058 1035 F2FS_SBI_GENERAL_RO_ATTR(unusable_blocks_per_sec); 1036 + F2FS_SBI_GENERAL_RW_ATTR(blkzone_alloc_policy); 1059 1037 #endif 1060 1038 1061 1039 /* STAT_INFO ATTR */ ··· 1096 1072 F2FS_GENERAL_RO_ATTR(mounted_time_sec); 1097 1073 F2FS_GENERAL_RO_ATTR(main_blkaddr); 1098 1074 F2FS_GENERAL_RO_ATTR(pending_discard); 1075 + F2FS_GENERAL_RO_ATTR(atgc_enabled); 1099 1076 F2FS_GENERAL_RO_ATTR(gc_mode); 1100 1077 #ifdef CONFIG_F2FS_STAT_FS 1101 1078 F2FS_GENERAL_RO_ATTR(moved_blocks_background); ··· 1141 1116 ATTR_LIST(gc_min_sleep_time), 1142 1117 ATTR_LIST(gc_max_sleep_time), 1143 1118 ATTR_LIST(gc_no_gc_sleep_time), 1119 + ATTR_LIST(gc_no_zoned_gc_percent), 1120 + ATTR_LIST(gc_boost_zoned_gc_percent), 1121 + ATTR_LIST(gc_valid_thresh_ratio), 1144 1122 ATTR_LIST(gc_idle), 1145 1123 ATTR_LIST(gc_urgent), 1146 1124 ATTR_LIST(reclaim_segments), ··· 1166 1138 ATTR_LIST(min_seq_blocks), 1167 1139 ATTR_LIST(min_hot_blocks), 1168 1140 ATTR_LIST(min_ssr_sections), 1141 + ATTR_LIST(reserved_segments), 1169 1142 ATTR_LIST(max_victim_search), 1170 1143 ATTR_LIST(migration_granularity), 1144 + ATTR_LIST(migration_window_granularity), 1171 1145 ATTR_LIST(dir_level), 1172 1146 ATTR_LIST(ram_thresh), 1173 1147 ATTR_LIST(ra_nid_pages), ··· 1217 1187 #endif 1218 1188 #ifdef CONFIG_BLK_DEV_ZONED 1219 1189 ATTR_LIST(unusable_blocks_per_sec), 1190 + ATTR_LIST(blkzone_alloc_policy), 1220 1191 #endif 1221 1192 #ifdef CONFIG_F2FS_FS_COMPRESSION 1222 1193 ATTR_LIST(compr_written_block), ··· 1231 1200 ATTR_LIST(atgc_candidate_count), 1232 1201 ATTR_LIST(atgc_age_weight), 1233 1202 ATTR_LIST(atgc_age_threshold), 1203 + ATTR_LIST(atgc_enabled), 1234 1204 ATTR_LIST(seq_file_ra_mul), 1235 1205 ATTR_LIST(gc_segment_mode), 1236 1206 ATTR_LIST(gc_reclaimed_segments),
+3 -2
fs/f2fs/verity.c
··· 74 74 struct address_space *mapping = inode->i_mapping; 75 75 const struct address_space_operations *aops = mapping->a_ops; 76 76 77 - if (pos + count > inode->i_sb->s_maxbytes) 77 + if (pos + count > F2FS_BLK_TO_BYTES(max_file_blocks(inode))) 78 78 return -EFBIG; 79 79 80 80 while (count) { ··· 237 237 pos = le64_to_cpu(dloc.pos); 238 238 239 239 /* Get the descriptor */ 240 - if (pos + size < pos || pos + size > inode->i_sb->s_maxbytes || 240 + if (pos + size < pos || 241 + pos + size > F2FS_BLK_TO_BYTES(max_file_blocks(inode)) || 241 242 pos < f2fs_verity_metadata_pos(inode) || size > INT_MAX) { 242 243 f2fs_warn(F2FS_I_SB(inode), "invalid verity xattr"); 243 244 f2fs_handle_error(F2FS_I_SB(inode),
+12 -2
fs/f2fs/xattr.c
··· 629 629 const char *name, const void *value, size_t size, 630 630 struct page *ipage, int flags) 631 631 { 632 + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 632 633 struct f2fs_xattr_entry *here, *last; 633 634 void *base_addr, *last_base_addr; 634 635 int found, newsize; ··· 773 772 if (index == F2FS_XATTR_INDEX_ENCRYPTION && 774 773 !strcmp(name, F2FS_XATTR_NAME_ENCRYPTION_CONTEXT)) 775 774 f2fs_set_encrypted_inode(inode); 776 - if (S_ISDIR(inode->i_mode)) 777 - set_sbi_flag(F2FS_I_SB(inode), SBI_NEED_CP); 778 775 776 + if (!S_ISDIR(inode->i_mode)) 777 + goto same; 778 + /* 779 + * In restrict mode, fsync() always try to trigger checkpoint for all 780 + * metadata consistency, in other mode, it triggers checkpoint when 781 + * parent's xattr metadata was updated. 782 + */ 783 + if (F2FS_OPTION(sbi).fsync_mode == FSYNC_MODE_STRICT) 784 + set_sbi_flag(sbi, SBI_NEED_CP); 785 + else 786 + f2fs_add_ino_entry(sbi, inode->i_ino, XATTR_DIR_INO); 779 787 same: 780 788 if (is_inode_flag_set(inode, FI_ACL_MODE)) { 781 789 inode->i_mode = F2FS_I(inode)->i_acl_mode;
+2 -2
include/linux/f2fs_fs.h
··· 19 19 #define F2FS_BLKSIZE_BITS PAGE_SHIFT /* bits for F2FS_BLKSIZE */ 20 20 #define F2FS_MAX_EXTENSION 64 /* # of extension entries */ 21 21 #define F2FS_EXTENSION_LEN 8 /* max size of extension */ 22 - #define F2FS_BLK_ALIGN(x) (((x) + F2FS_BLKSIZE - 1) >> F2FS_BLKSIZE_BITS) 23 22 24 23 #define NULL_ADDR ((block_t)0) /* used as block_t addresses */ 25 24 #define NEW_ADDR ((block_t)-1) /* used as block_t addresses */ ··· 27 28 #define F2FS_BYTES_TO_BLK(bytes) ((bytes) >> F2FS_BLKSIZE_BITS) 28 29 #define F2FS_BLK_TO_BYTES(blk) ((blk) << F2FS_BLKSIZE_BITS) 29 30 #define F2FS_BLK_END_BYTES(blk) (F2FS_BLK_TO_BYTES(blk + 1) - 1) 31 + #define F2FS_BLK_ALIGN(x) (F2FS_BYTES_TO_BLK((x) + F2FS_BLKSIZE - 1)) 30 32 31 33 /* 0, 1(node nid), 2(meta nid) are reserved node id */ 32 34 #define F2FS_RESERVED_NODE_NUM 3 ··· 278 278 #define F2FS_INLINE_DATA 0x02 /* file inline data flag */ 279 279 #define F2FS_INLINE_DENTRY 0x04 /* file inline dentry flag */ 280 280 #define F2FS_DATA_EXIST 0x08 /* file inline data exist flag */ 281 - #define F2FS_INLINE_DOTS 0x10 /* file having implicit dot dentries */ 281 + #define F2FS_INLINE_DOTS 0x10 /* file having implicit dot dentries (obsolete) */ 282 282 #define F2FS_EXTRA_ATTR 0x20 /* file having extra attribute */ 283 283 #define F2FS_PIN_FILE 0x40 /* file should not be gced */ 284 284 #define F2FS_COMPRESS_RELEASED 0x80 /* file released compressed blocks */
+2 -1
include/trace/events/f2fs.h
··· 139 139 { CP_NODE_NEED_CP, "node needs cp" }, \ 140 140 { CP_FASTBOOT_MODE, "fastboot mode" }, \ 141 141 { CP_SPEC_LOG_NUM, "log type is 2" }, \ 142 - { CP_RECOVER_DIR, "dir needs recovery" }) 142 + { CP_RECOVER_DIR, "dir needs recovery" }, \ 143 + { CP_XATTR_DIR, "dir's xattr updated" }) 143 144 144 145 #define show_shutdown_mode(type) \ 145 146 __print_symbolic(type, \