Merge tag 'f2fs-for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
"In this round, we've mostly tuned f2fs to provide better user
experience for Android. Especially, we've worked on atomic write
feature again with SQLite community in order to support it officially.
And we added or modified several facilities to analyze and enhance IO
behaviors.

Major changes include:
- add app/fs io stat
- add inode checksum feature
- support project/journalled quota
- enhance atomic write with new ioctl() which exposes feature set
- enhance background gc/discard/fstrim flows with new gc_urgent mode
- add F2FS_IOC_FS{GET,SET}XATTR
- fix some quota flows"

* tag 'f2fs-for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (63 commits)
f2fs: hurry up to issue discard after io interruption
f2fs: fix to show correct discard_granularity in sysfs
f2fs: detect dirty inode in evict_inode
f2fs: clear radix tree dirty tag of pages whose dirty flag is cleared
f2fs: speed up gc_urgent mode with SSR
f2fs: better to wait for fstrim completion
f2fs: avoid race in between read xattr & write xattr
f2fs: make get_lock_data_page to handle encrypted inode
f2fs: use generic terms used for encrypted block management
f2fs: introduce f2fs_encrypted_file for clean-up
Revert "f2fs: add a new function get_ssr_cost"
f2fs: constify super_operations
f2fs: fix to wake up all sleeping flusher
f2fs: avoid race in between atomic_read & atomic_inc
f2fs: remove unneeded parameter of change_curseg
f2fs: update i_flags correctly
f2fs: don't check inode's checksum if it was dirtied or writebacked
f2fs: don't need to update inode checksum for recovery
f2fs: trigger fdatasync for non-atomic_write file
f2fs: fix to avoid race in between aio and gc
...

+2230 -513
+21
Documentation/ABI/testing/sysfs-fs-f2fs
··· 57 57 Description: 58 58 Controls the issue rate of small discard commands. 59 59 60 + What: /sys/fs/f2fs/<disk>/discard_granularity 61 + Date: July 2017 62 + Contact: "Chao Yu" <yuchao0@huawei.com> 63 + Description: 64 + Controls discard granularity of inner discard thread, inner thread 65 + will not issue discards with size that is smaller than granularity. 66 + The unit size is one block, now only support configuring in range 67 + of [1, 512]. 68 + 60 69 What: /sys/fs/f2fs/<disk>/max_victim_search 61 70 Date: January 2014 62 71 Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> ··· 139 130 Contact: "Chao Yu" <yuchao0@huawei.com> 140 131 Description: 141 132 Controls current reserved blocks in system. 133 + 134 + What: /sys/fs/f2fs/<disk>/gc_urgent 135 + Date: August 2017 136 + Contact: "Jaegeuk Kim" <jaegeuk@kernel.org> 137 + Description: 138 + Do background GC agressively 139 + 140 + What: /sys/fs/f2fs/<disk>/gc_urgent_sleep_time 141 + Date: August 2017 142 + Contact: "Jaegeuk Kim" <jaegeuk@kernel.org> 143 + Description: 144 + Controls sleep time of GC urgent mode
+19
Documentation/filesystems/f2fs.txt
··· 164 164 with "mode=lfs". 165 165 usrquota Enable plain user disk quota accounting. 166 166 grpquota Enable plain group disk quota accounting. 167 + prjquota Enable plain project quota accounting. 168 + usrjquota=<file> Appoint specified file and type during mount, so that quota 169 + grpjquota=<file> information can be properly updated during recovery flow, 170 + prjjquota=<file> <quota file>: must be in root directory; 171 + jqfmt=<quota type> <quota type>: [vfsold,vfsv0,vfsv1]. 172 + offusrjquota Turn off user journelled quota. 173 + offgrpjquota Turn off group journelled quota. 174 + offprjjquota Turn off project journelled quota. 175 + quota Enable plain user disk quota accounting. 176 + noquota Disable all plain disk quota option. 167 177 168 178 ================================================================================ 169 179 DEBUGFS ENTRIES ··· 218 208 (default) will disable this option. Setting 219 209 gc_idle = 1 will select the Cost Benefit approach 220 210 & setting gc_idle = 2 will select the greedy approach. 211 + 212 + gc_urgent This parameter controls triggering background GCs 213 + urgently or not. Setting gc_urgent = 0 [default] 214 + makes back to default behavior, while if it is set 215 + to 1, background thread starts to do GC by given 216 + gc_urgent_sleep_time interval. 217 + 218 + gc_urgent_sleep_time This parameter controls sleep time for gc_urgent. 219 + 500 ms is set by default. See above gc_urgent. 221 220 222 221 reclaim_segments This parameter controls the number of prefree 223 222 segments to be reclaimed. If the number of prefree
+3 -2
fs/f2fs/acl.c
··· 207 207 void *value = NULL; 208 208 size_t size = 0; 209 209 int error; 210 + umode_t mode = inode->i_mode; 210 211 211 212 switch (type) { 212 213 case ACL_TYPE_ACCESS: 213 214 name_index = F2FS_XATTR_INDEX_POSIX_ACL_ACCESS; 214 215 if (acl && !ipage) { 215 - error = posix_acl_update_mode(inode, &inode->i_mode, &acl); 216 + error = posix_acl_update_mode(inode, &mode, &acl); 216 217 if (error) 217 218 return error; 218 - set_acl_inode(inode, inode->i_mode); 219 + set_acl_inode(inode, mode); 219 220 } 220 221 break; 221 222
+47 -13
fs/f2fs/checkpoint.c
··· 230 230 ra_meta_pages(sbi, index, BIO_MAX_PAGES, META_POR, true); 231 231 } 232 232 233 - static int f2fs_write_meta_page(struct page *page, 234 - struct writeback_control *wbc) 233 + static int __f2fs_write_meta_page(struct page *page, 234 + struct writeback_control *wbc, 235 + enum iostat_type io_type) 235 236 { 236 237 struct f2fs_sb_info *sbi = F2FS_P_SB(page); 237 238 ··· 245 244 if (unlikely(f2fs_cp_error(sbi))) 246 245 goto redirty_out; 247 246 248 - write_meta_page(sbi, page); 247 + write_meta_page(sbi, page, io_type); 249 248 dec_page_count(sbi, F2FS_DIRTY_META); 250 249 251 250 if (wbc->for_reclaim) ··· 262 261 redirty_out: 263 262 redirty_page_for_writepage(wbc, page); 264 263 return AOP_WRITEPAGE_ACTIVATE; 264 + } 265 + 266 + static int f2fs_write_meta_page(struct page *page, 267 + struct writeback_control *wbc) 268 + { 269 + return __f2fs_write_meta_page(page, wbc, FS_META_IO); 265 270 } 266 271 267 272 static int f2fs_write_meta_pages(struct address_space *mapping, ··· 290 283 291 284 trace_f2fs_writepages(mapping->host, wbc, META); 292 285 diff = nr_pages_to_write(sbi, META, wbc); 293 - written = sync_meta_pages(sbi, META, wbc->nr_to_write); 286 + written = sync_meta_pages(sbi, META, wbc->nr_to_write, FS_META_IO); 294 287 mutex_unlock(&sbi->cp_mutex); 295 288 wbc->nr_to_write = max((long)0, wbc->nr_to_write - written - diff); 296 289 return 0; ··· 302 295 } 303 296 304 297 long sync_meta_pages(struct f2fs_sb_info *sbi, enum page_type type, 305 - long nr_to_write) 298 + long nr_to_write, enum iostat_type io_type) 306 299 { 307 300 struct address_space *mapping = META_MAPPING(sbi); 308 301 pgoff_t index = 0, end = ULONG_MAX, prev = ULONG_MAX; ··· 353 346 if (!clear_page_dirty_for_io(page)) 354 347 goto continue_unlock; 355 348 356 - if (mapping->a_ops->writepage(page, &wbc)) { 349 + if (__f2fs_write_meta_page(page, &wbc, io_type)) { 357 350 unlock_page(page); 358 351 break; 359 352 } ··· 588 581 int recover_orphan_inodes(struct f2fs_sb_info *sbi) 589 582 { 590 583 block_t start_blk, orphan_blocks, i, j; 591 - int err; 584 + unsigned int s_flags = sbi->sb->s_flags; 585 + int err = 0; 592 586 593 587 if (!is_set_ckpt_flags(sbi, CP_ORPHAN_PRESENT_FLAG)) 594 588 return 0; 589 + 590 + if (s_flags & MS_RDONLY) { 591 + f2fs_msg(sbi->sb, KERN_INFO, "orphan cleanup on readonly fs"); 592 + sbi->sb->s_flags &= ~MS_RDONLY; 593 + } 594 + 595 + #ifdef CONFIG_QUOTA 596 + /* Needed for iput() to work correctly and not trash data */ 597 + sbi->sb->s_flags |= MS_ACTIVE; 598 + /* Turn on quotas so that they are updated correctly */ 599 + f2fs_enable_quota_files(sbi); 600 + #endif 595 601 596 602 start_blk = __start_cp_addr(sbi) + 1 + __cp_payload(sbi); 597 603 orphan_blocks = __start_sum_addr(sbi) - 1 - __cp_payload(sbi); ··· 621 601 err = recover_orphan_inode(sbi, ino); 622 602 if (err) { 623 603 f2fs_put_page(page, 1); 624 - return err; 604 + goto out; 625 605 } 626 606 } 627 607 f2fs_put_page(page, 1); 628 608 } 629 609 /* clear Orphan Flag */ 630 610 clear_ckpt_flags(sbi, CP_ORPHAN_PRESENT_FLAG); 631 - return 0; 611 + out: 612 + #ifdef CONFIG_QUOTA 613 + /* Turn quotas off */ 614 + f2fs_quota_off_umount(sbi->sb); 615 + #endif 616 + sbi->sb->s_flags = s_flags; /* Restore MS_RDONLY status */ 617 + 618 + return err; 632 619 } 633 620 634 621 static void write_orphan_inodes(struct f2fs_sb_info *sbi, block_t start_blk) ··· 931 904 if (inode) { 932 905 unsigned long cur_ino = inode->i_ino; 933 906 907 + if (is_dir) 908 + F2FS_I(inode)->cp_task = current; 909 + 934 910 filemap_fdatawrite(inode->i_mapping); 911 + 912 + if (is_dir) 913 + F2FS_I(inode)->cp_task = NULL; 914 + 935 915 iput(inode); 936 916 /* We need to give cpu to another writers. */ 937 917 if (ino == cur_ino) { ··· 1051 1017 1052 1018 if (get_pages(sbi, F2FS_DIRTY_NODES)) { 1053 1019 up_write(&sbi->node_write); 1054 - err = sync_node_pages(sbi, &wbc); 1020 + err = sync_node_pages(sbi, &wbc, false, FS_CP_NODE_IO); 1055 1021 if (err) { 1056 1022 up_write(&sbi->node_change); 1057 1023 f2fs_unlock_all(sbi); ··· 1149 1115 1150 1116 /* Flush all the NAT/SIT pages */ 1151 1117 while (get_pages(sbi, F2FS_DIRTY_META)) { 1152 - sync_meta_pages(sbi, META, LONG_MAX); 1118 + sync_meta_pages(sbi, META, LONG_MAX, FS_CP_META_IO); 1153 1119 if (unlikely(f2fs_cp_error(sbi))) 1154 1120 return -EIO; 1155 1121 } ··· 1228 1194 1229 1195 /* Flush all the NAT BITS pages */ 1230 1196 while (get_pages(sbi, F2FS_DIRTY_META)) { 1231 - sync_meta_pages(sbi, META, LONG_MAX); 1197 + sync_meta_pages(sbi, META, LONG_MAX, FS_CP_META_IO); 1232 1198 if (unlikely(f2fs_cp_error(sbi))) 1233 1199 return -EIO; 1234 1200 } ··· 1283 1249 percpu_counter_set(&sbi->alloc_valid_block_count, 0); 1284 1250 1285 1251 /* Here, we only have one bio having CP pack */ 1286 - sync_meta_pages(sbi, META_FLUSH, LONG_MAX); 1252 + sync_meta_pages(sbi, META_FLUSH, LONG_MAX, FS_CP_META_IO); 1287 1253 1288 1254 /* wait for previous submitted meta pages writeback */ 1289 1255 wait_on_all_pages_writeback(sbi);
+96 -81
fs/f2fs/data.c
··· 457 457 return err; 458 458 } 459 459 460 + static struct bio *f2fs_grab_read_bio(struct inode *inode, block_t blkaddr, 461 + unsigned nr_pages) 462 + { 463 + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 464 + struct fscrypt_ctx *ctx = NULL; 465 + struct bio *bio; 466 + 467 + if (f2fs_encrypted_file(inode)) { 468 + ctx = fscrypt_get_ctx(inode, GFP_NOFS); 469 + if (IS_ERR(ctx)) 470 + return ERR_CAST(ctx); 471 + 472 + /* wait the page to be moved by cleaning */ 473 + f2fs_wait_on_block_writeback(sbi, blkaddr); 474 + } 475 + 476 + bio = bio_alloc(GFP_KERNEL, min_t(int, nr_pages, BIO_MAX_PAGES)); 477 + if (!bio) { 478 + if (ctx) 479 + fscrypt_release_ctx(ctx); 480 + return ERR_PTR(-ENOMEM); 481 + } 482 + f2fs_target_device(sbi, blkaddr, bio); 483 + bio->bi_end_io = f2fs_read_end_io; 484 + bio->bi_private = ctx; 485 + bio_set_op_attrs(bio, REQ_OP_READ, 0); 486 + 487 + return bio; 488 + } 489 + 490 + /* This can handle encryption stuffs */ 491 + static int f2fs_submit_page_read(struct inode *inode, struct page *page, 492 + block_t blkaddr) 493 + { 494 + struct bio *bio = f2fs_grab_read_bio(inode, blkaddr, 1); 495 + 496 + if (IS_ERR(bio)) 497 + return PTR_ERR(bio); 498 + 499 + if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) { 500 + bio_put(bio); 501 + return -EFAULT; 502 + } 503 + __submit_bio(F2FS_I_SB(inode), bio, DATA); 504 + return 0; 505 + } 506 + 460 507 static void __set_data_blkaddr(struct dnode_of_data *dn) 461 508 { 462 509 struct f2fs_node *rn = F2FS_NODE(dn->node_page); 463 510 __le32 *addr_array; 511 + int base = 0; 512 + 513 + if (IS_INODE(dn->node_page) && f2fs_has_extra_attr(dn->inode)) 514 + base = get_extra_isize(dn->inode); 464 515 465 516 /* Get physical address of data block */ 466 517 addr_array = blkaddr_in_node(rn); 467 - addr_array[dn->ofs_in_node] = cpu_to_le32(dn->data_blkaddr); 518 + addr_array[base + dn->ofs_in_node] = cpu_to_le32(dn->data_blkaddr); 468 519 } 469 520 470 521 /* ··· 559 508 f2fs_wait_on_page_writeback(dn->node_page, NODE, true); 560 509 561 510 for (; count > 0; dn->ofs_in_node++) { 562 - block_t blkaddr = 563 - datablock_addr(dn->node_page, dn->ofs_in_node); 511 + block_t blkaddr = datablock_addr(dn->inode, 512 + dn->node_page, dn->ofs_in_node); 564 513 if (blkaddr == NULL_ADDR) { 565 514 dn->data_blkaddr = NEW_ADDR; 566 515 __set_data_blkaddr(dn); ··· 621 570 struct page *page; 622 571 struct extent_info ei = {0,0,0}; 623 572 int err; 624 - struct f2fs_io_info fio = { 625 - .sbi = F2FS_I_SB(inode), 626 - .type = DATA, 627 - .op = REQ_OP_READ, 628 - .op_flags = op_flags, 629 - .encrypted_page = NULL, 630 - }; 631 - 632 - if (f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode)) 633 - return read_mapping_page(mapping, index, NULL); 634 573 635 574 page = f2fs_grab_cache_page(mapping, index, for_write); 636 575 if (!page) ··· 661 620 return page; 662 621 } 663 622 664 - fio.new_blkaddr = fio.old_blkaddr = dn.data_blkaddr; 665 - fio.page = page; 666 - err = f2fs_submit_page_bio(&fio); 623 + err = f2fs_submit_page_read(inode, page, dn.data_blkaddr); 667 624 if (err) 668 625 goto put_err; 669 626 return page; ··· 795 756 if (unlikely(is_inode_flag_set(dn->inode, FI_NO_ALLOC))) 796 757 return -EPERM; 797 758 798 - dn->data_blkaddr = datablock_addr(dn->node_page, dn->ofs_in_node); 759 + dn->data_blkaddr = datablock_addr(dn->inode, 760 + dn->node_page, dn->ofs_in_node); 799 761 if (dn->data_blkaddr == NEW_ADDR) 800 762 goto alloc; 801 763 ··· 822 782 823 783 static inline bool __force_buffered_io(struct inode *inode, int rw) 824 784 { 825 - return ((f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode)) || 785 + return (f2fs_encrypted_file(inode) || 826 786 (rw == WRITE && test_opt(F2FS_I_SB(inode), LFS)) || 827 787 F2FS_I_SB(inode)->s_ndevs); 828 788 } ··· 854 814 F2FS_GET_BLOCK_PRE_AIO : 855 815 F2FS_GET_BLOCK_PRE_DIO); 856 816 } 857 - if (iocb->ki_pos + iov_iter_count(from) > MAX_INLINE_DATA) { 817 + if (iocb->ki_pos + iov_iter_count(from) > MAX_INLINE_DATA(inode)) { 858 818 err = f2fs_convert_inline_inode(inode); 859 819 if (err) 860 820 return err; ··· 943 903 end_offset = ADDRS_PER_PAGE(dn.node_page, inode); 944 904 945 905 next_block: 946 - blkaddr = datablock_addr(dn.node_page, dn.ofs_in_node); 906 + blkaddr = datablock_addr(dn.inode, dn.node_page, dn.ofs_in_node); 947 907 948 908 if (blkaddr == NEW_ADDR || blkaddr == NULL_ADDR) { 949 909 if (create) { ··· 1080 1040 struct buffer_head *bh_result, int create) 1081 1041 { 1082 1042 return __get_data_block(inode, iblock, bh_result, create, 1083 - F2FS_GET_BLOCK_DIO, NULL); 1043 + F2FS_GET_BLOCK_DEFAULT, NULL); 1084 1044 } 1085 1045 1086 1046 static int get_data_block_bmap(struct inode *inode, sector_t iblock, ··· 1186 1146 return ret; 1187 1147 } 1188 1148 1189 - static struct bio *f2fs_grab_bio(struct inode *inode, block_t blkaddr, 1190 - unsigned nr_pages) 1191 - { 1192 - struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 1193 - struct fscrypt_ctx *ctx = NULL; 1194 - struct bio *bio; 1195 - 1196 - if (f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode)) { 1197 - ctx = fscrypt_get_ctx(inode, GFP_NOFS); 1198 - if (IS_ERR(ctx)) 1199 - return ERR_CAST(ctx); 1200 - 1201 - /* wait the page to be moved by cleaning */ 1202 - f2fs_wait_on_encrypted_page_writeback(sbi, blkaddr); 1203 - } 1204 - 1205 - bio = bio_alloc(GFP_KERNEL, min_t(int, nr_pages, BIO_MAX_PAGES)); 1206 - if (!bio) { 1207 - if (ctx) 1208 - fscrypt_release_ctx(ctx); 1209 - return ERR_PTR(-ENOMEM); 1210 - } 1211 - f2fs_target_device(sbi, blkaddr, bio); 1212 - bio->bi_end_io = f2fs_read_end_io; 1213 - bio->bi_private = ctx; 1214 - 1215 - return bio; 1216 - } 1217 - 1218 1149 /* 1219 1150 * This function was originally taken from fs/mpage.c, and customized for f2fs. 1220 1151 * Major change was from block_size == page_size in f2fs by default. ··· 1251 1240 map.m_len = last_block - block_in_file; 1252 1241 1253 1242 if (f2fs_map_blocks(inode, &map, 0, 1254 - F2FS_GET_BLOCK_READ)) 1243 + F2FS_GET_BLOCK_DEFAULT)) 1255 1244 goto set_error_page; 1256 1245 } 1257 1246 got_it: ··· 1282 1271 bio = NULL; 1283 1272 } 1284 1273 if (bio == NULL) { 1285 - bio = f2fs_grab_bio(inode, block_nr, nr_pages); 1274 + bio = f2fs_grab_read_bio(inode, block_nr, nr_pages); 1286 1275 if (IS_ERR(bio)) { 1287 1276 bio = NULL; 1288 1277 goto set_error_page; 1289 1278 } 1290 - bio_set_op_attrs(bio, REQ_OP_READ, 0); 1291 1279 } 1292 1280 1293 1281 if (bio_add_page(bio, page, blocksize, 0) < blocksize) ··· 1351 1341 struct inode *inode = fio->page->mapping->host; 1352 1342 gfp_t gfp_flags = GFP_NOFS; 1353 1343 1354 - if (!f2fs_encrypted_inode(inode) || !S_ISREG(inode->i_mode)) 1344 + if (!f2fs_encrypted_file(inode)) 1355 1345 return 0; 1356 1346 1357 1347 /* wait for GCed encrypted page writeback */ 1358 - f2fs_wait_on_encrypted_page_writeback(fio->sbi, fio->old_blkaddr); 1348 + f2fs_wait_on_block_writeback(fio->sbi, fio->old_blkaddr); 1359 1349 1360 1350 retry_encrypt: 1361 1351 fio->encrypted_page = fscrypt_encrypt_page(inode, fio->page, ··· 1481 1471 } 1482 1472 1483 1473 static int __write_data_page(struct page *page, bool *submitted, 1484 - struct writeback_control *wbc) 1474 + struct writeback_control *wbc, 1475 + enum iostat_type io_type) 1485 1476 { 1486 1477 struct inode *inode = page->mapping->host; 1487 1478 struct f2fs_sb_info *sbi = F2FS_I_SB(inode); ··· 1503 1492 .encrypted_page = NULL, 1504 1493 .submitted = false, 1505 1494 .need_lock = LOCK_RETRY, 1495 + .io_type = io_type, 1506 1496 }; 1507 1497 1508 1498 trace_f2fs_writepage(page, DATA); ··· 1610 1598 static int f2fs_write_data_page(struct page *page, 1611 1599 struct writeback_control *wbc) 1612 1600 { 1613 - return __write_data_page(page, NULL, wbc); 1601 + return __write_data_page(page, NULL, wbc, FS_DATA_IO); 1614 1602 } 1615 1603 1616 1604 /* ··· 1619 1607 * warm/hot data page. 1620 1608 */ 1621 1609 static int f2fs_write_cache_pages(struct address_space *mapping, 1622 - struct writeback_control *wbc) 1610 + struct writeback_control *wbc, 1611 + enum iostat_type io_type) 1623 1612 { 1624 1613 int ret = 0; 1625 1614 int done = 0; ··· 1710 1697 if (!clear_page_dirty_for_io(page)) 1711 1698 goto continue_unlock; 1712 1699 1713 - ret = __write_data_page(page, &submitted, wbc); 1700 + ret = __write_data_page(page, &submitted, wbc, io_type); 1714 1701 if (unlikely(ret)) { 1715 1702 /* 1716 1703 * keep nr_to_write, since vfs uses this to ··· 1765 1752 return ret; 1766 1753 } 1767 1754 1768 - static int f2fs_write_data_pages(struct address_space *mapping, 1769 - struct writeback_control *wbc) 1755 + int __f2fs_write_data_pages(struct address_space *mapping, 1756 + struct writeback_control *wbc, 1757 + enum iostat_type io_type) 1770 1758 { 1771 1759 struct inode *inode = mapping->host; 1772 1760 struct f2fs_sb_info *sbi = F2FS_I_SB(inode); ··· 1804 1790 goto skip_write; 1805 1791 1806 1792 blk_start_plug(&plug); 1807 - ret = f2fs_write_cache_pages(mapping, wbc); 1793 + ret = f2fs_write_cache_pages(mapping, wbc, io_type); 1808 1794 blk_finish_plug(&plug); 1809 1795 1810 1796 if (wbc->sync_mode == WB_SYNC_ALL) ··· 1821 1807 wbc->pages_skipped += get_dirty_pages(inode); 1822 1808 trace_f2fs_writepages(mapping->host, wbc, DATA); 1823 1809 return 0; 1810 + } 1811 + 1812 + static int f2fs_write_data_pages(struct address_space *mapping, 1813 + struct writeback_control *wbc) 1814 + { 1815 + struct inode *inode = mapping->host; 1816 + 1817 + return __f2fs_write_data_pages(mapping, wbc, 1818 + F2FS_I(inode)->cp_task == current ? 1819 + FS_CP_DATA_IO : FS_DATA_IO); 1824 1820 } 1825 1821 1826 1822 static void f2fs_write_failed(struct address_space *mapping, loff_t to) ··· 1882 1858 set_new_dnode(&dn, inode, ipage, ipage, 0); 1883 1859 1884 1860 if (f2fs_has_inline_data(inode)) { 1885 - if (pos + len <= MAX_INLINE_DATA) { 1861 + if (pos + len <= MAX_INLINE_DATA(inode)) { 1886 1862 read_inline_data(page, ipage); 1887 1863 set_inode_flag(inode, FI_DATA_EXIST); 1888 1864 if (inode->i_nlink) ··· 1980 1956 f2fs_wait_on_page_writeback(page, DATA, false); 1981 1957 1982 1958 /* wait for GCed encrypted page writeback */ 1983 - if (f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode)) 1984 - f2fs_wait_on_encrypted_page_writeback(sbi, blkaddr); 1959 + if (f2fs_encrypted_file(inode)) 1960 + f2fs_wait_on_block_writeback(sbi, blkaddr); 1985 1961 1986 1962 if (len == PAGE_SIZE || PageUptodate(page)) 1987 1963 return 0; ··· 1995 1971 zero_user_segment(page, 0, PAGE_SIZE); 1996 1972 SetPageUptodate(page); 1997 1973 } else { 1998 - struct bio *bio; 1999 - 2000 - bio = f2fs_grab_bio(inode, blkaddr, 1); 2001 - if (IS_ERR(bio)) { 2002 - err = PTR_ERR(bio); 1974 + err = f2fs_submit_page_read(inode, page, blkaddr); 1975 + if (err) 2003 1976 goto fail; 2004 - } 2005 - bio->bi_opf = REQ_OP_READ; 2006 - if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) { 2007 - bio_put(bio); 2008 - err = -EFAULT; 2009 - goto fail; 2010 - } 2011 - 2012 - __submit_bio(sbi, bio, DATA); 2013 1977 2014 1978 lock_page(page); 2015 1979 if (unlikely(page->mapping != mapping)) { ··· 2087 2075 up_read(&F2FS_I(inode)->dio_rwsem[rw]); 2088 2076 2089 2077 if (rw == WRITE) { 2090 - if (err > 0) 2078 + if (err > 0) { 2079 + f2fs_update_iostat(F2FS_I_SB(inode), APP_DIRECT_IO, 2080 + err); 2091 2081 set_inode_flag(inode, FI_UPDATE_WRITE); 2092 - else if (err < 0) 2082 + } else if (err < 0) { 2093 2083 f2fs_write_failed(mapping, offset + count); 2084 + } 2094 2085 } 2095 2086 2096 2087 trace_f2fs_direct_IO_exit(inode, offset, count, rw, err);
+7
fs/f2fs/dir.c
··· 705 705 struct f2fs_dentry_block *dentry_blk; 706 706 unsigned int bit_pos; 707 707 int slots = GET_DENTRY_SLOTS(le16_to_cpu(dentry->name_len)); 708 + struct address_space *mapping = page_mapping(page); 709 + unsigned long flags; 708 710 int i; 709 711 710 712 f2fs_update_time(F2FS_I_SB(dir), REQ_TIME); ··· 737 735 738 736 if (bit_pos == NR_DENTRY_IN_BLOCK && 739 737 !truncate_hole(dir, page->index, page->index + 1)) { 738 + spin_lock_irqsave(&mapping->tree_lock, flags); 739 + radix_tree_tag_clear(&mapping->page_tree, page_index(page), 740 + PAGECACHE_TAG_DIRTY); 741 + spin_unlock_irqrestore(&mapping->tree_lock, flags); 742 + 740 743 clear_page_dirty_for_io(page); 741 744 ClearPagePrivate(page); 742 745 ClearPageUptodate(page);
+253 -32
fs/f2fs/f2fs.h
··· 91 91 #define F2FS_MOUNT_LFS 0x00040000 92 92 #define F2FS_MOUNT_USRQUOTA 0x00080000 93 93 #define F2FS_MOUNT_GRPQUOTA 0x00100000 94 + #define F2FS_MOUNT_PRJQUOTA 0x00200000 95 + #define F2FS_MOUNT_QUOTA 0x00400000 94 96 95 97 #define clear_opt(sbi, option) ((sbi)->mount_opt.opt &= ~F2FS_MOUNT_##option) 96 98 #define set_opt(sbi, option) ((sbi)->mount_opt.opt |= F2FS_MOUNT_##option) ··· 112 110 unsigned int opt; 113 111 }; 114 112 115 - #define F2FS_FEATURE_ENCRYPT 0x0001 116 - #define F2FS_FEATURE_BLKZONED 0x0002 113 + #define F2FS_FEATURE_ENCRYPT 0x0001 114 + #define F2FS_FEATURE_BLKZONED 0x0002 115 + #define F2FS_FEATURE_ATOMIC_WRITE 0x0004 116 + #define F2FS_FEATURE_EXTRA_ATTR 0x0008 117 + #define F2FS_FEATURE_PRJQUOTA 0x0010 118 + #define F2FS_FEATURE_INODE_CHKSUM 0x0020 117 119 118 120 #define F2FS_HAS_FEATURE(sb, mask) \ 119 121 ((F2FS_SB(sb)->raw_super->feature & cpu_to_le32(mask)) != 0) ··· 148 142 (BATCHED_TRIM_SEGMENTS(sbi) << (sbi)->log_blocks_per_seg) 149 143 #define MAX_DISCARD_BLOCKS(sbi) BLKS_PER_SEC(sbi) 150 144 #define DISCARD_ISSUE_RATE 8 145 + #define DEF_MIN_DISCARD_ISSUE_TIME 50 /* 50 ms, if exists */ 146 + #define DEF_MAX_DISCARD_ISSUE_TIME 60000 /* 60 s, if no candidates */ 151 147 #define DEF_CP_INTERVAL 60 /* 60 secs */ 152 148 #define DEF_IDLE_INTERVAL 5 /* 5 secs */ 153 149 ··· 198 190 unsigned char discard_map[SIT_VBLOCK_MAP_SIZE]; /* segment discard bitmap */ 199 191 }; 200 192 193 + /* default discard granularity of inner discard thread, unit: block count */ 194 + #define DEFAULT_DISCARD_GRANULARITY 16 195 + 201 196 /* max discard pend list number */ 202 197 #define MAX_PLIST_NUM 512 203 198 #define plist_idx(blk_num) ((blk_num) >= MAX_PLIST_NUM ? \ 204 199 (MAX_PLIST_NUM - 1) : (blk_num - 1)) 200 + 201 + #define P_ACTIVE 0x01 202 + #define P_TRIM 0x02 203 + #define plist_issue(tag) (((tag) & P_ACTIVE) || ((tag) & P_TRIM)) 205 204 206 205 enum { 207 206 D_PREP, ··· 245 230 struct task_struct *f2fs_issue_discard; /* discard thread */ 246 231 struct list_head entry_list; /* 4KB discard entry list */ 247 232 struct list_head pend_list[MAX_PLIST_NUM];/* store pending entries */ 233 + unsigned char pend_list_tag[MAX_PLIST_NUM];/* tag for pending entries */ 248 234 struct list_head wait_list; /* store on-flushing entries */ 249 235 wait_queue_head_t discard_wait_queue; /* waiting queue for wake-up */ 236 + unsigned int discard_wake; /* to wake up discard thread */ 250 237 struct mutex cmd_lock; 251 238 unsigned int nr_discards; /* # of discards in the list */ 252 239 unsigned int max_discards; /* max. discards to be issued */ 240 + unsigned int discard_granularity; /* discard granularity */ 253 241 unsigned int undiscard_blks; /* # of undiscard blocks */ 254 242 atomic_t issued_discard; /* # of issued discard */ 255 243 atomic_t issing_discard; /* # of issing discard */ ··· 326 308 struct f2fs_flush_device) 327 309 #define F2FS_IOC_GARBAGE_COLLECT_RANGE _IOW(F2FS_IOCTL_MAGIC, 11, \ 328 310 struct f2fs_gc_range) 311 + #define F2FS_IOC_GET_FEATURES _IOR(F2FS_IOCTL_MAGIC, 12, __u32) 329 312 330 313 #define F2FS_IOC_SET_ENCRYPTION_POLICY FS_IOC_SET_ENCRYPTION_POLICY 331 314 #define F2FS_IOC_GET_ENCRYPTION_POLICY FS_IOC_GET_ENCRYPTION_POLICY ··· 351 332 #define F2FS_IOC32_GETVERSION FS_IOC32_GETVERSION 352 333 #endif 353 334 335 + #define F2FS_IOC_FSGETXATTR FS_IOC_FSGETXATTR 336 + #define F2FS_IOC_FSSETXATTR FS_IOC_FSSETXATTR 337 + 354 338 struct f2fs_gc_range { 355 339 u32 sync; 356 340 u64 start; ··· 377 355 u32 segments; /* # of segments to flush */ 378 356 }; 379 357 358 + /* for inline stuff */ 359 + #define DEF_INLINE_RESERVED_SIZE 1 360 + static inline int get_extra_isize(struct inode *inode); 361 + #define MAX_INLINE_DATA(inode) (sizeof(__le32) * \ 362 + (CUR_ADDRS_PER_INODE(inode) - \ 363 + DEF_INLINE_RESERVED_SIZE - \ 364 + F2FS_INLINE_XATTR_ADDRS)) 365 + 366 + /* for inline dir */ 367 + #define NR_INLINE_DENTRY(inode) (MAX_INLINE_DATA(inode) * BITS_PER_BYTE / \ 368 + ((SIZE_OF_DIR_ENTRY + F2FS_SLOT_LEN) * \ 369 + BITS_PER_BYTE + 1)) 370 + #define INLINE_DENTRY_BITMAP_SIZE(inode) ((NR_INLINE_DENTRY(inode) + \ 371 + BITS_PER_BYTE - 1) / BITS_PER_BYTE) 372 + #define INLINE_RESERVED_SIZE(inode) (MAX_INLINE_DATA(inode) - \ 373 + ((SIZE_OF_DIR_ENTRY + F2FS_SLOT_LEN) * \ 374 + NR_INLINE_DENTRY(inode) + \ 375 + INLINE_DENTRY_BITMAP_SIZE(inode))) 376 + 380 377 /* 381 378 * For INODE and NODE manager 382 379 */ 383 380 /* for directory operations */ 384 381 struct f2fs_dentry_ptr { 385 382 struct inode *inode; 386 - const void *bitmap; 383 + void *bitmap; 387 384 struct f2fs_dir_entry *dentry; 388 385 __u8 (*filename)[F2FS_SLOT_LEN]; 389 386 int max; 387 + int nr_bitmap; 390 388 }; 391 389 392 390 static inline void make_dentry_ptr_block(struct inode *inode, ··· 414 372 { 415 373 d->inode = inode; 416 374 d->max = NR_DENTRY_IN_BLOCK; 375 + d->nr_bitmap = SIZE_OF_DENTRY_BITMAP; 417 376 d->bitmap = &t->dentry_bitmap; 418 377 d->dentry = t->dentry; 419 378 d->filename = t->filename; 420 379 } 421 380 422 381 static inline void make_dentry_ptr_inline(struct inode *inode, 423 - struct f2fs_dentry_ptr *d, struct f2fs_inline_dentry *t) 382 + struct f2fs_dentry_ptr *d, void *t) 424 383 { 384 + int entry_cnt = NR_INLINE_DENTRY(inode); 385 + int bitmap_size = INLINE_DENTRY_BITMAP_SIZE(inode); 386 + int reserved_size = INLINE_RESERVED_SIZE(inode); 387 + 425 388 d->inode = inode; 426 - d->max = NR_INLINE_DENTRY; 427 - d->bitmap = &t->dentry_bitmap; 428 - d->dentry = t->dentry; 429 - d->filename = t->filename; 389 + d->max = entry_cnt; 390 + d->nr_bitmap = bitmap_size; 391 + d->bitmap = t; 392 + d->dentry = t + bitmap_size + reserved_size; 393 + d->filename = t + bitmap_size + reserved_size + 394 + SIZE_OF_DIR_ENTRY * entry_cnt; 430 395 } 431 396 432 397 /* ··· 522 473 }; 523 474 524 475 /* for flag in get_data_block */ 525 - #define F2FS_GET_BLOCK_READ 0 526 - #define F2FS_GET_BLOCK_DIO 1 527 - #define F2FS_GET_BLOCK_FIEMAP 2 528 - #define F2FS_GET_BLOCK_BMAP 3 529 - #define F2FS_GET_BLOCK_PRE_DIO 4 530 - #define F2FS_GET_BLOCK_PRE_AIO 5 476 + enum { 477 + F2FS_GET_BLOCK_DEFAULT, 478 + F2FS_GET_BLOCK_FIEMAP, 479 + F2FS_GET_BLOCK_BMAP, 480 + F2FS_GET_BLOCK_PRE_DIO, 481 + F2FS_GET_BLOCK_PRE_AIO, 482 + }; 531 483 532 484 /* 533 485 * i_advise uses FADVISE_XXX_BIT. We can add additional hints later. ··· 571 521 f2fs_hash_t chash; /* hash value of given file name */ 572 522 unsigned int clevel; /* maximum level of given file name */ 573 523 struct task_struct *task; /* lookup and create consistency */ 524 + struct task_struct *cp_task; /* separate cp/wb IO stats*/ 574 525 nid_t i_xattr_nid; /* node id that contains xattrs */ 575 526 loff_t last_disk_size; /* lastly written file size */ 576 527 ··· 584 533 struct list_head dirty_list; /* dirty list for dirs and files */ 585 534 struct list_head gdirty_list; /* linked in global dirty list */ 586 535 struct list_head inmem_pages; /* inmemory pages managed by f2fs */ 536 + struct task_struct *inmem_task; /* store inmemory task */ 587 537 struct mutex inmem_lock; /* lock for inmemory pages */ 588 538 struct extent_tree *extent_tree; /* cached extent_tree entry */ 589 539 struct rw_semaphore dio_rwsem[2];/* avoid racing between dio and gc */ 590 540 struct rw_semaphore i_mmap_sem; 541 + struct rw_semaphore i_xattr_sem; /* avoid racing between reading and changing EAs */ 542 + 543 + int i_extra_isize; /* size of extra space located in i_addr */ 544 + kprojid_t i_projid; /* id for project quota */ 591 545 }; 592 546 593 547 static inline void get_extent_info(struct extent_info *ext, ··· 879 823 LOCK_RETRY, 880 824 }; 881 825 826 + enum iostat_type { 827 + APP_DIRECT_IO, /* app direct IOs */ 828 + APP_BUFFERED_IO, /* app buffered IOs */ 829 + APP_WRITE_IO, /* app write IOs */ 830 + APP_MAPPED_IO, /* app mapped IOs */ 831 + FS_DATA_IO, /* data IOs from kworker/fsync/reclaimer */ 832 + FS_NODE_IO, /* node IOs from kworker/fsync/reclaimer */ 833 + FS_META_IO, /* meta IOs from kworker/reclaimer */ 834 + FS_GC_DATA_IO, /* data IOs from forground gc */ 835 + FS_GC_NODE_IO, /* node IOs from forground gc */ 836 + FS_CP_DATA_IO, /* data IOs from checkpoint */ 837 + FS_CP_NODE_IO, /* node IOs from checkpoint */ 838 + FS_CP_META_IO, /* meta IOs from checkpoint */ 839 + FS_DISCARD, /* discard */ 840 + NR_IO_TYPE, 841 + }; 842 + 882 843 struct f2fs_io_info { 883 844 struct f2fs_sb_info *sbi; /* f2fs_sb_info pointer */ 884 845 enum page_type type; /* contains DATA/NODE/META/META_FLUSH */ ··· 910 837 bool submitted; /* indicate IO submission */ 911 838 int need_lock; /* indicate we need to lock cp_rwsem */ 912 839 bool in_list; /* indicate fio is in io_list */ 840 + enum iostat_type io_type; /* io type */ 913 841 }; 914 842 915 843 #define is_read_io(rw) ((rw) == READ) ··· 1102 1028 #endif 1103 1029 spinlock_t stat_lock; /* lock for stat operations */ 1104 1030 1031 + /* For app/fs IO statistics */ 1032 + spinlock_t iostat_lock; 1033 + unsigned long long write_iostat[NR_IO_TYPE]; 1034 + bool iostat_enable; 1035 + 1105 1036 /* For sysfs suppport */ 1106 1037 struct kobject s_kobj; 1107 1038 struct completion s_kobj_unregister; ··· 1125 1046 /* Reference to checksum algorithm driver via cryptoapi */ 1126 1047 struct crypto_shash *s_chksum_driver; 1127 1048 1049 + /* Precomputed FS UUID checksum for seeding other checksums */ 1050 + __u32 s_chksum_seed; 1051 + 1128 1052 /* For fault injection */ 1129 1053 #ifdef CONFIG_F2FS_FAULT_INJECTION 1130 1054 struct f2fs_fault_info fault_info; 1055 + #endif 1056 + 1057 + #ifdef CONFIG_QUOTA 1058 + /* Names of quota files with journalled quota */ 1059 + char *s_qf_names[MAXQUOTAS]; 1060 + int s_jquota_fmt; /* Format of quota to use */ 1131 1061 #endif 1132 1062 }; 1133 1063 ··· 1223 1135 void *buf, size_t buf_size) 1224 1136 { 1225 1137 return f2fs_crc32(sbi, buf, buf_size) == blk_crc; 1138 + } 1139 + 1140 + static inline u32 f2fs_chksum(struct f2fs_sb_info *sbi, u32 crc, 1141 + const void *address, unsigned int length) 1142 + { 1143 + struct { 1144 + struct shash_desc shash; 1145 + char ctx[4]; 1146 + } desc; 1147 + int err; 1148 + 1149 + BUG_ON(crypto_shash_descsize(sbi->s_chksum_driver) != sizeof(desc.ctx)); 1150 + 1151 + desc.shash.tfm = sbi->s_chksum_driver; 1152 + desc.shash.flags = 0; 1153 + *(u32 *)desc.ctx = crc; 1154 + 1155 + err = crypto_shash_update(&desc.shash, address, length); 1156 + BUG_ON(err); 1157 + 1158 + return *(u32 *)desc.ctx; 1226 1159 } 1227 1160 1228 1161 static inline struct f2fs_inode_info *F2FS_I(struct inode *inode) ··· 1869 1760 return RAW_IS_INODE(p); 1870 1761 } 1871 1762 1763 + static inline int offset_in_addr(struct f2fs_inode *i) 1764 + { 1765 + return (i->i_inline & F2FS_EXTRA_ATTR) ? 1766 + (le16_to_cpu(i->i_extra_isize) / sizeof(__le32)) : 0; 1767 + } 1768 + 1872 1769 static inline __le32 *blkaddr_in_node(struct f2fs_node *node) 1873 1770 { 1874 1771 return RAW_IS_INODE(node) ? node->i.i_addr : node->dn.addr; 1875 1772 } 1876 1773 1877 - static inline block_t datablock_addr(struct page *node_page, 1878 - unsigned int offset) 1774 + static inline int f2fs_has_extra_attr(struct inode *inode); 1775 + static inline block_t datablock_addr(struct inode *inode, 1776 + struct page *node_page, unsigned int offset) 1879 1777 { 1880 1778 struct f2fs_node *raw_node; 1881 1779 __le32 *addr_array; 1780 + int base = 0; 1781 + bool is_inode = IS_INODE(node_page); 1882 1782 1883 1783 raw_node = F2FS_NODE(node_page); 1784 + 1785 + /* from GC path only */ 1786 + if (!inode) { 1787 + if (is_inode) 1788 + base = offset_in_addr(&raw_node->i); 1789 + } else if (f2fs_has_extra_attr(inode) && is_inode) { 1790 + base = get_extra_isize(inode); 1791 + } 1792 + 1884 1793 addr_array = blkaddr_in_node(raw_node); 1885 - return le32_to_cpu(addr_array[offset]); 1794 + return le32_to_cpu(addr_array[base + offset]); 1886 1795 } 1887 1796 1888 1797 static inline int f2fs_test_bit(unsigned int nr, char *addr) ··· 1963 1836 *addr ^= mask; 1964 1837 } 1965 1838 1839 + #define F2FS_REG_FLMASK (~(FS_DIRSYNC_FL | FS_TOPDIR_FL)) 1840 + #define F2FS_OTHER_FLMASK (FS_NODUMP_FL | FS_NOATIME_FL) 1841 + #define F2FS_FL_INHERITED (FS_PROJINHERIT_FL) 1842 + 1843 + static inline __u32 f2fs_mask_flags(umode_t mode, __u32 flags) 1844 + { 1845 + if (S_ISDIR(mode)) 1846 + return flags; 1847 + else if (S_ISREG(mode)) 1848 + return flags & F2FS_REG_FLMASK; 1849 + else 1850 + return flags & F2FS_OTHER_FLMASK; 1851 + } 1852 + 1966 1853 /* used for f2fs_inode_info->flags */ 1967 1854 enum { 1968 1855 FI_NEW_INODE, /* indicate newly allocated inode */ ··· 2005 1864 FI_DIRTY_FILE, /* indicate regular/symlink has dirty pages */ 2006 1865 FI_NO_PREALLOC, /* indicate skipped preallocated blocks */ 2007 1866 FI_HOT_DATA, /* indicate file is hot */ 1867 + FI_EXTRA_ATTR, /* indicate file has extra attribute */ 1868 + FI_PROJ_INHERIT, /* indicate file inherits projectid */ 2008 1869 }; 2009 1870 2010 1871 static inline void __mark_inode_dirty_flag(struct inode *inode, ··· 2126 1983 set_bit(FI_DATA_EXIST, &fi->flags); 2127 1984 if (ri->i_inline & F2FS_INLINE_DOTS) 2128 1985 set_bit(FI_INLINE_DOTS, &fi->flags); 1986 + if (ri->i_inline & F2FS_EXTRA_ATTR) 1987 + set_bit(FI_EXTRA_ATTR, &fi->flags); 2129 1988 } 2130 1989 2131 1990 static inline void set_raw_inline(struct inode *inode, struct f2fs_inode *ri) ··· 2144 1999 ri->i_inline |= F2FS_DATA_EXIST; 2145 2000 if (is_inode_flag_set(inode, FI_INLINE_DOTS)) 2146 2001 ri->i_inline |= F2FS_INLINE_DOTS; 2002 + if (is_inode_flag_set(inode, FI_EXTRA_ATTR)) 2003 + ri->i_inline |= F2FS_EXTRA_ATTR; 2004 + } 2005 + 2006 + static inline int f2fs_has_extra_attr(struct inode *inode) 2007 + { 2008 + return is_inode_flag_set(inode, FI_EXTRA_ATTR); 2147 2009 } 2148 2010 2149 2011 static inline int f2fs_has_inline_xattr(struct inode *inode) ··· 2161 2009 static inline unsigned int addrs_per_inode(struct inode *inode) 2162 2010 { 2163 2011 if (f2fs_has_inline_xattr(inode)) 2164 - return DEF_ADDRS_PER_INODE - F2FS_INLINE_XATTR_ADDRS; 2165 - return DEF_ADDRS_PER_INODE; 2012 + return CUR_ADDRS_PER_INODE(inode) - F2FS_INLINE_XATTR_ADDRS; 2013 + return CUR_ADDRS_PER_INODE(inode); 2166 2014 } 2167 2015 2168 2016 static inline void *inline_xattr_addr(struct page *page) ··· 2221 2069 return is_inode_flag_set(inode, FI_DROP_CACHE); 2222 2070 } 2223 2071 2224 - static inline void *inline_data_addr(struct page *page) 2072 + static inline void *inline_data_addr(struct inode *inode, struct page *page) 2225 2073 { 2226 2074 struct f2fs_inode *ri = F2FS_INODE(page); 2075 + int extra_size = get_extra_isize(inode); 2227 2076 2228 - return (void *)&(ri->i_addr[1]); 2077 + return (void *)&(ri->i_addr[extra_size + DEF_INLINE_RESERVED_SIZE]); 2229 2078 } 2230 2079 2231 2080 static inline int f2fs_has_inline_dentry(struct inode *inode) ··· 2317 2164 return kmalloc(size, flags); 2318 2165 } 2319 2166 2167 + static inline int get_extra_isize(struct inode *inode) 2168 + { 2169 + return F2FS_I(inode)->i_extra_isize / sizeof(__le32); 2170 + } 2171 + 2320 2172 #define get_inode_mode(i) \ 2321 2173 ((is_inode_flag_set(i, FI_ACL_MODE)) ? \ 2322 2174 (F2FS_I(i)->i_acl_mode) : ((i)->i_mode)) 2175 + 2176 + #define F2FS_TOTAL_EXTRA_ATTR_SIZE \ 2177 + (offsetof(struct f2fs_inode, i_extra_end) - \ 2178 + offsetof(struct f2fs_inode, i_extra_isize)) \ 2179 + 2180 + #define F2FS_OLD_ATTRIBUTE_SIZE (offsetof(struct f2fs_inode, i_addr)) 2181 + #define F2FS_FITS_IN_INODE(f2fs_inode, extra_isize, field) \ 2182 + ((offsetof(typeof(*f2fs_inode), field) + \ 2183 + sizeof((f2fs_inode)->field)) \ 2184 + <= (F2FS_OLD_ATTRIBUTE_SIZE + extra_isize)) \ 2185 + 2186 + static inline void f2fs_reset_iostat(struct f2fs_sb_info *sbi) 2187 + { 2188 + int i; 2189 + 2190 + spin_lock(&sbi->iostat_lock); 2191 + for (i = 0; i < NR_IO_TYPE; i++) 2192 + sbi->write_iostat[i] = 0; 2193 + spin_unlock(&sbi->iostat_lock); 2194 + } 2195 + 2196 + static inline void f2fs_update_iostat(struct f2fs_sb_info *sbi, 2197 + enum iostat_type type, unsigned long long io_bytes) 2198 + { 2199 + if (!sbi->iostat_enable) 2200 + return; 2201 + spin_lock(&sbi->iostat_lock); 2202 + sbi->write_iostat[type] += io_bytes; 2203 + 2204 + if (type == APP_WRITE_IO || type == APP_DIRECT_IO) 2205 + sbi->write_iostat[APP_BUFFERED_IO] = 2206 + sbi->write_iostat[APP_WRITE_IO] - 2207 + sbi->write_iostat[APP_DIRECT_IO]; 2208 + spin_unlock(&sbi->iostat_lock); 2209 + } 2323 2210 2324 2211 /* 2325 2212 * file.c ··· 2380 2187 * inode.c 2381 2188 */ 2382 2189 void f2fs_set_inode_flags(struct inode *inode); 2190 + bool f2fs_inode_chksum_verify(struct f2fs_sb_info *sbi, struct page *page); 2191 + void f2fs_inode_chksum_set(struct f2fs_sb_info *sbi, struct page *page); 2383 2192 struct inode *f2fs_iget(struct super_block *sb, unsigned long ino); 2384 2193 struct inode *f2fs_iget_retry(struct super_block *sb, unsigned long ino); 2385 2194 int try_to_free_nats(struct f2fs_sb_info *sbi, int nr_shrink); ··· 2450 2255 */ 2451 2256 int f2fs_inode_dirtied(struct inode *inode, bool sync); 2452 2257 void f2fs_inode_synced(struct inode *inode); 2258 + void f2fs_enable_quota_files(struct f2fs_sb_info *sbi); 2259 + void f2fs_quota_off_umount(struct super_block *sb); 2453 2260 int f2fs_commit_super(struct f2fs_sb_info *sbi, bool recover); 2454 2261 int f2fs_sync_fs(struct super_block *sb, int sync); 2455 2262 extern __printf(3, 4) ··· 2482 2285 int wait_on_node_pages_writeback(struct f2fs_sb_info *sbi, nid_t ino); 2483 2286 int remove_inode_page(struct inode *inode); 2484 2287 struct page *new_inode_page(struct inode *inode); 2485 - struct page *new_node_page(struct dnode_of_data *dn, 2486 - unsigned int ofs, struct page *ipage); 2288 + struct page *new_node_page(struct dnode_of_data *dn, unsigned int ofs); 2487 2289 void ra_node_page(struct f2fs_sb_info *sbi, nid_t nid); 2488 2290 struct page *get_node_page(struct f2fs_sb_info *sbi, pgoff_t nid); 2489 2291 struct page *get_node_page_ra(struct page *parent, int start); 2490 2292 void move_node_page(struct page *node_page, int gc_type); 2491 2293 int fsync_node_pages(struct f2fs_sb_info *sbi, struct inode *inode, 2492 2294 struct writeback_control *wbc, bool atomic); 2493 - int sync_node_pages(struct f2fs_sb_info *sbi, struct writeback_control *wbc); 2295 + int sync_node_pages(struct f2fs_sb_info *sbi, struct writeback_control *wbc, 2296 + bool do_balance, enum iostat_type io_type); 2494 2297 void build_free_nids(struct f2fs_sb_info *sbi, bool sync, bool mount); 2495 2298 bool alloc_nid(struct f2fs_sb_info *sbi, nid_t *nid); 2496 2299 void alloc_nid_done(struct f2fs_sb_info *sbi, nid_t nid); ··· 2511 2314 /* 2512 2315 * segment.c 2513 2316 */ 2317 + bool need_SSR(struct f2fs_sb_info *sbi); 2514 2318 void register_inmem_page(struct inode *inode, struct page *page); 2515 2319 void drop_inmem_pages(struct inode *inode); 2516 2320 void drop_inmem_page(struct inode *inode, struct page *page); ··· 2534 2336 bool exist_trim_candidates(struct f2fs_sb_info *sbi, struct cp_control *cpc); 2535 2337 struct page *get_sum_page(struct f2fs_sb_info *sbi, unsigned int segno); 2536 2338 void update_meta_page(struct f2fs_sb_info *sbi, void *src, block_t blk_addr); 2537 - void write_meta_page(struct f2fs_sb_info *sbi, struct page *page); 2339 + void write_meta_page(struct f2fs_sb_info *sbi, struct page *page, 2340 + enum iostat_type io_type); 2538 2341 void write_node_page(unsigned int nid, struct f2fs_io_info *fio); 2539 2342 void write_data_page(struct dnode_of_data *dn, struct f2fs_io_info *fio); 2540 2343 int rewrite_data_page(struct f2fs_io_info *fio); ··· 2552 2353 struct f2fs_io_info *fio, bool add_list); 2553 2354 void f2fs_wait_on_page_writeback(struct page *page, 2554 2355 enum page_type type, bool ordered); 2555 - void f2fs_wait_on_encrypted_page_writeback(struct f2fs_sb_info *sbi, 2556 - block_t blkaddr); 2356 + void f2fs_wait_on_block_writeback(struct f2fs_sb_info *sbi, block_t blkaddr); 2557 2357 void write_data_summaries(struct f2fs_sb_info *sbi, block_t start_blk); 2558 2358 void write_node_summaries(struct f2fs_sb_info *sbi, block_t start_blk); 2559 2359 int lookup_journal_in_cursum(struct f2fs_journal *journal, int type, ··· 2575 2377 int type, bool sync); 2576 2378 void ra_meta_pages_cond(struct f2fs_sb_info *sbi, pgoff_t index); 2577 2379 long sync_meta_pages(struct f2fs_sb_info *sbi, enum page_type type, 2578 - long nr_to_write); 2380 + long nr_to_write, enum iostat_type io_type); 2579 2381 void add_ino_entry(struct f2fs_sb_info *sbi, nid_t ino, int type); 2580 2382 void remove_ino_entry(struct f2fs_sb_info *sbi, nid_t ino, int type); 2581 2383 void release_ino_entry(struct f2fs_sb_info *sbi, bool all); ··· 2628 2430 int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, 2629 2431 u64 start, u64 len); 2630 2432 void f2fs_set_page_dirty_nobuffers(struct page *page); 2433 + int __f2fs_write_data_pages(struct address_space *mapping, 2434 + struct writeback_control *wbc, 2435 + enum iostat_type io_type); 2631 2436 void f2fs_invalidate_page(struct page *page, unsigned int offset, 2632 2437 unsigned int length); 2633 2438 int f2fs_release_page(struct page *page, gfp_t wait); ··· 2927 2726 /* 2928 2727 * sysfs.c 2929 2728 */ 2930 - int __init f2fs_register_sysfs(void); 2931 - void f2fs_unregister_sysfs(void); 2932 - int f2fs_init_sysfs(struct f2fs_sb_info *sbi); 2933 - void f2fs_exit_sysfs(struct f2fs_sb_info *sbi); 2729 + int __init f2fs_init_sysfs(void); 2730 + void f2fs_exit_sysfs(void); 2731 + int f2fs_register_sysfs(struct f2fs_sb_info *sbi); 2732 + void f2fs_unregister_sysfs(struct f2fs_sb_info *sbi); 2934 2733 2935 2734 /* 2936 2735 * crypto support ··· 2938 2737 static inline bool f2fs_encrypted_inode(struct inode *inode) 2939 2738 { 2940 2739 return file_is_encrypt(inode); 2740 + } 2741 + 2742 + static inline bool f2fs_encrypted_file(struct inode *inode) 2743 + { 2744 + return f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode); 2941 2745 } 2942 2746 2943 2747 static inline void f2fs_set_encrypted_inode(struct inode *inode) ··· 2965 2759 static inline int f2fs_sb_mounted_blkzoned(struct super_block *sb) 2966 2760 { 2967 2761 return F2FS_HAS_FEATURE(sb, F2FS_FEATURE_BLKZONED); 2762 + } 2763 + 2764 + static inline int f2fs_sb_has_extra_attr(struct super_block *sb) 2765 + { 2766 + return F2FS_HAS_FEATURE(sb, F2FS_FEATURE_EXTRA_ATTR); 2767 + } 2768 + 2769 + static inline int f2fs_sb_has_project_quota(struct super_block *sb) 2770 + { 2771 + return F2FS_HAS_FEATURE(sb, F2FS_FEATURE_PRJQUOTA); 2772 + } 2773 + 2774 + static inline int f2fs_sb_has_inode_chksum(struct super_block *sb) 2775 + { 2776 + return F2FS_HAS_FEATURE(sb, F2FS_FEATURE_INODE_CHKSUM); 2968 2777 } 2969 2778 2970 2779 #ifdef CONFIG_BLK_DEV_ZONED
+305 -55
fs/f2fs/file.c
··· 98 98 if (!PageUptodate(page)) 99 99 SetPageUptodate(page); 100 100 101 + f2fs_update_iostat(sbi, APP_MAPPED_IO, F2FS_BLKSIZE); 102 + 101 103 trace_f2fs_vm_page_mkwrite(page, DATA); 102 104 mapped: 103 105 /* fill the page */ 104 106 f2fs_wait_on_page_writeback(page, DATA, false); 105 107 106 108 /* wait for GCed encrypted page writeback */ 107 - if (f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode)) 108 - f2fs_wait_on_encrypted_page_writeback(sbi, dn.data_blkaddr); 109 + if (f2fs_encrypted_file(inode)) 110 + f2fs_wait_on_block_writeback(sbi, dn.data_blkaddr); 109 111 110 112 out_sem: 111 113 up_read(&F2FS_I(inode)->i_mmap_sem); ··· 276 274 goto sync_nodes; 277 275 } 278 276 279 - ret = wait_on_node_pages_writeback(sbi, ino); 280 - if (ret) 281 - goto out; 277 + /* 278 + * If it's atomic_write, it's just fine to keep write ordering. So 279 + * here we don't need to wait for node write completion, since we use 280 + * node chain which serializes node blocks. If one of node writes are 281 + * reordered, we can see simply broken chain, resulting in stopping 282 + * roll-forward recovery. It means we'll recover all or none node blocks 283 + * given fsync mark. 284 + */ 285 + if (!atomic) { 286 + ret = wait_on_node_pages_writeback(sbi, ino); 287 + if (ret) 288 + goto out; 289 + } 282 290 283 291 /* once recovery info is written, don't need to tack this */ 284 292 remove_ino_entry(sbi, ino, APPEND_INO); ··· 394 382 dn.ofs_in_node++, pgofs++, 395 383 data_ofs = (loff_t)pgofs << PAGE_SHIFT) { 396 384 block_t blkaddr; 397 - blkaddr = datablock_addr(dn.node_page, dn.ofs_in_node); 385 + blkaddr = datablock_addr(dn.inode, 386 + dn.node_page, dn.ofs_in_node); 398 387 399 388 if (__found_offset(blkaddr, dirty, pgofs, whence)) { 400 389 f2fs_put_dnode(&dn); ··· 480 467 struct f2fs_node *raw_node; 481 468 int nr_free = 0, ofs = dn->ofs_in_node, len = count; 482 469 __le32 *addr; 470 + int base = 0; 471 + 472 + if (IS_INODE(dn->node_page) && f2fs_has_extra_attr(dn->inode)) 473 + base = get_extra_isize(dn->inode); 483 474 484 475 raw_node = F2FS_NODE(dn->node_page); 485 - addr = blkaddr_in_node(raw_node) + ofs; 476 + addr = blkaddr_in_node(raw_node) + base + ofs; 486 477 487 478 for (; count > 0; count--, addr++, dn->ofs_in_node++) { 488 479 block_t blkaddr = le32_to_cpu(*addr); ··· 664 647 struct f2fs_inode_info *fi = F2FS_I(inode); 665 648 unsigned int flags; 666 649 667 - flags = fi->i_flags & FS_FL_USER_VISIBLE; 650 + flags = fi->i_flags & (FS_FL_USER_VISIBLE | FS_PROJINHERIT_FL); 668 651 if (flags & FS_APPEND_FL) 669 652 stat->attributes |= STATX_ATTR_APPEND; 670 653 if (flags & FS_COMPR_FL) ··· 944 927 done = min((pgoff_t)ADDRS_PER_PAGE(dn.node_page, inode) - 945 928 dn.ofs_in_node, len); 946 929 for (i = 0; i < done; i++, blkaddr++, do_replace++, dn.ofs_in_node++) { 947 - *blkaddr = datablock_addr(dn.node_page, dn.ofs_in_node); 930 + *blkaddr = datablock_addr(dn.inode, 931 + dn.node_page, dn.ofs_in_node); 948 932 if (!is_checkpointed_data(sbi, *blkaddr)) { 949 933 950 934 if (test_opt(sbi, LFS)) { ··· 1021 1003 ADDRS_PER_PAGE(dn.node_page, dst_inode) - 1022 1004 dn.ofs_in_node, len - i); 1023 1005 do { 1024 - dn.data_blkaddr = datablock_addr(dn.node_page, 1025 - dn.ofs_in_node); 1006 + dn.data_blkaddr = datablock_addr(dn.inode, 1007 + dn.node_page, dn.ofs_in_node); 1026 1008 truncate_data_blocks_range(&dn, 1); 1027 1009 1028 1010 if (do_replace[i]) { ··· 1191 1173 int ret; 1192 1174 1193 1175 for (; index < end; index++, dn->ofs_in_node++) { 1194 - if (datablock_addr(dn->node_page, dn->ofs_in_node) == NULL_ADDR) 1176 + if (datablock_addr(dn->inode, dn->node_page, 1177 + dn->ofs_in_node) == NULL_ADDR) 1195 1178 count++; 1196 1179 } 1197 1180 ··· 1203 1184 1204 1185 dn->ofs_in_node = ofs_in_node; 1205 1186 for (index = start; index < end; index++, dn->ofs_in_node++) { 1206 - dn->data_blkaddr = 1207 - datablock_addr(dn->node_page, dn->ofs_in_node); 1187 + dn->data_blkaddr = datablock_addr(dn->inode, 1188 + dn->node_page, dn->ofs_in_node); 1208 1189 /* 1209 1190 * reserve_new_blocks will not guarantee entire block 1210 1191 * allocation. ··· 1514 1495 return 0; 1515 1496 } 1516 1497 1517 - #define F2FS_REG_FLMASK (~(FS_DIRSYNC_FL | FS_TOPDIR_FL)) 1518 - #define F2FS_OTHER_FLMASK (FS_NODUMP_FL | FS_NOATIME_FL) 1519 - 1520 - static inline __u32 f2fs_mask_flags(umode_t mode, __u32 flags) 1498 + static int f2fs_file_flush(struct file *file, fl_owner_t id) 1521 1499 { 1522 - if (S_ISDIR(mode)) 1523 - return flags; 1524 - else if (S_ISREG(mode)) 1525 - return flags & F2FS_REG_FLMASK; 1526 - else 1527 - return flags & F2FS_OTHER_FLMASK; 1500 + struct inode *inode = file_inode(file); 1501 + 1502 + /* 1503 + * If the process doing a transaction is crashed, we should do 1504 + * roll-back. Otherwise, other reader/write can see corrupted database 1505 + * until all the writers close its file. Since this should be done 1506 + * before dropping file lock, it needs to do in ->flush. 1507 + */ 1508 + if (f2fs_is_atomic_file(inode) && 1509 + F2FS_I(inode)->inmem_task == current) 1510 + drop_inmem_pages(inode); 1511 + return 0; 1528 1512 } 1529 1513 1530 1514 static int f2fs_ioc_getflags(struct file *filp, unsigned long arg) 1531 1515 { 1532 1516 struct inode *inode = file_inode(filp); 1533 1517 struct f2fs_inode_info *fi = F2FS_I(inode); 1534 - unsigned int flags = fi->i_flags & FS_FL_USER_VISIBLE; 1518 + unsigned int flags = fi->i_flags & 1519 + (FS_FL_USER_VISIBLE | FS_PROJINHERIT_FL); 1535 1520 return put_user(flags, (int __user *)arg); 1521 + } 1522 + 1523 + static int __f2fs_ioc_setflags(struct inode *inode, unsigned int flags) 1524 + { 1525 + struct f2fs_inode_info *fi = F2FS_I(inode); 1526 + unsigned int oldflags; 1527 + 1528 + /* Is it quota file? Do not allow user to mess with it */ 1529 + if (IS_NOQUOTA(inode)) 1530 + return -EPERM; 1531 + 1532 + flags = f2fs_mask_flags(inode->i_mode, flags); 1533 + 1534 + oldflags = fi->i_flags; 1535 + 1536 + if ((flags ^ oldflags) & (FS_APPEND_FL | FS_IMMUTABLE_FL)) 1537 + if (!capable(CAP_LINUX_IMMUTABLE)) 1538 + return -EPERM; 1539 + 1540 + flags = flags & (FS_FL_USER_MODIFIABLE | FS_PROJINHERIT_FL); 1541 + flags |= oldflags & ~(FS_FL_USER_MODIFIABLE | FS_PROJINHERIT_FL); 1542 + fi->i_flags = flags; 1543 + 1544 + if (fi->i_flags & FS_PROJINHERIT_FL) 1545 + set_inode_flag(inode, FI_PROJ_INHERIT); 1546 + else 1547 + clear_inode_flag(inode, FI_PROJ_INHERIT); 1548 + 1549 + inode->i_ctime = current_time(inode); 1550 + f2fs_set_inode_flags(inode); 1551 + f2fs_mark_inode_dirty_sync(inode, false); 1552 + return 0; 1536 1553 } 1537 1554 1538 1555 static int f2fs_ioc_setflags(struct file *filp, unsigned long arg) 1539 1556 { 1540 1557 struct inode *inode = file_inode(filp); 1541 - struct f2fs_inode_info *fi = F2FS_I(inode); 1542 1558 unsigned int flags; 1543 - unsigned int oldflags; 1544 1559 int ret; 1545 1560 1546 1561 if (!inode_owner_or_capable(inode)) ··· 1589 1536 1590 1537 inode_lock(inode); 1591 1538 1592 - /* Is it quota file? Do not allow user to mess with it */ 1593 - if (IS_NOQUOTA(inode)) { 1594 - ret = -EPERM; 1595 - goto unlock_out; 1596 - } 1539 + ret = __f2fs_ioc_setflags(inode, flags); 1597 1540 1598 - flags = f2fs_mask_flags(inode->i_mode, flags); 1599 - 1600 - oldflags = fi->i_flags; 1601 - 1602 - if ((flags ^ oldflags) & (FS_APPEND_FL | FS_IMMUTABLE_FL)) { 1603 - if (!capable(CAP_LINUX_IMMUTABLE)) { 1604 - ret = -EPERM; 1605 - goto unlock_out; 1606 - } 1607 - } 1608 - 1609 - flags = flags & FS_FL_USER_MODIFIABLE; 1610 - flags |= oldflags & ~FS_FL_USER_MODIFIABLE; 1611 - fi->i_flags = flags; 1612 - 1613 - inode->i_ctime = current_time(inode); 1614 - f2fs_set_inode_flags(inode); 1615 - f2fs_mark_inode_dirty_sync(inode, false); 1616 - unlock_out: 1617 1541 inode_unlock(inode); 1618 1542 mnt_drop_write_file(filp); 1619 1543 return ret; ··· 1640 1610 ret = filemap_write_and_wait_range(inode->i_mapping, 0, LLONG_MAX); 1641 1611 if (ret) { 1642 1612 clear_inode_flag(inode, FI_ATOMIC_FILE); 1613 + clear_inode_flag(inode, FI_HOT_DATA); 1643 1614 goto out; 1644 1615 } 1645 1616 1646 1617 inc_stat: 1618 + F2FS_I(inode)->inmem_task = current; 1647 1619 stat_inc_atomic_write(inode); 1648 1620 stat_update_max_atomic_write(inode); 1649 1621 out: ··· 1679 1647 ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true); 1680 1648 if (!ret) { 1681 1649 clear_inode_flag(inode, FI_ATOMIC_FILE); 1650 + clear_inode_flag(inode, FI_HOT_DATA); 1682 1651 stat_dec_atomic_write(inode); 1683 1652 } 1684 1653 } else { 1685 - ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true); 1654 + ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false); 1686 1655 } 1687 1656 err_out: 1688 1657 inode_unlock(inode); ··· 1819 1786 f2fs_stop_checkpoint(sbi, false); 1820 1787 break; 1821 1788 case F2FS_GOING_DOWN_METAFLUSH: 1822 - sync_meta_pages(sbi, META, LONG_MAX); 1789 + sync_meta_pages(sbi, META, LONG_MAX, FS_META_IO); 1823 1790 f2fs_stop_checkpoint(sbi, false); 1824 1791 break; 1825 1792 default: ··· 2076 2043 */ 2077 2044 while (map.m_lblk < pg_end) { 2078 2045 map.m_len = pg_end - map.m_lblk; 2079 - err = f2fs_map_blocks(inode, &map, 0, F2FS_GET_BLOCK_READ); 2046 + err = f2fs_map_blocks(inode, &map, 0, F2FS_GET_BLOCK_DEFAULT); 2080 2047 if (err) 2081 2048 goto out; 2082 2049 ··· 2118 2085 2119 2086 do_map: 2120 2087 map.m_len = pg_end - map.m_lblk; 2121 - err = f2fs_map_blocks(inode, &map, 0, F2FS_GET_BLOCK_READ); 2088 + err = f2fs_map_blocks(inode, &map, 0, F2FS_GET_BLOCK_DEFAULT); 2122 2089 if (err) 2123 2090 goto clear_out; 2124 2091 ··· 2417 2384 return ret; 2418 2385 } 2419 2386 2387 + static int f2fs_ioc_get_features(struct file *filp, unsigned long arg) 2388 + { 2389 + struct inode *inode = file_inode(filp); 2390 + u32 sb_feature = le32_to_cpu(F2FS_I_SB(inode)->raw_super->feature); 2391 + 2392 + /* Must validate to set it with SQLite behavior in Android. */ 2393 + sb_feature |= F2FS_FEATURE_ATOMIC_WRITE; 2394 + 2395 + return put_user(sb_feature, (u32 __user *)arg); 2396 + } 2397 + 2398 + #ifdef CONFIG_QUOTA 2399 + static int f2fs_ioc_setproject(struct file *filp, __u32 projid) 2400 + { 2401 + struct inode *inode = file_inode(filp); 2402 + struct f2fs_inode_info *fi = F2FS_I(inode); 2403 + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 2404 + struct super_block *sb = sbi->sb; 2405 + struct dquot *transfer_to[MAXQUOTAS] = {}; 2406 + struct page *ipage; 2407 + kprojid_t kprojid; 2408 + int err; 2409 + 2410 + if (!f2fs_sb_has_project_quota(sb)) { 2411 + if (projid != F2FS_DEF_PROJID) 2412 + return -EOPNOTSUPP; 2413 + else 2414 + return 0; 2415 + } 2416 + 2417 + if (!f2fs_has_extra_attr(inode)) 2418 + return -EOPNOTSUPP; 2419 + 2420 + kprojid = make_kprojid(&init_user_ns, (projid_t)projid); 2421 + 2422 + if (projid_eq(kprojid, F2FS_I(inode)->i_projid)) 2423 + return 0; 2424 + 2425 + err = mnt_want_write_file(filp); 2426 + if (err) 2427 + return err; 2428 + 2429 + err = -EPERM; 2430 + inode_lock(inode); 2431 + 2432 + /* Is it quota file? Do not allow user to mess with it */ 2433 + if (IS_NOQUOTA(inode)) 2434 + goto out_unlock; 2435 + 2436 + ipage = get_node_page(sbi, inode->i_ino); 2437 + if (IS_ERR(ipage)) { 2438 + err = PTR_ERR(ipage); 2439 + goto out_unlock; 2440 + } 2441 + 2442 + if (!F2FS_FITS_IN_INODE(F2FS_INODE(ipage), fi->i_extra_isize, 2443 + i_projid)) { 2444 + err = -EOVERFLOW; 2445 + f2fs_put_page(ipage, 1); 2446 + goto out_unlock; 2447 + } 2448 + f2fs_put_page(ipage, 1); 2449 + 2450 + dquot_initialize(inode); 2451 + 2452 + transfer_to[PRJQUOTA] = dqget(sb, make_kqid_projid(kprojid)); 2453 + if (!IS_ERR(transfer_to[PRJQUOTA])) { 2454 + err = __dquot_transfer(inode, transfer_to); 2455 + dqput(transfer_to[PRJQUOTA]); 2456 + if (err) 2457 + goto out_dirty; 2458 + } 2459 + 2460 + F2FS_I(inode)->i_projid = kprojid; 2461 + inode->i_ctime = current_time(inode); 2462 + out_dirty: 2463 + f2fs_mark_inode_dirty_sync(inode, true); 2464 + out_unlock: 2465 + inode_unlock(inode); 2466 + mnt_drop_write_file(filp); 2467 + return err; 2468 + } 2469 + #else 2470 + static int f2fs_ioc_setproject(struct file *filp, __u32 projid) 2471 + { 2472 + if (projid != F2FS_DEF_PROJID) 2473 + return -EOPNOTSUPP; 2474 + return 0; 2475 + } 2476 + #endif 2477 + 2478 + /* Transfer internal flags to xflags */ 2479 + static inline __u32 f2fs_iflags_to_xflags(unsigned long iflags) 2480 + { 2481 + __u32 xflags = 0; 2482 + 2483 + if (iflags & FS_SYNC_FL) 2484 + xflags |= FS_XFLAG_SYNC; 2485 + if (iflags & FS_IMMUTABLE_FL) 2486 + xflags |= FS_XFLAG_IMMUTABLE; 2487 + if (iflags & FS_APPEND_FL) 2488 + xflags |= FS_XFLAG_APPEND; 2489 + if (iflags & FS_NODUMP_FL) 2490 + xflags |= FS_XFLAG_NODUMP; 2491 + if (iflags & FS_NOATIME_FL) 2492 + xflags |= FS_XFLAG_NOATIME; 2493 + if (iflags & FS_PROJINHERIT_FL) 2494 + xflags |= FS_XFLAG_PROJINHERIT; 2495 + return xflags; 2496 + } 2497 + 2498 + #define F2FS_SUPPORTED_FS_XFLAGS (FS_XFLAG_SYNC | FS_XFLAG_IMMUTABLE | \ 2499 + FS_XFLAG_APPEND | FS_XFLAG_NODUMP | \ 2500 + FS_XFLAG_NOATIME | FS_XFLAG_PROJINHERIT) 2501 + 2502 + /* Flags we can manipulate with through EXT4_IOC_FSSETXATTR */ 2503 + #define F2FS_FL_XFLAG_VISIBLE (FS_SYNC_FL | \ 2504 + FS_IMMUTABLE_FL | \ 2505 + FS_APPEND_FL | \ 2506 + FS_NODUMP_FL | \ 2507 + FS_NOATIME_FL | \ 2508 + FS_PROJINHERIT_FL) 2509 + 2510 + /* Transfer xflags flags to internal */ 2511 + static inline unsigned long f2fs_xflags_to_iflags(__u32 xflags) 2512 + { 2513 + unsigned long iflags = 0; 2514 + 2515 + if (xflags & FS_XFLAG_SYNC) 2516 + iflags |= FS_SYNC_FL; 2517 + if (xflags & FS_XFLAG_IMMUTABLE) 2518 + iflags |= FS_IMMUTABLE_FL; 2519 + if (xflags & FS_XFLAG_APPEND) 2520 + iflags |= FS_APPEND_FL; 2521 + if (xflags & FS_XFLAG_NODUMP) 2522 + iflags |= FS_NODUMP_FL; 2523 + if (xflags & FS_XFLAG_NOATIME) 2524 + iflags |= FS_NOATIME_FL; 2525 + if (xflags & FS_XFLAG_PROJINHERIT) 2526 + iflags |= FS_PROJINHERIT_FL; 2527 + 2528 + return iflags; 2529 + } 2530 + 2531 + static int f2fs_ioc_fsgetxattr(struct file *filp, unsigned long arg) 2532 + { 2533 + struct inode *inode = file_inode(filp); 2534 + struct f2fs_inode_info *fi = F2FS_I(inode); 2535 + struct fsxattr fa; 2536 + 2537 + memset(&fa, 0, sizeof(struct fsxattr)); 2538 + fa.fsx_xflags = f2fs_iflags_to_xflags(fi->i_flags & 2539 + (FS_FL_USER_VISIBLE | FS_PROJINHERIT_FL)); 2540 + 2541 + if (f2fs_sb_has_project_quota(inode->i_sb)) 2542 + fa.fsx_projid = (__u32)from_kprojid(&init_user_ns, 2543 + fi->i_projid); 2544 + 2545 + if (copy_to_user((struct fsxattr __user *)arg, &fa, sizeof(fa))) 2546 + return -EFAULT; 2547 + return 0; 2548 + } 2549 + 2550 + static int f2fs_ioc_fssetxattr(struct file *filp, unsigned long arg) 2551 + { 2552 + struct inode *inode = file_inode(filp); 2553 + struct f2fs_inode_info *fi = F2FS_I(inode); 2554 + struct fsxattr fa; 2555 + unsigned int flags; 2556 + int err; 2557 + 2558 + if (copy_from_user(&fa, (struct fsxattr __user *)arg, sizeof(fa))) 2559 + return -EFAULT; 2560 + 2561 + /* Make sure caller has proper permission */ 2562 + if (!inode_owner_or_capable(inode)) 2563 + return -EACCES; 2564 + 2565 + if (fa.fsx_xflags & ~F2FS_SUPPORTED_FS_XFLAGS) 2566 + return -EOPNOTSUPP; 2567 + 2568 + flags = f2fs_xflags_to_iflags(fa.fsx_xflags); 2569 + if (f2fs_mask_flags(inode->i_mode, flags) != flags) 2570 + return -EOPNOTSUPP; 2571 + 2572 + err = mnt_want_write_file(filp); 2573 + if (err) 2574 + return err; 2575 + 2576 + inode_lock(inode); 2577 + flags = (fi->i_flags & ~F2FS_FL_XFLAG_VISIBLE) | 2578 + (flags & F2FS_FL_XFLAG_VISIBLE); 2579 + err = __f2fs_ioc_setflags(inode, flags); 2580 + inode_unlock(inode); 2581 + mnt_drop_write_file(filp); 2582 + if (err) 2583 + return err; 2584 + 2585 + err = f2fs_ioc_setproject(filp, fa.fsx_projid); 2586 + if (err) 2587 + return err; 2588 + 2589 + return 0; 2590 + } 2420 2591 2421 2592 long f2fs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) 2422 2593 { ··· 2663 2426 return f2fs_ioc_move_range(filp, arg); 2664 2427 case F2FS_IOC_FLUSH_DEVICE: 2665 2428 return f2fs_ioc_flush_device(filp, arg); 2429 + case F2FS_IOC_GET_FEATURES: 2430 + return f2fs_ioc_get_features(filp, arg); 2431 + case F2FS_IOC_FSGETXATTR: 2432 + return f2fs_ioc_fsgetxattr(filp, arg); 2433 + case F2FS_IOC_FSSETXATTR: 2434 + return f2fs_ioc_fssetxattr(filp, arg); 2666 2435 default: 2667 2436 return -ENOTTY; 2668 2437 } ··· 2698 2455 ret = __generic_file_write_iter(iocb, from); 2699 2456 blk_finish_plug(&plug); 2700 2457 clear_inode_flag(inode, FI_NO_PREALLOC); 2458 + 2459 + if (ret > 0) 2460 + f2fs_update_iostat(F2FS_I_SB(inode), APP_WRITE_IO, ret); 2701 2461 } 2702 2462 inode_unlock(inode); 2703 2463 ··· 2737 2491 case F2FS_IOC_DEFRAGMENT: 2738 2492 case F2FS_IOC_MOVE_RANGE: 2739 2493 case F2FS_IOC_FLUSH_DEVICE: 2494 + case F2FS_IOC_GET_FEATURES: 2495 + case F2FS_IOC_FSGETXATTR: 2496 + case F2FS_IOC_FSSETXATTR: 2740 2497 break; 2741 2498 default: 2742 2499 return -ENOIOCTLCMD; ··· 2755 2506 .open = f2fs_file_open, 2756 2507 .release = f2fs_release_file, 2757 2508 .mmap = f2fs_file_mmap, 2509 + .flush = f2fs_file_flush, 2758 2510 .fsync = f2fs_sync_file, 2759 2511 .fallocate = f2fs_fallocate, 2760 2512 .unlocked_ioctl = f2fs_ioctl,
+78 -37
fs/f2fs/gc.c
··· 28 28 struct f2fs_sb_info *sbi = data; 29 29 struct f2fs_gc_kthread *gc_th = sbi->gc_thread; 30 30 wait_queue_head_t *wq = &sbi->gc_thread->gc_wait_queue_head; 31 - long wait_ms; 31 + unsigned int wait_ms; 32 32 33 33 wait_ms = gc_th->min_sleep_time; 34 34 35 35 set_freezable(); 36 36 do { 37 37 wait_event_interruptible_timeout(*wq, 38 - kthread_should_stop() || freezing(current), 38 + kthread_should_stop() || freezing(current) || 39 + gc_th->gc_wake, 39 40 msecs_to_jiffies(wait_ms)); 41 + 42 + /* give it a try one time */ 43 + if (gc_th->gc_wake) 44 + gc_th->gc_wake = 0; 40 45 41 46 if (try_to_freeze()) 42 47 continue; ··· 60 55 } 61 56 #endif 62 57 58 + if (!sb_start_write_trylock(sbi->sb)) 59 + continue; 60 + 63 61 /* 64 62 * [GC triggering condition] 65 63 * 0. GC is not conducted currently. ··· 77 69 * So, I'd like to wait some time to collect dirty segments. 78 70 */ 79 71 if (!mutex_trylock(&sbi->gc_mutex)) 80 - continue; 72 + goto next; 73 + 74 + if (gc_th->gc_urgent) { 75 + wait_ms = gc_th->urgent_sleep_time; 76 + goto do_gc; 77 + } 81 78 82 79 if (!is_idle(sbi)) { 83 80 increase_sleep_time(gc_th, &wait_ms); 84 81 mutex_unlock(&sbi->gc_mutex); 85 - continue; 82 + goto next; 86 83 } 87 84 88 85 if (has_enough_invalid_blocks(sbi)) 89 86 decrease_sleep_time(gc_th, &wait_ms); 90 87 else 91 88 increase_sleep_time(gc_th, &wait_ms); 92 - 89 + do_gc: 93 90 stat_inc_bggc_count(sbi); 94 91 95 92 /* if return value is not zero, no victim was selected */ ··· 106 93 107 94 /* balancing f2fs's metadata periodically */ 108 95 f2fs_balance_fs_bg(sbi); 96 + next: 97 + sb_end_write(sbi->sb); 109 98 110 99 } while (!kthread_should_stop()); 111 100 return 0; ··· 125 110 goto out; 126 111 } 127 112 113 + gc_th->urgent_sleep_time = DEF_GC_THREAD_URGENT_SLEEP_TIME; 128 114 gc_th->min_sleep_time = DEF_GC_THREAD_MIN_SLEEP_TIME; 129 115 gc_th->max_sleep_time = DEF_GC_THREAD_MAX_SLEEP_TIME; 130 116 gc_th->no_gc_sleep_time = DEF_GC_THREAD_NOGC_SLEEP_TIME; 131 117 132 118 gc_th->gc_idle = 0; 119 + gc_th->gc_urgent = 0; 120 + gc_th->gc_wake= 0; 133 121 134 122 sbi->gc_thread = gc_th; 135 123 init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head); ··· 277 259 valid_blocks * 2 : valid_blocks; 278 260 } 279 261 280 - static unsigned int get_ssr_cost(struct f2fs_sb_info *sbi, 281 - unsigned int segno) 282 - { 283 - struct seg_entry *se = get_seg_entry(sbi, segno); 284 - 285 - return se->ckpt_valid_blocks > se->valid_blocks ? 286 - se->ckpt_valid_blocks : se->valid_blocks; 287 - } 288 - 289 262 static inline unsigned int get_gc_cost(struct f2fs_sb_info *sbi, 290 263 unsigned int segno, struct victim_sel_policy *p) 291 264 { 292 265 if (p->alloc_mode == SSR) 293 - return get_ssr_cost(sbi, segno); 266 + return get_seg_entry(sbi, segno)->ckpt_valid_blocks; 294 267 295 268 /* alloc_mode == LFS */ 296 269 if (p->gc_mode == GC_GREEDY) ··· 591 582 } 592 583 593 584 *nofs = ofs_of_node(node_page); 594 - source_blkaddr = datablock_addr(node_page, ofs_in_node); 585 + source_blkaddr = datablock_addr(NULL, node_page, ofs_in_node); 595 586 f2fs_put_page(node_page, 1); 596 587 597 588 if (source_blkaddr != blkaddr) ··· 599 590 return true; 600 591 } 601 592 602 - static void move_encrypted_block(struct inode *inode, block_t bidx, 603 - unsigned int segno, int off) 593 + /* 594 + * Move data block via META_MAPPING while keeping locked data page. 595 + * This can be used to move blocks, aka LBAs, directly on disk. 596 + */ 597 + static void move_data_block(struct inode *inode, block_t bidx, 598 + unsigned int segno, int off) 604 599 { 605 600 struct f2fs_io_info fio = { 606 601 .sbi = F2FS_I_SB(inode), ··· 697 684 fio.new_blkaddr = newaddr; 698 685 f2fs_submit_page_write(&fio); 699 686 687 + f2fs_update_iostat(fio.sbi, FS_GC_DATA_IO, F2FS_BLKSIZE); 688 + 700 689 f2fs_update_data_blkaddr(&dn, newaddr); 701 690 set_inode_flag(inode, FI_APPEND_WRITE); 702 691 if (page->index == 0) ··· 746 731 .page = page, 747 732 .encrypted_page = NULL, 748 733 .need_lock = LOCK_REQ, 734 + .io_type = FS_GC_DATA_IO, 749 735 }; 750 736 bool is_dirty = PageDirty(page); 751 737 int err; ··· 835 819 continue; 836 820 837 821 /* if encrypted inode, let's go phase 3 */ 838 - if (f2fs_encrypted_inode(inode) && 839 - S_ISREG(inode->i_mode)) { 822 + if (f2fs_encrypted_file(inode)) { 840 823 add_gc_inode(gc_list, inode); 841 824 continue; 842 825 } ··· 869 854 continue; 870 855 } 871 856 locked = true; 857 + 858 + /* wait for all inflight aio data */ 859 + inode_dio_wait(inode); 872 860 } 873 861 874 862 start_bidx = start_bidx_of_node(nofs, inode) 875 863 + ofs_in_node; 876 - if (f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode)) 877 - move_encrypted_block(inode, start_bidx, segno, off); 864 + if (f2fs_encrypted_file(inode)) 865 + move_data_block(inode, start_bidx, segno, off); 878 866 else 879 - move_data_page(inode, start_bidx, gc_type, segno, off); 867 + move_data_page(inode, start_bidx, gc_type, 868 + segno, off); 880 869 881 870 if (locked) { 882 871 up_write(&fi->dio_rwsem[WRITE]); ··· 917 898 struct blk_plug plug; 918 899 unsigned int segno = start_segno; 919 900 unsigned int end_segno = start_segno + sbi->segs_per_sec; 920 - int sec_freed = 0; 901 + int seg_freed = 0; 921 902 unsigned char type = IS_DATASEG(get_seg_entry(sbi, segno)->type) ? 922 903 SUM_TYPE_DATA : SUM_TYPE_NODE; 923 904 ··· 963 944 gc_type); 964 945 965 946 stat_inc_seg_count(sbi, type, gc_type); 947 + 948 + if (gc_type == FG_GC && 949 + get_valid_blocks(sbi, segno, false) == 0) 950 + seg_freed++; 966 951 next: 967 952 f2fs_put_page(sum_page, 0); 968 953 } ··· 977 954 978 955 blk_finish_plug(&plug); 979 956 980 - if (gc_type == FG_GC && 981 - get_valid_blocks(sbi, start_segno, true) == 0) 982 - sec_freed = 1; 983 - 984 957 stat_inc_call_count(sbi->stat_info); 985 958 986 - return sec_freed; 959 + return seg_freed; 987 960 } 988 961 989 962 int f2fs_gc(struct f2fs_sb_info *sbi, bool sync, 990 963 bool background, unsigned int segno) 991 964 { 992 965 int gc_type = sync ? FG_GC : BG_GC; 993 - int sec_freed = 0; 994 - int ret; 966 + int sec_freed = 0, seg_freed = 0, total_freed = 0; 967 + int ret = 0; 995 968 struct cp_control cpc; 996 969 unsigned int init_segno = segno; 997 970 struct gc_inode_list gc_list = { 998 971 .ilist = LIST_HEAD_INIT(gc_list.ilist), 999 972 .iroot = RADIX_TREE_INIT(GFP_NOFS), 1000 973 }; 974 + 975 + trace_f2fs_gc_begin(sbi->sb, sync, background, 976 + get_pages(sbi, F2FS_DIRTY_NODES), 977 + get_pages(sbi, F2FS_DIRTY_DENTS), 978 + get_pages(sbi, F2FS_DIRTY_IMETA), 979 + free_sections(sbi), 980 + free_segments(sbi), 981 + reserved_segments(sbi), 982 + prefree_segments(sbi)); 1001 983 1002 984 cpc.reason = __get_cp_reason(sbi); 1003 985 gc_more: ··· 1030 1002 gc_type = FG_GC; 1031 1003 } 1032 1004 1033 - ret = -EINVAL; 1034 1005 /* f2fs_balance_fs doesn't need to do BG_GC in critical path. */ 1035 - if (gc_type == BG_GC && !background) 1006 + if (gc_type == BG_GC && !background) { 1007 + ret = -EINVAL; 1036 1008 goto stop; 1037 - if (!__get_victim(sbi, &segno, gc_type)) 1009 + } 1010 + if (!__get_victim(sbi, &segno, gc_type)) { 1011 + ret = -ENODATA; 1038 1012 goto stop; 1039 - ret = 0; 1013 + } 1040 1014 1041 - if (do_garbage_collect(sbi, segno, &gc_list, gc_type) && 1042 - gc_type == FG_GC) 1015 + seg_freed = do_garbage_collect(sbi, segno, &gc_list, gc_type); 1016 + if (gc_type == FG_GC && seg_freed == sbi->segs_per_sec) 1043 1017 sec_freed++; 1018 + total_freed += seg_freed; 1044 1019 1045 1020 if (gc_type == FG_GC) 1046 1021 sbi->cur_victim_sec = NULL_SEGNO; ··· 1060 1029 stop: 1061 1030 SIT_I(sbi)->last_victim[ALLOC_NEXT] = 0; 1062 1031 SIT_I(sbi)->last_victim[FLUSH_DEVICE] = init_segno; 1032 + 1033 + trace_f2fs_gc_end(sbi->sb, ret, total_freed, sec_freed, 1034 + get_pages(sbi, F2FS_DIRTY_NODES), 1035 + get_pages(sbi, F2FS_DIRTY_DENTS), 1036 + get_pages(sbi, F2FS_DIRTY_IMETA), 1037 + free_sections(sbi), 1038 + free_segments(sbi), 1039 + reserved_segments(sbi), 1040 + prefree_segments(sbi)); 1041 + 1063 1042 mutex_unlock(&sbi->gc_mutex); 1064 1043 1065 1044 put_gc_inode(&gc_list);
+19 -8
fs/f2fs/gc.h
··· 13 13 * whether IO subsystem is idle 14 14 * or not 15 15 */ 16 + #define DEF_GC_THREAD_URGENT_SLEEP_TIME 500 /* 500 ms */ 16 17 #define DEF_GC_THREAD_MIN_SLEEP_TIME 30000 /* milliseconds */ 17 18 #define DEF_GC_THREAD_MAX_SLEEP_TIME 60000 18 19 #define DEF_GC_THREAD_NOGC_SLEEP_TIME 300000 /* wait 5 min */ ··· 28 27 wait_queue_head_t gc_wait_queue_head; 29 28 30 29 /* for gc sleep time */ 30 + unsigned int urgent_sleep_time; 31 31 unsigned int min_sleep_time; 32 32 unsigned int max_sleep_time; 33 33 unsigned int no_gc_sleep_time; 34 34 35 35 /* for changing gc mode */ 36 36 unsigned int gc_idle; 37 + unsigned int gc_urgent; 38 + unsigned int gc_wake; 37 39 }; 38 40 39 41 struct gc_inode_list { ··· 69 65 } 70 66 71 67 static inline void increase_sleep_time(struct f2fs_gc_kthread *gc_th, 72 - long *wait) 68 + unsigned int *wait) 73 69 { 70 + unsigned int min_time = gc_th->min_sleep_time; 71 + unsigned int max_time = gc_th->max_sleep_time; 72 + 74 73 if (*wait == gc_th->no_gc_sleep_time) 75 74 return; 76 75 77 - *wait += gc_th->min_sleep_time; 78 - if (*wait > gc_th->max_sleep_time) 79 - *wait = gc_th->max_sleep_time; 76 + if ((long long)*wait + (long long)min_time > (long long)max_time) 77 + *wait = max_time; 78 + else 79 + *wait += min_time; 80 80 } 81 81 82 82 static inline void decrease_sleep_time(struct f2fs_gc_kthread *gc_th, 83 - long *wait) 83 + unsigned int *wait) 84 84 { 85 + unsigned int min_time = gc_th->min_sleep_time; 86 + 85 87 if (*wait == gc_th->no_gc_sleep_time) 86 88 *wait = gc_th->max_sleep_time; 87 89 88 - *wait -= gc_th->min_sleep_time; 89 - if (*wait <= gc_th->min_sleep_time) 90 - *wait = gc_th->min_sleep_time; 90 + if ((long long)*wait - (long long)min_time < (long long)min_time) 91 + *wait = min_time; 92 + else 93 + *wait -= min_time; 91 94 } 92 95 93 96 static inline bool has_enough_invalid_blocks(struct f2fs_sb_info *sbi)
+78 -64
fs/f2fs/inline.c
··· 22 22 if (!S_ISREG(inode->i_mode) && !S_ISLNK(inode->i_mode)) 23 23 return false; 24 24 25 - if (i_size_read(inode) > MAX_INLINE_DATA) 25 + if (i_size_read(inode) > MAX_INLINE_DATA(inode)) 26 26 return false; 27 27 28 - if (f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode)) 28 + if (f2fs_encrypted_file(inode)) 29 29 return false; 30 30 31 31 return true; ··· 44 44 45 45 void read_inline_data(struct page *page, struct page *ipage) 46 46 { 47 + struct inode *inode = page->mapping->host; 47 48 void *src_addr, *dst_addr; 48 49 49 50 if (PageUptodate(page)) ··· 52 51 53 52 f2fs_bug_on(F2FS_P_SB(page), page->index); 54 53 55 - zero_user_segment(page, MAX_INLINE_DATA, PAGE_SIZE); 54 + zero_user_segment(page, MAX_INLINE_DATA(inode), PAGE_SIZE); 56 55 57 56 /* Copy the whole inline data block */ 58 - src_addr = inline_data_addr(ipage); 57 + src_addr = inline_data_addr(inode, ipage); 59 58 dst_addr = kmap_atomic(page); 60 - memcpy(dst_addr, src_addr, MAX_INLINE_DATA); 59 + memcpy(dst_addr, src_addr, MAX_INLINE_DATA(inode)); 61 60 flush_dcache_page(page); 62 61 kunmap_atomic(dst_addr); 63 62 if (!PageUptodate(page)) ··· 68 67 { 69 68 void *addr; 70 69 71 - if (from >= MAX_INLINE_DATA) 70 + if (from >= MAX_INLINE_DATA(inode)) 72 71 return; 73 72 74 - addr = inline_data_addr(ipage); 73 + addr = inline_data_addr(inode, ipage); 75 74 76 75 f2fs_wait_on_page_writeback(ipage, NODE, true); 77 - memset(addr + from, 0, MAX_INLINE_DATA - from); 76 + memset(addr + from, 0, MAX_INLINE_DATA(inode) - from); 78 77 set_page_dirty(ipage); 79 78 80 79 if (from == 0) ··· 117 116 .op_flags = REQ_SYNC | REQ_PRIO, 118 117 .page = page, 119 118 .encrypted_page = NULL, 119 + .io_type = FS_DATA_IO, 120 120 }; 121 121 int dirty, err; 122 122 ··· 202 200 { 203 201 void *src_addr, *dst_addr; 204 202 struct dnode_of_data dn; 203 + struct address_space *mapping = page_mapping(page); 204 + unsigned long flags; 205 205 int err; 206 206 207 207 set_new_dnode(&dn, inode, NULL, NULL, 0); ··· 220 216 221 217 f2fs_wait_on_page_writeback(dn.inode_page, NODE, true); 222 218 src_addr = kmap_atomic(page); 223 - dst_addr = inline_data_addr(dn.inode_page); 224 - memcpy(dst_addr, src_addr, MAX_INLINE_DATA); 219 + dst_addr = inline_data_addr(inode, dn.inode_page); 220 + memcpy(dst_addr, src_addr, MAX_INLINE_DATA(inode)); 225 221 kunmap_atomic(src_addr); 226 222 set_page_dirty(dn.inode_page); 223 + 224 + spin_lock_irqsave(&mapping->tree_lock, flags); 225 + radix_tree_tag_clear(&mapping->page_tree, page_index(page), 226 + PAGECACHE_TAG_DIRTY); 227 + spin_unlock_irqrestore(&mapping->tree_lock, flags); 227 228 228 229 set_inode_flag(inode, FI_APPEND_WRITE); 229 230 set_inode_flag(inode, FI_DATA_EXIST); ··· 264 255 265 256 f2fs_wait_on_page_writeback(ipage, NODE, true); 266 257 267 - src_addr = inline_data_addr(npage); 268 - dst_addr = inline_data_addr(ipage); 269 - memcpy(dst_addr, src_addr, MAX_INLINE_DATA); 258 + src_addr = inline_data_addr(inode, npage); 259 + dst_addr = inline_data_addr(inode, ipage); 260 + memcpy(dst_addr, src_addr, MAX_INLINE_DATA(inode)); 270 261 271 262 set_inode_flag(inode, FI_INLINE_DATA); 272 263 set_inode_flag(inode, FI_DATA_EXIST); ··· 294 285 struct fscrypt_name *fname, struct page **res_page) 295 286 { 296 287 struct f2fs_sb_info *sbi = F2FS_SB(dir->i_sb); 297 - struct f2fs_inline_dentry *inline_dentry; 298 288 struct qstr name = FSTR_TO_QSTR(&fname->disk_name); 299 289 struct f2fs_dir_entry *de; 300 290 struct f2fs_dentry_ptr d; 301 291 struct page *ipage; 292 + void *inline_dentry; 302 293 f2fs_hash_t namehash; 303 294 304 295 ipage = get_node_page(sbi, dir->i_ino); ··· 309 300 310 301 namehash = f2fs_dentry_hash(&name, fname); 311 302 312 - inline_dentry = inline_data_addr(ipage); 303 + inline_dentry = inline_data_addr(dir, ipage); 313 304 314 - make_dentry_ptr_inline(NULL, &d, inline_dentry); 305 + make_dentry_ptr_inline(dir, &d, inline_dentry); 315 306 de = find_target_dentry(fname, namehash, NULL, &d); 316 307 unlock_page(ipage); 317 308 if (de) ··· 325 316 int make_empty_inline_dir(struct inode *inode, struct inode *parent, 326 317 struct page *ipage) 327 318 { 328 - struct f2fs_inline_dentry *inline_dentry; 329 319 struct f2fs_dentry_ptr d; 320 + void *inline_dentry; 330 321 331 - inline_dentry = inline_data_addr(ipage); 322 + inline_dentry = inline_data_addr(inode, ipage); 332 323 333 - make_dentry_ptr_inline(NULL, &d, inline_dentry); 324 + make_dentry_ptr_inline(inode, &d, inline_dentry); 334 325 do_make_empty_dir(inode, parent, &d); 335 326 336 327 set_page_dirty(ipage); 337 328 338 329 /* update i_size to MAX_INLINE_DATA */ 339 - if (i_size_read(inode) < MAX_INLINE_DATA) 340 - f2fs_i_size_write(inode, MAX_INLINE_DATA); 330 + if (i_size_read(inode) < MAX_INLINE_DATA(inode)) 331 + f2fs_i_size_write(inode, MAX_INLINE_DATA(inode)); 341 332 return 0; 342 333 } 343 334 ··· 346 337 * release ipage in this function. 347 338 */ 348 339 static int f2fs_move_inline_dirents(struct inode *dir, struct page *ipage, 349 - struct f2fs_inline_dentry *inline_dentry) 340 + void *inline_dentry) 350 341 { 351 342 struct page *page; 352 343 struct dnode_of_data dn; 353 344 struct f2fs_dentry_block *dentry_blk; 345 + struct f2fs_dentry_ptr src, dst; 354 346 int err; 355 347 356 348 page = f2fs_grab_cache_page(dir->i_mapping, 0, false); ··· 366 356 goto out; 367 357 368 358 f2fs_wait_on_page_writeback(page, DATA, true); 369 - zero_user_segment(page, MAX_INLINE_DATA, PAGE_SIZE); 359 + zero_user_segment(page, MAX_INLINE_DATA(dir), PAGE_SIZE); 370 360 371 361 dentry_blk = kmap_atomic(page); 372 362 363 + make_dentry_ptr_inline(dir, &src, inline_dentry); 364 + make_dentry_ptr_block(dir, &dst, dentry_blk); 365 + 373 366 /* copy data from inline dentry block to new dentry block */ 374 - memcpy(dentry_blk->dentry_bitmap, inline_dentry->dentry_bitmap, 375 - INLINE_DENTRY_BITMAP_SIZE); 376 - memset(dentry_blk->dentry_bitmap + INLINE_DENTRY_BITMAP_SIZE, 0, 377 - SIZE_OF_DENTRY_BITMAP - INLINE_DENTRY_BITMAP_SIZE); 367 + memcpy(dst.bitmap, src.bitmap, src.nr_bitmap); 368 + memset(dst.bitmap + src.nr_bitmap, 0, dst.nr_bitmap - src.nr_bitmap); 378 369 /* 379 370 * we do not need to zero out remainder part of dentry and filename 380 371 * field, since we have used bitmap for marking the usage status of 381 372 * them, besides, we can also ignore copying/zeroing reserved space 382 373 * of dentry block, because them haven't been used so far. 383 374 */ 384 - memcpy(dentry_blk->dentry, inline_dentry->dentry, 385 - sizeof(struct f2fs_dir_entry) * NR_INLINE_DENTRY); 386 - memcpy(dentry_blk->filename, inline_dentry->filename, 387 - NR_INLINE_DENTRY * F2FS_SLOT_LEN); 375 + memcpy(dst.dentry, src.dentry, SIZE_OF_DIR_ENTRY * src.max); 376 + memcpy(dst.filename, src.filename, src.max * F2FS_SLOT_LEN); 388 377 389 378 kunmap_atomic(dentry_blk); 390 379 if (!PageUptodate(page)) ··· 404 395 return err; 405 396 } 406 397 407 - static int f2fs_add_inline_entries(struct inode *dir, 408 - struct f2fs_inline_dentry *inline_dentry) 398 + static int f2fs_add_inline_entries(struct inode *dir, void *inline_dentry) 409 399 { 410 400 struct f2fs_dentry_ptr d; 411 401 unsigned long bit_pos = 0; 412 402 int err = 0; 413 403 414 - make_dentry_ptr_inline(NULL, &d, inline_dentry); 404 + make_dentry_ptr_inline(dir, &d, inline_dentry); 415 405 416 406 while (bit_pos < d.max) { 417 407 struct f2fs_dir_entry *de; ··· 452 444 } 453 445 454 446 static int f2fs_move_rehashed_dirents(struct inode *dir, struct page *ipage, 455 - struct f2fs_inline_dentry *inline_dentry) 447 + void *inline_dentry) 456 448 { 457 - struct f2fs_inline_dentry *backup_dentry; 449 + void *backup_dentry; 458 450 int err; 459 451 460 452 backup_dentry = f2fs_kmalloc(F2FS_I_SB(dir), 461 - sizeof(struct f2fs_inline_dentry), GFP_F2FS_ZERO); 453 + MAX_INLINE_DATA(dir), GFP_F2FS_ZERO); 462 454 if (!backup_dentry) { 463 455 f2fs_put_page(ipage, 1); 464 456 return -ENOMEM; 465 457 } 466 458 467 - memcpy(backup_dentry, inline_dentry, MAX_INLINE_DATA); 459 + memcpy(backup_dentry, inline_dentry, MAX_INLINE_DATA(dir)); 468 460 truncate_inline_inode(dir, ipage, 0); 469 461 470 462 unlock_page(ipage); ··· 481 473 return 0; 482 474 recover: 483 475 lock_page(ipage); 484 - memcpy(inline_dentry, backup_dentry, MAX_INLINE_DATA); 476 + memcpy(inline_dentry, backup_dentry, MAX_INLINE_DATA(dir)); 485 477 f2fs_i_depth_write(dir, 0); 486 - f2fs_i_size_write(dir, MAX_INLINE_DATA); 478 + f2fs_i_size_write(dir, MAX_INLINE_DATA(dir)); 487 479 set_page_dirty(ipage); 488 480 f2fs_put_page(ipage, 1); 489 481 ··· 492 484 } 493 485 494 486 static int f2fs_convert_inline_dir(struct inode *dir, struct page *ipage, 495 - struct f2fs_inline_dentry *inline_dentry) 487 + void *inline_dentry) 496 488 { 497 489 if (!F2FS_I(dir)->i_dir_level) 498 490 return f2fs_move_inline_dirents(dir, ipage, inline_dentry); ··· 508 500 struct page *ipage; 509 501 unsigned int bit_pos; 510 502 f2fs_hash_t name_hash; 511 - struct f2fs_inline_dentry *inline_dentry = NULL; 503 + void *inline_dentry = NULL; 512 504 struct f2fs_dentry_ptr d; 513 505 int slots = GET_DENTRY_SLOTS(new_name->len); 514 506 struct page *page = NULL; ··· 518 510 if (IS_ERR(ipage)) 519 511 return PTR_ERR(ipage); 520 512 521 - inline_dentry = inline_data_addr(ipage); 522 - bit_pos = room_for_filename(&inline_dentry->dentry_bitmap, 523 - slots, NR_INLINE_DENTRY); 524 - if (bit_pos >= NR_INLINE_DENTRY) { 513 + inline_dentry = inline_data_addr(dir, ipage); 514 + make_dentry_ptr_inline(dir, &d, inline_dentry); 515 + 516 + bit_pos = room_for_filename(d.bitmap, slots, d.max); 517 + if (bit_pos >= d.max) { 525 518 err = f2fs_convert_inline_dir(dir, ipage, inline_dentry); 526 519 if (err) 527 520 return err; ··· 543 534 f2fs_wait_on_page_writeback(ipage, NODE, true); 544 535 545 536 name_hash = f2fs_dentry_hash(new_name, NULL); 546 - make_dentry_ptr_inline(NULL, &d, inline_dentry); 547 537 f2fs_update_dentry(ino, mode, &d, new_name, name_hash, bit_pos); 548 538 549 539 set_page_dirty(ipage); ··· 565 557 void f2fs_delete_inline_entry(struct f2fs_dir_entry *dentry, struct page *page, 566 558 struct inode *dir, struct inode *inode) 567 559 { 568 - struct f2fs_inline_dentry *inline_dentry; 560 + struct f2fs_dentry_ptr d; 561 + void *inline_dentry; 569 562 int slots = GET_DENTRY_SLOTS(le16_to_cpu(dentry->name_len)); 570 563 unsigned int bit_pos; 571 564 int i; ··· 574 565 lock_page(page); 575 566 f2fs_wait_on_page_writeback(page, NODE, true); 576 567 577 - inline_dentry = inline_data_addr(page); 578 - bit_pos = dentry - inline_dentry->dentry; 568 + inline_dentry = inline_data_addr(dir, page); 569 + make_dentry_ptr_inline(dir, &d, inline_dentry); 570 + 571 + bit_pos = dentry - d.dentry; 579 572 for (i = 0; i < slots; i++) 580 - __clear_bit_le(bit_pos + i, 581 - &inline_dentry->dentry_bitmap); 573 + __clear_bit_le(bit_pos + i, d.bitmap); 582 574 583 575 set_page_dirty(page); 584 576 f2fs_put_page(page, 1); ··· 596 586 struct f2fs_sb_info *sbi = F2FS_I_SB(dir); 597 587 struct page *ipage; 598 588 unsigned int bit_pos = 2; 599 - struct f2fs_inline_dentry *inline_dentry; 589 + void *inline_dentry; 590 + struct f2fs_dentry_ptr d; 600 591 601 592 ipage = get_node_page(sbi, dir->i_ino); 602 593 if (IS_ERR(ipage)) 603 594 return false; 604 595 605 - inline_dentry = inline_data_addr(ipage); 606 - bit_pos = find_next_bit_le(&inline_dentry->dentry_bitmap, 607 - NR_INLINE_DENTRY, 608 - bit_pos); 596 + inline_dentry = inline_data_addr(dir, ipage); 597 + make_dentry_ptr_inline(dir, &d, inline_dentry); 598 + 599 + bit_pos = find_next_bit_le(d.bitmap, d.max, bit_pos); 609 600 610 601 f2fs_put_page(ipage, 1); 611 602 612 - if (bit_pos < NR_INLINE_DENTRY) 603 + if (bit_pos < d.max) 613 604 return false; 614 605 615 606 return true; ··· 620 609 struct fscrypt_str *fstr) 621 610 { 622 611 struct inode *inode = file_inode(file); 623 - struct f2fs_inline_dentry *inline_dentry = NULL; 624 612 struct page *ipage = NULL; 625 613 struct f2fs_dentry_ptr d; 614 + void *inline_dentry = NULL; 626 615 int err; 627 616 628 - if (ctx->pos == NR_INLINE_DENTRY) 617 + make_dentry_ptr_inline(inode, &d, inline_dentry); 618 + 619 + if (ctx->pos == d.max) 629 620 return 0; 630 621 631 622 ipage = get_node_page(F2FS_I_SB(inode), inode->i_ino); 632 623 if (IS_ERR(ipage)) 633 624 return PTR_ERR(ipage); 634 625 635 - inline_dentry = inline_data_addr(ipage); 626 + inline_dentry = inline_data_addr(inode, ipage); 636 627 637 628 make_dentry_ptr_inline(inode, &d, inline_dentry); 638 629 639 630 err = f2fs_fill_dentries(ctx, &d, 0, fstr); 640 631 if (!err) 641 - ctx->pos = NR_INLINE_DENTRY; 632 + ctx->pos = d.max; 642 633 643 634 f2fs_put_page(ipage, 1); 644 635 return err < 0 ? err : 0; ··· 665 652 goto out; 666 653 } 667 654 668 - ilen = min_t(size_t, MAX_INLINE_DATA, i_size_read(inode)); 655 + ilen = min_t(size_t, MAX_INLINE_DATA(inode), i_size_read(inode)); 669 656 if (start >= ilen) 670 657 goto out; 671 658 if (start + len < ilen) ··· 674 661 675 662 get_node_info(F2FS_I_SB(inode), inode->i_ino, &ni); 676 663 byteaddr = (__u64)ni.blk_addr << inode->i_sb->s_blocksize_bits; 677 - byteaddr += (char *)inline_data_addr(ipage) - (char *)F2FS_INODE(ipage); 664 + byteaddr += (char *)inline_data_addr(inode, ipage) - 665 + (char *)F2FS_INODE(ipage); 678 666 err = fiemap_fill_next_extent(fieinfo, start, byteaddr, ilen, flags); 679 667 out: 680 668 f2fs_put_page(ipage, 1);
+119 -13
fs/f2fs/inode.c
··· 49 49 50 50 static void __get_inode_rdev(struct inode *inode, struct f2fs_inode *ri) 51 51 { 52 + int extra_size = get_extra_isize(inode); 53 + 52 54 if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode) || 53 55 S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) { 54 - if (ri->i_addr[0]) 55 - inode->i_rdev = 56 - old_decode_dev(le32_to_cpu(ri->i_addr[0])); 56 + if (ri->i_addr[extra_size]) 57 + inode->i_rdev = old_decode_dev( 58 + le32_to_cpu(ri->i_addr[extra_size])); 57 59 else 58 - inode->i_rdev = 59 - new_decode_dev(le32_to_cpu(ri->i_addr[1])); 60 + inode->i_rdev = new_decode_dev( 61 + le32_to_cpu(ri->i_addr[extra_size + 1])); 60 62 } 61 63 } 62 64 63 65 static bool __written_first_block(struct f2fs_inode *ri) 64 66 { 65 - block_t addr = le32_to_cpu(ri->i_addr[0]); 67 + block_t addr = le32_to_cpu(ri->i_addr[offset_in_addr(ri)]); 66 68 67 69 if (addr != NEW_ADDR && addr != NULL_ADDR) 68 70 return true; ··· 73 71 74 72 static void __set_inode_rdev(struct inode *inode, struct f2fs_inode *ri) 75 73 { 74 + int extra_size = get_extra_isize(inode); 75 + 76 76 if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode)) { 77 77 if (old_valid_dev(inode->i_rdev)) { 78 - ri->i_addr[0] = 78 + ri->i_addr[extra_size] = 79 79 cpu_to_le32(old_encode_dev(inode->i_rdev)); 80 - ri->i_addr[1] = 0; 80 + ri->i_addr[extra_size + 1] = 0; 81 81 } else { 82 - ri->i_addr[0] = 0; 83 - ri->i_addr[1] = 82 + ri->i_addr[extra_size] = 0; 83 + ri->i_addr[extra_size + 1] = 84 84 cpu_to_le32(new_encode_dev(inode->i_rdev)); 85 - ri->i_addr[2] = 0; 85 + ri->i_addr[extra_size + 2] = 0; 86 86 } 87 87 } 88 88 } 89 89 90 90 static void __recover_inline_status(struct inode *inode, struct page *ipage) 91 91 { 92 - void *inline_data = inline_data_addr(ipage); 92 + void *inline_data = inline_data_addr(inode, ipage); 93 93 __le32 *start = inline_data; 94 - __le32 *end = start + MAX_INLINE_DATA / sizeof(__le32); 94 + __le32 *end = start + MAX_INLINE_DATA(inode) / sizeof(__le32); 95 95 96 96 while (start < end) { 97 97 if (*start++) { ··· 108 104 return; 109 105 } 110 106 107 + static bool f2fs_enable_inode_chksum(struct f2fs_sb_info *sbi, struct page *page) 108 + { 109 + struct f2fs_inode *ri = &F2FS_NODE(page)->i; 110 + int extra_isize = le32_to_cpu(ri->i_extra_isize); 111 + 112 + if (!f2fs_sb_has_inode_chksum(sbi->sb)) 113 + return false; 114 + 115 + if (!RAW_IS_INODE(F2FS_NODE(page)) || !(ri->i_inline & F2FS_EXTRA_ATTR)) 116 + return false; 117 + 118 + if (!F2FS_FITS_IN_INODE(ri, extra_isize, i_inode_checksum)) 119 + return false; 120 + 121 + return true; 122 + } 123 + 124 + static __u32 f2fs_inode_chksum(struct f2fs_sb_info *sbi, struct page *page) 125 + { 126 + struct f2fs_node *node = F2FS_NODE(page); 127 + struct f2fs_inode *ri = &node->i; 128 + __le32 ino = node->footer.ino; 129 + __le32 gen = ri->i_generation; 130 + __u32 chksum, chksum_seed; 131 + __u32 dummy_cs = 0; 132 + unsigned int offset = offsetof(struct f2fs_inode, i_inode_checksum); 133 + unsigned int cs_size = sizeof(dummy_cs); 134 + 135 + chksum = f2fs_chksum(sbi, sbi->s_chksum_seed, (__u8 *)&ino, 136 + sizeof(ino)); 137 + chksum_seed = f2fs_chksum(sbi, chksum, (__u8 *)&gen, sizeof(gen)); 138 + 139 + chksum = f2fs_chksum(sbi, chksum_seed, (__u8 *)ri, offset); 140 + chksum = f2fs_chksum(sbi, chksum, (__u8 *)&dummy_cs, cs_size); 141 + offset += cs_size; 142 + chksum = f2fs_chksum(sbi, chksum, (__u8 *)ri + offset, 143 + F2FS_BLKSIZE - offset); 144 + return chksum; 145 + } 146 + 147 + bool f2fs_inode_chksum_verify(struct f2fs_sb_info *sbi, struct page *page) 148 + { 149 + struct f2fs_inode *ri; 150 + __u32 provided, calculated; 151 + 152 + if (!f2fs_enable_inode_chksum(sbi, page) || 153 + PageDirty(page) || PageWriteback(page)) 154 + return true; 155 + 156 + ri = &F2FS_NODE(page)->i; 157 + provided = le32_to_cpu(ri->i_inode_checksum); 158 + calculated = f2fs_inode_chksum(sbi, page); 159 + 160 + if (provided != calculated) 161 + f2fs_msg(sbi->sb, KERN_WARNING, 162 + "checksum invalid, ino = %x, %x vs. %x", 163 + ino_of_node(page), provided, calculated); 164 + 165 + return provided == calculated; 166 + } 167 + 168 + void f2fs_inode_chksum_set(struct f2fs_sb_info *sbi, struct page *page) 169 + { 170 + struct f2fs_inode *ri = &F2FS_NODE(page)->i; 171 + 172 + if (!f2fs_enable_inode_chksum(sbi, page)) 173 + return; 174 + 175 + ri->i_inode_checksum = cpu_to_le32(f2fs_inode_chksum(sbi, page)); 176 + } 177 + 111 178 static int do_read_inode(struct inode *inode) 112 179 { 113 180 struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 114 181 struct f2fs_inode_info *fi = F2FS_I(inode); 115 182 struct page *node_page; 116 183 struct f2fs_inode *ri; 184 + projid_t i_projid; 117 185 118 186 /* Check if ino is within scope */ 119 187 if (check_nid_range(sbi, inode->i_ino)) { ··· 229 153 230 154 get_inline_info(inode, ri); 231 155 156 + fi->i_extra_isize = f2fs_has_extra_attr(inode) ? 157 + le16_to_cpu(ri->i_extra_isize) : 0; 158 + 232 159 /* check data exist */ 233 160 if (f2fs_has_inline_data(inode) && !f2fs_exist_data(inode)) 234 161 __recover_inline_status(inode, node_page); ··· 244 165 245 166 if (!need_inode_block_update(sbi, inode->i_ino)) 246 167 fi->last_disk_size = inode->i_size; 168 + 169 + if (fi->i_flags & FS_PROJINHERIT_FL) 170 + set_inode_flag(inode, FI_PROJ_INHERIT); 171 + 172 + if (f2fs_has_extra_attr(inode) && f2fs_sb_has_project_quota(sbi->sb) && 173 + F2FS_FITS_IN_INODE(ri, fi->i_extra_isize, i_projid)) 174 + i_projid = (projid_t)le32_to_cpu(ri->i_projid); 175 + else 176 + i_projid = F2FS_DEF_PROJID; 177 + fi->i_projid = make_kprojid(&init_user_ns, i_projid); 247 178 248 179 f2fs_put_page(node_page, 1); 249 180 ··· 381 292 ri->i_generation = cpu_to_le32(inode->i_generation); 382 293 ri->i_dir_level = F2FS_I(inode)->i_dir_level; 383 294 295 + if (f2fs_has_extra_attr(inode)) { 296 + ri->i_extra_isize = cpu_to_le16(F2FS_I(inode)->i_extra_isize); 297 + 298 + if (f2fs_sb_has_project_quota(F2FS_I_SB(inode)->sb) && 299 + F2FS_FITS_IN_INODE(ri, F2FS_I(inode)->i_extra_isize, 300 + i_projid)) { 301 + projid_t i_projid; 302 + 303 + i_projid = from_kprojid(&init_user_ns, 304 + F2FS_I(inode)->i_projid); 305 + ri->i_projid = cpu_to_le32(i_projid); 306 + } 307 + } 308 + 384 309 __set_inode_rdev(inode, ri); 385 310 set_cold_node(inode, node_page); 386 311 ··· 518 415 stat_dec_inline_xattr(inode); 519 416 stat_dec_inline_dir(inode); 520 417 stat_dec_inline_inode(inode); 418 + 419 + if (!is_set_ckpt_flags(sbi, CP_ERROR_FLAG)) 420 + f2fs_bug_on(sbi, is_inode_flag_set(inode, FI_DIRTY_INODE)); 521 421 522 422 /* ino == 0, if f2fs_new_inode() was failed t*/ 523 423 if (inode->i_ino)
+43
fs/f2fs/namei.c
··· 58 58 goto fail; 59 59 } 60 60 61 + if (f2fs_sb_has_project_quota(sbi->sb) && 62 + (F2FS_I(dir)->i_flags & FS_PROJINHERIT_FL)) 63 + F2FS_I(inode)->i_projid = F2FS_I(dir)->i_projid; 64 + else 65 + F2FS_I(inode)->i_projid = make_kprojid(&init_user_ns, 66 + F2FS_DEF_PROJID); 67 + 61 68 err = dquot_initialize(inode); 62 69 if (err) 63 70 goto fail_drop; ··· 79 72 80 73 set_inode_flag(inode, FI_NEW_INODE); 81 74 75 + if (f2fs_sb_has_extra_attr(sbi->sb)) { 76 + set_inode_flag(inode, FI_EXTRA_ATTR); 77 + F2FS_I(inode)->i_extra_isize = F2FS_TOTAL_EXTRA_ATTR_SIZE; 78 + } 79 + 82 80 if (test_opt(sbi, INLINE_XATTR)) 83 81 set_inode_flag(inode, FI_INLINE_XATTR); 84 82 if (test_opt(sbi, INLINE_DATA) && f2fs_may_inline_data(inode)) ··· 96 84 stat_inc_inline_xattr(inode); 97 85 stat_inc_inline_inode(inode); 98 86 stat_inc_inline_dir(inode); 87 + 88 + F2FS_I(inode)->i_flags = 89 + f2fs_mask_flags(mode, F2FS_I(dir)->i_flags & F2FS_FL_INHERITED); 90 + 91 + if (S_ISDIR(inode->i_mode)) 92 + F2FS_I(inode)->i_flags |= FS_INDEX_FL; 93 + 94 + if (F2FS_I(inode)->i_flags & FS_PROJINHERIT_FL) 95 + set_inode_flag(inode, FI_PROJ_INHERIT); 99 96 100 97 trace_f2fs_new_inode(inode, 0); 101 98 return inode; ··· 225 204 !fscrypt_has_permitted_context(dir, inode)) 226 205 return -EPERM; 227 206 207 + if (is_inode_flag_set(dir, FI_PROJ_INHERIT) && 208 + (!projid_eq(F2FS_I(dir)->i_projid, 209 + F2FS_I(old_dentry->d_inode)->i_projid))) 210 + return -EXDEV; 211 + 228 212 err = dquot_initialize(dir); 229 213 if (err) 230 214 return err; ··· 286 260 "in readonly mountpoint", dir->i_ino, pino); 287 261 return 0; 288 262 } 263 + 264 + err = dquot_initialize(dir); 265 + if (err) 266 + return err; 289 267 290 268 f2fs_balance_fs(sbi, true); 291 269 ··· 754 724 goto out; 755 725 } 756 726 727 + if (is_inode_flag_set(new_dir, FI_PROJ_INHERIT) && 728 + (!projid_eq(F2FS_I(new_dir)->i_projid, 729 + F2FS_I(old_dentry->d_inode)->i_projid))) 730 + return -EXDEV; 731 + 757 732 err = dquot_initialize(old_dir); 758 733 if (err) 759 734 goto out; ··· 946 911 (!fscrypt_has_permitted_context(new_dir, old_inode) || 947 912 !fscrypt_has_permitted_context(old_dir, new_inode))) 948 913 return -EPERM; 914 + 915 + if ((is_inode_flag_set(new_dir, FI_PROJ_INHERIT) && 916 + !projid_eq(F2FS_I(new_dir)->i_projid, 917 + F2FS_I(old_dentry->d_inode)->i_projid)) || 918 + (is_inode_flag_set(new_dir, FI_PROJ_INHERIT) && 919 + !projid_eq(F2FS_I(old_dir)->i_projid, 920 + F2FS_I(new_dentry->d_inode)->i_projid))) 921 + return -EXDEV; 949 922 950 923 err = dquot_initialize(old_dir); 951 924 if (err)
+52 -27
fs/f2fs/node.c
··· 19 19 #include "f2fs.h" 20 20 #include "node.h" 21 21 #include "segment.h" 22 + #include "xattr.h" 22 23 #include "trace.h" 23 24 #include <trace/events/f2fs.h> 24 25 ··· 555 554 level = 3; 556 555 goto got; 557 556 } else { 558 - BUG(); 557 + return -E2BIG; 559 558 } 560 559 got: 561 560 return level; ··· 579 578 int err = 0; 580 579 581 580 level = get_node_path(dn->inode, index, offset, noffset); 581 + if (level < 0) 582 + return level; 582 583 583 584 nids[0] = dn->inode->i_ino; 584 585 npage[0] = dn->inode_page; ··· 616 613 } 617 614 618 615 dn->nid = nids[i]; 619 - npage[i] = new_node_page(dn, noffset[i], NULL); 616 + npage[i] = new_node_page(dn, noffset[i]); 620 617 if (IS_ERR(npage[i])) { 621 618 alloc_nid_failed(sbi, nids[i]); 622 619 err = PTR_ERR(npage[i]); ··· 657 654 dn->nid = nids[level]; 658 655 dn->ofs_in_node = offset[level]; 659 656 dn->node_page = npage[level]; 660 - dn->data_blkaddr = datablock_addr(dn->node_page, dn->ofs_in_node); 657 + dn->data_blkaddr = datablock_addr(dn->inode, 658 + dn->node_page, dn->ofs_in_node); 661 659 return 0; 662 660 663 661 release_pages: ··· 880 876 trace_f2fs_truncate_inode_blocks_enter(inode, from); 881 877 882 878 level = get_node_path(inode, from, offset, noffset); 879 + if (level < 0) 880 + return level; 883 881 884 882 page = get_node_page(sbi, inode->i_ino); 885 883 if (IS_ERR(page)) { ··· 1028 1022 set_new_dnode(&dn, inode, NULL, NULL, inode->i_ino); 1029 1023 1030 1024 /* caller should f2fs_put_page(page, 1); */ 1031 - return new_node_page(&dn, 0, NULL); 1025 + return new_node_page(&dn, 0); 1032 1026 } 1033 1027 1034 - struct page *new_node_page(struct dnode_of_data *dn, 1035 - unsigned int ofs, struct page *ipage) 1028 + struct page *new_node_page(struct dnode_of_data *dn, unsigned int ofs) 1036 1029 { 1037 1030 struct f2fs_sb_info *sbi = F2FS_I_SB(dn->inode); 1038 1031 struct node_info new_ni; ··· 1175 1170 err = -EIO; 1176 1171 goto out_err; 1177 1172 } 1173 + 1174 + if (!f2fs_inode_chksum_verify(sbi, page)) { 1175 + err = -EBADMSG; 1176 + goto out_err; 1177 + } 1178 1178 page_hit: 1179 1179 if(unlikely(nid != nid_of_node(page))) { 1180 1180 f2fs_msg(sbi->sb, KERN_WARNING, "inconsistent node block, " ··· 1187 1177 nid, nid_of_node(page), ino_of_node(page), 1188 1178 ofs_of_node(page), cpver_of_node(page), 1189 1179 next_blkaddr_of_node(page)); 1190 - ClearPageUptodate(page); 1191 1180 err = -EINVAL; 1192 1181 out_err: 1182 + ClearPageUptodate(page); 1193 1183 f2fs_put_page(page, 1); 1194 1184 return ERR_PTR(err); 1195 1185 } ··· 1336 1326 } 1337 1327 1338 1328 static int __write_node_page(struct page *page, bool atomic, bool *submitted, 1339 - struct writeback_control *wbc) 1329 + struct writeback_control *wbc, bool do_balance, 1330 + enum iostat_type io_type) 1340 1331 { 1341 1332 struct f2fs_sb_info *sbi = F2FS_P_SB(page); 1342 1333 nid_t nid; ··· 1350 1339 .page = page, 1351 1340 .encrypted_page = NULL, 1352 1341 .submitted = false, 1342 + .io_type = io_type, 1353 1343 }; 1354 1344 1355 1345 trace_f2fs_writepage(page, NODE); ··· 1407 1395 if (submitted) 1408 1396 *submitted = fio.submitted; 1409 1397 1398 + if (do_balance) 1399 + f2fs_balance_fs(sbi, false); 1410 1400 return 0; 1411 1401 1412 1402 redirty_out: ··· 1419 1405 static int f2fs_write_node_page(struct page *page, 1420 1406 struct writeback_control *wbc) 1421 1407 { 1422 - return __write_node_page(page, false, NULL, wbc); 1408 + return __write_node_page(page, false, NULL, wbc, false, FS_NODE_IO); 1423 1409 } 1424 1410 1425 1411 int fsync_node_pages(struct f2fs_sb_info *sbi, struct inode *inode, ··· 1507 1493 1508 1494 ret = __write_node_page(page, atomic && 1509 1495 page == last_page, 1510 - &submitted, wbc); 1496 + &submitted, wbc, true, 1497 + FS_NODE_IO); 1511 1498 if (ret) { 1512 1499 unlock_page(page); 1513 1500 f2fs_put_page(last_page, 0); ··· 1545 1530 return ret ? -EIO: 0; 1546 1531 } 1547 1532 1548 - int sync_node_pages(struct f2fs_sb_info *sbi, struct writeback_control *wbc) 1533 + int sync_node_pages(struct f2fs_sb_info *sbi, struct writeback_control *wbc, 1534 + bool do_balance, enum iostat_type io_type) 1549 1535 { 1550 1536 pgoff_t index, end; 1551 1537 struct pagevec pvec; ··· 1624 1608 set_fsync_mark(page, 0); 1625 1609 set_dentry_mark(page, 0); 1626 1610 1627 - ret = __write_node_page(page, false, &submitted, wbc); 1611 + ret = __write_node_page(page, false, &submitted, 1612 + wbc, do_balance, io_type); 1628 1613 if (ret) 1629 1614 unlock_page(page); 1630 1615 else if (submitted) ··· 1714 1697 diff = nr_pages_to_write(sbi, NODE, wbc); 1715 1698 wbc->sync_mode = WB_SYNC_NONE; 1716 1699 blk_start_plug(&plug); 1717 - sync_node_pages(sbi, wbc); 1700 + sync_node_pages(sbi, wbc, true, FS_NODE_IO); 1718 1701 blk_finish_plug(&plug); 1719 1702 wbc->nr_to_write = max((long)0, wbc->nr_to_write - diff); 1720 1703 return 0; ··· 2208 2191 { 2209 2192 struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 2210 2193 nid_t prev_xnid = F2FS_I(inode)->i_xattr_nid; 2211 - nid_t new_xnid = nid_of_node(page); 2194 + nid_t new_xnid; 2195 + struct dnode_of_data dn; 2212 2196 struct node_info ni; 2213 2197 struct page *xpage; 2214 2198 ··· 2225 2207 2226 2208 recover_xnid: 2227 2209 /* 2: update xattr nid in inode */ 2228 - remove_free_nid(sbi, new_xnid); 2229 - f2fs_i_xnid_write(inode, new_xnid); 2230 - if (unlikely(inc_valid_node_count(sbi, inode, false))) 2231 - f2fs_bug_on(sbi, 1); 2210 + if (!alloc_nid(sbi, &new_xnid)) 2211 + return -ENOSPC; 2212 + 2213 + set_new_dnode(&dn, inode, NULL, NULL, new_xnid); 2214 + xpage = new_node_page(&dn, XATTR_NODE_OFFSET); 2215 + if (IS_ERR(xpage)) { 2216 + alloc_nid_failed(sbi, new_xnid); 2217 + return PTR_ERR(xpage); 2218 + } 2219 + 2220 + alloc_nid_done(sbi, new_xnid); 2232 2221 update_inode_page(inode); 2233 2222 2234 2223 /* 3: update and set xattr node page dirty */ 2235 - xpage = grab_cache_page(NODE_MAPPING(sbi), new_xnid); 2236 - if (!xpage) 2237 - return -ENOMEM; 2224 + memcpy(F2FS_NODE(xpage), F2FS_NODE(page), VALID_XATTR_BLOCK_SIZE); 2238 2225 2239 - memcpy(F2FS_NODE(xpage), F2FS_NODE(page), PAGE_SIZE); 2240 - 2241 - get_node_info(sbi, new_xnid, &ni); 2242 - ni.ino = inode->i_ino; 2243 - set_node_addr(sbi, &ni, NEW_ADDR, false); 2244 2226 set_page_dirty(xpage); 2245 2227 f2fs_put_page(xpage, 1); 2246 2228 ··· 2280 2262 dst->i_blocks = cpu_to_le64(1); 2281 2263 dst->i_links = cpu_to_le32(1); 2282 2264 dst->i_xattr_nid = 0; 2283 - dst->i_inline = src->i_inline & F2FS_INLINE_XATTR; 2265 + dst->i_inline = src->i_inline & (F2FS_INLINE_XATTR | F2FS_EXTRA_ATTR); 2266 + if (dst->i_inline & F2FS_EXTRA_ATTR) { 2267 + dst->i_extra_isize = src->i_extra_isize; 2268 + if (f2fs_sb_has_project_quota(sbi->sb) && 2269 + F2FS_FITS_IN_INODE(src, le16_to_cpu(src->i_extra_isize), 2270 + i_projid)) 2271 + dst->i_projid = src->i_projid; 2272 + } 2284 2273 2285 2274 new_ni = old_ni; 2286 2275 new_ni.ino = ino;
+69 -14
fs/f2fs/recovery.c
··· 69 69 } 70 70 71 71 static struct fsync_inode_entry *add_fsync_inode(struct f2fs_sb_info *sbi, 72 - struct list_head *head, nid_t ino) 72 + struct list_head *head, nid_t ino, bool quota_inode) 73 73 { 74 74 struct inode *inode; 75 75 struct fsync_inode_entry *entry; 76 + int err; 76 77 77 78 inode = f2fs_iget_retry(sbi->sb, ino); 78 79 if (IS_ERR(inode)) 79 80 return ERR_CAST(inode); 81 + 82 + err = dquot_initialize(inode); 83 + if (err) 84 + goto err_out; 85 + 86 + if (quota_inode) { 87 + err = dquot_alloc_inode(inode); 88 + if (err) 89 + goto err_out; 90 + } 80 91 81 92 entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO); 82 93 entry->inode = inode; 83 94 list_add_tail(&entry->list, head); 84 95 85 96 return entry; 97 + err_out: 98 + iput(inode); 99 + return ERR_PTR(err); 86 100 } 87 101 88 102 static void del_fsync_inode(struct fsync_inode_entry *entry) ··· 121 107 122 108 entry = get_fsync_inode(dir_list, pino); 123 109 if (!entry) { 124 - entry = add_fsync_inode(F2FS_I_SB(inode), dir_list, pino); 110 + entry = add_fsync_inode(F2FS_I_SB(inode), dir_list, 111 + pino, false); 125 112 if (IS_ERR(entry)) { 126 113 dir = ERR_CAST(entry); 127 114 err = PTR_ERR(entry); ··· 155 140 err = -EEXIST; 156 141 goto out_unmap_put; 157 142 } 143 + 144 + err = dquot_initialize(einode); 145 + if (err) { 146 + iput(einode); 147 + goto out_unmap_put; 148 + } 149 + 158 150 err = acquire_orphan_inode(F2FS_I_SB(inode)); 159 151 if (err) { 160 152 iput(einode); ··· 248 226 249 227 entry = get_fsync_inode(head, ino_of_node(page)); 250 228 if (!entry) { 229 + bool quota_inode = false; 230 + 251 231 if (!check_only && 252 232 IS_INODE(page) && is_dent_dnode(page)) { 253 233 err = recover_inode_page(sbi, page); 254 234 if (err) 255 235 break; 236 + quota_inode = true; 256 237 } 257 238 258 239 /* 259 240 * CP | dnode(F) | inode(DF) 260 241 * For this case, we should not give up now. 261 242 */ 262 - entry = add_fsync_inode(sbi, head, ino_of_node(page)); 243 + entry = add_fsync_inode(sbi, head, ino_of_node(page), 244 + quota_inode); 263 245 if (IS_ERR(entry)) { 264 246 err = PTR_ERR(entry); 265 247 if (err == -ENOENT) { ··· 317 291 return 0; 318 292 319 293 /* Get the previous summary */ 320 - for (i = CURSEG_WARM_DATA; i <= CURSEG_COLD_DATA; i++) { 294 + for (i = CURSEG_HOT_DATA; i <= CURSEG_COLD_DATA; i++) { 321 295 struct curseg_info *curseg = CURSEG_I(sbi, i); 322 296 if (curseg->segno == segno) { 323 297 sum = curseg->sum_blk->entries[blkoff]; ··· 354 328 f2fs_put_page(node_page, 1); 355 329 356 330 if (ino != dn->inode->i_ino) { 331 + int ret; 332 + 357 333 /* Deallocate previous index in the node page */ 358 334 inode = f2fs_iget_retry(sbi->sb, ino); 359 335 if (IS_ERR(inode)) 360 336 return PTR_ERR(inode); 337 + 338 + ret = dquot_initialize(inode); 339 + if (ret) { 340 + iput(inode); 341 + return ret; 342 + } 361 343 } else { 362 344 inode = dn->inode; 363 345 } ··· 395 361 return 0; 396 362 397 363 truncate_out: 398 - if (datablock_addr(tdn.node_page, tdn.ofs_in_node) == blkaddr) 364 + if (datablock_addr(tdn.inode, tdn.node_page, 365 + tdn.ofs_in_node) == blkaddr) 399 366 truncate_data_blocks_range(&tdn, 1); 400 367 if (dn->inode->i_ino == nid && !dn->inode_page_locked) 401 368 unlock_page(dn->inode_page); ··· 449 414 for (; start < end; start++, dn.ofs_in_node++) { 450 415 block_t src, dest; 451 416 452 - src = datablock_addr(dn.node_page, dn.ofs_in_node); 453 - dest = datablock_addr(page, dn.ofs_in_node); 417 + src = datablock_addr(dn.inode, dn.node_page, dn.ofs_in_node); 418 + dest = datablock_addr(dn.inode, page, dn.ofs_in_node); 454 419 455 420 /* skip recovering if dest is the same as src */ 456 421 if (src == dest) ··· 592 557 struct list_head dir_list; 593 558 int err; 594 559 int ret = 0; 560 + unsigned long s_flags = sbi->sb->s_flags; 595 561 bool need_writecp = false; 562 + 563 + if (s_flags & MS_RDONLY) { 564 + f2fs_msg(sbi->sb, KERN_INFO, "orphan cleanup on readonly fs"); 565 + sbi->sb->s_flags &= ~MS_RDONLY; 566 + } 567 + 568 + #ifdef CONFIG_QUOTA 569 + /* Needed for iput() to work correctly and not trash data */ 570 + sbi->sb->s_flags |= MS_ACTIVE; 571 + /* Turn on quotas so that they are updated correctly */ 572 + f2fs_enable_quota_files(sbi); 573 + #endif 596 574 597 575 fsync_entry_slab = f2fs_kmem_cache_create("f2fs_fsync_inode_entry", 598 576 sizeof(struct fsync_inode_entry)); 599 - if (!fsync_entry_slab) 600 - return -ENOMEM; 577 + if (!fsync_entry_slab) { 578 + err = -ENOMEM; 579 + goto out; 580 + } 601 581 602 582 INIT_LIST_HEAD(&inode_list); 603 583 INIT_LIST_HEAD(&dir_list); ··· 623 573 /* step #1: find fsynced inode numbers */ 624 574 err = find_fsync_dnodes(sbi, &inode_list, check_only); 625 575 if (err || list_empty(&inode_list)) 626 - goto out; 576 + goto skip; 627 577 628 578 if (check_only) { 629 579 ret = 1; 630 - goto out; 580 + goto skip; 631 581 } 632 582 633 583 need_writecp = true; ··· 636 586 err = recover_data(sbi, &inode_list, &dir_list); 637 587 if (!err) 638 588 f2fs_bug_on(sbi, !list_empty(&inode_list)); 639 - out: 589 + skip: 640 590 destroy_fsync_dnodes(&inode_list); 641 591 642 592 /* truncate meta pages to be used by the recovery */ ··· 649 599 } 650 600 651 601 clear_sbi_flag(sbi, SBI_POR_DOING); 652 - if (err) 653 - set_ckpt_flags(sbi, CP_ERROR_FLAG); 654 602 mutex_unlock(&sbi->cp_mutex); 655 603 656 604 /* let's drop all the directory inodes for clean checkpoint */ ··· 662 614 } 663 615 664 616 kmem_cache_destroy(fsync_entry_slab); 617 + out: 618 + #ifdef CONFIG_QUOTA 619 + /* Turn quotas off */ 620 + f2fs_quota_off_umount(sbi->sb); 621 + #endif 622 + sbi->sb->s_flags = s_flags; /* Restore MS_RDONLY status */ 623 + 665 624 return ret ? ret: err; 666 625 }
+237 -59
fs/f2fs/segment.c
··· 17 17 #include <linux/swap.h> 18 18 #include <linux/timer.h> 19 19 #include <linux/freezer.h> 20 + #include <linux/sched/signal.h> 20 21 21 22 #include "f2fs.h" 22 23 #include "segment.h" 23 24 #include "node.h" 25 + #include "gc.h" 24 26 #include "trace.h" 25 27 #include <trace/events/f2fs.h> 26 28 ··· 169 167 return result - size + __reverse_ffz(tmp); 170 168 } 171 169 170 + bool need_SSR(struct f2fs_sb_info *sbi) 171 + { 172 + int node_secs = get_blocktype_secs(sbi, F2FS_DIRTY_NODES); 173 + int dent_secs = get_blocktype_secs(sbi, F2FS_DIRTY_DENTS); 174 + int imeta_secs = get_blocktype_secs(sbi, F2FS_DIRTY_IMETA); 175 + 176 + if (test_opt(sbi, LFS)) 177 + return false; 178 + if (sbi->gc_thread && sbi->gc_thread->gc_urgent) 179 + return true; 180 + 181 + return free_sections(sbi) <= (node_secs + 2 * dent_secs + imeta_secs + 182 + 2 * reserved_sections(sbi)); 183 + } 184 + 172 185 void register_inmem_page(struct inode *inode, struct page *page) 173 186 { 174 187 struct f2fs_inode_info *fi = F2FS_I(inode); ··· 230 213 struct node_info ni; 231 214 232 215 trace_f2fs_commit_inmem_page(page, INMEM_REVOKE); 233 - 216 + retry: 234 217 set_new_dnode(&dn, inode, NULL, NULL, 0); 235 - if (get_dnode_of_data(&dn, page->index, LOOKUP_NODE)) { 218 + err = get_dnode_of_data(&dn, page->index, LOOKUP_NODE); 219 + if (err) { 220 + if (err == -ENOMEM) { 221 + congestion_wait(BLK_RW_ASYNC, HZ/50); 222 + cond_resched(); 223 + goto retry; 224 + } 236 225 err = -EAGAIN; 237 226 goto next; 238 227 } ··· 271 248 mutex_unlock(&fi->inmem_lock); 272 249 273 250 clear_inode_flag(inode, FI_ATOMIC_FILE); 251 + clear_inode_flag(inode, FI_HOT_DATA); 274 252 stat_dec_atomic_write(inode); 275 253 } 276 254 ··· 316 292 .type = DATA, 317 293 .op = REQ_OP_WRITE, 318 294 .op_flags = REQ_SYNC | REQ_PRIO, 295 + .io_type = FS_DATA_IO, 319 296 }; 320 297 pgoff_t last_idx = ULONG_MAX; 321 298 int err = 0; ··· 334 309 inode_dec_dirty_pages(inode); 335 310 remove_dirty_inode(inode); 336 311 } 337 - 312 + retry: 338 313 fio.page = page; 339 314 fio.old_blkaddr = NULL_ADDR; 340 315 fio.encrypted_page = NULL; 341 316 fio.need_lock = LOCK_DONE; 342 317 err = do_write_data_page(&fio); 343 318 if (err) { 319 + if (err == -ENOMEM) { 320 + congestion_wait(BLK_RW_ASYNC, HZ/50); 321 + cond_resched(); 322 + goto retry; 323 + } 344 324 unlock_page(page); 345 325 break; 346 326 } 347 - 348 327 /* record old blkaddr for revoking */ 349 328 cur->old_addr = fio.old_blkaddr; 350 329 last_idx = page->index; ··· 510 481 if (kthread_should_stop()) 511 482 return 0; 512 483 484 + sb_start_intwrite(sbi->sb); 485 + 513 486 if (!llist_empty(&fcc->issue_list)) { 514 487 struct flush_cmd *cmd, *next; 515 488 int ret; ··· 529 498 } 530 499 fcc->dispatch_list = NULL; 531 500 } 501 + 502 + sb_end_intwrite(sbi->sb); 532 503 533 504 wait_event_interruptible(*q, 534 505 kthread_should_stop() || !llist_empty(&fcc->issue_list)); ··· 552 519 return ret; 553 520 } 554 521 555 - if (!atomic_read(&fcc->issing_flush)) { 556 - atomic_inc(&fcc->issing_flush); 522 + if (atomic_inc_return(&fcc->issing_flush) == 1) { 557 523 ret = submit_flush_wait(sbi); 558 524 atomic_dec(&fcc->issing_flush); 559 525 ··· 562 530 563 531 init_completion(&cmd.wait); 564 532 565 - atomic_inc(&fcc->issing_flush); 566 533 llist_add(&cmd.llnode, &fcc->issue_list); 567 534 568 - if (!fcc->dispatch_list) 535 + /* update issue_list before we wake up issue_flush thread */ 536 + smp_mb(); 537 + 538 + if (waitqueue_active(&fcc->flush_wait_queue)) 569 539 wake_up(&fcc->flush_wait_queue); 570 540 571 541 if (fcc->f2fs_issue_flush) { 572 542 wait_for_completion(&cmd.wait); 573 543 atomic_dec(&fcc->issing_flush); 574 544 } else { 575 - llist_del_all(&fcc->issue_list); 576 - atomic_set(&fcc->issing_flush, 0); 545 + struct llist_node *list; 546 + 547 + list = llist_del_all(&fcc->issue_list); 548 + if (!list) { 549 + wait_for_completion(&cmd.wait); 550 + atomic_dec(&fcc->issing_flush); 551 + } else { 552 + struct flush_cmd *tmp, *next; 553 + 554 + ret = submit_flush_wait(sbi); 555 + 556 + llist_for_each_entry_safe(tmp, next, list, llnode) { 557 + if (tmp == &cmd) { 558 + cmd.ret = ret; 559 + atomic_dec(&fcc->issing_flush); 560 + continue; 561 + } 562 + tmp->ret = ret; 563 + complete(&tmp->wait); 564 + } 565 + } 577 566 } 578 567 579 568 return cmd.ret; ··· 831 778 sentry = get_seg_entry(sbi, segno); 832 779 offset = GET_BLKOFF_FROM_SEG0(sbi, blk); 833 780 834 - size = min((unsigned long)(end - blk), max_blocks); 781 + if (end < START_BLOCK(sbi, segno + 1)) 782 + size = GET_BLKOFF_FROM_SEG0(sbi, end); 783 + else 784 + size = max_blocks; 835 785 map = (unsigned long *)(sentry->cur_valid_map); 836 786 offset = __find_rev_next_bit(map, size, offset); 837 787 f2fs_bug_on(sbi, offset != size); 838 - blk += size; 788 + blk = START_BLOCK(sbi, segno + 1); 839 789 } 840 790 #endif 841 791 } ··· 871 815 submit_bio(bio); 872 816 list_move_tail(&dc->list, &dcc->wait_list); 873 817 __check_sit_bitmap(sbi, dc->start, dc->start + dc->len); 818 + 819 + f2fs_update_iostat(sbi, FS_DISCARD, 1); 874 820 } 875 821 } else { 876 822 __remove_discard_cmd(sbi, dc); ··· 1054 996 return 0; 1055 997 } 1056 998 1057 - static void __issue_discard_cmd(struct f2fs_sb_info *sbi, bool issue_cond) 999 + static int __issue_discard_cmd(struct f2fs_sb_info *sbi, bool issue_cond) 1058 1000 { 1059 1001 struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; 1060 1002 struct list_head *pend_list; 1061 1003 struct discard_cmd *dc, *tmp; 1062 1004 struct blk_plug plug; 1063 - int i, iter = 0; 1005 + int iter = 0, issued = 0; 1006 + int i; 1007 + bool io_interrupted = false; 1064 1008 1065 1009 mutex_lock(&dcc->cmd_lock); 1066 1010 f2fs_bug_on(sbi, 1067 1011 !__check_rb_tree_consistence(sbi, &dcc->root)); 1068 1012 blk_start_plug(&plug); 1069 - for (i = MAX_PLIST_NUM - 1; i >= 0; i--) { 1013 + for (i = MAX_PLIST_NUM - 1; 1014 + i >= 0 && plist_issue(dcc->pend_list_tag[i]); i--) { 1070 1015 pend_list = &dcc->pend_list[i]; 1071 1016 list_for_each_entry_safe(dc, tmp, pend_list, list) { 1072 1017 f2fs_bug_on(sbi, dc->state != D_PREP); 1073 1018 1074 - if (!issue_cond || is_idle(sbi)) 1019 + /* Hurry up to finish fstrim */ 1020 + if (dcc->pend_list_tag[i] & P_TRIM) { 1075 1021 __submit_discard_cmd(sbi, dc); 1076 - if (issue_cond && iter++ > DISCARD_ISSUE_RATE) 1022 + issued++; 1023 + 1024 + if (fatal_signal_pending(current)) 1025 + break; 1026 + continue; 1027 + } 1028 + 1029 + if (!issue_cond) { 1030 + __submit_discard_cmd(sbi, dc); 1031 + issued++; 1032 + continue; 1033 + } 1034 + 1035 + if (is_idle(sbi)) { 1036 + __submit_discard_cmd(sbi, dc); 1037 + issued++; 1038 + } else { 1039 + io_interrupted = true; 1040 + } 1041 + 1042 + if (++iter >= DISCARD_ISSUE_RATE) 1077 1043 goto out; 1078 1044 } 1045 + if (list_empty(pend_list) && dcc->pend_list_tag[i] & P_TRIM) 1046 + dcc->pend_list_tag[i] &= (~P_TRIM); 1079 1047 } 1080 1048 out: 1081 1049 blk_finish_plug(&plug); 1050 + mutex_unlock(&dcc->cmd_lock); 1051 + 1052 + if (!issued && io_interrupted) 1053 + issued = -1; 1054 + 1055 + return issued; 1056 + } 1057 + 1058 + static void __drop_discard_cmd(struct f2fs_sb_info *sbi) 1059 + { 1060 + struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; 1061 + struct list_head *pend_list; 1062 + struct discard_cmd *dc, *tmp; 1063 + int i; 1064 + 1065 + mutex_lock(&dcc->cmd_lock); 1066 + for (i = MAX_PLIST_NUM - 1; i >= 0; i--) { 1067 + pend_list = &dcc->pend_list[i]; 1068 + list_for_each_entry_safe(dc, tmp, pend_list, list) { 1069 + f2fs_bug_on(sbi, dc->state != D_PREP); 1070 + __remove_discard_cmd(sbi, dc); 1071 + } 1072 + } 1082 1073 mutex_unlock(&dcc->cmd_lock); 1083 1074 } 1084 1075 ··· 1209 1102 } 1210 1103 } 1211 1104 1212 - /* This comes from f2fs_put_super */ 1105 + /* This comes from f2fs_put_super and f2fs_trim_fs */ 1213 1106 void f2fs_wait_discard_bios(struct f2fs_sb_info *sbi) 1214 1107 { 1215 1108 __issue_discard_cmd(sbi, false); 1109 + __drop_discard_cmd(sbi); 1216 1110 __wait_discard_cmd(sbi, false); 1111 + } 1112 + 1113 + static void mark_discard_range_all(struct f2fs_sb_info *sbi) 1114 + { 1115 + struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; 1116 + int i; 1117 + 1118 + mutex_lock(&dcc->cmd_lock); 1119 + for (i = 0; i < MAX_PLIST_NUM; i++) 1120 + dcc->pend_list_tag[i] |= P_TRIM; 1121 + mutex_unlock(&dcc->cmd_lock); 1217 1122 } 1218 1123 1219 1124 static int issue_discard_thread(void *data) ··· 1233 1114 struct f2fs_sb_info *sbi = data; 1234 1115 struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; 1235 1116 wait_queue_head_t *q = &dcc->discard_wait_queue; 1117 + unsigned int wait_ms = DEF_MIN_DISCARD_ISSUE_TIME; 1118 + int issued; 1236 1119 1237 1120 set_freezable(); 1238 1121 1239 1122 do { 1240 - wait_event_interruptible(*q, kthread_should_stop() || 1241 - freezing(current) || 1242 - atomic_read(&dcc->discard_cmd_cnt)); 1123 + wait_event_interruptible_timeout(*q, 1124 + kthread_should_stop() || freezing(current) || 1125 + dcc->discard_wake, 1126 + msecs_to_jiffies(wait_ms)); 1243 1127 if (try_to_freeze()) 1244 1128 continue; 1245 1129 if (kthread_should_stop()) 1246 1130 return 0; 1247 1131 1248 - __issue_discard_cmd(sbi, true); 1249 - __wait_discard_cmd(sbi, true); 1132 + if (dcc->discard_wake) { 1133 + dcc->discard_wake = 0; 1134 + if (sbi->gc_thread && sbi->gc_thread->gc_urgent) 1135 + mark_discard_range_all(sbi); 1136 + } 1250 1137 1251 - congestion_wait(BLK_RW_SYNC, HZ/50); 1138 + sb_start_intwrite(sbi->sb); 1139 + 1140 + issued = __issue_discard_cmd(sbi, true); 1141 + if (issued) { 1142 + __wait_discard_cmd(sbi, true); 1143 + wait_ms = DEF_MIN_DISCARD_ISSUE_TIME; 1144 + } else { 1145 + wait_ms = DEF_MAX_DISCARD_ISSUE_TIME; 1146 + } 1147 + 1148 + sb_end_intwrite(sbi->sb); 1149 + 1252 1150 } while (!kthread_should_stop()); 1253 1151 return 0; 1254 1152 } ··· 1456 1320 1457 1321 void clear_prefree_segments(struct f2fs_sb_info *sbi, struct cp_control *cpc) 1458 1322 { 1459 - struct list_head *head = &(SM_I(sbi)->dcc_info->entry_list); 1323 + struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; 1324 + struct list_head *head = &dcc->entry_list; 1460 1325 struct discard_entry *entry, *this; 1461 1326 struct dirty_seglist_info *dirty_i = DIRTY_I(sbi); 1462 1327 unsigned long *prefree_map = dirty_i->dirty_segmap[PRE]; ··· 1539 1402 goto find_next; 1540 1403 1541 1404 list_del(&entry->list); 1542 - SM_I(sbi)->dcc_info->nr_discards -= total_len; 1405 + dcc->nr_discards -= total_len; 1543 1406 kmem_cache_free(discard_entry_slab, entry); 1544 1407 } 1545 1408 1546 - wake_up(&SM_I(sbi)->dcc_info->discard_wait_queue); 1409 + wake_up_discard_thread(sbi, false); 1547 1410 } 1548 1411 1549 1412 static int create_discard_cmd_control(struct f2fs_sb_info *sbi) ··· 1561 1424 if (!dcc) 1562 1425 return -ENOMEM; 1563 1426 1427 + dcc->discard_granularity = DEFAULT_DISCARD_GRANULARITY; 1564 1428 INIT_LIST_HEAD(&dcc->entry_list); 1565 - for (i = 0; i < MAX_PLIST_NUM; i++) 1429 + for (i = 0; i < MAX_PLIST_NUM; i++) { 1566 1430 INIT_LIST_HEAD(&dcc->pend_list[i]); 1431 + if (i >= dcc->discard_granularity - 1) 1432 + dcc->pend_list_tag[i] |= P_ACTIVE; 1433 + } 1567 1434 INIT_LIST_HEAD(&dcc->wait_list); 1568 1435 mutex_init(&dcc->cmd_lock); 1569 1436 atomic_set(&dcc->issued_discard, 0); ··· 1632 1491 struct seg_entry *se; 1633 1492 unsigned int segno, offset; 1634 1493 long int new_vblocks; 1494 + bool exist; 1495 + #ifdef CONFIG_F2FS_CHECK_FS 1496 + bool mir_exist; 1497 + #endif 1635 1498 1636 1499 segno = GET_SEGNO(sbi, blkaddr); 1637 1500 ··· 1652 1507 1653 1508 /* Update valid block bitmap */ 1654 1509 if (del > 0) { 1655 - if (f2fs_test_and_set_bit(offset, se->cur_valid_map)) { 1510 + exist = f2fs_test_and_set_bit(offset, se->cur_valid_map); 1656 1511 #ifdef CONFIG_F2FS_CHECK_FS 1657 - if (f2fs_test_and_set_bit(offset, 1658 - se->cur_valid_map_mir)) 1659 - f2fs_bug_on(sbi, 1); 1660 - else 1661 - WARN_ON(1); 1662 - #else 1512 + mir_exist = f2fs_test_and_set_bit(offset, 1513 + se->cur_valid_map_mir); 1514 + if (unlikely(exist != mir_exist)) { 1515 + f2fs_msg(sbi->sb, KERN_ERR, "Inconsistent error " 1516 + "when setting bitmap, blk:%u, old bit:%d", 1517 + blkaddr, exist); 1663 1518 f2fs_bug_on(sbi, 1); 1664 - #endif 1665 1519 } 1520 + #endif 1521 + if (unlikely(exist)) { 1522 + f2fs_msg(sbi->sb, KERN_ERR, 1523 + "Bitmap was wrongly set, blk:%u", blkaddr); 1524 + f2fs_bug_on(sbi, 1); 1525 + se->valid_blocks--; 1526 + del = 0; 1527 + } 1528 + 1666 1529 if (f2fs_discard_en(sbi) && 1667 1530 !f2fs_test_and_set_bit(offset, se->discard_map)) 1668 1531 sbi->discard_blks--; ··· 1681 1528 se->ckpt_valid_blocks++; 1682 1529 } 1683 1530 } else { 1684 - if (!f2fs_test_and_clear_bit(offset, se->cur_valid_map)) { 1531 + exist = f2fs_test_and_clear_bit(offset, se->cur_valid_map); 1685 1532 #ifdef CONFIG_F2FS_CHECK_FS 1686 - if (!f2fs_test_and_clear_bit(offset, 1687 - se->cur_valid_map_mir)) 1688 - f2fs_bug_on(sbi, 1); 1689 - else 1690 - WARN_ON(1); 1691 - #else 1533 + mir_exist = f2fs_test_and_clear_bit(offset, 1534 + se->cur_valid_map_mir); 1535 + if (unlikely(exist != mir_exist)) { 1536 + f2fs_msg(sbi->sb, KERN_ERR, "Inconsistent error " 1537 + "when clearing bitmap, blk:%u, old bit:%d", 1538 + blkaddr, exist); 1692 1539 f2fs_bug_on(sbi, 1); 1693 - #endif 1694 1540 } 1541 + #endif 1542 + if (unlikely(!exist)) { 1543 + f2fs_msg(sbi->sb, KERN_ERR, 1544 + "Bitmap was wrongly cleared, blk:%u", blkaddr); 1545 + f2fs_bug_on(sbi, 1); 1546 + se->valid_blocks++; 1547 + del = 0; 1548 + } 1549 + 1695 1550 if (f2fs_discard_en(sbi) && 1696 1551 f2fs_test_and_clear_bit(offset, se->discard_map)) 1697 1552 sbi->discard_blks++; ··· 2061 1900 * This function always allocates a used segment(from dirty seglist) by SSR 2062 1901 * manner, so it should recover the existing segment information of valid blocks 2063 1902 */ 2064 - static void change_curseg(struct f2fs_sb_info *sbi, int type, bool reuse) 1903 + static void change_curseg(struct f2fs_sb_info *sbi, int type) 2065 1904 { 2066 1905 struct dirty_seglist_info *dirty_i = DIRTY_I(sbi); 2067 1906 struct curseg_info *curseg = CURSEG_I(sbi, type); ··· 2082 1921 curseg->alloc_type = SSR; 2083 1922 __next_free_blkoff(sbi, curseg, 0); 2084 1923 2085 - if (reuse) { 2086 - sum_page = get_sum_page(sbi, new_segno); 2087 - sum_node = (struct f2fs_summary_block *)page_address(sum_page); 2088 - memcpy(curseg->sum_blk, sum_node, SUM_ENTRY_SIZE); 2089 - f2fs_put_page(sum_page, 1); 2090 - } 1924 + sum_page = get_sum_page(sbi, new_segno); 1925 + sum_node = (struct f2fs_summary_block *)page_address(sum_page); 1926 + memcpy(curseg->sum_blk, sum_node, SUM_ENTRY_SIZE); 1927 + f2fs_put_page(sum_page, 1); 2091 1928 } 2092 1929 2093 1930 static int get_ssr_segment(struct f2fs_sb_info *sbi, int type) ··· 2149 1990 else if (curseg->alloc_type == LFS && is_next_segment_free(sbi, type)) 2150 1991 new_curseg(sbi, type, false); 2151 1992 else if (need_SSR(sbi) && get_ssr_segment(sbi, type)) 2152 - change_curseg(sbi, type, true); 1993 + change_curseg(sbi, type); 2153 1994 else 2154 1995 new_curseg(sbi, type, false); 2155 1996 ··· 2242 2083 2243 2084 schedule(); 2244 2085 } 2086 + /* It's time to issue all the filed discards */ 2087 + mark_discard_range_all(sbi); 2088 + f2fs_wait_discard_bios(sbi); 2245 2089 out: 2246 2090 range->len = F2FS_BLK_TO_BYTES(cpc.trimmed); 2247 2091 return err; ··· 2364 2202 2365 2203 mutex_unlock(&sit_i->sentry_lock); 2366 2204 2367 - if (page && IS_NODESEG(type)) 2205 + if (page && IS_NODESEG(type)) { 2368 2206 fill_node_footer_blkaddr(page, NEXT_FREE_BLKADDR(sbi, curseg)); 2207 + 2208 + f2fs_inode_chksum_set(sbi, page); 2209 + } 2369 2210 2370 2211 if (add_list) { 2371 2212 struct f2fs_bio_info *io; ··· 2401 2236 } 2402 2237 } 2403 2238 2404 - void write_meta_page(struct f2fs_sb_info *sbi, struct page *page) 2239 + void write_meta_page(struct f2fs_sb_info *sbi, struct page *page, 2240 + enum iostat_type io_type) 2405 2241 { 2406 2242 struct f2fs_io_info fio = { 2407 2243 .sbi = sbi, ··· 2421 2255 2422 2256 set_page_writeback(page); 2423 2257 f2fs_submit_page_write(&fio); 2258 + 2259 + f2fs_update_iostat(sbi, io_type, F2FS_BLKSIZE); 2424 2260 } 2425 2261 2426 2262 void write_node_page(unsigned int nid, struct f2fs_io_info *fio) ··· 2431 2263 2432 2264 set_summary(&sum, nid, 0, 0); 2433 2265 do_write_page(&sum, fio); 2266 + 2267 + f2fs_update_iostat(fio->sbi, fio->io_type, F2FS_BLKSIZE); 2434 2268 } 2435 2269 2436 2270 void write_data_page(struct dnode_of_data *dn, struct f2fs_io_info *fio) ··· 2446 2276 set_summary(&sum, dn->nid, dn->ofs_in_node, ni.version); 2447 2277 do_write_page(&sum, fio); 2448 2278 f2fs_update_data_blkaddr(dn, fio->new_blkaddr); 2279 + 2280 + f2fs_update_iostat(sbi, fio->io_type, F2FS_BLKSIZE); 2449 2281 } 2450 2282 2451 2283 int rewrite_data_page(struct f2fs_io_info *fio) 2452 2284 { 2285 + int err; 2286 + 2453 2287 fio->new_blkaddr = fio->old_blkaddr; 2454 2288 stat_inc_inplace_blocks(fio->sbi); 2455 - return f2fs_submit_page_bio(fio); 2289 + 2290 + err = f2fs_submit_page_bio(fio); 2291 + 2292 + f2fs_update_iostat(fio->sbi, fio->io_type, F2FS_BLKSIZE); 2293 + 2294 + return err; 2456 2295 } 2457 2296 2458 2297 void __f2fs_replace_block(struct f2fs_sb_info *sbi, struct f2fs_summary *sum, ··· 2503 2324 /* change the current segment */ 2504 2325 if (segno != curseg->segno) { 2505 2326 curseg->next_segno = segno; 2506 - change_curseg(sbi, type, true); 2327 + change_curseg(sbi, type); 2507 2328 } 2508 2329 2509 2330 curseg->next_blkoff = GET_BLKOFF_FROM_SEG0(sbi, new_blkaddr); ··· 2522 2343 if (recover_curseg) { 2523 2344 if (old_cursegno != curseg->segno) { 2524 2345 curseg->next_segno = old_cursegno; 2525 - change_curseg(sbi, type, true); 2346 + change_curseg(sbi, type); 2526 2347 } 2527 2348 curseg->next_blkoff = old_blkoff; 2528 2349 } ··· 2561 2382 } 2562 2383 } 2563 2384 2564 - void f2fs_wait_on_encrypted_page_writeback(struct f2fs_sb_info *sbi, 2565 - block_t blkaddr) 2385 + void f2fs_wait_on_block_writeback(struct f2fs_sb_info *sbi, block_t blkaddr) 2566 2386 { 2567 2387 struct page *cpage; 2568 2388
+29 -18
fs/f2fs/segment.h
··· 492 492 return SM_I(sbi)->ovp_segments; 493 493 } 494 494 495 - static inline int overprovision_sections(struct f2fs_sb_info *sbi) 496 - { 497 - return GET_SEC_FROM_SEG(sbi, (unsigned int)overprovision_segments(sbi)); 498 - } 499 - 500 495 static inline int reserved_sections(struct f2fs_sb_info *sbi) 501 496 { 502 497 return GET_SEC_FROM_SEG(sbi, (unsigned int)reserved_segments(sbi)); 503 - } 504 - 505 - static inline bool need_SSR(struct f2fs_sb_info *sbi) 506 - { 507 - int node_secs = get_blocktype_secs(sbi, F2FS_DIRTY_NODES); 508 - int dent_secs = get_blocktype_secs(sbi, F2FS_DIRTY_DENTS); 509 - int imeta_secs = get_blocktype_secs(sbi, F2FS_DIRTY_IMETA); 510 - 511 - if (test_opt(sbi, LFS)) 512 - return false; 513 - 514 - return free_sections(sbi) <= (node_secs + 2 * dent_secs + imeta_secs + 515 - 2 * reserved_sections(sbi)); 516 498 } 517 499 518 500 static inline bool has_not_enough_free_secs(struct f2fs_sb_info *sbi, ··· 558 576 559 577 if (test_opt(sbi, LFS)) 560 578 return false; 579 + 580 + /* if this is cold file, we should overwrite to avoid fragmentation */ 581 + if (file_is_cold(inode)) 582 + return true; 561 583 562 584 if (policy & (0x1 << F2FS_IPU_FORCE)) 563 585 return true; ··· 784 798 785 799 wbc->nr_to_write = desired; 786 800 return desired - nr_to_write; 801 + } 802 + 803 + static inline void wake_up_discard_thread(struct f2fs_sb_info *sbi, bool force) 804 + { 805 + struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; 806 + bool wakeup = false; 807 + int i; 808 + 809 + if (force) 810 + goto wake_up; 811 + 812 + mutex_lock(&dcc->cmd_lock); 813 + for (i = MAX_PLIST_NUM - 1; 814 + i >= 0 && plist_issue(dcc->pend_list_tag[i]); i--) { 815 + if (!list_empty(&dcc->pend_list[i])) { 816 + wakeup = true; 817 + break; 818 + } 819 + } 820 + mutex_unlock(&dcc->cmd_lock); 821 + if (!wakeup) 822 + return; 823 + wake_up: 824 + dcc->discard_wake = 1; 825 + wake_up_interruptible_all(&dcc->discard_wait_queue); 787 826 }
+401 -32
fs/f2fs/super.c
··· 25 25 #include <linux/quotaops.h> 26 26 #include <linux/f2fs_fs.h> 27 27 #include <linux/sysfs.h> 28 + #include <linux/quota.h> 28 29 29 30 #include "f2fs.h" 30 31 #include "node.h" ··· 108 107 Opt_fault_injection, 109 108 Opt_lazytime, 110 109 Opt_nolazytime, 110 + Opt_quota, 111 + Opt_noquota, 111 112 Opt_usrquota, 112 113 Opt_grpquota, 114 + Opt_prjquota, 115 + Opt_usrjquota, 116 + Opt_grpjquota, 117 + Opt_prjjquota, 118 + Opt_offusrjquota, 119 + Opt_offgrpjquota, 120 + Opt_offprjjquota, 121 + Opt_jqfmt_vfsold, 122 + Opt_jqfmt_vfsv0, 123 + Opt_jqfmt_vfsv1, 113 124 Opt_err, 114 125 }; 115 126 ··· 157 144 {Opt_fault_injection, "fault_injection=%u"}, 158 145 {Opt_lazytime, "lazytime"}, 159 146 {Opt_nolazytime, "nolazytime"}, 147 + {Opt_quota, "quota"}, 148 + {Opt_noquota, "noquota"}, 160 149 {Opt_usrquota, "usrquota"}, 161 150 {Opt_grpquota, "grpquota"}, 151 + {Opt_prjquota, "prjquota"}, 152 + {Opt_usrjquota, "usrjquota=%s"}, 153 + {Opt_grpjquota, "grpjquota=%s"}, 154 + {Opt_prjjquota, "prjjquota=%s"}, 155 + {Opt_offusrjquota, "usrjquota="}, 156 + {Opt_offgrpjquota, "grpjquota="}, 157 + {Opt_offprjjquota, "prjjquota="}, 158 + {Opt_jqfmt_vfsold, "jqfmt=vfsold"}, 159 + {Opt_jqfmt_vfsv0, "jqfmt=vfsv0"}, 160 + {Opt_jqfmt_vfsv1, "jqfmt=vfsv1"}, 162 161 {Opt_err, NULL}, 163 162 }; 164 163 ··· 182 157 va_start(args, fmt); 183 158 vaf.fmt = fmt; 184 159 vaf.va = &args; 185 - printk("%sF2FS-fs (%s): %pV\n", level, sb->s_id, &vaf); 160 + printk_ratelimited("%sF2FS-fs (%s): %pV\n", level, sb->s_id, &vaf); 186 161 va_end(args); 187 162 } 188 163 ··· 193 168 inode_init_once(&fi->vfs_inode); 194 169 } 195 170 171 + #ifdef CONFIG_QUOTA 172 + static const char * const quotatypes[] = INITQFNAMES; 173 + #define QTYPE2NAME(t) (quotatypes[t]) 174 + static int f2fs_set_qf_name(struct super_block *sb, int qtype, 175 + substring_t *args) 176 + { 177 + struct f2fs_sb_info *sbi = F2FS_SB(sb); 178 + char *qname; 179 + int ret = -EINVAL; 180 + 181 + if (sb_any_quota_loaded(sb) && !sbi->s_qf_names[qtype]) { 182 + f2fs_msg(sb, KERN_ERR, 183 + "Cannot change journaled " 184 + "quota options when quota turned on"); 185 + return -EINVAL; 186 + } 187 + qname = match_strdup(args); 188 + if (!qname) { 189 + f2fs_msg(sb, KERN_ERR, 190 + "Not enough memory for storing quotafile name"); 191 + return -EINVAL; 192 + } 193 + if (sbi->s_qf_names[qtype]) { 194 + if (strcmp(sbi->s_qf_names[qtype], qname) == 0) 195 + ret = 0; 196 + else 197 + f2fs_msg(sb, KERN_ERR, 198 + "%s quota file already specified", 199 + QTYPE2NAME(qtype)); 200 + goto errout; 201 + } 202 + if (strchr(qname, '/')) { 203 + f2fs_msg(sb, KERN_ERR, 204 + "quotafile must be on filesystem root"); 205 + goto errout; 206 + } 207 + sbi->s_qf_names[qtype] = qname; 208 + set_opt(sbi, QUOTA); 209 + return 0; 210 + errout: 211 + kfree(qname); 212 + return ret; 213 + } 214 + 215 + static int f2fs_clear_qf_name(struct super_block *sb, int qtype) 216 + { 217 + struct f2fs_sb_info *sbi = F2FS_SB(sb); 218 + 219 + if (sb_any_quota_loaded(sb) && sbi->s_qf_names[qtype]) { 220 + f2fs_msg(sb, KERN_ERR, "Cannot change journaled quota options" 221 + " when quota turned on"); 222 + return -EINVAL; 223 + } 224 + kfree(sbi->s_qf_names[qtype]); 225 + sbi->s_qf_names[qtype] = NULL; 226 + return 0; 227 + } 228 + 229 + static int f2fs_check_quota_options(struct f2fs_sb_info *sbi) 230 + { 231 + /* 232 + * We do the test below only for project quotas. 'usrquota' and 233 + * 'grpquota' mount options are allowed even without quota feature 234 + * to support legacy quotas in quota files. 235 + */ 236 + if (test_opt(sbi, PRJQUOTA) && !f2fs_sb_has_project_quota(sbi->sb)) { 237 + f2fs_msg(sbi->sb, KERN_ERR, "Project quota feature not enabled. " 238 + "Cannot enable project quota enforcement."); 239 + return -1; 240 + } 241 + if (sbi->s_qf_names[USRQUOTA] || sbi->s_qf_names[GRPQUOTA] || 242 + sbi->s_qf_names[PRJQUOTA]) { 243 + if (test_opt(sbi, USRQUOTA) && sbi->s_qf_names[USRQUOTA]) 244 + clear_opt(sbi, USRQUOTA); 245 + 246 + if (test_opt(sbi, GRPQUOTA) && sbi->s_qf_names[GRPQUOTA]) 247 + clear_opt(sbi, GRPQUOTA); 248 + 249 + if (test_opt(sbi, PRJQUOTA) && sbi->s_qf_names[PRJQUOTA]) 250 + clear_opt(sbi, PRJQUOTA); 251 + 252 + if (test_opt(sbi, GRPQUOTA) || test_opt(sbi, USRQUOTA) || 253 + test_opt(sbi, PRJQUOTA)) { 254 + f2fs_msg(sbi->sb, KERN_ERR, "old and new quota " 255 + "format mixing"); 256 + return -1; 257 + } 258 + 259 + if (!sbi->s_jquota_fmt) { 260 + f2fs_msg(sbi->sb, KERN_ERR, "journaled quota format " 261 + "not specified"); 262 + return -1; 263 + } 264 + } 265 + return 0; 266 + } 267 + #endif 268 + 196 269 static int parse_options(struct super_block *sb, char *options) 197 270 { 198 271 struct f2fs_sb_info *sbi = F2FS_SB(sb); ··· 298 175 substring_t args[MAX_OPT_ARGS]; 299 176 char *p, *name; 300 177 int arg = 0; 178 + #ifdef CONFIG_QUOTA 179 + int ret; 180 + #endif 301 181 302 182 if (!options) 303 183 return 0; ··· 512 386 sb->s_flags &= ~MS_LAZYTIME; 513 387 break; 514 388 #ifdef CONFIG_QUOTA 389 + case Opt_quota: 515 390 case Opt_usrquota: 516 391 set_opt(sbi, USRQUOTA); 517 392 break; 518 393 case Opt_grpquota: 519 394 set_opt(sbi, GRPQUOTA); 520 395 break; 396 + case Opt_prjquota: 397 + set_opt(sbi, PRJQUOTA); 398 + break; 399 + case Opt_usrjquota: 400 + ret = f2fs_set_qf_name(sb, USRQUOTA, &args[0]); 401 + if (ret) 402 + return ret; 403 + break; 404 + case Opt_grpjquota: 405 + ret = f2fs_set_qf_name(sb, GRPQUOTA, &args[0]); 406 + if (ret) 407 + return ret; 408 + break; 409 + case Opt_prjjquota: 410 + ret = f2fs_set_qf_name(sb, PRJQUOTA, &args[0]); 411 + if (ret) 412 + return ret; 413 + break; 414 + case Opt_offusrjquota: 415 + ret = f2fs_clear_qf_name(sb, USRQUOTA); 416 + if (ret) 417 + return ret; 418 + break; 419 + case Opt_offgrpjquota: 420 + ret = f2fs_clear_qf_name(sb, GRPQUOTA); 421 + if (ret) 422 + return ret; 423 + break; 424 + case Opt_offprjjquota: 425 + ret = f2fs_clear_qf_name(sb, PRJQUOTA); 426 + if (ret) 427 + return ret; 428 + break; 429 + case Opt_jqfmt_vfsold: 430 + sbi->s_jquota_fmt = QFMT_VFS_OLD; 431 + break; 432 + case Opt_jqfmt_vfsv0: 433 + sbi->s_jquota_fmt = QFMT_VFS_V0; 434 + break; 435 + case Opt_jqfmt_vfsv1: 436 + sbi->s_jquota_fmt = QFMT_VFS_V1; 437 + break; 438 + case Opt_noquota: 439 + clear_opt(sbi, QUOTA); 440 + clear_opt(sbi, USRQUOTA); 441 + clear_opt(sbi, GRPQUOTA); 442 + clear_opt(sbi, PRJQUOTA); 443 + break; 521 444 #else 445 + case Opt_quota: 522 446 case Opt_usrquota: 523 447 case Opt_grpquota: 448 + case Opt_prjquota: 449 + case Opt_usrjquota: 450 + case Opt_grpjquota: 451 + case Opt_prjjquota: 452 + case Opt_offusrjquota: 453 + case Opt_offgrpjquota: 454 + case Opt_offprjjquota: 455 + case Opt_jqfmt_vfsold: 456 + case Opt_jqfmt_vfsv0: 457 + case Opt_jqfmt_vfsv1: 458 + case Opt_noquota: 524 459 f2fs_msg(sb, KERN_INFO, 525 460 "quota operations not supported"); 526 461 break; ··· 593 406 return -EINVAL; 594 407 } 595 408 } 409 + #ifdef CONFIG_QUOTA 410 + if (f2fs_check_quota_options(sbi)) 411 + return -EINVAL; 412 + #endif 596 413 597 414 if (F2FS_IO_SIZE_BITS(sbi) && !test_opt(sbi, LFS)) { 598 415 f2fs_msg(sb, KERN_ERR, ··· 630 439 init_rwsem(&fi->dio_rwsem[READ]); 631 440 init_rwsem(&fi->dio_rwsem[WRITE]); 632 441 init_rwsem(&fi->i_mmap_sem); 442 + init_rwsem(&fi->i_xattr_sem); 633 443 634 444 #ifdef CONFIG_QUOTA 635 445 memset(&fi->i_dquot, 0, sizeof(fi->i_dquot)); ··· 638 446 #endif 639 447 /* Will be used by directory only */ 640 448 fi->i_dir_level = F2FS_SB(sb)->dir_level; 449 + 641 450 return &fi->vfs_inode; 642 451 } 643 452 ··· 777 584 kfree(sbi->devs); 778 585 } 779 586 780 - static void f2fs_quota_off_umount(struct super_block *sb); 781 587 static void f2fs_put_super(struct super_block *sb) 782 588 { 783 589 struct f2fs_sb_info *sbi = F2FS_SB(sb); ··· 834 642 835 643 kfree(sbi->ckpt); 836 644 837 - f2fs_exit_sysfs(sbi); 645 + f2fs_unregister_sysfs(sbi); 838 646 839 647 sb->s_fs_info = NULL; 840 648 if (sbi->s_chksum_driver) ··· 843 651 844 652 destroy_device_list(sbi); 845 653 mempool_destroy(sbi->write_io_dummy); 654 + #ifdef CONFIG_QUOTA 655 + for (i = 0; i < MAXQUOTAS; i++) 656 + kfree(sbi->s_qf_names[i]); 657 + #endif 846 658 destroy_percpu_info(sbi); 847 659 for (i = 0; i < NR_PAGE_TYPE; i++) 848 660 kfree(sbi->write_io[i]); ··· 859 663 int err = 0; 860 664 861 665 trace_f2fs_sync_fs(sb, sync); 666 + 667 + if (unlikely(is_sbi_flag_set(sbi, SBI_POR_DOING))) 668 + return -EAGAIN; 862 669 863 670 if (sync) { 864 671 struct cp_control cpc; ··· 896 697 { 897 698 return 0; 898 699 } 700 + 701 + #ifdef CONFIG_QUOTA 702 + static int f2fs_statfs_project(struct super_block *sb, 703 + kprojid_t projid, struct kstatfs *buf) 704 + { 705 + struct kqid qid; 706 + struct dquot *dquot; 707 + u64 limit; 708 + u64 curblock; 709 + 710 + qid = make_kqid_projid(projid); 711 + dquot = dqget(sb, qid); 712 + if (IS_ERR(dquot)) 713 + return PTR_ERR(dquot); 714 + spin_lock(&dq_data_lock); 715 + 716 + limit = (dquot->dq_dqb.dqb_bsoftlimit ? 717 + dquot->dq_dqb.dqb_bsoftlimit : 718 + dquot->dq_dqb.dqb_bhardlimit) >> sb->s_blocksize_bits; 719 + if (limit && buf->f_blocks > limit) { 720 + curblock = dquot->dq_dqb.dqb_curspace >> sb->s_blocksize_bits; 721 + buf->f_blocks = limit; 722 + buf->f_bfree = buf->f_bavail = 723 + (buf->f_blocks > curblock) ? 724 + (buf->f_blocks - curblock) : 0; 725 + } 726 + 727 + limit = dquot->dq_dqb.dqb_isoftlimit ? 728 + dquot->dq_dqb.dqb_isoftlimit : 729 + dquot->dq_dqb.dqb_ihardlimit; 730 + if (limit && buf->f_files > limit) { 731 + buf->f_files = limit; 732 + buf->f_ffree = 733 + (buf->f_files > dquot->dq_dqb.dqb_curinodes) ? 734 + (buf->f_files - dquot->dq_dqb.dqb_curinodes) : 0; 735 + } 736 + 737 + spin_unlock(&dq_data_lock); 738 + dqput(dquot); 739 + return 0; 740 + } 741 + #endif 899 742 900 743 static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf) 901 744 { ··· 974 733 buf->f_fsid.val[0] = (u32)id; 975 734 buf->f_fsid.val[1] = (u32)(id >> 32); 976 735 736 + #ifdef CONFIG_QUOTA 737 + if (is_inode_flag_set(dentry->d_inode, FI_PROJ_INHERIT) && 738 + sb_has_quota_limits_enabled(sb, PRJQUOTA)) { 739 + f2fs_statfs_project(sb, F2FS_I(dentry->d_inode)->i_projid, buf); 740 + } 741 + #endif 977 742 return 0; 743 + } 744 + 745 + static inline void f2fs_show_quota_options(struct seq_file *seq, 746 + struct super_block *sb) 747 + { 748 + #ifdef CONFIG_QUOTA 749 + struct f2fs_sb_info *sbi = F2FS_SB(sb); 750 + 751 + if (sbi->s_jquota_fmt) { 752 + char *fmtname = ""; 753 + 754 + switch (sbi->s_jquota_fmt) { 755 + case QFMT_VFS_OLD: 756 + fmtname = "vfsold"; 757 + break; 758 + case QFMT_VFS_V0: 759 + fmtname = "vfsv0"; 760 + break; 761 + case QFMT_VFS_V1: 762 + fmtname = "vfsv1"; 763 + break; 764 + } 765 + seq_printf(seq, ",jqfmt=%s", fmtname); 766 + } 767 + 768 + if (sbi->s_qf_names[USRQUOTA]) 769 + seq_show_option(seq, "usrjquota", sbi->s_qf_names[USRQUOTA]); 770 + 771 + if (sbi->s_qf_names[GRPQUOTA]) 772 + seq_show_option(seq, "grpjquota", sbi->s_qf_names[GRPQUOTA]); 773 + 774 + if (sbi->s_qf_names[PRJQUOTA]) 775 + seq_show_option(seq, "prjjquota", sbi->s_qf_names[PRJQUOTA]); 776 + #endif 978 777 } 979 778 980 779 static int f2fs_show_options(struct seq_file *seq, struct dentry *root) ··· 1090 809 sbi->fault_info.inject_rate); 1091 810 #endif 1092 811 #ifdef CONFIG_QUOTA 812 + if (test_opt(sbi, QUOTA)) 813 + seq_puts(seq, ",quota"); 1093 814 if (test_opt(sbi, USRQUOTA)) 1094 815 seq_puts(seq, ",usrquota"); 1095 816 if (test_opt(sbi, GRPQUOTA)) 1096 817 seq_puts(seq, ",grpquota"); 818 + if (test_opt(sbi, PRJQUOTA)) 819 + seq_puts(seq, ",prjquota"); 1097 820 #endif 821 + f2fs_show_quota_options(seq, sbi->sb); 1098 822 1099 823 return 0; 1100 824 } ··· 1148 862 #ifdef CONFIG_F2FS_FAULT_INJECTION 1149 863 struct f2fs_fault_info ffi = sbi->fault_info; 1150 864 #endif 865 + #ifdef CONFIG_QUOTA 866 + int s_jquota_fmt; 867 + char *s_qf_names[MAXQUOTAS]; 868 + int i, j; 869 + #endif 1151 870 1152 871 /* 1153 872 * Save the old mount options in case we ··· 1161 870 org_mount_opt = sbi->mount_opt; 1162 871 old_sb_flags = sb->s_flags; 1163 872 active_logs = sbi->active_logs; 873 + 874 + #ifdef CONFIG_QUOTA 875 + s_jquota_fmt = sbi->s_jquota_fmt; 876 + for (i = 0; i < MAXQUOTAS; i++) { 877 + if (sbi->s_qf_names[i]) { 878 + s_qf_names[i] = kstrdup(sbi->s_qf_names[i], 879 + GFP_KERNEL); 880 + if (!s_qf_names[i]) { 881 + for (j = 0; j < i; j++) 882 + kfree(s_qf_names[j]); 883 + return -ENOMEM; 884 + } 885 + } else { 886 + s_qf_names[i] = NULL; 887 + } 888 + } 889 + #endif 1164 890 1165 891 /* recover superblocks we couldn't write due to previous RO mount */ 1166 892 if (!(*flags & MS_RDONLY) && is_sbi_flag_set(sbi, SBI_NEED_SB_WRITE)) { ··· 1260 952 goto restore_gc; 1261 953 } 1262 954 skip: 955 + #ifdef CONFIG_QUOTA 956 + /* Release old quota file names */ 957 + for (i = 0; i < MAXQUOTAS; i++) 958 + kfree(s_qf_names[i]); 959 + #endif 1263 960 /* Update the POSIXACL Flag */ 1264 961 sb->s_flags = (sb->s_flags & ~MS_POSIXACL) | 1265 962 (test_opt(sbi, POSIX_ACL) ? MS_POSIXACL : 0); ··· 1279 966 stop_gc_thread(sbi); 1280 967 } 1281 968 restore_opts: 969 + #ifdef CONFIG_QUOTA 970 + sbi->s_jquota_fmt = s_jquota_fmt; 971 + for (i = 0; i < MAXQUOTAS; i++) { 972 + kfree(sbi->s_qf_names[i]); 973 + sbi->s_qf_names[i] = s_qf_names[i]; 974 + } 975 + #endif 1282 976 sbi->mount_opt = org_mount_opt; 1283 977 sbi->active_logs = active_logs; 1284 978 sb->s_flags = old_sb_flags; ··· 1385 1065 } 1386 1066 1387 1067 if (len == towrite) 1388 - return err; 1068 + return 0; 1389 1069 inode->i_version++; 1390 1070 inode->i_mtime = inode->i_ctime = current_time(inode); 1391 1071 f2fs_mark_inode_dirty_sync(inode, false); ··· 1400 1080 static qsize_t *f2fs_get_reserved_space(struct inode *inode) 1401 1081 { 1402 1082 return &F2FS_I(inode)->i_reserved_quota; 1083 + } 1084 + 1085 + static int f2fs_quota_on_mount(struct f2fs_sb_info *sbi, int type) 1086 + { 1087 + return dquot_quota_on_mount(sbi->sb, sbi->s_qf_names[type], 1088 + sbi->s_jquota_fmt, type); 1089 + } 1090 + 1091 + void f2fs_enable_quota_files(struct f2fs_sb_info *sbi) 1092 + { 1093 + int i, ret; 1094 + 1095 + for (i = 0; i < MAXQUOTAS; i++) { 1096 + if (sbi->s_qf_names[i]) { 1097 + ret = f2fs_quota_on_mount(sbi, i); 1098 + if (ret < 0) 1099 + f2fs_msg(sbi->sb, KERN_ERR, 1100 + "Cannot turn on journaled " 1101 + "quota: error %d", ret); 1102 + } 1103 + } 1403 1104 } 1404 1105 1405 1106 static int f2fs_quota_sync(struct super_block *sb, int type) ··· 1460 1119 struct inode *inode; 1461 1120 int err; 1462 1121 1463 - err = f2fs_quota_sync(sb, -1); 1122 + err = f2fs_quota_sync(sb, type); 1464 1123 if (err) 1465 1124 return err; 1466 1125 ··· 1488 1147 if (!inode || !igrab(inode)) 1489 1148 return dquot_quota_off(sb, type); 1490 1149 1491 - f2fs_quota_sync(sb, -1); 1150 + f2fs_quota_sync(sb, type); 1492 1151 1493 1152 err = dquot_quota_off(sb, type); 1494 1153 if (err) ··· 1504 1163 return err; 1505 1164 } 1506 1165 1507 - static void f2fs_quota_off_umount(struct super_block *sb) 1166 + void f2fs_quota_off_umount(struct super_block *sb) 1508 1167 { 1509 1168 int type; 1510 1169 1511 1170 for (type = 0; type < MAXQUOTAS; type++) 1512 1171 f2fs_quota_off(sb, type); 1172 + } 1173 + 1174 + int f2fs_get_projid(struct inode *inode, kprojid_t *projid) 1175 + { 1176 + *projid = F2FS_I(inode)->i_projid; 1177 + return 0; 1513 1178 } 1514 1179 1515 1180 static const struct dquot_operations f2fs_quota_operations = { ··· 1527 1180 .write_info = dquot_commit_info, 1528 1181 .alloc_dquot = dquot_alloc, 1529 1182 .destroy_dquot = dquot_destroy, 1183 + .get_projid = f2fs_get_projid, 1530 1184 .get_next_id = dquot_get_next_id, 1531 1185 }; 1532 1186 ··· 1542 1194 .get_nextdqblk = dquot_get_next_dqblk, 1543 1195 }; 1544 1196 #else 1545 - static inline void f2fs_quota_off_umount(struct super_block *sb) 1197 + void f2fs_quota_off_umount(struct super_block *sb) 1546 1198 { 1547 1199 } 1548 1200 #endif 1549 1201 1550 - static struct super_operations f2fs_sops = { 1202 + static const struct super_operations f2fs_sops = { 1551 1203 .alloc_inode = f2fs_alloc_inode, 1552 1204 .drop_inode = f2fs_drop_inode, 1553 1205 .destroy_inode = f2fs_destroy_inode, ··· 1651 1303 1652 1304 static loff_t max_file_blocks(void) 1653 1305 { 1654 - loff_t result = (DEF_ADDRS_PER_INODE - F2FS_INLINE_XATTR_ADDRS); 1306 + loff_t result = 0; 1655 1307 loff_t leaf_count = ADDRS_PER_BLOCK; 1308 + 1309 + /* 1310 + * note: previously, result is equal to (DEF_ADDRS_PER_INODE - 1311 + * F2FS_INLINE_XATTR_ADDRS), but now f2fs try to reserve more 1312 + * space in inode.i_addr, it will be more safe to reassign 1313 + * result as zero. 1314 + */ 1656 1315 1657 1316 /* two direct node blocks */ 1658 1317 result += (leaf_count * 2); ··· 2277 1922 sb->s_fs_info = sbi; 2278 1923 sbi->raw_super = raw_super; 2279 1924 1925 + /* precompute checksum seed for metadata */ 1926 + if (f2fs_sb_has_inode_chksum(sb)) 1927 + sbi->s_chksum_seed = f2fs_chksum(sbi, ~0, raw_super->uuid, 1928 + sizeof(raw_super->uuid)); 1929 + 2280 1930 /* 2281 1931 * The BLKZONED feature indicates that the drive was formatted with 2282 1932 * zone alignment optimization. This is optional for host-aware ··· 2316 1956 #ifdef CONFIG_QUOTA 2317 1957 sb->dq_op = &f2fs_quota_operations; 2318 1958 sb->s_qcop = &f2fs_quotactl_ops; 2319 - sb->s_quota_types = QTYPE_MASK_USR | QTYPE_MASK_GRP; 1959 + sb->s_quota_types = QTYPE_MASK_USR | QTYPE_MASK_GRP | QTYPE_MASK_PRJ; 2320 1960 #endif 2321 1961 2322 1962 sb->s_op = &f2fs_sops; ··· 2339 1979 /* disallow all the data/node/meta page writes */ 2340 1980 set_sbi_flag(sbi, SBI_POR_DOING); 2341 1981 spin_lock_init(&sbi->stat_lock); 1982 + 1983 + /* init iostat info */ 1984 + spin_lock_init(&sbi->iostat_lock); 1985 + sbi->iostat_enable = false; 2342 1986 2343 1987 for (i = 0; i < NR_PAGE_TYPE; i++) { 2344 1988 int n = (i == META) ? 1: NR_TEMP_TYPE; ··· 2462 2098 if (err) 2463 2099 goto free_nm; 2464 2100 2465 - /* if there are nt orphan nodes free them */ 2466 - err = recover_orphan_inodes(sbi); 2467 - if (err) 2468 - goto free_node_inode; 2469 - 2470 2101 /* read root inode and dentry */ 2471 2102 root = f2fs_iget(sb, F2FS_ROOT_INO(sbi)); 2472 2103 if (IS_ERR(root)) { ··· 2481 2122 goto free_root_inode; 2482 2123 } 2483 2124 2484 - err = f2fs_init_sysfs(sbi); 2125 + err = f2fs_register_sysfs(sbi); 2485 2126 if (err) 2486 2127 goto free_root_inode; 2128 + 2129 + /* if there are nt orphan nodes free them */ 2130 + err = recover_orphan_inodes(sbi); 2131 + if (err) 2132 + goto free_sysfs; 2487 2133 2488 2134 /* recover fsynced data */ 2489 2135 if (!test_opt(sbi, DISABLE_ROLL_FORWARD)) { ··· 2499 2135 if (bdev_read_only(sb->s_bdev) && 2500 2136 !is_set_ckpt_flags(sbi, CP_UMOUNT_FLAG)) { 2501 2137 err = -EROFS; 2502 - goto free_sysfs; 2138 + goto free_meta; 2503 2139 } 2504 2140 2505 2141 if (need_fsck) ··· 2513 2149 need_fsck = true; 2514 2150 f2fs_msg(sb, KERN_ERR, 2515 2151 "Cannot recover all fsync data errno=%d", err); 2516 - goto free_sysfs; 2152 + goto free_meta; 2517 2153 } 2518 2154 } else { 2519 2155 err = recover_fsync_data(sbi, true); ··· 2537 2173 /* After POR, we can run background GC thread.*/ 2538 2174 err = start_gc_thread(sbi); 2539 2175 if (err) 2540 - goto free_sysfs; 2176 + goto free_meta; 2541 2177 } 2542 2178 kfree(options); 2543 2179 ··· 2555 2191 f2fs_update_time(sbi, REQ_TIME); 2556 2192 return 0; 2557 2193 2558 - free_sysfs: 2194 + free_meta: 2559 2195 f2fs_sync_inode_meta(sbi); 2560 - f2fs_exit_sysfs(sbi); 2196 + /* 2197 + * Some dirty meta pages can be produced by recover_orphan_inodes() 2198 + * failed by EIO. Then, iput(node_inode) can trigger balance_fs_bg() 2199 + * followed by write_checkpoint() through f2fs_write_node_pages(), which 2200 + * falls into an infinite loop in sync_meta_pages(). 2201 + */ 2202 + truncate_inode_pages_final(META_MAPPING(sbi)); 2203 + free_sysfs: 2204 + f2fs_unregister_sysfs(sbi); 2561 2205 free_root_inode: 2562 2206 dput(sb->s_root); 2563 2207 sb->s_root = NULL; ··· 2574 2202 mutex_lock(&sbi->umount_mutex); 2575 2203 release_ino_entry(sbi, true); 2576 2204 f2fs_leave_shrinker(sbi); 2577 - /* 2578 - * Some dirty meta pages can be produced by recover_orphan_inodes() 2579 - * failed by EIO. Then, iput(node_inode) can trigger balance_fs_bg() 2580 - * followed by write_checkpoint() through f2fs_write_node_pages(), which 2581 - * falls into an infinite loop in sync_meta_pages(). 2582 - */ 2583 - truncate_inode_pages_final(META_MAPPING(sbi)); 2584 2205 iput(sbi->node_inode); 2585 2206 mutex_unlock(&sbi->umount_mutex); 2586 2207 f2fs_destroy_stats(sbi); ··· 2593 2228 for (i = 0; i < NR_PAGE_TYPE; i++) 2594 2229 kfree(sbi->write_io[i]); 2595 2230 destroy_percpu_info(sbi); 2231 + #ifdef CONFIG_QUOTA 2232 + for (i = 0; i < MAXQUOTAS; i++) 2233 + kfree(sbi->s_qf_names[i]); 2234 + #endif 2596 2235 kfree(options); 2597 2236 free_sb_buf: 2598 2237 kfree(raw_super); ··· 2680 2311 err = create_extent_cache(); 2681 2312 if (err) 2682 2313 goto free_checkpoint_caches; 2683 - err = f2fs_register_sysfs(); 2314 + err = f2fs_init_sysfs(); 2684 2315 if (err) 2685 2316 goto free_extent_cache; 2686 2317 err = register_shrinker(&f2fs_shrinker_info); ··· 2699 2330 free_shrinker: 2700 2331 unregister_shrinker(&f2fs_shrinker_info); 2701 2332 free_sysfs: 2702 - f2fs_unregister_sysfs(); 2333 + f2fs_exit_sysfs(); 2703 2334 free_extent_cache: 2704 2335 destroy_extent_cache(); 2705 2336 free_checkpoint_caches: ··· 2719 2350 f2fs_destroy_root_stats(); 2720 2351 unregister_filesystem(&f2fs_fs_type); 2721 2352 unregister_shrinker(&f2fs_shrinker_info); 2722 - f2fs_unregister_sysfs(); 2353 + f2fs_exit_sysfs(); 2723 2354 destroy_extent_cache(); 2724 2355 destroy_checkpoint_caches(); 2725 2356 destroy_segment_manager_caches();
+221 -30
fs/f2fs/sysfs.c
··· 18 18 #include "gc.h" 19 19 20 20 static struct proc_dir_entry *f2fs_proc_root; 21 - static struct kset *f2fs_kset; 22 21 23 22 /* Sysfs support for f2fs */ 24 23 enum { ··· 40 41 const char *, size_t); 41 42 int struct_type; 42 43 int offset; 44 + int id; 43 45 }; 44 46 45 47 static unsigned char *__struct_ptr(struct f2fs_sb_info *sbi, int struct_type) ··· 74 74 return snprintf(buf, PAGE_SIZE, "%llu\n", 75 75 (unsigned long long)(sbi->kbytes_written + 76 76 BD_PART_WRITTEN(sbi))); 77 + } 78 + 79 + static ssize_t features_show(struct f2fs_attr *a, 80 + struct f2fs_sb_info *sbi, char *buf) 81 + { 82 + struct super_block *sb = sbi->sb; 83 + int len = 0; 84 + 85 + if (!sb->s_bdev->bd_part) 86 + return snprintf(buf, PAGE_SIZE, "0\n"); 87 + 88 + if (f2fs_sb_has_crypto(sb)) 89 + len += snprintf(buf, PAGE_SIZE - len, "%s", 90 + "encryption"); 91 + if (f2fs_sb_mounted_blkzoned(sb)) 92 + len += snprintf(buf + len, PAGE_SIZE - len, "%s%s", 93 + len ? ", " : "", "blkzoned"); 94 + if (f2fs_sb_has_extra_attr(sb)) 95 + len += snprintf(buf + len, PAGE_SIZE - len, "%s%s", 96 + len ? ", " : "", "extra_attr"); 97 + if (f2fs_sb_has_project_quota(sb)) 98 + len += snprintf(buf + len, PAGE_SIZE - len, "%s%s", 99 + len ? ", " : "", "projquota"); 100 + if (f2fs_sb_has_inode_chksum(sb)) 101 + len += snprintf(buf + len, PAGE_SIZE - len, "%s%s", 102 + len ? ", " : "", "inode_checksum"); 103 + len += snprintf(buf + len, PAGE_SIZE - len, "\n"); 104 + return len; 77 105 } 78 106 79 107 static ssize_t f2fs_sbi_show(struct f2fs_attr *a, ··· 152 124 spin_unlock(&sbi->stat_lock); 153 125 return count; 154 126 } 127 + 128 + if (!strcmp(a->attr.name, "discard_granularity")) { 129 + struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; 130 + int i; 131 + 132 + if (t == 0 || t > MAX_PLIST_NUM) 133 + return -EINVAL; 134 + if (t == *ui) 135 + return count; 136 + 137 + mutex_lock(&dcc->cmd_lock); 138 + for (i = 0; i < MAX_PLIST_NUM; i++) { 139 + if (i >= t - 1) 140 + dcc->pend_list_tag[i] |= P_ACTIVE; 141 + else 142 + dcc->pend_list_tag[i] &= (~P_ACTIVE); 143 + } 144 + mutex_unlock(&dcc->cmd_lock); 145 + 146 + *ui = t; 147 + return count; 148 + } 149 + 155 150 *ui = t; 151 + 152 + if (!strcmp(a->attr.name, "iostat_enable") && *ui == 0) 153 + f2fs_reset_iostat(sbi); 154 + if (!strcmp(a->attr.name, "gc_urgent") && t == 1 && sbi->gc_thread) { 155 + sbi->gc_thread->gc_wake = 1; 156 + wake_up_interruptible_all(&sbi->gc_thread->gc_wait_queue_head); 157 + wake_up_discard_thread(sbi, true); 158 + } 159 + 156 160 return count; 157 161 } 158 162 ··· 215 155 complete(&sbi->s_kobj_unregister); 216 156 } 217 157 158 + enum feat_id { 159 + FEAT_CRYPTO = 0, 160 + FEAT_BLKZONED, 161 + FEAT_ATOMIC_WRITE, 162 + FEAT_EXTRA_ATTR, 163 + FEAT_PROJECT_QUOTA, 164 + FEAT_INODE_CHECKSUM, 165 + }; 166 + 167 + static ssize_t f2fs_feature_show(struct f2fs_attr *a, 168 + struct f2fs_sb_info *sbi, char *buf) 169 + { 170 + switch (a->id) { 171 + case FEAT_CRYPTO: 172 + case FEAT_BLKZONED: 173 + case FEAT_ATOMIC_WRITE: 174 + case FEAT_EXTRA_ATTR: 175 + case FEAT_PROJECT_QUOTA: 176 + case FEAT_INODE_CHECKSUM: 177 + return snprintf(buf, PAGE_SIZE, "supported\n"); 178 + } 179 + return 0; 180 + } 181 + 218 182 #define F2FS_ATTR_OFFSET(_struct_type, _name, _mode, _show, _store, _offset) \ 219 183 static struct f2fs_attr f2fs_attr_##_name = { \ 220 184 .attr = {.name = __stringify(_name), .mode = _mode }, \ ··· 256 172 #define F2FS_GENERAL_RO_ATTR(name) \ 257 173 static struct f2fs_attr f2fs_attr_##name = __ATTR(name, 0444, name##_show, NULL) 258 174 175 + #define F2FS_FEATURE_RO_ATTR(_name, _id) \ 176 + static struct f2fs_attr f2fs_attr_##_name = { \ 177 + .attr = {.name = __stringify(_name), .mode = 0444 }, \ 178 + .show = f2fs_feature_show, \ 179 + .id = _id, \ 180 + } 181 + 182 + F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_urgent_sleep_time, 183 + urgent_sleep_time); 259 184 F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_min_sleep_time, min_sleep_time); 260 185 F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_max_sleep_time, max_sleep_time); 261 186 F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_no_gc_sleep_time, no_gc_sleep_time); 262 187 F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_idle, gc_idle); 188 + F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_urgent, gc_urgent); 263 189 F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, reclaim_segments, rec_prefree_segments); 264 190 F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, max_small_discards, max_discards); 191 + F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, discard_granularity, discard_granularity); 265 192 F2FS_RW_ATTR(RESERVED_BLOCKS, f2fs_sb_info, reserved_blocks, reserved_blocks); 266 193 F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, batched_trim_sections, trim_sections); 267 194 F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, ipu_policy, ipu_policy); ··· 286 191 F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, dir_level, dir_level); 287 192 F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, cp_interval, interval_time[CP_TIME]); 288 193 F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, idle_interval, interval_time[REQ_TIME]); 194 + F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, iostat_enable, iostat_enable); 289 195 #ifdef CONFIG_F2FS_FAULT_INJECTION 290 196 F2FS_RW_ATTR(FAULT_INFO_RATE, f2fs_fault_info, inject_rate, inject_rate); 291 197 F2FS_RW_ATTR(FAULT_INFO_TYPE, f2fs_fault_info, inject_type, inject_type); 292 198 #endif 293 199 F2FS_GENERAL_RO_ATTR(lifetime_write_kbytes); 200 + F2FS_GENERAL_RO_ATTR(features); 201 + 202 + #ifdef CONFIG_F2FS_FS_ENCRYPTION 203 + F2FS_FEATURE_RO_ATTR(encryption, FEAT_CRYPTO); 204 + #endif 205 + #ifdef CONFIG_BLK_DEV_ZONED 206 + F2FS_FEATURE_RO_ATTR(block_zoned, FEAT_BLKZONED); 207 + #endif 208 + F2FS_FEATURE_RO_ATTR(atomic_write, FEAT_ATOMIC_WRITE); 209 + F2FS_FEATURE_RO_ATTR(extra_attr, FEAT_EXTRA_ATTR); 210 + F2FS_FEATURE_RO_ATTR(project_quota, FEAT_PROJECT_QUOTA); 211 + F2FS_FEATURE_RO_ATTR(inode_checksum, FEAT_INODE_CHECKSUM); 294 212 295 213 #define ATTR_LIST(name) (&f2fs_attr_##name.attr) 296 214 static struct attribute *f2fs_attrs[] = { 215 + ATTR_LIST(gc_urgent_sleep_time), 297 216 ATTR_LIST(gc_min_sleep_time), 298 217 ATTR_LIST(gc_max_sleep_time), 299 218 ATTR_LIST(gc_no_gc_sleep_time), 300 219 ATTR_LIST(gc_idle), 220 + ATTR_LIST(gc_urgent), 301 221 ATTR_LIST(reclaim_segments), 302 222 ATTR_LIST(max_small_discards), 223 + ATTR_LIST(discard_granularity), 303 224 ATTR_LIST(batched_trim_sections), 304 225 ATTR_LIST(ipu_policy), 305 226 ATTR_LIST(min_ipu_util), ··· 328 217 ATTR_LIST(dirty_nats_ratio), 329 218 ATTR_LIST(cp_interval), 330 219 ATTR_LIST(idle_interval), 220 + ATTR_LIST(iostat_enable), 331 221 #ifdef CONFIG_F2FS_FAULT_INJECTION 332 222 ATTR_LIST(inject_rate), 333 223 ATTR_LIST(inject_type), 334 224 #endif 335 225 ATTR_LIST(lifetime_write_kbytes), 226 + ATTR_LIST(features), 336 227 ATTR_LIST(reserved_blocks), 228 + NULL, 229 + }; 230 + 231 + static struct attribute *f2fs_feat_attrs[] = { 232 + #ifdef CONFIG_F2FS_FS_ENCRYPTION 233 + ATTR_LIST(encryption), 234 + #endif 235 + #ifdef CONFIG_BLK_DEV_ZONED 236 + ATTR_LIST(block_zoned), 237 + #endif 238 + ATTR_LIST(atomic_write), 239 + ATTR_LIST(extra_attr), 240 + ATTR_LIST(project_quota), 241 + ATTR_LIST(inode_checksum), 337 242 NULL, 338 243 }; 339 244 ··· 358 231 .store = f2fs_attr_store, 359 232 }; 360 233 361 - static struct kobj_type f2fs_ktype = { 234 + static struct kobj_type f2fs_sb_ktype = { 362 235 .default_attrs = f2fs_attrs, 363 236 .sysfs_ops = &f2fs_attr_ops, 364 237 .release = f2fs_sb_release, 238 + }; 239 + 240 + static struct kobj_type f2fs_ktype = { 241 + .sysfs_ops = &f2fs_attr_ops, 242 + }; 243 + 244 + static struct kset f2fs_kset = { 245 + .kobj = {.ktype = &f2fs_ktype}, 246 + }; 247 + 248 + static struct kobj_type f2fs_feat_ktype = { 249 + .default_attrs = f2fs_feat_attrs, 250 + .sysfs_ops = &f2fs_attr_ops, 251 + }; 252 + 253 + static struct kobject f2fs_feat = { 254 + .kset = &f2fs_kset, 365 255 }; 366 256 367 257 static int segment_info_seq_show(struct seq_file *seq, void *offset) ··· 432 288 return 0; 433 289 } 434 290 291 + static int iostat_info_seq_show(struct seq_file *seq, void *offset) 292 + { 293 + struct super_block *sb = seq->private; 294 + struct f2fs_sb_info *sbi = F2FS_SB(sb); 295 + time64_t now = ktime_get_real_seconds(); 296 + 297 + if (!sbi->iostat_enable) 298 + return 0; 299 + 300 + seq_printf(seq, "time: %-16llu\n", now); 301 + 302 + /* print app IOs */ 303 + seq_printf(seq, "app buffered: %-16llu\n", 304 + sbi->write_iostat[APP_BUFFERED_IO]); 305 + seq_printf(seq, "app direct: %-16llu\n", 306 + sbi->write_iostat[APP_DIRECT_IO]); 307 + seq_printf(seq, "app mapped: %-16llu\n", 308 + sbi->write_iostat[APP_MAPPED_IO]); 309 + 310 + /* print fs IOs */ 311 + seq_printf(seq, "fs data: %-16llu\n", 312 + sbi->write_iostat[FS_DATA_IO]); 313 + seq_printf(seq, "fs node: %-16llu\n", 314 + sbi->write_iostat[FS_NODE_IO]); 315 + seq_printf(seq, "fs meta: %-16llu\n", 316 + sbi->write_iostat[FS_META_IO]); 317 + seq_printf(seq, "fs gc data: %-16llu\n", 318 + sbi->write_iostat[FS_GC_DATA_IO]); 319 + seq_printf(seq, "fs gc node: %-16llu\n", 320 + sbi->write_iostat[FS_GC_NODE_IO]); 321 + seq_printf(seq, "fs cp data: %-16llu\n", 322 + sbi->write_iostat[FS_CP_DATA_IO]); 323 + seq_printf(seq, "fs cp node: %-16llu\n", 324 + sbi->write_iostat[FS_CP_NODE_IO]); 325 + seq_printf(seq, "fs cp meta: %-16llu\n", 326 + sbi->write_iostat[FS_CP_META_IO]); 327 + seq_printf(seq, "fs discard: %-16llu\n", 328 + sbi->write_iostat[FS_DISCARD]); 329 + 330 + return 0; 331 + } 332 + 435 333 #define F2FS_PROC_FILE_DEF(_name) \ 436 334 static int _name##_open_fs(struct inode *inode, struct file *file) \ 437 335 { \ ··· 489 303 490 304 F2FS_PROC_FILE_DEF(segment_info); 491 305 F2FS_PROC_FILE_DEF(segment_bits); 306 + F2FS_PROC_FILE_DEF(iostat_info); 492 307 493 - int __init f2fs_register_sysfs(void) 308 + int __init f2fs_init_sysfs(void) 494 309 { 495 - f2fs_proc_root = proc_mkdir("fs/f2fs", NULL); 310 + int ret; 496 311 497 - f2fs_kset = kset_create_and_add("f2fs", NULL, fs_kobj); 498 - if (!f2fs_kset) 499 - return -ENOMEM; 500 - return 0; 312 + kobject_set_name(&f2fs_kset.kobj, "f2fs"); 313 + f2fs_kset.kobj.parent = fs_kobj; 314 + ret = kset_register(&f2fs_kset); 315 + if (ret) 316 + return ret; 317 + 318 + ret = kobject_init_and_add(&f2fs_feat, &f2fs_feat_ktype, 319 + NULL, "features"); 320 + if (ret) 321 + kset_unregister(&f2fs_kset); 322 + else 323 + f2fs_proc_root = proc_mkdir("fs/f2fs", NULL); 324 + return ret; 501 325 } 502 326 503 - void f2fs_unregister_sysfs(void) 327 + void f2fs_exit_sysfs(void) 504 328 { 505 - kset_unregister(f2fs_kset); 329 + kobject_put(&f2fs_feat); 330 + kset_unregister(&f2fs_kset); 506 331 remove_proc_entry("fs/f2fs", NULL); 332 + f2fs_proc_root = NULL; 507 333 } 508 334 509 - int f2fs_init_sysfs(struct f2fs_sb_info *sbi) 335 + int f2fs_register_sysfs(struct f2fs_sb_info *sbi) 510 336 { 511 337 struct super_block *sb = sbi->sb; 512 338 int err; 339 + 340 + sbi->s_kobj.kset = &f2fs_kset; 341 + init_completion(&sbi->s_kobj_unregister); 342 + err = kobject_init_and_add(&sbi->s_kobj, &f2fs_sb_ktype, NULL, 343 + "%s", sb->s_id); 344 + if (err) 345 + return err; 513 346 514 347 if (f2fs_proc_root) 515 348 sbi->s_proc = proc_mkdir(sb->s_id, f2fs_proc_root); ··· 538 333 &f2fs_seq_segment_info_fops, sb); 539 334 proc_create_data("segment_bits", S_IRUGO, sbi->s_proc, 540 335 &f2fs_seq_segment_bits_fops, sb); 336 + proc_create_data("iostat_info", S_IRUGO, sbi->s_proc, 337 + &f2fs_seq_iostat_info_fops, sb); 541 338 } 542 - 543 - sbi->s_kobj.kset = f2fs_kset; 544 - init_completion(&sbi->s_kobj_unregister); 545 - err = kobject_init_and_add(&sbi->s_kobj, &f2fs_ktype, NULL, 546 - "%s", sb->s_id); 547 - if (err) 548 - goto err_out; 549 339 return 0; 550 - err_out: 551 - if (sbi->s_proc) { 552 - remove_proc_entry("segment_info", sbi->s_proc); 553 - remove_proc_entry("segment_bits", sbi->s_proc); 554 - remove_proc_entry(sb->s_id, f2fs_proc_root); 555 - } 556 - return err; 557 340 } 558 341 559 - void f2fs_exit_sysfs(struct f2fs_sb_info *sbi) 342 + void f2fs_unregister_sysfs(struct f2fs_sb_info *sbi) 560 343 { 561 - kobject_del(&sbi->s_kobj); 562 - kobject_put(&sbi->s_kobj); 563 - wait_for_completion(&sbi->s_kobj_unregister); 564 - 565 344 if (sbi->s_proc) { 345 + remove_proc_entry("iostat_info", sbi->s_proc); 566 346 remove_proc_entry("segment_info", sbi->s_proc); 567 347 remove_proc_entry("segment_bits", sbi->s_proc); 568 348 remove_proc_entry(sbi->sb->s_id, f2fs_proc_root); 569 349 } 350 + kobject_del(&sbi->s_kobj); 570 351 }
+7 -1
fs/f2fs/xattr.c
··· 442 442 } else { 443 443 struct dnode_of_data dn; 444 444 set_new_dnode(&dn, inode, NULL, NULL, new_nid); 445 - xpage = new_node_page(&dn, XATTR_NODE_OFFSET, ipage); 445 + xpage = new_node_page(&dn, XATTR_NODE_OFFSET); 446 446 if (IS_ERR(xpage)) { 447 447 alloc_nid_failed(sbi, new_nid); 448 448 return PTR_ERR(xpage); ··· 473 473 if (len > F2FS_NAME_LEN) 474 474 return -ERANGE; 475 475 476 + down_read(&F2FS_I(inode)->i_xattr_sem); 476 477 error = lookup_all_xattrs(inode, ipage, index, len, name, 477 478 &entry, &base_addr); 479 + up_read(&F2FS_I(inode)->i_xattr_sem); 478 480 if (error) 479 481 return error; 480 482 ··· 505 503 int error = 0; 506 504 size_t rest = buffer_size; 507 505 506 + down_read(&F2FS_I(inode)->i_xattr_sem); 508 507 error = read_all_xattrs(inode, NULL, &base_addr); 508 + up_read(&F2FS_I(inode)->i_xattr_sem); 509 509 if (error) 510 510 return error; 511 511 ··· 690 686 f2fs_lock_op(sbi); 691 687 /* protect xattr_ver */ 692 688 down_write(&F2FS_I(inode)->i_sem); 689 + down_write(&F2FS_I(inode)->i_xattr_sem); 693 690 err = __f2fs_setxattr(inode, index, name, value, size, ipage, flags); 691 + up_write(&F2FS_I(inode)->i_xattr_sem); 694 692 up_write(&F2FS_I(inode)->i_sem); 695 693 f2fs_unlock_op(sbi); 696 694
+16 -24
include/linux/f2fs_fs.h
··· 186 186 #define F2FS_NAME_LEN 255 187 187 #define F2FS_INLINE_XATTR_ADDRS 50 /* 200 bytes for inline xattrs */ 188 188 #define DEF_ADDRS_PER_INODE 923 /* Address Pointers in an Inode */ 189 + #define CUR_ADDRS_PER_INODE(inode) (DEF_ADDRS_PER_INODE - \ 190 + get_extra_isize(inode)) 189 191 #define DEF_NIDS_PER_INODE 5 /* Node IDs in an Inode */ 190 192 #define ADDRS_PER_INODE(inode) addrs_per_inode(inode) 191 193 #define ADDRS_PER_BLOCK 1018 /* Address Pointers in a Direct Block */ ··· 207 205 #define F2FS_INLINE_DENTRY 0x04 /* file inline dentry flag */ 208 206 #define F2FS_DATA_EXIST 0x08 /* file inline data exist flag */ 209 207 #define F2FS_INLINE_DOTS 0x10 /* file having implicit dot dentries */ 210 - 211 - #define MAX_INLINE_DATA (sizeof(__le32) * (DEF_ADDRS_PER_INODE - \ 212 - F2FS_INLINE_XATTR_ADDRS - 1)) 208 + #define F2FS_EXTRA_ATTR 0x20 /* file having extra attribute */ 213 209 214 210 struct f2fs_inode { 215 211 __le16 i_mode; /* file mode */ ··· 235 235 236 236 struct f2fs_extent i_ext; /* caching a largest extent */ 237 237 238 - __le32 i_addr[DEF_ADDRS_PER_INODE]; /* Pointers to data blocks */ 239 - 238 + union { 239 + struct { 240 + __le16 i_extra_isize; /* extra inode attribute size */ 241 + __le16 i_padding; /* padding */ 242 + __le32 i_projid; /* project id */ 243 + __le32 i_inode_checksum;/* inode meta checksum */ 244 + __le32 i_extra_end[0]; /* for attribute size calculation */ 245 + }; 246 + __le32 i_addr[DEF_ADDRS_PER_INODE]; /* Pointers to data blocks */ 247 + }; 240 248 __le32 i_nid[DEF_NIDS_PER_INODE]; /* direct(2), indirect(2), 241 249 double_indirect(1) node id */ 242 250 } __packed; ··· 473 465 #define MAX_DIR_BUCKETS (1 << ((MAX_DIR_HASH_DEPTH / 2) - 1)) 474 466 475 467 /* 476 - * space utilization of regular dentry and inline dentry 468 + * space utilization of regular dentry and inline dentry (w/o extra reservation) 477 469 * regular dentry inline dentry 478 470 * bitmap 1 * 27 = 27 1 * 23 = 23 479 471 * reserved 1 * 3 = 3 1 * 7 = 7 ··· 509 501 __u8 filename[NR_DENTRY_IN_BLOCK][F2FS_SLOT_LEN]; 510 502 } __packed; 511 503 512 - /* for inline dir */ 513 - #define NR_INLINE_DENTRY (MAX_INLINE_DATA * BITS_PER_BYTE / \ 514 - ((SIZE_OF_DIR_ENTRY + F2FS_SLOT_LEN) * \ 515 - BITS_PER_BYTE + 1)) 516 - #define INLINE_DENTRY_BITMAP_SIZE ((NR_INLINE_DENTRY + \ 517 - BITS_PER_BYTE - 1) / BITS_PER_BYTE) 518 - #define INLINE_RESERVED_SIZE (MAX_INLINE_DATA - \ 519 - ((SIZE_OF_DIR_ENTRY + F2FS_SLOT_LEN) * \ 520 - NR_INLINE_DENTRY + INLINE_DENTRY_BITMAP_SIZE)) 521 - 522 - /* inline directory entry structure */ 523 - struct f2fs_inline_dentry { 524 - __u8 dentry_bitmap[INLINE_DENTRY_BITMAP_SIZE]; 525 - __u8 reserved[INLINE_RESERVED_SIZE]; 526 - struct f2fs_dir_entry dentry[NR_INLINE_DENTRY]; 527 - __u8 filename[NR_INLINE_DENTRY][F2FS_SLOT_LEN]; 528 - } __packed; 529 - 530 504 /* file types used in inode_info->flags */ 531 505 enum { 532 506 F2FS_FT_UNKNOWN, ··· 523 533 }; 524 534 525 535 #define S_SHIFT 12 536 + 537 + #define F2FS_DEF_PROJID 0 /* default project ID */ 526 538 527 539 #endif /* _LINUX_F2FS_FS_H */
+110 -3
include/trace/events/f2fs.h
··· 543 543 544 544 TRACE_EVENT(f2fs_background_gc, 545 545 546 - TP_PROTO(struct super_block *sb, long wait_ms, 546 + TP_PROTO(struct super_block *sb, unsigned int wait_ms, 547 547 unsigned int prefree, unsigned int free), 548 548 549 549 TP_ARGS(sb, wait_ms, prefree, free), 550 550 551 551 TP_STRUCT__entry( 552 552 __field(dev_t, dev) 553 - __field(long, wait_ms) 553 + __field(unsigned int, wait_ms) 554 554 __field(unsigned int, prefree) 555 555 __field(unsigned int, free) 556 556 ), ··· 562 562 __entry->free = free; 563 563 ), 564 564 565 - TP_printk("dev = (%d,%d), wait_ms = %ld, prefree = %u, free = %u", 565 + TP_printk("dev = (%d,%d), wait_ms = %u, prefree = %u, free = %u", 566 566 show_dev(__entry->dev), 567 567 __entry->wait_ms, 568 568 __entry->prefree, 569 569 __entry->free) 570 + ); 571 + 572 + TRACE_EVENT(f2fs_gc_begin, 573 + 574 + TP_PROTO(struct super_block *sb, bool sync, bool background, 575 + long long dirty_nodes, long long dirty_dents, 576 + long long dirty_imeta, unsigned int free_sec, 577 + unsigned int free_seg, int reserved_seg, 578 + unsigned int prefree_seg), 579 + 580 + TP_ARGS(sb, sync, background, dirty_nodes, dirty_dents, dirty_imeta, 581 + free_sec, free_seg, reserved_seg, prefree_seg), 582 + 583 + TP_STRUCT__entry( 584 + __field(dev_t, dev) 585 + __field(bool, sync) 586 + __field(bool, background) 587 + __field(long long, dirty_nodes) 588 + __field(long long, dirty_dents) 589 + __field(long long, dirty_imeta) 590 + __field(unsigned int, free_sec) 591 + __field(unsigned int, free_seg) 592 + __field(int, reserved_seg) 593 + __field(unsigned int, prefree_seg) 594 + ), 595 + 596 + TP_fast_assign( 597 + __entry->dev = sb->s_dev; 598 + __entry->sync = sync; 599 + __entry->background = background; 600 + __entry->dirty_nodes = dirty_nodes; 601 + __entry->dirty_dents = dirty_dents; 602 + __entry->dirty_imeta = dirty_imeta; 603 + __entry->free_sec = free_sec; 604 + __entry->free_seg = free_seg; 605 + __entry->reserved_seg = reserved_seg; 606 + __entry->prefree_seg = prefree_seg; 607 + ), 608 + 609 + TP_printk("dev = (%d,%d), sync = %d, background = %d, nodes = %lld, " 610 + "dents = %lld, imeta = %lld, free_sec:%u, free_seg:%u, " 611 + "rsv_seg:%d, prefree_seg:%u", 612 + show_dev(__entry->dev), 613 + __entry->sync, 614 + __entry->background, 615 + __entry->dirty_nodes, 616 + __entry->dirty_dents, 617 + __entry->dirty_imeta, 618 + __entry->free_sec, 619 + __entry->free_seg, 620 + __entry->reserved_seg, 621 + __entry->prefree_seg) 622 + ); 623 + 624 + TRACE_EVENT(f2fs_gc_end, 625 + 626 + TP_PROTO(struct super_block *sb, int ret, int seg_freed, 627 + int sec_freed, long long dirty_nodes, 628 + long long dirty_dents, long long dirty_imeta, 629 + unsigned int free_sec, unsigned int free_seg, 630 + int reserved_seg, unsigned int prefree_seg), 631 + 632 + TP_ARGS(sb, ret, seg_freed, sec_freed, dirty_nodes, dirty_dents, 633 + dirty_imeta, free_sec, free_seg, reserved_seg, prefree_seg), 634 + 635 + TP_STRUCT__entry( 636 + __field(dev_t, dev) 637 + __field(int, ret) 638 + __field(int, seg_freed) 639 + __field(int, sec_freed) 640 + __field(long long, dirty_nodes) 641 + __field(long long, dirty_dents) 642 + __field(long long, dirty_imeta) 643 + __field(unsigned int, free_sec) 644 + __field(unsigned int, free_seg) 645 + __field(int, reserved_seg) 646 + __field(unsigned int, prefree_seg) 647 + ), 648 + 649 + TP_fast_assign( 650 + __entry->dev = sb->s_dev; 651 + __entry->ret = ret; 652 + __entry->seg_freed = seg_freed; 653 + __entry->sec_freed = sec_freed; 654 + __entry->dirty_nodes = dirty_nodes; 655 + __entry->dirty_dents = dirty_dents; 656 + __entry->dirty_imeta = dirty_imeta; 657 + __entry->free_sec = free_sec; 658 + __entry->free_seg = free_seg; 659 + __entry->reserved_seg = reserved_seg; 660 + __entry->prefree_seg = prefree_seg; 661 + ), 662 + 663 + TP_printk("dev = (%d,%d), ret = %d, seg_freed = %d, sec_freed = %d, " 664 + "nodes = %lld, dents = %lld, imeta = %lld, free_sec:%u, " 665 + "free_seg:%u, rsv_seg:%d, prefree_seg:%u", 666 + show_dev(__entry->dev), 667 + __entry->ret, 668 + __entry->seg_freed, 669 + __entry->sec_freed, 670 + __entry->dirty_nodes, 671 + __entry->dirty_dents, 672 + __entry->dirty_imeta, 673 + __entry->free_sec, 674 + __entry->free_seg, 675 + __entry->reserved_seg, 676 + __entry->prefree_seg) 570 677 ); 571 678 572 679 TRACE_EVENT(f2fs_get_victim,