Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'f2fs-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
"In this cycle, we mainly fixed some corner cases that manipulate a
per-file compression flag inappropriately. And, we found f2fs counted
valid blocks in a section incorrectly when zone capacity is set, and
thus, fixed it with additional sysfs entry to check it easily.

Lastly, this series includes several patches with respect to the new
atomic write support such as a couple of bug fixes and re-adding
atomic_write_abort support that we removed by mistake in the previous
release.

Enhancements:
- add sysfs entries to understand atomic write operations and zone
capacity
- introduce memory mode to get a hint for low-memory devices
- adjust the waiting time of foreground GC
- decompress clusters under softirq to avoid non-deterministic
latency
- do not skip updating inode when retrying to flush node page
- enforce single zone capacity

Bug fixes:
- set the compression/no-compression flags correctly
- revive F2FS_IOC_ABORT_VOLATILE_WRITE
- check inline_data during compressed inode conversion
- understand zone capacity when calculating valid block count

As usual, the series includes several minor clean-ups and sanity
checks"

* tag 'f2fs-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (29 commits)
f2fs: use onstack pages instead of pvec
f2fs: intorduce f2fs_all_cluster_page_ready
f2fs: clean up f2fs_abort_atomic_write()
f2fs: handle decompress only post processing in softirq
f2fs: do not allow to decompress files have FI_COMPRESS_RELEASED
f2fs: do not set compression bit if kernel doesn't support
f2fs: remove device type check for direct IO
f2fs: fix null-ptr-deref in f2fs_get_dnode_of_data
f2fs: revive F2FS_IOC_ABORT_VOLATILE_WRITE
f2fs: fix to do sanity check on segment type in build_sit_entries()
f2fs: obsolete unused MAX_DISCARD_BLOCKS
f2fs: fix to avoid use f2fs_bug_on() in f2fs_new_node_page()
f2fs: fix to remove F2FS_COMPR_FL and tag F2FS_NOCOMP_FL at the same time
f2fs: introduce sysfs atomic write statistics
f2fs: don't bother wait_ms by foreground gc
f2fs: invalidate meta pages only for post_read required inode
f2fs: allow compression of files without blocks
f2fs: fix to check inline_data during compressed inode conversion
f2fs: Delete f2fs_copy_page() and replace with memcpy_page()
f2fs: fix to invalidate META_MAPPING before DIO write
...

+562 -254
+30
Documentation/ABI/testing/sysfs-fs-f2fs
··· 580 580 Contact: "Jaegeuk Kim" <jaegeuk@kernel.org> 581 581 Description: Controls max # of node block writes to be used for roll forward 582 582 recovery. This can limit the roll forward recovery time. 583 + 584 + What: /sys/fs/f2fs/<disk>/unusable_blocks_per_sec 585 + Date: June 2022 586 + Contact: "Jaegeuk Kim" <jaegeuk@kernel.org> 587 + Description: Shows the number of unusable blocks in a section which was defined by 588 + the zone capacity reported by underlying zoned device. 589 + 590 + What: /sys/fs/f2fs/<disk>/current_atomic_write 591 + Date: July 2022 592 + Contact: "Daeho Jeong" <daehojeong@google.com> 593 + Description: Show the total current atomic write block count, which is not committed yet. 594 + This is a read-only entry. 595 + 596 + What: /sys/fs/f2fs/<disk>/peak_atomic_write 597 + Date: July 2022 598 + Contact: "Daeho Jeong" <daehojeong@google.com> 599 + Description: Show the peak value of total current atomic write block count after boot. 600 + If you write "0" here, you can initialize to "0". 601 + 602 + What: /sys/fs/f2fs/<disk>/committed_atomic_block 603 + Date: July 2022 604 + Contact: "Daeho Jeong" <daehojeong@google.com> 605 + Description: Show the accumulated total committed atomic write block count after boot. 606 + If you write "0" here, you can initialize to "0". 607 + 608 + What: /sys/fs/f2fs/<disk>/revoked_atomic_block 609 + Date: July 2022 610 + Contact: "Daeho Jeong" <daehojeong@google.com> 611 + Description: Show the accumulated total revoked atomic write block count after boot. 612 + If you write "0" here, you can initialize to "0".
+5
Documentation/filesystems/f2fs.rst
··· 336 336 default, it is helpful for large sized SMR or ZNS devices to 337 337 reduce memory cost by getting rid of fs metadata supports small 338 338 discard. 339 + memory=%s Control memory mode. This supports "normal" and "low" modes. 340 + "low" mode is introduced to support low memory devices. 341 + Because of the nature of low memory devices, in this mode, f2fs 342 + will try to save memory sometimes by sacrificing performance. 343 + "normal" mode is the default mode and same as before. 339 344 ======================== ============================================================ 340 345 341 346 Debugfs Entries
+153 -76
fs/f2fs/compress.c
··· 729 729 return ret; 730 730 } 731 731 732 - void f2fs_decompress_cluster(struct decompress_io_ctx *dic) 732 + static int f2fs_prepare_decomp_mem(struct decompress_io_ctx *dic, 733 + bool pre_alloc); 734 + static void f2fs_release_decomp_mem(struct decompress_io_ctx *dic, 735 + bool bypass_destroy_callback, bool pre_alloc); 736 + 737 + void f2fs_decompress_cluster(struct decompress_io_ctx *dic, bool in_task) 733 738 { 734 739 struct f2fs_sb_info *sbi = F2FS_I_SB(dic->inode); 735 740 struct f2fs_inode_info *fi = F2FS_I(dic->inode); 736 741 const struct f2fs_compress_ops *cops = 737 742 f2fs_cops[fi->i_compress_algorithm]; 743 + bool bypass_callback = false; 738 744 int ret; 739 - int i; 740 745 741 746 trace_f2fs_decompress_pages_start(dic->inode, dic->cluster_idx, 742 747 dic->cluster_size, fi->i_compress_algorithm); ··· 751 746 goto out_end_io; 752 747 } 753 748 754 - dic->tpages = page_array_alloc(dic->inode, dic->cluster_size); 755 - if (!dic->tpages) { 756 - ret = -ENOMEM; 757 - goto out_end_io; 758 - } 759 - 760 - for (i = 0; i < dic->cluster_size; i++) { 761 - if (dic->rpages[i]) { 762 - dic->tpages[i] = dic->rpages[i]; 763 - continue; 764 - } 765 - 766 - dic->tpages[i] = f2fs_compress_alloc_page(); 767 - if (!dic->tpages[i]) { 768 - ret = -ENOMEM; 769 - goto out_end_io; 770 - } 771 - } 772 - 773 - if (cops->init_decompress_ctx) { 774 - ret = cops->init_decompress_ctx(dic); 775 - if (ret) 776 - goto out_end_io; 777 - } 778 - 779 - dic->rbuf = f2fs_vmap(dic->tpages, dic->cluster_size); 780 - if (!dic->rbuf) { 781 - ret = -ENOMEM; 782 - goto out_destroy_decompress_ctx; 783 - } 784 - 785 - dic->cbuf = f2fs_vmap(dic->cpages, dic->nr_cpages); 786 - if (!dic->cbuf) { 787 - ret = -ENOMEM; 788 - goto out_vunmap_rbuf; 749 + ret = f2fs_prepare_decomp_mem(dic, false); 750 + if (ret) { 751 + bypass_callback = true; 752 + goto out_release; 789 753 } 790 754 791 755 dic->clen = le32_to_cpu(dic->cbuf->clen); ··· 762 788 763 789 if (dic->clen > PAGE_SIZE * dic->nr_cpages - COMPRESS_HEADER_SIZE) { 764 790 ret = -EFSCORRUPTED; 765 - goto out_vunmap_cbuf; 791 + goto out_release; 766 792 } 767 793 768 794 ret = cops->decompress_pages(dic); ··· 783 809 } 784 810 } 785 811 786 - out_vunmap_cbuf: 787 - vm_unmap_ram(dic->cbuf, dic->nr_cpages); 788 - out_vunmap_rbuf: 789 - vm_unmap_ram(dic->rbuf, dic->cluster_size); 790 - out_destroy_decompress_ctx: 791 - if (cops->destroy_decompress_ctx) 792 - cops->destroy_decompress_ctx(dic); 812 + out_release: 813 + f2fs_release_decomp_mem(dic, bypass_callback, false); 814 + 793 815 out_end_io: 794 816 trace_f2fs_decompress_pages_end(dic->inode, dic->cluster_idx, 795 817 dic->clen, ret); 796 - f2fs_decompress_end_io(dic, ret); 818 + f2fs_decompress_end_io(dic, ret, in_task); 797 819 } 798 820 799 821 /* ··· 799 829 * (or in the case of a failure, cleans up without actually decompressing). 800 830 */ 801 831 void f2fs_end_read_compressed_page(struct page *page, bool failed, 802 - block_t blkaddr) 832 + block_t blkaddr, bool in_task) 803 833 { 804 834 struct decompress_io_ctx *dic = 805 835 (struct decompress_io_ctx *)page_private(page); ··· 809 839 810 840 if (failed) 811 841 WRITE_ONCE(dic->failed, true); 812 - else if (blkaddr) 842 + else if (blkaddr && in_task) 813 843 f2fs_cache_compressed_page(sbi, page, 814 844 dic->inode->i_ino, blkaddr); 815 845 816 846 if (atomic_dec_and_test(&dic->remaining_pages)) 817 - f2fs_decompress_cluster(dic); 847 + f2fs_decompress_cluster(dic, in_task); 818 848 } 819 849 820 850 static bool is_page_in_cluster(struct compress_ctx *cc, pgoff_t index) ··· 841 871 return is_page_in_cluster(cc, index); 842 872 } 843 873 844 - bool f2fs_all_cluster_page_loaded(struct compress_ctx *cc, struct pagevec *pvec, 845 - int index, int nr_pages) 874 + bool f2fs_all_cluster_page_ready(struct compress_ctx *cc, struct page **pages, 875 + int index, int nr_pages, bool uptodate) 846 876 { 847 - unsigned long pgidx; 848 - int i; 877 + unsigned long pgidx = pages[index]->index; 878 + int i = uptodate ? 0 : 1; 879 + 880 + /* 881 + * when uptodate set to true, try to check all pages in cluster is 882 + * uptodate or not. 883 + */ 884 + if (uptodate && (pgidx % cc->cluster_size)) 885 + return false; 849 886 850 887 if (nr_pages - index < cc->cluster_size) 851 888 return false; 852 889 853 - pgidx = pvec->pages[index]->index; 854 - 855 - for (i = 1; i < cc->cluster_size; i++) { 856 - if (pvec->pages[index + i]->index != pgidx + i) 890 + for (; i < cc->cluster_size; i++) { 891 + if (pages[index + i]->index != pgidx + i) 892 + return false; 893 + if (uptodate && !PageUptodate(pages[index + i])) 857 894 return false; 858 895 } 859 896 ··· 1529 1552 return err; 1530 1553 } 1531 1554 1532 - static void f2fs_free_dic(struct decompress_io_ctx *dic); 1555 + static inline bool allow_memalloc_for_decomp(struct f2fs_sb_info *sbi, 1556 + bool pre_alloc) 1557 + { 1558 + return pre_alloc ^ f2fs_low_mem_mode(sbi); 1559 + } 1560 + 1561 + static int f2fs_prepare_decomp_mem(struct decompress_io_ctx *dic, 1562 + bool pre_alloc) 1563 + { 1564 + const struct f2fs_compress_ops *cops = 1565 + f2fs_cops[F2FS_I(dic->inode)->i_compress_algorithm]; 1566 + int i; 1567 + 1568 + if (!allow_memalloc_for_decomp(F2FS_I_SB(dic->inode), pre_alloc)) 1569 + return 0; 1570 + 1571 + dic->tpages = page_array_alloc(dic->inode, dic->cluster_size); 1572 + if (!dic->tpages) 1573 + return -ENOMEM; 1574 + 1575 + for (i = 0; i < dic->cluster_size; i++) { 1576 + if (dic->rpages[i]) { 1577 + dic->tpages[i] = dic->rpages[i]; 1578 + continue; 1579 + } 1580 + 1581 + dic->tpages[i] = f2fs_compress_alloc_page(); 1582 + if (!dic->tpages[i]) 1583 + return -ENOMEM; 1584 + } 1585 + 1586 + dic->rbuf = f2fs_vmap(dic->tpages, dic->cluster_size); 1587 + if (!dic->rbuf) 1588 + return -ENOMEM; 1589 + 1590 + dic->cbuf = f2fs_vmap(dic->cpages, dic->nr_cpages); 1591 + if (!dic->cbuf) 1592 + return -ENOMEM; 1593 + 1594 + if (cops->init_decompress_ctx) { 1595 + int ret = cops->init_decompress_ctx(dic); 1596 + 1597 + if (ret) 1598 + return ret; 1599 + } 1600 + 1601 + return 0; 1602 + } 1603 + 1604 + static void f2fs_release_decomp_mem(struct decompress_io_ctx *dic, 1605 + bool bypass_destroy_callback, bool pre_alloc) 1606 + { 1607 + const struct f2fs_compress_ops *cops = 1608 + f2fs_cops[F2FS_I(dic->inode)->i_compress_algorithm]; 1609 + 1610 + if (!allow_memalloc_for_decomp(F2FS_I_SB(dic->inode), pre_alloc)) 1611 + return; 1612 + 1613 + if (!bypass_destroy_callback && cops->destroy_decompress_ctx) 1614 + cops->destroy_decompress_ctx(dic); 1615 + 1616 + if (dic->cbuf) 1617 + vm_unmap_ram(dic->cbuf, dic->nr_cpages); 1618 + 1619 + if (dic->rbuf) 1620 + vm_unmap_ram(dic->rbuf, dic->cluster_size); 1621 + } 1622 + 1623 + static void f2fs_free_dic(struct decompress_io_ctx *dic, 1624 + bool bypass_destroy_callback); 1533 1625 1534 1626 struct decompress_io_ctx *f2fs_alloc_dic(struct compress_ctx *cc) 1535 1627 { 1536 1628 struct decompress_io_ctx *dic; 1537 1629 pgoff_t start_idx = start_idx_of_cluster(cc); 1538 - int i; 1630 + struct f2fs_sb_info *sbi = F2FS_I_SB(cc->inode); 1631 + int i, ret; 1539 1632 1540 - dic = f2fs_kmem_cache_alloc(dic_entry_slab, GFP_F2FS_ZERO, 1541 - false, F2FS_I_SB(cc->inode)); 1633 + dic = f2fs_kmem_cache_alloc(dic_entry_slab, GFP_F2FS_ZERO, false, sbi); 1542 1634 if (!dic) 1543 1635 return ERR_PTR(-ENOMEM); 1544 1636 ··· 1633 1587 dic->nr_rpages = cc->cluster_size; 1634 1588 1635 1589 dic->cpages = page_array_alloc(dic->inode, dic->nr_cpages); 1636 - if (!dic->cpages) 1590 + if (!dic->cpages) { 1591 + ret = -ENOMEM; 1637 1592 goto out_free; 1593 + } 1638 1594 1639 1595 for (i = 0; i < dic->nr_cpages; i++) { 1640 1596 struct page *page; 1641 1597 1642 1598 page = f2fs_compress_alloc_page(); 1643 - if (!page) 1599 + if (!page) { 1600 + ret = -ENOMEM; 1644 1601 goto out_free; 1602 + } 1645 1603 1646 1604 f2fs_set_compressed_page(page, cc->inode, 1647 1605 start_idx + i + 1, dic); 1648 1606 dic->cpages[i] = page; 1649 1607 } 1650 1608 1609 + ret = f2fs_prepare_decomp_mem(dic, true); 1610 + if (ret) 1611 + goto out_free; 1612 + 1651 1613 return dic; 1652 1614 1653 1615 out_free: 1654 - f2fs_free_dic(dic); 1655 - return ERR_PTR(-ENOMEM); 1616 + f2fs_free_dic(dic, true); 1617 + return ERR_PTR(ret); 1656 1618 } 1657 1619 1658 - static void f2fs_free_dic(struct decompress_io_ctx *dic) 1620 + static void f2fs_free_dic(struct decompress_io_ctx *dic, 1621 + bool bypass_destroy_callback) 1659 1622 { 1660 1623 int i; 1624 + 1625 + f2fs_release_decomp_mem(dic, bypass_destroy_callback, true); 1661 1626 1662 1627 if (dic->tpages) { 1663 1628 for (i = 0; i < dic->cluster_size; i++) { ··· 1694 1637 kmem_cache_free(dic_entry_slab, dic); 1695 1638 } 1696 1639 1697 - static void f2fs_put_dic(struct decompress_io_ctx *dic) 1640 + static void f2fs_late_free_dic(struct work_struct *work) 1698 1641 { 1699 - if (refcount_dec_and_test(&dic->refcnt)) 1700 - f2fs_free_dic(dic); 1642 + struct decompress_io_ctx *dic = 1643 + container_of(work, struct decompress_io_ctx, free_work); 1644 + 1645 + f2fs_free_dic(dic, false); 1646 + } 1647 + 1648 + static void f2fs_put_dic(struct decompress_io_ctx *dic, bool in_task) 1649 + { 1650 + if (refcount_dec_and_test(&dic->refcnt)) { 1651 + if (in_task) { 1652 + f2fs_free_dic(dic, false); 1653 + } else { 1654 + INIT_WORK(&dic->free_work, f2fs_late_free_dic); 1655 + queue_work(F2FS_I_SB(dic->inode)->post_read_wq, 1656 + &dic->free_work); 1657 + } 1658 + } 1701 1659 } 1702 1660 1703 1661 /* 1704 1662 * Update and unlock the cluster's pagecache pages, and release the reference to 1705 1663 * the decompress_io_ctx that was being held for I/O completion. 1706 1664 */ 1707 - static void __f2fs_decompress_end_io(struct decompress_io_ctx *dic, bool failed) 1665 + static void __f2fs_decompress_end_io(struct decompress_io_ctx *dic, bool failed, 1666 + bool in_task) 1708 1667 { 1709 1668 int i; 1710 1669 ··· 1741 1668 unlock_page(rpage); 1742 1669 } 1743 1670 1744 - f2fs_put_dic(dic); 1671 + f2fs_put_dic(dic, in_task); 1745 1672 } 1746 1673 1747 1674 static void f2fs_verify_cluster(struct work_struct *work) ··· 1758 1685 SetPageError(rpage); 1759 1686 } 1760 1687 1761 - __f2fs_decompress_end_io(dic, false); 1688 + __f2fs_decompress_end_io(dic, false, true); 1762 1689 } 1763 1690 1764 1691 /* 1765 1692 * This is called when a compressed cluster has been decompressed 1766 1693 * (or failed to be read and/or decompressed). 1767 1694 */ 1768 - void f2fs_decompress_end_io(struct decompress_io_ctx *dic, bool failed) 1695 + void f2fs_decompress_end_io(struct decompress_io_ctx *dic, bool failed, 1696 + bool in_task) 1769 1697 { 1770 1698 if (!failed && dic->need_verity) { 1771 1699 /* ··· 1778 1704 INIT_WORK(&dic->verity_work, f2fs_verify_cluster); 1779 1705 fsverity_enqueue_verify_work(&dic->verity_work); 1780 1706 } else { 1781 - __f2fs_decompress_end_io(dic, failed); 1707 + __f2fs_decompress_end_io(dic, failed, in_task); 1782 1708 } 1783 1709 } 1784 1710 ··· 1787 1713 * 1788 1714 * This is called when the page is no longer needed and can be freed. 1789 1715 */ 1790 - void f2fs_put_page_dic(struct page *page) 1716 + void f2fs_put_page_dic(struct page *page, bool in_task) 1791 1717 { 1792 1718 struct decompress_io_ctx *dic = 1793 1719 (struct decompress_io_ctx *)page_private(page); 1794 1720 1795 - f2fs_put_dic(dic); 1721 + f2fs_put_dic(dic, in_task); 1796 1722 } 1797 1723 1798 1724 /* ··· 1976 1902 { 1977 1903 dev_t dev = sbi->sb->s_bdev->bd_dev; 1978 1904 char slab_name[32]; 1905 + 1906 + if (!f2fs_sb_has_compression(sbi)) 1907 + return 0; 1979 1908 1980 1909 sprintf(slab_name, "f2fs_page_array_entry-%u:%u", MAJOR(dev), MINOR(dev)); 1981 1910
+50 -32
fs/f2fs/data.c
··· 119 119 block_t fs_blkaddr; 120 120 }; 121 121 122 - static void f2fs_finish_read_bio(struct bio *bio) 122 + static void f2fs_finish_read_bio(struct bio *bio, bool in_task) 123 123 { 124 124 struct bio_vec *bv; 125 125 struct bvec_iter_all iter_all; ··· 133 133 134 134 if (f2fs_is_compressed_page(page)) { 135 135 if (bio->bi_status) 136 - f2fs_end_read_compressed_page(page, true, 0); 137 - f2fs_put_page_dic(page); 136 + f2fs_end_read_compressed_page(page, true, 0, 137 + in_task); 138 + f2fs_put_page_dic(page, in_task); 138 139 continue; 139 140 } 140 141 ··· 192 191 fsverity_verify_bio(bio); 193 192 } 194 193 195 - f2fs_finish_read_bio(bio); 194 + f2fs_finish_read_bio(bio, true); 196 195 } 197 196 198 197 /* ··· 204 203 * can involve reading verity metadata pages from the file, and these verity 205 204 * metadata pages may be encrypted and/or compressed. 206 205 */ 207 - static void f2fs_verify_and_finish_bio(struct bio *bio) 206 + static void f2fs_verify_and_finish_bio(struct bio *bio, bool in_task) 208 207 { 209 208 struct bio_post_read_ctx *ctx = bio->bi_private; 210 209 ··· 212 211 INIT_WORK(&ctx->work, f2fs_verify_bio); 213 212 fsverity_enqueue_verify_work(&ctx->work); 214 213 } else { 215 - f2fs_finish_read_bio(bio); 214 + f2fs_finish_read_bio(bio, in_task); 216 215 } 217 216 } 218 217 ··· 225 224 * that the bio includes at least one compressed page. The actual decompression 226 225 * is done on a per-cluster basis, not a per-bio basis. 227 226 */ 228 - static void f2fs_handle_step_decompress(struct bio_post_read_ctx *ctx) 227 + static void f2fs_handle_step_decompress(struct bio_post_read_ctx *ctx, 228 + bool in_task) 229 229 { 230 230 struct bio_vec *bv; 231 231 struct bvec_iter_all iter_all; ··· 239 237 /* PG_error was set if decryption failed. */ 240 238 if (f2fs_is_compressed_page(page)) 241 239 f2fs_end_read_compressed_page(page, PageError(page), 242 - blkaddr); 240 + blkaddr, in_task); 243 241 else 244 242 all_compressed = false; 245 243 ··· 264 262 fscrypt_decrypt_bio(ctx->bio); 265 263 266 264 if (ctx->enabled_steps & STEP_DECOMPRESS) 267 - f2fs_handle_step_decompress(ctx); 265 + f2fs_handle_step_decompress(ctx, true); 268 266 269 - f2fs_verify_and_finish_bio(ctx->bio); 267 + f2fs_verify_and_finish_bio(ctx->bio, true); 270 268 } 271 269 272 270 static void f2fs_read_end_io(struct bio *bio) 273 271 { 274 272 struct f2fs_sb_info *sbi = F2FS_P_SB(bio_first_page_all(bio)); 275 273 struct bio_post_read_ctx *ctx; 274 + bool intask = in_task(); 276 275 277 276 iostat_update_and_unbind_ctx(bio, 0); 278 277 ctx = bio->bi_private; ··· 284 281 } 285 282 286 283 if (bio->bi_status) { 287 - f2fs_finish_read_bio(bio); 284 + f2fs_finish_read_bio(bio, intask); 288 285 return; 289 286 } 290 287 291 - if (ctx && (ctx->enabled_steps & (STEP_DECRYPT | STEP_DECOMPRESS))) { 292 - INIT_WORK(&ctx->work, f2fs_post_read_work); 293 - queue_work(ctx->sbi->post_read_wq, &ctx->work); 294 - } else { 295 - f2fs_verify_and_finish_bio(bio); 288 + if (ctx) { 289 + unsigned int enabled_steps = ctx->enabled_steps & 290 + (STEP_DECRYPT | STEP_DECOMPRESS); 291 + 292 + /* 293 + * If we have only decompression step between decompression and 294 + * decrypt, we don't need post processing for this. 295 + */ 296 + if (enabled_steps == STEP_DECOMPRESS && 297 + !f2fs_low_mem_mode(sbi)) { 298 + f2fs_handle_step_decompress(ctx, intask); 299 + } else if (enabled_steps) { 300 + INIT_WORK(&ctx->work, f2fs_post_read_work); 301 + queue_work(ctx->sbi->post_read_wq, &ctx->work); 302 + return; 303 + } 296 304 } 305 + 306 + f2fs_verify_and_finish_bio(bio, intask); 297 307 } 298 308 299 309 static void f2fs_write_end_io(struct bio *bio) ··· 1698 1682 */ 1699 1683 f2fs_wait_on_block_writeback_range(inode, 1700 1684 map->m_pblk, map->m_len); 1701 - invalidate_mapping_pages(META_MAPPING(sbi), 1702 - map->m_pblk, map->m_pblk); 1703 1685 1704 1686 if (map->m_multidev_dio) { 1705 1687 block_t blk_addr = map->m_pblk; ··· 2237 2223 2238 2224 if (f2fs_load_compressed_page(sbi, page, blkaddr)) { 2239 2225 if (atomic_dec_and_test(&dic->remaining_pages)) 2240 - f2fs_decompress_cluster(dic); 2226 + f2fs_decompress_cluster(dic, true); 2241 2227 continue; 2242 2228 } 2243 2229 ··· 2255 2241 page->index, for_write); 2256 2242 if (IS_ERR(bio)) { 2257 2243 ret = PTR_ERR(bio); 2258 - f2fs_decompress_end_io(dic, ret); 2244 + f2fs_decompress_end_io(dic, ret, true); 2259 2245 f2fs_put_dnode(&dn); 2260 2246 *bio_ret = NULL; 2261 2247 return ret; ··· 2745 2731 .submitted = false, 2746 2732 .compr_blocks = compr_blocks, 2747 2733 .need_lock = LOCK_RETRY, 2734 + .post_read = f2fs_post_read_required(inode), 2748 2735 .io_type = io_type, 2749 2736 .io_wbc = wbc, 2750 2737 .bio = bio, ··· 2917 2902 { 2918 2903 int ret = 0; 2919 2904 int done = 0, retry = 0; 2920 - struct pagevec pvec; 2905 + struct page *pages[F2FS_ONSTACK_PAGES]; 2921 2906 struct f2fs_sb_info *sbi = F2FS_M_SB(mapping); 2922 2907 struct bio *bio = NULL; 2923 2908 sector_t last_block; ··· 2948 2933 int submitted = 0; 2949 2934 int i; 2950 2935 2951 - pagevec_init(&pvec); 2952 - 2953 2936 if (get_dirty_pages(mapping->host) <= 2954 2937 SM_I(F2FS_M_SB(mapping))->min_hot_blocks) 2955 2938 set_inode_flag(mapping->host, FI_HOT_DATA); ··· 2973 2960 tag_pages_for_writeback(mapping, index, end); 2974 2961 done_index = index; 2975 2962 while (!done && !retry && (index <= end)) { 2976 - nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end, 2977 - tag); 2963 + nr_pages = find_get_pages_range_tag(mapping, &index, end, 2964 + tag, F2FS_ONSTACK_PAGES, pages); 2978 2965 if (nr_pages == 0) 2979 2966 break; 2980 2967 2981 2968 for (i = 0; i < nr_pages; i++) { 2982 - struct page *page = pvec.pages[i]; 2969 + struct page *page = pages[i]; 2983 2970 bool need_readd; 2984 2971 readd: 2985 2972 need_readd = false; ··· 3010 2997 if (!f2fs_cluster_is_empty(&cc)) 3011 2998 goto lock_page; 3012 2999 3000 + if (f2fs_all_cluster_page_ready(&cc, 3001 + pages, i, nr_pages, true)) 3002 + goto lock_page; 3003 + 3013 3004 ret2 = f2fs_prepare_compress_overwrite( 3014 3005 inode, &pagep, 3015 3006 page->index, &fsdata); ··· 3024 3007 } else if (ret2 && 3025 3008 (!f2fs_compress_write_end(inode, 3026 3009 fsdata, page->index, 1) || 3027 - !f2fs_all_cluster_page_loaded(&cc, 3028 - &pvec, i, nr_pages))) { 3010 + !f2fs_all_cluster_page_ready(&cc, 3011 + pages, i, nr_pages, false))) { 3029 3012 retry = 1; 3030 3013 break; 3031 3014 } ··· 3115 3098 if (need_readd) 3116 3099 goto readd; 3117 3100 } 3118 - pagevec_release(&pvec); 3101 + release_pages(pages, nr_pages); 3119 3102 cond_resched(); 3120 3103 } 3121 3104 #ifdef CONFIG_F2FS_FS_COMPRESSION ··· 3425 3408 struct inode *cow_inode = F2FS_I(inode)->cow_inode; 3426 3409 pgoff_t index = page->index; 3427 3410 int err = 0; 3428 - block_t ori_blk_addr; 3411 + block_t ori_blk_addr = NULL_ADDR; 3429 3412 3430 3413 /* If pos is beyond the end of file, reserve a new block in COW inode */ 3431 3414 if ((pos & PAGE_MASK) >= i_size_read(inode)) 3432 - return __reserve_data_block(cow_inode, index, blk_addr, 3433 - node_changed); 3415 + goto reserve_block; 3434 3416 3435 3417 /* Look for the block in COW inode first */ 3436 3418 err = __find_data_block(cow_inode, index, blk_addr); ··· 3443 3427 if (err) 3444 3428 return err; 3445 3429 3430 + reserve_block: 3446 3431 /* Finally, we should reserve a new block in COW inode for the update */ 3447 3432 err = __reserve_data_block(cow_inode, index, blk_addr, node_changed); 3448 3433 if (err) 3449 3434 return err; 3435 + inc_atomic_write_cnt(inode); 3450 3436 3451 3437 if (ori_blk_addr != NULL_ADDR) 3452 3438 *blk_addr = ori_blk_addr;
+1 -1
fs/f2fs/debug.c
··· 39 39 40 40 bimodal = 0; 41 41 total_vblocks = 0; 42 - blks_per_sec = BLKS_PER_SEC(sbi); 42 + blks_per_sec = CAP_BLKS_PER_SEC(sbi); 43 43 hblks_per_sec = blks_per_sec / 2; 44 44 for (segno = 0; segno < MAIN_SEGS(sbi); segno += sbi->segs_per_sec) { 45 45 vblocks = get_valid_blocks(sbi, segno, true);
+73 -29
fs/f2fs/f2fs.h
··· 159 159 int fsync_mode; /* fsync policy */ 160 160 int fs_mode; /* fs mode: LFS or ADAPTIVE */ 161 161 int bggc_mode; /* bggc mode: off, on or sync */ 162 + int memory_mode; /* memory mode */ 162 163 int discard_unit; /* 163 164 * discard command's offset/size should 164 165 * be aligned to this unit: block, ··· 230 229 #define CP_PAUSE 0x00000040 231 230 #define CP_RESIZE 0x00000080 232 231 233 - #define MAX_DISCARD_BLOCKS(sbi) BLKS_PER_SEC(sbi) 234 232 #define DEF_MAX_DISCARD_REQUEST 8 /* issue 8 discards per round */ 235 233 #define DEF_MIN_DISCARD_ISSUE_TIME 50 /* 50 ms, if exists */ 236 234 #define DEF_MID_DISCARD_ISSUE_TIME 500 /* 500 ms, if device busy */ ··· 598 598 #define RECOVERY_MAX_RA_BLOCKS BIO_MAX_VECS 599 599 #define RECOVERY_MIN_RA_BLOCKS 1 600 600 601 + #define F2FS_ONSTACK_PAGES 16 /* nr of onstack pages */ 602 + 601 603 struct rb_entry { 602 604 struct rb_node rb_node; /* rb node located in rb-tree */ 603 605 union { ··· 759 757 FI_ENABLE_COMPRESS, /* enable compression in "user" compression mode */ 760 758 FI_COMPRESS_RELEASED, /* compressed blocks were released */ 761 759 FI_ALIGNED_WRITE, /* enable aligned write */ 760 + FI_COW_FILE, /* indicate COW file */ 762 761 FI_MAX, /* max flag, never be used */ 763 762 }; 764 763 ··· 815 812 unsigned char i_compress_level; /* compress level (lz4hc,zstd) */ 816 813 unsigned short i_compress_flag; /* compress flag */ 817 814 unsigned int i_cluster_size; /* cluster size */ 815 + 816 + unsigned int atomic_write_cnt; 818 817 }; 819 818 820 819 static inline void get_extent_info(struct extent_info *ext, ··· 1203 1198 bool retry; /* need to reallocate block address */ 1204 1199 int compr_blocks; /* # of compressed block addresses */ 1205 1200 bool encrypted; /* indicate file is encrypted */ 1201 + bool post_read; /* require post read */ 1206 1202 enum iostat_type io_type; /* io type */ 1207 1203 struct writeback_control *io_wbc; /* writeback control */ 1208 1204 struct bio **bio; /* bio for ipu */ ··· 1240 1234 #ifdef CONFIG_BLK_DEV_ZONED 1241 1235 unsigned int nr_blkz; /* Total number of zones */ 1242 1236 unsigned long *blkz_seq; /* Bitmap indicating sequential zones */ 1243 - block_t *zone_capacity_blocks; /* Array of zone capacity in blks */ 1244 1237 #endif 1245 1238 }; 1246 1239 ··· 1364 1359 DISCARD_UNIT_SEGMENT, /* basic discard unit is segment */ 1365 1360 DISCARD_UNIT_SECTION, /* basic discard unit is section */ 1366 1361 }; 1362 + 1363 + enum { 1364 + MEMORY_MODE_NORMAL, /* memory mode for normal devices */ 1365 + MEMORY_MODE_LOW, /* memory mode for low memry devices */ 1366 + }; 1367 + 1368 + 1367 1369 1368 1370 static inline int f2fs_test_bit(unsigned int nr, char *addr); 1369 1371 static inline void f2fs_set_bit(unsigned int nr, char *addr); ··· 1592 1580 void *private; /* payload buffer for specified decompression algorithm */ 1593 1581 void *private2; /* extra payload buffer */ 1594 1582 struct work_struct verity_work; /* work to verify the decompressed pages */ 1583 + struct work_struct free_work; /* work for late free this structure itself */ 1595 1584 }; 1596 1585 1597 1586 #define NULL_CLUSTER ((unsigned int)(~0)) ··· 1677 1664 unsigned int meta_ino_num; /* meta inode number*/ 1678 1665 unsigned int log_blocks_per_seg; /* log2 blocks per segment */ 1679 1666 unsigned int blocks_per_seg; /* blocks per segment */ 1667 + unsigned int unusable_blocks_per_sec; /* unusable blocks per section */ 1680 1668 unsigned int segs_per_sec; /* segments per section */ 1681 1669 unsigned int secs_per_zone; /* sections per zone */ 1682 1670 unsigned int total_sections; /* total section count */ ··· 1817 1803 1818 1804 int max_fragment_chunk; /* max chunk size for block fragmentation mode */ 1819 1805 int max_fragment_hole; /* max hole size for block fragmentation mode */ 1806 + 1807 + /* For atomic write statistics */ 1808 + atomic64_t current_atomic_write; 1809 + s64 peak_atomic_write; 1810 + u64 committed_atomic_block; 1811 + u64 revoked_atomic_block; 1820 1812 1821 1813 #ifdef CONFIG_F2FS_FS_COMPRESSION 1822 1814 struct kmem_cache *page_array_slab; /* page array entry */ ··· 2438 2418 dec_page_count(F2FS_I_SB(inode), F2FS_DIRTY_QDATA); 2439 2419 } 2440 2420 2421 + static inline void inc_atomic_write_cnt(struct inode *inode) 2422 + { 2423 + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 2424 + struct f2fs_inode_info *fi = F2FS_I(inode); 2425 + u64 current_write; 2426 + 2427 + fi->atomic_write_cnt++; 2428 + atomic64_inc(&sbi->current_atomic_write); 2429 + current_write = atomic64_read(&sbi->current_atomic_write); 2430 + if (current_write > sbi->peak_atomic_write) 2431 + sbi->peak_atomic_write = current_write; 2432 + } 2433 + 2434 + static inline void release_atomic_write_cnt(struct inode *inode) 2435 + { 2436 + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 2437 + struct f2fs_inode_info *fi = F2FS_I(inode); 2438 + 2439 + atomic64_sub(fi->atomic_write_cnt, &sbi->current_atomic_write); 2440 + fi->atomic_write_cnt = 0; 2441 + } 2442 + 2441 2443 static inline s64 get_pages(struct f2fs_sb_info *sbi, int count_type) 2442 2444 { 2443 2445 return atomic_read(&sbi->nr_pages[count_type]); ··· 2736 2694 } 2737 2695 2738 2696 return pagecache_get_page(mapping, index, fgp_flags, gfp_mask); 2739 - } 2740 - 2741 - static inline void f2fs_copy_page(struct page *src, struct page *dst) 2742 - { 2743 - char *src_kaddr = kmap(src); 2744 - char *dst_kaddr = kmap(dst); 2745 - 2746 - memcpy(dst_kaddr, src_kaddr, PAGE_SIZE); 2747 - kunmap(dst); 2748 - kunmap(src); 2749 2697 } 2750 2698 2751 2699 static inline void f2fs_put_page(struct page *page, int unlock) ··· 3238 3206 static inline bool f2fs_is_atomic_file(struct inode *inode) 3239 3207 { 3240 3208 return is_inode_flag_set(inode, FI_ATOMIC_FILE); 3209 + } 3210 + 3211 + static inline bool f2fs_is_cow_file(struct inode *inode) 3212 + { 3213 + return is_inode_flag_set(inode, FI_COW_FILE); 3241 3214 } 3242 3215 3243 3216 static inline bool f2fs_is_first_block_written(struct inode *inode) ··· 4191 4154 bool f2fs_is_compress_backend_ready(struct inode *inode); 4192 4155 int f2fs_init_compress_mempool(void); 4193 4156 void f2fs_destroy_compress_mempool(void); 4194 - void f2fs_decompress_cluster(struct decompress_io_ctx *dic); 4157 + void f2fs_decompress_cluster(struct decompress_io_ctx *dic, bool in_task); 4195 4158 void f2fs_end_read_compressed_page(struct page *page, bool failed, 4196 - block_t blkaddr); 4159 + block_t blkaddr, bool in_task); 4197 4160 bool f2fs_cluster_is_empty(struct compress_ctx *cc); 4198 4161 bool f2fs_cluster_can_merge_page(struct compress_ctx *cc, pgoff_t index); 4199 - bool f2fs_all_cluster_page_loaded(struct compress_ctx *cc, struct pagevec *pvec, 4200 - int index, int nr_pages); 4162 + bool f2fs_all_cluster_page_ready(struct compress_ctx *cc, struct page **pages, 4163 + int index, int nr_pages, bool uptodate); 4201 4164 bool f2fs_sanity_check_cluster(struct dnode_of_data *dn); 4202 4165 void f2fs_compress_ctx_add_page(struct compress_ctx *cc, struct page *page); 4203 4166 int f2fs_write_multi_pages(struct compress_ctx *cc, ··· 4212 4175 unsigned nr_pages, sector_t *last_block_in_bio, 4213 4176 bool is_readahead, bool for_write); 4214 4177 struct decompress_io_ctx *f2fs_alloc_dic(struct compress_ctx *cc); 4215 - void f2fs_decompress_end_io(struct decompress_io_ctx *dic, bool failed); 4216 - void f2fs_put_page_dic(struct page *page); 4178 + void f2fs_decompress_end_io(struct decompress_io_ctx *dic, bool failed, 4179 + bool in_task); 4180 + void f2fs_put_page_dic(struct page *page, bool in_task); 4217 4181 unsigned int f2fs_cluster_blocks_are_contiguous(struct dnode_of_data *dn); 4218 4182 int f2fs_init_compress_ctx(struct compress_ctx *cc); 4219 4183 void f2fs_destroy_compress_ctx(struct compress_ctx *cc, bool reuse); ··· 4260 4222 } 4261 4223 static inline int f2fs_init_compress_mempool(void) { return 0; } 4262 4224 static inline void f2fs_destroy_compress_mempool(void) { } 4263 - static inline void f2fs_decompress_cluster(struct decompress_io_ctx *dic) { } 4225 + static inline void f2fs_decompress_cluster(struct decompress_io_ctx *dic, 4226 + bool in_task) { } 4264 4227 static inline void f2fs_end_read_compressed_page(struct page *page, 4265 - bool failed, block_t blkaddr) 4228 + bool failed, block_t blkaddr, bool in_task) 4266 4229 { 4267 4230 WARN_ON_ONCE(1); 4268 4231 } 4269 - static inline void f2fs_put_page_dic(struct page *page) 4232 + static inline void f2fs_put_page_dic(struct page *page, bool in_task) 4270 4233 { 4271 4234 WARN_ON_ONCE(1); 4272 4235 } ··· 4293 4254 unsigned int c_len) { } 4294 4255 #endif 4295 4256 4296 - static inline void set_compress_context(struct inode *inode) 4257 + static inline int set_compress_context(struct inode *inode) 4297 4258 { 4259 + #ifdef CONFIG_F2FS_FS_COMPRESSION 4298 4260 struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 4299 4261 4300 4262 F2FS_I(inode)->i_compress_algorithm = ··· 4318 4278 stat_inc_compr_inode(inode); 4319 4279 inc_compr_inode_stat(inode); 4320 4280 f2fs_mark_inode_dirty_sync(inode, true); 4281 + return 0; 4282 + #else 4283 + return -EOPNOTSUPP; 4284 + #endif 4321 4285 } 4322 4286 4323 4287 static inline bool f2fs_disable_compressed_file(struct inode *inode) ··· 4438 4394 return F2FS_OPTION(sbi).fs_mode == FS_MODE_LFS; 4439 4395 } 4440 4396 4397 + static inline bool f2fs_low_mem_mode(struct f2fs_sb_info *sbi) 4398 + { 4399 + return F2FS_OPTION(sbi).memory_mode == MEMORY_MODE_LOW; 4400 + } 4401 + 4441 4402 static inline bool f2fs_may_compress(struct inode *inode) 4442 4403 { 4443 4404 if (IS_SWAPFILE(inode) || f2fs_is_pinned_file(inode) || 4444 - f2fs_is_atomic_file(inode)) 4405 + f2fs_is_atomic_file(inode) || f2fs_has_inline_data(inode)) 4445 4406 return false; 4446 4407 return S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode); 4447 4408 } ··· 4508 4459 /* disallow direct IO if any of devices has unaligned blksize */ 4509 4460 if (f2fs_is_multi_device(sbi) && !sbi->aligned_blksize) 4510 4461 return true; 4511 - /* 4512 - * for blkzoned device, fallback direct IO to buffered IO, so 4513 - * all IOs can be serialized by log-structured write. 4514 - */ 4515 - if (f2fs_sb_has_blkzoned(sbi)) 4516 - return true; 4462 + 4517 4463 if (f2fs_lfs_mode(sbi) && (rw == WRITE)) { 4518 4464 if (block_unaligned_IO(inode, iocb, iter)) 4519 4465 return true;
+53 -26
fs/f2fs/file.c
··· 1272 1272 f2fs_put_page(psrc, 1); 1273 1273 return PTR_ERR(pdst); 1274 1274 } 1275 - f2fs_copy_page(psrc, pdst); 1275 + memcpy_page(pdst, 0, psrc, 0, PAGE_SIZE); 1276 1276 set_page_dirty(pdst); 1277 1277 f2fs_put_page(pdst, 1); 1278 1278 f2fs_put_page(psrc, 1); ··· 1675 1675 return 0; 1676 1676 1677 1677 if (f2fs_is_pinned_file(inode)) { 1678 - block_t sec_blks = BLKS_PER_SEC(sbi); 1678 + block_t sec_blks = CAP_BLKS_PER_SEC(sbi); 1679 1679 block_t sec_len = roundup(map.m_len, sec_blks); 1680 1680 1681 1681 map.m_len = sec_blks; ··· 1816 1816 atomic_read(&inode->i_writecount) != 1) 1817 1817 return 0; 1818 1818 1819 - if (f2fs_is_atomic_file(inode)) 1820 - f2fs_abort_atomic_write(inode, true); 1819 + f2fs_abort_atomic_write(inode, true); 1821 1820 return 0; 1822 1821 } 1823 1822 ··· 1830 1831 * until all the writers close its file. Since this should be done 1831 1832 * before dropping file lock, it needs to do in ->flush. 1832 1833 */ 1833 - if (f2fs_is_atomic_file(inode) && 1834 - F2FS_I(inode)->atomic_write_task == current) 1834 + if (F2FS_I(inode)->atomic_write_task == current) 1835 1835 f2fs_abort_atomic_write(inode, true); 1836 1836 return 0; 1837 1837 } ··· 1865 1867 if (masked_flags & F2FS_COMPR_FL) { 1866 1868 if (!f2fs_disable_compressed_file(inode)) 1867 1869 return -EINVAL; 1868 - } 1869 - if (iflags & F2FS_NOCOMP_FL) 1870 - return -EINVAL; 1871 - if (iflags & F2FS_COMPR_FL) { 1870 + } else { 1872 1871 if (!f2fs_may_compress(inode)) 1873 1872 return -EINVAL; 1874 - if (S_ISREG(inode->i_mode) && inode->i_size) 1873 + if (S_ISREG(inode->i_mode) && F2FS_HAS_BLOCKS(inode)) 1875 1874 return -EINVAL; 1876 - 1877 - set_compress_context(inode); 1875 + if (set_compress_context(inode)) 1876 + return -EOPNOTSUPP; 1878 1877 } 1879 - } 1880 - if ((iflags ^ masked_flags) & F2FS_NOCOMP_FL) { 1881 - if (masked_flags & F2FS_COMPR_FL) 1882 - return -EINVAL; 1883 1878 } 1884 1879 1885 1880 fi->i_flags = iflags | (fi->i_flags & ~mask); ··· 2053 2062 spin_unlock(&sbi->inode_lock[ATOMIC_FILE]); 2054 2063 2055 2064 set_inode_flag(inode, FI_ATOMIC_FILE); 2056 - set_inode_flag(fi->cow_inode, FI_ATOMIC_FILE); 2065 + set_inode_flag(fi->cow_inode, FI_COW_FILE); 2057 2066 clear_inode_flag(fi->cow_inode, FI_INLINE_DATA); 2058 2067 f2fs_up_write(&fi->i_gc_rwsem[WRITE]); 2059 2068 2060 2069 f2fs_update_time(sbi, REQ_TIME); 2061 2070 fi->atomic_write_task = current; 2062 2071 stat_update_max_atomic_write(inode); 2072 + fi->atomic_write_cnt = 0; 2063 2073 out: 2064 2074 inode_unlock(inode); 2065 2075 mnt_drop_write_file(filp); ··· 2098 2106 unlock_out: 2099 2107 inode_unlock(inode); 2100 2108 mnt_drop_write_file(filp); 2109 + return ret; 2110 + } 2111 + 2112 + static int f2fs_ioc_abort_atomic_write(struct file *filp) 2113 + { 2114 + struct inode *inode = file_inode(filp); 2115 + struct user_namespace *mnt_userns = file_mnt_user_ns(filp); 2116 + int ret; 2117 + 2118 + if (!inode_owner_or_capable(mnt_userns, inode)) 2119 + return -EACCES; 2120 + 2121 + ret = mnt_want_write_file(filp); 2122 + if (ret) 2123 + return ret; 2124 + 2125 + inode_lock(inode); 2126 + 2127 + f2fs_abort_atomic_write(inode, true); 2128 + 2129 + inode_unlock(inode); 2130 + 2131 + mnt_drop_write_file(filp); 2132 + f2fs_update_time(F2FS_I_SB(inode), REQ_TIME); 2101 2133 return ret; 2102 2134 } 2103 2135 ··· 2442 2426 ret = -EAGAIN; 2443 2427 goto out; 2444 2428 } 2445 - range->start += BLKS_PER_SEC(sbi); 2429 + range->start += CAP_BLKS_PER_SEC(sbi); 2446 2430 if (range->start <= end) 2447 2431 goto do_more; 2448 2432 out: ··· 2567 2551 goto out; 2568 2552 } 2569 2553 2570 - sec_num = DIV_ROUND_UP(total, BLKS_PER_SEC(sbi)); 2554 + sec_num = DIV_ROUND_UP(total, CAP_BLKS_PER_SEC(sbi)); 2571 2555 2572 2556 /* 2573 2557 * make sure there are enough free section for LFS allocation, this can ··· 3913 3897 3914 3898 for (i = 0; i < page_len; i++, redirty_idx++) { 3915 3899 page = find_lock_page(mapping, redirty_idx); 3916 - if (!page) { 3917 - ret = -ENOMEM; 3918 - break; 3919 - } 3900 + 3901 + /* It will never fail, when page has pinned above */ 3902 + f2fs_bug_on(F2FS_I_SB(inode), !page); 3903 + 3920 3904 set_page_dirty(page); 3921 3905 f2fs_put_page(page, 1); 3922 3906 f2fs_put_page(page, 0); ··· 3952 3936 3953 3937 if (!f2fs_is_compress_backend_ready(inode)) { 3954 3938 ret = -EOPNOTSUPP; 3939 + goto out; 3940 + } 3941 + 3942 + if (is_inode_flag_set(inode, FI_COMPRESS_RELEASED)) { 3943 + ret = -EINVAL; 3955 3944 goto out; 3956 3945 } 3957 3946 ··· 4027 4006 goto out; 4028 4007 } 4029 4008 4009 + if (is_inode_flag_set(inode, FI_COMPRESS_RELEASED)) { 4010 + ret = -EINVAL; 4011 + goto out; 4012 + } 4013 + 4030 4014 ret = filemap_write_and_wait_range(inode->i_mapping, 0, LLONG_MAX); 4031 4015 if (ret) 4032 4016 goto out; ··· 4080 4054 return f2fs_ioc_start_atomic_write(filp); 4081 4055 case F2FS_IOC_COMMIT_ATOMIC_WRITE: 4082 4056 return f2fs_ioc_commit_atomic_write(filp); 4057 + case F2FS_IOC_ABORT_ATOMIC_WRITE: 4058 + return f2fs_ioc_abort_atomic_write(filp); 4083 4059 case F2FS_IOC_START_VOLATILE_WRITE: 4084 4060 case F2FS_IOC_RELEASE_VOLATILE_WRITE: 4085 - case F2FS_IOC_ABORT_VOLATILE_WRITE: 4086 4061 return -EOPNOTSUPP; 4087 4062 case F2FS_IOC_SHUTDOWN: 4088 4063 return f2fs_ioc_shutdown(filp, arg); ··· 4752 4725 case F2FS_IOC_COMMIT_ATOMIC_WRITE: 4753 4726 case F2FS_IOC_START_VOLATILE_WRITE: 4754 4727 case F2FS_IOC_RELEASE_VOLATILE_WRITE: 4755 - case F2FS_IOC_ABORT_VOLATILE_WRITE: 4728 + case F2FS_IOC_ABORT_ATOMIC_WRITE: 4756 4729 case F2FS_IOC_SHUTDOWN: 4757 4730 case FITRIM: 4758 4731 case FS_IOC_SET_ENCRYPTION_POLICY:
+7 -4
fs/f2fs/gc.c
··· 150 150 gc_control.nr_free_secs = foreground ? 1 : 0; 151 151 152 152 /* if return value is not zero, no victim was selected */ 153 - if (f2fs_gc(sbi, &gc_control)) 154 - wait_ms = gc_th->no_gc_sleep_time; 153 + if (f2fs_gc(sbi, &gc_control)) { 154 + /* don't bother wait_ms by foreground gc */ 155 + if (!foreground) 156 + wait_ms = gc_th->no_gc_sleep_time; 157 + } 155 158 156 159 if (foreground) 157 160 wake_up_all(&gc_th->fggc_wq); ··· 490 487 unsigned long long age, u, accu; 491 488 unsigned long long max_mtime = sit_i->dirty_max_mtime; 492 489 unsigned long long min_mtime = sit_i->dirty_min_mtime; 493 - unsigned int sec_blocks = BLKS_PER_SEC(sbi); 490 + unsigned int sec_blocks = CAP_BLKS_PER_SEC(sbi); 494 491 unsigned int vblocks; 495 492 unsigned int dirty_threshold = max(am->max_candidate_count, 496 493 am->candidate_ratio * ··· 1490 1487 */ 1491 1488 if ((gc_type == BG_GC && has_not_enough_free_secs(sbi, 0, 0)) || 1492 1489 (!force_migrate && get_valid_blocks(sbi, segno, true) == 1493 - BLKS_PER_SEC(sbi))) 1490 + CAP_BLKS_PER_SEC(sbi))) 1494 1491 return submitted; 1495 1492 1496 1493 if (check_valid_map(sbi, segno, off) == 0)
+10 -11
fs/f2fs/gc.h
··· 120 120 return free_blks - ovp_blks; 121 121 } 122 122 123 - static inline block_t limit_invalid_user_blocks(struct f2fs_sb_info *sbi) 123 + static inline block_t limit_invalid_user_blocks(block_t user_block_count) 124 124 { 125 - return (long)(sbi->user_block_count * LIMIT_INVALID_BLOCK) / 100; 125 + return (long)(user_block_count * LIMIT_INVALID_BLOCK) / 100; 126 126 } 127 127 128 - static inline block_t limit_free_user_blocks(struct f2fs_sb_info *sbi) 128 + static inline block_t limit_free_user_blocks(block_t reclaimable_user_blocks) 129 129 { 130 - block_t reclaimable_user_blocks = sbi->user_block_count - 131 - written_block_count(sbi); 132 130 return (long)(reclaimable_user_blocks * LIMIT_FREE_BLOCK) / 100; 133 131 } 134 132 ··· 161 163 162 164 static inline bool has_enough_invalid_blocks(struct f2fs_sb_info *sbi) 163 165 { 164 - block_t invalid_user_blocks = sbi->user_block_count - 165 - written_block_count(sbi); 166 + block_t user_block_count = sbi->user_block_count; 167 + block_t invalid_user_blocks = user_block_count - 168 + written_block_count(sbi); 166 169 /* 167 170 * Background GC is triggered with the following conditions. 168 171 * 1. There are a number of invalid blocks. 169 172 * 2. There is not enough free space. 170 173 */ 171 - if (invalid_user_blocks > limit_invalid_user_blocks(sbi) && 172 - free_user_blocks(sbi) < limit_free_user_blocks(sbi)) 173 - return true; 174 - return false; 174 + return (invalid_user_blocks > 175 + limit_invalid_user_blocks(user_block_count) && 176 + free_user_blocks(sbi) < 177 + limit_free_user_blocks(invalid_user_blocks)); 175 178 }
+1 -2
fs/f2fs/inode.c
··· 744 744 nid_t xnid = F2FS_I(inode)->i_xattr_nid; 745 745 int err = 0; 746 746 747 - if (f2fs_is_atomic_file(inode)) 748 - f2fs_abort_atomic_write(inode, true); 747 + f2fs_abort_atomic_write(inode, true); 749 748 750 749 trace_f2fs_evict_inode(inode); 751 750 truncate_inode_pages_final(&inode->i_data);
+7 -7
fs/f2fs/node.c
··· 1292 1292 dec_valid_node_count(sbi, dn->inode, !ofs); 1293 1293 goto fail; 1294 1294 } 1295 - f2fs_bug_on(sbi, new_ni.blk_addr != NULL_ADDR); 1295 + if (unlikely(new_ni.blk_addr != NULL_ADDR)) { 1296 + err = -EFSCORRUPTED; 1297 + set_sbi_flag(sbi, SBI_NEED_FSCK); 1298 + goto fail; 1299 + } 1296 1300 #endif 1297 1301 new_ni.nid = dn->nid; 1298 1302 new_ni.ino = dn->inode->i_ino; ··· 1949 1945 for (i = 0; i < nr_pages; i++) { 1950 1946 struct page *page = pvec.pages[i]; 1951 1947 bool submitted = false; 1952 - bool may_dirty = true; 1953 1948 1954 1949 /* give a priority to WB_SYNC threads */ 1955 1950 if (atomic_read(&sbi->wb_sync_req[NODE]) && ··· 2001 1998 } 2002 1999 2003 2000 /* flush dirty inode */ 2004 - if (IS_INODE(page) && may_dirty) { 2005 - may_dirty = false; 2006 - if (flush_dirty_inode(page)) 2007 - goto lock_node; 2008 - } 2001 + if (IS_INODE(page) && flush_dirty_inode(page)) 2002 + goto lock_node; 2009 2003 write_node: 2010 2004 f2fs_wait_on_page_writeback(page, NODE, true, true); 2011 2005
+49 -30
fs/f2fs/segment.c
··· 190 190 struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 191 191 struct f2fs_inode_info *fi = F2FS_I(inode); 192 192 193 - if (f2fs_is_atomic_file(inode)) { 194 - if (clean) 195 - truncate_inode_pages_final(inode->i_mapping); 196 - clear_inode_flag(fi->cow_inode, FI_ATOMIC_FILE); 197 - iput(fi->cow_inode); 198 - fi->cow_inode = NULL; 199 - clear_inode_flag(inode, FI_ATOMIC_FILE); 193 + if (!f2fs_is_atomic_file(inode)) 194 + return; 200 195 201 - spin_lock(&sbi->inode_lock[ATOMIC_FILE]); 202 - sbi->atomic_files--; 203 - spin_unlock(&sbi->inode_lock[ATOMIC_FILE]); 204 - } 196 + if (clean) 197 + truncate_inode_pages_final(inode->i_mapping); 198 + clear_inode_flag(fi->cow_inode, FI_COW_FILE); 199 + iput(fi->cow_inode); 200 + fi->cow_inode = NULL; 201 + release_atomic_write_cnt(inode); 202 + clear_inode_flag(inode, FI_ATOMIC_FILE); 203 + 204 + spin_lock(&sbi->inode_lock[ATOMIC_FILE]); 205 + sbi->atomic_files--; 206 + spin_unlock(&sbi->inode_lock[ATOMIC_FILE]); 205 207 } 206 208 207 209 static int __replace_atomic_write_block(struct inode *inode, pgoff_t index, ··· 337 335 } 338 336 339 337 out: 338 + if (ret) 339 + sbi->revoked_atomic_block += fi->atomic_write_cnt; 340 + else 341 + sbi->committed_atomic_block += fi->atomic_write_cnt; 342 + 340 343 __complete_revoke_list(inode, &revoke_list, ret ? true : false); 341 344 342 345 return ret; ··· 735 728 get_valid_blocks(sbi, segno, true); 736 729 737 730 f2fs_bug_on(sbi, unlikely(!valid_blocks || 738 - valid_blocks == BLKS_PER_SEC(sbi))); 731 + valid_blocks == CAP_BLKS_PER_SEC(sbi))); 739 732 740 733 if (!IS_CURSEC(sbi, secno)) 741 734 set_bit(secno, dirty_i->dirty_secmap); ··· 771 764 unsigned int secno = GET_SEC_FROM_SEG(sbi, segno); 772 765 773 766 if (!valid_blocks || 774 - valid_blocks == BLKS_PER_SEC(sbi)) { 767 + valid_blocks == CAP_BLKS_PER_SEC(sbi)) { 775 768 clear_bit(secno, dirty_i->dirty_secmap); 776 769 return; 777 770 } ··· 3173 3166 return CURSEG_COLD_DATA; 3174 3167 if (file_is_hot(inode) || 3175 3168 is_inode_flag_set(inode, FI_HOT_DATA) || 3176 - f2fs_is_atomic_file(inode)) 3169 + f2fs_is_cow_file(inode)) 3177 3170 return CURSEG_HOT_DATA; 3178 3171 return f2fs_rw_hint_to_seg_type(inode->i_write_hint); 3179 3172 } else { ··· 3440 3433 goto drop_bio; 3441 3434 } 3442 3435 3443 - invalidate_mapping_pages(META_MAPPING(sbi), 3436 + if (fio->post_read) 3437 + invalidate_mapping_pages(META_MAPPING(sbi), 3444 3438 fio->new_blkaddr, fio->new_blkaddr); 3445 3439 3446 3440 stat_inc_inplace_blocks(fio->sbi); ··· 3624 3616 void f2fs_wait_on_block_writeback_range(struct inode *inode, block_t blkaddr, 3625 3617 block_t len) 3626 3618 { 3619 + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); 3627 3620 block_t i; 3621 + 3622 + if (!f2fs_post_read_required(inode)) 3623 + return; 3628 3624 3629 3625 for (i = 0; i < len; i++) 3630 3626 f2fs_wait_on_block_writeback(inode, blkaddr + i); 3627 + 3628 + invalidate_mapping_pages(META_MAPPING(sbi), blkaddr, blkaddr + len - 1); 3631 3629 } 3632 3630 3633 3631 static int read_compacted_summaries(struct f2fs_sb_info *sbi) ··· 4376 4362 return err; 4377 4363 seg_info_from_raw_sit(se, &sit); 4378 4364 4365 + if (se->type >= NR_PERSISTENT_LOG) { 4366 + f2fs_err(sbi, "Invalid segment type: %u, segno: %u", 4367 + se->type, start); 4368 + return -EFSCORRUPTED; 4369 + } 4370 + 4379 4371 sit_valid_blocks[SE_PAGETYPE(se)] += se->valid_blocks; 4380 4372 4381 4373 if (f2fs_block_unit_discard(sbi)) { ··· 4429 4409 if (err) 4430 4410 break; 4431 4411 seg_info_from_raw_sit(se, &sit); 4412 + 4413 + if (se->type >= NR_PERSISTENT_LOG) { 4414 + f2fs_err(sbi, "Invalid segment type: %u, segno: %u", 4415 + se->type, start); 4416 + err = -EFSCORRUPTED; 4417 + break; 4418 + } 4432 4419 4433 4420 sit_valid_blocks[SE_PAGETYPE(se)] += se->valid_blocks; 4434 4421 ··· 4510 4483 struct free_segmap_info *free_i = FREE_I(sbi); 4511 4484 unsigned int segno = 0, offset = 0, secno; 4512 4485 block_t valid_blocks, usable_blks_in_seg; 4513 - block_t blks_per_sec = BLKS_PER_SEC(sbi); 4514 4486 4515 4487 while (1) { 4516 4488 /* find dirty segment based on free segmap */ ··· 4538 4512 valid_blocks = get_valid_blocks(sbi, segno, true); 4539 4513 secno = GET_SEC_FROM_SEG(sbi, segno); 4540 4514 4541 - if (!valid_blocks || valid_blocks == blks_per_sec) 4515 + if (!valid_blocks || valid_blocks == CAP_BLKS_PER_SEC(sbi)) 4542 4516 continue; 4543 4517 if (IS_CURSEC(sbi, secno)) 4544 4518 continue; ··· 4921 4895 static inline unsigned int f2fs_usable_zone_segs_in_sec( 4922 4896 struct f2fs_sb_info *sbi, unsigned int segno) 4923 4897 { 4924 - unsigned int dev_idx, zone_idx, unusable_segs_in_sec; 4898 + unsigned int dev_idx, zone_idx; 4925 4899 4926 4900 dev_idx = f2fs_target_device_index(sbi, START_BLOCK(sbi, segno)); 4927 4901 zone_idx = get_zone_idx(sbi, GET_SEC_FROM_SEG(sbi, segno), dev_idx); ··· 4930 4904 if (is_conv_zone(sbi, zone_idx, dev_idx)) 4931 4905 return sbi->segs_per_sec; 4932 4906 4933 - /* 4934 - * If the zone_capacity_blocks array is NULL, then zone capacity 4935 - * is equal to the zone size for all zones 4936 - */ 4937 - if (!FDEV(dev_idx).zone_capacity_blocks) 4907 + if (!sbi->unusable_blocks_per_sec) 4938 4908 return sbi->segs_per_sec; 4939 4909 4940 4910 /* Get the segment count beyond zone capacity block */ 4941 - unusable_segs_in_sec = (sbi->blocks_per_blkz - 4942 - FDEV(dev_idx).zone_capacity_blocks[zone_idx]) >> 4943 - sbi->log_blocks_per_seg; 4944 - return sbi->segs_per_sec - unusable_segs_in_sec; 4911 + return sbi->segs_per_sec - (sbi->unusable_blocks_per_sec >> 4912 + sbi->log_blocks_per_seg); 4945 4913 } 4946 4914 4947 4915 /* ··· 4964 4944 if (is_conv_zone(sbi, zone_idx, dev_idx)) 4965 4945 return sbi->blocks_per_seg; 4966 4946 4967 - if (!FDEV(dev_idx).zone_capacity_blocks) 4947 + if (!sbi->unusable_blocks_per_sec) 4968 4948 return sbi->blocks_per_seg; 4969 4949 4970 4950 sec_start_blkaddr = START_BLOCK(sbi, GET_SEG_FROM_SEC(sbi, secno)); 4971 - sec_cap_blkaddr = sec_start_blkaddr + 4972 - FDEV(dev_idx).zone_capacity_blocks[zone_idx]; 4951 + sec_cap_blkaddr = sec_start_blkaddr + CAP_BLKS_PER_SEC(sbi); 4973 4952 4974 4953 /* 4975 4954 * If segment starts before zone capacity and spans beyond
+7 -4
fs/f2fs/segment.h
··· 101 101 GET_SEGNO_FROM_SEG0(sbi, blk_addr))) 102 102 #define BLKS_PER_SEC(sbi) \ 103 103 ((sbi)->segs_per_sec * (sbi)->blocks_per_seg) 104 + #define CAP_BLKS_PER_SEC(sbi) \ 105 + ((sbi)->segs_per_sec * (sbi)->blocks_per_seg - \ 106 + (sbi)->unusable_blocks_per_sec) 104 107 #define GET_SEC_FROM_SEG(sbi, segno) \ 105 108 (((segno) == -1) ? -1: (segno) / (sbi)->segs_per_sec) 106 109 #define GET_SEG_FROM_SEC(sbi, secno) \ ··· 612 609 get_pages(sbi, F2FS_DIRTY_DENTS) + 613 610 get_pages(sbi, F2FS_DIRTY_IMETA); 614 611 unsigned int total_dent_blocks = get_pages(sbi, F2FS_DIRTY_DENTS); 615 - unsigned int node_secs = total_node_blocks / BLKS_PER_SEC(sbi); 616 - unsigned int dent_secs = total_dent_blocks / BLKS_PER_SEC(sbi); 617 - unsigned int node_blocks = total_node_blocks % BLKS_PER_SEC(sbi); 618 - unsigned int dent_blocks = total_dent_blocks % BLKS_PER_SEC(sbi); 612 + unsigned int node_secs = total_node_blocks / CAP_BLKS_PER_SEC(sbi); 613 + unsigned int dent_secs = total_dent_blocks / CAP_BLKS_PER_SEC(sbi); 614 + unsigned int node_blocks = total_node_blocks % CAP_BLKS_PER_SEC(sbi); 615 + unsigned int dent_blocks = total_dent_blocks % CAP_BLKS_PER_SEC(sbi); 619 616 unsigned int free, need_lower, need_upper; 620 617 621 618 if (unlikely(is_sbi_flag_set(sbi, SBI_POR_DOING)))
+59 -31
fs/f2fs/super.c
··· 8 8 #include <linux/module.h> 9 9 #include <linux/init.h> 10 10 #include <linux/fs.h> 11 + #include <linux/fs_context.h> 11 12 #include <linux/sched/mm.h> 12 13 #include <linux/statfs.h> 13 14 #include <linux/buffer_head.h> ··· 160 159 Opt_gc_merge, 161 160 Opt_nogc_merge, 162 161 Opt_discard_unit, 162 + Opt_memory_mode, 163 163 Opt_err, 164 164 }; 165 165 ··· 237 235 {Opt_gc_merge, "gc_merge"}, 238 236 {Opt_nogc_merge, "nogc_merge"}, 239 237 {Opt_discard_unit, "discard_unit=%s"}, 238 + {Opt_memory_mode, "memory=%s"}, 240 239 {Opt_err, NULL}, 241 240 }; 242 241 ··· 495 492 bool is_remount) 496 493 { 497 494 struct f2fs_sb_info *sbi = F2FS_SB(sb); 498 - #ifdef CONFIG_FS_ENCRYPTION 495 + struct fs_parameter param = { 496 + .type = fs_value_is_string, 497 + .string = arg->from ? arg->from : "", 498 + }; 499 + struct fscrypt_dummy_policy *policy = 500 + &F2FS_OPTION(sbi).dummy_enc_policy; 499 501 int err; 502 + 503 + if (!IS_ENABLED(CONFIG_FS_ENCRYPTION)) { 504 + f2fs_warn(sbi, "test_dummy_encryption option not supported"); 505 + return -EINVAL; 506 + } 500 507 501 508 if (!f2fs_sb_has_encrypt(sbi)) { 502 509 f2fs_err(sbi, "Encrypt feature is off"); ··· 519 506 * needed to allow it to be set or changed during remount. We do allow 520 507 * it to be specified during remount, but only if there is no change. 521 508 */ 522 - if (is_remount && !F2FS_OPTION(sbi).dummy_enc_policy.policy) { 509 + if (is_remount && !fscrypt_is_dummy_policy_set(policy)) { 523 510 f2fs_warn(sbi, "Can't set test_dummy_encryption on remount"); 524 511 return -EINVAL; 525 512 } 526 - err = fscrypt_set_test_dummy_encryption( 527 - sb, arg->from, &F2FS_OPTION(sbi).dummy_enc_policy); 513 + 514 + err = fscrypt_parse_test_dummy_encryption(&param, policy); 528 515 if (err) { 529 516 if (err == -EEXIST) 530 517 f2fs_warn(sbi, ··· 537 524 opt, err); 538 525 return -EINVAL; 539 526 } 527 + err = fscrypt_add_test_dummy_key(sb, policy); 528 + if (err) { 529 + f2fs_warn(sbi, "Error adding test dummy encryption key [%d]", 530 + err); 531 + return err; 532 + } 540 533 f2fs_warn(sbi, "Test dummy encryption mode enabled"); 541 534 return 0; 542 - #else 543 - f2fs_warn(sbi, "test_dummy_encryption option not supported"); 544 - return -EINVAL; 545 - #endif 546 535 } 547 536 548 537 #ifdef CONFIG_F2FS_FS_COMPRESSION ··· 1237 1222 } 1238 1223 kfree(name); 1239 1224 break; 1225 + case Opt_memory_mode: 1226 + name = match_strdup(&args[0]); 1227 + if (!name) 1228 + return -ENOMEM; 1229 + if (!strcmp(name, "normal")) { 1230 + F2FS_OPTION(sbi).memory_mode = 1231 + MEMORY_MODE_NORMAL; 1232 + } else if (!strcmp(name, "low")) { 1233 + F2FS_OPTION(sbi).memory_mode = 1234 + MEMORY_MODE_LOW; 1235 + } else { 1236 + kfree(name); 1237 + return -EINVAL; 1238 + } 1239 + kfree(name); 1240 + break; 1240 1241 default: 1241 1242 f2fs_err(sbi, "Unrecognized mount option \"%s\" or missing value", 1242 1243 p); ··· 1411 1380 atomic_inc(&inode->i_count); 1412 1381 spin_unlock(&inode->i_lock); 1413 1382 1414 - if (f2fs_is_atomic_file(inode)) 1415 - f2fs_abort_atomic_write(inode, true); 1383 + f2fs_abort_atomic_write(inode, true); 1416 1384 1417 1385 /* should remain fi->extent_tree for writepage */ 1418 1386 f2fs_destroy_extent_node(inode); ··· 1521 1491 blkdev_put(FDEV(i).bdev, FMODE_EXCL); 1522 1492 #ifdef CONFIG_BLK_DEV_ZONED 1523 1493 kvfree(FDEV(i).blkz_seq); 1524 - kfree(FDEV(i).zone_capacity_blocks); 1525 1494 #endif 1526 1495 } 1527 1496 kvfree(sbi->devs); ··· 2022 1993 else if (F2FS_OPTION(sbi).discard_unit == DISCARD_UNIT_SECTION) 2023 1994 seq_printf(seq, ",discard_unit=%s", "section"); 2024 1995 1996 + if (F2FS_OPTION(sbi).memory_mode == MEMORY_MODE_NORMAL) 1997 + seq_printf(seq, ",memory=%s", "normal"); 1998 + else if (F2FS_OPTION(sbi).memory_mode == MEMORY_MODE_LOW) 1999 + seq_printf(seq, ",memory=%s", "low"); 2000 + 2025 2001 return 0; 2026 2002 } 2027 2003 ··· 2048 2014 F2FS_OPTION(sbi).compress_ext_cnt = 0; 2049 2015 F2FS_OPTION(sbi).compress_mode = COMPR_MODE_FS; 2050 2016 F2FS_OPTION(sbi).bggc_mode = BGGC_MODE_ON; 2017 + F2FS_OPTION(sbi).memory_mode = MEMORY_MODE_NORMAL; 2051 2018 2052 2019 sbi->sb->s_flags &= ~SB_INLINECRYPT; 2053 2020 ··· 3614 3579 sbi->max_fragment_chunk = DEF_FRAGMENT_SIZE; 3615 3580 sbi->max_fragment_hole = DEF_FRAGMENT_SIZE; 3616 3581 spin_lock_init(&sbi->gc_urgent_high_lock); 3582 + atomic64_set(&sbi->current_atomic_write, 0); 3617 3583 3618 3584 sbi->dir_level = DEF_DIR_LEVEL; 3619 3585 sbi->interval_time[CP_TIME] = DEF_CP_INTERVAL; ··· 3672 3636 #ifdef CONFIG_BLK_DEV_ZONED 3673 3637 3674 3638 struct f2fs_report_zones_args { 3639 + struct f2fs_sb_info *sbi; 3675 3640 struct f2fs_dev_info *dev; 3676 - bool zone_cap_mismatch; 3677 3641 }; 3678 3642 3679 3643 static int f2fs_report_zone_cb(struct blk_zone *zone, unsigned int idx, 3680 3644 void *data) 3681 3645 { 3682 3646 struct f2fs_report_zones_args *rz_args = data; 3647 + block_t unusable_blocks = (zone->len - zone->capacity) >> 3648 + F2FS_LOG_SECTORS_PER_BLOCK; 3683 3649 3684 3650 if (zone->type == BLK_ZONE_TYPE_CONVENTIONAL) 3685 3651 return 0; 3686 3652 3687 3653 set_bit(idx, rz_args->dev->blkz_seq); 3688 - rz_args->dev->zone_capacity_blocks[idx] = zone->capacity >> 3689 - F2FS_LOG_SECTORS_PER_BLOCK; 3690 - if (zone->len != zone->capacity && !rz_args->zone_cap_mismatch) 3691 - rz_args->zone_cap_mismatch = true; 3692 - 3654 + if (!rz_args->sbi->unusable_blocks_per_sec) { 3655 + rz_args->sbi->unusable_blocks_per_sec = unusable_blocks; 3656 + return 0; 3657 + } 3658 + if (rz_args->sbi->unusable_blocks_per_sec != unusable_blocks) { 3659 + f2fs_err(rz_args->sbi, "F2FS supports single zone capacity\n"); 3660 + return -EINVAL; 3661 + } 3693 3662 return 0; 3694 3663 } 3695 3664 ··· 3735 3694 if (!FDEV(devi).blkz_seq) 3736 3695 return -ENOMEM; 3737 3696 3738 - /* Get block zones type and zone-capacity */ 3739 - FDEV(devi).zone_capacity_blocks = f2fs_kzalloc(sbi, 3740 - FDEV(devi).nr_blkz * sizeof(block_t), 3741 - GFP_KERNEL); 3742 - if (!FDEV(devi).zone_capacity_blocks) 3743 - return -ENOMEM; 3744 - 3697 + rep_zone_arg.sbi = sbi; 3745 3698 rep_zone_arg.dev = &FDEV(devi); 3746 - rep_zone_arg.zone_cap_mismatch = false; 3747 3699 3748 3700 ret = blkdev_report_zones(bdev, 0, BLK_ALL_ZONES, f2fs_report_zone_cb, 3749 3701 &rep_zone_arg); 3750 3702 if (ret < 0) 3751 3703 return ret; 3752 - 3753 - if (!rep_zone_arg.zone_cap_mismatch) { 3754 - kfree(FDEV(devi).zone_capacity_blocks); 3755 - FDEV(devi).zone_capacity_blocks = NULL; 3756 - } 3757 - 3758 3704 return 0; 3759 3705 } 3760 3706 #endif
+56
fs/f2fs/sysfs.c
··· 339 339 sbi->gc_reclaimed_segs[sbi->gc_segment_mode]); 340 340 } 341 341 342 + if (!strcmp(a->attr.name, "current_atomic_write")) { 343 + s64 current_write = atomic64_read(&sbi->current_atomic_write); 344 + 345 + return sysfs_emit(buf, "%lld\n", current_write); 346 + } 347 + 348 + if (!strcmp(a->attr.name, "peak_atomic_write")) 349 + return sysfs_emit(buf, "%lld\n", sbi->peak_atomic_write); 350 + 351 + if (!strcmp(a->attr.name, "committed_atomic_block")) 352 + return sysfs_emit(buf, "%llu\n", sbi->committed_atomic_block); 353 + 354 + if (!strcmp(a->attr.name, "revoked_atomic_block")) 355 + return sysfs_emit(buf, "%llu\n", sbi->revoked_atomic_block); 356 + 342 357 ui = (unsigned int *)(ptr + a->offset); 343 358 344 359 return sprintf(buf, "%u\n", *ui); ··· 623 608 return count; 624 609 } 625 610 611 + if (!strcmp(a->attr.name, "peak_atomic_write")) { 612 + if (t != 0) 613 + return -EINVAL; 614 + sbi->peak_atomic_write = 0; 615 + return count; 616 + } 617 + 618 + if (!strcmp(a->attr.name, "committed_atomic_block")) { 619 + if (t != 0) 620 + return -EINVAL; 621 + sbi->committed_atomic_block = 0; 622 + return count; 623 + } 624 + 625 + if (!strcmp(a->attr.name, "revoked_atomic_block")) { 626 + if (t != 0) 627 + return -EINVAL; 628 + sbi->revoked_atomic_block = 0; 629 + return count; 630 + } 631 + 626 632 *ui = (unsigned int)t; 627 633 628 634 return count; ··· 749 713 .offset = _offset \ 750 714 } 751 715 716 + #define F2FS_RO_ATTR(struct_type, struct_name, name, elname) \ 717 + F2FS_ATTR_OFFSET(struct_type, name, 0444, \ 718 + f2fs_sbi_show, NULL, \ 719 + offsetof(struct struct_name, elname)) 720 + 752 721 #define F2FS_RW_ATTR(struct_type, struct_name, name, elname) \ 753 722 F2FS_ATTR_OFFSET(struct_type, name, 0644, \ 754 723 f2fs_sbi_show, f2fs_sbi_store, \ ··· 852 811 #endif /* CONFIG_FS_ENCRYPTION */ 853 812 #ifdef CONFIG_BLK_DEV_ZONED 854 813 F2FS_FEATURE_RO_ATTR(block_zoned); 814 + F2FS_RO_ATTR(F2FS_SBI, f2fs_sb_info, unusable_blocks_per_sec, 815 + unusable_blocks_per_sec); 855 816 #endif 856 817 F2FS_FEATURE_RO_ATTR(atomic_write); 857 818 F2FS_FEATURE_RO_ATTR(extra_attr); ··· 890 847 F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_reclaimed_segments, gc_reclaimed_segs); 891 848 F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, max_fragment_chunk, max_fragment_chunk); 892 849 F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, max_fragment_hole, max_fragment_hole); 850 + 851 + /* For atomic write */ 852 + F2FS_RO_ATTR(F2FS_SBI, f2fs_sb_info, current_atomic_write, current_atomic_write); 853 + F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, peak_atomic_write, peak_atomic_write); 854 + F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, committed_atomic_block, committed_atomic_block); 855 + F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, revoked_atomic_block, revoked_atomic_block); 893 856 894 857 #define ATTR_LIST(name) (&f2fs_attr_##name.attr) 895 858 static struct attribute *f2fs_attrs[] = { ··· 968 919 ATTR_LIST(moved_blocks_background), 969 920 ATTR_LIST(avg_vblocks), 970 921 #endif 922 + #ifdef CONFIG_BLK_DEV_ZONED 923 + ATTR_LIST(unusable_blocks_per_sec), 924 + #endif 971 925 #ifdef CONFIG_F2FS_FS_COMPRESSION 972 926 ATTR_LIST(compr_written_block), 973 927 ATTR_LIST(compr_saved_block), ··· 986 934 ATTR_LIST(gc_reclaimed_segments), 987 935 ATTR_LIST(max_fragment_chunk), 988 936 ATTR_LIST(max_fragment_hole), 937 + ATTR_LIST(current_atomic_write), 938 + ATTR_LIST(peak_atomic_write), 939 + ATTR_LIST(committed_atomic_block), 940 + ATTR_LIST(revoked_atomic_block), 989 941 NULL, 990 942 }; 991 943 ATTRIBUTE_GROUPS(f2fs);
+1 -1
include/uapi/linux/f2fs.h
··· 13 13 #define F2FS_IOC_COMMIT_ATOMIC_WRITE _IO(F2FS_IOCTL_MAGIC, 2) 14 14 #define F2FS_IOC_START_VOLATILE_WRITE _IO(F2FS_IOCTL_MAGIC, 3) 15 15 #define F2FS_IOC_RELEASE_VOLATILE_WRITE _IO(F2FS_IOCTL_MAGIC, 4) 16 - #define F2FS_IOC_ABORT_VOLATILE_WRITE _IO(F2FS_IOCTL_MAGIC, 5) 16 + #define F2FS_IOC_ABORT_ATOMIC_WRITE _IO(F2FS_IOCTL_MAGIC, 5) 17 17 #define F2FS_IOC_GARBAGE_COLLECT _IOW(F2FS_IOCTL_MAGIC, 6, __u32) 18 18 #define F2FS_IOC_WRITE_CHECKPOINT _IO(F2FS_IOCTL_MAGIC, 7) 19 19 #define F2FS_IOC_DEFRAGMENT _IOWR(F2FS_IOCTL_MAGIC, 8, \