Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

xfs: detect and fix bad summary counts at mount

Filippo Giunchedi complained that xfs doesn't even perform basic sanity
checks of the fs summary counters at mount time. Therefore, recalculate
the summary counters from the AGFs after log recovery if the counts were
bad (or we had to recover the fs). Enhance the recalculation routine to
fail the mount entirely if the new values are also obviously incorrect.

We use a mount state flag to record the "bad summary count" state so
that the (subsequent) online fsck patches can detect subtlely incorrect
counts and set the flag; clear it userspace asks for a repair; or force
a recalculation at the next mount if nobody fixes it by unmount time.

Reported-by: Filippo Giunchedi <fgiunchedi@wikimedia.org>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>

+73 -29
+18 -3
fs/xfs/libxfs/xfs_sb.c
··· 804 804 uint64_t bfree = 0; 805 805 uint64_t bfreelst = 0; 806 806 uint64_t btree = 0; 807 + uint64_t fdblocks; 807 808 int error; 808 809 809 810 for (index = 0; index < agcount; index++) { ··· 828 827 btree += pag->pagf_btreeblks; 829 828 xfs_perag_put(pag); 830 829 } 830 + fdblocks = bfree + bfreelst + btree; 831 + 832 + /* 833 + * If the new summary counts are obviously incorrect, fail the 834 + * mount operation because that implies the AGFs are also corrupt. 835 + * Clear BAD_SUMMARY so that we don't unmount with a dirty log, which 836 + * will prevent xfs_repair from fixing anything. 837 + */ 838 + if (fdblocks > sbp->sb_dblocks || ifree > ialloc) { 839 + xfs_alert(mp, "AGF corruption. Please run xfs_repair."); 840 + error = -EFSCORRUPTED; 841 + goto out; 842 + } 831 843 832 844 /* Overwrite incore superblock counters with just-read data */ 833 845 spin_lock(&mp->m_sb_lock); 834 846 sbp->sb_ifree = ifree; 835 847 sbp->sb_icount = ialloc; 836 - sbp->sb_fdblocks = bfree + bfreelst + btree; 848 + sbp->sb_fdblocks = fdblocks; 837 849 spin_unlock(&mp->m_sb_lock); 838 850 839 851 xfs_reinit_percpu_counters(mp); 840 - 841 - return 0; 852 + out: 853 + mp->m_flags &= ~XFS_MOUNT_BAD_SUMMARY; 854 + return error; 842 855 } 843 856 844 857 /*
+54 -26
fs/xfs/xfs_mount.c
··· 606 606 return resblks; 607 607 } 608 608 609 + /* Ensure the summary counts are correct. */ 610 + STATIC int 611 + xfs_check_summary_counts( 612 + struct xfs_mount *mp) 613 + { 614 + /* 615 + * The AG0 superblock verifier rejects in-progress filesystems, 616 + * so we should never see the flag set this far into mounting. 617 + */ 618 + if (mp->m_sb.sb_inprogress) { 619 + xfs_err(mp, "sb_inprogress set after log recovery??"); 620 + WARN_ON(1); 621 + return -EFSCORRUPTED; 622 + } 623 + 624 + /* 625 + * Now the log is mounted, we know if it was an unclean shutdown or 626 + * not. If it was, with the first phase of recovery has completed, we 627 + * have consistent AG blocks on disk. We have not recovered EFIs yet, 628 + * but they are recovered transactionally in the second recovery phase 629 + * later. 630 + * 631 + * If the log was clean when we mounted, we can check the summary 632 + * counters. If any of them are obviously incorrect, we can recompute 633 + * them from the AGF headers in the next step. 634 + */ 635 + if (XFS_LAST_UNMOUNT_WAS_CLEAN(mp) && 636 + (mp->m_sb.sb_fdblocks > mp->m_sb.sb_dblocks || 637 + mp->m_sb.sb_ifree > mp->m_sb.sb_icount)) 638 + mp->m_flags |= XFS_MOUNT_BAD_SUMMARY; 639 + 640 + /* 641 + * We can safely re-initialise incore superblock counters from the 642 + * per-ag data. These may not be correct if the filesystem was not 643 + * cleanly unmounted, so we waited for recovery to finish before doing 644 + * this. 645 + * 646 + * If the filesystem was cleanly unmounted or the previous check did 647 + * not flag anything weird, then we can trust the values in the 648 + * superblock to be correct and we don't need to do anything here. 649 + * Otherwise, recalculate the summary counters. 650 + */ 651 + if ((!xfs_sb_version_haslazysbcount(&mp->m_sb) || 652 + XFS_LAST_UNMOUNT_WAS_CLEAN(mp)) && 653 + !(mp->m_flags & XFS_MOUNT_BAD_SUMMARY)) 654 + return 0; 655 + 656 + return xfs_initialize_perag_data(mp, mp->m_sb.sb_agcount); 657 + } 658 + 609 659 /* 610 660 * This function does the following on an initial mount of a file system: 611 661 * - reads the superblock from disk and init the mount struct ··· 881 831 goto out_fail_wait; 882 832 } 883 833 884 - /* 885 - * Now the log is mounted, we know if it was an unclean shutdown or 886 - * not. If it was, with the first phase of recovery has completed, we 887 - * have consistent AG blocks on disk. We have not recovered EFIs yet, 888 - * but they are recovered transactionally in the second recovery phase 889 - * later. 890 - * 891 - * Hence we can safely re-initialise incore superblock counters from 892 - * the per-ag data. These may not be correct if the filesystem was not 893 - * cleanly unmounted, so we need to wait for recovery to finish before 894 - * doing this. 895 - * 896 - * If the filesystem was cleanly unmounted, then we can trust the 897 - * values in the superblock to be correct and we don't need to do 898 - * anything here. 899 - * 900 - * If we are currently making the filesystem, the initialisation will 901 - * fail as the perag data is in an undefined state. 902 - */ 903 - if (xfs_sb_version_haslazysbcount(&mp->m_sb) && 904 - !XFS_LAST_UNMOUNT_WAS_CLEAN(mp) && 905 - !mp->m_sb.sb_inprogress) { 906 - error = xfs_initialize_perag_data(mp, sbp->sb_agcount); 907 - if (error) 908 - goto out_log_dealloc; 909 - } 834 + /* Make sure the summary counts are ok. */ 835 + error = xfs_check_summary_counts(mp); 836 + if (error) 837 + goto out_log_dealloc; 910 838 911 839 /* 912 840 * Get and sanity-check the root inode.
+1
fs/xfs/xfs_mount.h
··· 202 202 must be synchronous except 203 203 for space allocations */ 204 204 #define XFS_MOUNT_UNMOUNTING (1ULL << 1) /* filesystem is unmounting */ 205 + #define XFS_MOUNT_BAD_SUMMARY (1ULL << 2) /* summary counters are bad */ 205 206 #define XFS_MOUNT_WAS_CLEAN (1ULL << 3) 206 207 #define XFS_MOUNT_FS_SHUTDOWN (1ULL << 4) /* atomic stop of all filesystem 207 208 operations, typically for