Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

xfs: move bmbt owner change to last step of extent swap

The extent swap operation currently resets bmbt block owners before
the inode forks are swapped. The bmbt buffers are marked as ordered
so they do not have to be physically logged in the transaction.

This use of ordered buffers is not safe as bmbt buffers may have
been previously physically logged. The bmbt owner change algorithm
needs to be updated to physically log buffers that are already dirty
when/if they are encountered. This means that an extent swap will
eventually require multiple rolling transactions to handle large
btrees. In addition, all inode related changes must be logged before
the bmbt owner change scan begins and can roll the transaction for
the first time to preserve fs consistency via log recovery.

In preparation for such fixes to the bmbt owner change algorithm,
refactor the bmbt scan out of the extent fork swap code to the last
operation before the transaction is committed. Update
xfs_swap_extent_forks() to only set the inode log flags when an
owner change scan is necessary. Update xfs_swap_extents() to trigger
the owner change based on the inode log flags. Note that since the
owner change now occurs after the extent fork swap, the inode btrees
must be fixed up with the inode number of the current inode (similar
to log recovery).

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

authored by

Brian Foster and committed by
Darrick J. Wong
6fb10d6d 99c794c6

+26 -18
+26 -18
fs/xfs/xfs_bmap_util.c
··· 1842 1842 } 1843 1843 1844 1844 /* 1845 - * Before we've swapped the forks, lets set the owners of the forks 1846 - * appropriately. We have to do this as we are demand paging the btree 1847 - * buffers, and so the validation done on read will expect the owner 1848 - * field to be correctly set. Once we change the owners, we can swap the 1849 - * inode forks. 1845 + * Btree format (v3) inodes have the inode number stamped in the bmbt 1846 + * block headers. We can't start changing the bmbt blocks until the 1847 + * inode owner change is logged so recovery does the right thing in the 1848 + * event of a crash. Set the owner change log flags now and leave the 1849 + * bmbt scan as the last step. 1850 1850 */ 1851 1851 if (ip->i_d.di_version == 3 && 1852 - ip->i_d.di_format == XFS_DINODE_FMT_BTREE) { 1852 + ip->i_d.di_format == XFS_DINODE_FMT_BTREE) 1853 1853 (*target_log_flags) |= XFS_ILOG_DOWNER; 1854 - error = xfs_bmbt_change_owner(tp, ip, XFS_DATA_FORK, 1855 - tip->i_ino, NULL); 1856 - if (error) 1857 - return error; 1858 - } 1859 - 1860 1854 if (tip->i_d.di_version == 3 && 1861 - tip->i_d.di_format == XFS_DINODE_FMT_BTREE) { 1855 + tip->i_d.di_format == XFS_DINODE_FMT_BTREE) 1862 1856 (*src_log_flags) |= XFS_ILOG_DOWNER; 1863 - error = xfs_bmbt_change_owner(tp, tip, XFS_DATA_FORK, 1864 - ip->i_ino, NULL); 1865 - if (error) 1866 - return error; 1867 - } 1868 1857 1869 1858 /* 1870 1859 * Swap the data forks of the inodes ··· 2081 2092 2082 2093 xfs_trans_log_inode(tp, ip, src_log_flags); 2083 2094 xfs_trans_log_inode(tp, tip, target_log_flags); 2095 + 2096 + /* 2097 + * The extent forks have been swapped, but crc=1,rmapbt=0 filesystems 2098 + * have inode number owner values in the bmbt blocks that still refer to 2099 + * the old inode. Scan each bmbt to fix up the owner values with the 2100 + * inode number of the current inode. 2101 + */ 2102 + if (src_log_flags & XFS_ILOG_DOWNER) { 2103 + error = xfs_bmbt_change_owner(tp, ip, XFS_DATA_FORK, 2104 + ip->i_ino, NULL); 2105 + if (error) 2106 + goto out_trans_cancel; 2107 + } 2108 + if (target_log_flags & XFS_ILOG_DOWNER) { 2109 + error = xfs_bmbt_change_owner(tp, tip, XFS_DATA_FORK, 2110 + tip->i_ino, NULL); 2111 + if (error) 2112 + goto out_trans_cancel; 2113 + } 2084 2114 2085 2115 /* 2086 2116 * If this is a synchronous mount, make sure that the