iomap: support partial page discard on writeback block mapping failure

iomap writeback mapping failure only calls into ->discard_page() if
the current page has not been added to the ioend. Accordingly, the
XFS callback assumes a full page discard and invalidation. This is
problematic for sub-page block size filesystems where some portion
of a page might have been mapped successfully before a failure to
map a delalloc block occurs. ->discard_page() is not called in that
error scenario and the bio is explicitly failed by iomap via the
error return from ->prepare_ioend(). As a result, the filesystem
leaks delalloc blocks and corrupts the filesystem block counters.

Since XFS is the only user of ->discard_page(), tweak the semantics
to invoke the callback unconditionally on mapping errors and provide
the file offset that failed to map. Update xfs_discard_page() to
discard the corresponding portion of the file and pass the range
along to iomap_invalidatepage(). The latter already properly handles
both full and sub-page scenarios by not changing any iomap or page
state on sub-page invalidations.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

authored by Brian Foster and committed by Darrick J. Wong 763e4cdc 869ae85d

+17 -14
+8 -7
fs/iomap/buffered-io.c
··· 1382 1382 * appropriately. 1383 1383 */ 1384 1384 if (unlikely(error)) { 1385 + /* 1386 + * Let the filesystem know what portion of the current page 1387 + * failed to map. If the page wasn't been added to ioend, it 1388 + * won't be affected by I/O completion and we must unlock it 1389 + * now. 1390 + */ 1391 + if (wpc->ops->discard_page) 1392 + wpc->ops->discard_page(page, file_offset); 1385 1393 if (!count) { 1386 - /* 1387 - * If the current page hasn't been added to ioend, it 1388 - * won't be affected by I/O completions and we must 1389 - * discard and unlock it right here. 1390 - */ 1391 - if (wpc->ops->discard_page) 1392 - wpc->ops->discard_page(page); 1393 1394 ClearPageUptodate(page); 1394 1395 unlock_page(page); 1395 1396 goto done;
+8 -6
fs/xfs/xfs_aops.c
··· 527 527 */ 528 528 static void 529 529 xfs_discard_page( 530 - struct page *page) 530 + struct page *page, 531 + loff_t fileoff) 531 532 { 532 533 struct inode *inode = page->mapping->host; 533 534 struct xfs_inode *ip = XFS_I(inode); 534 535 struct xfs_mount *mp = ip->i_mount; 535 - loff_t offset = page_offset(page); 536 - xfs_fileoff_t start_fsb = XFS_B_TO_FSBT(mp, offset); 536 + unsigned int pageoff = offset_in_page(fileoff); 537 + xfs_fileoff_t start_fsb = XFS_B_TO_FSBT(mp, fileoff); 538 + xfs_fileoff_t pageoff_fsb = XFS_B_TO_FSBT(mp, pageoff); 537 539 int error; 538 540 539 541 if (XFS_FORCED_SHUTDOWN(mp)) ··· 543 541 544 542 xfs_alert_ratelimited(mp, 545 543 "page discard on page "PTR_FMT", inode 0x%llx, offset %llu.", 546 - page, ip->i_ino, offset); 544 + page, ip->i_ino, fileoff); 547 545 548 546 error = xfs_bmap_punch_delalloc_range(ip, start_fsb, 549 - i_blocks_per_page(inode, page)); 547 + i_blocks_per_page(inode, page) - pageoff_fsb); 550 548 if (error && !XFS_FORCED_SHUTDOWN(mp)) 551 549 xfs_alert(mp, "page discard unable to remove delalloc mapping."); 552 550 out_invalidate: 553 - iomap_invalidatepage(page, 0, PAGE_SIZE); 551 + iomap_invalidatepage(page, pageoff, PAGE_SIZE - pageoff); 554 552 } 555 553 556 554 static const struct iomap_writeback_ops xfs_writeback_ops = {
+1 -1
include/linux/iomap.h
··· 221 221 * Optional, allows the file system to discard state on a page where 222 222 * we failed to submit any I/O. 223 223 */ 224 - void (*discard_page)(struct page *page); 224 + void (*discard_page)(struct page *page, loff_t fileoff); 225 225 }; 226 226 227 227 struct iomap_writepage_ctx {