Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm: dont clear PG_uptodate on truncate/invalidate

Brian Wang reported that a FUSE filesystem exported through NFS could
return I/O errors on read. This was traced to splice_direct_to_actor()
returning a short or zero count when racing with page invalidation.

However this is not FUSE or NFSD specific, other filesystems (notably
NFS) also call invalidate_inode_pages2() to purge stale data from the
cache.

If this happens while such pages are sitting in a pipe buffer, then
splice(2) from the pipe can return zero, and read(2) from the pipe can
return ENODATA.

The zero return is especially bad, since it implies end-of-file or
disconnected pipe/socket, and is documented as such for splice. But
returning an error for read() is also nasty, when in fact there was no
error (data becoming stale is not an error).

The same problems can be triggered by "hole punching" with
madvise(MADV_REMOVE).

Fix this by not clearing the PG_uptodate flag on truncation and
invalidation.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Miklos Szeredi and committed by
Linus Torvalds
84209e02 2b12a4c5

-2
-2
mm/truncate.c
··· 104 104 cancel_dirty_page(page, PAGE_CACHE_SIZE); 105 105 106 106 remove_from_page_cache(page); 107 - ClearPageUptodate(page); 108 107 ClearPageMappedToDisk(page); 109 108 page_cache_release(page); /* pagecache ref */ 110 109 } ··· 355 356 BUG_ON(PagePrivate(page)); 356 357 __remove_from_page_cache(page); 357 358 spin_unlock_irq(&mapping->tree_lock); 358 - ClearPageUptodate(page); 359 359 page_cache_release(page); /* pagecache ref */ 360 360 return 1; 361 361 failed: