Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm, netfs, fscache: stop read optimisation when folio removed from pagecache

Fscache has an optimisation by which reads from the cache are skipped
until we know that (a) there's data there to be read and (b) that data
isn't entirely covered by pages resident in the netfs pagecache. This is
done with two flags manipulated by fscache_note_page_release():

if (...
test_bit(FSCACHE_COOKIE_HAVE_DATA, &cookie->flags) &&
test_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags))
clear_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags);

where the NO_DATA_TO_READ flag causes cachefiles_prepare_read() to
indicate that netfslib should download from the server or clear the page
instead.

The fscache_note_page_release() function is intended to be called from
->releasepage() - but that only gets called if PG_private or PG_private_2
is set - and currently the former is at the discretion of the network
filesystem and the latter is only set whilst a page is being written to
the cache, so sometimes we miss clearing the optimisation.

Fix this by following Willy's suggestion[1] and adding an address_space
flag, AS_RELEASE_ALWAYS, that causes filemap_release_folio() to always call
->release_folio() if it's set, even if PG_private or PG_private_2 aren't
set.

Note that this would require folio_test_private() and page_has_private() to
become more complicated. To avoid that, in the places[*] where these are
used to conditionalise calls to filemap_release_folio() and
try_to_release_page(), the tests are removed the those functions just
jumped to unconditionally and the test is performed there.

[*] There are some exceptions in vmscan.c where the check guards more than
just a call to the releaser. I've added a function, folio_needs_release()
to wrap all the checks for that.

AS_RELEASE_ALWAYS should be set if a non-NULL cookie is obtained from
fscache and cleared in ->evict_inode() before truncate_inode_pages_final()
is called.

Additionally, the FSCACHE_COOKIE_NO_DATA_TO_READ flag needs to be cleared
and the optimisation cancelled if a cachefiles object already contains data
when we open it.

[dwysocha@redhat.com: call folio_mapping() inside folio_needs_release()]
Link: https://github.com/DaveWysochanskiRH/kernel/commit/902c990e311120179fa5de99d68364b2947b79ec
Link: https://lkml.kernel.org/r/20230628104852.3391651-3-dhowells@redhat.com
Fixes: 1f67e6d0b188 ("fscache: Provide a function to note the release of a page")
Fixes: 047487c947e8 ("cachefiles: Implement the I/O routines")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Reported-by: Rohith Surabattula <rohiths.msft@gmail.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Tested-by: SeongJae Park <sj@kernel.org>
Cc: Daire Byrne <daire.byrne@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steve French <sfrench@samba.org>
Cc: Shyam Prasad N <nspmangalore@gmail.com>
Cc: Rohith Surabattula <rohiths.msft@gmail.com>
Cc: Dave Wysochanski <dwysocha@redhat.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jingbo Xu <jefflexu@linux.alibaba.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

David Howells and committed by
Andrew Morton
b4fa966f 0201ebf2

+33 -1
+2
fs/9p/cache.c
··· 68 68 &path, sizeof(path), 69 69 &version, sizeof(version), 70 70 i_size_read(&v9inode->netfs.inode)); 71 + if (v9inode->netfs.cache) 72 + mapping_set_release_always(inode->i_mapping); 71 73 72 74 p9_debug(P9_DEBUG_FSC, "inode %p get cookie %p\n", 73 75 inode, v9fs_inode_cookie(v9inode));
+2
fs/afs/internal.h
··· 681 681 { 682 682 #ifdef CONFIG_AFS_FSCACHE 683 683 vnode->netfs.cache = cookie; 684 + if (cookie) 685 + mapping_set_release_always(vnode->netfs.inode.i_mapping); 684 686 #endif 685 687 } 686 688
+2
fs/cachefiles/namei.c
··· 585 585 if (ret < 0) 586 586 goto check_failed; 587 587 588 + clear_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &object->cookie->flags); 589 + 588 590 object->file = file; 589 591 590 592 /* Always update the atime on an object we've just looked up (this is
+2
fs/ceph/cache.c
··· 36 36 &ci->i_vino, sizeof(ci->i_vino), 37 37 &ci->i_version, sizeof(ci->i_version), 38 38 i_size_read(inode)); 39 + if (ci->netfs.cache) 40 + mapping_set_release_always(inode->i_mapping); 39 41 } 40 42 41 43 void ceph_fscache_unregister_inode_cookie(struct ceph_inode_info *ci)
+3
fs/nfs/fscache.c
··· 180 180 &auxdata, /* aux_data */ 181 181 sizeof(auxdata), 182 182 i_size_read(inode)); 183 + 184 + if (netfs_inode(inode)->cache) 185 + mapping_set_release_always(inode->i_mapping); 183 186 } 184 187 185 188 /*
+2
fs/smb/client/fscache.c
··· 108 108 &cifsi->uniqueid, sizeof(cifsi->uniqueid), 109 109 &cd, sizeof(cd), 110 110 i_size_read(&cifsi->netfs.inode)); 111 + if (cifsi->netfs.cache) 112 + mapping_set_release_always(inode->i_mapping); 111 113 } 112 114 113 115 void cifs_fscache_unuse_inode_cookie(struct inode *inode, bool update)
+16
include/linux/pagemap.h
··· 203 203 /* writeback related tags are not used */ 204 204 AS_NO_WRITEBACK_TAGS = 5, 205 205 AS_LARGE_FOLIO_SUPPORT = 6, 206 + AS_RELEASE_ALWAYS, /* Call ->release_folio(), even if no private data */ 206 207 }; 207 208 208 209 /** ··· 272 271 static inline int mapping_use_writeback_tags(struct address_space *mapping) 273 272 { 274 273 return !test_bit(AS_NO_WRITEBACK_TAGS, &mapping->flags); 274 + } 275 + 276 + static inline bool mapping_release_always(const struct address_space *mapping) 277 + { 278 + return test_bit(AS_RELEASE_ALWAYS, &mapping->flags); 279 + } 280 + 281 + static inline void mapping_set_release_always(struct address_space *mapping) 282 + { 283 + set_bit(AS_RELEASE_ALWAYS, &mapping->flags); 284 + } 285 + 286 + static inline void mapping_clear_release_always(struct address_space *mapping) 287 + { 288 + clear_bit(AS_RELEASE_ALWAYS, &mapping->flags); 275 289 } 276 290 277 291 static inline gfp_t mapping_gfp_mask(struct address_space * mapping)
+4 -1
mm/internal.h
··· 181 181 */ 182 182 static inline bool folio_needs_release(struct folio *folio) 183 183 { 184 - return folio_has_private(folio); 184 + struct address_space *mapping = folio_mapping(folio); 185 + 186 + return folio_has_private(folio) || 187 + (mapping && mapping_release_always(mapping)); 185 188 } 186 189 187 190 extern unsigned long highest_memmap_pfn;