Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

netfs: do not unlock and put the folio twice

check_write_begin() will unlock and put the folio when return
non-zero. So we should avoid unlocking and putting it twice in
netfs layer.

Change the way ->check_write_begin() works in the following two ways:

(1) Pass it a pointer to the folio pointer, allowing it to unlock and put
the folio prior to doing the stuff it wants to do, provided it clears
the folio pointer.

(2) Change the return values such that 0 with folio pointer set means
continue, 0 with folio pointer cleared means re-get and all error
codes indicating an error (no special treatment for -EAGAIN).

[ bagasdotme: use Sphinx code text syntax for *foliop pointer ]

Cc: stable@vger.kernel.org
Link: https://tracker.ceph.com/issues/56423
Link: https://lore.kernel.org/r/cf169f43-8ee7-8697-25da-0204d1b4343e@redhat.com
Co-developed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

authored by

Xiubo Li and committed by
Ilya Dryomov
fac47b43 32346491

+23 -17
+5 -3
Documentation/filesystems/netfs_library.rst
··· 301 301 void (*issue_read)(struct netfs_io_subrequest *subreq); 302 302 bool (*is_still_valid)(struct netfs_io_request *rreq); 303 303 int (*check_write_begin)(struct file *file, loff_t pos, unsigned len, 304 - struct folio *folio, void **_fsdata); 304 + struct folio **foliop, void **_fsdata); 305 305 void (*done)(struct netfs_io_request *rreq); 306 306 }; 307 307 ··· 381 381 allocated/grabbed the folio to be modified to allow the filesystem to flush 382 382 conflicting state before allowing it to be modified. 383 383 384 - It should return 0 if everything is now fine, -EAGAIN if the folio should be 385 - regrabbed and any other error code to abort the operation. 384 + It may unlock and discard the folio it was given and set the caller's folio 385 + pointer to NULL. It should return 0 if everything is now fine (``*foliop`` 386 + left set) or the op should be retried (``*foliop`` cleared) and any other 387 + error code to abort the operation. 386 388 387 389 * ``done`` 388 390
+1 -1
fs/afs/file.c
··· 375 375 } 376 376 377 377 static int afs_check_write_begin(struct file *file, loff_t pos, unsigned len, 378 - struct folio *folio, void **_fsdata) 378 + struct folio **foliop, void **_fsdata) 379 379 { 380 380 struct afs_vnode *vnode = AFS_FS_I(file_inode(file)); 381 381
+6 -5
fs/ceph/addr.c
··· 63 63 (CONGESTION_ON_THRESH(congestion_kb) >> 2)) 64 64 65 65 static int ceph_netfs_check_write_begin(struct file *file, loff_t pos, unsigned int len, 66 - struct folio *folio, void **_fsdata); 66 + struct folio **foliop, void **_fsdata); 67 67 68 68 static inline struct ceph_snap_context *page_snap_context(struct page *page) 69 69 { ··· 1288 1288 } 1289 1289 1290 1290 static int ceph_netfs_check_write_begin(struct file *file, loff_t pos, unsigned int len, 1291 - struct folio *folio, void **_fsdata) 1291 + struct folio **foliop, void **_fsdata) 1292 1292 { 1293 1293 struct inode *inode = file_inode(file); 1294 1294 struct ceph_inode_info *ci = ceph_inode(inode); 1295 1295 struct ceph_snap_context *snapc; 1296 1296 1297 - snapc = ceph_find_incompatible(folio_page(folio, 0)); 1297 + snapc = ceph_find_incompatible(folio_page(*foliop, 0)); 1298 1298 if (snapc) { 1299 1299 int r; 1300 1300 1301 - folio_unlock(folio); 1302 - folio_put(folio); 1301 + folio_unlock(*foliop); 1302 + folio_put(*foliop); 1303 + *foliop = NULL; 1303 1304 if (IS_ERR(snapc)) 1304 1305 return PTR_ERR(snapc); 1305 1306
+10 -7
fs/netfs/buffered_read.c
··· 319 319 * conflicting writes once the folio is grabbed and locked. It is passed a 320 320 * pointer to the fsdata cookie that gets returned to the VM to be passed to 321 321 * write_end. It is permitted to sleep. It should return 0 if the request 322 - * should go ahead; unlock the folio and return -EAGAIN to cause the folio to 323 - * be regot; or return an error. 322 + * should go ahead or it may return an error. It may also unlock and put the 323 + * folio, provided it sets ``*foliop`` to NULL, in which case a return of 0 324 + * will cause the folio to be re-got and the process to be retried. 324 325 * 325 326 * The calling netfs must initialise a netfs context contiguous to the vfs 326 327 * inode before calling this. ··· 349 348 350 349 if (ctx->ops->check_write_begin) { 351 350 /* Allow the netfs (eg. ceph) to flush conflicts. */ 352 - ret = ctx->ops->check_write_begin(file, pos, len, folio, _fsdata); 351 + ret = ctx->ops->check_write_begin(file, pos, len, &folio, _fsdata); 353 352 if (ret < 0) { 354 353 trace_netfs_failure(NULL, NULL, ret, netfs_fail_check_write_begin); 355 - if (ret == -EAGAIN) 356 - goto retry; 357 354 goto error; 358 355 } 356 + if (!folio) 357 + goto retry; 359 358 } 360 359 361 360 if (folio_test_uptodate(folio)) ··· 417 416 error_put: 418 417 netfs_put_request(rreq, false, netfs_rreq_trace_put_failed); 419 418 error: 420 - folio_unlock(folio); 421 - folio_put(folio); 419 + if (folio) { 420 + folio_unlock(folio); 421 + folio_put(folio); 422 + } 422 423 _leave(" = %d", ret); 423 424 return ret; 424 425 }
+1 -1
include/linux/netfs.h
··· 214 214 void (*issue_read)(struct netfs_io_subrequest *subreq); 215 215 bool (*is_still_valid)(struct netfs_io_request *rreq); 216 216 int (*check_write_begin)(struct file *file, loff_t pos, unsigned len, 217 - struct folio *folio, void **_fsdata); 217 + struct folio **foliop, void **_fsdata); 218 218 void (*done)(struct netfs_io_request *rreq); 219 219 }; 220 220