erofs: fix unexpected EIO under memory pressure

erofs readahead could fail with ENOMEM under the memory pressure because
it tries to alloc_page with GFP_NOWAIT | GFP_NORETRY, while GFP_KERNEL
for a regular read. And if readahead fails (with non-uptodate folios),
the original request will then fall back to synchronous read, and
`.read_folio()` should return appropriate errnos.

However, in scenarios where readahead and read operations compete,
read operation could return an unintended EIO because of an incorrect
error propagation.

To resolve this, this patch modifies the behavior so that, when the
PCL is for read(which means pcl.besteffort is true), it attempts actual
decompression instead of propagating the privios error except initial EIO.

- Page size: 4K
- The original size of FileA: 16K
- Compress-ratio per PCL: 50% (Uncompressed 8K -> Compressed 4K)
[page0, page1] [page2, page3]
[PCL0]---------[PCL1]

- functions declaration:
. pread(fd, buf, count, offset)
. readahead(fd, offset, count)
- Thread A tries to read the last 4K
- Thread B tries to do readahead 8K from 4K
- RA, besteffort == false
- R, besteffort == true

<process A> <process B>

pread(FileA, buf, 4K, 12K)
do readahead(page3) // failed with ENOMEM
wait_lock(page3)
if (!uptodate(page3))
goto do_read
readahead(FileA, 4K, 8K)
// Here create PCL-chain like below:
// [null, page1] [page2, null]
// [PCL0:RA]-----[PCL1:RA]
...
do read(page3) // found [PCL1:RA] and add page3 into it,
// and then, change PCL1 from RA to R
...
// Now, PCL-chain is as below:
// [null, page1] [page2, page3]
// [PCL0:RA]-----[PCL1:R]

// try to decompress PCL-chain...
z_erofs_decompress_queue
err = 0;

// failed with ENOMEM, so page 1
// only for RA will not be uptodated.
// it's okay.
err = decompress([PCL0:RA], err)

// However, ENOMEM propagated to next
// PCL, even though PCL is not only
// for RA but also for R. As a result,
// it just failed with ENOMEM without
// trying any decompression, so page2
// and page3 will not be uptodated.
** BUG HERE ** --> err = decompress([PCL1:R], err)

return err as ENOMEM
...
wait_lock(page3)
if (!uptodate(page3))
return EIO <-- Return an unexpected EIO!
...

Fixes: 2349d2fa02db ("erofs: sunset unneeded NOFAILs")
Cc: stable@vger.kernel.org
Reviewed-by: Jaewook Kim <jw5454.kim@samsung.com>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Junbeom Yeom <junbeom.yeom@samsung.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

authored by Junbeom Yeom and committed by Gao Xiang 4012d785 8f0b4cce

+4 -4
+4 -4
fs/erofs/zdata.c
··· 1262 1262 return err; 1263 1263 } 1264 1264 1265 - static int z_erofs_decompress_pcluster(struct z_erofs_backend *be, int err) 1265 + static int z_erofs_decompress_pcluster(struct z_erofs_backend *be, bool eio) 1266 1266 { 1267 1267 struct erofs_sb_info *const sbi = EROFS_SB(be->sb); 1268 1268 struct z_erofs_pcluster *pcl = be->pcl; ··· 1270 1270 const struct z_erofs_decompressor *alg = 1271 1271 z_erofs_decomp[pcl->algorithmformat]; 1272 1272 bool try_free = true; 1273 - int i, j, jtop, err2; 1273 + int i, j, jtop, err2, err = eio ? -EIO : 0; 1274 1274 struct page *page; 1275 1275 bool overlapped; 1276 1276 const char *reason; ··· 1413 1413 .pcl = io->head, 1414 1414 }; 1415 1415 struct z_erofs_pcluster *next; 1416 - int err = io->eio ? -EIO : 0; 1416 + int err = 0; 1417 1417 1418 1418 for (; be.pcl != Z_EROFS_PCLUSTER_TAIL; be.pcl = next) { 1419 1419 DBG_BUGON(!be.pcl); 1420 1420 next = READ_ONCE(be.pcl->next); 1421 - err = z_erofs_decompress_pcluster(&be, err) ?: err; 1421 + err = z_erofs_decompress_pcluster(&be, io->eio) ?: err; 1422 1422 } 1423 1423 return err; 1424 1424 }