Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

erofs: fix deadlock when shrink erofs slab

We observed the following deadlock in the stress test under low
memory scenario:

Thread A Thread B
- erofs_shrink_scan
- erofs_try_to_release_workgroup
- erofs_workgroup_try_to_freeze -- A
- z_erofs_do_read_page
- z_erofs_collection_begin
- z_erofs_register_collection
- erofs_insert_workgroup
- xa_lock(&sbi->managed_pslots) -- B
- erofs_workgroup_get
- erofs_wait_on_workgroup_freezed -- A
- xa_erase
- xa_lock(&sbi->managed_pslots) -- B

To fix this, it needs to hold xa_lock before freezing the workgroup
since xarray will be touched then. So let's hold the lock before
accessing each workgroup, just like what we did with the radix tree
before.

[ Gao Xiang: Jianhua Hao also reports this issue at
https://lore.kernel.org/r/b10b85df30694bac8aadfe43537c897a@xiaomi.com ]

Link: https://lore.kernel.org/r/20211118135844.3559-1-huangjianan@oppo.com
Fixes: 64094a04414f ("erofs: convert workstn to XArray")
Reviewed-by: Chao Yu <chao@kernel.org>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Huang Jianan <huangjianan@oppo.com>
Reported-by: Jianhua Hao <haojianhua1@xiaomi.com>
Signed-off-by: Gao Xiang <xiang@kernel.org>

authored by

Huang Jianan and committed by
Gao Xiang
57bbeacd 13605725

+6 -2
+6 -2
fs/erofs/utils.c
··· 150 150 * however in order to avoid some race conditions, add a 151 151 * DBG_BUGON to observe this in advance. 152 152 */ 153 - DBG_BUGON(xa_erase(&sbi->managed_pslots, grp->index) != grp); 153 + DBG_BUGON(__xa_erase(&sbi->managed_pslots, grp->index) != grp); 154 154 155 155 /* last refcount should be connected with its managed pslot. */ 156 156 erofs_workgroup_unfreeze(grp, 0); ··· 165 165 unsigned int freed = 0; 166 166 unsigned long index; 167 167 168 + xa_lock(&sbi->managed_pslots); 168 169 xa_for_each(&sbi->managed_pslots, index, grp) { 169 170 /* try to shrink each valid workgroup */ 170 171 if (!erofs_try_to_release_workgroup(sbi, grp)) 171 172 continue; 173 + xa_unlock(&sbi->managed_pslots); 172 174 173 175 ++freed; 174 176 if (!--nr_shrink) 175 - break; 177 + return freed; 178 + xa_lock(&sbi->managed_pslots); 176 179 } 180 + xa_unlock(&sbi->managed_pslots); 177 181 return freed; 178 182 } 179 183