[PATCH] shrinker->nr = LONG_MAX means deadlock for icache

With Andrew Morton <akpm@osdl.org>

The slab scanning code tries to balance the scanning rate of slabs versus the
scanning rate of LRU pages. To do this, it retains state concerning how many
slabs have been scanned - if a particular slab shrinker didn't scan enough
objects, we remember that for next time, and scan more objects on the next
pass.

The problem with this is that with (say) a huge number of GFP_NOIO
direct-reclaim attempts, the number of objects which are to be scanned when we
finally get a GFP_KERNEL request can be huge. Because some shrinker handlers
just bail out if !__GFP_FS.

So the patch clamps the number of objects-to-be-scanned to 2* the total number
of objects in the slab cache.

Signed-off-by: Andrea Arcangeli <andrea@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

authored by

Andrea Arcangeli and committed by
Linus Torvalds
ea164d73 154f484b

+15 -3
+15 -3
mm/vmscan.c
··· 201 201 list_for_each_entry(shrinker, &shrinker_list, list) { 202 202 unsigned long long delta; 203 203 unsigned long total_scan; 204 + unsigned long max_pass = (*shrinker->shrinker)(0, gfp_mask); 204 205 205 206 delta = (4 * scanned) / shrinker->seeks; 206 - delta *= (*shrinker->shrinker)(0, gfp_mask); 207 + delta *= max_pass; 207 208 do_div(delta, lru_pages + 1); 208 209 shrinker->nr += delta; 209 - if (shrinker->nr < 0) 210 - shrinker->nr = LONG_MAX; /* It wrapped! */ 210 + if (shrinker->nr < 0) { 211 + printk(KERN_ERR "%s: nr=%ld\n", 212 + __FUNCTION__, shrinker->nr); 213 + shrinker->nr = max_pass; 214 + } 215 + 216 + /* 217 + * Avoid risking looping forever due to too large nr value: 218 + * never try to free more than twice the estimate number of 219 + * freeable entries. 220 + */ 221 + if (shrinker->nr > max_pass * 2) 222 + shrinker->nr = max_pass * 2; 211 223 212 224 total_scan = shrinker->nr; 213 225 shrinker->nr = 0;