Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

more aggressively use lumpy reclaim

During an AIM7 run on a 16GB system, fork started failing around 32000
threads, despite the system having plenty of free swap and 15GB of
pageable memory. This was on x86-64, so 8k stacks.

If a higher order allocation fails, we can either:
- keep evicting pages off the end of the LRUs and hope that
we eventually create a contiguous region; this is somewhat
unlikely if the system is under enough stress by new
allocations
- after trying normal eviction for a bit, use lumpy reclaim

This patch switches the system to lumpy reclaim if the VM is having
trouble freeing enough pages, using the same threshold for detection as
used by pageout congestion wait.

Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Rik van Riel and committed by
Linus Torvalds
33c120ed c5fdae46

+16 -4
+16 -4
mm/vmscan.c
··· 909 909 * of reclaimed pages 910 910 */ 911 911 static unsigned long shrink_inactive_list(unsigned long max_scan, 912 - struct zone *zone, struct scan_control *sc, int file) 912 + struct zone *zone, struct scan_control *sc, 913 + int priority, int file) 913 914 { 914 915 LIST_HEAD(page_list); 915 916 struct pagevec pvec; ··· 928 927 unsigned long nr_freed; 929 928 unsigned long nr_active; 930 929 unsigned int count[NR_LRU_LISTS] = { 0, }; 931 - int mode = (sc->order > PAGE_ALLOC_COSTLY_ORDER) ? 932 - ISOLATE_BOTH : ISOLATE_INACTIVE; 930 + int mode = ISOLATE_INACTIVE; 931 + 932 + /* 933 + * If we need a large contiguous chunk of memory, or have 934 + * trouble getting a small set of contiguous pages, we 935 + * will reclaim both active and inactive pages. 936 + * 937 + * We use the same threshold as pageout congestion_wait below. 938 + */ 939 + if (sc->order > PAGE_ALLOC_COSTLY_ORDER) 940 + mode = ISOLATE_BOTH; 941 + else if (sc->order && priority < DEF_PRIORITY - 2) 942 + mode = ISOLATE_BOTH; 933 943 934 944 nr_taken = sc->isolate_pages(sc->swap_cluster_max, 935 945 &page_list, &nr_scan, sc->order, mode, ··· 1184 1172 shrink_active_list(nr_to_scan, zone, sc, priority, file); 1185 1173 return 0; 1186 1174 } 1187 - return shrink_inactive_list(nr_to_scan, zone, sc, file); 1175 + return shrink_inactive_list(nr_to_scan, zone, sc, priority, file); 1188 1176 } 1189 1177 1190 1178 /*