Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

selftests: memcg: adjust expected reclaim values of protected cgroups

The numbers are not easy to derive in a closed form (certainly mere
protections ratios do not apply), therefore use a simulation to obtain
expected numbers.

Link: https://lkml.kernel.org/r/20220518161859.21565-4-mkoutny@suse.com
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: David Vernet <void@manifault.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Richard Palethorpe <rpalethorpe@suse.de>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Michal Koutný and committed by
akpm
f10b6e9a 1d09069f

+107 -12
+1
MAINTAINERS
··· 5029 5029 S: Maintained 5030 5030 F: mm/memcontrol.c 5031 5031 F: mm/swap_cgroup.c 5032 + F: tools/testing/selftests/cgroup/memcg_protection.m 5032 5033 F: tools/testing/selftests/cgroup/test_kmem.c 5033 5034 F: tools/testing/selftests/cgroup/test_memcontrol.c 5034 5035
+89
tools/testing/selftests/cgroup/memcg_protection.m
··· 1 + % SPDX-License-Identifier: GPL-2.0 2 + % 3 + % run as: octave-cli memcg_protection.m 4 + % 5 + % This script simulates reclaim protection behavior on a single level of memcg 6 + % hierarchy to illustrate how overcommitted protection spreads among siblings 7 + % (as it depends also on their current consumption). 8 + % 9 + % Simulation assumes siblings consumed the initial amount of memory (w/out 10 + % reclaim) and then the reclaim starts, all memory is reclaimable, i.e. treated 11 + % same. It simulates only non-low reclaim and assumes all memory.min = 0. 12 + % 13 + % Input configurations 14 + % -------------------- 15 + % E number parent effective protection 16 + % n vector nominal protection of siblings set at the given level (memory.low) 17 + % c vector current consumption -,,- (memory.current) 18 + 19 + % example from testcase (values in GB) 20 + E = 50 / 1024; 21 + n = [75 25 0 500 ] / 1024; 22 + c = [50 50 50 0] / 1024; 23 + 24 + % Reclaim parameters 25 + % ------------------ 26 + 27 + % Minimal reclaim amount (GB) 28 + cluster = 32*4 / 2**20; 29 + 30 + % Reclaim coefficient (think as 0.5^sc->priority) 31 + alpha = .1 32 + 33 + % Simulation parameters 34 + % --------------------- 35 + epsilon = 1e-7; 36 + timeout = 1000; 37 + 38 + % Simulation loop 39 + % --------------- 40 + 41 + ch = []; 42 + eh = []; 43 + rh = []; 44 + 45 + for t = 1:timeout 46 + % low_usage 47 + u = min(c, n); 48 + siblings = sum(u); 49 + 50 + % effective_protection() 51 + protected = min(n, c); % start with nominal 52 + e = protected * min(1, E / siblings); % normalize overcommit 53 + 54 + % recursive protection 55 + unclaimed = max(0, E - siblings); 56 + parent_overuse = sum(c) - siblings; 57 + if (unclaimed > 0 && parent_overuse > 0) 58 + overuse = max(0, c - protected); 59 + e += unclaimed * (overuse / parent_overuse); 60 + endif 61 + 62 + % get_scan_count() 63 + r = alpha * c; % assume all memory is in a single LRU list 64 + 65 + % commit 1bc63fb1272b ("mm, memcg: make scan aggression always exclude protection") 66 + sz = max(e, c); 67 + r .*= (1 - (e+epsilon) ./ (sz+epsilon)); 68 + 69 + % uncomment to debug prints 70 + % e, c, r 71 + 72 + % nothing to reclaim, reached equilibrium 73 + if max(r) < epsilon 74 + break; 75 + endif 76 + 77 + % SWAP_CLUSTER_MAX roundup 78 + r = max(r, (r > epsilon) .* cluster); 79 + % XXX here I do parallel reclaim of all siblings 80 + % in reality reclaim is serialized and each sibling recalculates own residual 81 + c = max(c - r, 0); 82 + 83 + ch = [ch ; c]; 84 + eh = [eh ; e]; 85 + rh = [rh ; r]; 86 + endfor 87 + 88 + t 89 + c, e
+17 -12
tools/testing/selftests/cgroup/test_memcontrol.c
··· 248 248 /* 249 249 * First, this test creates the following hierarchy: 250 250 * A memory.min = 50M, memory.max = 200M 251 - * A/B memory.min = 50M, memory.current = 50M 251 + * A/B memory.min = 50M 252 252 * A/B/C memory.min = 75M, memory.current = 50M 253 253 * A/B/D memory.min = 25M, memory.current = 50M 254 254 * A/B/E memory.min = 0, memory.current = 50M ··· 259 259 * Then it creates A/G and creates a significant 260 260 * memory pressure in it. 261 261 * 262 + * Then it checks actual memory usages and expects that: 262 263 * A/B memory.current ~= 50M 263 - * A/B/C memory.current ~= 33M 264 - * A/B/D memory.current ~= 17M 265 - * A/B/F memory.current ~= 0 264 + * A/B/C memory.current ~= 29M 265 + * A/B/D memory.current ~= 21M 266 + * A/B/E memory.current ~= 0 267 + * A/B/F memory.current = 0 268 + * (for origin of the numbers, see model in memcg_protection.m.) 266 269 * 267 270 * After that it tries to allocate more than there is 268 271 * unprotected memory in A available, and checks ··· 368 365 for (i = 0; i < ARRAY_SIZE(children); i++) 369 366 c[i] = cg_read_long(children[i], "memory.current"); 370 367 371 - if (!values_close(c[0], MB(33), 10)) 368 + if (!values_close(c[0], MB(29), 10)) 372 369 goto cleanup; 373 370 374 - if (!values_close(c[1], MB(17), 10)) 371 + if (!values_close(c[1], MB(21), 10)) 375 372 goto cleanup; 376 373 377 374 if (c[3] != 0) ··· 408 405 /* 409 406 * First, this test creates the following hierarchy: 410 407 * A memory.low = 50M, memory.max = 200M 411 - * A/B memory.low = 50M, memory.current = 50M 408 + * A/B memory.low = 50M 412 409 * A/B/C memory.low = 75M, memory.current = 50M 413 410 * A/B/D memory.low = 25M, memory.current = 50M 414 411 * A/B/E memory.low = 0, memory.current = 50M ··· 420 417 * 421 418 * Then it checks actual memory usages and expects that: 422 419 * A/B memory.current ~= 50M 423 - * A/B/ memory.current ~= 33M 424 - * A/B/D memory.current ~= 17M 425 - * A/B/F memory.current ~= 0 420 + * A/B/C memory.current ~= 29M 421 + * A/B/D memory.current ~= 21M 422 + * A/B/E memory.current ~= 0 423 + * A/B/F memory.current = 0 424 + * (for origin of the numbers, see model in memcg_protection.m.) 426 425 * 427 426 * After that it tries to allocate more than there is 428 427 * unprotected memory in A available, ··· 517 512 for (i = 0; i < ARRAY_SIZE(children); i++) 518 513 c[i] = cg_read_long(children[i], "memory.current"); 519 514 520 - if (!values_close(c[0], MB(33), 10)) 515 + if (!values_close(c[0], MB(29), 10)) 521 516 goto cleanup; 522 517 523 - if (!values_close(c[1], MB(17), 10)) 518 + if (!values_close(c[1], MB(21), 10)) 524 519 goto cleanup; 525 520 526 521 if (c[3] != 0)