Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm: memcg: normalize the value passed into memcg_rstat_updated()

memcg_rstat_updated() uses the value of the state update to keep track of
the magnitude of pending updates, so that we only do a stats flush when
it's worth the work. Most values passed into memcg_rstat_updated() are in
pages, however, a few of them are actually in bytes or KBs.

To put this into perspective, a 512 byte slab allocation today would look
the same as allocating 512 pages. This may result in premature flushes,
which means unnecessary work and latency.

Normalize all the state values passed into memcg_rstat_updated() to pages.
Round up non-zero sub-page to 1 page, because memcg_rstat_updated()
ignores 0 page updates.

Link: https://lkml.kernel.org/r/20230922175741.635002-3-yosryahmed@google.com
Fixes: 5b3be698a872 ("memcg: better bounds on the memcg stats updates")
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Yosry Ahmed and committed by
Andrew Morton
7bd5bc3c ff841a06

+18 -2
+18 -2
mm/memcontrol.c
··· 763 763 return x; 764 764 } 765 765 766 + static int memcg_page_state_unit(int item); 767 + 768 + /* 769 + * Normalize the value passed into memcg_rstat_updated() to be in pages. Round 770 + * up non-zero sub-page updates to 1 page as zero page updates are ignored. 771 + */ 772 + static int memcg_state_val_in_pages(int idx, int val) 773 + { 774 + int unit = memcg_page_state_unit(idx); 775 + 776 + if (!val || unit == PAGE_SIZE) 777 + return val; 778 + else 779 + return max(val * unit / PAGE_SIZE, 1UL); 780 + } 781 + 766 782 /** 767 783 * __mod_memcg_state - update cgroup memory statistics 768 784 * @memcg: the memory cgroup ··· 791 775 return; 792 776 793 777 __this_cpu_add(memcg->vmstats_percpu->state[idx], val); 794 - memcg_rstat_updated(memcg, val); 778 + memcg_rstat_updated(memcg, memcg_state_val_in_pages(idx, val)); 795 779 } 796 780 797 781 /* idx can be of type enum memcg_stat_item or node_stat_item. */ ··· 842 826 /* Update lruvec */ 843 827 __this_cpu_add(pn->lruvec_stats_percpu->state[idx], val); 844 828 845 - memcg_rstat_updated(memcg, val); 829 + memcg_rstat_updated(memcg, memcg_state_val_in_pages(idx, val)); 846 830 memcg_stats_unlock(); 847 831 } 848 832