Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm: kmem: make mem_cgroup_from_obj() vmalloc()-safe

Currently mem_cgroup_from_obj() is not working properly with objects
allocated using vmalloc(). It creates problems in some cases, when it's
called for static objects belonging to modules or generally allocated
using vmalloc().

This patch makes mem_cgroup_from_obj() safe to be called on objects
allocated using vmalloc().

It also introduces mem_cgroup_from_slab_obj(), which is a faster version
to use in places when we know the object is either a slab object or a
generic slab page (e.g. when adding an object to a lru list).

Link: https://lkml.kernel.org/r/20220610180310.1725111-1-roman.gushchin@linux.dev
Suggested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Acked-by: Shakeel Butt <shakeelb@google.com>
Tested-by: Vasily Averin <vvs@openvz.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Qian Cai <quic_qiancai@quicinc.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Roman Gushchin and committed by
akpm
fc4db90f 1e57ffb6

+57 -22
+6
include/linux/memcontrol.h
··· 1740 1740 } 1741 1741 1742 1742 struct mem_cgroup *mem_cgroup_from_obj(void *p); 1743 + struct mem_cgroup *mem_cgroup_from_slab_obj(void *p); 1743 1744 1744 1745 static inline void count_objcg_event(struct obj_cgroup *objcg, 1745 1746 enum vm_event_item idx) ··· 1800 1799 static inline struct mem_cgroup *mem_cgroup_from_obj(void *p) 1801 1800 { 1802 1801 return NULL; 1802 + } 1803 + 1804 + static inline struct mem_cgroup *mem_cgroup_from_slab_obj(void *p) 1805 + { 1806 + return NULL; 1803 1807 } 1804 1808 1805 1809 static inline void count_objcg_event(struct obj_cgroup *objcg,
+1 -1
mm/list_lru.c
··· 71 71 if (!list_lru_memcg_aware(lru)) 72 72 goto out; 73 73 74 - memcg = mem_cgroup_from_obj(ptr); 74 + memcg = mem_cgroup_from_slab_obj(ptr); 75 75 if (!memcg) 76 76 goto out; 77 77
+50 -21
mm/memcontrol.c
··· 783 783 struct lruvec *lruvec; 784 784 785 785 rcu_read_lock(); 786 - memcg = mem_cgroup_from_obj(p); 786 + memcg = mem_cgroup_from_slab_obj(p); 787 787 788 788 /* 789 789 * Untracked pages have no memcg, no lruvec. Update only the ··· 2841 2841 return 0; 2842 2842 } 2843 2843 2844 - /* 2845 - * Returns a pointer to the memory cgroup to which the kernel object is charged. 2846 - * 2847 - * A passed kernel object can be a slab object or a generic kernel page, so 2848 - * different mechanisms for getting the memory cgroup pointer should be used. 2849 - * In certain cases (e.g. kernel stacks or large kmallocs with SLUB) the caller 2850 - * can not know for sure how the kernel object is implemented. 2851 - * mem_cgroup_from_obj() can be safely used in such cases. 2852 - * 2853 - * The caller must ensure the memcg lifetime, e.g. by taking rcu_read_lock(), 2854 - * cgroup_mutex, etc. 2855 - */ 2856 - struct mem_cgroup *mem_cgroup_from_obj(void *p) 2844 + static __always_inline 2845 + struct mem_cgroup *mem_cgroup_from_obj_folio(struct folio *folio, void *p) 2857 2846 { 2858 - struct folio *folio; 2859 - 2860 - if (mem_cgroup_disabled()) 2861 - return NULL; 2862 - 2863 - folio = virt_to_folio(p); 2864 - 2865 2847 /* 2866 2848 * Slab objects are accounted individually, not per-page. 2867 2849 * Memcg membership data for each individual object is saved in ··· 2874 2892 * cgroup pointer or NULL will be returned. 2875 2893 */ 2876 2894 return page_memcg_check(folio_page(folio, 0)); 2895 + } 2896 + 2897 + /* 2898 + * Returns a pointer to the memory cgroup to which the kernel object is charged. 2899 + * 2900 + * A passed kernel object can be a slab object, vmalloc object or a generic 2901 + * kernel page, so different mechanisms for getting the memory cgroup pointer 2902 + * should be used. 2903 + * 2904 + * In certain cases (e.g. kernel stacks or large kmallocs with SLUB) the caller 2905 + * can not know for sure how the kernel object is implemented. 2906 + * mem_cgroup_from_obj() can be safely used in such cases. 2907 + * 2908 + * The caller must ensure the memcg lifetime, e.g. by taking rcu_read_lock(), 2909 + * cgroup_mutex, etc. 2910 + */ 2911 + struct mem_cgroup *mem_cgroup_from_obj(void *p) 2912 + { 2913 + struct folio *folio; 2914 + 2915 + if (mem_cgroup_disabled()) 2916 + return NULL; 2917 + 2918 + if (unlikely(is_vmalloc_addr(p))) 2919 + folio = page_folio(vmalloc_to_page(p)); 2920 + else 2921 + folio = virt_to_folio(p); 2922 + 2923 + return mem_cgroup_from_obj_folio(folio, p); 2924 + } 2925 + 2926 + /* 2927 + * Returns a pointer to the memory cgroup to which the kernel object is charged. 2928 + * Similar to mem_cgroup_from_obj(), but faster and not suitable for objects, 2929 + * allocated using vmalloc(). 2930 + * 2931 + * A passed kernel object must be a slab object or a generic kernel page. 2932 + * 2933 + * The caller must ensure the memcg lifetime, e.g. by taking rcu_read_lock(), 2934 + * cgroup_mutex, etc. 2935 + */ 2936 + struct mem_cgroup *mem_cgroup_from_slab_obj(void *p) 2937 + { 2938 + if (mem_cgroup_disabled()) 2939 + return NULL; 2940 + 2941 + return mem_cgroup_from_obj_folio(virt_to_folio(p), p); 2877 2942 } 2878 2943 2879 2944 static struct obj_cgroup *__get_obj_cgroup_from_memcg(struct mem_cgroup *memcg)