Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm: memcontrol: fix swap counter leak from offline cgroup

Commit 6769183166b3 removed the parameter of id from swap_cgroup_record()
and get the memcg id from mem_cgroup_id(folio_memcg(folio)). However, the
caller of it may update a different memcg's counter instead of
folio_memcg(folio).

E.g. in the caller of mem_cgroup_swapout(), @swap_memcg could be
different with @memcg and update the counter of @swap_memcg, but
swap_cgroup_record() records the wrong memcg's ID. When it is uncharged
from __mem_cgroup_uncharge_swap(), the swap counter will leak since the
wrong recorded ID.

Fix it by bringing the parameter of id back.

Link: https://lkml.kernel.org/r/20250306023133.44838-1-songmuchun@bytedance.com
Fixes: 6769183166b3 ("mm/swap_cgroup: decouple swap cgroup recording and clearing")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Kairui Song <kasong@tencent.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Muchun Song and committed by
Andrew Morton
73f839b6 8c6ff7f1

+8 -7
+2 -2
include/linux/swap_cgroup.h
··· 6 6 7 7 #if defined(CONFIG_MEMCG) && defined(CONFIG_SWAP) 8 8 9 - extern void swap_cgroup_record(struct folio *folio, swp_entry_t ent); 9 + extern void swap_cgroup_record(struct folio *folio, unsigned short id, swp_entry_t ent); 10 10 extern unsigned short swap_cgroup_clear(swp_entry_t ent, unsigned int nr_ents); 11 11 extern unsigned short lookup_swap_cgroup_id(swp_entry_t ent); 12 12 extern int swap_cgroup_swapon(int type, unsigned long max_pages); ··· 15 15 #else 16 16 17 17 static inline 18 - void swap_cgroup_record(struct folio *folio, swp_entry_t ent) 18 + void swap_cgroup_record(struct folio *folio, unsigned short id, swp_entry_t ent) 19 19 { 20 20 } 21 21
+2 -2
mm/memcontrol.c
··· 4993 4993 mem_cgroup_id_get_many(swap_memcg, nr_entries - 1); 4994 4994 mod_memcg_state(swap_memcg, MEMCG_SWAP, nr_entries); 4995 4995 4996 - swap_cgroup_record(folio, entry); 4996 + swap_cgroup_record(folio, mem_cgroup_id(swap_memcg), entry); 4997 4997 4998 4998 folio_unqueue_deferred_split(folio); 4999 4999 folio->memcg_data = 0; ··· 5055 5055 mem_cgroup_id_get_many(memcg, nr_pages - 1); 5056 5056 mod_memcg_state(memcg, MEMCG_SWAP, nr_pages); 5057 5057 5058 - swap_cgroup_record(folio, entry); 5058 + swap_cgroup_record(folio, mem_cgroup_id(memcg), entry); 5059 5059 5060 5060 return 0; 5061 5061 }
+4 -3
mm/swap_cgroup.c
··· 58 58 * entries must not have been charged 59 59 * 60 60 * @folio: the folio that the swap entry belongs to 61 + * @id: mem_cgroup ID to be recorded 61 62 * @ent: the first swap entry to be recorded 62 63 */ 63 - void swap_cgroup_record(struct folio *folio, swp_entry_t ent) 64 + void swap_cgroup_record(struct folio *folio, unsigned short id, 65 + swp_entry_t ent) 64 66 { 65 67 unsigned int nr_ents = folio_nr_pages(folio); 66 68 struct swap_cgroup *map; ··· 74 72 map = swap_cgroup_ctrl[swp_type(ent)].map; 75 73 76 74 do { 77 - old = __swap_cgroup_id_xchg(map, offset, 78 - mem_cgroup_id(folio_memcg(folio))); 75 + old = __swap_cgroup_id_xchg(map, offset, id); 79 76 VM_BUG_ON(old); 80 77 } while (++offset != end); 81 78 }