Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

bpf: Fix kmemleak warning for percpu hashmap

Vlad Poenaru reported the following kmemleak issue:

unreferenced object 0x606fd7c44ac8 (size 32):
backtrace (crc 0):
pcpu_alloc_noprof+0x730/0xeb0
bpf_map_alloc_percpu+0x69/0xc0
prealloc_init+0x9d/0x1b0
htab_map_alloc+0x363/0x510
map_create+0x215/0x3a0
__sys_bpf+0x16b/0x3e0
__x64_sys_bpf+0x18/0x20
do_syscall_64+0x7b/0x150
entry_SYSCALL_64_after_hwframe+0x4b/0x53

Further investigation shows the reason is due to not 8-byte aligned
store of percpu pointer in htab_elem_set_ptr():
*(void __percpu **)(l->key + key_size) = pptr;

Note that the whole htab_elem alignment is 8 (for x86_64). If the key_size
is 4, that means pptr is stored in a location which is 4 byte aligned but
not 8 byte aligned. In mm/kmemleak.c, scan_block() scans the memory based
on 8 byte stride, so it won't detect above pptr, hence reporting the memory
leak.

In htab_map_alloc(), we already have

htab->elem_size = sizeof(struct htab_elem) +
round_up(htab->map.key_size, 8);
if (percpu)
htab->elem_size += sizeof(void *);
else
htab->elem_size += round_up(htab->map.value_size, 8);

So storing pptr with 8-byte alignment won't cause any problem and can fix
kmemleak too.

The issue can be reproduced with bpf selftest as well:
1. Enable CONFIG_DEBUG_KMEMLEAK config
2. Add a getchar() before skel destroy in test_hash_map() in prog_tests/for_each.c.
The purpose is to keep map available so kmemleak can be detected.
3. run './test_progs -t for_each/hash_map &' and a kmemleak should be reported.

Reported-by: Vlad Poenaru <thevlad@meta.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20250224175514.2207227-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

authored by

Yonghong Song and committed by
Alexei Starovoitov
11ba7ce0 23986082

+3 -3
+3 -3
kernel/bpf/hashtab.c
··· 198 198 static inline void htab_elem_set_ptr(struct htab_elem *l, u32 key_size, 199 199 void __percpu *pptr) 200 200 { 201 - *(void __percpu **)(l->key + key_size) = pptr; 201 + *(void __percpu **)(l->key + roundup(key_size, 8)) = pptr; 202 202 } 203 203 204 204 static inline void __percpu *htab_elem_get_ptr(struct htab_elem *l, u32 key_size) 205 205 { 206 - return *(void __percpu **)(l->key + key_size); 206 + return *(void __percpu **)(l->key + roundup(key_size, 8)); 207 207 } 208 208 209 209 static void *fd_htab_map_get_ptr(const struct bpf_map *map, struct htab_elem *l) ··· 2354 2354 *insn++ = BPF_EMIT_CALL(__htab_map_lookup_elem); 2355 2355 *insn++ = BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 3); 2356 2356 *insn++ = BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, 2357 - offsetof(struct htab_elem, key) + map->key_size); 2357 + offsetof(struct htab_elem, key) + roundup(map->key_size, 8)); 2358 2358 *insn++ = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0); 2359 2359 *insn++ = BPF_MOV64_PERCPU_REG(BPF_REG_0, BPF_REG_0); 2360 2360