Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

oom: task->mm == NULL doesn't mean the memory was freed

exit_mm() sets ->mm == NULL then it does mmput()->exit_mmap() which
frees the memory.

However select_bad_process() checks ->mm != NULL before TIF_MEMDIE,
so it continues to kill other tasks even if we have the oom-killed
task freeing its memory.

Change select_bad_process() to check ->mm after TIF_MEMDIE, but skip
the tasks which have already passed exit_notify() to ensure a zombie
with TIF_MEMDIE set can't block oom-killer. Alternatively we could
probably clear TIF_MEMDIE after exit_mmap().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Oleg Nesterov and committed by
Linus Torvalds
c027a474 cfe22345

+3 -1
+3 -1
mm/oom_kill.c
··· 303 303 do_each_thread(g, p) { 304 304 unsigned int points; 305 305 306 - if (!p->mm) 306 + if (p->exit_state) 307 307 continue; 308 308 if (oom_unkillable_task(p, mem, nodemask)) 309 309 continue; ··· 319 319 */ 320 320 if (test_tsk_thread_flag(p, TIF_MEMDIE)) 321 321 return ERR_PTR(-1UL); 322 + if (!p->mm) 323 + continue; 322 324 323 325 if (p->flags & PF_EXITING) { 324 326 /*