Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
fork

Configure Feed

Select the types of activity you want to include in your feed.

perf: Fix sample vs do_exit()

Baisheng Gao reported an ARM64 crash, which Mark decoded as being a
synchronous external abort -- most likely due to trying to access
MMIO in bad ways.

The crash further shows perf trying to do a user stack sample while in
exit_mmap()'s tlb_finish_mmu() -- i.e. while tearing down the address
space it is trying to access.

It turns out that we stop perf after we tear down the userspace mm; a
receipie for disaster, since perf likes to access userspace for
various reasons.

Flip this order by moving up where we stop perf in do_exit().

Additionally, harden PERF_SAMPLE_CALLCHAIN and PERF_SAMPLE_STACK_USER
to abort when the current task does not have an mm (exit_mm() makes
sure to set current->mm = NULL; before commencing with the actual
teardown). Such that CPU wide events don't trip on this same problem.

Fixes: c5ebcedb566e ("perf: Add ability to attach user stack dump to sample")
Reported-by: Baisheng Gao <baisheng.gao@unisoc.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250605110815.GQ39944@noisy.programming.kicks-ass.net

+16 -8
+7
kernel/events/core.c
··· 7439 7439 if (!regs) 7440 7440 return 0; 7441 7441 7442 + /* No mm, no stack, no dump. */ 7443 + if (!current->mm) 7444 + return 0; 7445 + 7442 7446 /* 7443 7447 * Check if we fit in with the requested stack size into the: 7444 7448 * - TASK_SIZE ··· 8153 8149 bool crosstask = event->ctx->task && event->ctx->task != current; 8154 8150 const u32 max_stack = event->attr.sample_max_stack; 8155 8151 struct perf_callchain_entry *callchain; 8152 + 8153 + if (!current->mm) 8154 + user = false; 8156 8155 8157 8156 if (!kernel && !user) 8158 8157 return &__empty_callchain;
+9 -8
kernel/exit.c
··· 944 944 taskstats_exit(tsk, group_dead); 945 945 trace_sched_process_exit(tsk, group_dead); 946 946 947 + /* 948 + * Since sampling can touch ->mm, make sure to stop everything before we 949 + * tear it down. 950 + * 951 + * Also flushes inherited counters to the parent - before the parent 952 + * gets woken up by child-exit notifications. 953 + */ 954 + perf_event_exit_task(tsk); 955 + 947 956 exit_mm(); 948 957 949 958 if (group_dead) ··· 967 958 exit_task_namespaces(tsk); 968 959 exit_task_work(tsk); 969 960 exit_thread(tsk); 970 - 971 - /* 972 - * Flush inherited counters to the parent - before the parent 973 - * gets woken up by child-exit notifications. 974 - * 975 - * because of cgroup mode, must be called before cgroup_exit() 976 - */ 977 - perf_event_exit_task(tsk); 978 961 979 962 sched_autogroup_exit_task(tsk); 980 963 cgroup_exit(tsk);