Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

perf offcpu: Track child processes

When -p option used or a workload is given, it needs to handle child
processes. The perf_event can inherit those task events
automatically. We can add a new BPF program in task_newtask
tracepoint to track child processes.

Before:
$ sudo perf record --off-cpu -- perf bench sched messaging
$ sudo perf report --stat | grep -A1 offcpu
offcpu-time stats:
SAMPLE events: 1

After:
$ sudo perf record -a --off-cpu -- perf bench sched messaging
$ sudo perf report --stat | grep -A1 offcpu
offcpu-time stats:
SAMPLE events: 856

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220811185456.194721-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

authored by

Namhyung Kim and committed by
Arnaldo Carvalho de Melo
d2347763 d6f415ca

+37
+7
tools/perf/util/bpf_off_cpu.c
··· 17 17 #include "bpf_skel/off_cpu.skel.h" 18 18 19 19 #define MAX_STACKS 32 20 + #define MAX_PROC 4096 20 21 /* we don't need actual timestamp, just want to put the samples at last */ 21 22 #define OFF_CPU_TIMESTAMP (~0ull << 32) 22 23 ··· 165 164 166 165 ntasks++; 167 166 } 167 + 168 + if (ntasks < MAX_PROC) 169 + ntasks = MAX_PROC; 170 + 168 171 bpf_map__set_max_entries(skel->maps.task_filter, ntasks); 169 172 } else if (target__has_task(target)) { 170 173 ntasks = perf_thread_map__nr(evlist->core.threads); 171 174 bpf_map__set_max_entries(skel->maps.task_filter, ntasks); 175 + } else if (target__none(target)) { 176 + bpf_map__set_max_entries(skel->maps.task_filter, MAX_PROC); 172 177 } 173 178 174 179 if (evlist__first(evlist)->cgrp) {
+30
tools/perf/util/bpf_skel/off_cpu.bpf.c
··· 12 12 #define TASK_INTERRUPTIBLE 0x0001 13 13 #define TASK_UNINTERRUPTIBLE 0x0002 14 14 15 + /* create a new thread */ 16 + #define CLONE_THREAD 0x10000 17 + 15 18 #define MAX_STACKS 32 16 19 #define MAX_ENTRIES 102400 17 20 ··· 219 216 /* prevent to reuse the timestamp later */ 220 217 pelem->timestamp = 0; 221 218 } 219 + 220 + return 0; 221 + } 222 + 223 + SEC("tp_btf/task_newtask") 224 + int on_newtask(u64 *ctx) 225 + { 226 + struct task_struct *task; 227 + u64 clone_flags; 228 + u32 pid; 229 + u8 val = 1; 230 + 231 + if (!uses_tgid) 232 + return 0; 233 + 234 + task = (struct task_struct *)bpf_get_current_task(); 235 + 236 + pid = BPF_CORE_READ(task, tgid); 237 + if (!bpf_map_lookup_elem(&task_filter, &pid)) 238 + return 0; 239 + 240 + task = (struct task_struct *)ctx[0]; 241 + clone_flags = ctx[1]; 242 + 243 + pid = task->tgid; 244 + if (!(clone_flags & CLONE_THREAD)) 245 + bpf_map_update_elem(&task_filter, &pid, &val, BPF_NOEXIST); 222 246 223 247 return 0; 224 248 }