Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

perf scripts python: intel-pt-events.py: Add ability interleave output

Intel PT timestamps are not provided for every branch, let alone every
instruction, so there can be many samples with the same timestamp. With
per-cpu contexts, decoding is done for each CPU in turn, which can make it
difficult to see what is happening on different CPUs at the same time.
Currently the interleaving from perf script --itrace=i0ns is quite coarse
grained. There are often long stretches executing on one CPU and nothing on
another.

Some people are interested in seeing what happened on multiple CPUs before
a crash to debug races etc.

To improve perf script interleaving for parallel execution, the
intel-pt-events.py script has been enhanced to enable interleaving the
output with the same timestamp from different CPUs. It is understood that
interleaving is not perfect or causal.

Add parameter --interleave [<n>] to interleave sample output for the same
timestamp so that no more than n samples for a CPU are displayed in a row.
'n' defaults to 4. Note this only affects the order of output, and only
when the timestamp is the same.

Example:

$ perf script intel-pt-events.py --insn-trace --interleave 3
...
bash 2267/2267 [004] 9323.692625625 563caa3c86f0 jz 0x563caa3c89c7 run_pending_traps+0x30 (/usr/bin/bash) IPC: 1.52 (38/25)
bash 2267/2267 [004] 9323.692625625 563caa3c89c7 movq 0x118(%rsp), %rax run_pending_traps+0x307 (/usr/bin/bash)
bash 2267/2267 [004] 9323.692625625 563caa3c89cf subq %fs:0x28, %rax run_pending_traps+0x30f (/usr/bin/bash)
bash 2270/2270 [007] 9323.692625625 55dc58cabf02 jz 0x55dc58cabf48 unquoted_glob_pattern_p+0x102 (/usr/bin/bash) IPC: 1.56 (25/16)
bash 2270/2270 [007] 9323.692625625 55dc58cabf04 cmp $0x5d, %al unquoted_glob_pattern_p+0x104 (/usr/bin/bash)
bash 2270/2270 [007] 9323.692625625 55dc58cabf06 jnz 0x55dc58cabf10 unquoted_glob_pattern_p+0x106 (/usr/bin/bash)
bash 2264/2264 [001] 9323.692625625 7fd556a4376c jbe 0x7fd556a43ac8 round_and_return+0x3fc (/usr/lib/x86_64-linux-gnu/libc.so.6) IPC: 4.30 (43/10)
bash 2264/2264 [001] 9323.692625625 7fd556a43772 and $0x8, %edx round_and_return+0x402 (/usr/lib/x86_64-linux-gnu/libc.so.6)
bash 2264/2264 [001] 9323.692625625 7fd556a43775 jnz 0x7fd556a43ac8 round_and_return+0x405 (/usr/lib/x86_64-linux-gnu/libc.so.6)
bash 2267/2267 [004] 9323.692625625 563caa3c89d8 jnz 0x563caa3c8b11 run_pending_traps+0x318 (/usr/bin/bash)
bash 2267/2267 [004] 9323.692625625 563caa3c89de add $0x128, %rsp run_pending_traps+0x31e (/usr/bin/bash)
bash 2267/2267 [004] 9323.692625625 563caa3c89e5 popq %rbx run_pending_traps+0x325 (/usr/bin/bash)
...

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221020152509.5298-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

authored by

Adrian Hunter and committed by
Arnaldo Carvalho de Melo
ad7ad6b5 b15cf900

+74 -3
+10 -2
tools/perf/Documentation/perf-intel-pt.txt
··· 189 189 unpack the raw data for power events and PTWRITE. The script also displays 190 190 branches, and supports 2 additional modes selected by option: 191 191 192 - --insn-trace - instruction trace 193 - --src-trace - source trace 192 + - --insn-trace - instruction trace 193 + - --src-trace - source trace 194 + 195 + The intel-pt-events.py script also has options: 196 + 197 + - --all-switch-events - display all switch events, not only the last consecutive. 198 + - --interleave [<n>] - interleave sample output for the same timestamp so that 199 + no more than n samples for a CPU are displayed in a row. 'n' defaults to 4. 200 + Note this only affects the order of output, and only when the timestamp is the 201 + same. 194 202 195 203 As mentioned above, it is easy to capture too much data. One way to limit the 196 204 data captured is to use 'snapshot' mode which is explained further below.
+64 -1
tools/perf/scripts/python/intel-pt-events.py
··· 13 13 14 14 from __future__ import print_function 15 15 16 + import io 16 17 import os 17 18 import sys 18 19 import struct 19 20 import argparse 21 + import contextlib 20 22 21 23 from libxed import LibXED 22 24 from ctypes import create_string_buffer, addressof ··· 41 39 glb_source_file_name = None 42 40 glb_line_number = None 43 41 glb_dso = None 42 + glb_stash_dict = {} 43 + glb_output = None 44 + glb_output_pos = 0 45 + glb_cpu = -1 46 + glb_time = 0 44 47 45 48 def get_optional_null(perf_dict, field): 46 49 if field in perf_dict: ··· 77 70 ap.add_argument("--insn-trace", action='store_true') 78 71 ap.add_argument("--src-trace", action='store_true') 79 72 ap.add_argument("--all-switch-events", action='store_true') 73 + ap.add_argument("--interleave", type=int, nargs='?', const=4, default=0) 80 74 global glb_args 81 75 global glb_insn 82 76 global glb_src ··· 102 94 perf_set_itrace_options(perf_script_context, itrace) 103 95 104 96 def trace_end(): 97 + if glb_args.interleave: 98 + flush_stashed_output() 105 99 print("End") 106 100 107 101 def trace_unhandled(event_name, context, event_fields_dict): 108 102 print(' '.join(['%s=%s'%(k,str(v))for k,v in sorted(event_fields_dict.items())])) 103 + 104 + def stash_output(): 105 + global glb_stash_dict 106 + global glb_output_pos 107 + output_str = glb_output.getvalue()[glb_output_pos:] 108 + n = len(output_str) 109 + if n: 110 + glb_output_pos += n 111 + if glb_cpu not in glb_stash_dict: 112 + glb_stash_dict[glb_cpu] = [] 113 + glb_stash_dict[glb_cpu].append(output_str) 114 + 115 + def flush_stashed_output(): 116 + global glb_stash_dict 117 + while glb_stash_dict: 118 + cpus = list(glb_stash_dict.keys()) 119 + # Output at most glb_args.interleave output strings per cpu 120 + for cpu in cpus: 121 + items = glb_stash_dict[cpu] 122 + countdown = glb_args.interleave 123 + while len(items) and countdown: 124 + sys.stdout.write(items[0]) 125 + del items[0] 126 + countdown -= 1 127 + if not items: 128 + del glb_stash_dict[cpu] 109 129 110 130 def print_ptwrite(raw_buf): 111 131 data = struct.unpack_from("<IQ", raw_buf) ··· 411 375 print_common_start(comm, sample, name) 412 376 print_common_ip(param_dict, sample, symbol, dso) 413 377 378 + def interleave_events(param_dict): 379 + global glb_cpu 380 + global glb_time 381 + global glb_output 382 + global glb_output_pos 383 + 384 + sample = param_dict["sample"] 385 + glb_cpu = sample["cpu"] 386 + ts = sample["time"] 387 + 388 + if glb_time != ts: 389 + glb_time = ts 390 + flush_stashed_output() 391 + 392 + glb_output_pos = 0 393 + with contextlib.redirect_stdout(io.StringIO()) as glb_output: 394 + do_process_event(param_dict) 395 + 396 + stash_output() 397 + 414 398 def process_event(param_dict): 415 399 try: 416 - do_process_event(param_dict) 400 + if glb_args.interleave: 401 + interleave_events(param_dict) 402 + else: 403 + do_process_event(param_dict) 417 404 except broken_pipe_exception: 418 405 # Stop python printing broken pipe errors and traceback 419 406 sys.stdout = open(os.devnull, 'w') 420 407 sys.exit(1) 421 408 422 409 def auxtrace_error(typ, code, cpu, pid, tid, ip, ts, msg, cpumode, *x): 410 + if glb_args.interleave: 411 + flush_stashed_output() 423 412 if len(x) >= 2 and x[0]: 424 413 machine_pid = x[0] 425 414 vcpu = x[1] ··· 464 403 sys.exit(1) 465 404 466 405 def context_switch(ts, cpu, pid, tid, np_pid, np_tid, machine_pid, out, out_preempt, *x): 406 + if glb_args.interleave: 407 + flush_stashed_output() 467 408 if out: 468 409 out_str = "Switch out " 469 410 else: