perf session: Fix Intel LBR callstack entries and nr print message

When generating callstack information from branch_stack(Intel LBR), the
actual number of callstack entry should be bigger than the number of
branch_stack, for example:

branch_stack records:
B() -> C()
A() -> B()
converted callstack records should be:
C()
B()
A()
though, the number of callstack equals
to the number of branch stack plus 1.

This patch fixes above issue in branch_stack__printf(). For example,

# echo 'scale=2000; 4*a(1)' > cmd
# perf record --call-graph lbr bc -l < cmd

Before applying this patch, `perf script -D` output:

1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
... LBR call chain: nr:8
..... 0: fffffffffffffe00
..... 1: 000000000040a410
..... 2: 000000000040573c
..... 3: 0000000000408650
..... 4: 00000000004022f2
..... 5: 00000000004015f5
..... 6: 00007f5ed6dcb553
..... 7: 0000000000401698
... FP chain: nr:2
..... 0: fffffffffffffe00
..... 1: 000000000040a6d8
... branch callstack: nr:6 # which is not consistent with LBR records.
..... 0: 000000000040a410
..... 1: 0000000000408650 # ditto
..... 2: 00000000004022f2
..... 3: 00000000004015f5
..... 4: 00007f5ed6dcb553
..... 5: 0000000000401698
... thread: bc:17990
...... dso: /usr/bin/bc
bc 17990 1220022.677386: 894172 cycles:
40a410 [unknown] (/usr/bin/bc)
40573c [unknown] (/usr/bin/bc)
408650 [unknown] (/usr/bin/bc)
4022f2 [unknown] (/usr/bin/bc)
4015f5 [unknown] (/usr/bin/bc)
7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
401698 [unknown] (/usr/bin/bc)

After applied:

1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
... LBR call chain: nr:8
..... 0: fffffffffffffe00
..... 1: 000000000040a410
..... 2: 000000000040573c
..... 3: 0000000000408650
..... 4: 00000000004022f2
..... 5: 00000000004015f5
..... 6: 00007f5ed6dcb553
..... 7: 0000000000401698
... FP chain: nr:2
..... 0: fffffffffffffe00
..... 1: 000000000040a6d8
... branch callstack: nr:7
..... 0: 000000000040a410
..... 1: 000000000040573c
..... 2: 0000000000408650
..... 3: 00000000004022f2
..... 4: 00000000004015f5
..... 5: 00007f5ed6dcb553
..... 6: 0000000000401698
... thread: bc:17990
...... dso: /usr/bin/bc
bc 17990 1220022.677386: 894172 cycles:
40a410 [unknown] (/usr/bin/bc)
40573c [unknown] (/usr/bin/bc)
408650 [unknown] (/usr/bin/bc)
4022f2 [unknown] (/usr/bin/bc)
4015f5 [unknown] (/usr/bin/bc)
7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
401698 [unknown] (/usr/bin/bc)

Change from v1:
- refined code style according to Jiri's review comments.

Signed-off-by: Chengdong Li <chengdongli@tencent.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: likexu@tencent.com
Link: https://lore.kernel.org/r/20220517015726.96131-1-chengdongli@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

authored by Chengdong Li and committed by Arnaldo Carvalho de Melo 51d0bf99 8994e97b

Changed files
+21 -5
tools
perf
util
+21 -5
tools/perf/util/session.c
··· 1151 1151 struct branch_entry *entries = perf_sample__branch_entries(sample); 1152 1152 uint64_t i; 1153 1153 1154 - printf("%s: nr:%" PRIu64 "\n", 1155 - !callstack ? "... branch stack" : "... branch callstack", 1156 - sample->branch_stack->nr); 1154 + if (!callstack) { 1155 + printf("%s: nr:%" PRIu64 "\n", "... branch stack", sample->branch_stack->nr); 1156 + } else { 1157 + /* the reason of adding 1 to nr is because after expanding 1158 + * branch stack it generates nr + 1 callstack records. e.g., 1159 + * B()->C() 1160 + * A()->B() 1161 + * the final callstack should be: 1162 + * C() 1163 + * B() 1164 + * A() 1165 + */ 1166 + printf("%s: nr:%" PRIu64 "\n", "... branch callstack", sample->branch_stack->nr+1); 1167 + } 1157 1168 1158 1169 for (i = 0; i < sample->branch_stack->nr; i++) { 1159 1170 struct branch_entry *e = &entries[i]; ··· 1180 1169 (unsigned)e->flags.reserved, 1181 1170 e->flags.type ? branch_type_name(e->flags.type) : ""); 1182 1171 } else { 1183 - printf("..... %2"PRIu64": %016" PRIx64 "\n", 1184 - i, i > 0 ? e->from : e->to); 1172 + if (i == 0) { 1173 + printf("..... %2"PRIu64": %016" PRIx64 "\n" 1174 + "..... %2"PRIu64": %016" PRIx64 "\n", 1175 + i, e->to, i+1, e->from); 1176 + } else { 1177 + printf("..... %2"PRIu64": %016" PRIx64 "\n", i+1, e->from); 1178 + } 1185 1179 } 1186 1180 } 1187 1181 }