Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

perf tools: Enable LBR call stack support

Currently, there are two call chain recording options, fp and dwarf.

Haswell has a new feature that utilizes the existing LBR facility to
record call chains. Kernel side LBR support code provides this as a
third option to record call chains. This patch enables the lbr call
stack support on the tooling side.

LBR call stack has some limitations:

- It reuses current LBR facility, so LBR call stack and branch record
can not be enabled at the same time.

- It is only available for user-space callchains.

However, it also offers some advantages:

- LBR call stack can work on user apps which don't have frame-pointers
or dwarf debug info compiled. It is a good alternative when nothing
else works.

Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Cody P Schafer <cody@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Jacob Shin <jacob.w.shin@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masanari Iida <standby24x7@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1420482185-29830-2-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

authored by

Kan Liang and committed by
Ingo Molnar
aad2b21c 2c44b193

+40 -6
+7 -1
tools/perf/Documentation/perf-record.txt
··· 115 115 implies -g. 116 116 117 117 Allows specifying "fp" (frame pointer) or "dwarf" 118 - (DWARF's CFI - Call Frame Information) as the method to collect 118 + (DWARF's CFI - Call Frame Information) or "lbr" 119 + (Hardware Last Branch Record facility) as the method to collect 119 120 the information used to show the call graphs. 120 121 121 122 In some systems, where binaries are build with gcc 122 123 --fomit-frame-pointer, using the "fp" method will produce bogus 123 124 call graphs, using "dwarf", if available (perf tools linked to 124 125 the libunwind library) should be used instead. 126 + Using the "lbr" method doesn't require any compiler options. It 127 + will produce call graphs from the hardware LBR registers. The 128 + main limition is that it is only available on new Intel 129 + platforms, such as Haswell. It can only get user call chain. It 130 + doesn't work with branch stack sampling at the same time. 125 131 126 132 -q:: 127 133 --quiet::
+3 -3
tools/perf/builtin-record.c
··· 658 658 659 659 static void callchain_debug(void) 660 660 { 661 - static const char *str[CALLCHAIN_MAX] = { "NONE", "FP", "DWARF" }; 661 + static const char *str[CALLCHAIN_MAX] = { "NONE", "FP", "DWARF", "LBR" }; 662 662 663 663 pr_debug("callchain: type %s\n", str[callchain_param.record_mode]); 664 664 ··· 751 751 #define CALLCHAIN_HELP "setup and enables call-graph (stack chain/backtrace) recording: " 752 752 753 753 #ifdef HAVE_DWARF_UNWIND_SUPPORT 754 - const char record_callchain_help[] = CALLCHAIN_HELP "fp dwarf"; 754 + const char record_callchain_help[] = CALLCHAIN_HELP "fp dwarf lbr"; 755 755 #else 756 - const char record_callchain_help[] = CALLCHAIN_HELP "fp"; 756 + const char record_callchain_help[] = CALLCHAIN_HELP "fp lbr"; 757 757 #endif 758 758 759 759 /*
+2
tools/perf/builtin-report.c
··· 249 249 if ((sample_type & PERF_SAMPLE_REGS_USER) && 250 250 (sample_type & PERF_SAMPLE_STACK_USER)) 251 251 callchain_param.record_mode = CALLCHAIN_DWARF; 252 + else if (sample_type & PERF_SAMPLE_BRANCH_STACK) 253 + callchain_param.record_mode = CALLCHAIN_LBR; 252 254 else 253 255 callchain_param.record_mode = CALLCHAIN_FP; 254 256 }
+8
tools/perf/util/callchain.c
··· 97 97 callchain_param.dump_size = size; 98 98 } 99 99 #endif /* HAVE_DWARF_UNWIND_SUPPORT */ 100 + } else if (!strncmp(name, "lbr", sizeof("lbr"))) { 101 + if (!strtok_r(NULL, ",", &saveptr)) { 102 + callchain_param.record_mode = CALLCHAIN_LBR; 103 + ret = 0; 104 + } else 105 + pr_err("callchain: No more arguments " 106 + "needed for --call-graph lbr\n"); 107 + break; 100 108 } else { 101 109 pr_err("callchain: Unknown --call-graph option " 102 110 "value: %s\n", arg);
+1
tools/perf/util/callchain.h
··· 11 11 CALLCHAIN_NONE, 12 12 CALLCHAIN_FP, 13 13 CALLCHAIN_DWARF, 14 + CALLCHAIN_LBR, 14 15 CALLCHAIN_MAX 15 16 }; 16 17
+19 -2
tools/perf/util/evsel.c
··· 537 537 } 538 538 539 539 static void 540 - perf_evsel__config_callgraph(struct perf_evsel *evsel) 540 + perf_evsel__config_callgraph(struct perf_evsel *evsel, 541 + struct record_opts *opts) 541 542 { 542 543 bool function = perf_evsel__is_function_event(evsel); 543 544 struct perf_event_attr *attr = &evsel->attr; 544 545 545 546 perf_evsel__set_sample_bit(evsel, CALLCHAIN); 547 + 548 + if (callchain_param.record_mode == CALLCHAIN_LBR) { 549 + if (!opts->branch_stack) { 550 + if (attr->exclude_user) { 551 + pr_warning("LBR callstack option is only available " 552 + "to get user callchain information. " 553 + "Falling back to framepointers.\n"); 554 + } else { 555 + perf_evsel__set_sample_bit(evsel, BRANCH_STACK); 556 + attr->branch_sample_type = PERF_SAMPLE_BRANCH_USER | 557 + PERF_SAMPLE_BRANCH_CALL_STACK; 558 + } 559 + } else 560 + pr_warning("Cannot use LBR callstack with branch stack. " 561 + "Falling back to framepointers.\n"); 562 + } 546 563 547 564 if (callchain_param.record_mode == CALLCHAIN_DWARF) { 548 565 if (!function) { ··· 684 667 evsel->attr.exclude_callchain_user = 1; 685 668 686 669 if (callchain_param.enabled && !evsel->no_aux_samples) 687 - perf_evsel__config_callgraph(evsel); 670 + perf_evsel__config_callgraph(evsel, opts); 688 671 689 672 if (opts->sample_intr_regs) { 690 673 attr->sample_regs_intr = PERF_REGS_MASK;