Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

perf tools: Improve call graph documents and help messages

The --call-graph option is complex so we should provide better guide for
users. Also change help message to be consistent with config option
names. Now perf top will show help like below:

$ perf top --call-graph
Error: option `call-graph' requires a value

Usage: perf top [<options>]

--call-graph <record_mode[,record_size],print_type,threshold[,print_limit],order,sort_key[,branch]>
setup and enables call-graph (stack chain/backtrace):

record_mode: call graph recording mode (fp|dwarf|lbr)
record_size: if record_mode is 'dwarf', max size of stack recording (<bytes>)
default: 8192 (bytes)
print_type: call graph printing style (graph|flat|fractal|none)
threshold: minimum call graph inclusion threshold (<percent>)
print_limit: maximum number of call graph entry (<number>)
order: call graph order (caller|callee)
sort_key: call graph sort key (function|address)
branch: include last branch info to call graph (branch)

Default: fp,graph,0.5,caller,function

Requested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1445524112-5201-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

authored by

Namhyung Kim and committed by
Arnaldo Carvalho de Melo
76a26549 792aeafa

+62 -30
+7 -2
tools/perf/Documentation/perf-record.txt
··· 144 144 145 145 --call-graph:: 146 146 Setup and enable call-graph (stack chain/backtrace) recording, 147 - implies -g. 147 + implies -g. Default is "fp". 148 148 149 149 Allows specifying "fp" (frame pointer) or "dwarf" 150 150 (DWARF's CFI - Call Frame Information) or "lbr" ··· 154 154 In some systems, where binaries are build with gcc 155 155 --fomit-frame-pointer, using the "fp" method will produce bogus 156 156 call graphs, using "dwarf", if available (perf tools linked to 157 - the libunwind library) should be used instead. 157 + the libunwind or libdw library) should be used instead. 158 158 Using the "lbr" method doesn't require any compiler options. It 159 159 will produce call graphs from the hardware LBR registers. The 160 160 main limition is that it is only available on new Intel 161 161 platforms, such as Haswell. It can only get user call chain. It 162 162 doesn't work with branch stack sampling at the same time. 163 + 164 + When "dwarf" recording is used, perf also records (user) stack dump 165 + when sampled. Default size of the stack dump is 8192 (bytes). 166 + User can change the size by passing the size after comma like 167 + "--call-graph dwarf,4096". 163 168 164 169 -q:: 165 170 --quiet::
+24 -14
tools/perf/Documentation/perf-report.txt
··· 169 169 --dump-raw-trace:: 170 170 Dump raw trace in ASCII. 171 171 172 - -g [type,min[,limit],order[,key][,branch]]:: 173 - --call-graph:: 174 - Display call chains using type, min percent threshold, optional print 175 - limit and order. 176 - type can be either: 172 + -g:: 173 + --call-graph=<print_type,threshold[,print_limit],order,sort_key,branch>:: 174 + Display call chains using type, min percent threshold, print limit, 175 + call order, sort key and branch. Note that ordering of parameters is not 176 + fixed so any parement can be given in an arbitraty order. One exception 177 + is the print_limit which should be preceded by threshold. 178 + 179 + print_type can be either: 177 180 - flat: single column, linear exposure of call chains. 178 - - graph: use a graph tree, displaying absolute overhead rates. 181 + - graph: use a graph tree, displaying absolute overhead rates. (default) 179 182 - fractal: like graph, but displays relative rates. Each branch of 180 - the tree is considered as a new profiled object. + 183 + the tree is considered as a new profiled object. 184 + - none: disable call chain display. 185 + 186 + threshold is a percentage value which specifies a minimum percent to be 187 + included in the output call graph. Default is 0.5 (%). 188 + 189 + print_limit is only applied when stdio interface is used. It's to limit 190 + number of call graph entries in a single hist entry. Note that it needs 191 + to be given after threshold (but not necessarily consecutive). 192 + Default is 0 (unlimited). 181 193 182 194 order can be either: 183 195 - callee: callee based call graph. 184 196 - caller: inverted caller based call graph. 197 + Default is 'caller' when --children is used, otherwise 'callee'. 185 198 186 - key can be: 187 - - function: compare on functions 199 + sort_key can be: 200 + - function: compare on functions (default) 188 201 - address: compare on individual code addresses 189 202 190 203 branch can be: 191 - - branch: include last branch information in callgraph 192 - when available. Usually more convenient to use --branch-history 193 - for this. 194 - 195 - Default: graph,0.5,caller 204 + - branch: include last branch information in callgraph when available. 205 + Usually more convenient to use --branch-history for this. 196 206 197 207 --children:: 198 208 Accumulate callchain of children to parent entry so that then can
+3 -2
tools/perf/builtin-record.c
··· 1010 1010 }, 1011 1011 }; 1012 1012 1013 - const char record_callchain_help[] = CALLCHAIN_RECORD_HELP; 1013 + const char record_callchain_help[] = CALLCHAIN_RECORD_HELP 1014 + "\n\t\t\t\tDefault: fp"; 1014 1015 1015 1016 /* 1016 1017 * XXX Will stay a global variable till we fix builtin-script.c to stop messing ··· 1059 1058 NULL, "enables call-graph recording" , 1060 1059 &record_callchain_opt), 1061 1060 OPT_CALLBACK(0, "call-graph", &record.opts, 1062 - "mode[,dump_size]", record_callchain_help, 1061 + "record_mode[,record_size]", record_callchain_help, 1063 1062 &record_parse_callchain_opt), 1064 1063 OPT_INCR('v', "verbose", &verbose, 1065 1064 "be more verbose (show counter open errors, etc)"),
+7 -4
tools/perf/builtin-report.c
··· 625 625 return 0; 626 626 } 627 627 628 - const char report_callchain_help[] = "Display callchains using " CALLCHAIN_REPORT_HELP ". " 629 - "Default: graph,0.5,caller"; 628 + #define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function" 629 + 630 + const char report_callchain_help[] = "Display call graph (stack chain/backtrace):\n\n" 631 + CALLCHAIN_REPORT_HELP 632 + "\n\t\t\t\tDefault: " CALLCHAIN_DEFAULT_OPT; 630 633 631 634 int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused) 632 635 { ··· 639 636 bool has_br_stack = false; 640 637 int branch_mode = -1; 641 638 bool branch_call_mode = false; 642 - char callchain_default_opt[] = "graph,0.5,caller"; 639 + char callchain_default_opt[] = CALLCHAIN_DEFAULT_OPT; 643 640 const char * const report_usage[] = { 644 641 "perf report [<options>]", 645 642 NULL ··· 706 703 OPT_BOOLEAN('x', "exclude-other", &symbol_conf.exclude_other, 707 704 "Only display entries with parent-match"), 708 705 OPT_CALLBACK_DEFAULT('g', "call-graph", &report, 709 - "output_type,min_percent[,print_limit],call_order[,branch]", 706 + "print_type,threshold[,print_limit],order,sort_key[,branch]", 710 707 report_callchain_help, &report_parse_callchain_opt, 711 708 callchain_default_opt), 712 709 OPT_BOOLEAN(0, "children", &symbol_conf.cumulate_callchain,
+3 -2
tools/perf/builtin-top.c
··· 1093 1093 return 0; 1094 1094 } 1095 1095 1096 - const char top_callchain_help[] = CALLCHAIN_RECORD_HELP ", " CALLCHAIN_REPORT_HELP; 1096 + const char top_callchain_help[] = CALLCHAIN_RECORD_HELP CALLCHAIN_REPORT_HELP 1097 + "\n\t\t\t\tDefault: fp,graph,0.5,caller,function"; 1097 1098 1098 1099 int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused) 1099 1100 { ··· 1174 1173 NULL, "enables call-graph recording and display", 1175 1174 &callchain_opt), 1176 1175 OPT_CALLBACK(0, "call-graph", &top.record_opts, 1177 - "mode[,dump_size],output_type,min_percent[,print_limit],call_order[,branch]", 1176 + "record_mode[,record_size],print_type,threshold[,print_limit],order,sort_key[,branch]", 1178 1177 top_callchain_help, &parse_callchain_opt), 1179 1178 OPT_BOOLEAN(0, "children", &symbol_conf.cumulate_callchain, 1180 1179 "Accumulate callchains of children and show total overhead as well"),
+18 -6
tools/perf/util/callchain.h
··· 7 7 #include "event.h" 8 8 #include "symbol.h" 9 9 10 - #define CALLCHAIN_HELP "setup and enables call-graph (stack chain/backtrace) recording: " 10 + #define HELP_PAD "\t\t\t\t" 11 + 12 + #define CALLCHAIN_HELP "setup and enables call-graph (stack chain/backtrace):\n\n" 11 13 12 14 #ifdef HAVE_DWARF_UNWIND_SUPPORT 13 - #define CALLCHAIN_RECORD_HELP CALLCHAIN_HELP "fp dwarf lbr" 15 + # define RECORD_MODE_HELP HELP_PAD "record_mode:\tcall graph recording mode (fp|dwarf|lbr)\n" 14 16 #else 15 - #define CALLCHAIN_RECORD_HELP CALLCHAIN_HELP "fp lbr" 17 + # define RECORD_MODE_HELP HELP_PAD "record_mode:\tcall graph recording mode (fp|lbr)\n" 16 18 #endif 17 19 18 - #define CALLCHAIN_REPORT_HELP "output_type (graph, flat, fractal, or none), " \ 19 - "min percent threshold, optional print limit, callchain order, " \ 20 - "key (function or address), add branches" 20 + #define RECORD_SIZE_HELP \ 21 + HELP_PAD "record_size:\tif record_mode is 'dwarf', max size of stack recording (<bytes>)\n" \ 22 + HELP_PAD "\t\tdefault: 8192 (bytes)\n" 23 + 24 + #define CALLCHAIN_RECORD_HELP CALLCHAIN_HELP RECORD_MODE_HELP RECORD_SIZE_HELP 25 + 26 + #define CALLCHAIN_REPORT_HELP \ 27 + HELP_PAD "print_type:\tcall graph printing style (graph|flat|fractal|none)\n" \ 28 + HELP_PAD "threshold:\tminimum call graph inclusion threshold (<percent>)\n" \ 29 + HELP_PAD "print_limit:\tmaximum number of call graph entry (<number>)\n" \ 30 + HELP_PAD "order:\t\tcall graph order (caller|callee)\n" \ 31 + HELP_PAD "sort_key:\tcall graph sort key (function|address)\n" \ 32 + HELP_PAD "branch:\t\tinclude last branch info to call graph (branch)\n" 21 33 22 34 enum perf_call_graph_mode { 23 35 CALLCHAIN_NONE,