Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

perf stat: Merge uncore events by default for hybrid platform

On a hybrid platform, by default 'perf stat' aggregates and reports the
event counts per PMU. For example,

# perf stat -e cycles -a true

Performance counter stats for 'system wide':

1,400,445 cpu_core/cycles/
680,881 cpu_atom/cycles/

0.001770773 seconds time elapsed

But for uncore events that's not a suitable method. Uncore has nothing
to do with hybrid. So for uncore events, we aggregate event counts from
all PMUs and report the counts without PMUs.

Before:

# perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ -a true

Performance counter stats for 'system wide':

2,058 uncore_arb_0/event=0x81,umask=0x1/
2,028 uncore_arb_1/event=0x81,umask=0x1/
0 uncore_arb_0/event=0x84,umask=0x1/
0 uncore_arb_1/event=0x84,umask=0x1/

0.000614498 seconds time elapsed

After:

# perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ -a true

Performance counter stats for 'system wide':

3,996 arb/event=0x81,umask=0x1/
0 arb/event=0x84,umask=0x1/

0.000630046 seconds time elapsed

Of course, we also keep the '--no-merge' working for uncore events.

# perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ --no-merge true

Performance counter stats for 'system wide':

1,952 uncore_arb_0/event=0x81,umask=0x1/
1,921 uncore_arb_1/event=0x81,umask=0x1/
0 uncore_arb_0/event=0x84,umask=0x1/
0 uncore_arb_1/event=0x84,umask=0x1/

0.000575536 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210707055652.962-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

authored by

Jin Yao and committed by
Arnaldo Carvalho de Melo
e0a7ef2a de3d5fd8

+13 -4
-3
tools/perf/builtin-stat.c
··· 2445 2445 2446 2446 evlist__check_cpu_maps(evsel_list); 2447 2447 2448 - if (perf_pmu__has_hybrid()) 2449 - stat_config.no_merge = true; 2450 - 2451 2448 /* 2452 2449 * Initialize thread_map with comm names, 2453 2450 * so we could print it out on output.
+13 -1
tools/perf/util/stat-display.c
··· 596 596 } 597 597 } 598 598 599 + static bool is_uncore(struct evsel *evsel) 600 + { 601 + struct perf_pmu *pmu = evsel__find_pmu(evsel); 602 + 603 + return pmu && pmu->is_uncore; 604 + } 605 + 606 + static bool hybrid_uniquify(struct evsel *evsel) 607 + { 608 + return perf_pmu__has_hybrid() && !is_uncore(evsel); 609 + } 610 + 599 611 static bool collect_data(struct perf_stat_config *config, struct evsel *counter, 600 612 void (*cb)(struct perf_stat_config *config, struct evsel *counter, void *data, 601 613 bool first), ··· 616 604 if (counter->merged_stat) 617 605 return false; 618 606 cb(config, counter, data, true); 619 - if (config->no_merge) 607 + if (config->no_merge || hybrid_uniquify(counter)) 620 608 uniquify_event_name(counter); 621 609 else if (counter->auto_merge_stats) 622 610 collect_all_aliases(config, counter, cb, data);