Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

perf arm-spe: Add SPE total latency as PERF_SAMPLE_WEIGHT

Use total latency info in the SPE counter packet as sample weight so
that we can see it in local_weight and (global) weight sort keys.

Maybe we can use PERF_SAMPLE_WEIGHT_STRUCT to support ins_lat as well
but I'm not sure which latency it matches. So just adding total latency
first.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211201220855.1260688-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

authored by

Namhyung Kim and committed by
Arnaldo Carvalho de Melo
b0fde9c6 f0a29c96

+7 -1
+2
tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
··· 179 179 decoder->record.phys_addr = ip; 180 180 break; 181 181 case ARM_SPE_COUNTER: 182 + if (idx == SPE_CNT_PKT_HDR_INDEX_TOTAL_LAT) 183 + decoder->record.latency = payload; 182 184 break; 183 185 case ARM_SPE_CONTEXT: 184 186 decoder->record.context_id = payload;
+1
tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
··· 33 33 enum arm_spe_sample_type type; 34 34 int err; 35 35 u32 op; 36 + u32 latency; 36 37 u64 from_ip; 37 38 u64 to_ip; 38 39 u64 timestamp;
+4 -1
tools/perf/util/arm-spe.c
··· 330 330 sample.addr = record->virt_addr; 331 331 sample.phys_addr = record->phys_addr; 332 332 sample.data_src = data_src; 333 + sample.weight = record->latency; 333 334 334 335 return arm_spe_deliver_synth_event(spe, speq, event, &sample); 335 336 } ··· 348 347 sample.id = spe_events_id; 349 348 sample.stream_id = spe_events_id; 350 349 sample.addr = record->to_ip; 350 + sample.weight = record->latency; 351 351 352 352 return arm_spe_deliver_synth_event(spe, speq, event, &sample); 353 353 } ··· 995 993 attr.type = PERF_TYPE_HARDWARE; 996 994 attr.sample_type = evsel->core.attr.sample_type & PERF_SAMPLE_MASK; 997 995 attr.sample_type |= PERF_SAMPLE_IP | PERF_SAMPLE_TID | 998 - PERF_SAMPLE_PERIOD | PERF_SAMPLE_DATA_SRC; 996 + PERF_SAMPLE_PERIOD | PERF_SAMPLE_DATA_SRC | 997 + PERF_SAMPLE_WEIGHT; 999 998 if (spe->timeless_decoding) 1000 999 attr.sample_type &= ~(u64)PERF_SAMPLE_TIME; 1001 1000 else