Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'perf-tools-for-v5.12-2020-02-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tool updates from Arnaldo Carvalho de Melo:
"New features:

- Support instruction latency in 'perf report', with both memory
latency (weight) and instruction latency information, users can
locate expensive load instructions and understand time spent in
different stages.

- Extend 'perf c2c' to display the number of loads which were blocked
by data or address conflict.

- Add 'perf stat' support for L2 topdown events in systems such as
Intel's Sapphire rapids server.

- Add support for PERF_SAMPLE_CODE_PAGE_SIZE in various tools, as a
sort key, for instance:

perf report --stdio --sort=comm,symbol,code_page_size

- New 'perf daemon' command to run long running sessions while
providing a way to control the enablement of events without
restarting a traditional 'perf record' session.

- Enable counting events for BPF programs in 'perf stat' just like
for other targets (tid, cgroup, cpu, etc), e.g.:

# perf stat -e ref-cycles,cycles -b 254 -I 1000
1.487903822 115,200 ref-cycles
1.487903822 86,012 cycles
2.489147029 80,560 ref-cycles
2.489147029 73,784 cycles
^C

The example above counts 'cycles' and 'ref-cycles' of BPF program
of id 254. It is similar to bpftool-prog-profile command, but more
flexible.

- Support the new layout for PERF_RECORD_MMAP2 to carry the DSO
build-id using infrastructure generalised from the eBPF subsystem,
removing the need for traversing the perf.data file to collect
build-ids at the end of 'perf record' sessions and helping with
long running sessions where binaries can get replaced in updates,
leading to possible mis-resolution of symbols.

- Support filtering by hex address in 'perf script'.

- Support DSO filter in 'perf script', like in other perf tools.

- Add namespaces support to 'perf inject'

- Add support for SDT (Dtrace Style Markers) events on ARM64.

perf record:

- Fix handling of eventfd() when draining a buffer in 'perf record'.

- Improvements to the generation of metadata events for pre-existing
threads (mmaps, comm, etc), speeding up the work done at the start
of system wide or per CPU 'perf record' sessions.

Hardware tracing:

- Initial support for tracing KVM with Intel PT.

- Intel PT fixes for IPC

- Support Intel PT PSB (synchronization packets) events.

- Automatically group aux-output events to overcome --filter syntax.

- Enable PERF_SAMPLE_DATA_SRC on ARMs SPE.

- Update ARM's CoreSight hardware tracing OpenCSD library to v1.0.0.

perf annotate TUI:

- Fix handling of 'k' ("show line number") hotkey

- Fix jump parsing for C++ code.

perf probe:

- Add protection to avoid endless loop.

cgroups:

- Avoid reading cgroup mountpoint multiple times, caching it.

- Fix handling of cgroup v1/v2 in mixed hierarchy.

Symbol resolving:

- Add OCaml symbol demangling.

- Further fixes for handling PE executables when using perf with Wine
and .exe/.dll files.

- Fix 'perf unwind' DSO handling.

- Resolve symbols against debug file first, to deal with artifacts
related to LTO.

- Fix gap between kernel end and module start on powerpc.

Reporting tools:

- The DSO filter shouldn't show samples in unresolved maps.

- Improve debuginfod support in various tools.

build ids:

- Fix 16-byte build ids in 'perf buildid-cache', add a 'perf test'
entry for that case.

perf test:

- Support for PERF_SAMPLE_WEIGHT_STRUCT.

- Add test case for PERF_SAMPLE_CODE_PAGE_SIZE.

- Shell based tests for 'perf daemon's commands ('start', 'stop,
'reconfig', 'list', etc).

- ARM cs-etm 'perf test' fixes.

- Add parse-metric memory bandwidth testcase.

Compiler related:

- Fix 'perf probe' kretprobe issue caused by gcc 11 bug when used
with -fpatchable-function-entry.

- Fix ARM64 build with gcc 11's -Wformat-overflow.

- Fix unaligned access in sample parsing test.

- Fix printf conversion specifier for IP addresses on arm64, s390 and
powerpc.

Arch specific:

- Support exposing Performance Monitor Counter SPRs as part of
extended regs on powerpc.

- Add JSON 'perf stat' metrics for ARM64's imx8mp, imx8mq and imx8mn
DDR, fix imx8mm ones.

- Fix common and uarch events for ARM64's A76 and Ampere eMag"

* tag 'perf-tools-for-v5.12-2020-02-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (148 commits)
perf buildid-cache: Don't skip 16-byte build-ids
perf buildid-cache: Add test for 16-byte build-id
perf symbol: Remove redundant libbfd checks
perf test: Output the sub testing result in cs-etm
perf test: Suppress logs in cs-etm testing
perf tools: Fix arm64 build error with gcc-11
perf intel-pt: Add documentation for tracing virtual machines
perf intel-pt: Split VM-Entry and VM-Exit branches
perf intel-pt: Adjust sample flags for VM-Exit
perf intel-pt: Allow for a guest kernel address filter
perf intel-pt: Support decoding of guest kernel
perf machine: Factor out machine__idle_thread()
perf machine: Factor out machines__find_guest()
perf intel-pt: Amend decoder to track the NR flag
perf intel-pt: Retain the last PIP packet payload as is
perf intel_pt: Add vmlaunch and vmresume as branches
perf script: Add branch types for VM-Entry and VM-Exit
perf auxtrace: Automatically group aux-output events
perf test: Fix unaligned access in sample parsing test
perf tools: Support arch specific PERF_SAMPLE_WEIGHT_STRUCT processing
...

+6941 -977
+22 -6
tools/arch/powerpc/include/uapi/asm/perf_regs.h
··· 55 55 PERF_REG_POWERPC_MMCR3, 56 56 PERF_REG_POWERPC_SIER2, 57 57 PERF_REG_POWERPC_SIER3, 58 + PERF_REG_POWERPC_PMC1, 59 + PERF_REG_POWERPC_PMC2, 60 + PERF_REG_POWERPC_PMC3, 61 + PERF_REG_POWERPC_PMC4, 62 + PERF_REG_POWERPC_PMC5, 63 + PERF_REG_POWERPC_PMC6, 58 64 /* Max regs without the extended regs */ 59 65 PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1, 60 66 }; 61 67 62 68 #define PERF_REG_PMU_MASK ((1ULL << PERF_REG_POWERPC_MAX) - 1) 63 69 64 - /* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 */ 65 - #define PERF_REG_PMU_MASK_300 (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) - PERF_REG_PMU_MASK) 66 - /* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_31 */ 67 - #define PERF_REG_PMU_MASK_31 (((1ULL << (PERF_REG_POWERPC_SIER3 + 1)) - 1) - PERF_REG_PMU_MASK) 70 + /* Exclude MMCR3, SIER2, SIER3 for CPU_FTR_ARCH_300 */ 71 + #define PERF_EXCLUDE_REG_EXT_300 (7ULL << PERF_REG_POWERPC_MMCR3) 68 72 69 - #define PERF_REG_MAX_ISA_300 (PERF_REG_POWERPC_MMCR2 + 1) 70 - #define PERF_REG_MAX_ISA_31 (PERF_REG_POWERPC_SIER3 + 1) 73 + /* 74 + * PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 75 + * includes 9 SPRS from MMCR0 to PMC6 excluding the 76 + * unsupported SPRS in PERF_EXCLUDE_REG_EXT_300. 77 + */ 78 + #define PERF_REG_PMU_MASK_300 ((0xfffULL << PERF_REG_POWERPC_MMCR0) - PERF_EXCLUDE_REG_EXT_300) 79 + 80 + /* 81 + * PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_31 82 + * includes 12 SPRs from MMCR0 to PMC6. 83 + */ 84 + #define PERF_REG_PMU_MASK_31 (0xfffULL << PERF_REG_POWERPC_MMCR0) 85 + 86 + #define PERF_REG_EXTENDED_MAX (PERF_REG_POWERPC_PMC6 + 1) 71 87 #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
+2
tools/bpf/bpftool/Makefile
··· 146 146 /boot/vmlinux-$(shell uname -r) 147 147 VMLINUX_BTF ?= $(abspath $(firstword $(wildcard $(VMLINUX_BTF_PATHS)))) 148 148 149 + bootstrap: $(BPFTOOL_BOOTSTRAP) 150 + 149 151 ifneq ($(VMLINUX_BTF)$(VMLINUX_H),) 150 152 ifeq ($(feature-clang-bpf-co-re),1) 151 153
+3 -1
tools/build/Makefile.feature
··· 99 99 clang \ 100 100 libbpf \ 101 101 libpfm4 \ 102 - libdebuginfod 102 + libdebuginfod \ 103 + clang-bpf-co-re 104 + 103 105 104 106 FEATURE_TESTS ?= $(FEATURE_TESTS_BASIC) 105 107
+2 -2
tools/build/feature/test-libopencsd.c
··· 4 4 /* 5 5 * Check OpenCSD library version is sufficient to provide required features 6 6 */ 7 - #define OCSD_MIN_VER ((0 << 16) | (14 << 8) | (0)) 7 + #define OCSD_MIN_VER ((1 << 16) | (0 << 8) | (0)) 8 8 #if !defined(OCSD_VER_NUM) || (OCSD_VER_NUM < OCSD_MIN_VER) 9 - #error "OpenCSD >= 0.14.0 is required" 9 + #error "OpenCSD >= 1.0.0 is required" 10 10 #endif 11 11 12 12 int main(void)
+87 -9
tools/include/uapi/linux/perf_event.h
··· 145 145 PERF_SAMPLE_CGROUP = 1U << 21, 146 146 PERF_SAMPLE_DATA_PAGE_SIZE = 1U << 22, 147 147 PERF_SAMPLE_CODE_PAGE_SIZE = 1U << 23, 148 + PERF_SAMPLE_WEIGHT_STRUCT = 1U << 24, 148 149 149 - PERF_SAMPLE_MAX = 1U << 24, /* non-ABI */ 150 + PERF_SAMPLE_MAX = 1U << 25, /* non-ABI */ 150 151 151 152 __PERF_SAMPLE_CALLCHAIN_EARLY = 1ULL << 63, /* non-ABI; internal use */ 152 153 }; 153 154 155 + #define PERF_SAMPLE_WEIGHT_TYPE (PERF_SAMPLE_WEIGHT | PERF_SAMPLE_WEIGHT_STRUCT) 154 156 /* 155 157 * values to program into branch_sample_type when PERF_SAMPLE_BRANCH is set 156 158 * ··· 388 386 aux_output : 1, /* generate AUX records instead of events */ 389 387 cgroup : 1, /* include cgroup events */ 390 388 text_poke : 1, /* include text poke events */ 391 - __reserved_1 : 30; 389 + build_id : 1, /* use build id in mmap2 events */ 390 + __reserved_1 : 29; 392 391 393 392 union { 394 393 __u32 wakeup_events; /* wakeup every n events */ ··· 662 659 __u64 aux_size; 663 660 }; 664 661 662 + /* 663 + * The current state of perf_event_header::misc bits usage: 664 + * ('|' used bit, '-' unused bit) 665 + * 666 + * 012 CDEF 667 + * |||---------|||| 668 + * 669 + * Where: 670 + * 0-2 CPUMODE_MASK 671 + * 672 + * C PROC_MAP_PARSE_TIMEOUT 673 + * D MMAP_DATA / COMM_EXEC / FORK_EXEC / SWITCH_OUT 674 + * E MMAP_BUILD_ID / EXACT_IP / SCHED_OUT_PREEMPT 675 + * F (reserved) 676 + */ 677 + 665 678 #define PERF_RECORD_MISC_CPUMODE_MASK (7 << 0) 666 679 #define PERF_RECORD_MISC_CPUMODE_UNKNOWN (0 << 0) 667 680 #define PERF_RECORD_MISC_KERNEL (1 << 0) ··· 709 690 * 710 691 * PERF_RECORD_MISC_EXACT_IP - PERF_RECORD_SAMPLE of precise events 711 692 * PERF_RECORD_MISC_SWITCH_OUT_PREEMPT - PERF_RECORD_SWITCH* events 693 + * PERF_RECORD_MISC_MMAP_BUILD_ID - PERF_RECORD_MMAP2 event 712 694 * 713 695 * 714 696 * PERF_RECORD_MISC_EXACT_IP: ··· 719 699 * 720 700 * PERF_RECORD_MISC_SWITCH_OUT_PREEMPT: 721 701 * Indicates that thread was preempted in TASK_RUNNING state. 702 + * 703 + * PERF_RECORD_MISC_MMAP_BUILD_ID: 704 + * Indicates that mmap2 event carries build id data. 722 705 */ 723 706 #define PERF_RECORD_MISC_EXACT_IP (1 << 14) 724 707 #define PERF_RECORD_MISC_SWITCH_OUT_PREEMPT (1 << 14) 708 + #define PERF_RECORD_MISC_MMAP_BUILD_ID (1 << 14) 725 709 /* 726 710 * Reserve the last bit to indicate some extended misc field 727 711 */ ··· 914 890 * char data[size]; 915 891 * u64 dyn_size; } && PERF_SAMPLE_STACK_USER 916 892 * 917 - * { u64 weight; } && PERF_SAMPLE_WEIGHT 893 + * { union perf_sample_weight 894 + * { 895 + * u64 full; && PERF_SAMPLE_WEIGHT 896 + * #if defined(__LITTLE_ENDIAN_BITFIELD) 897 + * struct { 898 + * u32 var1_dw; 899 + * u16 var2_w; 900 + * u16 var3_w; 901 + * } && PERF_SAMPLE_WEIGHT_STRUCT 902 + * #elif defined(__BIG_ENDIAN_BITFIELD) 903 + * struct { 904 + * u16 var3_w; 905 + * u16 var2_w; 906 + * u32 var1_dw; 907 + * } && PERF_SAMPLE_WEIGHT_STRUCT 908 + * #endif 909 + * } 910 + * } 918 911 * { u64 data_src; } && PERF_SAMPLE_DATA_SRC 919 912 * { u64 transaction; } && PERF_SAMPLE_TRANSACTION 920 913 * { u64 abi; # enum perf_sample_regs_abi ··· 956 915 * u64 addr; 957 916 * u64 len; 958 917 * u64 pgoff; 959 - * u32 maj; 960 - * u32 min; 961 - * u64 ino; 962 - * u64 ino_generation; 918 + * union { 919 + * struct { 920 + * u32 maj; 921 + * u32 min; 922 + * u64 ino; 923 + * u64 ino_generation; 924 + * }; 925 + * struct { 926 + * u8 build_id_size; 927 + * u8 __reserved_1; 928 + * u16 __reserved_2; 929 + * u8 build_id[20]; 930 + * }; 931 + * }; 963 932 * u32 prot, flags; 964 933 * char filename[]; 965 934 * struct sample_id sample_id; ··· 1178 1127 mem_lvl_num:4, /* memory hierarchy level number */ 1179 1128 mem_remote:1, /* remote */ 1180 1129 mem_snoopx:2, /* snoop mode, ext */ 1181 - mem_rsvd:24; 1130 + mem_blk:3, /* access blocked */ 1131 + mem_rsvd:21; 1182 1132 }; 1183 1133 }; 1184 1134 #elif defined(__BIG_ENDIAN_BITFIELD) 1185 1135 union perf_mem_data_src { 1186 1136 __u64 val; 1187 1137 struct { 1188 - __u64 mem_rsvd:24, 1138 + __u64 mem_rsvd:21, 1139 + mem_blk:3, /* access blocked */ 1189 1140 mem_snoopx:2, /* snoop mode, ext */ 1190 1141 mem_remote:1, /* remote */ 1191 1142 mem_lvl_num:4, /* memory hierarchy level number */ ··· 1270 1217 #define PERF_MEM_TLB_OS 0x40 /* OS fault handler */ 1271 1218 #define PERF_MEM_TLB_SHIFT 26 1272 1219 1220 + /* Access blocked */ 1221 + #define PERF_MEM_BLK_NA 0x01 /* not available */ 1222 + #define PERF_MEM_BLK_DATA 0x02 /* data could not be forwarded */ 1223 + #define PERF_MEM_BLK_ADDR 0x04 /* address conflict */ 1224 + #define PERF_MEM_BLK_SHIFT 40 1225 + 1273 1226 #define PERF_MEM_S(a, s) \ 1274 1227 (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT) 1275 1228 ··· 1305 1246 cycles:16, /* cycle count to last branch */ 1306 1247 type:4, /* branch type */ 1307 1248 reserved:40; 1249 + }; 1250 + 1251 + union perf_sample_weight { 1252 + __u64 full; 1253 + #if defined(__LITTLE_ENDIAN_BITFIELD) 1254 + struct { 1255 + __u32 var1_dw; 1256 + __u16 var2_w; 1257 + __u16 var3_w; 1258 + }; 1259 + #elif defined(__BIG_ENDIAN_BITFIELD) 1260 + struct { 1261 + __u16 var3_w; 1262 + __u16 var2_w; 1263 + __u32 var1_dw; 1264 + }; 1265 + #else 1266 + #error "Unknown endianness" 1267 + #endif 1308 1268 }; 1309 1269 1310 1270 #endif /* _UAPI_LINUX_PERF_EVENT_H */
+3
tools/include/uapi/linux/prctl.h
··· 251 251 #define PR_SET_SYSCALL_USER_DISPATCH 59 252 252 # define PR_SYS_DISPATCH_OFF 0 253 253 # define PR_SYS_DISPATCH_ON 1 254 + /* The control values for the user space selector when dispatch is enabled */ 255 + # define SYSCALL_DISPATCH_FILTER_ALLOW 0 256 + # define SYSCALL_DISPATCH_FILTER_BLOCK 1 254 257 255 258 #endif /* _LINUX_PRCTL_H */
+65 -30
tools/lib/api/fs/cgroup.c
··· 8 8 #include <string.h> 9 9 #include "fs.h" 10 10 11 + struct cgroupfs_cache_entry { 12 + char subsys[32]; 13 + char mountpoint[PATH_MAX]; 14 + }; 15 + 16 + /* just cache last used one */ 17 + static struct cgroupfs_cache_entry cached; 18 + 11 19 int cgroupfs_find_mountpoint(char *buf, size_t maxlen, const char *subsys) 12 20 { 13 21 FILE *fp; 14 - char mountpoint[PATH_MAX + 1], tokens[PATH_MAX + 1], type[PATH_MAX + 1]; 15 - char path_v1[PATH_MAX + 1], path_v2[PATH_MAX + 2], *path; 16 - char *token, *saved_ptr = NULL; 22 + char *line = NULL; 23 + size_t len = 0; 24 + char *p, *path; 25 + char mountpoint[PATH_MAX]; 26 + 27 + if (!strcmp(cached.subsys, subsys)) { 28 + if (strlen(cached.mountpoint) < maxlen) { 29 + strcpy(buf, cached.mountpoint); 30 + return 0; 31 + } 32 + return -1; 33 + } 17 34 18 35 fp = fopen("/proc/mounts", "r"); 19 36 if (!fp) ··· 39 22 /* 40 23 * in order to handle split hierarchy, we need to scan /proc/mounts 41 24 * and inspect every cgroupfs mount point to find one that has 42 - * perf_event subsystem 25 + * the given subsystem. If we found v1, just use it. If not we can 26 + * use v2 path as a fallback. 43 27 */ 44 - path_v1[0] = '\0'; 45 - path_v2[0] = '\0'; 28 + mountpoint[0] = '\0'; 46 29 47 - while (fscanf(fp, "%*s %"__stringify(PATH_MAX)"s %"__stringify(PATH_MAX)"s %" 48 - __stringify(PATH_MAX)"s %*d %*d\n", 49 - mountpoint, type, tokens) == 3) { 30 + /* 31 + * The /proc/mounts has the follow format: 32 + * 33 + * <devname> <mount point> <fs type> <options> ... 34 + * 35 + */ 36 + while (getline(&line, &len, fp) != -1) { 37 + /* skip devname */ 38 + p = strchr(line, ' '); 39 + if (p == NULL) 40 + continue; 50 41 51 - if (!path_v1[0] && !strcmp(type, "cgroup")) { 42 + /* save the mount point */ 43 + path = ++p; 44 + p = strchr(p, ' '); 45 + if (p == NULL) 46 + continue; 52 47 53 - token = strtok_r(tokens, ",", &saved_ptr); 48 + *p++ = '\0'; 54 49 55 - while (token != NULL) { 56 - if (subsys && !strcmp(token, subsys)) { 57 - strcpy(path_v1, mountpoint); 58 - break; 59 - } 60 - token = strtok_r(NULL, ",", &saved_ptr); 61 - } 50 + /* check filesystem type */ 51 + if (strncmp(p, "cgroup", 6)) 52 + continue; 53 + 54 + if (p[6] == '2') { 55 + /* save cgroup v2 path */ 56 + strcpy(mountpoint, path); 57 + continue; 62 58 } 63 59 64 - if (!path_v2[0] && !strcmp(type, "cgroup2")) 65 - strcpy(path_v2, mountpoint); 60 + /* now we have cgroup v1, check the options for subsystem */ 61 + p += 7; 66 62 67 - if (path_v1[0] && path_v2[0]) 68 - break; 63 + p = strstr(p, subsys); 64 + if (p == NULL) 65 + continue; 66 + 67 + /* sanity check: it should be separated by a space or a comma */ 68 + if (!strchr(" ,", p[-1]) || !strchr(" ,", p[strlen(subsys)])) 69 + continue; 70 + 71 + strcpy(mountpoint, path); 72 + break; 69 73 } 74 + free(line); 70 75 fclose(fp); 71 76 72 - if (path_v1[0]) 73 - path = path_v1; 74 - else if (path_v2[0]) 75 - path = path_v2; 76 - else 77 - return -1; 77 + strncpy(cached.subsys, subsys, sizeof(cached.subsys) - 1); 78 + strcpy(cached.mountpoint, mountpoint); 78 79 79 - if (strlen(path) < maxlen) { 80 - strcpy(buf, path); 80 + if (mountpoint[0] && strlen(mountpoint) < maxlen) { 81 + strcpy(buf, mountpoint); 81 82 return 0; 82 83 } 83 84 return -1;
+14 -4
tools/lib/perf/include/perf/event.h
··· 23 23 __u64 start; 24 24 __u64 len; 25 25 __u64 pgoff; 26 - __u32 maj; 27 - __u32 min; 28 - __u64 ino; 29 - __u64 ino_generation; 26 + union { 27 + struct { 28 + __u32 maj; 29 + __u32 min; 30 + __u64 ino; 31 + __u64 ino_generation; 32 + }; 33 + struct { 34 + __u8 build_id_size; 35 + __u8 __reserved_1; 36 + __u16 __reserved_2; 37 + __u8 build_id[20]; 38 + }; 39 + }; 30 40 __u32 prot; 31 41 __u32 flags; 32 42 char filename[PATH_MAX];
+1
tools/perf/Build
··· 24 24 perf-y += builtin-data.o 25 25 perf-y += builtin-version.o 26 26 perf-y += builtin-c2c.o 27 + perf-y += builtin-daemon.o 27 28 28 29 perf-$(CONFIG_TRACE) += builtin-trace.o 29 30 perf-$(CONFIG_LIBELF) += builtin-probe.o
+1 -1
tools/perf/Documentation/examples.txt
··· 3 3 ****** perf by examples ****** 4 4 ------------------------------ 5 5 6 - [ From an e-mail by Ingo Molnar, http://lkml.org/lkml/2009/8/4/346 ] 6 + [ From an e-mail by Ingo Molnar, https://lore.kernel.org/lkml/20090804195717.GA5998@elte.hu ] 7 7 8 8 9 9 First, discovery/enumeration of available counters can be done via
+1 -1
tools/perf/Documentation/itrace.txt
··· 4 4 r synthesize branches events (returns only) 5 5 x synthesize transactions events 6 6 w synthesize ptwrite events 7 - p synthesize power events 7 + p synthesize power events (incl. PSB events for Intel PT) 8 8 o synthesize other events recorded due to the use 9 9 of aux-output (refer to perf record) 10 10 e synthesize error events
+6
tools/perf/Documentation/perf-buildid-cache.txt
··· 74 74 used when creating a uprobe for a process that resides in a 75 75 different mount namespace from the perf(1) utility. 76 76 77 + --debuginfod=URLs:: 78 + Specify debuginfod URL to be used when retrieving perf.data binaries, 79 + it follows the same syntax as the DEBUGINFOD_URLS variable, like: 80 + 81 + buildid-cache.debuginfod=http://192.168.122.174:8002 82 + 77 83 SEE ALSO 78 84 -------- 79 85 linkperf:perf-record[1], linkperf:perf-report[1], linkperf:perf-buildid-list[1]
+23 -1
tools/perf/Documentation/perf-config.txt
··· 238 238 cache location, or to disable it altogether. If you want to disable it, 239 239 set buildid.dir to /dev/null. The default is $HOME/.debug 240 240 241 + buildid-cache.*:: 242 + buildid-cache.debuginfod=URLs 243 + Specify debuginfod URLs to be used when retrieving perf.data binaries, 244 + it follows the same syntax as the DEBUGINFOD_URLS variable, like: 245 + 246 + buildid-cache.debuginfod=http://192.168.122.174:8002 247 + 241 248 annotate.*:: 242 249 These are in control of addresses, jump function, source code 243 250 in lines of assembly code from a specific program. ··· 559 552 560 553 record.*:: 561 554 record.build-id:: 562 - This option can be 'cache', 'no-cache' or 'skip'. 555 + This option can be 'cache', 'no-cache', 'skip' or 'mmap'. 563 556 'cache' is to post-process data and save/update the binaries into 564 557 the build-id cache (in ~/.debug). This is the default. 565 558 But if this option is 'no-cache', it will not update the build-id cache. 566 559 'skip' skips post-processing and does not update the cache. 560 + 'mmap' skips post-processing and reads build-ids from MMAP events. 567 561 568 562 record.call-graph:: 569 563 This is identical to 'call-graph.record-mode', except it is ··· 702 694 can be changed using this option. Ex, auxtrace.dumpdir=/tmp. 703 695 If the directory does not exist or has the wrong file type, 704 696 the current directory is used. 697 + 698 + daemon.*:: 699 + 700 + daemon.base:: 701 + Base path for daemon data. All sessions data are stored under 702 + this path. 703 + 704 + session-<NAME>.*:: 705 + 706 + session-<NAME>.run:: 707 + 708 + Defines new record session for daemon. The value is record's 709 + command line without the 'record' keyword. 710 + 705 711 706 712 SEE ALSO 707 713 --------
+208
tools/perf/Documentation/perf-daemon.txt
··· 1 + perf-daemon(1) 2 + ============== 3 + 4 + 5 + NAME 6 + ---- 7 + perf-daemon - Run record sessions on background 8 + 9 + 10 + SYNOPSIS 11 + -------- 12 + [verse] 13 + 'perf daemon' 14 + 'perf daemon' [<options>] 15 + 'perf daemon start' [<options>] 16 + 'perf daemon stop' [<options>] 17 + 'perf daemon signal' [<options>] 18 + 'perf daemon ping' [<options>] 19 + 20 + 21 + DESCRIPTION 22 + ----------- 23 + This command allows to run simple daemon process that starts and 24 + monitors configured record sessions. 25 + 26 + You can imagine 'perf daemon' of background process with several 27 + 'perf record' child tasks, like: 28 + 29 + # ps axjf 30 + ... 31 + 1 916507 ... perf daemon start 32 + 916507 916508 ... \_ perf record --control=fifo:control,ack -m 10M -e cycles --overwrite --switch-output -a 33 + 916507 916509 ... \_ perf record --control=fifo:control,ack -m 20M -e sched:* --overwrite --switch-output -a 34 + 35 + Not every 'perf record' session is suitable for running under daemon. 36 + User need perf session that either produces data on query, like the 37 + flight recorder sessions in above example or session that is configured 38 + to produce data periodically, like with --switch-output configuration 39 + for time and size. 40 + 41 + Each session is started with control setup (with perf record --control 42 + options). 43 + 44 + Sessions are configured through config file, see CONFIG FILE section 45 + with EXAMPLES. 46 + 47 + 48 + OPTIONS 49 + ------- 50 + -v:: 51 + --verbose:: 52 + Be more verbose. 53 + 54 + --config=<PATH>:: 55 + Config file path. If not provided, perf will check system and default 56 + locations (/etc/perfconfig, $HOME/.perfconfig). 57 + 58 + --base=<PATH>:: 59 + Base directory path. Each daemon instance is running on top 60 + of base directory. Only one instance of server can run on 61 + top of one directory at the time. 62 + 63 + All generic options are available also under commands. 64 + 65 + 66 + START COMMAND 67 + ------------- 68 + The start command creates the daemon process. 69 + 70 + -f:: 71 + --foreground:: 72 + Do not put the process in background. 73 + 74 + 75 + STOP COMMAND 76 + ------------ 77 + The stop command stops all the session and the daemon process. 78 + 79 + 80 + SIGNAL COMMAND 81 + -------------- 82 + The signal command sends signal to configured sessions. 83 + 84 + --session:: 85 + Send signal to specific session. 86 + 87 + 88 + PING COMMAND 89 + ------------ 90 + The ping command sends control ping to configured sessions. 91 + 92 + --session:: 93 + Send ping to specific session. 94 + 95 + 96 + CONFIG FILE 97 + ----------- 98 + The daemon is configured within standard perf config file by 99 + following new variables: 100 + 101 + daemon.base: 102 + Base path for daemon data. All sessions data are 103 + stored under this path. 104 + 105 + session-<NAME>.run: 106 + Defines new record session. The value is record's command 107 + line without the 'record' keyword. 108 + 109 + Each perf record session is run in daemon.base/<NAME> directory. 110 + 111 + 112 + EXAMPLES 113 + -------- 114 + Example with 2 record sessions: 115 + 116 + # cat ~/.perfconfig 117 + [daemon] 118 + base=/opt/perfdata 119 + 120 + [session-cycles] 121 + run = -m 10M -e cycles --overwrite --switch-output -a 122 + 123 + [session-sched] 124 + run = -m 20M -e sched:* --overwrite --switch-output -a 125 + 126 + 127 + Starting the daemon: 128 + 129 + # perf daemon start 130 + 131 + 132 + Check sessions: 133 + 134 + # perf daemon 135 + [603349:daemon] base: /opt/perfdata 136 + [603350:cycles] perf record -m 10M -e cycles --overwrite --switch-output -a 137 + [603351:sched] perf record -m 20M -e sched:* --overwrite --switch-output -a 138 + 139 + First line is daemon process info with configured daemon base. 140 + 141 + 142 + Check sessions with more info: 143 + 144 + # perf daemon -v 145 + [603349:daemon] base: /opt/perfdata 146 + output: /opt/perfdata/output 147 + lock: /opt/perfdata/lock 148 + up: 1 minutes 149 + [603350:cycles] perf record -m 10M -e cycles --overwrite --switch-output -a 150 + base: /opt/perfdata/session-cycles 151 + output: /opt/perfdata/session-cycles/output 152 + control: /opt/perfdata/session-cycles/control 153 + ack: /opt/perfdata/session-cycles/ack 154 + up: 1 minutes 155 + [603351:sched] perf record -m 20M -e sched:* --overwrite --switch-output -a 156 + base: /opt/perfdata/session-sched 157 + output: /opt/perfdata/session-sched/output 158 + control: /opt/perfdata/session-sched/control 159 + ack: /opt/perfdata/session-sched/ack 160 + up: 1 minutes 161 + 162 + The 'base' path is daemon/session base. 163 + The 'lock' file is daemon's lock file guarding that no other 164 + daemon is running on top of the base. 165 + The 'output' file is perf record output for specific session. 166 + The 'control' and 'ack' files are perf control files. 167 + The 'up' number shows minutes daemon/session is running. 168 + 169 + 170 + Make sure control session is online: 171 + 172 + # perf daemon ping 173 + OK cycles 174 + OK sched 175 + 176 + 177 + Send USR2 signal to session 'cycles' to generate perf.data file: 178 + 179 + # perf daemon signal --session cycles 180 + signal 12 sent to session 'cycles [603452]' 181 + 182 + # tail -2 /opt/perfdata/session-cycles/output 183 + [ perf record: dump data: Woken up 1 times ] 184 + [ perf record: Dump perf.data.2020123017013149 ] 185 + 186 + 187 + Send USR2 signal to all sessions: 188 + 189 + # perf daemon signal 190 + signal 12 sent to session 'cycles [603452]' 191 + signal 12 sent to session 'sched [603453]' 192 + 193 + # tail -2 /opt/perfdata/session-cycles/output 194 + [ perf record: dump data: Woken up 1 times ] 195 + [ perf record: Dump perf.data.2020123017024689 ] 196 + # tail -2 /opt/perfdata/session-sched/output 197 + [ perf record: dump data: Woken up 1 times ] 198 + [ perf record: Dump perf.data.2020123017024713 ] 199 + 200 + 201 + Stop daemon: 202 + 203 + # perf daemon stop 204 + 205 + 206 + SEE ALSO 207 + -------- 208 + linkperf:perf-record[1], linkperf:perf-config[1]
+88 -1
tools/perf/Documentation/perf-intel-pt.txt
··· 858 858 b synthesize "branches" events 859 859 x synthesize "transactions" events 860 860 w synthesize "ptwrite" events 861 - p synthesize "power" events 861 + p synthesize "power" events (incl. PSB events) 862 862 c synthesize branches events (calls only) 863 863 r synthesize branches events (returns only) 864 864 e synthesize tracing error events ··· 912 912 "pwrx" indicates return to C0 913 913 For more details refer to the Intel 64 and IA-32 Architectures Software 914 914 Developer Manuals. 915 + 916 + PSB events show when a PSB+ occurred and also the byte-offset in the trace. 917 + Emitting a PSB+ can cause a CPU a slight delay. When doing timing analysis 918 + of code with Intel PT, it is useful to know if a timing bubble was caused 919 + by Intel PT or not. 915 920 916 921 Error events show where the decoder lost the trace. Error events 917 922 are quite important. Users must know if what they are seeing is a complete ··· 1145 1140 --- 1146 1141 1147 1142 include::build-xed.txt[] 1143 + 1144 + 1145 + Tracing Virtual Machines 1146 + ------------------------ 1147 + 1148 + Currently, only kernel tracing is supported and only with "timeless" decoding 1149 + i.e. no TSC timestamps 1150 + 1151 + Other limitations and caveats 1152 + 1153 + VMX controls may suppress packets needed for decoding resulting in decoding errors 1154 + VMX controls may block the perf NMI to the host potentially resulting in lost trace data 1155 + Guest kernel self-modifying code (e.g. jump labels or JIT-compiled eBPF) will result in decoding errors 1156 + Guest thread information is unknown 1157 + Guest VCPU is unknown but may be able to be inferred from the host thread 1158 + Callchains are not supported 1159 + 1160 + Example 1161 + 1162 + Start VM 1163 + 1164 + $ sudo virsh start kubuntu20.04 1165 + Domain kubuntu20.04 started 1166 + 1167 + Mount the guest file system. Note sshfs needs -o direct_io to enable reading of proc files. root access is needed to read /proc/kcore. 1168 + 1169 + $ mkdir vm0 1170 + $ sshfs -o direct_io root@vm0:/ vm0 1171 + 1172 + Copy the guest /proc/kallsyms, /proc/modules and /proc/kcore 1173 + 1174 + $ perf buildid-cache -v --kcore vm0/proc/kcore 1175 + kcore added to build-id cache directory /home/user/.debug/[kernel.kcore]/9600f316a53a0f54278885e8d9710538ec5f6a08/2021021807494306 1176 + $ KALLSYMS=/home/user/.debug/[kernel.kcore]/9600f316a53a0f54278885e8d9710538ec5f6a08/2021021807494306/kallsyms 1177 + 1178 + Find the VM process 1179 + 1180 + $ ps -eLl | grep 'KVM\|PID' 1181 + F S UID PID PPID LWP C PRI NI ADDR SZ WCHAN TTY TIME CMD 1182 + 3 S 64055 1430 1 1440 1 80 0 - 1921718 - ? 00:02:47 CPU 0/KVM 1183 + 3 S 64055 1430 1 1441 1 80 0 - 1921718 - ? 00:02:41 CPU 1/KVM 1184 + 3 S 64055 1430 1 1442 1 80 0 - 1921718 - ? 00:02:38 CPU 2/KVM 1185 + 3 S 64055 1430 1 1443 2 80 0 - 1921718 - ? 00:03:18 CPU 3/KVM 1186 + 1187 + Start an open-ended perf record, tracing the VM process, do something on the VM, and then ctrl-C to stop. 1188 + TSC is not supported and tsc=0 must be specified. That means mtc is useless, so add mtc=0. 1189 + However, IPC can still be determined, hence cyc=1 can be added. 1190 + Only kernel decoding is supported, so 'k' must be specified. 1191 + Intel PT traces both the host and the guest so --guest and --host need to be specified. 1192 + Without timestamps, --per-thread must be specified to distinguish threads. 1193 + 1194 + $ sudo perf kvm --guest --host --guestkallsyms $KALLSYMS record --kcore -e intel_pt/tsc=0,mtc=0,cyc=1/k -p 1430 --per-thread 1195 + ^C 1196 + [ perf record: Woken up 1 times to write data ] 1197 + [ perf record: Captured and wrote 5.829 MB ] 1198 + 1199 + perf script can be used to provide an instruction trace 1200 + 1201 + $ perf script --guestkallsyms $KALLSYMS --insn-trace --xed -F+ipc | grep -C10 vmresume | head -21 1202 + CPU 0/KVM 1440 ffffffff82133cdd __vmx_vcpu_run+0x3d ([kernel.kallsyms]) movq 0x48(%rax), %r9 1203 + CPU 0/KVM 1440 ffffffff82133ce1 __vmx_vcpu_run+0x41 ([kernel.kallsyms]) movq 0x50(%rax), %r10 1204 + CPU 0/KVM 1440 ffffffff82133ce5 __vmx_vcpu_run+0x45 ([kernel.kallsyms]) movq 0x58(%rax), %r11 1205 + CPU 0/KVM 1440 ffffffff82133ce9 __vmx_vcpu_run+0x49 ([kernel.kallsyms]) movq 0x60(%rax), %r12 1206 + CPU 0/KVM 1440 ffffffff82133ced __vmx_vcpu_run+0x4d ([kernel.kallsyms]) movq 0x68(%rax), %r13 1207 + CPU 0/KVM 1440 ffffffff82133cf1 __vmx_vcpu_run+0x51 ([kernel.kallsyms]) movq 0x70(%rax), %r14 1208 + CPU 0/KVM 1440 ffffffff82133cf5 __vmx_vcpu_run+0x55 ([kernel.kallsyms]) movq 0x78(%rax), %r15 1209 + CPU 0/KVM 1440 ffffffff82133cf9 __vmx_vcpu_run+0x59 ([kernel.kallsyms]) movq (%rax), %rax 1210 + CPU 0/KVM 1440 ffffffff82133cfc __vmx_vcpu_run+0x5c ([kernel.kallsyms]) callq 0xffffffff82133c40 1211 + CPU 0/KVM 1440 ffffffff82133c40 vmx_vmenter+0x0 ([kernel.kallsyms]) jz 0xffffffff82133c46 1212 + CPU 0/KVM 1440 ffffffff82133c42 vmx_vmenter+0x2 ([kernel.kallsyms]) vmresume IPC: 0.11 (50/445) 1213 + :1440 1440 ffffffffbb678b06 native_write_msr+0x6 ([guest.kernel.kallsyms]) nopl %eax, (%rax,%rax,1) 1214 + :1440 1440 ffffffffbb678b0b native_write_msr+0xb ([guest.kernel.kallsyms]) retq IPC: 0.04 (2/41) 1215 + :1440 1440 ffffffffbb666646 lapic_next_deadline+0x26 ([guest.kernel.kallsyms]) data16 nop 1216 + :1440 1440 ffffffffbb666648 lapic_next_deadline+0x28 ([guest.kernel.kallsyms]) xor %eax, %eax 1217 + :1440 1440 ffffffffbb66664a lapic_next_deadline+0x2a ([guest.kernel.kallsyms]) popq %rbp 1218 + :1440 1440 ffffffffbb66664b lapic_next_deadline+0x2b ([guest.kernel.kallsyms]) retq IPC: 0.16 (4/25) 1219 + :1440 1440 ffffffffbb74607f clockevents_program_event+0x8f ([guest.kernel.kallsyms]) test %eax, %eax 1220 + :1440 1440 ffffffffbb746081 clockevents_program_event+0x91 ([guest.kernel.kallsyms]) jz 0xffffffffbb74603c IPC: 0.06 (2/30) 1221 + :1440 1440 ffffffffbb74603c clockevents_program_event+0x4c ([guest.kernel.kallsyms]) popq %rbx 1222 + :1440 1440 ffffffffbb74603d clockevents_program_event+0x4d ([guest.kernel.kallsyms]) popq %r12 1223 + 1224 + 1148 1225 1149 1226 SEE ALSO 1150 1227 --------
+3
tools/perf/Documentation/perf-mem.txt
··· 63 63 --phys-data:: 64 64 Record/Report sample physical addresses 65 65 66 + --data-page-size:: 67 + Record/Report sample data address page size 68 + 66 69 RECORD OPTIONS 67 70 -------------- 68 71 -e::
+18 -3
tools/perf/Documentation/perf-record.txt
··· 296 296 --data-page-size:: 297 297 Record the sampled data address data page size. 298 298 299 + --code-page-size:: 300 + Record the sampled code address (ip) page size 301 + 299 302 -T:: 300 303 --timestamp:: 301 304 Record the sample timestamps. Use it with 'perf report -D' to see the ··· 488 485 --buildid-all:: 489 486 Record build-id of all DSOs regardless whether it's actually hit or not. 490 487 488 + --buildid-mmap:: 489 + Record build ids in mmap2 events, disables build id cache (implies --no-buildid). 490 + 491 491 --aio[=n]:: 492 492 Use <n> control blocks in asynchronous (Posix AIO) trace writing mode (default: 1, max: 4). 493 493 Asynchronous mode is supported only when linking Perf tool with libc library ··· 646 640 Listen on ctl-fd descriptor for command to control measurement. 647 641 648 642 Available commands: 649 - 'enable' : enable events 650 - 'disable' : disable events 651 - 'snapshot': AUX area tracing snapshot). 643 + 'enable' : enable events 644 + 'disable' : disable events 645 + 'enable name' : enable event 'name' 646 + 'disable name' : disable event 'name' 647 + 'snapshot' : AUX area tracing snapshot). 648 + 'stop' : stop perf record 649 + 'ping' : ping 650 + 651 + 'evlist [-v|-g|-F] : display all events 652 + -F Show just the sample frequency used for each event. 653 + -v Show all fields. 654 + -g Show event group information. 652 655 653 656 Measurements can be started with events disabled using --delay=-1 option. Optionally 654 657 send control command completion ('ack\n') to ack-fd descriptor to synchronize with the
+8 -2
tools/perf/Documentation/perf-report.txt
··· 108 108 - period: Raw number of event count of sample 109 109 - time: Separate the samples by time stamp with the resolution specified by 110 110 --time-quantum (default 100ms). Specify with overhead and before it. 111 + - code_page_size: the code page size of sampled code address (ip) 112 + - ins_lat: Instruction latency in core cycles. This is the global instruction 113 + latency 114 + - local_ins_lat: Local instruction latency version 111 115 112 116 By default, comm, dso and symbol keys are used. 113 117 (i.e. --sort comm,dso,symbol) ··· 143 139 144 140 If the --mem-mode option is used, the following sort keys are also available 145 141 (incompatible with --branch-stack): 146 - symbol_daddr, dso_daddr, locked, tlb, mem, snoop, dcacheline. 142 + symbol_daddr, dso_daddr, locked, tlb, mem, snoop, dcacheline, blocked. 147 143 148 144 - symbol_daddr: name of data symbol being executed on at the time of sample 149 145 - dso_daddr: name of library or module containing the data being executed ··· 155 151 - dcacheline: the cacheline the data address is on at the time of the sample 156 152 - phys_daddr: physical address of data being executed on at the time of sample 157 153 - data_page_size: the data page size of data being executed on at the time of sample 154 + - blocked: reason of blocked load access for the data at the time of the sample 158 155 159 156 And the default sort keys are changed to local_weight, mem, sym, dso, 160 - symbol_daddr, dso_daddr, snoop, tlb, locked, see '--mem-mode'. 157 + symbol_daddr, dso_daddr, snoop, tlb, locked, blocked, local_ins_lat, 158 + see '--mem-mode'. 161 159 162 160 If the data file has tracepoint event(s), following (dynamic) sort keys 163 161 are also available:
+24 -1
tools/perf/Documentation/perf-script.txt
··· 118 118 comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff, 119 119 srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output, 120 120 brstackinsn, brstackoff, callindent, insn, insnlen, synth, phys_addr, 121 - metric, misc, srccode, ipc, data_page_size. 121 + metric, misc, srccode, ipc, data_page_size, code_page_size. 122 122 Field list can be prepended with the type, trace, sw or hw, 123 123 to indicate to which event type the field list applies. 124 124 e.g., -F sw:comm,tid,time,ip,sym and -F trace:time,cpu,trace ··· 422 422 Only consider the listed symbols. Symbols are typically a name 423 423 but they may also be hexadecimal address. 424 424 425 + The hexadecimal address may be the start address of a symbol or 426 + any other address to filter the trace records 427 + 425 428 For example, to select the symbol noploop or the address 0x4007a0: 426 429 perf script --symbols=noploop,0x4007a0 430 + 431 + Support filtering trace records by symbol name, start address of 432 + symbol, any hexadecimal address and address range. 433 + 434 + The comparison order is: 435 + 436 + 1. symbol name comparison 437 + 2. symbol start address comparison. 438 + 3. any hexadecimal address comparison. 439 + 4. address range comparison (see --addr-range). 440 + 441 + --addr-range:: 442 + Use with -S or --symbols to list traced records within address range. 443 + 444 + For example, to list the traced records within the address range 445 + [0x4007a0, 0x0x4007a9]: 446 + perf script -S 0x4007a0 --addr-range 10 447 + 448 + --dsos=:: 449 + Only consider symbols in these DSOs. 427 450 428 451 --call-trace:: 429 452 Show call stream for intel_pt traces. The CPUs are interleaved, but
+31 -1
tools/perf/Documentation/perf-stat.txt
··· 75 75 --tid=<tid>:: 76 76 stat events on existing thread id (comma separated list) 77 77 78 + -b:: 79 + --bpf-prog:: 80 + stat events on existing bpf program id (comma separated list), 81 + requiring root rights. bpftool-prog could be used to find program 82 + id all bpf programs in the system. For example: 83 + 84 + # bpftool prog | head -n 1 85 + 17247: tracepoint name sys_enter tag 192d548b9d754067 gpl 86 + 87 + # perf stat -e cycles,instructions --bpf-prog 17247 --timeout 1000 88 + 89 + Performance counter stats for 'BPF program(s) 17247': 90 + 91 + 85,967 cycles 92 + 28,982 instructions # 0.34 insn per cycle 93 + 94 + 1.102235068 seconds time elapsed 95 + 78 96 ifdef::HAVE_LIBPFM[] 79 97 --pfm-events events:: 80 98 Select a PMU event using libpfm4 syntax (see http://perfmon2.sf.net) ··· 376 358 Do not aggregate counts across all monitored CPUs. 377 359 378 360 --topdown:: 379 - Print top down level 1 metrics if supported by the CPU. This allows to 361 + Print complete top-down metrics supported by the CPU. This allows to 380 362 determine bottle necks in the CPU pipeline for CPU bound workloads, 381 363 by breaking the cycles consumed down into frontend bound, backend bound, 382 364 bad speculation and retiring. ··· 410 392 To interpret the results it is usually needed to know on which 411 393 CPUs the workload runs on. If needed the CPUs can be forced using 412 394 taskset. 395 + 396 + --td-level:: 397 + Print the top-down statistics that equal to or lower than the input level. 398 + It allows users to print the interested top-down metrics level instead of 399 + the complete top-down metrics. 400 + 401 + The availability of the top-down metrics level depends on the hardware. For 402 + example, Ice Lake only supports L1 top-down metrics. The Sapphire Rapids 403 + supports both L1 and L2 top-down metrics. 404 + 405 + Default: 0 means the max level that the current hardware support. 406 + Error out if the input is higher than the supported max level. 413 407 414 408 --no-merge:: 415 409 Do not merge results from same PMUs.
+74 -4
tools/perf/Documentation/topdown.txt
··· 121 121 #define RDPMC_METRIC (1 << 29) /* return metric counters */ 122 122 123 123 #define FIXED_COUNTER_SLOTS 3 124 - #define METRIC_COUNTER_TOPDOWN_L1 0 124 + #define METRIC_COUNTER_TOPDOWN_L1_L2 0 125 125 126 126 static inline uint64_t read_slots(void) 127 127 { ··· 130 130 131 131 static inline uint64_t read_metrics(void) 132 132 { 133 - return _rdpmc(RDPMC_METRIC | METRIC_COUNTER_TOPDOWN_L1); 133 + return _rdpmc(RDPMC_METRIC | METRIC_COUNTER_TOPDOWN_L1_L2); 134 134 } 135 135 136 136 Then the program can be instrumented to read these metrics at different ··· 152 152 153 153 #define GET_METRIC(m, i) (((m) >> (i*8)) & 0xff) 154 154 155 + /* L1 Topdown metric events */ 155 156 #define TOPDOWN_RETIRING(val) ((float)GET_METRIC(val, 0) / 0xff) 156 157 #define TOPDOWN_BAD_SPEC(val) ((float)GET_METRIC(val, 1) / 0xff) 157 158 #define TOPDOWN_FE_BOUND(val) ((float)GET_METRIC(val, 2) / 0xff) 158 159 #define TOPDOWN_BE_BOUND(val) ((float)GET_METRIC(val, 3) / 0xff) 160 + 161 + /* 162 + * L2 Topdown metric events. 163 + * Available on Sapphire Rapids and later platforms. 164 + */ 165 + #define TOPDOWN_HEAVY_OPS(val) ((float)GET_METRIC(val, 4) / 0xff) 166 + #define TOPDOWN_BR_MISPREDICT(val) ((float)GET_METRIC(val, 5) / 0xff) 167 + #define TOPDOWN_FETCH_LAT(val) ((float)GET_METRIC(val, 6) / 0xff) 168 + #define TOPDOWN_MEM_BOUND(val) ((float)GET_METRIC(val, 7) / 0xff) 159 169 160 170 and then converted to percent for printing. 161 171 ··· 200 190 fe_bound_slots = GET_METRIC(metric_b, 2) * slots_b - fe_bound_slots_a 201 191 be_bound_slots = GET_METRIC(metric_b, 3) * slots_b - be_bound_slots_a 202 192 203 - Later the individual ratios for the measurement period can be recreated 204 - from these counts. 193 + Later the individual ratios of L1 metric events for the measurement period can 194 + be recreated from these counts. 205 195 206 196 slots_delta = slots_b - slots_a 207 197 retiring_ratio = (float)retiring_slots / slots_delta ··· 214 204 bad_spec_ratio * 100., 215 205 fe_bound_ratio * 100., 216 206 be_bound_ratio * 100.); 207 + 208 + The individual ratios of L2 metric events for the measurement period can be 209 + recreated from L1 and L2 metric counters. (Available on Sapphire Rapids and 210 + later platforms) 211 + 212 + # compute scaled metrics for measurement a 213 + heavy_ops_slots_a = GET_METRIC(metric_a, 4) * slots_a 214 + br_mispredict_slots_a = GET_METRIC(metric_a, 5) * slots_a 215 + fetch_lat_slots_a = GET_METRIC(metric_a, 6) * slots_a 216 + mem_bound_slots_a = GET_METRIC(metric_a, 7) * slots_a 217 + 218 + # compute delta scaled metrics between b and a 219 + heavy_ops_slots = GET_METRIC(metric_b, 4) * slots_b - heavy_ops_slots_a 220 + br_mispredict_slots = GET_METRIC(metric_b, 5) * slots_b - br_mispredict_slots_a 221 + fetch_lat_slots = GET_METRIC(metric_b, 6) * slots_b - fetch_lat_slots_a 222 + mem_bound_slots = GET_METRIC(metric_b, 7) * slots_b - mem_bound_slots_a 223 + 224 + slots_delta = slots_b - slots_a 225 + heavy_ops_ratio = (float)heavy_ops_slots / slots_delta 226 + light_ops_ratio = retiring_ratio - heavy_ops_ratio; 227 + 228 + br_mispredict_ratio = (float)br_mispredict_slots / slots_delta 229 + machine_clears_ratio = bad_spec_ratio - br_mispredict_ratio; 230 + 231 + fetch_lat_ratio = (float)fetch_lat_slots / slots_delta 232 + fetch_bw_ratio = fe_bound_ratio - fetch_lat_ratio; 233 + 234 + mem_bound_ratio = (float)mem_bound_slots / slota_delta 235 + core_bound_ratio = be_bound_ratio - mem_bound_ratio; 236 + 237 + printf("Heavy Operations %.2f%% Light Operations %.2f%% " 238 + "Branch Mispredict %.2f%% Machine Clears %.2f%% " 239 + "Fetch Latency %.2f%% Fetch Bandwidth %.2f%% " 240 + "Mem Bound %.2f%% Core Bound %.2f%%\n", 241 + heavy_ops_ratio * 100., 242 + light_ops_ratio * 100., 243 + br_mispredict_ratio * 100., 244 + machine_clears_ratio * 100., 245 + fetch_lat_ratio * 100., 246 + fetch_bw_ratio * 100., 247 + mem_bound_ratio * 100., 248 + core_bound_ratio * 100.); 217 249 218 250 Resetting metrics counters 219 251 ========================== ··· 299 247 a sampling read group. Since the SLOTS event must be the leader of a TopDown 300 248 group, the second event of the group is the sampling event. 301 249 For example, perf record -e '{slots, $sampling_event, topdown-retiring}:S' 250 + 251 + Extension on Sapphire Rapids Server 252 + =================================== 253 + The metrics counter is extended to support TMA method level 2 metrics. 254 + The lower half of the register is the TMA level 1 metrics (legacy). 255 + The upper half is also divided into four 8-bit fields for the new level 2 256 + metrics. Four more TopDown metric events are exposed for the end-users, 257 + topdown-heavy-ops, topdown-br-mispredict, topdown-fetch-lat and 258 + topdown-mem-bound. 259 + 260 + Each of the new level 2 metrics in the upper half is a subset of the 261 + corresponding level 1 metric in the lower half. Software can deduce the 262 + other four level 2 metrics by subtracting corresponding metrics as below. 263 + 264 + Light_Operations = Retiring - Heavy_Operations 265 + Machine_Clears = Bad_Speculation - Branch_Mispredicts 266 + Fetch_Bandwidth = Frontend_Bound - Fetch_Latency 267 + Core_Bound = Backend_Bound - Memory_Bound 302 268 303 269 304 270 [1] https://software.intel.com/en-us/top-down-microarchitecture-analysis-method-win
+9
tools/perf/Makefile.config
··· 621 621 endif 622 622 endif 623 623 624 + ifdef BUILD_BPF_SKEL 625 + $(call feature_check,clang-bpf-co-re) 626 + ifeq ($(feature-clang-bpf-co-re), 0) 627 + dummy := $(error Error: clang too old. Please install recent clang) 628 + endif 629 + $(call detected,CONFIG_PERF_BPF_SKEL) 630 + CFLAGS += -DHAVE_BPF_SKEL 631 + endif 632 + 624 633 dwarf-post-unwind := 1 625 634 dwarf-post-unwind-text := BUG 626 635
+47 -2
tools/perf/Makefile.perf
··· 126 126 # 127 127 # Define NO_LIBDEBUGINFOD if you do not want support debuginfod 128 128 # 129 + # Define BUILD_BPF_SKEL to enable BPF skeletons 130 + # 129 131 130 132 # As per kernel Makefile, avoid funny character set dependencies 131 133 unexport LC_ALL ··· 176 174 endef 177 175 178 176 LD += $(EXTRA_LDFLAGS) 177 + 178 + HOSTCC ?= gcc 179 + HOSTLD ?= ld 180 + HOSTAR ?= ar 181 + CLANG ?= clang 182 + LLVM_STRIP ?= llvm-strip 179 183 180 184 PKG_CONFIG = $(CROSS_COMPILE)pkg-config 181 185 ··· 738 730 $(x86_arch_prctl_code_array) \ 739 731 $(rename_flags_array) \ 740 732 $(arch_errno_name_array) \ 741 - $(sync_file_range_arrays) 733 + $(sync_file_range_arrays) \ 734 + bpf-skel 742 735 743 736 $(OUTPUT)%.o: %.c prepare FORCE 744 737 $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(build-dir) $@ ··· 1012 1003 python-clean: 1013 1004 $(python-clean) 1014 1005 1015 - clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean 1006 + SKEL_OUT := $(abspath $(OUTPUT)util/bpf_skel) 1007 + SKEL_TMP_OUT := $(abspath $(SKEL_OUT)/.tmp) 1008 + SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h 1009 + 1010 + ifdef BUILD_BPF_SKEL 1011 + BPFTOOL := $(SKEL_TMP_OUT)/bootstrap/bpftool 1012 + LIBBPF_SRC := $(abspath ../lib/bpf) 1013 + BPF_INCLUDE := -I$(SKEL_TMP_OUT)/.. -I$(BPF_PATH) -I$(LIBBPF_SRC)/.. 1014 + 1015 + $(SKEL_TMP_OUT): 1016 + $(Q)$(MKDIR) -p $@ 1017 + 1018 + $(BPFTOOL): | $(SKEL_TMP_OUT) 1019 + CFLAGS= $(MAKE) -C ../bpf/bpftool \ 1020 + OUTPUT=$(SKEL_TMP_OUT)/ bootstrap 1021 + 1022 + $(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF) | $(SKEL_TMP_OUT) 1023 + $(QUIET_CLANG)$(CLANG) -g -O2 -target bpf $(BPF_INCLUDE) \ 1024 + -c $(filter util/bpf_skel/%.bpf.c,$^) -o $@ && $(LLVM_STRIP) -g $@ 1025 + 1026 + $(SKEL_OUT)/%.skel.h: $(SKEL_TMP_OUT)/%.bpf.o | $(BPFTOOL) 1027 + $(QUIET_GENSKEL)$(BPFTOOL) gen skeleton $< > $@ 1028 + 1029 + bpf-skel: $(SKELETONS) 1030 + 1031 + .PRECIOUS: $(SKEL_TMP_OUT)/%.bpf.o 1032 + 1033 + else # BUILD_BPF_SKEL 1034 + 1035 + bpf-skel: 1036 + 1037 + endif # BUILD_BPF_SKEL 1038 + 1039 + bpf-skel-clean: 1040 + $(call QUIET_CLEAN, bpf-skel) $(RM) -r $(SKEL_TMP_OUT) $(SKELETONS) 1041 + 1042 + clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clean $(LIBPERF)-clean config-clean fixdep-clean python-clean bpf-skel-clean 1016 1043 $(call QUIET_CLEAN, core-objs) $(RM) $(LIBPERF_A) $(OUTPUT)perf-archive $(OUTPUT)perf-with-kcore $(LANG_BINDINGS) 1017 1044 $(Q)find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete 1018 1045 $(Q)$(RM) $(OUTPUT).config-detected
+1 -1
tools/perf/arch/arm/include/perf_regs.h
··· 15 15 #define PERF_REG_IP PERF_REG_ARM_PC 16 16 #define PERF_REG_SP PERF_REG_ARM_SP 17 17 18 - static inline const char *perf_reg_name(int id) 18 + static inline const char *__perf_reg_name(int id) 19 19 { 20 20 switch (id) { 21 21 case PERF_REG_ARM_R0:
+1 -1
tools/perf/arch/arm64/include/perf_regs.h
··· 15 15 #define PERF_REG_IP PERF_REG_ARM64_PC 16 16 #define PERF_REG_SP PERF_REG_ARM64_SP 17 17 18 - static inline const char *perf_reg_name(int id) 18 + static inline const char *__perf_reg_name(int id) 19 19 { 20 20 switch (id) { 21 21 case PERF_REG_ARM64_X0:
+2 -1
tools/perf/arch/arm64/util/machine.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 2 3 + #include <inttypes.h> 3 4 #include <stdio.h> 4 5 #include <string.h> 5 6 #include "debug.h" ··· 24 23 p->end += SYMBOL_LIMIT; 25 24 else 26 25 p->end = c->start; 27 - pr_debug4("%s sym:%s end:%#lx\n", __func__, p->name, p->end); 26 + pr_debug4("%s sym:%s end:%#" PRIx64 "\n", __func__, p->name, p->end); 28 27 }
+94
tools/perf/arch/arm64/util/perf_regs.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 + #include <errno.h> 3 + #include <regex.h> 4 + #include <string.h> 5 + #include <linux/kernel.h> 6 + #include <linux/zalloc.h> 7 + 8 + #include "../../../util/debug.h" 9 + #include "../../../util/event.h" 2 10 #include "../../../util/perf_regs.h" 3 11 4 12 const struct sample_reg sample_reg_masks[] = { ··· 45 37 SMPL_REG(pc, PERF_REG_ARM64_PC), 46 38 SMPL_REG_END 47 39 }; 40 + 41 + /* %xNUM */ 42 + #define SDT_OP_REGEX1 "^(x[1-2]?[0-9]|3[0-1])$" 43 + 44 + /* [sp], [sp, NUM] */ 45 + #define SDT_OP_REGEX2 "^\\[sp(, )?([0-9]+)?\\]$" 46 + 47 + static regex_t sdt_op_regex1, sdt_op_regex2; 48 + 49 + static int sdt_init_op_regex(void) 50 + { 51 + static int initialized; 52 + int ret = 0; 53 + 54 + if (initialized) 55 + return 0; 56 + 57 + ret = regcomp(&sdt_op_regex1, SDT_OP_REGEX1, REG_EXTENDED); 58 + if (ret) 59 + goto error; 60 + 61 + ret = regcomp(&sdt_op_regex2, SDT_OP_REGEX2, REG_EXTENDED); 62 + if (ret) 63 + goto free_regex1; 64 + 65 + initialized = 1; 66 + return 0; 67 + 68 + free_regex1: 69 + regfree(&sdt_op_regex1); 70 + error: 71 + pr_debug4("Regex compilation error.\n"); 72 + return ret; 73 + } 74 + 75 + /* 76 + * SDT marker arguments on Arm64 uses %xREG or [sp, NUM], currently 77 + * support these two formats. 78 + */ 79 + int arch_sdt_arg_parse_op(char *old_op, char **new_op) 80 + { 81 + int ret, new_len; 82 + regmatch_t rm[5]; 83 + 84 + ret = sdt_init_op_regex(); 85 + if (ret < 0) 86 + return ret; 87 + 88 + if (!regexec(&sdt_op_regex1, old_op, 3, rm, 0)) { 89 + /* Extract xNUM */ 90 + new_len = 2; /* % NULL */ 91 + new_len += (int)(rm[1].rm_eo - rm[1].rm_so); 92 + 93 + *new_op = zalloc(new_len); 94 + if (!*new_op) 95 + return -ENOMEM; 96 + 97 + scnprintf(*new_op, new_len, "%%%.*s", 98 + (int)(rm[1].rm_eo - rm[1].rm_so), old_op + rm[1].rm_so); 99 + } else if (!regexec(&sdt_op_regex2, old_op, 5, rm, 0)) { 100 + /* [sp], [sp, NUM] or [sp,NUM] */ 101 + new_len = 7; /* + ( % s p ) NULL */ 102 + 103 + /* If the arugment is [sp], need to fill offset '0' */ 104 + if (rm[2].rm_so == -1) 105 + new_len += 1; 106 + else 107 + new_len += (int)(rm[2].rm_eo - rm[2].rm_so); 108 + 109 + *new_op = zalloc(new_len); 110 + if (!*new_op) 111 + return -ENOMEM; 112 + 113 + if (rm[2].rm_so == -1) 114 + scnprintf(*new_op, new_len, "+0(%%sp)"); 115 + else 116 + scnprintf(*new_op, new_len, "+%.*s(%%sp)", 117 + (int)(rm[2].rm_eo - rm[2].rm_so), 118 + old_op + rm[2].rm_so); 119 + } else { 120 + pr_debug4("Skipping unsupported SDT argument: %s\n", old_op); 121 + return SDT_ARG_SKIP; 122 + } 123 + 124 + return SDT_ARG_VALID; 125 + }
+1 -1
tools/perf/arch/csky/include/perf_regs.h
··· 15 15 #define PERF_REG_IP PERF_REG_CSKY_PC 16 16 #define PERF_REG_SP PERF_REG_CSKY_SP 17 17 18 - static inline const char *perf_reg_name(int id) 18 + static inline const char *__perf_reg_name(int id) 19 19 { 20 20 switch (id) { 21 21 case PERF_REG_CSKY_A0:
+7 -1
tools/perf/arch/powerpc/include/perf_regs.h
··· 71 71 [PERF_REG_POWERPC_MMCR3] = "mmcr3", 72 72 [PERF_REG_POWERPC_SIER2] = "sier2", 73 73 [PERF_REG_POWERPC_SIER3] = "sier3", 74 + [PERF_REG_POWERPC_PMC1] = "pmc1", 75 + [PERF_REG_POWERPC_PMC2] = "pmc2", 76 + [PERF_REG_POWERPC_PMC3] = "pmc3", 77 + [PERF_REG_POWERPC_PMC4] = "pmc4", 78 + [PERF_REG_POWERPC_PMC5] = "pmc5", 79 + [PERF_REG_POWERPC_PMC6] = "pmc6", 74 80 }; 75 81 76 - static inline const char *perf_reg_name(int id) 82 + static inline const char *__perf_reg_name(int id) 77 83 { 78 84 return reg_names[id]; 79 85 }
+1
tools/perf/arch/powerpc/util/Build
··· 1 1 perf-y += header.o 2 + perf-y += machine.o 2 3 perf-y += kvm-stat.o 3 4 perf-y += perf_regs.o 4 5 perf-y += mem-events.o
+25
tools/perf/arch/powerpc/util/machine.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include <inttypes.h> 4 + #include <stdio.h> 5 + #include <string.h> 6 + #include <internal/lib.h> // page_size 7 + #include "debug.h" 8 + #include "symbol.h" 9 + 10 + /* On powerpc kernel text segment start at memory addresses, 0xc000000000000000 11 + * whereas the modules are located at very high memory addresses, 12 + * for example 0xc00800000xxxxxxx. The gap between end of kernel text segment 13 + * and beginning of first module's text segment is very high. 14 + * Therefore do not fill this gap and do not assign it to the kernel dso map. 15 + */ 16 + 17 + void arch__symbols__fixup_end(struct symbol *p, struct symbol *c) 18 + { 19 + if (strchr(p->name, '[') == NULL && strchr(c->name, '[')) 20 + /* Limit the range of last kernel symbol */ 21 + p->end += page_size; 22 + else 23 + p->end = c->start; 24 + pr_debug4("%s sym:%s end:%#" PRIx64 "\n", __func__, p->name, p->end); 25 + }
+6
tools/perf/arch/powerpc/util/perf_regs.c
··· 68 68 SMPL_REG(mmcr3, PERF_REG_POWERPC_MMCR3), 69 69 SMPL_REG(sier2, PERF_REG_POWERPC_SIER2), 70 70 SMPL_REG(sier3, PERF_REG_POWERPC_SIER3), 71 + SMPL_REG(pmc1, PERF_REG_POWERPC_PMC1), 72 + SMPL_REG(pmc2, PERF_REG_POWERPC_PMC2), 73 + SMPL_REG(pmc3, PERF_REG_POWERPC_PMC3), 74 + SMPL_REG(pmc4, PERF_REG_POWERPC_PMC4), 75 + SMPL_REG(pmc5, PERF_REG_POWERPC_PMC5), 76 + SMPL_REG(pmc6, PERF_REG_POWERPC_PMC6), 71 77 SMPL_REG_END 72 78 }; 73 79
+1 -1
tools/perf/arch/riscv/include/perf_regs.h
··· 19 19 #define PERF_REG_IP PERF_REG_RISCV_PC 20 20 #define PERF_REG_SP PERF_REG_RISCV_SP 21 21 22 - static inline const char *perf_reg_name(int id) 22 + static inline const char *__perf_reg_name(int id) 23 23 { 24 24 switch (id) { 25 25 case PERF_REG_RISCV_PC:
+1 -1
tools/perf/arch/s390/include/perf_regs.h
··· 14 14 #define PERF_REG_IP PERF_REG_S390_PC 15 15 #define PERF_REG_SP PERF_REG_S390_R15 16 16 17 - static inline const char *perf_reg_name(int id) 17 + static inline const char *__perf_reg_name(int id) 18 18 { 19 19 switch (id) { 20 20 case PERF_REG_S390_R0:
+2 -1
tools/perf/arch/s390/util/machine.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 + #include <inttypes.h> 2 3 #include <unistd.h> 3 4 #include <stdio.h> 4 5 #include <string.h> ··· 49 48 p->end = roundup(p->end, page_size); 50 49 else 51 50 p->end = c->start; 52 - pr_debug4("%s sym:%s end:%#lx\n", __func__, p->name, p->end); 51 + pr_debug4("%s sym:%s end:%#" PRIx64 "\n", __func__, p->name, p->end); 53 52 }
+1 -1
tools/perf/arch/x86/include/perf_regs.h
··· 23 23 #define PERF_REG_IP PERF_REG_X86_IP 24 24 #define PERF_REG_SP PERF_REG_X86_SP 25 25 26 - static inline const char *perf_reg_name(int id) 26 + static inline const char *__perf_reg_name(int id) 27 27 { 28 28 switch (id) { 29 29 case PERF_REG_X86_AX:
+1
tools/perf/arch/x86/tests/insn-x86.c
··· 48 48 {"int", INTEL_PT_OP_INT}, 49 49 {"syscall", INTEL_PT_OP_SYSCALL}, 50 50 {"sysret", INTEL_PT_OP_SYSRET}, 51 + {"vmentry", INTEL_PT_OP_VMENTRY}, 51 52 {NULL, 0}, 52 53 }; 53 54 struct val_data *val;
+2 -2
tools/perf/arch/x86/tests/intel-pt-pkt-decoder-test.c
··· 66 66 {7, {0x9d, 1, 2, 3, 4, 5, 6}, 0, {INTEL_PT_FUP, 4, 0x60504030201}, 0, 0 }, 67 67 {9, {0xdd, 1, 2, 3, 4, 5, 6, 7, 8}, 0, {INTEL_PT_FUP, 6, 0x807060504030201}, 0, 0 }, 68 68 /* Paging Information Packet */ 69 - {8, {0x02, 0x43, 2, 4, 6, 8, 10, 12}, 0, {INTEL_PT_PIP, 0, 0x60504030201}, 0, 0 }, 70 - {8, {0x02, 0x43, 3, 4, 6, 8, 10, 12}, 0, {INTEL_PT_PIP, 0, 0x60504030201 | (1ULL << 63)}, 0, 0 }, 69 + {8, {0x02, 0x43, 2, 4, 6, 8, 10, 12}, 0, {INTEL_PT_PIP, 0, 0xC0A08060402}, 0, 0 }, 70 + {8, {0x02, 0x43, 3, 4, 6, 8, 10, 12}, 0, {INTEL_PT_PIP, 0, 0xC0A08060403}, 0, 0 }, 71 71 /* Mode Exec Packet */ 72 72 {2, {0x99, 0x00}, 0, {INTEL_PT_MODE_EXEC, 0, 16}, 0, 0 }, 73 73 {2, {0x99, 0x01}, 0, {INTEL_PT_MODE_EXEC, 0, 64}, 0, 0 },
+3
tools/perf/arch/x86/util/Build
··· 6 6 perf-y += topdown.o 7 7 perf-y += machine.o 8 8 perf-y += event.o 9 + perf-y += evlist.o 10 + perf-y += mem-events.o 11 + perf-y += evsel.o 9 12 10 13 perf-$(CONFIG_DWARF) += dwarf-regs.o 11 14 perf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
+25
tools/perf/arch/x86/util/event.c
··· 75 75 } 76 76 77 77 #endif 78 + 79 + void arch_perf_parse_sample_weight(struct perf_sample *data, 80 + const __u64 *array, u64 type) 81 + { 82 + union perf_sample_weight weight; 83 + 84 + weight.full = *array; 85 + if (type & PERF_SAMPLE_WEIGHT) 86 + data->weight = weight.full; 87 + else { 88 + data->weight = weight.var1_dw; 89 + data->ins_lat = weight.var2_w; 90 + } 91 + } 92 + 93 + void arch_perf_synthesize_sample_weight(const struct perf_sample *data, 94 + __u64 *array, u64 type) 95 + { 96 + *array = data->weight; 97 + 98 + if (type & PERF_SAMPLE_WEIGHT_STRUCT) { 99 + *array &= 0xffffffff; 100 + *array |= ((u64)data->ins_lat << 32); 101 + } 102 + }
+15
tools/perf/arch/x86/util/evlist.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <stdio.h> 3 + #include "util/pmu.h" 4 + #include "util/evlist.h" 5 + #include "util/parse-events.h" 6 + 7 + #define TOPDOWN_L1_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}" 8 + 9 + int arch_evlist__add_default_attrs(struct evlist *evlist) 10 + { 11 + if (!pmu_have_event("cpu", "slots")) 12 + return 0; 13 + 14 + return parse_events(evlist, TOPDOWN_L1_EVENTS, NULL); 15 + }
+8
tools/perf/arch/x86/util/evsel.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <stdio.h> 3 + #include "util/evsel.h" 4 + 5 + void arch_evsel__set_sample_weight(struct evsel *evsel) 6 + { 7 + evsel__set_sample_bit(evsel, WEIGHT_STRUCT); 8 + }
+44
tools/perf/arch/x86/util/mem-events.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include "util/pmu.h" 3 + #include "map_symbol.h" 4 + #include "mem-events.h" 5 + 6 + static char mem_loads_name[100]; 7 + static bool mem_loads_name__init; 8 + 9 + #define MEM_LOADS_AUX 0x8203 10 + #define MEM_LOADS_AUX_NAME "{cpu/mem-loads-aux/,cpu/mem-loads,ldlat=%u/pp}:S" 11 + 12 + bool is_mem_loads_aux_event(struct evsel *leader) 13 + { 14 + if (!pmu_have_event("cpu", "mem-loads-aux")) 15 + return false; 16 + 17 + return leader->core.attr.config == MEM_LOADS_AUX; 18 + } 19 + 20 + char *perf_mem_events__name(int i) 21 + { 22 + struct perf_mem_event *e = perf_mem_events__ptr(i); 23 + 24 + if (!e) 25 + return NULL; 26 + 27 + if (i == PERF_MEM_EVENTS__LOAD) { 28 + if (mem_loads_name__init) 29 + return mem_loads_name; 30 + 31 + mem_loads_name__init = true; 32 + 33 + if (pmu_have_event("cpu", "mem-loads-aux")) { 34 + scnprintf(mem_loads_name, sizeof(mem_loads_name), 35 + MEM_LOADS_AUX_NAME, perf_mem_events__loads_ldlat); 36 + } else { 37 + scnprintf(mem_loads_name, sizeof(mem_loads_name), 38 + e->name, perf_mem_events__loads_ldlat); 39 + } 40 + return mem_loads_name; 41 + } 42 + 43 + return (char *)e->name; 44 + }
-1
tools/perf/bench/epoll-ctl.c
··· 21 21 #include <sys/resource.h> 22 22 #include <sys/epoll.h> 23 23 #include <sys/eventfd.h> 24 - #include <internal/cpumap.h> 25 24 #include <perf/cpumap.h> 26 25 27 26 #include "../util/stat.h"
-1
tools/perf/bench/epoll-wait.c
··· 76 76 #include <sys/epoll.h> 77 77 #include <sys/eventfd.h> 78 78 #include <sys/types.h> 79 - #include <internal/cpumap.h> 80 79 #include <perf/cpumap.h> 81 80 82 81 #include "../util/stat.h"
-1
tools/perf/bench/futex-hash.c
··· 20 20 #include <linux/kernel.h> 21 21 #include <linux/zalloc.h> 22 22 #include <sys/time.h> 23 - #include <internal/cpumap.h> 24 23 #include <perf/cpumap.h> 25 24 26 25 #include "../util/stat.h"
-1
tools/perf/bench/futex-lock-pi.c
··· 14 14 #include <linux/kernel.h> 15 15 #include <linux/zalloc.h> 16 16 #include <errno.h> 17 - #include <internal/cpumap.h> 18 17 #include <perf/cpumap.h> 19 18 #include "bench.h" 20 19 #include "futex.h"
-1
tools/perf/bench/futex-requeue.c
··· 20 20 #include <linux/kernel.h> 21 21 #include <linux/time64.h> 22 22 #include <errno.h> 23 - #include <internal/cpumap.h> 24 23 #include <perf/cpumap.h> 25 24 #include "bench.h" 26 25 #include "futex.h"
-1
tools/perf/bench/futex-wake-parallel.c
··· 29 29 #include <linux/time64.h> 30 30 #include <errno.h> 31 31 #include "futex.h" 32 - #include <internal/cpumap.h> 33 32 #include <perf/cpumap.h> 34 33 35 34 #include <err.h>
-1
tools/perf/bench/futex-wake.c
··· 20 20 #include <linux/kernel.h> 21 21 #include <linux/time64.h> 22 22 #include <errno.h> 23 - #include <internal/cpumap.h> 24 23 #include <perf/cpumap.h> 25 24 #include "bench.h" 26 25 #include "futex.h"
+25 -3
tools/perf/builtin-buildid-cache.c
··· 27 27 #include "util/time-utils.h" 28 28 #include "util/util.h" 29 29 #include "util/probe-file.h" 30 + #include "util/config.h" 30 31 #include <linux/string.h> 31 32 #include <linux/err.h> 32 33 ··· 349 348 return 0; 350 349 } 351 350 351 + static int perf_buildid_cache_config(const char *var, const char *value, void *cb) 352 + { 353 + const char **debuginfod = cb; 354 + 355 + if (!strcmp(var, "buildid-cache.debuginfod")) 356 + *debuginfod = strdup(value); 357 + 358 + return 0; 359 + } 360 + 352 361 int cmd_buildid_cache(int argc, const char **argv) 353 362 { 354 363 struct strlist *list; 355 364 struct str_node *pos; 356 - int ret = 0; 357 - int ns_id = -1; 365 + int ret, ns_id = -1; 358 366 bool force = false; 359 367 bool list_files = false; 360 368 bool opts_flag = false; ··· 373 363 *purge_name_list_str = NULL, 374 364 *missing_filename = NULL, 375 365 *update_name_list_str = NULL, 376 - *kcore_filename = NULL; 366 + *kcore_filename = NULL, 367 + *debuginfod = NULL; 377 368 char sbuf[STRERR_BUFSIZE]; 378 369 379 370 struct perf_data data = { ··· 399 388 OPT_BOOLEAN('f', "force", &force, "don't complain, do it"), 400 389 OPT_STRING('u', "update", &update_name_list_str, "file list", 401 390 "file(s) to update"), 391 + OPT_STRING(0, "debuginfod", &debuginfod, "debuginfod url", 392 + "set debuginfod url"), 402 393 OPT_INCR('v', "verbose", &verbose, "be more verbose"), 403 394 OPT_INTEGER(0, "target-ns", &ns_id, "target pid for namespace context"), 404 395 OPT_END() ··· 409 396 "perf buildid-cache [<options>]", 410 397 NULL 411 398 }; 399 + 400 + ret = perf_config(perf_buildid_cache_config, &debuginfod); 401 + if (ret) 402 + return ret; 412 403 413 404 argc = parse_options(argc, argv, buildid_cache_options, 414 405 buildid_cache_usage, 0); ··· 424 407 425 408 if (argc || !(list_files || opts_flag)) 426 409 usage_with_options(buildid_cache_usage, buildid_cache_options); 410 + 411 + if (debuginfod) { 412 + pr_debug("DEBUGINFOD_URLS=%s\n", debuginfod); 413 + setenv("DEBUGINFOD_URLS", debuginfod, 1); 414 + } 427 415 428 416 /* -l is exclusive. It can not be used with other options. */ 429 417 if (list_files && opts_flag) {
+3
tools/perf/builtin-buildid-list.c
··· 77 77 perf_header__has_feat(&session->header, HEADER_AUXTRACE)) 78 78 with_hits = false; 79 79 80 + if (!perf_header__has_feat(&session->header, HEADER_BUILD_ID)) 81 + with_hits = true; 82 + 80 83 /* 81 84 * in pipe-mode, the only way to get the buildids is to parse 82 85 * the record stream. Buildids are stored as RECORD_HEADER_BUILD_ID
+102 -69
tools/perf/builtin-c2c.c
··· 97 97 bool symbol_full; 98 98 bool stitch_lbr; 99 99 100 - /* HITM shared clines stats */ 101 - struct c2c_stats hitm_stats; 100 + /* Shared cache line stats */ 101 + struct c2c_stats shared_clines_stats; 102 102 int shared_clines; 103 103 104 104 int display; ··· 876 876 return &hists->stats; 877 877 } 878 878 879 - static double percent(int st, int tot) 879 + static double percent(u32 st, u32 tot) 880 880 { 881 881 return tot ? 100. * (double) st / (double) tot : 0; 882 882 } ··· 1048 1048 return 0; 1049 1049 } 1050 1050 1051 + static int display_metrics(struct perf_hpp *hpp, u32 val, u32 sum) 1052 + { 1053 + int ret; 1054 + 1055 + if (sum != 0) 1056 + ret = scnprintf(hpp->buf, hpp->size, "%5.1f%% ", 1057 + percent(val, sum)); 1058 + else 1059 + ret = scnprintf(hpp->buf, hpp->size, "%6s ", "n/a"); 1060 + 1061 + return ret; 1062 + } 1063 + 1051 1064 static int 1052 1065 node_entry(struct perf_hpp_fmt *fmt __maybe_unused, struct perf_hpp *hpp, 1053 1066 struct hist_entry *he) ··· 1104 1091 ret = scnprintf(hpp->buf, hpp->size, "%2d{%2d ", node, num); 1105 1092 advance_hpp(hpp, ret); 1106 1093 1107 - #define DISPLAY_HITM(__h) \ 1108 - if (c2c_he->stats.__h> 0) { \ 1109 - ret = scnprintf(hpp->buf, hpp->size, "%5.1f%% ", \ 1110 - percent(stats->__h, c2c_he->stats.__h));\ 1111 - } else { \ 1112 - ret = scnprintf(hpp->buf, hpp->size, "%6s ", "n/a"); \ 1113 - } 1114 - 1115 1094 switch (c2c.display) { 1116 1095 case DISPLAY_RMT: 1117 - DISPLAY_HITM(rmt_hitm); 1096 + ret = display_metrics(hpp, stats->rmt_hitm, 1097 + c2c_he->stats.rmt_hitm); 1118 1098 break; 1119 1099 case DISPLAY_LCL: 1120 - DISPLAY_HITM(lcl_hitm); 1100 + ret = display_metrics(hpp, stats->lcl_hitm, 1101 + c2c_he->stats.lcl_hitm); 1121 1102 break; 1122 1103 case DISPLAY_TOT: 1123 - DISPLAY_HITM(tot_hitm); 1104 + ret = display_metrics(hpp, stats->tot_hitm, 1105 + c2c_he->stats.tot_hitm); 1106 + break; 1124 1107 default: 1125 1108 break; 1126 1109 } 1127 - 1128 - #undef DISPLAY_HITM 1129 1110 1130 1111 advance_hpp(hpp, ret); 1131 1112 ··· 1858 1851 1859 1852 #define DISPLAY_LINE_LIMIT 0.001 1860 1853 1854 + static u8 filter_display(u32 val, u32 sum) 1855 + { 1856 + if (sum == 0 || ((double)val / sum) < DISPLAY_LINE_LIMIT) 1857 + return HIST_FILTER__C2C; 1858 + 1859 + return 0; 1860 + } 1861 + 1861 1862 static bool he__display(struct hist_entry *he, struct c2c_stats *stats) 1862 1863 { 1863 1864 struct c2c_hist_entry *c2c_he; 1864 - double ld_dist; 1865 1865 1866 1866 if (c2c.show_all) 1867 1867 return true; 1868 1868 1869 1869 c2c_he = container_of(he, struct c2c_hist_entry, he); 1870 1870 1871 - #define FILTER_HITM(__h) \ 1872 - if (stats->__h) { \ 1873 - ld_dist = ((double)c2c_he->stats.__h / stats->__h); \ 1874 - if (ld_dist < DISPLAY_LINE_LIMIT) \ 1875 - he->filtered = HIST_FILTER__C2C; \ 1876 - } else { \ 1877 - he->filtered = HIST_FILTER__C2C; \ 1878 - } 1879 - 1880 1871 switch (c2c.display) { 1881 1872 case DISPLAY_LCL: 1882 - FILTER_HITM(lcl_hitm); 1873 + he->filtered = filter_display(c2c_he->stats.lcl_hitm, 1874 + stats->lcl_hitm); 1883 1875 break; 1884 1876 case DISPLAY_RMT: 1885 - FILTER_HITM(rmt_hitm); 1877 + he->filtered = filter_display(c2c_he->stats.rmt_hitm, 1878 + stats->rmt_hitm); 1886 1879 break; 1887 1880 case DISPLAY_TOT: 1888 - FILTER_HITM(tot_hitm); 1881 + he->filtered = filter_display(c2c_he->stats.tot_hitm, 1882 + stats->tot_hitm); 1883 + break; 1889 1884 default: 1890 1885 break; 1891 1886 } 1892 1887 1893 - #undef FILTER_HITM 1894 - 1895 1888 return he->filtered == 0; 1896 1889 } 1897 1890 1898 - static inline int valid_hitm_or_store(struct hist_entry *he) 1891 + static inline bool is_valid_hist_entry(struct hist_entry *he) 1899 1892 { 1900 1893 struct c2c_hist_entry *c2c_he; 1901 - bool has_hitm; 1894 + bool has_record = false; 1902 1895 1903 1896 c2c_he = container_of(he, struct c2c_hist_entry, he); 1904 - has_hitm = c2c.display == DISPLAY_TOT ? c2c_he->stats.tot_hitm : 1905 - c2c.display == DISPLAY_LCL ? c2c_he->stats.lcl_hitm : 1906 - c2c_he->stats.rmt_hitm; 1907 - return has_hitm || c2c_he->stats.store; 1897 + 1898 + /* It's a valid entry if contains stores */ 1899 + if (c2c_he->stats.store) 1900 + return true; 1901 + 1902 + switch (c2c.display) { 1903 + case DISPLAY_LCL: 1904 + has_record = !!c2c_he->stats.lcl_hitm; 1905 + break; 1906 + case DISPLAY_RMT: 1907 + has_record = !!c2c_he->stats.rmt_hitm; 1908 + break; 1909 + case DISPLAY_TOT: 1910 + has_record = !!c2c_he->stats.tot_hitm; 1911 + break; 1912 + default: 1913 + break; 1914 + } 1915 + 1916 + return has_record; 1908 1917 } 1909 1918 1910 1919 static void set_node_width(struct c2c_hist_entry *c2c_he, int len) ··· 1974 1951 1975 1952 calc_width(c2c_he); 1976 1953 1977 - if (!valid_hitm_or_store(he)) 1954 + if (!is_valid_hist_entry(he)) 1978 1955 he->filtered = HIST_FILTER__C2C; 1979 1956 1980 1957 return 0; ··· 1984 1961 { 1985 1962 struct c2c_hist_entry *c2c_he; 1986 1963 struct c2c_hists *c2c_hists; 1987 - bool display = he__display(he, &c2c.hitm_stats); 1964 + bool display = he__display(he, &c2c.shared_clines_stats); 1988 1965 1989 1966 c2c_he = container_of(he, struct c2c_hist_entry, he); 1990 1967 c2c_hists = c2c_he->hists; ··· 2071 2048 2072 2049 #define HAS_HITMS(__h) ((__h)->stats.lcl_hitm || (__h)->stats.rmt_hitm) 2073 2050 2074 - static int resort_hitm_cb(struct hist_entry *he, void *arg __maybe_unused) 2051 + static int resort_shared_cl_cb(struct hist_entry *he, void *arg __maybe_unused) 2075 2052 { 2076 2053 struct c2c_hist_entry *c2c_he; 2077 2054 c2c_he = container_of(he, struct c2c_hist_entry, he); 2078 2055 2079 2056 if (HAS_HITMS(c2c_he)) { 2080 2057 c2c.shared_clines++; 2081 - c2c_add_stats(&c2c.hitm_stats, &c2c_he->stats); 2058 + c2c_add_stats(&c2c.shared_clines_stats, &c2c_he->stats); 2082 2059 } 2083 2060 2084 2061 return 0; ··· 2134 2111 fprintf(out, " Load MESI State Exclusive : %10d\n", stats->ld_excl); 2135 2112 fprintf(out, " Load MESI State Shared : %10d\n", stats->ld_shared); 2136 2113 fprintf(out, " Load LLC Misses : %10d\n", llc_misses); 2114 + fprintf(out, " Load access blocked by data : %10d\n", stats->blk_data); 2115 + fprintf(out, " Load access blocked by address : %10d\n", stats->blk_addr); 2137 2116 fprintf(out, " LLC Misses to Local DRAM : %10.1f%%\n", ((double)stats->lcl_dram/(double)llc_misses) * 100.); 2138 2117 fprintf(out, " LLC Misses to Remote DRAM : %10.1f%%\n", ((double)stats->rmt_dram/(double)llc_misses) * 100.); 2139 2118 fprintf(out, " LLC Misses to Remote cache (HIT) : %10.1f%%\n", ((double)stats->rmt_hit /(double)llc_misses) * 100.); ··· 2151 2126 2152 2127 static void print_shared_cacheline_info(FILE *out) 2153 2128 { 2154 - struct c2c_stats *stats = &c2c.hitm_stats; 2129 + struct c2c_stats *stats = &c2c.shared_clines_stats; 2155 2130 int hitm_cnt = stats->lcl_hitm + stats->rmt_hitm; 2156 2131 2157 2132 fprintf(out, "=================================================\n"); ··· 2164 2139 fprintf(out, " L2D hits on shared lines : %10d\n", stats->ld_l2hit); 2165 2140 fprintf(out, " LLC hits on shared lines : %10d\n", stats->ld_llchit + stats->lcl_hitm); 2166 2141 fprintf(out, " Locked Access on shared lines : %10d\n", stats->locks); 2142 + fprintf(out, " Blocked Access on shared lines : %10d\n", stats->blk_data + stats->blk_addr); 2167 2143 fprintf(out, " Store HITs on shared lines : %10d\n", stats->store); 2168 2144 fprintf(out, " Store L1D hits on shared lines : %10d\n", stats->st_l1hit); 2169 2145 fprintf(out, " Total Merged records : %10d\n", hitm_cnt + stats->store); ··· 2202 2176 struct perf_hpp_list hpp_list; 2203 2177 struct rb_node *nd; 2204 2178 int ret; 2179 + const char *cl_output; 2180 + 2181 + cl_output = "cl_num," 2182 + "cl_rmt_hitm," 2183 + "cl_lcl_hitm," 2184 + "cl_stores_l1hit," 2185 + "cl_stores_l1miss," 2186 + "dcacheline"; 2205 2187 2206 2188 perf_hpp_list__init(&hpp_list); 2207 - ret = hpp_list__parse(&hpp_list, 2208 - "cl_num," 2209 - "cl_rmt_hitm," 2210 - "cl_lcl_hitm," 2211 - "cl_stores_l1hit," 2212 - "cl_stores_l1miss," 2213 - "dcacheline", 2214 - NULL); 2189 + ret = hpp_list__parse(&hpp_list, cl_output, NULL); 2215 2190 2216 2191 if (WARN_ONCE(ret, "failed to setup sort entries\n")) 2217 2192 return; ··· 2756 2729 OPT_END() 2757 2730 }; 2758 2731 int err = 0; 2732 + const char *output_str, *sort_str = NULL; 2759 2733 2760 2734 argc = parse_options(argc, argv, options, report_c2c_usage, 2761 2735 PARSE_OPT_STOP_AT_NON_OPTION); ··· 2833 2805 goto out_mem2node; 2834 2806 } 2835 2807 2836 - c2c_hists__reinit(&c2c.hists, 2837 - "cl_idx," 2838 - "dcacheline," 2839 - "dcacheline_node," 2840 - "dcacheline_count," 2841 - "percent_hitm," 2842 - "tot_hitm,lcl_hitm,rmt_hitm," 2843 - "tot_recs," 2844 - "tot_loads," 2845 - "tot_stores," 2846 - "stores_l1hit,stores_l1miss," 2847 - "ld_fbhit,ld_l1hit,ld_l2hit," 2848 - "ld_lclhit,lcl_hitm," 2849 - "ld_rmthit,rmt_hitm," 2850 - "dram_lcl,dram_rmt", 2851 - c2c.display == DISPLAY_TOT ? "tot_hitm" : 2852 - c2c.display == DISPLAY_LCL ? "lcl_hitm" : "rmt_hitm" 2853 - ); 2808 + output_str = "cl_idx," 2809 + "dcacheline," 2810 + "dcacheline_node," 2811 + "dcacheline_count," 2812 + "percent_hitm," 2813 + "tot_hitm,lcl_hitm,rmt_hitm," 2814 + "tot_recs," 2815 + "tot_loads," 2816 + "tot_stores," 2817 + "stores_l1hit,stores_l1miss," 2818 + "ld_fbhit,ld_l1hit,ld_l2hit," 2819 + "ld_lclhit,lcl_hitm," 2820 + "ld_rmthit,rmt_hitm," 2821 + "dram_lcl,dram_rmt"; 2822 + 2823 + if (c2c.display == DISPLAY_TOT) 2824 + sort_str = "tot_hitm"; 2825 + else if (c2c.display == DISPLAY_RMT) 2826 + sort_str = "rmt_hitm"; 2827 + else if (c2c.display == DISPLAY_LCL) 2828 + sort_str = "lcl_hitm"; 2829 + 2830 + c2c_hists__reinit(&c2c.hists, output_str, sort_str); 2854 2831 2855 2832 ui_progress__init(&prog, c2c.hists.hists.nr_entries, "Sorting..."); 2856 2833 2857 2834 hists__collapse_resort(&c2c.hists.hists, NULL); 2858 - hists__output_resort_cb(&c2c.hists.hists, &prog, resort_hitm_cb); 2835 + hists__output_resort_cb(&c2c.hists.hists, &prog, resort_shared_cl_cb); 2859 2836 hists__iterate_cb(&c2c.hists.hists, resort_cl_cb); 2860 2837 2861 2838 ui_progress__finish();
+1521
tools/perf/builtin-daemon.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <internal/lib.h> 3 + #include <subcmd/parse-options.h> 4 + #include <api/fd/array.h> 5 + #include <api/fs/fs.h> 6 + #include <linux/zalloc.h> 7 + #include <linux/string.h> 8 + #include <linux/limits.h> 9 + #include <linux/string.h> 10 + #include <string.h> 11 + #include <sys/file.h> 12 + #include <signal.h> 13 + #include <stdlib.h> 14 + #include <time.h> 15 + #include <stdio.h> 16 + #include <unistd.h> 17 + #include <errno.h> 18 + #include <sys/inotify.h> 19 + #include <libgen.h> 20 + #include <sys/types.h> 21 + #include <sys/socket.h> 22 + #include <sys/un.h> 23 + #include <sys/stat.h> 24 + #include <sys/signalfd.h> 25 + #include <sys/wait.h> 26 + #include <poll.h> 27 + #include <sys/stat.h> 28 + #include <time.h> 29 + #include "builtin.h" 30 + #include "perf.h" 31 + #include "debug.h" 32 + #include "config.h" 33 + #include "util.h" 34 + 35 + #define SESSION_OUTPUT "output" 36 + #define SESSION_CONTROL "control" 37 + #define SESSION_ACK "ack" 38 + 39 + /* 40 + * Session states: 41 + * 42 + * OK - session is up and running 43 + * RECONFIG - session is pending for reconfiguration, 44 + * new values are already loaded in session object 45 + * KILL - session is pending to be killed 46 + * 47 + * Session object life and its state is maintained by 48 + * following functions: 49 + * 50 + * setup_server_config 51 + * - reads config file and setup session objects 52 + * with following states: 53 + * 54 + * OK - no change needed 55 + * RECONFIG - session needs to be changed 56 + * (run variable changed) 57 + * KILL - session needs to be killed 58 + * (session is no longer in config file) 59 + * 60 + * daemon__reconfig 61 + * - scans session objects and does following actions 62 + * for states: 63 + * 64 + * OK - skip 65 + * RECONFIG - session is killed and re-run with new config 66 + * KILL - session is killed 67 + * 68 + * - all sessions have OK state on the function exit 69 + */ 70 + enum daemon_session_state { 71 + OK, 72 + RECONFIG, 73 + KILL, 74 + }; 75 + 76 + struct daemon_session { 77 + char *base; 78 + char *name; 79 + char *run; 80 + char *control; 81 + int pid; 82 + struct list_head list; 83 + enum daemon_session_state state; 84 + time_t start; 85 + }; 86 + 87 + struct daemon { 88 + const char *config; 89 + char *config_real; 90 + char *config_base; 91 + const char *csv_sep; 92 + const char *base_user; 93 + char *base; 94 + struct list_head sessions; 95 + FILE *out; 96 + char perf[PATH_MAX]; 97 + int signal_fd; 98 + time_t start; 99 + }; 100 + 101 + static struct daemon __daemon = { 102 + .sessions = LIST_HEAD_INIT(__daemon.sessions), 103 + }; 104 + 105 + static const char * const daemon_usage[] = { 106 + "perf daemon start [<options>]", 107 + "perf daemon [<options>]", 108 + NULL 109 + }; 110 + 111 + static bool done; 112 + 113 + static void sig_handler(int sig __maybe_unused) 114 + { 115 + done = true; 116 + } 117 + 118 + static struct daemon_session *daemon__add_session(struct daemon *config, char *name) 119 + { 120 + struct daemon_session *session = zalloc(sizeof(*session)); 121 + 122 + if (!session) 123 + return NULL; 124 + 125 + session->name = strdup(name); 126 + if (!session->name) { 127 + free(session); 128 + return NULL; 129 + } 130 + 131 + session->pid = -1; 132 + list_add_tail(&session->list, &config->sessions); 133 + return session; 134 + } 135 + 136 + static struct daemon_session *daemon__find_session(struct daemon *daemon, char *name) 137 + { 138 + struct daemon_session *session; 139 + 140 + list_for_each_entry(session, &daemon->sessions, list) { 141 + if (!strcmp(session->name, name)) 142 + return session; 143 + } 144 + 145 + return NULL; 146 + } 147 + 148 + static int get_session_name(const char *var, char *session, int len) 149 + { 150 + const char *p = var + sizeof("session-") - 1; 151 + 152 + while (*p != '.' && *p != 0x0 && len--) 153 + *session++ = *p++; 154 + 155 + *session = 0; 156 + return *p == '.' ? 0 : -EINVAL; 157 + } 158 + 159 + static int session_config(struct daemon *daemon, const char *var, const char *value) 160 + { 161 + struct daemon_session *session; 162 + char name[100]; 163 + 164 + if (get_session_name(var, name, sizeof(name))) 165 + return -EINVAL; 166 + 167 + var = strchr(var, '.'); 168 + if (!var) 169 + return -EINVAL; 170 + 171 + var++; 172 + 173 + session = daemon__find_session(daemon, name); 174 + 175 + if (!session) { 176 + /* New session is defined. */ 177 + session = daemon__add_session(daemon, name); 178 + if (!session) 179 + return -ENOMEM; 180 + 181 + pr_debug("reconfig: found new session %s\n", name); 182 + 183 + /* Trigger reconfig to start it. */ 184 + session->state = RECONFIG; 185 + } else if (session->state == KILL) { 186 + /* Current session is defined, no action needed. */ 187 + pr_debug("reconfig: found current session %s\n", name); 188 + session->state = OK; 189 + } 190 + 191 + if (!strcmp(var, "run")) { 192 + bool same = false; 193 + 194 + if (session->run) 195 + same = !strcmp(session->run, value); 196 + 197 + if (!same) { 198 + if (session->run) { 199 + free(session->run); 200 + pr_debug("reconfig: session %s is changed\n", name); 201 + } 202 + 203 + session->run = strdup(value); 204 + if (!session->run) 205 + return -ENOMEM; 206 + 207 + /* 208 + * Either new or changed run value is defined, 209 + * trigger reconfig for the session. 210 + */ 211 + session->state = RECONFIG; 212 + } 213 + } 214 + 215 + return 0; 216 + } 217 + 218 + static int server_config(const char *var, const char *value, void *cb) 219 + { 220 + struct daemon *daemon = cb; 221 + 222 + if (strstarts(var, "session-")) { 223 + return session_config(daemon, var, value); 224 + } else if (!strcmp(var, "daemon.base") && !daemon->base_user) { 225 + if (daemon->base && strcmp(daemon->base, value)) { 226 + pr_err("failed: can't redefine base, bailing out\n"); 227 + return -EINVAL; 228 + } 229 + daemon->base = strdup(value); 230 + if (!daemon->base) 231 + return -ENOMEM; 232 + } 233 + 234 + return 0; 235 + } 236 + 237 + static int client_config(const char *var, const char *value, void *cb) 238 + { 239 + struct daemon *daemon = cb; 240 + 241 + if (!strcmp(var, "daemon.base") && !daemon->base_user) { 242 + daemon->base = strdup(value); 243 + if (!daemon->base) 244 + return -ENOMEM; 245 + } 246 + 247 + return 0; 248 + } 249 + 250 + static int check_base(struct daemon *daemon) 251 + { 252 + struct stat st; 253 + 254 + if (!daemon->base) { 255 + pr_err("failed: base not defined\n"); 256 + return -EINVAL; 257 + } 258 + 259 + if (stat(daemon->base, &st)) { 260 + switch (errno) { 261 + case EACCES: 262 + pr_err("failed: permission denied for '%s' base\n", 263 + daemon->base); 264 + return -EACCES; 265 + case ENOENT: 266 + pr_err("failed: base '%s' does not exists\n", 267 + daemon->base); 268 + return -EACCES; 269 + default: 270 + pr_err("failed: can't access base '%s': %s\n", 271 + daemon->base, strerror(errno)); 272 + return -errno; 273 + } 274 + } 275 + 276 + if ((st.st_mode & S_IFMT) != S_IFDIR) { 277 + pr_err("failed: base '%s' is not directory\n", 278 + daemon->base); 279 + return -EINVAL; 280 + } 281 + 282 + return 0; 283 + } 284 + 285 + static int setup_client_config(struct daemon *daemon) 286 + { 287 + struct perf_config_set *set = perf_config_set__load_file(daemon->config_real); 288 + int err = -ENOMEM; 289 + 290 + if (set) { 291 + err = perf_config_set(set, client_config, daemon); 292 + perf_config_set__delete(set); 293 + } 294 + 295 + return err ?: check_base(daemon); 296 + } 297 + 298 + static int setup_server_config(struct daemon *daemon) 299 + { 300 + struct perf_config_set *set; 301 + struct daemon_session *session; 302 + int err = -ENOMEM; 303 + 304 + pr_debug("reconfig: started\n"); 305 + 306 + /* 307 + * Mark all sessions for kill, the server config 308 + * will set following states, see explanation at 309 + * enum daemon_session_state declaration. 310 + */ 311 + list_for_each_entry(session, &daemon->sessions, list) 312 + session->state = KILL; 313 + 314 + set = perf_config_set__load_file(daemon->config_real); 315 + if (set) { 316 + err = perf_config_set(set, server_config, daemon); 317 + perf_config_set__delete(set); 318 + } 319 + 320 + return err ?: check_base(daemon); 321 + } 322 + 323 + static int daemon_session__run(struct daemon_session *session, 324 + struct daemon *daemon) 325 + { 326 + char buf[PATH_MAX]; 327 + char **argv; 328 + int argc, fd; 329 + 330 + if (asprintf(&session->base, "%s/session-%s", 331 + daemon->base, session->name) < 0) { 332 + perror("failed: asprintf"); 333 + return -1; 334 + } 335 + 336 + if (mkdir(session->base, 0755) && errno != EEXIST) { 337 + perror("failed: mkdir"); 338 + return -1; 339 + } 340 + 341 + session->start = time(NULL); 342 + 343 + session->pid = fork(); 344 + if (session->pid < 0) 345 + return -1; 346 + if (session->pid > 0) { 347 + pr_info("reconfig: ruining session [%s:%d]: %s\n", 348 + session->name, session->pid, session->run); 349 + return 0; 350 + } 351 + 352 + if (chdir(session->base)) { 353 + perror("failed: chdir"); 354 + return -1; 355 + } 356 + 357 + fd = open("/dev/null", O_RDONLY); 358 + if (fd < 0) { 359 + perror("failed: open /dev/null"); 360 + return -1; 361 + } 362 + 363 + dup2(fd, 0); 364 + close(fd); 365 + 366 + fd = open(SESSION_OUTPUT, O_RDWR|O_CREAT|O_TRUNC, 0644); 367 + if (fd < 0) { 368 + perror("failed: open session output"); 369 + return -1; 370 + } 371 + 372 + dup2(fd, 1); 373 + dup2(fd, 2); 374 + close(fd); 375 + 376 + if (mkfifo(SESSION_CONTROL, O_RDWR) && errno != EEXIST) { 377 + perror("failed: create control fifo"); 378 + return -1; 379 + } 380 + 381 + if (mkfifo(SESSION_ACK, O_RDWR) && errno != EEXIST) { 382 + perror("failed: create ack fifo"); 383 + return -1; 384 + } 385 + 386 + scnprintf(buf, sizeof(buf), "%s record --control=fifo:%s,%s %s", 387 + daemon->perf, SESSION_CONTROL, SESSION_ACK, session->run); 388 + 389 + argv = argv_split(buf, &argc); 390 + if (!argv) 391 + exit(-1); 392 + 393 + exit(execve(daemon->perf, argv, NULL)); 394 + return -1; 395 + } 396 + 397 + static pid_t handle_signalfd(struct daemon *daemon) 398 + { 399 + struct daemon_session *session; 400 + struct signalfd_siginfo si; 401 + ssize_t err; 402 + int status; 403 + pid_t pid; 404 + 405 + err = read(daemon->signal_fd, &si, sizeof(struct signalfd_siginfo)); 406 + if (err != sizeof(struct signalfd_siginfo)) 407 + return -1; 408 + 409 + list_for_each_entry(session, &daemon->sessions, list) { 410 + 411 + if (session->pid != (int) si.ssi_pid) 412 + continue; 413 + 414 + pid = waitpid(session->pid, &status, 0); 415 + if (pid == session->pid) { 416 + if (WIFEXITED(status)) { 417 + pr_info("session '%s' exited, status=%d\n", 418 + session->name, WEXITSTATUS(status)); 419 + } else if (WIFSIGNALED(status)) { 420 + pr_info("session '%s' killed (signal %d)\n", 421 + session->name, WTERMSIG(status)); 422 + } else if (WIFSTOPPED(status)) { 423 + pr_info("session '%s' stopped (signal %d)\n", 424 + session->name, WSTOPSIG(status)); 425 + } else { 426 + pr_info("session '%s' Unexpected status (0x%x)\n", 427 + session->name, status); 428 + } 429 + } 430 + 431 + session->state = KILL; 432 + session->pid = -1; 433 + return pid; 434 + } 435 + 436 + return 0; 437 + } 438 + 439 + static int daemon_session__wait(struct daemon_session *session, struct daemon *daemon, 440 + int secs) 441 + { 442 + struct pollfd pollfd = { 443 + .fd = daemon->signal_fd, 444 + .events = POLLIN, 445 + }; 446 + pid_t wpid = 0, pid = session->pid; 447 + time_t start; 448 + 449 + start = time(NULL); 450 + 451 + do { 452 + int err = poll(&pollfd, 1, 1000); 453 + 454 + if (err > 0) { 455 + wpid = handle_signalfd(daemon); 456 + } else if (err < 0) { 457 + perror("failed: poll\n"); 458 + return -1; 459 + } 460 + 461 + if (start + secs < time(NULL)) 462 + return -1; 463 + } while (wpid != pid); 464 + 465 + return 0; 466 + } 467 + 468 + static bool daemon__has_alive_session(struct daemon *daemon) 469 + { 470 + struct daemon_session *session; 471 + 472 + list_for_each_entry(session, &daemon->sessions, list) { 473 + if (session->pid != -1) 474 + return true; 475 + } 476 + 477 + return false; 478 + } 479 + 480 + static int daemon__wait(struct daemon *daemon, int secs) 481 + { 482 + struct pollfd pollfd = { 483 + .fd = daemon->signal_fd, 484 + .events = POLLIN, 485 + }; 486 + time_t start; 487 + 488 + start = time(NULL); 489 + 490 + do { 491 + int err = poll(&pollfd, 1, 1000); 492 + 493 + if (err > 0) { 494 + handle_signalfd(daemon); 495 + } else if (err < 0) { 496 + perror("failed: poll\n"); 497 + return -1; 498 + } 499 + 500 + if (start + secs < time(NULL)) 501 + return -1; 502 + } while (daemon__has_alive_session(daemon)); 503 + 504 + return 0; 505 + } 506 + 507 + static int daemon_session__control(struct daemon_session *session, 508 + const char *msg, bool do_ack) 509 + { 510 + struct pollfd pollfd = { .events = POLLIN, }; 511 + char control_path[PATH_MAX]; 512 + char ack_path[PATH_MAX]; 513 + int control, ack = -1, len; 514 + char buf[20]; 515 + int ret = -1; 516 + ssize_t err; 517 + 518 + /* open the control file */ 519 + scnprintf(control_path, sizeof(control_path), "%s/%s", 520 + session->base, SESSION_CONTROL); 521 + 522 + control = open(control_path, O_WRONLY|O_NONBLOCK); 523 + if (!control) 524 + return -1; 525 + 526 + if (do_ack) { 527 + /* open the ack file */ 528 + scnprintf(ack_path, sizeof(ack_path), "%s/%s", 529 + session->base, SESSION_ACK); 530 + 531 + ack = open(ack_path, O_RDONLY, O_NONBLOCK); 532 + if (!ack) { 533 + close(control); 534 + return -1; 535 + } 536 + } 537 + 538 + /* write the command */ 539 + len = strlen(msg); 540 + 541 + err = writen(control, msg, len); 542 + if (err != len) { 543 + pr_err("failed: write to control pipe: %d (%s)\n", 544 + errno, control_path); 545 + goto out; 546 + } 547 + 548 + if (!do_ack) 549 + goto out; 550 + 551 + /* wait for an ack */ 552 + pollfd.fd = ack; 553 + 554 + if (!poll(&pollfd, 1, 2000)) { 555 + pr_err("failed: control ack timeout\n"); 556 + goto out; 557 + } 558 + 559 + if (!(pollfd.revents & POLLIN)) { 560 + pr_err("failed: did not received an ack\n"); 561 + goto out; 562 + } 563 + 564 + err = read(ack, buf, sizeof(buf)); 565 + if (err > 0) 566 + ret = strcmp(buf, "ack\n"); 567 + else 568 + perror("failed: read ack %d\n"); 569 + 570 + out: 571 + if (ack != -1) 572 + close(ack); 573 + 574 + close(control); 575 + return ret; 576 + } 577 + 578 + static int setup_server_socket(struct daemon *daemon) 579 + { 580 + struct sockaddr_un addr; 581 + char path[PATH_MAX]; 582 + int fd = socket(AF_UNIX, SOCK_STREAM, 0); 583 + 584 + if (fd < 0) { 585 + fprintf(stderr, "socket: %s\n", strerror(errno)); 586 + return -1; 587 + } 588 + 589 + if (fcntl(fd, F_SETFD, FD_CLOEXEC)) { 590 + perror("failed: fcntl FD_CLOEXEC"); 591 + close(fd); 592 + return -1; 593 + } 594 + 595 + scnprintf(path, sizeof(path), "%s/control", daemon->base); 596 + 597 + if (strlen(path) + 1 >= sizeof(addr.sun_path)) { 598 + pr_err("failed: control path too long '%s'\n", path); 599 + close(fd); 600 + return -1; 601 + } 602 + 603 + memset(&addr, 0, sizeof(addr)); 604 + addr.sun_family = AF_UNIX; 605 + 606 + strlcpy(addr.sun_path, path, sizeof(addr.sun_path) - 1); 607 + unlink(path); 608 + 609 + if (bind(fd, (struct sockaddr *)&addr, sizeof(addr)) == -1) { 610 + perror("failed: bind"); 611 + close(fd); 612 + return -1; 613 + } 614 + 615 + if (listen(fd, 1) == -1) { 616 + perror("failed: listen"); 617 + close(fd); 618 + return -1; 619 + } 620 + 621 + return fd; 622 + } 623 + 624 + enum { 625 + CMD_LIST = 0, 626 + CMD_SIGNAL = 1, 627 + CMD_STOP = 2, 628 + CMD_PING = 3, 629 + CMD_MAX, 630 + }; 631 + 632 + #define SESSION_MAX 64 633 + 634 + union cmd { 635 + int cmd; 636 + 637 + /* CMD_LIST */ 638 + struct { 639 + int cmd; 640 + int verbose; 641 + char csv_sep; 642 + } list; 643 + 644 + /* CMD_SIGNAL */ 645 + struct { 646 + int cmd; 647 + int sig; 648 + char name[SESSION_MAX]; 649 + } signal; 650 + 651 + /* CMD_PING */ 652 + struct { 653 + int cmd; 654 + char name[SESSION_MAX]; 655 + } ping; 656 + }; 657 + 658 + enum { 659 + PING_OK = 0, 660 + PING_FAIL = 1, 661 + PING_MAX, 662 + }; 663 + 664 + static int daemon_session__ping(struct daemon_session *session) 665 + { 666 + return daemon_session__control(session, "ping", true) ? PING_FAIL : PING_OK; 667 + } 668 + 669 + static int cmd_session_list(struct daemon *daemon, union cmd *cmd, FILE *out) 670 + { 671 + char csv_sep = cmd->list.csv_sep; 672 + struct daemon_session *session; 673 + time_t curr = time(NULL); 674 + 675 + if (csv_sep) { 676 + fprintf(out, "%d%c%s%c%s%c%s/%s", 677 + /* pid daemon */ 678 + getpid(), csv_sep, "daemon", 679 + /* base */ 680 + csv_sep, daemon->base, 681 + /* output */ 682 + csv_sep, daemon->base, SESSION_OUTPUT); 683 + 684 + fprintf(out, "%c%s/%s", 685 + /* lock */ 686 + csv_sep, daemon->base, "lock"); 687 + 688 + fprintf(out, "%c%lu", 689 + /* session up time */ 690 + csv_sep, (curr - daemon->start) / 60); 691 + 692 + fprintf(out, "\n"); 693 + } else { 694 + fprintf(out, "[%d:daemon] base: %s\n", getpid(), daemon->base); 695 + if (cmd->list.verbose) { 696 + fprintf(out, " output: %s/%s\n", 697 + daemon->base, SESSION_OUTPUT); 698 + fprintf(out, " lock: %s/lock\n", 699 + daemon->base); 700 + fprintf(out, " up: %lu minutes\n", 701 + (curr - daemon->start) / 60); 702 + } 703 + } 704 + 705 + list_for_each_entry(session, &daemon->sessions, list) { 706 + if (csv_sep) { 707 + fprintf(out, "%d%c%s%c%s", 708 + /* pid */ 709 + session->pid, 710 + /* name */ 711 + csv_sep, session->name, 712 + /* base */ 713 + csv_sep, session->run); 714 + 715 + fprintf(out, "%c%s%c%s/%s", 716 + /* session dir */ 717 + csv_sep, session->base, 718 + /* session output */ 719 + csv_sep, session->base, SESSION_OUTPUT); 720 + 721 + fprintf(out, "%c%s/%s%c%s/%s", 722 + /* session control */ 723 + csv_sep, session->base, SESSION_CONTROL, 724 + /* session ack */ 725 + csv_sep, session->base, SESSION_ACK); 726 + 727 + fprintf(out, "%c%lu", 728 + /* session up time */ 729 + csv_sep, (curr - session->start) / 60); 730 + 731 + fprintf(out, "\n"); 732 + } else { 733 + fprintf(out, "[%d:%s] perf record %s\n", 734 + session->pid, session->name, session->run); 735 + if (!cmd->list.verbose) 736 + continue; 737 + fprintf(out, " base: %s\n", 738 + session->base); 739 + fprintf(out, " output: %s/%s\n", 740 + session->base, SESSION_OUTPUT); 741 + fprintf(out, " control: %s/%s\n", 742 + session->base, SESSION_CONTROL); 743 + fprintf(out, " ack: %s/%s\n", 744 + session->base, SESSION_ACK); 745 + fprintf(out, " up: %lu minutes\n", 746 + (curr - session->start) / 60); 747 + } 748 + } 749 + 750 + return 0; 751 + } 752 + 753 + static int daemon_session__signal(struct daemon_session *session, int sig) 754 + { 755 + if (session->pid < 0) 756 + return -1; 757 + return kill(session->pid, sig); 758 + } 759 + 760 + static int cmd_session_kill(struct daemon *daemon, union cmd *cmd, FILE *out) 761 + { 762 + struct daemon_session *session; 763 + bool all = false; 764 + 765 + all = !strcmp(cmd->signal.name, "all"); 766 + 767 + list_for_each_entry(session, &daemon->sessions, list) { 768 + if (all || !strcmp(cmd->signal.name, session->name)) { 769 + daemon_session__signal(session, cmd->signal.sig); 770 + fprintf(out, "signal %d sent to session '%s [%d]'\n", 771 + cmd->signal.sig, session->name, session->pid); 772 + } 773 + } 774 + 775 + return 0; 776 + } 777 + 778 + static const char *ping_str[PING_MAX] = { 779 + [PING_OK] = "OK", 780 + [PING_FAIL] = "FAIL", 781 + }; 782 + 783 + static int cmd_session_ping(struct daemon *daemon, union cmd *cmd, FILE *out) 784 + { 785 + struct daemon_session *session; 786 + bool all = false, found = false; 787 + 788 + all = !strcmp(cmd->ping.name, "all"); 789 + 790 + list_for_each_entry(session, &daemon->sessions, list) { 791 + if (all || !strcmp(cmd->ping.name, session->name)) { 792 + int state = daemon_session__ping(session); 793 + 794 + fprintf(out, "%-4s %s\n", ping_str[state], session->name); 795 + found = true; 796 + } 797 + } 798 + 799 + if (!found && !all) { 800 + fprintf(out, "%-4s %s (not found)\n", 801 + ping_str[PING_FAIL], cmd->ping.name); 802 + } 803 + return 0; 804 + } 805 + 806 + static int handle_server_socket(struct daemon *daemon, int sock_fd) 807 + { 808 + int ret = -1, fd; 809 + FILE *out = NULL; 810 + union cmd cmd; 811 + 812 + fd = accept(sock_fd, NULL, NULL); 813 + if (fd < 0) { 814 + perror("failed: accept"); 815 + return -1; 816 + } 817 + 818 + if (sizeof(cmd) != readn(fd, &cmd, sizeof(cmd))) { 819 + perror("failed: read"); 820 + goto out; 821 + } 822 + 823 + out = fdopen(fd, "w"); 824 + if (!out) { 825 + perror("failed: fdopen"); 826 + goto out; 827 + } 828 + 829 + switch (cmd.cmd) { 830 + case CMD_LIST: 831 + ret = cmd_session_list(daemon, &cmd, out); 832 + break; 833 + case CMD_SIGNAL: 834 + ret = cmd_session_kill(daemon, &cmd, out); 835 + break; 836 + case CMD_STOP: 837 + done = 1; 838 + ret = 0; 839 + pr_debug("perf daemon is exciting\n"); 840 + break; 841 + case CMD_PING: 842 + ret = cmd_session_ping(daemon, &cmd, out); 843 + break; 844 + default: 845 + break; 846 + } 847 + 848 + fclose(out); 849 + out: 850 + /* If out is defined, then fd is closed via fclose. */ 851 + if (!out) 852 + close(fd); 853 + return ret; 854 + } 855 + 856 + static int setup_client_socket(struct daemon *daemon) 857 + { 858 + struct sockaddr_un addr; 859 + char path[PATH_MAX]; 860 + int fd = socket(AF_UNIX, SOCK_STREAM, 0); 861 + 862 + if (fd == -1) { 863 + perror("failed: socket"); 864 + return -1; 865 + } 866 + 867 + scnprintf(path, sizeof(path), "%s/control", daemon->base); 868 + 869 + if (strlen(path) + 1 >= sizeof(addr.sun_path)) { 870 + pr_err("failed: control path too long '%s'\n", path); 871 + close(fd); 872 + return -1; 873 + } 874 + 875 + memset(&addr, 0, sizeof(addr)); 876 + addr.sun_family = AF_UNIX; 877 + strlcpy(addr.sun_path, path, sizeof(addr.sun_path) - 1); 878 + 879 + if (connect(fd, (struct sockaddr *) &addr, sizeof(addr)) == -1) { 880 + perror("failed: connect"); 881 + close(fd); 882 + return -1; 883 + } 884 + 885 + return fd; 886 + } 887 + 888 + static void daemon_session__kill(struct daemon_session *session, 889 + struct daemon *daemon) 890 + { 891 + int how = 0; 892 + 893 + do { 894 + switch (how) { 895 + case 0: 896 + daemon_session__control(session, "stop", false); 897 + break; 898 + case 1: 899 + daemon_session__signal(session, SIGTERM); 900 + break; 901 + case 2: 902 + daemon_session__signal(session, SIGKILL); 903 + break; 904 + default: 905 + break; 906 + } 907 + how++; 908 + 909 + } while (daemon_session__wait(session, daemon, 10)); 910 + } 911 + 912 + static void daemon__signal(struct daemon *daemon, int sig) 913 + { 914 + struct daemon_session *session; 915 + 916 + list_for_each_entry(session, &daemon->sessions, list) 917 + daemon_session__signal(session, sig); 918 + } 919 + 920 + static void daemon_session__delete(struct daemon_session *session) 921 + { 922 + free(session->base); 923 + free(session->name); 924 + free(session->run); 925 + free(session); 926 + } 927 + 928 + static void daemon_session__remove(struct daemon_session *session) 929 + { 930 + list_del(&session->list); 931 + daemon_session__delete(session); 932 + } 933 + 934 + static void daemon__stop(struct daemon *daemon) 935 + { 936 + struct daemon_session *session; 937 + 938 + list_for_each_entry(session, &daemon->sessions, list) 939 + daemon_session__control(session, "stop", false); 940 + } 941 + 942 + static void daemon__kill(struct daemon *daemon) 943 + { 944 + int how = 0; 945 + 946 + do { 947 + switch (how) { 948 + case 0: 949 + daemon__stop(daemon); 950 + break; 951 + case 1: 952 + daemon__signal(daemon, SIGTERM); 953 + break; 954 + case 2: 955 + daemon__signal(daemon, SIGKILL); 956 + break; 957 + default: 958 + break; 959 + } 960 + how++; 961 + 962 + } while (daemon__wait(daemon, 10)); 963 + } 964 + 965 + static void daemon__exit(struct daemon *daemon) 966 + { 967 + struct daemon_session *session, *h; 968 + 969 + list_for_each_entry_safe(session, h, &daemon->sessions, list) 970 + daemon_session__remove(session); 971 + 972 + free(daemon->config_real); 973 + free(daemon->config_base); 974 + free(daemon->base); 975 + } 976 + 977 + static int daemon__reconfig(struct daemon *daemon) 978 + { 979 + struct daemon_session *session, *n; 980 + 981 + list_for_each_entry_safe(session, n, &daemon->sessions, list) { 982 + /* No change. */ 983 + if (session->state == OK) 984 + continue; 985 + 986 + /* Remove session. */ 987 + if (session->state == KILL) { 988 + if (session->pid > 0) { 989 + daemon_session__kill(session, daemon); 990 + pr_info("reconfig: session '%s' killed\n", session->name); 991 + } 992 + daemon_session__remove(session); 993 + continue; 994 + } 995 + 996 + /* Reconfig session. */ 997 + if (session->pid > 0) { 998 + daemon_session__kill(session, daemon); 999 + pr_info("reconfig: session '%s' killed\n", session->name); 1000 + } 1001 + if (daemon_session__run(session, daemon)) 1002 + return -1; 1003 + 1004 + session->state = OK; 1005 + } 1006 + 1007 + return 0; 1008 + } 1009 + 1010 + static int setup_config_changes(struct daemon *daemon) 1011 + { 1012 + char *basen = strdup(daemon->config_real); 1013 + char *dirn = strdup(daemon->config_real); 1014 + char *base, *dir; 1015 + int fd, wd = -1; 1016 + 1017 + if (!dirn || !basen) 1018 + goto out; 1019 + 1020 + fd = inotify_init1(IN_NONBLOCK|O_CLOEXEC); 1021 + if (fd < 0) { 1022 + perror("failed: inotify_init"); 1023 + goto out; 1024 + } 1025 + 1026 + dir = dirname(dirn); 1027 + base = basename(basen); 1028 + pr_debug("config file: %s, dir: %s\n", base, dir); 1029 + 1030 + wd = inotify_add_watch(fd, dir, IN_CLOSE_WRITE); 1031 + if (wd >= 0) { 1032 + daemon->config_base = strdup(base); 1033 + if (!daemon->config_base) { 1034 + close(fd); 1035 + wd = -1; 1036 + } 1037 + } else { 1038 + perror("failed: inotify_add_watch"); 1039 + } 1040 + 1041 + out: 1042 + free(basen); 1043 + free(dirn); 1044 + return wd < 0 ? -1 : fd; 1045 + } 1046 + 1047 + static bool process_inotify_event(struct daemon *daemon, char *buf, ssize_t len) 1048 + { 1049 + char *p = buf; 1050 + 1051 + while (p < (buf + len)) { 1052 + struct inotify_event *event = (struct inotify_event *) p; 1053 + 1054 + /* 1055 + * We monitor config directory, check if our 1056 + * config file was changes. 1057 + */ 1058 + if ((event->mask & IN_CLOSE_WRITE) && 1059 + !(event->mask & IN_ISDIR)) { 1060 + if (!strcmp(event->name, daemon->config_base)) 1061 + return true; 1062 + } 1063 + p += sizeof(*event) + event->len; 1064 + } 1065 + return false; 1066 + } 1067 + 1068 + static int handle_config_changes(struct daemon *daemon, int conf_fd, 1069 + bool *config_changed) 1070 + { 1071 + char buf[4096]; 1072 + ssize_t len; 1073 + 1074 + while (!(*config_changed)) { 1075 + len = read(conf_fd, buf, sizeof(buf)); 1076 + if (len == -1) { 1077 + if (errno != EAGAIN) { 1078 + perror("failed: read"); 1079 + return -1; 1080 + } 1081 + return 0; 1082 + } 1083 + *config_changed = process_inotify_event(daemon, buf, len); 1084 + } 1085 + return 0; 1086 + } 1087 + 1088 + static int setup_config(struct daemon *daemon) 1089 + { 1090 + if (daemon->base_user) { 1091 + daemon->base = strdup(daemon->base_user); 1092 + if (!daemon->base) 1093 + return -ENOMEM; 1094 + } 1095 + 1096 + if (daemon->config) { 1097 + char *real = realpath(daemon->config, NULL); 1098 + 1099 + if (!real) { 1100 + perror("failed: realpath"); 1101 + return -1; 1102 + } 1103 + daemon->config_real = real; 1104 + return 0; 1105 + } 1106 + 1107 + if (perf_config_system() && !access(perf_etc_perfconfig(), R_OK)) 1108 + daemon->config_real = strdup(perf_etc_perfconfig()); 1109 + else if (perf_config_global() && perf_home_perfconfig()) 1110 + daemon->config_real = strdup(perf_home_perfconfig()); 1111 + 1112 + return daemon->config_real ? 0 : -1; 1113 + } 1114 + 1115 + #ifndef F_TLOCK 1116 + #define F_TLOCK 2 1117 + 1118 + #include <sys/file.h> 1119 + 1120 + static int lockf(int fd, int cmd, off_t len) 1121 + { 1122 + if (cmd != F_TLOCK || len != 0) 1123 + return -1; 1124 + 1125 + return flock(fd, LOCK_EX | LOCK_NB); 1126 + } 1127 + #endif // F_TLOCK 1128 + 1129 + /* 1130 + * Each daemon tries to create and lock BASE/lock file, 1131 + * if it's successful we are sure we're the only daemon 1132 + * running over the BASE. 1133 + * 1134 + * Once daemon is finished, file descriptor to lock file 1135 + * is closed and lock is released. 1136 + */ 1137 + static int check_lock(struct daemon *daemon) 1138 + { 1139 + char path[PATH_MAX]; 1140 + char buf[20]; 1141 + int fd, pid; 1142 + ssize_t len; 1143 + 1144 + scnprintf(path, sizeof(path), "%s/lock", daemon->base); 1145 + 1146 + fd = open(path, O_RDWR|O_CREAT|O_CLOEXEC, 0640); 1147 + if (fd < 0) 1148 + return -1; 1149 + 1150 + if (lockf(fd, F_TLOCK, 0) < 0) { 1151 + filename__read_int(path, &pid); 1152 + fprintf(stderr, "failed: another perf daemon (pid %d) owns %s\n", 1153 + pid, daemon->base); 1154 + close(fd); 1155 + return -1; 1156 + } 1157 + 1158 + scnprintf(buf, sizeof(buf), "%d", getpid()); 1159 + len = strlen(buf); 1160 + 1161 + if (write(fd, buf, len) != len) { 1162 + perror("failed: write"); 1163 + close(fd); 1164 + return -1; 1165 + } 1166 + 1167 + if (ftruncate(fd, len)) { 1168 + perror("failed: ftruncate"); 1169 + close(fd); 1170 + return -1; 1171 + } 1172 + 1173 + return 0; 1174 + } 1175 + 1176 + static int go_background(struct daemon *daemon) 1177 + { 1178 + int pid, fd; 1179 + 1180 + pid = fork(); 1181 + if (pid < 0) 1182 + return -1; 1183 + 1184 + if (pid > 0) 1185 + return 1; 1186 + 1187 + if (setsid() < 0) 1188 + return -1; 1189 + 1190 + if (check_lock(daemon)) 1191 + return -1; 1192 + 1193 + umask(0); 1194 + 1195 + if (chdir(daemon->base)) { 1196 + perror("failed: chdir"); 1197 + return -1; 1198 + } 1199 + 1200 + fd = open("output", O_RDWR|O_CREAT|O_TRUNC, 0644); 1201 + if (fd < 0) { 1202 + perror("failed: open"); 1203 + return -1; 1204 + } 1205 + 1206 + if (fcntl(fd, F_SETFD, FD_CLOEXEC)) { 1207 + perror("failed: fcntl FD_CLOEXEC"); 1208 + close(fd); 1209 + return -1; 1210 + } 1211 + 1212 + close(0); 1213 + dup2(fd, 1); 1214 + dup2(fd, 2); 1215 + close(fd); 1216 + 1217 + daemon->out = fdopen(1, "w"); 1218 + if (!daemon->out) { 1219 + close(1); 1220 + close(2); 1221 + return -1; 1222 + } 1223 + 1224 + setbuf(daemon->out, NULL); 1225 + return 0; 1226 + } 1227 + 1228 + static int setup_signalfd(struct daemon *daemon) 1229 + { 1230 + sigset_t mask; 1231 + 1232 + sigemptyset(&mask); 1233 + sigaddset(&mask, SIGCHLD); 1234 + 1235 + if (sigprocmask(SIG_BLOCK, &mask, NULL) == -1) 1236 + return -1; 1237 + 1238 + daemon->signal_fd = signalfd(-1, &mask, SFD_NONBLOCK|SFD_CLOEXEC); 1239 + return daemon->signal_fd; 1240 + } 1241 + 1242 + static int __cmd_start(struct daemon *daemon, struct option parent_options[], 1243 + int argc, const char **argv) 1244 + { 1245 + bool foreground = false; 1246 + struct option start_options[] = { 1247 + OPT_BOOLEAN('f', "foreground", &foreground, "stay on console"), 1248 + OPT_PARENT(parent_options), 1249 + OPT_END() 1250 + }; 1251 + int sock_fd = -1, conf_fd = -1, signal_fd = -1; 1252 + int sock_pos, file_pos, signal_pos; 1253 + struct fdarray fda; 1254 + int err = 0; 1255 + 1256 + argc = parse_options(argc, argv, start_options, daemon_usage, 0); 1257 + if (argc) 1258 + usage_with_options(daemon_usage, start_options); 1259 + 1260 + daemon->start = time(NULL); 1261 + 1262 + if (setup_config(daemon)) { 1263 + pr_err("failed: config not found\n"); 1264 + return -1; 1265 + } 1266 + 1267 + if (setup_server_config(daemon)) 1268 + return -1; 1269 + 1270 + if (foreground && check_lock(daemon)) 1271 + return -1; 1272 + 1273 + if (!foreground) { 1274 + err = go_background(daemon); 1275 + if (err) { 1276 + /* original process, exit normally */ 1277 + if (err == 1) 1278 + err = 0; 1279 + daemon__exit(daemon); 1280 + return err; 1281 + } 1282 + } 1283 + 1284 + debug_set_file(daemon->out); 1285 + debug_set_display_time(true); 1286 + 1287 + pr_info("daemon started (pid %d)\n", getpid()); 1288 + 1289 + fdarray__init(&fda, 3); 1290 + 1291 + sock_fd = setup_server_socket(daemon); 1292 + if (sock_fd < 0) 1293 + goto out; 1294 + 1295 + conf_fd = setup_config_changes(daemon); 1296 + if (conf_fd < 0) 1297 + goto out; 1298 + 1299 + signal_fd = setup_signalfd(daemon); 1300 + if (signal_fd < 0) 1301 + goto out; 1302 + 1303 + sock_pos = fdarray__add(&fda, sock_fd, POLLIN|POLLERR|POLLHUP, 0); 1304 + if (sock_pos < 0) 1305 + goto out; 1306 + 1307 + file_pos = fdarray__add(&fda, conf_fd, POLLIN|POLLERR|POLLHUP, 0); 1308 + if (file_pos < 0) 1309 + goto out; 1310 + 1311 + signal_pos = fdarray__add(&fda, signal_fd, POLLIN|POLLERR|POLLHUP, 0); 1312 + if (signal_pos < 0) 1313 + goto out; 1314 + 1315 + signal(SIGINT, sig_handler); 1316 + signal(SIGTERM, sig_handler); 1317 + signal(SIGPIPE, SIG_IGN); 1318 + 1319 + while (!done && !err) { 1320 + err = daemon__reconfig(daemon); 1321 + 1322 + if (!err && fdarray__poll(&fda, -1)) { 1323 + bool reconfig = false; 1324 + 1325 + if (fda.entries[sock_pos].revents & POLLIN) 1326 + err = handle_server_socket(daemon, sock_fd); 1327 + if (fda.entries[file_pos].revents & POLLIN) 1328 + err = handle_config_changes(daemon, conf_fd, &reconfig); 1329 + if (fda.entries[signal_pos].revents & POLLIN) 1330 + err = handle_signalfd(daemon) < 0; 1331 + 1332 + if (reconfig) 1333 + err = setup_server_config(daemon); 1334 + } 1335 + } 1336 + 1337 + out: 1338 + fdarray__exit(&fda); 1339 + 1340 + daemon__kill(daemon); 1341 + daemon__exit(daemon); 1342 + 1343 + if (sock_fd != -1) 1344 + close(sock_fd); 1345 + if (conf_fd != -1) 1346 + close(conf_fd); 1347 + if (conf_fd != -1) 1348 + close(signal_fd); 1349 + 1350 + pr_info("daemon exited\n"); 1351 + fclose(daemon->out); 1352 + return err; 1353 + } 1354 + 1355 + static int send_cmd(struct daemon *daemon, union cmd *cmd) 1356 + { 1357 + int ret = -1, fd; 1358 + char *line = NULL; 1359 + size_t len = 0; 1360 + ssize_t nread; 1361 + FILE *in = NULL; 1362 + 1363 + if (setup_client_config(daemon)) 1364 + return -1; 1365 + 1366 + fd = setup_client_socket(daemon); 1367 + if (fd < 0) 1368 + return -1; 1369 + 1370 + if (sizeof(*cmd) != writen(fd, cmd, sizeof(*cmd))) { 1371 + perror("failed: write"); 1372 + goto out; 1373 + } 1374 + 1375 + in = fdopen(fd, "r"); 1376 + if (!in) { 1377 + perror("failed: fdopen"); 1378 + goto out; 1379 + } 1380 + 1381 + while ((nread = getline(&line, &len, in)) != -1) { 1382 + if (fwrite(line, nread, 1, stdout) != 1) 1383 + goto out_fclose; 1384 + fflush(stdout); 1385 + } 1386 + 1387 + ret = 0; 1388 + out_fclose: 1389 + fclose(in); 1390 + free(line); 1391 + out: 1392 + /* If in is defined, then fd is closed via fclose. */ 1393 + if (!in) 1394 + close(fd); 1395 + return ret; 1396 + } 1397 + 1398 + static int send_cmd_list(struct daemon *daemon) 1399 + { 1400 + union cmd cmd = { .cmd = CMD_LIST, }; 1401 + 1402 + cmd.list.verbose = verbose; 1403 + cmd.list.csv_sep = daemon->csv_sep ? *daemon->csv_sep : 0; 1404 + 1405 + return send_cmd(daemon, &cmd); 1406 + } 1407 + 1408 + static int __cmd_signal(struct daemon *daemon, struct option parent_options[], 1409 + int argc, const char **argv) 1410 + { 1411 + const char *name = "all"; 1412 + struct option start_options[] = { 1413 + OPT_STRING(0, "session", &name, "session", 1414 + "Sent signal to specific session"), 1415 + OPT_PARENT(parent_options), 1416 + OPT_END() 1417 + }; 1418 + union cmd cmd; 1419 + 1420 + argc = parse_options(argc, argv, start_options, daemon_usage, 0); 1421 + if (argc) 1422 + usage_with_options(daemon_usage, start_options); 1423 + 1424 + if (setup_config(daemon)) { 1425 + pr_err("failed: config not found\n"); 1426 + return -1; 1427 + } 1428 + 1429 + cmd.signal.cmd = CMD_SIGNAL, 1430 + cmd.signal.sig = SIGUSR2; 1431 + strncpy(cmd.signal.name, name, sizeof(cmd.signal.name) - 1); 1432 + 1433 + return send_cmd(daemon, &cmd); 1434 + } 1435 + 1436 + static int __cmd_stop(struct daemon *daemon, struct option parent_options[], 1437 + int argc, const char **argv) 1438 + { 1439 + struct option start_options[] = { 1440 + OPT_PARENT(parent_options), 1441 + OPT_END() 1442 + }; 1443 + union cmd cmd = { .cmd = CMD_STOP, }; 1444 + 1445 + argc = parse_options(argc, argv, start_options, daemon_usage, 0); 1446 + if (argc) 1447 + usage_with_options(daemon_usage, start_options); 1448 + 1449 + if (setup_config(daemon)) { 1450 + pr_err("failed: config not found\n"); 1451 + return -1; 1452 + } 1453 + 1454 + return send_cmd(daemon, &cmd); 1455 + } 1456 + 1457 + static int __cmd_ping(struct daemon *daemon, struct option parent_options[], 1458 + int argc, const char **argv) 1459 + { 1460 + const char *name = "all"; 1461 + struct option ping_options[] = { 1462 + OPT_STRING(0, "session", &name, "session", 1463 + "Ping to specific session"), 1464 + OPT_PARENT(parent_options), 1465 + OPT_END() 1466 + }; 1467 + union cmd cmd = { .cmd = CMD_PING, }; 1468 + 1469 + argc = parse_options(argc, argv, ping_options, daemon_usage, 0); 1470 + if (argc) 1471 + usage_with_options(daemon_usage, ping_options); 1472 + 1473 + if (setup_config(daemon)) { 1474 + pr_err("failed: config not found\n"); 1475 + return -1; 1476 + } 1477 + 1478 + scnprintf(cmd.ping.name, sizeof(cmd.ping.name), "%s", name); 1479 + return send_cmd(daemon, &cmd); 1480 + } 1481 + 1482 + int cmd_daemon(int argc, const char **argv) 1483 + { 1484 + struct option daemon_options[] = { 1485 + OPT_INCR('v', "verbose", &verbose, "be more verbose"), 1486 + OPT_STRING(0, "config", &__daemon.config, 1487 + "config file", "config file path"), 1488 + OPT_STRING(0, "base", &__daemon.base_user, 1489 + "directory", "base directory"), 1490 + OPT_STRING_OPTARG('x', "field-separator", &__daemon.csv_sep, 1491 + "field separator", "print counts with custom separator", ","), 1492 + OPT_END() 1493 + }; 1494 + 1495 + perf_exe(__daemon.perf, sizeof(__daemon.perf)); 1496 + __daemon.out = stdout; 1497 + 1498 + argc = parse_options(argc, argv, daemon_options, daemon_usage, 1499 + PARSE_OPT_STOP_AT_NON_OPTION); 1500 + 1501 + if (argc) { 1502 + if (!strcmp(argv[0], "start")) 1503 + return __cmd_start(&__daemon, daemon_options, argc, argv); 1504 + if (!strcmp(argv[0], "signal")) 1505 + return __cmd_signal(&__daemon, daemon_options, argc, argv); 1506 + else if (!strcmp(argv[0], "stop")) 1507 + return __cmd_stop(&__daemon, daemon_options, argc, argv); 1508 + else if (!strcmp(argv[0], "ping")) 1509 + return __cmd_ping(&__daemon, daemon_options, argc, argv); 1510 + 1511 + pr_err("failed: unknown command '%s'\n", argv[0]); 1512 + return -1; 1513 + } 1514 + 1515 + if (setup_config(&__daemon)) { 1516 + pr_err("failed: config not found\n"); 1517 + return -1; 1518 + } 1519 + 1520 + return send_cmd_list(&__daemon); 1521 + }
+2 -2
tools/perf/builtin-inject.c
··· 313 313 * if jit marker, then inject jit mmaps and generate ELF images 314 314 */ 315 315 ret = jit_process(inject->session, &inject->output, machine, 316 - event->mmap.filename, event->mmap.pid, &n); 316 + event->mmap.filename, event->mmap.pid, event->mmap.tid, &n); 317 317 if (ret < 0) 318 318 return ret; 319 319 if (ret) { ··· 413 413 * if jit marker, then inject jit mmaps and generate ELF images 414 414 */ 415 415 ret = jit_process(inject->session, &inject->output, machine, 416 - event->mmap2.filename, event->mmap2.pid, &n); 416 + event->mmap2.filename, event->mmap2.pid, event->mmap2.tid, &n); 417 417 if (ret < 0) 418 418 return ret; 419 419 if (ret) {
+59 -58
tools/perf/builtin-mem.c
··· 30 30 bool dump_raw; 31 31 bool force; 32 32 bool phys_addr; 33 + bool data_page_size; 33 34 int operation; 34 35 const char *cpu_list; 35 36 DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS); ··· 125 124 if (mem->phys_addr) 126 125 rec_argv[i++] = "--phys-data"; 127 126 127 + if (mem->data_page_size) 128 + rec_argv[i++] = "--data-page-size"; 129 + 128 130 for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) { 129 131 e = perf_mem_events__ptr(j); 130 132 if (!e->record) ··· 176 172 { 177 173 struct perf_mem *mem = container_of(tool, struct perf_mem, tool); 178 174 struct addr_location al; 179 - const char *fmt; 175 + const char *fmt, *field_sep; 176 + char str[PAGE_SIZE_NAME_LEN]; 180 177 181 178 if (machine__resolve(machine, &al, sample) < 0) { 182 179 fprintf(stderr, "problem processing %d event, skipping it.\n", ··· 191 186 if (al.map != NULL) 192 187 al.map->dso->hit = 1; 193 188 194 - if (mem->phys_addr) { 195 - if (symbol_conf.field_sep) { 196 - fmt = "%d%s%d%s0x%"PRIx64"%s0x%"PRIx64"%s0x%016"PRIx64 197 - "%s%"PRIu64"%s0x%"PRIx64"%s%s:%s\n"; 198 - } else { 199 - fmt = "%5d%s%5d%s0x%016"PRIx64"%s0x016%"PRIx64 200 - "%s0x%016"PRIx64"%s%5"PRIu64"%s0x%06"PRIx64 201 - "%s%s:%s\n"; 202 - symbol_conf.field_sep = " "; 203 - } 204 - 205 - printf(fmt, 206 - sample->pid, 207 - symbol_conf.field_sep, 208 - sample->tid, 209 - symbol_conf.field_sep, 210 - sample->ip, 211 - symbol_conf.field_sep, 212 - sample->addr, 213 - symbol_conf.field_sep, 214 - sample->phys_addr, 215 - symbol_conf.field_sep, 216 - sample->weight, 217 - symbol_conf.field_sep, 218 - sample->data_src, 219 - symbol_conf.field_sep, 220 - al.map ? (al.map->dso ? al.map->dso->long_name : "???") : "???", 221 - al.sym ? al.sym->name : "???"); 189 + field_sep = symbol_conf.field_sep; 190 + if (field_sep) { 191 + fmt = "%d%s%d%s0x%"PRIx64"%s0x%"PRIx64"%s"; 222 192 } else { 223 - if (symbol_conf.field_sep) { 224 - fmt = "%d%s%d%s0x%"PRIx64"%s0x%"PRIx64"%s%"PRIu64 225 - "%s0x%"PRIx64"%s%s:%s\n"; 226 - } else { 227 - fmt = "%5d%s%5d%s0x%016"PRIx64"%s0x016%"PRIx64 228 - "%s%5"PRIu64"%s0x%06"PRIx64"%s%s:%s\n"; 229 - symbol_conf.field_sep = " "; 230 - } 231 - 232 - printf(fmt, 233 - sample->pid, 234 - symbol_conf.field_sep, 235 - sample->tid, 236 - symbol_conf.field_sep, 237 - sample->ip, 238 - symbol_conf.field_sep, 239 - sample->addr, 240 - symbol_conf.field_sep, 241 - sample->weight, 242 - symbol_conf.field_sep, 243 - sample->data_src, 244 - symbol_conf.field_sep, 245 - al.map ? (al.map->dso ? al.map->dso->long_name : "???") : "???", 246 - al.sym ? al.sym->name : "???"); 193 + fmt = "%5d%s%5d%s0x%016"PRIx64"%s0x016%"PRIx64"%s"; 194 + symbol_conf.field_sep = " "; 247 195 } 196 + printf(fmt, 197 + sample->pid, 198 + symbol_conf.field_sep, 199 + sample->tid, 200 + symbol_conf.field_sep, 201 + sample->ip, 202 + symbol_conf.field_sep, 203 + sample->addr, 204 + symbol_conf.field_sep); 205 + 206 + if (mem->phys_addr) { 207 + printf("0x%016"PRIx64"%s", 208 + sample->phys_addr, 209 + symbol_conf.field_sep); 210 + } 211 + 212 + if (mem->data_page_size) { 213 + printf("%s%s", 214 + get_page_size_name(sample->data_page_size, str), 215 + symbol_conf.field_sep); 216 + } 217 + 218 + if (field_sep) 219 + fmt = "%"PRIu64"%s0x%"PRIx64"%s%s:%s\n"; 220 + else 221 + fmt = "%5"PRIu64"%s0x%06"PRIx64"%s%s:%s\n"; 222 + 223 + printf(fmt, 224 + sample->weight, 225 + symbol_conf.field_sep, 226 + sample->data_src, 227 + symbol_conf.field_sep, 228 + al.map ? (al.map->dso ? al.map->dso->long_name : "???") : "???", 229 + al.sym ? al.sym->name : "???"); 248 230 out_put: 249 231 addr_location__put(&al); 250 232 return 0; ··· 279 287 if (ret < 0) 280 288 goto out_delete; 281 289 290 + printf("# PID, TID, IP, ADDR, "); 291 + 282 292 if (mem->phys_addr) 283 - printf("# PID, TID, IP, ADDR, PHYS ADDR, LOCAL WEIGHT, DSRC, SYMBOL\n"); 284 - else 285 - printf("# PID, TID, IP, ADDR, LOCAL WEIGHT, DSRC, SYMBOL\n"); 293 + printf("PHYS ADDR, "); 294 + 295 + if (mem->data_page_size) 296 + printf("DATA PAGE SIZE, "); 297 + 298 + printf("LOCAL WEIGHT, DSRC, SYMBOL\n"); 286 299 287 300 ret = perf_session__process_events(session); 288 301 ··· 297 300 } 298 301 static char *get_sort_order(struct perf_mem *mem) 299 302 { 300 - bool has_extra_options = mem->phys_addr ? true : false; 303 + bool has_extra_options = (mem->phys_addr | mem->data_page_size) ? true : false; 301 304 char sort[128]; 302 305 303 306 /* ··· 309 312 "dso_daddr,tlb,locked"); 310 313 } else if (has_extra_options) { 311 314 strcpy(sort, "--sort=local_weight,mem,sym,dso,symbol_daddr," 312 - "dso_daddr,snoop,tlb,locked"); 315 + "dso_daddr,snoop,tlb,locked,blocked"); 313 316 } else 314 317 return NULL; 315 318 316 319 if (mem->phys_addr) 317 320 strcat(sort, ",phys_daddr"); 321 + 322 + if (mem->data_page_size) 323 + strcat(sort, ",data_page_size"); 318 324 319 325 return strdup(sort); 320 326 } ··· 464 464 " between columns '.' is reserved."), 465 465 OPT_BOOLEAN('f', "force", &mem.force, "don't complain, do it"), 466 466 OPT_BOOLEAN('p', "phys-data", &mem.phys_addr, "Record/Report sample physical addresses"), 467 + OPT_BOOLEAN(0, "data-page-size", &mem.data_page_size, "Record/Report sample data address page size"), 467 468 OPT_END() 468 469 }; 469 470 const char *const mem_subcommands[] = { "record", "report", NULL };
+32 -7
tools/perf/builtin-record.c
··· 102 102 bool no_buildid_cache; 103 103 bool no_buildid_cache_set; 104 104 bool buildid_all; 105 + bool buildid_mmap; 105 106 bool timestamp_filename; 106 107 bool timestamp_boundary; 107 108 struct switch_output switch_output; ··· 730 729 rec->opts.auxtrace_sample_opts); 731 730 if (err) 732 731 return err; 732 + 733 + auxtrace_regroup_aux_output(rec->evlist); 733 734 734 735 return auxtrace_parse_filters(rec->evlist); 735 736 } ··· 1666 1663 status = -1; 1667 1664 goto out_delete_session; 1668 1665 } 1669 - err = evlist__add_pollfd(rec->evlist, done_fd); 1666 + err = evlist__add_wakeup_eventfd(rec->evlist, done_fd); 1670 1667 if (err < 0) { 1671 1668 pr_err("Failed to add wakeup eventfd to poll list\n"); 1672 1669 status = err; ··· 1940 1937 1941 1938 if (evlist__ctlfd_process(rec->evlist, &cmd) > 0) { 1942 1939 switch (cmd) { 1943 - case EVLIST_CTL_CMD_ENABLE: 1944 - pr_info(EVLIST_ENABLED_MSG); 1945 - break; 1946 - case EVLIST_CTL_CMD_DISABLE: 1947 - pr_info(EVLIST_DISABLED_MSG); 1948 - break; 1949 1940 case EVLIST_CTL_CMD_SNAPSHOT: 1950 1941 hit_auxtrace_snapshot_trigger(rec); 1951 1942 evlist__ctlfd_ack(rec->evlist); 1952 1943 break; 1944 + case EVLIST_CTL_CMD_STOP: 1945 + done = 1; 1946 + break; 1953 1947 case EVLIST_CTL_CMD_ACK: 1954 1948 case EVLIST_CTL_CMD_UNSUPPORTED: 1949 + case EVLIST_CTL_CMD_ENABLE: 1950 + case EVLIST_CTL_CMD_DISABLE: 1951 + case EVLIST_CTL_CMD_EVLIST: 1952 + case EVLIST_CTL_CMD_PING: 1955 1953 default: 1956 1954 break; 1957 1955 } ··· 2139 2135 rec->no_buildid_cache = true; 2140 2136 else if (!strcmp(value, "skip")) 2141 2137 rec->no_buildid = true; 2138 + else if (!strcmp(value, "mmap")) 2139 + rec->buildid_mmap = true; 2142 2140 else 2143 2141 return -1; 2144 2142 return 0; ··· 2480 2474 "Record the sample physical addresses"), 2481 2475 OPT_BOOLEAN(0, "data-page-size", &record.opts.sample_data_page_size, 2482 2476 "Record the sampled data address data page size"), 2477 + OPT_BOOLEAN(0, "code-page-size", &record.opts.sample_code_page_size, 2478 + "Record the sampled code address (ip) page size"), 2483 2479 OPT_BOOLEAN(0, "sample-cpu", &record.opts.sample_cpu, "Record the sample cpu"), 2484 2480 OPT_BOOLEAN_SET('T', "timestamp", &record.opts.sample_time, 2485 2481 &record.opts.sample_time_set, ··· 2560 2552 "file", "vmlinux pathname"), 2561 2553 OPT_BOOLEAN(0, "buildid-all", &record.buildid_all, 2562 2554 "Record build-id of all DSOs regardless of hits"), 2555 + OPT_BOOLEAN(0, "buildid-mmap", &record.buildid_mmap, 2556 + "Record build-id in map events"), 2563 2557 OPT_BOOLEAN(0, "timestamp-filename", &record.timestamp_filename, 2564 2558 "append timestamp to output filename"), 2565 2559 OPT_BOOLEAN(0, "timestamp-boundary", &record.timestamp_boundary, ··· 2663 2653 usage_with_options_msg(record_usage, record_options, 2664 2654 "cgroup monitoring only available in system-wide mode"); 2665 2655 2656 + } 2657 + 2658 + if (rec->buildid_mmap) { 2659 + if (!perf_can_record_build_id()) { 2660 + pr_err("Failed: no support to record build id in mmap events, update your kernel.\n"); 2661 + err = -EINVAL; 2662 + goto out_opts; 2663 + } 2664 + pr_debug("Enabling build id in mmap2 events.\n"); 2665 + /* Enable mmap build id synthesizing. */ 2666 + symbol_conf.buildid_mmap2 = true; 2667 + /* Enable perf_event_attr::build_id bit. */ 2668 + rec->opts.build_id = true; 2669 + /* Disable build id cache. */ 2670 + rec->no_buildid = true; 2666 2671 } 2667 2672 2668 2673 if (rec->opts.kcore)
+33 -4
tools/perf/builtin-script.c
··· 117 117 PERF_OUTPUT_IPC = 1ULL << 31, 118 118 PERF_OUTPUT_TOD = 1ULL << 32, 119 119 PERF_OUTPUT_DATA_PAGE_SIZE = 1ULL << 33, 120 + PERF_OUTPUT_CODE_PAGE_SIZE = 1ULL << 34, 120 121 }; 121 122 122 123 struct perf_script { ··· 183 182 {.str = "ipc", .field = PERF_OUTPUT_IPC}, 184 183 {.str = "tod", .field = PERF_OUTPUT_TOD}, 185 184 {.str = "data_page_size", .field = PERF_OUTPUT_DATA_PAGE_SIZE}, 185 + {.str = "code_page_size", .field = PERF_OUTPUT_CODE_PAGE_SIZE}, 186 186 }; 187 187 188 188 enum { ··· 258 256 PERF_OUTPUT_DSO | PERF_OUTPUT_PERIOD | 259 257 PERF_OUTPUT_ADDR | PERF_OUTPUT_DATA_SRC | 260 258 PERF_OUTPUT_WEIGHT | PERF_OUTPUT_PHYS_ADDR | 261 - PERF_OUTPUT_DATA_PAGE_SIZE, 259 + PERF_OUTPUT_DATA_PAGE_SIZE | PERF_OUTPUT_CODE_PAGE_SIZE, 262 260 263 261 .invalid_fields = PERF_OUTPUT_TRACE | PERF_OUTPUT_BPF_OUTPUT, 264 262 }, ··· 523 521 524 522 if (PRINT_FIELD(DATA_PAGE_SIZE) && 525 523 evsel__check_stype(evsel, PERF_SAMPLE_DATA_PAGE_SIZE, "DATA_PAGE_SIZE", PERF_OUTPUT_DATA_PAGE_SIZE)) 524 + return -EINVAL; 525 + 526 + if (PRINT_FIELD(CODE_PAGE_SIZE) && 527 + evsel__check_stype(evsel, PERF_SAMPLE_CODE_PAGE_SIZE, "CODE_PAGE_SIZE", PERF_OUTPUT_CODE_PAGE_SIZE)) 526 528 return -EINVAL; 527 529 528 530 return 0; ··· 1537 1531 {PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TX_ABORT, "tx abrt"}, 1538 1532 {PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_BEGIN, "tr strt"}, 1539 1533 {PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_END, "tr end"}, 1534 + {PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_VMENTRY, "vmentry"}, 1535 + {PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_VMEXIT, "vmexit"}, 1540 1536 {0, NULL} 1541 1537 }; 1542 1538 ··· 1768 1760 return len + perf_sample__fprintf_pt_spacing(len, fp); 1769 1761 } 1770 1762 1763 + static int perf_sample__fprintf_synth_psb(struct perf_sample *sample, FILE *fp) 1764 + { 1765 + struct perf_synth_intel_psb *data = perf_sample__synth_ptr(sample); 1766 + int len; 1767 + 1768 + if (perf_sample__bad_synth_size(sample, *data)) 1769 + return 0; 1770 + 1771 + len = fprintf(fp, " psb offs: %#" PRIx64, data->offset); 1772 + return len + perf_sample__fprintf_pt_spacing(len, fp); 1773 + } 1774 + 1771 1775 static int perf_sample__fprintf_synth(struct perf_sample *sample, 1772 1776 struct evsel *evsel, FILE *fp) 1773 1777 { ··· 1796 1776 return perf_sample__fprintf_synth_pwrx(sample, fp); 1797 1777 case PERF_SYNTH_INTEL_CBR: 1798 1778 return perf_sample__fprintf_synth_cbr(sample, fp); 1779 + case PERF_SYNTH_INTEL_PSB: 1780 + return perf_sample__fprintf_synth_psb(sample, fp); 1799 1781 default: 1800 1782 break; 1801 1783 } ··· 2057 2035 2058 2036 if (PRINT_FIELD(DATA_PAGE_SIZE)) 2059 2037 fprintf(fp, " %s", get_page_size_name(sample->data_page_size, str)); 2038 + 2039 + if (PRINT_FIELD(CODE_PAGE_SIZE)) 2040 + fprintf(fp, " %s", get_page_size_name(sample->code_page_size, str)); 2060 2041 2061 2042 perf_sample__fprintf_ipc(sample, attr, fp); 2062 2043 ··· 2811 2786 break; 2812 2787 } 2813 2788 if (i == imax && strcmp(tok, "flags") == 0) { 2814 - print_flags = change == REMOVE ? false : true; 2789 + print_flags = change != REMOVE; 2815 2790 continue; 2816 2791 } 2817 2792 if (i == imax) { ··· 3259 3234 3260 3235 static bool is_top_script(const char *script_path) 3261 3236 { 3262 - return ends_with(script_path, "top") == NULL ? false : true; 3237 + return ends_with(script_path, "top") != NULL; 3263 3238 } 3264 3239 3265 3240 static int has_required_arg(char *script_path) ··· 3560 3535 "addr,symoff,srcline,period,iregs,uregs,brstack," 3561 3536 "brstacksym,flags,bpf-output,brstackinsn,brstackoff," 3562 3537 "callindent,insn,insnlen,synth,phys_addr,metric,misc,ipc,tod," 3563 - "data_page_size", 3538 + "data_page_size,code_page_size", 3564 3539 parse_output_fields), 3565 3540 OPT_BOOLEAN('a', "all-cpus", &system_wide, 3566 3541 "system-wide collection from all CPUs"), 3542 + OPT_STRING(0, "dsos", &symbol_conf.dso_list_str, "dso[,dso...]", 3543 + "only consider symbols in these DSOs"), 3567 3544 OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]", 3568 3545 "only consider these symbols"), 3546 + OPT_INTEGER(0, "addr-range", &symbol_conf.addr_range, 3547 + "Use with -S to list traced records within address range"), 3569 3548 OPT_CALLBACK_OPTARG(0, "insn-trace", &itrace_synth_opts, NULL, NULL, 3570 3549 "Decode instructions from itrace", parse_insn_trace), 3571 3550 OPT_CALLBACK_OPTARG(0, "xed", NULL, NULL, NULL,
+105 -19
tools/perf/builtin-stat.c
··· 67 67 #include "util/top.h" 68 68 #include "util/affinity.h" 69 69 #include "util/pfm.h" 70 + #include "util/bpf_counter.h" 70 71 #include "asm/bug.h" 71 72 72 73 #include <linux/time64.h> ··· 135 134 "topdown-bad-spec", 136 135 "topdown-fe-bound", 137 136 "topdown-be-bound", 137 + NULL, 138 + }; 139 + 140 + static const char *topdown_metric_L2_attrs[] = { 141 + "slots", 142 + "topdown-retiring", 143 + "topdown-bad-spec", 144 + "topdown-fe-bound", 145 + "topdown-be-bound", 146 + "topdown-heavy-ops", 147 + "topdown-br-mispredict", 148 + "topdown-fetch-lat", 149 + "topdown-mem-bound", 138 150 NULL, 139 151 }; 140 152 ··· 423 409 return 0; 424 410 } 425 411 412 + static int read_bpf_map_counters(void) 413 + { 414 + struct evsel *counter; 415 + int err; 416 + 417 + evlist__for_each_entry(evsel_list, counter) { 418 + err = bpf_counter__read(counter); 419 + if (err) 420 + return err; 421 + } 422 + return 0; 423 + } 424 + 426 425 static void read_counters(struct timespec *rs) 427 426 { 428 427 struct evsel *counter; 428 + int err; 429 429 430 - if (!stat_config.stop_read_counter && (read_affinity_counters(rs) < 0)) 431 - return; 430 + if (!stat_config.stop_read_counter) { 431 + if (target__has_bpf(&target)) 432 + err = read_bpf_map_counters(); 433 + else 434 + err = read_affinity_counters(rs); 435 + if (err < 0) 436 + return; 437 + } 432 438 433 439 evlist__for_each_entry(evsel_list, counter) { 434 440 if (counter->err) ··· 530 496 return false; 531 497 } 532 498 533 - static void enable_counters(void) 499 + static int enable_counters(void) 534 500 { 501 + struct evsel *evsel; 502 + int err; 503 + 504 + if (target__has_bpf(&target)) { 505 + evlist__for_each_entry(evsel_list, evsel) { 506 + err = bpf_counter__enable(evsel); 507 + if (err) 508 + return err; 509 + } 510 + } 511 + 535 512 if (stat_config.initial_delay < 0) { 536 513 pr_info(EVLIST_DISABLED_MSG); 537 - return; 514 + return 0; 538 515 } 539 516 540 517 if (stat_config.initial_delay > 0) { ··· 563 518 if (stat_config.initial_delay > 0) 564 519 pr_info(EVLIST_ENABLED_MSG); 565 520 } 521 + return 0; 566 522 } 567 523 568 524 static void disable_counters(void) ··· 624 578 if (evlist__ctlfd_process(evlist, &cmd) > 0) { 625 579 switch (cmd) { 626 580 case EVLIST_CTL_CMD_ENABLE: 627 - pr_info(EVLIST_ENABLED_MSG); 628 581 if (interval) 629 582 process_interval(); 630 583 break; 631 584 case EVLIST_CTL_CMD_DISABLE: 632 585 if (interval) 633 586 process_interval(); 634 - pr_info(EVLIST_DISABLED_MSG); 635 587 break; 636 588 case EVLIST_CTL_CMD_SNAPSHOT: 637 589 case EVLIST_CTL_CMD_ACK: 638 590 case EVLIST_CTL_CMD_UNSUPPORTED: 591 + case EVLIST_CTL_CMD_EVLIST: 592 + case EVLIST_CTL_CMD_STOP: 593 + case EVLIST_CTL_CMD_PING: 639 594 default: 640 595 break; 641 596 } ··· 767 720 const bool forks = (argc > 0); 768 721 bool is_pipe = STAT_RECORD ? perf_stat.data.is_pipe : false; 769 722 struct affinity affinity; 770 - int i, cpu; 723 + int i, cpu, err; 771 724 bool second_pass = false; 772 725 773 726 if (forks) { ··· 783 736 784 737 if (affinity__setup(&affinity) < 0) 785 738 return -1; 739 + 740 + if (target__has_bpf(&target)) { 741 + evlist__for_each_entry(evsel_list, counter) { 742 + if (bpf_counter__load(counter, &target)) 743 + return -1; 744 + } 745 + } 786 746 787 747 evlist__for_each_cpu (evsel_list, i, cpu) { 788 748 affinity__set(&affinity, cpu); ··· 904 850 } 905 851 906 852 if (STAT_RECORD) { 907 - int err, fd = perf_data__fd(&perf_stat.data); 853 + int fd = perf_data__fd(&perf_stat.data); 908 854 909 855 if (is_pipe) { 910 856 err = perf_header__write_pipe(perf_data__fd(&perf_stat.data)); ··· 930 876 931 877 if (forks) { 932 878 evlist__start_workload(evsel_list); 933 - enable_counters(); 879 + err = enable_counters(); 880 + if (err) 881 + return -1; 934 882 935 883 if (interval || timeout || evlist__ctlfd_initialized(evsel_list)) 936 884 status = dispatch_events(forks, timeout, interval, &times); ··· 951 895 if (WIFSIGNALED(status)) 952 896 psignal(WTERMSIG(status), argv[0]); 953 897 } else { 954 - enable_counters(); 898 + err = enable_counters(); 899 + if (err) 900 + return -1; 955 901 status = dispatch_events(forks, timeout, interval, &times); 956 902 } 957 903 ··· 1143 1085 "stat events on existing process id"), 1144 1086 OPT_STRING('t', "tid", &target.tid, "tid", 1145 1087 "stat events on existing thread id"), 1088 + #ifdef HAVE_BPF_SKEL 1089 + OPT_STRING('b', "bpf-prog", &target.bpf_str, "bpf-prog-id", 1090 + "stat events on existing bpf program id"), 1091 + #endif 1146 1092 OPT_BOOLEAN('a', "all-cpus", &target.system_wide, 1147 1093 "system-wide collection from all CPUs"), 1148 1094 OPT_BOOLEAN('g', "group", &group, ··· 1215 1153 OPT_BOOLEAN(0, "metric-no-merge", &stat_config.metric_no_merge, 1216 1154 "don't try to share events between metrics in a group"), 1217 1155 OPT_BOOLEAN(0, "topdown", &topdown_run, 1218 - "measure topdown level 1 statistics"), 1156 + "measure top-down statistics"), 1157 + OPT_UINTEGER(0, "td-level", &stat_config.topdown_level, 1158 + "Set the metrics level for the top-down statistics (0: max level)"), 1219 1159 OPT_BOOLEAN(0, "smi-cost", &smi_cost, 1220 1160 "measure SMI cost"), 1221 1161 OPT_CALLBACK('M', "metrics", &evsel_list, "metric/metric group list", ··· 1770 1706 } 1771 1707 1772 1708 if (topdown_run) { 1709 + const char **metric_attrs = topdown_metric_attrs; 1710 + unsigned int max_level = 1; 1773 1711 char *str = NULL; 1774 1712 bool warn = false; 1775 1713 1776 1714 if (!force_metric_only) 1777 1715 stat_config.metric_only = true; 1778 1716 1779 - if (topdown_filter_events(topdown_metric_attrs, &str, 1) < 0) { 1717 + if (pmu_have_event("cpu", topdown_metric_L2_attrs[5])) { 1718 + metric_attrs = topdown_metric_L2_attrs; 1719 + max_level = 2; 1720 + } 1721 + 1722 + if (stat_config.topdown_level > max_level) { 1723 + pr_err("Invalid top-down metrics level. The max level is %u.\n", max_level); 1724 + return -1; 1725 + } else if (!stat_config.topdown_level) 1726 + stat_config.topdown_level = max_level; 1727 + 1728 + if (topdown_filter_events(metric_attrs, &str, 1) < 0) { 1780 1729 pr_err("Out of memory\n"); 1781 1730 return -1; 1782 1731 } 1783 - if (topdown_metric_attrs[0] && str) { 1732 + if (metric_attrs[0] && str) { 1784 1733 if (!stat_config.interval && !stat_config.metric_only) { 1785 1734 fprintf(stat_config.output, 1786 1735 "Topdown accuracy may decrease when measuring long periods.\n" ··· 1855 1778 return -1; 1856 1779 } 1857 1780 if (evlist__add_default_attrs(evsel_list, default_attrs1) < 0) 1781 + return -1; 1782 + 1783 + if (arch_evlist__add_default_attrs(evsel_list) < 0) 1858 1784 return -1; 1859 1785 } 1860 1786 ··· 2144 2064 "perf stat [<options>] [<command>]", 2145 2065 NULL 2146 2066 }; 2147 - int status = -EINVAL, run_idx; 2067 + int status = -EINVAL, run_idx, err; 2148 2068 const char *mode; 2149 2069 FILE *output = stderr; 2150 2070 unsigned int interval, timeout; 2151 2071 const char * const stat_subcommands[] = { "record", "report" }; 2072 + char errbuf[BUFSIZ]; 2152 2073 2153 2074 setlocale(LC_ALL, ""); 2154 2075 ··· 2260 2179 } else if (big_num_opt == 0) /* User passed --no-big-num */ 2261 2180 stat_config.big_num = false; 2262 2181 2182 + err = target__validate(&target); 2183 + if (err) { 2184 + target__strerror(&target, err, errbuf, BUFSIZ); 2185 + pr_warning("%s\n", errbuf); 2186 + } 2187 + 2263 2188 setup_system_wide(argc); 2264 2189 2265 2190 /* ··· 2338 2251 goto out; 2339 2252 } 2340 2253 } 2341 - 2342 - target__validate(&target); 2343 2254 2344 2255 if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide)) 2345 2256 target.per_thread = true; ··· 2469 2384 * tools remain -acme 2470 2385 */ 2471 2386 int fd = perf_data__fd(&perf_stat.data); 2472 - int err = perf_event__synthesize_kernel_mmap((void *)&perf_stat, 2473 - process_synthesized_event, 2474 - &perf_stat.session->machines.host); 2387 + 2388 + err = perf_event__synthesize_kernel_mmap((void *)&perf_stat, 2389 + process_synthesized_event, 2390 + &perf_stat.session->machines.host); 2475 2391 if (err) { 2476 2392 pr_warning("Couldn't synthesize the kernel mmap record, harmless, " 2477 2393 "older tools may produce warnings about this file\n.");
+1
tools/perf/builtin.h
··· 37 37 int cmd_mem(int argc, const char **argv); 38 38 int cmd_data(int argc, const char **argv); 39 39 int cmd_ftrace(int argc, const char **argv); 40 + int cmd_daemon(int argc, const char **argv); 40 41 41 42 int find_scripts(char **scripts_array, char **scripts_path_array, int num, 42 43 int pathlen);
+1
tools/perf/command-list.txt
··· 31 31 perf-top mainporcelain common 32 32 perf-trace mainporcelain audit 33 33 perf-version mainporcelain common 34 + perf-daemon mainporcelain common
+1
tools/perf/perf.c
··· 88 88 { "mem", cmd_mem, 0 }, 89 89 { "data", cmd_data, 0 }, 90 90 { "ftrace", cmd_ftrace, 0 }, 91 + { "daemon", cmd_daemon, 0 }, 91 92 }; 92 93 93 94 struct pager_config {
+2 -6
tools/perf/pmu-events/arch/arm64/ampere/emag/branch.json
··· 9 9 "ArchStdEvent": "BR_INDIRECT_SPEC" 10 10 }, 11 11 { 12 - "PublicDescription": "Mispredicted or not predicted branch speculatively executed", 13 - "EventCode": "0x10", 14 - "EventName": "BR_MIS_PRED", 12 + "ArchStdEvent": "BR_MIS_PRED", 15 13 "BriefDescription": "Branch mispredicted" 16 14 }, 17 15 { 18 - "PublicDescription": "Predictable branch speculatively executed", 19 - "EventCode": "0x12", 20 - "EventName": "BR_PRED", 16 + "ArchStdEvent": "BR_PRED", 21 17 "BriefDescription": "Predictable branch" 22 18 } 23 19 ]
+1 -4
tools/perf/pmu-events/arch/arm64/ampere/emag/bus.json
··· 18 18 "ArchStdEvent": "BUS_ACCESS_PERIPH" 19 19 }, 20 20 { 21 - "PublicDescription": "Bus access", 22 - "EventCode": "0x19", 23 - "EventName": "BUS_ACCESS", 24 - "BriefDescription": "Bus access" 21 + "ArchStdEvent": "BUS_ACCESS", 25 22 } 26 23 ]
+14 -44
tools/perf/pmu-events/arch/arm64/ampere/emag/cache.json
··· 39 39 "ArchStdEvent": "L2D_CACHE_INVAL" 40 40 }, 41 41 { 42 - "PublicDescription": "Level 1 instruction cache refill", 43 - "EventCode": "0x01", 44 - "EventName": "L1I_CACHE_REFILL", 45 - "BriefDescription": "L1I cache refill" 42 + "ArchStdEvent": "L1I_CACHE_REFILL", 46 43 }, 47 44 { 48 - "PublicDescription": "Level 1 instruction TLB refill", 49 - "EventCode": "0x02", 50 - "EventName": "L1I_TLB_REFILL", 51 - "BriefDescription": "L1I TLB refill" 45 + "ArchStdEvent": "L1I_TLB_REFILL", 52 46 }, 53 47 { 54 - "PublicDescription": "Level 1 data cache refill", 55 - "EventCode": "0x03", 56 - "EventName": "L1D_CACHE_REFILL", 57 - "BriefDescription": "L1D cache refill" 48 + "ArchStdEvent": "L1D_CACHE_REFILL", 58 49 }, 59 50 { 60 - "PublicDescription": "Level 1 data cache access", 61 - "EventCode": "0x04", 62 - "EventName": "L1D_CACHE_ACCESS", 63 - "BriefDescription": "L1D cache access" 51 + "ArchStdEvent": "L1D_CACHE", 64 52 }, 65 53 { 66 - "PublicDescription": "Level 1 data TLB refill", 67 - "EventCode": "0x05", 68 - "EventName": "L1D_TLB_REFILL", 69 - "BriefDescription": "L1D TLB refill" 54 + "ArchStdEvent": "L1D_TLB_REFILL", 70 55 }, 71 56 { 72 - "PublicDescription": "Level 1 instruction cache access", 73 - "EventCode": "0x14", 74 - "EventName": "L1I_CACHE_ACCESS", 75 - "BriefDescription": "L1I cache access" 57 + "ArchStdEvent": "L1I_CACHE", 76 58 }, 77 59 { 78 - "PublicDescription": "Level 2 data cache access", 79 - "EventCode": "0x16", 80 - "EventName": "L2D_CACHE_ACCESS", 81 - "BriefDescription": "L2D cache access" 60 + "ArchStdEvent": "L2D_CACHE", 82 61 }, 83 62 { 84 - "PublicDescription": "Level 2 data refill", 85 - "EventCode": "0x17", 86 - "EventName": "L2D_CACHE_REFILL", 87 - "BriefDescription": "L2D cache refill" 63 + "ArchStdEvent": "L2D_CACHE_REFILL", 88 64 }, 89 65 { 90 - "PublicDescription": "Level 2 data cache, Write-Back", 91 - "EventCode": "0x18", 92 - "EventName": "L2D_CACHE_WB", 93 - "BriefDescription": "L2D cache Write-Back" 66 + "ArchStdEvent": "L2D_CACHE_WB", 94 67 }, 95 68 { 96 - "PublicDescription": "Level 1 data TLB access. This event counts any load or store operation which accesses the data L1 TLB", 97 - "EventCode": "0x25", 98 - "EventName": "L1D_TLB_ACCESS", 69 + "PublicDescription": "This event counts any load or store operation which accesses the data L1 TLB", 70 + "ArchStdEvent": "L1D_TLB", 99 71 "BriefDescription": "L1D TLB access" 100 72 }, 101 73 { 102 - "PublicDescription": "Level 1 instruction TLB access. This event counts any instruction fetch which accesses the instruction L1 TLB", 103 - "EventCode": "0x26", 104 - "EventName": "L1I_TLB_ACCESS", 105 - "BriefDescription": "L1I TLB access" 74 + "PublicDescription": "This event counts any instruction fetch which accesses the instruction L1 TLB", 75 + "ArchStdEvent": "L1I_TLB", 106 76 }, 107 77 { 108 78 "PublicDescription": "Level 2 access to data TLB that caused a page table walk. This event counts on any data access which causes L2D_TLB_REFILL to count", ··· 84 114 "PublicDescription": "Level 2 access to instruciton TLB that caused a page table walk. This event counts on any instruciton access which causes L2I_TLB_REFILL to count", 85 115 "EventCode": "0x35", 86 116 "EventName": "L2I_TLB_ACCESS", 87 - "BriefDescription": "L2D TLB access" 117 + "BriefDescription": "L2I TLB access" 88 118 }, 89 119 { 90 120 "PublicDescription": "Branch target buffer misprediction",
+1 -3
tools/perf/pmu-events/arch/arm64/ampere/emag/clock.json
··· 1 1 [ 2 2 { 3 3 "PublicDescription": "The number of core clock cycles", 4 - "EventCode": "0x11", 5 - "EventName": "CPU_CYCLES", 6 - "BriefDescription": "Clock cycles" 4 + "ArchStdEvent": "CPU_CYCLES", 7 5 }, 8 6 { 9 7 "PublicDescription": "FSU clocking gated off cycle",
+2 -8
tools/perf/pmu-events/arch/arm64/ampere/emag/exception.json
··· 36 36 "ArchStdEvent": "EXC_TRAP_FIQ" 37 37 }, 38 38 { 39 - "PublicDescription": "Exception taken", 40 - "EventCode": "0x09", 41 - "EventName": "EXC_TAKEN", 42 - "BriefDescription": "Exception taken" 39 + "ArchStdEvent": "EXC_TAKEN", 43 40 }, 44 41 { 45 - "PublicDescription": "Instruction architecturally executed, condition check pass, exception return", 46 - "EventCode": "0x0a", 47 - "EventName": "EXC_RETURN", 48 - "BriefDescription": "Exception return" 42 + "ArchStdEvent": "EXC_RETURN", 49 43 } 50 44 ]
+9 -25
tools/perf/pmu-events/arch/arm64/ampere/emag/instruction.json
··· 40 40 }, 41 41 { 42 42 "PublicDescription": "Instruction architecturally executed, software increment", 43 - "EventCode": "0x00", 44 - "EventName": "SW_INCR", 43 + "ArchStdEvent": "SW_INCR", 45 44 "BriefDescription": "Software increment" 46 45 }, 47 46 { 48 - "PublicDescription": "Instruction architecturally executed", 49 - "EventCode": "0x08", 50 - "EventName": "INST_RETIRED", 51 - "BriefDescription": "Instruction retired" 47 + "ArchStdEvent": "INST_RETIRED", 52 48 }, 53 49 { 54 - "PublicDescription": "Instruction architecturally executed, condition code check pass, write to CONTEXTIDR", 55 - "EventCode": "0x0b", 56 - "EventName": "CID_WRITE_RETIRED", 50 + "ArchStdEvent": "CID_WRITE_RETIRED", 57 51 "BriefDescription": "Write to CONTEXTIDR" 58 52 }, 59 53 { 60 - "PublicDescription": "Operation speculatively executed", 61 - "EventCode": "0x1b", 62 - "EventName": "INST_SPEC", 63 - "BriefDescription": "Speculatively executed" 54 + "ArchStdEvent": "INST_SPEC", 64 55 }, 65 56 { 66 - "PublicDescription": "Instruction architecturally executed (condition check pass), write to TTBR", 67 - "EventCode": "0x1c", 68 - "EventName": "TTBR_WRITE_RETIRED", 69 - "BriefDescription": "Instruction executed, TTBR write" 57 + "ArchStdEvent": "TTBR_WRITE_RETIRED", 70 58 }, 71 59 { 72 - "PublicDescription": "Instruction architecturally executed, branch. This event counts all branches, taken or not. This excludes exception entries, debug entries and CCFAIL branches", 73 - "EventCode": "0x21", 74 - "EventName": "BR_RETIRED", 75 - "BriefDescription": "Branch retired" 60 + "PublicDescription": "This event counts all branches, taken or not. This excludes exception entries, debug entries and CCFAIL branches", 61 + "ArchStdEvent": "BR_RETIRED", 76 62 }, 77 63 { 78 - "PublicDescription": "Instruction architecturally executed, mispredicted branch. This event counts any branch counted by BR_RETIRED which is not correctly predicted and causes a pipeline flush", 79 - "EventCode": "0x22", 80 - "EventName": "BR_MISPRED_RETIRED", 81 - "BriefDescription": "Mispredicted branch retired" 64 + "PublicDescription": "This event counts any branch counted by BR_RETIRED which is not correctly predicted and causes a pipeline flush", 65 + "ArchStdEvent": "BR_MIS_PRED_RETIRED", 82 66 }, 83 67 { 84 68 "PublicDescription": "Operation speculatively executed, NOP",
+3 -8
tools/perf/pmu-events/arch/arm64/ampere/emag/memory.json
··· 15 15 "ArchStdEvent": "UNALIGNED_LDST_SPEC" 16 16 }, 17 17 { 18 - "PublicDescription": "Data memory access", 19 - "EventCode": "0x13", 20 - "EventName": "MEM_ACCESS", 21 - "BriefDescription": "Memory access" 18 + "ArchStdEvent": "MEM_ACCESS", 22 19 }, 23 20 { 24 - "PublicDescription": "Local memory error. This event counts any correctable or uncorrectable memory error (ECC or parity) in the protected core RAMs", 25 - "EventCode": "0x1a", 26 - "EventName": "MEM_ERROR", 27 - "BriefDescription": "Memory error" 21 + "PublicDescription": "This event counts any correctable or uncorrectable memory error (ECC or parity) in the protected core RAMs", 22 + "ArchStdEvent": "MEMORY_ERROR", 28 23 } 29 24 ]
+4 -8
tools/perf/pmu-events/arch/arm64/arm/cortex-a76-n1/branch.json
··· 1 1 [ 2 2 { 3 - "PublicDescription": "Mispredicted or not predicted branch speculatively executed. This event counts any predictable branch instruction which is mispredicted either due to dynamic misprediction or because the MMU is off and the branches are statically predicted not taken.", 4 - "EventCode": "0x10", 5 - "EventName": "BR_MIS_PRED", 6 - "BriefDescription": "Mispredicted or not predicted branch speculatively executed." 3 + "PublicDescription": "This event counts any predictable branch instruction which is mispredicted either due to dynamic misprediction or because the MMU is off and the branches are statically predicted not taken", 4 + "ArchStdEvent": "BR_MIS_PRED", 7 5 }, 8 6 { 9 - "PublicDescription": "Predictable branch speculatively executed. This event counts all predictable branches.", 10 - "EventCode": "0x12", 11 - "EventName": "BR_PRED", 12 - "BriefDescription": "Predictable branch speculatively executed." 7 + "PublicDescription": "This event counts all predictable branches.", 8 + "ArchStdEvent": "BR_PRED", 13 9 } 14 10 ]
+8 -11
tools/perf/pmu-events/arch/arm64/arm/cortex-a76-n1/bus.json
··· 1 1 [ 2 2 { 3 - "EventCode": "0x11", 4 - "EventName": "CPU_CYCLES", 3 + "PublicDescription": "The number of core clock cycles" 4 + "ArchStdEvent": "CPU_CYCLES", 5 5 "BriefDescription": "The number of core clock cycles." 6 6 }, 7 7 { 8 - "PublicDescription": "Bus access. This event counts for every beat of data transferred over the data channels between the core and the SCU. If both read and write data beats are transferred on a given cycle, this event is counted twice on that cycle. This event counts the sum of BUS_ACCESS_RD and BUS_ACCESS_WR.", 9 - "EventCode": "0x19", 10 - "EventName": "BUS_ACCESS", 11 - "BriefDescription": "Bus access." 8 + "PublicDescription": "This event counts for every beat of data transferred over the data channels between the core and the SCU. If both read and write data beats are transferred on a given cycle, this event is counted twice on that cycle. This event counts the sum of BUS_ACCESS_RD and BUS_ACCESS_WR.", 9 + "ArchStdEvent": "BUS_ACCESS", 12 10 }, 13 11 { 14 - "EventCode": "0x1D", 15 - "EventName": "BUS_CYCLES", 16 - "BriefDescription": "Bus cycles. This event duplicates CPU_CYCLES." 12 + "PublicDescription": "This event duplicates CPU_CYCLES." 13 + "ArchStdEvent": "BUS_CYCLES", 17 14 }, 18 15 { 19 - "ArchStdEvent": "BUS_ACCESS_RD" 16 + "ArchStdEvent": "BUS_ACCESS_RD", 20 17 }, 21 18 { 22 - "ArchStdEvent": "BUS_ACCESS_WR" 19 + "ArchStdEvent": "BUS_ACCESS_WR", 23 20 } 24 21 ]
+40 -78
tools/perf/pmu-events/arch/arm64/arm/cortex-a76-n1/cache.json
··· 1 1 [ 2 2 { 3 - "PublicDescription": "L1 instruction cache refill. This event counts any instruction fetch which misses in the cache.", 4 - "EventCode": "0x01", 5 - "EventName": "L1I_CACHE_REFILL", 6 - "BriefDescription": "L1 instruction cache refill" 3 + "PublicDescription": "This event counts any instruction fetch which misses in the cache.", 4 + "ArchStdEvent": "L1I_CACHE_REFILL", 7 5 }, 8 6 { 9 - "PublicDescription": "L1 instruction TLB refill. This event counts any refill of the instruction L1 TLB from the L2 TLB. This includes refills that result in a translation fault.", 10 - "EventCode": "0x02", 11 - "EventName": "L1I_TLB_REFILL", 12 - "BriefDescription": "L1 instruction TLB refill" 7 + "PublicDescription": "This event counts any refill of the instruction L1 TLB from the L2 TLB. This includes refills that result in a translation fault.", 8 + "ArchStdEvent": "L1I_TLB_REFILL", 13 9 }, 14 10 { 15 - "PublicDescription": "L1 data cache refill. This event counts any load or store operation or page table walk access which causes data to be read from outside the L1, including accesses which do not allocate into L1.", 16 - "EventCode": "0x03", 17 - "EventName": "L1D_CACHE_REFILL", 18 - "BriefDescription": "L1 data cache refill" 11 + "PublicDescription": "This event counts any load or store operation or page table walk access which causes data to be read from outside the L1, including accesses which do not allocate into L1.", 12 + "ArchStdEvent": "L1D_CACHE_REFILL", 19 13 }, 20 14 { 21 - "PublicDescription": "L1 data cache access. This event counts any load or store operation or page table walk access which looks up in the L1 data cache. In particular, any access which could count the L1D_CACHE_REFILL event causes this event to count.", 22 - "EventCode": "0x04", 23 - "EventName": "L1D_CACHE", 24 - "BriefDescription": "L1 data cache access" 15 + "PublicDescription": "This event counts any load or store operation or page table walk access which looks up in the L1 data cache. In particular, any access which could count the L1D_CACHE_REFILL event causes this event to count.", 16 + "ArchStdEvent": "L1D_CACHE", 25 17 }, 26 18 { 27 - "PublicDescription": "L1 data TLB refill. This event counts any refill of the data L1 TLB from the L2 TLB. This includes refills that result in a translation fault.", 28 - "EventCode": "0x05", 29 - "EventName": "L1D_TLB_REFILL", 30 - "BriefDescription": "L1 data TLB refill" 19 + "PublicDescription": "This event counts any refill of the data L1 TLB from the L2 TLB. This includes refills that result in a translation fault.", 20 + "ArchStdEvent": "L1D_TLB_REFILL", 31 21 }, 32 - { 22 + {, 33 23 "PublicDescription": "Level 1 instruction cache access or Level 0 Macro-op cache access. This event counts any instruction fetch which accesses the L1 instruction cache or L0 Macro-op cache.", 34 - "EventCode": "0x14", 35 - "EventName": "L1I_CACHE", 36 - "BriefDescription": "L1 instruction cache access" 24 + "ArchStdEvent": "L1I_CACHE", 37 25 }, 38 26 { 39 - "PublicDescription": "L1 data cache Write-Back. This event counts any write-back of data from the L1 data cache to L2 or L3. This counts both victim line evictions and snoops, including cache maintenance operations.", 40 - "EventCode": "0x15", 41 - "EventName": "L1D_CACHE_WB", 42 - "BriefDescription": "L1 data cache Write-Back" 27 + "PublicDescription": "This event counts any write-back of data from the L1 data cache to L2 or L3. This counts both victim line evictions and snoops, including cache maintenance operations.", 28 + "ArchStdEvent": "L1D_CACHE_WB", 43 29 }, 44 30 { 45 - "PublicDescription": "L2 data cache access. This event counts any transaction from L1 which looks up in the L2 cache, and any write-back from the L1 to the L2. Snoops from outside the core and cache maintenance operations are not counted.", 46 - "EventCode": "0x16", 47 - "EventName": "L2D_CACHE", 48 - "BriefDescription": "L2 data cache access" 31 + "PublicDescription": "This event counts any transaction from L1 which looks up in the L2 cache, and any write-back from the L1 to the L2. Snoops from outside the core and cache maintenance operations are not counted.", 32 + "ArchStdEvent": "L2D_CACHE", 49 33 }, 50 34 { 51 35 "PublicDescription": "L2 data cache refill. This event counts any cacheable transaction from L1 which causes data to be read from outside the core. L2 refills caused by stashes into L2 should not be counted", 52 - "EventCode": "0x17", 53 - "EventName": "L2D_CACHE_REFILL", 54 - "BriefDescription": "L2 data cache refill" 36 + "ArchStdEvent": "L2D_CACHE_REFILL", 55 37 }, 56 38 { 57 - "PublicDescription": "L2 data cache write-back. This event counts any write-back of data from the L2 cache to outside the core. This includes snoops to the L2 which return data, regardless of whether they cause an invalidation. Invalidations from the L2 which do not write data outside of the core and snoops which return data from the L1 are not counted", 58 - "EventCode": "0x18", 59 - "EventName": "L2D_CACHE_WB", 60 - "BriefDescription": "L2 data cache write-back" 39 + "PublicDescription": "This event counts any write-back of data from the L2 cache to outside the core. This includes snoops to the L2 which return data, regardless of whether they cause an invalidation. Invalidations from the L2 which do not write data outside of the core and snoops which return data from the L1 are not counted", 40 + "ArchStdEvent": "L2D_CACHE_WB", 61 41 }, 62 42 { 63 - "PublicDescription": "L2 data cache allocation without refill. This event counts any full cache line write into the L2 cache which does not cause a linefill, including write-backs from L1 to L2 and full-line writes which do not allocate into L1.", 64 - "EventCode": "0x20", 65 - "EventName": "L2D_CACHE_ALLOCATE", 66 - "BriefDescription": "L2 data cache allocation without refill" 43 + "PublicDescription": "This event counts any full cache line write into the L2 cache which does not cause a linefill, including write-backs from L1 to L2 and full-line writes which do not allocate into L1.", 44 + "ArchStdEvent": "L2D_CACHE_ALLOCATE", 67 45 }, 68 46 { 69 - "PublicDescription": "Level 1 data TLB access. This event counts any load or store operation which accesses the data L1 TLB. If both a load and a store are executed on a cycle, this event counts twice. This event counts regardless of whether the MMU is enabled.", 70 - "EventCode": "0x25", 71 - "EventName": "L1D_TLB", 47 + "PublicDescription": "This event counts any load or store operation which accesses the data L1 TLB. If both a load and a store are executed on a cycle, this event counts twice. This event counts regardless of whether the MMU is enabled.", 48 + "ArchStdEvent": "L1D_TLB", 72 49 "BriefDescription": "Level 1 data TLB access." 73 50 }, 74 51 { 75 - "PublicDescription": "Level 1 instruction TLB access. This event counts any instruction fetch which accesses the instruction L1 TLB.This event counts regardless of whether the MMU is enabled.", 76 - "EventCode": "0x26", 77 - "EventName": "L1I_TLB", 52 + "PublicDescription": "This event counts any instruction fetch which accesses the instruction L1 TLB.This event counts regardless of whether the MMU is enabled.", 53 + "ArchStdEvent": "L1I_TLB", 78 54 "BriefDescription": "Level 1 instruction TLB access" 79 55 }, 80 56 { 81 57 "PublicDescription": "This event counts any full cache line write into the L3 cache which does not cause a linefill, including write-backs from L2 to L3 and full-line writes which do not allocate into L2", 82 - "EventCode": "0x29", 83 - "EventName": "L3D_CACHE_ALLOCATE", 58 + "ArchStdEvent": "L3D_CACHE_ALLOCATE", 84 59 "BriefDescription": "Allocation without refill" 85 60 }, 86 61 { 87 - "PublicDescription": "Attributable Level 3 unified cache refill. This event counts for any cacheable read transaction returning datafrom the SCU for which the data source was outside the cluster. Transactions such as ReadUnique are counted here as 'read' transactions, even though they can be generated by store instructions.", 88 - "EventCode": "0x2A", 89 - "EventName": "L3D_CACHE_REFILL", 62 + "PublicDescription": "This event counts for any cacheable read transaction returning datafrom the SCU for which the data source was outside the cluster. Transactions such as ReadUnique are counted here as 'read' transactions, even though they can be generated by store instructions.", 63 + "ArchStdEvent": "L3D_CACHE_REFILL", 90 64 "BriefDescription": "Attributable Level 3 unified cache refill." 91 65 }, 92 66 { 93 - "PublicDescription": "Attributable Level 3 unified cache access. This event counts for any cacheable read transaction returning datafrom the SCU, or for any cacheable write to the SCU.", 94 - "EventCode": "0x2B", 95 - "EventName": "L3D_CACHE", 67 + "PublicDescription": "This event counts for any cacheable read transaction returning datafrom the SCU, or for any cacheable write to the SCU.", 68 + "ArchStdEvent": "L3D_CACHE", 96 69 "BriefDescription": "Attributable Level 3 unified cache access." 97 70 }, 98 71 { 99 - "PublicDescription": "Attributable L2 data or unified TLB refill. This event counts on anyrefill of the L2 TLB, caused by either an instruction or data access.This event does not count if the MMU is disabled.", 100 - "EventCode": "0x2D", 101 - "EventName": "L2D_TLB_REFILL", 72 + "PublicDescription": "This event counts on anyrefill of the L2 TLB, caused by either an instruction or data access.This event does not count if the MMU is disabled.", 73 + "ArchStdEvent": "L2D_TLB_REFILL", 102 74 "BriefDescription": "Attributable L2 data or unified TLB refill" 103 75 }, 104 76 { 105 - "PublicDescription": "Attributable L2 data or unified TLB access. This event counts on any access to the L2 TLB (caused by a refill of any of the L1 TLBs). This event does not count if the MMU is disabled.", 106 - "EventCode": "0x2F", 107 - "EventName": "L2D_TLB", 108 - "BriefDescription": "Attributable L2 data or unified TLB access" 77 + "PublicDescription": "This event counts on any access to the L2 TLB (caused by a refill of any of the L1 TLBs). This event does not count if the MMU is disabled.", 78 + "ArchStdEvent": "L2D_TLB", 109 79 }, 110 80 { 111 - "PublicDescription": "Access to data TLB that caused a page table walk. This event counts on any data access which causes L2D_TLB_REFILL to count.", 112 - "EventCode": "0x34", 113 - "EventName": "DTLB_WALK", 114 - "BriefDescription": "Access to data TLB that caused a page table walk." 81 + "PublicDescription": "This event counts on any data access which causes L2D_TLB_REFILL to count.", 82 + "ArchStdEvent": "DTLB_WALK", 115 83 }, 116 84 { 117 - "PublicDescription": "Access to instruction TLB that caused a page table walk. This event counts on any instruction access which causes L2D_TLB_REFILL to count.", 118 - "EventCode": "0x35", 119 - "EventName": "ITLB_WALK", 120 - "BriefDescription": "Access to instruction TLB that caused a page table walk." 85 + "PublicDescription": "This event counts on any instruction access which causes L2D_TLB_REFILL to count.", 86 + "ArchStdEvent": "ITLB_WALK", 121 87 }, 122 88 { 123 - "EventCode": "0x36", 124 - "EventName": "LL_CACHE_RD", 125 - "BriefDescription": "Last level cache access, read" 89 + "ArchStdEvent": "LL_CACHE_RD", 126 90 }, 127 91 { 128 - "EventCode": "0x37", 129 - "EventName": "LL_CACHE_MISS_RD", 130 - "BriefDescription": "Last level cache miss, read" 92 + "ArchStdEvent": "LL_CACHE_MISS_RD", 131 93 }, 132 94 { 133 95 "ArchStdEvent": "L1D_CACHE_INVAL"
+3 -7
tools/perf/pmu-events/arch/arm64/arm/cortex-a76-n1/exception.json
··· 1 1 [ 2 2 { 3 - "EventCode": "0x09", 4 - "EventName": "EXC_TAKEN", 5 - "BriefDescription": "Exception taken." 3 + "ArchStdEvent": "EXC_TAKEN", 6 4 }, 7 5 { 8 - "PublicDescription": "Local memory error. This event counts any correctable or uncorrectable memory error (ECC or parity) in the protected core RAMs", 9 - "EventCode": "0x1A", 10 - "EventName": "MEMORY_ERROR", 11 - "BriefDescription": "Local memory error." 6 + "PublicDescription": "This event counts any correctable or uncorrectable memory error (ECC or parity) in the protected core RAMs", 7 + "ArchStdEvent": "MEMORY_ERROR", 12 8 }, 13 9 { 14 10 "ArchStdEvent": "EXC_DABORT"
+15 -32
tools/perf/pmu-events/arch/arm64/arm/cortex-a76-n1/instruction.json
··· 1 1 [ 2 2 { 3 - "PublicDescription": "Software increment. Instruction architecturally executed (condition code check pass).", 4 - "EventCode": "0x00", 5 - "EventName": "SW_INCR", 6 - "BriefDescription": "Software increment." 3 + "ArchStdEvent": "SW_INCR", 7 4 }, 8 5 { 9 - "PublicDescription": "Instruction architecturally executed. This event counts all retired instructions, including those that fail their condition check.", 10 - "EventCode": "0x08", 11 - "EventName": "INST_RETIRED", 12 - "BriefDescription": "Instruction architecturally executed." 6 + "PublicDescription": "This event counts all retired instructions, including those that fail their condition check.", 7 + "ArchStdEvent": "INST_RETIRED", 13 8 }, 14 9 { 15 - "EventCode": "0x0A", 16 - "EventName": "EXC_RETURN", 17 - "BriefDescription": "Instruction architecturally executed, condition code check pass, exception return." 10 + "ArchStdEvent": "EXC_RETURN", 18 11 }, 19 12 { 20 - "PublicDescription": "Instruction architecturally executed, condition code check pass, write to CONTEXTIDR. This event only counts writes to CONTEXTIDR in AArch32 state, and via the CONTEXTIDR_EL1 mnemonic in AArch64 state.", 21 - "EventCode": "0x0B", 22 - "EventName": "CID_WRITE_RETIRED", 23 - "BriefDescription": "Instruction architecturally executed, condition code check pass, write to CONTEXTIDR." 13 + "PublicDescription": "This event only counts writes to CONTEXTIDR in AArch32 state, and via the CONTEXTIDR_EL1 mnemonic in AArch64 state.", 14 + "ArchStdEvent": "CID_WRITE_RETIRED", 24 15 }, 25 16 { 26 - "EventCode": "0x1B", 27 - "EventName": "INST_SPEC", 28 - "BriefDescription": "Operation speculatively executed" 17 + "ArchStdEvent": "INST_SPEC", 29 18 }, 30 19 { 31 - "PublicDescription": "Instruction architecturally executed, condition code check pass, write to TTBR. This event only counts writes to TTBR0/TTBR1 in AArch32 state and TTBR0_EL1/TTBR1_EL1 in AArch64 state.", 32 - "EventCode": "0x1C", 33 - "EventName": "TTBR_WRITE_RETIRED", 34 - "BriefDescription": "Instruction architecturally executed, condition code check pass, write to TTBR" 20 + "PublicDescription": "This event only counts writes to TTBR0/TTBR1 in AArch32 state and TTBR0_EL1/TTBR1_EL1 in AArch64 state.", 21 + "ArchStdEvent": "TTBR_WRITE_RETIRED", 22 + }, 23 + {, 24 + "PublicDescription": "This event counts all branches, taken or not. This excludes exception entries, debug entries and CCFAIL branches.", 25 + "ArchStdEvent": "BR_RETIRED", 35 26 }, 36 27 { 37 - "PublicDescription": "Instruction architecturally executed, branch. This event counts all branches, taken or not. This excludes exception entries, debug entries and CCFAIL branches.", 38 - "EventCode": "0x21", 39 - "EventName": "BR_RETIRED", 40 - "BriefDescription": "Instruction architecturally executed, branch." 41 - }, 42 - { 43 - "PublicDescription": "Instruction architecturally executed, mispredicted branch. This event counts any branch counted by BR_RETIRED which is not correctly predicted and causes a pipeline flush.", 44 - "EventCode": "0x22", 45 - "EventName": "BR_MIS_PRED_RETIRED", 46 - "BriefDescription": "Instruction architecturally executed, mispredicted branch." 28 + "PublicDescription": "This event counts any branch counted by BR_RETIRED which is not correctly predicted and causes a pipeline flush.", 29 + "ArchStdEvent": "BR_MIS_PRED_RETIRED", 47 30 }, 48 31 { 49 32 "ArchStdEvent": "ASE_SPEC"
+2 -4
tools/perf/pmu-events/arch/arm64/arm/cortex-a76-n1/memory.json
··· 1 1 [ 2 2 { 3 - "PublicDescription": "Data memory access. This event counts memory accesses due to load or store instructions. This event counts the sum of MEM_ACCESS_RD and MEM_ACCESS_WR.", 4 - "EventCode": "0x13", 5 - "EventName": "MEM_ACCESS", 6 - "BriefDescription": "Data memory access" 3 + "PublicDescription": "This event counts memory accesses due to load or store instructions. This event counts the sum of MEM_ACCESS_RD and MEM_ACCESS_WR.", 4 + "ArchStdEvent": "MEM_ACCESS", 7 5 }, 8 6 { 9 7 "ArchStdEvent": "MEM_ACCESS_RD"
+1 -3
tools/perf/pmu-events/arch/arm64/arm/cortex-a76-n1/other.json
··· 1 1 [ 2 2 { 3 - "EventCode": "0x31", 4 - "EventName": "REMOTE_ACCESS", 5 - "BriefDescription": "Access to another socket in a multi-socket system" 3 + "ArchStdEvent": "REMOTE_ACCESS", 6 4 } 7 5 ]
+4 -8
tools/perf/pmu-events/arch/arm64/arm/cortex-a76-n1/pipeline.json
··· 1 1 [ 2 2 { 3 - "PublicDescription": "No operation issued because of the frontend. The counter counts on any cycle when there are no fetched instructions available to dispatch.", 4 - "EventCode": "0x23", 5 - "EventName": "STALL_FRONTEND", 6 - "BriefDescription": "No operation issued because of the frontend." 3 + "PublicDescription": "The counter counts on any cycle when there are no fetched instructions available to dispatch.", 4 + "ArchStdEvent": "STALL_FRONTEND", 7 5 }, 8 6 { 9 - "PublicDescription": "No operation issued because of the backend. The counter counts on any cycle fetched instructions are not dispatched due to resource constraints.", 10 - "EventCode": "0x24", 11 - "EventName": "STALL_BACKEND", 12 - "BriefDescription": "No operation issued because of the backend." 7 + "PublicDescription": "The counter counts on any cycle fetched instructions are not dispatched due to resource constraints.", 8 + "ArchStdEvent": "STALL_BACKEND", 13 9 } 14 10 ]
+248
tools/perf/pmu-events/arch/arm64/armv8-common-and-microarch.json
··· 1 + [ 2 + { 3 + "PublicDescription": "Instruction architecturally executed, Condition code check pass, software increment", 4 + "EventCode": "0x00", 5 + "EventName": "SW_INCR", 6 + "BriefDescription": "Instruction architecturally executed, Condition code check pass, software increment" 7 + }, 8 + { 9 + "PublicDescription": "Level 1 instruction cache refill", 10 + "EventCode": "0x01", 11 + "EventName": "L1I_CACHE_REFILL", 12 + "BriefDescription": "Level 1 instruction cache refill" 13 + }, 14 + { 15 + "PublicDescription": "Attributable Level 1 instruction TLB refill", 16 + "EventCode": "0x02", 17 + "EventName": "L1I_TLB_REFILL", 18 + "BriefDescription": "Attributable Level 1 instruction TLB refill" 19 + }, 20 + { 21 + "PublicDescription": "Level 1 data cache refill", 22 + "EventCode": "0x03", 23 + "EventName": "L1D_CACHE_REFILL", 24 + "BriefDescription": "Level 1 data cache refill" 25 + }, 26 + { 27 + "PublicDescription": "Level 1 data cache access", 28 + "EventCode": "0x04", 29 + "EventName": "L1D_CACHE", 30 + "BriefDescription": "Level 1 data cache access" 31 + }, 32 + { 33 + "PublicDescription": "Attributable Level 1 data TLB refill", 34 + "EventCode": "0x05", 35 + "EventName": "L1D_TLB_REFILL", 36 + "BriefDescription": "Attributable Level 1 data TLB refill" 37 + }, 38 + { 39 + "PublicDescription": "Instruction architecturally executed", 40 + "EventCode": "0x08", 41 + "EventName": "INST_RETIRED", 42 + "BriefDescription": "Instruction architecturally executed" 43 + }, 44 + { 45 + "PublicDescription": "Exception taken", 46 + "EventCode": "0x09", 47 + "EventName": "EXC_TAKEN", 48 + "BriefDescription": "Exception taken" 49 + }, 50 + { 51 + "PublicDescription": "Instruction architecturally executed, condition check pass, exception return", 52 + "EventCode": "0x0a", 53 + "EventName": "EXC_RETURN", 54 + "BriefDescription": "Instruction architecturally executed, condition check pass, exception return" 55 + }, 56 + { 57 + "PublicDescription": "Instruction architecturally executed, condition code check pass, write to CONTEXTIDR", 58 + "EventCode": "0x0b", 59 + "EventName": "CID_WRITE_RETIRED", 60 + "BriefDescription": "Instruction architecturally executed, condition code check pass, write to CONTEXTIDR" 61 + }, 62 + { 63 + "PublicDescription": "Mispredicted or not predicted branch speculatively executed", 64 + "EventCode": "0x10", 65 + "EventName": "BR_MIS_PRED", 66 + "BriefDescription": "Mispredicted or not predicted branch speculatively executed" 67 + }, 68 + { 69 + "PublicDescription": "Cycle", 70 + "EventCode": "0x11", 71 + "EventName": "CPU_CYCLES", 72 + "BriefDescription": "Cycle" 73 + }, 74 + { 75 + "PublicDescription": "Predictable branch speculatively executed", 76 + "EventCode": "0x12", 77 + "EventName": "BR_PRED", 78 + "BriefDescription": "Predictable branch speculatively executed" 79 + }, 80 + { 81 + "PublicDescription": "Data memory access", 82 + "EventCode": "0x13", 83 + "EventName": "MEM_ACCESS", 84 + "BriefDescription": "Data memory access" 85 + }, 86 + { 87 + "PublicDescription": "Attributable Level 1 instruction cache access", 88 + "EventCode": "0x14", 89 + "EventName": "L1I_CACHE", 90 + "BriefDescription": "Attributable Level 1 instruction cache access" 91 + }, 92 + { 93 + "PublicDescription": "Attributable Level 1 data cache write-back", 94 + "EventCode": "0x15", 95 + "EventName": "L1D_CACHE_WB", 96 + "BriefDescription": "Attributable Level 1 data cache write-back" 97 + }, 98 + { 99 + "PublicDescription": "Level 2 data cache access", 100 + "EventCode": "0x16", 101 + "EventName": "L2D_CACHE", 102 + "BriefDescription": "Level 2 data cache access" 103 + }, 104 + { 105 + "PublicDescription": "Level 2 data refill", 106 + "EventCode": "0x17", 107 + "EventName": "L2D_CACHE_REFILL", 108 + "BriefDescription": "Level 2 data refill" 109 + }, 110 + { 111 + "PublicDescription": "Attributable Level 2 data cache write-back", 112 + "EventCode": "0x18", 113 + "EventName": "L2D_CACHE_WB", 114 + "BriefDescription": "Attributable Level 2 data cache write-back" 115 + }, 116 + { 117 + "PublicDescription": "Attributable Bus access", 118 + "EventCode": "0x19", 119 + "EventName": "BUS_ACCESS", 120 + "BriefDescription": "Attributable Bus access" 121 + }, 122 + { 123 + "PublicDescription": "Local memory error", 124 + "EventCode": "0x1a", 125 + "EventName": "MEMORY_ERROR", 126 + "BriefDescription": "Local memory error" 127 + }, 128 + { 129 + "PublicDescription": "Operation speculatively executed", 130 + "EventCode": "0x1b", 131 + "EventName": "INST_SPEC", 132 + "BriefDescription": "Operation speculatively executed" 133 + }, 134 + { 135 + "PublicDescription": "Instruction architecturally executed, Condition code check pass, write to TTBR", 136 + "EventCode": "0x1c", 137 + "EventName": "TTBR_WRITE_RETIRED", 138 + "BriefDescription": "Instruction architecturally executed, Condition code check pass, write to TTBR" 139 + }, 140 + { 141 + "PublicDescription": "Bus cycle", 142 + "EventCode": "0x1D", 143 + "EventName": "BUS_CYCLES", 144 + "BriefDescription": "Bus cycle" 145 + }, 146 + { 147 + "PublicDescription": "Attributable Level 2 data cache allocation without refill", 148 + "EventCode": "0x20", 149 + "EventName": "L2D_CACHE_ALLOCATE", 150 + "BriefDescription": "Attributable Level 2 data cache allocation without refill" 151 + }, 152 + { 153 + "PublicDescription": "Instruction architecturally executed, branch", 154 + "EventCode": "0x21", 155 + "EventName": "BR_RETIRED", 156 + "BriefDescription": "Instruction architecturally executed, branch" 157 + }, 158 + { 159 + "PublicDescription": "Instruction architecturally executed, mispredicted branch", 160 + "EventCode": "0x22", 161 + "EventName": "BR_MIS_PRED_RETIRED", 162 + "BriefDescription": "Instruction architecturally executed, mispredicted branch" 163 + }, 164 + { 165 + "PublicDescription": "No operation issued because of the frontend", 166 + "EventCode": "0x23", 167 + "EventName": "STALL_FRONTEND", 168 + "BriefDescription": "No operation issued because of the frontend" 169 + }, 170 + { 171 + "PublicDescription": "No operation issued due to the backend", 172 + "EventCode": "0x24", 173 + "EventName": "STALL_BACKEND", 174 + "BriefDescription": "No operation issued due to the backend" 175 + }, 176 + { 177 + "PublicDescription": "Attributable Level 1 data or unified TLB access", 178 + "EventCode": "0x25", 179 + "EventName": "L1D_TLB", 180 + "BriefDescription": "Attributable Level 1 data or unified TLB access" 181 + }, 182 + { 183 + "PublicDescription": "Attributable Level 1 instruction TLB access", 184 + "EventCode": "0x26", 185 + "EventName": "L1I_TLB", 186 + "BriefDescription": "Attributable Level 1 instruction TLB access" 187 + }, 188 + { 189 + "PublicDescription": "Attributable Level 3 data cache allocation without refill", 190 + "EventCode": "0x29", 191 + "EventName": "L3D_CACHE_ALLOCATE", 192 + "BriefDescription": "Attributable Level 3 data cache allocation without refill" 193 + }, 194 + { 195 + "PublicDescription": "Attributable Level 3 data cache refill", 196 + "EventCode": "0x2A", 197 + "EventName": "L3D_CACHE_REFILL", 198 + "BriefDescription": "Attributable Level 3 data cache refill" 199 + }, 200 + { 201 + "PublicDescription": "Attributable Level 3 data cache access", 202 + "EventCode": "0x2B", 203 + "EventName": "L3D_CACHE", 204 + "BriefDescription": "Attributable Level 3 data cache access" 205 + }, 206 + { 207 + "PublicDescription": "Attributable Level 2 data TLB refill", 208 + "EventCode": "0x2D", 209 + "EventName": "L2D_TLB_REFILL", 210 + "BriefDescription": "Attributable Level 2 data TLB refill" 211 + }, 212 + { 213 + "PublicDescription": "Attributable Level 2 data or unified TLB access", 214 + "EventCode": "0x2F", 215 + "EventName": "L2D_TLB", 216 + "BriefDescription": "Attributable Level 2 data or unified TLB access" 217 + }, 218 + { 219 + "PublicDescription": "Access to another socket in a multi-socket system", 220 + "EventCode": "0x31", 221 + "EventName": "REMOTE_ACCESS", 222 + "BriefDescription": "Access to another socket in a multi-socket system" 223 + }, 224 + { 225 + "PublicDescription": "Access to data TLB causes a translation table walk", 226 + "EventCode": "0x34", 227 + "EventName": "DTLB_WALK", 228 + "BriefDescription": "Access to data TLB causes a translation table walk" 229 + }, 230 + { 231 + "PublicDescription": "Access to instruction TLB that causes a translation table walk", 232 + "EventCode": "0x35", 233 + "EventName": "ITLB_WALK", 234 + "BriefDescription": "Access to instruction TLB that causes a translation table walk" 235 + }, 236 + { 237 + "PublicDescription": "Attributable Last level cache memory read", 238 + "EventCode": "0x36", 239 + "EventName": "LL_CACHE_RD", 240 + "BriefDescription": "Attributable Last level cache memory read" 241 + }, 242 + { 243 + "PublicDescription": "Last level cache miss, read", 244 + "EventCode": "0x37", 245 + "EventName": "LL_CACHE_MISS_RD", 246 + "BriefDescription": "Last level cache miss, read" 247 + } 248 + ]
+2 -2
tools/perf/pmu-events/arch/arm64/freescale/imx8mm/sys/metrics.json
··· 6 6 "ScaleUnit": "9.765625e-4KB", 7 7 "Unit": "imx8_ddr", 8 8 "Compat": "i.MX8MM" 9 - }, 9 + }, 10 10 { 11 11 "BriefDescription": "bytes all masters write to ddr based on write-cycles event", 12 12 "MetricName": "imx8mm_ddr_write.all", ··· 14 14 "ScaleUnit": "9.765625e-4KB", 15 15 "Unit": "imx8_ddr", 16 16 "Compat": "i.MX8MM" 17 - } 17 + } 18 18 ]
+37
tools/perf/pmu-events/arch/arm64/freescale/imx8mn/sys/ddrc.json
··· 1 + [ 2 + { 3 + "BriefDescription": "ddr cycles event", 4 + "EventCode": "0x00", 5 + "EventName": "imx8mn_ddr.cycles", 6 + "Unit": "imx8_ddr", 7 + "Compat": "i.MX8MN" 8 + }, 9 + { 10 + "BriefDescription": "ddr read-cycles event", 11 + "EventCode": "0x2a", 12 + "EventName": "imx8mn_ddr.read_cycles", 13 + "Unit": "imx8_ddr", 14 + "Compat": "i.MX8MN" 15 + }, 16 + { 17 + "BriefDescription": "ddr write-cycles event", 18 + "EventCode": "0x2b", 19 + "EventName": "imx8mn_ddr.write_cycles", 20 + "Unit": "imx8_ddr", 21 + "Compat": "i.MX8MN" 22 + }, 23 + { 24 + "BriefDescription": "ddr read event", 25 + "EventCode": "0x35", 26 + "EventName": "imx8mn_ddr.read", 27 + "Unit": "imx8_ddr", 28 + "Compat": "i.MX8MN" 29 + }, 30 + { 31 + "BriefDescription": "ddr write event", 32 + "EventCode": "0x38", 33 + "EventName": "imx8mn_ddr.write", 34 + "Unit": "imx8_ddr", 35 + "Compat": "i.MX8MN" 36 + } 37 + ]
+18
tools/perf/pmu-events/arch/arm64/freescale/imx8mn/sys/metrics.json
··· 1 + [ 2 + { 3 + "BriefDescription": "bytes all masters read from ddr based on read-cycles event", 4 + "MetricName": "imx8mn_ddr_read.all", 5 + "MetricExpr": "imx8mn_ddr.read_cycles * 4 * 2", 6 + "ScaleUnit": "9.765625e-4KB", 7 + "Unit": "imx8_ddr", 8 + "Compat": "i.MX8MN" 9 + }, 10 + { 11 + "BriefDescription": "bytes all masters write to ddr based on write-cycles event", 12 + "MetricName": "imx8mn_ddr_write.all", 13 + "MetricExpr": "imx8mn_ddr.write_cycles * 4 * 2", 14 + "ScaleUnit": "9.765625e-4KB", 15 + "Unit": "imx8_ddr", 16 + "Compat": "i.MX8MN" 17 + } 18 + ]
+37
tools/perf/pmu-events/arch/arm64/freescale/imx8mp/sys/ddrc.json
··· 1 + [ 2 + { 3 + "BriefDescription": "ddr cycles event", 4 + "EventCode": "0x00", 5 + "EventName": "imx8mp_ddr.cycles", 6 + "Unit": "imx8_ddr", 7 + "Compat": "i.MX8MP" 8 + }, 9 + { 10 + "BriefDescription": "ddr read-cycles event", 11 + "EventCode": "0x2a", 12 + "EventName": "imx8mp_ddr.read_cycles", 13 + "Unit": "imx8_ddr", 14 + "Compat": "i.MX8MP" 15 + }, 16 + { 17 + "BriefDescription": "ddr write-cycles event", 18 + "EventCode": "0x2b", 19 + "EventName": "imx8mp_ddr.write_cycles", 20 + "Unit": "imx8_ddr", 21 + "Compat": "i.MX8MP" 22 + }, 23 + { 24 + "BriefDescription": "ddr read event", 25 + "EventCode": "0x35", 26 + "EventName": "imx8mp_ddr.read", 27 + "Unit": "imx8_ddr", 28 + "Compat": "i.MX8MP" 29 + }, 30 + { 31 + "BriefDescription": "ddr write event", 32 + "EventCode": "0x38", 33 + "EventName": "imx8mp_ddr.write", 34 + "Unit": "imx8_ddr", 35 + "Compat": "i.MX8MP" 36 + } 37 + ]
+466
tools/perf/pmu-events/arch/arm64/freescale/imx8mp/sys/metrics.json
··· 1 + [ 2 + { 3 + "BriefDescription": "bytes of all masters read from ddr", 4 + "MetricName": "imx8mp_ddr_read.all", 5 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0xffff\\,axi_id\\=0x0000@", 6 + "ScaleUnit": "9.765625e-4KB", 7 + "Unit": "imx8_ddr", 8 + "Compat": "i.MX8MP" 9 + }, 10 + { 11 + "BriefDescription": "bytes of all masters write to ddr", 12 + "MetricName": "imx8mp_ddr_write.all", 13 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0xffff\\,axi_id\\=0x0000@", 14 + "ScaleUnit": "9.765625e-4KB", 15 + "Unit": "imx8_ddr", 16 + "Compat": "i.MX8MP" 17 + }, 18 + { 19 + "BriefDescription": "bytes of a53 core read from ddr", 20 + "MetricName": "imx8mp_ddr_read.a53", 21 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x0000@", 22 + "ScaleUnit": "9.765625e-4KB", 23 + "Unit": "imx8_ddr", 24 + "Compat": "i.MX8MP" 25 + }, 26 + { 27 + "BriefDescription": "bytes of a53 core write to ddr", 28 + "MetricName": "imx8mp_ddr_write.a53", 29 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x0000@", 30 + "ScaleUnit": "9.765625e-4KB", 31 + "Unit": "imx8_ddr", 32 + "Compat": "i.MX8MP" 33 + }, 34 + { 35 + "BriefDescription": "bytes of supermix(m7) core read from ddr", 36 + "MetricName": "imx8mp_ddr_read.supermix", 37 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x000f\\,axi_id\\=0x0020@", 38 + "ScaleUnit": "9.765625e-4KB", 39 + "Unit": "imx8_ddr", 40 + "Compat": "i.MX8MP" 41 + }, 42 + { 43 + "BriefDescription": "bytes of supermix(m7) write to ddr", 44 + "MetricName": "imx8mp_ddr_write.supermix", 45 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x000f\\,axi_id\\=0x0020@", 46 + "ScaleUnit": "9.765625e-4KB", 47 + "Unit": "imx8_ddr", 48 + "Compat": "i.MX8MP" 49 + }, 50 + { 51 + "BriefDescription": "bytes of gpu 3d read from ddr", 52 + "MetricName": "imx8mp_ddr_read.3d", 53 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x0070@", 54 + "ScaleUnit": "9.765625e-4KB", 55 + "Unit": "imx8_ddr", 56 + "Compat": "i.MX8MP" 57 + }, 58 + { 59 + "BriefDescription": "bytes of gpu 3d write to ddr", 60 + "MetricName": "imx8mp_ddr_write.3d", 61 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x0070@", 62 + "ScaleUnit": "9.765625e-4KB", 63 + "Unit": "imx8_ddr", 64 + "Compat": "i.MX8MP" 65 + }, 66 + { 67 + "BriefDescription": "bytes of gpu 2d read from ddr", 68 + "MetricName": "imx8mp_ddr_read.2d", 69 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x0071@", 70 + "ScaleUnit": "9.765625e-4KB", 71 + "Unit": "imx8_ddr", 72 + "Compat": "i.MX8MP" 73 + }, 74 + { 75 + "BriefDescription": "bytes of gpu 2d write to ddr", 76 + "MetricName": "imx8mp_ddr_write.2d", 77 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x0071@", 78 + "ScaleUnit": "9.765625e-4KB", 79 + "Unit": "imx8_ddr", 80 + "Compat": "i.MX8MP" 81 + }, 82 + { 83 + "BriefDescription": "bytes of display lcdif1 read from ddr", 84 + "MetricName": "imx8mp_ddr_read.lcdif1", 85 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x0068@", 86 + "ScaleUnit": "9.765625e-4KB", 87 + "Unit": "imx8_ddr", 88 + "Compat": "i.MX8MP" 89 + }, 90 + { 91 + "BriefDescription": "bytes of display lcdif1 write to ddr", 92 + "MetricName": "imx8mp_ddr_write.lcdif1", 93 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x0068@", 94 + "ScaleUnit": "9.765625e-4KB", 95 + "Unit": "imx8_ddr", 96 + "Compat": "i.MX8MP" 97 + }, 98 + { 99 + "BriefDescription": "bytes of display lcdif2 read from ddr", 100 + "MetricName": "imx8mp_ddr_read.lcdif2", 101 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x0069@", 102 + "ScaleUnit": "9.765625e-4KB", 103 + "Unit": "imx8_ddr", 104 + "Compat": "i.MX8MP" 105 + }, 106 + { 107 + "BriefDescription": "bytes of display lcdif2 write to ddr", 108 + "MetricName": "imx8mp_ddr_write.lcdif2", 109 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x0069@", 110 + "ScaleUnit": "9.765625e-4KB", 111 + "Unit": "imx8_ddr", 112 + "Compat": "i.MX8MP" 113 + }, 114 + { 115 + "BriefDescription": "bytes of display isi1 read from ddr", 116 + "MetricName": "imx8mp_ddr_read.isi1", 117 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x006a@", 118 + "ScaleUnit": "9.765625e-4KB", 119 + "Unit": "imx8_ddr", 120 + "Compat": "i.MX8MP" 121 + }, 122 + { 123 + "BriefDescription": "bytes of display isi1 write to ddr", 124 + "MetricName": "imx8mp_ddr_write.isi1", 125 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x006a@", 126 + "ScaleUnit": "9.765625e-4KB", 127 + "Unit": "imx8_ddr", 128 + "Compat": "i.MX8MP" 129 + }, 130 + { 131 + "BriefDescription": "bytes of display isi2 read from ddr", 132 + "MetricName": "imx8mp_ddr_read.isi2", 133 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x006b@", 134 + "ScaleUnit": "9.765625e-4KB", 135 + "Unit": "imx8_ddr", 136 + "Compat": "i.MX8MP" 137 + }, 138 + { 139 + "BriefDescription": "bytes of display isi2 write to ddr", 140 + "MetricName": "imx8mp_ddr_write.isi2", 141 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x006b@", 142 + "ScaleUnit": "9.765625e-4KB", 143 + "Unit": "imx8_ddr", 144 + "Compat": "i.MX8MP" 145 + }, 146 + { 147 + "BriefDescription": "bytes of display isi3 read from ddr", 148 + "MetricName": "imx8mp_ddr_read.isi3", 149 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x006c@", 150 + "ScaleUnit": "9.765625e-4KB", 151 + "Unit": "imx8_ddr", 152 + "Compat": "i.MX8MP" 153 + }, 154 + { 155 + "BriefDescription": "bytes of display isi3 write to ddr", 156 + "MetricName": "imx8mp_ddr_write.isi3", 157 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x006c@", 158 + "ScaleUnit": "9.765625e-4KB", 159 + "Unit": "imx8_ddr", 160 + "Compat": "i.MX8MP" 161 + }, 162 + { 163 + "BriefDescription": "bytes of display isp1 read from ddr", 164 + "MetricName": "imx8mp_ddr_read.isp1", 165 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x006d@", 166 + "ScaleUnit": "9.765625e-4KB", 167 + "Unit": "imx8_ddr", 168 + "Compat": "i.MX8MP" 169 + }, 170 + { 171 + "BriefDescription": "bytes of display isp1 write to ddr", 172 + "MetricName": "imx8mp_ddr_write.isp1", 173 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x006d@", 174 + "ScaleUnit": "9.765625e-4KB", 175 + "Unit": "imx8_ddr", 176 + "Compat": "i.MX8MP" 177 + }, 178 + { 179 + "BriefDescription": "bytes of display isp2 read from ddr", 180 + "MetricName": "imx8mp_ddr_read.isp2", 181 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x006e@", 182 + "ScaleUnit": "9.765625e-4KB", 183 + "Unit": "imx8_ddr", 184 + "Compat": "i.MX8MP" 185 + }, 186 + { 187 + "BriefDescription": "bytes of display isp2 write to ddr", 188 + "MetricName": "imx8mp_ddr_write.isp2", 189 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x006e@", 190 + "ScaleUnit": "9.765625e-4KB", 191 + "Unit": "imx8_ddr", 192 + "Compat": "i.MX8MP" 193 + }, 194 + { 195 + "BriefDescription": "bytes of display dewarp read from ddr", 196 + "MetricName": "imx8mp_ddr_read.dewarp", 197 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x006f@", 198 + "ScaleUnit": "9.765625e-4KB", 199 + "Unit": "imx8_ddr", 200 + "Compat": "i.MX8MP" 201 + }, 202 + { 203 + "BriefDescription": "bytes of display dewarp write to ddr", 204 + "MetricName": "imx8mp_ddr_write.dewarp", 205 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x006f@", 206 + "ScaleUnit": "9.765625e-4KB", 207 + "Unit": "imx8_ddr", 208 + "Compat": "i.MX8MP" 209 + }, 210 + { 211 + "BriefDescription": "bytes of vpu1 read from ddr", 212 + "MetricName": "imx8mp_ddr_read.vpu1", 213 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x007c@", 214 + "ScaleUnit": "9.765625e-4KB", 215 + "Unit": "imx8_ddr", 216 + "Compat": "i.MX8MP" 217 + }, 218 + { 219 + "BriefDescription": "bytes of vpu1 write to ddr", 220 + "MetricName": "imx8mp_ddr_write.vpu1", 221 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x007c@", 222 + "ScaleUnit": "9.765625e-4KB", 223 + "Unit": "imx8_ddr", 224 + "Compat": "i.MX8MP" 225 + }, 226 + { 227 + "BriefDescription": "bytes of vpu2 read from ddr", 228 + "MetricName": "imx8mp_ddr_read.vpu2", 229 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x007d@", 230 + "ScaleUnit": "9.765625e-4KB", 231 + "Unit": "imx8_ddr", 232 + "Compat": "i.MX8MP" 233 + }, 234 + { 235 + "BriefDescription": "bytes of vpu2 write to ddr", 236 + "MetricName": "imx8mp_ddr_write.vpu2", 237 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x007d@", 238 + "ScaleUnit": "9.765625e-4KB", 239 + "Unit": "imx8_ddr", 240 + "Compat": "i.MX8MP" 241 + }, 242 + { 243 + "BriefDescription": "bytes of vpu3 read from ddr", 244 + "MetricName": "imx8mp_ddr_read.vpu3", 245 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x007e@", 246 + "ScaleUnit": "9.765625e-4KB", 247 + "Unit": "imx8_ddr", 248 + "Compat": "i.MX8MP" 249 + }, 250 + { 251 + "BriefDescription": "bytes of vpu3 write to ddr", 252 + "MetricName": "imx8mp_ddr_write.vpu3", 253 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x007e@", 254 + "ScaleUnit": "9.765625e-4KB", 255 + "Unit": "imx8_ddr", 256 + "Compat": "i.MX8MP" 257 + }, 258 + { 259 + "BriefDescription": "bytes of npu read from ddr", 260 + "MetricName": "imx8mp_ddr_read.npu", 261 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x0073@", 262 + "ScaleUnit": "9.765625e-4KB", 263 + "Unit": "imx8_ddr", 264 + "Compat": "i.MX8MP" 265 + }, 266 + { 267 + "BriefDescription": "bytes of npu write to ddr", 268 + "MetricName": "imx8mp_ddr_write.npu", 269 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x0073@", 270 + "ScaleUnit": "9.765625e-4KB", 271 + "Unit": "imx8_ddr", 272 + "Compat": "i.MX8MP" 273 + }, 274 + { 275 + "BriefDescription": "bytes of hsio usb1 read from ddr", 276 + "MetricName": "imx8mp_ddr_read.usb1", 277 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x0078@", 278 + "ScaleUnit": "9.765625e-4KB", 279 + "Unit": "imx8_ddr", 280 + "Compat": "i.MX8MP" 281 + }, 282 + { 283 + "BriefDescription": "bytes of hsio usb1 write to ddr", 284 + "MetricName": "imx8mp_ddr_write.usb1", 285 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x0078@", 286 + "ScaleUnit": "9.765625e-4KB", 287 + "Unit": "imx8_ddr", 288 + "Compat": "i.MX8MP" 289 + }, 290 + { 291 + "BriefDescription": "bytes of hsio usb2 read from ddr", 292 + "MetricName": "imx8mp_ddr_read.usb2", 293 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x0079@", 294 + "ScaleUnit": "9.765625e-4KB", 295 + "Unit": "imx8_ddr", 296 + "Compat": "i.MX8MP" 297 + }, 298 + { 299 + "BriefDescription": "bytes of hsio usb2 write to ddr", 300 + "MetricName": "imx8mp_ddr_write.usb2", 301 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x0079@", 302 + "ScaleUnit": "9.765625e-4KB", 303 + "Unit": "imx8_ddr", 304 + "Compat": "i.MX8MP" 305 + }, 306 + { 307 + "BriefDescription": "bytes of hsio pci read from ddr", 308 + "MetricName": "imx8mp_ddr_read.pci", 309 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x007a@", 310 + "ScaleUnit": "9.765625e-4KB", 311 + "Unit": "imx8_ddr", 312 + "Compat": "i.MX8MP" 313 + }, 314 + { 315 + "BriefDescription": "bytes of hsio pci write to ddr", 316 + "MetricName": "imx8mp_ddr_write.pci", 317 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x007a@", 318 + "ScaleUnit": "9.765625e-4KB", 319 + "Unit": "imx8_ddr", 320 + "Compat": "i.MX8MP" 321 + }, 322 + { 323 + "BriefDescription": "bytes of hdmi_tx hrv_mwr read from ddr", 324 + "MetricName": "imx8mp_ddr_read.hdmi_hrv_mwr", 325 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x0074@", 326 + "ScaleUnit": "9.765625e-4KB", 327 + "Unit": "imx8_ddr", 328 + "Compat": "i.MX8MP" 329 + }, 330 + { 331 + "BriefDescription": "bytes of hdmi_tx hrv_mwr write to ddr", 332 + "MetricName": "imx8mp_ddr_write.hdmi_hrv_mwr", 333 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x0074@", 334 + "ScaleUnit": "9.765625e-4KB", 335 + "Unit": "imx8_ddr", 336 + "Compat": "i.MX8MP" 337 + }, 338 + { 339 + "BriefDescription": "bytes of hdmi_tx lcdif read from ddr", 340 + "MetricName": "imx8mp_ddr_read.hdmi_lcdif", 341 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x0075@", 342 + "ScaleUnit": "9.765625e-4KB", 343 + "Unit": "imx8_ddr", 344 + "Compat": "i.MX8MP" 345 + }, 346 + { 347 + "BriefDescription": "bytes of hdmi_tx lcdif write to ddr", 348 + "MetricName": "imx8mp_ddr_write.hdmi_lcdif", 349 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x0075@", 350 + "ScaleUnit": "9.765625e-4KB", 351 + "Unit": "imx8_ddr", 352 + "Compat": "i.MX8MP" 353 + }, 354 + { 355 + "BriefDescription": "bytes of hdmi_tx tx_hdcp read from ddr", 356 + "MetricName": "imx8mp_ddr_read.hdmi_hdcp", 357 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x0076@", 358 + "ScaleUnit": "9.765625e-4KB", 359 + "Unit": "imx8_ddr", 360 + "Compat": "i.MX8MP" 361 + }, 362 + { 363 + "BriefDescription": "bytes of hdmi_tx tx_hdcp write to ddr", 364 + "MetricName": "imx8mp_ddr_write.hdmi_hdcp", 365 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x0076@", 366 + "ScaleUnit": "9.765625e-4KB", 367 + "Unit": "imx8_ddr", 368 + "Compat": "i.MX8MP" 369 + }, 370 + { 371 + "BriefDescription": "bytes of audio dsp read from ddr", 372 + "MetricName": "imx8mp_ddr_read.audio_dsp", 373 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x0041@", 374 + "ScaleUnit": "9.765625e-4KB", 375 + "Unit": "imx8_ddr", 376 + "Compat": "i.MX8MP" 377 + }, 378 + { 379 + "BriefDescription": "bytes of audio dsp write to ddr", 380 + "MetricName": "imx8mp_ddr_write.audio_dsp", 381 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x0041@", 382 + "ScaleUnit": "9.765625e-4KB", 383 + "Unit": "imx8_ddr", 384 + "Compat": "i.MX8MP" 385 + }, 386 + { 387 + "BriefDescription": "bytes of audio sdma2_per read from ddr", 388 + "MetricName": "imx8mp_ddr_read.audio_sdma2_per", 389 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x0062@", 390 + "ScaleUnit": "9.765625e-4KB", 391 + "Unit": "imx8_ddr", 392 + "Compat": "i.MX8MP" 393 + }, 394 + { 395 + "BriefDescription": "bytes of audio sdma2_per write to ddr", 396 + "MetricName": "imx8mp_ddr_write.audio_sdma2_per", 397 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x0062@", 398 + "ScaleUnit": "9.765625e-4KB", 399 + "Unit": "imx8_ddr", 400 + "Compat": "i.MX8MP" 401 + }, 402 + { 403 + "BriefDescription": "bytes of audio sdma2_burst read from ddr", 404 + "MetricName": "imx8mp_ddr_read.audio_sdma2_burst", 405 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x0063@", 406 + "ScaleUnit": "9.765625e-4KB", 407 + "Unit": "imx8_ddr", 408 + "Compat": "i.MX8MP" 409 + }, 410 + { 411 + "BriefDescription": "bytes of audio sdma2_burst write to ddr", 412 + "MetricName": "imx8mp_ddr_write.audio_sdma2_burst", 413 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x0063@", 414 + "ScaleUnit": "9.765625e-4KB", 415 + "Unit": "imx8_ddr", 416 + "Compat": "i.MX8MP" 417 + }, 418 + { 419 + "BriefDescription": "bytes of audio sdma3_per read from ddr", 420 + "MetricName": "imx8mp_ddr_read.audio_sdma3_per", 421 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x0064@", 422 + "ScaleUnit": "9.765625e-4KB", 423 + "Unit": "imx8_ddr", 424 + "Compat": "i.MX8MP" 425 + }, 426 + { 427 + "BriefDescription": "bytes of audio sdma3_per write to ddr", 428 + "MetricName": "imx8mp_ddr_write.audio_sdma3_per", 429 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x0064@", 430 + "ScaleUnit": "9.765625e-4KB", 431 + "Unit": "imx8_ddr", 432 + "Compat": "i.MX8MP" 433 + }, 434 + { 435 + "BriefDescription": "bytes of audio sdma3_burst read from ddr", 436 + "MetricName": "imx8mp_ddr_read.audio_sdma3_burst", 437 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x0065@", 438 + "ScaleUnit": "9.765625e-4KB", 439 + "Unit": "imx8_ddr", 440 + "Compat": "i.MX8MP" 441 + }, 442 + { 443 + "BriefDescription": "bytes of audio sdma3_burst write to ddr", 444 + "MetricName": "imx8mp_ddr_write.audio_sdma3_burst", 445 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x0065@", 446 + "ScaleUnit": "9.765625e-4KB", 447 + "Unit": "imx8_ddr", 448 + "Compat": "i.MX8MP" 449 + }, 450 + { 451 + "BriefDescription": "bytes of audio sdma_pif read from ddr", 452 + "MetricName": "imx8mp_ddr_read.audio_sdma_pif", 453 + "MetricExpr": "imx8_ddr0@axid\\-read\\,axi_mask\\=0x0000\\,axi_id\\=0x0066@", 454 + "ScaleUnit": "9.765625e-4KB", 455 + "Unit": "imx8_ddr", 456 + "Compat": "i.MX8MP" 457 + }, 458 + { 459 + "BriefDescription": "bytes of audio sdma_pif write to ddr", 460 + "MetricName": "imx8mp_ddr_write.audio_sdma_pif", 461 + "MetricExpr": "imx8_ddr0@axid\\-write\\,axi_mask\\=0x0000\\,axi_id\\=0x0066@", 462 + "ScaleUnit": "9.765625e-4KB", 463 + "Unit": "imx8_ddr", 464 + "Compat": "i.MX8MP" 465 + } 466 + ]
+37
tools/perf/pmu-events/arch/arm64/freescale/imx8mq/sys/ddrc.json
··· 1 + [ 2 + { 3 + "BriefDescription": "ddr cycles event", 4 + "EventCode": "0x00", 5 + "EventName": "imx8mq_ddr.cycles", 6 + "Unit": "imx8_ddr", 7 + "Compat": "i.MX8MQ" 8 + }, 9 + { 10 + "BriefDescription": "ddr read-cycles event", 11 + "EventCode": "0x2a", 12 + "EventName": "imx8mq_ddr.read_cycles", 13 + "Unit": "imx8_ddr", 14 + "Compat": "i.MX8MQ" 15 + }, 16 + { 17 + "BriefDescription": "ddr write-cycles event", 18 + "EventCode": "0x2b", 19 + "EventName": "imx8mq_ddr.write_cycles", 20 + "Unit": "imx8_ddr", 21 + "Compat": "i.MX8MQ" 22 + }, 23 + { 24 + "BriefDescription": "ddr read event", 25 + "EventCode": "0x35", 26 + "EventName": "imx8mq_ddr.read", 27 + "Unit": "imx8_ddr", 28 + "Compat": "i.MX8MQ" 29 + }, 30 + { 31 + "BriefDescription": "ddr write event", 32 + "EventCode": "0x38", 33 + "EventName": "imx8mq_ddr.write", 34 + "Unit": "imx8_ddr", 35 + "Compat": "i.MX8MQ" 36 + } 37 + ]
+18
tools/perf/pmu-events/arch/arm64/freescale/imx8mq/sys/metrics.json
··· 1 + [ 2 + { 3 + "BriefDescription": "bytes all masters read from ddr based on read-cycles event", 4 + "MetricName": "imx8mq_ddr_read.all", 5 + "MetricExpr": "imx8mq_ddr.read_cycles * 4 * 4", 6 + "ScaleUnit": "9.765625e-4KB", 7 + "Unit": "imx8_ddr", 8 + "Compat": "i.MX8MQ" 9 + }, 10 + { 11 + "BriefDescription": "bytes all masters write to ddr based on write-cycles event", 12 + "MetricName": "imx8mq_ddr_write.all", 13 + "MetricExpr": "imx8mq_ddr.write_cycles * 4 * 4", 14 + "ScaleUnit": "9.765625e-4KB", 15 + "Unit": "imx8_ddr", 16 + "Compat": "i.MX8MQ" 17 + } 18 + ]
+1
tools/perf/tests/Build
··· 58 58 perf-y += genelf.o 59 59 perf-y += api-io.o 60 60 perf-y += demangle-java-test.o 61 + perf-y += demangle-ocaml-test.o 61 62 perf-y += pfm.o 62 63 perf-y += parse-metric.o 63 64 perf-y += pe-file-parsing.o
+4
tools/perf/tests/builtin-test.c
··· 339 339 .func = test__demangle_java, 340 340 }, 341 341 { 342 + .desc = "Demangle OCaml", 343 + .func = test__demangle_ocaml, 344 + }, 345 + { 342 346 .desc = "Parse and process metrics", 343 347 .func = test__parse_metric, 344 348 },
+1 -9
tools/perf/tests/code-reading.c
··· 26 26 #include "event.h" 27 27 #include "record.h" 28 28 #include "util/mmap.h" 29 + #include "util/string2.h" 29 30 #include "util/synthetic-events.h" 30 31 #include "thread.h" 31 32 ··· 41 40 u64 done[1024]; 42 41 size_t done_cnt; 43 42 }; 44 - 45 - static unsigned int hex(char c) 46 - { 47 - if (c >= '0' && c <= '9') 48 - return c - '0'; 49 - if (c >= 'a' && c <= 'f') 50 - return c - 'a' + 10; 51 - return c - 'A' + 10; 52 - } 53 43 54 44 static size_t read_objdump_chunk(const char **line, unsigned char **buf, 55 45 size_t *buf_len)
+43
tools/perf/tests/demangle-ocaml-test.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <string.h> 3 + #include <stdlib.h> 4 + #include <stdio.h> 5 + #include "tests.h" 6 + #include "session.h" 7 + #include "debug.h" 8 + #include "demangle-ocaml.h" 9 + 10 + int test__demangle_ocaml(struct test *test __maybe_unused, int subtest __maybe_unused) 11 + { 12 + int ret = TEST_OK; 13 + char *buf = NULL; 14 + size_t i; 15 + 16 + struct { 17 + const char *mangled, *demangled; 18 + } test_cases[] = { 19 + { "main", 20 + NULL }, 21 + { "camlStdlib__array__map_154", 22 + "Stdlib.array.map" }, 23 + { "camlStdlib__anon_fn$5bstdlib$2eml$3a334$2c0$2d$2d54$5d_1453", 24 + "Stdlib.anon_fn[stdlib.ml:334,0--54]" }, 25 + { "camlStdlib__bytes__$2b$2b_2205", 26 + "Stdlib.bytes.++" }, 27 + }; 28 + 29 + for (i = 0; i < sizeof(test_cases) / sizeof(test_cases[0]); i++) { 30 + buf = ocaml_demangle_sym(test_cases[i].mangled); 31 + if ((buf == NULL && test_cases[i].demangled != NULL) 32 + || (buf != NULL && test_cases[i].demangled == NULL) 33 + || (buf != NULL && strcmp(buf, test_cases[i].demangled))) { 34 + pr_debug("FAILED: %s: %s != %s\n", test_cases[i].mangled, 35 + buf == NULL ? "(null)" : buf, 36 + test_cases[i].demangled == NULL ? "(null)" : test_cases[i].demangled); 37 + ret = TEST_FAIL; 38 + } 39 + free(buf); 40 + } 41 + 42 + return ret; 43 + }
-1
tools/perf/tests/openat-syscall-all-cpus.c
··· 15 15 #include "tests.h" 16 16 #include "thread_map.h" 17 17 #include <perf/cpumap.h> 18 - #include <internal/cpumap.h> 19 18 #include "debug.h" 20 19 #include "stat.h" 21 20 #include "util/counts.h"
+24
tools/perf/tests/parse-metric.c
··· 70 70 .metric_name = "M3", 71 71 }, 72 72 { 73 + .metric_expr = "64 * l1d.replacement / 1000000000 / duration_time", 74 + .metric_name = "L1D_Cache_Fill_BW", 75 + }, 76 + { 73 77 .name = NULL, 74 78 } 75 79 }; ··· 111 107 evlist__for_each_entry(evlist, evsel) { 112 108 count = find_value(evsel->name, vals); 113 109 perf_stat__update_shadow_stats(evsel, count, 0, st); 110 + if (!strcmp(evsel->name, "duration_time")) 111 + update_stats(&walltime_nsecs_stats, count); 114 112 } 115 113 } 116 114 ··· 327 321 return 0; 328 322 } 329 323 324 + static int test_memory_bandwidth(void) 325 + { 326 + double ratio; 327 + struct value vals[] = { 328 + { .event = "l1d.replacement", .val = 4000000 }, 329 + { .event = "duration_time", .val = 200000000 }, 330 + { .event = NULL, }, 331 + }; 332 + 333 + TEST_ASSERT_VAL("failed to compute metric", 334 + compute_metric("L1D_Cache_Fill_BW", vals, &ratio) == 0); 335 + TEST_ASSERT_VAL("L1D_Cache_Fill_BW, wrong ratio", 336 + 1.28 == ratio); 337 + 338 + return 0; 339 + } 340 + 330 341 static int test_metric_group(void) 331 342 { 332 343 double ratio1, ratio2; ··· 376 353 TEST_ASSERT_VAL("DCache_L2 failed", test_dcache_l2() == 0); 377 354 TEST_ASSERT_VAL("recursion fail failed", test_recursion_fail() == 0); 378 355 TEST_ASSERT_VAL("test metric group", test_metric_group() == 0); 356 + TEST_ASSERT_VAL("Memory bandwidth", test_memory_bandwidth() == 0); 379 357 return 0; 380 358 }
+16 -4
tools/perf/tests/sample-parsing.c
··· 129 129 if (type & PERF_SAMPLE_WEIGHT) 130 130 COMP(weight); 131 131 132 + if (type & PERF_SAMPLE_WEIGHT_STRUCT) 133 + COMP(ins_lat); 134 + 132 135 if (type & PERF_SAMPLE_DATA_SRC) 133 136 COMP(data_src); 134 137 ··· 159 156 160 157 if (type & PERF_SAMPLE_DATA_PAGE_SIZE) 161 158 COMP(data_page_size); 159 + 160 + if (type & PERF_SAMPLE_CODE_PAGE_SIZE) 161 + COMP(code_page_size); 162 162 163 163 if (type & PERF_SAMPLE_AUX) { 164 164 COMP(aux_sample.size); ··· 202 196 .data = {1, -1ULL, 211, 212, 213}, 203 197 }; 204 198 u64 regs[64]; 205 - const u64 raw_data[] = {0x123456780a0b0c0dULL, 0x1102030405060708ULL}; 199 + const u32 raw_data[] = {0x12345678, 0x0a0b0c0d, 0x11020304, 0x05060708, 0 }; 206 200 const u64 data[] = {0x2211443366558877ULL, 0, 0xaabbccddeeff4321ULL}; 207 201 const u64 aux_data[] = {0xa55a, 0, 0xeeddee, 0x0282028202820282}; 208 202 struct perf_sample sample = { ··· 244 238 .phys_addr = 113, 245 239 .cgroup = 114, 246 240 .data_page_size = 115, 241 + .code_page_size = 116, 242 + .ins_lat = 117, 247 243 .aux_sample = { 248 244 .size = sizeof(aux_data), 249 245 .data = (void *)aux_data, ··· 352 344 * were added. Please actually update the test rather than just change 353 345 * the condition below. 354 346 */ 355 - if (PERF_SAMPLE_MAX > PERF_SAMPLE_CODE_PAGE_SIZE << 1) { 347 + if (PERF_SAMPLE_MAX > PERF_SAMPLE_WEIGHT_STRUCT << 1) { 356 348 pr_debug("sample format has changed, some new PERF_SAMPLE_ bit was introduced - test needs updating\n"); 357 349 return -1; 358 350 } ··· 382 374 return err; 383 375 } 384 376 385 - /* Test all sample format bits together */ 386 - sample_type = PERF_SAMPLE_MAX - 1; 377 + /* 378 + * Test all sample format bits together 379 + * Note: PERF_SAMPLE_WEIGHT and PERF_SAMPLE_WEIGHT_STRUCT cannot 380 + * be set simultaneously. 381 + */ 382 + sample_type = (PERF_SAMPLE_MAX - 1) & ~PERF_SAMPLE_WEIGHT; 387 383 sample_regs = 0x3fff; /* shared yb intr and user regs */ 388 384 for (i = 0; i < ARRAY_SIZE(rf); i++) { 389 385 err = do_test(sample_type, sample_regs, rf[i]);
+6
tools/perf/tests/shell/buildid.sh
··· 50 50 exit 1 51 51 fi 52 52 53 + ${perf} buildid-cache -l | grep $id 54 + if [ $? -ne 0 ]; then 55 + echo "failed: ${id} is not reported by \"perf buildid-cache -l\"" 56 + exit 1 57 + fi 58 + 53 59 echo "OK for ${1}" 54 60 } 55 61
+475
tools/perf/tests/shell/daemon.sh
··· 1 + #!/bin/sh 2 + # daemon operations 3 + # SPDX-License-Identifier: GPL-2.0 4 + 5 + check_line_first() 6 + { 7 + local line=$1 8 + local name=$2 9 + local base=$3 10 + local output=$4 11 + local lock=$5 12 + local up=$6 13 + 14 + local line_name=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $2 }'` 15 + local line_base=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $3 }'` 16 + local line_output=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $4 }'` 17 + local line_lock=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $5 }'` 18 + local line_up=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $6 }'` 19 + 20 + if [ "${name}" != "${line_name}" ]; then 21 + echo "FAILED: wrong name" 22 + error=1 23 + fi 24 + 25 + if [ "${base}" != "${line_base}" ]; then 26 + echo "FAILED: wrong base" 27 + error=1 28 + fi 29 + 30 + if [ "${output}" != "${line_output}" ]; then 31 + echo "FAILED: wrong output" 32 + error=1 33 + fi 34 + 35 + if [ "${lock}" != "${line_lock}" ]; then 36 + echo "FAILED: wrong lock" 37 + error=1 38 + fi 39 + 40 + if [ "${up}" != "${line_up}" ]; then 41 + echo "FAILED: wrong up" 42 + error=1 43 + fi 44 + } 45 + 46 + check_line_other() 47 + { 48 + local line=$1 49 + local name=$2 50 + local run=$3 51 + local base=$4 52 + local output=$5 53 + local control=$6 54 + local ack=$7 55 + local up=$8 56 + 57 + local line_name=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $2 }'` 58 + local line_run=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $3 }'` 59 + local line_base=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $4 }'` 60 + local line_output=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $5 }'` 61 + local line_control=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $6 }'` 62 + local line_ack=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $7 }'` 63 + local line_up=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $8 }'` 64 + 65 + if [ "${name}" != "${line_name}" ]; then 66 + echo "FAILED: wrong name" 67 + error=1 68 + fi 69 + 70 + if [ "${run}" != "${line_run}" ]; then 71 + echo "FAILED: wrong run" 72 + error=1 73 + fi 74 + 75 + if [ "${base}" != "${line_base}" ]; then 76 + echo "FAILED: wrong base" 77 + error=1 78 + fi 79 + 80 + if [ "${output}" != "${line_output}" ]; then 81 + echo "FAILED: wrong output" 82 + error=1 83 + fi 84 + 85 + if [ "${control}" != "${line_control}" ]; then 86 + echo "FAILED: wrong control" 87 + error=1 88 + fi 89 + 90 + if [ "${ack}" != "${line_ack}" ]; then 91 + echo "FAILED: wrong ack" 92 + error=1 93 + fi 94 + 95 + if [ "${up}" != "${line_up}" ]; then 96 + echo "FAILED: wrong up" 97 + error=1 98 + fi 99 + } 100 + 101 + daemon_start() 102 + { 103 + local config=$1 104 + local session=$2 105 + 106 + perf daemon start --config ${config} 107 + 108 + # wait for the session to ping 109 + local state="FAIL" 110 + while [ "${state}" != "OK" ]; do 111 + state=`perf daemon ping --config ${config} --session ${session} | awk '{ print $1 }'` 112 + sleep 0.05 113 + done 114 + } 115 + 116 + daemon_exit() 117 + { 118 + local base=$1 119 + local config=$2 120 + 121 + local line=`perf daemon --config ${config} -x: | head -1` 122 + local pid=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $1 }'` 123 + 124 + # stop daemon 125 + perf daemon stop --config ${config} 126 + 127 + # ... and wait for the pid to go away 128 + tail --pid=${pid} -f /dev/null 129 + } 130 + 131 + test_list() 132 + { 133 + echo "test daemon list" 134 + 135 + local config=$(mktemp /tmp/perf.daemon.config.XXX) 136 + local base=$(mktemp -d /tmp/perf.daemon.base.XXX) 137 + 138 + cat <<EOF > ${config} 139 + [daemon] 140 + base=BASE 141 + 142 + [session-size] 143 + run = -e cpu-clock 144 + 145 + [session-time] 146 + run = -e task-clock 147 + EOF 148 + 149 + sed -i -e "s|BASE|${base}|" ${config} 150 + 151 + # start daemon 152 + daemon_start ${config} size 153 + 154 + # check first line 155 + # pid:daemon:base:base/output:base/lock 156 + local line=`perf daemon --config ${config} -x: | head -1` 157 + check_line_first ${line} daemon ${base} ${base}/output ${base}/lock "0" 158 + 159 + # check 1st session 160 + # pid:size:-e cpu-clock:base/size:base/size/output:base/size/control:base/size/ack:0 161 + local line=`perf daemon --config ${config} -x: | head -2 | tail -1` 162 + check_line_other "${line}" size "-e cpu-clock" ${base}/session-size \ 163 + ${base}/session-size/output ${base}/session-size/control \ 164 + ${base}/session-size/ack "0" 165 + 166 + # check 2nd session 167 + # pid:time:-e task-clock:base/time:base/time/output:base/time/control:base/time/ack:0 168 + local line=`perf daemon --config ${config} -x: | head -3 | tail -1` 169 + check_line_other "${line}" time "-e task-clock" ${base}/session-time \ 170 + ${base}/session-time/output ${base}/session-time/control \ 171 + ${base}/session-time/ack "0" 172 + 173 + # stop daemon 174 + daemon_exit ${base} ${config} 175 + 176 + rm -rf ${base} 177 + rm -f ${config} 178 + } 179 + 180 + test_reconfig() 181 + { 182 + echo "test daemon reconfig" 183 + 184 + local config=$(mktemp /tmp/perf.daemon.config.XXX) 185 + local base=$(mktemp -d /tmp/perf.daemon.base.XXX) 186 + 187 + # prepare config 188 + cat <<EOF > ${config} 189 + [daemon] 190 + base=BASE 191 + 192 + [session-size] 193 + run = -e cpu-clock 194 + 195 + [session-time] 196 + run = -e task-clock 197 + EOF 198 + 199 + sed -i -e "s|BASE|${base}|" ${config} 200 + 201 + # start daemon 202 + daemon_start ${config} size 203 + 204 + # check 2nd session 205 + # pid:time:-e task-clock:base/time:base/time/output:base/time/control:base/time/ack:0 206 + local line=`perf daemon --config ${config} -x: | head -3 | tail -1` 207 + check_line_other "${line}" time "-e task-clock" ${base}/session-time \ 208 + ${base}/session-time/output ${base}/session-time/control ${base}/session-time/ack "0" 209 + local pid=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $1 }'` 210 + 211 + # prepare new config 212 + local config_new=${config}.new 213 + cat <<EOF > ${config_new} 214 + [daemon] 215 + base=BASE 216 + 217 + [session-size] 218 + run = -e cpu-clock 219 + 220 + [session-time] 221 + run = -e cpu-clock 222 + EOF 223 + 224 + # TEST 1 - change config 225 + 226 + sed -i -e "s|BASE|${base}|" ${config_new} 227 + cp ${config_new} ${config} 228 + 229 + # wait for old session to finish 230 + tail --pid=${pid} -f /dev/null 231 + 232 + # wait for new one to start 233 + local state="FAIL" 234 + while [ "${state}" != "OK" ]; do 235 + state=`perf daemon ping --config ${config} --session time | awk '{ print $1 }'` 236 + done 237 + 238 + # check reconfigured 2nd session 239 + # pid:time:-e task-clock:base/time:base/time/output:base/time/control:base/time/ack:0 240 + local line=`perf daemon --config ${config} -x: | head -3 | tail -1` 241 + check_line_other "${line}" time "-e cpu-clock" ${base}/session-time \ 242 + ${base}/session-time/output ${base}/session-time/control ${base}/session-time/ack "0" 243 + 244 + # TEST 2 - empty config 245 + 246 + local config_empty=${config}.empty 247 + cat <<EOF > ${config_empty} 248 + [daemon] 249 + base=BASE 250 + EOF 251 + 252 + # change config 253 + sed -i -e "s|BASE|${base}|" ${config_empty} 254 + cp ${config_empty} ${config} 255 + 256 + # wait for sessions to finish 257 + local state="OK" 258 + while [ "${state}" != "FAIL" ]; do 259 + state=`perf daemon ping --config ${config} --session time | awk '{ print $1 }'` 260 + done 261 + 262 + local state="OK" 263 + while [ "${state}" != "FAIL" ]; do 264 + state=`perf daemon ping --config ${config} --session size | awk '{ print $1 }'` 265 + done 266 + 267 + local one=`perf daemon --config ${config} -x: | wc -l` 268 + 269 + if [ ${one} -ne "1" ]; then 270 + echo "FAILED: wrong list output" 271 + error=1 272 + fi 273 + 274 + # TEST 3 - config again 275 + 276 + cp ${config_new} ${config} 277 + 278 + # wait for size to start 279 + local state="FAIL" 280 + while [ "${state}" != "OK" ]; do 281 + state=`perf daemon ping --config ${config} --session size | awk '{ print $1 }'` 282 + done 283 + 284 + # wait for time to start 285 + local state="FAIL" 286 + while [ "${state}" != "OK" ]; do 287 + state=`perf daemon ping --config ${config} --session time | awk '{ print $1 }'` 288 + done 289 + 290 + # stop daemon 291 + daemon_exit ${base} ${config} 292 + 293 + rm -rf ${base} 294 + rm -f ${config} 295 + rm -f ${config_new} 296 + rm -f ${config_empty} 297 + } 298 + 299 + test_stop() 300 + { 301 + echo "test daemon stop" 302 + 303 + local config=$(mktemp /tmp/perf.daemon.config.XXX) 304 + local base=$(mktemp -d /tmp/perf.daemon.base.XXX) 305 + 306 + # prepare config 307 + cat <<EOF > ${config} 308 + [daemon] 309 + base=BASE 310 + 311 + [session-size] 312 + run = -e cpu-clock 313 + 314 + [session-time] 315 + run = -e task-clock 316 + EOF 317 + 318 + sed -i -e "s|BASE|${base}|" ${config} 319 + 320 + # start daemon 321 + daemon_start ${config} size 322 + 323 + local pid_size=`perf daemon --config ${config} -x: | head -2 | tail -1 | awk 'BEGIN { FS = ":" } ; { print $1 }'` 324 + local pid_time=`perf daemon --config ${config} -x: | head -3 | tail -1 | awk 'BEGIN { FS = ":" } ; { print $1 }'` 325 + 326 + # check that sessions are running 327 + if [ ! -d "/proc/${pid_size}" ]; then 328 + echo "FAILED: session size not up" 329 + fi 330 + 331 + if [ ! -d "/proc/${pid_time}" ]; then 332 + echo "FAILED: session time not up" 333 + fi 334 + 335 + # stop daemon 336 + daemon_exit ${base} ${config} 337 + 338 + # check that sessions are gone 339 + if [ -d "/proc/${pid_size}" ]; then 340 + echo "FAILED: session size still up" 341 + fi 342 + 343 + if [ -d "/proc/${pid_time}" ]; then 344 + echo "FAILED: session time still up" 345 + fi 346 + 347 + rm -rf ${base} 348 + rm -f ${config} 349 + } 350 + 351 + test_signal() 352 + { 353 + echo "test daemon signal" 354 + 355 + local config=$(mktemp /tmp/perf.daemon.config.XXX) 356 + local base=$(mktemp -d /tmp/perf.daemon.base.XXX) 357 + 358 + # prepare config 359 + cat <<EOF > ${config} 360 + [daemon] 361 + base=BASE 362 + 363 + [session-test] 364 + run = -e cpu-clock --switch-output 365 + EOF 366 + 367 + sed -i -e "s|BASE|${base}|" ${config} 368 + 369 + # start daemon 370 + daemon_start ${config} test 371 + 372 + # send 2 signals 373 + perf daemon signal --config ${config} --session test 374 + perf daemon signal --config ${config} 375 + 376 + # stop daemon 377 + daemon_exit ${base} ${config} 378 + 379 + # count is 2 perf.data for signals and 1 for perf record finished 380 + count=`ls ${base}/session-test/ | grep perf.data | wc -l` 381 + if [ ${count} -ne 3 ]; then 382 + error=1 383 + echo "FAILED: perf data no generated" 384 + fi 385 + 386 + rm -rf ${base} 387 + rm -f ${config} 388 + } 389 + 390 + test_ping() 391 + { 392 + echo "test daemon ping" 393 + 394 + local config=$(mktemp /tmp/perf.daemon.config.XXX) 395 + local base=$(mktemp -d /tmp/perf.daemon.base.XXX) 396 + 397 + # prepare config 398 + cat <<EOF > ${config} 399 + [daemon] 400 + base=BASE 401 + 402 + [session-size] 403 + run = -e cpu-clock 404 + 405 + [session-time] 406 + run = -e task-clock 407 + EOF 408 + 409 + sed -i -e "s|BASE|${base}|" ${config} 410 + 411 + # start daemon 412 + daemon_start ${config} size 413 + 414 + size=`perf daemon ping --config ${config} --session size | awk '{ print $1 }'` 415 + type=`perf daemon ping --config ${config} --session time | awk '{ print $1 }'` 416 + 417 + if [ ${size} != "OK" -o ${type} != "OK" ]; then 418 + error=1 419 + echo "FAILED: daemon ping failed" 420 + fi 421 + 422 + # stop daemon 423 + daemon_exit ${base} ${config} 424 + 425 + rm -rf ${base} 426 + rm -f ${config} 427 + } 428 + 429 + test_lock() 430 + { 431 + echo "test daemon lock" 432 + 433 + local config=$(mktemp /tmp/perf.daemon.config.XXX) 434 + local base=$(mktemp -d /tmp/perf.daemon.base.XXX) 435 + 436 + # prepare config 437 + cat <<EOF > ${config} 438 + [daemon] 439 + base=BASE 440 + 441 + [session-size] 442 + run = -e cpu-clock 443 + EOF 444 + 445 + sed -i -e "s|BASE|${base}|" ${config} 446 + 447 + # start daemon 448 + daemon_start ${config} size 449 + 450 + # start second daemon over the same config/base 451 + failed=`perf daemon start --config ${config} 2>&1 | awk '{ print $1 }'` 452 + 453 + # check that we failed properly 454 + if [ ${failed} != "failed:" ]; then 455 + error=1 456 + echo "FAILED: daemon lock failed" 457 + fi 458 + 459 + # stop daemon 460 + daemon_exit ${base} ${config} 461 + 462 + rm -rf ${base} 463 + rm -f ${config} 464 + } 465 + 466 + error=0 467 + 468 + test_list 469 + test_reconfig 470 + test_stop 471 + test_signal 472 + test_ping 473 + test_lock 474 + 475 + exit ${error}
+23 -22
tools/perf/tests/shell/test_arm_coresight.sh
··· 11 11 12 12 perfdata=$(mktemp /tmp/__perf_test.perf.data.XXXXX) 13 13 file=$(mktemp /tmp/temporary_file.XXXXX) 14 + glb_err=0 14 15 15 16 skip_if_no_cs_etm_event() { 16 17 perf list | grep -q 'cs_etm//' && return 0 ··· 34 33 echo "Recording trace (only user mode) with path: CPU$2 => $1" 35 34 rm -f $file 36 35 perf record -o ${perfdata} -e cs_etm/@$1/u --per-thread \ 37 - -- taskset -c $2 touch $file 36 + -- taskset -c $2 touch $file > /dev/null 2>&1 38 37 } 39 38 40 39 perf_script_branch_samples() { ··· 44 43 # touch 6512 1 branches:u: ffffb220824c strcmp+0xc (/lib/aarch64-linux-gnu/ld-2.27.so) 45 44 # touch 6512 1 branches:u: ffffb22082e0 strcmp+0xa0 (/lib/aarch64-linux-gnu/ld-2.27.so) 46 45 # touch 6512 1 branches:u: ffffb2208320 strcmp+0xe0 (/lib/aarch64-linux-gnu/ld-2.27.so) 47 - perf script -F,-time -i ${perfdata} | \ 48 - egrep " +$1 +[0-9]+ .* +branches:(.*:)? +" 46 + perf script -F,-time -i ${perfdata} 2>&1 | \ 47 + egrep " +$1 +[0-9]+ .* +branches:(.*:)? +" > /dev/null 2>&1 49 48 } 50 49 51 50 perf_report_branch_samples() { ··· 55 54 # 73.04% 73.04% touch libc-2.27.so [.] _dl_addr 56 55 # 7.71% 7.71% touch libc-2.27.so [.] getenv 57 56 # 2.59% 2.59% touch ld-2.27.so [.] strcmp 58 - perf report --stdio -i ${perfdata} | \ 59 - egrep " +[0-9]+\.[0-9]+% +[0-9]+\.[0-9]+% +$1 " 57 + perf report --stdio -i ${perfdata} 2>&1 | \ 58 + egrep " +[0-9]+\.[0-9]+% +[0-9]+\.[0-9]+% +$1 " > /dev/null 2>&1 60 59 } 61 60 62 61 perf_report_instruction_samples() { ··· 66 65 # 68.12% touch libc-2.27.so [.] _dl_addr 67 66 # 5.80% touch libc-2.27.so [.] getenv 68 67 # 4.35% touch ld-2.27.so [.] _dl_fixup 69 - perf report --itrace=i1000i --stdio -i ${perfdata} | \ 70 - egrep " +[0-9]+\.[0-9]+% +$1" 68 + perf report --itrace=i1000i --stdio -i ${perfdata} 2>&1 | \ 69 + egrep " +[0-9]+\.[0-9]+% +$1" > /dev/null 2>&1 70 + } 71 + 72 + arm_cs_report() { 73 + if [ $2 != 0 ]; then 74 + echo "$1: FAIL" 75 + glb_err=$2 76 + else 77 + echo "$1: PASS" 78 + fi 71 79 } 72 80 73 81 is_device_sink() { ··· 123 113 perf_report_instruction_samples touch 124 114 125 115 err=$? 126 - 127 - # Exit when find failure 128 - [ $err != 0 ] && exit $err 116 + arm_cs_report "CoreSight path testing (CPU$2 -> $device_name)" $err 129 117 fi 130 118 131 119 arm_cs_iterate_devices $dev $2 ··· 137 129 # Find the ETM device belonging to which CPU 138 130 cpu=`cat $dev/cpu` 139 131 140 - echo $dev 141 - echo $cpu 142 - 143 132 # Use depth-first search (DFS) to iterate outputs 144 133 arm_cs_iterate_devices $dev $cpu 145 134 done ··· 144 139 145 140 arm_cs_etm_system_wide_test() { 146 141 echo "Recording trace with system wide mode" 147 - perf record -o ${perfdata} -e cs_etm// -a -- ls 142 + perf record -o ${perfdata} -e cs_etm// -a -- ls > /dev/null 2>&1 148 143 149 144 perf_script_branch_samples perf && 150 145 perf_report_branch_samples perf && 151 146 perf_report_instruction_samples perf 152 147 153 148 err=$? 154 - 155 - # Exit when find failure 156 - [ $err != 0 ] && exit $err 149 + arm_cs_report "CoreSight system wide testing" $err 157 150 } 158 151 159 152 arm_cs_etm_snapshot_test() { 160 153 echo "Recording trace with snapshot mode" 161 154 perf record -o ${perfdata} -e cs_etm// -S \ 162 - -- dd if=/dev/zero of=/dev/null & 155 + -- dd if=/dev/zero of=/dev/null > /dev/null 2>&1 & 163 156 PERFPID=$! 164 157 165 158 # Wait for perf program ··· 175 172 perf_report_instruction_samples dd 176 173 177 174 err=$? 178 - 179 - # Exit when find failure 180 - [ $err != 0 ] && exit $err 175 + arm_cs_report "CoreSight snapshot testing" $err 181 176 } 182 177 183 178 arm_cs_etm_traverse_path_test 184 179 arm_cs_etm_system_wide_test 185 180 arm_cs_etm_snapshot_test 186 - exit 0 181 + exit $glb_err
+1
tools/perf/tests/tests.h
··· 119 119 int test__jit_write_elf(struct test *test, int subtest); 120 120 int test__api_io(struct test *test, int subtest); 121 121 int test__demangle_java(struct test *test, int subtest); 122 + int test__demangle_ocaml(struct test *test, int subtest); 122 123 int test__pfm(struct test *test, int subtest); 123 124 const char *test__pfm_subtest_get_desc(int subtest); 124 125 int test__pfm_subtest_get_nr(void);
+1 -1
tools/perf/ui/browsers/annotate.c
··· 759 759 continue; 760 760 case 'k': 761 761 notes->options->show_linenr = !notes->options->show_linenr; 762 - break; 762 + continue; 763 763 case 'H': 764 764 nd = browser->curr_hot; 765 765 break;
+2
tools/perf/util/Build
··· 135 135 136 136 perf-$(CONFIG_LIBBPF) += bpf-loader.o 137 137 perf-$(CONFIG_LIBBPF) += bpf_map.o 138 + perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter.o 138 139 perf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o 139 140 perf-$(CONFIG_LIBELF) += symbol-elf.o 140 141 perf-$(CONFIG_LIBELF) += probe-file.o ··· 173 172 174 173 perf-$(CONFIG_LIBCAP) += cap.o 175 174 175 + perf-y += demangle-ocaml.o 176 176 perf-y += demangle-java.o 177 177 perf-y += demangle-rust.o 178 178
+8
tools/perf/util/annotate.c
··· 321 321 /* 322 322 * Prevents from matching commas in the comment section, e.g.: 323 323 * ffff200008446e70: b.cs ffff2000084470f4 <generic_exec_single+0x314> // b.hs, b.nlast 324 + * 325 + * and skip comma as part of function arguments, e.g.: 326 + * 1d8b4ac <linemap_lookup(line_maps const*, unsigned int)+0xcc> 324 327 */ 325 328 static inline const char *validate_comma(const char *c, struct ins_operands *ops) 326 329 { 327 330 if (ops->raw_comment && c > ops->raw_comment) 331 + return NULL; 332 + 333 + if (ops->raw_func_start && c > ops->raw_func_start) 328 334 return NULL; 329 335 330 336 return c; ··· 347 341 u64 start, end; 348 342 349 343 ops->raw_comment = strchr(ops->raw, arch->objdump.comment_char); 344 + ops->raw_func_start = strchr(ops->raw, '<'); 345 + 350 346 c = validate_comma(c, ops); 351 347 352 348 /*
+1
tools/perf/util/annotate.h
··· 32 32 struct ins_operands { 33 33 char *raw; 34 34 char *raw_comment; 35 + char *raw_func_start; 35 36 struct { 36 37 char *raw; 37 38 char *name;
+10
tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
··· 172 172 decoder->record.from_ip = ip; 173 173 else if (idx == SPE_ADDR_PKT_HDR_INDEX_BRANCH) 174 174 decoder->record.to_ip = ip; 175 + else if (idx == SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT) 176 + decoder->record.virt_addr = ip; 177 + else if (idx == SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS) 178 + decoder->record.phys_addr = ip; 175 179 break; 176 180 case ARM_SPE_COUNTER: 177 181 break; 178 182 case ARM_SPE_CONTEXT: 179 183 break; 180 184 case ARM_SPE_OP_TYPE: 185 + if (idx == SPE_OP_PKT_HDR_CLASS_LD_ST_ATOMIC) { 186 + if (payload & 0x1) 187 + decoder->record.op = ARM_SPE_ST; 188 + else 189 + decoder->record.op = ARM_SPE_LD; 190 + } 181 191 break; 182 192 case ARM_SPE_EVENTS: 183 193 if (payload & BIT(EV_L1D_REFILL))
+8
tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
··· 24 24 ARM_SPE_REMOTE_ACCESS = 1 << 7, 25 25 }; 26 26 27 + enum arm_spe_op_type { 28 + ARM_SPE_LD = 1 << 0, 29 + ARM_SPE_ST = 1 << 1, 30 + }; 31 + 27 32 struct arm_spe_record { 28 33 enum arm_spe_sample_type type; 29 34 int err; 35 + u32 op; 30 36 u64 from_ip; 31 37 u64 to_ip; 32 38 u64 timestamp; 39 + u64 virt_addr; 40 + u64 phys_addr; 33 41 }; 34 42 35 43 struct arm_spe_insn;
+112 -21
tools/perf/util/arm-spe.c
··· 53 53 u8 sample_tlb; 54 54 u8 sample_branch; 55 55 u8 sample_remote_access; 56 + u8 sample_memory; 56 57 57 58 u64 l1d_miss_id; 58 59 u64 l1d_access_id; ··· 63 62 u64 tlb_access_id; 64 63 u64 branch_miss_id; 65 64 u64 remote_access_id; 65 + u64 memory_id; 66 66 67 67 u64 kernel_start; 68 68 ··· 237 235 sample->cpumode = arm_spe_cpumode(spe, sample->ip); 238 236 sample->pid = speq->pid; 239 237 sample->tid = speq->tid; 240 - sample->addr = record->to_ip; 241 238 sample->period = 1; 242 239 sample->cpu = speq->cpu; 243 240 ··· 260 259 return ret; 261 260 } 262 261 263 - static int 264 - arm_spe_synth_spe_events_sample(struct arm_spe_queue *speq, 265 - u64 spe_events_id) 262 + static int arm_spe__synth_mem_sample(struct arm_spe_queue *speq, 263 + u64 spe_events_id, u64 data_src) 266 264 { 267 265 struct arm_spe *spe = speq->spe; 266 + struct arm_spe_record *record = &speq->decoder->record; 268 267 union perf_event *event = speq->event_buf; 269 268 struct perf_sample sample = { .ip = 0, }; 270 269 ··· 272 271 273 272 sample.id = spe_events_id; 274 273 sample.stream_id = spe_events_id; 274 + sample.addr = record->virt_addr; 275 + sample.phys_addr = record->phys_addr; 276 + sample.data_src = data_src; 275 277 276 278 return arm_spe_deliver_synth_event(spe, speq, event, &sample); 279 + } 280 + 281 + static int arm_spe__synth_branch_sample(struct arm_spe_queue *speq, 282 + u64 spe_events_id) 283 + { 284 + struct arm_spe *spe = speq->spe; 285 + struct arm_spe_record *record = &speq->decoder->record; 286 + union perf_event *event = speq->event_buf; 287 + struct perf_sample sample = { .ip = 0, }; 288 + 289 + arm_spe_prep_sample(spe, speq, event, &sample); 290 + 291 + sample.id = spe_events_id; 292 + sample.stream_id = spe_events_id; 293 + sample.addr = record->to_ip; 294 + 295 + return arm_spe_deliver_synth_event(spe, speq, event, &sample); 296 + } 297 + 298 + #define SPE_MEM_TYPE (ARM_SPE_L1D_ACCESS | ARM_SPE_L1D_MISS | \ 299 + ARM_SPE_LLC_ACCESS | ARM_SPE_LLC_MISS | \ 300 + ARM_SPE_REMOTE_ACCESS) 301 + 302 + static bool arm_spe__is_memory_event(enum arm_spe_sample_type type) 303 + { 304 + if (type & SPE_MEM_TYPE) 305 + return true; 306 + 307 + return false; 308 + } 309 + 310 + static u64 arm_spe__synth_data_source(const struct arm_spe_record *record) 311 + { 312 + union perf_mem_data_src data_src = { 0 }; 313 + 314 + if (record->op == ARM_SPE_LD) 315 + data_src.mem_op = PERF_MEM_OP_LOAD; 316 + else 317 + data_src.mem_op = PERF_MEM_OP_STORE; 318 + 319 + if (record->type & (ARM_SPE_LLC_ACCESS | ARM_SPE_LLC_MISS)) { 320 + data_src.mem_lvl = PERF_MEM_LVL_L3; 321 + 322 + if (record->type & ARM_SPE_LLC_MISS) 323 + data_src.mem_lvl |= PERF_MEM_LVL_MISS; 324 + else 325 + data_src.mem_lvl |= PERF_MEM_LVL_HIT; 326 + } else if (record->type & (ARM_SPE_L1D_ACCESS | ARM_SPE_L1D_MISS)) { 327 + data_src.mem_lvl = PERF_MEM_LVL_L1; 328 + 329 + if (record->type & ARM_SPE_L1D_MISS) 330 + data_src.mem_lvl |= PERF_MEM_LVL_MISS; 331 + else 332 + data_src.mem_lvl |= PERF_MEM_LVL_HIT; 333 + } 334 + 335 + if (record->type & ARM_SPE_REMOTE_ACCESS) 336 + data_src.mem_lvl |= PERF_MEM_LVL_REM_CCE1; 337 + 338 + if (record->type & (ARM_SPE_TLB_ACCESS | ARM_SPE_TLB_MISS)) { 339 + data_src.mem_dtlb = PERF_MEM_TLB_WK; 340 + 341 + if (record->type & ARM_SPE_TLB_MISS) 342 + data_src.mem_dtlb |= PERF_MEM_TLB_MISS; 343 + else 344 + data_src.mem_dtlb |= PERF_MEM_TLB_HIT; 345 + } 346 + 347 + return data_src.val; 277 348 } 278 349 279 350 static int arm_spe_sample(struct arm_spe_queue *speq) 280 351 { 281 352 const struct arm_spe_record *record = &speq->decoder->record; 282 353 struct arm_spe *spe = speq->spe; 354 + u64 data_src; 283 355 int err; 356 + 357 + data_src = arm_spe__synth_data_source(record); 284 358 285 359 if (spe->sample_flc) { 286 360 if (record->type & ARM_SPE_L1D_MISS) { 287 - err = arm_spe_synth_spe_events_sample( 288 - speq, spe->l1d_miss_id); 361 + err = arm_spe__synth_mem_sample(speq, spe->l1d_miss_id, 362 + data_src); 289 363 if (err) 290 364 return err; 291 365 } 292 366 293 367 if (record->type & ARM_SPE_L1D_ACCESS) { 294 - err = arm_spe_synth_spe_events_sample( 295 - speq, spe->l1d_access_id); 368 + err = arm_spe__synth_mem_sample(speq, spe->l1d_access_id, 369 + data_src); 296 370 if (err) 297 371 return err; 298 372 } ··· 375 299 376 300 if (spe->sample_llc) { 377 301 if (record->type & ARM_SPE_LLC_MISS) { 378 - err = arm_spe_synth_spe_events_sample( 379 - speq, spe->llc_miss_id); 302 + err = arm_spe__synth_mem_sample(speq, spe->llc_miss_id, 303 + data_src); 380 304 if (err) 381 305 return err; 382 306 } 383 307 384 308 if (record->type & ARM_SPE_LLC_ACCESS) { 385 - err = arm_spe_synth_spe_events_sample( 386 - speq, spe->llc_access_id); 309 + err = arm_spe__synth_mem_sample(speq, spe->llc_access_id, 310 + data_src); 387 311 if (err) 388 312 return err; 389 313 } ··· 391 315 392 316 if (spe->sample_tlb) { 393 317 if (record->type & ARM_SPE_TLB_MISS) { 394 - err = arm_spe_synth_spe_events_sample( 395 - speq, spe->tlb_miss_id); 318 + err = arm_spe__synth_mem_sample(speq, spe->tlb_miss_id, 319 + data_src); 396 320 if (err) 397 321 return err; 398 322 } 399 323 400 324 if (record->type & ARM_SPE_TLB_ACCESS) { 401 - err = arm_spe_synth_spe_events_sample( 402 - speq, spe->tlb_access_id); 325 + err = arm_spe__synth_mem_sample(speq, spe->tlb_access_id, 326 + data_src); 403 327 if (err) 404 328 return err; 405 329 } 406 330 } 407 331 408 332 if (spe->sample_branch && (record->type & ARM_SPE_BRANCH_MISS)) { 409 - err = arm_spe_synth_spe_events_sample(speq, 410 - spe->branch_miss_id); 333 + err = arm_spe__synth_branch_sample(speq, spe->branch_miss_id); 411 334 if (err) 412 335 return err; 413 336 } 414 337 415 338 if (spe->sample_remote_access && 416 339 (record->type & ARM_SPE_REMOTE_ACCESS)) { 417 - err = arm_spe_synth_spe_events_sample(speq, 418 - spe->remote_access_id); 340 + err = arm_spe__synth_mem_sample(speq, spe->remote_access_id, 341 + data_src); 342 + if (err) 343 + return err; 344 + } 345 + 346 + if (spe->sample_memory && arm_spe__is_memory_event(record->type)) { 347 + err = arm_spe__synth_mem_sample(speq, spe->memory_id, data_src); 419 348 if (err) 420 349 return err; 421 350 } ··· 884 803 attr.type = PERF_TYPE_HARDWARE; 885 804 attr.sample_type = evsel->core.attr.sample_type & PERF_SAMPLE_MASK; 886 805 attr.sample_type |= PERF_SAMPLE_IP | PERF_SAMPLE_TID | 887 - PERF_SAMPLE_PERIOD; 806 + PERF_SAMPLE_PERIOD | PERF_SAMPLE_DATA_SRC; 888 807 if (spe->timeless_decoding) 889 808 attr.sample_type &= ~(u64)PERF_SAMPLE_TIME; 890 809 else ··· 986 905 spe->remote_access_id = id; 987 906 arm_spe_set_event_name(evlist, id, "remote-access"); 988 907 id += 1; 908 + } 909 + 910 + if (spe->synth_opts.mem) { 911 + spe->sample_memory = true; 912 + 913 + err = arm_spe_synth_event(session, &attr, id); 914 + if (err) 915 + return err; 916 + spe->memory_id = id; 917 + arm_spe_set_event_name(evlist, id, "memory"); 989 918 } 990 919 991 920 return 0;
+15
tools/perf/util/auxtrace.c
··· 788 788 return auxtrace_validate_aux_sample_size(evlist, opts); 789 789 } 790 790 791 + void auxtrace_regroup_aux_output(struct evlist *evlist) 792 + { 793 + struct evsel *evsel, *aux_evsel = NULL; 794 + struct evsel_config_term *term; 795 + 796 + evlist__for_each_entry(evlist, evsel) { 797 + if (evsel__is_aux_event(evsel)) 798 + aux_evsel = evsel; 799 + term = evsel__get_config_term(evsel, AUX_OUTPUT); 800 + /* If possible, group with the AUX event */ 801 + if (term && aux_evsel) 802 + evlist__regroup(evlist, aux_evsel, evsel); 803 + } 804 + } 805 + 791 806 struct auxtrace_record *__weak 792 807 auxtrace_record__init(struct evlist *evlist __maybe_unused, int *err) 793 808 {
+6
tools/perf/util/auxtrace.h
··· 559 559 int auxtrace_parse_sample_options(struct auxtrace_record *itr, 560 560 struct evlist *evlist, 561 561 struct record_opts *opts, const char *str); 562 + void auxtrace_regroup_aux_output(struct evlist *evlist); 562 563 int auxtrace_record__options(struct auxtrace_record *itr, 563 564 struct evlist *evlist, 564 565 struct record_opts *opts); ··· 739 738 return 0; 740 739 pr_err("AUX area tracing not supported\n"); 741 740 return -EINVAL; 741 + } 742 + 743 + static inline 744 + void auxtrace_regroup_aux_output(struct evlist *evlist __maybe_unused) 745 + { 742 746 } 743 747 744 748 static inline
+314
tools/perf/util/bpf_counter.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + /* Copyright (c) 2019 Facebook */ 4 + 5 + #include <assert.h> 6 + #include <limits.h> 7 + #include <unistd.h> 8 + #include <sys/time.h> 9 + #include <sys/resource.h> 10 + #include <linux/err.h> 11 + #include <linux/zalloc.h> 12 + #include <bpf/bpf.h> 13 + #include <bpf/btf.h> 14 + #include <bpf/libbpf.h> 15 + 16 + #include "bpf_counter.h" 17 + #include "counts.h" 18 + #include "debug.h" 19 + #include "evsel.h" 20 + #include "target.h" 21 + 22 + #include "bpf_skel/bpf_prog_profiler.skel.h" 23 + 24 + static inline void *u64_to_ptr(__u64 ptr) 25 + { 26 + return (void *)(unsigned long)ptr; 27 + } 28 + 29 + static void set_max_rlimit(void) 30 + { 31 + struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY }; 32 + 33 + setrlimit(RLIMIT_MEMLOCK, &rinf); 34 + } 35 + 36 + static struct bpf_counter *bpf_counter_alloc(void) 37 + { 38 + struct bpf_counter *counter; 39 + 40 + counter = zalloc(sizeof(*counter)); 41 + if (counter) 42 + INIT_LIST_HEAD(&counter->list); 43 + return counter; 44 + } 45 + 46 + static int bpf_program_profiler__destroy(struct evsel *evsel) 47 + { 48 + struct bpf_counter *counter, *tmp; 49 + 50 + list_for_each_entry_safe(counter, tmp, 51 + &evsel->bpf_counter_list, list) { 52 + list_del_init(&counter->list); 53 + bpf_prog_profiler_bpf__destroy(counter->skel); 54 + free(counter); 55 + } 56 + assert(list_empty(&evsel->bpf_counter_list)); 57 + 58 + return 0; 59 + } 60 + 61 + static char *bpf_target_prog_name(int tgt_fd) 62 + { 63 + struct bpf_prog_info_linear *info_linear; 64 + struct bpf_func_info *func_info; 65 + const struct btf_type *t; 66 + char *name = NULL; 67 + struct btf *btf; 68 + 69 + info_linear = bpf_program__get_prog_info_linear( 70 + tgt_fd, 1UL << BPF_PROG_INFO_FUNC_INFO); 71 + if (IS_ERR_OR_NULL(info_linear)) { 72 + pr_debug("failed to get info_linear for prog FD %d\n", tgt_fd); 73 + return NULL; 74 + } 75 + 76 + if (info_linear->info.btf_id == 0 || 77 + btf__get_from_id(info_linear->info.btf_id, &btf)) { 78 + pr_debug("prog FD %d doesn't have valid btf\n", tgt_fd); 79 + goto out; 80 + } 81 + 82 + func_info = u64_to_ptr(info_linear->info.func_info); 83 + t = btf__type_by_id(btf, func_info[0].type_id); 84 + if (!t) { 85 + pr_debug("btf %d doesn't have type %d\n", 86 + info_linear->info.btf_id, func_info[0].type_id); 87 + goto out; 88 + } 89 + name = strdup(btf__name_by_offset(btf, t->name_off)); 90 + out: 91 + free(info_linear); 92 + return name; 93 + } 94 + 95 + static int bpf_program_profiler_load_one(struct evsel *evsel, u32 prog_id) 96 + { 97 + struct bpf_prog_profiler_bpf *skel; 98 + struct bpf_counter *counter; 99 + struct bpf_program *prog; 100 + char *prog_name; 101 + int prog_fd; 102 + int err; 103 + 104 + prog_fd = bpf_prog_get_fd_by_id(prog_id); 105 + if (prog_fd < 0) { 106 + pr_err("Failed to open fd for bpf prog %u\n", prog_id); 107 + return -1; 108 + } 109 + counter = bpf_counter_alloc(); 110 + if (!counter) { 111 + close(prog_fd); 112 + return -1; 113 + } 114 + 115 + skel = bpf_prog_profiler_bpf__open(); 116 + if (!skel) { 117 + pr_err("Failed to open bpf skeleton\n"); 118 + goto err_out; 119 + } 120 + 121 + skel->rodata->num_cpu = evsel__nr_cpus(evsel); 122 + 123 + bpf_map__resize(skel->maps.events, evsel__nr_cpus(evsel)); 124 + bpf_map__resize(skel->maps.fentry_readings, 1); 125 + bpf_map__resize(skel->maps.accum_readings, 1); 126 + 127 + prog_name = bpf_target_prog_name(prog_fd); 128 + if (!prog_name) { 129 + pr_err("Failed to get program name for bpf prog %u. Does it have BTF?\n", prog_id); 130 + goto err_out; 131 + } 132 + 133 + bpf_object__for_each_program(prog, skel->obj) { 134 + err = bpf_program__set_attach_target(prog, prog_fd, prog_name); 135 + if (err) { 136 + pr_err("bpf_program__set_attach_target failed.\n" 137 + "Does bpf prog %u have BTF?\n", prog_id); 138 + goto err_out; 139 + } 140 + } 141 + set_max_rlimit(); 142 + err = bpf_prog_profiler_bpf__load(skel); 143 + if (err) { 144 + pr_err("bpf_prog_profiler_bpf__load failed\n"); 145 + goto err_out; 146 + } 147 + 148 + assert(skel != NULL); 149 + counter->skel = skel; 150 + list_add(&counter->list, &evsel->bpf_counter_list); 151 + close(prog_fd); 152 + return 0; 153 + err_out: 154 + bpf_prog_profiler_bpf__destroy(skel); 155 + free(counter); 156 + close(prog_fd); 157 + return -1; 158 + } 159 + 160 + static int bpf_program_profiler__load(struct evsel *evsel, struct target *target) 161 + { 162 + char *bpf_str, *bpf_str_, *tok, *saveptr = NULL, *p; 163 + u32 prog_id; 164 + int ret; 165 + 166 + bpf_str_ = bpf_str = strdup(target->bpf_str); 167 + if (!bpf_str) 168 + return -1; 169 + 170 + while ((tok = strtok_r(bpf_str, ",", &saveptr)) != NULL) { 171 + prog_id = strtoul(tok, &p, 10); 172 + if (prog_id == 0 || prog_id == UINT_MAX || 173 + (*p != '\0' && *p != ',')) { 174 + pr_err("Failed to parse bpf prog ids %s\n", 175 + target->bpf_str); 176 + return -1; 177 + } 178 + 179 + ret = bpf_program_profiler_load_one(evsel, prog_id); 180 + if (ret) { 181 + bpf_program_profiler__destroy(evsel); 182 + free(bpf_str_); 183 + return -1; 184 + } 185 + bpf_str = NULL; 186 + } 187 + free(bpf_str_); 188 + return 0; 189 + } 190 + 191 + static int bpf_program_profiler__enable(struct evsel *evsel) 192 + { 193 + struct bpf_counter *counter; 194 + int ret; 195 + 196 + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { 197 + assert(counter->skel != NULL); 198 + ret = bpf_prog_profiler_bpf__attach(counter->skel); 199 + if (ret) { 200 + bpf_program_profiler__destroy(evsel); 201 + return ret; 202 + } 203 + } 204 + return 0; 205 + } 206 + 207 + static int bpf_program_profiler__read(struct evsel *evsel) 208 + { 209 + // perf_cpu_map uses /sys/devices/system/cpu/online 210 + int num_cpu = evsel__nr_cpus(evsel); 211 + // BPF_MAP_TYPE_PERCPU_ARRAY uses /sys/devices/system/cpu/possible 212 + // Sometimes possible > online, like on a Ryzen 3900X that has 24 213 + // threads but its possible showed 0-31 -acme 214 + int num_cpu_bpf = libbpf_num_possible_cpus(); 215 + struct bpf_perf_event_value values[num_cpu_bpf]; 216 + struct bpf_counter *counter; 217 + int reading_map_fd; 218 + __u32 key = 0; 219 + int err, cpu; 220 + 221 + if (list_empty(&evsel->bpf_counter_list)) 222 + return -EAGAIN; 223 + 224 + for (cpu = 0; cpu < num_cpu; cpu++) { 225 + perf_counts(evsel->counts, cpu, 0)->val = 0; 226 + perf_counts(evsel->counts, cpu, 0)->ena = 0; 227 + perf_counts(evsel->counts, cpu, 0)->run = 0; 228 + } 229 + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { 230 + struct bpf_prog_profiler_bpf *skel = counter->skel; 231 + 232 + assert(skel != NULL); 233 + reading_map_fd = bpf_map__fd(skel->maps.accum_readings); 234 + 235 + err = bpf_map_lookup_elem(reading_map_fd, &key, values); 236 + if (err) { 237 + pr_err("failed to read value\n"); 238 + return err; 239 + } 240 + 241 + for (cpu = 0; cpu < num_cpu; cpu++) { 242 + perf_counts(evsel->counts, cpu, 0)->val += values[cpu].counter; 243 + perf_counts(evsel->counts, cpu, 0)->ena += values[cpu].enabled; 244 + perf_counts(evsel->counts, cpu, 0)->run += values[cpu].running; 245 + } 246 + } 247 + return 0; 248 + } 249 + 250 + static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu, 251 + int fd) 252 + { 253 + struct bpf_prog_profiler_bpf *skel; 254 + struct bpf_counter *counter; 255 + int ret; 256 + 257 + list_for_each_entry(counter, &evsel->bpf_counter_list, list) { 258 + skel = counter->skel; 259 + assert(skel != NULL); 260 + 261 + ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events), 262 + &cpu, &fd, BPF_ANY); 263 + if (ret) 264 + return ret; 265 + } 266 + return 0; 267 + } 268 + 269 + struct bpf_counter_ops bpf_program_profiler_ops = { 270 + .load = bpf_program_profiler__load, 271 + .enable = bpf_program_profiler__enable, 272 + .read = bpf_program_profiler__read, 273 + .destroy = bpf_program_profiler__destroy, 274 + .install_pe = bpf_program_profiler__install_pe, 275 + }; 276 + 277 + int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd) 278 + { 279 + if (list_empty(&evsel->bpf_counter_list)) 280 + return 0; 281 + return evsel->bpf_counter_ops->install_pe(evsel, cpu, fd); 282 + } 283 + 284 + int bpf_counter__load(struct evsel *evsel, struct target *target) 285 + { 286 + if (target__has_bpf(target)) 287 + evsel->bpf_counter_ops = &bpf_program_profiler_ops; 288 + 289 + if (evsel->bpf_counter_ops) 290 + return evsel->bpf_counter_ops->load(evsel, target); 291 + return 0; 292 + } 293 + 294 + int bpf_counter__enable(struct evsel *evsel) 295 + { 296 + if (list_empty(&evsel->bpf_counter_list)) 297 + return 0; 298 + return evsel->bpf_counter_ops->enable(evsel); 299 + } 300 + 301 + int bpf_counter__read(struct evsel *evsel) 302 + { 303 + if (list_empty(&evsel->bpf_counter_list)) 304 + return -EAGAIN; 305 + return evsel->bpf_counter_ops->read(evsel); 306 + } 307 + 308 + void bpf_counter__destroy(struct evsel *evsel) 309 + { 310 + if (list_empty(&evsel->bpf_counter_list)) 311 + return; 312 + evsel->bpf_counter_ops->destroy(evsel); 313 + evsel->bpf_counter_ops = NULL; 314 + }
+72
tools/perf/util/bpf_counter.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef __PERF_BPF_COUNTER_H 3 + #define __PERF_BPF_COUNTER_H 1 4 + 5 + #include <linux/list.h> 6 + 7 + struct evsel; 8 + struct target; 9 + struct bpf_counter; 10 + 11 + typedef int (*bpf_counter_evsel_op)(struct evsel *evsel); 12 + typedef int (*bpf_counter_evsel_target_op)(struct evsel *evsel, 13 + struct target *target); 14 + typedef int (*bpf_counter_evsel_install_pe_op)(struct evsel *evsel, 15 + int cpu, 16 + int fd); 17 + 18 + struct bpf_counter_ops { 19 + bpf_counter_evsel_target_op load; 20 + bpf_counter_evsel_op enable; 21 + bpf_counter_evsel_op read; 22 + bpf_counter_evsel_op destroy; 23 + bpf_counter_evsel_install_pe_op install_pe; 24 + }; 25 + 26 + struct bpf_counter { 27 + void *skel; 28 + struct list_head list; 29 + }; 30 + 31 + #ifdef HAVE_BPF_SKEL 32 + 33 + int bpf_counter__load(struct evsel *evsel, struct target *target); 34 + int bpf_counter__enable(struct evsel *evsel); 35 + int bpf_counter__read(struct evsel *evsel); 36 + void bpf_counter__destroy(struct evsel *evsel); 37 + int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd); 38 + 39 + #else /* HAVE_BPF_SKEL */ 40 + 41 + #include<linux/err.h> 42 + 43 + static inline int bpf_counter__load(struct evsel *evsel __maybe_unused, 44 + struct target *target __maybe_unused) 45 + { 46 + return 0; 47 + } 48 + 49 + static inline int bpf_counter__enable(struct evsel *evsel __maybe_unused) 50 + { 51 + return 0; 52 + } 53 + 54 + static inline int bpf_counter__read(struct evsel *evsel __maybe_unused) 55 + { 56 + return -EAGAIN; 57 + } 58 + 59 + static inline void bpf_counter__destroy(struct evsel *evsel __maybe_unused) 60 + { 61 + } 62 + 63 + static inline int bpf_counter__install_pe(struct evsel *evsel __maybe_unused, 64 + int cpu __maybe_unused, 65 + int fd __maybe_unused) 66 + { 67 + return 0; 68 + } 69 + 70 + #endif /* HAVE_BPF_SKEL */ 71 + 72 + #endif /* __PERF_BPF_COUNTER_H */
+3
tools/perf/util/bpf_skel/.gitignore
··· 1 + # SPDX-License-Identifier: GPL-2.0-only 2 + .tmp 3 + *.skel.h
+93
tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
··· 1 + // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) 2 + // Copyright (c) 2020 Facebook 3 + #include <linux/bpf.h> 4 + #include <bpf/bpf_helpers.h> 5 + #include <bpf/bpf_tracing.h> 6 + 7 + /* map of perf event fds, num_cpu * num_metric entries */ 8 + struct { 9 + __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY); 10 + __uint(key_size, sizeof(__u32)); 11 + __uint(value_size, sizeof(int)); 12 + } events SEC(".maps"); 13 + 14 + /* readings at fentry */ 15 + struct { 16 + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); 17 + __uint(key_size, sizeof(__u32)); 18 + __uint(value_size, sizeof(struct bpf_perf_event_value)); 19 + __uint(max_entries, 1); 20 + } fentry_readings SEC(".maps"); 21 + 22 + /* accumulated readings */ 23 + struct { 24 + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); 25 + __uint(key_size, sizeof(__u32)); 26 + __uint(value_size, sizeof(struct bpf_perf_event_value)); 27 + __uint(max_entries, 1); 28 + } accum_readings SEC(".maps"); 29 + 30 + const volatile __u32 num_cpu = 1; 31 + 32 + SEC("fentry/XXX") 33 + int BPF_PROG(fentry_XXX) 34 + { 35 + __u32 key = bpf_get_smp_processor_id(); 36 + struct bpf_perf_event_value *ptr; 37 + __u32 zero = 0; 38 + long err; 39 + 40 + /* look up before reading, to reduce error */ 41 + ptr = bpf_map_lookup_elem(&fentry_readings, &zero); 42 + if (!ptr) 43 + return 0; 44 + 45 + err = bpf_perf_event_read_value(&events, key, ptr, sizeof(*ptr)); 46 + if (err) 47 + return 0; 48 + 49 + return 0; 50 + } 51 + 52 + static inline void 53 + fexit_update_maps(struct bpf_perf_event_value *after) 54 + { 55 + struct bpf_perf_event_value *before, diff, *accum; 56 + __u32 zero = 0; 57 + 58 + before = bpf_map_lookup_elem(&fentry_readings, &zero); 59 + /* only account samples with a valid fentry_reading */ 60 + if (before && before->counter) { 61 + struct bpf_perf_event_value *accum; 62 + 63 + diff.counter = after->counter - before->counter; 64 + diff.enabled = after->enabled - before->enabled; 65 + diff.running = after->running - before->running; 66 + 67 + accum = bpf_map_lookup_elem(&accum_readings, &zero); 68 + if (accum) { 69 + accum->counter += diff.counter; 70 + accum->enabled += diff.enabled; 71 + accum->running += diff.running; 72 + } 73 + } 74 + } 75 + 76 + SEC("fexit/XXX") 77 + int BPF_PROG(fexit_XXX) 78 + { 79 + struct bpf_perf_event_value reading; 80 + __u32 cpu = bpf_get_smp_processor_id(); 81 + __u32 one = 1, zero = 0; 82 + int err; 83 + 84 + /* read all events before updating the maps, to reduce error */ 85 + err = bpf_perf_event_read_value(&events, cpu, &reading, sizeof(reading)); 86 + if (err) 87 + return 0; 88 + 89 + fexit_update_maps(&reading); 90 + return 0; 91 + } 92 + 93 + char LICENSE[] SEC("license") = "Dual BSD/GPL";
+3 -2
tools/perf/util/build-id.c
··· 448 448 int i = 0; 449 449 while (isxdigit(d->d_name[i]) && i < SBUILD_ID_SIZE - 3) 450 450 i++; 451 - return (i == SBUILD_ID_SIZE - 3) && (d->d_name[i] == '\0'); 451 + return (i >= SBUILD_ID_MIN_SIZE - 3) && (i <= SBUILD_ID_SIZE - 3) && 452 + (d->d_name[i] == '\0'); 452 453 } 453 454 454 455 struct strlist *build_id_cache__list_all(bool validonly) ··· 491 490 } 492 491 strlist__for_each_entry(nd2, linklist) { 493 492 if (snprintf(sbuild_id, SBUILD_ID_SIZE, "%s%s", 494 - nd->s, nd2->s) != SBUILD_ID_SIZE - 1) 493 + nd->s, nd2->s) > SBUILD_ID_SIZE - 1) 495 494 goto err_out; 496 495 if (validonly && !build_id_cache__valid_id(sbuild_id)) 497 496 continue;
+3 -1
tools/perf/util/build-id.h
··· 2 2 #ifndef PERF_BUILD_ID_H_ 3 3 #define PERF_BUILD_ID_H_ 1 4 4 5 - #define BUILD_ID_SIZE 20 5 + #define BUILD_ID_SIZE 20 /* SHA-1 length in bytes */ 6 + #define BUILD_ID_MIN_SIZE 16 /* MD5/UUID/GUID length in bytes */ 6 7 #define SBUILD_ID_SIZE (BUILD_ID_SIZE * 2 + 1) 8 + #define SBUILD_ID_MIN_SIZE (BUILD_ID_MIN_SIZE * 2 + 1) 7 9 8 10 #include "machine.h" 9 11 #include "tool.h"
+4 -4
tools/perf/util/cgroup.c
··· 161 161 162 162 /* helper function for ftw() in match_cgroups and list_cgroups */ 163 163 static int add_cgroup_name(const char *fpath, const struct stat *sb __maybe_unused, 164 - int typeflag) 164 + int typeflag, struct FTW *ftwbuf __maybe_unused) 165 165 { 166 166 struct cgroup_name *cn; 167 167 ··· 209 209 if (!s) 210 210 return -1; 211 211 /* pretend if it's added by ftw() */ 212 - ret = add_cgroup_name(s, NULL, FTW_D); 212 + ret = add_cgroup_name(s, NULL, FTW_D, NULL); 213 213 free(s); 214 214 if (ret) 215 215 return -1; 216 216 } else { 217 - if (add_cgroup_name("", NULL, FTW_D) < 0) 217 + if (add_cgroup_name("", NULL, FTW_D, NULL) < 0) 218 218 return -1; 219 219 } 220 220 ··· 247 247 prefix_len = strlen(mnt); 248 248 249 249 /* collect all cgroups in the cgroup_list */ 250 - if (ftw(mnt, add_cgroup_name, 20) < 0) 250 + if (nftw(mnt, add_cgroup_name, 20, 0) < 0) 251 251 return -1; 252 252 253 253 for (;;) {
+79 -44
tools/perf/util/config.c
··· 489 489 return 0; 490 490 } 491 491 492 - int perf_config_from_file(config_fn_t fn, const char *filename, void *data) 492 + static int perf_config_from_file(config_fn_t fn, const char *filename, void *data) 493 493 { 494 494 int ret; 495 495 FILE *f = fopen(filename, "r"); ··· 521 521 return v ? perf_config_bool(k, v) : def; 522 522 } 523 523 524 - static int perf_config_system(void) 524 + int perf_config_system(void) 525 525 { 526 526 return !perf_env_bool("PERF_CONFIG_NOSYSTEM", 0); 527 527 } 528 528 529 - static int perf_config_global(void) 529 + int perf_config_global(void) 530 530 { 531 531 return !perf_env_bool("PERF_CONFIG_NOGLOBAL", 0); 532 + } 533 + 534 + static char *home_perfconfig(void) 535 + { 536 + const char *home = NULL; 537 + char *config; 538 + struct stat st; 539 + 540 + home = getenv("HOME"); 541 + 542 + /* 543 + * Skip reading user config if: 544 + * - there is no place to read it from (HOME) 545 + * - we are asked not to (PERF_CONFIG_NOGLOBAL=1) 546 + */ 547 + if (!home || !*home || !perf_config_global()) 548 + return NULL; 549 + 550 + config = strdup(mkpath("%s/.perfconfig", home)); 551 + if (config == NULL) { 552 + pr_warning("Not enough memory to process %s/.perfconfig, ignoring it.", home); 553 + return NULL; 554 + } 555 + 556 + if (stat(config, &st) < 0) 557 + goto out_free; 558 + 559 + if (st.st_uid && (st.st_uid != geteuid())) { 560 + pr_warning("File %s not owned by current user or root, ignoring it.", config); 561 + goto out_free; 562 + } 563 + 564 + if (st.st_size) 565 + return config; 566 + 567 + out_free: 568 + free(config); 569 + return NULL; 570 + } 571 + 572 + const char *perf_home_perfconfig(void) 573 + { 574 + static const char *config; 575 + static bool failed; 576 + 577 + config = failed ? NULL : home_perfconfig(); 578 + if (!config) 579 + failed = true; 580 + 581 + return config; 532 582 } 533 583 534 584 static struct perf_config_section *find_section(struct list_head *sections, ··· 726 676 static int perf_config_set__init(struct perf_config_set *set) 727 677 { 728 678 int ret = -1; 729 - const char *home = NULL; 730 - char *user_config; 731 - struct stat st; 732 679 733 680 /* Setting $PERF_CONFIG makes perf read _only_ the given config file. */ 734 681 if (config_exclusive_filename) ··· 734 687 if (perf_config_from_file(collect_config, perf_etc_perfconfig(), set) < 0) 735 688 goto out; 736 689 } 737 - 738 - home = getenv("HOME"); 739 - 740 - /* 741 - * Skip reading user config if: 742 - * - there is no place to read it from (HOME) 743 - * - we are asked not to (PERF_CONFIG_NOGLOBAL=1) 744 - */ 745 - if (!home || !*home || !perf_config_global()) 746 - return 0; 747 - 748 - user_config = strdup(mkpath("%s/.perfconfig", home)); 749 - if (user_config == NULL) { 750 - pr_warning("Not enough memory to process %s/.perfconfig, ignoring it.", home); 751 - goto out; 690 + if (perf_config_global() && perf_home_perfconfig()) { 691 + if (perf_config_from_file(collect_config, perf_home_perfconfig(), set) < 0) 692 + goto out; 752 693 } 753 694 754 - if (stat(user_config, &st) < 0) { 755 - if (errno == ENOENT) 756 - ret = 0; 757 - goto out_free; 758 - } 759 - 760 - ret = 0; 761 - 762 - if (st.st_uid && (st.st_uid != geteuid())) { 763 - pr_warning("File %s not owned by current user or root, ignoring it.", user_config); 764 - goto out_free; 765 - } 766 - 767 - if (st.st_size) 768 - ret = perf_config_from_file(collect_config, user_config, set); 769 - 770 - out_free: 771 - free(user_config); 772 695 out: 773 696 return ret; 774 697 } ··· 755 738 return set; 756 739 } 757 740 741 + struct perf_config_set *perf_config_set__load_file(const char *file) 742 + { 743 + struct perf_config_set *set = zalloc(sizeof(*set)); 744 + 745 + if (set) { 746 + INIT_LIST_HEAD(&set->sections); 747 + perf_config_from_file(collect_config, file, set); 748 + } 749 + 750 + return set; 751 + } 752 + 758 753 static int perf_config__init(void) 759 754 { 760 755 if (config_set == NULL) ··· 775 746 return config_set == NULL; 776 747 } 777 748 778 - int perf_config(config_fn_t fn, void *data) 749 + int perf_config_set(struct perf_config_set *set, 750 + config_fn_t fn, void *data) 779 751 { 780 752 int ret = 0; 781 753 char key[BUFSIZ]; 782 754 struct perf_config_section *section; 783 755 struct perf_config_item *item; 784 756 785 - if (config_set == NULL && perf_config__init()) 786 - return -1; 787 - 788 - perf_config_set__for_each_entry(config_set, section, item) { 757 + perf_config_set__for_each_entry(set, section, item) { 789 758 char *value = item->value; 790 759 791 760 if (value) { ··· 803 776 } 804 777 out: 805 778 return ret; 779 + } 780 + 781 + int perf_config(config_fn_t fn, void *data) 782 + { 783 + if (config_set == NULL && perf_config__init()) 784 + return -1; 785 + 786 + return perf_config_set(config_set, fn, data); 806 787 } 807 788 808 789 void perf_config__exit(void)
+6 -1
tools/perf/util/config.h
··· 27 27 28 28 typedef int (*config_fn_t)(const char *, const char *, void *); 29 29 30 - int perf_config_from_file(config_fn_t fn, const char *filename, void *data); 31 30 int perf_default_config(const char *, const char *, void *); 32 31 int perf_config(config_fn_t fn, void *); 32 + int perf_config_set(struct perf_config_set *set, 33 + config_fn_t fn, void *data); 33 34 int perf_config_int(int *dest, const char *, const char *); 34 35 int perf_config_u8(u8 *dest, const char *name, const char *value); 35 36 int perf_config_u64(u64 *dest, const char *, const char *); 36 37 int perf_config_bool(const char *, const char *); 37 38 int config_error_nonbool(const char *); 38 39 const char *perf_etc_perfconfig(void); 40 + const char *perf_home_perfconfig(void); 41 + int perf_config_system(void); 42 + int perf_config_global(void); 39 43 40 44 struct perf_config_set *perf_config_set__new(void); 45 + struct perf_config_set *perf_config_set__load_file(const char *file); 41 46 void perf_config_set__delete(struct perf_config_set *set); 42 47 int perf_config_set__collect(struct perf_config_set *set, const char *file_name, 43 48 const char *var, const char *value);
+4 -11
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
··· 419 419 packet->last_instr_subtype = elem->last_i_subtype; 420 420 packet->last_instr_cond = elem->last_instr_cond; 421 421 422 - switch (elem->last_i_type) { 423 - case OCSD_INSTR_BR: 424 - case OCSD_INSTR_BR_INDIRECT: 422 + if (elem->last_i_type == OCSD_INSTR_BR || elem->last_i_type == OCSD_INSTR_BR_INDIRECT) 425 423 packet->last_instr_taken_branch = elem->last_instr_exec; 426 - break; 427 - case OCSD_INSTR_ISB: 428 - case OCSD_INSTR_DSB_DMB: 429 - case OCSD_INSTR_WFI_WFE: 430 - case OCSD_INSTR_OTHER: 431 - default: 424 + else 432 425 packet->last_instr_taken_branch = false; 433 - break; 434 - } 435 426 436 427 packet->last_instr_size = elem->last_instr_sz; 437 428 ··· 563 572 case OCSD_GEN_TRC_ELEM_EVENT: 564 573 case OCSD_GEN_TRC_ELEM_SWTRACE: 565 574 case OCSD_GEN_TRC_ELEM_CUSTOM: 575 + case OCSD_GEN_TRC_ELEM_SYNC_MARKER: 576 + case OCSD_GEN_TRC_ELEM_MEMTRANS: 566 577 default: 567 578 break; 568 579 }
+1 -1
tools/perf/util/data-convert-bt.c
··· 948 948 goto out; 949 949 /* 950 950 * Add '_' prefix to potential keywork. According to 951 - * Mathieu Desnoyers (https://lkml.org/lkml/2015/1/23/652), 951 + * Mathieu Desnoyers (https://lore.kernel.org/lkml/1074266107.40857.1422045946295.JavaMail.zimbra@efficios.com), 952 952 * futher CTF spec updating may require us to use '$'. 953 953 */ 954 954 if (dup < 0)
+2
tools/perf/util/db-export.c
··· 438 438 {PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TX_ABORT, "transaction abort"}, 439 439 {PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_BEGIN, "trace begin"}, 440 440 {PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_END, "trace end"}, 441 + {PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_VMENTRY, "vm entry"}, 442 + {PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_VMEXIT, "vm exit"}, 441 443 {0, NULL} 442 444 }; 443 445
+31 -3
tools/perf/util/debug.c
··· 10 10 #include <api/debug.h> 11 11 #include <linux/kernel.h> 12 12 #include <linux/time64.h> 13 + #include <sys/time.h> 13 14 #ifdef HAVE_BACKTRACE_SUPPORT 14 15 #include <execinfo.h> 15 16 #endif ··· 32 31 static int redirect_to_stderr; 33 32 int debug_data_convert; 34 33 static FILE *debug_file; 34 + bool debug_display_time; 35 35 36 36 void debug_set_file(FILE *file) 37 37 { 38 38 debug_file = file; 39 + } 40 + 41 + void debug_set_display_time(bool set) 42 + { 43 + debug_display_time = set; 44 + } 45 + 46 + static int fprintf_time(FILE *file) 47 + { 48 + struct timeval tod; 49 + struct tm ltime; 50 + char date[64]; 51 + 52 + if (!debug_display_time) 53 + return 0; 54 + 55 + if (gettimeofday(&tod, NULL) != 0) 56 + return 0; 57 + 58 + if (localtime_r(&tod.tv_sec, &ltime) == NULL) 59 + return 0; 60 + 61 + strftime(date, sizeof(date), "%F %H:%M:%S", &ltime); 62 + return fprintf(file, "[%s.%06lu] ", date, (long)tod.tv_usec); 39 63 } 40 64 41 65 int veprintf(int level, int var, const char *fmt, va_list args) ··· 68 42 int ret = 0; 69 43 70 44 if (var >= level) { 71 - if (use_browser >= 1 && !redirect_to_stderr) 45 + if (use_browser >= 1 && !redirect_to_stderr) { 72 46 ui_helpline__vshow(fmt, args); 73 - else 74 - ret = vfprintf(debug_file, fmt, args); 47 + } else { 48 + ret = fprintf_time(debug_file); 49 + ret += vfprintf(debug_file, fmt, args); 50 + } 75 51 } 76 52 77 53 return ret;
+1
tools/perf/util/debug.h
··· 64 64 65 65 int perf_debug_option(const char *str); 66 66 void debug_set_file(FILE *file); 67 + void debug_set_display_time(bool set); 67 68 void perf_debug_setup(void); 68 69 int perf_quiet_option(void); 69 70
+80
tools/perf/util/demangle-ocaml.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <string.h> 3 + #include <stdlib.h> 4 + #include "util/string2.h" 5 + 6 + #include "demangle-ocaml.h" 7 + 8 + #include <linux/ctype.h> 9 + 10 + static const char *caml_prefix = "caml"; 11 + static const size_t caml_prefix_len = 4; 12 + 13 + /* mangled OCaml symbols start with "caml" followed by an upper-case letter */ 14 + static bool 15 + ocaml_is_mangled(const char *sym) 16 + { 17 + return 0 == strncmp(sym, caml_prefix, caml_prefix_len) 18 + && isupper(sym[caml_prefix_len]); 19 + } 20 + 21 + /* 22 + * input: 23 + * sym: a symbol which may have been mangled by the OCaml compiler 24 + * return: 25 + * if the input doesn't look like a mangled OCaml symbol, NULL is returned 26 + * otherwise, a newly allocated string containing the demangled symbol is returned 27 + */ 28 + char * 29 + ocaml_demangle_sym(const char *sym) 30 + { 31 + char *result; 32 + int j = 0; 33 + int i; 34 + int len; 35 + 36 + if (!ocaml_is_mangled(sym)) { 37 + return NULL; 38 + } 39 + 40 + len = strlen(sym); 41 + 42 + /* the demangled symbol is always smaller than the mangled symbol */ 43 + result = malloc(len + 1); 44 + if (!result) 45 + return NULL; 46 + 47 + /* skip "caml" prefix */ 48 + i = caml_prefix_len; 49 + 50 + while (i < len) { 51 + if (sym[i] == '_' && sym[i + 1] == '_') { 52 + /* "__" -> "." */ 53 + result[j++] = '.'; 54 + i += 2; 55 + } 56 + else if (sym[i] == '$' && isxdigit(sym[i + 1]) && isxdigit(sym[i + 2])) { 57 + /* "$xx" is a hex-encoded character */ 58 + result[j++] = (hex(sym[i + 1]) << 4) | hex(sym[i + 2]); 59 + i += 3; 60 + } 61 + else { 62 + result[j++] = sym[i++]; 63 + } 64 + } 65 + result[j] = '\0'; 66 + 67 + /* scan backwards to remove an "_" followed by decimal digits */ 68 + if (j != 0 && isdigit(result[j - 1])) { 69 + while (--j) { 70 + if (!isdigit(result[j])) { 71 + break; 72 + } 73 + } 74 + if (result[j] == '_') { 75 + result[j] = '\0'; 76 + } 77 + } 78 + 79 + return result; 80 + }
+7
tools/perf/util/demangle-ocaml.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef __PERF_DEMANGLE_OCAML 3 + #define __PERF_DEMANGLE_OCAML 1 4 + 5 + char * ocaml_demangle_sym(const char *str); 6 + 7 + #endif /* __PERF_DEMANGLE_OCAML */
+56 -11
tools/perf/util/event.c
··· 288 288 289 289 size_t perf_event__fprintf_mmap2(union perf_event *event, FILE *fp) 290 290 { 291 - return fprintf(fp, " %d/%d: [%#" PRI_lx64 "(%#" PRI_lx64 ") @ %#" PRI_lx64 292 - " %02x:%02x %"PRI_lu64" %"PRI_lu64"]: %c%c%c%c %s\n", 293 - event->mmap2.pid, event->mmap2.tid, event->mmap2.start, 294 - event->mmap2.len, event->mmap2.pgoff, event->mmap2.maj, 295 - event->mmap2.min, event->mmap2.ino, 296 - event->mmap2.ino_generation, 297 - (event->mmap2.prot & PROT_READ) ? 'r' : '-', 298 - (event->mmap2.prot & PROT_WRITE) ? 'w' : '-', 299 - (event->mmap2.prot & PROT_EXEC) ? 'x' : '-', 300 - (event->mmap2.flags & MAP_SHARED) ? 's' : 'p', 301 - event->mmap2.filename); 291 + if (event->header.misc & PERF_RECORD_MISC_MMAP_BUILD_ID) { 292 + char sbuild_id[SBUILD_ID_SIZE]; 293 + struct build_id bid; 294 + 295 + build_id__init(&bid, event->mmap2.build_id, 296 + event->mmap2.build_id_size); 297 + build_id__sprintf(&bid, sbuild_id); 298 + 299 + return fprintf(fp, " %d/%d: [%#" PRI_lx64 "(%#" PRI_lx64 ") @ %#" PRI_lx64 300 + " <%s>]: %c%c%c%c %s\n", 301 + event->mmap2.pid, event->mmap2.tid, event->mmap2.start, 302 + event->mmap2.len, event->mmap2.pgoff, sbuild_id, 303 + (event->mmap2.prot & PROT_READ) ? 'r' : '-', 304 + (event->mmap2.prot & PROT_WRITE) ? 'w' : '-', 305 + (event->mmap2.prot & PROT_EXEC) ? 'x' : '-', 306 + (event->mmap2.flags & MAP_SHARED) ? 's' : 'p', 307 + event->mmap2.filename); 308 + } else { 309 + return fprintf(fp, " %d/%d: [%#" PRI_lx64 "(%#" PRI_lx64 ") @ %#" PRI_lx64 310 + " %02x:%02x %"PRI_lu64" %"PRI_lu64"]: %c%c%c%c %s\n", 311 + event->mmap2.pid, event->mmap2.tid, event->mmap2.start, 312 + event->mmap2.len, event->mmap2.pgoff, event->mmap2.maj, 313 + event->mmap2.min, event->mmap2.ino, 314 + event->mmap2.ino_generation, 315 + (event->mmap2.prot & PROT_READ) ? 'r' : '-', 316 + (event->mmap2.prot & PROT_WRITE) ? 'w' : '-', 317 + (event->mmap2.prot & PROT_EXEC) ? 'x' : '-', 318 + (event->mmap2.flags & MAP_SHARED) ? 's' : 'p', 319 + event->mmap2.filename); 320 + } 302 321 } 303 322 304 323 size_t perf_event__fprintf_thread_map(union perf_event *event, FILE *fp) ··· 645 626 return al->sym; 646 627 } 647 628 629 + static bool check_address_range(struct intlist *addr_list, int addr_range, 630 + unsigned long addr) 631 + { 632 + struct int_node *pos; 633 + 634 + intlist__for_each_entry(pos, addr_list) { 635 + if (addr >= pos->i && addr < pos->i + addr_range) 636 + return true; 637 + } 638 + 639 + return false; 640 + } 641 + 648 642 /* 649 643 * Callers need to drop the reference to al->thread, obtained in 650 644 * machine__findnew_thread() ··· 705 673 } 706 674 707 675 al->sym = map__find_symbol(al->map, al->addr); 676 + } else if (symbol_conf.dso_list) { 677 + al->filtered |= (1 << HIST_FILTER__DSO); 708 678 } 709 679 710 680 if (symbol_conf.sym_list) { ··· 724 690 ret = strlist__has_entry(symbol_conf.sym_list, 725 691 al_addr_str); 726 692 } 693 + if (!ret && symbol_conf.addr_list && al->map) { 694 + unsigned long addr = al->map->unmap_ip(al->map, al->addr); 695 + 696 + ret = intlist__has_entry(symbol_conf.addr_list, addr); 697 + if (!ret && symbol_conf.addr_range) { 698 + ret = check_address_range(symbol_conf.addr_list, 699 + symbol_conf.addr_range, 700 + addr); 701 + } 702 + } 703 + 727 704 if (!ret) 728 705 al->filtered |= (1 << HIST_FILTER__SYMBOL); 729 706 }
+17 -1
tools/perf/util/event.h
··· 96 96 PERF_IP_FLAG_TRACE_BEGIN = 1ULL << 8, 97 97 PERF_IP_FLAG_TRACE_END = 1ULL << 9, 98 98 PERF_IP_FLAG_IN_TX = 1ULL << 10, 99 + PERF_IP_FLAG_VMENTRY = 1ULL << 11, 100 + PERF_IP_FLAG_VMEXIT = 1ULL << 12, 99 101 }; 100 102 101 103 #define PERF_IP_FLAG_CHARS "bcrosyiABEx" ··· 112 110 PERF_IP_FLAG_INTERRUPT |\ 113 111 PERF_IP_FLAG_TX_ABORT |\ 114 112 PERF_IP_FLAG_TRACE_BEGIN |\ 115 - PERF_IP_FLAG_TRACE_END) 113 + PERF_IP_FLAG_TRACE_END |\ 114 + PERF_IP_FLAG_VMENTRY |\ 115 + PERF_IP_FLAG_VMEXIT) 116 116 117 117 #define MAX_INSN 16 118 118 ··· 140 136 u64 data_src; 141 137 u64 phys_addr; 142 138 u64 data_page_size; 139 + u64 code_page_size; 143 140 u64 cgroup; 144 141 u32 flags; 145 142 u16 insn_len; 146 143 u8 cpumode; 147 144 u16 misc; 145 + u16 ins_lat; 148 146 bool no_hw_idx; /* No hw_idx collected in branch_stack */ 149 147 char insn[MAX_INSN]; 150 148 void *raw_data; ··· 177 171 PERF_SYNTH_INTEL_EXSTOP, 178 172 PERF_SYNTH_INTEL_PWRX, 179 173 PERF_SYNTH_INTEL_CBR, 174 + PERF_SYNTH_INTEL_PSB, 180 175 }; 181 176 182 177 /* ··· 268 261 }; 269 262 u32 freq; 270 263 u32 reserved3; 264 + }; 265 + 266 + struct perf_synth_intel_psb { 267 + u32 padding; 268 + u32 reserved; 269 + u64 offset; 271 270 }; 272 271 273 272 /* ··· 424 411 425 412 #define PAGE_SIZE_NAME_LEN 32 426 413 char *get_page_size_name(u64 size, char *str); 414 + 415 + void arch_perf_parse_sample_weight(struct perf_sample *data, const __u64 *array, u64 type); 416 + void arch_perf_synthesize_sample_weight(const struct perf_sample *data, __u64 *array, u64 type); 427 417 428 418 #endif /* __PERF_RECORD_H */
+122 -3
tools/perf/util/evlist.c
··· 24 24 #include "bpf-event.h" 25 25 #include "util/string2.h" 26 26 #include "util/perf_api_probe.h" 27 + #include "util/evsel_fprintf.h" 27 28 #include <signal.h> 28 29 #include <unistd.h> 29 30 #include <sched.h> ··· 304 303 return evlist__add_attrs(evlist, attrs, nr_attrs); 305 304 } 306 305 306 + __weak int arch_evlist__add_default_attrs(struct evlist *evlist __maybe_unused) 307 + { 308 + return 0; 309 + } 310 + 307 311 struct evsel *evlist__find_tracepoint_by_id(struct evlist *evlist, int id) 308 312 { 309 313 struct evsel *evsel; ··· 577 571 { 578 572 return perf_evlist__filter_pollfd(&evlist->core, revents_and_mask); 579 573 } 574 + 575 + #ifdef HAVE_EVENTFD_SUPPORT 576 + int evlist__add_wakeup_eventfd(struct evlist *evlist, int fd) 577 + { 578 + return perf_evlist__add_pollfd(&evlist->core, fd, NULL, POLLIN, 579 + fdarray_flag__nonfilterable); 580 + } 581 + #endif 580 582 581 583 int evlist__poll(struct evlist *evlist, int timeout) 582 584 { ··· 1950 1936 (sizeof(EVLIST_CTL_CMD_SNAPSHOT_TAG)-1))) { 1951 1937 *cmd = EVLIST_CTL_CMD_SNAPSHOT; 1952 1938 pr_debug("is snapshot\n"); 1939 + } else if (!strncmp(cmd_data, EVLIST_CTL_CMD_EVLIST_TAG, 1940 + (sizeof(EVLIST_CTL_CMD_EVLIST_TAG)-1))) { 1941 + *cmd = EVLIST_CTL_CMD_EVLIST; 1942 + } else if (!strncmp(cmd_data, EVLIST_CTL_CMD_STOP_TAG, 1943 + (sizeof(EVLIST_CTL_CMD_STOP_TAG)-1))) { 1944 + *cmd = EVLIST_CTL_CMD_STOP; 1945 + } else if (!strncmp(cmd_data, EVLIST_CTL_CMD_PING_TAG, 1946 + (sizeof(EVLIST_CTL_CMD_PING_TAG)-1))) { 1947 + *cmd = EVLIST_CTL_CMD_PING; 1953 1948 } 1954 1949 } 1955 1950 ··· 1980 1957 return err; 1981 1958 } 1982 1959 1960 + static int get_cmd_arg(char *cmd_data, size_t cmd_size, char **arg) 1961 + { 1962 + char *data = cmd_data + cmd_size; 1963 + 1964 + /* no argument */ 1965 + if (!*data) 1966 + return 0; 1967 + 1968 + /* there's argument */ 1969 + if (*data == ' ') { 1970 + *arg = data + 1; 1971 + return 1; 1972 + } 1973 + 1974 + /* malformed */ 1975 + return -1; 1976 + } 1977 + 1978 + static int evlist__ctlfd_enable(struct evlist *evlist, char *cmd_data, bool enable) 1979 + { 1980 + struct evsel *evsel; 1981 + char *name; 1982 + int err; 1983 + 1984 + err = get_cmd_arg(cmd_data, 1985 + enable ? sizeof(EVLIST_CTL_CMD_ENABLE_TAG) - 1 : 1986 + sizeof(EVLIST_CTL_CMD_DISABLE_TAG) - 1, 1987 + &name); 1988 + if (err < 0) { 1989 + pr_info("failed: wrong command\n"); 1990 + return -1; 1991 + } 1992 + 1993 + if (err) { 1994 + evsel = evlist__find_evsel_by_str(evlist, name); 1995 + if (evsel) { 1996 + if (enable) 1997 + evlist__enable_evsel(evlist, name); 1998 + else 1999 + evlist__disable_evsel(evlist, name); 2000 + pr_info("Event %s %s\n", evsel->name, 2001 + enable ? "enabled" : "disabled"); 2002 + } else { 2003 + pr_info("failed: can't find '%s' event\n", name); 2004 + } 2005 + } else { 2006 + if (enable) { 2007 + evlist__enable(evlist); 2008 + pr_info(EVLIST_ENABLED_MSG); 2009 + } else { 2010 + evlist__disable(evlist); 2011 + pr_info(EVLIST_DISABLED_MSG); 2012 + } 2013 + } 2014 + 2015 + return 0; 2016 + } 2017 + 2018 + static int evlist__ctlfd_list(struct evlist *evlist, char *cmd_data) 2019 + { 2020 + struct perf_attr_details details = { .verbose = false, }; 2021 + struct evsel *evsel; 2022 + char *arg; 2023 + int err; 2024 + 2025 + err = get_cmd_arg(cmd_data, 2026 + sizeof(EVLIST_CTL_CMD_EVLIST_TAG) - 1, 2027 + &arg); 2028 + if (err < 0) { 2029 + pr_info("failed: wrong command\n"); 2030 + return -1; 2031 + } 2032 + 2033 + if (err) { 2034 + if (!strcmp(arg, "-v")) { 2035 + details.verbose = true; 2036 + } else if (!strcmp(arg, "-g")) { 2037 + details.event_group = true; 2038 + } else if (!strcmp(arg, "-F")) { 2039 + details.freq = true; 2040 + } else { 2041 + pr_info("failed: wrong command\n"); 2042 + return -1; 2043 + } 2044 + } 2045 + 2046 + evlist__for_each_entry(evlist, evsel) 2047 + evsel__fprintf(evsel, &details, stderr); 2048 + 2049 + return 0; 2050 + } 2051 + 1983 2052 int evlist__ctlfd_process(struct evlist *evlist, enum evlist_ctl_cmd *cmd) 1984 2053 { 1985 2054 int err = 0; ··· 2088 1973 if (err > 0) { 2089 1974 switch (*cmd) { 2090 1975 case EVLIST_CTL_CMD_ENABLE: 2091 - evlist__enable(evlist); 2092 - break; 2093 1976 case EVLIST_CTL_CMD_DISABLE: 2094 - evlist__disable(evlist); 1977 + err = evlist__ctlfd_enable(evlist, cmd_data, 1978 + *cmd == EVLIST_CTL_CMD_ENABLE); 1979 + break; 1980 + case EVLIST_CTL_CMD_EVLIST: 1981 + err = evlist__ctlfd_list(evlist, cmd_data); 2095 1982 break; 2096 1983 case EVLIST_CTL_CMD_SNAPSHOT: 1984 + case EVLIST_CTL_CMD_STOP: 1985 + case EVLIST_CTL_CMD_PING: 2097 1986 break; 2098 1987 case EVLIST_CTL_CMD_ACK: 2099 1988 case EVLIST_CTL_CMD_UNSUPPORTED:
+12
tools/perf/util/evlist.h
··· 110 110 #define evlist__add_default_attrs(evlist, array) \ 111 111 __evlist__add_default_attrs(evlist, array, ARRAY_SIZE(array)) 112 112 113 + int arch_evlist__add_default_attrs(struct evlist *evlist); 114 + 113 115 int evlist__add_dummy(struct evlist *evlist); 114 116 115 117 int evlist__add_sb_event(struct evlist *evlist, struct perf_event_attr *attr, ··· 143 141 144 142 int evlist__add_pollfd(struct evlist *evlist, int fd); 145 143 int evlist__filter_pollfd(struct evlist *evlist, short revents_and_mask); 144 + 145 + #ifdef HAVE_EVENTFD_SUPPORT 146 + int evlist__add_wakeup_eventfd(struct evlist *evlist, int fd); 147 + #endif 146 148 147 149 int evlist__poll(struct evlist *evlist, int timeout); 148 150 ··· 336 330 #define EVLIST_CTL_CMD_DISABLE_TAG "disable" 337 331 #define EVLIST_CTL_CMD_ACK_TAG "ack\n" 338 332 #define EVLIST_CTL_CMD_SNAPSHOT_TAG "snapshot" 333 + #define EVLIST_CTL_CMD_EVLIST_TAG "evlist" 334 + #define EVLIST_CTL_CMD_STOP_TAG "stop" 335 + #define EVLIST_CTL_CMD_PING_TAG "ping" 339 336 340 337 #define EVLIST_CTL_CMD_MAX_LEN 64 341 338 ··· 348 339 EVLIST_CTL_CMD_DISABLE, 349 340 EVLIST_CTL_CMD_ACK, 350 341 EVLIST_CTL_CMD_SNAPSHOT, 342 + EVLIST_CTL_CMD_EVLIST, 343 + EVLIST_CTL_CMD_STOP, 344 + EVLIST_CTL_CMD_PING, 351 345 }; 352 346 353 347 int evlist__parse_control(const char *str, int *ctl_fd, int *ctl_fd_ack, bool *ctl_fd_close);
+55 -8
tools/perf/util/evsel.c
··· 25 25 #include <stdlib.h> 26 26 #include <perf/evsel.h> 27 27 #include "asm/bug.h" 28 + #include "bpf_counter.h" 28 29 #include "callchain.h" 29 30 #include "cgroup.h" 30 31 #include "counts.h" ··· 248 247 evsel->bpf_obj = NULL; 249 248 evsel->bpf_fd = -1; 250 249 INIT_LIST_HEAD(&evsel->config_terms); 250 + INIT_LIST_HEAD(&evsel->bpf_counter_list); 251 251 perf_evsel__object.init(evsel); 252 252 evsel->sample_size = __evsel__sample_size(attr->sample_type); 253 253 evsel__calc_id_pos(evsel); ··· 1014 1012 return found_term; 1015 1013 } 1016 1014 1015 + void __weak arch_evsel__set_sample_weight(struct evsel *evsel) 1016 + { 1017 + evsel__set_sample_bit(evsel, WEIGHT); 1018 + } 1019 + 1017 1020 /* 1018 1021 * The enable_on_exec/disabled value strategy: 1019 1022 * ··· 1173 1166 } 1174 1167 1175 1168 if (opts->sample_weight) 1176 - evsel__set_sample_bit(evsel, WEIGHT); 1169 + arch_evsel__set_sample_weight(evsel); 1177 1170 1178 - attr->task = track; 1179 - attr->mmap = track; 1180 - attr->mmap2 = track && !perf_missing_features.mmap2; 1181 - attr->comm = track; 1171 + attr->task = track; 1172 + attr->mmap = track; 1173 + attr->mmap2 = track && !perf_missing_features.mmap2; 1174 + attr->comm = track; 1175 + attr->build_id = track && opts->build_id; 1176 + 1182 1177 /* 1183 1178 * ksymbol is tracked separately with text poke because it needs to be 1184 1179 * system wide and enabled immediately. ··· 1199 1190 1200 1191 if (opts->sample_data_page_size) 1201 1192 evsel__set_sample_bit(evsel, DATA_PAGE_SIZE); 1193 + 1194 + if (opts->sample_code_page_size) 1195 + evsel__set_sample_bit(evsel, CODE_PAGE_SIZE); 1202 1196 1203 1197 if (opts->record_switch_events) 1204 1198 attr->context_switch = track; ··· 1378 1366 { 1379 1367 assert(list_empty(&evsel->core.node)); 1380 1368 assert(evsel->evlist == NULL); 1369 + bpf_counter__destroy(evsel); 1381 1370 evsel__free_counts(evsel); 1382 1371 perf_evsel__free_fd(&evsel->core); 1383 1372 perf_evsel__free_id(&evsel->core); ··· 1748 1735 } 1749 1736 1750 1737 fallback_missing_features: 1738 + if (perf_missing_features.weight_struct) { 1739 + evsel__set_sample_bit(evsel, WEIGHT); 1740 + evsel__reset_sample_bit(evsel, WEIGHT_STRUCT); 1741 + } 1751 1742 if (perf_missing_features.clockid_wrong) 1752 1743 evsel->core.attr.clockid = CLOCK_MONOTONIC; /* should always work */ 1753 1744 if (perf_missing_features.clockid) { ··· 1797 1780 group_fd, flags); 1798 1781 1799 1782 FD(evsel, cpu, thread) = fd; 1783 + 1784 + bpf_counter__install_pe(evsel, cpu, fd); 1800 1785 1801 1786 if (unlikely(test_attr__enabled)) { 1802 1787 test_attr__open(&evsel->core.attr, pid, cpus->map[cpu], ··· 1892 1873 * Must probe features in the order they were added to the 1893 1874 * perf_event_attr interface. 1894 1875 */ 1895 - if (!perf_missing_features.data_page_size && 1876 + if (!perf_missing_features.weight_struct && 1877 + (evsel->core.attr.sample_type & PERF_SAMPLE_WEIGHT_STRUCT)) { 1878 + perf_missing_features.weight_struct = true; 1879 + pr_debug2("switching off weight struct support\n"); 1880 + goto fallback_missing_features; 1881 + } else if (!perf_missing_features.code_page_size && 1882 + (evsel->core.attr.sample_type & PERF_SAMPLE_CODE_PAGE_SIZE)) { 1883 + perf_missing_features.code_page_size = true; 1884 + pr_debug2_peo("Kernel has no PERF_SAMPLE_CODE_PAGE_SIZE support, bailing out\n"); 1885 + goto out_close; 1886 + } else if (!perf_missing_features.data_page_size && 1896 1887 (evsel->core.attr.sample_type & PERF_SAMPLE_DATA_PAGE_SIZE)) { 1897 1888 perf_missing_features.data_page_size = true; 1898 1889 pr_debug2_peo("Kernel has no PERF_SAMPLE_DATA_PAGE_SIZE support, bailing out\n"); ··· 2103 2074 return -EFAULT; 2104 2075 2105 2076 return 0; 2077 + } 2078 + 2079 + void __weak arch_perf_parse_sample_weight(struct perf_sample *data, 2080 + const __u64 *array, 2081 + u64 type __maybe_unused) 2082 + { 2083 + data->weight = *array; 2106 2084 } 2107 2085 2108 2086 int evsel__parse_sample(struct evsel *evsel, union perf_event *event, ··· 2352 2316 } 2353 2317 } 2354 2318 2355 - if (type & PERF_SAMPLE_WEIGHT) { 2319 + if (type & PERF_SAMPLE_WEIGHT_TYPE) { 2356 2320 OVERFLOW_CHECK_u64(array); 2357 - data->weight = *array; 2321 + arch_perf_parse_sample_weight(data, array, type); 2358 2322 array++; 2359 2323 } 2360 2324 ··· 2402 2366 data->data_page_size = 0; 2403 2367 if (type & PERF_SAMPLE_DATA_PAGE_SIZE) { 2404 2368 data->data_page_size = *array; 2369 + array++; 2370 + } 2371 + 2372 + data->code_page_size = 0; 2373 + if (type & PERF_SAMPLE_CODE_PAGE_SIZE) { 2374 + data->code_page_size = *array; 2405 2375 array++; 2406 2376 } 2407 2377 ··· 2720 2678 "We found oprofile daemon running, please stop it and try again."); 2721 2679 break; 2722 2680 case EINVAL: 2681 + if (evsel->core.attr.sample_type & PERF_SAMPLE_CODE_PAGE_SIZE && perf_missing_features.code_page_size) 2682 + return scnprintf(msg, size, "Asking for the code page size isn't supported by this kernel."); 2723 2683 if (evsel->core.attr.sample_type & PERF_SAMPLE_DATA_PAGE_SIZE && perf_missing_features.data_page_size) 2724 2684 return scnprintf(msg, size, "Asking for the data page size isn't supported by this kernel."); 2725 2685 if (evsel->core.attr.write_backward && perf_missing_features.write_backward) ··· 2733 2689 if (perf_missing_features.aux_output) 2734 2690 return scnprintf(msg, size, "The 'aux_output' feature is not supported, update the kernel."); 2735 2691 break; 2692 + case ENODATA: 2693 + return scnprintf(msg, size, "Cannot collect data source with the load latency event alone. " 2694 + "Please add an auxiliary event in front of the load latency event."); 2736 2695 default: 2737 2696 break; 2738 2697 }
+9
tools/perf/util/evsel.h
··· 17 17 struct perf_counts; 18 18 struct perf_stat_evsel; 19 19 union perf_event; 20 + struct bpf_counter_ops; 21 + struct target; 20 22 21 23 typedef int (evsel__sb_cb_t)(union perf_event *event, void *data); 22 24 ··· 129 127 * See also evsel__has_callchain(). 130 128 */ 131 129 __u64 synth_sample_type; 130 + struct list_head bpf_counter_list; 131 + struct bpf_counter_ops *bpf_counter_ops; 132 132 }; 133 133 134 134 struct perf_missing_features { ··· 149 145 bool branch_hw_idx; 150 146 bool cgroup; 151 147 bool data_page_size; 148 + bool code_page_size; 149 + bool weight_struct; 152 150 }; 153 151 154 152 extern struct perf_missing_features perf_missing_features; ··· 244 238 __evsel__reset_sample_bit(evsel, PERF_SAMPLE_##bit) 245 239 246 240 void evsel__set_sample_id(struct evsel *evsel, bool use_sample_identifier); 241 + 242 + void arch_evsel__set_sample_weight(struct evsel *evsel); 247 243 248 244 int evsel__set_filter(struct evsel *evsel, const char *filter); 249 245 int evsel__append_tp_filter(struct evsel *evsel, const char *filter); ··· 432 424 struct perf_env *evsel__env(struct evsel *evsel); 433 425 434 426 int evsel__store_ids(struct evsel *evsel, struct evlist *evlist); 427 + 435 428 #endif /* __PERF_EVSEL_H */
+2
tools/perf/util/evsel_fprintf.c
··· 100 100 return ++printed; 101 101 } 102 102 103 + #ifndef PYTHON_PERF 103 104 int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment, 104 105 unsigned int print_opts, struct callchain_cursor *cursor, 105 106 struct strlist *bt_stop_list, FILE *fp) ··· 240 239 241 240 return printed; 242 241 } 242 + #endif /* PYTHON_PERF */
+1 -1
tools/perf/util/header.c
··· 3806 3806 * check for the pipe header regardless of source. 3807 3807 */ 3808 3808 err = perf_header__read_pipe(session); 3809 - if (!err || (err && perf_data__is_pipe(data))) { 3809 + if (!err || perf_data__is_pipe(data)) { 3810 3810 data->is_pipe = true; 3811 3811 return err; 3812 3812 }
+12 -3
tools/perf/util/hist.c
··· 208 208 hists__new_col_len(hists, HISTC_MEM_LVL, 21 + 3); 209 209 hists__new_col_len(hists, HISTC_LOCAL_WEIGHT, 12); 210 210 hists__new_col_len(hists, HISTC_GLOBAL_WEIGHT, 12); 211 + hists__new_col_len(hists, HISTC_MEM_BLOCKED, 10); 212 + hists__new_col_len(hists, HISTC_LOCAL_INS_LAT, 13); 213 + hists__new_col_len(hists, HISTC_GLOBAL_INS_LAT, 13); 211 214 if (symbol_conf.nanosecs) 212 215 hists__new_col_len(hists, HISTC_TIME, 16); 213 216 else 214 217 hists__new_col_len(hists, HISTC_TIME, 12); 218 + hists__new_col_len(hists, HISTC_CODE_PAGE_SIZE, 6); 215 219 216 220 if (h->srcline) { 217 221 len = MAX(strlen(h->srcline), strlen(sort_srcline.se_header)); ··· 289 285 } 290 286 291 287 static void he_stat__add_period(struct he_stat *he_stat, u64 period, 292 - u64 weight) 288 + u64 weight, u64 ins_lat) 293 289 { 294 290 295 291 he_stat->period += period; 296 292 he_stat->weight += weight; 297 293 he_stat->nr_events += 1; 294 + he_stat->ins_lat += ins_lat; 298 295 } 299 296 300 297 static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src) ··· 307 302 dest->period_guest_us += src->period_guest_us; 308 303 dest->nr_events += src->nr_events; 309 304 dest->weight += src->weight; 305 + dest->ins_lat += src->ins_lat; 310 306 } 311 307 312 308 static void he_stat__decay(struct he_stat *he_stat) ··· 596 590 int64_t cmp; 597 591 u64 period = entry->stat.period; 598 592 u64 weight = entry->stat.weight; 593 + u64 ins_lat = entry->stat.ins_lat; 599 594 bool leftmost = true; 600 595 601 596 p = &hists->entries_in->rb_root.rb_node; ··· 615 608 616 609 if (!cmp) { 617 610 if (sample_self) { 618 - he_stat__add_period(&he->stat, period, weight); 611 + he_stat__add_period(&he->stat, period, weight, ins_lat); 619 612 hist_entry__add_callchain_period(he, period); 620 613 } 621 614 if (symbol_conf.cumulate_callchain) 622 - he_stat__add_period(he->stat_acc, period, weight); 615 + he_stat__add_period(he->stat_acc, period, weight, ins_lat); 623 616 624 617 /* 625 618 * This mem info was allocated from sample__resolve_mem ··· 725 718 .cpumode = al->cpumode, 726 719 .ip = al->addr, 727 720 .level = al->level, 721 + .code_page_size = sample->code_page_size, 728 722 .stat = { 729 723 .nr_events = 1, 730 724 .period = sample->period, 731 725 .weight = sample->weight, 726 + .ins_lat = sample->ins_lat, 732 727 }, 733 728 .parent = sym_parent, 734 729 .filtered = symbol__parent_filter(sym_parent) | al->filtered,
+4
tools/perf/util/hist.h
··· 53 53 HISTC_DSO_TO, 54 54 HISTC_LOCAL_WEIGHT, 55 55 HISTC_GLOBAL_WEIGHT, 56 + HISTC_CODE_PAGE_SIZE, 56 57 HISTC_MEM_DADDR_SYMBOL, 57 58 HISTC_MEM_DADDR_DSO, 58 59 HISTC_MEM_PHYS_DADDR, ··· 72 71 HISTC_SYM_SIZE, 73 72 HISTC_DSO_SIZE, 74 73 HISTC_SYMBOL_IPC, 74 + HISTC_MEM_BLOCKED, 75 + HISTC_LOCAL_INS_LAT, 76 + HISTC_GLOBAL_INS_LAT, 75 77 HISTC_NR_COLS, /* Last entry */ 76 78 }; 77 79
+273 -61
tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
··· 24 24 #include "intel-pt-decoder.h" 25 25 #include "intel-pt-log.h" 26 26 27 + #define BITULL(x) (1ULL << (x)) 28 + 29 + /* IA32_RTIT_CTL MSR bits */ 30 + #define INTEL_PT_CYC_ENABLE BITULL(1) 31 + #define INTEL_PT_CYC_THRESHOLD (BITULL(22) | BITULL(21) | BITULL(20) | BITULL(19)) 32 + #define INTEL_PT_CYC_THRESHOLD_SHIFT 19 33 + 27 34 #define INTEL_PT_BLK_SIZE 1024 28 35 29 36 #define BIT63 (((uint64_t)1 << 63)) ··· 62 55 INTEL_PT_STATE_TIP_PGD, 63 56 INTEL_PT_STATE_FUP, 64 57 INTEL_PT_STATE_FUP_NO_TIP, 58 + INTEL_PT_STATE_FUP_IN_PSB, 65 59 INTEL_PT_STATE_RESAMPLE, 66 60 }; 67 61 ··· 81 73 case INTEL_PT_STATE_TIP_PGD: 82 74 case INTEL_PT_STATE_FUP: 83 75 case INTEL_PT_STATE_FUP_NO_TIP: 76 + case INTEL_PT_STATE_FUP_IN_PSB: 84 77 return false; 85 78 default: 86 79 return true; ··· 121 112 bool have_last_ip; 122 113 bool in_psb; 123 114 bool hop; 124 - bool hop_psb_fup; 125 115 bool leap; 116 + bool nr; 117 + bool next_nr; 126 118 enum intel_pt_param_flags flags; 127 119 uint64_t pos; 128 120 uint64_t last_ip; 129 121 uint64_t ip; 130 - uint64_t cr3; 122 + uint64_t pip_payload; 131 123 uint64_t timestamp; 132 124 uint64_t tsc_timestamp; 133 125 uint64_t ref_timestamp; ··· 177 167 uint64_t sample_tot_cyc_cnt; 178 168 uint64_t base_cyc_cnt; 179 169 uint64_t cyc_cnt_timestamp; 170 + uint64_t ctl; 171 + uint64_t cyc_threshold; 180 172 double tsc_to_cyc; 181 173 bool continuous_period; 182 174 bool overflow; ··· 201 189 int no_progress; 202 190 int stuck_ip_prd; 203 191 int stuck_ip_cnt; 192 + uint64_t psb_ip; 204 193 const unsigned char *next_buf; 205 194 size_t next_len; 206 195 unsigned char temp_buf[INTEL_PT_PKT_MAX_SZ]; ··· 215 202 x >>= 1; 216 203 217 204 return x << i; 205 + } 206 + 207 + static uint64_t intel_pt_cyc_threshold(uint64_t ctl) 208 + { 209 + if (!(ctl & INTEL_PT_CYC_ENABLE)) 210 + return 0; 211 + 212 + return (ctl & INTEL_PT_CYC_THRESHOLD) >> INTEL_PT_CYC_THRESHOLD_SHIFT; 218 213 } 219 214 220 215 static void intel_pt_setup_period(struct intel_pt_decoder *decoder) ··· 266 245 267 246 decoder->flags = params->flags; 268 247 248 + decoder->ctl = params->ctl; 269 249 decoder->period = params->period; 270 250 decoder->period_type = params->period_type; 271 251 272 252 decoder->max_non_turbo_ratio = params->max_non_turbo_ratio; 273 253 decoder->max_non_turbo_ratio_fp = params->max_non_turbo_ratio; 254 + 255 + decoder->cyc_threshold = intel_pt_cyc_threshold(decoder->ctl); 274 256 275 257 intel_pt_setup_period(decoder); 276 258 ··· 503 479 static inline void intel_pt_update_in_tx(struct intel_pt_decoder *decoder) 504 480 { 505 481 decoder->tx_flags = decoder->packet.payload & INTEL_PT_IN_TX; 482 + } 483 + 484 + static inline void intel_pt_update_pip(struct intel_pt_decoder *decoder) 485 + { 486 + decoder->pip_payload = decoder->packet.payload; 487 + } 488 + 489 + static inline void intel_pt_update_nr(struct intel_pt_decoder *decoder) 490 + { 491 + decoder->next_nr = decoder->pip_payload & 1; 492 + } 493 + 494 + static inline void intel_pt_set_nr(struct intel_pt_decoder *decoder) 495 + { 496 + decoder->nr = decoder->pip_payload & 1; 497 + decoder->next_nr = decoder->nr; 498 + } 499 + 500 + static inline void intel_pt_set_pip(struct intel_pt_decoder *decoder) 501 + { 502 + intel_pt_update_pip(decoder); 503 + intel_pt_set_nr(decoder); 506 504 } 507 505 508 506 static int intel_pt_bad_packet(struct intel_pt_decoder *decoder) ··· 1264 1218 decoder->continuous_period = false; 1265 1219 decoder->pkt_state = INTEL_PT_STATE_IN_SYNC; 1266 1220 decoder->state.type |= INTEL_PT_TRACE_END; 1221 + intel_pt_update_nr(decoder); 1267 1222 return 0; 1268 1223 } 1269 1224 if (err == INTEL_PT_RETURN) 1270 1225 return 0; 1271 1226 if (err) 1272 1227 return err; 1228 + 1229 + intel_pt_update_nr(decoder); 1273 1230 1274 1231 if (intel_pt_insn.branch == INTEL_PT_BR_INDIRECT) { 1275 1232 if (decoder->pkt_state == INTEL_PT_STATE_TIP_PGD) { ··· 1386 1337 decoder->state.from_ip = decoder->ip; 1387 1338 decoder->state.to_ip = decoder->last_ip; 1388 1339 decoder->ip = decoder->last_ip; 1340 + intel_pt_update_nr(decoder); 1389 1341 return 0; 1390 1342 } 1391 1343 ··· 1511 1461 { 1512 1462 intel_pt_log("ERROR: Buffer overflow\n"); 1513 1463 intel_pt_clear_tx_flags(decoder); 1464 + intel_pt_set_nr(decoder); 1514 1465 decoder->timestamp_insn_cnt = 0; 1515 1466 decoder->pkt_state = INTEL_PT_STATE_ERR_RESYNC; 1516 1467 decoder->overflow = true; ··· 1786 1735 break; 1787 1736 1788 1737 case INTEL_PT_PIP: 1789 - decoder->cr3 = decoder->packet.payload & (BIT63 - 1); 1738 + intel_pt_set_pip(decoder); 1790 1739 break; 1791 1740 1792 1741 case INTEL_PT_FUP: 1793 1742 decoder->pge = true; 1794 1743 if (decoder->packet.count) { 1795 1744 intel_pt_set_last_ip(decoder); 1796 - if (decoder->hop) { 1797 - /* Act on FUP at PSBEND */ 1798 - decoder->ip = decoder->last_ip; 1799 - decoder->hop_psb_fup = true; 1800 - } 1745 + decoder->psb_ip = decoder->last_ip; 1801 1746 } 1802 1747 break; 1803 1748 ··· 1808 1761 break; 1809 1762 1810 1763 case INTEL_PT_CYC: 1764 + intel_pt_calc_cyc_timestamp(decoder); 1765 + break; 1766 + 1811 1767 case INTEL_PT_VMCS: 1812 1768 case INTEL_PT_MNT: 1813 1769 case INTEL_PT_PAD: ··· 1885 1835 decoder->pge = false; 1886 1836 decoder->continuous_period = false; 1887 1837 decoder->state.type |= INTEL_PT_TRACE_END; 1838 + intel_pt_update_nr(decoder); 1888 1839 return 0; 1889 1840 1890 1841 case INTEL_PT_TIP_PGE: ··· 1901 1850 } 1902 1851 decoder->state.type |= INTEL_PT_TRACE_BEGIN; 1903 1852 intel_pt_mtc_cyc_cnt_pge(decoder); 1853 + intel_pt_set_nr(decoder); 1904 1854 return 0; 1905 1855 1906 1856 case INTEL_PT_TIP: ··· 1912 1860 intel_pt_set_ip(decoder); 1913 1861 decoder->state.to_ip = decoder->ip; 1914 1862 } 1863 + intel_pt_update_nr(decoder); 1915 1864 return 0; 1916 1865 1917 1866 case INTEL_PT_PIP: 1918 - decoder->cr3 = decoder->packet.payload & (BIT63 - 1); 1867 + intel_pt_update_pip(decoder); 1919 1868 break; 1920 1869 1921 1870 case INTEL_PT_MTC: ··· 1975 1922 return HOP_IGNORE; 1976 1923 1977 1924 case INTEL_PT_TIP_PGD: 1978 - if (!decoder->packet.count) 1925 + if (!decoder->packet.count) { 1926 + intel_pt_set_nr(decoder); 1979 1927 return HOP_IGNORE; 1928 + } 1980 1929 intel_pt_set_ip(decoder); 1981 1930 decoder->state.type |= INTEL_PT_TRACE_END; 1982 1931 decoder->state.from_ip = 0; 1983 1932 decoder->state.to_ip = decoder->ip; 1933 + intel_pt_update_nr(decoder); 1984 1934 return HOP_RETURN; 1985 1935 1986 1936 case INTEL_PT_TIP: 1987 - if (!decoder->packet.count) 1937 + if (!decoder->packet.count) { 1938 + intel_pt_set_nr(decoder); 1988 1939 return HOP_IGNORE; 1940 + } 1989 1941 intel_pt_set_ip(decoder); 1990 1942 decoder->state.type = INTEL_PT_INSTRUCTION; 1991 1943 decoder->state.from_ip = decoder->ip; 1992 1944 decoder->state.to_ip = 0; 1945 + intel_pt_update_nr(decoder); 1993 1946 return HOP_RETURN; 1994 1947 1995 1948 case INTEL_PT_FUP: ··· 2018 1959 return HOP_RETURN; 2019 1960 2020 1961 case INTEL_PT_PSB: 1962 + decoder->state.psb_offset = decoder->pos; 1963 + decoder->psb_ip = 0; 2021 1964 decoder->last_ip = 0; 2022 1965 decoder->have_last_ip = true; 2023 - decoder->hop_psb_fup = false; 2024 1966 *err = intel_pt_walk_psbend(decoder); 2025 1967 if (*err == -EAGAIN) 2026 1968 return HOP_AGAIN; 2027 1969 if (*err) 2028 1970 return HOP_RETURN; 2029 - if (decoder->hop_psb_fup) { 2030 - decoder->hop_psb_fup = false; 2031 - decoder->state.type = INTEL_PT_INSTRUCTION; 2032 - decoder->state.from_ip = decoder->ip; 2033 - decoder->state.to_ip = 0; 2034 - return HOP_RETURN; 1971 + decoder->state.type = INTEL_PT_PSB_EVT; 1972 + if (decoder->psb_ip) { 1973 + decoder->state.type |= INTEL_PT_INSTRUCTION; 1974 + decoder->ip = decoder->psb_ip; 2035 1975 } 2036 - if (decoder->cbr != decoder->cbr_seen) { 2037 - decoder->state.type = 0; 2038 - return HOP_RETURN; 2039 - } 2040 - return HOP_IGNORE; 1976 + decoder->state.from_ip = decoder->psb_ip; 1977 + decoder->state.to_ip = 0; 1978 + return HOP_RETURN; 2041 1979 2042 1980 case INTEL_PT_BAD: 2043 1981 case INTEL_PT_PAD: ··· 2068 2012 } 2069 2013 } 2070 2014 2015 + struct intel_pt_psb_info { 2016 + struct intel_pt_pkt fup_packet; 2017 + bool fup; 2018 + int after_psbend; 2019 + }; 2020 + 2021 + /* Lookahead and get the FUP packet from PSB+ */ 2022 + static int intel_pt_psb_lookahead_cb(struct intel_pt_pkt_info *pkt_info) 2023 + { 2024 + struct intel_pt_psb_info *data = pkt_info->data; 2025 + 2026 + switch (pkt_info->packet.type) { 2027 + case INTEL_PT_PAD: 2028 + case INTEL_PT_MNT: 2029 + case INTEL_PT_TSC: 2030 + case INTEL_PT_TMA: 2031 + case INTEL_PT_MODE_EXEC: 2032 + case INTEL_PT_MODE_TSX: 2033 + case INTEL_PT_MTC: 2034 + case INTEL_PT_CYC: 2035 + case INTEL_PT_VMCS: 2036 + case INTEL_PT_CBR: 2037 + case INTEL_PT_PIP: 2038 + if (data->after_psbend) { 2039 + data->after_psbend -= 1; 2040 + if (!data->after_psbend) 2041 + return 1; 2042 + } 2043 + break; 2044 + 2045 + case INTEL_PT_FUP: 2046 + if (data->after_psbend) 2047 + return 1; 2048 + if (data->fup || pkt_info->packet.count == 0) 2049 + return 1; 2050 + data->fup_packet = pkt_info->packet; 2051 + data->fup = true; 2052 + break; 2053 + 2054 + case INTEL_PT_PSBEND: 2055 + if (!data->fup) 2056 + return 1; 2057 + /* Keep going to check for a TIP.PGE */ 2058 + data->after_psbend = 6; 2059 + break; 2060 + 2061 + case INTEL_PT_TIP_PGE: 2062 + /* Ignore FUP in PSB+ if followed by TIP.PGE */ 2063 + if (data->after_psbend) 2064 + data->fup = false; 2065 + return 1; 2066 + 2067 + case INTEL_PT_PTWRITE: 2068 + case INTEL_PT_PTWRITE_IP: 2069 + case INTEL_PT_EXSTOP: 2070 + case INTEL_PT_EXSTOP_IP: 2071 + case INTEL_PT_MWAIT: 2072 + case INTEL_PT_PWRE: 2073 + case INTEL_PT_PWRX: 2074 + case INTEL_PT_BBP: 2075 + case INTEL_PT_BIP: 2076 + case INTEL_PT_BEP: 2077 + case INTEL_PT_BEP_IP: 2078 + if (data->after_psbend) { 2079 + data->after_psbend -= 1; 2080 + if (!data->after_psbend) 2081 + return 1; 2082 + break; 2083 + } 2084 + return 1; 2085 + 2086 + case INTEL_PT_OVF: 2087 + case INTEL_PT_BAD: 2088 + case INTEL_PT_TNT: 2089 + case INTEL_PT_TIP_PGD: 2090 + case INTEL_PT_TIP: 2091 + case INTEL_PT_PSB: 2092 + case INTEL_PT_TRACESTOP: 2093 + default: 2094 + return 1; 2095 + } 2096 + 2097 + return 0; 2098 + } 2099 + 2100 + static int intel_pt_psb(struct intel_pt_decoder *decoder) 2101 + { 2102 + int err; 2103 + 2104 + decoder->last_ip = 0; 2105 + decoder->psb_ip = 0; 2106 + decoder->have_last_ip = true; 2107 + intel_pt_clear_stack(&decoder->stack); 2108 + err = intel_pt_walk_psbend(decoder); 2109 + if (err) 2110 + return err; 2111 + decoder->state.type = INTEL_PT_PSB_EVT; 2112 + decoder->state.from_ip = decoder->psb_ip; 2113 + decoder->state.to_ip = 0; 2114 + return 0; 2115 + } 2116 + 2117 + static int intel_pt_fup_in_psb(struct intel_pt_decoder *decoder) 2118 + { 2119 + int err; 2120 + 2121 + if (decoder->ip != decoder->last_ip) { 2122 + err = intel_pt_walk_fup(decoder); 2123 + if (!err || err != -EAGAIN) 2124 + return err; 2125 + } 2126 + 2127 + decoder->pkt_state = INTEL_PT_STATE_IN_SYNC; 2128 + err = intel_pt_psb(decoder); 2129 + if (err) { 2130 + decoder->pkt_state = INTEL_PT_STATE_ERR3; 2131 + return -ENOENT; 2132 + } 2133 + 2134 + return 0; 2135 + } 2136 + 2137 + static bool intel_pt_psb_with_fup(struct intel_pt_decoder *decoder, int *err) 2138 + { 2139 + struct intel_pt_psb_info data = { .fup = false }; 2140 + 2141 + if (!decoder->branch_enable || !decoder->pge) 2142 + return false; 2143 + 2144 + intel_pt_pkt_lookahead(decoder, intel_pt_psb_lookahead_cb, &data); 2145 + if (!data.fup) 2146 + return false; 2147 + 2148 + decoder->packet = data.fup_packet; 2149 + intel_pt_set_last_ip(decoder); 2150 + decoder->pkt_state = INTEL_PT_STATE_FUP_IN_PSB; 2151 + 2152 + *err = intel_pt_fup_in_psb(decoder); 2153 + 2154 + return true; 2155 + } 2156 + 2071 2157 static int intel_pt_walk_trace(struct intel_pt_decoder *decoder) 2072 2158 { 2159 + int last_packet_type = INTEL_PT_PAD; 2073 2160 bool no_tip = false; 2074 2161 int err; 2075 2162 ··· 2221 2022 if (err) 2222 2023 return err; 2223 2024 next: 2025 + if (decoder->cyc_threshold) { 2026 + if (decoder->sample_cyc && last_packet_type != INTEL_PT_CYC) 2027 + decoder->sample_cyc = false; 2028 + last_packet_type = decoder->packet.type; 2029 + } 2030 + 2224 2031 if (decoder->hop) { 2225 2032 switch (intel_pt_hop_trace(decoder, &no_tip, &err)) { 2226 2033 case HOP_IGNORE: ··· 2260 2055 case INTEL_PT_TIP_PGE: { 2261 2056 decoder->pge = true; 2262 2057 intel_pt_mtc_cyc_cnt_pge(decoder); 2058 + intel_pt_set_nr(decoder); 2263 2059 if (decoder->packet.count == 0) { 2264 2060 intel_pt_log_at("Skipping zero TIP.PGE", 2265 2061 decoder->pos); ··· 2326 2120 break; 2327 2121 2328 2122 case INTEL_PT_PSB: 2329 - decoder->last_ip = 0; 2330 - decoder->have_last_ip = true; 2331 - intel_pt_clear_stack(&decoder->stack); 2332 - err = intel_pt_walk_psbend(decoder); 2123 + decoder->state.psb_offset = decoder->pos; 2124 + decoder->psb_ip = 0; 2125 + if (intel_pt_psb_with_fup(decoder, &err)) 2126 + return err; 2127 + err = intel_pt_psb(decoder); 2333 2128 if (err == -EAGAIN) 2334 2129 goto next; 2335 - if (err) 2336 - return err; 2337 - /* 2338 - * PSB+ CBR will not have changed but cater for the 2339 - * possibility of another CBR change that gets caught up 2340 - * in the PSB+. 2341 - */ 2342 - if (decoder->cbr != decoder->cbr_seen) { 2343 - decoder->state.type = 0; 2344 - return 0; 2345 - } 2346 - break; 2130 + return err; 2347 2131 2348 2132 case INTEL_PT_PIP: 2349 - decoder->cr3 = decoder->packet.payload & (BIT63 - 1); 2133 + intel_pt_update_pip(decoder); 2350 2134 break; 2351 2135 2352 2136 case INTEL_PT_MTC: ··· 2547 2351 uint64_t current_ip = decoder->ip; 2548 2352 2549 2353 intel_pt_set_ip(decoder); 2354 + decoder->psb_ip = decoder->ip; 2550 2355 if (current_ip) 2551 2356 intel_pt_log_to("Setting IP", 2552 2357 decoder->ip); ··· 2575 2378 break; 2576 2379 2577 2380 case INTEL_PT_PIP: 2578 - decoder->cr3 = decoder->packet.payload & (BIT63 - 1); 2381 + intel_pt_set_pip(decoder); 2579 2382 break; 2580 2383 2581 2384 case INTEL_PT_MODE_EXEC: ··· 2694 2497 break; 2695 2498 2696 2499 case INTEL_PT_PIP: 2697 - decoder->cr3 = decoder->packet.payload & (BIT63 - 1); 2500 + intel_pt_set_pip(decoder); 2698 2501 break; 2699 2502 2700 2503 case INTEL_PT_MODE_EXEC: ··· 2719 2522 break; 2720 2523 2721 2524 case INTEL_PT_PSB: 2525 + decoder->state.psb_offset = decoder->pos; 2526 + decoder->psb_ip = 0; 2722 2527 decoder->last_ip = 0; 2723 2528 decoder->have_last_ip = true; 2724 2529 intel_pt_clear_stack(&decoder->stack); 2725 2530 err = intel_pt_walk_psb(decoder); 2726 2531 if (err) 2727 2532 return err; 2728 - if (decoder->ip) { 2729 - /* Do not have a sample */ 2730 - decoder->state.type = 0; 2731 - return 0; 2732 - } 2733 - break; 2533 + decoder->state.type = INTEL_PT_PSB_EVT; 2534 + decoder->state.from_ip = decoder->psb_ip; 2535 + decoder->state.to_ip = 0; 2536 + return 0; 2734 2537 2735 2538 case INTEL_PT_TNT: 2736 2539 case INTEL_PT_PSBEND: ··· 2774 2577 2775 2578 intel_pt_log("Scanning for full IP\n"); 2776 2579 err = intel_pt_walk_to_ip(decoder); 2777 - if (err) 2580 + if (err || ((decoder->state.type & INTEL_PT_PSB_EVT) && !decoder->ip)) 2778 2581 return err; 2779 2582 2780 2583 /* In hop mode, resample to get the to_ip as an "instruction" sample */ ··· 2886 2689 decoder->continuous_period = false; 2887 2690 decoder->have_last_ip = false; 2888 2691 decoder->last_ip = 0; 2692 + decoder->psb_ip = 0; 2889 2693 decoder->ip = 0; 2890 2694 intel_pt_clear_stack(&decoder->stack); 2891 2695 2892 - leap: 2893 2696 err = intel_pt_scan_for_psb(decoder); 2894 2697 if (err) 2895 2698 return err; ··· 2901 2704 if (err) 2902 2705 return err; 2903 2706 2707 + decoder->state.type = INTEL_PT_PSB_EVT; /* Only PSB sample */ 2708 + decoder->state.from_ip = decoder->psb_ip; 2709 + decoder->state.to_ip = 0; 2710 + 2904 2711 if (decoder->ip) { 2905 - decoder->state.type = 0; /* Do not have a sample */ 2906 2712 /* 2907 2713 * In hop mode, resample to get the PSB FUP ip as an 2908 2714 * "instruction" sample. ··· 2914 2714 decoder->pkt_state = INTEL_PT_STATE_RESAMPLE; 2915 2715 else 2916 2716 decoder->pkt_state = INTEL_PT_STATE_IN_SYNC; 2917 - } else if (decoder->leap) { 2918 - /* 2919 - * In leap mode, only PSB+ is decoded, so keeping leaping to the 2920 - * next PSB until there is an ip. 2921 - */ 2922 - goto leap; 2923 - } else { 2924 - return intel_pt_sync_ip(decoder); 2925 2717 } 2926 2718 2927 2719 return 0; ··· 2975 2783 if (err == -EAGAIN) 2976 2784 err = intel_pt_walk_trace(decoder); 2977 2785 break; 2786 + case INTEL_PT_STATE_FUP_IN_PSB: 2787 + err = intel_pt_fup_in_psb(decoder); 2788 + break; 2978 2789 case INTEL_PT_STATE_RESAMPLE: 2979 2790 err = intel_pt_resample(decoder); 2980 2791 break; ··· 2992 2797 decoder->state.from_ip = decoder->ip; 2993 2798 intel_pt_update_sample_time(decoder); 2994 2799 decoder->sample_tot_cyc_cnt = decoder->tot_cyc_cnt; 2800 + intel_pt_set_nr(decoder); 2995 2801 } else { 2996 2802 decoder->state.err = 0; 2997 2803 if (decoder->cbr != decoder->cbr_seen) { ··· 3007 2811 } 3008 2812 if (intel_pt_sample_time(decoder->pkt_state)) { 3009 2813 intel_pt_update_sample_time(decoder); 3010 - if (decoder->sample_cyc) 2814 + if (decoder->sample_cyc) { 3011 2815 decoder->sample_tot_cyc_cnt = decoder->tot_cyc_cnt; 2816 + decoder->state.flags |= INTEL_PT_SAMPLE_IPC; 2817 + decoder->sample_cyc = false; 2818 + } 3012 2819 } 2820 + /* 2821 + * When using only TSC/MTC to compute cycles, IPC can be 2822 + * sampled as soon as the cycle count changes. 2823 + */ 2824 + if (!decoder->have_cyc) 2825 + decoder->state.flags |= INTEL_PT_SAMPLE_IPC; 3013 2826 } 2827 + 2828 + /* Let PSB event always have TSC timestamp */ 2829 + if ((decoder->state.type & INTEL_PT_PSB_EVT) && decoder->tsc_timestamp) 2830 + decoder->sample_timestamp = decoder->tsc_timestamp; 2831 + 2832 + decoder->state.from_nr = decoder->nr; 2833 + decoder->state.to_nr = decoder->next_nr; 2834 + decoder->nr = decoder->next_nr; 3014 2835 3015 2836 decoder->state.timestamp = decoder->sample_timestamp; 3016 2837 decoder->state.est_timestamp = intel_pt_est_timestamp(decoder); 3017 - decoder->state.cr3 = decoder->cr3; 3018 2838 decoder->state.tot_insn_cnt = decoder->tot_insn_cnt; 3019 2839 decoder->state.tot_cyc_cnt = decoder->sample_tot_cyc_cnt; 3020 2840
+6 -1
tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
··· 17 17 #define INTEL_PT_ABORT_TX (1 << 1) 18 18 #define INTEL_PT_ASYNC (1 << 2) 19 19 #define INTEL_PT_FUP_IP (1 << 3) 20 + #define INTEL_PT_SAMPLE_IPC (1 << 4) 20 21 21 22 enum intel_pt_sample_type { 22 23 INTEL_PT_BRANCH = 1 << 0, ··· 32 31 INTEL_PT_TRACE_BEGIN = 1 << 9, 33 32 INTEL_PT_TRACE_END = 1 << 10, 34 33 INTEL_PT_BLK_ITEMS = 1 << 11, 34 + INTEL_PT_PSB_EVT = 1 << 12, 35 35 }; 36 36 37 37 enum intel_pt_period_type { ··· 201 199 202 200 struct intel_pt_state { 203 201 enum intel_pt_sample_type type; 202 + bool from_nr; 203 + bool to_nr; 204 204 int err; 205 205 uint64_t from_ip; 206 206 uint64_t to_ip; 207 - uint64_t cr3; 208 207 uint64_t tot_insn_cnt; 209 208 uint64_t tot_cyc_cnt; 210 209 uint64_t timestamp; ··· 216 213 uint64_t pwre_payload; 217 214 uint64_t pwrx_payload; 218 215 uint64_t cbr_payload; 216 + uint64_t psb_offset; 219 217 uint32_t cbr; 220 218 uint32_t flags; 221 219 enum intel_pt_insn_op insn_op; ··· 247 243 void *data; 248 244 bool return_compression; 249 245 bool branch_enable; 246 + uint64_t ctl; 250 247 uint64_t period; 251 248 enum intel_pt_period_type period_type; 252 249 unsigned max_non_turbo_ratio;
+15
tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
··· 43 43 switch (insn->opcode.bytes[0]) { 44 44 case 0xf: 45 45 switch (insn->opcode.bytes[1]) { 46 + case 0x01: 47 + switch (insn->modrm.bytes[0]) { 48 + case 0xc2: /* vmlaunch */ 49 + case 0xc3: /* vmresume */ 50 + op = INTEL_PT_OP_VMENTRY; 51 + branch = INTEL_PT_BR_INDIRECT; 52 + break; 53 + default: 54 + break; 55 + } 56 + break; 46 57 case 0x05: /* syscall */ 47 58 case 0x34: /* sysenter */ 48 59 op = INTEL_PT_OP_SYSCALL; ··· 224 213 [INTEL_PT_OP_INT] = "Int", 225 214 [INTEL_PT_OP_SYSCALL] = "Syscall", 226 215 [INTEL_PT_OP_SYSRET] = "Sysret", 216 + [INTEL_PT_OP_VMENTRY] = "VMentry", 227 217 }; 228 218 229 219 const char *intel_pt_insn_name(enum intel_pt_insn_op op) ··· 279 267 case INTEL_PT_OP_SYSRET: 280 268 return PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_RETURN | 281 269 PERF_IP_FLAG_SYSCALLRET; 270 + case INTEL_PT_OP_VMENTRY: 271 + return PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | 272 + PERF_IP_FLAG_VMENTRY; 282 273 default: 283 274 return 0; 284 275 }
+1
tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.h
··· 24 24 INTEL_PT_OP_INT, 25 25 INTEL_PT_OP_SYSCALL, 26 26 INTEL_PT_OP_SYSRET, 27 + INTEL_PT_OP_VMENTRY, 27 28 }; 28 29 29 30 enum intel_pt_insn_branch {
+4 -8
tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c
··· 16 16 17 17 #define BIT63 ((uint64_t)1 << 63) 18 18 19 - #define NR_FLAG BIT63 20 - 21 19 #if __BYTE_ORDER == __BIG_ENDIAN 22 20 #define le16_to_cpu bswap_16 23 21 #define le32_to_cpu bswap_32 ··· 104 106 105 107 packet->type = INTEL_PT_PIP; 106 108 memcpy_le64(&payload, buf + 2, 6); 107 - packet->payload = payload >> 1; 108 - if (payload & 1) 109 - packet->payload |= NR_FLAG; 109 + packet->payload = payload; 110 110 111 111 return 8; 112 112 } ··· 715 719 name, (unsigned)(payload >> 1) & 1, 716 720 (unsigned)payload & 1); 717 721 case INTEL_PT_PIP: 718 - nr = packet->payload & NR_FLAG ? 1 : 0; 719 - payload &= ~NR_FLAG; 722 + nr = packet->payload & INTEL_PT_VMX_NR_FLAG ? 1 : 0; 723 + payload &= ~INTEL_PT_VMX_NR_FLAG; 720 724 ret = snprintf(buf, buf_len, "%s 0x%llx (NR=%d)", 721 - name, payload, nr); 725 + name, payload >> 1, nr); 722 726 return ret; 723 727 case INTEL_PT_PTWRITE: 724 728 return snprintf(buf, buf_len, "%s 0x%llx IP:0", name, payload);
+2
tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.h
··· 21 21 22 22 #define INTEL_PT_PKT_MAX_SZ 16 23 23 24 + #define INTEL_PT_VMX_NR_FLAG 1 25 + 24 26 enum intel_pt_pkt_type { 25 27 INTEL_PT_BAD, 26 28 INTEL_PT_PAD,
+183 -31
tools/perf/util/intel-pt.c
··· 108 108 u64 exstop_id; 109 109 u64 pwrx_id; 110 110 u64 cbr_id; 111 + u64 psb_id; 111 112 112 113 bool sample_pebs; 113 114 struct evsel *pebs_evsel; ··· 163 162 int switch_state; 164 163 pid_t next_tid; 165 164 struct thread *thread; 165 + struct machine *guest_machine; 166 + struct thread *unknown_guest_thread; 167 + pid_t guest_machine_pid; 166 168 bool exclude_kernel; 167 169 bool have_sample; 168 170 u64 time; ··· 553 549 auxtrace_cache__remove(dso->auxtrace_cache, offset); 554 550 } 555 551 556 - static inline u8 intel_pt_cpumode(struct intel_pt *pt, uint64_t ip) 552 + static inline bool intel_pt_guest_kernel_ip(uint64_t ip) 557 553 { 558 - return ip >= pt->kernel_start ? 554 + /* Assumes 64-bit kernel */ 555 + return ip & (1ULL << 63); 556 + } 557 + 558 + static inline u8 intel_pt_nr_cpumode(struct intel_pt_queue *ptq, uint64_t ip, bool nr) 559 + { 560 + if (nr) { 561 + return intel_pt_guest_kernel_ip(ip) ? 562 + PERF_RECORD_MISC_GUEST_KERNEL : 563 + PERF_RECORD_MISC_GUEST_USER; 564 + } 565 + 566 + return ip >= ptq->pt->kernel_start ? 559 567 PERF_RECORD_MISC_KERNEL : 560 568 PERF_RECORD_MISC_USER; 569 + } 570 + 571 + static inline u8 intel_pt_cpumode(struct intel_pt_queue *ptq, uint64_t from_ip, uint64_t to_ip) 572 + { 573 + /* No support for non-zero CS base */ 574 + if (from_ip) 575 + return intel_pt_nr_cpumode(ptq, from_ip, ptq->state->from_nr); 576 + return intel_pt_nr_cpumode(ptq, to_ip, ptq->state->to_nr); 577 + } 578 + 579 + static int intel_pt_get_guest(struct intel_pt_queue *ptq) 580 + { 581 + struct machines *machines = &ptq->pt->session->machines; 582 + struct machine *machine; 583 + pid_t pid = ptq->pid <= 0 ? DEFAULT_GUEST_KERNEL_ID : ptq->pid; 584 + 585 + if (ptq->guest_machine && pid == ptq->guest_machine_pid) 586 + return 0; 587 + 588 + ptq->guest_machine = NULL; 589 + thread__zput(ptq->unknown_guest_thread); 590 + 591 + machine = machines__find_guest(machines, pid); 592 + if (!machine) 593 + return -1; 594 + 595 + ptq->unknown_guest_thread = machine__idle_thread(machine); 596 + if (!ptq->unknown_guest_thread) 597 + return -1; 598 + 599 + ptq->guest_machine = machine; 600 + ptq->guest_machine_pid = pid; 601 + 602 + return 0; 561 603 } 562 604 563 605 static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn, ··· 622 572 u64 offset, start_offset, start_ip; 623 573 u64 insn_cnt = 0; 624 574 bool one_map = true; 575 + bool nr; 625 576 626 577 intel_pt_insn->length = 0; 627 578 628 579 if (to_ip && *ip == to_ip) 629 580 goto out_no_cache; 630 581 631 - cpumode = intel_pt_cpumode(ptq->pt, *ip); 582 + nr = ptq->state->to_nr; 583 + cpumode = intel_pt_nr_cpumode(ptq, *ip, nr); 632 584 633 - thread = ptq->thread; 634 - if (!thread) { 635 - if (cpumode != PERF_RECORD_MISC_KERNEL) 585 + if (nr) { 586 + if (cpumode != PERF_RECORD_MISC_GUEST_KERNEL || 587 + intel_pt_get_guest(ptq)) 636 588 return -EINVAL; 637 - thread = ptq->pt->unknown_thread; 589 + machine = ptq->guest_machine; 590 + thread = ptq->unknown_guest_thread; 591 + } else { 592 + thread = ptq->thread; 593 + if (!thread) { 594 + if (cpumode != PERF_RECORD_MISC_KERNEL) 595 + return -EINVAL; 596 + thread = ptq->pt->unknown_thread; 597 + } 638 598 } 639 599 640 600 while (1) { ··· 792 732 u8 cpumode; 793 733 u64 offset; 794 734 795 - if (ip >= ptq->pt->kernel_start) 735 + if (ptq->state->to_nr) { 736 + if (intel_pt_guest_kernel_ip(ip)) 737 + return intel_pt_match_pgd_ip(ptq->pt, ip, ip, NULL); 738 + /* No support for decoding guest user space */ 739 + return -EINVAL; 740 + } else if (ip >= ptq->pt->kernel_start) { 796 741 return intel_pt_match_pgd_ip(ptq->pt, ip, ip, NULL); 742 + } 797 743 798 744 cpumode = PERF_RECORD_MISC_USER; 799 745 ··· 959 893 return false; 960 894 } 961 895 896 + static u64 intel_pt_ctl(struct intel_pt *pt) 897 + { 898 + struct evsel *evsel; 899 + u64 config; 900 + 901 + evlist__for_each_entry(pt->session->evlist, evsel) { 902 + if (intel_pt_get_config(pt, &evsel->core.attr, &config)) 903 + return config; 904 + } 905 + return 0; 906 + } 907 + 962 908 static u64 intel_pt_ns_to_ticks(const struct intel_pt *pt, u64 ns) 963 909 { 964 910 u64 quot, rem; ··· 1104 1026 params.data = ptq; 1105 1027 params.return_compression = intel_pt_return_compression(pt); 1106 1028 params.branch_enable = intel_pt_branch_enable(pt); 1029 + params.ctl = intel_pt_ctl(pt); 1107 1030 params.max_non_turbo_ratio = pt->max_non_turbo_ratio; 1108 1031 params.mtc_period = intel_pt_mtc_period(pt); 1109 1032 params.tsc_ctc_ratio_n = pt->tsc_ctc_ratio_n; ··· 1166 1087 if (!ptq) 1167 1088 return; 1168 1089 thread__zput(ptq->thread); 1090 + thread__zput(ptq->unknown_guest_thread); 1169 1091 intel_pt_decoder_free(ptq->decoder); 1170 1092 zfree(&ptq->event_buf); 1171 1093 zfree(&ptq->last_branch); ··· 1201 1121 if (ptq->state->flags & INTEL_PT_ABORT_TX) { 1202 1122 ptq->flags = PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TX_ABORT; 1203 1123 } else if (ptq->state->flags & INTEL_PT_ASYNC) { 1204 - if (ptq->state->to_ip) 1124 + if (!ptq->state->to_ip) 1125 + ptq->flags = PERF_IP_FLAG_BRANCH | 1126 + PERF_IP_FLAG_TRACE_END; 1127 + else if (ptq->state->from_nr && !ptq->state->to_nr) 1128 + ptq->flags = PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | 1129 + PERF_IP_FLAG_VMEXIT; 1130 + else 1205 1131 ptq->flags = PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | 1206 1132 PERF_IP_FLAG_ASYNC | 1207 1133 PERF_IP_FLAG_INTERRUPT; 1208 - else 1209 - ptq->flags = PERF_IP_FLAG_BRANCH | 1210 - PERF_IP_FLAG_TRACE_END; 1211 1134 ptq->insn_len = 0; 1212 1135 } else { 1213 1136 if (ptq->state->from_ip) ··· 1384 1301 sample->time = tsc_to_perf_time(ptq->timestamp, &pt->tc); 1385 1302 1386 1303 sample->ip = ptq->state->from_ip; 1387 - sample->cpumode = intel_pt_cpumode(pt, sample->ip); 1388 1304 sample->addr = ptq->state->to_ip; 1305 + sample->cpumode = intel_pt_cpumode(ptq, sample->ip, sample->addr); 1389 1306 sample->period = 1; 1390 1307 sample->flags = ptq->flags; 1391 1308 ··· 1464 1381 sample.branch_stack = (struct branch_stack *)&dummy_bs; 1465 1382 } 1466 1383 1467 - sample.cyc_cnt = ptq->ipc_cyc_cnt - ptq->last_br_cyc_cnt; 1384 + if (ptq->state->flags & INTEL_PT_SAMPLE_IPC) 1385 + sample.cyc_cnt = ptq->ipc_cyc_cnt - ptq->last_br_cyc_cnt; 1468 1386 if (sample.cyc_cnt) { 1469 1387 sample.insn_cnt = ptq->ipc_insn_cnt - ptq->last_br_insn_cnt; 1470 1388 ptq->last_br_insn_cnt = ptq->ipc_insn_cnt; ··· 1515 1431 else 1516 1432 sample.period = ptq->state->tot_insn_cnt - ptq->last_insn_cnt; 1517 1433 1518 - sample.cyc_cnt = ptq->ipc_cyc_cnt - ptq->last_in_cyc_cnt; 1434 + if (ptq->state->flags & INTEL_PT_SAMPLE_IPC) 1435 + sample.cyc_cnt = ptq->ipc_cyc_cnt - ptq->last_in_cyc_cnt; 1519 1436 if (sample.cyc_cnt) { 1520 1437 sample.insn_cnt = ptq->ipc_insn_cnt - ptq->last_in_insn_cnt; 1521 1438 ptq->last_in_insn_cnt = ptq->ipc_insn_cnt; ··· 1610 1525 raw.flags = cpu_to_le32(flags); 1611 1526 raw.freq = cpu_to_le32(raw.cbr * pt->cbr2khz); 1612 1527 raw.reserved3 = 0; 1528 + 1529 + sample.raw_size = perf_synth__raw_size(raw); 1530 + sample.raw_data = perf_synth__raw_data(&raw); 1531 + 1532 + return intel_pt_deliver_synth_event(pt, event, &sample, 1533 + pt->pwr_events_sample_type); 1534 + } 1535 + 1536 + static int intel_pt_synth_psb_sample(struct intel_pt_queue *ptq) 1537 + { 1538 + struct intel_pt *pt = ptq->pt; 1539 + union perf_event *event = ptq->event_buf; 1540 + struct perf_sample sample = { .ip = 0, }; 1541 + struct perf_synth_intel_psb raw; 1542 + 1543 + if (intel_pt_skip_event(pt)) 1544 + return 0; 1545 + 1546 + intel_pt_prep_p_sample(pt, ptq, event, &sample); 1547 + 1548 + sample.id = ptq->pt->psb_id; 1549 + sample.stream_id = ptq->pt->psb_id; 1550 + sample.flags = 0; 1551 + 1552 + raw.reserved = 0; 1553 + raw.offset = ptq->state->psb_offset; 1613 1554 1614 1555 sample.raw_size = perf_synth__raw_size(raw); 1615 1556 sample.raw_data = perf_synth__raw_data(&raw); ··· 1902 1791 else 1903 1792 sample.ip = ptq->state->from_ip; 1904 1793 1905 - /* No support for guest mode at this time */ 1906 - cpumode = sample.ip < ptq->pt->kernel_start ? 1907 - PERF_RECORD_MISC_USER : 1908 - PERF_RECORD_MISC_KERNEL; 1794 + cpumode = intel_pt_cpumode(ptq, sample.ip, 0); 1909 1795 1910 1796 event->sample.header.misc = cpumode | PERF_RECORD_MISC_EXACT_IP; 1911 1797 ··· 1961 1853 if (sample_type & PERF_SAMPLE_ADDR && items->has_mem_access_address) 1962 1854 sample.addr = items->mem_access_address; 1963 1855 1964 - if (sample_type & PERF_SAMPLE_WEIGHT) { 1856 + if (sample_type & PERF_SAMPLE_WEIGHT_TYPE) { 1965 1857 /* 1966 1858 * Refer kernel's setup_pebs_adaptive_sample_data() and 1967 1859 * intel_hsw_weight(). 1968 1860 */ 1969 - if (items->has_mem_access_latency) 1970 - sample.weight = items->mem_access_latency; 1861 + if (items->has_mem_access_latency) { 1862 + u64 weight = items->mem_access_latency >> 32; 1863 + 1864 + /* 1865 + * Starts from SPR, the mem access latency field 1866 + * contains both cache latency [47:32] and instruction 1867 + * latency [15:0]. The cache latency is the same as the 1868 + * mem access latency on previous platforms. 1869 + * 1870 + * In practice, no memory access could last than 4G 1871 + * cycles. Use latency >> 32 to distinguish the 1872 + * different format of the mem access latency field. 1873 + */ 1874 + if (weight > 0) { 1875 + sample.weight = weight & 0xffff; 1876 + sample.ins_lat = items->mem_access_latency & 0xffff; 1877 + } else 1878 + sample.weight = items->mem_access_latency; 1879 + } 1971 1880 if (!sample.weight && items->has_tsx_aux_info) { 1972 1881 /* Cycles last block */ 1973 1882 sample.weight = (u32)items->tsx_aux_info; ··· 2091 1966 2092 1967 ptq->have_sample = false; 2093 1968 2094 - if (ptq->state->tot_cyc_cnt > ptq->ipc_cyc_cnt) { 2095 - /* 2096 - * Cycle count and instruction count only go together to create 2097 - * a valid IPC ratio when the cycle count changes. 2098 - */ 2099 - ptq->ipc_insn_cnt = ptq->state->tot_insn_cnt; 2100 - ptq->ipc_cyc_cnt = ptq->state->tot_cyc_cnt; 2101 - } 1969 + ptq->ipc_insn_cnt = ptq->state->tot_insn_cnt; 1970 + ptq->ipc_cyc_cnt = ptq->state->tot_cyc_cnt; 2102 1971 2103 1972 /* 2104 1973 * Do PEBS first to allow for the possibility that the PEBS timestamp ··· 2105 1986 } 2106 1987 2107 1988 if (pt->sample_pwr_events) { 1989 + if (state->type & INTEL_PT_PSB_EVT) { 1990 + err = intel_pt_synth_psb_sample(ptq); 1991 + if (err) 1992 + return err; 1993 + } 2108 1994 if (ptq->state->cbr != ptq->cbr_seen) { 2109 1995 err = intel_pt_synth_cbr_sample(ptq); 2110 1996 if (err) ··· 2171 2047 } 2172 2048 2173 2049 if (pt->sample_branches) { 2174 - err = intel_pt_synth_branch_sample(ptq); 2050 + if (state->from_nr != state->to_nr && 2051 + state->from_ip && state->to_ip) { 2052 + struct intel_pt_state *st = (struct intel_pt_state *)state; 2053 + u64 to_ip = st->to_ip; 2054 + u64 from_ip = st->from_ip; 2055 + 2056 + /* 2057 + * perf cannot handle having different machines for ip 2058 + * and addr, so create 2 branches. 2059 + */ 2060 + st->to_ip = 0; 2061 + err = intel_pt_synth_branch_sample(ptq); 2062 + if (err) 2063 + return err; 2064 + st->from_ip = 0; 2065 + st->to_ip = to_ip; 2066 + err = intel_pt_synth_branch_sample(ptq); 2067 + st->from_ip = from_ip; 2068 + } else { 2069 + err = intel_pt_synth_branch_sample(ptq); 2070 + } 2175 2071 if (err) 2176 2072 return err; 2177 2073 } ··· 3226 3082 return err; 3227 3083 pt->cbr_id = id; 3228 3084 intel_pt_set_event_name(evlist, id, "cbr"); 3085 + id += 1; 3086 + 3087 + attr.config = PERF_SYNTH_INTEL_PSB; 3088 + err = intel_pt_synth_event(session, "psb", &attr, id); 3089 + if (err) 3090 + return err; 3091 + pt->psb_id = id; 3092 + intel_pt_set_event_name(evlist, id, "psb"); 3229 3093 id += 1; 3230 3094 } 3231 3095
+16 -11
tools/perf/util/intlist.c
··· 13 13 static struct rb_node *intlist__node_new(struct rblist *rblist __maybe_unused, 14 14 const void *entry) 15 15 { 16 - int i = (int)((long)entry); 16 + unsigned long i = (unsigned long)entry; 17 17 struct rb_node *rc = NULL; 18 18 struct int_node *node = malloc(sizeof(*node)); 19 19 ··· 41 41 42 42 static int intlist__node_cmp(struct rb_node *rb_node, const void *entry) 43 43 { 44 - int i = (int)((long)entry); 44 + unsigned long i = (unsigned long)entry; 45 45 struct int_node *node = container_of(rb_node, struct int_node, rb_node); 46 46 47 - return node->i - i; 47 + if (node->i > i) 48 + return 1; 49 + else if (node->i < i) 50 + return -1; 51 + 52 + return 0; 48 53 } 49 54 50 - int intlist__add(struct intlist *ilist, int i) 55 + int intlist__add(struct intlist *ilist, unsigned long i) 51 56 { 52 - return rblist__add_node(&ilist->rblist, (void *)((long)i)); 57 + return rblist__add_node(&ilist->rblist, (void *)i); 53 58 } 54 59 55 60 void intlist__remove(struct intlist *ilist, struct int_node *node) ··· 63 58 } 64 59 65 60 static struct int_node *__intlist__findnew(struct intlist *ilist, 66 - int i, bool create) 61 + unsigned long i, bool create) 67 62 { 68 63 struct int_node *node = NULL; 69 64 struct rb_node *rb_node; ··· 72 67 return NULL; 73 68 74 69 if (create) 75 - rb_node = rblist__findnew(&ilist->rblist, (void *)((long)i)); 70 + rb_node = rblist__findnew(&ilist->rblist, (void *)i); 76 71 else 77 - rb_node = rblist__find(&ilist->rblist, (void *)((long)i)); 72 + rb_node = rblist__find(&ilist->rblist, (void *)i); 78 73 79 74 if (rb_node) 80 75 node = container_of(rb_node, struct int_node, rb_node); ··· 82 77 return node; 83 78 } 84 79 85 - struct int_node *intlist__find(struct intlist *ilist, int i) 80 + struct int_node *intlist__find(struct intlist *ilist, unsigned long i) 86 81 { 87 82 return __intlist__findnew(ilist, i, false); 88 83 } 89 84 90 - struct int_node *intlist__findnew(struct intlist *ilist, int i) 85 + struct int_node *intlist__findnew(struct intlist *ilist, unsigned long i) 91 86 { 92 87 return __intlist__findnew(ilist, i, true); 93 88 } ··· 98 93 int err; 99 94 100 95 do { 101 - long value = strtol(s, &sep, 10); 96 + unsigned long value = strtol(s, &sep, 10); 102 97 err = -EINVAL; 103 98 if (*sep != ',' && *sep != '\0') 104 99 break;
+5 -5
tools/perf/util/intlist.h
··· 9 9 10 10 struct int_node { 11 11 struct rb_node rb_node; 12 - int i; 12 + unsigned long i; 13 13 void *priv; 14 14 }; 15 15 ··· 21 21 void intlist__delete(struct intlist *ilist); 22 22 23 23 void intlist__remove(struct intlist *ilist, struct int_node *in); 24 - int intlist__add(struct intlist *ilist, int i); 24 + int intlist__add(struct intlist *ilist, unsigned long i); 25 25 26 26 struct int_node *intlist__entry(const struct intlist *ilist, unsigned int idx); 27 - struct int_node *intlist__find(struct intlist *ilist, int i); 28 - struct int_node *intlist__findnew(struct intlist *ilist, int i); 27 + struct int_node *intlist__find(struct intlist *ilist, unsigned long i); 28 + struct int_node *intlist__findnew(struct intlist *ilist, unsigned long i); 29 29 30 - static inline bool intlist__has_entry(struct intlist *ilist, int i) 30 + static inline bool intlist__has_entry(struct intlist *ilist, unsigned long i) 31 31 { 32 32 return intlist__find(ilist, i) != NULL; 33 33 }
+1 -1
tools/perf/util/jit.h
··· 5 5 #include <data.h> 6 6 7 7 int jit_process(struct perf_session *session, struct perf_data *output, 8 - struct machine *machine, char *filename, pid_t pid, u64 *nbytes); 8 + struct machine *machine, char *filename, pid_t pid, pid_t tid, u64 *nbytes); 9 9 10 10 int jit_inject_record(const char *filename); 11 11
+65 -19
tools/perf/util/jitdump.c
··· 18 18 #include "event.h" 19 19 #include "debug.h" 20 20 #include "evlist.h" 21 + #include "namespaces.h" 21 22 #include "symbol.h" 22 23 #include <elf.h> 23 24 ··· 36 35 struct perf_data *output; 37 36 struct perf_session *session; 38 37 struct machine *machine; 38 + struct nsinfo *nsi; 39 39 union jr_entry *entry; 40 40 void *buf; 41 41 uint64_t sample_type; ··· 74 72 #define get_jit_tool(t) (container_of(tool, struct jit_tool, tool)) 75 73 76 74 static int 77 - jit_emit_elf(char *filename, 75 + jit_emit_elf(struct jit_buf_desc *jd, 76 + char *filename, 78 77 const char *sym, 79 78 uint64_t code_addr, 80 79 const void *code, ··· 86 83 uint32_t unwinding_header_size, 87 84 uint32_t unwinding_size) 88 85 { 89 - int ret, fd; 86 + int ret, fd, saved_errno; 87 + struct nscookie nsc; 90 88 91 89 if (verbose > 0) 92 90 fprintf(stderr, "write ELF image %s\n", filename); 93 91 92 + nsinfo__mountns_enter(jd->nsi, &nsc); 94 93 fd = open(filename, O_CREAT|O_TRUNC|O_WRONLY, 0644); 94 + saved_errno = errno; 95 + nsinfo__mountns_exit(&nsc); 95 96 if (fd == -1) { 96 - pr_warning("cannot create jit ELF %s: %s\n", filename, strerror(errno)); 97 + pr_warning("cannot create jit ELF %s: %s\n", filename, strerror(saved_errno)); 97 98 return -1; 98 99 } 99 100 ··· 106 99 107 100 close(fd); 108 101 109 - if (ret) 110 - unlink(filename); 102 + if (ret) { 103 + nsinfo__mountns_enter(jd->nsi, &nsc); 104 + unlink(filename); 105 + nsinfo__mountns_exit(&nsc); 106 + } 111 107 112 108 return ret; 113 109 } ··· 144 134 jit_open(struct jit_buf_desc *jd, const char *name) 145 135 { 146 136 struct jitheader header; 137 + struct nscookie nsc; 147 138 struct jr_prefix *prefix; 148 139 ssize_t bs, bsz = 0; 149 140 void *n, *buf = NULL; 150 141 int ret, retval = -1; 151 142 143 + nsinfo__mountns_enter(jd->nsi, &nsc); 152 144 jd->in = fopen(name, "r"); 145 + nsinfo__mountns_exit(&nsc); 153 146 if (!jd->in) 154 147 return -1; 155 148 ··· 380 367 return 0; 381 368 } 382 369 370 + static pid_t jr_entry_pid(struct jit_buf_desc *jd, union jr_entry *jr) 371 + { 372 + if (jd->nsi && jd->nsi->in_pidns) 373 + return jd->nsi->tgid; 374 + return jr->load.pid; 375 + } 376 + 377 + static pid_t jr_entry_tid(struct jit_buf_desc *jd, union jr_entry *jr) 378 + { 379 + if (jd->nsi && jd->nsi->in_pidns) 380 + return jd->nsi->pid; 381 + return jr->load.tid; 382 + } 383 + 383 384 static uint64_t convert_timestamp(struct jit_buf_desc *jd, uint64_t timestamp) 384 385 { 385 386 struct perf_tsc_conversion tc; ··· 429 402 const char *sym; 430 403 uint64_t count; 431 404 int ret, csize, usize; 432 - pid_t pid, tid; 405 + pid_t nspid, pid, tid; 433 406 struct { 434 407 u32 pid, tid; 435 408 u64 time; 436 409 } *id; 437 410 438 - pid = jr->load.pid; 439 - tid = jr->load.tid; 411 + nspid = jr->load.pid; 412 + pid = jr_entry_pid(jd, jr); 413 + tid = jr_entry_tid(jd, jr); 440 414 csize = jr->load.code_size; 441 415 usize = jd->unwinding_mapped_size; 442 416 addr = jr->load.code_addr; ··· 453 425 filename = event->mmap2.filename; 454 426 size = snprintf(filename, PATH_MAX, "%s/jitted-%d-%" PRIu64 ".so", 455 427 jd->dir, 456 - pid, 428 + nspid, 457 429 count); 458 430 459 431 size++; /* for \0 */ 460 432 461 433 size = PERF_ALIGN(size, sizeof(u64)); 462 434 uaddr = (uintptr_t)code; 463 - ret = jit_emit_elf(filename, sym, addr, (const void *)uaddr, csize, jd->debug_data, jd->nr_debug_entries, 435 + ret = jit_emit_elf(jd, filename, sym, addr, (const void *)uaddr, csize, jd->debug_data, jd->nr_debug_entries, 464 436 jd->unwinding_data, jd->eh_frame_hdr_size, jd->unwinding_size); 465 437 466 438 if (jd->debug_data && jd->nr_debug_entries) { ··· 479 451 free(event); 480 452 return -1; 481 453 } 482 - if (stat(filename, &st)) 454 + if (nsinfo__stat(filename, &st, jd->nsi)) 483 455 memset(&st, 0, sizeof(st)); 484 456 485 457 event->mmap2.header.type = PERF_RECORD_MMAP2; ··· 543 515 int usize; 544 516 u16 idr_size; 545 517 int ret; 546 - pid_t pid, tid; 518 + pid_t nspid, pid, tid; 547 519 struct { 548 520 u32 pid, tid; 549 521 u64 time; 550 522 } *id; 551 523 552 - pid = jr->move.pid; 553 - tid = jr->move.tid; 524 + nspid = jr->load.pid; 525 + pid = jr_entry_pid(jd, jr); 526 + tid = jr_entry_tid(jd, jr); 554 527 usize = jd->unwinding_mapped_size; 555 528 idr_size = jd->machine->id_hdr_size; 556 529 ··· 565 536 filename = event->mmap2.filename; 566 537 size = snprintf(filename, PATH_MAX, "%s/jitted-%d-%" PRIu64 ".so", 567 538 jd->dir, 568 - pid, 539 + nspid, 569 540 jr->move.code_index); 570 541 571 542 size++; /* for \0 */ 572 543 573 - if (stat(filename, &st)) 544 + if (nsinfo__stat(filename, &st, jd->nsi)) 574 545 memset(&st, 0, sizeof(st)); 575 546 576 547 size = PERF_ALIGN(size, sizeof(u64)); ··· 729 700 * as captured in the RECORD_MMAP record 730 701 */ 731 702 static int 732 - jit_detect(char *mmap_name, pid_t pid) 703 + jit_detect(char *mmap_name, pid_t pid, struct nsinfo *nsi) 733 704 { 734 705 char *p; 735 706 char *end = NULL; ··· 769 740 * pid does not match mmap pid 770 741 * pid==0 in system-wide mode (synthesized) 771 742 */ 772 - if (pid && pid2 != pid) 743 + if (pid && pid2 != nsi->nstgid) 773 744 return -1; 774 745 /* 775 746 * validate suffix ··· 811 782 struct machine *machine, 812 783 char *filename, 813 784 pid_t pid, 785 + pid_t tid, 814 786 u64 *nbytes) 815 787 { 788 + struct thread *thread; 789 + struct nsinfo *nsi; 816 790 struct evsel *first; 817 791 struct jit_buf_desc jd; 818 792 int ret; 819 793 794 + thread = machine__findnew_thread(machine, pid, tid); 795 + if (thread == NULL) { 796 + pr_err("problem processing JIT mmap event, skipping it.\n"); 797 + return 0; 798 + } 799 + 800 + nsi = nsinfo__get(thread->nsinfo); 801 + thread__put(thread); 802 + 820 803 /* 821 804 * first, detect marker mmap (i.e., the jitdump mmap) 822 805 */ 823 - if (jit_detect(filename, pid)) { 806 + if (jit_detect(filename, pid, nsi)) { 807 + nsinfo__put(nsi); 808 + 824 809 // Strip //anon* mmaps if we processed a jitdump for this pid 825 810 if (jit_has_pid(machine, pid) && (strncmp(filename, "//anon", 6) == 0)) 826 811 return 1; ··· 847 804 jd.session = session; 848 805 jd.output = output; 849 806 jd.machine = machine; 807 + jd.nsi = nsi; 850 808 851 809 /* 852 810 * track sample_type to compute id_all layout ··· 864 820 *nbytes = jd.bytes_written; 865 821 ret = 1; 866 822 } 823 + 824 + nsinfo__put(jd.nsi); 867 825 868 826 return ret; 869 827 }
+46 -5
tools/perf/util/machine.c
··· 369 369 return machine; 370 370 } 371 371 372 + struct machine *machines__find_guest(struct machines *machines, pid_t pid) 373 + { 374 + struct machine *machine = machines__find(machines, pid); 375 + 376 + if (!machine) 377 + machine = machines__findnew(machines, DEFAULT_GUEST_KERNEL_ID); 378 + return machine; 379 + } 380 + 372 381 void machines__process_guests(struct machines *machines, 373 382 machine__process_t process, void *data) 374 383 { ··· 596 587 th = ____machine__findnew_thread(machine, threads, pid, tid, false); 597 588 up_read(&threads->lock); 598 589 return th; 590 + } 591 + 592 + /* 593 + * Threads are identified by pid and tid, and the idle task has pid == tid == 0. 594 + * So here a single thread is created for that, but actually there is a separate 595 + * idle task per cpu, so there should be one 'struct thread' per cpu, but there 596 + * is only 1. That causes problems for some tools, requiring workarounds. For 597 + * example get_idle_thread() in builtin-sched.c, or thread_stack__per_cpu(). 598 + */ 599 + struct thread *machine__idle_thread(struct machine *machine) 600 + { 601 + struct thread *thread = machine__findnew_thread(machine, 0, 0); 602 + 603 + if (!thread || thread__set_comm(thread, "swapper", 0) || 604 + thread__set_namespaces(thread, 0, NULL)) 605 + pr_err("problem inserting idle task for machine pid %d\n", machine->pid); 606 + 607 + return thread; 599 608 } 600 609 601 610 struct comm *machine__thread_exec_comm(struct machine *machine, ··· 1626 1599 } 1627 1600 1628 1601 static int machine__process_kernel_mmap_event(struct machine *machine, 1629 - struct extra_kernel_map *xm) 1602 + struct extra_kernel_map *xm, 1603 + struct build_id *bid) 1630 1604 { 1631 1605 struct map *map; 1632 1606 enum dso_space_type dso_space; ··· 1652 1624 goto out_problem; 1653 1625 1654 1626 map->end = map->start + xm->end - xm->start; 1627 + 1628 + if (build_id__is_defined(bid)) 1629 + dso__set_build_id(map->dso, bid); 1630 + 1655 1631 } else if (is_kernel_mmap) { 1656 1632 const char *symbol_name = (xm->name + strlen(machine->mmap_name)); 1657 1633 /* ··· 1713 1681 1714 1682 machine__update_kernel_mmap(machine, xm->start, xm->end); 1715 1683 1684 + if (build_id__is_defined(bid)) 1685 + dso__set_build_id(kernel, bid); 1686 + 1716 1687 /* 1717 1688 * Avoid using a zero address (kptr_restrict) for the ref reloc 1718 1689 * symbol. Effectively having zero here means that at record ··· 1753 1718 .ino = event->mmap2.ino, 1754 1719 .ino_generation = event->mmap2.ino_generation, 1755 1720 }; 1721 + struct build_id __bid, *bid = NULL; 1756 1722 int ret = 0; 1757 1723 1758 1724 if (dump_trace) 1759 1725 perf_event__fprintf_mmap2(event, stdout); 1726 + 1727 + if (event->header.misc & PERF_RECORD_MISC_MMAP_BUILD_ID) { 1728 + bid = &__bid; 1729 + build_id__init(bid, event->mmap2.build_id, event->mmap2.build_id_size); 1730 + } 1760 1731 1761 1732 if (sample->cpumode == PERF_RECORD_MISC_GUEST_KERNEL || 1762 1733 sample->cpumode == PERF_RECORD_MISC_KERNEL) { ··· 1773 1732 }; 1774 1733 1775 1734 strlcpy(xm.name, event->mmap2.filename, KMAP_NAME_LEN); 1776 - ret = machine__process_kernel_mmap_event(machine, &xm); 1735 + ret = machine__process_kernel_mmap_event(machine, &xm, bid); 1777 1736 if (ret < 0) 1778 1737 goto out_problem; 1779 1738 return 0; ··· 1787 1746 map = map__new(machine, event->mmap2.start, 1788 1747 event->mmap2.len, event->mmap2.pgoff, 1789 1748 &dso_id, event->mmap2.prot, 1790 - event->mmap2.flags, 1749 + event->mmap2.flags, bid, 1791 1750 event->mmap2.filename, thread); 1792 1751 1793 1752 if (map == NULL) ··· 1830 1789 }; 1831 1790 1832 1791 strlcpy(xm.name, event->mmap.filename, KMAP_NAME_LEN); 1833 - ret = machine__process_kernel_mmap_event(machine, &xm); 1792 + ret = machine__process_kernel_mmap_event(machine, &xm, NULL); 1834 1793 if (ret < 0) 1835 1794 goto out_problem; 1836 1795 return 0; ··· 1846 1805 1847 1806 map = map__new(machine, event->mmap.start, 1848 1807 event->mmap.len, event->mmap.pgoff, 1849 - NULL, prot, 0, event->mmap.filename, thread); 1808 + NULL, prot, 0, NULL, event->mmap.filename, thread); 1850 1809 1851 1810 if (map == NULL) 1852 1811 goto out_problem_map;
+2
tools/perf/util/machine.h
··· 106 106 107 107 struct thread *machine__find_thread(struct machine *machine, pid_t pid, 108 108 pid_t tid); 109 + struct thread *machine__idle_thread(struct machine *machine); 109 110 struct comm *machine__thread_exec_comm(struct machine *machine, 110 111 struct thread *thread); 111 112 ··· 163 162 struct machine *machines__find_host(struct machines *machines); 164 163 struct machine *machines__find(struct machines *machines, pid_t pid); 165 164 struct machine *machines__findnew(struct machines *machines, pid_t pid); 165 + struct machine *machines__find_guest(struct machines *machines, pid_t pid); 166 166 167 167 void machines__set_id_hdr_size(struct machines *machines, u16 id_hdr_size); 168 168 void machines__set_comm_exec(struct machines *machines, bool comm_exec);
+6 -2
tools/perf/util/map.c
··· 130 130 131 131 struct map *map__new(struct machine *machine, u64 start, u64 len, 132 132 u64 pgoff, struct dso_id *id, 133 - u32 prot, u32 flags, char *filename, 134 - struct thread *thread) 133 + u32 prot, u32 flags, struct build_id *bid, 134 + char *filename, struct thread *thread) 135 135 { 136 136 struct map *map = malloc(sizeof(*map)); 137 137 struct nsinfo *nsi = NULL; ··· 194 194 dso__set_loaded(dso); 195 195 } 196 196 dso->nsinfo = nsi; 197 + 198 + if (build_id__is_defined(bid)) 199 + dso__set_build_id(dso, bid); 200 + 197 201 dso__put(dso); 198 202 } 199 203 return map;
+2 -1
tools/perf/util/map.h
··· 104 104 u64 start, u64 end, u64 pgoff, struct dso *dso); 105 105 106 106 struct dso_id; 107 + struct build_id; 107 108 108 109 struct map *map__new(struct machine *machine, u64 start, u64 len, 109 110 u64 pgoff, struct dso_id *id, u32 prot, u32 flags, 110 - char *filename, struct thread *thread); 111 + struct build_id *bid, char *filename, struct thread *thread); 111 112 struct map *map__new2(u64 start, struct dso *dso); 112 113 void map__delete(struct map *map); 113 114 struct map *map__clone(struct map *map);
+36
tools/perf/util/mem-events.c
··· 56 56 return (char *)e->name; 57 57 } 58 58 59 + __weak bool is_mem_loads_aux_event(struct evsel *leader __maybe_unused) 60 + { 61 + return false; 62 + } 63 + 59 64 int perf_mem_events__parse(const char *str) 60 65 { 61 66 char *tok, *saveptr = NULL; ··· 337 332 return l; 338 333 } 339 334 335 + int perf_mem__blk_scnprintf(char *out, size_t sz, struct mem_info *mem_info) 336 + { 337 + size_t l = 0; 338 + u64 mask = PERF_MEM_BLK_NA; 339 + 340 + sz -= 1; /* -1 for null termination */ 341 + out[0] = '\0'; 342 + 343 + if (mem_info) 344 + mask = mem_info->data_src.mem_blk; 345 + 346 + if (!mask || (mask & PERF_MEM_BLK_NA)) { 347 + l += scnprintf(out + l, sz - l, " N/A"); 348 + return l; 349 + } 350 + if (mask & PERF_MEM_BLK_DATA) 351 + l += scnprintf(out + l, sz - l, " Data"); 352 + if (mask & PERF_MEM_BLK_ADDR) 353 + l += scnprintf(out + l, sz - l, " Addr"); 354 + 355 + return l; 356 + } 357 + 340 358 int perf_script__meminfo_scnprintf(char *out, size_t sz, struct mem_info *mem_info) 341 359 { 342 360 int i = 0; ··· 371 343 i += perf_mem__tlb_scnprintf(out + i, sz - i, mem_info); 372 344 i += scnprintf(out + i, sz - i, "|LCK "); 373 345 i += perf_mem__lck_scnprintf(out + i, sz - i, mem_info); 346 + i += scnprintf(out + i, sz - i, "|BLK "); 347 + i += perf_mem__blk_scnprintf(out + i, sz - i, mem_info); 374 348 375 349 return i; 376 350 } ··· 385 355 u64 lvl = data_src->mem_lvl; 386 356 u64 snoop = data_src->mem_snoop; 387 357 u64 lock = data_src->mem_lock; 358 + u64 blk = data_src->mem_blk; 388 359 /* 389 360 * Skylake might report unknown remote level via this 390 361 * bit, consider it when evaluating remote HITMs. ··· 404 373 stats->nr_entries++; 405 374 406 375 if (lock & P(LOCK, LOCKED)) stats->locks++; 376 + 377 + if (blk & P(BLK, DATA)) stats->blk_data++; 378 + if (blk & P(BLK, ADDR)) stats->blk_addr++; 407 379 408 380 if (op & P(OP, LOAD)) { 409 381 /* load */ ··· 519 485 stats->rmt_hit += add->rmt_hit; 520 486 stats->lcl_dram += add->lcl_dram; 521 487 stats->rmt_dram += add->rmt_dram; 488 + stats->blk_data += add->blk_data; 489 + stats->blk_addr += add->blk_addr; 522 490 stats->nomap += add->nomap; 523 491 stats->noparse += add->noparse; 524 492 }
+5
tools/perf/util/mem-events.h
··· 9 9 #include <linux/refcount.h> 10 10 #include <linux/perf_event.h> 11 11 #include "stat.h" 12 + #include "evsel.h" 12 13 13 14 struct perf_mem_event { 14 15 bool record; ··· 40 39 41 40 char *perf_mem_events__name(int i); 42 41 struct perf_mem_event *perf_mem_events__ptr(int i); 42 + bool is_mem_loads_aux_event(struct evsel *leader); 43 43 44 44 void perf_mem_events__list(void); 45 45 ··· 49 47 int perf_mem__lvl_scnprintf(char *out, size_t sz, struct mem_info *mem_info); 50 48 int perf_mem__snp_scnprintf(char *out, size_t sz, struct mem_info *mem_info); 51 49 int perf_mem__lck_scnprintf(char *out, size_t sz, struct mem_info *mem_info); 50 + int perf_mem__blk_scnprintf(char *out, size_t sz, struct mem_info *mem_info); 52 51 53 52 int perf_script__meminfo_scnprintf(char *bf, size_t size, struct mem_info *mem_info); 54 53 ··· 79 76 u32 rmt_hit; /* count of loads with remote hit clean; */ 80 77 u32 lcl_dram; /* count of loads miss to local DRAM */ 81 78 u32 rmt_dram; /* count of loads miss to remote DRAM */ 79 + u32 blk_data; /* count of loads blocked by data */ 80 + u32 blk_addr; /* count of loads blocked by address conflict */ 82 81 u32 nomap; /* count of load/stores with no phys adrs */ 83 82 u32 noparse; /* count of unparsable data sources */ 84 83 };
+1 -1
tools/perf/util/metricgroup.c
··· 379 379 metric_refs[i].metric_expr = ref->metric_expr; 380 380 i++; 381 381 } 382 - }; 382 + } 383 383 384 384 expr->metric_refs = metric_refs; 385 385 expr->metric_expr = m->metric_expr;
+21 -2
tools/perf/util/namespaces.c
··· 66 66 char spath[PATH_MAX]; 67 67 char *newns = NULL; 68 68 char *statln = NULL; 69 + char *nspid; 69 70 struct stat old_stat; 70 71 struct stat new_stat; 71 72 FILE *f = NULL; ··· 113 112 } 114 113 115 114 if (strstr(statln, "NStgid:") != NULL) { 116 - nsi->nstgid = (pid_t)strtol(strrchr(statln, '\t'), 117 - NULL, 10); 115 + nspid = strrchr(statln, '\t'); 116 + nsi->nstgid = (pid_t)strtol(nspid, NULL, 10); 117 + /* If innermost tgid is not the first, process is in a different 118 + * PID namespace. 119 + */ 120 + nsi->in_pidns = (statln + sizeof("NStgid:") - 1) != nspid; 118 121 break; 119 122 } 120 123 } ··· 145 140 nsi->tgid = pid; 146 141 nsi->nstgid = pid; 147 142 nsi->need_setns = false; 143 + nsi->in_pidns = false; 148 144 /* Init may fail if the process exits while we're trying to look 149 145 * at its proc information. In that case, save the pid but 150 146 * don't try to enter the namespace. ··· 172 166 nnsi->tgid = nsi->tgid; 173 167 nnsi->nstgid = nsi->nstgid; 174 168 nnsi->need_setns = nsi->need_setns; 169 + nnsi->in_pidns = nsi->in_pidns; 175 170 if (nsi->mntns_path) { 176 171 nnsi->mntns_path = strdup(nsi->mntns_path); 177 172 if (!nnsi->mntns_path) { ··· 286 279 nsinfo__mountns_exit(&nsc); 287 280 288 281 return rpath; 282 + } 283 + 284 + int nsinfo__stat(const char *filename, struct stat *st, struct nsinfo *nsi) 285 + { 286 + int ret; 287 + struct nscookie nsc; 288 + 289 + nsinfo__mountns_enter(nsi, &nsc); 290 + ret = stat(filename, st); 291 + nsinfo__mountns_exit(&nsc); 292 + 293 + return ret; 289 294 }
+3
tools/perf/util/namespaces.h
··· 8 8 #define __PERF_NAMESPACES_H 9 9 10 10 #include <sys/types.h> 11 + #include <sys/stat.h> 11 12 #include <linux/stddef.h> 12 13 #include <linux/perf_event.h> 13 14 #include <linux/refcount.h> ··· 34 33 pid_t tgid; 35 34 pid_t nstgid; 36 35 bool need_setns; 36 + bool in_pidns; 37 37 char *mntns_path; 38 38 refcount_t refcnt; 39 39 }; ··· 57 55 void nsinfo__mountns_exit(struct nscookie *nc); 58 56 59 57 char *nsinfo__realpath(const char *path, struct nsinfo *nsi); 58 + int nsinfo__stat(const char *filename, struct stat *st, struct nsinfo *nsi); 60 59 61 60 static inline void __nsinfo__zput(struct nsinfo **nsip) 62 61 {
+1
tools/perf/util/parse-events.l
··· 356 356 cycles-ct | 357 357 cycles-t | 358 358 mem-loads | 359 + mem-loads-aux | 359 360 mem-stores | 360 361 topdown-[a-z-]+ | 361 362 tx-capacity-[a-z-]+ |
+10
tools/perf/util/perf_api_probe.c
··· 98 98 evsel->core.attr.text_poke = 1; 99 99 } 100 100 101 + static void perf_probe_build_id(struct evsel *evsel) 102 + { 103 + evsel->core.attr.build_id = 1; 104 + } 105 + 101 106 bool perf_can_sample_identifier(void) 102 107 { 103 108 return perf_probe_api(perf_probe_sample_identifier); ··· 176 171 close(fd); 177 172 178 173 return true; 174 + } 175 + 176 + bool perf_can_record_build_id(void) 177 + { 178 + return perf_probe_api(perf_probe_build_id); 179 179 }
+1
tools/perf/util/perf_api_probe.h
··· 11 11 bool perf_can_record_switch_events(void); 12 12 bool perf_can_record_text_poke_events(void); 13 13 bool perf_can_sample_identifier(void); 14 + bool perf_can_record_build_id(void); 14 15 15 16 #endif // __PERF_API_PROBE_H
+4 -1
tools/perf/util/perf_event_attr_fprintf.c
··· 35 35 bit_name(BRANCH_STACK), bit_name(REGS_USER), bit_name(STACK_USER), 36 36 bit_name(IDENTIFIER), bit_name(REGS_INTR), bit_name(DATA_SRC), 37 37 bit_name(WEIGHT), bit_name(PHYS_ADDR), bit_name(AUX), 38 - bit_name(CGROUP), bit_name(DATA_PAGE_SIZE), 38 + bit_name(CGROUP), bit_name(DATA_PAGE_SIZE), bit_name(CODE_PAGE_SIZE), 39 + bit_name(WEIGHT_STRUCT), 39 40 { .name = NULL, } 40 41 }; 41 42 #undef bit_name ··· 135 134 PRINT_ATTRf(bpf_event, p_unsigned); 136 135 PRINT_ATTRf(aux_output, p_unsigned); 137 136 PRINT_ATTRf(cgroup, p_unsigned); 137 + PRINT_ATTRf(text_poke, p_unsigned); 138 + PRINT_ATTRf(build_id, p_unsigned); 138 139 139 140 PRINT_ATTRn("{ wakeup_events, wakeup_watermark }", wakeup_events, p_unsigned); 140 141 PRINT_ATTRf(bp_type, p_unsigned);
+7
tools/perf/util/perf_regs.h
··· 33 33 34 34 int perf_reg_value(u64 *valp, struct regs_dump *regs, int id); 35 35 36 + static inline const char *perf_reg_name(int id) 37 + { 38 + const char *reg_name = __perf_reg_name(id); 39 + 40 + return reg_name ?: "unknown"; 41 + } 42 + 36 43 #else 37 44 #define PERF_REGS_MASK 0 38 45 #define PERF_REGS_MAX 0
+11 -1
tools/perf/util/probe-event.c
··· 894 894 struct debuginfo *dinfo; 895 895 int ntevs, ret = 0; 896 896 897 + /* Workaround for gcc #98776 issue. 898 + * Perf failed to add kretprobe event with debuginfo of vmlinux which is 899 + * compiled by gcc with -fpatchable-function-entry option enabled. The 900 + * same issue with kernel module. The retprobe doesn`t need debuginfo. 901 + * This workaround solution use map to query the probe function address 902 + * for retprobe event. 903 + */ 904 + if (pev->point.retprobe) 905 + return 0; 906 + 897 907 dinfo = open_debuginfo(pev->target, pev->nsi, !need_dwarf); 898 908 if (!dinfo) { 899 909 if (need_dwarf) ··· 1084 1074 } 1085 1075 1086 1076 intlist__for_each_entry(ln, lr->line_list) { 1087 - for (; ln->i > l; l++) { 1077 + for (; ln->i > (unsigned long)l; l++) { 1088 1078 ret = show_one_line(fp, l - lr->offset); 1089 1079 if (ret < 0) 1090 1080 goto end;
+36 -2
tools/perf/util/probe-file.c
··· 794 794 char *ret = NULL; 795 795 int i, args_count, err; 796 796 unsigned long long ref_ctr_offset; 797 + char *arg; 798 + int arg_idx = 0; 797 799 798 800 if (strbuf_init(&buf, 32) < 0) 799 801 return NULL; ··· 820 818 if (args == NULL) 821 819 goto error; 822 820 823 - for (i = 0; i < args_count; ++i) { 824 - if (synthesize_sdt_probe_arg(&buf, i, args[i]) < 0) { 821 + for (i = 0; i < args_count; ) { 822 + /* 823 + * FIXUP: Arm64 ELF section '.note.stapsdt' uses string 824 + * format "-4@[sp, NUM]" if a probe is to access data in 825 + * the stack, e.g. below is an example for the SDT 826 + * Arguments: 827 + * 828 + * Arguments: -4@[sp, 12] -4@[sp, 8] -4@[sp, 4] 829 + * 830 + * Since the string introduces an extra space character 831 + * in the middle of square brackets, the argument is 832 + * divided into two items. Fixup for this case, if an 833 + * item contains sub string "[sp,", need to concatenate 834 + * the two items. 835 + */ 836 + if (strstr(args[i], "[sp,") && (i+1) < args_count) { 837 + err = asprintf(&arg, "%s %s", args[i], args[i+1]); 838 + i += 2; 839 + } else { 840 + err = asprintf(&arg, "%s", args[i]); 841 + i += 1; 842 + } 843 + 844 + /* Failed to allocate memory */ 845 + if (err < 0) { 825 846 argv_free(args); 826 847 goto error; 827 848 } 849 + 850 + if (synthesize_sdt_probe_arg(&buf, arg_idx, arg) < 0) { 851 + free(arg); 852 + argv_free(args); 853 + goto error; 854 + } 855 + 856 + free(arg); 857 + arg_idx++; 828 858 } 829 859 830 860 argv_free(args);
+6 -2
tools/perf/util/probe-finder.c
··· 1187 1187 while (!dwarf_nextcu(dbg->dbg, off, &noff, &cuhl, NULL, NULL, NULL)) { 1188 1188 /* Get the DIE(Debugging Information Entry) of this CU */ 1189 1189 diep = dwarf_offdie(dbg->dbg, off + cuhl, &pf->cu_die); 1190 - if (!diep) 1190 + if (!diep) { 1191 + off = noff; 1191 1192 continue; 1193 + } 1192 1194 1193 1195 /* Check if target file is included. */ 1194 1196 if (pp->file) ··· 1951 1949 1952 1950 /* Get the DIE(Debugging Information Entry) of this CU */ 1953 1951 diep = dwarf_offdie(dbg->dbg, off + cuhl, &lf.cu_die); 1954 - if (!diep) 1952 + if (!diep) { 1953 + off = noff; 1955 1954 continue; 1955 + } 1956 1956 1957 1957 /* Check if target file is included. */ 1958 1958 if (lr->file)
+1
tools/perf/util/python-ext-sources
··· 10 10 util/cap.c 11 11 util/evlist.c 12 12 util/evsel.c 13 + util/evsel_fprintf.c 13 14 util/perf_event_attr_fprintf.c 14 15 util/cpumap.c 15 16 util/memswap.c
+21
tools/perf/util/python.c
··· 80 80 } 81 81 82 82 /* 83 + * XXX: All these evsel destructors need some better mechanism, like a linked 84 + * list of destructors registered when the relevant code indeed is used instead 85 + * of having more and more calls in perf_evsel__delete(). -- acme 86 + * 87 + * For now, add some more: 88 + * 89 + * Not to drag the BPF bandwagon... 90 + */ 91 + void bpf_counter__destroy(struct evsel *evsel); 92 + int bpf_counter__install_pe(struct evsel *evsel, int cpu, int fd); 93 + 94 + void bpf_counter__destroy(struct evsel *evsel __maybe_unused) 95 + { 96 + } 97 + 98 + int bpf_counter__install_pe(struct evsel *evsel __maybe_unused, int cpu __maybe_unused, int fd __maybe_unused) 99 + { 100 + return 0; 101 + } 102 + 103 + /* 83 104 * Support debug printing even though util/debug.c is not linked. That means 84 105 * implementing 'verbose' and 'eprintf'. 85 106 */
+6 -3
tools/perf/util/record.c
··· 15 15 #include "record.h" 16 16 #include "../perf-sys.h" 17 17 #include "topdown.h" 18 + #include "map_symbol.h" 19 + #include "mem-events.h" 18 20 19 21 /* 20 22 * evsel__config_leader_sampling() uses special rules for leader sampling. ··· 27 25 { 28 26 struct evsel *leader = evsel->leader; 29 27 30 - if (evsel__is_aux_event(leader) || arch_topdown_sample_read(leader)) { 28 + if (evsel__is_aux_event(leader) || arch_topdown_sample_read(leader) || 29 + is_mem_loads_aux_event(leader)) { 31 30 evlist__for_each_entry(evlist, evsel) { 32 31 if (evsel->leader == leader && evsel != evsel->leader) 33 32 return evsel; ··· 204 201 * Default frequency is over current maximum. 205 202 */ 206 203 if (max_rate < opts->freq) { 207 - pr_warning("Lowering default frequency rate to %u.\n" 204 + pr_warning("Lowering default frequency rate from %u to %u.\n" 208 205 "Please consider tweaking " 209 206 "/proc/sys/kernel/perf_event_max_sample_rate.\n", 210 - max_rate); 207 + opts->freq, max_rate); 211 208 opts->freq = max_rate; 212 209 } 213 210
+2
tools/perf/util/record.h
··· 23 23 bool sample_address; 24 24 bool sample_phys_addr; 25 25 bool sample_data_page_size; 26 + bool sample_code_page_size; 26 27 bool sample_weight; 27 28 bool sample_time; 28 29 bool sample_time_set; ··· 51 50 bool no_bpf_event; 52 51 bool kcore; 53 52 bool text_poke; 53 + bool build_id; 54 54 unsigned int freq; 55 55 unsigned int mmap_pages; 56 56 unsigned int auxtrace_mmap_pages;
+20 -34
tools/perf/util/session.c
··· 593 593 event->mmap2.start = bswap_64(event->mmap2.start); 594 594 event->mmap2.len = bswap_64(event->mmap2.len); 595 595 event->mmap2.pgoff = bswap_64(event->mmap2.pgoff); 596 - event->mmap2.maj = bswap_32(event->mmap2.maj); 597 - event->mmap2.min = bswap_32(event->mmap2.min); 598 - event->mmap2.ino = bswap_64(event->mmap2.ino); 599 - event->mmap2.ino_generation = bswap_64(event->mmap2.ino_generation); 596 + 597 + if (!(event->header.misc & PERF_RECORD_MISC_MMAP_BUILD_ID)) { 598 + event->mmap2.maj = bswap_32(event->mmap2.maj); 599 + event->mmap2.min = bswap_32(event->mmap2.min); 600 + event->mmap2.ino = bswap_64(event->mmap2.ino); 601 + event->mmap2.ino_generation = bswap_64(event->mmap2.ino_generation); 602 + } 600 603 601 604 if (sample_id_all) { 602 605 void *data = &event->mmap2.filename; ··· 1300 1297 if (sample_type & PERF_SAMPLE_STACK_USER) 1301 1298 stack_user__printf(&sample->user_stack); 1302 1299 1303 - if (sample_type & PERF_SAMPLE_WEIGHT) 1304 - printf("... weight: %" PRIu64 "\n", sample->weight); 1300 + if (sample_type & PERF_SAMPLE_WEIGHT_TYPE) { 1301 + printf("... weight: %" PRIu64 "", sample->weight); 1302 + if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT) 1303 + printf(",0x%"PRIx16"", sample->ins_lat); 1304 + printf("\n"); 1305 + } 1305 1306 1306 1307 if (sample_type & PERF_SAMPLE_DATA_SRC) 1307 1308 printf(" . data_src: 0x%"PRIx64"\n", sample->data_src); ··· 1315 1308 1316 1309 if (sample_type & PERF_SAMPLE_DATA_PAGE_SIZE) 1317 1310 printf(" .. data page size: %s\n", get_page_size_name(sample->data_page_size, str)); 1311 + 1312 + if (sample_type & PERF_SAMPLE_CODE_PAGE_SIZE) 1313 + printf(" .. code page size: %s\n", get_page_size_name(sample->code_page_size, str)); 1318 1314 1319 1315 if (sample_type & PERF_SAMPLE_TRANSACTION) 1320 1316 printf("... transaction: %" PRIx64 "\n", sample->transaction); ··· 1356 1346 union perf_event *event, 1357 1347 struct perf_sample *sample) 1358 1348 { 1359 - struct machine *machine; 1360 - 1361 1349 if (perf_guest && 1362 1350 ((sample->cpumode == PERF_RECORD_MISC_GUEST_KERNEL) || 1363 1351 (sample->cpumode == PERF_RECORD_MISC_GUEST_USER))) { ··· 1367 1359 else 1368 1360 pid = sample->pid; 1369 1361 1370 - machine = machines__find(machines, pid); 1371 - if (!machine) 1372 - machine = machines__findnew(machines, DEFAULT_GUEST_KERNEL_ID); 1373 - return machine; 1362 + return machines__find_guest(machines, pid); 1374 1363 } 1375 1364 1376 1365 return &machines->host; ··· 1789 1784 return machine__findnew_thread(&session->machines.host, -1, pid); 1790 1785 } 1791 1786 1792 - /* 1793 - * Threads are identified by pid and tid, and the idle task has pid == tid == 0. 1794 - * So here a single thread is created for that, but actually there is a separate 1795 - * idle task per cpu, so there should be one 'struct thread' per cpu, but there 1796 - * is only 1. That causes problems for some tools, requiring workarounds. For 1797 - * example get_idle_thread() in builtin-sched.c, or thread_stack__per_cpu(). 1798 - */ 1799 1787 int perf_session__register_idle_thread(struct perf_session *session) 1800 1788 { 1801 - struct thread *thread; 1802 - int err = 0; 1789 + struct thread *thread = machine__idle_thread(&session->machines.host); 1803 1790 1804 - thread = machine__findnew_thread(&session->machines.host, 0, 0); 1805 - if (thread == NULL || thread__set_comm(thread, "swapper", 0)) { 1806 - pr_err("problem inserting idle task.\n"); 1807 - err = -1; 1808 - } 1809 - 1810 - if (thread == NULL || thread__set_namespaces(thread, 0, NULL)) { 1811 - pr_err("problem inserting idle task.\n"); 1812 - err = -1; 1813 - } 1814 - 1815 - /* machine__findnew_thread() got the thread, so put it */ 1791 + /* machine__idle_thread() got the thread, so put it */ 1816 1792 thread__put(thread); 1817 - return err; 1793 + return thread ? 0 : -1; 1818 1794 } 1819 1795 1820 1796 static void
+1 -1
tools/perf/util/setup.py
··· 43 43 44 44 cflags = getenv('CFLAGS', '').split() 45 45 # switch off several checks (need to be at the end of cflags list) 46 - cflags += ['-fno-strict-aliasing', '-Wno-write-strings', '-Wno-unused-parameter', '-Wno-redundant-decls' ] 46 + cflags += ['-fno-strict-aliasing', '-Wno-write-strings', '-Wno-unused-parameter', '-Wno-redundant-decls', '-DPYTHON_PERF' ] 47 47 if not cc_is_clang: 48 48 cflags += ['-Wno-cast-function-type' ] 49 49
+108 -1
tools/perf/util/sort.c
··· 36 36 const char *parent_pattern = default_parent_pattern; 37 37 const char *default_sort_order = "comm,dso,symbol"; 38 38 const char default_branch_sort_order[] = "comm,dso_from,symbol_from,symbol_to,cycles"; 39 - const char default_mem_sort_order[] = "local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked"; 39 + const char default_mem_sort_order[] = "local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat"; 40 40 const char default_top_sort_order[] = "dso,symbol"; 41 41 const char default_diff_sort_order[] = "dso,symbol"; 42 42 const char default_tracepoint_sort_order[] = "trace"; ··· 1365 1365 .se_width_idx = HISTC_GLOBAL_WEIGHT, 1366 1366 }; 1367 1367 1368 + static u64 he_ins_lat(struct hist_entry *he) 1369 + { 1370 + return he->stat.nr_events ? he->stat.ins_lat / he->stat.nr_events : 0; 1371 + } 1372 + 1373 + static int64_t 1374 + sort__local_ins_lat_cmp(struct hist_entry *left, struct hist_entry *right) 1375 + { 1376 + return he_ins_lat(left) - he_ins_lat(right); 1377 + } 1378 + 1379 + static int hist_entry__local_ins_lat_snprintf(struct hist_entry *he, char *bf, 1380 + size_t size, unsigned int width) 1381 + { 1382 + return repsep_snprintf(bf, size, "%-*u", width, he_ins_lat(he)); 1383 + } 1384 + 1385 + struct sort_entry sort_local_ins_lat = { 1386 + .se_header = "Local INSTR Latency", 1387 + .se_cmp = sort__local_ins_lat_cmp, 1388 + .se_snprintf = hist_entry__local_ins_lat_snprintf, 1389 + .se_width_idx = HISTC_LOCAL_INS_LAT, 1390 + }; 1391 + 1392 + static int64_t 1393 + sort__global_ins_lat_cmp(struct hist_entry *left, struct hist_entry *right) 1394 + { 1395 + return left->stat.ins_lat - right->stat.ins_lat; 1396 + } 1397 + 1398 + static int hist_entry__global_ins_lat_snprintf(struct hist_entry *he, char *bf, 1399 + size_t size, unsigned int width) 1400 + { 1401 + return repsep_snprintf(bf, size, "%-*u", width, he->stat.ins_lat); 1402 + } 1403 + 1404 + struct sort_entry sort_global_ins_lat = { 1405 + .se_header = "INSTR Latency", 1406 + .se_cmp = sort__global_ins_lat_cmp, 1407 + .se_snprintf = hist_entry__global_ins_lat_snprintf, 1408 + .se_width_idx = HISTC_GLOBAL_INS_LAT, 1409 + }; 1410 + 1368 1411 struct sort_entry sort_mem_daddr_sym = { 1369 1412 .se_header = "Data Symbol", 1370 1413 .se_cmp = sort__daddr_cmp, ··· 1462 1419 .se_cmp = sort__dcacheline_cmp, 1463 1420 .se_snprintf = hist_entry__dcacheline_snprintf, 1464 1421 .se_width_idx = HISTC_MEM_DCACHELINE, 1422 + }; 1423 + 1424 + static int64_t 1425 + sort__blocked_cmp(struct hist_entry *left, struct hist_entry *right) 1426 + { 1427 + union perf_mem_data_src data_src_l; 1428 + union perf_mem_data_src data_src_r; 1429 + 1430 + if (left->mem_info) 1431 + data_src_l = left->mem_info->data_src; 1432 + else 1433 + data_src_l.mem_blk = PERF_MEM_BLK_NA; 1434 + 1435 + if (right->mem_info) 1436 + data_src_r = right->mem_info->data_src; 1437 + else 1438 + data_src_r.mem_blk = PERF_MEM_BLK_NA; 1439 + 1440 + return (int64_t)(data_src_r.mem_blk - data_src_l.mem_blk); 1441 + } 1442 + 1443 + static int hist_entry__blocked_snprintf(struct hist_entry *he, char *bf, 1444 + size_t size, unsigned int width) 1445 + { 1446 + char out[16]; 1447 + 1448 + perf_mem__blk_scnprintf(out, sizeof(out), he->mem_info); 1449 + return repsep_snprintf(bf, size, "%.*s", width, out); 1450 + } 1451 + 1452 + struct sort_entry sort_mem_blocked = { 1453 + .se_header = "Blocked", 1454 + .se_cmp = sort__blocked_cmp, 1455 + .se_snprintf = hist_entry__blocked_snprintf, 1456 + .se_width_idx = HISTC_MEM_BLOCKED, 1465 1457 }; 1466 1458 1467 1459 static int64_t ··· 1567 1489 .se_cmp = sort__data_page_size_cmp, 1568 1490 .se_snprintf = hist_entry__data_page_size_snprintf, 1569 1491 .se_width_idx = HISTC_MEM_DATA_PAGE_SIZE, 1492 + }; 1493 + 1494 + static int64_t 1495 + sort__code_page_size_cmp(struct hist_entry *left, struct hist_entry *right) 1496 + { 1497 + uint64_t l = left->code_page_size; 1498 + uint64_t r = right->code_page_size; 1499 + 1500 + return (int64_t)(r - l); 1501 + } 1502 + 1503 + static int hist_entry__code_page_size_snprintf(struct hist_entry *he, char *bf, 1504 + size_t size, unsigned int width) 1505 + { 1506 + char str[PAGE_SIZE_NAME_LEN]; 1507 + 1508 + return repsep_snprintf(bf, size, "%-*s", width, 1509 + get_page_size_name(he->code_page_size, str)); 1510 + } 1511 + 1512 + struct sort_entry sort_code_page_size = { 1513 + .se_header = "Code Page Size", 1514 + .se_cmp = sort__code_page_size_cmp, 1515 + .se_snprintf = hist_entry__code_page_size_snprintf, 1516 + .se_width_idx = HISTC_CODE_PAGE_SIZE, 1570 1517 }; 1571 1518 1572 1519 static int64_t ··· 1838 1735 DIM(SORT_CGROUP_ID, "cgroup_id", sort_cgroup_id), 1839 1736 DIM(SORT_SYM_IPC_NULL, "ipc_null", sort_sym_ipc_null), 1840 1737 DIM(SORT_TIME, "time", sort_time), 1738 + DIM(SORT_CODE_PAGE_SIZE, "code_page_size", sort_code_page_size), 1739 + DIM(SORT_LOCAL_INS_LAT, "local_ins_lat", sort_local_ins_lat), 1740 + DIM(SORT_GLOBAL_INS_LAT, "ins_lat", sort_global_ins_lat), 1841 1741 }; 1842 1742 1843 1743 #undef DIM ··· 1876 1770 DIM(SORT_MEM_DCACHELINE, "dcacheline", sort_mem_dcacheline), 1877 1771 DIM(SORT_MEM_PHYS_DADDR, "phys_daddr", sort_mem_phys_daddr), 1878 1772 DIM(SORT_MEM_DATA_PAGE_SIZE, "data_page_size", sort_mem_data_page_size), 1773 + DIM(SORT_MEM_BLOCKED, "blocked", sort_mem_blocked), 1879 1774 }; 1880 1775 1881 1776 #undef DIM
+6
tools/perf/util/sort.h
··· 50 50 u64 period_guest_sys; 51 51 u64 period_guest_us; 52 52 u64 weight; 53 + u64 ins_lat; 53 54 u32 nr_events; 54 55 }; 55 56 ··· 107 106 u64 transaction; 108 107 s32 socket; 109 108 s32 cpu; 109 + u64 code_page_size; 110 110 u8 cpumode; 111 111 u8 depth; 112 112 ··· 231 229 SORT_CGROUP_ID, 232 230 SORT_SYM_IPC_NULL, 233 231 SORT_TIME, 232 + SORT_CODE_PAGE_SIZE, 233 + SORT_LOCAL_INS_LAT, 234 + SORT_GLOBAL_INS_LAT, 234 235 235 236 /* branch stack specific sort keys */ 236 237 __SORT_BRANCH_STACK, ··· 261 256 SORT_MEM_IADDR_SYMBOL, 262 257 SORT_MEM_PHYS_DADDR, 263 258 SORT_MEM_DATA_PAGE_SIZE, 259 + SORT_MEM_BLOCKED, 264 260 }; 265 261 266 262 /*
+3 -1
tools/perf/util/stat-display.c
··· 1045 1045 if (!config->csv_output) { 1046 1046 fprintf(output, "\n"); 1047 1047 fprintf(output, " Performance counter stats for "); 1048 - if (_target->system_wide) 1048 + if (_target->bpf_str) 1049 + fprintf(output, "\'BPF program(s) %s", _target->bpf_str); 1050 + else if (_target->system_wide) 1049 1051 fprintf(output, "\'system wide"); 1050 1052 else if (_target->cpu_list) 1051 1053 fprintf(output, "\'CPU(s) %s", _target->cpu_list);
+92
tools/perf/util/stat-shadow.c
··· 273 273 else if (perf_stat_evsel__is(counter, TOPDOWN_BE_BOUND)) 274 274 update_runtime_stat(st, STAT_TOPDOWN_BE_BOUND, 275 275 cpu, count, &rsd); 276 + else if (perf_stat_evsel__is(counter, TOPDOWN_HEAVY_OPS)) 277 + update_runtime_stat(st, STAT_TOPDOWN_HEAVY_OPS, 278 + cpu, count, &rsd); 279 + else if (perf_stat_evsel__is(counter, TOPDOWN_BR_MISPREDICT)) 280 + update_runtime_stat(st, STAT_TOPDOWN_BR_MISPREDICT, 281 + cpu, count, &rsd); 282 + else if (perf_stat_evsel__is(counter, TOPDOWN_FETCH_LAT)) 283 + update_runtime_stat(st, STAT_TOPDOWN_FETCH_LAT, 284 + cpu, count, &rsd); 285 + else if (perf_stat_evsel__is(counter, TOPDOWN_MEM_BOUND)) 286 + update_runtime_stat(st, STAT_TOPDOWN_MEM_BOUND, 287 + cpu, count, &rsd); 276 288 else if (evsel__match(counter, HARDWARE, HW_STALLED_CYCLES_FRONTEND)) 277 289 update_runtime_stat(st, STAT_STALLED_CYCLES_FRONT, 278 290 cpu, count, &rsd); ··· 1186 1174 color = PERF_COLOR_RED; 1187 1175 print_metric(config, ctxp, color, "%8.1f%%", "bad speculation", 1188 1176 bad_spec * 100.); 1177 + } else if (perf_stat_evsel__is(evsel, TOPDOWN_HEAVY_OPS) && 1178 + full_td(cpu, st, &rsd) && (config->topdown_level > 1)) { 1179 + double retiring = td_metric_ratio(cpu, 1180 + STAT_TOPDOWN_RETIRING, st, 1181 + &rsd); 1182 + double heavy_ops = td_metric_ratio(cpu, 1183 + STAT_TOPDOWN_HEAVY_OPS, st, 1184 + &rsd); 1185 + double light_ops = retiring - heavy_ops; 1186 + 1187 + if (retiring > 0.7 && heavy_ops > 0.1) 1188 + color = PERF_COLOR_GREEN; 1189 + print_metric(config, ctxp, color, "%8.1f%%", "heavy operations", 1190 + heavy_ops * 100.); 1191 + if (retiring > 0.7 && light_ops > 0.6) 1192 + color = PERF_COLOR_GREEN; 1193 + else 1194 + color = NULL; 1195 + print_metric(config, ctxp, color, "%8.1f%%", "light operations", 1196 + light_ops * 100.); 1197 + } else if (perf_stat_evsel__is(evsel, TOPDOWN_BR_MISPREDICT) && 1198 + full_td(cpu, st, &rsd) && (config->topdown_level > 1)) { 1199 + double bad_spec = td_metric_ratio(cpu, 1200 + STAT_TOPDOWN_BAD_SPEC, st, 1201 + &rsd); 1202 + double br_mis = td_metric_ratio(cpu, 1203 + STAT_TOPDOWN_BR_MISPREDICT, st, 1204 + &rsd); 1205 + double m_clears = bad_spec - br_mis; 1206 + 1207 + if (bad_spec > 0.1 && br_mis > 0.05) 1208 + color = PERF_COLOR_RED; 1209 + print_metric(config, ctxp, color, "%8.1f%%", "branch mispredict", 1210 + br_mis * 100.); 1211 + if (bad_spec > 0.1 && m_clears > 0.05) 1212 + color = PERF_COLOR_RED; 1213 + else 1214 + color = NULL; 1215 + print_metric(config, ctxp, color, "%8.1f%%", "machine clears", 1216 + m_clears * 100.); 1217 + } else if (perf_stat_evsel__is(evsel, TOPDOWN_FETCH_LAT) && 1218 + full_td(cpu, st, &rsd) && (config->topdown_level > 1)) { 1219 + double fe_bound = td_metric_ratio(cpu, 1220 + STAT_TOPDOWN_FE_BOUND, st, 1221 + &rsd); 1222 + double fetch_lat = td_metric_ratio(cpu, 1223 + STAT_TOPDOWN_FETCH_LAT, st, 1224 + &rsd); 1225 + double fetch_bw = fe_bound - fetch_lat; 1226 + 1227 + if (fe_bound > 0.2 && fetch_lat > 0.15) 1228 + color = PERF_COLOR_RED; 1229 + print_metric(config, ctxp, color, "%8.1f%%", "fetch latency", 1230 + fetch_lat * 100.); 1231 + if (fe_bound > 0.2 && fetch_bw > 0.1) 1232 + color = PERF_COLOR_RED; 1233 + else 1234 + color = NULL; 1235 + print_metric(config, ctxp, color, "%8.1f%%", "fetch bandwidth", 1236 + fetch_bw * 100.); 1237 + } else if (perf_stat_evsel__is(evsel, TOPDOWN_MEM_BOUND) && 1238 + full_td(cpu, st, &rsd) && (config->topdown_level > 1)) { 1239 + double be_bound = td_metric_ratio(cpu, 1240 + STAT_TOPDOWN_BE_BOUND, st, 1241 + &rsd); 1242 + double mem_bound = td_metric_ratio(cpu, 1243 + STAT_TOPDOWN_MEM_BOUND, st, 1244 + &rsd); 1245 + double core_bound = be_bound - mem_bound; 1246 + 1247 + if (be_bound > 0.2 && mem_bound > 0.2) 1248 + color = PERF_COLOR_RED; 1249 + print_metric(config, ctxp, color, "%8.1f%%", "memory bound", 1250 + mem_bound * 100.); 1251 + if (be_bound > 0.2 && core_bound > 0.1) 1252 + color = PERF_COLOR_RED; 1253 + else 1254 + color = NULL; 1255 + print_metric(config, ctxp, color, "%8.1f%%", "Core bound", 1256 + core_bound * 100.); 1189 1257 } else if (evsel->metric_expr) { 1190 1258 generic_metric(config, evsel->metric_expr, evsel->metric_events, NULL, 1191 1259 evsel->name, evsel->metric_name, NULL, 1, cpu, out, st);
+5 -1
tools/perf/util/stat.c
··· 99 99 ID(TOPDOWN_BAD_SPEC, topdown-bad-spec), 100 100 ID(TOPDOWN_FE_BOUND, topdown-fe-bound), 101 101 ID(TOPDOWN_BE_BOUND, topdown-be-bound), 102 + ID(TOPDOWN_HEAVY_OPS, topdown-heavy-ops), 103 + ID(TOPDOWN_BR_MISPREDICT, topdown-br-mispredict), 104 + ID(TOPDOWN_FETCH_LAT, topdown-fetch-lat), 105 + ID(TOPDOWN_MEM_BOUND, topdown-mem-bound), 102 106 ID(SMI_NUM, msr/smi/), 103 107 ID(APERF, msr/aperf/), 104 108 }; ··· 531 527 if (leader->core.nr_members > 1) 532 528 attr->read_format |= PERF_FORMAT_ID|PERF_FORMAT_GROUP; 533 529 534 - attr->inherit = !config->no_inherit; 530 + attr->inherit = !config->no_inherit && list_empty(&evsel->bpf_counter_list); 535 531 536 532 /* 537 533 * Some events get initialized with sample_(period/type) set,
+9
tools/perf/util/stat.h
··· 33 33 PERF_STAT_EVSEL_ID__TOPDOWN_BAD_SPEC, 34 34 PERF_STAT_EVSEL_ID__TOPDOWN_FE_BOUND, 35 35 PERF_STAT_EVSEL_ID__TOPDOWN_BE_BOUND, 36 + PERF_STAT_EVSEL_ID__TOPDOWN_HEAVY_OPS, 37 + PERF_STAT_EVSEL_ID__TOPDOWN_BR_MISPREDICT, 38 + PERF_STAT_EVSEL_ID__TOPDOWN_FETCH_LAT, 39 + PERF_STAT_EVSEL_ID__TOPDOWN_MEM_BOUND, 36 40 PERF_STAT_EVSEL_ID__SMI_NUM, 37 41 PERF_STAT_EVSEL_ID__APERF, 38 42 PERF_STAT_EVSEL_ID__MAX, ··· 95 91 STAT_TOPDOWN_BAD_SPEC, 96 92 STAT_TOPDOWN_FE_BOUND, 97 93 STAT_TOPDOWN_BE_BOUND, 94 + STAT_TOPDOWN_HEAVY_OPS, 95 + STAT_TOPDOWN_BR_MISPREDICT, 96 + STAT_TOPDOWN_FETCH_LAT, 97 + STAT_TOPDOWN_MEM_BOUND, 98 98 STAT_SMI_NUM, 99 99 STAT_APERF, 100 100 STAT_MAX ··· 156 148 int ctl_fd_ack; 157 149 bool ctl_fd_close; 158 150 const char *cgroup_list; 151 + unsigned int topdown_level; 159 152 }; 160 153 161 154 void perf_stat__set_big_num(int set);
+9
tools/perf/util/string.c
··· 293 293 294 294 return ret; 295 295 } 296 + 297 + unsigned int hex(char c) 298 + { 299 + if (c >= '0' && c <= '9') 300 + return c - '0'; 301 + if (c >= 'a' && c <= 'f') 302 + return c - 'a' + 10; 303 + return c - 'A' + 10; 304 + }
+2
tools/perf/util/string2.h
··· 38 38 char *strpbrk_esc(char *str, const char *stopset); 39 39 char *strdup_esc(const char *str); 40 40 41 + unsigned int hex(char c); 42 + 41 43 #endif /* PERF_STRING_H */
+22 -3
tools/perf/util/symbol-elf.c
··· 12 12 #include "maps.h" 13 13 #include "symbol.h" 14 14 #include "symsrc.h" 15 + #include "demangle-ocaml.h" 15 16 #include "demangle-java.h" 16 17 #include "demangle-rust.h" 17 18 #include "machine.h" ··· 252 251 return demangled; 253 252 254 253 demangled = bfd_demangle(NULL, elf_name, demangle_flags); 255 - if (demangled == NULL) 256 - demangled = java_demangle_sym(elf_name, JAVA_DEMANGLE_NORET); 254 + if (demangled == NULL) { 255 + demangled = ocaml_demangle_sym(elf_name); 256 + if (demangled == NULL) { 257 + demangled = java_demangle_sym(elf_name, JAVA_DEMANGLE_NORET); 258 + } 259 + } 257 260 else if (rust_is_mangled(demangled)) 258 261 /* 259 262 * Input to Rust demangling is the BFD-demangled ··· 1231 1226 if (sym.st_shndx == SHN_ABS) 1232 1227 continue; 1233 1228 1234 - sec = elf_getscn(runtime_ss->elf, sym.st_shndx); 1229 + sec = elf_getscn(syms_ss->elf, sym.st_shndx); 1235 1230 if (!sec) 1236 1231 goto out_elf_end; 1237 1232 1238 1233 gelf_getshdr(sec, &shdr); 1234 + 1235 + /* 1236 + * We have to fallback to runtime when syms' section header has 1237 + * NOBITS set. NOBITS results in file offset (sh_offset) not 1238 + * being incremented. So sh_offset used below has different 1239 + * values for syms (invalid) and runtime (valid). 1240 + */ 1241 + if (shdr.sh_type == SHT_NOBITS) { 1242 + sec = elf_getscn(runtime_ss->elf, sym.st_shndx); 1243 + if (!sec) 1244 + goto out_elf_end; 1245 + 1246 + gelf_getshdr(sec, &shdr); 1247 + } 1239 1248 1240 1249 if (is_label && !elf_sec__filter(&shdr, secstrs)) 1241 1250 continue;
+54 -19
tools/perf/util/symbol.c
··· 1561 1561 int dso__load_bfd_symbols(struct dso *dso, const char *debugfile) 1562 1562 { 1563 1563 int err = -1; 1564 - long symbols_size, symbols_count; 1564 + long symbols_size, symbols_count, i; 1565 1565 asection *section; 1566 1566 asymbol **symbols, *sym; 1567 1567 struct symbol *symbol; 1568 1568 bfd *abfd; 1569 - u_int i; 1570 1569 u64 start, len; 1571 1570 1572 - abfd = bfd_openr(dso->long_name, NULL); 1571 + abfd = bfd_openr(debugfile, NULL); 1573 1572 if (!abfd) 1574 1573 return -1; 1575 1574 ··· 1584 1585 section = bfd_get_section_by_name(abfd, ".text"); 1585 1586 if (section) 1586 1587 dso->text_offset = section->vma - section->filepos; 1587 - 1588 - bfd_close(abfd); 1589 - 1590 - abfd = bfd_openr(debugfile, NULL); 1591 - if (!abfd) 1592 - return -1; 1593 - 1594 - if (!bfd_check_format(abfd, bfd_object)) { 1595 - pr_debug2("%s: cannot read %s bfd file.\n", __func__, 1596 - debugfile); 1597 - goto out_close; 1598 - } 1599 - 1600 - if (bfd_get_flavour(abfd) == bfd_target_elf_flavour) 1601 - goto out_close; 1602 1588 1603 1589 symbols_size = bfd_get_symtab_upper_bound(abfd); 1604 1590 if (symbols_size == 0) { ··· 1851 1867 if (nsexit) 1852 1868 nsinfo__mountns_enter(dso->nsinfo, &nsc); 1853 1869 1854 - if (bfdrc == 0) 1870 + if (bfdrc == 0) { 1871 + ret = 0; 1855 1872 break; 1873 + } 1856 1874 1857 1875 if (!is_reg || sirc < 0) 1858 1876 continue; ··· 2392 2406 return 0; 2393 2407 } 2394 2408 2409 + static int setup_addrlist(struct intlist **addr_list, struct strlist *sym_list) 2410 + { 2411 + struct str_node *pos, *tmp; 2412 + unsigned long val; 2413 + char *sep; 2414 + const char *end; 2415 + int i = 0, err; 2416 + 2417 + *addr_list = intlist__new(NULL); 2418 + if (!*addr_list) 2419 + return -1; 2420 + 2421 + strlist__for_each_entry_safe(pos, tmp, sym_list) { 2422 + errno = 0; 2423 + val = strtoul(pos->s, &sep, 16); 2424 + if (errno || (sep == pos->s)) 2425 + continue; 2426 + 2427 + if (*sep != '\0') { 2428 + end = pos->s + strlen(pos->s) - 1; 2429 + while (end >= sep && isspace(*end)) 2430 + end--; 2431 + 2432 + if (end >= sep) 2433 + continue; 2434 + } 2435 + 2436 + err = intlist__add(*addr_list, val); 2437 + if (err) 2438 + break; 2439 + 2440 + strlist__remove(sym_list, pos); 2441 + i++; 2442 + } 2443 + 2444 + if (i == 0) { 2445 + intlist__delete(*addr_list); 2446 + *addr_list = NULL; 2447 + } 2448 + 2449 + return 0; 2450 + } 2451 + 2395 2452 static bool symbol__read_kptr_restrict(void) 2396 2453 { 2397 2454 bool value = false; ··· 2518 2489 symbol_conf.sym_list_str, "symbol") < 0) 2519 2490 goto out_free_tid_list; 2520 2491 2492 + if (symbol_conf.sym_list && 2493 + setup_addrlist(&symbol_conf.addr_list, symbol_conf.sym_list) < 0) 2494 + goto out_free_sym_list; 2495 + 2521 2496 if (setup_list(&symbol_conf.bt_stop_list, 2522 2497 symbol_conf.bt_stop_list_str, "symbol") < 0) 2523 2498 goto out_free_sym_list; ··· 2545 2512 2546 2513 out_free_sym_list: 2547 2514 strlist__delete(symbol_conf.sym_list); 2515 + intlist__delete(symbol_conf.addr_list); 2548 2516 out_free_tid_list: 2549 2517 intlist__delete(symbol_conf.tid_list); 2550 2518 out_free_pid_list: ··· 2567 2533 strlist__delete(symbol_conf.comm_list); 2568 2534 intlist__delete(symbol_conf.tid_list); 2569 2535 intlist__delete(symbol_conf.pid_list); 2536 + intlist__delete(symbol_conf.addr_list); 2570 2537 vmlinux_path__exit(); 2571 2538 symbol_conf.sym_list = symbol_conf.dso_list = symbol_conf.comm_list = NULL; 2572 2539 symbol_conf.bt_stop_list = NULL;
+5 -2
tools/perf/util/symbol_conf.h
··· 42 42 report_block, 43 43 report_individual_block, 44 44 inline_name, 45 - disable_add2line_warn; 45 + disable_add2line_warn, 46 + buildid_mmap2; 46 47 const char *vmlinux_name, 47 48 *kallsyms_name, 48 49 *source_prefix, ··· 70 69 *sym_to_list, 71 70 *bt_stop_list; 72 71 struct intlist *pid_list, 73 - *tid_list; 72 + *tid_list, 73 + *addr_list; 74 74 const char *symfs; 75 75 int res_sample; 76 76 int pad_output_len_dso; 77 77 int group_sort_idx; 78 + int addr_range; 78 79 }; 79 80 80 81 extern struct symbol_conf symbol_conf;
+161 -64
tools/perf/util/synthetic-events.c
··· 24 24 #include <linux/perf_event.h> 25 25 #include <asm/bug.h> 26 26 #include <perf/evsel.h> 27 - #include <internal/cpumap.h> 28 27 #include <perf/cpumap.h> 29 28 #include <internal/lib.h> // page_size 30 29 #include <internal/threadmap.h> ··· 68 69 * Assumes that the first 4095 bytes of /proc/pid/stat contains 69 70 * the comm, tgid and ppid. 70 71 */ 71 - static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len, 72 - pid_t *tgid, pid_t *ppid) 72 + static int perf_event__get_comm_ids(pid_t pid, pid_t tid, char *comm, size_t len, 73 + pid_t *tgid, pid_t *ppid, bool *kernel) 73 74 { 74 75 char bf[4096]; 75 76 int fd; 76 77 size_t size = 0; 77 78 ssize_t n; 78 - char *name, *tgids, *ppids; 79 + char *name, *tgids, *ppids, *vmpeak, *threads; 79 80 80 81 *tgid = -1; 81 82 *ppid = -1; 82 83 83 - snprintf(bf, sizeof(bf), "/proc/%d/status", pid); 84 + if (pid) 85 + snprintf(bf, sizeof(bf), "/proc/%d/task/%d/status", pid, tid); 86 + else 87 + snprintf(bf, sizeof(bf), "/proc/%d/status", tid); 84 88 85 89 fd = open(bf, O_RDONLY); 86 90 if (fd < 0) { ··· 95 93 close(fd); 96 94 if (n <= 0) { 97 95 pr_warning("Couldn't get COMM, tigd and ppid for pid %d\n", 98 - pid); 96 + tid); 99 97 return -1; 100 98 } 101 99 bf[n] = '\0'; 102 100 103 101 name = strstr(bf, "Name:"); 104 - tgids = strstr(bf, "Tgid:"); 105 - ppids = strstr(bf, "PPid:"); 102 + tgids = strstr(name ?: bf, "Tgid:"); 103 + ppids = strstr(tgids ?: bf, "PPid:"); 104 + vmpeak = strstr(ppids ?: bf, "VmPeak:"); 105 + 106 + if (vmpeak) 107 + threads = NULL; 108 + else 109 + threads = strstr(ppids ?: bf, "Threads:"); 106 110 107 111 if (name) { 108 112 char *nl; ··· 124 116 memcpy(comm, name, size); 125 117 comm[size] = '\0'; 126 118 } else { 127 - pr_debug("Name: string not found for pid %d\n", pid); 119 + pr_debug("Name: string not found for pid %d\n", tid); 128 120 } 129 121 130 122 if (tgids) { 131 123 tgids += 5; /* strlen("Tgid:") */ 132 124 *tgid = atoi(tgids); 133 125 } else { 134 - pr_debug("Tgid: string not found for pid %d\n", pid); 126 + pr_debug("Tgid: string not found for pid %d\n", tid); 135 127 } 136 128 137 129 if (ppids) { 138 130 ppids += 5; /* strlen("PPid:") */ 139 131 *ppid = atoi(ppids); 140 132 } else { 141 - pr_debug("PPid: string not found for pid %d\n", pid); 133 + pr_debug("PPid: string not found for pid %d\n", tid); 142 134 } 135 + 136 + if (!vmpeak && threads) 137 + *kernel = true; 138 + else 139 + *kernel = false; 143 140 144 141 return 0; 145 142 } 146 143 147 - static int perf_event__prepare_comm(union perf_event *event, pid_t pid, 144 + static int perf_event__prepare_comm(union perf_event *event, pid_t pid, pid_t tid, 148 145 struct machine *machine, 149 - pid_t *tgid, pid_t *ppid) 146 + pid_t *tgid, pid_t *ppid, bool *kernel) 150 147 { 151 148 size_t size; 152 149 ··· 160 147 memset(&event->comm, 0, sizeof(event->comm)); 161 148 162 149 if (machine__is_host(machine)) { 163 - if (perf_event__get_comm_ids(pid, event->comm.comm, 150 + if (perf_event__get_comm_ids(pid, tid, event->comm.comm, 164 151 sizeof(event->comm.comm), 165 - tgid, ppid) != 0) { 152 + tgid, ppid, kernel) != 0) { 166 153 return -1; 167 154 } 168 155 } else { ··· 181 168 event->comm.header.size = (sizeof(event->comm) - 182 169 (sizeof(event->comm.comm) - size) + 183 170 machine->id_hdr_size); 184 - event->comm.tid = pid; 171 + event->comm.tid = tid; 185 172 186 173 return 0; 187 174 } ··· 192 179 struct machine *machine) 193 180 { 194 181 pid_t tgid, ppid; 182 + bool kernel_thread; 195 183 196 - if (perf_event__prepare_comm(event, pid, machine, &tgid, &ppid) != 0) 184 + if (perf_event__prepare_comm(event, 0, pid, machine, &tgid, &ppid, 185 + &kernel_thread) != 0) 197 186 return -1; 198 187 199 188 if (perf_tool__process_synth_event(tool, event, machine, process) != 0) ··· 362 347 } 363 348 } 364 349 350 + static void perf_record_mmap2__read_build_id(struct perf_record_mmap2 *event, 351 + bool is_kernel) 352 + { 353 + struct build_id bid; 354 + int rc; 355 + 356 + if (is_kernel) 357 + rc = sysfs__read_build_id("/sys/kernel/notes", &bid); 358 + else 359 + rc = filename__read_build_id(event->filename, &bid) > 0 ? 0 : -1; 360 + 361 + if (rc == 0) { 362 + memcpy(event->build_id, bid.data, sizeof(bid.data)); 363 + event->build_id_size = (u8) bid.size; 364 + event->header.misc |= PERF_RECORD_MISC_MMAP_BUILD_ID; 365 + event->__reserved_1 = 0; 366 + event->__reserved_2 = 0; 367 + } else { 368 + if (event->filename[0] == '/') { 369 + pr_debug2("Failed to read build ID for %s\n", 370 + event->filename); 371 + } 372 + } 373 + } 374 + 365 375 int perf_event__synthesize_mmap_events(struct perf_tool *tool, 366 376 union perf_event *event, 367 377 pid_t pid, pid_t tgid, ··· 492 452 event->mmap2.header.size += machine->id_hdr_size; 493 453 event->mmap2.pid = tgid; 494 454 event->mmap2.tid = pid; 455 + 456 + if (symbol_conf.buildid_mmap2) 457 + perf_record_mmap2__read_build_id(&event->mmap2, false); 495 458 496 459 if (perf_tool__process_synth_event(tool, event, machine, process) != 0) { 497 460 rc = -1; ··· 639 596 int rc = 0; 640 597 struct map *pos; 641 598 struct maps *maps = machine__kernel_maps(machine); 642 - union perf_event *event = zalloc((sizeof(event->mmap) + 643 - machine->id_hdr_size)); 599 + union perf_event *event; 600 + size_t size = symbol_conf.buildid_mmap2 ? 601 + sizeof(event->mmap2) : sizeof(event->mmap); 602 + 603 + event = zalloc(size + machine->id_hdr_size); 644 604 if (event == NULL) { 645 605 pr_debug("Not enough memory synthesizing mmap event " 646 606 "for kernel modules\n"); 647 607 return -1; 648 608 } 649 - 650 - event->header.type = PERF_RECORD_MMAP; 651 609 652 610 /* 653 611 * kernel uses 0 for user space maps, see kernel/perf_event.c ··· 660 616 event->header.misc = PERF_RECORD_MISC_GUEST_KERNEL; 661 617 662 618 maps__for_each_entry(maps, pos) { 663 - size_t size; 664 - 665 619 if (!__map__is_kmodule(pos)) 666 620 continue; 667 621 668 - size = PERF_ALIGN(pos->dso->long_name_len + 1, sizeof(u64)); 669 - event->mmap.header.type = PERF_RECORD_MMAP; 670 - event->mmap.header.size = (sizeof(event->mmap) - 671 - (sizeof(event->mmap.filename) - size)); 672 - memset(event->mmap.filename + size, 0, machine->id_hdr_size); 673 - event->mmap.header.size += machine->id_hdr_size; 674 - event->mmap.start = pos->start; 675 - event->mmap.len = pos->end - pos->start; 676 - event->mmap.pid = machine->pid; 622 + if (symbol_conf.buildid_mmap2) { 623 + size = PERF_ALIGN(pos->dso->long_name_len + 1, sizeof(u64)); 624 + event->mmap2.header.type = PERF_RECORD_MMAP2; 625 + event->mmap2.header.size = (sizeof(event->mmap2) - 626 + (sizeof(event->mmap2.filename) - size)); 627 + memset(event->mmap2.filename + size, 0, machine->id_hdr_size); 628 + event->mmap2.header.size += machine->id_hdr_size; 629 + event->mmap2.start = pos->start; 630 + event->mmap2.len = pos->end - pos->start; 631 + event->mmap2.pid = machine->pid; 677 632 678 - memcpy(event->mmap.filename, pos->dso->long_name, 679 - pos->dso->long_name_len + 1); 633 + memcpy(event->mmap2.filename, pos->dso->long_name, 634 + pos->dso->long_name_len + 1); 635 + 636 + perf_record_mmap2__read_build_id(&event->mmap2, false); 637 + } else { 638 + size = PERF_ALIGN(pos->dso->long_name_len + 1, sizeof(u64)); 639 + event->mmap.header.type = PERF_RECORD_MMAP; 640 + event->mmap.header.size = (sizeof(event->mmap) - 641 + (sizeof(event->mmap.filename) - size)); 642 + memset(event->mmap.filename + size, 0, machine->id_hdr_size); 643 + event->mmap.header.size += machine->id_hdr_size; 644 + event->mmap.start = pos->start; 645 + event->mmap.len = pos->end - pos->start; 646 + event->mmap.pid = machine->pid; 647 + 648 + memcpy(event->mmap.filename, pos->dso->long_name, 649 + pos->dso->long_name_len + 1); 650 + } 651 + 680 652 if (perf_tool__process_synth_event(tool, event, machine, process) != 0) { 681 653 rc = -1; 682 654 break; ··· 703 643 return rc; 704 644 } 705 645 646 + static int filter_task(const struct dirent *dirent) 647 + { 648 + return isdigit(dirent->d_name[0]); 649 + } 650 + 706 651 static int __event__synthesize_thread(union perf_event *comm_event, 707 652 union perf_event *mmap_event, 708 653 union perf_event *fork_event, ··· 716 651 struct perf_tool *tool, struct machine *machine, bool mmap_data) 717 652 { 718 653 char filename[PATH_MAX]; 719 - DIR *tasks; 720 - struct dirent *dirent; 654 + struct dirent **dirent; 721 655 pid_t tgid, ppid; 722 656 int rc = 0; 657 + int i, n; 723 658 724 659 /* special case: only send one comm event using passed in pid */ 725 660 if (!full) { ··· 751 686 snprintf(filename, sizeof(filename), "%s/proc/%d/task", 752 687 machine->root_dir, pid); 753 688 754 - tasks = opendir(filename); 755 - if (tasks == NULL) { 756 - pr_debug("couldn't open %s\n", filename); 757 - return 0; 758 - } 689 + n = scandir(filename, &dirent, filter_task, alphasort); 690 + if (n < 0) 691 + return n; 759 692 760 - while ((dirent = readdir(tasks)) != NULL) { 693 + for (i = 0; i < n; i++) { 761 694 char *end; 762 695 pid_t _pid; 696 + bool kernel_thread; 763 697 764 - _pid = strtol(dirent->d_name, &end, 10); 698 + _pid = strtol(dirent[i]->d_name, &end, 10); 765 699 if (*end) 766 700 continue; 767 701 768 702 rc = -1; 769 - if (perf_event__prepare_comm(comm_event, _pid, machine, 770 - &tgid, &ppid) != 0) 703 + if (perf_event__prepare_comm(comm_event, pid, _pid, machine, 704 + &tgid, &ppid, &kernel_thread) != 0) 771 705 break; 772 706 773 707 if (perf_event__synthesize_fork(tool, fork_event, _pid, tgid, ··· 784 720 break; 785 721 786 722 rc = 0; 787 - if (_pid == pid) { 723 + if (_pid == pid && !kernel_thread) { 788 724 /* process the parent's maps too */ 789 725 rc = perf_event__synthesize_mmap_events(tool, mmap_event, pid, tgid, 790 726 process, machine, mmap_data); ··· 793 729 } 794 730 } 795 731 796 - closedir(tasks); 732 + for (i = 0; i < n; i++) 733 + zfree(&dirent[i]); 734 + free(dirent); 735 + 797 736 return rc; 798 737 } 799 738 ··· 981 914 return 0; 982 915 983 916 snprintf(proc_path, sizeof(proc_path), "%s/proc", machine->root_dir); 984 - n = scandir(proc_path, &dirent, 0, alphasort); 917 + n = scandir(proc_path, &dirent, filter_task, alphasort); 985 918 if (n < 0) 986 919 return err; 987 920 ··· 1058 991 perf_event__handler_t process, 1059 992 struct machine *machine) 1060 993 { 1061 - size_t size; 994 + union perf_event *event; 995 + size_t size = symbol_conf.buildid_mmap2 ? 996 + sizeof(event->mmap2) : sizeof(event->mmap); 1062 997 struct map *map = machine__kernel_map(machine); 1063 998 struct kmap *kmap; 1064 999 int err; 1065 - union perf_event *event; 1066 1000 1067 1001 if (map == NULL) 1068 1002 return -1; ··· 1077 1009 * available use this, and after it is use this as a fallback for older 1078 1010 * kernels. 1079 1011 */ 1080 - event = zalloc((sizeof(event->mmap) + machine->id_hdr_size)); 1012 + event = zalloc(size + machine->id_hdr_size); 1081 1013 if (event == NULL) { 1082 1014 pr_debug("Not enough memory synthesizing mmap event " 1083 1015 "for kernel modules\n"); ··· 1094 1026 event->header.misc = PERF_RECORD_MISC_GUEST_KERNEL; 1095 1027 } 1096 1028 1097 - size = snprintf(event->mmap.filename, sizeof(event->mmap.filename), 1098 - "%s%s", machine->mmap_name, kmap->ref_reloc_sym->name) + 1; 1099 - size = PERF_ALIGN(size, sizeof(u64)); 1100 - event->mmap.header.type = PERF_RECORD_MMAP; 1101 - event->mmap.header.size = (sizeof(event->mmap) - 1102 - (sizeof(event->mmap.filename) - size) + machine->id_hdr_size); 1103 - event->mmap.pgoff = kmap->ref_reloc_sym->addr; 1104 - event->mmap.start = map->start; 1105 - event->mmap.len = map->end - event->mmap.start; 1106 - event->mmap.pid = machine->pid; 1029 + if (symbol_conf.buildid_mmap2) { 1030 + size = snprintf(event->mmap2.filename, sizeof(event->mmap2.filename), 1031 + "%s%s", machine->mmap_name, kmap->ref_reloc_sym->name) + 1; 1032 + size = PERF_ALIGN(size, sizeof(u64)); 1033 + event->mmap2.header.type = PERF_RECORD_MMAP2; 1034 + event->mmap2.header.size = (sizeof(event->mmap2) - 1035 + (sizeof(event->mmap2.filename) - size) + machine->id_hdr_size); 1036 + event->mmap2.pgoff = kmap->ref_reloc_sym->addr; 1037 + event->mmap2.start = map->start; 1038 + event->mmap2.len = map->end - event->mmap.start; 1039 + event->mmap2.pid = machine->pid; 1040 + 1041 + perf_record_mmap2__read_build_id(&event->mmap2, true); 1042 + } else { 1043 + size = snprintf(event->mmap.filename, sizeof(event->mmap.filename), 1044 + "%s%s", machine->mmap_name, kmap->ref_reloc_sym->name) + 1; 1045 + size = PERF_ALIGN(size, sizeof(u64)); 1046 + event->mmap.header.type = PERF_RECORD_MMAP; 1047 + event->mmap.header.size = (sizeof(event->mmap) - 1048 + (sizeof(event->mmap.filename) - size) + machine->id_hdr_size); 1049 + event->mmap.pgoff = kmap->ref_reloc_sym->addr; 1050 + event->mmap.start = map->start; 1051 + event->mmap.len = map->end - event->mmap.start; 1052 + event->mmap.pid = machine->pid; 1053 + } 1107 1054 1108 1055 err = perf_tool__process_synth_event(tool, event, machine, process); 1109 1056 free(event); ··· 1467 1384 } 1468 1385 } 1469 1386 1470 - if (type & PERF_SAMPLE_WEIGHT) 1387 + if (type & PERF_SAMPLE_WEIGHT_TYPE) 1471 1388 result += sizeof(u64); 1472 1389 1473 1390 if (type & PERF_SAMPLE_DATA_SRC) ··· 1495 1412 if (type & PERF_SAMPLE_DATA_PAGE_SIZE) 1496 1413 result += sizeof(u64); 1497 1414 1415 + if (type & PERF_SAMPLE_CODE_PAGE_SIZE) 1416 + result += sizeof(u64); 1417 + 1498 1418 if (type & PERF_SAMPLE_AUX) { 1499 1419 result += sizeof(u64); 1500 1420 result += sample->aux_sample.size; 1501 1421 } 1502 1422 1503 1423 return result; 1424 + } 1425 + 1426 + void __weak arch_perf_synthesize_sample_weight(const struct perf_sample *data, 1427 + __u64 *array, u64 type __maybe_unused) 1428 + { 1429 + *array = data->weight; 1504 1430 } 1505 1431 1506 1432 int perf_event__synthesize_sample(union perf_event *event, u64 type, u64 read_format, ··· 1647 1555 } 1648 1556 } 1649 1557 1650 - if (type & PERF_SAMPLE_WEIGHT) { 1651 - *array = sample->weight; 1558 + if (type & PERF_SAMPLE_WEIGHT_TYPE) { 1559 + arch_perf_synthesize_sample_weight(sample, array, type); 1652 1560 array++; 1653 1561 } 1654 1562 ··· 1685 1593 1686 1594 if (type & PERF_SAMPLE_DATA_PAGE_SIZE) { 1687 1595 *array = sample->data_page_size; 1596 + array++; 1597 + } 1598 + 1599 + if (type & PERF_SAMPLE_CODE_PAGE_SIZE) { 1600 + *array = sample->code_page_size; 1688 1601 array++; 1689 1602 } 1690 1603
+33 -1
tools/perf/util/target.c
··· 56 56 ret = TARGET_ERRNO__UID_OVERRIDE_SYSTEM; 57 57 } 58 58 59 + /* BPF and CPU are mutually exclusive */ 60 + if (target->bpf_str && target->cpu_list) { 61 + target->cpu_list = NULL; 62 + if (ret == TARGET_ERRNO__SUCCESS) 63 + ret = TARGET_ERRNO__BPF_OVERRIDE_CPU; 64 + } 65 + 66 + /* BPF and PID/TID are mutually exclusive */ 67 + if (target->bpf_str && target->tid) { 68 + target->tid = NULL; 69 + if (ret == TARGET_ERRNO__SUCCESS) 70 + ret = TARGET_ERRNO__BPF_OVERRIDE_PID; 71 + } 72 + 73 + /* BPF and UID are mutually exclusive */ 74 + if (target->bpf_str && target->uid_str) { 75 + target->uid_str = NULL; 76 + if (ret == TARGET_ERRNO__SUCCESS) 77 + ret = TARGET_ERRNO__BPF_OVERRIDE_UID; 78 + } 79 + 80 + /* BPF and THREADS are mutually exclusive */ 81 + if (target->bpf_str && target->per_thread) { 82 + target->per_thread = false; 83 + if (ret == TARGET_ERRNO__SUCCESS) 84 + ret = TARGET_ERRNO__BPF_OVERRIDE_THREAD; 85 + } 86 + 59 87 /* THREAD and SYSTEM/CPU are mutually exclusive */ 60 88 if (target->per_thread && (target->system_wide || target->cpu_list)) { 61 89 target->per_thread = false; ··· 137 109 "PID/TID switch overriding SYSTEM", 138 110 "UID switch overriding SYSTEM", 139 111 "SYSTEM/CPU switch overriding PER-THREAD", 112 + "BPF switch overriding CPU", 113 + "BPF switch overriding PID/TID", 114 + "BPF switch overriding UID", 115 + "BPF switch overriding THREAD", 140 116 "Invalid User: %s", 141 117 "Problems obtaining information for user %s", 142 118 }; ··· 166 134 167 135 switch (errnum) { 168 136 case TARGET_ERRNO__PID_OVERRIDE_CPU ... 169 - TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD: 137 + TARGET_ERRNO__BPF_OVERRIDE_THREAD: 170 138 snprintf(buf, buflen, "%s", msg); 171 139 break; 172 140
+10
tools/perf/util/target.h
··· 10 10 const char *tid; 11 11 const char *cpu_list; 12 12 const char *uid_str; 13 + const char *bpf_str; 13 14 uid_t uid; 14 15 bool system_wide; 15 16 bool uses_mmap; ··· 37 36 TARGET_ERRNO__PID_OVERRIDE_SYSTEM, 38 37 TARGET_ERRNO__UID_OVERRIDE_SYSTEM, 39 38 TARGET_ERRNO__SYSTEM_OVERRIDE_THREAD, 39 + TARGET_ERRNO__BPF_OVERRIDE_CPU, 40 + TARGET_ERRNO__BPF_OVERRIDE_PID, 41 + TARGET_ERRNO__BPF_OVERRIDE_UID, 42 + TARGET_ERRNO__BPF_OVERRIDE_THREAD, 40 43 41 44 /* for target__parse_uid() */ 42 45 TARGET_ERRNO__INVALID_UID, ··· 62 57 static inline bool target__has_cpu(struct target *target) 63 58 { 64 59 return target->system_wide || target->cpu_list; 60 + } 61 + 62 + static inline bool target__has_bpf(struct target *target) 63 + { 64 + return target->bpf_str; 65 65 } 66 66 67 67 static inline bool target__none(struct target *target)
+5 -5
tools/perf/util/trace-event-info.c
··· 152 152 return false; 153 153 } 154 154 155 - #define for_each_event(dir, dent, tps) \ 155 + #define for_each_event_tps(dir, dent, tps) \ 156 156 while ((dent = readdir(dir))) \ 157 157 if (dent->d_type == DT_DIR && \ 158 158 (strcmp(dent->d_name, ".")) && \ ··· 174 174 return -errno; 175 175 } 176 176 177 - for_each_event(dir, dent, tps) { 177 + for_each_event_tps(dir, dent, tps) { 178 178 if (!name_in_tp_list(dent->d_name, tps)) 179 179 continue; 180 180 ··· 196 196 } 197 197 198 198 rewinddir(dir); 199 - for_each_event(dir, dent, tps) { 199 + for_each_event_tps(dir, dent, tps) { 200 200 if (!name_in_tp_list(dent->d_name, tps)) 201 201 continue; 202 202 ··· 274 274 goto out; 275 275 } 276 276 277 - for_each_event(dir, dent, tps) { 277 + for_each_event_tps(dir, dent, tps) { 278 278 if (strcmp(dent->d_name, "ftrace") == 0 || 279 279 !system_in_tp_list(dent->d_name, tps)) 280 280 continue; ··· 289 289 } 290 290 291 291 rewinddir(dir); 292 - for_each_event(dir, dent, tps) { 292 + for_each_event_tps(dir, dent, tps) { 293 293 if (strcmp(dent->d_name, "ftrace") == 0 || 294 294 !system_in_tp_list(dent->d_name, tps)) 295 295 continue;
+8 -3
tools/perf/util/unwind-libdw.c
··· 60 60 mod = dwfl_addrmodule(ui->dwfl, ip); 61 61 if (mod) { 62 62 Dwarf_Addr s; 63 - void **userdatap; 64 63 65 - dwfl_module_info(mod, &userdatap, &s, NULL, NULL, NULL, NULL, NULL); 66 - *userdatap = dso; 64 + dwfl_module_info(mod, NULL, &s, NULL, NULL, NULL, NULL, NULL); 67 65 if (s != al->map->start - al->map->pgoff) 68 66 mod = 0; 69 67 } ··· 75 77 if (dso__build_id_filename(dso, filename, sizeof(filename), false)) 76 78 mod = dwfl_report_elf(ui->dwfl, dso->short_name, filename, -1, 77 79 al->map->start - al->map->pgoff, false); 80 + } 81 + 82 + if (mod) { 83 + void **userdatap; 84 + 85 + dwfl_module_info(mod, &userdatap, NULL, NULL, NULL, NULL, NULL, NULL); 86 + *userdatap = dso; 78 87 } 79 88 80 89 return mod && dwfl_addrmodule(ui->dwfl, ip) == mod ? 0 : -1;
-33
tools/perf/util/xyarray.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - #include "xyarray.h" 3 - #include <stdlib.h> 4 - #include <string.h> 5 - #include <linux/zalloc.h> 6 - 7 - struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size) 8 - { 9 - size_t row_size = ylen * entry_size; 10 - struct xyarray *xy = zalloc(sizeof(*xy) + xlen * row_size); 11 - 12 - if (xy != NULL) { 13 - xy->entry_size = entry_size; 14 - xy->row_size = row_size; 15 - xy->entries = xlen * ylen; 16 - xy->max_x = xlen; 17 - xy->max_y = ylen; 18 - } 19 - 20 - return xy; 21 - } 22 - 23 - void xyarray__reset(struct xyarray *xy) 24 - { 25 - size_t n = xy->entries * xy->entry_size; 26 - 27 - memset(xy->contents, 0, n); 28 - } 29 - 30 - void xyarray__delete(struct xyarray *xy) 31 - { 32 - free(xy); 33 - }
+1
tools/scripts/Makefile.include
··· 134 134 $(MAKE) $(PRINT_DIR) -C $$subdir 135 135 QUIET_FLEX = @echo ' FLEX '$@; 136 136 QUIET_BISON = @echo ' BISON '$@; 137 + QUIET_GENSKEL = @echo ' GEN-SKEL '$@; 137 138 138 139 descend = \ 139 140 +@echo ' DESCEND '$(1); \