Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'perf-core-for-mingo-4.21-20181217' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

- Introduce 'perf record --aio' to use asynchronous IO trace writing, disabled
by default (Alexey Budankov)

- Add fallback routines to be used in places where we don't have the CPU mode
(kernel/userspace/hypervisor) and thus must first fallback lookups looking
at all map trees when trying to resolve symbols (Adrian Hunter)

- Fix error with config term "pt=0", where we should just force "pt=1" and
warn the user about the former being nonsensical (Adrian Hunter)

- Fix 'perf test' entry where we expect 'sleep' to come in a PERF_RECORD_COMM
but instead we get 'coreutils' when sleep is provided by some versions of
the 'coreutils' package (Adrian Hunter)

- Introduce 'perf top --kallsyms file' to match 'perf report --kallsyms', useful
when dealing with BPF, where symbol resolution happens via kallsyms, not via
the default vmlinux ELF symtabs (Arnaldo Carvalho de Melo)

- Support 'srccode' output field in 'perf script' (Andi Kleen)

- Introduce basic 'perf annotation' support for the ARC architecture (Eugeniy Paltsev)

- Compute and display average IPC and IPC coverage per symbol in 'perf annotate' and
'perf report' (Jin Yao)

- Make 'perf top' use ordered_events and process histograms in a separate thread (Jiri Olsa)

- Make 'perf trace' use ordered_events (Jiri Olsa)

- Add support for ETMv3 and PTMv1.1 decoding in cs-etm (Mathieu Poirier)

- Support for ARM A32/T32 instruction sets in CoreSight trace (cs-etm) (Robert Walker)

- Fix 'perf stat' shadow stats for clock events. (Ravi Bangoria)

- Remove needless rb_tree extra indirection from map__find() (Eric Saint-Etienne)

- Fix CSV mode column output for non-cgroup events in 'perf stat' (Stephane Eranian)

- Add sanity check to libtraceevent's is_timestamp_in_us() (Tzvetomir Stoyanov)

- Use ERR_CAST instead of ERR_PTR(PTR_ERR()) (Wen Yang)

- Fix Load_Miss_Real_Latency on SKL/SKX intel vendor event files (Andi Kleen)

- strncpy() fixes triggered by new warnings on gcc 8.2.0 (Arnaldo Carvalho de Melo)

- Handle tracefs syscall tracepoint older 'nr' field in 'perf trace', that got
renamed to '__syscall_nr' to work in older kernels (Arnaldo Carvalho de Melo)

- Give better hint about devel package for libssl (Arnaldo Carvalho de Melo)

- Fix the 'perf trace' build in architectures lacking explicit mmap.h file (Arnaldo Carvalho de Melo)

- Remove extra rb_tree traversal indirection from map__find() (Eric Saint-Etienne)

- Disable breakpoint tests for 32-bit ARM (Florian Fainelli)

- Fix typos all over the place, mostly in comments, but also in some debug
messages and JSON files (Ingo Molnar)

- Allow specifying proc-map-timeout in config file (Mark Drayton)

- Fix mmap_flags table generation script (Sihyeon Jang)

- Fix 'size' parameter to snprintf in the 'perf config' code (Sihyeon Jang)

- More libtraceevent renames to make it a proper library (Tzvetomir Stoyanov)

- Implement new API tep_get_ref() in libtraceevent (Tzvetomir Stoyanov)

- Added support for pkg-config in libtraceevent (Tzvetomir Stoyanov)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

+2051 -572
+4 -2
tools/build/Makefile.feature
··· 70 70 sched_getcpu \ 71 71 sdt \ 72 72 setns \ 73 - libopencsd 73 + libopencsd \ 74 + libaio 74 75 75 76 # FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list 76 77 # of all feature tests ··· 117 116 zlib \ 118 117 lzma \ 119 118 get_cpuid \ 120 - bpf 119 + bpf \ 120 + libaio 121 121 122 122 # Set FEATURE_CHECK_(C|LD)FLAGS-all for all FEATURE_TESTS features. 123 123 # If in the future we need per-feature checks/flags for features not
+5 -1
tools/build/feature/Makefile
··· 61 61 test-libopencsd.bin \ 62 62 test-clang.bin \ 63 63 test-llvm.bin \ 64 - test-llvm-version.bin 64 + test-llvm-version.bin \ 65 + test-libaio.bin 65 66 66 67 FILES := $(addprefix $(OUTPUT),$(FILES)) 67 68 ··· 297 296 > $(@:.bin=.make.output) 2>&1 298 297 299 298 -include $(OUTPUT)*.d 299 + 300 + $(OUTPUT)test-libaio.bin: 301 + $(BUILD) -lrt 300 302 301 303 ############################### 302 304
+5
tools/build/feature/test-all.c
··· 174 174 # include "test-libopencsd.c" 175 175 #undef main 176 176 177 + #define main main_test_libaio 178 + # include "test-libaio.c" 179 + #undef main 180 + 177 181 int main(int argc, char *argv[]) 178 182 { 179 183 main_test_libpython(); ··· 218 214 main_test_sdt(); 219 215 main_test_setns(); 220 216 main_test_libopencsd(); 217 + main_test_libaio(); 221 218 222 219 return 0; 223 220 }
+16
tools/build/feature/test-libaio.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <aio.h> 3 + 4 + int main(void) 5 + { 6 + struct aiocb aiocb; 7 + 8 + aiocb.aio_fildes = 0; 9 + aiocb.aio_offset = 0; 10 + aiocb.aio_buf = 0; 11 + aiocb.aio_nbytes = 0; 12 + aiocb.aio_reqprio = 0; 13 + aiocb.aio_sigevent.sigev_notify = 1 /*SIGEV_NONE*/; 14 + 15 + return (int)aio_return(&aiocb); 16 + }
+8
tools/build/feature/test-libopencsd.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 2 #include <opencsd/c_api/opencsd_c_api.h> 3 3 4 + /* 5 + * Check OpenCSD library version is sufficient to provide required features 6 + */ 7 + #define OCSD_MIN_VER ((0 << 16) | (10 << 8) | (0)) 8 + #if !defined(OCSD_VER_NUM) || (OCSD_VER_NUM < OCSD_MIN_VER) 9 + #error "OpenCSD >= 0.10.0 is required" 10 + #endif 11 + 4 12 int main(void) 5 13 { 6 14 (void)ocsd_get_version();
+13
tools/include/linux/err.h
··· 59 59 else 60 60 return 0; 61 61 } 62 + 63 + /** 64 + * ERR_CAST - Explicitly cast an error-valued pointer to another pointer type 65 + * @ptr: The pointer to cast. 66 + * 67 + * Explicitly cast an error-valued pointer to another pointer type in such a 68 + * way as to make it clear that's what's going on. 69 + */ 70 + static inline void * __must_check ERR_CAST(__force const void *ptr) 71 + { 72 + /* cast away the const */ 73 + return (void *) ptr; 74 + } 62 75 #endif /* _LINUX_ERR_H */
+2 -2
tools/lib/subcmd/parse-options.h
··· 71 71 * 72 72 * `argh`:: 73 73 * token to explain the kind of argument this option wants. Keep it 74 - * homogenous across the repository. 74 + * homogeneous across the repository. 75 75 * 76 76 * `help`:: 77 77 * the short help associated to what the option does. ··· 80 80 * 81 81 * `flags`:: 82 82 * mask of parse_opt_option_flags. 83 - * PARSE_OPT_OPTARG: says that the argument is optionnal (not for BOOLEANs) 83 + * PARSE_OPT_OPTARG: says that the argument is optional (not for BOOLEANs) 84 84 * PARSE_OPT_NOARG: says that this option takes no argument, for CALLBACKs 85 85 * PARSE_OPT_NONEG: says that this option cannot be negated 86 86 * PARSE_OPT_HIDDEN this option is skipped in the default usage, showed in
+24 -3
tools/lib/traceevent/Makefile
··· 25 25 $(call allow-override,CC,$(CROSS_COMPILE)gcc) 26 26 $(call allow-override,AR,$(CROSS_COMPILE)ar) 27 27 $(call allow-override,NM,$(CROSS_COMPILE)nm) 28 + $(call allow-override,PKG_CONFIG,pkg-config) 28 29 29 30 EXT = -std=gnu99 30 31 INSTALL = install ··· 48 47 libdir = $(prefix)/$(libdir_relative) 49 48 man_dir = $(prefix)/share/man 50 49 man_dir_SQ = '$(subst ','\'',$(man_dir))' 50 + pkgconfig_dir ?= $(word 1,$(shell $(PKG_CONFIG) \ 51 + --variable pc_path pkg-config | tr ":" " ")) 51 52 52 53 export man_dir man_dir_SQ INSTALL 53 54 export DESTDIR DESTDIR_SQ ··· 273 270 fi 274 271 endef 275 272 276 - install_lib: all_cmd install_plugins 273 + PKG_CONFIG_FILE = libtraceevent.pc 274 + define do_install_pkgconfig_file 275 + if [ -n "${pkgconfig_dir}" ]; then \ 276 + cp -f ${PKG_CONFIG_FILE}.template ${PKG_CONFIG_FILE}; \ 277 + sed -i "s|INSTALL_PREFIX|${1}|g" ${PKG_CONFIG_FILE}; \ 278 + sed -i "s|LIB_VERSION|${EVENT_PARSE_VERSION}|g" ${PKG_CONFIG_FILE}; \ 279 + $(call do_install,$(PKG_CONFIG_FILE),$(pkgconfig_dir),644); \ 280 + else \ 281 + (echo Failed to locate pkg-config directory) 1>&2; \ 282 + fi 283 + endef 284 + 285 + install_lib: all_cmd install_plugins install_headers install_pkgconfig 277 286 $(call QUIET_INSTALL, $(LIB_TARGET)) \ 278 287 $(call do_install_mkdir,$(libdir_SQ)); \ 279 288 cp -fpR $(LIB_INSTALL) $(DESTDIR)$(libdir_SQ) ··· 294 279 $(call QUIET_INSTALL, trace_plugins) \ 295 280 $(call do_install_plugins, $(PLUGINS)) 296 281 282 + install_pkgconfig: 283 + $(call QUIET_INSTALL, $(PKG_CONFIG_FILE)) \ 284 + $(call do_install_pkgconfig_file,$(prefix)) 285 + 297 286 install_headers: 298 287 $(call QUIET_INSTALL, headers) \ 299 288 $(call do_install,event-parse.h,$(prefix)/include/traceevent,644); \ 300 289 $(call do_install,event-utils.h,$(prefix)/include/traceevent,644); \ 290 + $(call do_install,trace-seq.h,$(prefix)/include/traceevent,644); \ 301 291 $(call do_install,kbuffer.h,$(prefix)/include/traceevent,644) 302 292 303 293 install: install_lib 304 294 305 295 clean: 306 296 $(call QUIET_CLEAN, libtraceevent) \ 307 - $(RM) *.o *~ $(TARGETS) *.a *.so $(VERSION_FILES) .*.d .*.cmd \ 308 - $(RM) TRACEEVENT-CFLAGS tags TAGS 297 + $(RM) *.o *~ $(TARGETS) *.a *.so $(VERSION_FILES) .*.d .*.cmd; \ 298 + $(RM) TRACEEVENT-CFLAGS tags TAGS; \ 299 + $(RM) $(PKG_CONFIG_FILE) 309 300 310 301 PHONY += force plugins 311 302 force:
+4 -4
tools/lib/traceevent/event-parse-api.c
··· 15 15 * This returns pointer to the first element of the events array 16 16 * If @tep is NULL, NULL is returned. 17 17 */ 18 - struct tep_event_format *tep_get_first_event(struct tep_handle *tep) 18 + struct tep_event *tep_get_first_event(struct tep_handle *tep) 19 19 { 20 20 if (tep && tep->events) 21 21 return tep->events[0]; ··· 51 51 tep->flags |= flag; 52 52 } 53 53 54 - unsigned short __tep_data2host2(struct tep_handle *pevent, unsigned short data) 54 + unsigned short tep_data2host2(struct tep_handle *pevent, unsigned short data) 55 55 { 56 56 unsigned short swap; 57 57 ··· 64 64 return swap; 65 65 } 66 66 67 - unsigned int __tep_data2host4(struct tep_handle *pevent, unsigned int data) 67 + unsigned int tep_data2host4(struct tep_handle *pevent, unsigned int data) 68 68 { 69 69 unsigned int swap; 70 70 ··· 80 80 } 81 81 82 82 unsigned long long 83 - __tep_data2host8(struct tep_handle *pevent, unsigned long long data) 83 + tep_data2host8(struct tep_handle *pevent, unsigned long long data) 84 84 { 85 85 unsigned long long swap; 86 86
+10 -3
tools/lib/traceevent/event-parse-local.h
··· 50 50 unsigned int printk_count; 51 51 52 52 53 - struct tep_event_format **events; 53 + struct tep_event **events; 54 54 int nr_events; 55 - struct tep_event_format **sort_events; 55 + struct tep_event **sort_events; 56 56 enum tep_event_sort_type last_type; 57 57 58 58 int type_offset; ··· 84 84 struct tep_function_handler *func_handlers; 85 85 86 86 /* cache */ 87 - struct tep_event_format *last_event; 87 + struct tep_event *last_event; 88 88 89 89 char *trace_clock; 90 90 }; 91 + 92 + void tep_free_event(struct tep_event *event); 93 + void tep_free_format_field(struct tep_format_field *field); 94 + 95 + unsigned short tep_data2host2(struct tep_handle *pevent, unsigned short data); 96 + unsigned int tep_data2host4(struct tep_handle *pevent, unsigned int data); 97 + unsigned long long tep_data2host8(struct tep_handle *pevent, unsigned long long data); 91 98 92 99 #endif /* _PARSE_EVENTS_INT_H */
+122 -112
tools/lib/traceevent/event-parse.c
··· 96 96 97 97 static unsigned long long 98 98 process_defined_func(struct trace_seq *s, void *data, int size, 99 - struct tep_event_format *event, struct tep_print_arg *arg); 99 + struct tep_event *event, struct tep_print_arg *arg); 100 100 101 101 static void free_func_handle(struct tep_function_handler *func); 102 102 ··· 739 739 } 740 740 } 741 741 742 - static struct tep_event_format *alloc_event(void) 742 + static struct tep_event *alloc_event(void) 743 743 { 744 - return calloc(1, sizeof(struct tep_event_format)); 744 + return calloc(1, sizeof(struct tep_event)); 745 745 } 746 746 747 - static int add_event(struct tep_handle *pevent, struct tep_event_format *event) 747 + static int add_event(struct tep_handle *pevent, struct tep_event *event) 748 748 { 749 749 int i; 750 - struct tep_event_format **events = realloc(pevent->events, sizeof(event) * 751 - (pevent->nr_events + 1)); 750 + struct tep_event **events = realloc(pevent->events, sizeof(event) * 751 + (pevent->nr_events + 1)); 752 752 if (!events) 753 753 return -1; 754 754 ··· 1145 1145 } 1146 1146 1147 1147 /** 1148 - * tep_read_token - access to utilites to use the pevent parser 1148 + * tep_read_token - access to utilities to use the pevent parser 1149 1149 * @tok: The token to return 1150 1150 * 1151 1151 * This will parse tokens from the string given by ··· 1355 1355 return 0; 1356 1356 } 1357 1357 1358 - static int event_read_fields(struct tep_event_format *event, struct tep_format_field **fields) 1358 + static int event_read_fields(struct tep_event *event, struct tep_format_field **fields) 1359 1359 { 1360 1360 struct tep_format_field *field = NULL; 1361 1361 enum tep_event_type type; ··· 1642 1642 return -1; 1643 1643 } 1644 1644 1645 - static int event_read_format(struct tep_event_format *event) 1645 + static int event_read_format(struct tep_event *event) 1646 1646 { 1647 1647 char *token; 1648 1648 int ret; ··· 1675 1675 } 1676 1676 1677 1677 static enum tep_event_type 1678 - process_arg_token(struct tep_event_format *event, struct tep_print_arg *arg, 1678 + process_arg_token(struct tep_event *event, struct tep_print_arg *arg, 1679 1679 char **tok, enum tep_event_type type); 1680 1680 1681 1681 static enum tep_event_type 1682 - process_arg(struct tep_event_format *event, struct tep_print_arg *arg, char **tok) 1682 + process_arg(struct tep_event *event, struct tep_print_arg *arg, char **tok) 1683 1683 { 1684 1684 enum tep_event_type type; 1685 1685 char *token; ··· 1691 1691 } 1692 1692 1693 1693 static enum tep_event_type 1694 - process_op(struct tep_event_format *event, struct tep_print_arg *arg, char **tok); 1694 + process_op(struct tep_event *event, struct tep_print_arg *arg, char **tok); 1695 1695 1696 1696 /* 1697 1697 * For __print_symbolic() and __print_flags, we need to completely 1698 1698 * evaluate the first argument, which defines what to print next. 1699 1699 */ 1700 1700 static enum tep_event_type 1701 - process_field_arg(struct tep_event_format *event, struct tep_print_arg *arg, char **tok) 1701 + process_field_arg(struct tep_event *event, struct tep_print_arg *arg, char **tok) 1702 1702 { 1703 1703 enum tep_event_type type; 1704 1704 ··· 1712 1712 } 1713 1713 1714 1714 static enum tep_event_type 1715 - process_cond(struct tep_event_format *event, struct tep_print_arg *top, char **tok) 1715 + process_cond(struct tep_event *event, struct tep_print_arg *top, char **tok) 1716 1716 { 1717 1717 struct tep_print_arg *arg, *left, *right; 1718 1718 enum tep_event_type type; ··· 1768 1768 } 1769 1769 1770 1770 static enum tep_event_type 1771 - process_array(struct tep_event_format *event, struct tep_print_arg *top, char **tok) 1771 + process_array(struct tep_event *event, struct tep_print_arg *top, char **tok) 1772 1772 { 1773 1773 struct tep_print_arg *arg; 1774 1774 enum tep_event_type type; ··· 1870 1870 1871 1871 /* Note, *tok does not get freed, but will most likely be saved */ 1872 1872 static enum tep_event_type 1873 - process_op(struct tep_event_format *event, struct tep_print_arg *arg, char **tok) 1873 + process_op(struct tep_event *event, struct tep_print_arg *arg, char **tok) 1874 1874 { 1875 1875 struct tep_print_arg *left, *right = NULL; 1876 1876 enum tep_event_type type; ··· 2071 2071 } 2072 2072 2073 2073 static enum tep_event_type 2074 - process_entry(struct tep_event_format *event __maybe_unused, struct tep_print_arg *arg, 2074 + process_entry(struct tep_event *event __maybe_unused, struct tep_print_arg *arg, 2075 2075 char **tok) 2076 2076 { 2077 2077 enum tep_event_type type; ··· 2110 2110 return TEP_EVENT_ERROR; 2111 2111 } 2112 2112 2113 - static int alloc_and_process_delim(struct tep_event_format *event, char *next_token, 2113 + static int alloc_and_process_delim(struct tep_event *event, char *next_token, 2114 2114 struct tep_print_arg **print_arg) 2115 2115 { 2116 2116 struct tep_print_arg *field; ··· 2445 2445 } 2446 2446 2447 2447 static enum tep_event_type 2448 - process_fields(struct tep_event_format *event, struct tep_print_flag_sym **list, char **tok) 2448 + process_fields(struct tep_event *event, struct tep_print_flag_sym **list, char **tok) 2449 2449 { 2450 2450 enum tep_event_type type; 2451 2451 struct tep_print_arg *arg = NULL; ··· 2526 2526 } 2527 2527 2528 2528 static enum tep_event_type 2529 - process_flags(struct tep_event_format *event, struct tep_print_arg *arg, char **tok) 2529 + process_flags(struct tep_event *event, struct tep_print_arg *arg, char **tok) 2530 2530 { 2531 2531 struct tep_print_arg *field; 2532 2532 enum tep_event_type type; ··· 2579 2579 } 2580 2580 2581 2581 static enum tep_event_type 2582 - process_symbols(struct tep_event_format *event, struct tep_print_arg *arg, char **tok) 2582 + process_symbols(struct tep_event *event, struct tep_print_arg *arg, char **tok) 2583 2583 { 2584 2584 struct tep_print_arg *field; 2585 2585 enum tep_event_type type; ··· 2618 2618 } 2619 2619 2620 2620 static enum tep_event_type 2621 - process_hex_common(struct tep_event_format *event, struct tep_print_arg *arg, 2621 + process_hex_common(struct tep_event *event, struct tep_print_arg *arg, 2622 2622 char **tok, enum tep_print_arg_type type) 2623 2623 { 2624 2624 memset(arg, 0, sizeof(*arg)); ··· 2641 2641 } 2642 2642 2643 2643 static enum tep_event_type 2644 - process_hex(struct tep_event_format *event, struct tep_print_arg *arg, char **tok) 2644 + process_hex(struct tep_event *event, struct tep_print_arg *arg, char **tok) 2645 2645 { 2646 2646 return process_hex_common(event, arg, tok, TEP_PRINT_HEX); 2647 2647 } 2648 2648 2649 2649 static enum tep_event_type 2650 - process_hex_str(struct tep_event_format *event, struct tep_print_arg *arg, 2650 + process_hex_str(struct tep_event *event, struct tep_print_arg *arg, 2651 2651 char **tok) 2652 2652 { 2653 2653 return process_hex_common(event, arg, tok, TEP_PRINT_HEX_STR); 2654 2654 } 2655 2655 2656 2656 static enum tep_event_type 2657 - process_int_array(struct tep_event_format *event, struct tep_print_arg *arg, char **tok) 2657 + process_int_array(struct tep_event *event, struct tep_print_arg *arg, char **tok) 2658 2658 { 2659 2659 memset(arg, 0, sizeof(*arg)); 2660 2660 arg->type = TEP_PRINT_INT_ARRAY; ··· 2682 2682 } 2683 2683 2684 2684 static enum tep_event_type 2685 - process_dynamic_array(struct tep_event_format *event, struct tep_print_arg *arg, char **tok) 2685 + process_dynamic_array(struct tep_event *event, struct tep_print_arg *arg, char **tok) 2686 2686 { 2687 2687 struct tep_format_field *field; 2688 2688 enum tep_event_type type; ··· 2746 2746 } 2747 2747 2748 2748 static enum tep_event_type 2749 - process_dynamic_array_len(struct tep_event_format *event, struct tep_print_arg *arg, 2749 + process_dynamic_array_len(struct tep_event *event, struct tep_print_arg *arg, 2750 2750 char **tok) 2751 2751 { 2752 2752 struct tep_format_field *field; ··· 2782 2782 } 2783 2783 2784 2784 static enum tep_event_type 2785 - process_paren(struct tep_event_format *event, struct tep_print_arg *arg, char **tok) 2785 + process_paren(struct tep_event *event, struct tep_print_arg *arg, char **tok) 2786 2786 { 2787 2787 struct tep_print_arg *item_arg; 2788 2788 enum tep_event_type type; ··· 2845 2845 2846 2846 2847 2847 static enum tep_event_type 2848 - process_str(struct tep_event_format *event __maybe_unused, struct tep_print_arg *arg, 2848 + process_str(struct tep_event *event __maybe_unused, struct tep_print_arg *arg, 2849 2849 char **tok) 2850 2850 { 2851 2851 enum tep_event_type type; ··· 2874 2874 } 2875 2875 2876 2876 static enum tep_event_type 2877 - process_bitmask(struct tep_event_format *event __maybe_unused, struct tep_print_arg *arg, 2877 + process_bitmask(struct tep_event *event __maybe_unused, struct tep_print_arg *arg, 2878 2878 char **tok) 2879 2879 { 2880 2880 enum tep_event_type type; ··· 2935 2935 } 2936 2936 2937 2937 static enum tep_event_type 2938 - process_func_handler(struct tep_event_format *event, struct tep_function_handler *func, 2938 + process_func_handler(struct tep_event *event, struct tep_function_handler *func, 2939 2939 struct tep_print_arg *arg, char **tok) 2940 2940 { 2941 2941 struct tep_print_arg **next_arg; ··· 2993 2993 } 2994 2994 2995 2995 static enum tep_event_type 2996 - process_function(struct tep_event_format *event, struct tep_print_arg *arg, 2996 + process_function(struct tep_event *event, struct tep_print_arg *arg, 2997 2997 char *token, char **tok) 2998 2998 { 2999 2999 struct tep_function_handler *func; ··· 3049 3049 } 3050 3050 3051 3051 static enum tep_event_type 3052 - process_arg_token(struct tep_event_format *event, struct tep_print_arg *arg, 3052 + process_arg_token(struct tep_event *event, struct tep_print_arg *arg, 3053 3053 char **tok, enum tep_event_type type) 3054 3054 { 3055 3055 char *token; ··· 3137 3137 return type; 3138 3138 } 3139 3139 3140 - static int event_read_print_args(struct tep_event_format *event, struct tep_print_arg **list) 3140 + static int event_read_print_args(struct tep_event *event, struct tep_print_arg **list) 3141 3141 { 3142 3142 enum tep_event_type type = TEP_EVENT_ERROR; 3143 3143 struct tep_print_arg *arg; ··· 3195 3195 return args; 3196 3196 } 3197 3197 3198 - static int event_read_print(struct tep_event_format *event) 3198 + static int event_read_print(struct tep_event *event) 3199 3199 { 3200 3200 enum tep_event_type type; 3201 3201 char *token; ··· 3258 3258 * @name: the name of the common field to return 3259 3259 * 3260 3260 * Returns a common field from the event by the given @name. 3261 - * This only searchs the common fields and not all field. 3261 + * This only searches the common fields and not all field. 3262 3262 */ 3263 3263 struct tep_format_field * 3264 - tep_find_common_field(struct tep_event_format *event, const char *name) 3264 + tep_find_common_field(struct tep_event *event, const char *name) 3265 3265 { 3266 3266 struct tep_format_field *format; 3267 3267 ··· 3283 3283 * This does not search common fields. 3284 3284 */ 3285 3285 struct tep_format_field * 3286 - tep_find_field(struct tep_event_format *event, const char *name) 3286 + tep_find_field(struct tep_event *event, const char *name) 3287 3287 { 3288 3288 struct tep_format_field *format; 3289 3289 ··· 3302 3302 * @name: the name of the field 3303 3303 * 3304 3304 * Returns a field by the given @name. 3305 - * This searchs the common field names first, then 3305 + * This searches the common field names first, then 3306 3306 * the non-common ones if a common one was not found. 3307 3307 */ 3308 3308 struct tep_format_field * 3309 - tep_find_any_field(struct tep_event_format *event, const char *name) 3309 + tep_find_any_field(struct tep_event *event, const char *name) 3310 3310 { 3311 3311 struct tep_format_field *format; 3312 3312 ··· 3328 3328 unsigned long long tep_read_number(struct tep_handle *pevent, 3329 3329 const void *ptr, int size) 3330 3330 { 3331 + unsigned long long val; 3332 + 3331 3333 switch (size) { 3332 3334 case 1: 3333 3335 return *(unsigned char *)ptr; 3334 3336 case 2: 3335 - return tep_data2host2(pevent, ptr); 3337 + return tep_data2host2(pevent, *(unsigned short *)ptr); 3336 3338 case 4: 3337 - return tep_data2host4(pevent, ptr); 3339 + return tep_data2host4(pevent, *(unsigned int *)ptr); 3338 3340 case 8: 3339 - return tep_data2host8(pevent, ptr); 3341 + memcpy(&val, (ptr), sizeof(unsigned long long)); 3342 + return tep_data2host8(pevent, val); 3340 3343 default: 3341 3344 /* BUG! */ 3342 3345 return 0; ··· 3378 3375 static int get_common_info(struct tep_handle *pevent, 3379 3376 const char *type, int *offset, int *size) 3380 3377 { 3381 - struct tep_event_format *event; 3378 + struct tep_event *event; 3382 3379 struct tep_format_field *field; 3383 3380 3384 3381 /* ··· 3465 3462 * 3466 3463 * Returns an event that has a given @id. 3467 3464 */ 3468 - struct tep_event_format *tep_find_event(struct tep_handle *pevent, int id) 3465 + struct tep_event *tep_find_event(struct tep_handle *pevent, int id) 3469 3466 { 3470 - struct tep_event_format **eventptr; 3471 - struct tep_event_format key; 3472 - struct tep_event_format *pkey = &key; 3467 + struct tep_event **eventptr; 3468 + struct tep_event key; 3469 + struct tep_event *pkey = &key; 3473 3470 3474 3471 /* Check cache first */ 3475 3472 if (pevent->last_event && pevent->last_event->id == id) ··· 3497 3494 * This returns an event with a given @name and under the system 3498 3495 * @sys. If @sys is NULL the first event with @name is returned. 3499 3496 */ 3500 - struct tep_event_format * 3497 + struct tep_event * 3501 3498 tep_find_event_by_name(struct tep_handle *pevent, 3502 3499 const char *sys, const char *name) 3503 3500 { 3504 - struct tep_event_format *event; 3501 + struct tep_event *event = NULL; 3505 3502 int i; 3506 3503 3507 3504 if (pevent->last_event && ··· 3526 3523 } 3527 3524 3528 3525 static unsigned long long 3529 - eval_num_arg(void *data, int size, struct tep_event_format *event, struct tep_print_arg *arg) 3526 + eval_num_arg(void *data, int size, struct tep_event *event, struct tep_print_arg *arg) 3530 3527 { 3531 3528 struct tep_handle *pevent = event->pevent; 3532 3529 unsigned long long val = 0; ··· 3841 3838 /* 3842 3839 * data points to a bit mask of size bytes. 3843 3840 * In the kernel, this is an array of long words, thus 3844 - * endianess is very important. 3841 + * endianness is very important. 3845 3842 */ 3846 3843 if (pevent->file_bigendian) 3847 3844 index = size - (len + 1); ··· 3866 3863 } 3867 3864 3868 3865 static void print_str_arg(struct trace_seq *s, void *data, int size, 3869 - struct tep_event_format *event, const char *format, 3866 + struct tep_event *event, const char *format, 3870 3867 int len_arg, struct tep_print_arg *arg) 3871 3868 { 3872 3869 struct tep_handle *pevent = event->pevent; ··· 4065 4062 f = tep_find_any_field(event, arg->string.string); 4066 4063 arg->string.offset = f->offset; 4067 4064 } 4068 - str_offset = tep_data2host4(pevent, data + arg->string.offset); 4065 + str_offset = tep_data2host4(pevent, *(unsigned int *)(data + arg->string.offset)); 4069 4066 str_offset &= 0xffff; 4070 4067 print_str_to_seq(s, format, len_arg, ((char *)data) + str_offset); 4071 4068 break; ··· 4083 4080 f = tep_find_any_field(event, arg->bitmask.bitmask); 4084 4081 arg->bitmask.offset = f->offset; 4085 4082 } 4086 - bitmask_offset = tep_data2host4(pevent, data + arg->bitmask.offset); 4083 + bitmask_offset = tep_data2host4(pevent, *(unsigned int *)(data + arg->bitmask.offset)); 4087 4084 bitmask_size = bitmask_offset >> 16; 4088 4085 bitmask_offset &= 0xffff; 4089 4086 print_bitmask_to_seq(pevent, s, format, len_arg, ··· 4121 4118 4122 4119 static unsigned long long 4123 4120 process_defined_func(struct trace_seq *s, void *data, int size, 4124 - struct tep_event_format *event, struct tep_print_arg *arg) 4121 + struct tep_event *event, struct tep_print_arg *arg) 4125 4122 { 4126 4123 struct tep_function_handler *func_handle = arg->func.func; 4127 4124 struct func_params *param; ··· 4216 4213 } 4217 4214 } 4218 4215 4219 - static struct tep_print_arg *make_bprint_args(char *fmt, void *data, int size, struct tep_event_format *event) 4216 + static struct tep_print_arg *make_bprint_args(char *fmt, void *data, int size, struct tep_event *event) 4220 4217 { 4221 4218 struct tep_handle *pevent = event->pevent; 4222 4219 struct tep_format_field *field, *ip_field; ··· 4224 4221 unsigned long long ip, val; 4225 4222 char *ptr; 4226 4223 void *bptr; 4227 - int vsize; 4224 + int vsize = 0; 4228 4225 4229 4226 field = pevent->bprint_buf_field; 4230 4227 ip_field = pevent->bprint_ip_field; ··· 4393 4390 4394 4391 static char * 4395 4392 get_bprint_format(void *data, int size __maybe_unused, 4396 - struct tep_event_format *event) 4393 + struct tep_event *event) 4397 4394 { 4398 4395 struct tep_handle *pevent = event->pevent; 4399 4396 unsigned long long addr; ··· 4428 4425 } 4429 4426 4430 4427 static void print_mac_arg(struct trace_seq *s, int mac, void *data, int size, 4431 - struct tep_event_format *event, struct tep_print_arg *arg) 4428 + struct tep_event *event, struct tep_print_arg *arg) 4432 4429 { 4433 4430 unsigned char *buf; 4434 4431 const char *fmt = "%.2x:%.2x:%.2x:%.2x:%.2x:%.2x"; ··· 4581 4578 * %pISpc print an IP address based on sockaddr; p adds port. 4582 4579 */ 4583 4580 static int print_ipv4_arg(struct trace_seq *s, const char *ptr, char i, 4584 - void *data, int size, struct tep_event_format *event, 4581 + void *data, int size, struct tep_event *event, 4585 4582 struct tep_print_arg *arg) 4586 4583 { 4587 4584 unsigned char *buf; ··· 4618 4615 } 4619 4616 4620 4617 static int print_ipv6_arg(struct trace_seq *s, const char *ptr, char i, 4621 - void *data, int size, struct tep_event_format *event, 4618 + void *data, int size, struct tep_event *event, 4622 4619 struct tep_print_arg *arg) 4623 4620 { 4624 4621 char have_c = 0; ··· 4668 4665 } 4669 4666 4670 4667 static int print_ipsa_arg(struct trace_seq *s, const char *ptr, char i, 4671 - void *data, int size, struct tep_event_format *event, 4668 + void *data, int size, struct tep_event *event, 4672 4669 struct tep_print_arg *arg) 4673 4670 { 4674 4671 char have_c = 0, have_p = 0; ··· 4750 4747 } 4751 4748 4752 4749 static int print_ip_arg(struct trace_seq *s, const char *ptr, 4753 - void *data, int size, struct tep_event_format *event, 4750 + void *data, int size, struct tep_event *event, 4754 4751 struct tep_print_arg *arg) 4755 4752 { 4756 4753 char i = *ptr; /* 'i' or 'I' */ ··· 4857 4854 } 4858 4855 4859 4856 void tep_print_fields(struct trace_seq *s, void *data, 4860 - int size __maybe_unused, struct tep_event_format *event) 4857 + int size __maybe_unused, struct tep_event *event) 4861 4858 { 4862 4859 struct tep_format_field *field; 4863 4860 ··· 4869 4866 } 4870 4867 } 4871 4868 4872 - static void pretty_print(struct trace_seq *s, void *data, int size, struct tep_event_format *event) 4869 + static void pretty_print(struct trace_seq *s, void *data, int size, struct tep_event *event) 4873 4870 { 4874 4871 struct tep_handle *pevent = event->pevent; 4875 4872 struct tep_print_fmt *print_fmt = &event->print_fmt; ··· 4884 4881 char format[32]; 4885 4882 int show_func; 4886 4883 int len_as_arg; 4887 - int len_arg; 4884 + int len_arg = 0; 4888 4885 int len; 4889 4886 int ls; 4890 4887 ··· 5149 5146 static int migrate_disable_exists; 5150 5147 unsigned int lat_flags; 5151 5148 unsigned int pc; 5152 - int lock_depth; 5153 - int migrate_disable; 5149 + int lock_depth = 0; 5150 + int migrate_disable = 0; 5154 5151 int hardirq; 5155 5152 int softirq; 5156 5153 void *data = record->data; ··· 5232 5229 * 5233 5230 * This returns the event form a given @type; 5234 5231 */ 5235 - struct tep_event_format *tep_data_event_from_type(struct tep_handle *pevent, int type) 5232 + struct tep_event *tep_data_event_from_type(struct tep_handle *pevent, int type) 5236 5233 { 5237 5234 return tep_find_event(pevent, type); 5238 5235 } ··· 5316 5313 * This returns the cmdline structure that holds a pid for a given 5317 5314 * comm, or NULL if none found. As there may be more than one pid for 5318 5315 * a given comm, the result of this call can be passed back into 5319 - * a recurring call in the @next paramater, and then it will find the 5316 + * a recurring call in the @next parameter, and then it will find the 5320 5317 * next pid. 5321 - * Also, it does a linear seach, so it may be slow. 5318 + * Also, it does a linear search, so it may be slow. 5322 5319 */ 5323 5320 struct cmdline *tep_data_pid_from_comm(struct tep_handle *pevent, const char *comm, 5324 5321 struct cmdline *next) ··· 5390 5387 * This parses the raw @data using the given @event information and 5391 5388 * writes the print format into the trace_seq. 5392 5389 */ 5393 - void tep_event_info(struct trace_seq *s, struct tep_event_format *event, 5390 + void tep_event_info(struct trace_seq *s, struct tep_event *event, 5394 5391 struct tep_record *record) 5395 5392 { 5396 5393 int print_pretty = 1; ··· 5412 5409 5413 5410 static bool is_timestamp_in_us(char *trace_clock, bool use_trace_clock) 5414 5411 { 5415 - if (!use_trace_clock) 5412 + if (!trace_clock || !use_trace_clock) 5416 5413 return true; 5417 5414 5418 5415 if (!strcmp(trace_clock, "local") || !strcmp(trace_clock, "global") ··· 5431 5428 * Returns the associated event for a given record, or NULL if non is 5432 5429 * is found. 5433 5430 */ 5434 - struct tep_event_format * 5431 + struct tep_event * 5435 5432 tep_find_event_by_record(struct tep_handle *pevent, struct tep_record *record) 5436 5433 { 5437 5434 int type; ··· 5456 5453 * Writes the tasks comm, pid and CPU to @s. 5457 5454 */ 5458 5455 void tep_print_event_task(struct tep_handle *pevent, struct trace_seq *s, 5459 - struct tep_event_format *event, 5456 + struct tep_event *event, 5460 5457 struct tep_record *record) 5461 5458 { 5462 5459 void *data = record->data; ··· 5484 5481 * Writes the timestamp of the record into @s. 5485 5482 */ 5486 5483 void tep_print_event_time(struct tep_handle *pevent, struct trace_seq *s, 5487 - struct tep_event_format *event, 5484 + struct tep_event *event, 5488 5485 struct tep_record *record, 5489 5486 bool use_trace_clock) 5490 5487 { ··· 5534 5531 * Writes the parsing of the record's data to @s. 5535 5532 */ 5536 5533 void tep_print_event_data(struct tep_handle *pevent, struct trace_seq *s, 5537 - struct tep_event_format *event, 5534 + struct tep_event *event, 5538 5535 struct tep_record *record) 5539 5536 { 5540 5537 static const char *spaces = " "; /* 20 spaces */ ··· 5553 5550 void tep_print_event(struct tep_handle *pevent, struct trace_seq *s, 5554 5551 struct tep_record *record, bool use_trace_clock) 5555 5552 { 5556 - struct tep_event_format *event; 5553 + struct tep_event *event; 5557 5554 5558 5555 event = tep_find_event_by_record(pevent, record); 5559 5556 if (!event) { ··· 5575 5572 5576 5573 static int events_id_cmp(const void *a, const void *b) 5577 5574 { 5578 - struct tep_event_format * const * ea = a; 5579 - struct tep_event_format * const * eb = b; 5575 + struct tep_event * const * ea = a; 5576 + struct tep_event * const * eb = b; 5580 5577 5581 5578 if ((*ea)->id < (*eb)->id) 5582 5579 return -1; ··· 5589 5586 5590 5587 static int events_name_cmp(const void *a, const void *b) 5591 5588 { 5592 - struct tep_event_format * const * ea = a; 5593 - struct tep_event_format * const * eb = b; 5589 + struct tep_event * const * ea = a; 5590 + struct tep_event * const * eb = b; 5594 5591 int res; 5595 5592 5596 5593 res = strcmp((*ea)->name, (*eb)->name); ··· 5606 5603 5607 5604 static int events_system_cmp(const void *a, const void *b) 5608 5605 { 5609 - struct tep_event_format * const * ea = a; 5610 - struct tep_event_format * const * eb = b; 5606 + struct tep_event * const * ea = a; 5607 + struct tep_event * const * eb = b; 5611 5608 int res; 5612 5609 5613 5610 res = strcmp((*ea)->system, (*eb)->system); ··· 5621 5618 return events_id_cmp(a, b); 5622 5619 } 5623 5620 5624 - struct tep_event_format **tep_list_events(struct tep_handle *pevent, enum tep_event_sort_type sort_type) 5621 + struct tep_event **tep_list_events(struct tep_handle *pevent, enum tep_event_sort_type sort_type) 5625 5622 { 5626 - struct tep_event_format **events; 5623 + struct tep_event **events; 5627 5624 int (*sort)(const void *a, const void *b); 5628 5625 5629 5626 events = pevent->sort_events; ··· 5706 5703 * Returns an allocated array of fields. The last item in the array is NULL. 5707 5704 * The array must be freed with free(). 5708 5705 */ 5709 - struct tep_format_field **tep_event_common_fields(struct tep_event_format *event) 5706 + struct tep_format_field **tep_event_common_fields(struct tep_event *event) 5710 5707 { 5711 5708 return get_event_fields("common", event->name, 5712 5709 event->format.nr_common, ··· 5720 5717 * Returns an allocated array of fields. The last item in the array is NULL. 5721 5718 * The array must be freed with free(). 5722 5719 */ 5723 - struct tep_format_field **tep_event_fields(struct tep_event_format *event) 5720 + struct tep_format_field **tep_event_fields(struct tep_event *event) 5724 5721 { 5725 5722 return get_event_fields("event", event->name, 5726 5723 event->format.nr_fields, ··· 5962 5959 return 0; 5963 5960 } 5964 5961 5965 - static int event_matches(struct tep_event_format *event, 5962 + static int event_matches(struct tep_event *event, 5966 5963 int id, const char *sys_name, 5967 5964 const char *event_name) 5968 5965 { ··· 5985 5982 free(handle); 5986 5983 } 5987 5984 5988 - static int find_event_handle(struct tep_handle *pevent, struct tep_event_format *event) 5985 + static int find_event_handle(struct tep_handle *pevent, struct tep_event *event) 5989 5986 { 5990 5987 struct event_handler *handle, **next; 5991 5988 ··· 6026 6023 * 6027 6024 * /sys/kernel/debug/tracing/events/.../.../format 6028 6025 */ 6029 - enum tep_errno __tep_parse_format(struct tep_event_format **eventp, 6026 + enum tep_errno __tep_parse_format(struct tep_event **eventp, 6030 6027 struct tep_handle *pevent, const char *buf, 6031 6028 unsigned long size, const char *sys) 6032 6029 { 6033 - struct tep_event_format *event; 6030 + struct tep_event *event; 6034 6031 int ret; 6035 6032 6036 6033 init_input_buf(buf, size); ··· 6135 6132 6136 6133 static enum tep_errno 6137 6134 __parse_event(struct tep_handle *pevent, 6138 - struct tep_event_format **eventp, 6135 + struct tep_event **eventp, 6139 6136 const char *buf, unsigned long size, 6140 6137 const char *sys) 6141 6138 { 6142 6139 int ret = __tep_parse_format(eventp, pevent, buf, size, sys); 6143 - struct tep_event_format *event = *eventp; 6140 + struct tep_event *event = *eventp; 6144 6141 6145 6142 if (event == NULL) 6146 6143 return ret; ··· 6157 6154 return 0; 6158 6155 6159 6156 event_add_failed: 6160 - tep_free_format(event); 6157 + tep_free_event(event); 6161 6158 return ret; 6162 6159 } 6163 6160 ··· 6177 6174 * /sys/kernel/debug/tracing/events/.../.../format 6178 6175 */ 6179 6176 enum tep_errno tep_parse_format(struct tep_handle *pevent, 6180 - struct tep_event_format **eventp, 6177 + struct tep_event **eventp, 6181 6178 const char *buf, 6182 6179 unsigned long size, const char *sys) 6183 6180 { ··· 6201 6198 enum tep_errno tep_parse_event(struct tep_handle *pevent, const char *buf, 6202 6199 unsigned long size, const char *sys) 6203 6200 { 6204 - struct tep_event_format *event = NULL; 6201 + struct tep_event *event = NULL; 6205 6202 return __parse_event(pevent, &event, buf, size, sys); 6206 6203 } 6207 6204 ··· 6238 6235 * 6239 6236 * On failure, it returns NULL. 6240 6237 */ 6241 - void *tep_get_field_raw(struct trace_seq *s, struct tep_event_format *event, 6238 + void *tep_get_field_raw(struct trace_seq *s, struct tep_event *event, 6242 6239 const char *name, struct tep_record *record, 6243 6240 int *len, int err) 6244 6241 { ··· 6285 6282 * 6286 6283 * Returns 0 on success -1 on field not found. 6287 6284 */ 6288 - int tep_get_field_val(struct trace_seq *s, struct tep_event_format *event, 6285 + int tep_get_field_val(struct trace_seq *s, struct tep_event *event, 6289 6286 const char *name, struct tep_record *record, 6290 6287 unsigned long long *val, int err) 6291 6288 { ··· 6310 6307 * 6311 6308 * Returns 0 on success -1 on field not found. 6312 6309 */ 6313 - int tep_get_common_field_val(struct trace_seq *s, struct tep_event_format *event, 6310 + int tep_get_common_field_val(struct trace_seq *s, struct tep_event *event, 6314 6311 const char *name, struct tep_record *record, 6315 6312 unsigned long long *val, int err) 6316 6313 { ··· 6335 6332 * 6336 6333 * Returns 0 on success -1 on field not found. 6337 6334 */ 6338 - int tep_get_any_field_val(struct trace_seq *s, struct tep_event_format *event, 6335 + int tep_get_any_field_val(struct trace_seq *s, struct tep_event *event, 6339 6336 const char *name, struct tep_record *record, 6340 6337 unsigned long long *val, int err) 6341 6338 { ··· 6361 6358 * Returns: 0 on success, -1 field not found, or 1 if buffer is full. 6362 6359 */ 6363 6360 int tep_print_num_field(struct trace_seq *s, const char *fmt, 6364 - struct tep_event_format *event, const char *name, 6361 + struct tep_event *event, const char *name, 6365 6362 struct tep_record *record, int err) 6366 6363 { 6367 6364 struct tep_format_field *field = tep_find_field(event, name); ··· 6393 6390 * Returns: 0 on success, -1 field not found, or 1 if buffer is full. 6394 6391 */ 6395 6392 int tep_print_func_field(struct trace_seq *s, const char *fmt, 6396 - struct tep_event_format *event, const char *name, 6393 + struct tep_event *event, const char *name, 6397 6394 struct tep_record *record, int err) 6398 6395 { 6399 6396 struct tep_format_field *field = tep_find_field(event, name); ··· 6553 6550 return -1; 6554 6551 } 6555 6552 6556 - static struct tep_event_format *search_event(struct tep_handle *pevent, int id, 6557 - const char *sys_name, 6558 - const char *event_name) 6553 + static struct tep_event *search_event(struct tep_handle *pevent, int id, 6554 + const char *sys_name, 6555 + const char *event_name) 6559 6556 { 6560 - struct tep_event_format *event; 6557 + struct tep_event *event; 6561 6558 6562 6559 if (id >= 0) { 6563 6560 /* search by id */ ··· 6597 6594 const char *sys_name, const char *event_name, 6598 6595 tep_event_handler_func func, void *context) 6599 6596 { 6600 - struct tep_event_format *event; 6597 + struct tep_event *event; 6601 6598 struct event_handler *handle; 6602 6599 6603 6600 event = search_event(pevent, id, sys_name, event_name); ··· 6681 6678 const char *sys_name, const char *event_name, 6682 6679 tep_event_handler_func func, void *context) 6683 6680 { 6684 - struct tep_event_format *event; 6681 + struct tep_event *event; 6685 6682 struct event_handler *handle; 6686 6683 struct event_handler **next; 6687 6684 ··· 6733 6730 pevent->ref_count++; 6734 6731 } 6735 6732 6733 + int tep_get_ref(struct tep_handle *tep) 6734 + { 6735 + if (tep) 6736 + return tep->ref_count; 6737 + return 0; 6738 + } 6739 + 6736 6740 void tep_free_format_field(struct tep_format_field *field) 6737 6741 { 6738 6742 free(field->type); ··· 6766 6756 free_format_fields(format->fields); 6767 6757 } 6768 6758 6769 - void tep_free_format(struct tep_event_format *event) 6759 + void tep_free_event(struct tep_event *event) 6770 6760 { 6771 6761 free(event->name); 6772 6762 free(event->system); ··· 6852 6842 } 6853 6843 6854 6844 for (i = 0; i < pevent->nr_events; i++) 6855 - tep_free_format(pevent->events[i]); 6845 + tep_free_event(pevent->events[i]); 6856 6846 6857 6847 while (pevent->handlers) { 6858 6848 handle = pevent->handlers;
+31 -46
tools/lib/traceevent/event-parse.h
··· 57 57 /* ----------------------- tep ----------------------- */ 58 58 59 59 struct tep_handle; 60 - struct tep_event_format; 60 + struct tep_event; 61 61 62 62 typedef int (*tep_event_handler_func)(struct trace_seq *s, 63 63 struct tep_record *record, 64 - struct tep_event_format *event, 64 + struct tep_event *event, 65 65 void *context); 66 66 67 67 typedef int (*tep_plugin_load_func)(struct tep_handle *pevent); ··· 143 143 144 144 struct tep_format_field { 145 145 struct tep_format_field *next; 146 - struct tep_event_format *event; 146 + struct tep_event *event; 147 147 char *type; 148 148 char *name; 149 149 char *alias; ··· 277 277 struct tep_print_arg *args; 278 278 }; 279 279 280 - struct tep_event_format { 280 + struct tep_event { 281 281 struct tep_handle *pevent; 282 282 char *name; 283 283 int id; ··· 409 409 typedef char *(tep_func_resolver_t)(void *priv, 410 410 unsigned long long *addrp, char **modp); 411 411 void tep_set_flag(struct tep_handle *tep, int flag); 412 - unsigned short __tep_data2host2(struct tep_handle *pevent, unsigned short data); 413 - unsigned int __tep_data2host4(struct tep_handle *pevent, unsigned int data); 414 - unsigned long long 415 - __tep_data2host8(struct tep_handle *pevent, unsigned long long data); 416 - 417 - #define tep_data2host2(pevent, ptr) __tep_data2host2(pevent, *(unsigned short *)(ptr)) 418 - #define tep_data2host4(pevent, ptr) __tep_data2host4(pevent, *(unsigned int *)(ptr)) 419 - #define tep_data2host8(pevent, ptr) \ 420 - ({ \ 421 - unsigned long long __val; \ 422 - \ 423 - memcpy(&__val, (ptr), sizeof(unsigned long long)); \ 424 - __tep_data2host8(pevent, __val); \ 425 - }) 426 412 427 413 static inline int tep_host_bigendian(void) 428 414 { ··· 440 454 int tep_pid_is_registered(struct tep_handle *pevent, int pid); 441 455 442 456 void tep_print_event_task(struct tep_handle *pevent, struct trace_seq *s, 443 - struct tep_event_format *event, 457 + struct tep_event *event, 444 458 struct tep_record *record); 445 459 void tep_print_event_time(struct tep_handle *pevent, struct trace_seq *s, 446 - struct tep_event_format *event, 460 + struct tep_event *event, 447 461 struct tep_record *record, 448 462 bool use_trace_clock); 449 463 void tep_print_event_data(struct tep_handle *pevent, struct trace_seq *s, 450 - struct tep_event_format *event, 464 + struct tep_event *event, 451 465 struct tep_record *record); 452 466 void tep_print_event(struct tep_handle *pevent, struct trace_seq *s, 453 467 struct tep_record *record, bool use_trace_clock); ··· 458 472 enum tep_errno tep_parse_event(struct tep_handle *pevent, const char *buf, 459 473 unsigned long size, const char *sys); 460 474 enum tep_errno tep_parse_format(struct tep_handle *pevent, 461 - struct tep_event_format **eventp, 475 + struct tep_event **eventp, 462 476 const char *buf, 463 477 unsigned long size, const char *sys); 464 - void tep_free_format(struct tep_event_format *event); 465 - void tep_free_format_field(struct tep_format_field *field); 466 478 467 - void *tep_get_field_raw(struct trace_seq *s, struct tep_event_format *event, 479 + void *tep_get_field_raw(struct trace_seq *s, struct tep_event *event, 468 480 const char *name, struct tep_record *record, 469 481 int *len, int err); 470 482 471 - int tep_get_field_val(struct trace_seq *s, struct tep_event_format *event, 483 + int tep_get_field_val(struct trace_seq *s, struct tep_event *event, 472 484 const char *name, struct tep_record *record, 473 485 unsigned long long *val, int err); 474 - int tep_get_common_field_val(struct trace_seq *s, struct tep_event_format *event, 486 + int tep_get_common_field_val(struct trace_seq *s, struct tep_event *event, 475 487 const char *name, struct tep_record *record, 476 488 unsigned long long *val, int err); 477 - int tep_get_any_field_val(struct trace_seq *s, struct tep_event_format *event, 489 + int tep_get_any_field_val(struct trace_seq *s, struct tep_event *event, 478 490 const char *name, struct tep_record *record, 479 491 unsigned long long *val, int err); 480 492 481 493 int tep_print_num_field(struct trace_seq *s, const char *fmt, 482 - struct tep_event_format *event, const char *name, 494 + struct tep_event *event, const char *name, 483 495 struct tep_record *record, int err); 484 496 485 497 int tep_print_func_field(struct trace_seq *s, const char *fmt, 486 - struct tep_event_format *event, const char *name, 498 + struct tep_event *event, const char *name, 487 499 struct tep_record *record, int err); 488 500 489 501 int tep_register_event_handler(struct tep_handle *pevent, int id, ··· 497 513 int tep_unregister_print_function(struct tep_handle *pevent, 498 514 tep_func_handler func, char *name); 499 515 500 - struct tep_format_field *tep_find_common_field(struct tep_event_format *event, const char *name); 501 - struct tep_format_field *tep_find_field(struct tep_event_format *event, const char *name); 502 - struct tep_format_field *tep_find_any_field(struct tep_event_format *event, const char *name); 516 + struct tep_format_field *tep_find_common_field(struct tep_event *event, const char *name); 517 + struct tep_format_field *tep_find_field(struct tep_event *event, const char *name); 518 + struct tep_format_field *tep_find_any_field(struct tep_event *event, const char *name); 503 519 504 520 const char *tep_find_function(struct tep_handle *pevent, unsigned long long addr); 505 521 unsigned long long ··· 508 524 int tep_read_number_field(struct tep_format_field *field, const void *data, 509 525 unsigned long long *value); 510 526 511 - struct tep_event_format *tep_get_first_event(struct tep_handle *tep); 527 + struct tep_event *tep_get_first_event(struct tep_handle *tep); 512 528 int tep_get_events_count(struct tep_handle *tep); 513 - struct tep_event_format *tep_find_event(struct tep_handle *pevent, int id); 529 + struct tep_event *tep_find_event(struct tep_handle *pevent, int id); 514 530 515 - struct tep_event_format * 531 + struct tep_event * 516 532 tep_find_event_by_name(struct tep_handle *pevent, const char *sys, const char *name); 517 - struct tep_event_format * 533 + struct tep_event * 518 534 tep_find_event_by_record(struct tep_handle *pevent, struct tep_record *record); 519 535 520 536 void tep_data_lat_fmt(struct tep_handle *pevent, 521 537 struct trace_seq *s, struct tep_record *record); 522 538 int tep_data_type(struct tep_handle *pevent, struct tep_record *rec); 523 - struct tep_event_format *tep_data_event_from_type(struct tep_handle *pevent, int type); 539 + struct tep_event *tep_data_event_from_type(struct tep_handle *pevent, int type); 524 540 int tep_data_pid(struct tep_handle *pevent, struct tep_record *rec); 525 541 int tep_data_preempt_count(struct tep_handle *pevent, struct tep_record *rec); 526 542 int tep_data_flags(struct tep_handle *pevent, struct tep_record *rec); ··· 533 549 void tep_print_field(struct trace_seq *s, void *data, 534 550 struct tep_format_field *field); 535 551 void tep_print_fields(struct trace_seq *s, void *data, 536 - int size __maybe_unused, struct tep_event_format *event); 537 - void tep_event_info(struct trace_seq *s, struct tep_event_format *event, 538 - struct tep_record *record); 552 + int size __maybe_unused, struct tep_event *event); 553 + void tep_event_info(struct trace_seq *s, struct tep_event *event, 554 + struct tep_record *record); 539 555 int tep_strerror(struct tep_handle *pevent, enum tep_errno errnum, 540 - char *buf, size_t buflen); 556 + char *buf, size_t buflen); 541 557 542 - struct tep_event_format **tep_list_events(struct tep_handle *pevent, enum tep_event_sort_type); 543 - struct tep_format_field **tep_event_common_fields(struct tep_event_format *event); 544 - struct tep_format_field **tep_event_fields(struct tep_event_format *event); 558 + struct tep_event **tep_list_events(struct tep_handle *pevent, enum tep_event_sort_type); 559 + struct tep_format_field **tep_event_common_fields(struct tep_event *event); 560 + struct tep_format_field **tep_event_fields(struct tep_event *event); 545 561 546 562 enum tep_endian { 547 563 TEP_LITTLE_ENDIAN = 0, ··· 565 581 void tep_free(struct tep_handle *pevent); 566 582 void tep_ref(struct tep_handle *pevent); 567 583 void tep_unref(struct tep_handle *pevent); 584 + int tep_get_ref(struct tep_handle *tep); 568 585 569 586 /* access to the internal parser */ 570 587 void tep_buffer_init(const char *buf, unsigned long long size); ··· 697 712 698 713 struct tep_filter_type { 699 714 int event_id; 700 - struct tep_event_format *event; 715 + struct tep_event *event; 701 716 struct tep_filter_arg *filter; 702 717 }; 703 718
+10
tools/lib/traceevent/libtraceevent.pc.template
··· 1 + prefix=INSTALL_PREFIX 2 + libdir=${prefix}/lib64 3 + includedir=${prefix}/include/traceevent 4 + 5 + Name: libtraceevent 6 + URL: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 7 + Description: Linux kernel trace event library 8 + Version: LIB_VERSION 9 + Cflags: -I${includedir} 10 + Libs: -L${libdir} -ltraceevent
+21 -21
tools/lib/traceevent/parse-filter.c
··· 27 27 28 28 struct event_list { 29 29 struct event_list *next; 30 - struct tep_event_format *event; 30 + struct tep_event *event; 31 31 }; 32 32 33 33 static void show_error(char *error_buf, const char *fmt, ...) ··· 229 229 } 230 230 231 231 static int add_event(struct event_list **events, 232 - struct tep_event_format *event) 232 + struct tep_event *event) 233 233 { 234 234 struct event_list *list; 235 235 ··· 243 243 return 0; 244 244 } 245 245 246 - static int event_match(struct tep_event_format *event, 246 + static int event_match(struct tep_event *event, 247 247 regex_t *sreg, regex_t *ereg) 248 248 { 249 249 if (sreg) { ··· 259 259 find_event(struct tep_handle *pevent, struct event_list **events, 260 260 char *sys_name, char *event_name) 261 261 { 262 - struct tep_event_format *event; 262 + struct tep_event *event; 263 263 regex_t ereg; 264 264 regex_t sreg; 265 265 int match = 0; ··· 334 334 } 335 335 336 336 static enum tep_errno 337 - create_arg_item(struct tep_event_format *event, const char *token, 337 + create_arg_item(struct tep_event *event, const char *token, 338 338 enum tep_event_type type, struct tep_filter_arg **parg, char *error_str) 339 339 { 340 340 struct tep_format_field *field; ··· 940 940 } 941 941 942 942 static enum tep_errno 943 - process_filter(struct tep_event_format *event, struct tep_filter_arg **parg, 943 + process_filter(struct tep_event *event, struct tep_filter_arg **parg, 944 944 char *error_str, int not) 945 945 { 946 946 enum tep_event_type type; ··· 1180 1180 } 1181 1181 1182 1182 static enum tep_errno 1183 - process_event(struct tep_event_format *event, const char *filter_str, 1183 + process_event(struct tep_event *event, const char *filter_str, 1184 1184 struct tep_filter_arg **parg, char *error_str) 1185 1185 { 1186 1186 int ret; ··· 1205 1205 } 1206 1206 1207 1207 static enum tep_errno 1208 - filter_event(struct tep_event_filter *filter, struct tep_event_format *event, 1208 + filter_event(struct tep_event_filter *filter, struct tep_event *event, 1209 1209 const char *filter_str, char *error_str) 1210 1210 { 1211 1211 struct tep_filter_type *filter_type; ··· 1457 1457 struct tep_filter_type *filter_type) 1458 1458 { 1459 1459 struct tep_filter_arg *arg; 1460 - struct tep_event_format *event; 1460 + struct tep_event *event; 1461 1461 const char *sys; 1462 1462 const char *name; 1463 1463 char *str; ··· 1539 1539 { 1540 1540 struct tep_handle *src_pevent; 1541 1541 struct tep_handle *dest_pevent; 1542 - struct tep_event_format *event; 1542 + struct tep_event *event; 1543 1543 struct tep_filter_type *filter_type; 1544 1544 struct tep_filter_arg *arg; 1545 1545 char *str; ··· 1683 1683 } 1684 1684 } 1685 1685 1686 - static int test_filter(struct tep_event_format *event, struct tep_filter_arg *arg, 1686 + static int test_filter(struct tep_event *event, struct tep_filter_arg *arg, 1687 1687 struct tep_record *record, enum tep_errno *err); 1688 1688 1689 1689 static const char * 1690 - get_comm(struct tep_event_format *event, struct tep_record *record) 1690 + get_comm(struct tep_event *event, struct tep_record *record) 1691 1691 { 1692 1692 const char *comm; 1693 1693 int pid; ··· 1698 1698 } 1699 1699 1700 1700 static unsigned long long 1701 - get_value(struct tep_event_format *event, 1701 + get_value(struct tep_event *event, 1702 1702 struct tep_format_field *field, struct tep_record *record) 1703 1703 { 1704 1704 unsigned long long val; ··· 1734 1734 } 1735 1735 1736 1736 static unsigned long long 1737 - get_arg_value(struct tep_event_format *event, struct tep_filter_arg *arg, 1737 + get_arg_value(struct tep_event *event, struct tep_filter_arg *arg, 1738 1738 struct tep_record *record, enum tep_errno *err); 1739 1739 1740 1740 static unsigned long long 1741 - get_exp_value(struct tep_event_format *event, struct tep_filter_arg *arg, 1741 + get_exp_value(struct tep_event *event, struct tep_filter_arg *arg, 1742 1742 struct tep_record *record, enum tep_errno *err) 1743 1743 { 1744 1744 unsigned long long lval, rval; ··· 1793 1793 } 1794 1794 1795 1795 static unsigned long long 1796 - get_arg_value(struct tep_event_format *event, struct tep_filter_arg *arg, 1796 + get_arg_value(struct tep_event *event, struct tep_filter_arg *arg, 1797 1797 struct tep_record *record, enum tep_errno *err) 1798 1798 { 1799 1799 switch (arg->type) { ··· 1817 1817 return 0; 1818 1818 } 1819 1819 1820 - static int test_num(struct tep_event_format *event, struct tep_filter_arg *arg, 1820 + static int test_num(struct tep_event *event, struct tep_filter_arg *arg, 1821 1821 struct tep_record *record, enum tep_errno *err) 1822 1822 { 1823 1823 unsigned long long lval, rval; ··· 1860 1860 1861 1861 static const char *get_field_str(struct tep_filter_arg *arg, struct tep_record *record) 1862 1862 { 1863 - struct tep_event_format *event; 1863 + struct tep_event *event; 1864 1864 struct tep_handle *pevent; 1865 1865 unsigned long long addr; 1866 1866 const char *val = NULL; ··· 1908 1908 return val; 1909 1909 } 1910 1910 1911 - static int test_str(struct tep_event_format *event, struct tep_filter_arg *arg, 1911 + static int test_str(struct tep_event *event, struct tep_filter_arg *arg, 1912 1912 struct tep_record *record, enum tep_errno *err) 1913 1913 { 1914 1914 const char *val; ··· 1939 1939 } 1940 1940 } 1941 1941 1942 - static int test_op(struct tep_event_format *event, struct tep_filter_arg *arg, 1942 + static int test_op(struct tep_event *event, struct tep_filter_arg *arg, 1943 1943 struct tep_record *record, enum tep_errno *err) 1944 1944 { 1945 1945 switch (arg->op.type) { ··· 1961 1961 } 1962 1962 } 1963 1963 1964 - static int test_filter(struct tep_event_format *event, struct tep_filter_arg *arg, 1964 + static int test_filter(struct tep_event *event, struct tep_filter_arg *arg, 1965 1965 struct tep_record *record, enum tep_errno *err) 1966 1966 { 1967 1967 if (*err) {
+1 -1
tools/lib/traceevent/plugin_function.c
··· 124 124 } 125 125 126 126 static int function_handler(struct trace_seq *s, struct tep_record *record, 127 - struct tep_event_format *event, void *context) 127 + struct tep_event *event, void *context) 128 128 { 129 129 struct tep_handle *pevent = event->pevent; 130 130 unsigned long long function;
+2 -2
tools/lib/traceevent/plugin_hrtimer.c
··· 27 27 28 28 static int timer_expire_handler(struct trace_seq *s, 29 29 struct tep_record *record, 30 - struct tep_event_format *event, void *context) 30 + struct tep_event *event, void *context) 31 31 { 32 32 trace_seq_printf(s, "hrtimer="); 33 33 ··· 47 47 48 48 static int timer_start_handler(struct trace_seq *s, 49 49 struct tep_record *record, 50 - struct tep_event_format *event, void *context) 50 + struct tep_event *event, void *context) 51 51 { 52 52 trace_seq_printf(s, "hrtimer="); 53 53
+1 -1
tools/lib/traceevent/plugin_kmem.c
··· 25 25 #include "trace-seq.h" 26 26 27 27 static int call_site_handler(struct trace_seq *s, struct tep_record *record, 28 - struct tep_event_format *event, void *context) 28 + struct tep_event *event, void *context) 29 29 { 30 30 struct tep_format_field *field; 31 31 unsigned long long val, addr;
+8 -8
tools/lib/traceevent/plugin_kvm.c
··· 249 249 } 250 250 251 251 static int print_exit_reason(struct trace_seq *s, struct tep_record *record, 252 - struct tep_event_format *event, const char *field) 252 + struct tep_event *event, const char *field) 253 253 { 254 254 unsigned long long isa; 255 255 unsigned long long val; ··· 270 270 } 271 271 272 272 static int kvm_exit_handler(struct trace_seq *s, struct tep_record *record, 273 - struct tep_event_format *event, void *context) 273 + struct tep_event *event, void *context) 274 274 { 275 275 unsigned long long info1 = 0, info2 = 0; 276 276 ··· 293 293 294 294 static int kvm_emulate_insn_handler(struct trace_seq *s, 295 295 struct tep_record *record, 296 - struct tep_event_format *event, void *context) 296 + struct tep_event *event, void *context) 297 297 { 298 298 unsigned long long rip, csbase, len, flags, failed; 299 299 int llen; ··· 332 332 333 333 334 334 static int kvm_nested_vmexit_inject_handler(struct trace_seq *s, struct tep_record *record, 335 - struct tep_event_format *event, void *context) 335 + struct tep_event *event, void *context) 336 336 { 337 337 if (print_exit_reason(s, record, event, "exit_code") < 0) 338 338 return -1; ··· 346 346 } 347 347 348 348 static int kvm_nested_vmexit_handler(struct trace_seq *s, struct tep_record *record, 349 - struct tep_event_format *event, void *context) 349 + struct tep_event *event, void *context) 350 350 { 351 351 tep_print_num_field(s, "rip %llx ", event, "rip", record, 1); 352 352 ··· 372 372 }; 373 373 374 374 static int kvm_mmu_print_role(struct trace_seq *s, struct tep_record *record, 375 - struct tep_event_format *event, void *context) 375 + struct tep_event *event, void *context) 376 376 { 377 377 unsigned long long val; 378 378 static const char *access_str[] = { ··· 387 387 388 388 /* 389 389 * We can only use the structure if file is of the same 390 - * endianess. 390 + * endianness. 391 391 */ 392 392 if (tep_is_file_bigendian(event->pevent) == 393 393 tep_is_host_bigendian(event->pevent)) { ··· 419 419 420 420 static int kvm_mmu_get_page_handler(struct trace_seq *s, 421 421 struct tep_record *record, 422 - struct tep_event_format *event, void *context) 422 + struct tep_event *event, void *context) 423 423 { 424 424 unsigned long long val; 425 425
+2 -2
tools/lib/traceevent/plugin_mac80211.c
··· 26 26 27 27 #define INDENT 65 28 28 29 - static void print_string(struct trace_seq *s, struct tep_event_format *event, 29 + static void print_string(struct trace_seq *s, struct tep_event *event, 30 30 const char *name, const void *data) 31 31 { 32 32 struct tep_format_field *f = tep_find_field(event, name); ··· 60 60 61 61 static int drv_bss_info_changed(struct trace_seq *s, 62 62 struct tep_record *record, 63 - struct tep_event_format *event, void *context) 63 + struct tep_event *event, void *context) 64 64 { 65 65 void *data = record->data; 66 66
+2 -2
tools/lib/traceevent/plugin_sched_switch.c
··· 67 67 68 68 static int sched_wakeup_handler(struct trace_seq *s, 69 69 struct tep_record *record, 70 - struct tep_event_format *event, void *context) 70 + struct tep_event *event, void *context) 71 71 { 72 72 struct tep_format_field *field; 73 73 unsigned long long val; ··· 96 96 97 97 static int sched_switch_handler(struct trace_seq *s, 98 98 struct tep_record *record, 99 - struct tep_event_format *event, void *context) 99 + struct tep_event *event, void *context) 100 100 { 101 101 struct tep_format_field *field; 102 102 unsigned long long val;
+6
tools/perf/Documentation/perf-config.txt
··· 199 199 Colors for headers in the output of a sub-commands (top, report). 200 200 Default values are 'white', 'blue'. 201 201 202 + core.*:: 203 + core.proc-map-timeout:: 204 + Sets a timeout (in milliseconds) for parsing /proc/<pid>/maps files. 205 + Can be overridden by the --proc-map-timeout option on supported 206 + subcommands. The default timeout is 500ms. 207 + 202 208 tui.*, gtk.*:: 203 209 Subcommands that can be configured here are 'top', 'report' and 'annotate'. 204 210 These values are booleans, for example:
+1 -1
tools/perf/Documentation/perf-list.txt
··· 172 172 Other PMUs and global measurements are normally root only. 173 173 Some event qualifiers, such as "any", are also root only. 174 174 175 - This can be overriden by setting the kernel.perf_event_paranoid 175 + This can be overridden by setting the kernel.perf_event_paranoid 176 176 sysctl to -1, which allows non root to use these events. 177 177 178 178 For accessing trace point events perf needs to have read access to
+5
tools/perf/Documentation/perf-record.txt
··· 435 435 --buildid-all:: 436 436 Record build-id of all DSOs regardless whether it's actually hit or not. 437 437 438 + --aio[=n]:: 439 + Use <n> control blocks in asynchronous (Posix AIO) trace writing mode (default: 1, max: 4). 440 + Asynchronous mode is supported only when linking Perf tool with libc library 441 + providing implementation for Posix AIO API. 442 + 438 443 --all-kernel:: 439 444 Configure all used events to run in kernel space. 440 445
+9 -1
tools/perf/Documentation/perf-report.txt
··· 126 126 And default sort keys are changed to comm, dso_from, symbol_from, dso_to 127 127 and symbol_to, see '--branch-stack'. 128 128 129 + When the sort key symbol is specified, columns "IPC" and "IPC Coverage" 130 + are enabled automatically. Column "IPC" reports the average IPC per function 131 + and column "IPC coverage" reports the percentage of instructions with 132 + sampled IPC in this function. IPC means Instruction Per Cycle. If it's low, 133 + it indicates there may be a performance bottleneck when the function is 134 + executed, such as a memory access bottleneck. If a function has high overhead 135 + and low IPC, it's worth further analyzing it to optimize its performance. 136 + 129 137 If the --mem-mode option is used, the following sort keys are also available 130 138 (incompatible with --branch-stack): 131 139 symbol_daddr, dso_daddr, locked, tlb, mem, snoop, dcacheline. ··· 252 244 Usually more convenient to use --branch-history for this. 253 245 254 246 value can be: 255 - - percent: diplay overhead percent (default) 247 + - percent: display overhead percent (default) 256 248 - period: display event period 257 249 - count: display event count 258 250
+1 -1
tools/perf/Documentation/perf-script.txt
··· 117 117 Comma separated list of fields to print. Options are: 118 118 comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff, 119 119 srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output, brstackinsn, 120 - brstackoff, callindent, insn, insnlen, synth, phys_addr, metric, misc. 120 + brstackoff, callindent, insn, insnlen, synth, phys_addr, metric, misc, srccode. 121 121 Field list can be prepended with the type, trace, sw or hw, 122 122 to indicate to which event type the field list applies. 123 123 e.g., -F sw:comm,tid,time,ip,sym and -F trace:time,cpu,trace
+2 -2
tools/perf/Documentation/perf-stat.txt
··· 50 50 /sys/bus/event_source/devices/<pmu>/format/* 51 51 52 52 Note that the last two syntaxes support prefix and glob matching in 53 - the PMU name to simplify creation of events accross multiple instances 53 + the PMU name to simplify creation of events across multiple instances 54 54 of the same type of PMU in large systems (e.g. memory controller PMUs). 55 55 Multiple PMU instances are typical for uncore PMUs, so the prefix 56 56 'uncore_' is also ignored when performing this match. ··· 277 277 for best results. Otherwise the bottlenecks may be inconsistent 278 278 on workload with changing phases. 279 279 280 - This enables --metric-only, unless overriden with --no-metric-only. 280 + This enables --metric-only, unless overridden with --no-metric-only. 281 281 282 282 To interpret the results it is usually needed to know on which 283 283 CPUs the workload runs on. If needed the CPUs can be forced using
+3
tools/perf/Documentation/perf-top.txt
··· 70 70 --ignore-vmlinux:: 71 71 Ignore vmlinux files. 72 72 73 + --kallsyms=<file>:: 74 + kallsyms pathname 75 + 73 76 -m <pages>:: 74 77 --mmap-pages=<pages>:: 75 78 Number of mmap data pages (must be a power of two) or size
+7 -1
tools/perf/Makefile.config
··· 365 365 CFLAGS += -DHAVE_GLIBC_SUPPORT 366 366 endif 367 367 368 + ifeq ($(feature-libaio), 1) 369 + ifndef NO_AIO 370 + CFLAGS += -DHAVE_AIO_SUPPORT 371 + endif 372 + endif 373 + 368 374 ifdef NO_DWARF 369 375 NO_LIBDW_DWARF_UNWIND := 1 370 376 endif ··· 594 588 595 589 ifndef NO_LIBCRYPTO 596 590 ifneq ($(feature-libcrypto), 1) 597 - msg := $(warning No libcrypto.h found, disables jitted code injection, please install libssl-devel or libssl-dev); 591 + msg := $(warning No libcrypto.h found, disables jitted code injection, please install openssl-devel or libssl-dev); 598 592 NO_LIBCRYPTO := 1 599 593 else 600 594 CFLAGS += -DHAVE_LIBCRYPTO_SUPPORT
+7 -2
tools/perf/Makefile.perf
··· 101 101 # Define LIBCLANGLLVM if you DO want builtin clang and llvm support. 102 102 # When selected, pass LLVM_CONFIG=/path/to/llvm-config to `make' if 103 103 # llvm-config is not in $PATH. 104 - 104 + # 105 105 # Define NO_CORESIGHT if you do not want support for CoreSight trace decoding. 106 + # 107 + # Define NO_AIO if you do not want support of Posix AIO based trace 108 + # streaming for record mode. Currently Posix AIO trace streaming is 109 + # supported only when linking with glibc. 110 + # 106 111 107 112 # As per kernel Makefile, avoid funny character set dependencies 108 113 unexport LC_ALL ··· 474 469 mmap_flags_array := $(beauty_outdir)/mmap_flags_array.c 475 470 mmap_flags_tbl := $(srctree)/tools/perf/trace/beauty/mmap_flags.sh 476 471 477 - $(mmap_flags_array): $(asm_generic_uapi_dir)/mman.h $(asm_generic_uapi_dir)/mman-common.h $(arch_asm_uapi_dir)/mman.h $(mmap_flags_tbl) 472 + $(mmap_flags_array): $(asm_generic_uapi_dir)/mman.h $(asm_generic_uapi_dir)/mman-common.h $(mmap_flags_tbl) 478 473 $(Q)$(SHELL) '$(mmap_flags_tbl)' $(asm_generic_uapi_dir) $(arch_asm_uapi_dir) > $@ 479 474 480 475 mount_flags_array := $(beauty_outdir)/mount_flags_array.c
+9
tools/perf/arch/arc/annotate/instructions.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <linux/compiler.h> 3 + 4 + static int arc__annotate_init(struct arch *arch, char *cpuid __maybe_unused) 5 + { 6 + arch->initialized = true; 7 + arch->objdump.comment_char = ';'; 8 + return 0; 9 + }
+20 -1
tools/perf/arch/common.c
··· 5 5 #include "../util/util.h" 6 6 #include "../util/debug.h" 7 7 8 + const char *const arc_triplets[] = { 9 + "arc-linux-", 10 + "arc-snps-linux-uclibc-", 11 + "arc-snps-linux-gnu-", 12 + NULL 13 + }; 14 + 8 15 const char *const arm_triplets[] = { 9 16 "arm-eabi-", 10 17 "arm-linux-androideabi-", ··· 154 147 zfree(&buf); 155 148 } 156 149 157 - if (!strcmp(arch, "arm")) 150 + if (!strcmp(arch, "arc")) 151 + path_list = arc_triplets; 152 + else if (!strcmp(arch, "arm")) 158 153 path_list = arm_triplets; 159 154 else if (!strcmp(arch, "arm64")) 160 155 path_list = arm64_triplets; ··· 208 199 return 0; 209 200 210 201 return perf_env__lookup_binutils_path(env, "objdump", path); 202 + } 203 + 204 + /* 205 + * Some architectures have a single address space for kernel and user addresses, 206 + * which makes it possible to determine if an address is in kernel space or user 207 + * space. 208 + */ 209 + bool perf_env__single_address_space(struct perf_env *env) 210 + { 211 + return strcmp(perf_env__arch(env), "sparc"); 211 212 }
+1
tools/perf/arch/common.h
··· 5 5 #include "../util/env.h" 6 6 7 7 int perf_env__lookup_objdump(struct perf_env *env, const char **path); 8 + bool perf_env__single_address_space(struct perf_env *env); 8 9 9 10 #endif /* ARCH_PERF_COMMON_H */
+1 -1
tools/perf/arch/x86/tests/insn-x86.c
··· 170 170 * 171 171 * If the test passes %0 is returned, otherwise %-1 is returned. Use the 172 172 * verbose (-v) option to see all the instructions and whether or not they 173 - * decoded successfuly. 173 + * decoded successfully. 174 174 */ 175 175 int test__insn_x86(struct test *test __maybe_unused, int subtest __maybe_unused) 176 176 {
+11
tools/perf/arch/x86/util/intel-pt.c
··· 524 524 struct perf_evsel *evsel) 525 525 { 526 526 int err; 527 + char c; 527 528 528 529 if (!evsel) 529 530 return 0; 531 + 532 + /* 533 + * If supported, force pass-through config term (pt=1) even if user 534 + * sets pt=0, which avoids senseless kernel errors. 535 + */ 536 + if (perf_pmu__scan_file(intel_pt_pmu, "format/pt", "%c", &c) == 1 && 537 + !(evsel->attr.config & 1)) { 538 + pr_warning("pt=0 doesn't make sense, forcing pt=1\n"); 539 + evsel->attr.config |= 1; 540 + } 530 541 531 542 err = intel_pt_val_config_term(intel_pt_pmu, "caps/cycle_thresholds", 532 543 "cyc_thresh", "caps/psb_cyc",
+1 -1
tools/perf/builtin-help.c
··· 189 189 while (*p) 190 190 p = &((*p)->next); 191 191 *p = zalloc(sizeof(**p) + len + 1); 192 - strncpy((*p)->name, name, len); 192 + strcpy((*p)->name, name); 193 193 } 194 194 195 195 static int supported_man_viewer(const char *name, size_t len)
+2 -4
tools/perf/builtin-kvm.c
··· 1364 1364 "show events other than" 1365 1365 " HLT (x86 only) or Wait state (s390 only)" 1366 1366 " that take longer than duration usecs"), 1367 - OPT_UINTEGER(0, "proc-map-timeout", &kvm->opts.proc_map_timeout, 1367 + OPT_UINTEGER(0, "proc-map-timeout", &proc_map_timeout, 1368 1368 "per thread proc mmap processing timeout in ms"), 1369 1369 OPT_END() 1370 1370 }; ··· 1394 1394 kvm->opts.target.uses_mmap = false; 1395 1395 kvm->opts.target.uid_str = NULL; 1396 1396 kvm->opts.target.uid = UINT_MAX; 1397 - kvm->opts.proc_map_timeout = 500; 1398 1397 1399 1398 symbol__init(NULL); 1400 1399 disable_buildid_cache(); ··· 1452 1453 perf_session__set_id_hdr_size(kvm->session); 1453 1454 ordered_events__set_copy_on_queue(&kvm->session->ordered_events, true); 1454 1455 machine__synthesize_threads(&kvm->session->machines.host, &kvm->opts.target, 1455 - kvm->evlist->threads, false, 1456 - kvm->opts.proc_map_timeout, 1); 1456 + kvm->evlist->threads, false, 1); 1457 1457 err = kvm_live_open_events(kvm); 1458 1458 if (err) 1459 1459 goto out;
+254 -9
tools/perf/builtin-record.c
··· 124 124 return 0; 125 125 } 126 126 127 + #ifdef HAVE_AIO_SUPPORT 128 + static int record__aio_write(struct aiocb *cblock, int trace_fd, 129 + void *buf, size_t size, off_t off) 130 + { 131 + int rc; 132 + 133 + cblock->aio_fildes = trace_fd; 134 + cblock->aio_buf = buf; 135 + cblock->aio_nbytes = size; 136 + cblock->aio_offset = off; 137 + cblock->aio_sigevent.sigev_notify = SIGEV_NONE; 138 + 139 + do { 140 + rc = aio_write(cblock); 141 + if (rc == 0) { 142 + break; 143 + } else if (errno != EAGAIN) { 144 + cblock->aio_fildes = -1; 145 + pr_err("failed to queue perf data, error: %m\n"); 146 + break; 147 + } 148 + } while (1); 149 + 150 + return rc; 151 + } 152 + 153 + static int record__aio_complete(struct perf_mmap *md, struct aiocb *cblock) 154 + { 155 + void *rem_buf; 156 + off_t rem_off; 157 + size_t rem_size; 158 + int rc, aio_errno; 159 + ssize_t aio_ret, written; 160 + 161 + aio_errno = aio_error(cblock); 162 + if (aio_errno == EINPROGRESS) 163 + return 0; 164 + 165 + written = aio_ret = aio_return(cblock); 166 + if (aio_ret < 0) { 167 + if (aio_errno != EINTR) 168 + pr_err("failed to write perf data, error: %m\n"); 169 + written = 0; 170 + } 171 + 172 + rem_size = cblock->aio_nbytes - written; 173 + 174 + if (rem_size == 0) { 175 + cblock->aio_fildes = -1; 176 + /* 177 + * md->refcount is incremented in perf_mmap__push() for 178 + * every enqueued aio write request so decrement it because 179 + * the request is now complete. 180 + */ 181 + perf_mmap__put(md); 182 + rc = 1; 183 + } else { 184 + /* 185 + * aio write request may require restart with the 186 + * reminder if the kernel didn't write whole 187 + * chunk at once. 188 + */ 189 + rem_off = cblock->aio_offset + written; 190 + rem_buf = (void *)(cblock->aio_buf + written); 191 + record__aio_write(cblock, cblock->aio_fildes, 192 + rem_buf, rem_size, rem_off); 193 + rc = 0; 194 + } 195 + 196 + return rc; 197 + } 198 + 199 + static int record__aio_sync(struct perf_mmap *md, bool sync_all) 200 + { 201 + struct aiocb **aiocb = md->aio.aiocb; 202 + struct aiocb *cblocks = md->aio.cblocks; 203 + struct timespec timeout = { 0, 1000 * 1000 * 1 }; /* 1ms */ 204 + int i, do_suspend; 205 + 206 + do { 207 + do_suspend = 0; 208 + for (i = 0; i < md->aio.nr_cblocks; ++i) { 209 + if (cblocks[i].aio_fildes == -1 || record__aio_complete(md, &cblocks[i])) { 210 + if (sync_all) 211 + aiocb[i] = NULL; 212 + else 213 + return i; 214 + } else { 215 + /* 216 + * Started aio write is not complete yet 217 + * so it has to be waited before the 218 + * next allocation. 219 + */ 220 + aiocb[i] = &cblocks[i]; 221 + do_suspend = 1; 222 + } 223 + } 224 + if (!do_suspend) 225 + return -1; 226 + 227 + while (aio_suspend((const struct aiocb **)aiocb, md->aio.nr_cblocks, &timeout)) { 228 + if (!(errno == EAGAIN || errno == EINTR)) 229 + pr_err("failed to sync perf data, error: %m\n"); 230 + } 231 + } while (1); 232 + } 233 + 234 + static int record__aio_pushfn(void *to, struct aiocb *cblock, void *bf, size_t size, off_t off) 235 + { 236 + struct record *rec = to; 237 + int ret, trace_fd = rec->session->data->file.fd; 238 + 239 + rec->samples++; 240 + 241 + ret = record__aio_write(cblock, trace_fd, bf, size, off); 242 + if (!ret) { 243 + rec->bytes_written += size; 244 + if (switch_output_size(rec)) 245 + trigger_hit(&switch_output_trigger); 246 + } 247 + 248 + return ret; 249 + } 250 + 251 + static off_t record__aio_get_pos(int trace_fd) 252 + { 253 + return lseek(trace_fd, 0, SEEK_CUR); 254 + } 255 + 256 + static void record__aio_set_pos(int trace_fd, off_t pos) 257 + { 258 + lseek(trace_fd, pos, SEEK_SET); 259 + } 260 + 261 + static void record__aio_mmap_read_sync(struct record *rec) 262 + { 263 + int i; 264 + struct perf_evlist *evlist = rec->evlist; 265 + struct perf_mmap *maps = evlist->mmap; 266 + 267 + if (!rec->opts.nr_cblocks) 268 + return; 269 + 270 + for (i = 0; i < evlist->nr_mmaps; i++) { 271 + struct perf_mmap *map = &maps[i]; 272 + 273 + if (map->base) 274 + record__aio_sync(map, true); 275 + } 276 + } 277 + 278 + static int nr_cblocks_default = 1; 279 + static int nr_cblocks_max = 4; 280 + 281 + static int record__aio_parse(const struct option *opt, 282 + const char *str, 283 + int unset) 284 + { 285 + struct record_opts *opts = (struct record_opts *)opt->value; 286 + 287 + if (unset) { 288 + opts->nr_cblocks = 0; 289 + } else { 290 + if (str) 291 + opts->nr_cblocks = strtol(str, NULL, 0); 292 + if (!opts->nr_cblocks) 293 + opts->nr_cblocks = nr_cblocks_default; 294 + } 295 + 296 + return 0; 297 + } 298 + #else /* HAVE_AIO_SUPPORT */ 299 + static int nr_cblocks_max = 0; 300 + 301 + static int record__aio_sync(struct perf_mmap *md __maybe_unused, bool sync_all __maybe_unused) 302 + { 303 + return -1; 304 + } 305 + 306 + static int record__aio_pushfn(void *to __maybe_unused, struct aiocb *cblock __maybe_unused, 307 + void *bf __maybe_unused, size_t size __maybe_unused, off_t off __maybe_unused) 308 + { 309 + return -1; 310 + } 311 + 312 + static off_t record__aio_get_pos(int trace_fd __maybe_unused) 313 + { 314 + return -1; 315 + } 316 + 317 + static void record__aio_set_pos(int trace_fd __maybe_unused, off_t pos __maybe_unused) 318 + { 319 + } 320 + 321 + static void record__aio_mmap_read_sync(struct record *rec __maybe_unused) 322 + { 323 + } 324 + #endif 325 + 326 + static int record__aio_enabled(struct record *rec) 327 + { 328 + return rec->opts.nr_cblocks > 0; 329 + } 330 + 127 331 static int process_synthesized_event(struct perf_tool *tool, 128 332 union perf_event *event, 129 333 struct perf_sample *sample __maybe_unused, ··· 533 329 534 330 if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, 535 331 opts->auxtrace_mmap_pages, 536 - opts->auxtrace_snapshot_mode) < 0) { 332 + opts->auxtrace_snapshot_mode, opts->nr_cblocks) < 0) { 537 333 if (errno == EPERM) { 538 334 pr_err("Permission error mapping pages.\n" 539 335 "Consider increasing " ··· 729 525 int i; 730 526 int rc = 0; 731 527 struct perf_mmap *maps; 528 + int trace_fd = rec->data.file.fd; 529 + off_t off; 732 530 733 531 if (!evlist) 734 532 return 0; ··· 742 536 if (overwrite && evlist->bkw_mmap_state != BKW_MMAP_DATA_PENDING) 743 537 return 0; 744 538 539 + if (record__aio_enabled(rec)) 540 + off = record__aio_get_pos(trace_fd); 541 + 745 542 for (i = 0; i < evlist->nr_mmaps; i++) { 746 543 struct perf_mmap *map = &maps[i]; 747 544 748 545 if (map->base) { 749 - if (perf_mmap__push(map, rec, record__pushfn) != 0) { 750 - rc = -1; 751 - goto out; 546 + if (!record__aio_enabled(rec)) { 547 + if (perf_mmap__push(map, rec, record__pushfn) != 0) { 548 + rc = -1; 549 + goto out; 550 + } 551 + } else { 552 + int idx; 553 + /* 554 + * Call record__aio_sync() to wait till map->data buffer 555 + * becomes available after previous aio write request. 556 + */ 557 + idx = record__aio_sync(map, false); 558 + if (perf_mmap__aio_push(map, rec, idx, record__aio_pushfn, &off) != 0) { 559 + record__aio_set_pos(trace_fd, off); 560 + rc = -1; 561 + goto out; 562 + } 752 563 } 753 564 } 754 565 ··· 775 552 goto out; 776 553 } 777 554 } 555 + 556 + if (record__aio_enabled(rec)) 557 + record__aio_set_pos(trace_fd, off); 778 558 779 559 /* 780 560 * Mark the round finished in case we wrote ··· 867 641 err = perf_event__synthesize_thread_map(&rec->tool, thread_map, 868 642 process_synthesized_event, 869 643 &rec->session->machines.host, 870 - rec->opts.sample_address, 871 - rec->opts.proc_map_timeout); 644 + rec->opts.sample_address); 872 645 thread_map__put(thread_map); 873 646 return err; 874 647 } ··· 882 657 883 658 /* Same Size: "2015122520103046"*/ 884 659 char timestamp[] = "InvalidTimestamp"; 660 + 661 + record__aio_mmap_read_sync(rec); 885 662 886 663 record__synthesize(rec, true); 887 664 if (target__none(&rec->opts.target)) ··· 1084 857 1085 858 err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->threads, 1086 859 process_synthesized_event, opts->sample_address, 1087 - opts->proc_map_timeout, 1); 860 + 1); 1088 861 out: 1089 862 return err; 1090 863 } ··· 1395 1168 record__synthesize_workload(rec, true); 1396 1169 1397 1170 out_child: 1171 + record__aio_mmap_read_sync(rec); 1172 + 1398 1173 if (forks) { 1399 1174 int exit_status; 1400 1175 ··· 1530 1301 var = "call-graph.record-mode"; 1531 1302 return perf_default_config(var, value, cb); 1532 1303 } 1304 + #ifdef HAVE_AIO_SUPPORT 1305 + if (!strcmp(var, "record.aio")) { 1306 + rec->opts.nr_cblocks = strtol(value, NULL, 0); 1307 + if (!rec->opts.nr_cblocks) 1308 + rec->opts.nr_cblocks = nr_cblocks_default; 1309 + } 1310 + #endif 1533 1311 1534 1312 return 0; 1535 1313 } ··· 1782 1546 .uses_mmap = true, 1783 1547 .default_per_cpu = true, 1784 1548 }, 1785 - .proc_map_timeout = 500, 1786 1549 }, 1787 1550 .tool = { 1788 1551 .sample = process_sample_event, ··· 1911 1676 parse_clockid), 1912 1677 OPT_STRING_OPTARG('S', "snapshot", &record.opts.auxtrace_snapshot_opts, 1913 1678 "opts", "AUX area tracing Snapshot Mode", ""), 1914 - OPT_UINTEGER(0, "proc-map-timeout", &record.opts.proc_map_timeout, 1679 + OPT_UINTEGER(0, "proc-map-timeout", &proc_map_timeout, 1915 1680 "per thread proc mmap processing timeout in ms"), 1916 1681 OPT_BOOLEAN(0, "namespaces", &record.opts.record_namespaces, 1917 1682 "Record namespaces events"), ··· 1941 1706 "signal"), 1942 1707 OPT_BOOLEAN(0, "dry-run", &dry_run, 1943 1708 "Parse options then exit"), 1709 + #ifdef HAVE_AIO_SUPPORT 1710 + OPT_CALLBACK_OPTARG(0, "aio", &record.opts, 1711 + &nr_cblocks_default, "n", "Use <n> control blocks in asynchronous trace writing mode (default: 1, max: 4)", 1712 + record__aio_parse), 1713 + #endif 1944 1714 OPT_END() 1945 1715 }; 1946 1716 ··· 2137 1897 err = -EINVAL; 2138 1898 goto out; 2139 1899 } 1900 + 1901 + if (rec->opts.nr_cblocks > nr_cblocks_max) 1902 + rec->opts.nr_cblocks = nr_cblocks_max; 1903 + if (verbose > 0) 1904 + pr_info("nr_cblocks: %d\n", rec->opts.nr_cblocks); 2140 1905 2141 1906 err = __cmd_record(&record, argc, argv); 2142 1907 out:
+23 -3
tools/perf/builtin-report.c
··· 85 85 int socket_filter; 86 86 DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS); 87 87 struct branch_type_stat brtype_stat; 88 + bool symbol_ipc; 88 89 }; 89 90 90 91 static int report__config(const char *var, const char *value, void *cb) ··· 130 129 struct mem_info *mi; 131 130 struct branch_info *bi; 132 131 133 - if (!ui__has_annotation()) 132 + if (!ui__has_annotation() && !rep->symbol_ipc) 134 133 return 0; 135 134 136 135 hist__account_cycles(sample->branch_stack, al, sample, ··· 175 174 struct perf_evsel *evsel = iter->evsel; 176 175 int err; 177 176 178 - if (!ui__has_annotation()) 177 + if (!ui__has_annotation() && !rep->symbol_ipc) 179 178 return 0; 180 179 181 180 hist__account_cycles(sample->branch_stack, al, sample, ··· 1134 1133 .mode = PERF_DATA_MODE_READ, 1135 1134 }; 1136 1135 int ret = hists__init(); 1136 + char sort_tmp[128]; 1137 1137 1138 1138 if (ret < 0) 1139 1139 return ret; ··· 1286 1284 else 1287 1285 use_browser = 0; 1288 1286 1287 + if (sort_order && strstr(sort_order, "ipc")) { 1288 + parse_options_usage(report_usage, options, "s", 1); 1289 + goto error; 1290 + } 1291 + 1292 + if (sort_order && strstr(sort_order, "symbol")) { 1293 + if (sort__mode == SORT_MODE__BRANCH) { 1294 + snprintf(sort_tmp, sizeof(sort_tmp), "%s,%s", 1295 + sort_order, "ipc_lbr"); 1296 + report.symbol_ipc = true; 1297 + } else { 1298 + snprintf(sort_tmp, sizeof(sort_tmp), "%s,%s", 1299 + sort_order, "ipc_null"); 1300 + } 1301 + 1302 + sort_order = sort_tmp; 1303 + } 1304 + 1289 1305 if (setup_sorting(session->evlist) < 0) { 1290 1306 if (sort_order) 1291 1307 parse_options_usage(report_usage, options, "s", 1); ··· 1331 1311 * so don't allocate extra space that won't be used in the stdio 1332 1312 * implementation. 1333 1313 */ 1334 - if (ui__has_annotation()) { 1314 + if (ui__has_annotation() || report.symbol_ipc) { 1335 1315 ret = symbol__annotation_init(); 1336 1316 if (ret < 0) 1337 1317 goto error;
+51 -8
tools/perf/builtin-script.c
··· 96 96 PERF_OUTPUT_UREGS = 1U << 27, 97 97 PERF_OUTPUT_METRIC = 1U << 28, 98 98 PERF_OUTPUT_MISC = 1U << 29, 99 + PERF_OUTPUT_SRCCODE = 1U << 30, 99 100 }; 100 101 101 102 struct output_option { ··· 133 132 {.str = "phys_addr", .field = PERF_OUTPUT_PHYS_ADDR}, 134 133 {.str = "metric", .field = PERF_OUTPUT_METRIC}, 135 134 {.str = "misc", .field = PERF_OUTPUT_MISC}, 135 + {.str = "srccode", .field = PERF_OUTPUT_SRCCODE}, 136 136 }; 137 137 138 138 enum { ··· 426 424 pr_err("Display of DSO requested but no address to convert.\n"); 427 425 return -EINVAL; 428 426 } 429 - if (PRINT_FIELD(SRCLINE) && !PRINT_FIELD(IP)) { 427 + if ((PRINT_FIELD(SRCLINE) || PRINT_FIELD(SRCCODE)) && !PRINT_FIELD(IP)) { 430 428 pr_err("Display of source line number requested but sample IP is not\n" 431 429 "selected. Hence, no address to lookup the source line number.\n"); 432 430 return -EINVAL; ··· 726 724 if (PRINT_FIELD(DSO)) { 727 725 memset(&alf, 0, sizeof(alf)); 728 726 memset(&alt, 0, sizeof(alt)); 729 - thread__find_map(thread, sample->cpumode, from, &alf); 730 - thread__find_map(thread, sample->cpumode, to, &alt); 727 + thread__find_map_fb(thread, sample->cpumode, from, &alf); 728 + thread__find_map_fb(thread, sample->cpumode, to, &alt); 731 729 } 732 730 733 731 printed += fprintf(fp, " 0x%"PRIx64, from); ··· 773 771 from = br->entries[i].from; 774 772 to = br->entries[i].to; 775 773 776 - thread__find_symbol(thread, sample->cpumode, from, &alf); 777 - thread__find_symbol(thread, sample->cpumode, to, &alt); 774 + thread__find_symbol_fb(thread, sample->cpumode, from, &alf); 775 + thread__find_symbol_fb(thread, sample->cpumode, to, &alt); 778 776 779 777 printed += symbol__fprintf_symname_offs(alf.sym, &alf, fp); 780 778 if (PRINT_FIELD(DSO)) { ··· 818 816 from = br->entries[i].from; 819 817 to = br->entries[i].to; 820 818 821 - if (thread__find_map(thread, sample->cpumode, from, &alf) && 819 + if (thread__find_map_fb(thread, sample->cpumode, from, &alf) && 822 820 !alf.map->dso->adjust_symbols) 823 821 from = map__map_ip(alf.map, from); 824 822 825 - if (thread__find_map(thread, sample->cpumode, to, &alt) && 823 + if (thread__find_map_fb(thread, sample->cpumode, to, &alt) && 826 824 !alt.map->dso->adjust_symbols) 827 825 to = map__map_ip(alt.map, to); 828 826 ··· 907 905 pr_debug("\tcannot fetch code for block at %" PRIx64 "-%" PRIx64 "\n", 908 906 start, end); 909 907 return len; 908 + } 909 + 910 + static int print_srccode(struct thread *thread, u8 cpumode, uint64_t addr) 911 + { 912 + struct addr_location al; 913 + int ret = 0; 914 + 915 + memset(&al, 0, sizeof(al)); 916 + thread__find_map(thread, cpumode, addr, &al); 917 + if (!al.map) 918 + return 0; 919 + ret = map__fprintf_srccode(al.map, al.addr, stdout, 920 + &thread->srccode_state); 921 + if (ret) 922 + ret += printf("\n"); 923 + return ret; 910 924 } 911 925 912 926 static int ip__fprintf_jump(uint64_t ip, struct branch_entry *en, ··· 1016 998 x.cpumode, x.cpu, &lastsym, attr, fp); 1017 999 printed += ip__fprintf_jump(br->entries[nr - 1].from, &br->entries[nr - 1], 1018 1000 &x, buffer, len, 0, fp, &total_cycles); 1001 + if (PRINT_FIELD(SRCCODE)) 1002 + printed += print_srccode(thread, x.cpumode, br->entries[nr - 1].from); 1019 1003 } 1020 1004 1021 1005 /* Print all blocks */ ··· 1047 1027 if (ip == end) { 1048 1028 printed += ip__fprintf_jump(ip, &br->entries[i], &x, buffer + off, len - off, insn, fp, 1049 1029 &total_cycles); 1030 + if (PRINT_FIELD(SRCCODE)) 1031 + printed += print_srccode(thread, x.cpumode, ip); 1050 1032 break; 1051 1033 } else { 1052 1034 printed += fprintf(fp, "\t%016" PRIx64 "\t%s\n", ip, 1053 1035 dump_insn(&x, ip, buffer + off, len - off, &ilen)); 1054 1036 if (ilen == 0) 1055 1037 break; 1038 + if (PRINT_FIELD(SRCCODE)) 1039 + print_srccode(thread, x.cpumode, ip); 1056 1040 insn++; 1057 1041 } 1058 1042 } ··· 1087 1063 1088 1064 printed += fprintf(fp, "\t%016" PRIx64 "\t%s\n", sample->ip, 1089 1065 dump_insn(&x, sample->ip, buffer, len, NULL)); 1066 + if (PRINT_FIELD(SRCCODE)) 1067 + print_srccode(thread, x.cpumode, sample->ip); 1090 1068 goto out; 1091 1069 } 1092 1070 for (off = 0; off <= end - start; off += ilen) { ··· 1096 1070 dump_insn(&x, start + off, buffer + off, len - off, &ilen)); 1097 1071 if (ilen == 0) 1098 1072 break; 1073 + if (PRINT_FIELD(SRCCODE)) 1074 + print_srccode(thread, x.cpumode, start + off); 1099 1075 } 1100 1076 out: 1101 1077 return printed; ··· 1280 1252 printed += map__fprintf_srcline(al->map, al->addr, "\n ", fp); 1281 1253 1282 1254 printed += perf_sample__fprintf_insn(sample, attr, thread, machine, fp); 1283 - return printed + fprintf(fp, "\n"); 1255 + printed += fprintf(fp, "\n"); 1256 + if (PRINT_FIELD(SRCCODE)) { 1257 + int ret = map__fprintf_srccode(al->map, al->addr, stdout, 1258 + &thread->srccode_state); 1259 + if (ret) { 1260 + printed += ret; 1261 + printed += printf("\n"); 1262 + } 1263 + } 1264 + return printed; 1284 1265 } 1285 1266 1286 1267 static struct { ··· 1828 1791 if (PRINT_FIELD(PHYS_ADDR)) 1829 1792 fprintf(fp, "%16" PRIx64, sample->phys_addr); 1830 1793 fprintf(fp, "\n"); 1794 + 1795 + if (PRINT_FIELD(SRCCODE)) { 1796 + if (map__fprintf_srccode(al->map, al->addr, stdout, 1797 + &thread->srccode_state)) 1798 + printf("\n"); 1799 + } 1831 1800 1832 1801 if (PRINT_FIELD(METRIC)) 1833 1802 perf_sample__fprint_metric(script, thread, evsel, sample, fp);
+212 -77
tools/perf/builtin-top.c
··· 46 46 #include "arch/common.h" 47 47 48 48 #include "util/debug.h" 49 + #include "util/ordered-events.h" 49 50 50 51 #include <assert.h> 51 52 #include <elf.h> ··· 272 271 273 272 perf_top__header_snprintf(top, bf, sizeof(bf)); 274 273 printf("%s\n", bf); 275 - 276 - perf_top__reset_sample_counters(top); 277 274 278 275 printf("%-*.*s\n", win_width, win_width, graph_dotted_line); 279 276 ··· 552 553 struct perf_evsel *evsel = t->sym_evsel; 553 554 struct hists *hists; 554 555 555 - perf_top__reset_sample_counters(t); 556 - 557 556 if (t->evlist->selected != NULL) 558 557 t->sym_evsel = t->evlist->selected; 559 558 ··· 568 571 569 572 hists__collapse_resort(hists, NULL); 570 573 perf_evsel__output_resort(evsel, NULL); 574 + 575 + if (t->lost || t->drop) 576 + pr_warning("Too slow to read ring buffer (change period (-c/-F) or limit CPUs (-C)\n"); 577 + } 578 + 579 + static void stop_top(void) 580 + { 581 + session_done = 1; 582 + done = 1; 571 583 } 572 584 573 585 static void *display_thread_tui(void *arg) ··· 601 595 602 596 /* 603 597 * Initialize the uid_filter_str, in the future the TUI will allow 604 - * Zooming in/out UIDs. For now juse use whatever the user passed 598 + * Zooming in/out UIDs. For now just use whatever the user passed 605 599 * via --uid. 606 600 */ 607 601 evlist__for_each_entry(top->evlist, pos) { ··· 615 609 !top->record_opts.overwrite, 616 610 &top->annotation_opts); 617 611 618 - done = 1; 612 + stop_top(); 619 613 return NULL; 620 614 } 621 615 622 616 static void display_sig(int sig __maybe_unused) 623 617 { 624 - done = 1; 618 + stop_top(); 625 619 } 626 620 627 621 static void display_setup_sig(void) ··· 674 668 675 669 if (perf_top__handle_keypress(top, c)) 676 670 goto repeat; 677 - done = 1; 671 + stop_top(); 678 672 } 679 673 } 680 674 ··· 806 800 addr_location__put(&al); 807 801 } 808 802 803 + static void 804 + perf_top__process_lost(struct perf_top *top, union perf_event *event, 805 + struct perf_evsel *evsel) 806 + { 807 + struct hists *hists = evsel__hists(evsel); 808 + 809 + top->lost += event->lost.lost; 810 + top->lost_total += event->lost.lost; 811 + hists->stats.total_lost += event->lost.lost; 812 + } 813 + 814 + static void 815 + perf_top__process_lost_samples(struct perf_top *top, 816 + union perf_event *event, 817 + struct perf_evsel *evsel) 818 + { 819 + struct hists *hists = evsel__hists(evsel); 820 + 821 + top->lost += event->lost_samples.lost; 822 + top->lost_total += event->lost_samples.lost; 823 + hists->stats.total_lost_samples += event->lost_samples.lost; 824 + } 825 + 826 + static u64 last_timestamp; 827 + 809 828 static void perf_top__mmap_read_idx(struct perf_top *top, int idx) 810 829 { 811 830 struct record_opts *opts = &top->record_opts; 812 831 struct perf_evlist *evlist = top->evlist; 813 - struct perf_sample sample; 814 - struct perf_evsel *evsel; 815 832 struct perf_mmap *md; 816 - struct perf_session *session = top->session; 817 833 union perf_event *event; 818 - struct machine *machine; 819 - int ret; 820 834 821 835 md = opts->overwrite ? &evlist->overwrite_mmap[idx] : &evlist->mmap[idx]; 822 836 if (perf_mmap__read_init(md) < 0) 823 837 return; 824 838 825 839 while ((event = perf_mmap__read_event(md)) != NULL) { 826 - ret = perf_evlist__parse_sample(evlist, event, &sample); 827 - if (ret) { 828 - pr_err("Can't parse sample, err = %d\n", ret); 829 - goto next_event; 830 - } 840 + int ret; 831 841 832 - evsel = perf_evlist__id2evsel(session->evlist, sample.id); 833 - assert(evsel != NULL); 834 - 835 - if (event->header.type == PERF_RECORD_SAMPLE) 836 - ++top->samples; 837 - 838 - switch (sample.cpumode) { 839 - case PERF_RECORD_MISC_USER: 840 - ++top->us_samples; 841 - if (top->hide_user_symbols) 842 - goto next_event; 843 - machine = &session->machines.host; 842 + ret = perf_evlist__parse_sample_timestamp(evlist, event, &last_timestamp); 843 + if (ret && ret != -1) 844 844 break; 845 - case PERF_RECORD_MISC_KERNEL: 846 - ++top->kernel_samples; 847 - if (top->hide_kernel_symbols) 848 - goto next_event; 849 - machine = &session->machines.host; 850 - break; 851 - case PERF_RECORD_MISC_GUEST_KERNEL: 852 - ++top->guest_kernel_samples; 853 - machine = perf_session__find_machine(session, 854 - sample.pid); 855 - break; 856 - case PERF_RECORD_MISC_GUEST_USER: 857 - ++top->guest_us_samples; 858 - /* 859 - * TODO: we don't process guest user from host side 860 - * except simple counting. 861 - */ 862 - goto next_event; 863 - default: 864 - if (event->header.type == PERF_RECORD_SAMPLE) 865 - goto next_event; 866 - machine = &session->machines.host; 867 - break; 868 - } 869 845 846 + ret = ordered_events__queue(top->qe.in, event, last_timestamp, 0); 847 + if (ret) 848 + break; 870 849 871 - if (event->header.type == PERF_RECORD_SAMPLE) { 872 - perf_event__process_sample(&top->tool, event, evsel, 873 - &sample, machine); 874 - } else if (event->header.type < PERF_RECORD_MAX) { 875 - hists__inc_nr_events(evsel__hists(evsel), event->header.type); 876 - machine__process_event(machine, event, &sample); 877 - } else 878 - ++session->evlist->stats.nr_unknown_events; 879 - next_event: 880 850 perf_mmap__consume(md); 851 + 852 + if (top->qe.rotate) { 853 + pthread_mutex_lock(&top->qe.mutex); 854 + top->qe.rotate = false; 855 + pthread_cond_signal(&top->qe.cond); 856 + pthread_mutex_unlock(&top->qe.mutex); 857 + } 881 858 } 882 859 883 860 perf_mmap__read_done(md); ··· 870 881 { 871 882 bool overwrite = top->record_opts.overwrite; 872 883 struct perf_evlist *evlist = top->evlist; 873 - unsigned long long start, end; 874 884 int i; 875 885 876 - start = rdclock(); 877 886 if (overwrite) 878 887 perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_DATA_PENDING); 879 888 ··· 882 895 perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_EMPTY); 883 896 perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_RUNNING); 884 897 } 885 - end = rdclock(); 886 - 887 - if ((end - start) > (unsigned long long)top->delay_secs * NSEC_PER_SEC) 888 - ui__warning("Too slow to read ring buffer.\n" 889 - "Please try increasing the period (-c) or\n" 890 - "decreasing the freq (-F) or\n" 891 - "limiting the number of CPUs (-C)\n"); 892 898 } 893 899 894 900 /* ··· 1043 1063 return 0; 1044 1064 } 1045 1065 1066 + static struct ordered_events *rotate_queues(struct perf_top *top) 1067 + { 1068 + struct ordered_events *in = top->qe.in; 1069 + 1070 + if (top->qe.in == &top->qe.data[1]) 1071 + top->qe.in = &top->qe.data[0]; 1072 + else 1073 + top->qe.in = &top->qe.data[1]; 1074 + 1075 + return in; 1076 + } 1077 + 1078 + static void *process_thread(void *arg) 1079 + { 1080 + struct perf_top *top = arg; 1081 + 1082 + while (!done) { 1083 + struct ordered_events *out, *in = top->qe.in; 1084 + 1085 + if (!in->nr_events) { 1086 + usleep(100); 1087 + continue; 1088 + } 1089 + 1090 + out = rotate_queues(top); 1091 + 1092 + pthread_mutex_lock(&top->qe.mutex); 1093 + top->qe.rotate = true; 1094 + pthread_cond_wait(&top->qe.cond, &top->qe.mutex); 1095 + pthread_mutex_unlock(&top->qe.mutex); 1096 + 1097 + if (ordered_events__flush(out, OE_FLUSH__TOP)) 1098 + pr_err("failed to process events\n"); 1099 + } 1100 + 1101 + return NULL; 1102 + } 1103 + 1104 + /* 1105 + * Allow only 'top->delay_secs' seconds behind samples. 1106 + */ 1107 + static int should_drop(struct ordered_event *qevent, struct perf_top *top) 1108 + { 1109 + union perf_event *event = qevent->event; 1110 + u64 delay_timestamp; 1111 + 1112 + if (event->header.type != PERF_RECORD_SAMPLE) 1113 + return false; 1114 + 1115 + delay_timestamp = qevent->timestamp + top->delay_secs * NSEC_PER_SEC; 1116 + return delay_timestamp < last_timestamp; 1117 + } 1118 + 1119 + static int deliver_event(struct ordered_events *qe, 1120 + struct ordered_event *qevent) 1121 + { 1122 + struct perf_top *top = qe->data; 1123 + struct perf_evlist *evlist = top->evlist; 1124 + struct perf_session *session = top->session; 1125 + union perf_event *event = qevent->event; 1126 + struct perf_sample sample; 1127 + struct perf_evsel *evsel; 1128 + struct machine *machine; 1129 + int ret = -1; 1130 + 1131 + if (should_drop(qevent, top)) { 1132 + top->drop++; 1133 + top->drop_total++; 1134 + return 0; 1135 + } 1136 + 1137 + ret = perf_evlist__parse_sample(evlist, event, &sample); 1138 + if (ret) { 1139 + pr_err("Can't parse sample, err = %d\n", ret); 1140 + goto next_event; 1141 + } 1142 + 1143 + evsel = perf_evlist__id2evsel(session->evlist, sample.id); 1144 + assert(evsel != NULL); 1145 + 1146 + if (event->header.type == PERF_RECORD_SAMPLE) 1147 + ++top->samples; 1148 + 1149 + switch (sample.cpumode) { 1150 + case PERF_RECORD_MISC_USER: 1151 + ++top->us_samples; 1152 + if (top->hide_user_symbols) 1153 + goto next_event; 1154 + machine = &session->machines.host; 1155 + break; 1156 + case PERF_RECORD_MISC_KERNEL: 1157 + ++top->kernel_samples; 1158 + if (top->hide_kernel_symbols) 1159 + goto next_event; 1160 + machine = &session->machines.host; 1161 + break; 1162 + case PERF_RECORD_MISC_GUEST_KERNEL: 1163 + ++top->guest_kernel_samples; 1164 + machine = perf_session__find_machine(session, 1165 + sample.pid); 1166 + break; 1167 + case PERF_RECORD_MISC_GUEST_USER: 1168 + ++top->guest_us_samples; 1169 + /* 1170 + * TODO: we don't process guest user from host side 1171 + * except simple counting. 1172 + */ 1173 + goto next_event; 1174 + default: 1175 + if (event->header.type == PERF_RECORD_SAMPLE) 1176 + goto next_event; 1177 + machine = &session->machines.host; 1178 + break; 1179 + } 1180 + 1181 + if (event->header.type == PERF_RECORD_SAMPLE) { 1182 + perf_event__process_sample(&top->tool, event, evsel, 1183 + &sample, machine); 1184 + } else if (event->header.type == PERF_RECORD_LOST) { 1185 + perf_top__process_lost(top, event, evsel); 1186 + } else if (event->header.type == PERF_RECORD_LOST_SAMPLES) { 1187 + perf_top__process_lost_samples(top, event, evsel); 1188 + } else if (event->header.type < PERF_RECORD_MAX) { 1189 + hists__inc_nr_events(evsel__hists(evsel), event->header.type); 1190 + machine__process_event(machine, event, &sample); 1191 + } else 1192 + ++session->evlist->stats.nr_unknown_events; 1193 + 1194 + ret = 0; 1195 + next_event: 1196 + return ret; 1197 + } 1198 + 1199 + static void init_process_thread(struct perf_top *top) 1200 + { 1201 + ordered_events__init(&top->qe.data[0], deliver_event, top); 1202 + ordered_events__init(&top->qe.data[1], deliver_event, top); 1203 + ordered_events__set_copy_on_queue(&top->qe.data[0], true); 1204 + ordered_events__set_copy_on_queue(&top->qe.data[1], true); 1205 + top->qe.in = &top->qe.data[0]; 1206 + pthread_mutex_init(&top->qe.mutex, NULL); 1207 + pthread_cond_init(&top->qe.cond, NULL); 1208 + } 1209 + 1046 1210 static int __cmd_top(struct perf_top *top) 1047 1211 { 1048 1212 char msg[512]; ··· 1194 1070 struct perf_evsel_config_term *err_term; 1195 1071 struct perf_evlist *evlist = top->evlist; 1196 1072 struct record_opts *opts = &top->record_opts; 1197 - pthread_t thread; 1073 + pthread_t thread, thread_process; 1198 1074 int ret; 1199 1075 1200 1076 top->session = perf_session__new(NULL, false, NULL); ··· 1218 1094 if (top->nr_threads_synthesize > 1) 1219 1095 perf_set_multithreaded(); 1220 1096 1097 + init_process_thread(top); 1098 + 1221 1099 machine__synthesize_threads(&top->session->machines.host, &opts->target, 1222 1100 top->evlist->threads, false, 1223 - opts->proc_map_timeout, 1224 1101 top->nr_threads_synthesize); 1225 1102 1226 1103 if (top->nr_threads_synthesize > 1) ··· 1260 1135 perf_evlist__enable(top->evlist); 1261 1136 1262 1137 ret = -1; 1138 + if (pthread_create(&thread_process, NULL, process_thread, top)) { 1139 + ui__error("Could not create process thread.\n"); 1140 + goto out_delete; 1141 + } 1142 + 1263 1143 if (pthread_create(&thread, NULL, (use_browser > 0 ? display_thread_tui : 1264 1144 display_thread), top)) { 1265 1145 ui__error("Could not create display thread.\n"); 1266 - goto out_delete; 1146 + goto out_join_thread; 1267 1147 } 1268 1148 1269 1149 if (top->realtime_prio) { ··· 1303 1173 ret = 0; 1304 1174 out_join: 1305 1175 pthread_join(thread, NULL); 1176 + out_join_thread: 1177 + pthread_cond_signal(&top->qe.cond); 1178 + pthread_join(thread_process, NULL); 1306 1179 out_delete: 1307 1180 perf_session__delete(top->session); 1308 1181 top->session = NULL; ··· 1389 1256 .target = { 1390 1257 .uses_mmap = true, 1391 1258 }, 1392 - .proc_map_timeout = 500, 1393 1259 /* 1394 1260 * FIXME: This will lose PERF_RECORD_MMAP and other metadata 1395 1261 * when we pause, fix that and reenable. Probably using a ··· 1397 1265 * stays in overwrite mode. -acme 1398 1266 * */ 1399 1267 .overwrite = 0, 1268 + .sample_time = true, 1400 1269 }, 1401 1270 .max_stack = sysctl__max_stack(), 1402 1271 .annotation_opts = annotation__default_options, ··· 1422 1289 "file", "vmlinux pathname"), 1423 1290 OPT_BOOLEAN(0, "ignore-vmlinux", &symbol_conf.ignore_vmlinux, 1424 1291 "don't load vmlinux even if found"), 1292 + OPT_STRING(0, "kallsyms", &symbol_conf.kallsyms_name, 1293 + "file", "kallsyms pathname"), 1425 1294 OPT_BOOLEAN('K', "hide_kernel_symbols", &top.hide_kernel_symbols, 1426 1295 "hide kernel symbols"), 1427 1296 OPT_CALLBACK('m', "mmap-pages", &opts->mmap_pages, "pages", ··· 1502 1367 OPT_STRING('w', "column-widths", &symbol_conf.col_width_list_str, 1503 1368 "width[,width...]", 1504 1369 "don't try to adjust column width, use these fixed values"), 1505 - OPT_UINTEGER(0, "proc-map-timeout", &opts->proc_map_timeout, 1370 + OPT_UINTEGER(0, "proc-map-timeout", &proc_map_timeout, 1506 1371 "per thread proc mmap processing timeout in ms"), 1507 1372 OPT_CALLBACK_NOOPT('b', "branch-any", &opts->branch_stack, 1508 1373 "branch any", "sample any taken branches",
+72 -15
tools/perf/builtin-trace.c
··· 127 127 bool force; 128 128 bool vfs_getname; 129 129 int trace_pgfaults; 130 + struct { 131 + struct ordered_events data; 132 + u64 last; 133 + } oe; 130 134 }; 131 135 132 136 struct tp_field { ··· 262 258 struct syscall_tp *sc = evsel->priv = malloc(sizeof(struct syscall_tp)); 263 259 264 260 if (evsel->priv != NULL) { 265 - if (perf_evsel__init_tp_uint_field(evsel, &sc->id, "__syscall_nr")) 261 + if (perf_evsel__init_tp_uint_field(evsel, &sc->id, "__syscall_nr") && 262 + perf_evsel__init_tp_uint_field(evsel, &sc->id, "nr")) 266 263 goto out_delete; 267 264 return 0; 268 265 } ··· 890 885 * args_size: sum of the sizes of the syscall arguments, anything after that is augmented stuff: pathname for openat, etc. 891 886 */ 892 887 struct syscall { 893 - struct tep_event_format *tp_format; 888 + struct tep_event *tp_format; 894 889 int nr_args; 895 890 int args_size; 896 891 bool is_exit; ··· 1269 1264 1270 1265 err = __machine__synthesize_threads(trace->host, &trace->tool, &trace->opts.target, 1271 1266 evlist->threads, trace__tool_process, false, 1272 - trace->opts.proc_map_timeout, 1); 1267 + 1); 1273 1268 out: 1274 1269 if (err) 1275 1270 symbol__exit(); ··· 2641 2636 return err; 2642 2637 } 2643 2638 2639 + static int trace__deliver_event(struct trace *trace, union perf_event *event) 2640 + { 2641 + struct perf_evlist *evlist = trace->evlist; 2642 + struct perf_sample sample; 2643 + int err; 2644 + 2645 + err = perf_evlist__parse_sample(evlist, event, &sample); 2646 + if (err) 2647 + fprintf(trace->output, "Can't parse sample, err = %d, skipping...\n", err); 2648 + else 2649 + trace__handle_event(trace, event, &sample); 2650 + 2651 + return 0; 2652 + } 2653 + 2654 + static int trace__flush_ordered_events(struct trace *trace) 2655 + { 2656 + u64 first = ordered_events__first_time(&trace->oe.data); 2657 + u64 flush = trace->oe.last - NSEC_PER_SEC; 2658 + 2659 + /* Is there some thing to flush.. */ 2660 + if (first && first < flush) 2661 + return ordered_events__flush_time(&trace->oe.data, flush); 2662 + 2663 + return 0; 2664 + } 2665 + 2666 + static int trace__deliver_ordered_event(struct trace *trace, union perf_event *event) 2667 + { 2668 + struct perf_evlist *evlist = trace->evlist; 2669 + int err; 2670 + 2671 + err = perf_evlist__parse_sample_timestamp(evlist, event, &trace->oe.last); 2672 + if (err && err != -1) 2673 + return err; 2674 + 2675 + err = ordered_events__queue(&trace->oe.data, event, trace->oe.last, 0); 2676 + if (err) 2677 + return err; 2678 + 2679 + return trace__flush_ordered_events(trace); 2680 + } 2681 + 2682 + static int ordered_events__deliver_event(struct ordered_events *oe, 2683 + struct ordered_event *event) 2684 + { 2685 + struct trace *trace = container_of(oe, struct trace, oe.data); 2686 + 2687 + return trace__deliver_event(trace, event->event); 2688 + } 2689 + 2644 2690 static int trace__run(struct trace *trace, int argc, const char **argv) 2645 2691 { 2646 2692 struct perf_evlist *evlist = trace->evlist; ··· 2838 2782 * Now that we already used evsel->attr to ask the kernel to setup the 2839 2783 * events, lets reuse evsel->attr.sample_max_stack as the limit in 2840 2784 * trace__resolve_callchain(), allowing per-event max-stack settings 2841 - * to override an explicitely set --max-stack global setting. 2785 + * to override an explicitly set --max-stack global setting. 2842 2786 */ 2843 2787 evlist__for_each_entry(evlist, evsel) { 2844 2788 if (evsel__has_callchain(evsel) && ··· 2857 2801 continue; 2858 2802 2859 2803 while ((event = perf_mmap__read_event(md)) != NULL) { 2860 - struct perf_sample sample; 2861 - 2862 2804 ++trace->nr_events; 2863 2805 2864 - err = perf_evlist__parse_sample(evlist, event, &sample); 2865 - if (err) { 2866 - fprintf(trace->output, "Can't parse sample, err = %d, skipping...\n", err); 2867 - goto next_event; 2868 - } 2806 + err = trace__deliver_ordered_event(trace, event); 2807 + if (err) 2808 + goto out_disable; 2869 2809 2870 - trace__handle_event(trace, event, &sample); 2871 - next_event: 2872 2810 perf_mmap__consume(md); 2873 2811 2874 2812 if (interrupted) ··· 2884 2834 draining = true; 2885 2835 2886 2836 goto again; 2837 + } else { 2838 + if (trace__flush_ordered_events(trace)) 2839 + goto out_disable; 2887 2840 } 2888 2841 } else { 2889 2842 goto again; ··· 2896 2843 thread__zput(trace->current); 2897 2844 2898 2845 perf_evlist__disable(evlist); 2846 + 2847 + ordered_events__flush(&trace->oe.data, OE_FLUSH__FINAL); 2899 2848 2900 2849 if (!err) { 2901 2850 if (trace->summary) ··· 3448 3393 .user_interval = ULLONG_MAX, 3449 3394 .no_buffering = true, 3450 3395 .mmap_pages = UINT_MAX, 3451 - .proc_map_timeout = 500, 3452 3396 }, 3453 3397 .output = stderr, 3454 3398 .show_comm = true, ··· 3518 3464 "Default: kernel.perf_event_max_stack or " __stringify(PERF_MAX_STACK_DEPTH)), 3519 3465 OPT_BOOLEAN(0, "print-sample", &trace.print_sample, 3520 3466 "print the PERF_RECORD_SAMPLE PERF_SAMPLE_ info, for debugging"), 3521 - OPT_UINTEGER(0, "proc-map-timeout", &trace.opts.proc_map_timeout, 3467 + OPT_UINTEGER(0, "proc-map-timeout", &proc_map_timeout, 3522 3468 "per thread proc mmap processing timeout in ms"), 3523 3469 OPT_CALLBACK('G', "cgroup", &trace, "name", "monitor event in cgroup name only", 3524 3470 trace__parse_cgroups), ··· 3608 3554 goto out; 3609 3555 } 3610 3556 } 3557 + 3558 + ordered_events__init(&trace.oe.data, ordered_events__deliver_event, &trace); 3559 + ordered_events__set_copy_on_queue(&trace.oe.data, true); 3611 3560 3612 3561 /* 3613 3562 * If we are augmenting syscalls, then combine what we put in the
+1 -1
tools/perf/perf.h
··· 82 82 bool use_clockid; 83 83 clockid_t clockid; 84 84 u64 clockid_res_ns; 85 - unsigned int proc_map_timeout; 85 + int nr_cblocks; 86 86 }; 87 87 88 88 struct option;
+2 -2
tools/perf/pmu-events/arch/x86/broadwell/cache.json
··· 433 433 }, 434 434 { 435 435 "PEBS": "1", 436 - "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-splitted load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 436 + "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-split load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 437 437 "EventCode": "0xD0", 438 438 "Counter": "0,1,2,3", 439 439 "UMask": "0x41", ··· 445 445 }, 446 446 { 447 447 "PEBS": "1", 448 - "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-splitted store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 448 + "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-split store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 449 449 "EventCode": "0xD0", 450 450 "Counter": "0,1,2,3", 451 451 "UMask": "0x42",
+1 -1
tools/perf/pmu-events/arch/x86/broadwell/pipeline.json
··· 317 317 "CounterHTOff": "0,1,2,3,4,5,6,7" 318 318 }, 319 319 { 320 - "PublicDescription": "This event counts stalls occured due to changing prefix length (66, 67 or REX.W when they change the length of the decoded instruction). Occurrences counting is proportional to the number of prefixes in a 16B-line. This may result in the following penalties: three-cycle penalty for each LCP in a 16-byte chunk.", 320 + "PublicDescription": "This event counts stalls occurred due to changing prefix length (66, 67 or REX.W when they change the length of the decoded instruction). Occurrences counting is proportional to the number of prefixes in a 16B-line. This may result in the following penalties: three-cycle penalty for each LCP in a 16-byte chunk.", 321 321 "EventCode": "0x87", 322 322 "Counter": "0,1,2,3", 323 323 "UMask": "0x1",
+2 -2
tools/perf/pmu-events/arch/x86/broadwellde/cache.json
··· 439 439 "PEBS": "1", 440 440 "Counter": "0,1,2,3", 441 441 "EventName": "MEM_UOPS_RETIRED.SPLIT_LOADS", 442 - "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-splitted load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 442 + "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-split load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 443 443 "SampleAfterValue": "100003", 444 444 "CounterHTOff": "0,1,2,3" 445 445 }, ··· 451 451 "PEBS": "1", 452 452 "Counter": "0,1,2,3", 453 453 "EventName": "MEM_UOPS_RETIRED.SPLIT_STORES", 454 - "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-splitted store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 454 + "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-split store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 455 455 "SampleAfterValue": "100003", 456 456 "L1_Hit_Indication": "1", 457 457 "CounterHTOff": "0,1,2,3"
+1 -1
tools/perf/pmu-events/arch/x86/broadwellde/pipeline.json
··· 322 322 "BriefDescription": "Stalls caused by changing prefix length of the instruction.", 323 323 "Counter": "0,1,2,3", 324 324 "EventName": "ILD_STALL.LCP", 325 - "PublicDescription": "This event counts stalls occured due to changing prefix length (66, 67 or REX.W when they change the length of the decoded instruction). Occurrences counting is proportional to the number of prefixes in a 16B-line. This may result in the following penalties: three-cycle penalty for each LCP in a 16-byte chunk.", 325 + "PublicDescription": "This event counts stalls occurred due to changing prefix length (66, 67 or REX.W when they change the length of the decoded instruction). Occurrences counting is proportional to the number of prefixes in a 16B-line. This may result in the following penalties: three-cycle penalty for each LCP in a 16-byte chunk.", 326 326 "SampleAfterValue": "2000003", 327 327 "CounterHTOff": "0,1,2,3,4,5,6,7" 328 328 },
+2 -2
tools/perf/pmu-events/arch/x86/broadwellx/cache.json
··· 439 439 "PEBS": "1", 440 440 "Counter": "0,1,2,3", 441 441 "EventName": "MEM_UOPS_RETIRED.SPLIT_LOADS", 442 - "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-splitted load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 442 + "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-split load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 443 443 "SampleAfterValue": "100003", 444 444 "CounterHTOff": "0,1,2,3" 445 445 }, ··· 451 451 "PEBS": "1", 452 452 "Counter": "0,1,2,3", 453 453 "EventName": "MEM_UOPS_RETIRED.SPLIT_STORES", 454 - "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-splitted store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 454 + "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-split store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 455 455 "SampleAfterValue": "100003", 456 456 "L1_Hit_Indication": "1", 457 457 "CounterHTOff": "0,1,2,3"
+1 -1
tools/perf/pmu-events/arch/x86/broadwellx/pipeline.json
··· 322 322 "BriefDescription": "Stalls caused by changing prefix length of the instruction.", 323 323 "Counter": "0,1,2,3", 324 324 "EventName": "ILD_STALL.LCP", 325 - "PublicDescription": "This event counts stalls occured due to changing prefix length (66, 67 or REX.W when they change the length of the decoded instruction). Occurrences counting is proportional to the number of prefixes in a 16B-line. This may result in the following penalties: three-cycle penalty for each LCP in a 16-byte chunk.", 325 + "PublicDescription": "This event counts stalls occurred due to changing prefix length (66, 67 or REX.W when they change the length of the decoded instruction). Occurrences counting is proportional to the number of prefixes in a 16B-line. This may result in the following penalties: three-cycle penalty for each LCP in a 16-byte chunk.", 326 326 "SampleAfterValue": "2000003", 327 327 "CounterHTOff": "0,1,2,3,4,5,6,7" 328 328 },
+2 -2
tools/perf/pmu-events/arch/x86/jaketown/cache.json
··· 31 31 }, 32 32 { 33 33 "PEBS": "1", 34 - "PublicDescription": "This event counts line-splitted load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 34 + "PublicDescription": "This event counts line-split load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 35 35 "EventCode": "0xD0", 36 36 "Counter": "0,1,2,3", 37 37 "UMask": "0x41", ··· 42 42 }, 43 43 { 44 44 "PEBS": "1", 45 - "PublicDescription": "This event counts line-splitted store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 45 + "PublicDescription": "This event counts line-split store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 46 46 "EventCode": "0xD0", 47 47 "Counter": "0,1,2,3", 48 48 "UMask": "0x42",
+1 -1
tools/perf/pmu-events/arch/x86/jaketown/pipeline.json
··· 778 778 "CounterHTOff": "0,1,2,3,4,5,6,7" 779 779 }, 780 780 { 781 - "PublicDescription": "This event counts loads that followed a store to the same address, where the data could not be forwarded inside the pipeline from the store to the load. The most common reason why store forwarding would be blocked is when a load's address range overlaps with a preceeding smaller uncompleted store. See the table of not supported store forwards in the Intel? 64 and IA-32 Architectures Optimization Reference Manual. The penalty for blocked store forwarding is that the load must wait for the store to complete before it can be issued.", 781 + "PublicDescription": "This event counts loads that followed a store to the same address, where the data could not be forwarded inside the pipeline from the store to the load. The most common reason why store forwarding would be blocked is when a load's address range overlaps with a preceding smaller uncompleted store. See the table of not supported store forwards in the Intel? 64 and IA-32 Architectures Optimization Reference Manual. The penalty for blocked store forwarding is that the load must wait for the store to complete before it can be issued.", 782 782 "EventCode": "0x03", 783 783 "Counter": "0,1,2,3", 784 784 "UMask": "0x2",
+15 -15
tools/perf/pmu-events/arch/x86/knightslanding/cache.json
··· 121 121 "EventName": "OFFCORE_RESPONSE.ANY_PF_L2.OUTSTANDING", 122 122 "MSRIndex": "0x1a6", 123 123 "SampleAfterValue": "100007", 124 - "BriefDescription": "Counts any Prefetch requests that are outstanding, per weighted cycle, from the time of the request to when any response is received. The oustanding response should be programmed only on PMC0. ", 124 + "BriefDescription": "Counts any Prefetch requests that are outstanding, per weighted cycle, from the time of the request to when any response is received. The outstanding response should be programmed only on PMC0. ", 125 125 "Offcore": "1" 126 126 }, 127 127 { ··· 187 187 "EventName": "OFFCORE_RESPONSE.ANY_READ.OUTSTANDING", 188 188 "MSRIndex": "0x1a6", 189 189 "SampleAfterValue": "100007", 190 - "BriefDescription": "Counts any Read request that are outstanding, per weighted cycle, from the time of the request to when any response is received. The oustanding response should be programmed only on PMC0. ", 190 + "BriefDescription": "Counts any Read request that are outstanding, per weighted cycle, from the time of the request to when any response is received. The outstanding response should be programmed only on PMC0. ", 191 191 "Offcore": "1" 192 192 }, 193 193 { ··· 253 253 "EventName": "OFFCORE_RESPONSE.ANY_CODE_RD.OUTSTANDING", 254 254 "MSRIndex": "0x1a6", 255 255 "SampleAfterValue": "100007", 256 - "BriefDescription": "Counts Demand code reads and prefetch code read requests that are outstanding, per weighted cycle, from the time of the request to when any response is received. The oustanding response should be programmed only on PMC0. ", 256 + "BriefDescription": "Counts Demand code reads and prefetch code read requests that are outstanding, per weighted cycle, from the time of the request to when any response is received. The outstanding response should be programmed only on PMC0. ", 257 257 "Offcore": "1" 258 258 }, 259 259 { ··· 319 319 "EventName": "OFFCORE_RESPONSE.ANY_RFO.OUTSTANDING", 320 320 "MSRIndex": "0x1a6", 321 321 "SampleAfterValue": "100007", 322 - "BriefDescription": "Counts Demand cacheable data write requests that are outstanding, per weighted cycle, from the time of the request to when any response is received. The oustanding response should be programmed only on PMC0. ", 322 + "BriefDescription": "Counts Demand cacheable data write requests that are outstanding, per weighted cycle, from the time of the request to when any response is received. The outstanding response should be programmed only on PMC0. ", 323 323 "Offcore": "1" 324 324 }, 325 325 { ··· 385 385 "EventName": "OFFCORE_RESPONSE.ANY_DATA_RD.OUTSTANDING", 386 386 "MSRIndex": "0x1a6", 387 387 "SampleAfterValue": "100007", 388 - "BriefDescription": "Counts Demand cacheable data and L1 prefetch data read requests that are outstanding, per weighted cycle, from the time of the request to when any response is received. The oustanding response should be programmed only on PMC0. ", 388 + "BriefDescription": "Counts Demand cacheable data and L1 prefetch data read requests that are outstanding, per weighted cycle, from the time of the request to when any response is received. The outstanding response should be programmed only on PMC0. ", 389 389 "Offcore": "1" 390 390 }, 391 391 { ··· 451 451 "EventName": "OFFCORE_RESPONSE.ANY_REQUEST.OUTSTANDING", 452 452 "MSRIndex": "0x1a6", 453 453 "SampleAfterValue": "100007", 454 - "BriefDescription": "Counts any request that are outstanding, per weighted cycle, from the time of the request to when any response is received. The oustanding response should be programmed only on PMC0. ", 454 + "BriefDescription": "Counts any request that are outstanding, per weighted cycle, from the time of the request to when any response is received. The outstanding response should be programmed only on PMC0. ", 455 455 "Offcore": "1" 456 456 }, 457 457 { ··· 539 539 "EventName": "OFFCORE_RESPONSE.PF_L1_DATA_RD.OUTSTANDING", 540 540 "MSRIndex": "0x1a6", 541 541 "SampleAfterValue": "100007", 542 - "BriefDescription": "Counts L1 data HW prefetches that are outstanding, per weighted cycle, from the time of the request to when any response is received. The oustanding response should be programmed only on PMC0. ", 542 + "BriefDescription": "Counts L1 data HW prefetches that are outstanding, per weighted cycle, from the time of the request to when any response is received. The outstanding response should be programmed only on PMC0. ", 543 543 "Offcore": "1" 544 544 }, 545 545 { ··· 605 605 "EventName": "OFFCORE_RESPONSE.PF_SOFTWARE.OUTSTANDING", 606 606 "MSRIndex": "0x1a6", 607 607 "SampleAfterValue": "100007", 608 - "BriefDescription": "Counts Software Prefetches that are outstanding, per weighted cycle, from the time of the request to when any response is received. The oustanding response should be programmed only on PMC0. ", 608 + "BriefDescription": "Counts Software Prefetches that are outstanding, per weighted cycle, from the time of the request to when any response is received. The outstanding response should be programmed only on PMC0. ", 609 609 "Offcore": "1" 610 610 }, 611 611 { ··· 682 682 "EventName": "OFFCORE_RESPONSE.BUS_LOCKS.OUTSTANDING", 683 683 "MSRIndex": "0x1a6", 684 684 "SampleAfterValue": "100007", 685 - "BriefDescription": "Counts Bus locks and split lock requests that are outstanding, per weighted cycle, from the time of the request to when any response is received. The oustanding response should be programmed only on PMC0. ", 685 + "BriefDescription": "Counts Bus locks and split lock requests that are outstanding, per weighted cycle, from the time of the request to when any response is received. The outstanding response should be programmed only on PMC0. ", 686 686 "Offcore": "1" 687 687 }, 688 688 { ··· 748 748 "EventName": "OFFCORE_RESPONSE.UC_CODE_READS.OUTSTANDING", 749 749 "MSRIndex": "0x1a6", 750 750 "SampleAfterValue": "100007", 751 - "BriefDescription": "Counts UC code reads (valid only for Outstanding response type) that are outstanding, per weighted cycle, from the time of the request to when any response is received. The oustanding response should be programmed only on PMC0. ", 751 + "BriefDescription": "Counts UC code reads (valid only for Outstanding response type) that are outstanding, per weighted cycle, from the time of the request to when any response is received. The outstanding response should be programmed only on PMC0. ", 752 752 "Offcore": "1" 753 753 }, 754 754 { ··· 869 869 "EventName": "OFFCORE_RESPONSE.PARTIAL_READS.OUTSTANDING", 870 870 "MSRIndex": "0x1a6", 871 871 "SampleAfterValue": "100007", 872 - "BriefDescription": "Counts Partial reads (UC or WC and is valid only for Outstanding response type). that are outstanding, per weighted cycle, from the time of the request to when any response is received. The oustanding response should be programmed only on PMC0. ", 872 + "BriefDescription": "Counts Partial reads (UC or WC and is valid only for Outstanding response type). that are outstanding, per weighted cycle, from the time of the request to when any response is received. The outstanding response should be programmed only on PMC0. ", 873 873 "Offcore": "1" 874 874 }, 875 875 { ··· 935 935 "EventName": "OFFCORE_RESPONSE.PF_L2_CODE_RD.OUTSTANDING", 936 936 "MSRIndex": "0x1a6", 937 937 "SampleAfterValue": "100007", 938 - "BriefDescription": "Counts L2 code HW prefetches that are outstanding, per weighted cycle, from the time of the request to when any response is received. The oustanding response should be programmed only on PMC0. ", 938 + "BriefDescription": "Counts L2 code HW prefetches that are outstanding, per weighted cycle, from the time of the request to when any response is received. The outstanding response should be programmed only on PMC0. ", 939 939 "Offcore": "1" 940 940 }, 941 941 { ··· 1067 1067 "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.OUTSTANDING", 1068 1068 "MSRIndex": "0x1a6", 1069 1069 "SampleAfterValue": "100007", 1070 - "BriefDescription": "Counts demand code reads and prefetch code reads that are outstanding, per weighted cycle, from the time of the request to when any response is received. The oustanding response should be programmed only on PMC0. ", 1070 + "BriefDescription": "Counts demand code reads and prefetch code reads that are outstanding, per weighted cycle, from the time of the request to when any response is received. The outstanding response should be programmed only on PMC0. ", 1071 1071 "Offcore": "1" 1072 1072 }, 1073 1073 { ··· 1133 1133 "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.OUTSTANDING", 1134 1134 "MSRIndex": "0x1a6", 1135 1135 "SampleAfterValue": "100007", 1136 - "BriefDescription": "Counts Demand cacheable data writes that are outstanding, per weighted cycle, from the time of the request to when any response is received. The oustanding response should be programmed only on PMC0. ", 1136 + "BriefDescription": "Counts Demand cacheable data writes that are outstanding, per weighted cycle, from the time of the request to when any response is received. The outstanding response should be programmed only on PMC0. ", 1137 1137 "Offcore": "1" 1138 1138 }, 1139 1139 { ··· 1199 1199 "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.OUTSTANDING", 1200 1200 "MSRIndex": "0x1a6", 1201 1201 "SampleAfterValue": "100007", 1202 - "BriefDescription": "Counts demand cacheable data and L1 prefetch data reads that are outstanding, per weighted cycle, from the time of the request to when any response is received. The oustanding response should be programmed only on PMC0. ", 1202 + "BriefDescription": "Counts demand cacheable data and L1 prefetch data reads that are outstanding, per weighted cycle, from the time of the request to when any response is received. The outstanding response should be programmed only on PMC0. ", 1203 1203 "Offcore": "1" 1204 1204 }, 1205 1205 {
+2 -2
tools/perf/pmu-events/arch/x86/sandybridge/cache.json
··· 31 31 }, 32 32 { 33 33 "PEBS": "1", 34 - "PublicDescription": "This event counts line-splitted load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 34 + "PublicDescription": "This event counts line-split load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 35 35 "EventCode": "0xD0", 36 36 "Counter": "0,1,2,3", 37 37 "UMask": "0x41", ··· 42 42 }, 43 43 { 44 44 "PEBS": "1", 45 - "PublicDescription": "This event counts line-splitted store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 45 + "PublicDescription": "This event counts line-split store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).", 46 46 "EventCode": "0xD0", 47 47 "Counter": "0,1,2,3", 48 48 "UMask": "0x42",
+1 -1
tools/perf/pmu-events/arch/x86/sandybridge/pipeline.json
··· 778 778 "CounterHTOff": "0,1,2,3,4,5,6,7" 779 779 }, 780 780 { 781 - "PublicDescription": "This event counts loads that followed a store to the same address, where the data could not be forwarded inside the pipeline from the store to the load. The most common reason why store forwarding would be blocked is when a load's address range overlaps with a preceeding smaller uncompleted store. See the table of not supported store forwards in the Intel? 64 and IA-32 Architectures Optimization Reference Manual. The penalty for blocked store forwarding is that the load must wait for the store to complete before it can be issued.", 781 + "PublicDescription": "This event counts loads that followed a store to the same address, where the data could not be forwarded inside the pipeline from the store to the load. The most common reason why store forwarding would be blocked is when a load's address range overlaps with a preceding smaller uncompleted store. See the table of not supported store forwards in the Intel? 64 and IA-32 Architectures Optimization Reference Manual. The penalty for blocked store forwarding is that the load must wait for the store to complete before it can be issued.", 782 782 "EventCode": "0x03", 783 783 "Counter": "0,1,2,3", 784 784 "UMask": "0x2",
+1 -1
tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json
··· 73 73 }, 74 74 { 75 75 "BriefDescription": "Actual Average Latency for L1 data-cache miss demand loads", 76 - "MetricExpr": "L1D_PEND_MISS.PENDING / ( MEM_LOAD_RETIRED.L1_MISS_PS + MEM_LOAD_RETIRED.FB_HIT_PS )", 76 + "MetricExpr": "L1D_PEND_MISS.PENDING / ( MEM_LOAD_RETIRED.L1_MISS + MEM_LOAD_RETIRED.FB_HIT )", 77 77 "MetricGroup": "Memory_Bound;Memory_Lat", 78 78 "MetricName": "Load_Miss_Real_Latency" 79 79 },
+1 -1
tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json
··· 73 73 }, 74 74 { 75 75 "BriefDescription": "Actual Average Latency for L1 data-cache miss demand loads", 76 - "MetricExpr": "L1D_PEND_MISS.PENDING / ( MEM_LOAD_RETIRED.L1_MISS_PS + MEM_LOAD_RETIRED.FB_HIT_PS )", 76 + "MetricExpr": "L1D_PEND_MISS.PENDING / ( MEM_LOAD_RETIRED.L1_MISS + MEM_LOAD_RETIRED.FB_HIT )", 77 77 "MetricGroup": "Memory_Bound;Memory_Lat", 78 78 "MetricName": "Load_Miss_Real_Latency" 79 79 },
+6 -6
tools/perf/pmu-events/arch/x86/skylakex/uncore-other.json
··· 428 428 "EventCode": "0x5C", 429 429 "EventName": "UNC_CHA_SNOOP_RESP.RSP_WBWB", 430 430 "PerPkg": "1", 431 - "PublicDescription": "Counts when a transaction with the opcode type Rsp*WB Snoop Response was received which indicates which indicates the data was written back to it's home. This is returned when a non-RFO request hits a cacheline in the Modified state. The Cache can either downgrade the cacheline to a S (Shared) or I (Invalid) state depending on how the system has been configured. This reponse will also be sent when a cache requests E (Exclusive) ownership of a cache line without receiving data, because the cache must acquire ownership.", 431 + "PublicDescription": "Counts when a transaction with the opcode type Rsp*WB Snoop Response was received which indicates which indicates the data was written back to it's home. This is returned when a non-RFO request hits a cacheline in the Modified state. The Cache can either downgrade the cacheline to a S (Shared) or I (Invalid) state depending on how the system has been configured. This response will also be sent when a cache requests E (Exclusive) ownership of a cache line without receiving data, because the cache must acquire ownership.", 432 432 "UMask": "0x10", 433 433 "Unit": "CHA" 434 434 }, ··· 967 967 "EventCode": "0x57", 968 968 "EventName": "UNC_M2M_PREFCAM_INSERTS", 969 969 "PerPkg": "1", 970 - "PublicDescription": "Counts when the M2M (Mesh to Memory) recieves a prefetch request and inserts it into its outstanding prefetch queue. Explanatory Side Note: the prefect queue is made from CAM: Content Addressable Memory", 970 + "PublicDescription": "Counts when the M2M (Mesh to Memory) receives a prefetch request and inserts it into its outstanding prefetch queue. Explanatory Side Note: the prefect queue is made from CAM: Content Addressable Memory", 971 971 "Unit": "M2M" 972 972 }, 973 973 { ··· 1041 1041 "EventCode": "0x31", 1042 1042 "EventName": "UNC_UPI_RxL_BYPASSED.SLOT0", 1043 1043 "PerPkg": "1", 1044 - "PublicDescription": "Counts incoming FLITs (FLow control unITs) which bypassed the slot0 RxQ buffer (Receive Queue) and passed directly to the Egress. This is a latency optimization, and should generally be the common case. If this value is less than the number of FLITs transfered, it implies that there was queueing getting onto the ring, and thus the transactions saw higher latency.", 1044 + "PublicDescription": "Counts incoming FLITs (FLow control unITs) which bypassed the slot0 RxQ buffer (Receive Queue) and passed directly to the Egress. This is a latency optimization, and should generally be the common case. If this value is less than the number of FLITs transferred, it implies that there was queueing getting onto the ring, and thus the transactions saw higher latency.", 1045 1045 "UMask": "0x1", 1046 1046 "Unit": "UPI LL" 1047 1047 }, ··· 1051 1051 "EventCode": "0x31", 1052 1052 "EventName": "UNC_UPI_RxL_BYPASSED.SLOT1", 1053 1053 "PerPkg": "1", 1054 - "PublicDescription": "Counts incoming FLITs (FLow control unITs) which bypassed the slot1 RxQ buffer (Receive Queue) and passed directly across the BGF and into the Egress. This is a latency optimization, and should generally be the common case. If this value is less than the number of FLITs transfered, it implies that there was queueing getting onto the ring, and thus the transactions saw higher latency.", 1054 + "PublicDescription": "Counts incoming FLITs (FLow control unITs) which bypassed the slot1 RxQ buffer (Receive Queue) and passed directly across the BGF and into the Egress. This is a latency optimization, and should generally be the common case. If this value is less than the number of FLITs transferred, it implies that there was queueing getting onto the ring, and thus the transactions saw higher latency.", 1055 1055 "UMask": "0x2", 1056 1056 "Unit": "UPI LL" 1057 1057 }, 1058 1058 { 1059 - "BriefDescription": "FLITs received which bypassed the Slot0 Recieve Buffer", 1059 + "BriefDescription": "FLITs received which bypassed the Slot0 Receive Buffer", 1060 1060 "Counter": "0,1,2,3", 1061 1061 "EventCode": "0x31", 1062 1062 "EventName": "UNC_UPI_RxL_BYPASSED.SLOT2", 1063 1063 "PerPkg": "1", 1064 - "PublicDescription": "Counts incoming FLITs (FLow control unITs) whcih bypassed the slot2 RxQ buffer (Receive Queue) and passed directly to the Egress. This is a latency optimization, and should generally be the common case. If this value is less than the number of FLITs transfered, it implies that there was queueing getting onto the ring, and thus the transactions saw higher latency.", 1064 + "PublicDescription": "Counts incoming FLITs (FLow control unITs) which bypassed the slot2 RxQ buffer (Receive Queue) and passed directly to the Egress. This is a latency optimization, and should generally be the common case. If this value is less than the number of FLITs transferred, it implies that there was queueing getting onto the ring, and thus the transactions saw higher latency.", 1065 1065 "UMask": "0x4", 1066 1066 "Unit": "UPI LL" 1067 1067 },
+1 -1
tools/perf/tests/attr.c
··· 182 182 char path_perf[PATH_MAX]; 183 183 char path_dir[PATH_MAX]; 184 184 185 - /* First try developement tree tests. */ 185 + /* First try development tree tests. */ 186 186 if (!lstat("./tests", &st)) 187 187 return run_dir("./tests", "./perf"); 188 188
+1 -1
tools/perf/tests/attr.py
··· 116 116 if not self.has_key(t) or not other.has_key(t): 117 117 continue 118 118 if not data_equal(self[t], other[t]): 119 - log.warning("expected %s=%s, got %s" % (t, self[t], other[t])) 119 + log.warning("expected %s=%s, got %s" % (t, self[t], other[t])) 120 120 121 121 # Test file description needs to have following sections: 122 122 # [config]
+14 -6
tools/perf/tests/bp_signal.c
··· 291 291 292 292 bool test__bp_signal_is_supported(void) 293 293 { 294 - /* 295 - * The powerpc so far does not have support to even create 296 - * instruction breakpoint using the perf event interface. 297 - * Once it's there we can release this. 298 - */ 299 - #if defined(__powerpc__) || defined(__s390x__) 294 + /* 295 + * PowerPC and S390 do not support creation of instruction 296 + * breakpoints using the perf_event interface. 297 + * 298 + * ARM requires explicit rounding down of the instruction 299 + * pointer in Thumb mode, and then requires the single-step 300 + * to be handled explicitly in the overflow handler to avoid 301 + * stepping into the SIGIO handler and getting stuck on the 302 + * breakpointed instruction. 303 + * 304 + * Just disable the test for these architectures until these 305 + * issues are resolved. 306 + */ 307 + #if defined(__powerpc__) || defined(__s390x__) || defined(__arm__) 300 308 return false; 301 309 #else 302 310 return true;
+1 -1
tools/perf/tests/code-reading.c
··· 599 599 } 600 600 601 601 ret = perf_event__synthesize_thread_map(NULL, threads, 602 - perf_event__process, machine, false, 500); 602 + perf_event__process, machine, false); 603 603 if (ret < 0) { 604 604 pr_debug("perf_event__synthesize_thread_map failed\n"); 605 605 goto out_err;
+1 -1
tools/perf/tests/dwarf-unwind.c
··· 34 34 pid_t pid = getpid(); 35 35 36 36 return perf_event__synthesize_mmap_events(NULL, &event, pid, pid, 37 - mmap_handler, machine, true, 500); 37 + mmap_handler, machine, true); 38 38 } 39 39 40 40 /*
+2 -2
tools/perf/tests/mmap-thread-lookup.c
··· 132 132 { 133 133 return perf_event__synthesize_threads(NULL, 134 134 perf_event__process, 135 - machine, 0, 500, 1); 135 + machine, 0, 1); 136 136 } 137 137 138 138 static int synth_process(struct machine *machine) ··· 144 144 145 145 err = perf_event__synthesize_thread_map(NULL, map, 146 146 perf_event__process, 147 - machine, 0, 500); 147 + machine, 0); 148 148 149 149 thread_map__put(map); 150 150 return err;
+5 -2
tools/perf/tests/perf-record.c
··· 58 58 char *bname, *mmap_filename; 59 59 u64 prev_time = 0; 60 60 bool found_cmd_mmap = false, 61 + found_coreutils_mmap = false, 61 62 found_libc_mmap = false, 62 63 found_vdso_mmap = false, 63 64 found_ld_mmap = false; ··· 255 254 if (bname != NULL) { 256 255 if (!found_cmd_mmap) 257 256 found_cmd_mmap = !strcmp(bname + 1, cmd); 257 + if (!found_coreutils_mmap) 258 + found_coreutils_mmap = !strcmp(bname + 1, "coreutils"); 258 259 if (!found_libc_mmap) 259 260 found_libc_mmap = !strncmp(bname + 1, "libc", 4); 260 261 if (!found_ld_mmap) ··· 295 292 } 296 293 297 294 found_exit: 298 - if (nr_events[PERF_RECORD_COMM] > 1) { 295 + if (nr_events[PERF_RECORD_COMM] > 1 + !!found_coreutils_mmap) { 299 296 pr_debug("Excessive number of PERF_RECORD_COMM events!\n"); 300 297 ++errs; 301 298 } ··· 305 302 ++errs; 306 303 } 307 304 308 - if (!found_cmd_mmap) { 305 + if (!found_cmd_mmap && !found_coreutils_mmap) { 309 306 pr_debug("PERF_RECORD_MMAP for %s missing!\n", cmd); 310 307 ++errs; 311 308 }
+2 -2
tools/perf/trace/beauty/mmap_flags.sh
··· 20 20 (egrep $regex ${arch_mman} | \ 21 21 sed -r "s/$regex/\2 \1/g" | \ 22 22 xargs printf "\t[ilog2(%s) + 1] = \"%s\",\n") 23 - egrep -q '#[[:space:]]*include[[:space:]]+<uapi/asm-generic/mman.*' ${arch_mman} && 23 + ([ ! -f ${arch_mman} ] || egrep -q '#[[:space:]]*include[[:space:]]+<uapi/asm-generic/mman.*' ${arch_mman}) && 24 24 (egrep $regex ${header_dir}/mman-common.h | \ 25 25 egrep -vw 'MAP_(UNINITIALIZED|TYPE|SHARED_VALIDATE)' | \ 26 26 sed -r "s/$regex/\2 \1/g" | \ 27 27 xargs printf "\t[ilog2(%s) + 1] = \"%s\",\n") 28 - egrep -q '#[[:space:]]*include[[:space:]]+<uapi/asm-generic/mman.h>.*' ${arch_mman} && 28 + ([ ! -f ${arch_mman} ] || egrep -q '#[[:space:]]*include[[:space:]]+<uapi/asm-generic/mman.h>.*' ${arch_mman}) && 29 29 (egrep $regex ${header_dir}/mman.h | \ 30 30 sed -r "s/$regex/\2 \1/g" | \ 31 31 xargs printf "\t[ilog2(%s) + 1] = \"%s\",\n")
+11
tools/perf/ui/browsers/hists.c
··· 2219 2219 if (!is_report_browser(hbt)) { 2220 2220 struct perf_top *top = hbt->arg; 2221 2221 2222 + printed += scnprintf(bf + printed, size - printed, 2223 + " lost: %" PRIu64 "/%" PRIu64, 2224 + top->lost, top->lost_total); 2225 + 2226 + printed += scnprintf(bf + printed, size - printed, 2227 + " drop: %" PRIu64 "/%" PRIu64, 2228 + top->drop, top->drop_total); 2229 + 2222 2230 if (top->zero) 2223 2231 printed += scnprintf(bf + printed, size - printed, " [z]"); 2232 + 2233 + perf_top__reset_sample_counters(top); 2224 2234 } 2235 + 2225 2236 2226 2237 return printed; 2227 2238 }
+1 -1
tools/perf/ui/tui/helpline.c
··· 24 24 SLsmg_set_color(0); 25 25 SLsmg_write_nstring((char *)msg, SLtt_Screen_Cols); 26 26 SLsmg_refresh(); 27 - strncpy(ui_helpline__current, msg, sz)[sz - 1] = '\0'; 27 + strlcpy(ui_helpline__current, msg, sz); 28 28 } 29 29 30 30 static int tui_helpline__show(const char *format, va_list ap)
+1
tools/perf/util/Build
··· 77 77 libperf-y += stat-display.o 78 78 libperf-y += record.o 79 79 libperf-y += srcline.o 80 + libperf-y += srccode.o 80 81 libperf-y += data.o 81 82 libperf-y += tsc.o 82 83 libperf-y += cloexec.o
+45 -4
tools/perf/util/annotate.c
··· 134 134 return 0; 135 135 } 136 136 137 + #include "arch/arc/annotate/instructions.c" 137 138 #include "arch/arm/annotate/instructions.c" 138 139 #include "arch/arm64/annotate/instructions.c" 139 140 #include "arch/x86/annotate/instructions.c" ··· 143 142 #include "arch/sparc/annotate/instructions.c" 144 143 145 144 static struct arch architectures[] = { 145 + { 146 + .name = "arc", 147 + .init = arc__annotate_init, 148 + }, 146 149 { 147 150 .name = "arm", 148 151 .init = arm__annotate_init, ··· 1005 1000 static void annotation__count_and_fill(struct annotation *notes, u64 start, u64 end, struct cyc_hist *ch) 1006 1001 { 1007 1002 unsigned n_insn; 1003 + unsigned int cover_insn = 0; 1008 1004 u64 offset; 1009 1005 1010 1006 n_insn = annotation__count_insn(notes, start, end); ··· 1019 1013 for (offset = start; offset <= end; offset++) { 1020 1014 struct annotation_line *al = notes->offsets[offset]; 1021 1015 1022 - if (al) 1016 + if (al && al->ipc == 0.0) { 1023 1017 al->ipc = ipc; 1018 + cover_insn++; 1019 + } 1020 + } 1021 + 1022 + if (cover_insn) { 1023 + notes->hit_cycles += ch->cycles; 1024 + notes->hit_insn += n_insn * ch->num; 1025 + notes->cover_insn += cover_insn; 1024 1026 } 1025 1027 } 1026 1028 } 1027 1029 1028 1030 void annotation__compute_ipc(struct annotation *notes, size_t size) 1029 1031 { 1030 - u64 offset; 1032 + s64 offset; 1031 1033 1032 1034 if (!notes->src || !notes->src->cycles_hist) 1033 1035 return; 1034 1036 1037 + notes->total_insn = annotation__count_insn(notes, 0, size - 1); 1038 + notes->hit_cycles = 0; 1039 + notes->hit_insn = 0; 1040 + notes->cover_insn = 0; 1041 + 1035 1042 pthread_mutex_lock(&notes->lock); 1036 - for (offset = 0; offset < size; ++offset) { 1043 + for (offset = size - 1; offset >= 0; --offset) { 1037 1044 struct cyc_hist *ch; 1038 1045 1039 1046 ch = &notes->src->cycles_hist[offset]; ··· 1777 1758 while (!feof(file)) { 1778 1759 /* 1779 1760 * The source code line number (lineno) needs to be kept in 1780 - * accross calls to symbol__parse_objdump_line(), so that it 1761 + * across calls to symbol__parse_objdump_line(), so that it 1781 1762 * can associate it with the instructions till the next one. 1782 1763 * See disasm_line__new() and struct disasm_line::line_nr. 1783 1764 */ ··· 2582 2563 disasm_line__scnprintf(dl, bf, size, !notes->options->use_offset); 2583 2564 } 2584 2565 2566 + static void ipc_coverage_string(char *bf, int size, struct annotation *notes) 2567 + { 2568 + double ipc = 0.0, coverage = 0.0; 2569 + 2570 + if (notes->hit_cycles) 2571 + ipc = notes->hit_insn / ((double)notes->hit_cycles); 2572 + 2573 + if (notes->total_insn) { 2574 + coverage = notes->cover_insn * 100.0 / 2575 + ((double)notes->total_insn); 2576 + } 2577 + 2578 + scnprintf(bf, size, "(Average IPC: %.2f, IPC Coverage: %.1f%%)", 2579 + ipc, coverage); 2580 + } 2581 + 2585 2582 static void __annotation_line__write(struct annotation_line *al, struct annotation *notes, 2586 2583 bool first_line, bool current_entry, bool change_color, int width, 2587 2584 void *obj, unsigned int percent_type, ··· 2692 2657 obj__printf(obj, "%*s ", 2693 2658 ANNOTATION__MINMAX_CYCLES_WIDTH - 1, 2694 2659 "Cycle(min/max)"); 2660 + } 2661 + 2662 + if (show_title && !*al->line) { 2663 + ipc_coverage_string(bf, sizeof(bf), notes); 2664 + obj__printf(obj, "%*s", ANNOTATION__AVG_IPC_WIDTH, bf); 2695 2665 } 2696 2666 } 2697 2667 ··· 2803 2763 notes->nr_events = nr_pcnt; 2804 2764 2805 2765 annotation__update_column_widths(notes); 2766 + sym->annotate2 = true; 2806 2767 2807 2768 return 0; 2808 2769
+5
tools/perf/util/annotate.h
··· 64 64 #define ANNOTATION__IPC_WIDTH 6 65 65 #define ANNOTATION__CYCLES_WIDTH 6 66 66 #define ANNOTATION__MINMAX_CYCLES_WIDTH 19 67 + #define ANNOTATION__AVG_IPC_WIDTH 36 67 68 68 69 struct annotation_options { 69 70 bool hide_src_code, ··· 263 262 pthread_mutex_t lock; 264 263 u64 max_coverage; 265 264 u64 start; 265 + u64 hit_cycles; 266 + u64 hit_insn; 267 + unsigned int total_insn; 268 + unsigned int cover_insn; 266 269 struct annotation_options *options; 267 270 struct annotation_line **offsets; 268 271 int nr_events;
+2 -2
tools/perf/util/bpf-loader.c
··· 99 99 if (err) 100 100 return ERR_PTR(-BPF_LOADER_ERRNO__COMPILE); 101 101 } else 102 - pr_debug("bpf: successfull builtin compilation\n"); 102 + pr_debug("bpf: successful builtin compilation\n"); 103 103 obj = bpf_object__open_buffer(obj_buf, obj_buf_sz, filename); 104 104 105 105 if (!IS_ERR_OR_NULL(obj) && llvm_param.dump_obj) ··· 1603 1603 1604 1604 op = bpf_map__add_newop(map, NULL); 1605 1605 if (IS_ERR(op)) 1606 - return ERR_PTR(PTR_ERR(op)); 1606 + return ERR_CAST(op); 1607 1607 op->op_type = BPF_MAP_OP_SET_EVSEL; 1608 1608 op->v.evsel = evsel; 1609 1609 }
+6 -2
tools/perf/util/config.c
··· 14 14 #include "util.h" 15 15 #include "cache.h" 16 16 #include <subcmd/exec-cmd.h> 17 + #include "util/event.h" /* proc_map_timeout */ 17 18 #include "util/hist.h" /* perf_hist_config */ 18 19 #include "util/llvm-utils.h" /* perf_llvm_config */ 19 20 #include "config.h" ··· 420 419 static int perf_default_core_config(const char *var __maybe_unused, 421 420 const char *value __maybe_unused) 422 421 { 422 + if (!strcmp(var, "core.proc-map-timeout")) 423 + proc_map_timeout = strtoul(value, NULL, 10); 424 + 423 425 /* Add other config variables here. */ 424 426 return 0; 425 427 } ··· 815 811 void set_buildid_dir(const char *dir) 816 812 { 817 813 if (dir) 818 - scnprintf(buildid_dir, MAXPATHLEN-1, "%s", dir); 814 + scnprintf(buildid_dir, MAXPATHLEN, "%s", dir); 819 815 820 816 /* default to $HOME/.debug */ 821 817 if (buildid_dir[0] == '\0') { 822 818 char *home = getenv("HOME"); 823 819 824 820 if (home) { 825 - snprintf(buildid_dir, MAXPATHLEN-1, "%s/%s", 821 + snprintf(buildid_dir, MAXPATHLEN, "%s/%s", 826 822 home, DEBUG_CACHE_DIR); 827 823 } else { 828 824 strncpy(buildid_dir, DEBUG_CACHE_DIR, MAXPATHLEN-1);
+60
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
··· 116 116 return 1; 117 117 } 118 118 119 + static int cs_etm_decoder__gen_etmv3_config(struct cs_etm_trace_params *params, 120 + ocsd_etmv3_cfg *config) 121 + { 122 + config->reg_idr = params->etmv3.reg_idr; 123 + config->reg_ctrl = params->etmv3.reg_ctrl; 124 + config->reg_ccer = params->etmv3.reg_ccer; 125 + config->reg_trc_id = params->etmv3.reg_trc_id; 126 + config->arch_ver = ARCH_V7; 127 + config->core_prof = profile_CortexA; 128 + 129 + return 0; 130 + } 131 + 119 132 static void cs_etm_decoder__gen_etmv4_config(struct cs_etm_trace_params *params, 120 133 ocsd_etmv4_cfg *config) 121 134 { ··· 250 237 struct cs_etm_decoder *decoder) 251 238 { 252 239 const char *decoder_name; 240 + ocsd_etmv3_cfg config_etmv3; 253 241 ocsd_etmv4_cfg trace_config_etmv4; 254 242 void *trace_config; 255 243 256 244 switch (t_params->protocol) { 245 + case CS_ETM_PROTO_ETMV3: 246 + case CS_ETM_PROTO_PTM: 247 + cs_etm_decoder__gen_etmv3_config(t_params, &config_etmv3); 248 + decoder_name = (t_params->protocol == CS_ETM_PROTO_ETMV3) ? 249 + OCSD_BUILTIN_DCD_ETMV3 : 250 + OCSD_BUILTIN_DCD_PTM; 251 + trace_config = &config_etmv3; 252 + break; 257 253 case CS_ETM_PROTO_ETMV4i: 258 254 cs_etm_decoder__gen_etmv4_config(t_params, &trace_config_etmv4); 259 255 decoder_name = OCSD_BUILTIN_DCD_ETMV4I; ··· 285 263 decoder->tail = 0; 286 264 decoder->packet_count = 0; 287 265 for (i = 0; i < MAX_BUFFER; i++) { 266 + decoder->packet_buffer[i].isa = CS_ETM_ISA_UNKNOWN; 288 267 decoder->packet_buffer[i].start_addr = CS_ETM_INVAL_ADDR; 289 268 decoder->packet_buffer[i].end_addr = CS_ETM_INVAL_ADDR; 269 + decoder->packet_buffer[i].instr_count = 0; 290 270 decoder->packet_buffer[i].last_instr_taken_branch = false; 271 + decoder->packet_buffer[i].last_instr_size = 0; 291 272 decoder->packet_buffer[i].exc = false; 292 273 decoder->packet_buffer[i].exc_ret = false; 293 274 decoder->packet_buffer[i].cpu = INT_MIN; ··· 319 294 decoder->packet_count++; 320 295 321 296 decoder->packet_buffer[et].sample_type = sample_type; 297 + decoder->packet_buffer[et].isa = CS_ETM_ISA_UNKNOWN; 322 298 decoder->packet_buffer[et].exc = false; 323 299 decoder->packet_buffer[et].exc_ret = false; 324 300 decoder->packet_buffer[et].cpu = *((int *)inode->priv); 325 301 decoder->packet_buffer[et].start_addr = CS_ETM_INVAL_ADDR; 326 302 decoder->packet_buffer[et].end_addr = CS_ETM_INVAL_ADDR; 303 + decoder->packet_buffer[et].instr_count = 0; 304 + decoder->packet_buffer[et].last_instr_taken_branch = false; 305 + decoder->packet_buffer[et].last_instr_size = 0; 327 306 328 307 if (decoder->packet_count == MAX_BUFFER - 1) 329 308 return OCSD_RESP_WAIT; ··· 350 321 351 322 packet = &decoder->packet_buffer[decoder->tail]; 352 323 324 + switch (elem->isa) { 325 + case ocsd_isa_aarch64: 326 + packet->isa = CS_ETM_ISA_A64; 327 + break; 328 + case ocsd_isa_arm: 329 + packet->isa = CS_ETM_ISA_A32; 330 + break; 331 + case ocsd_isa_thumb2: 332 + packet->isa = CS_ETM_ISA_T32; 333 + break; 334 + case ocsd_isa_tee: 335 + case ocsd_isa_jazelle: 336 + case ocsd_isa_custom: 337 + case ocsd_isa_unknown: 338 + default: 339 + packet->isa = CS_ETM_ISA_UNKNOWN; 340 + } 341 + 353 342 packet->start_addr = elem->st_addr; 354 343 packet->end_addr = elem->en_addr; 344 + packet->instr_count = elem->num_instr_range; 345 + 355 346 switch (elem->last_i_type) { 356 347 case OCSD_INSTR_BR: 357 348 case OCSD_INSTR_BR_INDIRECT: ··· 384 335 packet->last_instr_taken_branch = false; 385 336 break; 386 337 } 338 + 339 + packet->last_instr_size = elem->last_instr_sz; 387 340 388 341 return ret; 389 342 } ··· 449 398 struct cs_etm_decoder *decoder) 450 399 { 451 400 const char *decoder_name; 401 + ocsd_etmv3_cfg config_etmv3; 452 402 ocsd_etmv4_cfg trace_config_etmv4; 453 403 void *trace_config; 454 404 u8 csid; 455 405 456 406 switch (t_params->protocol) { 407 + case CS_ETM_PROTO_ETMV3: 408 + case CS_ETM_PROTO_PTM: 409 + cs_etm_decoder__gen_etmv3_config(t_params, &config_etmv3); 410 + decoder_name = (t_params->protocol == CS_ETM_PROTO_ETMV3) ? 411 + OCSD_BUILTIN_DCD_ETMV3 : 412 + OCSD_BUILTIN_DCD_PTM; 413 + trace_config = &config_etmv3; 414 + break; 457 415 case CS_ETM_PROTO_ETMV4i: 458 416 cs_etm_decoder__gen_etmv4_config(t_params, &trace_config_etmv4); 459 417 decoder_name = OCSD_BUILTIN_DCD_ETMV4I;
+19
tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
··· 28 28 CS_ETM_TRACE_ON = 1 << 1, 29 29 }; 30 30 31 + enum cs_etm_isa { 32 + CS_ETM_ISA_UNKNOWN, 33 + CS_ETM_ISA_A64, 34 + CS_ETM_ISA_A32, 35 + CS_ETM_ISA_T32, 36 + }; 37 + 31 38 struct cs_etm_packet { 32 39 enum cs_etm_sample_type sample_type; 40 + enum cs_etm_isa isa; 33 41 u64 start_addr; 34 42 u64 end_addr; 43 + u32 instr_count; 35 44 u8 last_instr_taken_branch; 45 + u8 last_instr_size; 36 46 u8 exc; 37 47 u8 exc_ret; 38 48 int cpu; ··· 52 42 53 43 typedef u32 (*cs_etm_mem_cb_type)(struct cs_etm_queue *, u64, 54 44 size_t, u8 *); 45 + 46 + struct cs_etmv3_trace_params { 47 + u32 reg_ctrl; 48 + u32 reg_trc_id; 49 + u32 reg_ccer; 50 + u32 reg_idr; 51 + }; 55 52 56 53 struct cs_etmv4_trace_params { 57 54 u32 reg_idr0; ··· 72 55 struct cs_etm_trace_params { 73 56 int protocol; 74 57 union { 58 + struct cs_etmv3_trace_params etmv3; 75 59 struct cs_etmv4_trace_params etmv4; 76 60 }; 77 61 }; ··· 96 78 CS_ETM_PROTO_ETMV3 = 1, 97 79 CS_ETM_PROTO_ETMV4i, 98 80 CS_ETM_PROTO_ETMV4d, 81 + CS_ETM_PROTO_PTM, 99 82 }; 100 83 101 84 enum {
+90 -53
tools/perf/util/cs-etm.c
··· 31 31 32 32 #define MAX_TIMESTAMP (~0ULL) 33 33 34 - /* 35 - * A64 instructions are always 4 bytes 36 - * 37 - * Only A64 is supported, so can use this constant for converting between 38 - * addresses and instruction counts, calculting offsets etc 39 - */ 40 - #define A64_INSTR_SIZE 4 41 - 42 34 struct cs_etm_auxtrace { 43 35 struct auxtrace auxtrace; 44 36 struct auxtrace_queues queues; ··· 83 91 static int cs_etm__process_timeless_queues(struct cs_etm_auxtrace *etm, 84 92 pid_t tid, u64 time_); 85 93 94 + /* PTMs ETMIDR [11:8] set to b0011 */ 95 + #define ETMIDR_PTM_VERSION 0x00000300 96 + 97 + static u32 cs_etm__get_v7_protocol_version(u32 etmidr) 98 + { 99 + etmidr &= ETMIDR_PTM_VERSION; 100 + 101 + if (etmidr == ETMIDR_PTM_VERSION) 102 + return CS_ETM_PROTO_PTM; 103 + 104 + return CS_ETM_PROTO_ETMV3; 105 + } 106 + 86 107 static void cs_etm__packet_dump(const char *pkt_string) 87 108 { 88 109 const char *color = PERF_COLOR_BLUE; ··· 127 122 /* Use metadata to fill in trace parameters for trace decoder */ 128 123 t_params = zalloc(sizeof(*t_params) * etm->num_cpu); 129 124 for (i = 0; i < etm->num_cpu; i++) { 130 - t_params[i].protocol = CS_ETM_PROTO_ETMV4i; 131 - t_params[i].etmv4.reg_idr0 = etm->metadata[i][CS_ETMV4_TRCIDR0]; 132 - t_params[i].etmv4.reg_idr1 = etm->metadata[i][CS_ETMV4_TRCIDR1]; 133 - t_params[i].etmv4.reg_idr2 = etm->metadata[i][CS_ETMV4_TRCIDR2]; 134 - t_params[i].etmv4.reg_idr8 = etm->metadata[i][CS_ETMV4_TRCIDR8]; 135 - t_params[i].etmv4.reg_configr = 125 + if (etm->metadata[i][CS_ETM_MAGIC] == __perf_cs_etmv3_magic) { 126 + u32 etmidr = etm->metadata[i][CS_ETM_ETMIDR]; 127 + 128 + t_params[i].protocol = 129 + cs_etm__get_v7_protocol_version(etmidr); 130 + t_params[i].etmv3.reg_ctrl = 131 + etm->metadata[i][CS_ETM_ETMCR]; 132 + t_params[i].etmv3.reg_trc_id = 133 + etm->metadata[i][CS_ETM_ETMTRACEIDR]; 134 + } else if (etm->metadata[i][CS_ETM_MAGIC] == 135 + __perf_cs_etmv4_magic) { 136 + t_params[i].protocol = CS_ETM_PROTO_ETMV4i; 137 + t_params[i].etmv4.reg_idr0 = 138 + etm->metadata[i][CS_ETMV4_TRCIDR0]; 139 + t_params[i].etmv4.reg_idr1 = 140 + etm->metadata[i][CS_ETMV4_TRCIDR1]; 141 + t_params[i].etmv4.reg_idr2 = 142 + etm->metadata[i][CS_ETMV4_TRCIDR2]; 143 + t_params[i].etmv4.reg_idr8 = 144 + etm->metadata[i][CS_ETMV4_TRCIDR8]; 145 + t_params[i].etmv4.reg_configr = 136 146 etm->metadata[i][CS_ETMV4_TRCCONFIGR]; 137 - t_params[i].etmv4.reg_traceidr = 147 + t_params[i].etmv4.reg_traceidr = 138 148 etm->metadata[i][CS_ETMV4_TRCTRACEIDR]; 149 + } 139 150 } 140 151 141 152 /* Set decoder parameters to simply print the trace packets */ ··· 381 360 goto out_free; 382 361 383 362 for (i = 0; i < etm->num_cpu; i++) { 384 - t_params[i].protocol = CS_ETM_PROTO_ETMV4i; 385 - t_params[i].etmv4.reg_idr0 = etm->metadata[i][CS_ETMV4_TRCIDR0]; 386 - t_params[i].etmv4.reg_idr1 = etm->metadata[i][CS_ETMV4_TRCIDR1]; 387 - t_params[i].etmv4.reg_idr2 = etm->metadata[i][CS_ETMV4_TRCIDR2]; 388 - t_params[i].etmv4.reg_idr8 = etm->metadata[i][CS_ETMV4_TRCIDR8]; 389 - t_params[i].etmv4.reg_configr = 363 + if (etm->metadata[i][CS_ETM_MAGIC] == __perf_cs_etmv3_magic) { 364 + u32 etmidr = etm->metadata[i][CS_ETM_ETMIDR]; 365 + 366 + t_params[i].protocol = 367 + cs_etm__get_v7_protocol_version(etmidr); 368 + t_params[i].etmv3.reg_ctrl = 369 + etm->metadata[i][CS_ETM_ETMCR]; 370 + t_params[i].etmv3.reg_trc_id = 371 + etm->metadata[i][CS_ETM_ETMTRACEIDR]; 372 + } else if (etm->metadata[i][CS_ETM_MAGIC] == 373 + __perf_cs_etmv4_magic) { 374 + t_params[i].protocol = CS_ETM_PROTO_ETMV4i; 375 + t_params[i].etmv4.reg_idr0 = 376 + etm->metadata[i][CS_ETMV4_TRCIDR0]; 377 + t_params[i].etmv4.reg_idr1 = 378 + etm->metadata[i][CS_ETMV4_TRCIDR1]; 379 + t_params[i].etmv4.reg_idr2 = 380 + etm->metadata[i][CS_ETMV4_TRCIDR2]; 381 + t_params[i].etmv4.reg_idr8 = 382 + etm->metadata[i][CS_ETMV4_TRCIDR8]; 383 + t_params[i].etmv4.reg_configr = 390 384 etm->metadata[i][CS_ETMV4_TRCCONFIGR]; 391 - t_params[i].etmv4.reg_traceidr = 385 + t_params[i].etmv4.reg_traceidr = 392 386 etm->metadata[i][CS_ETMV4_TRCTRACEIDR]; 387 + } 393 388 } 394 389 395 390 /* Set decoder parameters to simply print the trace packets */ ··· 547 510 etmq->last_branch_rb->nr = 0; 548 511 } 549 512 550 - static inline u64 cs_etm__last_executed_instr(struct cs_etm_packet *packet) 551 - { 552 - /* Returns 0 for the CS_ETM_TRACE_ON packet */ 553 - if (packet->sample_type == CS_ETM_TRACE_ON) 554 - return 0; 513 + static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq, 514 + u64 addr) { 515 + u8 instrBytes[2]; 555 516 517 + cs_etm__mem_access(etmq, addr, ARRAY_SIZE(instrBytes), instrBytes); 556 518 /* 557 - * The packet records the execution range with an exclusive end address 558 - * 559 - * A64 instructions are constant size, so the last executed 560 - * instruction is A64_INSTR_SIZE before the end address 561 - * Will need to do instruction level decode for T32 instructions as 562 - * they can be variable size (not yet supported). 519 + * T32 instruction size is indicated by bits[15:11] of the first 520 + * 16-bit word of the instruction: 0b11101, 0b11110 and 0b11111 521 + * denote a 32-bit instruction. 563 522 */ 564 - return packet->end_addr - A64_INSTR_SIZE; 523 + return ((instrBytes[1] & 0xF8) >= 0xE8) ? 4 : 2; 565 524 } 566 525 567 526 static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) ··· 569 536 return packet->start_addr; 570 537 } 571 538 572 - static inline u64 cs_etm__instr_count(const struct cs_etm_packet *packet) 539 + static inline 540 + u64 cs_etm__last_executed_instr(const struct cs_etm_packet *packet) 573 541 { 574 - /* 575 - * Only A64 instructions are currently supported, so can get 576 - * instruction count by dividing. 577 - * Will need to do instruction level decode for T32 instructions as 578 - * they can be variable size (not yet supported). 579 - */ 580 - return (packet->end_addr - packet->start_addr) / A64_INSTR_SIZE; 542 + /* Returns 0 for the CS_ETM_TRACE_ON packet */ 543 + if (packet->sample_type == CS_ETM_TRACE_ON) 544 + return 0; 545 + 546 + return packet->end_addr - packet->last_instr_size; 581 547 } 582 548 583 - static inline u64 cs_etm__instr_addr(const struct cs_etm_packet *packet, 549 + static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq, 550 + const struct cs_etm_packet *packet, 584 551 u64 offset) 585 552 { 586 - /* 587 - * Only A64 instructions are currently supported, so can get 588 - * instruction address by muliplying. 589 - * Will need to do instruction level decode for T32 instructions as 590 - * they can be variable size (not yet supported). 591 - */ 592 - return packet->start_addr + offset * A64_INSTR_SIZE; 553 + if (packet->isa == CS_ETM_ISA_T32) { 554 + u64 addr = packet->start_addr; 555 + 556 + while (offset > 0) { 557 + addr += cs_etm__t32_instr_size(etmq, addr); 558 + offset--; 559 + } 560 + return addr; 561 + } 562 + 563 + /* Assume a 4 byte instruction size (A32/A64) */ 564 + return packet->start_addr + offset * 4; 593 565 } 594 566 595 567 static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq) ··· 926 888 struct cs_etm_auxtrace *etm = etmq->etm; 927 889 struct cs_etm_packet *tmp; 928 890 int ret; 929 - u64 instrs_executed; 891 + u64 instrs_executed = etmq->packet->instr_count; 930 892 931 - instrs_executed = cs_etm__instr_count(etmq->packet); 932 893 etmq->period_instructions += instrs_executed; 933 894 934 895 /* ··· 957 920 * executed, but PC has not advanced to next instruction) 958 921 */ 959 922 u64 offset = (instrs_executed - instrs_over - 1); 960 - u64 addr = cs_etm__instr_addr(etmq->packet, offset); 923 + u64 addr = cs_etm__instr_addr(etmq, etmq->packet, offset); 961 924 962 925 ret = cs_etm__synth_instruction_sample( 963 926 etmq, addr, etm->instructions_sample_period);
+1 -1
tools/perf/util/dso.c
··· 295 295 unlink(tmpbuf); 296 296 297 297 if (pathname && (fd >= 0)) 298 - strncpy(pathname, tmpbuf, len); 298 + strlcpy(pathname, tmpbuf, len); 299 299 300 300 return fd; 301 301 }
+1 -1
tools/perf/util/env.c
··· 166 166 struct utsname uts; 167 167 char *arch_name; 168 168 169 - if (!env) { /* Assume local operation */ 169 + if (!env || !env->arch) { /* Assume local operation */ 170 170 if (uname(&uts) < 0) 171 171 return NULL; 172 172 arch_name = uts.machine;
+41 -20
tools/perf/util/event.c
··· 25 25 #include "asm/bug.h" 26 26 #include "stat.h" 27 27 28 + #define DEFAULT_PROC_MAP_PARSE_TIMEOUT 500 29 + 28 30 static const char *perf_event__names[] = { 29 31 [0] = "TOTAL", 30 32 [PERF_RECORD_MMAP] = "MMAP", ··· 73 71 [MNT_NS_INDEX] = "mnt", 74 72 [CGROUP_NS_INDEX] = "cgroup", 75 73 }; 74 + 75 + unsigned int proc_map_timeout = DEFAULT_PROC_MAP_PARSE_TIMEOUT; 76 76 77 77 const char *perf_event__name(unsigned int id) 78 78 { ··· 327 323 pid_t pid, pid_t tgid, 328 324 perf_event__handler_t process, 329 325 struct machine *machine, 330 - bool mmap_data, 331 - unsigned int proc_map_timeout) 326 + bool mmap_data) 332 327 { 333 328 char filename[PATH_MAX]; 334 329 FILE *fp; ··· 524 521 perf_event__handler_t process, 525 522 struct perf_tool *tool, 526 523 struct machine *machine, 527 - bool mmap_data, 528 - unsigned int proc_map_timeout) 524 + bool mmap_data) 529 525 { 530 526 char filename[PATH_MAX]; 531 527 DIR *tasks; ··· 550 548 */ 551 549 if (pid == tgid && 552 550 perf_event__synthesize_mmap_events(tool, mmap_event, pid, tgid, 553 - process, machine, mmap_data, 554 - proc_map_timeout)) 551 + process, machine, mmap_data)) 555 552 return -1; 556 553 557 554 return 0; ··· 599 598 if (_pid == pid) { 600 599 /* process the parent's maps too */ 601 600 rc = perf_event__synthesize_mmap_events(tool, mmap_event, pid, tgid, 602 - process, machine, mmap_data, proc_map_timeout); 601 + process, machine, mmap_data); 603 602 if (rc) 604 603 break; 605 604 } ··· 613 612 struct thread_map *threads, 614 613 perf_event__handler_t process, 615 614 struct machine *machine, 616 - bool mmap_data, 617 - unsigned int proc_map_timeout) 615 + bool mmap_data) 618 616 { 619 617 union perf_event *comm_event, *mmap_event, *fork_event; 620 618 union perf_event *namespaces_event; ··· 643 643 fork_event, namespaces_event, 644 644 thread_map__pid(threads, thread), 0, 645 645 process, tool, machine, 646 - mmap_data, proc_map_timeout)) { 646 + mmap_data)) { 647 647 err = -1; 648 648 break; 649 649 } ··· 669 669 fork_event, namespaces_event, 670 670 comm_event->comm.pid, 0, 671 671 process, tool, machine, 672 - mmap_data, proc_map_timeout)) { 672 + mmap_data)) { 673 673 err = -1; 674 674 break; 675 675 } ··· 690 690 perf_event__handler_t process, 691 691 struct machine *machine, 692 692 bool mmap_data, 693 - unsigned int proc_map_timeout, 694 693 struct dirent **dirent, 695 694 int start, 696 695 int num) ··· 733 734 */ 734 735 __event__synthesize_thread(comm_event, mmap_event, fork_event, 735 736 namespaces_event, pid, 1, process, 736 - tool, machine, mmap_data, 737 - proc_map_timeout); 737 + tool, machine, mmap_data); 738 738 } 739 739 err = 0; 740 740 ··· 753 755 perf_event__handler_t process; 754 756 struct machine *machine; 755 757 bool mmap_data; 756 - unsigned int proc_map_timeout; 757 758 struct dirent **dirent; 758 759 int num; 759 760 int start; ··· 764 767 765 768 __perf_event__synthesize_threads(args->tool, args->process, 766 769 args->machine, args->mmap_data, 767 - args->proc_map_timeout, args->dirent, 770 + args->dirent, 768 771 args->start, args->num); 769 772 return NULL; 770 773 } ··· 773 776 perf_event__handler_t process, 774 777 struct machine *machine, 775 778 bool mmap_data, 776 - unsigned int proc_map_timeout, 777 779 unsigned int nr_threads_synthesize) 778 780 { 779 781 struct synthesize_threads_arg *args = NULL; ··· 802 806 if (thread_nr <= 1) { 803 807 err = __perf_event__synthesize_threads(tool, process, 804 808 machine, mmap_data, 805 - proc_map_timeout, 806 809 dirent, base, n); 807 810 goto free_dirent; 808 811 } ··· 823 828 args[i].process = process; 824 829 args[i].machine = machine; 825 830 args[i].mmap_data = mmap_data; 826 - args[i].proc_map_timeout = proc_map_timeout; 827 831 args[i].dirent = dirent; 828 832 } 829 833 for (i = 0; i < m; i++) { ··· 1571 1577 return al->map; 1572 1578 } 1573 1579 1580 + /* 1581 + * For branch stacks or branch samples, the sample cpumode might not be correct 1582 + * because it applies only to the sample 'ip' and not necessary to 'addr' or 1583 + * branch stack addresses. If possible, use a fallback to deal with those cases. 1584 + */ 1585 + struct map *thread__find_map_fb(struct thread *thread, u8 cpumode, u64 addr, 1586 + struct addr_location *al) 1587 + { 1588 + struct map *map = thread__find_map(thread, cpumode, addr, al); 1589 + struct machine *machine = thread->mg->machine; 1590 + u8 addr_cpumode = machine__addr_cpumode(machine, cpumode, addr); 1591 + 1592 + if (map || addr_cpumode == cpumode) 1593 + return map; 1594 + 1595 + return thread__find_map(thread, addr_cpumode, addr, al); 1596 + } 1597 + 1574 1598 struct symbol *thread__find_symbol(struct thread *thread, u8 cpumode, 1575 1599 u64 addr, struct addr_location *al) 1576 1600 { 1577 1601 al->sym = NULL; 1578 1602 if (thread__find_map(thread, cpumode, addr, al)) 1603 + al->sym = map__find_symbol(al->map, al->addr); 1604 + return al->sym; 1605 + } 1606 + 1607 + struct symbol *thread__find_symbol_fb(struct thread *thread, u8 cpumode, 1608 + u64 addr, struct addr_location *al) 1609 + { 1610 + al->sym = NULL; 1611 + if (thread__find_map_fb(thread, cpumode, addr, al)) 1579 1612 al->sym = map__find_symbol(al->map, al->addr); 1580 1613 return al->sym; 1581 1614 } ··· 1700 1679 void thread__resolve(struct thread *thread, struct addr_location *al, 1701 1680 struct perf_sample *sample) 1702 1681 { 1703 - thread__find_map(thread, sample->cpumode, sample->addr, al); 1682 + thread__find_map_fb(thread, sample->cpumode, sample->addr, al); 1704 1683 1705 1684 al->cpu = sample->cpu; 1706 1685 al->sym = NULL;
+3 -5
tools/perf/util/event.h
··· 669 669 int perf_event__synthesize_thread_map(struct perf_tool *tool, 670 670 struct thread_map *threads, 671 671 perf_event__handler_t process, 672 - struct machine *machine, bool mmap_data, 673 - unsigned int proc_map_timeout); 672 + struct machine *machine, bool mmap_data); 674 673 int perf_event__synthesize_thread_map2(struct perf_tool *tool, 675 674 struct thread_map *threads, 676 675 perf_event__handler_t process, ··· 681 682 int perf_event__synthesize_threads(struct perf_tool *tool, 682 683 perf_event__handler_t process, 683 684 struct machine *machine, bool mmap_data, 684 - unsigned int proc_map_timeout, 685 685 unsigned int nr_threads_synthesize); 686 686 int perf_event__synthesize_kernel_mmap(struct perf_tool *tool, 687 687 perf_event__handler_t process, ··· 795 797 pid_t pid, pid_t tgid, 796 798 perf_event__handler_t process, 797 799 struct machine *machine, 798 - bool mmap_data, 799 - unsigned int proc_map_timeout); 800 + bool mmap_data); 800 801 801 802 int perf_event__synthesize_extra_kmaps(struct perf_tool *tool, 802 803 perf_event__handler_t process, ··· 826 829 827 830 extern int sysctl_perf_event_max_stack; 828 831 extern int sysctl_perf_event_max_contexts_per_stack; 832 + extern unsigned int proc_map_timeout; 829 833 830 834 #endif /* __PERF_RECORD_H */
+3 -3
tools/perf/util/evlist.c
··· 1018 1018 */ 1019 1019 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages, 1020 1020 unsigned int auxtrace_pages, 1021 - bool auxtrace_overwrite) 1021 + bool auxtrace_overwrite, int nr_cblocks) 1022 1022 { 1023 1023 struct perf_evsel *evsel; 1024 1024 const struct cpu_map *cpus = evlist->cpus; ··· 1028 1028 * Its value is decided by evsel's write_backward. 1029 1029 * So &mp should not be passed through const pointer. 1030 1030 */ 1031 - struct mmap_params mp; 1031 + struct mmap_params mp = { .nr_cblocks = nr_cblocks }; 1032 1032 1033 1033 if (!evlist->mmap) 1034 1034 evlist->mmap = perf_evlist__alloc_mmap(evlist, false); ··· 1060 1060 1061 1061 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages) 1062 1062 { 1063 - return perf_evlist__mmap_ex(evlist, pages, 0, false); 1063 + return perf_evlist__mmap_ex(evlist, pages, 0, false, 0); 1064 1064 } 1065 1065 1066 1066 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
+1 -1
tools/perf/util/evlist.h
··· 162 162 163 163 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages, 164 164 unsigned int auxtrace_pages, 165 - bool auxtrace_overwrite); 165 + bool auxtrace_overwrite, int nr_cblocks); 166 166 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages); 167 167 void perf_evlist__munmap(struct perf_evlist *evlist); 168 168
+2 -2
tools/perf/util/evsel.h
··· 106 106 char *name; 107 107 double scale; 108 108 const char *unit; 109 - struct tep_event_format *tp_format; 109 + struct tep_event *tp_format; 110 110 off_t id_offset; 111 111 struct perf_stat_evsel *stats; 112 112 void *priv; ··· 216 216 217 217 struct perf_evsel *perf_evsel__new_cycles(bool precise); 218 218 219 - struct tep_event_format *event_format__new(const char *sys, const char *name); 219 + struct tep_event *event_format__new(const char *sys, const char *name); 220 220 221 221 void perf_evsel__init(struct perf_evsel *evsel, 222 222 struct perf_event_attr *attr, int idx);
+1
tools/perf/util/evsel_fprintf.c
··· 173 173 if (!print_oneline) 174 174 printed += fprintf(fp, "\n"); 175 175 176 + /* Add srccode here too? */ 176 177 if (symbol_conf.bt_stop_list && 177 178 node->sym && 178 179 strlist__has_entry(symbol_conf.bt_stop_list,
+4 -4
tools/perf/util/header.c
··· 2798 2798 lseek(fd, sec_start, SEEK_SET); 2799 2799 /* 2800 2800 * may write more than needed due to dropped feature, but 2801 - * this is okay, reader will skip the mising entries 2801 + * this is okay, reader will skip the missing entries 2802 2802 */ 2803 2803 err = do_write(&ff, feat_sec, sec_size); 2804 2804 if (err < 0) ··· 3268 3268 static int perf_evsel__prepare_tracepoint_event(struct perf_evsel *evsel, 3269 3269 struct tep_handle *pevent) 3270 3270 { 3271 - struct tep_event_format *event; 3271 + struct tep_event *event; 3272 3272 char bf[128]; 3273 3273 3274 3274 /* already prepared */ ··· 3583 3583 if (ev == NULL) 3584 3584 return -ENOMEM; 3585 3585 3586 - strncpy(ev->data, evsel->unit, size); 3586 + strlcpy(ev->data, evsel->unit, size + 1); 3587 3587 err = process(tool, (union perf_event *)ev, NULL, NULL); 3588 3588 free(ev); 3589 3589 return err; ··· 3622 3622 if (ev == NULL) 3623 3623 return -ENOMEM; 3624 3624 3625 - strncpy(ev->data, evsel->name, len); 3625 + strlcpy(ev->data, evsel->name, len + 1); 3626 3626 err = process(tool, (union perf_event*) ev, NULL, NULL); 3627 3627 free(ev); 3628 3628 return err;
+1 -1
tools/perf/util/hist.c
··· 1160 1160 1161 1161 /* 1162 1162 * If this is not the last column, then we need to pad it according to the 1163 - * pre-calculated max lenght for this column, otherwise don't bother adding 1163 + * pre-calculated max length for this column, otherwise don't bother adding 1164 1164 * spaces because that would break viewing this with, for instance, 'less', 1165 1165 * that would show tons of trailing spaces when a long C++ demangled method 1166 1166 * names is sampled.
+1
tools/perf/util/hist.h
··· 62 62 HISTC_TRACE, 63 63 HISTC_SYM_SIZE, 64 64 HISTC_DSO_SIZE, 65 + HISTC_SYMBOL_IPC, 65 66 HISTC_NR_COLS, /* Last entry */ 66 67 }; 67 68
+1 -1
tools/perf/util/jitdump.c
··· 38 38 uint64_t sample_type; 39 39 size_t bufsize; 40 40 FILE *in; 41 - bool needs_bswap; /* handles cross-endianess */ 41 + bool needs_bswap; /* handles cross-endianness */ 42 42 bool use_arch_timestamp; 43 43 void *debug_data; 44 44 void *unwinding_data;
+29 -4
tools/perf/util/machine.c
··· 137 137 struct machine *machine = machine__new_host(); 138 138 /* 139 139 * FIXME: 140 - * 1) We should switch to machine__load_kallsyms(), i.e. not explicitely 140 + * 1) We should switch to machine__load_kallsyms(), i.e. not explicitly 141 141 * ask for not using the kcore parsing code, once this one is fixed 142 142 * to create a map per module. 143 143 */ ··· 2493 2493 int __machine__synthesize_threads(struct machine *machine, struct perf_tool *tool, 2494 2494 struct target *target, struct thread_map *threads, 2495 2495 perf_event__handler_t process, bool data_mmap, 2496 - unsigned int proc_map_timeout, 2497 2496 unsigned int nr_threads_synthesize) 2498 2497 { 2499 2498 if (target__has_task(target)) 2500 - return perf_event__synthesize_thread_map(tool, threads, process, machine, data_mmap, proc_map_timeout); 2499 + return perf_event__synthesize_thread_map(tool, threads, process, machine, data_mmap); 2501 2500 else if (target__has_cpu(target)) 2502 2501 return perf_event__synthesize_threads(tool, process, 2503 2502 machine, data_mmap, 2504 - proc_map_timeout, 2505 2503 nr_threads_synthesize); 2506 2504 /* command specified */ 2507 2505 return 0; ··· 2588 2590 machine->kernel_start = map->start; 2589 2591 } 2590 2592 return err; 2593 + } 2594 + 2595 + u8 machine__addr_cpumode(struct machine *machine, u8 cpumode, u64 addr) 2596 + { 2597 + u8 addr_cpumode = cpumode; 2598 + bool kernel_ip; 2599 + 2600 + if (!machine->single_address_space) 2601 + goto out; 2602 + 2603 + kernel_ip = machine__kernel_ip(machine, addr); 2604 + switch (cpumode) { 2605 + case PERF_RECORD_MISC_KERNEL: 2606 + case PERF_RECORD_MISC_USER: 2607 + addr_cpumode = kernel_ip ? PERF_RECORD_MISC_KERNEL : 2608 + PERF_RECORD_MISC_USER; 2609 + break; 2610 + case PERF_RECORD_MISC_GUEST_KERNEL: 2611 + case PERF_RECORD_MISC_GUEST_USER: 2612 + addr_cpumode = kernel_ip ? PERF_RECORD_MISC_GUEST_KERNEL : 2613 + PERF_RECORD_MISC_GUEST_USER; 2614 + break; 2615 + default: 2616 + break; 2617 + } 2618 + out: 2619 + return addr_cpumode; 2591 2620 } 2592 2621 2593 2622 struct dso *machine__findnew_dso(struct machine *machine, const char *filename)
+3 -3
tools/perf/util/machine.h
··· 42 42 u16 id_hdr_size; 43 43 bool comm_exec; 44 44 bool kptr_restrict_warned; 45 + bool single_address_space; 45 46 char *root_dir; 46 47 char *mmap_name; 47 48 struct threads threads[THREADS__TABLE_SIZE]; ··· 99 98 100 99 return ip >= kernel_start; 101 100 } 101 + 102 + u8 machine__addr_cpumode(struct machine *machine, u8 cpumode, u64 addr); 102 103 103 104 struct thread *machine__find_thread(struct machine *machine, pid_t pid, 104 105 pid_t tid); ··· 250 247 int __machine__synthesize_threads(struct machine *machine, struct perf_tool *tool, 251 248 struct target *target, struct thread_map *threads, 252 249 perf_event__handler_t process, bool data_mmap, 253 - unsigned int proc_map_timeout, 254 250 unsigned int nr_threads_synthesize); 255 251 static inline 256 252 int machine__synthesize_threads(struct machine *machine, struct target *target, 257 253 struct thread_map *threads, bool data_mmap, 258 - unsigned int proc_map_timeout, 259 254 unsigned int nr_threads_synthesize) 260 255 { 261 256 return __machine__synthesize_threads(machine, NULL, target, threads, 262 257 perf_event__process, data_mmap, 263 - proc_map_timeout, 264 258 nr_threads_synthesize); 265 259 } 266 260
+55 -7
tools/perf/util/map.c
··· 19 19 #include "srcline.h" 20 20 #include "namespaces.h" 21 21 #include "unwind.h" 22 + #include "srccode.h" 22 23 23 24 static void __maps__insert(struct maps *maps, struct map *map); 24 25 static void __maps__insert_name(struct maps *maps, struct map *map); ··· 420 419 free_srcline(srcline); 421 420 } 422 421 return ret; 422 + } 423 + 424 + int map__fprintf_srccode(struct map *map, u64 addr, 425 + FILE *fp, 426 + struct srccode_state *state) 427 + { 428 + char *srcfile; 429 + int ret = 0; 430 + unsigned line; 431 + int len; 432 + char *srccode; 433 + 434 + if (!map || !map->dso) 435 + return 0; 436 + srcfile = get_srcline_split(map->dso, 437 + map__rip_2objdump(map, addr), 438 + &line); 439 + if (!srcfile) 440 + return 0; 441 + 442 + /* Avoid redundant printing */ 443 + if (state && 444 + state->srcfile && 445 + !strcmp(state->srcfile, srcfile) && 446 + state->line == line) { 447 + free(srcfile); 448 + return 0; 449 + } 450 + 451 + srccode = find_sourceline(srcfile, line, &len); 452 + if (!srccode) 453 + goto out_free_line; 454 + 455 + ret = fprintf(fp, "|%-8d %.*s", line, len, srccode); 456 + state->srcfile = srcfile; 457 + state->line = line; 458 + return ret; 459 + 460 + out_free_line: 461 + free(srcfile); 462 + return ret; 463 + } 464 + 465 + 466 + void srccode_state_free(struct srccode_state *state) 467 + { 468 + zfree(&state->srcfile); 469 + state->line = 0; 423 470 } 424 471 425 472 /** ··· 922 873 923 874 struct map *maps__find(struct maps *maps, u64 ip) 924 875 { 925 - struct rb_node **p, *parent = NULL; 876 + struct rb_node *p; 926 877 struct map *m; 927 878 928 879 down_read(&maps->lock); 929 880 930 - p = &maps->entries.rb_node; 931 - while (*p != NULL) { 932 - parent = *p; 933 - m = rb_entry(parent, struct map, rb_node); 881 + p = maps->entries.rb_node; 882 + while (p != NULL) { 883 + m = rb_entry(p, struct map, rb_node); 934 884 if (ip < m->start) 935 - p = &(*p)->rb_left; 885 + p = p->rb_left; 936 886 else if (ip >= m->end) 937 - p = &(*p)->rb_right; 887 + p = p->rb_right; 938 888 else 939 889 goto out; 940 890 }
+16
tools/perf/util/map.h
··· 174 174 int map__fprintf_srcline(struct map *map, u64 addr, const char *prefix, 175 175 FILE *fp); 176 176 177 + struct srccode_state { 178 + char *srcfile; 179 + unsigned line; 180 + }; 181 + 182 + static inline void srccode_state_init(struct srccode_state *state) 183 + { 184 + state->srcfile = NULL; 185 + state->line = 0; 186 + } 187 + 188 + void srccode_state_free(struct srccode_state *state); 189 + 190 + int map__fprintf_srccode(struct map *map, u64 addr, 191 + FILE *fp, struct srccode_state *state); 192 + 177 193 int map__load(struct map *map); 178 194 struct symbol *map__find_symbol(struct map *map, u64 addr); 179 195 struct symbol *map__find_symbol_by_name(struct map *map, const char *name);
+151 -1
tools/perf/util/mmap.c
··· 153 153 { 154 154 } 155 155 156 + #ifdef HAVE_AIO_SUPPORT 157 + static int perf_mmap__aio_mmap(struct perf_mmap *map, struct mmap_params *mp) 158 + { 159 + int delta_max, i, prio; 160 + 161 + map->aio.nr_cblocks = mp->nr_cblocks; 162 + if (map->aio.nr_cblocks) { 163 + map->aio.aiocb = calloc(map->aio.nr_cblocks, sizeof(struct aiocb *)); 164 + if (!map->aio.aiocb) { 165 + pr_debug2("failed to allocate aiocb for data buffer, error %m\n"); 166 + return -1; 167 + } 168 + map->aio.cblocks = calloc(map->aio.nr_cblocks, sizeof(struct aiocb)); 169 + if (!map->aio.cblocks) { 170 + pr_debug2("failed to allocate cblocks for data buffer, error %m\n"); 171 + return -1; 172 + } 173 + map->aio.data = calloc(map->aio.nr_cblocks, sizeof(void *)); 174 + if (!map->aio.data) { 175 + pr_debug2("failed to allocate data buffer, error %m\n"); 176 + return -1; 177 + } 178 + delta_max = sysconf(_SC_AIO_PRIO_DELTA_MAX); 179 + for (i = 0; i < map->aio.nr_cblocks; ++i) { 180 + map->aio.data[i] = malloc(perf_mmap__mmap_len(map)); 181 + if (!map->aio.data[i]) { 182 + pr_debug2("failed to allocate data buffer area, error %m"); 183 + return -1; 184 + } 185 + /* 186 + * Use cblock.aio_fildes value different from -1 187 + * to denote started aio write operation on the 188 + * cblock so it requires explicit record__aio_sync() 189 + * call prior the cblock may be reused again. 190 + */ 191 + map->aio.cblocks[i].aio_fildes = -1; 192 + /* 193 + * Allocate cblocks with priority delta to have 194 + * faster aio write system calls because queued requests 195 + * are kept in separate per-prio queues and adding 196 + * a new request will iterate thru shorter per-prio 197 + * list. Blocks with numbers higher than 198 + * _SC_AIO_PRIO_DELTA_MAX go with priority 0. 199 + */ 200 + prio = delta_max - i; 201 + map->aio.cblocks[i].aio_reqprio = prio >= 0 ? prio : 0; 202 + } 203 + } 204 + 205 + return 0; 206 + } 207 + 208 + static void perf_mmap__aio_munmap(struct perf_mmap *map) 209 + { 210 + int i; 211 + 212 + for (i = 0; i < map->aio.nr_cblocks; ++i) 213 + zfree(&map->aio.data[i]); 214 + if (map->aio.data) 215 + zfree(&map->aio.data); 216 + zfree(&map->aio.cblocks); 217 + zfree(&map->aio.aiocb); 218 + } 219 + 220 + int perf_mmap__aio_push(struct perf_mmap *md, void *to, int idx, 221 + int push(void *to, struct aiocb *cblock, void *buf, size_t size, off_t off), 222 + off_t *off) 223 + { 224 + u64 head = perf_mmap__read_head(md); 225 + unsigned char *data = md->base + page_size; 226 + unsigned long size, size0 = 0; 227 + void *buf; 228 + int rc = 0; 229 + 230 + rc = perf_mmap__read_init(md); 231 + if (rc < 0) 232 + return (rc == -EAGAIN) ? 0 : -1; 233 + 234 + /* 235 + * md->base data is copied into md->data[idx] buffer to 236 + * release space in the kernel buffer as fast as possible, 237 + * thru perf_mmap__consume() below. 238 + * 239 + * That lets the kernel to proceed with storing more 240 + * profiling data into the kernel buffer earlier than other 241 + * per-cpu kernel buffers are handled. 242 + * 243 + * Coping can be done in two steps in case the chunk of 244 + * profiling data crosses the upper bound of the kernel buffer. 245 + * In this case we first move part of data from md->start 246 + * till the upper bound and then the reminder from the 247 + * beginning of the kernel buffer till the end of 248 + * the data chunk. 249 + */ 250 + 251 + size = md->end - md->start; 252 + 253 + if ((md->start & md->mask) + size != (md->end & md->mask)) { 254 + buf = &data[md->start & md->mask]; 255 + size = md->mask + 1 - (md->start & md->mask); 256 + md->start += size; 257 + memcpy(md->aio.data[idx], buf, size); 258 + size0 = size; 259 + } 260 + 261 + buf = &data[md->start & md->mask]; 262 + size = md->end - md->start; 263 + md->start += size; 264 + memcpy(md->aio.data[idx] + size0, buf, size); 265 + 266 + /* 267 + * Increment md->refcount to guard md->data[idx] buffer 268 + * from premature deallocation because md object can be 269 + * released earlier than aio write request started 270 + * on mmap->data[idx] is complete. 271 + * 272 + * perf_mmap__put() is done at record__aio_complete() 273 + * after started request completion. 274 + */ 275 + perf_mmap__get(md); 276 + 277 + md->prev = head; 278 + perf_mmap__consume(md); 279 + 280 + rc = push(to, &md->aio.cblocks[idx], md->aio.data[idx], size0 + size, *off); 281 + if (!rc) { 282 + *off += size0 + size; 283 + } else { 284 + /* 285 + * Decrement md->refcount back if aio write 286 + * operation failed to start. 287 + */ 288 + perf_mmap__put(md); 289 + } 290 + 291 + return rc; 292 + } 293 + #else 294 + static int perf_mmap__aio_mmap(struct perf_mmap *map __maybe_unused, 295 + struct mmap_params *mp __maybe_unused) 296 + { 297 + return 0; 298 + } 299 + 300 + static void perf_mmap__aio_munmap(struct perf_mmap *map __maybe_unused) 301 + { 302 + } 303 + #endif 304 + 156 305 void perf_mmap__munmap(struct perf_mmap *map) 157 306 { 307 + perf_mmap__aio_munmap(map); 158 308 if (map->base != NULL) { 159 309 munmap(map->base, perf_mmap__mmap_len(map)); 160 310 map->base = NULL; ··· 347 197 &mp->auxtrace_mp, map->base, fd)) 348 198 return -1; 349 199 350 - return 0; 200 + return perf_mmap__aio_mmap(map, mp); 351 201 } 352 202 353 203 static int overwrite_rb_find_range(void *buf, int mask, u64 *start, u64 *end)
+25 -1
tools/perf/util/mmap.h
··· 6 6 #include <linux/types.h> 7 7 #include <linux/ring_buffer.h> 8 8 #include <stdbool.h> 9 + #ifdef HAVE_AIO_SUPPORT 10 + #include <aio.h> 11 + #endif 9 12 #include "auxtrace.h" 10 13 #include "event.h" 11 14 15 + struct aiocb; 12 16 /** 13 17 * struct perf_mmap - perf's ring buffer mmap details 14 18 * ··· 30 26 bool overwrite; 31 27 struct auxtrace_mmap auxtrace_mmap; 32 28 char event_copy[PERF_SAMPLE_MAX_SIZE] __aligned(8); 29 + #ifdef HAVE_AIO_SUPPORT 30 + struct { 31 + void **data; 32 + struct aiocb *cblocks; 33 + struct aiocb **aiocb; 34 + int nr_cblocks; 35 + } aio; 36 + #endif 33 37 }; 34 38 35 39 /* ··· 69 57 }; 70 58 71 59 struct mmap_params { 72 - int prot, mask; 60 + int prot, mask, nr_cblocks; 73 61 struct auxtrace_mmap_params auxtrace_mp; 74 62 }; 75 63 ··· 97 85 98 86 int perf_mmap__push(struct perf_mmap *md, void *to, 99 87 int push(struct perf_mmap *map, void *to, void *buf, size_t size)); 88 + #ifdef HAVE_AIO_SUPPORT 89 + int perf_mmap__aio_push(struct perf_mmap *md, void *to, int idx, 90 + int push(void *to, struct aiocb *cblock, void *buf, size_t size, off_t off), 91 + off_t *off); 92 + #else 93 + static inline int perf_mmap__aio_push(struct perf_mmap *md __maybe_unused, void *to __maybe_unused, int idx __maybe_unused, 94 + int push(void *to, struct aiocb *cblock, void *buf, size_t size, off_t off) __maybe_unused, 95 + off_t *off __maybe_unused) 96 + { 97 + return 0; 98 + } 99 + #endif 100 100 101 101 size_t perf_mmap__mmap_len(struct perf_mmap *map); 102 102
+38 -6
tools/perf/util/ordered-events.c
··· 219 219 return 0; 220 220 } 221 221 222 - static int __ordered_events__flush(struct ordered_events *oe) 222 + static int do_flush(struct ordered_events *oe, bool show_progress) 223 223 { 224 224 struct list_head *head = &oe->events; 225 225 struct ordered_event *tmp, *iter; 226 226 u64 limit = oe->next_flush; 227 227 u64 last_ts = oe->last ? oe->last->timestamp : 0ULL; 228 - bool show_progress = limit == ULLONG_MAX; 229 228 struct ui_progress prog; 230 229 int ret; 231 230 ··· 262 263 return 0; 263 264 } 264 265 265 - int ordered_events__flush(struct ordered_events *oe, enum oe_flush how) 266 + static int __ordered_events__flush(struct ordered_events *oe, enum oe_flush how, 267 + u64 timestamp) 266 268 { 267 269 static const char * const str[] = { 268 270 "NONE", ··· 272 272 "HALF ", 273 273 }; 274 274 int err; 275 + bool show_progress = false; 275 276 276 277 if (oe->nr_events == 0) 277 278 return 0; 278 279 279 280 switch (how) { 280 281 case OE_FLUSH__FINAL: 282 + show_progress = true; 283 + __fallthrough; 284 + case OE_FLUSH__TOP: 281 285 oe->next_flush = ULLONG_MAX; 282 286 break; 283 287 ··· 302 298 break; 303 299 } 304 300 301 + case OE_FLUSH__TIME: 302 + oe->next_flush = timestamp; 303 + show_progress = false; 304 + break; 305 + 305 306 case OE_FLUSH__ROUND: 306 307 case OE_FLUSH__NONE: 307 308 default: ··· 317 308 str[how], oe->nr_events); 318 309 pr_oe_time(oe->max_timestamp, "max_timestamp\n"); 319 310 320 - err = __ordered_events__flush(oe); 311 + err = do_flush(oe, show_progress); 321 312 322 313 if (!err) { 323 314 if (how == OE_FLUSH__ROUND) ··· 333 324 return err; 334 325 } 335 326 336 - void ordered_events__init(struct ordered_events *oe, ordered_events__deliver_t deliver) 327 + int ordered_events__flush(struct ordered_events *oe, enum oe_flush how) 328 + { 329 + return __ordered_events__flush(oe, how, 0); 330 + } 331 + 332 + int ordered_events__flush_time(struct ordered_events *oe, u64 timestamp) 333 + { 334 + return __ordered_events__flush(oe, OE_FLUSH__TIME, timestamp); 335 + } 336 + 337 + u64 ordered_events__first_time(struct ordered_events *oe) 338 + { 339 + struct ordered_event *event; 340 + 341 + if (list_empty(&oe->events)) 342 + return 0; 343 + 344 + event = list_first_entry(&oe->events, struct ordered_event, list); 345 + return event->timestamp; 346 + } 347 + 348 + void ordered_events__init(struct ordered_events *oe, ordered_events__deliver_t deliver, 349 + void *data) 337 350 { 338 351 INIT_LIST_HEAD(&oe->events); 339 352 INIT_LIST_HEAD(&oe->cache); ··· 363 332 oe->max_alloc_size = (u64) -1; 364 333 oe->cur_alloc_size = 0; 365 334 oe->deliver = deliver; 335 + oe->data = data; 366 336 } 367 337 368 338 static void ··· 407 375 408 376 ordered_events__free(oe); 409 377 memset(oe, '\0', sizeof(*oe)); 410 - ordered_events__init(oe, old_deliver); 378 + ordered_events__init(oe, old_deliver, oe->data); 411 379 }
+7 -1
tools/perf/util/ordered-events.h
··· 18 18 OE_FLUSH__FINAL, 19 19 OE_FLUSH__ROUND, 20 20 OE_FLUSH__HALF, 21 + OE_FLUSH__TOP, 22 + OE_FLUSH__TIME, 21 23 }; 22 24 23 25 struct ordered_events; ··· 49 47 enum oe_flush last_flush_type; 50 48 u32 nr_unordered_events; 51 49 bool copy_on_queue; 50 + void *data; 52 51 }; 53 52 54 53 int ordered_events__queue(struct ordered_events *oe, union perf_event *event, 55 54 u64 timestamp, u64 file_offset); 56 55 void ordered_events__delete(struct ordered_events *oe, struct ordered_event *event); 57 56 int ordered_events__flush(struct ordered_events *oe, enum oe_flush how); 58 - void ordered_events__init(struct ordered_events *oe, ordered_events__deliver_t deliver); 57 + int ordered_events__flush_time(struct ordered_events *oe, u64 timestamp); 58 + void ordered_events__init(struct ordered_events *oe, ordered_events__deliver_t deliver, 59 + void *data); 59 60 void ordered_events__free(struct ordered_events *oe); 60 61 void ordered_events__reinit(struct ordered_events *oe); 62 + u64 ordered_events__first_time(struct ordered_events *oe); 61 63 62 64 static inline 63 65 void ordered_events__set_alloc_size(struct ordered_events *oe, u64 size)
+1 -1
tools/perf/util/parse-events.c
··· 2462 2462 if (!name_only && strlen(syms->alias)) 2463 2463 snprintf(name, MAX_NAME_LEN, "%s OR %s", syms->symbol, syms->alias); 2464 2464 else 2465 - strncpy(name, syms->symbol, MAX_NAME_LEN); 2465 + strlcpy(name, syms->symbol, MAX_NAME_LEN); 2466 2466 2467 2467 evt_list[evt_i] = strdup(name); 2468 2468 if (evt_list[evt_i] == NULL)
+2 -2
tools/perf/util/probe-event.c
··· 692 692 return ret; 693 693 694 694 for (i = 0; i < ntevs && ret >= 0; i++) { 695 - /* point.address is the addres of point.symbol + point.offset */ 695 + /* point.address is the address of point.symbol + point.offset */ 696 696 tevs[i].point.address -= stext; 697 697 tevs[i].point.module = strdup(exec); 698 698 if (!tevs[i].point.module) { ··· 3062 3062 /* 3063 3063 * Give it a '0x' leading symbol name. 3064 3064 * In __add_probe_trace_events, a NULL symbol is interpreted as 3065 - * invalud. 3065 + * invalid. 3066 3066 */ 3067 3067 if (asprintf(&tp->symbol, "0x%lx", tp->address) < 0) 3068 3068 goto errout;
+1 -1
tools/perf/util/probe-file.c
··· 424 424 425 425 if (target && build_id_cache__cached(target)) { 426 426 /* This is a cached buildid */ 427 - strncpy(sbuildid, target, SBUILD_ID_SIZE); 427 + strlcpy(sbuildid, target, SBUILD_ID_SIZE); 428 428 dir_name = build_id_cache__linkname(sbuildid, NULL, 0); 429 429 goto found; 430 430 }
+2 -2
tools/perf/util/python.c
··· 386 386 struct tep_format_field *field; 387 387 388 388 if (!evsel->tp_format) { 389 - struct tep_event_format *tp_format; 389 + struct tep_event *tp_format; 390 390 391 391 tp_format = trace_event__tp_format_id(evsel->attr.config); 392 392 if (!tp_format) ··· 1240 1240 static PyObject *pyrf__tracepoint(struct pyrf_evsel *pevsel, 1241 1241 PyObject *args, PyObject *kwargs) 1242 1242 { 1243 - struct tep_event_format *tp_format; 1243 + struct tep_event *tp_format; 1244 1244 static char *kwlist[] = { "sys", "name", NULL }; 1245 1245 char *sys = NULL; 1246 1246 char *name = NULL;
+3 -3
tools/perf/util/scripting-engines/trace-event-perl.c
··· 189 189 LEAVE; 190 190 } 191 191 192 - static void define_event_symbols(struct tep_event_format *event, 192 + static void define_event_symbols(struct tep_event *event, 193 193 const char *ev_name, 194 194 struct tep_print_arg *args) 195 195 { ··· 338 338 struct addr_location *al) 339 339 { 340 340 struct thread *thread = al->thread; 341 - struct tep_event_format *event = evsel->tp_format; 341 + struct tep_event *event = evsel->tp_format; 342 342 struct tep_format_field *field; 343 343 static char handler[256]; 344 344 unsigned long long val; ··· 537 537 538 538 static int perl_generate_script(struct tep_handle *pevent, const char *outfile) 539 539 { 540 - struct tep_event_format *event = NULL; 540 + struct tep_event *event = NULL; 541 541 struct tep_format_field *f; 542 542 char fname[PATH_MAX]; 543 543 int not_first, count;
+12 -12
tools/perf/util/scripting-engines/trace-event-python.c
··· 264 264 Py_DECREF(t); 265 265 } 266 266 267 - static void define_event_symbols(struct tep_event_format *event, 267 + static void define_event_symbols(struct tep_event *event, 268 268 const char *ev_name, 269 269 struct tep_print_arg *args) 270 270 { ··· 332 332 define_event_symbols(event, ev_name, args->next); 333 333 } 334 334 335 - static PyObject *get_field_numeric_entry(struct tep_event_format *event, 335 + static PyObject *get_field_numeric_entry(struct tep_event *event, 336 336 struct tep_format_field *field, void *data) 337 337 { 338 338 bool is_array = field->flags & TEP_FIELD_IS_ARRAY; ··· 494 494 pydict_set_item_string_decref(pyelem, "cycles", 495 495 PyLong_FromUnsignedLongLong(br->entries[i].flags.cycles)); 496 496 497 - thread__find_map(thread, sample->cpumode, 498 - br->entries[i].from, &al); 497 + thread__find_map_fb(thread, sample->cpumode, 498 + br->entries[i].from, &al); 499 499 dsoname = get_dsoname(al.map); 500 500 pydict_set_item_string_decref(pyelem, "from_dsoname", 501 501 _PyUnicode_FromString(dsoname)); 502 502 503 - thread__find_map(thread, sample->cpumode, 504 - br->entries[i].to, &al); 503 + thread__find_map_fb(thread, sample->cpumode, 504 + br->entries[i].to, &al); 505 505 dsoname = get_dsoname(al.map); 506 506 pydict_set_item_string_decref(pyelem, "to_dsoname", 507 507 _PyUnicode_FromString(dsoname)); ··· 576 576 if (!pyelem) 577 577 Py_FatalError("couldn't create Python dictionary"); 578 578 579 - thread__find_symbol(thread, sample->cpumode, 580 - br->entries[i].from, &al); 579 + thread__find_symbol_fb(thread, sample->cpumode, 580 + br->entries[i].from, &al); 581 581 get_symoff(al.sym, &al, true, bf, sizeof(bf)); 582 582 pydict_set_item_string_decref(pyelem, "from", 583 583 _PyUnicode_FromString(bf)); 584 584 585 - thread__find_symbol(thread, sample->cpumode, 586 - br->entries[i].to, &al); 585 + thread__find_symbol_fb(thread, sample->cpumode, 586 + br->entries[i].to, &al); 587 587 get_symoff(al.sym, &al, true, bf, sizeof(bf)); 588 588 pydict_set_item_string_decref(pyelem, "to", 589 589 _PyUnicode_FromString(bf)); ··· 790 790 struct perf_evsel *evsel, 791 791 struct addr_location *al) 792 792 { 793 - struct tep_event_format *event = evsel->tp_format; 793 + struct tep_event *event = evsel->tp_format; 794 794 PyObject *handler, *context, *t, *obj = NULL, *callchain; 795 795 PyObject *dict = NULL, *all_entries_dict = NULL; 796 796 static char handler_name[256]; ··· 1590 1590 1591 1591 static int python_generate_script(struct tep_handle *pevent, const char *outfile) 1592 1592 { 1593 - struct tep_event_format *event = NULL; 1593 + struct tep_event *event = NULL; 1594 1594 struct tep_format_field *f; 1595 1595 char fname[PATH_MAX]; 1596 1596 int not_first, count;
+6 -1
tools/perf/util/session.c
··· 24 24 #include "thread.h" 25 25 #include "thread-stack.h" 26 26 #include "stat.h" 27 + #include "arch/common.h" 27 28 28 29 static int perf_session__deliver_event(struct perf_session *session, 29 30 union perf_event *event, ··· 126 125 session->tool = tool; 127 126 INIT_LIST_HEAD(&session->auxtrace_index); 128 127 machines__init(&session->machines); 129 - ordered_events__init(&session->ordered_events, ordered_events__deliver_event); 128 + ordered_events__init(&session->ordered_events, 129 + ordered_events__deliver_event, NULL); 130 130 131 131 if (data) { 132 132 if (perf_data__open(data)) ··· 151 149 } else { 152 150 session->machines.host.env = &perf_env; 153 151 } 152 + 153 + session->machines.host.single_address_space = 154 + perf_env__single_address_space(session->machines.host.env); 154 155 155 156 if (!data || perf_data__is_write(data)) { 156 157 /*
+62 -1
tools/perf/util/sort.c
··· 13 13 #include "strlist.h" 14 14 #include <traceevent/event-parse.h> 15 15 #include "mem-events.h" 16 + #include "annotate.h" 16 17 #include <linux/kernel.h> 17 18 18 19 regex_t parent_regex; ··· 37 36 * -t, --field-separator 38 37 * 39 38 * option, that uses a special separator character and don't pad with spaces, 40 - * replacing all occurances of this separator in symbol names (and other 39 + * replacing all occurrences of this separator in symbol names (and other 41 40 * output) with a '.' character, that thus it's the only non valid separator. 42 41 */ 43 42 static int repsep_snprintf(char *bf, size_t size, const char *fmt, ...) ··· 421 420 .se_cmp = sort__srcline_to_cmp, 422 421 .se_snprintf = hist_entry__srcline_to_snprintf, 423 422 .se_width_idx = HISTC_SRCLINE_TO, 423 + }; 424 + 425 + static int hist_entry__sym_ipc_snprintf(struct hist_entry *he, char *bf, 426 + size_t size, unsigned int width) 427 + { 428 + 429 + struct symbol *sym = he->ms.sym; 430 + struct map *map = he->ms.map; 431 + struct perf_evsel *evsel = hists_to_evsel(he->hists); 432 + struct annotation *notes; 433 + double ipc = 0.0, coverage = 0.0; 434 + char tmp[64]; 435 + 436 + if (!sym) 437 + return repsep_snprintf(bf, size, "%-*s", width, "-"); 438 + 439 + if (!sym->annotate2 && symbol__annotate2(sym, map, evsel, 440 + &annotation__default_options, NULL) < 0) { 441 + return 0; 442 + } 443 + 444 + notes = symbol__annotation(sym); 445 + 446 + if (notes->hit_cycles) 447 + ipc = notes->hit_insn / ((double)notes->hit_cycles); 448 + 449 + if (notes->total_insn) { 450 + coverage = notes->cover_insn * 100.0 / 451 + ((double)notes->total_insn); 452 + } 453 + 454 + snprintf(tmp, sizeof(tmp), "%-5.2f [%5.1f%%]", ipc, coverage); 455 + return repsep_snprintf(bf, size, "%-*s", width, tmp); 456 + } 457 + 458 + struct sort_entry sort_sym_ipc = { 459 + .se_header = "IPC [IPC Coverage]", 460 + .se_cmp = sort__sym_cmp, 461 + .se_snprintf = hist_entry__sym_ipc_snprintf, 462 + .se_width_idx = HISTC_SYMBOL_IPC, 463 + }; 464 + 465 + static int hist_entry__sym_ipc_null_snprintf(struct hist_entry *he 466 + __maybe_unused, 467 + char *bf, size_t size, 468 + unsigned int width) 469 + { 470 + char tmp[64]; 471 + 472 + snprintf(tmp, sizeof(tmp), "%-5s %2s", "-", "-"); 473 + return repsep_snprintf(bf, size, "%-*s", width, tmp); 474 + } 475 + 476 + struct sort_entry sort_sym_ipc_null = { 477 + .se_header = "IPC [IPC Coverage]", 478 + .se_cmp = sort__sym_cmp, 479 + .se_snprintf = hist_entry__sym_ipc_null_snprintf, 480 + .se_width_idx = HISTC_SYMBOL_IPC, 424 481 }; 425 482 426 483 /* --sort srcfile */ ··· 1633 1574 DIM(SORT_SYM_SIZE, "symbol_size", sort_sym_size), 1634 1575 DIM(SORT_DSO_SIZE, "dso_size", sort_dso_size), 1635 1576 DIM(SORT_CGROUP_ID, "cgroup_id", sort_cgroup_id), 1577 + DIM(SORT_SYM_IPC_NULL, "ipc_null", sort_sym_ipc_null), 1636 1578 }; 1637 1579 1638 1580 #undef DIM ··· 1651 1591 DIM(SORT_CYCLES, "cycles", sort_cycles), 1652 1592 DIM(SORT_SRCLINE_FROM, "srcline_from", sort_srcline_from), 1653 1593 DIM(SORT_SRCLINE_TO, "srcline_to", sort_srcline_to), 1594 + DIM(SORT_SYM_IPC, "ipc_lbr", sort_sym_ipc), 1654 1595 }; 1655 1596 1656 1597 #undef DIM
+2
tools/perf/util/sort.h
··· 229 229 SORT_SYM_SIZE, 230 230 SORT_DSO_SIZE, 231 231 SORT_CGROUP_ID, 232 + SORT_SYM_IPC_NULL, 232 233 233 234 /* branch stack specific sort keys */ 234 235 __SORT_BRANCH_STACK, ··· 243 242 SORT_CYCLES, 244 243 SORT_SRCLINE_FROM, 245 244 SORT_SRCLINE_TO, 245 + SORT_SYM_IPC, 246 246 247 247 /* memory mode specific sort keys */ 248 248 __SORT_MEMORY_MODE,
+186
tools/perf/util/srccode.c
··· 1 + /* 2 + * Manage printing of source lines 3 + * Copyright (c) 2017, Intel Corporation. 4 + * Author: Andi Kleen 5 + * 6 + * This program is free software; you can redistribute it and/or modify it 7 + * under the terms and conditions of the GNU General Public License, 8 + * version 2, as published by the Free Software Foundation. 9 + * 10 + * This program is distributed in the hope it will be useful, but WITHOUT 11 + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 12 + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for 13 + * more details. 14 + */ 15 + #include "linux/list.h" 16 + #include <stdlib.h> 17 + #include <sys/mman.h> 18 + #include <sys/stat.h> 19 + #include <fcntl.h> 20 + #include <unistd.h> 21 + #include <assert.h> 22 + #include <string.h> 23 + #include "srccode.h" 24 + #include "debug.h" 25 + #include "util.h" 26 + 27 + #define MAXSRCCACHE (32*1024*1024) 28 + #define MAXSRCFILES 64 29 + #define SRC_HTAB_SZ 64 30 + 31 + struct srcfile { 32 + struct hlist_node hash_nd; 33 + struct list_head nd; 34 + char *fn; 35 + char **lines; 36 + char *map; 37 + unsigned numlines; 38 + size_t maplen; 39 + }; 40 + 41 + static struct hlist_head srcfile_htab[SRC_HTAB_SZ]; 42 + static LIST_HEAD(srcfile_list); 43 + static long map_total_sz; 44 + static int num_srcfiles; 45 + 46 + static unsigned shash(unsigned char *s) 47 + { 48 + unsigned h = 0; 49 + while (*s) 50 + h = 65599 * h + *s++; 51 + return h ^ (h >> 16); 52 + } 53 + 54 + static int countlines(char *map, int maplen) 55 + { 56 + int numl; 57 + char *end = map + maplen; 58 + char *p = map; 59 + 60 + if (maplen == 0) 61 + return 0; 62 + numl = 0; 63 + while (p < end && (p = memchr(p, '\n', end - p)) != NULL) { 64 + numl++; 65 + p++; 66 + } 67 + if (p < end) 68 + numl++; 69 + return numl; 70 + } 71 + 72 + static void fill_lines(char **lines, int maxline, char *map, int maplen) 73 + { 74 + int l; 75 + char *end = map + maplen; 76 + char *p = map; 77 + 78 + if (maplen == 0 || maxline == 0) 79 + return; 80 + l = 0; 81 + lines[l++] = map; 82 + while (p < end && (p = memchr(p, '\n', end - p)) != NULL) { 83 + if (l >= maxline) 84 + return; 85 + lines[l++] = ++p; 86 + } 87 + if (p < end) 88 + lines[l] = p; 89 + } 90 + 91 + static void free_srcfile(struct srcfile *sf) 92 + { 93 + list_del(&sf->nd); 94 + hlist_del(&sf->hash_nd); 95 + map_total_sz -= sf->maplen; 96 + munmap(sf->map, sf->maplen); 97 + free(sf->lines); 98 + free(sf->fn); 99 + free(sf); 100 + num_srcfiles--; 101 + } 102 + 103 + static struct srcfile *find_srcfile(char *fn) 104 + { 105 + struct stat st; 106 + struct srcfile *h; 107 + int fd; 108 + unsigned long sz; 109 + unsigned hval = shash((unsigned char *)fn) % SRC_HTAB_SZ; 110 + 111 + hlist_for_each_entry (h, &srcfile_htab[hval], hash_nd) { 112 + if (!strcmp(fn, h->fn)) { 113 + /* Move to front */ 114 + list_del(&h->nd); 115 + list_add(&h->nd, &srcfile_list); 116 + return h; 117 + } 118 + } 119 + 120 + /* Only prune if there is more than one entry */ 121 + while ((num_srcfiles > MAXSRCFILES || map_total_sz > MAXSRCCACHE) && 122 + srcfile_list.next != &srcfile_list) { 123 + assert(!list_empty(&srcfile_list)); 124 + h = list_entry(srcfile_list.prev, struct srcfile, nd); 125 + free_srcfile(h); 126 + } 127 + 128 + fd = open(fn, O_RDONLY); 129 + if (fd < 0 || fstat(fd, &st) < 0) { 130 + pr_debug("cannot open source file %s\n", fn); 131 + return NULL; 132 + } 133 + 134 + h = malloc(sizeof(struct srcfile)); 135 + if (!h) 136 + return NULL; 137 + 138 + h->fn = strdup(fn); 139 + if (!h->fn) 140 + goto out_h; 141 + 142 + h->maplen = st.st_size; 143 + sz = (h->maplen + page_size - 1) & ~(page_size - 1); 144 + h->map = mmap(NULL, sz, PROT_READ, MAP_SHARED, fd, 0); 145 + close(fd); 146 + if (h->map == (char *)-1) { 147 + pr_debug("cannot mmap source file %s\n", fn); 148 + goto out_fn; 149 + } 150 + h->numlines = countlines(h->map, h->maplen); 151 + h->lines = calloc(h->numlines, sizeof(char *)); 152 + if (!h->lines) 153 + goto out_map; 154 + fill_lines(h->lines, h->numlines, h->map, h->maplen); 155 + list_add(&h->nd, &srcfile_list); 156 + hlist_add_head(&h->hash_nd, &srcfile_htab[hval]); 157 + map_total_sz += h->maplen; 158 + num_srcfiles++; 159 + return h; 160 + 161 + out_map: 162 + munmap(h->map, sz); 163 + out_fn: 164 + free(h->fn); 165 + out_h: 166 + free(h); 167 + return NULL; 168 + } 169 + 170 + /* Result is not 0 terminated */ 171 + char *find_sourceline(char *fn, unsigned line, int *lenp) 172 + { 173 + char *l, *p; 174 + struct srcfile *sf = find_srcfile(fn); 175 + if (!sf) 176 + return NULL; 177 + line--; 178 + if (line >= sf->numlines) 179 + return NULL; 180 + l = sf->lines[line]; 181 + if (!l) 182 + return NULL; 183 + p = memchr(l, '\n', sf->map + sf->maplen - l); 184 + *lenp = p - l; 185 + return l; 186 + }
+7
tools/perf/util/srccode.h
··· 1 + #ifndef SRCCODE_H 2 + #define SRCCODE_H 1 3 + 4 + /* Result is not 0 terminated */ 5 + char *find_sourceline(char *fn, unsigned line, int *lenp); 6 + 7 + #endif
+28
tools/perf/util/srcline.c
··· 548 548 return srcline; 549 549 } 550 550 551 + /* Returns filename and fills in line number in line */ 552 + char *get_srcline_split(struct dso *dso, u64 addr, unsigned *line) 553 + { 554 + char *file = NULL; 555 + const char *dso_name; 556 + 557 + if (!dso->has_srcline) 558 + goto out; 559 + 560 + dso_name = dso__name(dso); 561 + if (dso_name == NULL) 562 + goto out; 563 + 564 + if (!addr2line(dso_name, addr, &file, line, dso, true, NULL, NULL)) 565 + goto out; 566 + 567 + dso->a2l_fails = 0; 568 + return file; 569 + 570 + out: 571 + if (dso->a2l_fails && ++dso->a2l_fails > A2L_FAIL_LIMIT) { 572 + dso->has_srcline = 0; 573 + dso__free_a2l(dso); 574 + } 575 + 576 + return NULL; 577 + } 578 + 551 579 void free_srcline(char *srcline) 552 580 { 553 581 if (srcline && strcmp(srcline, SRCLINE_UNKNOWN) != 0)
+1
tools/perf/util/srcline.h
··· 16 16 bool show_sym, bool show_addr, bool unwind_inlines, 17 17 u64 ip); 18 18 void free_srcline(char *srcline); 19 + char *get_srcline_split(struct dso *dso, u64 addr, unsigned *line); 19 20 20 21 /* insert the srcline into the DSO, which will take ownership */ 21 22 void srcline__tree_insert(struct rb_root *tree, u64 addr, char *srcline);
+11 -5
tools/perf/util/stat-display.c
··· 59 59 print_noise_pct(config, stddev_stats(&ps->res_stats[0]), avg); 60 60 } 61 61 62 + static void print_cgroup(struct perf_stat_config *config, struct perf_evsel *evsel) 63 + { 64 + if (nr_cgroups) { 65 + const char *cgrp_name = evsel->cgrp ? evsel->cgrp->name : ""; 66 + fprintf(config->output, "%s%s", config->csv_sep, cgrp_name); 67 + } 68 + } 69 + 70 + 62 71 static void aggr_printout(struct perf_stat_config *config, 63 72 struct perf_evsel *evsel, int id, int nr) 64 73 { ··· 345 336 346 337 fprintf(output, "%-*s", config->csv_output ? 0 : 25, perf_evsel__name(evsel)); 347 338 348 - if (evsel->cgrp) 349 - fprintf(output, "%s%s", config->csv_sep, evsel->cgrp->name); 339 + print_cgroup(config, evsel); 350 340 } 351 341 352 342 static bool is_mixed_hw_group(struct perf_evsel *counter) ··· 439 431 config->csv_output ? 0 : -25, 440 432 perf_evsel__name(counter)); 441 433 442 - if (counter->cgrp) 443 - fprintf(config->output, "%s%s", 444 - config->csv_sep, counter->cgrp->name); 434 + print_cgroup(config, counter); 445 435 446 436 if (!config->csv_output) 447 437 pm(config, &os, NULL, NULL, "", 0);
+2 -1
tools/perf/util/stat-shadow.c
··· 209 209 int cpu, struct runtime_stat *st) 210 210 { 211 211 int ctx = evsel_context(counter); 212 + u64 count_ns = count; 212 213 213 214 count *= counter->scale; 214 215 215 216 if (perf_evsel__is_clock(counter)) 216 - update_runtime_stat(st, STAT_NSECS, 0, cpu, count); 217 + update_runtime_stat(st, STAT_NSECS, 0, cpu, count_ns); 217 218 else if (perf_evsel__match(counter, HARDWARE, HW_CPU_CYCLES)) 218 219 update_runtime_stat(st, STAT_CYCLES, ctx, cpu, count); 219 220 else if (perf_stat_evsel__is(counter, CYCLES_IN_TX))
+1 -1
tools/perf/util/svghelper.c
··· 334 334 if (file) { 335 335 while (fgets(buf, 255, file)) { 336 336 if (strstr(buf, "model name")) { 337 - strncpy(cpu_m, &buf[13], 255); 337 + strlcpy(cpu_m, &buf[13], 255); 338 338 break; 339 339 } 340 340 }
+1
tools/perf/util/symbol.h
··· 63 63 u8 ignore:1; 64 64 u8 inlined:1; 65 65 u8 arch_sym; 66 + bool annotate2; 66 67 char name[0]; 67 68 }; 68 69
+2
tools/perf/util/thread.c
··· 64 64 RB_CLEAR_NODE(&thread->rb_node); 65 65 /* Thread holds first ref to nsdata. */ 66 66 thread->nsinfo = nsinfo__new(pid); 67 + srccode_state_init(&thread->srccode_state); 67 68 } 68 69 69 70 return thread; ··· 104 103 105 104 unwind__finish_access(thread); 106 105 nsinfo__zput(thread->nsinfo); 106 + srccode_state_free(&thread->srccode_state); 107 107 108 108 exit_rwsem(&thread->namespaces_lock); 109 109 exit_rwsem(&thread->comm_lock);
+6
tools/perf/util/thread.h
··· 8 8 #include <unistd.h> 9 9 #include <sys/types.h> 10 10 #include "symbol.h" 11 + #include "map.h" 11 12 #include <strlist.h> 12 13 #include <intlist.h> 13 14 #include "rwsem.h" ··· 39 38 void *priv; 40 39 struct thread_stack *ts; 41 40 struct nsinfo *nsinfo; 41 + struct srccode_state srccode_state; 42 42 #ifdef HAVE_LIBUNWIND_SUPPORT 43 43 void *addr_space; 44 44 struct unwind_libunwind_ops *unwind_libunwind_ops; ··· 98 96 99 97 struct map *thread__find_map(struct thread *thread, u8 cpumode, u64 addr, 100 98 struct addr_location *al); 99 + struct map *thread__find_map_fb(struct thread *thread, u8 cpumode, u64 addr, 100 + struct addr_location *al); 101 101 102 102 struct symbol *thread__find_symbol(struct thread *thread, u8 cpumode, 103 103 u64 addr, struct addr_location *al); 104 + struct symbol *thread__find_symbol_fb(struct thread *thread, u8 cpumode, 105 + u64 addr, struct addr_location *al); 104 106 105 107 void thread__find_cpumode_addr_location(struct thread *thread, u64 addr, 106 108 struct addr_location *al);
+5 -3
tools/perf/util/top.c
··· 46 46 samples_per_sec; 47 47 ret = SNPRINTF(bf, size, 48 48 " PerfTop:%8.0f irqs/sec kernel:%4.1f%%" 49 - " exact: %4.1f%% [", samples_per_sec, 50 - ksamples_percent, esamples_percent); 49 + " exact: %4.1f%% lost: %" PRIu64 "/%" PRIu64 " drop: %" PRIu64 "/%" PRIu64 " [", 50 + samples_per_sec, ksamples_percent, esamples_percent, 51 + top->lost, top->lost_total, top->drop, top->drop_total); 51 52 } else { 52 53 float us_samples_per_sec = top->us_samples / top->delay_secs; 53 54 float guest_kernel_samples_per_sec = top->guest_kernel_samples / top->delay_secs; ··· 107 106 top->evlist->cpus->nr > 1 ? "s" : ""); 108 107 } 109 108 109 + perf_top__reset_sample_counters(top); 110 110 return ret; 111 111 } 112 112 ··· 115 113 { 116 114 top->samples = top->us_samples = top->kernel_samples = 117 115 top->exact_samples = top->guest_kernel_samples = 118 - top->guest_us_samples = 0; 116 + top->guest_us_samples = top->lost = top->drop = 0; 119 117 }
+9 -1
tools/perf/util/top.h
··· 22 22 * Symbols will be added here in perf_event__process_sample and will 23 23 * get out after decayed. 24 24 */ 25 - u64 samples; 25 + u64 samples, lost, lost_total, drop, drop_total; 26 26 u64 kernel_samples, us_samples; 27 27 u64 exact_samples; 28 28 u64 guest_us_samples, guest_kernel_samples; ··· 40 40 const char *sym_filter; 41 41 float min_percent; 42 42 unsigned int nr_threads_synthesize; 43 + 44 + struct { 45 + struct ordered_events *in; 46 + struct ordered_events data[2]; 47 + bool rotate; 48 + pthread_mutex_t mutex; 49 + pthread_cond_t cond; 50 + } qe; 43 51 }; 44 52 45 53 #define CONSOLE_CLEAR ""
+8 -8
tools/perf/util/trace-event-parse.c
··· 33 33 int *offset, int *size, const char *type) 34 34 { 35 35 struct tep_handle *pevent = context->pevent; 36 - struct tep_event_format *event; 36 + struct tep_event *event; 37 37 struct tep_format_field *field; 38 38 39 39 if (!*size) { ··· 95 95 } 96 96 97 97 unsigned long long 98 - raw_field_value(struct tep_event_format *event, const char *name, void *data) 98 + raw_field_value(struct tep_event *event, const char *name, void *data) 99 99 { 100 100 struct tep_format_field *field; 101 101 unsigned long long val; ··· 109 109 return val; 110 110 } 111 111 112 - unsigned long long read_size(struct tep_event_format *event, void *ptr, int size) 112 + unsigned long long read_size(struct tep_event *event, void *ptr, int size) 113 113 { 114 114 return tep_read_number(event->pevent, ptr, size); 115 115 } 116 116 117 - void event_format__fprintf(struct tep_event_format *event, 117 + void event_format__fprintf(struct tep_event *event, 118 118 int cpu, void *data, int size, FILE *fp) 119 119 { 120 120 struct tep_record record; ··· 131 131 trace_seq_destroy(&s); 132 132 } 133 133 134 - void event_format__print(struct tep_event_format *event, 134 + void event_format__print(struct tep_event *event, 135 135 int cpu, void *data, int size) 136 136 { 137 137 return event_format__fprintf(event, cpu, data, size, stdout); ··· 190 190 return tep_parse_event(pevent, buf, size, sys); 191 191 } 192 192 193 - struct tep_event_format *trace_find_next_event(struct tep_handle *pevent, 194 - struct tep_event_format *event) 193 + struct tep_event *trace_find_next_event(struct tep_handle *pevent, 194 + struct tep_event *event) 195 195 { 196 196 static int idx; 197 197 int events_count; 198 - struct tep_event_format *all_events; 198 + struct tep_event *all_events; 199 199 200 200 all_events = tep_get_first_event(pevent); 201 201 events_count = tep_get_events_count(pevent);
+2 -2
tools/perf/util/trace-event-read.c
··· 102 102 103 103 if (do_read(&data, 4) < 0) 104 104 return 0; 105 - return __tep_data2host4(pevent, data); 105 + return tep_read_number(pevent, &data, 4); 106 106 } 107 107 108 108 static unsigned long long read8(struct tep_handle *pevent) ··· 111 111 112 112 if (do_read(&data, 8) < 0) 113 113 return 0; 114 - return __tep_data2host8(pevent, data); 114 + return tep_read_number(pevent, &data, 8); 115 115 } 116 116 117 117 static char *read_string(void)
+4 -4
tools/perf/util/trace-event.c
··· 72 72 /* 73 73 * Returns pointer with encoded error via <linux/err.h> interface. 74 74 */ 75 - static struct tep_event_format* 75 + static struct tep_event* 76 76 tp_format(const char *sys, const char *name) 77 77 { 78 78 char *tp_dir = get_events_file(sys); 79 79 struct tep_handle *pevent = tevent.pevent; 80 - struct tep_event_format *event = NULL; 80 + struct tep_event *event = NULL; 81 81 char path[PATH_MAX]; 82 82 size_t size; 83 83 char *data; ··· 102 102 /* 103 103 * Returns pointer with encoded error via <linux/err.h> interface. 104 104 */ 105 - struct tep_event_format* 105 + struct tep_event* 106 106 trace_event__tp_format(const char *sys, const char *name) 107 107 { 108 108 if (!tevent_initialized && trace_event__init2()) ··· 111 111 return tp_format(sys, name); 112 112 } 113 113 114 - struct tep_event_format *trace_event__tp_format_id(int id) 114 + struct tep_event *trace_event__tp_format_id(int id) 115 115 { 116 116 if (!tevent_initialized && trace_event__init2()) 117 117 return ERR_PTR(-ENOMEM);
+8 -8
tools/perf/util/trace-event.h
··· 22 22 void trace_event__cleanup(struct trace_event *t); 23 23 int trace_event__register_resolver(struct machine *machine, 24 24 tep_func_resolver_t *func); 25 - struct tep_event_format* 25 + struct tep_event* 26 26 trace_event__tp_format(const char *sys, const char *name); 27 27 28 - struct tep_event_format *trace_event__tp_format_id(int id); 28 + struct tep_event *trace_event__tp_format_id(int id); 29 29 30 30 int bigendian(void); 31 31 32 - void event_format__fprintf(struct tep_event_format *event, 32 + void event_format__fprintf(struct tep_event *event, 33 33 int cpu, void *data, int size, FILE *fp); 34 34 35 - void event_format__print(struct tep_event_format *event, 35 + void event_format__print(struct tep_event *event, 36 36 int cpu, void *data, int size); 37 37 38 38 int parse_ftrace_file(struct tep_handle *pevent, char *buf, unsigned long size); ··· 40 40 char *buf, unsigned long size, char *sys); 41 41 42 42 unsigned long long 43 - raw_field_value(struct tep_event_format *event, const char *name, void *data); 43 + raw_field_value(struct tep_event *event, const char *name, void *data); 44 44 45 45 void parse_proc_kallsyms(struct tep_handle *pevent, char *file, unsigned int size); 46 46 void parse_ftrace_printk(struct tep_handle *pevent, char *file, unsigned int size); ··· 48 48 49 49 ssize_t trace_report(int fd, struct trace_event *tevent, bool repipe); 50 50 51 - struct tep_event_format *trace_find_next_event(struct tep_handle *pevent, 52 - struct tep_event_format *event); 53 - unsigned long long read_size(struct tep_event_format *event, void *ptr, int size); 51 + struct tep_event *trace_find_next_event(struct tep_handle *pevent, 52 + struct tep_event *event); 53 + unsigned long long read_size(struct tep_event *event, void *ptr, int size); 54 54 unsigned long long eval_flag(const char *flag); 55 55 56 56 int read_tracing_data(int fd, struct list_head *pattrs);