Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

-71

Documentation/ABI/testing/debugfs-kmemtrace

··· 1 - What: /sys/kernel/debug/kmemtrace/ 2 - Date: July 2008 3 - Contact: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> 4 - Description: 5 - 6 - In kmemtrace-enabled kernels, the following files are created: 7 - 8 - /sys/kernel/debug/kmemtrace/ 9 - cpu<n> (0400) Per-CPU tracing data, see below. (binary) 10 - total_overruns (0400) Total number of bytes which were dropped from 11 - cpu<n> files because of full buffer condition, 12 - non-binary. (text) 13 - abi_version (0400) Kernel's kmemtrace ABI version. (text) 14 - 15 - Each per-CPU file should be read according to the relay interface. That is, 16 - the reader should set affinity to that specific CPU and, as currently done by 17 - the userspace application (though there are other methods), use poll() with 18 - an infinite timeout before every read(). Otherwise, erroneous data may be 19 - read. The binary data has the following _core_ format: 20 - 21 - Event ID (1 byte) Unsigned integer, one of: 22 - 0 - represents an allocation (KMEMTRACE_EVENT_ALLOC) 23 - 1 - represents a freeing of previously allocated memory 24 - (KMEMTRACE_EVENT_FREE) 25 - Type ID (1 byte) Unsigned integer, one of: 26 - 0 - this is a kmalloc() / kfree() 27 - 1 - this is a kmem_cache_alloc() / kmem_cache_free() 28 - 2 - this is a __get_free_pages() et al. 29 - Event size (2 bytes) Unsigned integer representing the 30 - size of this event. Used to extend 31 - kmemtrace. Discard the bytes you 32 - don't know about. 33 - Sequence number (4 bytes) Signed integer used to reorder data 34 - logged on SMP machines. Wraparound 35 - must be taken into account, although 36 - it is unlikely. 37 - Caller address (8 bytes) Return address to the caller. 38 - Pointer to mem (8 bytes) Pointer to target memory area. Can be 39 - NULL, but not all such calls might be 40 - recorded. 41 - 42 - In case of KMEMTRACE_EVENT_ALLOC events, the next fields follow: 43 - 44 - Requested bytes (8 bytes) Total number of requested bytes, 45 - unsigned, must not be zero. 46 - Allocated bytes (8 bytes) Total number of actually allocated 47 - bytes, unsigned, must not be lower 48 - than requested bytes. 49 - Requested flags (4 bytes) GFP flags supplied by the caller. 50 - Target CPU (4 bytes) Signed integer, valid for event id 1. 51 - If equal to -1, target CPU is the same 52 - as origin CPU, but the reverse might 53 - not be true. 54 - 55 - The data is made available in the same endianness the machine has. 56 - 57 - Other event ids and type ids may be defined and added. Other fields may be 58 - added by increasing event size, but see below for details. 59 - Every modification to the ABI, including new id definitions, are followed 60 - by bumping the ABI version by one. 61 - 62 - Adding new data to the packet (features) is done at the end of the mandatory 63 - data: 64 - Feature size (2 byte) 65 - Feature ID (1 byte) 66 - Feature data (Feature size - 3 bytes) 67 - 68 - 69 - Users: 70 - kmemtrace-user - git://repo.or.cz/kmemtrace-user.git 71 -

+2

Documentation/kernel-parameters.txt

··· 1816 1816 1817 1817 nousb [USB] Disable the USB subsystem 1818 1818 1819 + nowatchdog [KNL] Disable the lockup detector. 1820 + 1819 1821 nowb [ARM] 1820 1822 1821 1823 nox2apic [X86-64,APIC] Do not enable x2APIC mode.

+149 -6

Documentation/trace/ftrace-design.txt

··· 13 13 want more explanation of a feature in terms of common code, review the common 14 14 ftrace.txt file. 15 15 16 + Ideally, everyone who wishes to retain performance while supporting tracing in 17 + their kernel should make it all the way to dynamic ftrace support. 18 + 16 19 17 20 Prerequisites 18 21 ------------- ··· 218 215 exiting of a function. On exit, the value is compared and if it does not 219 216 match, then it will panic the kernel. This is largely a sanity check for bad 220 217 code generation with gcc. If gcc for your port sanely updates the frame 221 - pointer under different opitmization levels, then ignore this option. 218 + pointer under different optimization levels, then ignore this option. 222 219 223 220 However, adding support for it isn't terribly difficult. In your assembly code 224 221 that calls prepare_ftrace_return(), pass the frame pointer as the 3rd argument. ··· 237 234 238 235 239 236 HAVE_SYSCALL_TRACEPOINTS 240 - --------------------- 237 + ------------------------ 241 238 242 239 You need very few things to get the syscalls tracing in an arch. 243 240 ··· 253 250 HAVE_FTRACE_MCOUNT_RECORD 254 251 ------------------------- 255 252 256 - See scripts/recordmcount.pl for more info. 257 - 258 - <details to be filled> 253 + See scripts/recordmcount.pl for more info. Just fill in the arch-specific 254 + details for how to locate the addresses of mcount call sites via objdump. 255 + This option doesn't make much sense without also implementing dynamic ftrace. 259 256 260 257 261 258 HAVE_DYNAMIC_FTRACE 262 - --------------------- 259 + ------------------- 260 + 261 + You will first need HAVE_FTRACE_MCOUNT_RECORD and HAVE_FUNCTION_TRACER, so 262 + scroll your reader back up if you got over eager. 263 + 264 + Once those are out of the way, you will need to implement: 265 + - asm/ftrace.h: 266 + - MCOUNT_ADDR 267 + - ftrace_call_adjust() 268 + - struct dyn_arch_ftrace{} 269 + - asm code: 270 + - mcount() (new stub) 271 + - ftrace_caller() 272 + - ftrace_call() 273 + - ftrace_stub() 274 + - C code: 275 + - ftrace_dyn_arch_init() 276 + - ftrace_make_nop() 277 + - ftrace_make_call() 278 + - ftrace_update_ftrace_func() 279 + 280 + First you will need to fill out some arch details in your asm/ftrace.h. 281 + 282 + Define MCOUNT_ADDR as the address of your mcount symbol similar to: 283 + #define MCOUNT_ADDR ((unsigned long)mcount) 284 + Since no one else will have a decl for that function, you will need to: 285 + extern void mcount(void); 286 + 287 + You will also need the helper function ftrace_call_adjust(). Most people 288 + will be able to stub it out like so: 289 + static inline unsigned long ftrace_call_adjust(unsigned long addr) 290 + { 291 + return addr; 292 + } 293 + <details to be filled> 294 + 295 + Lastly you will need the custom dyn_arch_ftrace structure. If you need 296 + some extra state when runtime patching arbitrary call sites, this is the 297 + place. For now though, create an empty struct: 298 + struct dyn_arch_ftrace { 299 + /* No extra data needed */ 300 + }; 301 + 302 + With the header out of the way, we can fill out the assembly code. While we 303 + did already create a mcount() function earlier, dynamic ftrace only wants a 304 + stub function. This is because the mcount() will only be used during boot 305 + and then all references to it will be patched out never to return. Instead, 306 + the guts of the old mcount() will be used to create a new ftrace_caller() 307 + function. Because the two are hard to merge, it will most likely be a lot 308 + easier to have two separate definitions split up by #ifdefs. Same goes for 309 + the ftrace_stub() as that will now be inlined in ftrace_caller(). 310 + 311 + Before we get confused anymore, let's check out some pseudo code so you can 312 + implement your own stuff in assembly: 313 + 314 + void mcount(void) 315 + { 316 + return; 317 + } 318 + 319 + void ftrace_caller(void) 320 + { 321 + /* implement HAVE_FUNCTION_TRACE_MCOUNT_TEST if you desire */ 322 + 323 + /* save all state needed by the ABI (see paragraph above) */ 324 + 325 + unsigned long frompc = ...; 326 + unsigned long selfpc = <return address> - MCOUNT_INSN_SIZE; 327 + 328 + ftrace_call: 329 + ftrace_stub(frompc, selfpc); 330 + 331 + /* restore all state needed by the ABI */ 332 + 333 + ftrace_stub: 334 + return; 335 + } 336 + 337 + This might look a little odd at first, but keep in mind that we will be runtime 338 + patching multiple things. First, only functions that we actually want to trace 339 + will be patched to call ftrace_caller(). Second, since we only have one tracer 340 + active at a time, we will patch the ftrace_caller() function itself to call the 341 + specific tracer in question. That is the point of the ftrace_call label. 342 + 343 + With that in mind, let's move on to the C code that will actually be doing the 344 + runtime patching. You'll need a little knowledge of your arch's opcodes in 345 + order to make it through the next section. 346 + 347 + Every arch has an init callback function. If you need to do something early on 348 + to initialize some state, this is the time to do that. Otherwise, this simple 349 + function below should be sufficient for most people: 350 + 351 + int __init ftrace_dyn_arch_init(void *data) 352 + { 353 + /* return value is done indirectly via data */ 354 + *(unsigned long *)data = 0; 355 + 356 + return 0; 357 + } 358 + 359 + There are two functions that are used to do runtime patching of arbitrary 360 + functions. The first is used to turn the mcount call site into a nop (which 361 + is what helps us retain runtime performance when not tracing). The second is 362 + used to turn the mcount call site into a call to an arbitrary location (but 363 + typically that is ftracer_caller()). See the general function definition in 364 + linux/ftrace.h for the functions: 365 + ftrace_make_nop() 366 + ftrace_make_call() 367 + The rec->ip value is the address of the mcount call site that was collected 368 + by the scripts/recordmcount.pl during build time. 369 + 370 + The last function is used to do runtime patching of the active tracer. This 371 + will be modifying the assembly code at the location of the ftrace_call symbol 372 + inside of the ftrace_caller() function. So you should have sufficient padding 373 + at that location to support the new function calls you'll be inserting. Some 374 + people will be using a "call" type instruction while others will be using a 375 + "branch" type instruction. Specifically, the function is: 376 + ftrace_update_ftrace_func() 377 + 378 + 379 + HAVE_DYNAMIC_FTRACE + HAVE_FUNCTION_GRAPH_TRACER 380 + ------------------------------------------------ 381 + 382 + The function grapher needs a few tweaks in order to work with dynamic ftrace. 383 + Basically, you will need to: 384 + - update: 385 + - ftrace_caller() 386 + - ftrace_graph_call() 387 + - ftrace_graph_caller() 388 + - implement: 389 + - ftrace_enable_ftrace_graph_caller() 390 + - ftrace_disable_ftrace_graph_caller() 263 391 264 392 <details to be filled> 393 + Quick notes: 394 + - add a nop stub after the ftrace_call location named ftrace_graph_call; 395 + stub needs to be large enough to support a call to ftrace_graph_caller() 396 + - update ftrace_graph_caller() to work with being called by the new 397 + ftrace_caller() since some semantics may have changed 398 + - ftrace_enable_ftrace_graph_caller() will runtime patch the 399 + ftrace_graph_call location with a call to ftrace_graph_caller() 400 + - ftrace_disable_ftrace_graph_caller() will runtime patch the 401 + ftrace_graph_call location with nops

-126

Documentation/trace/kmemtrace.txt

··· 1 - kmemtrace - Kernel Memory Tracer 2 - 3 - by Eduard - Gabriel Munteanu 4 - <eduard.munteanu@linux360.ro> 5 - 6 - I. Introduction 7 - =============== 8 - 9 - kmemtrace helps kernel developers figure out two things: 10 - 1) how different allocators (SLAB, SLUB etc.) perform 11 - 2) how kernel code allocates memory and how much 12 - 13 - To do this, we trace every allocation and export information to the userspace 14 - through the relay interface. We export things such as the number of requested 15 - bytes, the number of bytes actually allocated (i.e. including internal 16 - fragmentation), whether this is a slab allocation or a plain kmalloc() and so 17 - on. 18 - 19 - The actual analysis is performed by a userspace tool (see section III for 20 - details on where to get it from). It logs the data exported by the kernel, 21 - processes it and (as of writing this) can provide the following information: 22 - - the total amount of memory allocated and fragmentation per call-site 23 - - the amount of memory allocated and fragmentation per allocation 24 - - total memory allocated and fragmentation in the collected dataset 25 - - number of cross-CPU allocation and frees (makes sense in NUMA environments) 26 - 27 - Moreover, it can potentially find inconsistent and erroneous behavior in 28 - kernel code, such as using slab free functions on kmalloc'ed memory or 29 - allocating less memory than requested (but not truly failed allocations). 30 - 31 - kmemtrace also makes provisions for tracing on some arch and analysing the 32 - data on another. 33 - 34 - II. Design and goals 35 - ==================== 36 - 37 - kmemtrace was designed to handle rather large amounts of data. Thus, it uses 38 - the relay interface to export whatever is logged to userspace, which then 39 - stores it. Analysis and reporting is done asynchronously, that is, after the 40 - data is collected and stored. By design, it allows one to log and analyse 41 - on different machines and different arches. 42 - 43 - As of writing this, the ABI is not considered stable, though it might not 44 - change much. However, no guarantees are made about compatibility yet. When 45 - deemed stable, the ABI should still allow easy extension while maintaining 46 - backward compatibility. This is described further in Documentation/ABI. 47 - 48 - Summary of design goals: 49 - - allow logging and analysis to be done across different machines 50 - - be fast and anticipate usage in high-load environments (*) 51 - - be reasonably extensible 52 - - make it possible for GNU/Linux distributions to have kmemtrace 53 - included in their repositories 54 - 55 - (*) - one of the reasons Pekka Enberg's original userspace data analysis 56 - tool's code was rewritten from Perl to C (although this is more than a 57 - simple conversion) 58 - 59 - 60 - III. Quick usage guide 61 - ====================== 62 - 63 - 1) Get a kernel that supports kmemtrace and build it accordingly (i.e. enable 64 - CONFIG_KMEMTRACE). 65 - 66 - 2) Get the userspace tool and build it: 67 - $ git clone git://repo.or.cz/kmemtrace-user.git # current repository 68 - $ cd kmemtrace-user/ 69 - $ ./autogen.sh 70 - $ ./configure 71 - $ make 72 - 73 - 3) Boot the kmemtrace-enabled kernel if you haven't, preferably in the 74 - 'single' runlevel (so that relay buffers don't fill up easily), and run 75 - kmemtrace: 76 - # '$' does not mean user, but root here. 77 - $ mount -t debugfs none /sys/kernel/debug 78 - $ mount -t proc none /proc 79 - $ cd path/to/kmemtrace-user/ 80 - $ ./kmemtraced 81 - Wait a bit, then stop it with CTRL+C. 82 - $ cat /sys/kernel/debug/kmemtrace/total_overruns # Check if we didn't 83 - # overrun, should 84 - # be zero. 85 - $ (Optionally) [Run kmemtrace_check separately on each cpu[0-9]*.out file to 86 - check its correctness] 87 - $ ./kmemtrace-report 88 - 89 - Now you should have a nice and short summary of how the allocator performs. 90 - 91 - IV. FAQ and known issues 92 - ======================== 93 - 94 - Q: 'cat /sys/kernel/debug/kmemtrace/total_overruns' is non-zero, how do I fix 95 - this? Should I worry? 96 - A: If it's non-zero, this affects kmemtrace's accuracy, depending on how 97 - large the number is. You can fix it by supplying a higher 98 - 'kmemtrace.subbufs=N' kernel parameter. 99 - --- 100 - 101 - Q: kmemtrace_check reports errors, how do I fix this? Should I worry? 102 - A: This is a bug and should be reported. It can occur for a variety of 103 - reasons: 104 - - possible bugs in relay code 105 - - possible misuse of relay by kmemtrace 106 - - timestamps being collected unorderly 107 - Or you may fix it yourself and send us a patch. 108 - --- 109 - 110 - Q: kmemtrace_report shows many errors, how do I fix this? Should I worry? 111 - A: This is a known issue and I'm working on it. These might be true errors 112 - in kernel code, which may have inconsistent behavior (e.g. allocating memory 113 - with kmem_cache_alloc() and freeing it with kfree()). Pekka Enberg pointed 114 - out this behavior may work with SLAB, but may fail with other allocators. 115 - 116 - It may also be due to lack of tracing in some unusual allocator functions. 117 - 118 - We don't want bug reports regarding this issue yet. 119 - --- 120 - 121 - V. See also 122 - =========== 123 - 124 - Documentation/kernel-parameters.txt 125 - Documentation/ABI/testing/debugfs-kmemtrace 126 -

+1 -1

Documentation/trace/kprobetrace.txt

··· 42 42 +|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(**) 43 43 NAME=FETCHARG : Set NAME as the argument name of FETCHARG. 44 44 FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types 45 - (u8/u16/u32/u64/s8/s16/s32/s64) are supported. 45 + (u8/u16/u32/u64/s8/s16/s32/s64) and string are supported. 46 46 47 47 (*) only for return probe. 48 48 (**) this is useful for fetching a field of data structures.

+1 -8

MAINTAINERS

··· 3403 3403 F: mm/kmemleak.c 3404 3404 F: mm/kmemleak-test.c 3405 3405 3406 - KMEMTRACE 3407 - M: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> 3408 - S: Maintained 3409 - F: Documentation/trace/kmemtrace.txt 3410 - F: include/linux/kmemtrace.h 3411 - F: kernel/trace/kmemtrace.c 3412 - 3413 3406 KPROBES 3414 3407 M: Ananth N Mavinakayanahalli <ananth@in.ibm.com> 3415 3408 M: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> ··· 5678 5685 M: Steven Rostedt <rostedt@goodmis.org> 5679 5686 M: Frederic Weisbecker <fweisbec@gmail.com> 5680 5687 M: Ingo Molnar <mingo@redhat.com> 5681 - T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git tracing/core 5688 + T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git perf/core 5682 5689 S: Maintained 5683 5690 F: Documentation/trace/ftrace.txt 5684 5691 F: arch/*/*/*/ftrace.h

+3 -1

Makefile

··· 420 420 no-dot-config-targets := clean mrproper distclean \ 421 421 cscope TAGS tags help %docs check% coccicheck \ 422 422 include/linux/version.h headers_% \ 423 - kernelversion 423 + kernelversion %src-pkg 424 424 425 425 config-targets := 0 426 426 mixed-targets := 0 ··· 1168 1168 # rpm target kept for backward compatibility 1169 1169 package-dir := $(srctree)/scripts/package 1170 1170 1171 + %src-pkg: FORCE 1172 + $(Q)$(MAKE) $(build)=$(package-dir) $@ 1171 1173 %pkg: include/config/kernel.release FORCE 1172 1174 $(Q)$(MAKE) $(build)=$(package-dir) $@ 1173 1175 rpm: include/config/kernel.release FORCE

+7

arch/Kconfig

··· 151 151 config HAVE_USER_RETURN_NOTIFIER 152 152 bool 153 153 154 + config HAVE_PERF_EVENTS_NMI 155 + bool 156 + help 157 + System hardware can generate an NMI using the perf event 158 + subsystem. Also has support for calculating CPU cycle events 159 + to determine how many clock cycles in a given period. 160 + 154 161 source "kernel/gcov/Kconfig"

+1

arch/alpha/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+1

arch/arm/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+9 -9

arch/arm/kernel/perf_event.c

··· 164 164 struct hw_perf_event *hwc, 165 165 int idx) 166 166 { 167 - s64 left = atomic64_read(&hwc->period_left); 167 + s64 left = local64_read(&hwc->period_left); 168 168 s64 period = hwc->sample_period; 169 169 int ret = 0; 170 170 171 171 if (unlikely(left <= -period)) { 172 172 left = period; 173 - atomic64_set(&hwc->period_left, left); 173 + local64_set(&hwc->period_left, left); 174 174 hwc->last_period = period; 175 175 ret = 1; 176 176 } 177 177 178 178 if (unlikely(left <= 0)) { 179 179 left += period; 180 - atomic64_set(&hwc->period_left, left); 180 + local64_set(&hwc->period_left, left); 181 181 hwc->last_period = period; 182 182 ret = 1; 183 183 } ··· 185 185 if (left > (s64)armpmu->max_period) 186 186 left = armpmu->max_period; 187 187 188 - atomic64_set(&hwc->prev_count, (u64)-left); 188 + local64_set(&hwc->prev_count, (u64)-left); 189 189 190 190 armpmu->write_counter(idx, (u64)(-left) & 0xffffffff); 191 191 ··· 204 204 u64 delta; 205 205 206 206 again: 207 - prev_raw_count = atomic64_read(&hwc->prev_count); 207 + prev_raw_count = local64_read(&hwc->prev_count); 208 208 new_raw_count = armpmu->read_counter(idx); 209 209 210 - if (atomic64_cmpxchg(&hwc->prev_count, prev_raw_count, 210 + if (local64_cmpxchg(&hwc->prev_count, prev_raw_count, 211 211 new_raw_count) != prev_raw_count) 212 212 goto again; 213 213 214 214 delta = (new_raw_count << shift) - (prev_raw_count << shift); 215 215 delta >>= shift; 216 216 217 - atomic64_add(delta, &event->count); 218 - atomic64_sub(delta, &hwc->period_left); 217 + local64_add(delta, &event->count); 218 + local64_sub(delta, &hwc->period_left); 219 219 220 220 return new_raw_count; 221 221 } ··· 478 478 if (!hwc->sample_period) { 479 479 hwc->sample_period = armpmu->max_period; 480 480 hwc->last_period = hwc->sample_period; 481 - atomic64_set(&hwc->period_left, hwc->sample_period); 481 + local64_set(&hwc->period_left, hwc->sample_period); 482 482 } 483 483 484 484 err = 0;

+1

arch/avr32/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+1

arch/blackfin/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+1

arch/cris/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+1

arch/frv/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+1

arch/frv/kernel/local64.h

··· 1 + #include <asm-generic/local64.h>

+1

arch/h8300/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+1

arch/ia64/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+1

arch/m32r/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+1

arch/m68k/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+1

arch/microblaze/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+1

arch/mips/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+1

arch/mn10300/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+1

arch/parisc/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+1

arch/powerpc/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+12

arch/powerpc/include/asm/perf_event.h

··· 21 21 #ifdef CONFIG_FSL_EMB_PERF_EVENT 22 22 #include <asm/perf_event_fsl_emb.h> 23 23 #endif 24 + 25 + #ifdef CONFIG_PERF_EVENTS 26 + #include <asm/ptrace.h> 27 + #include <asm/reg.h> 28 + 29 + #define perf_arch_fetch_caller_regs(regs, __ip) \ 30 + do { \ 31 + (regs)->nip = __ip; \ 32 + (regs)->gpr[1] = *(unsigned long *)__get_SP(); \ 33 + asm volatile("mfmsr %0" : "=r" ((regs)->msr)); \ 34 + } while (0) 35 + #endif

-26

arch/powerpc/kernel/misc.S

··· 127 127 _GLOBAL(__restore_cpu_power7) 128 128 /* place holder */ 129 129 blr 130 - 131 - /* 132 - * Get a minimal set of registers for our caller's nth caller. 133 - * r3 = regs pointer, r5 = n. 134 - * 135 - * We only get R1 (stack pointer), NIP (next instruction pointer) 136 - * and LR (link register). These are all we can get in the 137 - * general case without doing complicated stack unwinding, but 138 - * fortunately they are enough to do a stack backtrace, which 139 - * is all we need them for. 140 - */ 141 - _GLOBAL(perf_arch_fetch_caller_regs) 142 - mr r6,r1 143 - cmpwi r5,0 144 - mflr r4 145 - ble 2f 146 - mtctr r5 147 - 1: PPC_LL r6,0(r6) 148 - bdnz 1b 149 - PPC_LL r4,PPC_LR_STKOFF(r6) 150 - 2: PPC_LL r7,0(r6) 151 - PPC_LL r7,PPC_LR_STKOFF(r7) 152 - PPC_STL r6,GPR1-STACK_FRAME_OVERHEAD(r3) 153 - PPC_STL r4,_NIP-STACK_FRAME_OVERHEAD(r3) 154 - PPC_STL r7,_LINK-STACK_FRAME_OVERHEAD(r3) 155 - blr

+21 -20

arch/powerpc/kernel/perf_event.c

··· 410 410 * Therefore we treat them like NMIs. 411 411 */ 412 412 do { 413 - prev = atomic64_read(&event->hw.prev_count); 413 + prev = local64_read(&event->hw.prev_count); 414 414 barrier(); 415 415 val = read_pmc(event->hw.idx); 416 - } while (atomic64_cmpxchg(&event->hw.prev_count, prev, val) != prev); 416 + } while (local64_cmpxchg(&event->hw.prev_count, prev, val) != prev); 417 417 418 418 /* The counters are only 32 bits wide */ 419 419 delta = (val - prev) & 0xfffffffful; 420 - atomic64_add(delta, &event->count); 421 - atomic64_sub(delta, &event->hw.period_left); 420 + local64_add(delta, &event->count); 421 + local64_sub(delta, &event->hw.period_left); 422 422 } 423 423 424 424 /* ··· 444 444 if (!event->hw.idx) 445 445 continue; 446 446 val = (event->hw.idx == 5) ? pmc5 : pmc6; 447 - prev = atomic64_read(&event->hw.prev_count); 447 + prev = local64_read(&event->hw.prev_count); 448 448 event->hw.idx = 0; 449 449 delta = (val - prev) & 0xfffffffful; 450 - atomic64_add(delta, &event->count); 450 + local64_add(delta, &event->count); 451 451 } 452 452 } 453 453 ··· 462 462 event = cpuhw->limited_counter[i]; 463 463 event->hw.idx = cpuhw->limited_hwidx[i]; 464 464 val = (event->hw.idx == 5) ? pmc5 : pmc6; 465 - atomic64_set(&event->hw.prev_count, val); 465 + local64_set(&event->hw.prev_count, val); 466 466 perf_event_update_userpage(event); 467 467 } 468 468 } ··· 666 666 } 667 667 val = 0; 668 668 if (event->hw.sample_period) { 669 - left = atomic64_read(&event->hw.period_left); 669 + left = local64_read(&event->hw.period_left); 670 670 if (left < 0x80000000L) 671 671 val = 0x80000000L - left; 672 672 } 673 - atomic64_set(&event->hw.prev_count, val); 673 + local64_set(&event->hw.prev_count, val); 674 674 event->hw.idx = idx; 675 675 write_pmc(idx, val); 676 676 perf_event_update_userpage(event); ··· 754 754 * skip the schedulability test here, it will be peformed 755 755 * at commit time(->commit_txn) as a whole 756 756 */ 757 - if (cpuhw->group_flag & PERF_EVENT_TXN_STARTED) 757 + if (cpuhw->group_flag & PERF_EVENT_TXN) 758 758 goto nocheck; 759 759 760 760 if (check_excludes(cpuhw->event, cpuhw->flags, n0, 1)) ··· 845 845 if (left < 0x80000000L) 846 846 val = 0x80000000L - left; 847 847 write_pmc(event->hw.idx, val); 848 - atomic64_set(&event->hw.prev_count, val); 849 - atomic64_set(&event->hw.period_left, left); 848 + local64_set(&event->hw.prev_count, val); 849 + local64_set(&event->hw.period_left, left); 850 850 perf_event_update_userpage(event); 851 851 perf_enable(); 852 852 local_irq_restore(flags); ··· 861 861 { 862 862 struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events); 863 863 864 - cpuhw->group_flag |= PERF_EVENT_TXN_STARTED; 864 + cpuhw->group_flag |= PERF_EVENT_TXN; 865 865 cpuhw->n_txn_start = cpuhw->n_events; 866 866 } 867 867 ··· 874 874 { 875 875 struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events); 876 876 877 - cpuhw->group_flag &= ~PERF_EVENT_TXN_STARTED; 877 + cpuhw->group_flag &= ~PERF_EVENT_TXN; 878 878 } 879 879 880 880 /* ··· 900 900 for (i = cpuhw->n_txn_start; i < n; ++i) 901 901 cpuhw->event[i]->hw.config = cpuhw->events[i]; 902 902 903 + cpuhw->group_flag &= ~PERF_EVENT_TXN; 903 904 return 0; 904 905 } 905 906 ··· 1112 1111 event->hw.config = events[n]; 1113 1112 event->hw.event_base = cflags[n]; 1114 1113 event->hw.last_period = event->hw.sample_period; 1115 - atomic64_set(&event->hw.period_left, event->hw.last_period); 1114 + local64_set(&event->hw.period_left, event->hw.last_period); 1116 1115 1117 1116 /* 1118 1117 * See if we need to reserve the PMU. ··· 1150 1149 int record = 0; 1151 1150 1152 1151 /* we don't have to worry about interrupts here */ 1153 - prev = atomic64_read(&event->hw.prev_count); 1152 + prev = local64_read(&event->hw.prev_count); 1154 1153 delta = (val - prev) & 0xfffffffful; 1155 - atomic64_add(delta, &event->count); 1154 + local64_add(delta, &event->count); 1156 1155 1157 1156 /* 1158 1157 * See if the total period for this event has expired, 1159 1158 * and update for the next period. 1160 1159 */ 1161 1160 val = 0; 1162 - left = atomic64_read(&event->hw.period_left) - delta; 1161 + left = local64_read(&event->hw.period_left) - delta; 1163 1162 if (period) { 1164 1163 if (left <= 0) { 1165 1164 left += period; ··· 1197 1196 } 1198 1197 1199 1198 write_pmc(event->hw.idx, val); 1200 - atomic64_set(&event->hw.prev_count, val); 1201 - atomic64_set(&event->hw.period_left, left); 1199 + local64_set(&event->hw.prev_count, val); 1200 + local64_set(&event->hw.period_left, left); 1202 1201 perf_event_update_userpage(event); 1203 1202 } 1204 1203

+15 -14

arch/powerpc/kernel/perf_event_fsl_emb.c

··· 162 162 * Therefore we treat them like NMIs. 163 163 */ 164 164 do { 165 - prev = atomic64_read(&event->hw.prev_count); 165 + prev = local64_read(&event->hw.prev_count); 166 166 barrier(); 167 167 val = read_pmc(event->hw.idx); 168 - } while (atomic64_cmpxchg(&event->hw.prev_count, prev, val) != prev); 168 + } while (local64_cmpxchg(&event->hw.prev_count, prev, val) != prev); 169 169 170 170 /* The counters are only 32 bits wide */ 171 171 delta = (val - prev) & 0xfffffffful; 172 - atomic64_add(delta, &event->count); 173 - atomic64_sub(delta, &event->hw.period_left); 172 + local64_add(delta, &event->count); 173 + local64_sub(delta, &event->hw.period_left); 174 174 } 175 175 176 176 /* ··· 296 296 297 297 val = 0; 298 298 if (event->hw.sample_period) { 299 - s64 left = atomic64_read(&event->hw.period_left); 299 + s64 left = local64_read(&event->hw.period_left); 300 300 if (left < 0x80000000L) 301 301 val = 0x80000000L - left; 302 302 } 303 - atomic64_set(&event->hw.prev_count, val); 303 + local64_set(&event->hw.prev_count, val); 304 304 write_pmc(i, val); 305 305 perf_event_update_userpage(event); 306 306 ··· 371 371 if (left < 0x80000000L) 372 372 val = 0x80000000L - left; 373 373 write_pmc(event->hw.idx, val); 374 - atomic64_set(&event->hw.prev_count, val); 375 - atomic64_set(&event->hw.period_left, left); 374 + local64_set(&event->hw.prev_count, val); 375 + local64_set(&event->hw.period_left, left); 376 376 perf_event_update_userpage(event); 377 377 perf_enable(); 378 378 local_irq_restore(flags); ··· 500 500 return ERR_PTR(-ENOTSUPP); 501 501 502 502 event->hw.last_period = event->hw.sample_period; 503 - atomic64_set(&event->hw.period_left, event->hw.last_period); 503 + local64_set(&event->hw.period_left, event->hw.last_period); 504 504 505 505 /* 506 506 * See if we need to reserve the PMU. ··· 541 541 int record = 0; 542 542 543 543 /* we don't have to worry about interrupts here */ 544 - prev = atomic64_read(&event->hw.prev_count); 544 + prev = local64_read(&event->hw.prev_count); 545 545 delta = (val - prev) & 0xfffffffful; 546 - atomic64_add(delta, &event->count); 546 + local64_add(delta, &event->count); 547 547 548 548 /* 549 549 * See if the total period for this event has expired, 550 550 * and update for the next period. 551 551 */ 552 552 val = 0; 553 - left = atomic64_read(&event->hw.period_left) - delta; 553 + left = local64_read(&event->hw.period_left) - delta; 554 554 if (period) { 555 555 if (left <= 0) { 556 556 left += period; ··· 569 569 struct perf_sample_data data; 570 570 571 571 perf_sample_data_init(&data, 0); 572 + data.period = event->hw.last_period; 572 573 573 574 if (perf_event_overflow(event, nmi, &data, regs)) { 574 575 /* ··· 585 584 } 586 585 587 586 write_pmc(event->hw.idx, val); 588 - atomic64_set(&event->hw.prev_count, val); 589 - atomic64_set(&event->hw.period_left, left); 587 + local64_set(&event->hw.prev_count, val); 588 + local64_set(&event->hw.period_left, left); 590 589 perf_event_update_userpage(event); 591 590 } 592 591

+1

arch/s390/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+1

arch/score/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+1

arch/sh/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+3 -3

arch/sh/kernel/perf_event.c

··· 185 185 * this is the simplest approach for maintaining consistency. 186 186 */ 187 187 again: 188 - prev_raw_count = atomic64_read(&hwc->prev_count); 188 + prev_raw_count = local64_read(&hwc->prev_count); 189 189 new_raw_count = sh_pmu->read(idx); 190 190 191 - if (atomic64_cmpxchg(&hwc->prev_count, prev_raw_count, 191 + if (local64_cmpxchg(&hwc->prev_count, prev_raw_count, 192 192 new_raw_count) != prev_raw_count) 193 193 goto again; 194 194 ··· 203 203 delta = (new_raw_count << shift) - (prev_raw_count << shift); 204 204 delta >>= shift; 205 205 206 - atomic64_add(delta, &event->count); 206 + local64_add(delta, &event->count); 207 207 } 208 208 209 209 static void sh_pmu_disable(struct perf_event *event)

+1

arch/sparc/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+8

arch/sparc/include/asm/perf_event.h

··· 6 6 #define PERF_EVENT_INDEX_OFFSET 0 7 7 8 8 #ifdef CONFIG_PERF_EVENTS 9 + #include <asm/ptrace.h> 10 + 9 11 extern void init_hw_perf_events(void); 12 + 13 + extern void 14 + __perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip); 15 + 16 + #define perf_arch_fetch_caller_regs(pt_regs, ip) \ 17 + __perf_arch_fetch_caller_regs(pt_regs, ip, 1); 10 18 #else 11 19 static inline void init_hw_perf_events(void) { } 12 20 #endif

+3 -3

arch/sparc/kernel/helpers.S

··· 47 47 .size stack_trace_flush,.-stack_trace_flush 48 48 49 49 #ifdef CONFIG_PERF_EVENTS 50 - .globl perf_arch_fetch_caller_regs 51 - .type perf_arch_fetch_caller_regs,#function 52 - perf_arch_fetch_caller_regs: 50 + .globl __perf_arch_fetch_caller_regs 51 + .type __perf_arch_fetch_caller_regs,#function 52 + __perf_arch_fetch_caller_regs: 53 53 /* We always read the %pstate into %o5 since we will use 54 54 * that to construct a fake %tstate to store into the regs. 55 55 */

+13 -12

arch/sparc/kernel/perf_event.c

··· 572 572 s64 delta; 573 573 574 574 again: 575 - prev_raw_count = atomic64_read(&hwc->prev_count); 575 + prev_raw_count = local64_read(&hwc->prev_count); 576 576 new_raw_count = read_pmc(idx); 577 577 578 - if (atomic64_cmpxchg(&hwc->prev_count, prev_raw_count, 578 + if (local64_cmpxchg(&hwc->prev_count, prev_raw_count, 579 579 new_raw_count) != prev_raw_count) 580 580 goto again; 581 581 582 582 delta = (new_raw_count << shift) - (prev_raw_count << shift); 583 583 delta >>= shift; 584 584 585 - atomic64_add(delta, &event->count); 586 - atomic64_sub(delta, &hwc->period_left); 585 + local64_add(delta, &event->count); 586 + local64_sub(delta, &hwc->period_left); 587 587 588 588 return new_raw_count; 589 589 } ··· 591 591 static int sparc_perf_event_set_period(struct perf_event *event, 592 592 struct hw_perf_event *hwc, int idx) 593 593 { 594 - s64 left = atomic64_read(&hwc->period_left); 594 + s64 left = local64_read(&hwc->period_left); 595 595 s64 period = hwc->sample_period; 596 596 int ret = 0; 597 597 598 598 if (unlikely(left <= -period)) { 599 599 left = period; 600 - atomic64_set(&hwc->period_left, left); 600 + local64_set(&hwc->period_left, left); 601 601 hwc->last_period = period; 602 602 ret = 1; 603 603 } 604 604 605 605 if (unlikely(left <= 0)) { 606 606 left += period; 607 - atomic64_set(&hwc->period_left, left); 607 + local64_set(&hwc->period_left, left); 608 608 hwc->last_period = period; 609 609 ret = 1; 610 610 } 611 611 if (left > MAX_PERIOD) 612 612 left = MAX_PERIOD; 613 613 614 - atomic64_set(&hwc->prev_count, (u64)-left); 614 + local64_set(&hwc->prev_count, (u64)-left); 615 615 616 616 write_pmc(idx, (u64)(-left) & 0xffffffff); 617 617 ··· 1006 1006 * skip the schedulability test here, it will be peformed 1007 1007 * at commit time(->commit_txn) as a whole 1008 1008 */ 1009 - if (cpuc->group_flag & PERF_EVENT_TXN_STARTED) 1009 + if (cpuc->group_flag & PERF_EVENT_TXN) 1010 1010 goto nocheck; 1011 1011 1012 1012 if (check_excludes(cpuc->event, n0, 1)) ··· 1088 1088 if (!hwc->sample_period) { 1089 1089 hwc->sample_period = MAX_PERIOD; 1090 1090 hwc->last_period = hwc->sample_period; 1091 - atomic64_set(&hwc->period_left, hwc->sample_period); 1091 + local64_set(&hwc->period_left, hwc->sample_period); 1092 1092 } 1093 1093 1094 1094 return 0; ··· 1103 1103 { 1104 1104 struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events); 1105 1105 1106 - cpuhw->group_flag |= PERF_EVENT_TXN_STARTED; 1106 + cpuhw->group_flag |= PERF_EVENT_TXN; 1107 1107 } 1108 1108 1109 1109 /* ··· 1115 1115 { 1116 1116 struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events); 1117 1117 1118 - cpuhw->group_flag &= ~PERF_EVENT_TXN_STARTED; 1118 + cpuhw->group_flag &= ~PERF_EVENT_TXN; 1119 1119 } 1120 1120 1121 1121 /* ··· 1138 1138 if (sparc_check_constraints(cpuc->event, cpuc->events, n)) 1139 1139 return -EAGAIN; 1140 1140 1141 + cpuc->group_flag &= ~PERF_EVENT_TXN; 1141 1142 return 0; 1142 1143 } 1143 1144

+1

arch/x86/Kconfig

··· 55 55 select HAVE_HW_BREAKPOINT 56 56 select HAVE_MIXED_BREAKPOINTS_REGS 57 57 select PERF_EVENTS 58 + select HAVE_PERF_EVENTS_NMI 58 59 select ANON_INODES 59 60 select HAVE_ARCH_KMEMCHECK 60 61 select HAVE_USER_RETURN_NOTIFIER

+1 -1

arch/x86/include/asm/hw_breakpoint.h

··· 20 20 #include <linux/list.h> 21 21 22 22 /* Available HW breakpoint length encodings */ 23 + #define X86_BREAKPOINT_LEN_X 0x00 23 24 #define X86_BREAKPOINT_LEN_1 0x40 24 25 #define X86_BREAKPOINT_LEN_2 0x44 25 26 #define X86_BREAKPOINT_LEN_4 0x4c 26 - #define X86_BREAKPOINT_LEN_EXECUTE 0x40 27 27 28 28 #ifdef CONFIG_X86_64 29 29 #define X86_BREAKPOINT_LEN_8 0x48

+1

arch/x86/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+2

arch/x86/include/asm/nmi.h

··· 17 17 18 18 extern void die_nmi(char *str, struct pt_regs *regs, int do_panic); 19 19 extern int check_nmi_watchdog(void); 20 + #if !defined(CONFIG_LOCKUP_DETECTOR) 20 21 extern int nmi_watchdog_enabled; 22 + #endif 21 23 extern int avail_to_resrv_perfctr_nmi_bit(unsigned int); 22 24 extern int reserve_perfctr_nmi(unsigned int); 23 25 extern void release_perfctr_nmi(unsigned int);

+16 -2

arch/x86/include/asm/perf_event.h

··· 68 68 69 69 union cpuid10_edx { 70 70 struct { 71 - unsigned int num_counters_fixed:4; 72 - unsigned int reserved:28; 71 + unsigned int num_counters_fixed:5; 72 + unsigned int bit_width_fixed:8; 73 + unsigned int reserved:19; 73 74 } split; 74 75 unsigned int full; 75 76 }; ··· 140 139 extern unsigned long perf_instruction_pointer(struct pt_regs *regs); 141 140 extern unsigned long perf_misc_flags(struct pt_regs *regs); 142 141 #define perf_misc_flags(regs) perf_misc_flags(regs) 142 + 143 + #include <asm/stacktrace.h> 144 + 145 + /* 146 + * We abuse bit 3 from flags to pass exact information, see perf_misc_flags 147 + * and the comment with PERF_EFLAGS_EXACT. 148 + */ 149 + #define perf_arch_fetch_caller_regs(regs, __ip) { \ 150 + (regs)->ip = (__ip); \ 151 + (regs)->bp = caller_frame_pointer(); \ 152 + (regs)->cs = __KERNEL_CS; \ 153 + regs->flags = 0; \ 154 + } 143 155 144 156 #else 145 157 static inline void init_hw_perf_events(void) { }

+47 -42

arch/x86/include/asm/perf_event_p4.h

··· 19 19 #define ARCH_P4_RESERVED_ESCR (2) /* IQ_ESCR(0,1) not always present */ 20 20 #define ARCH_P4_MAX_ESCR (ARCH_P4_TOTAL_ESCR - ARCH_P4_RESERVED_ESCR) 21 21 #define ARCH_P4_MAX_CCCR (18) 22 - #define ARCH_P4_MAX_COUNTER (ARCH_P4_MAX_CCCR / 2) 23 22 24 23 #define P4_ESCR_EVENT_MASK 0x7e000000U 25 24 #define P4_ESCR_EVENT_SHIFT 25 ··· 70 71 #define P4_CCCR_THRESHOLD(v) ((v) << P4_CCCR_THRESHOLD_SHIFT) 71 72 #define P4_CCCR_ESEL(v) ((v) << P4_CCCR_ESCR_SELECT_SHIFT) 72 73 73 - /* Custom bits in reerved CCCR area */ 74 - #define P4_CCCR_CACHE_OPS_MASK 0x0000003fU 75 - 76 - 77 74 /* Non HT mask */ 78 75 #define P4_CCCR_MASK \ 79 76 (P4_CCCR_OVF | \ ··· 101 106 * ESCR and CCCR but rather an only packed value should 102 107 * be unpacked and written to a proper addresses 103 108 * 104 - * the base idea is to pack as much info as 105 - * possible 109 + * the base idea is to pack as much info as possible 106 110 */ 107 111 #define p4_config_pack_escr(v) (((u64)(v)) << 32) 108 112 #define p4_config_pack_cccr(v) (((u64)(v)) & 0xffffffffULL) ··· 123 129 t = t >> P4_ESCR_EVENT_SHIFT; \ 124 130 t; \ 125 131 }) 126 - 127 - #define p4_config_unpack_cache_event(v) (((u64)(v)) & P4_CCCR_CACHE_OPS_MASK) 128 132 129 133 #define P4_CONFIG_HT_SHIFT 63 130 134 #define P4_CONFIG_HT (1ULL << P4_CONFIG_HT_SHIFT) ··· 206 214 return escr; 207 215 } 208 216 217 + /* 218 + * This are the events which should be used in "Event Select" 219 + * field of ESCR register, they are like unique keys which allow 220 + * the kernel to determinate which CCCR and COUNTER should be 221 + * used to track an event 222 + */ 209 223 enum P4_EVENTS { 210 224 P4_EVENT_TC_DELIVER_MODE, 211 225 P4_EVENT_BPU_FETCH_REQUEST, ··· 559 561 * a caller should use P4_ESCR_EMASK_NAME helper to 560 562 * pick the EventMask needed, for example 561 563 * 562 - * P4_ESCR_EMASK_NAME(P4_EVENT_TC_DELIVER_MODE, DD) 564 + * P4_ESCR_EMASK_BIT(P4_EVENT_TC_DELIVER_MODE, DD) 563 565 */ 564 566 enum P4_ESCR_EMASKS { 565 567 P4_GEN_ESCR_EMASK(P4_EVENT_TC_DELIVER_MODE, DD, 0), ··· 751 753 P4_GEN_ESCR_EMASK(P4_EVENT_INSTR_COMPLETED, BOGUS, 1), 752 754 }; 753 755 754 - /* P4 PEBS: stale for a while */ 755 - #define P4_PEBS_METRIC_MASK 0x00001fffU 756 - #define P4_PEBS_UOB_TAG 0x01000000U 757 - #define P4_PEBS_ENABLE 0x02000000U 756 + /* 757 + * P4 PEBS specifics (Replay Event only) 758 + * 759 + * Format (bits): 760 + * 0-6: metric from P4_PEBS_METRIC enum 761 + * 7 : reserved 762 + * 8 : reserved 763 + * 9-11 : reserved 764 + * 765 + * Note we have UOP and PEBS bits reserved for now 766 + * just in case if we will need them once 767 + */ 768 + #define P4_PEBS_CONFIG_ENABLE (1 << 7) 769 + #define P4_PEBS_CONFIG_UOP_TAG (1 << 8) 770 + #define P4_PEBS_CONFIG_METRIC_MASK 0x3f 771 + #define P4_PEBS_CONFIG_MASK 0xff 758 772 759 - /* Replay metrics for MSR_IA32_PEBS_ENABLE and MSR_P4_PEBS_MATRIX_VERT */ 760 - #define P4_PEBS__1stl_cache_load_miss_retired 0x3000001 761 - #define P4_PEBS__2ndl_cache_load_miss_retired 0x3000002 762 - #define P4_PEBS__dtlb_load_miss_retired 0x3000004 763 - #define P4_PEBS__dtlb_store_miss_retired 0x3000004 764 - #define P4_PEBS__dtlb_all_miss_retired 0x3000004 765 - #define P4_PEBS__tagged_mispred_branch 0x3018000 766 - #define P4_PEBS__mob_load_replay_retired 0x3000200 767 - #define P4_PEBS__split_load_retired 0x3000400 768 - #define P4_PEBS__split_store_retired 0x3000400 773 + /* 774 + * mem: Only counters MSR_IQ_COUNTER4 (16) and 775 + * MSR_IQ_COUNTER5 (17) are allowed for PEBS sampling 776 + */ 777 + #define P4_PEBS_ENABLE 0x02000000U 778 + #define P4_PEBS_ENABLE_UOP_TAG 0x01000000U 769 779 770 - #define P4_VERT__1stl_cache_load_miss_retired 0x0000001 771 - #define P4_VERT__2ndl_cache_load_miss_retired 0x0000001 772 - #define P4_VERT__dtlb_load_miss_retired 0x0000001 773 - #define P4_VERT__dtlb_store_miss_retired 0x0000002 774 - #define P4_VERT__dtlb_all_miss_retired 0x0000003 775 - #define P4_VERT__tagged_mispred_branch 0x0000010 776 - #define P4_VERT__mob_load_replay_retired 0x0000001 777 - #define P4_VERT__split_load_retired 0x0000001 778 - #define P4_VERT__split_store_retired 0x0000002 780 + #define p4_config_unpack_metric(v) (((u64)(v)) & P4_PEBS_CONFIG_METRIC_MASK) 781 + #define p4_config_unpack_pebs(v) (((u64)(v)) & P4_PEBS_CONFIG_MASK) 779 782 780 - enum P4_CACHE_EVENTS { 781 - P4_CACHE__NONE, 783 + #define p4_config_pebs_has(v, mask) (p4_config_unpack_pebs(v) & (mask)) 782 784 783 - P4_CACHE__1stl_cache_load_miss_retired, 784 - P4_CACHE__2ndl_cache_load_miss_retired, 785 - P4_CACHE__dtlb_load_miss_retired, 786 - P4_CACHE__dtlb_store_miss_retired, 787 - P4_CACHE__itlb_reference_hit, 788 - P4_CACHE__itlb_reference_miss, 785 + enum P4_PEBS_METRIC { 786 + P4_PEBS_METRIC__none, 789 787 790 - P4_CACHE__MAX 788 + P4_PEBS_METRIC__1stl_cache_load_miss_retired, 789 + P4_PEBS_METRIC__2ndl_cache_load_miss_retired, 790 + P4_PEBS_METRIC__dtlb_load_miss_retired, 791 + P4_PEBS_METRIC__dtlb_store_miss_retired, 792 + P4_PEBS_METRIC__dtlb_all_miss_retired, 793 + P4_PEBS_METRIC__tagged_mispred_branch, 794 + P4_PEBS_METRIC__mob_load_replay_retired, 795 + P4_PEBS_METRIC__split_load_retired, 796 + P4_PEBS_METRIC__split_store_retired, 797 + 798 + P4_PEBS_METRIC__max 791 799 }; 792 800 793 801 #endif /* PERF_EVENT_P4_H */ 802 +

+49

arch/x86/include/asm/stacktrace.h

··· 1 + /* 2 + * Copyright (C) 1991, 1992 Linus Torvalds 3 + * Copyright (C) 2000, 2001, 2002 Andi Kleen, SuSE Labs 4 + */ 5 + 1 6 #ifndef _ASM_X86_STACKTRACE_H 2 7 #define _ASM_X86_STACKTRACE_H 8 + 9 + #include <linux/uaccess.h> 3 10 4 11 extern int kstack_depth_to_print; 5 12 ··· 48 41 void dump_trace(struct task_struct *tsk, struct pt_regs *regs, 49 42 unsigned long *stack, unsigned long bp, 50 43 const struct stacktrace_ops *ops, void *data); 44 + 45 + #ifdef CONFIG_X86_32 46 + #define STACKSLOTS_PER_LINE 8 47 + #define get_bp(bp) asm("movl %%ebp, %0" : "=r" (bp) :) 48 + #else 49 + #define STACKSLOTS_PER_LINE 4 50 + #define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :) 51 + #endif 52 + 53 + extern void 54 + show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs, 55 + unsigned long *stack, unsigned long bp, char *log_lvl); 56 + 57 + extern void 58 + show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs, 59 + unsigned long *sp, unsigned long bp, char *log_lvl); 60 + 61 + extern unsigned int code_bytes; 62 + 63 + /* The form of the top of the frame on the stack */ 64 + struct stack_frame { 65 + struct stack_frame *next_frame; 66 + unsigned long return_address; 67 + }; 68 + 69 + struct stack_frame_ia32 { 70 + u32 next_frame; 71 + u32 return_address; 72 + }; 73 + 74 + static inline unsigned long caller_frame_pointer(void) 75 + { 76 + struct stack_frame *frame; 77 + 78 + get_bp(frame); 79 + 80 + #ifdef CONFIG_FRAME_POINTER 81 + frame = frame->next_frame; 82 + #endif 83 + 84 + return (unsigned long)frame; 85 + } 51 86 52 87 #endif /* _ASM_X86_STACKTRACE_H */

+6 -1

arch/x86/kernel/apic/Makefile

··· 2 2 # Makefile for local APIC drivers and for the IO-APIC code 3 3 # 4 4 5 - obj-$(CONFIG_X86_LOCAL_APIC) += apic.o apic_noop.o probe_$(BITS).o ipi.o nmi.o 5 + obj-$(CONFIG_X86_LOCAL_APIC) += apic.o apic_noop.o probe_$(BITS).o ipi.o 6 + ifneq ($(CONFIG_HARDLOCKUP_DETECTOR),y) 7 + obj-$(CONFIG_X86_LOCAL_APIC) += nmi.o 8 + endif 9 + obj-$(CONFIG_HARDLOCKUP_DETECTOR) += hw_nmi.o 10 + 6 11 obj-$(CONFIG_X86_IO_APIC) += io_apic.o 7 12 obj-$(CONFIG_SMP) += ipi.o 8 13

+107

arch/x86/kernel/apic/hw_nmi.c

··· 1 + /* 2 + * HW NMI watchdog support 3 + * 4 + * started by Don Zickus, Copyright (C) 2010 Red Hat, Inc. 5 + * 6 + * Arch specific calls to support NMI watchdog 7 + * 8 + * Bits copied from original nmi.c file 9 + * 10 + */ 11 + #include <asm/apic.h> 12 + 13 + #include <linux/cpumask.h> 14 + #include <linux/kdebug.h> 15 + #include <linux/notifier.h> 16 + #include <linux/kprobes.h> 17 + #include <linux/nmi.h> 18 + #include <linux/module.h> 19 + 20 + /* For reliability, we're prepared to waste bits here. */ 21 + static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly; 22 + 23 + u64 hw_nmi_get_sample_period(void) 24 + { 25 + return (u64)(cpu_khz) * 1000 * 60; 26 + } 27 + 28 + #ifdef ARCH_HAS_NMI_WATCHDOG 29 + void arch_trigger_all_cpu_backtrace(void) 30 + { 31 + int i; 32 + 33 + cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask); 34 + 35 + printk(KERN_INFO "sending NMI to all CPUs:\n"); 36 + apic->send_IPI_all(NMI_VECTOR); 37 + 38 + /* Wait for up to 10 seconds for all CPUs to do the backtrace */ 39 + for (i = 0; i < 10 * 1000; i++) { 40 + if (cpumask_empty(to_cpumask(backtrace_mask))) 41 + break; 42 + mdelay(1); 43 + } 44 + } 45 + 46 + static int __kprobes 47 + arch_trigger_all_cpu_backtrace_handler(struct notifier_block *self, 48 + unsigned long cmd, void *__args) 49 + { 50 + struct die_args *args = __args; 51 + struct pt_regs *regs; 52 + int cpu = smp_processor_id(); 53 + 54 + switch (cmd) { 55 + case DIE_NMI: 56 + case DIE_NMI_IPI: 57 + break; 58 + 59 + default: 60 + return NOTIFY_DONE; 61 + } 62 + 63 + regs = args->regs; 64 + 65 + if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) { 66 + static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED; 67 + 68 + arch_spin_lock(&lock); 69 + printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu); 70 + show_regs(regs); 71 + dump_stack(); 72 + arch_spin_unlock(&lock); 73 + cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); 74 + return NOTIFY_STOP; 75 + } 76 + 77 + return NOTIFY_DONE; 78 + } 79 + 80 + static __read_mostly struct notifier_block backtrace_notifier = { 81 + .notifier_call = arch_trigger_all_cpu_backtrace_handler, 82 + .next = NULL, 83 + .priority = 1 84 + }; 85 + 86 + static int __init register_trigger_all_cpu_backtrace(void) 87 + { 88 + register_die_notifier(&backtrace_notifier); 89 + return 0; 90 + } 91 + early_initcall(register_trigger_all_cpu_backtrace); 92 + #endif 93 + 94 + /* STUB calls to mimic old nmi_watchdog behaviour */ 95 + #if defined(CONFIG_X86_LOCAL_APIC) 96 + unsigned int nmi_watchdog = NMI_NONE; 97 + EXPORT_SYMBOL(nmi_watchdog); 98 + void acpi_nmi_enable(void) { return; } 99 + void acpi_nmi_disable(void) { return; } 100 + #endif 101 + atomic_t nmi_active = ATOMIC_INIT(0); /* oprofile uses this */ 102 + EXPORT_SYMBOL(nmi_active); 103 + int unknown_nmi_panic; 104 + void cpu_nmi_set_wd_enabled(void) { return; } 105 + void stop_apic_nmi_watchdog(void *unused) { return; } 106 + void setup_apic_nmi_watchdog(void *unused) { return; } 107 + int __init check_nmi_watchdog(void) { return 0; }

-7

arch/x86/kernel/apic/nmi.c

··· 401 401 int cpu = smp_processor_id(); 402 402 int rc = 0; 403 403 404 - /* check for other users first */ 405 - if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT) 406 - == NOTIFY_STOP) { 407 - rc = 1; 408 - touched = 1; 409 - } 410 - 411 404 sum = get_timer_irqs(cpu); 412 405 413 406 if (__get_cpu_var(nmi_touch)) {

+25 -37

arch/x86/kernel/cpu/perf_event.c

··· 220 220 struct perf_event *event); 221 221 struct event_constraint *event_constraints; 222 222 void (*quirks)(void); 223 + int perfctr_second_write; 223 224 224 225 int (*cpu_prepare)(int cpu); 225 226 void (*cpu_starting)(int cpu); ··· 296 295 * count to the generic event atomically: 297 296 */ 298 297 again: 299 - prev_raw_count = atomic64_read(&hwc->prev_count); 298 + prev_raw_count = local64_read(&hwc->prev_count); 300 299 rdmsrl(hwc->event_base + idx, new_raw_count); 301 300 302 - if (atomic64_cmpxchg(&hwc->prev_count, prev_raw_count, 301 + if (local64_cmpxchg(&hwc->prev_count, prev_raw_count, 303 302 new_raw_count) != prev_raw_count) 304 303 goto again; 305 304 ··· 314 313 delta = (new_raw_count << shift) - (prev_raw_count << shift); 315 314 delta >>= shift; 316 315 317 - atomic64_add(delta, &event->count); 318 - atomic64_sub(delta, &hwc->period_left); 316 + local64_add(delta, &event->count); 317 + local64_sub(delta, &hwc->period_left); 319 318 320 319 return new_raw_count; 321 320 } ··· 439 438 if (!hwc->sample_period) { 440 439 hwc->sample_period = x86_pmu.max_period; 441 440 hwc->last_period = hwc->sample_period; 442 - atomic64_set(&hwc->period_left, hwc->sample_period); 441 + local64_set(&hwc->period_left, hwc->sample_period); 443 442 } else { 444 443 /* 445 444 * If we have a PMU initialized but no APIC ··· 886 885 x86_perf_event_set_period(struct perf_event *event) 887 886 { 888 887 struct hw_perf_event *hwc = &event->hw; 889 - s64 left = atomic64_read(&hwc->period_left); 888 + s64 left = local64_read(&hwc->period_left); 890 889 s64 period = hwc->sample_period; 891 890 int ret = 0, idx = hwc->idx; 892 891 ··· 898 897 */ 899 898 if (unlikely(left <= -period)) { 900 899 left = period; 901 - atomic64_set(&hwc->period_left, left); 900 + local64_set(&hwc->period_left, left); 902 901 hwc->last_period = period; 903 902 ret = 1; 904 903 } 905 904 906 905 if (unlikely(left <= 0)) { 907 906 left += period; 908 - atomic64_set(&hwc->period_left, left); 907 + local64_set(&hwc->period_left, left); 909 908 hwc->last_period = period; 910 909 ret = 1; 911 910 } ··· 924 923 * The hw event starts counting from this event offset, 925 924 * mark it to be able to extra future deltas: 926 925 */ 927 - atomic64_set(&hwc->prev_count, (u64)-left); 926 + local64_set(&hwc->prev_count, (u64)-left); 928 927 929 - wrmsrl(hwc->event_base + idx, 928 + wrmsrl(hwc->event_base + idx, (u64)(-left) & x86_pmu.cntval_mask); 929 + 930 + /* 931 + * Due to erratum on certan cpu we need 932 + * a second write to be sure the register 933 + * is updated properly 934 + */ 935 + if (x86_pmu.perfctr_second_write) { 936 + wrmsrl(hwc->event_base + idx, 930 937 (u64)(-left) & x86_pmu.cntval_mask); 938 + } 931 939 932 940 perf_event_update_userpage(event); 933 941 ··· 979 969 * skip the schedulability test here, it will be peformed 980 970 * at commit time(->commit_txn) as a whole 981 971 */ 982 - if (cpuc->group_flag & PERF_EVENT_TXN_STARTED) 972 + if (cpuc->group_flag & PERF_EVENT_TXN) 983 973 goto out; 984 974 985 975 ret = x86_pmu.schedule_events(cpuc, n, assign); ··· 1106 1096 * The events never got scheduled and ->cancel_txn will truncate 1107 1097 * the event_list. 1108 1098 */ 1109 - if (cpuc->group_flag & PERF_EVENT_TXN_STARTED) 1099 + if (cpuc->group_flag & PERF_EVENT_TXN) 1110 1100 return; 1111 1101 1112 1102 x86_pmu_stop(event); ··· 1398 1388 { 1399 1389 struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); 1400 1390 1401 - cpuc->group_flag |= PERF_EVENT_TXN_STARTED; 1391 + cpuc->group_flag |= PERF_EVENT_TXN; 1402 1392 cpuc->n_txn = 0; 1403 1393 } 1404 1394 ··· 1411 1401 { 1412 1402 struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); 1413 1403 1414 - cpuc->group_flag &= ~PERF_EVENT_TXN_STARTED; 1404 + cpuc->group_flag &= ~PERF_EVENT_TXN; 1415 1405 /* 1416 1406 * Truncate the collected events. 1417 1407 */ ··· 1445 1435 */ 1446 1436 memcpy(cpuc->assign, assign, n*sizeof(int)); 1447 1437 1448 - /* 1449 - * Clear out the txn count so that ->cancel_txn() which gets 1450 - * run after ->commit_txn() doesn't undo things. 1451 - */ 1452 - cpuc->n_txn = 0; 1438 + cpuc->group_flag &= ~PERF_EVENT_TXN; 1453 1439 1454 1440 return 0; 1455 1441 } ··· 1613 1607 .walk_stack = print_context_stack_bp, 1614 1608 }; 1615 1609 1616 - #include "../dumpstack.h" 1617 - 1618 1610 static void 1619 1611 perf_callchain_kernel(struct pt_regs *regs, struct perf_callchain_entry *entry) 1620 1612 { ··· 1732 1728 perf_do_callchain(regs, entry); 1733 1729 1734 1730 return entry; 1735 - } 1736 - 1737 - void perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip) 1738 - { 1739 - regs->ip = ip; 1740 - /* 1741 - * perf_arch_fetch_caller_regs adds another call, we need to increment 1742 - * the skip level 1743 - */ 1744 - regs->bp = rewind_frame_pointer(skip + 1); 1745 - regs->cs = __KERNEL_CS; 1746 - /* 1747 - * We abuse bit 3 to pass exact information, see perf_misc_flags 1748 - * and the comment with PERF_EFLAGS_EXACT. 1749 - */ 1750 - regs->flags = 0; 1751 1731 } 1752 1732 1753 1733 unsigned long perf_instruction_pointer(struct pt_regs *regs)

+120 -36

arch/x86/kernel/cpu/perf_event_p4.c

··· 21 21 char cntr[2][P4_CNTR_LIMIT]; /* counter index (offset), -1 on abscence */ 22 22 }; 23 23 24 - struct p4_cache_event_bind { 24 + struct p4_pebs_bind { 25 25 unsigned int metric_pebs; 26 26 unsigned int metric_vert; 27 27 }; 28 28 29 - #define P4_GEN_CACHE_EVENT_BIND(name) \ 30 - [P4_CACHE__##name] = { \ 31 - .metric_pebs = P4_PEBS__##name, \ 32 - .metric_vert = P4_VERT__##name, \ 29 + /* it sets P4_PEBS_ENABLE_UOP_TAG as well */ 30 + #define P4_GEN_PEBS_BIND(name, pebs, vert) \ 31 + [P4_PEBS_METRIC__##name] = { \ 32 + .metric_pebs = pebs | P4_PEBS_ENABLE_UOP_TAG, \ 33 + .metric_vert = vert, \ 33 34 } 34 35 35 - static struct p4_cache_event_bind p4_cache_event_bind_map[] = { 36 - P4_GEN_CACHE_EVENT_BIND(1stl_cache_load_miss_retired), 37 - P4_GEN_CACHE_EVENT_BIND(2ndl_cache_load_miss_retired), 38 - P4_GEN_CACHE_EVENT_BIND(dtlb_load_miss_retired), 39 - P4_GEN_CACHE_EVENT_BIND(dtlb_store_miss_retired), 36 + /* 37 + * note we have P4_PEBS_ENABLE_UOP_TAG always set here 38 + * 39 + * it's needed for mapping P4_PEBS_CONFIG_METRIC_MASK bits of 40 + * event configuration to find out which values are to be 41 + * written into MSR_IA32_PEBS_ENABLE and MSR_P4_PEBS_MATRIX_VERT 42 + * resgisters 43 + */ 44 + static struct p4_pebs_bind p4_pebs_bind_map[] = { 45 + P4_GEN_PEBS_BIND(1stl_cache_load_miss_retired, 0x0000001, 0x0000001), 46 + P4_GEN_PEBS_BIND(2ndl_cache_load_miss_retired, 0x0000002, 0x0000001), 47 + P4_GEN_PEBS_BIND(dtlb_load_miss_retired, 0x0000004, 0x0000001), 48 + P4_GEN_PEBS_BIND(dtlb_store_miss_retired, 0x0000004, 0x0000002), 49 + P4_GEN_PEBS_BIND(dtlb_all_miss_retired, 0x0000004, 0x0000003), 50 + P4_GEN_PEBS_BIND(tagged_mispred_branch, 0x0018000, 0x0000010), 51 + P4_GEN_PEBS_BIND(mob_load_replay_retired, 0x0000200, 0x0000001), 52 + P4_GEN_PEBS_BIND(split_load_retired, 0x0000400, 0x0000001), 53 + P4_GEN_PEBS_BIND(split_store_retired, 0x0000400, 0x0000002), 40 54 }; 41 55 42 56 /* ··· 295 281 }, 296 282 }; 297 283 298 - #define P4_GEN_CACHE_EVENT(event, bit, cache_event) \ 284 + #define P4_GEN_CACHE_EVENT(event, bit, metric) \ 299 285 p4_config_pack_escr(P4_ESCR_EVENT(event) | \ 300 286 P4_ESCR_EMASK_BIT(event, bit)) | \ 301 - p4_config_pack_cccr(cache_event | \ 287 + p4_config_pack_cccr(metric | \ 302 288 P4_CCCR_ESEL(P4_OPCODE_ESEL(P4_OPCODE(event)))) 303 289 304 290 static __initconst const u64 p4_hw_cache_event_ids ··· 310 296 [ C(OP_READ) ] = { 311 297 [ C(RESULT_ACCESS) ] = 0x0, 312 298 [ C(RESULT_MISS) ] = P4_GEN_CACHE_EVENT(P4_EVENT_REPLAY_EVENT, NBOGUS, 313 - P4_CACHE__1stl_cache_load_miss_retired), 299 + P4_PEBS_METRIC__1stl_cache_load_miss_retired), 314 300 }, 315 301 }, 316 302 [ C(LL ) ] = { 317 303 [ C(OP_READ) ] = { 318 304 [ C(RESULT_ACCESS) ] = 0x0, 319 305 [ C(RESULT_MISS) ] = P4_GEN_CACHE_EVENT(P4_EVENT_REPLAY_EVENT, NBOGUS, 320 - P4_CACHE__2ndl_cache_load_miss_retired), 306 + P4_PEBS_METRIC__2ndl_cache_load_miss_retired), 321 307 }, 322 308 }, 323 309 [ C(DTLB) ] = { 324 310 [ C(OP_READ) ] = { 325 311 [ C(RESULT_ACCESS) ] = 0x0, 326 312 [ C(RESULT_MISS) ] = P4_GEN_CACHE_EVENT(P4_EVENT_REPLAY_EVENT, NBOGUS, 327 - P4_CACHE__dtlb_load_miss_retired), 313 + P4_PEBS_METRIC__dtlb_load_miss_retired), 328 314 }, 329 315 [ C(OP_WRITE) ] = { 330 316 [ C(RESULT_ACCESS) ] = 0x0, 331 317 [ C(RESULT_MISS) ] = P4_GEN_CACHE_EVENT(P4_EVENT_REPLAY_EVENT, NBOGUS, 332 - P4_CACHE__dtlb_store_miss_retired), 318 + P4_PEBS_METRIC__dtlb_store_miss_retired), 333 319 }, 334 320 }, 335 321 [ C(ITLB) ] = { 336 322 [ C(OP_READ) ] = { 337 323 [ C(RESULT_ACCESS) ] = P4_GEN_CACHE_EVENT(P4_EVENT_ITLB_REFERENCE, HIT, 338 - P4_CACHE__itlb_reference_hit), 324 + P4_PEBS_METRIC__none), 339 325 [ C(RESULT_MISS) ] = P4_GEN_CACHE_EVENT(P4_EVENT_ITLB_REFERENCE, MISS, 340 - P4_CACHE__itlb_reference_miss), 326 + P4_PEBS_METRIC__none), 341 327 }, 342 328 [ C(OP_WRITE) ] = { 343 329 [ C(RESULT_ACCESS) ] = -1, ··· 428 414 return config; 429 415 } 430 416 417 + static int p4_validate_raw_event(struct perf_event *event) 418 + { 419 + unsigned int v; 420 + 421 + /* user data may have out-of-bound event index */ 422 + v = p4_config_unpack_event(event->attr.config); 423 + if (v >= ARRAY_SIZE(p4_event_bind_map)) { 424 + pr_warning("P4 PMU: Unknown event code: %d\n", v); 425 + return -EINVAL; 426 + } 427 + 428 + /* 429 + * it may have some screwed PEBS bits 430 + */ 431 + if (p4_config_pebs_has(event->attr.config, P4_PEBS_CONFIG_ENABLE)) { 432 + pr_warning("P4 PMU: PEBS are not supported yet\n"); 433 + return -EINVAL; 434 + } 435 + v = p4_config_unpack_metric(event->attr.config); 436 + if (v >= ARRAY_SIZE(p4_pebs_bind_map)) { 437 + pr_warning("P4 PMU: Unknown metric code: %d\n", v); 438 + return -EINVAL; 439 + } 440 + 441 + return 0; 442 + } 443 + 431 444 static int p4_hw_config(struct perf_event *event) 432 445 { 433 446 int cpu = get_cpu(); 434 447 int rc = 0; 435 - unsigned int evnt; 436 448 u32 escr, cccr; 437 449 438 450 /* ··· 478 438 479 439 if (event->attr.type == PERF_TYPE_RAW) { 480 440 481 - /* user data may have out-of-bound event index */ 482 - evnt = p4_config_unpack_event(event->attr.config); 483 - if (evnt >= ARRAY_SIZE(p4_event_bind_map)) { 484 - rc = -EINVAL; 441 + rc = p4_validate_raw_event(event); 442 + if (rc) 485 443 goto out; 486 - } 487 444 488 445 /* 489 446 * We don't control raw events so it's up to the caller ··· 488 451 * on HT machine but allow HT-compatible specifics to be 489 452 * passed on) 490 453 * 454 + * Note that for RAW events we allow user to use P4_CCCR_RESERVED 455 + * bits since we keep additional info here (for cache events and etc) 456 + * 491 457 * XXX: HT wide things should check perf_paranoid_cpu() && 492 458 * CAP_SYS_ADMIN 493 459 */ 494 460 event->hw.config |= event->attr.config & 495 461 (p4_config_pack_escr(P4_ESCR_MASK_HT) | 496 - p4_config_pack_cccr(P4_CCCR_MASK_HT)); 462 + p4_config_pack_cccr(P4_CCCR_MASK_HT | P4_CCCR_RESERVED)); 497 463 } 498 464 499 465 rc = x86_setup_perfctr(event); ··· 520 480 } 521 481 522 482 return overflow; 483 + } 484 + 485 + static void p4_pmu_disable_pebs(void) 486 + { 487 + /* 488 + * FIXME 489 + * 490 + * It's still allowed that two threads setup same cache 491 + * events so we can't simply clear metrics until we knew 492 + * noone is depending on us, so we need kind of counter 493 + * for "ReplayEvent" users. 494 + * 495 + * What is more complex -- RAW events, if user (for some 496 + * reason) will pass some cache event metric with improper 497 + * event opcode -- it's fine from hardware point of view 498 + * but completely nonsence from "meaning" of such action. 499 + * 500 + * So at moment let leave metrics turned on forever -- it's 501 + * ok for now but need to be revisited! 502 + * 503 + * (void)checking_wrmsrl(MSR_IA32_PEBS_ENABLE, (u64)0); 504 + * (void)checking_wrmsrl(MSR_P4_PEBS_MATRIX_VERT, (u64)0); 505 + */ 523 506 } 524 507 525 508 static inline void p4_pmu_disable_event(struct perf_event *event) ··· 570 507 continue; 571 508 p4_pmu_disable_event(event); 572 509 } 510 + 511 + p4_pmu_disable_pebs(); 512 + } 513 + 514 + /* configuration must be valid */ 515 + static void p4_pmu_enable_pebs(u64 config) 516 + { 517 + struct p4_pebs_bind *bind; 518 + unsigned int idx; 519 + 520 + BUILD_BUG_ON(P4_PEBS_METRIC__max > P4_PEBS_CONFIG_METRIC_MASK); 521 + 522 + idx = p4_config_unpack_metric(config); 523 + if (idx == P4_PEBS_METRIC__none) 524 + return; 525 + 526 + bind = &p4_pebs_bind_map[idx]; 527 + 528 + (void)checking_wrmsrl(MSR_IA32_PEBS_ENABLE, (u64)bind->metric_pebs); 529 + (void)checking_wrmsrl(MSR_P4_PEBS_MATRIX_VERT, (u64)bind->metric_vert); 573 530 } 574 531 575 532 static void p4_pmu_enable_event(struct perf_event *event) ··· 598 515 int thread = p4_ht_config_thread(hwc->config); 599 516 u64 escr_conf = p4_config_unpack_escr(p4_clear_ht_bit(hwc->config)); 600 517 unsigned int idx = p4_config_unpack_event(hwc->config); 601 - unsigned int idx_cache = p4_config_unpack_cache_event(hwc->config); 602 518 struct p4_event_bind *bind; 603 - struct p4_cache_event_bind *bind_cache; 604 519 u64 escr_addr, cccr; 605 520 606 521 bind = &p4_event_bind_map[idx]; ··· 618 537 cccr = p4_config_unpack_cccr(hwc->config); 619 538 620 539 /* 621 - * it could be Cache event so that we need to 622 - * set metrics into additional MSRs 540 + * it could be Cache event so we need to write metrics 541 + * into additional MSRs 623 542 */ 624 - BUILD_BUG_ON(P4_CACHE__MAX > P4_CCCR_CACHE_OPS_MASK); 625 - if (idx_cache > P4_CACHE__NONE && 626 - idx_cache < ARRAY_SIZE(p4_cache_event_bind_map)) { 627 - bind_cache = &p4_cache_event_bind_map[idx_cache]; 628 - (void)checking_wrmsrl(MSR_IA32_PEBS_ENABLE, (u64)bind_cache->metric_pebs); 629 - (void)checking_wrmsrl(MSR_P4_PEBS_MATRIX_VERT, (u64)bind_cache->metric_vert); 630 - } 543 + p4_pmu_enable_pebs(hwc->config); 631 544 632 545 (void)checking_wrmsrl(escr_addr, escr_conf); 633 546 (void)checking_wrmsrl(hwc->config_base + hwc->idx, ··· 904 829 .max_period = (1ULL << 39) - 1, 905 830 .hw_config = p4_hw_config, 906 831 .schedule_events = p4_pmu_schedule_events, 832 + /* 833 + * This handles erratum N15 in intel doc 249199-029, 834 + * the counter may not be updated correctly on write 835 + * so we need a second write operation to do the trick 836 + * (the official workaround didn't work) 837 + * 838 + * the former idea is taken from OProfile code 839 + */ 840 + .perfctr_second_write = 1, 907 841 }; 908 842 909 843 static __init int p4_pmu_init(void)

-1

arch/x86/kernel/dumpstack.c

··· 18 18 19 19 #include <asm/stacktrace.h> 20 20 21 - #include "dumpstack.h" 22 21 23 22 int panic_on_unrecovered_nmi; 24 23 int panic_on_io_nmi;

-56

arch/x86/kernel/dumpstack.h

··· 1 - /* 2 - * Copyright (C) 1991, 1992 Linus Torvalds 3 - * Copyright (C) 2000, 2001, 2002 Andi Kleen, SuSE Labs 4 - */ 5 - 6 - #ifndef DUMPSTACK_H 7 - #define DUMPSTACK_H 8 - 9 - #ifdef CONFIG_X86_32 10 - #define STACKSLOTS_PER_LINE 8 11 - #define get_bp(bp) asm("movl %%ebp, %0" : "=r" (bp) :) 12 - #else 13 - #define STACKSLOTS_PER_LINE 4 14 - #define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :) 15 - #endif 16 - 17 - #include <linux/uaccess.h> 18 - 19 - extern void 20 - show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs, 21 - unsigned long *stack, unsigned long bp, char *log_lvl); 22 - 23 - extern void 24 - show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs, 25 - unsigned long *sp, unsigned long bp, char *log_lvl); 26 - 27 - extern unsigned int code_bytes; 28 - 29 - /* The form of the top of the frame on the stack */ 30 - struct stack_frame { 31 - struct stack_frame *next_frame; 32 - unsigned long return_address; 33 - }; 34 - 35 - struct stack_frame_ia32 { 36 - u32 next_frame; 37 - u32 return_address; 38 - }; 39 - 40 - static inline unsigned long rewind_frame_pointer(int n) 41 - { 42 - struct stack_frame *frame; 43 - 44 - get_bp(frame); 45 - 46 - #ifdef CONFIG_FRAME_POINTER 47 - while (n--) { 48 - if (probe_kernel_address(&frame->next_frame, frame)) 49 - break; 50 - } 51 - #endif 52 - 53 - return (unsigned long)frame; 54 - } 55 - 56 - #endif /* DUMPSTACK_H */

-2

arch/x86/kernel/dumpstack_32.c

··· 16 16 17 17 #include <asm/stacktrace.h> 18 18 19 - #include "dumpstack.h" 20 - 21 19 22 20 void dump_trace(struct task_struct *task, struct pt_regs *regs, 23 21 unsigned long *stack, unsigned long bp,

-1

arch/x86/kernel/dumpstack_64.c

··· 16 16 17 17 #include <asm/stacktrace.h> 18 18 19 - #include "dumpstack.h" 20 19 21 20 #define N_EXCEPTION_STACKS_END \ 22 21 (N_EXCEPTION_STACKS + DEBUG_STKSZ/EXCEPTION_STKSZ - 2)

+36 -15

arch/x86/kernel/hw_breakpoint.c

··· 208 208 { 209 209 /* Len */ 210 210 switch (x86_len) { 211 + case X86_BREAKPOINT_LEN_X: 212 + *gen_len = sizeof(long); 213 + break; 211 214 case X86_BREAKPOINT_LEN_1: 212 215 *gen_len = HW_BREAKPOINT_LEN_1; 213 216 break; ··· 254 251 255 252 info->address = bp->attr.bp_addr; 256 253 254 + /* Type */ 255 + switch (bp->attr.bp_type) { 256 + case HW_BREAKPOINT_W: 257 + info->type = X86_BREAKPOINT_WRITE; 258 + break; 259 + case HW_BREAKPOINT_W | HW_BREAKPOINT_R: 260 + info->type = X86_BREAKPOINT_RW; 261 + break; 262 + case HW_BREAKPOINT_X: 263 + info->type = X86_BREAKPOINT_EXECUTE; 264 + /* 265 + * x86 inst breakpoints need to have a specific undefined len. 266 + * But we still need to check userspace is not trying to setup 267 + * an unsupported length, to get a range breakpoint for example. 268 + */ 269 + if (bp->attr.bp_len == sizeof(long)) { 270 + info->len = X86_BREAKPOINT_LEN_X; 271 + return 0; 272 + } 273 + default: 274 + return -EINVAL; 275 + } 276 + 257 277 /* Len */ 258 278 switch (bp->attr.bp_len) { 259 279 case HW_BREAKPOINT_LEN_1: ··· 293 267 info->len = X86_BREAKPOINT_LEN_8; 294 268 break; 295 269 #endif 296 - default: 297 - return -EINVAL; 298 - } 299 - 300 - /* Type */ 301 - switch (bp->attr.bp_type) { 302 - case HW_BREAKPOINT_W: 303 - info->type = X86_BREAKPOINT_WRITE; 304 - break; 305 - case HW_BREAKPOINT_W | HW_BREAKPOINT_R: 306 - info->type = X86_BREAKPOINT_RW; 307 - break; 308 - case HW_BREAKPOINT_X: 309 - info->type = X86_BREAKPOINT_EXECUTE; 310 - break; 311 270 default: 312 271 return -EINVAL; 313 272 } ··· 316 305 ret = -EINVAL; 317 306 318 307 switch (info->len) { 308 + case X86_BREAKPOINT_LEN_X: 309 + align = sizeof(long) -1; 310 + break; 319 311 case X86_BREAKPOINT_LEN_1: 320 312 align = 0; 321 313 break; ··· 479 465 } 480 466 481 467 perf_bp_event(bp, args->regs); 468 + 469 + /* 470 + * Set up resume flag to avoid breakpoint recursion when 471 + * returning back to origin. 472 + */ 473 + if (bp->hw.info.type == X86_BREAKPOINT_EXECUTE) 474 + args->regs->flags |= X86_EFLAGS_RF; 482 475 483 476 rcu_read_unlock(); 484 477 }

+17 -16

arch/x86/kernel/kprobes.c

··· 126 126 } 127 127 128 128 /* 129 - * Check for the REX prefix which can only exist on X86_64 130 - * X86_32 always returns 0 129 + * Skip the prefixes of the instruction. 131 130 */ 132 - static int __kprobes is_REX_prefix(kprobe_opcode_t *insn) 131 + static kprobe_opcode_t *__kprobes skip_prefixes(kprobe_opcode_t *insn) 133 132 { 133 + insn_attr_t attr; 134 + 135 + attr = inat_get_opcode_attribute((insn_byte_t)*insn); 136 + while (inat_is_legacy_prefix(attr)) { 137 + insn++; 138 + attr = inat_get_opcode_attribute((insn_byte_t)*insn); 139 + } 134 140 #ifdef CONFIG_X86_64 135 - if ((*insn & 0xf0) == 0x40) 136 - return 1; 141 + if (inat_is_rex_prefix(attr)) 142 + insn++; 137 143 #endif 138 - return 0; 144 + return insn; 139 145 } 140 146 141 147 /* ··· 278 272 */ 279 273 static int __kprobes is_IF_modifier(kprobe_opcode_t *insn) 280 274 { 275 + /* Skip prefixes */ 276 + insn = skip_prefixes(insn); 277 + 281 278 switch (*insn) { 282 279 case 0xfa: /* cli */ 283 280 case 0xfb: /* sti */ ··· 288 279 case 0x9d: /* popf/popfd */ 289 280 return 1; 290 281 } 291 - 292 - /* 293 - * on X86_64, 0x40-0x4f are REX prefixes so we need to look 294 - * at the next byte instead.. but of course not recurse infinitely 295 - */ 296 - if (is_REX_prefix(insn)) 297 - return is_IF_modifier(++insn); 298 282 299 283 return 0; 300 284 } ··· 805 803 unsigned long orig_ip = (unsigned long)p->addr; 806 804 kprobe_opcode_t *insn = p->ainsn.insn; 807 805 808 - /*skip the REX prefix*/ 809 - if (is_REX_prefix(insn)) 810 - insn++; 806 + /* Skip prefixes */ 807 + insn = skip_prefixes(insn); 811 808 812 809 regs->flags &= ~X86_EFLAGS_TF; 813 810 switch (*insn) {

+4

arch/x86/kernel/process_32.c

··· 57 57 #include <asm/syscalls.h> 58 58 #include <asm/debugreg.h> 59 59 60 + #include <trace/events/power.h> 61 + 60 62 asmlinkage void ret_from_fork(void) __asm__("ret_from_fork"); 61 63 62 64 /* ··· 113 111 stop_critical_timings(); 114 112 pm_idle(); 115 113 start_critical_timings(); 114 + 115 + trace_power_end(smp_processor_id()); 116 116 } 117 117 tick_nohz_restart_sched_tick(); 118 118 preempt_enable_no_resched();

+5

arch/x86/kernel/process_64.c

··· 51 51 #include <asm/syscalls.h> 52 52 #include <asm/debugreg.h> 53 53 54 + #include <trace/events/power.h> 55 + 54 56 asmlinkage extern void ret_from_fork(void); 55 57 56 58 DEFINE_PER_CPU(unsigned long, old_rsp); ··· 140 138 stop_critical_timings(); 141 139 pm_idle(); 142 140 start_critical_timings(); 141 + 142 + trace_power_end(smp_processor_id()); 143 + 143 144 /* In many cases the interrupt that ended idle 144 145 has already called exit_idle. But some idle 145 146 loops can be woken up without interrupt. */

+16 -15

arch/x86/kernel/stacktrace.c

··· 23 23 return 0; 24 24 } 25 25 26 - static void save_stack_address(void *data, unsigned long addr, int reliable) 26 + static void 27 + __save_stack_address(void *data, unsigned long addr, bool reliable, bool nosched) 27 28 { 28 29 struct stack_trace *trace = data; 30 + #ifdef CONFIG_FRAME_POINTER 29 31 if (!reliable) 32 + return; 33 + #endif 34 + if (nosched && in_sched_functions(addr)) 30 35 return; 31 36 if (trace->skip > 0) { 32 37 trace->skip--; ··· 41 36 trace->entries[trace->nr_entries++] = addr; 42 37 } 43 38 39 + static void save_stack_address(void *data, unsigned long addr, int reliable) 40 + { 41 + return __save_stack_address(data, addr, reliable, false); 42 + } 43 + 44 44 static void 45 45 save_stack_address_nosched(void *data, unsigned long addr, int reliable) 46 46 { 47 - struct stack_trace *trace = (struct stack_trace *)data; 48 - if (!reliable) 49 - return; 50 - if (in_sched_functions(addr)) 51 - return; 52 - if (trace->skip > 0) { 53 - trace->skip--; 54 - return; 55 - } 56 - if (trace->nr_entries < trace->max_entries) 57 - trace->entries[trace->nr_entries++] = addr; 47 + return __save_stack_address(data, addr, reliable, true); 58 48 } 59 49 60 50 static const struct stacktrace_ops save_stack_ops = { ··· 96 96 97 97 /* Userspace stacktrace - based on kernel/trace/trace_sysprof.c */ 98 98 99 - struct stack_frame { 99 + struct stack_frame_user { 100 100 const void __user *next_fp; 101 101 unsigned long ret_addr; 102 102 }; 103 103 104 - static int copy_stack_frame(const void __user *fp, struct stack_frame *frame) 104 + static int 105 + copy_stack_frame(const void __user *fp, struct stack_frame_user *frame) 105 106 { 106 107 int ret; 107 108 ··· 127 126 trace->entries[trace->nr_entries++] = regs->ip; 128 127 129 128 while (trace->nr_entries < trace->max_entries) { 130 - struct stack_frame frame; 129 + struct stack_frame_user frame; 131 130 132 131 frame.next_fp = NULL; 133 132 frame.ret_addr = 0;

+7

arch/x86/kernel/traps.c

··· 392 392 if (notify_die(DIE_NMI_IPI, "nmi_ipi", regs, reason, 2, SIGINT) 393 393 == NOTIFY_STOP) 394 394 return; 395 + 395 396 #ifdef CONFIG_X86_LOCAL_APIC 397 + if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT) 398 + == NOTIFY_STOP) 399 + return; 400 + 401 + #ifndef CONFIG_LOCKUP_DETECTOR 396 402 /* 397 403 * Ok, so this is none of the documented NMI sources, 398 404 * so it must be the NMI watchdog. ··· 406 400 if (nmi_watchdog_tick(regs, reason)) 407 401 return; 408 402 if (!do_nmi_callback(regs, cpu)) 403 + #endif /* !CONFIG_LOCKUP_DETECTOR */ 409 404 unknown_nmi_error(reason, regs); 410 405 #else 411 406 unknown_nmi_error(reason, regs);

+17 -13

arch/x86/mm/pf_in.c

··· 40 40 static unsigned int reg_rop[] = { 41 41 0x8A, 0x8B, 0xB60F, 0xB70F, 0xBE0F, 0xBF0F 42 42 }; 43 - static unsigned int reg_wop[] = { 0x88, 0x89 }; 43 + static unsigned int reg_wop[] = { 0x88, 0x89, 0xAA, 0xAB }; 44 44 static unsigned int imm_wop[] = { 0xC6, 0xC7 }; 45 45 /* IA32 Manual 3, 3-432*/ 46 - static unsigned int rw8[] = { 0x88, 0x8A, 0xC6 }; 46 + static unsigned int rw8[] = { 0x88, 0x8A, 0xC6, 0xAA }; 47 47 static unsigned int rw32[] = { 48 - 0x89, 0x8B, 0xC7, 0xB60F, 0xB70F, 0xBE0F, 0xBF0F 48 + 0x89, 0x8B, 0xC7, 0xB60F, 0xB70F, 0xBE0F, 0xBF0F, 0xAB 49 49 }; 50 - static unsigned int mw8[] = { 0x88, 0x8A, 0xC6, 0xB60F, 0xBE0F }; 50 + static unsigned int mw8[] = { 0x88, 0x8A, 0xC6, 0xB60F, 0xBE0F, 0xAA }; 51 51 static unsigned int mw16[] = { 0xB70F, 0xBF0F }; 52 - static unsigned int mw32[] = { 0x89, 0x8B, 0xC7 }; 52 + static unsigned int mw32[] = { 0x89, 0x8B, 0xC7, 0xAB }; 53 53 static unsigned int mw64[] = {}; 54 54 #else /* not __i386__ */ 55 55 static unsigned char prefix_codes[] = { ··· 63 63 static unsigned int reg_rop[] = { 64 64 0x8A, 0x8B, 0xB60F, 0xB70F, 0xBE0F, 0xBF0F 65 65 }; 66 - static unsigned int reg_wop[] = { 0x88, 0x89 }; 66 + static unsigned int reg_wop[] = { 0x88, 0x89, 0xAA, 0xAB }; 67 67 static unsigned int imm_wop[] = { 0xC6, 0xC7 }; 68 - static unsigned int rw8[] = { 0xC6, 0x88, 0x8A }; 68 + static unsigned int rw8[] = { 0xC6, 0x88, 0x8A, 0xAA }; 69 69 static unsigned int rw32[] = { 70 - 0xC7, 0x89, 0x8B, 0xB60F, 0xB70F, 0xBE0F, 0xBF0F 70 + 0xC7, 0x89, 0x8B, 0xB60F, 0xB70F, 0xBE0F, 0xBF0F, 0xAB 71 71 }; 72 72 /* 8 bit only */ 73 - static unsigned int mw8[] = { 0xC6, 0x88, 0x8A, 0xB60F, 0xBE0F }; 73 + static unsigned int mw8[] = { 0xC6, 0x88, 0x8A, 0xB60F, 0xBE0F, 0xAA }; 74 74 /* 16 bit only */ 75 75 static unsigned int mw16[] = { 0xB70F, 0xBF0F }; 76 76 /* 16 or 32 bit */ 77 77 static unsigned int mw32[] = { 0xC7 }; 78 78 /* 16, 32 or 64 bit */ 79 - static unsigned int mw64[] = { 0x89, 0x8B }; 79 + static unsigned int mw64[] = { 0x89, 0x8B, 0xAB }; 80 80 #endif /* not __i386__ */ 81 81 82 82 struct prefix_bits { ··· 410 410 unsigned long get_ins_reg_val(unsigned long ins_addr, struct pt_regs *regs) 411 411 { 412 412 unsigned int opcode; 413 - unsigned char mod_rm; 414 413 int reg; 415 414 unsigned char *p; 416 415 struct prefix_bits prf; ··· 436 437 goto err; 437 438 438 439 do_work: 439 - mod_rm = *p; 440 - reg = ((mod_rm >> 3) & 0x7) | (prf.rexr << 3); 440 + /* for STOS, source register is fixed */ 441 + if (opcode == 0xAA || opcode == 0xAB) { 442 + reg = arg_AX; 443 + } else { 444 + unsigned char mod_rm = *p; 445 + reg = ((mod_rm >> 3) & 0x7) | (prf.rexr << 3); 446 + } 441 447 switch (get_ins_reg_width(ins_addr)) { 442 448 case 1: 443 449 return *get_reg_w8(reg, prf.rex, regs);

+14 -2

arch/x86/oprofile/nmi_int.c

··· 634 634 if (force_arch_perfmon && cpu_has_arch_perfmon) 635 635 return 0; 636 636 637 + /* 638 + * Documentation on identifying Intel processors by CPU family 639 + * and model can be found in the Intel Software Developer's 640 + * Manuals (SDM): 641 + * 642 + * http://www.intel.com/products/processor/manuals/ 643 + * 644 + * As of May 2010 the documentation for this was in the: 645 + * "Intel 64 and IA-32 Architectures Software Developer's 646 + * Manual Volume 3B: System Programming Guide", "Table B-1 647 + * CPUID Signature Values of DisplayFamily_DisplayModel". 648 + */ 637 649 switch (cpu_model) { 638 650 case 0 ... 2: 639 651 *cpu_type = "i386/ppro"; ··· 667 655 case 15: case 23: 668 656 *cpu_type = "i386/core_2"; 669 657 break; 658 + case 0x1a: 670 659 case 0x2e: 671 - case 26: 672 660 spec = &op_arch_perfmon_spec; 673 661 *cpu_type = "i386/core_i7"; 674 662 break; 675 - case 28: 663 + case 0x1c: 676 664 *cpu_type = "i386/atom"; 677 665 break; 678 666 default:

+1

arch/xtensa/include/asm/local64.h

··· 1 + #include <asm-generic/local64.h>

+2 -1

drivers/oprofile/event_buffer.c

··· 135 135 * echo 1 >/dev/oprofile/enable 136 136 */ 137 137 138 - return 0; 138 + return nonseekable_open(inode, file); 139 139 140 140 fail: 141 141 dcookie_unregister(file->private_data); ··· 205 205 .open = event_buffer_open, 206 206 .release = event_buffer_release, 207 207 .read = event_buffer_read, 208 + .llseek = no_llseek, 208 209 };

+1

fs/exec.c

··· 653 653 else 654 654 stack_base = vma->vm_start - stack_expand; 655 655 #endif 656 + current->mm->start_stack = bprm->p; 656 657 ret = expand_stack(vma, stack_base); 657 658 if (ret) 658 659 ret = -EFAULT;

+96

include/asm-generic/local64.h

··· 1 + #ifndef _ASM_GENERIC_LOCAL64_H 2 + #define _ASM_GENERIC_LOCAL64_H 3 + 4 + #include <linux/percpu.h> 5 + #include <asm/types.h> 6 + 7 + /* 8 + * A signed long type for operations which are atomic for a single CPU. 9 + * Usually used in combination with per-cpu variables. 10 + * 11 + * This is the default implementation, which uses atomic64_t. Which is 12 + * rather pointless. The whole point behind local64_t is that some processors 13 + * can perform atomic adds and subtracts in a manner which is atomic wrt IRQs 14 + * running on this CPU. local64_t allows exploitation of such capabilities. 15 + */ 16 + 17 + /* Implement in terms of atomics. */ 18 + 19 + #if BITS_PER_LONG == 64 20 + 21 + #include <asm/local.h> 22 + 23 + typedef struct { 24 + local_t a; 25 + } local64_t; 26 + 27 + #define LOCAL64_INIT(i) { LOCAL_INIT(i) } 28 + 29 + #define local64_read(l) local_read(&(l)->a) 30 + #define local64_set(l,i) local_set((&(l)->a),(i)) 31 + #define local64_inc(l) local_inc(&(l)->a) 32 + #define local64_dec(l) local_dec(&(l)->a) 33 + #define local64_add(i,l) local_add((i),(&(l)->a)) 34 + #define local64_sub(i,l) local_sub((i),(&(l)->a)) 35 + 36 + #define local64_sub_and_test(i, l) local_sub_and_test((i), (&(l)->a)) 37 + #define local64_dec_and_test(l) local_dec_and_test(&(l)->a) 38 + #define local64_inc_and_test(l) local_inc_and_test(&(l)->a) 39 + #define local64_add_negative(i, l) local_add_negative((i), (&(l)->a)) 40 + #define local64_add_return(i, l) local_add_return((i), (&(l)->a)) 41 + #define local64_sub_return(i, l) local_sub_return((i), (&(l)->a)) 42 + #define local64_inc_return(l) local_inc_return(&(l)->a) 43 + 44 + #define local64_cmpxchg(l, o, n) local_cmpxchg((&(l)->a), (o), (n)) 45 + #define local64_xchg(l, n) local_xchg((&(l)->a), (n)) 46 + #define local64_add_unless(l, _a, u) local_add_unless((&(l)->a), (_a), (u)) 47 + #define local64_inc_not_zero(l) local_inc_not_zero(&(l)->a) 48 + 49 + /* Non-atomic variants, ie. preemption disabled and won't be touched 50 + * in interrupt, etc. Some archs can optimize this case well. */ 51 + #define __local64_inc(l) local64_set((l), local64_read(l) + 1) 52 + #define __local64_dec(l) local64_set((l), local64_read(l) - 1) 53 + #define __local64_add(i,l) local64_set((l), local64_read(l) + (i)) 54 + #define __local64_sub(i,l) local64_set((l), local64_read(l) - (i)) 55 + 56 + #else /* BITS_PER_LONG != 64 */ 57 + 58 + #include <asm/atomic.h> 59 + 60 + /* Don't use typedef: don't want them to be mixed with atomic_t's. */ 61 + typedef struct { 62 + atomic64_t a; 63 + } local64_t; 64 + 65 + #define LOCAL64_INIT(i) { ATOMIC_LONG_INIT(i) } 66 + 67 + #define local64_read(l) atomic64_read(&(l)->a) 68 + #define local64_set(l,i) atomic64_set((&(l)->a),(i)) 69 + #define local64_inc(l) atomic64_inc(&(l)->a) 70 + #define local64_dec(l) atomic64_dec(&(l)->a) 71 + #define local64_add(i,l) atomic64_add((i),(&(l)->a)) 72 + #define local64_sub(i,l) atomic64_sub((i),(&(l)->a)) 73 + 74 + #define local64_sub_and_test(i, l) atomic64_sub_and_test((i), (&(l)->a)) 75 + #define local64_dec_and_test(l) atomic64_dec_and_test(&(l)->a) 76 + #define local64_inc_and_test(l) atomic64_inc_and_test(&(l)->a) 77 + #define local64_add_negative(i, l) atomic64_add_negative((i), (&(l)->a)) 78 + #define local64_add_return(i, l) atomic64_add_return((i), (&(l)->a)) 79 + #define local64_sub_return(i, l) atomic64_sub_return((i), (&(l)->a)) 80 + #define local64_inc_return(l) atomic64_inc_return(&(l)->a) 81 + 82 + #define local64_cmpxchg(l, o, n) atomic64_cmpxchg((&(l)->a), (o), (n)) 83 + #define local64_xchg(l, n) atomic64_xchg((&(l)->a), (n)) 84 + #define local64_add_unless(l, _a, u) atomic64_add_unless((&(l)->a), (_a), (u)) 85 + #define local64_inc_not_zero(l) atomic64_inc_not_zero(&(l)->a) 86 + 87 + /* Non-atomic variants, ie. preemption disabled and won't be touched 88 + * in interrupt, etc. Some archs can optimize this case well. */ 89 + #define __local64_inc(l) local64_set((l), local64_read(l) + 1) 90 + #define __local64_dec(l) local64_set((l), local64_read(l) - 1) 91 + #define __local64_add(i,l) local64_set((l), local64_read(l) + (i)) 92 + #define __local64_sub(i,l) local64_set((l), local64_read(l) - (i)) 93 + 94 + #endif /* BITS_PER_LONG != 64 */ 95 + 96 + #endif /* _ASM_GENERIC_LOCAL64_H */

-4

include/asm-generic/vmlinux.lds.h

··· 156 156 CPU_KEEP(exit.data) \ 157 157 MEM_KEEP(init.data) \ 158 158 MEM_KEEP(exit.data) \ 159 - . = ALIGN(8); \ 160 - VMLINUX_SYMBOL(__start___markers) = .; \ 161 - *(__markers) \ 162 - VMLINUX_SYMBOL(__stop___markers) = .; \ 163 159 . = ALIGN(32); \ 164 160 VMLINUX_SYMBOL(__start___tracepoints) = .; \ 165 161 *(__tracepoints) \

+5

include/linux/ftrace.h

··· 1 + /* 2 + * Ftrace header. For implementation details beyond the random comments 3 + * scattered below, see: Documentation/trace/ftrace-design.txt 4 + */ 5 + 1 6 #ifndef _LINUX_FTRACE_H 2 7 #define _LINUX_FTRACE_H 3 8

+12 -6

include/linux/ftrace_event.h

··· 11 11 struct tracer; 12 12 struct dentry; 13 13 14 - DECLARE_PER_CPU(struct trace_seq, ftrace_event_seq); 15 - 16 14 struct trace_print_flags { 17 15 unsigned long mask; 18 16 const char *name; ··· 55 57 struct mutex mutex; 56 58 struct ring_buffer_iter *buffer_iter[NR_CPUS]; 57 59 unsigned long iter_flags; 60 + 61 + /* trace_seq for __print_flags() and __print_symbolic() etc. */ 62 + struct trace_seq tmp_seq; 58 63 59 64 /* The below is zeroed out in pipe_read */ 60 65 struct trace_seq seq; ··· 147 146 int (*raw_init)(struct ftrace_event_call *); 148 147 }; 149 148 149 + extern int ftrace_event_reg(struct ftrace_event_call *event, 150 + enum trace_reg type); 151 + 150 152 enum { 151 153 TRACE_EVENT_FL_ENABLED_BIT, 152 154 TRACE_EVENT_FL_FILTERED_BIT, 155 + TRACE_EVENT_FL_RECORDED_CMD_BIT, 153 156 }; 154 157 155 158 enum { 156 - TRACE_EVENT_FL_ENABLED = (1 << TRACE_EVENT_FL_ENABLED_BIT), 157 - TRACE_EVENT_FL_FILTERED = (1 << TRACE_EVENT_FL_FILTERED_BIT), 159 + TRACE_EVENT_FL_ENABLED = (1 << TRACE_EVENT_FL_ENABLED_BIT), 160 + TRACE_EVENT_FL_FILTERED = (1 << TRACE_EVENT_FL_FILTERED_BIT), 161 + TRACE_EVENT_FL_RECORDED_CMD = (1 << TRACE_EVENT_FL_RECORDED_CMD_BIT), 158 162 }; 159 163 160 164 struct ftrace_event_call { ··· 177 171 * 32 bit flags: 178 172 * bit 1: enabled 179 173 * bit 2: filter_active 174 + * bit 3: enabled cmd record 180 175 * 181 176 * Changes to flags must hold the event_mutex. 182 177 * ··· 264 257 perf_trace_buf_submit(void *raw_data, int size, int rctx, u64 addr, 265 258 u64 count, struct pt_regs *regs, void *head) 266 259 { 267 - perf_tp_event(addr, count, raw_data, size, regs, head); 268 - perf_swevent_put_recursion_context(rctx); 260 + perf_tp_event(addr, count, raw_data, size, regs, head, rctx); 269 261 } 270 262 #endif 271 263

-5

include/linux/kernel.h

··· 513 513 extern void tracing_stop(void); 514 514 extern void ftrace_off_permanent(void); 515 515 516 - extern void 517 - ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3); 518 - 519 516 static inline void __attribute__ ((format (printf, 1, 2))) 520 517 ____trace_printk_check_format(const char *fmt, ...) 521 518 { ··· 588 591 589 592 extern void ftrace_dump(enum ftrace_dump_mode oops_dump_mode); 590 593 #else 591 - static inline void 592 - ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3) { } 593 594 static inline int 594 595 trace_printk(const char *fmt, ...) __attribute__ ((format (printf, 1, 2))); 595 596

-25

include/linux/kmemtrace.h

··· 1 - /* 2 - * Copyright (C) 2008 Eduard - Gabriel Munteanu 3 - * 4 - * This file is released under GPL version 2. 5 - */ 6 - 7 - #ifndef _LINUX_KMEMTRACE_H 8 - #define _LINUX_KMEMTRACE_H 9 - 10 - #ifdef __KERNEL__ 11 - 12 - #include <trace/events/kmem.h> 13 - 14 - #ifdef CONFIG_KMEMTRACE 15 - extern void kmemtrace_init(void); 16 - #else 17 - static inline void kmemtrace_init(void) 18 - { 19 - } 20 - #endif 21 - 22 - #endif /* __KERNEL__ */ 23 - 24 - #endif /* _LINUX_KMEMTRACE_H */ 25 -

+13

include/linux/nmi.h

··· 20 20 extern void acpi_nmi_disable(void); 21 21 extern void acpi_nmi_enable(void); 22 22 #else 23 + #ifndef CONFIG_HARDLOCKUP_DETECTOR 23 24 static inline void touch_nmi_watchdog(void) 24 25 { 25 26 touch_softlockup_watchdog(); 26 27 } 28 + #else 29 + extern void touch_nmi_watchdog(void); 30 + #endif 27 31 static inline void acpi_nmi_disable(void) { } 28 32 static inline void acpi_nmi_enable(void) { } 29 33 #endif ··· 49 45 { 50 46 return false; 51 47 } 48 + #endif 49 + 50 + #ifdef CONFIG_LOCKUP_DETECTOR 51 + int hw_nmi_is_cpu_stuck(struct pt_regs *); 52 + u64 hw_nmi_get_sample_period(void); 53 + extern int watchdog_enabled; 54 + struct ctl_table; 55 + extern int proc_dowatchdog_enabled(struct ctl_table *, int , 56 + void __user *, size_t *, loff_t *); 52 57 #endif 53 58 54 59 #endif

+47 -48

include/linux/perf_event.h

··· 214 214 * See also PERF_RECORD_MISC_EXACT_IP 215 215 */ 216 216 precise_ip : 2, /* skid constraint */ 217 + mmap_data : 1, /* non-exec mmap data */ 217 218 218 - __reserved_1 : 47; 219 + __reserved_1 : 46; 219 220 220 221 union { 221 222 __u32 wakeup_events; /* wakeup every n events */ ··· 462 461 463 462 #ifdef CONFIG_PERF_EVENTS 464 463 # include <asm/perf_event.h> 464 + # include <asm/local64.h> 465 465 #endif 466 466 467 467 struct perf_guest_info_callbacks { ··· 533 531 struct hrtimer hrtimer; 534 532 }; 535 533 #ifdef CONFIG_HAVE_HW_BREAKPOINT 536 - /* breakpoint */ 537 - struct arch_hw_breakpoint info; 534 + struct { /* breakpoint */ 535 + struct arch_hw_breakpoint info; 536 + struct list_head bp_list; 537 + }; 538 538 #endif 539 539 }; 540 - atomic64_t prev_count; 540 + local64_t prev_count; 541 541 u64 sample_period; 542 542 u64 last_period; 543 - atomic64_t period_left; 543 + local64_t period_left; 544 544 u64 interrupts; 545 545 546 546 u64 freq_time_stamp; ··· 552 548 553 549 struct perf_event; 554 550 555 - #define PERF_EVENT_TXN_STARTED 1 551 + /* 552 + * Common implementation detail of pmu::{start,commit,cancel}_txn 553 + */ 554 + #define PERF_EVENT_TXN 0x1 556 555 557 556 /** 558 557 * struct pmu - generic performance monitoring unit ··· 569 562 void (*unthrottle) (struct perf_event *event); 570 563 571 564 /* 572 - * group events scheduling is treated as a transaction, 573 - * add group events as a whole and perform one schedulability test. 574 - * If test fails, roll back the whole group 565 + * Group events scheduling is treated as a transaction, add group 566 + * events as a whole and perform one schedulability test. If the test 567 + * fails, roll back the whole group 575 568 */ 576 569 570 + /* 571 + * Start the transaction, after this ->enable() doesn't need 572 + * to do schedulability tests. 573 + */ 577 574 void (*start_txn) (const struct pmu *pmu); 578 - void (*cancel_txn) (const struct pmu *pmu); 575 + /* 576 + * If ->start_txn() disabled the ->enable() schedulability test 577 + * then ->commit_txn() is required to perform one. On success 578 + * the transaction is closed. On error the transaction is kept 579 + * open until ->cancel_txn() is called. 580 + */ 579 581 int (*commit_txn) (const struct pmu *pmu); 582 + /* 583 + * Will cancel the transaction, assumes ->disable() is called for 584 + * each successfull ->enable() during the transaction. 585 + */ 586 + void (*cancel_txn) (const struct pmu *pmu); 580 587 }; 581 588 582 589 /** ··· 605 584 606 585 struct file; 607 586 608 - struct perf_mmap_data { 587 + #define PERF_BUFFER_WRITABLE 0x01 588 + 589 + struct perf_buffer { 609 590 atomic_t refcount; 610 591 struct rcu_head rcu_head; 611 592 #ifdef CONFIG_PERF_USE_VMALLOC ··· 673 650 674 651 enum perf_event_active_state state; 675 652 unsigned int attach_state; 676 - atomic64_t count; 653 + local64_t count; 654 + atomic64_t child_count; 677 655 678 656 /* 679 657 * These are the total time in nanoseconds that the event ··· 733 709 atomic_t mmap_count; 734 710 int mmap_locked; 735 711 struct user_struct *mmap_user; 736 - struct perf_mmap_data *data; 712 + struct perf_buffer *buffer; 737 713 738 714 /* poll related */ 739 715 wait_queue_head_t waitq; ··· 831 807 832 808 struct perf_output_handle { 833 809 struct perf_event *event; 834 - struct perf_mmap_data *data; 810 + struct perf_buffer *buffer; 835 811 unsigned long wakeup; 836 812 unsigned long size; 837 813 void *addr; ··· 934 910 935 911 extern void __perf_sw_event(u32, u64, int, struct pt_regs *, u64); 936 912 937 - extern void 938 - perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip); 913 + #ifndef perf_arch_fetch_caller_regs 914 + static inline void 915 + perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip) { } 916 + #endif 939 917 940 918 /* 941 919 * Take a snapshot of the regs. Skip ip and frame pointer to ··· 947 921 * - bp for callchains 948 922 * - eflags, for future purposes, just in case 949 923 */ 950 - static inline void perf_fetch_caller_regs(struct pt_regs *regs, int skip) 924 + static inline void perf_fetch_caller_regs(struct pt_regs *regs) 951 925 { 952 - unsigned long ip; 953 - 954 926 memset(regs, 0, sizeof(*regs)); 955 927 956 - switch (skip) { 957 - case 1 : 958 - ip = CALLER_ADDR0; 959 - break; 960 - case 2 : 961 - ip = CALLER_ADDR1; 962 - break; 963 - case 3 : 964 - ip = CALLER_ADDR2; 965 - break; 966 - case 4: 967 - ip = CALLER_ADDR3; 968 - break; 969 - /* No need to support further for now */ 970 - default: 971 - ip = 0; 972 - } 973 - 974 - return perf_arch_fetch_caller_regs(regs, ip, skip); 928 + perf_arch_fetch_caller_regs(regs, CALLER_ADDR0); 975 929 } 976 930 977 931 static inline void ··· 961 955 struct pt_regs hot_regs; 962 956 963 957 if (!regs) { 964 - perf_fetch_caller_regs(&hot_regs, 1); 958 + perf_fetch_caller_regs(&hot_regs); 965 959 regs = &hot_regs; 966 960 } 967 961 __perf_sw_event(event_id, nr, nmi, regs, addr); 968 962 } 969 963 } 970 964 971 - extern void __perf_event_mmap(struct vm_area_struct *vma); 972 - 973 - static inline void perf_event_mmap(struct vm_area_struct *vma) 974 - { 975 - if (vma->vm_flags & VM_EXEC) 976 - __perf_event_mmap(vma); 977 - } 978 - 965 + extern void perf_event_mmap(struct vm_area_struct *vma); 979 966 extern struct perf_guest_info_callbacks *perf_guest_cbs; 980 967 extern int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks); 981 968 extern int perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks); ··· 1000 1001 extern void perf_event_init(void); 1001 1002 extern void perf_tp_event(u64 addr, u64 count, void *record, 1002 1003 int entry_size, struct pt_regs *regs, 1003 - struct hlist_head *head); 1004 + struct hlist_head *head, int rctx); 1004 1005 extern void perf_bp_event(struct perf_event *event, void *data); 1005 1006 1006 1007 #ifndef perf_misc_flags

+4 -20

include/linux/sched.h

··· 316 316 317 317 extern void sched_show_task(struct task_struct *p); 318 318 319 - #ifdef CONFIG_DETECT_SOFTLOCKUP 320 - extern void softlockup_tick(void); 319 + #ifdef CONFIG_LOCKUP_DETECTOR 321 320 extern void touch_softlockup_watchdog(void); 322 321 extern void touch_softlockup_watchdog_sync(void); 323 322 extern void touch_all_softlockup_watchdogs(void); 324 - extern int proc_dosoftlockup_thresh(struct ctl_table *table, int write, 325 - void __user *buffer, 326 - size_t *lenp, loff_t *ppos); 323 + extern int proc_dowatchdog_thresh(struct ctl_table *table, int write, 324 + void __user *buffer, 325 + size_t *lenp, loff_t *ppos); 327 326 extern unsigned int softlockup_panic; 328 327 extern int softlockup_thresh; 329 328 #else 330 - static inline void softlockup_tick(void) 331 - { 332 - } 333 329 static inline void touch_softlockup_watchdog(void) 334 330 { 335 331 } ··· 2430 2434 } 2431 2435 2432 2436 #endif /* CONFIG_SMP */ 2433 - 2434 - #ifdef CONFIG_TRACING 2435 - extern void 2436 - __trace_special(void *__tr, void *__data, 2437 - unsigned long arg1, unsigned long arg2, unsigned long arg3); 2438 - #else 2439 - static inline void 2440 - __trace_special(void *__tr, void *__data, 2441 - unsigned long arg1, unsigned long arg2, unsigned long arg3) 2442 - { 2443 - } 2444 - #endif 2445 2437 2446 2438 extern long sched_setaffinity(pid_t pid, const struct cpumask *new_mask); 2447 2439 extern long sched_getaffinity(pid_t pid, struct cpumask *mask);

+2 -1

include/linux/slab_def.h

··· 14 14 #include <asm/page.h> /* kmalloc_sizes.h needs PAGE_SIZE */ 15 15 #include <asm/cache.h> /* kmalloc_sizes.h needs L1_CACHE_BYTES */ 16 16 #include <linux/compiler.h> 17 - #include <linux/kmemtrace.h> 17 + 18 + #include <trace/events/kmem.h> 18 19 19 20 #ifndef ARCH_KMALLOC_MINALIGN 20 21 /*

+2 -1

include/linux/slub_def.h

··· 10 10 #include <linux/gfp.h> 11 11 #include <linux/workqueue.h> 12 12 #include <linux/kobject.h> 13 - #include <linux/kmemtrace.h> 14 13 #include <linux/kmemleak.h> 14 + 15 + #include <trace/events/kmem.h> 15 16 16 17 enum stat_item { 17 18 ALLOC_FASTPATH, /* Allocation from cpu slab */

-2

include/linux/syscalls.h

··· 167 167 .enter_event = &event_enter_##sname, \ 168 168 .exit_event = &event_exit_##sname, \ 169 169 .enter_fields = LIST_HEAD_INIT(__syscall_meta_##sname.enter_fields), \ 170 - .exit_fields = LIST_HEAD_INIT(__syscall_meta_##sname.exit_fields), \ 171 170 }; 172 171 173 172 #define SYSCALL_DEFINE0(sname) \ ··· 181 182 .enter_event = &event_enter__##sname, \ 182 183 .exit_event = &event_exit__##sname, \ 183 184 .enter_fields = LIST_HEAD_INIT(__syscall_meta__##sname.enter_fields), \ 184 - .exit_fields = LIST_HEAD_INIT(__syscall_meta__##sname.exit_fields), \ 185 185 }; \ 186 186 asmlinkage long sys_##sname(void) 187 187 #else

-60

include/trace/boot.h

··· 1 - #ifndef _LINUX_TRACE_BOOT_H 2 - #define _LINUX_TRACE_BOOT_H 3 - 4 - #include <linux/module.h> 5 - #include <linux/kallsyms.h> 6 - #include <linux/init.h> 7 - 8 - /* 9 - * Structure which defines the trace of an initcall 10 - * while it is called. 11 - * You don't have to fill the func field since it is 12 - * only used internally by the tracer. 13 - */ 14 - struct boot_trace_call { 15 - pid_t caller; 16 - char func[KSYM_SYMBOL_LEN]; 17 - }; 18 - 19 - /* 20 - * Structure which defines the trace of an initcall 21 - * while it returns. 22 - */ 23 - struct boot_trace_ret { 24 - char func[KSYM_SYMBOL_LEN]; 25 - int result; 26 - unsigned long long duration; /* nsecs */ 27 - }; 28 - 29 - #ifdef CONFIG_BOOT_TRACER 30 - /* Append the traces on the ring-buffer */ 31 - extern void trace_boot_call(struct boot_trace_call *bt, initcall_t fn); 32 - extern void trace_boot_ret(struct boot_trace_ret *bt, initcall_t fn); 33 - 34 - /* Tells the tracer that smp_pre_initcall is finished. 35 - * So we can start the tracing 36 - */ 37 - extern void start_boot_trace(void); 38 - 39 - /* Resume the tracing of other necessary events 40 - * such as sched switches 41 - */ 42 - extern void enable_boot_trace(void); 43 - 44 - /* Suspend this tracing. Actually, only sched_switches tracing have 45 - * to be suspended. Initcalls doesn't need it.) 46 - */ 47 - extern void disable_boot_trace(void); 48 - #else 49 - static inline 50 - void trace_boot_call(struct boot_trace_call *bt, initcall_t fn) { } 51 - 52 - static inline 53 - void trace_boot_ret(struct boot_trace_ret *bt, initcall_t fn) { } 54 - 55 - static inline void start_boot_trace(void) { } 56 - static inline void enable_boot_trace(void) { } 57 - static inline void disable_boot_trace(void) { } 58 - #endif /* CONFIG_BOOT_TRACER */ 59 - 60 - #endif /* __LINUX_TRACE_BOOT_H */

+7 -25

include/trace/events/sched.h

··· 50 50 ); 51 51 52 52 /* 53 - * Tracepoint for waiting on task to unschedule: 54 - */ 55 - TRACE_EVENT(sched_wait_task, 56 - 57 - TP_PROTO(struct task_struct *p), 58 - 59 - TP_ARGS(p), 60 - 61 - TP_STRUCT__entry( 62 - __array( char, comm, TASK_COMM_LEN ) 63 - __field( pid_t, pid ) 64 - __field( int, prio ) 65 - ), 66 - 67 - TP_fast_assign( 68 - memcpy(__entry->comm, p->comm, TASK_COMM_LEN); 69 - __entry->pid = p->pid; 70 - __entry->prio = p->prio; 71 - ), 72 - 73 - TP_printk("comm=%s pid=%d prio=%d", 74 - __entry->comm, __entry->pid, __entry->prio) 75 - ); 76 - 77 - /* 78 53 * Tracepoint for waking up a task: 79 54 */ 80 55 DECLARE_EVENT_CLASS(sched_wakeup_template, ··· 213 238 DEFINE_EVENT(sched_process_template, sched_process_exit, 214 239 TP_PROTO(struct task_struct *p), 215 240 TP_ARGS(p)); 241 + 242 + /* 243 + * Tracepoint for waiting on task to unschedule: 244 + */ 245 + DEFINE_EVENT(sched_process_template, sched_wait_task, 246 + TP_PROTO(struct task_struct *p), 247 + TP_ARGS(p)); 216 248 217 249 /* 218 250 * Tracepoint for a waiting task:

+32 -48

include/trace/events/timer.h

··· 8 8 #include <linux/hrtimer.h> 9 9 #include <linux/timer.h> 10 10 11 - /** 12 - * timer_init - called when the timer is initialized 13 - * @timer: pointer to struct timer_list 14 - */ 15 - TRACE_EVENT(timer_init, 11 + DECLARE_EVENT_CLASS(timer_class, 16 12 17 13 TP_PROTO(struct timer_list *timer), 18 14 ··· 23 27 ), 24 28 25 29 TP_printk("timer=%p", __entry->timer) 30 + ); 31 + 32 + /** 33 + * timer_init - called when the timer is initialized 34 + * @timer: pointer to struct timer_list 35 + */ 36 + DEFINE_EVENT(timer_class, timer_init, 37 + 38 + TP_PROTO(struct timer_list *timer), 39 + 40 + TP_ARGS(timer) 26 41 ); 27 42 28 43 /** ··· 101 94 * NOTE: Do NOT derefernce timer in TP_fast_assign. The pointer might 102 95 * be invalid. We solely track the pointer. 103 96 */ 104 - TRACE_EVENT(timer_expire_exit, 97 + DEFINE_EVENT(timer_class, timer_expire_exit, 105 98 106 99 TP_PROTO(struct timer_list *timer), 107 100 108 - TP_ARGS(timer), 109 - 110 - TP_STRUCT__entry( 111 - __field(void *, timer ) 112 - ), 113 - 114 - TP_fast_assign( 115 - __entry->timer = timer; 116 - ), 117 - 118 - TP_printk("timer=%p", __entry->timer) 101 + TP_ARGS(timer) 119 102 ); 120 103 121 104 /** 122 105 * timer_cancel - called when the timer is canceled 123 106 * @timer: pointer to struct timer_list 124 107 */ 125 - TRACE_EVENT(timer_cancel, 108 + DEFINE_EVENT(timer_class, timer_cancel, 126 109 127 110 TP_PROTO(struct timer_list *timer), 128 111 129 - TP_ARGS(timer), 130 - 131 - TP_STRUCT__entry( 132 - __field( void *, timer ) 133 - ), 134 - 135 - TP_fast_assign( 136 - __entry->timer = timer; 137 - ), 138 - 139 - TP_printk("timer=%p", __entry->timer) 112 + TP_ARGS(timer) 140 113 ); 141 114 142 115 /** ··· 211 224 (unsigned long long)ktime_to_ns((ktime_t) { .tv64 = __entry->now })) 212 225 ); 213 226 214 - /** 215 - * hrtimer_expire_exit - called immediately after the hrtimer callback returns 216 - * @timer: pointer to struct hrtimer 217 - * 218 - * When used in combination with the hrtimer_expire_entry tracepoint we can 219 - * determine the runtime of the callback function. 220 - */ 221 - TRACE_EVENT(hrtimer_expire_exit, 227 + DECLARE_EVENT_CLASS(hrtimer_class, 222 228 223 229 TP_PROTO(struct hrtimer *hrtimer), 224 230 ··· 229 249 ); 230 250 231 251 /** 232 - * hrtimer_cancel - called when the hrtimer is canceled 233 - * @hrtimer: pointer to struct hrtimer 252 + * hrtimer_expire_exit - called immediately after the hrtimer callback returns 253 + * @timer: pointer to struct hrtimer 254 + * 255 + * When used in combination with the hrtimer_expire_entry tracepoint we can 256 + * determine the runtime of the callback function. 234 257 */ 235 - TRACE_EVENT(hrtimer_cancel, 258 + DEFINE_EVENT(hrtimer_class, hrtimer_expire_exit, 236 259 237 260 TP_PROTO(struct hrtimer *hrtimer), 238 261 239 - TP_ARGS(hrtimer), 262 + TP_ARGS(hrtimer) 263 + ); 240 264 241 - TP_STRUCT__entry( 242 - __field( void *, hrtimer ) 243 - ), 265 + /** 266 + * hrtimer_cancel - called when the hrtimer is canceled 267 + * @hrtimer: pointer to struct hrtimer 268 + */ 269 + DEFINE_EVENT(hrtimer_class, hrtimer_cancel, 244 270 245 - TP_fast_assign( 246 - __entry->hrtimer = hrtimer; 247 - ), 271 + TP_PROTO(struct hrtimer *hrtimer), 248 272 249 - TP_printk("hrtimer=%p", __entry->hrtimer) 273 + TP_ARGS(hrtimer) 250 274 ); 251 275 252 276 /**

+8 -15

include/trace/ftrace.h

··· 75 75 #define DEFINE_EVENT_PRINT(template, name, proto, args, print) \ 76 76 DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args)) 77 77 78 - #undef __cpparg 79 - #define __cpparg(arg...) arg 80 - 81 78 /* Callbacks are meaningless to ftrace. */ 82 79 #undef TRACE_EVENT_FN 83 80 #define TRACE_EVENT_FN(name, proto, args, tstruct, \ 84 81 assign, print, reg, unreg) \ 85 - TRACE_EVENT(name, __cpparg(proto), __cpparg(args), \ 86 - __cpparg(tstruct), __cpparg(assign), __cpparg(print)) \ 82 + TRACE_EVENT(name, PARAMS(proto), PARAMS(args), \ 83 + PARAMS(tstruct), PARAMS(assign), PARAMS(print)) \ 87 84 88 85 #include TRACE_INCLUDE(TRACE_INCLUDE_FILE) 89 86 ··· 142 145 * struct trace_seq *s = &iter->seq; 143 146 * struct ftrace_raw_<call> *field; <-- defined in stage 1 144 147 * struct trace_entry *entry; 145 - * struct trace_seq *p; 148 + * struct trace_seq *p = &iter->tmp_seq; 146 149 * int ret; 147 150 * 148 151 * entry = iter->ent; ··· 154 157 * 155 158 * field = (typeof(field))entry; 156 159 * 157 - * p = &get_cpu_var(ftrace_event_seq); 158 160 * trace_seq_init(p); 159 161 * ret = trace_seq_printf(s, "%s: ", <call>); 160 162 * if (ret) 161 163 * ret = trace_seq_printf(s, <TP_printk> "\n"); 162 - * put_cpu(); 163 164 * if (!ret) 164 165 * return TRACE_TYPE_PARTIAL_LINE; 165 166 * ··· 211 216 struct trace_seq *s = &iter->seq; \ 212 217 struct ftrace_raw_##call *field; \ 213 218 struct trace_entry *entry; \ 214 - struct trace_seq *p; \ 219 + struct trace_seq *p = &iter->tmp_seq; \ 215 220 int ret; \ 216 221 \ 217 222 event = container_of(trace_event, struct ftrace_event_call, \ ··· 226 231 \ 227 232 field = (typeof(field))entry; \ 228 233 \ 229 - p = &get_cpu_var(ftrace_event_seq); \ 230 234 trace_seq_init(p); \ 231 235 ret = trace_seq_printf(s, "%s: ", event->name); \ 232 236 if (ret) \ 233 237 ret = trace_seq_printf(s, print); \ 234 - put_cpu(); \ 235 238 if (!ret) \ 236 239 return TRACE_TYPE_PARTIAL_LINE; \ 237 240 \ ··· 248 255 struct trace_seq *s = &iter->seq; \ 249 256 struct ftrace_raw_##template *field; \ 250 257 struct trace_entry *entry; \ 251 - struct trace_seq *p; \ 258 + struct trace_seq *p = &iter->tmp_seq; \ 252 259 int ret; \ 253 260 \ 254 261 entry = iter->ent; \ ··· 260 267 \ 261 268 field = (typeof(field))entry; \ 262 269 \ 263 - p = &get_cpu_var(ftrace_event_seq); \ 264 270 trace_seq_init(p); \ 265 271 ret = trace_seq_printf(s, "%s: ", #call); \ 266 272 if (ret) \ 267 273 ret = trace_seq_printf(s, print); \ 268 - put_cpu(); \ 269 274 if (!ret) \ 270 275 return TRACE_TYPE_PARTIAL_LINE; \ 271 276 \ ··· 430 439 * .fields = LIST_HEAD_INIT(event_class_##call.fields), 431 440 * .raw_init = trace_event_raw_init, 432 441 * .probe = ftrace_raw_event_##call, 442 + * .reg = ftrace_event_reg, 433 443 * }; 434 444 * 435 445 * static struct ftrace_event_call __used ··· 559 567 .fields = LIST_HEAD_INIT(event_class_##call.fields),\ 560 568 .raw_init = trace_event_raw_init, \ 561 569 .probe = ftrace_raw_event_##call, \ 570 + .reg = ftrace_event_reg, \ 562 571 _TRACE_PERF_INIT(call) \ 563 572 }; 564 573 ··· 698 705 int __data_size; \ 699 706 int rctx; \ 700 707 \ 701 - perf_fetch_caller_regs(&__regs, 1); \ 708 + perf_fetch_caller_regs(&__regs); \ 702 709 \ 703 710 __data_size = ftrace_get_offsets_##call(&__data_offsets, args); \ 704 711 __entry_size = ALIGN(__data_size + sizeof(*entry) + sizeof(u32),\

-1

include/trace/syscall.h

··· 26 26 const char **types; 27 27 const char **args; 28 28 struct list_head enter_fields; 29 - struct list_head exit_fields; 30 29 31 30 struct ftrace_event_call *enter_event; 32 31 struct ftrace_event_call *exit_event;

+10 -19

init/main.c

··· 66 66 #include <linux/ftrace.h> 67 67 #include <linux/async.h> 68 68 #include <linux/kmemcheck.h> 69 - #include <linux/kmemtrace.h> 70 69 #include <linux/sfi.h> 71 70 #include <linux/shmem_fs.h> 72 71 #include <linux/slab.h> 73 - #include <trace/boot.h> 74 72 75 73 #include <asm/io.h> 76 74 #include <asm/bugs.h> ··· 662 664 #endif 663 665 page_cgroup_init(); 664 666 enable_debug_pagealloc(); 665 - kmemtrace_init(); 666 667 kmemleak_init(); 667 668 debug_objects_mem_init(); 668 669 idr_init_cache(); ··· 723 726 core_param(initcall_debug, initcall_debug, bool, 0644); 724 727 725 728 static char msgbuf[64]; 726 - static struct boot_trace_call call; 727 - static struct boot_trace_ret ret; 728 729 729 730 int do_one_initcall(initcall_t fn) 730 731 { 731 732 int count = preempt_count(); 732 733 ktime_t calltime, delta, rettime; 734 + unsigned long long duration; 735 + int ret; 733 736 734 737 if (initcall_debug) { 735 - call.caller = task_pid_nr(current); 736 - printk("calling %pF @ %i\n", fn, call.caller); 738 + printk("calling %pF @ %i\n", fn, task_pid_nr(current)); 737 739 calltime = ktime_get(); 738 - trace_boot_call(&call, fn); 739 - enable_boot_trace(); 740 740 } 741 741 742 - ret.result = fn(); 742 + ret = fn(); 743 743 744 744 if (initcall_debug) { 745 - disable_boot_trace(); 746 745 rettime = ktime_get(); 747 746 delta = ktime_sub(rettime, calltime); 748 - ret.duration = (unsigned long long) ktime_to_ns(delta) >> 10; 749 - trace_boot_ret(&ret, fn); 750 - printk("initcall %pF returned %d after %Ld usecs\n", fn, 751 - ret.result, ret.duration); 747 + duration = (unsigned long long) ktime_to_ns(delta) >> 10; 748 + printk("initcall %pF returned %d after %lld usecs\n", fn, 749 + ret, duration); 752 750 } 753 751 754 752 msgbuf[0] = 0; 755 753 756 - if (ret.result && ret.result != -ENODEV && initcall_debug) 757 - sprintf(msgbuf, "error code %d ", ret.result); 754 + if (ret && ret != -ENODEV && initcall_debug) 755 + sprintf(msgbuf, "error code %d ", ret); 758 756 759 757 if (preempt_count() != count) { 760 758 strlcat(msgbuf, "preemption imbalance ", sizeof(msgbuf)); ··· 763 771 printk("initcall %pF returned with %s\n", fn, msgbuf); 764 772 } 765 773 766 - return ret.result; 774 + return ret; 767 775 } 768 776 769 777 ··· 887 895 smp_prepare_cpus(setup_max_cpus); 888 896 889 897 do_pre_smp_initcalls(); 890 - start_boot_trace(); 891 898 892 899 smp_init(); 893 900 sched_init_smp();

+1 -1

kernel/Makefile

··· 76 76 obj-$(CONFIG_AUDIT_TREE) += audit_tree.o 77 77 obj-$(CONFIG_KPROBES) += kprobes.o 78 78 obj-$(CONFIG_KGDB) += debug/ 79 - obj-$(CONFIG_DETECT_SOFTLOCKUP) += softlockup.o 80 79 obj-$(CONFIG_DETECT_HUNG_TASK) += hung_task.o 80 + obj-$(CONFIG_LOCKUP_DETECTOR) += watchdog.o 81 81 obj-$(CONFIG_GENERIC_HARDIRQS) += irq/ 82 82 obj-$(CONFIG_SECCOMP) += seccomp.o 83 83 obj-$(CONFIG_RCU_TORTURE_TEST) += rcutorture.o

+42 -38

kernel/hw_breakpoint.c

··· 41 41 #include <linux/sched.h> 42 42 #include <linux/init.h> 43 43 #include <linux/slab.h> 44 + #include <linux/list.h> 44 45 #include <linux/cpu.h> 45 46 #include <linux/smp.h> 46 47 ··· 62 61 static DEFINE_PER_CPU(unsigned int, nr_bp_flexible[TYPE_MAX]); 63 62 64 63 static int nr_slots[TYPE_MAX]; 64 + 65 + /* Keep track of the breakpoints attached to tasks */ 66 + static LIST_HEAD(bp_task_head); 65 67 66 68 static int constraints_initialized; 67 69 ··· 107 103 return 0; 108 104 } 109 105 110 - static int task_bp_pinned(struct task_struct *tsk, enum bp_type_idx type) 106 + /* 107 + * Count the number of breakpoints of the same type and same task. 108 + * The given event must be not on the list. 109 + */ 110 + static int task_bp_pinned(struct perf_event *bp, enum bp_type_idx type) 111 111 { 112 - struct perf_event_context *ctx = tsk->perf_event_ctxp; 113 - struct list_head *list; 114 - struct perf_event *bp; 115 - unsigned long flags; 112 + struct perf_event_context *ctx = bp->ctx; 113 + struct perf_event *iter; 116 114 int count = 0; 117 115 118 - if (WARN_ONCE(!ctx, "No perf context for this task")) 119 - return 0; 120 - 121 - list = &ctx->event_list; 122 - 123 - raw_spin_lock_irqsave(&ctx->lock, flags); 124 - 125 - /* 126 - * The current breakpoint counter is not included in the list 127 - * at the open() callback time 128 - */ 129 - list_for_each_entry(bp, list, event_entry) { 130 - if (bp->attr.type == PERF_TYPE_BREAKPOINT) 131 - if (find_slot_idx(bp) == type) 132 - count += hw_breakpoint_weight(bp); 116 + list_for_each_entry(iter, &bp_task_head, hw.bp_list) { 117 + if (iter->ctx == ctx && find_slot_idx(iter) == type) 118 + count += hw_breakpoint_weight(iter); 133 119 } 134 - 135 - raw_spin_unlock_irqrestore(&ctx->lock, flags); 136 120 137 121 return count; 138 122 } ··· 141 149 if (!tsk) 142 150 slots->pinned += max_task_bp_pinned(cpu, type); 143 151 else 144 - slots->pinned += task_bp_pinned(tsk, type); 152 + slots->pinned += task_bp_pinned(bp, type); 145 153 slots->flexible = per_cpu(nr_bp_flexible[type], cpu); 146 154 147 155 return; ··· 154 162 if (!tsk) 155 163 nr += max_task_bp_pinned(cpu, type); 156 164 else 157 - nr += task_bp_pinned(tsk, type); 165 + nr += task_bp_pinned(bp, type); 158 166 159 167 if (nr > slots->pinned) 160 168 slots->pinned = nr; ··· 180 188 /* 181 189 * Add a pinned breakpoint for the given task in our constraint table 182 190 */ 183 - static void toggle_bp_task_slot(struct task_struct *tsk, int cpu, bool enable, 191 + static void toggle_bp_task_slot(struct perf_event *bp, int cpu, bool enable, 184 192 enum bp_type_idx type, int weight) 185 193 { 186 194 unsigned int *tsk_pinned; ··· 188 196 int old_idx = 0; 189 197 int idx = 0; 190 198 191 - old_count = task_bp_pinned(tsk, type); 199 + old_count = task_bp_pinned(bp, type); 192 200 old_idx = old_count - 1; 193 201 idx = old_idx + weight; 194 202 203 + /* tsk_pinned[n] is the number of tasks having n breakpoints */ 195 204 tsk_pinned = per_cpu(nr_task_bp_pinned[type], cpu); 196 205 if (enable) { 197 206 tsk_pinned[idx]++; ··· 215 222 int cpu = bp->cpu; 216 223 struct task_struct *tsk = bp->ctx->task; 217 224 218 - /* Pinned counter task profiling */ 219 - if (tsk) { 220 - if (cpu >= 0) { 221 - toggle_bp_task_slot(tsk, cpu, enable, type, weight); 222 - return; 223 - } 225 + /* Pinned counter cpu profiling */ 226 + if (!tsk) { 224 227 225 - for_each_online_cpu(cpu) 226 - toggle_bp_task_slot(tsk, cpu, enable, type, weight); 228 + if (enable) 229 + per_cpu(nr_cpu_bp_pinned[type], bp->cpu) += weight; 230 + else 231 + per_cpu(nr_cpu_bp_pinned[type], bp->cpu) -= weight; 227 232 return; 228 233 } 229 234 230 - /* Pinned counter cpu profiling */ 235 + /* Pinned counter task profiling */ 236 + 237 + if (!enable) 238 + list_del(&bp->hw.bp_list); 239 + 240 + if (cpu >= 0) { 241 + toggle_bp_task_slot(bp, cpu, enable, type, weight); 242 + } else { 243 + for_each_online_cpu(cpu) 244 + toggle_bp_task_slot(bp, cpu, enable, type, weight); 245 + } 246 + 231 247 if (enable) 232 - per_cpu(nr_cpu_bp_pinned[type], bp->cpu) += weight; 233 - else 234 - per_cpu(nr_cpu_bp_pinned[type], bp->cpu) -= weight; 248 + list_add_tail(&bp->hw.bp_list, &bp_task_head); 235 249 } 236 250 237 251 /* ··· 312 312 weight = hw_breakpoint_weight(bp); 313 313 314 314 fetch_bp_busy_slots(&slots, bp, type); 315 + /* 316 + * Simulate the addition of this breakpoint to the constraints 317 + * and see the result. 318 + */ 315 319 fetch_this_slot(&slots, weight); 316 320 317 321 /* Flexible counters need to keep at least one slot */

+234 -224

kernel/perf_event.c

··· 675 675 struct perf_event *event, *partial_group = NULL; 676 676 const struct pmu *pmu = group_event->pmu; 677 677 bool txn = false; 678 - int ret; 679 678 680 679 if (group_event->state == PERF_EVENT_STATE_OFF) 681 680 return 0; ··· 702 703 } 703 704 } 704 705 705 - if (!txn) 706 + if (!txn || !pmu->commit_txn(pmu)) 706 707 return 0; 707 - 708 - ret = pmu->commit_txn(pmu); 709 - if (!ret) { 710 - pmu->cancel_txn(pmu); 711 - return 0; 712 - } 713 708 714 709 group_error: 715 710 /* ··· 1148 1155 * In order to keep per-task stats reliable we need to flip the event 1149 1156 * values when we flip the contexts. 1150 1157 */ 1151 - value = atomic64_read(&next_event->count); 1152 - value = atomic64_xchg(&event->count, value); 1153 - atomic64_set(&next_event->count, value); 1158 + value = local64_read(&next_event->count); 1159 + value = local64_xchg(&event->count, value); 1160 + local64_set(&next_event->count, value); 1154 1161 1155 1162 swap(event->total_time_enabled, next_event->total_time_enabled); 1156 1163 swap(event->total_time_running, next_event->total_time_running); ··· 1540 1547 1541 1548 hwc->sample_period = sample_period; 1542 1549 1543 - if (atomic64_read(&hwc->period_left) > 8*sample_period) { 1550 + if (local64_read(&hwc->period_left) > 8*sample_period) { 1544 1551 perf_disable(); 1545 1552 perf_event_stop(event); 1546 - atomic64_set(&hwc->period_left, 0); 1553 + local64_set(&hwc->period_left, 0); 1547 1554 perf_event_start(event); 1548 1555 perf_enable(); 1549 1556 } ··· 1584 1591 1585 1592 perf_disable(); 1586 1593 event->pmu->read(event); 1587 - now = atomic64_read(&event->count); 1594 + now = local64_read(&event->count); 1588 1595 delta = now - hwc->freq_count_stamp; 1589 1596 hwc->freq_count_stamp = now; 1590 1597 ··· 1736 1743 event->pmu->read(event); 1737 1744 } 1738 1745 1746 + static inline u64 perf_event_count(struct perf_event *event) 1747 + { 1748 + return local64_read(&event->count) + atomic64_read(&event->child_count); 1749 + } 1750 + 1739 1751 static u64 perf_event_read(struct perf_event *event) 1740 1752 { 1741 1753 /* ··· 1760 1762 raw_spin_unlock_irqrestore(&ctx->lock, flags); 1761 1763 } 1762 1764 1763 - return atomic64_read(&event->count); 1765 + return perf_event_count(event); 1764 1766 } 1765 1767 1766 1768 /* ··· 1881 1883 } 1882 1884 1883 1885 static void perf_pending_sync(struct perf_event *event); 1884 - static void perf_mmap_data_put(struct perf_mmap_data *data); 1886 + static void perf_buffer_put(struct perf_buffer *buffer); 1885 1887 1886 1888 static void free_event(struct perf_event *event) 1887 1889 { ··· 1889 1891 1890 1892 if (!event->parent) { 1891 1893 atomic_dec(&nr_events); 1892 - if (event->attr.mmap) 1894 + if (event->attr.mmap || event->attr.mmap_data) 1893 1895 atomic_dec(&nr_mmap_events); 1894 1896 if (event->attr.comm) 1895 1897 atomic_dec(&nr_comm_events); ··· 1897 1899 atomic_dec(&nr_task_events); 1898 1900 } 1899 1901 1900 - if (event->data) { 1901 - perf_mmap_data_put(event->data); 1902 - event->data = NULL; 1902 + if (event->buffer) { 1903 + perf_buffer_put(event->buffer); 1904 + event->buffer = NULL; 1903 1905 } 1904 1906 1905 1907 if (event->destroy) ··· 2124 2126 static unsigned int perf_poll(struct file *file, poll_table *wait) 2125 2127 { 2126 2128 struct perf_event *event = file->private_data; 2127 - struct perf_mmap_data *data; 2129 + struct perf_buffer *buffer; 2128 2130 unsigned int events = POLL_HUP; 2129 2131 2130 2132 rcu_read_lock(); 2131 - data = rcu_dereference(event->data); 2132 - if (data) 2133 - events = atomic_xchg(&data->poll, 0); 2133 + buffer = rcu_dereference(event->buffer); 2134 + if (buffer) 2135 + events = atomic_xchg(&buffer->poll, 0); 2134 2136 rcu_read_unlock(); 2135 2137 2136 2138 poll_wait(file, &event->waitq, wait); ··· 2141 2143 static void perf_event_reset(struct perf_event *event) 2142 2144 { 2143 2145 (void)perf_event_read(event); 2144 - atomic64_set(&event->count, 0); 2146 + local64_set(&event->count, 0); 2145 2147 perf_event_update_userpage(event); 2146 2148 } 2147 2149 ··· 2340 2342 void perf_event_update_userpage(struct perf_event *event) 2341 2343 { 2342 2344 struct perf_event_mmap_page *userpg; 2343 - struct perf_mmap_data *data; 2345 + struct perf_buffer *buffer; 2344 2346 2345 2347 rcu_read_lock(); 2346 - data = rcu_dereference(event->data); 2347 - if (!data) 2348 + buffer = rcu_dereference(event->buffer); 2349 + if (!buffer) 2348 2350 goto unlock; 2349 2351 2350 - userpg = data->user_page; 2352 + userpg = buffer->user_page; 2351 2353 2352 2354 /* 2353 2355 * Disable preemption so as to not let the corresponding user-space ··· 2357 2359 ++userpg->lock; 2358 2360 barrier(); 2359 2361 userpg->index = perf_event_index(event); 2360 - userpg->offset = atomic64_read(&event->count); 2362 + userpg->offset = perf_event_count(event); 2361 2363 if (event->state == PERF_EVENT_STATE_ACTIVE) 2362 - userpg->offset -= atomic64_read(&event->hw.prev_count); 2364 + userpg->offset -= local64_read(&event->hw.prev_count); 2363 2365 2364 2366 userpg->time_enabled = event->total_time_enabled + 2365 2367 atomic64_read(&event->child_total_time_enabled); ··· 2374 2376 rcu_read_unlock(); 2375 2377 } 2376 2378 2379 + static unsigned long perf_data_size(struct perf_buffer *buffer); 2380 + 2381 + static void 2382 + perf_buffer_init(struct perf_buffer *buffer, long watermark, int flags) 2383 + { 2384 + long max_size = perf_data_size(buffer); 2385 + 2386 + if (watermark) 2387 + buffer->watermark = min(max_size, watermark); 2388 + 2389 + if (!buffer->watermark) 2390 + buffer->watermark = max_size / 2; 2391 + 2392 + if (flags & PERF_BUFFER_WRITABLE) 2393 + buffer->writable = 1; 2394 + 2395 + atomic_set(&buffer->refcount, 1); 2396 + } 2397 + 2377 2398 #ifndef CONFIG_PERF_USE_VMALLOC 2378 2399 2379 2400 /* ··· 2400 2383 */ 2401 2384 2402 2385 static struct page * 2403 - perf_mmap_to_page(struct perf_mmap_data *data, unsigned long pgoff) 2386 + perf_mmap_to_page(struct perf_buffer *buffer, unsigned long pgoff) 2404 2387 { 2405 - if (pgoff > data->nr_pages) 2388 + if (pgoff > buffer->nr_pages) 2406 2389 return NULL; 2407 2390 2408 2391 if (pgoff == 0) 2409 - return virt_to_page(data->user_page); 2392 + return virt_to_page(buffer->user_page); 2410 2393 2411 - return virt_to_page(data->data_pages[pgoff - 1]); 2394 + return virt_to_page(buffer->data_pages[pgoff - 1]); 2412 2395 } 2413 2396 2414 2397 static void *perf_mmap_alloc_page(int cpu) ··· 2424 2407 return page_address(page); 2425 2408 } 2426 2409 2427 - static struct perf_mmap_data * 2428 - perf_mmap_data_alloc(struct perf_event *event, int nr_pages) 2410 + static struct perf_buffer * 2411 + perf_buffer_alloc(int nr_pages, long watermark, int cpu, int flags) 2429 2412 { 2430 - struct perf_mmap_data *data; 2413 + struct perf_buffer *buffer; 2431 2414 unsigned long size; 2432 2415 int i; 2433 2416 2434 - size = sizeof(struct perf_mmap_data); 2417 + size = sizeof(struct perf_buffer); 2435 2418 size += nr_pages * sizeof(void *); 2436 2419 2437 - data = kzalloc(size, GFP_KERNEL); 2438 - if (!data) 2420 + buffer = kzalloc(size, GFP_KERNEL); 2421 + if (!buffer) 2439 2422 goto fail; 2440 2423 2441 - data->user_page = perf_mmap_alloc_page(event->cpu); 2442 - if (!data->user_page) 2424 + buffer->user_page = perf_mmap_alloc_page(cpu); 2425 + if (!buffer->user_page) 2443 2426 goto fail_user_page; 2444 2427 2445 2428 for (i = 0; i < nr_pages; i++) { 2446 - data->data_pages[i] = perf_mmap_alloc_page(event->cpu); 2447 - if (!data->data_pages[i]) 2429 + buffer->data_pages[i] = perf_mmap_alloc_page(cpu); 2430 + if (!buffer->data_pages[i]) 2448 2431 goto fail_data_pages; 2449 2432 } 2450 2433 2451 - data->nr_pages = nr_pages; 2434 + buffer->nr_pages = nr_pages; 2452 2435 2453 - return data; 2436 + perf_buffer_init(buffer, watermark, flags); 2437 + 2438 + return buffer; 2454 2439 2455 2440 fail_data_pages: 2456 2441 for (i--; i >= 0; i--) 2457 - free_page((unsigned long)data->data_pages[i]); 2442 + free_page((unsigned long)buffer->data_pages[i]); 2458 2443 2459 - free_page((unsigned long)data->user_page); 2444 + free_page((unsigned long)buffer->user_page); 2460 2445 2461 2446 fail_user_page: 2462 - kfree(data); 2447 + kfree(buffer); 2463 2448 2464 2449 fail: 2465 2450 return NULL; ··· 2475 2456 __free_page(page); 2476 2457 } 2477 2458 2478 - static void perf_mmap_data_free(struct perf_mmap_data *data) 2459 + static void perf_buffer_free(struct perf_buffer *buffer) 2479 2460 { 2480 2461 int i; 2481 2462 2482 - perf_mmap_free_page((unsigned long)data->user_page); 2483 - for (i = 0; i < data->nr_pages; i++) 2484 - perf_mmap_free_page((unsigned long)data->data_pages[i]); 2485 - kfree(data); 2463 + perf_mmap_free_page((unsigned long)buffer->user_page); 2464 + for (i = 0; i < buffer->nr_pages; i++) 2465 + perf_mmap_free_page((unsigned long)buffer->data_pages[i]); 2466 + kfree(buffer); 2486 2467 } 2487 2468 2488 - static inline int page_order(struct perf_mmap_data *data) 2469 + static inline int page_order(struct perf_buffer *buffer) 2489 2470 { 2490 2471 return 0; 2491 2472 } ··· 2498 2479 * Required for architectures that have d-cache aliasing issues. 2499 2480 */ 2500 2481 2501 - static inline int page_order(struct perf_mmap_data *data) 2482 + static inline int page_order(struct perf_buffer *buffer) 2502 2483 { 2503 - return data->page_order; 2484 + return buffer->page_order; 2504 2485 } 2505 2486 2506 2487 static struct page * 2507 - perf_mmap_to_page(struct perf_mmap_data *data, unsigned long pgoff) 2488 + perf_mmap_to_page(struct perf_buffer *buffer, unsigned long pgoff) 2508 2489 { 2509 - if (pgoff > (1UL << page_order(data))) 2490 + if (pgoff > (1UL << page_order(buffer))) 2510 2491 return NULL; 2511 2492 2512 - return vmalloc_to_page((void *)data->user_page + pgoff * PAGE_SIZE); 2493 + return vmalloc_to_page((void *)buffer->user_page + pgoff * PAGE_SIZE); 2513 2494 } 2514 2495 2515 2496 static void perf_mmap_unmark_page(void *addr) ··· 2519 2500 page->mapping = NULL; 2520 2501 } 2521 2502 2522 - static void perf_mmap_data_free_work(struct work_struct *work) 2503 + static void perf_buffer_free_work(struct work_struct *work) 2523 2504 { 2524 - struct perf_mmap_data *data; 2505 + struct perf_buffer *buffer; 2525 2506 void *base; 2526 2507 int i, nr; 2527 2508 2528 - data = container_of(work, struct perf_mmap_data, work); 2529 - nr = 1 << page_order(data); 2509 + buffer = container_of(work, struct perf_buffer, work); 2510 + nr = 1 << page_order(buffer); 2530 2511 2531 - base = data->user_page; 2512 + base = buffer->user_page; 2532 2513 for (i = 0; i < nr + 1; i++) 2533 2514 perf_mmap_unmark_page(base + (i * PAGE_SIZE)); 2534 2515 2535 2516 vfree(base); 2536 - kfree(data); 2517 + kfree(buffer); 2537 2518 } 2538 2519 2539 - static void perf_mmap_data_free(struct perf_mmap_data *data) 2520 + static void perf_buffer_free(struct perf_buffer *buffer) 2540 2521 { 2541 - schedule_work(&data->work); 2522 + schedule_work(&buffer->work); 2542 2523 } 2543 2524 2544 - static struct perf_mmap_data * 2545 - perf_mmap_data_alloc(struct perf_event *event, int nr_pages) 2525 + static struct perf_buffer * 2526 + perf_buffer_alloc(int nr_pages, long watermark, int cpu, int flags) 2546 2527 { 2547 - struct perf_mmap_data *data; 2528 + struct perf_buffer *buffer; 2548 2529 unsigned long size; 2549 2530 void *all_buf; 2550 2531 2551 - size = sizeof(struct perf_mmap_data); 2532 + size = sizeof(struct perf_buffer); 2552 2533 size += sizeof(void *); 2553 2534 2554 - data = kzalloc(size, GFP_KERNEL); 2555 - if (!data) 2535 + buffer = kzalloc(size, GFP_KERNEL); 2536 + if (!buffer) 2556 2537 goto fail; 2557 2538 2558 - INIT_WORK(&data->work, perf_mmap_data_free_work); 2539 + INIT_WORK(&buffer->work, perf_buffer_free_work); 2559 2540 2560 2541 all_buf = vmalloc_user((nr_pages + 1) * PAGE_SIZE); 2561 2542 if (!all_buf) 2562 2543 goto fail_all_buf; 2563 2544 2564 - data->user_page = all_buf; 2565 - data->data_pages[0] = all_buf + PAGE_SIZE; 2566 - data->page_order = ilog2(nr_pages); 2567 - data->nr_pages = 1; 2545 + buffer->user_page = all_buf; 2546 + buffer->data_pages[0] = all_buf + PAGE_SIZE; 2547 + buffer->page_order = ilog2(nr_pages); 2548 + buffer->nr_pages = 1; 2568 2549 2569 - return data; 2550 + perf_buffer_init(buffer, watermark, flags); 2551 + 2552 + return buffer; 2570 2553 2571 2554 fail_all_buf: 2572 - kfree(data); 2555 + kfree(buffer); 2573 2556 2574 2557 fail: 2575 2558 return NULL; ··· 2579 2558 2580 2559 #endif 2581 2560 2582 - static unsigned long perf_data_size(struct perf_mmap_data *data) 2561 + static unsigned long perf_data_size(struct perf_buffer *buffer) 2583 2562 { 2584 - return data->nr_pages << (PAGE_SHIFT + page_order(data)); 2563 + return buffer->nr_pages << (PAGE_SHIFT + page_order(buffer)); 2585 2564 } 2586 2565 2587 2566 static int perf_mmap_fault(struct vm_area_struct *vma, struct vm_fault *vmf) 2588 2567 { 2589 2568 struct perf_event *event = vma->vm_file->private_data; 2590 - struct perf_mmap_data *data; 2569 + struct perf_buffer *buffer; 2591 2570 int ret = VM_FAULT_SIGBUS; 2592 2571 2593 2572 if (vmf->flags & FAULT_FLAG_MKWRITE) { ··· 2597 2576 } 2598 2577 2599 2578 rcu_read_lock(); 2600 - data = rcu_dereference(event->data); 2601 - if (!data) 2579 + buffer = rcu_dereference(event->buffer); 2580 + if (!buffer) 2602 2581 goto unlock; 2603 2582 2604 2583 if (vmf->pgoff && (vmf->flags & FAULT_FLAG_WRITE)) 2605 2584 goto unlock; 2606 2585 2607 - vmf->page = perf_mmap_to_page(data, vmf->pgoff); 2586 + vmf->page = perf_mmap_to_page(buffer, vmf->pgoff); 2608 2587 if (!vmf->page) 2609 2588 goto unlock; 2610 2589 ··· 2619 2598 return ret; 2620 2599 } 2621 2600 2622 - static void 2623 - perf_mmap_data_init(struct perf_event *event, struct perf_mmap_data *data) 2601 + static void perf_buffer_free_rcu(struct rcu_head *rcu_head) 2624 2602 { 2625 - long max_size = perf_data_size(data); 2603 + struct perf_buffer *buffer; 2626 2604 2627 - if (event->attr.watermark) { 2628 - data->watermark = min_t(long, max_size, 2629 - event->attr.wakeup_watermark); 2630 - } 2631 - 2632 - if (!data->watermark) 2633 - data->watermark = max_size / 2; 2634 - 2635 - atomic_set(&data->refcount, 1); 2636 - rcu_assign_pointer(event->data, data); 2605 + buffer = container_of(rcu_head, struct perf_buffer, rcu_head); 2606 + perf_buffer_free(buffer); 2637 2607 } 2638 2608 2639 - static void perf_mmap_data_free_rcu(struct rcu_head *rcu_head) 2609 + static struct perf_buffer *perf_buffer_get(struct perf_event *event) 2640 2610 { 2641 - struct perf_mmap_data *data; 2642 - 2643 - data = container_of(rcu_head, struct perf_mmap_data, rcu_head); 2644 - perf_mmap_data_free(data); 2645 - } 2646 - 2647 - static struct perf_mmap_data *perf_mmap_data_get(struct perf_event *event) 2648 - { 2649 - struct perf_mmap_data *data; 2611 + struct perf_buffer *buffer; 2650 2612 2651 2613 rcu_read_lock(); 2652 - data = rcu_dereference(event->data); 2653 - if (data) { 2654 - if (!atomic_inc_not_zero(&data->refcount)) 2655 - data = NULL; 2614 + buffer = rcu_dereference(event->buffer); 2615 + if (buffer) { 2616 + if (!atomic_inc_not_zero(&buffer->refcount)) 2617 + buffer = NULL; 2656 2618 } 2657 2619 rcu_read_unlock(); 2658 2620 2659 - return data; 2621 + return buffer; 2660 2622 } 2661 2623 2662 - static void perf_mmap_data_put(struct perf_mmap_data *data) 2624 + static void perf_buffer_put(struct perf_buffer *buffer) 2663 2625 { 2664 - if (!atomic_dec_and_test(&data->refcount)) 2626 + if (!atomic_dec_and_test(&buffer->refcount)) 2665 2627 return; 2666 2628 2667 - call_rcu(&data->rcu_head, perf_mmap_data_free_rcu); 2629 + call_rcu(&buffer->rcu_head, perf_buffer_free_rcu); 2668 2630 } 2669 2631 2670 2632 static void perf_mmap_open(struct vm_area_struct *vma) ··· 2662 2658 struct perf_event *event = vma->vm_file->private_data; 2663 2659 2664 2660 if (atomic_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex)) { 2665 - unsigned long size = perf_data_size(event->data); 2661 + unsigned long size = perf_data_size(event->buffer); 2666 2662 struct user_struct *user = event->mmap_user; 2667 - struct perf_mmap_data *data = event->data; 2663 + struct perf_buffer *buffer = event->buffer; 2668 2664 2669 2665 atomic_long_sub((size >> PAGE_SHIFT) + 1, &user->locked_vm); 2670 2666 vma->vm_mm->locked_vm -= event->mmap_locked; 2671 - rcu_assign_pointer(event->data, NULL); 2667 + rcu_assign_pointer(event->buffer, NULL); 2672 2668 mutex_unlock(&event->mmap_mutex); 2673 2669 2674 - perf_mmap_data_put(data); 2670 + perf_buffer_put(buffer); 2675 2671 free_uid(user); 2676 2672 } 2677 2673 } ··· 2689 2685 unsigned long user_locked, user_lock_limit; 2690 2686 struct user_struct *user = current_user(); 2691 2687 unsigned long locked, lock_limit; 2692 - struct perf_mmap_data *data; 2688 + struct perf_buffer *buffer; 2693 2689 unsigned long vma_size; 2694 2690 unsigned long nr_pages; 2695 2691 long user_extra, extra; 2696 - int ret = 0; 2692 + int ret = 0, flags = 0; 2697 2693 2698 2694 /* 2699 2695 * Don't allow mmap() of inherited per-task counters. This would ··· 2710 2706 nr_pages = (vma_size / PAGE_SIZE) - 1; 2711 2707 2712 2708 /* 2713 - * If we have data pages ensure they're a power-of-two number, so we 2709 + * If we have buffer pages ensure they're a power-of-two number, so we 2714 2710 * can do bitmasks instead of modulo. 2715 2711 */ 2716 2712 if (nr_pages != 0 && !is_power_of_2(nr_pages)) ··· 2724 2720 2725 2721 WARN_ON_ONCE(event->ctx->parent_ctx); 2726 2722 mutex_lock(&event->mmap_mutex); 2727 - if (event->data) { 2728 - if (event->data->nr_pages == nr_pages) 2729 - atomic_inc(&event->data->refcount); 2723 + if (event->buffer) { 2724 + if (event->buffer->nr_pages == nr_pages) 2725 + atomic_inc(&event->buffer->refcount); 2730 2726 else 2731 2727 ret = -EINVAL; 2732 2728 goto unlock; ··· 2756 2752 goto unlock; 2757 2753 } 2758 2754 2759 - WARN_ON(event->data); 2755 + WARN_ON(event->buffer); 2760 2756 2761 - data = perf_mmap_data_alloc(event, nr_pages); 2762 - if (!data) { 2757 + if (vma->vm_flags & VM_WRITE) 2758 + flags |= PERF_BUFFER_WRITABLE; 2759 + 2760 + buffer = perf_buffer_alloc(nr_pages, event->attr.wakeup_watermark, 2761 + event->cpu, flags); 2762 + if (!buffer) { 2763 2763 ret = -ENOMEM; 2764 2764 goto unlock; 2765 2765 } 2766 - 2767 - perf_mmap_data_init(event, data); 2768 - if (vma->vm_flags & VM_WRITE) 2769 - event->data->writable = 1; 2766 + rcu_assign_pointer(event->buffer, buffer); 2770 2767 2771 2768 atomic_long_add(user_extra, &user->locked_vm); 2772 2769 event->mmap_locked = extra; ··· 2946 2941 return NULL; 2947 2942 } 2948 2943 2949 - __weak 2950 - void perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip) 2951 - { 2952 - } 2953 - 2954 2944 2955 2945 /* 2956 2946 * We assume there is only KVM supporting the callbacks. ··· 2971 2971 /* 2972 2972 * Output 2973 2973 */ 2974 - static bool perf_output_space(struct perf_mmap_data *data, unsigned long tail, 2974 + static bool perf_output_space(struct perf_buffer *buffer, unsigned long tail, 2975 2975 unsigned long offset, unsigned long head) 2976 2976 { 2977 2977 unsigned long mask; 2978 2978 2979 - if (!data->writable) 2979 + if (!buffer->writable) 2980 2980 return true; 2981 2981 2982 - mask = perf_data_size(data) - 1; 2982 + mask = perf_data_size(buffer) - 1; 2983 2983 2984 2984 offset = (offset - tail) & mask; 2985 2985 head = (head - tail) & mask; ··· 2992 2992 2993 2993 static void perf_output_wakeup(struct perf_output_handle *handle) 2994 2994 { 2995 - atomic_set(&handle->data->poll, POLL_IN); 2995 + atomic_set(&handle->buffer->poll, POLL_IN); 2996 2996 2997 2997 if (handle->nmi) { 2998 2998 handle->event->pending_wakeup = 1; ··· 3012 3012 */ 3013 3013 static void perf_output_get_handle(struct perf_output_handle *handle) 3014 3014 { 3015 - struct perf_mmap_data *data = handle->data; 3015 + struct perf_buffer *buffer = handle->buffer; 3016 3016 3017 3017 preempt_disable(); 3018 - local_inc(&data->nest); 3019 - handle->wakeup = local_read(&data->wakeup); 3018 + local_inc(&buffer->nest); 3019 + handle->wakeup = local_read(&buffer->wakeup); 3020 3020 } 3021 3021 3022 3022 static void perf_output_put_handle(struct perf_output_handle *handle) 3023 3023 { 3024 - struct perf_mmap_data *data = handle->data; 3024 + struct perf_buffer *buffer = handle->buffer; 3025 3025 unsigned long head; 3026 3026 3027 3027 again: 3028 - head = local_read(&data->head); 3028 + head = local_read(&buffer->head); 3029 3029 3030 3030 /* 3031 3031 * IRQ/NMI can happen here, which means we can miss a head update. 3032 3032 */ 3033 3033 3034 - if (!local_dec_and_test(&data->nest)) 3034 + if (!local_dec_and_test(&buffer->nest)) 3035 3035 goto out; 3036 3036 3037 3037 /* 3038 3038 * Publish the known good head. Rely on the full barrier implied 3039 - * by atomic_dec_and_test() order the data->head read and this 3039 + * by atomic_dec_and_test() order the buffer->head read and this 3040 3040 * write. 3041 3041 */ 3042 - data->user_page->data_head = head; 3042 + buffer->user_page->data_head = head; 3043 3043 3044 3044 /* 3045 3045 * Now check if we missed an update, rely on the (compiler) 3046 - * barrier in atomic_dec_and_test() to re-read data->head. 3046 + * barrier in atomic_dec_and_test() to re-read buffer->head. 3047 3047 */ 3048 - if (unlikely(head != local_read(&data->head))) { 3049 - local_inc(&data->nest); 3048 + if (unlikely(head != local_read(&buffer->head))) { 3049 + local_inc(&buffer->nest); 3050 3050 goto again; 3051 3051 } 3052 3052 3053 - if (handle->wakeup != local_read(&data->wakeup)) 3053 + if (handle->wakeup != local_read(&buffer->wakeup)) 3054 3054 perf_output_wakeup(handle); 3055 3055 3056 3056 out: ··· 3070 3070 buf += size; 3071 3071 handle->size -= size; 3072 3072 if (!handle->size) { 3073 - struct perf_mmap_data *data = handle->data; 3073 + struct perf_buffer *buffer = handle->buffer; 3074 3074 3075 3075 handle->page++; 3076 - handle->page &= data->nr_pages - 1; 3077 - handle->addr = data->data_pages[handle->page]; 3078 - handle->size = PAGE_SIZE << page_order(data); 3076 + handle->page &= buffer->nr_pages - 1; 3077 + handle->addr = buffer->data_pages[handle->page]; 3078 + handle->size = PAGE_SIZE << page_order(buffer); 3079 3079 } 3080 3080 } while (len); 3081 3081 } ··· 3084 3084 struct perf_event *event, unsigned int size, 3085 3085 int nmi, int sample) 3086 3086 { 3087 - struct perf_mmap_data *data; 3087 + struct perf_buffer *buffer; 3088 3088 unsigned long tail, offset, head; 3089 3089 int have_lost; 3090 3090 struct { ··· 3100 3100 if (event->parent) 3101 3101 event = event->parent; 3102 3102 3103 - data = rcu_dereference(event->data); 3104 - if (!data) 3103 + buffer = rcu_dereference(event->buffer); 3104 + if (!buffer) 3105 3105 goto out; 3106 3106 3107 - handle->data = data; 3107 + handle->buffer = buffer; 3108 3108 handle->event = event; 3109 3109 handle->nmi = nmi; 3110 3110 handle->sample = sample; 3111 3111 3112 - if (!data->nr_pages) 3112 + if (!buffer->nr_pages) 3113 3113 goto out; 3114 3114 3115 - have_lost = local_read(&data->lost); 3115 + have_lost = local_read(&buffer->lost); 3116 3116 if (have_lost) 3117 3117 size += sizeof(lost_event); 3118 3118 ··· 3124 3124 * tail pointer. So that all reads will be completed before the 3125 3125 * write is issued. 3126 3126 */ 3127 - tail = ACCESS_ONCE(data->user_page->data_tail); 3127 + tail = ACCESS_ONCE(buffer->user_page->data_tail); 3128 3128 smp_rmb(); 3129 - offset = head = local_read(&data->head); 3129 + offset = head = local_read(&buffer->head); 3130 3130 head += size; 3131 - if (unlikely(!perf_output_space(data, tail, offset, head))) 3131 + if (unlikely(!perf_output_space(buffer, tail, offset, head))) 3132 3132 goto fail; 3133 - } while (local_cmpxchg(&data->head, offset, head) != offset); 3133 + } while (local_cmpxchg(&buffer->head, offset, head) != offset); 3134 3134 3135 - if (head - local_read(&data->wakeup) > data->watermark) 3136 - local_add(data->watermark, &data->wakeup); 3135 + if (head - local_read(&buffer->wakeup) > buffer->watermark) 3136 + local_add(buffer->watermark, &buffer->wakeup); 3137 3137 3138 - handle->page = offset >> (PAGE_SHIFT + page_order(data)); 3139 - handle->page &= data->nr_pages - 1; 3140 - handle->size = offset & ((PAGE_SIZE << page_order(data)) - 1); 3141 - handle->addr = data->data_pages[handle->page]; 3138 + handle->page = offset >> (PAGE_SHIFT + page_order(buffer)); 3139 + handle->page &= buffer->nr_pages - 1; 3140 + handle->size = offset & ((PAGE_SIZE << page_order(buffer)) - 1); 3141 + handle->addr = buffer->data_pages[handle->page]; 3142 3142 handle->addr += handle->size; 3143 - handle->size = (PAGE_SIZE << page_order(data)) - handle->size; 3143 + handle->size = (PAGE_SIZE << page_order(buffer)) - handle->size; 3144 3144 3145 3145 if (have_lost) { 3146 3146 lost_event.header.type = PERF_RECORD_LOST; 3147 3147 lost_event.header.misc = 0; 3148 3148 lost_event.header.size = sizeof(lost_event); 3149 3149 lost_event.id = event->id; 3150 - lost_event.lost = local_xchg(&data->lost, 0); 3150 + lost_event.lost = local_xchg(&buffer->lost, 0); 3151 3151 3152 3152 perf_output_put(handle, lost_event); 3153 3153 } ··· 3155 3155 return 0; 3156 3156 3157 3157 fail: 3158 - local_inc(&data->lost); 3158 + local_inc(&buffer->lost); 3159 3159 perf_output_put_handle(handle); 3160 3160 out: 3161 3161 rcu_read_unlock(); ··· 3166 3166 void perf_output_end(struct perf_output_handle *handle) 3167 3167 { 3168 3168 struct perf_event *event = handle->event; 3169 - struct perf_mmap_data *data = handle->data; 3169 + struct perf_buffer *buffer = handle->buffer; 3170 3170 3171 3171 int wakeup_events = event->attr.wakeup_events; 3172 3172 3173 3173 if (handle->sample && wakeup_events) { 3174 - int events = local_inc_return(&data->events); 3174 + int events = local_inc_return(&buffer->events); 3175 3175 if (events >= wakeup_events) { 3176 - local_sub(wakeup_events, &data->events); 3177 - local_inc(&data->wakeup); 3176 + local_sub(wakeup_events, &buffer->events); 3177 + local_inc(&buffer->wakeup); 3178 3178 } 3179 3179 } 3180 3180 ··· 3211 3211 u64 values[4]; 3212 3212 int n = 0; 3213 3213 3214 - values[n++] = atomic64_read(&event->count); 3214 + values[n++] = perf_event_count(event); 3215 3215 if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED) { 3216 3216 values[n++] = event->total_time_enabled + 3217 3217 atomic64_read(&event->child_total_time_enabled); ··· 3248 3248 if (leader != event) 3249 3249 leader->pmu->read(leader); 3250 3250 3251 - values[n++] = atomic64_read(&leader->count); 3251 + values[n++] = perf_event_count(leader); 3252 3252 if (read_format & PERF_FORMAT_ID) 3253 3253 values[n++] = primary_event_id(leader); 3254 3254 ··· 3260 3260 if (sub != event) 3261 3261 sub->pmu->read(sub); 3262 3262 3263 - values[n++] = atomic64_read(&sub->count); 3263 + values[n++] = perf_event_count(sub); 3264 3264 if (read_format & PERF_FORMAT_ID) 3265 3265 values[n++] = primary_event_id(sub); 3266 3266 ··· 3491 3491 /* 3492 3492 * task tracking -- fork/exit 3493 3493 * 3494 - * enabled by: attr.comm | attr.mmap | attr.task 3494 + * enabled by: attr.comm | attr.mmap | attr.mmap_data | attr.task 3495 3495 */ 3496 3496 3497 3497 struct perf_task_event { ··· 3541 3541 if (event->cpu != -1 && event->cpu != smp_processor_id()) 3542 3542 return 0; 3543 3543 3544 - if (event->attr.comm || event->attr.mmap || event->attr.task) 3544 + if (event->attr.comm || event->attr.mmap || 3545 + event->attr.mmap_data || event->attr.task) 3545 3546 return 1; 3546 3547 3547 3548 return 0; ··· 3767 3766 } 3768 3767 3769 3768 static int perf_event_mmap_match(struct perf_event *event, 3770 - struct perf_mmap_event *mmap_event) 3769 + struct perf_mmap_event *mmap_event, 3770 + int executable) 3771 3771 { 3772 3772 if (event->state < PERF_EVENT_STATE_INACTIVE) 3773 3773 return 0; ··· 3776 3774 if (event->cpu != -1 && event->cpu != smp_processor_id()) 3777 3775 return 0; 3778 3776 3779 - if (event->attr.mmap) 3777 + if ((!executable && event->attr.mmap_data) || 3778 + (executable && event->attr.mmap)) 3780 3779 return 1; 3781 3780 3782 3781 return 0; 3783 3782 } 3784 3783 3785 3784 static void perf_event_mmap_ctx(struct perf_event_context *ctx, 3786 - struct perf_mmap_event *mmap_event) 3785 + struct perf_mmap_event *mmap_event, 3786 + int executable) 3787 3787 { 3788 3788 struct perf_event *event; 3789 3789 3790 3790 list_for_each_entry_rcu(event, &ctx->event_list, event_entry) { 3791 - if (perf_event_mmap_match(event, mmap_event)) 3791 + if (perf_event_mmap_match(event, mmap_event, executable)) 3792 3792 perf_event_mmap_output(event, mmap_event); 3793 3793 } 3794 3794 } ··· 3834 3830 if (!vma->vm_mm) { 3835 3831 name = strncpy(tmp, "[vdso]", sizeof(tmp)); 3836 3832 goto got_name; 3833 + } else if (vma->vm_start <= vma->vm_mm->start_brk && 3834 + vma->vm_end >= vma->vm_mm->brk) { 3835 + name = strncpy(tmp, "[heap]", sizeof(tmp)); 3836 + goto got_name; 3837 + } else if (vma->vm_start <= vma->vm_mm->start_stack && 3838 + vma->vm_end >= vma->vm_mm->start_stack) { 3839 + name = strncpy(tmp, "[stack]", sizeof(tmp)); 3840 + goto got_name; 3837 3841 } 3838 3842 3839 3843 name = strncpy(tmp, "//anon", sizeof(tmp)); ··· 3858 3846 3859 3847 rcu_read_lock(); 3860 3848 cpuctx = &get_cpu_var(perf_cpu_context); 3861 - perf_event_mmap_ctx(&cpuctx->ctx, mmap_event); 3849 + perf_event_mmap_ctx(&cpuctx->ctx, mmap_event, vma->vm_flags & VM_EXEC); 3862 3850 ctx = rcu_dereference(current->perf_event_ctxp); 3863 3851 if (ctx) 3864 - perf_event_mmap_ctx(ctx, mmap_event); 3852 + perf_event_mmap_ctx(ctx, mmap_event, vma->vm_flags & VM_EXEC); 3865 3853 put_cpu_var(perf_cpu_context); 3866 3854 rcu_read_unlock(); 3867 3855 3868 3856 kfree(buf); 3869 3857 } 3870 3858 3871 - void __perf_event_mmap(struct vm_area_struct *vma) 3859 + void perf_event_mmap(struct vm_area_struct *vma) 3872 3860 { 3873 3861 struct perf_mmap_event mmap_event; 3874 3862 ··· 4030 4018 hwc->last_period = hwc->sample_period; 4031 4019 4032 4020 again: 4033 - old = val = atomic64_read(&hwc->period_left); 4021 + old = val = local64_read(&hwc->period_left); 4034 4022 if (val < 0) 4035 4023 return 0; 4036 4024 4037 4025 nr = div64_u64(period + val, period); 4038 4026 offset = nr * period; 4039 4027 val -= offset; 4040 - if (atomic64_cmpxchg(&hwc->period_left, old, val) != old) 4028 + if (local64_cmpxchg(&hwc->period_left, old, val) != old) 4041 4029 goto again; 4042 4030 4043 4031 return nr; ··· 4076 4064 { 4077 4065 struct hw_perf_event *hwc = &event->hw; 4078 4066 4079 - atomic64_add(nr, &event->count); 4067 + local64_add(nr, &event->count); 4080 4068 4081 4069 if (!regs) 4082 4070 return; ··· 4087 4075 if (nr == 1 && hwc->sample_period == 1 && !event->attr.freq) 4088 4076 return perf_swevent_overflow(event, 1, nmi, data, regs); 4089 4077 4090 - if (atomic64_add_negative(nr, &hwc->period_left)) 4078 + if (local64_add_negative(nr, &hwc->period_left)) 4091 4079 return; 4092 4080 4093 4081 perf_swevent_overflow(event, 0, nmi, data, regs); ··· 4225 4213 } 4226 4214 EXPORT_SYMBOL_GPL(perf_swevent_get_recursion_context); 4227 4215 4228 - void perf_swevent_put_recursion_context(int rctx) 4216 + void inline perf_swevent_put_recursion_context(int rctx) 4229 4217 { 4230 4218 struct perf_cpu_context *cpuctx = &__get_cpu_var(perf_cpu_context); 4231 4219 barrier(); 4232 4220 cpuctx->recursion[rctx]--; 4233 4221 } 4234 - EXPORT_SYMBOL_GPL(perf_swevent_put_recursion_context); 4235 - 4236 4222 4237 4223 void __perf_sw_event(u32 event_id, u64 nr, int nmi, 4238 4224 struct pt_regs *regs, u64 addr) ··· 4378 4368 u64 now; 4379 4369 4380 4370 now = cpu_clock(cpu); 4381 - prev = atomic64_xchg(&event->hw.prev_count, now); 4382 - atomic64_add(now - prev, &event->count); 4371 + prev = local64_xchg(&event->hw.prev_count, now); 4372 + local64_add(now - prev, &event->count); 4383 4373 } 4384 4374 4385 4375 static int cpu_clock_perf_event_enable(struct perf_event *event) ··· 4387 4377 struct hw_perf_event *hwc = &event->hw; 4388 4378 int cpu = raw_smp_processor_id(); 4389 4379 4390 - atomic64_set(&hwc->prev_count, cpu_clock(cpu)); 4380 + local64_set(&hwc->prev_count, cpu_clock(cpu)); 4391 4381 perf_swevent_start_hrtimer(event); 4392 4382 4393 4383 return 0; ··· 4419 4409 u64 prev; 4420 4410 s64 delta; 4421 4411 4422 - prev = atomic64_xchg(&event->hw.prev_count, now); 4412 + prev = local64_xchg(&event->hw.prev_count, now); 4423 4413 delta = now - prev; 4424 - atomic64_add(delta, &event->count); 4414 + local64_add(delta, &event->count); 4425 4415 } 4426 4416 4427 4417 static int task_clock_perf_event_enable(struct perf_event *event) ··· 4431 4421 4432 4422 now = event->ctx->time; 4433 4423 4434 - atomic64_set(&hwc->prev_count, now); 4424 + local64_set(&hwc->prev_count, now); 4435 4425 4436 4426 perf_swevent_start_hrtimer(event); 4437 4427 ··· 4611 4601 } 4612 4602 4613 4603 void perf_tp_event(u64 addr, u64 count, void *record, int entry_size, 4614 - struct pt_regs *regs, struct hlist_head *head) 4604 + struct pt_regs *regs, struct hlist_head *head, int rctx) 4615 4605 { 4616 4606 struct perf_sample_data data; 4617 4607 struct perf_event *event; ··· 4625 4615 perf_sample_data_init(&data, addr); 4626 4616 data.raw = &raw; 4627 4617 4628 - rcu_read_lock(); 4629 4618 hlist_for_each_entry_rcu(event, node, head, hlist_entry) { 4630 4619 if (perf_tp_event_match(event, &data, regs)) 4631 4620 perf_swevent_add(event, count, 1, &data, regs); 4632 4621 } 4633 - rcu_read_unlock(); 4622 + 4623 + perf_swevent_put_recursion_context(rctx); 4634 4624 } 4635 4625 EXPORT_SYMBOL_GPL(perf_tp_event); 4636 4626 ··· 4874 4864 hwc->sample_period = 1; 4875 4865 hwc->last_period = hwc->sample_period; 4876 4866 4877 - atomic64_set(&hwc->period_left, hwc->sample_period); 4867 + local64_set(&hwc->period_left, hwc->sample_period); 4878 4868 4879 4869 /* 4880 4870 * we currently do not support PERF_FORMAT_GROUP on inherited events ··· 4923 4913 4924 4914 if (!event->parent) { 4925 4915 atomic_inc(&nr_events); 4926 - if (event->attr.mmap) 4916 + if (event->attr.mmap || event->attr.mmap_data) 4927 4917 atomic_inc(&nr_mmap_events); 4928 4918 if (event->attr.comm) 4929 4919 atomic_inc(&nr_comm_events); ··· 5017 5007 static int 5018 5008 perf_event_set_output(struct perf_event *event, struct perf_event *output_event) 5019 5009 { 5020 - struct perf_mmap_data *data = NULL, *old_data = NULL; 5010 + struct perf_buffer *buffer = NULL, *old_buffer = NULL; 5021 5011 int ret = -EINVAL; 5022 5012 5023 5013 if (!output_event) ··· 5047 5037 5048 5038 if (output_event) { 5049 5039 /* get the buffer we want to redirect to */ 5050 - data = perf_mmap_data_get(output_event); 5051 - if (!data) 5040 + buffer = perf_buffer_get(output_event); 5041 + if (!buffer) 5052 5042 goto unlock; 5053 5043 } 5054 5044 5055 - old_data = event->data; 5056 - rcu_assign_pointer(event->data, data); 5045 + old_buffer = event->buffer; 5046 + rcu_assign_pointer(event->buffer, buffer); 5057 5047 ret = 0; 5058 5048 unlock: 5059 5049 mutex_unlock(&event->mmap_mutex); 5060 5050 5061 - if (old_data) 5062 - perf_mmap_data_put(old_data); 5051 + if (old_buffer) 5052 + perf_buffer_put(old_buffer); 5063 5053 out: 5064 5054 return ret; 5065 5055 } ··· 5308 5298 hwc->sample_period = sample_period; 5309 5299 hwc->last_period = sample_period; 5310 5300 5311 - atomic64_set(&hwc->period_left, sample_period); 5301 + local64_set(&hwc->period_left, sample_period); 5312 5302 } 5313 5303 5314 5304 child_event->overflow_handler = parent_event->overflow_handler; ··· 5369 5359 if (child_event->attr.inherit_stat) 5370 5360 perf_event_read_event(child_event, child); 5371 5361 5372 - child_val = atomic64_read(&child_event->count); 5362 + child_val = perf_event_count(child_event); 5373 5363 5374 5364 /* 5375 5365 * Add back the child's count to the parent's count: 5376 5366 */ 5377 - atomic64_add(child_val, &parent_event->count); 5367 + atomic64_add(child_val, &parent_event->child_count); 5378 5368 atomic64_add(child_event->total_time_enabled, 5379 5369 &parent_event->child_total_time_enabled); 5380 5370 atomic64_add(child_event->total_time_running,

+3 -3

kernel/sched.c

··· 3726 3726 * off of preempt_enable. Kernel preemptions off return from interrupt 3727 3727 * occur there and call schedule directly. 3728 3728 */ 3729 - asmlinkage void __sched preempt_schedule(void) 3729 + asmlinkage void __sched notrace preempt_schedule(void) 3730 3730 { 3731 3731 struct thread_info *ti = current_thread_info(); 3732 3732 ··· 3738 3738 return; 3739 3739 3740 3740 do { 3741 - add_preempt_count(PREEMPT_ACTIVE); 3741 + add_preempt_count_notrace(PREEMPT_ACTIVE); 3742 3742 schedule(); 3743 - sub_preempt_count(PREEMPT_ACTIVE); 3743 + sub_preempt_count_notrace(PREEMPT_ACTIVE); 3744 3744 3745 3745 /* 3746 3746 * Check again in case we missed a preemption opportunity

-293

kernel/softlockup.c

··· 1 - /* 2 - * Detect Soft Lockups 3 - * 4 - * started by Ingo Molnar, Copyright (C) 2005, 2006 Red Hat, Inc. 5 - * 6 - * this code detects soft lockups: incidents in where on a CPU 7 - * the kernel does not reschedule for 10 seconds or more. 8 - */ 9 - #include <linux/mm.h> 10 - #include <linux/cpu.h> 11 - #include <linux/nmi.h> 12 - #include <linux/init.h> 13 - #include <linux/delay.h> 14 - #include <linux/freezer.h> 15 - #include <linux/kthread.h> 16 - #include <linux/lockdep.h> 17 - #include <linux/notifier.h> 18 - #include <linux/module.h> 19 - #include <linux/sysctl.h> 20 - 21 - #include <asm/irq_regs.h> 22 - 23 - static DEFINE_SPINLOCK(print_lock); 24 - 25 - static DEFINE_PER_CPU(unsigned long, softlockup_touch_ts); /* touch timestamp */ 26 - static DEFINE_PER_CPU(unsigned long, softlockup_print_ts); /* print timestamp */ 27 - static DEFINE_PER_CPU(struct task_struct *, softlockup_watchdog); 28 - static DEFINE_PER_CPU(bool, softlock_touch_sync); 29 - 30 - static int __read_mostly did_panic; 31 - int __read_mostly softlockup_thresh = 60; 32 - 33 - /* 34 - * Should we panic (and reboot, if panic_timeout= is set) when a 35 - * soft-lockup occurs: 36 - */ 37 - unsigned int __read_mostly softlockup_panic = 38 - CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE; 39 - 40 - static int __init softlockup_panic_setup(char *str) 41 - { 42 - softlockup_panic = simple_strtoul(str, NULL, 0); 43 - 44 - return 1; 45 - } 46 - __setup("softlockup_panic=", softlockup_panic_setup); 47 - 48 - static int 49 - softlock_panic(struct notifier_block *this, unsigned long event, void *ptr) 50 - { 51 - did_panic = 1; 52 - 53 - return NOTIFY_DONE; 54 - } 55 - 56 - static struct notifier_block panic_block = { 57 - .notifier_call = softlock_panic, 58 - }; 59 - 60 - /* 61 - * Returns seconds, approximately. We don't need nanosecond 62 - * resolution, and we don't need to waste time with a big divide when 63 - * 2^30ns == 1.074s. 64 - */ 65 - static unsigned long get_timestamp(int this_cpu) 66 - { 67 - return cpu_clock(this_cpu) >> 30LL; /* 2^30 ~= 10^9 */ 68 - } 69 - 70 - static void __touch_softlockup_watchdog(void) 71 - { 72 - int this_cpu = raw_smp_processor_id(); 73 - 74 - __raw_get_cpu_var(softlockup_touch_ts) = get_timestamp(this_cpu); 75 - } 76 - 77 - void touch_softlockup_watchdog(void) 78 - { 79 - __raw_get_cpu_var(softlockup_touch_ts) = 0; 80 - } 81 - EXPORT_SYMBOL(touch_softlockup_watchdog); 82 - 83 - void touch_softlockup_watchdog_sync(void) 84 - { 85 - __raw_get_cpu_var(softlock_touch_sync) = true; 86 - __raw_get_cpu_var(softlockup_touch_ts) = 0; 87 - } 88 - 89 - void touch_all_softlockup_watchdogs(void) 90 - { 91 - int cpu; 92 - 93 - /* Cause each CPU to re-update its timestamp rather than complain */ 94 - for_each_online_cpu(cpu) 95 - per_cpu(softlockup_touch_ts, cpu) = 0; 96 - } 97 - EXPORT_SYMBOL(touch_all_softlockup_watchdogs); 98 - 99 - int proc_dosoftlockup_thresh(struct ctl_table *table, int write, 100 - void __user *buffer, 101 - size_t *lenp, loff_t *ppos) 102 - { 103 - touch_all_softlockup_watchdogs(); 104 - return proc_dointvec_minmax(table, write, buffer, lenp, ppos); 105 - } 106 - 107 - /* 108 - * This callback runs from the timer interrupt, and checks 109 - * whether the watchdog thread has hung or not: 110 - */ 111 - void softlockup_tick(void) 112 - { 113 - int this_cpu = smp_processor_id(); 114 - unsigned long touch_ts = per_cpu(softlockup_touch_ts, this_cpu); 115 - unsigned long print_ts; 116 - struct pt_regs *regs = get_irq_regs(); 117 - unsigned long now; 118 - 119 - /* Is detection switched off? */ 120 - if (!per_cpu(softlockup_watchdog, this_cpu) || softlockup_thresh <= 0) { 121 - /* Be sure we don't false trigger if switched back on */ 122 - if (touch_ts) 123 - per_cpu(softlockup_touch_ts, this_cpu) = 0; 124 - return; 125 - } 126 - 127 - if (touch_ts == 0) { 128 - if (unlikely(per_cpu(softlock_touch_sync, this_cpu))) { 129 - /* 130 - * If the time stamp was touched atomically 131 - * make sure the scheduler tick is up to date. 132 - */ 133 - per_cpu(softlock_touch_sync, this_cpu) = false; 134 - sched_clock_tick(); 135 - } 136 - __touch_softlockup_watchdog(); 137 - return; 138 - } 139 - 140 - print_ts = per_cpu(softlockup_print_ts, this_cpu); 141 - 142 - /* report at most once a second */ 143 - if (print_ts == touch_ts || did_panic) 144 - return; 145 - 146 - /* do not print during early bootup: */ 147 - if (unlikely(system_state != SYSTEM_RUNNING)) { 148 - __touch_softlockup_watchdog(); 149 - return; 150 - } 151 - 152 - now = get_timestamp(this_cpu); 153 - 154 - /* 155 - * Wake up the high-prio watchdog task twice per 156 - * threshold timespan. 157 - */ 158 - if (time_after(now - softlockup_thresh/2, touch_ts)) 159 - wake_up_process(per_cpu(softlockup_watchdog, this_cpu)); 160 - 161 - /* Warn about unreasonable delays: */ 162 - if (time_before_eq(now - softlockup_thresh, touch_ts)) 163 - return; 164 - 165 - per_cpu(softlockup_print_ts, this_cpu) = touch_ts; 166 - 167 - spin_lock(&print_lock); 168 - printk(KERN_ERR "BUG: soft lockup - CPU#%d stuck for %lus! [%s:%d]\n", 169 - this_cpu, now - touch_ts, 170 - current->comm, task_pid_nr(current)); 171 - print_modules(); 172 - print_irqtrace_events(current); 173 - if (regs) 174 - show_regs(regs); 175 - else 176 - dump_stack(); 177 - spin_unlock(&print_lock); 178 - 179 - if (softlockup_panic) 180 - panic("softlockup: hung tasks"); 181 - } 182 - 183 - /* 184 - * The watchdog thread - runs every second and touches the timestamp. 185 - */ 186 - static int watchdog(void *__bind_cpu) 187 - { 188 - struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 }; 189 - 190 - sched_setscheduler(current, SCHED_FIFO, &param); 191 - 192 - /* initialize timestamp */ 193 - __touch_softlockup_watchdog(); 194 - 195 - set_current_state(TASK_INTERRUPTIBLE); 196 - /* 197 - * Run briefly once per second to reset the softlockup timestamp. 198 - * If this gets delayed for more than 60 seconds then the 199 - * debug-printout triggers in softlockup_tick(). 200 - */ 201 - while (!kthread_should_stop()) { 202 - __touch_softlockup_watchdog(); 203 - schedule(); 204 - 205 - if (kthread_should_stop()) 206 - break; 207 - 208 - set_current_state(TASK_INTERRUPTIBLE); 209 - } 210 - __set_current_state(TASK_RUNNING); 211 - 212 - return 0; 213 - } 214 - 215 - /* 216 - * Create/destroy watchdog threads as CPUs come and go: 217 - */ 218 - static int __cpuinit 219 - cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu) 220 - { 221 - int hotcpu = (unsigned long)hcpu; 222 - struct task_struct *p; 223 - 224 - switch (action) { 225 - case CPU_UP_PREPARE: 226 - case CPU_UP_PREPARE_FROZEN: 227 - BUG_ON(per_cpu(softlockup_watchdog, hotcpu)); 228 - p = kthread_create(watchdog, hcpu, "watchdog/%d", hotcpu); 229 - if (IS_ERR(p)) { 230 - printk(KERN_ERR "watchdog for %i failed\n", hotcpu); 231 - return NOTIFY_BAD; 232 - } 233 - per_cpu(softlockup_touch_ts, hotcpu) = 0; 234 - per_cpu(softlockup_watchdog, hotcpu) = p; 235 - kthread_bind(p, hotcpu); 236 - break; 237 - case CPU_ONLINE: 238 - case CPU_ONLINE_FROZEN: 239 - wake_up_process(per_cpu(softlockup_watchdog, hotcpu)); 240 - break; 241 - #ifdef CONFIG_HOTPLUG_CPU 242 - case CPU_UP_CANCELED: 243 - case CPU_UP_CANCELED_FROZEN: 244 - if (!per_cpu(softlockup_watchdog, hotcpu)) 245 - break; 246 - /* Unbind so it can run. Fall thru. */ 247 - kthread_bind(per_cpu(softlockup_watchdog, hotcpu), 248 - cpumask_any(cpu_online_mask)); 249 - case CPU_DEAD: 250 - case CPU_DEAD_FROZEN: 251 - p = per_cpu(softlockup_watchdog, hotcpu); 252 - per_cpu(softlockup_watchdog, hotcpu) = NULL; 253 - kthread_stop(p); 254 - break; 255 - #endif /* CONFIG_HOTPLUG_CPU */ 256 - } 257 - return NOTIFY_OK; 258 - } 259 - 260 - static struct notifier_block __cpuinitdata cpu_nfb = { 261 - .notifier_call = cpu_callback 262 - }; 263 - 264 - static int __initdata nosoftlockup; 265 - 266 - static int __init nosoftlockup_setup(char *str) 267 - { 268 - nosoftlockup = 1; 269 - return 1; 270 - } 271 - __setup("nosoftlockup", nosoftlockup_setup); 272 - 273 - static int __init spawn_softlockup_task(void) 274 - { 275 - void *cpu = (void *)(long)smp_processor_id(); 276 - int err; 277 - 278 - if (nosoftlockup) 279 - return 0; 280 - 281 - err = cpu_callback(&cpu_nfb, CPU_UP_PREPARE, cpu); 282 - if (err == NOTIFY_BAD) { 283 - BUG(); 284 - return 1; 285 - } 286 - cpu_callback(&cpu_nfb, CPU_ONLINE, cpu); 287 - register_cpu_notifier(&cpu_nfb); 288 - 289 - atomic_notifier_chain_register(&panic_notifier_list, &panic_block); 290 - 291 - return 0; 292 - } 293 - early_initcall(spawn_softlockup_task);

+33 -22

kernel/sysctl.c

··· 76 76 #include <scsi/sg.h> 77 77 #endif 78 78 79 + #ifdef CONFIG_LOCKUP_DETECTOR 80 + #include <linux/nmi.h> 81 + #endif 82 + 79 83 80 84 #if defined(CONFIG_SYSCTL) 81 85 ··· 110 106 #endif 111 107 112 108 /* Constants used for minimum and maximum */ 113 - #ifdef CONFIG_DETECT_SOFTLOCKUP 109 + #ifdef CONFIG_LOCKUP_DETECTOR 114 110 static int sixty = 60; 115 111 static int neg_one = -1; 116 112 #endif ··· 714 710 .mode = 0444, 715 711 .proc_handler = proc_dointvec, 716 712 }, 717 - #if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86) 713 + #if defined(CONFIG_LOCKUP_DETECTOR) 714 + { 715 + .procname = "watchdog", 716 + .data = &watchdog_enabled, 717 + .maxlen = sizeof (int), 718 + .mode = 0644, 719 + .proc_handler = proc_dowatchdog_enabled, 720 + }, 721 + { 722 + .procname = "watchdog_thresh", 723 + .data = &softlockup_thresh, 724 + .maxlen = sizeof(int), 725 + .mode = 0644, 726 + .proc_handler = proc_dowatchdog_thresh, 727 + .extra1 = &neg_one, 728 + .extra2 = &sixty, 729 + }, 730 + { 731 + .procname = "softlockup_panic", 732 + .data = &softlockup_panic, 733 + .maxlen = sizeof(int), 734 + .mode = 0644, 735 + .proc_handler = proc_dointvec_minmax, 736 + .extra1 = &zero, 737 + .extra2 = &one, 738 + }, 739 + #endif 740 + #if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86) && !defined(CONFIG_LOCKUP_DETECTOR) 718 741 { 719 742 .procname = "unknown_nmi_panic", 720 743 .data = &unknown_nmi_panic, ··· 842 811 .maxlen = sizeof (int), 843 812 .mode = 0644, 844 813 .proc_handler = proc_dointvec, 845 - }, 846 - #endif 847 - #ifdef CONFIG_DETECT_SOFTLOCKUP 848 - { 849 - .procname = "softlockup_panic", 850 - .data = &softlockup_panic, 851 - .maxlen = sizeof(int), 852 - .mode = 0644, 853 - .proc_handler = proc_dointvec_minmax, 854 - .extra1 = &zero, 855 - .extra2 = &one, 856 - }, 857 - { 858 - .procname = "softlockup_thresh", 859 - .data = &softlockup_thresh, 860 - .maxlen = sizeof(int), 861 - .mode = 0644, 862 - .proc_handler = proc_dosoftlockup_thresh, 863 - .extra1 = &neg_one, 864 - .extra2 = &sixty, 865 814 }, 866 815 #endif 867 816 #ifdef CONFIG_DETECT_HUNG_TASK

-1

kernel/timer.c

··· 1302 1302 { 1303 1303 hrtimer_run_queues(); 1304 1304 raise_softirq(TIMER_SOFTIRQ); 1305 - softlockup_tick(); 1306 1305 } 1307 1306 1308 1307 /*

-68

kernel/trace/Kconfig

··· 194 194 enabled. This option and the irqs-off timing option can be 195 195 used together or separately.) 196 196 197 - config SYSPROF_TRACER 198 - bool "Sysprof Tracer" 199 - depends on X86 200 - select GENERIC_TRACER 201 - select CONTEXT_SWITCH_TRACER 202 - help 203 - This tracer provides the trace needed by the 'Sysprof' userspace 204 - tool. 205 - 206 197 config SCHED_TRACER 207 198 bool "Scheduling Latency Tracer" 208 199 select GENERIC_TRACER ··· 219 228 select KALLSYMS 220 229 help 221 230 Basic tracer to catch the syscall entry and exit events. 222 - 223 - config BOOT_TRACER 224 - bool "Trace boot initcalls" 225 - select GENERIC_TRACER 226 - select CONTEXT_SWITCH_TRACER 227 - help 228 - This tracer helps developers to optimize boot times: it records 229 - the timings of the initcalls and traces key events and the identity 230 - of tasks that can cause boot delays, such as context-switches. 231 - 232 - Its aim is to be parsed by the scripts/bootgraph.pl tool to 233 - produce pretty graphics about boot inefficiencies, giving a visual 234 - representation of the delays during initcalls - but the raw 235 - /debug/tracing/trace text output is readable too. 236 - 237 - You must pass in initcall_debug and ftrace=initcall to the kernel 238 - command line to enable this on bootup. 239 231 240 232 config TRACE_BRANCH_PROFILING 241 233 bool ··· 299 325 300 326 Say N if unsure. 301 327 302 - config KSYM_TRACER 303 - bool "Trace read and write access on kernel memory locations" 304 - depends on HAVE_HW_BREAKPOINT 305 - select TRACING 306 - help 307 - This tracer helps find read and write operations on any given kernel 308 - symbol i.e. /proc/kallsyms. 309 - 310 - config PROFILE_KSYM_TRACER 311 - bool "Profile all kernel memory accesses on 'watched' variables" 312 - depends on KSYM_TRACER 313 - help 314 - This tracer profiles kernel accesses on variables watched through the 315 - ksym tracer ftrace plugin. Depending upon the hardware, all read 316 - and write operations on kernel variables can be monitored for 317 - accesses. 318 - 319 - The results will be displayed in: 320 - /debugfs/tracing/profile_ksym 321 - 322 - Say N if unsure. 323 - 324 328 config STACK_TRACER 325 329 bool "Trace max stack" 326 330 depends on HAVE_FUNCTION_TRACER ··· 322 370 sysctl kernel.stack_tracer_enabled 323 371 324 372 Say N if unsure. 325 - 326 - config KMEMTRACE 327 - bool "Trace SLAB allocations" 328 - select GENERIC_TRACER 329 - help 330 - kmemtrace provides tracing for slab allocator functions, such as 331 - kmalloc, kfree, kmem_cache_alloc, kmem_cache_free, etc. Collected 332 - data is then fed to the userspace application in order to analyse 333 - allocation hotspots, internal fragmentation and so on, making it 334 - possible to see how well an allocator performs, as well as debug 335 - and profile kernel code. 336 - 337 - This requires an userspace application to use. See 338 - Documentation/trace/kmemtrace.txt for more information. 339 - 340 - Saying Y will make the kernel somewhat larger and slower. However, 341 - if you disable kmemtrace at run-time or boot-time, the performance 342 - impact is minimal (depending on the arch the kernel is built for). 343 - 344 - If unsure, say N. 345 373 346 374 config WORKQUEUE_TRACER 347 375 bool "Trace workqueues"

-4

kernel/trace/Makefile

··· 30 30 obj-$(CONFIG_TRACING) += trace_stat.o 31 31 obj-$(CONFIG_TRACING) += trace_printk.o 32 32 obj-$(CONFIG_CONTEXT_SWITCH_TRACER) += trace_sched_switch.o 33 - obj-$(CONFIG_SYSPROF_TRACER) += trace_sysprof.o 34 33 obj-$(CONFIG_FUNCTION_TRACER) += trace_functions.o 35 34 obj-$(CONFIG_IRQSOFF_TRACER) += trace_irqsoff.o 36 35 obj-$(CONFIG_PREEMPT_TRACER) += trace_irqsoff.o ··· 37 38 obj-$(CONFIG_NOP_TRACER) += trace_nop.o 38 39 obj-$(CONFIG_STACK_TRACER) += trace_stack.o 39 40 obj-$(CONFIG_MMIOTRACE) += trace_mmiotrace.o 40 - obj-$(CONFIG_BOOT_TRACER) += trace_boot.o 41 41 obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += trace_functions_graph.o 42 42 obj-$(CONFIG_TRACE_BRANCH_PROFILING) += trace_branch.o 43 - obj-$(CONFIG_KMEMTRACE) += kmemtrace.o 44 43 obj-$(CONFIG_WORKQUEUE_TRACER) += trace_workqueue.o 45 44 obj-$(CONFIG_BLK_DEV_IO_TRACE) += blktrace.o 46 45 ifeq ($(CONFIG_BLOCK),y) ··· 52 55 endif 53 56 obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o 54 57 obj-$(CONFIG_KPROBE_EVENT) += trace_kprobe.o 55 - obj-$(CONFIG_KSYM_TRACER) += trace_ksym.o 56 58 obj-$(CONFIG_EVENT_TRACING) += power-traces.o 57 59 ifeq ($(CONFIG_TRACING),y) 58 60 obj-$(CONFIG_KGDB_KDB) += trace_kdb.o

+2 -3

kernel/trace/ftrace.c

··· 1883 1883 struct hlist_head *hhd; 1884 1884 struct hlist_node *n; 1885 1885 unsigned long key; 1886 - int resched; 1887 1886 1888 1887 key = hash_long(ip, FTRACE_HASH_BITS); 1889 1888 ··· 1896 1897 * period. This syncs the hash iteration and freeing of items 1897 1898 * on the hash. rcu_read_lock is too dangerous here. 1898 1899 */ 1899 - resched = ftrace_preempt_disable(); 1900 + preempt_disable_notrace(); 1900 1901 hlist_for_each_entry_rcu(entry, n, hhd, node) { 1901 1902 if (entry->ip == ip) 1902 1903 entry->ops->func(ip, parent_ip, &entry->data); 1903 1904 } 1904 - ftrace_preempt_enable(resched); 1905 + preempt_enable_notrace(); 1905 1906 } 1906 1907 1907 1908 static struct ftrace_ops trace_probe_ops __read_mostly =

-529

kernel/trace/kmemtrace.c

··· 1 - /* 2 - * Memory allocator tracing 3 - * 4 - * Copyright (C) 2008 Eduard - Gabriel Munteanu 5 - * Copyright (C) 2008 Pekka Enberg <penberg@cs.helsinki.fi> 6 - * Copyright (C) 2008 Frederic Weisbecker <fweisbec@gmail.com> 7 - */ 8 - 9 - #include <linux/tracepoint.h> 10 - #include <linux/seq_file.h> 11 - #include <linux/debugfs.h> 12 - #include <linux/dcache.h> 13 - #include <linux/fs.h> 14 - 15 - #include <linux/kmemtrace.h> 16 - 17 - #include "trace_output.h" 18 - #include "trace.h" 19 - 20 - /* Select an alternative, minimalistic output than the original one */ 21 - #define TRACE_KMEM_OPT_MINIMAL 0x1 22 - 23 - static struct tracer_opt kmem_opts[] = { 24 - /* Default disable the minimalistic output */ 25 - { TRACER_OPT(kmem_minimalistic, TRACE_KMEM_OPT_MINIMAL) }, 26 - { } 27 - }; 28 - 29 - static struct tracer_flags kmem_tracer_flags = { 30 - .val = 0, 31 - .opts = kmem_opts 32 - }; 33 - 34 - static struct trace_array *kmemtrace_array; 35 - 36 - /* Trace allocations */ 37 - static inline void kmemtrace_alloc(enum kmemtrace_type_id type_id, 38 - unsigned long call_site, 39 - const void *ptr, 40 - size_t bytes_req, 41 - size_t bytes_alloc, 42 - gfp_t gfp_flags, 43 - int node) 44 - { 45 - struct ftrace_event_call *call = &event_kmem_alloc; 46 - struct trace_array *tr = kmemtrace_array; 47 - struct kmemtrace_alloc_entry *entry; 48 - struct ring_buffer_event *event; 49 - 50 - event = ring_buffer_lock_reserve(tr->buffer, sizeof(*entry)); 51 - if (!event) 52 - return; 53 - 54 - entry = ring_buffer_event_data(event); 55 - tracing_generic_entry_update(&entry->ent, 0, 0); 56 - 57 - entry->ent.type = TRACE_KMEM_ALLOC; 58 - entry->type_id = type_id; 59 - entry->call_site = call_site; 60 - entry->ptr = ptr; 61 - entry->bytes_req = bytes_req; 62 - entry->bytes_alloc = bytes_alloc; 63 - entry->gfp_flags = gfp_flags; 64 - entry->node = node; 65 - 66 - if (!filter_check_discard(call, entry, tr->buffer, event)) 67 - ring_buffer_unlock_commit(tr->buffer, event); 68 - 69 - trace_wake_up(); 70 - } 71 - 72 - static inline void kmemtrace_free(enum kmemtrace_type_id type_id, 73 - unsigned long call_site, 74 - const void *ptr) 75 - { 76 - struct ftrace_event_call *call = &event_kmem_free; 77 - struct trace_array *tr = kmemtrace_array; 78 - struct kmemtrace_free_entry *entry; 79 - struct ring_buffer_event *event; 80 - 81 - event = ring_buffer_lock_reserve(tr->buffer, sizeof(*entry)); 82 - if (!event) 83 - return; 84 - entry = ring_buffer_event_data(event); 85 - tracing_generic_entry_update(&entry->ent, 0, 0); 86 - 87 - entry->ent.type = TRACE_KMEM_FREE; 88 - entry->type_id = type_id; 89 - entry->call_site = call_site; 90 - entry->ptr = ptr; 91 - 92 - if (!filter_check_discard(call, entry, tr->buffer, event)) 93 - ring_buffer_unlock_commit(tr->buffer, event); 94 - 95 - trace_wake_up(); 96 - } 97 - 98 - static void kmemtrace_kmalloc(void *ignore, 99 - unsigned long call_site, 100 - const void *ptr, 101 - size_t bytes_req, 102 - size_t bytes_alloc, 103 - gfp_t gfp_flags) 104 - { 105 - kmemtrace_alloc(KMEMTRACE_TYPE_KMALLOC, call_site, ptr, 106 - bytes_req, bytes_alloc, gfp_flags, -1); 107 - } 108 - 109 - static void kmemtrace_kmem_cache_alloc(void *ignore, 110 - unsigned long call_site, 111 - const void *ptr, 112 - size_t bytes_req, 113 - size_t bytes_alloc, 114 - gfp_t gfp_flags) 115 - { 116 - kmemtrace_alloc(KMEMTRACE_TYPE_CACHE, call_site, ptr, 117 - bytes_req, bytes_alloc, gfp_flags, -1); 118 - } 119 - 120 - static void kmemtrace_kmalloc_node(void *ignore, 121 - unsigned long call_site, 122 - const void *ptr, 123 - size_t bytes_req, 124 - size_t bytes_alloc, 125 - gfp_t gfp_flags, 126 - int node) 127 - { 128 - kmemtrace_alloc(KMEMTRACE_TYPE_KMALLOC, call_site, ptr, 129 - bytes_req, bytes_alloc, gfp_flags, node); 130 - } 131 - 132 - static void kmemtrace_kmem_cache_alloc_node(void *ignore, 133 - unsigned long call_site, 134 - const void *ptr, 135 - size_t bytes_req, 136 - size_t bytes_alloc, 137 - gfp_t gfp_flags, 138 - int node) 139 - { 140 - kmemtrace_alloc(KMEMTRACE_TYPE_CACHE, call_site, ptr, 141 - bytes_req, bytes_alloc, gfp_flags, node); 142 - } 143 - 144 - static void 145 - kmemtrace_kfree(void *ignore, unsigned long call_site, const void *ptr) 146 - { 147 - kmemtrace_free(KMEMTRACE_TYPE_KMALLOC, call_site, ptr); 148 - } 149 - 150 - static void kmemtrace_kmem_cache_free(void *ignore, 151 - unsigned long call_site, const void *ptr) 152 - { 153 - kmemtrace_free(KMEMTRACE_TYPE_CACHE, call_site, ptr); 154 - } 155 - 156 - static int kmemtrace_start_probes(void) 157 - { 158 - int err; 159 - 160 - err = register_trace_kmalloc(kmemtrace_kmalloc, NULL); 161 - if (err) 162 - return err; 163 - err = register_trace_kmem_cache_alloc(kmemtrace_kmem_cache_alloc, NULL); 164 - if (err) 165 - return err; 166 - err = register_trace_kmalloc_node(kmemtrace_kmalloc_node, NULL); 167 - if (err) 168 - return err; 169 - err = register_trace_kmem_cache_alloc_node(kmemtrace_kmem_cache_alloc_node, NULL); 170 - if (err) 171 - return err; 172 - err = register_trace_kfree(kmemtrace_kfree, NULL); 173 - if (err) 174 - return err; 175 - err = register_trace_kmem_cache_free(kmemtrace_kmem_cache_free, NULL); 176 - 177 - return err; 178 - } 179 - 180 - static void kmemtrace_stop_probes(void) 181 - { 182 - unregister_trace_kmalloc(kmemtrace_kmalloc, NULL); 183 - unregister_trace_kmem_cache_alloc(kmemtrace_kmem_cache_alloc, NULL); 184 - unregister_trace_kmalloc_node(kmemtrace_kmalloc_node, NULL); 185 - unregister_trace_kmem_cache_alloc_node(kmemtrace_kmem_cache_alloc_node, NULL); 186 - unregister_trace_kfree(kmemtrace_kfree, NULL); 187 - unregister_trace_kmem_cache_free(kmemtrace_kmem_cache_free, NULL); 188 - } 189 - 190 - static int kmem_trace_init(struct trace_array *tr) 191 - { 192 - kmemtrace_array = tr; 193 - 194 - tracing_reset_online_cpus(tr); 195 - 196 - kmemtrace_start_probes(); 197 - 198 - return 0; 199 - } 200 - 201 - static void kmem_trace_reset(struct trace_array *tr) 202 - { 203 - kmemtrace_stop_probes(); 204 - } 205 - 206 - static void kmemtrace_headers(struct seq_file *s) 207 - { 208 - /* Don't need headers for the original kmemtrace output */ 209 - if (!(kmem_tracer_flags.val & TRACE_KMEM_OPT_MINIMAL)) 210 - return; 211 - 212 - seq_printf(s, "#\n"); 213 - seq_printf(s, "# ALLOC TYPE REQ GIVEN FLAGS " 214 - " POINTER NODE CALLER\n"); 215 - seq_printf(s, "# FREE | | | | " 216 - " | | | |\n"); 217 - seq_printf(s, "# |\n\n"); 218 - } 219 - 220 - /* 221 - * The following functions give the original output from kmemtrace, 222 - * plus the origin CPU, since reordering occurs in-kernel now. 223 - */ 224 - 225 - #define KMEMTRACE_USER_ALLOC 0 226 - #define KMEMTRACE_USER_FREE 1 227 - 228 - struct kmemtrace_user_event { 229 - u8 event_id; 230 - u8 type_id; 231 - u16 event_size; 232 - u32 cpu; 233 - u64 timestamp; 234 - unsigned long call_site; 235 - unsigned long ptr; 236 - }; 237 - 238 - struct kmemtrace_user_event_alloc { 239 - size_t bytes_req; 240 - size_t bytes_alloc; 241 - unsigned gfp_flags; 242 - int node; 243 - }; 244 - 245 - static enum print_line_t 246 - kmemtrace_print_alloc(struct trace_iterator *iter, int flags, 247 - struct trace_event *event) 248 - { 249 - struct trace_seq *s = &iter->seq; 250 - struct kmemtrace_alloc_entry *entry; 251 - int ret; 252 - 253 - trace_assign_type(entry, iter->ent); 254 - 255 - ret = trace_seq_printf(s, "type_id %d call_site %pF ptr %lu " 256 - "bytes_req %lu bytes_alloc %lu gfp_flags %lu node %d\n", 257 - entry->type_id, (void *)entry->call_site, (unsigned long)entry->ptr, 258 - (unsigned long)entry->bytes_req, (unsigned long)entry->bytes_alloc, 259 - (unsigned long)entry->gfp_flags, entry->node); 260 - 261 - if (!ret) 262 - return TRACE_TYPE_PARTIAL_LINE; 263 - return TRACE_TYPE_HANDLED; 264 - } 265 - 266 - static enum print_line_t 267 - kmemtrace_print_free(struct trace_iterator *iter, int flags, 268 - struct trace_event *event) 269 - { 270 - struct trace_seq *s = &iter->seq; 271 - struct kmemtrace_free_entry *entry; 272 - int ret; 273 - 274 - trace_assign_type(entry, iter->ent); 275 - 276 - ret = trace_seq_printf(s, "type_id %d call_site %pF ptr %lu\n", 277 - entry->type_id, (void *)entry->call_site, 278 - (unsigned long)entry->ptr); 279 - 280 - if (!ret) 281 - return TRACE_TYPE_PARTIAL_LINE; 282 - return TRACE_TYPE_HANDLED; 283 - } 284 - 285 - static enum print_line_t 286 - kmemtrace_print_alloc_user(struct trace_iterator *iter, int flags, 287 - struct trace_event *event) 288 - { 289 - struct trace_seq *s = &iter->seq; 290 - struct kmemtrace_alloc_entry *entry; 291 - struct kmemtrace_user_event *ev; 292 - struct kmemtrace_user_event_alloc *ev_alloc; 293 - 294 - trace_assign_type(entry, iter->ent); 295 - 296 - ev = trace_seq_reserve(s, sizeof(*ev)); 297 - if (!ev) 298 - return TRACE_TYPE_PARTIAL_LINE; 299 - 300 - ev->event_id = KMEMTRACE_USER_ALLOC; 301 - ev->type_id = entry->type_id; 302 - ev->event_size = sizeof(*ev) + sizeof(*ev_alloc); 303 - ev->cpu = iter->cpu; 304 - ev->timestamp = iter->ts; 305 - ev->call_site = entry->call_site; 306 - ev->ptr = (unsigned long)entry->ptr; 307 - 308 - ev_alloc = trace_seq_reserve(s, sizeof(*ev_alloc)); 309 - if (!ev_alloc) 310 - return TRACE_TYPE_PARTIAL_LINE; 311 - 312 - ev_alloc->bytes_req = entry->bytes_req; 313 - ev_alloc->bytes_alloc = entry->bytes_alloc; 314 - ev_alloc->gfp_flags = entry->gfp_flags; 315 - ev_alloc->node = entry->node; 316 - 317 - return TRACE_TYPE_HANDLED; 318 - } 319 - 320 - static enum print_line_t 321 - kmemtrace_print_free_user(struct trace_iterator *iter, int flags, 322 - struct trace_event *event) 323 - { 324 - struct trace_seq *s = &iter->seq; 325 - struct kmemtrace_free_entry *entry; 326 - struct kmemtrace_user_event *ev; 327 - 328 - trace_assign_type(entry, iter->ent); 329 - 330 - ev = trace_seq_reserve(s, sizeof(*ev)); 331 - if (!ev) 332 - return TRACE_TYPE_PARTIAL_LINE; 333 - 334 - ev->event_id = KMEMTRACE_USER_FREE; 335 - ev->type_id = entry->type_id; 336 - ev->event_size = sizeof(*ev); 337 - ev->cpu = iter->cpu; 338 - ev->timestamp = iter->ts; 339 - ev->call_site = entry->call_site; 340 - ev->ptr = (unsigned long)entry->ptr; 341 - 342 - return TRACE_TYPE_HANDLED; 343 - } 344 - 345 - /* The two other following provide a more minimalistic output */ 346 - static enum print_line_t 347 - kmemtrace_print_alloc_compress(struct trace_iterator *iter) 348 - { 349 - struct kmemtrace_alloc_entry *entry; 350 - struct trace_seq *s = &iter->seq; 351 - int ret; 352 - 353 - trace_assign_type(entry, iter->ent); 354 - 355 - /* Alloc entry */ 356 - ret = trace_seq_printf(s, " + "); 357 - if (!ret) 358 - return TRACE_TYPE_PARTIAL_LINE; 359 - 360 - /* Type */ 361 - switch (entry->type_id) { 362 - case KMEMTRACE_TYPE_KMALLOC: 363 - ret = trace_seq_printf(s, "K "); 364 - break; 365 - case KMEMTRACE_TYPE_CACHE: 366 - ret = trace_seq_printf(s, "C "); 367 - break; 368 - case KMEMTRACE_TYPE_PAGES: 369 - ret = trace_seq_printf(s, "P "); 370 - break; 371 - default: 372 - ret = trace_seq_printf(s, "? "); 373 - } 374 - 375 - if (!ret) 376 - return TRACE_TYPE_PARTIAL_LINE; 377 - 378 - /* Requested */ 379 - ret = trace_seq_printf(s, "%4zu ", entry->bytes_req); 380 - if (!ret) 381 - return TRACE_TYPE_PARTIAL_LINE; 382 - 383 - /* Allocated */ 384 - ret = trace_seq_printf(s, "%4zu ", entry->bytes_alloc); 385 - if (!ret) 386 - return TRACE_TYPE_PARTIAL_LINE; 387 - 388 - /* Flags 389 - * TODO: would be better to see the name of the GFP flag names 390 - */ 391 - ret = trace_seq_printf(s, "%08x ", entry->gfp_flags); 392 - if (!ret) 393 - return TRACE_TYPE_PARTIAL_LINE; 394 - 395 - /* Pointer to allocated */ 396 - ret = trace_seq_printf(s, "0x%tx ", (ptrdiff_t)entry->ptr); 397 - if (!ret) 398 - return TRACE_TYPE_PARTIAL_LINE; 399 - 400 - /* Node and call site*/ 401 - ret = trace_seq_printf(s, "%4d %pf\n", entry->node, 402 - (void *)entry->call_site); 403 - if (!ret) 404 - return TRACE_TYPE_PARTIAL_LINE; 405 - 406 - return TRACE_TYPE_HANDLED; 407 - } 408 - 409 - static enum print_line_t 410 - kmemtrace_print_free_compress(struct trace_iterator *iter) 411 - { 412 - struct kmemtrace_free_entry *entry; 413 - struct trace_seq *s = &iter->seq; 414 - int ret; 415 - 416 - trace_assign_type(entry, iter->ent); 417 - 418 - /* Free entry */ 419 - ret = trace_seq_printf(s, " - "); 420 - if (!ret) 421 - return TRACE_TYPE_PARTIAL_LINE; 422 - 423 - /* Type */ 424 - switch (entry->type_id) { 425 - case KMEMTRACE_TYPE_KMALLOC: 426 - ret = trace_seq_printf(s, "K "); 427 - break; 428 - case KMEMTRACE_TYPE_CACHE: 429 - ret = trace_seq_printf(s, "C "); 430 - break; 431 - case KMEMTRACE_TYPE_PAGES: 432 - ret = trace_seq_printf(s, "P "); 433 - break; 434 - default: 435 - ret = trace_seq_printf(s, "? "); 436 - } 437 - 438 - if (!ret) 439 - return TRACE_TYPE_PARTIAL_LINE; 440 - 441 - /* Skip requested/allocated/flags */ 442 - ret = trace_seq_printf(s, " "); 443 - if (!ret) 444 - return TRACE_TYPE_PARTIAL_LINE; 445 - 446 - /* Pointer to allocated */ 447 - ret = trace_seq_printf(s, "0x%tx ", (ptrdiff_t)entry->ptr); 448 - if (!ret) 449 - return TRACE_TYPE_PARTIAL_LINE; 450 - 451 - /* Skip node and print call site*/ 452 - ret = trace_seq_printf(s, " %pf\n", (void *)entry->call_site); 453 - if (!ret) 454 - return TRACE_TYPE_PARTIAL_LINE; 455 - 456 - return TRACE_TYPE_HANDLED; 457 - } 458 - 459 - static enum print_line_t kmemtrace_print_line(struct trace_iterator *iter) 460 - { 461 - struct trace_entry *entry = iter->ent; 462 - 463 - if (!(kmem_tracer_flags.val & TRACE_KMEM_OPT_MINIMAL)) 464 - return TRACE_TYPE_UNHANDLED; 465 - 466 - switch (entry->type) { 467 - case TRACE_KMEM_ALLOC: 468 - return kmemtrace_print_alloc_compress(iter); 469 - case TRACE_KMEM_FREE: 470 - return kmemtrace_print_free_compress(iter); 471 - default: 472 - return TRACE_TYPE_UNHANDLED; 473 - } 474 - } 475 - 476 - static struct trace_event_functions kmem_trace_alloc_funcs = { 477 - .trace = kmemtrace_print_alloc, 478 - .binary = kmemtrace_print_alloc_user, 479 - }; 480 - 481 - static struct trace_event kmem_trace_alloc = { 482 - .type = TRACE_KMEM_ALLOC, 483 - .funcs = &kmem_trace_alloc_funcs, 484 - }; 485 - 486 - static struct trace_event_functions kmem_trace_free_funcs = { 487 - .trace = kmemtrace_print_free, 488 - .binary = kmemtrace_print_free_user, 489 - }; 490 - 491 - static struct trace_event kmem_trace_free = { 492 - .type = TRACE_KMEM_FREE, 493 - .funcs = &kmem_trace_free_funcs, 494 - }; 495 - 496 - static struct tracer kmem_tracer __read_mostly = { 497 - .name = "kmemtrace", 498 - .init = kmem_trace_init, 499 - .reset = kmem_trace_reset, 500 - .print_line = kmemtrace_print_line, 501 - .print_header = kmemtrace_headers, 502 - .flags = &kmem_tracer_flags 503 - }; 504 - 505 - void kmemtrace_init(void) 506 - { 507 - /* earliest opportunity to start kmem tracing */ 508 - } 509 - 510 - static int __init init_kmem_tracer(void) 511 - { 512 - if (!register_ftrace_event(&kmem_trace_alloc)) { 513 - pr_warning("Warning: could not register kmem events\n"); 514 - return 1; 515 - } 516 - 517 - if (!register_ftrace_event(&kmem_trace_free)) { 518 - pr_warning("Warning: could not register kmem events\n"); 519 - return 1; 520 - } 521 - 522 - if (register_tracer(&kmem_tracer) != 0) { 523 - pr_warning("Warning: could not register the kmem tracer\n"); 524 - return 1; 525 - } 526 - 527 - return 0; 528 - } 529 - device_initcall(init_kmem_tracer);

+9 -31

kernel/trace/ring_buffer.c

··· 443 443 */ 444 444 struct ring_buffer_per_cpu { 445 445 int cpu; 446 + atomic_t record_disabled; 446 447 struct ring_buffer *buffer; 447 448 spinlock_t reader_lock; /* serialize readers */ 448 449 arch_spinlock_t lock; ··· 463 462 unsigned long read; 464 463 u64 write_stamp; 465 464 u64 read_stamp; 466 - atomic_t record_disabled; 467 465 }; 468 466 469 467 struct ring_buffer { ··· 2242 2242 2243 2243 #endif 2244 2244 2245 - static DEFINE_PER_CPU(int, rb_need_resched); 2246 - 2247 2245 /** 2248 2246 * ring_buffer_lock_reserve - reserve a part of the buffer 2249 2247 * @buffer: the ring buffer to reserve from ··· 2262 2264 { 2263 2265 struct ring_buffer_per_cpu *cpu_buffer; 2264 2266 struct ring_buffer_event *event; 2265 - int cpu, resched; 2267 + int cpu; 2266 2268 2267 2269 if (ring_buffer_flags != RB_BUFFERS_ON) 2268 2270 return NULL; 2269 2271 2270 2272 /* If we are tracing schedule, we don't want to recurse */ 2271 - resched = ftrace_preempt_disable(); 2273 + preempt_disable_notrace(); 2272 2274 2273 2275 if (atomic_read(&buffer->record_disabled)) 2274 2276 goto out_nocheck; ··· 2293 2295 if (!event) 2294 2296 goto out; 2295 2297 2296 - /* 2297 - * Need to store resched state on this cpu. 2298 - * Only the first needs to. 2299 - */ 2300 - 2301 - if (preempt_count() == 1) 2302 - per_cpu(rb_need_resched, cpu) = resched; 2303 - 2304 2298 return event; 2305 2299 2306 2300 out: 2307 2301 trace_recursive_unlock(); 2308 2302 2309 2303 out_nocheck: 2310 - ftrace_preempt_enable(resched); 2304 + preempt_enable_notrace(); 2311 2305 return NULL; 2312 2306 } 2313 2307 EXPORT_SYMBOL_GPL(ring_buffer_lock_reserve); ··· 2345 2355 2346 2356 trace_recursive_unlock(); 2347 2357 2348 - /* 2349 - * Only the last preempt count needs to restore preemption. 2350 - */ 2351 - if (preempt_count() == 1) 2352 - ftrace_preempt_enable(per_cpu(rb_need_resched, cpu)); 2353 - else 2354 - preempt_enable_no_resched_notrace(); 2358 + preempt_enable_notrace(); 2355 2359 2356 2360 return 0; 2357 2361 } ··· 2453 2469 2454 2470 trace_recursive_unlock(); 2455 2471 2456 - /* 2457 - * Only the last preempt count needs to restore preemption. 2458 - */ 2459 - if (preempt_count() == 1) 2460 - ftrace_preempt_enable(per_cpu(rb_need_resched, cpu)); 2461 - else 2462 - preempt_enable_no_resched_notrace(); 2472 + preempt_enable_notrace(); 2463 2473 2464 2474 } 2465 2475 EXPORT_SYMBOL_GPL(ring_buffer_discard_commit); ··· 2479 2501 struct ring_buffer_event *event; 2480 2502 void *body; 2481 2503 int ret = -EBUSY; 2482 - int cpu, resched; 2504 + int cpu; 2483 2505 2484 2506 if (ring_buffer_flags != RB_BUFFERS_ON) 2485 2507 return -EBUSY; 2486 2508 2487 - resched = ftrace_preempt_disable(); 2509 + preempt_disable_notrace(); 2488 2510 2489 2511 if (atomic_read(&buffer->record_disabled)) 2490 2512 goto out; ··· 2514 2536 2515 2537 ret = 0; 2516 2538 out: 2517 - ftrace_preempt_enable(resched); 2539 + preempt_enable_notrace(); 2518 2540 2519 2541 return ret; 2520 2542 }

+55 -72

kernel/trace/trace.c

··· 341 341 /* trace_flags holds trace_options default values */ 342 342 unsigned long trace_flags = TRACE_ITER_PRINT_PARENT | TRACE_ITER_PRINTK | 343 343 TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | TRACE_ITER_SLEEP_TIME | 344 - TRACE_ITER_GRAPH_TIME; 344 + TRACE_ITER_GRAPH_TIME | TRACE_ITER_RECORD_CMD; 345 345 346 346 static int trace_stop_count; 347 347 static DEFINE_SPINLOCK(tracing_start_lock); ··· 425 425 "latency-format", 426 426 "sleep-time", 427 427 "graph-time", 428 + "record-cmd", 428 429 NULL 429 430 }; 430 431 ··· 657 656 return; 658 657 659 658 WARN_ON_ONCE(!irqs_disabled()); 659 + if (!current_trace->use_max_tr) { 660 + WARN_ON_ONCE(1); 661 + return; 662 + } 660 663 arch_spin_lock(&ftrace_max_lock); 661 664 662 665 tr->buffer = max_tr.buffer; ··· 687 682 return; 688 683 689 684 WARN_ON_ONCE(!irqs_disabled()); 685 + if (!current_trace->use_max_tr) { 686 + WARN_ON_ONCE(1); 687 + return; 688 + } 689 + 690 690 arch_spin_lock(&ftrace_max_lock); 691 691 692 692 ftrace_disable_cpu(); ··· 736 726 return -1; 737 727 } 738 728 739 - if (strlen(type->name) > MAX_TRACER_SIZE) { 729 + if (strlen(type->name) >= MAX_TRACER_SIZE) { 740 730 pr_info("Tracer has a name longer than %d\n", MAX_TRACER_SIZE); 741 731 return -1; 742 732 } ··· 1338 1328 1339 1329 #endif /* CONFIG_STACKTRACE */ 1340 1330 1341 - static void 1342 - ftrace_trace_special(void *__tr, 1343 - unsigned long arg1, unsigned long arg2, unsigned long arg3, 1344 - int pc) 1345 - { 1346 - struct ftrace_event_call *call = &event_special; 1347 - struct ring_buffer_event *event; 1348 - struct trace_array *tr = __tr; 1349 - struct ring_buffer *buffer = tr->buffer; 1350 - struct special_entry *entry; 1351 - 1352 - event = trace_buffer_lock_reserve(buffer, TRACE_SPECIAL, 1353 - sizeof(*entry), 0, pc); 1354 - if (!event) 1355 - return; 1356 - entry = ring_buffer_event_data(event); 1357 - entry->arg1 = arg1; 1358 - entry->arg2 = arg2; 1359 - entry->arg3 = arg3; 1360 - 1361 - if (!filter_check_discard(call, entry, buffer, event)) 1362 - trace_buffer_unlock_commit(buffer, event, 0, pc); 1363 - } 1364 - 1365 - void 1366 - __trace_special(void *__tr, void *__data, 1367 - unsigned long arg1, unsigned long arg2, unsigned long arg3) 1368 - { 1369 - ftrace_trace_special(__tr, arg1, arg2, arg3, preempt_count()); 1370 - } 1371 - 1372 - void 1373 - ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3) 1374 - { 1375 - struct trace_array *tr = &global_trace; 1376 - struct trace_array_cpu *data; 1377 - unsigned long flags; 1378 - int cpu; 1379 - int pc; 1380 - 1381 - if (tracing_disabled) 1382 - return; 1383 - 1384 - pc = preempt_count(); 1385 - local_irq_save(flags); 1386 - cpu = raw_smp_processor_id(); 1387 - data = tr->data[cpu]; 1388 - 1389 - if (likely(atomic_inc_return(&data->disabled) == 1)) 1390 - ftrace_trace_special(tr, arg1, arg2, arg3, pc); 1391 - 1392 - atomic_dec(&data->disabled); 1393 - local_irq_restore(flags); 1394 - } 1395 - 1396 1331 /** 1397 1332 * trace_vbprintk - write binary msg to tracing buffer 1398 1333 * ··· 1356 1401 struct bprint_entry *entry; 1357 1402 unsigned long flags; 1358 1403 int disable; 1359 - int resched; 1360 1404 int cpu, len = 0, size, pc; 1361 1405 1362 1406 if (unlikely(tracing_selftest_running || tracing_disabled)) ··· 1365 1411 pause_graph_tracing(); 1366 1412 1367 1413 pc = preempt_count(); 1368 - resched = ftrace_preempt_disable(); 1414 + preempt_disable_notrace(); 1369 1415 cpu = raw_smp_processor_id(); 1370 1416 data = tr->data[cpu]; 1371 1417 ··· 1403 1449 1404 1450 out: 1405 1451 atomic_dec_return(&data->disabled); 1406 - ftrace_preempt_enable(resched); 1452 + preempt_enable_notrace(); 1407 1453 unpause_graph_tracing(); 1408 1454 1409 1455 return len; ··· 2340 2386 .open = show_traces_open, 2341 2387 .read = seq_read, 2342 2388 .release = seq_release, 2389 + .llseek = seq_lseek, 2343 2390 }; 2344 2391 2345 2392 /* ··· 2434 2479 .open = tracing_open_generic, 2435 2480 .read = tracing_cpumask_read, 2436 2481 .write = tracing_cpumask_write, 2482 + .llseek = generic_file_llseek, 2437 2483 }; 2438 2484 2439 2485 static int tracing_trace_options_show(struct seq_file *m, void *v) ··· 2510 2554 trace_flags |= mask; 2511 2555 else 2512 2556 trace_flags &= ~mask; 2557 + 2558 + if (mask == TRACE_ITER_RECORD_CMD) 2559 + trace_event_enable_cmd_record(enabled); 2513 2560 } 2514 2561 2515 2562 static ssize_t ··· 2604 2645 static const struct file_operations tracing_readme_fops = { 2605 2646 .open = tracing_open_generic, 2606 2647 .read = tracing_readme_read, 2648 + .llseek = generic_file_llseek, 2607 2649 }; 2608 2650 2609 2651 static ssize_t ··· 2655 2695 static const struct file_operations tracing_saved_cmdlines_fops = { 2656 2696 .open = tracing_open_generic, 2657 2697 .read = tracing_saved_cmdlines_read, 2698 + .llseek = generic_file_llseek, 2658 2699 }; 2659 2700 2660 2701 static ssize_t ··· 2751 2790 if (ret < 0) 2752 2791 return ret; 2753 2792 2793 + if (!current_trace->use_max_tr) 2794 + goto out; 2795 + 2754 2796 ret = ring_buffer_resize(max_tr.buffer, size); 2755 2797 if (ret < 0) { 2756 2798 int r; ··· 2781 2817 return ret; 2782 2818 } 2783 2819 2820 + max_tr.entries = size; 2821 + out: 2784 2822 global_trace.entries = size; 2785 2823 2786 2824 return ret; 2787 2825 } 2826 + 2788 2827 2789 2828 /** 2790 2829 * tracing_update_buffers - used by tracing facility to expand ring buffers ··· 2849 2882 trace_branch_disable(); 2850 2883 if (current_trace && current_trace->reset) 2851 2884 current_trace->reset(tr); 2852 - 2885 + if (current_trace && current_trace->use_max_tr) { 2886 + /* 2887 + * We don't free the ring buffer. instead, resize it because 2888 + * The max_tr ring buffer has some state (e.g. ring->clock) and 2889 + * we want preserve it. 2890 + */ 2891 + ring_buffer_resize(max_tr.buffer, 1); 2892 + max_tr.entries = 1; 2893 + } 2853 2894 destroy_trace_option_files(topts); 2854 2895 2855 2896 current_trace = t; 2856 2897 2857 2898 topts = create_trace_option_files(current_trace); 2899 + if (current_trace->use_max_tr) { 2900 + ret = ring_buffer_resize(max_tr.buffer, global_trace.entries); 2901 + if (ret < 0) 2902 + goto out; 2903 + max_tr.entries = global_trace.entries; 2904 + } 2858 2905 2859 2906 if (t->init) { 2860 2907 ret = tracer_init(t, tr); ··· 3005 3024 if (iter->trace->pipe_open) 3006 3025 iter->trace->pipe_open(iter); 3007 3026 3027 + nonseekable_open(inode, filp); 3008 3028 out: 3009 3029 mutex_unlock(&trace_types_lock); 3010 3030 return ret; ··· 3451 3469 } 3452 3470 3453 3471 tracing_start(); 3454 - max_tr.entries = global_trace.entries; 3455 3472 mutex_unlock(&trace_types_lock); 3456 3473 3457 3474 return cnt; ··· 3563 3582 .open = tracing_open_generic, 3564 3583 .read = tracing_max_lat_read, 3565 3584 .write = tracing_max_lat_write, 3585 + .llseek = generic_file_llseek, 3566 3586 }; 3567 3587 3568 3588 static const struct file_operations tracing_ctrl_fops = { 3569 3589 .open = tracing_open_generic, 3570 3590 .read = tracing_ctrl_read, 3571 3591 .write = tracing_ctrl_write, 3592 + .llseek = generic_file_llseek, 3572 3593 }; 3573 3594 3574 3595 static const struct file_operations set_tracer_fops = { 3575 3596 .open = tracing_open_generic, 3576 3597 .read = tracing_set_trace_read, 3577 3598 .write = tracing_set_trace_write, 3599 + .llseek = generic_file_llseek, 3578 3600 }; 3579 3601 3580 3602 static const struct file_operations tracing_pipe_fops = { ··· 3586 3602 .read = tracing_read_pipe, 3587 3603 .splice_read = tracing_splice_read_pipe, 3588 3604 .release = tracing_release_pipe, 3605 + .llseek = no_llseek, 3589 3606 }; 3590 3607 3591 3608 static const struct file_operations tracing_entries_fops = { 3592 3609 .open = tracing_open_generic, 3593 3610 .read = tracing_entries_read, 3594 3611 .write = tracing_entries_write, 3612 + .llseek = generic_file_llseek, 3595 3613 }; 3596 3614 3597 3615 static const struct file_operations tracing_mark_fops = { 3598 3616 .open = tracing_open_generic, 3599 3617 .write = tracing_mark_write, 3618 + .llseek = generic_file_llseek, 3600 3619 }; 3601 3620 3602 3621 static const struct file_operations trace_clock_fops = { ··· 3905 3918 static const struct file_operations tracing_stats_fops = { 3906 3919 .open = tracing_open_generic, 3907 3920 .read = tracing_stats_read, 3921 + .llseek = generic_file_llseek, 3908 3922 }; 3909 3923 3910 3924 #ifdef CONFIG_DYNAMIC_FTRACE ··· 3942 3954 static const struct file_operations tracing_dyn_info_fops = { 3943 3955 .open = tracing_open_generic, 3944 3956 .read = tracing_read_dyn_info, 3957 + .llseek = generic_file_llseek, 3945 3958 }; 3946 3959 #endif 3947 3960 ··· 4096 4107 .open = tracing_open_generic, 4097 4108 .read = trace_options_read, 4098 4109 .write = trace_options_write, 4110 + .llseek = generic_file_llseek, 4099 4111 }; 4100 4112 4101 4113 static ssize_t ··· 4148 4158 .open = tracing_open_generic, 4149 4159 .read = trace_options_core_read, 4150 4160 .write = trace_options_core_write, 4161 + .llseek = generic_file_llseek, 4151 4162 }; 4152 4163 4153 4164 struct dentry *trace_create_file(const char *name, ··· 4337 4346 #ifdef CONFIG_DYNAMIC_FTRACE 4338 4347 trace_create_file("dyn_ftrace_total_info", 0444, d_tracer, 4339 4348 &ftrace_update_tot_cnt, &tracing_dyn_info_fops); 4340 - #endif 4341 - #ifdef CONFIG_SYSPROF_TRACER 4342 - init_tracer_sysprof_debugfs(d_tracer); 4343 4349 #endif 4344 4350 4345 4351 create_trace_options_dir(); ··· 4564 4576 4565 4577 4566 4578 #ifdef CONFIG_TRACER_MAX_TRACE 4567 - max_tr.buffer = ring_buffer_alloc(ring_buf_size, 4568 - TRACE_BUFFER_FLAGS); 4579 + max_tr.buffer = ring_buffer_alloc(1, TRACE_BUFFER_FLAGS); 4569 4580 if (!max_tr.buffer) { 4570 4581 printk(KERN_ERR "tracer: failed to allocate max ring buffer!\n"); 4571 4582 WARN_ON(1); 4572 4583 ring_buffer_free(global_trace.buffer); 4573 4584 goto out_free_cpumask; 4574 4585 } 4575 - max_tr.entries = ring_buffer_size(max_tr.buffer); 4576 - WARN_ON(max_tr.entries != global_trace.entries); 4586 + max_tr.entries = 1; 4577 4587 #endif 4578 4588 4579 4589 /* Allocate the first page for all buffers */ ··· 4584 4598 4585 4599 register_tracer(&nop_trace); 4586 4600 current_trace = &nop_trace; 4587 - #ifdef CONFIG_BOOT_TRACER 4588 - register_tracer(&boot_tracer); 4589 - #endif 4590 4601 /* All seems OK, enable tracing */ 4591 4602 tracing_disabled = 0; 4592 4603

+6 -84

kernel/trace/trace.h

··· 9 9 #include <linux/mmiotrace.h> 10 10 #include <linux/tracepoint.h> 11 11 #include <linux/ftrace.h> 12 - #include <trace/boot.h> 13 - #include <linux/kmemtrace.h> 14 12 #include <linux/hw_breakpoint.h> 15 - 16 13 #include <linux/trace_seq.h> 17 14 #include <linux/ftrace_event.h> 18 15 ··· 22 25 TRACE_STACK, 23 26 TRACE_PRINT, 24 27 TRACE_BPRINT, 25 - TRACE_SPECIAL, 26 28 TRACE_MMIO_RW, 27 29 TRACE_MMIO_MAP, 28 30 TRACE_BRANCH, 29 - TRACE_BOOT_CALL, 30 - TRACE_BOOT_RET, 31 31 TRACE_GRAPH_RET, 32 32 TRACE_GRAPH_ENT, 33 33 TRACE_USER_STACK, 34 - TRACE_KMEM_ALLOC, 35 - TRACE_KMEM_FREE, 36 34 TRACE_BLK, 37 - TRACE_KSYM, 38 35 39 36 __TRACE_LAST_TYPE, 40 37 }; 41 38 42 - enum kmemtrace_type_id { 43 - KMEMTRACE_TYPE_KMALLOC = 0, /* kmalloc() or kfree(). */ 44 - KMEMTRACE_TYPE_CACHE, /* kmem_cache_*(). */ 45 - KMEMTRACE_TYPE_PAGES, /* __get_free_pages() and friends. */ 46 - }; 47 - 48 - extern struct tracer boot_tracer; 49 39 50 40 #undef __field 51 41 #define __field(type, item) type item; ··· 188 204 IF_ASSIGN(var, ent, struct userstack_entry, TRACE_USER_STACK);\ 189 205 IF_ASSIGN(var, ent, struct print_entry, TRACE_PRINT); \ 190 206 IF_ASSIGN(var, ent, struct bprint_entry, TRACE_BPRINT); \ 191 - IF_ASSIGN(var, ent, struct special_entry, 0); \ 192 207 IF_ASSIGN(var, ent, struct trace_mmiotrace_rw, \ 193 208 TRACE_MMIO_RW); \ 194 209 IF_ASSIGN(var, ent, struct trace_mmiotrace_map, \ 195 210 TRACE_MMIO_MAP); \ 196 - IF_ASSIGN(var, ent, struct trace_boot_call, TRACE_BOOT_CALL);\ 197 - IF_ASSIGN(var, ent, struct trace_boot_ret, TRACE_BOOT_RET);\ 198 211 IF_ASSIGN(var, ent, struct trace_branch, TRACE_BRANCH); \ 199 212 IF_ASSIGN(var, ent, struct ftrace_graph_ent_entry, \ 200 213 TRACE_GRAPH_ENT); \ 201 214 IF_ASSIGN(var, ent, struct ftrace_graph_ret_entry, \ 202 215 TRACE_GRAPH_RET); \ 203 - IF_ASSIGN(var, ent, struct kmemtrace_alloc_entry, \ 204 - TRACE_KMEM_ALLOC); \ 205 - IF_ASSIGN(var, ent, struct kmemtrace_free_entry, \ 206 - TRACE_KMEM_FREE); \ 207 - IF_ASSIGN(var, ent, struct ksym_trace_entry, TRACE_KSYM);\ 208 216 __ftrace_bad_type(); \ 209 217 } while (0) 210 218 ··· 274 298 struct tracer *next; 275 299 int print_max; 276 300 struct tracer_flags *flags; 301 + int use_max_tr; 277 302 }; 278 303 279 304 ··· 295 318 const struct file_operations *fops); 296 319 297 320 struct dentry *tracing_init_dentry(void); 298 - void init_tracer_sysprof_debugfs(struct dentry *d_tracer); 299 321 300 322 struct ring_buffer_event; 301 323 ··· 339 363 struct task_struct *wakee, 340 364 struct task_struct *cur, 341 365 unsigned long flags, int pc); 342 - void trace_special(struct trace_array *tr, 343 - struct trace_array_cpu *data, 344 - unsigned long arg1, 345 - unsigned long arg2, 346 - unsigned long arg3, int pc); 347 366 void trace_function(struct trace_array *tr, 348 367 unsigned long ip, 349 368 unsigned long parent_ip, ··· 368 397 369 398 #define for_each_tracing_cpu(cpu) \ 370 399 for_each_cpu(cpu, tracing_buffer_mask) 371 - 372 - extern int process_new_ksym_entry(char *ksymname, int op, unsigned long addr); 373 400 374 401 extern unsigned long nsecs_to_usecs(unsigned long nsecs); 375 402 ··· 438 469 struct trace_array *tr); 439 470 extern int trace_selftest_startup_sched_switch(struct tracer *trace, 440 471 struct trace_array *tr); 441 - extern int trace_selftest_startup_sysprof(struct tracer *trace, 442 - struct trace_array *tr); 443 472 extern int trace_selftest_startup_branch(struct tracer *trace, 444 - struct trace_array *tr); 445 - extern int trace_selftest_startup_ksym(struct tracer *trace, 446 473 struct trace_array *tr); 447 474 #endif /* CONFIG_FTRACE_STARTUP_TEST */ 448 475 ··· 601 636 TRACE_ITER_LATENCY_FMT = 0x20000, 602 637 TRACE_ITER_SLEEP_TIME = 0x40000, 603 638 TRACE_ITER_GRAPH_TIME = 0x80000, 639 + TRACE_ITER_RECORD_CMD = 0x100000, 604 640 }; 605 641 606 642 /* ··· 612 646 (TRACE_ITER_PRINT_PARENT|TRACE_ITER_SYM_OFFSET|TRACE_ITER_SYM_ADDR) 613 647 614 648 extern struct tracer nop_trace; 615 - 616 - /** 617 - * ftrace_preempt_disable - disable preemption scheduler safe 618 - * 619 - * When tracing can happen inside the scheduler, there exists 620 - * cases that the tracing might happen before the need_resched 621 - * flag is checked. If this happens and the tracer calls 622 - * preempt_enable (after a disable), a schedule might take place 623 - * causing an infinite recursion. 624 - * 625 - * To prevent this, we read the need_resched flag before 626 - * disabling preemption. When we want to enable preemption we 627 - * check the flag, if it is set, then we call preempt_enable_no_resched. 628 - * Otherwise, we call preempt_enable. 629 - * 630 - * The rational for doing the above is that if need_resched is set 631 - * and we have yet to reschedule, we are either in an atomic location 632 - * (where we do not need to check for scheduling) or we are inside 633 - * the scheduler and do not want to resched. 634 - */ 635 - static inline int ftrace_preempt_disable(void) 636 - { 637 - int resched; 638 - 639 - resched = need_resched(); 640 - preempt_disable_notrace(); 641 - 642 - return resched; 643 - } 644 - 645 - /** 646 - * ftrace_preempt_enable - enable preemption scheduler safe 647 - * @resched: the return value from ftrace_preempt_disable 648 - * 649 - * This is a scheduler safe way to enable preemption and not miss 650 - * any preemption checks. The disabled saved the state of preemption. 651 - * If resched is set, then we are either inside an atomic or 652 - * are inside the scheduler (we would have already scheduled 653 - * otherwise). In this case, we do not want to call normal 654 - * preempt_enable, but preempt_enable_no_resched instead. 655 - */ 656 - static inline void ftrace_preempt_enable(int resched) 657 - { 658 - if (resched) 659 - preempt_enable_no_resched_notrace(); 660 - else 661 - preempt_enable_notrace(); 662 - } 663 649 664 650 #ifdef CONFIG_BRANCH_TRACER 665 651 extern int enable_branch_tracing(struct trace_array *tr); ··· 703 785 int pop_n; 704 786 }; 705 787 788 + extern struct list_head ftrace_common_fields; 789 + 706 790 extern enum regex_type 707 791 filter_parse_regex(char *buff, int len, char **search, int *not); 708 792 extern void print_event_filter(struct ftrace_event_call *call, ··· 733 813 734 814 return 0; 735 815 } 816 + 817 + extern void trace_event_enable_cmd_record(bool enable); 736 818 737 819 extern struct mutex event_mutex; 738 820 extern struct list_head ftrace_events;

-185

kernel/trace/trace_boot.c

··· 1 - /* 2 - * ring buffer based initcalls tracer 3 - * 4 - * Copyright (C) 2008 Frederic Weisbecker <fweisbec@gmail.com> 5 - * 6 - */ 7 - 8 - #include <linux/init.h> 9 - #include <linux/debugfs.h> 10 - #include <linux/ftrace.h> 11 - #include <linux/kallsyms.h> 12 - #include <linux/time.h> 13 - 14 - #include "trace.h" 15 - #include "trace_output.h" 16 - 17 - static struct trace_array *boot_trace; 18 - static bool pre_initcalls_finished; 19 - 20 - /* Tells the boot tracer that the pre_smp_initcalls are finished. 21 - * So we are ready . 22 - * It doesn't enable sched events tracing however. 23 - * You have to call enable_boot_trace to do so. 24 - */ 25 - void start_boot_trace(void) 26 - { 27 - pre_initcalls_finished = true; 28 - } 29 - 30 - void enable_boot_trace(void) 31 - { 32 - if (boot_trace && pre_initcalls_finished) 33 - tracing_start_sched_switch_record(); 34 - } 35 - 36 - void disable_boot_trace(void) 37 - { 38 - if (boot_trace && pre_initcalls_finished) 39 - tracing_stop_sched_switch_record(); 40 - } 41 - 42 - static int boot_trace_init(struct trace_array *tr) 43 - { 44 - boot_trace = tr; 45 - 46 - if (!tr) 47 - return 0; 48 - 49 - tracing_reset_online_cpus(tr); 50 - 51 - tracing_sched_switch_assign_trace(tr); 52 - return 0; 53 - } 54 - 55 - static enum print_line_t 56 - initcall_call_print_line(struct trace_iterator *iter) 57 - { 58 - struct trace_entry *entry = iter->ent; 59 - struct trace_seq *s = &iter->seq; 60 - struct trace_boot_call *field; 61 - struct boot_trace_call *call; 62 - u64 ts; 63 - unsigned long nsec_rem; 64 - int ret; 65 - 66 - trace_assign_type(field, entry); 67 - call = &field->boot_call; 68 - ts = iter->ts; 69 - nsec_rem = do_div(ts, NSEC_PER_SEC); 70 - 71 - ret = trace_seq_printf(s, "[%5ld.%09ld] calling %s @ %i\n", 72 - (unsigned long)ts, nsec_rem, call->func, call->caller); 73 - 74 - if (!ret) 75 - return TRACE_TYPE_PARTIAL_LINE; 76 - else 77 - return TRACE_TYPE_HANDLED; 78 - } 79 - 80 - static enum print_line_t 81 - initcall_ret_print_line(struct trace_iterator *iter) 82 - { 83 - struct trace_entry *entry = iter->ent; 84 - struct trace_seq *s = &iter->seq; 85 - struct trace_boot_ret *field; 86 - struct boot_trace_ret *init_ret; 87 - u64 ts; 88 - unsigned long nsec_rem; 89 - int ret; 90 - 91 - trace_assign_type(field, entry); 92 - init_ret = &field->boot_ret; 93 - ts = iter->ts; 94 - nsec_rem = do_div(ts, NSEC_PER_SEC); 95 - 96 - ret = trace_seq_printf(s, "[%5ld.%09ld] initcall %s " 97 - "returned %d after %llu msecs\n", 98 - (unsigned long) ts, 99 - nsec_rem, 100 - init_ret->func, init_ret->result, init_ret->duration); 101 - 102 - if (!ret) 103 - return TRACE_TYPE_PARTIAL_LINE; 104 - else 105 - return TRACE_TYPE_HANDLED; 106 - } 107 - 108 - static enum print_line_t initcall_print_line(struct trace_iterator *iter) 109 - { 110 - struct trace_entry *entry = iter->ent; 111 - 112 - switch (entry->type) { 113 - case TRACE_BOOT_CALL: 114 - return initcall_call_print_line(iter); 115 - case TRACE_BOOT_RET: 116 - return initcall_ret_print_line(iter); 117 - default: 118 - return TRACE_TYPE_UNHANDLED; 119 - } 120 - } 121 - 122 - struct tracer boot_tracer __read_mostly = 123 - { 124 - .name = "initcall", 125 - .init = boot_trace_init, 126 - .reset = tracing_reset_online_cpus, 127 - .print_line = initcall_print_line, 128 - }; 129 - 130 - void trace_boot_call(struct boot_trace_call *bt, initcall_t fn) 131 - { 132 - struct ftrace_event_call *call = &event_boot_call; 133 - struct ring_buffer_event *event; 134 - struct ring_buffer *buffer; 135 - struct trace_boot_call *entry; 136 - struct trace_array *tr = boot_trace; 137 - 138 - if (!tr || !pre_initcalls_finished) 139 - return; 140 - 141 - /* Get its name now since this function could 142 - * disappear because it is in the .init section. 143 - */ 144 - sprint_symbol(bt->func, (unsigned long)fn); 145 - preempt_disable(); 146 - 147 - buffer = tr->buffer; 148 - event = trace_buffer_lock_reserve(buffer, TRACE_BOOT_CALL, 149 - sizeof(*entry), 0, 0); 150 - if (!event) 151 - goto out; 152 - entry = ring_buffer_event_data(event); 153 - entry->boot_call = *bt; 154 - if (!filter_check_discard(call, entry, buffer, event)) 155 - trace_buffer_unlock_commit(buffer, event, 0, 0); 156 - out: 157 - preempt_enable(); 158 - } 159 - 160 - void trace_boot_ret(struct boot_trace_ret *bt, initcall_t fn) 161 - { 162 - struct ftrace_event_call *call = &event_boot_ret; 163 - struct ring_buffer_event *event; 164 - struct ring_buffer *buffer; 165 - struct trace_boot_ret *entry; 166 - struct trace_array *tr = boot_trace; 167 - 168 - if (!tr || !pre_initcalls_finished) 169 - return; 170 - 171 - sprint_symbol(bt->func, (unsigned long)fn); 172 - preempt_disable(); 173 - 174 - buffer = tr->buffer; 175 - event = trace_buffer_lock_reserve(buffer, TRACE_BOOT_RET, 176 - sizeof(*entry), 0, 0); 177 - if (!event) 178 - goto out; 179 - entry = ring_buffer_event_data(event); 180 - entry->boot_ret = *bt; 181 - if (!filter_check_discard(call, entry, buffer, event)) 182 - trace_buffer_unlock_commit(buffer, event, 0, 0); 183 - out: 184 - preempt_enable(); 185 - }

+2 -3

kernel/trace/trace_clock.c

··· 32 32 u64 notrace trace_clock_local(void) 33 33 { 34 34 u64 clock; 35 - int resched; 36 35 37 36 /* 38 37 * sched_clock() is an architecture implemented, fast, scalable, 39 38 * lockless clock. It is not guaranteed to be coherent across 40 39 * CPUs, nor across CPU idle events. 41 40 */ 42 - resched = ftrace_preempt_disable(); 41 + preempt_disable_notrace(); 43 42 clock = sched_clock(); 44 - ftrace_preempt_enable(resched); 43 + preempt_enable_notrace(); 45 44 46 45 return clock; 47 46 }

-94

kernel/trace/trace_entries.h

··· 151 151 ); 152 152 153 153 /* 154 - * Special (free-form) trace entry: 155 - */ 156 - FTRACE_ENTRY(special, special_entry, 157 - 158 - TRACE_SPECIAL, 159 - 160 - F_STRUCT( 161 - __field( unsigned long, arg1 ) 162 - __field( unsigned long, arg2 ) 163 - __field( unsigned long, arg3 ) 164 - ), 165 - 166 - F_printk("(%08lx) (%08lx) (%08lx)", 167 - __entry->arg1, __entry->arg2, __entry->arg3) 168 - ); 169 - 170 - /* 171 154 * Stack-trace entry: 172 155 */ 173 156 ··· 254 271 __entry->map_id, __entry->opcode) 255 272 ); 256 273 257 - FTRACE_ENTRY(boot_call, trace_boot_call, 258 - 259 - TRACE_BOOT_CALL, 260 - 261 - F_STRUCT( 262 - __field_struct( struct boot_trace_call, boot_call ) 263 - __field_desc( pid_t, boot_call, caller ) 264 - __array_desc( char, boot_call, func, KSYM_SYMBOL_LEN) 265 - ), 266 - 267 - F_printk("%d %s", __entry->caller, __entry->func) 268 - ); 269 - 270 - FTRACE_ENTRY(boot_ret, trace_boot_ret, 271 - 272 - TRACE_BOOT_RET, 273 - 274 - F_STRUCT( 275 - __field_struct( struct boot_trace_ret, boot_ret ) 276 - __array_desc( char, boot_ret, func, KSYM_SYMBOL_LEN) 277 - __field_desc( int, boot_ret, result ) 278 - __field_desc( unsigned long, boot_ret, duration ) 279 - ), 280 - 281 - F_printk("%s %d %lx", 282 - __entry->func, __entry->result, __entry->duration) 283 - ); 284 274 285 275 #define TRACE_FUNC_SIZE 30 286 276 #define TRACE_FILE_SIZE 20 ··· 274 318 __entry->func, __entry->file, __entry->correct) 275 319 ); 276 320 277 - FTRACE_ENTRY(kmem_alloc, kmemtrace_alloc_entry, 278 - 279 - TRACE_KMEM_ALLOC, 280 - 281 - F_STRUCT( 282 - __field( enum kmemtrace_type_id, type_id ) 283 - __field( unsigned long, call_site ) 284 - __field( const void *, ptr ) 285 - __field( size_t, bytes_req ) 286 - __field( size_t, bytes_alloc ) 287 - __field( gfp_t, gfp_flags ) 288 - __field( int, node ) 289 - ), 290 - 291 - F_printk("type:%u call_site:%lx ptr:%p req:%zi alloc:%zi" 292 - " flags:%x node:%d", 293 - __entry->type_id, __entry->call_site, __entry->ptr, 294 - __entry->bytes_req, __entry->bytes_alloc, 295 - __entry->gfp_flags, __entry->node) 296 - ); 297 - 298 - FTRACE_ENTRY(kmem_free, kmemtrace_free_entry, 299 - 300 - TRACE_KMEM_FREE, 301 - 302 - F_STRUCT( 303 - __field( enum kmemtrace_type_id, type_id ) 304 - __field( unsigned long, call_site ) 305 - __field( const void *, ptr ) 306 - ), 307 - 308 - F_printk("type:%u call_site:%lx ptr:%p", 309 - __entry->type_id, __entry->call_site, __entry->ptr) 310 - ); 311 - 312 - FTRACE_ENTRY(ksym_trace, ksym_trace_entry, 313 - 314 - TRACE_KSYM, 315 - 316 - F_STRUCT( 317 - __field( unsigned long, ip ) 318 - __field( unsigned char, type ) 319 - __array( char , cmd, TASK_COMM_LEN ) 320 - __field( unsigned long, addr ) 321 - ), 322 - 323 - F_printk("ip: %pF type: %d ksym_name: %pS cmd: %s", 324 - (void *)__entry->ip, (unsigned int)__entry->type, 325 - (void *)__entry->addr, __entry->cmd) 326 - );

+6 -21

kernel/trace/trace_event_perf.c

··· 9 9 #include <linux/kprobes.h> 10 10 #include "trace.h" 11 11 12 - EXPORT_SYMBOL_GPL(perf_arch_fetch_caller_regs); 13 - 14 12 static char *perf_trace_buf[4]; 15 13 16 14 /* ··· 54 56 } 55 57 } 56 58 57 - if (tp_event->class->reg) 58 - ret = tp_event->class->reg(tp_event, TRACE_REG_PERF_REGISTER); 59 - else 60 - ret = tracepoint_probe_register(tp_event->name, 61 - tp_event->class->perf_probe, 62 - tp_event); 63 - 59 + ret = tp_event->class->reg(tp_event, TRACE_REG_PERF_REGISTER); 64 60 if (ret) 65 61 goto fail; 66 62 ··· 88 96 mutex_lock(&event_mutex); 89 97 list_for_each_entry(tp_event, &ftrace_events, list) { 90 98 if (tp_event->event.type == event_id && 91 - tp_event->class && 92 - (tp_event->class->perf_probe || 93 - tp_event->class->reg) && 99 + tp_event->class && tp_event->class->reg && 94 100 try_module_get(tp_event->mod)) { 95 101 ret = perf_trace_event_init(tp_event, p_event); 96 102 break; ··· 128 138 if (--tp_event->perf_refcount > 0) 129 139 goto out; 130 140 131 - if (tp_event->class->reg) 132 - tp_event->class->reg(tp_event, TRACE_REG_PERF_UNREGISTER); 133 - else 134 - tracepoint_probe_unregister(tp_event->name, 135 - tp_event->class->perf_probe, 136 - tp_event); 141 + tp_event->class->reg(tp_event, TRACE_REG_PERF_UNREGISTER); 137 142 138 143 /* 139 - * Ensure our callback won't be called anymore. See 140 - * tracepoint_probe_unregister() and __DO_TRACE(). 144 + * Ensure our callback won't be called anymore. The buffers 145 + * will be freed after that. 141 146 */ 142 - synchronize_sched(); 147 + tracepoint_synchronize_unregister(); 143 148 144 149 free_percpu(tp_event->perf_events); 145 150 tp_event->perf_events = NULL;

+162 -139

kernel/trace/trace_events.c

··· 28 28 DEFINE_MUTEX(event_mutex); 29 29 30 30 LIST_HEAD(ftrace_events); 31 + LIST_HEAD(ftrace_common_fields); 31 32 32 33 struct list_head * 33 34 trace_get_fields(struct ftrace_event_call *event_call) ··· 38 37 return event_call->class->get_fields(event_call); 39 38 } 40 39 41 - int trace_define_field(struct ftrace_event_call *call, const char *type, 42 - const char *name, int offset, int size, int is_signed, 43 - int filter_type) 40 + static int __trace_define_field(struct list_head *head, const char *type, 41 + const char *name, int offset, int size, 42 + int is_signed, int filter_type) 44 43 { 45 44 struct ftrace_event_field *field; 46 - struct list_head *head; 47 - 48 - if (WARN_ON(!call->class)) 49 - return 0; 50 45 51 46 field = kzalloc(sizeof(*field), GFP_KERNEL); 52 47 if (!field) ··· 65 68 field->size = size; 66 69 field->is_signed = is_signed; 67 70 68 - head = trace_get_fields(call); 69 71 list_add(&field->link, head); 70 72 71 73 return 0; ··· 76 80 77 81 return -ENOMEM; 78 82 } 83 + 84 + int trace_define_field(struct ftrace_event_call *call, const char *type, 85 + const char *name, int offset, int size, int is_signed, 86 + int filter_type) 87 + { 88 + struct list_head *head; 89 + 90 + if (WARN_ON(!call->class)) 91 + return 0; 92 + 93 + head = trace_get_fields(call); 94 + return __trace_define_field(head, type, name, offset, size, 95 + is_signed, filter_type); 96 + } 79 97 EXPORT_SYMBOL_GPL(trace_define_field); 80 98 81 99 #define __common_field(type, item) \ 82 - ret = trace_define_field(call, #type, "common_" #item, \ 83 - offsetof(typeof(ent), item), \ 84 - sizeof(ent.item), \ 85 - is_signed_type(type), FILTER_OTHER); \ 100 + ret = __trace_define_field(&ftrace_common_fields, #type, \ 101 + "common_" #item, \ 102 + offsetof(typeof(ent), item), \ 103 + sizeof(ent.item), \ 104 + is_signed_type(type), FILTER_OTHER); \ 86 105 if (ret) \ 87 106 return ret; 88 107 89 - static int trace_define_common_fields(struct ftrace_event_call *call) 108 + static int trace_define_common_fields(void) 90 109 { 91 110 int ret; 92 111 struct trace_entry ent; ··· 141 130 } 142 131 EXPORT_SYMBOL_GPL(trace_event_raw_init); 143 132 133 + int ftrace_event_reg(struct ftrace_event_call *call, enum trace_reg type) 134 + { 135 + switch (type) { 136 + case TRACE_REG_REGISTER: 137 + return tracepoint_probe_register(call->name, 138 + call->class->probe, 139 + call); 140 + case TRACE_REG_UNREGISTER: 141 + tracepoint_probe_unregister(call->name, 142 + call->class->probe, 143 + call); 144 + return 0; 145 + 146 + #ifdef CONFIG_PERF_EVENTS 147 + case TRACE_REG_PERF_REGISTER: 148 + return tracepoint_probe_register(call->name, 149 + call->class->perf_probe, 150 + call); 151 + case TRACE_REG_PERF_UNREGISTER: 152 + tracepoint_probe_unregister(call->name, 153 + call->class->perf_probe, 154 + call); 155 + return 0; 156 + #endif 157 + } 158 + return 0; 159 + } 160 + EXPORT_SYMBOL_GPL(ftrace_event_reg); 161 + 162 + void trace_event_enable_cmd_record(bool enable) 163 + { 164 + struct ftrace_event_call *call; 165 + 166 + mutex_lock(&event_mutex); 167 + list_for_each_entry(call, &ftrace_events, list) { 168 + if (!(call->flags & TRACE_EVENT_FL_ENABLED)) 169 + continue; 170 + 171 + if (enable) { 172 + tracing_start_cmdline_record(); 173 + call->flags |= TRACE_EVENT_FL_RECORDED_CMD; 174 + } else { 175 + tracing_stop_cmdline_record(); 176 + call->flags &= ~TRACE_EVENT_FL_RECORDED_CMD; 177 + } 178 + } 179 + mutex_unlock(&event_mutex); 180 + } 181 + 144 182 static int ftrace_event_enable_disable(struct ftrace_event_call *call, 145 183 int enable) 146 184 { ··· 199 139 case 0: 200 140 if (call->flags & TRACE_EVENT_FL_ENABLED) { 201 141 call->flags &= ~TRACE_EVENT_FL_ENABLED; 202 - tracing_stop_cmdline_record(); 203 - if (call->class->reg) 204 - call->class->reg(call, TRACE_REG_UNREGISTER); 205 - else 206 - tracepoint_probe_unregister(call->name, 207 - call->class->probe, 208 - call); 142 + if (call->flags & TRACE_EVENT_FL_RECORDED_CMD) { 143 + tracing_stop_cmdline_record(); 144 + call->flags &= ~TRACE_EVENT_FL_RECORDED_CMD; 145 + } 146 + call->class->reg(call, TRACE_REG_UNREGISTER); 209 147 } 210 148 break; 211 149 case 1: 212 150 if (!(call->flags & TRACE_EVENT_FL_ENABLED)) { 213 - tracing_start_cmdline_record(); 214 - if (call->class->reg) 215 - ret = call->class->reg(call, TRACE_REG_REGISTER); 216 - else 217 - ret = tracepoint_probe_register(call->name, 218 - call->class->probe, 219 - call); 151 + if (trace_flags & TRACE_ITER_RECORD_CMD) { 152 + tracing_start_cmdline_record(); 153 + call->flags |= TRACE_EVENT_FL_RECORDED_CMD; 154 + } 155 + ret = call->class->reg(call, TRACE_REG_REGISTER); 220 156 if (ret) { 221 157 tracing_stop_cmdline_record(); 222 158 pr_info("event trace: Could not enable event " ··· 250 194 mutex_lock(&event_mutex); 251 195 list_for_each_entry(call, &ftrace_events, list) { 252 196 253 - if (!call->name || !call->class || 254 - (!call->class->probe && !call->class->reg)) 197 + if (!call->name || !call->class || !call->class->reg) 255 198 continue; 256 199 257 200 if (match && ··· 376 321 * The ftrace subsystem is for showing formats only. 377 322 * They can not be enabled or disabled via the event files. 378 323 */ 379 - if (call->class && (call->class->probe || call->class->reg)) 324 + if (call->class && call->class->reg) 380 325 return call; 381 326 } 382 327 ··· 529 474 530 475 mutex_lock(&event_mutex); 531 476 list_for_each_entry(call, &ftrace_events, list) { 532 - if (!call->name || !call->class || 533 - (!call->class->probe && !call->class->reg)) 477 + if (!call->name || !call->class || !call->class->reg) 534 478 continue; 535 479 536 480 if (system && strcmp(call->class->system, system) != 0) ··· 598 544 return ret; 599 545 } 600 546 601 - static ssize_t 602 - event_format_read(struct file *filp, char __user *ubuf, size_t cnt, 603 - loff_t *ppos) 547 + static void print_event_fields(struct trace_seq *s, struct list_head *head) 604 548 { 605 - struct ftrace_event_call *call = filp->private_data; 606 549 struct ftrace_event_field *field; 607 - struct list_head *head; 608 - struct trace_seq *s; 609 - int common_field_count = 5; 610 - char *buf; 611 - int r = 0; 612 550 613 - if (*ppos) 614 - return 0; 615 - 616 - s = kmalloc(sizeof(*s), GFP_KERNEL); 617 - if (!s) 618 - return -ENOMEM; 619 - 620 - trace_seq_init(s); 621 - 622 - trace_seq_printf(s, "name: %s\n", call->name); 623 - trace_seq_printf(s, "ID: %d\n", call->event.type); 624 - trace_seq_printf(s, "format:\n"); 625 - 626 - head = trace_get_fields(call); 627 551 list_for_each_entry_reverse(field, head, link) { 628 552 /* 629 553 * Smartly shows the array type(except dynamic array). ··· 616 584 array_descriptor = NULL; 617 585 618 586 if (!array_descriptor) { 619 - r = trace_seq_printf(s, "\tfield:%s %s;\toffset:%u;" 587 + trace_seq_printf(s, "\tfield:%s %s;\toffset:%u;" 620 588 "\tsize:%u;\tsigned:%d;\n", 621 589 field->type, field->name, field->offset, 622 590 field->size, !!field->is_signed); 623 591 } else { 624 - r = trace_seq_printf(s, "\tfield:%.*s %s%s;\toffset:%u;" 592 + trace_seq_printf(s, "\tfield:%.*s %s%s;\toffset:%u;" 625 593 "\tsize:%u;\tsigned:%d;\n", 626 594 (int)(array_descriptor - field->type), 627 595 field->type, field->name, 628 596 array_descriptor, field->offset, 629 597 field->size, !!field->is_signed); 630 598 } 631 - 632 - if (--common_field_count == 0) 633 - r = trace_seq_printf(s, "\n"); 634 - 635 - if (!r) 636 - break; 637 599 } 600 + } 638 601 639 - if (r) 640 - r = trace_seq_printf(s, "\nprint fmt: %s\n", 641 - call->print_fmt); 602 + static ssize_t 603 + event_format_read(struct file *filp, char __user *ubuf, size_t cnt, 604 + loff_t *ppos) 605 + { 606 + struct ftrace_event_call *call = filp->private_data; 607 + struct list_head *head; 608 + struct trace_seq *s; 609 + char *buf; 610 + int r; 611 + 612 + if (*ppos) 613 + return 0; 614 + 615 + s = kmalloc(sizeof(*s), GFP_KERNEL); 616 + if (!s) 617 + return -ENOMEM; 618 + 619 + trace_seq_init(s); 620 + 621 + trace_seq_printf(s, "name: %s\n", call->name); 622 + trace_seq_printf(s, "ID: %d\n", call->event.type); 623 + trace_seq_printf(s, "format:\n"); 624 + 625 + /* print common fields */ 626 + print_event_fields(s, &ftrace_common_fields); 627 + 628 + trace_seq_putc(s, '\n'); 629 + 630 + /* print event specific fields */ 631 + head = trace_get_fields(call); 632 + print_event_fields(s, head); 633 + 634 + r = trace_seq_printf(s, "\nprint fmt: %s\n", call->print_fmt); 642 635 643 636 if (!r) { 644 637 /* ··· 1020 963 return -1; 1021 964 } 1022 965 1023 - if (call->class->probe || call->class->reg) 966 + if (call->class->reg) 1024 967 trace_create_file("enable", 0644, call->dir, call, 1025 968 enable); 1026 969 1027 970 #ifdef CONFIG_PERF_EVENTS 1028 - if (call->event.type && (call->class->perf_probe || call->class->reg)) 971 + if (call->event.type && call->class->reg) 1029 972 trace_create_file("id", 0444, call->dir, call, 1030 973 id); 1031 974 #endif 1032 975 1033 - if (call->class->define_fields) { 1034 - /* 1035 - * Other events may have the same class. Only update 1036 - * the fields if they are not already defined. 1037 - */ 1038 - head = trace_get_fields(call); 1039 - if (list_empty(head)) { 1040 - ret = trace_define_common_fields(call); 1041 - if (!ret) 1042 - ret = call->class->define_fields(call); 1043 - if (ret < 0) { 1044 - pr_warning("Could not initialize trace point" 1045 - " events/%s\n", call->name); 1046 - return ret; 1047 - } 976 + /* 977 + * Other events may have the same class. Only update 978 + * the fields if they are not already defined. 979 + */ 980 + head = trace_get_fields(call); 981 + if (list_empty(head)) { 982 + ret = call->class->define_fields(call); 983 + if (ret < 0) { 984 + pr_warning("Could not initialize trace point" 985 + " events/%s\n", call->name); 986 + return ret; 1048 987 } 1049 - trace_create_file("filter", 0644, call->dir, call, 1050 - filter); 1051 988 } 989 + trace_create_file("filter", 0644, call->dir, call, 990 + filter); 1052 991 1053 992 trace_create_file("format", 0444, call->dir, call, 1054 993 format); ··· 1052 999 return 0; 1053 1000 } 1054 1001 1055 - static int __trace_add_event_call(struct ftrace_event_call *call) 1002 + static int 1003 + __trace_add_event_call(struct ftrace_event_call *call, struct module *mod, 1004 + const struct file_operations *id, 1005 + const struct file_operations *enable, 1006 + const struct file_operations *filter, 1007 + const struct file_operations *format) 1056 1008 { 1057 1009 struct dentry *d_events; 1058 1010 int ret; 1059 1011 1012 + /* The linker may leave blanks */ 1060 1013 if (!call->name) 1061 1014 return -EINVAL; 1062 1015 ··· 1070 1011 ret = call->class->raw_init(call); 1071 1012 if (ret < 0) { 1072 1013 if (ret != -ENOSYS) 1073 - pr_warning("Could not initialize trace " 1074 - "events/%s\n", call->name); 1014 + pr_warning("Could not initialize trace events/%s\n", 1015 + call->name); 1075 1016 return ret; 1076 1017 } 1077 1018 } ··· 1080 1021 if (!d_events) 1081 1022 return -ENOENT; 1082 1023 1083 - ret = event_create_dir(call, d_events, &ftrace_event_id_fops, 1084 - &ftrace_enable_fops, &ftrace_event_filter_fops, 1085 - &ftrace_event_format_fops); 1024 + ret = event_create_dir(call, d_events, id, enable, filter, format); 1086 1025 if (!ret) 1087 1026 list_add(&call->list, &ftrace_events); 1027 + call->mod = mod; 1088 1028 1089 1029 return ret; 1090 1030 } ··· 1093 1035 { 1094 1036 int ret; 1095 1037 mutex_lock(&event_mutex); 1096 - ret = __trace_add_event_call(call); 1038 + ret = __trace_add_event_call(call, NULL, &ftrace_event_id_fops, 1039 + &ftrace_enable_fops, 1040 + &ftrace_event_filter_fops, 1041 + &ftrace_event_format_fops); 1097 1042 mutex_unlock(&event_mutex); 1098 1043 return ret; 1099 1044 } ··· 1213 1152 { 1214 1153 struct ftrace_module_file_ops *file_ops = NULL; 1215 1154 struct ftrace_event_call *call, *start, *end; 1216 - struct dentry *d_events; 1217 - int ret; 1218 1155 1219 1156 start = mod->trace_events; 1220 1157 end = mod->trace_events + mod->num_trace_events; ··· 1220 1161 if (start == end) 1221 1162 return; 1222 1163 1223 - d_events = event_trace_events_dir(); 1224 - if (!d_events) 1164 + file_ops = trace_create_file_ops(mod); 1165 + if (!file_ops) 1225 1166 return; 1226 1167 1227 1168 for_each_event(call, start, end) { 1228 - /* The linker may leave blanks */ 1229 - if (!call->name) 1230 - continue; 1231 - if (call->class->raw_init) { 1232 - ret = call->class->raw_init(call); 1233 - if (ret < 0) { 1234 - if (ret != -ENOSYS) 1235 - pr_warning("Could not initialize trace " 1236 - "point events/%s\n", call->name); 1237 - continue; 1238 - } 1239 - } 1240 - /* 1241 - * This module has events, create file ops for this module 1242 - * if not already done. 1243 - */ 1244 - if (!file_ops) { 1245 - file_ops = trace_create_file_ops(mod); 1246 - if (!file_ops) 1247 - return; 1248 - } 1249 - call->mod = mod; 1250 - ret = event_create_dir(call, d_events, 1169 + __trace_add_event_call(call, mod, 1251 1170 &file_ops->id, &file_ops->enable, 1252 1171 &file_ops->filter, &file_ops->format); 1253 - if (!ret) 1254 - list_add(&call->list, &ftrace_events); 1255 1172 } 1256 1173 } 1257 1174 ··· 1354 1319 trace_create_file("enable", 0644, d_events, 1355 1320 NULL, &ftrace_system_enable_fops); 1356 1321 1322 + if (trace_define_common_fields()) 1323 + pr_warning("tracing: Failed to allocate common fields"); 1324 + 1357 1325 for_each_event(call, __start_ftrace_events, __stop_ftrace_events) { 1358 - /* The linker may leave blanks */ 1359 - if (!call->name) 1360 - continue; 1361 - if (call->class->raw_init) { 1362 - ret = call->class->raw_init(call); 1363 - if (ret < 0) { 1364 - if (ret != -ENOSYS) 1365 - pr_warning("Could not initialize trace " 1366 - "point events/%s\n", call->name); 1367 - continue; 1368 - } 1369 - } 1370 - ret = event_create_dir(call, d_events, &ftrace_event_id_fops, 1326 + __trace_add_event_call(call, NULL, &ftrace_event_id_fops, 1371 1327 &ftrace_enable_fops, 1372 1328 &ftrace_event_filter_fops, 1373 1329 &ftrace_event_format_fops); 1374 - if (!ret) 1375 - list_add(&call->list, &ftrace_events); 1376 1330 } 1377 1331 1378 1332 while (true) { ··· 1548 1524 struct ftrace_entry *entry; 1549 1525 unsigned long flags; 1550 1526 long disabled; 1551 - int resched; 1552 1527 int cpu; 1553 1528 int pc; 1554 1529 1555 1530 pc = preempt_count(); 1556 - resched = ftrace_preempt_disable(); 1531 + preempt_disable_notrace(); 1557 1532 cpu = raw_smp_processor_id(); 1558 1533 disabled = atomic_inc_return(&per_cpu(ftrace_test_event_disable, cpu)); 1559 1534 ··· 1574 1551 1575 1552 out: 1576 1553 atomic_dec(&per_cpu(ftrace_test_event_disable, cpu)); 1577 - ftrace_preempt_enable(resched); 1554 + preempt_enable_notrace(); 1578 1555 } 1579 1556 1580 1557 static struct ftrace_ops trace_ops __initdata =

+15 -12

kernel/trace/trace_events_filter.c

··· 497 497 } 498 498 499 499 static struct ftrace_event_field * 500 - find_event_field(struct ftrace_event_call *call, char *name) 500 + __find_event_field(struct list_head *head, char *name) 501 501 { 502 502 struct ftrace_event_field *field; 503 - struct list_head *head; 504 503 505 - head = trace_get_fields(call); 506 504 list_for_each_entry(field, head, link) { 507 505 if (!strcmp(field->name, name)) 508 506 return field; 509 507 } 510 508 511 509 return NULL; 510 + } 511 + 512 + static struct ftrace_event_field * 513 + find_event_field(struct ftrace_event_call *call, char *name) 514 + { 515 + struct ftrace_event_field *field; 516 + struct list_head *head; 517 + 518 + field = __find_event_field(&ftrace_common_fields, name); 519 + if (field) 520 + return field; 521 + 522 + head = trace_get_fields(call); 523 + return __find_event_field(head, name); 512 524 } 513 525 514 526 static void filter_free_pred(struct filter_pred *pred) ··· 639 627 int err; 640 628 641 629 list_for_each_entry(call, &ftrace_events, list) { 642 - if (!call->class || !call->class->define_fields) 643 - continue; 644 - 645 630 if (strcmp(call->class->system, system->name) != 0) 646 631 continue; 647 632 ··· 655 646 struct ftrace_event_call *call; 656 647 657 648 list_for_each_entry(call, &ftrace_events, list) { 658 - if (!call->class || !call->class->define_fields) 659 - continue; 660 - 661 649 if (strcmp(call->class->system, system->name) != 0) 662 650 continue; 663 651 ··· 1256 1250 1257 1251 list_for_each_entry(call, &ftrace_events, list) { 1258 1252 struct event_filter *filter = call->filter; 1259 - 1260 - if (!call->class || !call->class->define_fields) 1261 - continue; 1262 1253 1263 1254 if (strcmp(call->class->system, system->name) != 0) 1264 1255 continue;

+1 -7

kernel/trace/trace_export.c

··· 125 125 126 126 #include "trace_entries.h" 127 127 128 - static int ftrace_raw_init_event(struct ftrace_event_call *call) 129 - { 130 - INIT_LIST_HEAD(&call->class->fields); 131 - return 0; 132 - } 133 - 134 128 #undef __entry 135 129 #define __entry REC 136 130 ··· 152 158 struct ftrace_event_class event_class_ftrace_##call = { \ 153 159 .system = __stringify(TRACE_SYSTEM), \ 154 160 .define_fields = ftrace_define_fields_##call, \ 155 - .raw_init = ftrace_raw_init_event, \ 161 + .fields = LIST_HEAD_INIT(event_class_ftrace_##call.fields),\ 156 162 }; \ 157 163 \ 158 164 struct ftrace_event_call __used \

+3 -3

kernel/trace/trace_functions.c

··· 54 54 struct trace_array_cpu *data; 55 55 unsigned long flags; 56 56 long disabled; 57 - int cpu, resched; 57 + int cpu; 58 58 int pc; 59 59 60 60 if (unlikely(!ftrace_function_enabled)) 61 61 return; 62 62 63 63 pc = preempt_count(); 64 - resched = ftrace_preempt_disable(); 64 + preempt_disable_notrace(); 65 65 local_save_flags(flags); 66 66 cpu = raw_smp_processor_id(); 67 67 data = tr->data[cpu]; ··· 71 71 trace_function(tr, ip, parent_ip, flags, pc); 72 72 73 73 atomic_dec(&data->disabled); 74 - ftrace_preempt_enable(resched); 74 + preempt_enable_notrace(); 75 75 } 76 76 77 77 static void

+2 -1

kernel/trace/trace_functions_graph.c

··· 641 641 642 642 /* Print nsecs (we don't want to exceed 7 numbers) */ 643 643 if (len < 7) { 644 - snprintf(nsecs_str, 8 - len, "%03lu", nsecs_rem); 644 + snprintf(nsecs_str, min(sizeof(nsecs_str), 8UL - len), "%03lu", 645 + nsecs_rem); 645 646 ret = trace_seq_printf(s, ".%s", nsecs_str); 646 647 if (!ret) 647 648 return TRACE_TYPE_PARTIAL_LINE;

+3

kernel/trace/trace_irqsoff.c

··· 649 649 #endif 650 650 .open = irqsoff_trace_open, 651 651 .close = irqsoff_trace_close, 652 + .use_max_tr = 1, 652 653 }; 653 654 # define register_irqsoff(trace) register_tracer(&trace) 654 655 #else ··· 682 681 #endif 683 682 .open = irqsoff_trace_open, 684 683 .close = irqsoff_trace_close, 684 + .use_max_tr = 1, 685 685 }; 686 686 # define register_preemptoff(trace) register_tracer(&trace) 687 687 #else ··· 717 715 #endif 718 716 .open = irqsoff_trace_open, 719 717 .close = irqsoff_trace_close, 718 + .use_max_tr = 1, 720 719 }; 721 720 722 721 # define register_preemptirqsoff(trace) register_tracer(&trace)

+294 -85

kernel/trace/trace_kprobe.c

··· 30 30 #include <linux/ptrace.h> 31 31 #include <linux/perf_event.h> 32 32 #include <linux/stringify.h> 33 + #include <linux/limits.h> 34 + #include <linux/uaccess.h> 33 35 #include <asm/bitsperlong.h> 34 36 35 37 #include "trace.h" ··· 40 38 #define MAX_TRACE_ARGS 128 41 39 #define MAX_ARGSTR_LEN 63 42 40 #define MAX_EVENT_NAME_LEN 64 41 + #define MAX_STRING_SIZE PATH_MAX 43 42 #define KPROBE_EVENT_SYSTEM "kprobes" 44 43 45 44 /* Reserved field names */ ··· 61 58 }; 62 59 63 60 /* Printing function type */ 64 - typedef int (*print_type_func_t)(struct trace_seq *, const char *, void *); 61 + typedef int (*print_type_func_t)(struct trace_seq *, const char *, void *, 62 + void *); 65 63 #define PRINT_TYPE_FUNC_NAME(type) print_type_##type 66 64 #define PRINT_TYPE_FMT_NAME(type) print_type_format_##type 67 65 68 66 /* Printing in basic type function template */ 69 67 #define DEFINE_BASIC_PRINT_TYPE_FUNC(type, fmt, cast) \ 70 68 static __kprobes int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s, \ 71 - const char *name, void *data)\ 69 + const char *name, \ 70 + void *data, void *ent)\ 72 71 { \ 73 72 return trace_seq_printf(s, " %s=" fmt, name, (cast)*(type *)data);\ 74 73 } \ ··· 84 79 DEFINE_BASIC_PRINT_TYPE_FUNC(s16, "%d", int) 85 80 DEFINE_BASIC_PRINT_TYPE_FUNC(s32, "%ld", long) 86 81 DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%lld", long long) 82 + 83 + /* data_rloc: data relative location, compatible with u32 */ 84 + #define make_data_rloc(len, roffs) \ 85 + (((u32)(len) << 16) | ((u32)(roffs) & 0xffff)) 86 + #define get_rloc_len(dl) ((u32)(dl) >> 16) 87 + #define get_rloc_offs(dl) ((u32)(dl) & 0xffff) 88 + 89 + static inline void *get_rloc_data(u32 *dl) 90 + { 91 + return (u8 *)dl + get_rloc_offs(*dl); 92 + } 93 + 94 + /* For data_loc conversion */ 95 + static inline void *get_loc_data(u32 *dl, void *ent) 96 + { 97 + return (u8 *)ent + get_rloc_offs(*dl); 98 + } 99 + 100 + /* 101 + * Convert data_rloc to data_loc: 102 + * data_rloc stores the offset from data_rloc itself, but data_loc 103 + * stores the offset from event entry. 104 + */ 105 + #define convert_rloc_to_loc(dl, offs) ((u32)(dl) + (offs)) 106 + 107 + /* For defining macros, define string/string_size types */ 108 + typedef u32 string; 109 + typedef u32 string_size; 110 + 111 + /* Print type function for string type */ 112 + static __kprobes int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s, 113 + const char *name, 114 + void *data, void *ent) 115 + { 116 + int len = *(u32 *)data >> 16; 117 + 118 + if (!len) 119 + return trace_seq_printf(s, " %s=(fault)", name); 120 + else 121 + return trace_seq_printf(s, " %s=\"%s\"", name, 122 + (const char *)get_loc_data(data, ent)); 123 + } 124 + static const char PRINT_TYPE_FMT_NAME(string)[] = "\\\"%s\\\""; 87 125 88 126 /* Data fetch function type */ 89 127 typedef void (*fetch_func_t)(struct pt_regs *, void *, void *); ··· 142 94 return fprm->fn(regs, fprm->data, dest); 143 95 } 144 96 145 - #define FETCH_FUNC_NAME(kind, type) fetch_##kind##_##type 97 + #define FETCH_FUNC_NAME(method, type) fetch_##method##_##type 146 98 /* 147 99 * Define macro for basic types - we don't need to define s* types, because 148 100 * we have to care only about bitwidth at recording time. 149 101 */ 150 - #define DEFINE_BASIC_FETCH_FUNCS(kind) \ 151 - DEFINE_FETCH_##kind(u8) \ 152 - DEFINE_FETCH_##kind(u16) \ 153 - DEFINE_FETCH_##kind(u32) \ 154 - DEFINE_FETCH_##kind(u64) 102 + #define DEFINE_BASIC_FETCH_FUNCS(method) \ 103 + DEFINE_FETCH_##method(u8) \ 104 + DEFINE_FETCH_##method(u16) \ 105 + DEFINE_FETCH_##method(u32) \ 106 + DEFINE_FETCH_##method(u64) 155 107 156 - #define CHECK_BASIC_FETCH_FUNCS(kind, fn) \ 157 - ((FETCH_FUNC_NAME(kind, u8) == fn) || \ 158 - (FETCH_FUNC_NAME(kind, u16) == fn) || \ 159 - (FETCH_FUNC_NAME(kind, u32) == fn) || \ 160 - (FETCH_FUNC_NAME(kind, u64) == fn)) 108 + #define CHECK_FETCH_FUNCS(method, fn) \ 109 + (((FETCH_FUNC_NAME(method, u8) == fn) || \ 110 + (FETCH_FUNC_NAME(method, u16) == fn) || \ 111 + (FETCH_FUNC_NAME(method, u32) == fn) || \ 112 + (FETCH_FUNC_NAME(method, u64) == fn) || \ 113 + (FETCH_FUNC_NAME(method, string) == fn) || \ 114 + (FETCH_FUNC_NAME(method, string_size) == fn)) \ 115 + && (fn != NULL)) 161 116 162 117 /* Data fetch function templates */ 163 118 #define DEFINE_FETCH_reg(type) \ 164 119 static __kprobes void FETCH_FUNC_NAME(reg, type)(struct pt_regs *regs, \ 165 - void *offset, void *dest) \ 120 + void *offset, void *dest) \ 166 121 { \ 167 122 *(type *)dest = (type)regs_get_register(regs, \ 168 123 (unsigned int)((unsigned long)offset)); \ 169 124 } 170 125 DEFINE_BASIC_FETCH_FUNCS(reg) 126 + /* No string on the register */ 127 + #define fetch_reg_string NULL 128 + #define fetch_reg_string_size NULL 171 129 172 130 #define DEFINE_FETCH_stack(type) \ 173 131 static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\ ··· 183 129 (unsigned int)((unsigned long)offset)); \ 184 130 } 185 131 DEFINE_BASIC_FETCH_FUNCS(stack) 132 + /* No string on the stack entry */ 133 + #define fetch_stack_string NULL 134 + #define fetch_stack_string_size NULL 186 135 187 136 #define DEFINE_FETCH_retval(type) \ 188 137 static __kprobes void FETCH_FUNC_NAME(retval, type)(struct pt_regs *regs,\ ··· 194 137 *(type *)dest = (type)regs_return_value(regs); \ 195 138 } 196 139 DEFINE_BASIC_FETCH_FUNCS(retval) 140 + /* No string on the retval */ 141 + #define fetch_retval_string NULL 142 + #define fetch_retval_string_size NULL 197 143 198 144 #define DEFINE_FETCH_memory(type) \ 199 145 static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\ ··· 209 149 *(type *)dest = retval; \ 210 150 } 211 151 DEFINE_BASIC_FETCH_FUNCS(memory) 152 + /* 153 + * Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max 154 + * length and relative data location. 155 + */ 156 + static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs, 157 + void *addr, void *dest) 158 + { 159 + long ret; 160 + int maxlen = get_rloc_len(*(u32 *)dest); 161 + u8 *dst = get_rloc_data(dest); 162 + u8 *src = addr; 163 + mm_segment_t old_fs = get_fs(); 164 + if (!maxlen) 165 + return; 166 + /* 167 + * Try to get string again, since the string can be changed while 168 + * probing. 169 + */ 170 + set_fs(KERNEL_DS); 171 + pagefault_disable(); 172 + do 173 + ret = __copy_from_user_inatomic(dst++, src++, 1); 174 + while (dst[-1] && ret == 0 && src - (u8 *)addr < maxlen); 175 + dst[-1] = '\0'; 176 + pagefault_enable(); 177 + set_fs(old_fs); 178 + 179 + if (ret < 0) { /* Failed to fetch string */ 180 + ((u8 *)get_rloc_data(dest))[0] = '\0'; 181 + *(u32 *)dest = make_data_rloc(0, get_rloc_offs(*(u32 *)dest)); 182 + } else 183 + *(u32 *)dest = make_data_rloc(src - (u8 *)addr, 184 + get_rloc_offs(*(u32 *)dest)); 185 + } 186 + /* Return the length of string -- including null terminal byte */ 187 + static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs, 188 + void *addr, void *dest) 189 + { 190 + int ret, len = 0; 191 + u8 c; 192 + mm_segment_t old_fs = get_fs(); 193 + 194 + set_fs(KERNEL_DS); 195 + pagefault_disable(); 196 + do { 197 + ret = __copy_from_user_inatomic(&c, (u8 *)addr + len, 1); 198 + len++; 199 + } while (c && ret == 0 && len < MAX_STRING_SIZE); 200 + pagefault_enable(); 201 + set_fs(old_fs); 202 + 203 + if (ret < 0) /* Failed to check the length */ 204 + *(u32 *)dest = 0; 205 + else 206 + *(u32 *)dest = len; 207 + } 212 208 213 209 /* Memory fetching by symbol */ 214 210 struct symbol_cache { ··· 319 203 *(type *)dest = 0; \ 320 204 } 321 205 DEFINE_BASIC_FETCH_FUNCS(symbol) 206 + DEFINE_FETCH_symbol(string) 207 + DEFINE_FETCH_symbol(string_size) 322 208 323 209 /* Dereference memory access function */ 324 210 struct deref_fetch_param { ··· 342 224 *(type *)dest = 0; \ 343 225 } 344 226 DEFINE_BASIC_FETCH_FUNCS(deref) 227 + DEFINE_FETCH_deref(string) 228 + DEFINE_FETCH_deref(string_size) 345 229 346 230 static __kprobes void free_deref_fetch_param(struct deref_fetch_param *data) 347 231 { 348 - if (CHECK_BASIC_FETCH_FUNCS(deref, data->orig.fn)) 232 + if (CHECK_FETCH_FUNCS(deref, data->orig.fn)) 349 233 free_deref_fetch_param(data->orig.data); 350 - else if (CHECK_BASIC_FETCH_FUNCS(symbol, data->orig.fn)) 234 + else if (CHECK_FETCH_FUNCS(symbol, data->orig.fn)) 351 235 free_symbol_cache(data->orig.data); 352 236 kfree(data); 353 237 } ··· 360 240 #define DEFAULT_FETCH_TYPE _DEFAULT_FETCH_TYPE(BITS_PER_LONG) 361 241 #define DEFAULT_FETCH_TYPE_STR __stringify(DEFAULT_FETCH_TYPE) 362 242 363 - #define ASSIGN_FETCH_FUNC(kind, type) \ 364 - .kind = FETCH_FUNC_NAME(kind, type) 243 + /* Fetch types */ 244 + enum { 245 + FETCH_MTD_reg = 0, 246 + FETCH_MTD_stack, 247 + FETCH_MTD_retval, 248 + FETCH_MTD_memory, 249 + FETCH_MTD_symbol, 250 + FETCH_MTD_deref, 251 + FETCH_MTD_END, 252 + }; 365 253 366 - #define ASSIGN_FETCH_TYPE(ptype, ftype, sign) \ 367 - {.name = #ptype, \ 368 - .size = sizeof(ftype), \ 369 - .is_signed = sign, \ 370 - .print = PRINT_TYPE_FUNC_NAME(ptype), \ 371 - .fmt = PRINT_TYPE_FMT_NAME(ptype), \ 372 - ASSIGN_FETCH_FUNC(reg, ftype), \ 373 - ASSIGN_FETCH_FUNC(stack, ftype), \ 374 - ASSIGN_FETCH_FUNC(retval, ftype), \ 375 - ASSIGN_FETCH_FUNC(memory, ftype), \ 376 - ASSIGN_FETCH_FUNC(symbol, ftype), \ 377 - ASSIGN_FETCH_FUNC(deref, ftype), \ 254 + #define ASSIGN_FETCH_FUNC(method, type) \ 255 + [FETCH_MTD_##method] = FETCH_FUNC_NAME(method, type) 256 + 257 + #define __ASSIGN_FETCH_TYPE(_name, ptype, ftype, _size, sign, _fmttype) \ 258 + {.name = _name, \ 259 + .size = _size, \ 260 + .is_signed = sign, \ 261 + .print = PRINT_TYPE_FUNC_NAME(ptype), \ 262 + .fmt = PRINT_TYPE_FMT_NAME(ptype), \ 263 + .fmttype = _fmttype, \ 264 + .fetch = { \ 265 + ASSIGN_FETCH_FUNC(reg, ftype), \ 266 + ASSIGN_FETCH_FUNC(stack, ftype), \ 267 + ASSIGN_FETCH_FUNC(retval, ftype), \ 268 + ASSIGN_FETCH_FUNC(memory, ftype), \ 269 + ASSIGN_FETCH_FUNC(symbol, ftype), \ 270 + ASSIGN_FETCH_FUNC(deref, ftype), \ 271 + } \ 378 272 } 273 + 274 + #define ASSIGN_FETCH_TYPE(ptype, ftype, sign) \ 275 + __ASSIGN_FETCH_TYPE(#ptype, ptype, ftype, sizeof(ftype), sign, #ptype) 276 + 277 + #define FETCH_TYPE_STRING 0 278 + #define FETCH_TYPE_STRSIZE 1 379 279 380 280 /* Fetch type information table */ 381 281 static const struct fetch_type { ··· 404 264 int is_signed; /* Signed flag */ 405 265 print_type_func_t print; /* Print functions */ 406 266 const char *fmt; /* Fromat string */ 267 + const char *fmttype; /* Name in format file */ 407 268 /* Fetch functions */ 408 - fetch_func_t reg; 409 - fetch_func_t stack; 410 - fetch_func_t retval; 411 - fetch_func_t memory; 412 - fetch_func_t symbol; 413 - fetch_func_t deref; 269 + fetch_func_t fetch[FETCH_MTD_END]; 414 270 } fetch_type_table[] = { 271 + /* Special types */ 272 + [FETCH_TYPE_STRING] = __ASSIGN_FETCH_TYPE("string", string, string, 273 + sizeof(u32), 1, "__data_loc char[]"), 274 + [FETCH_TYPE_STRSIZE] = __ASSIGN_FETCH_TYPE("string_size", u32, 275 + string_size, sizeof(u32), 0, "u32"), 276 + /* Basic types */ 415 277 ASSIGN_FETCH_TYPE(u8, u8, 0), 416 278 ASSIGN_FETCH_TYPE(u16, u16, 0), 417 279 ASSIGN_FETCH_TYPE(u32, u32, 0), ··· 444 302 *(unsigned long *)dest = kernel_stack_pointer(regs); 445 303 } 446 304 305 + static fetch_func_t get_fetch_size_function(const struct fetch_type *type, 306 + fetch_func_t orig_fn) 307 + { 308 + int i; 309 + 310 + if (type != &fetch_type_table[FETCH_TYPE_STRING]) 311 + return NULL; /* Only string type needs size function */ 312 + for (i = 0; i < FETCH_MTD_END; i++) 313 + if (type->fetch[i] == orig_fn) 314 + return fetch_type_table[FETCH_TYPE_STRSIZE].fetch[i]; 315 + 316 + WARN_ON(1); /* This should not happen */ 317 + return NULL; 318 + } 319 + 447 320 /** 448 321 * Kprobe event core functions 449 322 */ 450 323 451 324 struct probe_arg { 452 325 struct fetch_param fetch; 326 + struct fetch_param fetch_size; 453 327 unsigned int offset; /* Offset from argument entry */ 454 328 const char *name; /* Name of this argument */ 455 329 const char *comm; /* Command of this argument */ ··· 587 429 588 430 static void free_probe_arg(struct probe_arg *arg) 589 431 { 590 - if (CHECK_BASIC_FETCH_FUNCS(deref, arg->fetch.fn)) 432 + if (CHECK_FETCH_FUNCS(deref, arg->fetch.fn)) 591 433 free_deref_fetch_param(arg->fetch.data); 592 - else if (CHECK_BASIC_FETCH_FUNCS(symbol, arg->fetch.fn)) 434 + else if (CHECK_FETCH_FUNCS(symbol, arg->fetch.fn)) 593 435 free_symbol_cache(arg->fetch.data); 594 436 kfree(arg->name); 595 437 kfree(arg->comm); ··· 706 548 707 549 if (strcmp(arg, "retval") == 0) { 708 550 if (is_return) 709 - f->fn = t->retval; 551 + f->fn = t->fetch[FETCH_MTD_retval]; 710 552 else 711 553 ret = -EINVAL; 712 554 } else if (strncmp(arg, "stack", 5) == 0) { ··· 720 562 if (ret || param > PARAM_MAX_STACK) 721 563 ret = -EINVAL; 722 564 else { 723 - f->fn = t->stack; 565 + f->fn = t->fetch[FETCH_MTD_stack]; 724 566 f->data = (void *)param; 725 567 } 726 568 } else ··· 746 588 case '%': /* named register */ 747 589 ret = regs_query_register_offset(arg + 1); 748 590 if (ret >= 0) { 749 - f->fn = t->reg; 591 + f->fn = t->fetch[FETCH_MTD_reg]; 750 592 f->data = (void *)(unsigned long)ret; 751 593 ret = 0; 752 594 } ··· 756 598 ret = strict_strtoul(arg + 1, 0, &param); 757 599 if (ret) 758 600 break; 759 - f->fn = t->memory; 601 + f->fn = t->fetch[FETCH_MTD_memory]; 760 602 f->data = (void *)param; 761 603 } else { 762 604 ret = split_symbol_offset(arg + 1, &offset); ··· 764 606 break; 765 607 f->data = alloc_symbol_cache(arg + 1, offset); 766 608 if (f->data) 767 - f->fn = t->symbol; 609 + f->fn = t->fetch[FETCH_MTD_symbol]; 768 610 } 769 611 break; 770 612 case '+': /* deref memory */ ··· 794 636 if (ret) 795 637 kfree(dprm); 796 638 else { 797 - f->fn = t->deref; 639 + f->fn = t->fetch[FETCH_MTD_deref]; 798 640 f->data = (void *)dprm; 799 641 } 800 642 } 801 643 break; 802 644 } 803 - if (!ret && !f->fn) 645 + if (!ret && !f->fn) { /* Parsed, but do not find fetch method */ 646 + pr_info("%s type has no corresponding fetch method.\n", 647 + t->name); 804 648 ret = -EINVAL; 649 + } 805 650 return ret; 806 651 } 807 652 ··· 813 652 struct probe_arg *parg, int is_return) 814 653 { 815 654 const char *t; 655 + int ret; 816 656 817 657 if (strlen(arg) > MAX_ARGSTR_LEN) { 818 658 pr_info("Argument is too long.: %s\n", arg); ··· 836 674 } 837 675 parg->offset = tp->size; 838 676 tp->size += parg->type->size; 839 - return __parse_probe_arg(arg, parg->type, &parg->fetch, is_return); 677 + ret = __parse_probe_arg(arg, parg->type, &parg->fetch, is_return); 678 + if (ret >= 0) { 679 + parg->fetch_size.fn = get_fetch_size_function(parg->type, 680 + parg->fetch.fn); 681 + parg->fetch_size.data = parg->fetch.data; 682 + } 683 + return ret; 840 684 } 841 685 842 686 /* Return 1 if name is reserved or already used by another argument */ ··· 925 757 pr_info("Delete command needs an event name.\n"); 926 758 return -EINVAL; 927 759 } 760 + mutex_lock(&probe_lock); 928 761 tp = find_probe_event(event, group); 929 762 if (!tp) { 763 + mutex_unlock(&probe_lock); 930 764 pr_info("Event %s/%s doesn't exist.\n", group, event); 931 765 return -ENOENT; 932 766 } 933 767 /* delete an event */ 934 768 unregister_trace_probe(tp); 935 769 free_trace_probe(tp); 770 + mutex_unlock(&probe_lock); 936 771 return 0; 937 772 } 938 773 ··· 1214 1043 .release = seq_release, 1215 1044 }; 1216 1045 1046 + /* Sum up total data length for dynamic arraies (strings) */ 1047 + static __kprobes int __get_data_size(struct trace_probe *tp, 1048 + struct pt_regs *regs) 1049 + { 1050 + int i, ret = 0; 1051 + u32 len; 1052 + 1053 + for (i = 0; i < tp->nr_args; i++) 1054 + if (unlikely(tp->args[i].fetch_size.fn)) { 1055 + call_fetch(&tp->args[i].fetch_size, regs, &len); 1056 + ret += len; 1057 + } 1058 + 1059 + return ret; 1060 + } 1061 + 1062 + /* Store the value of each argument */ 1063 + static __kprobes void store_trace_args(int ent_size, struct trace_probe *tp, 1064 + struct pt_regs *regs, 1065 + u8 *data, int maxlen) 1066 + { 1067 + int i; 1068 + u32 end = tp->size; 1069 + u32 *dl; /* Data (relative) location */ 1070 + 1071 + for (i = 0; i < tp->nr_args; i++) { 1072 + if (unlikely(tp->args[i].fetch_size.fn)) { 1073 + /* 1074 + * First, we set the relative location and 1075 + * maximum data length to *dl 1076 + */ 1077 + dl = (u32 *)(data + tp->args[i].offset); 1078 + *dl = make_data_rloc(maxlen, end - tp->args[i].offset); 1079 + /* Then try to fetch string or dynamic array data */ 1080 + call_fetch(&tp->args[i].fetch, regs, dl); 1081 + /* Reduce maximum length */ 1082 + end += get_rloc_len(*dl); 1083 + maxlen -= get_rloc_len(*dl); 1084 + /* Trick here, convert data_rloc to data_loc */ 1085 + *dl = convert_rloc_to_loc(*dl, 1086 + ent_size + tp->args[i].offset); 1087 + } else 1088 + /* Just fetching data normally */ 1089 + call_fetch(&tp->args[i].fetch, regs, 1090 + data + tp->args[i].offset); 1091 + } 1092 + } 1093 + 1217 1094 /* Kprobe handler */ 1218 1095 static __kprobes void kprobe_trace_func(struct kprobe *kp, struct pt_regs *regs) 1219 1096 { ··· 1269 1050 struct kprobe_trace_entry_head *entry; 1270 1051 struct ring_buffer_event *event; 1271 1052 struct ring_buffer *buffer; 1272 - u8 *data; 1273 - int size, i, pc; 1053 + int size, dsize, pc; 1274 1054 unsigned long irq_flags; 1275 1055 struct ftrace_event_call *call = &tp->call; 1276 1056 ··· 1278 1060 local_save_flags(irq_flags); 1279 1061 pc = preempt_count(); 1280 1062 1281 - size = sizeof(*entry) + tp->size; 1063 + dsize = __get_data_size(tp, regs); 1064 + size = sizeof(*entry) + tp->size + dsize; 1282 1065 1283 1066 event = trace_current_buffer_lock_reserve(&buffer, call->event.type, 1284 1067 size, irq_flags, pc); ··· 1288 1069 1289 1070 entry = ring_buffer_event_data(event); 1290 1071 entry->ip = (unsigned long)kp->addr; 1291 - data = (u8 *)&entry[1]; 1292 - for (i = 0; i < tp->nr_args; i++) 1293 - call_fetch(&tp->args[i].fetch, regs, data + tp->args[i].offset); 1072 + store_trace_args(sizeof(*entry), tp, regs, (u8 *)&entry[1], dsize); 1294 1073 1295 1074 if (!filter_current_check_discard(buffer, call, entry, event)) 1296 1075 trace_nowake_buffer_unlock_commit(buffer, event, irq_flags, pc); ··· 1302 1085 struct kretprobe_trace_entry_head *entry; 1303 1086 struct ring_buffer_event *event; 1304 1087 struct ring_buffer *buffer; 1305 - u8 *data; 1306 - int size, i, pc; 1088 + int size, pc, dsize; 1307 1089 unsigned long irq_flags; 1308 1090 struct ftrace_event_call *call = &tp->call; 1309 1091 1310 1092 local_save_flags(irq_flags); 1311 1093 pc = preempt_count(); 1312 1094 1313 - size = sizeof(*entry) + tp->size; 1095 + dsize = __get_data_size(tp, regs); 1096 + size = sizeof(*entry) + tp->size + dsize; 1314 1097 1315 1098 event = trace_current_buffer_lock_reserve(&buffer, call->event.type, 1316 1099 size, irq_flags, pc); ··· 1320 1103 entry = ring_buffer_event_data(event); 1321 1104 entry->func = (unsigned long)tp->rp.kp.addr; 1322 1105 entry->ret_ip = (unsigned long)ri->ret_addr; 1323 - data = (u8 *)&entry[1]; 1324 - for (i = 0; i < tp->nr_args; i++) 1325 - call_fetch(&tp->args[i].fetch, regs, data + tp->args[i].offset); 1106 + store_trace_args(sizeof(*entry), tp, regs, (u8 *)&entry[1], dsize); 1326 1107 1327 1108 if (!filter_current_check_discard(buffer, call, entry, event)) 1328 1109 trace_nowake_buffer_unlock_commit(buffer, event, irq_flags, pc); ··· 1352 1137 data = (u8 *)&field[1]; 1353 1138 for (i = 0; i < tp->nr_args; i++) 1354 1139 if (!tp->args[i].type->print(s, tp->args[i].name, 1355 - data + tp->args[i].offset)) 1140 + data + tp->args[i].offset, field)) 1356 1141 goto partial; 1357 1142 1358 1143 if (!trace_seq_puts(s, "\n")) ··· 1394 1179 data = (u8 *)&field[1]; 1395 1180 for (i = 0; i < tp->nr_args; i++) 1396 1181 if (!tp->args[i].type->print(s, tp->args[i].name, 1397 - data + tp->args[i].offset)) 1182 + data + tp->args[i].offset, field)) 1398 1183 goto partial; 1399 1184 1400 1185 if (!trace_seq_puts(s, "\n")) ··· 1429 1214 } 1430 1215 } 1431 1216 1432 - static int probe_event_raw_init(struct ftrace_event_call *event_call) 1433 - { 1434 - return 0; 1435 - } 1436 - 1437 1217 #undef DEFINE_FIELD 1438 1218 #define DEFINE_FIELD(type, item, name, is_signed) \ 1439 1219 do { \ ··· 1449 1239 DEFINE_FIELD(unsigned long, ip, FIELD_STRING_IP, 0); 1450 1240 /* Set argument names as fields */ 1451 1241 for (i = 0; i < tp->nr_args; i++) { 1452 - ret = trace_define_field(event_call, tp->args[i].type->name, 1242 + ret = trace_define_field(event_call, tp->args[i].type->fmttype, 1453 1243 tp->args[i].name, 1454 1244 sizeof(field) + tp->args[i].offset, 1455 1245 tp->args[i].type->size, ··· 1471 1261 DEFINE_FIELD(unsigned long, ret_ip, FIELD_STRING_RETIP, 0); 1472 1262 /* Set argument names as fields */ 1473 1263 for (i = 0; i < tp->nr_args; i++) { 1474 - ret = trace_define_field(event_call, tp->args[i].type->name, 1264 + ret = trace_define_field(event_call, tp->args[i].type->fmttype, 1475 1265 tp->args[i].name, 1476 1266 sizeof(field) + tp->args[i].offset, 1477 1267 tp->args[i].type->size, ··· 1511 1301 pos += snprintf(buf + pos, LEN_OR_ZERO, "\", %s", arg); 1512 1302 1513 1303 for (i = 0; i < tp->nr_args; i++) { 1514 - pos += snprintf(buf + pos, LEN_OR_ZERO, ", REC->%s", 1515 - tp->args[i].name); 1304 + if (strcmp(tp->args[i].type->name, "string") == 0) 1305 + pos += snprintf(buf + pos, LEN_OR_ZERO, 1306 + ", __get_str(%s)", 1307 + tp->args[i].name); 1308 + else 1309 + pos += snprintf(buf + pos, LEN_OR_ZERO, ", REC->%s", 1310 + tp->args[i].name); 1516 1311 } 1517 1312 1518 1313 #undef LEN_OR_ZERO ··· 1554 1339 struct ftrace_event_call *call = &tp->call; 1555 1340 struct kprobe_trace_entry_head *entry; 1556 1341 struct hlist_head *head; 1557 - u8 *data; 1558 - int size, __size, i; 1342 + int size, __size, dsize; 1559 1343 int rctx; 1560 1344 1561 - __size = sizeof(*entry) + tp->size; 1345 + dsize = __get_data_size(tp, regs); 1346 + __size = sizeof(*entry) + tp->size + dsize; 1562 1347 size = ALIGN(__size + sizeof(u32), sizeof(u64)); 1563 1348 size -= sizeof(u32); 1564 1349 if (WARN_ONCE(size > PERF_MAX_TRACE_SIZE, ··· 1570 1355 return; 1571 1356 1572 1357 entry->ip = (unsigned long)kp->addr; 1573 - data = (u8 *)&entry[1]; 1574 - for (i = 0; i < tp->nr_args; i++) 1575 - call_fetch(&tp->args[i].fetch, regs, data + tp->args[i].offset); 1358 + memset(&entry[1], 0, dsize); 1359 + store_trace_args(sizeof(*entry), tp, regs, (u8 *)&entry[1], dsize); 1576 1360 1577 1361 head = this_cpu_ptr(call->perf_events); 1578 1362 perf_trace_buf_submit(entry, size, rctx, entry->ip, 1, regs, head); ··· 1585 1371 struct ftrace_event_call *call = &tp->call; 1586 1372 struct kretprobe_trace_entry_head *entry; 1587 1373 struct hlist_head *head; 1588 - u8 *data; 1589 - int size, __size, i; 1374 + int size, __size, dsize; 1590 1375 int rctx; 1591 1376 1592 - __size = sizeof(*entry) + tp->size; 1377 + dsize = __get_data_size(tp, regs); 1378 + __size = sizeof(*entry) + tp->size + dsize; 1593 1379 size = ALIGN(__size + sizeof(u32), sizeof(u64)); 1594 1380 size -= sizeof(u32); 1595 1381 if (WARN_ONCE(size > PERF_MAX_TRACE_SIZE, ··· 1602 1388 1603 1389 entry->func = (unsigned long)tp->rp.kp.addr; 1604 1390 entry->ret_ip = (unsigned long)ri->ret_addr; 1605 - data = (u8 *)&entry[1]; 1606 - for (i = 0; i < tp->nr_args; i++) 1607 - call_fetch(&tp->args[i].fetch, regs, data + tp->args[i].offset); 1391 + store_trace_args(sizeof(*entry), tp, regs, (u8 *)&entry[1], dsize); 1608 1392 1609 1393 head = this_cpu_ptr(call->perf_events); 1610 1394 perf_trace_buf_submit(entry, size, rctx, entry->ret_ip, 1, regs, head); ··· 1698 1486 int ret; 1699 1487 1700 1488 /* Initialize ftrace_event_call */ 1489 + INIT_LIST_HEAD(&call->class->fields); 1701 1490 if (probe_is_return(tp)) { 1702 - INIT_LIST_HEAD(&call->class->fields); 1703 1491 call->event.funcs = &kretprobe_funcs; 1704 - call->class->raw_init = probe_event_raw_init; 1705 1492 call->class->define_fields = kretprobe_event_define_fields; 1706 1493 } else { 1707 - INIT_LIST_HEAD(&call->class->fields); 1708 1494 call->event.funcs = &kprobe_funcs; 1709 - call->class->raw_init = probe_event_raw_init; 1710 1495 call->class->define_fields = kprobe_event_define_fields; 1711 1496 } 1712 1497 if (set_print_fmt(tp) < 0)

-508

kernel/trace/trace_ksym.c

··· 1 - /* 2 - * trace_ksym.c - Kernel Symbol Tracer 3 - * 4 - * This program is free software; you can redistribute it and/or modify 5 - * it under the terms of the GNU General Public License as published by 6 - * the Free Software Foundation; either version 2 of the License, or 7 - * (at your option) any later version. 8 - * 9 - * This program is distributed in the hope that it will be useful, 10 - * but WITHOUT ANY WARRANTY; without even the implied warranty of 11 - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 12 - * GNU General Public License for more details. 13 - * 14 - * You should have received a copy of the GNU General Public License 15 - * along with this program; if not, write to the Free Software 16 - * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. 17 - * 18 - * Copyright (C) IBM Corporation, 2009 19 - */ 20 - 21 - #include <linux/kallsyms.h> 22 - #include <linux/uaccess.h> 23 - #include <linux/debugfs.h> 24 - #include <linux/ftrace.h> 25 - #include <linux/module.h> 26 - #include <linux/slab.h> 27 - #include <linux/fs.h> 28 - 29 - #include "trace_output.h" 30 - #include "trace.h" 31 - 32 - #include <linux/hw_breakpoint.h> 33 - #include <asm/hw_breakpoint.h> 34 - 35 - #include <asm/atomic.h> 36 - 37 - #define KSYM_TRACER_OP_LEN 3 /* rw- */ 38 - 39 - struct trace_ksym { 40 - struct perf_event **ksym_hbp; 41 - struct perf_event_attr attr; 42 - #ifdef CONFIG_PROFILE_KSYM_TRACER 43 - atomic64_t counter; 44 - #endif 45 - struct hlist_node ksym_hlist; 46 - }; 47 - 48 - static struct trace_array *ksym_trace_array; 49 - 50 - static unsigned int ksym_tracing_enabled; 51 - 52 - static HLIST_HEAD(ksym_filter_head); 53 - 54 - static DEFINE_MUTEX(ksym_tracer_mutex); 55 - 56 - #ifdef CONFIG_PROFILE_KSYM_TRACER 57 - 58 - #define MAX_UL_INT 0xffffffff 59 - 60 - void ksym_collect_stats(unsigned long hbp_hit_addr) 61 - { 62 - struct hlist_node *node; 63 - struct trace_ksym *entry; 64 - 65 - rcu_read_lock(); 66 - hlist_for_each_entry_rcu(entry, node, &ksym_filter_head, ksym_hlist) { 67 - if (entry->attr.bp_addr == hbp_hit_addr) { 68 - atomic64_inc(&entry->counter); 69 - break; 70 - } 71 - } 72 - rcu_read_unlock(); 73 - } 74 - #endif /* CONFIG_PROFILE_KSYM_TRACER */ 75 - 76 - void ksym_hbp_handler(struct perf_event *hbp, int nmi, 77 - struct perf_sample_data *data, 78 - struct pt_regs *regs) 79 - { 80 - struct ring_buffer_event *event; 81 - struct ksym_trace_entry *entry; 82 - struct ring_buffer *buffer; 83 - int pc; 84 - 85 - if (!ksym_tracing_enabled) 86 - return; 87 - 88 - buffer = ksym_trace_array->buffer; 89 - 90 - pc = preempt_count(); 91 - 92 - event = trace_buffer_lock_reserve(buffer, TRACE_KSYM, 93 - sizeof(*entry), 0, pc); 94 - if (!event) 95 - return; 96 - 97 - entry = ring_buffer_event_data(event); 98 - entry->ip = instruction_pointer(regs); 99 - entry->type = hw_breakpoint_type(hbp); 100 - entry->addr = hw_breakpoint_addr(hbp); 101 - strlcpy(entry->cmd, current->comm, TASK_COMM_LEN); 102 - 103 - #ifdef CONFIG_PROFILE_KSYM_TRACER 104 - ksym_collect_stats(hw_breakpoint_addr(hbp)); 105 - #endif /* CONFIG_PROFILE_KSYM_TRACER */ 106 - 107 - trace_buffer_unlock_commit(buffer, event, 0, pc); 108 - } 109 - 110 - /* Valid access types are represented as 111 - * 112 - * rw- : Set Read/Write Access Breakpoint 113 - * -w- : Set Write Access Breakpoint 114 - * --- : Clear Breakpoints 115 - * --x : Set Execution Break points (Not available yet) 116 - * 117 - */ 118 - static int ksym_trace_get_access_type(char *str) 119 - { 120 - int access = 0; 121 - 122 - if (str[0] == 'r') 123 - access |= HW_BREAKPOINT_R; 124 - 125 - if (str[1] == 'w') 126 - access |= HW_BREAKPOINT_W; 127 - 128 - if (str[2] == 'x') 129 - access |= HW_BREAKPOINT_X; 130 - 131 - switch (access) { 132 - case HW_BREAKPOINT_R: 133 - case HW_BREAKPOINT_W: 134 - case HW_BREAKPOINT_W | HW_BREAKPOINT_R: 135 - return access; 136 - default: 137 - return -EINVAL; 138 - } 139 - } 140 - 141 - /* 142 - * There can be several possible malformed requests and we attempt to capture 143 - * all of them. We enumerate some of the rules 144 - * 1. We will not allow kernel symbols with ':' since it is used as a delimiter. 145 - * i.e. multiple ':' symbols disallowed. Possible uses are of the form 146 - * <module>:<ksym_name>:<op>. 147 - * 2. No delimiter symbol ':' in the input string 148 - * 3. Spurious operator symbols or symbols not in their respective positions 149 - * 4. <ksym_name>:--- i.e. clear breakpoint request when ksym_name not in file 150 - * 5. Kernel symbol not a part of /proc/kallsyms 151 - * 6. Duplicate requests 152 - */ 153 - static int parse_ksym_trace_str(char *input_string, char **ksymname, 154 - unsigned long *addr) 155 - { 156 - int ret; 157 - 158 - *ksymname = strsep(&input_string, ":"); 159 - *addr = kallsyms_lookup_name(*ksymname); 160 - 161 - /* Check for malformed request: (2), (1) and (5) */ 162 - if ((!input_string) || 163 - (strlen(input_string) != KSYM_TRACER_OP_LEN) || 164 - (*addr == 0)) 165 - return -EINVAL;; 166 - 167 - ret = ksym_trace_get_access_type(input_string); 168 - 169 - return ret; 170 - } 171 - 172 - int process_new_ksym_entry(char *ksymname, int op, unsigned long addr) 173 - { 174 - struct trace_ksym *entry; 175 - int ret = -ENOMEM; 176 - 177 - entry = kzalloc(sizeof(struct trace_ksym), GFP_KERNEL); 178 - if (!entry) 179 - return -ENOMEM; 180 - 181 - hw_breakpoint_init(&entry->attr); 182 - 183 - entry->attr.bp_type = op; 184 - entry->attr.bp_addr = addr; 185 - entry->attr.bp_len = HW_BREAKPOINT_LEN_4; 186 - 187 - entry->ksym_hbp = register_wide_hw_breakpoint(&entry->attr, 188 - ksym_hbp_handler); 189 - 190 - if (IS_ERR(entry->ksym_hbp)) { 191 - ret = PTR_ERR(entry->ksym_hbp); 192 - if (ret == -ENOSPC) { 193 - printk(KERN_ERR "ksym_tracer: Maximum limit reached." 194 - " No new requests for tracing can be accepted now.\n"); 195 - } else { 196 - printk(KERN_INFO "ksym_tracer request failed. Try again" 197 - " later!!\n"); 198 - } 199 - goto err; 200 - } 201 - 202 - hlist_add_head_rcu(&(entry->ksym_hlist), &ksym_filter_head); 203 - 204 - return 0; 205 - 206 - err: 207 - kfree(entry); 208 - 209 - return ret; 210 - } 211 - 212 - static ssize_t ksym_trace_filter_read(struct file *filp, char __user *ubuf, 213 - size_t count, loff_t *ppos) 214 - { 215 - struct trace_ksym *entry; 216 - struct hlist_node *node; 217 - struct trace_seq *s; 218 - ssize_t cnt = 0; 219 - int ret; 220 - 221 - s = kmalloc(sizeof(*s), GFP_KERNEL); 222 - if (!s) 223 - return -ENOMEM; 224 - trace_seq_init(s); 225 - 226 - mutex_lock(&ksym_tracer_mutex); 227 - 228 - hlist_for_each_entry(entry, node, &ksym_filter_head, ksym_hlist) { 229 - ret = trace_seq_printf(s, "%pS:", 230 - (void *)(unsigned long)entry->attr.bp_addr); 231 - if (entry->attr.bp_type == HW_BREAKPOINT_R) 232 - ret = trace_seq_puts(s, "r--\n"); 233 - else if (entry->attr.bp_type == HW_BREAKPOINT_W) 234 - ret = trace_seq_puts(s, "-w-\n"); 235 - else if (entry->attr.bp_type == (HW_BREAKPOINT_W | HW_BREAKPOINT_R)) 236 - ret = trace_seq_puts(s, "rw-\n"); 237 - WARN_ON_ONCE(!ret); 238 - } 239 - 240 - cnt = simple_read_from_buffer(ubuf, count, ppos, s->buffer, s->len); 241 - 242 - mutex_unlock(&ksym_tracer_mutex); 243 - 244 - kfree(s); 245 - 246 - return cnt; 247 - } 248 - 249 - static void __ksym_trace_reset(void) 250 - { 251 - struct trace_ksym *entry; 252 - struct hlist_node *node, *node1; 253 - 254 - mutex_lock(&ksym_tracer_mutex); 255 - hlist_for_each_entry_safe(entry, node, node1, &ksym_filter_head, 256 - ksym_hlist) { 257 - unregister_wide_hw_breakpoint(entry->ksym_hbp); 258 - hlist_del_rcu(&(entry->ksym_hlist)); 259 - synchronize_rcu(); 260 - kfree(entry); 261 - } 262 - mutex_unlock(&ksym_tracer_mutex); 263 - } 264 - 265 - static ssize_t ksym_trace_filter_write(struct file *file, 266 - const char __user *buffer, 267 - size_t count, loff_t *ppos) 268 - { 269 - struct trace_ksym *entry; 270 - struct hlist_node *node; 271 - char *buf, *input_string, *ksymname = NULL; 272 - unsigned long ksym_addr = 0; 273 - int ret, op, changed = 0; 274 - 275 - buf = kzalloc(count + 1, GFP_KERNEL); 276 - if (!buf) 277 - return -ENOMEM; 278 - 279 - ret = -EFAULT; 280 - if (copy_from_user(buf, buffer, count)) 281 - goto out; 282 - 283 - buf[count] = '\0'; 284 - input_string = strstrip(buf); 285 - 286 - /* 287 - * Clear all breakpoints if: 288 - * 1: echo > ksym_trace_filter 289 - * 2: echo 0 > ksym_trace_filter 290 - * 3: echo "*:---" > ksym_trace_filter 291 - */ 292 - if (!input_string[0] || !strcmp(input_string, "0") || 293 - !strcmp(input_string, "*:---")) { 294 - __ksym_trace_reset(); 295 - ret = 0; 296 - goto out; 297 - } 298 - 299 - ret = op = parse_ksym_trace_str(input_string, &ksymname, &ksym_addr); 300 - if (ret < 0) 301 - goto out; 302 - 303 - mutex_lock(&ksym_tracer_mutex); 304 - 305 - ret = -EINVAL; 306 - hlist_for_each_entry(entry, node, &ksym_filter_head, ksym_hlist) { 307 - if (entry->attr.bp_addr == ksym_addr) { 308 - /* Check for malformed request: (6) */ 309 - if (entry->attr.bp_type != op) 310 - changed = 1; 311 - else 312 - goto out_unlock; 313 - break; 314 - } 315 - } 316 - if (changed) { 317 - unregister_wide_hw_breakpoint(entry->ksym_hbp); 318 - entry->attr.bp_type = op; 319 - ret = 0; 320 - if (op > 0) { 321 - entry->ksym_hbp = 322 - register_wide_hw_breakpoint(&entry->attr, 323 - ksym_hbp_handler); 324 - if (IS_ERR(entry->ksym_hbp)) 325 - ret = PTR_ERR(entry->ksym_hbp); 326 - else 327 - goto out_unlock; 328 - } 329 - /* Error or "symbol:---" case: drop it */ 330 - hlist_del_rcu(&(entry->ksym_hlist)); 331 - synchronize_rcu(); 332 - kfree(entry); 333 - goto out_unlock; 334 - } else { 335 - /* Check for malformed request: (4) */ 336 - if (op) 337 - ret = process_new_ksym_entry(ksymname, op, ksym_addr); 338 - } 339 - out_unlock: 340 - mutex_unlock(&ksym_tracer_mutex); 341 - out: 342 - kfree(buf); 343 - return !ret ? count : ret; 344 - } 345 - 346 - static const struct file_operations ksym_tracing_fops = { 347 - .open = tracing_open_generic, 348 - .read = ksym_trace_filter_read, 349 - .write = ksym_trace_filter_write, 350 - }; 351 - 352 - static void ksym_trace_reset(struct trace_array *tr) 353 - { 354 - ksym_tracing_enabled = 0; 355 - __ksym_trace_reset(); 356 - } 357 - 358 - static int ksym_trace_init(struct trace_array *tr) 359 - { 360 - int cpu, ret = 0; 361 - 362 - for_each_online_cpu(cpu) 363 - tracing_reset(tr, cpu); 364 - ksym_tracing_enabled = 1; 365 - ksym_trace_array = tr; 366 - 367 - return ret; 368 - } 369 - 370 - static void ksym_trace_print_header(struct seq_file *m) 371 - { 372 - seq_puts(m, 373 - "# TASK-PID CPU# Symbol " 374 - "Type Function\n"); 375 - seq_puts(m, 376 - "# | | | " 377 - " | |\n"); 378 - } 379 - 380 - static enum print_line_t ksym_trace_output(struct trace_iterator *iter) 381 - { 382 - struct trace_entry *entry = iter->ent; 383 - struct trace_seq *s = &iter->seq; 384 - struct ksym_trace_entry *field; 385 - char str[KSYM_SYMBOL_LEN]; 386 - int ret; 387 - 388 - if (entry->type != TRACE_KSYM) 389 - return TRACE_TYPE_UNHANDLED; 390 - 391 - trace_assign_type(field, entry); 392 - 393 - ret = trace_seq_printf(s, "%11s-%-5d [%03d] %pS", field->cmd, 394 - entry->pid, iter->cpu, (char *)field->addr); 395 - if (!ret) 396 - return TRACE_TYPE_PARTIAL_LINE; 397 - 398 - switch (field->type) { 399 - case HW_BREAKPOINT_R: 400 - ret = trace_seq_printf(s, " R "); 401 - break; 402 - case HW_BREAKPOINT_W: 403 - ret = trace_seq_printf(s, " W "); 404 - break; 405 - case HW_BREAKPOINT_R | HW_BREAKPOINT_W: 406 - ret = trace_seq_printf(s, " RW "); 407 - break; 408 - default: 409 - return TRACE_TYPE_PARTIAL_LINE; 410 - } 411 - 412 - if (!ret) 413 - return TRACE_TYPE_PARTIAL_LINE; 414 - 415 - sprint_symbol(str, field->ip); 416 - ret = trace_seq_printf(s, "%s\n", str); 417 - if (!ret) 418 - return TRACE_TYPE_PARTIAL_LINE; 419 - 420 - return TRACE_TYPE_HANDLED; 421 - } 422 - 423 - struct tracer ksym_tracer __read_mostly = 424 - { 425 - .name = "ksym_tracer", 426 - .init = ksym_trace_init, 427 - .reset = ksym_trace_reset, 428 - #ifdef CONFIG_FTRACE_SELFTEST 429 - .selftest = trace_selftest_startup_ksym, 430 - #endif 431 - .print_header = ksym_trace_print_header, 432 - .print_line = ksym_trace_output 433 - }; 434 - 435 - #ifdef CONFIG_PROFILE_KSYM_TRACER 436 - static int ksym_profile_show(struct seq_file *m, void *v) 437 - { 438 - struct hlist_node *node; 439 - struct trace_ksym *entry; 440 - int access_type = 0; 441 - char fn_name[KSYM_NAME_LEN]; 442 - 443 - seq_puts(m, " Access Type "); 444 - seq_puts(m, " Symbol Counter\n"); 445 - seq_puts(m, " ----------- "); 446 - seq_puts(m, " ------ -------\n"); 447 - 448 - rcu_read_lock(); 449 - hlist_for_each_entry_rcu(entry, node, &ksym_filter_head, ksym_hlist) { 450 - 451 - access_type = entry->attr.bp_type; 452 - 453 - switch (access_type) { 454 - case HW_BREAKPOINT_R: 455 - seq_puts(m, " R "); 456 - break; 457 - case HW_BREAKPOINT_W: 458 - seq_puts(m, " W "); 459 - break; 460 - case HW_BREAKPOINT_R | HW_BREAKPOINT_W: 461 - seq_puts(m, " RW "); 462 - break; 463 - default: 464 - seq_puts(m, " NA "); 465 - } 466 - 467 - if (lookup_symbol_name(entry->attr.bp_addr, fn_name) >= 0) 468 - seq_printf(m, " %-36s", fn_name); 469 - else 470 - seq_printf(m, " %-36s", "<NA>"); 471 - seq_printf(m, " %15llu\n", 472 - (unsigned long long)atomic64_read(&entry->counter)); 473 - } 474 - rcu_read_unlock(); 475 - 476 - return 0; 477 - } 478 - 479 - static int ksym_profile_open(struct inode *node, struct file *file) 480 - { 481 - return single_open(file, ksym_profile_show, NULL); 482 - } 483 - 484 - static const struct file_operations ksym_profile_fops = { 485 - .open = ksym_profile_open, 486 - .read = seq_read, 487 - .llseek = seq_lseek, 488 - .release = single_release, 489 - }; 490 - #endif /* CONFIG_PROFILE_KSYM_TRACER */ 491 - 492 - __init static int init_ksym_trace(void) 493 - { 494 - struct dentry *d_tracer; 495 - 496 - d_tracer = tracing_init_dentry(); 497 - 498 - trace_create_file("ksym_trace_filter", 0644, d_tracer, 499 - NULL, &ksym_tracing_fops); 500 - 501 - #ifdef CONFIG_PROFILE_KSYM_TRACER 502 - trace_create_file("ksym_profile", 0444, d_tracer, 503 - NULL, &ksym_profile_fops); 504 - #endif 505 - 506 - return register_tracer(&ksym_tracer); 507 - } 508 - device_initcall(init_ksym_trace);

-69

kernel/trace/trace_output.c

··· 16 16 17 17 DECLARE_RWSEM(trace_event_mutex); 18 18 19 - DEFINE_PER_CPU(struct trace_seq, ftrace_event_seq); 20 - EXPORT_PER_CPU_SYMBOL(ftrace_event_seq); 21 - 22 19 static struct hlist_head event_hash[EVENT_HASHSIZE] __read_mostly; 23 20 24 21 static int next_event_type = __TRACE_LAST_TYPE + 1; ··· 1066 1069 .funcs = &trace_wake_funcs, 1067 1070 }; 1068 1071 1069 - /* TRACE_SPECIAL */ 1070 - static enum print_line_t trace_special_print(struct trace_iterator *iter, 1071 - int flags, struct trace_event *event) 1072 - { 1073 - struct special_entry *field; 1074 - 1075 - trace_assign_type(field, iter->ent); 1076 - 1077 - if (!trace_seq_printf(&iter->seq, "# %ld %ld %ld\n", 1078 - field->arg1, 1079 - field->arg2, 1080 - field->arg3)) 1081 - return TRACE_TYPE_PARTIAL_LINE; 1082 - 1083 - return TRACE_TYPE_HANDLED; 1084 - } 1085 - 1086 - static enum print_line_t trace_special_hex(struct trace_iterator *iter, 1087 - int flags, struct trace_event *event) 1088 - { 1089 - struct special_entry *field; 1090 - struct trace_seq *s = &iter->seq; 1091 - 1092 - trace_assign_type(field, iter->ent); 1093 - 1094 - SEQ_PUT_HEX_FIELD_RET(s, field->arg1); 1095 - SEQ_PUT_HEX_FIELD_RET(s, field->arg2); 1096 - SEQ_PUT_HEX_FIELD_RET(s, field->arg3); 1097 - 1098 - return TRACE_TYPE_HANDLED; 1099 - } 1100 - 1101 - static enum print_line_t trace_special_bin(struct trace_iterator *iter, 1102 - int flags, struct trace_event *event) 1103 - { 1104 - struct special_entry *field; 1105 - struct trace_seq *s = &iter->seq; 1106 - 1107 - trace_assign_type(field, iter->ent); 1108 - 1109 - SEQ_PUT_FIELD_RET(s, field->arg1); 1110 - SEQ_PUT_FIELD_RET(s, field->arg2); 1111 - SEQ_PUT_FIELD_RET(s, field->arg3); 1112 - 1113 - return TRACE_TYPE_HANDLED; 1114 - } 1115 - 1116 - static struct trace_event_functions trace_special_funcs = { 1117 - .trace = trace_special_print, 1118 - .raw = trace_special_print, 1119 - .hex = trace_special_hex, 1120 - .binary = trace_special_bin, 1121 - }; 1122 - 1123 - static struct trace_event trace_special_event = { 1124 - .type = TRACE_SPECIAL, 1125 - .funcs = &trace_special_funcs, 1126 - }; 1127 - 1128 1072 /* TRACE_STACK */ 1129 1073 1130 1074 static enum print_line_t trace_stack_print(struct trace_iterator *iter, ··· 1099 1161 1100 1162 static struct trace_event_functions trace_stack_funcs = { 1101 1163 .trace = trace_stack_print, 1102 - .raw = trace_special_print, 1103 - .hex = trace_special_hex, 1104 - .binary = trace_special_bin, 1105 1164 }; 1106 1165 1107 1166 static struct trace_event trace_stack_event = { ··· 1129 1194 1130 1195 static struct trace_event_functions trace_user_stack_funcs = { 1131 1196 .trace = trace_user_stack_print, 1132 - .raw = trace_special_print, 1133 - .hex = trace_special_hex, 1134 - .binary = trace_special_bin, 1135 1197 }; 1136 1198 1137 1199 static struct trace_event trace_user_stack_event = { ··· 1246 1314 &trace_fn_event, 1247 1315 &trace_ctx_event, 1248 1316 &trace_wake_event, 1249 - &trace_special_event, 1250 1317 &trace_stack_event, 1251 1318 &trace_user_stack_event, 1252 1319 &trace_bprint_event,

+4 -3

kernel/trace/trace_sched_wakeup.c

··· 46 46 struct trace_array_cpu *data; 47 47 unsigned long flags; 48 48 long disabled; 49 - int resched; 50 49 int cpu; 51 50 int pc; 52 51 ··· 53 54 return; 54 55 55 56 pc = preempt_count(); 56 - resched = ftrace_preempt_disable(); 57 + preempt_disable_notrace(); 57 58 58 59 cpu = raw_smp_processor_id(); 59 60 if (cpu != wakeup_current_cpu) ··· 73 74 out: 74 75 atomic_dec(&data->disabled); 75 76 out_enable: 76 - ftrace_preempt_enable(resched); 77 + preempt_enable_notrace(); 77 78 } 78 79 79 80 static struct ftrace_ops trace_ops __read_mostly = ··· 382 383 #ifdef CONFIG_FTRACE_SELFTEST 383 384 .selftest = trace_selftest_startup_wakeup, 384 385 #endif 386 + .use_max_tr = 1, 385 387 }; 386 388 387 389 static struct tracer wakeup_rt_tracer __read_mostly = ··· 397 397 #ifdef CONFIG_FTRACE_SELFTEST 398 398 .selftest = trace_selftest_startup_wakeup, 399 399 #endif 400 + .use_max_tr = 1, 400 401 }; 401 402 402 403 __init static int init_wakeup_tracer(void)

-87

kernel/trace/trace_selftest.c

··· 13 13 case TRACE_WAKE: 14 14 case TRACE_STACK: 15 15 case TRACE_PRINT: 16 - case TRACE_SPECIAL: 17 16 case TRACE_BRANCH: 18 17 case TRACE_GRAPH_ENT: 19 18 case TRACE_GRAPH_RET: 20 - case TRACE_KSYM: 21 19 return 1; 22 20 } 23 21 return 0; ··· 689 691 } 690 692 #endif /* CONFIG_CONTEXT_SWITCH_TRACER */ 691 693 692 - #ifdef CONFIG_SYSPROF_TRACER 693 - int 694 - trace_selftest_startup_sysprof(struct tracer *trace, struct trace_array *tr) 695 - { 696 - unsigned long count; 697 - int ret; 698 - 699 - /* start the tracing */ 700 - ret = tracer_init(trace, tr); 701 - if (ret) { 702 - warn_failed_init_tracer(trace, ret); 703 - return ret; 704 - } 705 - 706 - /* Sleep for a 1/10 of a second */ 707 - msleep(100); 708 - /* stop the tracing. */ 709 - tracing_stop(); 710 - /* check the trace buffer */ 711 - ret = trace_test_buffer(tr, &count); 712 - trace->reset(tr); 713 - tracing_start(); 714 - 715 - if (!ret && !count) { 716 - printk(KERN_CONT ".. no entries found .."); 717 - ret = -1; 718 - } 719 - 720 - return ret; 721 - } 722 - #endif /* CONFIG_SYSPROF_TRACER */ 723 - 724 694 #ifdef CONFIG_BRANCH_TRACER 725 695 int 726 696 trace_selftest_startup_branch(struct tracer *trace, struct trace_array *tr) ··· 720 754 return ret; 721 755 } 722 756 #endif /* CONFIG_BRANCH_TRACER */ 723 - 724 - #ifdef CONFIG_KSYM_TRACER 725 - static int ksym_selftest_dummy; 726 - 727 - int 728 - trace_selftest_startup_ksym(struct tracer *trace, struct trace_array *tr) 729 - { 730 - unsigned long count; 731 - int ret; 732 - 733 - /* start the tracing */ 734 - ret = tracer_init(trace, tr); 735 - if (ret) { 736 - warn_failed_init_tracer(trace, ret); 737 - return ret; 738 - } 739 - 740 - ksym_selftest_dummy = 0; 741 - /* Register the read-write tracing request */ 742 - 743 - ret = process_new_ksym_entry("ksym_selftest_dummy", 744 - HW_BREAKPOINT_R | HW_BREAKPOINT_W, 745 - (unsigned long)(&ksym_selftest_dummy)); 746 - 747 - if (ret < 0) { 748 - printk(KERN_CONT "ksym_trace read-write startup test failed\n"); 749 - goto ret_path; 750 - } 751 - /* Perform a read and a write operation over the dummy variable to 752 - * trigger the tracer 753 - */ 754 - if (ksym_selftest_dummy == 0) 755 - ksym_selftest_dummy++; 756 - 757 - /* stop the tracing. */ 758 - tracing_stop(); 759 - /* check the trace buffer */ 760 - ret = trace_test_buffer(tr, &count); 761 - trace->reset(tr); 762 - tracing_start(); 763 - 764 - /* read & write operations - one each is performed on the dummy variable 765 - * triggering two entries in the trace buffer 766 - */ 767 - if (!ret && count != 2) { 768 - printk(KERN_CONT "Ksym tracer startup test failed"); 769 - ret = -1; 770 - } 771 - 772 - ret_path: 773 - return ret; 774 - } 775 - #endif /* CONFIG_KSYM_TRACER */ 776 757

+3 -3

kernel/trace/trace_stack.c

··· 110 110 static void 111 111 stack_trace_call(unsigned long ip, unsigned long parent_ip) 112 112 { 113 - int cpu, resched; 113 + int cpu; 114 114 115 115 if (unlikely(!ftrace_enabled || stack_trace_disabled)) 116 116 return; 117 117 118 - resched = ftrace_preempt_disable(); 118 + preempt_disable_notrace(); 119 119 120 120 cpu = raw_smp_processor_id(); 121 121 /* no atomic needed, we only modify this variable by this cpu */ ··· 127 127 out: 128 128 per_cpu(trace_active, cpu)--; 129 129 /* prevent recursion in schedule */ 130 - ftrace_preempt_enable(resched); 130 + preempt_enable_notrace(); 131 131 } 132 132 133 133 static struct ftrace_ops trace_ops __read_mostly =

+4 -3

kernel/trace/trace_syscalls.c

··· 23 23 static int syscall_enter_define_fields(struct ftrace_event_call *call); 24 24 static int syscall_exit_define_fields(struct ftrace_event_call *call); 25 25 26 + /* All syscall exit events have the same fields */ 27 + static LIST_HEAD(syscall_exit_fields); 28 + 26 29 static struct list_head * 27 30 syscall_get_enter_fields(struct ftrace_event_call *call) 28 31 { ··· 37 34 static struct list_head * 38 35 syscall_get_exit_fields(struct ftrace_event_call *call) 39 36 { 40 - struct syscall_metadata *entry = call->data; 41 - 42 - return &entry->exit_fields; 37 + return &syscall_exit_fields; 43 38 } 44 39 45 40 struct trace_event_functions enter_syscall_print_funcs = {

-329

kernel/trace/trace_sysprof.c

··· 1 - /* 2 - * trace stack traces 3 - * 4 - * Copyright (C) 2004-2008, Soeren Sandmann 5 - * Copyright (C) 2007 Steven Rostedt <srostedt@redhat.com> 6 - * Copyright (C) 2008 Ingo Molnar <mingo@redhat.com> 7 - */ 8 - #include <linux/kallsyms.h> 9 - #include <linux/debugfs.h> 10 - #include <linux/hrtimer.h> 11 - #include <linux/uaccess.h> 12 - #include <linux/ftrace.h> 13 - #include <linux/module.h> 14 - #include <linux/irq.h> 15 - #include <linux/fs.h> 16 - 17 - #include <asm/stacktrace.h> 18 - 19 - #include "trace.h" 20 - 21 - static struct trace_array *sysprof_trace; 22 - static int __read_mostly tracer_enabled; 23 - 24 - /* 25 - * 1 msec sample interval by default: 26 - */ 27 - static unsigned long sample_period = 1000000; 28 - static const unsigned int sample_max_depth = 512; 29 - 30 - static DEFINE_MUTEX(sample_timer_lock); 31 - /* 32 - * Per CPU hrtimers that do the profiling: 33 - */ 34 - static DEFINE_PER_CPU(struct hrtimer, stack_trace_hrtimer); 35 - 36 - struct stack_frame { 37 - const void __user *next_fp; 38 - unsigned long return_address; 39 - }; 40 - 41 - static int copy_stack_frame(const void __user *fp, struct stack_frame *frame) 42 - { 43 - int ret; 44 - 45 - if (!access_ok(VERIFY_READ, fp, sizeof(*frame))) 46 - return 0; 47 - 48 - ret = 1; 49 - pagefault_disable(); 50 - if (__copy_from_user_inatomic(frame, fp, sizeof(*frame))) 51 - ret = 0; 52 - pagefault_enable(); 53 - 54 - return ret; 55 - } 56 - 57 - struct backtrace_info { 58 - struct trace_array_cpu *data; 59 - struct trace_array *tr; 60 - int pos; 61 - }; 62 - 63 - static void 64 - backtrace_warning_symbol(void *data, char *msg, unsigned long symbol) 65 - { 66 - /* Ignore warnings */ 67 - } 68 - 69 - static void backtrace_warning(void *data, char *msg) 70 - { 71 - /* Ignore warnings */ 72 - } 73 - 74 - static int backtrace_stack(void *data, char *name) 75 - { 76 - /* Don't bother with IRQ stacks for now */ 77 - return -1; 78 - } 79 - 80 - static void backtrace_address(void *data, unsigned long addr, int reliable) 81 - { 82 - struct backtrace_info *info = data; 83 - 84 - if (info->pos < sample_max_depth && reliable) { 85 - __trace_special(info->tr, info->data, 1, addr, 0); 86 - 87 - info->pos++; 88 - } 89 - } 90 - 91 - static const struct stacktrace_ops backtrace_ops = { 92 - .warning = backtrace_warning, 93 - .warning_symbol = backtrace_warning_symbol, 94 - .stack = backtrace_stack, 95 - .address = backtrace_address, 96 - .walk_stack = print_context_stack, 97 - }; 98 - 99 - static int 100 - trace_kernel(struct pt_regs *regs, struct trace_array *tr, 101 - struct trace_array_cpu *data) 102 - { 103 - struct backtrace_info info; 104 - unsigned long bp; 105 - char *stack; 106 - 107 - info.tr = tr; 108 - info.data = data; 109 - info.pos = 1; 110 - 111 - __trace_special(info.tr, info.data, 1, regs->ip, 0); 112 - 113 - stack = ((char *)regs + sizeof(struct pt_regs)); 114 - #ifdef CONFIG_FRAME_POINTER 115 - bp = regs->bp; 116 - #else 117 - bp = 0; 118 - #endif 119 - 120 - dump_trace(NULL, regs, (void *)stack, bp, &backtrace_ops, &info); 121 - 122 - return info.pos; 123 - } 124 - 125 - static void timer_notify(struct pt_regs *regs, int cpu) 126 - { 127 - struct trace_array_cpu *data; 128 - struct stack_frame frame; 129 - struct trace_array *tr; 130 - const void __user *fp; 131 - int is_user; 132 - int i; 133 - 134 - if (!regs) 135 - return; 136 - 137 - tr = sysprof_trace; 138 - data = tr->data[cpu]; 139 - is_user = user_mode(regs); 140 - 141 - if (!current || current->pid == 0) 142 - return; 143 - 144 - if (is_user && current->state != TASK_RUNNING) 145 - return; 146 - 147 - __trace_special(tr, data, 0, 0, current->pid); 148 - 149 - if (!is_user) 150 - i = trace_kernel(regs, tr, data); 151 - else 152 - i = 0; 153 - 154 - /* 155 - * Trace user stack if we are not a kernel thread 156 - */ 157 - if (current->mm && i < sample_max_depth) { 158 - regs = (struct pt_regs *)current->thread.sp0 - 1; 159 - 160 - fp = (void __user *)regs->bp; 161 - 162 - __trace_special(tr, data, 2, regs->ip, 0); 163 - 164 - while (i < sample_max_depth) { 165 - frame.next_fp = NULL; 166 - frame.return_address = 0; 167 - if (!copy_stack_frame(fp, &frame)) 168 - break; 169 - if ((unsigned long)fp < regs->sp) 170 - break; 171 - 172 - __trace_special(tr, data, 2, frame.return_address, 173 - (unsigned long)fp); 174 - fp = frame.next_fp; 175 - 176 - i++; 177 - } 178 - 179 - } 180 - 181 - /* 182 - * Special trace entry if we overflow the max depth: 183 - */ 184 - if (i == sample_max_depth) 185 - __trace_special(tr, data, -1, -1, -1); 186 - 187 - __trace_special(tr, data, 3, current->pid, i); 188 - } 189 - 190 - static enum hrtimer_restart stack_trace_timer_fn(struct hrtimer *hrtimer) 191 - { 192 - /* trace here */ 193 - timer_notify(get_irq_regs(), smp_processor_id()); 194 - 195 - hrtimer_forward_now(hrtimer, ns_to_ktime(sample_period)); 196 - 197 - return HRTIMER_RESTART; 198 - } 199 - 200 - static void start_stack_timer(void *unused) 201 - { 202 - struct hrtimer *hrtimer = &__get_cpu_var(stack_trace_hrtimer); 203 - 204 - hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); 205 - hrtimer->function = stack_trace_timer_fn; 206 - 207 - hrtimer_start(hrtimer, ns_to_ktime(sample_period), 208 - HRTIMER_MODE_REL_PINNED); 209 - } 210 - 211 - static void start_stack_timers(void) 212 - { 213 - on_each_cpu(start_stack_timer, NULL, 1); 214 - } 215 - 216 - static void stop_stack_timer(int cpu) 217 - { 218 - struct hrtimer *hrtimer = &per_cpu(stack_trace_hrtimer, cpu); 219 - 220 - hrtimer_cancel(hrtimer); 221 - } 222 - 223 - static void stop_stack_timers(void) 224 - { 225 - int cpu; 226 - 227 - for_each_online_cpu(cpu) 228 - stop_stack_timer(cpu); 229 - } 230 - 231 - static void stop_stack_trace(struct trace_array *tr) 232 - { 233 - mutex_lock(&sample_timer_lock); 234 - stop_stack_timers(); 235 - tracer_enabled = 0; 236 - mutex_unlock(&sample_timer_lock); 237 - } 238 - 239 - static int stack_trace_init(struct trace_array *tr) 240 - { 241 - sysprof_trace = tr; 242 - 243 - tracing_start_cmdline_record(); 244 - 245 - mutex_lock(&sample_timer_lock); 246 - start_stack_timers(); 247 - tracer_enabled = 1; 248 - mutex_unlock(&sample_timer_lock); 249 - return 0; 250 - } 251 - 252 - static void stack_trace_reset(struct trace_array *tr) 253 - { 254 - tracing_stop_cmdline_record(); 255 - stop_stack_trace(tr); 256 - } 257 - 258 - static struct tracer stack_trace __read_mostly = 259 - { 260 - .name = "sysprof", 261 - .init = stack_trace_init, 262 - .reset = stack_trace_reset, 263 - #ifdef CONFIG_FTRACE_SELFTEST 264 - .selftest = trace_selftest_startup_sysprof, 265 - #endif 266 - }; 267 - 268 - __init static int init_stack_trace(void) 269 - { 270 - return register_tracer(&stack_trace); 271 - } 272 - device_initcall(init_stack_trace); 273 - 274 - #define MAX_LONG_DIGITS 22 275 - 276 - static ssize_t 277 - sysprof_sample_read(struct file *filp, char __user *ubuf, 278 - size_t cnt, loff_t *ppos) 279 - { 280 - char buf[MAX_LONG_DIGITS]; 281 - int r; 282 - 283 - r = sprintf(buf, "%ld\n", nsecs_to_usecs(sample_period)); 284 - 285 - return simple_read_from_buffer(ubuf, cnt, ppos, buf, r); 286 - } 287 - 288 - static ssize_t 289 - sysprof_sample_write(struct file *filp, const char __user *ubuf, 290 - size_t cnt, loff_t *ppos) 291 - { 292 - char buf[MAX_LONG_DIGITS]; 293 - unsigned long val; 294 - 295 - if (cnt > MAX_LONG_DIGITS-1) 296 - cnt = MAX_LONG_DIGITS-1; 297 - 298 - if (copy_from_user(&buf, ubuf, cnt)) 299 - return -EFAULT; 300 - 301 - buf[cnt] = 0; 302 - 303 - val = simple_strtoul(buf, NULL, 10); 304 - /* 305 - * Enforce a minimum sample period of 100 usecs: 306 - */ 307 - if (val < 100) 308 - val = 100; 309 - 310 - mutex_lock(&sample_timer_lock); 311 - stop_stack_timers(); 312 - sample_period = val * 1000; 313 - start_stack_timers(); 314 - mutex_unlock(&sample_timer_lock); 315 - 316 - return cnt; 317 - } 318 - 319 - static const struct file_operations sysprof_sample_fops = { 320 - .read = sysprof_sample_read, 321 - .write = sysprof_sample_write, 322 - }; 323 - 324 - void init_tracer_sysprof_debugfs(struct dentry *d_tracer) 325 - { 326 - 327 - trace_create_file("sysprof_sample_period", 0644, 328 - d_tracer, NULL, &sysprof_sample_fops); 329 - }

+567

kernel/watchdog.c

··· 1 + /* 2 + * Detect hard and soft lockups on a system 3 + * 4 + * started by Don Zickus, Copyright (C) 2010 Red Hat, Inc. 5 + * 6 + * this code detects hard lockups: incidents in where on a CPU 7 + * the kernel does not respond to anything except NMI. 8 + * 9 + * Note: Most of this code is borrowed heavily from softlockup.c, 10 + * so thanks to Ingo for the initial implementation. 11 + * Some chunks also taken from arch/x86/kernel/apic/nmi.c, thanks 12 + * to those contributors as well. 13 + */ 14 + 15 + #include <linux/mm.h> 16 + #include <linux/cpu.h> 17 + #include <linux/nmi.h> 18 + #include <linux/init.h> 19 + #include <linux/delay.h> 20 + #include <linux/freezer.h> 21 + #include <linux/kthread.h> 22 + #include <linux/lockdep.h> 23 + #include <linux/notifier.h> 24 + #include <linux/module.h> 25 + #include <linux/sysctl.h> 26 + 27 + #include <asm/irq_regs.h> 28 + #include <linux/perf_event.h> 29 + 30 + int watchdog_enabled; 31 + int __read_mostly softlockup_thresh = 60; 32 + 33 + static DEFINE_PER_CPU(unsigned long, watchdog_touch_ts); 34 + static DEFINE_PER_CPU(struct task_struct *, softlockup_watchdog); 35 + static DEFINE_PER_CPU(struct hrtimer, watchdog_hrtimer); 36 + static DEFINE_PER_CPU(bool, softlockup_touch_sync); 37 + static DEFINE_PER_CPU(bool, soft_watchdog_warn); 38 + #ifdef CONFIG_HARDLOCKUP_DETECTOR 39 + static DEFINE_PER_CPU(bool, hard_watchdog_warn); 40 + static DEFINE_PER_CPU(bool, watchdog_nmi_touch); 41 + static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts); 42 + static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts_saved); 43 + static DEFINE_PER_CPU(struct perf_event *, watchdog_ev); 44 + #endif 45 + 46 + static int __read_mostly did_panic; 47 + static int __initdata no_watchdog; 48 + 49 + 50 + /* boot commands */ 51 + /* 52 + * Should we panic when a soft-lockup or hard-lockup occurs: 53 + */ 54 + #ifdef CONFIG_HARDLOCKUP_DETECTOR 55 + static int hardlockup_panic; 56 + 57 + static int __init hardlockup_panic_setup(char *str) 58 + { 59 + if (!strncmp(str, "panic", 5)) 60 + hardlockup_panic = 1; 61 + return 1; 62 + } 63 + __setup("nmi_watchdog=", hardlockup_panic_setup); 64 + #endif 65 + 66 + unsigned int __read_mostly softlockup_panic = 67 + CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE; 68 + 69 + static int __init softlockup_panic_setup(char *str) 70 + { 71 + softlockup_panic = simple_strtoul(str, NULL, 0); 72 + 73 + return 1; 74 + } 75 + __setup("softlockup_panic=", softlockup_panic_setup); 76 + 77 + static int __init nowatchdog_setup(char *str) 78 + { 79 + no_watchdog = 1; 80 + return 1; 81 + } 82 + __setup("nowatchdog", nowatchdog_setup); 83 + 84 + /* deprecated */ 85 + static int __init nosoftlockup_setup(char *str) 86 + { 87 + no_watchdog = 1; 88 + return 1; 89 + } 90 + __setup("nosoftlockup", nosoftlockup_setup); 91 + /* */ 92 + 93 + 94 + /* 95 + * Returns seconds, approximately. We don't need nanosecond 96 + * resolution, and we don't need to waste time with a big divide when 97 + * 2^30ns == 1.074s. 98 + */ 99 + static unsigned long get_timestamp(int this_cpu) 100 + { 101 + return cpu_clock(this_cpu) >> 30LL; /* 2^30 ~= 10^9 */ 102 + } 103 + 104 + static unsigned long get_sample_period(void) 105 + { 106 + /* 107 + * convert softlockup_thresh from seconds to ns 108 + * the divide by 5 is to give hrtimer 5 chances to 109 + * increment before the hardlockup detector generates 110 + * a warning 111 + */ 112 + return softlockup_thresh / 5 * NSEC_PER_SEC; 113 + } 114 + 115 + /* Commands for resetting the watchdog */ 116 + static void __touch_watchdog(void) 117 + { 118 + int this_cpu = smp_processor_id(); 119 + 120 + __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu); 121 + } 122 + 123 + void touch_softlockup_watchdog(void) 124 + { 125 + __get_cpu_var(watchdog_touch_ts) = 0; 126 + } 127 + EXPORT_SYMBOL(touch_softlockup_watchdog); 128 + 129 + void touch_all_softlockup_watchdogs(void) 130 + { 131 + int cpu; 132 + 133 + /* 134 + * this is done lockless 135 + * do we care if a 0 races with a timestamp? 136 + * all it means is the softlock check starts one cycle later 137 + */ 138 + for_each_online_cpu(cpu) 139 + per_cpu(watchdog_touch_ts, cpu) = 0; 140 + } 141 + 142 + #ifdef CONFIG_HARDLOCKUP_DETECTOR 143 + void touch_nmi_watchdog(void) 144 + { 145 + __get_cpu_var(watchdog_nmi_touch) = true; 146 + touch_softlockup_watchdog(); 147 + } 148 + EXPORT_SYMBOL(touch_nmi_watchdog); 149 + 150 + #endif 151 + 152 + void touch_softlockup_watchdog_sync(void) 153 + { 154 + __raw_get_cpu_var(softlockup_touch_sync) = true; 155 + __raw_get_cpu_var(watchdog_touch_ts) = 0; 156 + } 157 + 158 + #ifdef CONFIG_HARDLOCKUP_DETECTOR 159 + /* watchdog detector functions */ 160 + static int is_hardlockup(void) 161 + { 162 + unsigned long hrint = __get_cpu_var(hrtimer_interrupts); 163 + 164 + if (__get_cpu_var(hrtimer_interrupts_saved) == hrint) 165 + return 1; 166 + 167 + __get_cpu_var(hrtimer_interrupts_saved) = hrint; 168 + return 0; 169 + } 170 + #endif 171 + 172 + static int is_softlockup(unsigned long touch_ts) 173 + { 174 + unsigned long now = get_timestamp(smp_processor_id()); 175 + 176 + /* Warn about unreasonable delays: */ 177 + if (time_after(now, touch_ts + softlockup_thresh)) 178 + return now - touch_ts; 179 + 180 + return 0; 181 + } 182 + 183 + static int 184 + watchdog_panic(struct notifier_block *this, unsigned long event, void *ptr) 185 + { 186 + did_panic = 1; 187 + 188 + return NOTIFY_DONE; 189 + } 190 + 191 + static struct notifier_block panic_block = { 192 + .notifier_call = watchdog_panic, 193 + }; 194 + 195 + #ifdef CONFIG_HARDLOCKUP_DETECTOR 196 + static struct perf_event_attr wd_hw_attr = { 197 + .type = PERF_TYPE_HARDWARE, 198 + .config = PERF_COUNT_HW_CPU_CYCLES, 199 + .size = sizeof(struct perf_event_attr), 200 + .pinned = 1, 201 + .disabled = 1, 202 + }; 203 + 204 + /* Callback function for perf event subsystem */ 205 + void watchdog_overflow_callback(struct perf_event *event, int nmi, 206 + struct perf_sample_data *data, 207 + struct pt_regs *regs) 208 + { 209 + if (__get_cpu_var(watchdog_nmi_touch) == true) { 210 + __get_cpu_var(watchdog_nmi_touch) = false; 211 + return; 212 + } 213 + 214 + /* check for a hardlockup 215 + * This is done by making sure our timer interrupt 216 + * is incrementing. The timer interrupt should have 217 + * fired multiple times before we overflow'd. If it hasn't 218 + * then this is a good indication the cpu is stuck 219 + */ 220 + if (is_hardlockup()) { 221 + int this_cpu = smp_processor_id(); 222 + 223 + /* only print hardlockups once */ 224 + if (__get_cpu_var(hard_watchdog_warn) == true) 225 + return; 226 + 227 + if (hardlockup_panic) 228 + panic("Watchdog detected hard LOCKUP on cpu %d", this_cpu); 229 + else 230 + WARN(1, "Watchdog detected hard LOCKUP on cpu %d", this_cpu); 231 + 232 + __get_cpu_var(hard_watchdog_warn) = true; 233 + return; 234 + } 235 + 236 + __get_cpu_var(hard_watchdog_warn) = false; 237 + return; 238 + } 239 + static void watchdog_interrupt_count(void) 240 + { 241 + __get_cpu_var(hrtimer_interrupts)++; 242 + } 243 + #else 244 + static inline void watchdog_interrupt_count(void) { return; } 245 + #endif /* CONFIG_HARDLOCKUP_DETECTOR */ 246 + 247 + /* watchdog kicker functions */ 248 + static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer) 249 + { 250 + unsigned long touch_ts = __get_cpu_var(watchdog_touch_ts); 251 + struct pt_regs *regs = get_irq_regs(); 252 + int duration; 253 + 254 + /* kick the hardlockup detector */ 255 + watchdog_interrupt_count(); 256 + 257 + /* kick the softlockup detector */ 258 + wake_up_process(__get_cpu_var(softlockup_watchdog)); 259 + 260 + /* .. and repeat */ 261 + hrtimer_forward_now(hrtimer, ns_to_ktime(get_sample_period())); 262 + 263 + if (touch_ts == 0) { 264 + if (unlikely(__get_cpu_var(softlockup_touch_sync))) { 265 + /* 266 + * If the time stamp was touched atomically 267 + * make sure the scheduler tick is up to date. 268 + */ 269 + __get_cpu_var(softlockup_touch_sync) = false; 270 + sched_clock_tick(); 271 + } 272 + __touch_watchdog(); 273 + return HRTIMER_RESTART; 274 + } 275 + 276 + /* check for a softlockup 277 + * This is done by making sure a high priority task is 278 + * being scheduled. The task touches the watchdog to 279 + * indicate it is getting cpu time. If it hasn't then 280 + * this is a good indication some task is hogging the cpu 281 + */ 282 + duration = is_softlockup(touch_ts); 283 + if (unlikely(duration)) { 284 + /* only warn once */ 285 + if (__get_cpu_var(soft_watchdog_warn) == true) 286 + return HRTIMER_RESTART; 287 + 288 + printk(KERN_ERR "BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n", 289 + smp_processor_id(), duration, 290 + current->comm, task_pid_nr(current)); 291 + print_modules(); 292 + print_irqtrace_events(current); 293 + if (regs) 294 + show_regs(regs); 295 + else 296 + dump_stack(); 297 + 298 + if (softlockup_panic) 299 + panic("softlockup: hung tasks"); 300 + __get_cpu_var(soft_watchdog_warn) = true; 301 + } else 302 + __get_cpu_var(soft_watchdog_warn) = false; 303 + 304 + return HRTIMER_RESTART; 305 + } 306 + 307 + 308 + /* 309 + * The watchdog thread - touches the timestamp. 310 + */ 311 + static int watchdog(void *unused) 312 + { 313 + struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 }; 314 + struct hrtimer *hrtimer = &__raw_get_cpu_var(watchdog_hrtimer); 315 + 316 + sched_setscheduler(current, SCHED_FIFO, &param); 317 + 318 + /* initialize timestamp */ 319 + __touch_watchdog(); 320 + 321 + /* kick off the timer for the hardlockup detector */ 322 + /* done here because hrtimer_start can only pin to smp_processor_id() */ 323 + hrtimer_start(hrtimer, ns_to_ktime(get_sample_period()), 324 + HRTIMER_MODE_REL_PINNED); 325 + 326 + set_current_state(TASK_INTERRUPTIBLE); 327 + /* 328 + * Run briefly once per second to reset the softlockup timestamp. 329 + * If this gets delayed for more than 60 seconds then the 330 + * debug-printout triggers in watchdog_timer_fn(). 331 + */ 332 + while (!kthread_should_stop()) { 333 + __touch_watchdog(); 334 + schedule(); 335 + 336 + if (kthread_should_stop()) 337 + break; 338 + 339 + set_current_state(TASK_INTERRUPTIBLE); 340 + } 341 + __set_current_state(TASK_RUNNING); 342 + 343 + return 0; 344 + } 345 + 346 + 347 + #ifdef CONFIG_HARDLOCKUP_DETECTOR 348 + static int watchdog_nmi_enable(int cpu) 349 + { 350 + struct perf_event_attr *wd_attr; 351 + struct perf_event *event = per_cpu(watchdog_ev, cpu); 352 + 353 + /* is it already setup and enabled? */ 354 + if (event && event->state > PERF_EVENT_STATE_OFF) 355 + goto out; 356 + 357 + /* it is setup but not enabled */ 358 + if (event != NULL) 359 + goto out_enable; 360 + 361 + /* Try to register using hardware perf events */ 362 + wd_attr = &wd_hw_attr; 363 + wd_attr->sample_period = hw_nmi_get_sample_period(); 364 + event = perf_event_create_kernel_counter(wd_attr, cpu, -1, watchdog_overflow_callback); 365 + if (!IS_ERR(event)) { 366 + printk(KERN_INFO "NMI watchdog enabled, takes one hw-pmu counter.\n"); 367 + goto out_save; 368 + } 369 + 370 + printk(KERN_ERR "NMI watchdog failed to create perf event on cpu%i: %p\n", cpu, event); 371 + return -1; 372 + 373 + /* success path */ 374 + out_save: 375 + per_cpu(watchdog_ev, cpu) = event; 376 + out_enable: 377 + perf_event_enable(per_cpu(watchdog_ev, cpu)); 378 + out: 379 + return 0; 380 + } 381 + 382 + static void watchdog_nmi_disable(int cpu) 383 + { 384 + struct perf_event *event = per_cpu(watchdog_ev, cpu); 385 + 386 + if (event) { 387 + perf_event_disable(event); 388 + per_cpu(watchdog_ev, cpu) = NULL; 389 + 390 + /* should be in cleanup, but blocks oprofile */ 391 + perf_event_release_kernel(event); 392 + } 393 + return; 394 + } 395 + #else 396 + static int watchdog_nmi_enable(int cpu) { return 0; } 397 + static void watchdog_nmi_disable(int cpu) { return; } 398 + #endif /* CONFIG_HARDLOCKUP_DETECTOR */ 399 + 400 + /* prepare/enable/disable routines */ 401 + static int watchdog_prepare_cpu(int cpu) 402 + { 403 + struct hrtimer *hrtimer = &per_cpu(watchdog_hrtimer, cpu); 404 + 405 + WARN_ON(per_cpu(softlockup_watchdog, cpu)); 406 + hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); 407 + hrtimer->function = watchdog_timer_fn; 408 + 409 + return 0; 410 + } 411 + 412 + static int watchdog_enable(int cpu) 413 + { 414 + struct task_struct *p = per_cpu(softlockup_watchdog, cpu); 415 + 416 + /* enable the perf event */ 417 + if (watchdog_nmi_enable(cpu) != 0) 418 + return -1; 419 + 420 + /* create the watchdog thread */ 421 + if (!p) { 422 + p = kthread_create(watchdog, (void *)(unsigned long)cpu, "watchdog/%d", cpu); 423 + if (IS_ERR(p)) { 424 + printk(KERN_ERR "softlockup watchdog for %i failed\n", cpu); 425 + return -1; 426 + } 427 + kthread_bind(p, cpu); 428 + per_cpu(watchdog_touch_ts, cpu) = 0; 429 + per_cpu(softlockup_watchdog, cpu) = p; 430 + wake_up_process(p); 431 + } 432 + 433 + return 0; 434 + } 435 + 436 + static void watchdog_disable(int cpu) 437 + { 438 + struct task_struct *p = per_cpu(softlockup_watchdog, cpu); 439 + struct hrtimer *hrtimer = &per_cpu(watchdog_hrtimer, cpu); 440 + 441 + /* 442 + * cancel the timer first to stop incrementing the stats 443 + * and waking up the kthread 444 + */ 445 + hrtimer_cancel(hrtimer); 446 + 447 + /* disable the perf event */ 448 + watchdog_nmi_disable(cpu); 449 + 450 + /* stop the watchdog thread */ 451 + if (p) { 452 + per_cpu(softlockup_watchdog, cpu) = NULL; 453 + kthread_stop(p); 454 + } 455 + 456 + /* if any cpu succeeds, watchdog is considered enabled for the system */ 457 + watchdog_enabled = 1; 458 + } 459 + 460 + static void watchdog_enable_all_cpus(void) 461 + { 462 + int cpu; 463 + int result = 0; 464 + 465 + for_each_online_cpu(cpu) 466 + result += watchdog_enable(cpu); 467 + 468 + if (result) 469 + printk(KERN_ERR "watchdog: failed to be enabled on some cpus\n"); 470 + 471 + } 472 + 473 + static void watchdog_disable_all_cpus(void) 474 + { 475 + int cpu; 476 + 477 + for_each_online_cpu(cpu) 478 + watchdog_disable(cpu); 479 + 480 + /* if all watchdogs are disabled, then they are disabled for the system */ 481 + watchdog_enabled = 0; 482 + } 483 + 484 + 485 + /* sysctl functions */ 486 + #ifdef CONFIG_SYSCTL 487 + /* 488 + * proc handler for /proc/sys/kernel/nmi_watchdog 489 + */ 490 + 491 + int proc_dowatchdog_enabled(struct ctl_table *table, int write, 492 + void __user *buffer, size_t *length, loff_t *ppos) 493 + { 494 + proc_dointvec(table, write, buffer, length, ppos); 495 + 496 + if (watchdog_enabled) 497 + watchdog_enable_all_cpus(); 498 + else 499 + watchdog_disable_all_cpus(); 500 + return 0; 501 + } 502 + 503 + int proc_dowatchdog_thresh(struct ctl_table *table, int write, 504 + void __user *buffer, 505 + size_t *lenp, loff_t *ppos) 506 + { 507 + return proc_dointvec_minmax(table, write, buffer, lenp, ppos); 508 + } 509 + #endif /* CONFIG_SYSCTL */ 510 + 511 + 512 + /* 513 + * Create/destroy watchdog threads as CPUs come and go: 514 + */ 515 + static int __cpuinit 516 + cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu) 517 + { 518 + int hotcpu = (unsigned long)hcpu; 519 + 520 + switch (action) { 521 + case CPU_UP_PREPARE: 522 + case CPU_UP_PREPARE_FROZEN: 523 + if (watchdog_prepare_cpu(hotcpu)) 524 + return NOTIFY_BAD; 525 + break; 526 + case CPU_ONLINE: 527 + case CPU_ONLINE_FROZEN: 528 + if (watchdog_enable(hotcpu)) 529 + return NOTIFY_BAD; 530 + break; 531 + #ifdef CONFIG_HOTPLUG_CPU 532 + case CPU_UP_CANCELED: 533 + case CPU_UP_CANCELED_FROZEN: 534 + watchdog_disable(hotcpu); 535 + break; 536 + case CPU_DEAD: 537 + case CPU_DEAD_FROZEN: 538 + watchdog_disable(hotcpu); 539 + break; 540 + #endif /* CONFIG_HOTPLUG_CPU */ 541 + } 542 + return NOTIFY_OK; 543 + } 544 + 545 + static struct notifier_block __cpuinitdata cpu_nfb = { 546 + .notifier_call = cpu_callback 547 + }; 548 + 549 + static int __init spawn_watchdog_task(void) 550 + { 551 + void *cpu = (void *)(long)smp_processor_id(); 552 + int err; 553 + 554 + if (no_watchdog) 555 + return 0; 556 + 557 + err = cpu_callback(&cpu_nfb, CPU_UP_PREPARE, cpu); 558 + WARN_ON(err == NOTIFY_BAD); 559 + 560 + cpu_callback(&cpu_nfb, CPU_ONLINE, cpu); 561 + register_cpu_notifier(&cpu_nfb); 562 + 563 + atomic_notifier_chain_register(&panic_notifier_list, &panic_block); 564 + 565 + return 0; 566 + } 567 + early_initcall(spawn_watchdog_task);

+20 -15

lib/Kconfig.debug

··· 152 152 Drivers ought to be able to handle interrupts coming in at those 153 153 points; some don't and need to be caught. 154 154 155 - config DETECT_SOFTLOCKUP 156 - bool "Detect Soft Lockups" 155 + config LOCKUP_DETECTOR 156 + bool "Detect Hard and Soft Lockups" 157 157 depends on DEBUG_KERNEL && !S390 158 - default y 159 158 help 160 - Say Y here to enable the kernel to detect "soft lockups", 161 - which are bugs that cause the kernel to loop in kernel 159 + Say Y here to enable the kernel to act as a watchdog to detect 160 + hard and soft lockups. 161 + 162 + Softlockups are bugs that cause the kernel to loop in kernel 162 163 mode for more than 60 seconds, without giving other tasks a 163 - chance to run. 164 + chance to run. The current stack trace is displayed upon 165 + detection and the system will stay locked up. 164 166 165 - When a soft-lockup is detected, the kernel will print the 166 - current stack trace (which you should report), but the 167 - system will stay locked up. This feature has negligible 168 - overhead. 167 + Hardlockups are bugs that cause the CPU to loop in kernel mode 168 + for more than 60 seconds, without letting other interrupts have a 169 + chance to run. The current stack trace is displayed upon detection 170 + and the system will stay locked up. 169 171 170 - (Note that "hard lockups" are separate type of bugs that 171 - can be detected via the NMI-watchdog, on platforms that 172 - support it.) 172 + The overhead should be minimal. A periodic hrtimer runs to 173 + generate interrupts and kick the watchdog task every 10-12 seconds. 174 + An NMI is generated every 60 seconds or so to check for hardlockups. 175 + 176 + config HARDLOCKUP_DETECTOR 177 + def_bool LOCKUP_DETECTOR && PERF_EVENTS && HAVE_PERF_EVENTS_NMI 173 178 174 179 config BOOTPARAM_SOFTLOCKUP_PANIC 175 180 bool "Panic (Reboot) On Soft Lockups" 176 - depends on DETECT_SOFTLOCKUP 181 + depends on LOCKUP_DETECTOR 177 182 help 178 183 Say Y here to enable the kernel to panic on "soft lockups", 179 184 which are bugs that cause the kernel to loop in kernel ··· 195 190 196 191 config BOOTPARAM_SOFTLOCKUP_PANIC_VALUE 197 192 int 198 - depends on DETECT_SOFTLOCKUP 193 + depends on LOCKUP_DETECTOR 199 194 range 0 1 200 195 default 0 if !BOOTPARAM_SOFTLOCKUP_PANIC 201 196 default 1 if BOOTPARAM_SOFTLOCKUP_PANIC

+5 -1

mm/mmap.c

··· 1734 1734 grow = (address - vma->vm_end) >> PAGE_SHIFT; 1735 1735 1736 1736 error = acct_stack_growth(vma, size, grow); 1737 - if (!error) 1737 + if (!error) { 1738 1738 vma->vm_end = address; 1739 + perf_event_mmap(vma); 1740 + } 1739 1741 } 1740 1742 anon_vma_unlock(vma); 1741 1743 return error; ··· 1783 1781 if (!error) { 1784 1782 vma->vm_start = address; 1785 1783 vma->vm_pgoff -= grow; 1784 + perf_event_mmap(vma); 1786 1785 } 1787 1786 } 1788 1787 anon_vma_unlock(vma); ··· 2211 2208 vma->vm_page_prot = vm_get_page_prot(flags); 2212 2209 vma_link(mm, vma, prev, rb_link, rb_parent); 2213 2210 out: 2211 + perf_event_mmap(vma); 2214 2212 mm->total_vm += len >> PAGE_SHIFT; 2215 2213 if (flags & VM_LOCKED) { 2216 2214 if (!mlock_vma_pages_range(vma, addr, addr + len))

-1

mm/slab.c

··· 102 102 #include <linux/cpu.h> 103 103 #include <linux/sysctl.h> 104 104 #include <linux/module.h> 105 - #include <linux/kmemtrace.h> 106 105 #include <linux/rcupdate.h> 107 106 #include <linux/string.h> 108 107 #include <linux/uaccess.h>

+3 -1

mm/slob.c

··· 66 66 #include <linux/module.h> 67 67 #include <linux/rcupdate.h> 68 68 #include <linux/list.h> 69 - #include <linux/kmemtrace.h> 70 69 #include <linux/kmemleak.h> 70 + 71 + #include <trace/events/kmem.h> 72 + 71 73 #include <asm/atomic.h> 72 74 73 75 /*

-1

mm/slub.c

··· 17 17 #include <linux/slab.h> 18 18 #include <linux/proc_fs.h> 19 19 #include <linux/seq_file.h> 20 - #include <linux/kmemtrace.h> 21 20 #include <linux/kmemcheck.h> 22 21 #include <linux/cpu.h> 23 22 #include <linux/cpuset.h>

+31 -6

scripts/package/Makefile

··· 111 111 clean-dirs += $(objtree)/tar-install/ 112 112 113 113 114 + # perf-pkg - generate a source tarball with perf source 115 + # --------------------------------------------------------------------------- 116 + 117 + perf-tar=perf-$(KERNELVERSION) 118 + 119 + quiet_cmd_perf_tar = TAR 120 + cmd_perf_tar = \ 121 + git archive --prefix=$(perf-tar)/ HEAD^{tree} \ 122 + $$(cat $(srctree)/tools/perf/MANIFEST) -o $(perf-tar).tar; \ 123 + mkdir -p $(perf-tar); \ 124 + git rev-parse HEAD > $(perf-tar)/HEAD; \ 125 + tar rf $(perf-tar).tar $(perf-tar)/HEAD; \ 126 + rm -r $(perf-tar); \ 127 + $(if $(findstring tar-src,$@),, \ 128 + $(if $(findstring bz2,$@),bzip2, \ 129 + $(if $(findstring gz,$@),gzip, \ 130 + $(error unknown target $@))) \ 131 + -f -9 $(perf-tar).tar) 132 + 133 + perf-%pkg: FORCE 134 + $(call cmd,perf_tar) 135 + 114 136 # Help text displayed when executing 'make help' 115 137 # --------------------------------------------------------------------------- 116 138 help: FORCE 117 - @echo ' rpm-pkg - Build both source and binary RPM kernel packages' 118 - @echo ' binrpm-pkg - Build only the binary kernel package' 119 - @echo ' deb-pkg - Build the kernel as an deb package' 120 - @echo ' tar-pkg - Build the kernel as an uncompressed tarball' 121 - @echo ' targz-pkg - Build the kernel as a gzip compressed tarball' 122 - @echo ' tarbz2-pkg - Build the kernel as a bzip2 compressed tarball' 139 + @echo ' rpm-pkg - Build both source and binary RPM kernel packages' 140 + @echo ' binrpm-pkg - Build only the binary kernel package' 141 + @echo ' deb-pkg - Build the kernel as an deb package' 142 + @echo ' tar-pkg - Build the kernel as an uncompressed tarball' 143 + @echo ' targz-pkg - Build the kernel as a gzip compressed tarball' 144 + @echo ' tarbz2-pkg - Build the kernel as a bzip2 compressed tarball' 145 + @echo ' perf-tar-src-pkg - Build $(perf-tar).tar source tarball' 146 + @echo ' perf-targz-src-pkg - Build $(perf-tar).tar.gz source tarball' 147 + @echo ' perf-tarbz2-src-pkg - Build $(perf-tar).tar.bz2 source tarball' 123 148

+1 -1

scripts/recordmcount.pl

··· 326 326 # 14: R_MIPS_NONE *ABS* 327 327 # 18: 00020021 nop 328 328 if ($is_module eq "0") { 329 - $mcount_regex = "^\\s*([0-9a-fA-F]+):.*\\s_mcount\$"; 329 + $mcount_regex = "^\\s*([0-9a-fA-F]+): R_MIPS_26\\s+_mcount\$"; 330 330 } else { 331 331 $mcount_regex = "^\\s*([0-9a-fA-F]+): R_MIPS_HI16\\s+_mcount\$"; 332 332 }

+2

tools/perf/.gitignore

··· 18 18 tags 19 19 TAGS 20 20 cscope* 21 + config.mak 22 + config.mak.autogen

+4 -4

tools/perf/Documentation/perf-buildid-cache.txt

··· 12 12 13 13 DESCRIPTION 14 14 ----------- 15 - This command manages the build-id cache. It can add and remove files to the 16 - cache. In the future it should as well purge older entries, set upper limits 17 - for the space used by the cache, etc. 15 + This command manages the build-id cache. It can add and remove files to/from 16 + the cache. In the future it should as well purge older entries, set upper 17 + limits for the space used by the cache, etc. 18 18 19 19 OPTIONS 20 20 ------- ··· 23 23 Add specified file to the cache. 24 24 -r:: 25 25 --remove=:: 26 - Remove specified file to the cache. 26 + Remove specified file from the cache. 27 27 -v:: 28 28 --verbose:: 29 29 Be more verbose.

+6 -2

tools/perf/Documentation/perf-probe.txt

··· 31 31 --vmlinux=PATH:: 32 32 Specify vmlinux path which has debuginfo (Dwarf binary). 33 33 34 + -s:: 35 + --source=PATH:: 36 + Specify path to kernel source. 37 + 34 38 -v:: 35 39 --verbose:: 36 40 Be more verbose (show parsed arguments, etc). ··· 94 90 95 91 [NAME=]LOCALVAR|$retval|%REG|@SYMBOL[:TYPE] 96 92 97 - 'NAME' specifies the name of this argument (optional). You can use the name of local variable, local data structure member (e.g. var->field, var.field2), or kprobe-tracer argument format (e.g. $retval, %ax, etc). Note that the name of this argument will be set as the last member name if you specify a local data structure member (e.g. field2 for 'var->field1.field2'.) 98 - 'TYPE' casts the type of this argument (optional). If omitted, perf probe automatically set the type based on debuginfo. 93 + 'NAME' specifies the name of this argument (optional). You can use the name of local variable, local data structure member (e.g. var->field, var.field2), local array with fixed index (e.g. array[1], var->array[0], var->pointer[2]), or kprobe-tracer argument format (e.g. $retval, %ax, etc). Note that the name of this argument will be set as the last member name if you specify a local data structure member (e.g. field2 for 'var->field1.field2'.) 94 + 'TYPE' casts the type of this argument (optional). If omitted, perf probe automatically set the type based on debuginfo. You can specify 'string' type only for the local variable or structure member which is an array of or a pointer to 'char' or 'unsigned char' type. 99 95 100 96 LINE SYNTAX 101 97 -----------

+13

tools/perf/Documentation/perf-record.txt

··· 103 103 --raw-samples:: 104 104 Collect raw sample records from all opened counters (default for tracepoint counters). 105 105 106 + -C:: 107 + --cpu:: 108 + Collect samples only on the list of cpus provided. Multiple CPUs can be provided as a 109 + comma-sperated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2. 110 + In per-thread mode with inheritance mode on (default), samples are captured only when 111 + the thread executes on the designated CPUs. Default is to monitor all CPUs. 112 + 113 + -N:: 114 + --no-buildid-cache:: 115 + Do not update the builid cache. This saves some overhead in situations 116 + where the information in the perf.data file (which includes buildids) 117 + is sufficient. 118 + 106 119 SEE ALSO 107 120 -------- 108 121 linkperf:perf-stat[1], linkperf:perf-list[1]

+7

tools/perf/Documentation/perf-stat.txt

··· 46 46 -B:: 47 47 print large numbers with thousands' separators according to locale 48 48 49 + -C:: 50 + --cpu=:: 51 + Count only on the list of cpus provided. Multiple CPUs can be provided as a 52 + comma-sperated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2. 53 + In per-thread mode, this option is ignored. The -a option is still necessary 54 + to activate system-wide monitoring. Default is to count on all CPUs. 55 + 49 56 EXAMPLES 50 57 -------- 51 58

+5 -3

tools/perf/Documentation/perf-top.txt

··· 25 25 --count=<count>:: 26 26 Event period to sample. 27 27 28 - -C <cpu>:: 29 - --CPU=<cpu>:: 30 - CPU to profile. 28 + -C <cpu-list>:: 29 + --cpu=<cpu>:: 30 + Monitor only on the list of cpus provided. Multiple CPUs can be provided as a 31 + comma-sperated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2. 32 + Default is to monitor all CPUS. 31 33 32 34 -d <seconds>:: 33 35 --delay=<seconds>::

+12

tools/perf/MANIFEST

··· 1 + tools/perf 2 + include/linux/perf_event.h 3 + include/linux/rbtree.h 4 + include/linux/list.h 5 + include/linux/hash.h 6 + include/linux/stringify.h 7 + lib/rbtree.c 8 + include/linux/swab.h 9 + arch/*/include/asm/unistd*.h 10 + include/linux/poison.h 11 + include/linux/magic.h 12 + include/linux/hw_breakpoint.h

+59 -56

tools/perf/Makefile

··· 285 285 QUIET_STDERR = ">/dev/null 2>&1" 286 286 endif 287 287 288 - BITBUCKET = "/dev/null" 288 + -include feature-tests.mak 289 289 290 - ifneq ($(shell sh -c "(echo '\#include <stdio.h>'; echo 'int main(void) { return puts(\"hi\"); }') | $(CC) -x c - $(ALL_CFLAGS) -o $(BITBUCKET) "$(QUIET_STDERR)" && echo y"), y) 291 - BITBUCKET = .perf.dev.null 292 - endif 293 - 294 - ifeq ($(shell sh -c "echo 'int foo(void) {char X[2]; return 3;}' | $(CC) -x c -c -Werror -fstack-protector-all - -o $(BITBUCKET) "$(QUIET_STDERR)" && echo y"), y) 295 - CFLAGS := $(CFLAGS) -fstack-protector-all 290 + ifeq ($(call try-cc,$(SOURCE_HELLO),-Werror -fstack-protector-all),y) 291 + CFLAGS := $(CFLAGS) -fstack-protector-all 296 292 endif 297 293 298 294 ··· 504 508 -include config.mak 505 509 506 510 ifndef NO_DWARF 507 - ifneq ($(shell sh -c "(echo '\#include <dwarf.h>'; echo '\#include <libdw.h>'; echo '\#include <version.h>'; echo '\#ifndef _ELFUTILS_PREREQ'; echo '\#error'; echo '\#endif'; echo 'int main(void) { Dwarf *dbg; dbg = dwarf_begin(0, DWARF_C_READ); return (long)dbg; }') | $(CC) -x c - $(ALL_CFLAGS) -I/usr/include/elfutils -ldw -lelf -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) "$(QUIET_STDERR)" && echo y"), y) 511 + FLAGS_DWARF=$(ALL_CFLAGS) -I/usr/include/elfutils -ldw -lelf $(ALL_LDFLAGS) $(EXTLIBS) 512 + ifneq ($(call try-cc,$(SOURCE_DWARF),$(FLAGS_DWARF)),y) 508 513 msg := $(warning No libdw.h found or old libdw.h found or elfutils is older than 0.138, disables dwarf support. Please install new elfutils-devel/libdw-dev); 509 514 NO_DWARF := 1 510 515 endif # Dwarf support ··· 533 536 BASIC_CFLAGS += -I$(OUTPUT) 534 537 endif 535 538 536 - ifeq ($(shell sh -c "(echo '\#include <libelf.h>'; echo 'int main(void) { Elf * elf = elf_begin(0, ELF_C_READ, 0); return (long)elf; }') | $(CC) -x c - $(ALL_CFLAGS) -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) "$(QUIET_STDERR)" && echo y"), y) 537 - ifneq ($(shell sh -c "(echo '\#include <gnu/libc-version.h>'; echo 'int main(void) { const char * version = gnu_get_libc_version(); return (long)version; }') | $(CC) -x c - $(ALL_CFLAGS) -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) "$(QUIET_STDERR)" && echo y"), y) 538 - msg := $(error No gnu/libc-version.h found, please install glibc-dev[el]/glibc-static); 539 + FLAGS_LIBELF=$(ALL_CFLAGS) $(ALL_LDFLAGS) $(EXTLIBS) 540 + ifneq ($(call try-cc,$(SOURCE_LIBELF),$(FLAGS_LIBELF)),y) 541 + FLAGS_GLIBC=$(ALL_CFLAGS) $(ALL_LDFLAGS) 542 + ifneq ($(call try-cc,$(SOURCE_GLIBC),$(FLAGS_GLIBC)),y) 543 + msg := $(error No gnu/libc-version.h found, please install glibc-dev[el]/glibc-static); 544 + else 545 + msg := $(error No libelf.h/libelf found, please install libelf-dev/elfutils-libelf-devel); 546 + endif 539 547 endif 540 548 541 - ifneq ($(shell sh -c "(echo '\#include <libelf.h>'; echo 'int main(void) { Elf * elf = elf_begin(0, ELF_C_READ_MMAP, 0); return (long)elf; }') | $(CC) -x c - $(ALL_CFLAGS) -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) "$(QUIET_STDERR)" && echo y"), y) 542 - BASIC_CFLAGS += -DLIBELF_NO_MMAP 543 - endif 544 - else 545 - msg := $(error No libelf.h/libelf found, please install libelf-dev/elfutils-libelf-devel and glibc-dev[el]); 549 + ifneq ($(call try-cc,$(SOURCE_ELF_MMAP),$(FLAGS_COMMON)),y) 550 + BASIC_CFLAGS += -DLIBELF_NO_MMAP 546 551 endif 547 552 548 553 ifndef NO_DWARF ··· 560 561 ifdef NO_NEWT 561 562 BASIC_CFLAGS += -DNO_NEWT_SUPPORT 562 563 else 563 - ifneq ($(shell sh -c "(echo '\#include <newt.h>'; echo 'int main(void) { newtInit(); newtCls(); return newtFinished(); }') | $(CC) -x c - $(ALL_CFLAGS) -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -lnewt -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) "$(QUIET_STDERR)" && echo y"), y) 564 - msg := $(warning newt not found, disables TUI support. Please install newt-devel or libnewt-dev); 565 - BASIC_CFLAGS += -DNO_NEWT_SUPPORT 566 - else 567 - # Fedora has /usr/include/slang/slang.h, but ubuntu /usr/include/slang.h 568 - BASIC_CFLAGS += -I/usr/include/slang 569 - EXTLIBS += -lnewt -lslang 570 - LIB_OBJS += $(OUTPUT)util/newt.o 571 - endif 572 - endif # NO_NEWT 573 - 574 - ifndef NO_LIBPERL 575 - PERL_EMBED_LDOPTS = `perl -MExtUtils::Embed -e ldopts 2>/dev/null` 576 - PERL_EMBED_CCOPTS = `perl -MExtUtils::Embed -e ccopts 2>/dev/null` 564 + FLAGS_NEWT=$(ALL_CFLAGS) $(ALL_LDFLAGS) $(EXTLIBS) -lnewt 565 + ifneq ($(call try-cc,$(SOURCE_NEWT),$(FLAGS_NEWT)),y) 566 + msg := $(warning newt not found, disables TUI support. Please install newt-devel or libnewt-dev); 567 + BASIC_CFLAGS += -DNO_NEWT_SUPPORT 568 + else 569 + # Fedora has /usr/include/slang/slang.h, but ubuntu /usr/include/slang.h 570 + BASIC_CFLAGS += -I/usr/include/slang 571 + EXTLIBS += -lnewt -lslang 572 + LIB_OBJS += $(OUTPUT)util/newt.o 573 + endif 577 574 endif 578 575 579 - ifneq ($(shell sh -c "(echo '\#include <EXTERN.h>'; echo '\#include <perl.h>'; echo 'int main(void) { perl_alloc(); return 0; }') | $(CC) -x c - $(PERL_EMBED_CCOPTS) -o $(BITBUCKET) $(PERL_EMBED_LDOPTS) > /dev/null 2>&1 && echo y"), y) 576 + ifdef NO_LIBPERL 580 577 BASIC_CFLAGS += -DNO_LIBPERL 581 578 else 582 - ALL_LDFLAGS += $(PERL_EMBED_LDOPTS) 583 - LIB_OBJS += $(OUTPUT)util/scripting-engines/trace-event-perl.o 584 - LIB_OBJS += $(OUTPUT)scripts/perl/Perf-Trace-Util/Context.o 579 + PERL_EMBED_LDOPTS = `perl -MExtUtils::Embed -e ldopts 2>/dev/null` 580 + PERL_EMBED_CCOPTS = `perl -MExtUtils::Embed -e ccopts 2>/dev/null` 581 + FLAGS_PERL_EMBED=$(PERL_EMBED_CCOPTS) $(PERL_EMBED_LDOPTS) 582 + 583 + ifneq ($(call try-cc,$(SOURCE_PERL_EMBED),$(FLAGS_PERL_EMBED)),y) 584 + BASIC_CFLAGS += -DNO_LIBPERL 585 + else 586 + ALL_LDFLAGS += $(PERL_EMBED_LDOPTS) 587 + LIB_OBJS += $(OUTPUT)util/scripting-engines/trace-event-perl.o 588 + LIB_OBJS += $(OUTPUT)scripts/perl/Perf-Trace-Util/Context.o 589 + endif 585 590 endif 586 591 587 - ifndef NO_LIBPYTHON 588 - PYTHON_EMBED_LDOPTS = `python-config --ldflags 2>/dev/null` 589 - PYTHON_EMBED_CCOPTS = `python-config --cflags 2>/dev/null` 590 - endif 591 - 592 - ifneq ($(shell sh -c "(echo '\#include <Python.h>'; echo 'int main(void) { Py_Initialize(); return 0; }') | $(CC) -x c - $(PYTHON_EMBED_CCOPTS) -o $(BITBUCKET) $(PYTHON_EMBED_LDOPTS) > /dev/null 2>&1 && echo y"), y) 592 + ifdef NO_LIBPYTHON 593 593 BASIC_CFLAGS += -DNO_LIBPYTHON 594 594 else 595 - ALL_LDFLAGS += $(PYTHON_EMBED_LDOPTS) 596 - LIB_OBJS += $(OUTPUT)util/scripting-engines/trace-event-python.o 597 - LIB_OBJS += $(OUTPUT)scripts/python/Perf-Trace-Util/Context.o 595 + PYTHON_EMBED_LDOPTS = `python-config --ldflags 2>/dev/null` 596 + PYTHON_EMBED_CCOPTS = `python-config --cflags 2>/dev/null` 597 + FLAGS_PYTHON_EMBED=$(PYTHON_EMBED_CCOPTS) $(PYTHON_EMBED_LDOPTS) 598 + ifneq ($(call try-cc,$(SOURCE_PYTHON_EMBED),$(FLAGS_PYTHON_EMBED)),y) 599 + BASIC_CFLAGS += -DNO_LIBPYTHON 600 + else 601 + ALL_LDFLAGS += $(PYTHON_EMBED_LDOPTS) 602 + LIB_OBJS += $(OUTPUT)util/scripting-engines/trace-event-python.o 603 + LIB_OBJS += $(OUTPUT)scripts/python/Perf-Trace-Util/Context.o 604 + endif 598 605 endif 599 606 600 607 ifdef NO_DEMANGLE 601 608 BASIC_CFLAGS += -DNO_DEMANGLE 602 609 else 603 - ifdef HAVE_CPLUS_DEMANGLE 610 + ifdef HAVE_CPLUS_DEMANGLE 604 611 EXTLIBS += -liberty 605 612 BASIC_CFLAGS += -DHAVE_CPLUS_DEMANGLE 606 - else 607 - has_bfd := $(shell sh -c "(echo '\#include <bfd.h>'; echo 'int main(void) { bfd_demangle(0, 0, 0); return 0; }') | $(CC) -x c - $(ALL_CFLAGS) -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) -lbfd "$(QUIET_STDERR)" && echo y") 608 - 613 + else 614 + FLAGS_BFD=$(ALL_CFLAGS) $(ALL_LDFLAGS) $(EXTLIBS) -lbfd 615 + has_bfd := $(call try-cc,$(SOURCE_BFD),$(FLAGS_BFD)) 609 616 ifeq ($(has_bfd),y) 610 617 EXTLIBS += -lbfd 611 618 else 612 - has_bfd_iberty := $(shell sh -c "(echo '\#include <bfd.h>'; echo 'int main(void) { bfd_demangle(0, 0, 0); return 0; }') | $(CC) -x c - $(ALL_CFLAGS) -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) -lbfd -liberty "$(QUIET_STDERR)" && echo y") 619 + FLAGS_BFD_IBERTY=$(FLAGS_BFD) -liberty 620 + has_bfd_iberty := $(call try-cc,$(SOURCE_BFD),$(FLAGS_BFD_IBERTY)) 613 621 ifeq ($(has_bfd_iberty),y) 614 622 EXTLIBS += -lbfd -liberty 615 623 else 616 - has_bfd_iberty_z := $(shell sh -c "(echo '\#include <bfd.h>'; echo 'int main(void) { bfd_demangle(0, 0, 0); return 0; }') | $(CC) -x c - $(ALL_CFLAGS) -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) -lbfd -liberty -lz "$(QUIET_STDERR)" && echo y") 624 + FLAGS_BFD_IBERTY_Z=$(FLAGS_BFD_IBERTY) -lz 625 + has_bfd_iberty_z := $(call try-cc,$(SOURCE_BFD),$(FLAGS_BFD_IBERTY_Z)) 617 626 ifeq ($(has_bfd_iberty_z),y) 618 627 EXTLIBS += -lbfd -liberty -lz 619 628 else 620 - has_cplus_demangle := $(shell sh -c "(echo 'extern char *cplus_demangle(const char *, int);'; echo 'int main(void) { cplus_demangle(0, 0); return 0; }') | $(CC) -x c - $(ALL_CFLAGS) -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) -liberty "$(QUIET_STDERR)" && echo y") 629 + FLAGS_CPLUS_DEMANGLE=$(ALL_CFLAGS) $(ALL_LDFLAGS) $(EXTLIBS) -liberty 630 + has_cplus_demangle := $(call try-cc,$(SOURCE_CPLUS_DEMANGLE),$(FLAGS_CPLUS_DEMANGLE)) 621 631 ifeq ($(has_cplus_demangle),y) 622 632 EXTLIBS += -liberty 623 633 BASIC_CFLAGS += -DHAVE_CPLUS_DEMANGLE ··· 875 867 876 868 SHELL = $(SHELL_PATH) 877 869 878 - all:: .perf.dev.null shell_compatibility_test $(ALL_PROGRAMS) $(BUILT_INS) $(OTHER_PROGRAMS) $(OUTPUT)PERF-BUILD-OPTIONS 870 + all:: shell_compatibility_test $(ALL_PROGRAMS) $(BUILT_INS) $(OTHER_PROGRAMS) $(OUTPUT)PERF-BUILD-OPTIONS 879 871 ifneq (,$X) 880 872 $(foreach p,$(patsubst %$X,%,$(filter %$X,$(ALL_PROGRAMS) $(BUILT_INS) perf$X)), test '$p' -ef '$p$X' || $(RM) '$p';) 881 873 endif ··· 1204 1196 .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell 1205 1197 .PHONY: .FORCE-PERF-VERSION-FILE TAGS tags cscope .FORCE-PERF-CFLAGS 1206 1198 .PHONY: .FORCE-PERF-BUILD-OPTIONS 1207 - 1208 - .perf.dev.null: 1209 - touch .perf.dev.null 1210 - 1211 - .INTERMEDIATE: .perf.dev.null 1212 1199 1213 1200 ### Make sure built-ins do not have dups and listed in perf.c 1214 1201 #

+4

tools/perf/arch/sh/Makefile

··· 1 + ifndef NO_DWARF 2 + PERF_HAVE_DWARF_REGS := 1 3 + LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o 4 + endif

+55

tools/perf/arch/sh/util/dwarf-regs.c

··· 1 + /* 2 + * Mapping of DWARF debug register numbers into register names. 3 + * 4 + * Copyright (C) 2010 Matt Fleming <matt@console-pimps.org> 5 + * 6 + * This program is free software; you can redistribute it and/or modify 7 + * it under the terms of the GNU General Public License as published by 8 + * the Free Software Foundation; either version 2 of the License, or 9 + * (at your option) any later version. 10 + * 11 + * This program is distributed in the hope that it will be useful, 12 + * but WITHOUT ANY WARRANTY; without even the implied warranty of 13 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 14 + * GNU General Public License for more details. 15 + * 16 + * You should have received a copy of the GNU General Public License 17 + * along with this program; if not, write to the Free Software 18 + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. 19 + * 20 + */ 21 + 22 + #include <libio.h> 23 + #include <dwarf-regs.h> 24 + 25 + /* 26 + * Generic dwarf analysis helpers 27 + */ 28 + 29 + #define SH_MAX_REGS 18 30 + const char *sh_regs_table[SH_MAX_REGS] = { 31 + "r0", 32 + "r1", 33 + "r2", 34 + "r3", 35 + "r4", 36 + "r5", 37 + "r6", 38 + "r7", 39 + "r8", 40 + "r9", 41 + "r10", 42 + "r11", 43 + "r12", 44 + "r13", 45 + "r14", 46 + "r15", 47 + "pc", 48 + "pr", 49 + }; 50 + 51 + /* Return architecture dependent register string (for kprobe-tracer) */ 52 + const char *get_arch_regstr(unsigned int n) 53 + { 54 + return (n <= SH_MAX_REGS) ? sh_regs_table[n] : NULL; 55 + }

+2 -4

tools/perf/builtin-annotate.c

··· 61 61 static int process_sample_event(event_t *event, struct perf_session *session) 62 62 { 63 63 struct addr_location al; 64 + struct sample_data data; 64 65 65 - dump_printf("(IP, %d): %d: %#Lx\n", event->header.misc, 66 - event->ip.pid, event->ip.ip); 67 - 68 - if (event__preprocess_sample(event, session, &al, NULL) < 0) { 66 + if (event__preprocess_sample(event, session, &al, &data, NULL) < 0) { 69 67 pr_warning("problem processing %d event, skipping it.\n", 70 68 event->header.type); 71 69 return -1;

+1 -2

tools/perf/builtin-buildid-cache.c

··· 78 78 struct str_node *pos; 79 79 char debugdir[PATH_MAX]; 80 80 81 - snprintf(debugdir, sizeof(debugdir), "%s/%s", getenv("HOME"), 82 - DEBUG_CACHE_DIR); 81 + snprintf(debugdir, sizeof(debugdir), "%s", buildid_dir); 83 82 84 83 if (add_name_list_str) { 85 84 list = strlist__new(true, add_name_list_str);

+1 -3

tools/perf/builtin-buildid-list.c

··· 43 43 if (session == NULL) 44 44 return -1; 45 45 46 - if (with_hits) { 47 - symbol_conf.full_paths = true; 46 + if (with_hits) 48 47 perf_session__process_events(session, &build_id__mark_dso_hit_ops); 49 - } 50 48 51 49 perf_session__fprintf_dsos_buildid(session, stdout, with_hits); 52 50

+1 -8

tools/perf/builtin-diff.c

··· 35 35 struct addr_location al; 36 36 struct sample_data data = { .period = 1, }; 37 37 38 - dump_printf("(IP, %d): %d: %#Lx\n", event->header.misc, 39 - event->ip.pid, event->ip.ip); 40 - 41 - if (event__preprocess_sample(event, session, &al, NULL) < 0) { 38 + if (event__preprocess_sample(event, session, &al, &data, NULL) < 0) { 42 39 pr_warning("problem processing %d event, skipping it.\n", 43 40 event->header.type); 44 41 return -1; ··· 43 46 44 47 if (al.filtered || al.sym == NULL) 45 48 return 0; 46 - 47 - event__parse_sample(event, session->sample_type, &data); 48 49 49 50 if (hists__add_entry(&session->hists, &al, data.period)) { 50 51 pr_warning("problem incrementing symbol period, skipping event\n"); ··· 180 185 OPT_BOOLEAN('f', "force", &force, "don't complain, do it"), 181 186 OPT_BOOLEAN('m', "modules", &symbol_conf.use_modules, 182 187 "load module symbols - WARNING: use only with -k and LIVE kernel"), 183 - OPT_BOOLEAN('P', "full-paths", &symbol_conf.full_paths, 184 - "Don't shorten the pathnames taking into account the cwd"), 185 188 OPT_STRING('d', "dsos", &symbol_conf.dso_list_str, "dso[,dso...]", 186 189 "only consider symbols in these dsos"), 187 190 OPT_STRING('C', "comms", &symbol_conf.comm_list_str, "comm[,comm...]",

+2 -1

tools/perf/builtin-probe.c

··· 182 182 "Show source code lines.", opt_show_lines), 183 183 OPT_STRING('k', "vmlinux", &symbol_conf.vmlinux_name, 184 184 "file", "vmlinux pathname"), 185 + OPT_STRING('s', "source", &symbol_conf.source_prefix, 186 + "directory", "path to kernel source"), 185 187 #endif 186 188 OPT__DRY_RUN(&probe_event_dry_run), 187 189 OPT_INTEGER('\0', "max-probes", &params.max_probe_points, ··· 267 265 } 268 266 return 0; 269 267 } 270 -

+55 -26

tools/perf/builtin-record.c

··· 49 49 static int realtime_prio = 0; 50 50 static bool raw_samples = false; 51 51 static bool system_wide = false; 52 - static int profile_cpu = -1; 53 52 static pid_t target_pid = -1; 54 53 static pid_t target_tid = -1; 55 54 static pid_t *all_tids = NULL; ··· 60 61 static bool inherit_stat = false; 61 62 static bool no_samples = false; 62 63 static bool sample_address = false; 64 + static bool no_buildid = false; 63 65 64 66 static long samples = 0; 65 67 static u64 bytes_written = 0; ··· 74 74 static off_t post_processing_offset; 75 75 76 76 static struct perf_session *session; 77 + static const char *cpu_list; 77 78 78 79 struct mmap_data { 79 80 int counter; ··· 269 268 if (inherit_stat) 270 269 attr->inherit_stat = 1; 271 270 272 - if (sample_address) 271 + if (sample_address) { 273 272 attr->sample_type |= PERF_SAMPLE_ADDR; 273 + attr->mmap_data = track; 274 + } 274 275 275 276 if (call_graph) 276 277 attr->sample_type |= PERF_SAMPLE_CALLCHAIN; 278 + 279 + if (system_wide) 280 + attr->sample_type |= PERF_SAMPLE_CPU; 277 281 278 282 if (raw_samples) { 279 283 attr->sample_type |= PERF_SAMPLE_TIME; ··· 306 300 die("Permission error - are you root?\n" 307 301 "\t Consider tweaking" 308 302 " /proc/sys/kernel/perf_event_paranoid.\n"); 309 - else if (err == ENODEV && profile_cpu != -1) { 303 + else if (err == ENODEV && cpu_list) { 310 304 die("No such device - did you specify" 311 305 " an out-of-range profile CPU?\n"); 312 306 } ··· 439 433 440 434 process_buildids(); 441 435 perf_header__write(&session->header, output, true); 436 + perf_session__delete(session); 437 + symbol__exit(); 442 438 } 443 439 } 444 440 445 441 static void event__synthesize_guest_os(struct machine *machine, void *data) 446 442 { 447 443 int err; 448 - char *guest_kallsyms; 449 - char path[PATH_MAX]; 450 444 struct perf_session *psession = data; 451 445 452 446 if (machine__is_host(machine)) ··· 465 459 if (err < 0) 466 460 pr_err("Couldn't record guest kernel [%d]'s reference" 467 461 " relocation symbol.\n", machine->pid); 468 - 469 - if (machine__is_default_guest(machine)) 470 - guest_kallsyms = (char *) symbol_conf.default_guest_kallsyms; 471 - else { 472 - sprintf(path, "%s/proc/kallsyms", machine->root_dir); 473 - guest_kallsyms = path; 474 - } 475 462 476 463 /* 477 464 * We use _stext for guest kernel because guest kernel's /proc/kallsyms ··· 560 561 if (!file_new) { 561 562 err = perf_header__read(session, output); 562 563 if (err < 0) 563 - return err; 564 + goto out_delete_session; 564 565 } 565 566 566 567 if (have_tracepoints(attrs, nr_counters)) 567 568 perf_header__set_feat(&session->header, HEADER_TRACE_INFO); 568 569 570 + /* 571 + * perf_session__delete(session) will be called at atexit_header() 572 + */ 569 573 atexit(atexit_header); 570 574 571 575 if (forks) { ··· 624 622 close(child_ready_pipe[0]); 625 623 } 626 624 627 - if ((!system_wide && no_inherit) || profile_cpu != -1) { 628 - open_counters(profile_cpu); 625 + nr_cpus = read_cpu_map(cpu_list); 626 + if (nr_cpus < 1) { 627 + perror("failed to collect number of CPUs\n"); 628 + return -1; 629 + } 630 + 631 + if (!system_wide && no_inherit && !cpu_list) { 632 + open_counters(-1); 629 633 } else { 630 - nr_cpus = read_cpu_map(); 631 634 for (i = 0; i < nr_cpus; i++) 632 635 open_counters(cpumap[i]); 633 636 } ··· 711 704 if (perf_guest) 712 705 perf_session__process_machines(session, event__synthesize_guest_os); 713 706 714 - if (!system_wide && profile_cpu == -1) 707 + if (!system_wide) 715 708 event__synthesize_thread(target_tid, process_synthesized_event, 716 709 session); 717 710 else ··· 773 766 bytes_written / 24); 774 767 775 768 return 0; 769 + 770 + out_delete_session: 771 + perf_session__delete(session); 772 + return err; 776 773 } 777 774 778 775 static const char * const record_usage[] = { ··· 805 794 "system-wide collection from all CPUs"), 806 795 OPT_BOOLEAN('A', "append", &append_file, 807 796 "append to the output file to do incremental profiling"), 808 - OPT_INTEGER('C', "profile_cpu", &profile_cpu, 809 - "CPU to profile on"), 797 + OPT_STRING('C', "cpu", &cpu_list, "cpu", 798 + "list of cpus to monitor"), 810 799 OPT_BOOLEAN('f', "force", &force, 811 800 "overwrite existing data file (deprecated)"), 812 801 OPT_U64('c', "count", &user_interval, "event period to sample"), ··· 826 815 "Sample addresses"), 827 816 OPT_BOOLEAN('n', "no-samples", &no_samples, 828 817 "don't sample"), 818 + OPT_BOOLEAN('N', "no-buildid-cache", &no_buildid, 819 + "do not update the buildid cache"), 829 820 OPT_END() 830 821 }; 831 822 832 823 int cmd_record(int argc, const char **argv, const char *prefix __used) 833 824 { 834 - int i,j; 825 + int i, j, err = -ENOMEM; 835 826 836 827 argc = parse_options(argc, argv, options, record_usage, 837 828 PARSE_OPT_STOP_AT_NON_OPTION); 838 829 if (!argc && target_pid == -1 && target_tid == -1 && 839 - !system_wide && profile_cpu == -1) 830 + !system_wide && !cpu_list) 840 831 usage_with_options(record_usage, options); 841 832 842 833 if (force && append_file) { ··· 852 839 } 853 840 854 841 symbol__init(); 842 + if (no_buildid) 843 + disable_buildid_cache(); 855 844 856 845 if (!nr_counters) { 857 846 nr_counters = 1; ··· 872 857 } else { 873 858 all_tids=malloc(sizeof(pid_t)); 874 859 if (!all_tids) 875 - return -ENOMEM; 860 + goto out_symbol_exit; 876 861 877 862 all_tids[0] = target_tid; 878 863 thread_num = 1; ··· 882 867 for (j = 0; j < MAX_COUNTERS; j++) { 883 868 fd[i][j] = malloc(sizeof(int)*thread_num); 884 869 if (!fd[i][j]) 885 - return -ENOMEM; 870 + goto out_free_fd; 886 871 } 887 872 } 888 873 event_array = malloc( 889 874 sizeof(struct pollfd)*MAX_NR_CPUS*MAX_COUNTERS*thread_num); 890 875 if (!event_array) 891 - return -ENOMEM; 876 + goto out_free_fd; 892 877 893 878 if (user_interval != ULLONG_MAX) 894 879 default_interval = user_interval; ··· 904 889 default_interval = freq; 905 890 } else { 906 891 fprintf(stderr, "frequency and count are zero, aborting\n"); 907 - exit(EXIT_FAILURE); 892 + err = -EINVAL; 893 + goto out_free_event_array; 908 894 } 909 895 910 - return __cmd_record(argc, argv); 896 + err = __cmd_record(argc, argv); 897 + 898 + out_free_event_array: 899 + free(event_array); 900 + out_free_fd: 901 + for (i = 0; i < MAX_NR_CPUS; i++) { 902 + for (j = 0; j < MAX_COUNTERS; j++) 903 + free(fd[i][j]); 904 + } 905 + free(all_tids); 906 + all_tids = NULL; 907 + out_symbol_exit: 908 + symbol__exit(); 909 + return err; 911 910 }

+1 -26

tools/perf/builtin-report.c

··· 155 155 struct addr_location al; 156 156 struct perf_event_attr *attr; 157 157 158 - event__parse_sample(event, session->sample_type, &data); 159 - 160 - dump_printf("(IP, %d): %d/%d: %#Lx period: %Ld\n", event->header.misc, 161 - data.pid, data.tid, data.ip, data.period); 162 - 163 - if (session->sample_type & PERF_SAMPLE_CALLCHAIN) { 164 - unsigned int i; 165 - 166 - dump_printf("... chain: nr:%Lu\n", data.callchain->nr); 167 - 168 - if (!ip_callchain__valid(data.callchain, event)) { 169 - pr_debug("call-chain problem with event, " 170 - "skipping it.\n"); 171 - return 0; 172 - } 173 - 174 - if (dump_trace) { 175 - for (i = 0; i < data.callchain->nr; i++) 176 - dump_printf("..... %2d: %016Lx\n", 177 - i, data.callchain->ips[i]); 178 - } 179 - } 180 - 181 - if (event__preprocess_sample(event, session, &al, NULL) < 0) { 158 + if (event__preprocess_sample(event, session, &al, &data, NULL) < 0) { 182 159 fprintf(stderr, "problem processing %d event, skipping it.\n", 183 160 event->header.type); 184 161 return -1; ··· 441 464 "pretty printing style key: normal raw"), 442 465 OPT_STRING('s', "sort", &sort_order, "key[,key2...]", 443 466 "sort by key(s): pid, comm, dso, symbol, parent"), 444 - OPT_BOOLEAN('P', "full-paths", &symbol_conf.full_paths, 445 - "Don't shorten the pathnames taking into account the cwd"), 446 467 OPT_BOOLEAN(0, "showcpuutilization", &symbol_conf.show_cpu_utilization, 447 468 "Show sample percentage for different cpu modes"), 448 469 OPT_STRING('p', "parent", &parent_pattern, "regex",

+10 -4

tools/perf/builtin-stat.c

··· 69 69 }; 70 70 71 71 static bool system_wide = false; 72 - static unsigned int nr_cpus = 0; 72 + static int nr_cpus = 0; 73 73 static int run_idx = 0; 74 74 75 75 static int run_count = 1; ··· 82 82 static pid_t child_pid = -1; 83 83 static bool null_run = false; 84 84 static bool big_num = false; 85 + static const char *cpu_list; 85 86 86 87 87 88 static int *fd[MAX_NR_CPUS][MAX_COUNTERS]; ··· 159 158 PERF_FORMAT_TOTAL_TIME_RUNNING; 160 159 161 160 if (system_wide) { 162 - unsigned int cpu; 161 + int cpu; 163 162 164 163 for (cpu = 0; cpu < nr_cpus; cpu++) { 165 164 fd[cpu][counter][0] = sys_perf_event_open(attr, ··· 209 208 static void read_counter(int counter) 210 209 { 211 210 u64 count[3], single_count[3]; 212 - unsigned int cpu; 211 + int cpu; 213 212 size_t res, nv; 214 213 int scaled; 215 214 int i, thread; ··· 543 542 "null run - dont start any counters"), 544 543 OPT_BOOLEAN('B', "big-num", &big_num, 545 544 "print large numbers with thousands\' separators"), 545 + OPT_STRING('C', "cpu", &cpu_list, "cpu", 546 + "list of cpus to monitor in system-wide"), 546 547 OPT_END() 547 548 }; 548 549 ··· 569 566 } 570 567 571 568 if (system_wide) 572 - nr_cpus = read_cpu_map(); 569 + nr_cpus = read_cpu_map(cpu_list); 573 570 else 574 571 nr_cpus = 1; 572 + 573 + if (nr_cpus < 1) 574 + usage_with_options(stat_usage, options); 575 575 576 576 if (target_pid != -1) { 577 577 target_tid = target_pid;

+13 -27

tools/perf/builtin-top.c

··· 102 102 static int sym_pcnt_filter = 5; 103 103 static int sym_counter = 0; 104 104 static int display_weighted = -1; 105 + static const char *cpu_list; 105 106 106 107 /* 107 108 * Symbols ··· 983 982 u64 ip = self->ip.ip; 984 983 struct sym_entry *syme; 985 984 struct addr_location al; 985 + struct sample_data data; 986 986 struct machine *machine; 987 987 u8 origin = self->header.misc & PERF_RECORD_MISC_CPUMODE_MASK; 988 988 ··· 1026 1024 if (self->header.misc & PERF_RECORD_MISC_EXACT_IP) 1027 1025 exact_samples++; 1028 1026 1029 - if (event__preprocess_sample(self, session, &al, symbol_filter) < 0 || 1027 + if (event__preprocess_sample(self, session, &al, &data, 1028 + symbol_filter) < 0 || 1030 1029 al.filtered) 1031 1030 return; 1032 1031 ··· 1080 1077 __list_insert_active_sym(syme); 1081 1078 pthread_mutex_unlock(&active_symbols_lock); 1082 1079 } 1083 - } 1084 - 1085 - static int event__process(event_t *event, struct perf_session *session) 1086 - { 1087 - switch (event->header.type) { 1088 - case PERF_RECORD_COMM: 1089 - event__process_comm(event, session); 1090 - break; 1091 - case PERF_RECORD_MMAP: 1092 - event__process_mmap(event, session); 1093 - break; 1094 - case PERF_RECORD_FORK: 1095 - case PERF_RECORD_EXIT: 1096 - event__process_task(event, session); 1097 - break; 1098 - default: 1099 - break; 1100 - } 1101 - 1102 - return 0; 1103 1080 } 1104 1081 1105 1082 struct mmap_data { ··· 1334 1351 "profile events on existing thread id"), 1335 1352 OPT_BOOLEAN('a', "all-cpus", &system_wide, 1336 1353 "system-wide collection from all CPUs"), 1337 - OPT_INTEGER('C', "CPU", &profile_cpu, 1338 - "CPU to profile on"), 1354 + OPT_STRING('C', "cpu", &cpu_list, "cpu", 1355 + "list of cpus to monitor"), 1339 1356 OPT_STRING('k', "vmlinux", &symbol_conf.vmlinux_name, 1340 1357 "file", "vmlinux pathname"), 1341 1358 OPT_BOOLEAN('K', "hide_kernel_symbols", &hide_kernel_symbols, ··· 1411 1428 return -ENOMEM; 1412 1429 1413 1430 /* CPU and PID are mutually exclusive */ 1414 - if (target_tid > 0 && profile_cpu != -1) { 1431 + if (target_tid > 0 && cpu_list) { 1415 1432 printf("WARNING: PID switch overriding CPU\n"); 1416 1433 sleep(1); 1417 - profile_cpu = -1; 1434 + cpu_list = NULL; 1418 1435 } 1419 1436 1420 1437 if (!nr_counters) ··· 1452 1469 attrs[counter].sample_period = default_interval; 1453 1470 } 1454 1471 1455 - if (target_tid != -1 || profile_cpu != -1) 1472 + if (target_tid != -1) 1456 1473 nr_cpus = 1; 1457 1474 else 1458 - nr_cpus = read_cpu_map(); 1475 + nr_cpus = read_cpu_map(cpu_list); 1476 + 1477 + if (nr_cpus < 1) 1478 + usage_with_options(top_usage, options); 1459 1479 1460 1480 get_term_dimensions(&winsize); 1461 1481 if (print_entries == 0) {

+27 -5

tools/perf/builtin-trace.c

··· 11 11 12 12 static char const *script_name; 13 13 static char const *generate_script_lang; 14 - static bool debug_ordering; 14 + static bool debug_mode; 15 15 static u64 last_timestamp; 16 + static u64 nr_unordered; 16 17 17 18 static int default_start_script(const char *script __unused, 18 19 int argc __unused, ··· 92 91 } 93 92 94 93 if (session->sample_type & PERF_SAMPLE_RAW) { 95 - if (debug_ordering) { 94 + if (debug_mode) { 96 95 if (data.time < last_timestamp) { 97 96 pr_err("Samples misordered, previous: %llu " 98 97 "this: %llu\n", last_timestamp, 99 98 data.time); 99 + nr_unordered++; 100 100 } 101 101 last_timestamp = data.time; 102 + return 0; 102 103 } 103 104 /* 104 105 * FIXME: better resolve from pid from the struct trace_entry ··· 116 113 return 0; 117 114 } 118 115 116 + static u64 nr_lost; 117 + 118 + static int process_lost_event(event_t *event, struct perf_session *session __used) 119 + { 120 + nr_lost += event->lost.lost; 121 + 122 + return 0; 123 + } 124 + 119 125 static struct perf_event_ops event_ops = { 120 126 .sample = process_sample_event, 121 127 .comm = event__process_comm, ··· 132 120 .event_type = event__process_event_type, 133 121 .tracing_data = event__process_tracing_data, 134 122 .build_id = event__process_build_id, 123 + .lost = process_lost_event, 135 124 .ordered_samples = true, 136 125 }; 137 126 ··· 145 132 146 133 static int __cmd_trace(struct perf_session *session) 147 134 { 135 + int ret; 136 + 148 137 signal(SIGINT, sig_handler); 149 138 150 - return perf_session__process_events(session, &event_ops); 139 + ret = perf_session__process_events(session, &event_ops); 140 + 141 + if (debug_mode) { 142 + pr_err("Misordered timestamps: %llu\n", nr_unordered); 143 + pr_err("Lost events: %llu\n", nr_lost); 144 + } 145 + 146 + return ret; 151 147 } 152 148 153 149 struct script_spec { ··· 566 544 "generate perf-trace.xx script in specified language"), 567 545 OPT_STRING('i', "input", &input_name, "file", 568 546 "input file name"), 569 - OPT_BOOLEAN('d', "debug-ordering", &debug_ordering, 570 - "check that samples time ordering is monotonic"), 547 + OPT_BOOLEAN('d', "debug-mode", &debug_mode, 548 + "do various checks like samples ordering and lost events"), 571 549 572 550 OPT_END() 573 551 };

+119

tools/perf/feature-tests.mak

··· 1 + define SOURCE_HELLO 2 + #include <stdio.h> 3 + int main(void) 4 + { 5 + return puts(\"hi\"); 6 + } 7 + endef 8 + 9 + ifndef NO_DWARF 10 + define SOURCE_DWARF 11 + #include <dwarf.h> 12 + #include <libdw.h> 13 + #include <version.h> 14 + #ifndef _ELFUTILS_PREREQ 15 + #error 16 + #endif 17 + 18 + int main(void) 19 + { 20 + Dwarf *dbg = dwarf_begin(0, DWARF_C_READ); 21 + return (long)dbg; 22 + } 23 + endef 24 + endif 25 + 26 + define SOURCE_LIBELF 27 + #include <libelf.h> 28 + 29 + int main(void) 30 + { 31 + Elf *elf = elf_begin(0, ELF_C_READ, 0); 32 + return (long)elf; 33 + } 34 + endef 35 + 36 + define SOURCE_GLIBC 37 + #include <gnu/libc-version.h> 38 + 39 + int main(void) 40 + { 41 + const char *version = gnu_get_libc_version(); 42 + return (long)version; 43 + } 44 + endef 45 + 46 + define SOURCE_ELF_MMAP 47 + #include <libelf.h> 48 + int main(void) 49 + { 50 + Elf *elf = elf_begin(0, ELF_C_READ_MMAP, 0); 51 + return (long)elf; 52 + } 53 + endef 54 + 55 + ifndef NO_NEWT 56 + define SOURCE_NEWT 57 + #include <newt.h> 58 + 59 + int main(void) 60 + { 61 + newtInit(); 62 + newtCls(); 63 + return newtFinished(); 64 + } 65 + endef 66 + endif 67 + 68 + ifndef NO_LIBPERL 69 + define SOURCE_PERL_EMBED 70 + #include <EXTERN.h> 71 + #include <perl.h> 72 + 73 + int main(void) 74 + { 75 + perl_alloc(); 76 + return 0; 77 + } 78 + endef 79 + endif 80 + 81 + ifndef NO_LIBPYTHON 82 + define SOURCE_PYTHON_EMBED 83 + #include <Python.h> 84 + 85 + int main(void) 86 + { 87 + Py_Initialize(); 88 + return 0; 89 + } 90 + endef 91 + endif 92 + 93 + define SOURCE_BFD 94 + #include <bfd.h> 95 + 96 + int main(void) 97 + { 98 + bfd_demangle(0, 0, 0); 99 + return 0; 100 + } 101 + endef 102 + 103 + define SOURCE_CPLUS_DEMANGLE 104 + extern char *cplus_demangle(const char *, int); 105 + 106 + int main(void) 107 + { 108 + cplus_demangle(0, 0); 109 + return 0; 110 + } 111 + endef 112 + 113 + # try-cc 114 + # Usage: option = $(call try-cc, source-to-build, cc-options) 115 + try-cc = $(shell sh -c \ 116 + 'TMP="$(TMPOUT).$$$$"; \ 117 + echo "$(1)" | \ 118 + $(CC) -x c - $(2) -o "$$TMP" > /dev/null 2>&1 && echo y; \ 119 + rm -f "$$TMP"')

+15 -5

tools/perf/perf-archive.sh

··· 7 7 PERF_DATA=$1 8 8 fi 9 9 10 - DEBUGDIR=~/.debug/ 10 + # 11 + # PERF_BUILDID_DIR environment variable set by perf 12 + # path to buildid directory, default to $HOME/.debug 13 + # 14 + if [ -z $PERF_BUILDID_DIR ]; then 15 + PERF_BUILDID_DIR=~/.debug/ 16 + else 17 + # append / to make substitutions work 18 + PERF_BUILDID_DIR=$PERF_BUILDID_DIR/ 19 + fi 20 + 11 21 BUILDIDS=$(mktemp /tmp/perf-archive-buildids.XXXXXX) 12 22 NOBUILDID=0000000000000000000000000000000000000000 13 23 ··· 32 22 33 23 cut -d ' ' -f 1 $BUILDIDS | \ 34 24 while read build_id ; do 35 - linkname=$DEBUGDIR.build-id/${build_id:0:2}/${build_id:2} 25 + linkname=$PERF_BUILDID_DIR.build-id/${build_id:0:2}/${build_id:2} 36 26 filename=$(readlink -f $linkname) 37 - echo ${linkname#$DEBUGDIR} >> $MANIFEST 38 - echo ${filename#$DEBUGDIR} >> $MANIFEST 27 + echo ${linkname#$PERF_BUILDID_DIR} >> $MANIFEST 28 + echo ${filename#$PERF_BUILDID_DIR} >> $MANIFEST 39 29 done 40 30 41 - tar cfj $PERF_DATA.tar.bz2 -C $DEBUGDIR -T $MANIFEST 31 + tar cfj $PERF_DATA.tar.bz2 -C $PERF_BUILDID_DIR -T $MANIFEST 42 32 rm -f $MANIFEST $BUILDIDS 43 33 echo -e "Now please run:\n" 44 34 echo -e "$ tar xvf $PERF_DATA.tar.bz2 -C ~/.debug\n"

+2

tools/perf/perf.c

··· 458 458 handle_options(&argv, &argc, NULL); 459 459 commit_pager_choice(); 460 460 set_debugfs_path(); 461 + set_buildid_dir(); 462 + 461 463 if (argc > 0) { 462 464 if (!prefixcmp(argv[0], "--")) 463 465 argv[0] += 2;

+30

tools/perf/scripts/python/Perf-Trace-Util/lib/Perf/Trace/Core.py

··· 89 89 value &= ~idx 90 90 91 91 return string 92 + 93 + 94 + def taskState(state): 95 + states = { 96 + 0 : "R", 97 + 1 : "S", 98 + 2 : "D", 99 + 64: "DEAD" 100 + } 101 + 102 + if state not in states: 103 + return "Unknown" 104 + 105 + return states[state] 106 + 107 + 108 + class EventHeaders: 109 + def __init__(self, common_cpu, common_secs, common_nsecs, 110 + common_pid, common_comm): 111 + self.cpu = common_cpu 112 + self.secs = common_secs 113 + self.nsecs = common_nsecs 114 + self.pid = common_pid 115 + self.comm = common_comm 116 + 117 + def ts(self): 118 + return (self.secs * (10 ** 9)) + self.nsecs 119 + 120 + def ts_format(self): 121 + return "%d.%d" % (self.secs, int(self.nsecs / 1000))

+184

tools/perf/scripts/python/Perf-Trace-Util/lib/Perf/Trace/SchedGui.py

··· 1 + # SchedGui.py - Python extension for perf trace, basic GUI code for 2 + # traces drawing and overview. 3 + # 4 + # Copyright (C) 2010 by Frederic Weisbecker <fweisbec@gmail.com> 5 + # 6 + # This software is distributed under the terms of the GNU General 7 + # Public License ("GPL") version 2 as published by the Free Software 8 + # Foundation. 9 + 10 + 11 + try: 12 + import wx 13 + except ImportError: 14 + raise ImportError, "You need to install the wxpython lib for this script" 15 + 16 + 17 + class RootFrame(wx.Frame): 18 + Y_OFFSET = 100 19 + RECT_HEIGHT = 100 20 + RECT_SPACE = 50 21 + EVENT_MARKING_WIDTH = 5 22 + 23 + def __init__(self, sched_tracer, title, parent = None, id = -1): 24 + wx.Frame.__init__(self, parent, id, title) 25 + 26 + (self.screen_width, self.screen_height) = wx.GetDisplaySize() 27 + self.screen_width -= 10 28 + self.screen_height -= 10 29 + self.zoom = 0.5 30 + self.scroll_scale = 20 31 + self.sched_tracer = sched_tracer 32 + self.sched_tracer.set_root_win(self) 33 + (self.ts_start, self.ts_end) = sched_tracer.interval() 34 + self.update_width_virtual() 35 + self.nr_rects = sched_tracer.nr_rectangles() + 1 36 + self.height_virtual = RootFrame.Y_OFFSET + (self.nr_rects * (RootFrame.RECT_HEIGHT + RootFrame.RECT_SPACE)) 37 + 38 + # whole window panel 39 + self.panel = wx.Panel(self, size=(self.screen_width, self.screen_height)) 40 + 41 + # scrollable container 42 + self.scroll = wx.ScrolledWindow(self.panel) 43 + self.scroll.SetScrollbars(self.scroll_scale, self.scroll_scale, self.width_virtual / self.scroll_scale, self.height_virtual / self.scroll_scale) 44 + self.scroll.EnableScrolling(True, True) 45 + self.scroll.SetFocus() 46 + 47 + # scrollable drawing area 48 + self.scroll_panel = wx.Panel(self.scroll, size=(self.screen_width - 15, self.screen_height / 2)) 49 + self.scroll_panel.Bind(wx.EVT_PAINT, self.on_paint) 50 + self.scroll_panel.Bind(wx.EVT_KEY_DOWN, self.on_key_press) 51 + self.scroll_panel.Bind(wx.EVT_LEFT_DOWN, self.on_mouse_down) 52 + self.scroll.Bind(wx.EVT_PAINT, self.on_paint) 53 + self.scroll.Bind(wx.EVT_KEY_DOWN, self.on_key_press) 54 + self.scroll.Bind(wx.EVT_LEFT_DOWN, self.on_mouse_down) 55 + 56 + self.scroll.Fit() 57 + self.Fit() 58 + 59 + self.scroll_panel.SetDimensions(-1, -1, self.width_virtual, self.height_virtual, wx.SIZE_USE_EXISTING) 60 + 61 + self.txt = None 62 + 63 + self.Show(True) 64 + 65 + def us_to_px(self, val): 66 + return val / (10 ** 3) * self.zoom 67 + 68 + def px_to_us(self, val): 69 + return (val / self.zoom) * (10 ** 3) 70 + 71 + def scroll_start(self): 72 + (x, y) = self.scroll.GetViewStart() 73 + return (x * self.scroll_scale, y * self.scroll_scale) 74 + 75 + def scroll_start_us(self): 76 + (x, y) = self.scroll_start() 77 + return self.px_to_us(x) 78 + 79 + def paint_rectangle_zone(self, nr, color, top_color, start, end): 80 + offset_px = self.us_to_px(start - self.ts_start) 81 + width_px = self.us_to_px(end - self.ts_start) 82 + 83 + offset_py = RootFrame.Y_OFFSET + (nr * (RootFrame.RECT_HEIGHT + RootFrame.RECT_SPACE)) 84 + width_py = RootFrame.RECT_HEIGHT 85 + 86 + dc = self.dc 87 + 88 + if top_color is not None: 89 + (r, g, b) = top_color 90 + top_color = wx.Colour(r, g, b) 91 + brush = wx.Brush(top_color, wx.SOLID) 92 + dc.SetBrush(brush) 93 + dc.DrawRectangle(offset_px, offset_py, width_px, RootFrame.EVENT_MARKING_WIDTH) 94 + width_py -= RootFrame.EVENT_MARKING_WIDTH 95 + offset_py += RootFrame.EVENT_MARKING_WIDTH 96 + 97 + (r ,g, b) = color 98 + color = wx.Colour(r, g, b) 99 + brush = wx.Brush(color, wx.SOLID) 100 + dc.SetBrush(brush) 101 + dc.DrawRectangle(offset_px, offset_py, width_px, width_py) 102 + 103 + def update_rectangles(self, dc, start, end): 104 + start += self.ts_start 105 + end += self.ts_start 106 + self.sched_tracer.fill_zone(start, end) 107 + 108 + def on_paint(self, event): 109 + dc = wx.PaintDC(self.scroll_panel) 110 + self.dc = dc 111 + 112 + width = min(self.width_virtual, self.screen_width) 113 + (x, y) = self.scroll_start() 114 + start = self.px_to_us(x) 115 + end = self.px_to_us(x + width) 116 + self.update_rectangles(dc, start, end) 117 + 118 + def rect_from_ypixel(self, y): 119 + y -= RootFrame.Y_OFFSET 120 + rect = y / (RootFrame.RECT_HEIGHT + RootFrame.RECT_SPACE) 121 + height = y % (RootFrame.RECT_HEIGHT + RootFrame.RECT_SPACE) 122 + 123 + if rect < 0 or rect > self.nr_rects - 1 or height > RootFrame.RECT_HEIGHT: 124 + return -1 125 + 126 + return rect 127 + 128 + def update_summary(self, txt): 129 + if self.txt: 130 + self.txt.Destroy() 131 + self.txt = wx.StaticText(self.panel, -1, txt, (0, (self.screen_height / 2) + 50)) 132 + 133 + 134 + def on_mouse_down(self, event): 135 + (x, y) = event.GetPositionTuple() 136 + rect = self.rect_from_ypixel(y) 137 + if rect == -1: 138 + return 139 + 140 + t = self.px_to_us(x) + self.ts_start 141 + 142 + self.sched_tracer.mouse_down(rect, t) 143 + 144 + 145 + def update_width_virtual(self): 146 + self.width_virtual = self.us_to_px(self.ts_end - self.ts_start) 147 + 148 + def __zoom(self, x): 149 + self.update_width_virtual() 150 + (xpos, ypos) = self.scroll.GetViewStart() 151 + xpos = self.us_to_px(x) / self.scroll_scale 152 + self.scroll.SetScrollbars(self.scroll_scale, self.scroll_scale, self.width_virtual / self.scroll_scale, self.height_virtual / self.scroll_scale, xpos, ypos) 153 + self.Refresh() 154 + 155 + def zoom_in(self): 156 + x = self.scroll_start_us() 157 + self.zoom *= 2 158 + self.__zoom(x) 159 + 160 + def zoom_out(self): 161 + x = self.scroll_start_us() 162 + self.zoom /= 2 163 + self.__zoom(x) 164 + 165 + 166 + def on_key_press(self, event): 167 + key = event.GetRawKeyCode() 168 + if key == ord("+"): 169 + self.zoom_in() 170 + return 171 + if key == ord("-"): 172 + self.zoom_out() 173 + return 174 + 175 + key = event.GetKeyCode() 176 + (x, y) = self.scroll.GetViewStart() 177 + if key == wx.WXK_RIGHT: 178 + self.scroll.Scroll(x + 1, y) 179 + elif key == wx.WXK_LEFT: 180 + self.scroll.Scroll(x - 1, y) 181 + elif key == wx.WXK_DOWN: 182 + self.scroll.Scroll(x, y + 1) 183 + elif key == wx.WXK_UP: 184 + self.scroll.Scroll(x, y - 1)

+2

tools/perf/scripts/python/bin/sched-migration-record

··· 1 + #!/bin/bash 2 + perf record -m 16384 -a -e sched:sched_wakeup -e sched:sched_wakeup_new -e sched:sched_switch -e sched:sched_migrate_task $@

+3

tools/perf/scripts/python/bin/sched-migration-report

··· 1 + #!/bin/bash 2 + # description: sched migration overview 3 + perf trace $@ -s ~/libexec/perf-core/scripts/python/sched-migration.py

+461

tools/perf/scripts/python/sched-migration.py

··· 1 + #!/usr/bin/python 2 + # 3 + # Cpu task migration overview toy 4 + # 5 + # Copyright (C) 2010 Frederic Weisbecker <fweisbec@gmail.com> 6 + # 7 + # perf trace event handlers have been generated by perf trace -g python 8 + # 9 + # This software is distributed under the terms of the GNU General 10 + # Public License ("GPL") version 2 as published by the Free Software 11 + # Foundation. 12 + 13 + 14 + import os 15 + import sys 16 + 17 + from collections import defaultdict 18 + from UserList import UserList 19 + 20 + sys.path.append(os.environ['PERF_EXEC_PATH'] + \ 21 + '/scripts/python/Perf-Trace-Util/lib/Perf/Trace') 22 + sys.path.append('scripts/python/Perf-Trace-Util/lib/Perf/Trace') 23 + 24 + from perf_trace_context import * 25 + from Core import * 26 + from SchedGui import * 27 + 28 + 29 + threads = { 0 : "idle"} 30 + 31 + def thread_name(pid): 32 + return "%s:%d" % (threads[pid], pid) 33 + 34 + class RunqueueEventUnknown: 35 + @staticmethod 36 + def color(): 37 + return None 38 + 39 + def __repr__(self): 40 + return "unknown" 41 + 42 + class RunqueueEventSleep: 43 + @staticmethod 44 + def color(): 45 + return (0, 0, 0xff) 46 + 47 + def __init__(self, sleeper): 48 + self.sleeper = sleeper 49 + 50 + def __repr__(self): 51 + return "%s gone to sleep" % thread_name(self.sleeper) 52 + 53 + class RunqueueEventWakeup: 54 + @staticmethod 55 + def color(): 56 + return (0xff, 0xff, 0) 57 + 58 + def __init__(self, wakee): 59 + self.wakee = wakee 60 + 61 + def __repr__(self): 62 + return "%s woke up" % thread_name(self.wakee) 63 + 64 + class RunqueueEventFork: 65 + @staticmethod 66 + def color(): 67 + return (0, 0xff, 0) 68 + 69 + def __init__(self, child): 70 + self.child = child 71 + 72 + def __repr__(self): 73 + return "new forked task %s" % thread_name(self.child) 74 + 75 + class RunqueueMigrateIn: 76 + @staticmethod 77 + def color(): 78 + return (0, 0xf0, 0xff) 79 + 80 + def __init__(self, new): 81 + self.new = new 82 + 83 + def __repr__(self): 84 + return "task migrated in %s" % thread_name(self.new) 85 + 86 + class RunqueueMigrateOut: 87 + @staticmethod 88 + def color(): 89 + return (0xff, 0, 0xff) 90 + 91 + def __init__(self, old): 92 + self.old = old 93 + 94 + def __repr__(self): 95 + return "task migrated out %s" % thread_name(self.old) 96 + 97 + class RunqueueSnapshot: 98 + def __init__(self, tasks = [0], event = RunqueueEventUnknown()): 99 + self.tasks = tuple(tasks) 100 + self.event = event 101 + 102 + def sched_switch(self, prev, prev_state, next): 103 + event = RunqueueEventUnknown() 104 + 105 + if taskState(prev_state) == "R" and next in self.tasks \ 106 + and prev in self.tasks: 107 + return self 108 + 109 + if taskState(prev_state) != "R": 110 + event = RunqueueEventSleep(prev) 111 + 112 + next_tasks = list(self.tasks[:]) 113 + if prev in self.tasks: 114 + if taskState(prev_state) != "R": 115 + next_tasks.remove(prev) 116 + elif taskState(prev_state) == "R": 117 + next_tasks.append(prev) 118 + 119 + if next not in next_tasks: 120 + next_tasks.append(next) 121 + 122 + return RunqueueSnapshot(next_tasks, event) 123 + 124 + def migrate_out(self, old): 125 + if old not in self.tasks: 126 + return self 127 + next_tasks = [task for task in self.tasks if task != old] 128 + 129 + return RunqueueSnapshot(next_tasks, RunqueueMigrateOut(old)) 130 + 131 + def __migrate_in(self, new, event): 132 + if new in self.tasks: 133 + self.event = event 134 + return self 135 + next_tasks = self.tasks[:] + tuple([new]) 136 + 137 + return RunqueueSnapshot(next_tasks, event) 138 + 139 + def migrate_in(self, new): 140 + return self.__migrate_in(new, RunqueueMigrateIn(new)) 141 + 142 + def wake_up(self, new): 143 + return self.__migrate_in(new, RunqueueEventWakeup(new)) 144 + 145 + def wake_up_new(self, new): 146 + return self.__migrate_in(new, RunqueueEventFork(new)) 147 + 148 + def load(self): 149 + """ Provide the number of tasks on the runqueue. 150 + Don't count idle""" 151 + return len(self.tasks) - 1 152 + 153 + def __repr__(self): 154 + ret = self.tasks.__repr__() 155 + ret += self.origin_tostring() 156 + 157 + return ret 158 + 159 + class TimeSlice: 160 + def __init__(self, start, prev): 161 + self.start = start 162 + self.prev = prev 163 + self.end = start 164 + # cpus that triggered the event 165 + self.event_cpus = [] 166 + if prev is not None: 167 + self.total_load = prev.total_load 168 + self.rqs = prev.rqs.copy() 169 + else: 170 + self.rqs = defaultdict(RunqueueSnapshot) 171 + self.total_load = 0 172 + 173 + def __update_total_load(self, old_rq, new_rq): 174 + diff = new_rq.load() - old_rq.load() 175 + self.total_load += diff 176 + 177 + def sched_switch(self, ts_list, prev, prev_state, next, cpu): 178 + old_rq = self.prev.rqs[cpu] 179 + new_rq = old_rq.sched_switch(prev, prev_state, next) 180 + 181 + if old_rq is new_rq: 182 + return 183 + 184 + self.rqs[cpu] = new_rq 185 + self.__update_total_load(old_rq, new_rq) 186 + ts_list.append(self) 187 + self.event_cpus = [cpu] 188 + 189 + def migrate(self, ts_list, new, old_cpu, new_cpu): 190 + if old_cpu == new_cpu: 191 + return 192 + old_rq = self.prev.rqs[old_cpu] 193 + out_rq = old_rq.migrate_out(new) 194 + self.rqs[old_cpu] = out_rq 195 + self.__update_total_load(old_rq, out_rq) 196 + 197 + new_rq = self.prev.rqs[new_cpu] 198 + in_rq = new_rq.migrate_in(new) 199 + self.rqs[new_cpu] = in_rq 200 + self.__update_total_load(new_rq, in_rq) 201 + 202 + ts_list.append(self) 203 + 204 + if old_rq is not out_rq: 205 + self.event_cpus.append(old_cpu) 206 + self.event_cpus.append(new_cpu) 207 + 208 + def wake_up(self, ts_list, pid, cpu, fork): 209 + old_rq = self.prev.rqs[cpu] 210 + if fork: 211 + new_rq = old_rq.wake_up_new(pid) 212 + else: 213 + new_rq = old_rq.wake_up(pid) 214 + 215 + if new_rq is old_rq: 216 + return 217 + self.rqs[cpu] = new_rq 218 + self.__update_total_load(old_rq, new_rq) 219 + ts_list.append(self) 220 + self.event_cpus = [cpu] 221 + 222 + def next(self, t): 223 + self.end = t 224 + return TimeSlice(t, self) 225 + 226 + class TimeSliceList(UserList): 227 + def __init__(self, arg = []): 228 + self.data = arg 229 + 230 + def get_time_slice(self, ts): 231 + if len(self.data) == 0: 232 + slice = TimeSlice(ts, TimeSlice(-1, None)) 233 + else: 234 + slice = self.data[-1].next(ts) 235 + return slice 236 + 237 + def find_time_slice(self, ts): 238 + start = 0 239 + end = len(self.data) 240 + found = -1 241 + searching = True 242 + while searching: 243 + if start == end or start == end - 1: 244 + searching = False 245 + 246 + i = (end + start) / 2 247 + if self.data[i].start <= ts and self.data[i].end >= ts: 248 + found = i 249 + end = i 250 + continue 251 + 252 + if self.data[i].end < ts: 253 + start = i 254 + 255 + elif self.data[i].start > ts: 256 + end = i 257 + 258 + return found 259 + 260 + def set_root_win(self, win): 261 + self.root_win = win 262 + 263 + def mouse_down(self, cpu, t): 264 + idx = self.find_time_slice(t) 265 + if idx == -1: 266 + return 267 + 268 + ts = self[idx] 269 + rq = ts.rqs[cpu] 270 + raw = "CPU: %d\n" % cpu 271 + raw += "Last event : %s\n" % rq.event.__repr__() 272 + raw += "Timestamp : %d.%06d\n" % (ts.start / (10 ** 9), (ts.start % (10 ** 9)) / 1000) 273 + raw += "Duration : %6d us\n" % ((ts.end - ts.start) / (10 ** 6)) 274 + raw += "Load = %d\n" % rq.load() 275 + for t in rq.tasks: 276 + raw += "%s \n" % thread_name(t) 277 + 278 + self.root_win.update_summary(raw) 279 + 280 + def update_rectangle_cpu(self, slice, cpu): 281 + rq = slice.rqs[cpu] 282 + 283 + if slice.total_load != 0: 284 + load_rate = rq.load() / float(slice.total_load) 285 + else: 286 + load_rate = 0 287 + 288 + red_power = int(0xff - (0xff * load_rate)) 289 + color = (0xff, red_power, red_power) 290 + 291 + top_color = None 292 + 293 + if cpu in slice.event_cpus: 294 + top_color = rq.event.color() 295 + 296 + self.root_win.paint_rectangle_zone(cpu, color, top_color, slice.start, slice.end) 297 + 298 + def fill_zone(self, start, end): 299 + i = self.find_time_slice(start) 300 + if i == -1: 301 + return 302 + 303 + for i in xrange(i, len(self.data)): 304 + timeslice = self.data[i] 305 + if timeslice.start > end: 306 + return 307 + 308 + for cpu in timeslice.rqs: 309 + self.update_rectangle_cpu(timeslice, cpu) 310 + 311 + def interval(self): 312 + if len(self.data) == 0: 313 + return (0, 0) 314 + 315 + return (self.data[0].start, self.data[-1].end) 316 + 317 + def nr_rectangles(self): 318 + last_ts = self.data[-1] 319 + max_cpu = 0 320 + for cpu in last_ts.rqs: 321 + if cpu > max_cpu: 322 + max_cpu = cpu 323 + return max_cpu 324 + 325 + 326 + class SchedEventProxy: 327 + def __init__(self): 328 + self.current_tsk = defaultdict(lambda : -1) 329 + self.timeslices = TimeSliceList() 330 + 331 + def sched_switch(self, headers, prev_comm, prev_pid, prev_prio, prev_state, 332 + next_comm, next_pid, next_prio): 333 + """ Ensure the task we sched out this cpu is really the one 334 + we logged. Otherwise we may have missed traces """ 335 + 336 + on_cpu_task = self.current_tsk[headers.cpu] 337 + 338 + if on_cpu_task != -1 and on_cpu_task != prev_pid: 339 + print "Sched switch event rejected ts: %s cpu: %d prev: %s(%d) next: %s(%d)" % \ 340 + (headers.ts_format(), headers.cpu, prev_comm, prev_pid, next_comm, next_pid) 341 + 342 + threads[prev_pid] = prev_comm 343 + threads[next_pid] = next_comm 344 + self.current_tsk[headers.cpu] = next_pid 345 + 346 + ts = self.timeslices.get_time_slice(headers.ts()) 347 + ts.sched_switch(self.timeslices, prev_pid, prev_state, next_pid, headers.cpu) 348 + 349 + def migrate(self, headers, pid, prio, orig_cpu, dest_cpu): 350 + ts = self.timeslices.get_time_slice(headers.ts()) 351 + ts.migrate(self.timeslices, pid, orig_cpu, dest_cpu) 352 + 353 + def wake_up(self, headers, comm, pid, success, target_cpu, fork): 354 + if success == 0: 355 + return 356 + ts = self.timeslices.get_time_slice(headers.ts()) 357 + ts.wake_up(self.timeslices, pid, target_cpu, fork) 358 + 359 + 360 + def trace_begin(): 361 + global parser 362 + parser = SchedEventProxy() 363 + 364 + def trace_end(): 365 + app = wx.App(False) 366 + timeslices = parser.timeslices 367 + frame = RootFrame(timeslices, "Migration") 368 + app.MainLoop() 369 + 370 + def sched__sched_stat_runtime(event_name, context, common_cpu, 371 + common_secs, common_nsecs, common_pid, common_comm, 372 + comm, pid, runtime, vruntime): 373 + pass 374 + 375 + def sched__sched_stat_iowait(event_name, context, common_cpu, 376 + common_secs, common_nsecs, common_pid, common_comm, 377 + comm, pid, delay): 378 + pass 379 + 380 + def sched__sched_stat_sleep(event_name, context, common_cpu, 381 + common_secs, common_nsecs, common_pid, common_comm, 382 + comm, pid, delay): 383 + pass 384 + 385 + def sched__sched_stat_wait(event_name, context, common_cpu, 386 + common_secs, common_nsecs, common_pid, common_comm, 387 + comm, pid, delay): 388 + pass 389 + 390 + def sched__sched_process_fork(event_name, context, common_cpu, 391 + common_secs, common_nsecs, common_pid, common_comm, 392 + parent_comm, parent_pid, child_comm, child_pid): 393 + pass 394 + 395 + def sched__sched_process_wait(event_name, context, common_cpu, 396 + common_secs, common_nsecs, common_pid, common_comm, 397 + comm, pid, prio): 398 + pass 399 + 400 + def sched__sched_process_exit(event_name, context, common_cpu, 401 + common_secs, common_nsecs, common_pid, common_comm, 402 + comm, pid, prio): 403 + pass 404 + 405 + def sched__sched_process_free(event_name, context, common_cpu, 406 + common_secs, common_nsecs, common_pid, common_comm, 407 + comm, pid, prio): 408 + pass 409 + 410 + def sched__sched_migrate_task(event_name, context, common_cpu, 411 + common_secs, common_nsecs, common_pid, common_comm, 412 + comm, pid, prio, orig_cpu, 413 + dest_cpu): 414 + headers = EventHeaders(common_cpu, common_secs, common_nsecs, 415 + common_pid, common_comm) 416 + parser.migrate(headers, pid, prio, orig_cpu, dest_cpu) 417 + 418 + def sched__sched_switch(event_name, context, common_cpu, 419 + common_secs, common_nsecs, common_pid, common_comm, 420 + prev_comm, prev_pid, prev_prio, prev_state, 421 + next_comm, next_pid, next_prio): 422 + 423 + headers = EventHeaders(common_cpu, common_secs, common_nsecs, 424 + common_pid, common_comm) 425 + parser.sched_switch(headers, prev_comm, prev_pid, prev_prio, prev_state, 426 + next_comm, next_pid, next_prio) 427 + 428 + def sched__sched_wakeup_new(event_name, context, common_cpu, 429 + common_secs, common_nsecs, common_pid, common_comm, 430 + comm, pid, prio, success, 431 + target_cpu): 432 + headers = EventHeaders(common_cpu, common_secs, common_nsecs, 433 + common_pid, common_comm) 434 + parser.wake_up(headers, comm, pid, success, target_cpu, 1) 435 + 436 + def sched__sched_wakeup(event_name, context, common_cpu, 437 + common_secs, common_nsecs, common_pid, common_comm, 438 + comm, pid, prio, success, 439 + target_cpu): 440 + headers = EventHeaders(common_cpu, common_secs, common_nsecs, 441 + common_pid, common_comm) 442 + parser.wake_up(headers, comm, pid, success, target_cpu, 0) 443 + 444 + def sched__sched_wait_task(event_name, context, common_cpu, 445 + common_secs, common_nsecs, common_pid, common_comm, 446 + comm, pid, prio): 447 + pass 448 + 449 + def sched__sched_kthread_stop_ret(event_name, context, common_cpu, 450 + common_secs, common_nsecs, common_pid, common_comm, 451 + ret): 452 + pass 453 + 454 + def sched__sched_kthread_stop(event_name, context, common_cpu, 455 + common_secs, common_nsecs, common_pid, common_comm, 456 + comm, pid): 457 + pass 458 + 459 + def trace_unhandled(event_name, context, common_cpu, common_secs, common_nsecs, 460 + common_pid, common_comm): 461 + pass

+22 -6

tools/perf/util/build-id.c

··· 12 12 #include "event.h" 13 13 #include "symbol.h" 14 14 #include <linux/kernel.h> 15 + #include "debug.h" 15 16 16 17 static int build_id__mark_dso_hit(event_t *event, struct perf_session *session) 17 18 { ··· 35 34 return 0; 36 35 } 37 36 37 + static int event__exit_del_thread(event_t *self, struct perf_session *session) 38 + { 39 + struct thread *thread = perf_session__findnew(session, self->fork.tid); 40 + 41 + dump_printf("(%d:%d):(%d:%d)\n", self->fork.pid, self->fork.tid, 42 + self->fork.ppid, self->fork.ptid); 43 + 44 + if (thread) { 45 + rb_erase(&thread->rb_node, &session->threads); 46 + session->last_match = NULL; 47 + thread__delete(thread); 48 + } 49 + 50 + return 0; 51 + } 52 + 38 53 struct perf_event_ops build_id__mark_dso_hit_ops = { 39 54 .sample = build_id__mark_dso_hit, 40 55 .mmap = event__process_mmap, 41 56 .fork = event__process_task, 57 + .exit = event__exit_del_thread, 42 58 }; 43 59 44 60 char *dso__build_id_filename(struct dso *self, char *bf, size_t size) 45 61 { 46 62 char build_id_hex[BUILD_ID_SIZE * 2 + 1]; 47 - const char *home; 48 63 49 64 if (!self->has_build_id) 50 65 return NULL; 51 66 52 67 build_id__sprintf(self->build_id, sizeof(self->build_id), build_id_hex); 53 - home = getenv("HOME"); 54 68 if (bf == NULL) { 55 - if (asprintf(&bf, "%s/%s/.build-id/%.2s/%s", home, 56 - DEBUG_CACHE_DIR, build_id_hex, build_id_hex + 2) < 0) 69 + if (asprintf(&bf, "%s/.build-id/%.2s/%s", buildid_dir, 70 + build_id_hex, build_id_hex + 2) < 0) 57 71 return NULL; 58 72 } else 59 - snprintf(bf, size, "%s/%s/.build-id/%.2s/%s", home, 60 - DEBUG_CACHE_DIR, build_id_hex, build_id_hex + 2); 73 + snprintf(bf, size, "%s/.build-id/%.2s/%s", buildid_dir, 74 + build_id_hex, build_id_hex + 2); 61 75 return bf; 62 76 }

+1

tools/perf/util/cache.h

··· 23 23 extern int perf_config_int(const char *, const char *); 24 24 extern int perf_config_bool(const char *, const char *); 25 25 extern int config_error_nonbool(const char *); 26 + extern const char *perf_config_dirname(const char *, const char *); 26 27 27 28 /* pager.c */ 28 29 extern void setup_pager(void);

+1 -1

tools/perf/util/callchain.c

··· 18 18 #include "util.h" 19 19 #include "callchain.h" 20 20 21 - bool ip_callchain__valid(struct ip_callchain *chain, event_t *event) 21 + bool ip_callchain__valid(struct ip_callchain *chain, const event_t *event) 22 22 { 23 23 unsigned int chain_size = event->header.size; 24 24 chain_size -= (unsigned long)&event->ip.__more_data - (unsigned long)event;

+1 -1

tools/perf/util/callchain.h

··· 63 63 int append_chain(struct callchain_node *root, struct ip_callchain *chain, 64 64 struct map_symbol *syms, u64 period); 65 65 66 - bool ip_callchain__valid(struct ip_callchain *chain, event_t *event); 66 + bool ip_callchain__valid(struct ip_callchain *chain, const event_t *event); 67 67 #endif /* __PERF_CALLCHAIN_H */

+63 -1

tools/perf/util/config.c

··· 11 11 12 12 #define MAXNAME (256) 13 13 14 + #define DEBUG_CACHE_DIR ".debug" 15 + 16 + 17 + char buildid_dir[MAXPATHLEN]; /* root dir for buildid, binary cache */ 18 + 14 19 static FILE *config_file; 15 20 static const char *config_file_name; 16 21 static int config_linenr; ··· 132 127 break; 133 128 if (!iskeychar(c)) 134 129 break; 135 - name[len++] = tolower(c); 130 + name[len++] = c; 136 131 if (len >= MAXNAME) 137 132 return -1; 138 133 } ··· 332 327 return !!perf_config_bool_or_int(name, value, &discard); 333 328 } 334 329 330 + const char *perf_config_dirname(const char *name, const char *value) 331 + { 332 + if (!name) 333 + return NULL; 334 + return value; 335 + } 336 + 335 337 static int perf_default_core_config(const char *var __used, const char *value __used) 336 338 { 337 339 /* Add other config variables here and to Documentation/config.txt. */ ··· 439 427 int config_error_nonbool(const char *var) 440 428 { 441 429 return error("Missing value for '%s'", var); 430 + } 431 + 432 + struct buildid_dir_config { 433 + char *dir; 434 + }; 435 + 436 + static int buildid_dir_command_config(const char *var, const char *value, 437 + void *data) 438 + { 439 + struct buildid_dir_config *c = data; 440 + const char *v; 441 + 442 + /* same dir for all commands */ 443 + if (!prefixcmp(var, "buildid.") && !strcmp(var + 8, "dir")) { 444 + v = perf_config_dirname(var, value); 445 + if (!v) 446 + return -1; 447 + strncpy(c->dir, v, MAXPATHLEN-1); 448 + c->dir[MAXPATHLEN-1] = '\0'; 449 + } 450 + return 0; 451 + } 452 + 453 + static void check_buildid_dir_config(void) 454 + { 455 + struct buildid_dir_config c; 456 + c.dir = buildid_dir; 457 + perf_config(buildid_dir_command_config, &c); 458 + } 459 + 460 + void set_buildid_dir(void) 461 + { 462 + buildid_dir[0] = '\0'; 463 + 464 + /* try config file */ 465 + check_buildid_dir_config(); 466 + 467 + /* default to $HOME/.debug */ 468 + if (buildid_dir[0] == '\0') { 469 + char *v = getenv("HOME"); 470 + if (v) { 471 + snprintf(buildid_dir, MAXPATHLEN-1, "%s/%s", 472 + v, DEBUG_CACHE_DIR); 473 + } else { 474 + strncpy(buildid_dir, DEBUG_CACHE_DIR, MAXPATHLEN-1); 475 + } 476 + buildid_dir[MAXPATHLEN-1] = '\0'; 477 + } 478 + /* for communicating with external commands */ 479 + setenv("PERF_BUILDID_DIR", buildid_dir, 1); 442 480 }

+56 -1

tools/perf/util/cpumap.c

··· 20 20 return nr_cpus; 21 21 } 22 22 23 - int read_cpu_map(void) 23 + static int read_all_cpu_map(void) 24 24 { 25 25 FILE *onlnf; 26 26 int nr_cpus = 0; ··· 56 56 return nr_cpus; 57 57 58 58 return default_cpu_map(); 59 + } 60 + 61 + int read_cpu_map(const char *cpu_list) 62 + { 63 + unsigned long start_cpu, end_cpu = 0; 64 + char *p = NULL; 65 + int i, nr_cpus = 0; 66 + 67 + if (!cpu_list) 68 + return read_all_cpu_map(); 69 + 70 + if (!isdigit(*cpu_list)) 71 + goto invalid; 72 + 73 + while (isdigit(*cpu_list)) { 74 + p = NULL; 75 + start_cpu = strtoul(cpu_list, &p, 0); 76 + if (start_cpu >= INT_MAX 77 + || (*p != '\0' && *p != ',' && *p != '-')) 78 + goto invalid; 79 + 80 + if (*p == '-') { 81 + cpu_list = ++p; 82 + p = NULL; 83 + end_cpu = strtoul(cpu_list, &p, 0); 84 + 85 + if (end_cpu >= INT_MAX || (*p != '\0' && *p != ',')) 86 + goto invalid; 87 + 88 + if (end_cpu < start_cpu) 89 + goto invalid; 90 + } else { 91 + end_cpu = start_cpu; 92 + } 93 + 94 + for (; start_cpu <= end_cpu; start_cpu++) { 95 + /* check for duplicates */ 96 + for (i = 0; i < nr_cpus; i++) 97 + if (cpumap[i] == (int)start_cpu) 98 + goto invalid; 99 + 100 + assert(nr_cpus < MAX_NR_CPUS); 101 + cpumap[nr_cpus++] = (int)start_cpu; 102 + } 103 + if (*p) 104 + ++p; 105 + 106 + cpu_list = p; 107 + } 108 + if (nr_cpus > 0) 109 + return nr_cpus; 110 + 111 + return default_cpu_map(); 112 + invalid: 113 + return -1; 59 114 }

+1 -1

tools/perf/util/cpumap.h

··· 1 1 #ifndef __PERF_CPUMAP_H 2 2 #define __PERF_CPUMAP_H 3 3 4 - extern int read_cpu_map(void); 4 + extern int read_cpu_map(const char *cpu_list); 5 5 extern int cpumap[]; 6 6 7 7 #endif /* __PERF_CPUMAP_H */

+4 -6

tools/perf/util/debug.c

··· 86 86 dump_printf_color(" ", color); 87 87 for (j = 0; j < 15-(i & 15); j++) 88 88 dump_printf_color(" ", color); 89 - for (j = 0; j < (i & 15); j++) { 90 - if (isprint(raw_event[i-15+j])) 91 - dump_printf_color("%c", color, 92 - raw_event[i-15+j]); 93 - else 94 - dump_printf_color(".", color); 89 + for (j = i & ~15; j <= i; j++) { 90 + dump_printf_color("%c", color, 91 + isprint(raw_event[j]) ? 92 + raw_event[j] : '.'); 95 93 } 96 94 dump_printf_color("\n", color); 97 95 }

+74 -33

tools/perf/util/event.c

··· 151 151 continue; 152 152 pbf += n + 3; 153 153 if (*pbf == 'x') { /* vm_exec */ 154 - u64 vm_pgoff; 155 154 char *execname = strchr(bf, '/'); 156 155 157 156 /* Catch VDSO */ ··· 161 162 continue; 162 163 163 164 pbf += 3; 164 - n = hex2u64(pbf, &vm_pgoff); 165 - /* pgoff is in bytes, not pages */ 166 - if (n >= 0) 167 - ev.mmap.pgoff = vm_pgoff << getpagesize(); 168 - else 169 - ev.mmap.pgoff = 0; 165 + n = hex2u64(pbf, &ev.mmap.pgoff); 170 166 171 167 size = strlen(execname); 172 168 execname[size - 1] = '\0'; /* Remove \n */ ··· 334 340 return process(&ev, session); 335 341 } 336 342 337 - static void thread__comm_adjust(struct thread *self) 343 + static void thread__comm_adjust(struct thread *self, struct hists *hists) 338 344 { 339 345 char *comm = self->comm; 340 346 341 347 if (!symbol_conf.col_width_list_str && !symbol_conf.field_sep && 342 348 (!symbol_conf.comm_list || 343 349 strlist__has_entry(symbol_conf.comm_list, comm))) { 344 - unsigned int slen = strlen(comm); 350 + u16 slen = strlen(comm); 345 351 346 - if (slen > comms__col_width) { 347 - comms__col_width = slen; 348 - threads__col_width = slen + 6; 349 - } 352 + if (hists__new_col_len(hists, HISTC_COMM, slen)) 353 + hists__set_col_len(hists, HISTC_THREAD, slen + 6); 350 354 } 351 355 } 352 356 353 - static int thread__set_comm_adjust(struct thread *self, const char *comm) 357 + static int thread__set_comm_adjust(struct thread *self, const char *comm, 358 + struct hists *hists) 354 359 { 355 360 int ret = thread__set_comm(self, comm); 356 361 357 362 if (ret) 358 363 return ret; 359 364 360 - thread__comm_adjust(self); 365 + thread__comm_adjust(self, hists); 361 366 362 367 return 0; 363 368 } ··· 367 374 368 375 dump_printf(": %s:%d\n", self->comm.comm, self->comm.tid); 369 376 370 - if (thread == NULL || thread__set_comm_adjust(thread, self->comm.comm)) { 377 + if (thread == NULL || thread__set_comm_adjust(thread, self->comm.comm, 378 + &session->hists)) { 371 379 dump_printf("problem processing PERF_RECORD_COMM, skipping event.\n"); 372 380 return -1; 373 381 } ··· 450 456 goto out_problem; 451 457 452 458 map->dso->short_name = name; 459 + map->dso->sname_alloc = 1; 453 460 map->end = map->start + self->mmap.len; 454 461 } else if (is_kernel_mmap) { 455 462 const char *symbol_name = (self->mmap.filename + ··· 509 514 if (machine == NULL) 510 515 goto out_problem; 511 516 thread = perf_session__findnew(session, self->mmap.pid); 517 + if (thread == NULL) 518 + goto out_problem; 512 519 map = map__new(&machine->user_dsos, self->mmap.start, 513 520 self->mmap.len, self->mmap.pgoff, 514 521 self->mmap.pid, self->mmap.filename, 515 - MAP__FUNCTION, session->cwd, session->cwdlen); 516 - 517 - if (thread == NULL || map == NULL) 522 + MAP__FUNCTION); 523 + if (map == NULL) 518 524 goto out_problem; 519 525 520 526 thread__insert_map(thread, map); ··· 543 547 thread__fork(thread, parent) < 0) { 544 548 dump_printf("problem processing PERF_RECORD_FORK, skipping event.\n"); 545 549 return -1; 550 + } 551 + 552 + return 0; 553 + } 554 + 555 + int event__process(event_t *event, struct perf_session *session) 556 + { 557 + switch (event->header.type) { 558 + case PERF_RECORD_COMM: 559 + event__process_comm(event, session); 560 + break; 561 + case PERF_RECORD_MMAP: 562 + event__process_mmap(event, session); 563 + break; 564 + case PERF_RECORD_FORK: 565 + case PERF_RECORD_EXIT: 566 + event__process_task(event, session); 567 + break; 568 + default: 569 + break; 546 570 } 547 571 548 572 return 0; ··· 657 641 al->sym = NULL; 658 642 } 659 643 660 - static void dso__calc_col_width(struct dso *self) 644 + static void dso__calc_col_width(struct dso *self, struct hists *hists) 661 645 { 662 646 if (!symbol_conf.col_width_list_str && !symbol_conf.field_sep && 663 647 (!symbol_conf.dso_list || 664 648 strlist__has_entry(symbol_conf.dso_list, self->name))) { 665 - u16 slen = self->short_name_len; 666 - if (verbose) 667 - slen = self->long_name_len; 668 - if (dsos__col_width < slen) 669 - dsos__col_width = slen; 649 + u16 slen = dso__name_len(self); 650 + hists__new_col_len(hists, HISTC_DSO, slen); 670 651 } 671 652 672 653 self->slen_calculated = 1; 673 654 } 674 655 675 656 int event__preprocess_sample(const event_t *self, struct perf_session *session, 676 - struct addr_location *al, symbol_filter_t filter) 657 + struct addr_location *al, struct sample_data *data, 658 + symbol_filter_t filter) 677 659 { 678 660 u8 cpumode = self->header.misc & PERF_RECORD_MISC_CPUMODE_MASK; 679 - struct thread *thread = perf_session__findnew(session, self->ip.pid); 661 + struct thread *thread; 680 662 663 + event__parse_sample(self, session->sample_type, data); 664 + 665 + dump_printf("(IP, %d): %d/%d: %#Lx period: %Ld cpu:%d\n", 666 + self->header.misc, data->pid, data->tid, data->ip, 667 + data->period, data->cpu); 668 + 669 + if (session->sample_type & PERF_SAMPLE_CALLCHAIN) { 670 + unsigned int i; 671 + 672 + dump_printf("... chain: nr:%Lu\n", data->callchain->nr); 673 + 674 + if (!ip_callchain__valid(data->callchain, self)) { 675 + pr_debug("call-chain problem with event, " 676 + "skipping it.\n"); 677 + goto out_filtered; 678 + } 679 + 680 + if (dump_trace) { 681 + for (i = 0; i < data->callchain->nr; i++) 682 + dump_printf("..... %2d: %016Lx\n", 683 + i, data->callchain->ips[i]); 684 + } 685 + } 686 + thread = perf_session__findnew(session, self->ip.pid); 681 687 if (thread == NULL) 682 688 return -1; 683 689 ··· 725 687 al->map ? al->map->dso->long_name : 726 688 al->level == 'H' ? "[hypervisor]" : "<not found>"); 727 689 al->sym = NULL; 690 + al->cpu = data->cpu; 728 691 729 692 if (al->map) { 730 693 if (symbol_conf.dso_list && ··· 742 703 * sampled. 743 704 */ 744 705 if (!sort_dso.elide && !al->map->dso->slen_calculated) 745 - dso__calc_col_width(al->map->dso); 706 + dso__calc_col_width(al->map->dso, &session->hists); 746 707 747 708 al->sym = map__find_symbol(al->map, al->addr, filter); 748 709 } else { 749 710 const unsigned int unresolved_col_width = BITS_PER_LONG / 4; 750 711 751 - if (dsos__col_width < unresolved_col_width && 712 + if (hists__col_len(&session->hists, HISTC_DSO) < unresolved_col_width && 752 713 !symbol_conf.col_width_list_str && !symbol_conf.field_sep && 753 714 !symbol_conf.dso_list) 754 - dsos__col_width = unresolved_col_width; 715 + hists__set_col_len(&session->hists, HISTC_DSO, 716 + unresolved_col_width); 755 717 } 756 718 757 719 if (symbol_conf.sym_list && al->sym && ··· 766 726 return 0; 767 727 } 768 728 769 - int event__parse_sample(event_t *event, u64 type, struct sample_data *data) 729 + int event__parse_sample(const event_t *event, u64 type, struct sample_data *data) 770 730 { 771 - u64 *array = event->sample.array; 731 + const u64 *array = event->sample.array; 772 732 773 733 if (type & PERF_SAMPLE_IP) { 774 734 data->ip = event->ip.ip; ··· 807 767 u32 *p = (u32 *)array; 808 768 data->cpu = *p; 809 769 array++; 810 - } 770 + } else 771 + data->cpu = -1; 811 772 812 773 if (type & PERF_SAMPLE_PERIOD) { 813 774 data->period = *array;

+4 -2

tools/perf/util/event.h

··· 154 154 int event__process_lost(event_t *self, struct perf_session *session); 155 155 int event__process_mmap(event_t *self, struct perf_session *session); 156 156 int event__process_task(event_t *self, struct perf_session *session); 157 + int event__process(event_t *event, struct perf_session *session); 157 158 158 159 struct addr_location; 159 160 int event__preprocess_sample(const event_t *self, struct perf_session *session, 160 - struct addr_location *al, symbol_filter_t filter); 161 - int event__parse_sample(event_t *event, u64 type, struct sample_data *data); 161 + struct addr_location *al, struct sample_data *data, 162 + symbol_filter_t filter); 163 + int event__parse_sample(const event_t *event, u64 type, struct sample_data *data); 162 164 163 165 extern const char *event__name[]; 164 166

+10 -3

tools/perf/util/header.c

··· 16 16 #include "symbol.h" 17 17 #include "debug.h" 18 18 19 + static bool no_buildid_cache = false; 20 + 19 21 /* 20 22 * Create new perf.data header attribute: 21 23 */ ··· 387 385 int ret; 388 386 char debugdir[PATH_MAX]; 389 387 390 - snprintf(debugdir, sizeof(debugdir), "%s/%s", getenv("HOME"), 391 - DEBUG_CACHE_DIR); 388 + snprintf(debugdir, sizeof(debugdir), "%s", buildid_dir); 392 389 393 390 if (mkdir(debugdir, 0755) != 0 && errno != EEXIST) 394 391 return -1; ··· 472 471 } 473 472 buildid_sec->size = lseek(fd, 0, SEEK_CUR) - 474 473 buildid_sec->offset; 475 - perf_session__cache_build_ids(session); 474 + if (!no_buildid_cache) 475 + perf_session__cache_build_ids(session); 476 476 } 477 477 478 478 lseek(fd, sec_start, SEEK_SET); ··· 1191 1189 self->build_id.filename, 1192 1190 session); 1193 1191 return 0; 1192 + } 1193 + 1194 + void disable_buildid_cache(void) 1195 + { 1196 + no_buildid_cache = true; 1194 1197 }

+145 -69

tools/perf/util/hist.c

··· 5 5 #include "sort.h" 6 6 #include <math.h> 7 7 8 + enum hist_filter { 9 + HIST_FILTER__DSO, 10 + HIST_FILTER__THREAD, 11 + HIST_FILTER__PARENT, 12 + }; 13 + 8 14 struct callchain_param callchain_param = { 9 15 .mode = CHAIN_GRAPH_REL, 10 16 .min_percent = 0.5 11 17 }; 18 + 19 + u16 hists__col_len(struct hists *self, enum hist_column col) 20 + { 21 + return self->col_len[col]; 22 + } 23 + 24 + void hists__set_col_len(struct hists *self, enum hist_column col, u16 len) 25 + { 26 + self->col_len[col] = len; 27 + } 28 + 29 + bool hists__new_col_len(struct hists *self, enum hist_column col, u16 len) 30 + { 31 + if (len > hists__col_len(self, col)) { 32 + hists__set_col_len(self, col, len); 33 + return true; 34 + } 35 + return false; 36 + } 37 + 38 + static void hists__reset_col_len(struct hists *self) 39 + { 40 + enum hist_column col; 41 + 42 + for (col = 0; col < HISTC_NR_COLS; ++col) 43 + hists__set_col_len(self, col, 0); 44 + } 45 + 46 + static void hists__calc_col_len(struct hists *self, struct hist_entry *h) 47 + { 48 + u16 len; 49 + 50 + if (h->ms.sym) 51 + hists__new_col_len(self, HISTC_SYMBOL, h->ms.sym->namelen); 52 + 53 + len = thread__comm_len(h->thread); 54 + if (hists__new_col_len(self, HISTC_COMM, len)) 55 + hists__set_col_len(self, HISTC_THREAD, len + 6); 56 + 57 + if (h->ms.map) { 58 + len = dso__name_len(h->ms.map->dso); 59 + hists__new_col_len(self, HISTC_DSO, len); 60 + } 61 + } 12 62 13 63 static void hist_entry__add_cpumode_period(struct hist_entry *self, 14 64 unsigned int cpumode, u64 period) ··· 93 43 if (self != NULL) { 94 44 *self = *template; 95 45 self->nr_events = 1; 46 + if (self->ms.map) 47 + self->ms.map->referenced = true; 96 48 if (symbol_conf.use_callchain) 97 49 callchain_init(self->callchain); 98 50 } ··· 102 50 return self; 103 51 } 104 52 105 - static void hists__inc_nr_entries(struct hists *self, struct hist_entry *entry) 53 + static void hists__inc_nr_entries(struct hists *self, struct hist_entry *h) 106 54 { 107 - if (entry->ms.sym && self->max_sym_namelen < entry->ms.sym->namelen) 108 - self->max_sym_namelen = entry->ms.sym->namelen; 109 - ++self->nr_entries; 55 + if (!h->filtered) { 56 + hists__calc_col_len(self, h); 57 + ++self->nr_entries; 58 + } 59 + } 60 + 61 + static u8 symbol__parent_filter(const struct symbol *parent) 62 + { 63 + if (symbol_conf.exclude_other && parent == NULL) 64 + return 1 << HIST_FILTER__PARENT; 65 + return 0; 110 66 } 111 67 112 68 struct hist_entry *__hists__add_entry(struct hists *self, ··· 130 70 .map = al->map, 131 71 .sym = al->sym, 132 72 }, 73 + .cpu = al->cpu, 133 74 .ip = al->addr, 134 75 .level = al->level, 135 76 .period = period, 136 77 .parent = sym_parent, 78 + .filtered = symbol__parent_filter(sym_parent), 137 79 }; 138 80 int cmp; 139 81 ··· 253 191 tmp = RB_ROOT; 254 192 next = rb_first(&self->entries); 255 193 self->nr_entries = 0; 256 - self->max_sym_namelen = 0; 194 + hists__reset_col_len(self); 257 195 258 196 while (next) { 259 197 n = rb_entry(next, struct hist_entry, rb_node); ··· 310 248 next = rb_first(&self->entries); 311 249 312 250 self->nr_entries = 0; 313 - self->max_sym_namelen = 0; 251 + hists__reset_col_len(self); 314 252 315 253 while (next) { 316 254 n = rb_entry(next, struct hist_entry, rb_node); ··· 577 515 } 578 516 579 517 int hist_entry__snprintf(struct hist_entry *self, char *s, size_t size, 580 - struct hists *pair_hists, bool show_displacement, 581 - long displacement, bool color, u64 session_total) 518 + struct hists *hists, struct hists *pair_hists, 519 + bool show_displacement, long displacement, 520 + bool color, u64 session_total) 582 521 { 583 522 struct sort_entry *se; 584 523 u64 period, total, period_sys, period_us, period_guest_sys, period_guest_us; ··· 683 620 684 621 ret += snprintf(s + ret, size - ret, "%s", sep ?: " "); 685 622 ret += se->se_snprintf(self, s + ret, size - ret, 686 - se->se_width ? *se->se_width : 0); 623 + hists__col_len(hists, se->se_width_idx)); 687 624 } 688 625 689 626 return ret; 690 627 } 691 628 692 - int hist_entry__fprintf(struct hist_entry *self, struct hists *pair_hists, 693 - bool show_displacement, long displacement, FILE *fp, 694 - u64 session_total) 629 + int hist_entry__fprintf(struct hist_entry *self, struct hists *hists, 630 + struct hists *pair_hists, bool show_displacement, 631 + long displacement, FILE *fp, u64 session_total) 695 632 { 696 633 char bf[512]; 697 - int ret; 698 - 699 - ret = hist_entry__snprintf(self, bf, sizeof(bf), pair_hists, 700 - show_displacement, displacement, 701 - true, session_total); 702 - if (!ret) 703 - return 0; 704 - 634 + hist_entry__snprintf(self, bf, sizeof(bf), hists, pair_hists, 635 + show_displacement, displacement, 636 + true, session_total); 705 637 return fprintf(fp, "%s\n", bf); 706 638 } 707 639 708 - static size_t hist_entry__fprintf_callchain(struct hist_entry *self, FILE *fp, 640 + static size_t hist_entry__fprintf_callchain(struct hist_entry *self, 641 + struct hists *hists, FILE *fp, 709 642 u64 session_total) 710 643 { 711 644 int left_margin = 0; ··· 709 650 if (sort__first_dimension == SORT_COMM) { 710 651 struct sort_entry *se = list_first_entry(&hist_entry__sort_list, 711 652 typeof(*se), list); 712 - left_margin = se->se_width ? *se->se_width : 0; 653 + left_margin = hists__col_len(hists, se->se_width_idx); 713 654 left_margin -= thread__comm_len(self->thread); 714 655 } 715 656 ··· 780 721 continue; 781 722 } 782 723 width = strlen(se->se_header); 783 - if (se->se_width) { 784 - if (symbol_conf.col_width_list_str) { 785 - if (col_width) { 786 - *se->se_width = atoi(col_width); 787 - col_width = strchr(col_width, ','); 788 - if (col_width) 789 - ++col_width; 790 - } 724 + if (symbol_conf.col_width_list_str) { 725 + if (col_width) { 726 + hists__set_col_len(self, se->se_width_idx, 727 + atoi(col_width)); 728 + col_width = strchr(col_width, ','); 729 + if (col_width) 730 + ++col_width; 791 731 } 792 - width = *se->se_width = max(*se->se_width, width); 793 732 } 733 + if (!hists__new_col_len(self, se->se_width_idx, width)) 734 + width = hists__col_len(self, se->se_width_idx); 794 735 fprintf(fp, " %*s", width, se->se_header); 795 736 } 796 737 fprintf(fp, "\n"); ··· 813 754 continue; 814 755 815 756 fprintf(fp, " "); 816 - if (se->se_width) 817 - width = *se->se_width; 818 - else 757 + width = hists__col_len(self, se->se_width_idx); 758 + if (width == 0) 819 759 width = strlen(se->se_header); 820 760 for (i = 0; i < width; i++) 821 761 fprintf(fp, "."); ··· 825 767 print_entries: 826 768 for (nd = rb_first(&self->entries); nd; nd = rb_next(nd)) { 827 769 struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node); 828 - int cnt; 829 770 830 771 if (show_displacement) { 831 772 if (h->pair != NULL) ··· 834 777 displacement = 0; 835 778 ++position; 836 779 } 837 - cnt = hist_entry__fprintf(h, pair, show_displacement, 838 - displacement, fp, self->stats.total_period); 839 - /* Ignore those that didn't match the parent filter */ 840 - if (!cnt) 841 - continue; 842 - 843 - ret += cnt; 780 + ret += hist_entry__fprintf(h, self, pair, show_displacement, 781 + displacement, fp, self->stats.total_period); 844 782 845 783 if (symbol_conf.use_callchain) 846 - ret += hist_entry__fprintf_callchain(h, fp, self->stats.total_period); 847 - 784 + ret += hist_entry__fprintf_callchain(h, self, fp, 785 + self->stats.total_period); 848 786 if (h->ms.map == NULL && verbose > 1) { 849 787 __map_groups__fprintf_maps(&h->thread->mg, 850 788 MAP__FUNCTION, verbose, fp); ··· 852 800 return ret; 853 801 } 854 802 855 - enum hist_filter { 856 - HIST_FILTER__DSO, 857 - HIST_FILTER__THREAD, 858 - }; 803 + /* 804 + * See hists__fprintf to match the column widths 805 + */ 806 + unsigned int hists__sort_list_width(struct hists *self) 807 + { 808 + struct sort_entry *se; 809 + int ret = 9; /* total % */ 810 + 811 + if (symbol_conf.show_cpu_utilization) { 812 + ret += 7; /* count_sys % */ 813 + ret += 6; /* count_us % */ 814 + if (perf_guest) { 815 + ret += 13; /* count_guest_sys % */ 816 + ret += 12; /* count_guest_us % */ 817 + } 818 + } 819 + 820 + if (symbol_conf.show_nr_samples) 821 + ret += 11; 822 + 823 + list_for_each_entry(se, &hist_entry__sort_list, list) 824 + if (!se->elide) 825 + ret += 2 + hists__col_len(self, se->se_width_idx); 826 + 827 + return ret; 828 + } 829 + 830 + static void hists__remove_entry_filter(struct hists *self, struct hist_entry *h, 831 + enum hist_filter filter) 832 + { 833 + h->filtered &= ~(1 << filter); 834 + if (h->filtered) 835 + return; 836 + 837 + ++self->nr_entries; 838 + if (h->ms.unfolded) 839 + self->nr_entries += h->nr_rows; 840 + h->row_offset = 0; 841 + self->stats.total_period += h->period; 842 + self->stats.nr_events[PERF_RECORD_SAMPLE] += h->nr_events; 843 + 844 + hists__calc_col_len(self, h); 845 + } 859 846 860 847 void hists__filter_by_dso(struct hists *self, const struct dso *dso) 861 848 { ··· 902 811 903 812 self->nr_entries = self->stats.total_period = 0; 904 813 self->stats.nr_events[PERF_RECORD_SAMPLE] = 0; 905 - self->max_sym_namelen = 0; 814 + hists__reset_col_len(self); 906 815 907 816 for (nd = rb_first(&self->entries); nd; nd = rb_next(nd)) { 908 817 struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node); ··· 915 824 continue; 916 825 } 917 826 918 - h->filtered &= ~(1 << HIST_FILTER__DSO); 919 - if (!h->filtered) { 920 - ++self->nr_entries; 921 - self->stats.total_period += h->period; 922 - self->stats.nr_events[PERF_RECORD_SAMPLE] += h->nr_events; 923 - if (h->ms.sym && 924 - self->max_sym_namelen < h->ms.sym->namelen) 925 - self->max_sym_namelen = h->ms.sym->namelen; 926 - } 827 + hists__remove_entry_filter(self, h, HIST_FILTER__DSO); 927 828 } 928 829 } 929 830 ··· 925 842 926 843 self->nr_entries = self->stats.total_period = 0; 927 844 self->stats.nr_events[PERF_RECORD_SAMPLE] = 0; 928 - self->max_sym_namelen = 0; 845 + hists__reset_col_len(self); 929 846 930 847 for (nd = rb_first(&self->entries); nd; nd = rb_next(nd)) { 931 848 struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node); ··· 934 851 h->filtered |= (1 << HIST_FILTER__THREAD); 935 852 continue; 936 853 } 937 - h->filtered &= ~(1 << HIST_FILTER__THREAD); 938 - if (!h->filtered) { 939 - ++self->nr_entries; 940 - self->stats.total_period += h->period; 941 - self->stats.nr_events[PERF_RECORD_SAMPLE] += h->nr_events; 942 - if (h->ms.sym && 943 - self->max_sym_namelen < h->ms.sym->namelen) 944 - self->max_sym_namelen = h->ms.sym->namelen; 945 - } 854 + 855 + hists__remove_entry_filter(self, h, HIST_FILTER__THREAD); 946 856 } 947 857 } 948 858 ··· 1128 1052 dso, dso->long_name, sym, sym->name); 1129 1053 1130 1054 snprintf(command, sizeof(command), 1131 - "objdump --start-address=0x%016Lx --stop-address=0x%016Lx -dS %s|grep -v %s|expand", 1055 + "objdump --start-address=0x%016Lx --stop-address=0x%016Lx -dS -C %s|grep -v %s|expand", 1132 1056 map__rip_2objdump(map, sym->start), 1133 1057 map__rip_2objdump(map, sym->end), 1134 1058 filename, filename);

+24 -6

tools/perf/util/hist.h

··· 56 56 u32 nr_unknown_events; 57 57 }; 58 58 59 + enum hist_column { 60 + HISTC_SYMBOL, 61 + HISTC_DSO, 62 + HISTC_THREAD, 63 + HISTC_COMM, 64 + HISTC_PARENT, 65 + HISTC_CPU, 66 + HISTC_NR_COLS, /* Last entry */ 67 + }; 68 + 59 69 struct hists { 60 70 struct rb_node rb_node; 61 71 struct rb_root entries; ··· 74 64 u64 config; 75 65 u64 event_stream; 76 66 u32 type; 77 - u32 max_sym_namelen; 67 + u16 col_len[HISTC_NR_COLS]; 78 68 }; 79 69 80 70 struct hist_entry *__hists__add_entry(struct hists *self, ··· 82 72 struct symbol *parent, u64 period); 83 73 extern int64_t hist_entry__cmp(struct hist_entry *, struct hist_entry *); 84 74 extern int64_t hist_entry__collapse(struct hist_entry *, struct hist_entry *); 85 - int hist_entry__fprintf(struct hist_entry *self, struct hists *pair_hists, 86 - bool show_displacement, long displacement, FILE *fp, 87 - u64 total); 75 + int hist_entry__fprintf(struct hist_entry *self, struct hists *hists, 76 + struct hists *pair_hists, bool show_displacement, 77 + long displacement, FILE *fp, u64 total); 88 78 int hist_entry__snprintf(struct hist_entry *self, char *bf, size_t size, 89 - struct hists *pair_hists, bool show_displacement, 90 - long displacement, bool color, u64 total); 79 + struct hists *hists, struct hists *pair_hists, 80 + bool show_displacement, long displacement, 81 + bool color, u64 total); 91 82 void hist_entry__free(struct hist_entry *); 92 83 93 84 void hists__output_resort(struct hists *self); ··· 105 94 106 95 void hists__filter_by_dso(struct hists *self, const struct dso *dso); 107 96 void hists__filter_by_thread(struct hists *self, const struct thread *thread); 97 + 98 + u16 hists__col_len(struct hists *self, enum hist_column col); 99 + void hists__set_col_len(struct hists *self, enum hist_column col, u16 len); 100 + bool hists__new_col_len(struct hists *self, enum hist_column col, u16 len); 108 101 109 102 #ifdef NO_NEWT_SUPPORT 110 103 static inline int hists__browse(struct hists *self __used, ··· 141 126 142 127 int hists__tui_browse_tree(struct rb_root *self, const char *help); 143 128 #endif 129 + 130 + unsigned int hists__sort_list_width(struct hists *self); 131 + 144 132 #endif /* __PERF_HIST_H */

+85 -31

tools/perf/util/map.c

··· 17 17 return strcmp(filename, "//anon") == 0; 18 18 } 19 19 20 - static int strcommon(const char *pathname, char *cwd, int cwdlen) 21 - { 22 - int n = 0; 23 - 24 - while (n < cwdlen && pathname[n] == cwd[n]) 25 - ++n; 26 - 27 - return n; 28 - } 29 - 30 20 void map__init(struct map *self, enum map_type type, 31 21 u64 start, u64 end, u64 pgoff, struct dso *dso) 32 22 { ··· 29 39 self->unmap_ip = map__unmap_ip; 30 40 RB_CLEAR_NODE(&self->rb_node); 31 41 self->groups = NULL; 42 + self->referenced = false; 32 43 } 33 44 34 45 struct map *map__new(struct list_head *dsos__list, u64 start, u64 len, 35 46 u64 pgoff, u32 pid, char *filename, 36 - enum map_type type, char *cwd, int cwdlen) 47 + enum map_type type) 37 48 { 38 49 struct map *self = malloc(sizeof(*self)); 39 50 ··· 42 51 char newfilename[PATH_MAX]; 43 52 struct dso *dso; 44 53 int anon; 45 - 46 - if (cwd) { 47 - int n = strcommon(filename, cwd, cwdlen); 48 - 49 - if (n == cwdlen) { 50 - snprintf(newfilename, sizeof(newfilename), 51 - ".%s", filename + n); 52 - filename = newfilename; 53 - } 54 - } 55 54 56 55 anon = is_anon_memory(filename); 57 56 ··· 229 248 self->machine = NULL; 230 249 } 231 250 251 + static void maps__delete(struct rb_root *self) 252 + { 253 + struct rb_node *next = rb_first(self); 254 + 255 + while (next) { 256 + struct map *pos = rb_entry(next, struct map, rb_node); 257 + 258 + next = rb_next(&pos->rb_node); 259 + rb_erase(&pos->rb_node, self); 260 + map__delete(pos); 261 + } 262 + } 263 + 264 + static void maps__delete_removed(struct list_head *self) 265 + { 266 + struct map *pos, *n; 267 + 268 + list_for_each_entry_safe(pos, n, self, node) { 269 + list_del(&pos->node); 270 + map__delete(pos); 271 + } 272 + } 273 + 274 + void map_groups__exit(struct map_groups *self) 275 + { 276 + int i; 277 + 278 + for (i = 0; i < MAP__NR_TYPES; ++i) { 279 + maps__delete(&self->maps[i]); 280 + maps__delete_removed(&self->removed_maps[i]); 281 + } 282 + } 283 + 232 284 void map_groups__flush(struct map_groups *self) 233 285 { 234 286 int type; ··· 388 374 { 389 375 struct rb_root *root = &self->maps[map->type]; 390 376 struct rb_node *next = rb_first(root); 377 + int err = 0; 391 378 392 379 while (next) { 393 380 struct map *pos = rb_entry(next, struct map, rb_node); ··· 405 390 406 391 rb_erase(&pos->rb_node, root); 407 392 /* 408 - * We may have references to this map, for instance in some 409 - * hist_entry instances, so just move them to a separate 410 - * list. 411 - */ 412 - list_add_tail(&pos->node, &self->removed_maps[map->type]); 413 - /* 414 393 * Now check if we need to create new maps for areas not 415 394 * overlapped by the new map: 416 395 */ 417 396 if (map->start > pos->start) { 418 397 struct map *before = map__clone(pos); 419 398 420 - if (before == NULL) 421 - return -ENOMEM; 399 + if (before == NULL) { 400 + err = -ENOMEM; 401 + goto move_map; 402 + } 422 403 423 404 before->end = map->start - 1; 424 405 map_groups__insert(self, before); ··· 425 414 if (map->end < pos->end) { 426 415 struct map *after = map__clone(pos); 427 416 428 - if (after == NULL) 429 - return -ENOMEM; 417 + if (after == NULL) { 418 + err = -ENOMEM; 419 + goto move_map; 420 + } 430 421 431 422 after->start = map->end + 1; 432 423 map_groups__insert(self, after); 433 424 if (verbose >= 2) 434 425 map__fprintf(after, fp); 435 426 } 427 + move_map: 428 + /* 429 + * If we have references, just move them to a separate list. 430 + */ 431 + if (pos->referenced) 432 + list_add_tail(&pos->node, &self->removed_maps[map->type]); 433 + else 434 + map__delete(pos); 435 + 436 + if (err) 437 + return err; 436 438 } 437 439 438 440 return 0; ··· 517 493 rb_insert_color(&map->rb_node, maps); 518 494 } 519 495 496 + void maps__remove(struct rb_root *self, struct map *map) 497 + { 498 + rb_erase(&map->rb_node, self); 499 + } 500 + 520 501 struct map *maps__find(struct rb_root *maps, u64 ip) 521 502 { 522 503 struct rb_node **p = &maps->rb_node; ··· 553 524 self->pid = pid; 554 525 self->root_dir = strdup(root_dir); 555 526 return self->root_dir == NULL ? -ENOMEM : 0; 527 + } 528 + 529 + static void dsos__delete(struct list_head *self) 530 + { 531 + struct dso *pos, *n; 532 + 533 + list_for_each_entry_safe(pos, n, self, node) { 534 + list_del(&pos->node); 535 + dso__delete(pos); 536 + } 537 + } 538 + 539 + void machine__exit(struct machine *self) 540 + { 541 + map_groups__exit(&self->kmaps); 542 + dsos__delete(&self->user_dsos); 543 + dsos__delete(&self->kernel_dsos); 544 + free(self->root_dir); 545 + self->root_dir = NULL; 546 + } 547 + 548 + void machine__delete(struct machine *self) 549 + { 550 + machine__exit(self); 551 + free(self); 556 552 } 557 553 558 554 struct machine *machines__add(struct rb_root *self, pid_t pid,

+12 -2

tools/perf/util/map.h

··· 29 29 }; 30 30 u64 start; 31 31 u64 end; 32 - enum map_type type; 32 + u8 /* enum map_type */ type; 33 + bool referenced; 33 34 u32 priv; 34 35 u64 pgoff; 35 36 ··· 107 106 u64 start, u64 end, u64 pgoff, struct dso *dso); 108 107 struct map *map__new(struct list_head *dsos__list, u64 start, u64 len, 109 108 u64 pgoff, u32 pid, char *filename, 110 - enum map_type type, char *cwd, int cwdlen); 109 + enum map_type type); 111 110 void map__delete(struct map *self); 112 111 struct map *map__clone(struct map *self); 113 112 int map__overlap(struct map *l, struct map *r); ··· 126 125 size_t __map_groups__fprintf_maps(struct map_groups *self, 127 126 enum map_type type, int verbose, FILE *fp); 128 127 void maps__insert(struct rb_root *maps, struct map *map); 128 + void maps__remove(struct rb_root *self, struct map *map); 129 129 struct map *maps__find(struct rb_root *maps, u64 addr); 130 130 void map_groups__init(struct map_groups *self); 131 + void map_groups__exit(struct map_groups *self); 131 132 int map_groups__clone(struct map_groups *self, 132 133 struct map_groups *parent, enum map_type type); 133 134 size_t map_groups__fprintf(struct map_groups *self, int verbose, FILE *fp); ··· 145 142 struct machine *machines__findnew(struct rb_root *self, pid_t pid); 146 143 char *machine__mmap_name(struct machine *self, char *bf, size_t size); 147 144 int machine__init(struct machine *self, const char *root_dir, pid_t pid); 145 + void machine__exit(struct machine *self); 146 + void machine__delete(struct machine *self); 148 147 149 148 /* 150 149 * Default guest kernel is defined by parameter --guestkallsyms ··· 166 161 { 167 162 maps__insert(&self->maps[map->type], map); 168 163 map->groups = self; 164 + } 165 + 166 + static inline void map_groups__remove(struct map_groups *self, struct map *map) 167 + { 168 + maps__remove(&self->maps[map->type], map); 169 169 } 170 170 171 171 static inline struct map *map_groups__find(struct map_groups *self,

+779 -389

tools/perf/util/newt.c

··· 11 11 #define HAVE_LONG_LONG __GLIBC_HAVE_LONG_LONG 12 12 #endif 13 13 #include <slang.h> 14 + #include <signal.h> 14 15 #include <stdlib.h> 15 16 #include <newt.h> 16 17 #include <sys/ttydefaults.h> ··· 279 278 void *first_visible_entry, *entries; 280 279 u16 top, left, width, height; 281 280 void *priv; 281 + unsigned int (*refresh_entries)(struct ui_browser *self); 282 + void (*seek)(struct ui_browser *self, 283 + off_t offset, int whence); 282 284 u32 nr_entries; 283 285 }; 286 + 287 + static void ui_browser__list_head_seek(struct ui_browser *self, 288 + off_t offset, int whence) 289 + { 290 + struct list_head *head = self->entries; 291 + struct list_head *pos; 292 + 293 + switch (whence) { 294 + case SEEK_SET: 295 + pos = head->next; 296 + break; 297 + case SEEK_CUR: 298 + pos = self->first_visible_entry; 299 + break; 300 + case SEEK_END: 301 + pos = head->prev; 302 + break; 303 + default: 304 + return; 305 + } 306 + 307 + if (offset > 0) { 308 + while (offset-- != 0) 309 + pos = pos->next; 310 + } else { 311 + while (offset++ != 0) 312 + pos = pos->prev; 313 + } 314 + 315 + self->first_visible_entry = pos; 316 + } 317 + 318 + static bool ui_browser__is_current_entry(struct ui_browser *self, unsigned row) 319 + { 320 + return (self->first_visible_entry_idx + row) == self->index; 321 + } 284 322 285 323 static void ui_browser__refresh_dimensions(struct ui_browser *self) 286 324 { ··· 337 297 338 298 static void ui_browser__reset_index(struct ui_browser *self) 339 299 { 340 - self->index = self->first_visible_entry_idx = 0; 341 - self->first_visible_entry = NULL; 300 + self->index = self->first_visible_entry_idx = 0; 301 + self->seek(self, 0, SEEK_SET); 302 + } 303 + 304 + static int ui_browser__show(struct ui_browser *self, const char *title) 305 + { 306 + if (self->form != NULL) { 307 + newtFormDestroy(self->form); 308 + newtPopWindow(); 309 + } 310 + ui_browser__refresh_dimensions(self); 311 + newtCenteredWindow(self->width, self->height, title); 312 + self->form = newt_form__new(); 313 + if (self->form == NULL) 314 + return -1; 315 + 316 + self->sb = newtVerticalScrollbar(self->width, 0, self->height, 317 + HE_COLORSET_NORMAL, 318 + HE_COLORSET_SELECTED); 319 + if (self->sb == NULL) 320 + return -1; 321 + 322 + newtFormAddHotKey(self->form, NEWT_KEY_UP); 323 + newtFormAddHotKey(self->form, NEWT_KEY_DOWN); 324 + newtFormAddHotKey(self->form, NEWT_KEY_PGUP); 325 + newtFormAddHotKey(self->form, NEWT_KEY_PGDN); 326 + newtFormAddHotKey(self->form, NEWT_KEY_HOME); 327 + newtFormAddHotKey(self->form, NEWT_KEY_END); 328 + newtFormAddComponent(self->form, self->sb); 329 + return 0; 342 330 } 343 331 344 332 static int objdump_line__show(struct objdump_line *self, struct list_head *head, ··· 420 352 421 353 static int ui_browser__refresh_entries(struct ui_browser *self) 422 354 { 423 - struct objdump_line *pos; 424 - struct list_head *head = self->entries; 425 - struct hist_entry *he = self->priv; 426 - int row = 0; 427 - int len = he->ms.sym->end - he->ms.sym->start; 355 + int row; 428 356 429 - if (self->first_visible_entry == NULL || self->first_visible_entry == self->entries) 430 - self->first_visible_entry = head->next; 431 - 432 - pos = list_entry(self->first_visible_entry, struct objdump_line, node); 433 - 434 - list_for_each_entry_from(pos, head, node) { 435 - bool current_entry = (self->first_visible_entry_idx + row) == self->index; 436 - SLsmg_gotorc(self->top + row, self->left); 437 - objdump_line__show(pos, head, self->width, 438 - he, len, current_entry); 439 - if (++row == self->height) 440 - break; 441 - } 442 - 357 + newtScrollbarSet(self->sb, self->index, self->nr_entries - 1); 358 + row = self->refresh_entries(self); 443 359 SLsmg_set_color(HE_COLORSET_NORMAL); 444 360 SLsmg_fill_region(self->top + row, self->left, 445 361 self->height - row, self->width, ' '); ··· 431 379 return 0; 432 380 } 433 381 434 - static int ui_browser__run(struct ui_browser *self, const char *title, 435 - struct newtExitStruct *es) 382 + static int ui_browser__run(struct ui_browser *self, struct newtExitStruct *es) 436 383 { 437 - if (self->form) { 438 - newtFormDestroy(self->form); 439 - newtPopWindow(); 440 - } 441 - 442 - ui_browser__refresh_dimensions(self); 443 - newtCenteredWindow(self->width + 2, self->height, title); 444 - self->form = newt_form__new(); 445 - if (self->form == NULL) 446 - return -1; 447 - 448 - self->sb = newtVerticalScrollbar(self->width + 1, 0, self->height, 449 - HE_COLORSET_NORMAL, 450 - HE_COLORSET_SELECTED); 451 - if (self->sb == NULL) 452 - return -1; 453 - 454 - newtFormAddHotKey(self->form, NEWT_KEY_UP); 455 - newtFormAddHotKey(self->form, NEWT_KEY_DOWN); 456 - newtFormAddHotKey(self->form, NEWT_KEY_PGUP); 457 - newtFormAddHotKey(self->form, NEWT_KEY_PGDN); 458 - newtFormAddHotKey(self->form, ' '); 459 - newtFormAddHotKey(self->form, NEWT_KEY_HOME); 460 - newtFormAddHotKey(self->form, NEWT_KEY_END); 461 - newtFormAddHotKey(self->form, NEWT_KEY_TAB); 462 - newtFormAddHotKey(self->form, NEWT_KEY_RIGHT); 463 - 464 384 if (ui_browser__refresh_entries(self) < 0) 465 385 return -1; 466 - newtFormAddComponent(self->form, self->sb); 467 386 468 387 while (1) { 469 - unsigned int offset; 388 + off_t offset; 470 389 471 390 newtFormRun(self->form, es); 472 391 ··· 451 428 break; 452 429 ++self->index; 453 430 if (self->index == self->first_visible_entry_idx + self->height) { 454 - struct list_head *pos = self->first_visible_entry; 455 431 ++self->first_visible_entry_idx; 456 - self->first_visible_entry = pos->next; 432 + self->seek(self, +1, SEEK_CUR); 457 433 } 458 434 break; 459 435 case NEWT_KEY_UP: ··· 460 438 break; 461 439 --self->index; 462 440 if (self->index < self->first_visible_entry_idx) { 463 - struct list_head *pos = self->first_visible_entry; 464 441 --self->first_visible_entry_idx; 465 - self->first_visible_entry = pos->prev; 442 + self->seek(self, -1, SEEK_CUR); 466 443 } 467 444 break; 468 445 case NEWT_KEY_PGDN: ··· 474 453 offset = self->nr_entries - 1 - self->index; 475 454 self->index += offset; 476 455 self->first_visible_entry_idx += offset; 477 - 478 - while (offset--) { 479 - struct list_head *pos = self->first_visible_entry; 480 - self->first_visible_entry = pos->next; 481 - } 482 - 456 + self->seek(self, +offset, SEEK_CUR); 483 457 break; 484 458 case NEWT_KEY_PGUP: 485 459 if (self->first_visible_entry_idx == 0) ··· 487 471 488 472 self->index -= offset; 489 473 self->first_visible_entry_idx -= offset; 490 - 491 - while (offset--) { 492 - struct list_head *pos = self->first_visible_entry; 493 - self->first_visible_entry = pos->prev; 494 - } 474 + self->seek(self, -offset, SEEK_CUR); 495 475 break; 496 476 case NEWT_KEY_HOME: 497 477 ui_browser__reset_index(self); 498 478 break; 499 - case NEWT_KEY_END: { 500 - struct list_head *head = self->entries; 479 + case NEWT_KEY_END: 501 480 offset = self->height - 1; 481 + if (offset >= self->nr_entries) 482 + offset = self->nr_entries - 1; 502 483 503 - if (offset > self->nr_entries) 504 - offset = self->nr_entries; 505 - 506 - self->index = self->first_visible_entry_idx = self->nr_entries - 1 - offset; 507 - self->first_visible_entry = head->prev; 508 - while (offset-- != 0) { 509 - struct list_head *pos = self->first_visible_entry; 510 - self->first_visible_entry = pos->prev; 511 - } 512 - } 484 + self->index = self->nr_entries - 1; 485 + self->first_visible_entry_idx = self->index - offset; 486 + self->seek(self, -offset, SEEK_END); 513 487 break; 514 - case NEWT_KEY_RIGHT: 515 - case NEWT_KEY_LEFT: 516 - case NEWT_KEY_TAB: 517 - return es->u.key; 518 488 default: 519 - continue; 489 + return es->u.key; 520 490 } 521 491 if (ui_browser__refresh_entries(self) < 0) 522 492 return -1; 523 493 } 524 494 return 0; 525 - } 526 - 527 - /* 528 - * When debugging newt problems it was useful to be able to "unroll" 529 - * the calls to newtCheckBoxTreeAdd{Array,Item}, so that we can generate 530 - * a source file with the sequence of calls to these methods, to then 531 - * tweak the arrays to get the intended results, so I'm keeping this code 532 - * here, may be useful again in the future. 533 - */ 534 - #undef NEWT_DEBUG 535 - 536 - static void newt_checkbox_tree__add(newtComponent tree, const char *str, 537 - void *priv, int *indexes) 538 - { 539 - #ifdef NEWT_DEBUG 540 - /* Print the newtCheckboxTreeAddArray to tinker with its index arrays */ 541 - int i = 0, len = 40 - strlen(str); 542 - 543 - fprintf(stderr, 544 - "\tnewtCheckboxTreeAddItem(tree, %*.*s\"%s\", (void *)%p, 0, ", 545 - len, len, " ", str, priv); 546 - while (indexes[i] != NEWT_ARG_LAST) { 547 - if (indexes[i] != NEWT_ARG_APPEND) 548 - fprintf(stderr, " %d,", indexes[i]); 549 - else 550 - fprintf(stderr, " %s,", "NEWT_ARG_APPEND"); 551 - ++i; 552 - } 553 - fprintf(stderr, " %s", " NEWT_ARG_LAST);\n"); 554 - fflush(stderr); 555 - #endif 556 - newtCheckboxTreeAddArray(tree, str, priv, 0, indexes); 557 495 } 558 496 559 497 static char *callchain_list__sym_name(struct callchain_list *self, ··· 520 550 return bf; 521 551 } 522 552 523 - static void __callchain__append_graph_browser(struct callchain_node *self, 524 - newtComponent tree, u64 total, 525 - int *indexes, int depth) 553 + static unsigned int hist_entry__annotate_browser_refresh(struct ui_browser *self) 526 554 { 527 - struct rb_node *node; 528 - u64 new_total, remaining; 529 - int idx = 0; 555 + struct objdump_line *pos; 556 + struct list_head *head = self->entries; 557 + struct hist_entry *he = self->priv; 558 + int row = 0; 559 + int len = he->ms.sym->end - he->ms.sym->start; 530 560 531 - if (callchain_param.mode == CHAIN_GRAPH_REL) 532 - new_total = self->children_hit; 533 - else 534 - new_total = total; 561 + if (self->first_visible_entry == NULL || self->first_visible_entry == self->entries) 562 + self->first_visible_entry = head->next; 535 563 536 - remaining = new_total; 537 - node = rb_first(&self->rb_root); 538 - while (node) { 539 - struct callchain_node *child = rb_entry(node, struct callchain_node, rb_node); 540 - struct rb_node *next = rb_next(node); 541 - u64 cumul = cumul_hits(child); 542 - struct callchain_list *chain; 543 - int first = true, printed = 0; 544 - int chain_idx = -1; 545 - remaining -= cumul; 564 + pos = list_entry(self->first_visible_entry, struct objdump_line, node); 546 565 547 - indexes[depth] = NEWT_ARG_APPEND; 548 - indexes[depth + 1] = NEWT_ARG_LAST; 549 - 550 - list_for_each_entry(chain, &child->val, list) { 551 - char ipstr[BITS_PER_LONG / 4 + 1], 552 - *alloc_str = NULL; 553 - const char *str = callchain_list__sym_name(chain, ipstr, sizeof(ipstr)); 554 - 555 - if (first) { 556 - double percent = cumul * 100.0 / new_total; 557 - 558 - first = false; 559 - if (asprintf(&alloc_str, "%2.2f%% %s", percent, str) < 0) 560 - str = "Not enough memory!"; 561 - else 562 - str = alloc_str; 563 - } else { 564 - indexes[depth] = idx; 565 - indexes[depth + 1] = NEWT_ARG_APPEND; 566 - indexes[depth + 2] = NEWT_ARG_LAST; 567 - ++chain_idx; 568 - } 569 - newt_checkbox_tree__add(tree, str, &chain->ms, indexes); 570 - free(alloc_str); 571 - ++printed; 572 - } 573 - 574 - indexes[depth] = idx; 575 - if (chain_idx != -1) 576 - indexes[depth + 1] = chain_idx; 577 - if (printed != 0) 578 - ++idx; 579 - __callchain__append_graph_browser(child, tree, new_total, indexes, 580 - depth + (chain_idx != -1 ? 2 : 1)); 581 - node = next; 582 - } 583 - } 584 - 585 - static void callchain__append_graph_browser(struct callchain_node *self, 586 - newtComponent tree, u64 total, 587 - int *indexes, int parent_idx) 588 - { 589 - struct callchain_list *chain; 590 - int i = 0; 591 - 592 - indexes[1] = NEWT_ARG_APPEND; 593 - indexes[2] = NEWT_ARG_LAST; 594 - 595 - list_for_each_entry(chain, &self->val, list) { 596 - char ipstr[BITS_PER_LONG / 4 + 1], *str; 597 - 598 - if (chain->ip >= PERF_CONTEXT_MAX) 599 - continue; 600 - 601 - if (!i++ && sort__first_dimension == SORT_SYM) 602 - continue; 603 - 604 - str = callchain_list__sym_name(chain, ipstr, sizeof(ipstr)); 605 - newt_checkbox_tree__add(tree, str, &chain->ms, indexes); 566 + list_for_each_entry_from(pos, head, node) { 567 + bool current_entry = ui_browser__is_current_entry(self, row); 568 + SLsmg_gotorc(self->top + row, self->left); 569 + objdump_line__show(pos, head, self->width, 570 + he, len, current_entry); 571 + if (++row == self->height) 572 + break; 606 573 } 607 574 608 - indexes[1] = parent_idx; 609 - indexes[2] = NEWT_ARG_APPEND; 610 - indexes[3] = NEWT_ARG_LAST; 611 - __callchain__append_graph_browser(self, tree, total, indexes, 2); 612 - } 613 - 614 - static void hist_entry__append_callchain_browser(struct hist_entry *self, 615 - newtComponent tree, u64 total, int parent_idx) 616 - { 617 - struct rb_node *rb_node; 618 - int indexes[1024] = { [0] = parent_idx, }; 619 - int idx = 0; 620 - struct callchain_node *chain; 621 - 622 - rb_node = rb_first(&self->sorted_chain); 623 - while (rb_node) { 624 - chain = rb_entry(rb_node, struct callchain_node, rb_node); 625 - switch (callchain_param.mode) { 626 - case CHAIN_FLAT: 627 - break; 628 - case CHAIN_GRAPH_ABS: /* falldown */ 629 - case CHAIN_GRAPH_REL: 630 - callchain__append_graph_browser(chain, tree, total, indexes, idx++); 631 - break; 632 - case CHAIN_NONE: 633 - default: 634 - break; 635 - } 636 - rb_node = rb_next(rb_node); 637 - } 638 - } 639 - 640 - static size_t hist_entry__append_browser(struct hist_entry *self, 641 - newtComponent tree, u64 total) 642 - { 643 - char s[256]; 644 - size_t ret; 645 - 646 - if (symbol_conf.exclude_other && !self->parent) 647 - return 0; 648 - 649 - ret = hist_entry__snprintf(self, s, sizeof(s), NULL, 650 - false, 0, false, total); 651 - if (symbol_conf.use_callchain) { 652 - int indexes[2]; 653 - 654 - indexes[0] = NEWT_ARG_APPEND; 655 - indexes[1] = NEWT_ARG_LAST; 656 - newt_checkbox_tree__add(tree, s, &self->ms, indexes); 657 - } else 658 - newtListboxAppendEntry(tree, s, &self->ms); 659 - 660 - return ret; 575 + return row; 661 576 } 662 577 663 578 int hist_entry__tui_annotate(struct hist_entry *self) ··· 567 712 ui_helpline__push("Press <- or ESC to exit"); 568 713 569 714 memset(&browser, 0, sizeof(browser)); 570 - browser.entries = &head; 715 + browser.entries = &head; 716 + browser.refresh_entries = hist_entry__annotate_browser_refresh; 717 + browser.seek = ui_browser__list_head_seek; 571 718 browser.priv = self; 572 719 list_for_each_entry(pos, &head, node) { 573 720 size_t line_len = strlen(pos->line); ··· 579 722 } 580 723 581 724 browser.width += 18; /* Percentage */ 582 - ret = ui_browser__run(&browser, self->ms.sym->name, &es); 725 + ui_browser__show(&browser, self->ms.sym->name); 726 + newtFormAddHotKey(browser.form, ' '); 727 + ret = ui_browser__run(&browser, &es); 583 728 newtFormDestroy(browser.form); 584 729 newtPopWindow(); 585 730 list_for_each_entry_safe(pos, n, &head, node) { ··· 592 733 return ret; 593 734 } 594 735 595 - static const void *newt__symbol_tree_get_current(newtComponent self) 596 - { 597 - if (symbol_conf.use_callchain) 598 - return newtCheckboxTreeGetCurrent(self); 599 - return newtListboxGetCurrent(self); 600 - } 601 - 602 - static void hist_browser__selection(newtComponent self, void *data) 603 - { 604 - const struct map_symbol **symbol_ptr = data; 605 - *symbol_ptr = newt__symbol_tree_get_current(self); 606 - } 607 - 608 736 struct hist_browser { 609 - newtComponent form, tree; 610 - const struct map_symbol *selection; 737 + struct ui_browser b; 738 + struct hists *hists; 739 + struct hist_entry *he_selection; 740 + struct map_symbol *selection; 611 741 }; 612 742 613 - static struct hist_browser *hist_browser__new(void) 614 - { 615 - struct hist_browser *self = malloc(sizeof(*self)); 743 + static void hist_browser__reset(struct hist_browser *self); 744 + static int hist_browser__run(struct hist_browser *self, const char *title, 745 + struct newtExitStruct *es); 746 + static unsigned int hist_browser__refresh_entries(struct ui_browser *self); 747 + static void ui_browser__hists_seek(struct ui_browser *self, 748 + off_t offset, int whence); 616 749 617 - if (self != NULL) 618 - self->form = NULL; 750 + static struct hist_browser *hist_browser__new(struct hists *hists) 751 + { 752 + struct hist_browser *self = zalloc(sizeof(*self)); 753 + 754 + if (self) { 755 + self->hists = hists; 756 + self->b.refresh_entries = hist_browser__refresh_entries; 757 + self->b.seek = ui_browser__hists_seek; 758 + } 619 759 620 760 return self; 621 761 } 622 762 623 763 static void hist_browser__delete(struct hist_browser *self) 624 764 { 625 - newtFormDestroy(self->form); 765 + newtFormDestroy(self->b.form); 626 766 newtPopWindow(); 627 767 free(self); 628 768 } 629 769 630 - static int hist_browser__populate(struct hist_browser *self, struct hists *hists, 631 - const char *title) 632 - { 633 - int max_len = 0, idx, cols, rows; 634 - struct ui_progress *progress; 635 - struct rb_node *nd; 636 - u64 curr_hist = 0; 637 - char seq[] = ".", unit; 638 - char str[256]; 639 - unsigned long nr_events = hists->stats.nr_events[PERF_RECORD_SAMPLE]; 640 - 641 - if (self->form) { 642 - newtFormDestroy(self->form); 643 - newtPopWindow(); 644 - } 645 - 646 - nr_events = convert_unit(nr_events, &unit); 647 - snprintf(str, sizeof(str), "Events: %lu%c ", 648 - nr_events, unit); 649 - newtDrawRootText(0, 0, str); 650 - 651 - newtGetScreenSize(NULL, &rows); 652 - 653 - if (symbol_conf.use_callchain) 654 - self->tree = newtCheckboxTreeMulti(0, 0, rows - 5, seq, 655 - NEWT_FLAG_SCROLL); 656 - else 657 - self->tree = newtListbox(0, 0, rows - 5, 658 - (NEWT_FLAG_SCROLL | 659 - NEWT_FLAG_RETURNEXIT)); 660 - 661 - newtComponentAddCallback(self->tree, hist_browser__selection, 662 - &self->selection); 663 - 664 - progress = ui_progress__new("Adding entries to the browser...", 665 - hists->nr_entries); 666 - if (progress == NULL) 667 - return -1; 668 - 669 - idx = 0; 670 - for (nd = rb_first(&hists->entries); nd; nd = rb_next(nd)) { 671 - struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node); 672 - int len; 673 - 674 - if (h->filtered) 675 - continue; 676 - 677 - len = hist_entry__append_browser(h, self->tree, hists->stats.total_period); 678 - if (len > max_len) 679 - max_len = len; 680 - if (symbol_conf.use_callchain) 681 - hist_entry__append_callchain_browser(h, self->tree, 682 - hists->stats.total_period, idx++); 683 - ++curr_hist; 684 - if (curr_hist % 5) 685 - ui_progress__update(progress, curr_hist); 686 - } 687 - 688 - ui_progress__delete(progress); 689 - 690 - newtGetScreenSize(&cols, &rows); 691 - 692 - if (max_len > cols) 693 - max_len = cols - 3; 694 - 695 - if (!symbol_conf.use_callchain) 696 - newtListboxSetWidth(self->tree, max_len); 697 - 698 - newtCenteredWindow(max_len + (symbol_conf.use_callchain ? 5 : 0), 699 - rows - 5, title); 700 - self->form = newt_form__new(); 701 - if (self->form == NULL) 702 - return -1; 703 - 704 - newtFormAddHotKey(self->form, 'A'); 705 - newtFormAddHotKey(self->form, 'a'); 706 - newtFormAddHotKey(self->form, 'D'); 707 - newtFormAddHotKey(self->form, 'd'); 708 - newtFormAddHotKey(self->form, 'T'); 709 - newtFormAddHotKey(self->form, 't'); 710 - newtFormAddHotKey(self->form, '?'); 711 - newtFormAddHotKey(self->form, 'H'); 712 - newtFormAddHotKey(self->form, 'h'); 713 - newtFormAddHotKey(self->form, NEWT_KEY_F1); 714 - newtFormAddHotKey(self->form, NEWT_KEY_RIGHT); 715 - newtFormAddHotKey(self->form, NEWT_KEY_TAB); 716 - newtFormAddHotKey(self->form, NEWT_KEY_UNTAB); 717 - newtFormAddComponents(self->form, self->tree, NULL); 718 - self->selection = newt__symbol_tree_get_current(self->tree); 719 - 720 - return 0; 721 - } 722 - 723 770 static struct hist_entry *hist_browser__selected_entry(struct hist_browser *self) 724 771 { 725 - int *indexes; 726 - 727 - if (!symbol_conf.use_callchain) 728 - goto out; 729 - 730 - indexes = newtCheckboxTreeFindItem(self->tree, (void *)self->selection); 731 - if (indexes) { 732 - bool is_hist_entry = indexes[1] == NEWT_ARG_LAST; 733 - free(indexes); 734 - if (is_hist_entry) 735 - goto out; 736 - } 737 - return NULL; 738 - out: 739 - return container_of(self->selection, struct hist_entry, ms); 772 + return self->he_selection; 740 773 } 741 774 742 775 static struct thread *hist_browser__selected_thread(struct hist_browser *self) 743 776 { 744 - struct hist_entry *he = hist_browser__selected_entry(self); 745 - return he ? he->thread : NULL; 777 + return self->he_selection->thread; 746 778 } 747 779 748 780 static int hist_browser__title(char *bf, size_t size, const char *ev_name, ··· 655 905 656 906 int hists__browse(struct hists *self, const char *helpline, const char *ev_name) 657 907 { 658 - struct hist_browser *browser = hist_browser__new(); 908 + struct hist_browser *browser = hist_browser__new(self); 659 909 struct pstack *fstack; 660 910 const struct thread *thread_filter = NULL; 661 911 const struct dso *dso_filter = NULL; ··· 674 924 675 925 hist_browser__title(msg, sizeof(msg), ev_name, 676 926 dso_filter, thread_filter); 677 - if (hist_browser__populate(browser, self, msg) < 0) 678 - goto out_free_stack; 679 927 680 928 while (1) { 681 929 const struct thread *thread; ··· 682 934 int nr_options = 0, choice = 0, i, 683 935 annotate = -2, zoom_dso = -2, zoom_thread = -2; 684 936 685 - newtFormRun(browser->form, &es); 937 + if (hist_browser__run(browser, msg, &es)) 938 + break; 686 939 687 940 thread = hist_browser__selected_thread(browser); 688 941 dso = browser->selection->map ? browser->selection->map->dso : NULL; ··· 818 1069 hists__filter_by_dso(self, dso_filter); 819 1070 hist_browser__title(msg, sizeof(msg), ev_name, 820 1071 dso_filter, thread_filter); 821 - if (hist_browser__populate(browser, self, msg) < 0) 822 - goto out; 1072 + hist_browser__reset(browser); 823 1073 } else if (choice == zoom_thread) { 824 1074 zoom_thread: 825 1075 if (thread_filter) { ··· 836 1088 hists__filter_by_thread(self, thread_filter); 837 1089 hist_browser__title(msg, sizeof(msg), ev_name, 838 1090 dso_filter, thread_filter); 839 - if (hist_browser__populate(browser, self, msg) < 0) 840 - goto out; 1091 + hist_browser__reset(browser); 841 1092 } 842 1093 } 843 1094 out_free_stack: ··· 892 1145 "blue", "lightgray", 893 1146 }; 894 1147 1148 + static void newt_suspend(void *d __used) 1149 + { 1150 + newtSuspend(); 1151 + raise(SIGTSTP); 1152 + newtResume(); 1153 + } 1154 + 895 1155 void setup_browser(void) 896 1156 { 897 1157 struct newtPercentTreeColors *c = &defaultPercentTreeColors; ··· 912 1158 use_browser = 1; 913 1159 newtInit(); 914 1160 newtCls(); 1161 + newtSetSuspendCallback(newt_suspend, NULL); 915 1162 ui_helpline__puts(" "); 916 1163 sltt_set_color(HE_COLORSET_TOP, NULL, c->topColorFg, c->topColorBg); 917 1164 sltt_set_color(HE_COLORSET_MEDIUM, NULL, c->mediumColorFg, c->mediumColorBg); ··· 930 1175 } 931 1176 newtFinished(); 932 1177 } 1178 + } 1179 + 1180 + static void hist_browser__refresh_dimensions(struct hist_browser *self) 1181 + { 1182 + /* 3 == +/- toggle symbol before actual hist_entry rendering */ 1183 + self->b.width = 3 + (hists__sort_list_width(self->hists) + 1184 + sizeof("[k]")); 1185 + } 1186 + 1187 + static void hist_browser__reset(struct hist_browser *self) 1188 + { 1189 + self->b.nr_entries = self->hists->nr_entries; 1190 + hist_browser__refresh_dimensions(self); 1191 + ui_browser__reset_index(&self->b); 1192 + } 1193 + 1194 + static char tree__folded_sign(bool unfolded) 1195 + { 1196 + return unfolded ? '-' : '+'; 1197 + } 1198 + 1199 + static char map_symbol__folded(const struct map_symbol *self) 1200 + { 1201 + return self->has_children ? tree__folded_sign(self->unfolded) : ' '; 1202 + } 1203 + 1204 + static char hist_entry__folded(const struct hist_entry *self) 1205 + { 1206 + return map_symbol__folded(&self->ms); 1207 + } 1208 + 1209 + static char callchain_list__folded(const struct callchain_list *self) 1210 + { 1211 + return map_symbol__folded(&self->ms); 1212 + } 1213 + 1214 + static bool map_symbol__toggle_fold(struct map_symbol *self) 1215 + { 1216 + if (!self->has_children) 1217 + return false; 1218 + 1219 + self->unfolded = !self->unfolded; 1220 + return true; 1221 + } 1222 + 1223 + #define LEVEL_OFFSET_STEP 3 1224 + 1225 + static int hist_browser__show_callchain_node_rb_tree(struct hist_browser *self, 1226 + struct callchain_node *chain_node, 1227 + u64 total, int level, 1228 + unsigned short row, 1229 + off_t *row_offset, 1230 + bool *is_current_entry) 1231 + { 1232 + struct rb_node *node; 1233 + int first_row = row, width, offset = level * LEVEL_OFFSET_STEP; 1234 + u64 new_total, remaining; 1235 + 1236 + if (callchain_param.mode == CHAIN_GRAPH_REL) 1237 + new_total = chain_node->children_hit; 1238 + else 1239 + new_total = total; 1240 + 1241 + remaining = new_total; 1242 + node = rb_first(&chain_node->rb_root); 1243 + while (node) { 1244 + struct callchain_node *child = rb_entry(node, struct callchain_node, rb_node); 1245 + struct rb_node *next = rb_next(node); 1246 + u64 cumul = cumul_hits(child); 1247 + struct callchain_list *chain; 1248 + char folded_sign = ' '; 1249 + int first = true; 1250 + int extra_offset = 0; 1251 + 1252 + remaining -= cumul; 1253 + 1254 + list_for_each_entry(chain, &child->val, list) { 1255 + char ipstr[BITS_PER_LONG / 4 + 1], *alloc_str; 1256 + const char *str; 1257 + int color; 1258 + bool was_first = first; 1259 + 1260 + if (first) { 1261 + first = false; 1262 + chain->ms.has_children = chain->list.next != &child->val || 1263 + rb_first(&child->rb_root) != NULL; 1264 + } else { 1265 + extra_offset = LEVEL_OFFSET_STEP; 1266 + chain->ms.has_children = chain->list.next == &child->val && 1267 + rb_first(&child->rb_root) != NULL; 1268 + } 1269 + 1270 + folded_sign = callchain_list__folded(chain); 1271 + if (*row_offset != 0) { 1272 + --*row_offset; 1273 + goto do_next; 1274 + } 1275 + 1276 + alloc_str = NULL; 1277 + str = callchain_list__sym_name(chain, ipstr, sizeof(ipstr)); 1278 + if (was_first) { 1279 + double percent = cumul * 100.0 / new_total; 1280 + 1281 + if (asprintf(&alloc_str, "%2.2f%% %s", percent, str) < 0) 1282 + str = "Not enough memory!"; 1283 + else 1284 + str = alloc_str; 1285 + } 1286 + 1287 + color = HE_COLORSET_NORMAL; 1288 + width = self->b.width - (offset + extra_offset + 2); 1289 + if (ui_browser__is_current_entry(&self->b, row)) { 1290 + self->selection = &chain->ms; 1291 + color = HE_COLORSET_SELECTED; 1292 + *is_current_entry = true; 1293 + } 1294 + 1295 + SLsmg_set_color(color); 1296 + SLsmg_gotorc(self->b.top + row, self->b.left); 1297 + slsmg_write_nstring(" ", offset + extra_offset); 1298 + slsmg_printf("%c ", folded_sign); 1299 + slsmg_write_nstring(str, width); 1300 + free(alloc_str); 1301 + 1302 + if (++row == self->b.height) 1303 + goto out; 1304 + do_next: 1305 + if (folded_sign == '+') 1306 + break; 1307 + } 1308 + 1309 + if (folded_sign == '-') { 1310 + const int new_level = level + (extra_offset ? 2 : 1); 1311 + row += hist_browser__show_callchain_node_rb_tree(self, child, new_total, 1312 + new_level, row, row_offset, 1313 + is_current_entry); 1314 + } 1315 + if (row == self->b.height) 1316 + goto out; 1317 + node = next; 1318 + } 1319 + out: 1320 + return row - first_row; 1321 + } 1322 + 1323 + static int hist_browser__show_callchain_node(struct hist_browser *self, 1324 + struct callchain_node *node, 1325 + int level, unsigned short row, 1326 + off_t *row_offset, 1327 + bool *is_current_entry) 1328 + { 1329 + struct callchain_list *chain; 1330 + int first_row = row, 1331 + offset = level * LEVEL_OFFSET_STEP, 1332 + width = self->b.width - offset; 1333 + char folded_sign = ' '; 1334 + 1335 + list_for_each_entry(chain, &node->val, list) { 1336 + char ipstr[BITS_PER_LONG / 4 + 1], *s; 1337 + int color; 1338 + /* 1339 + * FIXME: This should be moved to somewhere else, 1340 + * probably when the callchain is created, so as not to 1341 + * traverse it all over again 1342 + */ 1343 + chain->ms.has_children = rb_first(&node->rb_root) != NULL; 1344 + folded_sign = callchain_list__folded(chain); 1345 + 1346 + if (*row_offset != 0) { 1347 + --*row_offset; 1348 + continue; 1349 + } 1350 + 1351 + color = HE_COLORSET_NORMAL; 1352 + if (ui_browser__is_current_entry(&self->b, row)) { 1353 + self->selection = &chain->ms; 1354 + color = HE_COLORSET_SELECTED; 1355 + *is_current_entry = true; 1356 + } 1357 + 1358 + s = callchain_list__sym_name(chain, ipstr, sizeof(ipstr)); 1359 + SLsmg_gotorc(self->b.top + row, self->b.left); 1360 + SLsmg_set_color(color); 1361 + slsmg_write_nstring(" ", offset); 1362 + slsmg_printf("%c ", folded_sign); 1363 + slsmg_write_nstring(s, width - 2); 1364 + 1365 + if (++row == self->b.height) 1366 + goto out; 1367 + } 1368 + 1369 + if (folded_sign == '-') 1370 + row += hist_browser__show_callchain_node_rb_tree(self, node, 1371 + self->hists->stats.total_period, 1372 + level + 1, row, 1373 + row_offset, 1374 + is_current_entry); 1375 + out: 1376 + return row - first_row; 1377 + } 1378 + 1379 + static int hist_browser__show_callchain(struct hist_browser *self, 1380 + struct rb_root *chain, 1381 + int level, unsigned short row, 1382 + off_t *row_offset, 1383 + bool *is_current_entry) 1384 + { 1385 + struct rb_node *nd; 1386 + int first_row = row; 1387 + 1388 + for (nd = rb_first(chain); nd; nd = rb_next(nd)) { 1389 + struct callchain_node *node = rb_entry(nd, struct callchain_node, rb_node); 1390 + 1391 + row += hist_browser__show_callchain_node(self, node, level, 1392 + row, row_offset, 1393 + is_current_entry); 1394 + if (row == self->b.height) 1395 + break; 1396 + } 1397 + 1398 + return row - first_row; 1399 + } 1400 + 1401 + static int hist_browser__show_entry(struct hist_browser *self, 1402 + struct hist_entry *entry, 1403 + unsigned short row) 1404 + { 1405 + char s[256]; 1406 + double percent; 1407 + int printed = 0; 1408 + int color, width = self->b.width; 1409 + char folded_sign = ' '; 1410 + bool current_entry = ui_browser__is_current_entry(&self->b, row); 1411 + off_t row_offset = entry->row_offset; 1412 + 1413 + if (current_entry) { 1414 + self->he_selection = entry; 1415 + self->selection = &entry->ms; 1416 + } 1417 + 1418 + if (symbol_conf.use_callchain) { 1419 + entry->ms.has_children = !RB_EMPTY_ROOT(&entry->sorted_chain); 1420 + folded_sign = hist_entry__folded(entry); 1421 + } 1422 + 1423 + if (row_offset == 0) { 1424 + hist_entry__snprintf(entry, s, sizeof(s), self->hists, NULL, false, 1425 + 0, false, self->hists->stats.total_period); 1426 + percent = (entry->period * 100.0) / self->hists->stats.total_period; 1427 + 1428 + color = HE_COLORSET_SELECTED; 1429 + if (!current_entry) { 1430 + if (percent >= MIN_RED) 1431 + color = HE_COLORSET_TOP; 1432 + else if (percent >= MIN_GREEN) 1433 + color = HE_COLORSET_MEDIUM; 1434 + else 1435 + color = HE_COLORSET_NORMAL; 1436 + } 1437 + 1438 + SLsmg_set_color(color); 1439 + SLsmg_gotorc(self->b.top + row, self->b.left); 1440 + if (symbol_conf.use_callchain) { 1441 + slsmg_printf("%c ", folded_sign); 1442 + width -= 2; 1443 + } 1444 + slsmg_write_nstring(s, width); 1445 + ++row; 1446 + ++printed; 1447 + } else 1448 + --row_offset; 1449 + 1450 + if (folded_sign == '-' && row != self->b.height) { 1451 + printed += hist_browser__show_callchain(self, &entry->sorted_chain, 1452 + 1, row, &row_offset, 1453 + &current_entry); 1454 + if (current_entry) 1455 + self->he_selection = entry; 1456 + } 1457 + 1458 + return printed; 1459 + } 1460 + 1461 + static unsigned int hist_browser__refresh_entries(struct ui_browser *self) 1462 + { 1463 + unsigned row = 0; 1464 + struct rb_node *nd; 1465 + struct hist_browser *hb = container_of(self, struct hist_browser, b); 1466 + 1467 + if (self->first_visible_entry == NULL) 1468 + self->first_visible_entry = rb_first(&hb->hists->entries); 1469 + 1470 + for (nd = self->first_visible_entry; nd; nd = rb_next(nd)) { 1471 + struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node); 1472 + 1473 + if (h->filtered) 1474 + continue; 1475 + 1476 + row += hist_browser__show_entry(hb, h, row); 1477 + if (row == self->height) 1478 + break; 1479 + } 1480 + 1481 + return row; 1482 + } 1483 + 1484 + static void callchain_node__init_have_children_rb_tree(struct callchain_node *self) 1485 + { 1486 + struct rb_node *nd = rb_first(&self->rb_root); 1487 + 1488 + for (nd = rb_first(&self->rb_root); nd; nd = rb_next(nd)) { 1489 + struct callchain_node *child = rb_entry(nd, struct callchain_node, rb_node); 1490 + struct callchain_list *chain; 1491 + int first = true; 1492 + 1493 + list_for_each_entry(chain, &child->val, list) { 1494 + if (first) { 1495 + first = false; 1496 + chain->ms.has_children = chain->list.next != &child->val || 1497 + rb_first(&child->rb_root) != NULL; 1498 + } else 1499 + chain->ms.has_children = chain->list.next == &child->val && 1500 + rb_first(&child->rb_root) != NULL; 1501 + } 1502 + 1503 + callchain_node__init_have_children_rb_tree(child); 1504 + } 1505 + } 1506 + 1507 + static void callchain_node__init_have_children(struct callchain_node *self) 1508 + { 1509 + struct callchain_list *chain; 1510 + 1511 + list_for_each_entry(chain, &self->val, list) 1512 + chain->ms.has_children = rb_first(&self->rb_root) != NULL; 1513 + 1514 + callchain_node__init_have_children_rb_tree(self); 1515 + } 1516 + 1517 + static void callchain__init_have_children(struct rb_root *self) 1518 + { 1519 + struct rb_node *nd; 1520 + 1521 + for (nd = rb_first(self); nd; nd = rb_next(nd)) { 1522 + struct callchain_node *node = rb_entry(nd, struct callchain_node, rb_node); 1523 + callchain_node__init_have_children(node); 1524 + } 1525 + } 1526 + 1527 + static void hist_entry__init_have_children(struct hist_entry *self) 1528 + { 1529 + if (!self->init_have_children) { 1530 + callchain__init_have_children(&self->sorted_chain); 1531 + self->init_have_children = true; 1532 + } 1533 + } 1534 + 1535 + static struct rb_node *hists__filter_entries(struct rb_node *nd) 1536 + { 1537 + while (nd != NULL) { 1538 + struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node); 1539 + if (!h->filtered) 1540 + return nd; 1541 + 1542 + nd = rb_next(nd); 1543 + } 1544 + 1545 + return NULL; 1546 + } 1547 + 1548 + static struct rb_node *hists__filter_prev_entries(struct rb_node *nd) 1549 + { 1550 + while (nd != NULL) { 1551 + struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node); 1552 + if (!h->filtered) 1553 + return nd; 1554 + 1555 + nd = rb_prev(nd); 1556 + } 1557 + 1558 + return NULL; 1559 + } 1560 + 1561 + static void ui_browser__hists_seek(struct ui_browser *self, 1562 + off_t offset, int whence) 1563 + { 1564 + struct hist_entry *h; 1565 + struct rb_node *nd; 1566 + bool first = true; 1567 + 1568 + switch (whence) { 1569 + case SEEK_SET: 1570 + nd = hists__filter_entries(rb_first(self->entries)); 1571 + break; 1572 + case SEEK_CUR: 1573 + nd = self->first_visible_entry; 1574 + goto do_offset; 1575 + case SEEK_END: 1576 + nd = hists__filter_prev_entries(rb_last(self->entries)); 1577 + first = false; 1578 + break; 1579 + default: 1580 + return; 1581 + } 1582 + 1583 + /* 1584 + * Moves not relative to the first visible entry invalidates its 1585 + * row_offset: 1586 + */ 1587 + h = rb_entry(self->first_visible_entry, struct hist_entry, rb_node); 1588 + h->row_offset = 0; 1589 + 1590 + /* 1591 + * Here we have to check if nd is expanded (+), if it is we can't go 1592 + * the next top level hist_entry, instead we must compute an offset of 1593 + * what _not_ to show and not change the first visible entry. 1594 + * 1595 + * This offset increments when we are going from top to bottom and 1596 + * decreases when we're going from bottom to top. 1597 + * 1598 + * As we don't have backpointers to the top level in the callchains 1599 + * structure, we need to always print the whole hist_entry callchain, 1600 + * skipping the first ones that are before the first visible entry 1601 + * and stop when we printed enough lines to fill the screen. 1602 + */ 1603 + do_offset: 1604 + if (offset > 0) { 1605 + do { 1606 + h = rb_entry(nd, struct hist_entry, rb_node); 1607 + if (h->ms.unfolded) { 1608 + u16 remaining = h->nr_rows - h->row_offset; 1609 + if (offset > remaining) { 1610 + offset -= remaining; 1611 + h->row_offset = 0; 1612 + } else { 1613 + h->row_offset += offset; 1614 + offset = 0; 1615 + self->first_visible_entry = nd; 1616 + break; 1617 + } 1618 + } 1619 + nd = hists__filter_entries(rb_next(nd)); 1620 + if (nd == NULL) 1621 + break; 1622 + --offset; 1623 + self->first_visible_entry = nd; 1624 + } while (offset != 0); 1625 + } else if (offset < 0) { 1626 + while (1) { 1627 + h = rb_entry(nd, struct hist_entry, rb_node); 1628 + if (h->ms.unfolded) { 1629 + if (first) { 1630 + if (-offset > h->row_offset) { 1631 + offset += h->row_offset; 1632 + h->row_offset = 0; 1633 + } else { 1634 + h->row_offset += offset; 1635 + offset = 0; 1636 + self->first_visible_entry = nd; 1637 + break; 1638 + } 1639 + } else { 1640 + if (-offset > h->nr_rows) { 1641 + offset += h->nr_rows; 1642 + h->row_offset = 0; 1643 + } else { 1644 + h->row_offset = h->nr_rows + offset; 1645 + offset = 0; 1646 + self->first_visible_entry = nd; 1647 + break; 1648 + } 1649 + } 1650 + } 1651 + 1652 + nd = hists__filter_prev_entries(rb_prev(nd)); 1653 + if (nd == NULL) 1654 + break; 1655 + ++offset; 1656 + self->first_visible_entry = nd; 1657 + if (offset == 0) { 1658 + /* 1659 + * Last unfiltered hist_entry, check if it is 1660 + * unfolded, if it is then we should have 1661 + * row_offset at its last entry. 1662 + */ 1663 + h = rb_entry(nd, struct hist_entry, rb_node); 1664 + if (h->ms.unfolded) 1665 + h->row_offset = h->nr_rows; 1666 + break; 1667 + } 1668 + first = false; 1669 + } 1670 + } else { 1671 + self->first_visible_entry = nd; 1672 + h = rb_entry(nd, struct hist_entry, rb_node); 1673 + h->row_offset = 0; 1674 + } 1675 + } 1676 + 1677 + static int callchain_node__count_rows_rb_tree(struct callchain_node *self) 1678 + { 1679 + int n = 0; 1680 + struct rb_node *nd; 1681 + 1682 + for (nd = rb_first(&self->rb_root); nd; nd = rb_next(nd)) { 1683 + struct callchain_node *child = rb_entry(nd, struct callchain_node, rb_node); 1684 + struct callchain_list *chain; 1685 + char folded_sign = ' '; /* No children */ 1686 + 1687 + list_for_each_entry(chain, &child->val, list) { 1688 + ++n; 1689 + /* We need this because we may not have children */ 1690 + folded_sign = callchain_list__folded(chain); 1691 + if (folded_sign == '+') 1692 + break; 1693 + } 1694 + 1695 + if (folded_sign == '-') /* Have children and they're unfolded */ 1696 + n += callchain_node__count_rows_rb_tree(child); 1697 + } 1698 + 1699 + return n; 1700 + } 1701 + 1702 + static int callchain_node__count_rows(struct callchain_node *node) 1703 + { 1704 + struct callchain_list *chain; 1705 + bool unfolded = false; 1706 + int n = 0; 1707 + 1708 + list_for_each_entry(chain, &node->val, list) { 1709 + ++n; 1710 + unfolded = chain->ms.unfolded; 1711 + } 1712 + 1713 + if (unfolded) 1714 + n += callchain_node__count_rows_rb_tree(node); 1715 + 1716 + return n; 1717 + } 1718 + 1719 + static int callchain__count_rows(struct rb_root *chain) 1720 + { 1721 + struct rb_node *nd; 1722 + int n = 0; 1723 + 1724 + for (nd = rb_first(chain); nd; nd = rb_next(nd)) { 1725 + struct callchain_node *node = rb_entry(nd, struct callchain_node, rb_node); 1726 + n += callchain_node__count_rows(node); 1727 + } 1728 + 1729 + return n; 1730 + } 1731 + 1732 + static bool hist_browser__toggle_fold(struct hist_browser *self) 1733 + { 1734 + if (map_symbol__toggle_fold(self->selection)) { 1735 + struct hist_entry *he = self->he_selection; 1736 + 1737 + hist_entry__init_have_children(he); 1738 + self->hists->nr_entries -= he->nr_rows; 1739 + 1740 + if (he->ms.unfolded) 1741 + he->nr_rows = callchain__count_rows(&he->sorted_chain); 1742 + else 1743 + he->nr_rows = 0; 1744 + self->hists->nr_entries += he->nr_rows; 1745 + self->b.nr_entries = self->hists->nr_entries; 1746 + 1747 + return true; 1748 + } 1749 + 1750 + /* If it doesn't have children, no toggling performed */ 1751 + return false; 1752 + } 1753 + 1754 + static int hist_browser__run(struct hist_browser *self, const char *title, 1755 + struct newtExitStruct *es) 1756 + { 1757 + char str[256], unit; 1758 + unsigned long nr_events = self->hists->stats.nr_events[PERF_RECORD_SAMPLE]; 1759 + 1760 + self->b.entries = &self->hists->entries; 1761 + self->b.nr_entries = self->hists->nr_entries; 1762 + 1763 + hist_browser__refresh_dimensions(self); 1764 + 1765 + nr_events = convert_unit(nr_events, &unit); 1766 + snprintf(str, sizeof(str), "Events: %lu%c ", 1767 + nr_events, unit); 1768 + newtDrawRootText(0, 0, str); 1769 + 1770 + if (ui_browser__show(&self->b, title) < 0) 1771 + return -1; 1772 + 1773 + newtFormAddHotKey(self->b.form, 'A'); 1774 + newtFormAddHotKey(self->b.form, 'a'); 1775 + newtFormAddHotKey(self->b.form, '?'); 1776 + newtFormAddHotKey(self->b.form, 'h'); 1777 + newtFormAddHotKey(self->b.form, 'H'); 1778 + newtFormAddHotKey(self->b.form, 'd'); 1779 + 1780 + newtFormAddHotKey(self->b.form, NEWT_KEY_LEFT); 1781 + newtFormAddHotKey(self->b.form, NEWT_KEY_RIGHT); 1782 + newtFormAddHotKey(self->b.form, NEWT_KEY_ENTER); 1783 + 1784 + while (1) { 1785 + ui_browser__run(&self->b, es); 1786 + 1787 + if (es->reason != NEWT_EXIT_HOTKEY) 1788 + break; 1789 + switch (es->u.key) { 1790 + case 'd': { /* Debug */ 1791 + static int seq; 1792 + struct hist_entry *h = rb_entry(self->b.first_visible_entry, 1793 + struct hist_entry, rb_node); 1794 + ui_helpline__pop(); 1795 + ui_helpline__fpush("%d: nr_ent=(%d,%d), height=%d, idx=%d, fve: idx=%d, row_off=%d, nrows=%d", 1796 + seq++, self->b.nr_entries, 1797 + self->hists->nr_entries, 1798 + self->b.height, 1799 + self->b.index, 1800 + self->b.first_visible_entry_idx, 1801 + h->row_offset, h->nr_rows); 1802 + } 1803 + continue; 1804 + case NEWT_KEY_ENTER: 1805 + if (hist_browser__toggle_fold(self)) 1806 + break; 1807 + /* fall thru */ 1808 + default: 1809 + return 0; 1810 + } 1811 + } 1812 + return 0; 933 1813 }

+9 -2

tools/perf/util/parse-events.c

··· 602 602 return EVT_FAILED; 603 603 } 604 604 605 - /* We should find a nice way to override the access type */ 606 - attr->bp_len = HW_BREAKPOINT_LEN_4; 605 + /* 606 + * We should find a nice way to override the access length 607 + * Provide some defaults for now 608 + */ 609 + if (attr->bp_type == HW_BREAKPOINT_X) 610 + attr->bp_len = sizeof(long); 611 + else 612 + attr->bp_len = HW_BREAKPOINT_LEN_4; 613 + 607 614 attr->type = PERF_TYPE_BREAKPOINT; 608 615 609 616 return EVT_HANDLED;

+187 -84

tools/perf/util/probe-event.c

··· 1 1 /* 2 - * probe-event.c : perf-probe definition to kprobe_events format converter 2 + * probe-event.c : perf-probe definition to probe_events format converter 3 3 * 4 4 * Written by Masami Hiramatsu <mhiramat@redhat.com> 5 5 * ··· 120 120 return open(machine.vmlinux_maps[MAP__FUNCTION]->dso->long_name, O_RDONLY); 121 121 } 122 122 123 - /* Convert trace point to probe point with debuginfo */ 124 - static int convert_to_perf_probe_point(struct kprobe_trace_point *tp, 123 + /* 124 + * Convert trace point to probe point with debuginfo 125 + * Currently only handles kprobes. 126 + */ 127 + static int kprobe_convert_to_perf_probe(struct probe_trace_point *tp, 125 128 struct perf_probe_point *pp) 126 129 { 127 130 struct symbol *sym; ··· 154 151 } 155 152 156 153 /* Try to find perf_probe_event with debuginfo */ 157 - static int try_to_find_kprobe_trace_events(struct perf_probe_event *pev, 158 - struct kprobe_trace_event **tevs, 154 + static int try_to_find_probe_trace_events(struct perf_probe_event *pev, 155 + struct probe_trace_event **tevs, 159 156 int max_tevs) 160 157 { 161 158 bool need_dwarf = perf_probe_event_need_dwarf(pev); ··· 172 169 } 173 170 174 171 /* Searching trace events corresponding to probe event */ 175 - ntevs = find_kprobe_trace_events(fd, pev, tevs, max_tevs); 172 + ntevs = find_probe_trace_events(fd, pev, tevs, max_tevs); 176 173 close(fd); 177 174 178 175 if (ntevs > 0) { /* Succeeded to find trace events */ 179 - pr_debug("find %d kprobe_trace_events.\n", ntevs); 176 + pr_debug("find %d probe_trace_events.\n", ntevs); 180 177 return ntevs; 181 178 } 182 179 ··· 196 193 } 197 194 } 198 195 return ntevs; 196 + } 197 + 198 + /* 199 + * Find a src file from a DWARF tag path. Prepend optional source path prefix 200 + * and chop off leading directories that do not exist. Result is passed back as 201 + * a newly allocated path on success. 202 + * Return 0 if file was found and readable, -errno otherwise. 203 + */ 204 + static int get_real_path(const char *raw_path, const char *comp_dir, 205 + char **new_path) 206 + { 207 + const char *prefix = symbol_conf.source_prefix; 208 + 209 + if (!prefix) { 210 + if (raw_path[0] != '/' && comp_dir) 211 + /* If not an absolute path, try to use comp_dir */ 212 + prefix = comp_dir; 213 + else { 214 + if (access(raw_path, R_OK) == 0) { 215 + *new_path = strdup(raw_path); 216 + return 0; 217 + } else 218 + return -errno; 219 + } 220 + } 221 + 222 + *new_path = malloc((strlen(prefix) + strlen(raw_path) + 2)); 223 + if (!*new_path) 224 + return -ENOMEM; 225 + 226 + for (;;) { 227 + sprintf(*new_path, "%s/%s", prefix, raw_path); 228 + 229 + if (access(*new_path, R_OK) == 0) 230 + return 0; 231 + 232 + if (!symbol_conf.source_prefix) 233 + /* In case of searching comp_dir, don't retry */ 234 + return -errno; 235 + 236 + switch (errno) { 237 + case ENAMETOOLONG: 238 + case ENOENT: 239 + case EROFS: 240 + case EFAULT: 241 + raw_path = strchr(++raw_path, '/'); 242 + if (!raw_path) { 243 + free(*new_path); 244 + *new_path = NULL; 245 + return -ENOENT; 246 + } 247 + continue; 248 + 249 + default: 250 + free(*new_path); 251 + *new_path = NULL; 252 + return -errno; 253 + } 254 + } 199 255 } 200 256 201 257 #define LINEBUF_SIZE 256 ··· 306 244 struct line_node *ln; 307 245 FILE *fp; 308 246 int fd, ret; 247 + char *tmp; 309 248 310 249 /* Search a line range */ 311 250 ret = init_vmlinux(); ··· 326 263 return -ENOENT; 327 264 } else if (ret < 0) { 328 265 pr_warning("Debuginfo analysis failed. (%d)\n", ret); 266 + return ret; 267 + } 268 + 269 + /* Convert source file path */ 270 + tmp = lr->path; 271 + ret = get_real_path(tmp, lr->comp_dir, &lr->path); 272 + free(tmp); /* Free old path */ 273 + if (ret < 0) { 274 + pr_warning("Failed to find source file. (%d)\n", ret); 329 275 return ret; 330 276 } 331 277 ··· 380 308 381 309 #else /* !DWARF_SUPPORT */ 382 310 383 - static int convert_to_perf_probe_point(struct kprobe_trace_point *tp, 384 - struct perf_probe_point *pp) 311 + static int kprobe_convert_to_perf_probe(struct probe_trace_point *tp, 312 + struct perf_probe_point *pp) 385 313 { 386 314 pp->function = strdup(tp->symbol); 387 315 if (pp->function == NULL) ··· 392 320 return 0; 393 321 } 394 322 395 - static int try_to_find_kprobe_trace_events(struct perf_probe_event *pev, 396 - struct kprobe_trace_event **tevs __unused, 323 + static int try_to_find_probe_trace_events(struct perf_probe_event *pev, 324 + struct probe_trace_event **tevs __unused, 397 325 int max_tevs __unused) 398 326 { 399 327 if (perf_probe_event_need_dwarf(pev)) { ··· 629 557 /* Parse perf-probe event argument */ 630 558 static int parse_perf_probe_arg(char *str, struct perf_probe_arg *arg) 631 559 { 632 - char *tmp; 560 + char *tmp, *goodname; 633 561 struct perf_probe_arg_field **fieldp; 634 562 635 563 pr_debug("parsing arg: %s into ", str); ··· 652 580 pr_debug("type:%s ", arg->type); 653 581 } 654 582 655 - tmp = strpbrk(str, "-."); 583 + tmp = strpbrk(str, "-.["); 656 584 if (!is_c_varname(str) || !tmp) { 657 585 /* A variable, register, symbol or special value */ 658 586 arg->var = strdup(str); ··· 662 590 return 0; 663 591 } 664 592 665 - /* Structure fields */ 593 + /* Structure fields or array element */ 666 594 arg->var = strndup(str, tmp - str); 667 595 if (arg->var == NULL) 668 596 return -ENOMEM; 597 + goodname = arg->var; 669 598 pr_debug("%s, ", arg->var); 670 599 fieldp = &arg->field; 671 600 ··· 674 601 *fieldp = zalloc(sizeof(struct perf_probe_arg_field)); 675 602 if (*fieldp == NULL) 676 603 return -ENOMEM; 677 - if (*tmp == '.') { 678 - str = tmp + 1; 679 - (*fieldp)->ref = false; 680 - } else if (tmp[1] == '>') { 681 - str = tmp + 2; 604 + if (*tmp == '[') { /* Array */ 605 + str = tmp; 606 + (*fieldp)->index = strtol(str + 1, &tmp, 0); 682 607 (*fieldp)->ref = true; 683 - } else { 684 - semantic_error("Argument parse error: %s\n", str); 685 - return -EINVAL; 608 + if (*tmp != ']' || tmp == str + 1) { 609 + semantic_error("Array index must be a" 610 + " number.\n"); 611 + return -EINVAL; 612 + } 613 + tmp++; 614 + if (*tmp == '\0') 615 + tmp = NULL; 616 + } else { /* Structure */ 617 + if (*tmp == '.') { 618 + str = tmp + 1; 619 + (*fieldp)->ref = false; 620 + } else if (tmp[1] == '>') { 621 + str = tmp + 2; 622 + (*fieldp)->ref = true; 623 + } else { 624 + semantic_error("Argument parse error: %s\n", 625 + str); 626 + return -EINVAL; 627 + } 628 + tmp = strpbrk(str, "-.["); 686 629 } 687 - 688 - tmp = strpbrk(str, "-."); 689 630 if (tmp) { 690 631 (*fieldp)->name = strndup(str, tmp - str); 691 632 if ((*fieldp)->name == NULL) 692 633 return -ENOMEM; 634 + if (*str != '[') 635 + goodname = (*fieldp)->name; 693 636 pr_debug("%s(%d), ", (*fieldp)->name, (*fieldp)->ref); 694 637 fieldp = &(*fieldp)->next; 695 638 } ··· 713 624 (*fieldp)->name = strdup(str); 714 625 if ((*fieldp)->name == NULL) 715 626 return -ENOMEM; 627 + if (*str != '[') 628 + goodname = (*fieldp)->name; 716 629 pr_debug("%s(%d)\n", (*fieldp)->name, (*fieldp)->ref); 717 630 718 - /* If no name is specified, set the last field name */ 631 + /* If no name is specified, set the last field name (not array index)*/ 719 632 if (!arg->name) { 720 - arg->name = strdup((*fieldp)->name); 633 + arg->name = strdup(goodname); 721 634 if (arg->name == NULL) 722 635 return -ENOMEM; 723 636 } ··· 784 693 return false; 785 694 } 786 695 787 - /* Parse kprobe_events event into struct probe_point */ 788 - int parse_kprobe_trace_command(const char *cmd, struct kprobe_trace_event *tev) 696 + /* Parse probe_events event into struct probe_point */ 697 + static int parse_probe_trace_command(const char *cmd, 698 + struct probe_trace_event *tev) 789 699 { 790 - struct kprobe_trace_point *tp = &tev->point; 700 + struct probe_trace_point *tp = &tev->point; 791 701 char pr; 792 702 char *p; 793 703 int ret, i, argc; 794 704 char **argv; 795 705 796 - pr_debug("Parsing kprobe_events: %s\n", cmd); 706 + pr_debug("Parsing probe_events: %s\n", cmd); 797 707 argv = argv_split(cmd, &argc); 798 708 if (!argv) { 799 709 pr_debug("Failed to split arguments.\n"); ··· 826 734 tp->offset = 0; 827 735 828 736 tev->nargs = argc - 2; 829 - tev->args = zalloc(sizeof(struct kprobe_trace_arg) * tev->nargs); 737 + tev->args = zalloc(sizeof(struct probe_trace_arg) * tev->nargs); 830 738 if (tev->args == NULL) { 831 739 ret = -ENOMEM; 832 740 goto out; ··· 868 776 len -= ret; 869 777 870 778 while (field) { 871 - ret = e_snprintf(tmp, len, "%s%s", field->ref ? "->" : ".", 872 - field->name); 779 + if (field->name[0] == '[') 780 + ret = e_snprintf(tmp, len, "%s", field->name); 781 + else 782 + ret = e_snprintf(tmp, len, "%s%s", 783 + field->ref ? "->" : ".", field->name); 873 784 if (ret <= 0) 874 785 goto error; 875 786 tmp += ret; ··· 972 877 } 973 878 #endif 974 879 975 - static int __synthesize_kprobe_trace_arg_ref(struct kprobe_trace_arg_ref *ref, 880 + static int __synthesize_probe_trace_arg_ref(struct probe_trace_arg_ref *ref, 976 881 char **buf, size_t *buflen, 977 882 int depth) 978 883 { 979 884 int ret; 980 885 if (ref->next) { 981 - depth = __synthesize_kprobe_trace_arg_ref(ref->next, buf, 886 + depth = __synthesize_probe_trace_arg_ref(ref->next, buf, 982 887 buflen, depth + 1); 983 888 if (depth < 0) 984 889 goto out; ··· 996 901 997 902 } 998 903 999 - static int synthesize_kprobe_trace_arg(struct kprobe_trace_arg *arg, 904 + static int synthesize_probe_trace_arg(struct probe_trace_arg *arg, 1000 905 char *buf, size_t buflen) 1001 906 { 907 + struct probe_trace_arg_ref *ref = arg->ref; 1002 908 int ret, depth = 0; 1003 909 char *tmp = buf; 1004 910 ··· 1013 917 buf += ret; 1014 918 buflen -= ret; 1015 919 920 + /* Special case: @XXX */ 921 + if (arg->value[0] == '@' && arg->ref) 922 + ref = ref->next; 923 + 1016 924 /* Dereferencing arguments */ 1017 - if (arg->ref) { 1018 - depth = __synthesize_kprobe_trace_arg_ref(arg->ref, &buf, 925 + if (ref) { 926 + depth = __synthesize_probe_trace_arg_ref(ref, &buf, 1019 927 &buflen, 1); 1020 928 if (depth < 0) 1021 929 return depth; 1022 930 } 1023 931 1024 932 /* Print argument value */ 1025 - ret = e_snprintf(buf, buflen, "%s", arg->value); 933 + if (arg->value[0] == '@' && arg->ref) 934 + ret = e_snprintf(buf, buflen, "%s%+ld", arg->value, 935 + arg->ref->offset); 936 + else 937 + ret = e_snprintf(buf, buflen, "%s", arg->value); 1026 938 if (ret < 0) 1027 939 return ret; 1028 940 buf += ret; ··· 1055 951 return buf - tmp; 1056 952 } 1057 953 1058 - char *synthesize_kprobe_trace_command(struct kprobe_trace_event *tev) 954 + char *synthesize_probe_trace_command(struct probe_trace_event *tev) 1059 955 { 1060 - struct kprobe_trace_point *tp = &tev->point; 956 + struct probe_trace_point *tp = &tev->point; 1061 957 char *buf; 1062 958 int i, len, ret; 1063 959 ··· 1073 969 goto error; 1074 970 1075 971 for (i = 0; i < tev->nargs; i++) { 1076 - ret = synthesize_kprobe_trace_arg(&tev->args[i], buf + len, 972 + ret = synthesize_probe_trace_arg(&tev->args[i], buf + len, 1077 973 MAX_CMDLEN - len); 1078 974 if (ret <= 0) 1079 975 goto error; ··· 1086 982 return NULL; 1087 983 } 1088 984 1089 - int convert_to_perf_probe_event(struct kprobe_trace_event *tev, 985 + static int convert_to_perf_probe_event(struct probe_trace_event *tev, 1090 986 struct perf_probe_event *pev) 1091 987 { 1092 988 char buf[64] = ""; ··· 1099 995 return -ENOMEM; 1100 996 1101 997 /* Convert trace_point to probe_point */ 1102 - ret = convert_to_perf_probe_point(&tev->point, &pev->point); 998 + ret = kprobe_convert_to_perf_probe(&tev->point, &pev->point); 1103 999 if (ret < 0) 1104 1000 return ret; 1105 1001 ··· 1112 1008 if (tev->args[i].name) 1113 1009 pev->args[i].name = strdup(tev->args[i].name); 1114 1010 else { 1115 - ret = synthesize_kprobe_trace_arg(&tev->args[i], 1011 + ret = synthesize_probe_trace_arg(&tev->args[i], 1116 1012 buf, 64); 1117 1013 pev->args[i].name = strdup(buf); 1118 1014 } ··· 1163 1059 memset(pev, 0, sizeof(*pev)); 1164 1060 } 1165 1061 1166 - void clear_kprobe_trace_event(struct kprobe_trace_event *tev) 1062 + static void clear_probe_trace_event(struct probe_trace_event *tev) 1167 1063 { 1168 - struct kprobe_trace_arg_ref *ref, *next; 1064 + struct probe_trace_arg_ref *ref, *next; 1169 1065 int i; 1170 1066 1171 1067 if (tev->event) ··· 1226 1122 } 1227 1123 1228 1124 /* Get raw string list of current kprobe_events */ 1229 - static struct strlist *get_kprobe_trace_command_rawlist(int fd) 1125 + static struct strlist *get_probe_trace_command_rawlist(int fd) 1230 1126 { 1231 1127 int ret, idx; 1232 1128 FILE *fp; ··· 1294 1190 int show_perf_probe_events(void) 1295 1191 { 1296 1192 int fd, ret; 1297 - struct kprobe_trace_event tev; 1193 + struct probe_trace_event tev; 1298 1194 struct perf_probe_event pev; 1299 1195 struct strlist *rawlist; 1300 1196 struct str_node *ent; ··· 1311 1207 if (fd < 0) 1312 1208 return fd; 1313 1209 1314 - rawlist = get_kprobe_trace_command_rawlist(fd); 1210 + rawlist = get_probe_trace_command_rawlist(fd); 1315 1211 close(fd); 1316 1212 if (!rawlist) 1317 1213 return -ENOENT; 1318 1214 1319 1215 strlist__for_each(ent, rawlist) { 1320 - ret = parse_kprobe_trace_command(ent->s, &tev); 1216 + ret = parse_probe_trace_command(ent->s, &tev); 1321 1217 if (ret >= 0) { 1322 1218 ret = convert_to_perf_probe_event(&tev, &pev); 1323 1219 if (ret >= 0) 1324 1220 ret = show_perf_probe_event(&pev); 1325 1221 } 1326 1222 clear_perf_probe_event(&pev); 1327 - clear_kprobe_trace_event(&tev); 1223 + clear_probe_trace_event(&tev); 1328 1224 if (ret < 0) 1329 1225 break; 1330 1226 } ··· 1334 1230 } 1335 1231 1336 1232 /* Get current perf-probe event names */ 1337 - static struct strlist *get_kprobe_trace_event_names(int fd, bool include_group) 1233 + static struct strlist *get_probe_trace_event_names(int fd, bool include_group) 1338 1234 { 1339 1235 char buf[128]; 1340 1236 struct strlist *sl, *rawlist; 1341 1237 struct str_node *ent; 1342 - struct kprobe_trace_event tev; 1238 + struct probe_trace_event tev; 1343 1239 int ret = 0; 1344 1240 1345 1241 memset(&tev, 0, sizeof(tev)); 1346 - 1347 - rawlist = get_kprobe_trace_command_rawlist(fd); 1242 + rawlist = get_probe_trace_command_rawlist(fd); 1348 1243 sl = strlist__new(true, NULL); 1349 1244 strlist__for_each(ent, rawlist) { 1350 - ret = parse_kprobe_trace_command(ent->s, &tev); 1245 + ret = parse_probe_trace_command(ent->s, &tev); 1351 1246 if (ret < 0) 1352 1247 break; 1353 1248 if (include_group) { ··· 1356 1253 ret = strlist__add(sl, buf); 1357 1254 } else 1358 1255 ret = strlist__add(sl, tev.event); 1359 - clear_kprobe_trace_event(&tev); 1256 + clear_probe_trace_event(&tev); 1360 1257 if (ret < 0) 1361 1258 break; 1362 1259 } ··· 1369 1266 return sl; 1370 1267 } 1371 1268 1372 - static int write_kprobe_trace_event(int fd, struct kprobe_trace_event *tev) 1269 + static int write_probe_trace_event(int fd, struct probe_trace_event *tev) 1373 1270 { 1374 1271 int ret = 0; 1375 - char *buf = synthesize_kprobe_trace_command(tev); 1272 + char *buf = synthesize_probe_trace_command(tev); 1376 1273 1377 1274 if (!buf) { 1378 - pr_debug("Failed to synthesize kprobe trace event.\n"); 1275 + pr_debug("Failed to synthesize probe trace event.\n"); 1379 1276 return -EINVAL; 1380 1277 } 1381 1278 ··· 1428 1325 return ret; 1429 1326 } 1430 1327 1431 - static int __add_kprobe_trace_events(struct perf_probe_event *pev, 1432 - struct kprobe_trace_event *tevs, 1328 + static int __add_probe_trace_events(struct perf_probe_event *pev, 1329 + struct probe_trace_event *tevs, 1433 1330 int ntevs, bool allow_suffix) 1434 1331 { 1435 1332 int i, fd, ret; 1436 - struct kprobe_trace_event *tev = NULL; 1333 + struct probe_trace_event *tev = NULL; 1437 1334 char buf[64]; 1438 1335 const char *event, *group; 1439 1336 struct strlist *namelist; ··· 1442 1339 if (fd < 0) 1443 1340 return fd; 1444 1341 /* Get current event names */ 1445 - namelist = get_kprobe_trace_event_names(fd, false); 1342 + namelist = get_probe_trace_event_names(fd, false); 1446 1343 if (!namelist) { 1447 1344 pr_debug("Failed to get current event list.\n"); 1448 1345 return -EIO; ··· 1477 1374 ret = -ENOMEM; 1478 1375 break; 1479 1376 } 1480 - ret = write_kprobe_trace_event(fd, tev); 1377 + ret = write_probe_trace_event(fd, tev); 1481 1378 if (ret < 0) 1482 1379 break; 1483 1380 /* Add added event name to namelist */ ··· 1514 1411 return ret; 1515 1412 } 1516 1413 1517 - static int convert_to_kprobe_trace_events(struct perf_probe_event *pev, 1518 - struct kprobe_trace_event **tevs, 1414 + static int convert_to_probe_trace_events(struct perf_probe_event *pev, 1415 + struct probe_trace_event **tevs, 1519 1416 int max_tevs) 1520 1417 { 1521 1418 struct symbol *sym; 1522 1419 int ret = 0, i; 1523 - struct kprobe_trace_event *tev; 1420 + struct probe_trace_event *tev; 1524 1421 1525 1422 /* Convert perf_probe_event with debuginfo */ 1526 - ret = try_to_find_kprobe_trace_events(pev, tevs, max_tevs); 1423 + ret = try_to_find_probe_trace_events(pev, tevs, max_tevs); 1527 1424 if (ret != 0) 1528 1425 return ret; 1529 1426 1530 1427 /* Allocate trace event buffer */ 1531 - tev = *tevs = zalloc(sizeof(struct kprobe_trace_event)); 1428 + tev = *tevs = zalloc(sizeof(struct probe_trace_event)); 1532 1429 if (tev == NULL) 1533 1430 return -ENOMEM; 1534 1431 ··· 1541 1438 tev->point.offset = pev->point.offset; 1542 1439 tev->nargs = pev->nargs; 1543 1440 if (tev->nargs) { 1544 - tev->args = zalloc(sizeof(struct kprobe_trace_arg) 1441 + tev->args = zalloc(sizeof(struct probe_trace_arg) 1545 1442 * tev->nargs); 1546 1443 if (tev->args == NULL) { 1547 1444 ret = -ENOMEM; ··· 1582 1479 1583 1480 return 1; 1584 1481 error: 1585 - clear_kprobe_trace_event(tev); 1482 + clear_probe_trace_event(tev); 1586 1483 free(tev); 1587 1484 *tevs = NULL; 1588 1485 return ret; ··· 1590 1487 1591 1488 struct __event_package { 1592 1489 struct perf_probe_event *pev; 1593 - struct kprobe_trace_event *tevs; 1490 + struct probe_trace_event *tevs; 1594 1491 int ntevs; 1595 1492 }; 1596 1493 ··· 1613 1510 for (i = 0; i < npevs; i++) { 1614 1511 pkgs[i].pev = &pevs[i]; 1615 1512 /* Convert with or without debuginfo */ 1616 - ret = convert_to_kprobe_trace_events(pkgs[i].pev, 1513 + ret = convert_to_probe_trace_events(pkgs[i].pev, 1617 1514 &pkgs[i].tevs, max_tevs); 1618 1515 if (ret < 0) 1619 1516 goto end; ··· 1622 1519 1623 1520 /* Loop 2: add all events */ 1624 1521 for (i = 0; i < npevs && ret >= 0; i++) 1625 - ret = __add_kprobe_trace_events(pkgs[i].pev, pkgs[i].tevs, 1522 + ret = __add_probe_trace_events(pkgs[i].pev, pkgs[i].tevs, 1626 1523 pkgs[i].ntevs, force_add); 1627 1524 end: 1628 1525 /* Loop 3: cleanup trace events */ 1629 1526 for (i = 0; i < npevs; i++) 1630 1527 for (j = 0; j < pkgs[i].ntevs; j++) 1631 - clear_kprobe_trace_event(&pkgs[i].tevs[j]); 1528 + clear_probe_trace_event(&pkgs[i].tevs[j]); 1632 1529 1633 1530 return ret; 1634 1531 } 1635 1532 1636 - static int __del_trace_kprobe_event(int fd, struct str_node *ent) 1533 + static int __del_trace_probe_event(int fd, struct str_node *ent) 1637 1534 { 1638 1535 char *p; 1639 1536 char buf[128]; 1640 1537 int ret; 1641 1538 1642 - /* Convert from perf-probe event to trace-kprobe event */ 1539 + /* Convert from perf-probe event to trace-probe event */ 1643 1540 ret = e_snprintf(buf, 128, "-:%s", ent->s); 1644 1541 if (ret < 0) 1645 1542 goto error; ··· 1665 1562 return ret; 1666 1563 } 1667 1564 1668 - static int del_trace_kprobe_event(int fd, const char *group, 1565 + static int del_trace_probe_event(int fd, const char *group, 1669 1566 const char *event, struct strlist *namelist) 1670 1567 { 1671 1568 char buf[128]; ··· 1682 1579 strlist__for_each_safe(ent, n, namelist) 1683 1580 if (strglobmatch(ent->s, buf)) { 1684 1581 found++; 1685 - ret = __del_trace_kprobe_event(fd, ent); 1582 + ret = __del_trace_probe_event(fd, ent); 1686 1583 if (ret < 0) 1687 1584 break; 1688 1585 strlist__remove(namelist, ent); ··· 1691 1588 ent = strlist__find(namelist, buf); 1692 1589 if (ent) { 1693 1590 found++; 1694 - ret = __del_trace_kprobe_event(fd, ent); 1591 + ret = __del_trace_probe_event(fd, ent); 1695 1592 if (ret >= 0) 1696 1593 strlist__remove(namelist, ent); 1697 1594 } ··· 1715 1612 return fd; 1716 1613 1717 1614 /* Get current event names */ 1718 - namelist = get_kprobe_trace_event_names(fd, true); 1615 + namelist = get_probe_trace_event_names(fd, true); 1719 1616 if (namelist == NULL) 1720 1617 return -EINVAL; 1721 1618 ··· 1736 1633 event = str; 1737 1634 } 1738 1635 pr_debug("Group: %s, Event: %s\n", group, event); 1739 - ret = del_trace_kprobe_event(fd, group, event, namelist); 1636 + ret = del_trace_probe_event(fd, group, event, namelist); 1740 1637 free(str); 1741 1638 if (ret < 0) 1742 1639 break;

+12 -17

tools/perf/util/probe-event.h

··· 7 7 extern bool probe_event_dry_run; 8 8 9 9 /* kprobe-tracer tracing point */ 10 - struct kprobe_trace_point { 10 + struct probe_trace_point { 11 11 char *symbol; /* Base symbol */ 12 12 unsigned long offset; /* Offset from symbol */ 13 13 bool retprobe; /* Return probe flag */ 14 14 }; 15 15 16 - /* kprobe-tracer tracing argument referencing offset */ 17 - struct kprobe_trace_arg_ref { 18 - struct kprobe_trace_arg_ref *next; /* Next reference */ 16 + /* probe-tracer tracing argument referencing offset */ 17 + struct probe_trace_arg_ref { 18 + struct probe_trace_arg_ref *next; /* Next reference */ 19 19 long offset; /* Offset value */ 20 20 }; 21 21 22 22 /* kprobe-tracer tracing argument */ 23 - struct kprobe_trace_arg { 23 + struct probe_trace_arg { 24 24 char *name; /* Argument name */ 25 25 char *value; /* Base value */ 26 26 char *type; /* Type name */ 27 - struct kprobe_trace_arg_ref *ref; /* Referencing offset */ 27 + struct probe_trace_arg_ref *ref; /* Referencing offset */ 28 28 }; 29 29 30 30 /* kprobe-tracer tracing event (point + arg) */ 31 - struct kprobe_trace_event { 31 + struct probe_trace_event { 32 32 char *event; /* Event name */ 33 33 char *group; /* Group name */ 34 - struct kprobe_trace_point point; /* Trace point */ 34 + struct probe_trace_point point; /* Trace point */ 35 35 int nargs; /* Number of args */ 36 - struct kprobe_trace_arg *args; /* Arguments */ 36 + struct probe_trace_arg *args; /* Arguments */ 37 37 }; 38 38 39 39 /* Perf probe probing point */ ··· 50 50 struct perf_probe_arg_field { 51 51 struct perf_probe_arg_field *next; /* Next field */ 52 52 char *name; /* Name of the field */ 53 + long index; /* Array index number */ 53 54 bool ref; /* Referencing flag */ 54 55 }; 55 56 ··· 86 85 int end; /* End line number */ 87 86 int offset; /* Start line offset */ 88 87 char *path; /* Real path name */ 88 + char *comp_dir; /* Compile directory */ 89 89 struct list_head line_list; /* Visible lines */ 90 90 }; 91 91 92 92 /* Command string to events */ 93 93 extern int parse_perf_probe_command(const char *cmd, 94 94 struct perf_probe_event *pev); 95 - extern int parse_kprobe_trace_command(const char *cmd, 96 - struct kprobe_trace_event *tev); 97 95 98 96 /* Events to command string */ 99 97 extern char *synthesize_perf_probe_command(struct perf_probe_event *pev); 100 - extern char *synthesize_kprobe_trace_command(struct kprobe_trace_event *tev); 98 + extern char *synthesize_probe_trace_command(struct probe_trace_event *tev); 101 99 extern int synthesize_perf_probe_arg(struct perf_probe_arg *pa, char *buf, 102 100 size_t len); 103 101 104 102 /* Check the perf_probe_event needs debuginfo */ 105 103 extern bool perf_probe_event_need_dwarf(struct perf_probe_event *pev); 106 104 107 - /* Convert from kprobe_trace_event to perf_probe_event */ 108 - extern int convert_to_perf_probe_event(struct kprobe_trace_event *tev, 109 - struct perf_probe_event *pev); 110 - 111 105 /* Release event contents */ 112 106 extern void clear_perf_probe_event(struct perf_probe_event *pev); 113 - extern void clear_kprobe_trace_event(struct kprobe_trace_event *tev); 114 107 115 108 /* Command string to line-range */ 116 109 extern int parse_line_range_desc(const char *cmd, struct line_range *lr);

+190 -58

tools/perf/util/probe-finder.c

··· 37 37 #include "event.h" 38 38 #include "debug.h" 39 39 #include "util.h" 40 + #include "symbol.h" 40 41 #include "probe-finder.h" 41 42 42 43 /* Kprobe tracer basic type is up to u64 */ ··· 144 143 return src; 145 144 } 146 145 146 + /* Get DW_AT_comp_dir (should be NULL with older gcc) */ 147 + static const char *cu_get_comp_dir(Dwarf_Die *cu_die) 148 + { 149 + Dwarf_Attribute attr; 150 + if (dwarf_attr(cu_die, DW_AT_comp_dir, &attr) == NULL) 151 + return NULL; 152 + return dwarf_formstring(&attr); 153 + } 154 + 147 155 /* Compare diename and tname */ 148 156 static bool die_compare_name(Dwarf_Die *dw_die, const char *tname) 149 157 { 150 158 const char *name; 151 159 name = dwarf_diename(dw_die); 152 - return name ? strcmp(tname, name) : -1; 160 + return name ? (strcmp(tname, name) == 0) : false; 153 161 } 154 162 155 163 /* Get type die, but skip qualifiers and typedef */ ··· 329 319 tag = dwarf_tag(die_mem); 330 320 if ((tag == DW_TAG_formal_parameter || 331 321 tag == DW_TAG_variable) && 332 - (die_compare_name(die_mem, name) == 0)) 322 + die_compare_name(die_mem, name)) 333 323 return DIE_FIND_CB_FOUND; 334 324 335 325 return DIE_FIND_CB_CONTINUE; ··· 348 338 const char *name = data; 349 339 350 340 if ((dwarf_tag(die_mem) == DW_TAG_member) && 351 - (die_compare_name(die_mem, name) == 0)) 341 + die_compare_name(die_mem, name)) 352 342 return DIE_FIND_CB_FOUND; 353 343 354 344 return DIE_FIND_CB_SIBLING; ··· 366 356 * Probe finder related functions 367 357 */ 368 358 369 - /* Show a location */ 370 - static int convert_location(Dwarf_Op *op, struct probe_finder *pf) 359 + static struct probe_trace_arg_ref *alloc_trace_arg_ref(long offs) 371 360 { 361 + struct probe_trace_arg_ref *ref; 362 + ref = zalloc(sizeof(struct probe_trace_arg_ref)); 363 + if (ref != NULL) 364 + ref->offset = offs; 365 + return ref; 366 + } 367 + 368 + /* Show a location */ 369 + static int convert_variable_location(Dwarf_Die *vr_die, struct probe_finder *pf) 370 + { 371 + Dwarf_Attribute attr; 372 + Dwarf_Op *op; 373 + size_t nops; 372 374 unsigned int regn; 373 375 Dwarf_Word offs = 0; 374 376 bool ref = false; 375 377 const char *regs; 376 - struct kprobe_trace_arg *tvar = pf->tvar; 378 + struct probe_trace_arg *tvar = pf->tvar; 379 + int ret; 380 + 381 + /* TODO: handle more than 1 exprs */ 382 + if (dwarf_attr(vr_die, DW_AT_location, &attr) == NULL || 383 + dwarf_getlocation_addr(&attr, pf->addr, &op, &nops, 1) <= 0 || 384 + nops == 0) { 385 + /* TODO: Support const_value */ 386 + pr_err("Failed to find the location of %s at this address.\n" 387 + " Perhaps, it has been optimized out.\n", pf->pvar->var); 388 + return -ENOENT; 389 + } 390 + 391 + if (op->atom == DW_OP_addr) { 392 + /* Static variables on memory (not stack), make @varname */ 393 + ret = strlen(dwarf_diename(vr_die)); 394 + tvar->value = zalloc(ret + 2); 395 + if (tvar->value == NULL) 396 + return -ENOMEM; 397 + snprintf(tvar->value, ret + 2, "@%s", dwarf_diename(vr_die)); 398 + tvar->ref = alloc_trace_arg_ref((long)offs); 399 + if (tvar->ref == NULL) 400 + return -ENOMEM; 401 + return 0; 402 + } 377 403 378 404 /* If this is based on frame buffer, set the offset */ 379 405 if (op->atom == DW_OP_fbreg) { ··· 451 405 return -ENOMEM; 452 406 453 407 if (ref) { 454 - tvar->ref = zalloc(sizeof(struct kprobe_trace_arg_ref)); 408 + tvar->ref = alloc_trace_arg_ref((long)offs); 455 409 if (tvar->ref == NULL) 456 410 return -ENOMEM; 457 - tvar->ref->offset = (long)offs; 458 411 } 459 412 return 0; 460 413 } 461 414 462 415 static int convert_variable_type(Dwarf_Die *vr_die, 463 - struct kprobe_trace_arg *targ) 416 + struct probe_trace_arg *tvar, 417 + const char *cast) 464 418 { 419 + struct probe_trace_arg_ref **ref_ptr = &tvar->ref; 465 420 Dwarf_Die type; 466 421 char buf[16]; 467 422 int ret; 423 + 424 + /* TODO: check all types */ 425 + if (cast && strcmp(cast, "string") != 0) { 426 + /* Non string type is OK */ 427 + tvar->type = strdup(cast); 428 + return (tvar->type == NULL) ? -ENOMEM : 0; 429 + } 468 430 469 431 if (die_get_real_type(vr_die, &type) == NULL) { 470 432 pr_warning("Failed to get a type information of %s.\n", 471 433 dwarf_diename(vr_die)); 472 434 return -ENOENT; 435 + } 436 + 437 + pr_debug("%s type is %s.\n", 438 + dwarf_diename(vr_die), dwarf_diename(&type)); 439 + 440 + if (cast && strcmp(cast, "string") == 0) { /* String type */ 441 + ret = dwarf_tag(&type); 442 + if (ret != DW_TAG_pointer_type && 443 + ret != DW_TAG_array_type) { 444 + pr_warning("Failed to cast into string: " 445 + "%s(%s) is not a pointer nor array.", 446 + dwarf_diename(vr_die), dwarf_diename(&type)); 447 + return -EINVAL; 448 + } 449 + if (ret == DW_TAG_pointer_type) { 450 + if (die_get_real_type(&type, &type) == NULL) { 451 + pr_warning("Failed to get a type information."); 452 + return -ENOENT; 453 + } 454 + while (*ref_ptr) 455 + ref_ptr = &(*ref_ptr)->next; 456 + /* Add new reference with offset +0 */ 457 + *ref_ptr = zalloc(sizeof(struct probe_trace_arg_ref)); 458 + if (*ref_ptr == NULL) { 459 + pr_warning("Out of memory error\n"); 460 + return -ENOMEM; 461 + } 462 + } 463 + if (!die_compare_name(&type, "char") && 464 + !die_compare_name(&type, "unsigned char")) { 465 + pr_warning("Failed to cast into string: " 466 + "%s is not (unsigned) char *.", 467 + dwarf_diename(vr_die)); 468 + return -EINVAL; 469 + } 470 + tvar->type = strdup(cast); 471 + return (tvar->type == NULL) ? -ENOMEM : 0; 473 472 } 474 473 475 474 ret = die_get_byte_size(&type) * 8; ··· 536 445 strerror(-ret)); 537 446 return ret; 538 447 } 539 - targ->type = strdup(buf); 540 - if (targ->type == NULL) 448 + tvar->type = strdup(buf); 449 + if (tvar->type == NULL) 541 450 return -ENOMEM; 542 451 } 543 452 return 0; ··· 545 454 546 455 static int convert_variable_fields(Dwarf_Die *vr_die, const char *varname, 547 456 struct perf_probe_arg_field *field, 548 - struct kprobe_trace_arg_ref **ref_ptr, 457 + struct probe_trace_arg_ref **ref_ptr, 549 458 Dwarf_Die *die_mem) 550 459 { 551 - struct kprobe_trace_arg_ref *ref = *ref_ptr; 460 + struct probe_trace_arg_ref *ref = *ref_ptr; 552 461 Dwarf_Die type; 553 462 Dwarf_Word offs; 554 - int ret; 463 + int ret, tag; 555 464 556 465 pr_debug("converting %s in %s\n", field->name, varname); 557 466 if (die_get_real_type(vr_die, &type) == NULL) { 558 467 pr_warning("Failed to get the type of %s.\n", varname); 559 468 return -ENOENT; 560 469 } 470 + pr_debug2("Var real type: (%x)\n", (unsigned)dwarf_dieoffset(&type)); 471 + tag = dwarf_tag(&type); 561 472 562 - /* Check the pointer and dereference */ 563 - if (dwarf_tag(&type) == DW_TAG_pointer_type) { 473 + if (field->name[0] == '[' && 474 + (tag == DW_TAG_array_type || tag == DW_TAG_pointer_type)) { 475 + if (field->next) 476 + /* Save original type for next field */ 477 + memcpy(die_mem, &type, sizeof(*die_mem)); 478 + /* Get the type of this array */ 479 + if (die_get_real_type(&type, &type) == NULL) { 480 + pr_warning("Failed to get the type of %s.\n", varname); 481 + return -ENOENT; 482 + } 483 + pr_debug2("Array real type: (%x)\n", 484 + (unsigned)dwarf_dieoffset(&type)); 485 + if (tag == DW_TAG_pointer_type) { 486 + ref = zalloc(sizeof(struct probe_trace_arg_ref)); 487 + if (ref == NULL) 488 + return -ENOMEM; 489 + if (*ref_ptr) 490 + (*ref_ptr)->next = ref; 491 + else 492 + *ref_ptr = ref; 493 + } 494 + ref->offset += die_get_byte_size(&type) * field->index; 495 + if (!field->next) 496 + /* Save vr_die for converting types */ 497 + memcpy(die_mem, vr_die, sizeof(*die_mem)); 498 + goto next; 499 + } else if (tag == DW_TAG_pointer_type) { 500 + /* Check the pointer and dereference */ 564 501 if (!field->ref) { 565 502 pr_err("Semantic error: %s must be referred by '->'\n", 566 503 field->name); ··· 605 486 return -EINVAL; 606 487 } 607 488 608 - ref = zalloc(sizeof(struct kprobe_trace_arg_ref)); 489 + ref = zalloc(sizeof(struct probe_trace_arg_ref)); 609 490 if (ref == NULL) 610 491 return -ENOMEM; 611 492 if (*ref_ptr) ··· 614 495 *ref_ptr = ref; 615 496 } else { 616 497 /* Verify it is a data structure */ 617 - if (dwarf_tag(&type) != DW_TAG_structure_type) { 498 + if (tag != DW_TAG_structure_type) { 618 499 pr_warning("%s is not a data structure.\n", varname); 500 + return -EINVAL; 501 + } 502 + if (field->name[0] == '[') { 503 + pr_err("Semantic error: %s is not a pointor nor array.", 504 + varname); 619 505 return -EINVAL; 620 506 } 621 507 if (field->ref) { ··· 649 525 } 650 526 ref->offset += (long)offs; 651 527 528 + next: 652 529 /* Converting next field */ 653 530 if (field->next) 654 531 return convert_variable_fields(die_mem, field->name, ··· 661 536 /* Show a variables in kprobe event format */ 662 537 static int convert_variable(Dwarf_Die *vr_die, struct probe_finder *pf) 663 538 { 664 - Dwarf_Attribute attr; 665 539 Dwarf_Die die_mem; 666 - Dwarf_Op *expr; 667 - size_t nexpr; 668 540 int ret; 669 541 670 - if (dwarf_attr(vr_die, DW_AT_location, &attr) == NULL) 671 - goto error; 672 - /* TODO: handle more than 1 exprs */ 673 - ret = dwarf_getlocation_addr(&attr, pf->addr, &expr, &nexpr, 1); 674 - if (ret <= 0 || nexpr == 0) 675 - goto error; 542 + pr_debug("Converting variable %s into trace event.\n", 543 + dwarf_diename(vr_die)); 676 544 677 - ret = convert_location(expr, pf); 545 + ret = convert_variable_location(vr_die, pf); 678 546 if (ret == 0 && pf->pvar->field) { 679 547 ret = convert_variable_fields(vr_die, pf->pvar->var, 680 548 pf->pvar->field, &pf->tvar->ref, 681 549 &die_mem); 682 550 vr_die = &die_mem; 683 551 } 684 - if (ret == 0) { 685 - if (pf->pvar->type) { 686 - pf->tvar->type = strdup(pf->pvar->type); 687 - if (pf->tvar->type == NULL) 688 - ret = -ENOMEM; 689 - } else 690 - ret = convert_variable_type(vr_die, pf->tvar); 691 - } 552 + if (ret == 0) 553 + ret = convert_variable_type(vr_die, pf->tvar, pf->pvar->type); 692 554 /* *expr will be cached in libdw. Don't free it. */ 693 555 return ret; 694 - error: 695 - /* TODO: Support const_value */ 696 - pr_err("Failed to find the location of %s at this address.\n" 697 - " Perhaps, it has been optimized out.\n", pf->pvar->var); 698 - return -ENOENT; 699 556 } 700 557 701 558 /* Find a variable in a subprogram die */ 702 559 static int find_variable(Dwarf_Die *sp_die, struct probe_finder *pf) 703 560 { 704 - Dwarf_Die vr_die; 561 + Dwarf_Die vr_die, *scopes; 705 562 char buf[32], *ptr; 706 - int ret; 563 + int ret, nscopes; 707 564 708 - /* TODO: Support arrays */ 709 565 if (pf->pvar->name) 710 566 pf->tvar->name = strdup(pf->pvar->name); 711 567 else { ··· 713 607 pr_debug("Searching '%s' variable in context.\n", 714 608 pf->pvar->var); 715 609 /* Search child die for local variables and parameters. */ 716 - if (!die_find_variable(sp_die, pf->pvar->var, &vr_die)) { 610 + if (die_find_variable(sp_die, pf->pvar->var, &vr_die)) 611 + ret = convert_variable(&vr_die, pf); 612 + else { 613 + /* Search upper class */ 614 + nscopes = dwarf_getscopes_die(sp_die, &scopes); 615 + if (nscopes > 0) { 616 + ret = dwarf_getscopevar(scopes, nscopes, pf->pvar->var, 617 + 0, NULL, 0, 0, &vr_die); 618 + if (ret >= 0) 619 + ret = convert_variable(&vr_die, pf); 620 + else 621 + ret = -ENOENT; 622 + free(scopes); 623 + } else 624 + ret = -ENOENT; 625 + } 626 + if (ret < 0) 717 627 pr_warning("Failed to find '%s' in this function.\n", 718 628 pf->pvar->var); 719 - return -ENOENT; 720 - } 721 - return convert_variable(&vr_die, pf); 629 + return ret; 722 630 } 723 631 724 632 /* Show a probe point to output buffer */ 725 633 static int convert_probe_point(Dwarf_Die *sp_die, struct probe_finder *pf) 726 634 { 727 - struct kprobe_trace_event *tev; 635 + struct probe_trace_event *tev; 728 636 Dwarf_Addr eaddr; 729 637 Dwarf_Die die_mem; 730 638 const char *name; ··· 803 683 804 684 /* Find each argument */ 805 685 tev->nargs = pf->pev->nargs; 806 - tev->args = zalloc(sizeof(struct kprobe_trace_arg) * tev->nargs); 686 + tev->args = zalloc(sizeof(struct probe_trace_arg) * tev->nargs); 807 687 if (tev->args == NULL) 808 688 return -ENOMEM; 809 689 for (i = 0; i < pf->pev->nargs; i++) { ··· 1017 897 1018 898 /* Check tag and diename */ 1019 899 if (dwarf_tag(sp_die) != DW_TAG_subprogram || 1020 - die_compare_name(sp_die, pp->function) != 0) 900 + !die_compare_name(sp_die, pp->function)) 1021 901 return DWARF_CB_OK; 1022 902 1023 903 pf->fname = dwarf_decl_file(sp_die); ··· 1060 940 return _param.retval; 1061 941 } 1062 942 1063 - /* Find kprobe_trace_events specified by perf_probe_event from debuginfo */ 1064 - int find_kprobe_trace_events(int fd, struct perf_probe_event *pev, 1065 - struct kprobe_trace_event **tevs, int max_tevs) 943 + /* Find probe_trace_events specified by perf_probe_event from debuginfo */ 944 + int find_probe_trace_events(int fd, struct perf_probe_event *pev, 945 + struct probe_trace_event **tevs, int max_tevs) 1066 946 { 1067 947 struct probe_finder pf = {.pev = pev, .max_tevs = max_tevs}; 1068 948 struct perf_probe_point *pp = &pev->point; ··· 1072 952 Dwarf *dbg; 1073 953 int ret = 0; 1074 954 1075 - pf.tevs = zalloc(sizeof(struct kprobe_trace_event) * max_tevs); 955 + pf.tevs = zalloc(sizeof(struct probe_trace_event) * max_tevs); 1076 956 if (pf.tevs == NULL) 1077 957 return -ENOMEM; 1078 958 *tevs = pf.tevs; ··· 1216 1096 static int line_range_add_line(const char *src, unsigned int lineno, 1217 1097 struct line_range *lr) 1218 1098 { 1219 - /* Copy real path */ 1099 + /* Copy source path */ 1220 1100 if (!lr->path) { 1221 1101 lr->path = strdup(src); 1222 1102 if (lr->path == NULL) ··· 1340 1220 struct line_range *lr = lf->lr; 1341 1221 1342 1222 if (dwarf_tag(sp_die) == DW_TAG_subprogram && 1343 - die_compare_name(sp_die, lr->function) == 0) { 1223 + die_compare_name(sp_die, lr->function)) { 1344 1224 lf->fname = dwarf_decl_file(sp_die); 1345 1225 dwarf_decl_line(sp_die, &lr->offset); 1346 1226 pr_debug("fname: %s, lineno:%d\n", lf->fname, lr->offset); ··· 1383 1263 size_t cuhl; 1384 1264 Dwarf_Die *diep; 1385 1265 Dwarf *dbg; 1266 + const char *comp_dir; 1386 1267 1387 1268 dbg = dwarf_begin(fd, DWARF_C_READ); 1388 1269 if (!dbg) { ··· 1419 1298 } 1420 1299 off = noff; 1421 1300 } 1422 - pr_debug("path: %lx\n", (unsigned long)lr->path); 1301 + 1302 + /* Store comp_dir */ 1303 + if (lf.found) { 1304 + comp_dir = cu_get_comp_dir(&lf.cu_die); 1305 + if (comp_dir) { 1306 + lr->comp_dir = strdup(comp_dir); 1307 + if (!lr->comp_dir) 1308 + ret = -ENOMEM; 1309 + } 1310 + } 1311 + 1312 + pr_debug("path: %s\n", lr->path); 1423 1313 dwarf_end(dbg); 1424 1314 1425 1315 return (ret < 0) ? ret : lf.found;

+5 -5

tools/perf/util/probe-finder.h

··· 16 16 } 17 17 18 18 #ifdef DWARF_SUPPORT 19 - /* Find kprobe_trace_events specified by perf_probe_event from debuginfo */ 20 - extern int find_kprobe_trace_events(int fd, struct perf_probe_event *pev, 21 - struct kprobe_trace_event **tevs, 19 + /* Find probe_trace_events specified by perf_probe_event from debuginfo */ 20 + extern int find_probe_trace_events(int fd, struct perf_probe_event *pev, 21 + struct probe_trace_event **tevs, 22 22 int max_tevs); 23 23 24 24 /* Find a perf_probe_point from debuginfo */ ··· 33 33 34 34 struct probe_finder { 35 35 struct perf_probe_event *pev; /* Target probe event */ 36 - struct kprobe_trace_event *tevs; /* Result trace events */ 36 + struct probe_trace_event *tevs; /* Result trace events */ 37 37 int ntevs; /* Number of trace events */ 38 38 int max_tevs; /* Max number of trace events */ 39 39 ··· 50 50 #endif 51 51 Dwarf_Op *fb_ops; /* Frame base attribute */ 52 52 struct perf_probe_arg *pvar; /* Current target variable */ 53 - struct kprobe_trace_arg *tvar; /* Current result variable */ 53 + struct probe_trace_arg *tvar; /* Current result variable */ 54 54 }; 55 55 56 56 struct line_finder {

+39 -23

tools/perf/util/session.c

··· 27 27 28 28 self->fd = open(self->filename, O_RDONLY); 29 29 if (self->fd < 0) { 30 - pr_err("failed to open file: %s", self->filename); 31 - if (!strcmp(self->filename, "perf.data")) 30 + int err = errno; 31 + 32 + pr_err("failed to open %s: %s", self->filename, strerror(err)); 33 + if (err == ENOENT && !strcmp(self->filename, "perf.data")) 32 34 pr_err(" (try 'perf record' first)"); 33 35 pr_err("\n"); 34 36 return -errno; ··· 79 77 return ret; 80 78 } 81 79 80 + static void perf_session__destroy_kernel_maps(struct perf_session *self) 81 + { 82 + machine__destroy_kernel_maps(&self->host_machine); 83 + machines__destroy_guest_kernel_maps(&self->machines); 84 + } 85 + 82 86 struct perf_session *perf_session__new(const char *filename, int mode, bool force, bool repipe) 83 87 { 84 88 size_t len = filename ? strlen(filename) + 1 : 0; ··· 102 94 self->hists_tree = RB_ROOT; 103 95 self->last_match = NULL; 104 96 self->mmap_window = 32; 105 - self->cwd = NULL; 106 - self->cwdlen = 0; 107 97 self->machines = RB_ROOT; 108 98 self->repipe = repipe; 109 99 INIT_LIST_HEAD(&self->ordered_samples.samples_head); ··· 130 124 return NULL; 131 125 } 132 126 127 + static void perf_session__delete_dead_threads(struct perf_session *self) 128 + { 129 + struct thread *n, *t; 130 + 131 + list_for_each_entry_safe(t, n, &self->dead_threads, node) { 132 + list_del(&t->node); 133 + thread__delete(t); 134 + } 135 + } 136 + 137 + static void perf_session__delete_threads(struct perf_session *self) 138 + { 139 + struct rb_node *nd = rb_first(&self->threads); 140 + 141 + while (nd) { 142 + struct thread *t = rb_entry(nd, struct thread, rb_node); 143 + 144 + rb_erase(&t->rb_node, &self->threads); 145 + nd = rb_next(nd); 146 + thread__delete(t); 147 + } 148 + } 149 + 133 150 void perf_session__delete(struct perf_session *self) 134 151 { 135 152 perf_header__exit(&self->header); 153 + perf_session__destroy_kernel_maps(self); 154 + perf_session__delete_dead_threads(self); 155 + perf_session__delete_threads(self); 156 + machine__exit(&self->host_machine); 136 157 close(self->fd); 137 - free(self->cwd); 138 158 free(self); 139 159 } 140 160 141 161 void perf_session__remove_thread(struct perf_session *self, struct thread *th) 142 162 { 163 + self->last_match = NULL; 143 164 rb_erase(&th->rb_node, &self->threads); 144 165 /* 145 166 * We may have references to this thread, for instance in some hist_entry ··· 863 830 if (perf_session__register_idle_thread(self) == NULL) 864 831 return -ENOMEM; 865 832 866 - if (!symbol_conf.full_paths) { 867 - char bf[PATH_MAX]; 868 - 869 - if (getcwd(bf, sizeof(bf)) == NULL) { 870 - err = -errno; 871 - out_getcwd_err: 872 - pr_err("failed to get the current directory\n"); 873 - goto out_err; 874 - } 875 - self->cwd = strdup(bf); 876 - if (self->cwd == NULL) { 877 - err = -ENOMEM; 878 - goto out_getcwd_err; 879 - } 880 - self->cwdlen = strlen(self->cwd); 881 - } 882 - 883 833 if (!self->fd_pipe) 884 834 err = __perf_session__process_events(self, 885 835 self->header.data_offset, ··· 870 854 self->size, ops); 871 855 else 872 856 err = __perf_session__process_pipe_events(self, ops); 873 - out_err: 857 + 874 858 return err; 875 859 } 876 860

+32 -8

tools/perf/util/sort.c

··· 1 1 #include "sort.h" 2 + #include "hist.h" 2 3 3 4 regex_t parent_regex; 4 5 const char default_parent_pattern[] = "^sys_|^do_page_fault"; ··· 11 10 12 11 enum sort_type sort__first_dimension; 13 12 14 - unsigned int dsos__col_width; 15 - unsigned int comms__col_width; 16 - unsigned int threads__col_width; 17 - static unsigned int parent_symbol__col_width; 18 13 char * field_sep; 19 14 20 15 LIST_HEAD(hist_entry__sort_list); ··· 25 28 size_t size, unsigned int width); 26 29 static int hist_entry__parent_snprintf(struct hist_entry *self, char *bf, 27 30 size_t size, unsigned int width); 31 + static int hist_entry__cpu_snprintf(struct hist_entry *self, char *bf, 32 + size_t size, unsigned int width); 28 33 29 34 struct sort_entry sort_thread = { 30 35 .se_header = "Command: Pid", 31 36 .se_cmp = sort__thread_cmp, 32 37 .se_snprintf = hist_entry__thread_snprintf, 33 - .se_width = &threads__col_width, 38 + .se_width_idx = HISTC_THREAD, 34 39 }; 35 40 36 41 struct sort_entry sort_comm = { ··· 40 41 .se_cmp = sort__comm_cmp, 41 42 .se_collapse = sort__comm_collapse, 42 43 .se_snprintf = hist_entry__comm_snprintf, 43 - .se_width = &comms__col_width, 44 + .se_width_idx = HISTC_COMM, 44 45 }; 45 46 46 47 struct sort_entry sort_dso = { 47 48 .se_header = "Shared Object", 48 49 .se_cmp = sort__dso_cmp, 49 50 .se_snprintf = hist_entry__dso_snprintf, 50 - .se_width = &dsos__col_width, 51 + .se_width_idx = HISTC_DSO, 51 52 }; 52 53 53 54 struct sort_entry sort_sym = { 54 55 .se_header = "Symbol", 55 56 .se_cmp = sort__sym_cmp, 56 57 .se_snprintf = hist_entry__sym_snprintf, 58 + .se_width_idx = HISTC_SYMBOL, 57 59 }; 58 60 59 61 struct sort_entry sort_parent = { 60 62 .se_header = "Parent symbol", 61 63 .se_cmp = sort__parent_cmp, 62 64 .se_snprintf = hist_entry__parent_snprintf, 63 - .se_width = &parent_symbol__col_width, 65 + .se_width_idx = HISTC_PARENT, 66 + }; 67 + 68 + struct sort_entry sort_cpu = { 69 + .se_header = "CPU", 70 + .se_cmp = sort__cpu_cmp, 71 + .se_snprintf = hist_entry__cpu_snprintf, 72 + .se_width_idx = HISTC_CPU, 64 73 }; 65 74 66 75 struct sort_dimension { ··· 83 76 { .name = "dso", .entry = &sort_dso, }, 84 77 { .name = "symbol", .entry = &sort_sym, }, 85 78 { .name = "parent", .entry = &sort_parent, }, 79 + { .name = "cpu", .entry = &sort_cpu, }, 86 80 }; 87 81 88 82 int64_t cmp_null(void *l, void *r) ··· 250 242 self->parent ? self->parent->name : "[other]"); 251 243 } 252 244 245 + /* --sort cpu */ 246 + 247 + int64_t 248 + sort__cpu_cmp(struct hist_entry *left, struct hist_entry *right) 249 + { 250 + return right->cpu - left->cpu; 251 + } 252 + 253 + static int hist_entry__cpu_snprintf(struct hist_entry *self, char *bf, 254 + size_t size, unsigned int width) 255 + { 256 + return repsep_snprintf(bf, size, "%-*d", width, self->cpu); 257 + } 258 + 253 259 int sort_dimension__add(const char *tok) 254 260 { 255 261 unsigned int i; ··· 303 281 sort__first_dimension = SORT_SYM; 304 282 else if (!strcmp(sd->name, "parent")) 305 283 sort__first_dimension = SORT_PARENT; 284 + else if (!strcmp(sd->name, "cpu")) 285 + sort__first_dimension = SORT_CPU; 306 286 } 307 287 308 288 list_add_tail(&sd->entry->list, &hist_entry__sort_list);

+17 -5

tools/perf/util/sort.h

··· 36 36 extern struct sort_entry sort_dso; 37 37 extern struct sort_entry sort_sym; 38 38 extern struct sort_entry sort_parent; 39 - extern unsigned int dsos__col_width; 40 - extern unsigned int comms__col_width; 41 - extern unsigned int threads__col_width; 42 39 extern enum sort_type sort__first_dimension; 43 40 41 + /** 42 + * struct hist_entry - histogram entry 43 + * 44 + * @row_offset - offset from the first callchain expanded to appear on screen 45 + * @nr_rows - rows expanded in callchain, recalculated on folding/unfolding 46 + */ 44 47 struct hist_entry { 45 48 struct rb_node rb_node; 46 49 u64 period; ··· 54 51 struct map_symbol ms; 55 52 struct thread *thread; 56 53 u64 ip; 54 + s32 cpu; 57 55 u32 nr_events; 56 + 57 + /* XXX These two should move to some tree widget lib */ 58 + u16 row_offset; 59 + u16 nr_rows; 60 + 61 + bool init_have_children; 58 62 char level; 59 63 u8 filtered; 60 64 struct symbol *parent; ··· 78 68 SORT_COMM, 79 69 SORT_DSO, 80 70 SORT_SYM, 81 - SORT_PARENT 71 + SORT_PARENT, 72 + SORT_CPU, 82 73 }; 83 74 84 75 /* ··· 95 84 int64_t (*se_collapse)(struct hist_entry *, struct hist_entry *); 96 85 int (*se_snprintf)(struct hist_entry *self, char *bf, size_t size, 97 86 unsigned int width); 98 - unsigned int *se_width; 87 + u8 se_width_idx; 99 88 bool elide; 100 89 }; 101 90 ··· 115 104 extern int64_t sort__dso_cmp(struct hist_entry *, struct hist_entry *); 116 105 extern int64_t sort__sym_cmp(struct hist_entry *, struct hist_entry *); 117 106 extern int64_t sort__parent_cmp(struct hist_entry *, struct hist_entry *); 107 + int64_t sort__cpu_cmp(struct hist_entry *left, struct hist_entry *right); 118 108 extern size_t sort__parent_print(FILE *, struct hist_entry *, unsigned int); 119 109 extern int sort_dimension__add(const char *); 120 110 void sort_entry__setup_elide(struct sort_entry *self, struct strlist *list,

+205 -94

tools/perf/util/symbol.c

··· 12 12 #include <fcntl.h> 13 13 #include <unistd.h> 14 14 #include "build-id.h" 15 + #include "debug.h" 15 16 #include "symbol.h" 16 17 #include "strlist.h" 17 18 ··· 26 25 #define NT_GNU_BUILD_ID 3 27 26 #endif 28 27 28 + static bool dso__build_id_equal(const struct dso *self, u8 *build_id); 29 + static int elf_read_build_id(Elf *elf, void *bf, size_t size); 29 30 static void dsos__add(struct list_head *head, struct dso *dso); 30 31 static struct map *map__new2(u64 start, struct dso *dso, enum map_type type); 31 32 static int dso__load_kernel_sym(struct dso *self, struct map *map, ··· 42 39 .use_modules = true, 43 40 .try_vmlinux_path = true, 44 41 }; 42 + 43 + int dso__name_len(const struct dso *self) 44 + { 45 + if (verbose) 46 + return self->long_name_len; 47 + 48 + return self->short_name_len; 49 + } 45 50 46 51 bool dso__loaded(const struct dso *self, enum map_type type) 47 52 { ··· 226 215 int i; 227 216 for (i = 0; i < MAP__NR_TYPES; ++i) 228 217 symbols__delete(&self->symbols[i]); 229 - if (self->long_name != self->name) 218 + if (self->sname_alloc) 219 + free((char *)self->short_name); 220 + if (self->lname_alloc) 230 221 free(self->long_name); 231 222 free(self); 232 223 } ··· 946 933 } 947 934 } 948 935 936 + static size_t elf_addr_to_index(Elf *elf, GElf_Addr addr) 937 + { 938 + Elf_Scn *sec = NULL; 939 + GElf_Shdr shdr; 940 + size_t cnt = 1; 941 + 942 + while ((sec = elf_nextscn(elf, sec)) != NULL) { 943 + gelf_getshdr(sec, &shdr); 944 + 945 + if ((addr >= shdr.sh_addr) && 946 + (addr < (shdr.sh_addr + shdr.sh_size))) 947 + return cnt; 948 + 949 + ++cnt; 950 + } 951 + 952 + return -1; 953 + } 954 + 949 955 static int dso__load_sym(struct dso *self, struct map *map, const char *name, 950 - int fd, symbol_filter_t filter, int kmodule) 956 + int fd, symbol_filter_t filter, int kmodule, 957 + int want_symtab) 951 958 { 952 959 struct kmap *kmap = self->kernel ? map__kmap(map) : NULL; 953 960 struct map *curr_map = map; ··· 977 944 int err = -1; 978 945 uint32_t idx; 979 946 GElf_Ehdr ehdr; 980 - GElf_Shdr shdr; 981 - Elf_Data *syms; 947 + GElf_Shdr shdr, opdshdr; 948 + Elf_Data *syms, *opddata = NULL; 982 949 GElf_Sym sym; 983 - Elf_Scn *sec, *sec_strndx; 950 + Elf_Scn *sec, *sec_strndx, *opdsec; 984 951 Elf *elf; 985 952 int nr = 0; 953 + size_t opdidx = 0; 986 954 987 955 elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL); 988 956 if (elf == NULL) { 989 - pr_err("%s: cannot read %s ELF file.\n", __func__, name); 957 + pr_debug("%s: cannot read %s ELF file.\n", __func__, name); 990 958 goto out_close; 991 959 } 992 960 993 961 if (gelf_getehdr(elf, &ehdr) == NULL) { 994 - pr_err("%s: cannot get elf header.\n", __func__); 962 + pr_debug("%s: cannot get elf header.\n", __func__); 995 963 goto out_elf_end; 964 + } 965 + 966 + /* Always reject images with a mismatched build-id: */ 967 + if (self->has_build_id) { 968 + u8 build_id[BUILD_ID_SIZE]; 969 + 970 + if (elf_read_build_id(elf, build_id, 971 + BUILD_ID_SIZE) != BUILD_ID_SIZE) 972 + goto out_elf_end; 973 + 974 + if (!dso__build_id_equal(self, build_id)) 975 + goto out_elf_end; 996 976 } 997 977 998 978 sec = elf_section_by_name(elf, &ehdr, &shdr, ".symtab", NULL); 999 979 if (sec == NULL) { 980 + if (want_symtab) 981 + goto out_elf_end; 982 + 1000 983 sec = elf_section_by_name(elf, &ehdr, &shdr, ".dynsym", NULL); 1001 984 if (sec == NULL) 1002 985 goto out_elf_end; 1003 986 } 987 + 988 + opdsec = elf_section_by_name(elf, &ehdr, &opdshdr, ".opd", &opdidx); 989 + if (opdsec) 990 + opddata = elf_rawdata(opdsec, NULL); 1004 991 1005 992 syms = elf_getdata(sec, NULL); 1006 993 if (syms == NULL) ··· 1065 1012 1066 1013 if (!is_label && !elf_sym__is_a(&sym, map->type)) 1067 1014 continue; 1015 + 1016 + if (opdsec && sym.st_shndx == opdidx) { 1017 + u32 offset = sym.st_value - opdshdr.sh_addr; 1018 + u64 *opd = opddata->d_buf + offset; 1019 + sym.st_value = *opd; 1020 + sym.st_shndx = elf_addr_to_index(elf, sym.st_value); 1021 + } 1068 1022 1069 1023 sec = elf_getscn(elf, sym.st_shndx); 1070 1024 if (!sec) ··· 1211 1151 */ 1212 1152 #define NOTE_ALIGN(n) (((n) + 3) & -4U) 1213 1153 1214 - int filename__read_build_id(const char *filename, void *bf, size_t size) 1154 + static int elf_read_build_id(Elf *elf, void *bf, size_t size) 1215 1155 { 1216 - int fd, err = -1; 1156 + int err = -1; 1217 1157 GElf_Ehdr ehdr; 1218 1158 GElf_Shdr shdr; 1219 1159 Elf_Data *data; 1220 1160 Elf_Scn *sec; 1221 1161 Elf_Kind ek; 1222 1162 void *ptr; 1223 - Elf *elf; 1224 1163 1225 1164 if (size < BUILD_ID_SIZE) 1226 1165 goto out; 1227 1166 1228 - fd = open(filename, O_RDONLY); 1229 - if (fd < 0) 1230 - goto out; 1231 - 1232 - elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL); 1233 - if (elf == NULL) { 1234 - pr_debug2("%s: cannot read %s ELF file.\n", __func__, filename); 1235 - goto out_close; 1236 - } 1237 - 1238 1167 ek = elf_kind(elf); 1239 1168 if (ek != ELF_K_ELF) 1240 - goto out_elf_end; 1169 + goto out; 1241 1170 1242 1171 if (gelf_getehdr(elf, &ehdr) == NULL) { 1243 1172 pr_err("%s: cannot get elf header.\n", __func__); 1244 - goto out_elf_end; 1173 + goto out; 1245 1174 } 1246 1175 1247 1176 sec = elf_section_by_name(elf, &ehdr, &shdr, ··· 1239 1190 sec = elf_section_by_name(elf, &ehdr, &shdr, 1240 1191 ".notes", NULL); 1241 1192 if (sec == NULL) 1242 - goto out_elf_end; 1193 + goto out; 1243 1194 } 1244 1195 1245 1196 data = elf_getdata(sec, NULL); 1246 1197 if (data == NULL) 1247 - goto out_elf_end; 1198 + goto out; 1248 1199 1249 1200 ptr = data->d_buf; 1250 1201 while (ptr < (data->d_buf + data->d_size)) { ··· 1266 1217 } 1267 1218 ptr += descsz; 1268 1219 } 1269 - out_elf_end: 1220 + 1221 + out: 1222 + return err; 1223 + } 1224 + 1225 + int filename__read_build_id(const char *filename, void *bf, size_t size) 1226 + { 1227 + int fd, err = -1; 1228 + Elf *elf; 1229 + 1230 + if (size < BUILD_ID_SIZE) 1231 + goto out; 1232 + 1233 + fd = open(filename, O_RDONLY); 1234 + if (fd < 0) 1235 + goto out; 1236 + 1237 + elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL); 1238 + if (elf == NULL) { 1239 + pr_debug2("%s: cannot read %s ELF file.\n", __func__, filename); 1240 + goto out_close; 1241 + } 1242 + 1243 + err = elf_read_build_id(elf, bf, size); 1244 + 1270 1245 elf_end(elf); 1271 1246 out_close: 1272 1247 close(fd); ··· 1366 1293 { 1367 1294 int size = PATH_MAX; 1368 1295 char *name; 1369 - u8 build_id[BUILD_ID_SIZE]; 1370 1296 int ret = -1; 1371 1297 int fd; 1372 1298 struct machine *machine; 1373 1299 const char *root_dir; 1300 + int want_symtab; 1374 1301 1375 1302 dso__set_loaded(self, map->type); 1376 1303 ··· 1397 1324 return ret; 1398 1325 } 1399 1326 1400 - self->origin = DSO__ORIG_BUILD_ID_CACHE; 1401 - if (dso__build_id_filename(self, name, size) != NULL) 1402 - goto open_file; 1403 - more: 1404 - do { 1405 - self->origin++; 1327 + /* Iterate over candidate debug images. 1328 + * On the first pass, only load images if they have a full symtab. 1329 + * Failing that, do a second pass where we accept .dynsym also 1330 + */ 1331 + for (self->origin = DSO__ORIG_BUILD_ID_CACHE, want_symtab = 1; 1332 + self->origin != DSO__ORIG_NOT_FOUND; 1333 + self->origin++) { 1406 1334 switch (self->origin) { 1335 + case DSO__ORIG_BUILD_ID_CACHE: 1336 + if (dso__build_id_filename(self, name, size) == NULL) 1337 + continue; 1338 + break; 1407 1339 case DSO__ORIG_FEDORA: 1408 1340 snprintf(name, size, "/usr/lib/debug%s.debug", 1409 1341 self->long_name); ··· 1417 1339 snprintf(name, size, "/usr/lib/debug%s", 1418 1340 self->long_name); 1419 1341 break; 1420 - case DSO__ORIG_BUILDID: 1421 - if (filename__read_build_id(self->long_name, build_id, 1422 - sizeof(build_id))) { 1423 - char build_id_hex[BUILD_ID_SIZE * 2 + 1]; 1424 - build_id__sprintf(build_id, sizeof(build_id), 1425 - build_id_hex); 1426 - snprintf(name, size, 1427 - "/usr/lib/debug/.build-id/%.2s/%s.debug", 1428 - build_id_hex, build_id_hex + 2); 1429 - if (self->has_build_id) 1430 - goto compare_build_id; 1431 - break; 1342 + case DSO__ORIG_BUILDID: { 1343 + char build_id_hex[BUILD_ID_SIZE * 2 + 1]; 1344 + 1345 + if (!self->has_build_id) 1346 + continue; 1347 + 1348 + build_id__sprintf(self->build_id, 1349 + sizeof(self->build_id), 1350 + build_id_hex); 1351 + snprintf(name, size, 1352 + "/usr/lib/debug/.build-id/%.2s/%s.debug", 1353 + build_id_hex, build_id_hex + 2); 1432 1354 } 1433 - self->origin++; 1434 - /* Fall thru */ 1355 + break; 1435 1356 case DSO__ORIG_DSO: 1436 1357 snprintf(name, size, "%s", self->long_name); 1437 1358 break; ··· 1443 1366 break; 1444 1367 1445 1368 default: 1446 - goto out; 1369 + /* 1370 + * If we wanted a full symtab but no image had one, 1371 + * relax our requirements and repeat the search. 1372 + */ 1373 + if (want_symtab) { 1374 + want_symtab = 0; 1375 + self->origin = DSO__ORIG_BUILD_ID_CACHE; 1376 + } else 1377 + continue; 1447 1378 } 1448 1379 1449 - if (self->has_build_id) { 1450 - if (filename__read_build_id(name, build_id, 1451 - sizeof(build_id)) < 0) 1452 - goto more; 1453 - compare_build_id: 1454 - if (!dso__build_id_equal(self, build_id)) 1455 - goto more; 1456 - } 1457 - open_file: 1380 + /* Name is now the name of the next image to try */ 1458 1381 fd = open(name, O_RDONLY); 1459 - } while (fd < 0); 1382 + if (fd < 0) 1383 + continue; 1460 1384 1461 - ret = dso__load_sym(self, map, name, fd, filter, 0); 1462 - close(fd); 1385 + ret = dso__load_sym(self, map, name, fd, filter, 0, 1386 + want_symtab); 1387 + close(fd); 1463 1388 1464 - /* 1465 - * Some people seem to have debuginfo files _WITHOUT_ debug info!?!? 1466 - */ 1467 - if (!ret) 1468 - goto more; 1389 + /* 1390 + * Some people seem to have debuginfo files _WITHOUT_ debug 1391 + * info!?!? 1392 + */ 1393 + if (!ret) 1394 + continue; 1469 1395 1470 - if (ret > 0) { 1471 - int nr_plt = dso__synthesize_plt_symbols(self, map, filter); 1472 - if (nr_plt > 0) 1473 - ret += nr_plt; 1396 + if (ret > 0) { 1397 + int nr_plt = dso__synthesize_plt_symbols(self, map, filter); 1398 + if (nr_plt > 0) 1399 + ret += nr_plt; 1400 + break; 1401 + } 1474 1402 } 1475 - out: 1403 + 1476 1404 free(name); 1477 1405 if (ret < 0 && strstr(self->name, " (deleted)") != NULL) 1478 1406 return 0; ··· 1576 1494 goto out; 1577 1495 } 1578 1496 dso__set_long_name(map->dso, long_name); 1497 + map->dso->lname_alloc = 1; 1579 1498 dso__kernel_module_get_build_id(map->dso, ""); 1580 1499 } 1581 1500 } ··· 1739 1656 { 1740 1657 int err = -1, fd; 1741 1658 1742 - if (self->has_build_id) { 1743 - u8 build_id[BUILD_ID_SIZE]; 1744 - 1745 - if (filename__read_build_id(vmlinux, build_id, 1746 - sizeof(build_id)) < 0) { 1747 - pr_debug("No build_id in %s, ignoring it\n", vmlinux); 1748 - return -1; 1749 - } 1750 - if (!dso__build_id_equal(self, build_id)) { 1751 - char expected_build_id[BUILD_ID_SIZE * 2 + 1], 1752 - vmlinux_build_id[BUILD_ID_SIZE * 2 + 1]; 1753 - 1754 - build_id__sprintf(self->build_id, 1755 - sizeof(self->build_id), 1756 - expected_build_id); 1757 - build_id__sprintf(build_id, sizeof(build_id), 1758 - vmlinux_build_id); 1759 - pr_debug("build_id in %s is %s while expected is %s, " 1760 - "ignoring it\n", vmlinux, vmlinux_build_id, 1761 - expected_build_id); 1762 - return -1; 1763 - } 1764 - } 1765 - 1766 1659 fd = open(vmlinux, O_RDONLY); 1767 1660 if (fd < 0) 1768 1661 return -1; 1769 1662 1770 1663 dso__set_loaded(self, map->type); 1771 - err = dso__load_sym(self, map, vmlinux, fd, filter, 0); 1664 + err = dso__load_sym(self, map, vmlinux, fd, filter, 0, 0); 1772 1665 close(fd); 1773 1666 1774 1667 if (err > 0) ··· 2107 2048 return 0; 2108 2049 } 2109 2050 2051 + void machine__destroy_kernel_maps(struct machine *self) 2052 + { 2053 + enum map_type type; 2054 + 2055 + for (type = 0; type < MAP__NR_TYPES; ++type) { 2056 + struct kmap *kmap; 2057 + 2058 + if (self->vmlinux_maps[type] == NULL) 2059 + continue; 2060 + 2061 + kmap = map__kmap(self->vmlinux_maps[type]); 2062 + map_groups__remove(&self->kmaps, self->vmlinux_maps[type]); 2063 + if (kmap->ref_reloc_sym) { 2064 + /* 2065 + * ref_reloc_sym is shared among all maps, so free just 2066 + * on one of them. 2067 + */ 2068 + if (type == MAP__FUNCTION) { 2069 + free((char *)kmap->ref_reloc_sym->name); 2070 + kmap->ref_reloc_sym->name = NULL; 2071 + free(kmap->ref_reloc_sym); 2072 + } 2073 + kmap->ref_reloc_sym = NULL; 2074 + } 2075 + 2076 + map__delete(self->vmlinux_maps[type]); 2077 + self->vmlinux_maps[type] = NULL; 2078 + } 2079 + } 2080 + 2110 2081 int machine__create_kernel_maps(struct machine *self) 2111 2082 { 2112 2083 struct dso *kernel = machine__create_kernel(self); ··· 2278 2189 return -1; 2279 2190 } 2280 2191 2192 + void symbol__exit(void) 2193 + { 2194 + strlist__delete(symbol_conf.sym_list); 2195 + strlist__delete(symbol_conf.dso_list); 2196 + strlist__delete(symbol_conf.comm_list); 2197 + vmlinux_path__exit(); 2198 + symbol_conf.sym_list = symbol_conf.dso_list = symbol_conf.comm_list = NULL; 2199 + } 2200 + 2281 2201 int machines__create_kernel_maps(struct rb_root *self, pid_t pid) 2282 2202 { 2283 2203 struct machine *machine = machines__findnew(self, pid); ··· 2379 2281 } 2380 2282 2381 2283 return ret; 2284 + } 2285 + 2286 + void machines__destroy_guest_kernel_maps(struct rb_root *self) 2287 + { 2288 + struct rb_node *next = rb_first(self); 2289 + 2290 + while (next) { 2291 + struct machine *pos = rb_entry(next, struct machine, rb_node); 2292 + 2293 + next = rb_next(&pos->rb_node); 2294 + rb_erase(&pos->rb_node, self); 2295 + machine__delete(pos); 2296 + } 2382 2297 } 2383 2298 2384 2299 int machine__load_kallsyms(struct machine *self, const char *filename,

+13 -5

tools/perf/util/symbol.h

··· 9 9 #include <linux/rbtree.h> 10 10 #include <stdio.h> 11 11 12 - #define DEBUG_CACHE_DIR ".debug" 13 - 14 12 #ifdef HAVE_CPLUS_DEMANGLE 15 13 extern char *cplus_demangle(const char *, int); 16 14 ··· 68 70 show_nr_samples, 69 71 use_callchain, 70 72 exclude_other, 71 - full_paths, 72 73 show_cpu_utilization; 73 74 const char *vmlinux_name, 75 + *source_prefix, 74 76 *field_sep; 75 77 const char *default_guest_vmlinux_name, 76 78 *default_guest_kallsyms, ··· 101 103 struct map_symbol { 102 104 struct map *map; 103 105 struct symbol *sym; 106 + bool unfolded; 107 + bool has_children; 104 108 }; 105 109 106 110 struct addr_location { ··· 112 112 u64 addr; 113 113 char level; 114 114 bool filtered; 115 - unsigned int cpumode; 115 + u8 cpumode; 116 + s32 cpu; 116 117 }; 117 118 118 119 enum dso_kernel_type { ··· 126 125 struct list_head node; 127 126 struct rb_root symbols[MAP__NR_TYPES]; 128 127 struct rb_root symbol_names[MAP__NR_TYPES]; 128 + enum dso_kernel_type kernel; 129 129 u8 adjust_symbols:1; 130 130 u8 slen_calculated:1; 131 131 u8 has_build_id:1; 132 - enum dso_kernel_type kernel; 133 132 u8 hit:1; 134 133 u8 annotate_warned:1; 134 + u8 sname_alloc:1; 135 + u8 lname_alloc:1; 135 136 unsigned char origin; 136 137 u8 sorted_by_name; 137 138 u8 loaded; ··· 148 145 struct dso *dso__new(const char *name); 149 146 struct dso *dso__new_kernel(const char *name); 150 147 void dso__delete(struct dso *self); 148 + 149 + int dso__name_len(const struct dso *self); 151 150 152 151 bool dso__loaded(const struct dso *self, enum map_type type); 153 152 bool dso__sorted_by_name(const struct dso *self, enum map_type type); ··· 212 207 int (*process_symbol)(void *arg, const char *name, 213 208 char type, u64 start)); 214 209 210 + void machine__destroy_kernel_maps(struct machine *self); 215 211 int __machine__create_kernel_maps(struct machine *self, struct dso *kernel); 216 212 int machine__create_kernel_maps(struct machine *self); 217 213 218 214 int machines__create_kernel_maps(struct rb_root *self, pid_t pid); 219 215 int machines__create_guest_kernel_maps(struct rb_root *self); 216 + void machines__destroy_guest_kernel_maps(struct rb_root *self); 220 217 221 218 int symbol__init(void); 219 + void symbol__exit(void); 222 220 bool symbol_type__is_a(char symbol_type, enum map_type map_type); 223 221 224 222 size_t machine__fprintf_vmlinux_path(struct machine *self, FILE *fp);

+7

tools/perf/util/thread.c

··· 62 62 return self; 63 63 } 64 64 65 + void thread__delete(struct thread *self) 66 + { 67 + map_groups__exit(&self->mg); 68 + free(self->comm); 69 + free(self); 70 + } 71 + 65 72 int thread__set_comm(struct thread *self, const char *comm) 66 73 { 67 74 int err;

+2

tools/perf/util/thread.h

··· 20 20 21 21 struct perf_session; 22 22 23 + void thread__delete(struct thread *self); 24 + 23 25 int find_all_tid(int pid, pid_t ** all_tid); 24 26 int thread__set_comm(struct thread *self, const char *comm); 25 27 int thread__comm_len(struct thread *self);

+3

tools/perf/util/util.h

··· 89 89 90 90 extern const char *graph_line; 91 91 extern const char *graph_dotted_line; 92 + extern char buildid_dir[]; 92 93 93 94 /* On most systems <limits.h> would have given us this, but 94 95 * not on some systems (e.g. GNU/Hurd). ··· 153 152 extern void set_die_routine(void (*routine)(const char *err, va_list params) NORETURN); 154 153 155 154 extern int prefixcmp(const char *str, const char *prefix); 155 + extern void set_buildid_dir(void); 156 + extern void disable_buildid_cache(void); 156 157 157 158 static inline const char *skip_prefix(const char *str, const char *prefix) 158 159 {