Merge tag 'trace-v4.15-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:

- Bring back context level recursive protection in ring buffer.

The simpler counter protection failed, due to a path when tracing
with trace_clock_global() as it could not be reentrant and depended
on the ring buffer recursive protection to keep that from happening.

- Prevent branch profiling when FORTIFY_SOURCE is enabled.

It causes 50 - 60 MB in warning messages. Branch profiling should
never be run on production systems, so there's no reason that it
needs to be enabled with FORTIFY_SOURCE.

* tag 'trace-v4.15-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Prevent PROFILE_ALL_BRANCHES when FORTIFY_SOURCE=y
ring-buffer: Bring back context level recursive checks

+46 -18
+1 -1
kernel/trace/Kconfig
··· 355 355 on if you need to profile the system's use of these macros. 356 356 357 357 config PROFILE_ALL_BRANCHES 358 - bool "Profile all if conditionals" 358 + bool "Profile all if conditionals" if !FORTIFY_SOURCE 359 359 select TRACE_BRANCH_PROFILING 360 360 help 361 361 This tracer profiles all branch conditions. Every if ()
+45 -17
kernel/trace/ring_buffer.c
··· 2534 2534 * The lock and unlock are done within a preempt disable section. 2535 2535 * The current_context per_cpu variable can only be modified 2536 2536 * by the current task between lock and unlock. But it can 2537 - * be modified more than once via an interrupt. There are four 2538 - * different contexts that we need to consider. 2537 + * be modified more than once via an interrupt. To pass this 2538 + * information from the lock to the unlock without having to 2539 + * access the 'in_interrupt()' functions again (which do show 2540 + * a bit of overhead in something as critical as function tracing, 2541 + * we use a bitmask trick. 2539 2542 * 2540 - * Normal context. 2541 - * SoftIRQ context 2542 - * IRQ context 2543 - * NMI context 2543 + * bit 0 = NMI context 2544 + * bit 1 = IRQ context 2545 + * bit 2 = SoftIRQ context 2546 + * bit 3 = normal context. 2544 2547 * 2545 - * If for some reason the ring buffer starts to recurse, we 2546 - * only allow that to happen at most 4 times (one for each 2547 - * context). If it happens 5 times, then we consider this a 2548 - * recusive loop and do not let it go further. 2548 + * This works because this is the order of contexts that can 2549 + * preempt other contexts. A SoftIRQ never preempts an IRQ 2550 + * context. 2551 + * 2552 + * When the context is determined, the corresponding bit is 2553 + * checked and set (if it was set, then a recursion of that context 2554 + * happened). 2555 + * 2556 + * On unlock, we need to clear this bit. To do so, just subtract 2557 + * 1 from the current_context and AND it to itself. 2558 + * 2559 + * (binary) 2560 + * 101 - 1 = 100 2561 + * 101 & 100 = 100 (clearing bit zero) 2562 + * 2563 + * 1010 - 1 = 1001 2564 + * 1010 & 1001 = 1000 (clearing bit 1) 2565 + * 2566 + * The least significant bit can be cleared this way, and it 2567 + * just so happens that it is the same bit corresponding to 2568 + * the current context. 2549 2569 */ 2550 2570 2551 2571 static __always_inline int 2552 2572 trace_recursive_lock(struct ring_buffer_per_cpu *cpu_buffer) 2553 2573 { 2554 - if (cpu_buffer->current_context >= 4) 2574 + unsigned int val = cpu_buffer->current_context; 2575 + unsigned long pc = preempt_count(); 2576 + int bit; 2577 + 2578 + if (!(pc & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET))) 2579 + bit = RB_CTX_NORMAL; 2580 + else 2581 + bit = pc & NMI_MASK ? RB_CTX_NMI : 2582 + pc & HARDIRQ_MASK ? RB_CTX_IRQ : 2583 + pc & SOFTIRQ_OFFSET ? 2 : RB_CTX_SOFTIRQ; 2584 + 2585 + if (unlikely(val & (1 << bit))) 2555 2586 return 1; 2556 2587 2557 - cpu_buffer->current_context++; 2558 - /* Interrupts must see this update */ 2559 - barrier(); 2588 + val |= (1 << bit); 2589 + cpu_buffer->current_context = val; 2560 2590 2561 2591 return 0; 2562 2592 } ··· 2594 2564 static __always_inline void 2595 2565 trace_recursive_unlock(struct ring_buffer_per_cpu *cpu_buffer) 2596 2566 { 2597 - /* Don't let the dec leak out */ 2598 - barrier(); 2599 - cpu_buffer->current_context--; 2567 + cpu_buffer->current_context &= cpu_buffer->current_context - 1; 2600 2568 } 2601 2569 2602 2570 /**