Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge branch 'perf/jump-labels' into perf/core

Merge reason: After much naming discussion, there seems to be consensus
now - queue it up for v3.4.

Signed-off-by: Ingo Molnar <mingo@elte.hu>

+589 -207
+286
Documentation/static-keys.txt
··· 1 + Static Keys 2 + ----------- 3 + 4 + By: Jason Baron <jbaron@redhat.com> 5 + 6 + 0) Abstract 7 + 8 + Static keys allows the inclusion of seldom used features in 9 + performance-sensitive fast-path kernel code, via a GCC feature and a code 10 + patching technique. A quick example: 11 + 12 + struct static_key key = STATIC_KEY_INIT_FALSE; 13 + 14 + ... 15 + 16 + if (static_key_false(&key)) 17 + do unlikely code 18 + else 19 + do likely code 20 + 21 + ... 22 + static_key_slow_inc(); 23 + ... 24 + static_key_slow_inc(); 25 + ... 26 + 27 + The static_key_false() branch will be generated into the code with as little 28 + impact to the likely code path as possible. 29 + 30 + 31 + 1) Motivation 32 + 33 + 34 + Currently, tracepoints are implemented using a conditional branch. The 35 + conditional check requires checking a global variable for each tracepoint. 36 + Although the overhead of this check is small, it increases when the memory 37 + cache comes under pressure (memory cache lines for these global variables may 38 + be shared with other memory accesses). As we increase the number of tracepoints 39 + in the kernel this overhead may become more of an issue. In addition, 40 + tracepoints are often dormant (disabled) and provide no direct kernel 41 + functionality. Thus, it is highly desirable to reduce their impact as much as 42 + possible. Although tracepoints are the original motivation for this work, other 43 + kernel code paths should be able to make use of the static keys facility. 44 + 45 + 46 + 2) Solution 47 + 48 + 49 + gcc (v4.5) adds a new 'asm goto' statement that allows branching to a label: 50 + 51 + http://gcc.gnu.org/ml/gcc-patches/2009-07/msg01556.html 52 + 53 + Using the 'asm goto', we can create branches that are either taken or not taken 54 + by default, without the need to check memory. Then, at run-time, we can patch 55 + the branch site to change the branch direction. 56 + 57 + For example, if we have a simple branch that is disabled by default: 58 + 59 + if (static_key_false(&key)) 60 + printk("I am the true branch\n"); 61 + 62 + Thus, by default the 'printk' will not be emitted. And the code generated will 63 + consist of a single atomic 'no-op' instruction (5 bytes on x86), in the 64 + straight-line code path. When the branch is 'flipped', we will patch the 65 + 'no-op' in the straight-line codepath with a 'jump' instruction to the 66 + out-of-line true branch. Thus, changing branch direction is expensive but 67 + branch selection is basically 'free'. That is the basic tradeoff of this 68 + optimization. 69 + 70 + This lowlevel patching mechanism is called 'jump label patching', and it gives 71 + the basis for the static keys facility. 72 + 73 + 3) Static key label API, usage and examples: 74 + 75 + 76 + In order to make use of this optimization you must first define a key: 77 + 78 + struct static_key key; 79 + 80 + Which is initialized as: 81 + 82 + struct static_key key = STATIC_KEY_INIT_TRUE; 83 + 84 + or: 85 + 86 + struct static_key key = STATIC_KEY_INIT_FALSE; 87 + 88 + If the key is not initialized, it is default false. The 'struct static_key', 89 + must be a 'global'. That is, it can't be allocated on the stack or dynamically 90 + allocated at run-time. 91 + 92 + The key is then used in code as: 93 + 94 + if (static_key_false(&key)) 95 + do unlikely code 96 + else 97 + do likely code 98 + 99 + Or: 100 + 101 + if (static_key_true(&key)) 102 + do likely code 103 + else 104 + do unlikely code 105 + 106 + A key that is initialized via 'STATIC_KEY_INIT_FALSE', must be used in a 107 + 'static_key_false()' construct. Likewise, a key initialized via 108 + 'STATIC_KEY_INIT_TRUE' must be used in a 'static_key_true()' construct. A 109 + single key can be used in many branches, but all the branches must match the 110 + way that the key has been initialized. 111 + 112 + The branch(es) can then be switched via: 113 + 114 + static_key_slow_inc(&key); 115 + ... 116 + static_key_slow_dec(&key); 117 + 118 + Thus, 'static_key_slow_inc()' means 'make the branch true', and 119 + 'static_key_slow_dec()' means 'make the the branch false' with appropriate 120 + reference counting. For example, if the key is initialized true, a 121 + static_key_slow_dec(), will switch the branch to false. And a subsequent 122 + static_key_slow_inc(), will change the branch back to true. Likewise, if the 123 + key is initialized false, a 'static_key_slow_inc()', will change the branch to 124 + true. And then a 'static_key_slow_dec()', will again make the branch false. 125 + 126 + An example usage in the kernel is the implementation of tracepoints: 127 + 128 + static inline void trace_##name(proto) \ 129 + { \ 130 + if (static_key_false(&__tracepoint_##name.key)) \ 131 + __DO_TRACE(&__tracepoint_##name, \ 132 + TP_PROTO(data_proto), \ 133 + TP_ARGS(data_args), \ 134 + TP_CONDITION(cond)); \ 135 + } 136 + 137 + Tracepoints are disabled by default, and can be placed in performance critical 138 + pieces of the kernel. Thus, by using a static key, the tracepoints can have 139 + absolutely minimal impact when not in use. 140 + 141 + 142 + 4) Architecture level code patching interface, 'jump labels' 143 + 144 + 145 + There are a few functions and macros that architectures must implement in order 146 + to take advantage of this optimization. If there is no architecture support, we 147 + simply fall back to a traditional, load, test, and jump sequence. 148 + 149 + * select HAVE_ARCH_JUMP_LABEL, see: arch/x86/Kconfig 150 + 151 + * #define JUMP_LABEL_NOP_SIZE, see: arch/x86/include/asm/jump_label.h 152 + 153 + * __always_inline bool arch_static_branch(struct static_key *key), see: 154 + arch/x86/include/asm/jump_label.h 155 + 156 + * void arch_jump_label_transform(struct jump_entry *entry, enum jump_label_type type), 157 + see: arch/x86/kernel/jump_label.c 158 + 159 + * __init_or_module void arch_jump_label_transform_static(struct jump_entry *entry, enum jump_label_type type), 160 + see: arch/x86/kernel/jump_label.c 161 + 162 + 163 + * struct jump_entry, see: arch/x86/include/asm/jump_label.h 164 + 165 + 166 + 5) Static keys / jump label analysis, results (x86_64): 167 + 168 + 169 + As an example, let's add the following branch to 'getppid()', such that the 170 + system call now looks like: 171 + 172 + SYSCALL_DEFINE0(getppid) 173 + { 174 + int pid; 175 + 176 + + if (static_key_false(&key)) 177 + + printk("I am the true branch\n"); 178 + 179 + rcu_read_lock(); 180 + pid = task_tgid_vnr(rcu_dereference(current->real_parent)); 181 + rcu_read_unlock(); 182 + 183 + return pid; 184 + } 185 + 186 + The resulting instructions with jump labels generated by GCC is: 187 + 188 + ffffffff81044290 <sys_getppid>: 189 + ffffffff81044290: 55 push %rbp 190 + ffffffff81044291: 48 89 e5 mov %rsp,%rbp 191 + ffffffff81044294: e9 00 00 00 00 jmpq ffffffff81044299 <sys_getppid+0x9> 192 + ffffffff81044299: 65 48 8b 04 25 c0 b6 mov %gs:0xb6c0,%rax 193 + ffffffff810442a0: 00 00 194 + ffffffff810442a2: 48 8b 80 80 02 00 00 mov 0x280(%rax),%rax 195 + ffffffff810442a9: 48 8b 80 b0 02 00 00 mov 0x2b0(%rax),%rax 196 + ffffffff810442b0: 48 8b b8 e8 02 00 00 mov 0x2e8(%rax),%rdi 197 + ffffffff810442b7: e8 f4 d9 00 00 callq ffffffff81051cb0 <pid_vnr> 198 + ffffffff810442bc: 5d pop %rbp 199 + ffffffff810442bd: 48 98 cltq 200 + ffffffff810442bf: c3 retq 201 + ffffffff810442c0: 48 c7 c7 e3 54 98 81 mov $0xffffffff819854e3,%rdi 202 + ffffffff810442c7: 31 c0 xor %eax,%eax 203 + ffffffff810442c9: e8 71 13 6d 00 callq ffffffff8171563f <printk> 204 + ffffffff810442ce: eb c9 jmp ffffffff81044299 <sys_getppid+0x9> 205 + 206 + Without the jump label optimization it looks like: 207 + 208 + ffffffff810441f0 <sys_getppid>: 209 + ffffffff810441f0: 8b 05 8a 52 d8 00 mov 0xd8528a(%rip),%eax # ffffffff81dc9480 <key> 210 + ffffffff810441f6: 55 push %rbp 211 + ffffffff810441f7: 48 89 e5 mov %rsp,%rbp 212 + ffffffff810441fa: 85 c0 test %eax,%eax 213 + ffffffff810441fc: 75 27 jne ffffffff81044225 <sys_getppid+0x35> 214 + ffffffff810441fe: 65 48 8b 04 25 c0 b6 mov %gs:0xb6c0,%rax 215 + ffffffff81044205: 00 00 216 + ffffffff81044207: 48 8b 80 80 02 00 00 mov 0x280(%rax),%rax 217 + ffffffff8104420e: 48 8b 80 b0 02 00 00 mov 0x2b0(%rax),%rax 218 + ffffffff81044215: 48 8b b8 e8 02 00 00 mov 0x2e8(%rax),%rdi 219 + ffffffff8104421c: e8 2f da 00 00 callq ffffffff81051c50 <pid_vnr> 220 + ffffffff81044221: 5d pop %rbp 221 + ffffffff81044222: 48 98 cltq 222 + ffffffff81044224: c3 retq 223 + ffffffff81044225: 48 c7 c7 13 53 98 81 mov $0xffffffff81985313,%rdi 224 + ffffffff8104422c: 31 c0 xor %eax,%eax 225 + ffffffff8104422e: e8 60 0f 6d 00 callq ffffffff81715193 <printk> 226 + ffffffff81044233: eb c9 jmp ffffffff810441fe <sys_getppid+0xe> 227 + ffffffff81044235: 66 66 2e 0f 1f 84 00 data32 nopw %cs:0x0(%rax,%rax,1) 228 + ffffffff8104423c: 00 00 00 00 229 + 230 + Thus, the disable jump label case adds a 'mov', 'test' and 'jne' instruction 231 + vs. the jump label case just has a 'no-op' or 'jmp 0'. (The jmp 0, is patched 232 + to a 5 byte atomic no-op instruction at boot-time.) Thus, the disabled jump 233 + label case adds: 234 + 235 + 6 (mov) + 2 (test) + 2 (jne) = 10 - 5 (5 byte jump 0) = 5 addition bytes. 236 + 237 + If we then include the padding bytes, the jump label code saves, 16 total bytes 238 + of instruction memory for this small fucntion. In this case the non-jump label 239 + function is 80 bytes long. Thus, we have have saved 20% of the instruction 240 + footprint. We can in fact improve this even further, since the 5-byte no-op 241 + really can be a 2-byte no-op since we can reach the branch with a 2-byte jmp. 242 + However, we have not yet implemented optimal no-op sizes (they are currently 243 + hard-coded). 244 + 245 + Since there are a number of static key API uses in the scheduler paths, 246 + 'pipe-test' (also known as 'perf bench sched pipe') can be used to show the 247 + performance improvement. Testing done on 3.3.0-rc2: 248 + 249 + jump label disabled: 250 + 251 + Performance counter stats for 'bash -c /tmp/pipe-test' (50 runs): 252 + 253 + 855.700314 task-clock # 0.534 CPUs utilized ( +- 0.11% ) 254 + 200,003 context-switches # 0.234 M/sec ( +- 0.00% ) 255 + 0 CPU-migrations # 0.000 M/sec ( +- 39.58% ) 256 + 487 page-faults # 0.001 M/sec ( +- 0.02% ) 257 + 1,474,374,262 cycles # 1.723 GHz ( +- 0.17% ) 258 + <not supported> stalled-cycles-frontend 259 + <not supported> stalled-cycles-backend 260 + 1,178,049,567 instructions # 0.80 insns per cycle ( +- 0.06% ) 261 + 208,368,926 branches # 243.507 M/sec ( +- 0.06% ) 262 + 5,569,188 branch-misses # 2.67% of all branches ( +- 0.54% ) 263 + 264 + 1.601607384 seconds time elapsed ( +- 0.07% ) 265 + 266 + jump label enabled: 267 + 268 + Performance counter stats for 'bash -c /tmp/pipe-test' (50 runs): 269 + 270 + 841.043185 task-clock # 0.533 CPUs utilized ( +- 0.12% ) 271 + 200,004 context-switches # 0.238 M/sec ( +- 0.00% ) 272 + 0 CPU-migrations # 0.000 M/sec ( +- 40.87% ) 273 + 487 page-faults # 0.001 M/sec ( +- 0.05% ) 274 + 1,432,559,428 cycles # 1.703 GHz ( +- 0.18% ) 275 + <not supported> stalled-cycles-frontend 276 + <not supported> stalled-cycles-backend 277 + 1,175,363,994 instructions # 0.82 insns per cycle ( +- 0.04% ) 278 + 206,859,359 branches # 245.956 M/sec ( +- 0.04% ) 279 + 4,884,119 branch-misses # 2.36% of all branches ( +- 0.85% ) 280 + 281 + 1.579384366 seconds time elapsed 282 + 283 + The percentage of saved branches is .7%, and we've saved 12% on 284 + 'branch-misses'. This is where we would expect to get the most savings, since 285 + this optimization is about reducing the number of branches. In addition, we've 286 + saved .2% on instructions, and 2.8% on cycles and 1.4% on elapsed time.
+20 -9
arch/Kconfig
··· 47 47 If in doubt, say "N". 48 48 49 49 config JUMP_LABEL 50 - bool "Optimize trace point call sites" 50 + bool "Optimize very unlikely/likely branches" 51 51 depends on HAVE_ARCH_JUMP_LABEL 52 52 help 53 - If it is detected that the compiler has support for "asm goto", 54 - the kernel will compile trace point locations with just a 55 - nop instruction. When trace points are enabled, the nop will 56 - be converted to a jump to the trace function. This technique 57 - lowers overhead and stress on the branch prediction of the 58 - processor. 53 + This option enables a transparent branch optimization that 54 + makes certain almost-always-true or almost-always-false branch 55 + conditions even cheaper to execute within the kernel. 59 56 60 - On i386, options added to the compiler flags may increase 61 - the size of the kernel slightly. 57 + Certain performance-sensitive kernel code, such as trace points, 58 + scheduler functionality, networking code and KVM have such 59 + branches and include support for this optimization technique. 60 + 61 + If it is detected that the compiler has support for "asm goto", 62 + the kernel will compile such branches with just a nop 63 + instruction. When the condition flag is toggled to true, the 64 + nop will be converted to a jump instruction to execute the 65 + conditional block of instructions. 66 + 67 + This technique lowers overhead and stress on the branch prediction 68 + of the processor and generally makes the kernel faster. The update 69 + of the condition is slower, but those are always very rare. 70 + 71 + ( On 32-bit x86, the necessary options added to the compiler 72 + flags may increase the size of the kernel slightly. ) 62 73 63 74 config OPTPROBES 64 75 def_bool y
+3 -3
arch/ia64/include/asm/paravirt.h
··· 281 281 pv_time_ops.init_missing_ticks_accounting(cpu); 282 282 } 283 283 284 - struct jump_label_key; 285 - extern struct jump_label_key paravirt_steal_enabled; 286 - extern struct jump_label_key paravirt_steal_rq_enabled; 284 + struct static_key; 285 + extern struct static_key paravirt_steal_enabled; 286 + extern struct static_key paravirt_steal_rq_enabled; 287 287 288 288 static inline int 289 289 paravirt_do_steal_accounting(unsigned long *new_itm)
+2 -2
arch/ia64/kernel/paravirt.c
··· 634 634 * pv_time_ops 635 635 * time operations 636 636 */ 637 - struct jump_label_key paravirt_steal_enabled; 638 - struct jump_label_key paravirt_steal_rq_enabled; 637 + struct static_key paravirt_steal_enabled; 638 + struct static_key paravirt_steal_rq_enabled; 639 639 640 640 static int 641 641 ia64_native_do_steal_accounting(unsigned long *new_itm)
+1 -1
arch/mips/include/asm/jump_label.h
··· 20 20 #define WORD_INSN ".word" 21 21 #endif 22 22 23 - static __always_inline bool arch_static_branch(struct jump_label_key *key) 23 + static __always_inline bool arch_static_branch(struct static_key *key) 24 24 { 25 25 asm goto("1:\tnop\n\t" 26 26 "nop\n\t"
+1 -1
arch/powerpc/include/asm/jump_label.h
··· 17 17 #define JUMP_ENTRY_TYPE stringify_in_c(FTR_ENTRY_LONG) 18 18 #define JUMP_LABEL_NOP_SIZE 4 19 19 20 - static __always_inline bool arch_static_branch(struct jump_label_key *key) 20 + static __always_inline bool arch_static_branch(struct static_key *key) 21 21 { 22 22 asm goto("1:\n\t" 23 23 "nop\n\t"
+1 -1
arch/s390/include/asm/jump_label.h
··· 13 13 #define ASM_ALIGN ".balign 4" 14 14 #endif 15 15 16 - static __always_inline bool arch_static_branch(struct jump_label_key *key) 16 + static __always_inline bool arch_static_branch(struct static_key *key) 17 17 { 18 18 asm goto("0: brcl 0,0\n" 19 19 ".pushsection __jump_table, \"aw\"\n"
+1 -1
arch/sparc/include/asm/jump_label.h
··· 7 7 8 8 #define JUMP_LABEL_NOP_SIZE 4 9 9 10 - static __always_inline bool arch_static_branch(struct jump_label_key *key) 10 + static __always_inline bool arch_static_branch(struct static_key *key) 11 11 { 12 12 asm goto("1:\n\t" 13 13 "nop\n\t"
+3 -3
arch/x86/include/asm/jump_label.h
··· 9 9 10 10 #define JUMP_LABEL_NOP_SIZE 5 11 11 12 - #define JUMP_LABEL_INITIAL_NOP ".byte 0xe9 \n\t .long 0\n\t" 12 + #define STATIC_KEY_INITIAL_NOP ".byte 0xe9 \n\t .long 0\n\t" 13 13 14 - static __always_inline bool arch_static_branch(struct jump_label_key *key) 14 + static __always_inline bool arch_static_branch(struct static_key *key) 15 15 { 16 16 asm goto("1:" 17 - JUMP_LABEL_INITIAL_NOP 17 + STATIC_KEY_INITIAL_NOP 18 18 ".pushsection __jump_table, \"aw\" \n\t" 19 19 _ASM_ALIGN "\n\t" 20 20 _ASM_PTR "1b, %l[l_yes], %c0 \n\t"
+3 -3
arch/x86/include/asm/paravirt.h
··· 230 230 return PVOP_CALL0(unsigned long long, pv_time_ops.sched_clock); 231 231 } 232 232 233 - struct jump_label_key; 234 - extern struct jump_label_key paravirt_steal_enabled; 235 - extern struct jump_label_key paravirt_steal_rq_enabled; 233 + struct static_key; 234 + extern struct static_key paravirt_steal_enabled; 235 + extern struct static_key paravirt_steal_rq_enabled; 236 236 237 237 static inline u64 paravirt_steal_clock(int cpu) 238 238 {
+2 -2
arch/x86/kernel/kvm.c
··· 438 438 static __init int activate_jump_labels(void) 439 439 { 440 440 if (has_steal_clock) { 441 - jump_label_inc(&paravirt_steal_enabled); 441 + static_key_slow_inc(&paravirt_steal_enabled); 442 442 if (steal_acc) 443 - jump_label_inc(&paravirt_steal_rq_enabled); 443 + static_key_slow_inc(&paravirt_steal_rq_enabled); 444 444 } 445 445 446 446 return 0;
+2 -2
arch/x86/kernel/paravirt.c
··· 202 202 __native_flush_tlb_single(addr); 203 203 } 204 204 205 - struct jump_label_key paravirt_steal_enabled; 206 - struct jump_label_key paravirt_steal_rq_enabled; 205 + struct static_key paravirt_steal_enabled; 206 + struct static_key paravirt_steal_rq_enabled; 207 207 208 208 static u64 native_steal_clock(int cpu) 209 209 {
+4 -4
arch/x86/kvm/mmu_audit.c
··· 234 234 } 235 235 236 236 static bool mmu_audit; 237 - static struct jump_label_key mmu_audit_key; 237 + static struct static_key mmu_audit_key; 238 238 239 239 static void __kvm_mmu_audit(struct kvm_vcpu *vcpu, int point) 240 240 { ··· 250 250 251 251 static inline void kvm_mmu_audit(struct kvm_vcpu *vcpu, int point) 252 252 { 253 - if (static_branch((&mmu_audit_key))) 253 + if (static_key_false((&mmu_audit_key))) 254 254 __kvm_mmu_audit(vcpu, point); 255 255 } 256 256 ··· 259 259 if (mmu_audit) 260 260 return; 261 261 262 - jump_label_inc(&mmu_audit_key); 262 + static_key_slow_inc(&mmu_audit_key); 263 263 mmu_audit = true; 264 264 } 265 265 ··· 268 268 if (!mmu_audit) 269 269 return; 270 270 271 - jump_label_dec(&mmu_audit_key); 271 + static_key_slow_dec(&mmu_audit_key); 272 272 mmu_audit = false; 273 273 } 274 274
+100 -39
include/linux/jump_label.h
··· 9 9 * 10 10 * Jump labels provide an interface to generate dynamic branches using 11 11 * self-modifying code. Assuming toolchain and architecture support the result 12 - * of a "if (static_branch(&key))" statement is a unconditional branch (which 12 + * of a "if (static_key_false(&key))" statement is a unconditional branch (which 13 13 * defaults to false - and the true block is placed out of line). 14 14 * 15 - * However at runtime we can change the 'static' branch target using 16 - * jump_label_{inc,dec}(). These function as a 'reference' count on the key 15 + * However at runtime we can change the branch target using 16 + * static_key_slow_{inc,dec}(). These function as a 'reference' count on the key 17 17 * object and for as long as there are references all branches referring to 18 18 * that particular key will point to the (out of line) true block. 19 19 * 20 - * Since this relies on modifying code the jump_label_{inc,dec}() functions 20 + * Since this relies on modifying code the static_key_slow_{inc,dec}() functions 21 21 * must be considered absolute slow paths (machine wide synchronization etc.). 22 22 * OTOH, since the affected branches are unconditional their runtime overhead 23 23 * will be absolutely minimal, esp. in the default (off) case where the total ··· 26 26 * 27 27 * When the control is directly exposed to userspace it is prudent to delay the 28 28 * decrement to avoid high frequency code modifications which can (and do) 29 - * cause significant performance degradation. Struct jump_label_key_deferred and 30 - * jump_label_dec_deferred() provide for this. 29 + * cause significant performance degradation. Struct static_key_deferred and 30 + * static_key_slow_dec_deferred() provide for this. 31 31 * 32 32 * Lacking toolchain and or architecture support, it falls back to a simple 33 33 * conditional branch. 34 - */ 34 + * 35 + * struct static_key my_key = STATIC_KEY_INIT_TRUE; 36 + * 37 + * if (static_key_true(&my_key)) { 38 + * } 39 + * 40 + * will result in the true case being in-line and starts the key with a single 41 + * reference. Mixing static_key_true() and static_key_false() on the same key is not 42 + * allowed. 43 + * 44 + * Not initializing the key (static data is initialized to 0s anyway) is the 45 + * same as using STATIC_KEY_INIT_FALSE and static_key_false() is 46 + * equivalent with static_branch(). 47 + * 48 + */ 35 49 36 50 #include <linux/types.h> 37 51 #include <linux/compiler.h> ··· 53 39 54 40 #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL) 55 41 56 - struct jump_label_key { 42 + struct static_key { 57 43 atomic_t enabled; 44 + /* Set lsb bit to 1 if branch is default true, 0 ot */ 58 45 struct jump_entry *entries; 59 46 #ifdef CONFIG_MODULES 60 - struct jump_label_mod *next; 47 + struct static_key_mod *next; 61 48 #endif 62 49 }; 63 50 64 - struct jump_label_key_deferred { 65 - struct jump_label_key key; 51 + struct static_key_deferred { 52 + struct static_key key; 66 53 unsigned long timeout; 67 54 struct delayed_work work; 68 55 }; ··· 81 66 82 67 #ifdef HAVE_JUMP_LABEL 83 68 84 - #ifdef CONFIG_MODULES 85 - #define JUMP_LABEL_INIT {ATOMIC_INIT(0), NULL, NULL} 86 - #else 87 - #define JUMP_LABEL_INIT {ATOMIC_INIT(0), NULL} 88 - #endif 69 + #define JUMP_LABEL_TRUE_BRANCH 1UL 89 70 90 - static __always_inline bool static_branch(struct jump_label_key *key) 71 + static 72 + inline struct jump_entry *jump_label_get_entries(struct static_key *key) 73 + { 74 + return (struct jump_entry *)((unsigned long)key->entries 75 + & ~JUMP_LABEL_TRUE_BRANCH); 76 + } 77 + 78 + static inline bool jump_label_get_branch_default(struct static_key *key) 79 + { 80 + if ((unsigned long)key->entries & JUMP_LABEL_TRUE_BRANCH) 81 + return true; 82 + return false; 83 + } 84 + 85 + static __always_inline bool static_key_false(struct static_key *key) 86 + { 87 + return arch_static_branch(key); 88 + } 89 + 90 + static __always_inline bool static_key_true(struct static_key *key) 91 + { 92 + return !static_key_false(key); 93 + } 94 + 95 + /* Deprecated. Please use 'static_key_false() instead. */ 96 + static __always_inline bool static_branch(struct static_key *key) 91 97 { 92 98 return arch_static_branch(key); 93 99 } ··· 124 88 extern void arch_jump_label_transform_static(struct jump_entry *entry, 125 89 enum jump_label_type type); 126 90 extern int jump_label_text_reserved(void *start, void *end); 127 - extern void jump_label_inc(struct jump_label_key *key); 128 - extern void jump_label_dec(struct jump_label_key *key); 129 - extern void jump_label_dec_deferred(struct jump_label_key_deferred *key); 130 - extern bool jump_label_enabled(struct jump_label_key *key); 91 + extern void static_key_slow_inc(struct static_key *key); 92 + extern void static_key_slow_dec(struct static_key *key); 93 + extern void static_key_slow_dec_deferred(struct static_key_deferred *key); 94 + extern bool static_key_enabled(struct static_key *key); 131 95 extern void jump_label_apply_nops(struct module *mod); 132 - extern void jump_label_rate_limit(struct jump_label_key_deferred *key, 133 - unsigned long rl); 96 + extern void 97 + jump_label_rate_limit(struct static_key_deferred *key, unsigned long rl); 98 + 99 + #define STATIC_KEY_INIT_TRUE ((struct static_key) \ 100 + { .enabled = ATOMIC_INIT(1), .entries = (void *)1 }) 101 + #define STATIC_KEY_INIT_FALSE ((struct static_key) \ 102 + { .enabled = ATOMIC_INIT(0), .entries = (void *)0 }) 134 103 135 104 #else /* !HAVE_JUMP_LABEL */ 136 105 137 106 #include <linux/atomic.h> 138 107 139 - #define JUMP_LABEL_INIT {ATOMIC_INIT(0)} 140 - 141 - struct jump_label_key { 108 + struct static_key { 142 109 atomic_t enabled; 143 110 }; 144 111 ··· 149 110 { 150 111 } 151 112 152 - struct jump_label_key_deferred { 153 - struct jump_label_key key; 113 + struct static_key_deferred { 114 + struct static_key key; 154 115 }; 155 116 156 - static __always_inline bool static_branch(struct jump_label_key *key) 117 + static __always_inline bool static_key_false(struct static_key *key) 157 118 { 158 - if (unlikely(atomic_read(&key->enabled))) 119 + if (unlikely(atomic_read(&key->enabled)) > 0) 159 120 return true; 160 121 return false; 161 122 } 162 123 163 - static inline void jump_label_inc(struct jump_label_key *key) 124 + static __always_inline bool static_key_true(struct static_key *key) 125 + { 126 + if (likely(atomic_read(&key->enabled)) > 0) 127 + return true; 128 + return false; 129 + } 130 + 131 + /* Deprecated. Please use 'static_key_false() instead. */ 132 + static __always_inline bool static_branch(struct static_key *key) 133 + { 134 + if (unlikely(atomic_read(&key->enabled)) > 0) 135 + return true; 136 + return false; 137 + } 138 + 139 + static inline void static_key_slow_inc(struct static_key *key) 164 140 { 165 141 atomic_inc(&key->enabled); 166 142 } 167 143 168 - static inline void jump_label_dec(struct jump_label_key *key) 144 + static inline void static_key_slow_dec(struct static_key *key) 169 145 { 170 146 atomic_dec(&key->enabled); 171 147 } 172 148 173 - static inline void jump_label_dec_deferred(struct jump_label_key_deferred *key) 149 + static inline void static_key_slow_dec_deferred(struct static_key_deferred *key) 174 150 { 175 - jump_label_dec(&key->key); 151 + static_key_slow_dec(&key->key); 176 152 } 177 153 178 154 static inline int jump_label_text_reserved(void *start, void *end) ··· 198 144 static inline void jump_label_lock(void) {} 199 145 static inline void jump_label_unlock(void) {} 200 146 201 - static inline bool jump_label_enabled(struct jump_label_key *key) 147 + static inline bool static_key_enabled(struct static_key *key) 202 148 { 203 - return !!atomic_read(&key->enabled); 149 + return (atomic_read(&key->enabled) > 0); 204 150 } 205 151 206 152 static inline int jump_label_apply_nops(struct module *mod) ··· 208 154 return 0; 209 155 } 210 156 211 - static inline void jump_label_rate_limit(struct jump_label_key_deferred *key, 157 + static inline void 158 + jump_label_rate_limit(struct static_key_deferred *key, 212 159 unsigned long rl) 213 160 { 214 161 } 162 + 163 + #define STATIC_KEY_INIT_TRUE ((struct static_key) \ 164 + { .enabled = ATOMIC_INIT(1) }) 165 + #define STATIC_KEY_INIT_FALSE ((struct static_key) \ 166 + { .enabled = ATOMIC_INIT(0) }) 167 + 215 168 #endif /* HAVE_JUMP_LABEL */ 216 169 217 - #define jump_label_key_enabled ((struct jump_label_key){ .enabled = ATOMIC_INIT(1), }) 218 - #define jump_label_key_disabled ((struct jump_label_key){ .enabled = ATOMIC_INIT(0), }) 170 + #define STATIC_KEY_INIT STATIC_KEY_INIT_FALSE 171 + #define jump_label_enabled static_key_enabled 219 172 220 173 #endif /* _LINUX_JUMP_LABEL_H */
+2 -2
include/linux/netdevice.h
··· 214 214 #include <linux/skbuff.h> 215 215 216 216 #ifdef CONFIG_RPS 217 - #include <linux/jump_label.h> 218 - extern struct jump_label_key rps_needed; 217 + #include <linux/static_key.h> 218 + extern struct static_key rps_needed; 219 219 #endif 220 220 221 221 struct neighbour;
+3 -3
include/linux/netfilter.h
··· 163 163 extern struct list_head nf_hooks[NFPROTO_NUMPROTO][NF_MAX_HOOKS]; 164 164 165 165 #if defined(CONFIG_JUMP_LABEL) 166 - #include <linux/jump_label.h> 167 - extern struct jump_label_key nf_hooks_needed[NFPROTO_NUMPROTO][NF_MAX_HOOKS]; 166 + #include <linux/static_key.h> 167 + extern struct static_key nf_hooks_needed[NFPROTO_NUMPROTO][NF_MAX_HOOKS]; 168 168 static inline bool nf_hooks_active(u_int8_t pf, unsigned int hook) 169 169 { 170 170 if (__builtin_constant_p(pf) && 171 171 __builtin_constant_p(hook)) 172 - return static_branch(&nf_hooks_needed[pf][hook]); 172 + return static_key_false(&nf_hooks_needed[pf][hook]); 173 173 174 174 return !list_empty(&nf_hooks[pf][hook]); 175 175 }
+6 -6
include/linux/perf_event.h
··· 514 514 #include <linux/ftrace.h> 515 515 #include <linux/cpu.h> 516 516 #include <linux/irq_work.h> 517 - #include <linux/jump_label.h> 517 + #include <linux/static_key.h> 518 518 #include <linux/atomic.h> 519 519 #include <asm/local.h> 520 520 ··· 1041 1041 return event->pmu->task_ctx_nr == perf_sw_context; 1042 1042 } 1043 1043 1044 - extern struct jump_label_key perf_swevent_enabled[PERF_COUNT_SW_MAX]; 1044 + extern struct static_key perf_swevent_enabled[PERF_COUNT_SW_MAX]; 1045 1045 1046 1046 extern void __perf_sw_event(u32, u64, struct pt_regs *, u64); 1047 1047 ··· 1069 1069 { 1070 1070 struct pt_regs hot_regs; 1071 1071 1072 - if (static_branch(&perf_swevent_enabled[event_id])) { 1072 + if (static_key_false(&perf_swevent_enabled[event_id])) { 1073 1073 if (!regs) { 1074 1074 perf_fetch_caller_regs(&hot_regs); 1075 1075 regs = &hot_regs; ··· 1078 1078 } 1079 1079 } 1080 1080 1081 - extern struct jump_label_key_deferred perf_sched_events; 1081 + extern struct static_key_deferred perf_sched_events; 1082 1082 1083 1083 static inline void perf_event_task_sched_in(struct task_struct *prev, 1084 1084 struct task_struct *task) 1085 1085 { 1086 - if (static_branch(&perf_sched_events.key)) 1086 + if (static_key_false(&perf_sched_events.key)) 1087 1087 __perf_event_task_sched_in(prev, task); 1088 1088 } 1089 1089 ··· 1092 1092 { 1093 1093 perf_sw_event(PERF_COUNT_SW_CONTEXT_SWITCHES, 1, NULL, 0); 1094 1094 1095 - if (static_branch(&perf_sched_events.key)) 1095 + if (static_key_false(&perf_sched_events.key)) 1096 1096 __perf_event_task_sched_out(prev, next); 1097 1097 } 1098 1098
+1
include/linux/static_key.h
··· 1 + #include <linux/jump_label.h>
+4 -4
include/linux/tracepoint.h
··· 17 17 #include <linux/errno.h> 18 18 #include <linux/types.h> 19 19 #include <linux/rcupdate.h> 20 - #include <linux/jump_label.h> 20 + #include <linux/static_key.h> 21 21 22 22 struct module; 23 23 struct tracepoint; ··· 29 29 30 30 struct tracepoint { 31 31 const char *name; /* Tracepoint name */ 32 - struct jump_label_key key; 32 + struct static_key key; 33 33 void (*regfunc)(void); 34 34 void (*unregfunc)(void); 35 35 struct tracepoint_func __rcu *funcs; ··· 145 145 extern struct tracepoint __tracepoint_##name; \ 146 146 static inline void trace_##name(proto) \ 147 147 { \ 148 - if (static_branch(&__tracepoint_##name.key)) \ 148 + if (static_key_false(&__tracepoint_##name.key)) \ 149 149 __DO_TRACE(&__tracepoint_##name, \ 150 150 TP_PROTO(data_proto), \ 151 151 TP_ARGS(data_args), \ ··· 188 188 __attribute__((section("__tracepoints_strings"))) = #name; \ 189 189 struct tracepoint __tracepoint_##name \ 190 190 __attribute__((section("__tracepoints"))) = \ 191 - { __tpstrtab_##name, JUMP_LABEL_INIT, reg, unreg, NULL };\ 191 + { __tpstrtab_##name, STATIC_KEY_INIT_FALSE, reg, unreg, NULL };\ 192 192 static struct tracepoint * const __tracepoint_ptr_##name __used \ 193 193 __attribute__((section("__tracepoints_ptrs"))) = \ 194 194 &__tracepoint_##name;
+3 -3
include/net/sock.h
··· 55 55 #include <linux/uaccess.h> 56 56 #include <linux/memcontrol.h> 57 57 #include <linux/res_counter.h> 58 - #include <linux/jump_label.h> 58 + #include <linux/static_key.h> 59 59 60 60 #include <linux/filter.h> 61 61 #include <linux/rculist_nulls.h> ··· 924 924 #endif /* SOCK_REFCNT_DEBUG */ 925 925 926 926 #if defined(CONFIG_CGROUP_MEM_RES_CTLR_KMEM) && defined(CONFIG_NET) 927 - extern struct jump_label_key memcg_socket_limit_enabled; 927 + extern struct static_key memcg_socket_limit_enabled; 928 928 static inline struct cg_proto *parent_cg_proto(struct proto *proto, 929 929 struct cg_proto *cg_proto) 930 930 { 931 931 return proto->proto_cgroup(parent_mem_cgroup(cg_proto->memcg)); 932 932 } 933 - #define mem_cgroup_sockets_enabled static_branch(&memcg_socket_limit_enabled) 933 + #define mem_cgroup_sockets_enabled static_key_false(&memcg_socket_limit_enabled) 934 934 #else 935 935 #define mem_cgroup_sockets_enabled 0 936 936 static inline struct cg_proto *parent_cg_proto(struct proto *proto,
+8 -8
kernel/events/core.c
··· 128 128 * perf_sched_events : >0 events exist 129 129 * perf_cgroup_events: >0 per-cpu cgroup events exist on this cpu 130 130 */ 131 - struct jump_label_key_deferred perf_sched_events __read_mostly; 131 + struct static_key_deferred perf_sched_events __read_mostly; 132 132 static DEFINE_PER_CPU(atomic_t, perf_cgroup_events); 133 133 134 134 static atomic_t nr_mmap_events __read_mostly; ··· 2769 2769 2770 2770 if (!event->parent) { 2771 2771 if (event->attach_state & PERF_ATTACH_TASK) 2772 - jump_label_dec_deferred(&perf_sched_events); 2772 + static_key_slow_dec_deferred(&perf_sched_events); 2773 2773 if (event->attr.mmap || event->attr.mmap_data) 2774 2774 atomic_dec(&nr_mmap_events); 2775 2775 if (event->attr.comm) ··· 2780 2780 put_callchain_buffers(); 2781 2781 if (is_cgroup_event(event)) { 2782 2782 atomic_dec(&per_cpu(perf_cgroup_events, event->cpu)); 2783 - jump_label_dec_deferred(&perf_sched_events); 2783 + static_key_slow_dec_deferred(&perf_sched_events); 2784 2784 } 2785 2785 } 2786 2786 ··· 4982 4982 return err; 4983 4983 } 4984 4984 4985 - struct jump_label_key perf_swevent_enabled[PERF_COUNT_SW_MAX]; 4985 + struct static_key perf_swevent_enabled[PERF_COUNT_SW_MAX]; 4986 4986 4987 4987 static void sw_perf_event_destroy(struct perf_event *event) 4988 4988 { ··· 4990 4990 4991 4991 WARN_ON(event->parent); 4992 4992 4993 - jump_label_dec(&perf_swevent_enabled[event_id]); 4993 + static_key_slow_dec(&perf_swevent_enabled[event_id]); 4994 4994 swevent_hlist_put(event); 4995 4995 } 4996 4996 ··· 5020 5020 if (err) 5021 5021 return err; 5022 5022 5023 - jump_label_inc(&perf_swevent_enabled[event_id]); 5023 + static_key_slow_inc(&perf_swevent_enabled[event_id]); 5024 5024 event->destroy = sw_perf_event_destroy; 5025 5025 } 5026 5026 ··· 5843 5843 5844 5844 if (!event->parent) { 5845 5845 if (event->attach_state & PERF_ATTACH_TASK) 5846 - jump_label_inc(&perf_sched_events.key); 5846 + static_key_slow_inc(&perf_sched_events.key); 5847 5847 if (event->attr.mmap || event->attr.mmap_data) 5848 5848 atomic_inc(&nr_mmap_events); 5849 5849 if (event->attr.comm) ··· 6081 6081 * - that may need work on context switch 6082 6082 */ 6083 6083 atomic_inc(&per_cpu(perf_cgroup_events, event->cpu)); 6084 - jump_label_inc(&perf_sched_events.key); 6084 + static_key_slow_inc(&perf_sched_events.key); 6085 6085 } 6086 6086 6087 6087 /*
+79 -56
kernel/jump_label.c
··· 12 12 #include <linux/slab.h> 13 13 #include <linux/sort.h> 14 14 #include <linux/err.h> 15 - #include <linux/jump_label.h> 15 + #include <linux/static_key.h> 16 16 17 17 #ifdef HAVE_JUMP_LABEL 18 18 ··· 29 29 mutex_unlock(&jump_label_mutex); 30 30 } 31 31 32 - bool jump_label_enabled(struct jump_label_key *key) 32 + bool static_key_enabled(struct static_key *key) 33 33 { 34 - return !!atomic_read(&key->enabled); 34 + return (atomic_read(&key->enabled) > 0); 35 35 } 36 + EXPORT_SYMBOL_GPL(static_key_enabled); 36 37 37 38 static int jump_label_cmp(const void *a, const void *b) 38 39 { ··· 59 58 sort(start, size, sizeof(struct jump_entry), jump_label_cmp, NULL); 60 59 } 61 60 62 - static void jump_label_update(struct jump_label_key *key, int enable); 61 + static void jump_label_update(struct static_key *key, int enable); 63 62 64 - void jump_label_inc(struct jump_label_key *key) 63 + void static_key_slow_inc(struct static_key *key) 65 64 { 66 65 if (atomic_inc_not_zero(&key->enabled)) 67 66 return; 68 67 69 68 jump_label_lock(); 70 - if (atomic_read(&key->enabled) == 0) 71 - jump_label_update(key, JUMP_LABEL_ENABLE); 69 + if (atomic_read(&key->enabled) == 0) { 70 + if (!jump_label_get_branch_default(key)) 71 + jump_label_update(key, JUMP_LABEL_ENABLE); 72 + else 73 + jump_label_update(key, JUMP_LABEL_DISABLE); 74 + } 72 75 atomic_inc(&key->enabled); 73 76 jump_label_unlock(); 74 77 } 75 - EXPORT_SYMBOL_GPL(jump_label_inc); 78 + EXPORT_SYMBOL_GPL(static_key_slow_inc); 76 79 77 - static void __jump_label_dec(struct jump_label_key *key, 80 + static void __static_key_slow_dec(struct static_key *key, 78 81 unsigned long rate_limit, struct delayed_work *work) 79 82 { 80 - if (!atomic_dec_and_mutex_lock(&key->enabled, &jump_label_mutex)) 83 + if (!atomic_dec_and_mutex_lock(&key->enabled, &jump_label_mutex)) { 84 + WARN(atomic_read(&key->enabled) < 0, 85 + "jump label: negative count!\n"); 81 86 return; 87 + } 82 88 83 89 if (rate_limit) { 84 90 atomic_inc(&key->enabled); 85 91 schedule_delayed_work(work, rate_limit); 86 - } else 87 - jump_label_update(key, JUMP_LABEL_DISABLE); 88 - 92 + } else { 93 + if (!jump_label_get_branch_default(key)) 94 + jump_label_update(key, JUMP_LABEL_DISABLE); 95 + else 96 + jump_label_update(key, JUMP_LABEL_ENABLE); 97 + } 89 98 jump_label_unlock(); 90 99 } 91 - EXPORT_SYMBOL_GPL(jump_label_dec); 92 100 93 101 static void jump_label_update_timeout(struct work_struct *work) 94 102 { 95 - struct jump_label_key_deferred *key = 96 - container_of(work, struct jump_label_key_deferred, work.work); 97 - __jump_label_dec(&key->key, 0, NULL); 103 + struct static_key_deferred *key = 104 + container_of(work, struct static_key_deferred, work.work); 105 + __static_key_slow_dec(&key->key, 0, NULL); 98 106 } 99 107 100 - void jump_label_dec(struct jump_label_key *key) 108 + void static_key_slow_dec(struct static_key *key) 101 109 { 102 - __jump_label_dec(key, 0, NULL); 110 + __static_key_slow_dec(key, 0, NULL); 103 111 } 112 + EXPORT_SYMBOL_GPL(static_key_slow_dec); 104 113 105 - void jump_label_dec_deferred(struct jump_label_key_deferred *key) 114 + void static_key_slow_dec_deferred(struct static_key_deferred *key) 106 115 { 107 - __jump_label_dec(&key->key, key->timeout, &key->work); 116 + __static_key_slow_dec(&key->key, key->timeout, &key->work); 108 117 } 118 + EXPORT_SYMBOL_GPL(static_key_slow_dec_deferred); 109 119 110 - 111 - void jump_label_rate_limit(struct jump_label_key_deferred *key, 120 + void jump_label_rate_limit(struct static_key_deferred *key, 112 121 unsigned long rl) 113 122 { 114 123 key->timeout = rl; ··· 161 150 arch_jump_label_transform(entry, type); 162 151 } 163 152 164 - static void __jump_label_update(struct jump_label_key *key, 153 + static void __jump_label_update(struct static_key *key, 165 154 struct jump_entry *entry, 166 155 struct jump_entry *stop, int enable) 167 156 { ··· 178 167 } 179 168 } 180 169 170 + static enum jump_label_type jump_label_type(struct static_key *key) 171 + { 172 + bool true_branch = jump_label_get_branch_default(key); 173 + bool state = static_key_enabled(key); 174 + 175 + if ((!true_branch && state) || (true_branch && !state)) 176 + return JUMP_LABEL_ENABLE; 177 + 178 + return JUMP_LABEL_DISABLE; 179 + } 180 + 181 181 void __init jump_label_init(void) 182 182 { 183 183 struct jump_entry *iter_start = __start___jump_table; 184 184 struct jump_entry *iter_stop = __stop___jump_table; 185 - struct jump_label_key *key = NULL; 185 + struct static_key *key = NULL; 186 186 struct jump_entry *iter; 187 187 188 188 jump_label_lock(); 189 189 jump_label_sort_entries(iter_start, iter_stop); 190 190 191 191 for (iter = iter_start; iter < iter_stop; iter++) { 192 - struct jump_label_key *iterk; 192 + struct static_key *iterk; 193 193 194 - iterk = (struct jump_label_key *)(unsigned long)iter->key; 195 - arch_jump_label_transform_static(iter, jump_label_enabled(iterk) ? 196 - JUMP_LABEL_ENABLE : JUMP_LABEL_DISABLE); 194 + iterk = (struct static_key *)(unsigned long)iter->key; 195 + arch_jump_label_transform_static(iter, jump_label_type(iterk)); 197 196 if (iterk == key) 198 197 continue; 199 198 200 199 key = iterk; 201 - key->entries = iter; 200 + /* 201 + * Set key->entries to iter, but preserve JUMP_LABEL_TRUE_BRANCH. 202 + */ 203 + *((unsigned long *)&key->entries) += (unsigned long)iter; 202 204 #ifdef CONFIG_MODULES 203 205 key->next = NULL; 204 206 #endif ··· 221 197 222 198 #ifdef CONFIG_MODULES 223 199 224 - struct jump_label_mod { 225 - struct jump_label_mod *next; 200 + struct static_key_mod { 201 + struct static_key_mod *next; 226 202 struct jump_entry *entries; 227 203 struct module *mod; 228 204 }; ··· 242 218 start, end); 243 219 } 244 220 245 - static void __jump_label_mod_update(struct jump_label_key *key, int enable) 221 + static void __jump_label_mod_update(struct static_key *key, int enable) 246 222 { 247 - struct jump_label_mod *mod = key->next; 223 + struct static_key_mod *mod = key->next; 248 224 249 225 while (mod) { 250 226 struct module *m = mod->mod; ··· 275 251 return; 276 252 277 253 for (iter = iter_start; iter < iter_stop; iter++) { 278 - struct jump_label_key *iterk; 279 - 280 - iterk = (struct jump_label_key *)(unsigned long)iter->key; 281 - arch_jump_label_transform_static(iter, jump_label_enabled(iterk) ? 282 - JUMP_LABEL_ENABLE : JUMP_LABEL_DISABLE); 254 + arch_jump_label_transform_static(iter, JUMP_LABEL_DISABLE); 283 255 } 284 256 } 285 257 ··· 284 264 struct jump_entry *iter_start = mod->jump_entries; 285 265 struct jump_entry *iter_stop = iter_start + mod->num_jump_entries; 286 266 struct jump_entry *iter; 287 - struct jump_label_key *key = NULL; 288 - struct jump_label_mod *jlm; 267 + struct static_key *key = NULL; 268 + struct static_key_mod *jlm; 289 269 290 270 /* if the module doesn't have jump label entries, just return */ 291 271 if (iter_start == iter_stop) ··· 294 274 jump_label_sort_entries(iter_start, iter_stop); 295 275 296 276 for (iter = iter_start; iter < iter_stop; iter++) { 297 - if (iter->key == (jump_label_t)(unsigned long)key) 277 + struct static_key *iterk; 278 + 279 + iterk = (struct static_key *)(unsigned long)iter->key; 280 + if (iterk == key) 298 281 continue; 299 282 300 - key = (struct jump_label_key *)(unsigned long)iter->key; 301 - 283 + key = iterk; 302 284 if (__module_address(iter->key) == mod) { 303 - atomic_set(&key->enabled, 0); 304 - key->entries = iter; 285 + /* 286 + * Set key->entries to iter, but preserve JUMP_LABEL_TRUE_BRANCH. 287 + */ 288 + *((unsigned long *)&key->entries) += (unsigned long)iter; 305 289 key->next = NULL; 306 290 continue; 307 291 } 308 - 309 - jlm = kzalloc(sizeof(struct jump_label_mod), GFP_KERNEL); 292 + jlm = kzalloc(sizeof(struct static_key_mod), GFP_KERNEL); 310 293 if (!jlm) 311 294 return -ENOMEM; 312 - 313 295 jlm->mod = mod; 314 296 jlm->entries = iter; 315 297 jlm->next = key->next; 316 298 key->next = jlm; 317 299 318 - if (jump_label_enabled(key)) 300 + if (jump_label_type(key) == JUMP_LABEL_ENABLE) 319 301 __jump_label_update(key, iter, iter_stop, JUMP_LABEL_ENABLE); 320 302 } 321 303 ··· 329 307 struct jump_entry *iter_start = mod->jump_entries; 330 308 struct jump_entry *iter_stop = iter_start + mod->num_jump_entries; 331 309 struct jump_entry *iter; 332 - struct jump_label_key *key = NULL; 333 - struct jump_label_mod *jlm, **prev; 310 + struct static_key *key = NULL; 311 + struct static_key_mod *jlm, **prev; 334 312 335 313 for (iter = iter_start; iter < iter_stop; iter++) { 336 314 if (iter->key == (jump_label_t)(unsigned long)key) 337 315 continue; 338 316 339 - key = (struct jump_label_key *)(unsigned long)iter->key; 317 + key = (struct static_key *)(unsigned long)iter->key; 340 318 341 319 if (__module_address(iter->key) == mod) 342 320 continue; ··· 438 416 return ret; 439 417 } 440 418 441 - static void jump_label_update(struct jump_label_key *key, int enable) 419 + static void jump_label_update(struct static_key *key, int enable) 442 420 { 443 - struct jump_entry *entry = key->entries, *stop = __stop___jump_table; 421 + struct jump_entry *stop = __stop___jump_table; 422 + struct jump_entry *entry = jump_label_get_entries(key); 444 423 445 424 #ifdef CONFIG_MODULES 446 - struct module *mod = __module_address((jump_label_t)key); 425 + struct module *mod = __module_address((unsigned long)key); 447 426 448 427 __jump_label_mod_update(key, enable); 449 428
+9 -9
kernel/sched/core.c
··· 162 162 163 163 #ifdef HAVE_JUMP_LABEL 164 164 165 - #define jump_label_key__true jump_label_key_enabled 166 - #define jump_label_key__false jump_label_key_disabled 165 + #define jump_label_key__true STATIC_KEY_INIT_TRUE 166 + #define jump_label_key__false STATIC_KEY_INIT_FALSE 167 167 168 168 #define SCHED_FEAT(name, enabled) \ 169 169 jump_label_key__##enabled , 170 170 171 - struct jump_label_key sched_feat_keys[__SCHED_FEAT_NR] = { 171 + struct static_key sched_feat_keys[__SCHED_FEAT_NR] = { 172 172 #include "features.h" 173 173 }; 174 174 ··· 176 176 177 177 static void sched_feat_disable(int i) 178 178 { 179 - if (jump_label_enabled(&sched_feat_keys[i])) 180 - jump_label_dec(&sched_feat_keys[i]); 179 + if (static_key_enabled(&sched_feat_keys[i])) 180 + static_key_slow_dec(&sched_feat_keys[i]); 181 181 } 182 182 183 183 static void sched_feat_enable(int i) 184 184 { 185 - if (!jump_label_enabled(&sched_feat_keys[i])) 186 - jump_label_inc(&sched_feat_keys[i]); 185 + if (!static_key_enabled(&sched_feat_keys[i])) 186 + static_key_slow_inc(&sched_feat_keys[i]); 187 187 } 188 188 #else 189 189 static void sched_feat_disable(int i) { }; ··· 894 894 delta -= irq_delta; 895 895 #endif 896 896 #ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING 897 - if (static_branch((&paravirt_steal_rq_enabled))) { 897 + if (static_key_false((&paravirt_steal_rq_enabled))) { 898 898 u64 st; 899 899 900 900 steal = paravirt_steal_clock(cpu_of(rq)); ··· 2756 2756 static __always_inline bool steal_account_process_tick(void) 2757 2757 { 2758 2758 #ifdef CONFIG_PARAVIRT 2759 - if (static_branch(&paravirt_steal_enabled)) { 2759 + if (static_key_false(&paravirt_steal_enabled)) { 2760 2760 u64 steal, st = 0; 2761 2761 2762 2762 steal = paravirt_steal_clock(smp_processor_id());
+4 -4
kernel/sched/fair.c
··· 1399 1399 #ifdef CONFIG_CFS_BANDWIDTH 1400 1400 1401 1401 #ifdef HAVE_JUMP_LABEL 1402 - static struct jump_label_key __cfs_bandwidth_used; 1402 + static struct static_key __cfs_bandwidth_used; 1403 1403 1404 1404 static inline bool cfs_bandwidth_used(void) 1405 1405 { 1406 - return static_branch(&__cfs_bandwidth_used); 1406 + return static_key_false(&__cfs_bandwidth_used); 1407 1407 } 1408 1408 1409 1409 void account_cfs_bandwidth_used(int enabled, int was_enabled) 1410 1410 { 1411 1411 /* only need to count groups transitioning between enabled/!enabled */ 1412 1412 if (enabled && !was_enabled) 1413 - jump_label_inc(&__cfs_bandwidth_used); 1413 + static_key_slow_inc(&__cfs_bandwidth_used); 1414 1414 else if (!enabled && was_enabled) 1415 - jump_label_dec(&__cfs_bandwidth_used); 1415 + static_key_slow_dec(&__cfs_bandwidth_used); 1416 1416 } 1417 1417 #else /* HAVE_JUMP_LABEL */ 1418 1418 static bool cfs_bandwidth_used(void)
+7 -7
kernel/sched/sched.h
··· 611 611 * Tunables that become constants when CONFIG_SCHED_DEBUG is off: 612 612 */ 613 613 #ifdef CONFIG_SCHED_DEBUG 614 - # include <linux/jump_label.h> 614 + # include <linux/static_key.h> 615 615 # define const_debug __read_mostly 616 616 #else 617 617 # define const_debug const ··· 630 630 #undef SCHED_FEAT 631 631 632 632 #if defined(CONFIG_SCHED_DEBUG) && defined(HAVE_JUMP_LABEL) 633 - static __always_inline bool static_branch__true(struct jump_label_key *key) 633 + static __always_inline bool static_branch__true(struct static_key *key) 634 634 { 635 - return likely(static_branch(key)); /* Not out of line branch. */ 635 + return static_key_true(key); /* Not out of line branch. */ 636 636 } 637 637 638 - static __always_inline bool static_branch__false(struct jump_label_key *key) 638 + static __always_inline bool static_branch__false(struct static_key *key) 639 639 { 640 - return unlikely(static_branch(key)); /* Out of line branch. */ 640 + return static_key_false(key); /* Out of line branch. */ 641 641 } 642 642 643 643 #define SCHED_FEAT(name, enabled) \ 644 - static __always_inline bool static_branch_##name(struct jump_label_key *key) \ 644 + static __always_inline bool static_branch_##name(struct static_key *key) \ 645 645 { \ 646 646 return static_branch__##enabled(key); \ 647 647 } ··· 650 650 651 651 #undef SCHED_FEAT 652 652 653 - extern struct jump_label_key sched_feat_keys[__SCHED_FEAT_NR]; 653 + extern struct static_key sched_feat_keys[__SCHED_FEAT_NR]; 654 654 #define sched_feat(x) (static_branch_##x(&sched_feat_keys[__SCHED_FEAT_##x])) 655 655 #else /* !(SCHED_DEBUG && HAVE_JUMP_LABEL) */ 656 656 #define sched_feat(x) (sysctl_sched_features & (1UL << __SCHED_FEAT_##x))
+10 -10
kernel/tracepoint.c
··· 25 25 #include <linux/err.h> 26 26 #include <linux/slab.h> 27 27 #include <linux/sched.h> 28 - #include <linux/jump_label.h> 28 + #include <linux/static_key.h> 29 29 30 30 extern struct tracepoint * const __start___tracepoints_ptrs[]; 31 31 extern struct tracepoint * const __stop___tracepoints_ptrs[]; ··· 256 256 { 257 257 WARN_ON(strcmp((*entry)->name, elem->name) != 0); 258 258 259 - if (elem->regfunc && !jump_label_enabled(&elem->key) && active) 259 + if (elem->regfunc && !static_key_enabled(&elem->key) && active) 260 260 elem->regfunc(); 261 - else if (elem->unregfunc && jump_label_enabled(&elem->key) && !active) 261 + else if (elem->unregfunc && static_key_enabled(&elem->key) && !active) 262 262 elem->unregfunc(); 263 263 264 264 /* ··· 269 269 * is used. 270 270 */ 271 271 rcu_assign_pointer(elem->funcs, (*entry)->funcs); 272 - if (active && !jump_label_enabled(&elem->key)) 273 - jump_label_inc(&elem->key); 274 - else if (!active && jump_label_enabled(&elem->key)) 275 - jump_label_dec(&elem->key); 272 + if (active && !static_key_enabled(&elem->key)) 273 + static_key_slow_inc(&elem->key); 274 + else if (!active && static_key_enabled(&elem->key)) 275 + static_key_slow_dec(&elem->key); 276 276 } 277 277 278 278 /* ··· 283 283 */ 284 284 static void disable_tracepoint(struct tracepoint *elem) 285 285 { 286 - if (elem->unregfunc && jump_label_enabled(&elem->key)) 286 + if (elem->unregfunc && static_key_enabled(&elem->key)) 287 287 elem->unregfunc(); 288 288 289 - if (jump_label_enabled(&elem->key)) 290 - jump_label_dec(&elem->key); 289 + if (static_key_enabled(&elem->key)) 290 + static_key_slow_dec(&elem->key); 291 291 rcu_assign_pointer(elem->funcs, NULL); 292 292 } 293 293
+12 -12
net/core/dev.c
··· 134 134 #include <linux/inetdevice.h> 135 135 #include <linux/cpu_rmap.h> 136 136 #include <linux/net_tstamp.h> 137 - #include <linux/jump_label.h> 137 + #include <linux/static_key.h> 138 138 #include <net/flow_keys.h> 139 139 140 140 #include "net-sysfs.h" ··· 1441 1441 } 1442 1442 EXPORT_SYMBOL(call_netdevice_notifiers); 1443 1443 1444 - static struct jump_label_key netstamp_needed __read_mostly; 1444 + static struct static_key netstamp_needed __read_mostly; 1445 1445 #ifdef HAVE_JUMP_LABEL 1446 - /* We are not allowed to call jump_label_dec() from irq context 1446 + /* We are not allowed to call static_key_slow_dec() from irq context 1447 1447 * If net_disable_timestamp() is called from irq context, defer the 1448 - * jump_label_dec() calls. 1448 + * static_key_slow_dec() calls. 1449 1449 */ 1450 1450 static atomic_t netstamp_needed_deferred; 1451 1451 #endif ··· 1457 1457 1458 1458 if (deferred) { 1459 1459 while (--deferred) 1460 - jump_label_dec(&netstamp_needed); 1460 + static_key_slow_dec(&netstamp_needed); 1461 1461 return; 1462 1462 } 1463 1463 #endif 1464 1464 WARN_ON(in_interrupt()); 1465 - jump_label_inc(&netstamp_needed); 1465 + static_key_slow_inc(&netstamp_needed); 1466 1466 } 1467 1467 EXPORT_SYMBOL(net_enable_timestamp); 1468 1468 ··· 1474 1474 return; 1475 1475 } 1476 1476 #endif 1477 - jump_label_dec(&netstamp_needed); 1477 + static_key_slow_dec(&netstamp_needed); 1478 1478 } 1479 1479 EXPORT_SYMBOL(net_disable_timestamp); 1480 1480 1481 1481 static inline void net_timestamp_set(struct sk_buff *skb) 1482 1482 { 1483 1483 skb->tstamp.tv64 = 0; 1484 - if (static_branch(&netstamp_needed)) 1484 + if (static_key_false(&netstamp_needed)) 1485 1485 __net_timestamp(skb); 1486 1486 } 1487 1487 1488 1488 #define net_timestamp_check(COND, SKB) \ 1489 - if (static_branch(&netstamp_needed)) { \ 1489 + if (static_key_false(&netstamp_needed)) { \ 1490 1490 if ((COND) && !(SKB)->tstamp.tv64) \ 1491 1491 __net_timestamp(SKB); \ 1492 1492 } \ ··· 2660 2660 struct rps_sock_flow_table __rcu *rps_sock_flow_table __read_mostly; 2661 2661 EXPORT_SYMBOL(rps_sock_flow_table); 2662 2662 2663 - struct jump_label_key rps_needed __read_mostly; 2663 + struct static_key rps_needed __read_mostly; 2664 2664 2665 2665 static struct rps_dev_flow * 2666 2666 set_rps_cpu(struct net_device *dev, struct sk_buff *skb, ··· 2945 2945 2946 2946 trace_netif_rx(skb); 2947 2947 #ifdef CONFIG_RPS 2948 - if (static_branch(&rps_needed)) { 2948 + if (static_key_false(&rps_needed)) { 2949 2949 struct rps_dev_flow voidflow, *rflow = &voidflow; 2950 2950 int cpu; 2951 2951 ··· 3309 3309 return NET_RX_SUCCESS; 3310 3310 3311 3311 #ifdef CONFIG_RPS 3312 - if (static_branch(&rps_needed)) { 3312 + if (static_key_false(&rps_needed)) { 3313 3313 struct rps_dev_flow voidflow, *rflow = &voidflow; 3314 3314 int cpu, ret; 3315 3315
+2 -2
net/core/net-sysfs.c
··· 608 608 spin_unlock(&rps_map_lock); 609 609 610 610 if (map) 611 - jump_label_inc(&rps_needed); 611 + static_key_slow_inc(&rps_needed); 612 612 if (old_map) { 613 613 kfree_rcu(old_map, rcu); 614 - jump_label_dec(&rps_needed); 614 + static_key_slow_dec(&rps_needed); 615 615 } 616 616 free_cpumask_var(mask); 617 617 return len;
+2 -2
net/core/sock.c
··· 111 111 #include <linux/init.h> 112 112 #include <linux/highmem.h> 113 113 #include <linux/user_namespace.h> 114 - #include <linux/jump_label.h> 114 + #include <linux/static_key.h> 115 115 #include <linux/memcontrol.h> 116 116 117 117 #include <asm/uaccess.h> ··· 184 184 static struct lock_class_key af_family_keys[AF_MAX]; 185 185 static struct lock_class_key af_family_slock_keys[AF_MAX]; 186 186 187 - struct jump_label_key memcg_socket_limit_enabled; 187 + struct static_key memcg_socket_limit_enabled; 188 188 EXPORT_SYMBOL(memcg_socket_limit_enabled); 189 189 190 190 /*
+2 -2
net/core/sysctl_net_core.c
··· 69 69 if (sock_table != orig_sock_table) { 70 70 rcu_assign_pointer(rps_sock_flow_table, sock_table); 71 71 if (sock_table) 72 - jump_label_inc(&rps_needed); 72 + static_key_slow_inc(&rps_needed); 73 73 if (orig_sock_table) { 74 - jump_label_dec(&rps_needed); 74 + static_key_slow_dec(&rps_needed); 75 75 synchronize_rcu(); 76 76 vfree(orig_sock_table); 77 77 }
+3 -3
net/ipv4/tcp_memcontrol.c
··· 111 111 val = res_counter_read_u64(&tcp->tcp_memory_allocated, RES_LIMIT); 112 112 113 113 if (val != RESOURCE_MAX) 114 - jump_label_dec(&memcg_socket_limit_enabled); 114 + static_key_slow_dec(&memcg_socket_limit_enabled); 115 115 } 116 116 EXPORT_SYMBOL(tcp_destroy_cgroup); 117 117 ··· 143 143 net->ipv4.sysctl_tcp_mem[i]); 144 144 145 145 if (val == RESOURCE_MAX && old_lim != RESOURCE_MAX) 146 - jump_label_dec(&memcg_socket_limit_enabled); 146 + static_key_slow_dec(&memcg_socket_limit_enabled); 147 147 else if (old_lim == RESOURCE_MAX && val != RESOURCE_MAX) 148 - jump_label_inc(&memcg_socket_limit_enabled); 148 + static_key_slow_inc(&memcg_socket_limit_enabled); 149 149 150 150 return 0; 151 151 }
+3 -3
net/netfilter/core.c
··· 56 56 EXPORT_SYMBOL(nf_hooks); 57 57 58 58 #if defined(CONFIG_JUMP_LABEL) 59 - struct jump_label_key nf_hooks_needed[NFPROTO_NUMPROTO][NF_MAX_HOOKS]; 59 + struct static_key nf_hooks_needed[NFPROTO_NUMPROTO][NF_MAX_HOOKS]; 60 60 EXPORT_SYMBOL(nf_hooks_needed); 61 61 #endif 62 62 ··· 77 77 list_add_rcu(&reg->list, elem->list.prev); 78 78 mutex_unlock(&nf_hook_mutex); 79 79 #if defined(CONFIG_JUMP_LABEL) 80 - jump_label_inc(&nf_hooks_needed[reg->pf][reg->hooknum]); 80 + static_key_slow_inc(&nf_hooks_needed[reg->pf][reg->hooknum]); 81 81 #endif 82 82 return 0; 83 83 } ··· 89 89 list_del_rcu(&reg->list); 90 90 mutex_unlock(&nf_hook_mutex); 91 91 #if defined(CONFIG_JUMP_LABEL) 92 - jump_label_dec(&nf_hooks_needed[reg->pf][reg->hooknum]); 92 + static_key_slow_dec(&nf_hooks_needed[reg->pf][reg->hooknum]); 93 93 #endif 94 94 synchronize_net(); 95 95 }