Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

+7 -2

Documentation/atomic_t.txt

··· 81 81 82 82 The non-RMW ops are (typically) regular LOADs and STOREs and are canonically 83 83 implemented using READ_ONCE(), WRITE_ONCE(), smp_load_acquire() and 84 - smp_store_release() respectively. 84 + smp_store_release() respectively. Therefore, if you find yourself only using 85 + the Non-RMW operations of atomic_t, you do not in fact need atomic_t at all 86 + and are doing it wrong. 85 87 86 - The one detail to this is that atomic_set{}() should be observable to the RMW 88 + A subtle detail of atomic_set{}() is that it should be observable to the RMW 87 89 ops. That is: 88 90 89 91 C atomic-set ··· 201 199 These helper barriers exist because architectures have varying implicit 202 200 ordering on their SMP atomic primitives. For example our TSO architectures 203 201 provide full ordered atomics and these barriers are no-ops. 202 + 203 + NOTE: when the atomic RmW ops are fully ordered, they should also imply a 204 + compiler barrier. 204 205 205 206 Thus: 206 207

+83 -27

Documentation/locking/lockdep-design.txt

··· 15 15 struct is one class, while each inode has its own instantiation of that 16 16 lock class. 17 17 18 - The validator tracks the 'state' of lock-classes, and it tracks 19 - dependencies between different lock-classes. The validator maintains a 20 - rolling proof that the state and the dependencies are correct. 18 + The validator tracks the 'usage state' of lock-classes, and it tracks 19 + the dependencies between different lock-classes. Lock usage indicates 20 + how a lock is used with regard to its IRQ contexts, while lock 21 + dependency can be understood as lock order, where L1 -> L2 suggests that 22 + a task is attempting to acquire L2 while holding L1. From lockdep's 23 + perspective, the two locks (L1 and L2) are not necessarily related; that 24 + dependency just means the order ever happened. The validator maintains a 25 + continuing effort to prove lock usages and dependencies are correct or 26 + the validator will shoot a splat if incorrect. 21 27 22 - Unlike an lock instantiation, the lock-class itself never goes away: when 23 - a lock-class is used for the first time after bootup it gets registered, 24 - and all subsequent uses of that lock-class will be attached to this 25 - lock-class. 28 + A lock-class's behavior is constructed by its instances collectively: 29 + when the first instance of a lock-class is used after bootup the class 30 + gets registered, then all (subsequent) instances will be mapped to the 31 + class and hence their usages and dependecies will contribute to those of 32 + the class. A lock-class does not go away when a lock instance does, but 33 + it can be removed if the memory space of the lock class (static or 34 + dynamic) is reclaimed, this happens for example when a module is 35 + unloaded or a workqueue is destroyed. 26 36 27 37 State 28 38 ----- 29 39 30 - The validator tracks lock-class usage history into 4 * nSTATEs + 1 separate 31 - state bits: 40 + The validator tracks lock-class usage history and divides the usage into 41 + (4 usages * n STATEs + 1) categories: 32 42 43 + where the 4 usages can be: 33 44 - 'ever held in STATE context' 34 45 - 'ever held as readlock in STATE context' 35 46 - 'ever held with STATE enabled' 36 47 - 'ever held as readlock with STATE enabled' 37 48 38 - Where STATE can be either one of (kernel/locking/lockdep_states.h) 39 - - hardirq 40 - - softirq 49 + where the n STATEs are coded in kernel/locking/lockdep_states.h and as of 50 + now they include: 51 + - hardirq 52 + - softirq 41 53 54 + where the last 1 category is: 42 55 - 'ever used' [ == !unused ] 43 56 44 - When locking rules are violated, these state bits are presented in the 45 - locking error messages, inside curlies. A contrived example: 57 + When locking rules are violated, these usage bits are presented in the 58 + locking error messages, inside curlies, with a total of 2 * n STATEs bits. 59 + A contrived example: 46 60 47 61 modprobe/2287 is trying to acquire lock: 48 62 (&sio_locks[i].lock){-.-.}, at: [<c02867fd>] mutex_lock+0x21/0x24 ··· 65 51 (&sio_locks[i].lock){-.-.}, at: [<c02867fd>] mutex_lock+0x21/0x24 66 52 67 53 68 - The bit position indicates STATE, STATE-read, for each of the states listed 69 - above, and the character displayed in each indicates: 54 + For a given lock, the bit positions from left to right indicate the usage 55 + of the lock and readlock (if exists), for each of the n STATEs listed 56 + above respectively, and the character displayed at each bit position 57 + indicates: 70 58 71 59 '.' acquired while irqs disabled and not in irq context 72 60 '-' acquired in irq context 73 61 '+' acquired with irqs enabled 74 62 '?' acquired in irq context with irqs enabled. 75 63 76 - Unused mutexes cannot be part of the cause of an error. 64 + The bits are illustrated with an example: 65 + 66 + (&sio_locks[i].lock){-.-.}, at: [<c02867fd>] mutex_lock+0x21/0x24 67 + |||| 68 + ||| \-> softirq disabled and not in softirq context 69 + || \--> acquired in softirq context 70 + | \---> hardirq disabled and not in hardirq context 71 + \----> acquired in hardirq context 72 + 73 + 74 + For a given STATE, whether the lock is ever acquired in that STATE 75 + context and whether that STATE is enabled yields four possible cases as 76 + shown in the table below. The bit character is able to indicate which 77 + exact case is for the lock as of the reporting time. 78 + 79 + ------------------------------------------- 80 + | | irq enabled | irq disabled | 81 + |-------------------------------------------| 82 + | ever in irq | ? | - | 83 + |-------------------------------------------| 84 + | never in irq | + | . | 85 + ------------------------------------------- 86 + 87 + The character '-' suggests irq is disabled because if otherwise the 88 + charactor '?' would have been shown instead. Similar deduction can be 89 + applied for '+' too. 90 + 91 + Unused locks (e.g., mutexes) cannot be part of the cause of an error. 77 92 78 93 79 94 Single-lock state rules: 80 95 ------------------------ 81 96 97 + A lock is irq-safe means it was ever used in an irq context, while a lock 98 + is irq-unsafe means it was ever acquired with irq enabled. 99 + 82 100 A softirq-unsafe lock-class is automatically hardirq-unsafe as well. The 83 - following states are exclusive, and only one of them is allowed to be 84 - set for any lock-class: 101 + following states must be exclusive: only one of them is allowed to be set 102 + for any lock-class based on its usage: 85 103 86 - <hardirq-safe> and <hardirq-unsafe> 87 - <softirq-safe> and <softirq-unsafe> 104 + <hardirq-safe> or <hardirq-unsafe> 105 + <softirq-safe> or <softirq-unsafe> 88 106 89 - The validator detects and reports lock usage that violate these 107 + This is because if a lock can be used in irq context (irq-safe) then it 108 + cannot be ever acquired with irq enabled (irq-unsafe). Otherwise, a 109 + deadlock may happen. For example, in the scenario that after this lock 110 + was acquired but before released, if the context is interrupted this 111 + lock will be attempted to acquire twice, which creates a deadlock, 112 + referred to as lock recursion deadlock. 113 + 114 + The validator detects and reports lock usage that violates these 90 115 single-lock state rules. 91 116 92 117 Multi-lock dependency rules: ··· 134 81 The same lock-class must not be acquired twice, because this could lead 135 82 to lock recursion deadlocks. 136 83 137 - Furthermore, two locks may not be taken in different order: 84 + Furthermore, two locks can not be taken in inverse order: 138 85 139 86 <L1> -> <L2> 140 87 <L2> -> <L1> 141 88 142 - because this could lead to lock inversion deadlocks. (The validator 143 - finds such dependencies in arbitrary complexity, i.e. there can be any 144 - other locking sequence between the acquire-lock operations, the 145 - validator will still track all dependencies between locks.) 89 + because this could lead to a deadlock - referred to as lock inversion 90 + deadlock - as attempts to acquire the two locks form a circle which 91 + could lead to the two contexts waiting for each other permanently. The 92 + validator will find such dependency circle in arbitrary complexity, 93 + i.e., there can be any other locking sequence between the acquire-lock 94 + operations; the validator will still find whether these locks can be 95 + acquired in a circular fashion. 146 96 147 97 Furthermore, the following usage based lock dependencies are not allowed 148 98 between any two lock-classes:

+10 -10

arch/alpha/include/asm/atomic.h

··· 93 93 } 94 94 95 95 #define ATOMIC64_OP(op, asm_op) \ 96 - static __inline__ void atomic64_##op(long i, atomic64_t * v) \ 96 + static __inline__ void atomic64_##op(s64 i, atomic64_t * v) \ 97 97 { \ 98 - unsigned long temp; \ 98 + s64 temp; \ 99 99 __asm__ __volatile__( \ 100 100 "1: ldq_l %0,%1\n" \ 101 101 " " #asm_op " %0,%2,%0\n" \ ··· 109 109 } \ 110 110 111 111 #define ATOMIC64_OP_RETURN(op, asm_op) \ 112 - static __inline__ long atomic64_##op##_return_relaxed(long i, atomic64_t * v) \ 112 + static __inline__ s64 atomic64_##op##_return_relaxed(s64 i, atomic64_t * v) \ 113 113 { \ 114 - long temp, result; \ 114 + s64 temp, result; \ 115 115 __asm__ __volatile__( \ 116 116 "1: ldq_l %0,%1\n" \ 117 117 " " #asm_op " %0,%3,%2\n" \ ··· 128 128 } 129 129 130 130 #define ATOMIC64_FETCH_OP(op, asm_op) \ 131 - static __inline__ long atomic64_fetch_##op##_relaxed(long i, atomic64_t * v) \ 131 + static __inline__ s64 atomic64_fetch_##op##_relaxed(s64 i, atomic64_t * v) \ 132 132 { \ 133 - long temp, result; \ 133 + s64 temp, result; \ 134 134 __asm__ __volatile__( \ 135 135 "1: ldq_l %2,%1\n" \ 136 136 " " #asm_op " %2,%3,%0\n" \ ··· 246 246 * Atomically adds @a to @v, so long as it was not @u. 247 247 * Returns the old value of @v. 248 248 */ 249 - static __inline__ long atomic64_fetch_add_unless(atomic64_t *v, long a, long u) 249 + static __inline__ s64 atomic64_fetch_add_unless(atomic64_t *v, s64 a, s64 u) 250 250 { 251 - long c, new, old; 251 + s64 c, new, old; 252 252 smp_mb(); 253 253 __asm__ __volatile__( 254 254 "1: ldq_l %[old],%[mem]\n" ··· 276 276 * The function returns the old value of *v minus 1, even if 277 277 * the atomic variable, v, was not decremented. 278 278 */ 279 - static inline long atomic64_dec_if_positive(atomic64_t *v) 279 + static inline s64 atomic64_dec_if_positive(atomic64_t *v) 280 280 { 281 - long old, tmp; 281 + s64 old, tmp; 282 282 smp_mb(); 283 283 __asm__ __volatile__( 284 284 "1: ldq_l %[old],%[mem]\n"

+20 -21

arch/arc/include/asm/atomic.h

··· 321 321 */ 322 322 323 323 typedef struct { 324 - aligned_u64 counter; 324 + s64 __aligned(8) counter; 325 325 } atomic64_t; 326 326 327 327 #define ATOMIC64_INIT(a) { (a) } 328 328 329 - static inline long long atomic64_read(const atomic64_t *v) 329 + static inline s64 atomic64_read(const atomic64_t *v) 330 330 { 331 - unsigned long long val; 331 + s64 val; 332 332 333 333 __asm__ __volatile__( 334 334 " ldd %0, [%1] \n" ··· 338 338 return val; 339 339 } 340 340 341 - static inline void atomic64_set(atomic64_t *v, long long a) 341 + static inline void atomic64_set(atomic64_t *v, s64 a) 342 342 { 343 343 /* 344 344 * This could have been a simple assignment in "C" but would need ··· 359 359 } 360 360 361 361 #define ATOMIC64_OP(op, op1, op2) \ 362 - static inline void atomic64_##op(long long a, atomic64_t *v) \ 362 + static inline void atomic64_##op(s64 a, atomic64_t *v) \ 363 363 { \ 364 - unsigned long long val; \ 364 + s64 val; \ 365 365 \ 366 366 __asm__ __volatile__( \ 367 367 "1: \n" \ ··· 372 372 " bnz 1b \n" \ 373 373 : "=&r"(val) \ 374 374 : "r"(&v->counter), "ir"(a) \ 375 - : "cc"); \ 375 + : "cc"); \ 376 376 } \ 377 377 378 378 #define ATOMIC64_OP_RETURN(op, op1, op2) \ 379 - static inline long long atomic64_##op##_return(long long a, atomic64_t *v) \ 379 + static inline s64 atomic64_##op##_return(s64 a, atomic64_t *v) \ 380 380 { \ 381 - unsigned long long val; \ 381 + s64 val; \ 382 382 \ 383 383 smp_mb(); \ 384 384 \ ··· 399 399 } 400 400 401 401 #define ATOMIC64_FETCH_OP(op, op1, op2) \ 402 - static inline long long atomic64_fetch_##op(long long a, atomic64_t *v) \ 402 + static inline s64 atomic64_fetch_##op(s64 a, atomic64_t *v) \ 403 403 { \ 404 - unsigned long long val, orig; \ 404 + s64 val, orig; \ 405 405 \ 406 406 smp_mb(); \ 407 407 \ ··· 441 441 #undef ATOMIC64_OP_RETURN 442 442 #undef ATOMIC64_OP 443 443 444 - static inline long long 445 - atomic64_cmpxchg(atomic64_t *ptr, long long expected, long long new) 444 + static inline s64 445 + atomic64_cmpxchg(atomic64_t *ptr, s64 expected, s64 new) 446 446 { 447 - long long prev; 447 + s64 prev; 448 448 449 449 smp_mb(); 450 450 ··· 464 464 return prev; 465 465 } 466 466 467 - static inline long long atomic64_xchg(atomic64_t *ptr, long long new) 467 + static inline s64 atomic64_xchg(atomic64_t *ptr, s64 new) 468 468 { 469 - long long prev; 469 + s64 prev; 470 470 471 471 smp_mb(); 472 472 ··· 492 492 * the atomic variable, v, was not decremented. 493 493 */ 494 494 495 - static inline long long atomic64_dec_if_positive(atomic64_t *v) 495 + static inline s64 atomic64_dec_if_positive(atomic64_t *v) 496 496 { 497 - long long val; 497 + s64 val; 498 498 499 499 smp_mb(); 500 500 ··· 525 525 * Atomically adds @a to @v, if it was not @u. 526 526 * Returns the old value of @v 527 527 */ 528 - static inline long long atomic64_fetch_add_unless(atomic64_t *v, long long a, 529 - long long u) 528 + static inline s64 atomic64_fetch_add_unless(atomic64_t *v, s64 a, s64 u) 530 529 { 531 - long long old, temp; 530 + s64 old, temp; 532 531 533 532 smp_mb(); 534 533

+24 -26

arch/arm/include/asm/atomic.h

··· 246 246 247 247 #ifndef CONFIG_GENERIC_ATOMIC64 248 248 typedef struct { 249 - long long counter; 249 + s64 counter; 250 250 } atomic64_t; 251 251 252 252 #define ATOMIC64_INIT(i) { (i) } 253 253 254 254 #ifdef CONFIG_ARM_LPAE 255 - static inline long long atomic64_read(const atomic64_t *v) 255 + static inline s64 atomic64_read(const atomic64_t *v) 256 256 { 257 - long long result; 257 + s64 result; 258 258 259 259 __asm__ __volatile__("@ atomic64_read\n" 260 260 " ldrd %0, %H0, [%1]" ··· 265 265 return result; 266 266 } 267 267 268 - static inline void atomic64_set(atomic64_t *v, long long i) 268 + static inline void atomic64_set(atomic64_t *v, s64 i) 269 269 { 270 270 __asm__ __volatile__("@ atomic64_set\n" 271 271 " strd %2, %H2, [%1]" ··· 274 274 ); 275 275 } 276 276 #else 277 - static inline long long atomic64_read(const atomic64_t *v) 277 + static inline s64 atomic64_read(const atomic64_t *v) 278 278 { 279 - long long result; 279 + s64 result; 280 280 281 281 __asm__ __volatile__("@ atomic64_read\n" 282 282 " ldrexd %0, %H0, [%1]" ··· 287 287 return result; 288 288 } 289 289 290 - static inline void atomic64_set(atomic64_t *v, long long i) 290 + static inline void atomic64_set(atomic64_t *v, s64 i) 291 291 { 292 - long long tmp; 292 + s64 tmp; 293 293 294 294 prefetchw(&v->counter); 295 295 __asm__ __volatile__("@ atomic64_set\n" ··· 304 304 #endif 305 305 306 306 #define ATOMIC64_OP(op, op1, op2) \ 307 - static inline void atomic64_##op(long long i, atomic64_t *v) \ 307 + static inline void atomic64_##op(s64 i, atomic64_t *v) \ 308 308 { \ 309 - long long result; \ 309 + s64 result; \ 310 310 unsigned long tmp; \ 311 311 \ 312 312 prefetchw(&v->counter); \ ··· 323 323 } \ 324 324 325 325 #define ATOMIC64_OP_RETURN(op, op1, op2) \ 326 - static inline long long \ 327 - atomic64_##op##_return_relaxed(long long i, atomic64_t *v) \ 326 + static inline s64 \ 327 + atomic64_##op##_return_relaxed(s64 i, atomic64_t *v) \ 328 328 { \ 329 - long long result; \ 329 + s64 result; \ 330 330 unsigned long tmp; \ 331 331 \ 332 332 prefetchw(&v->counter); \ ··· 346 346 } 347 347 348 348 #define ATOMIC64_FETCH_OP(op, op1, op2) \ 349 - static inline long long \ 350 - atomic64_fetch_##op##_relaxed(long long i, atomic64_t *v) \ 349 + static inline s64 \ 350 + atomic64_fetch_##op##_relaxed(s64 i, atomic64_t *v) \ 351 351 { \ 352 - long long result, val; \ 352 + s64 result, val; \ 353 353 unsigned long tmp; \ 354 354 \ 355 355 prefetchw(&v->counter); \ ··· 403 403 #undef ATOMIC64_OP_RETURN 404 404 #undef ATOMIC64_OP 405 405 406 - static inline long long 407 - atomic64_cmpxchg_relaxed(atomic64_t *ptr, long long old, long long new) 406 + static inline s64 atomic64_cmpxchg_relaxed(atomic64_t *ptr, s64 old, s64 new) 408 407 { 409 - long long oldval; 408 + s64 oldval; 410 409 unsigned long res; 411 410 412 411 prefetchw(&ptr->counter); ··· 426 427 } 427 428 #define atomic64_cmpxchg_relaxed atomic64_cmpxchg_relaxed 428 429 429 - static inline long long atomic64_xchg_relaxed(atomic64_t *ptr, long long new) 430 + static inline s64 atomic64_xchg_relaxed(atomic64_t *ptr, s64 new) 430 431 { 431 - long long result; 432 + s64 result; 432 433 unsigned long tmp; 433 434 434 435 prefetchw(&ptr->counter); ··· 446 447 } 447 448 #define atomic64_xchg_relaxed atomic64_xchg_relaxed 448 449 449 - static inline long long atomic64_dec_if_positive(atomic64_t *v) 450 + static inline s64 atomic64_dec_if_positive(atomic64_t *v) 450 451 { 451 - long long result; 452 + s64 result; 452 453 unsigned long tmp; 453 454 454 455 smp_mb(); ··· 474 475 } 475 476 #define atomic64_dec_if_positive atomic64_dec_if_positive 476 477 477 - static inline long long atomic64_fetch_add_unless(atomic64_t *v, long long a, 478 - long long u) 478 + static inline s64 atomic64_fetch_add_unless(atomic64_t *v, s64 a, s64 u) 479 479 { 480 - long long oldval, newval; 480 + s64 oldval, newval; 481 481 unsigned long tmp; 482 482 483 483 smp_mb();

+10 -10

arch/arm64/include/asm/atomic_ll_sc.h

··· 122 122 123 123 #define ATOMIC64_OP(op, asm_op) \ 124 124 __LL_SC_INLINE void \ 125 - __LL_SC_PREFIX(arch_atomic64_##op(long i, atomic64_t *v)) \ 125 + __LL_SC_PREFIX(arch_atomic64_##op(s64 i, atomic64_t *v)) \ 126 126 { \ 127 - long result; \ 127 + s64 result; \ 128 128 unsigned long tmp; \ 129 129 \ 130 130 asm volatile("// atomic64_" #op "\n" \ ··· 139 139 __LL_SC_EXPORT(arch_atomic64_##op); 140 140 141 141 #define ATOMIC64_OP_RETURN(name, mb, acq, rel, cl, op, asm_op) \ 142 - __LL_SC_INLINE long \ 143 - __LL_SC_PREFIX(arch_atomic64_##op##_return##name(long i, atomic64_t *v))\ 142 + __LL_SC_INLINE s64 \ 143 + __LL_SC_PREFIX(arch_atomic64_##op##_return##name(s64 i, atomic64_t *v))\ 144 144 { \ 145 - long result; \ 145 + s64 result; \ 146 146 unsigned long tmp; \ 147 147 \ 148 148 asm volatile("// atomic64_" #op "_return" #name "\n" \ ··· 161 161 __LL_SC_EXPORT(arch_atomic64_##op##_return##name); 162 162 163 163 #define ATOMIC64_FETCH_OP(name, mb, acq, rel, cl, op, asm_op) \ 164 - __LL_SC_INLINE long \ 165 - __LL_SC_PREFIX(arch_atomic64_fetch_##op##name(long i, atomic64_t *v)) \ 164 + __LL_SC_INLINE s64 \ 165 + __LL_SC_PREFIX(arch_atomic64_fetch_##op##name(s64 i, atomic64_t *v)) \ 166 166 { \ 167 - long result, val; \ 167 + s64 result, val; \ 168 168 unsigned long tmp; \ 169 169 \ 170 170 asm volatile("// atomic64_fetch_" #op #name "\n" \ ··· 214 214 #undef ATOMIC64_OP_RETURN 215 215 #undef ATOMIC64_OP 216 216 217 - __LL_SC_INLINE long 217 + __LL_SC_INLINE s64 218 218 __LL_SC_PREFIX(arch_atomic64_dec_if_positive(atomic64_t *v)) 219 219 { 220 - long result; 220 + s64 result; 221 221 unsigned long tmp; 222 222 223 223 asm volatile("// atomic64_dec_if_positive\n"

+17 -17

arch/arm64/include/asm/atomic_lse.h

··· 213 213 214 214 #define __LL_SC_ATOMIC64(op) __LL_SC_CALL(arch_atomic64_##op) 215 215 #define ATOMIC64_OP(op, asm_op) \ 216 - static inline void arch_atomic64_##op(long i, atomic64_t *v) \ 216 + static inline void arch_atomic64_##op(s64 i, atomic64_t *v) \ 217 217 { \ 218 - register long x0 asm ("x0") = i; \ 218 + register s64 x0 asm ("x0") = i; \ 219 219 register atomic64_t *x1 asm ("x1") = v; \ 220 220 \ 221 221 asm volatile(ARM64_LSE_ATOMIC_INSN(__LL_SC_ATOMIC64(op), \ ··· 233 233 #undef ATOMIC64_OP 234 234 235 235 #define ATOMIC64_FETCH_OP(name, mb, op, asm_op, cl...) \ 236 - static inline long arch_atomic64_fetch_##op##name(long i, atomic64_t *v)\ 236 + static inline s64 arch_atomic64_fetch_##op##name(s64 i, atomic64_t *v) \ 237 237 { \ 238 - register long x0 asm ("x0") = i; \ 238 + register s64 x0 asm ("x0") = i; \ 239 239 register atomic64_t *x1 asm ("x1") = v; \ 240 240 \ 241 241 asm volatile(ARM64_LSE_ATOMIC_INSN( \ ··· 265 265 #undef ATOMIC64_FETCH_OPS 266 266 267 267 #define ATOMIC64_OP_ADD_RETURN(name, mb, cl...) \ 268 - static inline long arch_atomic64_add_return##name(long i, atomic64_t *v)\ 268 + static inline s64 arch_atomic64_add_return##name(s64 i, atomic64_t *v) \ 269 269 { \ 270 - register long x0 asm ("x0") = i; \ 270 + register s64 x0 asm ("x0") = i; \ 271 271 register atomic64_t *x1 asm ("x1") = v; \ 272 272 \ 273 273 asm volatile(ARM64_LSE_ATOMIC_INSN( \ ··· 291 291 292 292 #undef ATOMIC64_OP_ADD_RETURN 293 293 294 - static inline void arch_atomic64_and(long i, atomic64_t *v) 294 + static inline void arch_atomic64_and(s64 i, atomic64_t *v) 295 295 { 296 - register long x0 asm ("x0") = i; 296 + register s64 x0 asm ("x0") = i; 297 297 register atomic64_t *x1 asm ("x1") = v; 298 298 299 299 asm volatile(ARM64_LSE_ATOMIC_INSN( ··· 309 309 } 310 310 311 311 #define ATOMIC64_FETCH_OP_AND(name, mb, cl...) \ 312 - static inline long arch_atomic64_fetch_and##name(long i, atomic64_t *v) \ 312 + static inline s64 arch_atomic64_fetch_and##name(s64 i, atomic64_t *v) \ 313 313 { \ 314 - register long x0 asm ("x0") = i; \ 314 + register s64 x0 asm ("x0") = i; \ 315 315 register atomic64_t *x1 asm ("x1") = v; \ 316 316 \ 317 317 asm volatile(ARM64_LSE_ATOMIC_INSN( \ ··· 335 335 336 336 #undef ATOMIC64_FETCH_OP_AND 337 337 338 - static inline void arch_atomic64_sub(long i, atomic64_t *v) 338 + static inline void arch_atomic64_sub(s64 i, atomic64_t *v) 339 339 { 340 - register long x0 asm ("x0") = i; 340 + register s64 x0 asm ("x0") = i; 341 341 register atomic64_t *x1 asm ("x1") = v; 342 342 343 343 asm volatile(ARM64_LSE_ATOMIC_INSN( ··· 353 353 } 354 354 355 355 #define ATOMIC64_OP_SUB_RETURN(name, mb, cl...) \ 356 - static inline long arch_atomic64_sub_return##name(long i, atomic64_t *v)\ 356 + static inline s64 arch_atomic64_sub_return##name(s64 i, atomic64_t *v) \ 357 357 { \ 358 - register long x0 asm ("x0") = i; \ 358 + register s64 x0 asm ("x0") = i; \ 359 359 register atomic64_t *x1 asm ("x1") = v; \ 360 360 \ 361 361 asm volatile(ARM64_LSE_ATOMIC_INSN( \ ··· 381 381 #undef ATOMIC64_OP_SUB_RETURN 382 382 383 383 #define ATOMIC64_FETCH_OP_SUB(name, mb, cl...) \ 384 - static inline long arch_atomic64_fetch_sub##name(long i, atomic64_t *v) \ 384 + static inline s64 arch_atomic64_fetch_sub##name(s64 i, atomic64_t *v) \ 385 385 { \ 386 - register long x0 asm ("x0") = i; \ 386 + register s64 x0 asm ("x0") = i; \ 387 387 register atomic64_t *x1 asm ("x1") = v; \ 388 388 \ 389 389 asm volatile(ARM64_LSE_ATOMIC_INSN( \ ··· 407 407 408 408 #undef ATOMIC64_FETCH_OP_SUB 409 409 410 - static inline long arch_atomic64_dec_if_positive(atomic64_t *v) 410 + static inline s64 arch_atomic64_dec_if_positive(atomic64_t *v) 411 411 { 412 412 register long x0 asm ("x0") = (long)v; 413 413

+10 -10

arch/ia64/include/asm/atomic.h

··· 124 124 #undef ATOMIC_OP 125 125 126 126 #define ATOMIC64_OP(op, c_op) \ 127 - static __inline__ long \ 128 - ia64_atomic64_##op (__s64 i, atomic64_t *v) \ 127 + static __inline__ s64 \ 128 + ia64_atomic64_##op (s64 i, atomic64_t *v) \ 129 129 { \ 130 - __s64 old, new; \ 130 + s64 old, new; \ 131 131 CMPXCHG_BUGCHECK_DECL \ 132 132 \ 133 133 do { \ ··· 139 139 } 140 140 141 141 #define ATOMIC64_FETCH_OP(op, c_op) \ 142 - static __inline__ long \ 143 - ia64_atomic64_fetch_##op (__s64 i, atomic64_t *v) \ 142 + static __inline__ s64 \ 143 + ia64_atomic64_fetch_##op (s64 i, atomic64_t *v) \ 144 144 { \ 145 - __s64 old, new; \ 145 + s64 old, new; \ 146 146 CMPXCHG_BUGCHECK_DECL \ 147 147 \ 148 148 do { \ ··· 162 162 163 163 #define atomic64_add_return(i,v) \ 164 164 ({ \ 165 - long __ia64_aar_i = (i); \ 165 + s64 __ia64_aar_i = (i); \ 166 166 __ia64_atomic_const(i) \ 167 167 ? ia64_fetch_and_add(__ia64_aar_i, &(v)->counter) \ 168 168 : ia64_atomic64_add(__ia64_aar_i, v); \ ··· 170 170 171 171 #define atomic64_sub_return(i,v) \ 172 172 ({ \ 173 - long __ia64_asr_i = (i); \ 173 + s64 __ia64_asr_i = (i); \ 174 174 __ia64_atomic_const(i) \ 175 175 ? ia64_fetch_and_add(-__ia64_asr_i, &(v)->counter) \ 176 176 : ia64_atomic64_sub(__ia64_asr_i, v); \ ··· 178 178 179 179 #define atomic64_fetch_add(i,v) \ 180 180 ({ \ 181 - long __ia64_aar_i = (i); \ 181 + s64 __ia64_aar_i = (i); \ 182 182 __ia64_atomic_const(i) \ 183 183 ? ia64_fetchadd(__ia64_aar_i, &(v)->counter, acq) \ 184 184 : ia64_atomic64_fetch_add(__ia64_aar_i, v); \ ··· 186 186 187 187 #define atomic64_fetch_sub(i,v) \ 188 188 ({ \ 189 - long __ia64_asr_i = (i); \ 189 + s64 __ia64_asr_i = (i); \ 190 190 __ia64_atomic_const(i) \ 191 191 ? ia64_fetchadd(-__ia64_asr_i, &(v)->counter, acq) \ 192 192 : ia64_atomic64_fetch_sub(__ia64_asr_i, v); \

+11 -11

arch/mips/include/asm/atomic.h

··· 254 254 #define atomic64_set(v, i) WRITE_ONCE((v)->counter, (i)) 255 255 256 256 #define ATOMIC64_OP(op, c_op, asm_op) \ 257 - static __inline__ void atomic64_##op(long i, atomic64_t * v) \ 257 + static __inline__ void atomic64_##op(s64 i, atomic64_t * v) \ 258 258 { \ 259 259 if (kernel_uses_llsc) { \ 260 - long temp; \ 260 + s64 temp; \ 261 261 \ 262 262 loongson_llsc_mb(); \ 263 263 __asm__ __volatile__( \ ··· 280 280 } 281 281 282 282 #define ATOMIC64_OP_RETURN(op, c_op, asm_op) \ 283 - static __inline__ long atomic64_##op##_return_relaxed(long i, atomic64_t * v) \ 283 + static __inline__ s64 atomic64_##op##_return_relaxed(s64 i, atomic64_t * v) \ 284 284 { \ 285 - long result; \ 285 + s64 result; \ 286 286 \ 287 287 if (kernel_uses_llsc) { \ 288 - long temp; \ 288 + s64 temp; \ 289 289 \ 290 290 loongson_llsc_mb(); \ 291 291 __asm__ __volatile__( \ ··· 314 314 } 315 315 316 316 #define ATOMIC64_FETCH_OP(op, c_op, asm_op) \ 317 - static __inline__ long atomic64_fetch_##op##_relaxed(long i, atomic64_t * v) \ 317 + static __inline__ s64 atomic64_fetch_##op##_relaxed(s64 i, atomic64_t * v) \ 318 318 { \ 319 - long result; \ 319 + s64 result; \ 320 320 \ 321 321 if (kernel_uses_llsc) { \ 322 - long temp; \ 322 + s64 temp; \ 323 323 \ 324 324 loongson_llsc_mb(); \ 325 325 __asm__ __volatile__( \ ··· 386 386 * Atomically test @v and subtract @i if @v is greater or equal than @i. 387 387 * The function returns the old value of @v minus @i. 388 388 */ 389 - static __inline__ long atomic64_sub_if_positive(long i, atomic64_t * v) 389 + static __inline__ s64 atomic64_sub_if_positive(s64 i, atomic64_t * v) 390 390 { 391 - long result; 391 + s64 result; 392 392 393 393 smp_mb__before_llsc(); 394 394 395 395 if (kernel_uses_llsc) { 396 - long temp; 396 + s64 temp; 397 397 398 398 __asm__ __volatile__( 399 399 " .set push \n"

+22 -22

arch/powerpc/include/asm/atomic.h

··· 297 297 298 298 #define ATOMIC64_INIT(i) { (i) } 299 299 300 - static __inline__ long atomic64_read(const atomic64_t *v) 300 + static __inline__ s64 atomic64_read(const atomic64_t *v) 301 301 { 302 - long t; 302 + s64 t; 303 303 304 304 __asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : "m"(v->counter)); 305 305 306 306 return t; 307 307 } 308 308 309 - static __inline__ void atomic64_set(atomic64_t *v, long i) 309 + static __inline__ void atomic64_set(atomic64_t *v, s64 i) 310 310 { 311 311 __asm__ __volatile__("std%U0%X0 %1,%0" : "=m"(v->counter) : "r"(i)); 312 312 } 313 313 314 314 #define ATOMIC64_OP(op, asm_op) \ 315 - static __inline__ void atomic64_##op(long a, atomic64_t *v) \ 315 + static __inline__ void atomic64_##op(s64 a, atomic64_t *v) \ 316 316 { \ 317 - long t; \ 317 + s64 t; \ 318 318 \ 319 319 __asm__ __volatile__( \ 320 320 "1: ldarx %0,0,%3 # atomic64_" #op "\n" \ ··· 327 327 } 328 328 329 329 #define ATOMIC64_OP_RETURN_RELAXED(op, asm_op) \ 330 - static inline long \ 331 - atomic64_##op##_return_relaxed(long a, atomic64_t *v) \ 330 + static inline s64 \ 331 + atomic64_##op##_return_relaxed(s64 a, atomic64_t *v) \ 332 332 { \ 333 - long t; \ 333 + s64 t; \ 334 334 \ 335 335 __asm__ __volatile__( \ 336 336 "1: ldarx %0,0,%3 # atomic64_" #op "_return_relaxed\n" \ ··· 345 345 } 346 346 347 347 #define ATOMIC64_FETCH_OP_RELAXED(op, asm_op) \ 348 - static inline long \ 349 - atomic64_fetch_##op##_relaxed(long a, atomic64_t *v) \ 348 + static inline s64 \ 349 + atomic64_fetch_##op##_relaxed(s64 a, atomic64_t *v) \ 350 350 { \ 351 - long res, t; \ 351 + s64 res, t; \ 352 352 \ 353 353 __asm__ __volatile__( \ 354 354 "1: ldarx %0,0,%4 # atomic64_fetch_" #op "_relaxed\n" \ ··· 396 396 397 397 static __inline__ void atomic64_inc(atomic64_t *v) 398 398 { 399 - long t; 399 + s64 t; 400 400 401 401 __asm__ __volatile__( 402 402 "1: ldarx %0,0,%2 # atomic64_inc\n\ ··· 409 409 } 410 410 #define atomic64_inc atomic64_inc 411 411 412 - static __inline__ long atomic64_inc_return_relaxed(atomic64_t *v) 412 + static __inline__ s64 atomic64_inc_return_relaxed(atomic64_t *v) 413 413 { 414 - long t; 414 + s64 t; 415 415 416 416 __asm__ __volatile__( 417 417 "1: ldarx %0,0,%2 # atomic64_inc_return_relaxed\n" ··· 427 427 428 428 static __inline__ void atomic64_dec(atomic64_t *v) 429 429 { 430 - long t; 430 + s64 t; 431 431 432 432 __asm__ __volatile__( 433 433 "1: ldarx %0,0,%2 # atomic64_dec\n\ ··· 440 440 } 441 441 #define atomic64_dec atomic64_dec 442 442 443 - static __inline__ long atomic64_dec_return_relaxed(atomic64_t *v) 443 + static __inline__ s64 atomic64_dec_return_relaxed(atomic64_t *v) 444 444 { 445 - long t; 445 + s64 t; 446 446 447 447 __asm__ __volatile__( 448 448 "1: ldarx %0,0,%2 # atomic64_dec_return_relaxed\n" ··· 463 463 * Atomically test *v and decrement if it is greater than 0. 464 464 * The function returns the old value of *v minus 1. 465 465 */ 466 - static __inline__ long atomic64_dec_if_positive(atomic64_t *v) 466 + static __inline__ s64 atomic64_dec_if_positive(atomic64_t *v) 467 467 { 468 - long t; 468 + s64 t; 469 469 470 470 __asm__ __volatile__( 471 471 PPC_ATOMIC_ENTRY_BARRIER ··· 502 502 * Atomically adds @a to @v, so long as it was not @u. 503 503 * Returns the old value of @v. 504 504 */ 505 - static __inline__ long atomic64_fetch_add_unless(atomic64_t *v, long a, long u) 505 + static __inline__ s64 atomic64_fetch_add_unless(atomic64_t *v, s64 a, s64 u) 506 506 { 507 - long t; 507 + s64 t; 508 508 509 509 __asm__ __volatile__ ( 510 510 PPC_ATOMIC_ENTRY_BARRIER ··· 534 534 */ 535 535 static __inline__ int atomic64_inc_not_zero(atomic64_t *v) 536 536 { 537 - long t1, t2; 537 + s64 t1, t2; 538 538 539 539 __asm__ __volatile__ ( 540 540 PPC_ATOMIC_ENTRY_BARRIER

+23 -21

arch/riscv/include/asm/atomic.h

··· 38 38 39 39 #ifndef CONFIG_GENERIC_ATOMIC64 40 40 #define ATOMIC64_INIT(i) { (i) } 41 - static __always_inline long atomic64_read(const atomic64_t *v) 41 + static __always_inline s64 atomic64_read(const atomic64_t *v) 42 42 { 43 43 return READ_ONCE(v->counter); 44 44 } 45 - static __always_inline void atomic64_set(atomic64_t *v, long i) 45 + static __always_inline void atomic64_set(atomic64_t *v, s64 i) 46 46 { 47 47 WRITE_ONCE(v->counter, i); 48 48 } ··· 66 66 67 67 #ifdef CONFIG_GENERIC_ATOMIC64 68 68 #define ATOMIC_OPS(op, asm_op, I) \ 69 - ATOMIC_OP (op, asm_op, I, w, int, ) 69 + ATOMIC_OP (op, asm_op, I, w, int, ) 70 70 #else 71 71 #define ATOMIC_OPS(op, asm_op, I) \ 72 - ATOMIC_OP (op, asm_op, I, w, int, ) \ 73 - ATOMIC_OP (op, asm_op, I, d, long, 64) 72 + ATOMIC_OP (op, asm_op, I, w, int, ) \ 73 + ATOMIC_OP (op, asm_op, I, d, s64, 64) 74 74 #endif 75 75 76 76 ATOMIC_OPS(add, add, i) ··· 127 127 128 128 #ifdef CONFIG_GENERIC_ATOMIC64 129 129 #define ATOMIC_OPS(op, asm_op, c_op, I) \ 130 - ATOMIC_FETCH_OP( op, asm_op, I, w, int, ) \ 131 - ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int, ) 130 + ATOMIC_FETCH_OP( op, asm_op, I, w, int, ) \ 131 + ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int, ) 132 132 #else 133 133 #define ATOMIC_OPS(op, asm_op, c_op, I) \ 134 - ATOMIC_FETCH_OP( op, asm_op, I, w, int, ) \ 135 - ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int, ) \ 136 - ATOMIC_FETCH_OP( op, asm_op, I, d, long, 64) \ 137 - ATOMIC_OP_RETURN(op, asm_op, c_op, I, d, long, 64) 134 + ATOMIC_FETCH_OP( op, asm_op, I, w, int, ) \ 135 + ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int, ) \ 136 + ATOMIC_FETCH_OP( op, asm_op, I, d, s64, 64) \ 137 + ATOMIC_OP_RETURN(op, asm_op, c_op, I, d, s64, 64) 138 138 #endif 139 139 140 140 ATOMIC_OPS(add, add, +, i) ··· 166 166 167 167 #ifdef CONFIG_GENERIC_ATOMIC64 168 168 #define ATOMIC_OPS(op, asm_op, I) \ 169 - ATOMIC_FETCH_OP(op, asm_op, I, w, int, ) 169 + ATOMIC_FETCH_OP(op, asm_op, I, w, int, ) 170 170 #else 171 171 #define ATOMIC_OPS(op, asm_op, I) \ 172 - ATOMIC_FETCH_OP(op, asm_op, I, w, int, ) \ 173 - ATOMIC_FETCH_OP(op, asm_op, I, d, long, 64) 172 + ATOMIC_FETCH_OP(op, asm_op, I, w, int, ) \ 173 + ATOMIC_FETCH_OP(op, asm_op, I, d, s64, 64) 174 174 #endif 175 175 176 176 ATOMIC_OPS(and, and, i) ··· 219 219 #define atomic_fetch_add_unless atomic_fetch_add_unless 220 220 221 221 #ifndef CONFIG_GENERIC_ATOMIC64 222 - static __always_inline long atomic64_fetch_add_unless(atomic64_t *v, long a, long u) 222 + static __always_inline s64 atomic64_fetch_add_unless(atomic64_t *v, s64 a, s64 u) 223 223 { 224 - long prev, rc; 224 + s64 prev; 225 + long rc; 225 226 226 227 __asm__ __volatile__ ( 227 228 "0: lr.d %[p], %[c]\n" ··· 291 290 292 291 #ifdef CONFIG_GENERIC_ATOMIC64 293 292 #define ATOMIC_OPS() \ 294 - ATOMIC_OP( int, , 4) 293 + ATOMIC_OP(int, , 4) 295 294 #else 296 295 #define ATOMIC_OPS() \ 297 - ATOMIC_OP( int, , 4) \ 298 - ATOMIC_OP(long, 64, 8) 296 + ATOMIC_OP(int, , 4) \ 297 + ATOMIC_OP(s64, 64, 8) 299 298 #endif 300 299 301 300 ATOMIC_OPS() ··· 333 332 #define atomic_dec_if_positive(v) atomic_sub_if_positive(v, 1) 334 333 335 334 #ifndef CONFIG_GENERIC_ATOMIC64 336 - static __always_inline long atomic64_sub_if_positive(atomic64_t *v, int offset) 335 + static __always_inline s64 atomic64_sub_if_positive(atomic64_t *v, s64 offset) 337 336 { 338 - long prev, rc; 337 + s64 prev; 338 + long rc; 339 339 340 340 __asm__ __volatile__ ( 341 341 "0: lr.d %[p], %[c]\n"

+19 -19

arch/s390/include/asm/atomic.h

··· 84 84 85 85 #define ATOMIC64_INIT(i) { (i) } 86 86 87 - static inline long atomic64_read(const atomic64_t *v) 87 + static inline s64 atomic64_read(const atomic64_t *v) 88 88 { 89 - long c; 89 + s64 c; 90 90 91 91 asm volatile( 92 92 " lg %0,%1\n" ··· 94 94 return c; 95 95 } 96 96 97 - static inline void atomic64_set(atomic64_t *v, long i) 97 + static inline void atomic64_set(atomic64_t *v, s64 i) 98 98 { 99 99 asm volatile( 100 100 " stg %1,%0\n" 101 101 : "=Q" (v->counter) : "d" (i)); 102 102 } 103 103 104 - static inline long atomic64_add_return(long i, atomic64_t *v) 104 + static inline s64 atomic64_add_return(s64 i, atomic64_t *v) 105 105 { 106 - return __atomic64_add_barrier(i, &v->counter) + i; 106 + return __atomic64_add_barrier(i, (long *)&v->counter) + i; 107 107 } 108 108 109 - static inline long atomic64_fetch_add(long i, atomic64_t *v) 109 + static inline s64 atomic64_fetch_add(s64 i, atomic64_t *v) 110 110 { 111 - return __atomic64_add_barrier(i, &v->counter); 111 + return __atomic64_add_barrier(i, (long *)&v->counter); 112 112 } 113 113 114 - static inline void atomic64_add(long i, atomic64_t *v) 114 + static inline void atomic64_add(s64 i, atomic64_t *v) 115 115 { 116 116 #ifdef CONFIG_HAVE_MARCH_Z196_FEATURES 117 117 if (__builtin_constant_p(i) && (i > -129) && (i < 128)) { 118 - __atomic64_add_const(i, &v->counter); 118 + __atomic64_add_const(i, (long *)&v->counter); 119 119 return; 120 120 } 121 121 #endif 122 - __atomic64_add(i, &v->counter); 122 + __atomic64_add(i, (long *)&v->counter); 123 123 } 124 124 125 125 #define atomic64_xchg(v, new) (xchg(&((v)->counter), new)) 126 126 127 - static inline long atomic64_cmpxchg(atomic64_t *v, long old, long new) 127 + static inline s64 atomic64_cmpxchg(atomic64_t *v, s64 old, s64 new) 128 128 { 129 - return __atomic64_cmpxchg(&v->counter, old, new); 129 + return __atomic64_cmpxchg((long *)&v->counter, old, new); 130 130 } 131 131 132 132 #define ATOMIC64_OPS(op) \ 133 - static inline void atomic64_##op(long i, atomic64_t *v) \ 133 + static inline void atomic64_##op(s64 i, atomic64_t *v) \ 134 134 { \ 135 - __atomic64_##op(i, &v->counter); \ 135 + __atomic64_##op(i, (long *)&v->counter); \ 136 136 } \ 137 - static inline long atomic64_fetch_##op(long i, atomic64_t *v) \ 137 + static inline long atomic64_fetch_##op(s64 i, atomic64_t *v) \ 138 138 { \ 139 - return __atomic64_##op##_barrier(i, &v->counter); \ 139 + return __atomic64_##op##_barrier(i, (long *)&v->counter); \ 140 140 } 141 141 142 142 ATOMIC64_OPS(and) ··· 145 145 146 146 #undef ATOMIC64_OPS 147 147 148 - #define atomic64_sub_return(_i, _v) atomic64_add_return(-(long)(_i), _v) 149 - #define atomic64_fetch_sub(_i, _v) atomic64_fetch_add(-(long)(_i), _v) 150 - #define atomic64_sub(_i, _v) atomic64_add(-(long)(_i), _v) 148 + #define atomic64_sub_return(_i, _v) atomic64_add_return(-(s64)(_i), _v) 149 + #define atomic64_fetch_sub(_i, _v) atomic64_fetch_add(-(s64)(_i), _v) 150 + #define atomic64_sub(_i, _v) atomic64_add(-(s64)(_i), _v) 151 151 152 152 #endif /* __ARCH_S390_ATOMIC__ */

+1 -1

arch/s390/pci/pci_debug.c

··· 74 74 int i; 75 75 76 76 for (i = 0; i < ARRAY_SIZE(pci_sw_names); i++, counter++) 77 - seq_printf(m, "%26s:\t%lu\n", pci_sw_names[i], 77 + seq_printf(m, "%26s:\t%llu\n", pci_sw_names[i], 78 78 atomic64_read(counter)); 79 79 } 80 80

+4 -4

arch/sparc/include/asm/atomic_64.h

··· 23 23 24 24 #define ATOMIC_OP(op) \ 25 25 void atomic_##op(int, atomic_t *); \ 26 - void atomic64_##op(long, atomic64_t *); 26 + void atomic64_##op(s64, atomic64_t *); 27 27 28 28 #define ATOMIC_OP_RETURN(op) \ 29 29 int atomic_##op##_return(int, atomic_t *); \ 30 - long atomic64_##op##_return(long, atomic64_t *); 30 + s64 atomic64_##op##_return(s64, atomic64_t *); 31 31 32 32 #define ATOMIC_FETCH_OP(op) \ 33 33 int atomic_fetch_##op(int, atomic_t *); \ 34 - long atomic64_fetch_##op(long, atomic64_t *); 34 + s64 atomic64_fetch_##op(s64, atomic64_t *); 35 35 36 36 #define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op) ATOMIC_FETCH_OP(op) 37 37 ··· 61 61 ((__typeof__((v)->counter))cmpxchg(&((v)->counter), (o), (n))) 62 62 #define atomic64_xchg(v, new) (xchg(&((v)->counter), new)) 63 63 64 - long atomic64_dec_if_positive(atomic64_t *v); 64 + s64 atomic64_dec_if_positive(atomic64_t *v); 65 65 #define atomic64_dec_if_positive atomic64_dec_if_positive 66 66 67 67 #endif /* !(__ARCH_SPARC64_ATOMIC__) */

+1 -1

arch/x86/events/core.c

··· 2179 2179 * For now, this can't happen because all callers hold mmap_sem 2180 2180 * for write. If this changes, we'll need a different solution. 2181 2181 */ 2182 - lockdep_assert_held_exclusive(&mm->mmap_sem); 2182 + lockdep_assert_held_write(&mm->mmap_sem); 2183 2183 2184 2184 if (atomic_inc_return(&mm->context.perf_rdpmc_allowed) == 1) 2185 2185 on_each_cpu_mask(mm_cpumask(mm), refresh_pce, NULL, 1);

+4 -4

arch/x86/include/asm/atomic.h

··· 54 54 { 55 55 asm volatile(LOCK_PREFIX "addl %1,%0" 56 56 : "+m" (v->counter) 57 - : "ir" (i)); 57 + : "ir" (i) : "memory"); 58 58 } 59 59 60 60 /** ··· 68 68 { 69 69 asm volatile(LOCK_PREFIX "subl %1,%0" 70 70 : "+m" (v->counter) 71 - : "ir" (i)); 71 + : "ir" (i) : "memory"); 72 72 } 73 73 74 74 /** ··· 95 95 static __always_inline void arch_atomic_inc(atomic_t *v) 96 96 { 97 97 asm volatile(LOCK_PREFIX "incl %0" 98 - : "+m" (v->counter)); 98 + : "+m" (v->counter) :: "memory"); 99 99 } 100 100 #define arch_atomic_inc arch_atomic_inc 101 101 ··· 108 108 static __always_inline void arch_atomic_dec(atomic_t *v) 109 109 { 110 110 asm volatile(LOCK_PREFIX "decl %0" 111 - : "+m" (v->counter)); 111 + : "+m" (v->counter) :: "memory"); 112 112 } 113 113 #define arch_atomic_dec arch_atomic_dec 114 114

+32 -34

arch/x86/include/asm/atomic64_32.h

··· 9 9 /* An 64bit atomic type */ 10 10 11 11 typedef struct { 12 - u64 __aligned(8) counter; 12 + s64 __aligned(8) counter; 13 13 } atomic64_t; 14 14 15 15 #define ATOMIC64_INIT(val) { (val) } ··· 71 71 * the old value. 72 72 */ 73 73 74 - static inline long long arch_atomic64_cmpxchg(atomic64_t *v, long long o, 75 - long long n) 74 + static inline s64 arch_atomic64_cmpxchg(atomic64_t *v, s64 o, s64 n) 76 75 { 77 76 return arch_cmpxchg64(&v->counter, o, n); 78 77 } ··· 84 85 * Atomically xchgs the value of @v to @n and returns 85 86 * the old value. 86 87 */ 87 - static inline long long arch_atomic64_xchg(atomic64_t *v, long long n) 88 + static inline s64 arch_atomic64_xchg(atomic64_t *v, s64 n) 88 89 { 89 - long long o; 90 + s64 o; 90 91 unsigned high = (unsigned)(n >> 32); 91 92 unsigned low = (unsigned)n; 92 93 alternative_atomic64(xchg, "=&A" (o), ··· 102 103 * 103 104 * Atomically sets the value of @v to @n. 104 105 */ 105 - static inline void arch_atomic64_set(atomic64_t *v, long long i) 106 + static inline void arch_atomic64_set(atomic64_t *v, s64 i) 106 107 { 107 108 unsigned high = (unsigned)(i >> 32); 108 109 unsigned low = (unsigned)i; ··· 117 118 * 118 119 * Atomically reads the value of @v and returns it. 119 120 */ 120 - static inline long long arch_atomic64_read(const atomic64_t *v) 121 + static inline s64 arch_atomic64_read(const atomic64_t *v) 121 122 { 122 - long long r; 123 + s64 r; 123 124 alternative_atomic64(read, "=&A" (r), "c" (v) : "memory"); 124 125 return r; 125 126 } ··· 131 132 * 132 133 * Atomically adds @i to @v and returns @i + *@v 133 134 */ 134 - static inline long long arch_atomic64_add_return(long long i, atomic64_t *v) 135 + static inline s64 arch_atomic64_add_return(s64 i, atomic64_t *v) 135 136 { 136 137 alternative_atomic64(add_return, 137 138 ASM_OUTPUT2("+A" (i), "+c" (v)), ··· 142 143 /* 143 144 * Other variants with different arithmetic operators: 144 145 */ 145 - static inline long long arch_atomic64_sub_return(long long i, atomic64_t *v) 146 + static inline s64 arch_atomic64_sub_return(s64 i, atomic64_t *v) 146 147 { 147 148 alternative_atomic64(sub_return, 148 149 ASM_OUTPUT2("+A" (i), "+c" (v)), ··· 150 151 return i; 151 152 } 152 153 153 - static inline long long arch_atomic64_inc_return(atomic64_t *v) 154 + static inline s64 arch_atomic64_inc_return(atomic64_t *v) 154 155 { 155 - long long a; 156 + s64 a; 156 157 alternative_atomic64(inc_return, "=&A" (a), 157 158 "S" (v) : "memory", "ecx"); 158 159 return a; 159 160 } 160 161 #define arch_atomic64_inc_return arch_atomic64_inc_return 161 162 162 - static inline long long arch_atomic64_dec_return(atomic64_t *v) 163 + static inline s64 arch_atomic64_dec_return(atomic64_t *v) 163 164 { 164 - long long a; 165 + s64 a; 165 166 alternative_atomic64(dec_return, "=&A" (a), 166 167 "S" (v) : "memory", "ecx"); 167 168 return a; ··· 175 176 * 176 177 * Atomically adds @i to @v. 177 178 */ 178 - static inline long long arch_atomic64_add(long long i, atomic64_t *v) 179 + static inline s64 arch_atomic64_add(s64 i, atomic64_t *v) 179 180 { 180 181 __alternative_atomic64(add, add_return, 181 182 ASM_OUTPUT2("+A" (i), "+c" (v)), ··· 190 191 * 191 192 * Atomically subtracts @i from @v. 192 193 */ 193 - static inline long long arch_atomic64_sub(long long i, atomic64_t *v) 194 + static inline s64 arch_atomic64_sub(s64 i, atomic64_t *v) 194 195 { 195 196 __alternative_atomic64(sub, sub_return, 196 197 ASM_OUTPUT2("+A" (i), "+c" (v)), ··· 233 234 * Atomically adds @a to @v, so long as it was not @u. 234 235 * Returns non-zero if the add was done, zero otherwise. 235 236 */ 236 - static inline int arch_atomic64_add_unless(atomic64_t *v, long long a, 237 - long long u) 237 + static inline int arch_atomic64_add_unless(atomic64_t *v, s64 a, s64 u) 238 238 { 239 239 unsigned low = (unsigned)u; 240 240 unsigned high = (unsigned)(u >> 32); ··· 252 254 } 253 255 #define arch_atomic64_inc_not_zero arch_atomic64_inc_not_zero 254 256 255 - static inline long long arch_atomic64_dec_if_positive(atomic64_t *v) 257 + static inline s64 arch_atomic64_dec_if_positive(atomic64_t *v) 256 258 { 257 - long long r; 259 + s64 r; 258 260 alternative_atomic64(dec_if_positive, "=&A" (r), 259 261 "S" (v) : "ecx", "memory"); 260 262 return r; ··· 264 266 #undef alternative_atomic64 265 267 #undef __alternative_atomic64 266 268 267 - static inline void arch_atomic64_and(long long i, atomic64_t *v) 269 + static inline void arch_atomic64_and(s64 i, atomic64_t *v) 268 270 { 269 - long long old, c = 0; 271 + s64 old, c = 0; 270 272 271 273 while ((old = arch_atomic64_cmpxchg(v, c, c & i)) != c) 272 274 c = old; 273 275 } 274 276 275 - static inline long long arch_atomic64_fetch_and(long long i, atomic64_t *v) 277 + static inline s64 arch_atomic64_fetch_and(s64 i, atomic64_t *v) 276 278 { 277 - long long old, c = 0; 279 + s64 old, c = 0; 278 280 279 281 while ((old = arch_atomic64_cmpxchg(v, c, c & i)) != c) 280 282 c = old; ··· 282 284 return old; 283 285 } 284 286 285 - static inline void arch_atomic64_or(long long i, atomic64_t *v) 287 + static inline void arch_atomic64_or(s64 i, atomic64_t *v) 286 288 { 287 - long long old, c = 0; 289 + s64 old, c = 0; 288 290 289 291 while ((old = arch_atomic64_cmpxchg(v, c, c | i)) != c) 290 292 c = old; 291 293 } 292 294 293 - static inline long long arch_atomic64_fetch_or(long long i, atomic64_t *v) 295 + static inline s64 arch_atomic64_fetch_or(s64 i, atomic64_t *v) 294 296 { 295 - long long old, c = 0; 297 + s64 old, c = 0; 296 298 297 299 while ((old = arch_atomic64_cmpxchg(v, c, c | i)) != c) 298 300 c = old; ··· 300 302 return old; 301 303 } 302 304 303 - static inline void arch_atomic64_xor(long long i, atomic64_t *v) 305 + static inline void arch_atomic64_xor(s64 i, atomic64_t *v) 304 306 { 305 - long long old, c = 0; 307 + s64 old, c = 0; 306 308 307 309 while ((old = arch_atomic64_cmpxchg(v, c, c ^ i)) != c) 308 310 c = old; 309 311 } 310 312 311 - static inline long long arch_atomic64_fetch_xor(long long i, atomic64_t *v) 313 + static inline s64 arch_atomic64_fetch_xor(s64 i, atomic64_t *v) 312 314 { 313 - long long old, c = 0; 315 + s64 old, c = 0; 314 316 315 317 while ((old = arch_atomic64_cmpxchg(v, c, c ^ i)) != c) 316 318 c = old; ··· 318 320 return old; 319 321 } 320 322 321 - static inline long long arch_atomic64_fetch_add(long long i, atomic64_t *v) 323 + static inline s64 arch_atomic64_fetch_add(s64 i, atomic64_t *v) 322 324 { 323 - long long old, c = 0; 325 + s64 old, c = 0; 324 326 325 327 while ((old = arch_atomic64_cmpxchg(v, c, c + i)) != c) 326 328 c = old;

+23 -23

arch/x86/include/asm/atomic64_64.h

··· 17 17 * Atomically reads the value of @v. 18 18 * Doesn't imply a read memory barrier. 19 19 */ 20 - static inline long arch_atomic64_read(const atomic64_t *v) 20 + static inline s64 arch_atomic64_read(const atomic64_t *v) 21 21 { 22 22 return READ_ONCE((v)->counter); 23 23 } ··· 29 29 * 30 30 * Atomically sets the value of @v to @i. 31 31 */ 32 - static inline void arch_atomic64_set(atomic64_t *v, long i) 32 + static inline void arch_atomic64_set(atomic64_t *v, s64 i) 33 33 { 34 34 WRITE_ONCE(v->counter, i); 35 35 } ··· 41 41 * 42 42 * Atomically adds @i to @v. 43 43 */ 44 - static __always_inline void arch_atomic64_add(long i, atomic64_t *v) 44 + static __always_inline void arch_atomic64_add(s64 i, atomic64_t *v) 45 45 { 46 46 asm volatile(LOCK_PREFIX "addq %1,%0" 47 47 : "=m" (v->counter) 48 - : "er" (i), "m" (v->counter)); 48 + : "er" (i), "m" (v->counter) : "memory"); 49 49 } 50 50 51 51 /** ··· 55 55 * 56 56 * Atomically subtracts @i from @v. 57 57 */ 58 - static inline void arch_atomic64_sub(long i, atomic64_t *v) 58 + static inline void arch_atomic64_sub(s64 i, atomic64_t *v) 59 59 { 60 60 asm volatile(LOCK_PREFIX "subq %1,%0" 61 61 : "=m" (v->counter) 62 - : "er" (i), "m" (v->counter)); 62 + : "er" (i), "m" (v->counter) : "memory"); 63 63 } 64 64 65 65 /** ··· 71 71 * true if the result is zero, or false for all 72 72 * other cases. 73 73 */ 74 - static inline bool arch_atomic64_sub_and_test(long i, atomic64_t *v) 74 + static inline bool arch_atomic64_sub_and_test(s64 i, atomic64_t *v) 75 75 { 76 76 return GEN_BINARY_RMWcc(LOCK_PREFIX "subq", v->counter, e, "er", i); 77 77 } ··· 87 87 { 88 88 asm volatile(LOCK_PREFIX "incq %0" 89 89 : "=m" (v->counter) 90 - : "m" (v->counter)); 90 + : "m" (v->counter) : "memory"); 91 91 } 92 92 #define arch_atomic64_inc arch_atomic64_inc 93 93 ··· 101 101 { 102 102 asm volatile(LOCK_PREFIX "decq %0" 103 103 : "=m" (v->counter) 104 - : "m" (v->counter)); 104 + : "m" (v->counter) : "memory"); 105 105 } 106 106 #define arch_atomic64_dec arch_atomic64_dec 107 107 ··· 142 142 * if the result is negative, or false when 143 143 * result is greater than or equal to zero. 144 144 */ 145 - static inline bool arch_atomic64_add_negative(long i, atomic64_t *v) 145 + static inline bool arch_atomic64_add_negative(s64 i, atomic64_t *v) 146 146 { 147 147 return GEN_BINARY_RMWcc(LOCK_PREFIX "addq", v->counter, s, "er", i); 148 148 } ··· 155 155 * 156 156 * Atomically adds @i to @v and returns @i + @v 157 157 */ 158 - static __always_inline long arch_atomic64_add_return(long i, atomic64_t *v) 158 + static __always_inline s64 arch_atomic64_add_return(s64 i, atomic64_t *v) 159 159 { 160 160 return i + xadd(&v->counter, i); 161 161 } 162 162 163 - static inline long arch_atomic64_sub_return(long i, atomic64_t *v) 163 + static inline s64 arch_atomic64_sub_return(s64 i, atomic64_t *v) 164 164 { 165 165 return arch_atomic64_add_return(-i, v); 166 166 } 167 167 168 - static inline long arch_atomic64_fetch_add(long i, atomic64_t *v) 168 + static inline s64 arch_atomic64_fetch_add(s64 i, atomic64_t *v) 169 169 { 170 170 return xadd(&v->counter, i); 171 171 } 172 172 173 - static inline long arch_atomic64_fetch_sub(long i, atomic64_t *v) 173 + static inline s64 arch_atomic64_fetch_sub(s64 i, atomic64_t *v) 174 174 { 175 175 return xadd(&v->counter, -i); 176 176 } 177 177 178 - static inline long arch_atomic64_cmpxchg(atomic64_t *v, long old, long new) 178 + static inline s64 arch_atomic64_cmpxchg(atomic64_t *v, s64 old, s64 new) 179 179 { 180 180 return arch_cmpxchg(&v->counter, old, new); 181 181 } 182 182 183 183 #define arch_atomic64_try_cmpxchg arch_atomic64_try_cmpxchg 184 - static __always_inline bool arch_atomic64_try_cmpxchg(atomic64_t *v, s64 *old, long new) 184 + static __always_inline bool arch_atomic64_try_cmpxchg(atomic64_t *v, s64 *old, s64 new) 185 185 { 186 186 return try_cmpxchg(&v->counter, old, new); 187 187 } 188 188 189 - static inline long arch_atomic64_xchg(atomic64_t *v, long new) 189 + static inline s64 arch_atomic64_xchg(atomic64_t *v, s64 new) 190 190 { 191 191 return arch_xchg(&v->counter, new); 192 192 } 193 193 194 - static inline void arch_atomic64_and(long i, atomic64_t *v) 194 + static inline void arch_atomic64_and(s64 i, atomic64_t *v) 195 195 { 196 196 asm volatile(LOCK_PREFIX "andq %1,%0" 197 197 : "+m" (v->counter) ··· 199 199 : "memory"); 200 200 } 201 201 202 - static inline long arch_atomic64_fetch_and(long i, atomic64_t *v) 202 + static inline s64 arch_atomic64_fetch_and(s64 i, atomic64_t *v) 203 203 { 204 204 s64 val = arch_atomic64_read(v); 205 205 ··· 208 208 return val; 209 209 } 210 210 211 - static inline void arch_atomic64_or(long i, atomic64_t *v) 211 + static inline void arch_atomic64_or(s64 i, atomic64_t *v) 212 212 { 213 213 asm volatile(LOCK_PREFIX "orq %1,%0" 214 214 : "+m" (v->counter) ··· 216 216 : "memory"); 217 217 } 218 218 219 - static inline long arch_atomic64_fetch_or(long i, atomic64_t *v) 219 + static inline s64 arch_atomic64_fetch_or(s64 i, atomic64_t *v) 220 220 { 221 221 s64 val = arch_atomic64_read(v); 222 222 ··· 225 225 return val; 226 226 } 227 227 228 - static inline void arch_atomic64_xor(long i, atomic64_t *v) 228 + static inline void arch_atomic64_xor(s64 i, atomic64_t *v) 229 229 { 230 230 asm volatile(LOCK_PREFIX "xorq %1,%0" 231 231 : "+m" (v->counter) ··· 233 233 : "memory"); 234 234 } 235 235 236 - static inline long arch_atomic64_fetch_xor(long i, atomic64_t *v) 236 + static inline s64 arch_atomic64_fetch_xor(s64 i, atomic64_t *v) 237 237 { 238 238 s64 val = arch_atomic64_read(v); 239 239

+2 -2

arch/x86/include/asm/barrier.h

··· 80 80 }) 81 81 82 82 /* Atomic operations are already serializing on x86 */ 83 - #define __smp_mb__before_atomic() barrier() 84 - #define __smp_mb__after_atomic() barrier() 83 + #define __smp_mb__before_atomic() do { } while (0) 84 + #define __smp_mb__after_atomic() do { } while (0) 85 85 86 86 #include <asm-generic/barrier.h> 87 87

+2 -2

arch/x86/include/asm/irq_regs.h

··· 16 16 17 17 static inline struct pt_regs *get_irq_regs(void) 18 18 { 19 - return this_cpu_read(irq_regs); 19 + return __this_cpu_read(irq_regs); 20 20 } 21 21 22 22 static inline struct pt_regs *set_irq_regs(struct pt_regs *new_regs) ··· 24 24 struct pt_regs *old_regs; 25 25 26 26 old_regs = get_irq_regs(); 27 - this_cpu_write(irq_regs, new_regs); 27 + __this_cpu_write(irq_regs, new_regs); 28 28 29 29 return old_regs; 30 30 }

+2

arch/x86/include/asm/jump_label.h

··· 2 2 #ifndef _ASM_X86_JUMP_LABEL_H 3 3 #define _ASM_X86_JUMP_LABEL_H 4 4 5 + #define HAVE_JUMP_LABEL_BATCH 6 + 5 7 #define JUMP_LABEL_NOP_SIZE 5 6 8 7 9 #ifdef CONFIG_X86_64

+119 -107

arch/x86/include/asm/percpu.h

··· 87 87 * don't give an lvalue though). */ 88 88 extern void __bad_percpu_size(void); 89 89 90 - #define percpu_to_op(op, var, val) \ 90 + #define percpu_to_op(qual, op, var, val) \ 91 91 do { \ 92 92 typedef typeof(var) pto_T__; \ 93 93 if (0) { \ ··· 97 97 } \ 98 98 switch (sizeof(var)) { \ 99 99 case 1: \ 100 - asm(op "b %1,"__percpu_arg(0) \ 100 + asm qual (op "b %1,"__percpu_arg(0) \ 101 101 : "+m" (var) \ 102 102 : "qi" ((pto_T__)(val))); \ 103 103 break; \ 104 104 case 2: \ 105 - asm(op "w %1,"__percpu_arg(0) \ 105 + asm qual (op "w %1,"__percpu_arg(0) \ 106 106 : "+m" (var) \ 107 107 : "ri" ((pto_T__)(val))); \ 108 108 break; \ 109 109 case 4: \ 110 - asm(op "l %1,"__percpu_arg(0) \ 110 + asm qual (op "l %1,"__percpu_arg(0) \ 111 111 : "+m" (var) \ 112 112 : "ri" ((pto_T__)(val))); \ 113 113 break; \ 114 114 case 8: \ 115 - asm(op "q %1,"__percpu_arg(0) \ 115 + asm qual (op "q %1,"__percpu_arg(0) \ 116 116 : "+m" (var) \ 117 117 : "re" ((pto_T__)(val))); \ 118 118 break; \ ··· 124 124 * Generate a percpu add to memory instruction and optimize code 125 125 * if one is added or subtracted. 126 126 */ 127 - #define percpu_add_op(var, val) \ 127 + #define percpu_add_op(qual, var, val) \ 128 128 do { \ 129 129 typedef typeof(var) pao_T__; \ 130 130 const int pao_ID__ = (__builtin_constant_p(val) && \ ··· 138 138 switch (sizeof(var)) { \ 139 139 case 1: \ 140 140 if (pao_ID__ == 1) \ 141 - asm("incb "__percpu_arg(0) : "+m" (var)); \ 141 + asm qual ("incb "__percpu_arg(0) : "+m" (var)); \ 142 142 else if (pao_ID__ == -1) \ 143 - asm("decb "__percpu_arg(0) : "+m" (var)); \ 143 + asm qual ("decb "__percpu_arg(0) : "+m" (var)); \ 144 144 else \ 145 - asm("addb %1, "__percpu_arg(0) \ 145 + asm qual ("addb %1, "__percpu_arg(0) \ 146 146 : "+m" (var) \ 147 147 : "qi" ((pao_T__)(val))); \ 148 148 break; \ 149 149 case 2: \ 150 150 if (pao_ID__ == 1) \ 151 - asm("incw "__percpu_arg(0) : "+m" (var)); \ 151 + asm qual ("incw "__percpu_arg(0) : "+m" (var)); \ 152 152 else if (pao_ID__ == -1) \ 153 - asm("decw "__percpu_arg(0) : "+m" (var)); \ 153 + asm qual ("decw "__percpu_arg(0) : "+m" (var)); \ 154 154 else \ 155 - asm("addw %1, "__percpu_arg(0) \ 155 + asm qual ("addw %1, "__percpu_arg(0) \ 156 156 : "+m" (var) \ 157 157 : "ri" ((pao_T__)(val))); \ 158 158 break; \ 159 159 case 4: \ 160 160 if (pao_ID__ == 1) \ 161 - asm("incl "__percpu_arg(0) : "+m" (var)); \ 161 + asm qual ("incl "__percpu_arg(0) : "+m" (var)); \ 162 162 else if (pao_ID__ == -1) \ 163 - asm("decl "__percpu_arg(0) : "+m" (var)); \ 163 + asm qual ("decl "__percpu_arg(0) : "+m" (var)); \ 164 164 else \ 165 - asm("addl %1, "__percpu_arg(0) \ 165 + asm qual ("addl %1, "__percpu_arg(0) \ 166 166 : "+m" (var) \ 167 167 : "ri" ((pao_T__)(val))); \ 168 168 break; \ 169 169 case 8: \ 170 170 if (pao_ID__ == 1) \ 171 - asm("incq "__percpu_arg(0) : "+m" (var)); \ 171 + asm qual ("incq "__percpu_arg(0) : "+m" (var)); \ 172 172 else if (pao_ID__ == -1) \ 173 - asm("decq "__percpu_arg(0) : "+m" (var)); \ 173 + asm qual ("decq "__percpu_arg(0) : "+m" (var)); \ 174 174 else \ 175 - asm("addq %1, "__percpu_arg(0) \ 175 + asm qual ("addq %1, "__percpu_arg(0) \ 176 176 : "+m" (var) \ 177 177 : "re" ((pao_T__)(val))); \ 178 178 break; \ ··· 180 180 } \ 181 181 } while (0) 182 182 183 - #define percpu_from_op(op, var) \ 183 + #define percpu_from_op(qual, op, var) \ 184 184 ({ \ 185 185 typeof(var) pfo_ret__; \ 186 186 switch (sizeof(var)) { \ 187 187 case 1: \ 188 - asm volatile(op "b "__percpu_arg(1)",%0"\ 188 + asm qual (op "b "__percpu_arg(1)",%0" \ 189 189 : "=q" (pfo_ret__) \ 190 190 : "m" (var)); \ 191 191 break; \ 192 192 case 2: \ 193 - asm volatile(op "w "__percpu_arg(1)",%0"\ 193 + asm qual (op "w "__percpu_arg(1)",%0" \ 194 194 : "=r" (pfo_ret__) \ 195 195 : "m" (var)); \ 196 196 break; \ 197 197 case 4: \ 198 - asm volatile(op "l "__percpu_arg(1)",%0"\ 198 + asm qual (op "l "__percpu_arg(1)",%0" \ 199 199 : "=r" (pfo_ret__) \ 200 200 : "m" (var)); \ 201 201 break; \ 202 202 case 8: \ 203 - asm volatile(op "q "__percpu_arg(1)",%0"\ 203 + asm qual (op "q "__percpu_arg(1)",%0" \ 204 204 : "=r" (pfo_ret__) \ 205 205 : "m" (var)); \ 206 206 break; \ ··· 238 238 pfo_ret__; \ 239 239 }) 240 240 241 - #define percpu_unary_op(op, var) \ 241 + #define percpu_unary_op(qual, op, var) \ 242 242 ({ \ 243 243 switch (sizeof(var)) { \ 244 244 case 1: \ 245 - asm(op "b "__percpu_arg(0) \ 245 + asm qual (op "b "__percpu_arg(0) \ 246 246 : "+m" (var)); \ 247 247 break; \ 248 248 case 2: \ 249 - asm(op "w "__percpu_arg(0) \ 249 + asm qual (op "w "__percpu_arg(0) \ 250 250 : "+m" (var)); \ 251 251 break; \ 252 252 case 4: \ 253 - asm(op "l "__percpu_arg(0) \ 253 + asm qual (op "l "__percpu_arg(0) \ 254 254 : "+m" (var)); \ 255 255 break; \ 256 256 case 8: \ 257 - asm(op "q "__percpu_arg(0) \ 257 + asm qual (op "q "__percpu_arg(0) \ 258 258 : "+m" (var)); \ 259 259 break; \ 260 260 default: __bad_percpu_size(); \ ··· 264 264 /* 265 265 * Add return operation 266 266 */ 267 - #define percpu_add_return_op(var, val) \ 267 + #define percpu_add_return_op(qual, var, val) \ 268 268 ({ \ 269 269 typeof(var) paro_ret__ = val; \ 270 270 switch (sizeof(var)) { \ 271 271 case 1: \ 272 - asm("xaddb %0, "__percpu_arg(1) \ 272 + asm qual ("xaddb %0, "__percpu_arg(1) \ 273 273 : "+q" (paro_ret__), "+m" (var) \ 274 274 : : "memory"); \ 275 275 break; \ 276 276 case 2: \ 277 - asm("xaddw %0, "__percpu_arg(1) \ 277 + asm qual ("xaddw %0, "__percpu_arg(1) \ 278 278 : "+r" (paro_ret__), "+m" (var) \ 279 279 : : "memory"); \ 280 280 break; \ 281 281 case 4: \ 282 - asm("xaddl %0, "__percpu_arg(1) \ 282 + asm qual ("xaddl %0, "__percpu_arg(1) \ 283 283 : "+r" (paro_ret__), "+m" (var) \ 284 284 : : "memory"); \ 285 285 break; \ 286 286 case 8: \ 287 - asm("xaddq %0, "__percpu_arg(1) \ 287 + asm qual ("xaddq %0, "__percpu_arg(1) \ 288 288 : "+re" (paro_ret__), "+m" (var) \ 289 289 : : "memory"); \ 290 290 break; \ ··· 299 299 * expensive due to the implied lock prefix. The processor cannot prefetch 300 300 * cachelines if xchg is used. 301 301 */ 302 - #define percpu_xchg_op(var, nval) \ 302 + #define percpu_xchg_op(qual, var, nval) \ 303 303 ({ \ 304 304 typeof(var) pxo_ret__; \ 305 305 typeof(var) pxo_new__ = (nval); \ 306 306 switch (sizeof(var)) { \ 307 307 case 1: \ 308 - asm("\n\tmov "__percpu_arg(1)",%%al" \ 308 + asm qual ("\n\tmov "__percpu_arg(1)",%%al" \ 309 309 "\n1:\tcmpxchgb %2, "__percpu_arg(1) \ 310 310 "\n\tjnz 1b" \ 311 311 : "=&a" (pxo_ret__), "+m" (var) \ ··· 313 313 : "memory"); \ 314 314 break; \ 315 315 case 2: \ 316 - asm("\n\tmov "__percpu_arg(1)",%%ax" \ 316 + asm qual ("\n\tmov "__percpu_arg(1)",%%ax" \ 317 317 "\n1:\tcmpxchgw %2, "__percpu_arg(1) \ 318 318 "\n\tjnz 1b" \ 319 319 : "=&a" (pxo_ret__), "+m" (var) \ ··· 321 321 : "memory"); \ 322 322 break; \ 323 323 case 4: \ 324 - asm("\n\tmov "__percpu_arg(1)",%%eax" \ 324 + asm qual ("\n\tmov "__percpu_arg(1)",%%eax" \ 325 325 "\n1:\tcmpxchgl %2, "__percpu_arg(1) \ 326 326 "\n\tjnz 1b" \ 327 327 : "=&a" (pxo_ret__), "+m" (var) \ ··· 329 329 : "memory"); \ 330 330 break; \ 331 331 case 8: \ 332 - asm("\n\tmov "__percpu_arg(1)",%%rax" \ 332 + asm qual ("\n\tmov "__percpu_arg(1)",%%rax" \ 333 333 "\n1:\tcmpxchgq %2, "__percpu_arg(1) \ 334 334 "\n\tjnz 1b" \ 335 335 : "=&a" (pxo_ret__), "+m" (var) \ ··· 345 345 * cmpxchg has no such implied lock semantics as a result it is much 346 346 * more efficient for cpu local operations. 347 347 */ 348 - #define percpu_cmpxchg_op(var, oval, nval) \ 348 + #define percpu_cmpxchg_op(qual, var, oval, nval) \ 349 349 ({ \ 350 350 typeof(var) pco_ret__; \ 351 351 typeof(var) pco_old__ = (oval); \ 352 352 typeof(var) pco_new__ = (nval); \ 353 353 switch (sizeof(var)) { \ 354 354 case 1: \ 355 - asm("cmpxchgb %2, "__percpu_arg(1) \ 355 + asm qual ("cmpxchgb %2, "__percpu_arg(1) \ 356 356 : "=a" (pco_ret__), "+m" (var) \ 357 357 : "q" (pco_new__), "0" (pco_old__) \ 358 358 : "memory"); \ 359 359 break; \ 360 360 case 2: \ 361 - asm("cmpxchgw %2, "__percpu_arg(1) \ 361 + asm qual ("cmpxchgw %2, "__percpu_arg(1) \ 362 362 : "=a" (pco_ret__), "+m" (var) \ 363 363 : "r" (pco_new__), "0" (pco_old__) \ 364 364 : "memory"); \ 365 365 break; \ 366 366 case 4: \ 367 - asm("cmpxchgl %2, "__percpu_arg(1) \ 367 + asm qual ("cmpxchgl %2, "__percpu_arg(1) \ 368 368 : "=a" (pco_ret__), "+m" (var) \ 369 369 : "r" (pco_new__), "0" (pco_old__) \ 370 370 : "memory"); \ 371 371 break; \ 372 372 case 8: \ 373 - asm("cmpxchgq %2, "__percpu_arg(1) \ 373 + asm qual ("cmpxchgq %2, "__percpu_arg(1) \ 374 374 : "=a" (pco_ret__), "+m" (var) \ 375 375 : "r" (pco_new__), "0" (pco_old__) \ 376 376 : "memory"); \ ··· 391 391 */ 392 392 #define this_cpu_read_stable(var) percpu_stable_op("mov", var) 393 393 394 - #define raw_cpu_read_1(pcp) percpu_from_op("mov", pcp) 395 - #define raw_cpu_read_2(pcp) percpu_from_op("mov", pcp) 396 - #define raw_cpu_read_4(pcp) percpu_from_op("mov", pcp) 394 + #define raw_cpu_read_1(pcp) percpu_from_op(, "mov", pcp) 395 + #define raw_cpu_read_2(pcp) percpu_from_op(, "mov", pcp) 396 + #define raw_cpu_read_4(pcp) percpu_from_op(, "mov", pcp) 397 397 398 - #define raw_cpu_write_1(pcp, val) percpu_to_op("mov", (pcp), val) 399 - #define raw_cpu_write_2(pcp, val) percpu_to_op("mov", (pcp), val) 400 - #define raw_cpu_write_4(pcp, val) percpu_to_op("mov", (pcp), val) 401 - #define raw_cpu_add_1(pcp, val) percpu_add_op((pcp), val) 402 - #define raw_cpu_add_2(pcp, val) percpu_add_op((pcp), val) 403 - #define raw_cpu_add_4(pcp, val) percpu_add_op((pcp), val) 404 - #define raw_cpu_and_1(pcp, val) percpu_to_op("and", (pcp), val) 405 - #define raw_cpu_and_2(pcp, val) percpu_to_op("and", (pcp), val) 406 - #define raw_cpu_and_4(pcp, val) percpu_to_op("and", (pcp), val) 407 - #define raw_cpu_or_1(pcp, val) percpu_to_op("or", (pcp), val) 408 - #define raw_cpu_or_2(pcp, val) percpu_to_op("or", (pcp), val) 409 - #define raw_cpu_or_4(pcp, val) percpu_to_op("or", (pcp), val) 410 - #define raw_cpu_xchg_1(pcp, val) percpu_xchg_op(pcp, val) 411 - #define raw_cpu_xchg_2(pcp, val) percpu_xchg_op(pcp, val) 412 - #define raw_cpu_xchg_4(pcp, val) percpu_xchg_op(pcp, val) 398 + #define raw_cpu_write_1(pcp, val) percpu_to_op(, "mov", (pcp), val) 399 + #define raw_cpu_write_2(pcp, val) percpu_to_op(, "mov", (pcp), val) 400 + #define raw_cpu_write_4(pcp, val) percpu_to_op(, "mov", (pcp), val) 401 + #define raw_cpu_add_1(pcp, val) percpu_add_op(, (pcp), val) 402 + #define raw_cpu_add_2(pcp, val) percpu_add_op(, (pcp), val) 403 + #define raw_cpu_add_4(pcp, val) percpu_add_op(, (pcp), val) 404 + #define raw_cpu_and_1(pcp, val) percpu_to_op(, "and", (pcp), val) 405 + #define raw_cpu_and_2(pcp, val) percpu_to_op(, "and", (pcp), val) 406 + #define raw_cpu_and_4(pcp, val) percpu_to_op(, "and", (pcp), val) 407 + #define raw_cpu_or_1(pcp, val) percpu_to_op(, "or", (pcp), val) 408 + #define raw_cpu_or_2(pcp, val) percpu_to_op(, "or", (pcp), val) 409 + #define raw_cpu_or_4(pcp, val) percpu_to_op(, "or", (pcp), val) 413 410 414 - #define this_cpu_read_1(pcp) percpu_from_op("mov", pcp) 415 - #define this_cpu_read_2(pcp) percpu_from_op("mov", pcp) 416 - #define this_cpu_read_4(pcp) percpu_from_op("mov", pcp) 417 - #define this_cpu_write_1(pcp, val) percpu_to_op("mov", (pcp), val) 418 - #define this_cpu_write_2(pcp, val) percpu_to_op("mov", (pcp), val) 419 - #define this_cpu_write_4(pcp, val) percpu_to_op("mov", (pcp), val) 420 - #define this_cpu_add_1(pcp, val) percpu_add_op((pcp), val) 421 - #define this_cpu_add_2(pcp, val) percpu_add_op((pcp), val) 422 - #define this_cpu_add_4(pcp, val) percpu_add_op((pcp), val) 423 - #define this_cpu_and_1(pcp, val) percpu_to_op("and", (pcp), val) 424 - #define this_cpu_and_2(pcp, val) percpu_to_op("and", (pcp), val) 425 - #define this_cpu_and_4(pcp, val) percpu_to_op("and", (pcp), val) 426 - #define this_cpu_or_1(pcp, val) percpu_to_op("or", (pcp), val) 427 - #define this_cpu_or_2(pcp, val) percpu_to_op("or", (pcp), val) 428 - #define this_cpu_or_4(pcp, val) percpu_to_op("or", (pcp), val) 429 - #define this_cpu_xchg_1(pcp, nval) percpu_xchg_op(pcp, nval) 430 - #define this_cpu_xchg_2(pcp, nval) percpu_xchg_op(pcp, nval) 431 - #define this_cpu_xchg_4(pcp, nval) percpu_xchg_op(pcp, nval) 411 + /* 412 + * raw_cpu_xchg() can use a load-store since it is not required to be 413 + * IRQ-safe. 414 + */ 415 + #define raw_percpu_xchg_op(var, nval) \ 416 + ({ \ 417 + typeof(var) pxo_ret__ = raw_cpu_read(var); \ 418 + raw_cpu_write(var, (nval)); \ 419 + pxo_ret__; \ 420 + }) 432 421 433 - #define raw_cpu_add_return_1(pcp, val) percpu_add_return_op(pcp, val) 434 - #define raw_cpu_add_return_2(pcp, val) percpu_add_return_op(pcp, val) 435 - #define raw_cpu_add_return_4(pcp, val) percpu_add_return_op(pcp, val) 436 - #define raw_cpu_cmpxchg_1(pcp, oval, nval) percpu_cmpxchg_op(pcp, oval, nval) 437 - #define raw_cpu_cmpxchg_2(pcp, oval, nval) percpu_cmpxchg_op(pcp, oval, nval) 438 - #define raw_cpu_cmpxchg_4(pcp, oval, nval) percpu_cmpxchg_op(pcp, oval, nval) 422 + #define raw_cpu_xchg_1(pcp, val) raw_percpu_xchg_op(pcp, val) 423 + #define raw_cpu_xchg_2(pcp, val) raw_percpu_xchg_op(pcp, val) 424 + #define raw_cpu_xchg_4(pcp, val) raw_percpu_xchg_op(pcp, val) 439 425 440 - #define this_cpu_add_return_1(pcp, val) percpu_add_return_op(pcp, val) 441 - #define this_cpu_add_return_2(pcp, val) percpu_add_return_op(pcp, val) 442 - #define this_cpu_add_return_4(pcp, val) percpu_add_return_op(pcp, val) 443 - #define this_cpu_cmpxchg_1(pcp, oval, nval) percpu_cmpxchg_op(pcp, oval, nval) 444 - #define this_cpu_cmpxchg_2(pcp, oval, nval) percpu_cmpxchg_op(pcp, oval, nval) 445 - #define this_cpu_cmpxchg_4(pcp, oval, nval) percpu_cmpxchg_op(pcp, oval, nval) 426 + #define this_cpu_read_1(pcp) percpu_from_op(volatile, "mov", pcp) 427 + #define this_cpu_read_2(pcp) percpu_from_op(volatile, "mov", pcp) 428 + #define this_cpu_read_4(pcp) percpu_from_op(volatile, "mov", pcp) 429 + #define this_cpu_write_1(pcp, val) percpu_to_op(volatile, "mov", (pcp), val) 430 + #define this_cpu_write_2(pcp, val) percpu_to_op(volatile, "mov", (pcp), val) 431 + #define this_cpu_write_4(pcp, val) percpu_to_op(volatile, "mov", (pcp), val) 432 + #define this_cpu_add_1(pcp, val) percpu_add_op(volatile, (pcp), val) 433 + #define this_cpu_add_2(pcp, val) percpu_add_op(volatile, (pcp), val) 434 + #define this_cpu_add_4(pcp, val) percpu_add_op(volatile, (pcp), val) 435 + #define this_cpu_and_1(pcp, val) percpu_to_op(volatile, "and", (pcp), val) 436 + #define this_cpu_and_2(pcp, val) percpu_to_op(volatile, "and", (pcp), val) 437 + #define this_cpu_and_4(pcp, val) percpu_to_op(volatile, "and", (pcp), val) 438 + #define this_cpu_or_1(pcp, val) percpu_to_op(volatile, "or", (pcp), val) 439 + #define this_cpu_or_2(pcp, val) percpu_to_op(volatile, "or", (pcp), val) 440 + #define this_cpu_or_4(pcp, val) percpu_to_op(volatile, "or", (pcp), val) 441 + #define this_cpu_xchg_1(pcp, nval) percpu_xchg_op(volatile, pcp, nval) 442 + #define this_cpu_xchg_2(pcp, nval) percpu_xchg_op(volatile, pcp, nval) 443 + #define this_cpu_xchg_4(pcp, nval) percpu_xchg_op(volatile, pcp, nval) 444 + 445 + #define raw_cpu_add_return_1(pcp, val) percpu_add_return_op(, pcp, val) 446 + #define raw_cpu_add_return_2(pcp, val) percpu_add_return_op(, pcp, val) 447 + #define raw_cpu_add_return_4(pcp, val) percpu_add_return_op(, pcp, val) 448 + #define raw_cpu_cmpxchg_1(pcp, oval, nval) percpu_cmpxchg_op(, pcp, oval, nval) 449 + #define raw_cpu_cmpxchg_2(pcp, oval, nval) percpu_cmpxchg_op(, pcp, oval, nval) 450 + #define raw_cpu_cmpxchg_4(pcp, oval, nval) percpu_cmpxchg_op(, pcp, oval, nval) 451 + 452 + #define this_cpu_add_return_1(pcp, val) percpu_add_return_op(volatile, pcp, val) 453 + #define this_cpu_add_return_2(pcp, val) percpu_add_return_op(volatile, pcp, val) 454 + #define this_cpu_add_return_4(pcp, val) percpu_add_return_op(volatile, pcp, val) 455 + #define this_cpu_cmpxchg_1(pcp, oval, nval) percpu_cmpxchg_op(volatile, pcp, oval, nval) 456 + #define this_cpu_cmpxchg_2(pcp, oval, nval) percpu_cmpxchg_op(volatile, pcp, oval, nval) 457 + #define this_cpu_cmpxchg_4(pcp, oval, nval) percpu_cmpxchg_op(volatile, pcp, oval, nval) 446 458 447 459 #ifdef CONFIG_X86_CMPXCHG64 448 460 #define percpu_cmpxchg8b_double(pcp1, pcp2, o1, o2, n1, n2) \ ··· 478 466 * 32 bit must fall back to generic operations. 479 467 */ 480 468 #ifdef CONFIG_X86_64 481 - #define raw_cpu_read_8(pcp) percpu_from_op("mov", pcp) 482 - #define raw_cpu_write_8(pcp, val) percpu_to_op("mov", (pcp), val) 483 - #define raw_cpu_add_8(pcp, val) percpu_add_op((pcp), val) 484 - #define raw_cpu_and_8(pcp, val) percpu_to_op("and", (pcp), val) 485 - #define raw_cpu_or_8(pcp, val) percpu_to_op("or", (pcp), val) 486 - #define raw_cpu_add_return_8(pcp, val) percpu_add_return_op(pcp, val) 487 - #define raw_cpu_xchg_8(pcp, nval) percpu_xchg_op(pcp, nval) 488 - #define raw_cpu_cmpxchg_8(pcp, oval, nval) percpu_cmpxchg_op(pcp, oval, nval) 469 + #define raw_cpu_read_8(pcp) percpu_from_op(, "mov", pcp) 470 + #define raw_cpu_write_8(pcp, val) percpu_to_op(, "mov", (pcp), val) 471 + #define raw_cpu_add_8(pcp, val) percpu_add_op(, (pcp), val) 472 + #define raw_cpu_and_8(pcp, val) percpu_to_op(, "and", (pcp), val) 473 + #define raw_cpu_or_8(pcp, val) percpu_to_op(, "or", (pcp), val) 474 + #define raw_cpu_add_return_8(pcp, val) percpu_add_return_op(, pcp, val) 475 + #define raw_cpu_xchg_8(pcp, nval) raw_percpu_xchg_op(pcp, nval) 476 + #define raw_cpu_cmpxchg_8(pcp, oval, nval) percpu_cmpxchg_op(, pcp, oval, nval) 489 477 490 - #define this_cpu_read_8(pcp) percpu_from_op("mov", pcp) 491 - #define this_cpu_write_8(pcp, val) percpu_to_op("mov", (pcp), val) 492 - #define this_cpu_add_8(pcp, val) percpu_add_op((pcp), val) 493 - #define this_cpu_and_8(pcp, val) percpu_to_op("and", (pcp), val) 494 - #define this_cpu_or_8(pcp, val) percpu_to_op("or", (pcp), val) 495 - #define this_cpu_add_return_8(pcp, val) percpu_add_return_op(pcp, val) 496 - #define this_cpu_xchg_8(pcp, nval) percpu_xchg_op(pcp, nval) 497 - #define this_cpu_cmpxchg_8(pcp, oval, nval) percpu_cmpxchg_op(pcp, oval, nval) 478 + #define this_cpu_read_8(pcp) percpu_from_op(volatile, "mov", pcp) 479 + #define this_cpu_write_8(pcp, val) percpu_to_op(volatile, "mov", (pcp), val) 480 + #define this_cpu_add_8(pcp, val) percpu_add_op(volatile, (pcp), val) 481 + #define this_cpu_and_8(pcp, val) percpu_to_op(volatile, "and", (pcp), val) 482 + #define this_cpu_or_8(pcp, val) percpu_to_op(volatile, "or", (pcp), val) 483 + #define this_cpu_add_return_8(pcp, val) percpu_add_return_op(volatile, pcp, val) 484 + #define this_cpu_xchg_8(pcp, nval) percpu_xchg_op(volatile, pcp, nval) 485 + #define this_cpu_cmpxchg_8(pcp, oval, nval) percpu_cmpxchg_op(volatile, pcp, oval, nval) 498 486 499 487 /* 500 488 * Pretty complex macro to generate cmpxchg16 instruction. The instruction

+2 -1

arch/x86/include/asm/smp.h

··· 162 162 * from the initial startup. We map APIC_BASE very early in page_setup(), 163 163 * so this is correct in the x86 case. 164 164 */ 165 - #define raw_smp_processor_id() (this_cpu_read(cpu_number)) 165 + #define raw_smp_processor_id() this_cpu_read(cpu_number) 166 + #define __smp_processor_id() __this_cpu_read(cpu_number) 166 167 167 168 #ifdef CONFIG_X86_32 168 169 extern int safe_smp_processor_id(void);

+15

arch/x86/include/asm/text-patching.h

··· 18 18 #define __parainstructions_end NULL 19 19 #endif 20 20 21 + /* 22 + * Currently, the max observed size in the kernel code is 23 + * JUMP_LABEL_NOP_SIZE/RELATIVEJUMP_SIZE, which are 5. 24 + * Raise it if needed. 25 + */ 26 + #define POKE_MAX_OPCODE_SIZE 5 27 + 28 + struct text_poke_loc { 29 + void *detour; 30 + void *addr; 31 + size_t len; 32 + const char opcode[POKE_MAX_OPCODE_SIZE]; 33 + }; 34 + 21 35 extern void text_poke_early(void *addr, const void *opcode, size_t len); 22 36 23 37 /* ··· 52 38 extern void *text_poke_kgdb(void *addr, const void *opcode, size_t len); 53 39 extern int poke_int3_handler(struct pt_regs *regs); 54 40 extern void text_poke_bp(void *addr, const void *opcode, size_t len, void *handler); 41 + extern void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries); 55 42 extern int after_bootmem; 56 43 extern __ro_after_init struct mm_struct *poking_mm; 57 44 extern __ro_after_init unsigned long poking_addr;

+120 -34

arch/x86/kernel/alternative.c

··· 14 14 #include <linux/kdebug.h> 15 15 #include <linux/kprobes.h> 16 16 #include <linux/mmu_context.h> 17 + #include <linux/bsearch.h> 17 18 #include <asm/text-patching.h> 18 19 #include <asm/alternative.h> 19 20 #include <asm/sections.h> ··· 849 848 sync_core(); 850 849 } 851 850 852 - static bool bp_patching_in_progress; 853 - static void *bp_int3_handler, *bp_int3_addr; 851 + static struct bp_patching_desc { 852 + struct text_poke_loc *vec; 853 + int nr_entries; 854 + } bp_patching; 855 + 856 + static int patch_cmp(const void *key, const void *elt) 857 + { 858 + struct text_poke_loc *tp = (struct text_poke_loc *) elt; 859 + 860 + if (key < tp->addr) 861 + return -1; 862 + if (key > tp->addr) 863 + return 1; 864 + return 0; 865 + } 866 + NOKPROBE_SYMBOL(patch_cmp); 854 867 855 868 int poke_int3_handler(struct pt_regs *regs) 856 869 { 870 + struct text_poke_loc *tp; 871 + unsigned char int3 = 0xcc; 872 + void *ip; 873 + 857 874 /* 858 875 * Having observed our INT3 instruction, we now must observe 859 - * bp_patching_in_progress. 876 + * bp_patching.nr_entries. 860 877 * 861 - * in_progress = TRUE INT3 878 + * nr_entries != 0 INT3 862 879 * WMB RMB 863 - * write INT3 if (in_progress) 880 + * write INT3 if (nr_entries) 864 881 * 865 - * Idem for bp_int3_handler. 882 + * Idem for other elements in bp_patching. 866 883 */ 867 884 smp_rmb(); 868 885 869 - if (likely(!bp_patching_in_progress)) 886 + if (likely(!bp_patching.nr_entries)) 870 887 return 0; 871 888 872 - if (user_mode(regs) || regs->ip != (unsigned long)bp_int3_addr) 889 + if (user_mode(regs)) 873 890 return 0; 874 891 875 - /* set up the specified breakpoint handler */ 876 - regs->ip = (unsigned long) bp_int3_handler; 892 + /* 893 + * Discount the sizeof(int3). See text_poke_bp_batch(). 894 + */ 895 + ip = (void *) regs->ip - sizeof(int3); 896 + 897 + /* 898 + * Skip the binary search if there is a single member in the vector. 899 + */ 900 + if (unlikely(bp_patching.nr_entries > 1)) { 901 + tp = bsearch(ip, bp_patching.vec, bp_patching.nr_entries, 902 + sizeof(struct text_poke_loc), 903 + patch_cmp); 904 + if (!tp) 905 + return 0; 906 + } else { 907 + tp = bp_patching.vec; 908 + if (tp->addr != ip) 909 + return 0; 910 + } 911 + 912 + /* set up the specified breakpoint detour */ 913 + regs->ip = (unsigned long) tp->detour; 877 914 878 915 return 1; 879 916 } 880 917 NOKPROBE_SYMBOL(poke_int3_handler); 881 918 882 919 /** 883 - * text_poke_bp() -- update instructions on live kernel on SMP 884 - * @addr: address to patch 885 - * @opcode: opcode of new instruction 886 - * @len: length to copy 887 - * @handler: address to jump to when the temporary breakpoint is hit 920 + * text_poke_bp_batch() -- update instructions on live kernel on SMP 921 + * @tp: vector of instructions to patch 922 + * @nr_entries: number of entries in the vector 888 923 * 889 924 * Modify multi-byte instruction by using int3 breakpoint on SMP. 890 925 * We completely avoid stop_machine() here, and achieve the 891 926 * synchronization using int3 breakpoint. 892 927 * 893 928 * The way it is done: 894 - * - add a int3 trap to the address that will be patched 929 + * - For each entry in the vector: 930 + * - add a int3 trap to the address that will be patched 895 931 * - sync cores 896 - * - update all but the first byte of the patched range 932 + * - For each entry in the vector: 933 + * - update all but the first byte of the patched range 897 934 * - sync cores 898 - * - replace the first byte (int3) by the first byte of 899 - * replacing opcode 935 + * - For each entry in the vector: 936 + * - replace the first byte (int3) by the first byte of 937 + * replacing opcode 900 938 * - sync cores 901 939 */ 902 - void text_poke_bp(void *addr, const void *opcode, size_t len, void *handler) 940 + void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries) 903 941 { 942 + int patched_all_but_first = 0; 904 943 unsigned char int3 = 0xcc; 905 - 906 - bp_int3_handler = handler; 907 - bp_int3_addr = (u8 *)addr + sizeof(int3); 908 - bp_patching_in_progress = true; 944 + unsigned int i; 909 945 910 946 lockdep_assert_held(&text_mutex); 911 947 948 + bp_patching.vec = tp; 949 + bp_patching.nr_entries = nr_entries; 950 + 912 951 /* 913 952 * Corresponding read barrier in int3 notifier for making sure the 914 - * in_progress and handler are correctly ordered wrt. patching. 953 + * nr_entries and handler are correctly ordered wrt. patching. 915 954 */ 916 955 smp_wmb(); 917 956 918 - text_poke(addr, &int3, sizeof(int3)); 957 + /* 958 + * First step: add a int3 trap to the address that will be patched. 959 + */ 960 + for (i = 0; i < nr_entries; i++) 961 + text_poke(tp[i].addr, &int3, sizeof(int3)); 919 962 920 963 on_each_cpu(do_sync_core, NULL, 1); 921 964 922 - if (len - sizeof(int3) > 0) { 923 - /* patch all but the first byte */ 924 - text_poke((char *)addr + sizeof(int3), 925 - (const char *) opcode + sizeof(int3), 926 - len - sizeof(int3)); 965 + /* 966 + * Second step: update all but the first byte of the patched range. 967 + */ 968 + for (i = 0; i < nr_entries; i++) { 969 + if (tp[i].len - sizeof(int3) > 0) { 970 + text_poke((char *)tp[i].addr + sizeof(int3), 971 + (const char *)tp[i].opcode + sizeof(int3), 972 + tp[i].len - sizeof(int3)); 973 + patched_all_but_first++; 974 + } 975 + } 976 + 977 + if (patched_all_but_first) { 927 978 /* 928 979 * According to Intel, this core syncing is very likely 929 980 * not necessary and we'd be safe even without it. But ··· 984 931 on_each_cpu(do_sync_core, NULL, 1); 985 932 } 986 933 987 - /* patch the first byte */ 988 - text_poke(addr, opcode, sizeof(int3)); 934 + /* 935 + * Third step: replace the first byte (int3) by the first byte of 936 + * replacing opcode. 937 + */ 938 + for (i = 0; i < nr_entries; i++) 939 + text_poke(tp[i].addr, tp[i].opcode, sizeof(int3)); 989 940 990 941 on_each_cpu(do_sync_core, NULL, 1); 991 942 /* 992 943 * sync_core() implies an smp_mb() and orders this store against 993 944 * the writing of the new instruction. 994 945 */ 995 - bp_patching_in_progress = false; 946 + bp_patching.vec = NULL; 947 + bp_patching.nr_entries = 0; 996 948 } 997 949 950 + /** 951 + * text_poke_bp() -- update instructions on live kernel on SMP 952 + * @addr: address to patch 953 + * @opcode: opcode of new instruction 954 + * @len: length to copy 955 + * @handler: address to jump to when the temporary breakpoint is hit 956 + * 957 + * Update a single instruction with the vector in the stack, avoiding 958 + * dynamically allocated memory. This function should be used when it is 959 + * not possible to allocate memory. 960 + */ 961 + void text_poke_bp(void *addr, const void *opcode, size_t len, void *handler) 962 + { 963 + struct text_poke_loc tp = { 964 + .detour = handler, 965 + .addr = addr, 966 + .len = len, 967 + }; 968 + 969 + if (len > POKE_MAX_OPCODE_SIZE) { 970 + WARN_ONCE(1, "len is larger than %d\n", POKE_MAX_OPCODE_SIZE); 971 + return; 972 + } 973 + 974 + memcpy((void *)tp.opcode, opcode, len); 975 + 976 + text_poke_bp_batch(&tp, 1); 977 + }

+96 -25

arch/x86/kernel/jump_label.c

··· 35 35 BUG(); 36 36 } 37 37 38 - static void __ref __jump_label_transform(struct jump_entry *entry, 39 - enum jump_label_type type, 40 - int init) 38 + static void __jump_label_set_jump_code(struct jump_entry *entry, 39 + enum jump_label_type type, 40 + union jump_code_union *code, 41 + int init) 41 42 { 42 - union jump_code_union jmp; 43 43 const unsigned char default_nop[] = { STATIC_KEY_INIT_NOP }; 44 44 const unsigned char *ideal_nop = ideal_nops[NOP_ATOMIC5]; 45 - const void *expect, *code; 45 + const void *expect; 46 46 int line; 47 47 48 - jmp.jump = 0xe9; 49 - jmp.offset = jump_entry_target(entry) - 50 - (jump_entry_code(entry) + JUMP_LABEL_NOP_SIZE); 48 + code->jump = 0xe9; 49 + code->offset = jump_entry_target(entry) - 50 + (jump_entry_code(entry) + JUMP_LABEL_NOP_SIZE); 51 51 52 - if (type == JUMP_LABEL_JMP) { 53 - if (init) { 54 - expect = default_nop; line = __LINE__; 55 - } else { 56 - expect = ideal_nop; line = __LINE__; 57 - } 58 - 59 - code = &jmp.code; 52 + if (init) { 53 + expect = default_nop; line = __LINE__; 54 + } else if (type == JUMP_LABEL_JMP) { 55 + expect = ideal_nop; line = __LINE__; 60 56 } else { 61 - if (init) { 62 - expect = default_nop; line = __LINE__; 63 - } else { 64 - expect = &jmp.code; line = __LINE__; 65 - } 66 - 67 - code = ideal_nop; 57 + expect = code->code; line = __LINE__; 68 58 } 69 59 70 60 if (memcmp((void *)jump_entry_code(entry), expect, JUMP_LABEL_NOP_SIZE)) 71 61 bug_at((void *)jump_entry_code(entry), line); 62 + 63 + if (type == JUMP_LABEL_NOP) 64 + memcpy(code, ideal_nop, JUMP_LABEL_NOP_SIZE); 65 + } 66 + 67 + static void __ref __jump_label_transform(struct jump_entry *entry, 68 + enum jump_label_type type, 69 + int init) 70 + { 71 + union jump_code_union code; 72 + 73 + __jump_label_set_jump_code(entry, type, &code, init); 72 74 73 75 /* 74 76 * As long as only a single processor is running and the code is still ··· 84 82 * always nop being the 'currently valid' instruction 85 83 */ 86 84 if (init || system_state == SYSTEM_BOOTING) { 87 - text_poke_early((void *)jump_entry_code(entry), code, 85 + text_poke_early((void *)jump_entry_code(entry), &code, 88 86 JUMP_LABEL_NOP_SIZE); 89 87 return; 90 88 } 91 89 92 - text_poke_bp((void *)jump_entry_code(entry), code, JUMP_LABEL_NOP_SIZE, 90 + text_poke_bp((void *)jump_entry_code(entry), &code, JUMP_LABEL_NOP_SIZE, 93 91 (void *)jump_entry_code(entry) + JUMP_LABEL_NOP_SIZE); 94 92 } 95 93 ··· 99 97 mutex_lock(&text_mutex); 100 98 __jump_label_transform(entry, type, 0); 101 99 mutex_unlock(&text_mutex); 100 + } 101 + 102 + #define TP_VEC_MAX (PAGE_SIZE / sizeof(struct text_poke_loc)) 103 + static struct text_poke_loc tp_vec[TP_VEC_MAX]; 104 + static int tp_vec_nr; 105 + 106 + bool arch_jump_label_transform_queue(struct jump_entry *entry, 107 + enum jump_label_type type) 108 + { 109 + struct text_poke_loc *tp; 110 + void *entry_code; 111 + 112 + if (system_state == SYSTEM_BOOTING) { 113 + /* 114 + * Fallback to the non-batching mode. 115 + */ 116 + arch_jump_label_transform(entry, type); 117 + return true; 118 + } 119 + 120 + /* 121 + * No more space in the vector, tell upper layer to apply 122 + * the queue before continuing. 123 + */ 124 + if (tp_vec_nr == TP_VEC_MAX) 125 + return false; 126 + 127 + tp = &tp_vec[tp_vec_nr]; 128 + 129 + entry_code = (void *)jump_entry_code(entry); 130 + 131 + /* 132 + * The INT3 handler will do a bsearch in the queue, so we need entries 133 + * to be sorted. We can survive an unsorted list by rejecting the entry, 134 + * forcing the generic jump_label code to apply the queue. Warning once, 135 + * to raise the attention to the case of an unsorted entry that is 136 + * better not happen, because, in the worst case we will perform in the 137 + * same way as we do without batching - with some more overhead. 138 + */ 139 + if (tp_vec_nr > 0) { 140 + int prev = tp_vec_nr - 1; 141 + struct text_poke_loc *prev_tp = &tp_vec[prev]; 142 + 143 + if (WARN_ON_ONCE(prev_tp->addr > entry_code)) 144 + return false; 145 + } 146 + 147 + __jump_label_set_jump_code(entry, type, 148 + (union jump_code_union *) &tp->opcode, 0); 149 + 150 + tp->addr = entry_code; 151 + tp->detour = entry_code + JUMP_LABEL_NOP_SIZE; 152 + tp->len = JUMP_LABEL_NOP_SIZE; 153 + 154 + tp_vec_nr++; 155 + 156 + return true; 157 + } 158 + 159 + void arch_jump_label_transform_apply(void) 160 + { 161 + if (!tp_vec_nr) 162 + return; 163 + 164 + mutex_lock(&text_mutex); 165 + text_poke_bp_batch(tp_vec, tp_vec_nr); 166 + mutex_unlock(&text_mutex); 167 + 168 + tp_vec_nr = 0; 102 169 } 103 170 104 171 static enum {

+3 -3

drivers/crypto/nx/nx-842-pseries.c

··· 856 856 rcu_read_lock(); \ 857 857 local_devdata = rcu_dereference(devdata); \ 858 858 if (local_devdata) \ 859 - p = snprintf(buf, PAGE_SIZE, "%ld\n", \ 859 + p = snprintf(buf, PAGE_SIZE, "%lld\n", \ 860 860 atomic64_read(&local_devdata->counters->_name)); \ 861 861 rcu_read_unlock(); \ 862 862 return p; \ ··· 909 909 } 910 910 911 911 for (i = 0; i < (NX842_HIST_SLOTS - 2); i++) { 912 - bytes = snprintf(p, bytes_remain, "%u-%uus:\t%ld\n", 912 + bytes = snprintf(p, bytes_remain, "%u-%uus:\t%lld\n", 913 913 i ? (2<<(i-1)) : 0, (2<<i)-1, 914 914 atomic64_read(&times[i])); 915 915 bytes_remain -= bytes; ··· 917 917 } 918 918 /* The last bucket holds everything over 919 919 * 2<<(NX842_HIST_SLOTS - 2) us */ 920 - bytes = snprintf(p, bytes_remain, "%uus - :\t%ld\n", 920 + bytes = snprintf(p, bytes_remain, "%uus - :\t%lld\n", 921 921 2<<(NX842_HIST_SLOTS - 2), 922 922 atomic64_read(&times[(NX842_HIST_SLOTS - 1)])); 923 923 p += bytes;

+1 -1

drivers/infiniband/core/device.c

··· 457 457 int rc; 458 458 int i; 459 459 460 - lockdep_assert_held_exclusive(&devices_rwsem); 460 + lockdep_assert_held_write(&devices_rwsem); 461 461 ida_init(&inuse); 462 462 xa_for_each (&devices, index, device) { 463 463 char buf[IB_DEVICE_NAME_MAX];

+4 -4

drivers/tty/tty_ldisc.c

··· 487 487 488 488 static void tty_ldisc_close(struct tty_struct *tty, struct tty_ldisc *ld) 489 489 { 490 - lockdep_assert_held_exclusive(&tty->ldisc_sem); 490 + lockdep_assert_held_write(&tty->ldisc_sem); 491 491 WARN_ON(!test_bit(TTY_LDISC_OPEN, &tty->flags)); 492 492 clear_bit(TTY_LDISC_OPEN, &tty->flags); 493 493 if (ld->ops->close) ··· 509 509 struct tty_ldisc *disc = tty_ldisc_get(tty, ld); 510 510 int r; 511 511 512 - lockdep_assert_held_exclusive(&tty->ldisc_sem); 512 + lockdep_assert_held_write(&tty->ldisc_sem); 513 513 if (IS_ERR(disc)) 514 514 return PTR_ERR(disc); 515 515 tty->ldisc = disc; ··· 633 633 */ 634 634 static void tty_ldisc_kill(struct tty_struct *tty) 635 635 { 636 - lockdep_assert_held_exclusive(&tty->ldisc_sem); 636 + lockdep_assert_held_write(&tty->ldisc_sem); 637 637 if (!tty->ldisc) 638 638 return; 639 639 /* ··· 681 681 struct tty_ldisc *ld; 682 682 int retval; 683 683 684 - lockdep_assert_held_exclusive(&tty->ldisc_sem); 684 + lockdep_assert_held_write(&tty->ldisc_sem); 685 685 ld = tty_ldisc_get(tty, disc); 686 686 if (IS_ERR(ld)) { 687 687 BUG_ON(disc == N_TTY);

+1 -1

fs/dax.c

··· 1187 1187 unsigned flags = 0; 1188 1188 1189 1189 if (iov_iter_rw(iter) == WRITE) { 1190 - lockdep_assert_held_exclusive(&inode->i_rwsem); 1190 + lockdep_assert_held_write(&inode->i_rwsem); 1191 1191 flags |= IOMAP_WRITE; 1192 1192 } else { 1193 1193 lockdep_assert_held(&inode->i_rwsem);

+10 -10

include/asm-generic/atomic64.h

··· 10 10 #include <linux/types.h> 11 11 12 12 typedef struct { 13 - long long counter; 13 + s64 counter; 14 14 } atomic64_t; 15 15 16 16 #define ATOMIC64_INIT(i) { (i) } 17 17 18 - extern long long atomic64_read(const atomic64_t *v); 19 - extern void atomic64_set(atomic64_t *v, long long i); 18 + extern s64 atomic64_read(const atomic64_t *v); 19 + extern void atomic64_set(atomic64_t *v, s64 i); 20 20 21 21 #define atomic64_set_release(v, i) atomic64_set((v), (i)) 22 22 23 23 #define ATOMIC64_OP(op) \ 24 - extern void atomic64_##op(long long a, atomic64_t *v); 24 + extern void atomic64_##op(s64 a, atomic64_t *v); 25 25 26 26 #define ATOMIC64_OP_RETURN(op) \ 27 - extern long long atomic64_##op##_return(long long a, atomic64_t *v); 27 + extern s64 atomic64_##op##_return(s64 a, atomic64_t *v); 28 28 29 29 #define ATOMIC64_FETCH_OP(op) \ 30 - extern long long atomic64_fetch_##op(long long a, atomic64_t *v); 30 + extern s64 atomic64_fetch_##op(s64 a, atomic64_t *v); 31 31 32 32 #define ATOMIC64_OPS(op) ATOMIC64_OP(op) ATOMIC64_OP_RETURN(op) ATOMIC64_FETCH_OP(op) 33 33 ··· 46 46 #undef ATOMIC64_OP_RETURN 47 47 #undef ATOMIC64_OP 48 48 49 - extern long long atomic64_dec_if_positive(atomic64_t *v); 49 + extern s64 atomic64_dec_if_positive(atomic64_t *v); 50 50 #define atomic64_dec_if_positive atomic64_dec_if_positive 51 - extern long long atomic64_cmpxchg(atomic64_t *v, long long o, long long n); 52 - extern long long atomic64_xchg(atomic64_t *v, long long new); 53 - extern long long atomic64_fetch_add_unless(atomic64_t *v, long long a, long long u); 51 + extern s64 atomic64_cmpxchg(atomic64_t *v, s64 o, s64 n); 52 + extern s64 atomic64_xchg(atomic64_t *v, s64 new); 53 + extern s64 atomic64_fetch_add_unless(atomic64_t *v, s64 a, s64 u); 54 54 #define atomic64_fetch_add_unless atomic64_fetch_add_unless 55 55 56 56 #endif /* _ASM_GENERIC_ATOMIC64_H */

+3

include/linux/jump_label.h

··· 215 215 enum jump_label_type type); 216 216 extern void arch_jump_label_transform_static(struct jump_entry *entry, 217 217 enum jump_label_type type); 218 + extern bool arch_jump_label_transform_queue(struct jump_entry *entry, 219 + enum jump_label_type type); 220 + extern void arch_jump_label_transform_apply(void); 218 221 extern int jump_label_text_reserved(void *start, void *end); 219 222 extern void static_key_slow_inc(struct static_key *key); 220 223 extern void static_key_slow_dec(struct static_key *key);

+24 -12

include/linux/lockdep.h

··· 203 203 struct lock_list *parent; 204 204 }; 205 205 206 - /* 207 - * We record lock dependency chains, so that we can cache them: 206 + /** 207 + * struct lock_chain - lock dependency chain record 208 + * 209 + * @irq_context: the same as irq_context in held_lock below 210 + * @depth: the number of held locks in this chain 211 + * @base: the index in chain_hlocks for this chain 212 + * @entry: the collided lock chains in lock_chain hash list 213 + * @chain_key: the hash key of this lock_chain 208 214 */ 209 215 struct lock_chain { 210 - /* see BUILD_BUG_ON()s in lookup_chain_cache() */ 216 + /* see BUILD_BUG_ON()s in add_chain_cache() */ 211 217 unsigned int irq_context : 2, 212 218 depth : 6, 213 219 base : 24; ··· 223 217 }; 224 218 225 219 #define MAX_LOCKDEP_KEYS_BITS 13 226 - /* 227 - * Subtract one because we offset hlock->class_idx by 1 in order 228 - * to make 0 mean no class. This avoids overflowing the class_idx 229 - * bitfield and hitting the BUG in hlock_class(). 230 - */ 231 - #define MAX_LOCKDEP_KEYS ((1UL << MAX_LOCKDEP_KEYS_BITS) - 1) 220 + #define MAX_LOCKDEP_KEYS (1UL << MAX_LOCKDEP_KEYS_BITS) 221 + #define INITIAL_CHAIN_KEY -1 232 222 233 223 struct held_lock { 234 224 /* ··· 249 247 u64 waittime_stamp; 250 248 u64 holdtime_stamp; 251 249 #endif 250 + /* 251 + * class_idx is zero-indexed; it points to the element in 252 + * lock_classes this held lock instance belongs to. class_idx is in 253 + * the range from 0 to (MAX_LOCKDEP_KEYS-1) inclusive. 254 + */ 252 255 unsigned int class_idx:MAX_LOCKDEP_KEYS_BITS; 253 256 /* 254 257 * The lock-stack is unified in that the lock chains of interrupt ··· 287 280 extern void lockdep_free_key_range(void *start, unsigned long size); 288 281 extern asmlinkage void lockdep_sys_exit(void); 289 282 extern void lockdep_set_selftest_task(struct task_struct *task); 283 + 284 + extern void lockdep_init_task(struct task_struct *task); 290 285 291 286 extern void lockdep_off(void); 292 287 extern void lockdep_on(void); ··· 394 385 WARN_ON(debug_locks && !lockdep_is_held(l)); \ 395 386 } while (0) 396 387 397 - #define lockdep_assert_held_exclusive(l) do { \ 388 + #define lockdep_assert_held_write(l) do { \ 398 389 WARN_ON(debug_locks && !lockdep_is_held_type(l, 0)); \ 399 390 } while (0) 400 391 ··· 413 404 #define lockdep_unpin_lock(l,c) lock_unpin_lock(&(l)->dep_map, (c)) 414 405 415 406 #else /* !CONFIG_LOCKDEP */ 407 + 408 + static inline void lockdep_init_task(struct task_struct *task) 409 + { 410 + } 416 411 417 412 static inline void lockdep_off(void) 418 413 { ··· 479 466 #define lockdep_is_held_type(l, r) (1) 480 467 481 468 #define lockdep_assert_held(l) do { (void)(l); } while (0) 482 - #define lockdep_assert_held_exclusive(l) do { (void)(l); } while (0) 469 + #define lockdep_assert_held_write(l) do { (void)(l); } while (0) 483 470 #define lockdep_assert_held_read(l) do { (void)(l); } while (0) 484 471 #define lockdep_assert_held_once(l) do { (void)(l); } while (0) 485 472 ··· 510 497 { .name = (_name), .key = (void *)(_key), } 511 498 512 499 static inline void lockdep_invariant_state(bool force) {} 513 - static inline void lockdep_init_task(struct task_struct *task) {} 514 500 static inline void lockdep_free_task(struct task_struct *task) {} 515 501 516 502 #ifdef CONFIG_LOCK_STAT

+2 -2

include/linux/percpu-rwsem.h

··· 121 121 lock_release(&sem->rw_sem.dep_map, 1, ip); 122 122 #ifdef CONFIG_RWSEM_SPIN_ON_OWNER 123 123 if (!read) 124 - sem->rw_sem.owner = RWSEM_OWNER_UNKNOWN; 124 + atomic_long_set(&sem->rw_sem.owner, RWSEM_OWNER_UNKNOWN); 125 125 #endif 126 126 } 127 127 ··· 131 131 lock_acquire(&sem->rw_sem.dep_map, 0, 1, read, 1, NULL, ip); 132 132 #ifdef CONFIG_RWSEM_SPIN_ON_OWNER 133 133 if (!read) 134 - sem->rw_sem.owner = current; 134 + atomic_long_set(&sem->rw_sem.owner, (long)current); 135 135 #endif 136 136 } 137 137

+9 -7

include/linux/rwsem.h

··· 34 34 */ 35 35 struct rw_semaphore { 36 36 atomic_long_t count; 37 - #ifdef CONFIG_RWSEM_SPIN_ON_OWNER 38 37 /* 39 - * Write owner. Used as a speculative check to see 40 - * if the owner is running on the cpu. 38 + * Write owner or one of the read owners as well flags regarding 39 + * the current state of the rwsem. Can be used as a speculative 40 + * check to see if the write owner is running on the cpu. 41 41 */ 42 - struct task_struct *owner; 42 + atomic_long_t owner; 43 + #ifdef CONFIG_RWSEM_SPIN_ON_OWNER 43 44 struct optimistic_spin_queue osq; /* spinner MCS lock */ 44 45 #endif 45 46 raw_spinlock_t wait_lock; ··· 51 50 }; 52 51 53 52 /* 54 - * Setting bit 1 of the owner field but not bit 0 will indicate 53 + * Setting all bits of the owner field except bit 0 will indicate 55 54 * that the rwsem is writer-owned with an unknown owner. 56 55 */ 57 - #define RWSEM_OWNER_UNKNOWN ((struct task_struct *)-2L) 56 + #define RWSEM_OWNER_UNKNOWN (-2L) 58 57 59 58 /* In all implementations count != 0 means locked */ 60 59 static inline int rwsem_is_locked(struct rw_semaphore *sem) ··· 74 73 #endif 75 74 76 75 #ifdef CONFIG_RWSEM_SPIN_ON_OWNER 77 - #define __RWSEM_OPT_INIT(lockname) , .osq = OSQ_LOCK_UNLOCKED, .owner = NULL 76 + #define __RWSEM_OPT_INIT(lockname) , .osq = OSQ_LOCK_UNLOCKED 78 77 #else 79 78 #define __RWSEM_OPT_INIT(lockname) 80 79 #endif 81 80 82 81 #define __RWSEM_INITIALIZER(name) \ 83 82 { __RWSEM_INIT_COUNT(name), \ 83 + .owner = ATOMIC_LONG_INIT(0), \ 84 84 .wait_list = LIST_HEAD_INIT((name).wait_list), \ 85 85 .wait_lock = __RAW_SPIN_LOCK_UNLOCKED(name.wait_lock) \ 86 86 __RWSEM_OPT_INIT(name) \

+5

include/linux/sched/wake_q.h

··· 51 51 head->lastp = &head->first; 52 52 } 53 53 54 + static inline bool wake_q_empty(struct wake_q_head *head) 55 + { 56 + return head->first == WAKE_Q_TAIL; 57 + } 58 + 54 59 extern void wake_q_add(struct wake_q_head *head, struct task_struct *task); 55 60 extern void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task); 56 61 extern void wake_up_q(struct wake_q_head *head);

+32 -15

include/linux/smp.h

··· 180 180 181 181 #endif /* !SMP */ 182 182 183 - /* 184 - * smp_processor_id(): get the current CPU ID. 183 + /** 184 + * raw_processor_id() - get the current (unstable) CPU id 185 185 * 186 - * if DEBUG_PREEMPT is enabled then we check whether it is 187 - * used in a preemption-safe way. (smp_processor_id() is safe 188 - * if it's used in a preemption-off critical section, or in 189 - * a thread that is bound to the current CPU.) 190 - * 191 - * NOTE: raw_smp_processor_id() is for internal use only 192 - * (smp_processor_id() is the preferred variant), but in rare 193 - * instances it might also be used to turn off false positives 194 - * (i.e. smp_processor_id() use that the debugging code reports but 195 - * which use for some reason is legal). Don't use this to hack around 196 - * the warning message, as your code might not work under PREEMPT. 186 + * For then you know what you are doing and need an unstable 187 + * CPU id. 197 188 */ 189 + 190 + /** 191 + * smp_processor_id() - get the current (stable) CPU id 192 + * 193 + * This is the normal accessor to the CPU id and should be used 194 + * whenever possible. 195 + * 196 + * The CPU id is stable when: 197 + * 198 + * - IRQs are disabled; 199 + * - preemption is disabled; 200 + * - the task is CPU affine. 201 + * 202 + * When CONFIG_DEBUG_PREEMPT; we verify these assumption and WARN 203 + * when smp_processor_id() is used when the CPU id is not stable. 204 + */ 205 + 206 + /* 207 + * Allow the architecture to differentiate between a stable and unstable read. 208 + * For example, x86 uses an IRQ-safe asm-volatile read for the unstable but a 209 + * regular asm read for the stable. 210 + */ 211 + #ifndef __smp_processor_id 212 + #define __smp_processor_id(x) raw_smp_processor_id(x) 213 + #endif 214 + 198 215 #ifdef CONFIG_DEBUG_PREEMPT 199 216 extern unsigned int debug_smp_processor_id(void); 200 217 # define smp_processor_id() debug_smp_processor_id() 201 218 #else 202 - # define smp_processor_id() raw_smp_processor_id() 219 + # define smp_processor_id() __smp_processor_id() 203 220 #endif 204 221 205 - #define get_cpu() ({ preempt_disable(); smp_processor_id(); }) 222 + #define get_cpu() ({ preempt_disable(); __smp_processor_id(); }) 206 223 #define put_cpu() preempt_enable() 207 224 208 225 /*

+1 -1

include/linux/types.h

··· 174 174 175 175 #ifdef CONFIG_64BIT 176 176 typedef struct { 177 - long counter; 177 + s64 counter; 178 178 } atomic64_t; 179 179 #endif 180 180

+2

init/init_task.c

··· 166 166 .softirqs_enabled = 1, 167 167 #endif 168 168 #ifdef CONFIG_LOCKDEP 169 + .lockdep_depth = 0, /* no locks held yet */ 170 + .curr_chain_key = INITIAL_CHAIN_KEY, 169 171 .lockdep_recursion = 0, 170 172 #endif 171 173 #ifdef CONFIG_FUNCTION_GRAPH_TRACER

-3

kernel/fork.c

··· 1952 1952 p->pagefault_disabled = 0; 1953 1953 1954 1954 #ifdef CONFIG_LOCKDEP 1955 - p->lockdep_depth = 0; /* no locks held yet */ 1956 - p->curr_chain_key = 0; 1957 - p->lockdep_recursion = 0; 1958 1955 lockdep_init_task(p); 1959 1956 #endif 1960 1957

+39 -30

kernel/futex.c

··· 471 471 }; 472 472 473 473 /** 474 + * futex_setup_timer - set up the sleeping hrtimer. 475 + * @time: ptr to the given timeout value 476 + * @timeout: the hrtimer_sleeper structure to be set up 477 + * @flags: futex flags 478 + * @range_ns: optional range in ns 479 + * 480 + * Return: Initialized hrtimer_sleeper structure or NULL if no timeout 481 + * value given 482 + */ 483 + static inline struct hrtimer_sleeper * 484 + futex_setup_timer(ktime_t *time, struct hrtimer_sleeper *timeout, 485 + int flags, u64 range_ns) 486 + { 487 + if (!time) 488 + return NULL; 489 + 490 + hrtimer_init_on_stack(&timeout->timer, (flags & FLAGS_CLOCKRT) ? 491 + CLOCK_REALTIME : CLOCK_MONOTONIC, 492 + HRTIMER_MODE_ABS); 493 + hrtimer_init_sleeper(timeout, current); 494 + 495 + /* 496 + * If range_ns is 0, calling hrtimer_set_expires_range_ns() is 497 + * effectively the same as calling hrtimer_set_expires(). 498 + */ 499 + hrtimer_set_expires_range_ns(&timeout->timer, *time, range_ns); 500 + 501 + return timeout; 502 + } 503 + 504 + /** 474 505 * get_futex_key() - Get parameters which are the keys for a futex 475 506 * @uaddr: virtual address of the futex 476 507 * @fshared: 0 for a PROCESS_PRIVATE futex, 1 for PROCESS_SHARED ··· 2710 2679 static int futex_wait(u32 __user *uaddr, unsigned int flags, u32 val, 2711 2680 ktime_t *abs_time, u32 bitset) 2712 2681 { 2713 - struct hrtimer_sleeper timeout, *to = NULL; 2682 + struct hrtimer_sleeper timeout, *to; 2714 2683 struct restart_block *restart; 2715 2684 struct futex_hash_bucket *hb; 2716 2685 struct futex_q q = futex_q_init; ··· 2720 2689 return -EINVAL; 2721 2690 q.bitset = bitset; 2722 2691 2723 - if (abs_time) { 2724 - to = &timeout; 2725 - 2726 - hrtimer_init_on_stack(&to->timer, (flags & FLAGS_CLOCKRT) ? 2727 - CLOCK_REALTIME : CLOCK_MONOTONIC, 2728 - HRTIMER_MODE_ABS); 2729 - hrtimer_init_sleeper(to, current); 2730 - hrtimer_set_expires_range_ns(&to->timer, *abs_time, 2731 - current->timer_slack_ns); 2732 - } 2733 - 2692 + to = futex_setup_timer(abs_time, &timeout, flags, 2693 + current->timer_slack_ns); 2734 2694 retry: 2735 2695 /* 2736 2696 * Prepare to wait on uaddr. On success, holds hb lock and increments ··· 2801 2779 static int futex_lock_pi(u32 __user *uaddr, unsigned int flags, 2802 2780 ktime_t *time, int trylock) 2803 2781 { 2804 - struct hrtimer_sleeper timeout, *to = NULL; 2782 + struct hrtimer_sleeper timeout, *to; 2805 2783 struct futex_pi_state *pi_state = NULL; 2806 2784 struct rt_mutex_waiter rt_waiter; 2807 2785 struct futex_hash_bucket *hb; ··· 2814 2792 if (refill_pi_state_cache()) 2815 2793 return -ENOMEM; 2816 2794 2817 - if (time) { 2818 - to = &timeout; 2819 - hrtimer_init_on_stack(&to->timer, CLOCK_REALTIME, 2820 - HRTIMER_MODE_ABS); 2821 - hrtimer_init_sleeper(to, current); 2822 - hrtimer_set_expires(&to->timer, *time); 2823 - } 2795 + to = futex_setup_timer(time, &timeout, FLAGS_CLOCKRT, 0); 2824 2796 2825 2797 retry: 2826 2798 ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &q.key, FUTEX_WRITE); ··· 3211 3195 u32 val, ktime_t *abs_time, u32 bitset, 3212 3196 u32 __user *uaddr2) 3213 3197 { 3214 - struct hrtimer_sleeper timeout, *to = NULL; 3198 + struct hrtimer_sleeper timeout, *to; 3215 3199 struct futex_pi_state *pi_state = NULL; 3216 3200 struct rt_mutex_waiter rt_waiter; 3217 3201 struct futex_hash_bucket *hb; ··· 3228 3212 if (!bitset) 3229 3213 return -EINVAL; 3230 3214 3231 - if (abs_time) { 3232 - to = &timeout; 3233 - hrtimer_init_on_stack(&to->timer, (flags & FLAGS_CLOCKRT) ? 3234 - CLOCK_REALTIME : CLOCK_MONOTONIC, 3235 - HRTIMER_MODE_ABS); 3236 - hrtimer_init_sleeper(to, current); 3237 - hrtimer_set_expires_range_ns(&to->timer, *abs_time, 3238 - current->timer_slack_ns); 3239 - } 3215 + to = futex_setup_timer(abs_time, &timeout, flags, 3216 + current->timer_slack_ns); 3240 3217 3241 3218 /* 3242 3219 * The waiter is allocated on our stack, manipulated by the requeue

+55 -11

kernel/jump_label.c

··· 37 37 const struct jump_entry *jea = a; 38 38 const struct jump_entry *jeb = b; 39 39 40 + /* 41 + * Entrires are sorted by key. 42 + */ 40 43 if (jump_entry_key(jea) < jump_entry_key(jeb)) 41 44 return -1; 42 45 43 46 if (jump_entry_key(jea) > jump_entry_key(jeb)) 47 + return 1; 48 + 49 + /* 50 + * In the batching mode, entries should also be sorted by the code 51 + * inside the already sorted list of entries, enabling a bsearch in 52 + * the vector. 53 + */ 54 + if (jump_entry_code(jea) < jump_entry_code(jeb)) 55 + return -1; 56 + 57 + if (jump_entry_code(jea) > jump_entry_code(jeb)) 44 58 return 1; 45 59 46 60 return 0; ··· 398 384 return enabled ^ branch; 399 385 } 400 386 387 + static bool jump_label_can_update(struct jump_entry *entry, bool init) 388 + { 389 + /* 390 + * Cannot update code that was in an init text area. 391 + */ 392 + if (!init && jump_entry_is_init(entry)) 393 + return false; 394 + 395 + if (!kernel_text_address(jump_entry_code(entry))) { 396 + WARN_ONCE(1, "can't patch jump_label at %pS", (void *)jump_entry_code(entry)); 397 + return false; 398 + } 399 + 400 + return true; 401 + } 402 + 403 + #ifndef HAVE_JUMP_LABEL_BATCH 401 404 static void __jump_label_update(struct static_key *key, 402 405 struct jump_entry *entry, 403 406 struct jump_entry *stop, 404 407 bool init) 405 408 { 406 409 for (; (entry < stop) && (jump_entry_key(entry) == key); entry++) { 407 - /* 408 - * An entry->code of 0 indicates an entry which has been 409 - * disabled because it was in an init text area. 410 - */ 411 - if (init || !jump_entry_is_init(entry)) { 412 - if (kernel_text_address(jump_entry_code(entry))) 413 - arch_jump_label_transform(entry, jump_label_type(entry)); 414 - else 415 - WARN_ONCE(1, "can't patch jump_label at %pS", 416 - (void *)jump_entry_code(entry)); 417 - } 410 + if (jump_label_can_update(entry, init)) 411 + arch_jump_label_transform(entry, jump_label_type(entry)); 418 412 } 419 413 } 414 + #else 415 + static void __jump_label_update(struct static_key *key, 416 + struct jump_entry *entry, 417 + struct jump_entry *stop, 418 + bool init) 419 + { 420 + for (; (entry < stop) && (jump_entry_key(entry) == key); entry++) { 421 + 422 + if (!jump_label_can_update(entry, init)) 423 + continue; 424 + 425 + if (!arch_jump_label_transform_queue(entry, jump_label_type(entry))) { 426 + /* 427 + * Queue is full: Apply the current queue and try again. 428 + */ 429 + arch_jump_label_transform_apply(); 430 + BUG_ON(!arch_jump_label_transform_queue(entry, jump_label_type(entry))); 431 + } 432 + } 433 + arch_jump_label_transform_apply(); 434 + } 435 + #endif 420 436 421 437 void __init jump_label_init(void) 422 438 {

+1 -1

kernel/locking/Makefile

··· 3 3 # and is generally not a function of system call inputs. 4 4 KCOV_INSTRUMENT := n 5 5 6 - obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o rwsem-xadd.o 6 + obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o 7 7 8 8 ifdef CONFIG_FUNCTION_TRACER 9 9 CFLAGS_REMOVE_lockdep.o = $(CC_FLAGS_FTRACE)

+4 -41

kernel/locking/lock_events.h

··· 31 31 DECLARE_PER_CPU(unsigned long, lockevents[lockevent_num]); 32 32 33 33 /* 34 - * The purpose of the lock event counting subsystem is to provide a low 35 - * overhead way to record the number of specific locking events by using 36 - * percpu counters. It is the percpu sum that matters, not specifically 37 - * how many of them happens in each cpu. 38 - * 39 - * It is possible that the same percpu counter may be modified in both 40 - * the process and interrupt contexts. For architectures that perform 41 - * percpu operation with multiple instructions, it is possible to lose 42 - * count if a process context percpu update is interrupted in the middle 43 - * and the same counter is updated in the interrupt context. Therefore, 44 - * the generated percpu sum may not be precise. The error, if any, should 45 - * be small and insignificant. 46 - * 47 - * For those architectures that do multi-instruction percpu operation, 48 - * preemption in the middle and moving the task to another cpu may cause 49 - * a larger error in the count. Again, this will be few and far between. 50 - * Given the imprecise nature of the count and the possibility of resetting 51 - * the count and doing the measurement again, this is not really a big 52 - * problem. 53 - * 54 - * To get a better picture of what is happening under the hood, it is 55 - * suggested that a few measurements should be taken with the counts 56 - * reset in between to stamp out outliner because of these possible 57 - * error conditions. 58 - * 59 - * To minimize overhead, we use __this_cpu_*() in all cases except when 60 - * CONFIG_DEBUG_PREEMPT is defined. In this particular case, this_cpu_*() 61 - * will be used to avoid the appearance of unwanted BUG messages. 62 - */ 63 - #ifdef CONFIG_DEBUG_PREEMPT 64 - #define lockevent_percpu_inc(x) this_cpu_inc(x) 65 - #define lockevent_percpu_add(x, v) this_cpu_add(x, v) 66 - #else 67 - #define lockevent_percpu_inc(x) __this_cpu_inc(x) 68 - #define lockevent_percpu_add(x, v) __this_cpu_add(x, v) 69 - #endif 70 - 71 - /* 72 - * Increment the PV qspinlock statistical counters 34 + * Increment the statistical counters. use raw_cpu_inc() because of lower 35 + * overhead and we don't care if we loose the occasional update. 73 36 */ 74 37 static inline void __lockevent_inc(enum lock_events event, bool cond) 75 38 { 76 39 if (cond) 77 - lockevent_percpu_inc(lockevents[event]); 40 + raw_cpu_inc(lockevents[event]); 78 41 } 79 42 80 43 #define lockevent_inc(ev) __lockevent_inc(LOCKEVENT_ ##ev, true) ··· 45 82 46 83 static inline void __lockevent_add(enum lock_events event, int inc) 47 84 { 48 - lockevent_percpu_add(lockevents[event], inc); 85 + raw_cpu_add(lockevents[event], inc); 49 86 } 50 87 51 88 #define lockevent_add(ev, c) __lockevent_add(LOCKEVENT_ ##ev, c)

+8 -4

kernel/locking/lock_events_list.h

··· 56 56 LOCK_EVENT(rwsem_sleep_writer) /* # of writer sleeps */ 57 57 LOCK_EVENT(rwsem_wake_reader) /* # of reader wakeups */ 58 58 LOCK_EVENT(rwsem_wake_writer) /* # of writer wakeups */ 59 - LOCK_EVENT(rwsem_opt_wlock) /* # of write locks opt-spin acquired */ 60 - LOCK_EVENT(rwsem_opt_fail) /* # of failed opt-spinnings */ 59 + LOCK_EVENT(rwsem_opt_rlock) /* # of opt-acquired read locks */ 60 + LOCK_EVENT(rwsem_opt_wlock) /* # of opt-acquired write locks */ 61 + LOCK_EVENT(rwsem_opt_fail) /* # of failed optspins */ 62 + LOCK_EVENT(rwsem_opt_nospin) /* # of disabled optspins */ 63 + LOCK_EVENT(rwsem_opt_norspin) /* # of disabled reader-only optspins */ 64 + LOCK_EVENT(rwsem_opt_rlock2) /* # of opt-acquired 2ndary read locks */ 61 65 LOCK_EVENT(rwsem_rlock) /* # of read locks acquired */ 62 66 LOCK_EVENT(rwsem_rlock_fast) /* # of fast read locks acquired */ 63 67 LOCK_EVENT(rwsem_rlock_fail) /* # of failed read lock acquisitions */ 64 - LOCK_EVENT(rwsem_rtrylock) /* # of read trylock calls */ 68 + LOCK_EVENT(rwsem_rlock_handoff) /* # of read lock handoffs */ 65 69 LOCK_EVENT(rwsem_wlock) /* # of write locks acquired */ 66 70 LOCK_EVENT(rwsem_wlock_fail) /* # of failed write lock acquisitions */ 67 - LOCK_EVENT(rwsem_wtrylock) /* # of write trylock calls */ 71 + LOCK_EVENT(rwsem_wlock_handoff) /* # of write lock handoffs */

+426 -318

kernel/locking/lockdep.c

··· 151 151 static 152 152 #endif 153 153 struct lock_class lock_classes[MAX_LOCKDEP_KEYS]; 154 + static DECLARE_BITMAP(lock_classes_in_use, MAX_LOCKDEP_KEYS); 154 155 155 156 static inline struct lock_class *hlock_class(struct held_lock *hlock) 156 157 { 157 - if (!hlock->class_idx) { 158 + unsigned int class_idx = hlock->class_idx; 159 + 160 + /* Don't re-read hlock->class_idx, can't use READ_ONCE() on bitfield */ 161 + barrier(); 162 + 163 + if (!test_bit(class_idx, lock_classes_in_use)) { 158 164 /* 159 165 * Someone passed in garbage, we give up. 160 166 */ 161 167 DEBUG_LOCKS_WARN_ON(1); 162 168 return NULL; 163 169 } 164 - return lock_classes + hlock->class_idx - 1; 170 + 171 + /* 172 + * At this point, if the passed hlock->class_idx is still garbage, 173 + * we just have to live with it 174 + */ 175 + return lock_classes + class_idx; 165 176 } 166 177 167 178 #ifdef CONFIG_LOCK_STAT ··· 370 359 return k0 | (u64)k1 << 32; 371 360 } 372 361 362 + void lockdep_init_task(struct task_struct *task) 363 + { 364 + task->lockdep_depth = 0; /* no locks held yet */ 365 + task->curr_chain_key = INITIAL_CHAIN_KEY; 366 + task->lockdep_recursion = 0; 367 + } 368 + 373 369 void lockdep_off(void) 374 370 { 375 371 current->lockdep_recursion++; ··· 437 419 return 0; 438 420 } 439 421 440 - /* 441 - * Stack-trace: tightly packed array of stack backtrace 442 - * addresses. Protected by the graph_lock. 443 - */ 444 - unsigned long nr_stack_trace_entries; 445 - static unsigned long stack_trace[MAX_STACK_TRACE_ENTRIES]; 446 - 447 422 static void print_lockdep_off(const char *bug_msg) 448 423 { 449 424 printk(KERN_DEBUG "%s\n", bug_msg); ··· 445 434 printk(KERN_DEBUG "Please attach the output of /proc/lock_stat to the bug report\n"); 446 435 #endif 447 436 } 437 + 438 + unsigned long nr_stack_trace_entries; 439 + 440 + #if defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING) 441 + /* 442 + * Stack-trace: tightly packed array of stack backtrace 443 + * addresses. Protected by the graph_lock. 444 + */ 445 + static unsigned long stack_trace[MAX_STACK_TRACE_ENTRIES]; 448 446 449 447 static int save_trace(struct lock_trace *trace) 450 448 { ··· 477 457 478 458 return 1; 479 459 } 460 + #endif 480 461 481 462 unsigned int nr_hardirq_chains; 482 463 unsigned int nr_softirq_chains; ··· 491 470 DEFINE_PER_CPU(struct lockdep_stats, lockdep_stats); 492 471 #endif 493 472 473 + #if defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING) 494 474 /* 495 475 * Locking printouts: 496 476 */ ··· 509 487 #undef LOCKDEP_STATE 510 488 [LOCK_USED] = "INITIAL USE", 511 489 }; 490 + #endif 512 491 513 492 const char * __get_key_name(struct lockdep_subclass_key *key, char *str) 514 493 { ··· 523 500 524 501 static char get_usage_char(struct lock_class *class, enum lock_usage_bit bit) 525 502 { 503 + /* 504 + * The usage character defaults to '.' (i.e., irqs disabled and not in 505 + * irq context), which is the safest usage category. 506 + */ 526 507 char c = '.'; 527 508 528 - if (class->usage_mask & lock_flag(bit + LOCK_USAGE_DIR_MASK)) 509 + /* 510 + * The order of the following usage checks matters, which will 511 + * result in the outcome character as follows: 512 + * 513 + * - '+': irq is enabled and not in irq context 514 + * - '-': in irq context and irq is disabled 515 + * - '?': in irq context and irq is enabled 516 + */ 517 + if (class->usage_mask & lock_flag(bit + LOCK_USAGE_DIR_MASK)) { 529 518 c = '+'; 530 - if (class->usage_mask & lock_flag(bit)) { 531 - c = '-'; 532 - if (class->usage_mask & lock_flag(bit + LOCK_USAGE_DIR_MASK)) 519 + if (class->usage_mask & lock_flag(bit)) 533 520 c = '?'; 534 - } 521 + } else if (class->usage_mask & lock_flag(bit)) 522 + c = '-'; 535 523 536 524 return c; 537 525 } ··· 606 572 /* 607 573 * We can be called locklessly through debug_show_all_locks() so be 608 574 * extra careful, the hlock might have been released and cleared. 575 + * 576 + * If this indeed happens, lets pretend it does not hurt to continue 577 + * to print the lock unless the hlock class_idx does not point to a 578 + * registered class. The rationale here is: since we don't attempt 579 + * to distinguish whether we are in this situation, if it just 580 + * happened we can't count on class_idx to tell either. 609 581 */ 610 - unsigned int class_idx = hlock->class_idx; 582 + struct lock_class *lock = hlock_class(hlock); 611 583 612 - /* Don't re-read hlock->class_idx, can't use READ_ONCE() on bitfields: */ 613 - barrier(); 614 - 615 - if (!class_idx || (class_idx - 1) >= MAX_LOCKDEP_KEYS) { 584 + if (!lock) { 616 585 printk(KERN_CONT "<RELEASED>\n"); 617 586 return; 618 587 } 619 588 620 589 printk(KERN_CONT "%p", hlock->instance); 621 - print_lock_name(lock_classes + class_idx - 1); 590 + print_lock_name(lock); 622 591 printk(KERN_CONT ", at: %pS\n", (void *)hlock->acquire_ip); 623 592 } 624 593 ··· 769 732 * Huh! same key, different name? Did someone trample 770 733 * on some memory? We're most confused. 771 734 */ 772 - WARN_ON_ONCE(class->name != lock->name); 735 + WARN_ON_ONCE(class->name != lock->name && 736 + lock->key != &__lockdep_no_validate__); 773 737 return class; 774 738 } 775 739 } ··· 876 838 static bool check_lock_chain_key(struct lock_chain *chain) 877 839 { 878 840 #ifdef CONFIG_PROVE_LOCKING 879 - u64 chain_key = 0; 841 + u64 chain_key = INITIAL_CHAIN_KEY; 880 842 int i; 881 843 882 844 for (i = chain->base; i < chain->base + chain->depth; i++) 883 - chain_key = iterate_chain_key(chain_key, chain_hlocks[i] + 1); 845 + chain_key = iterate_chain_key(chain_key, chain_hlocks[i]); 884 846 /* 885 847 * The 'unsigned long long' casts avoid that a compiler warning 886 848 * is reported when building tools/lib/lockdep. ··· 1155 1117 return NULL; 1156 1118 } 1157 1119 nr_lock_classes++; 1120 + __set_bit(class - lock_classes, lock_classes_in_use); 1158 1121 debug_atomic_inc(nr_unused_locks); 1159 1122 class->key = key; 1160 1123 class->name = lock->name; ··· 1267 1228 #define CQ_MASK (MAX_CIRCULAR_QUEUE_SIZE-1) 1268 1229 1269 1230 /* 1270 - * The circular_queue and helpers is used to implement the 1271 - * breadth-first search(BFS)algorithem, by which we can build 1272 - * the shortest path from the next lock to be acquired to the 1273 - * previous held lock if there is a circular between them. 1231 + * The circular_queue and helpers are used to implement graph 1232 + * breadth-first search (BFS) algorithm, by which we can determine 1233 + * whether there is a path from a lock to another. In deadlock checks, 1234 + * a path from the next lock to be acquired to a previous held lock 1235 + * indicates that adding the <prev> -> <next> lock dependency will 1236 + * produce a circle in the graph. Breadth-first search instead of 1237 + * depth-first search is used in order to find the shortest (circular) 1238 + * path. 1274 1239 */ 1275 1240 struct circular_queue { 1276 - unsigned long element[MAX_CIRCULAR_QUEUE_SIZE]; 1241 + struct lock_list *element[MAX_CIRCULAR_QUEUE_SIZE]; 1277 1242 unsigned int front, rear; 1278 1243 }; 1279 1244 ··· 1303 1260 return ((cq->rear + 1) & CQ_MASK) == cq->front; 1304 1261 } 1305 1262 1306 - static inline int __cq_enqueue(struct circular_queue *cq, unsigned long elem) 1263 + static inline int __cq_enqueue(struct circular_queue *cq, struct lock_list *elem) 1307 1264 { 1308 1265 if (__cq_full(cq)) 1309 1266 return -1; ··· 1313 1270 return 0; 1314 1271 } 1315 1272 1316 - static inline int __cq_dequeue(struct circular_queue *cq, unsigned long *elem) 1273 + /* 1274 + * Dequeue an element from the circular_queue, return a lock_list if 1275 + * the queue is not empty, or NULL if otherwise. 1276 + */ 1277 + static inline struct lock_list * __cq_dequeue(struct circular_queue *cq) 1317 1278 { 1318 - if (__cq_empty(cq)) 1319 - return -1; 1279 + struct lock_list * lock; 1320 1280 1321 - *elem = cq->element[cq->front]; 1281 + if (__cq_empty(cq)) 1282 + return NULL; 1283 + 1284 + lock = cq->element[cq->front]; 1322 1285 cq->front = (cq->front + 1) & CQ_MASK; 1323 - return 0; 1286 + 1287 + return lock; 1324 1288 } 1325 1289 1326 1290 static inline unsigned int __cq_get_elem_count(struct circular_queue *cq) ··· 1372 1322 return depth; 1373 1323 } 1374 1324 1325 + /* 1326 + * Return the forward or backward dependency list. 1327 + * 1328 + * @lock: the lock_list to get its class's dependency list 1329 + * @offset: the offset to struct lock_class to determine whether it is 1330 + * locks_after or locks_before 1331 + */ 1332 + static inline struct list_head *get_dep_list(struct lock_list *lock, int offset) 1333 + { 1334 + void *lock_class = lock->class; 1335 + 1336 + return lock_class + offset; 1337 + } 1338 + 1339 + /* 1340 + * Forward- or backward-dependency search, used for both circular dependency 1341 + * checking and hardirq-unsafe/softirq-unsafe checking. 1342 + */ 1375 1343 static int __bfs(struct lock_list *source_entry, 1376 1344 void *data, 1377 1345 int (*match)(struct lock_list *entry, void *data), 1378 1346 struct lock_list **target_entry, 1379 - int forward) 1347 + int offset) 1380 1348 { 1381 1349 struct lock_list *entry; 1350 + struct lock_list *lock; 1382 1351 struct list_head *head; 1383 1352 struct circular_queue *cq = &lock_cq; 1384 1353 int ret = 1; ··· 1408 1339 goto exit; 1409 1340 } 1410 1341 1411 - if (forward) 1412 - head = &source_entry->class->locks_after; 1413 - else 1414 - head = &source_entry->class->locks_before; 1415 - 1342 + head = get_dep_list(source_entry, offset); 1416 1343 if (list_empty(head)) 1417 1344 goto exit; 1418 1345 1419 1346 __cq_init(cq); 1420 - __cq_enqueue(cq, (unsigned long)source_entry); 1347 + __cq_enqueue(cq, source_entry); 1421 1348 1422 - while (!__cq_empty(cq)) { 1423 - struct lock_list *lock; 1424 - 1425 - __cq_dequeue(cq, (unsigned long *)&lock); 1349 + while ((lock = __cq_dequeue(cq))) { 1426 1350 1427 1351 if (!lock->class) { 1428 1352 ret = -2; 1429 1353 goto exit; 1430 1354 } 1431 1355 1432 - if (forward) 1433 - head = &lock->class->locks_after; 1434 - else 1435 - head = &lock->class->locks_before; 1356 + head = get_dep_list(lock, offset); 1436 1357 1437 1358 DEBUG_LOCKS_WARN_ON(!irqs_disabled()); 1438 1359 ··· 1436 1377 goto exit; 1437 1378 } 1438 1379 1439 - if (__cq_enqueue(cq, (unsigned long)entry)) { 1380 + if (__cq_enqueue(cq, entry)) { 1440 1381 ret = -1; 1441 1382 goto exit; 1442 1383 } ··· 1455 1396 int (*match)(struct lock_list *entry, void *data), 1456 1397 struct lock_list **target_entry) 1457 1398 { 1458 - return __bfs(src_entry, data, match, target_entry, 1); 1399 + return __bfs(src_entry, data, match, target_entry, 1400 + offsetof(struct lock_class, locks_after)); 1459 1401 1460 1402 } 1461 1403 ··· 1465 1405 int (*match)(struct lock_list *entry, void *data), 1466 1406 struct lock_list **target_entry) 1467 1407 { 1468 - return __bfs(src_entry, data, match, target_entry, 0); 1408 + return __bfs(src_entry, data, match, target_entry, 1409 + offsetof(struct lock_class, locks_before)); 1469 1410 1470 1411 } 1471 - 1472 - /* 1473 - * Recursive, forwards-direction lock-dependency checking, used for 1474 - * both noncyclic checking and for hardirq-unsafe/softirq-unsafe 1475 - * checking. 1476 - */ 1477 1412 1478 1413 static void print_lock_trace(struct lock_trace *trace, unsigned int spaces) 1479 1414 { ··· 1481 1426 * Print a dependency chain entry (this is only done when a deadlock 1482 1427 * has been detected): 1483 1428 */ 1484 - static noinline int 1429 + static noinline void 1485 1430 print_circular_bug_entry(struct lock_list *target, int depth) 1486 1431 { 1487 1432 if (debug_locks_silent) 1488 - return 0; 1433 + return; 1489 1434 printk("\n-> #%u", depth); 1490 1435 print_lock_name(target->class); 1491 1436 printk(KERN_CONT ":\n"); 1492 1437 print_lock_trace(&target->trace, 6); 1493 - return 0; 1494 1438 } 1495 1439 1496 1440 static void ··· 1546 1492 * When a circular dependency is detected, print the 1547 1493 * header first: 1548 1494 */ 1549 - static noinline int 1495 + static noinline void 1550 1496 print_circular_bug_header(struct lock_list *entry, unsigned int depth, 1551 1497 struct held_lock *check_src, 1552 1498 struct held_lock *check_tgt) ··· 1554 1500 struct task_struct *curr = current; 1555 1501 1556 1502 if (debug_locks_silent) 1557 - return 0; 1503 + return; 1558 1504 1559 1505 pr_warn("\n"); 1560 1506 pr_warn("======================================================\n"); ··· 1572 1518 pr_warn("\nthe existing dependency chain (in reverse order) is:\n"); 1573 1519 1574 1520 print_circular_bug_entry(entry, depth); 1575 - 1576 - return 0; 1577 1521 } 1578 1522 1579 1523 static inline int class_equal(struct lock_list *entry, void *data) ··· 1579 1527 return entry->class == data; 1580 1528 } 1581 1529 1582 - static noinline int print_circular_bug(struct lock_list *this, 1583 - struct lock_list *target, 1584 - struct held_lock *check_src, 1585 - struct held_lock *check_tgt) 1530 + static noinline void print_circular_bug(struct lock_list *this, 1531 + struct lock_list *target, 1532 + struct held_lock *check_src, 1533 + struct held_lock *check_tgt) 1586 1534 { 1587 1535 struct task_struct *curr = current; 1588 1536 struct lock_list *parent; ··· 1590 1538 int depth; 1591 1539 1592 1540 if (!debug_locks_off_graph_unlock() || debug_locks_silent) 1593 - return 0; 1541 + return; 1594 1542 1595 1543 if (!save_trace(&this->trace)) 1596 - return 0; 1544 + return; 1597 1545 1598 1546 depth = get_lock_depth(target); 1599 1547 ··· 1615 1563 1616 1564 printk("\nstack backtrace:\n"); 1617 1565 dump_stack(); 1618 - 1619 - return 0; 1620 1566 } 1621 1567 1622 - static noinline int print_bfs_bug(int ret) 1568 + static noinline void print_bfs_bug(int ret) 1623 1569 { 1624 1570 if (!debug_locks_off_graph_unlock()) 1625 - return 0; 1571 + return; 1626 1572 1627 1573 /* 1628 1574 * Breadth-first-search failed, graph got corrupted? 1629 1575 */ 1630 1576 WARN(1, "lockdep bfs error:%d\n", ret); 1631 - 1632 - return 0; 1633 1577 } 1634 1578 1635 1579 static int noop_count(struct lock_list *entry, void *data) ··· 1688 1640 } 1689 1641 1690 1642 /* 1691 - * Prove that the dependency graph starting at <entry> can not 1692 - * lead to <target>. Print an error and return 0 if it does. 1643 + * Check that the dependency graph starting at <src> can lead to 1644 + * <target> or not. Print an error and return 0 if it does. 1693 1645 */ 1694 1646 static noinline int 1695 - check_noncircular(struct lock_list *root, struct lock_class *target, 1696 - struct lock_list **target_entry) 1647 + check_path(struct lock_class *target, struct lock_list *src_entry, 1648 + struct lock_list **target_entry) 1697 1649 { 1698 - int result; 1650 + int ret; 1651 + 1652 + ret = __bfs_forwards(src_entry, (void *)target, class_equal, 1653 + target_entry); 1654 + 1655 + if (unlikely(ret < 0)) 1656 + print_bfs_bug(ret); 1657 + 1658 + return ret; 1659 + } 1660 + 1661 + /* 1662 + * Prove that the dependency graph starting at <src> can not 1663 + * lead to <target>. If it can, there is a circle when adding 1664 + * <target> -> <src> dependency. 1665 + * 1666 + * Print an error and return 0 if it does. 1667 + */ 1668 + static noinline int 1669 + check_noncircular(struct held_lock *src, struct held_lock *target, 1670 + struct lock_trace *trace) 1671 + { 1672 + int ret; 1673 + struct lock_list *uninitialized_var(target_entry); 1674 + struct lock_list src_entry = { 1675 + .class = hlock_class(src), 1676 + .parent = NULL, 1677 + }; 1699 1678 1700 1679 debug_atomic_inc(nr_cyclic_checks); 1701 1680 1702 - result = __bfs_forwards(root, target, class_equal, target_entry); 1681 + ret = check_path(hlock_class(target), &src_entry, &target_entry); 1703 1682 1704 - return result; 1683 + if (unlikely(!ret)) { 1684 + if (!trace->nr_entries) { 1685 + /* 1686 + * If save_trace fails here, the printing might 1687 + * trigger a WARN but because of the !nr_entries it 1688 + * should not do bad things. 1689 + */ 1690 + save_trace(trace); 1691 + } 1692 + 1693 + print_circular_bug(&src_entry, target_entry, src, target); 1694 + } 1695 + 1696 + return ret; 1705 1697 } 1706 1698 1699 + #ifdef CONFIG_LOCKDEP_SMALL 1700 + /* 1701 + * Check that the dependency graph starting at <src> can lead to 1702 + * <target> or not. If it can, <src> -> <target> dependency is already 1703 + * in the graph. 1704 + * 1705 + * Print an error and return 2 if it does or 1 if it does not. 1706 + */ 1707 1707 static noinline int 1708 - check_redundant(struct lock_list *root, struct lock_class *target, 1709 - struct lock_list **target_entry) 1708 + check_redundant(struct held_lock *src, struct held_lock *target) 1710 1709 { 1711 - int result; 1710 + int ret; 1711 + struct lock_list *uninitialized_var(target_entry); 1712 + struct lock_list src_entry = { 1713 + .class = hlock_class(src), 1714 + .parent = NULL, 1715 + }; 1712 1716 1713 1717 debug_atomic_inc(nr_redundant_checks); 1714 1718 1715 - result = __bfs_forwards(root, target, class_equal, target_entry); 1719 + ret = check_path(hlock_class(target), &src_entry, &target_entry); 1716 1720 1717 - return result; 1721 + if (!ret) { 1722 + debug_atomic_inc(nr_redundant); 1723 + ret = 2; 1724 + } else if (ret < 0) 1725 + ret = 0; 1726 + 1727 + return ret; 1718 1728 } 1729 + #endif 1719 1730 1720 - #if defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING) 1731 + #ifdef CONFIG_TRACE_IRQFLAGS 1721 1732 1722 1733 static inline int usage_accumulate(struct lock_list *entry, void *mask) 1723 1734 { ··· 1873 1766 */ 1874 1767 static void __used 1875 1768 print_shortest_lock_dependencies(struct lock_list *leaf, 1876 - struct lock_list *root) 1769 + struct lock_list *root) 1877 1770 { 1878 1771 struct lock_list *entry = leaf; 1879 1772 int depth; ··· 1895 1788 entry = get_lock_parent(entry); 1896 1789 depth--; 1897 1790 } while (entry && (depth >= 0)); 1898 - 1899 - return; 1900 1791 } 1901 1792 1902 1793 static void ··· 1953 1848 printk("\n *** DEADLOCK ***\n\n"); 1954 1849 } 1955 1850 1956 - static int 1851 + static void 1957 1852 print_bad_irq_dependency(struct task_struct *curr, 1958 1853 struct lock_list *prev_root, 1959 1854 struct lock_list *next_root, ··· 1966 1861 const char *irqclass) 1967 1862 { 1968 1863 if (!debug_locks_off_graph_unlock() || debug_locks_silent) 1969 - return 0; 1864 + return; 1970 1865 1971 1866 pr_warn("\n"); 1972 1867 pr_warn("=====================================================\n"); ··· 2012 1907 2013 1908 pr_warn("\nthe dependencies between %s-irq-safe lock and the holding lock:\n", irqclass); 2014 1909 if (!save_trace(&prev_root->trace)) 2015 - return 0; 1910 + return; 2016 1911 print_shortest_lock_dependencies(backwards_entry, prev_root); 2017 1912 2018 1913 pr_warn("\nthe dependencies between the lock to be acquired"); 2019 1914 pr_warn(" and %s-irq-unsafe lock:\n", irqclass); 2020 1915 if (!save_trace(&next_root->trace)) 2021 - return 0; 1916 + return; 2022 1917 print_shortest_lock_dependencies(forwards_entry, next_root); 2023 1918 2024 1919 pr_warn("\nstack backtrace:\n"); 2025 1920 dump_stack(); 2026 - 2027 - return 0; 2028 1921 } 2029 1922 2030 1923 static const char *state_names[] = { ··· 2169 2066 this.class = hlock_class(prev); 2170 2067 2171 2068 ret = __bfs_backwards(&this, &usage_mask, usage_accumulate, NULL); 2172 - if (ret < 0) 2173 - return print_bfs_bug(ret); 2069 + if (ret < 0) { 2070 + print_bfs_bug(ret); 2071 + return 0; 2072 + } 2174 2073 2175 2074 usage_mask &= LOCKF_USED_IN_IRQ_ALL; 2176 2075 if (!usage_mask) ··· 2188 2083 that.class = hlock_class(next); 2189 2084 2190 2085 ret = find_usage_forwards(&that, forward_mask, &target_entry1); 2191 - if (ret < 0) 2192 - return print_bfs_bug(ret); 2086 + if (ret < 0) { 2087 + print_bfs_bug(ret); 2088 + return 0; 2089 + } 2193 2090 if (ret == 1) 2194 2091 return ret; 2195 2092 ··· 2203 2096 backward_mask = original_mask(target_entry1->class->usage_mask); 2204 2097 2205 2098 ret = find_usage_backwards(&this, backward_mask, &target_entry); 2206 - if (ret < 0) 2207 - return print_bfs_bug(ret); 2099 + if (ret < 0) { 2100 + print_bfs_bug(ret); 2101 + return 0; 2102 + } 2208 2103 if (DEBUG_LOCKS_WARN_ON(ret == 1)) 2209 2104 return 1; 2210 2105 ··· 2220 2111 if (DEBUG_LOCKS_WARN_ON(ret == -1)) 2221 2112 return 1; 2222 2113 2223 - return print_bad_irq_dependency(curr, &this, &that, 2224 - target_entry, target_entry1, 2225 - prev, next, 2226 - backward_bit, forward_bit, 2227 - state_name(backward_bit)); 2114 + print_bad_irq_dependency(curr, &this, &that, 2115 + target_entry, target_entry1, 2116 + prev, next, 2117 + backward_bit, forward_bit, 2118 + state_name(backward_bit)); 2119 + 2120 + return 0; 2228 2121 } 2229 2122 2230 2123 static void inc_chains(void) ··· 2254 2143 nr_process_chains++; 2255 2144 } 2256 2145 2257 - #endif 2146 + #endif /* CONFIG_TRACE_IRQFLAGS */ 2258 2147 2259 2148 static void 2260 - print_deadlock_scenario(struct held_lock *nxt, 2261 - struct held_lock *prv) 2149 + print_deadlock_scenario(struct held_lock *nxt, struct held_lock *prv) 2262 2150 { 2263 2151 struct lock_class *next = hlock_class(nxt); 2264 2152 struct lock_class *prev = hlock_class(prv); ··· 2275 2165 printk(" May be due to missing lock nesting notation\n\n"); 2276 2166 } 2277 2167 2278 - static int 2168 + static void 2279 2169 print_deadlock_bug(struct task_struct *curr, struct held_lock *prev, 2280 2170 struct held_lock *next) 2281 2171 { 2282 2172 if (!debug_locks_off_graph_unlock() || debug_locks_silent) 2283 - return 0; 2173 + return; 2284 2174 2285 2175 pr_warn("\n"); 2286 2176 pr_warn("============================================\n"); ··· 2299 2189 2300 2190 pr_warn("\nstack backtrace:\n"); 2301 2191 dump_stack(); 2302 - 2303 - return 0; 2304 2192 } 2305 2193 2306 2194 /* ··· 2310 2202 * Returns: 0 on deadlock detected, 1 on OK, 2 on recursive read 2311 2203 */ 2312 2204 static int 2313 - check_deadlock(struct task_struct *curr, struct held_lock *next, 2314 - struct lockdep_map *next_instance, int read) 2205 + check_deadlock(struct task_struct *curr, struct held_lock *next) 2315 2206 { 2316 2207 struct held_lock *prev; 2317 2208 struct held_lock *nest = NULL; ··· 2329 2222 * Allow read-after-read recursion of the same 2330 2223 * lock class (i.e. read_lock(lock)+read_lock(lock)): 2331 2224 */ 2332 - if ((read == 2) && prev->read) 2225 + if ((next->read == 2) && prev->read) 2333 2226 return 2; 2334 2227 2335 2228 /* ··· 2339 2232 if (nest) 2340 2233 return 2; 2341 2234 2342 - return print_deadlock_bug(curr, prev, next); 2235 + print_deadlock_bug(curr, prev, next); 2236 + return 0; 2343 2237 } 2344 2238 return 1; 2345 2239 } 2346 2240 2347 2241 /* 2348 2242 * There was a chain-cache miss, and we are about to add a new dependency 2349 - * to a previous lock. We recursively validate the following rules: 2243 + * to a previous lock. We validate the following rules: 2350 2244 * 2351 2245 * - would the adding of the <prev> -> <next> dependency create a 2352 2246 * circular dependency in the graph? [== circular deadlock] ··· 2371 2263 check_prev_add(struct task_struct *curr, struct held_lock *prev, 2372 2264 struct held_lock *next, int distance, struct lock_trace *trace) 2373 2265 { 2374 - struct lock_list *uninitialized_var(target_entry); 2375 2266 struct lock_list *entry; 2376 - struct lock_list this; 2377 2267 int ret; 2378 2268 2379 2269 if (!hlock_class(prev)->key || !hlock_class(next)->key) { ··· 2395 2289 /* 2396 2290 * Prove that the new <prev> -> <next> dependency would not 2397 2291 * create a circular dependency in the graph. (We do this by 2398 - * forward-recursing into the graph starting at <next>, and 2399 - * checking whether we can reach <prev>.) 2292 + * a breadth-first search into the graph starting at <next>, 2293 + * and check whether we can reach <prev>.) 2400 2294 * 2401 - * We are using global variables to control the recursion, to 2402 - * keep the stackframe size of the recursive functions low: 2295 + * The search is limited by the size of the circular queue (i.e., 2296 + * MAX_CIRCULAR_QUEUE_SIZE) which keeps track of a breadth of nodes 2297 + * in the graph whose neighbours are to be checked. 2403 2298 */ 2404 - this.class = hlock_class(next); 2405 - this.parent = NULL; 2406 - ret = check_noncircular(&this, hlock_class(prev), &target_entry); 2407 - if (unlikely(!ret)) { 2408 - if (!trace->nr_entries) { 2409 - /* 2410 - * If save_trace fails here, the printing might 2411 - * trigger a WARN but because of the !nr_entries it 2412 - * should not do bad things. 2413 - */ 2414 - save_trace(trace); 2415 - } 2416 - return print_circular_bug(&this, target_entry, next, prev); 2417 - } 2418 - else if (unlikely(ret < 0)) 2419 - return print_bfs_bug(ret); 2299 + ret = check_noncircular(next, prev, trace); 2300 + if (unlikely(ret <= 0)) 2301 + return 0; 2420 2302 2421 2303 if (!check_irq_usage(curr, prev, next)) 2422 2304 return 0; ··· 2435 2341 } 2436 2342 } 2437 2343 2344 + #ifdef CONFIG_LOCKDEP_SMALL 2438 2345 /* 2439 2346 * Is the <prev> -> <next> link redundant? 2440 2347 */ 2441 - this.class = hlock_class(prev); 2442 - this.parent = NULL; 2443 - ret = check_redundant(&this, hlock_class(next), &target_entry); 2444 - if (!ret) { 2445 - debug_atomic_inc(nr_redundant); 2446 - return 2; 2447 - } 2448 - if (ret < 0) 2449 - return print_bfs_bug(ret); 2450 - 2348 + ret = check_redundant(prev, next); 2349 + if (ret != 1) 2350 + return ret; 2351 + #endif 2451 2352 2452 2353 if (!trace->nr_entries && !save_trace(trace)) 2453 2354 return 0; ··· 2594 2505 print_chain_keys_held_locks(struct task_struct *curr, struct held_lock *hlock_next) 2595 2506 { 2596 2507 struct held_lock *hlock; 2597 - u64 chain_key = 0; 2508 + u64 chain_key = INITIAL_CHAIN_KEY; 2598 2509 int depth = curr->lockdep_depth; 2599 - int i; 2510 + int i = get_first_held_lock(curr, hlock_next); 2600 2511 2601 - printk("depth: %u\n", depth + 1); 2602 - for (i = get_first_held_lock(curr, hlock_next); i < depth; i++) { 2512 + printk("depth: %u (irq_context %u)\n", depth - i + 1, 2513 + hlock_next->irq_context); 2514 + for (; i < depth; i++) { 2603 2515 hlock = curr->held_locks + i; 2604 2516 chain_key = print_chain_key_iteration(hlock->class_idx, chain_key); 2605 2517 ··· 2614 2524 static void print_chain_keys_chain(struct lock_chain *chain) 2615 2525 { 2616 2526 int i; 2617 - u64 chain_key = 0; 2527 + u64 chain_key = INITIAL_CHAIN_KEY; 2618 2528 int class_id; 2619 2529 2620 2530 printk("depth: %u\n", chain->depth); 2621 2531 for (i = 0; i < chain->depth; i++) { 2622 2532 class_id = chain_hlocks[chain->base + i]; 2623 - chain_key = print_chain_key_iteration(class_id + 1, chain_key); 2533 + chain_key = print_chain_key_iteration(class_id, chain_key); 2624 2534 2625 2535 print_lock_name(lock_classes + class_id); 2626 2536 printk("\n"); ··· 2671 2581 } 2672 2582 2673 2583 for (j = 0; j < chain->depth - 1; j++, i++) { 2674 - id = curr->held_locks[i].class_idx - 1; 2584 + id = curr->held_locks[i].class_idx; 2675 2585 2676 2586 if (DEBUG_LOCKS_WARN_ON(chain_hlocks[chain->base + j] != id)) { 2677 2587 print_collision(curr, hlock, chain); ··· 2754 2664 if (likely(nr_chain_hlocks + chain->depth <= MAX_LOCKDEP_CHAIN_HLOCKS)) { 2755 2665 chain->base = nr_chain_hlocks; 2756 2666 for (j = 0; j < chain->depth - 1; j++, i++) { 2757 - int lock_id = curr->held_locks[i].class_idx - 1; 2667 + int lock_id = curr->held_locks[i].class_idx; 2758 2668 chain_hlocks[chain->base + j] = lock_id; 2759 2669 } 2760 2670 chain_hlocks[chain->base + j] = class - lock_classes; ··· 2844 2754 return 1; 2845 2755 } 2846 2756 2847 - static int validate_chain(struct task_struct *curr, struct lockdep_map *lock, 2848 - struct held_lock *hlock, int chain_head, u64 chain_key) 2757 + static int validate_chain(struct task_struct *curr, 2758 + struct held_lock *hlock, 2759 + int chain_head, u64 chain_key) 2849 2760 { 2850 2761 /* 2851 2762 * Trylock needs to maintain the stack of held locks, but it ··· 2867 2776 * - is softirq-safe, if this lock is hardirq-unsafe 2868 2777 * 2869 2778 * And check whether the new lock's dependency graph 2870 - * could lead back to the previous lock. 2779 + * could lead back to the previous lock: 2871 2780 * 2872 - * any of these scenarios could lead to a deadlock. If 2873 - * All validations 2781 + * - within the current held-lock stack 2782 + * - across our accumulated lock dependency records 2783 + * 2784 + * any of these scenarios could lead to a deadlock. 2874 2785 */ 2875 - int ret = check_deadlock(curr, hlock, lock, hlock->read); 2786 + /* 2787 + * The simple case: does the current hold the same lock 2788 + * already? 2789 + */ 2790 + int ret = check_deadlock(curr, hlock); 2876 2791 2877 2792 if (!ret) 2878 2793 return 0; ··· 2909 2812 } 2910 2813 #else 2911 2814 static inline int validate_chain(struct task_struct *curr, 2912 - struct lockdep_map *lock, struct held_lock *hlock, 2913 - int chain_head, u64 chain_key) 2815 + struct held_lock *hlock, 2816 + int chain_head, u64 chain_key) 2914 2817 { 2915 2818 return 1; 2916 2819 } 2917 - 2918 - static void print_lock_trace(struct lock_trace *trace, unsigned int spaces) 2919 - { 2920 - } 2921 - #endif 2820 + #endif /* CONFIG_PROVE_LOCKING */ 2922 2821 2923 2822 /* 2924 2823 * We are building curr_chain_key incrementally, so double-check ··· 2925 2832 #ifdef CONFIG_DEBUG_LOCKDEP 2926 2833 struct held_lock *hlock, *prev_hlock = NULL; 2927 2834 unsigned int i; 2928 - u64 chain_key = 0; 2835 + u64 chain_key = INITIAL_CHAIN_KEY; 2929 2836 2930 2837 for (i = 0; i < curr->lockdep_depth; i++) { 2931 2838 hlock = curr->held_locks + i; ··· 2941 2848 (unsigned long long)hlock->prev_chain_key); 2942 2849 return; 2943 2850 } 2851 + 2944 2852 /* 2945 - * Whoops ran out of static storage again? 2853 + * hlock->class_idx can't go beyond MAX_LOCKDEP_KEYS, but is 2854 + * it registered lock class index? 2946 2855 */ 2947 - if (DEBUG_LOCKS_WARN_ON(hlock->class_idx > MAX_LOCKDEP_KEYS)) 2856 + if (DEBUG_LOCKS_WARN_ON(!test_bit(hlock->class_idx, lock_classes_in_use))) 2948 2857 return; 2949 2858 2950 2859 if (prev_hlock && (prev_hlock->irq_context != 2951 2860 hlock->irq_context)) 2952 - chain_key = 0; 2861 + chain_key = INITIAL_CHAIN_KEY; 2953 2862 chain_key = iterate_chain_key(chain_key, hlock->class_idx); 2954 2863 prev_hlock = hlock; 2955 2864 } ··· 2969 2874 #endif 2970 2875 } 2971 2876 2877 + #if defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING) 2972 2878 static int mark_lock(struct task_struct *curr, struct held_lock *this, 2973 2879 enum lock_usage_bit new_bit); 2974 2880 2975 - #if defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING) 2976 - 2977 - 2978 - static void 2979 - print_usage_bug_scenario(struct held_lock *lock) 2881 + static void print_usage_bug_scenario(struct held_lock *lock) 2980 2882 { 2981 2883 struct lock_class *class = hlock_class(lock); 2982 2884 ··· 2990 2898 printk("\n *** DEADLOCK ***\n\n"); 2991 2899 } 2992 2900 2993 - static int 2901 + static void 2994 2902 print_usage_bug(struct task_struct *curr, struct held_lock *this, 2995 2903 enum lock_usage_bit prev_bit, enum lock_usage_bit new_bit) 2996 2904 { 2997 2905 if (!debug_locks_off_graph_unlock() || debug_locks_silent) 2998 - return 0; 2906 + return; 2999 2907 3000 2908 pr_warn("\n"); 3001 2909 pr_warn("================================\n"); ··· 3025 2933 3026 2934 pr_warn("\nstack backtrace:\n"); 3027 2935 dump_stack(); 3028 - 3029 - return 0; 3030 2936 } 3031 2937 3032 2938 /* ··· 3034 2944 valid_state(struct task_struct *curr, struct held_lock *this, 3035 2945 enum lock_usage_bit new_bit, enum lock_usage_bit bad_bit) 3036 2946 { 3037 - if (unlikely(hlock_class(this)->usage_mask & (1 << bad_bit))) 3038 - return print_usage_bug(curr, this, bad_bit, new_bit); 2947 + if (unlikely(hlock_class(this)->usage_mask & (1 << bad_bit))) { 2948 + print_usage_bug(curr, this, bad_bit, new_bit); 2949 + return 0; 2950 + } 3039 2951 return 1; 3040 2952 } 3041 2953 ··· 3045 2953 /* 3046 2954 * print irq inversion bug: 3047 2955 */ 3048 - static int 2956 + static void 3049 2957 print_irq_inversion_bug(struct task_struct *curr, 3050 2958 struct lock_list *root, struct lock_list *other, 3051 2959 struct held_lock *this, int forwards, ··· 3056 2964 int depth; 3057 2965 3058 2966 if (!debug_locks_off_graph_unlock() || debug_locks_silent) 3059 - return 0; 2967 + return; 3060 2968 3061 2969 pr_warn("\n"); 3062 2970 pr_warn("========================================================\n"); ··· 3097 3005 3098 3006 pr_warn("\nthe shortest dependencies between 2nd lock and 1st lock:\n"); 3099 3007 if (!save_trace(&root->trace)) 3100 - return 0; 3008 + return; 3101 3009 print_shortest_lock_dependencies(other, root); 3102 3010 3103 3011 pr_warn("\nstack backtrace:\n"); 3104 3012 dump_stack(); 3105 - 3106 - return 0; 3107 3013 } 3108 3014 3109 3015 /* ··· 3119 3029 root.parent = NULL; 3120 3030 root.class = hlock_class(this); 3121 3031 ret = find_usage_forwards(&root, lock_flag(bit), &target_entry); 3122 - if (ret < 0) 3123 - return print_bfs_bug(ret); 3032 + if (ret < 0) { 3033 + print_bfs_bug(ret); 3034 + return 0; 3035 + } 3124 3036 if (ret == 1) 3125 3037 return ret; 3126 3038 3127 - return print_irq_inversion_bug(curr, &root, target_entry, 3128 - this, 1, irqclass); 3039 + print_irq_inversion_bug(curr, &root, target_entry, 3040 + this, 1, irqclass); 3041 + return 0; 3129 3042 } 3130 3043 3131 3044 /* ··· 3146 3053 root.parent = NULL; 3147 3054 root.class = hlock_class(this); 3148 3055 ret = find_usage_backwards(&root, lock_flag(bit), &target_entry); 3149 - if (ret < 0) 3150 - return print_bfs_bug(ret); 3056 + if (ret < 0) { 3057 + print_bfs_bug(ret); 3058 + return 0; 3059 + } 3151 3060 if (ret == 1) 3152 3061 return ret; 3153 3062 3154 - return print_irq_inversion_bug(curr, &root, target_entry, 3155 - this, 0, irqclass); 3063 + print_irq_inversion_bug(curr, &root, target_entry, 3064 + this, 0, irqclass); 3065 + return 0; 3156 3066 } 3157 3067 3158 3068 void print_irqtrace_events(struct task_struct *curr) ··· 3238 3142 * Validate that the lock dependencies don't have conflicting usage 3239 3143 * states. 3240 3144 */ 3241 - if ((!read || !dir || STRICT_READ_CHECKS) && 3145 + if ((!read || STRICT_READ_CHECKS) && 3242 3146 !usage(curr, this, excl_bit, state_name(new_bit & ~LOCK_USAGE_READ_MASK))) 3243 3147 return 0; 3244 3148 ··· 3463 3367 debug_atomic_inc(redundant_softirqs_off); 3464 3368 } 3465 3369 3466 - static int mark_irqflags(struct task_struct *curr, struct held_lock *hlock) 3370 + static int 3371 + mark_usage(struct task_struct *curr, struct held_lock *hlock, int check) 3467 3372 { 3373 + if (!check) 3374 + goto lock_used; 3375 + 3468 3376 /* 3469 3377 * If non-trylock use in a hardirq or softirq context, then 3470 3378 * mark the lock as used in these contexts: ··· 3512 3412 } 3513 3413 } 3514 3414 3415 + lock_used: 3416 + /* mark it as used: */ 3417 + if (!mark_lock(curr, hlock, LOCK_USED)) 3418 + return 0; 3419 + 3515 3420 return 1; 3516 3421 } 3517 3422 ··· 3548 3443 return 0; 3549 3444 } 3550 3445 3551 - #else /* defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING) */ 3552 - 3553 - static inline 3554 - int mark_lock_irq(struct task_struct *curr, struct held_lock *this, 3555 - enum lock_usage_bit new_bit) 3556 - { 3557 - WARN_ON(1); /* Impossible innit? when we don't have TRACE_IRQFLAG */ 3558 - return 1; 3559 - } 3560 - 3561 - static inline int mark_irqflags(struct task_struct *curr, 3562 - struct held_lock *hlock) 3563 - { 3564 - return 1; 3565 - } 3566 - 3567 - static inline unsigned int task_irq_context(struct task_struct *task) 3568 - { 3569 - return 0; 3570 - } 3571 - 3572 - static inline int separate_irq_context(struct task_struct *curr, 3573 - struct held_lock *hlock) 3574 - { 3575 - return 0; 3576 - } 3577 - 3578 - #endif /* defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING) */ 3579 - 3580 3446 /* 3581 3447 * Mark a lock with a usage bit, and validate the state transition: 3582 3448 */ ··· 3555 3479 enum lock_usage_bit new_bit) 3556 3480 { 3557 3481 unsigned int new_mask = 1 << new_bit, ret = 1; 3482 + 3483 + if (new_bit >= LOCK_USAGE_STATES) { 3484 + DEBUG_LOCKS_WARN_ON(1); 3485 + return 0; 3486 + } 3558 3487 3559 3488 /* 3560 3489 * If already set then do not dirty the cacheline, ··· 3584 3503 return 0; 3585 3504 3586 3505 switch (new_bit) { 3587 - #define LOCKDEP_STATE(__STATE) \ 3588 - case LOCK_USED_IN_##__STATE: \ 3589 - case LOCK_USED_IN_##__STATE##_READ: \ 3590 - case LOCK_ENABLED_##__STATE: \ 3591 - case LOCK_ENABLED_##__STATE##_READ: 3592 - #include "lockdep_states.h" 3593 - #undef LOCKDEP_STATE 3594 - ret = mark_lock_irq(curr, this, new_bit); 3595 - if (!ret) 3596 - return 0; 3597 - break; 3598 3506 case LOCK_USED: 3599 3507 debug_atomic_dec(nr_unused_locks); 3600 3508 break; 3601 3509 default: 3602 - if (!debug_locks_off_graph_unlock()) 3510 + ret = mark_lock_irq(curr, this, new_bit); 3511 + if (!ret) 3603 3512 return 0; 3604 - WARN_ON(1); 3605 - return 0; 3606 3513 } 3607 3514 3608 3515 graph_unlock(); ··· 3607 3538 3608 3539 return ret; 3609 3540 } 3541 + 3542 + #else /* defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING) */ 3543 + 3544 + static inline int 3545 + mark_usage(struct task_struct *curr, struct held_lock *hlock, int check) 3546 + { 3547 + return 1; 3548 + } 3549 + 3550 + static inline unsigned int task_irq_context(struct task_struct *task) 3551 + { 3552 + return 0; 3553 + } 3554 + 3555 + static inline int separate_irq_context(struct task_struct *curr, 3556 + struct held_lock *hlock) 3557 + { 3558 + return 0; 3559 + } 3560 + 3561 + #endif /* defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING) */ 3610 3562 3611 3563 /* 3612 3564 * Initialize a lock instance's lock-class mapping info: ··· 3692 3602 struct lock_class_key __lockdep_no_validate__; 3693 3603 EXPORT_SYMBOL_GPL(__lockdep_no_validate__); 3694 3604 3695 - static int 3605 + static void 3696 3606 print_lock_nested_lock_not_held(struct task_struct *curr, 3697 3607 struct held_lock *hlock, 3698 3608 unsigned long ip) 3699 3609 { 3700 3610 if (!debug_locks_off()) 3701 - return 0; 3611 + return; 3702 3612 if (debug_locks_silent) 3703 - return 0; 3613 + return; 3704 3614 3705 3615 pr_warn("\n"); 3706 3616 pr_warn("==================================\n"); ··· 3722 3632 3723 3633 pr_warn("\nstack backtrace:\n"); 3724 3634 dump_stack(); 3725 - 3726 - return 0; 3727 3635 } 3728 3636 3729 3637 static int __lock_is_held(const struct lockdep_map *lock, int read); ··· 3786 3698 if (DEBUG_LOCKS_WARN_ON(depth >= MAX_LOCK_DEPTH)) 3787 3699 return 0; 3788 3700 3789 - class_idx = class - lock_classes + 1; 3701 + class_idx = class - lock_classes; 3790 3702 3791 3703 if (depth) { 3792 3704 hlock = curr->held_locks + depth - 1; 3793 3705 if (hlock->class_idx == class_idx && nest_lock) { 3794 - if (hlock->references) { 3795 - /* 3796 - * Check: unsigned int references:12, overflow. 3797 - */ 3798 - if (DEBUG_LOCKS_WARN_ON(hlock->references == (1 << 12)-1)) 3799 - return 0; 3706 + if (!references) 3707 + references++; 3800 3708 3709 + if (!hlock->references) 3801 3710 hlock->references++; 3802 - } else { 3803 - hlock->references = 2; 3804 - } 3805 3711 3806 - return 1; 3712 + hlock->references += references; 3713 + 3714 + /* Overflow */ 3715 + if (DEBUG_LOCKS_WARN_ON(hlock->references < references)) 3716 + return 0; 3717 + 3718 + return 2; 3807 3719 } 3808 3720 } 3809 3721 ··· 3830 3742 #endif 3831 3743 hlock->pin_count = pin_count; 3832 3744 3833 - if (check && !mark_irqflags(curr, hlock)) 3834 - return 0; 3835 - 3836 - /* mark it as used: */ 3837 - if (!mark_lock(curr, hlock, LOCK_USED)) 3745 + /* Initialize the lock usage bit */ 3746 + if (!mark_usage(curr, hlock, check)) 3838 3747 return 0; 3839 3748 3840 3749 /* ··· 3845 3760 * the hash, not class->key. 3846 3761 */ 3847 3762 /* 3848 - * Whoops, we did it again.. ran straight out of our static allocation. 3763 + * Whoops, we did it again.. class_idx is invalid. 3849 3764 */ 3850 - if (DEBUG_LOCKS_WARN_ON(class_idx > MAX_LOCKDEP_KEYS)) 3765 + if (DEBUG_LOCKS_WARN_ON(!test_bit(class_idx, lock_classes_in_use))) 3851 3766 return 0; 3852 3767 3853 3768 chain_key = curr->curr_chain_key; ··· 3855 3770 /* 3856 3771 * How can we have a chain hash when we ain't got no keys?! 3857 3772 */ 3858 - if (DEBUG_LOCKS_WARN_ON(chain_key != 0)) 3773 + if (DEBUG_LOCKS_WARN_ON(chain_key != INITIAL_CHAIN_KEY)) 3859 3774 return 0; 3860 3775 chain_head = 1; 3861 3776 } 3862 3777 3863 3778 hlock->prev_chain_key = chain_key; 3864 3779 if (separate_irq_context(curr, hlock)) { 3865 - chain_key = 0; 3780 + chain_key = INITIAL_CHAIN_KEY; 3866 3781 chain_head = 1; 3867 3782 } 3868 3783 chain_key = iterate_chain_key(chain_key, class_idx); 3869 3784 3870 - if (nest_lock && !__lock_is_held(nest_lock, -1)) 3871 - return print_lock_nested_lock_not_held(curr, hlock, ip); 3785 + if (nest_lock && !__lock_is_held(nest_lock, -1)) { 3786 + print_lock_nested_lock_not_held(curr, hlock, ip); 3787 + return 0; 3788 + } 3872 3789 3873 3790 if (!debug_locks_silent) { 3874 3791 WARN_ON_ONCE(depth && !hlock_class(hlock - 1)->key); 3875 3792 WARN_ON_ONCE(!hlock_class(hlock)->key); 3876 3793 } 3877 3794 3878 - if (!validate_chain(curr, lock, hlock, chain_head, chain_key)) 3795 + if (!validate_chain(curr, hlock, chain_head, chain_key)) 3879 3796 return 0; 3880 3797 3881 3798 curr->curr_chain_key = chain_key; ··· 3906 3819 return 1; 3907 3820 } 3908 3821 3909 - static int 3910 - print_unlock_imbalance_bug(struct task_struct *curr, struct lockdep_map *lock, 3911 - unsigned long ip) 3822 + static void print_unlock_imbalance_bug(struct task_struct *curr, 3823 + struct lockdep_map *lock, 3824 + unsigned long ip) 3912 3825 { 3913 3826 if (!debug_locks_off()) 3914 - return 0; 3827 + return; 3915 3828 if (debug_locks_silent) 3916 - return 0; 3829 + return; 3917 3830 3918 3831 pr_warn("\n"); 3919 3832 pr_warn("=====================================\n"); ··· 3931 3844 3932 3845 pr_warn("\nstack backtrace:\n"); 3933 3846 dump_stack(); 3934 - 3935 - return 0; 3936 3847 } 3937 3848 3938 3849 static int match_held_lock(const struct held_lock *hlock, ··· 3962 3877 if (DEBUG_LOCKS_WARN_ON(!hlock->nest_lock)) 3963 3878 return 0; 3964 3879 3965 - if (hlock->class_idx == class - lock_classes + 1) 3880 + if (hlock->class_idx == class - lock_classes) 3966 3881 return 1; 3967 3882 } 3968 3883 ··· 4006 3921 } 4007 3922 4008 3923 static int reacquire_held_locks(struct task_struct *curr, unsigned int depth, 4009 - int idx) 3924 + int idx, unsigned int *merged) 4010 3925 { 4011 3926 struct held_lock *hlock; 3927 + int first_idx = idx; 4012 3928 4013 3929 if (DEBUG_LOCKS_WARN_ON(!irqs_disabled())) 4014 3930 return 0; 4015 3931 4016 3932 for (hlock = curr->held_locks + idx; idx < depth; idx++, hlock++) { 4017 - if (!__lock_acquire(hlock->instance, 3933 + switch (__lock_acquire(hlock->instance, 4018 3934 hlock_class(hlock)->subclass, 4019 3935 hlock->trylock, 4020 3936 hlock->read, hlock->check, 4021 3937 hlock->hardirqs_off, 4022 3938 hlock->nest_lock, hlock->acquire_ip, 4023 - hlock->references, hlock->pin_count)) 3939 + hlock->references, hlock->pin_count)) { 3940 + case 0: 4024 3941 return 1; 3942 + case 1: 3943 + break; 3944 + case 2: 3945 + *merged += (idx == first_idx); 3946 + break; 3947 + default: 3948 + WARN_ON(1); 3949 + return 0; 3950 + } 4025 3951 } 4026 3952 return 0; 4027 3953 } ··· 4043 3947 unsigned long ip) 4044 3948 { 4045 3949 struct task_struct *curr = current; 3950 + unsigned int depth, merged = 0; 4046 3951 struct held_lock *hlock; 4047 3952 struct lock_class *class; 4048 - unsigned int depth; 4049 3953 int i; 4050 3954 4051 3955 if (unlikely(!debug_locks)) ··· 4060 3964 return 0; 4061 3965 4062 3966 hlock = find_held_lock(curr, lock, depth, &i); 4063 - if (!hlock) 4064 - return print_unlock_imbalance_bug(curr, lock, ip); 3967 + if (!hlock) { 3968 + print_unlock_imbalance_bug(curr, lock, ip); 3969 + return 0; 3970 + } 4065 3971 4066 3972 lockdep_init_map(lock, name, key, 0); 4067 3973 class = register_lock_class(lock, subclass, 0); 4068 - hlock->class_idx = class - lock_classes + 1; 3974 + hlock->class_idx = class - lock_classes; 4069 3975 4070 3976 curr->lockdep_depth = i; 4071 3977 curr->curr_chain_key = hlock->prev_chain_key; 4072 3978 4073 - if (reacquire_held_locks(curr, depth, i)) 3979 + if (reacquire_held_locks(curr, depth, i, &merged)) 4074 3980 return 0; 4075 3981 4076 3982 /* 4077 3983 * I took it apart and put it back together again, except now I have 4078 3984 * these 'spare' parts.. where shall I put them. 4079 3985 */ 4080 - if (DEBUG_LOCKS_WARN_ON(curr->lockdep_depth != depth)) 3986 + if (DEBUG_LOCKS_WARN_ON(curr->lockdep_depth != depth - merged)) 4081 3987 return 0; 4082 3988 return 1; 4083 3989 } ··· 4087 3989 static int __lock_downgrade(struct lockdep_map *lock, unsigned long ip) 4088 3990 { 4089 3991 struct task_struct *curr = current; 3992 + unsigned int depth, merged = 0; 4090 3993 struct held_lock *hlock; 4091 - unsigned int depth; 4092 3994 int i; 4093 3995 4094 3996 if (unlikely(!debug_locks)) ··· 4103 4005 return 0; 4104 4006 4105 4007 hlock = find_held_lock(curr, lock, depth, &i); 4106 - if (!hlock) 4107 - return print_unlock_imbalance_bug(curr, lock, ip); 4008 + if (!hlock) { 4009 + print_unlock_imbalance_bug(curr, lock, ip); 4010 + return 0; 4011 + } 4108 4012 4109 4013 curr->lockdep_depth = i; 4110 4014 curr->curr_chain_key = hlock->prev_chain_key; ··· 4115 4015 hlock->read = 1; 4116 4016 hlock->acquire_ip = ip; 4117 4017 4118 - if (reacquire_held_locks(curr, depth, i)) 4018 + if (reacquire_held_locks(curr, depth, i, &merged)) 4019 + return 0; 4020 + 4021 + /* Merging can't happen with unchanged classes.. */ 4022 + if (DEBUG_LOCKS_WARN_ON(merged)) 4119 4023 return 0; 4120 4024 4121 4025 /* ··· 4128 4024 */ 4129 4025 if (DEBUG_LOCKS_WARN_ON(curr->lockdep_depth != depth)) 4130 4026 return 0; 4027 + 4131 4028 return 1; 4132 4029 } 4133 4030 ··· 4140 4035 * @nested is an hysterical artifact, needs a tree wide cleanup. 4141 4036 */ 4142 4037 static int 4143 - __lock_release(struct lockdep_map *lock, int nested, unsigned long ip) 4038 + __lock_release(struct lockdep_map *lock, unsigned long ip) 4144 4039 { 4145 4040 struct task_struct *curr = current; 4041 + unsigned int depth, merged = 1; 4146 4042 struct held_lock *hlock; 4147 - unsigned int depth; 4148 4043 int i; 4149 4044 4150 4045 if (unlikely(!debug_locks)) ··· 4155 4050 * So we're all set to release this lock.. wait what lock? We don't 4156 4051 * own any locks, you've been drinking again? 4157 4052 */ 4158 - if (DEBUG_LOCKS_WARN_ON(depth <= 0)) 4159 - return print_unlock_imbalance_bug(curr, lock, ip); 4053 + if (depth <= 0) { 4054 + print_unlock_imbalance_bug(curr, lock, ip); 4055 + return 0; 4056 + } 4160 4057 4161 4058 /* 4162 4059 * Check whether the lock exists in the current stack 4163 4060 * of held locks: 4164 4061 */ 4165 4062 hlock = find_held_lock(curr, lock, depth, &i); 4166 - if (!hlock) 4167 - return print_unlock_imbalance_bug(curr, lock, ip); 4063 + if (!hlock) { 4064 + print_unlock_imbalance_bug(curr, lock, ip); 4065 + return 0; 4066 + } 4168 4067 4169 4068 if (hlock->instance == lock) 4170 4069 lock_release_holdtime(hlock); ··· 4203 4094 if (i == depth-1) 4204 4095 return 1; 4205 4096 4206 - if (reacquire_held_locks(curr, depth, i + 1)) 4097 + if (reacquire_held_locks(curr, depth, i + 1, &merged)) 4207 4098 return 0; 4208 4099 4209 4100 /* 4210 4101 * We had N bottles of beer on the wall, we drank one, but now 4211 4102 * there's not N-1 bottles of beer left on the wall... 4103 + * Pouring two of the bottles together is acceptable. 4212 4104 */ 4213 - DEBUG_LOCKS_WARN_ON(curr->lockdep_depth != depth-1); 4105 + DEBUG_LOCKS_WARN_ON(curr->lockdep_depth != depth - merged); 4214 4106 4215 4107 /* 4216 4108 * Since reacquire_held_locks() would have called check_chain_key() ··· 4429 4319 check_flags(flags); 4430 4320 current->lockdep_recursion = 1; 4431 4321 trace_lock_release(lock, ip); 4432 - if (__lock_release(lock, nested, ip)) 4322 + if (__lock_release(lock, ip)) 4433 4323 check_chain_key(current); 4434 4324 current->lockdep_recursion = 0; 4435 4325 raw_local_irq_restore(flags); ··· 4512 4402 EXPORT_SYMBOL_GPL(lock_unpin_lock); 4513 4403 4514 4404 #ifdef CONFIG_LOCK_STAT 4515 - static int 4516 - print_lock_contention_bug(struct task_struct *curr, struct lockdep_map *lock, 4517 - unsigned long ip) 4405 + static void print_lock_contention_bug(struct task_struct *curr, 4406 + struct lockdep_map *lock, 4407 + unsigned long ip) 4518 4408 { 4519 4409 if (!debug_locks_off()) 4520 - return 0; 4410 + return; 4521 4411 if (debug_locks_silent) 4522 - return 0; 4412 + return; 4523 4413 4524 4414 pr_warn("\n"); 4525 4415 pr_warn("=================================\n"); ··· 4537 4427 4538 4428 pr_warn("\nstack backtrace:\n"); 4539 4429 dump_stack(); 4540 - 4541 - return 0; 4542 4430 } 4543 4431 4544 4432 static void ··· 4681 4573 int i; 4682 4574 4683 4575 raw_local_irq_save(flags); 4684 - current->curr_chain_key = 0; 4685 - current->lockdep_depth = 0; 4686 - current->lockdep_recursion = 0; 4576 + lockdep_init_task(current); 4687 4577 memset(current->held_locks, 0, MAX_LOCK_DEPTH*sizeof(struct held_lock)); 4688 4578 nr_hardirq_chains = 0; 4689 4579 nr_softirq_chains = 0; ··· 4721 4615 return; 4722 4616 4723 4617 recalc: 4724 - chain_key = 0; 4618 + chain_key = INITIAL_CHAIN_KEY; 4725 4619 for (i = chain->base; i < chain->base + chain->depth; i++) 4726 - chain_key = iterate_chain_key(chain_key, chain_hlocks[i] + 1); 4620 + chain_key = iterate_chain_key(chain_key, chain_hlocks[i]); 4727 4621 if (chain->depth && chain->chain_key == chain_key) 4728 4622 return; 4729 4623 /* Overwrite the chain key for concurrent RCU readers. */ ··· 4797 4691 WRITE_ONCE(class->key, NULL); 4798 4692 WRITE_ONCE(class->name, NULL); 4799 4693 nr_lock_classes--; 4694 + __clear_bit(class - lock_classes, lock_classes_in_use); 4800 4695 } else { 4801 4696 WARN_ONCE(true, "%s() failed for class %s\n", __func__, 4802 4697 class->name); ··· 5143 5036 5144 5037 printk(" memory used by lock dependency info: %zu kB\n", 5145 5038 (sizeof(lock_classes) + 5039 + sizeof(lock_classes_in_use) + 5146 5040 sizeof(classhash_table) + 5147 5041 sizeof(list_entries) + 5148 5042 sizeof(list_entries_in_use) +

+16 -20

kernel/locking/lockdep_internals.h

··· 131 131 extern unsigned int nr_softirq_chains; 132 132 extern unsigned int nr_process_chains; 133 133 extern unsigned int max_lockdep_depth; 134 - extern unsigned int max_recursion_depth; 135 134 136 135 extern unsigned int max_bfs_queue_depth; 137 136 ··· 159 160 * and we want to avoid too much cache bouncing. 160 161 */ 161 162 struct lockdep_stats { 162 - int chain_lookup_hits; 163 - int chain_lookup_misses; 164 - int hardirqs_on_events; 165 - int hardirqs_off_events; 166 - int redundant_hardirqs_on; 167 - int redundant_hardirqs_off; 168 - int softirqs_on_events; 169 - int softirqs_off_events; 170 - int redundant_softirqs_on; 171 - int redundant_softirqs_off; 172 - int nr_unused_locks; 173 - int nr_redundant_checks; 174 - int nr_redundant; 175 - int nr_cyclic_checks; 176 - int nr_cyclic_check_recursions; 177 - int nr_find_usage_forwards_checks; 178 - int nr_find_usage_forwards_recursions; 179 - int nr_find_usage_backwards_checks; 180 - int nr_find_usage_backwards_recursions; 163 + unsigned long chain_lookup_hits; 164 + unsigned int chain_lookup_misses; 165 + unsigned long hardirqs_on_events; 166 + unsigned long hardirqs_off_events; 167 + unsigned long redundant_hardirqs_on; 168 + unsigned long redundant_hardirqs_off; 169 + unsigned long softirqs_on_events; 170 + unsigned long softirqs_off_events; 171 + unsigned long redundant_softirqs_on; 172 + unsigned long redundant_softirqs_off; 173 + int nr_unused_locks; 174 + unsigned int nr_redundant_checks; 175 + unsigned int nr_redundant; 176 + unsigned int nr_cyclic_checks; 177 + unsigned int nr_find_usage_forwards_checks; 178 + unsigned int nr_find_usage_backwards_checks; 181 179 182 180 /* 183 181 * Per lock class locking operation stat counts

-745

kernel/locking/rwsem-xadd.c

··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* rwsem.c: R/W semaphores: contention handling functions 3 - * 4 - * Written by David Howells (dhowells@redhat.com). 5 - * Derived from arch/i386/kernel/semaphore.c 6 - * 7 - * Writer lock-stealing by Alex Shi <alex.shi@intel.com> 8 - * and Michel Lespinasse <walken@google.com> 9 - * 10 - * Optimistic spinning by Tim Chen <tim.c.chen@intel.com> 11 - * and Davidlohr Bueso <davidlohr@hp.com>. Based on mutexes. 12 - */ 13 - #include <linux/rwsem.h> 14 - #include <linux/init.h> 15 - #include <linux/export.h> 16 - #include <linux/sched/signal.h> 17 - #include <linux/sched/rt.h> 18 - #include <linux/sched/wake_q.h> 19 - #include <linux/sched/debug.h> 20 - #include <linux/osq_lock.h> 21 - 22 - #include "rwsem.h" 23 - 24 - /* 25 - * Guide to the rw_semaphore's count field for common values. 26 - * (32-bit case illustrated, similar for 64-bit) 27 - * 28 - * 0x0000000X (1) X readers active or attempting lock, no writer waiting 29 - * X = #active_readers + #readers attempting to lock 30 - * (X*ACTIVE_BIAS) 31 - * 32 - * 0x00000000 rwsem is unlocked, and no one is waiting for the lock or 33 - * attempting to read lock or write lock. 34 - * 35 - * 0xffff000X (1) X readers active or attempting lock, with waiters for lock 36 - * X = #active readers + # readers attempting lock 37 - * (X*ACTIVE_BIAS + WAITING_BIAS) 38 - * (2) 1 writer attempting lock, no waiters for lock 39 - * X-1 = #active readers + #readers attempting lock 40 - * ((X-1)*ACTIVE_BIAS + ACTIVE_WRITE_BIAS) 41 - * (3) 1 writer active, no waiters for lock 42 - * X-1 = #active readers + #readers attempting lock 43 - * ((X-1)*ACTIVE_BIAS + ACTIVE_WRITE_BIAS) 44 - * 45 - * 0xffff0001 (1) 1 reader active or attempting lock, waiters for lock 46 - * (WAITING_BIAS + ACTIVE_BIAS) 47 - * (2) 1 writer active or attempting lock, no waiters for lock 48 - * (ACTIVE_WRITE_BIAS) 49 - * 50 - * 0xffff0000 (1) There are writers or readers queued but none active 51 - * or in the process of attempting lock. 52 - * (WAITING_BIAS) 53 - * Note: writer can attempt to steal lock for this count by adding 54 - * ACTIVE_WRITE_BIAS in cmpxchg and checking the old count 55 - * 56 - * 0xfffe0001 (1) 1 writer active, or attempting lock. Waiters on queue. 57 - * (ACTIVE_WRITE_BIAS + WAITING_BIAS) 58 - * 59 - * Note: Readers attempt to lock by adding ACTIVE_BIAS in down_read and checking 60 - * the count becomes more than 0 for successful lock acquisition, 61 - * i.e. the case where there are only readers or nobody has lock. 62 - * (1st and 2nd case above). 63 - * 64 - * Writers attempt to lock by adding ACTIVE_WRITE_BIAS in down_write and 65 - * checking the count becomes ACTIVE_WRITE_BIAS for successful lock 66 - * acquisition (i.e. nobody else has lock or attempts lock). If 67 - * unsuccessful, in rwsem_down_write_failed, we'll check to see if there 68 - * are only waiters but none active (5th case above), and attempt to 69 - * steal the lock. 70 - * 71 - */ 72 - 73 - /* 74 - * Initialize an rwsem: 75 - */ 76 - void __init_rwsem(struct rw_semaphore *sem, const char *name, 77 - struct lock_class_key *key) 78 - { 79 - #ifdef CONFIG_DEBUG_LOCK_ALLOC 80 - /* 81 - * Make sure we are not reinitializing a held semaphore: 82 - */ 83 - debug_check_no_locks_freed((void *)sem, sizeof(*sem)); 84 - lockdep_init_map(&sem->dep_map, name, key, 0); 85 - #endif 86 - atomic_long_set(&sem->count, RWSEM_UNLOCKED_VALUE); 87 - raw_spin_lock_init(&sem->wait_lock); 88 - INIT_LIST_HEAD(&sem->wait_list); 89 - #ifdef CONFIG_RWSEM_SPIN_ON_OWNER 90 - sem->owner = NULL; 91 - osq_lock_init(&sem->osq); 92 - #endif 93 - } 94 - 95 - EXPORT_SYMBOL(__init_rwsem); 96 - 97 - enum rwsem_waiter_type { 98 - RWSEM_WAITING_FOR_WRITE, 99 - RWSEM_WAITING_FOR_READ 100 - }; 101 - 102 - struct rwsem_waiter { 103 - struct list_head list; 104 - struct task_struct *task; 105 - enum rwsem_waiter_type type; 106 - }; 107 - 108 - enum rwsem_wake_type { 109 - RWSEM_WAKE_ANY, /* Wake whatever's at head of wait list */ 110 - RWSEM_WAKE_READERS, /* Wake readers only */ 111 - RWSEM_WAKE_READ_OWNED /* Waker thread holds the read lock */ 112 - }; 113 - 114 - /* 115 - * handle the lock release when processes blocked on it that can now run 116 - * - if we come here from up_xxxx(), then: 117 - * - the 'active part' of count (&0x0000ffff) reached 0 (but may have changed) 118 - * - the 'waiting part' of count (&0xffff0000) is -ve (and will still be so) 119 - * - there must be someone on the queue 120 - * - the wait_lock must be held by the caller 121 - * - tasks are marked for wakeup, the caller must later invoke wake_up_q() 122 - * to actually wakeup the blocked task(s) and drop the reference count, 123 - * preferably when the wait_lock is released 124 - * - woken process blocks are discarded from the list after having task zeroed 125 - * - writers are only marked woken if downgrading is false 126 - */ 127 - static void __rwsem_mark_wake(struct rw_semaphore *sem, 128 - enum rwsem_wake_type wake_type, 129 - struct wake_q_head *wake_q) 130 - { 131 - struct rwsem_waiter *waiter, *tmp; 132 - long oldcount, woken = 0, adjustment = 0; 133 - struct list_head wlist; 134 - 135 - /* 136 - * Take a peek at the queue head waiter such that we can determine 137 - * the wakeup(s) to perform. 138 - */ 139 - waiter = list_first_entry(&sem->wait_list, struct rwsem_waiter, list); 140 - 141 - if (waiter->type == RWSEM_WAITING_FOR_WRITE) { 142 - if (wake_type == RWSEM_WAKE_ANY) { 143 - /* 144 - * Mark writer at the front of the queue for wakeup. 145 - * Until the task is actually later awoken later by 146 - * the caller, other writers are able to steal it. 147 - * Readers, on the other hand, will block as they 148 - * will notice the queued writer. 149 - */ 150 - wake_q_add(wake_q, waiter->task); 151 - lockevent_inc(rwsem_wake_writer); 152 - } 153 - 154 - return; 155 - } 156 - 157 - /* 158 - * Writers might steal the lock before we grant it to the next reader. 159 - * We prefer to do the first reader grant before counting readers 160 - * so we can bail out early if a writer stole the lock. 161 - */ 162 - if (wake_type != RWSEM_WAKE_READ_OWNED) { 163 - adjustment = RWSEM_ACTIVE_READ_BIAS; 164 - try_reader_grant: 165 - oldcount = atomic_long_fetch_add(adjustment, &sem->count); 166 - if (unlikely(oldcount < RWSEM_WAITING_BIAS)) { 167 - /* 168 - * If the count is still less than RWSEM_WAITING_BIAS 169 - * after removing the adjustment, it is assumed that 170 - * a writer has stolen the lock. We have to undo our 171 - * reader grant. 172 - */ 173 - if (atomic_long_add_return(-adjustment, &sem->count) < 174 - RWSEM_WAITING_BIAS) 175 - return; 176 - 177 - /* Last active locker left. Retry waking readers. */ 178 - goto try_reader_grant; 179 - } 180 - /* 181 - * Set it to reader-owned to give spinners an early 182 - * indication that readers now have the lock. 183 - */ 184 - __rwsem_set_reader_owned(sem, waiter->task); 185 - } 186 - 187 - /* 188 - * Grant an infinite number of read locks to the readers at the front 189 - * of the queue. We know that woken will be at least 1 as we accounted 190 - * for above. Note we increment the 'active part' of the count by the 191 - * number of readers before waking any processes up. 192 - * 193 - * We have to do wakeup in 2 passes to prevent the possibility that 194 - * the reader count may be decremented before it is incremented. It 195 - * is because the to-be-woken waiter may not have slept yet. So it 196 - * may see waiter->task got cleared, finish its critical section and 197 - * do an unlock before the reader count increment. 198 - * 199 - * 1) Collect the read-waiters in a separate list, count them and 200 - * fully increment the reader count in rwsem. 201 - * 2) For each waiters in the new list, clear waiter->task and 202 - * put them into wake_q to be woken up later. 203 - */ 204 - list_for_each_entry(waiter, &sem->wait_list, list) { 205 - if (waiter->type == RWSEM_WAITING_FOR_WRITE) 206 - break; 207 - 208 - woken++; 209 - } 210 - list_cut_before(&wlist, &sem->wait_list, &waiter->list); 211 - 212 - adjustment = woken * RWSEM_ACTIVE_READ_BIAS - adjustment; 213 - lockevent_cond_inc(rwsem_wake_reader, woken); 214 - if (list_empty(&sem->wait_list)) { 215 - /* hit end of list above */ 216 - adjustment -= RWSEM_WAITING_BIAS; 217 - } 218 - 219 - if (adjustment) 220 - atomic_long_add(adjustment, &sem->count); 221 - 222 - /* 2nd pass */ 223 - list_for_each_entry_safe(waiter, tmp, &wlist, list) { 224 - struct task_struct *tsk; 225 - 226 - tsk = waiter->task; 227 - get_task_struct(tsk); 228 - 229 - /* 230 - * Ensure calling get_task_struct() before setting the reader 231 - * waiter to nil such that rwsem_down_read_failed() cannot 232 - * race with do_exit() by always holding a reference count 233 - * to the task to wakeup. 234 - */ 235 - smp_store_release(&waiter->task, NULL); 236 - /* 237 - * Ensure issuing the wakeup (either by us or someone else) 238 - * after setting the reader waiter to nil. 239 - */ 240 - wake_q_add_safe(wake_q, tsk); 241 - } 242 - } 243 - 244 - /* 245 - * This function must be called with the sem->wait_lock held to prevent 246 - * race conditions between checking the rwsem wait list and setting the 247 - * sem->count accordingly. 248 - */ 249 - static inline bool rwsem_try_write_lock(long count, struct rw_semaphore *sem) 250 - { 251 - /* 252 - * Avoid trying to acquire write lock if count isn't RWSEM_WAITING_BIAS. 253 - */ 254 - if (count != RWSEM_WAITING_BIAS) 255 - return false; 256 - 257 - /* 258 - * Acquire the lock by trying to set it to ACTIVE_WRITE_BIAS. If there 259 - * are other tasks on the wait list, we need to add on WAITING_BIAS. 260 - */ 261 - count = list_is_singular(&sem->wait_list) ? 262 - RWSEM_ACTIVE_WRITE_BIAS : 263 - RWSEM_ACTIVE_WRITE_BIAS + RWSEM_WAITING_BIAS; 264 - 265 - if (atomic_long_cmpxchg_acquire(&sem->count, RWSEM_WAITING_BIAS, count) 266 - == RWSEM_WAITING_BIAS) { 267 - rwsem_set_owner(sem); 268 - return true; 269 - } 270 - 271 - return false; 272 - } 273 - 274 - #ifdef CONFIG_RWSEM_SPIN_ON_OWNER 275 - /* 276 - * Try to acquire write lock before the writer has been put on wait queue. 277 - */ 278 - static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem) 279 - { 280 - long count = atomic_long_read(&sem->count); 281 - 282 - while (!count || count == RWSEM_WAITING_BIAS) { 283 - if (atomic_long_try_cmpxchg_acquire(&sem->count, &count, 284 - count + RWSEM_ACTIVE_WRITE_BIAS)) { 285 - rwsem_set_owner(sem); 286 - lockevent_inc(rwsem_opt_wlock); 287 - return true; 288 - } 289 - } 290 - return false; 291 - } 292 - 293 - static inline bool owner_on_cpu(struct task_struct *owner) 294 - { 295 - /* 296 - * As lock holder preemption issue, we both skip spinning if 297 - * task is not on cpu or its cpu is preempted 298 - */ 299 - return owner->on_cpu && !vcpu_is_preempted(task_cpu(owner)); 300 - } 301 - 302 - static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) 303 - { 304 - struct task_struct *owner; 305 - bool ret = true; 306 - 307 - BUILD_BUG_ON(!rwsem_has_anonymous_owner(RWSEM_OWNER_UNKNOWN)); 308 - 309 - if (need_resched()) 310 - return false; 311 - 312 - rcu_read_lock(); 313 - owner = READ_ONCE(sem->owner); 314 - if (owner) { 315 - ret = is_rwsem_owner_spinnable(owner) && 316 - owner_on_cpu(owner); 317 - } 318 - rcu_read_unlock(); 319 - return ret; 320 - } 321 - 322 - /* 323 - * Return true only if we can still spin on the owner field of the rwsem. 324 - */ 325 - static noinline bool rwsem_spin_on_owner(struct rw_semaphore *sem) 326 - { 327 - struct task_struct *owner = READ_ONCE(sem->owner); 328 - 329 - if (!is_rwsem_owner_spinnable(owner)) 330 - return false; 331 - 332 - rcu_read_lock(); 333 - while (owner && (READ_ONCE(sem->owner) == owner)) { 334 - /* 335 - * Ensure we emit the owner->on_cpu, dereference _after_ 336 - * checking sem->owner still matches owner, if that fails, 337 - * owner might point to free()d memory, if it still matches, 338 - * the rcu_read_lock() ensures the memory stays valid. 339 - */ 340 - barrier(); 341 - 342 - /* 343 - * abort spinning when need_resched or owner is not running or 344 - * owner's cpu is preempted. 345 - */ 346 - if (need_resched() || !owner_on_cpu(owner)) { 347 - rcu_read_unlock(); 348 - return false; 349 - } 350 - 351 - cpu_relax(); 352 - } 353 - rcu_read_unlock(); 354 - 355 - /* 356 - * If there is a new owner or the owner is not set, we continue 357 - * spinning. 358 - */ 359 - return is_rwsem_owner_spinnable(READ_ONCE(sem->owner)); 360 - } 361 - 362 - static bool rwsem_optimistic_spin(struct rw_semaphore *sem) 363 - { 364 - bool taken = false; 365 - 366 - preempt_disable(); 367 - 368 - /* sem->wait_lock should not be held when doing optimistic spinning */ 369 - if (!rwsem_can_spin_on_owner(sem)) 370 - goto done; 371 - 372 - if (!osq_lock(&sem->osq)) 373 - goto done; 374 - 375 - /* 376 - * Optimistically spin on the owner field and attempt to acquire the 377 - * lock whenever the owner changes. Spinning will be stopped when: 378 - * 1) the owning writer isn't running; or 379 - * 2) readers own the lock as we can't determine if they are 380 - * actively running or not. 381 - */ 382 - while (rwsem_spin_on_owner(sem)) { 383 - /* 384 - * Try to acquire the lock 385 - */ 386 - if (rwsem_try_write_lock_unqueued(sem)) { 387 - taken = true; 388 - break; 389 - } 390 - 391 - /* 392 - * When there's no owner, we might have preempted between the 393 - * owner acquiring the lock and setting the owner field. If 394 - * we're an RT task that will live-lock because we won't let 395 - * the owner complete. 396 - */ 397 - if (!sem->owner && (need_resched() || rt_task(current))) 398 - break; 399 - 400 - /* 401 - * The cpu_relax() call is a compiler barrier which forces 402 - * everything in this loop to be re-loaded. We don't need 403 - * memory barriers as we'll eventually observe the right 404 - * values at the cost of a few extra spins. 405 - */ 406 - cpu_relax(); 407 - } 408 - osq_unlock(&sem->osq); 409 - done: 410 - preempt_enable(); 411 - lockevent_cond_inc(rwsem_opt_fail, !taken); 412 - return taken; 413 - } 414 - 415 - /* 416 - * Return true if the rwsem has active spinner 417 - */ 418 - static inline bool rwsem_has_spinner(struct rw_semaphore *sem) 419 - { 420 - return osq_is_locked(&sem->osq); 421 - } 422 - 423 - #else 424 - static bool rwsem_optimistic_spin(struct rw_semaphore *sem) 425 - { 426 - return false; 427 - } 428 - 429 - static inline bool rwsem_has_spinner(struct rw_semaphore *sem) 430 - { 431 - return false; 432 - } 433 - #endif 434 - 435 - /* 436 - * Wait for the read lock to be granted 437 - */ 438 - static inline struct rw_semaphore __sched * 439 - __rwsem_down_read_failed_common(struct rw_semaphore *sem, int state) 440 - { 441 - long count, adjustment = -RWSEM_ACTIVE_READ_BIAS; 442 - struct rwsem_waiter waiter; 443 - DEFINE_WAKE_Q(wake_q); 444 - 445 - waiter.task = current; 446 - waiter.type = RWSEM_WAITING_FOR_READ; 447 - 448 - raw_spin_lock_irq(&sem->wait_lock); 449 - if (list_empty(&sem->wait_list)) { 450 - /* 451 - * In case the wait queue is empty and the lock isn't owned 452 - * by a writer, this reader can exit the slowpath and return 453 - * immediately as its RWSEM_ACTIVE_READ_BIAS has already 454 - * been set in the count. 455 - */ 456 - if (atomic_long_read(&sem->count) >= 0) { 457 - raw_spin_unlock_irq(&sem->wait_lock); 458 - rwsem_set_reader_owned(sem); 459 - lockevent_inc(rwsem_rlock_fast); 460 - return sem; 461 - } 462 - adjustment += RWSEM_WAITING_BIAS; 463 - } 464 - list_add_tail(&waiter.list, &sem->wait_list); 465 - 466 - /* we're now waiting on the lock, but no longer actively locking */ 467 - count = atomic_long_add_return(adjustment, &sem->count); 468 - 469 - /* 470 - * If there are no active locks, wake the front queued process(es). 471 - * 472 - * If there are no writers and we are first in the queue, 473 - * wake our own waiter to join the existing active readers ! 474 - */ 475 - if (count == RWSEM_WAITING_BIAS || 476 - (count > RWSEM_WAITING_BIAS && 477 - adjustment != -RWSEM_ACTIVE_READ_BIAS)) 478 - __rwsem_mark_wake(sem, RWSEM_WAKE_ANY, &wake_q); 479 - 480 - raw_spin_unlock_irq(&sem->wait_lock); 481 - wake_up_q(&wake_q); 482 - 483 - /* wait to be given the lock */ 484 - while (true) { 485 - set_current_state(state); 486 - if (!waiter.task) 487 - break; 488 - if (signal_pending_state(state, current)) { 489 - raw_spin_lock_irq(&sem->wait_lock); 490 - if (waiter.task) 491 - goto out_nolock; 492 - raw_spin_unlock_irq(&sem->wait_lock); 493 - break; 494 - } 495 - schedule(); 496 - lockevent_inc(rwsem_sleep_reader); 497 - } 498 - 499 - __set_current_state(TASK_RUNNING); 500 - lockevent_inc(rwsem_rlock); 501 - return sem; 502 - out_nolock: 503 - list_del(&waiter.list); 504 - if (list_empty(&sem->wait_list)) 505 - atomic_long_add(-RWSEM_WAITING_BIAS, &sem->count); 506 - raw_spin_unlock_irq(&sem->wait_lock); 507 - __set_current_state(TASK_RUNNING); 508 - lockevent_inc(rwsem_rlock_fail); 509 - return ERR_PTR(-EINTR); 510 - } 511 - 512 - __visible struct rw_semaphore * __sched 513 - rwsem_down_read_failed(struct rw_semaphore *sem) 514 - { 515 - return __rwsem_down_read_failed_common(sem, TASK_UNINTERRUPTIBLE); 516 - } 517 - EXPORT_SYMBOL(rwsem_down_read_failed); 518 - 519 - __visible struct rw_semaphore * __sched 520 - rwsem_down_read_failed_killable(struct rw_semaphore *sem) 521 - { 522 - return __rwsem_down_read_failed_common(sem, TASK_KILLABLE); 523 - } 524 - EXPORT_SYMBOL(rwsem_down_read_failed_killable); 525 - 526 - /* 527 - * Wait until we successfully acquire the write lock 528 - */ 529 - static inline struct rw_semaphore * 530 - __rwsem_down_write_failed_common(struct rw_semaphore *sem, int state) 531 - { 532 - long count; 533 - bool waiting = true; /* any queued threads before us */ 534 - struct rwsem_waiter waiter; 535 - struct rw_semaphore *ret = sem; 536 - DEFINE_WAKE_Q(wake_q); 537 - 538 - /* undo write bias from down_write operation, stop active locking */ 539 - count = atomic_long_sub_return(RWSEM_ACTIVE_WRITE_BIAS, &sem->count); 540 - 541 - /* do optimistic spinning and steal lock if possible */ 542 - if (rwsem_optimistic_spin(sem)) 543 - return sem; 544 - 545 - /* 546 - * Optimistic spinning failed, proceed to the slowpath 547 - * and block until we can acquire the sem. 548 - */ 549 - waiter.task = current; 550 - waiter.type = RWSEM_WAITING_FOR_WRITE; 551 - 552 - raw_spin_lock_irq(&sem->wait_lock); 553 - 554 - /* account for this before adding a new element to the list */ 555 - if (list_empty(&sem->wait_list)) 556 - waiting = false; 557 - 558 - list_add_tail(&waiter.list, &sem->wait_list); 559 - 560 - /* we're now waiting on the lock, but no longer actively locking */ 561 - if (waiting) { 562 - count = atomic_long_read(&sem->count); 563 - 564 - /* 565 - * If there were already threads queued before us and there are 566 - * no active writers, the lock must be read owned; so we try to 567 - * wake any read locks that were queued ahead of us. 568 - */ 569 - if (count > RWSEM_WAITING_BIAS) { 570 - __rwsem_mark_wake(sem, RWSEM_WAKE_READERS, &wake_q); 571 - /* 572 - * The wakeup is normally called _after_ the wait_lock 573 - * is released, but given that we are proactively waking 574 - * readers we can deal with the wake_q overhead as it is 575 - * similar to releasing and taking the wait_lock again 576 - * for attempting rwsem_try_write_lock(). 577 - */ 578 - wake_up_q(&wake_q); 579 - 580 - /* 581 - * Reinitialize wake_q after use. 582 - */ 583 - wake_q_init(&wake_q); 584 - } 585 - 586 - } else 587 - count = atomic_long_add_return(RWSEM_WAITING_BIAS, &sem->count); 588 - 589 - /* wait until we successfully acquire the lock */ 590 - set_current_state(state); 591 - while (true) { 592 - if (rwsem_try_write_lock(count, sem)) 593 - break; 594 - raw_spin_unlock_irq(&sem->wait_lock); 595 - 596 - /* Block until there are no active lockers. */ 597 - do { 598 - if (signal_pending_state(state, current)) 599 - goto out_nolock; 600 - 601 - schedule(); 602 - lockevent_inc(rwsem_sleep_writer); 603 - set_current_state(state); 604 - } while ((count = atomic_long_read(&sem->count)) & RWSEM_ACTIVE_MASK); 605 - 606 - raw_spin_lock_irq(&sem->wait_lock); 607 - } 608 - __set_current_state(TASK_RUNNING); 609 - list_del(&waiter.list); 610 - raw_spin_unlock_irq(&sem->wait_lock); 611 - lockevent_inc(rwsem_wlock); 612 - 613 - return ret; 614 - 615 - out_nolock: 616 - __set_current_state(TASK_RUNNING); 617 - raw_spin_lock_irq(&sem->wait_lock); 618 - list_del(&waiter.list); 619 - if (list_empty(&sem->wait_list)) 620 - atomic_long_add(-RWSEM_WAITING_BIAS, &sem->count); 621 - else 622 - __rwsem_mark_wake(sem, RWSEM_WAKE_ANY, &wake_q); 623 - raw_spin_unlock_irq(&sem->wait_lock); 624 - wake_up_q(&wake_q); 625 - lockevent_inc(rwsem_wlock_fail); 626 - 627 - return ERR_PTR(-EINTR); 628 - } 629 - 630 - __visible struct rw_semaphore * __sched 631 - rwsem_down_write_failed(struct rw_semaphore *sem) 632 - { 633 - return __rwsem_down_write_failed_common(sem, TASK_UNINTERRUPTIBLE); 634 - } 635 - EXPORT_SYMBOL(rwsem_down_write_failed); 636 - 637 - __visible struct rw_semaphore * __sched 638 - rwsem_down_write_failed_killable(struct rw_semaphore *sem) 639 - { 640 - return __rwsem_down_write_failed_common(sem, TASK_KILLABLE); 641 - } 642 - EXPORT_SYMBOL(rwsem_down_write_failed_killable); 643 - 644 - /* 645 - * handle waking up a waiter on the semaphore 646 - * - up_read/up_write has decremented the active part of count if we come here 647 - */ 648 - __visible 649 - struct rw_semaphore *rwsem_wake(struct rw_semaphore *sem) 650 - { 651 - unsigned long flags; 652 - DEFINE_WAKE_Q(wake_q); 653 - 654 - /* 655 - * __rwsem_down_write_failed_common(sem) 656 - * rwsem_optimistic_spin(sem) 657 - * osq_unlock(sem->osq) 658 - * ... 659 - * atomic_long_add_return(&sem->count) 660 - * 661 - * - VS - 662 - * 663 - * __up_write() 664 - * if (atomic_long_sub_return_release(&sem->count) < 0) 665 - * rwsem_wake(sem) 666 - * osq_is_locked(&sem->osq) 667 - * 668 - * And __up_write() must observe !osq_is_locked() when it observes the 669 - * atomic_long_add_return() in order to not miss a wakeup. 670 - * 671 - * This boils down to: 672 - * 673 - * [S.rel] X = 1 [RmW] r0 = (Y += 0) 674 - * MB RMB 675 - * [RmW] Y += 1 [L] r1 = X 676 - * 677 - * exists (r0=1 /\ r1=0) 678 - */ 679 - smp_rmb(); 680 - 681 - /* 682 - * If a spinner is present, it is not necessary to do the wakeup. 683 - * Try to do wakeup only if the trylock succeeds to minimize 684 - * spinlock contention which may introduce too much delay in the 685 - * unlock operation. 686 - * 687 - * spinning writer up_write/up_read caller 688 - * --------------- ----------------------- 689 - * [S] osq_unlock() [L] osq 690 - * MB RMB 691 - * [RmW] rwsem_try_write_lock() [RmW] spin_trylock(wait_lock) 692 - * 693 - * Here, it is important to make sure that there won't be a missed 694 - * wakeup while the rwsem is free and the only spinning writer goes 695 - * to sleep without taking the rwsem. Even when the spinning writer 696 - * is just going to break out of the waiting loop, it will still do 697 - * a trylock in rwsem_down_write_failed() before sleeping. IOW, if 698 - * rwsem_has_spinner() is true, it will guarantee at least one 699 - * trylock attempt on the rwsem later on. 700 - */ 701 - if (rwsem_has_spinner(sem)) { 702 - /* 703 - * The smp_rmb() here is to make sure that the spinner 704 - * state is consulted before reading the wait_lock. 705 - */ 706 - smp_rmb(); 707 - if (!raw_spin_trylock_irqsave(&sem->wait_lock, flags)) 708 - return sem; 709 - goto locked; 710 - } 711 - raw_spin_lock_irqsave(&sem->wait_lock, flags); 712 - locked: 713 - 714 - if (!list_empty(&sem->wait_list)) 715 - __rwsem_mark_wake(sem, RWSEM_WAKE_ANY, &wake_q); 716 - 717 - raw_spin_unlock_irqrestore(&sem->wait_lock, flags); 718 - wake_up_q(&wake_q); 719 - 720 - return sem; 721 - } 722 - EXPORT_SYMBOL(rwsem_wake); 723 - 724 - /* 725 - * downgrade a write lock into a read lock 726 - * - caller incremented waiting part of count and discovered it still negative 727 - * - just wake up any readers at the front of the queue 728 - */ 729 - __visible 730 - struct rw_semaphore *rwsem_downgrade_wake(struct rw_semaphore *sem) 731 - { 732 - unsigned long flags; 733 - DEFINE_WAKE_Q(wake_q); 734 - 735 - raw_spin_lock_irqsave(&sem->wait_lock, flags); 736 - 737 - if (!list_empty(&sem->wait_list)) 738 - __rwsem_mark_wake(sem, RWSEM_WAKE_READ_OWNED, &wake_q); 739 - 740 - raw_spin_unlock_irqrestore(&sem->wait_lock, flags); 741 - wake_up_q(&wake_q); 742 - 743 - return sem; 744 - } 745 - EXPORT_SYMBOL(rwsem_downgrade_wake);

+1426 -27

kernel/locking/rwsem.c

··· 3 3 * 4 4 * Written by David Howells (dhowells@redhat.com). 5 5 * Derived from asm-i386/semaphore.h 6 + * 7 + * Writer lock-stealing by Alex Shi <alex.shi@intel.com> 8 + * and Michel Lespinasse <walken@google.com> 9 + * 10 + * Optimistic spinning by Tim Chen <tim.c.chen@intel.com> 11 + * and Davidlohr Bueso <davidlohr@hp.com>. Based on mutexes. 12 + * 13 + * Rwsem count bit fields re-definition and rwsem rearchitecture by 14 + * Waiman Long <longman@redhat.com> and 15 + * Peter Zijlstra <peterz@infradead.org>. 6 16 */ 7 17 8 18 #include <linux/types.h> 9 19 #include <linux/kernel.h> 10 20 #include <linux/sched.h> 21 + #include <linux/sched/rt.h> 22 + #include <linux/sched/task.h> 11 23 #include <linux/sched/debug.h> 24 + #include <linux/sched/wake_q.h> 25 + #include <linux/sched/signal.h> 26 + #include <linux/sched/clock.h> 12 27 #include <linux/export.h> 13 28 #include <linux/rwsem.h> 14 29 #include <linux/atomic.h> 15 30 16 31 #include "rwsem.h" 32 + #include "lock_events.h" 33 + 34 + /* 35 + * The least significant 3 bits of the owner value has the following 36 + * meanings when set. 37 + * - Bit 0: RWSEM_READER_OWNED - The rwsem is owned by readers 38 + * - Bit 1: RWSEM_RD_NONSPINNABLE - Readers cannot spin on this lock. 39 + * - Bit 2: RWSEM_WR_NONSPINNABLE - Writers cannot spin on this lock. 40 + * 41 + * When the rwsem is either owned by an anonymous writer, or it is 42 + * reader-owned, but a spinning writer has timed out, both nonspinnable 43 + * bits will be set to disable optimistic spinning by readers and writers. 44 + * In the later case, the last unlocking reader should then check the 45 + * writer nonspinnable bit and clear it only to give writers preference 46 + * to acquire the lock via optimistic spinning, but not readers. Similar 47 + * action is also done in the reader slowpath. 48 + 49 + * When a writer acquires a rwsem, it puts its task_struct pointer 50 + * into the owner field. It is cleared after an unlock. 51 + * 52 + * When a reader acquires a rwsem, it will also puts its task_struct 53 + * pointer into the owner field with the RWSEM_READER_OWNED bit set. 54 + * On unlock, the owner field will largely be left untouched. So 55 + * for a free or reader-owned rwsem, the owner value may contain 56 + * information about the last reader that acquires the rwsem. 57 + * 58 + * That information may be helpful in debugging cases where the system 59 + * seems to hang on a reader owned rwsem especially if only one reader 60 + * is involved. Ideally we would like to track all the readers that own 61 + * a rwsem, but the overhead is simply too big. 62 + * 63 + * Reader optimistic spinning is helpful when the reader critical section 64 + * is short and there aren't that many readers around. It makes readers 65 + * relatively more preferred than writers. When a writer times out spinning 66 + * on a reader-owned lock and set the nospinnable bits, there are two main 67 + * reasons for that. 68 + * 69 + * 1) The reader critical section is long, perhaps the task sleeps after 70 + * acquiring the read lock. 71 + * 2) There are just too many readers contending the lock causing it to 72 + * take a while to service all of them. 73 + * 74 + * In the former case, long reader critical section will impede the progress 75 + * of writers which is usually more important for system performance. In 76 + * the later case, reader optimistic spinning tends to make the reader 77 + * groups that contain readers that acquire the lock together smaller 78 + * leading to more of them. That may hurt performance in some cases. In 79 + * other words, the setting of nonspinnable bits indicates that reader 80 + * optimistic spinning may not be helpful for those workloads that cause 81 + * it. 82 + * 83 + * Therefore, any writers that had observed the setting of the writer 84 + * nonspinnable bit for a given rwsem after they fail to acquire the lock 85 + * via optimistic spinning will set the reader nonspinnable bit once they 86 + * acquire the write lock. Similarly, readers that observe the setting 87 + * of reader nonspinnable bit at slowpath entry will set the reader 88 + * nonspinnable bits when they acquire the read lock via the wakeup path. 89 + * 90 + * Once the reader nonspinnable bit is on, it will only be reset when 91 + * a writer is able to acquire the rwsem in the fast path or somehow a 92 + * reader or writer in the slowpath doesn't observe the nonspinable bit. 93 + * 94 + * This is to discourage reader optmistic spinning on that particular 95 + * rwsem and make writers more preferred. This adaptive disabling of reader 96 + * optimistic spinning will alleviate the negative side effect of this 97 + * feature. 98 + */ 99 + #define RWSEM_READER_OWNED (1UL << 0) 100 + #define RWSEM_RD_NONSPINNABLE (1UL << 1) 101 + #define RWSEM_WR_NONSPINNABLE (1UL << 2) 102 + #define RWSEM_NONSPINNABLE (RWSEM_RD_NONSPINNABLE | RWSEM_WR_NONSPINNABLE) 103 + #define RWSEM_OWNER_FLAGS_MASK (RWSEM_READER_OWNED | RWSEM_NONSPINNABLE) 104 + 105 + #ifdef CONFIG_DEBUG_RWSEMS 106 + # define DEBUG_RWSEMS_WARN_ON(c, sem) do { \ 107 + if (!debug_locks_silent && \ 108 + WARN_ONCE(c, "DEBUG_RWSEMS_WARN_ON(%s): count = 0x%lx, owner = 0x%lx, curr 0x%lx, list %sempty\n",\ 109 + #c, atomic_long_read(&(sem)->count), \ 110 + atomic_long_read(&(sem)->owner), (long)current, \ 111 + list_empty(&(sem)->wait_list) ? "" : "not ")) \ 112 + debug_locks_off(); \ 113 + } while (0) 114 + #else 115 + # define DEBUG_RWSEMS_WARN_ON(c, sem) 116 + #endif 117 + 118 + /* 119 + * On 64-bit architectures, the bit definitions of the count are: 120 + * 121 + * Bit 0 - writer locked bit 122 + * Bit 1 - waiters present bit 123 + * Bit 2 - lock handoff bit 124 + * Bits 3-7 - reserved 125 + * Bits 8-62 - 55-bit reader count 126 + * Bit 63 - read fail bit 127 + * 128 + * On 32-bit architectures, the bit definitions of the count are: 129 + * 130 + * Bit 0 - writer locked bit 131 + * Bit 1 - waiters present bit 132 + * Bit 2 - lock handoff bit 133 + * Bits 3-7 - reserved 134 + * Bits 8-30 - 23-bit reader count 135 + * Bit 31 - read fail bit 136 + * 137 + * It is not likely that the most significant bit (read fail bit) will ever 138 + * be set. This guard bit is still checked anyway in the down_read() fastpath 139 + * just in case we need to use up more of the reader bits for other purpose 140 + * in the future. 141 + * 142 + * atomic_long_fetch_add() is used to obtain reader lock, whereas 143 + * atomic_long_cmpxchg() will be used to obtain writer lock. 144 + * 145 + * There are three places where the lock handoff bit may be set or cleared. 146 + * 1) rwsem_mark_wake() for readers. 147 + * 2) rwsem_try_write_lock() for writers. 148 + * 3) Error path of rwsem_down_write_slowpath(). 149 + * 150 + * For all the above cases, wait_lock will be held. A writer must also 151 + * be the first one in the wait_list to be eligible for setting the handoff 152 + * bit. So concurrent setting/clearing of handoff bit is not possible. 153 + */ 154 + #define RWSEM_WRITER_LOCKED (1UL << 0) 155 + #define RWSEM_FLAG_WAITERS (1UL << 1) 156 + #define RWSEM_FLAG_HANDOFF (1UL << 2) 157 + #define RWSEM_FLAG_READFAIL (1UL << (BITS_PER_LONG - 1)) 158 + 159 + #define RWSEM_READER_SHIFT 8 160 + #define RWSEM_READER_BIAS (1UL << RWSEM_READER_SHIFT) 161 + #define RWSEM_READER_MASK (~(RWSEM_READER_BIAS - 1)) 162 + #define RWSEM_WRITER_MASK RWSEM_WRITER_LOCKED 163 + #define RWSEM_LOCK_MASK (RWSEM_WRITER_MASK|RWSEM_READER_MASK) 164 + #define RWSEM_READ_FAILED_MASK (RWSEM_WRITER_MASK|RWSEM_FLAG_WAITERS|\ 165 + RWSEM_FLAG_HANDOFF|RWSEM_FLAG_READFAIL) 166 + 167 + /* 168 + * All writes to owner are protected by WRITE_ONCE() to make sure that 169 + * store tearing can't happen as optimistic spinners may read and use 170 + * the owner value concurrently without lock. Read from owner, however, 171 + * may not need READ_ONCE() as long as the pointer value is only used 172 + * for comparison and isn't being dereferenced. 173 + */ 174 + static inline void rwsem_set_owner(struct rw_semaphore *sem) 175 + { 176 + atomic_long_set(&sem->owner, (long)current); 177 + } 178 + 179 + static inline void rwsem_clear_owner(struct rw_semaphore *sem) 180 + { 181 + atomic_long_set(&sem->owner, 0); 182 + } 183 + 184 + /* 185 + * Test the flags in the owner field. 186 + */ 187 + static inline bool rwsem_test_oflags(struct rw_semaphore *sem, long flags) 188 + { 189 + return atomic_long_read(&sem->owner) & flags; 190 + } 191 + 192 + /* 193 + * The task_struct pointer of the last owning reader will be left in 194 + * the owner field. 195 + * 196 + * Note that the owner value just indicates the task has owned the rwsem 197 + * previously, it may not be the real owner or one of the real owners 198 + * anymore when that field is examined, so take it with a grain of salt. 199 + * 200 + * The reader non-spinnable bit is preserved. 201 + */ 202 + static inline void __rwsem_set_reader_owned(struct rw_semaphore *sem, 203 + struct task_struct *owner) 204 + { 205 + unsigned long val = (unsigned long)owner | RWSEM_READER_OWNED | 206 + (atomic_long_read(&sem->owner) & RWSEM_RD_NONSPINNABLE); 207 + 208 + atomic_long_set(&sem->owner, val); 209 + } 210 + 211 + static inline void rwsem_set_reader_owned(struct rw_semaphore *sem) 212 + { 213 + __rwsem_set_reader_owned(sem, current); 214 + } 215 + 216 + /* 217 + * Return true if the rwsem is owned by a reader. 218 + */ 219 + static inline bool is_rwsem_reader_owned(struct rw_semaphore *sem) 220 + { 221 + #ifdef CONFIG_DEBUG_RWSEMS 222 + /* 223 + * Check the count to see if it is write-locked. 224 + */ 225 + long count = atomic_long_read(&sem->count); 226 + 227 + if (count & RWSEM_WRITER_MASK) 228 + return false; 229 + #endif 230 + return rwsem_test_oflags(sem, RWSEM_READER_OWNED); 231 + } 232 + 233 + #ifdef CONFIG_DEBUG_RWSEMS 234 + /* 235 + * With CONFIG_DEBUG_RWSEMS configured, it will make sure that if there 236 + * is a task pointer in owner of a reader-owned rwsem, it will be the 237 + * real owner or one of the real owners. The only exception is when the 238 + * unlock is done by up_read_non_owner(). 239 + */ 240 + static inline void rwsem_clear_reader_owned(struct rw_semaphore *sem) 241 + { 242 + unsigned long val = atomic_long_read(&sem->owner); 243 + 244 + while ((val & ~RWSEM_OWNER_FLAGS_MASK) == (unsigned long)current) { 245 + if (atomic_long_try_cmpxchg(&sem->owner, &val, 246 + val & RWSEM_OWNER_FLAGS_MASK)) 247 + return; 248 + } 249 + } 250 + #else 251 + static inline void rwsem_clear_reader_owned(struct rw_semaphore *sem) 252 + { 253 + } 254 + #endif 255 + 256 + /* 257 + * Set the RWSEM_NONSPINNABLE bits if the RWSEM_READER_OWNED flag 258 + * remains set. Otherwise, the operation will be aborted. 259 + */ 260 + static inline void rwsem_set_nonspinnable(struct rw_semaphore *sem) 261 + { 262 + unsigned long owner = atomic_long_read(&sem->owner); 263 + 264 + do { 265 + if (!(owner & RWSEM_READER_OWNED)) 266 + break; 267 + if (owner & RWSEM_NONSPINNABLE) 268 + break; 269 + } while (!atomic_long_try_cmpxchg(&sem->owner, &owner, 270 + owner | RWSEM_NONSPINNABLE)); 271 + } 272 + 273 + static inline bool rwsem_read_trylock(struct rw_semaphore *sem) 274 + { 275 + long cnt = atomic_long_add_return_acquire(RWSEM_READER_BIAS, &sem->count); 276 + if (WARN_ON_ONCE(cnt < 0)) 277 + rwsem_set_nonspinnable(sem); 278 + return !(cnt & RWSEM_READ_FAILED_MASK); 279 + } 280 + 281 + /* 282 + * Return just the real task structure pointer of the owner 283 + */ 284 + static inline struct task_struct *rwsem_owner(struct rw_semaphore *sem) 285 + { 286 + return (struct task_struct *) 287 + (atomic_long_read(&sem->owner) & ~RWSEM_OWNER_FLAGS_MASK); 288 + } 289 + 290 + /* 291 + * Return the real task structure pointer of the owner and the embedded 292 + * flags in the owner. pflags must be non-NULL. 293 + */ 294 + static inline struct task_struct * 295 + rwsem_owner_flags(struct rw_semaphore *sem, unsigned long *pflags) 296 + { 297 + unsigned long owner = atomic_long_read(&sem->owner); 298 + 299 + *pflags = owner & RWSEM_OWNER_FLAGS_MASK; 300 + return (struct task_struct *)(owner & ~RWSEM_OWNER_FLAGS_MASK); 301 + } 302 + 303 + /* 304 + * Guide to the rw_semaphore's count field. 305 + * 306 + * When the RWSEM_WRITER_LOCKED bit in count is set, the lock is owned 307 + * by a writer. 308 + * 309 + * The lock is owned by readers when 310 + * (1) the RWSEM_WRITER_LOCKED isn't set in count, 311 + * (2) some of the reader bits are set in count, and 312 + * (3) the owner field has RWSEM_READ_OWNED bit set. 313 + * 314 + * Having some reader bits set is not enough to guarantee a readers owned 315 + * lock as the readers may be in the process of backing out from the count 316 + * and a writer has just released the lock. So another writer may steal 317 + * the lock immediately after that. 318 + */ 319 + 320 + /* 321 + * Initialize an rwsem: 322 + */ 323 + void __init_rwsem(struct rw_semaphore *sem, const char *name, 324 + struct lock_class_key *key) 325 + { 326 + #ifdef CONFIG_DEBUG_LOCK_ALLOC 327 + /* 328 + * Make sure we are not reinitializing a held semaphore: 329 + */ 330 + debug_check_no_locks_freed((void *)sem, sizeof(*sem)); 331 + lockdep_init_map(&sem->dep_map, name, key, 0); 332 + #endif 333 + atomic_long_set(&sem->count, RWSEM_UNLOCKED_VALUE); 334 + raw_spin_lock_init(&sem->wait_lock); 335 + INIT_LIST_HEAD(&sem->wait_list); 336 + atomic_long_set(&sem->owner, 0L); 337 + #ifdef CONFIG_RWSEM_SPIN_ON_OWNER 338 + osq_lock_init(&sem->osq); 339 + #endif 340 + } 341 + EXPORT_SYMBOL(__init_rwsem); 342 + 343 + enum rwsem_waiter_type { 344 + RWSEM_WAITING_FOR_WRITE, 345 + RWSEM_WAITING_FOR_READ 346 + }; 347 + 348 + struct rwsem_waiter { 349 + struct list_head list; 350 + struct task_struct *task; 351 + enum rwsem_waiter_type type; 352 + unsigned long timeout; 353 + unsigned long last_rowner; 354 + }; 355 + #define rwsem_first_waiter(sem) \ 356 + list_first_entry(&sem->wait_list, struct rwsem_waiter, list) 357 + 358 + enum rwsem_wake_type { 359 + RWSEM_WAKE_ANY, /* Wake whatever's at head of wait list */ 360 + RWSEM_WAKE_READERS, /* Wake readers only */ 361 + RWSEM_WAKE_READ_OWNED /* Waker thread holds the read lock */ 362 + }; 363 + 364 + enum writer_wait_state { 365 + WRITER_NOT_FIRST, /* Writer is not first in wait list */ 366 + WRITER_FIRST, /* Writer is first in wait list */ 367 + WRITER_HANDOFF /* Writer is first & handoff needed */ 368 + }; 369 + 370 + /* 371 + * The typical HZ value is either 250 or 1000. So set the minimum waiting 372 + * time to at least 4ms or 1 jiffy (if it is higher than 4ms) in the wait 373 + * queue before initiating the handoff protocol. 374 + */ 375 + #define RWSEM_WAIT_TIMEOUT DIV_ROUND_UP(HZ, 250) 376 + 377 + /* 378 + * Magic number to batch-wakeup waiting readers, even when writers are 379 + * also present in the queue. This both limits the amount of work the 380 + * waking thread must do and also prevents any potential counter overflow, 381 + * however unlikely. 382 + */ 383 + #define MAX_READERS_WAKEUP 0x100 384 + 385 + /* 386 + * handle the lock release when processes blocked on it that can now run 387 + * - if we come here from up_xxxx(), then the RWSEM_FLAG_WAITERS bit must 388 + * have been set. 389 + * - there must be someone on the queue 390 + * - the wait_lock must be held by the caller 391 + * - tasks are marked for wakeup, the caller must later invoke wake_up_q() 392 + * to actually wakeup the blocked task(s) and drop the reference count, 393 + * preferably when the wait_lock is released 394 + * - woken process blocks are discarded from the list after having task zeroed 395 + * - writers are only marked woken if downgrading is false 396 + */ 397 + static void rwsem_mark_wake(struct rw_semaphore *sem, 398 + enum rwsem_wake_type wake_type, 399 + struct wake_q_head *wake_q) 400 + { 401 + struct rwsem_waiter *waiter, *tmp; 402 + long oldcount, woken = 0, adjustment = 0; 403 + struct list_head wlist; 404 + 405 + lockdep_assert_held(&sem->wait_lock); 406 + 407 + /* 408 + * Take a peek at the queue head waiter such that we can determine 409 + * the wakeup(s) to perform. 410 + */ 411 + waiter = rwsem_first_waiter(sem); 412 + 413 + if (waiter->type == RWSEM_WAITING_FOR_WRITE) { 414 + if (wake_type == RWSEM_WAKE_ANY) { 415 + /* 416 + * Mark writer at the front of the queue for wakeup. 417 + * Until the task is actually later awoken later by 418 + * the caller, other writers are able to steal it. 419 + * Readers, on the other hand, will block as they 420 + * will notice the queued writer. 421 + */ 422 + wake_q_add(wake_q, waiter->task); 423 + lockevent_inc(rwsem_wake_writer); 424 + } 425 + 426 + return; 427 + } 428 + 429 + /* 430 + * No reader wakeup if there are too many of them already. 431 + */ 432 + if (unlikely(atomic_long_read(&sem->count) < 0)) 433 + return; 434 + 435 + /* 436 + * Writers might steal the lock before we grant it to the next reader. 437 + * We prefer to do the first reader grant before counting readers 438 + * so we can bail out early if a writer stole the lock. 439 + */ 440 + if (wake_type != RWSEM_WAKE_READ_OWNED) { 441 + struct task_struct *owner; 442 + 443 + adjustment = RWSEM_READER_BIAS; 444 + oldcount = atomic_long_fetch_add(adjustment, &sem->count); 445 + if (unlikely(oldcount & RWSEM_WRITER_MASK)) { 446 + /* 447 + * When we've been waiting "too" long (for writers 448 + * to give up the lock), request a HANDOFF to 449 + * force the issue. 450 + */ 451 + if (!(oldcount & RWSEM_FLAG_HANDOFF) && 452 + time_after(jiffies, waiter->timeout)) { 453 + adjustment -= RWSEM_FLAG_HANDOFF; 454 + lockevent_inc(rwsem_rlock_handoff); 455 + } 456 + 457 + atomic_long_add(-adjustment, &sem->count); 458 + return; 459 + } 460 + /* 461 + * Set it to reader-owned to give spinners an early 462 + * indication that readers now have the lock. 463 + * The reader nonspinnable bit seen at slowpath entry of 464 + * the reader is copied over. 465 + */ 466 + owner = waiter->task; 467 + if (waiter->last_rowner & RWSEM_RD_NONSPINNABLE) { 468 + owner = (void *)((unsigned long)owner | RWSEM_RD_NONSPINNABLE); 469 + lockevent_inc(rwsem_opt_norspin); 470 + } 471 + __rwsem_set_reader_owned(sem, owner); 472 + } 473 + 474 + /* 475 + * Grant up to MAX_READERS_WAKEUP read locks to all the readers in the 476 + * queue. We know that the woken will be at least 1 as we accounted 477 + * for above. Note we increment the 'active part' of the count by the 478 + * number of readers before waking any processes up. 479 + * 480 + * This is an adaptation of the phase-fair R/W locks where at the 481 + * reader phase (first waiter is a reader), all readers are eligible 482 + * to acquire the lock at the same time irrespective of their order 483 + * in the queue. The writers acquire the lock according to their 484 + * order in the queue. 485 + * 486 + * We have to do wakeup in 2 passes to prevent the possibility that 487 + * the reader count may be decremented before it is incremented. It 488 + * is because the to-be-woken waiter may not have slept yet. So it 489 + * may see waiter->task got cleared, finish its critical section and 490 + * do an unlock before the reader count increment. 491 + * 492 + * 1) Collect the read-waiters in a separate list, count them and 493 + * fully increment the reader count in rwsem. 494 + * 2) For each waiters in the new list, clear waiter->task and 495 + * put them into wake_q to be woken up later. 496 + */ 497 + INIT_LIST_HEAD(&wlist); 498 + list_for_each_entry_safe(waiter, tmp, &sem->wait_list, list) { 499 + if (waiter->type == RWSEM_WAITING_FOR_WRITE) 500 + continue; 501 + 502 + woken++; 503 + list_move_tail(&waiter->list, &wlist); 504 + 505 + /* 506 + * Limit # of readers that can be woken up per wakeup call. 507 + */ 508 + if (woken >= MAX_READERS_WAKEUP) 509 + break; 510 + } 511 + 512 + adjustment = woken * RWSEM_READER_BIAS - adjustment; 513 + lockevent_cond_inc(rwsem_wake_reader, woken); 514 + if (list_empty(&sem->wait_list)) { 515 + /* hit end of list above */ 516 + adjustment -= RWSEM_FLAG_WAITERS; 517 + } 518 + 519 + /* 520 + * When we've woken a reader, we no longer need to force writers 521 + * to give up the lock and we can clear HANDOFF. 522 + */ 523 + if (woken && (atomic_long_read(&sem->count) & RWSEM_FLAG_HANDOFF)) 524 + adjustment -= RWSEM_FLAG_HANDOFF; 525 + 526 + if (adjustment) 527 + atomic_long_add(adjustment, &sem->count); 528 + 529 + /* 2nd pass */ 530 + list_for_each_entry_safe(waiter, tmp, &wlist, list) { 531 + struct task_struct *tsk; 532 + 533 + tsk = waiter->task; 534 + get_task_struct(tsk); 535 + 536 + /* 537 + * Ensure calling get_task_struct() before setting the reader 538 + * waiter to nil such that rwsem_down_read_slowpath() cannot 539 + * race with do_exit() by always holding a reference count 540 + * to the task to wakeup. 541 + */ 542 + smp_store_release(&waiter->task, NULL); 543 + /* 544 + * Ensure issuing the wakeup (either by us or someone else) 545 + * after setting the reader waiter to nil. 546 + */ 547 + wake_q_add_safe(wake_q, tsk); 548 + } 549 + } 550 + 551 + /* 552 + * This function must be called with the sem->wait_lock held to prevent 553 + * race conditions between checking the rwsem wait list and setting the 554 + * sem->count accordingly. 555 + * 556 + * If wstate is WRITER_HANDOFF, it will make sure that either the handoff 557 + * bit is set or the lock is acquired with handoff bit cleared. 558 + */ 559 + static inline bool rwsem_try_write_lock(struct rw_semaphore *sem, 560 + enum writer_wait_state wstate) 561 + { 562 + long count, new; 563 + 564 + lockdep_assert_held(&sem->wait_lock); 565 + 566 + count = atomic_long_read(&sem->count); 567 + do { 568 + bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF); 569 + 570 + if (has_handoff && wstate == WRITER_NOT_FIRST) 571 + return false; 572 + 573 + new = count; 574 + 575 + if (count & RWSEM_LOCK_MASK) { 576 + if (has_handoff || (wstate != WRITER_HANDOFF)) 577 + return false; 578 + 579 + new |= RWSEM_FLAG_HANDOFF; 580 + } else { 581 + new |= RWSEM_WRITER_LOCKED; 582 + new &= ~RWSEM_FLAG_HANDOFF; 583 + 584 + if (list_is_singular(&sem->wait_list)) 585 + new &= ~RWSEM_FLAG_WAITERS; 586 + } 587 + } while (!atomic_long_try_cmpxchg_acquire(&sem->count, &count, new)); 588 + 589 + /* 590 + * We have either acquired the lock with handoff bit cleared or 591 + * set the handoff bit. 592 + */ 593 + if (new & RWSEM_FLAG_HANDOFF) 594 + return false; 595 + 596 + rwsem_set_owner(sem); 597 + return true; 598 + } 599 + 600 + #ifdef CONFIG_RWSEM_SPIN_ON_OWNER 601 + /* 602 + * Try to acquire read lock before the reader is put on wait queue. 603 + * Lock acquisition isn't allowed if the rwsem is locked or a writer handoff 604 + * is ongoing. 605 + */ 606 + static inline bool rwsem_try_read_lock_unqueued(struct rw_semaphore *sem) 607 + { 608 + long count = atomic_long_read(&sem->count); 609 + 610 + if (count & (RWSEM_WRITER_MASK | RWSEM_FLAG_HANDOFF)) 611 + return false; 612 + 613 + count = atomic_long_fetch_add_acquire(RWSEM_READER_BIAS, &sem->count); 614 + if (!(count & (RWSEM_WRITER_MASK | RWSEM_FLAG_HANDOFF))) { 615 + rwsem_set_reader_owned(sem); 616 + lockevent_inc(rwsem_opt_rlock); 617 + return true; 618 + } 619 + 620 + /* Back out the change */ 621 + atomic_long_add(-RWSEM_READER_BIAS, &sem->count); 622 + return false; 623 + } 624 + 625 + /* 626 + * Try to acquire write lock before the writer has been put on wait queue. 627 + */ 628 + static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem) 629 + { 630 + long count = atomic_long_read(&sem->count); 631 + 632 + while (!(count & (RWSEM_LOCK_MASK|RWSEM_FLAG_HANDOFF))) { 633 + if (atomic_long_try_cmpxchg_acquire(&sem->count, &count, 634 + count | RWSEM_WRITER_LOCKED)) { 635 + rwsem_set_owner(sem); 636 + lockevent_inc(rwsem_opt_wlock); 637 + return true; 638 + } 639 + } 640 + return false; 641 + } 642 + 643 + static inline bool owner_on_cpu(struct task_struct *owner) 644 + { 645 + /* 646 + * As lock holder preemption issue, we both skip spinning if 647 + * task is not on cpu or its cpu is preempted 648 + */ 649 + return owner->on_cpu && !vcpu_is_preempted(task_cpu(owner)); 650 + } 651 + 652 + static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem, 653 + unsigned long nonspinnable) 654 + { 655 + struct task_struct *owner; 656 + unsigned long flags; 657 + bool ret = true; 658 + 659 + BUILD_BUG_ON(!(RWSEM_OWNER_UNKNOWN & RWSEM_NONSPINNABLE)); 660 + 661 + if (need_resched()) { 662 + lockevent_inc(rwsem_opt_fail); 663 + return false; 664 + } 665 + 666 + preempt_disable(); 667 + rcu_read_lock(); 668 + owner = rwsem_owner_flags(sem, &flags); 669 + if ((flags & nonspinnable) || (owner && !owner_on_cpu(owner))) 670 + ret = false; 671 + rcu_read_unlock(); 672 + preempt_enable(); 673 + 674 + lockevent_cond_inc(rwsem_opt_fail, !ret); 675 + return ret; 676 + } 677 + 678 + /* 679 + * The rwsem_spin_on_owner() function returns the folowing 4 values 680 + * depending on the lock owner state. 681 + * OWNER_NULL : owner is currently NULL 682 + * OWNER_WRITER: when owner changes and is a writer 683 + * OWNER_READER: when owner changes and the new owner may be a reader. 684 + * OWNER_NONSPINNABLE: 685 + * when optimistic spinning has to stop because either the 686 + * owner stops running, is unknown, or its timeslice has 687 + * been used up. 688 + */ 689 + enum owner_state { 690 + OWNER_NULL = 1 << 0, 691 + OWNER_WRITER = 1 << 1, 692 + OWNER_READER = 1 << 2, 693 + OWNER_NONSPINNABLE = 1 << 3, 694 + }; 695 + #define OWNER_SPINNABLE (OWNER_NULL | OWNER_WRITER | OWNER_READER) 696 + 697 + static inline enum owner_state 698 + rwsem_owner_state(struct task_struct *owner, unsigned long flags, unsigned long nonspinnable) 699 + { 700 + if (flags & nonspinnable) 701 + return OWNER_NONSPINNABLE; 702 + 703 + if (flags & RWSEM_READER_OWNED) 704 + return OWNER_READER; 705 + 706 + return owner ? OWNER_WRITER : OWNER_NULL; 707 + } 708 + 709 + static noinline enum owner_state 710 + rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable) 711 + { 712 + struct task_struct *new, *owner; 713 + unsigned long flags, new_flags; 714 + enum owner_state state; 715 + 716 + owner = rwsem_owner_flags(sem, &flags); 717 + state = rwsem_owner_state(owner, flags, nonspinnable); 718 + if (state != OWNER_WRITER) 719 + return state; 720 + 721 + rcu_read_lock(); 722 + for (;;) { 723 + if (atomic_long_read(&sem->count) & RWSEM_FLAG_HANDOFF) { 724 + state = OWNER_NONSPINNABLE; 725 + break; 726 + } 727 + 728 + new = rwsem_owner_flags(sem, &new_flags); 729 + if ((new != owner) || (new_flags != flags)) { 730 + state = rwsem_owner_state(new, new_flags, nonspinnable); 731 + break; 732 + } 733 + 734 + /* 735 + * Ensure we emit the owner->on_cpu, dereference _after_ 736 + * checking sem->owner still matches owner, if that fails, 737 + * owner might point to free()d memory, if it still matches, 738 + * the rcu_read_lock() ensures the memory stays valid. 739 + */ 740 + barrier(); 741 + 742 + if (need_resched() || !owner_on_cpu(owner)) { 743 + state = OWNER_NONSPINNABLE; 744 + break; 745 + } 746 + 747 + cpu_relax(); 748 + } 749 + rcu_read_unlock(); 750 + 751 + return state; 752 + } 753 + 754 + /* 755 + * Calculate reader-owned rwsem spinning threshold for writer 756 + * 757 + * The more readers own the rwsem, the longer it will take for them to 758 + * wind down and free the rwsem. So the empirical formula used to 759 + * determine the actual spinning time limit here is: 760 + * 761 + * Spinning threshold = (10 + nr_readers/2)us 762 + * 763 + * The limit is capped to a maximum of 25us (30 readers). This is just 764 + * a heuristic and is subjected to change in the future. 765 + */ 766 + static inline u64 rwsem_rspin_threshold(struct rw_semaphore *sem) 767 + { 768 + long count = atomic_long_read(&sem->count); 769 + int readers = count >> RWSEM_READER_SHIFT; 770 + u64 delta; 771 + 772 + if (readers > 30) 773 + readers = 30; 774 + delta = (20 + readers) * NSEC_PER_USEC / 2; 775 + 776 + return sched_clock() + delta; 777 + } 778 + 779 + static bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock) 780 + { 781 + bool taken = false; 782 + int prev_owner_state = OWNER_NULL; 783 + int loop = 0; 784 + u64 rspin_threshold = 0; 785 + unsigned long nonspinnable = wlock ? RWSEM_WR_NONSPINNABLE 786 + : RWSEM_RD_NONSPINNABLE; 787 + 788 + preempt_disable(); 789 + 790 + /* sem->wait_lock should not be held when doing optimistic spinning */ 791 + if (!osq_lock(&sem->osq)) 792 + goto done; 793 + 794 + /* 795 + * Optimistically spin on the owner field and attempt to acquire the 796 + * lock whenever the owner changes. Spinning will be stopped when: 797 + * 1) the owning writer isn't running; or 798 + * 2) readers own the lock and spinning time has exceeded limit. 799 + */ 800 + for (;;) { 801 + enum owner_state owner_state; 802 + 803 + owner_state = rwsem_spin_on_owner(sem, nonspinnable); 804 + if (!(owner_state & OWNER_SPINNABLE)) 805 + break; 806 + 807 + /* 808 + * Try to acquire the lock 809 + */ 810 + taken = wlock ? rwsem_try_write_lock_unqueued(sem) 811 + : rwsem_try_read_lock_unqueued(sem); 812 + 813 + if (taken) 814 + break; 815 + 816 + /* 817 + * Time-based reader-owned rwsem optimistic spinning 818 + */ 819 + if (wlock && (owner_state == OWNER_READER)) { 820 + /* 821 + * Re-initialize rspin_threshold every time when 822 + * the owner state changes from non-reader to reader. 823 + * This allows a writer to steal the lock in between 824 + * 2 reader phases and have the threshold reset at 825 + * the beginning of the 2nd reader phase. 826 + */ 827 + if (prev_owner_state != OWNER_READER) { 828 + if (rwsem_test_oflags(sem, nonspinnable)) 829 + break; 830 + rspin_threshold = rwsem_rspin_threshold(sem); 831 + loop = 0; 832 + } 833 + 834 + /* 835 + * Check time threshold once every 16 iterations to 836 + * avoid calling sched_clock() too frequently so 837 + * as to reduce the average latency between the times 838 + * when the lock becomes free and when the spinner 839 + * is ready to do a trylock. 840 + */ 841 + else if (!(++loop & 0xf) && (sched_clock() > rspin_threshold)) { 842 + rwsem_set_nonspinnable(sem); 843 + lockevent_inc(rwsem_opt_nospin); 844 + break; 845 + } 846 + } 847 + 848 + /* 849 + * An RT task cannot do optimistic spinning if it cannot 850 + * be sure the lock holder is running or live-lock may 851 + * happen if the current task and the lock holder happen 852 + * to run in the same CPU. However, aborting optimistic 853 + * spinning while a NULL owner is detected may miss some 854 + * opportunity where spinning can continue without causing 855 + * problem. 856 + * 857 + * There are 2 possible cases where an RT task may be able 858 + * to continue spinning. 859 + * 860 + * 1) The lock owner is in the process of releasing the 861 + * lock, sem->owner is cleared but the lock has not 862 + * been released yet. 863 + * 2) The lock was free and owner cleared, but another 864 + * task just comes in and acquire the lock before 865 + * we try to get it. The new owner may be a spinnable 866 + * writer. 867 + * 868 + * To take advantage of two scenarios listed agove, the RT 869 + * task is made to retry one more time to see if it can 870 + * acquire the lock or continue spinning on the new owning 871 + * writer. Of course, if the time lag is long enough or the 872 + * new owner is not a writer or spinnable, the RT task will 873 + * quit spinning. 874 + * 875 + * If the owner is a writer, the need_resched() check is 876 + * done inside rwsem_spin_on_owner(). If the owner is not 877 + * a writer, need_resched() check needs to be done here. 878 + */ 879 + if (owner_state != OWNER_WRITER) { 880 + if (need_resched()) 881 + break; 882 + if (rt_task(current) && 883 + (prev_owner_state != OWNER_WRITER)) 884 + break; 885 + } 886 + prev_owner_state = owner_state; 887 + 888 + /* 889 + * The cpu_relax() call is a compiler barrier which forces 890 + * everything in this loop to be re-loaded. We don't need 891 + * memory barriers as we'll eventually observe the right 892 + * values at the cost of a few extra spins. 893 + */ 894 + cpu_relax(); 895 + } 896 + osq_unlock(&sem->osq); 897 + done: 898 + preempt_enable(); 899 + lockevent_cond_inc(rwsem_opt_fail, !taken); 900 + return taken; 901 + } 902 + 903 + /* 904 + * Clear the owner's RWSEM_WR_NONSPINNABLE bit if it is set. This should 905 + * only be called when the reader count reaches 0. 906 + * 907 + * This give writers better chance to acquire the rwsem first before 908 + * readers when the rwsem was being held by readers for a relatively long 909 + * period of time. Race can happen that an optimistic spinner may have 910 + * just stolen the rwsem and set the owner, but just clearing the 911 + * RWSEM_WR_NONSPINNABLE bit will do no harm anyway. 912 + */ 913 + static inline void clear_wr_nonspinnable(struct rw_semaphore *sem) 914 + { 915 + if (rwsem_test_oflags(sem, RWSEM_WR_NONSPINNABLE)) 916 + atomic_long_andnot(RWSEM_WR_NONSPINNABLE, &sem->owner); 917 + } 918 + 919 + /* 920 + * This function is called when the reader fails to acquire the lock via 921 + * optimistic spinning. In this case we will still attempt to do a trylock 922 + * when comparing the rwsem state right now with the state when entering 923 + * the slowpath indicates that the reader is still in a valid reader phase. 924 + * This happens when the following conditions are true: 925 + * 926 + * 1) The lock is currently reader owned, and 927 + * 2) The lock is previously not reader-owned or the last read owner changes. 928 + * 929 + * In the former case, we have transitioned from a writer phase to a 930 + * reader-phase while spinning. In the latter case, it means the reader 931 + * phase hasn't ended when we entered the optimistic spinning loop. In 932 + * both cases, the reader is eligible to acquire the lock. This is the 933 + * secondary path where a read lock is acquired optimistically. 934 + * 935 + * The reader non-spinnable bit wasn't set at time of entry or it will 936 + * not be here at all. 937 + */ 938 + static inline bool rwsem_reader_phase_trylock(struct rw_semaphore *sem, 939 + unsigned long last_rowner) 940 + { 941 + unsigned long owner = atomic_long_read(&sem->owner); 942 + 943 + if (!(owner & RWSEM_READER_OWNED)) 944 + return false; 945 + 946 + if (((owner ^ last_rowner) & ~RWSEM_OWNER_FLAGS_MASK) && 947 + rwsem_try_read_lock_unqueued(sem)) { 948 + lockevent_inc(rwsem_opt_rlock2); 949 + lockevent_add(rwsem_opt_fail, -1); 950 + return true; 951 + } 952 + return false; 953 + } 954 + #else 955 + static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem, 956 + unsigned long nonspinnable) 957 + { 958 + return false; 959 + } 960 + 961 + static inline bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock) 962 + { 963 + return false; 964 + } 965 + 966 + static inline void clear_wr_nonspinnable(struct rw_semaphore *sem) { } 967 + 968 + static inline bool rwsem_reader_phase_trylock(struct rw_semaphore *sem, 969 + unsigned long last_rowner) 970 + { 971 + return false; 972 + } 973 + #endif 974 + 975 + /* 976 + * Wait for the read lock to be granted 977 + */ 978 + static struct rw_semaphore __sched * 979 + rwsem_down_read_slowpath(struct rw_semaphore *sem, int state) 980 + { 981 + long count, adjustment = -RWSEM_READER_BIAS; 982 + struct rwsem_waiter waiter; 983 + DEFINE_WAKE_Q(wake_q); 984 + bool wake = false; 985 + 986 + /* 987 + * Save the current read-owner of rwsem, if available, and the 988 + * reader nonspinnable bit. 989 + */ 990 + waiter.last_rowner = atomic_long_read(&sem->owner); 991 + if (!(waiter.last_rowner & RWSEM_READER_OWNED)) 992 + waiter.last_rowner &= RWSEM_RD_NONSPINNABLE; 993 + 994 + if (!rwsem_can_spin_on_owner(sem, RWSEM_RD_NONSPINNABLE)) 995 + goto queue; 996 + 997 + /* 998 + * Undo read bias from down_read() and do optimistic spinning. 999 + */ 1000 + atomic_long_add(-RWSEM_READER_BIAS, &sem->count); 1001 + adjustment = 0; 1002 + if (rwsem_optimistic_spin(sem, false)) { 1003 + /* 1004 + * Wake up other readers in the wait list if the front 1005 + * waiter is a reader. 1006 + */ 1007 + if ((atomic_long_read(&sem->count) & RWSEM_FLAG_WAITERS)) { 1008 + raw_spin_lock_irq(&sem->wait_lock); 1009 + if (!list_empty(&sem->wait_list)) 1010 + rwsem_mark_wake(sem, RWSEM_WAKE_READ_OWNED, 1011 + &wake_q); 1012 + raw_spin_unlock_irq(&sem->wait_lock); 1013 + wake_up_q(&wake_q); 1014 + } 1015 + return sem; 1016 + } else if (rwsem_reader_phase_trylock(sem, waiter.last_rowner)) { 1017 + return sem; 1018 + } 1019 + 1020 + queue: 1021 + waiter.task = current; 1022 + waiter.type = RWSEM_WAITING_FOR_READ; 1023 + waiter.timeout = jiffies + RWSEM_WAIT_TIMEOUT; 1024 + 1025 + raw_spin_lock_irq(&sem->wait_lock); 1026 + if (list_empty(&sem->wait_list)) { 1027 + /* 1028 + * In case the wait queue is empty and the lock isn't owned 1029 + * by a writer or has the handoff bit set, this reader can 1030 + * exit the slowpath and return immediately as its 1031 + * RWSEM_READER_BIAS has already been set in the count. 1032 + */ 1033 + if (adjustment && !(atomic_long_read(&sem->count) & 1034 + (RWSEM_WRITER_MASK | RWSEM_FLAG_HANDOFF))) { 1035 + raw_spin_unlock_irq(&sem->wait_lock); 1036 + rwsem_set_reader_owned(sem); 1037 + lockevent_inc(rwsem_rlock_fast); 1038 + return sem; 1039 + } 1040 + adjustment += RWSEM_FLAG_WAITERS; 1041 + } 1042 + list_add_tail(&waiter.list, &sem->wait_list); 1043 + 1044 + /* we're now waiting on the lock, but no longer actively locking */ 1045 + if (adjustment) 1046 + count = atomic_long_add_return(adjustment, &sem->count); 1047 + else 1048 + count = atomic_long_read(&sem->count); 1049 + 1050 + /* 1051 + * If there are no active locks, wake the front queued process(es). 1052 + * 1053 + * If there are no writers and we are first in the queue, 1054 + * wake our own waiter to join the existing active readers ! 1055 + */ 1056 + if (!(count & RWSEM_LOCK_MASK)) { 1057 + clear_wr_nonspinnable(sem); 1058 + wake = true; 1059 + } 1060 + if (wake || (!(count & RWSEM_WRITER_MASK) && 1061 + (adjustment & RWSEM_FLAG_WAITERS))) 1062 + rwsem_mark_wake(sem, RWSEM_WAKE_ANY, &wake_q); 1063 + 1064 + raw_spin_unlock_irq(&sem->wait_lock); 1065 + wake_up_q(&wake_q); 1066 + 1067 + /* wait to be given the lock */ 1068 + while (true) { 1069 + set_current_state(state); 1070 + if (!waiter.task) 1071 + break; 1072 + if (signal_pending_state(state, current)) { 1073 + raw_spin_lock_irq(&sem->wait_lock); 1074 + if (waiter.task) 1075 + goto out_nolock; 1076 + raw_spin_unlock_irq(&sem->wait_lock); 1077 + break; 1078 + } 1079 + schedule(); 1080 + lockevent_inc(rwsem_sleep_reader); 1081 + } 1082 + 1083 + __set_current_state(TASK_RUNNING); 1084 + lockevent_inc(rwsem_rlock); 1085 + return sem; 1086 + out_nolock: 1087 + list_del(&waiter.list); 1088 + if (list_empty(&sem->wait_list)) { 1089 + atomic_long_andnot(RWSEM_FLAG_WAITERS|RWSEM_FLAG_HANDOFF, 1090 + &sem->count); 1091 + } 1092 + raw_spin_unlock_irq(&sem->wait_lock); 1093 + __set_current_state(TASK_RUNNING); 1094 + lockevent_inc(rwsem_rlock_fail); 1095 + return ERR_PTR(-EINTR); 1096 + } 1097 + 1098 + /* 1099 + * This function is called by the a write lock owner. So the owner value 1100 + * won't get changed by others. 1101 + */ 1102 + static inline void rwsem_disable_reader_optspin(struct rw_semaphore *sem, 1103 + bool disable) 1104 + { 1105 + if (unlikely(disable)) { 1106 + atomic_long_or(RWSEM_RD_NONSPINNABLE, &sem->owner); 1107 + lockevent_inc(rwsem_opt_norspin); 1108 + } 1109 + } 1110 + 1111 + /* 1112 + * Wait until we successfully acquire the write lock 1113 + */ 1114 + static struct rw_semaphore * 1115 + rwsem_down_write_slowpath(struct rw_semaphore *sem, int state) 1116 + { 1117 + long count; 1118 + bool disable_rspin; 1119 + enum writer_wait_state wstate; 1120 + struct rwsem_waiter waiter; 1121 + struct rw_semaphore *ret = sem; 1122 + DEFINE_WAKE_Q(wake_q); 1123 + 1124 + /* do optimistic spinning and steal lock if possible */ 1125 + if (rwsem_can_spin_on_owner(sem, RWSEM_WR_NONSPINNABLE) && 1126 + rwsem_optimistic_spin(sem, true)) 1127 + return sem; 1128 + 1129 + /* 1130 + * Disable reader optimistic spinning for this rwsem after 1131 + * acquiring the write lock when the setting of the nonspinnable 1132 + * bits are observed. 1133 + */ 1134 + disable_rspin = atomic_long_read(&sem->owner) & RWSEM_NONSPINNABLE; 1135 + 1136 + /* 1137 + * Optimistic spinning failed, proceed to the slowpath 1138 + * and block until we can acquire the sem. 1139 + */ 1140 + waiter.task = current; 1141 + waiter.type = RWSEM_WAITING_FOR_WRITE; 1142 + waiter.timeout = jiffies + RWSEM_WAIT_TIMEOUT; 1143 + 1144 + raw_spin_lock_irq(&sem->wait_lock); 1145 + 1146 + /* account for this before adding a new element to the list */ 1147 + wstate = list_empty(&sem->wait_list) ? WRITER_FIRST : WRITER_NOT_FIRST; 1148 + 1149 + list_add_tail(&waiter.list, &sem->wait_list); 1150 + 1151 + /* we're now waiting on the lock */ 1152 + if (wstate == WRITER_NOT_FIRST) { 1153 + count = atomic_long_read(&sem->count); 1154 + 1155 + /* 1156 + * If there were already threads queued before us and: 1157 + * 1) there are no no active locks, wake the front 1158 + * queued process(es) as the handoff bit might be set. 1159 + * 2) there are no active writers and some readers, the lock 1160 + * must be read owned; so we try to wake any read lock 1161 + * waiters that were queued ahead of us. 1162 + */ 1163 + if (count & RWSEM_WRITER_MASK) 1164 + goto wait; 1165 + 1166 + rwsem_mark_wake(sem, (count & RWSEM_READER_MASK) 1167 + ? RWSEM_WAKE_READERS 1168 + : RWSEM_WAKE_ANY, &wake_q); 1169 + 1170 + if (!wake_q_empty(&wake_q)) { 1171 + /* 1172 + * We want to minimize wait_lock hold time especially 1173 + * when a large number of readers are to be woken up. 1174 + */ 1175 + raw_spin_unlock_irq(&sem->wait_lock); 1176 + wake_up_q(&wake_q); 1177 + wake_q_init(&wake_q); /* Used again, reinit */ 1178 + raw_spin_lock_irq(&sem->wait_lock); 1179 + } 1180 + } else { 1181 + atomic_long_or(RWSEM_FLAG_WAITERS, &sem->count); 1182 + } 1183 + 1184 + wait: 1185 + /* wait until we successfully acquire the lock */ 1186 + set_current_state(state); 1187 + while (true) { 1188 + if (rwsem_try_write_lock(sem, wstate)) 1189 + break; 1190 + 1191 + raw_spin_unlock_irq(&sem->wait_lock); 1192 + 1193 + /* Block until there are no active lockers. */ 1194 + for (;;) { 1195 + if (signal_pending_state(state, current)) 1196 + goto out_nolock; 1197 + 1198 + schedule(); 1199 + lockevent_inc(rwsem_sleep_writer); 1200 + set_current_state(state); 1201 + /* 1202 + * If HANDOFF bit is set, unconditionally do 1203 + * a trylock. 1204 + */ 1205 + if (wstate == WRITER_HANDOFF) 1206 + break; 1207 + 1208 + if ((wstate == WRITER_NOT_FIRST) && 1209 + (rwsem_first_waiter(sem) == &waiter)) 1210 + wstate = WRITER_FIRST; 1211 + 1212 + count = atomic_long_read(&sem->count); 1213 + if (!(count & RWSEM_LOCK_MASK)) 1214 + break; 1215 + 1216 + /* 1217 + * The setting of the handoff bit is deferred 1218 + * until rwsem_try_write_lock() is called. 1219 + */ 1220 + if ((wstate == WRITER_FIRST) && (rt_task(current) || 1221 + time_after(jiffies, waiter.timeout))) { 1222 + wstate = WRITER_HANDOFF; 1223 + lockevent_inc(rwsem_wlock_handoff); 1224 + break; 1225 + } 1226 + } 1227 + 1228 + raw_spin_lock_irq(&sem->wait_lock); 1229 + } 1230 + __set_current_state(TASK_RUNNING); 1231 + list_del(&waiter.list); 1232 + rwsem_disable_reader_optspin(sem, disable_rspin); 1233 + raw_spin_unlock_irq(&sem->wait_lock); 1234 + lockevent_inc(rwsem_wlock); 1235 + 1236 + return ret; 1237 + 1238 + out_nolock: 1239 + __set_current_state(TASK_RUNNING); 1240 + raw_spin_lock_irq(&sem->wait_lock); 1241 + list_del(&waiter.list); 1242 + 1243 + if (unlikely(wstate == WRITER_HANDOFF)) 1244 + atomic_long_add(-RWSEM_FLAG_HANDOFF, &sem->count); 1245 + 1246 + if (list_empty(&sem->wait_list)) 1247 + atomic_long_andnot(RWSEM_FLAG_WAITERS, &sem->count); 1248 + else 1249 + rwsem_mark_wake(sem, RWSEM_WAKE_ANY, &wake_q); 1250 + raw_spin_unlock_irq(&sem->wait_lock); 1251 + wake_up_q(&wake_q); 1252 + lockevent_inc(rwsem_wlock_fail); 1253 + 1254 + return ERR_PTR(-EINTR); 1255 + } 1256 + 1257 + /* 1258 + * handle waking up a waiter on the semaphore 1259 + * - up_read/up_write has decremented the active part of count if we come here 1260 + */ 1261 + static struct rw_semaphore *rwsem_wake(struct rw_semaphore *sem, long count) 1262 + { 1263 + unsigned long flags; 1264 + DEFINE_WAKE_Q(wake_q); 1265 + 1266 + raw_spin_lock_irqsave(&sem->wait_lock, flags); 1267 + 1268 + if (!list_empty(&sem->wait_list)) 1269 + rwsem_mark_wake(sem, RWSEM_WAKE_ANY, &wake_q); 1270 + 1271 + raw_spin_unlock_irqrestore(&sem->wait_lock, flags); 1272 + wake_up_q(&wake_q); 1273 + 1274 + return sem; 1275 + } 1276 + 1277 + /* 1278 + * downgrade a write lock into a read lock 1279 + * - caller incremented waiting part of count and discovered it still negative 1280 + * - just wake up any readers at the front of the queue 1281 + */ 1282 + static struct rw_semaphore *rwsem_downgrade_wake(struct rw_semaphore *sem) 1283 + { 1284 + unsigned long flags; 1285 + DEFINE_WAKE_Q(wake_q); 1286 + 1287 + raw_spin_lock_irqsave(&sem->wait_lock, flags); 1288 + 1289 + if (!list_empty(&sem->wait_list)) 1290 + rwsem_mark_wake(sem, RWSEM_WAKE_READ_OWNED, &wake_q); 1291 + 1292 + raw_spin_unlock_irqrestore(&sem->wait_lock, flags); 1293 + wake_up_q(&wake_q); 1294 + 1295 + return sem; 1296 + } 1297 + 1298 + /* 1299 + * lock for reading 1300 + */ 1301 + inline void __down_read(struct rw_semaphore *sem) 1302 + { 1303 + if (!rwsem_read_trylock(sem)) { 1304 + rwsem_down_read_slowpath(sem, TASK_UNINTERRUPTIBLE); 1305 + DEBUG_RWSEMS_WARN_ON(!is_rwsem_reader_owned(sem), sem); 1306 + } else { 1307 + rwsem_set_reader_owned(sem); 1308 + } 1309 + } 1310 + 1311 + static inline int __down_read_killable(struct rw_semaphore *sem) 1312 + { 1313 + if (!rwsem_read_trylock(sem)) { 1314 + if (IS_ERR(rwsem_down_read_slowpath(sem, TASK_KILLABLE))) 1315 + return -EINTR; 1316 + DEBUG_RWSEMS_WARN_ON(!is_rwsem_reader_owned(sem), sem); 1317 + } else { 1318 + rwsem_set_reader_owned(sem); 1319 + } 1320 + return 0; 1321 + } 1322 + 1323 + static inline int __down_read_trylock(struct rw_semaphore *sem) 1324 + { 1325 + /* 1326 + * Optimize for the case when the rwsem is not locked at all. 1327 + */ 1328 + long tmp = RWSEM_UNLOCKED_VALUE; 1329 + 1330 + do { 1331 + if (atomic_long_try_cmpxchg_acquire(&sem->count, &tmp, 1332 + tmp + RWSEM_READER_BIAS)) { 1333 + rwsem_set_reader_owned(sem); 1334 + return 1; 1335 + } 1336 + } while (!(tmp & RWSEM_READ_FAILED_MASK)); 1337 + return 0; 1338 + } 1339 + 1340 + /* 1341 + * lock for writing 1342 + */ 1343 + static inline void __down_write(struct rw_semaphore *sem) 1344 + { 1345 + long tmp = RWSEM_UNLOCKED_VALUE; 1346 + 1347 + if (unlikely(!atomic_long_try_cmpxchg_acquire(&sem->count, &tmp, 1348 + RWSEM_WRITER_LOCKED))) 1349 + rwsem_down_write_slowpath(sem, TASK_UNINTERRUPTIBLE); 1350 + else 1351 + rwsem_set_owner(sem); 1352 + } 1353 + 1354 + static inline int __down_write_killable(struct rw_semaphore *sem) 1355 + { 1356 + long tmp = RWSEM_UNLOCKED_VALUE; 1357 + 1358 + if (unlikely(!atomic_long_try_cmpxchg_acquire(&sem->count, &tmp, 1359 + RWSEM_WRITER_LOCKED))) { 1360 + if (IS_ERR(rwsem_down_write_slowpath(sem, TASK_KILLABLE))) 1361 + return -EINTR; 1362 + } else { 1363 + rwsem_set_owner(sem); 1364 + } 1365 + return 0; 1366 + } 1367 + 1368 + static inline int __down_write_trylock(struct rw_semaphore *sem) 1369 + { 1370 + long tmp = RWSEM_UNLOCKED_VALUE; 1371 + 1372 + if (atomic_long_try_cmpxchg_acquire(&sem->count, &tmp, 1373 + RWSEM_WRITER_LOCKED)) { 1374 + rwsem_set_owner(sem); 1375 + return true; 1376 + } 1377 + return false; 1378 + } 1379 + 1380 + /* 1381 + * unlock after reading 1382 + */ 1383 + inline void __up_read(struct rw_semaphore *sem) 1384 + { 1385 + long tmp; 1386 + 1387 + DEBUG_RWSEMS_WARN_ON(!is_rwsem_reader_owned(sem), sem); 1388 + rwsem_clear_reader_owned(sem); 1389 + tmp = atomic_long_add_return_release(-RWSEM_READER_BIAS, &sem->count); 1390 + DEBUG_RWSEMS_WARN_ON(tmp < 0, sem); 1391 + if (unlikely((tmp & (RWSEM_LOCK_MASK|RWSEM_FLAG_WAITERS)) == 1392 + RWSEM_FLAG_WAITERS)) { 1393 + clear_wr_nonspinnable(sem); 1394 + rwsem_wake(sem, tmp); 1395 + } 1396 + } 1397 + 1398 + /* 1399 + * unlock after writing 1400 + */ 1401 + static inline void __up_write(struct rw_semaphore *sem) 1402 + { 1403 + long tmp; 1404 + 1405 + /* 1406 + * sem->owner may differ from current if the ownership is transferred 1407 + * to an anonymous writer by setting the RWSEM_NONSPINNABLE bits. 1408 + */ 1409 + DEBUG_RWSEMS_WARN_ON((rwsem_owner(sem) != current) && 1410 + !rwsem_test_oflags(sem, RWSEM_NONSPINNABLE), sem); 1411 + rwsem_clear_owner(sem); 1412 + tmp = atomic_long_fetch_add_release(-RWSEM_WRITER_LOCKED, &sem->count); 1413 + if (unlikely(tmp & RWSEM_FLAG_WAITERS)) 1414 + rwsem_wake(sem, tmp); 1415 + } 1416 + 1417 + /* 1418 + * downgrade write lock to read lock 1419 + */ 1420 + static inline void __downgrade_write(struct rw_semaphore *sem) 1421 + { 1422 + long tmp; 1423 + 1424 + /* 1425 + * When downgrading from exclusive to shared ownership, 1426 + * anything inside the write-locked region cannot leak 1427 + * into the read side. In contrast, anything in the 1428 + * read-locked region is ok to be re-ordered into the 1429 + * write side. As such, rely on RELEASE semantics. 1430 + */ 1431 + DEBUG_RWSEMS_WARN_ON(rwsem_owner(sem) != current, sem); 1432 + tmp = atomic_long_fetch_add_release( 1433 + -RWSEM_WRITER_LOCKED+RWSEM_READER_BIAS, &sem->count); 1434 + rwsem_set_reader_owned(sem); 1435 + if (tmp & RWSEM_FLAG_WAITERS) 1436 + rwsem_downgrade_wake(sem); 1437 + } 17 1438 18 1439 /* 19 1440 * lock for reading ··· 1446 25 1447 26 LOCK_CONTENDED(sem, __down_read_trylock, __down_read); 1448 27 } 1449 - 1450 28 EXPORT_SYMBOL(down_read); 1451 29 1452 30 int __sched down_read_killable(struct rw_semaphore *sem) ··· 1460 40 1461 41 return 0; 1462 42 } 1463 - 1464 43 EXPORT_SYMBOL(down_read_killable); 1465 44 1466 45 /* ··· 1473 54 rwsem_acquire_read(&sem->dep_map, 0, 1, _RET_IP_); 1474 55 return ret; 1475 56 } 1476 - 1477 57 EXPORT_SYMBOL(down_read_trylock); 1478 58 1479 59 /* ··· 1482 64 { 1483 65 might_sleep(); 1484 66 rwsem_acquire(&sem->dep_map, 0, 0, _RET_IP_); 1485 - 1486 67 LOCK_CONTENDED(sem, __down_write_trylock, __down_write); 1487 68 } 1488 - 1489 69 EXPORT_SYMBOL(down_write); 1490 70 1491 71 /* ··· 1494 78 might_sleep(); 1495 79 rwsem_acquire(&sem->dep_map, 0, 0, _RET_IP_); 1496 80 1497 - if (LOCK_CONTENDED_RETURN(sem, __down_write_trylock, __down_write_killable)) { 81 + if (LOCK_CONTENDED_RETURN(sem, __down_write_trylock, 82 + __down_write_killable)) { 1498 83 rwsem_release(&sem->dep_map, 1, _RET_IP_); 1499 84 return -EINTR; 1500 85 } 1501 86 1502 87 return 0; 1503 88 } 1504 - 1505 89 EXPORT_SYMBOL(down_write_killable); 1506 90 1507 91 /* ··· 1516 100 1517 101 return ret; 1518 102 } 1519 - 1520 103 EXPORT_SYMBOL(down_write_trylock); 1521 104 1522 105 /* ··· 1524 109 void up_read(struct rw_semaphore *sem) 1525 110 { 1526 111 rwsem_release(&sem->dep_map, 1, _RET_IP_); 1527 - 1528 112 __up_read(sem); 1529 113 } 1530 - 1531 114 EXPORT_SYMBOL(up_read); 1532 115 1533 116 /* ··· 1534 121 void up_write(struct rw_semaphore *sem) 1535 122 { 1536 123 rwsem_release(&sem->dep_map, 1, _RET_IP_); 1537 - 1538 124 __up_write(sem); 1539 125 } 1540 - 1541 126 EXPORT_SYMBOL(up_write); 1542 127 1543 128 /* ··· 1544 133 void downgrade_write(struct rw_semaphore *sem) 1545 134 { 1546 135 lock_downgrade(&sem->dep_map, _RET_IP_); 1547 - 1548 136 __downgrade_write(sem); 1549 137 } 1550 - 1551 138 EXPORT_SYMBOL(downgrade_write); 1552 139 1553 140 #ifdef CONFIG_DEBUG_LOCK_ALLOC ··· 1554 145 { 1555 146 might_sleep(); 1556 147 rwsem_acquire_read(&sem->dep_map, subclass, 0, _RET_IP_); 1557 - 1558 148 LOCK_CONTENDED(sem, __down_read_trylock, __down_read); 1559 149 } 1560 - 1561 150 EXPORT_SYMBOL(down_read_nested); 1562 151 1563 152 void _down_write_nest_lock(struct rw_semaphore *sem, struct lockdep_map *nest) 1564 153 { 1565 154 might_sleep(); 1566 155 rwsem_acquire_nest(&sem->dep_map, 0, 0, nest, _RET_IP_); 1567 - 1568 156 LOCK_CONTENDED(sem, __down_write_trylock, __down_write); 1569 157 } 1570 - 1571 158 EXPORT_SYMBOL(_down_write_nest_lock); 1572 159 1573 160 void down_read_non_owner(struct rw_semaphore *sem) 1574 161 { 1575 162 might_sleep(); 1576 - 1577 163 __down_read(sem); 1578 164 __rwsem_set_reader_owned(sem, NULL); 1579 165 } 1580 - 1581 166 EXPORT_SYMBOL(down_read_non_owner); 1582 167 1583 168 void down_write_nested(struct rw_semaphore *sem, int subclass) 1584 169 { 1585 170 might_sleep(); 1586 171 rwsem_acquire(&sem->dep_map, subclass, 0, _RET_IP_); 1587 - 1588 172 LOCK_CONTENDED(sem, __down_write_trylock, __down_write); 1589 173 } 1590 - 1591 174 EXPORT_SYMBOL(down_write_nested); 1592 175 1593 176 int __sched down_write_killable_nested(struct rw_semaphore *sem, int subclass) ··· 1587 186 might_sleep(); 1588 187 rwsem_acquire(&sem->dep_map, subclass, 0, _RET_IP_); 1589 188 1590 - if (LOCK_CONTENDED_RETURN(sem, __down_write_trylock, __down_write_killable)) { 189 + if (LOCK_CONTENDED_RETURN(sem, __down_write_trylock, 190 + __down_write_killable)) { 1591 191 rwsem_release(&sem->dep_map, 1, _RET_IP_); 1592 192 return -EINTR; 1593 193 } 1594 194 1595 195 return 0; 1596 196 } 1597 - 1598 197 EXPORT_SYMBOL(down_write_killable_nested); 1599 198 1600 199 void up_read_non_owner(struct rw_semaphore *sem) 1601 200 { 1602 - DEBUG_RWSEMS_WARN_ON(!((unsigned long)sem->owner & RWSEM_READER_OWNED), 1603 - sem); 201 + DEBUG_RWSEMS_WARN_ON(!is_rwsem_reader_owned(sem), sem); 1604 202 __up_read(sem); 1605 203 } 1606 - 1607 204 EXPORT_SYMBOL(up_read_non_owner); 1608 205 1609 206 #endif

+6 -300

kernel/locking/rwsem.h

··· 1 1 /* SPDX-License-Identifier: GPL-2.0 */ 2 - /* 3 - * The least significant 2 bits of the owner value has the following 4 - * meanings when set. 5 - * - RWSEM_READER_OWNED (bit 0): The rwsem is owned by readers 6 - * - RWSEM_ANONYMOUSLY_OWNED (bit 1): The rwsem is anonymously owned, 7 - * i.e. the owner(s) cannot be readily determined. It can be reader 8 - * owned or the owning writer is indeterminate. 9 - * 10 - * When a writer acquires a rwsem, it puts its task_struct pointer 11 - * into the owner field. It is cleared after an unlock. 12 - * 13 - * When a reader acquires a rwsem, it will also puts its task_struct 14 - * pointer into the owner field with both the RWSEM_READER_OWNED and 15 - * RWSEM_ANONYMOUSLY_OWNED bits set. On unlock, the owner field will 16 - * largely be left untouched. So for a free or reader-owned rwsem, 17 - * the owner value may contain information about the last reader that 18 - * acquires the rwsem. The anonymous bit is set because that particular 19 - * reader may or may not still own the lock. 20 - * 21 - * That information may be helpful in debugging cases where the system 22 - * seems to hang on a reader owned rwsem especially if only one reader 23 - * is involved. Ideally we would like to track all the readers that own 24 - * a rwsem, but the overhead is simply too big. 25 - */ 26 - #include "lock_events.h" 27 2 28 - #define RWSEM_READER_OWNED (1UL << 0) 29 - #define RWSEM_ANONYMOUSLY_OWNED (1UL << 1) 3 + #ifndef __INTERNAL_RWSEM_H 4 + #define __INTERNAL_RWSEM_H 5 + #include <linux/rwsem.h> 30 6 31 - #ifdef CONFIG_DEBUG_RWSEMS 32 - # define DEBUG_RWSEMS_WARN_ON(c, sem) do { \ 33 - if (!debug_locks_silent && \ 34 - WARN_ONCE(c, "DEBUG_RWSEMS_WARN_ON(%s): count = 0x%lx, owner = 0x%lx, curr 0x%lx, list %sempty\n",\ 35 - #c, atomic_long_read(&(sem)->count), \ 36 - (long)((sem)->owner), (long)current, \ 37 - list_empty(&(sem)->wait_list) ? "" : "not ")) \ 38 - debug_locks_off(); \ 39 - } while (0) 40 - #else 41 - # define DEBUG_RWSEMS_WARN_ON(c, sem) 42 - #endif 7 + extern void __down_read(struct rw_semaphore *sem); 8 + extern void __up_read(struct rw_semaphore *sem); 43 9 44 - /* 45 - * R/W semaphores originally for PPC using the stuff in lib/rwsem.c. 46 - * Adapted largely from include/asm-i386/rwsem.h 47 - * by Paul Mackerras <paulus@samba.org>. 48 - */ 49 - 50 - /* 51 - * the semaphore definition 52 - */ 53 - #ifdef CONFIG_64BIT 54 - # define RWSEM_ACTIVE_MASK 0xffffffffL 55 - #else 56 - # define RWSEM_ACTIVE_MASK 0x0000ffffL 57 - #endif 58 - 59 - #define RWSEM_ACTIVE_BIAS 0x00000001L 60 - #define RWSEM_WAITING_BIAS (-RWSEM_ACTIVE_MASK-1) 61 - #define RWSEM_ACTIVE_READ_BIAS RWSEM_ACTIVE_BIAS 62 - #define RWSEM_ACTIVE_WRITE_BIAS (RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS) 63 - 64 - #ifdef CONFIG_RWSEM_SPIN_ON_OWNER 65 - /* 66 - * All writes to owner are protected by WRITE_ONCE() to make sure that 67 - * store tearing can't happen as optimistic spinners may read and use 68 - * the owner value concurrently without lock. Read from owner, however, 69 - * may not need READ_ONCE() as long as the pointer value is only used 70 - * for comparison and isn't being dereferenced. 71 - */ 72 - static inline void rwsem_set_owner(struct rw_semaphore *sem) 73 - { 74 - WRITE_ONCE(sem->owner, current); 75 - } 76 - 77 - static inline void rwsem_clear_owner(struct rw_semaphore *sem) 78 - { 79 - WRITE_ONCE(sem->owner, NULL); 80 - } 81 - 82 - /* 83 - * The task_struct pointer of the last owning reader will be left in 84 - * the owner field. 85 - * 86 - * Note that the owner value just indicates the task has owned the rwsem 87 - * previously, it may not be the real owner or one of the real owners 88 - * anymore when that field is examined, so take it with a grain of salt. 89 - */ 90 - static inline void __rwsem_set_reader_owned(struct rw_semaphore *sem, 91 - struct task_struct *owner) 92 - { 93 - unsigned long val = (unsigned long)owner | RWSEM_READER_OWNED 94 - | RWSEM_ANONYMOUSLY_OWNED; 95 - 96 - WRITE_ONCE(sem->owner, (struct task_struct *)val); 97 - } 98 - 99 - static inline void rwsem_set_reader_owned(struct rw_semaphore *sem) 100 - { 101 - __rwsem_set_reader_owned(sem, current); 102 - } 103 - 104 - /* 105 - * Return true if the a rwsem waiter can spin on the rwsem's owner 106 - * and steal the lock, i.e. the lock is not anonymously owned. 107 - * N.B. !owner is considered spinnable. 108 - */ 109 - static inline bool is_rwsem_owner_spinnable(struct task_struct *owner) 110 - { 111 - return !((unsigned long)owner & RWSEM_ANONYMOUSLY_OWNED); 112 - } 113 - 114 - /* 115 - * Return true if rwsem is owned by an anonymous writer or readers. 116 - */ 117 - static inline bool rwsem_has_anonymous_owner(struct task_struct *owner) 118 - { 119 - return (unsigned long)owner & RWSEM_ANONYMOUSLY_OWNED; 120 - } 121 - 122 - #ifdef CONFIG_DEBUG_RWSEMS 123 - /* 124 - * With CONFIG_DEBUG_RWSEMS configured, it will make sure that if there 125 - * is a task pointer in owner of a reader-owned rwsem, it will be the 126 - * real owner or one of the real owners. The only exception is when the 127 - * unlock is done by up_read_non_owner(). 128 - */ 129 - #define rwsem_clear_reader_owned rwsem_clear_reader_owned 130 - static inline void rwsem_clear_reader_owned(struct rw_semaphore *sem) 131 - { 132 - unsigned long val = (unsigned long)current | RWSEM_READER_OWNED 133 - | RWSEM_ANONYMOUSLY_OWNED; 134 - if (READ_ONCE(sem->owner) == (struct task_struct *)val) 135 - cmpxchg_relaxed((unsigned long *)&sem->owner, val, 136 - RWSEM_READER_OWNED | RWSEM_ANONYMOUSLY_OWNED); 137 - } 138 - #endif 139 - 140 - #else 141 - static inline void rwsem_set_owner(struct rw_semaphore *sem) 142 - { 143 - } 144 - 145 - static inline void rwsem_clear_owner(struct rw_semaphore *sem) 146 - { 147 - } 148 - 149 - static inline void __rwsem_set_reader_owned(struct rw_semaphore *sem, 150 - struct task_struct *owner) 151 - { 152 - } 153 - 154 - static inline void rwsem_set_reader_owned(struct rw_semaphore *sem) 155 - { 156 - } 157 - #endif 158 - 159 - #ifndef rwsem_clear_reader_owned 160 - static inline void rwsem_clear_reader_owned(struct rw_semaphore *sem) 161 - { 162 - } 163 - #endif 164 - 165 - extern struct rw_semaphore *rwsem_down_read_failed(struct rw_semaphore *sem); 166 - extern struct rw_semaphore *rwsem_down_read_failed_killable(struct rw_semaphore *sem); 167 - extern struct rw_semaphore *rwsem_down_write_failed(struct rw_semaphore *sem); 168 - extern struct rw_semaphore *rwsem_down_write_failed_killable(struct rw_semaphore *sem); 169 - extern struct rw_semaphore *rwsem_wake(struct rw_semaphore *sem); 170 - extern struct rw_semaphore *rwsem_downgrade_wake(struct rw_semaphore *sem); 171 - 172 - /* 173 - * lock for reading 174 - */ 175 - static inline void __down_read(struct rw_semaphore *sem) 176 - { 177 - if (unlikely(atomic_long_inc_return_acquire(&sem->count) <= 0)) { 178 - rwsem_down_read_failed(sem); 179 - DEBUG_RWSEMS_WARN_ON(!((unsigned long)sem->owner & 180 - RWSEM_READER_OWNED), sem); 181 - } else { 182 - rwsem_set_reader_owned(sem); 183 - } 184 - } 185 - 186 - static inline int __down_read_killable(struct rw_semaphore *sem) 187 - { 188 - if (unlikely(atomic_long_inc_return_acquire(&sem->count) <= 0)) { 189 - if (IS_ERR(rwsem_down_read_failed_killable(sem))) 190 - return -EINTR; 191 - DEBUG_RWSEMS_WARN_ON(!((unsigned long)sem->owner & 192 - RWSEM_READER_OWNED), sem); 193 - } else { 194 - rwsem_set_reader_owned(sem); 195 - } 196 - return 0; 197 - } 198 - 199 - static inline int __down_read_trylock(struct rw_semaphore *sem) 200 - { 201 - /* 202 - * Optimize for the case when the rwsem is not locked at all. 203 - */ 204 - long tmp = RWSEM_UNLOCKED_VALUE; 205 - 206 - lockevent_inc(rwsem_rtrylock); 207 - do { 208 - if (atomic_long_try_cmpxchg_acquire(&sem->count, &tmp, 209 - tmp + RWSEM_ACTIVE_READ_BIAS)) { 210 - rwsem_set_reader_owned(sem); 211 - return 1; 212 - } 213 - } while (tmp >= 0); 214 - return 0; 215 - } 216 - 217 - /* 218 - * lock for writing 219 - */ 220 - static inline void __down_write(struct rw_semaphore *sem) 221 - { 222 - long tmp; 223 - 224 - tmp = atomic_long_add_return_acquire(RWSEM_ACTIVE_WRITE_BIAS, 225 - &sem->count); 226 - if (unlikely(tmp != RWSEM_ACTIVE_WRITE_BIAS)) 227 - rwsem_down_write_failed(sem); 228 - rwsem_set_owner(sem); 229 - } 230 - 231 - static inline int __down_write_killable(struct rw_semaphore *sem) 232 - { 233 - long tmp; 234 - 235 - tmp = atomic_long_add_return_acquire(RWSEM_ACTIVE_WRITE_BIAS, 236 - &sem->count); 237 - if (unlikely(tmp != RWSEM_ACTIVE_WRITE_BIAS)) 238 - if (IS_ERR(rwsem_down_write_failed_killable(sem))) 239 - return -EINTR; 240 - rwsem_set_owner(sem); 241 - return 0; 242 - } 243 - 244 - static inline int __down_write_trylock(struct rw_semaphore *sem) 245 - { 246 - long tmp; 247 - 248 - lockevent_inc(rwsem_wtrylock); 249 - tmp = atomic_long_cmpxchg_acquire(&sem->count, RWSEM_UNLOCKED_VALUE, 250 - RWSEM_ACTIVE_WRITE_BIAS); 251 - if (tmp == RWSEM_UNLOCKED_VALUE) { 252 - rwsem_set_owner(sem); 253 - return true; 254 - } 255 - return false; 256 - } 257 - 258 - /* 259 - * unlock after reading 260 - */ 261 - static inline void __up_read(struct rw_semaphore *sem) 262 - { 263 - long tmp; 264 - 265 - DEBUG_RWSEMS_WARN_ON(!((unsigned long)sem->owner & RWSEM_READER_OWNED), 266 - sem); 267 - rwsem_clear_reader_owned(sem); 268 - tmp = atomic_long_dec_return_release(&sem->count); 269 - if (unlikely(tmp < -1 && (tmp & RWSEM_ACTIVE_MASK) == 0)) 270 - rwsem_wake(sem); 271 - } 272 - 273 - /* 274 - * unlock after writing 275 - */ 276 - static inline void __up_write(struct rw_semaphore *sem) 277 - { 278 - DEBUG_RWSEMS_WARN_ON(sem->owner != current, sem); 279 - rwsem_clear_owner(sem); 280 - if (unlikely(atomic_long_sub_return_release(RWSEM_ACTIVE_WRITE_BIAS, 281 - &sem->count) < 0)) 282 - rwsem_wake(sem); 283 - } 284 - 285 - /* 286 - * downgrade write lock to read lock 287 - */ 288 - static inline void __downgrade_write(struct rw_semaphore *sem) 289 - { 290 - long tmp; 291 - 292 - /* 293 - * When downgrading from exclusive to shared ownership, 294 - * anything inside the write-locked region cannot leak 295 - * into the read side. In contrast, anything in the 296 - * read-locked region is ok to be re-ordered into the 297 - * write side. As such, rely on RELEASE semantics. 298 - */ 299 - DEBUG_RWSEMS_WARN_ON(sem->owner != current, sem); 300 - tmp = atomic_long_add_return_release(-RWSEM_WAITING_BIAS, &sem->count); 301 - rwsem_set_reader_owned(sem); 302 - if (tmp < 0) 303 - rwsem_downgrade_wake(sem); 304 - } 10 + #endif /* __INTERNAL_RWSEM_H */

+3 -2

kernel/sched/fair.c

··· 6189 6189 u64 time, cost; 6190 6190 s64 delta; 6191 6191 int cpu, nr = INT_MAX; 6192 + int this = smp_processor_id(); 6192 6193 6193 6194 this_sd = rcu_dereference(*this_cpu_ptr(&sd_llc)); 6194 6195 if (!this_sd) ··· 6213 6212 nr = 4; 6214 6213 } 6215 6214 6216 - time = local_clock(); 6215 + time = cpu_clock(this); 6217 6216 6218 6217 for_each_cpu_wrap(cpu, sched_domain_span(sd), target) { 6219 6218 if (!--nr) ··· 6224 6223 break; 6225 6224 } 6226 6225 6227 - time = local_clock() - time; 6226 + time = cpu_clock(this) - time; 6228 6227 cost = this_sd->avg_scan_cost; 6229 6228 delta = (s64)(time - cost) / 8; 6230 6229 this_sd->avg_scan_cost += delta;

+4 -4

lib/Kconfig.debug

··· 1095 1095 select DEBUG_SPINLOCK 1096 1096 select DEBUG_MUTEXES 1097 1097 select DEBUG_RT_MUTEXES if RT_MUTEXES 1098 - select DEBUG_RWSEMS if RWSEM_SPIN_ON_OWNER 1098 + select DEBUG_RWSEMS 1099 1099 select DEBUG_WW_MUTEX_SLOWPATH 1100 1100 select DEBUG_LOCK_ALLOC 1101 1101 select TRACE_IRQFLAGS ··· 1199 1199 1200 1200 config DEBUG_RWSEMS 1201 1201 bool "RW Semaphore debugging: basic checks" 1202 - depends on DEBUG_KERNEL && RWSEM_SPIN_ON_OWNER 1202 + depends on DEBUG_KERNEL 1203 1203 help 1204 - This debugging feature allows mismatched rw semaphore locks and unlocks 1205 - to be detected and reported. 1204 + This debugging feature allows mismatched rw semaphore locks 1205 + and unlocks to be detected and reported. 1206 1206 1207 1207 config DEBUG_LOCK_ALLOC 1208 1208 bool "Lock debugging: detect incorrect freeing of live locks"

+16 -16

lib/atomic64.c

··· 42 42 return &atomic64_lock[addr & (NR_LOCKS - 1)].lock; 43 43 } 44 44 45 - long long atomic64_read(const atomic64_t *v) 45 + s64 atomic64_read(const atomic64_t *v) 46 46 { 47 47 unsigned long flags; 48 48 raw_spinlock_t *lock = lock_addr(v); 49 - long long val; 49 + s64 val; 50 50 51 51 raw_spin_lock_irqsave(lock, flags); 52 52 val = v->counter; ··· 55 55 } 56 56 EXPORT_SYMBOL(atomic64_read); 57 57 58 - void atomic64_set(atomic64_t *v, long long i) 58 + void atomic64_set(atomic64_t *v, s64 i) 59 59 { 60 60 unsigned long flags; 61 61 raw_spinlock_t *lock = lock_addr(v); ··· 67 67 EXPORT_SYMBOL(atomic64_set); 68 68 69 69 #define ATOMIC64_OP(op, c_op) \ 70 - void atomic64_##op(long long a, atomic64_t *v) \ 70 + void atomic64_##op(s64 a, atomic64_t *v) \ 71 71 { \ 72 72 unsigned long flags; \ 73 73 raw_spinlock_t *lock = lock_addr(v); \ ··· 79 79 EXPORT_SYMBOL(atomic64_##op); 80 80 81 81 #define ATOMIC64_OP_RETURN(op, c_op) \ 82 - long long atomic64_##op##_return(long long a, atomic64_t *v) \ 82 + s64 atomic64_##op##_return(s64 a, atomic64_t *v) \ 83 83 { \ 84 84 unsigned long flags; \ 85 85 raw_spinlock_t *lock = lock_addr(v); \ 86 - long long val; \ 86 + s64 val; \ 87 87 \ 88 88 raw_spin_lock_irqsave(lock, flags); \ 89 89 val = (v->counter c_op a); \ ··· 93 93 EXPORT_SYMBOL(atomic64_##op##_return); 94 94 95 95 #define ATOMIC64_FETCH_OP(op, c_op) \ 96 - long long atomic64_fetch_##op(long long a, atomic64_t *v) \ 96 + s64 atomic64_fetch_##op(s64 a, atomic64_t *v) \ 97 97 { \ 98 98 unsigned long flags; \ 99 99 raw_spinlock_t *lock = lock_addr(v); \ 100 - long long val; \ 100 + s64 val; \ 101 101 \ 102 102 raw_spin_lock_irqsave(lock, flags); \ 103 103 val = v->counter; \ ··· 130 130 #undef ATOMIC64_OP_RETURN 131 131 #undef ATOMIC64_OP 132 132 133 - long long atomic64_dec_if_positive(atomic64_t *v) 133 + s64 atomic64_dec_if_positive(atomic64_t *v) 134 134 { 135 135 unsigned long flags; 136 136 raw_spinlock_t *lock = lock_addr(v); 137 - long long val; 137 + s64 val; 138 138 139 139 raw_spin_lock_irqsave(lock, flags); 140 140 val = v->counter - 1; ··· 145 145 } 146 146 EXPORT_SYMBOL(atomic64_dec_if_positive); 147 147 148 - long long atomic64_cmpxchg(atomic64_t *v, long long o, long long n) 148 + s64 atomic64_cmpxchg(atomic64_t *v, s64 o, s64 n) 149 149 { 150 150 unsigned long flags; 151 151 raw_spinlock_t *lock = lock_addr(v); 152 - long long val; 152 + s64 val; 153 153 154 154 raw_spin_lock_irqsave(lock, flags); 155 155 val = v->counter; ··· 160 160 } 161 161 EXPORT_SYMBOL(atomic64_cmpxchg); 162 162 163 - long long atomic64_xchg(atomic64_t *v, long long new) 163 + s64 atomic64_xchg(atomic64_t *v, s64 new) 164 164 { 165 165 unsigned long flags; 166 166 raw_spinlock_t *lock = lock_addr(v); 167 - long long val; 167 + s64 val; 168 168 169 169 raw_spin_lock_irqsave(lock, flags); 170 170 val = v->counter; ··· 174 174 } 175 175 EXPORT_SYMBOL(atomic64_xchg); 176 176 177 - long long atomic64_fetch_add_unless(atomic64_t *v, long long a, long long u) 177 + s64 atomic64_fetch_add_unless(atomic64_t *v, s64 a, s64 u) 178 178 { 179 179 unsigned long flags; 180 180 raw_spinlock_t *lock = lock_addr(v); 181 - long long val; 181 + s64 val; 182 182 183 183 raw_spin_lock_irqsave(lock, flags); 184 184 val = v->counter;

+1 -1

scripts/atomic/check-atomics.sh

··· 22 22 OLDSUM="$(tail -n 1 ${LINUXDIR}/include/${header})" 23 23 OLDSUM="${OLDSUM#// }" 24 24 25 - NEWSUM="$(head -n -1 ${LINUXDIR}/include/${header} | sha1sum)" 25 + NEWSUM="$(sed '$d' ${LINUXDIR}/include/${header} | sha1sum)" 26 26 NEWSUM="${NEWSUM%% *}" 27 27 28 28 if [ "${OLDSUM}" != "${NEWSUM}" ]; then

+4 -4

security/apparmor/label.c

··· 76 76 77 77 AA_BUG(!orig); 78 78 AA_BUG(!new); 79 - lockdep_assert_held_exclusive(&labels_set(orig)->lock); 79 + lockdep_assert_held_write(&labels_set(orig)->lock); 80 80 81 81 tmp = rcu_dereference_protected(orig->proxy->label, 82 82 &labels_ns(orig)->lock); ··· 566 566 567 567 AA_BUG(!ls); 568 568 AA_BUG(!label); 569 - lockdep_assert_held_exclusive(&ls->lock); 569 + lockdep_assert_held_write(&ls->lock); 570 570 571 571 if (new) 572 572 __aa_proxy_redirect(label, new); ··· 603 603 AA_BUG(!ls); 604 604 AA_BUG(!old); 605 605 AA_BUG(!new); 606 - lockdep_assert_held_exclusive(&ls->lock); 606 + lockdep_assert_held_write(&ls->lock); 607 607 AA_BUG(new->flags & FLAG_IN_TREE); 608 608 609 609 if (!label_is_stale(old)) ··· 640 640 AA_BUG(!ls); 641 641 AA_BUG(!label); 642 642 AA_BUG(labels_set(label) != ls); 643 - lockdep_assert_held_exclusive(&ls->lock); 643 + lockdep_assert_held_write(&ls->lock); 644 644 AA_BUG(label->flags & FLAG_IN_TREE); 645 645 646 646 /* Figure out where to put new node */