Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core locking updates from Ingo Molnar:
"The main updates in this cycle were:

- mutex MCS refactoring finishing touches: improve comments, refactor
and clean up code, reduce debug data structure footprint, etc.

- qrwlock finishing touches: remove old code, self-test updates.

- small rwsem optimization

- various smaller fixes/cleanups"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/lockdep: Revert qrwlock recusive stuff
locking/rwsem: Avoid double checking before try acquiring write lock
locking/rwsem: Move EXPORT_SYMBOL() lines to follow function definition
locking/rwlock, x86: Delete unused asm/rwlock.h and rwlock.S
locking/rwlock, x86: Clean up asm/spinlock*.h to remove old rwlock code
locking/semaphore: Resolve some shadow warnings
locking/selftest: Support queued rwlock
locking/lockdep: Restrict the use of recursive read_lock() with qrwlock
locking/spinlocks: Always evaluate the second argument of spin_lock_nested()
locking/Documentation: Update locking/mutex-design.txt disadvantages
locking/Documentation: Move locking related docs into Documentation/locking/
locking/mutexes: Use MUTEX_SPIN_ON_OWNER when appropriate
locking/mutexes: Refactor optimistic spinning code
locking/mcs: Remove obsolete comment
locking/mutexes: Document quick lock release when unlocking
locking/mutexes: Standardize arguments in lock/unlock slowpaths
locking: Remove deprecated smp_mb__() barriers

+282 -483
+2
Documentation/00-INDEX
··· 287 287 - semantics and behavior of local atomic operations. 288 288 lockdep-design.txt 289 289 - documentation on the runtime locking correctness validator. 290 + locking/ 291 + - directory with info about kernel locking primitives 290 292 lockstat.txt 291 293 - info on collecting statistics on locks (and contention). 292 294 lockup-watchdogs.txt
+1 -1
Documentation/DocBook/kernel-locking.tmpl
··· 1972 1972 <itemizedlist> 1973 1973 <listitem> 1974 1974 <para> 1975 - <filename>Documentation/spinlocks.txt</filename>: 1975 + <filename>Documentation/locking/spinlocks.txt</filename>: 1976 1976 Linus Torvalds' spinlocking tutorial in the kernel sources. 1977 1977 </para> 1978 1978 </listitem>
Documentation/lockdep-design.txt Documentation/locking/lockdep-design.txt
+1 -1
Documentation/lockstat.txt Documentation/locking/lockstat.txt
··· 12 12 - HOW 13 13 14 14 Lockdep already has hooks in the lock functions and maps lock instances to 15 - lock classes. We build on that (see Documentation/lockdep-design.txt). 15 + lock classes. We build on that (see Documentation/lokcing/lockdep-design.txt). 16 16 The graph below shows the relation between the lock functions and the various 17 17 hooks therein. 18 18
+3 -3
Documentation/mutex-design.txt Documentation/locking/mutex-design.txt
··· 145 145 146 146 Unlike its original design and purpose, 'struct mutex' is larger than 147 147 most locks in the kernel. E.g: on x86-64 it is 40 bytes, almost twice 148 - as large as 'struct semaphore' (24 bytes) and 8 bytes shy of the 149 - 'struct rw_semaphore' variant. Larger structure sizes mean more CPU 150 - cache and memory footprint. 148 + as large as 'struct semaphore' (24 bytes) and tied, along with rwsems, 149 + for the largest lock in the kernel. Larger structure sizes mean more 150 + CPU cache and memory footprint. 151 151 152 152 When to use mutexes 153 153 -------------------
Documentation/rt-mutex-design.txt Documentation/locking/rt-mutex-design.txt
Documentation/rt-mutex.txt Documentation/locking/rt-mutex.txt
+7 -7
Documentation/spinlocks.txt Documentation/locking/spinlocks.txt
··· 105 105 spin_unlock(&lock); 106 106 107 107 (and the equivalent read-write versions too, of course). The spinlock will 108 - guarantee the same kind of exclusive access, and it will be much faster. 108 + guarantee the same kind of exclusive access, and it will be much faster. 109 109 This is useful if you know that the data in question is only ever 110 - manipulated from a "process context", ie no interrupts involved. 110 + manipulated from a "process context", ie no interrupts involved. 111 111 112 112 The reasons you mustn't use these versions if you have interrupts that 113 113 play with the spinlock is that you can get deadlocks: ··· 122 122 interrupt happens on the same CPU that already holds the lock, because the 123 123 lock will obviously never be released (because the interrupt is waiting 124 124 for the lock, and the lock-holder is interrupted by the interrupt and will 125 - not continue until the interrupt has been processed). 125 + not continue until the interrupt has been processed). 126 126 127 127 (This is also the reason why the irq-versions of the spinlocks only need 128 128 to disable the _local_ interrupts - it's ok to use spinlocks in interrupts 129 129 on other CPU's, because an interrupt on another CPU doesn't interrupt the 130 130 CPU that holds the lock, so the lock-holder can continue and eventually 131 - releases the lock). 131 + releases the lock). 132 132 133 133 Note that you can be clever with read-write locks and interrupts. For 134 134 example, if you know that the interrupt only ever gets a read-lock, then 135 135 you can use a non-irq version of read locks everywhere - because they 136 - don't block on each other (and thus there is no dead-lock wrt interrupts. 137 - But when you do the write-lock, you have to use the irq-safe version. 136 + don't block on each other (and thus there is no dead-lock wrt interrupts. 137 + But when you do the write-lock, you have to use the irq-safe version. 138 138 139 - For an example of being clever with rw-locks, see the "waitqueue_lock" 139 + For an example of being clever with rw-locks, see the "waitqueue_lock" 140 140 handling in kernel/sched/core.c - nothing ever _changes_ a wait-queue from 141 141 within an interrupt, they only read the queue in order to know whom to 142 142 wake up. So read-locks are safe (which is good: they are very common
Documentation/ww-mutex-design.txt Documentation/locking/ww-mutex-design.txt
+2 -2
MAINTAINERS
··· 5680 5680 L: linux-kernel@vger.kernel.org 5681 5681 T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git core/locking 5682 5682 S: Maintained 5683 - F: Documentation/lockdep*.txt 5684 - F: Documentation/lockstat.txt 5683 + F: Documentation/locking/lockdep*.txt 5684 + F: Documentation/locking/lockstat.txt 5685 5685 F: include/linux/lockdep.h 5686 5686 F: kernel/locking/ 5687 5687
-49
arch/x86/include/asm/rwlock.h
··· 1 - #ifndef _ASM_X86_RWLOCK_H 2 - #define _ASM_X86_RWLOCK_H 3 - 4 - #include <asm/asm.h> 5 - 6 - #if CONFIG_NR_CPUS <= 2048 7 - 8 - #ifndef __ASSEMBLY__ 9 - typedef union { 10 - s32 lock; 11 - s32 write; 12 - } arch_rwlock_t; 13 - #endif 14 - 15 - #define RW_LOCK_BIAS 0x00100000 16 - #define READ_LOCK_SIZE(insn) __ASM_FORM(insn##l) 17 - #define READ_LOCK_ATOMIC(n) atomic_##n 18 - #define WRITE_LOCK_ADD(n) __ASM_FORM_COMMA(addl n) 19 - #define WRITE_LOCK_SUB(n) __ASM_FORM_COMMA(subl n) 20 - #define WRITE_LOCK_CMP RW_LOCK_BIAS 21 - 22 - #else /* CONFIG_NR_CPUS > 2048 */ 23 - 24 - #include <linux/const.h> 25 - 26 - #ifndef __ASSEMBLY__ 27 - typedef union { 28 - s64 lock; 29 - struct { 30 - u32 read; 31 - s32 write; 32 - }; 33 - } arch_rwlock_t; 34 - #endif 35 - 36 - #define RW_LOCK_BIAS (_AC(1,L) << 32) 37 - #define READ_LOCK_SIZE(insn) __ASM_FORM(insn##q) 38 - #define READ_LOCK_ATOMIC(n) atomic64_##n 39 - #define WRITE_LOCK_ADD(n) __ASM_FORM(incl) 40 - #define WRITE_LOCK_SUB(n) __ASM_FORM(decl) 41 - #define WRITE_LOCK_CMP 1 42 - 43 - #endif /* CONFIG_NR_CPUS */ 44 - 45 - #define __ARCH_RW_LOCK_UNLOCKED { RW_LOCK_BIAS } 46 - 47 - /* Actual code is in asm/spinlock.h or in arch/x86/lib/rwlock.S */ 48 - 49 - #endif /* _ASM_X86_RWLOCK_H */
+2 -79
arch/x86/include/asm/spinlock.h
··· 187 187 cpu_relax(); 188 188 } 189 189 190 - #ifndef CONFIG_QUEUE_RWLOCK 191 190 /* 192 191 * Read-write spinlocks, allowing multiple readers 193 192 * but only one writer. ··· 197 198 * irq-safe write-lock, but readers can get non-irqsafe 198 199 * read-locks. 199 200 * 200 - * On x86, we implement read-write locks as a 32-bit counter 201 - * with the high bit (sign) being the "contended" bit. 201 + * On x86, we implement read-write locks using the generic qrwlock with 202 + * x86 specific optimization. 202 203 */ 203 204 204 - /** 205 - * read_can_lock - would read_trylock() succeed? 206 - * @lock: the rwlock in question. 207 - */ 208 - static inline int arch_read_can_lock(arch_rwlock_t *lock) 209 - { 210 - return lock->lock > 0; 211 - } 212 - 213 - /** 214 - * write_can_lock - would write_trylock() succeed? 215 - * @lock: the rwlock in question. 216 - */ 217 - static inline int arch_write_can_lock(arch_rwlock_t *lock) 218 - { 219 - return lock->write == WRITE_LOCK_CMP; 220 - } 221 - 222 - static inline void arch_read_lock(arch_rwlock_t *rw) 223 - { 224 - asm volatile(LOCK_PREFIX READ_LOCK_SIZE(dec) " (%0)\n\t" 225 - "jns 1f\n" 226 - "call __read_lock_failed\n\t" 227 - "1:\n" 228 - ::LOCK_PTR_REG (rw) : "memory"); 229 - } 230 - 231 - static inline void arch_write_lock(arch_rwlock_t *rw) 232 - { 233 - asm volatile(LOCK_PREFIX WRITE_LOCK_SUB(%1) "(%0)\n\t" 234 - "jz 1f\n" 235 - "call __write_lock_failed\n\t" 236 - "1:\n" 237 - ::LOCK_PTR_REG (&rw->write), "i" (RW_LOCK_BIAS) 238 - : "memory"); 239 - } 240 - 241 - static inline int arch_read_trylock(arch_rwlock_t *lock) 242 - { 243 - READ_LOCK_ATOMIC(t) *count = (READ_LOCK_ATOMIC(t) *)lock; 244 - 245 - if (READ_LOCK_ATOMIC(dec_return)(count) >= 0) 246 - return 1; 247 - READ_LOCK_ATOMIC(inc)(count); 248 - return 0; 249 - } 250 - 251 - static inline int arch_write_trylock(arch_rwlock_t *lock) 252 - { 253 - atomic_t *count = (atomic_t *)&lock->write; 254 - 255 - if (atomic_sub_and_test(WRITE_LOCK_CMP, count)) 256 - return 1; 257 - atomic_add(WRITE_LOCK_CMP, count); 258 - return 0; 259 - } 260 - 261 - static inline void arch_read_unlock(arch_rwlock_t *rw) 262 - { 263 - asm volatile(LOCK_PREFIX READ_LOCK_SIZE(inc) " %0" 264 - :"+m" (rw->lock) : : "memory"); 265 - } 266 - 267 - static inline void arch_write_unlock(arch_rwlock_t *rw) 268 - { 269 - asm volatile(LOCK_PREFIX WRITE_LOCK_ADD(%1) "%0" 270 - : "+m" (rw->write) : "i" (RW_LOCK_BIAS) : "memory"); 271 - } 272 - #else 273 205 #include <asm/qrwlock.h> 274 - #endif /* CONFIG_QUEUE_RWLOCK */ 275 206 276 207 #define arch_read_lock_flags(lock, flags) arch_read_lock(lock) 277 208 #define arch_write_lock_flags(lock, flags) arch_write_lock(lock) 278 - 279 - #undef READ_LOCK_SIZE 280 - #undef READ_LOCK_ATOMIC 281 - #undef WRITE_LOCK_ADD 282 - #undef WRITE_LOCK_SUB 283 - #undef WRITE_LOCK_CMP 284 209 285 210 #define arch_spin_relax(lock) cpu_relax() 286 211 #define arch_read_relax(lock) cpu_relax()
-4
arch/x86/include/asm/spinlock_types.h
··· 34 34 35 35 #define __ARCH_SPIN_LOCK_UNLOCKED { { 0 } } 36 36 37 - #ifdef CONFIG_QUEUE_RWLOCK 38 37 #include <asm-generic/qrwlock_types.h> 39 - #else 40 - #include <asm/rwlock.h> 41 - #endif 42 38 43 39 #endif /* _ASM_X86_SPINLOCK_TYPES_H */
-1
arch/x86/lib/Makefile
··· 20 20 lib-y += thunk_$(BITS).o 21 21 lib-y += usercopy_$(BITS).o usercopy.o getuser.o putuser.o 22 22 lib-y += memcpy_$(BITS).o 23 - lib-$(CONFIG_SMP) += rwlock.o 24 23 lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o 25 24 lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o 26 25
-44
arch/x86/lib/rwlock.S
··· 1 - /* Slow paths of read/write spinlocks. */ 2 - 3 - #include <linux/linkage.h> 4 - #include <asm/alternative-asm.h> 5 - #include <asm/frame.h> 6 - #include <asm/rwlock.h> 7 - 8 - #ifdef CONFIG_X86_32 9 - # define __lock_ptr eax 10 - #else 11 - # define __lock_ptr rdi 12 - #endif 13 - 14 - ENTRY(__write_lock_failed) 15 - CFI_STARTPROC 16 - FRAME 17 - 0: LOCK_PREFIX 18 - WRITE_LOCK_ADD($RW_LOCK_BIAS) (%__lock_ptr) 19 - 1: rep; nop 20 - cmpl $WRITE_LOCK_CMP, (%__lock_ptr) 21 - jne 1b 22 - LOCK_PREFIX 23 - WRITE_LOCK_SUB($RW_LOCK_BIAS) (%__lock_ptr) 24 - jnz 0b 25 - ENDFRAME 26 - ret 27 - CFI_ENDPROC 28 - END(__write_lock_failed) 29 - 30 - ENTRY(__read_lock_failed) 31 - CFI_STARTPROC 32 - FRAME 33 - 0: LOCK_PREFIX 34 - READ_LOCK_SIZE(inc) (%__lock_ptr) 35 - 1: rep; nop 36 - READ_LOCK_SIZE(cmp) $1, (%__lock_ptr) 37 - js 1b 38 - LOCK_PREFIX 39 - READ_LOCK_SIZE(dec) (%__lock_ptr) 40 - js 0b 41 - ENDFRAME 42 - ret 43 - CFI_ENDPROC 44 - END(__read_lock_failed)
+1 -1
drivers/gpu/drm/drm_modeset_lock.c
··· 35 35 * of extra utility/tracking out of our acquire-ctx. This is provided 36 36 * by drm_modeset_lock / drm_modeset_acquire_ctx. 37 37 * 38 - * For basic principles of ww_mutex, see: Documentation/ww-mutex-design.txt 38 + * For basic principles of ww_mutex, see: Documentation/locking/ww-mutex-design.txt 39 39 * 40 40 * The basic usage pattern is to: 41 41 *
-36
include/linux/atomic.h
··· 3 3 #define _LINUX_ATOMIC_H 4 4 #include <asm/atomic.h> 5 5 6 - /* 7 - * Provide __deprecated wrappers for the new interface, avoid flag day changes. 8 - * We need the ugly external functions to break header recursion hell. 9 - */ 10 - #ifndef smp_mb__before_atomic_inc 11 - static inline void __deprecated smp_mb__before_atomic_inc(void) 12 - { 13 - extern void __smp_mb__before_atomic(void); 14 - __smp_mb__before_atomic(); 15 - } 16 - #endif 17 - 18 - #ifndef smp_mb__after_atomic_inc 19 - static inline void __deprecated smp_mb__after_atomic_inc(void) 20 - { 21 - extern void __smp_mb__after_atomic(void); 22 - __smp_mb__after_atomic(); 23 - } 24 - #endif 25 - 26 - #ifndef smp_mb__before_atomic_dec 27 - static inline void __deprecated smp_mb__before_atomic_dec(void) 28 - { 29 - extern void __smp_mb__before_atomic(void); 30 - __smp_mb__before_atomic(); 31 - } 32 - #endif 33 - 34 - #ifndef smp_mb__after_atomic_dec 35 - static inline void __deprecated smp_mb__after_atomic_dec(void) 36 - { 37 - extern void __smp_mb__after_atomic(void); 38 - __smp_mb__after_atomic(); 39 - } 40 - #endif 41 - 42 6 /** 43 7 * atomic_add_unless - add unless the number is already a given value 44 8 * @v: pointer of type atomic_t
-20
include/linux/bitops.h
··· 32 32 */ 33 33 #include <asm/bitops.h> 34 34 35 - /* 36 - * Provide __deprecated wrappers for the new interface, avoid flag day changes. 37 - * We need the ugly external functions to break header recursion hell. 38 - */ 39 - #ifndef smp_mb__before_clear_bit 40 - static inline void __deprecated smp_mb__before_clear_bit(void) 41 - { 42 - extern void __smp_mb__before_atomic(void); 43 - __smp_mb__before_atomic(); 44 - } 45 - #endif 46 - 47 - #ifndef smp_mb__after_clear_bit 48 - static inline void __deprecated smp_mb__after_clear_bit(void) 49 - { 50 - extern void __smp_mb__after_atomic(void); 51 - __smp_mb__after_atomic(); 52 - } 53 - #endif 54 - 55 35 #define for_each_set_bit(bit, addr, size) \ 56 36 for ((bit) = find_first_bit((addr), (size)); \ 57 37 (bit) < (size); \
+1 -1
include/linux/lockdep.h
··· 4 4 * Copyright (C) 2006,2007 Red Hat, Inc., Ingo Molnar <mingo@redhat.com> 5 5 * Copyright (C) 2007 Red Hat, Inc., Peter Zijlstra <pzijlstr@redhat.com> 6 6 * 7 - * see Documentation/lockdep-design.txt for more details. 7 + * see Documentation/locking/lockdep-design.txt for more details. 8 8 */ 9 9 #ifndef __LINUX_LOCKDEP_H 10 10 #define __LINUX_LOCKDEP_H
+2 -2
include/linux/mutex.h
··· 52 52 atomic_t count; 53 53 spinlock_t wait_lock; 54 54 struct list_head wait_list; 55 - #if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_SMP) 55 + #if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_MUTEX_SPIN_ON_OWNER) 56 56 struct task_struct *owner; 57 57 #endif 58 58 #ifdef CONFIG_MUTEX_SPIN_ON_OWNER ··· 133 133 134 134 /* 135 135 * See kernel/locking/mutex.c for detailed documentation of these APIs. 136 - * Also see Documentation/mutex-design.txt. 136 + * Also see Documentation/locking/mutex-design.txt. 137 137 */ 138 138 #ifdef CONFIG_DEBUG_LOCK_ALLOC 139 139 extern void mutex_lock_nested(struct mutex *lock, unsigned int subclass);
+1 -1
include/linux/rwsem.h
··· 149 149 * static then another method for expressing nested locking is 150 150 * the explicit definition of lock class keys and the use of 151 151 * lockdep_set_class() at lock initialization time. 152 - * See Documentation/lockdep-design.txt for more details.) 152 + * See Documentation/locking/lockdep-design.txt for more details.) 153 153 */ 154 154 extern void down_read_nested(struct rw_semaphore *sem, int subclass); 155 155 extern void down_write_nested(struct rw_semaphore *sem, int subclass);
+7 -1
include/linux/spinlock.h
··· 197 197 _raw_spin_lock_nest_lock(lock, &(nest_lock)->dep_map); \ 198 198 } while (0) 199 199 #else 200 - # define raw_spin_lock_nested(lock, subclass) _raw_spin_lock(lock) 200 + /* 201 + * Always evaluate the 'subclass' argument to avoid that the compiler 202 + * warns about set-but-not-used variables when building with 203 + * CONFIG_DEBUG_LOCK_ALLOC=n and with W=1. 204 + */ 205 + # define raw_spin_lock_nested(lock, subclass) \ 206 + _raw_spin_lock(((void)(subclass), (lock))) 201 207 # define raw_spin_lock_nest_lock(lock, nest_lock) _raw_spin_lock(lock) 202 208 #endif 203 209
-3
kernel/locking/mcs_spinlock.h
··· 56 56 * If the lock has already been acquired, then this will proceed to spin 57 57 * on this node->locked until the previous lock holder sets the node->locked 58 58 * in mcs_spin_unlock(). 59 - * 60 - * We don't inline mcs_spin_lock() so that perf can correctly account for the 61 - * time spent in this lock function. 62 59 */ 63 60 static inline 64 61 void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
+228 -188
kernel/locking/mutex.c
··· 15 15 * by Steven Rostedt, based on work by Gregory Haskins, Peter Morreale 16 16 * and Sven Dietrich. 17 17 * 18 - * Also see Documentation/mutex-design.txt. 18 + * Also see Documentation/locking/mutex-design.txt. 19 19 */ 20 20 #include <linux/mutex.h> 21 21 #include <linux/ww_mutex.h> ··· 106 106 EXPORT_SYMBOL(mutex_lock); 107 107 #endif 108 108 109 + static __always_inline void ww_mutex_lock_acquired(struct ww_mutex *ww, 110 + struct ww_acquire_ctx *ww_ctx) 111 + { 112 + #ifdef CONFIG_DEBUG_MUTEXES 113 + /* 114 + * If this WARN_ON triggers, you used ww_mutex_lock to acquire, 115 + * but released with a normal mutex_unlock in this call. 116 + * 117 + * This should never happen, always use ww_mutex_unlock. 118 + */ 119 + DEBUG_LOCKS_WARN_ON(ww->ctx); 120 + 121 + /* 122 + * Not quite done after calling ww_acquire_done() ? 123 + */ 124 + DEBUG_LOCKS_WARN_ON(ww_ctx->done_acquire); 125 + 126 + if (ww_ctx->contending_lock) { 127 + /* 128 + * After -EDEADLK you tried to 129 + * acquire a different ww_mutex? Bad! 130 + */ 131 + DEBUG_LOCKS_WARN_ON(ww_ctx->contending_lock != ww); 132 + 133 + /* 134 + * You called ww_mutex_lock after receiving -EDEADLK, 135 + * but 'forgot' to unlock everything else first? 136 + */ 137 + DEBUG_LOCKS_WARN_ON(ww_ctx->acquired > 0); 138 + ww_ctx->contending_lock = NULL; 139 + } 140 + 141 + /* 142 + * Naughty, using a different class will lead to undefined behavior! 143 + */ 144 + DEBUG_LOCKS_WARN_ON(ww_ctx->ww_class != ww->ww_class); 145 + #endif 146 + ww_ctx->acquired++; 147 + } 148 + 149 + /* 150 + * after acquiring lock with fastpath or when we lost out in contested 151 + * slowpath, set ctx and wake up any waiters so they can recheck. 152 + * 153 + * This function is never called when CONFIG_DEBUG_LOCK_ALLOC is set, 154 + * as the fastpath and opportunistic spinning are disabled in that case. 155 + */ 156 + static __always_inline void 157 + ww_mutex_set_context_fastpath(struct ww_mutex *lock, 158 + struct ww_acquire_ctx *ctx) 159 + { 160 + unsigned long flags; 161 + struct mutex_waiter *cur; 162 + 163 + ww_mutex_lock_acquired(lock, ctx); 164 + 165 + lock->ctx = ctx; 166 + 167 + /* 168 + * The lock->ctx update should be visible on all cores before 169 + * the atomic read is done, otherwise contended waiters might be 170 + * missed. The contended waiters will either see ww_ctx == NULL 171 + * and keep spinning, or it will acquire wait_lock, add itself 172 + * to waiter list and sleep. 173 + */ 174 + smp_mb(); /* ^^^ */ 175 + 176 + /* 177 + * Check if lock is contended, if not there is nobody to wake up 178 + */ 179 + if (likely(atomic_read(&lock->base.count) == 0)) 180 + return; 181 + 182 + /* 183 + * Uh oh, we raced in fastpath, wake up everyone in this case, 184 + * so they can see the new lock->ctx. 185 + */ 186 + spin_lock_mutex(&lock->base.wait_lock, flags); 187 + list_for_each_entry(cur, &lock->base.wait_list, list) { 188 + debug_mutex_wake_waiter(&lock->base, cur); 189 + wake_up_process(cur->task); 190 + } 191 + spin_unlock_mutex(&lock->base.wait_lock, flags); 192 + } 193 + 194 + 109 195 #ifdef CONFIG_MUTEX_SPIN_ON_OWNER 110 196 /* 111 197 * In order to avoid a stampede of mutex spinners from acquiring the mutex ··· 265 179 * it and not set the owner yet or the mutex has been released. 266 180 */ 267 181 return retval; 182 + } 183 + 184 + /* 185 + * Atomically try to take the lock when it is available 186 + */ 187 + static inline bool mutex_try_to_acquire(struct mutex *lock) 188 + { 189 + return !mutex_is_locked(lock) && 190 + (atomic_cmpxchg(&lock->count, 1, 0) == 1); 191 + } 192 + 193 + /* 194 + * Optimistic spinning. 195 + * 196 + * We try to spin for acquisition when we find that the lock owner 197 + * is currently running on a (different) CPU and while we don't 198 + * need to reschedule. The rationale is that if the lock owner is 199 + * running, it is likely to release the lock soon. 200 + * 201 + * Since this needs the lock owner, and this mutex implementation 202 + * doesn't track the owner atomically in the lock field, we need to 203 + * track it non-atomically. 204 + * 205 + * We can't do this for DEBUG_MUTEXES because that relies on wait_lock 206 + * to serialize everything. 207 + * 208 + * The mutex spinners are queued up using MCS lock so that only one 209 + * spinner can compete for the mutex. However, if mutex spinning isn't 210 + * going to happen, there is no point in going through the lock/unlock 211 + * overhead. 212 + * 213 + * Returns true when the lock was taken, otherwise false, indicating 214 + * that we need to jump to the slowpath and sleep. 215 + */ 216 + static bool mutex_optimistic_spin(struct mutex *lock, 217 + struct ww_acquire_ctx *ww_ctx, const bool use_ww_ctx) 218 + { 219 + struct task_struct *task = current; 220 + 221 + if (!mutex_can_spin_on_owner(lock)) 222 + goto done; 223 + 224 + if (!osq_lock(&lock->osq)) 225 + goto done; 226 + 227 + while (true) { 228 + struct task_struct *owner; 229 + 230 + if (use_ww_ctx && ww_ctx->acquired > 0) { 231 + struct ww_mutex *ww; 232 + 233 + ww = container_of(lock, struct ww_mutex, base); 234 + /* 235 + * If ww->ctx is set the contents are undefined, only 236 + * by acquiring wait_lock there is a guarantee that 237 + * they are not invalid when reading. 238 + * 239 + * As such, when deadlock detection needs to be 240 + * performed the optimistic spinning cannot be done. 241 + */ 242 + if (ACCESS_ONCE(ww->ctx)) 243 + break; 244 + } 245 + 246 + /* 247 + * If there's an owner, wait for it to either 248 + * release the lock or go to sleep. 249 + */ 250 + owner = ACCESS_ONCE(lock->owner); 251 + if (owner && !mutex_spin_on_owner(lock, owner)) 252 + break; 253 + 254 + /* Try to acquire the mutex if it is unlocked. */ 255 + if (mutex_try_to_acquire(lock)) { 256 + lock_acquired(&lock->dep_map, ip); 257 + 258 + if (use_ww_ctx) { 259 + struct ww_mutex *ww; 260 + ww = container_of(lock, struct ww_mutex, base); 261 + 262 + ww_mutex_set_context_fastpath(ww, ww_ctx); 263 + } 264 + 265 + mutex_set_owner(lock); 266 + osq_unlock(&lock->osq); 267 + return true; 268 + } 269 + 270 + /* 271 + * When there's no owner, we might have preempted between the 272 + * owner acquiring the lock and setting the owner field. If 273 + * we're an RT task that will live-lock because we won't let 274 + * the owner complete. 275 + */ 276 + if (!owner && (need_resched() || rt_task(task))) 277 + break; 278 + 279 + /* 280 + * The cpu_relax() call is a compiler barrier which forces 281 + * everything in this loop to be re-loaded. We don't need 282 + * memory barriers as we'll eventually observe the right 283 + * values at the cost of a few extra spins. 284 + */ 285 + cpu_relax_lowlatency(); 286 + } 287 + 288 + osq_unlock(&lock->osq); 289 + done: 290 + /* 291 + * If we fell out of the spin path because of need_resched(), 292 + * reschedule now, before we try-lock the mutex. This avoids getting 293 + * scheduled out right after we obtained the mutex. 294 + */ 295 + if (need_resched()) 296 + schedule_preempt_disabled(); 297 + 298 + return false; 299 + } 300 + #else 301 + static bool mutex_optimistic_spin(struct mutex *lock, 302 + struct ww_acquire_ctx *ww_ctx, const bool use_ww_ctx) 303 + { 304 + return false; 268 305 } 269 306 #endif 270 307 ··· 486 277 return 0; 487 278 } 488 279 489 - static __always_inline void ww_mutex_lock_acquired(struct ww_mutex *ww, 490 - struct ww_acquire_ctx *ww_ctx) 491 - { 492 - #ifdef CONFIG_DEBUG_MUTEXES 493 - /* 494 - * If this WARN_ON triggers, you used ww_mutex_lock to acquire, 495 - * but released with a normal mutex_unlock in this call. 496 - * 497 - * This should never happen, always use ww_mutex_unlock. 498 - */ 499 - DEBUG_LOCKS_WARN_ON(ww->ctx); 500 - 501 - /* 502 - * Not quite done after calling ww_acquire_done() ? 503 - */ 504 - DEBUG_LOCKS_WARN_ON(ww_ctx->done_acquire); 505 - 506 - if (ww_ctx->contending_lock) { 507 - /* 508 - * After -EDEADLK you tried to 509 - * acquire a different ww_mutex? Bad! 510 - */ 511 - DEBUG_LOCKS_WARN_ON(ww_ctx->contending_lock != ww); 512 - 513 - /* 514 - * You called ww_mutex_lock after receiving -EDEADLK, 515 - * but 'forgot' to unlock everything else first? 516 - */ 517 - DEBUG_LOCKS_WARN_ON(ww_ctx->acquired > 0); 518 - ww_ctx->contending_lock = NULL; 519 - } 520 - 521 - /* 522 - * Naughty, using a different class will lead to undefined behavior! 523 - */ 524 - DEBUG_LOCKS_WARN_ON(ww_ctx->ww_class != ww->ww_class); 525 - #endif 526 - ww_ctx->acquired++; 527 - } 528 - 529 - /* 530 - * after acquiring lock with fastpath or when we lost out in contested 531 - * slowpath, set ctx and wake up any waiters so they can recheck. 532 - * 533 - * This function is never called when CONFIG_DEBUG_LOCK_ALLOC is set, 534 - * as the fastpath and opportunistic spinning are disabled in that case. 535 - */ 536 - static __always_inline void 537 - ww_mutex_set_context_fastpath(struct ww_mutex *lock, 538 - struct ww_acquire_ctx *ctx) 539 - { 540 - unsigned long flags; 541 - struct mutex_waiter *cur; 542 - 543 - ww_mutex_lock_acquired(lock, ctx); 544 - 545 - lock->ctx = ctx; 546 - 547 - /* 548 - * The lock->ctx update should be visible on all cores before 549 - * the atomic read is done, otherwise contended waiters might be 550 - * missed. The contended waiters will either see ww_ctx == NULL 551 - * and keep spinning, or it will acquire wait_lock, add itself 552 - * to waiter list and sleep. 553 - */ 554 - smp_mb(); /* ^^^ */ 555 - 556 - /* 557 - * Check if lock is contended, if not there is nobody to wake up 558 - */ 559 - if (likely(atomic_read(&lock->base.count) == 0)) 560 - return; 561 - 562 - /* 563 - * Uh oh, we raced in fastpath, wake up everyone in this case, 564 - * so they can see the new lock->ctx. 565 - */ 566 - spin_lock_mutex(&lock->base.wait_lock, flags); 567 - list_for_each_entry(cur, &lock->base.wait_list, list) { 568 - debug_mutex_wake_waiter(&lock->base, cur); 569 - wake_up_process(cur->task); 570 - } 571 - spin_unlock_mutex(&lock->base.wait_lock, flags); 572 - } 573 - 574 280 /* 575 281 * Lock a mutex (possibly interruptible), slowpath: 576 282 */ ··· 502 378 preempt_disable(); 503 379 mutex_acquire_nest(&lock->dep_map, subclass, 0, nest_lock, ip); 504 380 505 - #ifdef CONFIG_MUTEX_SPIN_ON_OWNER 506 - /* 507 - * Optimistic spinning. 508 - * 509 - * We try to spin for acquisition when we find that the lock owner 510 - * is currently running on a (different) CPU and while we don't 511 - * need to reschedule. The rationale is that if the lock owner is 512 - * running, it is likely to release the lock soon. 513 - * 514 - * Since this needs the lock owner, and this mutex implementation 515 - * doesn't track the owner atomically in the lock field, we need to 516 - * track it non-atomically. 517 - * 518 - * We can't do this for DEBUG_MUTEXES because that relies on wait_lock 519 - * to serialize everything. 520 - * 521 - * The mutex spinners are queued up using MCS lock so that only one 522 - * spinner can compete for the mutex. However, if mutex spinning isn't 523 - * going to happen, there is no point in going through the lock/unlock 524 - * overhead. 525 - */ 526 - if (!mutex_can_spin_on_owner(lock)) 527 - goto slowpath; 528 - 529 - if (!osq_lock(&lock->osq)) 530 - goto slowpath; 531 - 532 - for (;;) { 533 - struct task_struct *owner; 534 - 535 - if (use_ww_ctx && ww_ctx->acquired > 0) { 536 - struct ww_mutex *ww; 537 - 538 - ww = container_of(lock, struct ww_mutex, base); 539 - /* 540 - * If ww->ctx is set the contents are undefined, only 541 - * by acquiring wait_lock there is a guarantee that 542 - * they are not invalid when reading. 543 - * 544 - * As such, when deadlock detection needs to be 545 - * performed the optimistic spinning cannot be done. 546 - */ 547 - if (ACCESS_ONCE(ww->ctx)) 548 - break; 549 - } 550 - 551 - /* 552 - * If there's an owner, wait for it to either 553 - * release the lock or go to sleep. 554 - */ 555 - owner = ACCESS_ONCE(lock->owner); 556 - if (owner && !mutex_spin_on_owner(lock, owner)) 557 - break; 558 - 559 - /* Try to acquire the mutex if it is unlocked. */ 560 - if (!mutex_is_locked(lock) && 561 - (atomic_cmpxchg(&lock->count, 1, 0) == 1)) { 562 - lock_acquired(&lock->dep_map, ip); 563 - if (use_ww_ctx) { 564 - struct ww_mutex *ww; 565 - ww = container_of(lock, struct ww_mutex, base); 566 - 567 - ww_mutex_set_context_fastpath(ww, ww_ctx); 568 - } 569 - 570 - mutex_set_owner(lock); 571 - osq_unlock(&lock->osq); 572 - preempt_enable(); 573 - return 0; 574 - } 575 - 576 - /* 577 - * When there's no owner, we might have preempted between the 578 - * owner acquiring the lock and setting the owner field. If 579 - * we're an RT task that will live-lock because we won't let 580 - * the owner complete. 581 - */ 582 - if (!owner && (need_resched() || rt_task(task))) 583 - break; 584 - 585 - /* 586 - * The cpu_relax() call is a compiler barrier which forces 587 - * everything in this loop to be re-loaded. We don't need 588 - * memory barriers as we'll eventually observe the right 589 - * values at the cost of a few extra spins. 590 - */ 591 - cpu_relax_lowlatency(); 381 + if (mutex_optimistic_spin(lock, ww_ctx, use_ww_ctx)) { 382 + /* got the lock, yay! */ 383 + preempt_enable(); 384 + return 0; 592 385 } 593 - osq_unlock(&lock->osq); 594 - slowpath: 595 - /* 596 - * If we fell out of the spin path because of need_resched(), 597 - * reschedule now, before we try-lock the mutex. This avoids getting 598 - * scheduled out right after we obtained the mutex. 599 - */ 600 - if (need_resched()) 601 - schedule_preempt_disabled(); 602 - #endif 386 + 603 387 spin_lock_mutex(&lock->wait_lock, flags); 604 388 605 389 /* ··· 711 679 * Release the lock, slowpath: 712 680 */ 713 681 static inline void 714 - __mutex_unlock_common_slowpath(atomic_t *lock_count, int nested) 682 + __mutex_unlock_common_slowpath(struct mutex *lock, int nested) 715 683 { 716 - struct mutex *lock = container_of(lock_count, struct mutex, count); 717 684 unsigned long flags; 718 685 719 686 /* 720 - * some architectures leave the lock unlocked in the fastpath failure 687 + * As a performance measurement, release the lock before doing other 688 + * wakeup related duties to follow. This allows other tasks to acquire 689 + * the lock sooner, while still handling cleanups in past unlock calls. 690 + * This can be done as we do not enforce strict equivalence between the 691 + * mutex counter and wait_list. 692 + * 693 + * 694 + * Some architectures leave the lock unlocked in the fastpath failure 721 695 * case, others need to leave it locked. In the later case we have to 722 - * unlock it here 696 + * unlock it here - as the lock counter is currently 0 or negative. 723 697 */ 724 698 if (__mutex_slowpath_needs_to_unlock()) 725 699 atomic_set(&lock->count, 1); ··· 754 716 __visible void 755 717 __mutex_unlock_slowpath(atomic_t *lock_count) 756 718 { 757 - __mutex_unlock_common_slowpath(lock_count, 1); 719 + struct mutex *lock = container_of(lock_count, struct mutex, count); 720 + 721 + __mutex_unlock_common_slowpath(lock, 1); 758 722 } 759 723 760 724 #ifndef CONFIG_DEBUG_LOCK_ALLOC
+1 -1
kernel/locking/mutex.h
··· 16 16 #define mutex_remove_waiter(lock, waiter, ti) \ 17 17 __list_del((waiter)->list.prev, (waiter)->list.next) 18 18 19 - #ifdef CONFIG_SMP 19 + #ifdef CONFIG_MUTEX_SPIN_ON_OWNER 20 20 static inline void mutex_set_owner(struct mutex *lock) 21 21 { 22 22 lock->owner = current;
+1 -1
kernel/locking/rtmutex.c
··· 8 8 * Copyright (C) 2005 Kihon Technologies Inc., Steven Rostedt 9 9 * Copyright (C) 2006 Esben Nielsen 10 10 * 11 - * See Documentation/rt-mutex-design.txt for details. 11 + * See Documentation/locking/rt-mutex-design.txt for details. 12 12 */ 13 13 #include <linux/spinlock.h> 14 14 #include <linux/export.h>
+14 -13
kernel/locking/rwsem-xadd.c
··· 246 246 247 247 return sem; 248 248 } 249 + EXPORT_SYMBOL(rwsem_down_read_failed); 249 250 250 251 static inline bool rwsem_try_write_lock(long count, struct rw_semaphore *sem) 251 252 { 252 - if (!(count & RWSEM_ACTIVE_MASK)) { 253 - /* try acquiring the write lock */ 254 - if (sem->count == RWSEM_WAITING_BIAS && 255 - cmpxchg(&sem->count, RWSEM_WAITING_BIAS, 256 - RWSEM_ACTIVE_WRITE_BIAS) == RWSEM_WAITING_BIAS) { 257 - if (!list_is_singular(&sem->wait_list)) 258 - rwsem_atomic_update(RWSEM_WAITING_BIAS, sem); 259 - return true; 260 - } 253 + /* 254 + * Try acquiring the write lock. Check count first in order 255 + * to reduce unnecessary expensive cmpxchg() operations. 256 + */ 257 + if (count == RWSEM_WAITING_BIAS && 258 + cmpxchg(&sem->count, RWSEM_WAITING_BIAS, 259 + RWSEM_ACTIVE_WRITE_BIAS) == RWSEM_WAITING_BIAS) { 260 + if (!list_is_singular(&sem->wait_list)) 261 + rwsem_atomic_update(RWSEM_WAITING_BIAS, sem); 262 + return true; 261 263 } 264 + 262 265 return false; 263 266 } 264 267 ··· 468 465 469 466 return sem; 470 467 } 468 + EXPORT_SYMBOL(rwsem_down_write_failed); 471 469 472 470 /* 473 471 * handle waking up a waiter on the semaphore ··· 489 485 490 486 return sem; 491 487 } 488 + EXPORT_SYMBOL(rwsem_wake); 492 489 493 490 /* 494 491 * downgrade a write lock into a read lock ··· 511 506 512 507 return sem; 513 508 } 514 - 515 - EXPORT_SYMBOL(rwsem_down_read_failed); 516 - EXPORT_SYMBOL(rwsem_down_write_failed); 517 - EXPORT_SYMBOL(rwsem_wake); 518 509 EXPORT_SYMBOL(rwsem_downgrade_wake);
+6 -6
kernel/locking/semaphore.c
··· 36 36 static noinline void __down(struct semaphore *sem); 37 37 static noinline int __down_interruptible(struct semaphore *sem); 38 38 static noinline int __down_killable(struct semaphore *sem); 39 - static noinline int __down_timeout(struct semaphore *sem, long jiffies); 39 + static noinline int __down_timeout(struct semaphore *sem, long timeout); 40 40 static noinline void __up(struct semaphore *sem); 41 41 42 42 /** ··· 145 145 /** 146 146 * down_timeout - acquire the semaphore within a specified time 147 147 * @sem: the semaphore to be acquired 148 - * @jiffies: how long to wait before failing 148 + * @timeout: how long to wait before failing 149 149 * 150 150 * Attempts to acquire the semaphore. If no more tasks are allowed to 151 151 * acquire the semaphore, calling this function will put the task to sleep. 152 152 * If the semaphore is not released within the specified number of jiffies, 153 153 * this function returns -ETIME. It returns 0 if the semaphore was acquired. 154 154 */ 155 - int down_timeout(struct semaphore *sem, long jiffies) 155 + int down_timeout(struct semaphore *sem, long timeout) 156 156 { 157 157 unsigned long flags; 158 158 int result = 0; ··· 161 161 if (likely(sem->count > 0)) 162 162 sem->count--; 163 163 else 164 - result = __down_timeout(sem, jiffies); 164 + result = __down_timeout(sem, timeout); 165 165 raw_spin_unlock_irqrestore(&sem->lock, flags); 166 166 167 167 return result; ··· 248 248 return __down_common(sem, TASK_KILLABLE, MAX_SCHEDULE_TIMEOUT); 249 249 } 250 250 251 - static noinline int __sched __down_timeout(struct semaphore *sem, long jiffies) 251 + static noinline int __sched __down_timeout(struct semaphore *sem, long timeout) 252 252 { 253 - return __down_common(sem, TASK_UNINTERRUPTIBLE, jiffies); 253 + return __down_common(sem, TASK_UNINTERRUPTIBLE, timeout); 254 254 } 255 255 256 256 static noinline void __sched __up(struct semaphore *sem)
-16
kernel/sched/core.c
··· 90 90 #define CREATE_TRACE_POINTS 91 91 #include <trace/events/sched.h> 92 92 93 - #ifdef smp_mb__before_atomic 94 - void __smp_mb__before_atomic(void) 95 - { 96 - smp_mb__before_atomic(); 97 - } 98 - EXPORT_SYMBOL(__smp_mb__before_atomic); 99 - #endif 100 - 101 - #ifdef smp_mb__after_atomic 102 - void __smp_mb__after_atomic(void) 103 - { 104 - smp_mb__after_atomic(); 105 - } 106 - EXPORT_SYMBOL(__smp_mb__after_atomic); 107 - #endif 108 - 109 93 void start_bandwidth_timer(struct hrtimer *period_timer, ktime_t period) 110 94 { 111 95 unsigned long delta;
+2 -2
lib/Kconfig.debug
··· 952 952 the proof of observed correctness is also maintained for an 953 953 arbitrary combination of these separate locking variants. 954 954 955 - For more details, see Documentation/lockdep-design.txt. 955 + For more details, see Documentation/locking/lockdep-design.txt. 956 956 957 957 config LOCKDEP 958 958 bool ··· 973 973 help 974 974 This feature enables tracking lock contention points 975 975 976 - For more details, see Documentation/lockstat.txt 976 + For more details, see Documentation/locking/lockstat.txt 977 977 978 978 This also enables lock events required by "perf lock", 979 979 subcommand of perf.