Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

locking/spinlocks: Remove an instruction from spin and write locks

Both spin locks and write locks currently do:

f0 0f b1 17 lock cmpxchg %edx,(%rdi)
85 c0 test %eax,%eax
75 05 jne [slowpath]

This 'test' insn is superfluous; the cmpxchg insn sets the Z flag
appropriately. Peter pointed out that using atomic_try_cmpxchg_acquire()
will let the compiler know this is true. Comparing before/after
disassemblies show the only effect is to remove this insn.

Take this opportunity to make the spin & write lock code resemble each
other more closely and have similar likely() hints.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Link: http://lkml.kernel.org/r/20180820162639.GC25153@bombadil.infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

authored by

Matthew Wilcox and committed by
Ingo Molnar
27df8968 cb92173d

+13 -10
+4 -3
include/asm-generic/qrwlock.h
··· 71 71 if (unlikely(cnts)) 72 72 return 0; 73 73 74 - return likely(atomic_cmpxchg_acquire(&lock->cnts, 75 - cnts, cnts | _QW_LOCKED) == cnts); 74 + return likely(atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, 75 + _QW_LOCKED)); 76 76 } 77 77 /** 78 78 * queued_read_lock - acquire read lock of a queue rwlock ··· 96 96 */ 97 97 static inline void queued_write_lock(struct qrwlock *lock) 98 98 { 99 + u32 cnts = 0; 99 100 /* Optimize for the unfair lock case where the fair flag is 0. */ 100 - if (atomic_cmpxchg_acquire(&lock->cnts, 0, _QW_LOCKED) == 0) 101 + if (likely(atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED))) 101 102 return; 102 103 103 104 queued_write_lock_slowpath(lock);
+9 -7
include/asm-generic/qspinlock.h
··· 66 66 */ 67 67 static __always_inline int queued_spin_trylock(struct qspinlock *lock) 68 68 { 69 - if (!atomic_read(&lock->val) && 70 - (atomic_cmpxchg_acquire(&lock->val, 0, _Q_LOCKED_VAL) == 0)) 71 - return 1; 72 - return 0; 69 + u32 val = atomic_read(&lock->val); 70 + 71 + if (unlikely(val)) 72 + return 0; 73 + 74 + return likely(atomic_try_cmpxchg_acquire(&lock->val, &val, _Q_LOCKED_VAL)); 73 75 } 74 76 75 77 extern void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); ··· 82 80 */ 83 81 static __always_inline void queued_spin_lock(struct qspinlock *lock) 84 82 { 85 - u32 val; 83 + u32 val = 0; 86 84 87 - val = atomic_cmpxchg_acquire(&lock->val, 0, _Q_LOCKED_VAL); 88 - if (likely(val == 0)) 85 + if (likely(atomic_try_cmpxchg_acquire(&lock->val, &val, _Q_LOCKED_VAL))) 89 86 return; 87 + 90 88 queued_spin_lock_slowpath(lock, val); 91 89 } 92 90