Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mutex: speed up generic mutex implementations

- atomic operations which both modify the variable and return something imply
full smp memory barriers before and after the memory operations involved
(failing atomic_cmpxchg, atomic_add_unless, etc don't imply a barrier because
they don't modify the target). See Documentation/atomic_ops.txt.
So remove extra barriers and branches.

- All architectures support atomic_cmpxchg. This has no relation to
__HAVE_ARCH_CMPXCHG. We can just take the atomic_cmpxchg path unconditionally

This reduces a simple single threaded fastpath lock+unlock test from 590 cycles
to 203 cycles on a ppc970 system.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Nick Piggin and committed by
Linus Torvalds
a8ddac7e 5a439c56

+3 -32
+2 -24
include/asm-generic/mutex-dec.h
··· 22 22 { 23 23 if (unlikely(atomic_dec_return(count) < 0)) 24 24 fail_fn(count); 25 - else 26 - smp_mb(); 27 25 } 28 26 29 27 /** ··· 39 41 { 40 42 if (unlikely(atomic_dec_return(count) < 0)) 41 43 return fail_fn(count); 42 - else { 43 - smp_mb(); 44 - return 0; 45 - } 44 + return 0; 46 45 } 47 46 48 47 /** ··· 58 63 static inline void 59 64 __mutex_fastpath_unlock(atomic_t *count, void (*fail_fn)(atomic_t *)) 60 65 { 61 - smp_mb(); 62 66 if (unlikely(atomic_inc_return(count) <= 0)) 63 67 fail_fn(count); 64 68 } ··· 82 88 static inline int 83 89 __mutex_fastpath_trylock(atomic_t *count, int (*fail_fn)(atomic_t *)) 84 90 { 85 - /* 86 - * We have two variants here. The cmpxchg based one is the best one 87 - * because it never induce a false contention state. It is included 88 - * here because architectures using the inc/dec algorithms over the 89 - * xchg ones are much more likely to support cmpxchg natively. 90 - * 91 - * If not we fall back to the spinlock based variant - that is 92 - * just as efficient (and simpler) as a 'destructive' probing of 93 - * the mutex state would be. 94 - */ 95 - #ifdef __HAVE_ARCH_CMPXCHG 96 - if (likely(atomic_cmpxchg(count, 1, 0) == 1)) { 97 - smp_mb(); 91 + if (likely(atomic_cmpxchg(count, 1, 0) == 1)) 98 92 return 1; 99 - } 100 93 return 0; 101 - #else 102 - return fail_fn(count); 103 - #endif 104 94 } 105 95 106 96 #endif
+1 -8
include/asm-generic/mutex-xchg.h
··· 27 27 { 28 28 if (unlikely(atomic_xchg(count, 0) != 1)) 29 29 fail_fn(count); 30 - else 31 - smp_mb(); 32 30 } 33 31 34 32 /** ··· 44 46 { 45 47 if (unlikely(atomic_xchg(count, 0) != 1)) 46 48 return fail_fn(count); 47 - else { 48 - smp_mb(); 49 - return 0; 50 - } 49 + return 0; 51 50 } 52 51 53 52 /** ··· 62 67 static inline void 63 68 __mutex_fastpath_unlock(atomic_t *count, void (*fail_fn)(atomic_t *)) 64 69 { 65 - smp_mb(); 66 70 if (unlikely(atomic_xchg(count, 1) != 0)) 67 71 fail_fn(count); 68 72 } ··· 104 110 if (prev < 0) 105 111 prev = 0; 106 112 } 107 - smp_mb(); 108 113 109 114 return prev; 110 115 }