Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

locking/rwsem: Reduce spinlock contention in wakeup after up_read()/up_write()

In up_write()/up_read(), rwsem_wake() will be called whenever it
detects that some writers/readers are waiting. The rwsem_wake()
function will take the wait_lock and call __rwsem_do_wake() to do the
real wakeup. For a heavily contended rwsem, doing a spin_lock() on
wait_lock will cause further contention on the heavily contended rwsem
cacheline resulting in delay in the completion of the up_read/up_write
operations.

This patch makes the wait_lock taking and the call to __rwsem_do_wake()
optional if at least one spinning writer is present. The spinning
writer will be able to take the rwsem and call rwsem_wake() later
when it calls up_write(). With the presence of a spinning writer,
rwsem_wake() will now try to acquire the lock using trylock. If that
fails, it will just quit.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Jason Low <jason.low2@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Douglas Hatch <doug.hatch@hp.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1430428337-16802-2-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

authored by

Waiman Long and committed by
Ingo Molnar
59aabfc7 3e0283a5

+49
+5
include/linux/osq_lock.h
··· 32 32 extern bool osq_lock(struct optimistic_spin_queue *lock); 33 33 extern void osq_unlock(struct optimistic_spin_queue *lock); 34 34 35 + static inline bool osq_is_locked(struct optimistic_spin_queue *lock) 36 + { 37 + return atomic_read(&lock->tail) != OSQ_UNLOCKED_VAL; 38 + } 39 + 35 40 #endif
+44
kernel/locking/rwsem-xadd.c
··· 409 409 return taken; 410 410 } 411 411 412 + /* 413 + * Return true if the rwsem has active spinner 414 + */ 415 + static inline bool rwsem_has_spinner(struct rw_semaphore *sem) 416 + { 417 + return osq_is_locked(&sem->osq); 418 + } 419 + 412 420 #else 413 421 static bool rwsem_optimistic_spin(struct rw_semaphore *sem) 422 + { 423 + return false; 424 + } 425 + 426 + static inline bool rwsem_has_spinner(struct rw_semaphore *sem) 414 427 { 415 428 return false; 416 429 } ··· 509 496 { 510 497 unsigned long flags; 511 498 499 + /* 500 + * If a spinner is present, it is not necessary to do the wakeup. 501 + * Try to do wakeup only if the trylock succeeds to minimize 502 + * spinlock contention which may introduce too much delay in the 503 + * unlock operation. 504 + * 505 + * spinning writer up_write/up_read caller 506 + * --------------- ----------------------- 507 + * [S] osq_unlock() [L] osq 508 + * MB RMB 509 + * [RmW] rwsem_try_write_lock() [RmW] spin_trylock(wait_lock) 510 + * 511 + * Here, it is important to make sure that there won't be a missed 512 + * wakeup while the rwsem is free and the only spinning writer goes 513 + * to sleep without taking the rwsem. Even when the spinning writer 514 + * is just going to break out of the waiting loop, it will still do 515 + * a trylock in rwsem_down_write_failed() before sleeping. IOW, if 516 + * rwsem_has_spinner() is true, it will guarantee at least one 517 + * trylock attempt on the rwsem later on. 518 + */ 519 + if (rwsem_has_spinner(sem)) { 520 + /* 521 + * The smp_rmb() here is to make sure that the spinner 522 + * state is consulted before reading the wait_lock. 523 + */ 524 + smp_rmb(); 525 + if (!raw_spin_trylock_irqsave(&sem->wait_lock, flags)) 526 + return sem; 527 + goto locked; 528 + } 512 529 raw_spin_lock_irqsave(&sem->wait_lock, flags); 530 + locked: 513 531 514 532 /* do nothing if list empty */ 515 533 if (!list_empty(&sem->wait_list))