Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

rcu/nocb: Invoke rcu_core() at the start of deoffloading

On PREEMPT_RT, if rcu_core() is preempted by the de-offloading process,
some work, such as callbacks acceleration and invocation, may be left
unattended due to the volatile checks on the offloaded state.

In the worst case this work is postponed until the next rcu_pending()
check that can take a jiffy to reach, which can be a problem in case
of callbacks flooding.

Solve that with invoking rcu_core() early in the de-offloading process.
This way any work dismissed by an ongoing rcu_core() call fooled by
a preempting deoffloading process will be caught up by a nearby future
recall to rcu_core(), this time fully aware of the de-offloading state.

Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

authored by

Frederic Weisbecker and committed by
Paul E. McKenney
fbb94cbd 213d56bf

+42 -4
+14
include/linux/rcu_segcblist.h
··· 136 136 * |--------------------------------------------------------------------------| 137 137 * | SEGCBLIST_RCU_CORE | | 138 138 * | SEGCBLIST_LOCKING | | 139 + * | SEGCBLIST_OFFLOADED | | 140 + * | SEGCBLIST_KTHREAD_CB | | 141 + * | SEGCBLIST_KTHREAD_GP | 142 + * | | 143 + * | CB/GP kthreads handle callbacks holding nocb_lock, local rcu_core() | 144 + * | handles callbacks concurrently. Bypass enqueue is enabled. | 145 + * | Invoke RCU core so we make sure not to preempt it in the middle with | 146 + * | leaving some urgent work unattended within a jiffy. | 147 + * ---------------------------------------------------------------------------- 148 + * | 149 + * v 150 + * |--------------------------------------------------------------------------| 151 + * | SEGCBLIST_RCU_CORE | | 152 + * | SEGCBLIST_LOCKING | | 139 153 * | SEGCBLIST_KTHREAD_CB | | 140 154 * | SEGCBLIST_KTHREAD_GP | 141 155 * | |
+2 -4
kernel/rcu/rcu_segcblist.c
··· 265 265 */ 266 266 void rcu_segcblist_offload(struct rcu_segcblist *rsclp, bool offload) 267 267 { 268 - if (offload) { 268 + if (offload) 269 269 rcu_segcblist_set_flags(rsclp, SEGCBLIST_LOCKING | SEGCBLIST_OFFLOADED); 270 - } else { 271 - rcu_segcblist_set_flags(rsclp, SEGCBLIST_RCU_CORE); 270 + else 272 271 rcu_segcblist_clear_flags(rsclp, SEGCBLIST_OFFLOADED); 273 - } 274 272 } 275 273 276 274 /*
+17
kernel/rcu/tree.c
··· 2707 2707 unsigned long flags; 2708 2708 struct rcu_data *rdp = raw_cpu_ptr(&rcu_data); 2709 2709 struct rcu_node *rnp = rdp->mynode; 2710 + /* 2711 + * On RT rcu_core() can be preempted when IRQs aren't disabled. 2712 + * Therefore this function can race with concurrent NOCB (de-)offloading 2713 + * on this CPU and the below condition must be considered volatile. 2714 + * However if we race with: 2715 + * 2716 + * _ Offloading: In the worst case we accelerate or process callbacks 2717 + * concurrently with NOCB kthreads. We are guaranteed to 2718 + * call rcu_nocb_lock() if that happens. 2719 + * 2720 + * _ Deoffloading: In the worst case we miss callbacks acceleration or 2721 + * processing. This is fine because the early stage 2722 + * of deoffloading invokes rcu_core() after setting 2723 + * SEGCBLIST_RCU_CORE. So we guarantee that we'll process 2724 + * what could have been dismissed without the need to wait 2725 + * for the next rcu_pending() check in the next jiffy. 2726 + */ 2710 2727 const bool do_batch = !rcu_segcblist_completely_offloaded(&rdp->cblist); 2711 2728 2712 2729 if (cpu_is_offline(smp_processor_id()))
+9
kernel/rcu/tree_nocb.h
··· 990 990 * will refuse to put anything into the bypass. 991 991 */ 992 992 WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies)); 993 + /* 994 + * Start with invoking rcu_core() early. This way if the current thread 995 + * happens to preempt an ongoing call to rcu_core() in the middle, 996 + * leaving some work dismissed because rcu_core() still thinks the rdp is 997 + * completely offloaded, we are guaranteed a nearby future instance of 998 + * rcu_core() to catch up. 999 + */ 1000 + rcu_segcblist_set_flags(cblist, SEGCBLIST_RCU_CORE); 1001 + invoke_rcu_core(); 993 1002 ret = rdp_offload_toggle(rdp, false, flags); 994 1003 swait_event_exclusive(rdp->nocb_state_wq, 995 1004 !rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB |