Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

rcu: Optionally run grace-period kthreads at real-time priority

Recent testing has shown that under heavy load, running RCU's grace-period
kthreads at real-time priority can improve performance (according to 0day
test robot) and reduce the incidence of RCU CPU stall warnings. However,
most systems do just fine with the default non-realtime priorities for
these kthreads, and it does not make sense to expose the entire user
base to any risk stemming from this change, given that this change is
of use only to a few users running extremely heavy workloads.

Therefore, this commit allows users to specify realtime priorities
for the grace-period kthreads, but leaves them running SCHED_OTHER
by default. The realtime priority may be specified at build time
via the RCU_KTHREAD_PRIO Kconfig parameter, or at boot time via the
rcutree.kthread_prio parameter. Either way, 0 says to continue the
default SCHED_OTHER behavior and values from 1-99 specify that priority
of SCHED_FIFO behavior. Note that a value of 0 is not permitted when
the RCU_BOOST Kconfig parameter is specified.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

+27 -8
+4 -3
init/Kconfig
··· 668 668 669 669 config RCU_KTHREAD_PRIO 670 670 int "Real-time priority to use for RCU worker threads" 671 - range 1 99 672 - depends on RCU_BOOST 673 - default 1 671 + range 1 99 if RCU_BOOST 672 + range 0 99 if !RCU_BOOST 673 + default 1 if RCU_BOOST 674 + default 0 if !RCU_BOOST 674 675 help 675 676 This option specifies the SCHED_FIFO priority value that will be 676 677 assigned to the rcuc/n and rcub/n threads and is also the value
+23 -1
kernel/rcu/tree.c
··· 156 156 static void invoke_rcu_core(void); 157 157 static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp); 158 158 159 + /* rcuc/rcub kthread realtime priority */ 160 + static int kthread_prio = CONFIG_RCU_KTHREAD_PRIO; 161 + module_param(kthread_prio, int, 0644); 162 + 159 163 /* 160 164 * Track the rcutorture test sequence number and the update version 161 165 * number within a given test. The rcutorture_testseq is incremented ··· 3601 3597 static int __init rcu_spawn_gp_kthread(void) 3602 3598 { 3603 3599 unsigned long flags; 3600 + int kthread_prio_in = kthread_prio; 3604 3601 struct rcu_node *rnp; 3605 3602 struct rcu_state *rsp; 3603 + struct sched_param sp; 3606 3604 struct task_struct *t; 3605 + 3606 + /* Force priority into range. */ 3607 + if (IS_ENABLED(CONFIG_RCU_BOOST) && kthread_prio < 1) 3608 + kthread_prio = 1; 3609 + else if (kthread_prio < 0) 3610 + kthread_prio = 0; 3611 + else if (kthread_prio > 99) 3612 + kthread_prio = 99; 3613 + if (kthread_prio != kthread_prio_in) 3614 + pr_alert("rcu_spawn_gp_kthread(): Limited prio to %d from %d\n", 3615 + kthread_prio, kthread_prio_in); 3607 3616 3608 3617 rcu_scheduler_fully_active = 1; 3609 3618 for_each_rcu_flavor(rsp) { 3610 - t = kthread_run(rcu_gp_kthread, rsp, "%s", rsp->name); 3619 + t = kthread_create(rcu_gp_kthread, rsp, "%s", rsp->name); 3611 3620 BUG_ON(IS_ERR(t)); 3612 3621 rnp = rcu_get_root(rsp); 3613 3622 raw_spin_lock_irqsave(&rnp->lock, flags); 3614 3623 rsp->gp_kthread = t; 3624 + if (kthread_prio) { 3625 + sp.sched_priority = kthread_prio; 3626 + sched_setscheduler_nocheck(t, SCHED_FIFO, &sp); 3627 + } 3628 + wake_up_process(t); 3615 3629 raw_spin_unlock_irqrestore(&rnp->lock, flags); 3616 3630 } 3617 3631 rcu_spawn_nocb_kthreads();
-4
kernel/rcu/tree_plugin.h
··· 34 34 35 35 #include "../locking/rtmutex_common.h" 36 36 37 - /* rcuc/rcub kthread realtime priority */ 38 - static int kthread_prio = CONFIG_RCU_KTHREAD_PRIO; 39 - module_param(kthread_prio, int, 0644); 40 - 41 37 /* 42 38 * Control variables for per-CPU and per-rcu_node kthreads. These 43 39 * handle all flavors of RCU.