Merge branches 'context_tracking.15.08.24a', 'csd.lock.15.08.24a', 'nocb.09.09.24a', 'rcutorture.14.08.24a', 'rcustall.09.09.24a', 'srcu.12.08.24a', 'rcu.tasks.14.08.24a', 'rcu_scaling_tests.15.08.24a', 'fixes.12.08.24a' and 'misc.11.08.24a' into next.09.09.24a

+1 -2

Documentation/RCU/Design/Requirements/Requirements.rst

··· 2649 2649 be removed from the kernel. 2650 2650 2651 2651 The tasks-rude-RCU API is also reader-marking-free and thus quite compact, 2652 - consisting of call_rcu_tasks_rude(), synchronize_rcu_tasks_rude(), 2653 - and rcu_barrier_tasks_rude(). 2652 + consisting solely of synchronize_rcu_tasks_rude(). 2654 2653 2655 2654 Tasks Trace RCU 2656 2655 ~~~~~~~~~~~~~~~

+28 -33

Documentation/RCU/checklist.rst

··· 194 194 when publicizing a pointer to a structure that can 195 195 be traversed by an RCU read-side critical section. 196 196 197 - 5. If any of call_rcu(), call_srcu(), call_rcu_tasks(), 198 - call_rcu_tasks_rude(), or call_rcu_tasks_trace() is used, 199 - the callback function may be invoked from softirq context, 200 - and in any case with bottom halves disabled. In particular, 201 - this callback function cannot block. If you need the callback 202 - to block, run that code in a workqueue handler scheduled from 203 - the callback. The queue_rcu_work() function does this for you 204 - in the case of call_rcu(). 197 + 5. If any of call_rcu(), call_srcu(), call_rcu_tasks(), or 198 + call_rcu_tasks_trace() is used, the callback function may be 199 + invoked from softirq context, and in any case with bottom halves 200 + disabled. In particular, this callback function cannot block. 201 + If you need the callback to block, run that code in a workqueue 202 + handler scheduled from the callback. The queue_rcu_work() 203 + function does this for you in the case of call_rcu(). 205 204 206 205 6. Since synchronize_rcu() can block, it cannot be called 207 206 from any sort of irq context. The same rule applies ··· 253 254 corresponding readers must use rcu_read_lock_trace() 254 255 and rcu_read_unlock_trace(). 255 256 256 - c. If an updater uses call_rcu_tasks_rude() or 257 - synchronize_rcu_tasks_rude(), then the corresponding 258 - readers must use anything that disables preemption, 259 - for example, preempt_disable() and preempt_enable(). 257 + c. If an updater uses synchronize_rcu_tasks_rude(), 258 + then the corresponding readers must use anything that 259 + disables preemption, for example, preempt_disable() 260 + and preempt_enable(). 260 261 261 262 Mixing things up will result in confusion and broken kernels, and 262 263 has even resulted in an exploitable security issue. Therefore, ··· 325 326 d. Periodically invoke rcu_barrier(), permitting a limited 326 327 number of updates per grace period. 327 328 328 - The same cautions apply to call_srcu(), call_rcu_tasks(), 329 - call_rcu_tasks_rude(), and call_rcu_tasks_trace(). This is 330 - why there is an srcu_barrier(), rcu_barrier_tasks(), 331 - rcu_barrier_tasks_rude(), and rcu_barrier_tasks_rude(), 332 - respectively. 329 + The same cautions apply to call_srcu(), call_rcu_tasks(), and 330 + call_rcu_tasks_trace(). This is why there is an srcu_barrier(), 331 + rcu_barrier_tasks(), and rcu_barrier_tasks_trace(), respectively. 333 332 334 333 Note that although these primitives do take action to avoid 335 334 memory exhaustion when any given CPU has too many callbacks, ··· 380 383 must use whatever locking or other synchronization is required 381 384 to safely access and/or modify that data structure. 382 385 383 - Do not assume that RCU callbacks will be executed on 384 - the same CPU that executed the corresponding call_rcu(), 385 - call_srcu(), call_rcu_tasks(), call_rcu_tasks_rude(), or 386 - call_rcu_tasks_trace(). For example, if a given CPU goes offline 387 - while having an RCU callback pending, then that RCU callback 388 - will execute on some surviving CPU. (If this was not the case, 389 - a self-spawning RCU callback would prevent the victim CPU from 390 - ever going offline.) Furthermore, CPUs designated by rcu_nocbs= 391 - might well *always* have their RCU callbacks executed on some 392 - other CPUs, in fact, for some real-time workloads, this is the 393 - whole point of using the rcu_nocbs= kernel boot parameter. 386 + Do not assume that RCU callbacks will be executed on the same 387 + CPU that executed the corresponding call_rcu(), call_srcu(), 388 + call_rcu_tasks(), or call_rcu_tasks_trace(). For example, if 389 + a given CPU goes offline while having an RCU callback pending, 390 + then that RCU callback will execute on some surviving CPU. 391 + (If this was not the case, a self-spawning RCU callback would 392 + prevent the victim CPU from ever going offline.) Furthermore, 393 + CPUs designated by rcu_nocbs= might well *always* have their 394 + RCU callbacks executed on some other CPUs, in fact, for some 395 + real-time workloads, this is the whole point of using the 396 + rcu_nocbs= kernel boot parameter. 394 397 395 398 In addition, do not assume that callbacks queued in a given order 396 399 will be invoked in that order, even if they all are queued on the ··· 504 507 These debugging aids can help you find problems that are 505 508 otherwise extremely difficult to spot. 506 509 507 - 17. If you pass a callback function defined within a module to one of 508 - call_rcu(), call_srcu(), call_rcu_tasks(), call_rcu_tasks_rude(), 509 - or call_rcu_tasks_trace(), then it is necessary to wait for all 510 + 17. If you pass a callback function defined within a module 511 + to one of call_rcu(), call_srcu(), call_rcu_tasks(), or 512 + call_rcu_tasks_trace(), then it is necessary to wait for all 510 513 pending callbacks to be invoked before unloading that module. 511 514 Note that it is absolutely *not* sufficient to wait for a grace 512 515 period! For example, synchronize_rcu() implementation is *not* ··· 519 522 - call_rcu() -> rcu_barrier() 520 523 - call_srcu() -> srcu_barrier() 521 524 - call_rcu_tasks() -> rcu_barrier_tasks() 522 - - call_rcu_tasks_rude() -> rcu_barrier_tasks_rude() 523 525 - call_rcu_tasks_trace() -> rcu_barrier_tasks_trace() 524 526 525 527 However, these barrier functions are absolutely *not* guaranteed ··· 535 539 - Either synchronize_srcu() or synchronize_srcu_expedited(), 536 540 together with and srcu_barrier() 537 541 - synchronize_rcu_tasks() and rcu_barrier_tasks() 538 - - synchronize_tasks_rude() and rcu_barrier_tasks_rude() 539 542 - synchronize_tasks_trace() and rcu_barrier_tasks_trace() 540 543 541 544 If necessary, you can use something like workqueues to execute

+1 -1

Documentation/RCU/whatisRCU.rst

··· 1103 1103 1104 1104 Critical sections Grace period Barrier 1105 1105 1106 - N/A call_rcu_tasks_rude rcu_barrier_tasks_rude 1106 + N/A N/A 1107 1107 synchronize_rcu_tasks_rude 1108 1108 1109 1109

+11 -9

Documentation/admin-guide/kernel-parameters.txt

··· 4937 4937 Set maximum number of finished RCU callbacks to 4938 4938 process in one batch. 4939 4939 4940 + rcutree.csd_lock_suppress_rcu_stall= [KNL] 4941 + Do only a one-line RCU CPU stall warning when 4942 + there is an ongoing too-long CSD-lock wait. 4943 + 4940 4944 rcutree.do_rcu_barrier= [KNL] 4941 4945 Request a call to rcu_barrier(). This is 4942 4946 throttled so that userspace tests can safely ··· 5388 5384 Time to wait (s) after boot before inducing stall. 5389 5385 5390 5386 rcutorture.stall_cpu_irqsoff= [KNL] 5391 - Disable interrupts while stalling if set. 5387 + Disable interrupts while stalling if set, but only 5388 + on the first stall in the set. 5389 + 5390 + rcutorture.stall_cpu_repeat= [KNL] 5391 + Number of times to repeat the stall sequence, 5392 + so that rcutorture.stall_cpu_repeat=3 will result 5393 + in four stall sequences. 5392 5394 5393 5395 rcutorture.stall_gp_kthread= [KNL] 5394 5396 Duration (s) of forced sleep within RCU ··· 5581 5571 A negative value will take the default. A value 5582 5572 of zero will disable batching. Batching is 5583 5573 always disabled for synchronize_rcu_tasks(). 5584 - 5585 - rcupdate.rcu_tasks_rude_lazy_ms= [KNL] 5586 - Set timeout in milliseconds RCU Tasks 5587 - Rude asynchronous callback batching for 5588 - call_rcu_tasks_rude(). A negative value 5589 - will take the default. A value of zero will 5590 - disable batching. Batching is always disabled 5591 - for synchronize_rcu_tasks_rude(). 5592 5574 5593 5575 rcupdate.rcu_tasks_trace_lazy_ms= [KNL] 5594 5576 Set timeout in milliseconds RCU Tasks

+1 -5

include/linux/rcu_segcblist.h

··· 185 185 * ---------------------------------------------------------------------------- 186 186 */ 187 187 #define SEGCBLIST_ENABLED BIT(0) 188 - #define SEGCBLIST_RCU_CORE BIT(1) 189 - #define SEGCBLIST_LOCKING BIT(2) 190 - #define SEGCBLIST_KTHREAD_CB BIT(3) 191 - #define SEGCBLIST_KTHREAD_GP BIT(4) 192 - #define SEGCBLIST_OFFLOADED BIT(5) 188 + #define SEGCBLIST_OFFLOADED BIT(1) 193 189 194 190 struct rcu_segcblist { 195 191 struct rcu_head *head;

+7 -2

include/linux/rculist.h

··· 191 191 * @old : the element to be replaced 192 192 * @new : the new element to insert 193 193 * 194 - * The @old entry will be replaced with the @new entry atomically. 194 + * The @old entry will be replaced with the @new entry atomically from 195 + * the perspective of concurrent readers. It is the caller's responsibility 196 + * to synchronize with concurrent updaters, if any. 197 + * 195 198 * Note: @old should not be empty. 196 199 */ 197 200 static inline void list_replace_rcu(struct list_head *old, ··· 522 519 * @old : the element to be replaced 523 520 * @new : the new element to insert 524 521 * 525 - * The @old entry will be replaced with the @new entry atomically. 522 + * The @old entry will be replaced with the @new entry atomically from 523 + * the perspective of concurrent readers. It is the caller's responsibility 524 + * to synchronize with concurrent updaters, if any. 526 525 */ 527 526 static inline void hlist_replace_rcu(struct hlist_node *old, 528 527 struct hlist_node *new)

+13 -2

include/linux/rcupdate.h

··· 34 34 #define ULONG_CMP_GE(a, b) (ULONG_MAX / 2 >= (a) - (b)) 35 35 #define ULONG_CMP_LT(a, b) (ULONG_MAX / 2 < (a) - (b)) 36 36 37 + #define RCU_SEQ_CTR_SHIFT 2 38 + #define RCU_SEQ_STATE_MASK ((1 << RCU_SEQ_CTR_SHIFT) - 1) 39 + 37 40 /* Exported common interfaces */ 38 41 void call_rcu(struct rcu_head *head, rcu_callback_t func); 39 42 void rcu_barrier_tasks(void); 40 - void rcu_barrier_tasks_rude(void); 41 43 void synchronize_rcu(void); 42 44 43 45 struct rcu_gp_oldstate; ··· 146 144 int rcu_nocb_cpu_offload(int cpu); 147 145 int rcu_nocb_cpu_deoffload(int cpu); 148 146 void rcu_nocb_flush_deferred_wakeup(void); 147 + 148 + #define RCU_NOCB_LOCKDEP_WARN(c, s) RCU_LOCKDEP_WARN(c, s) 149 + 149 150 #else /* #ifdef CONFIG_RCU_NOCB_CPU */ 151 + 150 152 static inline void rcu_init_nohz(void) { } 151 153 static inline int rcu_nocb_cpu_offload(int cpu) { return -EINVAL; } 152 154 static inline int rcu_nocb_cpu_deoffload(int cpu) { return 0; } 153 155 static inline void rcu_nocb_flush_deferred_wakeup(void) { } 156 + 157 + #define RCU_NOCB_LOCKDEP_WARN(c, s) 158 + 154 159 #endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */ 155 160 156 161 /* ··· 174 165 } while (0) 175 166 void call_rcu_tasks(struct rcu_head *head, rcu_callback_t func); 176 167 void synchronize_rcu_tasks(void); 168 + void rcu_tasks_torture_stats_print(char *tt, char *tf); 177 169 # else 178 170 # define rcu_tasks_classic_qs(t, preempt) do { } while (0) 179 171 # define call_rcu_tasks call_rcu ··· 201 191 rcu_tasks_trace_qs_blkd(t); \ 202 192 } \ 203 193 } while (0) 194 + void rcu_tasks_trace_torture_stats_print(char *tt, char *tf); 204 195 # else 205 196 # define rcu_tasks_trace_qs(t) do { } while (0) 206 197 # endif ··· 213 202 } while (0) 214 203 215 204 # ifdef CONFIG_TASKS_RUDE_RCU 216 - void call_rcu_tasks_rude(struct rcu_head *head, rcu_callback_t func); 217 205 void synchronize_rcu_tasks_rude(void); 206 + void rcu_tasks_rude_torture_stats_print(char *tt, char *tf); 218 207 # endif 219 208 220 209 #define rcu_note_voluntary_context_switch(t) rcu_tasks_qs(t, false)

+6

include/linux/smp.h

··· 294 294 int smpcfd_dead_cpu(unsigned int cpu); 295 295 int smpcfd_dying_cpu(unsigned int cpu); 296 296 297 + #ifdef CONFIG_CSD_LOCK_WAIT_DEBUG 298 + bool csd_lock_is_stuck(void); 299 + #else 300 + static inline bool csd_lock_is_stuck(void) { return false; } 301 + #endif 302 + 297 303 #endif /* __LINUX_SMP_H */

+14 -1

include/linux/srcutree.h

··· 129 129 #define SRCU_STATE_SCAN1 1 130 130 #define SRCU_STATE_SCAN2 2 131 131 132 + /* 133 + * Values for initializing gp sequence fields. Higher values allow wrap arounds to 134 + * occur earlier. 135 + * The second value with state is useful in the case of static initialization of 136 + * srcu_usage where srcu_gp_seq_needed is expected to have some state value in its 137 + * lower bits (or else it will appear to be already initialized within 138 + * the call check_init_srcu_struct()). 139 + */ 140 + #define SRCU_GP_SEQ_INITIAL_VAL ((0UL - 100UL) << RCU_SEQ_CTR_SHIFT) 141 + #define SRCU_GP_SEQ_INITIAL_VAL_WITH_STATE (SRCU_GP_SEQ_INITIAL_VAL - 1) 142 + 132 143 #define __SRCU_USAGE_INIT(name) \ 133 144 { \ 134 145 .lock = __SPIN_LOCK_UNLOCKED(name.lock), \ 135 - .srcu_gp_seq_needed = -1UL, \ 146 + .srcu_gp_seq = SRCU_GP_SEQ_INITIAL_VAL, \ 147 + .srcu_gp_seq_needed = SRCU_GP_SEQ_INITIAL_VAL_WITH_STATE, \ 148 + .srcu_gp_seq_needed_exp = SRCU_GP_SEQ_INITIAL_VAL, \ 136 149 .work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \ 137 150 } 138 151

+5 -3

kernel/rcu/rcu.h

··· 54 54 * grace-period sequence number. 55 55 */ 56 56 57 - #define RCU_SEQ_CTR_SHIFT 2 58 - #define RCU_SEQ_STATE_MASK ((1 << RCU_SEQ_CTR_SHIFT) - 1) 59 - 60 57 /* Low-order bit definition for polled grace-period APIs. */ 61 58 #define RCU_GET_STATE_COMPLETED 0x1 62 59 ··· 250 253 { 251 254 if (unlikely(!rhp->func)) 252 255 kmem_dump_obj(rhp); 256 + } 257 + 258 + static inline bool rcu_barrier_cb_is_done(struct rcu_head *rhp) 259 + { 260 + return rhp->next == rhp; 253 261 } 254 262 255 263 extern int rcu_cpu_stall_suppress_at_boot;

-11

kernel/rcu/rcu_segcblist.c

··· 261 261 } 262 262 263 263 /* 264 - * Mark the specified rcu_segcblist structure as offloaded (or not) 265 - */ 266 - void rcu_segcblist_offload(struct rcu_segcblist *rsclp, bool offload) 267 - { 268 - if (offload) 269 - rcu_segcblist_set_flags(rsclp, SEGCBLIST_LOCKING | SEGCBLIST_OFFLOADED); 270 - else 271 - rcu_segcblist_clear_flags(rsclp, SEGCBLIST_OFFLOADED); 272 - } 273 - 274 - /* 275 264 * Does the specified rcu_segcblist structure contain callbacks that 276 265 * are ready to be invoked? 277 266 */

+1 -10

kernel/rcu/rcu_segcblist.h

··· 89 89 static inline bool rcu_segcblist_is_offloaded(struct rcu_segcblist *rsclp) 90 90 { 91 91 if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) && 92 - rcu_segcblist_test_flags(rsclp, SEGCBLIST_LOCKING)) 93 - return true; 94 - 95 - return false; 96 - } 97 - 98 - static inline bool rcu_segcblist_completely_offloaded(struct rcu_segcblist *rsclp) 99 - { 100 - if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) && 101 - !rcu_segcblist_test_flags(rsclp, SEGCBLIST_RCU_CORE)) 92 + rcu_segcblist_test_flags(rsclp, SEGCBLIST_OFFLOADED)) 102 93 return true; 103 94 104 95 return false;

+187 -27

kernel/rcu/rcuscale.c

··· 39 39 #include <linux/torture.h> 40 40 #include <linux/vmalloc.h> 41 41 #include <linux/rcupdate_trace.h> 42 + #include <linux/sched/debug.h> 42 43 43 44 #include "rcu.h" 44 45 ··· 105 104 module_param(scale_type, charp, 0444); 106 105 MODULE_PARM_DESC(scale_type, "Type of RCU to scalability-test (rcu, srcu, ...)"); 107 106 107 + // Structure definitions for custom fixed-per-task allocator. 108 + struct writer_mblock { 109 + struct rcu_head wmb_rh; 110 + struct llist_node wmb_node; 111 + struct writer_freelist *wmb_wfl; 112 + }; 113 + 114 + struct writer_freelist { 115 + struct llist_head ws_lhg; 116 + atomic_t ws_inflight; 117 + struct llist_head ____cacheline_internodealigned_in_smp ws_lhp; 118 + struct writer_mblock *ws_mblocks; 119 + }; 120 + 108 121 static int nrealreaders; 109 122 static int nrealwriters; 110 123 static struct task_struct **writer_tasks; ··· 126 111 static struct task_struct *shutdown_task; 127 112 128 113 static u64 **writer_durations; 114 + static bool *writer_done; 115 + static struct writer_freelist *writer_freelists; 129 116 static int *writer_n_durations; 130 117 static atomic_t n_rcu_scale_reader_started; 131 118 static atomic_t n_rcu_scale_writer_started; ··· 137 120 static u64 t_rcu_scale_writer_finished; 138 121 static unsigned long b_rcu_gp_test_started; 139 122 static unsigned long b_rcu_gp_test_finished; 140 - static DEFINE_PER_CPU(atomic_t, n_async_inflight); 141 123 142 124 #define MAX_MEAS 10000 143 125 #define MIN_MEAS 100 ··· 159 143 void (*sync)(void); 160 144 void (*exp_sync)(void); 161 145 struct task_struct *(*rso_gp_kthread)(void); 146 + void (*stats)(void); 162 147 const char *name; 163 148 }; 164 149 ··· 241 224 synchronize_srcu(srcu_ctlp); 242 225 } 243 226 227 + static void srcu_scale_stats(void) 228 + { 229 + srcu_torture_stats_print(srcu_ctlp, scale_type, SCALE_FLAG); 230 + } 231 + 244 232 static void srcu_scale_synchronize_expedited(void) 245 233 { 246 234 synchronize_srcu_expedited(srcu_ctlp); ··· 263 241 .gp_barrier = srcu_rcu_barrier, 264 242 .sync = srcu_scale_synchronize, 265 243 .exp_sync = srcu_scale_synchronize_expedited, 244 + .stats = srcu_scale_stats, 266 245 .name = "srcu" 267 246 }; 268 247 ··· 293 270 .gp_barrier = srcu_rcu_barrier, 294 271 .sync = srcu_scale_synchronize, 295 272 .exp_sync = srcu_scale_synchronize_expedited, 273 + .stats = srcu_scale_stats, 296 274 .name = "srcud" 297 275 }; 298 276 ··· 312 288 { 313 289 } 314 290 291 + static void rcu_tasks_scale_stats(void) 292 + { 293 + rcu_tasks_torture_stats_print(scale_type, SCALE_FLAG); 294 + } 295 + 315 296 static struct rcu_scale_ops tasks_ops = { 316 297 .ptype = RCU_TASKS_FLAVOR, 317 298 .init = rcu_sync_scale_init, ··· 329 300 .sync = synchronize_rcu_tasks, 330 301 .exp_sync = synchronize_rcu_tasks, 331 302 .rso_gp_kthread = get_rcu_tasks_gp_kthread, 303 + .stats = IS_ENABLED(CONFIG_TINY_RCU) ? NULL : rcu_tasks_scale_stats, 332 304 .name = "tasks" 333 305 }; 334 306 ··· 356 326 { 357 327 } 358 328 329 + static void rcu_tasks_rude_scale_stats(void) 330 + { 331 + rcu_tasks_rude_torture_stats_print(scale_type, SCALE_FLAG); 332 + } 333 + 359 334 static struct rcu_scale_ops tasks_rude_ops = { 360 335 .ptype = RCU_TASKS_RUDE_FLAVOR, 361 336 .init = rcu_sync_scale_init, ··· 368 333 .readunlock = tasks_rude_scale_read_unlock, 369 334 .get_gp_seq = rcu_no_completed, 370 335 .gp_diff = rcu_seq_diff, 371 - .async = call_rcu_tasks_rude, 372 - .gp_barrier = rcu_barrier_tasks_rude, 373 336 .sync = synchronize_rcu_tasks_rude, 374 337 .exp_sync = synchronize_rcu_tasks_rude, 375 338 .rso_gp_kthread = get_rcu_tasks_rude_gp_kthread, 339 + .stats = IS_ENABLED(CONFIG_TINY_RCU) ? NULL : rcu_tasks_rude_scale_stats, 376 340 .name = "tasks-rude" 377 341 }; 378 342 ··· 400 366 rcu_read_unlock_trace(); 401 367 } 402 368 369 + static void rcu_tasks_trace_scale_stats(void) 370 + { 371 + rcu_tasks_trace_torture_stats_print(scale_type, SCALE_FLAG); 372 + } 373 + 403 374 static struct rcu_scale_ops tasks_tracing_ops = { 404 375 .ptype = RCU_TASKS_FLAVOR, 405 376 .init = rcu_sync_scale_init, ··· 417 378 .sync = synchronize_rcu_tasks_trace, 418 379 .exp_sync = synchronize_rcu_tasks_trace, 419 380 .rso_gp_kthread = get_rcu_tasks_trace_gp_kthread, 381 + .stats = IS_ENABLED(CONFIG_TINY_RCU) ? NULL : rcu_tasks_trace_scale_stats, 420 382 .name = "tasks-tracing" 421 383 }; 422 384 ··· 478 438 } 479 439 480 440 /* 441 + * Allocate a writer_mblock structure for the specified rcu_scale_writer 442 + * task. 443 + */ 444 + static struct writer_mblock *rcu_scale_alloc(long me) 445 + { 446 + struct llist_node *llnp; 447 + struct writer_freelist *wflp; 448 + struct writer_mblock *wmbp; 449 + 450 + if (WARN_ON_ONCE(!writer_freelists)) 451 + return NULL; 452 + wflp = &writer_freelists[me]; 453 + if (llist_empty(&wflp->ws_lhp)) { 454 + // ->ws_lhp is private to its rcu_scale_writer task. 455 + wmbp = container_of(llist_del_all(&wflp->ws_lhg), struct writer_mblock, wmb_node); 456 + wflp->ws_lhp.first = &wmbp->wmb_node; 457 + } 458 + llnp = llist_del_first(&wflp->ws_lhp); 459 + if (!llnp) 460 + return NULL; 461 + return container_of(llnp, struct writer_mblock, wmb_node); 462 + } 463 + 464 + /* 465 + * Free a writer_mblock structure to its rcu_scale_writer task. 466 + */ 467 + static void rcu_scale_free(struct writer_mblock *wmbp) 468 + { 469 + struct writer_freelist *wflp; 470 + 471 + if (!wmbp) 472 + return; 473 + wflp = wmbp->wmb_wfl; 474 + llist_add(&wmbp->wmb_node, &wflp->ws_lhg); 475 + } 476 + 477 + /* 481 478 * Callback function for asynchronous grace periods from rcu_scale_writer(). 482 479 */ 483 480 static void rcu_scale_async_cb(struct rcu_head *rhp) 484 481 { 485 - atomic_dec(this_cpu_ptr(&n_async_inflight)); 486 - kfree(rhp); 482 + struct writer_mblock *wmbp = container_of(rhp, struct writer_mblock, wmb_rh); 483 + struct writer_freelist *wflp = wmbp->wmb_wfl; 484 + 485 + atomic_dec(&wflp->ws_inflight); 486 + rcu_scale_free(wmbp); 487 487 } 488 488 489 489 /* ··· 536 456 int i_max; 537 457 unsigned long jdone; 538 458 long me = (long)arg; 539 - struct rcu_head *rhp = NULL; 459 + bool selfreport = false; 540 460 bool started = false, done = false, alldone = false; 541 461 u64 t; 542 462 DEFINE_TORTURE_RANDOM(tr); 543 463 u64 *wdp; 544 464 u64 *wdpp = writer_durations[me]; 465 + struct writer_freelist *wflp = &writer_freelists[me]; 466 + struct writer_mblock *wmbp = NULL; 545 467 546 468 VERBOSE_SCALEOUT_STRING("rcu_scale_writer task started"); 547 469 WARN_ON(!wdpp); ··· 575 493 576 494 jdone = jiffies + minruntime * HZ; 577 495 do { 496 + bool gp_succeeded = false; 497 + 578 498 if (writer_holdoff) 579 499 udelay(writer_holdoff); 580 500 if (writer_holdoff_jiffies) 581 501 schedule_timeout_idle(torture_random(&tr) % writer_holdoff_jiffies + 1); 582 502 wdp = &wdpp[i]; 583 503 *wdp = ktime_get_mono_fast_ns(); 584 - if (gp_async) { 585 - retry: 586 - if (!rhp) 587 - rhp = kmalloc(sizeof(*rhp), GFP_KERNEL); 588 - if (rhp && atomic_read(this_cpu_ptr(&n_async_inflight)) < gp_async_max) { 589 - atomic_inc(this_cpu_ptr(&n_async_inflight)); 590 - cur_ops->async(rhp, rcu_scale_async_cb); 591 - rhp = NULL; 504 + if (gp_async && !WARN_ON_ONCE(!cur_ops->async)) { 505 + if (!wmbp) 506 + wmbp = rcu_scale_alloc(me); 507 + if (wmbp && atomic_read(&wflp->ws_inflight) < gp_async_max) { 508 + atomic_inc(&wflp->ws_inflight); 509 + cur_ops->async(&wmbp->wmb_rh, rcu_scale_async_cb); 510 + wmbp = NULL; 511 + gp_succeeded = true; 592 512 } else if (!kthread_should_stop()) { 593 513 cur_ops->gp_barrier(); 594 - goto retry; 595 514 } else { 596 - kfree(rhp); /* Because we are stopping. */ 515 + rcu_scale_free(wmbp); /* Because we are stopping. */ 516 + wmbp = NULL; 597 517 } 598 518 } else if (gp_exp) { 599 519 cur_ops->exp_sync(); 520 + gp_succeeded = true; 600 521 } else { 601 522 cur_ops->sync(); 523 + gp_succeeded = true; 602 524 } 603 525 t = ktime_get_mono_fast_ns(); 604 526 *wdp = t - *wdp; ··· 612 526 started = true; 613 527 if (!done && i >= MIN_MEAS && time_after(jiffies, jdone)) { 614 528 done = true; 529 + WRITE_ONCE(writer_done[me], true); 615 530 sched_set_normal(current, 0); 616 531 pr_alert("%s%s rcu_scale_writer %ld has %d measurements\n", 617 532 scale_type, SCALE_FLAG, me, MIN_MEAS); ··· 638 551 if (done && !alldone && 639 552 atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters) 640 553 alldone = true; 641 - if (started && !alldone && i < MAX_MEAS - 1) 554 + if (done && !alldone && time_after(jiffies, jdone + HZ * 60)) { 555 + static atomic_t dumped; 556 + int i; 557 + 558 + if (!atomic_xchg(&dumped, 1)) { 559 + for (i = 0; i < nrealwriters; i++) { 560 + if (writer_done[i]) 561 + continue; 562 + pr_info("%s: Task %ld flags writer %d:\n", __func__, me, i); 563 + sched_show_task(writer_tasks[i]); 564 + } 565 + if (cur_ops->stats) 566 + cur_ops->stats(); 567 + } 568 + } 569 + if (!selfreport && time_after(jiffies, jdone + HZ * (70 + me))) { 570 + pr_info("%s: Writer %ld self-report: started %d done %d/%d->%d i %d jdone %lu.\n", 571 + __func__, me, started, done, writer_done[me], atomic_read(&n_rcu_scale_writer_finished), i, jiffies - jdone); 572 + selfreport = true; 573 + } 574 + if (gp_succeeded && started && !alldone && i < MAX_MEAS - 1) 642 575 i++; 643 576 rcu_scale_wait_shutdown(); 644 577 } while (!torture_must_stop()); 645 - if (gp_async) { 578 + if (gp_async && cur_ops->async) { 579 + rcu_scale_free(wmbp); 646 580 cur_ops->gp_barrier(); 647 581 } 648 582 writer_n_durations[me] = i_max + 1; ··· 821 713 torture_stop_kthread(kfree_scale_thread, 822 714 kfree_reader_tasks[i]); 823 715 kfree(kfree_reader_tasks); 716 + kfree_reader_tasks = NULL; 824 717 } 825 718 826 719 torture_cleanup_end(); ··· 990 881 torture_stop_kthread(rcu_scale_reader, 991 882 reader_tasks[i]); 992 883 kfree(reader_tasks); 884 + reader_tasks = NULL; 993 885 } 994 886 995 887 if (writer_tasks) { ··· 1029 919 schedule_timeout_uninterruptible(1); 1030 920 } 1031 921 kfree(writer_durations[i]); 922 + if (writer_freelists) { 923 + int ctr = 0; 924 + struct llist_node *llnp; 925 + struct writer_freelist *wflp = &writer_freelists[i]; 926 + 927 + if (wflp->ws_mblocks) { 928 + llist_for_each(llnp, wflp->ws_lhg.first) 929 + ctr++; 930 + llist_for_each(llnp, wflp->ws_lhp.first) 931 + ctr++; 932 + WARN_ONCE(ctr != gp_async_max, 933 + "%s: ctr = %d gp_async_max = %d\n", 934 + __func__, ctr, gp_async_max); 935 + kfree(wflp->ws_mblocks); 936 + } 937 + } 1032 938 } 1033 939 kfree(writer_tasks); 940 + writer_tasks = NULL; 1034 941 kfree(writer_durations); 942 + writer_durations = NULL; 1035 943 kfree(writer_n_durations); 944 + writer_n_durations = NULL; 945 + kfree(writer_done); 946 + writer_done = NULL; 947 + kfree(writer_freelists); 948 + writer_freelists = NULL; 1036 949 } 1037 950 1038 951 /* Do torture-type-specific cleanup operations. */ ··· 1082 949 static int __init 1083 950 rcu_scale_init(void) 1084 951 { 1085 - long i; 1086 952 int firsterr = 0; 953 + long i; 954 + long j; 1087 955 static struct rcu_scale_ops *scale_ops[] = { 1088 956 &rcu_ops, &srcu_ops, &srcud_ops, TASKS_OPS TASKS_RUDE_OPS TASKS_TRACING_OPS 1089 957 }; ··· 1151 1017 } 1152 1018 while (atomic_read(&n_rcu_scale_reader_started) < nrealreaders) 1153 1019 schedule_timeout_uninterruptible(1); 1154 - writer_tasks = kcalloc(nrealwriters, sizeof(reader_tasks[0]), 1155 - GFP_KERNEL); 1156 - writer_durations = kcalloc(nrealwriters, sizeof(*writer_durations), 1157 - GFP_KERNEL); 1158 - writer_n_durations = 1159 - kcalloc(nrealwriters, sizeof(*writer_n_durations), 1160 - GFP_KERNEL); 1161 - if (!writer_tasks || !writer_durations || !writer_n_durations) { 1020 + writer_tasks = kcalloc(nrealwriters, sizeof(writer_tasks[0]), GFP_KERNEL); 1021 + writer_durations = kcalloc(nrealwriters, sizeof(*writer_durations), GFP_KERNEL); 1022 + writer_n_durations = kcalloc(nrealwriters, sizeof(*writer_n_durations), GFP_KERNEL); 1023 + writer_done = kcalloc(nrealwriters, sizeof(writer_done[0]), GFP_KERNEL); 1024 + if (gp_async) { 1025 + if (gp_async_max <= 0) { 1026 + pr_warn("%s: gp_async_max = %d must be greater than zero.\n", 1027 + __func__, gp_async_max); 1028 + WARN_ON_ONCE(IS_BUILTIN(CONFIG_RCU_TORTURE_TEST)); 1029 + firsterr = -EINVAL; 1030 + goto unwind; 1031 + } 1032 + writer_freelists = kcalloc(nrealwriters, sizeof(writer_freelists[0]), GFP_KERNEL); 1033 + } 1034 + if (!writer_tasks || !writer_durations || !writer_n_durations || !writer_done || 1035 + (gp_async && !writer_freelists)) { 1162 1036 SCALEOUT_ERRSTRING("out of memory"); 1163 1037 firsterr = -ENOMEM; 1164 1038 goto unwind; ··· 1178 1036 if (!writer_durations[i]) { 1179 1037 firsterr = -ENOMEM; 1180 1038 goto unwind; 1039 + } 1040 + if (writer_freelists) { 1041 + struct writer_freelist *wflp = &writer_freelists[i]; 1042 + 1043 + init_llist_head(&wflp->ws_lhg); 1044 + init_llist_head(&wflp->ws_lhp); 1045 + wflp->ws_mblocks = kcalloc(gp_async_max, sizeof(wflp->ws_mblocks[0]), 1046 + GFP_KERNEL); 1047 + if (!wflp->ws_mblocks) { 1048 + firsterr = -ENOMEM; 1049 + goto unwind; 1050 + } 1051 + for (j = 0; j < gp_async_max; j++) { 1052 + struct writer_mblock *wmbp = &wflp->ws_mblocks[j]; 1053 + 1054 + wmbp->wmb_wfl = wflp; 1055 + llist_add(&wmbp->wmb_node, &wflp->ws_lhp); 1056 + } 1181 1057 } 1182 1058 firsterr = torture_create_kthread(rcu_scale_writer, (void *)i, 1183 1059 writer_tasks[i]);

+77 -40

kernel/rcu/rcutorture.c

··· 115 115 torture_param(bool, stall_no_softlockup, false, "Avoid softlockup warning during cpu stall."); 116 116 torture_param(int, stall_cpu_irqsoff, 0, "Disable interrupts while stalling."); 117 117 torture_param(int, stall_cpu_block, 0, "Sleep while stalling."); 118 + torture_param(int, stall_cpu_repeat, 0, "Number of additional stalls after the first one."); 118 119 torture_param(int, stall_gp_kthread, 0, "Grace-period kthread stall duration (s)."); 119 120 torture_param(int, stat_interval, 60, "Number of seconds between stats printk()s"); 120 121 torture_param(int, stutter, 5, "Number of seconds to run/halt test"); ··· 367 366 bool (*same_gp_state_full)(struct rcu_gp_oldstate *rgosp1, struct rcu_gp_oldstate *rgosp2); 368 367 unsigned long (*get_gp_state)(void); 369 368 void (*get_gp_state_full)(struct rcu_gp_oldstate *rgosp); 370 - unsigned long (*get_gp_completed)(void); 371 - void (*get_gp_completed_full)(struct rcu_gp_oldstate *rgosp); 372 369 unsigned long (*start_gp_poll)(void); 373 370 void (*start_gp_poll_full)(struct rcu_gp_oldstate *rgosp); 374 371 bool (*poll_gp_state)(unsigned long oldstate); ··· 374 375 bool (*poll_need_2gp)(bool poll, bool poll_full); 375 376 void (*cond_sync)(unsigned long oldstate); 376 377 void (*cond_sync_full)(struct rcu_gp_oldstate *rgosp); 378 + int poll_active; 379 + int poll_active_full; 377 380 call_rcu_func_t call; 378 381 void (*cb_barrier)(void); 379 382 void (*fqs)(void); ··· 554 553 .get_comp_state_full = get_completed_synchronize_rcu_full, 555 554 .get_gp_state = get_state_synchronize_rcu, 556 555 .get_gp_state_full = get_state_synchronize_rcu_full, 557 - .get_gp_completed = get_completed_synchronize_rcu, 558 - .get_gp_completed_full = get_completed_synchronize_rcu_full, 559 556 .start_gp_poll = start_poll_synchronize_rcu, 560 557 .start_gp_poll_full = start_poll_synchronize_rcu_full, 561 558 .poll_gp_state = poll_state_synchronize_rcu, ··· 561 562 .poll_need_2gp = rcu_poll_need_2gp, 562 563 .cond_sync = cond_synchronize_rcu, 563 564 .cond_sync_full = cond_synchronize_rcu_full, 565 + .poll_active = NUM_ACTIVE_RCU_POLL_OLDSTATE, 566 + .poll_active_full = NUM_ACTIVE_RCU_POLL_FULL_OLDSTATE, 564 567 .get_gp_state_exp = get_state_synchronize_rcu, 565 568 .start_gp_poll_exp = start_poll_synchronize_rcu_expedited, 566 569 .start_gp_poll_exp_full = start_poll_synchronize_rcu_expedited_full, ··· 741 740 .deferred_free = srcu_torture_deferred_free, 742 741 .sync = srcu_torture_synchronize, 743 742 .exp_sync = srcu_torture_synchronize_expedited, 743 + .same_gp_state = same_state_synchronize_srcu, 744 + .get_comp_state = get_completed_synchronize_srcu, 744 745 .get_gp_state = srcu_torture_get_gp_state, 745 746 .start_gp_poll = srcu_torture_start_gp_poll, 746 747 .poll_gp_state = srcu_torture_poll_gp_state, 748 + .poll_active = NUM_ACTIVE_SRCU_POLL_OLDSTATE, 747 749 .call = srcu_torture_call, 748 750 .cb_barrier = srcu_torture_barrier, 749 751 .stats = srcu_torture_stats, ··· 784 780 .deferred_free = srcu_torture_deferred_free, 785 781 .sync = srcu_torture_synchronize, 786 782 .exp_sync = srcu_torture_synchronize_expedited, 783 + .same_gp_state = same_state_synchronize_srcu, 784 + .get_comp_state = get_completed_synchronize_srcu, 787 785 .get_gp_state = srcu_torture_get_gp_state, 788 786 .start_gp_poll = srcu_torture_start_gp_poll, 789 787 .poll_gp_state = srcu_torture_poll_gp_state, 788 + .poll_active = NUM_ACTIVE_SRCU_POLL_OLDSTATE, 790 789 .call = srcu_torture_call, 791 790 .cb_barrier = srcu_torture_barrier, 792 791 .stats = srcu_torture_stats, ··· 922 915 * Definitions for rude RCU-tasks torture testing. 923 916 */ 924 917 925 - static void rcu_tasks_rude_torture_deferred_free(struct rcu_torture *p) 926 - { 927 - call_rcu_tasks_rude(&p->rtort_rcu, rcu_torture_cb); 928 - } 929 - 930 918 static struct rcu_torture_ops tasks_rude_ops = { 931 919 .ttype = RCU_TASKS_RUDE_FLAVOR, 932 920 .init = rcu_sync_torture_init, ··· 929 927 .read_delay = rcu_read_delay, /* just reuse rcu's version. */ 930 928 .readunlock = rcu_torture_read_unlock_trivial, 931 929 .get_gp_seq = rcu_no_completed, 932 - .deferred_free = rcu_tasks_rude_torture_deferred_free, 933 930 .sync = synchronize_rcu_tasks_rude, 934 931 .exp_sync = synchronize_rcu_tasks_rude, 935 - .call = call_rcu_tasks_rude, 936 - .cb_barrier = rcu_barrier_tasks_rude, 937 932 .gp_kthread_dbg = show_rcu_tasks_rude_gp_kthread, 938 933 .get_gp_data = rcu_tasks_rude_get_gp_data, 939 934 .cbflood_max = 50000, ··· 1317 1318 } else if (gp_sync && !cur_ops->sync) { 1318 1319 pr_alert("%s: gp_sync without primitives.\n", __func__); 1319 1320 } 1321 + pr_alert("%s: Testing %d update types.\n", __func__, nsynctypes); 1320 1322 } 1321 1323 1322 1324 /* ··· 1374 1374 int i; 1375 1375 int idx; 1376 1376 int oldnice = task_nice(current); 1377 - struct rcu_gp_oldstate rgo[NUM_ACTIVE_RCU_POLL_FULL_OLDSTATE]; 1377 + struct rcu_gp_oldstate *rgo = NULL; 1378 + int rgo_size = 0; 1378 1379 struct rcu_torture *rp; 1379 1380 struct rcu_torture *old_rp; 1380 1381 static DEFINE_TORTURE_RANDOM(rand); 1381 1382 unsigned long stallsdone = jiffies; 1382 1383 bool stutter_waited; 1383 - unsigned long ulo[NUM_ACTIVE_RCU_POLL_OLDSTATE]; 1384 + unsigned long *ulo = NULL; 1385 + int ulo_size = 0; 1384 1386 1385 1387 // If a new stall test is added, this must be adjusted. 1386 1388 if (stall_cpu_holdoff + stall_gp_kthread + stall_cpu) 1387 - stallsdone += (stall_cpu_holdoff + stall_gp_kthread + stall_cpu + 60) * HZ; 1389 + stallsdone += (stall_cpu_holdoff + stall_gp_kthread + stall_cpu + 60) * 1390 + HZ * (stall_cpu_repeat + 1); 1388 1391 VERBOSE_TOROUT_STRING("rcu_torture_writer task started"); 1389 1392 if (!can_expedite) 1390 1393 pr_alert("%s" TORTURE_FLAG ··· 1403 1400 rcu_torture_writer_state = RTWS_STOPPING; 1404 1401 torture_kthread_stopping("rcu_torture_writer"); 1405 1402 return 0; 1403 + } 1404 + if (cur_ops->poll_active > 0) { 1405 + ulo = kzalloc(cur_ops->poll_active * sizeof(ulo[0]), GFP_KERNEL); 1406 + if (!WARN_ON(!ulo)) 1407 + ulo_size = cur_ops->poll_active; 1408 + } 1409 + if (cur_ops->poll_active_full > 0) { 1410 + rgo = kzalloc(cur_ops->poll_active_full * sizeof(rgo[0]), GFP_KERNEL); 1411 + if (!WARN_ON(!rgo)) 1412 + rgo_size = cur_ops->poll_active_full; 1406 1413 } 1407 1414 1408 1415 do { ··· 1450 1437 rcu_torture_writer_state_getname(), 1451 1438 rcu_torture_writer_state, 1452 1439 cookie, cur_ops->get_gp_state()); 1453 - if (cur_ops->get_gp_completed) { 1454 - cookie = cur_ops->get_gp_completed(); 1440 + if (cur_ops->get_comp_state) { 1441 + cookie = cur_ops->get_comp_state(); 1455 1442 WARN_ON_ONCE(!cur_ops->poll_gp_state(cookie)); 1456 1443 } 1457 1444 cur_ops->readunlock(idx); ··· 1465 1452 rcu_torture_writer_state_getname(), 1466 1453 rcu_torture_writer_state, 1467 1454 cpumask_pr_args(cpu_online_mask)); 1468 - if (cur_ops->get_gp_completed_full) { 1469 - cur_ops->get_gp_completed_full(&cookie_full); 1455 + if (cur_ops->get_comp_state_full) { 1456 + cur_ops->get_comp_state_full(&cookie_full); 1470 1457 WARN_ON_ONCE(!cur_ops->poll_gp_state_full(&cookie_full)); 1471 1458 } 1472 1459 cur_ops->readunlock(idx); ··· 1515 1502 break; 1516 1503 case RTWS_POLL_GET: 1517 1504 rcu_torture_writer_state = RTWS_POLL_GET; 1518 - for (i = 0; i < ARRAY_SIZE(ulo); i++) 1505 + for (i = 0; i < ulo_size; i++) 1519 1506 ulo[i] = cur_ops->get_comp_state(); 1520 1507 gp_snap = cur_ops->start_gp_poll(); 1521 1508 rcu_torture_writer_state = RTWS_POLL_WAIT; 1522 1509 while (!cur_ops->poll_gp_state(gp_snap)) { 1523 1510 gp_snap1 = cur_ops->get_gp_state(); 1524 - for (i = 0; i < ARRAY_SIZE(ulo); i++) 1511 + for (i = 0; i < ulo_size; i++) 1525 1512 if (cur_ops->poll_gp_state(ulo[i]) || 1526 1513 cur_ops->same_gp_state(ulo[i], gp_snap1)) { 1527 1514 ulo[i] = gp_snap1; 1528 1515 break; 1529 1516 } 1530 - WARN_ON_ONCE(i >= ARRAY_SIZE(ulo)); 1517 + WARN_ON_ONCE(ulo_size > 0 && i >= ulo_size); 1531 1518 torture_hrtimeout_jiffies(torture_random(&rand) % 16, 1532 1519 &rand); 1533 1520 } ··· 1535 1522 break; 1536 1523 case RTWS_POLL_GET_FULL: 1537 1524 rcu_torture_writer_state = RTWS_POLL_GET_FULL; 1538 - for (i = 0; i < ARRAY_SIZE(rgo); i++) 1525 + for (i = 0; i < rgo_size; i++) 1539 1526 cur_ops->get_comp_state_full(&rgo[i]); 1540 1527 cur_ops->start_gp_poll_full(&gp_snap_full); 1541 1528 rcu_torture_writer_state = RTWS_POLL_WAIT_FULL; 1542 1529 while (!cur_ops->poll_gp_state_full(&gp_snap_full)) { 1543 1530 cur_ops->get_gp_state_full(&gp_snap1_full); 1544 - for (i = 0; i < ARRAY_SIZE(rgo); i++) 1531 + for (i = 0; i < rgo_size; i++) 1545 1532 if (cur_ops->poll_gp_state_full(&rgo[i]) || 1546 1533 cur_ops->same_gp_state_full(&rgo[i], 1547 1534 &gp_snap1_full)) { 1548 1535 rgo[i] = gp_snap1_full; 1549 1536 break; 1550 1537 } 1551 - WARN_ON_ONCE(i >= ARRAY_SIZE(rgo)); 1538 + WARN_ON_ONCE(rgo_size > 0 && i >= rgo_size); 1552 1539 torture_hrtimeout_jiffies(torture_random(&rand) % 16, 1553 1540 &rand); 1554 1541 } ··· 1630 1617 pr_alert("%s" TORTURE_FLAG 1631 1618 " Dynamic grace-period expediting was disabled.\n", 1632 1619 torture_type); 1620 + kfree(ulo); 1621 + kfree(rgo); 1633 1622 rcu_torture_writer_state = RTWS_STOPPING; 1634 1623 torture_kthread_stopping("rcu_torture_writer"); 1635 1624 return 0; ··· 2385 2370 "test_boost=%d/%d test_boost_interval=%d " 2386 2371 "test_boost_duration=%d shutdown_secs=%d " 2387 2372 "stall_cpu=%d stall_cpu_holdoff=%d stall_cpu_irqsoff=%d " 2388 - "stall_cpu_block=%d " 2373 + "stall_cpu_block=%d stall_cpu_repeat=%d " 2389 2374 "n_barrier_cbs=%d " 2390 2375 "onoff_interval=%d onoff_holdoff=%d " 2391 2376 "read_exit_delay=%d read_exit_burst=%d " ··· 2397 2382 test_boost, cur_ops->can_boost, 2398 2383 test_boost_interval, test_boost_duration, shutdown_secs, 2399 2384 stall_cpu, stall_cpu_holdoff, stall_cpu_irqsoff, 2400 - stall_cpu_block, 2385 + stall_cpu_block, stall_cpu_repeat, 2401 2386 n_barrier_cbs, 2402 2387 onoff_interval, onoff_holdoff, 2403 2388 read_exit_delay, read_exit_burst, ··· 2475 2460 * induces a CPU stall for the time specified by stall_cpu. If a new 2476 2461 * stall test is added, stallsdone in rcu_torture_writer() must be adjusted. 2477 2462 */ 2478 - static int rcu_torture_stall(void *args) 2463 + static void rcu_torture_stall_one(int rep, int irqsoff) 2479 2464 { 2480 2465 int idx; 2481 - int ret; 2482 2466 unsigned long stop_at; 2483 2467 2484 - VERBOSE_TOROUT_STRING("rcu_torture_stall task started"); 2485 - if (rcu_cpu_stall_notifiers) { 2486 - ret = rcu_stall_chain_notifier_register(&rcu_torture_stall_block); 2487 - if (ret) 2488 - pr_info("%s: rcu_stall_chain_notifier_register() returned %d, %sexpected.\n", 2489 - __func__, ret, !IS_ENABLED(CONFIG_RCU_STALL_COMMON) ? "un" : ""); 2490 - } 2491 2468 if (stall_cpu_holdoff > 0) { 2492 2469 VERBOSE_TOROUT_STRING("rcu_torture_stall begin holdoff"); 2493 2470 schedule_timeout_interruptible(stall_cpu_holdoff * HZ); ··· 2499 2492 stop_at = ktime_get_seconds() + stall_cpu; 2500 2493 /* RCU CPU stall is expected behavior in following code. */ 2501 2494 idx = cur_ops->readlock(); 2502 - if (stall_cpu_irqsoff) 2495 + if (irqsoff) 2503 2496 local_irq_disable(); 2504 2497 else if (!stall_cpu_block) 2505 2498 preempt_disable(); 2506 - pr_alert("%s start on CPU %d.\n", 2507 - __func__, raw_smp_processor_id()); 2499 + pr_alert("%s start stall episode %d on CPU %d.\n", 2500 + __func__, rep + 1, raw_smp_processor_id()); 2508 2501 while (ULONG_CMP_LT((unsigned long)ktime_get_seconds(), stop_at) && 2509 2502 !kthread_should_stop()) 2510 2503 if (stall_cpu_block) { ··· 2516 2509 } else if (stall_no_softlockup) { 2517 2510 touch_softlockup_watchdog(); 2518 2511 } 2519 - if (stall_cpu_irqsoff) 2512 + if (irqsoff) 2520 2513 local_irq_enable(); 2521 2514 else if (!stall_cpu_block) 2522 2515 preempt_enable(); 2523 2516 cur_ops->readunlock(idx); 2517 + } 2518 + } 2519 + 2520 + /* 2521 + * CPU-stall kthread. Invokes rcu_torture_stall_one() once, and then as many 2522 + * additional times as specified by the stall_cpu_repeat module parameter. 2523 + * Note that stall_cpu_irqsoff is ignored on the second and subsequent 2524 + * stall. 2525 + */ 2526 + static int rcu_torture_stall(void *args) 2527 + { 2528 + int i; 2529 + int repeat = stall_cpu_repeat; 2530 + int ret; 2531 + 2532 + VERBOSE_TOROUT_STRING("rcu_torture_stall task started"); 2533 + if (repeat < 0) { 2534 + repeat = 0; 2535 + WARN_ON_ONCE(IS_BUILTIN(CONFIG_RCU_TORTURE_TEST)); 2536 + } 2537 + if (rcu_cpu_stall_notifiers) { 2538 + ret = rcu_stall_chain_notifier_register(&rcu_torture_stall_block); 2539 + if (ret) 2540 + pr_info("%s: rcu_stall_chain_notifier_register() returned %d, %sexpected.\n", 2541 + __func__, ret, !IS_ENABLED(CONFIG_RCU_STALL_COMMON) ? "un" : ""); 2542 + } 2543 + for (i = 0; i <= repeat; i++) { 2544 + if (kthread_should_stop()) 2545 + break; 2546 + rcu_torture_stall_one(i, i == 0 ? stall_cpu_irqsoff : 0); 2524 2547 } 2525 2548 pr_alert("%s end.\n", __func__); 2526 2549 if (rcu_cpu_stall_notifiers && !ret) {

+35 -32

kernel/rcu/refscale.c

··· 28 28 #include <linux/rcupdate_trace.h> 29 29 #include <linux/reboot.h> 30 30 #include <linux/sched.h> 31 + #include <linux/seq_buf.h> 31 32 #include <linux/spinlock.h> 32 33 #include <linux/smp.h> 33 34 #include <linux/stat.h> ··· 135 134 const char *name; 136 135 }; 137 136 138 - static struct ref_scale_ops *cur_ops; 137 + static const struct ref_scale_ops *cur_ops; 139 138 140 139 static void un_delay(const int udl, const int ndl) 141 140 { ··· 171 170 return true; 172 171 } 173 172 174 - static struct ref_scale_ops rcu_ops = { 173 + static const struct ref_scale_ops rcu_ops = { 175 174 .init = rcu_sync_scale_init, 176 175 .readsection = ref_rcu_read_section, 177 176 .delaysection = ref_rcu_delay_section, ··· 205 204 } 206 205 } 207 206 208 - static struct ref_scale_ops srcu_ops = { 207 + static const struct ref_scale_ops srcu_ops = { 209 208 .init = rcu_sync_scale_init, 210 209 .readsection = srcu_ref_scale_read_section, 211 210 .delaysection = srcu_ref_scale_delay_section, ··· 232 231 un_delay(udl, ndl); 233 232 } 234 233 235 - static struct ref_scale_ops rcu_tasks_ops = { 234 + static const struct ref_scale_ops rcu_tasks_ops = { 236 235 .init = rcu_sync_scale_init, 237 236 .readsection = rcu_tasks_ref_scale_read_section, 238 237 .delaysection = rcu_tasks_ref_scale_delay_section, ··· 271 270 } 272 271 } 273 272 274 - static struct ref_scale_ops rcu_trace_ops = { 273 + static const struct ref_scale_ops rcu_trace_ops = { 275 274 .init = rcu_sync_scale_init, 276 275 .readsection = rcu_trace_ref_scale_read_section, 277 276 .delaysection = rcu_trace_ref_scale_delay_section, ··· 310 309 } 311 310 } 312 311 313 - static struct ref_scale_ops refcnt_ops = { 312 + static const struct ref_scale_ops refcnt_ops = { 314 313 .init = rcu_sync_scale_init, 315 314 .readsection = ref_refcnt_section, 316 315 .delaysection = ref_refcnt_delay_section, ··· 347 346 } 348 347 } 349 348 350 - static struct ref_scale_ops rwlock_ops = { 349 + static const struct ref_scale_ops rwlock_ops = { 351 350 .init = ref_rwlock_init, 352 351 .readsection = ref_rwlock_section, 353 352 .delaysection = ref_rwlock_delay_section, ··· 384 383 } 385 384 } 386 385 387 - static struct ref_scale_ops rwsem_ops = { 386 + static const struct ref_scale_ops rwsem_ops = { 388 387 .init = ref_rwsem_init, 389 388 .readsection = ref_rwsem_section, 390 389 .delaysection = ref_rwsem_delay_section, ··· 419 418 preempt_enable(); 420 419 } 421 420 422 - static struct ref_scale_ops lock_ops = { 421 + static const struct ref_scale_ops lock_ops = { 423 422 .readsection = ref_lock_section, 424 423 .delaysection = ref_lock_delay_section, 425 424 .name = "lock" ··· 454 453 preempt_enable(); 455 454 } 456 455 457 - static struct ref_scale_ops lock_irq_ops = { 456 + static const struct ref_scale_ops lock_irq_ops = { 458 457 .readsection = ref_lock_irq_section, 459 458 .delaysection = ref_lock_irq_delay_section, 460 459 .name = "lock-irq" ··· 490 489 preempt_enable(); 491 490 } 492 491 493 - static struct ref_scale_ops acqrel_ops = { 492 + static const struct ref_scale_ops acqrel_ops = { 494 493 .readsection = ref_acqrel_section, 495 494 .delaysection = ref_acqrel_delay_section, 496 495 .name = "acqrel" ··· 524 523 stopopts = x; 525 524 } 526 525 527 - static struct ref_scale_ops clock_ops = { 526 + static const struct ref_scale_ops clock_ops = { 528 527 .readsection = ref_clock_section, 529 528 .delaysection = ref_clock_delay_section, 530 529 .name = "clock" ··· 556 555 stopopts = x; 557 556 } 558 557 559 - static struct ref_scale_ops jiffies_ops = { 558 + static const struct ref_scale_ops jiffies_ops = { 560 559 .readsection = ref_jiffies_section, 561 560 .delaysection = ref_jiffies_delay_section, 562 561 .name = "jiffies" ··· 706 705 preempt_enable(); 707 706 } 708 707 709 - static struct ref_scale_ops typesafe_ref_ops; 710 - static struct ref_scale_ops typesafe_lock_ops; 711 - static struct ref_scale_ops typesafe_seqlock_ops; 708 + static const struct ref_scale_ops typesafe_ref_ops; 709 + static const struct ref_scale_ops typesafe_lock_ops; 710 + static const struct ref_scale_ops typesafe_seqlock_ops; 712 711 713 712 // Initialize for a typesafe test. 714 713 static bool typesafe_init(void) ··· 769 768 } 770 769 771 770 // The typesafe_init() function distinguishes these structures by address. 772 - static struct ref_scale_ops typesafe_ref_ops = { 771 + static const struct ref_scale_ops typesafe_ref_ops = { 773 772 .init = typesafe_init, 774 773 .cleanup = typesafe_cleanup, 775 774 .readsection = typesafe_read_section, ··· 777 776 .name = "typesafe_ref" 778 777 }; 779 778 780 - static struct ref_scale_ops typesafe_lock_ops = { 779 + static const struct ref_scale_ops typesafe_lock_ops = { 781 780 .init = typesafe_init, 782 781 .cleanup = typesafe_cleanup, 783 782 .readsection = typesafe_read_section, ··· 785 784 .name = "typesafe_lock" 786 785 }; 787 786 788 - static struct ref_scale_ops typesafe_seqlock_ops = { 787 + static const struct ref_scale_ops typesafe_seqlock_ops = { 789 788 .init = typesafe_init, 790 789 .cleanup = typesafe_cleanup, 791 790 .readsection = typesafe_read_section, ··· 892 891 { 893 892 int i; 894 893 struct reader_task *rt; 895 - char buf1[64]; 894 + struct seq_buf s; 896 895 char *buf; 897 896 u64 sum = 0; 898 897 899 898 buf = kmalloc(800 + 64, GFP_KERNEL); 900 899 if (!buf) 901 900 return 0; 902 - buf[0] = 0; 903 - sprintf(buf, "Experiment #%d (Format: <THREAD-NUM>:<Total loop time in ns>)", 904 - exp_idx); 901 + seq_buf_init(&s, buf, 800 + 64); 902 + 903 + seq_buf_printf(&s, "Experiment #%d (Format: <THREAD-NUM>:<Total loop time in ns>)", 904 + exp_idx); 905 905 906 906 for (i = 0; i < n && !torture_must_stop(); i++) { 907 907 rt = &(reader_tasks[i]); 908 - sprintf(buf1, "%d: %llu\t", i, rt->last_duration_ns); 909 908 910 909 if (i % 5 == 0) 911 - strcat(buf, "\n"); 912 - if (strlen(buf) >= 800) { 913 - pr_alert("%s", buf); 914 - buf[0] = 0; 910 + seq_buf_putc(&s, '\n'); 911 + 912 + if (seq_buf_used(&s) >= 800) { 913 + pr_alert("%s", seq_buf_str(&s)); 914 + seq_buf_clear(&s); 915 915 } 916 - strcat(buf, buf1); 916 + 917 + seq_buf_printf(&s, "%d: %llu\t", i, rt->last_duration_ns); 917 918 918 919 sum += rt->last_duration_ns; 919 920 } 920 - pr_alert("%s\n", buf); 921 + pr_alert("%s\n", seq_buf_str(&s)); 921 922 922 923 kfree(buf); 923 924 return sum; ··· 1026 1023 } 1027 1024 1028 1025 static void 1029 - ref_scale_print_module_parms(struct ref_scale_ops *cur_ops, const char *tag) 1026 + ref_scale_print_module_parms(const struct ref_scale_ops *cur_ops, const char *tag) 1030 1027 { 1031 1028 pr_alert("%s" SCALE_FLAG 1032 1029 "--- %s: verbose=%d verbose_batched=%d shutdown=%d holdoff=%d lookup_instances=%ld loops=%ld nreaders=%d nruns=%d readdelay=%d\n", scale_type, tag, ··· 1081 1078 { 1082 1079 long i; 1083 1080 int firsterr = 0; 1084 - static struct ref_scale_ops *scale_ops[] = { 1081 + static const struct ref_scale_ops *scale_ops[] = { 1085 1082 &rcu_ops, &srcu_ops, RCU_TRACE_OPS RCU_TASKS_OPS &refcnt_ops, &rwlock_ops, 1086 1083 &rwsem_ops, &lock_ops, &lock_irq_ops, &acqrel_ops, &clock_ops, &jiffies_ops, 1087 1084 &typesafe_ref_ops, &typesafe_lock_ops, &typesafe_seqlock_ops,

+8 -3

kernel/rcu/srcutree.c

··· 137 137 sdp->srcu_cblist_invoking = false; 138 138 sdp->srcu_gp_seq_needed = ssp->srcu_sup->srcu_gp_seq; 139 139 sdp->srcu_gp_seq_needed_exp = ssp->srcu_sup->srcu_gp_seq; 140 + sdp->srcu_barrier_head.next = &sdp->srcu_barrier_head; 140 141 sdp->mynode = NULL; 141 142 sdp->cpu = cpu; 142 143 INIT_WORK(&sdp->work, srcu_invoke_callbacks); ··· 248 247 mutex_init(&ssp->srcu_sup->srcu_cb_mutex); 249 248 mutex_init(&ssp->srcu_sup->srcu_gp_mutex); 250 249 ssp->srcu_idx = 0; 251 - ssp->srcu_sup->srcu_gp_seq = 0; 250 + ssp->srcu_sup->srcu_gp_seq = SRCU_GP_SEQ_INITIAL_VAL; 252 251 ssp->srcu_sup->srcu_barrier_seq = 0; 253 252 mutex_init(&ssp->srcu_sup->srcu_barrier_mutex); 254 253 atomic_set(&ssp->srcu_sup->srcu_barrier_cpu_cnt, 0); ··· 259 258 if (!ssp->sda) 260 259 goto err_free_sup; 261 260 init_srcu_struct_data(ssp); 262 - ssp->srcu_sup->srcu_gp_seq_needed_exp = 0; 261 + ssp->srcu_sup->srcu_gp_seq_needed_exp = SRCU_GP_SEQ_INITIAL_VAL; 263 262 ssp->srcu_sup->srcu_last_gp_end = ktime_get_mono_fast_ns(); 264 263 if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) { 265 264 if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) ··· 267 266 WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG); 268 267 } 269 268 ssp->srcu_sup->srcu_ssp = ssp; 270 - smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, 0); /* Init done. */ 269 + smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, 270 + SRCU_GP_SEQ_INITIAL_VAL); /* Init done. */ 271 271 return 0; 272 272 273 273 err_free_sda: ··· 630 628 if (time_after(j, gpstart)) 631 629 jbase += j - gpstart; 632 630 if (!jbase) { 631 + ASSERT_EXCLUSIVE_WRITER(sup->srcu_n_exp_nodelay); 633 632 WRITE_ONCE(sup->srcu_n_exp_nodelay, READ_ONCE(sup->srcu_n_exp_nodelay) + 1); 634 633 if (READ_ONCE(sup->srcu_n_exp_nodelay) > srcu_max_nodelay_phase) 635 634 jbase = 1; ··· 1563 1560 struct srcu_data *sdp; 1564 1561 struct srcu_struct *ssp; 1565 1562 1563 + rhp->next = rhp; // Mark the callback as having been invoked. 1566 1564 sdp = container_of(rhp, struct srcu_data, srcu_barrier_head); 1567 1565 ssp = sdp->ssp; 1568 1566 if (atomic_dec_and_test(&ssp->srcu_sup->srcu_barrier_cpu_cnt)) ··· 1822 1818 } else { 1823 1819 j = jiffies; 1824 1820 if (READ_ONCE(sup->reschedule_jiffies) == j) { 1821 + ASSERT_EXCLUSIVE_WRITER(sup->reschedule_count); 1825 1822 WRITE_ONCE(sup->reschedule_count, READ_ONCE(sup->reschedule_count) + 1); 1826 1823 if (READ_ONCE(sup->reschedule_count) > srcu_max_nodelay) 1827 1824 curdelay = 1;

+142 -70

kernel/rcu/tasks.h

··· 34 34 * @rtp_blkd_tasks: List of tasks blocked as readers. 35 35 * @rtp_exit_list: List of tasks in the latter portion of do_exit(). 36 36 * @cpu: CPU number corresponding to this entry. 37 + * @index: Index of this CPU in rtpcp_array of the rcu_tasks structure. 37 38 * @rtpp: Pointer to the rcu_tasks structure. 38 39 */ 39 40 struct rcu_tasks_percpu { ··· 50 49 struct list_head rtp_blkd_tasks; 51 50 struct list_head rtp_exit_list; 52 51 int cpu; 52 + int index; 53 53 struct rcu_tasks *rtpp; 54 54 }; 55 55 ··· 65 63 * @init_fract: Initial backoff sleep interval. 66 64 * @gp_jiffies: Time of last @gp_state transition. 67 65 * @gp_start: Most recent grace-period start in jiffies. 68 - * @tasks_gp_seq: Number of grace periods completed since boot. 66 + * @tasks_gp_seq: Number of grace periods completed since boot in upper bits. 69 67 * @n_ipis: Number of IPIs sent to encourage grace periods to end. 70 68 * @n_ipis_fails: Number of IPI-send failures. 71 69 * @kthread_ptr: This flavor's grace-period/callback-invocation kthread. ··· 78 76 * @call_func: This flavor's call_rcu()-equivalent function. 79 77 * @wait_state: Task state for synchronous grace-period waits (default TASK_UNINTERRUPTIBLE). 80 78 * @rtpcpu: This flavor's rcu_tasks_percpu structure. 79 + * @rtpcp_array: Array of pointers to rcu_tasks_percpu structure of CPUs in cpu_possible_mask. 81 80 * @percpu_enqueue_shift: Shift down CPU ID this much when enqueuing callbacks. 82 81 * @percpu_enqueue_lim: Number of per-CPU callback queues in use for enqueuing. 83 82 * @percpu_dequeue_lim: Number of per-CPU callback queues in use for dequeuing. ··· 87 84 * @barrier_q_count: Number of queues being waited on. 88 85 * @barrier_q_completion: Barrier wait/wakeup mechanism. 89 86 * @barrier_q_seq: Sequence number for barrier operations. 87 + * @barrier_q_start: Most recent barrier start in jiffies. 90 88 * @name: This flavor's textual name. 91 89 * @kname: This flavor's kthread name. 92 90 */ ··· 114 110 call_rcu_func_t call_func; 115 111 unsigned int wait_state; 116 112 struct rcu_tasks_percpu __percpu *rtpcpu; 113 + struct rcu_tasks_percpu **rtpcp_array; 117 114 int percpu_enqueue_shift; 118 115 int percpu_enqueue_lim; 119 116 int percpu_dequeue_lim; ··· 123 118 atomic_t barrier_q_count; 124 119 struct completion barrier_q_completion; 125 120 unsigned long barrier_q_seq; 121 + unsigned long barrier_q_start; 126 122 char *name; 127 123 char *kname; 128 124 }; ··· 188 182 static int rcu_task_lazy_lim __read_mostly = 32; 189 183 module_param(rcu_task_lazy_lim, int, 0444); 190 184 185 + static int rcu_task_cpu_ids; 186 + 191 187 /* RCU tasks grace-period state for debugging. */ 192 188 #define RTGS_INIT 0 193 189 #define RTGS_WAIT_WAIT_CBS 1 ··· 253 245 int cpu; 254 246 int lim; 255 247 int shift; 248 + int maxcpu; 249 + int index = 0; 256 250 257 251 if (rcu_task_enqueue_lim < 0) { 258 252 rcu_task_enqueue_lim = 1; ··· 264 254 } 265 255 lim = rcu_task_enqueue_lim; 266 256 267 - if (lim > nr_cpu_ids) 268 - lim = nr_cpu_ids; 269 - shift = ilog2(nr_cpu_ids / lim); 270 - if (((nr_cpu_ids - 1) >> shift) >= lim) 271 - shift++; 272 - WRITE_ONCE(rtp->percpu_enqueue_shift, shift); 273 - WRITE_ONCE(rtp->percpu_dequeue_lim, lim); 274 - smp_store_release(&rtp->percpu_enqueue_lim, lim); 257 + rtp->rtpcp_array = kcalloc(num_possible_cpus(), sizeof(struct rcu_tasks_percpu *), GFP_KERNEL); 258 + BUG_ON(!rtp->rtpcp_array); 259 + 275 260 for_each_possible_cpu(cpu) { 276 261 struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu); 277 262 ··· 278 273 INIT_WORK(&rtpcp->rtp_work, rcu_tasks_invoke_cbs_wq); 279 274 rtpcp->cpu = cpu; 280 275 rtpcp->rtpp = rtp; 276 + rtpcp->index = index; 277 + rtp->rtpcp_array[index] = rtpcp; 278 + index++; 281 279 if (!rtpcp->rtp_blkd_tasks.next) 282 280 INIT_LIST_HEAD(&rtpcp->rtp_blkd_tasks); 283 281 if (!rtpcp->rtp_exit_list.next) 284 282 INIT_LIST_HEAD(&rtpcp->rtp_exit_list); 283 + rtpcp->barrier_q_head.next = &rtpcp->barrier_q_head; 284 + maxcpu = cpu; 285 285 } 286 286 287 - pr_info("%s: Setting shift to %d and lim to %d rcu_task_cb_adjust=%d.\n", rtp->name, 288 - data_race(rtp->percpu_enqueue_shift), data_race(rtp->percpu_enqueue_lim), rcu_task_cb_adjust); 287 + rcu_task_cpu_ids = maxcpu + 1; 288 + if (lim > rcu_task_cpu_ids) 289 + lim = rcu_task_cpu_ids; 290 + shift = ilog2(rcu_task_cpu_ids / lim); 291 + if (((rcu_task_cpu_ids - 1) >> shift) >= lim) 292 + shift++; 293 + WRITE_ONCE(rtp->percpu_enqueue_shift, shift); 294 + WRITE_ONCE(rtp->percpu_dequeue_lim, lim); 295 + smp_store_release(&rtp->percpu_enqueue_lim, lim); 296 + 297 + pr_info("%s: Setting shift to %d and lim to %d rcu_task_cb_adjust=%d rcu_task_cpu_ids=%d.\n", 298 + rtp->name, data_race(rtp->percpu_enqueue_shift), data_race(rtp->percpu_enqueue_lim), 299 + rcu_task_cb_adjust, rcu_task_cpu_ids); 289 300 } 290 301 291 302 // Compute wakeup time for lazy callback timer. ··· 360 339 rcu_read_lock(); 361 340 ideal_cpu = smp_processor_id() >> READ_ONCE(rtp->percpu_enqueue_shift); 362 341 chosen_cpu = cpumask_next(ideal_cpu - 1, cpu_possible_mask); 342 + WARN_ON_ONCE(chosen_cpu >= rcu_task_cpu_ids); 363 343 rtpcp = per_cpu_ptr(rtp->rtpcpu, chosen_cpu); 364 344 if (!raw_spin_trylock_rcu_node(rtpcp)) { // irqs already disabled. 365 345 raw_spin_lock_rcu_node(rtpcp); // irqs already disabled. ··· 370 348 rtpcp->rtp_n_lock_retries = 0; 371 349 } 372 350 if (rcu_task_cb_adjust && ++rtpcp->rtp_n_lock_retries > rcu_task_contend_lim && 373 - READ_ONCE(rtp->percpu_enqueue_lim) != nr_cpu_ids) 351 + READ_ONCE(rtp->percpu_enqueue_lim) != rcu_task_cpu_ids) 374 352 needadjust = true; // Defer adjustment to avoid deadlock. 375 353 } 376 354 // Queuing callbacks before initialization not yet supported. ··· 390 368 raw_spin_unlock_irqrestore_rcu_node(rtpcp, flags); 391 369 if (unlikely(needadjust)) { 392 370 raw_spin_lock_irqsave(&rtp->cbs_gbl_lock, flags); 393 - if (rtp->percpu_enqueue_lim != nr_cpu_ids) { 371 + if (rtp->percpu_enqueue_lim != rcu_task_cpu_ids) { 394 372 WRITE_ONCE(rtp->percpu_enqueue_shift, 0); 395 - WRITE_ONCE(rtp->percpu_dequeue_lim, nr_cpu_ids); 396 - smp_store_release(&rtp->percpu_enqueue_lim, nr_cpu_ids); 373 + WRITE_ONCE(rtp->percpu_dequeue_lim, rcu_task_cpu_ids); 374 + smp_store_release(&rtp->percpu_enqueue_lim, rcu_task_cpu_ids); 397 375 pr_info("Switching %s to per-CPU callback queuing.\n", rtp->name); 398 376 } 399 377 raw_spin_unlock_irqrestore(&rtp->cbs_gbl_lock, flags); ··· 410 388 struct rcu_tasks *rtp; 411 389 struct rcu_tasks_percpu *rtpcp; 412 390 391 + rhp->next = rhp; // Mark the callback as having been invoked. 413 392 rtpcp = container_of(rhp, struct rcu_tasks_percpu, barrier_q_head); 414 393 rtp = rtpcp->rtpp; 415 394 if (atomic_dec_and_test(&rtp->barrier_q_count)) ··· 419 396 420 397 // Wait for all in-flight callbacks for the specified RCU Tasks flavor. 421 398 // Operates in a manner similar to rcu_barrier(). 422 - static void rcu_barrier_tasks_generic(struct rcu_tasks *rtp) 399 + static void __maybe_unused rcu_barrier_tasks_generic(struct rcu_tasks *rtp) 423 400 { 424 401 int cpu; 425 402 unsigned long flags; ··· 432 409 mutex_unlock(&rtp->barrier_q_mutex); 433 410 return; 434 411 } 412 + rtp->barrier_q_start = jiffies; 435 413 rcu_seq_start(&rtp->barrier_q_seq); 436 414 init_completion(&rtp->barrier_q_completion); 437 415 atomic_set(&rtp->barrier_q_count, 2); ··· 468 444 469 445 dequeue_limit = smp_load_acquire(&rtp->percpu_dequeue_lim); 470 446 for (cpu = 0; cpu < dequeue_limit; cpu++) { 447 + if (!cpu_possible(cpu)) 448 + continue; 471 449 struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu); 472 450 473 451 /* Advance and accelerate any new callbacks. */ ··· 507 481 if (rcu_task_cb_adjust && ncbs <= rcu_task_collapse_lim) { 508 482 raw_spin_lock_irqsave(&rtp->cbs_gbl_lock, flags); 509 483 if (rtp->percpu_enqueue_lim > 1) { 510 - WRITE_ONCE(rtp->percpu_enqueue_shift, order_base_2(nr_cpu_ids)); 484 + WRITE_ONCE(rtp->percpu_enqueue_shift, order_base_2(rcu_task_cpu_ids)); 511 485 smp_store_release(&rtp->percpu_enqueue_lim, 1); 512 486 rtp->percpu_dequeue_gpseq = get_state_synchronize_rcu(); 513 487 gpdone = false; ··· 522 496 pr_info("Completing switch %s to CPU-0 callback queuing.\n", rtp->name); 523 497 } 524 498 if (rtp->percpu_dequeue_lim == 1) { 525 - for (cpu = rtp->percpu_dequeue_lim; cpu < nr_cpu_ids; cpu++) { 499 + for (cpu = rtp->percpu_dequeue_lim; cpu < rcu_task_cpu_ids; cpu++) { 500 + if (!cpu_possible(cpu)) 501 + continue; 526 502 struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu); 527 503 528 504 WARN_ON_ONCE(rcu_segcblist_n_cbs(&rtpcp->cblist)); ··· 539 511 // Advance callbacks and invoke any that are ready. 540 512 static void rcu_tasks_invoke_cbs(struct rcu_tasks *rtp, struct rcu_tasks_percpu *rtpcp) 541 513 { 542 - int cpu; 543 - int cpunext; 544 514 int cpuwq; 545 515 unsigned long flags; 546 516 int len; 517 + int index; 547 518 struct rcu_head *rhp; 548 519 struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl); 549 520 struct rcu_tasks_percpu *rtpcp_next; 550 521 551 - cpu = rtpcp->cpu; 552 - cpunext = cpu * 2 + 1; 553 - if (cpunext < smp_load_acquire(&rtp->percpu_dequeue_lim)) { 554 - rtpcp_next = per_cpu_ptr(rtp->rtpcpu, cpunext); 555 - cpuwq = rcu_cpu_beenfullyonline(cpunext) ? cpunext : WORK_CPU_UNBOUND; 556 - queue_work_on(cpuwq, system_wq, &rtpcp_next->rtp_work); 557 - cpunext++; 558 - if (cpunext < smp_load_acquire(&rtp->percpu_dequeue_lim)) { 559 - rtpcp_next = per_cpu_ptr(rtp->rtpcpu, cpunext); 560 - cpuwq = rcu_cpu_beenfullyonline(cpunext) ? cpunext : WORK_CPU_UNBOUND; 522 + index = rtpcp->index * 2 + 1; 523 + if (index < num_possible_cpus()) { 524 + rtpcp_next = rtp->rtpcp_array[index]; 525 + if (rtpcp_next->cpu < smp_load_acquire(&rtp->percpu_dequeue_lim)) { 526 + cpuwq = rcu_cpu_beenfullyonline(rtpcp_next->cpu) ? rtpcp_next->cpu : WORK_CPU_UNBOUND; 561 527 queue_work_on(cpuwq, system_wq, &rtpcp_next->rtp_work); 528 + index++; 529 + if (index < num_possible_cpus()) { 530 + rtpcp_next = rtp->rtpcp_array[index]; 531 + if (rtpcp_next->cpu < smp_load_acquire(&rtp->percpu_dequeue_lim)) { 532 + cpuwq = rcu_cpu_beenfullyonline(rtpcp_next->cpu) ? rtpcp_next->cpu : WORK_CPU_UNBOUND; 533 + queue_work_on(cpuwq, system_wq, &rtpcp_next->rtp_work); 534 + } 535 + } 562 536 } 563 537 } 564 538 565 - if (rcu_segcblist_empty(&rtpcp->cblist) || !cpu_possible(cpu)) 539 + if (rcu_segcblist_empty(&rtpcp->cblist)) 566 540 return; 567 541 raw_spin_lock_irqsave_rcu_node(rtpcp, flags); 568 542 rcu_segcblist_advance(&rtpcp->cblist, rcu_seq_current(&rtp->tasks_gp_seq)); ··· 717 687 #endif /* #ifdef CONFIG_TASKS_TRACE_RCU */ 718 688 } 719 689 720 - #endif /* #ifndef CONFIG_TINY_RCU */ 721 690 722 - #ifndef CONFIG_TINY_RCU 723 691 /* Dump out rcutorture-relevant state common to all RCU-tasks flavors. */ 724 692 static void show_rcu_tasks_generic_gp_kthread(struct rcu_tasks *rtp, char *s) 725 693 { ··· 751 723 rtp->lazy_jiffies, 752 724 s); 753 725 } 726 + 727 + /* Dump out more rcutorture-relevant state common to all RCU-tasks flavors. */ 728 + static void rcu_tasks_torture_stats_print_generic(struct rcu_tasks *rtp, char *tt, 729 + char *tf, char *tst) 730 + { 731 + cpumask_var_t cm; 732 + int cpu; 733 + bool gotcb = false; 734 + unsigned long j = jiffies; 735 + 736 + pr_alert("%s%s Tasks%s RCU g%ld gp_start %lu gp_jiffies %lu gp_state %d (%s).\n", 737 + tt, tf, tst, data_race(rtp->tasks_gp_seq), 738 + j - data_race(rtp->gp_start), j - data_race(rtp->gp_jiffies), 739 + data_race(rtp->gp_state), tasks_gp_state_getname(rtp)); 740 + pr_alert("\tEnqueue shift %d limit %d Dequeue limit %d gpseq %lu.\n", 741 + data_race(rtp->percpu_enqueue_shift), 742 + data_race(rtp->percpu_enqueue_lim), 743 + data_race(rtp->percpu_dequeue_lim), 744 + data_race(rtp->percpu_dequeue_gpseq)); 745 + (void)zalloc_cpumask_var(&cm, GFP_KERNEL); 746 + pr_alert("\tCallback counts:"); 747 + for_each_possible_cpu(cpu) { 748 + long n; 749 + struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu); 750 + 751 + if (cpumask_available(cm) && !rcu_barrier_cb_is_done(&rtpcp->barrier_q_head)) 752 + cpumask_set_cpu(cpu, cm); 753 + n = rcu_segcblist_n_cbs(&rtpcp->cblist); 754 + if (!n) 755 + continue; 756 + pr_cont(" %d:%ld", cpu, n); 757 + gotcb = true; 758 + } 759 + if (gotcb) 760 + pr_cont(".\n"); 761 + else 762 + pr_cont(" (none).\n"); 763 + pr_alert("\tBarrier seq %lu start %lu count %d holdout CPUs ", 764 + data_race(rtp->barrier_q_seq), j - data_race(rtp->barrier_q_start), 765 + atomic_read(&rtp->barrier_q_count)); 766 + if (cpumask_available(cm) && !cpumask_empty(cm)) 767 + pr_cont(" %*pbl.\n", cpumask_pr_args(cm)); 768 + else 769 + pr_cont("(none).\n"); 770 + free_cpumask_var(cm); 771 + } 772 + 754 773 #endif // #ifndef CONFIG_TINY_RCU 755 774 756 775 static void exit_tasks_rcu_finish_trace(struct task_struct *t); ··· 1249 1174 show_rcu_tasks_generic_gp_kthread(&rcu_tasks, ""); 1250 1175 } 1251 1176 EXPORT_SYMBOL_GPL(show_rcu_tasks_classic_gp_kthread); 1177 + 1178 + void rcu_tasks_torture_stats_print(char *tt, char *tf) 1179 + { 1180 + rcu_tasks_torture_stats_print_generic(&rcu_tasks, tt, tf, ""); 1181 + } 1182 + EXPORT_SYMBOL_GPL(rcu_tasks_torture_stats_print); 1252 1183 #endif // !defined(CONFIG_TINY_RCU) 1253 1184 1254 1185 struct task_struct *get_rcu_tasks_gp_kthread(void) ··· 1325 1244 1326 1245 //////////////////////////////////////////////////////////////////////// 1327 1246 // 1328 - // "Rude" variant of Tasks RCU, inspired by Steve Rostedt's trick of 1329 - // passing an empty function to schedule_on_each_cpu(). This approach 1330 - // provides an asynchronous call_rcu_tasks_rude() API and batching of 1331 - // concurrent calls to the synchronous synchronize_rcu_tasks_rude() API. 1332 - // This invokes schedule_on_each_cpu() in order to send IPIs far and wide 1333 - // and induces otherwise unnecessary context switches on all online CPUs, 1334 - // whether idle or not. 1247 + // "Rude" variant of Tasks RCU, inspired by Steve Rostedt's 1248 + // trick of passing an empty function to schedule_on_each_cpu(). 1249 + // This approach provides batching of concurrent calls to the synchronous 1250 + // synchronize_rcu_tasks_rude() API. This invokes schedule_on_each_cpu() 1251 + // in order to send IPIs far and wide and induces otherwise unnecessary 1252 + // context switches on all online CPUs, whether idle or not. 1335 1253 // 1336 1254 // Callback handling is provided by the rcu_tasks_kthread() function. 1337 1255 // ··· 1348 1268 schedule_on_each_cpu(rcu_tasks_be_rude); 1349 1269 } 1350 1270 1351 - void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func); 1271 + static void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func); 1352 1272 DEFINE_RCU_TASKS(rcu_tasks_rude, rcu_tasks_rude_wait_gp, call_rcu_tasks_rude, 1353 1273 "RCU Tasks Rude"); 1354 1274 1355 - /** 1275 + /* 1356 1276 * call_rcu_tasks_rude() - Queue a callback rude task-based grace period 1357 1277 * @rhp: structure to be used for queueing the RCU updates. 1358 1278 * @func: actual callback function to be invoked after the grace period ··· 1369 1289 * 1370 1290 * See the description of call_rcu() for more detailed information on 1371 1291 * memory ordering guarantees. 1292 + * 1293 + * This is no longer exported, and is instead reserved for use by 1294 + * synchronize_rcu_tasks_rude(). 1372 1295 */ 1373 - void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func) 1296 + static void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func) 1374 1297 { 1375 1298 call_rcu_tasks_generic(rhp, func, &rcu_tasks_rude); 1376 1299 } 1377 - EXPORT_SYMBOL_GPL(call_rcu_tasks_rude); 1378 1300 1379 1301 /** 1380 1302 * synchronize_rcu_tasks_rude - wait for a rude rcu-tasks grace period ··· 1402 1320 } 1403 1321 EXPORT_SYMBOL_GPL(synchronize_rcu_tasks_rude); 1404 1322 1405 - /** 1406 - * rcu_barrier_tasks_rude - Wait for in-flight call_rcu_tasks_rude() callbacks. 1407 - * 1408 - * Although the current implementation is guaranteed to wait, it is not 1409 - * obligated to, for example, if there are no pending callbacks. 1410 - */ 1411 - void rcu_barrier_tasks_rude(void) 1412 - { 1413 - rcu_barrier_tasks_generic(&rcu_tasks_rude); 1414 - } 1415 - EXPORT_SYMBOL_GPL(rcu_barrier_tasks_rude); 1416 - 1417 - int rcu_tasks_rude_lazy_ms = -1; 1418 - module_param(rcu_tasks_rude_lazy_ms, int, 0444); 1419 - 1420 1323 static int __init rcu_spawn_tasks_rude_kthread(void) 1421 1324 { 1422 1325 rcu_tasks_rude.gp_sleep = HZ / 10; 1423 - if (rcu_tasks_rude_lazy_ms >= 0) 1424 - rcu_tasks_rude.lazy_jiffies = msecs_to_jiffies(rcu_tasks_rude_lazy_ms); 1425 1326 rcu_spawn_tasks_kthread_generic(&rcu_tasks_rude); 1426 1327 return 0; 1427 1328 } ··· 1415 1350 show_rcu_tasks_generic_gp_kthread(&rcu_tasks_rude, ""); 1416 1351 } 1417 1352 EXPORT_SYMBOL_GPL(show_rcu_tasks_rude_gp_kthread); 1353 + 1354 + void rcu_tasks_rude_torture_stats_print(char *tt, char *tf) 1355 + { 1356 + rcu_tasks_torture_stats_print_generic(&rcu_tasks_rude, tt, tf, ""); 1357 + } 1358 + EXPORT_SYMBOL_GPL(rcu_tasks_rude_torture_stats_print); 1418 1359 #endif // !defined(CONFIG_TINY_RCU) 1419 1360 1420 1361 struct task_struct *get_rcu_tasks_rude_gp_kthread(void) ··· 2098 2027 show_rcu_tasks_generic_gp_kthread(&rcu_tasks_trace, buf); 2099 2028 } 2100 2029 EXPORT_SYMBOL_GPL(show_rcu_tasks_trace_gp_kthread); 2030 + 2031 + void rcu_tasks_trace_torture_stats_print(char *tt, char *tf) 2032 + { 2033 + rcu_tasks_torture_stats_print_generic(&rcu_tasks_trace, tt, tf, ""); 2034 + } 2035 + EXPORT_SYMBOL_GPL(rcu_tasks_trace_torture_stats_print); 2101 2036 #endif // !defined(CONFIG_TINY_RCU) 2102 2037 2103 2038 struct task_struct *get_rcu_tasks_trace_gp_kthread(void) ··· 2147 2070 .notrun = IS_ENABLED(CONFIG_TASKS_RCU), 2148 2071 }, 2149 2072 { 2150 - .name = "call_rcu_tasks_rude()", 2151 - /* If not defined, the test is skipped. */ 2152 - .notrun = IS_ENABLED(CONFIG_TASKS_RUDE_RCU), 2153 - }, 2154 - { 2155 2073 .name = "call_rcu_tasks_trace()", 2156 2074 /* If not defined, the test is skipped. */ 2157 2075 .notrun = IS_ENABLED(CONFIG_TASKS_TRACE_RCU) 2158 2076 } 2159 2077 }; 2160 2078 2079 + #if defined(CONFIG_TASKS_RCU) || defined(CONFIG_TASKS_TRACE_RCU) 2161 2080 static void test_rcu_tasks_callback(struct rcu_head *rhp) 2162 2081 { 2163 2082 struct rcu_tasks_test_desc *rttd = ··· 2163 2090 2164 2091 rttd->notrun = false; 2165 2092 } 2093 + #endif // #if defined(CONFIG_TASKS_RCU) || defined(CONFIG_TASKS_TRACE_RCU) 2166 2094 2167 2095 static void rcu_tasks_initiate_self_tests(void) 2168 2096 { ··· 2176 2102 2177 2103 #ifdef CONFIG_TASKS_RUDE_RCU 2178 2104 pr_info("Running RCU Tasks Rude wait API self tests\n"); 2179 - tests[1].runstart = jiffies; 2180 2105 synchronize_rcu_tasks_rude(); 2181 - call_rcu_tasks_rude(&tests[1].rh, test_rcu_tasks_callback); 2182 2106 #endif 2183 2107 2184 2108 #ifdef CONFIG_TASKS_TRACE_RCU 2185 2109 pr_info("Running RCU Tasks Trace wait API self tests\n"); 2186 - tests[2].runstart = jiffies; 2110 + tests[1].runstart = jiffies; 2187 2111 synchronize_rcu_tasks_trace(); 2188 - call_rcu_tasks_trace(&tests[2].rh, test_rcu_tasks_callback); 2112 + call_rcu_tasks_trace(&tests[1].rh, test_rcu_tasks_callback); 2189 2113 #endif 2190 2114 } 2191 2115

+18 -45

kernel/rcu/tree.c

··· 79 79 80 80 static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = { 81 81 .gpwrap = true, 82 - #ifdef CONFIG_RCU_NOCB_CPU 83 - .cblist.flags = SEGCBLIST_RCU_CORE, 84 - #endif 85 82 }; 86 83 static struct rcu_state rcu_state = { 87 84 .level = { &rcu_state.node[0] }, ··· 94 97 .srs_cleanup_work = __WORK_INITIALIZER(rcu_state.srs_cleanup_work, 95 98 rcu_sr_normal_gp_cleanup_work), 96 99 .srs_cleanups_pending = ATOMIC_INIT(0), 100 + #ifdef CONFIG_RCU_NOCB_CPU 101 + .nocb_mutex = __MUTEX_INITIALIZER(rcu_state.nocb_mutex), 102 + #endif 97 103 }; 98 104 99 105 /* Dump rcu_node combining tree at boot to verify correct setup. */ ··· 1660 1660 * the done tail list manipulations are protected here. 1661 1661 */ 1662 1662 done = smp_load_acquire(&rcu_state.srs_done_tail); 1663 - if (!done) 1663 + if (WARN_ON_ONCE(!done)) 1664 1664 return; 1665 1665 1666 1666 WARN_ON_ONCE(!rcu_sr_is_wait_head(done)); ··· 2394 2394 { 2395 2395 unsigned long flags; 2396 2396 unsigned long mask; 2397 - bool needacc = false; 2398 2397 struct rcu_node *rnp; 2399 2398 2400 2399 WARN_ON_ONCE(rdp->cpu != smp_processor_id()); ··· 2430 2431 * to return true. So complain, but don't awaken. 2431 2432 */ 2432 2433 WARN_ON_ONCE(rcu_accelerate_cbs(rnp, rdp)); 2433 - } else if (!rcu_segcblist_completely_offloaded(&rdp->cblist)) { 2434 - /* 2435 - * ...but NOCB kthreads may miss or delay callbacks acceleration 2436 - * if in the middle of a (de-)offloading process. 2437 - */ 2438 - needacc = true; 2439 2434 } 2440 2435 2441 2436 rcu_disable_urgency_upon_qs(rdp); 2442 2437 rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); 2443 2438 /* ^^^ Released rnp->lock */ 2444 - 2445 - if (needacc) { 2446 - rcu_nocb_lock_irqsave(rdp, flags); 2447 - rcu_accelerate_cbs_unlocked(rnp, rdp); 2448 - rcu_nocb_unlock_irqrestore(rdp, flags); 2449 - } 2450 2439 } 2451 2440 } 2452 2441 ··· 2789 2802 unsigned long flags; 2790 2803 struct rcu_data *rdp = raw_cpu_ptr(&rcu_data); 2791 2804 struct rcu_node *rnp = rdp->mynode; 2792 - /* 2793 - * On RT rcu_core() can be preempted when IRQs aren't disabled. 2794 - * Therefore this function can race with concurrent NOCB (de-)offloading 2795 - * on this CPU and the below condition must be considered volatile. 2796 - * However if we race with: 2797 - * 2798 - * _ Offloading: In the worst case we accelerate or process callbacks 2799 - * concurrently with NOCB kthreads. We are guaranteed to 2800 - * call rcu_nocb_lock() if that happens. 2801 - * 2802 - * _ Deoffloading: In the worst case we miss callbacks acceleration or 2803 - * processing. This is fine because the early stage 2804 - * of deoffloading invokes rcu_core() after setting 2805 - * SEGCBLIST_RCU_CORE. So we guarantee that we'll process 2806 - * what could have been dismissed without the need to wait 2807 - * for the next rcu_pending() check in the next jiffy. 2808 - */ 2809 - const bool do_batch = !rcu_segcblist_completely_offloaded(&rdp->cblist); 2810 2805 2811 2806 if (cpu_is_offline(smp_processor_id())) 2812 2807 return; ··· 2808 2839 2809 2840 /* No grace period and unregistered callbacks? */ 2810 2841 if (!rcu_gp_in_progress() && 2811 - rcu_segcblist_is_enabled(&rdp->cblist) && do_batch) { 2812 - rcu_nocb_lock_irqsave(rdp, flags); 2842 + rcu_segcblist_is_enabled(&rdp->cblist) && !rcu_rdp_is_offloaded(rdp)) { 2843 + local_irq_save(flags); 2813 2844 if (!rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL)) 2814 2845 rcu_accelerate_cbs_unlocked(rnp, rdp); 2815 - rcu_nocb_unlock_irqrestore(rdp, flags); 2846 + local_irq_restore(flags); 2816 2847 } 2817 2848 2818 2849 rcu_check_gp_start_stall(rnp, rdp, rcu_jiffies_till_stall_check()); 2819 2850 2820 2851 /* If there are callbacks ready, invoke them. */ 2821 - if (do_batch && rcu_segcblist_ready_cbs(&rdp->cblist) && 2852 + if (!rcu_rdp_is_offloaded(rdp) && rcu_segcblist_ready_cbs(&rdp->cblist) && 2822 2853 likely(READ_ONCE(rcu_scheduler_fully_active))) { 2823 2854 rcu_do_batch(rdp); 2824 2855 /* Re-invoke RCU core processing if there are callbacks remaining. */ ··· 3207 3238 struct list_head list; 3208 3239 struct rcu_gp_oldstate gp_snap; 3209 3240 unsigned long nr_records; 3210 - void *records[]; 3241 + void *records[] __counted_by(nr_records); 3211 3242 }; 3212 3243 3213 3244 /* ··· 3519 3550 if (delayed_work_pending(&krcp->monitor_work)) { 3520 3551 delay_left = krcp->monitor_work.timer.expires - jiffies; 3521 3552 if (delay < delay_left) 3522 - mod_delayed_work(system_wq, &krcp->monitor_work, delay); 3553 + mod_delayed_work(system_unbound_wq, &krcp->monitor_work, delay); 3523 3554 return; 3524 3555 } 3525 - queue_delayed_work(system_wq, &krcp->monitor_work, delay); 3556 + queue_delayed_work(system_unbound_wq, &krcp->monitor_work, delay); 3526 3557 } 3527 3558 3528 3559 static void ··· 3614 3645 // be that the work is in the pending state when 3615 3646 // channels have been detached following by each 3616 3647 // other. 3617 - queue_rcu_work(system_wq, &krwp->rcu_work); 3648 + queue_rcu_work(system_unbound_wq, &krwp->rcu_work); 3618 3649 } 3619 3650 } 3620 3651 ··· 3684 3715 if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING && 3685 3716 !atomic_xchg(&krcp->work_in_progress, 1)) { 3686 3717 if (atomic_read(&krcp->backoff_page_cache_fill)) { 3687 - queue_delayed_work(system_wq, 3718 + queue_delayed_work(system_unbound_wq, 3688 3719 &krcp->page_cache_work, 3689 3720 msecs_to_jiffies(rcu_delay_page_cache_fill_msec)); 3690 3721 } else { ··· 3747 3778 } 3748 3779 3749 3780 // Finally insert and update the GP for this page. 3750 - bnode->records[bnode->nr_records++] = ptr; 3781 + bnode->nr_records++; 3782 + bnode->records[bnode->nr_records - 1] = ptr; 3751 3783 get_state_synchronize_rcu_full(&bnode->gp_snap); 3752 3784 atomic_inc(&(*krcp)->bulk_count[idx]); 3753 3785 ··· 4384 4414 { 4385 4415 unsigned long __maybe_unused s = rcu_state.barrier_sequence; 4386 4416 4417 + rhp->next = rhp; // Mark the callback as having been invoked. 4387 4418 if (atomic_dec_and_test(&rcu_state.barrier_cpu_count)) { 4388 4419 rcu_barrier_trace(TPS("LastCB"), -1, s); 4389 4420 complete(&rcu_state.barrier_completion); ··· 5406 5435 while (i > rnp->grphi) 5407 5436 rnp++; 5408 5437 per_cpu_ptr(&rcu_data, i)->mynode = rnp; 5438 + per_cpu_ptr(&rcu_data, i)->barrier_head.next = 5439 + &per_cpu_ptr(&rcu_data, i)->barrier_head; 5409 5440 rcu_boot_init_percpu_data(i); 5410 5441 } 5411 5442 }

+5 -1

kernel/rcu/tree.h

··· 411 411 arch_spinlock_t ofl_lock ____cacheline_internodealigned_in_smp; 412 412 /* Synchronize offline with */ 413 413 /* GP pre-initialization. */ 414 - int nocb_is_setup; /* nocb is setup from boot */ 415 414 416 415 /* synchronize_rcu() part. */ 417 416 struct llist_head srs_next; /* request a GP users. */ ··· 419 420 struct sr_wait_node srs_wait_nodes[SR_NORMAL_GP_WAIT_HEAD_MAX]; 420 421 struct work_struct srs_cleanup_work; 421 422 atomic_t srs_cleanups_pending; /* srs inflight worker cleanups. */ 423 + 424 + #ifdef CONFIG_RCU_NOCB_CPU 425 + struct mutex nocb_mutex; /* Guards (de-)offloading */ 426 + int nocb_is_setup; /* nocb is setup from boot */ 427 + #endif 422 428 }; 423 429 424 430 /* Values for rcu_state structure's gp_flags field. */

+62 -51

kernel/rcu/tree_exp.h

··· 543 543 } 544 544 545 545 /* 546 + * Print out an expedited RCU CPU stall warning message. 547 + */ 548 + static void synchronize_rcu_expedited_stall(unsigned long jiffies_start, unsigned long j) 549 + { 550 + int cpu; 551 + unsigned long mask; 552 + int ndetected; 553 + struct rcu_node *rnp; 554 + struct rcu_node *rnp_root = rcu_get_root(); 555 + 556 + if (READ_ONCE(csd_lock_suppress_rcu_stall) && csd_lock_is_stuck()) { 557 + pr_err("INFO: %s detected expedited stalls, but suppressed full report due to a stuck CSD-lock.\n", rcu_state.name); 558 + return; 559 + } 560 + pr_err("INFO: %s detected expedited stalls on CPUs/tasks: {", rcu_state.name); 561 + ndetected = 0; 562 + rcu_for_each_leaf_node(rnp) { 563 + ndetected += rcu_print_task_exp_stall(rnp); 564 + for_each_leaf_node_possible_cpu(rnp, cpu) { 565 + struct rcu_data *rdp; 566 + 567 + mask = leaf_node_cpu_bit(rnp, cpu); 568 + if (!(READ_ONCE(rnp->expmask) & mask)) 569 + continue; 570 + ndetected++; 571 + rdp = per_cpu_ptr(&rcu_data, cpu); 572 + pr_cont(" %d-%c%c%c%c", cpu, 573 + "O."[!!cpu_online(cpu)], 574 + "o."[!!(rdp->grpmask & rnp->expmaskinit)], 575 + "N."[!!(rdp->grpmask & rnp->expmaskinitnext)], 576 + "D."[!!data_race(rdp->cpu_no_qs.b.exp)]); 577 + } 578 + } 579 + pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n", 580 + j - jiffies_start, rcu_state.expedited_sequence, data_race(rnp_root->expmask), 581 + ".T"[!!data_race(rnp_root->exp_tasks)]); 582 + if (ndetected) { 583 + pr_err("blocking rcu_node structures (internal RCU debug):"); 584 + rcu_for_each_node_breadth_first(rnp) { 585 + if (rnp == rnp_root) 586 + continue; /* printed unconditionally */ 587 + if (sync_rcu_exp_done_unlocked(rnp)) 588 + continue; 589 + pr_cont(" l=%u:%d-%d:%#lx/%c", 590 + rnp->level, rnp->grplo, rnp->grphi, data_race(rnp->expmask), 591 + ".T"[!!data_race(rnp->exp_tasks)]); 592 + } 593 + pr_cont("\n"); 594 + } 595 + rcu_for_each_leaf_node(rnp) { 596 + for_each_leaf_node_possible_cpu(rnp, cpu) { 597 + mask = leaf_node_cpu_bit(rnp, cpu); 598 + if (!(READ_ONCE(rnp->expmask) & mask)) 599 + continue; 600 + dump_cpu_task(cpu); 601 + } 602 + rcu_exp_print_detail_task_stall_rnp(rnp); 603 + } 604 + } 605 + 606 + /* 546 607 * Wait for the expedited grace period to elapse, issuing any needed 547 608 * RCU CPU stall warnings along the way. 548 609 */ ··· 614 553 unsigned long jiffies_stall; 615 554 unsigned long jiffies_start; 616 555 unsigned long mask; 617 - int ndetected; 618 556 struct rcu_data *rdp; 619 557 struct rcu_node *rnp; 620 - struct rcu_node *rnp_root = rcu_get_root(); 621 558 unsigned long flags; 622 559 623 560 trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("startwait")); ··· 652 593 j = jiffies; 653 594 rcu_stall_notifier_call_chain(RCU_STALL_NOTIFY_EXP, (void *)(j - jiffies_start)); 654 595 trace_rcu_stall_warning(rcu_state.name, TPS("ExpeditedStall")); 655 - pr_err("INFO: %s detected expedited stalls on CPUs/tasks: {", 656 - rcu_state.name); 657 - ndetected = 0; 658 - rcu_for_each_leaf_node(rnp) { 659 - ndetected += rcu_print_task_exp_stall(rnp); 660 - for_each_leaf_node_possible_cpu(rnp, cpu) { 661 - struct rcu_data *rdp; 662 - 663 - mask = leaf_node_cpu_bit(rnp, cpu); 664 - if (!(READ_ONCE(rnp->expmask) & mask)) 665 - continue; 666 - ndetected++; 667 - rdp = per_cpu_ptr(&rcu_data, cpu); 668 - pr_cont(" %d-%c%c%c%c", cpu, 669 - "O."[!!cpu_online(cpu)], 670 - "o."[!!(rdp->grpmask & rnp->expmaskinit)], 671 - "N."[!!(rdp->grpmask & rnp->expmaskinitnext)], 672 - "D."[!!data_race(rdp->cpu_no_qs.b.exp)]); 673 - } 674 - } 675 - pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n", 676 - j - jiffies_start, rcu_state.expedited_sequence, 677 - data_race(rnp_root->expmask), 678 - ".T"[!!data_race(rnp_root->exp_tasks)]); 679 - if (ndetected) { 680 - pr_err("blocking rcu_node structures (internal RCU debug):"); 681 - rcu_for_each_node_breadth_first(rnp) { 682 - if (rnp == rnp_root) 683 - continue; /* printed unconditionally */ 684 - if (sync_rcu_exp_done_unlocked(rnp)) 685 - continue; 686 - pr_cont(" l=%u:%d-%d:%#lx/%c", 687 - rnp->level, rnp->grplo, rnp->grphi, 688 - data_race(rnp->expmask), 689 - ".T"[!!data_race(rnp->exp_tasks)]); 690 - } 691 - pr_cont("\n"); 692 - } 693 - rcu_for_each_leaf_node(rnp) { 694 - for_each_leaf_node_possible_cpu(rnp, cpu) { 695 - mask = leaf_node_cpu_bit(rnp, cpu); 696 - if (!(READ_ONCE(rnp->expmask) & mask)) 697 - continue; 698 - preempt_disable(); // For smp_processor_id() in dump_cpu_task(). 699 - dump_cpu_task(cpu); 700 - preempt_enable(); 701 - } 702 - rcu_exp_print_detail_task_stall_rnp(rnp); 703 - } 596 + synchronize_rcu_expedited_stall(jiffies_start, j); 704 597 jiffies_stall = 3 * rcu_exp_jiffies_till_stall_check() + 3; 705 598 panic_on_rcu_stall(); 706 599 }

+107 -172

kernel/rcu/tree_nocb.h

··· 16 16 #ifdef CONFIG_RCU_NOCB_CPU 17 17 static cpumask_var_t rcu_nocb_mask; /* CPUs to have callbacks offloaded. */ 18 18 static bool __read_mostly rcu_nocb_poll; /* Offload kthread are to poll. */ 19 - static inline int rcu_lockdep_is_held_nocb(struct rcu_data *rdp) 20 - { 21 - return lockdep_is_held(&rdp->nocb_lock); 22 - } 23 19 24 20 static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp) 25 21 { ··· 216 220 raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags); 217 221 if (needwake) { 218 222 trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DoWake")); 219 - wake_up_process(rdp_gp->nocb_gp_kthread); 223 + swake_up_one_online(&rdp_gp->nocb_gp_wq); 220 224 } 221 225 222 226 return needwake; ··· 409 413 return false; 410 414 } 411 415 412 - // In the process of (de-)offloading: no bypassing, but 413 - // locking. 414 - if (!rcu_segcblist_completely_offloaded(&rdp->cblist)) { 415 - rcu_nocb_lock(rdp); 416 - *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist); 417 - return false; /* Not offloaded, no bypassing. */ 418 - } 419 - 420 416 // Don't use ->nocb_bypass during early boot. 421 417 if (rcu_scheduler_active != RCU_SCHEDULER_RUNNING) { 422 418 rcu_nocb_lock(rdp); ··· 493 505 trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstBQ")); 494 506 } 495 507 rcu_nocb_bypass_unlock(rdp); 496 - smp_mb(); /* Order enqueue before wake. */ 508 + 497 509 // A wake up of the grace period kthread or timer adjustment 498 510 // needs to be done only if: 499 511 // 1. Bypass list was fully empty before (this is the first ··· 604 616 } 605 617 } 606 618 607 - static int nocb_gp_toggle_rdp(struct rcu_data *rdp) 619 + static void nocb_gp_toggle_rdp(struct rcu_data *rdp_gp, struct rcu_data *rdp) 608 620 { 609 621 struct rcu_segcblist *cblist = &rdp->cblist; 610 622 unsigned long flags; 611 - int ret; 612 623 613 - rcu_nocb_lock_irqsave(rdp, flags); 614 - if (rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED) && 615 - !rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP)) { 624 + /* 625 + * Locking orders future de-offloaded callbacks enqueue against previous 626 + * handling of this rdp. Ie: Make sure rcuog is done with this rdp before 627 + * deoffloaded callbacks can be enqueued. 628 + */ 629 + raw_spin_lock_irqsave(&rdp->nocb_lock, flags); 630 + if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED)) { 616 631 /* 617 632 * Offloading. Set our flag and notify the offload worker. 618 633 * We will handle this rdp until it ever gets de-offloaded. 619 634 */ 620 - rcu_segcblist_set_flags(cblist, SEGCBLIST_KTHREAD_GP); 621 - ret = 1; 622 - } else if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED) && 623 - rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP)) { 635 + list_add_tail(&rdp->nocb_entry_rdp, &rdp_gp->nocb_head_rdp); 636 + rcu_segcblist_set_flags(cblist, SEGCBLIST_OFFLOADED); 637 + } else { 624 638 /* 625 639 * De-offloading. Clear our flag and notify the de-offload worker. 626 640 * We will ignore this rdp until it ever gets re-offloaded. 627 641 */ 628 - rcu_segcblist_clear_flags(cblist, SEGCBLIST_KTHREAD_GP); 629 - ret = 0; 630 - } else { 631 - WARN_ON_ONCE(1); 632 - ret = -1; 642 + list_del(&rdp->nocb_entry_rdp); 643 + rcu_segcblist_clear_flags(cblist, SEGCBLIST_OFFLOADED); 633 644 } 634 - 635 - rcu_nocb_unlock_irqrestore(rdp, flags); 636 - 637 - return ret; 645 + raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags); 638 646 } 639 647 640 648 static void nocb_gp_sleep(struct rcu_data *my_rdp, int cpu) ··· 837 853 } 838 854 839 855 if (rdp_toggling) { 840 - int ret; 841 - 842 - ret = nocb_gp_toggle_rdp(rdp_toggling); 843 - if (ret == 1) 844 - list_add_tail(&rdp_toggling->nocb_entry_rdp, &my_rdp->nocb_head_rdp); 845 - else if (ret == 0) 846 - list_del(&rdp_toggling->nocb_entry_rdp); 847 - 856 + nocb_gp_toggle_rdp(my_rdp, rdp_toggling); 848 857 swake_up_one(&rdp_toggling->nocb_state_wq); 849 858 } 850 859 ··· 1007 1030 } 1008 1031 EXPORT_SYMBOL_GPL(rcu_nocb_flush_deferred_wakeup); 1009 1032 1010 - static int rdp_offload_toggle(struct rcu_data *rdp, 1011 - bool offload, unsigned long flags) 1012 - __releases(rdp->nocb_lock) 1033 + static int rcu_nocb_queue_toggle_rdp(struct rcu_data *rdp) 1013 1034 { 1014 - struct rcu_segcblist *cblist = &rdp->cblist; 1015 1035 struct rcu_data *rdp_gp = rdp->nocb_gp_rdp; 1016 1036 bool wake_gp = false; 1017 - 1018 - rcu_segcblist_offload(cblist, offload); 1019 - rcu_nocb_unlock_irqrestore(rdp, flags); 1037 + unsigned long flags; 1020 1038 1021 1039 raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags); 1022 1040 // Queue this rdp for add/del to/from the list to iterate on rcuog ··· 1025 1053 return wake_gp; 1026 1054 } 1027 1055 1028 - static long rcu_nocb_rdp_deoffload(void *arg) 1056 + static bool rcu_nocb_rdp_deoffload_wait_cond(struct rcu_data *rdp) 1029 1057 { 1030 - struct rcu_data *rdp = arg; 1031 - struct rcu_segcblist *cblist = &rdp->cblist; 1058 + unsigned long flags; 1059 + bool ret; 1060 + 1061 + /* 1062 + * Locking makes sure rcuog is done handling this rdp before deoffloaded 1063 + * enqueue can happen. Also it keeps the SEGCBLIST_OFFLOADED flag stable 1064 + * while the ->nocb_lock is held. 1065 + */ 1066 + raw_spin_lock_irqsave(&rdp->nocb_lock, flags); 1067 + ret = !rcu_segcblist_test_flags(&rdp->cblist, SEGCBLIST_OFFLOADED); 1068 + raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags); 1069 + 1070 + return ret; 1071 + } 1072 + 1073 + static int rcu_nocb_rdp_deoffload(struct rcu_data *rdp) 1074 + { 1032 1075 unsigned long flags; 1033 1076 int wake_gp; 1034 1077 struct rcu_data *rdp_gp = rdp->nocb_gp_rdp; 1035 1078 1036 - /* 1037 - * rcu_nocb_rdp_deoffload() may be called directly if 1038 - * rcuog/o[p] spawn failed, because at this time the rdp->cpu 1039 - * is not online yet. 1040 - */ 1041 - WARN_ON_ONCE((rdp->cpu != raw_smp_processor_id()) && cpu_online(rdp->cpu)); 1079 + /* CPU must be offline, unless it's early boot */ 1080 + WARN_ON_ONCE(cpu_online(rdp->cpu) && rdp->cpu != raw_smp_processor_id()); 1042 1081 1043 1082 pr_info("De-offloading %d\n", rdp->cpu); 1044 1083 1084 + /* Flush all callbacks from segcblist and bypass */ 1085 + rcu_barrier(); 1086 + 1087 + /* 1088 + * Make sure the rcuoc kthread isn't in the middle of a nocb locked 1089 + * sequence while offloading is deactivated, along with nocb locking. 1090 + */ 1091 + if (rdp->nocb_cb_kthread) 1092 + kthread_park(rdp->nocb_cb_kthread); 1093 + 1045 1094 rcu_nocb_lock_irqsave(rdp, flags); 1046 - /* 1047 - * Flush once and for all now. This suffices because we are 1048 - * running on the target CPU holding ->nocb_lock (thus having 1049 - * interrupts disabled), and because rdp_offload_toggle() 1050 - * invokes rcu_segcblist_offload(), which clears SEGCBLIST_OFFLOADED. 1051 - * Thus future calls to rcu_segcblist_completely_offloaded() will 1052 - * return false, which means that future calls to rcu_nocb_try_bypass() 1053 - * will refuse to put anything into the bypass. 1054 - */ 1055 - WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false)); 1056 - /* 1057 - * Start with invoking rcu_core() early. This way if the current thread 1058 - * happens to preempt an ongoing call to rcu_core() in the middle, 1059 - * leaving some work dismissed because rcu_core() still thinks the rdp is 1060 - * completely offloaded, we are guaranteed a nearby future instance of 1061 - * rcu_core() to catch up. 1062 - */ 1063 - rcu_segcblist_set_flags(cblist, SEGCBLIST_RCU_CORE); 1064 - invoke_rcu_core(); 1065 - wake_gp = rdp_offload_toggle(rdp, false, flags); 1095 + WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass)); 1096 + WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist)); 1097 + rcu_nocb_unlock_irqrestore(rdp, flags); 1098 + 1099 + wake_gp = rcu_nocb_queue_toggle_rdp(rdp); 1066 1100 1067 1101 mutex_lock(&rdp_gp->nocb_gp_kthread_mutex); 1102 + 1068 1103 if (rdp_gp->nocb_gp_kthread) { 1069 1104 if (wake_gp) 1070 1105 wake_up_process(rdp_gp->nocb_gp_kthread); 1071 1106 1072 1107 swait_event_exclusive(rdp->nocb_state_wq, 1073 - !rcu_segcblist_test_flags(cblist, 1074 - SEGCBLIST_KTHREAD_GP)); 1075 - if (rdp->nocb_cb_kthread) 1076 - kthread_park(rdp->nocb_cb_kthread); 1108 + rcu_nocb_rdp_deoffload_wait_cond(rdp)); 1077 1109 } else { 1078 1110 /* 1079 1111 * No kthread to clear the flags for us or remove the rdp from the nocb list 1080 1112 * to iterate. Do it here instead. Locking doesn't look stricly necessary 1081 1113 * but we stick to paranoia in this rare path. 1082 1114 */ 1083 - rcu_nocb_lock_irqsave(rdp, flags); 1084 - rcu_segcblist_clear_flags(&rdp->cblist, SEGCBLIST_KTHREAD_GP); 1085 - rcu_nocb_unlock_irqrestore(rdp, flags); 1115 + raw_spin_lock_irqsave(&rdp->nocb_lock, flags); 1116 + rcu_segcblist_clear_flags(&rdp->cblist, SEGCBLIST_OFFLOADED); 1117 + raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags); 1086 1118 1087 1119 list_del(&rdp->nocb_entry_rdp); 1088 1120 } 1121 + 1089 1122 mutex_unlock(&rdp_gp->nocb_gp_kthread_mutex); 1090 - 1091 - /* 1092 - * Lock one last time to acquire latest callback updates from kthreads 1093 - * so we can later handle callbacks locally without locking. 1094 - */ 1095 - rcu_nocb_lock_irqsave(rdp, flags); 1096 - /* 1097 - * Theoretically we could clear SEGCBLIST_LOCKING after the nocb 1098 - * lock is released but how about being paranoid for once? 1099 - */ 1100 - rcu_segcblist_clear_flags(cblist, SEGCBLIST_LOCKING); 1101 - /* 1102 - * Without SEGCBLIST_LOCKING, we can't use 1103 - * rcu_nocb_unlock_irqrestore() anymore. 1104 - */ 1105 - raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags); 1106 - 1107 - /* Sanity check */ 1108 - WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass)); 1109 - 1110 1123 1111 1124 return 0; 1112 1125 } ··· 1102 1145 int ret = 0; 1103 1146 1104 1147 cpus_read_lock(); 1105 - mutex_lock(&rcu_state.barrier_mutex); 1148 + mutex_lock(&rcu_state.nocb_mutex); 1106 1149 if (rcu_rdp_is_offloaded(rdp)) { 1107 - if (cpu_online(cpu)) { 1108 - ret = work_on_cpu(cpu, rcu_nocb_rdp_deoffload, rdp); 1150 + if (!cpu_online(cpu)) { 1151 + ret = rcu_nocb_rdp_deoffload(rdp); 1109 1152 if (!ret) 1110 1153 cpumask_clear_cpu(cpu, rcu_nocb_mask); 1111 1154 } else { 1112 - pr_info("NOCB: Cannot CB-deoffload offline CPU %d\n", rdp->cpu); 1155 + pr_info("NOCB: Cannot CB-deoffload online CPU %d\n", rdp->cpu); 1113 1156 ret = -EINVAL; 1114 1157 } 1115 1158 } 1116 - mutex_unlock(&rcu_state.barrier_mutex); 1159 + mutex_unlock(&rcu_state.nocb_mutex); 1117 1160 cpus_read_unlock(); 1118 1161 1119 1162 return ret; 1120 1163 } 1121 1164 EXPORT_SYMBOL_GPL(rcu_nocb_cpu_deoffload); 1122 1165 1123 - static long rcu_nocb_rdp_offload(void *arg) 1166 + static bool rcu_nocb_rdp_offload_wait_cond(struct rcu_data *rdp) 1124 1167 { 1125 - struct rcu_data *rdp = arg; 1126 - struct rcu_segcblist *cblist = &rdp->cblist; 1127 1168 unsigned long flags; 1169 + bool ret; 1170 + 1171 + raw_spin_lock_irqsave(&rdp->nocb_lock, flags); 1172 + ret = rcu_segcblist_test_flags(&rdp->cblist, SEGCBLIST_OFFLOADED); 1173 + raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags); 1174 + 1175 + return ret; 1176 + } 1177 + 1178 + static int rcu_nocb_rdp_offload(struct rcu_data *rdp) 1179 + { 1128 1180 int wake_gp; 1129 1181 struct rcu_data *rdp_gp = rdp->nocb_gp_rdp; 1130 1182 1131 - WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id()); 1183 + WARN_ON_ONCE(cpu_online(rdp->cpu)); 1132 1184 /* 1133 1185 * For now we only support re-offload, ie: the rdp must have been 1134 1186 * offloaded on boot first. ··· 1150 1184 1151 1185 pr_info("Offloading %d\n", rdp->cpu); 1152 1186 1153 - /* 1154 - * Can't use rcu_nocb_lock_irqsave() before SEGCBLIST_LOCKING 1155 - * is set. 1156 - */ 1157 - raw_spin_lock_irqsave(&rdp->nocb_lock, flags); 1187 + WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass)); 1188 + WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist)); 1158 1189 1159 - /* 1160 - * We didn't take the nocb lock while working on the 1161 - * rdp->cblist with SEGCBLIST_LOCKING cleared (pure softirq/rcuc mode). 1162 - * Every modifications that have been done previously on 1163 - * rdp->cblist must be visible remotely by the nocb kthreads 1164 - * upon wake up after reading the cblist flags. 1165 - * 1166 - * The layout against nocb_lock enforces that ordering: 1167 - * 1168 - * __rcu_nocb_rdp_offload() nocb_cb_wait()/nocb_gp_wait() 1169 - * ------------------------- ---------------------------- 1170 - * WRITE callbacks rcu_nocb_lock() 1171 - * rcu_nocb_lock() READ flags 1172 - * WRITE flags READ callbacks 1173 - * rcu_nocb_unlock() rcu_nocb_unlock() 1174 - */ 1175 - wake_gp = rdp_offload_toggle(rdp, true, flags); 1190 + wake_gp = rcu_nocb_queue_toggle_rdp(rdp); 1176 1191 if (wake_gp) 1177 1192 wake_up_process(rdp_gp->nocb_gp_kthread); 1178 1193 1179 - kthread_unpark(rdp->nocb_cb_kthread); 1180 - 1181 1194 swait_event_exclusive(rdp->nocb_state_wq, 1182 - rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP)); 1195 + rcu_nocb_rdp_offload_wait_cond(rdp)); 1183 1196 1184 - /* 1185 - * All kthreads are ready to work, we can finally relieve rcu_core() and 1186 - * enable nocb bypass. 1187 - */ 1188 - rcu_nocb_lock_irqsave(rdp, flags); 1189 - rcu_segcblist_clear_flags(cblist, SEGCBLIST_RCU_CORE); 1190 - rcu_nocb_unlock_irqrestore(rdp, flags); 1197 + kthread_unpark(rdp->nocb_cb_kthread); 1191 1198 1192 1199 return 0; 1193 1200 } ··· 1171 1232 int ret = 0; 1172 1233 1173 1234 cpus_read_lock(); 1174 - mutex_lock(&rcu_state.barrier_mutex); 1235 + mutex_lock(&rcu_state.nocb_mutex); 1175 1236 if (!rcu_rdp_is_offloaded(rdp)) { 1176 - if (cpu_online(cpu)) { 1177 - ret = work_on_cpu(cpu, rcu_nocb_rdp_offload, rdp); 1237 + if (!cpu_online(cpu)) { 1238 + ret = rcu_nocb_rdp_offload(rdp); 1178 1239 if (!ret) 1179 1240 cpumask_set_cpu(cpu, rcu_nocb_mask); 1180 1241 } else { 1181 - pr_info("NOCB: Cannot CB-offload offline CPU %d\n", rdp->cpu); 1242 + pr_info("NOCB: Cannot CB-offload online CPU %d\n", rdp->cpu); 1182 1243 ret = -EINVAL; 1183 1244 } 1184 1245 } 1185 - mutex_unlock(&rcu_state.barrier_mutex); 1246 + mutex_unlock(&rcu_state.nocb_mutex); 1186 1247 cpus_read_unlock(); 1187 1248 1188 1249 return ret; ··· 1200 1261 return 0; 1201 1262 1202 1263 /* Protect rcu_nocb_mask against concurrent (de-)offloading. */ 1203 - if (!mutex_trylock(&rcu_state.barrier_mutex)) 1264 + if (!mutex_trylock(&rcu_state.nocb_mutex)) 1204 1265 return 0; 1205 1266 1206 1267 /* Snapshot count of all CPUs */ ··· 1210 1271 count += READ_ONCE(rdp->lazy_len); 1211 1272 } 1212 1273 1213 - mutex_unlock(&rcu_state.barrier_mutex); 1274 + mutex_unlock(&rcu_state.nocb_mutex); 1214 1275 1215 1276 return count ? count : SHRINK_EMPTY; 1216 1277 } ··· 1228 1289 * Protect against concurrent (de-)offloading. Otherwise nocb locking 1229 1290 * may be ignored or imbalanced. 1230 1291 */ 1231 - if (!mutex_trylock(&rcu_state.barrier_mutex)) { 1292 + if (!mutex_trylock(&rcu_state.nocb_mutex)) { 1232 1293 /* 1233 - * But really don't insist if barrier_mutex is contended since we 1294 + * But really don't insist if nocb_mutex is contended since we 1234 1295 * can't guarantee that it will never engage in a dependency 1235 1296 * chain involving memory allocation. The lock is seldom contended 1236 1297 * anyway. ··· 1269 1330 break; 1270 1331 } 1271 1332 1272 - mutex_unlock(&rcu_state.barrier_mutex); 1333 + mutex_unlock(&rcu_state.nocb_mutex); 1273 1334 1274 1335 return count ? count : SHRINK_STOP; 1275 1336 } ··· 1335 1396 rdp = per_cpu_ptr(&rcu_data, cpu); 1336 1397 if (rcu_segcblist_empty(&rdp->cblist)) 1337 1398 rcu_segcblist_init(&rdp->cblist); 1338 - rcu_segcblist_offload(&rdp->cblist, true); 1339 - rcu_segcblist_set_flags(&rdp->cblist, SEGCBLIST_KTHREAD_GP); 1340 - rcu_segcblist_clear_flags(&rdp->cblist, SEGCBLIST_RCU_CORE); 1399 + rcu_segcblist_set_flags(&rdp->cblist, SEGCBLIST_OFFLOADED); 1341 1400 } 1342 1401 rcu_organize_nocb_kthreads(); 1343 1402 } ··· 1383 1446 "rcuog/%d", rdp_gp->cpu); 1384 1447 if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo GP kthread, OOM is now expected behavior\n", __func__)) { 1385 1448 mutex_unlock(&rdp_gp->nocb_gp_kthread_mutex); 1386 - goto end; 1449 + goto err; 1387 1450 } 1388 1451 WRITE_ONCE(rdp_gp->nocb_gp_kthread, t); 1389 1452 if (kthread_prio) ··· 1395 1458 t = kthread_create(rcu_nocb_cb_kthread, rdp, 1396 1459 "rcuo%c/%d", rcu_state.abbr, cpu); 1397 1460 if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__)) 1398 - goto end; 1461 + goto err; 1399 1462 1400 1463 if (rcu_rdp_is_offloaded(rdp)) 1401 1464 wake_up_process(t); ··· 1408 1471 WRITE_ONCE(rdp->nocb_cb_kthread, t); 1409 1472 WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread); 1410 1473 return; 1411 - end: 1412 - mutex_lock(&rcu_state.barrier_mutex); 1474 + 1475 + err: 1476 + /* 1477 + * No need to protect against concurrent rcu_barrier() 1478 + * because the number of callbacks should be 0 for a non-boot CPU, 1479 + * therefore rcu_barrier() shouldn't even try to grab the nocb_lock. 1480 + * But hold nocb_mutex to avoid nocb_lock imbalance from shrinker. 1481 + */ 1482 + WARN_ON_ONCE(system_state > SYSTEM_BOOTING && rcu_segcblist_n_cbs(&rdp->cblist)); 1483 + mutex_lock(&rcu_state.nocb_mutex); 1413 1484 if (rcu_rdp_is_offloaded(rdp)) { 1414 1485 rcu_nocb_rdp_deoffload(rdp); 1415 1486 cpumask_clear_cpu(cpu, rcu_nocb_mask); 1416 1487 } 1417 - mutex_unlock(&rcu_state.barrier_mutex); 1488 + mutex_unlock(&rcu_state.nocb_mutex); 1418 1489 } 1419 1490 1420 1491 /* How many CB CPU IDs per GP kthread? Default of -1 for sqrt(nr_cpu_ids). */ ··· 1597 1652 } 1598 1653 1599 1654 #else /* #ifdef CONFIG_RCU_NOCB_CPU */ 1600 - 1601 - static inline int rcu_lockdep_is_held_nocb(struct rcu_data *rdp) 1602 - { 1603 - return 0; 1604 - } 1605 - 1606 - static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp) 1607 - { 1608 - return false; 1609 - } 1610 1655 1611 1656 /* No ->nocb_lock to acquire. */ 1612 1657 static void rcu_nocb_lock(struct rcu_data *rdp)

+3 -2

kernel/rcu/tree_plugin.h

··· 24 24 * timers have their own means of synchronization against the 25 25 * offloaded state updaters. 26 26 */ 27 - RCU_LOCKDEP_WARN( 27 + RCU_NOCB_LOCKDEP_WARN( 28 28 !(lockdep_is_held(&rcu_state.barrier_mutex) || 29 29 (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_held()) || 30 - rcu_lockdep_is_held_nocb(rdp) || 30 + lockdep_is_held(&rdp->nocb_lock) || 31 + lockdep_is_held(&rcu_state.nocb_mutex) || 31 32 (!(IS_ENABLED(CONFIG_PREEMPT_COUNT) && preemptible()) && 32 33 rdp == this_cpu_ptr(&rcu_data)) || 33 34 rcu_current_is_nocb_kthread(rdp)),

+9 -1

kernel/rcu/tree_stall.h

··· 9 9 10 10 #include <linux/kvm_para.h> 11 11 #include <linux/rcu_notifier.h> 12 + #include <linux/smp.h> 12 13 13 14 ////////////////////////////////////////////////////////////////////////////// 14 15 // ··· 371 370 struct rcu_node *rnp; 372 371 373 372 rcu_for_each_leaf_node(rnp) { 373 + printk_deferred_enter(); 374 374 raw_spin_lock_irqsave_rcu_node(rnp, flags); 375 375 for_each_leaf_node_possible_cpu(rnp, cpu) 376 376 if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) { ··· 381 379 dump_cpu_task(cpu); 382 380 } 383 381 raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 382 + printk_deferred_exit(); 384 383 } 385 384 } 386 385 ··· 722 719 set_preempt_need_resched(); 723 720 } 724 721 722 + static bool csd_lock_suppress_rcu_stall; 723 + module_param(csd_lock_suppress_rcu_stall, bool, 0644); 724 + 725 725 static void check_cpu_stall(struct rcu_data *rdp) 726 726 { 727 727 bool self_detected; ··· 797 791 return; 798 792 799 793 rcu_stall_notifier_call_chain(RCU_STALL_NOTIFY_NORM, (void *)j - gps); 800 - if (self_detected) { 794 + if (READ_ONCE(csd_lock_suppress_rcu_stall) && csd_lock_is_stuck()) { 795 + pr_err("INFO: %s detected stall, but suppressed full report due to a stuck CSD-lock.\n", rcu_state.name); 796 + } else if (self_detected) { 801 797 /* We haven't checked in, so go dump stack. */ 802 798 print_cpu_stall(gps); 803 799 } else {

+1 -1

kernel/sched/core.c

··· 9726 9726 9727 9727 void dump_cpu_task(int cpu) 9728 9728 { 9729 - if (cpu == smp_processor_id() && in_hardirq()) { 9729 + if (in_hardirq() && cpu == smp_processor_id()) { 9730 9730 struct pt_regs *regs; 9731 9731 9732 9732 regs = get_irq_regs();

+33 -5

kernel/smp.c

··· 208 208 return -1; 209 209 } 210 210 211 + static atomic_t n_csd_lock_stuck; 212 + 213 + /** 214 + * csd_lock_is_stuck - Has a CSD-lock acquisition been stuck too long? 215 + * 216 + * Returns @true if a CSD-lock acquisition is stuck and has been stuck 217 + * long enough for a "non-responsive CSD lock" message to be printed. 218 + */ 219 + bool csd_lock_is_stuck(void) 220 + { 221 + return !!atomic_read(&n_csd_lock_stuck); 222 + } 223 + 211 224 /* 212 225 * Complain if too much time spent waiting. Note that only 213 226 * the CSD_TYPE_SYNC/ASYNC types provide the destination CPU, 214 227 * so waiting on other types gets much less information. 215 228 */ 216 - static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id) 229 + static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id, unsigned long *nmessages) 217 230 { 218 231 int cpu = -1; 219 232 int cpux; ··· 242 229 cpu = csd_lock_wait_getcpu(csd); 243 230 pr_alert("csd: CSD lock (#%d) got unstuck on CPU#%02d, CPU#%02d released the lock.\n", 244 231 *bug_id, raw_smp_processor_id(), cpu); 232 + atomic_dec(&n_csd_lock_stuck); 245 233 return true; 246 234 } 247 235 248 236 ts2 = sched_clock(); 249 237 /* How long since we last checked for a stuck CSD lock.*/ 250 238 ts_delta = ts2 - *ts1; 251 - if (likely(ts_delta <= csd_lock_timeout_ns || csd_lock_timeout_ns == 0)) 239 + if (likely(ts_delta <= csd_lock_timeout_ns * (*nmessages + 1) * 240 + (!*nmessages ? 1 : (ilog2(num_online_cpus()) / 2 + 1)) || 241 + csd_lock_timeout_ns == 0)) 252 242 return false; 243 + 244 + if (ts0 > ts2) { 245 + /* Our own sched_clock went backward; don't blame another CPU. */ 246 + ts_delta = ts0 - ts2; 247 + pr_alert("sched_clock on CPU %d went backward by %llu ns\n", raw_smp_processor_id(), ts_delta); 248 + *ts1 = ts2; 249 + return false; 250 + } 253 251 254 252 firsttime = !*bug_id; 255 253 if (firsttime) ··· 273 249 cpu_cur_csd = smp_load_acquire(&per_cpu(cur_csd, cpux)); /* Before func and info. */ 274 250 /* How long since this CSD lock was stuck. */ 275 251 ts_delta = ts2 - ts0; 276 - pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %llu ns for CPU#%02d %pS(%ps).\n", 277 - firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), ts_delta, 252 + pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %lld ns for CPU#%02d %pS(%ps).\n", 253 + firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), (s64)ts_delta, 278 254 cpu, csd->func, csd->info); 255 + (*nmessages)++; 256 + if (firsttime) 257 + atomic_inc(&n_csd_lock_stuck); 279 258 /* 280 259 * If the CSD lock is still stuck after 5 minutes, it is unlikely 281 260 * to become unstuck. Use a signed comparison to avoid triggering ··· 317 290 */ 318 291 static void __csd_lock_wait(call_single_data_t *csd) 319 292 { 293 + unsigned long nmessages = 0; 320 294 int bug_id = 0; 321 295 u64 ts0, ts1; 322 296 323 297 ts1 = ts0 = sched_clock(); 324 298 for (;;) { 325 - if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id)) 299 + if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id, &nmessages)) 326 300 break; 327 301 cpu_relax(); 328 302 }

+1

lib/Kconfig.debug

··· 1614 1614 config CSD_LOCK_WAIT_DEBUG 1615 1615 bool "Debugging for csd_lock_wait(), called from smp_call_function*()" 1616 1616 depends on DEBUG_KERNEL 1617 + depends on SMP 1617 1618 depends on 64BIT 1618 1619 default n 1619 1620 help

-2

tools/rcu/rcu-updaters.sh

··· 21 21 bpftrace -e 'kprobe:kvfree_call_rcu, 22 22 kprobe:call_rcu, 23 23 kprobe:call_rcu_tasks, 24 - kprobe:call_rcu_tasks_rude, 25 24 kprobe:call_rcu_tasks_trace, 26 25 kprobe:call_srcu, 27 26 kprobe:rcu_barrier, 28 27 kprobe:rcu_barrier_tasks, 29 - kprobe:rcu_barrier_tasks_rude, 30 28 kprobe:rcu_barrier_tasks_trace, 31 29 kprobe:srcu_barrier, 32 30 kprobe:synchronize_rcu,

+2

tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh

··· 68 68 config_override_param "--kasan options" KcList "$TORTURE_KCONFIG_KASAN_ARG" 69 69 config_override_param "--kcsan options" KcList "$TORTURE_KCONFIG_KCSAN_ARG" 70 70 config_override_param "--kconfig argument" KcList "$TORTURE_KCONFIG_ARG" 71 + config_override_param "$config_dir/CFcommon.$(uname -m)" KcList \ 72 + "`cat $config_dir/CFcommon.$(uname -m) 2> /dev/null`" 71 73 cp $T/KcList $resdir/ConfigFragment 72 74 73 75 base_resdir=`echo $resdir | sed -e 's/\.[0-9]\+$//'`

+27 -11

tools/testing/selftests/rcutorture/bin/torture.sh

··· 19 19 20 20 TORTURE_ALLOTED_CPUS="`identify_qemu_vcpus`" 21 21 MAKE_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS*2)) 22 - HALF_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS/2)) 23 - if test "$HALF_ALLOTED_CPUS" -lt 1 22 + SCALE_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS/2)) 23 + if test "$SCALE_ALLOTED_CPUS" -lt 1 24 24 then 25 - HALF_ALLOTED_CPUS=1 25 + SCALE_ALLOTED_CPUS=1 26 26 fi 27 27 VERBOSE_BATCH_CPUS=$((TORTURE_ALLOTED_CPUS/16)) 28 28 if test "$VERBOSE_BATCH_CPUS" -lt 2 ··· 90 90 echo " --do-scftorture / --do-no-scftorture / --no-scftorture" 91 91 echo " --do-srcu-lockdep / --do-no-srcu-lockdep / --no-srcu-lockdep" 92 92 echo " --duration [ <minutes> | <hours>h | <days>d ]" 93 + echo " --guest-cpu-limit N" 93 94 echo " --kcsan-kmake-arg kernel-make-arguments" 94 95 exit 1 95 96 } ··· 202 201 fi 203 202 ts=`echo $2 | sed -e 's/[smhd]$//'` 204 203 duration_base=$(($ts*mult)) 204 + shift 205 + ;; 206 + --guest-cpu-limit|--guest-cpu-lim) 207 + checkarg --guest-cpu-limit "(number)" "$#" "$2" '^[0-9]*$' '^--' 208 + if (("$2" <= "$TORTURE_ALLOTED_CPUS" / 2)) 209 + then 210 + SCALE_ALLOTED_CPUS="$2" 211 + VERBOSE_BATCH_CPUS="$((SCALE_ALLOTED_CPUS/8))" 212 + if (("$VERBOSE_BATCH_CPUS" < 2)) 213 + then 214 + VERBOSE_BATCH_CPUS=0 215 + fi 216 + else 217 + echo "Ignoring value of $2 for --guest-cpu-limit which is greater than (("$TORTURE_ALLOTED_CPUS" / 2))." 218 + fi 205 219 shift 206 220 ;; 207 221 --kcsan-kmake-arg|--kcsan-kmake-args) ··· 441 425 if test "$do_scftorture" = "yes" 442 426 then 443 427 # Scale memory based on the number of CPUs. 444 - scfmem=$((3+HALF_ALLOTED_CPUS/16)) 445 - torture_bootargs="scftorture.nthreads=$HALF_ALLOTED_CPUS torture.disable_onoff_at_boot csdlock_debug=1" 446 - torture_set "scftorture" tools/testing/selftests/rcutorture/bin/kvm.sh --torture scf --allcpus --duration "$duration_scftorture" --configs "$configs_scftorture" --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory ${scfmem}G --trust-make 428 + scfmem=$((3+SCALE_ALLOTED_CPUS/16)) 429 + torture_bootargs="scftorture.nthreads=$SCALE_ALLOTED_CPUS torture.disable_onoff_at_boot csdlock_debug=1" 430 + torture_set "scftorture" tools/testing/selftests/rcutorture/bin/kvm.sh --torture scf --allcpus --duration "$duration_scftorture" --configs "$configs_scftorture" --kconfig "CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --memory ${scfmem}G --trust-make 447 431 fi 448 432 449 433 if test "$do_rt" = "yes" ··· 487 471 do 488 472 if test -n "$firsttime" 489 473 then 490 - torture_bootargs="refscale.scale_type="$prim" refscale.nreaders=$HALF_ALLOTED_CPUS refscale.loops=10000 refscale.holdoff=20 torture.disable_onoff_at_boot" 491 - torture_set "refscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture refscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --bootargs "refscale.verbose_batched=$VERBOSE_BATCH_CPUS torture.verbose_sleep_frequency=8 torture.verbose_sleep_duration=$VERBOSE_BATCH_CPUS" --trust-make 474 + torture_bootargs="refscale.scale_type="$prim" refscale.nreaders=$SCALE_ALLOTED_CPUS refscale.loops=10000 refscale.holdoff=20 torture.disable_onoff_at_boot" 475 + torture_set "refscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture refscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --bootargs "refscale.verbose_batched=$VERBOSE_BATCH_CPUS torture.verbose_sleep_frequency=8 torture.verbose_sleep_duration=$VERBOSE_BATCH_CPUS" --trust-make 492 476 mv $T/last-resdir-nodebug $T/first-resdir-nodebug || : 493 477 if test -f "$T/last-resdir-kasan" 494 478 then ··· 536 520 do 537 521 if test -n "$firsttime" 538 522 then 539 - torture_bootargs="rcuscale.scale_type="$prim" rcuscale.nwriters=$HALF_ALLOTED_CPUS rcuscale.holdoff=20 torture.disable_onoff_at_boot" 540 - torture_set "rcuscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --trust-make 523 + torture_bootargs="rcuscale.scale_type="$prim" rcuscale.nwriters=$SCALE_ALLOTED_CPUS rcuscale.holdoff=20 torture.disable_onoff_at_boot" 524 + torture_set "rcuscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --trust-make 541 525 mv $T/last-resdir-nodebug $T/first-resdir-nodebug || : 542 526 if test -f "$T/last-resdir-kasan" 543 527 then ··· 575 559 if test "$do_kvfree" = "yes" 576 560 then 577 561 torture_bootargs="rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot" 578 - torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration $duration_rcutorture --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory 2G --trust-make 562 + torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration $duration_rcutorture --kconfig "CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --memory 2G --trust-make 579 563 fi 580 564 581 565 if test "$do_clocksourcewd" = "yes"

-2

tools/testing/selftests/rcutorture/configs/rcu/CFcommon

··· 1 1 CONFIG_RCU_TORTURE_TEST=y 2 2 CONFIG_PRINTK_TIME=y 3 - CONFIG_HYPERVISOR_GUEST=y 4 3 CONFIG_PARAVIRT=y 5 - CONFIG_KVM_GUEST=y 6 4 CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=n 7 5 CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n

+2

tools/testing/selftests/rcutorture/configs/rcu/CFcommon.i686

··· 1 + CONFIG_HYPERVISOR_GUEST=y 2 + CONFIG_KVM_GUEST=y

+1

tools/testing/selftests/rcutorture/configs/rcu/CFcommon.ppc64le

··· 1 + CONFIG_KVM_GUEST=y

+2

tools/testing/selftests/rcutorture/configs/rcu/CFcommon.x86_64

··· 1 + CONFIG_HYPERVISOR_GUEST=y 2 + CONFIG_KVM_GUEST=y

+1

tools/testing/selftests/rcutorture/configs/rcu/TREE07.boot

··· 2 2 rcutorture.stall_cpu=14 3 3 rcutorture.stall_cpu_holdoff=90 4 4 rcutorture.fwd_progress=0 5 + rcutree.nohz_full_patience_delay=1000

+20

tools/testing/selftests/rcutorture/configs/refscale/TINY

··· 1 + CONFIG_SMP=n 2 + CONFIG_PREEMPT_NONE=y 3 + CONFIG_PREEMPT_VOLUNTARY=n 4 + CONFIG_PREEMPT=n 5 + CONFIG_PREEMPT_DYNAMIC=n 6 + #CHECK#CONFIG_PREEMPT_RCU=n 7 + CONFIG_HZ_PERIODIC=n 8 + CONFIG_NO_HZ_IDLE=y 9 + CONFIG_NO_HZ_FULL=n 10 + CONFIG_HOTPLUG_CPU=n 11 + CONFIG_SUSPEND=n 12 + CONFIG_HIBERNATION=n 13 + CONFIG_RCU_NOCB_CPU=n 14 + CONFIG_DEBUG_LOCK_ALLOC=n 15 + CONFIG_PROVE_LOCKING=n 16 + CONFIG_RCU_BOOST=n 17 + CONFIG_DEBUG_OBJECTS_RCU_HEAD=n 18 + CONFIG_RCU_EXPERT=y 19 + CONFIG_KPROBES=n 20 + CONFIG_FTRACE=n