Merge branches 'doc.2018.08.30a', 'dynticks.2018.08.30b', 'srcu.2018.08.30b' and 'torture.2018.08.29a' into HEAD

+24 -26

Documentation/RCU/Design/Requirements/Requirements.html

··· 2398 2398 <p> 2399 2399 RCU depends on the scheduler, and the scheduler uses RCU to 2400 2400 protect some of its data structures. 2401 - This means the scheduler is forbidden from acquiring 2402 - the runqueue locks and the priority-inheritance locks 2403 - in the middle of an outermost RCU read-side critical section unless either 2404 - (1) it releases them before exiting that same 2405 - RCU read-side critical section, or 2406 - (2) interrupts are disabled across 2407 - that entire RCU read-side critical section. 2408 - This same prohibition also applies (recursively!) to any lock that is acquired 2409 - while holding any lock to which this prohibition applies. 2410 - Adhering to this rule prevents preemptible RCU from invoking 2411 - <tt>rcu_read_unlock_special()</tt> while either runqueue or 2412 - priority-inheritance locks are held, thus avoiding deadlock. 2413 - 2414 - <p> 2415 - Prior to v4.4, it was only necessary to disable preemption across 2416 - RCU read-side critical sections that acquired scheduler locks. 2417 - In v4.4, expedited grace periods started using IPIs, and these 2418 - IPIs could force a <tt>rcu_read_unlock()</tt> to take the slowpath. 2419 - Therefore, this expedited-grace-period change required disabling of 2420 - interrupts, not just preemption. 2421 - 2422 - <p> 2423 - For RCU's part, the preemptible-RCU <tt>rcu_read_unlock()</tt> 2424 - implementation must be written carefully to avoid similar deadlocks. 2401 + The preemptible-RCU <tt>rcu_read_unlock()</tt> 2402 + implementation must therefore be written carefully to avoid deadlocks 2403 + involving the scheduler's runqueue and priority-inheritance locks. 2425 2404 In particular, <tt>rcu_read_unlock()</tt> must tolerate an 2426 2405 interrupt where the interrupt handler invokes both 2427 2406 <tt>rcu_read_lock()</tt> and <tt>rcu_read_unlock()</tt>. ··· 2409 2430 interrupt handler's use of RCU. 2410 2431 2411 2432 <p> 2412 - This pair of mutual scheduler-RCU requirements came as a 2433 + This scheduler-RCU requirement came as a 2413 2434 <a href="https://lwn.net/Articles/453002/">complete surprise</a>. 2414 2435 2415 2436 <p> ··· 2420 2441 <tt>CONFIG_NO_HZ_FULL=y</tt> 2421 2442 <a href="http://www.rdrop.com/users/paulmck/scalability/paper/BareMetal.2015.01.15b.pdf">did come as a surprise [PDF]</a>. 2422 2443 RCU has made good progress towards meeting this requirement, even 2423 - for context-switch-have <tt>CONFIG_NO_HZ_FULL=y</tt> workloads, 2444 + for context-switch-heavy <tt>CONFIG_NO_HZ_FULL=y</tt> workloads, 2424 2445 but there is room for further improvement. 2446 + 2447 + <p> 2448 + In the past, it was forbidden to disable interrupts across an 2449 + <tt>rcu_read_unlock()</tt> unless that interrupt-disabled region 2450 + of code also included the matching <tt>rcu_read_lock()</tt>. 2451 + Violating this restriction could result in deadlocks involving the 2452 + scheduler's runqueue and priority-inheritance spinlocks. 2453 + This restriction was lifted when interrupt-disabled calls to 2454 + <tt>rcu_read_unlock()</tt> started deferring the reporting of 2455 + the resulting RCU-preempt quiescent state until the end of that 2456 + interrupts-disabled region. 2457 + This deferred reporting means that the scheduler's runqueue and 2458 + priority-inheritance locks cannot be held while reporting an RCU-preempt 2459 + quiescent state, which lifts the earlier restriction, at least from 2460 + a deadlock perspective. 2461 + Unfortunately, real-time systems using RCU priority boosting may 2462 + need this restriction to remain in effect because deferred 2463 + quiescent-state reporting also defers deboosting, which in turn 2464 + degrades real-time latencies. 2425 2465 2426 2466 <h3><a name="Tracing and RCU">Tracing and RCU</a></h3> 2427 2467

+8 -7

Documentation/admin-guide/kernel-parameters.txt

··· 3595 3595 Set required age in jiffies for a 3596 3596 given grace period before RCU starts 3597 3597 soliciting quiescent-state help from 3598 - rcu_note_context_switch(). 3598 + rcu_note_context_switch(). If not specified, the 3599 + kernel will calculate a value based on the most 3600 + recent settings of rcutree.jiffies_till_first_fqs 3601 + and rcutree.jiffies_till_next_fqs. 3602 + This calculated value may be viewed in 3603 + rcutree.jiffies_to_sched_qs. Any attempt to 3604 + set rcutree.jiffies_to_sched_qs will be 3605 + cheerfully overwritten. 3599 3606 3600 3607 rcutree.jiffies_till_first_fqs= [KNL] 3601 3608 Set delay from grace-period initialization to ··· 3869 3862 3870 3863 rcupdate.rcu_self_test= [KNL] 3871 3864 Run the RCU early boot self tests 3872 - 3873 - rcupdate.rcu_self_test_bh= [KNL] 3874 - Run the RCU bh early boot self tests 3875 - 3876 - rcupdate.rcu_self_test_sched= [KNL] 3877 - Run the RCU sched early boot self tests 3878 3865 3879 3866 rdinit= [KNL] 3880 3867 Format: <full_path>

+15 -17

include/linux/rculist.h

··· 182 182 * @list: the RCU-protected list to splice 183 183 * @prev: points to the last element of the existing list 184 184 * @next: points to the first element of the existing list 185 - * @sync: function to sync: synchronize_rcu(), synchronize_sched(), ... 185 + * @sync: synchronize_rcu, synchronize_rcu_expedited, ... 186 186 * 187 187 * The list pointed to by @prev and @next can be RCU-read traversed 188 188 * concurrently with this function. ··· 240 240 * designed for stacks. 241 241 * @list: the RCU-protected list to splice 242 242 * @head: the place in the existing list to splice the first list into 243 - * @sync: function to sync: synchronize_rcu(), synchronize_sched(), ... 243 + * @sync: synchronize_rcu, synchronize_rcu_expedited, ... 244 244 */ 245 245 static inline void list_splice_init_rcu(struct list_head *list, 246 246 struct list_head *head, ··· 255 255 * list, designed for queues. 256 256 * @list: the RCU-protected list to splice 257 257 * @head: the place in the existing list to splice the first list into 258 - * @sync: function to sync: synchronize_rcu(), synchronize_sched(), ... 258 + * @sync: synchronize_rcu, synchronize_rcu_expedited, ... 259 259 */ 260 260 static inline void list_splice_tail_init_rcu(struct list_head *list, 261 261 struct list_head *head, ··· 359 359 * @type: the type of the struct this is embedded in. 360 360 * @member: the name of the list_head within the struct. 361 361 * 362 - * This primitive may safely run concurrently with the _rcu list-mutation 363 - * primitives such as list_add_rcu(), but requires some implicit RCU 364 - * read-side guarding. One example is running within a special 365 - * exception-time environment where preemption is disabled and where 366 - * lockdep cannot be invoked (in which case updaters must use RCU-sched, 367 - * as in synchronize_sched(), call_rcu_sched(), and friends). Another 368 - * example is when items are added to the list, but never deleted. 362 + * This primitive may safely run concurrently with the _rcu 363 + * list-mutation primitives such as list_add_rcu(), but requires some 364 + * implicit RCU read-side guarding. One example is running within a special 365 + * exception-time environment where preemption is disabled and where lockdep 366 + * cannot be invoked. Another example is when items are added to the list, 367 + * but never deleted. 369 368 */ 370 369 #define list_entry_lockless(ptr, type, member) \ 371 370 container_of((typeof(ptr))READ_ONCE(ptr), type, member) ··· 375 376 * @head: the head for your list. 376 377 * @member: the name of the list_struct within the struct. 377 378 * 378 - * This primitive may safely run concurrently with the _rcu list-mutation 379 - * primitives such as list_add_rcu(), but requires some implicit RCU 380 - * read-side guarding. One example is running within a special 381 - * exception-time environment where preemption is disabled and where 382 - * lockdep cannot be invoked (in which case updaters must use RCU-sched, 383 - * as in synchronize_sched(), call_rcu_sched(), and friends). Another 384 - * example is when items are added to the list, but never deleted. 379 + * This primitive may safely run concurrently with the _rcu 380 + * list-mutation primitives such as list_add_rcu(), but requires some 381 + * implicit RCU read-side guarding. One example is running within a special 382 + * exception-time environment where preemption is disabled and where lockdep 383 + * cannot be invoked. Another example is when items are added to the list, 384 + * but never deleted. 385 385 */ 386 386 #define list_for_each_entry_lockless(pos, head, member) \ 387 387 for (pos = list_entry_lockless((head)->next, typeof(*pos), member); \

+109 -45

include/linux/rcupdate.h

··· 48 48 #define ulong2long(a) (*(long *)(&(a))) 49 49 50 50 /* Exported common interfaces */ 51 - 52 - #ifdef CONFIG_PREEMPT_RCU 53 51 void call_rcu(struct rcu_head *head, rcu_callback_t func); 54 - #else /* #ifdef CONFIG_PREEMPT_RCU */ 55 - #define call_rcu call_rcu_sched 56 - #endif /* #else #ifdef CONFIG_PREEMPT_RCU */ 57 - 58 - void call_rcu_bh(struct rcu_head *head, rcu_callback_t func); 59 - void call_rcu_sched(struct rcu_head *head, rcu_callback_t func); 60 - void synchronize_sched(void); 61 52 void rcu_barrier_tasks(void); 53 + void synchronize_rcu(void); 62 54 63 55 #ifdef CONFIG_PREEMPT_RCU 64 56 65 57 void __rcu_read_lock(void); 66 58 void __rcu_read_unlock(void); 67 - void synchronize_rcu(void); 68 59 69 60 /* 70 61 * Defined as a macro as it is a very low level header included from ··· 79 88 preempt_enable(); 80 89 } 81 90 82 - static inline void synchronize_rcu(void) 83 - { 84 - synchronize_sched(); 85 - } 86 - 87 91 static inline int rcu_preempt_depth(void) 88 92 { 89 93 return 0; ··· 89 103 /* Internal to kernel */ 90 104 void rcu_init(void); 91 105 extern int rcu_scheduler_active __read_mostly; 92 - void rcu_sched_qs(void); 93 - void rcu_bh_qs(void); 94 106 void rcu_check_callbacks(int user); 95 107 void rcu_report_dead(unsigned int cpu); 96 108 void rcutree_migrate_callbacks(int cpu); ··· 119 135 * RCU_NONIDLE - Indicate idle-loop code that needs RCU readers 120 136 * @a: Code that RCU needs to pay attention to. 121 137 * 122 - * RCU, RCU-bh, and RCU-sched read-side critical sections are forbidden 123 - * in the inner idle loop, that is, between the rcu_idle_enter() and 124 - * the rcu_idle_exit() -- RCU will happily ignore any such read-side 125 - * critical sections. However, things like powertop need tracepoints 126 - * in the inner idle loop. 138 + * RCU read-side critical sections are forbidden in the inner idle loop, 139 + * that is, between the rcu_idle_enter() and the rcu_idle_exit() -- RCU 140 + * will happily ignore any such read-side critical sections. However, 141 + * things like powertop need tracepoints in the inner idle loop. 127 142 * 128 143 * This macro provides the way out: RCU_NONIDLE(do_something_with_RCU()) 129 144 * will tell RCU that it needs to pay attention, invoke its argument ··· 150 167 if (READ_ONCE((t)->rcu_tasks_holdout)) \ 151 168 WRITE_ONCE((t)->rcu_tasks_holdout, false); \ 152 169 } while (0) 153 - #define rcu_note_voluntary_context_switch(t) \ 154 - do { \ 155 - rcu_all_qs(); \ 156 - rcu_tasks_qs(t); \ 157 - } while (0) 170 + #define rcu_note_voluntary_context_switch(t) rcu_tasks_qs(t) 158 171 void call_rcu_tasks(struct rcu_head *head, rcu_callback_t func); 159 172 void synchronize_rcu_tasks(void); 160 173 void exit_tasks_rcu_start(void); 161 174 void exit_tasks_rcu_finish(void); 162 175 #else /* #ifdef CONFIG_TASKS_RCU */ 163 176 #define rcu_tasks_qs(t) do { } while (0) 164 - #define rcu_note_voluntary_context_switch(t) rcu_all_qs() 165 - #define call_rcu_tasks call_rcu_sched 166 - #define synchronize_rcu_tasks synchronize_sched 177 + #define rcu_note_voluntary_context_switch(t) do { } while (0) 178 + #define call_rcu_tasks call_rcu 179 + #define synchronize_rcu_tasks synchronize_rcu 167 180 static inline void exit_tasks_rcu_start(void) { } 168 181 static inline void exit_tasks_rcu_finish(void) { } 169 182 #endif /* #else #ifdef CONFIG_TASKS_RCU */ ··· 304 325 * Helper functions for rcu_dereference_check(), rcu_dereference_protected() 305 326 * and rcu_assign_pointer(). Some of these could be folded into their 306 327 * callers, but they are left separate in order to ease introduction of 307 - * multiple flavors of pointers to match the multiple flavors of RCU 308 - * (e.g., __rcu_bh, * __rcu_sched, and __srcu), should this make sense in 309 - * the future. 328 + * multiple pointers markings to match different RCU implementations 329 + * (e.g., __srcu), should this make sense in the future. 310 330 */ 311 331 312 332 #ifdef __CHECKER__ ··· 664 686 /** 665 687 * rcu_read_lock_bh() - mark the beginning of an RCU-bh critical section 666 688 * 667 - * This is equivalent of rcu_read_lock(), but to be used when updates 668 - * are being done using call_rcu_bh() or synchronize_rcu_bh(). Since 669 - * both call_rcu_bh() and synchronize_rcu_bh() consider completion of a 670 - * softirq handler to be a quiescent state, a process in RCU read-side 671 - * critical section must be protected by disabling softirqs. Read-side 672 - * critical sections in interrupt context can use just rcu_read_lock(), 673 - * though this should at least be commented to avoid confusing people 674 - * reading the code. 689 + * This is equivalent of rcu_read_lock(), but also disables softirqs. 690 + * Note that anything else that disables softirqs can also serve as 691 + * an RCU read-side critical section. 675 692 * 676 693 * Note that rcu_read_lock_bh() and the matching rcu_read_unlock_bh() 677 694 * must occur in the same context, for example, it is illegal to invoke ··· 699 726 /** 700 727 * rcu_read_lock_sched() - mark the beginning of a RCU-sched critical section 701 728 * 702 - * This is equivalent of rcu_read_lock(), but to be used when updates 703 - * are being done using call_rcu_sched() or synchronize_rcu_sched(). 704 - * Read-side critical sections can also be introduced by anything that 705 - * disables preemption, including local_irq_disable() and friends. 729 + * This is equivalent of rcu_read_lock(), but disables preemption. 730 + * Read-side critical sections can also be introduced by anything else 731 + * that disables preemption, including local_irq_disable() and friends. 706 732 * 707 733 * Note that rcu_read_lock_sched() and the matching rcu_read_unlock_sched() 708 734 * must occur in the same context, for example, it is illegal to invoke ··· 856 884 #define smp_mb__after_unlock_lock() do { } while (0) 857 885 #endif /* #else #ifdef CONFIG_ARCH_WEAK_RELEASE_ACQUIRE */ 858 886 887 + 888 + /* Has the specified rcu_head structure been handed to call_rcu()? */ 889 + 890 + /* 891 + * rcu_head_init - Initialize rcu_head for rcu_head_after_call_rcu() 892 + * @rhp: The rcu_head structure to initialize. 893 + * 894 + * If you intend to invoke rcu_head_after_call_rcu() to test whether a 895 + * given rcu_head structure has already been passed to call_rcu(), then 896 + * you must also invoke this rcu_head_init() function on it just after 897 + * allocating that structure. Calls to this function must not race with 898 + * calls to call_rcu(), rcu_head_after_call_rcu(), or callback invocation. 899 + */ 900 + static inline void rcu_head_init(struct rcu_head *rhp) 901 + { 902 + rhp->func = (rcu_callback_t)~0L; 903 + } 904 + 905 + /* 906 + * rcu_head_after_call_rcu - Has this rcu_head been passed to call_rcu()? 907 + * @rhp: The rcu_head structure to test. 908 + * @func: The function passed to call_rcu() along with @rhp. 909 + * 910 + * Returns @true if the @rhp has been passed to call_rcu() with @func, 911 + * and @false otherwise. Emits a warning in any other case, including 912 + * the case where @rhp has already been invoked after a grace period. 913 + * Calls to this function must not race with callback invocation. One way 914 + * to avoid such races is to enclose the call to rcu_head_after_call_rcu() 915 + * in an RCU read-side critical section that includes a read-side fetch 916 + * of the pointer to the structure containing @rhp. 917 + */ 918 + static inline bool 919 + rcu_head_after_call_rcu(struct rcu_head *rhp, rcu_callback_t f) 920 + { 921 + if (READ_ONCE(rhp->func) == f) 922 + return true; 923 + WARN_ON_ONCE(READ_ONCE(rhp->func) != (rcu_callback_t)~0L); 924 + return false; 925 + } 926 + 927 + 928 + /* Transitional pre-consolidation compatibility definitions. */ 929 + 930 + static inline void synchronize_rcu_bh(void) 931 + { 932 + synchronize_rcu(); 933 + } 934 + 935 + static inline void synchronize_rcu_bh_expedited(void) 936 + { 937 + synchronize_rcu_expedited(); 938 + } 939 + 940 + static inline void call_rcu_bh(struct rcu_head *head, rcu_callback_t func) 941 + { 942 + call_rcu(head, func); 943 + } 944 + 945 + static inline void rcu_barrier_bh(void) 946 + { 947 + rcu_barrier(); 948 + } 949 + 950 + static inline void synchronize_sched(void) 951 + { 952 + synchronize_rcu(); 953 + } 954 + 955 + static inline void synchronize_sched_expedited(void) 956 + { 957 + synchronize_rcu_expedited(); 958 + } 959 + 960 + static inline void call_rcu_sched(struct rcu_head *head, rcu_callback_t func) 961 + { 962 + call_rcu(head, func); 963 + } 964 + 965 + static inline void rcu_barrier_sched(void) 966 + { 967 + rcu_barrier(); 968 + } 969 + 970 + static inline unsigned long get_state_synchronize_sched(void) 971 + { 972 + return get_state_synchronize_rcu(); 973 + } 974 + 975 + static inline void cond_synchronize_sched(unsigned long oldstate) 976 + { 977 + cond_synchronize_rcu(oldstate); 978 + } 859 979 860 980 #endif /* __LINUX_RCUPDATE_H */

+7 -7

include/linux/rcupdate_wait.h

··· 33 33 34 34 /** 35 35 * synchronize_rcu_mult - Wait concurrently for multiple grace periods 36 - * @...: List of call_rcu() functions for the flavors to wait on. 36 + * @...: List of call_rcu() functions for different grace periods to wait on 37 37 * 38 - * This macro waits concurrently for multiple flavors of RCU grace periods. 39 - * For example, synchronize_rcu_mult(call_rcu, call_rcu_bh) would wait 40 - * on concurrent RCU and RCU-bh grace periods. Waiting on a give SRCU 38 + * This macro waits concurrently for multiple types of RCU grace periods. 39 + * For example, synchronize_rcu_mult(call_rcu, call_rcu_tasks) would wait 40 + * on concurrent RCU and RCU-tasks grace periods. Waiting on a give SRCU 41 41 * domain requires you to write a wrapper function for that SRCU domain's 42 42 * call_srcu() function, supplying the corresponding srcu_struct. 43 43 * 44 - * If Tiny RCU, tell _wait_rcu_gp() not to bother waiting for RCU 45 - * or RCU-bh, given that anywhere synchronize_rcu_mult() can be called 46 - * is automatically a grace period. 44 + * If Tiny RCU, tell _wait_rcu_gp() does not bother waiting for RCU, 45 + * given that anywhere synchronize_rcu_mult() can be called is automatically 46 + * a grace period. 47 47 */ 48 48 #define synchronize_rcu_mult(...) \ 49 49 _wait_rcu_gp(IS_ENABLED(CONFIG_TINY_RCU), __VA_ARGS__)

+17 -42

include/linux/rcutiny.h

··· 27 27 28 28 #include <linux/ktime.h> 29 29 30 - struct rcu_dynticks; 31 - static inline int rcu_dynticks_snap(struct rcu_dynticks *rdtp) 32 - { 33 - return 0; 34 - } 35 - 36 30 /* Never flag non-existent other CPUs! */ 37 31 static inline bool rcu_eqs_special_set(int cpu) { return false; } 38 32 ··· 40 46 might_sleep(); 41 47 } 42 48 43 - static inline unsigned long get_state_synchronize_sched(void) 44 - { 45 - return 0; 46 - } 47 - 48 - static inline void cond_synchronize_sched(unsigned long oldstate) 49 - { 50 - might_sleep(); 51 - } 52 - 53 - extern void rcu_barrier_bh(void); 54 - extern void rcu_barrier_sched(void); 49 + extern void rcu_barrier(void); 55 50 56 51 static inline void synchronize_rcu_expedited(void) 57 52 { 58 - synchronize_sched(); /* Only one CPU, so pretty fast anyway!!! */ 53 + synchronize_rcu(); 59 54 } 60 55 61 - static inline void rcu_barrier(void) 62 - { 63 - rcu_barrier_sched(); /* Only one CPU, so only one list of callbacks! */ 64 - } 65 - 66 - static inline void synchronize_rcu_bh(void) 67 - { 68 - synchronize_sched(); 69 - } 70 - 71 - static inline void synchronize_rcu_bh_expedited(void) 72 - { 73 - synchronize_sched(); 74 - } 75 - 76 - static inline void synchronize_sched_expedited(void) 77 - { 78 - synchronize_sched(); 79 - } 80 - 81 - static inline void kfree_call_rcu(struct rcu_head *head, 82 - rcu_callback_t func) 56 + static inline void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func) 83 57 { 84 58 call_rcu(head, func); 85 59 } 86 60 61 + void rcu_qs(void); 62 + 63 + static inline void rcu_softirq_qs(void) 64 + { 65 + rcu_qs(); 66 + } 67 + 87 68 #define rcu_note_context_switch(preempt) \ 88 69 do { \ 89 - rcu_sched_qs(); \ 70 + rcu_qs(); \ 90 71 rcu_tasks_qs(current); \ 91 72 } while (0) 92 73 ··· 77 108 */ 78 109 static inline void rcu_virt_note_context_switch(int cpu) { } 79 110 static inline void rcu_cpu_stall_reset(void) { } 111 + static inline int rcu_jiffies_till_stall_check(void) { return 21 * HZ; } 80 112 static inline void rcu_idle_enter(void) { } 81 113 static inline void rcu_idle_exit(void) { } 82 114 static inline void rcu_irq_enter(void) { } ··· 85 115 static inline void rcu_irq_enter_irqson(void) { } 86 116 static inline void rcu_irq_exit(void) { } 87 117 static inline void exit_rcu(void) { } 118 + static inline bool rcu_preempt_need_deferred_qs(struct task_struct *t) 119 + { 120 + return false; 121 + } 122 + static inline void rcu_preempt_deferred_qs(struct task_struct *t) { } 88 123 #ifdef CONFIG_SRCU 89 124 void rcu_scheduler_starting(void); 90 125 #else /* #ifndef CONFIG_SRCU */

+3 -28

include/linux/rcutree.h

··· 30 30 #ifndef __LINUX_RCUTREE_H 31 31 #define __LINUX_RCUTREE_H 32 32 33 + void rcu_softirq_qs(void); 33 34 void rcu_note_context_switch(bool preempt); 34 35 int rcu_needs_cpu(u64 basem, u64 *nextevt); 35 36 void rcu_cpu_stall_reset(void); ··· 45 44 rcu_note_context_switch(false); 46 45 } 47 46 48 - void synchronize_rcu_bh(void); 49 - void synchronize_sched_expedited(void); 50 47 void synchronize_rcu_expedited(void); 51 - 52 48 void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func); 53 49 54 - /** 55 - * synchronize_rcu_bh_expedited - Brute-force RCU-bh grace period 56 - * 57 - * Wait for an RCU-bh grace period to elapse, but use a "big hammer" 58 - * approach to force the grace period to end quickly. This consumes 59 - * significant time on all CPUs and is unfriendly to real-time workloads, 60 - * so is thus not recommended for any sort of common-case code. In fact, 61 - * if you are using synchronize_rcu_bh_expedited() in a loop, please 62 - * restructure your code to batch your updates, and then use a single 63 - * synchronize_rcu_bh() instead. 64 - * 65 - * Note that it is illegal to call this function while holding any lock 66 - * that is acquired by a CPU-hotplug notifier. And yes, it is also illegal 67 - * to call this function from a CPU-hotplug notifier. Failing to observe 68 - * these restriction will result in deadlock. 69 - */ 70 - static inline void synchronize_rcu_bh_expedited(void) 71 - { 72 - synchronize_sched_expedited(); 73 - } 74 - 75 50 void rcu_barrier(void); 76 - void rcu_barrier_bh(void); 77 - void rcu_barrier_sched(void); 78 51 bool rcu_eqs_special_set(int cpu); 79 52 unsigned long get_state_synchronize_rcu(void); 80 53 void cond_synchronize_rcu(unsigned long oldstate); 81 - unsigned long get_state_synchronize_sched(void); 82 - void cond_synchronize_sched(unsigned long oldstate); 83 54 84 55 void rcu_idle_enter(void); 85 56 void rcu_idle_exit(void); ··· 66 93 extern int rcu_scheduler_active __read_mostly; 67 94 void rcu_end_inkernel_boot(void); 68 95 bool rcu_is_watching(void); 96 + #ifndef CONFIG_PREEMPT 69 97 void rcu_all_qs(void); 98 + #endif 70 99 71 100 /* RCUtree hotplug events */ 72 101 int rcutree_prepare_cpu(unsigned int cpu);

+1 -5

include/linux/sched.h

··· 571 571 struct { 572 572 u8 blocked; 573 573 u8 need_qs; 574 - u8 exp_need_qs; 575 - 576 - /* Otherwise the compiler can store garbage here: */ 577 - u8 pad; 578 574 } b; /* Bits. */ 579 - u32 s; /* Set of bits. */ 575 + u16 s; /* Set of bits. */ 580 576 }; 581 577 582 578 enum perf_event_task_context {

+7 -6

include/linux/srcutree.h

··· 105 105 #define SRCU_STATE_SCAN2 2 106 106 107 107 #define __SRCU_STRUCT_INIT(name, pcpu_name) \ 108 - { \ 109 - .sda = &pcpu_name, \ 110 - .lock = __SPIN_LOCK_UNLOCKED(name.lock), \ 111 - .srcu_gp_seq_needed = 0 - 1, \ 112 - __SRCU_DEP_MAP_INIT(name) \ 113 - } 108 + { \ 109 + .sda = &pcpu_name, \ 110 + .lock = __SPIN_LOCK_UNLOCKED(name.lock), \ 111 + .srcu_gp_seq_needed = -1UL, \ 112 + .work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \ 113 + __SRCU_DEP_MAP_INIT(name) \ 114 + } 114 115 115 116 /* 116 117 * Define and initialize a srcu struct at build time.

+1 -1

include/linux/torture.h

··· 77 77 int torture_shutdown_init(int ssecs, void (*cleanup)(void)); 78 78 79 79 /* Task stuttering, which forces load/no-load transitions. */ 80 - void stutter_wait(const char *title); 80 + bool stutter_wait(const char *title); 81 81 int torture_stutter_init(int s); 82 82 83 83 /* Initialization and cleanup. */

+12 -13

include/trace/events/rcu.h

··· 393 393 * Tracepoint for quiescent states detected by force_quiescent_state(). 394 394 * These trace events include the type of RCU, the grace-period number 395 395 * that was blocked by the CPU, the CPU itself, and the type of quiescent 396 - * state, which can be "dti" for dyntick-idle mode, "kick" when kicking 397 - * a CPU that has been in dyntick-idle mode for too long, or "rqc" if the 398 - * CPU got a quiescent state via its rcu_qs_ctr. 396 + * state, which can be "dti" for dyntick-idle mode or "kick" when kicking 397 + * a CPU that has been in dyntick-idle mode for too long. 399 398 */ 400 399 TRACE_EVENT(rcu_fqs, 401 400 ··· 704 705 ); 705 706 706 707 /* 707 - * Tracepoint for _rcu_barrier() execution. The string "s" describes 708 - * the _rcu_barrier phase: 709 - * "Begin": _rcu_barrier() started. 710 - * "EarlyExit": _rcu_barrier() piggybacked, thus early exit. 711 - * "Inc1": _rcu_barrier() piggyback check counter incremented. 712 - * "OfflineNoCB": _rcu_barrier() found callback on never-online CPU 713 - * "OnlineNoCB": _rcu_barrier() found online no-CBs CPU. 714 - * "OnlineQ": _rcu_barrier() found online CPU with callbacks. 715 - * "OnlineNQ": _rcu_barrier() found online CPU, no callbacks. 708 + * Tracepoint for rcu_barrier() execution. The string "s" describes 709 + * the rcu_barrier phase: 710 + * "Begin": rcu_barrier() started. 711 + * "EarlyExit": rcu_barrier() piggybacked, thus early exit. 712 + * "Inc1": rcu_barrier() piggyback check counter incremented. 713 + * "OfflineNoCB": rcu_barrier() found callback on never-online CPU 714 + * "OnlineNoCB": rcu_barrier() found online no-CBs CPU. 715 + * "OnlineQ": rcu_barrier() found online CPU with callbacks. 716 + * "OnlineNQ": rcu_barrier() found online CPU, no callbacks. 716 717 * "IRQ": An rcu_barrier_callback() callback posted on remote CPU. 717 718 * "IRQNQ": An rcu_barrier_callback() callback found no callbacks. 718 719 * "CB": An rcu_barrier_callback() invoked a callback, not the last. 719 720 * "LastCB": An rcu_barrier_callback() invoked the last callback. 720 - * "Inc2": _rcu_barrier() piggyback check counter incremented. 721 + * "Inc2": rcu_barrier() piggyback check counter incremented. 721 722 * The "cpu" argument is the CPU or -1 if meaningless, the "cnt" argument 722 723 * is the count of remaining callbacks, and "done" is the piggybacking count. 723 724 */

+7 -7

kernel/rcu/Kconfig

··· 196 196 This option boosts the priority of preempted RCU readers that 197 197 block the current preemptible RCU grace period for too long. 198 198 This option also prevents heavy loads from blocking RCU 199 - callback invocation for all flavors of RCU. 199 + callback invocation. 200 200 201 201 Say Y here if you are working with real-time apps or heavy loads 202 202 Say N here if you are unsure. ··· 225 225 callback invocation to energy-efficient CPUs in battery-powered 226 226 asymmetric multiprocessors. 227 227 228 - This option offloads callback invocation from the set of 229 - CPUs specified at boot time by the rcu_nocbs parameter. 230 - For each such CPU, a kthread ("rcuox/N") will be created to 231 - invoke callbacks, where the "N" is the CPU being offloaded, 232 - and where the "x" is "b" for RCU-bh, "p" for RCU-preempt, and 233 - "s" for RCU-sched. Nothing prevents this kthread from running 228 + This option offloads callback invocation from the set of CPUs 229 + specified at boot time by the rcu_nocbs parameter. For each 230 + such CPU, a kthread ("rcuox/N") will be created to invoke 231 + callbacks, where the "N" is the CPU being offloaded, and where 232 + the "p" for RCU-preempt (PREEMPT kernels) and "s" for RCU-sched 233 + (!PREEMPT kernels). Nothing prevents this kthread from running 234 234 on the specified CPUs, but (1) the kthreads may be preempted 235 235 between each callback, and (2) affinity or cgroups can be used 236 236 to force the kthreads to run on whatever set of CPUs is desired.

+31 -36

kernel/rcu/rcu.h

··· 176 176 177 177 /* 178 178 * debug_rcu_head_queue()/debug_rcu_head_unqueue() are used internally 179 - * by call_rcu() and rcu callback execution, and are therefore not part of the 180 - * RCU API. Leaving in rcupdate.h because they are used by all RCU flavors. 179 + * by call_rcu() and rcu callback execution, and are therefore not part 180 + * of the RCU API. These are in rcupdate.h because they are used by all 181 + * RCU implementations. 181 182 */ 182 183 183 184 #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD ··· 224 223 */ 225 224 static inline bool __rcu_reclaim(const char *rn, struct rcu_head *head) 226 225 { 226 + rcu_callback_t f; 227 227 unsigned long offset = (unsigned long)head->func; 228 228 229 229 rcu_lock_acquire(&rcu_callback_map); ··· 235 233 return true; 236 234 } else { 237 235 RCU_TRACE(trace_rcu_invoke_callback(rn, head);) 238 - head->func(head); 236 + f = head->func; 237 + WRITE_ONCE(head->func, (rcu_callback_t)0L); 238 + f(head); 239 239 rcu_lock_release(&rcu_callback_map); 240 240 return false; 241 241 } ··· 332 328 } 333 329 } 334 330 335 - /* Returns first leaf rcu_node of the specified RCU flavor. */ 336 - #define rcu_first_leaf_node(rsp) ((rsp)->level[rcu_num_lvls - 1]) 331 + /* Returns a pointer to the first leaf rcu_node structure. */ 332 + #define rcu_first_leaf_node() (rcu_state.level[rcu_num_lvls - 1]) 337 333 338 334 /* Is this rcu_node a leaf? */ 339 335 #define rcu_is_leaf_node(rnp) ((rnp)->level == rcu_num_lvls - 1) 340 336 341 337 /* Is this rcu_node the last leaf? */ 342 - #define rcu_is_last_leaf_node(rsp, rnp) ((rnp) == &(rsp)->node[rcu_num_nodes - 1]) 338 + #define rcu_is_last_leaf_node(rnp) ((rnp) == &rcu_state.node[rcu_num_nodes - 1]) 343 339 344 340 /* 345 - * Do a full breadth-first scan of the rcu_node structures for the 346 - * specified rcu_state structure. 341 + * Do a full breadth-first scan of the {s,}rcu_node structures for the 342 + * specified state structure (for SRCU) or the only rcu_state structure 343 + * (for RCU). 347 344 */ 348 - #define rcu_for_each_node_breadth_first(rsp, rnp) \ 349 - for ((rnp) = &(rsp)->node[0]; \ 350 - (rnp) < &(rsp)->node[rcu_num_nodes]; (rnp)++) 345 + #define srcu_for_each_node_breadth_first(sp, rnp) \ 346 + for ((rnp) = &(sp)->node[0]; \ 347 + (rnp) < &(sp)->node[rcu_num_nodes]; (rnp)++) 348 + #define rcu_for_each_node_breadth_first(rnp) \ 349 + srcu_for_each_node_breadth_first(&rcu_state, rnp) 351 350 352 351 /* 353 - * Do a breadth-first scan of the non-leaf rcu_node structures for the 354 - * specified rcu_state structure. Note that if there is a singleton 355 - * rcu_node tree with but one rcu_node structure, this loop is a no-op. 352 + * Scan the leaves of the rcu_node hierarchy for the rcu_state structure. 353 + * Note that if there is a singleton rcu_node tree with but one rcu_node 354 + * structure, this loop -will- visit the rcu_node structure. It is still 355 + * a leaf node, even if it is also the root node. 356 356 */ 357 - #define rcu_for_each_nonleaf_node_breadth_first(rsp, rnp) \ 358 - for ((rnp) = &(rsp)->node[0]; !rcu_is_leaf_node(rsp, rnp); (rnp)++) 359 - 360 - /* 361 - * Scan the leaves of the rcu_node hierarchy for the specified rcu_state 362 - * structure. Note that if there is a singleton rcu_node tree with but 363 - * one rcu_node structure, this loop -will- visit the rcu_node structure. 364 - * It is still a leaf node, even if it is also the root node. 365 - */ 366 - #define rcu_for_each_leaf_node(rsp, rnp) \ 367 - for ((rnp) = rcu_first_leaf_node(rsp); \ 368 - (rnp) < &(rsp)->node[rcu_num_nodes]; (rnp)++) 357 + #define rcu_for_each_leaf_node(rnp) \ 358 + for ((rnp) = rcu_first_leaf_node(); \ 359 + (rnp) < &rcu_state.node[rcu_num_nodes]; (rnp)++) 369 360 370 361 /* 371 362 * Iterate over all possible CPUs in a leaf RCU node. ··· 433 434 lockdep_assert_held(&ACCESS_PRIVATE(p, lock)) 434 435 435 436 #endif /* #if defined(SRCU) || !defined(TINY_RCU) */ 437 + 438 + #ifdef CONFIG_SRCU 439 + void srcu_init(void); 440 + #else /* #ifdef CONFIG_SRCU */ 441 + static inline void srcu_init(void) { } 442 + #endif /* #else #ifdef CONFIG_SRCU */ 436 443 437 444 #ifdef CONFIG_TINY_RCU 438 445 /* Tiny RCU doesn't expedite, as its purpose in life is instead to be tiny. */ ··· 520 515 521 516 #ifdef CONFIG_TINY_RCU 522 517 static inline unsigned long rcu_get_gp_seq(void) { return 0; } 523 - static inline unsigned long rcu_bh_get_gp_seq(void) { return 0; } 524 - static inline unsigned long rcu_sched_get_gp_seq(void) { return 0; } 525 518 static inline unsigned long rcu_exp_batches_completed(void) { return 0; } 526 - static inline unsigned long rcu_exp_batches_completed_sched(void) { return 0; } 527 519 static inline unsigned long 528 520 srcu_batches_completed(struct srcu_struct *sp) { return 0; } 529 521 static inline void rcu_force_quiescent_state(void) { } 530 - static inline void rcu_bh_force_quiescent_state(void) { } 531 - static inline void rcu_sched_force_quiescent_state(void) { } 532 522 static inline void show_rcu_gp_kthreads(void) { } 533 523 static inline int rcu_get_gp_kthreads_prio(void) { return 0; } 534 524 #else /* #ifdef CONFIG_TINY_RCU */ 535 525 unsigned long rcu_get_gp_seq(void); 536 - unsigned long rcu_bh_get_gp_seq(void); 537 - unsigned long rcu_sched_get_gp_seq(void); 538 526 unsigned long rcu_exp_batches_completed(void); 539 - unsigned long rcu_exp_batches_completed_sched(void); 540 527 unsigned long srcu_batches_completed(struct srcu_struct *sp); 541 528 void show_rcu_gp_kthreads(void); 542 529 int rcu_get_gp_kthreads_prio(void); 543 530 void rcu_force_quiescent_state(void); 544 - void rcu_bh_force_quiescent_state(void); 545 - void rcu_sched_force_quiescent_state(void); 546 531 extern struct workqueue_struct *rcu_gp_wq; 547 532 extern struct workqueue_struct *rcu_par_gp_wq; 548 533 #endif /* #else #ifdef CONFIG_TINY_RCU */

+3 -63

kernel/rcu/rcuperf.c

··· 190 190 }; 191 191 192 192 /* 193 - * Definitions for rcu_bh perf testing. 194 - */ 195 - 196 - static int rcu_bh_perf_read_lock(void) __acquires(RCU_BH) 197 - { 198 - rcu_read_lock_bh(); 199 - return 0; 200 - } 201 - 202 - static void rcu_bh_perf_read_unlock(int idx) __releases(RCU_BH) 203 - { 204 - rcu_read_unlock_bh(); 205 - } 206 - 207 - static struct rcu_perf_ops rcu_bh_ops = { 208 - .ptype = RCU_BH_FLAVOR, 209 - .init = rcu_sync_perf_init, 210 - .readlock = rcu_bh_perf_read_lock, 211 - .readunlock = rcu_bh_perf_read_unlock, 212 - .get_gp_seq = rcu_bh_get_gp_seq, 213 - .gp_diff = rcu_seq_diff, 214 - .exp_completed = rcu_exp_batches_completed_sched, 215 - .async = call_rcu_bh, 216 - .gp_barrier = rcu_barrier_bh, 217 - .sync = synchronize_rcu_bh, 218 - .exp_sync = synchronize_rcu_bh_expedited, 219 - .name = "rcu_bh" 220 - }; 221 - 222 - /* 223 193 * Definitions for srcu perf testing. 224 194 */ 225 195 ··· 273 303 .sync = srcu_perf_synchronize, 274 304 .exp_sync = srcu_perf_synchronize_expedited, 275 305 .name = "srcud" 276 - }; 277 - 278 - /* 279 - * Definitions for sched perf testing. 280 - */ 281 - 282 - static int sched_perf_read_lock(void) 283 - { 284 - preempt_disable(); 285 - return 0; 286 - } 287 - 288 - static void sched_perf_read_unlock(int idx) 289 - { 290 - preempt_enable(); 291 - } 292 - 293 - static struct rcu_perf_ops sched_ops = { 294 - .ptype = RCU_SCHED_FLAVOR, 295 - .init = rcu_sync_perf_init, 296 - .readlock = sched_perf_read_lock, 297 - .readunlock = sched_perf_read_unlock, 298 - .get_gp_seq = rcu_sched_get_gp_seq, 299 - .gp_diff = rcu_seq_diff, 300 - .exp_completed = rcu_exp_batches_completed_sched, 301 - .async = call_rcu_sched, 302 - .gp_barrier = rcu_barrier_sched, 303 - .sync = synchronize_sched, 304 - .exp_sync = synchronize_sched_expedited, 305 - .name = "sched" 306 306 }; 307 307 308 308 /* ··· 551 611 kfree(writer_n_durations); 552 612 } 553 613 554 - /* Do flavor-specific cleanup operations. */ 614 + /* Do torture-type-specific cleanup operations. */ 555 615 if (cur_ops->cleanup != NULL) 556 616 cur_ops->cleanup(); 557 617 ··· 601 661 long i; 602 662 int firsterr = 0; 603 663 static struct rcu_perf_ops *perf_ops[] = { 604 - &rcu_ops, &rcu_bh_ops, &srcu_ops, &srcud_ops, &sched_ops, 605 - &tasks_ops, 664 + &rcu_ops, &srcu_ops, &srcud_ops, &tasks_ops, 606 665 }; 607 666 608 667 if (!torture_init_begin(perf_type, verbose)) ··· 619 680 for (i = 0; i < ARRAY_SIZE(perf_ops); i++) 620 681 pr_cont(" %s", perf_ops[i]->name); 621 682 pr_cont("\n"); 683 + WARN_ON(!IS_MODULE(CONFIG_RCU_PERF_TEST)); 622 684 firsterr = -EINVAL; 623 685 goto unwind; 624 686 }

+272 -125

kernel/rcu/rcutorture.c

··· 66 66 /* Bits for ->extendables field, extendables param, and related definitions. */ 67 67 #define RCUTORTURE_RDR_SHIFT 8 /* Put SRCU index in upper bits. */ 68 68 #define RCUTORTURE_RDR_MASK ((1 << RCUTORTURE_RDR_SHIFT) - 1) 69 - #define RCUTORTURE_RDR_BH 0x1 /* Extend readers by disabling bh. */ 70 - #define RCUTORTURE_RDR_IRQ 0x2 /* ... disabling interrupts. */ 71 - #define RCUTORTURE_RDR_PREEMPT 0x4 /* ... disabling preemption. */ 72 - #define RCUTORTURE_RDR_RCU 0x8 /* ... entering another RCU reader. */ 73 - #define RCUTORTURE_RDR_NBITS 4 /* Number of bits defined above. */ 74 - #define RCUTORTURE_MAX_EXTEND (RCUTORTURE_RDR_BH | RCUTORTURE_RDR_IRQ | \ 75 - RCUTORTURE_RDR_PREEMPT) 69 + #define RCUTORTURE_RDR_BH 0x01 /* Extend readers by disabling bh. */ 70 + #define RCUTORTURE_RDR_IRQ 0x02 /* ... disabling interrupts. */ 71 + #define RCUTORTURE_RDR_PREEMPT 0x04 /* ... disabling preemption. */ 72 + #define RCUTORTURE_RDR_RBH 0x08 /* ... rcu_read_lock_bh(). */ 73 + #define RCUTORTURE_RDR_SCHED 0x10 /* ... rcu_read_lock_sched(). */ 74 + #define RCUTORTURE_RDR_RCU 0x20 /* ... entering another RCU reader. */ 75 + #define RCUTORTURE_RDR_NBITS 6 /* Number of bits defined above. */ 76 + #define RCUTORTURE_MAX_EXTEND \ 77 + (RCUTORTURE_RDR_BH | RCUTORTURE_RDR_IRQ | RCUTORTURE_RDR_PREEMPT | \ 78 + RCUTORTURE_RDR_RBH | RCUTORTURE_RDR_SCHED) 76 79 #define RCUTORTURE_RDR_MAX_LOOPS 0x7 /* Maximum reader extensions. */ 77 80 /* Must be power of two minus one. */ 81 + #define RCUTORTURE_RDR_MAX_SEGS (RCUTORTURE_RDR_MAX_LOOPS + 3) 78 82 79 83 torture_param(int, cbflood_inter_holdoff, HZ, 80 84 "Holdoff between floods (jiffies)"); ··· 93 89 "Duration of fqs bursts (us), 0 to disable"); 94 90 torture_param(int, fqs_holdoff, 0, "Holdoff time within fqs bursts (us)"); 95 91 torture_param(int, fqs_stutter, 3, "Wait time between fqs bursts (s)"); 92 + torture_param(bool, fwd_progress, 1, "Test grace-period forward progress"); 93 + torture_param(int, fwd_progress_div, 4, "Fraction of CPU stall to wait"); 94 + torture_param(int, fwd_progress_holdoff, 60, 95 + "Time between forward-progress tests (s)"); 96 + torture_param(bool, fwd_progress_need_resched, 1, 97 + "Hide cond_resched() behind need_resched()"); 96 98 torture_param(bool, gp_cond, false, "Use conditional/async GP wait primitives"); 97 99 torture_param(bool, gp_exp, false, "Use expedited GP wait primitives"); 98 100 torture_param(bool, gp_normal, false, ··· 135 125 136 126 static char *torture_type = "rcu"; 137 127 module_param(torture_type, charp, 0444); 138 - MODULE_PARM_DESC(torture_type, "Type of RCU to torture (rcu, rcu_bh, ...)"); 128 + MODULE_PARM_DESC(torture_type, "Type of RCU to torture (rcu, srcu, ...)"); 139 129 140 130 static int nrealreaders; 141 131 static int ncbflooders; ··· 147 137 static struct task_struct *fqs_task; 148 138 static struct task_struct *boost_tasks[NR_CPUS]; 149 139 static struct task_struct *stall_task; 140 + static struct task_struct *fwd_prog_task; 150 141 static struct task_struct **barrier_cbs_tasks; 151 142 static struct task_struct *barrier_task; 152 143 ··· 207 196 "RTWS_STUTTER", 208 197 "RTWS_STOPPING", 209 198 }; 199 + 200 + /* Record reader segment types and duration for first failing read. */ 201 + struct rt_read_seg { 202 + int rt_readstate; 203 + unsigned long rt_delay_jiffies; 204 + unsigned long rt_delay_ms; 205 + unsigned long rt_delay_us; 206 + bool rt_preempted; 207 + }; 208 + static int err_segs_recorded; 209 + static struct rt_read_seg err_segs[RCUTORTURE_RDR_MAX_SEGS]; 210 + static int rt_read_nsegs; 210 211 211 212 static const char *rcu_torture_writer_state_getname(void) 212 213 { ··· 301 278 void (*init)(void); 302 279 void (*cleanup)(void); 303 280 int (*readlock)(void); 304 - void (*read_delay)(struct torture_random_state *rrsp); 281 + void (*read_delay)(struct torture_random_state *rrsp, 282 + struct rt_read_seg *rtrsp); 305 283 void (*readunlock)(int idx); 306 284 unsigned long (*get_gp_seq)(void); 307 285 unsigned long (*gp_diff)(unsigned long new, unsigned long old); ··· 315 291 void (*cb_barrier)(void); 316 292 void (*fqs)(void); 317 293 void (*stats)(void); 294 + int (*stall_dur)(void); 318 295 int irq_capable; 319 296 int can_boost; 320 297 int extendables; ··· 335 310 return 0; 336 311 } 337 312 338 - static void rcu_read_delay(struct torture_random_state *rrsp) 313 + static void 314 + rcu_read_delay(struct torture_random_state *rrsp, struct rt_read_seg *rtrsp) 339 315 { 340 316 unsigned long started; 341 317 unsigned long completed; 342 318 const unsigned long shortdelay_us = 200; 343 - const unsigned long longdelay_ms = 50; 319 + unsigned long longdelay_ms = 300; 344 320 unsigned long long ts; 345 321 346 322 /* We want a short delay sometimes to make a reader delay the grace ··· 351 325 if (!(torture_random(rrsp) % (nrealreaders * 2000 * longdelay_ms))) { 352 326 started = cur_ops->get_gp_seq(); 353 327 ts = rcu_trace_clock_local(); 328 + if (preempt_count() & (SOFTIRQ_MASK | HARDIRQ_MASK)) 329 + longdelay_ms = 5; /* Avoid triggering BH limits. */ 354 330 mdelay(longdelay_ms); 331 + rtrsp->rt_delay_ms = longdelay_ms; 355 332 completed = cur_ops->get_gp_seq(); 356 333 do_trace_rcu_torture_read(cur_ops->name, NULL, ts, 357 334 started, completed); 358 335 } 359 - if (!(torture_random(rrsp) % (nrealreaders * 2 * shortdelay_us))) 336 + if (!(torture_random(rrsp) % (nrealreaders * 2 * shortdelay_us))) { 360 337 udelay(shortdelay_us); 338 + rtrsp->rt_delay_us = shortdelay_us; 339 + } 361 340 if (!preempt_count() && 362 - !(torture_random(rrsp) % (nrealreaders * 500))) 341 + !(torture_random(rrsp) % (nrealreaders * 500))) { 363 342 torture_preempt_schedule(); /* QS only if preemptible. */ 343 + rtrsp->rt_preempted = true; 344 + } 364 345 } 365 346 366 347 static void rcu_torture_read_unlock(int idx) __releases(RCU) ··· 462 429 .cb_barrier = rcu_barrier, 463 430 .fqs = rcu_force_quiescent_state, 464 431 .stats = NULL, 432 + .stall_dur = rcu_jiffies_till_stall_check, 465 433 .irq_capable = 1, 466 434 .can_boost = rcu_can_boost(), 435 + .extendables = RCUTORTURE_MAX_EXTEND, 467 436 .name = "rcu" 468 - }; 469 - 470 - /* 471 - * Definitions for rcu_bh torture testing. 472 - */ 473 - 474 - static int rcu_bh_torture_read_lock(void) __acquires(RCU_BH) 475 - { 476 - rcu_read_lock_bh(); 477 - return 0; 478 - } 479 - 480 - static void rcu_bh_torture_read_unlock(int idx) __releases(RCU_BH) 481 - { 482 - rcu_read_unlock_bh(); 483 - } 484 - 485 - static void rcu_bh_torture_deferred_free(struct rcu_torture *p) 486 - { 487 - call_rcu_bh(&p->rtort_rcu, rcu_torture_cb); 488 - } 489 - 490 - static struct rcu_torture_ops rcu_bh_ops = { 491 - .ttype = RCU_BH_FLAVOR, 492 - .init = rcu_sync_torture_init, 493 - .readlock = rcu_bh_torture_read_lock, 494 - .read_delay = rcu_read_delay, /* just reuse rcu's version. */ 495 - .readunlock = rcu_bh_torture_read_unlock, 496 - .get_gp_seq = rcu_bh_get_gp_seq, 497 - .gp_diff = rcu_seq_diff, 498 - .deferred_free = rcu_bh_torture_deferred_free, 499 - .sync = synchronize_rcu_bh, 500 - .exp_sync = synchronize_rcu_bh_expedited, 501 - .call = call_rcu_bh, 502 - .cb_barrier = rcu_barrier_bh, 503 - .fqs = rcu_bh_force_quiescent_state, 504 - .stats = NULL, 505 - .irq_capable = 1, 506 - .extendables = (RCUTORTURE_RDR_BH | RCUTORTURE_RDR_IRQ), 507 - .ext_irq_conflict = RCUTORTURE_RDR_RCU, 508 - .name = "rcu_bh" 509 437 }; 510 438 511 439 /* ··· 525 531 return srcu_read_lock(srcu_ctlp); 526 532 } 527 533 528 - static void srcu_read_delay(struct torture_random_state *rrsp) 534 + static void 535 + srcu_read_delay(struct torture_random_state *rrsp, struct rt_read_seg *rtrsp) 529 536 { 530 537 long delay; 531 538 const long uspertick = 1000000 / HZ; ··· 536 541 537 542 delay = torture_random(rrsp) % 538 543 (nrealreaders * 2 * longdelay * uspertick); 539 - if (!delay && in_task()) 544 + if (!delay && in_task()) { 540 545 schedule_timeout_interruptible(longdelay); 541 - else 542 - rcu_read_delay(rrsp); 546 + rtrsp->rt_delay_jiffies = longdelay; 547 + } else { 548 + rcu_read_delay(rrsp, rtrsp); 549 + } 543 550 } 544 551 545 552 static void srcu_torture_read_unlock(int idx) __releases(srcu_ctlp) ··· 657 660 .irq_capable = 1, 658 661 .extendables = RCUTORTURE_MAX_EXTEND, 659 662 .name = "busted_srcud" 660 - }; 661 - 662 - /* 663 - * Definitions for sched torture testing. 664 - */ 665 - 666 - static int sched_torture_read_lock(void) 667 - { 668 - preempt_disable(); 669 - return 0; 670 - } 671 - 672 - static void sched_torture_read_unlock(int idx) 673 - { 674 - preempt_enable(); 675 - } 676 - 677 - static void rcu_sched_torture_deferred_free(struct rcu_torture *p) 678 - { 679 - call_rcu_sched(&p->rtort_rcu, rcu_torture_cb); 680 - } 681 - 682 - static struct rcu_torture_ops sched_ops = { 683 - .ttype = RCU_SCHED_FLAVOR, 684 - .init = rcu_sync_torture_init, 685 - .readlock = sched_torture_read_lock, 686 - .read_delay = rcu_read_delay, /* just reuse rcu's version. */ 687 - .readunlock = sched_torture_read_unlock, 688 - .get_gp_seq = rcu_sched_get_gp_seq, 689 - .gp_diff = rcu_seq_diff, 690 - .deferred_free = rcu_sched_torture_deferred_free, 691 - .sync = synchronize_sched, 692 - .exp_sync = synchronize_sched_expedited, 693 - .get_state = get_state_synchronize_sched, 694 - .cond_sync = cond_synchronize_sched, 695 - .call = call_rcu_sched, 696 - .cb_barrier = rcu_barrier_sched, 697 - .fqs = rcu_sched_force_quiescent_state, 698 - .stats = NULL, 699 - .irq_capable = 1, 700 - .extendables = RCUTORTURE_MAX_EXTEND, 701 - .name = "sched" 702 663 }; 703 664 704 665 /* ··· 1071 1116 break; 1072 1117 } 1073 1118 } 1074 - rcu_torture_current_version++; 1119 + WRITE_ONCE(rcu_torture_current_version, 1120 + rcu_torture_current_version + 1); 1075 1121 /* Cycle through nesting levels of rcu_expedite_gp() calls. */ 1076 1122 if (can_expedite && 1077 1123 !(torture_random(&rand) & 0xff & (!!expediting - 1))) { ··· 1088 1132 !rcu_gp_is_normal(); 1089 1133 } 1090 1134 rcu_torture_writer_state = RTWS_STUTTER; 1091 - stutter_wait("rcu_torture_writer"); 1135 + if (stutter_wait("rcu_torture_writer")) 1136 + for (i = 0; i < ARRAY_SIZE(rcu_tortures); i++) 1137 + if (list_empty(&rcu_tortures[i].rtort_free)) 1138 + WARN_ON_ONCE(1); 1092 1139 } while (!torture_must_stop()); 1093 1140 /* Reset expediting back to unexpedited. */ 1094 1141 if (expediting > 0) ··· 1158 1199 * change, do a ->read_delay(). 1159 1200 */ 1160 1201 static void rcutorture_one_extend(int *readstate, int newstate, 1161 - struct torture_random_state *trsp) 1202 + struct torture_random_state *trsp, 1203 + struct rt_read_seg *rtrsp) 1162 1204 { 1163 1205 int idxnew = -1; 1164 1206 int idxold = *readstate; ··· 1168 1208 1169 1209 WARN_ON_ONCE(idxold < 0); 1170 1210 WARN_ON_ONCE((idxold >> RCUTORTURE_RDR_SHIFT) > 1); 1211 + rtrsp->rt_readstate = newstate; 1171 1212 1172 1213 /* First, put new protection in place to avoid critical-section gap. */ 1173 1214 if (statesnew & RCUTORTURE_RDR_BH) ··· 1177 1216 local_irq_disable(); 1178 1217 if (statesnew & RCUTORTURE_RDR_PREEMPT) 1179 1218 preempt_disable(); 1219 + if (statesnew & RCUTORTURE_RDR_RBH) 1220 + rcu_read_lock_bh(); 1221 + if (statesnew & RCUTORTURE_RDR_SCHED) 1222 + rcu_read_lock_sched(); 1180 1223 if (statesnew & RCUTORTURE_RDR_RCU) 1181 1224 idxnew = cur_ops->readlock() << RCUTORTURE_RDR_SHIFT; 1182 1225 ··· 1191 1226 local_bh_enable(); 1192 1227 if (statesold & RCUTORTURE_RDR_PREEMPT) 1193 1228 preempt_enable(); 1229 + if (statesold & RCUTORTURE_RDR_RBH) 1230 + rcu_read_unlock_bh(); 1231 + if (statesold & RCUTORTURE_RDR_SCHED) 1232 + rcu_read_unlock_sched(); 1194 1233 if (statesold & RCUTORTURE_RDR_RCU) 1195 1234 cur_ops->readunlock(idxold >> RCUTORTURE_RDR_SHIFT); 1196 1235 1197 1236 /* Delay if neither beginning nor end and there was a change. */ 1198 1237 if ((statesnew || statesold) && *readstate && newstate) 1199 - cur_ops->read_delay(trsp); 1238 + cur_ops->read_delay(trsp, rtrsp); 1200 1239 1201 1240 /* Update the reader state. */ 1202 1241 if (idxnew == -1) ··· 1229 1260 { 1230 1261 int mask = rcutorture_extend_mask_max(); 1231 1262 unsigned long randmask1 = torture_random(trsp) >> 8; 1232 - unsigned long randmask2 = randmask1 >> 1; 1263 + unsigned long randmask2 = randmask1 >> 3; 1233 1264 1234 1265 WARN_ON_ONCE(mask >> RCUTORTURE_RDR_SHIFT); 1235 - /* Half the time lots of bits, half the time only one bit. */ 1236 - if (randmask1 & 0x1) 1266 + /* Most of the time lots of bits, half the time only one bit. */ 1267 + if (!(randmask1 & 0x7)) 1237 1268 mask = mask & randmask2; 1238 1269 else 1239 1270 mask = mask & (1 << (randmask2 % RCUTORTURE_RDR_NBITS)); 1271 + /* Can't enable bh w/irq disabled. */ 1240 1272 if ((mask & RCUTORTURE_RDR_IRQ) && 1241 - !(mask & RCUTORTURE_RDR_BH) && 1242 - (oldmask & RCUTORTURE_RDR_BH)) 1243 - mask |= RCUTORTURE_RDR_BH; /* Can't enable bh w/irq disabled. */ 1273 + ((!(mask & RCUTORTURE_RDR_BH) && (oldmask & RCUTORTURE_RDR_BH)) || 1274 + (!(mask & RCUTORTURE_RDR_RBH) && (oldmask & RCUTORTURE_RDR_RBH)))) 1275 + mask |= RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH; 1244 1276 if ((mask & RCUTORTURE_RDR_IRQ) && 1245 1277 !(mask & cur_ops->ext_irq_conflict) && 1246 1278 (oldmask & cur_ops->ext_irq_conflict)) ··· 1253 1283 * Do a randomly selected number of extensions of an existing RCU read-side 1254 1284 * critical section. 1255 1285 */ 1256 - static void rcutorture_loop_extend(int *readstate, 1257 - struct torture_random_state *trsp) 1286 + static struct rt_read_seg * 1287 + rcutorture_loop_extend(int *readstate, struct torture_random_state *trsp, 1288 + struct rt_read_seg *rtrsp) 1258 1289 { 1259 1290 int i; 1291 + int j; 1260 1292 int mask = rcutorture_extend_mask_max(); 1261 1293 1262 1294 WARN_ON_ONCE(!*readstate); /* -Existing- RCU read-side critsect! */ 1263 1295 if (!((mask - 1) & mask)) 1264 - return; /* Current RCU flavor not extendable. */ 1265 - i = (torture_random(trsp) >> 3) & RCUTORTURE_RDR_MAX_LOOPS; 1266 - while (i--) { 1296 + return rtrsp; /* Current RCU reader not extendable. */ 1297 + /* Bias towards larger numbers of loops. */ 1298 + i = (torture_random(trsp) >> 3); 1299 + i = ((i | (i >> 3)) & RCUTORTURE_RDR_MAX_LOOPS) + 1; 1300 + for (j = 0; j < i; j++) { 1267 1301 mask = rcutorture_extend_mask(*readstate, trsp); 1268 - rcutorture_one_extend(readstate, mask, trsp); 1302 + rcutorture_one_extend(readstate, mask, trsp, &rtrsp[j]); 1269 1303 } 1304 + return &rtrsp[j]; 1270 1305 } 1271 1306 1272 1307 /* ··· 1281 1306 */ 1282 1307 static bool rcu_torture_one_read(struct torture_random_state *trsp) 1283 1308 { 1309 + int i; 1284 1310 unsigned long started; 1285 1311 unsigned long completed; 1286 1312 int newstate; 1287 1313 struct rcu_torture *p; 1288 1314 int pipe_count; 1289 1315 int readstate = 0; 1316 + struct rt_read_seg rtseg[RCUTORTURE_RDR_MAX_SEGS] = { { 0 } }; 1317 + struct rt_read_seg *rtrsp = &rtseg[0]; 1318 + struct rt_read_seg *rtrsp1; 1290 1319 unsigned long long ts; 1291 1320 1292 1321 newstate = rcutorture_extend_mask(readstate, trsp); 1293 - rcutorture_one_extend(&readstate, newstate, trsp); 1322 + rcutorture_one_extend(&readstate, newstate, trsp, rtrsp++); 1294 1323 started = cur_ops->get_gp_seq(); 1295 1324 ts = rcu_trace_clock_local(); 1296 1325 p = rcu_dereference_check(rcu_torture_current, ··· 1304 1325 torturing_tasks()); 1305 1326 if (p == NULL) { 1306 1327 /* Wait for rcu_torture_writer to get underway */ 1307 - rcutorture_one_extend(&readstate, 0, trsp); 1328 + rcutorture_one_extend(&readstate, 0, trsp, rtrsp); 1308 1329 return false; 1309 1330 } 1310 1331 if (p->rtort_mbtest == 0) 1311 1332 atomic_inc(&n_rcu_torture_mberror); 1312 - rcutorture_loop_extend(&readstate, trsp); 1333 + rtrsp = rcutorture_loop_extend(&readstate, trsp, rtrsp); 1313 1334 preempt_disable(); 1314 1335 pipe_count = p->rtort_pipe_count; 1315 1336 if (pipe_count > RCU_TORTURE_PIPE_LEN) { ··· 1330 1351 } 1331 1352 __this_cpu_inc(rcu_torture_batch[completed]); 1332 1353 preempt_enable(); 1333 - rcutorture_one_extend(&readstate, 0, trsp); 1354 + rcutorture_one_extend(&readstate, 0, trsp, rtrsp); 1334 1355 WARN_ON_ONCE(readstate & RCUTORTURE_RDR_MASK); 1356 + 1357 + /* If error or close call, record the sequence of reader protections. */ 1358 + if ((pipe_count > 1 || completed > 1) && !xchg(&err_segs_recorded, 1)) { 1359 + i = 0; 1360 + for (rtrsp1 = &rtseg[0]; rtrsp1 < rtrsp; rtrsp1++) 1361 + err_segs[i++] = *rtrsp1; 1362 + rt_read_nsegs = i; 1363 + } 1364 + 1335 1365 return true; 1336 1366 } 1337 1367 ··· 1375 1387 static int 1376 1388 rcu_torture_reader(void *arg) 1377 1389 { 1390 + unsigned long lastsleep = jiffies; 1391 + long myid = (long)arg; 1392 + int mynumonline = myid; 1378 1393 DEFINE_TORTURE_RANDOM(rand); 1379 1394 struct timer_list t; 1380 1395 ··· 1393 1402 } 1394 1403 if (!rcu_torture_one_read(&rand)) 1395 1404 schedule_timeout_interruptible(HZ); 1405 + if (time_after(jiffies, lastsleep)) { 1406 + schedule_timeout_interruptible(1); 1407 + lastsleep = jiffies + 10; 1408 + } 1409 + while (num_online_cpus() < mynumonline && !torture_must_stop()) 1410 + schedule_timeout_interruptible(HZ / 5); 1396 1411 stutter_wait("rcu_torture_reader"); 1397 1412 } while (!torture_must_stop()); 1398 1413 if (irqreader && cur_ops->irq_capable) { ··· 1652 1655 return torture_create_kthread(rcu_torture_stall, NULL, stall_task); 1653 1656 } 1654 1657 1658 + /* State structure for forward-progress self-propagating RCU callback. */ 1659 + struct fwd_cb_state { 1660 + struct rcu_head rh; 1661 + int stop; 1662 + }; 1663 + 1664 + /* 1665 + * Forward-progress self-propagating RCU callback function. Because 1666 + * callbacks run from softirq, this function is an implicit RCU read-side 1667 + * critical section. 1668 + */ 1669 + static void rcu_torture_fwd_prog_cb(struct rcu_head *rhp) 1670 + { 1671 + struct fwd_cb_state *fcsp = container_of(rhp, struct fwd_cb_state, rh); 1672 + 1673 + if (READ_ONCE(fcsp->stop)) { 1674 + WRITE_ONCE(fcsp->stop, 2); 1675 + return; 1676 + } 1677 + cur_ops->call(&fcsp->rh, rcu_torture_fwd_prog_cb); 1678 + } 1679 + 1680 + /* Carry out grace-period forward-progress testing. */ 1681 + static int rcu_torture_fwd_prog(void *args) 1682 + { 1683 + unsigned long cver; 1684 + unsigned long dur; 1685 + struct fwd_cb_state fcs; 1686 + unsigned long gps; 1687 + int idx; 1688 + int sd; 1689 + int sd4; 1690 + bool selfpropcb = false; 1691 + unsigned long stopat; 1692 + int tested = 0; 1693 + int tested_tries = 0; 1694 + static DEFINE_TORTURE_RANDOM(trs); 1695 + 1696 + VERBOSE_TOROUT_STRING("rcu_torture_fwd_progress task started"); 1697 + if (!IS_ENABLED(CONFIG_SMP) || !IS_ENABLED(CONFIG_RCU_BOOST)) 1698 + set_user_nice(current, MAX_NICE); 1699 + if (cur_ops->call && cur_ops->sync && cur_ops->cb_barrier) { 1700 + init_rcu_head_on_stack(&fcs.rh); 1701 + selfpropcb = true; 1702 + } 1703 + do { 1704 + schedule_timeout_interruptible(fwd_progress_holdoff * HZ); 1705 + if (selfpropcb) { 1706 + WRITE_ONCE(fcs.stop, 0); 1707 + cur_ops->call(&fcs.rh, rcu_torture_fwd_prog_cb); 1708 + } 1709 + cver = READ_ONCE(rcu_torture_current_version); 1710 + gps = cur_ops->get_gp_seq(); 1711 + sd = cur_ops->stall_dur() + 1; 1712 + sd4 = (sd + fwd_progress_div - 1) / fwd_progress_div; 1713 + dur = sd4 + torture_random(&trs) % (sd - sd4); 1714 + stopat = jiffies + dur; 1715 + while (time_before(jiffies, stopat) && !torture_must_stop()) { 1716 + idx = cur_ops->readlock(); 1717 + udelay(10); 1718 + cur_ops->readunlock(idx); 1719 + if (!fwd_progress_need_resched || need_resched()) 1720 + cond_resched(); 1721 + } 1722 + tested_tries++; 1723 + if (!time_before(jiffies, stopat) && !torture_must_stop()) { 1724 + tested++; 1725 + cver = READ_ONCE(rcu_torture_current_version) - cver; 1726 + gps = rcutorture_seq_diff(cur_ops->get_gp_seq(), gps); 1727 + WARN_ON(!cver && gps < 2); 1728 + pr_alert("%s: Duration %ld cver %ld gps %ld\n", __func__, dur, cver, gps); 1729 + } 1730 + if (selfpropcb) { 1731 + WRITE_ONCE(fcs.stop, 1); 1732 + cur_ops->sync(); /* Wait for running CB to complete. */ 1733 + cur_ops->cb_barrier(); /* Wait for queued callbacks. */ 1734 + } 1735 + /* Avoid slow periods, better to test when busy. */ 1736 + stutter_wait("rcu_torture_fwd_prog"); 1737 + } while (!torture_must_stop()); 1738 + if (selfpropcb) { 1739 + WARN_ON(READ_ONCE(fcs.stop) != 2); 1740 + destroy_rcu_head_on_stack(&fcs.rh); 1741 + } 1742 + /* Short runs might not contain a valid forward-progress attempt. */ 1743 + WARN_ON(!tested && tested_tries >= 5); 1744 + pr_alert("%s: tested %d tested_tries %d\n", __func__, tested, tested_tries); 1745 + torture_kthread_stopping("rcu_torture_fwd_prog"); 1746 + return 0; 1747 + } 1748 + 1749 + /* If forward-progress checking is requested and feasible, spawn the thread. */ 1750 + static int __init rcu_torture_fwd_prog_init(void) 1751 + { 1752 + if (!fwd_progress) 1753 + return 0; /* Not requested, so don't do it. */ 1754 + if (!cur_ops->stall_dur || cur_ops->stall_dur() <= 0) { 1755 + VERBOSE_TOROUT_STRING("rcu_torture_fwd_prog_init: Disabled, unsupported by RCU flavor under test"); 1756 + return 0; 1757 + } 1758 + if (stall_cpu > 0) { 1759 + VERBOSE_TOROUT_STRING("rcu_torture_fwd_prog_init: Disabled, conflicts with CPU-stall testing"); 1760 + if (IS_MODULE(CONFIG_RCU_TORTURE_TESTS)) 1761 + return -EINVAL; /* In module, can fail back to user. */ 1762 + WARN_ON(1); /* Make sure rcutorture notices conflict. */ 1763 + return 0; 1764 + } 1765 + if (fwd_progress_holdoff <= 0) 1766 + fwd_progress_holdoff = 1; 1767 + if (fwd_progress_div <= 0) 1768 + fwd_progress_div = 4; 1769 + return torture_create_kthread(rcu_torture_fwd_prog, 1770 + NULL, fwd_prog_task); 1771 + } 1772 + 1655 1773 /* Callback function for RCU barrier testing. */ 1656 1774 static void rcu_torture_barrier_cbf(struct rcu_head *rcu) 1657 1775 { ··· 1929 1817 static void 1930 1818 rcu_torture_cleanup(void) 1931 1819 { 1820 + int firsttime; 1932 1821 int flags = 0; 1933 1822 unsigned long gp_seq = 0; 1934 1823 int i; ··· 1941 1828 } 1942 1829 1943 1830 rcu_torture_barrier_cleanup(); 1831 + torture_stop_kthread(rcu_torture_fwd_prog, fwd_prog_task); 1944 1832 torture_stop_kthread(rcu_torture_stall, stall_task); 1945 1833 torture_stop_kthread(rcu_torture_writer, writer_task); 1946 1834 ··· 1974 1860 cpuhp_remove_state(rcutor_hp); 1975 1861 1976 1862 /* 1977 - * Wait for all RCU callbacks to fire, then do flavor-specific 1863 + * Wait for all RCU callbacks to fire, then do torture-type-specific 1978 1864 * cleanup operations. 1979 1865 */ 1980 1866 if (cur_ops->cb_barrier != NULL) ··· 1984 1870 1985 1871 rcu_torture_stats_print(); /* -After- the stats thread is stopped! */ 1986 1872 1873 + if (err_segs_recorded) { 1874 + pr_alert("Failure/close-call rcutorture reader segments:\n"); 1875 + if (rt_read_nsegs == 0) 1876 + pr_alert("\t: No segments recorded!!!\n"); 1877 + firsttime = 1; 1878 + for (i = 0; i < rt_read_nsegs; i++) { 1879 + pr_alert("\t%d: %#x ", i, err_segs[i].rt_readstate); 1880 + if (err_segs[i].rt_delay_jiffies != 0) { 1881 + pr_cont("%s%ldjiffies", firsttime ? "" : "+", 1882 + err_segs[i].rt_delay_jiffies); 1883 + firsttime = 0; 1884 + } 1885 + if (err_segs[i].rt_delay_ms != 0) { 1886 + pr_cont("%s%ldms", firsttime ? "" : "+", 1887 + err_segs[i].rt_delay_ms); 1888 + firsttime = 0; 1889 + } 1890 + if (err_segs[i].rt_delay_us != 0) { 1891 + pr_cont("%s%ldus", firsttime ? "" : "+", 1892 + err_segs[i].rt_delay_us); 1893 + firsttime = 0; 1894 + } 1895 + pr_cont("%s\n", 1896 + err_segs[i].rt_preempted ? "preempted" : ""); 1897 + 1898 + } 1899 + } 1987 1900 if (atomic_read(&n_rcu_torture_error) || n_rcu_torture_barrier_error) 1988 1901 rcu_torture_print_module_parms(cur_ops, "End of test: FAILURE"); 1989 1902 else if (torture_onoff_failures()) ··· 2080 1939 static int __init 2081 1940 rcu_torture_init(void) 2082 1941 { 2083 - int i; 1942 + long i; 2084 1943 int cpu; 2085 1944 int firsterr = 0; 2086 1945 static struct rcu_torture_ops *torture_ops[] = { 2087 - &rcu_ops, &rcu_bh_ops, &rcu_busted_ops, &srcu_ops, &srcud_ops, 2088 - &busted_srcud_ops, &sched_ops, &tasks_ops, 1946 + &rcu_ops, &rcu_busted_ops, &srcu_ops, &srcud_ops, 1947 + &busted_srcud_ops, &tasks_ops, 2089 1948 }; 2090 1949 2091 1950 if (!torture_init_begin(torture_type, verbose)) ··· 2104 1963 for (i = 0; i < ARRAY_SIZE(torture_ops); i++) 2105 1964 pr_cont(" %s", torture_ops[i]->name); 2106 1965 pr_cont("\n"); 1966 + WARN_ON(!IS_MODULE(CONFIG_RCU_TORTURE_TEST)); 2107 1967 firsterr = -EINVAL; 2108 1968 goto unwind; 2109 1969 } ··· 2155 2013 per_cpu(rcu_torture_batch, cpu)[i] = 0; 2156 2014 } 2157 2015 } 2016 + err_segs_recorded = 0; 2017 + rt_read_nsegs = 0; 2158 2018 2159 2019 /* Start up the kthreads. */ 2160 2020 ··· 2188 2044 goto unwind; 2189 2045 } 2190 2046 for (i = 0; i < nrealreaders; i++) { 2191 - firsterr = torture_create_kthread(rcu_torture_reader, NULL, 2047 + firsterr = torture_create_kthread(rcu_torture_reader, (void *)i, 2192 2048 reader_tasks[i]); 2193 2049 if (firsterr) 2194 2050 goto unwind; ··· 2242 2098 if (firsterr) 2243 2099 goto unwind; 2244 2100 firsterr = rcu_torture_stall_init(); 2101 + if (firsterr) 2102 + goto unwind; 2103 + firsterr = rcu_torture_fwd_prog_init(); 2245 2104 if (firsterr) 2246 2105 goto unwind; 2247 2106 firsterr = rcu_torture_barrier_init();

+27 -2

kernel/rcu/srcutiny.c

··· 34 34 #include "rcu.h" 35 35 36 36 int rcu_scheduler_active __read_mostly; 37 + static LIST_HEAD(srcu_boot_list); 38 + static bool srcu_init_done; 37 39 38 40 static int init_srcu_struct_fields(struct srcu_struct *sp) 39 41 { ··· 48 46 sp->srcu_gp_waiting = false; 49 47 sp->srcu_idx = 0; 50 48 INIT_WORK(&sp->srcu_work, srcu_drive_gp); 49 + INIT_LIST_HEAD(&sp->srcu_work.entry); 51 50 return 0; 52 51 } 53 52 ··· 182 179 *sp->srcu_cb_tail = rhp; 183 180 sp->srcu_cb_tail = &rhp->next; 184 181 local_irq_restore(flags); 185 - if (!READ_ONCE(sp->srcu_gp_running)) 186 - schedule_work(&sp->srcu_work); 182 + if (!READ_ONCE(sp->srcu_gp_running)) { 183 + if (likely(srcu_init_done)) 184 + schedule_work(&sp->srcu_work); 185 + else if (list_empty(&sp->srcu_work.entry)) 186 + list_add(&sp->srcu_work.entry, &srcu_boot_list); 187 + } 187 188 } 188 189 EXPORT_SYMBOL_GPL(call_srcu); 189 190 ··· 210 203 void __init rcu_scheduler_starting(void) 211 204 { 212 205 rcu_scheduler_active = RCU_SCHEDULER_RUNNING; 206 + } 207 + 208 + /* 209 + * Queue work for srcu_struct structures with early boot callbacks. 210 + * The work won't actually execute until the workqueue initialization 211 + * phase that takes place after the scheduler starts. 212 + */ 213 + void __init srcu_init(void) 214 + { 215 + struct srcu_struct *sp; 216 + 217 + srcu_init_done = true; 218 + while (!list_empty(&srcu_boot_list)) { 219 + sp = list_first_entry(&srcu_boot_list, 220 + struct srcu_struct, srcu_work.entry); 221 + list_del_init(&sp->srcu_work.entry); 222 + schedule_work(&sp->srcu_work); 223 + } 213 224 }

+26 -5

kernel/rcu/srcutree.c

··· 51 51 static ulong counter_wrap_check = (ULONG_MAX >> 2); 52 52 module_param(counter_wrap_check, ulong, 0444); 53 53 54 + /* Early-boot callback-management, so early that no lock is required! */ 55 + static LIST_HEAD(srcu_boot_list); 56 + static bool __read_mostly srcu_init_done; 57 + 54 58 static void srcu_invoke_callbacks(struct work_struct *work); 55 59 static void srcu_reschedule(struct srcu_struct *sp, unsigned long delay); 56 60 static void process_srcu(struct work_struct *work); ··· 109 105 rcu_init_levelspread(levelspread, num_rcu_lvl); 110 106 111 107 /* Each pass through this loop initializes one srcu_node structure. */ 112 - rcu_for_each_node_breadth_first(sp, snp) { 108 + srcu_for_each_node_breadth_first(sp, snp) { 113 109 spin_lock_init(&ACCESS_PRIVATE(snp, lock)); 114 110 WARN_ON_ONCE(ARRAY_SIZE(snp->srcu_have_cbs) != 115 111 ARRAY_SIZE(snp->srcu_data_have_cbs)); ··· 239 235 { 240 236 unsigned long flags; 241 237 242 - WARN_ON_ONCE(rcu_scheduler_active == RCU_SCHEDULER_INIT); 243 238 /* The smp_load_acquire() pairs with the smp_store_release(). */ 244 239 if (!rcu_seq_state(smp_load_acquire(&sp->srcu_gp_seq_needed))) /*^^^*/ 245 240 return; /* Already initialized. */ ··· 564 561 565 562 /* Initiate callback invocation as needed. */ 566 563 idx = rcu_seq_ctr(gpseq) % ARRAY_SIZE(snp->srcu_have_cbs); 567 - rcu_for_each_node_breadth_first(sp, snp) { 564 + srcu_for_each_node_breadth_first(sp, snp) { 568 565 spin_lock_irq_rcu_node(snp); 569 566 cbs = false; 570 567 last_lvl = snp >= sp->level[rcu_num_lvls - 1]; ··· 704 701 rcu_seq_state(sp->srcu_gp_seq) == SRCU_STATE_IDLE) { 705 702 WARN_ON_ONCE(ULONG_CMP_GE(sp->srcu_gp_seq, sp->srcu_gp_seq_needed)); 706 703 srcu_gp_start(sp); 707 - queue_delayed_work(rcu_gp_wq, &sp->work, srcu_get_delay(sp)); 704 + if (likely(srcu_init_done)) 705 + queue_delayed_work(rcu_gp_wq, &sp->work, 706 + srcu_get_delay(sp)); 707 + else if (list_empty(&sp->work.work.entry)) 708 + list_add(&sp->work.work.entry, &srcu_boot_list); 708 709 } 709 710 spin_unlock_irqrestore_rcu_node(sp, flags); 710 711 } ··· 987 980 * There are memory-ordering constraints implied by synchronize_srcu(). 988 981 * On systems with more than one CPU, when synchronize_srcu() returns, 989 982 * each CPU is guaranteed to have executed a full memory barrier since 990 - * the end of its last corresponding SRCU-sched read-side critical section 983 + * the end of its last corresponding SRCU read-side critical section 991 984 * whose beginning preceded the call to synchronize_srcu(). In addition, 992 985 * each CPU having an SRCU read-side critical section that extends beyond 993 986 * the return from synchronize_srcu() is guaranteed to have executed a ··· 1315 1308 return 0; 1316 1309 } 1317 1310 early_initcall(srcu_bootup_announce); 1311 + 1312 + void __init srcu_init(void) 1313 + { 1314 + struct srcu_struct *sp; 1315 + 1316 + srcu_init_done = true; 1317 + while (!list_empty(&srcu_boot_list)) { 1318 + sp = list_first_entry(&srcu_boot_list, struct srcu_struct, 1319 + work.work.entry); 1320 + check_init_srcu_struct(sp); 1321 + list_del_init(&sp->work.work.entry); 1322 + queue_work(rcu_gp_wq, &sp->work.work); 1323 + } 1324 + }

+45 -113

kernel/rcu/tiny.c

··· 46 46 }; 47 47 48 48 /* Definition for rcupdate control block. */ 49 - static struct rcu_ctrlblk rcu_sched_ctrlblk = { 50 - .donetail = &rcu_sched_ctrlblk.rcucblist, 51 - .curtail = &rcu_sched_ctrlblk.rcucblist, 49 + static struct rcu_ctrlblk rcu_ctrlblk = { 50 + .donetail = &rcu_ctrlblk.rcucblist, 51 + .curtail = &rcu_ctrlblk.rcucblist, 52 52 }; 53 53 54 - static struct rcu_ctrlblk rcu_bh_ctrlblk = { 55 - .donetail = &rcu_bh_ctrlblk.rcucblist, 56 - .curtail = &rcu_bh_ctrlblk.rcucblist, 57 - }; 58 - 59 - void rcu_barrier_bh(void) 54 + void rcu_barrier(void) 60 55 { 61 - wait_rcu_gp(call_rcu_bh); 56 + wait_rcu_gp(call_rcu); 62 57 } 63 - EXPORT_SYMBOL(rcu_barrier_bh); 58 + EXPORT_SYMBOL(rcu_barrier); 64 59 65 - void rcu_barrier_sched(void) 60 + /* Record an rcu quiescent state. */ 61 + void rcu_qs(void) 66 62 { 67 - wait_rcu_gp(call_rcu_sched); 68 - } 69 - EXPORT_SYMBOL(rcu_barrier_sched); 63 + unsigned long flags; 70 64 71 - /* 72 - * Helper function for rcu_sched_qs() and rcu_bh_qs(). 73 - * Also irqs are disabled to avoid confusion due to interrupt handlers 74 - * invoking call_rcu(). 75 - */ 76 - static int rcu_qsctr_help(struct rcu_ctrlblk *rcp) 77 - { 78 - if (rcp->donetail != rcp->curtail) { 79 - rcp->donetail = rcp->curtail; 80 - return 1; 65 + local_irq_save(flags); 66 + if (rcu_ctrlblk.donetail != rcu_ctrlblk.curtail) { 67 + rcu_ctrlblk.donetail = rcu_ctrlblk.curtail; 68 + raise_softirq(RCU_SOFTIRQ); 81 69 } 82 - 83 - return 0; 84 - } 85 - 86 - /* 87 - * Record an rcu quiescent state. And an rcu_bh quiescent state while we 88 - * are at it, given that any rcu quiescent state is also an rcu_bh 89 - * quiescent state. Use "+" instead of "||" to defeat short circuiting. 90 - */ 91 - void rcu_sched_qs(void) 92 - { 93 - unsigned long flags; 94 - 95 - local_irq_save(flags); 96 - if (rcu_qsctr_help(&rcu_sched_ctrlblk) + 97 - rcu_qsctr_help(&rcu_bh_ctrlblk)) 98 - raise_softirq(RCU_SOFTIRQ); 99 - local_irq_restore(flags); 100 - } 101 - 102 - /* 103 - * Record an rcu_bh quiescent state. 104 - */ 105 - void rcu_bh_qs(void) 106 - { 107 - unsigned long flags; 108 - 109 - local_irq_save(flags); 110 - if (rcu_qsctr_help(&rcu_bh_ctrlblk)) 111 - raise_softirq(RCU_SOFTIRQ); 112 70 local_irq_restore(flags); 113 71 } 114 72 ··· 78 120 */ 79 121 void rcu_check_callbacks(int user) 80 122 { 81 - if (user) 82 - rcu_sched_qs(); 83 - if (user || !in_softirq()) 84 - rcu_bh_qs(); 123 + if (user) { 124 + rcu_qs(); 125 + } else if (rcu_ctrlblk.donetail != rcu_ctrlblk.curtail) { 126 + set_tsk_need_resched(current); 127 + set_preempt_need_resched(); 128 + } 85 129 } 86 130 87 - /* 88 - * Invoke the RCU callbacks on the specified rcu_ctrlkblk structure 89 - * whose grace period has elapsed. 90 - */ 91 - static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp) 131 + /* Invoke the RCU callbacks whose grace period has elapsed. */ 132 + static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused) 92 133 { 93 134 struct rcu_head *next, *list; 94 135 unsigned long flags; 95 136 96 137 /* Move the ready-to-invoke callbacks to a local list. */ 97 138 local_irq_save(flags); 98 - if (rcp->donetail == &rcp->rcucblist) { 139 + if (rcu_ctrlblk.donetail == &rcu_ctrlblk.rcucblist) { 99 140 /* No callbacks ready, so just leave. */ 100 141 local_irq_restore(flags); 101 142 return; 102 143 } 103 - list = rcp->rcucblist; 104 - rcp->rcucblist = *rcp->donetail; 105 - *rcp->donetail = NULL; 106 - if (rcp->curtail == rcp->donetail) 107 - rcp->curtail = &rcp->rcucblist; 108 - rcp->donetail = &rcp->rcucblist; 144 + list = rcu_ctrlblk.rcucblist; 145 + rcu_ctrlblk.rcucblist = *rcu_ctrlblk.donetail; 146 + *rcu_ctrlblk.donetail = NULL; 147 + if (rcu_ctrlblk.curtail == rcu_ctrlblk.donetail) 148 + rcu_ctrlblk.curtail = &rcu_ctrlblk.rcucblist; 149 + rcu_ctrlblk.donetail = &rcu_ctrlblk.rcucblist; 109 150 local_irq_restore(flags); 110 151 111 152 /* Invoke the callbacks on the local list. */ ··· 119 162 } 120 163 } 121 164 122 - static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused) 123 - { 124 - __rcu_process_callbacks(&rcu_sched_ctrlblk); 125 - __rcu_process_callbacks(&rcu_bh_ctrlblk); 126 - } 127 - 128 165 /* 129 166 * Wait for a grace period to elapse. But it is illegal to invoke 130 - * synchronize_sched() from within an RCU read-side critical section. 131 - * Therefore, any legal call to synchronize_sched() is a quiescent 132 - * state, and so on a UP system, synchronize_sched() need do nothing. 133 - * Ditto for synchronize_rcu_bh(). (But Lai Jiangshan points out the 134 - * benefits of doing might_sleep() to reduce latency.) 167 + * synchronize_rcu() from within an RCU read-side critical section. 168 + * Therefore, any legal call to synchronize_rcu() is a quiescent 169 + * state, and so on a UP system, synchronize_rcu() need do nothing. 170 + * (But Lai Jiangshan points out the benefits of doing might_sleep() 171 + * to reduce latency.) 135 172 * 136 173 * Cool, huh? (Due to Josh Triplett.) 137 174 */ 138 - void synchronize_sched(void) 175 + void synchronize_rcu(void) 139 176 { 140 177 RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || 141 178 lock_is_held(&rcu_lock_map) || 142 179 lock_is_held(&rcu_sched_lock_map), 143 - "Illegal synchronize_sched() in RCU read-side critical section"); 180 + "Illegal synchronize_rcu() in RCU read-side critical section"); 144 181 } 145 - EXPORT_SYMBOL_GPL(synchronize_sched); 182 + EXPORT_SYMBOL_GPL(synchronize_rcu); 146 183 147 184 /* 148 - * Helper function for call_rcu() and call_rcu_bh(). 185 + * Post an RCU callback to be invoked after the end of an RCU grace 186 + * period. But since we have but one CPU, that would be after any 187 + * quiescent state. 149 188 */ 150 - static void __call_rcu(struct rcu_head *head, 151 - rcu_callback_t func, 152 - struct rcu_ctrlblk *rcp) 189 + void call_rcu(struct rcu_head *head, rcu_callback_t func) 153 190 { 154 191 unsigned long flags; 155 192 ··· 152 201 head->next = NULL; 153 202 154 203 local_irq_save(flags); 155 - *rcp->curtail = head; 156 - rcp->curtail = &head->next; 204 + *rcu_ctrlblk.curtail = head; 205 + rcu_ctrlblk.curtail = &head->next; 157 206 local_irq_restore(flags); 158 207 159 208 if (unlikely(is_idle_task(current))) { 160 - /* force scheduling for rcu_sched_qs() */ 209 + /* force scheduling for rcu_qs() */ 161 210 resched_cpu(0); 162 211 } 163 212 } 164 - 165 - /* 166 - * Post an RCU callback to be invoked after the end of an RCU-sched grace 167 - * period. But since we have but one CPU, that would be after any 168 - * quiescent state. 169 - */ 170 - void call_rcu_sched(struct rcu_head *head, rcu_callback_t func) 171 - { 172 - __call_rcu(head, func, &rcu_sched_ctrlblk); 173 - } 174 - EXPORT_SYMBOL_GPL(call_rcu_sched); 175 - 176 - /* 177 - * Post an RCU bottom-half callback to be invoked after any subsequent 178 - * quiescent state. 179 - */ 180 - void call_rcu_bh(struct rcu_head *head, rcu_callback_t func) 181 - { 182 - __call_rcu(head, func, &rcu_bh_ctrlblk); 183 - } 184 - EXPORT_SYMBOL_GPL(call_rcu_bh); 213 + EXPORT_SYMBOL_GPL(call_rcu); 185 214 186 215 void __init rcu_init(void) 187 216 { 188 217 open_softirq(RCU_SOFTIRQ, rcu_process_callbacks); 189 218 rcu_early_boot_tests(); 219 + srcu_init(); 190 220 }

+908 -1333

kernel/rcu/tree.c

··· 61 61 #include <linux/trace_events.h> 62 62 #include <linux/suspend.h> 63 63 #include <linux/ftrace.h> 64 + #include <linux/tick.h> 64 65 65 66 #include "tree.h" 66 67 #include "rcu.h" ··· 74 73 /* Data structures. */ 75 74 76 75 /* 77 - * In order to export the rcu_state name to the tracing tools, it 78 - * needs to be added in the __tracepoint_string section. 79 - * This requires defining a separate variable tp_<sname>_varname 80 - * that points to the string being used, and this will allow 81 - * the tracing userspace tools to be able to decipher the string 82 - * address to the matching string. 76 + * Steal a bit from the bottom of ->dynticks for idle entry/exit 77 + * control. Initially this is for TLB flushing. 83 78 */ 84 - #ifdef CONFIG_TRACING 85 - # define DEFINE_RCU_TPS(sname) \ 86 - static char sname##_varname[] = #sname; \ 87 - static const char *tp_##sname##_varname __used __tracepoint_string = sname##_varname; 88 - # define RCU_STATE_NAME(sname) sname##_varname 89 - #else 90 - # define DEFINE_RCU_TPS(sname) 91 - # define RCU_STATE_NAME(sname) __stringify(sname) 79 + #define RCU_DYNTICK_CTRL_MASK 0x1 80 + #define RCU_DYNTICK_CTRL_CTR (RCU_DYNTICK_CTRL_MASK + 1) 81 + #ifndef rcu_eqs_special_exit 82 + #define rcu_eqs_special_exit() do { } while (0) 92 83 #endif 93 84 94 - #define RCU_STATE_INITIALIZER(sname, sabbr, cr) \ 95 - DEFINE_RCU_TPS(sname) \ 96 - static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, sname##_data); \ 97 - struct rcu_state sname##_state = { \ 98 - .level = { &sname##_state.node[0] }, \ 99 - .rda = &sname##_data, \ 100 - .call = cr, \ 101 - .gp_state = RCU_GP_IDLE, \ 102 - .gp_seq = (0UL - 300UL) << RCU_SEQ_CTR_SHIFT, \ 103 - .barrier_mutex = __MUTEX_INITIALIZER(sname##_state.barrier_mutex), \ 104 - .name = RCU_STATE_NAME(sname), \ 105 - .abbr = sabbr, \ 106 - .exp_mutex = __MUTEX_INITIALIZER(sname##_state.exp_mutex), \ 107 - .exp_wake_mutex = __MUTEX_INITIALIZER(sname##_state.exp_wake_mutex), \ 108 - .ofl_lock = __SPIN_LOCK_UNLOCKED(sname##_state.ofl_lock), \ 109 - } 110 - 111 - RCU_STATE_INITIALIZER(rcu_sched, 's', call_rcu_sched); 112 - RCU_STATE_INITIALIZER(rcu_bh, 'b', call_rcu_bh); 113 - 114 - static struct rcu_state *const rcu_state_p; 115 - LIST_HEAD(rcu_struct_flavors); 85 + static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = { 86 + .dynticks_nesting = 1, 87 + .dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE, 88 + .dynticks = ATOMIC_INIT(RCU_DYNTICK_CTRL_CTR), 89 + }; 90 + struct rcu_state rcu_state = { 91 + .level = { &rcu_state.node[0] }, 92 + .gp_state = RCU_GP_IDLE, 93 + .gp_seq = (0UL - 300UL) << RCU_SEQ_CTR_SHIFT, 94 + .barrier_mutex = __MUTEX_INITIALIZER(rcu_state.barrier_mutex), 95 + .name = RCU_NAME, 96 + .abbr = RCU_ABBR, 97 + .exp_mutex = __MUTEX_INITIALIZER(rcu_state.exp_mutex), 98 + .exp_wake_mutex = __MUTEX_INITIALIZER(rcu_state.exp_wake_mutex), 99 + .ofl_lock = __RAW_SPIN_LOCK_UNLOCKED(rcu_state.ofl_lock), 100 + }; 116 101 117 102 /* Dump rcu_node combining tree at boot to verify correct setup. */ 118 103 static bool dump_tree; ··· 145 158 */ 146 159 static int rcu_scheduler_fully_active __read_mostly; 147 160 148 - static void 149 - rcu_report_qs_rnp(unsigned long mask, struct rcu_state *rsp, 150 - struct rcu_node *rnp, unsigned long gps, unsigned long flags); 161 + static void rcu_report_qs_rnp(unsigned long mask, struct rcu_node *rnp, 162 + unsigned long gps, unsigned long flags); 151 163 static void rcu_init_new_rnp(struct rcu_node *rnp_leaf); 152 164 static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf); 153 165 static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu); 154 166 static void invoke_rcu_core(void); 155 - static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp); 156 - static void rcu_report_exp_rdp(struct rcu_state *rsp, 157 - struct rcu_data *rdp, bool wake); 167 + static void invoke_rcu_callbacks(struct rcu_data *rdp); 168 + static void rcu_report_exp_rdp(struct rcu_data *rdp); 158 169 static void sync_sched_exp_online_cleanup(int cpu); 159 170 160 171 /* rcuc/rcub kthread realtime priority */ ··· 168 183 static int gp_cleanup_delay; 169 184 module_param(gp_cleanup_delay, int, 0444); 170 185 171 - /* Retreive RCU kthreads priority for rcutorture */ 186 + /* Retrieve RCU kthreads priority for rcutorture */ 172 187 int rcu_get_gp_kthreads_prio(void) 173 188 { 174 189 return kthread_prio; ··· 202 217 * permit this function to be invoked without holding the root rcu_node 203 218 * structure's ->lock, but of course results can be subject to change. 204 219 */ 205 - static int rcu_gp_in_progress(struct rcu_state *rsp) 220 + static int rcu_gp_in_progress(void) 206 221 { 207 - return rcu_seq_state(rcu_seq_current(&rsp->gp_seq)); 222 + return rcu_seq_state(rcu_seq_current(&rcu_state.gp_seq)); 208 223 } 209 224 210 - /* 211 - * Note a quiescent state. Because we do not need to know 212 - * how many quiescent states passed, just if there was at least 213 - * one since the start of the grace period, this just sets a flag. 214 - * The caller must have disabled preemption. 215 - */ 216 - void rcu_sched_qs(void) 225 + void rcu_softirq_qs(void) 217 226 { 218 - RCU_LOCKDEP_WARN(preemptible(), "rcu_sched_qs() invoked with preemption enabled!!!"); 219 - if (!__this_cpu_read(rcu_sched_data.cpu_no_qs.s)) 220 - return; 221 - trace_rcu_grace_period(TPS("rcu_sched"), 222 - __this_cpu_read(rcu_sched_data.gp_seq), 223 - TPS("cpuqs")); 224 - __this_cpu_write(rcu_sched_data.cpu_no_qs.b.norm, false); 225 - if (!__this_cpu_read(rcu_sched_data.cpu_no_qs.b.exp)) 226 - return; 227 - __this_cpu_write(rcu_sched_data.cpu_no_qs.b.exp, false); 228 - rcu_report_exp_rdp(&rcu_sched_state, 229 - this_cpu_ptr(&rcu_sched_data), true); 227 + rcu_qs(); 228 + rcu_preempt_deferred_qs(current); 230 229 } 231 - 232 - void rcu_bh_qs(void) 233 - { 234 - RCU_LOCKDEP_WARN(preemptible(), "rcu_bh_qs() invoked with preemption enabled!!!"); 235 - if (__this_cpu_read(rcu_bh_data.cpu_no_qs.s)) { 236 - trace_rcu_grace_period(TPS("rcu_bh"), 237 - __this_cpu_read(rcu_bh_data.gp_seq), 238 - TPS("cpuqs")); 239 - __this_cpu_write(rcu_bh_data.cpu_no_qs.b.norm, false); 240 - } 241 - } 242 - 243 - /* 244 - * Steal a bit from the bottom of ->dynticks for idle entry/exit 245 - * control. Initially this is for TLB flushing. 246 - */ 247 - #define RCU_DYNTICK_CTRL_MASK 0x1 248 - #define RCU_DYNTICK_CTRL_CTR (RCU_DYNTICK_CTRL_MASK + 1) 249 - #ifndef rcu_eqs_special_exit 250 - #define rcu_eqs_special_exit() do { } while (0) 251 - #endif 252 - 253 - static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = { 254 - .dynticks_nesting = 1, 255 - .dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE, 256 - .dynticks = ATOMIC_INIT(RCU_DYNTICK_CTRL_CTR), 257 - }; 258 230 259 231 /* 260 232 * Record entry into an extended quiescent state. This is only to be ··· 219 277 */ 220 278 static void rcu_dynticks_eqs_enter(void) 221 279 { 222 - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); 280 + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); 223 281 int seq; 224 282 225 283 /* ··· 227 285 * critical sections, and we also must force ordering with the 228 286 * next idle sojourn. 229 287 */ 230 - seq = atomic_add_return(RCU_DYNTICK_CTRL_CTR, &rdtp->dynticks); 288 + seq = atomic_add_return(RCU_DYNTICK_CTRL_CTR, &rdp->dynticks); 231 289 /* Better be in an extended quiescent state! */ 232 290 WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && 233 291 (seq & RCU_DYNTICK_CTRL_CTR)); ··· 242 300 */ 243 301 static void rcu_dynticks_eqs_exit(void) 244 302 { 245 - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); 303 + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); 246 304 int seq; 247 305 248 306 /* ··· 250 308 * and we also must force ordering with the next RCU read-side 251 309 * critical section. 252 310 */ 253 - seq = atomic_add_return(RCU_DYNTICK_CTRL_CTR, &rdtp->dynticks); 311 + seq = atomic_add_return(RCU_DYNTICK_CTRL_CTR, &rdp->dynticks); 254 312 WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && 255 313 !(seq & RCU_DYNTICK_CTRL_CTR)); 256 314 if (seq & RCU_DYNTICK_CTRL_MASK) { 257 - atomic_andnot(RCU_DYNTICK_CTRL_MASK, &rdtp->dynticks); 315 + atomic_andnot(RCU_DYNTICK_CTRL_MASK, &rdp->dynticks); 258 316 smp_mb__after_atomic(); /* _exit after clearing mask. */ 259 317 /* Prefer duplicate flushes to losing a flush. */ 260 318 rcu_eqs_special_exit(); ··· 273 331 */ 274 332 static void rcu_dynticks_eqs_online(void) 275 333 { 276 - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); 334 + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); 277 335 278 - if (atomic_read(&rdtp->dynticks) & RCU_DYNTICK_CTRL_CTR) 336 + if (atomic_read(&rdp->dynticks) & RCU_DYNTICK_CTRL_CTR) 279 337 return; 280 - atomic_add(RCU_DYNTICK_CTRL_CTR, &rdtp->dynticks); 338 + atomic_add(RCU_DYNTICK_CTRL_CTR, &rdp->dynticks); 281 339 } 282 340 283 341 /* ··· 287 345 */ 288 346 bool rcu_dynticks_curr_cpu_in_eqs(void) 289 347 { 290 - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); 348 + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); 291 349 292 - return !(atomic_read(&rdtp->dynticks) & RCU_DYNTICK_CTRL_CTR); 350 + return !(atomic_read(&rdp->dynticks) & RCU_DYNTICK_CTRL_CTR); 293 351 } 294 352 295 353 /* 296 354 * Snapshot the ->dynticks counter with full ordering so as to allow 297 355 * stable comparison of this counter with past and future snapshots. 298 356 */ 299 - int rcu_dynticks_snap(struct rcu_dynticks *rdtp) 357 + int rcu_dynticks_snap(struct rcu_data *rdp) 300 358 { 301 - int snap = atomic_add_return(0, &rdtp->dynticks); 359 + int snap = atomic_add_return(0, &rdp->dynticks); 302 360 303 361 return snap & ~RCU_DYNTICK_CTRL_MASK; 304 362 } ··· 313 371 } 314 372 315 373 /* 316 - * Return true if the CPU corresponding to the specified rcu_dynticks 374 + * Return true if the CPU corresponding to the specified rcu_data 317 375 * structure has spent some time in an extended quiescent state since 318 376 * rcu_dynticks_snap() returned the specified snapshot. 319 377 */ 320 - static bool rcu_dynticks_in_eqs_since(struct rcu_dynticks *rdtp, int snap) 378 + static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap) 321 379 { 322 - return snap != rcu_dynticks_snap(rdtp); 380 + return snap != rcu_dynticks_snap(rdp); 323 381 } 324 382 325 383 /* ··· 333 391 { 334 392 int old; 335 393 int new; 336 - struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu); 394 + struct rcu_data *rdp = &per_cpu(rcu_data, cpu); 337 395 338 396 do { 339 - old = atomic_read(&rdtp->dynticks); 397 + old = atomic_read(&rdp->dynticks); 340 398 if (old & RCU_DYNTICK_CTRL_CTR) 341 399 return false; 342 400 new = old | RCU_DYNTICK_CTRL_MASK; 343 - } while (atomic_cmpxchg(&rdtp->dynticks, old, new) != old); 401 + } while (atomic_cmpxchg(&rdp->dynticks, old, new) != old); 344 402 return true; 345 403 } 346 404 ··· 355 413 * 356 414 * The caller must have disabled interrupts and must not be idle. 357 415 */ 358 - static void rcu_momentary_dyntick_idle(void) 416 + static void __maybe_unused rcu_momentary_dyntick_idle(void) 359 417 { 360 - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); 361 418 int special; 362 419 363 - raw_cpu_write(rcu_dynticks.rcu_need_heavy_qs, false); 364 - special = atomic_add_return(2 * RCU_DYNTICK_CTRL_CTR, &rdtp->dynticks); 420 + raw_cpu_write(rcu_data.rcu_need_heavy_qs, false); 421 + special = atomic_add_return(2 * RCU_DYNTICK_CTRL_CTR, 422 + &this_cpu_ptr(&rcu_data)->dynticks); 365 423 /* It is illegal to call this from idle state. */ 366 424 WARN_ON_ONCE(!(special & RCU_DYNTICK_CTRL_CTR)); 425 + rcu_preempt_deferred_qs(current); 367 426 } 368 427 369 - /* 370 - * Note a context switch. This is a quiescent state for RCU-sched, 371 - * and requires special handling for preemptible RCU. 372 - * The caller must have disabled interrupts. 373 - */ 374 - void rcu_note_context_switch(bool preempt) 375 - { 376 - barrier(); /* Avoid RCU read-side critical sections leaking down. */ 377 - trace_rcu_utilization(TPS("Start context switch")); 378 - rcu_sched_qs(); 379 - rcu_preempt_note_context_switch(preempt); 380 - /* Load rcu_urgent_qs before other flags. */ 381 - if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs))) 382 - goto out; 383 - this_cpu_write(rcu_dynticks.rcu_urgent_qs, false); 384 - if (unlikely(raw_cpu_read(rcu_dynticks.rcu_need_heavy_qs))) 385 - rcu_momentary_dyntick_idle(); 386 - this_cpu_inc(rcu_dynticks.rcu_qs_ctr); 387 - if (!preempt) 388 - rcu_tasks_qs(current); 389 - out: 390 - trace_rcu_utilization(TPS("End context switch")); 391 - barrier(); /* Avoid RCU read-side critical sections leaking up. */ 392 - } 393 - EXPORT_SYMBOL_GPL(rcu_note_context_switch); 394 - 395 - /* 396 - * Register a quiescent state for all RCU flavors. If there is an 397 - * emergency, invoke rcu_momentary_dyntick_idle() to do a heavy-weight 398 - * dyntick-idle quiescent state visible to other CPUs (but only for those 399 - * RCU flavors in desperate need of a quiescent state, which will normally 400 - * be none of them). Either way, do a lightweight quiescent state for 401 - * all RCU flavors. 428 + /** 429 + * rcu_is_cpu_rrupt_from_idle - see if idle or immediately interrupted from idle 402 430 * 403 - * The barrier() calls are redundant in the common case when this is 404 - * called externally, but just in case this is called from within this 405 - * file. 406 - * 431 + * If the current CPU is idle or running at a first-level (not nested) 432 + * interrupt from idle, return true. The caller must have at least 433 + * disabled preemption. 407 434 */ 408 - void rcu_all_qs(void) 435 + static int rcu_is_cpu_rrupt_from_idle(void) 409 436 { 410 - unsigned long flags; 411 - 412 - if (!raw_cpu_read(rcu_dynticks.rcu_urgent_qs)) 413 - return; 414 - preempt_disable(); 415 - /* Load rcu_urgent_qs before other flags. */ 416 - if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs))) { 417 - preempt_enable(); 418 - return; 419 - } 420 - this_cpu_write(rcu_dynticks.rcu_urgent_qs, false); 421 - barrier(); /* Avoid RCU read-side critical sections leaking down. */ 422 - if (unlikely(raw_cpu_read(rcu_dynticks.rcu_need_heavy_qs))) { 423 - local_irq_save(flags); 424 - rcu_momentary_dyntick_idle(); 425 - local_irq_restore(flags); 426 - } 427 - if (unlikely(raw_cpu_read(rcu_sched_data.cpu_no_qs.b.exp))) 428 - rcu_sched_qs(); 429 - this_cpu_inc(rcu_dynticks.rcu_qs_ctr); 430 - barrier(); /* Avoid RCU read-side critical sections leaking up. */ 431 - preempt_enable(); 437 + return __this_cpu_read(rcu_data.dynticks_nesting) <= 0 && 438 + __this_cpu_read(rcu_data.dynticks_nmi_nesting) <= 1; 432 439 } 433 - EXPORT_SYMBOL_GPL(rcu_all_qs); 434 440 435 441 #define DEFAULT_RCU_BLIMIT 10 /* Maximum callbacks per rcu_do_batch. */ 436 442 static long blimit = DEFAULT_RCU_BLIMIT; ··· 395 505 static ulong jiffies_till_next_fqs = ULONG_MAX; 396 506 static bool rcu_kick_kthreads; 397 507 508 + /* 509 + * How long the grace period must be before we start recruiting 510 + * quiescent-state help from rcu_note_context_switch(). 511 + */ 512 + static ulong jiffies_till_sched_qs = ULONG_MAX; 513 + module_param(jiffies_till_sched_qs, ulong, 0444); 514 + static ulong jiffies_to_sched_qs; /* Adjusted version of above if not default */ 515 + module_param(jiffies_to_sched_qs, ulong, 0444); /* Display only! */ 516 + 517 + /* 518 + * Make sure that we give the grace-period kthread time to detect any 519 + * idle CPUs before taking active measures to force quiescent states. 520 + * However, don't go below 100 milliseconds, adjusted upwards for really 521 + * large systems. 522 + */ 523 + static void adjust_jiffies_till_sched_qs(void) 524 + { 525 + unsigned long j; 526 + 527 + /* If jiffies_till_sched_qs was specified, respect the request. */ 528 + if (jiffies_till_sched_qs != ULONG_MAX) { 529 + WRITE_ONCE(jiffies_to_sched_qs, jiffies_till_sched_qs); 530 + return; 531 + } 532 + j = READ_ONCE(jiffies_till_first_fqs) + 533 + 2 * READ_ONCE(jiffies_till_next_fqs); 534 + if (j < HZ / 10 + nr_cpu_ids / RCU_JIFFIES_FQS_DIV) 535 + j = HZ / 10 + nr_cpu_ids / RCU_JIFFIES_FQS_DIV; 536 + pr_info("RCU calculated value of scheduler-enlistment delay is %ld jiffies.\n", j); 537 + WRITE_ONCE(jiffies_to_sched_qs, j); 538 + } 539 + 398 540 static int param_set_first_fqs_jiffies(const char *val, const struct kernel_param *kp) 399 541 { 400 542 ulong j; 401 543 int ret = kstrtoul(val, 0, &j); 402 544 403 - if (!ret) 545 + if (!ret) { 404 546 WRITE_ONCE(*(ulong *)kp->arg, (j > HZ) ? HZ : j); 547 + adjust_jiffies_till_sched_qs(); 548 + } 405 549 return ret; 406 550 } 407 551 ··· 444 520 ulong j; 445 521 int ret = kstrtoul(val, 0, &j); 446 522 447 - if (!ret) 523 + if (!ret) { 448 524 WRITE_ONCE(*(ulong *)kp->arg, (j > HZ) ? HZ : (j ?: 1)); 525 + adjust_jiffies_till_sched_qs(); 526 + } 449 527 return ret; 450 528 } 451 529 ··· 465 539 module_param_cb(jiffies_till_next_fqs, &next_fqs_jiffies_ops, &jiffies_till_next_fqs, 0644); 466 540 module_param(rcu_kick_kthreads, bool, 0644); 467 541 468 - /* 469 - * How long the grace period must be before we start recruiting 470 - * quiescent-state help from rcu_note_context_switch(). 471 - */ 472 - static ulong jiffies_till_sched_qs = HZ / 10; 473 - module_param(jiffies_till_sched_qs, ulong, 0444); 474 - 475 - static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *rsp)); 476 - static void force_quiescent_state(struct rcu_state *rsp); 542 + static void force_qs_rnp(int (*f)(struct rcu_data *rdp)); 543 + static void force_quiescent_state(void); 477 544 static int rcu_pending(void); 478 545 479 546 /* ··· 474 555 */ 475 556 unsigned long rcu_get_gp_seq(void) 476 557 { 477 - return READ_ONCE(rcu_state_p->gp_seq); 558 + return READ_ONCE(rcu_state.gp_seq); 478 559 } 479 560 EXPORT_SYMBOL_GPL(rcu_get_gp_seq); 480 - 481 - /* 482 - * Return the number of RCU-sched GPs completed thus far for debug & stats. 483 - */ 484 - unsigned long rcu_sched_get_gp_seq(void) 485 - { 486 - return READ_ONCE(rcu_sched_state.gp_seq); 487 - } 488 - EXPORT_SYMBOL_GPL(rcu_sched_get_gp_seq); 489 - 490 - /* 491 - * Return the number of RCU-bh GPs completed thus far for debug & stats. 492 - */ 493 - unsigned long rcu_bh_get_gp_seq(void) 494 - { 495 - return READ_ONCE(rcu_bh_state.gp_seq); 496 - } 497 - EXPORT_SYMBOL_GPL(rcu_bh_get_gp_seq); 498 561 499 562 /* 500 563 * Return the number of RCU expedited batches completed thus far for ··· 486 585 */ 487 586 unsigned long rcu_exp_batches_completed(void) 488 587 { 489 - return rcu_state_p->expedited_sequence; 588 + return rcu_state.expedited_sequence; 490 589 } 491 590 EXPORT_SYMBOL_GPL(rcu_exp_batches_completed); 492 - 493 - /* 494 - * Return the number of RCU-sched expedited batches completed thus far 495 - * for debug & stats. Similar to rcu_exp_batches_completed(). 496 - */ 497 - unsigned long rcu_exp_batches_completed_sched(void) 498 - { 499 - return rcu_sched_state.expedited_sequence; 500 - } 501 - EXPORT_SYMBOL_GPL(rcu_exp_batches_completed_sched); 502 591 503 592 /* 504 593 * Force a quiescent state. 505 594 */ 506 595 void rcu_force_quiescent_state(void) 507 596 { 508 - force_quiescent_state(rcu_state_p); 597 + force_quiescent_state(); 509 598 } 510 599 EXPORT_SYMBOL_GPL(rcu_force_quiescent_state); 511 - 512 - /* 513 - * Force a quiescent state for RCU BH. 514 - */ 515 - void rcu_bh_force_quiescent_state(void) 516 - { 517 - force_quiescent_state(&rcu_bh_state); 518 - } 519 - EXPORT_SYMBOL_GPL(rcu_bh_force_quiescent_state); 520 - 521 - /* 522 - * Force a quiescent state for RCU-sched. 523 - */ 524 - void rcu_sched_force_quiescent_state(void) 525 - { 526 - force_quiescent_state(&rcu_sched_state); 527 - } 528 - EXPORT_SYMBOL_GPL(rcu_sched_force_quiescent_state); 529 600 530 601 /* 531 602 * Show the state of the grace-period kthreads. ··· 507 634 int cpu; 508 635 struct rcu_data *rdp; 509 636 struct rcu_node *rnp; 510 - struct rcu_state *rsp; 511 637 512 - for_each_rcu_flavor(rsp) { 513 - pr_info("%s: wait state: %d ->state: %#lx\n", 514 - rsp->name, rsp->gp_state, rsp->gp_kthread->state); 515 - rcu_for_each_node_breadth_first(rsp, rnp) { 516 - if (ULONG_CMP_GE(rsp->gp_seq, rnp->gp_seq_needed)) 638 + pr_info("%s: wait state: %d ->state: %#lx\n", rcu_state.name, 639 + rcu_state.gp_state, rcu_state.gp_kthread->state); 640 + rcu_for_each_node_breadth_first(rnp) { 641 + if (ULONG_CMP_GE(rcu_state.gp_seq, rnp->gp_seq_needed)) 642 + continue; 643 + pr_info("\trcu_node %d:%d ->gp_seq %lu ->gp_seq_needed %lu\n", 644 + rnp->grplo, rnp->grphi, rnp->gp_seq, 645 + rnp->gp_seq_needed); 646 + if (!rcu_is_leaf_node(rnp)) 647 + continue; 648 + for_each_leaf_node_possible_cpu(rnp, cpu) { 649 + rdp = per_cpu_ptr(&rcu_data, cpu); 650 + if (rdp->gpwrap || 651 + ULONG_CMP_GE(rcu_state.gp_seq, 652 + rdp->gp_seq_needed)) 517 653 continue; 518 - pr_info("\trcu_node %d:%d ->gp_seq %lu ->gp_seq_needed %lu\n", 519 - rnp->grplo, rnp->grphi, rnp->gp_seq, 520 - rnp->gp_seq_needed); 521 - if (!rcu_is_leaf_node(rnp)) 522 - continue; 523 - for_each_leaf_node_possible_cpu(rnp, cpu) { 524 - rdp = per_cpu_ptr(rsp->rda, cpu); 525 - if (rdp->gpwrap || 526 - ULONG_CMP_GE(rsp->gp_seq, 527 - rdp->gp_seq_needed)) 528 - continue; 529 - pr_info("\tcpu %d ->gp_seq_needed %lu\n", 530 - cpu, rdp->gp_seq_needed); 531 - } 654 + pr_info("\tcpu %d ->gp_seq_needed %lu\n", 655 + cpu, rdp->gp_seq_needed); 532 656 } 533 - /* sched_show_task(rsp->gp_kthread); */ 534 657 } 658 + /* sched_show_task(rcu_state.gp_kthread); */ 535 659 } 536 660 EXPORT_SYMBOL_GPL(show_rcu_gp_kthreads); 537 661 ··· 538 668 void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags, 539 669 unsigned long *gp_seq) 540 670 { 541 - struct rcu_state *rsp = NULL; 542 - 543 671 switch (test_type) { 544 672 case RCU_FLAVOR: 545 - rsp = rcu_state_p; 546 - break; 547 673 case RCU_BH_FLAVOR: 548 - rsp = &rcu_bh_state; 549 - break; 550 674 case RCU_SCHED_FLAVOR: 551 - rsp = &rcu_sched_state; 675 + *flags = READ_ONCE(rcu_state.gp_flags); 676 + *gp_seq = rcu_seq_current(&rcu_state.gp_seq); 552 677 break; 553 678 default: 554 679 break; 555 680 } 556 - if (rsp == NULL) 557 - return; 558 - *flags = READ_ONCE(rsp->gp_flags); 559 - *gp_seq = rcu_seq_current(&rsp->gp_seq); 560 681 } 561 682 EXPORT_SYMBOL_GPL(rcutorture_get_gp_data); 562 683 563 684 /* 564 - * Return the root node of the specified rcu_state structure. 685 + * Return the root node of the rcu_state structure. 565 686 */ 566 - static struct rcu_node *rcu_get_root(struct rcu_state *rsp) 687 + static struct rcu_node *rcu_get_root(void) 567 688 { 568 - return &rsp->node[0]; 689 + return &rcu_state.node[0]; 569 690 } 570 691 571 692 /* ··· 569 708 */ 570 709 static void rcu_eqs_enter(bool user) 571 710 { 572 - struct rcu_state *rsp; 573 - struct rcu_data *rdp; 574 - struct rcu_dynticks *rdtp; 711 + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); 575 712 576 - rdtp = this_cpu_ptr(&rcu_dynticks); 577 - WRITE_ONCE(rdtp->dynticks_nmi_nesting, 0); 713 + WARN_ON_ONCE(rdp->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE); 714 + WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); 578 715 WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && 579 - rdtp->dynticks_nesting == 0); 580 - if (rdtp->dynticks_nesting != 1) { 581 - rdtp->dynticks_nesting--; 716 + rdp->dynticks_nesting == 0); 717 + if (rdp->dynticks_nesting != 1) { 718 + rdp->dynticks_nesting--; 582 719 return; 583 720 } 584 721 585 722 lockdep_assert_irqs_disabled(); 586 - trace_rcu_dyntick(TPS("Start"), rdtp->dynticks_nesting, 0, rdtp->dynticks); 723 + trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, rdp->dynticks); 587 724 WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current)); 588 - for_each_rcu_flavor(rsp) { 589 - rdp = this_cpu_ptr(rsp->rda); 590 - do_nocb_deferred_wakeup(rdp); 591 - } 725 + rdp = this_cpu_ptr(&rcu_data); 726 + do_nocb_deferred_wakeup(rdp); 592 727 rcu_prepare_for_idle(); 593 - WRITE_ONCE(rdtp->dynticks_nesting, 0); /* Avoid irq-access tearing. */ 728 + rcu_preempt_deferred_qs(current); 729 + WRITE_ONCE(rdp->dynticks_nesting, 0); /* Avoid irq-access tearing. */ 594 730 rcu_dynticks_eqs_enter(); 595 731 rcu_dynticks_task_enter(); 596 732 } ··· 628 770 } 629 771 #endif /* CONFIG_NO_HZ_FULL */ 630 772 631 - /** 632 - * rcu_nmi_exit - inform RCU of exit from NMI context 633 - * 773 + /* 634 774 * If we are returning from the outermost NMI handler that interrupted an 635 - * RCU-idle period, update rdtp->dynticks and rdtp->dynticks_nmi_nesting 775 + * RCU-idle period, update rdp->dynticks and rdp->dynticks_nmi_nesting 636 776 * to let the RCU grace-period handling know that the CPU is back to 637 777 * being RCU-idle. 638 778 * 639 - * If you add or remove a call to rcu_nmi_exit(), be sure to test 779 + * If you add or remove a call to rcu_nmi_exit_common(), be sure to test 640 780 * with CONFIG_RCU_EQS_DEBUG=y. 641 781 */ 642 - void rcu_nmi_exit(void) 782 + static __always_inline void rcu_nmi_exit_common(bool irq) 643 783 { 644 - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); 784 + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); 645 785 646 786 /* 647 787 * Check for ->dynticks_nmi_nesting underflow and bad ->dynticks. 648 788 * (We are exiting an NMI handler, so RCU better be paying attention 649 789 * to us!) 650 790 */ 651 - WARN_ON_ONCE(rdtp->dynticks_nmi_nesting <= 0); 791 + WARN_ON_ONCE(rdp->dynticks_nmi_nesting <= 0); 652 792 WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs()); 653 793 654 794 /* 655 795 * If the nesting level is not 1, the CPU wasn't RCU-idle, so 656 796 * leave it in non-RCU-idle state. 657 797 */ 658 - if (rdtp->dynticks_nmi_nesting != 1) { 659 - trace_rcu_dyntick(TPS("--="), rdtp->dynticks_nmi_nesting, rdtp->dynticks_nmi_nesting - 2, rdtp->dynticks); 660 - WRITE_ONCE(rdtp->dynticks_nmi_nesting, /* No store tearing. */ 661 - rdtp->dynticks_nmi_nesting - 2); 798 + if (rdp->dynticks_nmi_nesting != 1) { 799 + trace_rcu_dyntick(TPS("--="), rdp->dynticks_nmi_nesting, rdp->dynticks_nmi_nesting - 2, rdp->dynticks); 800 + WRITE_ONCE(rdp->dynticks_nmi_nesting, /* No store tearing. */ 801 + rdp->dynticks_nmi_nesting - 2); 662 802 return; 663 803 } 664 804 665 805 /* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */ 666 - trace_rcu_dyntick(TPS("Startirq"), rdtp->dynticks_nmi_nesting, 0, rdtp->dynticks); 667 - WRITE_ONCE(rdtp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */ 806 + trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, rdp->dynticks); 807 + WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */ 808 + 809 + if (irq) 810 + rcu_prepare_for_idle(); 811 + 668 812 rcu_dynticks_eqs_enter(); 813 + 814 + if (irq) 815 + rcu_dynticks_task_enter(); 816 + } 817 + 818 + /** 819 + * rcu_nmi_exit - inform RCU of exit from NMI context 820 + * @irq: Is this call from rcu_irq_exit? 821 + * 822 + * If you add or remove a call to rcu_nmi_exit(), be sure to test 823 + * with CONFIG_RCU_EQS_DEBUG=y. 824 + */ 825 + void rcu_nmi_exit(void) 826 + { 827 + rcu_nmi_exit_common(false); 669 828 } 670 829 671 830 /** ··· 706 831 */ 707 832 void rcu_irq_exit(void) 708 833 { 709 - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); 710 - 711 834 lockdep_assert_irqs_disabled(); 712 - if (rdtp->dynticks_nmi_nesting == 1) 713 - rcu_prepare_for_idle(); 714 - rcu_nmi_exit(); 715 - if (rdtp->dynticks_nmi_nesting == 0) 716 - rcu_dynticks_task_enter(); 835 + rcu_nmi_exit_common(true); 717 836 } 718 837 719 838 /* ··· 735 866 */ 736 867 static void rcu_eqs_exit(bool user) 737 868 { 738 - struct rcu_dynticks *rdtp; 869 + struct rcu_data *rdp; 739 870 long oldval; 740 871 741 872 lockdep_assert_irqs_disabled(); 742 - rdtp = this_cpu_ptr(&rcu_dynticks); 743 - oldval = rdtp->dynticks_nesting; 873 + rdp = this_cpu_ptr(&rcu_data); 874 + oldval = rdp->dynticks_nesting; 744 875 WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0); 745 876 if (oldval) { 746 - rdtp->dynticks_nesting++; 877 + rdp->dynticks_nesting++; 747 878 return; 748 879 } 749 880 rcu_dynticks_task_exit(); 750 881 rcu_dynticks_eqs_exit(); 751 882 rcu_cleanup_after_idle(); 752 - trace_rcu_dyntick(TPS("End"), rdtp->dynticks_nesting, 1, rdtp->dynticks); 883 + trace_rcu_dyntick(TPS("End"), rdp->dynticks_nesting, 1, rdp->dynticks); 753 884 WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current)); 754 - WRITE_ONCE(rdtp->dynticks_nesting, 1); 755 - WRITE_ONCE(rdtp->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE); 885 + WRITE_ONCE(rdp->dynticks_nesting, 1); 886 + WARN_ON_ONCE(rdp->dynticks_nmi_nesting); 887 + WRITE_ONCE(rdp->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE); 756 888 } 757 889 758 890 /** ··· 791 921 #endif /* CONFIG_NO_HZ_FULL */ 792 922 793 923 /** 794 - * rcu_nmi_enter - inform RCU of entry to NMI context 924 + * rcu_nmi_enter_common - inform RCU of entry to NMI context 925 + * @irq: Is this call from rcu_irq_enter? 795 926 * 796 - * If the CPU was idle from RCU's viewpoint, update rdtp->dynticks and 797 - * rdtp->dynticks_nmi_nesting to let the RCU grace-period handling know 927 + * If the CPU was idle from RCU's viewpoint, update rdp->dynticks and 928 + * rdp->dynticks_nmi_nesting to let the RCU grace-period handling know 798 929 * that the CPU is active. This implementation permits nested NMIs, as 799 930 * long as the nesting level does not overflow an int. (You will probably 800 931 * run out of stack space first.) 801 932 * 802 - * If you add or remove a call to rcu_nmi_enter(), be sure to test 933 + * If you add or remove a call to rcu_nmi_enter_common(), be sure to test 803 934 * with CONFIG_RCU_EQS_DEBUG=y. 804 935 */ 805 - void rcu_nmi_enter(void) 936 + static __always_inline void rcu_nmi_enter_common(bool irq) 806 937 { 807 - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); 938 + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); 808 939 long incby = 2; 809 940 810 941 /* Complain about underflow. */ 811 - WARN_ON_ONCE(rdtp->dynticks_nmi_nesting < 0); 942 + WARN_ON_ONCE(rdp->dynticks_nmi_nesting < 0); 812 943 813 944 /* 814 945 * If idle from RCU viewpoint, atomically increment ->dynticks ··· 820 949 * period (observation due to Andy Lutomirski). 821 950 */ 822 951 if (rcu_dynticks_curr_cpu_in_eqs()) { 952 + 953 + if (irq) 954 + rcu_dynticks_task_exit(); 955 + 823 956 rcu_dynticks_eqs_exit(); 957 + 958 + if (irq) 959 + rcu_cleanup_after_idle(); 960 + 824 961 incby = 1; 825 962 } 826 963 trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="), 827 - rdtp->dynticks_nmi_nesting, 828 - rdtp->dynticks_nmi_nesting + incby, rdtp->dynticks); 829 - WRITE_ONCE(rdtp->dynticks_nmi_nesting, /* Prevent store tearing. */ 830 - rdtp->dynticks_nmi_nesting + incby); 964 + rdp->dynticks_nmi_nesting, 965 + rdp->dynticks_nmi_nesting + incby, rdp->dynticks); 966 + WRITE_ONCE(rdp->dynticks_nmi_nesting, /* Prevent store tearing. */ 967 + rdp->dynticks_nmi_nesting + incby); 831 968 barrier(); 969 + } 970 + 971 + /** 972 + * rcu_nmi_enter - inform RCU of entry to NMI context 973 + */ 974 + void rcu_nmi_enter(void) 975 + { 976 + rcu_nmi_enter_common(false); 832 977 } 833 978 834 979 /** ··· 871 984 */ 872 985 void rcu_irq_enter(void) 873 986 { 874 - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); 875 - 876 987 lockdep_assert_irqs_disabled(); 877 - if (rdtp->dynticks_nmi_nesting == 0) 878 - rcu_dynticks_task_exit(); 879 - rcu_nmi_enter(); 880 - if (rdtp->dynticks_nmi_nesting == 1) 881 - rcu_cleanup_after_idle(); 988 + rcu_nmi_enter_common(true); 882 989 } 883 990 884 991 /* ··· 924 1043 cpu = task_cpu(t); 925 1044 if (!task_curr(t)) 926 1045 return; /* This task is not running on that CPU. */ 927 - smp_store_release(per_cpu_ptr(&rcu_dynticks.rcu_urgent_qs, cpu), true); 1046 + smp_store_release(per_cpu_ptr(&rcu_data.rcu_urgent_qs, cpu), true); 928 1047 } 929 1048 930 1049 #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU) ··· 935 1054 * Disable preemption to avoid false positives that could otherwise 936 1055 * happen due to the current CPU number being sampled, this task being 937 1056 * preempted, its old CPU being taken offline, resuming on some other CPU, 938 - * then determining that its old CPU is now offline. Because there are 939 - * multiple flavors of RCU, and because this function can be called in the 940 - * midst of updating the flavors while a given CPU coming online or going 941 - * offline, it is necessary to check all flavors. If any of the flavors 942 - * believe that given CPU is online, it is considered to be online. 1057 + * then determining that its old CPU is now offline. 943 1058 * 944 1059 * Disable checking if in an NMI handler because we cannot safely 945 1060 * report errors from NMI handlers anyway. In addition, it is OK to use ··· 946 1069 { 947 1070 struct rcu_data *rdp; 948 1071 struct rcu_node *rnp; 949 - struct rcu_state *rsp; 1072 + bool ret = false; 950 1073 951 1074 if (in_nmi() || !rcu_scheduler_fully_active) 952 1075 return true; 953 1076 preempt_disable(); 954 - for_each_rcu_flavor(rsp) { 955 - rdp = this_cpu_ptr(rsp->rda); 956 - rnp = rdp->mynode; 957 - if (rdp->grpmask & rcu_rnp_online_cpus(rnp)) { 958 - preempt_enable(); 959 - return true; 960 - } 961 - } 1077 + rdp = this_cpu_ptr(&rcu_data); 1078 + rnp = rdp->mynode; 1079 + if (rdp->grpmask & rcu_rnp_online_cpus(rnp)) 1080 + ret = true; 962 1081 preempt_enable(); 963 - return false; 1082 + return ret; 964 1083 } 965 1084 EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online); 966 1085 967 1086 #endif /* #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU) */ 968 - 969 - /** 970 - * rcu_is_cpu_rrupt_from_idle - see if idle or immediately interrupted from idle 971 - * 972 - * If the current CPU is idle or running at a first-level (not nested) 973 - * interrupt from idle, return true. The caller must have at least 974 - * disabled preemption. 975 - */ 976 - static int rcu_is_cpu_rrupt_from_idle(void) 977 - { 978 - return __this_cpu_read(rcu_dynticks.dynticks_nesting) <= 0 && 979 - __this_cpu_read(rcu_dynticks.dynticks_nmi_nesting) <= 1; 980 - } 981 1087 982 1088 /* 983 1089 * We are reporting a quiescent state on behalf of some other CPU, so ··· 986 1126 */ 987 1127 static int dyntick_save_progress_counter(struct rcu_data *rdp) 988 1128 { 989 - rdp->dynticks_snap = rcu_dynticks_snap(rdp->dynticks); 1129 + rdp->dynticks_snap = rcu_dynticks_snap(rdp); 990 1130 if (rcu_dynticks_in_eqs(rdp->dynticks_snap)) { 991 - trace_rcu_fqs(rdp->rsp->name, rdp->gp_seq, rdp->cpu, TPS("dti")); 1131 + trace_rcu_fqs(rcu_state.name, rdp->gp_seq, rdp->cpu, TPS("dti")); 992 1132 rcu_gpnum_ovf(rdp->mynode, rdp); 993 1133 return 1; 994 1134 } ··· 1037 1177 * read-side critical section that started before the beginning 1038 1178 * of the current RCU grace period. 1039 1179 */ 1040 - if (rcu_dynticks_in_eqs_since(rdp->dynticks, rdp->dynticks_snap)) { 1041 - trace_rcu_fqs(rdp->rsp->name, rdp->gp_seq, rdp->cpu, TPS("dti")); 1042 - rdp->dynticks_fqs++; 1180 + if (rcu_dynticks_in_eqs_since(rdp, rdp->dynticks_snap)) { 1181 + trace_rcu_fqs(rcu_state.name, rdp->gp_seq, rdp->cpu, TPS("dti")); 1043 1182 rcu_gpnum_ovf(rnp, rdp); 1044 1183 return 1; 1045 - } 1046 - 1047 - /* 1048 - * Has this CPU encountered a cond_resched() since the beginning 1049 - * of the grace period? For this to be the case, the CPU has to 1050 - * have noticed the current grace period. This might not be the 1051 - * case for nohz_full CPUs looping in the kernel. 1052 - */ 1053 - jtsq = jiffies_till_sched_qs; 1054 - ruqp = per_cpu_ptr(&rcu_dynticks.rcu_urgent_qs, rdp->cpu); 1055 - if (time_after(jiffies, rdp->rsp->gp_start + jtsq) && 1056 - READ_ONCE(rdp->rcu_qs_ctr_snap) != per_cpu(rcu_dynticks.rcu_qs_ctr, rdp->cpu) && 1057 - rcu_seq_current(&rdp->gp_seq) == rnp->gp_seq && !rdp->gpwrap) { 1058 - trace_rcu_fqs(rdp->rsp->name, rdp->gp_seq, rdp->cpu, TPS("rqc")); 1059 - rcu_gpnum_ovf(rnp, rdp); 1060 - return 1; 1061 - } else if (time_after(jiffies, rdp->rsp->gp_start + jtsq)) { 1062 - /* Load rcu_qs_ctr before store to rcu_urgent_qs. */ 1063 - smp_store_release(ruqp, true); 1064 1184 } 1065 1185 1066 1186 /* If waiting too long on an offline CPU, complain. */ 1067 1187 if (!(rdp->grpmask & rcu_rnp_online_cpus(rnp)) && 1068 - time_after(jiffies, rdp->rsp->gp_start + HZ)) { 1188 + time_after(jiffies, rcu_state.gp_start + HZ)) { 1069 1189 bool onl; 1070 1190 struct rcu_node *rnp1; 1071 1191 ··· 1066 1226 1067 1227 /* 1068 1228 * A CPU running for an extended time within the kernel can 1069 - * delay RCU grace periods. When the CPU is in NO_HZ_FULL mode, 1070 - * even context-switching back and forth between a pair of 1071 - * in-kernel CPU-bound tasks cannot advance grace periods. 1072 - * So if the grace period is old enough, make the CPU pay attention. 1073 - * Note that the unsynchronized assignments to the per-CPU 1074 - * rcu_need_heavy_qs variable are safe. Yes, setting of 1075 - * bits can be lost, but they will be set again on the next 1076 - * force-quiescent-state pass. So lost bit sets do not result 1077 - * in incorrect behavior, merely in a grace period lasting 1078 - * a few jiffies longer than it might otherwise. Because 1079 - * there are at most four threads involved, and because the 1080 - * updates are only once every few jiffies, the probability of 1081 - * lossage (and thus of slight grace-period extension) is 1082 - * quite low. 1229 + * delay RCU grace periods: (1) At age jiffies_to_sched_qs, 1230 + * set .rcu_urgent_qs, (2) At age 2*jiffies_to_sched_qs, set 1231 + * both .rcu_need_heavy_qs and .rcu_urgent_qs. Note that the 1232 + * unsynchronized assignments to the per-CPU rcu_need_heavy_qs 1233 + * variable are safe because the assignments are repeated if this 1234 + * CPU failed to pass through a quiescent state. This code 1235 + * also checks .jiffies_resched in case jiffies_to_sched_qs 1236 + * is set way high. 1083 1237 */ 1084 - rnhqp = &per_cpu(rcu_dynticks.rcu_need_heavy_qs, rdp->cpu); 1238 + jtsq = READ_ONCE(jiffies_to_sched_qs); 1239 + ruqp = per_cpu_ptr(&rcu_data.rcu_urgent_qs, rdp->cpu); 1240 + rnhqp = &per_cpu(rcu_data.rcu_need_heavy_qs, rdp->cpu); 1085 1241 if (!READ_ONCE(*rnhqp) && 1086 - (time_after(jiffies, rdp->rsp->gp_start + jtsq) || 1087 - time_after(jiffies, rdp->rsp->jiffies_resched))) { 1242 + (time_after(jiffies, rcu_state.gp_start + jtsq * 2) || 1243 + time_after(jiffies, rcu_state.jiffies_resched))) { 1088 1244 WRITE_ONCE(*rnhqp, true); 1089 1245 /* Store rcu_need_heavy_qs before rcu_urgent_qs. */ 1090 1246 smp_store_release(ruqp, true); 1091 - rdp->rsp->jiffies_resched += jtsq; /* Re-enable beating. */ 1247 + } else if (time_after(jiffies, rcu_state.gp_start + jtsq)) { 1248 + WRITE_ONCE(*ruqp, true); 1092 1249 } 1093 1250 1094 1251 /* 1095 - * If more than halfway to RCU CPU stall-warning time, do a 1096 - * resched_cpu() to try to loosen things up a bit. Also check to 1097 - * see if the CPU is getting hammered with interrupts, but only 1098 - * once per grace period, just to keep the IPIs down to a dull roar. 1252 + * NO_HZ_FULL CPUs can run in-kernel without rcu_check_callbacks! 1253 + * The above code handles this, but only for straight cond_resched(). 1254 + * And some in-kernel loops check need_resched() before calling 1255 + * cond_resched(), which defeats the above code for CPUs that are 1256 + * running in-kernel with scheduling-clock interrupts disabled. 1257 + * So hit them over the head with the resched_cpu() hammer! 1099 1258 */ 1100 - if (jiffies - rdp->rsp->gp_start > rcu_jiffies_till_stall_check() / 2) { 1259 + if (tick_nohz_full_cpu(rdp->cpu) && 1260 + time_after(jiffies, 1261 + READ_ONCE(rdp->last_fqs_resched) + jtsq * 3)) { 1101 1262 resched_cpu(rdp->cpu); 1263 + WRITE_ONCE(rdp->last_fqs_resched, jiffies); 1264 + } 1265 + 1266 + /* 1267 + * If more than halfway to RCU CPU stall-warning time, invoke 1268 + * resched_cpu() more frequently to try to loosen things up a bit. 1269 + * Also check to see if the CPU is getting hammered with interrupts, 1270 + * but only once per grace period, just to keep the IPIs down to 1271 + * a dull roar. 1272 + */ 1273 + if (time_after(jiffies, rcu_state.jiffies_resched)) { 1274 + if (time_after(jiffies, 1275 + READ_ONCE(rdp->last_fqs_resched) + jtsq)) { 1276 + resched_cpu(rdp->cpu); 1277 + WRITE_ONCE(rdp->last_fqs_resched, jiffies); 1278 + } 1102 1279 if (IS_ENABLED(CONFIG_IRQ_WORK) && 1103 1280 !rdp->rcu_iw_pending && rdp->rcu_iw_gp_seq != rnp->gp_seq && 1104 1281 (rnp->ffmask & rdp->grpmask)) { ··· 1129 1272 return 0; 1130 1273 } 1131 1274 1132 - static void record_gp_stall_check_time(struct rcu_state *rsp) 1275 + static void record_gp_stall_check_time(void) 1133 1276 { 1134 1277 unsigned long j = jiffies; 1135 1278 unsigned long j1; 1136 1279 1137 - rsp->gp_start = j; 1280 + rcu_state.gp_start = j; 1138 1281 j1 = rcu_jiffies_till_stall_check(); 1139 1282 /* Record ->gp_start before ->jiffies_stall. */ 1140 - smp_store_release(&rsp->jiffies_stall, j + j1); /* ^^^ */ 1141 - rsp->jiffies_resched = j + j1 / 2; 1142 - rsp->n_force_qs_gpstart = READ_ONCE(rsp->n_force_qs); 1283 + smp_store_release(&rcu_state.jiffies_stall, j + j1); /* ^^^ */ 1284 + rcu_state.jiffies_resched = j + j1 / 2; 1285 + rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs); 1143 1286 } 1144 1287 1145 1288 /* ··· 1155 1298 /* 1156 1299 * Complain about starvation of grace-period kthread. 1157 1300 */ 1158 - static void rcu_check_gp_kthread_starvation(struct rcu_state *rsp) 1301 + static void rcu_check_gp_kthread_starvation(void) 1159 1302 { 1160 - unsigned long gpa; 1303 + struct task_struct *gpk = rcu_state.gp_kthread; 1161 1304 unsigned long j; 1162 1305 1163 - j = jiffies; 1164 - gpa = READ_ONCE(rsp->gp_activity); 1165 - if (j - gpa > 2 * HZ) { 1306 + j = jiffies - READ_ONCE(rcu_state.gp_activity); 1307 + if (j > 2 * HZ) { 1166 1308 pr_err("%s kthread starved for %ld jiffies! g%ld f%#x %s(%d) ->state=%#lx ->cpu=%d\n", 1167 - rsp->name, j - gpa, 1168 - (long)rcu_seq_current(&rsp->gp_seq), 1169 - rsp->gp_flags, 1170 - gp_state_getname(rsp->gp_state), rsp->gp_state, 1171 - rsp->gp_kthread ? rsp->gp_kthread->state : ~0, 1172 - rsp->gp_kthread ? task_cpu(rsp->gp_kthread) : -1); 1173 - if (rsp->gp_kthread) { 1309 + rcu_state.name, j, 1310 + (long)rcu_seq_current(&rcu_state.gp_seq), 1311 + rcu_state.gp_flags, 1312 + gp_state_getname(rcu_state.gp_state), rcu_state.gp_state, 1313 + gpk ? gpk->state : ~0, gpk ? task_cpu(gpk) : -1); 1314 + if (gpk) { 1174 1315 pr_err("RCU grace-period kthread stack dump:\n"); 1175 - sched_show_task(rsp->gp_kthread); 1176 - wake_up_process(rsp->gp_kthread); 1316 + sched_show_task(gpk); 1317 + wake_up_process(gpk); 1177 1318 } 1178 1319 } 1179 1320 } ··· 1182 1327 * that don't support NMI-based stack dumps. The NMI-triggered stack 1183 1328 * traces are more accurate because they are printed by the target CPU. 1184 1329 */ 1185 - static void rcu_dump_cpu_stacks(struct rcu_state *rsp) 1330 + static void rcu_dump_cpu_stacks(void) 1186 1331 { 1187 1332 int cpu; 1188 1333 unsigned long flags; 1189 1334 struct rcu_node *rnp; 1190 1335 1191 - rcu_for_each_leaf_node(rsp, rnp) { 1336 + rcu_for_each_leaf_node(rnp) { 1192 1337 raw_spin_lock_irqsave_rcu_node(rnp, flags); 1193 1338 for_each_leaf_node_possible_cpu(rnp, cpu) 1194 1339 if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) ··· 1202 1347 * If too much time has passed in the current grace period, and if 1203 1348 * so configured, go kick the relevant kthreads. 1204 1349 */ 1205 - static void rcu_stall_kick_kthreads(struct rcu_state *rsp) 1350 + static void rcu_stall_kick_kthreads(void) 1206 1351 { 1207 1352 unsigned long j; 1208 1353 1209 1354 if (!rcu_kick_kthreads) 1210 1355 return; 1211 - j = READ_ONCE(rsp->jiffies_kick_kthreads); 1212 - if (time_after(jiffies, j) && rsp->gp_kthread && 1213 - (rcu_gp_in_progress(rsp) || READ_ONCE(rsp->gp_flags))) { 1214 - WARN_ONCE(1, "Kicking %s grace-period kthread\n", rsp->name); 1356 + j = READ_ONCE(rcu_state.jiffies_kick_kthreads); 1357 + if (time_after(jiffies, j) && rcu_state.gp_kthread && 1358 + (rcu_gp_in_progress() || READ_ONCE(rcu_state.gp_flags))) { 1359 + WARN_ONCE(1, "Kicking %s grace-period kthread\n", 1360 + rcu_state.name); 1215 1361 rcu_ftrace_dump(DUMP_ALL); 1216 - wake_up_process(rsp->gp_kthread); 1217 - WRITE_ONCE(rsp->jiffies_kick_kthreads, j + HZ); 1362 + wake_up_process(rcu_state.gp_kthread); 1363 + WRITE_ONCE(rcu_state.jiffies_kick_kthreads, j + HZ); 1218 1364 } 1219 1365 } 1220 1366 ··· 1225 1369 panic("RCU Stall\n"); 1226 1370 } 1227 1371 1228 - static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gp_seq) 1372 + static void print_other_cpu_stall(unsigned long gp_seq) 1229 1373 { 1230 1374 int cpu; 1231 1375 unsigned long flags; 1232 1376 unsigned long gpa; 1233 1377 unsigned long j; 1234 1378 int ndetected = 0; 1235 - struct rcu_node *rnp = rcu_get_root(rsp); 1379 + struct rcu_node *rnp = rcu_get_root(); 1236 1380 long totqlen = 0; 1237 1381 1238 1382 /* Kick and suppress, if so configured. */ 1239 - rcu_stall_kick_kthreads(rsp); 1383 + rcu_stall_kick_kthreads(); 1240 1384 if (rcu_cpu_stall_suppress) 1241 1385 return; 1242 1386 ··· 1245 1389 * See Documentation/RCU/stallwarn.txt for info on how to debug 1246 1390 * RCU CPU stall warnings. 1247 1391 */ 1248 - pr_err("INFO: %s detected stalls on CPUs/tasks:", rsp->name); 1392 + pr_err("INFO: %s detected stalls on CPUs/tasks:", rcu_state.name); 1249 1393 print_cpu_stall_info_begin(); 1250 - rcu_for_each_leaf_node(rsp, rnp) { 1394 + rcu_for_each_leaf_node(rnp) { 1251 1395 raw_spin_lock_irqsave_rcu_node(rnp, flags); 1252 1396 ndetected += rcu_print_task_stall(rnp); 1253 1397 if (rnp->qsmask != 0) { 1254 1398 for_each_leaf_node_possible_cpu(rnp, cpu) 1255 1399 if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) { 1256 - print_cpu_stall_info(rsp, cpu); 1400 + print_cpu_stall_info(cpu); 1257 1401 ndetected++; 1258 1402 } 1259 1403 } ··· 1262 1406 1263 1407 print_cpu_stall_info_end(); 1264 1408 for_each_possible_cpu(cpu) 1265 - totqlen += rcu_segcblist_n_cbs(&per_cpu_ptr(rsp->rda, 1409 + totqlen += rcu_segcblist_n_cbs(&per_cpu_ptr(&rcu_data, 1266 1410 cpu)->cblist); 1267 1411 pr_cont("(detected by %d, t=%ld jiffies, g=%ld, q=%lu)\n", 1268 - smp_processor_id(), (long)(jiffies - rsp->gp_start), 1269 - (long)rcu_seq_current(&rsp->gp_seq), totqlen); 1412 + smp_processor_id(), (long)(jiffies - rcu_state.gp_start), 1413 + (long)rcu_seq_current(&rcu_state.gp_seq), totqlen); 1270 1414 if (ndetected) { 1271 - rcu_dump_cpu_stacks(rsp); 1415 + rcu_dump_cpu_stacks(); 1272 1416 1273 1417 /* Complain about tasks blocking the grace period. */ 1274 - rcu_print_detail_task_stall(rsp); 1418 + rcu_print_detail_task_stall(); 1275 1419 } else { 1276 - if (rcu_seq_current(&rsp->gp_seq) != gp_seq) { 1420 + if (rcu_seq_current(&rcu_state.gp_seq) != gp_seq) { 1277 1421 pr_err("INFO: Stall ended before state dump start\n"); 1278 1422 } else { 1279 1423 j = jiffies; 1280 - gpa = READ_ONCE(rsp->gp_activity); 1424 + gpa = READ_ONCE(rcu_state.gp_activity); 1281 1425 pr_err("All QSes seen, last %s kthread activity %ld (%ld-%ld), jiffies_till_next_fqs=%ld, root ->qsmask %#lx\n", 1282 - rsp->name, j - gpa, j, gpa, 1283 - jiffies_till_next_fqs, 1284 - rcu_get_root(rsp)->qsmask); 1426 + rcu_state.name, j - gpa, j, gpa, 1427 + READ_ONCE(jiffies_till_next_fqs), 1428 + rcu_get_root()->qsmask); 1285 1429 /* In this case, the current CPU might be at fault. */ 1286 1430 sched_show_task(current); 1287 1431 } 1288 1432 } 1289 1433 /* Rewrite if needed in case of slow consoles. */ 1290 - if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall))) 1291 - WRITE_ONCE(rsp->jiffies_stall, 1434 + if (ULONG_CMP_GE(jiffies, READ_ONCE(rcu_state.jiffies_stall))) 1435 + WRITE_ONCE(rcu_state.jiffies_stall, 1292 1436 jiffies + 3 * rcu_jiffies_till_stall_check() + 3); 1293 1437 1294 - rcu_check_gp_kthread_starvation(rsp); 1438 + rcu_check_gp_kthread_starvation(); 1295 1439 1296 1440 panic_on_rcu_stall(); 1297 1441 1298 - force_quiescent_state(rsp); /* Kick them all. */ 1442 + force_quiescent_state(); /* Kick them all. */ 1299 1443 } 1300 1444 1301 - static void print_cpu_stall(struct rcu_state *rsp) 1445 + static void print_cpu_stall(void) 1302 1446 { 1303 1447 int cpu; 1304 1448 unsigned long flags; 1305 - struct rcu_data *rdp = this_cpu_ptr(rsp->rda); 1306 - struct rcu_node *rnp = rcu_get_root(rsp); 1449 + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); 1450 + struct rcu_node *rnp = rcu_get_root(); 1307 1451 long totqlen = 0; 1308 1452 1309 1453 /* Kick and suppress, if so configured. */ 1310 - rcu_stall_kick_kthreads(rsp); 1454 + rcu_stall_kick_kthreads(); 1311 1455 if (rcu_cpu_stall_suppress) 1312 1456 return; 1313 1457 ··· 1316 1460 * See Documentation/RCU/stallwarn.txt for info on how to debug 1317 1461 * RCU CPU stall warnings. 1318 1462 */ 1319 - pr_err("INFO: %s self-detected stall on CPU", rsp->name); 1463 + pr_err("INFO: %s self-detected stall on CPU", rcu_state.name); 1320 1464 print_cpu_stall_info_begin(); 1321 1465 raw_spin_lock_irqsave_rcu_node(rdp->mynode, flags); 1322 - print_cpu_stall_info(rsp, smp_processor_id()); 1466 + print_cpu_stall_info(smp_processor_id()); 1323 1467 raw_spin_unlock_irqrestore_rcu_node(rdp->mynode, flags); 1324 1468 print_cpu_stall_info_end(); 1325 1469 for_each_possible_cpu(cpu) 1326 - totqlen += rcu_segcblist_n_cbs(&per_cpu_ptr(rsp->rda, 1470 + totqlen += rcu_segcblist_n_cbs(&per_cpu_ptr(&rcu_data, 1327 1471 cpu)->cblist); 1328 1472 pr_cont(" (t=%lu jiffies g=%ld q=%lu)\n", 1329 - jiffies - rsp->gp_start, 1330 - (long)rcu_seq_current(&rsp->gp_seq), totqlen); 1473 + jiffies - rcu_state.gp_start, 1474 + (long)rcu_seq_current(&rcu_state.gp_seq), totqlen); 1331 1475 1332 - rcu_check_gp_kthread_starvation(rsp); 1476 + rcu_check_gp_kthread_starvation(); 1333 1477 1334 - rcu_dump_cpu_stacks(rsp); 1478 + rcu_dump_cpu_stacks(); 1335 1479 1336 1480 raw_spin_lock_irqsave_rcu_node(rnp, flags); 1337 1481 /* Rewrite if needed in case of slow consoles. */ 1338 - if (ULONG_CMP_GE(jiffies, READ_ONCE(rsp->jiffies_stall))) 1339 - WRITE_ONCE(rsp->jiffies_stall, 1482 + if (ULONG_CMP_GE(jiffies, READ_ONCE(rcu_state.jiffies_stall))) 1483 + WRITE_ONCE(rcu_state.jiffies_stall, 1340 1484 jiffies + 3 * rcu_jiffies_till_stall_check() + 3); 1341 1485 raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 1342 1486 ··· 1349 1493 * progress and it could be we're stuck in kernel space without context 1350 1494 * switches for an entirely unreasonable amount of time. 1351 1495 */ 1352 - resched_cpu(smp_processor_id()); 1496 + set_tsk_need_resched(current); 1497 + set_preempt_need_resched(); 1353 1498 } 1354 1499 1355 - static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) 1500 + static void check_cpu_stall(struct rcu_data *rdp) 1356 1501 { 1357 1502 unsigned long gs1; 1358 1503 unsigned long gs2; ··· 1364 1507 struct rcu_node *rnp; 1365 1508 1366 1509 if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) || 1367 - !rcu_gp_in_progress(rsp)) 1510 + !rcu_gp_in_progress()) 1368 1511 return; 1369 - rcu_stall_kick_kthreads(rsp); 1512 + rcu_stall_kick_kthreads(); 1370 1513 j = jiffies; 1371 1514 1372 1515 /* 1373 1516 * Lots of memory barriers to reject false positives. 1374 1517 * 1375 - * The idea is to pick up rsp->gp_seq, then rsp->jiffies_stall, 1376 - * then rsp->gp_start, and finally another copy of rsp->gp_seq. 1377 - * These values are updated in the opposite order with memory 1378 - * barriers (or equivalent) during grace-period initialization 1379 - * and cleanup. Now, a false positive can occur if we get an new 1380 - * value of rsp->gp_start and a old value of rsp->jiffies_stall. 1381 - * But given the memory barriers, the only way that this can happen 1382 - * is if one grace period ends and another starts between these 1383 - * two fetches. This is detected by comparing the second fetch 1384 - * of rsp->gp_seq with the previous fetch from rsp->gp_seq. 1518 + * The idea is to pick up rcu_state.gp_seq, then 1519 + * rcu_state.jiffies_stall, then rcu_state.gp_start, and finally 1520 + * another copy of rcu_state.gp_seq. These values are updated in 1521 + * the opposite order with memory barriers (or equivalent) during 1522 + * grace-period initialization and cleanup. Now, a false positive 1523 + * can occur if we get an new value of rcu_state.gp_start and a old 1524 + * value of rcu_state.jiffies_stall. But given the memory barriers, 1525 + * the only way that this can happen is if one grace period ends 1526 + * and another starts between these two fetches. This is detected 1527 + * by comparing the second fetch of rcu_state.gp_seq with the 1528 + * previous fetch from rcu_state.gp_seq. 1385 1529 * 1386 - * Given this check, comparisons of jiffies, rsp->jiffies_stall, 1387 - * and rsp->gp_start suffice to forestall false positives. 1530 + * Given this check, comparisons of jiffies, rcu_state.jiffies_stall, 1531 + * and rcu_state.gp_start suffice to forestall false positives. 1388 1532 */ 1389 - gs1 = READ_ONCE(rsp->gp_seq); 1533 + gs1 = READ_ONCE(rcu_state.gp_seq); 1390 1534 smp_rmb(); /* Pick up ->gp_seq first... */ 1391 - js = READ_ONCE(rsp->jiffies_stall); 1535 + js = READ_ONCE(rcu_state.jiffies_stall); 1392 1536 smp_rmb(); /* ...then ->jiffies_stall before the rest... */ 1393 - gps = READ_ONCE(rsp->gp_start); 1537 + gps = READ_ONCE(rcu_state.gp_start); 1394 1538 smp_rmb(); /* ...and finally ->gp_start before ->gp_seq again. */ 1395 - gs2 = READ_ONCE(rsp->gp_seq); 1539 + gs2 = READ_ONCE(rcu_state.gp_seq); 1396 1540 if (gs1 != gs2 || 1397 1541 ULONG_CMP_LT(j, js) || 1398 1542 ULONG_CMP_GE(gps, js)) 1399 1543 return; /* No stall or GP completed since entering function. */ 1400 1544 rnp = rdp->mynode; 1401 1545 jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; 1402 - if (rcu_gp_in_progress(rsp) && 1546 + if (rcu_gp_in_progress() && 1403 1547 (READ_ONCE(rnp->qsmask) & rdp->grpmask) && 1404 - cmpxchg(&rsp->jiffies_stall, js, jn) == js) { 1548 + cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) { 1405 1549 1406 1550 /* We haven't checked in, so go dump stack. */ 1407 - print_cpu_stall(rsp); 1551 + print_cpu_stall(); 1408 1552 1409 - } else if (rcu_gp_in_progress(rsp) && 1553 + } else if (rcu_gp_in_progress() && 1410 1554 ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) && 1411 - cmpxchg(&rsp->jiffies_stall, js, jn) == js) { 1555 + cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) { 1412 1556 1413 1557 /* They had a few time units to dump stack, so complain. */ 1414 - print_other_cpu_stall(rsp, gs2); 1558 + print_other_cpu_stall(gs2); 1415 1559 } 1416 1560 } 1417 1561 ··· 1427 1569 */ 1428 1570 void rcu_cpu_stall_reset(void) 1429 1571 { 1430 - struct rcu_state *rsp; 1431 - 1432 - for_each_rcu_flavor(rsp) 1433 - WRITE_ONCE(rsp->jiffies_stall, jiffies + ULONG_MAX / 2); 1572 + WRITE_ONCE(rcu_state.jiffies_stall, jiffies + ULONG_MAX / 2); 1434 1573 } 1435 1574 1436 1575 /* Trace-event wrapper function for trace_rcu_future_grace_period. */ 1437 1576 static void trace_rcu_this_gp(struct rcu_node *rnp, struct rcu_data *rdp, 1438 1577 unsigned long gp_seq_req, const char *s) 1439 1578 { 1440 - trace_rcu_future_grace_period(rdp->rsp->name, rnp->gp_seq, gp_seq_req, 1579 + trace_rcu_future_grace_period(rcu_state.name, rnp->gp_seq, gp_seq_req, 1441 1580 rnp->level, rnp->grplo, rnp->grphi, s); 1442 1581 } 1443 1582 ··· 1458 1603 unsigned long gp_seq_req) 1459 1604 { 1460 1605 bool ret = false; 1461 - struct rcu_state *rsp = rdp->rsp; 1462 1606 struct rcu_node *rnp; 1463 1607 1464 1608 /* ··· 1501 1647 } 1502 1648 1503 1649 /* If GP already in progress, just leave, otherwise start one. */ 1504 - if (rcu_gp_in_progress(rsp)) { 1650 + if (rcu_gp_in_progress()) { 1505 1651 trace_rcu_this_gp(rnp, rdp, gp_seq_req, TPS("Startedleafroot")); 1506 1652 goto unlock_out; 1507 1653 } 1508 1654 trace_rcu_this_gp(rnp, rdp, gp_seq_req, TPS("Startedroot")); 1509 - WRITE_ONCE(rsp->gp_flags, rsp->gp_flags | RCU_GP_FLAG_INIT); 1510 - rsp->gp_req_activity = jiffies; 1511 - if (!rsp->gp_kthread) { 1655 + WRITE_ONCE(rcu_state.gp_flags, rcu_state.gp_flags | RCU_GP_FLAG_INIT); 1656 + rcu_state.gp_req_activity = jiffies; 1657 + if (!rcu_state.gp_kthread) { 1512 1658 trace_rcu_this_gp(rnp, rdp, gp_seq_req, TPS("NoGPkthread")); 1513 1659 goto unlock_out; 1514 1660 } 1515 - trace_rcu_grace_period(rsp->name, READ_ONCE(rsp->gp_seq), TPS("newreq")); 1661 + trace_rcu_grace_period(rcu_state.name, READ_ONCE(rcu_state.gp_seq), TPS("newreq")); 1516 1662 ret = true; /* Caller must wake GP kthread. */ 1517 1663 unlock_out: 1518 1664 /* Push furthest requested GP to leaf node and rcu_data structure. */ ··· 1529 1675 * Clean up any old requests for the just-ended grace period. Also return 1530 1676 * whether any additional grace periods have been requested. 1531 1677 */ 1532 - static bool rcu_future_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp) 1678 + static bool rcu_future_gp_cleanup(struct rcu_node *rnp) 1533 1679 { 1534 1680 bool needmore; 1535 - struct rcu_data *rdp = this_cpu_ptr(rsp->rda); 1681 + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); 1536 1682 1537 1683 needmore = ULONG_CMP_LT(rnp->gp_seq, rnp->gp_seq_needed); 1538 1684 if (!needmore) ··· 1543 1689 } 1544 1690 1545 1691 /* 1546 - * Awaken the grace-period kthread for the specified flavor of RCU. 1547 - * Don't do a self-awaken, and don't bother awakening when there is 1548 - * nothing for the grace-period kthread to do (as in several CPUs 1549 - * raced to awaken, and we lost), and finally don't try to awaken 1550 - * a kthread that has not yet been created. 1692 + * Awaken the grace-period kthread. Don't do a self-awaken, and don't 1693 + * bother awakening when there is nothing for the grace-period kthread 1694 + * to do (as in several CPUs raced to awaken, and we lost), and finally 1695 + * don't try to awaken a kthread that has not yet been created. 1551 1696 */ 1552 - static void rcu_gp_kthread_wake(struct rcu_state *rsp) 1697 + static void rcu_gp_kthread_wake(void) 1553 1698 { 1554 - if (current == rsp->gp_kthread || 1555 - !READ_ONCE(rsp->gp_flags) || 1556 - !rsp->gp_kthread) 1699 + if (current == rcu_state.gp_kthread || 1700 + !READ_ONCE(rcu_state.gp_flags) || 1701 + !rcu_state.gp_kthread) 1557 1702 return; 1558 - swake_up_one(&rsp->gp_wq); 1703 + swake_up_one(&rcu_state.gp_wq); 1559 1704 } 1560 1705 1561 1706 /* ··· 1569 1716 * 1570 1717 * The caller must hold rnp->lock with interrupts disabled. 1571 1718 */ 1572 - static bool rcu_accelerate_cbs(struct rcu_state *rsp, struct rcu_node *rnp, 1573 - struct rcu_data *rdp) 1719 + static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp) 1574 1720 { 1575 1721 unsigned long gp_seq_req; 1576 1722 bool ret = false; ··· 1590 1738 * accelerating callback invocation to an earlier grace-period 1591 1739 * number. 1592 1740 */ 1593 - gp_seq_req = rcu_seq_snap(&rsp->gp_seq); 1741 + gp_seq_req = rcu_seq_snap(&rcu_state.gp_seq); 1594 1742 if (rcu_segcblist_accelerate(&rdp->cblist, gp_seq_req)) 1595 1743 ret = rcu_start_this_gp(rnp, rdp, gp_seq_req); 1596 1744 1597 1745 /* Trace depending on how much we were able to accelerate. */ 1598 1746 if (rcu_segcblist_restempty(&rdp->cblist, RCU_WAIT_TAIL)) 1599 - trace_rcu_grace_period(rsp->name, rdp->gp_seq, TPS("AccWaitCB")); 1747 + trace_rcu_grace_period(rcu_state.name, rdp->gp_seq, TPS("AccWaitCB")); 1600 1748 else 1601 - trace_rcu_grace_period(rsp->name, rdp->gp_seq, TPS("AccReadyCB")); 1749 + trace_rcu_grace_period(rcu_state.name, rdp->gp_seq, TPS("AccReadyCB")); 1602 1750 return ret; 1603 1751 } 1604 1752 ··· 1609 1757 * that a new grace-period request be made, invokes rcu_accelerate_cbs() 1610 1758 * while holding the leaf rcu_node structure's ->lock. 1611 1759 */ 1612 - static void rcu_accelerate_cbs_unlocked(struct rcu_state *rsp, 1613 - struct rcu_node *rnp, 1760 + static void rcu_accelerate_cbs_unlocked(struct rcu_node *rnp, 1614 1761 struct rcu_data *rdp) 1615 1762 { 1616 1763 unsigned long c; 1617 1764 bool needwake; 1618 1765 1619 1766 lockdep_assert_irqs_disabled(); 1620 - c = rcu_seq_snap(&rsp->gp_seq); 1767 + c = rcu_seq_snap(&rcu_state.gp_seq); 1621 1768 if (!rdp->gpwrap && ULONG_CMP_GE(rdp->gp_seq_needed, c)) { 1622 1769 /* Old request still live, so mark recent callbacks. */ 1623 1770 (void)rcu_segcblist_accelerate(&rdp->cblist, c); 1624 1771 return; 1625 1772 } 1626 1773 raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */ 1627 - needwake = rcu_accelerate_cbs(rsp, rnp, rdp); 1774 + needwake = rcu_accelerate_cbs(rnp, rdp); 1628 1775 raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */ 1629 1776 if (needwake) 1630 - rcu_gp_kthread_wake(rsp); 1777 + rcu_gp_kthread_wake(); 1631 1778 } 1632 1779 1633 1780 /* ··· 1639 1788 * 1640 1789 * The caller must hold rnp->lock with interrupts disabled. 1641 1790 */ 1642 - static bool rcu_advance_cbs(struct rcu_state *rsp, struct rcu_node *rnp, 1643 - struct rcu_data *rdp) 1791 + static bool rcu_advance_cbs(struct rcu_node *rnp, struct rcu_data *rdp) 1644 1792 { 1645 1793 raw_lockdep_assert_held_rcu_node(rnp); 1646 1794 ··· 1654 1804 rcu_segcblist_advance(&rdp->cblist, rnp->gp_seq); 1655 1805 1656 1806 /* Classify any remaining callbacks. */ 1657 - return rcu_accelerate_cbs(rsp, rnp, rdp); 1807 + return rcu_accelerate_cbs(rnp, rdp); 1658 1808 } 1659 1809 1660 1810 /* ··· 1663 1813 * structure corresponding to the current CPU, and must have irqs disabled. 1664 1814 * Returns true if the grace-period kthread needs to be awakened. 1665 1815 */ 1666 - static bool __note_gp_changes(struct rcu_state *rsp, struct rcu_node *rnp, 1667 - struct rcu_data *rdp) 1816 + static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp) 1668 1817 { 1669 1818 bool ret; 1670 1819 bool need_gp; ··· 1676 1827 /* Handle the ends of any preceding grace periods first. */ 1677 1828 if (rcu_seq_completed_gp(rdp->gp_seq, rnp->gp_seq) || 1678 1829 unlikely(READ_ONCE(rdp->gpwrap))) { 1679 - ret = rcu_advance_cbs(rsp, rnp, rdp); /* Advance callbacks. */ 1680 - trace_rcu_grace_period(rsp->name, rdp->gp_seq, TPS("cpuend")); 1830 + ret = rcu_advance_cbs(rnp, rdp); /* Advance callbacks. */ 1831 + trace_rcu_grace_period(rcu_state.name, rdp->gp_seq, TPS("cpuend")); 1681 1832 } else { 1682 - ret = rcu_accelerate_cbs(rsp, rnp, rdp); /* Recent callbacks. */ 1833 + ret = rcu_accelerate_cbs(rnp, rdp); /* Recent callbacks. */ 1683 1834 } 1684 1835 1685 1836 /* Now handle the beginnings of any new-to-this-CPU grace periods. */ ··· 1690 1841 * set up to detect a quiescent state, otherwise don't 1691 1842 * go looking for one. 1692 1843 */ 1693 - trace_rcu_grace_period(rsp->name, rnp->gp_seq, TPS("cpustart")); 1844 + trace_rcu_grace_period(rcu_state.name, rnp->gp_seq, TPS("cpustart")); 1694 1845 need_gp = !!(rnp->qsmask & rdp->grpmask); 1695 1846 rdp->cpu_no_qs.b.norm = need_gp; 1696 - rdp->rcu_qs_ctr_snap = __this_cpu_read(rcu_dynticks.rcu_qs_ctr); 1697 1847 rdp->core_needs_qs = need_gp; 1698 1848 zero_cpu_stall_ticks(rdp); 1699 1849 } ··· 1704 1856 return ret; 1705 1857 } 1706 1858 1707 - static void note_gp_changes(struct rcu_state *rsp, struct rcu_data *rdp) 1859 + static void note_gp_changes(struct rcu_data *rdp) 1708 1860 { 1709 1861 unsigned long flags; 1710 1862 bool needwake; ··· 1718 1870 local_irq_restore(flags); 1719 1871 return; 1720 1872 } 1721 - needwake = __note_gp_changes(rsp, rnp, rdp); 1873 + needwake = __note_gp_changes(rnp, rdp); 1722 1874 raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 1723 1875 if (needwake) 1724 - rcu_gp_kthread_wake(rsp); 1876 + rcu_gp_kthread_wake(); 1725 1877 } 1726 1878 1727 - static void rcu_gp_slow(struct rcu_state *rsp, int delay) 1879 + static void rcu_gp_slow(int delay) 1728 1880 { 1729 1881 if (delay > 0 && 1730 - !(rcu_seq_ctr(rsp->gp_seq) % 1882 + !(rcu_seq_ctr(rcu_state.gp_seq) % 1731 1883 (rcu_num_nodes * PER_RCU_NODE_PERIOD * delay))) 1732 1884 schedule_timeout_uninterruptible(delay); 1733 1885 } ··· 1735 1887 /* 1736 1888 * Initialize a new grace period. Return false if no grace period required. 1737 1889 */ 1738 - static bool rcu_gp_init(struct rcu_state *rsp) 1890 + static bool rcu_gp_init(void) 1739 1891 { 1740 1892 unsigned long flags; 1741 1893 unsigned long oldmask; 1742 1894 unsigned long mask; 1743 1895 struct rcu_data *rdp; 1744 - struct rcu_node *rnp = rcu_get_root(rsp); 1896 + struct rcu_node *rnp = rcu_get_root(); 1745 1897 1746 - WRITE_ONCE(rsp->gp_activity, jiffies); 1898 + WRITE_ONCE(rcu_state.gp_activity, jiffies); 1747 1899 raw_spin_lock_irq_rcu_node(rnp); 1748 - if (!READ_ONCE(rsp->gp_flags)) { 1900 + if (!READ_ONCE(rcu_state.gp_flags)) { 1749 1901 /* Spurious wakeup, tell caller to go back to sleep. */ 1750 1902 raw_spin_unlock_irq_rcu_node(rnp); 1751 1903 return false; 1752 1904 } 1753 - WRITE_ONCE(rsp->gp_flags, 0); /* Clear all flags: New grace period. */ 1905 + WRITE_ONCE(rcu_state.gp_flags, 0); /* Clear all flags: New GP. */ 1754 1906 1755 - if (WARN_ON_ONCE(rcu_gp_in_progress(rsp))) { 1907 + if (WARN_ON_ONCE(rcu_gp_in_progress())) { 1756 1908 /* 1757 1909 * Grace period already in progress, don't start another. 1758 1910 * Not supposed to be able to happen. ··· 1762 1914 } 1763 1915 1764 1916 /* Advance to a new grace period and initialize state. */ 1765 - record_gp_stall_check_time(rsp); 1917 + record_gp_stall_check_time(); 1766 1918 /* Record GP times before starting GP, hence rcu_seq_start(). */ 1767 - rcu_seq_start(&rsp->gp_seq); 1768 - trace_rcu_grace_period(rsp->name, rsp->gp_seq, TPS("start")); 1919 + rcu_seq_start(&rcu_state.gp_seq); 1920 + trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, TPS("start")); 1769 1921 raw_spin_unlock_irq_rcu_node(rnp); 1770 1922 1771 1923 /* ··· 1774 1926 * for subsequent online CPUs, and that quiescent-state forcing 1775 1927 * will handle subsequent offline CPUs. 1776 1928 */ 1777 - rsp->gp_state = RCU_GP_ONOFF; 1778 - rcu_for_each_leaf_node(rsp, rnp) { 1779 - spin_lock(&rsp->ofl_lock); 1929 + rcu_state.gp_state = RCU_GP_ONOFF; 1930 + rcu_for_each_leaf_node(rnp) { 1931 + raw_spin_lock(&rcu_state.ofl_lock); 1780 1932 raw_spin_lock_irq_rcu_node(rnp); 1781 1933 if (rnp->qsmaskinit == rnp->qsmaskinitnext && 1782 1934 !rnp->wait_blkd_tasks) { 1783 1935 /* Nothing to do on this leaf rcu_node structure. */ 1784 1936 raw_spin_unlock_irq_rcu_node(rnp); 1785 - spin_unlock(&rsp->ofl_lock); 1937 + raw_spin_unlock(&rcu_state.ofl_lock); 1786 1938 continue; 1787 1939 } 1788 1940 ··· 1818 1970 } 1819 1971 1820 1972 raw_spin_unlock_irq_rcu_node(rnp); 1821 - spin_unlock(&rsp->ofl_lock); 1973 + raw_spin_unlock(&rcu_state.ofl_lock); 1822 1974 } 1823 - rcu_gp_slow(rsp, gp_preinit_delay); /* Races with CPU hotplug. */ 1975 + rcu_gp_slow(gp_preinit_delay); /* Races with CPU hotplug. */ 1824 1976 1825 1977 /* 1826 1978 * Set the quiescent-state-needed bits in all the rcu_node 1827 - * structures for all currently online CPUs in breadth-first order, 1828 - * starting from the root rcu_node structure, relying on the layout 1829 - * of the tree within the rsp->node[] array. Note that other CPUs 1830 - * will access only the leaves of the hierarchy, thus seeing that no 1831 - * grace period is in progress, at least until the corresponding 1832 - * leaf node has been initialized. 1979 + * structures for all currently online CPUs in breadth-first 1980 + * order, starting from the root rcu_node structure, relying on the 1981 + * layout of the tree within the rcu_state.node[] array. Note that 1982 + * other CPUs will access only the leaves of the hierarchy, thus 1983 + * seeing that no grace period is in progress, at least until the 1984 + * corresponding leaf node has been initialized. 1833 1985 * 1834 1986 * The grace period cannot complete until the initialization 1835 1987 * process finishes, because this kthread handles both. 1836 1988 */ 1837 - rsp->gp_state = RCU_GP_INIT; 1838 - rcu_for_each_node_breadth_first(rsp, rnp) { 1839 - rcu_gp_slow(rsp, gp_init_delay); 1989 + rcu_state.gp_state = RCU_GP_INIT; 1990 + rcu_for_each_node_breadth_first(rnp) { 1991 + rcu_gp_slow(gp_init_delay); 1840 1992 raw_spin_lock_irqsave_rcu_node(rnp, flags); 1841 - rdp = this_cpu_ptr(rsp->rda); 1842 - rcu_preempt_check_blocked_tasks(rsp, rnp); 1993 + rdp = this_cpu_ptr(&rcu_data); 1994 + rcu_preempt_check_blocked_tasks(rnp); 1843 1995 rnp->qsmask = rnp->qsmaskinit; 1844 - WRITE_ONCE(rnp->gp_seq, rsp->gp_seq); 1996 + WRITE_ONCE(rnp->gp_seq, rcu_state.gp_seq); 1845 1997 if (rnp == rdp->mynode) 1846 - (void)__note_gp_changes(rsp, rnp, rdp); 1998 + (void)__note_gp_changes(rnp, rdp); 1847 1999 rcu_preempt_boost_start_gp(rnp); 1848 - trace_rcu_grace_period_init(rsp->name, rnp->gp_seq, 2000 + trace_rcu_grace_period_init(rcu_state.name, rnp->gp_seq, 1849 2001 rnp->level, rnp->grplo, 1850 2002 rnp->grphi, rnp->qsmask); 1851 2003 /* Quiescent states for tasks on any now-offline CPUs. */ 1852 2004 mask = rnp->qsmask & ~rnp->qsmaskinitnext; 1853 2005 rnp->rcu_gp_init_mask = mask; 1854 2006 if ((mask || rnp->wait_blkd_tasks) && rcu_is_leaf_node(rnp)) 1855 - rcu_report_qs_rnp(mask, rsp, rnp, rnp->gp_seq, flags); 2007 + rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); 1856 2008 else 1857 2009 raw_spin_unlock_irq_rcu_node(rnp); 1858 2010 cond_resched_tasks_rcu_qs(); 1859 - WRITE_ONCE(rsp->gp_activity, jiffies); 2011 + WRITE_ONCE(rcu_state.gp_activity, jiffies); 1860 2012 } 1861 2013 1862 2014 return true; ··· 1866 2018 * Helper function for swait_event_idle_exclusive() wakeup at force-quiescent-state 1867 2019 * time. 1868 2020 */ 1869 - static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp) 2021 + static bool rcu_gp_fqs_check_wake(int *gfp) 1870 2022 { 1871 - struct rcu_node *rnp = rcu_get_root(rsp); 2023 + struct rcu_node *rnp = rcu_get_root(); 1872 2024 1873 2025 /* Someone like call_rcu() requested a force-quiescent-state scan. */ 1874 - *gfp = READ_ONCE(rsp->gp_flags); 2026 + *gfp = READ_ONCE(rcu_state.gp_flags); 1875 2027 if (*gfp & RCU_GP_FLAG_FQS) 1876 2028 return true; 1877 2029 ··· 1885 2037 /* 1886 2038 * Do one round of quiescent-state forcing. 1887 2039 */ 1888 - static void rcu_gp_fqs(struct rcu_state *rsp, bool first_time) 2040 + static void rcu_gp_fqs(bool first_time) 1889 2041 { 1890 - struct rcu_node *rnp = rcu_get_root(rsp); 2042 + struct rcu_node *rnp = rcu_get_root(); 1891 2043 1892 - WRITE_ONCE(rsp->gp_activity, jiffies); 1893 - rsp->n_force_qs++; 2044 + WRITE_ONCE(rcu_state.gp_activity, jiffies); 2045 + rcu_state.n_force_qs++; 1894 2046 if (first_time) { 1895 2047 /* Collect dyntick-idle snapshots. */ 1896 - force_qs_rnp(rsp, dyntick_save_progress_counter); 2048 + force_qs_rnp(dyntick_save_progress_counter); 1897 2049 } else { 1898 2050 /* Handle dyntick-idle and offline CPUs. */ 1899 - force_qs_rnp(rsp, rcu_implicit_dynticks_qs); 2051 + force_qs_rnp(rcu_implicit_dynticks_qs); 1900 2052 } 1901 2053 /* Clear flag to prevent immediate re-entry. */ 1902 - if (READ_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) { 2054 + if (READ_ONCE(rcu_state.gp_flags) & RCU_GP_FLAG_FQS) { 1903 2055 raw_spin_lock_irq_rcu_node(rnp); 1904 - WRITE_ONCE(rsp->gp_flags, 1905 - READ_ONCE(rsp->gp_flags) & ~RCU_GP_FLAG_FQS); 2056 + WRITE_ONCE(rcu_state.gp_flags, 2057 + READ_ONCE(rcu_state.gp_flags) & ~RCU_GP_FLAG_FQS); 1906 2058 raw_spin_unlock_irq_rcu_node(rnp); 2059 + } 2060 + } 2061 + 2062 + /* 2063 + * Loop doing repeated quiescent-state forcing until the grace period ends. 2064 + */ 2065 + static void rcu_gp_fqs_loop(void) 2066 + { 2067 + bool first_gp_fqs; 2068 + int gf; 2069 + unsigned long j; 2070 + int ret; 2071 + struct rcu_node *rnp = rcu_get_root(); 2072 + 2073 + first_gp_fqs = true; 2074 + j = READ_ONCE(jiffies_till_first_fqs); 2075 + ret = 0; 2076 + for (;;) { 2077 + if (!ret) { 2078 + rcu_state.jiffies_force_qs = jiffies + j; 2079 + WRITE_ONCE(rcu_state.jiffies_kick_kthreads, 2080 + jiffies + 3 * j); 2081 + } 2082 + trace_rcu_grace_period(rcu_state.name, 2083 + READ_ONCE(rcu_state.gp_seq), 2084 + TPS("fqswait")); 2085 + rcu_state.gp_state = RCU_GP_WAIT_FQS; 2086 + ret = swait_event_idle_timeout_exclusive( 2087 + rcu_state.gp_wq, rcu_gp_fqs_check_wake(&gf), j); 2088 + rcu_state.gp_state = RCU_GP_DOING_FQS; 2089 + /* Locking provides needed memory barriers. */ 2090 + /* If grace period done, leave loop. */ 2091 + if (!READ_ONCE(rnp->qsmask) && 2092 + !rcu_preempt_blocked_readers_cgp(rnp)) 2093 + break; 2094 + /* If time for quiescent-state forcing, do it. */ 2095 + if (ULONG_CMP_GE(jiffies, rcu_state.jiffies_force_qs) || 2096 + (gf & RCU_GP_FLAG_FQS)) { 2097 + trace_rcu_grace_period(rcu_state.name, 2098 + READ_ONCE(rcu_state.gp_seq), 2099 + TPS("fqsstart")); 2100 + rcu_gp_fqs(first_gp_fqs); 2101 + first_gp_fqs = false; 2102 + trace_rcu_grace_period(rcu_state.name, 2103 + READ_ONCE(rcu_state.gp_seq), 2104 + TPS("fqsend")); 2105 + cond_resched_tasks_rcu_qs(); 2106 + WRITE_ONCE(rcu_state.gp_activity, jiffies); 2107 + ret = 0; /* Force full wait till next FQS. */ 2108 + j = READ_ONCE(jiffies_till_next_fqs); 2109 + } else { 2110 + /* Deal with stray signal. */ 2111 + cond_resched_tasks_rcu_qs(); 2112 + WRITE_ONCE(rcu_state.gp_activity, jiffies); 2113 + WARN_ON(signal_pending(current)); 2114 + trace_rcu_grace_period(rcu_state.name, 2115 + READ_ONCE(rcu_state.gp_seq), 2116 + TPS("fqswaitsig")); 2117 + ret = 1; /* Keep old FQS timing. */ 2118 + j = jiffies; 2119 + if (time_after(jiffies, rcu_state.jiffies_force_qs)) 2120 + j = 1; 2121 + else 2122 + j = rcu_state.jiffies_force_qs - j; 2123 + } 1907 2124 } 1908 2125 } 1909 2126 1910 2127 /* 1911 2128 * Clean up after the old grace period. 1912 2129 */ 1913 - static void rcu_gp_cleanup(struct rcu_state *rsp) 2130 + static void rcu_gp_cleanup(void) 1914 2131 { 1915 2132 unsigned long gp_duration; 1916 2133 bool needgp = false; 1917 2134 unsigned long new_gp_seq; 1918 2135 struct rcu_data *rdp; 1919 - struct rcu_node *rnp = rcu_get_root(rsp); 2136 + struct rcu_node *rnp = rcu_get_root(); 1920 2137 struct swait_queue_head *sq; 1921 2138 1922 - WRITE_ONCE(rsp->gp_activity, jiffies); 2139 + WRITE_ONCE(rcu_state.gp_activity, jiffies); 1923 2140 raw_spin_lock_irq_rcu_node(rnp); 1924 - gp_duration = jiffies - rsp->gp_start; 1925 - if (gp_duration > rsp->gp_max) 1926 - rsp->gp_max = gp_duration; 2141 + gp_duration = jiffies - rcu_state.gp_start; 2142 + if (gp_duration > rcu_state.gp_max) 2143 + rcu_state.gp_max = gp_duration; 1927 2144 1928 2145 /* 1929 2146 * We know the grace period is complete, but to everyone else ··· 2009 2096 * the rcu_node structures before the beginning of the next grace 2010 2097 * period is recorded in any of the rcu_node structures. 2011 2098 */ 2012 - new_gp_seq = rsp->gp_seq; 2099 + new_gp_seq = rcu_state.gp_seq; 2013 2100 rcu_seq_end(&new_gp_seq); 2014 - rcu_for_each_node_breadth_first(rsp, rnp) { 2101 + rcu_for_each_node_breadth_first(rnp) { 2015 2102 raw_spin_lock_irq_rcu_node(rnp); 2016 2103 if (WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp))) 2017 - dump_blkd_tasks(rsp, rnp, 10); 2104 + dump_blkd_tasks(rnp, 10); 2018 2105 WARN_ON_ONCE(rnp->qsmask); 2019 2106 WRITE_ONCE(rnp->gp_seq, new_gp_seq); 2020 - rdp = this_cpu_ptr(rsp->rda); 2107 + rdp = this_cpu_ptr(&rcu_data); 2021 2108 if (rnp == rdp->mynode) 2022 - needgp = __note_gp_changes(rsp, rnp, rdp) || needgp; 2109 + needgp = __note_gp_changes(rnp, rdp) || needgp; 2023 2110 /* smp_mb() provided by prior unlock-lock pair. */ 2024 - needgp = rcu_future_gp_cleanup(rsp, rnp) || needgp; 2111 + needgp = rcu_future_gp_cleanup(rnp) || needgp; 2025 2112 sq = rcu_nocb_gp_get(rnp); 2026 2113 raw_spin_unlock_irq_rcu_node(rnp); 2027 2114 rcu_nocb_gp_cleanup(sq); 2028 2115 cond_resched_tasks_rcu_qs(); 2029 - WRITE_ONCE(rsp->gp_activity, jiffies); 2030 - rcu_gp_slow(rsp, gp_cleanup_delay); 2116 + WRITE_ONCE(rcu_state.gp_activity, jiffies); 2117 + rcu_gp_slow(gp_cleanup_delay); 2031 2118 } 2032 - rnp = rcu_get_root(rsp); 2033 - raw_spin_lock_irq_rcu_node(rnp); /* GP before rsp->gp_seq update. */ 2119 + rnp = rcu_get_root(); 2120 + raw_spin_lock_irq_rcu_node(rnp); /* GP before ->gp_seq update. */ 2034 2121 2035 2122 /* Declare grace period done. */ 2036 - rcu_seq_end(&rsp->gp_seq); 2037 - trace_rcu_grace_period(rsp->name, rsp->gp_seq, TPS("end")); 2038 - rsp->gp_state = RCU_GP_IDLE; 2123 + rcu_seq_end(&rcu_state.gp_seq); 2124 + trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, TPS("end")); 2125 + rcu_state.gp_state = RCU_GP_IDLE; 2039 2126 /* Check for GP requests since above loop. */ 2040 - rdp = this_cpu_ptr(rsp->rda); 2127 + rdp = this_cpu_ptr(&rcu_data); 2041 2128 if (!needgp && ULONG_CMP_LT(rnp->gp_seq, rnp->gp_seq_needed)) { 2042 2129 trace_rcu_this_gp(rnp, rdp, rnp->gp_seq_needed, 2043 2130 TPS("CleanupMore")); 2044 2131 needgp = true; 2045 2132 } 2046 2133 /* Advance CBs to reduce false positives below. */ 2047 - if (!rcu_accelerate_cbs(rsp, rnp, rdp) && needgp) { 2048 - WRITE_ONCE(rsp->gp_flags, RCU_GP_FLAG_INIT); 2049 - rsp->gp_req_activity = jiffies; 2050 - trace_rcu_grace_period(rsp->name, READ_ONCE(rsp->gp_seq), 2134 + if (!rcu_accelerate_cbs(rnp, rdp) && needgp) { 2135 + WRITE_ONCE(rcu_state.gp_flags, RCU_GP_FLAG_INIT); 2136 + rcu_state.gp_req_activity = jiffies; 2137 + trace_rcu_grace_period(rcu_state.name, 2138 + READ_ONCE(rcu_state.gp_seq), 2051 2139 TPS("newreq")); 2052 2140 } else { 2053 - WRITE_ONCE(rsp->gp_flags, rsp->gp_flags & RCU_GP_FLAG_INIT); 2141 + WRITE_ONCE(rcu_state.gp_flags, 2142 + rcu_state.gp_flags & RCU_GP_FLAG_INIT); 2054 2143 } 2055 2144 raw_spin_unlock_irq_rcu_node(rnp); 2056 2145 } ··· 2060 2145 /* 2061 2146 * Body of kthread that handles grace periods. 2062 2147 */ 2063 - static int __noreturn rcu_gp_kthread(void *arg) 2148 + static int __noreturn rcu_gp_kthread(void *unused) 2064 2149 { 2065 - bool first_gp_fqs; 2066 - int gf; 2067 - unsigned long j; 2068 - int ret; 2069 - struct rcu_state *rsp = arg; 2070 - struct rcu_node *rnp = rcu_get_root(rsp); 2071 - 2072 2150 rcu_bind_gp_kthread(); 2073 2151 for (;;) { 2074 2152 2075 2153 /* Handle grace-period start. */ 2076 2154 for (;;) { 2077 - trace_rcu_grace_period(rsp->name, 2078 - READ_ONCE(rsp->gp_seq), 2155 + trace_rcu_grace_period(rcu_state.name, 2156 + READ_ONCE(rcu_state.gp_seq), 2079 2157 TPS("reqwait")); 2080 - rsp->gp_state = RCU_GP_WAIT_GPS; 2081 - swait_event_idle_exclusive(rsp->gp_wq, READ_ONCE(rsp->gp_flags) & 2082 - RCU_GP_FLAG_INIT); 2083 - rsp->gp_state = RCU_GP_DONE_GPS; 2158 + rcu_state.gp_state = RCU_GP_WAIT_GPS; 2159 + swait_event_idle_exclusive(rcu_state.gp_wq, 2160 + READ_ONCE(rcu_state.gp_flags) & 2161 + RCU_GP_FLAG_INIT); 2162 + rcu_state.gp_state = RCU_GP_DONE_GPS; 2084 2163 /* Locking provides needed memory barrier. */ 2085 - if (rcu_gp_init(rsp)) 2164 + if (rcu_gp_init()) 2086 2165 break; 2087 2166 cond_resched_tasks_rcu_qs(); 2088 - WRITE_ONCE(rsp->gp_activity, jiffies); 2167 + WRITE_ONCE(rcu_state.gp_activity, jiffies); 2089 2168 WARN_ON(signal_pending(current)); 2090 - trace_rcu_grace_period(rsp->name, 2091 - READ_ONCE(rsp->gp_seq), 2169 + trace_rcu_grace_period(rcu_state.name, 2170 + READ_ONCE(rcu_state.gp_seq), 2092 2171 TPS("reqwaitsig")); 2093 2172 } 2094 2173 2095 2174 /* Handle quiescent-state forcing. */ 2096 - first_gp_fqs = true; 2097 - j = jiffies_till_first_fqs; 2098 - ret = 0; 2099 - for (;;) { 2100 - if (!ret) { 2101 - rsp->jiffies_force_qs = jiffies + j; 2102 - WRITE_ONCE(rsp->jiffies_kick_kthreads, 2103 - jiffies + 3 * j); 2104 - } 2105 - trace_rcu_grace_period(rsp->name, 2106 - READ_ONCE(rsp->gp_seq), 2107 - TPS("fqswait")); 2108 - rsp->gp_state = RCU_GP_WAIT_FQS; 2109 - ret = swait_event_idle_timeout_exclusive(rsp->gp_wq, 2110 - rcu_gp_fqs_check_wake(rsp, &gf), j); 2111 - rsp->gp_state = RCU_GP_DOING_FQS; 2112 - /* Locking provides needed memory barriers. */ 2113 - /* If grace period done, leave loop. */ 2114 - if (!READ_ONCE(rnp->qsmask) && 2115 - !rcu_preempt_blocked_readers_cgp(rnp)) 2116 - break; 2117 - /* If time for quiescent-state forcing, do it. */ 2118 - if (ULONG_CMP_GE(jiffies, rsp->jiffies_force_qs) || 2119 - (gf & RCU_GP_FLAG_FQS)) { 2120 - trace_rcu_grace_period(rsp->name, 2121 - READ_ONCE(rsp->gp_seq), 2122 - TPS("fqsstart")); 2123 - rcu_gp_fqs(rsp, first_gp_fqs); 2124 - first_gp_fqs = false; 2125 - trace_rcu_grace_period(rsp->name, 2126 - READ_ONCE(rsp->gp_seq), 2127 - TPS("fqsend")); 2128 - cond_resched_tasks_rcu_qs(); 2129 - WRITE_ONCE(rsp->gp_activity, jiffies); 2130 - ret = 0; /* Force full wait till next FQS. */ 2131 - j = jiffies_till_next_fqs; 2132 - } else { 2133 - /* Deal with stray signal. */ 2134 - cond_resched_tasks_rcu_qs(); 2135 - WRITE_ONCE(rsp->gp_activity, jiffies); 2136 - WARN_ON(signal_pending(current)); 2137 - trace_rcu_grace_period(rsp->name, 2138 - READ_ONCE(rsp->gp_seq), 2139 - TPS("fqswaitsig")); 2140 - ret = 1; /* Keep old FQS timing. */ 2141 - j = jiffies; 2142 - if (time_after(jiffies, rsp->jiffies_force_qs)) 2143 - j = 1; 2144 - else 2145 - j = rsp->jiffies_force_qs - j; 2146 - } 2147 - } 2175 + rcu_gp_fqs_loop(); 2148 2176 2149 2177 /* Handle grace-period end. */ 2150 - rsp->gp_state = RCU_GP_CLEANUP; 2151 - rcu_gp_cleanup(rsp); 2152 - rsp->gp_state = RCU_GP_CLEANED; 2178 + rcu_state.gp_state = RCU_GP_CLEANUP; 2179 + rcu_gp_cleanup(); 2180 + rcu_state.gp_state = RCU_GP_CLEANED; 2153 2181 } 2154 2182 } 2155 2183 2156 2184 /* 2157 - * Report a full set of quiescent states to the specified rcu_state data 2158 - * structure. Invoke rcu_gp_kthread_wake() to awaken the grace-period 2159 - * kthread if another grace period is required. Whether we wake 2160 - * the grace-period kthread or it awakens itself for the next round 2161 - * of quiescent-state forcing, that kthread will clean up after the 2162 - * just-completed grace period. Note that the caller must hold rnp->lock, 2163 - * which is released before return. 2185 + * Report a full set of quiescent states to the rcu_state data structure. 2186 + * Invoke rcu_gp_kthread_wake() to awaken the grace-period kthread if 2187 + * another grace period is required. Whether we wake the grace-period 2188 + * kthread or it awakens itself for the next round of quiescent-state 2189 + * forcing, that kthread will clean up after the just-completed grace 2190 + * period. Note that the caller must hold rnp->lock, which is released 2191 + * before return. 2164 2192 */ 2165 - static void rcu_report_qs_rsp(struct rcu_state *rsp, unsigned long flags) 2166 - __releases(rcu_get_root(rsp)->lock) 2193 + static void rcu_report_qs_rsp(unsigned long flags) 2194 + __releases(rcu_get_root()->lock) 2167 2195 { 2168 - raw_lockdep_assert_held_rcu_node(rcu_get_root(rsp)); 2169 - WARN_ON_ONCE(!rcu_gp_in_progress(rsp)); 2170 - WRITE_ONCE(rsp->gp_flags, READ_ONCE(rsp->gp_flags) | RCU_GP_FLAG_FQS); 2171 - raw_spin_unlock_irqrestore_rcu_node(rcu_get_root(rsp), flags); 2172 - rcu_gp_kthread_wake(rsp); 2196 + raw_lockdep_assert_held_rcu_node(rcu_get_root()); 2197 + WARN_ON_ONCE(!rcu_gp_in_progress()); 2198 + WRITE_ONCE(rcu_state.gp_flags, 2199 + READ_ONCE(rcu_state.gp_flags) | RCU_GP_FLAG_FQS); 2200 + raw_spin_unlock_irqrestore_rcu_node(rcu_get_root(), flags); 2201 + rcu_gp_kthread_wake(); 2173 2202 } 2174 2203 2175 2204 /* ··· 2130 2271 * disabled. This allows propagating quiescent state due to resumed tasks 2131 2272 * during grace-period initialization. 2132 2273 */ 2133 - static void 2134 - rcu_report_qs_rnp(unsigned long mask, struct rcu_state *rsp, 2135 - struct rcu_node *rnp, unsigned long gps, unsigned long flags) 2274 + static void rcu_report_qs_rnp(unsigned long mask, struct rcu_node *rnp, 2275 + unsigned long gps, unsigned long flags) 2136 2276 __releases(rnp->lock) 2137 2277 { 2138 2278 unsigned long oldmask = 0; ··· 2154 2296 WARN_ON_ONCE(!rcu_is_leaf_node(rnp) && 2155 2297 rcu_preempt_blocked_readers_cgp(rnp)); 2156 2298 rnp->qsmask &= ~mask; 2157 - trace_rcu_quiescent_state_report(rsp->name, rnp->gp_seq, 2299 + trace_rcu_quiescent_state_report(rcu_state.name, rnp->gp_seq, 2158 2300 mask, rnp->qsmask, rnp->level, 2159 2301 rnp->grplo, rnp->grphi, 2160 2302 !!rnp->gp_tasks); ··· 2184 2326 * state for this grace period. Invoke rcu_report_qs_rsp() 2185 2327 * to clean up and start the next grace period if one is needed. 2186 2328 */ 2187 - rcu_report_qs_rsp(rsp, flags); /* releases rnp->lock. */ 2329 + rcu_report_qs_rsp(flags); /* releases rnp->lock. */ 2188 2330 } 2189 2331 2190 2332 /* 2191 2333 * Record a quiescent state for all tasks that were previously queued 2192 2334 * on the specified rcu_node structure and that were blocking the current 2193 - * RCU grace period. The caller must hold the specified rnp->lock with 2335 + * RCU grace period. The caller must hold the corresponding rnp->lock with 2194 2336 * irqs disabled, and this lock is released upon return, but irqs remain 2195 2337 * disabled. 2196 2338 */ 2197 2339 static void __maybe_unused 2198 - rcu_report_unblock_qs_rnp(struct rcu_state *rsp, 2199 - struct rcu_node *rnp, unsigned long flags) 2340 + rcu_report_unblock_qs_rnp(struct rcu_node *rnp, unsigned long flags) 2200 2341 __releases(rnp->lock) 2201 2342 { 2202 2343 unsigned long gps; ··· 2203 2346 struct rcu_node *rnp_p; 2204 2347 2205 2348 raw_lockdep_assert_held_rcu_node(rnp); 2206 - if (WARN_ON_ONCE(rcu_state_p == &rcu_sched_state) || 2207 - WARN_ON_ONCE(rsp != rcu_state_p) || 2349 + if (WARN_ON_ONCE(!IS_ENABLED(CONFIG_PREEMPT)) || 2208 2350 WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp)) || 2209 2351 rnp->qsmask != 0) { 2210 2352 raw_spin_unlock_irqrestore_rcu_node(rnp, flags); ··· 2217 2361 * Only one rcu_node structure in the tree, so don't 2218 2362 * try to report up to its nonexistent parent! 2219 2363 */ 2220 - rcu_report_qs_rsp(rsp, flags); 2364 + rcu_report_qs_rsp(flags); 2221 2365 return; 2222 2366 } 2223 2367 ··· 2226 2370 mask = rnp->grpmask; 2227 2371 raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */ 2228 2372 raw_spin_lock_rcu_node(rnp_p); /* irqs already disabled. */ 2229 - rcu_report_qs_rnp(mask, rsp, rnp_p, gps, flags); 2373 + rcu_report_qs_rnp(mask, rnp_p, gps, flags); 2230 2374 } 2231 2375 2232 2376 /* ··· 2234 2378 * structure. This must be called from the specified CPU. 2235 2379 */ 2236 2380 static void 2237 - rcu_report_qs_rdp(int cpu, struct rcu_state *rsp, struct rcu_data *rdp) 2381 + rcu_report_qs_rdp(int cpu, struct rcu_data *rdp) 2238 2382 { 2239 2383 unsigned long flags; 2240 2384 unsigned long mask; ··· 2253 2397 * within the current grace period. 2254 2398 */ 2255 2399 rdp->cpu_no_qs.b.norm = true; /* need qs for new gp. */ 2256 - rdp->rcu_qs_ctr_snap = __this_cpu_read(rcu_dynticks.rcu_qs_ctr); 2257 2400 raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 2258 2401 return; 2259 2402 } ··· 2266 2411 * This GP can't end until cpu checks in, so all of our 2267 2412 * callbacks can be processed during the next GP. 2268 2413 */ 2269 - needwake = rcu_accelerate_cbs(rsp, rnp, rdp); 2414 + needwake = rcu_accelerate_cbs(rnp, rdp); 2270 2415 2271 - rcu_report_qs_rnp(mask, rsp, rnp, rnp->gp_seq, flags); 2416 + rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); 2272 2417 /* ^^^ Released rnp->lock */ 2273 2418 if (needwake) 2274 - rcu_gp_kthread_wake(rsp); 2419 + rcu_gp_kthread_wake(); 2275 2420 } 2276 2421 } 2277 2422 ··· 2282 2427 * quiescent state for this grace period, and record that fact if so. 2283 2428 */ 2284 2429 static void 2285 - rcu_check_quiescent_state(struct rcu_state *rsp, struct rcu_data *rdp) 2430 + rcu_check_quiescent_state(struct rcu_data *rdp) 2286 2431 { 2287 2432 /* Check for grace-period ends and beginnings. */ 2288 - note_gp_changes(rsp, rdp); 2433 + note_gp_changes(rdp); 2289 2434 2290 2435 /* 2291 2436 * Does this CPU still need to do its part for current grace period? ··· 2305 2450 * Tell RCU we are done (but rcu_report_qs_rdp() will be the 2306 2451 * judge of that). 2307 2452 */ 2308 - rcu_report_qs_rdp(rdp->cpu, rsp, rdp); 2453 + rcu_report_qs_rdp(rdp->cpu, rdp); 2309 2454 } 2310 2455 2311 2456 /* 2312 - * Trace the fact that this CPU is going offline. 2457 + * Near the end of the offline process. Trace the fact that this CPU 2458 + * is going offline. 2313 2459 */ 2314 - static void rcu_cleanup_dying_cpu(struct rcu_state *rsp) 2460 + int rcutree_dying_cpu(unsigned int cpu) 2315 2461 { 2316 2462 RCU_TRACE(bool blkd;) 2317 - RCU_TRACE(struct rcu_data *rdp = this_cpu_ptr(rsp->rda);) 2463 + RCU_TRACE(struct rcu_data *rdp = this_cpu_ptr(&rcu_data);) 2318 2464 RCU_TRACE(struct rcu_node *rnp = rdp->mynode;) 2319 2465 2320 2466 if (!IS_ENABLED(CONFIG_HOTPLUG_CPU)) 2321 - return; 2467 + return 0; 2322 2468 2323 2469 RCU_TRACE(blkd = !!(rnp->qsmask & rdp->grpmask);) 2324 - trace_rcu_grace_period(rsp->name, rnp->gp_seq, 2470 + trace_rcu_grace_period(rcu_state.name, rnp->gp_seq, 2325 2471 blkd ? TPS("cpuofl") : TPS("cpuofl-bgp")); 2472 + return 0; 2326 2473 } 2327 2474 2328 2475 /* ··· 2378 2521 * There can only be one CPU hotplug operation at a time, so no need for 2379 2522 * explicit locking. 2380 2523 */ 2381 - static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp) 2524 + int rcutree_dead_cpu(unsigned int cpu) 2382 2525 { 2383 - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); 2526 + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); 2384 2527 struct rcu_node *rnp = rdp->mynode; /* Outgoing CPU's rdp & rnp. */ 2385 2528 2386 2529 if (!IS_ENABLED(CONFIG_HOTPLUG_CPU)) 2387 - return; 2530 + return 0; 2388 2531 2389 2532 /* Adjust any no-longer-needed kthreads. */ 2390 2533 rcu_boost_kthread_setaffinity(rnp, -1); 2534 + /* Do any needed no-CB deferred wakeups from this CPU. */ 2535 + do_nocb_deferred_wakeup(per_cpu_ptr(&rcu_data, cpu)); 2536 + return 0; 2391 2537 } 2392 2538 2393 2539 /* 2394 2540 * Invoke any RCU callbacks that have made it to the end of their grace 2395 2541 * period. Thottle as specified by rdp->blimit. 2396 2542 */ 2397 - static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp) 2543 + static void rcu_do_batch(struct rcu_data *rdp) 2398 2544 { 2399 2545 unsigned long flags; 2400 2546 struct rcu_head *rhp; ··· 2406 2546 2407 2547 /* If no callbacks are ready, just return. */ 2408 2548 if (!rcu_segcblist_ready_cbs(&rdp->cblist)) { 2409 - trace_rcu_batch_start(rsp->name, 2549 + trace_rcu_batch_start(rcu_state.name, 2410 2550 rcu_segcblist_n_lazy_cbs(&rdp->cblist), 2411 2551 rcu_segcblist_n_cbs(&rdp->cblist), 0); 2412 - trace_rcu_batch_end(rsp->name, 0, 2552 + trace_rcu_batch_end(rcu_state.name, 0, 2413 2553 !rcu_segcblist_empty(&rdp->cblist), 2414 2554 need_resched(), is_idle_task(current), 2415 2555 rcu_is_callbacks_kthread()); ··· 2424 2564 local_irq_save(flags); 2425 2565 WARN_ON_ONCE(cpu_is_offline(smp_processor_id())); 2426 2566 bl = rdp->blimit; 2427 - trace_rcu_batch_start(rsp->name, rcu_segcblist_n_lazy_cbs(&rdp->cblist), 2567 + trace_rcu_batch_start(rcu_state.name, 2568 + rcu_segcblist_n_lazy_cbs(&rdp->cblist), 2428 2569 rcu_segcblist_n_cbs(&rdp->cblist), bl); 2429 2570 rcu_segcblist_extract_done_cbs(&rdp->cblist, &rcl); 2430 2571 local_irq_restore(flags); ··· 2434 2573 rhp = rcu_cblist_dequeue(&rcl); 2435 2574 for (; rhp; rhp = rcu_cblist_dequeue(&rcl)) { 2436 2575 debug_rcu_head_unqueue(rhp); 2437 - if (__rcu_reclaim(rsp->name, rhp)) 2576 + if (__rcu_reclaim(rcu_state.name, rhp)) 2438 2577 rcu_cblist_dequeued_lazy(&rcl); 2439 2578 /* 2440 2579 * Stop only if limit reached and CPU has something to do. ··· 2448 2587 2449 2588 local_irq_save(flags); 2450 2589 count = -rcl.len; 2451 - trace_rcu_batch_end(rsp->name, count, !!rcl.head, need_resched(), 2590 + trace_rcu_batch_end(rcu_state.name, count, !!rcl.head, need_resched(), 2452 2591 is_idle_task(current), rcu_is_callbacks_kthread()); 2453 2592 2454 2593 /* Update counts and requeue any remaining callbacks. */ ··· 2464 2603 /* Reset ->qlen_last_fqs_check trigger if enough CBs have drained. */ 2465 2604 if (count == 0 && rdp->qlen_last_fqs_check != 0) { 2466 2605 rdp->qlen_last_fqs_check = 0; 2467 - rdp->n_force_qs_snap = rsp->n_force_qs; 2606 + rdp->n_force_qs_snap = rcu_state.n_force_qs; 2468 2607 } else if (count < rdp->qlen_last_fqs_check - qhimark) 2469 2608 rdp->qlen_last_fqs_check = count; 2470 2609 ··· 2492 2631 void rcu_check_callbacks(int user) 2493 2632 { 2494 2633 trace_rcu_utilization(TPS("Start scheduler-tick")); 2495 - increment_cpu_stall_ticks(); 2496 - if (user || rcu_is_cpu_rrupt_from_idle()) { 2497 - 2498 - /* 2499 - * Get here if this CPU took its interrupt from user 2500 - * mode or from the idle loop, and if this is not a 2501 - * nested interrupt. In this case, the CPU is in 2502 - * a quiescent state, so note it. 2503 - * 2504 - * No memory barrier is required here because both 2505 - * rcu_sched_qs() and rcu_bh_qs() reference only CPU-local 2506 - * variables that other CPUs neither access nor modify, 2507 - * at least not while the corresponding CPU is online. 2508 - */ 2509 - 2510 - rcu_sched_qs(); 2511 - rcu_bh_qs(); 2512 - rcu_note_voluntary_context_switch(current); 2513 - 2514 - } else if (!in_softirq()) { 2515 - 2516 - /* 2517 - * Get here if this CPU did not take its interrupt from 2518 - * softirq, in other words, if it is not interrupting 2519 - * a rcu_bh read-side critical section. This is an _bh 2520 - * critical section, so note it. 2521 - */ 2522 - 2523 - rcu_bh_qs(); 2634 + raw_cpu_inc(rcu_data.ticks_this_gp); 2635 + /* The load-acquire pairs with the store-release setting to true. */ 2636 + if (smp_load_acquire(this_cpu_ptr(&rcu_data.rcu_urgent_qs))) { 2637 + /* Idle and userspace execution already are quiescent states. */ 2638 + if (!rcu_is_cpu_rrupt_from_idle() && !user) { 2639 + set_tsk_need_resched(current); 2640 + set_preempt_need_resched(); 2641 + } 2642 + __this_cpu_write(rcu_data.rcu_urgent_qs, false); 2524 2643 } 2525 - rcu_preempt_check_callbacks(); 2644 + rcu_flavor_check_callbacks(user); 2526 2645 if (rcu_pending()) 2527 2646 invoke_rcu_core(); 2528 2647 ··· 2516 2675 * 2517 2676 * The caller must have suppressed start of new grace periods. 2518 2677 */ 2519 - static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *rsp)) 2678 + static void force_qs_rnp(int (*f)(struct rcu_data *rdp)) 2520 2679 { 2521 2680 int cpu; 2522 2681 unsigned long flags; 2523 2682 unsigned long mask; 2524 2683 struct rcu_node *rnp; 2525 2684 2526 - rcu_for_each_leaf_node(rsp, rnp) { 2685 + rcu_for_each_leaf_node(rnp) { 2527 2686 cond_resched_tasks_rcu_qs(); 2528 2687 mask = 0; 2529 2688 raw_spin_lock_irqsave_rcu_node(rnp, flags); 2530 2689 if (rnp->qsmask == 0) { 2531 - if (rcu_state_p == &rcu_sched_state || 2532 - rsp != rcu_state_p || 2690 + if (!IS_ENABLED(CONFIG_PREEMPT) || 2533 2691 rcu_preempt_blocked_readers_cgp(rnp)) { 2534 2692 /* 2535 2693 * No point in scanning bits because they ··· 2545 2705 for_each_leaf_node_possible_cpu(rnp, cpu) { 2546 2706 unsigned long bit = leaf_node_cpu_bit(rnp, cpu); 2547 2707 if ((rnp->qsmask & bit) != 0) { 2548 - if (f(per_cpu_ptr(rsp->rda, cpu))) 2708 + if (f(per_cpu_ptr(&rcu_data, cpu))) 2549 2709 mask |= bit; 2550 2710 } 2551 2711 } 2552 2712 if (mask != 0) { 2553 2713 /* Idle/offline CPUs, report (releases rnp->lock). */ 2554 - rcu_report_qs_rnp(mask, rsp, rnp, rnp->gp_seq, flags); 2714 + rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); 2555 2715 } else { 2556 2716 /* Nothing to do here, so just drop the lock. */ 2557 2717 raw_spin_unlock_irqrestore_rcu_node(rnp, flags); ··· 2563 2723 * Force quiescent states on reluctant CPUs, and also detect which 2564 2724 * CPUs are in dyntick-idle mode. 2565 2725 */ 2566 - static void force_quiescent_state(struct rcu_state *rsp) 2726 + static void force_quiescent_state(void) 2567 2727 { 2568 2728 unsigned long flags; 2569 2729 bool ret; ··· 2571 2731 struct rcu_node *rnp_old = NULL; 2572 2732 2573 2733 /* Funnel through hierarchy to reduce memory contention. */ 2574 - rnp = __this_cpu_read(rsp->rda->mynode); 2734 + rnp = __this_cpu_read(rcu_data.mynode); 2575 2735 for (; rnp != NULL; rnp = rnp->parent) { 2576 - ret = (READ_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) || 2736 + ret = (READ_ONCE(rcu_state.gp_flags) & RCU_GP_FLAG_FQS) || 2577 2737 !raw_spin_trylock(&rnp->fqslock); 2578 2738 if (rnp_old != NULL) 2579 2739 raw_spin_unlock(&rnp_old->fqslock); ··· 2581 2741 return; 2582 2742 rnp_old = rnp; 2583 2743 } 2584 - /* rnp_old == rcu_get_root(rsp), rnp == NULL. */ 2744 + /* rnp_old == rcu_get_root(), rnp == NULL. */ 2585 2745 2586 2746 /* Reached the root of the rcu_node tree, acquire lock. */ 2587 2747 raw_spin_lock_irqsave_rcu_node(rnp_old, flags); 2588 2748 raw_spin_unlock(&rnp_old->fqslock); 2589 - if (READ_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) { 2749 + if (READ_ONCE(rcu_state.gp_flags) & RCU_GP_FLAG_FQS) { 2590 2750 raw_spin_unlock_irqrestore_rcu_node(rnp_old, flags); 2591 2751 return; /* Someone beat us to it. */ 2592 2752 } 2593 - WRITE_ONCE(rsp->gp_flags, READ_ONCE(rsp->gp_flags) | RCU_GP_FLAG_FQS); 2753 + WRITE_ONCE(rcu_state.gp_flags, 2754 + READ_ONCE(rcu_state.gp_flags) | RCU_GP_FLAG_FQS); 2594 2755 raw_spin_unlock_irqrestore_rcu_node(rnp_old, flags); 2595 - rcu_gp_kthread_wake(rsp); 2756 + rcu_gp_kthread_wake(); 2596 2757 } 2597 2758 2598 2759 /* ··· 2601 2760 * RCU to come out of its idle mode. 2602 2761 */ 2603 2762 static void 2604 - rcu_check_gp_start_stall(struct rcu_state *rsp, struct rcu_node *rnp, 2605 - struct rcu_data *rdp) 2763 + rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp) 2606 2764 { 2607 2765 const unsigned long gpssdelay = rcu_jiffies_till_stall_check() * HZ; 2608 2766 unsigned long flags; 2609 2767 unsigned long j; 2610 - struct rcu_node *rnp_root = rcu_get_root(rsp); 2768 + struct rcu_node *rnp_root = rcu_get_root(); 2611 2769 static atomic_t warned = ATOMIC_INIT(0); 2612 2770 2613 - if (!IS_ENABLED(CONFIG_PROVE_RCU) || rcu_gp_in_progress(rsp) || 2771 + if (!IS_ENABLED(CONFIG_PROVE_RCU) || rcu_gp_in_progress() || 2614 2772 ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed)) 2615 2773 return; 2616 2774 j = jiffies; /* Expensive access, and in common case don't get here. */ 2617 - if (time_before(j, READ_ONCE(rsp->gp_req_activity) + gpssdelay) || 2618 - time_before(j, READ_ONCE(rsp->gp_activity) + gpssdelay) || 2775 + if (time_before(j, READ_ONCE(rcu_state.gp_req_activity) + gpssdelay) || 2776 + time_before(j, READ_ONCE(rcu_state.gp_activity) + gpssdelay) || 2619 2777 atomic_read(&warned)) 2620 2778 return; 2621 2779 2622 2780 raw_spin_lock_irqsave_rcu_node(rnp, flags); 2623 2781 j = jiffies; 2624 - if (rcu_gp_in_progress(rsp) || 2782 + if (rcu_gp_in_progress() || 2625 2783 ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) || 2626 - time_before(j, READ_ONCE(rsp->gp_req_activity) + gpssdelay) || 2627 - time_before(j, READ_ONCE(rsp->gp_activity) + gpssdelay) || 2784 + time_before(j, READ_ONCE(rcu_state.gp_req_activity) + gpssdelay) || 2785 + time_before(j, READ_ONCE(rcu_state.gp_activity) + gpssdelay) || 2628 2786 atomic_read(&warned)) { 2629 2787 raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 2630 2788 return; ··· 2633 2793 if (rnp_root != rnp) 2634 2794 raw_spin_lock_rcu_node(rnp_root); /* irqs already disabled. */ 2635 2795 j = jiffies; 2636 - if (rcu_gp_in_progress(rsp) || 2796 + if (rcu_gp_in_progress() || 2637 2797 ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) || 2638 - time_before(j, rsp->gp_req_activity + gpssdelay) || 2639 - time_before(j, rsp->gp_activity + gpssdelay) || 2798 + time_before(j, rcu_state.gp_req_activity + gpssdelay) || 2799 + time_before(j, rcu_state.gp_activity + gpssdelay) || 2640 2800 atomic_xchg(&warned, 1)) { 2641 2801 raw_spin_unlock_rcu_node(rnp_root); /* irqs remain disabled. */ 2642 2802 raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 2643 2803 return; 2644 2804 } 2645 2805 pr_alert("%s: g%ld->%ld gar:%lu ga:%lu f%#x gs:%d %s->state:%#lx\n", 2646 - __func__, (long)READ_ONCE(rsp->gp_seq), 2806 + __func__, (long)READ_ONCE(rcu_state.gp_seq), 2647 2807 (long)READ_ONCE(rnp_root->gp_seq_needed), 2648 - j - rsp->gp_req_activity, j - rsp->gp_activity, 2649 - rsp->gp_flags, rsp->gp_state, rsp->name, 2650 - rsp->gp_kthread ? rsp->gp_kthread->state : 0x1ffffL); 2808 + j - rcu_state.gp_req_activity, j - rcu_state.gp_activity, 2809 + rcu_state.gp_flags, rcu_state.gp_state, rcu_state.name, 2810 + rcu_state.gp_kthread ? rcu_state.gp_kthread->state : 0x1ffffL); 2651 2811 WARN_ON(1); 2652 2812 if (rnp_root != rnp) 2653 2813 raw_spin_unlock_rcu_node(rnp_root); ··· 2655 2815 } 2656 2816 2657 2817 /* 2658 - * This does the RCU core processing work for the specified rcu_state 2659 - * and rcu_data structures. This may be called only from the CPU to 2660 - * whom the rdp belongs. 2661 - */ 2662 - static void 2663 - __rcu_process_callbacks(struct rcu_state *rsp) 2664 - { 2665 - unsigned long flags; 2666 - struct rcu_data *rdp = raw_cpu_ptr(rsp->rda); 2667 - struct rcu_node *rnp = rdp->mynode; 2668 - 2669 - WARN_ON_ONCE(!rdp->beenonline); 2670 - 2671 - /* Update RCU state based on any recent quiescent states. */ 2672 - rcu_check_quiescent_state(rsp, rdp); 2673 - 2674 - /* No grace period and unregistered callbacks? */ 2675 - if (!rcu_gp_in_progress(rsp) && 2676 - rcu_segcblist_is_enabled(&rdp->cblist)) { 2677 - local_irq_save(flags); 2678 - if (!rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL)) 2679 - rcu_accelerate_cbs_unlocked(rsp, rnp, rdp); 2680 - local_irq_restore(flags); 2681 - } 2682 - 2683 - rcu_check_gp_start_stall(rsp, rnp, rdp); 2684 - 2685 - /* If there are callbacks ready, invoke them. */ 2686 - if (rcu_segcblist_ready_cbs(&rdp->cblist)) 2687 - invoke_rcu_callbacks(rsp, rdp); 2688 - 2689 - /* Do any needed deferred wakeups of rcuo kthreads. */ 2690 - do_nocb_deferred_wakeup(rdp); 2691 - } 2692 - 2693 - /* 2694 - * Do RCU core processing for the current CPU. 2818 + * This does the RCU core processing work for the specified rcu_data 2819 + * structures. This may be called only from the CPU to whom the rdp 2820 + * belongs. 2695 2821 */ 2696 2822 static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused) 2697 2823 { 2698 - struct rcu_state *rsp; 2824 + unsigned long flags; 2825 + struct rcu_data *rdp = raw_cpu_ptr(&rcu_data); 2826 + struct rcu_node *rnp = rdp->mynode; 2699 2827 2700 2828 if (cpu_is_offline(smp_processor_id())) 2701 2829 return; 2702 2830 trace_rcu_utilization(TPS("Start RCU core")); 2703 - for_each_rcu_flavor(rsp) 2704 - __rcu_process_callbacks(rsp); 2831 + WARN_ON_ONCE(!rdp->beenonline); 2832 + 2833 + /* Report any deferred quiescent states if preemption enabled. */ 2834 + if (!(preempt_count() & PREEMPT_MASK)) { 2835 + rcu_preempt_deferred_qs(current); 2836 + } else if (rcu_preempt_need_deferred_qs(current)) { 2837 + set_tsk_need_resched(current); 2838 + set_preempt_need_resched(); 2839 + } 2840 + 2841 + /* Update RCU state based on any recent quiescent states. */ 2842 + rcu_check_quiescent_state(rdp); 2843 + 2844 + /* No grace period and unregistered callbacks? */ 2845 + if (!rcu_gp_in_progress() && 2846 + rcu_segcblist_is_enabled(&rdp->cblist)) { 2847 + local_irq_save(flags); 2848 + if (!rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL)) 2849 + rcu_accelerate_cbs_unlocked(rnp, rdp); 2850 + local_irq_restore(flags); 2851 + } 2852 + 2853 + rcu_check_gp_start_stall(rnp, rdp); 2854 + 2855 + /* If there are callbacks ready, invoke them. */ 2856 + if (rcu_segcblist_ready_cbs(&rdp->cblist)) 2857 + invoke_rcu_callbacks(rdp); 2858 + 2859 + /* Do any needed deferred wakeups of rcuo kthreads. */ 2860 + do_nocb_deferred_wakeup(rdp); 2705 2861 trace_rcu_utilization(TPS("End RCU core")); 2706 2862 } 2707 2863 2708 2864 /* 2709 - * Schedule RCU callback invocation. If the specified type of RCU 2710 - * does not support RCU priority boosting, just do a direct call, 2711 - * otherwise wake up the per-CPU kernel kthread. Note that because we 2712 - * are running on the current CPU with softirqs disabled, the 2713 - * rcu_cpu_kthread_task cannot disappear out from under us. 2865 + * Schedule RCU callback invocation. If the running implementation of RCU 2866 + * does not support RCU priority boosting, just do a direct call, otherwise 2867 + * wake up the per-CPU kernel kthread. Note that because we are running 2868 + * on the current CPU with softirqs disabled, the rcu_cpu_kthread_task 2869 + * cannot disappear out from under us. 2714 2870 */ 2715 - static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp) 2871 + static void invoke_rcu_callbacks(struct rcu_data *rdp) 2716 2872 { 2717 2873 if (unlikely(!READ_ONCE(rcu_scheduler_fully_active))) 2718 2874 return; 2719 - if (likely(!rsp->boost)) { 2720 - rcu_do_batch(rsp, rdp); 2875 + if (likely(!rcu_state.boost)) { 2876 + rcu_do_batch(rdp); 2721 2877 return; 2722 2878 } 2723 2879 invoke_rcu_callbacks_kthread(); ··· 2728 2892 /* 2729 2893 * Handle any core-RCU processing required by a call_rcu() invocation. 2730 2894 */ 2731 - static void __call_rcu_core(struct rcu_state *rsp, struct rcu_data *rdp, 2732 - struct rcu_head *head, unsigned long flags) 2895 + static void __call_rcu_core(struct rcu_data *rdp, struct rcu_head *head, 2896 + unsigned long flags) 2733 2897 { 2734 2898 /* 2735 2899 * If called from an extended quiescent state, invoke the RCU ··· 2753 2917 rdp->qlen_last_fqs_check + qhimark)) { 2754 2918 2755 2919 /* Are we ignoring a completed grace period? */ 2756 - note_gp_changes(rsp, rdp); 2920 + note_gp_changes(rdp); 2757 2921 2758 2922 /* Start a new grace period if one not already started. */ 2759 - if (!rcu_gp_in_progress(rsp)) { 2760 - rcu_accelerate_cbs_unlocked(rsp, rdp->mynode, rdp); 2923 + if (!rcu_gp_in_progress()) { 2924 + rcu_accelerate_cbs_unlocked(rdp->mynode, rdp); 2761 2925 } else { 2762 2926 /* Give the grace period a kick. */ 2763 2927 rdp->blimit = LONG_MAX; 2764 - if (rsp->n_force_qs == rdp->n_force_qs_snap && 2928 + if (rcu_state.n_force_qs == rdp->n_force_qs_snap && 2765 2929 rcu_segcblist_first_pend_cb(&rdp->cblist) != head) 2766 - force_quiescent_state(rsp); 2767 - rdp->n_force_qs_snap = rsp->n_force_qs; 2930 + force_quiescent_state(); 2931 + rdp->n_force_qs_snap = rcu_state.n_force_qs; 2768 2932 rdp->qlen_last_fqs_check = rcu_segcblist_n_cbs(&rdp->cblist); 2769 2933 } 2770 2934 } ··· 2780 2944 /* 2781 2945 * Helper function for call_rcu() and friends. The cpu argument will 2782 2946 * normally be -1, indicating "currently running CPU". It may specify 2783 - * a CPU only if that CPU is a no-CBs CPU. Currently, only _rcu_barrier() 2947 + * a CPU only if that CPU is a no-CBs CPU. Currently, only rcu_barrier() 2784 2948 * is expected to specify a CPU. 2785 2949 */ 2786 2950 static void 2787 - __call_rcu(struct rcu_head *head, rcu_callback_t func, 2788 - struct rcu_state *rsp, int cpu, bool lazy) 2951 + __call_rcu(struct rcu_head *head, rcu_callback_t func, int cpu, bool lazy) 2789 2952 { 2790 2953 unsigned long flags; 2791 2954 struct rcu_data *rdp; ··· 2806 2971 head->func = func; 2807 2972 head->next = NULL; 2808 2973 local_irq_save(flags); 2809 - rdp = this_cpu_ptr(rsp->rda); 2974 + rdp = this_cpu_ptr(&rcu_data); 2810 2975 2811 2976 /* Add the callback to our list. */ 2812 2977 if (unlikely(!rcu_segcblist_is_enabled(&rdp->cblist)) || cpu != -1) { 2813 2978 int offline; 2814 2979 2815 2980 if (cpu != -1) 2816 - rdp = per_cpu_ptr(rsp->rda, cpu); 2981 + rdp = per_cpu_ptr(&rcu_data, cpu); 2817 2982 if (likely(rdp->mynode)) { 2818 2983 /* Post-boot, so this should be for a no-CBs CPU. */ 2819 2984 offline = !__call_rcu_nocb(rdp, head, lazy, flags); ··· 2836 3001 rcu_idle_count_callbacks_posted(); 2837 3002 2838 3003 if (__is_kfree_rcu_offset((unsigned long)func)) 2839 - trace_rcu_kfree_callback(rsp->name, head, (unsigned long)func, 3004 + trace_rcu_kfree_callback(rcu_state.name, head, 3005 + (unsigned long)func, 2840 3006 rcu_segcblist_n_lazy_cbs(&rdp->cblist), 2841 3007 rcu_segcblist_n_cbs(&rdp->cblist)); 2842 3008 else 2843 - trace_rcu_callback(rsp->name, head, 3009 + trace_rcu_callback(rcu_state.name, head, 2844 3010 rcu_segcblist_n_lazy_cbs(&rdp->cblist), 2845 3011 rcu_segcblist_n_cbs(&rdp->cblist)); 2846 3012 2847 3013 /* Go handle any RCU core processing required. */ 2848 - __call_rcu_core(rsp, rdp, head, flags); 3014 + __call_rcu_core(rdp, head, flags); 2849 3015 local_irq_restore(flags); 2850 3016 } 2851 3017 2852 3018 /** 2853 - * call_rcu_sched() - Queue an RCU for invocation after sched grace period. 3019 + * call_rcu() - Queue an RCU callback for invocation after a grace period. 2854 3020 * @head: structure to be used for queueing the RCU updates. 2855 3021 * @func: actual callback function to be invoked after the grace period 2856 3022 * 2857 3023 * The callback function will be invoked some time after a full grace 2858 - * period elapses, in other words after all currently executing RCU 2859 - * read-side critical sections have completed. call_rcu_sched() assumes 2860 - * that the read-side critical sections end on enabling of preemption 2861 - * or on voluntary preemption. 2862 - * RCU read-side critical sections are delimited by: 3024 + * period elapses, in other words after all pre-existing RCU read-side 3025 + * critical sections have completed. However, the callback function 3026 + * might well execute concurrently with RCU read-side critical sections 3027 + * that started after call_rcu() was invoked. RCU read-side critical 3028 + * sections are delimited by rcu_read_lock() and rcu_read_unlock(), and 3029 + * may be nested. In addition, regions of code across which interrupts, 3030 + * preemption, or softirqs have been disabled also serve as RCU read-side 3031 + * critical sections. This includes hardware interrupt handlers, softirq 3032 + * handlers, and NMI handlers. 2863 3033 * 2864 - * - rcu_read_lock_sched() and rcu_read_unlock_sched(), OR 2865 - * - anything that disables preemption. 3034 + * Note that all CPUs must agree that the grace period extended beyond 3035 + * all pre-existing RCU read-side critical section. On systems with more 3036 + * than one CPU, this means that when "func()" is invoked, each CPU is 3037 + * guaranteed to have executed a full memory barrier since the end of its 3038 + * last RCU read-side critical section whose beginning preceded the call 3039 + * to call_rcu(). It also means that each CPU executing an RCU read-side 3040 + * critical section that continues beyond the start of "func()" must have 3041 + * executed a memory barrier after the call_rcu() but before the beginning 3042 + * of that RCU read-side critical section. Note that these guarantees 3043 + * include CPUs that are offline, idle, or executing in user mode, as 3044 + * well as CPUs that are executing in the kernel. 2866 3045 * 2867 - * These may be nested. 2868 - * 2869 - * See the description of call_rcu() for more detailed information on 2870 - * memory ordering guarantees. 3046 + * Furthermore, if CPU A invoked call_rcu() and CPU B invoked the 3047 + * resulting RCU callback function "func()", then both CPU A and CPU B are 3048 + * guaranteed to execute a full memory barrier during the time interval 3049 + * between the call to call_rcu() and the invocation of "func()" -- even 3050 + * if CPU A and CPU B are the same CPU (but again only if the system has 3051 + * more than one CPU). 2871 3052 */ 2872 - void call_rcu_sched(struct rcu_head *head, rcu_callback_t func) 3053 + void call_rcu(struct rcu_head *head, rcu_callback_t func) 2873 3054 { 2874 - __call_rcu(head, func, &rcu_sched_state, -1, 0); 3055 + __call_rcu(head, func, -1, 0); 2875 3056 } 2876 - EXPORT_SYMBOL_GPL(call_rcu_sched); 2877 - 2878 - /** 2879 - * call_rcu_bh() - Queue an RCU for invocation after a quicker grace period. 2880 - * @head: structure to be used for queueing the RCU updates. 2881 - * @func: actual callback function to be invoked after the grace period 2882 - * 2883 - * The callback function will be invoked some time after a full grace 2884 - * period elapses, in other words after all currently executing RCU 2885 - * read-side critical sections have completed. call_rcu_bh() assumes 2886 - * that the read-side critical sections end on completion of a softirq 2887 - * handler. This means that read-side critical sections in process 2888 - * context must not be interrupted by softirqs. This interface is to be 2889 - * used when most of the read-side critical sections are in softirq context. 2890 - * RCU read-side critical sections are delimited by: 2891 - * 2892 - * - rcu_read_lock() and rcu_read_unlock(), if in interrupt context, OR 2893 - * - rcu_read_lock_bh() and rcu_read_unlock_bh(), if in process context. 2894 - * 2895 - * These may be nested. 2896 - * 2897 - * See the description of call_rcu() for more detailed information on 2898 - * memory ordering guarantees. 2899 - */ 2900 - void call_rcu_bh(struct rcu_head *head, rcu_callback_t func) 2901 - { 2902 - __call_rcu(head, func, &rcu_bh_state, -1, 0); 2903 - } 2904 - EXPORT_SYMBOL_GPL(call_rcu_bh); 3057 + EXPORT_SYMBOL_GPL(call_rcu); 2905 3058 2906 3059 /* 2907 3060 * Queue an RCU callback for lazy invocation after a grace period. ··· 2898 3075 * callbacks in the list of pending callbacks. Until then, this 2899 3076 * function may only be called from __kfree_rcu(). 2900 3077 */ 2901 - void kfree_call_rcu(struct rcu_head *head, 2902 - rcu_callback_t func) 3078 + void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func) 2903 3079 { 2904 - __call_rcu(head, func, rcu_state_p, -1, 1); 3080 + __call_rcu(head, func, -1, 1); 2905 3081 } 2906 3082 EXPORT_SYMBOL_GPL(kfree_call_rcu); 2907 - 2908 - /* 2909 - * Because a context switch is a grace period for RCU-sched and RCU-bh, 2910 - * any blocking grace-period wait automatically implies a grace period 2911 - * if there is only one CPU online at any point time during execution 2912 - * of either synchronize_sched() or synchronize_rcu_bh(). It is OK to 2913 - * occasionally incorrectly indicate that there are multiple CPUs online 2914 - * when there was in fact only one the whole time, as this just adds 2915 - * some overhead: RCU still operates correctly. 2916 - */ 2917 - static int rcu_blocking_is_gp(void) 2918 - { 2919 - int ret; 2920 - 2921 - might_sleep(); /* Check for RCU read-side critical section. */ 2922 - preempt_disable(); 2923 - ret = num_online_cpus() <= 1; 2924 - preempt_enable(); 2925 - return ret; 2926 - } 2927 - 2928 - /** 2929 - * synchronize_sched - wait until an rcu-sched grace period has elapsed. 2930 - * 2931 - * Control will return to the caller some time after a full rcu-sched 2932 - * grace period has elapsed, in other words after all currently executing 2933 - * rcu-sched read-side critical sections have completed. These read-side 2934 - * critical sections are delimited by rcu_read_lock_sched() and 2935 - * rcu_read_unlock_sched(), and may be nested. Note that preempt_disable(), 2936 - * local_irq_disable(), and so on may be used in place of 2937 - * rcu_read_lock_sched(). 2938 - * 2939 - * This means that all preempt_disable code sequences, including NMI and 2940 - * non-threaded hardware-interrupt handlers, in progress on entry will 2941 - * have completed before this primitive returns. However, this does not 2942 - * guarantee that softirq handlers will have completed, since in some 2943 - * kernels, these handlers can run in process context, and can block. 2944 - * 2945 - * Note that this guarantee implies further memory-ordering guarantees. 2946 - * On systems with more than one CPU, when synchronize_sched() returns, 2947 - * each CPU is guaranteed to have executed a full memory barrier since the 2948 - * end of its last RCU-sched read-side critical section whose beginning 2949 - * preceded the call to synchronize_sched(). In addition, each CPU having 2950 - * an RCU read-side critical section that extends beyond the return from 2951 - * synchronize_sched() is guaranteed to have executed a full memory barrier 2952 - * after the beginning of synchronize_sched() and before the beginning of 2953 - * that RCU read-side critical section. Note that these guarantees include 2954 - * CPUs that are offline, idle, or executing in user mode, as well as CPUs 2955 - * that are executing in the kernel. 2956 - * 2957 - * Furthermore, if CPU A invoked synchronize_sched(), which returned 2958 - * to its caller on CPU B, then both CPU A and CPU B are guaranteed 2959 - * to have executed a full memory barrier during the execution of 2960 - * synchronize_sched() -- even if CPU A and CPU B are the same CPU (but 2961 - * again only if the system has more than one CPU). 2962 - */ 2963 - void synchronize_sched(void) 2964 - { 2965 - RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || 2966 - lock_is_held(&rcu_lock_map) || 2967 - lock_is_held(&rcu_sched_lock_map), 2968 - "Illegal synchronize_sched() in RCU-sched read-side critical section"); 2969 - if (rcu_blocking_is_gp()) 2970 - return; 2971 - if (rcu_gp_is_expedited()) 2972 - synchronize_sched_expedited(); 2973 - else 2974 - wait_rcu_gp(call_rcu_sched); 2975 - } 2976 - EXPORT_SYMBOL_GPL(synchronize_sched); 2977 - 2978 - /** 2979 - * synchronize_rcu_bh - wait until an rcu_bh grace period has elapsed. 2980 - * 2981 - * Control will return to the caller some time after a full rcu_bh grace 2982 - * period has elapsed, in other words after all currently executing rcu_bh 2983 - * read-side critical sections have completed. RCU read-side critical 2984 - * sections are delimited by rcu_read_lock_bh() and rcu_read_unlock_bh(), 2985 - * and may be nested. 2986 - * 2987 - * See the description of synchronize_sched() for more detailed information 2988 - * on memory ordering guarantees. 2989 - */ 2990 - void synchronize_rcu_bh(void) 2991 - { 2992 - RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || 2993 - lock_is_held(&rcu_lock_map) || 2994 - lock_is_held(&rcu_sched_lock_map), 2995 - "Illegal synchronize_rcu_bh() in RCU-bh read-side critical section"); 2996 - if (rcu_blocking_is_gp()) 2997 - return; 2998 - if (rcu_gp_is_expedited()) 2999 - synchronize_rcu_bh_expedited(); 3000 - else 3001 - wait_rcu_gp(call_rcu_bh); 3002 - } 3003 - EXPORT_SYMBOL_GPL(synchronize_rcu_bh); 3004 3083 3005 3084 /** 3006 3085 * get_state_synchronize_rcu - Snapshot current RCU state ··· 2918 3193 * before the load from ->gp_seq. 2919 3194 */ 2920 3195 smp_mb(); /* ^^^ */ 2921 - return rcu_seq_snap(&rcu_state_p->gp_seq); 3196 + return rcu_seq_snap(&rcu_state.gp_seq); 2922 3197 } 2923 3198 EXPORT_SYMBOL_GPL(get_state_synchronize_rcu); 2924 3199 ··· 2938 3213 */ 2939 3214 void cond_synchronize_rcu(unsigned long oldstate) 2940 3215 { 2941 - if (!rcu_seq_done(&rcu_state_p->gp_seq, oldstate)) 3216 + if (!rcu_seq_done(&rcu_state.gp_seq, oldstate)) 2942 3217 synchronize_rcu(); 2943 3218 else 2944 3219 smp_mb(); /* Ensure GP ends before subsequent accesses. */ 2945 3220 } 2946 3221 EXPORT_SYMBOL_GPL(cond_synchronize_rcu); 2947 3222 2948 - /** 2949 - * get_state_synchronize_sched - Snapshot current RCU-sched state 2950 - * 2951 - * Returns a cookie that is used by a later call to cond_synchronize_sched() 2952 - * to determine whether or not a full grace period has elapsed in the 2953 - * meantime. 2954 - */ 2955 - unsigned long get_state_synchronize_sched(void) 2956 - { 2957 - /* 2958 - * Any prior manipulation of RCU-protected data must happen 2959 - * before the load from ->gp_seq. 2960 - */ 2961 - smp_mb(); /* ^^^ */ 2962 - return rcu_seq_snap(&rcu_sched_state.gp_seq); 2963 - } 2964 - EXPORT_SYMBOL_GPL(get_state_synchronize_sched); 2965 - 2966 - /** 2967 - * cond_synchronize_sched - Conditionally wait for an RCU-sched grace period 2968 - * 2969 - * @oldstate: return value from earlier call to get_state_synchronize_sched() 2970 - * 2971 - * If a full RCU-sched grace period has elapsed since the earlier call to 2972 - * get_state_synchronize_sched(), just return. Otherwise, invoke 2973 - * synchronize_sched() to wait for a full grace period. 2974 - * 2975 - * Yes, this function does not take counter wrap into account. But 2976 - * counter wrap is harmless. If the counter wraps, we have waited for 2977 - * more than 2 billion grace periods (and way more on a 64-bit system!), 2978 - * so waiting for one additional grace period should be just fine. 2979 - */ 2980 - void cond_synchronize_sched(unsigned long oldstate) 2981 - { 2982 - if (!rcu_seq_done(&rcu_sched_state.gp_seq, oldstate)) 2983 - synchronize_sched(); 2984 - else 2985 - smp_mb(); /* Ensure GP ends before subsequent accesses. */ 2986 - } 2987 - EXPORT_SYMBOL_GPL(cond_synchronize_sched); 2988 - 2989 3223 /* 2990 - * Check to see if there is any immediate RCU-related work to be done 2991 - * by the current CPU, for the specified type of RCU, returning 1 if so. 2992 - * The checks are in order of increasing expense: checks that can be 2993 - * carried out against CPU-local state are performed first. However, 2994 - * we must check for CPU stalls first, else we might not get a chance. 3224 + * Check to see if there is any immediate RCU-related work to be done by 3225 + * the current CPU, returning 1 if so and zero otherwise. The checks are 3226 + * in order of increasing expense: checks that can be carried out against 3227 + * CPU-local state are performed first. However, we must check for CPU 3228 + * stalls first, else we might not get a chance. 2995 3229 */ 2996 - static int __rcu_pending(struct rcu_state *rsp, struct rcu_data *rdp) 3230 + static int rcu_pending(void) 2997 3231 { 3232 + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); 2998 3233 struct rcu_node *rnp = rdp->mynode; 2999 3234 3000 3235 /* Check for CPU stalls, if enabled. */ 3001 - check_cpu_stall(rsp, rdp); 3236 + check_cpu_stall(rdp); 3002 3237 3003 3238 /* Is this CPU a NO_HZ_FULL CPU that should ignore RCU? */ 3004 - if (rcu_nohz_full_cpu(rsp)) 3239 + if (rcu_nohz_full_cpu()) 3005 3240 return 0; 3006 3241 3007 3242 /* Is the RCU core waiting for a quiescent state from this CPU? */ ··· 2973 3288 return 1; 2974 3289 2975 3290 /* Has RCU gone idle with this CPU needing another grace period? */ 2976 - if (!rcu_gp_in_progress(rsp) && 3291 + if (!rcu_gp_in_progress() && 2977 3292 rcu_segcblist_is_enabled(&rdp->cblist) && 2978 3293 !rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL)) 2979 3294 return 1; ··· 2992 3307 } 2993 3308 2994 3309 /* 2995 - * Check to see if there is any immediate RCU-related work to be done 2996 - * by the current CPU, returning 1 if so. This function is part of the 2997 - * RCU implementation; it is -not- an exported member of the RCU API. 2998 - */ 2999 - static int rcu_pending(void) 3000 - { 3001 - struct rcu_state *rsp; 3002 - 3003 - for_each_rcu_flavor(rsp) 3004 - if (__rcu_pending(rsp, this_cpu_ptr(rsp->rda))) 3005 - return 1; 3006 - return 0; 3007 - } 3008 - 3009 - /* 3010 3310 * Return true if the specified CPU has any callback. If all_lazy is 3011 3311 * non-NULL, store an indication of whether all callbacks are lazy. 3012 3312 * (If there are no callbacks, all of them are deemed to be lazy.) ··· 3001 3331 bool al = true; 3002 3332 bool hc = false; 3003 3333 struct rcu_data *rdp; 3004 - struct rcu_state *rsp; 3005 3334 3006 - for_each_rcu_flavor(rsp) { 3007 - rdp = this_cpu_ptr(rsp->rda); 3008 - if (rcu_segcblist_empty(&rdp->cblist)) 3009 - continue; 3335 + rdp = this_cpu_ptr(&rcu_data); 3336 + if (!rcu_segcblist_empty(&rdp->cblist)) { 3010 3337 hc = true; 3011 - if (rcu_segcblist_n_nonlazy_cbs(&rdp->cblist) || !all_lazy) { 3338 + if (rcu_segcblist_n_nonlazy_cbs(&rdp->cblist)) 3012 3339 al = false; 3013 - break; 3014 - } 3015 3340 } 3016 3341 if (all_lazy) 3017 3342 *all_lazy = al; ··· 3014 3349 } 3015 3350 3016 3351 /* 3017 - * Helper function for _rcu_barrier() tracing. If tracing is disabled, 3352 + * Helper function for rcu_barrier() tracing. If tracing is disabled, 3018 3353 * the compiler is expected to optimize this away. 3019 3354 */ 3020 - static void _rcu_barrier_trace(struct rcu_state *rsp, const char *s, 3021 - int cpu, unsigned long done) 3355 + static void rcu_barrier_trace(const char *s, int cpu, unsigned long done) 3022 3356 { 3023 - trace_rcu_barrier(rsp->name, s, cpu, 3024 - atomic_read(&rsp->barrier_cpu_count), done); 3357 + trace_rcu_barrier(rcu_state.name, s, cpu, 3358 + atomic_read(&rcu_state.barrier_cpu_count), done); 3025 3359 } 3026 3360 3027 3361 /* 3028 - * RCU callback function for _rcu_barrier(). If we are last, wake 3029 - * up the task executing _rcu_barrier(). 3362 + * RCU callback function for rcu_barrier(). If we are last, wake 3363 + * up the task executing rcu_barrier(). 3030 3364 */ 3031 3365 static void rcu_barrier_callback(struct rcu_head *rhp) 3032 3366 { 3033 - struct rcu_data *rdp = container_of(rhp, struct rcu_data, barrier_head); 3034 - struct rcu_state *rsp = rdp->rsp; 3035 - 3036 - if (atomic_dec_and_test(&rsp->barrier_cpu_count)) { 3037 - _rcu_barrier_trace(rsp, TPS("LastCB"), -1, 3038 - rsp->barrier_sequence); 3039 - complete(&rsp->barrier_completion); 3367 + if (atomic_dec_and_test(&rcu_state.barrier_cpu_count)) { 3368 + rcu_barrier_trace(TPS("LastCB"), -1, 3369 + rcu_state.barrier_sequence); 3370 + complete(&rcu_state.barrier_completion); 3040 3371 } else { 3041 - _rcu_barrier_trace(rsp, TPS("CB"), -1, rsp->barrier_sequence); 3372 + rcu_barrier_trace(TPS("CB"), -1, rcu_state.barrier_sequence); 3042 3373 } 3043 3374 } 3044 3375 3045 3376 /* 3046 3377 * Called with preemption disabled, and from cross-cpu IRQ context. 3047 3378 */ 3048 - static void rcu_barrier_func(void *type) 3379 + static void rcu_barrier_func(void *unused) 3049 3380 { 3050 - struct rcu_state *rsp = type; 3051 - struct rcu_data *rdp = raw_cpu_ptr(rsp->rda); 3381 + struct rcu_data *rdp = raw_cpu_ptr(&rcu_data); 3052 3382 3053 - _rcu_barrier_trace(rsp, TPS("IRQ"), -1, rsp->barrier_sequence); 3383 + rcu_barrier_trace(TPS("IRQ"), -1, rcu_state.barrier_sequence); 3054 3384 rdp->barrier_head.func = rcu_barrier_callback; 3055 3385 debug_rcu_head_queue(&rdp->barrier_head); 3056 3386 if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head, 0)) { 3057 - atomic_inc(&rsp->barrier_cpu_count); 3387 + atomic_inc(&rcu_state.barrier_cpu_count); 3058 3388 } else { 3059 3389 debug_rcu_head_unqueue(&rdp->barrier_head); 3060 - _rcu_barrier_trace(rsp, TPS("IRQNQ"), -1, 3061 - rsp->barrier_sequence); 3390 + rcu_barrier_trace(TPS("IRQNQ"), -1, 3391 + rcu_state.barrier_sequence); 3062 3392 } 3063 3393 } 3064 3394 3065 - /* 3066 - * Orchestrate the specified type of RCU barrier, waiting for all 3067 - * RCU callbacks of the specified type to complete. 3395 + /** 3396 + * rcu_barrier - Wait until all in-flight call_rcu() callbacks complete. 3397 + * 3398 + * Note that this primitive does not necessarily wait for an RCU grace period 3399 + * to complete. For example, if there are no RCU callbacks queued anywhere 3400 + * in the system, then rcu_barrier() is within its rights to return 3401 + * immediately, without waiting for anything, much less an RCU grace period. 3068 3402 */ 3069 - static void _rcu_barrier(struct rcu_state *rsp) 3403 + void rcu_barrier(void) 3070 3404 { 3071 3405 int cpu; 3072 3406 struct rcu_data *rdp; 3073 - unsigned long s = rcu_seq_snap(&rsp->barrier_sequence); 3407 + unsigned long s = rcu_seq_snap(&rcu_state.barrier_sequence); 3074 3408 3075 - _rcu_barrier_trace(rsp, TPS("Begin"), -1, s); 3409 + rcu_barrier_trace(TPS("Begin"), -1, s); 3076 3410 3077 3411 /* Take mutex to serialize concurrent rcu_barrier() requests. */ 3078 - mutex_lock(&rsp->barrier_mutex); 3412 + mutex_lock(&rcu_state.barrier_mutex); 3079 3413 3080 3414 /* Did someone else do our work for us? */ 3081 - if (rcu_seq_done(&rsp->barrier_sequence, s)) { 3082 - _rcu_barrier_trace(rsp, TPS("EarlyExit"), -1, 3083 - rsp->barrier_sequence); 3415 + if (rcu_seq_done(&rcu_state.barrier_sequence, s)) { 3416 + rcu_barrier_trace(TPS("EarlyExit"), -1, 3417 + rcu_state.barrier_sequence); 3084 3418 smp_mb(); /* caller's subsequent code after above check. */ 3085 - mutex_unlock(&rsp->barrier_mutex); 3419 + mutex_unlock(&rcu_state.barrier_mutex); 3086 3420 return; 3087 3421 } 3088 3422 3089 3423 /* Mark the start of the barrier operation. */ 3090 - rcu_seq_start(&rsp->barrier_sequence); 3091 - _rcu_barrier_trace(rsp, TPS("Inc1"), -1, rsp->barrier_sequence); 3424 + rcu_seq_start(&rcu_state.barrier_sequence); 3425 + rcu_barrier_trace(TPS("Inc1"), -1, rcu_state.barrier_sequence); 3092 3426 3093 3427 /* 3094 3428 * Initialize the count to one rather than to zero in order to ··· 3095 3431 * (or preemption of this task). Exclude CPU-hotplug operations 3096 3432 * to ensure that no offline CPU has callbacks queued. 3097 3433 */ 3098 - init_completion(&rsp->barrier_completion); 3099 - atomic_set(&rsp->barrier_cpu_count, 1); 3434 + init_completion(&rcu_state.barrier_completion); 3435 + atomic_set(&rcu_state.barrier_cpu_count, 1); 3100 3436 get_online_cpus(); 3101 3437 3102 3438 /* ··· 3107 3443 for_each_possible_cpu(cpu) { 3108 3444 if (!cpu_online(cpu) && !rcu_is_nocb_cpu(cpu)) 3109 3445 continue; 3110 - rdp = per_cpu_ptr(rsp->rda, cpu); 3446 + rdp = per_cpu_ptr(&rcu_data, cpu); 3111 3447 if (rcu_is_nocb_cpu(cpu)) { 3112 - if (!rcu_nocb_cpu_needs_barrier(rsp, cpu)) { 3113 - _rcu_barrier_trace(rsp, TPS("OfflineNoCB"), cpu, 3114 - rsp->barrier_sequence); 3448 + if (!rcu_nocb_cpu_needs_barrier(cpu)) { 3449 + rcu_barrier_trace(TPS("OfflineNoCB"), cpu, 3450 + rcu_state.barrier_sequence); 3115 3451 } else { 3116 - _rcu_barrier_trace(rsp, TPS("OnlineNoCB"), cpu, 3117 - rsp->barrier_sequence); 3452 + rcu_barrier_trace(TPS("OnlineNoCB"), cpu, 3453 + rcu_state.barrier_sequence); 3118 3454 smp_mb__before_atomic(); 3119 - atomic_inc(&rsp->barrier_cpu_count); 3455 + atomic_inc(&rcu_state.barrier_cpu_count); 3120 3456 __call_rcu(&rdp->barrier_head, 3121 - rcu_barrier_callback, rsp, cpu, 0); 3457 + rcu_barrier_callback, cpu, 0); 3122 3458 } 3123 3459 } else if (rcu_segcblist_n_cbs(&rdp->cblist)) { 3124 - _rcu_barrier_trace(rsp, TPS("OnlineQ"), cpu, 3125 - rsp->barrier_sequence); 3126 - smp_call_function_single(cpu, rcu_barrier_func, rsp, 1); 3460 + rcu_barrier_trace(TPS("OnlineQ"), cpu, 3461 + rcu_state.barrier_sequence); 3462 + smp_call_function_single(cpu, rcu_barrier_func, NULL, 1); 3127 3463 } else { 3128 - _rcu_barrier_trace(rsp, TPS("OnlineNQ"), cpu, 3129 - rsp->barrier_sequence); 3464 + rcu_barrier_trace(TPS("OnlineNQ"), cpu, 3465 + rcu_state.barrier_sequence); 3130 3466 } 3131 3467 } 3132 3468 put_online_cpus(); ··· 3135 3471 * Now that we have an rcu_barrier_callback() callback on each 3136 3472 * CPU, and thus each counted, remove the initial count. 3137 3473 */ 3138 - if (atomic_dec_and_test(&rsp->barrier_cpu_count)) 3139 - complete(&rsp->barrier_completion); 3474 + if (atomic_dec_and_test(&rcu_state.barrier_cpu_count)) 3475 + complete(&rcu_state.barrier_completion); 3140 3476 3141 3477 /* Wait for all rcu_barrier_callback() callbacks to be invoked. */ 3142 - wait_for_completion(&rsp->barrier_completion); 3478 + wait_for_completion(&rcu_state.barrier_completion); 3143 3479 3144 3480 /* Mark the end of the barrier operation. */ 3145 - _rcu_barrier_trace(rsp, TPS("Inc2"), -1, rsp->barrier_sequence); 3146 - rcu_seq_end(&rsp->barrier_sequence); 3481 + rcu_barrier_trace(TPS("Inc2"), -1, rcu_state.barrier_sequence); 3482 + rcu_seq_end(&rcu_state.barrier_sequence); 3147 3483 3148 3484 /* Other rcu_barrier() invocations can now safely proceed. */ 3149 - mutex_unlock(&rsp->barrier_mutex); 3485 + mutex_unlock(&rcu_state.barrier_mutex); 3150 3486 } 3151 - 3152 - /** 3153 - * rcu_barrier_bh - Wait until all in-flight call_rcu_bh() callbacks complete. 3154 - */ 3155 - void rcu_barrier_bh(void) 3156 - { 3157 - _rcu_barrier(&rcu_bh_state); 3158 - } 3159 - EXPORT_SYMBOL_GPL(rcu_barrier_bh); 3160 - 3161 - /** 3162 - * rcu_barrier_sched - Wait for in-flight call_rcu_sched() callbacks. 3163 - */ 3164 - void rcu_barrier_sched(void) 3165 - { 3166 - _rcu_barrier(&rcu_sched_state); 3167 - } 3168 - EXPORT_SYMBOL_GPL(rcu_barrier_sched); 3487 + EXPORT_SYMBOL_GPL(rcu_barrier); 3169 3488 3170 3489 /* 3171 3490 * Propagate ->qsinitmask bits up the rcu_node tree to account for the ··· 3182 3535 * Do boot-time initialization of a CPU's per-CPU RCU data. 3183 3536 */ 3184 3537 static void __init 3185 - rcu_boot_init_percpu_data(int cpu, struct rcu_state *rsp) 3538 + rcu_boot_init_percpu_data(int cpu) 3186 3539 { 3187 - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); 3540 + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); 3188 3541 3189 3542 /* Set up local state, ensuring consistent view of global state. */ 3190 3543 rdp->grpmask = leaf_node_cpu_bit(rdp->mynode, cpu); 3191 - rdp->dynticks = &per_cpu(rcu_dynticks, cpu); 3192 - WARN_ON_ONCE(rdp->dynticks->dynticks_nesting != 1); 3193 - WARN_ON_ONCE(rcu_dynticks_in_eqs(rcu_dynticks_snap(rdp->dynticks))); 3194 - rdp->rcu_ofl_gp_seq = rsp->gp_seq; 3544 + WARN_ON_ONCE(rdp->dynticks_nesting != 1); 3545 + WARN_ON_ONCE(rcu_dynticks_in_eqs(rcu_dynticks_snap(rdp))); 3546 + rdp->rcu_ofl_gp_seq = rcu_state.gp_seq; 3195 3547 rdp->rcu_ofl_gp_flags = RCU_GP_CLEANED; 3196 - rdp->rcu_onl_gp_seq = rsp->gp_seq; 3548 + rdp->rcu_onl_gp_seq = rcu_state.gp_seq; 3197 3549 rdp->rcu_onl_gp_flags = RCU_GP_CLEANED; 3198 3550 rdp->cpu = cpu; 3199 - rdp->rsp = rsp; 3200 3551 rcu_boot_init_nocb_percpu_data(rdp); 3201 3552 } 3202 3553 3203 3554 /* 3204 - * Initialize a CPU's per-CPU RCU data. Note that only one online or 3555 + * Invoked early in the CPU-online process, when pretty much all services 3556 + * are available. The incoming CPU is not present. 3557 + * 3558 + * Initializes a CPU's per-CPU RCU data. Note that only one online or 3205 3559 * offline event can be happening at a given time. Note also that we can 3206 3560 * accept some slop in the rsp->gp_seq access due to the fact that this 3207 3561 * CPU cannot possibly have any RCU callbacks in flight yet. 3208 3562 */ 3209 - static void 3210 - rcu_init_percpu_data(int cpu, struct rcu_state *rsp) 3563 + int rcutree_prepare_cpu(unsigned int cpu) 3211 3564 { 3212 3565 unsigned long flags; 3213 - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); 3214 - struct rcu_node *rnp = rcu_get_root(rsp); 3566 + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); 3567 + struct rcu_node *rnp = rcu_get_root(); 3215 3568 3216 3569 /* Set up local state, ensuring consistent view of global state. */ 3217 3570 raw_spin_lock_irqsave_rcu_node(rnp, flags); 3218 3571 rdp->qlen_last_fqs_check = 0; 3219 - rdp->n_force_qs_snap = rsp->n_force_qs; 3572 + rdp->n_force_qs_snap = rcu_state.n_force_qs; 3220 3573 rdp->blimit = blimit; 3221 3574 if (rcu_segcblist_empty(&rdp->cblist) && /* No early-boot CBs? */ 3222 3575 !init_nocb_callback_list(rdp)) 3223 3576 rcu_segcblist_init(&rdp->cblist); /* Re-enable callbacks. */ 3224 - rdp->dynticks->dynticks_nesting = 1; /* CPU not up, no tearing. */ 3577 + rdp->dynticks_nesting = 1; /* CPU not up, no tearing. */ 3225 3578 rcu_dynticks_eqs_online(); 3226 3579 raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */ 3227 3580 ··· 3236 3589 rdp->gp_seq = rnp->gp_seq; 3237 3590 rdp->gp_seq_needed = rnp->gp_seq; 3238 3591 rdp->cpu_no_qs.b.norm = true; 3239 - rdp->rcu_qs_ctr_snap = per_cpu(rcu_dynticks.rcu_qs_ctr, cpu); 3240 3592 rdp->core_needs_qs = false; 3241 3593 rdp->rcu_iw_pending = false; 3242 3594 rdp->rcu_iw_gp_seq = rnp->gp_seq - 1; 3243 - trace_rcu_grace_period(rsp->name, rdp->gp_seq, TPS("cpuonl")); 3595 + trace_rcu_grace_period(rcu_state.name, rdp->gp_seq, TPS("cpuonl")); 3244 3596 raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 3245 - } 3246 - 3247 - /* 3248 - * Invoked early in the CPU-online process, when pretty much all 3249 - * services are available. The incoming CPU is not present. 3250 - */ 3251 - int rcutree_prepare_cpu(unsigned int cpu) 3252 - { 3253 - struct rcu_state *rsp; 3254 - 3255 - for_each_rcu_flavor(rsp) 3256 - rcu_init_percpu_data(cpu, rsp); 3257 - 3258 3597 rcu_prepare_kthreads(cpu); 3259 3598 rcu_spawn_all_nocb_kthreads(cpu); 3260 3599 ··· 3252 3619 */ 3253 3620 static void rcutree_affinity_setting(unsigned int cpu, int outgoing) 3254 3621 { 3255 - struct rcu_data *rdp = per_cpu_ptr(rcu_state_p->rda, cpu); 3622 + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); 3256 3623 3257 3624 rcu_boost_kthread_setaffinity(rdp->mynode, outgoing); 3258 3625 } ··· 3266 3633 unsigned long flags; 3267 3634 struct rcu_data *rdp; 3268 3635 struct rcu_node *rnp; 3269 - struct rcu_state *rsp; 3270 3636 3271 - for_each_rcu_flavor(rsp) { 3272 - rdp = per_cpu_ptr(rsp->rda, cpu); 3273 - rnp = rdp->mynode; 3274 - raw_spin_lock_irqsave_rcu_node(rnp, flags); 3275 - rnp->ffmask |= rdp->grpmask; 3276 - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 3277 - } 3637 + rdp = per_cpu_ptr(&rcu_data, cpu); 3638 + rnp = rdp->mynode; 3639 + raw_spin_lock_irqsave_rcu_node(rnp, flags); 3640 + rnp->ffmask |= rdp->grpmask; 3641 + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 3278 3642 if (IS_ENABLED(CONFIG_TREE_SRCU)) 3279 3643 srcu_online_cpu(cpu); 3280 3644 if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE) ··· 3290 3660 unsigned long flags; 3291 3661 struct rcu_data *rdp; 3292 3662 struct rcu_node *rnp; 3293 - struct rcu_state *rsp; 3294 3663 3295 - for_each_rcu_flavor(rsp) { 3296 - rdp = per_cpu_ptr(rsp->rda, cpu); 3297 - rnp = rdp->mynode; 3298 - raw_spin_lock_irqsave_rcu_node(rnp, flags); 3299 - rnp->ffmask &= ~rdp->grpmask; 3300 - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 3301 - } 3664 + rdp = per_cpu_ptr(&rcu_data, cpu); 3665 + rnp = rdp->mynode; 3666 + raw_spin_lock_irqsave_rcu_node(rnp, flags); 3667 + rnp->ffmask &= ~rdp->grpmask; 3668 + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 3302 3669 3303 3670 rcutree_affinity_setting(cpu, cpu); 3304 3671 if (IS_ENABLED(CONFIG_TREE_SRCU)) 3305 3672 srcu_offline_cpu(cpu); 3306 - return 0; 3307 - } 3308 - 3309 - /* 3310 - * Near the end of the offline process. We do only tracing here. 3311 - */ 3312 - int rcutree_dying_cpu(unsigned int cpu) 3313 - { 3314 - struct rcu_state *rsp; 3315 - 3316 - for_each_rcu_flavor(rsp) 3317 - rcu_cleanup_dying_cpu(rsp); 3318 - return 0; 3319 - } 3320 - 3321 - /* 3322 - * The outgoing CPU is gone and we are running elsewhere. 3323 - */ 3324 - int rcutree_dead_cpu(unsigned int cpu) 3325 - { 3326 - struct rcu_state *rsp; 3327 - 3328 - for_each_rcu_flavor(rsp) { 3329 - rcu_cleanup_dead_cpu(cpu, rsp); 3330 - do_nocb_deferred_wakeup(per_cpu_ptr(rsp->rda, cpu)); 3331 - } 3332 3673 return 0; 3333 3674 } 3334 3675 ··· 3324 3723 unsigned long oldmask; 3325 3724 struct rcu_data *rdp; 3326 3725 struct rcu_node *rnp; 3327 - struct rcu_state *rsp; 3328 3726 3329 3727 if (per_cpu(rcu_cpu_started, cpu)) 3330 3728 return; 3331 3729 3332 3730 per_cpu(rcu_cpu_started, cpu) = 1; 3333 3731 3334 - for_each_rcu_flavor(rsp) { 3335 - rdp = per_cpu_ptr(rsp->rda, cpu); 3336 - rnp = rdp->mynode; 3337 - mask = rdp->grpmask; 3338 - raw_spin_lock_irqsave_rcu_node(rnp, flags); 3339 - rnp->qsmaskinitnext |= mask; 3340 - oldmask = rnp->expmaskinitnext; 3341 - rnp->expmaskinitnext |= mask; 3342 - oldmask ^= rnp->expmaskinitnext; 3343 - nbits = bitmap_weight(&oldmask, BITS_PER_LONG); 3344 - /* Allow lockless access for expedited grace periods. */ 3345 - smp_store_release(&rsp->ncpus, rsp->ncpus + nbits); /* ^^^ */ 3346 - rcu_gpnum_ovf(rnp, rdp); /* Offline-induced counter wrap? */ 3347 - rdp->rcu_onl_gp_seq = READ_ONCE(rsp->gp_seq); 3348 - rdp->rcu_onl_gp_flags = READ_ONCE(rsp->gp_flags); 3349 - if (rnp->qsmask & mask) { /* RCU waiting on incoming CPU? */ 3350 - /* Report QS -after- changing ->qsmaskinitnext! */ 3351 - rcu_report_qs_rnp(mask, rsp, rnp, rnp->gp_seq, flags); 3352 - } else { 3353 - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 3354 - } 3732 + rdp = per_cpu_ptr(&rcu_data, cpu); 3733 + rnp = rdp->mynode; 3734 + mask = rdp->grpmask; 3735 + raw_spin_lock_irqsave_rcu_node(rnp, flags); 3736 + rnp->qsmaskinitnext |= mask; 3737 + oldmask = rnp->expmaskinitnext; 3738 + rnp->expmaskinitnext |= mask; 3739 + oldmask ^= rnp->expmaskinitnext; 3740 + nbits = bitmap_weight(&oldmask, BITS_PER_LONG); 3741 + /* Allow lockless access for expedited grace periods. */ 3742 + smp_store_release(&rcu_state.ncpus, rcu_state.ncpus + nbits); /* ^^^ */ 3743 + rcu_gpnum_ovf(rnp, rdp); /* Offline-induced counter wrap? */ 3744 + rdp->rcu_onl_gp_seq = READ_ONCE(rcu_state.gp_seq); 3745 + rdp->rcu_onl_gp_flags = READ_ONCE(rcu_state.gp_flags); 3746 + if (rnp->qsmask & mask) { /* RCU waiting on incoming CPU? */ 3747 + /* Report QS -after- changing ->qsmaskinitnext! */ 3748 + rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); 3749 + } else { 3750 + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 3355 3751 } 3356 3752 smp_mb(); /* Ensure RCU read-side usage follows above initialization. */ 3357 3753 } 3358 3754 3359 3755 #ifdef CONFIG_HOTPLUG_CPU 3360 3756 /* 3361 - * The CPU is exiting the idle loop into the arch_cpu_idle_dead() 3362 - * function. We now remove it from the rcu_node tree's ->qsmaskinitnext 3363 - * bit masks. 3364 - */ 3365 - static void rcu_cleanup_dying_idle_cpu(int cpu, struct rcu_state *rsp) 3366 - { 3367 - unsigned long flags; 3368 - unsigned long mask; 3369 - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); 3370 - struct rcu_node *rnp = rdp->mynode; /* Outgoing CPU's rdp & rnp. */ 3371 - 3372 - /* Remove outgoing CPU from mask in the leaf rcu_node structure. */ 3373 - mask = rdp->grpmask; 3374 - spin_lock(&rsp->ofl_lock); 3375 - raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Enforce GP memory-order guarantee. */ 3376 - rdp->rcu_ofl_gp_seq = READ_ONCE(rsp->gp_seq); 3377 - rdp->rcu_ofl_gp_flags = READ_ONCE(rsp->gp_flags); 3378 - if (rnp->qsmask & mask) { /* RCU waiting on outgoing CPU? */ 3379 - /* Report quiescent state -before- changing ->qsmaskinitnext! */ 3380 - rcu_report_qs_rnp(mask, rsp, rnp, rnp->gp_seq, flags); 3381 - raw_spin_lock_irqsave_rcu_node(rnp, flags); 3382 - } 3383 - rnp->qsmaskinitnext &= ~mask; 3384 - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 3385 - spin_unlock(&rsp->ofl_lock); 3386 - } 3387 - 3388 - /* 3389 3757 * The outgoing function has no further need of RCU, so remove it from 3390 - * the list of CPUs that RCU must track. 3758 + * the rcu_node tree's ->qsmaskinitnext bit masks. 3391 3759 * 3392 3760 * Note that this function is special in that it is invoked directly 3393 3761 * from the outgoing CPU rather than from the cpuhp_step mechanism. ··· 3364 3794 */ 3365 3795 void rcu_report_dead(unsigned int cpu) 3366 3796 { 3367 - struct rcu_state *rsp; 3797 + unsigned long flags; 3798 + unsigned long mask; 3799 + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); 3800 + struct rcu_node *rnp = rdp->mynode; /* Outgoing CPU's rdp & rnp. */ 3368 3801 3369 - /* QS for any half-done expedited RCU-sched GP. */ 3802 + /* QS for any half-done expedited grace period. */ 3370 3803 preempt_disable(); 3371 - rcu_report_exp_rdp(&rcu_sched_state, 3372 - this_cpu_ptr(rcu_sched_state.rda), true); 3804 + rcu_report_exp_rdp(this_cpu_ptr(&rcu_data)); 3373 3805 preempt_enable(); 3374 - for_each_rcu_flavor(rsp) 3375 - rcu_cleanup_dying_idle_cpu(cpu, rsp); 3806 + rcu_preempt_deferred_qs(current); 3807 + 3808 + /* Remove outgoing CPU from mask in the leaf rcu_node structure. */ 3809 + mask = rdp->grpmask; 3810 + raw_spin_lock(&rcu_state.ofl_lock); 3811 + raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Enforce GP memory-order guarantee. */ 3812 + rdp->rcu_ofl_gp_seq = READ_ONCE(rcu_state.gp_seq); 3813 + rdp->rcu_ofl_gp_flags = READ_ONCE(rcu_state.gp_flags); 3814 + if (rnp->qsmask & mask) { /* RCU waiting on outgoing CPU? */ 3815 + /* Report quiescent state -before- changing ->qsmaskinitnext! */ 3816 + rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); 3817 + raw_spin_lock_irqsave_rcu_node(rnp, flags); 3818 + } 3819 + rnp->qsmaskinitnext &= ~mask; 3820 + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 3821 + raw_spin_unlock(&rcu_state.ofl_lock); 3376 3822 3377 3823 per_cpu(rcu_cpu_started, cpu) = 0; 3378 3824 } 3379 3825 3380 - /* Migrate the dead CPU's callbacks to the current CPU. */ 3381 - static void rcu_migrate_callbacks(int cpu, struct rcu_state *rsp) 3826 + /* 3827 + * The outgoing CPU has just passed through the dying-idle state, and we 3828 + * are being invoked from the CPU that was IPIed to continue the offline 3829 + * operation. Migrate the outgoing CPU's callbacks to the current CPU. 3830 + */ 3831 + void rcutree_migrate_callbacks(int cpu) 3382 3832 { 3383 3833 unsigned long flags; 3384 3834 struct rcu_data *my_rdp; 3385 - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); 3386 - struct rcu_node *rnp_root = rcu_get_root(rdp->rsp); 3835 + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); 3836 + struct rcu_node *rnp_root = rcu_get_root(); 3387 3837 bool needwake; 3388 3838 3389 3839 if (rcu_is_nocb_cpu(cpu) || rcu_segcblist_empty(&rdp->cblist)) 3390 3840 return; /* No callbacks to migrate. */ 3391 3841 3392 3842 local_irq_save(flags); 3393 - my_rdp = this_cpu_ptr(rsp->rda); 3843 + my_rdp = this_cpu_ptr(&rcu_data); 3394 3844 if (rcu_nocb_adopt_orphan_cbs(my_rdp, rdp, flags)) { 3395 3845 local_irq_restore(flags); 3396 3846 return; 3397 3847 } 3398 3848 raw_spin_lock_rcu_node(rnp_root); /* irqs already disabled. */ 3399 3849 /* Leverage recent GPs and set GP for new callbacks. */ 3400 - needwake = rcu_advance_cbs(rsp, rnp_root, rdp) || 3401 - rcu_advance_cbs(rsp, rnp_root, my_rdp); 3850 + needwake = rcu_advance_cbs(rnp_root, rdp) || 3851 + rcu_advance_cbs(rnp_root, my_rdp); 3402 3852 rcu_segcblist_merge(&my_rdp->cblist, &rdp->cblist); 3403 3853 WARN_ON_ONCE(rcu_segcblist_empty(&my_rdp->cblist) != 3404 3854 !rcu_segcblist_n_cbs(&my_rdp->cblist)); 3405 3855 raw_spin_unlock_irqrestore_rcu_node(rnp_root, flags); 3406 3856 if (needwake) 3407 - rcu_gp_kthread_wake(rsp); 3857 + rcu_gp_kthread_wake(); 3408 3858 WARN_ONCE(rcu_segcblist_n_cbs(&rdp->cblist) != 0 || 3409 3859 !rcu_segcblist_empty(&rdp->cblist), 3410 3860 "rcu_cleanup_dead_cpu: Callbacks on offline CPU %d: qlen=%lu, 1stCB=%p\n", 3411 3861 cpu, rcu_segcblist_n_cbs(&rdp->cblist), 3412 3862 rcu_segcblist_first_cb(&rdp->cblist)); 3413 - } 3414 - 3415 - /* 3416 - * The outgoing CPU has just passed through the dying-idle state, 3417 - * and we are being invoked from the CPU that was IPIed to continue the 3418 - * offline operation. We need to migrate the outgoing CPU's callbacks. 3419 - */ 3420 - void rcutree_migrate_callbacks(int cpu) 3421 - { 3422 - struct rcu_state *rsp; 3423 - 3424 - for_each_rcu_flavor(rsp) 3425 - rcu_migrate_callbacks(cpu, rsp); 3426 3863 } 3427 3864 #endif 3428 3865 ··· 3458 3881 } 3459 3882 3460 3883 /* 3461 - * Spawn the kthreads that handle each RCU flavor's grace periods. 3884 + * Spawn the kthreads that handle RCU's grace periods. 3462 3885 */ 3463 3886 static int __init rcu_spawn_gp_kthread(void) 3464 3887 { 3465 3888 unsigned long flags; 3466 3889 int kthread_prio_in = kthread_prio; 3467 3890 struct rcu_node *rnp; 3468 - struct rcu_state *rsp; 3469 3891 struct sched_param sp; 3470 3892 struct task_struct *t; 3471 3893 ··· 3484 3908 kthread_prio, kthread_prio_in); 3485 3909 3486 3910 rcu_scheduler_fully_active = 1; 3487 - for_each_rcu_flavor(rsp) { 3488 - t = kthread_create(rcu_gp_kthread, rsp, "%s", rsp->name); 3489 - BUG_ON(IS_ERR(t)); 3490 - rnp = rcu_get_root(rsp); 3491 - raw_spin_lock_irqsave_rcu_node(rnp, flags); 3492 - rsp->gp_kthread = t; 3493 - if (kthread_prio) { 3494 - sp.sched_priority = kthread_prio; 3495 - sched_setscheduler_nocheck(t, SCHED_FIFO, &sp); 3496 - } 3497 - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 3498 - wake_up_process(t); 3911 + t = kthread_create(rcu_gp_kthread, NULL, "%s", rcu_state.name); 3912 + BUG_ON(IS_ERR(t)); 3913 + rnp = rcu_get_root(); 3914 + raw_spin_lock_irqsave_rcu_node(rnp, flags); 3915 + rcu_state.gp_kthread = t; 3916 + if (kthread_prio) { 3917 + sp.sched_priority = kthread_prio; 3918 + sched_setscheduler_nocheck(t, SCHED_FIFO, &sp); 3499 3919 } 3920 + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 3921 + wake_up_process(t); 3500 3922 rcu_spawn_nocb_kthreads(); 3501 3923 rcu_spawn_boost_kthreads(); 3502 3924 return 0; ··· 3521 3947 } 3522 3948 3523 3949 /* 3524 - * Helper function for rcu_init() that initializes one rcu_state structure. 3950 + * Helper function for rcu_init() that initializes the rcu_state structure. 3525 3951 */ 3526 - static void __init rcu_init_one(struct rcu_state *rsp) 3952 + static void __init rcu_init_one(void) 3527 3953 { 3528 3954 static const char * const buf[] = RCU_NODE_NAME_INIT; 3529 3955 static const char * const fqs[] = RCU_FQS_NAME_INIT; ··· 3545 3971 /* Initialize the level-tracking arrays. */ 3546 3972 3547 3973 for (i = 1; i < rcu_num_lvls; i++) 3548 - rsp->level[i] = rsp->level[i - 1] + num_rcu_lvl[i - 1]; 3974 + rcu_state.level[i] = 3975 + rcu_state.level[i - 1] + num_rcu_lvl[i - 1]; 3549 3976 rcu_init_levelspread(levelspread, num_rcu_lvl); 3550 3977 3551 3978 /* Initialize the elements themselves, starting from the leaves. */ 3552 3979 3553 3980 for (i = rcu_num_lvls - 1; i >= 0; i--) { 3554 3981 cpustride *= levelspread[i]; 3555 - rnp = rsp->level[i]; 3982 + rnp = rcu_state.level[i]; 3556 3983 for (j = 0; j < num_rcu_lvl[i]; j++, rnp++) { 3557 3984 raw_spin_lock_init(&ACCESS_PRIVATE(rnp, lock)); 3558 3985 lockdep_set_class_and_name(&ACCESS_PRIVATE(rnp, lock), ··· 3561 3986 raw_spin_lock_init(&rnp->fqslock); 3562 3987 lockdep_set_class_and_name(&rnp->fqslock, 3563 3988 &rcu_fqs_class[i], fqs[i]); 3564 - rnp->gp_seq = rsp->gp_seq; 3565 - rnp->gp_seq_needed = rsp->gp_seq; 3566 - rnp->completedqs = rsp->gp_seq; 3989 + rnp->gp_seq = rcu_state.gp_seq; 3990 + rnp->gp_seq_needed = rcu_state.gp_seq; 3991 + rnp->completedqs = rcu_state.gp_seq; 3567 3992 rnp->qsmask = 0; 3568 3993 rnp->qsmaskinit = 0; 3569 3994 rnp->grplo = j * cpustride; ··· 3576 4001 rnp->parent = NULL; 3577 4002 } else { 3578 4003 rnp->grpnum = j % levelspread[i - 1]; 3579 - rnp->grpmask = 1UL << rnp->grpnum; 3580 - rnp->parent = rsp->level[i - 1] + 4004 + rnp->grpmask = BIT(rnp->grpnum); 4005 + rnp->parent = rcu_state.level[i - 1] + 3581 4006 j / levelspread[i - 1]; 3582 4007 } 3583 4008 rnp->level = i; ··· 3591 4016 } 3592 4017 } 3593 4018 3594 - init_swait_queue_head(&rsp->gp_wq); 3595 - init_swait_queue_head(&rsp->expedited_wq); 3596 - rnp = rcu_first_leaf_node(rsp); 4019 + init_swait_queue_head(&rcu_state.gp_wq); 4020 + init_swait_queue_head(&rcu_state.expedited_wq); 4021 + rnp = rcu_first_leaf_node(); 3597 4022 for_each_possible_cpu(i) { 3598 4023 while (i > rnp->grphi) 3599 4024 rnp++; 3600 - per_cpu_ptr(rsp->rda, i)->mynode = rnp; 3601 - rcu_boot_init_percpu_data(i, rsp); 4025 + per_cpu_ptr(&rcu_data, i)->mynode = rnp; 4026 + rcu_boot_init_percpu_data(i); 3602 4027 } 3603 - list_add(&rsp->flavors, &rcu_struct_flavors); 3604 4028 } 3605 4029 3606 4030 /* ··· 3625 4051 jiffies_till_first_fqs = d; 3626 4052 if (jiffies_till_next_fqs == ULONG_MAX) 3627 4053 jiffies_till_next_fqs = d; 4054 + if (jiffies_till_sched_qs == ULONG_MAX) 4055 + adjust_jiffies_till_sched_qs(); 3628 4056 3629 4057 /* If the compile-time values are accurate, just leave. */ 3630 4058 if (rcu_fanout_leaf == RCU_FANOUT_LEAF && ··· 3685 4109 3686 4110 /* 3687 4111 * Dump out the structure of the rcu_node combining tree associated 3688 - * with the rcu_state structure referenced by rsp. 4112 + * with the rcu_state structure. 3689 4113 */ 3690 - static void __init rcu_dump_rcu_node_tree(struct rcu_state *rsp) 4114 + static void __init rcu_dump_rcu_node_tree(void) 3691 4115 { 3692 4116 int level = 0; 3693 4117 struct rcu_node *rnp; 3694 4118 3695 4119 pr_info("rcu_node tree layout dump\n"); 3696 4120 pr_info(" "); 3697 - rcu_for_each_node_breadth_first(rsp, rnp) { 4121 + rcu_for_each_node_breadth_first(rnp) { 3698 4122 if (rnp->level != level) { 3699 4123 pr_cont("\n"); 3700 4124 pr_info(" "); ··· 3716 4140 3717 4141 rcu_bootup_announce(); 3718 4142 rcu_init_geometry(); 3719 - rcu_init_one(&rcu_bh_state); 3720 - rcu_init_one(&rcu_sched_state); 4143 + rcu_init_one(); 3721 4144 if (dump_tree) 3722 - rcu_dump_rcu_node_tree(&rcu_sched_state); 3723 - __rcu_init_preempt(); 4145 + rcu_dump_rcu_node_tree(); 3724 4146 open_softirq(RCU_SOFTIRQ, rcu_process_callbacks); 3725 4147 3726 4148 /* ··· 3738 4164 WARN_ON(!rcu_gp_wq); 3739 4165 rcu_par_gp_wq = alloc_workqueue("rcu_par_gp", WQ_MEM_RECLAIM, 0); 3740 4166 WARN_ON(!rcu_par_gp_wq); 4167 + srcu_init(); 3741 4168 } 3742 4169 3743 4170 #include "tree_exp.h"

+56 -76

kernel/rcu/tree.h

··· 34 34 35 35 #include "rcu_segcblist.h" 36 36 37 - /* 38 - * Dynticks per-CPU state. 39 - */ 40 - struct rcu_dynticks { 41 - long dynticks_nesting; /* Track process nesting level. */ 42 - long dynticks_nmi_nesting; /* Track irq/NMI nesting level. */ 43 - atomic_t dynticks; /* Even value for idle, else odd. */ 44 - bool rcu_need_heavy_qs; /* GP old, need heavy quiescent state. */ 45 - unsigned long rcu_qs_ctr; /* Light universal quiescent state ctr. */ 46 - bool rcu_urgent_qs; /* GP old need light quiescent state. */ 47 - #ifdef CONFIG_RCU_FAST_NO_HZ 48 - bool all_lazy; /* Are all CPU's CBs lazy? */ 49 - unsigned long nonlazy_posted; 50 - /* # times non-lazy CBs posted to CPU. */ 51 - unsigned long nonlazy_posted_snap; 52 - /* idle-period nonlazy_posted snapshot. */ 53 - unsigned long last_accelerate; 54 - /* Last jiffy CBs were accelerated. */ 55 - unsigned long last_advance_all; 56 - /* Last jiffy CBs were all advanced. */ 57 - int tick_nohz_enabled_snap; /* Previously seen value from sysfs. */ 58 - #endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */ 59 - }; 60 - 61 37 /* Communicate arguments to a workqueue handler. */ 62 38 struct rcu_exp_work { 63 39 smp_call_func_t rew_func; 64 - struct rcu_state *rew_rsp; 65 40 unsigned long rew_s; 66 41 struct work_struct rew_work; 67 42 }; ··· 145 170 * are indexed relative to this interval rather than the global CPU ID space. 146 171 * This generates the bit for a CPU in node-local masks. 147 172 */ 148 - #define leaf_node_cpu_bit(rnp, cpu) (1UL << ((cpu) - (rnp)->grplo)) 173 + #define leaf_node_cpu_bit(rnp, cpu) (BIT((cpu) - (rnp)->grplo)) 149 174 150 175 /* 151 176 * Union to allow "aggregate OR" operation on the need for a quiescent ··· 164 189 /* 1) quiescent-state and grace-period handling : */ 165 190 unsigned long gp_seq; /* Track rsp->rcu_gp_seq counter. */ 166 191 unsigned long gp_seq_needed; /* Track rsp->rcu_gp_seq_needed ctr. */ 167 - unsigned long rcu_qs_ctr_snap;/* Snapshot of rcu_qs_ctr to check */ 168 - /* for rcu_all_qs() invocations. */ 169 192 union rcu_noqs cpu_no_qs; /* No QSes yet for this CPU. */ 170 193 bool core_needs_qs; /* Core waits for quiesc state. */ 171 194 bool beenonline; /* CPU online at least once. */ 172 195 bool gpwrap; /* Possible ->gp_seq wrap. */ 196 + bool deferred_qs; /* This CPU awaiting a deferred QS? */ 173 197 struct rcu_node *mynode; /* This CPU's leaf of hierarchy */ 174 198 unsigned long grpmask; /* Mask to apply to leaf qsmask. */ 175 199 unsigned long ticks_this_gp; /* The number of scheduling-clock */ ··· 187 213 long blimit; /* Upper limit on a processed batch */ 188 214 189 215 /* 3) dynticks interface. */ 190 - struct rcu_dynticks *dynticks; /* Shared per-CPU dynticks state. */ 191 216 int dynticks_snap; /* Per-GP tracking for dynticks. */ 192 - 193 - /* 4) reasons this CPU needed to be kicked by force_quiescent_state */ 194 - unsigned long dynticks_fqs; /* Kicked due to dynticks idle. */ 195 - unsigned long cond_resched_completed; 196 - /* Grace period that needs help */ 197 - /* from cond_resched(). */ 198 - 199 - /* 5) _rcu_barrier(), OOM callbacks, and expediting. */ 200 - struct rcu_head barrier_head; 217 + long dynticks_nesting; /* Track process nesting level. */ 218 + long dynticks_nmi_nesting; /* Track irq/NMI nesting level. */ 219 + atomic_t dynticks; /* Even value for idle, else odd. */ 220 + bool rcu_need_heavy_qs; /* GP old, so heavy quiescent state! */ 221 + bool rcu_urgent_qs; /* GP old need light quiescent state. */ 201 222 #ifdef CONFIG_RCU_FAST_NO_HZ 202 - struct rcu_head oom_head; 223 + bool all_lazy; /* Are all CPU's CBs lazy? */ 224 + unsigned long nonlazy_posted; /* # times non-lazy CB posted to CPU. */ 225 + unsigned long nonlazy_posted_snap; 226 + /* Nonlazy_posted snapshot. */ 227 + unsigned long last_accelerate; /* Last jiffy CBs were accelerated. */ 228 + unsigned long last_advance_all; /* Last jiffy CBs were all advanced. */ 229 + int tick_nohz_enabled_snap; /* Previously seen value from sysfs. */ 203 230 #endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */ 231 + 232 + /* 4) rcu_barrier(), OOM callbacks, and expediting. */ 233 + struct rcu_head barrier_head; 204 234 int exp_dynticks_snap; /* Double-check need for IPI. */ 205 235 206 - /* 6) Callback offloading. */ 236 + /* 5) Callback offloading. */ 207 237 #ifdef CONFIG_RCU_NOCB_CPU 208 238 struct rcu_head *nocb_head; /* CBs waiting for kthread. */ 209 239 struct rcu_head **nocb_tail; ··· 234 256 /* Leader CPU takes GP-end wakeups. */ 235 257 #endif /* #ifdef CONFIG_RCU_NOCB_CPU */ 236 258 237 - /* 7) Diagnostic data, including RCU CPU stall warnings. */ 259 + /* 6) Diagnostic data, including RCU CPU stall warnings. */ 238 260 unsigned int softirq_snap; /* Snapshot of softirq activity. */ 239 261 /* ->rcu_iw* fields protected by leaf rcu_node ->lock. */ 240 262 struct irq_work rcu_iw; /* Check for non-irq activity. */ ··· 244 266 short rcu_ofl_gp_flags; /* ->gp_flags at last offline. */ 245 267 unsigned long rcu_onl_gp_seq; /* ->gp_seq at last online. */ 246 268 short rcu_onl_gp_flags; /* ->gp_flags at last online. */ 269 + unsigned long last_fqs_resched; /* Time of last rcu_resched(). */ 247 270 248 271 int cpu; 249 - struct rcu_state *rsp; 250 272 }; 251 273 252 274 /* Values for nocb_defer_wakeup field in struct rcu_data. */ ··· 292 314 struct rcu_node *level[RCU_NUM_LVLS + 1]; 293 315 /* Hierarchy levels (+1 to */ 294 316 /* shut bogus gcc warning) */ 295 - struct rcu_data __percpu *rda; /* pointer of percu rcu_data. */ 296 - call_rcu_func_t call; /* call_rcu() flavor. */ 297 317 int ncpus; /* # CPUs seen so far. */ 298 318 299 319 /* The following fields are guarded by the root rcu_node's lock. */ ··· 310 334 atomic_t barrier_cpu_count; /* # CPUs waiting on. */ 311 335 struct completion barrier_completion; /* Wake at barrier end. */ 312 336 unsigned long barrier_sequence; /* ++ at start and end of */ 313 - /* _rcu_barrier(). */ 337 + /* rcu_barrier(). */ 314 338 /* End of fields guarded by barrier_mutex. */ 315 339 316 340 struct mutex exp_mutex; /* Serialize expedited GP. */ ··· 342 366 /* jiffies. */ 343 367 const char *name; /* Name of structure. */ 344 368 char abbr; /* Abbreviated name. */ 345 - struct list_head flavors; /* List of RCU flavors. */ 346 369 347 - spinlock_t ofl_lock ____cacheline_internodealigned_in_smp; 370 + raw_spinlock_t ofl_lock ____cacheline_internodealigned_in_smp; 348 371 /* Synchronize offline with */ 349 372 /* GP pre-initialization. */ 350 373 }; ··· 363 388 #define RCU_GP_CLEANUP 7 /* Grace-period cleanup started. */ 364 389 #define RCU_GP_CLEANED 8 /* Grace-period cleanup complete. */ 365 390 366 - #ifndef RCU_TREE_NONCORE 367 391 static const char * const gp_state_names[] = { 368 392 "RCU_GP_IDLE", 369 393 "RCU_GP_WAIT_GPS", ··· 374 400 "RCU_GP_CLEANUP", 375 401 "RCU_GP_CLEANED", 376 402 }; 377 - #endif /* #ifndef RCU_TREE_NONCORE */ 378 403 379 - extern struct list_head rcu_struct_flavors; 380 - 381 - /* Sequence through rcu_state structures for each RCU flavor. */ 382 - #define for_each_rcu_flavor(rsp) \ 383 - list_for_each_entry((rsp), &rcu_struct_flavors, flavors) 404 + /* 405 + * In order to export the rcu_state name to the tracing tools, it 406 + * needs to be added in the __tracepoint_string section. 407 + * This requires defining a separate variable tp_<sname>_varname 408 + * that points to the string being used, and this will allow 409 + * the tracing userspace tools to be able to decipher the string 410 + * address to the matching string. 411 + */ 412 + #ifdef CONFIG_PREEMPT_RCU 413 + #define RCU_ABBR 'p' 414 + #define RCU_NAME_RAW "rcu_preempt" 415 + #else /* #ifdef CONFIG_PREEMPT_RCU */ 416 + #define RCU_ABBR 's' 417 + #define RCU_NAME_RAW "rcu_sched" 418 + #endif /* #else #ifdef CONFIG_PREEMPT_RCU */ 419 + #ifndef CONFIG_TRACING 420 + #define RCU_NAME RCU_NAME_RAW 421 + #else /* #ifdef CONFIG_TRACING */ 422 + static char rcu_name[] = RCU_NAME_RAW; 423 + static const char *tp_rcu_varname __used __tracepoint_string = rcu_name; 424 + #define RCU_NAME rcu_name 425 + #endif /* #else #ifdef CONFIG_TRACING */ 384 426 385 427 /* 386 428 * RCU implementation internal declarations: ··· 409 419 extern struct rcu_state rcu_preempt_state; 410 420 #endif /* #ifdef CONFIG_PREEMPT_RCU */ 411 421 412 - int rcu_dynticks_snap(struct rcu_dynticks *rdtp); 422 + int rcu_dynticks_snap(struct rcu_data *rdp); 413 423 414 424 #ifdef CONFIG_RCU_BOOST 415 425 DECLARE_PER_CPU(unsigned int, rcu_cpu_kthread_status); ··· 418 428 DECLARE_PER_CPU(char, rcu_cpu_has_work); 419 429 #endif /* #ifdef CONFIG_RCU_BOOST */ 420 430 421 - #ifndef RCU_TREE_NONCORE 422 - 423 431 /* Forward declarations for rcutree_plugin.h */ 424 432 static void rcu_bootup_announce(void); 425 - static void rcu_preempt_note_context_switch(bool preempt); 433 + static void rcu_qs(void); 426 434 static int rcu_preempt_blocked_readers_cgp(struct rcu_node *rnp); 427 435 #ifdef CONFIG_HOTPLUG_CPU 428 436 static bool rcu_preempt_has_tasks(struct rcu_node *rnp); 429 437 #endif /* #ifdef CONFIG_HOTPLUG_CPU */ 430 - static void rcu_print_detail_task_stall(struct rcu_state *rsp); 438 + static void rcu_print_detail_task_stall(void); 431 439 static int rcu_print_task_stall(struct rcu_node *rnp); 432 440 static int rcu_print_task_exp_stall(struct rcu_node *rnp); 433 - static void rcu_preempt_check_blocked_tasks(struct rcu_state *rsp, 434 - struct rcu_node *rnp); 435 - static void rcu_preempt_check_callbacks(void); 441 + static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp); 442 + static void rcu_flavor_check_callbacks(int user); 436 443 void call_rcu(struct rcu_head *head, rcu_callback_t func); 437 - static void __init __rcu_init_preempt(void); 438 - static void dump_blkd_tasks(struct rcu_state *rsp, struct rcu_node *rnp, 439 - int ncheck); 444 + static void dump_blkd_tasks(struct rcu_node *rnp, int ncheck); 440 445 static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags); 441 446 static void rcu_preempt_boost_start_gp(struct rcu_node *rnp); 442 447 static void invoke_rcu_callbacks_kthread(void); 443 448 static bool rcu_is_callbacks_kthread(void); 444 - #ifdef CONFIG_RCU_BOOST 445 - static int rcu_spawn_one_boost_kthread(struct rcu_state *rsp, 446 - struct rcu_node *rnp); 447 - #endif /* #ifdef CONFIG_RCU_BOOST */ 448 449 static void __init rcu_spawn_boost_kthreads(void); 449 450 static void rcu_prepare_kthreads(int cpu); 450 451 static void rcu_cleanup_after_idle(void); 451 452 static void rcu_prepare_for_idle(void); 452 453 static void rcu_idle_count_callbacks_posted(void); 453 454 static bool rcu_preempt_has_tasks(struct rcu_node *rnp); 455 + static bool rcu_preempt_need_deferred_qs(struct task_struct *t); 456 + static void rcu_preempt_deferred_qs(struct task_struct *t); 454 457 static void print_cpu_stall_info_begin(void); 455 - static void print_cpu_stall_info(struct rcu_state *rsp, int cpu); 458 + static void print_cpu_stall_info(int cpu); 456 459 static void print_cpu_stall_info_end(void); 457 460 static void zero_cpu_stall_ticks(struct rcu_data *rdp); 458 - static void increment_cpu_stall_ticks(void); 459 - static bool rcu_nocb_cpu_needs_barrier(struct rcu_state *rsp, int cpu); 461 + static bool rcu_nocb_cpu_needs_barrier(int cpu); 460 462 static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp); 461 463 static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq); 462 464 static void rcu_init_one_nocb(struct rcu_node *rnp); ··· 463 481 static void rcu_spawn_all_nocb_kthreads(int cpu); 464 482 static void __init rcu_spawn_nocb_kthreads(void); 465 483 #ifdef CONFIG_RCU_NOCB_CPU 466 - static void __init rcu_organize_nocb_kthreads(struct rcu_state *rsp); 484 + static void __init rcu_organize_nocb_kthreads(void); 467 485 #endif /* #ifdef CONFIG_RCU_NOCB_CPU */ 468 486 static bool init_nocb_callback_list(struct rcu_data *rdp); 469 487 static void rcu_bind_gp_kthread(void); 470 - static bool rcu_nohz_full_cpu(struct rcu_state *rsp); 488 + static bool rcu_nohz_full_cpu(void); 471 489 static void rcu_dynticks_task_enter(void); 472 490 static void rcu_dynticks_task_exit(void); 473 491 ··· 478 496 void srcu_online_cpu(unsigned int cpu) { } 479 497 void srcu_offline_cpu(unsigned int cpu) { } 480 498 #endif /* #else #ifdef CONFIG_SRCU */ 481 - 482 - #endif /* #ifndef RCU_TREE_NONCORE */

+226 -200

kernel/rcu/tree_exp.h

··· 25 25 /* 26 26 * Record the start of an expedited grace period. 27 27 */ 28 - static void rcu_exp_gp_seq_start(struct rcu_state *rsp) 28 + static void rcu_exp_gp_seq_start(void) 29 29 { 30 - rcu_seq_start(&rsp->expedited_sequence); 30 + rcu_seq_start(&rcu_state.expedited_sequence); 31 31 } 32 32 33 33 /* 34 34 * Return then value that expedited-grace-period counter will have 35 35 * at the end of the current grace period. 36 36 */ 37 - static __maybe_unused unsigned long rcu_exp_gp_seq_endval(struct rcu_state *rsp) 37 + static __maybe_unused unsigned long rcu_exp_gp_seq_endval(void) 38 38 { 39 - return rcu_seq_endval(&rsp->expedited_sequence); 39 + return rcu_seq_endval(&rcu_state.expedited_sequence); 40 40 } 41 41 42 42 /* 43 43 * Record the end of an expedited grace period. 44 44 */ 45 - static void rcu_exp_gp_seq_end(struct rcu_state *rsp) 45 + static void rcu_exp_gp_seq_end(void) 46 46 { 47 - rcu_seq_end(&rsp->expedited_sequence); 47 + rcu_seq_end(&rcu_state.expedited_sequence); 48 48 smp_mb(); /* Ensure that consecutive grace periods serialize. */ 49 49 } 50 50 51 51 /* 52 52 * Take a snapshot of the expedited-grace-period counter. 53 53 */ 54 - static unsigned long rcu_exp_gp_seq_snap(struct rcu_state *rsp) 54 + static unsigned long rcu_exp_gp_seq_snap(void) 55 55 { 56 56 unsigned long s; 57 57 58 58 smp_mb(); /* Caller's modifications seen first by other CPUs. */ 59 - s = rcu_seq_snap(&rsp->expedited_sequence); 60 - trace_rcu_exp_grace_period(rsp->name, s, TPS("snap")); 59 + s = rcu_seq_snap(&rcu_state.expedited_sequence); 60 + trace_rcu_exp_grace_period(rcu_state.name, s, TPS("snap")); 61 61 return s; 62 62 } 63 63 ··· 66 66 * if a full expedited grace period has elapsed since that snapshot 67 67 * was taken. 68 68 */ 69 - static bool rcu_exp_gp_seq_done(struct rcu_state *rsp, unsigned long s) 69 + static bool rcu_exp_gp_seq_done(unsigned long s) 70 70 { 71 - return rcu_seq_done(&rsp->expedited_sequence, s); 71 + return rcu_seq_done(&rcu_state.expedited_sequence, s); 72 72 } 73 73 74 74 /* ··· 78 78 * ever been online. This means that this function normally takes its 79 79 * no-work-to-do fastpath. 80 80 */ 81 - static void sync_exp_reset_tree_hotplug(struct rcu_state *rsp) 81 + static void sync_exp_reset_tree_hotplug(void) 82 82 { 83 83 bool done; 84 84 unsigned long flags; 85 85 unsigned long mask; 86 86 unsigned long oldmask; 87 - int ncpus = smp_load_acquire(&rsp->ncpus); /* Order against locking. */ 87 + int ncpus = smp_load_acquire(&rcu_state.ncpus); /* Order vs. locking. */ 88 88 struct rcu_node *rnp; 89 89 struct rcu_node *rnp_up; 90 90 91 91 /* If no new CPUs onlined since last time, nothing to do. */ 92 - if (likely(ncpus == rsp->ncpus_snap)) 92 + if (likely(ncpus == rcu_state.ncpus_snap)) 93 93 return; 94 - rsp->ncpus_snap = ncpus; 94 + rcu_state.ncpus_snap = ncpus; 95 95 96 96 /* 97 97 * Each pass through the following loop propagates newly onlined 98 98 * CPUs for the current rcu_node structure up the rcu_node tree. 99 99 */ 100 - rcu_for_each_leaf_node(rsp, rnp) { 100 + rcu_for_each_leaf_node(rnp) { 101 101 raw_spin_lock_irqsave_rcu_node(rnp, flags); 102 102 if (rnp->expmaskinit == rnp->expmaskinitnext) { 103 103 raw_spin_unlock_irqrestore_rcu_node(rnp, flags); ··· 135 135 * Reset the ->expmask values in the rcu_node tree in preparation for 136 136 * a new expedited grace period. 137 137 */ 138 - static void __maybe_unused sync_exp_reset_tree(struct rcu_state *rsp) 138 + static void __maybe_unused sync_exp_reset_tree(void) 139 139 { 140 140 unsigned long flags; 141 141 struct rcu_node *rnp; 142 142 143 - sync_exp_reset_tree_hotplug(rsp); 144 - rcu_for_each_node_breadth_first(rsp, rnp) { 143 + sync_exp_reset_tree_hotplug(); 144 + rcu_for_each_node_breadth_first(rnp) { 145 145 raw_spin_lock_irqsave_rcu_node(rnp, flags); 146 146 WARN_ON_ONCE(rnp->expmask); 147 147 rnp->expmask = rnp->expmaskinit; ··· 194 194 * 195 195 * Caller must hold the specified rcu_node structure's ->lock. 196 196 */ 197 - static void __rcu_report_exp_rnp(struct rcu_state *rsp, struct rcu_node *rnp, 197 + static void __rcu_report_exp_rnp(struct rcu_node *rnp, 198 198 bool wake, unsigned long flags) 199 199 __releases(rnp->lock) 200 200 { ··· 212 212 raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 213 213 if (wake) { 214 214 smp_mb(); /* EGP done before wake_up(). */ 215 - swake_up_one(&rsp->expedited_wq); 215 + swake_up_one(&rcu_state.expedited_wq); 216 216 } 217 217 break; 218 218 } ··· 229 229 * Report expedited quiescent state for specified node. This is a 230 230 * lock-acquisition wrapper function for __rcu_report_exp_rnp(). 231 231 */ 232 - static void __maybe_unused rcu_report_exp_rnp(struct rcu_state *rsp, 233 - struct rcu_node *rnp, bool wake) 232 + static void __maybe_unused rcu_report_exp_rnp(struct rcu_node *rnp, bool wake) 234 233 { 235 234 unsigned long flags; 236 235 237 236 raw_spin_lock_irqsave_rcu_node(rnp, flags); 238 - __rcu_report_exp_rnp(rsp, rnp, wake, flags); 237 + __rcu_report_exp_rnp(rnp, wake, flags); 239 238 } 240 239 241 240 /* 242 241 * Report expedited quiescent state for multiple CPUs, all covered by the 243 242 * specified leaf rcu_node structure. 244 243 */ 245 - static void rcu_report_exp_cpu_mult(struct rcu_state *rsp, struct rcu_node *rnp, 244 + static void rcu_report_exp_cpu_mult(struct rcu_node *rnp, 246 245 unsigned long mask, bool wake) 247 246 { 248 247 unsigned long flags; ··· 252 253 return; 253 254 } 254 255 rnp->expmask &= ~mask; 255 - __rcu_report_exp_rnp(rsp, rnp, wake, flags); /* Releases rnp->lock. */ 256 + __rcu_report_exp_rnp(rnp, wake, flags); /* Releases rnp->lock. */ 256 257 } 257 258 258 259 /* 259 260 * Report expedited quiescent state for specified rcu_data (CPU). 260 261 */ 261 - static void rcu_report_exp_rdp(struct rcu_state *rsp, struct rcu_data *rdp, 262 - bool wake) 262 + static void rcu_report_exp_rdp(struct rcu_data *rdp) 263 263 { 264 - rcu_report_exp_cpu_mult(rsp, rdp->mynode, rdp->grpmask, wake); 264 + WRITE_ONCE(rdp->deferred_qs, false); 265 + rcu_report_exp_cpu_mult(rdp->mynode, rdp->grpmask, true); 265 266 } 266 267 267 - /* Common code for synchronize_{rcu,sched}_expedited() work-done checking. */ 268 - static bool sync_exp_work_done(struct rcu_state *rsp, unsigned long s) 268 + /* Common code for work-done checking. */ 269 + static bool sync_exp_work_done(unsigned long s) 269 270 { 270 - if (rcu_exp_gp_seq_done(rsp, s)) { 271 - trace_rcu_exp_grace_period(rsp->name, s, TPS("done")); 271 + if (rcu_exp_gp_seq_done(s)) { 272 + trace_rcu_exp_grace_period(rcu_state.name, s, TPS("done")); 272 273 /* Ensure test happens before caller kfree(). */ 273 274 smp_mb__before_atomic(); /* ^^^ */ 274 275 return true; ··· 283 284 * with the mutex held, indicating that the caller must actually do the 284 285 * expedited grace period. 285 286 */ 286 - static bool exp_funnel_lock(struct rcu_state *rsp, unsigned long s) 287 + static bool exp_funnel_lock(unsigned long s) 287 288 { 288 - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, raw_smp_processor_id()); 289 + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, raw_smp_processor_id()); 289 290 struct rcu_node *rnp = rdp->mynode; 290 - struct rcu_node *rnp_root = rcu_get_root(rsp); 291 + struct rcu_node *rnp_root = rcu_get_root(); 291 292 292 293 /* Low-contention fastpath. */ 293 294 if (ULONG_CMP_LT(READ_ONCE(rnp->exp_seq_rq), s) && 294 295 (rnp == rnp_root || 295 296 ULONG_CMP_LT(READ_ONCE(rnp_root->exp_seq_rq), s)) && 296 - mutex_trylock(&rsp->exp_mutex)) 297 + mutex_trylock(&rcu_state.exp_mutex)) 297 298 goto fastpath; 298 299 299 300 /* 300 301 * Each pass through the following loop works its way up 301 302 * the rcu_node tree, returning if others have done the work or 302 - * otherwise falls through to acquire rsp->exp_mutex. The mapping 303 + * otherwise falls through to acquire ->exp_mutex. The mapping 303 304 * from CPU to rcu_node structure can be inexact, as it is just 304 305 * promoting locality and is not strictly needed for correctness. 305 306 */ 306 307 for (; rnp != NULL; rnp = rnp->parent) { 307 - if (sync_exp_work_done(rsp, s)) 308 + if (sync_exp_work_done(s)) 308 309 return true; 309 310 310 311 /* Work not done, either wait here or go up. */ ··· 313 314 314 315 /* Someone else doing GP, so wait for them. */ 315 316 spin_unlock(&rnp->exp_lock); 316 - trace_rcu_exp_funnel_lock(rsp->name, rnp->level, 317 + trace_rcu_exp_funnel_lock(rcu_state.name, rnp->level, 317 318 rnp->grplo, rnp->grphi, 318 319 TPS("wait")); 319 320 wait_event(rnp->exp_wq[rcu_seq_ctr(s) & 0x3], 320 - sync_exp_work_done(rsp, s)); 321 + sync_exp_work_done(s)); 321 322 return true; 322 323 } 323 324 rnp->exp_seq_rq = s; /* Followers can wait on us. */ 324 325 spin_unlock(&rnp->exp_lock); 325 - trace_rcu_exp_funnel_lock(rsp->name, rnp->level, rnp->grplo, 326 - rnp->grphi, TPS("nxtlvl")); 326 + trace_rcu_exp_funnel_lock(rcu_state.name, rnp->level, 327 + rnp->grplo, rnp->grphi, TPS("nxtlvl")); 327 328 } 328 - mutex_lock(&rsp->exp_mutex); 329 + mutex_lock(&rcu_state.exp_mutex); 329 330 fastpath: 330 - if (sync_exp_work_done(rsp, s)) { 331 - mutex_unlock(&rsp->exp_mutex); 331 + if (sync_exp_work_done(s)) { 332 + mutex_unlock(&rcu_state.exp_mutex); 332 333 return true; 333 334 } 334 - rcu_exp_gp_seq_start(rsp); 335 - trace_rcu_exp_grace_period(rsp->name, s, TPS("start")); 335 + rcu_exp_gp_seq_start(); 336 + trace_rcu_exp_grace_period(rcu_state.name, s, TPS("start")); 336 337 return false; 337 - } 338 - 339 - /* Invoked on each online non-idle CPU for expedited quiescent state. */ 340 - static void sync_sched_exp_handler(void *data) 341 - { 342 - struct rcu_data *rdp; 343 - struct rcu_node *rnp; 344 - struct rcu_state *rsp = data; 345 - 346 - rdp = this_cpu_ptr(rsp->rda); 347 - rnp = rdp->mynode; 348 - if (!(READ_ONCE(rnp->expmask) & rdp->grpmask) || 349 - __this_cpu_read(rcu_sched_data.cpu_no_qs.b.exp)) 350 - return; 351 - if (rcu_is_cpu_rrupt_from_idle()) { 352 - rcu_report_exp_rdp(&rcu_sched_state, 353 - this_cpu_ptr(&rcu_sched_data), true); 354 - return; 355 - } 356 - __this_cpu_write(rcu_sched_data.cpu_no_qs.b.exp, true); 357 - /* Store .exp before .rcu_urgent_qs. */ 358 - smp_store_release(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs), true); 359 - resched_cpu(smp_processor_id()); 360 - } 361 - 362 - /* Send IPI for expedited cleanup if needed at end of CPU-hotplug operation. */ 363 - static void sync_sched_exp_online_cleanup(int cpu) 364 - { 365 - struct rcu_data *rdp; 366 - int ret; 367 - struct rcu_node *rnp; 368 - struct rcu_state *rsp = &rcu_sched_state; 369 - 370 - rdp = per_cpu_ptr(rsp->rda, cpu); 371 - rnp = rdp->mynode; 372 - if (!(READ_ONCE(rnp->expmask) & rdp->grpmask)) 373 - return; 374 - ret = smp_call_function_single(cpu, sync_sched_exp_handler, rsp, 0); 375 - WARN_ON_ONCE(ret); 376 338 } 377 339 378 340 /* ··· 351 391 struct rcu_exp_work *rewp = 352 392 container_of(wp, struct rcu_exp_work, rew_work); 353 393 struct rcu_node *rnp = container_of(rewp, struct rcu_node, rew); 354 - struct rcu_state *rsp = rewp->rew_rsp; 355 394 356 395 func = rewp->rew_func; 357 396 raw_spin_lock_irqsave_rcu_node(rnp, flags); ··· 359 400 mask_ofl_test = 0; 360 401 for_each_leaf_node_cpu_mask(rnp, cpu, rnp->expmask) { 361 402 unsigned long mask = leaf_node_cpu_bit(rnp, cpu); 362 - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); 363 - struct rcu_dynticks *rdtp = per_cpu_ptr(&rcu_dynticks, cpu); 403 + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); 364 404 int snap; 365 405 366 406 if (raw_smp_processor_id() == cpu || 367 407 !(rnp->qsmaskinitnext & mask)) { 368 408 mask_ofl_test |= mask; 369 409 } else { 370 - snap = rcu_dynticks_snap(rdtp); 410 + snap = rcu_dynticks_snap(rdp); 371 411 if (rcu_dynticks_in_eqs(snap)) 372 412 mask_ofl_test |= mask; 373 413 else ··· 387 429 /* IPI the remaining CPUs for expedited quiescent state. */ 388 430 for_each_leaf_node_cpu_mask(rnp, cpu, rnp->expmask) { 389 431 unsigned long mask = leaf_node_cpu_bit(rnp, cpu); 390 - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); 432 + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); 391 433 392 434 if (!(mask_ofl_ipi & mask)) 393 435 continue; 394 436 retry_ipi: 395 - if (rcu_dynticks_in_eqs_since(rdp->dynticks, 396 - rdp->exp_dynticks_snap)) { 437 + if (rcu_dynticks_in_eqs_since(rdp, rdp->exp_dynticks_snap)) { 397 438 mask_ofl_test |= mask; 398 439 continue; 399 440 } 400 - ret = smp_call_function_single(cpu, func, rsp, 0); 441 + ret = smp_call_function_single(cpu, func, NULL, 0); 401 442 if (!ret) { 402 443 mask_ofl_ipi &= ~mask; 403 444 continue; ··· 407 450 (rnp->expmask & mask)) { 408 451 /* Online, so delay for a bit and try again. */ 409 452 raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 410 - trace_rcu_exp_grace_period(rsp->name, rcu_exp_gp_seq_endval(rsp), TPS("selectofl")); 453 + trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("selectofl")); 411 454 schedule_timeout_uninterruptible(1); 412 455 goto retry_ipi; 413 456 } ··· 419 462 /* Report quiescent states for those that went offline. */ 420 463 mask_ofl_test |= mask_ofl_ipi; 421 464 if (mask_ofl_test) 422 - rcu_report_exp_cpu_mult(rsp, rnp, mask_ofl_test, false); 465 + rcu_report_exp_cpu_mult(rnp, mask_ofl_test, false); 423 466 } 424 467 425 468 /* 426 469 * Select the nodes that the upcoming expedited grace period needs 427 470 * to wait for. 428 471 */ 429 - static void sync_rcu_exp_select_cpus(struct rcu_state *rsp, 430 - smp_call_func_t func) 472 + static void sync_rcu_exp_select_cpus(smp_call_func_t func) 431 473 { 432 474 int cpu; 433 475 struct rcu_node *rnp; 434 476 435 - trace_rcu_exp_grace_period(rsp->name, rcu_exp_gp_seq_endval(rsp), TPS("reset")); 436 - sync_exp_reset_tree(rsp); 437 - trace_rcu_exp_grace_period(rsp->name, rcu_exp_gp_seq_endval(rsp), TPS("select")); 477 + trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("reset")); 478 + sync_exp_reset_tree(); 479 + trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("select")); 438 480 439 481 /* Schedule work for each leaf rcu_node structure. */ 440 - rcu_for_each_leaf_node(rsp, rnp) { 482 + rcu_for_each_leaf_node(rnp) { 441 483 rnp->exp_need_flush = false; 442 484 if (!READ_ONCE(rnp->expmask)) 443 485 continue; /* Avoid early boot non-existent wq. */ 444 486 rnp->rew.rew_func = func; 445 - rnp->rew.rew_rsp = rsp; 446 487 if (!READ_ONCE(rcu_par_gp_wq) || 447 488 rcu_scheduler_active != RCU_SCHEDULER_RUNNING || 448 - rcu_is_last_leaf_node(rsp, rnp)) { 489 + rcu_is_last_leaf_node(rnp)) { 449 490 /* No workqueues yet or last leaf, do direct call. */ 450 491 sync_rcu_exp_select_node_cpus(&rnp->rew.rew_work); 451 492 continue; ··· 460 505 } 461 506 462 507 /* Wait for workqueue jobs (if any) to complete. */ 463 - rcu_for_each_leaf_node(rsp, rnp) 508 + rcu_for_each_leaf_node(rnp) 464 509 if (rnp->exp_need_flush) 465 510 flush_work(&rnp->rew.rew_work); 466 511 } 467 512 468 - static void synchronize_sched_expedited_wait(struct rcu_state *rsp) 513 + static void synchronize_sched_expedited_wait(void) 469 514 { 470 515 int cpu; 471 516 unsigned long jiffies_stall; ··· 473 518 unsigned long mask; 474 519 int ndetected; 475 520 struct rcu_node *rnp; 476 - struct rcu_node *rnp_root = rcu_get_root(rsp); 521 + struct rcu_node *rnp_root = rcu_get_root(); 477 522 int ret; 478 523 479 - trace_rcu_exp_grace_period(rsp->name, rcu_exp_gp_seq_endval(rsp), TPS("startwait")); 524 + trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("startwait")); 480 525 jiffies_stall = rcu_jiffies_till_stall_check(); 481 526 jiffies_start = jiffies; 482 527 483 528 for (;;) { 484 529 ret = swait_event_timeout_exclusive( 485 - rsp->expedited_wq, 530 + rcu_state.expedited_wq, 486 531 sync_rcu_preempt_exp_done_unlocked(rnp_root), 487 532 jiffies_stall); 488 533 if (ret > 0 || sync_rcu_preempt_exp_done_unlocked(rnp_root)) ··· 492 537 continue; 493 538 panic_on_rcu_stall(); 494 539 pr_err("INFO: %s detected expedited stalls on CPUs/tasks: {", 495 - rsp->name); 540 + rcu_state.name); 496 541 ndetected = 0; 497 - rcu_for_each_leaf_node(rsp, rnp) { 542 + rcu_for_each_leaf_node(rnp) { 498 543 ndetected += rcu_print_task_exp_stall(rnp); 499 544 for_each_leaf_node_possible_cpu(rnp, cpu) { 500 545 struct rcu_data *rdp; ··· 503 548 if (!(rnp->expmask & mask)) 504 549 continue; 505 550 ndetected++; 506 - rdp = per_cpu_ptr(rsp->rda, cpu); 551 + rdp = per_cpu_ptr(&rcu_data, cpu); 507 552 pr_cont(" %d-%c%c%c", cpu, 508 553 "O."[!!cpu_online(cpu)], 509 554 "o."[!!(rdp->grpmask & rnp->expmaskinit)], ··· 511 556 } 512 557 } 513 558 pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n", 514 - jiffies - jiffies_start, rsp->expedited_sequence, 559 + jiffies - jiffies_start, rcu_state.expedited_sequence, 515 560 rnp_root->expmask, ".T"[!!rnp_root->exp_tasks]); 516 561 if (ndetected) { 517 562 pr_err("blocking rcu_node structures:"); 518 - rcu_for_each_node_breadth_first(rsp, rnp) { 563 + rcu_for_each_node_breadth_first(rnp) { 519 564 if (rnp == rnp_root) 520 565 continue; /* printed unconditionally */ 521 566 if (sync_rcu_preempt_exp_done_unlocked(rnp)) ··· 527 572 } 528 573 pr_cont("\n"); 529 574 } 530 - rcu_for_each_leaf_node(rsp, rnp) { 575 + rcu_for_each_leaf_node(rnp) { 531 576 for_each_leaf_node_possible_cpu(rnp, cpu) { 532 577 mask = leaf_node_cpu_bit(rnp, cpu); 533 578 if (!(rnp->expmask & mask)) ··· 545 590 * grace period. Also update all the ->exp_seq_rq counters as needed 546 591 * in order to avoid counter-wrap problems. 547 592 */ 548 - static void rcu_exp_wait_wake(struct rcu_state *rsp, unsigned long s) 593 + static void rcu_exp_wait_wake(unsigned long s) 549 594 { 550 595 struct rcu_node *rnp; 551 596 552 - synchronize_sched_expedited_wait(rsp); 553 - rcu_exp_gp_seq_end(rsp); 554 - trace_rcu_exp_grace_period(rsp->name, s, TPS("end")); 597 + synchronize_sched_expedited_wait(); 598 + rcu_exp_gp_seq_end(); 599 + trace_rcu_exp_grace_period(rcu_state.name, s, TPS("end")); 555 600 556 601 /* 557 602 * Switch over to wakeup mode, allowing the next GP, but -only- the 558 603 * next GP, to proceed. 559 604 */ 560 - mutex_lock(&rsp->exp_wake_mutex); 605 + mutex_lock(&rcu_state.exp_wake_mutex); 561 606 562 - rcu_for_each_node_breadth_first(rsp, rnp) { 607 + rcu_for_each_node_breadth_first(rnp) { 563 608 if (ULONG_CMP_LT(READ_ONCE(rnp->exp_seq_rq), s)) { 564 609 spin_lock(&rnp->exp_lock); 565 610 /* Recheck, avoid hang in case someone just arrived. */ ··· 568 613 spin_unlock(&rnp->exp_lock); 569 614 } 570 615 smp_mb(); /* All above changes before wakeup. */ 571 - wake_up_all(&rnp->exp_wq[rcu_seq_ctr(rsp->expedited_sequence) & 0x3]); 616 + wake_up_all(&rnp->exp_wq[rcu_seq_ctr(rcu_state.expedited_sequence) & 0x3]); 572 617 } 573 - trace_rcu_exp_grace_period(rsp->name, s, TPS("endwake")); 574 - mutex_unlock(&rsp->exp_wake_mutex); 618 + trace_rcu_exp_grace_period(rcu_state.name, s, TPS("endwake")); 619 + mutex_unlock(&rcu_state.exp_wake_mutex); 575 620 } 576 621 577 622 /* 578 623 * Common code to drive an expedited grace period forward, used by 579 624 * workqueues and mid-boot-time tasks. 580 625 */ 581 - static void rcu_exp_sel_wait_wake(struct rcu_state *rsp, 582 - smp_call_func_t func, unsigned long s) 626 + static void rcu_exp_sel_wait_wake(smp_call_func_t func, unsigned long s) 583 627 { 584 628 /* Initialize the rcu_node tree in preparation for the wait. */ 585 - sync_rcu_exp_select_cpus(rsp, func); 629 + sync_rcu_exp_select_cpus(func); 586 630 587 631 /* Wait and clean up, including waking everyone. */ 588 - rcu_exp_wait_wake(rsp, s); 632 + rcu_exp_wait_wake(s); 589 633 } 590 634 591 635 /* ··· 595 641 struct rcu_exp_work *rewp; 596 642 597 643 rewp = container_of(wp, struct rcu_exp_work, rew_work); 598 - rcu_exp_sel_wait_wake(rewp->rew_rsp, rewp->rew_func, rewp->rew_s); 644 + rcu_exp_sel_wait_wake(rewp->rew_func, rewp->rew_s); 599 645 } 600 646 601 647 /* 602 - * Given an rcu_state pointer and a smp_call_function() handler, kick 603 - * off the specified flavor of expedited grace period. 648 + * Given a smp_call_function() handler, kick off the specified 649 + * implementation of expedited grace period. 604 650 */ 605 - static void _synchronize_rcu_expedited(struct rcu_state *rsp, 606 - smp_call_func_t func) 651 + static void _synchronize_rcu_expedited(smp_call_func_t func) 607 652 { 608 653 struct rcu_data *rdp; 609 654 struct rcu_exp_work rew; ··· 611 658 612 659 /* If expedited grace periods are prohibited, fall back to normal. */ 613 660 if (rcu_gp_is_normal()) { 614 - wait_rcu_gp(rsp->call); 661 + wait_rcu_gp(call_rcu); 615 662 return; 616 663 } 617 664 618 665 /* Take a snapshot of the sequence number. */ 619 - s = rcu_exp_gp_seq_snap(rsp); 620 - if (exp_funnel_lock(rsp, s)) 666 + s = rcu_exp_gp_seq_snap(); 667 + if (exp_funnel_lock(s)) 621 668 return; /* Someone else did our work for us. */ 622 669 623 670 /* Ensure that load happens before action based on it. */ 624 671 if (unlikely(rcu_scheduler_active == RCU_SCHEDULER_INIT)) { 625 672 /* Direct call during scheduler init and early_initcalls(). */ 626 - rcu_exp_sel_wait_wake(rsp, func, s); 673 + rcu_exp_sel_wait_wake(func, s); 627 674 } else { 628 675 /* Marshall arguments & schedule the expedited grace period. */ 629 676 rew.rew_func = func; 630 - rew.rew_rsp = rsp; 631 677 rew.rew_s = s; 632 678 INIT_WORK_ONSTACK(&rew.rew_work, wait_rcu_exp_gp); 633 679 queue_work(rcu_gp_wq, &rew.rew_work); 634 680 } 635 681 636 682 /* Wait for expedited grace period to complete. */ 637 - rdp = per_cpu_ptr(rsp->rda, raw_smp_processor_id()); 638 - rnp = rcu_get_root(rsp); 683 + rdp = per_cpu_ptr(&rcu_data, raw_smp_processor_id()); 684 + rnp = rcu_get_root(); 639 685 wait_event(rnp->exp_wq[rcu_seq_ctr(s) & 0x3], 640 - sync_exp_work_done(rsp, s)); 686 + sync_exp_work_done(s)); 641 687 smp_mb(); /* Workqueue actions happen before return. */ 642 688 643 689 /* Let the next expedited grace period start. */ 644 - mutex_unlock(&rsp->exp_mutex); 690 + mutex_unlock(&rcu_state.exp_mutex); 645 691 } 646 - 647 - /** 648 - * synchronize_sched_expedited - Brute-force RCU-sched grace period 649 - * 650 - * Wait for an RCU-sched grace period to elapse, but use a "big hammer" 651 - * approach to force the grace period to end quickly. This consumes 652 - * significant time on all CPUs and is unfriendly to real-time workloads, 653 - * so is thus not recommended for any sort of common-case code. In fact, 654 - * if you are using synchronize_sched_expedited() in a loop, please 655 - * restructure your code to batch your updates, and then use a single 656 - * synchronize_sched() instead. 657 - * 658 - * This implementation can be thought of as an application of sequence 659 - * locking to expedited grace periods, but using the sequence counter to 660 - * determine when someone else has already done the work instead of for 661 - * retrying readers. 662 - */ 663 - void synchronize_sched_expedited(void) 664 - { 665 - struct rcu_state *rsp = &rcu_sched_state; 666 - 667 - RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || 668 - lock_is_held(&rcu_lock_map) || 669 - lock_is_held(&rcu_sched_lock_map), 670 - "Illegal synchronize_sched_expedited() in RCU read-side critical section"); 671 - 672 - /* If only one CPU, this is automatically a grace period. */ 673 - if (rcu_blocking_is_gp()) 674 - return; 675 - 676 - _synchronize_rcu_expedited(rsp, sync_sched_exp_handler); 677 - } 678 - EXPORT_SYMBOL_GPL(synchronize_sched_expedited); 679 692 680 693 #ifdef CONFIG_PREEMPT_RCU 681 694 ··· 652 733 * ->expmask fields in the rcu_node tree. Otherwise, immediately 653 734 * report the quiescent state. 654 735 */ 655 - static void sync_rcu_exp_handler(void *info) 736 + static void sync_rcu_exp_handler(void *unused) 656 737 { 657 - struct rcu_data *rdp; 658 - struct rcu_state *rsp = info; 738 + unsigned long flags; 739 + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); 740 + struct rcu_node *rnp = rdp->mynode; 659 741 struct task_struct *t = current; 660 742 661 743 /* 662 - * Within an RCU read-side critical section, request that the next 663 - * rcu_read_unlock() report. Unless this RCU read-side critical 664 - * section has already blocked, in which case it is already set 665 - * up for the expedited grace period to wait on it. 744 + * First, the common case of not being in an RCU read-side 745 + * critical section. If also enabled or idle, immediately 746 + * report the quiescent state, otherwise defer. 666 747 */ 667 - if (t->rcu_read_lock_nesting > 0 && 668 - !t->rcu_read_unlock_special.b.blocked) { 669 - t->rcu_read_unlock_special.b.exp_need_qs = true; 748 + if (!t->rcu_read_lock_nesting) { 749 + if (!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)) || 750 + rcu_dynticks_curr_cpu_in_eqs()) { 751 + rcu_report_exp_rdp(rdp); 752 + } else { 753 + rdp->deferred_qs = true; 754 + set_tsk_need_resched(t); 755 + set_preempt_need_resched(); 756 + } 670 757 return; 671 758 } 672 759 673 760 /* 674 - * We are either exiting an RCU read-side critical section (negative 675 - * values of t->rcu_read_lock_nesting) or are not in one at all 676 - * (zero value of t->rcu_read_lock_nesting). Or we are in an RCU 677 - * read-side critical section that blocked before this expedited 678 - * grace period started. Either way, we can immediately report 679 - * the quiescent state. 761 + * Second, the less-common case of being in an RCU read-side 762 + * critical section. In this case we can count on a future 763 + * rcu_read_unlock(). However, this rcu_read_unlock() might 764 + * execute on some other CPU, but in that case there will be 765 + * a future context switch. Either way, if the expedited 766 + * grace period is still waiting on this CPU, set ->deferred_qs 767 + * so that the eventual quiescent state will be reported. 768 + * Note that there is a large group of race conditions that 769 + * can have caused this quiescent state to already have been 770 + * reported, so we really do need to check ->expmask. 680 771 */ 681 - rdp = this_cpu_ptr(rsp->rda); 682 - rcu_report_exp_rdp(rsp, rdp, true); 772 + if (t->rcu_read_lock_nesting > 0) { 773 + raw_spin_lock_irqsave_rcu_node(rnp, flags); 774 + if (rnp->expmask & rdp->grpmask) 775 + rdp->deferred_qs = true; 776 + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 777 + } 778 + 779 + /* 780 + * The final and least likely case is where the interrupted 781 + * code was just about to or just finished exiting the RCU-preempt 782 + * read-side critical section, and no, we can't tell which. 783 + * So either way, set ->deferred_qs to flag later code that 784 + * a quiescent state is required. 785 + * 786 + * If the CPU is fully enabled (or if some buggy RCU-preempt 787 + * read-side critical section is being used from idle), just 788 + * invoke rcu_preempt_defer_qs() to immediately report the 789 + * quiescent state. We cannot use rcu_read_unlock_special() 790 + * because we are in an interrupt handler, which will cause that 791 + * function to take an early exit without doing anything. 792 + * 793 + * Otherwise, force a context switch after the CPU enables everything. 794 + */ 795 + rdp->deferred_qs = true; 796 + if (!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)) || 797 + WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs())) { 798 + rcu_preempt_deferred_qs(t); 799 + } else { 800 + set_tsk_need_resched(t); 801 + set_preempt_need_resched(); 802 + } 803 + } 804 + 805 + /* PREEMPT=y, so no PREEMPT=n expedited grace period to clean up after. */ 806 + static void sync_sched_exp_online_cleanup(int cpu) 807 + { 683 808 } 684 809 685 810 /** ··· 743 780 * you are using synchronize_rcu_expedited() in a loop, please restructure 744 781 * your code to batch your updates, and then Use a single synchronize_rcu() 745 782 * instead. 783 + * 784 + * This has the same semantics as (but is more brutal than) synchronize_rcu(). 746 785 */ 747 786 void synchronize_rcu_expedited(void) 748 787 { 749 - struct rcu_state *rsp = rcu_state_p; 750 - 751 788 RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || 752 789 lock_is_held(&rcu_lock_map) || 753 790 lock_is_held(&rcu_sched_lock_map), ··· 755 792 756 793 if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE) 757 794 return; 758 - _synchronize_rcu_expedited(rsp, sync_rcu_exp_handler); 795 + _synchronize_rcu_expedited(sync_rcu_exp_handler); 759 796 } 760 797 EXPORT_SYMBOL_GPL(synchronize_rcu_expedited); 761 798 762 799 #else /* #ifdef CONFIG_PREEMPT_RCU */ 763 800 801 + /* Invoked on each online non-idle CPU for expedited quiescent state. */ 802 + static void sync_sched_exp_handler(void *unused) 803 + { 804 + struct rcu_data *rdp; 805 + struct rcu_node *rnp; 806 + 807 + rdp = this_cpu_ptr(&rcu_data); 808 + rnp = rdp->mynode; 809 + if (!(READ_ONCE(rnp->expmask) & rdp->grpmask) || 810 + __this_cpu_read(rcu_data.cpu_no_qs.b.exp)) 811 + return; 812 + if (rcu_is_cpu_rrupt_from_idle()) { 813 + rcu_report_exp_rdp(this_cpu_ptr(&rcu_data)); 814 + return; 815 + } 816 + __this_cpu_write(rcu_data.cpu_no_qs.b.exp, true); 817 + /* Store .exp before .rcu_urgent_qs. */ 818 + smp_store_release(this_cpu_ptr(&rcu_data.rcu_urgent_qs), true); 819 + set_tsk_need_resched(current); 820 + set_preempt_need_resched(); 821 + } 822 + 823 + /* Send IPI for expedited cleanup if needed at end of CPU-hotplug operation. */ 824 + static void sync_sched_exp_online_cleanup(int cpu) 825 + { 826 + struct rcu_data *rdp; 827 + int ret; 828 + struct rcu_node *rnp; 829 + 830 + rdp = per_cpu_ptr(&rcu_data, cpu); 831 + rnp = rdp->mynode; 832 + if (!(READ_ONCE(rnp->expmask) & rdp->grpmask)) 833 + return; 834 + ret = smp_call_function_single(cpu, sync_sched_exp_handler, NULL, 0); 835 + WARN_ON_ONCE(ret); 836 + } 837 + 764 838 /* 765 - * Wait for an rcu-preempt grace period, but make it happen quickly. 766 - * But because preemptible RCU does not exist, map to rcu-sched. 839 + * Because a context switch is a grace period for !PREEMPT, any 840 + * blocking grace-period wait automatically implies a grace period if 841 + * there is only one CPU online at any point time during execution of 842 + * either synchronize_rcu() or synchronize_rcu_expedited(). It is OK to 843 + * occasionally incorrectly indicate that there are multiple CPUs online 844 + * when there was in fact only one the whole time, as this just adds some 845 + * overhead: RCU still operates correctly. 767 846 */ 847 + static int rcu_blocking_is_gp(void) 848 + { 849 + int ret; 850 + 851 + might_sleep(); /* Check for RCU read-side critical section. */ 852 + preempt_disable(); 853 + ret = num_online_cpus() <= 1; 854 + preempt_enable(); 855 + return ret; 856 + } 857 + 858 + /* PREEMPT=n implementation of synchronize_rcu_expedited(). */ 768 859 void synchronize_rcu_expedited(void) 769 860 { 770 - synchronize_sched_expedited(); 861 + RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || 862 + lock_is_held(&rcu_lock_map) || 863 + lock_is_held(&rcu_sched_lock_map), 864 + "Illegal synchronize_rcu_expedited() in RCU read-side critical section"); 865 + 866 + /* If only one CPU, this is automatically a grace period. */ 867 + if (rcu_blocking_is_gp()) 868 + return; 869 + 870 + _synchronize_rcu_expedited(sync_sched_exp_handler); 771 871 } 772 872 EXPORT_SYMBOL_GPL(synchronize_rcu_expedited); 773 873

+397 -393

kernel/rcu/tree_plugin.h

··· 38 38 #include "../locking/rtmutex_common.h" 39 39 40 40 /* 41 - * Control variables for per-CPU and per-rcu_node kthreads. These 42 - * handle all flavors of RCU. 41 + * Control variables for per-CPU and per-rcu_node kthreads. 43 42 */ 44 43 static DEFINE_PER_CPU(struct task_struct *, rcu_cpu_kthread_task); 45 44 DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_status); ··· 105 106 pr_info("\tBoot-time adjustment of first FQS scan delay to %ld jiffies.\n", jiffies_till_first_fqs); 106 107 if (jiffies_till_next_fqs != ULONG_MAX) 107 108 pr_info("\tBoot-time adjustment of subsequent FQS scan delay to %ld jiffies.\n", jiffies_till_next_fqs); 109 + if (jiffies_till_sched_qs != ULONG_MAX) 110 + pr_info("\tBoot-time adjustment of scheduler-enlistment delay to %ld jiffies.\n", jiffies_till_sched_qs); 108 111 if (rcu_kick_kthreads) 109 112 pr_info("\tKick kthreads if too-long grace period.\n"); 110 113 if (IS_ENABLED(CONFIG_DEBUG_OBJECTS_RCU_HEAD)) ··· 124 123 125 124 #ifdef CONFIG_PREEMPT_RCU 126 125 127 - RCU_STATE_INITIALIZER(rcu_preempt, 'p', call_rcu); 128 - static struct rcu_state *const rcu_state_p = &rcu_preempt_state; 129 - static struct rcu_data __percpu *const rcu_data_p = &rcu_preempt_data; 130 - 131 - static void rcu_report_exp_rnp(struct rcu_state *rsp, struct rcu_node *rnp, 132 - bool wake); 126 + static void rcu_report_exp_rnp(struct rcu_node *rnp, bool wake); 133 127 static void rcu_read_unlock_special(struct task_struct *t); 134 128 135 129 /* ··· 280 284 * no need to check for a subsequent expedited GP. (Though we are 281 285 * still in a quiescent state in any case.) 282 286 */ 283 - if (blkd_state & RCU_EXP_BLKD && 284 - t->rcu_read_unlock_special.b.exp_need_qs) { 285 - t->rcu_read_unlock_special.b.exp_need_qs = false; 286 - rcu_report_exp_rdp(rdp->rsp, rdp, true); 287 - } else { 288 - WARN_ON_ONCE(t->rcu_read_unlock_special.b.exp_need_qs); 289 - } 287 + if (blkd_state & RCU_EXP_BLKD && rdp->deferred_qs) 288 + rcu_report_exp_rdp(rdp); 289 + else 290 + WARN_ON_ONCE(rdp->deferred_qs); 290 291 } 291 292 292 293 /* ··· 299 306 * 300 307 * Callers to this function must disable preemption. 301 308 */ 302 - static void rcu_preempt_qs(void) 309 + static void rcu_qs(void) 303 310 { 304 - RCU_LOCKDEP_WARN(preemptible(), "rcu_preempt_qs() invoked with preemption enabled!!!\n"); 305 - if (__this_cpu_read(rcu_data_p->cpu_no_qs.s)) { 311 + RCU_LOCKDEP_WARN(preemptible(), "rcu_qs() invoked with preemption enabled!!!\n"); 312 + if (__this_cpu_read(rcu_data.cpu_no_qs.s)) { 306 313 trace_rcu_grace_period(TPS("rcu_preempt"), 307 - __this_cpu_read(rcu_data_p->gp_seq), 314 + __this_cpu_read(rcu_data.gp_seq), 308 315 TPS("cpuqs")); 309 - __this_cpu_write(rcu_data_p->cpu_no_qs.b.norm, false); 310 - barrier(); /* Coordinate with rcu_preempt_check_callbacks(). */ 316 + __this_cpu_write(rcu_data.cpu_no_qs.b.norm, false); 317 + barrier(); /* Coordinate with rcu_flavor_check_callbacks(). */ 311 318 current->rcu_read_unlock_special.b.need_qs = false; 312 319 } 313 320 } ··· 325 332 * 326 333 * Caller must disable interrupts. 327 334 */ 328 - static void rcu_preempt_note_context_switch(bool preempt) 335 + void rcu_note_context_switch(bool preempt) 329 336 { 330 337 struct task_struct *t = current; 331 - struct rcu_data *rdp; 338 + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); 332 339 struct rcu_node *rnp; 333 340 341 + barrier(); /* Avoid RCU read-side critical sections leaking down. */ 342 + trace_rcu_utilization(TPS("Start context switch")); 334 343 lockdep_assert_irqs_disabled(); 335 344 WARN_ON_ONCE(!preempt && t->rcu_read_lock_nesting > 0); 336 345 if (t->rcu_read_lock_nesting > 0 && 337 346 !t->rcu_read_unlock_special.b.blocked) { 338 347 339 348 /* Possibly blocking in an RCU read-side critical section. */ 340 - rdp = this_cpu_ptr(rcu_state_p->rda); 341 349 rnp = rdp->mynode; 342 350 raw_spin_lock_rcu_node(rnp); 343 351 t->rcu_read_unlock_special.b.blocked = true; ··· 351 357 */ 352 358 WARN_ON_ONCE((rdp->grpmask & rcu_rnp_online_cpus(rnp)) == 0); 353 359 WARN_ON_ONCE(!list_empty(&t->rcu_node_entry)); 354 - trace_rcu_preempt_task(rdp->rsp->name, 360 + trace_rcu_preempt_task(rcu_state.name, 355 361 t->pid, 356 362 (rnp->qsmask & rdp->grpmask) 357 363 ? rnp->gp_seq ··· 365 371 * behalf of preempted instance of __rcu_read_unlock(). 366 372 */ 367 373 rcu_read_unlock_special(t); 374 + rcu_preempt_deferred_qs(t); 375 + } else { 376 + rcu_preempt_deferred_qs(t); 368 377 } 369 378 370 379 /* ··· 379 382 * grace period, then the fact that the task has been enqueued 380 383 * means that we continue to block the current grace period. 381 384 */ 382 - rcu_preempt_qs(); 385 + rcu_qs(); 386 + if (rdp->deferred_qs) 387 + rcu_report_exp_rdp(rdp); 388 + trace_rcu_utilization(TPS("End context switch")); 389 + barrier(); /* Avoid RCU read-side critical sections leaking up. */ 383 390 } 391 + EXPORT_SYMBOL_GPL(rcu_note_context_switch); 384 392 385 393 /* 386 394 * Check for preempted RCU readers blocking the current grace period ··· 466 464 } 467 465 468 466 /* 469 - * Handle special cases during rcu_read_unlock(), such as needing to 470 - * notify RCU core processing or task having blocked during the RCU 471 - * read-side critical section. 467 + * Report deferred quiescent states. The deferral time can 468 + * be quite short, for example, in the case of the call from 469 + * rcu_read_unlock_special(). 472 470 */ 473 - static void rcu_read_unlock_special(struct task_struct *t) 471 + static void 472 + rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) 474 473 { 475 474 bool empty_exp; 476 475 bool empty_norm; 477 476 bool empty_exp_now; 478 - unsigned long flags; 479 477 struct list_head *np; 480 478 bool drop_boost_mutex = false; 481 479 struct rcu_data *rdp; 482 480 struct rcu_node *rnp; 483 481 union rcu_special special; 484 - 485 - /* NMI handlers cannot block and cannot safely manipulate state. */ 486 - if (in_nmi()) 487 - return; 488 - 489 - local_irq_save(flags); 490 482 491 483 /* 492 484 * If RCU core is waiting for this CPU to exit its critical section, ··· 488 492 * t->rcu_read_unlock_special cannot change. 489 493 */ 490 494 special = t->rcu_read_unlock_special; 495 + rdp = this_cpu_ptr(&rcu_data); 496 + if (!special.s && !rdp->deferred_qs) { 497 + local_irq_restore(flags); 498 + return; 499 + } 491 500 if (special.b.need_qs) { 492 - rcu_preempt_qs(); 501 + rcu_qs(); 493 502 t->rcu_read_unlock_special.b.need_qs = false; 494 - if (!t->rcu_read_unlock_special.s) { 503 + if (!t->rcu_read_unlock_special.s && !rdp->deferred_qs) { 495 504 local_irq_restore(flags); 496 505 return; 497 506 } 498 507 } 499 508 500 509 /* 501 - * Respond to a request for an expedited grace period, but only if 502 - * we were not preempted, meaning that we were running on the same 503 - * CPU throughout. If we were preempted, the exp_need_qs flag 504 - * would have been cleared at the time of the first preemption, 505 - * and the quiescent state would be reported when we were dequeued. 510 + * Respond to a request by an expedited grace period for a 511 + * quiescent state from this CPU. Note that requests from 512 + * tasks are handled when removing the task from the 513 + * blocked-tasks list below. 506 514 */ 507 - if (special.b.exp_need_qs) { 508 - WARN_ON_ONCE(special.b.blocked); 509 - t->rcu_read_unlock_special.b.exp_need_qs = false; 510 - rdp = this_cpu_ptr(rcu_state_p->rda); 511 - rcu_report_exp_rdp(rcu_state_p, rdp, true); 515 + if (rdp->deferred_qs) { 516 + rcu_report_exp_rdp(rdp); 512 517 if (!t->rcu_read_unlock_special.s) { 513 518 local_irq_restore(flags); 514 519 return; 515 520 } 516 - } 517 - 518 - /* Hardware IRQ handlers cannot block, complain if they get here. */ 519 - if (in_irq() || in_serving_softirq()) { 520 - lockdep_rcu_suspicious(__FILE__, __LINE__, 521 - "rcu_read_unlock() from irq or softirq with blocking in critical section!!!\n"); 522 - pr_alert("->rcu_read_unlock_special: %#x (b: %d, enq: %d nq: %d)\n", 523 - t->rcu_read_unlock_special.s, 524 - t->rcu_read_unlock_special.b.blocked, 525 - t->rcu_read_unlock_special.b.exp_need_qs, 526 - t->rcu_read_unlock_special.b.need_qs); 527 - local_irq_restore(flags); 528 - return; 529 521 } 530 522 531 523 /* Clean up if blocked during RCU read-side critical section. */ ··· 566 582 rnp->grplo, 567 583 rnp->grphi, 568 584 !!rnp->gp_tasks); 569 - rcu_report_unblock_qs_rnp(rcu_state_p, rnp, flags); 585 + rcu_report_unblock_qs_rnp(rnp, flags); 570 586 } else { 571 587 raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 572 588 } ··· 580 596 * then we need to report up the rcu_node hierarchy. 581 597 */ 582 598 if (!empty_exp && empty_exp_now) 583 - rcu_report_exp_rnp(rcu_state_p, rnp, true); 599 + rcu_report_exp_rnp(rnp, true); 584 600 } else { 585 601 local_irq_restore(flags); 586 602 } 603 + } 604 + 605 + /* 606 + * Is a deferred quiescent-state pending, and are we also not in 607 + * an RCU read-side critical section? It is the caller's responsibility 608 + * to ensure it is otherwise safe to report any deferred quiescent 609 + * states. The reason for this is that it is safe to report a 610 + * quiescent state during context switch even though preemption 611 + * is disabled. This function cannot be expected to understand these 612 + * nuances, so the caller must handle them. 613 + */ 614 + static bool rcu_preempt_need_deferred_qs(struct task_struct *t) 615 + { 616 + return (this_cpu_ptr(&rcu_data)->deferred_qs || 617 + READ_ONCE(t->rcu_read_unlock_special.s)) && 618 + t->rcu_read_lock_nesting <= 0; 619 + } 620 + 621 + /* 622 + * Report a deferred quiescent state if needed and safe to do so. 623 + * As with rcu_preempt_need_deferred_qs(), "safe" involves only 624 + * not being in an RCU read-side critical section. The caller must 625 + * evaluate safety in terms of interrupt, softirq, and preemption 626 + * disabling. 627 + */ 628 + static void rcu_preempt_deferred_qs(struct task_struct *t) 629 + { 630 + unsigned long flags; 631 + bool couldrecurse = t->rcu_read_lock_nesting >= 0; 632 + 633 + if (!rcu_preempt_need_deferred_qs(t)) 634 + return; 635 + if (couldrecurse) 636 + t->rcu_read_lock_nesting -= INT_MIN; 637 + local_irq_save(flags); 638 + rcu_preempt_deferred_qs_irqrestore(t, flags); 639 + if (couldrecurse) 640 + t->rcu_read_lock_nesting += INT_MIN; 641 + } 642 + 643 + /* 644 + * Handle special cases during rcu_read_unlock(), such as needing to 645 + * notify RCU core processing or task having blocked during the RCU 646 + * read-side critical section. 647 + */ 648 + static void rcu_read_unlock_special(struct task_struct *t) 649 + { 650 + unsigned long flags; 651 + bool preempt_bh_were_disabled = 652 + !!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)); 653 + bool irqs_were_disabled; 654 + 655 + /* NMI handlers cannot block and cannot safely manipulate state. */ 656 + if (in_nmi()) 657 + return; 658 + 659 + local_irq_save(flags); 660 + irqs_were_disabled = irqs_disabled_flags(flags); 661 + if ((preempt_bh_were_disabled || irqs_were_disabled) && 662 + t->rcu_read_unlock_special.b.blocked) { 663 + /* Need to defer quiescent state until everything is enabled. */ 664 + raise_softirq_irqoff(RCU_SOFTIRQ); 665 + local_irq_restore(flags); 666 + return; 667 + } 668 + rcu_preempt_deferred_qs_irqrestore(t, flags); 587 669 } 588 670 589 671 /* ··· 683 633 * Dump detailed information for all tasks blocking the current RCU 684 634 * grace period. 685 635 */ 686 - static void rcu_print_detail_task_stall(struct rcu_state *rsp) 636 + static void rcu_print_detail_task_stall(void) 687 637 { 688 - struct rcu_node *rnp = rcu_get_root(rsp); 638 + struct rcu_node *rnp = rcu_get_root(); 689 639 690 640 rcu_print_detail_task_stall_rnp(rnp); 691 - rcu_for_each_leaf_node(rsp, rnp) 641 + rcu_for_each_leaf_node(rnp) 692 642 rcu_print_detail_task_stall_rnp(rnp); 693 643 } 694 644 ··· 756 706 * Also, if there are blocked tasks on the list, they automatically 757 707 * block the newly created grace period, so set up ->gp_tasks accordingly. 758 708 */ 759 - static void 760 - rcu_preempt_check_blocked_tasks(struct rcu_state *rsp, struct rcu_node *rnp) 709 + static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp) 761 710 { 762 711 struct task_struct *t; 763 712 764 713 RCU_LOCKDEP_WARN(preemptible(), "rcu_preempt_check_blocked_tasks() invoked with preemption enabled!!!\n"); 765 714 if (WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp))) 766 - dump_blkd_tasks(rsp, rnp, 10); 715 + dump_blkd_tasks(rnp, 10); 767 716 if (rcu_preempt_has_tasks(rnp) && 768 717 (rnp->qsmaskinit || rnp->wait_blkd_tasks)) { 769 718 rnp->gp_tasks = rnp->blkd_tasks.next; ··· 781 732 * 782 733 * Caller must disable hard irqs. 783 734 */ 784 - static void rcu_preempt_check_callbacks(void) 735 + static void rcu_flavor_check_callbacks(int user) 785 736 { 786 - struct rcu_state *rsp = &rcu_preempt_state; 787 737 struct task_struct *t = current; 788 738 789 - if (t->rcu_read_lock_nesting == 0) { 790 - rcu_preempt_qs(); 739 + if (user || rcu_is_cpu_rrupt_from_idle()) { 740 + rcu_note_voluntary_context_switch(current); 741 + } 742 + if (t->rcu_read_lock_nesting > 0 || 743 + (preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK))) { 744 + /* No QS, force context switch if deferred. */ 745 + if (rcu_preempt_need_deferred_qs(t)) { 746 + set_tsk_need_resched(t); 747 + set_preempt_need_resched(); 748 + } 749 + } else if (rcu_preempt_need_deferred_qs(t)) { 750 + rcu_preempt_deferred_qs(t); /* Report deferred QS. */ 751 + return; 752 + } else if (!t->rcu_read_lock_nesting) { 753 + rcu_qs(); /* Report immediate QS. */ 791 754 return; 792 755 } 756 + 757 + /* If GP is oldish, ask for help from rcu_read_unlock_special(). */ 793 758 if (t->rcu_read_lock_nesting > 0 && 794 - __this_cpu_read(rcu_data_p->core_needs_qs) && 795 - __this_cpu_read(rcu_data_p->cpu_no_qs.b.norm) && 759 + __this_cpu_read(rcu_data.core_needs_qs) && 760 + __this_cpu_read(rcu_data.cpu_no_qs.b.norm) && 796 761 !t->rcu_read_unlock_special.b.need_qs && 797 - time_after(jiffies, rsp->gp_start + HZ)) 762 + time_after(jiffies, rcu_state.gp_start + HZ)) 798 763 t->rcu_read_unlock_special.b.need_qs = true; 799 764 } 800 - 801 - /** 802 - * call_rcu() - Queue an RCU callback for invocation after a grace period. 803 - * @head: structure to be used for queueing the RCU updates. 804 - * @func: actual callback function to be invoked after the grace period 805 - * 806 - * The callback function will be invoked some time after a full grace 807 - * period elapses, in other words after all pre-existing RCU read-side 808 - * critical sections have completed. However, the callback function 809 - * might well execute concurrently with RCU read-side critical sections 810 - * that started after call_rcu() was invoked. RCU read-side critical 811 - * sections are delimited by rcu_read_lock() and rcu_read_unlock(), 812 - * and may be nested. 813 - * 814 - * Note that all CPUs must agree that the grace period extended beyond 815 - * all pre-existing RCU read-side critical section. On systems with more 816 - * than one CPU, this means that when "func()" is invoked, each CPU is 817 - * guaranteed to have executed a full memory barrier since the end of its 818 - * last RCU read-side critical section whose beginning preceded the call 819 - * to call_rcu(). It also means that each CPU executing an RCU read-side 820 - * critical section that continues beyond the start of "func()" must have 821 - * executed a memory barrier after the call_rcu() but before the beginning 822 - * of that RCU read-side critical section. Note that these guarantees 823 - * include CPUs that are offline, idle, or executing in user mode, as 824 - * well as CPUs that are executing in the kernel. 825 - * 826 - * Furthermore, if CPU A invoked call_rcu() and CPU B invoked the 827 - * resulting RCU callback function "func()", then both CPU A and CPU B are 828 - * guaranteed to execute a full memory barrier during the time interval 829 - * between the call to call_rcu() and the invocation of "func()" -- even 830 - * if CPU A and CPU B are the same CPU (but again only if the system has 831 - * more than one CPU). 832 - */ 833 - void call_rcu(struct rcu_head *head, rcu_callback_t func) 834 - { 835 - __call_rcu(head, func, rcu_state_p, -1, 0); 836 - } 837 - EXPORT_SYMBOL_GPL(call_rcu); 838 765 839 766 /** 840 767 * synchronize_rcu - wait until a grace period has elapsed. ··· 822 797 * concurrently with new RCU read-side critical sections that began while 823 798 * synchronize_rcu() was waiting. RCU read-side critical sections are 824 799 * delimited by rcu_read_lock() and rcu_read_unlock(), and may be nested. 800 + * In addition, regions of code across which interrupts, preemption, or 801 + * softirqs have been disabled also serve as RCU read-side critical 802 + * sections. This includes hardware interrupt handlers, softirq handlers, 803 + * and NMI handlers. 825 804 * 826 - * See the description of synchronize_sched() for more detailed 827 - * information on memory-ordering guarantees. However, please note 828 - * that -only- the memory-ordering guarantees apply. For example, 829 - * synchronize_rcu() is -not- guaranteed to wait on things like code 830 - * protected by preempt_disable(), instead, synchronize_rcu() is -only- 831 - * guaranteed to wait on RCU read-side critical sections, that is, sections 832 - * of code protected by rcu_read_lock(). 805 + * Note that this guarantee implies further memory-ordering guarantees. 806 + * On systems with more than one CPU, when synchronize_rcu() returns, 807 + * each CPU is guaranteed to have executed a full memory barrier since 808 + * the end of its last RCU read-side critical section whose beginning 809 + * preceded the call to synchronize_rcu(). In addition, each CPU having 810 + * an RCU read-side critical section that extends beyond the return from 811 + * synchronize_rcu() is guaranteed to have executed a full memory barrier 812 + * after the beginning of synchronize_rcu() and before the beginning of 813 + * that RCU read-side critical section. Note that these guarantees include 814 + * CPUs that are offline, idle, or executing in user mode, as well as CPUs 815 + * that are executing in the kernel. 816 + * 817 + * Furthermore, if CPU A invoked synchronize_rcu(), which returned 818 + * to its caller on CPU B, then both CPU A and CPU B are guaranteed 819 + * to have executed a full memory barrier during the execution of 820 + * synchronize_rcu() -- even if CPU A and CPU B are the same CPU (but 821 + * again only if the system has more than one CPU). 833 822 */ 834 823 void synchronize_rcu(void) 835 824 { ··· 859 820 wait_rcu_gp(call_rcu); 860 821 } 861 822 EXPORT_SYMBOL_GPL(synchronize_rcu); 862 - 863 - /** 864 - * rcu_barrier - Wait until all in-flight call_rcu() callbacks complete. 865 - * 866 - * Note that this primitive does not necessarily wait for an RCU grace period 867 - * to complete. For example, if there are no RCU callbacks queued anywhere 868 - * in the system, then rcu_barrier() is within its rights to return 869 - * immediately, without waiting for anything, much less an RCU grace period. 870 - */ 871 - void rcu_barrier(void) 872 - { 873 - _rcu_barrier(rcu_state_p); 874 - } 875 - EXPORT_SYMBOL_GPL(rcu_barrier); 876 - 877 - /* 878 - * Initialize preemptible RCU's state structures. 879 - */ 880 - static void __init __rcu_init_preempt(void) 881 - { 882 - rcu_init_one(rcu_state_p); 883 - } 884 823 885 824 /* 886 825 * Check for a task exiting while in a preemptible-RCU read-side ··· 876 859 barrier(); 877 860 t->rcu_read_unlock_special.b.blocked = true; 878 861 __rcu_read_unlock(); 862 + rcu_preempt_deferred_qs(current); 879 863 } 880 864 881 865 /* ··· 884 866 * specified number of elements. 885 867 */ 886 868 static void 887 - dump_blkd_tasks(struct rcu_state *rsp, struct rcu_node *rnp, int ncheck) 869 + dump_blkd_tasks(struct rcu_node *rnp, int ncheck) 888 870 { 889 871 int cpu; 890 872 int i; ··· 911 893 } 912 894 pr_cont("\n"); 913 895 for (cpu = rnp->grplo; cpu <= rnp->grphi; cpu++) { 914 - rdp = per_cpu_ptr(rsp->rda, cpu); 896 + rdp = per_cpu_ptr(&rcu_data, cpu); 915 897 onl = !!(rdp->grpmask & rcu_rnp_online_cpus(rnp)); 916 898 pr_info("\t%d: %c online: %ld(%d) offline: %ld(%d)\n", 917 899 cpu, ".o"[onl], ··· 921 903 } 922 904 923 905 #else /* #ifdef CONFIG_PREEMPT_RCU */ 924 - 925 - static struct rcu_state *const rcu_state_p = &rcu_sched_state; 926 906 927 907 /* 928 908 * Tell them what RCU they are running. ··· 932 916 } 933 917 934 918 /* 935 - * Because preemptible RCU does not exist, we never have to check for 936 - * CPUs being in quiescent states. 919 + * Note a quiescent state for PREEMPT=n. Because we do not need to know 920 + * how many quiescent states passed, just if there was at least one since 921 + * the start of the grace period, this just sets a flag. The caller must 922 + * have disabled preemption. 937 923 */ 938 - static void rcu_preempt_note_context_switch(bool preempt) 924 + static void rcu_qs(void) 939 925 { 926 + RCU_LOCKDEP_WARN(preemptible(), "rcu_qs() invoked with preemption enabled!!!"); 927 + if (!__this_cpu_read(rcu_data.cpu_no_qs.s)) 928 + return; 929 + trace_rcu_grace_period(TPS("rcu_sched"), 930 + __this_cpu_read(rcu_data.gp_seq), TPS("cpuqs")); 931 + __this_cpu_write(rcu_data.cpu_no_qs.b.norm, false); 932 + if (!__this_cpu_read(rcu_data.cpu_no_qs.b.exp)) 933 + return; 934 + __this_cpu_write(rcu_data.cpu_no_qs.b.exp, false); 935 + rcu_report_exp_rdp(this_cpu_ptr(&rcu_data)); 940 936 } 937 + 938 + /* 939 + * Register an urgently needed quiescent state. If there is an 940 + * emergency, invoke rcu_momentary_dyntick_idle() to do a heavy-weight 941 + * dyntick-idle quiescent state visible to other CPUs, which will in 942 + * some cases serve for expedited as well as normal grace periods. 943 + * Either way, register a lightweight quiescent state. 944 + * 945 + * The barrier() calls are redundant in the common case when this is 946 + * called externally, but just in case this is called from within this 947 + * file. 948 + * 949 + */ 950 + void rcu_all_qs(void) 951 + { 952 + unsigned long flags; 953 + 954 + if (!raw_cpu_read(rcu_data.rcu_urgent_qs)) 955 + return; 956 + preempt_disable(); 957 + /* Load rcu_urgent_qs before other flags. */ 958 + if (!smp_load_acquire(this_cpu_ptr(&rcu_data.rcu_urgent_qs))) { 959 + preempt_enable(); 960 + return; 961 + } 962 + this_cpu_write(rcu_data.rcu_urgent_qs, false); 963 + barrier(); /* Avoid RCU read-side critical sections leaking down. */ 964 + if (unlikely(raw_cpu_read(rcu_data.rcu_need_heavy_qs))) { 965 + local_irq_save(flags); 966 + rcu_momentary_dyntick_idle(); 967 + local_irq_restore(flags); 968 + } 969 + rcu_qs(); 970 + barrier(); /* Avoid RCU read-side critical sections leaking up. */ 971 + preempt_enable(); 972 + } 973 + EXPORT_SYMBOL_GPL(rcu_all_qs); 974 + 975 + /* 976 + * Note a PREEMPT=n context switch. The caller must have disabled interrupts. 977 + */ 978 + void rcu_note_context_switch(bool preempt) 979 + { 980 + barrier(); /* Avoid RCU read-side critical sections leaking down. */ 981 + trace_rcu_utilization(TPS("Start context switch")); 982 + rcu_qs(); 983 + /* Load rcu_urgent_qs before other flags. */ 984 + if (!smp_load_acquire(this_cpu_ptr(&rcu_data.rcu_urgent_qs))) 985 + goto out; 986 + this_cpu_write(rcu_data.rcu_urgent_qs, false); 987 + if (unlikely(raw_cpu_read(rcu_data.rcu_need_heavy_qs))) 988 + rcu_momentary_dyntick_idle(); 989 + if (!preempt) 990 + rcu_tasks_qs(current); 991 + out: 992 + trace_rcu_utilization(TPS("End context switch")); 993 + barrier(); /* Avoid RCU read-side critical sections leaking up. */ 994 + } 995 + EXPORT_SYMBOL_GPL(rcu_note_context_switch); 941 996 942 997 /* 943 998 * Because preemptible RCU does not exist, there are never any preempted ··· 1028 941 } 1029 942 1030 943 /* 944 + * Because there is no preemptible RCU, there can be no deferred quiescent 945 + * states. 946 + */ 947 + static bool rcu_preempt_need_deferred_qs(struct task_struct *t) 948 + { 949 + return false; 950 + } 951 + static void rcu_preempt_deferred_qs(struct task_struct *t) { } 952 + 953 + /* 1031 954 * Because preemptible RCU does not exist, we never have to check for 1032 955 * tasks blocked within RCU read-side critical sections. 1033 956 */ 1034 - static void rcu_print_detail_task_stall(struct rcu_state *rsp) 957 + static void rcu_print_detail_task_stall(void) 1035 958 { 1036 959 } 1037 960 ··· 1069 972 * so there is no need to check for blocked tasks. So check only for 1070 973 * bogus qsmask values. 1071 974 */ 1072 - static void 1073 - rcu_preempt_check_blocked_tasks(struct rcu_state *rsp, struct rcu_node *rnp) 975 + static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp) 1074 976 { 1075 977 WARN_ON_ONCE(rnp->qsmask); 1076 978 } 1077 979 1078 980 /* 1079 - * Because preemptible RCU does not exist, it never has any callbacks 1080 - * to check. 981 + * Check to see if this CPU is in a non-context-switch quiescent state 982 + * (user mode or idle loop for rcu, non-softirq execution for rcu_bh). 983 + * Also schedule RCU core processing. 984 + * 985 + * This function must be called from hardirq context. It is normally 986 + * invoked from the scheduling-clock interrupt. 1081 987 */ 1082 - static void rcu_preempt_check_callbacks(void) 988 + static void rcu_flavor_check_callbacks(int user) 1083 989 { 990 + if (user || rcu_is_cpu_rrupt_from_idle()) { 991 + 992 + /* 993 + * Get here if this CPU took its interrupt from user 994 + * mode or from the idle loop, and if this is not a 995 + * nested interrupt. In this case, the CPU is in 996 + * a quiescent state, so note it. 997 + * 998 + * No memory barrier is required here because rcu_qs() 999 + * references only CPU-local variables that other CPUs 1000 + * neither access nor modify, at least not while the 1001 + * corresponding CPU is online. 1002 + */ 1003 + 1004 + rcu_qs(); 1005 + } 1084 1006 } 1085 1007 1086 - /* 1087 - * Because preemptible RCU does not exist, rcu_barrier() is just 1088 - * another name for rcu_barrier_sched(). 1089 - */ 1090 - void rcu_barrier(void) 1008 + /* PREEMPT=n implementation of synchronize_rcu(). */ 1009 + void synchronize_rcu(void) 1091 1010 { 1092 - rcu_barrier_sched(); 1011 + RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || 1012 + lock_is_held(&rcu_lock_map) || 1013 + lock_is_held(&rcu_sched_lock_map), 1014 + "Illegal synchronize_rcu() in RCU read-side critical section"); 1015 + if (rcu_blocking_is_gp()) 1016 + return; 1017 + if (rcu_gp_is_expedited()) 1018 + synchronize_rcu_expedited(); 1019 + else 1020 + wait_rcu_gp(call_rcu); 1093 1021 } 1094 - EXPORT_SYMBOL_GPL(rcu_barrier); 1095 - 1096 - /* 1097 - * Because preemptible RCU does not exist, it need not be initialized. 1098 - */ 1099 - static void __init __rcu_init_preempt(void) 1100 - { 1101 - } 1022 + EXPORT_SYMBOL_GPL(synchronize_rcu); 1102 1023 1103 1024 /* 1104 1025 * Because preemptible RCU does not exist, tasks cannot possibly exit ··· 1130 1015 * Dump the guaranteed-empty blocked-tasks state. Trust but verify. 1131 1016 */ 1132 1017 static void 1133 - dump_blkd_tasks(struct rcu_state *rsp, struct rcu_node *rnp, int ncheck) 1018 + dump_blkd_tasks(struct rcu_node *rnp, int ncheck) 1134 1019 { 1135 1020 WARN_ON_ONCE(!list_empty(&rnp->blkd_tasks)); 1136 1021 } ··· 1327 1212 * already exist. We only create this kthread for preemptible RCU. 1328 1213 * Returns zero if all is well, a negated errno otherwise. 1329 1214 */ 1330 - static int rcu_spawn_one_boost_kthread(struct rcu_state *rsp, 1331 - struct rcu_node *rnp) 1215 + static int rcu_spawn_one_boost_kthread(struct rcu_node *rnp) 1332 1216 { 1333 - int rnp_index = rnp - &rsp->node[0]; 1217 + int rnp_index = rnp - rcu_get_root(); 1334 1218 unsigned long flags; 1335 1219 struct sched_param sp; 1336 1220 struct task_struct *t; 1337 1221 1338 - if (rcu_state_p != rsp) 1222 + if (!IS_ENABLED(CONFIG_PREEMPT_RCU)) 1339 1223 return 0; 1340 1224 1341 1225 if (!rcu_scheduler_fully_active || rcu_rnp_online_cpus(rnp) == 0) 1342 1226 return 0; 1343 1227 1344 - rsp->boost = 1; 1228 + rcu_state.boost = 1; 1345 1229 if (rnp->boost_kthread_task != NULL) 1346 1230 return 0; 1347 1231 t = kthread_create(rcu_boost_kthread, (void *)rnp, ··· 1358 1244 1359 1245 static void rcu_kthread_do_work(void) 1360 1246 { 1361 - rcu_do_batch(&rcu_sched_state, this_cpu_ptr(&rcu_sched_data)); 1362 - rcu_do_batch(&rcu_bh_state, this_cpu_ptr(&rcu_bh_data)); 1363 - rcu_do_batch(&rcu_preempt_state, this_cpu_ptr(&rcu_preempt_data)); 1247 + rcu_do_batch(this_cpu_ptr(&rcu_data)); 1364 1248 } 1365 1249 1366 1250 static void rcu_cpu_kthread_setup(unsigned int cpu) ··· 1380 1268 } 1381 1269 1382 1270 /* 1383 - * Per-CPU kernel thread that invokes RCU callbacks. This replaces the 1384 - * RCU softirq used in flavors and configurations of RCU that do not 1385 - * support RCU priority boosting. 1271 + * Per-CPU kernel thread that invokes RCU callbacks. This replaces 1272 + * the RCU softirq used in configurations of RCU that do not support RCU 1273 + * priority boosting. 1386 1274 */ 1387 1275 static void rcu_cpu_kthread(unsigned int cpu) 1388 1276 { ··· 1465 1353 for_each_possible_cpu(cpu) 1466 1354 per_cpu(rcu_cpu_has_work, cpu) = 0; 1467 1355 BUG_ON(smpboot_register_percpu_thread(&rcu_cpu_thread_spec)); 1468 - rcu_for_each_leaf_node(rcu_state_p, rnp) 1469 - (void)rcu_spawn_one_boost_kthread(rcu_state_p, rnp); 1356 + rcu_for_each_leaf_node(rnp) 1357 + (void)rcu_spawn_one_boost_kthread(rnp); 1470 1358 } 1471 1359 1472 1360 static void rcu_prepare_kthreads(int cpu) 1473 1361 { 1474 - struct rcu_data *rdp = per_cpu_ptr(rcu_state_p->rda, cpu); 1362 + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); 1475 1363 struct rcu_node *rnp = rdp->mynode; 1476 1364 1477 1365 /* Fire up the incoming CPU's kthread and leaf rcu_node kthread. */ 1478 1366 if (rcu_scheduler_fully_active) 1479 - (void)rcu_spawn_one_boost_kthread(rcu_state_p, rnp); 1367 + (void)rcu_spawn_one_boost_kthread(rnp); 1480 1368 } 1481 1369 1482 1370 #else /* #ifdef CONFIG_RCU_BOOST */ ··· 1523 1411 * 1 if so. This function is part of the RCU implementation; it is -not- 1524 1412 * an exported member of the RCU API. 1525 1413 * 1526 - * Because we not have RCU_FAST_NO_HZ, just check whether this CPU needs 1527 - * any flavor of RCU. 1414 + * Because we not have RCU_FAST_NO_HZ, just check whether or not this 1415 + * CPU has RCU callbacks queued. 1528 1416 */ 1529 1417 int rcu_needs_cpu(u64 basemono, u64 *nextevt) 1530 1418 { ··· 1590 1478 module_param(rcu_idle_lazy_gp_delay, int, 0644); 1591 1479 1592 1480 /* 1593 - * Try to advance callbacks for all flavors of RCU on the current CPU, but 1594 - * only if it has been awhile since the last time we did so. Afterwards, 1595 - * if there are any callbacks ready for immediate invocation, return true. 1481 + * Try to advance callbacks on the current CPU, but only if it has been 1482 + * awhile since the last time we did so. Afterwards, if there are any 1483 + * callbacks ready for immediate invocation, return true. 1596 1484 */ 1597 1485 static bool __maybe_unused rcu_try_advance_all_cbs(void) 1598 1486 { 1599 1487 bool cbs_ready = false; 1600 - struct rcu_data *rdp; 1601 - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); 1488 + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); 1602 1489 struct rcu_node *rnp; 1603 - struct rcu_state *rsp; 1604 1490 1605 1491 /* Exit early if we advanced recently. */ 1606 - if (jiffies == rdtp->last_advance_all) 1492 + if (jiffies == rdp->last_advance_all) 1607 1493 return false; 1608 - rdtp->last_advance_all = jiffies; 1494 + rdp->last_advance_all = jiffies; 1609 1495 1610 - for_each_rcu_flavor(rsp) { 1611 - rdp = this_cpu_ptr(rsp->rda); 1612 - rnp = rdp->mynode; 1496 + rnp = rdp->mynode; 1613 1497 1614 - /* 1615 - * Don't bother checking unless a grace period has 1616 - * completed since we last checked and there are 1617 - * callbacks not yet ready to invoke. 1618 - */ 1619 - if ((rcu_seq_completed_gp(rdp->gp_seq, 1620 - rcu_seq_current(&rnp->gp_seq)) || 1621 - unlikely(READ_ONCE(rdp->gpwrap))) && 1622 - rcu_segcblist_pend_cbs(&rdp->cblist)) 1623 - note_gp_changes(rsp, rdp); 1498 + /* 1499 + * Don't bother checking unless a grace period has 1500 + * completed since we last checked and there are 1501 + * callbacks not yet ready to invoke. 1502 + */ 1503 + if ((rcu_seq_completed_gp(rdp->gp_seq, 1504 + rcu_seq_current(&rnp->gp_seq)) || 1505 + unlikely(READ_ONCE(rdp->gpwrap))) && 1506 + rcu_segcblist_pend_cbs(&rdp->cblist)) 1507 + note_gp_changes(rdp); 1624 1508 1625 - if (rcu_segcblist_ready_cbs(&rdp->cblist)) 1626 - cbs_ready = true; 1627 - } 1509 + if (rcu_segcblist_ready_cbs(&rdp->cblist)) 1510 + cbs_ready = true; 1628 1511 return cbs_ready; 1629 1512 } 1630 1513 ··· 1633 1526 */ 1634 1527 int rcu_needs_cpu(u64 basemono, u64 *nextevt) 1635 1528 { 1636 - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); 1529 + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); 1637 1530 unsigned long dj; 1638 1531 1639 1532 lockdep_assert_irqs_disabled(); 1640 1533 1641 1534 /* Snapshot to detect later posting of non-lazy callback. */ 1642 - rdtp->nonlazy_posted_snap = rdtp->nonlazy_posted; 1535 + rdp->nonlazy_posted_snap = rdp->nonlazy_posted; 1643 1536 1644 1537 /* If no callbacks, RCU doesn't need the CPU. */ 1645 - if (!rcu_cpu_has_callbacks(&rdtp->all_lazy)) { 1538 + if (!rcu_cpu_has_callbacks(&rdp->all_lazy)) { 1646 1539 *nextevt = KTIME_MAX; 1647 1540 return 0; 1648 1541 } ··· 1653 1546 invoke_rcu_core(); 1654 1547 return 1; 1655 1548 } 1656 - rdtp->last_accelerate = jiffies; 1549 + rdp->last_accelerate = jiffies; 1657 1550 1658 1551 /* Request timer delay depending on laziness, and round. */ 1659 - if (!rdtp->all_lazy) { 1552 + if (!rdp->all_lazy) { 1660 1553 dj = round_up(rcu_idle_gp_delay + jiffies, 1661 1554 rcu_idle_gp_delay) - jiffies; 1662 1555 } else { ··· 1679 1572 static void rcu_prepare_for_idle(void) 1680 1573 { 1681 1574 bool needwake; 1682 - struct rcu_data *rdp; 1683 - struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks); 1575 + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); 1684 1576 struct rcu_node *rnp; 1685 - struct rcu_state *rsp; 1686 1577 int tne; 1687 1578 1688 1579 lockdep_assert_irqs_disabled(); ··· 1689 1584 1690 1585 /* Handle nohz enablement switches conservatively. */ 1691 1586 tne = READ_ONCE(tick_nohz_active); 1692 - if (tne != rdtp->tick_nohz_enabled_snap) { 1587 + if (tne != rdp->tick_nohz_enabled_snap) { 1693 1588 if (rcu_cpu_has_callbacks(NULL)) 1694 1589 invoke_rcu_core(); /* force nohz to see update. */ 1695 - rdtp->tick_nohz_enabled_snap = tne; 1590 + rdp->tick_nohz_enabled_snap = tne; 1696 1591 return; 1697 1592 } 1698 1593 if (!tne) ··· 1703 1598 * callbacks, invoke RCU core for the side-effect of recalculating 1704 1599 * idle duration on re-entry to idle. 1705 1600 */ 1706 - if (rdtp->all_lazy && 1707 - rdtp->nonlazy_posted != rdtp->nonlazy_posted_snap) { 1708 - rdtp->all_lazy = false; 1709 - rdtp->nonlazy_posted_snap = rdtp->nonlazy_posted; 1601 + if (rdp->all_lazy && 1602 + rdp->nonlazy_posted != rdp->nonlazy_posted_snap) { 1603 + rdp->all_lazy = false; 1604 + rdp->nonlazy_posted_snap = rdp->nonlazy_posted; 1710 1605 invoke_rcu_core(); 1711 1606 return; 1712 1607 } ··· 1715 1610 * If we have not yet accelerated this jiffy, accelerate all 1716 1611 * callbacks on this CPU. 1717 1612 */ 1718 - if (rdtp->last_accelerate == jiffies) 1613 + if (rdp->last_accelerate == jiffies) 1719 1614 return; 1720 - rdtp->last_accelerate = jiffies; 1721 - for_each_rcu_flavor(rsp) { 1722 - rdp = this_cpu_ptr(rsp->rda); 1723 - if (!rcu_segcblist_pend_cbs(&rdp->cblist)) 1724 - continue; 1615 + rdp->last_accelerate = jiffies; 1616 + if (rcu_segcblist_pend_cbs(&rdp->cblist)) { 1725 1617 rnp = rdp->mynode; 1726 1618 raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */ 1727 - needwake = rcu_accelerate_cbs(rsp, rnp, rdp); 1619 + needwake = rcu_accelerate_cbs(rnp, rdp); 1728 1620 raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */ 1729 1621 if (needwake) 1730 - rcu_gp_kthread_wake(rsp); 1622 + rcu_gp_kthread_wake(); 1731 1623 } 1732 1624 } 1733 1625 ··· 1752 1650 */ 1753 1651 static void rcu_idle_count_callbacks_posted(void) 1754 1652 { 1755 - __this_cpu_add(rcu_dynticks.nonlazy_posted, 1); 1653 + __this_cpu_add(rcu_data.nonlazy_posted, 1); 1756 1654 } 1757 - 1758 - /* 1759 - * Data for flushing lazy RCU callbacks at OOM time. 1760 - */ 1761 - static atomic_t oom_callback_count; 1762 - static DECLARE_WAIT_QUEUE_HEAD(oom_callback_wq); 1763 - 1764 - /* 1765 - * RCU OOM callback -- decrement the outstanding count and deliver the 1766 - * wake-up if we are the last one. 1767 - */ 1768 - static void rcu_oom_callback(struct rcu_head *rhp) 1769 - { 1770 - if (atomic_dec_and_test(&oom_callback_count)) 1771 - wake_up(&oom_callback_wq); 1772 - } 1773 - 1774 - /* 1775 - * Post an rcu_oom_notify callback on the current CPU if it has at 1776 - * least one lazy callback. This will unnecessarily post callbacks 1777 - * to CPUs that already have a non-lazy callback at the end of their 1778 - * callback list, but this is an infrequent operation, so accept some 1779 - * extra overhead to keep things simple. 1780 - */ 1781 - static void rcu_oom_notify_cpu(void *unused) 1782 - { 1783 - struct rcu_state *rsp; 1784 - struct rcu_data *rdp; 1785 - 1786 - for_each_rcu_flavor(rsp) { 1787 - rdp = raw_cpu_ptr(rsp->rda); 1788 - if (rcu_segcblist_n_lazy_cbs(&rdp->cblist)) { 1789 - atomic_inc(&oom_callback_count); 1790 - rsp->call(&rdp->oom_head, rcu_oom_callback); 1791 - } 1792 - } 1793 - } 1794 - 1795 - /* 1796 - * If low on memory, ensure that each CPU has a non-lazy callback. 1797 - * This will wake up CPUs that have only lazy callbacks, in turn 1798 - * ensuring that they free up the corresponding memory in a timely manner. 1799 - * Because an uncertain amount of memory will be freed in some uncertain 1800 - * timeframe, we do not claim to have freed anything. 1801 - */ 1802 - static int rcu_oom_notify(struct notifier_block *self, 1803 - unsigned long notused, void *nfreed) 1804 - { 1805 - int cpu; 1806 - 1807 - /* Wait for callbacks from earlier instance to complete. */ 1808 - wait_event(oom_callback_wq, atomic_read(&oom_callback_count) == 0); 1809 - smp_mb(); /* Ensure callback reuse happens after callback invocation. */ 1810 - 1811 - /* 1812 - * Prevent premature wakeup: ensure that all increments happen 1813 - * before there is a chance of the counter reaching zero. 1814 - */ 1815 - atomic_set(&oom_callback_count, 1); 1816 - 1817 - for_each_online_cpu(cpu) { 1818 - smp_call_function_single(cpu, rcu_oom_notify_cpu, NULL, 1); 1819 - cond_resched_tasks_rcu_qs(); 1820 - } 1821 - 1822 - /* Unconditionally decrement: no need to wake ourselves up. */ 1823 - atomic_dec(&oom_callback_count); 1824 - 1825 - return NOTIFY_OK; 1826 - } 1827 - 1828 - static struct notifier_block rcu_oom_nb = { 1829 - .notifier_call = rcu_oom_notify 1830 - }; 1831 - 1832 - static int __init rcu_register_oom_notifier(void) 1833 - { 1834 - register_oom_notifier(&rcu_oom_nb); 1835 - return 0; 1836 - } 1837 - early_initcall(rcu_register_oom_notifier); 1838 1655 1839 1656 #endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */ 1840 1657 ··· 1761 1740 1762 1741 static void print_cpu_stall_fast_no_hz(char *cp, int cpu) 1763 1742 { 1764 - struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu); 1765 - unsigned long nlpd = rdtp->nonlazy_posted - rdtp->nonlazy_posted_snap; 1743 + struct rcu_data *rdp = &per_cpu(rcu_data, cpu); 1744 + unsigned long nlpd = rdp->nonlazy_posted - rdp->nonlazy_posted_snap; 1766 1745 1767 1746 sprintf(cp, "last_accelerate: %04lx/%04lx, nonlazy_posted: %ld, %c%c", 1768 - rdtp->last_accelerate & 0xffff, jiffies & 0xffff, 1747 + rdp->last_accelerate & 0xffff, jiffies & 0xffff, 1769 1748 ulong2long(nlpd), 1770 - rdtp->all_lazy ? 'L' : '.', 1771 - rdtp->tick_nohz_enabled_snap ? '.' : 'D'); 1749 + rdp->all_lazy ? 'L' : '.', 1750 + rdp->tick_nohz_enabled_snap ? '.' : 'D'); 1772 1751 } 1773 1752 1774 1753 #else /* #ifdef CONFIG_RCU_FAST_NO_HZ */ ··· 1789 1768 /* 1790 1769 * Print out diagnostic information for the specified stalled CPU. 1791 1770 * 1792 - * If the specified CPU is aware of the current RCU grace period 1793 - * (flavor specified by rsp), then print the number of scheduling 1794 - * clock interrupts the CPU has taken during the time that it has 1795 - * been aware. Otherwise, print the number of RCU grace periods 1796 - * that this CPU is ignorant of, for example, "1" if the CPU was 1797 - * aware of the previous grace period. 1771 + * If the specified CPU is aware of the current RCU grace period, then 1772 + * print the number of scheduling clock interrupts the CPU has taken 1773 + * during the time that it has been aware. Otherwise, print the number 1774 + * of RCU grace periods that this CPU is ignorant of, for example, "1" 1775 + * if the CPU was aware of the previous grace period. 1798 1776 * 1799 1777 * Also print out idle and (if CONFIG_RCU_FAST_NO_HZ) idle-entry info. 1800 1778 */ 1801 - static void print_cpu_stall_info(struct rcu_state *rsp, int cpu) 1779 + static void print_cpu_stall_info(int cpu) 1802 1780 { 1803 1781 unsigned long delta; 1804 1782 char fast_no_hz[72]; 1805 - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); 1806 - struct rcu_dynticks *rdtp = rdp->dynticks; 1783 + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); 1807 1784 char *ticks_title; 1808 1785 unsigned long ticks_value; 1809 1786 ··· 1811 1792 */ 1812 1793 touch_nmi_watchdog(); 1813 1794 1814 - ticks_value = rcu_seq_ctr(rsp->gp_seq - rdp->gp_seq); 1795 + ticks_value = rcu_seq_ctr(rcu_state.gp_seq - rdp->gp_seq); 1815 1796 if (ticks_value) { 1816 1797 ticks_title = "GPs behind"; 1817 1798 } else { ··· 1829 1810 rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' : 1830 1811 "!."[!delta], 1831 1812 ticks_value, ticks_title, 1832 - rcu_dynticks_snap(rdtp) & 0xfff, 1833 - rdtp->dynticks_nesting, rdtp->dynticks_nmi_nesting, 1813 + rcu_dynticks_snap(rdp) & 0xfff, 1814 + rdp->dynticks_nesting, rdp->dynticks_nmi_nesting, 1834 1815 rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu), 1835 - READ_ONCE(rsp->n_force_qs) - rsp->n_force_qs_gpstart, 1816 + READ_ONCE(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart, 1836 1817 fast_no_hz); 1837 1818 } 1838 1819 ··· 1842 1823 pr_err("\t"); 1843 1824 } 1844 1825 1845 - /* Zero ->ticks_this_gp for all flavors of RCU. */ 1826 + /* Zero ->ticks_this_gp and snapshot the number of RCU softirq handlers. */ 1846 1827 static void zero_cpu_stall_ticks(struct rcu_data *rdp) 1847 1828 { 1848 1829 rdp->ticks_this_gp = 0; 1849 1830 rdp->softirq_snap = kstat_softirqs_cpu(RCU_SOFTIRQ, smp_processor_id()); 1850 - } 1851 - 1852 - /* Increment ->ticks_this_gp for all flavors of RCU. */ 1853 - static void increment_cpu_stall_ticks(void) 1854 - { 1855 - struct rcu_state *rsp; 1856 - 1857 - for_each_rcu_flavor(rsp) 1858 - raw_cpu_inc(rsp->rda->ticks_this_gp); 1831 + WRITE_ONCE(rdp->last_fqs_resched, jiffies); 1859 1832 } 1860 1833 1861 1834 #ifdef CONFIG_RCU_NOCB_CPU ··· 1969 1958 if (rdp->nocb_defer_wakeup == RCU_NOCB_WAKE_NOT) 1970 1959 mod_timer(&rdp->nocb_timer, jiffies + 1); 1971 1960 WRITE_ONCE(rdp->nocb_defer_wakeup, waketype); 1972 - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, reason); 1961 + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, reason); 1973 1962 raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags); 1974 1963 } 1975 1964 1976 1965 /* 1977 - * Does the specified CPU need an RCU callback for the specified flavor 1966 + * Does the specified CPU need an RCU callback for this invocation 1978 1967 * of rcu_barrier()? 1979 1968 */ 1980 - static bool rcu_nocb_cpu_needs_barrier(struct rcu_state *rsp, int cpu) 1969 + static bool rcu_nocb_cpu_needs_barrier(int cpu) 1981 1970 { 1982 - struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu); 1971 + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); 1983 1972 unsigned long ret; 1984 1973 #ifdef CONFIG_PROVE_RCU 1985 1974 struct rcu_head *rhp; ··· 1990 1979 * There needs to be a barrier before this function is called, 1991 1980 * but associated with a prior determination that no more 1992 1981 * callbacks would be posted. In the worst case, the first 1993 - * barrier in _rcu_barrier() suffices (but the caller cannot 1982 + * barrier in rcu_barrier() suffices (but the caller cannot 1994 1983 * necessarily rely on this, not a substitute for the caller 1995 1984 * getting the concurrency design right!). There must also be 1996 1985 * a barrier between the following load an posting of a callback ··· 2048 2037 /* If we are not being polled and there is a kthread, awaken it ... */ 2049 2038 t = READ_ONCE(rdp->nocb_kthread); 2050 2039 if (rcu_nocb_poll || !t) { 2051 - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, 2040 + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, 2052 2041 TPS("WakeNotPoll")); 2053 2042 return; 2054 2043 } ··· 2057 2046 if (!irqs_disabled_flags(flags)) { 2058 2047 /* ... if queue was empty ... */ 2059 2048 wake_nocb_leader(rdp, false); 2060 - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, 2049 + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, 2061 2050 TPS("WakeEmpty")); 2062 2051 } else { 2063 2052 wake_nocb_leader_defer(rdp, RCU_NOCB_WAKE, ··· 2068 2057 /* ... or if many callbacks queued. */ 2069 2058 if (!irqs_disabled_flags(flags)) { 2070 2059 wake_nocb_leader(rdp, true); 2071 - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, 2060 + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, 2072 2061 TPS("WakeOvf")); 2073 2062 } else { 2074 2063 wake_nocb_leader_defer(rdp, RCU_NOCB_WAKE_FORCE, ··· 2076 2065 } 2077 2066 rdp->qlen_last_fqs_check = LONG_MAX / 2; 2078 2067 } else { 2079 - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("WakeNot")); 2068 + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeNot")); 2080 2069 } 2081 2070 return; 2082 2071 } ··· 2098 2087 return false; 2099 2088 __call_rcu_nocb_enqueue(rdp, rhp, &rhp->next, 1, lazy, flags); 2100 2089 if (__is_kfree_rcu_offset((unsigned long)rhp->func)) 2101 - trace_rcu_kfree_callback(rdp->rsp->name, rhp, 2090 + trace_rcu_kfree_callback(rcu_state.name, rhp, 2102 2091 (unsigned long)rhp->func, 2103 2092 -atomic_long_read(&rdp->nocb_q_count_lazy), 2104 2093 -atomic_long_read(&rdp->nocb_q_count)); 2105 2094 else 2106 - trace_rcu_callback(rdp->rsp->name, rhp, 2095 + trace_rcu_callback(rcu_state.name, rhp, 2107 2096 -atomic_long_read(&rdp->nocb_q_count_lazy), 2108 2097 -atomic_long_read(&rdp->nocb_q_count)); 2109 2098 ··· 2153 2142 struct rcu_node *rnp = rdp->mynode; 2154 2143 2155 2144 local_irq_save(flags); 2156 - c = rcu_seq_snap(&rdp->rsp->gp_seq); 2145 + c = rcu_seq_snap(&rcu_state.gp_seq); 2157 2146 if (!rdp->gpwrap && ULONG_CMP_GE(rdp->gp_seq_needed, c)) { 2158 2147 local_irq_restore(flags); 2159 2148 } else { ··· 2161 2150 needwake = rcu_start_this_gp(rnp, rdp, c); 2162 2151 raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 2163 2152 if (needwake) 2164 - rcu_gp_kthread_wake(rdp->rsp); 2153 + rcu_gp_kthread_wake(); 2165 2154 } 2166 2155 2167 2156 /* ··· 2198 2187 2199 2188 /* Wait for callbacks to appear. */ 2200 2189 if (!rcu_nocb_poll) { 2201 - trace_rcu_nocb_wake(my_rdp->rsp->name, my_rdp->cpu, TPS("Sleep")); 2190 + trace_rcu_nocb_wake(rcu_state.name, my_rdp->cpu, TPS("Sleep")); 2202 2191 swait_event_interruptible_exclusive(my_rdp->nocb_wq, 2203 2192 !READ_ONCE(my_rdp->nocb_leader_sleep)); 2204 2193 raw_spin_lock_irqsave(&my_rdp->nocb_lock, flags); ··· 2208 2197 raw_spin_unlock_irqrestore(&my_rdp->nocb_lock, flags); 2209 2198 } else if (firsttime) { 2210 2199 firsttime = false; /* Don't drown trace log with "Poll"! */ 2211 - trace_rcu_nocb_wake(my_rdp->rsp->name, my_rdp->cpu, TPS("Poll")); 2200 + trace_rcu_nocb_wake(rcu_state.name, my_rdp->cpu, TPS("Poll")); 2212 2201 } 2213 2202 2214 2203 /* ··· 2235 2224 if (rcu_nocb_poll) { 2236 2225 schedule_timeout_interruptible(1); 2237 2226 } else { 2238 - trace_rcu_nocb_wake(my_rdp->rsp->name, my_rdp->cpu, 2227 + trace_rcu_nocb_wake(rcu_state.name, my_rdp->cpu, 2239 2228 TPS("WokeEmpty")); 2240 2229 } 2241 2230 goto wait_again; ··· 2280 2269 static void nocb_follower_wait(struct rcu_data *rdp) 2281 2270 { 2282 2271 for (;;) { 2283 - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("FollowerSleep")); 2272 + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FollowerSleep")); 2284 2273 swait_event_interruptible_exclusive(rdp->nocb_wq, 2285 2274 READ_ONCE(rdp->nocb_follower_head)); 2286 2275 if (smp_load_acquire(&rdp->nocb_follower_head)) { ··· 2288 2277 return; 2289 2278 } 2290 2279 WARN_ON(signal_pending(current)); 2291 - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("WokeEmpty")); 2280 + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WokeEmpty")); 2292 2281 } 2293 2282 } 2294 2283 ··· 2323 2312 rdp->nocb_follower_tail = &rdp->nocb_follower_head; 2324 2313 raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags); 2325 2314 BUG_ON(!list); 2326 - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("WokeNonEmpty")); 2315 + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WokeNonEmpty")); 2327 2316 2328 2317 /* Each pass through the following loop invokes a callback. */ 2329 - trace_rcu_batch_start(rdp->rsp->name, 2318 + trace_rcu_batch_start(rcu_state.name, 2330 2319 atomic_long_read(&rdp->nocb_q_count_lazy), 2331 2320 atomic_long_read(&rdp->nocb_q_count), -1); 2332 2321 c = cl = 0; ··· 2334 2323 next = list->next; 2335 2324 /* Wait for enqueuing to complete, if needed. */ 2336 2325 while (next == NULL && &list->next != tail) { 2337 - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, 2326 + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, 2338 2327 TPS("WaitQueue")); 2339 2328 schedule_timeout_interruptible(1); 2340 - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, 2329 + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, 2341 2330 TPS("WokeQueue")); 2342 2331 next = list->next; 2343 2332 } 2344 2333 debug_rcu_head_unqueue(list); 2345 2334 local_bh_disable(); 2346 - if (__rcu_reclaim(rdp->rsp->name, list)) 2335 + if (__rcu_reclaim(rcu_state.name, list)) 2347 2336 cl++; 2348 2337 c++; 2349 2338 local_bh_enable(); 2350 2339 cond_resched_tasks_rcu_qs(); 2351 2340 list = next; 2352 2341 } 2353 - trace_rcu_batch_end(rdp->rsp->name, c, !!list, 0, 0, 1); 2342 + trace_rcu_batch_end(rcu_state.name, c, !!list, 0, 0, 1); 2354 2343 smp_mb__before_atomic(); /* _add after CB invocation. */ 2355 2344 atomic_long_add(-c, &rdp->nocb_q_count); 2356 2345 atomic_long_add(-cl, &rdp->nocb_q_count_lazy); ··· 2378 2367 ndw = READ_ONCE(rdp->nocb_defer_wakeup); 2379 2368 WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT); 2380 2369 __wake_nocb_leader(rdp, ndw == RCU_NOCB_WAKE_FORCE, flags); 2381 - trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("DeferredWake")); 2370 + trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DeferredWake")); 2382 2371 } 2383 2372 2384 2373 /* Do a deferred wakeup of rcu_nocb_kthread() from a timer handler. */ ··· 2404 2393 { 2405 2394 int cpu; 2406 2395 bool need_rcu_nocb_mask = false; 2407 - struct rcu_state *rsp; 2408 2396 2409 2397 #if defined(CONFIG_NO_HZ_FULL) 2410 2398 if (tick_nohz_full_running && cpumask_weight(tick_nohz_full_mask)) ··· 2437 2427 if (rcu_nocb_poll) 2438 2428 pr_info("\tPoll for callbacks from no-CBs CPUs.\n"); 2439 2429 2440 - for_each_rcu_flavor(rsp) { 2441 - for_each_cpu(cpu, rcu_nocb_mask) 2442 - init_nocb_callback_list(per_cpu_ptr(rsp->rda, cpu)); 2443 - rcu_organize_nocb_kthreads(rsp); 2444 - } 2430 + for_each_cpu(cpu, rcu_nocb_mask) 2431 + init_nocb_callback_list(per_cpu_ptr(&rcu_data, cpu)); 2432 + rcu_organize_nocb_kthreads(); 2445 2433 } 2446 2434 2447 2435 /* Initialize per-rcu_data variables for no-CBs CPUs. */ ··· 2454 2446 2455 2447 /* 2456 2448 * If the specified CPU is a no-CBs CPU that does not already have its 2457 - * rcuo kthread for the specified RCU flavor, spawn it. If the CPUs are 2458 - * brought online out of order, this can require re-organizing the 2459 - * leader-follower relationships. 2449 + * rcuo kthread, spawn it. If the CPUs are brought online out of order, 2450 + * this can require re-organizing the leader-follower relationships. 2460 2451 */ 2461 - static void rcu_spawn_one_nocb_kthread(struct rcu_state *rsp, int cpu) 2452 + static void rcu_spawn_one_nocb_kthread(int cpu) 2462 2453 { 2463 2454 struct rcu_data *rdp; 2464 2455 struct rcu_data *rdp_last; 2465 2456 struct rcu_data *rdp_old_leader; 2466 - struct rcu_data *rdp_spawn = per_cpu_ptr(rsp->rda, cpu); 2457 + struct rcu_data *rdp_spawn = per_cpu_ptr(&rcu_data, cpu); 2467 2458 struct task_struct *t; 2468 2459 2469 2460 /* ··· 2492 2485 rdp_spawn->nocb_next_follower = rdp_old_leader; 2493 2486 } 2494 2487 2495 - /* Spawn the kthread for this CPU and RCU flavor. */ 2488 + /* Spawn the kthread for this CPU. */ 2496 2489 t = kthread_run(rcu_nocb_kthread, rdp_spawn, 2497 - "rcuo%c/%d", rsp->abbr, cpu); 2490 + "rcuo%c/%d", rcu_state.abbr, cpu); 2498 2491 BUG_ON(IS_ERR(t)); 2499 2492 WRITE_ONCE(rdp_spawn->nocb_kthread, t); 2500 2493 } ··· 2505 2498 */ 2506 2499 static void rcu_spawn_all_nocb_kthreads(int cpu) 2507 2500 { 2508 - struct rcu_state *rsp; 2509 - 2510 2501 if (rcu_scheduler_fully_active) 2511 - for_each_rcu_flavor(rsp) 2512 - rcu_spawn_one_nocb_kthread(rsp, cpu); 2502 + rcu_spawn_one_nocb_kthread(cpu); 2513 2503 } 2514 2504 2515 2505 /* ··· 2530 2526 /* 2531 2527 * Initialize leader-follower relationships for all no-CBs CPU. 2532 2528 */ 2533 - static void __init rcu_organize_nocb_kthreads(struct rcu_state *rsp) 2529 + static void __init rcu_organize_nocb_kthreads(void) 2534 2530 { 2535 2531 int cpu; 2536 2532 int ls = rcu_nocb_leader_stride; ··· 2552 2548 * we will spawn the needed set of rcu_nocb_kthread() kthreads. 2553 2549 */ 2554 2550 for_each_cpu(cpu, rcu_nocb_mask) { 2555 - rdp = per_cpu_ptr(rsp->rda, cpu); 2551 + rdp = per_cpu_ptr(&rcu_data, cpu); 2556 2552 if (rdp->cpu >= nl) { 2557 2553 /* New leader, set up for followers & next leader. */ 2558 2554 nl = DIV_ROUND_UP(rdp->cpu + 1, ls) * ls; ··· 2589 2585 2590 2586 #else /* #ifdef CONFIG_RCU_NOCB_CPU */ 2591 2587 2592 - static bool rcu_nocb_cpu_needs_barrier(struct rcu_state *rsp, int cpu) 2588 + static bool rcu_nocb_cpu_needs_barrier(int cpu) 2593 2589 { 2594 2590 WARN_ON_ONCE(1); /* Should be dead code. */ 2595 2591 return false; ··· 2658 2654 * This code relies on the fact that all NO_HZ_FULL CPUs are also 2659 2655 * CONFIG_RCU_NOCB_CPU CPUs. 2660 2656 */ 2661 - static bool rcu_nohz_full_cpu(struct rcu_state *rsp) 2657 + static bool rcu_nohz_full_cpu(void) 2662 2658 { 2663 2659 #ifdef CONFIG_NO_HZ_FULL 2664 2660 if (tick_nohz_full_cpu(smp_processor_id()) && 2665 - (!rcu_gp_in_progress(rsp) || 2666 - ULONG_CMP_LT(jiffies, READ_ONCE(rsp->gp_start) + HZ))) 2661 + (!rcu_gp_in_progress() || 2662 + ULONG_CMP_LT(jiffies, READ_ONCE(rcu_state.gp_start) + HZ))) 2667 2663 return true; 2668 2664 #endif /* #ifdef CONFIG_NO_HZ_FULL */ 2669 2665 return false;

+22 -48

kernel/rcu/update.c

··· 203 203 if (!IS_ENABLED(CONFIG_PROVE_RCU)) 204 204 return; 205 205 synchronize_rcu(); 206 - synchronize_rcu_bh(); 207 - synchronize_sched(); 208 206 synchronize_rcu_expedited(); 209 - synchronize_rcu_bh_expedited(); 210 - synchronize_sched_expedited(); 211 207 } 212 208 213 209 #if !defined(CONFIG_TINY_RCU) || defined(CONFIG_SRCU) ··· 294 298 * 295 299 * Check debug_lockdep_rcu_enabled() to prevent false positives during boot. 296 300 * 297 - * Note that rcu_read_lock() is disallowed if the CPU is either idle or 301 + * Note that rcu_read_lock_bh() is disallowed if the CPU is either idle or 298 302 * offline from an RCU perspective, so check for those as well. 299 303 */ 300 304 int rcu_read_lock_bh_held(void) ··· 332 336 int i; 333 337 int j; 334 338 335 - /* Initialize and register callbacks for each flavor specified. */ 339 + /* Initialize and register callbacks for each crcu_array element. */ 336 340 for (i = 0; i < n; i++) { 337 341 if (checktiny && 338 342 (crcu_array[i] == call_rcu || ··· 468 472 } 469 473 return till_stall_check * HZ + RCU_STALL_DELAY_DELTA; 470 474 } 475 + EXPORT_SYMBOL_GPL(rcu_jiffies_till_stall_check); 471 476 472 477 void rcu_sysrq_start(void) 473 478 { ··· 698 701 699 702 /* 700 703 * Wait for all pre-existing t->on_rq and t->nvcsw 701 - * transitions to complete. Invoking synchronize_sched() 704 + * transitions to complete. Invoking synchronize_rcu() 702 705 * suffices because all these transitions occur with 703 - * interrupts disabled. Without this synchronize_sched(), 706 + * interrupts disabled. Without this synchronize_rcu(), 704 707 * a read-side critical section that started before the 705 708 * grace period might be incorrectly seen as having started 706 709 * after the grace period. 707 710 * 708 - * This synchronize_sched() also dispenses with the 711 + * This synchronize_rcu() also dispenses with the 709 712 * need for a memory barrier on the first store to 710 713 * ->rcu_tasks_holdout, as it forces the store to happen 711 714 * after the beginning of the grace period. 712 715 */ 713 - synchronize_sched(); 716 + synchronize_rcu(); 714 717 715 718 /* 716 719 * There were callbacks, so we need to wait for an ··· 737 740 * This does only part of the job, ensuring that all 738 741 * tasks that were previously exiting reach the point 739 742 * where they have disabled preemption, allowing the 740 - * later synchronize_sched() to finish the job. 743 + * later synchronize_rcu() to finish the job. 741 744 */ 742 745 synchronize_srcu(&tasks_rcu_exit_srcu); 743 746 ··· 787 790 * cause their RCU-tasks read-side critical sections to 788 791 * extend past the end of the grace period. However, 789 792 * because these ->nvcsw updates are carried out with 790 - * interrupts disabled, we can use synchronize_sched() 793 + * interrupts disabled, we can use synchronize_rcu() 791 794 * to force the needed ordering on all such CPUs. 792 795 * 793 - * This synchronize_sched() also confines all 796 + * This synchronize_rcu() also confines all 794 797 * ->rcu_tasks_holdout accesses to be within the grace 795 798 * period, avoiding the need for memory barriers for 796 799 * ->rcu_tasks_holdout accesses. 797 800 * 798 - * In addition, this synchronize_sched() waits for exiting 801 + * In addition, this synchronize_rcu() waits for exiting 799 802 * tasks to complete their final preempt_disable() region 800 803 * of execution, cleaning up after the synchronize_srcu() 801 804 * above. 802 805 */ 803 - synchronize_sched(); 806 + synchronize_rcu(); 804 807 805 808 /* Invoke the callbacks. */ 806 809 while (list) { ··· 867 870 #ifdef CONFIG_PROVE_RCU 868 871 869 872 /* 870 - * Early boot self test parameters, one for each flavor 873 + * Early boot self test parameters. 871 874 */ 872 875 static bool rcu_self_test; 873 - static bool rcu_self_test_bh; 874 - static bool rcu_self_test_sched; 875 - 876 876 module_param(rcu_self_test, bool, 0444); 877 - module_param(rcu_self_test_bh, bool, 0444); 878 - module_param(rcu_self_test_sched, bool, 0444); 879 877 880 878 static int rcu_self_test_counter; 881 879 ··· 880 888 pr_info("RCU test callback executed %d\n", rcu_self_test_counter); 881 889 } 882 890 891 + DEFINE_STATIC_SRCU(early_srcu); 892 + 883 893 static void early_boot_test_call_rcu(void) 884 894 { 885 895 static struct rcu_head head; 896 + static struct rcu_head shead; 886 897 887 898 call_rcu(&head, test_callback); 888 - } 889 - 890 - static void early_boot_test_call_rcu_bh(void) 891 - { 892 - static struct rcu_head head; 893 - 894 - call_rcu_bh(&head, test_callback); 895 - } 896 - 897 - static void early_boot_test_call_rcu_sched(void) 898 - { 899 - static struct rcu_head head; 900 - 901 - call_rcu_sched(&head, test_callback); 899 + if (IS_ENABLED(CONFIG_SRCU)) 900 + call_srcu(&early_srcu, &shead, test_callback); 902 901 } 903 902 904 903 void rcu_early_boot_tests(void) ··· 898 915 899 916 if (rcu_self_test) 900 917 early_boot_test_call_rcu(); 901 - if (rcu_self_test_bh) 902 - early_boot_test_call_rcu_bh(); 903 - if (rcu_self_test_sched) 904 - early_boot_test_call_rcu_sched(); 905 918 rcu_test_sync_prims(); 906 919 } 907 920 ··· 909 930 if (rcu_self_test) { 910 931 early_boot_test_counter++; 911 932 rcu_barrier(); 933 + if (IS_ENABLED(CONFIG_SRCU)) { 934 + early_boot_test_counter++; 935 + srcu_barrier(&early_srcu); 936 + } 912 937 } 913 - if (rcu_self_test_bh) { 914 - early_boot_test_counter++; 915 - rcu_barrier_bh(); 916 - } 917 - if (rcu_self_test_sched) { 918 - early_boot_test_counter++; 919 - rcu_barrier_sched(); 920 - } 921 - 922 938 if (rcu_self_test_counter != early_boot_test_counter) { 923 939 WARN_ON(1); 924 940 ret = -1;

+2 -1

kernel/softirq.c

··· 301 301 pending >>= softirq_bit; 302 302 } 303 303 304 - rcu_bh_qs(); 304 + if (__this_cpu_read(ksoftirqd) == current) 305 + rcu_softirq_qs(); 305 306 local_irq_disable(); 306 307 307 308 pending = local_softirq_pending();

+2 -1

kernel/torture.c

··· 573 573 * Block until the stutter interval ends. This must be called periodically 574 574 * by all running kthreads that need to be subject to stuttering. 575 575 */ 576 - void stutter_wait(const char *title) 576 + bool stutter_wait(const char *title) 577 577 { 578 578 int spt; 579 579 ··· 590 590 } 591 591 torture_shutdown_absorb(title); 592 592 } 593 + return !!spt; 593 594 } 594 595 EXPORT_SYMBOL_GPL(stutter_wait); 595 596

-1

tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh

··· 120 120 parse-build.sh $resdir/Make.out $title 121 121 else 122 122 # Build failed. 123 - cp $builddir/Make*.out $resdir 124 123 cp $builddir/.config $resdir || : 125 124 echo Build failed, not running KVM, see $resdir. 126 125 if test -f $builddir.wait

-2

tools/testing/selftests/rcutorture/configs/rcu/CFLIST

··· 3 3 TREE03 4 4 TREE04 5 5 TREE05 6 - TREE06 7 6 TREE07 8 - TREE08 9 7 TREE09 10 8 SRCU-N 11 9 SRCU-P

+1

tools/testing/selftests/rcutorture/configs/rcu/SRCU-P.boot

··· 1 1 rcutorture.torture_type=srcud 2 + rcupdate.rcu_self_test=1

+1

tools/testing/selftests/rcutorture/configs/rcu/SRCU-u.boot

··· 1 1 rcutorture.torture_type=srcud 2 + rcupdate.rcu_self_test=1

-2

tools/testing/selftests/rcutorture/configs/rcu/TINY02.boot

··· 1 1 rcupdate.rcu_self_test=1 2 - rcupdate.rcu_self_test_bh=1 3 - rcutorture.torture_type=rcu_bh

+1 -1

tools/testing/selftests/rcutorture/configs/rcu/TREE01.boot

··· 1 - rcutorture.torture_type=rcu_bh maxcpus=8 nr_cpus=43 1 + maxcpus=8 nr_cpus=43 2 2 rcutree.gp_preinit_delay=3 3 3 rcutree.gp_init_delay=3 4 4 rcutree.gp_cleanup_delay=3

+1 -1

tools/testing/selftests/rcutorture/configs/rcu/TREE04.boot

··· 1 - rcutorture.torture_type=rcu_bh rcutree.rcu_fanout_leaf=4 nohz_full=1-7 1 + rcutree.rcu_fanout_leaf=4 nohz_full=1-7

+1 -2

tools/testing/selftests/rcutorture/configs/rcu/TREE05.boot

··· 1 - rcutorture.torture_type=sched 2 - rcupdate.rcu_self_test_sched=1 3 1 rcutree.gp_preinit_delay=3 4 2 rcutree.gp_init_delay=3 5 3 rcutree.gp_cleanup_delay=3 4 + rcupdate.rcu_self_test=1

-2

tools/testing/selftests/rcutorture/configs/rcu/TREE06.boot

··· 1 1 rcupdate.rcu_self_test=1 2 - rcupdate.rcu_self_test_bh=1 3 - rcupdate.rcu_self_test_sched=1 4 2 rcutree.rcu_fanout_exact=1 5 3 rcutree.gp_preinit_delay=3 6 4 rcutree.gp_init_delay=3

-2

tools/testing/selftests/rcutorture/configs/rcu/TREE08.boot

··· 1 - rcutorture.torture_type=sched 2 1 rcupdate.rcu_self_test=1 3 - rcupdate.rcu_self_test_sched=1 4 2 rcutree.rcu_fanout_exact=1 5 3 rcu_nocbs=0-7