Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next

Pull RCU changes from Ingo Molnar:
"The main RCU changes in this cycle were:

- RCU torture-test changes.

- variable-name renaming cleanup.

- update RCU documentation.

- miscellaneous fixes.

- patch to suppress RCU stall warnings while sysrq requests are being
processed"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (68 commits)
rcu: Provide API to suppress stall warnings while sysrc runs
rcu: Variable name changed in tree_plugin.h and used in tree.c
torture: Remove unused definition
torture: Remove __init from torture_init_begin/end
torture: Check for multiple concurrent torture tests
locktorture: Remove reference to nonexistent Kconfig parameter
rcutorture: Run rcu_torture_writer at normal priority
rcutorture: Note diffs from git commits
rcutorture: Add missing destroy_timer_on_stack()
rcutorture: Explicitly test synchronous grace-period primitives
rcutorture: Add tests for get_state_synchronize_rcu()
rcutorture: Test RCU-sched primitives in TREE_PREEMPT_RCU kernels
torture: Use elapsed time to detect hangs
rcutorture: Check for rcu_torture_fqs creation errors
torture: Better summary diagnostics for build failures
torture: Notice if an all-zero cpumask is passed inside a critical section
rcutorture: Make rcu_torture_reader() use cond_resched()
sched,rcu: Make cond_resched() report RCU quiescent states
percpu: Fix raw_cpu_inc_return()
rcutorture: Export RCU grace-period kthread wait state to rcutorture
...

+1211 -413
+2
Documentation/RCU/00-INDEX
··· 12 12 - RCU Lockdep splats explained. 13 13 NMI-RCU.txt 14 14 - Using RCU to Protect Dynamic NMI Handlers 15 + rcu_dereference.txt 16 + - Proper care and feeding of return values from rcu_dereference() 15 17 rcubarrier.txt 16 18 - RCU and Unloadable Modules 17 19 rculist_nulls.txt
+8 -4
Documentation/RCU/checklist.txt
··· 114 114 http://www.openvms.compaq.com/wizard/wiz_2637.html 115 115 116 116 The rcu_dereference() primitive is also an excellent 117 - documentation aid, letting the person reading the code 118 - know exactly which pointers are protected by RCU. 117 + documentation aid, letting the person reading the 118 + code know exactly which pointers are protected by RCU. 119 119 Please note that compilers can also reorder code, and 120 120 they are becoming increasingly aggressive about doing 121 - just that. The rcu_dereference() primitive therefore 122 - also prevents destructive compiler optimizations. 121 + just that. The rcu_dereference() primitive therefore also 122 + prevents destructive compiler optimizations. However, 123 + with a bit of devious creativity, it is possible to 124 + mishandle the return value from rcu_dereference(). 125 + Please see rcu_dereference.txt in this directory for 126 + more information. 123 127 124 128 The rcu_dereference() primitive is used by the 125 129 various "_rcu()" list-traversal primitives, such
+371
Documentation/RCU/rcu_dereference.txt
··· 1 + PROPER CARE AND FEEDING OF RETURN VALUES FROM rcu_dereference() 2 + 3 + Most of the time, you can use values from rcu_dereference() or one of 4 + the similar primitives without worries. Dereferencing (prefix "*"), 5 + field selection ("->"), assignment ("="), address-of ("&"), addition and 6 + subtraction of constants, and casts all work quite naturally and safely. 7 + 8 + It is nevertheless possible to get into trouble with other operations. 9 + Follow these rules to keep your RCU code working properly: 10 + 11 + o You must use one of the rcu_dereference() family of primitives 12 + to load an RCU-protected pointer, otherwise CONFIG_PROVE_RCU 13 + will complain. Worse yet, your code can see random memory-corruption 14 + bugs due to games that compilers and DEC Alpha can play. 15 + Without one of the rcu_dereference() primitives, compilers 16 + can reload the value, and won't your code have fun with two 17 + different values for a single pointer! Without rcu_dereference(), 18 + DEC Alpha can load a pointer, dereference that pointer, and 19 + return data preceding initialization that preceded the store of 20 + the pointer. 21 + 22 + In addition, the volatile cast in rcu_dereference() prevents the 23 + compiler from deducing the resulting pointer value. Please see 24 + the section entitled "EXAMPLE WHERE THE COMPILER KNOWS TOO MUCH" 25 + for an example where the compiler can in fact deduce the exact 26 + value of the pointer, and thus cause misordering. 27 + 28 + o Do not use single-element RCU-protected arrays. The compiler 29 + is within its right to assume that the value of an index into 30 + such an array must necessarily evaluate to zero. The compiler 31 + could then substitute the constant zero for the computation, so 32 + that the array index no longer depended on the value returned 33 + by rcu_dereference(). If the array index no longer depends 34 + on rcu_dereference(), then both the compiler and the CPU 35 + are within their rights to order the array access before the 36 + rcu_dereference(), which can cause the array access to return 37 + garbage. 38 + 39 + o Avoid cancellation when using the "+" and "-" infix arithmetic 40 + operators. For example, for a given variable "x", avoid 41 + "(x-x)". There are similar arithmetic pitfalls from other 42 + arithmetic operatiors, such as "(x*0)", "(x/(x+1))" or "(x%1)". 43 + The compiler is within its rights to substitute zero for all of 44 + these expressions, so that subsequent accesses no longer depend 45 + on the rcu_dereference(), again possibly resulting in bugs due 46 + to misordering. 47 + 48 + Of course, if "p" is a pointer from rcu_dereference(), and "a" 49 + and "b" are integers that happen to be equal, the expression 50 + "p+a-b" is safe because its value still necessarily depends on 51 + the rcu_dereference(), thus maintaining proper ordering. 52 + 53 + o Avoid all-zero operands to the bitwise "&" operator, and 54 + similarly avoid all-ones operands to the bitwise "|" operator. 55 + If the compiler is able to deduce the value of such operands, 56 + it is within its rights to substitute the corresponding constant 57 + for the bitwise operation. Once again, this causes subsequent 58 + accesses to no longer depend on the rcu_dereference(), causing 59 + bugs due to misordering. 60 + 61 + Please note that single-bit operands to bitwise "&" can also 62 + be dangerous. At this point, the compiler knows that the 63 + resulting value can only take on one of two possible values. 64 + Therefore, a very small amount of additional information will 65 + allow the compiler to deduce the exact value, which again can 66 + result in misordering. 67 + 68 + o If you are using RCU to protect JITed functions, so that the 69 + "()" function-invocation operator is applied to a value obtained 70 + (directly or indirectly) from rcu_dereference(), you may need to 71 + interact directly with the hardware to flush instruction caches. 72 + This issue arises on some systems when a newly JITed function is 73 + using the same memory that was used by an earlier JITed function. 74 + 75 + o Do not use the results from the boolean "&&" and "||" when 76 + dereferencing. For example, the following (rather improbable) 77 + code is buggy: 78 + 79 + int a[2]; 80 + int index; 81 + int force_zero_index = 1; 82 + 83 + ... 84 + 85 + r1 = rcu_dereference(i1) 86 + r2 = a[r1 && force_zero_index]; /* BUGGY!!! */ 87 + 88 + The reason this is buggy is that "&&" and "||" are often compiled 89 + using branches. While weak-memory machines such as ARM or PowerPC 90 + do order stores after such branches, they can speculate loads, 91 + which can result in misordering bugs. 92 + 93 + o Do not use the results from relational operators ("==", "!=", 94 + ">", ">=", "<", or "<=") when dereferencing. For example, 95 + the following (quite strange) code is buggy: 96 + 97 + int a[2]; 98 + int index; 99 + int flip_index = 0; 100 + 101 + ... 102 + 103 + r1 = rcu_dereference(i1) 104 + r2 = a[r1 != flip_index]; /* BUGGY!!! */ 105 + 106 + As before, the reason this is buggy is that relational operators 107 + are often compiled using branches. And as before, although 108 + weak-memory machines such as ARM or PowerPC do order stores 109 + after such branches, but can speculate loads, which can again 110 + result in misordering bugs. 111 + 112 + o Be very careful about comparing pointers obtained from 113 + rcu_dereference() against non-NULL values. As Linus Torvalds 114 + explained, if the two pointers are equal, the compiler could 115 + substitute the pointer you are comparing against for the pointer 116 + obtained from rcu_dereference(). For example: 117 + 118 + p = rcu_dereference(gp); 119 + if (p == &default_struct) 120 + do_default(p->a); 121 + 122 + Because the compiler now knows that the value of "p" is exactly 123 + the address of the variable "default_struct", it is free to 124 + transform this code into the following: 125 + 126 + p = rcu_dereference(gp); 127 + if (p == &default_struct) 128 + do_default(default_struct.a); 129 + 130 + On ARM and Power hardware, the load from "default_struct.a" 131 + can now be speculated, such that it might happen before the 132 + rcu_dereference(). This could result in bugs due to misordering. 133 + 134 + However, comparisons are OK in the following cases: 135 + 136 + o The comparison was against the NULL pointer. If the 137 + compiler knows that the pointer is NULL, you had better 138 + not be dereferencing it anyway. If the comparison is 139 + non-equal, the compiler is none the wiser. Therefore, 140 + it is safe to compare pointers from rcu_dereference() 141 + against NULL pointers. 142 + 143 + o The pointer is never dereferenced after being compared. 144 + Since there are no subsequent dereferences, the compiler 145 + cannot use anything it learned from the comparison 146 + to reorder the non-existent subsequent dereferences. 147 + This sort of comparison occurs frequently when scanning 148 + RCU-protected circular linked lists. 149 + 150 + o The comparison is against a pointer that references memory 151 + that was initialized "a long time ago." The reason 152 + this is safe is that even if misordering occurs, the 153 + misordering will not affect the accesses that follow 154 + the comparison. So exactly how long ago is "a long 155 + time ago"? Here are some possibilities: 156 + 157 + o Compile time. 158 + 159 + o Boot time. 160 + 161 + o Module-init time for module code. 162 + 163 + o Prior to kthread creation for kthread code. 164 + 165 + o During some prior acquisition of the lock that 166 + we now hold. 167 + 168 + o Before mod_timer() time for a timer handler. 169 + 170 + There are many other possibilities involving the Linux 171 + kernel's wide array of primitives that cause code to 172 + be invoked at a later time. 173 + 174 + o The pointer being compared against also came from 175 + rcu_dereference(). In this case, both pointers depend 176 + on one rcu_dereference() or another, so you get proper 177 + ordering either way. 178 + 179 + That said, this situation can make certain RCU usage 180 + bugs more likely to happen. Which can be a good thing, 181 + at least if they happen during testing. An example 182 + of such an RCU usage bug is shown in the section titled 183 + "EXAMPLE OF AMPLIFIED RCU-USAGE BUG". 184 + 185 + o All of the accesses following the comparison are stores, 186 + so that a control dependency preserves the needed ordering. 187 + That said, it is easy to get control dependencies wrong. 188 + Please see the "CONTROL DEPENDENCIES" section of 189 + Documentation/memory-barriers.txt for more details. 190 + 191 + o The pointers are not equal -and- the compiler does 192 + not have enough information to deduce the value of the 193 + pointer. Note that the volatile cast in rcu_dereference() 194 + will normally prevent the compiler from knowing too much. 195 + 196 + o Disable any value-speculation optimizations that your compiler 197 + might provide, especially if you are making use of feedback-based 198 + optimizations that take data collected from prior runs. Such 199 + value-speculation optimizations reorder operations by design. 200 + 201 + There is one exception to this rule: Value-speculation 202 + optimizations that leverage the branch-prediction hardware are 203 + safe on strongly ordered systems (such as x86), but not on weakly 204 + ordered systems (such as ARM or Power). Choose your compiler 205 + command-line options wisely! 206 + 207 + 208 + EXAMPLE OF AMPLIFIED RCU-USAGE BUG 209 + 210 + Because updaters can run concurrently with RCU readers, RCU readers can 211 + see stale and/or inconsistent values. If RCU readers need fresh or 212 + consistent values, which they sometimes do, they need to take proper 213 + precautions. To see this, consider the following code fragment: 214 + 215 + struct foo { 216 + int a; 217 + int b; 218 + int c; 219 + }; 220 + struct foo *gp1; 221 + struct foo *gp2; 222 + 223 + void updater(void) 224 + { 225 + struct foo *p; 226 + 227 + p = kmalloc(...); 228 + if (p == NULL) 229 + deal_with_it(); 230 + p->a = 42; /* Each field in its own cache line. */ 231 + p->b = 43; 232 + p->c = 44; 233 + rcu_assign_pointer(gp1, p); 234 + p->b = 143; 235 + p->c = 144; 236 + rcu_assign_pointer(gp2, p); 237 + } 238 + 239 + void reader(void) 240 + { 241 + struct foo *p; 242 + struct foo *q; 243 + int r1, r2; 244 + 245 + p = rcu_dereference(gp2); 246 + if (p == NULL) 247 + return; 248 + r1 = p->b; /* Guaranteed to get 143. */ 249 + q = rcu_dereference(gp1); /* Guaranteed non-NULL. */ 250 + if (p == q) { 251 + /* The compiler decides that q->c is same as p->c. */ 252 + r2 = p->c; /* Could get 44 on weakly order system. */ 253 + } 254 + do_something_with(r1, r2); 255 + } 256 + 257 + You might be surprised that the outcome (r1 == 143 && r2 == 44) is possible, 258 + but you should not be. After all, the updater might have been invoked 259 + a second time between the time reader() loaded into "r1" and the time 260 + that it loaded into "r2". The fact that this same result can occur due 261 + to some reordering from the compiler and CPUs is beside the point. 262 + 263 + But suppose that the reader needs a consistent view? 264 + 265 + Then one approach is to use locking, for example, as follows: 266 + 267 + struct foo { 268 + int a; 269 + int b; 270 + int c; 271 + spinlock_t lock; 272 + }; 273 + struct foo *gp1; 274 + struct foo *gp2; 275 + 276 + void updater(void) 277 + { 278 + struct foo *p; 279 + 280 + p = kmalloc(...); 281 + if (p == NULL) 282 + deal_with_it(); 283 + spin_lock(&p->lock); 284 + p->a = 42; /* Each field in its own cache line. */ 285 + p->b = 43; 286 + p->c = 44; 287 + spin_unlock(&p->lock); 288 + rcu_assign_pointer(gp1, p); 289 + spin_lock(&p->lock); 290 + p->b = 143; 291 + p->c = 144; 292 + spin_unlock(&p->lock); 293 + rcu_assign_pointer(gp2, p); 294 + } 295 + 296 + void reader(void) 297 + { 298 + struct foo *p; 299 + struct foo *q; 300 + int r1, r2; 301 + 302 + p = rcu_dereference(gp2); 303 + if (p == NULL) 304 + return; 305 + spin_lock(&p->lock); 306 + r1 = p->b; /* Guaranteed to get 143. */ 307 + q = rcu_dereference(gp1); /* Guaranteed non-NULL. */ 308 + if (p == q) { 309 + /* The compiler decides that q->c is same as p->c. */ 310 + r2 = p->c; /* Locking guarantees r2 == 144. */ 311 + } 312 + spin_unlock(&p->lock); 313 + do_something_with(r1, r2); 314 + } 315 + 316 + As always, use the right tool for the job! 317 + 318 + 319 + EXAMPLE WHERE THE COMPILER KNOWS TOO MUCH 320 + 321 + If a pointer obtained from rcu_dereference() compares not-equal to some 322 + other pointer, the compiler normally has no clue what the value of the 323 + first pointer might be. This lack of knowledge prevents the compiler 324 + from carrying out optimizations that otherwise might destroy the ordering 325 + guarantees that RCU depends on. And the volatile cast in rcu_dereference() 326 + should prevent the compiler from guessing the value. 327 + 328 + But without rcu_dereference(), the compiler knows more than you might 329 + expect. Consider the following code fragment: 330 + 331 + struct foo { 332 + int a; 333 + int b; 334 + }; 335 + static struct foo variable1; 336 + static struct foo variable2; 337 + static struct foo *gp = &variable1; 338 + 339 + void updater(void) 340 + { 341 + initialize_foo(&variable2); 342 + rcu_assign_pointer(gp, &variable2); 343 + /* 344 + * The above is the only store to gp in this translation unit, 345 + * and the address of gp is not exported in any way. 346 + */ 347 + } 348 + 349 + int reader(void) 350 + { 351 + struct foo *p; 352 + 353 + p = gp; 354 + barrier(); 355 + if (p == &variable1) 356 + return p->a; /* Must be variable1.a. */ 357 + else 358 + return p->b; /* Must be variable2.b. */ 359 + } 360 + 361 + Because the compiler can see all stores to "gp", it knows that the only 362 + possible values of "gp" are "variable1" on the one hand and "variable2" 363 + on the other. The comparison in reader() therefore tells the compiler 364 + the exact value of "p" even in the not-equals case. This allows the 365 + compiler to make the return values independent of the load from "gp", 366 + in turn destroying the ordering between this load and the loads of the 367 + return values. This can result in "p->b" returning pre-initialization 368 + garbage values. 369 + 370 + In short, rcu_dereference() is -not- optional when you are going to 371 + dereference the resulting pointer.
+1 -1
Documentation/RCU/stallwarn.txt
··· 24 24 timing of the next warning for the current stall. 25 25 26 26 Stall-warning messages may be enabled and disabled completely via 27 - /sys/module/rcutree/parameters/rcu_cpu_stall_suppress. 27 + /sys/module/rcupdate/parameters/rcu_cpu_stall_suppress. 28 28 29 29 CONFIG_RCU_CPU_STALL_VERBOSE 30 30
+45 -12
Documentation/RCU/whatisRCU.txt
··· 326 326 a. synchronize_rcu() rcu_read_lock() / rcu_read_unlock() 327 327 call_rcu() rcu_dereference() 328 328 329 - b. call_rcu_bh() rcu_read_lock_bh() / rcu_read_unlock_bh() 330 - rcu_dereference_bh() 329 + b. synchronize_rcu_bh() rcu_read_lock_bh() / rcu_read_unlock_bh() 330 + call_rcu_bh() rcu_dereference_bh() 331 331 332 332 c. synchronize_sched() rcu_read_lock_sched() / rcu_read_unlock_sched() 333 - preempt_disable() / preempt_enable() 333 + call_rcu_sched() preempt_disable() / preempt_enable() 334 334 local_irq_save() / local_irq_restore() 335 335 hardirq enter / hardirq exit 336 336 NMI enter / NMI exit ··· 794 794 795 795 RCU list traversal: 796 796 797 + list_entry_rcu 798 + list_first_entry_rcu 799 + list_next_rcu 797 800 list_for_each_entry_rcu 798 - hlist_for_each_entry_rcu 799 - hlist_nulls_for_each_entry_rcu 800 801 list_for_each_entry_continue_rcu 802 + hlist_first_rcu 803 + hlist_next_rcu 804 + hlist_pprev_rcu 805 + hlist_for_each_entry_rcu 806 + hlist_for_each_entry_rcu_bh 807 + hlist_for_each_entry_continue_rcu 808 + hlist_for_each_entry_continue_rcu_bh 809 + hlist_nulls_first_rcu 810 + hlist_nulls_for_each_entry_rcu 811 + hlist_bl_first_rcu 812 + hlist_bl_for_each_entry_rcu 801 813 802 814 RCU pointer/list update: 803 815 ··· 818 806 list_add_tail_rcu 819 807 list_del_rcu 820 808 list_replace_rcu 821 - hlist_del_rcu 822 809 hlist_add_after_rcu 823 810 hlist_add_before_rcu 824 811 hlist_add_head_rcu 812 + hlist_del_rcu 813 + hlist_del_init_rcu 825 814 hlist_replace_rcu 826 815 list_splice_init_rcu() 816 + hlist_nulls_del_init_rcu 817 + hlist_nulls_del_rcu 818 + hlist_nulls_add_head_rcu 819 + hlist_bl_add_head_rcu 820 + hlist_bl_del_init_rcu 821 + hlist_bl_del_rcu 822 + hlist_bl_set_first_rcu 827 823 828 824 RCU: Critical sections Grace period Barrier 829 825 830 826 rcu_read_lock synchronize_net rcu_barrier 831 827 rcu_read_unlock synchronize_rcu 832 828 rcu_dereference synchronize_rcu_expedited 833 - call_rcu 834 - kfree_rcu 835 - 829 + rcu_read_lock_held call_rcu 830 + rcu_dereference_check kfree_rcu 831 + rcu_dereference_protected 836 832 837 833 bh: Critical sections Grace period Barrier 838 834 839 835 rcu_read_lock_bh call_rcu_bh rcu_barrier_bh 840 836 rcu_read_unlock_bh synchronize_rcu_bh 841 837 rcu_dereference_bh synchronize_rcu_bh_expedited 842 - 838 + rcu_dereference_bh_check 839 + rcu_dereference_bh_protected 840 + rcu_read_lock_bh_held 843 841 844 842 sched: Critical sections Grace period Barrier 845 843 ··· 857 835 rcu_read_unlock_sched call_rcu_sched 858 836 [preempt_disable] synchronize_sched_expedited 859 837 [and friends] 838 + rcu_read_lock_sched_notrace 839 + rcu_read_unlock_sched_notrace 860 840 rcu_dereference_sched 841 + rcu_dereference_sched_check 842 + rcu_dereference_sched_protected 843 + rcu_read_lock_sched_held 861 844 862 845 863 846 SRCU: Critical sections Grace period Barrier ··· 870 843 srcu_read_lock synchronize_srcu srcu_barrier 871 844 srcu_read_unlock call_srcu 872 845 srcu_dereference synchronize_srcu_expedited 846 + srcu_dereference_check 847 + srcu_read_lock_held 873 848 874 849 SRCU: Initialization/cleanup 875 850 init_srcu_struct ··· 879 850 880 851 All: lockdep-checked RCU-protected pointer access 881 852 882 - rcu_dereference_check 883 - rcu_dereference_protected 853 + rcu_access_index 884 854 rcu_access_pointer 855 + rcu_dereference_index_check 856 + rcu_dereference_raw 857 + rcu_lockdep_assert 858 + rcu_sleep_check 859 + RCU_NONIDLE 885 860 886 861 See the comment headers in the source code (or the docbook generated 887 862 from them) for more information.
+1 -1
include/linux/percpu.h
··· 639 639 # define raw_cpu_add_return_8(pcp, val) raw_cpu_generic_add_return(pcp, val) 640 640 # endif 641 641 # define raw_cpu_add_return(pcp, val) \ 642 - __pcpu_size_call_return2(raw_add_return_, pcp, val) 642 + __pcpu_size_call_return2(raw_cpu_add_return_, pcp, val) 643 643 #endif 644 644 645 645 #define raw_cpu_sub_return(pcp, val) raw_cpu_add_return(pcp, -(typeof(pcp))(val))
+71 -1
include/linux/rcupdate.h
··· 44 44 #include <linux/debugobjects.h> 45 45 #include <linux/bug.h> 46 46 #include <linux/compiler.h> 47 + #include <linux/percpu.h> 47 48 #include <asm/barrier.h> 48 49 49 50 extern int rcu_expedited; /* for sysctl */ ··· 52 51 extern int rcutorture_runnable; /* for sysctl */ 53 52 #endif /* #ifdef CONFIG_RCU_TORTURE_TEST */ 54 53 54 + enum rcutorture_type { 55 + RCU_FLAVOR, 56 + RCU_BH_FLAVOR, 57 + RCU_SCHED_FLAVOR, 58 + SRCU_FLAVOR, 59 + INVALID_RCU_FLAVOR 60 + }; 61 + 55 62 #if defined(CONFIG_TREE_RCU) || defined(CONFIG_TREE_PREEMPT_RCU) 63 + void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags, 64 + unsigned long *gpnum, unsigned long *completed); 56 65 void rcutorture_record_test_transition(void); 57 66 void rcutorture_record_progress(unsigned long vernum); 58 67 void do_trace_rcu_torture_read(const char *rcutorturename, ··· 71 60 unsigned long c_old, 72 61 unsigned long c); 73 62 #else 63 + static inline void rcutorture_get_gp_data(enum rcutorture_type test_type, 64 + int *flags, 65 + unsigned long *gpnum, 66 + unsigned long *completed) 67 + { 68 + *flags = 0; 69 + *gpnum = 0; 70 + *completed = 0; 71 + } 74 72 static inline void rcutorture_record_test_transition(void) 75 73 { 76 74 } ··· 248 228 void rcu_irq_enter(void); 249 229 void rcu_irq_exit(void); 250 230 231 + #ifdef CONFIG_RCU_STALL_COMMON 232 + void rcu_sysrq_start(void); 233 + void rcu_sysrq_end(void); 234 + #else /* #ifdef CONFIG_RCU_STALL_COMMON */ 235 + static inline void rcu_sysrq_start(void) 236 + { 237 + } 238 + static inline void rcu_sysrq_end(void) 239 + { 240 + } 241 + #endif /* #else #ifdef CONFIG_RCU_STALL_COMMON */ 242 + 251 243 #ifdef CONFIG_RCU_USER_QS 252 244 void rcu_user_enter(void); 253 245 void rcu_user_exit(void); ··· 298 266 #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) || defined(CONFIG_SMP) 299 267 bool __rcu_is_watching(void); 300 268 #endif /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) || defined(CONFIG_SMP) */ 269 + 270 + /* 271 + * Hooks for cond_resched() and friends to avoid RCU CPU stall warnings. 272 + */ 273 + 274 + #define RCU_COND_RESCHED_LIM 256 /* ms vs. 100s of ms. */ 275 + DECLARE_PER_CPU(int, rcu_cond_resched_count); 276 + void rcu_resched(void); 277 + 278 + /* 279 + * Is it time to report RCU quiescent states? 280 + * 281 + * Note unsynchronized access to rcu_cond_resched_count. Yes, we might 282 + * increment some random CPU's count, and possibly also load the result from 283 + * yet another CPU's count. We might even clobber some other CPU's attempt 284 + * to zero its counter. This is all OK because the goal is not precision, 285 + * but rather reasonable amortization of rcu_note_context_switch() overhead 286 + * and extremely high probability of avoiding RCU CPU stall warnings. 287 + * Note that this function has to be preempted in just the wrong place, 288 + * many thousands of times in a row, for anything bad to happen. 289 + */ 290 + static inline bool rcu_should_resched(void) 291 + { 292 + return raw_cpu_inc_return(rcu_cond_resched_count) >= 293 + RCU_COND_RESCHED_LIM; 294 + } 295 + 296 + /* 297 + * Report quiscent states to RCU if it is time to do so. 298 + */ 299 + static inline void rcu_cond_resched(void) 300 + { 301 + if (unlikely(rcu_should_resched())) 302 + rcu_resched(); 303 + } 301 304 302 305 /* 303 306 * Infrastructure to implement the synchronize_() primitives in ··· 395 328 extern struct lockdep_map rcu_bh_lock_map; 396 329 extern struct lockdep_map rcu_sched_lock_map; 397 330 extern struct lockdep_map rcu_callback_map; 398 - extern int debug_lockdep_rcu_enabled(void); 331 + int debug_lockdep_rcu_enabled(void); 399 332 400 333 /** 401 334 * rcu_read_lock_held() - might we be in RCU read-side critical section? ··· 1016 949 * pointers, but you must use rcu_assign_pointer() to initialize the 1017 950 * external-to-structure pointer -after- you have completely initialized 1018 951 * the reader-accessible portions of the linked structure. 952 + * 953 + * Note that unlike rcu_assign_pointer(), RCU_INIT_POINTER() provides no 954 + * ordering guarantees for either the CPU or the compiler. 1019 955 */ 1020 956 #define RCU_INIT_POINTER(p, v) \ 1021 957 do { \
+4
include/linux/rcutiny.h
··· 119 119 { 120 120 } 121 121 122 + static inline void show_rcu_gp_kthreads(void) 123 + { 124 + } 125 + 122 126 static inline void rcu_cpu_stall_reset(void) 123 127 { 124 128 }
+1
include/linux/rcutree.h
··· 84 84 long rcu_batches_completed(void); 85 85 long rcu_batches_completed_bh(void); 86 86 long rcu_batches_completed_sched(void); 87 + void show_rcu_gp_kthreads(void); 87 88 88 89 void rcu_force_quiescent_state(void); 89 90 void rcu_bh_force_quiescent_state(void);
+1 -7
include/linux/torture.h
··· 49 49 #define VERBOSE_TOROUT_ERRSTRING(s) \ 50 50 do { if (verbose) pr_alert("%s" TORTURE_FLAG "!!! %s\n", torture_type, s); } while (0) 51 51 52 - /* Definitions for a non-string torture-test module parameter. */ 53 - #define torture_parm(type, name, init, msg) \ 54 - static type name = init; \ 55 - module_param(name, type, 0444); \ 56 - MODULE_PARM_DESC(name, msg); 57 - 58 52 /* Definitions for online/offline exerciser. */ 59 53 int torture_onoff_init(long ooholdoff, long oointerval); 60 54 char *torture_onoff_stats(char *page); ··· 75 81 int torture_stutter_init(int s); 76 82 77 83 /* Initialization and cleanup. */ 78 - void torture_init_begin(char *ttype, bool v, int *runnable); 84 + bool torture_init_begin(char *ttype, bool v, int *runnable); 79 85 void torture_init_end(void); 80 86 bool torture_cleanup(void); 81 87 bool torture_must_stop(void);
+6 -4
kernel/locking/locktorture.c
··· 82 82 }; 83 83 static struct lock_writer_stress_stats *lwsa; 84 84 85 - #if defined(MODULE) || defined(CONFIG_LOCK_TORTURE_TEST_RUNNABLE) 85 + #if defined(MODULE) 86 86 #define LOCKTORTURE_RUNNABLE_INIT 1 87 87 #else 88 88 #define LOCKTORTURE_RUNNABLE_INIT 0 89 89 #endif 90 90 int locktorture_runnable = LOCKTORTURE_RUNNABLE_INIT; 91 91 module_param(locktorture_runnable, int, 0444); 92 - MODULE_PARM_DESC(locktorture_runnable, "Start locktorture at boot"); 92 + MODULE_PARM_DESC(locktorture_runnable, "Start locktorture at module init"); 93 93 94 94 /* Forward reference. */ 95 95 static void lock_torture_cleanup(void); ··· 219 219 set_user_nice(current, 19); 220 220 221 221 do { 222 - schedule_timeout_uninterruptible(1); 222 + if ((torture_random(&rand) & 0xfffff) == 0) 223 + schedule_timeout_uninterruptible(1); 223 224 cur_ops->writelock(); 224 225 if (WARN_ON_ONCE(lock_is_write_held)) 225 226 lwsp->n_write_lock_fail++; ··· 355 354 &lock_busted_ops, &spin_lock_ops, &spin_lock_irq_ops, 356 355 }; 357 356 358 - torture_init_begin(torture_type, verbose, &locktorture_runnable); 357 + if (!torture_init_begin(torture_type, verbose, &locktorture_runnable)) 358 + return -EBUSY; 359 359 360 360 /* Process args and tell the world that the torturer is on the job. */ 361 361 for (i = 0; i < ARRAY_SIZE(torture_ops); i++) {
+171 -46
kernel/rcu/rcutorture.c
··· 58 58 "Duration of fqs bursts (us), 0 to disable"); 59 59 torture_param(int, fqs_holdoff, 0, "Holdoff time within fqs bursts (us)"); 60 60 torture_param(int, fqs_stutter, 3, "Wait time between fqs bursts (s)"); 61 + torture_param(bool, gp_cond, false, "Use conditional/async GP wait primitives"); 61 62 torture_param(bool, gp_exp, false, "Use expedited GP wait primitives"); 62 63 torture_param(bool, gp_normal, false, 63 64 "Use normal (non-expedited) GP wait primitives"); 65 + torture_param(bool, gp_sync, false, "Use synchronous GP wait primitives"); 64 66 torture_param(int, irqreader, 1, "Allow RCU readers from irq handlers"); 65 67 torture_param(int, n_barrier_cbs, 0, 66 68 "# of callbacks/kthreads for barrier testing"); ··· 140 138 static long n_barrier_successes; 141 139 static struct list_head rcu_torture_removed; 142 140 141 + static int rcu_torture_writer_state; 142 + #define RTWS_FIXED_DELAY 0 143 + #define RTWS_DELAY 1 144 + #define RTWS_REPLACE 2 145 + #define RTWS_DEF_FREE 3 146 + #define RTWS_EXP_SYNC 4 147 + #define RTWS_COND_GET 5 148 + #define RTWS_COND_SYNC 6 149 + #define RTWS_SYNC 7 150 + #define RTWS_STUTTER 8 151 + #define RTWS_STOPPING 9 152 + 143 153 #if defined(MODULE) || defined(CONFIG_RCU_TORTURE_TEST_RUNNABLE) 144 154 #define RCUTORTURE_RUNNABLE_INIT 1 145 155 #else ··· 228 214 */ 229 215 230 216 struct rcu_torture_ops { 217 + int ttype; 231 218 void (*init)(void); 232 219 int (*readlock)(void); 233 220 void (*read_delay)(struct torture_random_state *rrsp); ··· 237 222 void (*deferred_free)(struct rcu_torture *p); 238 223 void (*sync)(void); 239 224 void (*exp_sync)(void); 225 + unsigned long (*get_state)(void); 226 + void (*cond_sync)(unsigned long oldstate); 240 227 void (*call)(struct rcu_head *head, void (*func)(struct rcu_head *rcu)); 241 228 void (*cb_barrier)(void); 242 229 void (*fqs)(void); ··· 290 273 return rcu_batches_completed(); 291 274 } 292 275 276 + /* 277 + * Update callback in the pipe. This should be invoked after a grace period. 278 + */ 279 + static bool 280 + rcu_torture_pipe_update_one(struct rcu_torture *rp) 281 + { 282 + int i; 283 + 284 + i = rp->rtort_pipe_count; 285 + if (i > RCU_TORTURE_PIPE_LEN) 286 + i = RCU_TORTURE_PIPE_LEN; 287 + atomic_inc(&rcu_torture_wcount[i]); 288 + if (++rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) { 289 + rp->rtort_mbtest = 0; 290 + return true; 291 + } 292 + return false; 293 + } 294 + 295 + /* 296 + * Update all callbacks in the pipe. Suitable for synchronous grace-period 297 + * primitives. 298 + */ 299 + static void 300 + rcu_torture_pipe_update(struct rcu_torture *old_rp) 301 + { 302 + struct rcu_torture *rp; 303 + struct rcu_torture *rp1; 304 + 305 + if (old_rp) 306 + list_add(&old_rp->rtort_free, &rcu_torture_removed); 307 + list_for_each_entry_safe(rp, rp1, &rcu_torture_removed, rtort_free) { 308 + if (rcu_torture_pipe_update_one(rp)) { 309 + list_del(&rp->rtort_free); 310 + rcu_torture_free(rp); 311 + } 312 + } 313 + } 314 + 293 315 static void 294 316 rcu_torture_cb(struct rcu_head *p) 295 317 { 296 - int i; 297 318 struct rcu_torture *rp = container_of(p, struct rcu_torture, rtort_rcu); 298 319 299 320 if (torture_must_stop_irq()) { ··· 339 284 /* The next initialization will pick up the pieces. */ 340 285 return; 341 286 } 342 - i = rp->rtort_pipe_count; 343 - if (i > RCU_TORTURE_PIPE_LEN) 344 - i = RCU_TORTURE_PIPE_LEN; 345 - atomic_inc(&rcu_torture_wcount[i]); 346 - if (++rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) { 347 - rp->rtort_mbtest = 0; 287 + if (rcu_torture_pipe_update_one(rp)) 348 288 rcu_torture_free(rp); 349 - } else { 289 + else 350 290 cur_ops->deferred_free(rp); 351 - } 352 291 } 353 292 354 293 static int rcu_no_completed(void) ··· 361 312 } 362 313 363 314 static struct rcu_torture_ops rcu_ops = { 315 + .ttype = RCU_FLAVOR, 364 316 .init = rcu_sync_torture_init, 365 317 .readlock = rcu_torture_read_lock, 366 318 .read_delay = rcu_read_delay, ··· 370 320 .deferred_free = rcu_torture_deferred_free, 371 321 .sync = synchronize_rcu, 372 322 .exp_sync = synchronize_rcu_expedited, 323 + .get_state = get_state_synchronize_rcu, 324 + .cond_sync = cond_synchronize_rcu, 373 325 .call = call_rcu, 374 326 .cb_barrier = rcu_barrier, 375 327 .fqs = rcu_force_quiescent_state, ··· 407 355 } 408 356 409 357 static struct rcu_torture_ops rcu_bh_ops = { 358 + .ttype = RCU_BH_FLAVOR, 410 359 .init = rcu_sync_torture_init, 411 360 .readlock = rcu_bh_torture_read_lock, 412 361 .read_delay = rcu_read_delay, /* just reuse rcu's version. */ ··· 450 397 } 451 398 452 399 static struct rcu_torture_ops rcu_busted_ops = { 400 + .ttype = INVALID_RCU_FLAVOR, 453 401 .init = rcu_sync_torture_init, 454 402 .readlock = rcu_torture_read_lock, 455 403 .read_delay = rcu_read_delay, /* just reuse rcu's version. */ ··· 533 479 page += sprintf(page, "%s%s per-CPU(idx=%d):", 534 480 torture_type, TORTURE_FLAG, idx); 535 481 for_each_possible_cpu(cpu) { 536 - page += sprintf(page, " %d(%lu,%lu)", cpu, 537 - per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[!idx], 538 - per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[idx]); 482 + long c0, c1; 483 + 484 + c0 = (long)per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[!idx]; 485 + c1 = (long)per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[idx]; 486 + page += sprintf(page, " %d(%ld,%ld)", cpu, c0, c1); 539 487 } 540 488 sprintf(page, "\n"); 541 489 } ··· 548 492 } 549 493 550 494 static struct rcu_torture_ops srcu_ops = { 495 + .ttype = SRCU_FLAVOR, 551 496 .init = rcu_sync_torture_init, 552 497 .readlock = srcu_torture_read_lock, 553 498 .read_delay = srcu_read_delay, ··· 584 527 } 585 528 586 529 static struct rcu_torture_ops sched_ops = { 530 + .ttype = RCU_SCHED_FLAVOR, 587 531 .init = rcu_sync_torture_init, 588 532 .readlock = sched_torture_read_lock, 589 533 .read_delay = rcu_read_delay, /* just reuse rcu's version. */ ··· 746 688 static int 747 689 rcu_torture_writer(void *arg) 748 690 { 749 - bool exp; 691 + unsigned long gp_snap; 692 + bool gp_cond1 = gp_cond, gp_exp1 = gp_exp, gp_normal1 = gp_normal; 693 + bool gp_sync1 = gp_sync; 750 694 int i; 751 695 struct rcu_torture *rp; 752 - struct rcu_torture *rp1; 753 696 struct rcu_torture *old_rp; 754 697 static DEFINE_TORTURE_RANDOM(rand); 698 + int synctype[] = { RTWS_DEF_FREE, RTWS_EXP_SYNC, 699 + RTWS_COND_GET, RTWS_SYNC }; 700 + int nsynctypes = 0; 755 701 756 702 VERBOSE_TOROUT_STRING("rcu_torture_writer task started"); 757 - set_user_nice(current, MAX_NICE); 703 + 704 + /* Initialize synctype[] array. If none set, take default. */ 705 + if (!gp_cond1 && !gp_exp1 && !gp_normal1 && !gp_sync) 706 + gp_cond1 = gp_exp1 = gp_normal1 = gp_sync1 = true; 707 + if (gp_cond1 && cur_ops->get_state && cur_ops->cond_sync) 708 + synctype[nsynctypes++] = RTWS_COND_GET; 709 + else if (gp_cond && (!cur_ops->get_state || !cur_ops->cond_sync)) 710 + pr_alert("rcu_torture_writer: gp_cond without primitives.\n"); 711 + if (gp_exp1 && cur_ops->exp_sync) 712 + synctype[nsynctypes++] = RTWS_EXP_SYNC; 713 + else if (gp_exp && !cur_ops->exp_sync) 714 + pr_alert("rcu_torture_writer: gp_exp without primitives.\n"); 715 + if (gp_normal1 && cur_ops->deferred_free) 716 + synctype[nsynctypes++] = RTWS_DEF_FREE; 717 + else if (gp_normal && !cur_ops->deferred_free) 718 + pr_alert("rcu_torture_writer: gp_normal without primitives.\n"); 719 + if (gp_sync1 && cur_ops->sync) 720 + synctype[nsynctypes++] = RTWS_SYNC; 721 + else if (gp_sync && !cur_ops->sync) 722 + pr_alert("rcu_torture_writer: gp_sync without primitives.\n"); 723 + if (WARN_ONCE(nsynctypes == 0, 724 + "rcu_torture_writer: No update-side primitives.\n")) { 725 + /* 726 + * No updates primitives, so don't try updating. 727 + * The resulting test won't be testing much, hence the 728 + * above WARN_ONCE(). 729 + */ 730 + rcu_torture_writer_state = RTWS_STOPPING; 731 + torture_kthread_stopping("rcu_torture_writer"); 732 + } 758 733 759 734 do { 735 + rcu_torture_writer_state = RTWS_FIXED_DELAY; 760 736 schedule_timeout_uninterruptible(1); 761 737 rp = rcu_torture_alloc(); 762 738 if (rp == NULL) 763 739 continue; 764 740 rp->rtort_pipe_count = 0; 741 + rcu_torture_writer_state = RTWS_DELAY; 765 742 udelay(torture_random(&rand) & 0x3ff); 743 + rcu_torture_writer_state = RTWS_REPLACE; 766 744 old_rp = rcu_dereference_check(rcu_torture_current, 767 745 current == writer_task); 768 746 rp->rtort_mbtest = 1; ··· 810 716 i = RCU_TORTURE_PIPE_LEN; 811 717 atomic_inc(&rcu_torture_wcount[i]); 812 718 old_rp->rtort_pipe_count++; 813 - if (gp_normal == gp_exp) 814 - exp = !!(torture_random(&rand) & 0x80); 815 - else 816 - exp = gp_exp; 817 - if (!exp) { 719 + switch (synctype[torture_random(&rand) % nsynctypes]) { 720 + case RTWS_DEF_FREE: 721 + rcu_torture_writer_state = RTWS_DEF_FREE; 818 722 cur_ops->deferred_free(old_rp); 819 - } else { 723 + break; 724 + case RTWS_EXP_SYNC: 725 + rcu_torture_writer_state = RTWS_EXP_SYNC; 820 726 cur_ops->exp_sync(); 821 - list_add(&old_rp->rtort_free, 822 - &rcu_torture_removed); 823 - list_for_each_entry_safe(rp, rp1, 824 - &rcu_torture_removed, 825 - rtort_free) { 826 - i = rp->rtort_pipe_count; 827 - if (i > RCU_TORTURE_PIPE_LEN) 828 - i = RCU_TORTURE_PIPE_LEN; 829 - atomic_inc(&rcu_torture_wcount[i]); 830 - if (++rp->rtort_pipe_count >= 831 - RCU_TORTURE_PIPE_LEN) { 832 - rp->rtort_mbtest = 0; 833 - list_del(&rp->rtort_free); 834 - rcu_torture_free(rp); 835 - } 836 - } 727 + rcu_torture_pipe_update(old_rp); 728 + break; 729 + case RTWS_COND_GET: 730 + rcu_torture_writer_state = RTWS_COND_GET; 731 + gp_snap = cur_ops->get_state(); 732 + i = torture_random(&rand) % 16; 733 + if (i != 0) 734 + schedule_timeout_interruptible(i); 735 + udelay(torture_random(&rand) % 1000); 736 + rcu_torture_writer_state = RTWS_COND_SYNC; 737 + cur_ops->cond_sync(gp_snap); 738 + rcu_torture_pipe_update(old_rp); 739 + break; 740 + case RTWS_SYNC: 741 + rcu_torture_writer_state = RTWS_SYNC; 742 + cur_ops->sync(); 743 + rcu_torture_pipe_update(old_rp); 744 + break; 745 + default: 746 + WARN_ON_ONCE(1); 747 + break; 837 748 } 838 749 } 839 750 rcutorture_record_progress(++rcu_torture_current_version); 751 + rcu_torture_writer_state = RTWS_STUTTER; 840 752 stutter_wait("rcu_torture_writer"); 841 753 } while (!torture_must_stop()); 754 + rcu_torture_writer_state = RTWS_STOPPING; 842 755 torture_kthread_stopping("rcu_torture_writer"); 843 756 return 0; 844 757 } ··· 885 784 return 0; 886 785 } 887 786 888 - void rcutorture_trace_dump(void) 787 + static void rcutorture_trace_dump(void) 889 788 { 890 789 static atomic_t beenhere = ATOMIC_INIT(0); 891 790 ··· 1019 918 __this_cpu_inc(rcu_torture_batch[completed]); 1020 919 preempt_enable(); 1021 920 cur_ops->readunlock(idx); 1022 - schedule(); 921 + cond_resched(); 1023 922 stutter_wait("rcu_torture_reader"); 1024 923 } while (!torture_must_stop()); 1025 - if (irqreader && cur_ops->irq_capable) 924 + if (irqreader && cur_ops->irq_capable) { 1026 925 del_timer_sync(&t); 926 + destroy_timer_on_stack(&t); 927 + } 1027 928 torture_kthread_stopping("rcu_torture_reader"); 1028 929 return 0; 1029 930 } ··· 1040 937 int i; 1041 938 long pipesummary[RCU_TORTURE_PIPE_LEN + 1] = { 0 }; 1042 939 long batchsummary[RCU_TORTURE_PIPE_LEN + 1] = { 0 }; 940 + static unsigned long rtcv_snap = ULONG_MAX; 1043 941 1044 942 for_each_possible_cpu(cpu) { 1045 943 for (i = 0; i < RCU_TORTURE_PIPE_LEN + 1; i++) { ··· 1101 997 page += sprintf(page, "\n"); 1102 998 if (cur_ops->stats) 1103 999 cur_ops->stats(page); 1000 + if (rtcv_snap == rcu_torture_current_version && 1001 + rcu_torture_current != NULL) { 1002 + int __maybe_unused flags; 1003 + unsigned long __maybe_unused gpnum; 1004 + unsigned long __maybe_unused completed; 1005 + 1006 + rcutorture_get_gp_data(cur_ops->ttype, 1007 + &flags, &gpnum, &completed); 1008 + page += sprintf(page, 1009 + "??? Writer stall state %d g%lu c%lu f%#x\n", 1010 + rcu_torture_writer_state, 1011 + gpnum, completed, flags); 1012 + show_rcu_gp_kthreads(); 1013 + rcutorture_trace_dump(); 1014 + } 1015 + rtcv_snap = rcu_torture_current_version; 1104 1016 } 1105 1017 1106 1018 /* ··· 1266 1146 } 1267 1147 1268 1148 /* Callback function for RCU barrier testing. */ 1269 - void rcu_torture_barrier_cbf(struct rcu_head *rcu) 1149 + static void rcu_torture_barrier_cbf(struct rcu_head *rcu) 1270 1150 { 1271 1151 atomic_inc(&barrier_cbs_invoked); 1272 1152 } ··· 1536 1416 &rcu_ops, &rcu_bh_ops, &rcu_busted_ops, &srcu_ops, &sched_ops, 1537 1417 }; 1538 1418 1539 - torture_init_begin(torture_type, verbose, &rcutorture_runnable); 1419 + if (!torture_init_begin(torture_type, verbose, &rcutorture_runnable)) 1420 + return -EBUSY; 1540 1421 1541 1422 /* Process args and tell the world that the torturer is on the job. */ 1542 1423 for (i = 0; i < ARRAY_SIZE(torture_ops); i++) { ··· 1562 1441 if (cur_ops->init) 1563 1442 cur_ops->init(); /* no "goto unwind" prior to this point!!! */ 1564 1443 1565 - if (nreaders >= 0) 1444 + if (nreaders >= 0) { 1566 1445 nrealreaders = nreaders; 1567 - else 1568 - nrealreaders = 2 * num_online_cpus(); 1446 + } else { 1447 + nrealreaders = num_online_cpus() - 1; 1448 + if (nrealreaders <= 0) 1449 + nrealreaders = 1; 1450 + } 1569 1451 rcu_torture_print_module_parms(cur_ops, "Start of test"); 1570 1452 1571 1453 /* Set up the freelist. */ ··· 1657 1533 fqs_duration = 0; 1658 1534 if (fqs_duration) { 1659 1535 /* Create the fqs thread */ 1660 - torture_create_kthread(rcu_torture_fqs, NULL, fqs_task); 1536 + firsterr = torture_create_kthread(rcu_torture_fqs, NULL, 1537 + fqs_task); 1661 1538 if (firsterr) 1662 1539 goto unwind; 1663 1540 }
+4 -4
kernel/rcu/tiny_plugin.h
··· 144 144 return; 145 145 rcp->ticks_this_gp++; 146 146 j = jiffies; 147 - js = rcp->jiffies_stall; 147 + js = ACCESS_ONCE(rcp->jiffies_stall); 148 148 if (*rcp->curtail && ULONG_CMP_GE(j, js)) { 149 149 pr_err("INFO: %s stall on CPU (%lu ticks this GP) idle=%llx (t=%lu jiffies q=%ld)\n", 150 150 rcp->name, rcp->ticks_this_gp, rcu_dynticks_nesting, ··· 152 152 dump_stack(); 153 153 } 154 154 if (*rcp->curtail && ULONG_CMP_GE(j, js)) 155 - rcp->jiffies_stall = jiffies + 155 + ACCESS_ONCE(rcp->jiffies_stall) = jiffies + 156 156 3 * rcu_jiffies_till_stall_check() + 3; 157 157 else if (ULONG_CMP_GE(j, js)) 158 - rcp->jiffies_stall = jiffies + rcu_jiffies_till_stall_check(); 158 + ACCESS_ONCE(rcp->jiffies_stall) = jiffies + rcu_jiffies_till_stall_check(); 159 159 } 160 160 161 161 static void reset_cpu_stall_ticks(struct rcu_ctrlblk *rcp) 162 162 { 163 163 rcp->ticks_this_gp = 0; 164 164 rcp->gp_start = jiffies; 165 - rcp->jiffies_stall = jiffies + rcu_jiffies_till_stall_check(); 165 + ACCESS_ONCE(rcp->jiffies_stall) = jiffies + rcu_jiffies_till_stall_check(); 166 166 } 167 167 168 168 static void check_cpu_stalls(void)
+217 -92
kernel/rcu/tree.c
··· 101 101 RCU_STATE_INITIALIZER(rcu_sched, 's', call_rcu_sched); 102 102 RCU_STATE_INITIALIZER(rcu_bh, 'b', call_rcu_bh); 103 103 104 - static struct rcu_state *rcu_state; 104 + static struct rcu_state *rcu_state_p; 105 105 LIST_HEAD(rcu_struct_flavors); 106 106 107 107 /* Increase (but not decrease) the CONFIG_RCU_FANOUT_LEAF at boot time. */ ··· 243 243 module_param(jiffies_till_first_fqs, ulong, 0644); 244 244 module_param(jiffies_till_next_fqs, ulong, 0644); 245 245 246 - static void rcu_start_gp_advanced(struct rcu_state *rsp, struct rcu_node *rnp, 246 + static bool rcu_start_gp_advanced(struct rcu_state *rsp, struct rcu_node *rnp, 247 247 struct rcu_data *rdp); 248 248 static void force_qs_rnp(struct rcu_state *rsp, 249 249 int (*f)(struct rcu_data *rsp, bool *isidle, ··· 271 271 EXPORT_SYMBOL_GPL(rcu_batches_completed_bh); 272 272 273 273 /* 274 + * Force a quiescent state. 275 + */ 276 + void rcu_force_quiescent_state(void) 277 + { 278 + force_quiescent_state(rcu_state_p); 279 + } 280 + EXPORT_SYMBOL_GPL(rcu_force_quiescent_state); 281 + 282 + /* 274 283 * Force a quiescent state for RCU BH. 275 284 */ 276 285 void rcu_bh_force_quiescent_state(void) ··· 287 278 force_quiescent_state(&rcu_bh_state); 288 279 } 289 280 EXPORT_SYMBOL_GPL(rcu_bh_force_quiescent_state); 281 + 282 + /* 283 + * Show the state of the grace-period kthreads. 284 + */ 285 + void show_rcu_gp_kthreads(void) 286 + { 287 + struct rcu_state *rsp; 288 + 289 + for_each_rcu_flavor(rsp) { 290 + pr_info("%s: wait state: %d ->state: %#lx\n", 291 + rsp->name, rsp->gp_state, rsp->gp_kthread->state); 292 + /* sched_show_task(rsp->gp_kthread); */ 293 + } 294 + } 295 + EXPORT_SYMBOL_GPL(show_rcu_gp_kthreads); 290 296 291 297 /* 292 298 * Record the number of times rcutorture tests have been initiated and ··· 316 292 rcutorture_vernum = 0; 317 293 } 318 294 EXPORT_SYMBOL_GPL(rcutorture_record_test_transition); 295 + 296 + /* 297 + * Send along grace-period-related data for rcutorture diagnostics. 298 + */ 299 + void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags, 300 + unsigned long *gpnum, unsigned long *completed) 301 + { 302 + struct rcu_state *rsp = NULL; 303 + 304 + switch (test_type) { 305 + case RCU_FLAVOR: 306 + rsp = rcu_state_p; 307 + break; 308 + case RCU_BH_FLAVOR: 309 + rsp = &rcu_bh_state; 310 + break; 311 + case RCU_SCHED_FLAVOR: 312 + rsp = &rcu_sched_state; 313 + break; 314 + default: 315 + break; 316 + } 317 + if (rsp != NULL) { 318 + *flags = ACCESS_ONCE(rsp->gp_flags); 319 + *gpnum = ACCESS_ONCE(rsp->gpnum); 320 + *completed = ACCESS_ONCE(rsp->completed); 321 + return; 322 + } 323 + *flags = 0; 324 + *gpnum = 0; 325 + *completed = 0; 326 + } 327 + EXPORT_SYMBOL_GPL(rcutorture_get_gp_data); 319 328 320 329 /* 321 330 * Record the number of writer passes through the current rcutorture test. ··· 381 324 } 382 325 383 326 /* 327 + * Return the root node of the specified rcu_state structure. 328 + */ 329 + static struct rcu_node *rcu_get_root(struct rcu_state *rsp) 330 + { 331 + return &rsp->node[0]; 332 + } 333 + 334 + /* 335 + * Is there any need for future grace periods? 336 + * Interrupts must be disabled. If the caller does not hold the root 337 + * rnp_node structure's ->lock, the results are advisory only. 338 + */ 339 + static int rcu_future_needs_gp(struct rcu_state *rsp) 340 + { 341 + struct rcu_node *rnp = rcu_get_root(rsp); 342 + int idx = (ACCESS_ONCE(rnp->completed) + 1) & 0x1; 343 + int *fp = &rnp->need_future_gp[idx]; 344 + 345 + return ACCESS_ONCE(*fp); 346 + } 347 + 348 + /* 384 349 * Does the current CPU require a not-yet-started grace period? 385 350 * The caller must have disabled interrupts to prevent races with 386 351 * normal callback registry. ··· 414 335 415 336 if (rcu_gp_in_progress(rsp)) 416 337 return 0; /* No, a grace period is already in progress. */ 417 - if (rcu_nocb_needs_gp(rsp)) 338 + if (rcu_future_needs_gp(rsp)) 418 339 return 1; /* Yes, a no-CBs CPU needs one. */ 419 340 if (!rdp->nxttail[RCU_NEXT_TAIL]) 420 341 return 0; /* No, this is a no-CBs (or offline) CPU. */ ··· 426 347 rdp->nxtcompleted[i])) 427 348 return 1; /* Yes, CBs for future grace period. */ 428 349 return 0; /* No grace period needed. */ 429 - } 430 - 431 - /* 432 - * Return the root node of the specified rcu_state structure. 433 - */ 434 - static struct rcu_node *rcu_get_root(struct rcu_state *rsp) 435 - { 436 - return &rsp->node[0]; 437 350 } 438 351 439 352 /* ··· 829 758 { 830 759 rdp->dynticks_snap = atomic_add_return(0, &rdp->dynticks->dynticks); 831 760 rcu_sysidle_check_cpu(rdp, isidle, maxj); 832 - return (rdp->dynticks_snap & 0x1) == 0; 761 + if ((rdp->dynticks_snap & 0x1) == 0) { 762 + trace_rcu_fqs(rdp->rsp->name, rdp->gpnum, rdp->cpu, TPS("dti")); 763 + return 1; 764 + } else { 765 + return 0; 766 + } 833 767 } 834 768 835 769 /* ··· 910 834 * we will beat on the first one until it gets unstuck, then move 911 835 * to the next. Only do this for the primary flavor of RCU. 912 836 */ 913 - if (rdp->rsp == rcu_state && 837 + if (rdp->rsp == rcu_state_p && 914 838 ULONG_CMP_GE(jiffies, rdp->rsp->jiffies_resched)) { 915 839 rdp->rsp->jiffies_resched += 5; 916 840 resched_cpu(rdp->cpu); ··· 927 851 rsp->gp_start = j; 928 852 smp_wmb(); /* Record start time before stall time. */ 929 853 j1 = rcu_jiffies_till_stall_check(); 930 - rsp->jiffies_stall = j + j1; 854 + ACCESS_ONCE(rsp->jiffies_stall) = j + j1; 931 855 rsp->jiffies_resched = j + j1 / 2; 932 856 } 933 857 ··· 966 890 /* Only let one CPU complain about others per time interval. */ 967 891 968 892 raw_spin_lock_irqsave(&rnp->lock, flags); 969 - delta = jiffies - rsp->jiffies_stall; 893 + delta = jiffies - ACCESS_ONCE(rsp->jiffies_stall); 970 894 if (delta < RCU_STALL_RAT_DELAY || !rcu_gp_in_progress(rsp)) { 971 895 raw_spin_unlock_irqrestore(&rnp->lock, flags); 972 896 return; 973 897 } 974 - rsp->jiffies_stall = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; 898 + ACCESS_ONCE(rsp->jiffies_stall) = jiffies + 3 * rcu_jiffies_till_stall_check() + 3; 975 899 raw_spin_unlock_irqrestore(&rnp->lock, flags); 976 900 977 901 /* ··· 1008 932 print_cpu_stall_info_end(); 1009 933 for_each_possible_cpu(cpu) 1010 934 totqlen += per_cpu_ptr(rsp->rda, cpu)->qlen; 1011 - pr_cont("(detected by %d, t=%ld jiffies, g=%lu, c=%lu, q=%lu)\n", 935 + pr_cont("(detected by %d, t=%ld jiffies, g=%ld, c=%ld, q=%lu)\n", 1012 936 smp_processor_id(), (long)(jiffies - rsp->gp_start), 1013 - rsp->gpnum, rsp->completed, totqlen); 937 + (long)rsp->gpnum, (long)rsp->completed, totqlen); 1014 938 if (ndetected == 0) 1015 939 pr_err("INFO: Stall ended before state dump start\n"); 1016 940 else if (!trigger_all_cpu_backtrace()) ··· 1022 946 1023 947 force_quiescent_state(rsp); /* Kick them all. */ 1024 948 } 1025 - 1026 - /* 1027 - * This function really isn't for public consumption, but RCU is special in 1028 - * that context switches can allow the state machine to make progress. 1029 - */ 1030 - extern void resched_cpu(int cpu); 1031 949 1032 950 static void print_cpu_stall(struct rcu_state *rsp) 1033 951 { ··· 1041 971 print_cpu_stall_info_end(); 1042 972 for_each_possible_cpu(cpu) 1043 973 totqlen += per_cpu_ptr(rsp->rda, cpu)->qlen; 1044 - pr_cont(" (t=%lu jiffies g=%lu c=%lu q=%lu)\n", 1045 - jiffies - rsp->gp_start, rsp->gpnum, rsp->completed, totqlen); 974 + pr_cont(" (t=%lu jiffies g=%ld c=%ld q=%lu)\n", 975 + jiffies - rsp->gp_start, 976 + (long)rsp->gpnum, (long)rsp->completed, totqlen); 1046 977 if (!trigger_all_cpu_backtrace()) 1047 978 dump_stack(); 1048 979 1049 980 raw_spin_lock_irqsave(&rnp->lock, flags); 1050 - if (ULONG_CMP_GE(jiffies, rsp->jiffies_stall)) 1051 - rsp->jiffies_stall = jiffies + 981 + if (ULONG_CMP_GE(jiffies, ACCESS_ONCE(rsp->jiffies_stall))) 982 + ACCESS_ONCE(rsp->jiffies_stall) = jiffies + 1052 983 3 * rcu_jiffies_till_stall_check() + 3; 1053 984 raw_spin_unlock_irqrestore(&rnp->lock, flags); 1054 985 ··· 1133 1062 struct rcu_state *rsp; 1134 1063 1135 1064 for_each_rcu_flavor(rsp) 1136 - rsp->jiffies_stall = jiffies + ULONG_MAX / 2; 1065 + ACCESS_ONCE(rsp->jiffies_stall) = jiffies + ULONG_MAX / 2; 1137 1066 } 1138 1067 1139 1068 /* ··· 1194 1123 /* 1195 1124 * Start some future grace period, as needed to handle newly arrived 1196 1125 * callbacks. The required future grace periods are recorded in each 1197 - * rcu_node structure's ->need_future_gp field. 1126 + * rcu_node structure's ->need_future_gp field. Returns true if there 1127 + * is reason to awaken the grace-period kthread. 1198 1128 * 1199 1129 * The caller must hold the specified rcu_node structure's ->lock. 1200 1130 */ 1201 - static unsigned long __maybe_unused 1202 - rcu_start_future_gp(struct rcu_node *rnp, struct rcu_data *rdp) 1131 + static bool __maybe_unused 1132 + rcu_start_future_gp(struct rcu_node *rnp, struct rcu_data *rdp, 1133 + unsigned long *c_out) 1203 1134 { 1204 1135 unsigned long c; 1205 1136 int i; 1137 + bool ret = false; 1206 1138 struct rcu_node *rnp_root = rcu_get_root(rdp->rsp); 1207 1139 1208 1140 /* ··· 1216 1142 trace_rcu_future_gp(rnp, rdp, c, TPS("Startleaf")); 1217 1143 if (rnp->need_future_gp[c & 0x1]) { 1218 1144 trace_rcu_future_gp(rnp, rdp, c, TPS("Prestartleaf")); 1219 - return c; 1145 + goto out; 1220 1146 } 1221 1147 1222 1148 /* ··· 1230 1156 ACCESS_ONCE(rnp->gpnum) != ACCESS_ONCE(rnp->completed)) { 1231 1157 rnp->need_future_gp[c & 0x1]++; 1232 1158 trace_rcu_future_gp(rnp, rdp, c, TPS("Startedleaf")); 1233 - return c; 1159 + goto out; 1234 1160 } 1235 1161 1236 1162 /* ··· 1271 1197 trace_rcu_future_gp(rnp, rdp, c, TPS("Startedleafroot")); 1272 1198 } else { 1273 1199 trace_rcu_future_gp(rnp, rdp, c, TPS("Startedroot")); 1274 - rcu_start_gp_advanced(rdp->rsp, rnp_root, rdp); 1200 + ret = rcu_start_gp_advanced(rdp->rsp, rnp_root, rdp); 1275 1201 } 1276 1202 unlock_out: 1277 1203 if (rnp != rnp_root) 1278 1204 raw_spin_unlock(&rnp_root->lock); 1279 - return c; 1205 + out: 1206 + if (c_out != NULL) 1207 + *c_out = c; 1208 + return ret; 1280 1209 } 1281 1210 1282 1211 /* ··· 1303 1226 } 1304 1227 1305 1228 /* 1229 + * Awaken the grace-period kthread for the specified flavor of RCU. 1230 + * Don't do a self-awaken, and don't bother awakening when there is 1231 + * nothing for the grace-period kthread to do (as in several CPUs 1232 + * raced to awaken, and we lost), and finally don't try to awaken 1233 + * a kthread that has not yet been created. 1234 + */ 1235 + static void rcu_gp_kthread_wake(struct rcu_state *rsp) 1236 + { 1237 + if (current == rsp->gp_kthread || 1238 + !ACCESS_ONCE(rsp->gp_flags) || 1239 + !rsp->gp_kthread) 1240 + return; 1241 + wake_up(&rsp->gp_wq); 1242 + } 1243 + 1244 + /* 1306 1245 * If there is room, assign a ->completed number to any callbacks on 1307 1246 * this CPU that have not already been assigned. Also accelerate any 1308 1247 * callbacks that were previously assigned a ->completed number that has 1309 1248 * since proven to be too conservative, which can happen if callbacks get 1310 1249 * assigned a ->completed number while RCU is idle, but with reference to 1311 1250 * a non-root rcu_node structure. This function is idempotent, so it does 1312 - * not hurt to call it repeatedly. 1251 + * not hurt to call it repeatedly. Returns an flag saying that we should 1252 + * awaken the RCU grace-period kthread. 1313 1253 * 1314 1254 * The caller must hold rnp->lock with interrupts disabled. 1315 1255 */ 1316 - static void rcu_accelerate_cbs(struct rcu_state *rsp, struct rcu_node *rnp, 1256 + static bool rcu_accelerate_cbs(struct rcu_state *rsp, struct rcu_node *rnp, 1317 1257 struct rcu_data *rdp) 1318 1258 { 1319 1259 unsigned long c; 1320 1260 int i; 1261 + bool ret; 1321 1262 1322 1263 /* If the CPU has no callbacks, nothing to do. */ 1323 1264 if (!rdp->nxttail[RCU_NEXT_TAIL] || !*rdp->nxttail[RCU_DONE_TAIL]) 1324 - return; 1265 + return false; 1325 1266 1326 1267 /* 1327 1268 * Starting from the sublist containing the callbacks most ··· 1368 1273 * be grouped into. 1369 1274 */ 1370 1275 if (++i >= RCU_NEXT_TAIL) 1371 - return; 1276 + return false; 1372 1277 1373 1278 /* 1374 1279 * Assign all subsequent callbacks' ->completed number to the next ··· 1380 1285 rdp->nxtcompleted[i] = c; 1381 1286 } 1382 1287 /* Record any needed additional grace periods. */ 1383 - rcu_start_future_gp(rnp, rdp); 1288 + ret = rcu_start_future_gp(rnp, rdp, NULL); 1384 1289 1385 1290 /* Trace depending on how much we were able to accelerate. */ 1386 1291 if (!*rdp->nxttail[RCU_WAIT_TAIL]) 1387 1292 trace_rcu_grace_period(rsp->name, rdp->gpnum, TPS("AccWaitCB")); 1388 1293 else 1389 1294 trace_rcu_grace_period(rsp->name, rdp->gpnum, TPS("AccReadyCB")); 1295 + return ret; 1390 1296 } 1391 1297 1392 1298 /* ··· 1396 1300 * assign ->completed numbers to any callbacks in the RCU_NEXT_TAIL 1397 1301 * sublist. This function is idempotent, so it does not hurt to 1398 1302 * invoke it repeatedly. As long as it is not invoked -too- often... 1303 + * Returns true if the RCU grace-period kthread needs to be awakened. 1399 1304 * 1400 1305 * The caller must hold rnp->lock with interrupts disabled. 1401 1306 */ 1402 - static void rcu_advance_cbs(struct rcu_state *rsp, struct rcu_node *rnp, 1307 + static bool rcu_advance_cbs(struct rcu_state *rsp, struct rcu_node *rnp, 1403 1308 struct rcu_data *rdp) 1404 1309 { 1405 1310 int i, j; 1406 1311 1407 1312 /* If the CPU has no callbacks, nothing to do. */ 1408 1313 if (!rdp->nxttail[RCU_NEXT_TAIL] || !*rdp->nxttail[RCU_DONE_TAIL]) 1409 - return; 1314 + return false; 1410 1315 1411 1316 /* 1412 1317 * Find all callbacks whose ->completed numbers indicate that they ··· 1431 1334 } 1432 1335 1433 1336 /* Classify any remaining callbacks. */ 1434 - rcu_accelerate_cbs(rsp, rnp, rdp); 1337 + return rcu_accelerate_cbs(rsp, rnp, rdp); 1435 1338 } 1436 1339 1437 1340 /* 1438 1341 * Update CPU-local rcu_data state to record the beginnings and ends of 1439 1342 * grace periods. The caller must hold the ->lock of the leaf rcu_node 1440 1343 * structure corresponding to the current CPU, and must have irqs disabled. 1344 + * Returns true if the grace-period kthread needs to be awakened. 1441 1345 */ 1442 - static void __note_gp_changes(struct rcu_state *rsp, struct rcu_node *rnp, struct rcu_data *rdp) 1346 + static bool __note_gp_changes(struct rcu_state *rsp, struct rcu_node *rnp, 1347 + struct rcu_data *rdp) 1443 1348 { 1349 + bool ret; 1350 + 1444 1351 /* Handle the ends of any preceding grace periods first. */ 1445 1352 if (rdp->completed == rnp->completed) { 1446 1353 1447 1354 /* No grace period end, so just accelerate recent callbacks. */ 1448 - rcu_accelerate_cbs(rsp, rnp, rdp); 1355 + ret = rcu_accelerate_cbs(rsp, rnp, rdp); 1449 1356 1450 1357 } else { 1451 1358 1452 1359 /* Advance callbacks. */ 1453 - rcu_advance_cbs(rsp, rnp, rdp); 1360 + ret = rcu_advance_cbs(rsp, rnp, rdp); 1454 1361 1455 1362 /* Remember that we saw this grace-period completion. */ 1456 1363 rdp->completed = rnp->completed; ··· 1473 1372 rdp->qs_pending = !!(rnp->qsmask & rdp->grpmask); 1474 1373 zero_cpu_stall_ticks(rdp); 1475 1374 } 1375 + return ret; 1476 1376 } 1477 1377 1478 1378 static void note_gp_changes(struct rcu_state *rsp, struct rcu_data *rdp) 1479 1379 { 1480 1380 unsigned long flags; 1381 + bool needwake; 1481 1382 struct rcu_node *rnp; 1482 1383 1483 1384 local_irq_save(flags); ··· 1491 1388 return; 1492 1389 } 1493 1390 smp_mb__after_unlock_lock(); 1494 - __note_gp_changes(rsp, rnp, rdp); 1391 + needwake = __note_gp_changes(rsp, rnp, rdp); 1495 1392 raw_spin_unlock_irqrestore(&rnp->lock, flags); 1393 + if (needwake) 1394 + rcu_gp_kthread_wake(rsp); 1496 1395 } 1497 1396 1498 1397 /* ··· 1508 1403 rcu_bind_gp_kthread(); 1509 1404 raw_spin_lock_irq(&rnp->lock); 1510 1405 smp_mb__after_unlock_lock(); 1511 - if (rsp->gp_flags == 0) { 1406 + if (!ACCESS_ONCE(rsp->gp_flags)) { 1512 1407 /* Spurious wakeup, tell caller to go back to sleep. */ 1513 1408 raw_spin_unlock_irq(&rnp->lock); 1514 1409 return 0; 1515 1410 } 1516 - rsp->gp_flags = 0; /* Clear all flags: New grace period. */ 1411 + ACCESS_ONCE(rsp->gp_flags) = 0; /* Clear all flags: New grace period. */ 1517 1412 1518 1413 if (WARN_ON_ONCE(rcu_gp_in_progress(rsp))) { 1519 1414 /* ··· 1558 1453 WARN_ON_ONCE(rnp->completed != rsp->completed); 1559 1454 ACCESS_ONCE(rnp->completed) = rsp->completed; 1560 1455 if (rnp == rdp->mynode) 1561 - __note_gp_changes(rsp, rnp, rdp); 1456 + (void)__note_gp_changes(rsp, rnp, rdp); 1562 1457 rcu_preempt_boost_start_gp(rnp); 1563 1458 trace_rcu_grace_period_init(rsp->name, rnp->gpnum, 1564 1459 rnp->level, rnp->grplo, ··· 1606 1501 if (ACCESS_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) { 1607 1502 raw_spin_lock_irq(&rnp->lock); 1608 1503 smp_mb__after_unlock_lock(); 1609 - rsp->gp_flags &= ~RCU_GP_FLAG_FQS; 1504 + ACCESS_ONCE(rsp->gp_flags) &= ~RCU_GP_FLAG_FQS; 1610 1505 raw_spin_unlock_irq(&rnp->lock); 1611 1506 } 1612 1507 return fqs_state; ··· 1618 1513 static void rcu_gp_cleanup(struct rcu_state *rsp) 1619 1514 { 1620 1515 unsigned long gp_duration; 1516 + bool needgp = false; 1621 1517 int nocb = 0; 1622 1518 struct rcu_data *rdp; 1623 1519 struct rcu_node *rnp = rcu_get_root(rsp); ··· 1654 1548 ACCESS_ONCE(rnp->completed) = rsp->gpnum; 1655 1549 rdp = this_cpu_ptr(rsp->rda); 1656 1550 if (rnp == rdp->mynode) 1657 - __note_gp_changes(rsp, rnp, rdp); 1551 + needgp = __note_gp_changes(rsp, rnp, rdp) || needgp; 1658 1552 /* smp_mb() provided by prior unlock-lock pair. */ 1659 1553 nocb += rcu_future_gp_cleanup(rsp, rnp); 1660 1554 raw_spin_unlock_irq(&rnp->lock); ··· 1670 1564 trace_rcu_grace_period(rsp->name, rsp->completed, TPS("end")); 1671 1565 rsp->fqs_state = RCU_GP_IDLE; 1672 1566 rdp = this_cpu_ptr(rsp->rda); 1673 - rcu_advance_cbs(rsp, rnp, rdp); /* Reduce false positives below. */ 1674 - if (cpu_needs_another_gp(rsp, rdp)) { 1675 - rsp->gp_flags = RCU_GP_FLAG_INIT; 1567 + /* Advance CBs to reduce false positives below. */ 1568 + needgp = rcu_advance_cbs(rsp, rnp, rdp) || needgp; 1569 + if (needgp || cpu_needs_another_gp(rsp, rdp)) { 1570 + ACCESS_ONCE(rsp->gp_flags) = RCU_GP_FLAG_INIT; 1676 1571 trace_rcu_grace_period(rsp->name, 1677 1572 ACCESS_ONCE(rsp->gpnum), 1678 1573 TPS("newreq")); ··· 1700 1593 trace_rcu_grace_period(rsp->name, 1701 1594 ACCESS_ONCE(rsp->gpnum), 1702 1595 TPS("reqwait")); 1596 + rsp->gp_state = RCU_GP_WAIT_GPS; 1703 1597 wait_event_interruptible(rsp->gp_wq, 1704 1598 ACCESS_ONCE(rsp->gp_flags) & 1705 1599 RCU_GP_FLAG_INIT); ··· 1728 1620 trace_rcu_grace_period(rsp->name, 1729 1621 ACCESS_ONCE(rsp->gpnum), 1730 1622 TPS("fqswait")); 1623 + rsp->gp_state = RCU_GP_WAIT_FQS; 1731 1624 ret = wait_event_interruptible_timeout(rsp->gp_wq, 1732 1625 ((gf = ACCESS_ONCE(rsp->gp_flags)) & 1733 1626 RCU_GP_FLAG_FQS) || ··· 1774 1665 } 1775 1666 } 1776 1667 1777 - static void rsp_wakeup(struct irq_work *work) 1778 - { 1779 - struct rcu_state *rsp = container_of(work, struct rcu_state, wakeup_work); 1780 - 1781 - /* Wake up rcu_gp_kthread() to start the grace period. */ 1782 - wake_up(&rsp->gp_wq); 1783 - } 1784 - 1785 1668 /* 1786 1669 * Start a new RCU grace period if warranted, re-initializing the hierarchy 1787 1670 * in preparation for detecting the next grace period. The caller must hold ··· 1782 1681 * Note that it is legal for a dying CPU (which is marked as offline) to 1783 1682 * invoke this function. This can happen when the dying CPU reports its 1784 1683 * quiescent state. 1684 + * 1685 + * Returns true if the grace-period kthread must be awakened. 1785 1686 */ 1786 - static void 1687 + static bool 1787 1688 rcu_start_gp_advanced(struct rcu_state *rsp, struct rcu_node *rnp, 1788 1689 struct rcu_data *rdp) 1789 1690 { ··· 1796 1693 * or a grace period is already in progress. 1797 1694 * Either way, don't start a new grace period. 1798 1695 */ 1799 - return; 1696 + return false; 1800 1697 } 1801 - rsp->gp_flags = RCU_GP_FLAG_INIT; 1698 + ACCESS_ONCE(rsp->gp_flags) = RCU_GP_FLAG_INIT; 1802 1699 trace_rcu_grace_period(rsp->name, ACCESS_ONCE(rsp->gpnum), 1803 1700 TPS("newreq")); 1804 1701 1805 1702 /* 1806 1703 * We can't do wakeups while holding the rnp->lock, as that 1807 1704 * could cause possible deadlocks with the rq->lock. Defer 1808 - * the wakeup to interrupt context. And don't bother waking 1809 - * up the running kthread. 1705 + * the wakeup to our caller. 1810 1706 */ 1811 - if (current != rsp->gp_kthread) 1812 - irq_work_queue(&rsp->wakeup_work); 1707 + return true; 1813 1708 } 1814 1709 1815 1710 /* ··· 1816 1715 * is invoked indirectly from rcu_advance_cbs(), which would result in 1817 1716 * endless recursion -- or would do so if it wasn't for the self-deadlock 1818 1717 * that is encountered beforehand. 1718 + * 1719 + * Returns true if the grace-period kthread needs to be awakened. 1819 1720 */ 1820 - static void 1821 - rcu_start_gp(struct rcu_state *rsp) 1721 + static bool rcu_start_gp(struct rcu_state *rsp) 1822 1722 { 1823 1723 struct rcu_data *rdp = this_cpu_ptr(rsp->rda); 1824 1724 struct rcu_node *rnp = rcu_get_root(rsp); 1725 + bool ret = false; 1825 1726 1826 1727 /* 1827 1728 * If there is no grace period in progress right now, any ··· 1833 1730 * resulting in pointless grace periods. So, advance callbacks 1834 1731 * then start the grace period! 1835 1732 */ 1836 - rcu_advance_cbs(rsp, rnp, rdp); 1837 - rcu_start_gp_advanced(rsp, rnp, rdp); 1733 + ret = rcu_advance_cbs(rsp, rnp, rdp) || ret; 1734 + ret = rcu_start_gp_advanced(rsp, rnp, rdp) || ret; 1735 + return ret; 1838 1736 } 1839 1737 1840 1738 /* ··· 1924 1820 { 1925 1821 unsigned long flags; 1926 1822 unsigned long mask; 1823 + bool needwake; 1927 1824 struct rcu_node *rnp; 1928 1825 1929 1826 rnp = rdp->mynode; ··· 1953 1848 * This GP can't end until cpu checks in, so all of our 1954 1849 * callbacks can be processed during the next GP. 1955 1850 */ 1956 - rcu_accelerate_cbs(rsp, rnp, rdp); 1851 + needwake = rcu_accelerate_cbs(rsp, rnp, rdp); 1957 1852 1958 1853 rcu_report_qs_rnp(mask, rsp, rnp, flags); /* rlses rnp->lock */ 1854 + if (needwake) 1855 + rcu_gp_kthread_wake(rsp); 1959 1856 } 1960 1857 } 1961 1858 ··· 2058 1951 static void rcu_adopt_orphan_cbs(struct rcu_state *rsp, unsigned long flags) 2059 1952 { 2060 1953 int i; 2061 - struct rcu_data *rdp = __this_cpu_ptr(rsp->rda); 1954 + struct rcu_data *rdp = raw_cpu_ptr(rsp->rda); 2062 1955 2063 1956 /* No-CBs CPUs are handled specially. */ 2064 1957 if (rcu_nocb_adopt_orphan_cbs(rsp, rdp, flags)) ··· 2427 2320 raw_spin_unlock_irqrestore(&rnp_old->lock, flags); 2428 2321 return; /* Someone beat us to it. */ 2429 2322 } 2430 - rsp->gp_flags |= RCU_GP_FLAG_FQS; 2323 + ACCESS_ONCE(rsp->gp_flags) |= RCU_GP_FLAG_FQS; 2431 2324 raw_spin_unlock_irqrestore(&rnp_old->lock, flags); 2432 2325 wake_up(&rsp->gp_wq); /* Memory barrier implied by wake_up() path. */ 2433 2326 } ··· 2441 2334 __rcu_process_callbacks(struct rcu_state *rsp) 2442 2335 { 2443 2336 unsigned long flags; 2444 - struct rcu_data *rdp = __this_cpu_ptr(rsp->rda); 2337 + bool needwake; 2338 + struct rcu_data *rdp = raw_cpu_ptr(rsp->rda); 2445 2339 2446 2340 WARN_ON_ONCE(rdp->beenonline == 0); 2447 2341 ··· 2453 2345 local_irq_save(flags); 2454 2346 if (cpu_needs_another_gp(rsp, rdp)) { 2455 2347 raw_spin_lock(&rcu_get_root(rsp)->lock); /* irqs disabled. */ 2456 - rcu_start_gp(rsp); 2348 + needwake = rcu_start_gp(rsp); 2457 2349 raw_spin_unlock_irqrestore(&rcu_get_root(rsp)->lock, flags); 2350 + if (needwake) 2351 + rcu_gp_kthread_wake(rsp); 2458 2352 } else { 2459 2353 local_irq_restore(flags); 2460 2354 } ··· 2514 2404 static void __call_rcu_core(struct rcu_state *rsp, struct rcu_data *rdp, 2515 2405 struct rcu_head *head, unsigned long flags) 2516 2406 { 2407 + bool needwake; 2408 + 2517 2409 /* 2518 2410 * If called from an extended quiescent state, invoke the RCU 2519 2411 * core in order to force a re-evaluation of RCU's idleness. ··· 2545 2433 2546 2434 raw_spin_lock(&rnp_root->lock); 2547 2435 smp_mb__after_unlock_lock(); 2548 - rcu_start_gp(rsp); 2436 + needwake = rcu_start_gp(rsp); 2549 2437 raw_spin_unlock(&rnp_root->lock); 2438 + if (needwake) 2439 + rcu_gp_kthread_wake(rsp); 2550 2440 } else { 2551 2441 /* Give the grace period a kick. */ 2552 2442 rdp->blimit = LONG_MAX; ··· 2649 2535 __call_rcu(head, func, &rcu_bh_state, -1, 0); 2650 2536 } 2651 2537 EXPORT_SYMBOL_GPL(call_rcu_bh); 2538 + 2539 + /* 2540 + * Queue an RCU callback for lazy invocation after a grace period. 2541 + * This will likely be later named something like "call_rcu_lazy()", 2542 + * but this change will require some way of tagging the lazy RCU 2543 + * callbacks in the list of pending callbacks. Until then, this 2544 + * function may only be called from __kfree_rcu(). 2545 + */ 2546 + void kfree_call_rcu(struct rcu_head *head, 2547 + void (*func)(struct rcu_head *rcu)) 2548 + { 2549 + __call_rcu(head, func, rcu_state_p, -1, 1); 2550 + } 2551 + EXPORT_SYMBOL_GPL(kfree_call_rcu); 2652 2552 2653 2553 /* 2654 2554 * Because a context switch is a grace period for RCU-sched and RCU-bh, ··· 2787 2659 * time-consuming work between get_state_synchronize_rcu() 2788 2660 * and cond_synchronize_rcu(). 2789 2661 */ 2790 - return smp_load_acquire(&rcu_state->gpnum); 2662 + return smp_load_acquire(&rcu_state_p->gpnum); 2791 2663 } 2792 2664 EXPORT_SYMBOL_GPL(get_state_synchronize_rcu); 2793 2665 ··· 2813 2685 * Ensure that this load happens before any RCU-destructive 2814 2686 * actions the caller might carry out after we return. 2815 2687 */ 2816 - newstate = smp_load_acquire(&rcu_state->completed); 2688 + newstate = smp_load_acquire(&rcu_state_p->completed); 2817 2689 if (ULONG_CMP_GE(oldstate, newstate)) 2818 2690 synchronize_rcu(); 2819 2691 } ··· 3116 2988 static void rcu_barrier_func(void *type) 3117 2989 { 3118 2990 struct rcu_state *rsp = type; 3119 - struct rcu_data *rdp = __this_cpu_ptr(rsp->rda); 2991 + struct rcu_data *rdp = raw_cpu_ptr(rsp->rda); 3120 2992 3121 2993 _rcu_barrier_trace(rsp, "IRQ", -1, rsp->n_barrier_done); 3122 2994 atomic_inc(&rsp->barrier_cpu_count); ··· 3288 3160 * that this CPU cannot possibly have any RCU callbacks in flight yet. 3289 3161 */ 3290 3162 static void 3291 - rcu_init_percpu_data(int cpu, struct rcu_state *rsp, int preemptible) 3163 + rcu_init_percpu_data(int cpu, struct rcu_state *rsp) 3292 3164 { 3293 3165 unsigned long flags; 3294 3166 unsigned long mask; ··· 3301 3173 /* Set up local state, ensuring consistent view of global state. */ 3302 3174 raw_spin_lock_irqsave(&rnp->lock, flags); 3303 3175 rdp->beenonline = 1; /* We have now been online. */ 3304 - rdp->preemptible = preemptible; 3305 3176 rdp->qlen_last_fqs_check = 0; 3306 3177 rdp->n_force_qs_snap = rsp->n_force_qs; 3307 3178 rdp->blimit = blimit; ··· 3344 3217 struct rcu_state *rsp; 3345 3218 3346 3219 for_each_rcu_flavor(rsp) 3347 - rcu_init_percpu_data(cpu, rsp, 3348 - strcmp(rsp->name, "rcu_preempt") == 0); 3220 + rcu_init_percpu_data(cpu, rsp); 3349 3221 } 3350 3222 3351 3223 /* ··· 3354 3228 unsigned long action, void *hcpu) 3355 3229 { 3356 3230 long cpu = (long)hcpu; 3357 - struct rcu_data *rdp = per_cpu_ptr(rcu_state->rda, cpu); 3231 + struct rcu_data *rdp = per_cpu_ptr(rcu_state_p->rda, cpu); 3358 3232 struct rcu_node *rnp = rdp->mynode; 3359 3233 struct rcu_state *rsp; 3360 3234 ··· 3528 3402 rnp->qsmaskinit = 0; 3529 3403 rnp->grplo = j * cpustride; 3530 3404 rnp->grphi = (j + 1) * cpustride - 1; 3531 - if (rnp->grphi >= NR_CPUS) 3532 - rnp->grphi = NR_CPUS - 1; 3405 + if (rnp->grphi >= nr_cpu_ids) 3406 + rnp->grphi = nr_cpu_ids - 1; 3533 3407 if (i == 0) { 3534 3408 rnp->grpnum = 0; 3535 3409 rnp->grpmask = 0; ··· 3548 3422 3549 3423 rsp->rda = rda; 3550 3424 init_waitqueue_head(&rsp->gp_wq); 3551 - init_irq_work(&rsp->wakeup_work, rsp_wakeup); 3552 3425 rnp = rsp->level[rcu_num_lvls - 1]; 3553 3426 for_each_possible_cpu(i) { 3554 3427 while (i > rnp->grphi)
+7 -4
kernel/rcu/tree.h
··· 252 252 bool passed_quiesce; /* User-mode/idle loop etc. */ 253 253 bool qs_pending; /* Core waits for quiesc state. */ 254 254 bool beenonline; /* CPU online at least once. */ 255 - bool preemptible; /* Preemptible RCU? */ 256 255 struct rcu_node *mynode; /* This CPU's leaf of hierarchy */ 257 256 unsigned long grpmask; /* Mask to apply to leaf qsmask. */ 258 257 #ifdef CONFIG_RCU_CPU_STALL_INFO ··· 405 406 unsigned long completed; /* # of last completed gp. */ 406 407 struct task_struct *gp_kthread; /* Task for grace periods. */ 407 408 wait_queue_head_t gp_wq; /* Where GP task waits. */ 408 - int gp_flags; /* Commands for GP task. */ 409 + short gp_flags; /* Commands for GP task. */ 410 + short gp_state; /* GP kthread sleep state. */ 409 411 410 412 /* End of fields guarded by root rcu_node's lock. */ 411 413 ··· 462 462 const char *name; /* Name of structure. */ 463 463 char abbr; /* Abbreviated name. */ 464 464 struct list_head flavors; /* List of RCU flavors. */ 465 - struct irq_work wakeup_work; /* Postponed wakeups */ 466 465 }; 467 466 468 467 /* Values for rcu_state structure's gp_flags field. */ 469 468 #define RCU_GP_FLAG_INIT 0x1 /* Need grace-period initialization. */ 470 469 #define RCU_GP_FLAG_FQS 0x2 /* Need grace-period quiescent-state forcing. */ 470 + 471 + /* Values for rcu_state structure's gp_flags field. */ 472 + #define RCU_GP_WAIT_INIT 0 /* Initial state. */ 473 + #define RCU_GP_WAIT_GPS 1 /* Wait for grace-period start. */ 474 + #define RCU_GP_WAIT_FQS 2 /* Wait for force-quiescent-state time. */ 471 475 472 476 extern struct list_head rcu_struct_flavors; 473 477 ··· 551 547 static void print_cpu_stall_info_end(void); 552 548 static void zero_cpu_stall_ticks(struct rcu_data *rdp); 553 549 static void increment_cpu_stall_ticks(void); 554 - static int rcu_nocb_needs_gp(struct rcu_state *rsp); 555 550 static void rcu_nocb_gp_set(struct rcu_node *rnp, int nrq); 556 551 static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp); 557 552 static void rcu_init_one_nocb(struct rcu_node *rnp);
+37 -99
kernel/rcu/tree_plugin.h
··· 116 116 #ifdef CONFIG_TREE_PREEMPT_RCU 117 117 118 118 RCU_STATE_INITIALIZER(rcu_preempt, 'p', call_rcu); 119 - static struct rcu_state *rcu_state = &rcu_preempt_state; 119 + static struct rcu_state *rcu_state_p = &rcu_preempt_state; 120 120 121 121 static int rcu_preempted_readers_exp(struct rcu_node *rnp); 122 122 ··· 147 147 return rcu_batches_completed_preempt(); 148 148 } 149 149 EXPORT_SYMBOL_GPL(rcu_batches_completed); 150 - 151 - /* 152 - * Force a quiescent state for preemptible RCU. 153 - */ 154 - void rcu_force_quiescent_state(void) 155 - { 156 - force_quiescent_state(&rcu_preempt_state); 157 - } 158 - EXPORT_SYMBOL_GPL(rcu_force_quiescent_state); 159 150 160 151 /* 161 152 * Record a preemptible-RCU quiescent state for the specified CPU. Note ··· 679 688 } 680 689 EXPORT_SYMBOL_GPL(call_rcu); 681 690 682 - /* 683 - * Queue an RCU callback for lazy invocation after a grace period. 684 - * This will likely be later named something like "call_rcu_lazy()", 685 - * but this change will require some way of tagging the lazy RCU 686 - * callbacks in the list of pending callbacks. Until then, this 687 - * function may only be called from __kfree_rcu(). 688 - */ 689 - void kfree_call_rcu(struct rcu_head *head, 690 - void (*func)(struct rcu_head *rcu)) 691 - { 692 - __call_rcu(head, func, &rcu_preempt_state, -1, 1); 693 - } 694 - EXPORT_SYMBOL_GPL(kfree_call_rcu); 695 - 696 691 /** 697 692 * synchronize_rcu - wait until a grace period has elapsed. 698 693 * ··· 947 970 948 971 #else /* #ifdef CONFIG_TREE_PREEMPT_RCU */ 949 972 950 - static struct rcu_state *rcu_state = &rcu_sched_state; 973 + static struct rcu_state *rcu_state_p = &rcu_sched_state; 951 974 952 975 /* 953 976 * Tell them what RCU they are running. ··· 966 989 return rcu_batches_completed_sched(); 967 990 } 968 991 EXPORT_SYMBOL_GPL(rcu_batches_completed); 969 - 970 - /* 971 - * Force a quiescent state for RCU, which, because there is no preemptible 972 - * RCU, becomes the same as rcu-sched. 973 - */ 974 - void rcu_force_quiescent_state(void) 975 - { 976 - rcu_sched_force_quiescent_state(); 977 - } 978 - EXPORT_SYMBOL_GPL(rcu_force_quiescent_state); 979 992 980 993 /* 981 994 * Because preemptible RCU does not exist, we never have to check for ··· 1045 1078 static void rcu_preempt_check_callbacks(int cpu) 1046 1079 { 1047 1080 } 1048 - 1049 - /* 1050 - * Queue an RCU callback for lazy invocation after a grace period. 1051 - * This will likely be later named something like "call_rcu_lazy()", 1052 - * but this change will require some way of tagging the lazy RCU 1053 - * callbacks in the list of pending callbacks. Until then, this 1054 - * function may only be called from __kfree_rcu(). 1055 - * 1056 - * Because there is no preemptible RCU, we use RCU-sched instead. 1057 - */ 1058 - void kfree_call_rcu(struct rcu_head *head, 1059 - void (*func)(struct rcu_head *rcu)) 1060 - { 1061 - __call_rcu(head, func, &rcu_sched_state, -1, 1); 1062 - } 1063 - EXPORT_SYMBOL_GPL(kfree_call_rcu); 1064 1081 1065 1082 /* 1066 1083 * Wait for an rcu-preempt grace period, but make it happen quickly. ··· 1468 1517 for_each_possible_cpu(cpu) 1469 1518 per_cpu(rcu_cpu_has_work, cpu) = 0; 1470 1519 BUG_ON(smpboot_register_percpu_thread(&rcu_cpu_thread_spec)); 1471 - rnp = rcu_get_root(rcu_state); 1472 - (void)rcu_spawn_one_boost_kthread(rcu_state, rnp); 1520 + rnp = rcu_get_root(rcu_state_p); 1521 + (void)rcu_spawn_one_boost_kthread(rcu_state_p, rnp); 1473 1522 if (NUM_RCU_NODES > 1) { 1474 - rcu_for_each_leaf_node(rcu_state, rnp) 1475 - (void)rcu_spawn_one_boost_kthread(rcu_state, rnp); 1523 + rcu_for_each_leaf_node(rcu_state_p, rnp) 1524 + (void)rcu_spawn_one_boost_kthread(rcu_state_p, rnp); 1476 1525 } 1477 1526 return 0; 1478 1527 } ··· 1480 1529 1481 1530 static void rcu_prepare_kthreads(int cpu) 1482 1531 { 1483 - struct rcu_data *rdp = per_cpu_ptr(rcu_state->rda, cpu); 1532 + struct rcu_data *rdp = per_cpu_ptr(rcu_state_p->rda, cpu); 1484 1533 struct rcu_node *rnp = rdp->mynode; 1485 1534 1486 1535 /* Fire up the incoming CPU's kthread and leaf rcu_node kthread. */ 1487 1536 if (rcu_scheduler_fully_active) 1488 - (void)rcu_spawn_one_boost_kthread(rcu_state, rnp); 1537 + (void)rcu_spawn_one_boost_kthread(rcu_state_p, rnp); 1489 1538 } 1490 1539 1491 1540 #else /* #ifdef CONFIG_RCU_BOOST */ ··· 1695 1744 static void rcu_prepare_for_idle(int cpu) 1696 1745 { 1697 1746 #ifndef CONFIG_RCU_NOCB_CPU_ALL 1747 + bool needwake; 1698 1748 struct rcu_data *rdp; 1699 1749 struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu); 1700 1750 struct rcu_node *rnp; ··· 1744 1792 rnp = rdp->mynode; 1745 1793 raw_spin_lock(&rnp->lock); /* irqs already disabled. */ 1746 1794 smp_mb__after_unlock_lock(); 1747 - rcu_accelerate_cbs(rsp, rnp, rdp); 1795 + needwake = rcu_accelerate_cbs(rsp, rnp, rdp); 1748 1796 raw_spin_unlock(&rnp->lock); /* irqs remain disabled. */ 1797 + if (needwake) 1798 + rcu_gp_kthread_wake(rsp); 1749 1799 } 1750 1800 #endif /* #ifndef CONFIG_RCU_NOCB_CPU_ALL */ 1751 1801 } ··· 1809 1855 struct rcu_data *rdp; 1810 1856 1811 1857 for_each_rcu_flavor(rsp) { 1812 - rdp = __this_cpu_ptr(rsp->rda); 1858 + rdp = raw_cpu_ptr(rsp->rda); 1813 1859 if (rdp->qlen_lazy != 0) { 1814 1860 atomic_inc(&oom_callback_count); 1815 1861 rsp->call(&rdp->oom_head, rcu_oom_callback); ··· 1951 1997 struct rcu_state *rsp; 1952 1998 1953 1999 for_each_rcu_flavor(rsp) 1954 - __this_cpu_ptr(rsp->rda)->ticks_this_gp++; 2000 + raw_cpu_inc(rsp->rda->ticks_this_gp); 1955 2001 } 1956 2002 1957 2003 #else /* #ifdef CONFIG_RCU_CPU_STALL_INFO */ ··· 2022 2068 early_param("rcu_nocb_poll", parse_rcu_nocb_poll); 2023 2069 2024 2070 /* 2025 - * Do any no-CBs CPUs need another grace period? 2026 - * 2027 - * Interrupts must be disabled. If the caller does not hold the root 2028 - * rnp_node structure's ->lock, the results are advisory only. 2029 - */ 2030 - static int rcu_nocb_needs_gp(struct rcu_state *rsp) 2031 - { 2032 - struct rcu_node *rnp = rcu_get_root(rsp); 2033 - 2034 - return rnp->need_future_gp[(ACCESS_ONCE(rnp->completed) + 1) & 0x1]; 2035 - } 2036 - 2037 - /* 2038 2071 * Wake up any no-CBs CPUs' kthreads that were waiting on the just-ended 2039 2072 * grace period. 2040 2073 */ ··· 2050 2109 } 2051 2110 2052 2111 #ifndef CONFIG_RCU_NOCB_CPU_ALL 2053 - /* Is the specified CPU a no-CPUs CPU? */ 2112 + /* Is the specified CPU a no-CBs CPU? */ 2054 2113 bool rcu_is_nocb_cpu(int cpu) 2055 2114 { 2056 2115 if (have_rcu_nocb_mask) ··· 2184 2243 unsigned long c; 2185 2244 bool d; 2186 2245 unsigned long flags; 2246 + bool needwake; 2187 2247 struct rcu_node *rnp = rdp->mynode; 2188 2248 2189 2249 raw_spin_lock_irqsave(&rnp->lock, flags); 2190 2250 smp_mb__after_unlock_lock(); 2191 - c = rcu_start_future_gp(rnp, rdp); 2251 + needwake = rcu_start_future_gp(rnp, rdp, &c); 2192 2252 raw_spin_unlock_irqrestore(&rnp->lock, flags); 2253 + if (needwake) 2254 + rcu_gp_kthread_wake(rdp->rsp); 2193 2255 2194 2256 /* 2195 2257 * Wait for the grace period. Do so interruptibly to avoid messing ··· 2345 2401 } 2346 2402 2347 2403 #else /* #ifdef CONFIG_RCU_NOCB_CPU */ 2348 - 2349 - static int rcu_nocb_needs_gp(struct rcu_state *rsp) 2350 - { 2351 - return 0; 2352 - } 2353 2404 2354 2405 static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp) 2355 2406 { ··· 2596 2657 } 2597 2658 2598 2659 /* 2599 - * Bind the grace-period kthread for the sysidle flavor of RCU to the 2600 - * timekeeping CPU. 2601 - */ 2602 - static void rcu_bind_gp_kthread(void) 2603 - { 2604 - int cpu = ACCESS_ONCE(tick_do_timer_cpu); 2605 - 2606 - if (cpu < 0 || cpu >= nr_cpu_ids) 2607 - return; 2608 - if (raw_smp_processor_id() != cpu) 2609 - set_cpus_allowed_ptr(current, cpumask_of(cpu)); 2610 - } 2611 - 2612 - /* 2613 2660 * Return a delay in jiffies based on the number of CPUs, rcu_node 2614 2661 * leaf fanout, and jiffies tick rate. The idea is to allow larger 2615 2662 * systems more time to transition to full-idle state in order to ··· 2659 2734 static void rcu_sysidle_cancel(void) 2660 2735 { 2661 2736 smp_mb(); 2662 - ACCESS_ONCE(full_sysidle_state) = RCU_SYSIDLE_NOT; 2737 + if (full_sysidle_state > RCU_SYSIDLE_SHORT) 2738 + ACCESS_ONCE(full_sysidle_state) = RCU_SYSIDLE_NOT; 2663 2739 } 2664 2740 2665 2741 /* ··· 2806 2880 return false; 2807 2881 } 2808 2882 2809 - static void rcu_bind_gp_kthread(void) 2810 - { 2811 - } 2812 - 2813 2883 static void rcu_sysidle_report_gp(struct rcu_state *rsp, int isidle, 2814 2884 unsigned long maxj) 2815 2885 { ··· 2835 2913 return 1; 2836 2914 #endif /* #ifdef CONFIG_NO_HZ_FULL */ 2837 2915 return 0; 2916 + } 2917 + 2918 + /* 2919 + * Bind the grace-period kthread for the sysidle flavor of RCU to the 2920 + * timekeeping CPU. 2921 + */ 2922 + static void rcu_bind_gp_kthread(void) 2923 + { 2924 + #ifdef CONFIG_NO_HZ_FULL 2925 + int cpu = ACCESS_ONCE(tick_do_timer_cpu); 2926 + 2927 + if (cpu < 0 || cpu >= nr_cpu_ids) 2928 + return; 2929 + if (raw_smp_processor_id() != cpu) 2930 + set_cpus_allowed_ptr(current, cpumask_of(cpu)); 2931 + #endif /* #ifdef CONFIG_NO_HZ_FULL */ 2838 2932 }
+30
kernel/rcu/update.c
··· 320 320 return till_stall_check * HZ + RCU_STALL_DELAY_DELTA; 321 321 } 322 322 323 + void rcu_sysrq_start(void) 324 + { 325 + if (!rcu_cpu_stall_suppress) 326 + rcu_cpu_stall_suppress = 2; 327 + } 328 + 329 + void rcu_sysrq_end(void) 330 + { 331 + if (rcu_cpu_stall_suppress == 2) 332 + rcu_cpu_stall_suppress = 0; 333 + } 334 + 323 335 static int rcu_panic(struct notifier_block *this, unsigned long ev, void *ptr) 324 336 { 325 337 rcu_cpu_stall_suppress = 1; ··· 350 338 early_initcall(check_cpu_stall_init); 351 339 352 340 #endif /* #ifdef CONFIG_RCU_STALL_COMMON */ 341 + 342 + /* 343 + * Hooks for cond_resched() and friends to avoid RCU CPU stall warnings. 344 + */ 345 + 346 + DEFINE_PER_CPU(int, rcu_cond_resched_count); 347 + 348 + /* 349 + * Report a set of RCU quiescent states, for use by cond_resched() 350 + * and friends. Out of line due to being called infrequently. 351 + */ 352 + void rcu_resched(void) 353 + { 354 + preempt_disable(); 355 + __this_cpu_write(rcu_cond_resched_count, 0); 356 + rcu_note_context_switch(smp_processor_id()); 357 + preempt_enable(); 358 + }
+6 -1
kernel/sched/core.c
··· 4084 4084 4085 4085 int __sched _cond_resched(void) 4086 4086 { 4087 + rcu_cond_resched(); 4087 4088 if (should_resched()) { 4088 4089 __cond_resched(); 4089 4090 return 1; ··· 4103 4102 */ 4104 4103 int __cond_resched_lock(spinlock_t *lock) 4105 4104 { 4105 + bool need_rcu_resched = rcu_should_resched(); 4106 4106 int resched = should_resched(); 4107 4107 int ret = 0; 4108 4108 4109 4109 lockdep_assert_held(lock); 4110 4110 4111 - if (spin_needbreak(lock) || resched) { 4111 + if (spin_needbreak(lock) || resched || need_rcu_resched) { 4112 4112 spin_unlock(lock); 4113 4113 if (resched) 4114 4114 __cond_resched(); 4115 + else if (unlikely(need_rcu_resched)) 4116 + rcu_resched(); 4115 4117 else 4116 4118 cpu_relax(); 4117 4119 ret = 1; ··· 4128 4124 { 4129 4125 BUG_ON(!in_softirq()); 4130 4126 4127 + rcu_cond_resched(); /* BH disabled OK, just recording QSes. */ 4131 4128 if (should_resched()) { 4132 4129 local_bh_enable(); 4133 4130 __cond_resched();
+1 -3
kernel/softirq.c
··· 232 232 bool in_hardirq; 233 233 __u32 pending; 234 234 int softirq_bit; 235 - int cpu; 236 235 237 236 /* 238 237 * Mask out PF_MEMALLOC s current task context is borrowed for the ··· 246 247 __local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET); 247 248 in_hardirq = lockdep_softirq_start(); 248 249 249 - cpu = smp_processor_id(); 250 250 restart: 251 251 /* Reset the pending bitmask before enabling irqs */ 252 252 set_softirq_pending(0); ··· 274 276 prev_count, preempt_count()); 275 277 preempt_count_set(prev_count); 276 278 } 277 - rcu_bh_qs(cpu); 278 279 h++; 279 280 pending >>= softirq_bit; 280 281 } 281 282 283 + rcu_bh_qs(smp_processor_id()); 282 284 local_irq_disable(); 283 285 284 286 pending = local_softirq_pending();
+27 -13
kernel/torture.c
··· 335 335 shuffle_idle_cpu = cpumask_next(shuffle_idle_cpu, shuffle_tmp_mask); 336 336 if (shuffle_idle_cpu >= nr_cpu_ids) 337 337 shuffle_idle_cpu = -1; 338 - if (shuffle_idle_cpu != -1) { 338 + else 339 339 cpumask_clear_cpu(shuffle_idle_cpu, shuffle_tmp_mask); 340 - if (cpumask_empty(shuffle_tmp_mask)) { 341 - put_online_cpus(); 342 - return; 343 - } 344 - } 345 340 346 341 mutex_lock(&shuffle_task_mutex); 347 342 list_for_each_entry(stp, &shuffle_task_list, st_l) ··· 528 533 while (ACCESS_ONCE(stutter_pause_test) || 529 534 (torture_runnable && !ACCESS_ONCE(*torture_runnable))) { 530 535 if (stutter_pause_test) 531 - schedule_timeout_interruptible(1); 536 + if (ACCESS_ONCE(stutter_pause_test) == 1) 537 + schedule_timeout_interruptible(1); 538 + else 539 + while (ACCESS_ONCE(stutter_pause_test)) 540 + cond_resched(); 532 541 else 533 542 schedule_timeout_interruptible(round_jiffies_relative(HZ)); 534 543 torture_shutdown_absorb(title); ··· 549 550 VERBOSE_TOROUT_STRING("torture_stutter task started"); 550 551 do { 551 552 if (!torture_must_stop()) { 552 - schedule_timeout_interruptible(stutter); 553 + if (stutter > 1) { 554 + schedule_timeout_interruptible(stutter - 1); 555 + ACCESS_ONCE(stutter_pause_test) = 2; 556 + } 557 + schedule_timeout_interruptible(1); 553 558 ACCESS_ONCE(stutter_pause_test) = 1; 554 559 } 555 560 if (!torture_must_stop()) ··· 599 596 * The runnable parameter points to a flag that controls whether or not 600 597 * the test is currently runnable. If there is no such flag, pass in NULL. 601 598 */ 602 - void __init torture_init_begin(char *ttype, bool v, int *runnable) 599 + bool torture_init_begin(char *ttype, bool v, int *runnable) 603 600 { 604 601 mutex_lock(&fullstop_mutex); 602 + if (torture_type != NULL) { 603 + pr_alert("torture_init_begin: refusing %s init: %s running", 604 + ttype, torture_type); 605 + mutex_unlock(&fullstop_mutex); 606 + return false; 607 + } 605 608 torture_type = ttype; 606 609 verbose = v; 607 610 torture_runnable = runnable; 608 611 fullstop = FULLSTOP_DONTSTOP; 609 - 612 + return true; 610 613 } 611 614 EXPORT_SYMBOL_GPL(torture_init_begin); 612 615 613 616 /* 614 617 * Tell the torture module that initialization is complete. 615 618 */ 616 - void __init torture_init_end(void) 619 + void torture_init_end(void) 617 620 { 618 621 mutex_unlock(&fullstop_mutex); 619 622 register_reboot_notifier(&torture_shutdown_nb); ··· 651 642 torture_shuffle_cleanup(); 652 643 torture_stutter_cleanup(); 653 644 torture_onoff_cleanup(); 645 + mutex_lock(&fullstop_mutex); 646 + torture_type = NULL; 647 + mutex_unlock(&fullstop_mutex); 654 648 return false; 655 649 } 656 650 EXPORT_SYMBOL_GPL(torture_cleanup); ··· 686 674 */ 687 675 void torture_kthread_stopping(char *title) 688 676 { 689 - if (verbose) 690 - VERBOSE_TOROUT_STRING(title); 677 + char buf[128]; 678 + 679 + snprintf(buf, sizeof(buf), "Stopping %s", title); 680 + VERBOSE_TOROUT_STRING(buf); 691 681 while (!kthread_should_stop()) { 692 682 torture_shutdown_absorb(title); 693 683 schedule_timeout_uninterruptible(1);
+1 -1
tools/testing/selftests/rcutorture/bin/configinit.sh
··· 62 62 echo "cat - $c" >> $T/upd.sh 63 63 make mrproper 64 64 make $buildloc distclean > $builddir/Make.distclean 2>&1 65 - make $buildloc defconfig > $builddir/Make.defconfig.out 2>&1 65 + make $buildloc $TORTURE_DEFCONFIG > $builddir/Make.defconfig.out 2>&1 66 66 mv $builddir/.config $builddir/.config.sav 67 67 sh $T/upd.sh < $builddir/.config.sav > $builddir/.config 68 68 cp $builddir/.config $builddir/.config.new
+36 -12
tools/testing/selftests/rcutorture/bin/functions.sh
··· 76 76 grep -q '^CONFIG_HOTPLUG_CPU=y$' "$1" 77 77 } 78 78 79 + # identify_boot_image qemu-cmd 80 + # 81 + # Returns the relative path to the kernel build image. This will be 82 + # arch/<arch>/boot/bzImage unless overridden with the TORTURE_BOOT_IMAGE 83 + # environment variable. 84 + identify_boot_image () { 85 + if test -n "$TORTURE_BOOT_IMAGE" 86 + then 87 + echo $TORTURE_BOOT_IMAGE 88 + else 89 + case "$1" in 90 + qemu-system-x86_64|qemu-system-i386) 91 + echo arch/x86/boot/bzImage 92 + ;; 93 + qemu-system-ppc64) 94 + echo arch/powerpc/boot/bzImage 95 + ;; 96 + *) 97 + echo "" 98 + ;; 99 + esac 100 + fi 101 + } 102 + 79 103 # identify_qemu builddir 80 104 # 81 105 # Returns our best guess as to which qemu command is appropriate for 82 - # the kernel at hand. Override with the RCU_QEMU_CMD environment variable. 106 + # the kernel at hand. Override with the TORTURE_QEMU_CMD environment variable. 83 107 identify_qemu () { 84 108 local u="`file "$1"`" 85 - if test -n "$RCU_QEMU_CMD" 109 + if test -n "$TORTURE_QEMU_CMD" 86 110 then 87 - echo $RCU_QEMU_CMD 111 + echo $TORTURE_QEMU_CMD 88 112 elif echo $u | grep -q x86-64 89 113 then 90 114 echo qemu-system-x86_64 ··· 122 98 echo Cannot figure out what qemu command to use! 1>&2 123 99 echo file $1 output: $u 124 100 # Usually this will be one of /usr/bin/qemu-system-* 125 - # Use RCU_QEMU_CMD environment variable or appropriate 101 + # Use TORTURE_QEMU_CMD environment variable or appropriate 126 102 # argument to top-level script. 127 103 exit 1 128 104 fi ··· 131 107 # identify_qemu_append qemu-cmd 132 108 # 133 109 # Output arguments for the qemu "-append" string based on CPU type 134 - # and the RCU_QEMU_INTERACTIVE environment variable. 110 + # and the TORTURE_QEMU_INTERACTIVE environment variable. 135 111 identify_qemu_append () { 136 112 case "$1" in 137 113 qemu-system-x86_64|qemu-system-i386) 138 114 echo noapic selinux=0 initcall_debug debug 139 115 ;; 140 116 esac 141 - if test -n "$RCU_QEMU_INTERACTIVE" 117 + if test -n "$TORTURE_QEMU_INTERACTIVE" 142 118 then 143 119 echo root=/dev/sda 144 120 else ··· 148 124 149 125 # identify_qemu_args qemu-cmd serial-file 150 126 # 151 - # Output arguments for qemu arguments based on the RCU_QEMU_MAC 152 - # and RCU_QEMU_INTERACTIVE environment variables. 127 + # Output arguments for qemu arguments based on the TORTURE_QEMU_MAC 128 + # and TORTURE_QEMU_INTERACTIVE environment variables. 153 129 identify_qemu_args () { 154 130 case "$1" in 155 131 qemu-system-x86_64|qemu-system-i386) ··· 157 133 qemu-system-ppc64) 158 134 echo -enable-kvm -M pseries -cpu POWER7 -nodefaults 159 135 echo -device spapr-vscsi 160 - if test -n "$RCU_QEMU_INTERACTIVE" -a -n "$RCU_QEMU_MAC" 136 + if test -n "$TORTURE_QEMU_INTERACTIVE" -a -n "$TORTURE_QEMU_MAC" 161 137 then 162 - echo -device spapr-vlan,netdev=net0,mac=$RCU_QEMU_MAC 138 + echo -device spapr-vlan,netdev=net0,mac=$TORTURE_QEMU_MAC 163 139 echo -netdev bridge,br=br0,id=net0 164 - elif test -n "$RCU_QEMU_INTERACTIVE" 140 + elif test -n "$TORTURE_QEMU_INTERACTIVE" 165 141 then 166 142 echo -net nic -net user 167 143 fi 168 144 ;; 169 145 esac 170 - if test -n "$RCU_QEMU_INTERACTIVE" 146 + if test -n "$TORTURE_QEMU_INTERACTIVE" 171 147 then 172 148 echo -monitor stdio -serial pty -S 173 149 else
+3 -3
tools/testing/selftests/rcutorture/bin/kvm-build.sh
··· 45 45 trap 'rm -rf $T' 0 46 46 mkdir $T 47 47 48 - cat ${config_template} | grep -v CONFIG_RCU_TORTURE_TEST > $T/config 48 + grep -v 'CONFIG_[A-Z]*_TORTURE_TEST' < ${config_template} > $T/config 49 49 cat << ___EOF___ >> $T/config 50 - CONFIG_INITRAMFS_SOURCE="$RCU_INITRD" 50 + CONFIG_INITRAMFS_SOURCE="$TORTURE_INITRD" 51 51 CONFIG_VIRTIO_PCI=y 52 52 CONFIG_VIRTIO_CONSOLE=y 53 53 ___EOF___ ··· 60 60 exit 2 61 61 fi 62 62 ncpus=`cpus2use.sh` 63 - make O=$builddir -j$ncpus $RCU_KMAKE_ARG > $builddir/Make.out 2>&1 63 + make O=$builddir -j$ncpus $TORTURE_KMAKE_ARG > $builddir/Make.out 2>&1 64 64 retval=$? 65 65 if test $retval -ne 0 || grep "rcu[^/]*": < $builddir/Make.out | egrep -q "Stop|Error|error:|warning:" || egrep -q "Stop|Error|error:" < $builddir/Make.out 66 66 then
+1 -1
tools/testing/selftests/rcutorture/bin/kvm-recheck-lock.sh
··· 35 35 ncs=`grep "Writes: Total:" $i/console.log 2> /dev/null | tail -1 | sed -e 's/^.* Total: //' -e 's/ .*$//'` 36 36 if test -z "$ncs" 37 37 then 38 - echo $configfile 38 + echo "$configfile -------" 39 39 else 40 40 title="$configfile ------- $ncs acquisitions/releases" 41 41 dur=`sed -e 's/^.* locktorture.shutdown_secs=//' -e 's/ .*$//' < $i/qemu-cmd 2> /dev/null`
+1 -1
tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh
··· 35 35 ngps=`grep ver: $i/console.log 2> /dev/null | tail -1 | sed -e 's/^.* ver: //' -e 's/ .*$//'` 36 36 if test -z "$ngps" 37 37 then 38 - echo $configfile 38 + echo "$configfile -------" 39 39 else 40 40 title="$configfile ------- $ngps grace periods" 41 41 dur=`sed -e 's/^.* rcutorture.shutdown_secs=//' -e 's/ .*$//' < $i/qemu-cmd 2> /dev/null`
+18 -6
tools/testing/selftests/rcutorture/bin/kvm-recheck.sh
··· 25 25 # Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com> 26 26 27 27 PATH=`pwd`/tools/testing/selftests/rcutorture/bin:$PATH; export PATH 28 + . tools/testing/selftests/rcutorture/bin/functions.sh 28 29 for rd in "$@" 29 30 do 30 31 firsttime=1 ··· 40 39 fi 41 40 TORTURE_SUITE="`cat $i/../TORTURE_SUITE`" 42 41 kvm-recheck-${TORTURE_SUITE}.sh $i 43 - configcheck.sh $i/.config $i/ConfigFragment 44 - parse-build.sh $i/Make.out $configfile 45 - parse-rcutorture.sh $i/console.log $configfile 46 - parse-console.sh $i/console.log $configfile 47 - if test -r $i/Warnings 42 + if test -f "$i/console.log" 48 43 then 49 - cat $i/Warnings 44 + configcheck.sh $i/.config $i/ConfigFragment 45 + parse-build.sh $i/Make.out $configfile 46 + parse-torture.sh $i/console.log $configfile 47 + parse-console.sh $i/console.log $configfile 48 + if test -r $i/Warnings 49 + then 50 + cat $i/Warnings 51 + fi 52 + else 53 + if test -f "$i/qemu-cmd" 54 + then 55 + print_bug qemu failed 56 + else 57 + print_bug Build failed 58 + fi 59 + echo " $i" 50 60 fi 51 61 done 52 62 done
+33 -14
tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
··· 94 94 # CONFIG_YENTA=n 95 95 if kvm-build.sh $config_template $builddir $T 96 96 then 97 + QEMU="`identify_qemu $builddir/vmlinux`" 98 + BOOT_IMAGE="`identify_boot_image $QEMU`" 97 99 cp $builddir/Make*.out $resdir 98 100 cp $builddir/.config $resdir 99 - cp $builddir/arch/x86/boot/bzImage $resdir 101 + if test -n "$BOOT_IMAGE" 102 + then 103 + cp $builddir/$BOOT_IMAGE $resdir 104 + else 105 + echo No identifiable boot image, not running KVM, see $resdir. 106 + echo Do the torture scripts know about your architecture? 107 + fi 100 108 parse-build.sh $resdir/Make.out $title 101 109 if test -f $builddir.wait 102 110 then ··· 112 104 fi 113 105 else 114 106 cp $builddir/Make*.out $resdir 107 + cp $builddir/.config $resdir || : 115 108 echo Build failed, not running KVM, see $resdir. 116 109 if test -f $builddir.wait 117 110 then ··· 132 123 cd $KVM 133 124 kstarttime=`awk 'BEGIN { print systime() }' < /dev/null` 134 125 echo ' ---' `date`: Starting kernel 135 - 136 - # Determine the appropriate flavor of qemu command. 137 - QEMU="`identify_qemu $builddir/vmlinux`" 138 126 139 127 # Generate -smp qemu argument. 140 128 qemu_args="-nographic $qemu_args" ··· 157 151 # Generate kernel-version-specific boot parameters 158 152 boot_args="`per_version_boot_params "$boot_args" $builddir/.config $seconds`" 159 153 160 - echo $QEMU $qemu_args -m 512 -kernel $builddir/arch/x86/boot/bzImage -append \"$qemu_append $boot_args\" > $resdir/qemu-cmd 161 - if test -n "$RCU_BUILDONLY" 154 + echo $QEMU $qemu_args -m 512 -kernel $builddir/$BOOT_IMAGE -append \"$qemu_append $boot_args\" > $resdir/qemu-cmd 155 + if test -n "$TORTURE_BUILDONLY" 162 156 then 163 157 echo Build-only run specified, boot/test omitted. 164 158 exit 0 165 159 fi 166 - $QEMU $qemu_args -m 512 -kernel $builddir/arch/x86/boot/bzImage -append "$qemu_append $boot_args" & 160 + ( $QEMU $qemu_args -m 512 -kernel $builddir/$BOOT_IMAGE -append "$qemu_append $boot_args"; echo $? > $resdir/qemu-retval ) & 167 161 qemu_pid=$! 168 162 commandcompleted=0 169 163 echo Monitoring qemu job at pid $qemu_pid 170 - for ((i=0;i<$seconds;i++)) 164 + while : 171 165 do 166 + kruntime=`awk 'BEGIN { print systime() - '"$kstarttime"' }' < /dev/null` 172 167 if kill -0 $qemu_pid > /dev/null 2>&1 173 168 then 169 + if test $kruntime -ge $seconds 170 + then 171 + break; 172 + fi 174 173 sleep 1 175 174 else 176 175 commandcompleted=1 177 - kruntime=`awk 'BEGIN { print systime() - '"$kstarttime"' }' < /dev/null` 178 176 if test $kruntime -lt $seconds 179 177 then 180 178 echo Completed in $kruntime vs. $seconds >> $resdir/Warnings 2>&1 179 + grep "^(qemu) qemu:" $resdir/kvm-test-1-run.sh.out >> $resdir/Warnings 2>&1 180 + killpid="`sed -n "s/^(qemu) qemu: terminating on signal [0-9]* from pid \([0-9]*\).*$/\1/p" $resdir/Warnings`" 181 + if test -n "$killpid" 182 + then 183 + echo "ps -fp $killpid" >> $resdir/Warnings 2>&1 184 + ps -fp $killpid >> $resdir/Warnings 2>&1 185 + fi 181 186 else 182 187 echo ' ---' `date`: Kernel done 183 188 fi ··· 198 181 if test $commandcompleted -eq 0 199 182 then 200 183 echo Grace period for qemu job at pid $qemu_pid 201 - for ((i=0;i<=$grace;i++)) 184 + while : 202 185 do 186 + kruntime=`awk 'BEGIN { print systime() - '"$kstarttime"' }' < /dev/null` 203 187 if kill -0 $qemu_pid > /dev/null 2>&1 204 188 then 205 - sleep 1 189 + : 206 190 else 207 191 break 208 192 fi 209 - if test $i -eq $grace 193 + if test $kruntime -ge $((seconds + grace)) 210 194 then 211 - kruntime=`awk 'BEGIN { print systime() - '"$kstarttime"' }'` 212 195 echo "!!! Hang at $kruntime vs. $seconds seconds" >> $resdir/Warnings 2>&1 213 196 kill -KILL $qemu_pid 197 + break 214 198 fi 199 + sleep 1 215 200 done 216 201 fi 217 202 218 203 cp $builddir/console.log $resdir 219 - parse-${TORTURE_SUITE}torture.sh $resdir/console.log $title 204 + parse-torture.sh $resdir/console.log $title 220 205 parse-console.sh $resdir/console.log $title
+70 -72
tools/testing/selftests/rcutorture/bin/kvm.sh
··· 38 38 dryrun="" 39 39 KVM="`pwd`/tools/testing/selftests/rcutorture"; export KVM 40 40 PATH=${KVM}/bin:$PATH; export PATH 41 - builddir="${KVM}/b1" 42 - RCU_INITRD="$KVM/initrd"; export RCU_INITRD 43 - RCU_KMAKE_ARG=""; export RCU_KMAKE_ARG 41 + TORTURE_DEFCONFIG=defconfig 42 + TORTURE_BOOT_IMAGE="" 43 + TORTURE_INITRD="$KVM/initrd"; export TORTURE_INITRD 44 + TORTURE_KMAKE_ARG="" 44 45 TORTURE_SUITE=rcu 45 46 resdir="" 46 47 configs="" ··· 54 53 usage () { 55 54 echo "Usage: $scriptname optional arguments:" 56 55 echo " --bootargs kernel-boot-arguments" 57 - echo " --builddir absolute-pathname" 56 + echo " --bootimage relative-path-to-kernel-boot-image" 58 57 echo " --buildonly" 59 58 echo " --configs \"config-file list\"" 60 59 echo " --cpus N" 61 60 echo " --datestamp string" 61 + echo " --defconfig string" 62 62 echo " --dryrun sched|script" 63 63 echo " --duration minutes" 64 64 echo " --interactive" ··· 69 67 echo " --no-initrd" 70 68 echo " --qemu-args qemu-system-..." 71 69 echo " --qemu-cmd qemu-system-..." 72 - echo " --relbuilddir relative-pathname" 73 70 echo " --results absolute-pathname" 74 71 echo " --torture rcu" 75 72 exit 1 ··· 79 78 case "$1" in 80 79 --bootargs) 81 80 checkarg --bootargs "(list of kernel boot arguments)" "$#" "$2" '.*' '^--' 82 - RCU_BOOTARGS="$2" 81 + TORTURE_BOOTARGS="$2" 83 82 shift 84 83 ;; 85 - --builddir) 86 - checkarg --builddir "(absolute pathname)" "$#" "$2" '^/' '^error' 87 - builddir=$2 88 - gotbuilddir=1 84 + --bootimage) 85 + checkarg --bootimage "(relative path to kernel boot image)" "$#" "$2" '[a-zA-Z0-9][a-zA-Z0-9_]*' '^--' 86 + TORTURE_BOOT_IMAGE="$2" 89 87 shift 90 88 ;; 91 89 --buildonly) 92 - RCU_BUILDONLY=1; export RCU_BUILDONLY 90 + TORTURE_BUILDONLY=1 93 91 ;; 94 92 --configs) 95 93 checkarg --configs "(list of config files)" "$#" "$2" '^[^/]*$' '^--' ··· 105 105 ds=$2 106 106 shift 107 107 ;; 108 + --defconfig) 109 + checkarg --defconfig "defconfigtype" "$#" "$2" '^[^/][^/]*$' '^--' 110 + TORTURE_DEFCONFIG=$2 111 + shift 112 + ;; 108 113 --dryrun) 109 114 checkarg --dryrun "sched|script" $# "$2" 'sched\|script' '^--' 110 115 dryrun=$2 ··· 121 116 shift 122 117 ;; 123 118 --interactive) 124 - RCU_QEMU_INTERACTIVE=1; export RCU_QEMU_INTERACTIVE 119 + TORTURE_QEMU_INTERACTIVE=1; export TORTURE_QEMU_INTERACTIVE 125 120 ;; 126 121 --kmake-arg) 127 122 checkarg --kmake-arg "(kernel make arguments)" $# "$2" '.*' '^error$' 128 - RCU_KMAKE_ARG="$2"; export RCU_KMAKE_ARG 123 + TORTURE_KMAKE_ARG="$2" 129 124 shift 130 125 ;; 131 126 --kversion) ··· 135 130 ;; 136 131 --mac) 137 132 checkarg --mac "(MAC address)" $# "$2" '^\([0-9a-fA-F]\{2\}:\)\{5\}[0-9a-fA-F]\{2\}$' error 138 - RCU_QEMU_MAC=$2; export RCU_QEMU_MAC 133 + TORTURE_QEMU_MAC=$2 139 134 shift 140 135 ;; 141 136 --no-initrd) 142 - RCU_INITRD=""; export RCU_INITRD 137 + TORTURE_INITRD=""; export TORTURE_INITRD 143 138 ;; 144 139 --qemu-args) 145 140 checkarg --qemu-args "-qemu args" $# "$2" '^-' '^error' 146 - RCU_QEMU_ARG="$2" 141 + TORTURE_QEMU_ARG="$2" 147 142 shift 148 143 ;; 149 144 --qemu-cmd) 150 145 checkarg --qemu-cmd "(qemu-system-...)" $# "$2" 'qemu-system-' '^--' 151 - RCU_QEMU_CMD="$2"; export RCU_QEMU_CMD 152 - shift 153 - ;; 154 - --relbuilddir) 155 - checkarg --relbuilddir "(relative pathname)" "$#" "$2" '^[^/]*$' '^--' 156 - relbuilddir=$2 157 - gotrelbuilddir=1 158 - builddir=${KVM}/${relbuilddir} 146 + TORTURE_QEMU_CMD="$2" 159 147 shift 160 148 ;; 161 149 --results) ··· 180 182 if test -z "$resdir" 181 183 then 182 184 resdir=$KVM/res 183 - fi 184 - 185 - if test "$dryrun" = "" 186 - then 187 - if ! test -e $resdir 188 - then 189 - mkdir -p "$resdir" || : 190 - fi 191 - mkdir $resdir/$ds 192 - 193 - # Be noisy only if running the script. 194 - echo Results directory: $resdir/$ds 195 - echo $scriptname $args 196 - 197 - touch $resdir/$ds/log 198 - echo $scriptname $args >> $resdir/$ds/log 199 - echo ${TORTURE_SUITE} > $resdir/$ds/TORTURE_SUITE 200 - 201 - pwd > $resdir/$ds/testid.txt 202 - if test -d .git 203 - then 204 - git status >> $resdir/$ds/testid.txt 205 - git rev-parse HEAD >> $resdir/$ds/testid.txt 206 - fi 207 185 fi 208 186 209 187 # Create a file of test-name/#cpus pairs, sorted by decreasing #cpus. ··· 248 274 249 275 # Generate a script to execute the tests in appropriate batches. 250 276 cat << ___EOF___ > $T/script 277 + CONFIGFRAG="$CONFIGFRAG"; export CONFIGFRAG 278 + KVM="$KVM"; export KVM 279 + KVPATH="$KVPATH"; export KVPATH 280 + PATH="$PATH"; export PATH 281 + TORTURE_BOOT_IMAGE="$TORTURE_BOOT_IMAGE"; export TORTURE_BOOT_IMAGE 282 + TORTURE_BUILDONLY="$TORTURE_BUILDONLY"; export TORTURE_BUILDONLY 283 + TORTURE_DEFCONFIG="$TORTURE_DEFCONFIG"; export TORTURE_DEFCONFIG 284 + TORTURE_INITRD="$TORTURE_INITRD"; export TORTURE_INITRD 285 + TORTURE_KMAKE_ARG="$TORTURE_KMAKE_ARG"; export TORTURE_KMAKE_ARG 286 + TORTURE_QEMU_CMD="$TORTURE_QEMU_CMD"; export TORTURE_QEMU_CMD 287 + TORTURE_QEMU_INTERACTIVE="$TORTURE_QEMU_INTERACTIVE"; export TORTURE_QEMU_INTERACTIVE 288 + TORTURE_QEMU_MAC="$TORTURE_QEMU_MAC"; export TORTURE_QEMU_MAC 251 289 TORTURE_SUITE="$TORTURE_SUITE"; export TORTURE_SUITE 290 + if ! test -e $resdir 291 + then 292 + mkdir -p "$resdir" || : 293 + fi 294 + mkdir $resdir/$ds 295 + echo Results directory: $resdir/$ds 296 + echo $scriptname $args 297 + touch $resdir/$ds/log 298 + echo $scriptname $args >> $resdir/$ds/log 299 + echo ${TORTURE_SUITE} > $resdir/$ds/TORTURE_SUITE 300 + pwd > $resdir/$ds/testid.txt 301 + if test -d .git 302 + then 303 + git status >> $resdir/$ds/testid.txt 304 + git rev-parse HEAD >> $resdir/$ds/testid.txt 305 + if ! git diff HEAD > $T/git-diff 2>&1 306 + then 307 + cp $T/git-diff $resdir/$ds 308 + fi 309 + fi 252 310 ___EOF___ 253 311 awk < $T/cfgcpu.pack \ 254 312 -v CONFIGDIR="$CONFIGFRAG/$kversion/" \ ··· 288 282 -v ncpus=$cpus \ 289 283 -v rd=$resdir/$ds/ \ 290 284 -v dur=$dur \ 291 - -v RCU_QEMU_ARG=$RCU_QEMU_ARG \ 292 - -v RCU_BOOTARGS=$RCU_BOOTARGS \ 285 + -v TORTURE_QEMU_ARG="$TORTURE_QEMU_ARG" \ 286 + -v TORTURE_BOOTARGS="$TORTURE_BOOTARGS" \ 293 287 'BEGIN { 294 288 i = 0; 295 289 } ··· 326 320 print "touch " builddir ".wait"; 327 321 print "mkdir " builddir " > /dev/null 2>&1 || :"; 328 322 print "mkdir " rd cfr[jn] " || :"; 329 - print "kvm-test-1-run.sh " CONFIGDIR cf[j], builddir, rd cfr[jn], dur " \"" RCU_QEMU_ARG "\" \"" RCU_BOOTARGS "\" > " rd cfr[jn] "/kvm-test-1-run.sh.out 2>&1 &" 323 + print "kvm-test-1-run.sh " CONFIGDIR cf[j], builddir, rd cfr[jn], dur " \"" TORTURE_QEMU_ARG "\" \"" TORTURE_BOOTARGS "\" > " rd cfr[jn] "/kvm-test-1-run.sh.out 2>&1 &" 330 324 print "echo ", cfr[jn], cpusr[jn] ovf ": Waiting for build to complete. `date`"; 331 325 print "echo ", cfr[jn], cpusr[jn] ovf ": Waiting for build to complete. `date` >> " rd "/log"; 332 326 print "while test -f " builddir ".wait" ··· 380 374 dump(first, i); 381 375 }' >> $T/script 382 376 377 + cat << ___EOF___ >> $T/script 378 + echo 379 + echo 380 + echo " --- `date` Test summary:" 381 + echo Results directory: $resdir/$ds 382 + if test -z "$TORTURE_BUILDONLY" 383 + then 384 + kvm-recheck.sh $resdir/$ds 385 + fi 386 + ___EOF___ 387 + 383 388 if test "$dryrun" = script 384 389 then 385 - # Dump out the script, but define the environment variables that 386 - # it needs to run standalone. 387 - echo CONFIGFRAG="$CONFIGFRAG; export CONFIGFRAG" 388 - echo KVM="$KVM; export KVM" 389 - echo KVPATH="$KVPATH; export KVPATH" 390 - echo PATH="$PATH; export PATH" 391 - echo RCU_BUILDONLY="$RCU_BUILDONLY; export RCU_BUILDONLY" 392 - echo RCU_INITRD="$RCU_INITRD; export RCU_INITRD" 393 - echo RCU_KMAKE_ARG="$RCU_KMAKE_ARG; export RCU_KMAKE_ARG" 394 - echo RCU_QEMU_CMD="$RCU_QEMU_CMD; export RCU_QEMU_CMD" 395 - echo RCU_QEMU_INTERACTIVE="$RCU_QEMU_INTERACTIVE; export RCU_QEMU_INTERACTIVE" 396 - echo RCU_QEMU_MAC="$RCU_QEMU_MAC; export RCU_QEMU_MAC" 397 - echo "mkdir -p "$resdir" || :" 398 - echo "mkdir $resdir/$ds" 399 390 cat $T/script 400 391 exit 0 401 392 elif test "$dryrun" = sched 402 393 then 403 394 # Extract the test run schedule from the script. 404 - egrep 'start batch|Starting build\.' $T/script | 395 + egrep 'Start batch|Starting build\.' $T/script | 396 + grep -v ">>" | 405 397 sed -e 's/:.*$//' -e 's/^echo //' 406 398 exit 0 407 399 else ··· 408 404 fi 409 405 410 406 # Tracing: trace_event=rcu:rcu_grace_period,rcu:rcu_future_grace_period,rcu:rcu_grace_period_init,rcu:rcu_nocb_wake,rcu:rcu_preempt_task,rcu:rcu_unlock_preempted_task,rcu:rcu_quiescent_state_report,rcu:rcu_fqs,rcu:rcu_callback,rcu:rcu_kfree_callback,rcu:rcu_batch_start,rcu:rcu_invoke_callback,rcu:rcu_invoke_kfree_callback,rcu:rcu_batch_end,rcu:rcu_torture_read,rcu:rcu_barrier 411 - 412 - echo 413 - echo 414 - echo " --- `date` Test summary:" 415 - echo Results directory: $resdir/$ds 416 - kvm-recheck.sh $resdir/$ds
+11 -11
tools/testing/selftests/rcutorture/bin/parse-rcutorture.sh tools/testing/selftests/rcutorture/bin/parse-torture.sh
··· 1 1 #!/bin/sh 2 2 # 3 - # Check the console output from an rcutorture run for goodness. 3 + # Check the console output from a torture run for goodness. 4 4 # The "file" is a pathname on the local system, and "title" is 5 5 # a text string for error-message purposes. 6 6 # 7 - # The file must contain rcutorture output, but can be interspersed 8 - # with other dmesg text. 7 + # The file must contain torture output, but can be interspersed 8 + # with other dmesg text, as in console-log output. 9 9 # 10 10 # Usage: 11 - # sh parse-rcutorture.sh file title 11 + # sh parse-torture.sh file title 12 12 # 13 13 # This program is free software; you can redistribute it and/or modify 14 14 # it under the terms of the GNU General Public License as published by ··· 28 28 # 29 29 # Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com> 30 30 31 - T=/tmp/parse-rcutorture.sh.$$ 31 + T=/tmp/parse-torture.sh.$$ 32 32 file="$1" 33 33 title="$2" 34 34 ··· 36 36 37 37 . functions.sh 38 38 39 - # check for presence of rcutorture.txt file 39 + # check for presence of torture output file. 40 40 41 41 if test -f "$file" -a -r "$file" 42 42 then 43 43 : 44 44 else 45 - echo $title unreadable rcutorture.txt file: $file 45 + echo $title unreadable torture output file: $file 46 46 exit 1 47 47 fi 48 48 ··· 76 76 END { 77 77 if (badseq) { 78 78 if (badseqno1 == badseqno2 && badseqno2 == ver) 79 - print "RCU GP HANG at " ver " rcutorture stat " badseqnr; 79 + print "GP HANG at " ver " torture stat " badseqnr; 80 80 else 81 - print "BAD SEQ " badseqno1 ":" badseqno2 " last:" ver " RCU version " badseqnr; 81 + print "BAD SEQ " badseqno1 ":" badseqno2 " last:" ver " version " badseqnr; 82 82 } 83 83 }' > $T.seq 84 84 ··· 91 91 exit 2 92 92 fi 93 93 else 94 - if grep -q RCU_HOTPLUG $file 94 + if grep -q "_HOTPLUG:" $file 95 95 then 96 96 print_warning HOTPLUG FAILURES $title `cat $T.seq` 97 97 echo " " $file 98 98 exit 3 99 99 fi 100 - echo $title no success message, `grep --binary-files=text 'ver:' $file | wc -l` successful RCU version messages 100 + echo $title no success message, `grep --binary-files=text 'ver:' $file | wc -l` successful version messages 101 101 if test -s $T.seq 102 102 then 103 103 print_warning $title `cat $T.seq`
+25
tools/testing/selftests/rcutorture/configs/rcu/TREE02-T
··· 1 + CONFIG_SMP=y 2 + CONFIG_NR_CPUS=8 3 + CONFIG_PREEMPT_NONE=n 4 + CONFIG_PREEMPT_VOLUNTARY=n 5 + CONFIG_PREEMPT=y 6 + #CHECK#CONFIG_TREE_PREEMPT_RCU=y 7 + CONFIG_HZ_PERIODIC=n 8 + CONFIG_NO_HZ_IDLE=y 9 + CONFIG_NO_HZ_FULL=n 10 + CONFIG_RCU_FAST_NO_HZ=n 11 + CONFIG_RCU_TRACE=y 12 + CONFIG_HOTPLUG_CPU=n 13 + CONFIG_SUSPEND=n 14 + CONFIG_HIBERNATION=n 15 + CONFIG_RCU_FANOUT=3 16 + CONFIG_RCU_FANOUT_LEAF=3 17 + CONFIG_RCU_FANOUT_EXACT=n 18 + CONFIG_RCU_NOCB_CPU=n 19 + CONFIG_DEBUG_LOCK_ALLOC=y 20 + CONFIG_PROVE_LOCKING=n 21 + CONFIG_PROVE_RCU_DELAY=n 22 + CONFIG_RCU_CPU_STALL_INFO=n 23 + CONFIG_RCU_CPU_STALL_VERBOSE=y 24 + CONFIG_RCU_BOOST=n 25 + CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+1
tools/testing/selftests/rcutorture/configs/rcu/TREE08.boot
··· 1 + rcutorture.torture_type=sched