Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

tools/memory-model: Document locking corner cases

Most Linux-kernel uses of locking are straightforward, but there are
corner-case uses that rely on less well-known aspects of the lock and
unlock primitives. This commit therefore adds a locking.txt and litmus
tests in Documentation/litmus-tests/locking to explain these corner-case
uses.

[ paulmck: Apply Andrea Parri feedback for klitmus7. ]
[ paulmck: Apply Akira Yokosawa example-consistency feedback. ]

Reviewed-by: Akira Yokosawa <akiyks@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

+489
+54
Documentation/litmus-tests/locking/DCL-broken.litmus
··· 1 + C DCL-broken 2 + 3 + (* 4 + * Result: Sometimes 5 + * 6 + * This litmus test demonstrates more than just locking is required to 7 + * correctly implement double-checked locking. 8 + *) 9 + 10 + { 11 + int flag; 12 + int data; 13 + } 14 + 15 + P0(int *flag, int *data, spinlock_t *lck) 16 + { 17 + int r0; 18 + int r1; 19 + int r2; 20 + 21 + r0 = READ_ONCE(*flag); 22 + if (r0 == 0) { 23 + spin_lock(lck); 24 + r1 = READ_ONCE(*flag); 25 + if (r1 == 0) { 26 + WRITE_ONCE(*data, 1); 27 + WRITE_ONCE(*flag, 1); 28 + } 29 + spin_unlock(lck); 30 + } 31 + r2 = READ_ONCE(*data); 32 + } 33 + 34 + P1(int *flag, int *data, spinlock_t *lck) 35 + { 36 + int r0; 37 + int r1; 38 + int r2; 39 + 40 + r0 = READ_ONCE(*flag); 41 + if (r0 == 0) { 42 + spin_lock(lck); 43 + r1 = READ_ONCE(*flag); 44 + if (r1 == 0) { 45 + WRITE_ONCE(*data, 1); 46 + WRITE_ONCE(*flag, 1); 47 + } 48 + spin_unlock(lck); 49 + } 50 + r2 = READ_ONCE(*data); 51 + } 52 + 53 + locations [flag;data;0:r0;0:r1;1:r0;1:r1] 54 + exists (0:r2=0 \/ 1:r2=0)
+55
Documentation/litmus-tests/locking/DCL-fixed.litmus
··· 1 + C DCL-fixed 2 + 3 + (* 4 + * Result: Never 5 + * 6 + * This litmus test demonstrates that double-checked locking can be 7 + * reliable given proper use of smp_load_acquire() and smp_store_release() 8 + * in addition to the locking. 9 + *) 10 + 11 + { 12 + int flag; 13 + int data; 14 + } 15 + 16 + P0(int *flag, int *data, spinlock_t *lck) 17 + { 18 + int r0; 19 + int r1; 20 + int r2; 21 + 22 + r0 = smp_load_acquire(flag); 23 + if (r0 == 0) { 24 + spin_lock(lck); 25 + r1 = READ_ONCE(*flag); 26 + if (r1 == 0) { 27 + WRITE_ONCE(*data, 1); 28 + smp_store_release(flag, 1); 29 + } 30 + spin_unlock(lck); 31 + } 32 + r2 = READ_ONCE(*data); 33 + } 34 + 35 + P1(int *flag, int *data, spinlock_t *lck) 36 + { 37 + int r0; 38 + int r1; 39 + int r2; 40 + 41 + r0 = smp_load_acquire(flag); 42 + if (r0 == 0) { 43 + spin_lock(lck); 44 + r1 = READ_ONCE(*flag); 45 + if (r1 == 0) { 46 + WRITE_ONCE(*data, 1); 47 + smp_store_release(flag, 1); 48 + } 49 + spin_unlock(lck); 50 + } 51 + r2 = READ_ONCE(*data); 52 + } 53 + 54 + locations [flag;data;0:r0;0:r1;1:r0;1:r1] 55 + exists (0:r2=0 \/ 1:r2=0)
+41
Documentation/litmus-tests/locking/RM-broken.litmus
··· 1 + C RM-broken 2 + 3 + (* 4 + * Result: DEADLOCK 5 + * 6 + * This litmus test demonstrates that the old "roach motel" approach 7 + * to locking, where code can be freely moved into critical sections, 8 + * cannot be used in the Linux kernel. 9 + *) 10 + 11 + { 12 + int x; 13 + atomic_t y; 14 + } 15 + 16 + P0(int *x, atomic_t *y, spinlock_t *lck) 17 + { 18 + int r2; 19 + 20 + spin_lock(lck); 21 + r2 = atomic_inc_return(y); 22 + WRITE_ONCE(*x, 1); 23 + spin_unlock(lck); 24 + } 25 + 26 + P1(int *x, atomic_t *y, spinlock_t *lck) 27 + { 28 + int r0; 29 + int r1; 30 + int r2; 31 + 32 + spin_lock(lck); 33 + r0 = READ_ONCE(*x); 34 + r1 = READ_ONCE(*x); 35 + r2 = atomic_inc_return(y); 36 + spin_unlock(lck); 37 + } 38 + 39 + locations [x;0:r2;1:r0;1:r1;1:r2] 40 + filter (1:r0=0 /\ 1:r1=1) 41 + exists (1:r2=1)
+41
Documentation/litmus-tests/locking/RM-fixed.litmus
··· 1 + C RM-fixed 2 + 3 + (* 4 + * Result: Never 5 + * 6 + * This litmus test demonstrates that the old "roach motel" approach 7 + * to locking, where code can be freely moved into critical sections, 8 + * cannot be used in the Linux kernel. 9 + *) 10 + 11 + { 12 + int x; 13 + atomic_t y; 14 + } 15 + 16 + P0(int *x, atomic_t *y, spinlock_t *lck) 17 + { 18 + int r2; 19 + 20 + spin_lock(lck); 21 + r2 = atomic_inc_return(y); 22 + WRITE_ONCE(*x, 1); 23 + spin_unlock(lck); 24 + } 25 + 26 + P1(int *x, atomic_t *y, spinlock_t *lck) 27 + { 28 + int r0; 29 + int r1; 30 + int r2; 31 + 32 + r0 = READ_ONCE(*x); 33 + r1 = READ_ONCE(*x); 34 + spin_lock(lck); 35 + r2 = atomic_inc_return(y); 36 + spin_unlock(lck); 37 + } 38 + 39 + locations [x;0:r2;1:r0;1:r1;1:r2] 40 + filter (1:r0=0 /\ 1:r1=1) 41 + exists (1:r2=1)
+298
tools/memory-model/Documentation/locking.txt
··· 1 + Locking 2 + ======= 3 + 4 + Locking is well-known and the common use cases are straightforward: Any 5 + CPU holding a given lock sees any changes previously seen or made by any 6 + CPU before it previously released that same lock. This last sentence 7 + is the only part of this document that most developers will need to read. 8 + 9 + However, developers who would like to also access lock-protected shared 10 + variables outside of their corresponding locks should continue reading. 11 + 12 + 13 + Locking and Prior Accesses 14 + -------------------------- 15 + 16 + The basic rule of locking is worth repeating: 17 + 18 + Any CPU holding a given lock sees any changes previously seen 19 + or made by any CPU before it previously released that same lock. 20 + 21 + Note that this statement is a bit stronger than "Any CPU holding a 22 + given lock sees all changes made by any CPU during the time that CPU was 23 + previously holding this same lock". For example, consider the following 24 + pair of code fragments: 25 + 26 + /* See MP+polocks.litmus. */ 27 + void CPU0(void) 28 + { 29 + WRITE_ONCE(x, 1); 30 + spin_lock(&mylock); 31 + WRITE_ONCE(y, 1); 32 + spin_unlock(&mylock); 33 + } 34 + 35 + void CPU1(void) 36 + { 37 + spin_lock(&mylock); 38 + r0 = READ_ONCE(y); 39 + spin_unlock(&mylock); 40 + r1 = READ_ONCE(x); 41 + } 42 + 43 + The basic rule guarantees that if CPU0() acquires mylock before CPU1(), 44 + then both r0 and r1 must be set to the value 1. This also has the 45 + consequence that if the final value of r0 is equal to 1, then the final 46 + value of r1 must also be equal to 1. In contrast, the weaker rule would 47 + say nothing about the final value of r1. 48 + 49 + 50 + Locking and Subsequent Accesses 51 + ------------------------------- 52 + 53 + The converse to the basic rule also holds: Any CPU holding a given 54 + lock will not see any changes that will be made by any CPU after it 55 + subsequently acquires this same lock. This converse statement is 56 + illustrated by the following litmus test: 57 + 58 + /* See MP+porevlocks.litmus. */ 59 + void CPU0(void) 60 + { 61 + r0 = READ_ONCE(y); 62 + spin_lock(&mylock); 63 + r1 = READ_ONCE(x); 64 + spin_unlock(&mylock); 65 + } 66 + 67 + void CPU1(void) 68 + { 69 + spin_lock(&mylock); 70 + WRITE_ONCE(x, 1); 71 + spin_unlock(&mylock); 72 + WRITE_ONCE(y, 1); 73 + } 74 + 75 + This converse to the basic rule guarantees that if CPU0() acquires 76 + mylock before CPU1(), then both r0 and r1 must be set to the value 0. 77 + This also has the consequence that if the final value of r1 is equal 78 + to 0, then the final value of r0 must also be equal to 0. In contrast, 79 + the weaker rule would say nothing about the final value of r0. 80 + 81 + These examples show only a single pair of CPUs, but the effects of the 82 + locking basic rule extend across multiple acquisitions of a given lock 83 + across multiple CPUs. 84 + 85 + 86 + Double-Checked Locking 87 + ---------------------- 88 + 89 + It is well known that more than just a lock is required to make 90 + double-checked locking work correctly, This litmus test illustrates 91 + one incorrect approach: 92 + 93 + /* See Documentation/litmus-tests/locking/DCL-broken.litmus. */ 94 + void CPU0(void) 95 + { 96 + r0 = READ_ONCE(flag); 97 + if (r0 == 0) { 98 + spin_lock(&lck); 99 + r1 = READ_ONCE(flag); 100 + if (r1 == 0) { 101 + WRITE_ONCE(data, 1); 102 + WRITE_ONCE(flag, 1); 103 + } 104 + spin_unlock(&lck); 105 + } 106 + r2 = READ_ONCE(data); 107 + } 108 + /* CPU1() is the exactly the same as CPU0(). */ 109 + 110 + There are two problems. First, there is no ordering between the first 111 + READ_ONCE() of "flag" and the READ_ONCE() of "data". Second, there is 112 + no ordering between the two WRITE_ONCE() calls. It should therefore be 113 + no surprise that "r2" can be zero, and a quick herd7 run confirms this. 114 + 115 + One way to fix this is to use smp_load_acquire() and smp_store_release() 116 + as shown in this corrected version: 117 + 118 + /* See Documentation/litmus-tests/locking/DCL-fixed.litmus. */ 119 + void CPU0(void) 120 + { 121 + r0 = smp_load_acquire(&flag); 122 + if (r0 == 0) { 123 + spin_lock(&lck); 124 + r1 = READ_ONCE(flag); 125 + if (r1 == 0) { 126 + WRITE_ONCE(data, 1); 127 + smp_store_release(&flag, 1); 128 + } 129 + spin_unlock(&lck); 130 + } 131 + r2 = READ_ONCE(data); 132 + } 133 + /* CPU1() is the exactly the same as CPU0(). */ 134 + 135 + The smp_load_acquire() guarantees that its load from "flags" will 136 + be ordered before the READ_ONCE() from data, thus solving the first 137 + problem. The smp_store_release() guarantees that its store will be 138 + ordered after the WRITE_ONCE() to "data", solving the second problem. 139 + The smp_store_release() pairs with the smp_load_acquire(), thus ensuring 140 + that the ordering provided by each actually takes effect. Again, a 141 + quick herd7 run confirms this. 142 + 143 + In short, if you access a lock-protected variable without holding the 144 + corresponding lock, you will need to provide additional ordering, in 145 + this case, via the smp_load_acquire() and the smp_store_release(). 146 + 147 + 148 + Ordering Provided by a Lock to CPUs Not Holding That Lock 149 + --------------------------------------------------------- 150 + 151 + It is not necessarily the case that accesses ordered by locking will be 152 + seen as ordered by CPUs not holding that lock. Consider this example: 153 + 154 + /* See Z6.0+pooncelock+pooncelock+pombonce.litmus. */ 155 + void CPU0(void) 156 + { 157 + spin_lock(&mylock); 158 + WRITE_ONCE(x, 1); 159 + WRITE_ONCE(y, 1); 160 + spin_unlock(&mylock); 161 + } 162 + 163 + void CPU1(void) 164 + { 165 + spin_lock(&mylock); 166 + r0 = READ_ONCE(y); 167 + WRITE_ONCE(z, 1); 168 + spin_unlock(&mylock); 169 + } 170 + 171 + void CPU2(void) 172 + { 173 + WRITE_ONCE(z, 2); 174 + smp_mb(); 175 + r1 = READ_ONCE(x); 176 + } 177 + 178 + Counter-intuitive though it might be, it is quite possible to have 179 + the final value of r0 be 1, the final value of z be 2, and the final 180 + value of r1 be 0. The reason for this surprising outcome is that CPU2() 181 + never acquired the lock, and thus did not fully benefit from the lock's 182 + ordering properties. 183 + 184 + Ordering can be extended to CPUs not holding the lock by careful use 185 + of smp_mb__after_spinlock(): 186 + 187 + /* See Z6.0+pooncelock+poonceLock+pombonce.litmus. */ 188 + void CPU0(void) 189 + { 190 + spin_lock(&mylock); 191 + WRITE_ONCE(x, 1); 192 + WRITE_ONCE(y, 1); 193 + spin_unlock(&mylock); 194 + } 195 + 196 + void CPU1(void) 197 + { 198 + spin_lock(&mylock); 199 + smp_mb__after_spinlock(); 200 + r0 = READ_ONCE(y); 201 + WRITE_ONCE(z, 1); 202 + spin_unlock(&mylock); 203 + } 204 + 205 + void CPU2(void) 206 + { 207 + WRITE_ONCE(z, 2); 208 + smp_mb(); 209 + r1 = READ_ONCE(x); 210 + } 211 + 212 + This addition of smp_mb__after_spinlock() strengthens the lock 213 + acquisition sufficiently to rule out the counter-intuitive outcome. 214 + In other words, the addition of the smp_mb__after_spinlock() prohibits 215 + the counter-intuitive result where the final value of r0 is 1, the final 216 + value of z is 2, and the final value of r1 is 0. 217 + 218 + 219 + No Roach-Motel Locking! 220 + ----------------------- 221 + 222 + This example requires familiarity with the herd7 "filter" clause, so 223 + please read up on that topic in litmus-tests.txt. 224 + 225 + It is tempting to allow memory-reference instructions to be pulled 226 + into a critical section, but this cannot be allowed in the general case. 227 + For example, consider a spin loop preceding a lock-based critical section. 228 + Now, herd7 does not model spin loops, but we can emulate one with two 229 + loads, with a "filter" clause to constrain the first to return the 230 + initial value and the second to return the updated value, as shown below: 231 + 232 + /* See Documentation/litmus-tests/locking/RM-fixed.litmus. */ 233 + void CPU0(void) 234 + { 235 + spin_lock(&lck); 236 + r2 = atomic_inc_return(&y); 237 + WRITE_ONCE(x, 1); 238 + spin_unlock(&lck); 239 + } 240 + 241 + void CPU1(void) 242 + { 243 + r0 = READ_ONCE(x); 244 + r1 = READ_ONCE(x); 245 + spin_lock(&lck); 246 + r2 = atomic_inc_return(&y); 247 + spin_unlock(&lck); 248 + } 249 + 250 + filter (1:r0=0 /\ 1:r1=1) 251 + exists (1:r2=1) 252 + 253 + The variable "x" is the control variable for the emulated spin loop. 254 + CPU0() sets it to "1" while holding the lock, and CPU1() emulates the 255 + spin loop by reading it twice, first into "1:r0" (which should get the 256 + initial value "0") and then into "1:r1" (which should get the updated 257 + value "1"). 258 + 259 + The "filter" clause takes this into account, constraining "1:r0" to 260 + equal "0" and "1:r1" to equal 1. 261 + 262 + Then the "exists" clause checks to see if CPU1() acquired its lock first, 263 + which should not happen given the filter clause because CPU0() updates 264 + "x" while holding the lock. And herd7 confirms this. 265 + 266 + But suppose that the compiler was permitted to reorder the spin loop 267 + into CPU1()'s critical section, like this: 268 + 269 + /* See Documentation/litmus-tests/locking/RM-broken.litmus. */ 270 + void CPU0(void) 271 + { 272 + int r2; 273 + 274 + spin_lock(&lck); 275 + r2 = atomic_inc_return(&y); 276 + WRITE_ONCE(x, 1); 277 + spin_unlock(&lck); 278 + } 279 + 280 + void CPU1(void) 281 + { 282 + spin_lock(&lck); 283 + r0 = READ_ONCE(x); 284 + r1 = READ_ONCE(x); 285 + r2 = atomic_inc_return(&y); 286 + spin_unlock(&lck); 287 + } 288 + 289 + filter (1:r0=0 /\ 1:r1=1) 290 + exists (1:r2=1) 291 + 292 + If "1:r0" is equal to "0", "1:r1" can never equal "1" because CPU0() 293 + cannot update "x" while CPU1() holds the lock. And herd7 confirms this, 294 + showing zero executions matching the "filter" criteria. 295 + 296 + And this is why Linux-kernel lock and unlock primitives must prevent 297 + code from entering critical sections. It is not sufficient to only 298 + prevent code from leaving them.