Documentation/atomic_ops.txt: convert to ReST markup

+167 -149

Documentation/atomic_ops.txt Documentation/core-api/atomic_ops.rst

··· 1 - Semantics and Behavior of Atomic and 2 - Bitmask Operations 1 + ======================================================= 2 + Semantics and Behavior of Atomic and Bitmask Operations 3 + ======================================================= 3 4 4 - David S. Miller 5 + :Author: David S. Miller 5 6 6 - This document is intended to serve as a guide to Linux port 7 + This document is intended to serve as a guide to Linux port 7 8 maintainers on how to implement atomic counter, bitops, and spinlock 8 9 interfaces properly. 9 10 10 - The atomic_t type should be defined as a signed integer and 11 + Atomic Type And Operations 12 + ========================== 13 + 14 + The atomic_t type should be defined as a signed integer and 11 15 the atomic_long_t type as a signed long integer. Also, they should 12 16 be made opaque such that any kind of cast to a normal C integer type 13 - will fail. Something like the following should suffice: 17 + will fail. Something like the following should suffice:: 14 18 15 19 typedef struct { int counter; } atomic_t; 16 20 typedef struct { long counter; } atomic_long_t; 17 21 18 22 Historically, counter has been declared volatile. This is now discouraged. 19 - See Documentation/process/volatile-considered-harmful.rst for the complete rationale. 23 + See :ref:`Documentation/process/volatile-considered-harmful.rst 24 + <volatile_considered_harmful>` for the complete rationale. 20 25 21 26 local_t is very similar to atomic_t. If the counter is per CPU and only 22 27 updated by one CPU, local_t is probably more appropriate. Please see 23 - Documentation/local_ops.txt for the semantics of local_t. 28 + :ref:`Documentation/core-api/local_ops.rst <local_ops>` for the semantics of 29 + local_t. 24 30 25 31 The first operations to implement for atomic_t's are the initializers and 26 - plain reads. 32 + plain reads. :: 27 33 28 34 #define ATOMIC_INIT(i) { (i) } 29 35 #define atomic_set(v, i) ((v)->counter = (i)) 30 36 31 - The first macro is used in definitions, such as: 37 + The first macro is used in definitions, such as:: 32 38 33 - static atomic_t my_counter = ATOMIC_INIT(1); 39 + static atomic_t my_counter = ATOMIC_INIT(1); 34 40 35 41 The initializer is atomic in that the return values of the atomic operations 36 42 are guaranteed to be correct reflecting the initialized value if the ··· 44 38 proper implicit or explicit read memory barrier is needed before reading the 45 39 value with atomic_read from another thread. 46 40 47 - As with all of the atomic_ interfaces, replace the leading "atomic_" 48 - with "atomic_long_" to operate on atomic_long_t. 41 + As with all of the ``atomic_`` interfaces, replace the leading ``atomic_`` 42 + with ``atomic_long_`` to operate on atomic_long_t. 49 43 50 - The second interface can be used at runtime, as in: 44 + The second interface can be used at runtime, as in:: 51 45 52 46 struct foo { atomic_t counter; }; 53 47 ... ··· 65 59 or explicit memory barrier is needed before the value set with the operation 66 60 is guaranteed to be readable with atomic_read from another thread. 67 61 68 - Next, we have: 62 + Next, we have:: 69 63 70 64 #define atomic_read(v) ((v)->counter) 71 65 ··· 79 73 interface must take care of that with a proper implicit or explicit memory 80 74 barrier. 81 75 82 - *** WARNING: atomic_read() and atomic_set() DO NOT IMPLY BARRIERS! *** 76 + .. warning:: 83 77 84 - Some architectures may choose to use the volatile keyword, barriers, or inline 85 - assembly to guarantee some degree of immediacy for atomic_read() and 86 - atomic_set(). This is not uniformly guaranteed, and may change in the future, 87 - so all users of atomic_t should treat atomic_read() and atomic_set() as simple 88 - C statements that may be reordered or optimized away entirely by the compiler 89 - or processor, and explicitly invoke the appropriate compiler and/or memory 90 - barrier for each use case. Failure to do so will result in code that may 91 - suddenly break when used with different architectures or compiler 92 - optimizations, or even changes in unrelated code which changes how the 93 - compiler optimizes the section accessing atomic_t variables. 78 + ``atomic_read()`` and ``atomic_set()`` DO NOT IMPLY BARRIERS! 94 79 95 - *** YOU HAVE BEEN WARNED! *** 80 + Some architectures may choose to use the volatile keyword, barriers, or 81 + inline assembly to guarantee some degree of immediacy for atomic_read() 82 + and atomic_set(). This is not uniformly guaranteed, and may change in 83 + the future, so all users of atomic_t should treat atomic_read() and 84 + atomic_set() as simple C statements that may be reordered or optimized 85 + away entirely by the compiler or processor, and explicitly invoke the 86 + appropriate compiler and/or memory barrier for each use case. Failure 87 + to do so will result in code that may suddenly break when used with 88 + different architectures or compiler optimizations, or even changes in 89 + unrelated code which changes how the compiler optimizes the section 90 + accessing atomic_t variables. 96 91 97 92 Properly aligned pointers, longs, ints, and chars (and unsigned 98 93 equivalents) may be atomically loaded from and stored to in the same ··· 102 95 optimizations that might otherwise optimize accesses out of existence on 103 96 the one hand, or that might create unsolicited accesses on the other. 104 97 105 - For example consider the following code: 98 + For example consider the following code:: 106 99 107 100 while (a > 0) 108 101 do_something(); 109 102 110 103 If the compiler can prove that do_something() does not store to the 111 104 variable a, then the compiler is within its rights transforming this to 112 - the following: 105 + the following:: 113 106 114 107 tmp = a; 115 108 if (a > 0) ··· 117 110 do_something(); 118 111 119 112 If you don't want the compiler to do this (and you probably don't), then 120 - you should use something like the following: 113 + you should use something like the following:: 121 114 122 115 while (READ_ONCE(a) < 0) 123 116 do_something(); 124 117 125 118 Alternatively, you could place a barrier() call in the loop. 126 119 127 - For another example, consider the following code: 120 + For another example, consider the following code:: 128 121 129 122 tmp_a = a; 130 123 do_something_with(tmp_a); ··· 132 125 133 126 If the compiler can prove that do_something_with() does not store to the 134 127 variable a, then the compiler is within its rights to manufacture an 135 - additional load as follows: 128 + additional load as follows:: 136 129 137 130 tmp_a = a; 138 131 do_something_with(tmp_a); ··· 146 139 do_something_with() was an inline function that made very heavy use 147 140 of registers: reloading from variable a could save a flush to the 148 141 stack and later reload. To prevent the compiler from attacking your 149 - code in this manner, write the following: 142 + code in this manner, write the following:: 150 143 151 144 tmp_a = READ_ONCE(a); 152 145 do_something_with(tmp_a); ··· 154 147 155 148 For a final example, consider the following code, assuming that the 156 149 variable a is set at boot time before the second CPU is brought online 157 - and never changed later, so that memory barriers are not needed: 150 + and never changed later, so that memory barriers are not needed:: 158 151 159 152 if (a) 160 153 b = 9; ··· 162 155 b = 42; 163 156 164 157 The compiler is within its rights to manufacture an additional store 165 - by transforming the above code into the following: 158 + by transforming the above code into the following:: 166 159 167 160 b = 42; 168 161 if (a) ··· 170 163 171 164 This could come as a fatal surprise to other code running concurrently 172 165 that expected b to never have the value 42 if a was zero. To prevent 173 - the compiler from doing this, write something like: 166 + the compiler from doing this, write something like:: 174 167 175 168 if (a) 176 169 WRITE_ONCE(b, 9); ··· 180 173 Don't even -think- about doing this without proper use of memory barriers, 181 174 locks, or atomic operations if variable a can change at runtime! 182 175 183 - *** WARNING: READ_ONCE() OR WRITE_ONCE() DO NOT IMPLY A BARRIER! *** 176 + .. warning:: 177 + 178 + ``READ_ONCE()`` OR ``WRITE_ONCE()`` DO NOT IMPLY A BARRIER! 184 179 185 180 Now, we move onto the atomic operation interfaces typically implemented with 186 - the help of assembly code. 181 + the help of assembly code. :: 187 182 188 183 void atomic_add(int i, atomic_t *v); 189 184 void atomic_sub(int i, atomic_t *v); ··· 201 192 require any explicit memory barriers. They need only perform the 202 193 atomic_t counter update in an SMP safe manner. 203 194 204 - Next, we have: 195 + Next, we have:: 205 196 206 197 int atomic_inc_return(atomic_t *v); 207 198 int atomic_dec_return(atomic_t *v); ··· 223 214 memory barrier semantics which satisfy the above requirements, that is 224 215 fine as well. 225 216 226 - Let's move on: 217 + Let's move on:: 227 218 228 219 int atomic_add_return(int i, atomic_t *v); 229 220 int atomic_sub_return(int i, atomic_t *v); ··· 233 224 This means that like atomic_{inc,dec}_return(), the memory barrier 234 225 semantics are required. 235 226 236 - Next: 227 + Next:: 237 228 238 229 int atomic_inc_and_test(atomic_t *v); 239 230 int atomic_dec_and_test(atomic_t *v); ··· 243 234 resulting counter value was zero or not. 244 235 245 236 Again, these primitives provide explicit memory barrier semantics around 246 - the atomic operation. 237 + the atomic operation:: 247 238 248 239 int atomic_sub_and_test(int i, atomic_t *v); 249 240 250 241 This is identical to atomic_dec_and_test() except that an explicit 251 242 decrement is given instead of the implicit "1". This primitive must 252 - provide explicit memory barrier semantics around the operation. 243 + provide explicit memory barrier semantics around the operation:: 253 244 254 245 int atomic_add_negative(int i, atomic_t *v); 255 246 ··· 258 249 This primitive must provide explicit memory barrier semantics around 259 250 the operation. 260 251 261 - Then: 252 + Then:: 262 253 263 254 int atomic_xchg(atomic_t *v, int new); 264 255 ··· 266 257 the given new value. It returns the old value that the atomic variable v had 267 258 just before the operation. 268 259 269 - atomic_xchg must provide explicit memory barriers around the operation. 260 + atomic_xchg must provide explicit memory barriers around the operation. :: 270 261 271 262 int atomic_cmpxchg(atomic_t *v, int old, int new); 272 263 273 264 This performs an atomic compare exchange operation on the atomic value v, 274 265 with the given old and new values. Like all atomic_xxx operations, 275 266 atomic_cmpxchg will only satisfy its atomicity semantics as long as all 276 - other accesses of *v are performed through atomic_xxx operations. 267 + other accesses of \*v are performed through atomic_xxx operations. 277 268 278 269 atomic_cmpxchg must provide explicit memory barriers around the operation, 279 270 although if the comparison fails then no memory ordering guarantees are ··· 282 273 The semantics for atomic_cmpxchg are the same as those defined for 'cas' 283 274 below. 284 275 285 - Finally: 276 + Finally:: 286 277 287 278 int atomic_add_unless(atomic_t *v, int a, int u); 288 279 ··· 298 289 299 290 If a caller requires memory barrier semantics around an atomic_t 300 291 operation which does not return a value, a set of interfaces are 301 - defined which accomplish this: 292 + defined which accomplish this:: 302 293 303 294 void smp_mb__before_atomic(void); 304 295 void smp_mb__after_atomic(void); 305 296 306 - For example, smp_mb__before_atomic() can be used like so: 297 + For example, smp_mb__before_atomic() can be used like so:: 307 298 308 299 obj->dead = 1; 309 300 smp_mb__before_atomic(); ··· 324 315 an example, which follows a pattern occurring frequently in the Linux 325 316 kernel. It is the use of atomic counters to implement reference 326 317 counting, and it works such that once the counter falls to zero it can 327 - be guaranteed that no other entity can be accessing the object: 318 + be guaranteed that no other entity can be accessing the object:: 328 319 329 - static void obj_list_add(struct obj *obj, struct list_head *head) 330 - { 331 - obj->active = 1; 332 - list_add(&obj->list, head); 333 - } 320 + static void obj_list_add(struct obj *obj, struct list_head *head) 321 + { 322 + obj->active = 1; 323 + list_add(&obj->list, head); 324 + } 334 325 335 - static void obj_list_del(struct obj *obj) 336 - { 337 - list_del(&obj->list); 338 - obj->active = 0; 339 - } 326 + static void obj_list_del(struct obj *obj) 327 + { 328 + list_del(&obj->list); 329 + obj->active = 0; 330 + } 340 331 341 - static void obj_destroy(struct obj *obj) 342 - { 343 - BUG_ON(obj->active); 344 - kfree(obj); 345 - } 332 + static void obj_destroy(struct obj *obj) 333 + { 334 + BUG_ON(obj->active); 335 + kfree(obj); 336 + } 346 337 347 - struct obj *obj_list_peek(struct list_head *head) 348 - { 349 - if (!list_empty(head)) { 338 + struct obj *obj_list_peek(struct list_head *head) 339 + { 340 + if (!list_empty(head)) { 341 + struct obj *obj; 342 + 343 + obj = list_entry(head->next, struct obj, list); 344 + atomic_inc(&obj->refcnt); 345 + return obj; 346 + } 347 + return NULL; 348 + } 349 + 350 + void obj_poke(void) 351 + { 350 352 struct obj *obj; 351 353 352 - obj = list_entry(head->next, struct obj, list); 353 - atomic_inc(&obj->refcnt); 354 - return obj; 354 + spin_lock(&global_list_lock); 355 + obj = obj_list_peek(&global_list); 356 + spin_unlock(&global_list_lock); 357 + 358 + if (obj) { 359 + obj->ops->poke(obj); 360 + if (atomic_dec_and_test(&obj->refcnt)) 361 + obj_destroy(obj); 362 + } 355 363 } 356 - return NULL; 357 - } 358 364 359 - void obj_poke(void) 360 - { 361 - struct obj *obj; 365 + void obj_timeout(struct obj *obj) 366 + { 367 + spin_lock(&global_list_lock); 368 + obj_list_del(obj); 369 + spin_unlock(&global_list_lock); 362 370 363 - spin_lock(&global_list_lock); 364 - obj = obj_list_peek(&global_list); 365 - spin_unlock(&global_list_lock); 366 - 367 - if (obj) { 368 - obj->ops->poke(obj); 369 371 if (atomic_dec_and_test(&obj->refcnt)) 370 372 obj_destroy(obj); 371 373 } 372 - } 373 374 374 - void obj_timeout(struct obj *obj) 375 - { 376 - spin_lock(&global_list_lock); 377 - obj_list_del(obj); 378 - spin_unlock(&global_list_lock); 375 + .. note:: 379 376 380 - if (atomic_dec_and_test(&obj->refcnt)) 381 - obj_destroy(obj); 382 - } 383 - 384 - (This is a simplification of the ARP queue management in the 385 - generic neighbour discover code of the networking. Olaf Kirch 386 - found a bug wrt. memory barriers in kfree_skb() that exposed 387 - the atomic_t memory barrier requirements quite clearly.) 377 + This is a simplification of the ARP queue management in the generic 378 + neighbour discover code of the networking. Olaf Kirch found a bug wrt. 379 + memory barriers in kfree_skb() that exposed the atomic_t memory barrier 380 + requirements quite clearly. 388 381 389 382 Given the above scheme, it must be the case that the obj->active 390 383 update done by the obj list deletion be visible to other processors ··· 394 383 395 384 Otherwise, the counter could fall to zero, yet obj->active would still 396 385 be set, thus triggering the assertion in obj_destroy(). The error 397 - sequence looks like this: 386 + sequence looks like this:: 398 387 399 388 cpu 0 cpu 1 400 389 obj_poke() obj_timeout() ··· 431 420 Another note is that the atomic_t operations returning values are 432 421 extremely slow on an old 386. 433 422 423 + 424 + Atomic Bitmask 425 + ============== 426 + 434 427 We will now cover the atomic bitmask operations. You will find that 435 428 their SMP and memory barrier semantics are similar in shape and scope 436 429 to the atomic_t ops above. ··· 442 427 Native atomic bit operations are defined to operate on objects aligned 443 428 to the size of an "unsigned long" C data type, and are least of that 444 429 size. The endianness of the bits within each "unsigned long" are the 445 - native endianness of the cpu. 430 + native endianness of the cpu. :: 446 431 447 432 void set_bit(unsigned long nr, volatile unsigned long *addr); 448 433 void clear_bit(unsigned long nr, volatile unsigned long *addr); ··· 452 437 indicated by "nr" on the bit mask pointed to by "ADDR". 453 438 454 439 They must execute atomically, yet there are no implicit memory barrier 455 - semantics required of these interfaces. 440 + semantics required of these interfaces. :: 456 441 457 442 int test_and_set_bit(unsigned long nr, volatile unsigned long *addr); 458 443 int test_and_clear_bit(unsigned long nr, volatile unsigned long *addr); ··· 481 466 All memory operations before the atomic bit operation call must be 482 467 made visible globally before the atomic bit operation is made visible. 483 468 Likewise, the atomic bit operation must be visible globally before any 484 - subsequent memory operation is made visible. For example: 469 + subsequent memory operation is made visible. For example:: 485 470 486 471 obj->dead = 1; 487 472 if (test_and_set_bit(0, &obj->flags)) ··· 494 479 memory operation done by test_and_set_bit() must become visible before 495 480 "obj->killed = 1;" is visible. 496 481 497 - Finally there is the basic operation: 482 + Finally there is the basic operation:: 498 483 499 484 int test_bit(unsigned long nr, __const__ volatile unsigned long *addr); 500 485 ··· 503 488 504 489 If explicit memory barriers are required around {set,clear}_bit() (which do 505 490 not return a value, and thus does not need to provide memory barrier 506 - semantics), two interfaces are provided: 491 + semantics), two interfaces are provided:: 507 492 508 493 void smp_mb__before_atomic(void); 509 494 void smp_mb__after_atomic(void); 510 495 511 496 They are used as follows, and are akin to their atomic_t operation 512 - brothers: 497 + brothers:: 513 498 514 499 /* All memory operations before this call will 515 500 * be globally visible before the clear_bit(). ··· 526 511 same as spinlocks). These operate in the same way as their non-_lock/unlock 527 512 postfixed variants, except that they are to provide acquire/release semantics, 528 513 respectively. This means they can be used for bit_spin_trylock and 529 - bit_spin_unlock type operations without specifying any more barriers. 514 + bit_spin_unlock type operations without specifying any more barriers. :: 530 515 531 516 int test_and_set_bit_lock(unsigned long nr, unsigned long *addr); 532 517 void clear_bit_unlock(unsigned long nr, unsigned long *addr); ··· 541 526 locking scheme is being used to protect the bitmask, and thus less 542 527 expensive non-atomic operations may be used in the implementation. 543 528 They have names similar to the above bitmask operation interfaces, 544 - except that two underscores are prefixed to the interface name. 529 + except that two underscores are prefixed to the interface name. :: 545 530 546 531 void __set_bit(unsigned long nr, volatile unsigned long *addr); 547 532 void __clear_bit(unsigned long nr, volatile unsigned long *addr); ··· 557 542 memory-barrier semantics as the atomic and bit operations returning 558 543 values. 559 544 560 - Note: If someone wants to use xchg(), cmpxchg() and their variants, 561 - linux/atomic.h should be included rather than asm/cmpxchg.h, unless 562 - the code is in arch/* and can take care of itself. 545 + .. note:: 546 + 547 + If someone wants to use xchg(), cmpxchg() and their variants, 548 + linux/atomic.h should be included rather than asm/cmpxchg.h, unless the 549 + code is in arch/* and can take care of itself. 563 550 564 551 Spinlocks and rwlocks have memory barrier expectations as well. 565 552 The rule to follow is simple: ··· 575 558 576 559 Which finally brings us to _atomic_dec_and_lock(). There is an 577 560 architecture-neutral version implemented in lib/dec_and_lock.c, 578 - but most platforms will wish to optimize this in assembler. 561 + but most platforms will wish to optimize this in assembler. :: 579 562 580 563 int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock); 581 564 ··· 590 573 subsequent memory operation. 591 574 592 575 We can demonstrate this operation more clearly if we define 593 - an abstract atomic operation: 576 + an abstract atomic operation:: 594 577 595 578 long cas(long *mem, long old, long new); 596 579 ··· 601 584 3) Regardless, the current value at "mem" is returned. 602 585 603 586 As an example usage, here is what an atomic counter update 604 - might look like: 587 + might look like:: 605 588 606 - void example_atomic_inc(long *counter) 607 - { 608 - long old, new, ret; 589 + void example_atomic_inc(long *counter) 590 + { 591 + long old, new, ret; 609 592 610 - while (1) { 611 - old = *counter; 612 - new = old + 1; 593 + while (1) { 594 + old = *counter; 595 + new = old + 1; 613 596 614 - ret = cas(counter, old, new); 615 - if (ret == old) 616 - break; 617 - } 618 - } 619 - 620 - Let's use cas() in order to build a pseudo-C atomic_dec_and_lock(): 621 - 622 - int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock) 623 - { 624 - long old, new, ret; 625 - int went_to_zero; 626 - 627 - went_to_zero = 0; 628 - while (1) { 629 - old = atomic_read(atomic); 630 - new = old - 1; 631 - if (new == 0) { 632 - went_to_zero = 1; 633 - spin_lock(lock); 634 - } 635 - ret = cas(atomic, old, new); 636 - if (ret == old) 637 - break; 638 - if (went_to_zero) { 639 - spin_unlock(lock); 640 - went_to_zero = 0; 597 + ret = cas(counter, old, new); 598 + if (ret == old) 599 + break; 641 600 } 642 601 } 643 602 644 - return went_to_zero; 645 - } 603 + Let's use cas() in order to build a pseudo-C atomic_dec_and_lock():: 604 + 605 + int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock) 606 + { 607 + long old, new, ret; 608 + int went_to_zero; 609 + 610 + went_to_zero = 0; 611 + while (1) { 612 + old = atomic_read(atomic); 613 + new = old - 1; 614 + if (new == 0) { 615 + went_to_zero = 1; 616 + spin_lock(lock); 617 + } 618 + ret = cas(atomic, old, new); 619 + if (ret == old) 620 + break; 621 + if (went_to_zero) { 622 + spin_unlock(lock); 623 + went_to_zero = 0; 624 + } 625 + } 626 + 627 + return went_to_zero; 628 + } 646 629 647 630 Now, as far as memory barriers go, as long as spin_lock() 648 631 strictly orders all subsequent memory operations (including ··· 652 635 a counter dropping to zero is never made visible before the 653 636 spinlock being acquired. 654 637 655 - Note that this also means that for the case where the counter 656 - is not dropping to zero, there are no memory ordering 657 - requirements. 638 + .. note:: 639 + 640 + Note that this also means that for the case where the counter is not 641 + dropping to zero, there are no memory ordering requirements.

+1

Documentation/core-api/index.rst

··· 8 8 :maxdepth: 1 9 9 10 10 assoc_array 11 + atomic_ops 11 12 local_ops 12 13 workqueue 13 14

+3

Documentation/process/volatile-considered-harmful.rst

··· 1 + 2 + .. _volatile_considered_harmful: 3 + 1 4 Why the "volatile" type class should not be used 2 5 ------------------------------------------------ 3 6