Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Documentation/assoc_array.txt: convert to ReST markup

... and move to Documentation/core-api folder.

Signed-off-by: Silvio Fricke <silvio.fricke@gmail.com>
Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

authored by

Silvio Fricke and committed by
Jonathan Corbet
c3cbf1a7 388f9b20

+552 -574
-574
Documentation/assoc_array.txt
··· 1 - ======================================== 2 - GENERIC ASSOCIATIVE ARRAY IMPLEMENTATION 3 - ======================================== 4 - 5 - Contents: 6 - 7 - - Overview. 8 - 9 - - The public API. 10 - - Edit script. 11 - - Operations table. 12 - - Manipulation functions. 13 - - Access functions. 14 - - Index key form. 15 - 16 - - Internal workings. 17 - - Basic internal tree layout. 18 - - Shortcuts. 19 - - Splitting and collapsing nodes. 20 - - Non-recursive iteration. 21 - - Simultaneous alteration and iteration. 22 - 23 - 24 - ======== 25 - OVERVIEW 26 - ======== 27 - 28 - This associative array implementation is an object container with the following 29 - properties: 30 - 31 - (1) Objects are opaque pointers. The implementation does not care where they 32 - point (if anywhere) or what they point to (if anything). 33 - 34 - [!] NOTE: Pointers to objects _must_ be zero in the least significant bit. 35 - 36 - (2) Objects do not need to contain linkage blocks for use by the array. This 37 - permits an object to be located in multiple arrays simultaneously. 38 - Rather, the array is made up of metadata blocks that point to objects. 39 - 40 - (3) Objects require index keys to locate them within the array. 41 - 42 - (4) Index keys must be unique. Inserting an object with the same key as one 43 - already in the array will replace the old object. 44 - 45 - (5) Index keys can be of any length and can be of different lengths. 46 - 47 - (6) Index keys should encode the length early on, before any variation due to 48 - length is seen. 49 - 50 - (7) Index keys can include a hash to scatter objects throughout the array. 51 - 52 - (8) The array can iterated over. The objects will not necessarily come out in 53 - key order. 54 - 55 - (9) The array can be iterated over whilst it is being modified, provided the 56 - RCU readlock is being held by the iterator. Note, however, under these 57 - circumstances, some objects may be seen more than once. If this is a 58 - problem, the iterator should lock against modification. Objects will not 59 - be missed, however, unless deleted. 60 - 61 - (10) Objects in the array can be looked up by means of their index key. 62 - 63 - (11) Objects can be looked up whilst the array is being modified, provided the 64 - RCU readlock is being held by the thread doing the look up. 65 - 66 - The implementation uses a tree of 16-pointer nodes internally that are indexed 67 - on each level by nibbles from the index key in the same manner as in a radix 68 - tree. To improve memory efficiency, shortcuts can be emplaced to skip over 69 - what would otherwise be a series of single-occupancy nodes. Further, nodes 70 - pack leaf object pointers into spare space in the node rather than making an 71 - extra branch until as such time an object needs to be added to a full node. 72 - 73 - 74 - ============== 75 - THE PUBLIC API 76 - ============== 77 - 78 - The public API can be found in <linux/assoc_array.h>. The associative array is 79 - rooted on the following structure: 80 - 81 - struct assoc_array { 82 - ... 83 - }; 84 - 85 - The code is selected by enabling CONFIG_ASSOCIATIVE_ARRAY. 86 - 87 - 88 - EDIT SCRIPT 89 - ----------- 90 - 91 - The insertion and deletion functions produce an 'edit script' that can later be 92 - applied to effect the changes without risking ENOMEM. This retains the 93 - preallocated metadata blocks that will be installed in the internal tree and 94 - keeps track of the metadata blocks that will be removed from the tree when the 95 - script is applied. 96 - 97 - This is also used to keep track of dead blocks and dead objects after the 98 - script has been applied so that they can be freed later. The freeing is done 99 - after an RCU grace period has passed - thus allowing access functions to 100 - proceed under the RCU read lock. 101 - 102 - The script appears as outside of the API as a pointer of the type: 103 - 104 - struct assoc_array_edit; 105 - 106 - There are two functions for dealing with the script: 107 - 108 - (1) Apply an edit script. 109 - 110 - void assoc_array_apply_edit(struct assoc_array_edit *edit); 111 - 112 - This will perform the edit functions, interpolating various write barriers 113 - to permit accesses under the RCU read lock to continue. The edit script 114 - will then be passed to call_rcu() to free it and any dead stuff it points 115 - to. 116 - 117 - (2) Cancel an edit script. 118 - 119 - void assoc_array_cancel_edit(struct assoc_array_edit *edit); 120 - 121 - This frees the edit script and all preallocated memory immediately. If 122 - this was for insertion, the new object is _not_ released by this function, 123 - but must rather be released by the caller. 124 - 125 - These functions are guaranteed not to fail. 126 - 127 - 128 - OPERATIONS TABLE 129 - ---------------- 130 - 131 - Various functions take a table of operations: 132 - 133 - struct assoc_array_ops { 134 - ... 135 - }; 136 - 137 - This points to a number of methods, all of which need to be provided: 138 - 139 - (1) Get a chunk of index key from caller data: 140 - 141 - unsigned long (*get_key_chunk)(const void *index_key, int level); 142 - 143 - This should return a chunk of caller-supplied index key starting at the 144 - *bit* position given by the level argument. The level argument will be a 145 - multiple of ASSOC_ARRAY_KEY_CHUNK_SIZE and the function should return 146 - ASSOC_ARRAY_KEY_CHUNK_SIZE bits. No error is possible. 147 - 148 - 149 - (2) Get a chunk of an object's index key. 150 - 151 - unsigned long (*get_object_key_chunk)(const void *object, int level); 152 - 153 - As the previous function, but gets its data from an object in the array 154 - rather than from a caller-supplied index key. 155 - 156 - 157 - (3) See if this is the object we're looking for. 158 - 159 - bool (*compare_object)(const void *object, const void *index_key); 160 - 161 - Compare the object against an index key and return true if it matches and 162 - false if it doesn't. 163 - 164 - 165 - (4) Diff the index keys of two objects. 166 - 167 - int (*diff_objects)(const void *object, const void *index_key); 168 - 169 - Return the bit position at which the index key of the specified object 170 - differs from the given index key or -1 if they are the same. 171 - 172 - 173 - (5) Free an object. 174 - 175 - void (*free_object)(void *object); 176 - 177 - Free the specified object. Note that this may be called an RCU grace 178 - period after assoc_array_apply_edit() was called, so synchronize_rcu() may 179 - be necessary on module unloading. 180 - 181 - 182 - MANIPULATION FUNCTIONS 183 - ---------------------- 184 - 185 - There are a number of functions for manipulating an associative array: 186 - 187 - (1) Initialise an associative array. 188 - 189 - void assoc_array_init(struct assoc_array *array); 190 - 191 - This initialises the base structure for an associative array. It can't 192 - fail. 193 - 194 - 195 - (2) Insert/replace an object in an associative array. 196 - 197 - struct assoc_array_edit * 198 - assoc_array_insert(struct assoc_array *array, 199 - const struct assoc_array_ops *ops, 200 - const void *index_key, 201 - void *object); 202 - 203 - This inserts the given object into the array. Note that the least 204 - significant bit of the pointer must be zero as it's used to type-mark 205 - pointers internally. 206 - 207 - If an object already exists for that key then it will be replaced with the 208 - new object and the old one will be freed automatically. 209 - 210 - The index_key argument should hold index key information and is 211 - passed to the methods in the ops table when they are called. 212 - 213 - This function makes no alteration to the array itself, but rather returns 214 - an edit script that must be applied. -ENOMEM is returned in the case of 215 - an out-of-memory error. 216 - 217 - The caller should lock exclusively against other modifiers of the array. 218 - 219 - 220 - (3) Delete an object from an associative array. 221 - 222 - struct assoc_array_edit * 223 - assoc_array_delete(struct assoc_array *array, 224 - const struct assoc_array_ops *ops, 225 - const void *index_key); 226 - 227 - This deletes an object that matches the specified data from the array. 228 - 229 - The index_key argument should hold index key information and is 230 - passed to the methods in the ops table when they are called. 231 - 232 - This function makes no alteration to the array itself, but rather returns 233 - an edit script that must be applied. -ENOMEM is returned in the case of 234 - an out-of-memory error. NULL will be returned if the specified object is 235 - not found within the array. 236 - 237 - The caller should lock exclusively against other modifiers of the array. 238 - 239 - 240 - (4) Delete all objects from an associative array. 241 - 242 - struct assoc_array_edit * 243 - assoc_array_clear(struct assoc_array *array, 244 - const struct assoc_array_ops *ops); 245 - 246 - This deletes all the objects from an associative array and leaves it 247 - completely empty. 248 - 249 - This function makes no alteration to the array itself, but rather returns 250 - an edit script that must be applied. -ENOMEM is returned in the case of 251 - an out-of-memory error. 252 - 253 - The caller should lock exclusively against other modifiers of the array. 254 - 255 - 256 - (5) Destroy an associative array, deleting all objects. 257 - 258 - void assoc_array_destroy(struct assoc_array *array, 259 - const struct assoc_array_ops *ops); 260 - 261 - This destroys the contents of the associative array and leaves it 262 - completely empty. It is not permitted for another thread to be traversing 263 - the array under the RCU read lock at the same time as this function is 264 - destroying it as no RCU deferral is performed on memory release - 265 - something that would require memory to be allocated. 266 - 267 - The caller should lock exclusively against other modifiers and accessors 268 - of the array. 269 - 270 - 271 - (6) Garbage collect an associative array. 272 - 273 - int assoc_array_gc(struct assoc_array *array, 274 - const struct assoc_array_ops *ops, 275 - bool (*iterator)(void *object, void *iterator_data), 276 - void *iterator_data); 277 - 278 - This iterates over the objects in an associative array and passes each one 279 - to iterator(). If iterator() returns true, the object is kept. If it 280 - returns false, the object will be freed. If the iterator() function 281 - returns true, it must perform any appropriate refcount incrementing on the 282 - object before returning. 283 - 284 - The internal tree will be packed down if possible as part of the iteration 285 - to reduce the number of nodes in it. 286 - 287 - The iterator_data is passed directly to iterator() and is otherwise 288 - ignored by the function. 289 - 290 - The function will return 0 if successful and -ENOMEM if there wasn't 291 - enough memory. 292 - 293 - It is possible for other threads to iterate over or search the array under 294 - the RCU read lock whilst this function is in progress. The caller should 295 - lock exclusively against other modifiers of the array. 296 - 297 - 298 - ACCESS FUNCTIONS 299 - ---------------- 300 - 301 - There are two functions for accessing an associative array: 302 - 303 - (1) Iterate over all the objects in an associative array. 304 - 305 - int assoc_array_iterate(const struct assoc_array *array, 306 - int (*iterator)(const void *object, 307 - void *iterator_data), 308 - void *iterator_data); 309 - 310 - This passes each object in the array to the iterator callback function. 311 - iterator_data is private data for that function. 312 - 313 - This may be used on an array at the same time as the array is being 314 - modified, provided the RCU read lock is held. Under such circumstances, 315 - it is possible for the iteration function to see some objects twice. If 316 - this is a problem, then modification should be locked against. The 317 - iteration algorithm should not, however, miss any objects. 318 - 319 - The function will return 0 if no objects were in the array or else it will 320 - return the result of the last iterator function called. Iteration stops 321 - immediately if any call to the iteration function results in a non-zero 322 - return. 323 - 324 - 325 - (2) Find an object in an associative array. 326 - 327 - void *assoc_array_find(const struct assoc_array *array, 328 - const struct assoc_array_ops *ops, 329 - const void *index_key); 330 - 331 - This walks through the array's internal tree directly to the object 332 - specified by the index key.. 333 - 334 - This may be used on an array at the same time as the array is being 335 - modified, provided the RCU read lock is held. 336 - 337 - The function will return the object if found (and set *_type to the object 338 - type) or will return NULL if the object was not found. 339 - 340 - 341 - INDEX KEY FORM 342 - -------------- 343 - 344 - The index key can be of any form, but since the algorithms aren't told how long 345 - the key is, it is strongly recommended that the index key includes its length 346 - very early on before any variation due to the length would have an effect on 347 - comparisons. 348 - 349 - This will cause leaves with different length keys to scatter away from each 350 - other - and those with the same length keys to cluster together. 351 - 352 - It is also recommended that the index key begin with a hash of the rest of the 353 - key to maximise scattering throughout keyspace. 354 - 355 - The better the scattering, the wider and lower the internal tree will be. 356 - 357 - Poor scattering isn't too much of a problem as there are shortcuts and nodes 358 - can contain mixtures of leaves and metadata pointers. 359 - 360 - The index key is read in chunks of machine word. Each chunk is subdivided into 361 - one nibble (4 bits) per level, so on a 32-bit CPU this is good for 8 levels and 362 - on a 64-bit CPU, 16 levels. Unless the scattering is really poor, it is 363 - unlikely that more than one word of any particular index key will have to be 364 - used. 365 - 366 - 367 - ================= 368 - INTERNAL WORKINGS 369 - ================= 370 - 371 - The associative array data structure has an internal tree. This tree is 372 - constructed of two types of metadata blocks: nodes and shortcuts. 373 - 374 - A node is an array of slots. Each slot can contain one of four things: 375 - 376 - (*) A NULL pointer, indicating that the slot is empty. 377 - 378 - (*) A pointer to an object (a leaf). 379 - 380 - (*) A pointer to a node at the next level. 381 - 382 - (*) A pointer to a shortcut. 383 - 384 - 385 - BASIC INTERNAL TREE LAYOUT 386 - -------------------------- 387 - 388 - Ignoring shortcuts for the moment, the nodes form a multilevel tree. The index 389 - key space is strictly subdivided by the nodes in the tree and nodes occur on 390 - fixed levels. For example: 391 - 392 - Level: 0 1 2 3 393 - =============== =============== =============== =============== 394 - NODE D 395 - NODE B NODE C +------>+---+ 396 - +------>+---+ +------>+---+ | | 0 | 397 - NODE A | | 0 | | | 0 | | +---+ 398 - +---+ | +---+ | +---+ | : : 399 - | 0 | | : : | : : | +---+ 400 - +---+ | +---+ | +---+ | | f | 401 - | 1 |---+ | 3 |---+ | 7 |---+ +---+ 402 - +---+ +---+ +---+ 403 - : : : : | 8 |---+ 404 - +---+ +---+ +---+ | NODE E 405 - | e |---+ | f | : : +------>+---+ 406 - +---+ | +---+ +---+ | 0 | 407 - | f | | | f | +---+ 408 - +---+ | +---+ : : 409 - | NODE F +---+ 410 - +------>+---+ | f | 411 - | 0 | NODE G +---+ 412 - +---+ +------>+---+ 413 - : : | | 0 | 414 - +---+ | +---+ 415 - | 6 |---+ : : 416 - +---+ +---+ 417 - : : | f | 418 - +---+ +---+ 419 - | f | 420 - +---+ 421 - 422 - In the above example, there are 7 nodes (A-G), each with 16 slots (0-f). 423 - Assuming no other meta data nodes in the tree, the key space is divided thusly: 424 - 425 - KEY PREFIX NODE 426 - ========== ==== 427 - 137* D 428 - 138* E 429 - 13[0-69-f]* C 430 - 1[0-24-f]* B 431 - e6* G 432 - e[0-57-f]* F 433 - [02-df]* A 434 - 435 - So, for instance, keys with the following example index keys will be found in 436 - the appropriate nodes: 437 - 438 - INDEX KEY PREFIX NODE 439 - =============== ======= ==== 440 - 13694892892489 13 C 441 - 13795289025897 137 D 442 - 13889dde88793 138 E 443 - 138bbb89003093 138 E 444 - 1394879524789 12 C 445 - 1458952489 1 B 446 - 9431809de993ba - A 447 - b4542910809cd - A 448 - e5284310def98 e F 449 - e68428974237 e6 G 450 - e7fffcbd443 e F 451 - f3842239082 - A 452 - 453 - To save memory, if a node can hold all the leaves in its portion of keyspace, 454 - then the node will have all those leaves in it and will not have any metadata 455 - pointers - even if some of those leaves would like to be in the same slot. 456 - 457 - A node can contain a heterogeneous mix of leaves and metadata pointers. 458 - Metadata pointers must be in the slots that match their subdivisions of key 459 - space. The leaves can be in any slot not occupied by a metadata pointer. It 460 - is guaranteed that none of the leaves in a node will match a slot occupied by a 461 - metadata pointer. If the metadata pointer is there, any leaf whose key matches 462 - the metadata key prefix must be in the subtree that the metadata pointer points 463 - to. 464 - 465 - In the above example list of index keys, node A will contain: 466 - 467 - SLOT CONTENT INDEX KEY (PREFIX) 468 - ==== =============== ================== 469 - 1 PTR TO NODE B 1* 470 - any LEAF 9431809de993ba 471 - any LEAF b4542910809cd 472 - e PTR TO NODE F e* 473 - any LEAF f3842239082 474 - 475 - and node B: 476 - 477 - 3 PTR TO NODE C 13* 478 - any LEAF 1458952489 479 - 480 - 481 - SHORTCUTS 482 - --------- 483 - 484 - Shortcuts are metadata records that jump over a piece of keyspace. A shortcut 485 - is a replacement for a series of single-occupancy nodes ascending through the 486 - levels. Shortcuts exist to save memory and to speed up traversal. 487 - 488 - It is possible for the root of the tree to be a shortcut - say, for example, 489 - the tree contains at least 17 nodes all with key prefix '1111'. The insertion 490 - algorithm will insert a shortcut to skip over the '1111' keyspace in a single 491 - bound and get to the fourth level where these actually become different. 492 - 493 - 494 - SPLITTING AND COLLAPSING NODES 495 - ------------------------------ 496 - 497 - Each node has a maximum capacity of 16 leaves and metadata pointers. If the 498 - insertion algorithm finds that it is trying to insert a 17th object into a 499 - node, that node will be split such that at least two leaves that have a common 500 - key segment at that level end up in a separate node rooted on that slot for 501 - that common key segment. 502 - 503 - If the leaves in a full node and the leaf that is being inserted are 504 - sufficiently similar, then a shortcut will be inserted into the tree. 505 - 506 - When the number of objects in the subtree rooted at a node falls to 16 or 507 - fewer, then the subtree will be collapsed down to a single node - and this will 508 - ripple towards the root if possible. 509 - 510 - 511 - NON-RECURSIVE ITERATION 512 - ----------------------- 513 - 514 - Each node and shortcut contains a back pointer to its parent and the number of 515 - slot in that parent that points to it. None-recursive iteration uses these to 516 - proceed rootwards through the tree, going to the parent node, slot N + 1 to 517 - make sure progress is made without the need for a stack. 518 - 519 - The backpointers, however, make simultaneous alteration and iteration tricky. 520 - 521 - 522 - SIMULTANEOUS ALTERATION AND ITERATION 523 - ------------------------------------- 524 - 525 - There are a number of cases to consider: 526 - 527 - (1) Simple insert/replace. This involves simply replacing a NULL or old 528 - matching leaf pointer with the pointer to the new leaf after a barrier. 529 - The metadata blocks don't change otherwise. An old leaf won't be freed 530 - until after the RCU grace period. 531 - 532 - (2) Simple delete. This involves just clearing an old matching leaf. The 533 - metadata blocks don't change otherwise. The old leaf won't be freed until 534 - after the RCU grace period. 535 - 536 - (3) Insertion replacing part of a subtree that we haven't yet entered. This 537 - may involve replacement of part of that subtree - but that won't affect 538 - the iteration as we won't have reached the pointer to it yet and the 539 - ancestry blocks are not replaced (the layout of those does not change). 540 - 541 - (4) Insertion replacing nodes that we're actively processing. This isn't a 542 - problem as we've passed the anchoring pointer and won't switch onto the 543 - new layout until we follow the back pointers - at which point we've 544 - already examined the leaves in the replaced node (we iterate over all the 545 - leaves in a node before following any of its metadata pointers). 546 - 547 - We might, however, re-see some leaves that have been split out into a new 548 - branch that's in a slot further along than we were at. 549 - 550 - (5) Insertion replacing nodes that we're processing a dependent branch of. 551 - This won't affect us until we follow the back pointers. Similar to (4). 552 - 553 - (6) Deletion collapsing a branch under us. This doesn't affect us because the 554 - back pointers will get us back to the parent of the new node before we 555 - could see the new node. The entire collapsed subtree is thrown away 556 - unchanged - and will still be rooted on the same slot, so we shouldn't 557 - process it a second time as we'll go back to slot + 1. 558 - 559 - Note: 560 - 561 - (*) Under some circumstances, we need to simultaneously change the parent 562 - pointer and the parent slot pointer on a node (say, for example, we 563 - inserted another node before it and moved it up a level). We cannot do 564 - this without locking against a read - so we have to replace that node too. 565 - 566 - However, when we're changing a shortcut into a node this isn't a problem 567 - as shortcuts only have one slot and so the parent slot number isn't used 568 - when traversing backwards over one. This means that it's okay to change 569 - the slot number first - provided suitable barriers are used to make sure 570 - the parent slot number is read after the back pointer. 571 - 572 - Obsolete blocks and leaves are freed up after an RCU grace period has passed, 573 - so as long as anyone doing walking or iteration holds the RCU read lock, the 574 - old superstructure should not go away on them.
+551
Documentation/core-api/assoc_array.rst
··· 1 + ======================================== 2 + Generic Associative Array Implementation 3 + ======================================== 4 + 5 + Overview 6 + ======== 7 + 8 + This associative array implementation is an object container with the following 9 + properties: 10 + 11 + 1. Objects are opaque pointers. The implementation does not care where they 12 + point (if anywhere) or what they point to (if anything). 13 + .. note:: Pointers to objects _must_ be zero in the least significant bit.** 14 + 15 + 2. Objects do not need to contain linkage blocks for use by the array. This 16 + permits an object to be located in multiple arrays simultaneously. 17 + Rather, the array is made up of metadata blocks that point to objects. 18 + 19 + 3. Objects require index keys to locate them within the array. 20 + 21 + 4. Index keys must be unique. Inserting an object with the same key as one 22 + already in the array will replace the old object. 23 + 24 + 5. Index keys can be of any length and can be of different lengths. 25 + 26 + 6. Index keys should encode the length early on, before any variation due to 27 + length is seen. 28 + 29 + 7. Index keys can include a hash to scatter objects throughout the array. 30 + 31 + 8. The array can iterated over. The objects will not necessarily come out in 32 + key order. 33 + 34 + 9. The array can be iterated over whilst it is being modified, provided the 35 + RCU readlock is being held by the iterator. Note, however, under these 36 + circumstances, some objects may be seen more than once. If this is a 37 + problem, the iterator should lock against modification. Objects will not 38 + be missed, however, unless deleted. 39 + 40 + 10. Objects in the array can be looked up by means of their index key. 41 + 42 + 11. Objects can be looked up whilst the array is being modified, provided the 43 + RCU readlock is being held by the thread doing the look up. 44 + 45 + The implementation uses a tree of 16-pointer nodes internally that are indexed 46 + on each level by nibbles from the index key in the same manner as in a radix 47 + tree. To improve memory efficiency, shortcuts can be emplaced to skip over 48 + what would otherwise be a series of single-occupancy nodes. Further, nodes 49 + pack leaf object pointers into spare space in the node rather than making an 50 + extra branch until as such time an object needs to be added to a full node. 51 + 52 + 53 + The Public API 54 + ============== 55 + 56 + The public API can be found in ``<linux/assoc_array.h>``. The associative 57 + array is rooted on the following structure:: 58 + 59 + struct assoc_array { 60 + ... 61 + }; 62 + 63 + The code is selected by enabling ``CONFIG_ASSOCIATIVE_ARRAY`` with:: 64 + 65 + ./script/config -e ASSOCIATIVE_ARRAY 66 + 67 + 68 + Edit Script 69 + ----------- 70 + 71 + The insertion and deletion functions produce an 'edit script' that can later be 72 + applied to effect the changes without risking ``ENOMEM``. This retains the 73 + preallocated metadata blocks that will be installed in the internal tree and 74 + keeps track of the metadata blocks that will be removed from the tree when the 75 + script is applied. 76 + 77 + This is also used to keep track of dead blocks and dead objects after the 78 + script has been applied so that they can be freed later. The freeing is done 79 + after an RCU grace period has passed - thus allowing access functions to 80 + proceed under the RCU read lock. 81 + 82 + The script appears as outside of the API as a pointer of the type:: 83 + 84 + struct assoc_array_edit; 85 + 86 + There are two functions for dealing with the script: 87 + 88 + 1. Apply an edit script:: 89 + 90 + void assoc_array_apply_edit(struct assoc_array_edit *edit); 91 + 92 + This will perform the edit functions, interpolating various write barriers 93 + to permit accesses under the RCU read lock to continue. The edit script 94 + will then be passed to ``call_rcu()`` to free it and any dead stuff it points 95 + to. 96 + 97 + 2. Cancel an edit script:: 98 + 99 + void assoc_array_cancel_edit(struct assoc_array_edit *edit); 100 + 101 + This frees the edit script and all preallocated memory immediately. If 102 + this was for insertion, the new object is _not_ released by this function, 103 + but must rather be released by the caller. 104 + 105 + These functions are guaranteed not to fail. 106 + 107 + 108 + Operations Table 109 + ---------------- 110 + 111 + Various functions take a table of operations:: 112 + 113 + struct assoc_array_ops { 114 + ... 115 + }; 116 + 117 + This points to a number of methods, all of which need to be provided: 118 + 119 + 1. Get a chunk of index key from caller data:: 120 + 121 + unsigned long (*get_key_chunk)(const void *index_key, int level); 122 + 123 + This should return a chunk of caller-supplied index key starting at the 124 + *bit* position given by the level argument. The level argument will be a 125 + multiple of ``ASSOC_ARRAY_KEY_CHUNK_SIZE`` and the function should return 126 + ``ASSOC_ARRAY_KEY_CHUNK_SIZE bits``. No error is possible. 127 + 128 + 129 + 2. Get a chunk of an object's index key:: 130 + 131 + unsigned long (*get_object_key_chunk)(const void *object, int level); 132 + 133 + As the previous function, but gets its data from an object in the array 134 + rather than from a caller-supplied index key. 135 + 136 + 137 + 3. See if this is the object we're looking for:: 138 + 139 + bool (*compare_object)(const void *object, const void *index_key); 140 + 141 + Compare the object against an index key and return ``true`` if it matches and 142 + ``false`` if it doesn't. 143 + 144 + 145 + 4. Diff the index keys of two objects:: 146 + 147 + int (*diff_objects)(const void *object, const void *index_key); 148 + 149 + Return the bit position at which the index key of the specified object 150 + differs from the given index key or -1 if they are the same. 151 + 152 + 153 + 5. Free an object:: 154 + 155 + void (*free_object)(void *object); 156 + 157 + Free the specified object. Note that this may be called an RCU grace period 158 + after ``assoc_array_apply_edit()`` was called, so ``synchronize_rcu()`` may be 159 + necessary on module unloading. 160 + 161 + 162 + Manipulation Functions 163 + ---------------------- 164 + 165 + There are a number of functions for manipulating an associative array: 166 + 167 + 1. Initialise an associative array:: 168 + 169 + void assoc_array_init(struct assoc_array *array); 170 + 171 + This initialises the base structure for an associative array. It can't fail. 172 + 173 + 174 + 2. Insert/replace an object in an associative array:: 175 + 176 + struct assoc_array_edit * 177 + assoc_array_insert(struct assoc_array *array, 178 + const struct assoc_array_ops *ops, 179 + const void *index_key, 180 + void *object); 181 + 182 + This inserts the given object into the array. Note that the least 183 + significant bit of the pointer must be zero as it's used to type-mark 184 + pointers internally. 185 + 186 + If an object already exists for that key then it will be replaced with the 187 + new object and the old one will be freed automatically. 188 + 189 + The ``index_key`` argument should hold index key information and is 190 + passed to the methods in the ops table when they are called. 191 + 192 + This function makes no alteration to the array itself, but rather returns 193 + an edit script that must be applied. ``-ENOMEM`` is returned in the case of 194 + an out-of-memory error. 195 + 196 + The caller should lock exclusively against other modifiers of the array. 197 + 198 + 199 + 3. Delete an object from an associative array:: 200 + 201 + struct assoc_array_edit * 202 + assoc_array_delete(struct assoc_array *array, 203 + const struct assoc_array_ops *ops, 204 + const void *index_key); 205 + 206 + This deletes an object that matches the specified data from the array. 207 + 208 + The ``index_key`` argument should hold index key information and is 209 + passed to the methods in the ops table when they are called. 210 + 211 + This function makes no alteration to the array itself, but rather returns 212 + an edit script that must be applied. ``-ENOMEM`` is returned in the case of 213 + an out-of-memory error. ``NULL`` will be returned if the specified object is 214 + not found within the array. 215 + 216 + The caller should lock exclusively against other modifiers of the array. 217 + 218 + 219 + 4. Delete all objects from an associative array:: 220 + 221 + struct assoc_array_edit * 222 + assoc_array_clear(struct assoc_array *array, 223 + const struct assoc_array_ops *ops); 224 + 225 + This deletes all the objects from an associative array and leaves it 226 + completely empty. 227 + 228 + This function makes no alteration to the array itself, but rather returns 229 + an edit script that must be applied. ``-ENOMEM`` is returned in the case of 230 + an out-of-memory error. 231 + 232 + The caller should lock exclusively against other modifiers of the array. 233 + 234 + 235 + 5. Destroy an associative array, deleting all objects:: 236 + 237 + void assoc_array_destroy(struct assoc_array *array, 238 + const struct assoc_array_ops *ops); 239 + 240 + This destroys the contents of the associative array and leaves it 241 + completely empty. It is not permitted for another thread to be traversing 242 + the array under the RCU read lock at the same time as this function is 243 + destroying it as no RCU deferral is performed on memory release - 244 + something that would require memory to be allocated. 245 + 246 + The caller should lock exclusively against other modifiers and accessors 247 + of the array. 248 + 249 + 250 + 6. Garbage collect an associative array:: 251 + 252 + int assoc_array_gc(struct assoc_array *array, 253 + const struct assoc_array_ops *ops, 254 + bool (*iterator)(void *object, void *iterator_data), 255 + void *iterator_data); 256 + 257 + This iterates over the objects in an associative array and passes each one to 258 + ``iterator()``. If ``iterator()`` returns ``true``, the object is kept. If it 259 + returns ``false``, the object will be freed. If the ``iterator()`` function 260 + returns ``true``, it must perform any appropriate refcount incrementing on the 261 + object before returning. 262 + 263 + The internal tree will be packed down if possible as part of the iteration 264 + to reduce the number of nodes in it. 265 + 266 + The ``iterator_data`` is passed directly to ``iterator()`` and is otherwise 267 + ignored by the function. 268 + 269 + The function will return ``0`` if successful and ``-ENOMEM`` if there wasn't 270 + enough memory. 271 + 272 + It is possible for other threads to iterate over or search the array under 273 + the RCU read lock whilst this function is in progress. The caller should 274 + lock exclusively against other modifiers of the array. 275 + 276 + 277 + Access Functions 278 + ---------------- 279 + 280 + There are two functions for accessing an associative array: 281 + 282 + 1. Iterate over all the objects in an associative array:: 283 + 284 + int assoc_array_iterate(const struct assoc_array *array, 285 + int (*iterator)(const void *object, 286 + void *iterator_data), 287 + void *iterator_data); 288 + 289 + This passes each object in the array to the iterator callback function. 290 + ``iterator_data`` is private data for that function. 291 + 292 + This may be used on an array at the same time as the array is being 293 + modified, provided the RCU read lock is held. Under such circumstances, 294 + it is possible for the iteration function to see some objects twice. If 295 + this is a problem, then modification should be locked against. The 296 + iteration algorithm should not, however, miss any objects. 297 + 298 + The function will return ``0`` if no objects were in the array or else it will 299 + return the result of the last iterator function called. Iteration stops 300 + immediately if any call to the iteration function results in a non-zero 301 + return. 302 + 303 + 304 + 2. Find an object in an associative array:: 305 + 306 + void *assoc_array_find(const struct assoc_array *array, 307 + const struct assoc_array_ops *ops, 308 + const void *index_key); 309 + 310 + This walks through the array's internal tree directly to the object 311 + specified by the index key.. 312 + 313 + This may be used on an array at the same time as the array is being 314 + modified, provided the RCU read lock is held. 315 + 316 + The function will return the object if found (and set ``*_type`` to the object 317 + type) or will return ``NULL`` if the object was not found. 318 + 319 + 320 + Index Key Form 321 + -------------- 322 + 323 + The index key can be of any form, but since the algorithms aren't told how long 324 + the key is, it is strongly recommended that the index key includes its length 325 + very early on before any variation due to the length would have an effect on 326 + comparisons. 327 + 328 + This will cause leaves with different length keys to scatter away from each 329 + other - and those with the same length keys to cluster together. 330 + 331 + It is also recommended that the index key begin with a hash of the rest of the 332 + key to maximise scattering throughout keyspace. 333 + 334 + The better the scattering, the wider and lower the internal tree will be. 335 + 336 + Poor scattering isn't too much of a problem as there are shortcuts and nodes 337 + can contain mixtures of leaves and metadata pointers. 338 + 339 + The index key is read in chunks of machine word. Each chunk is subdivided into 340 + one nibble (4 bits) per level, so on a 32-bit CPU this is good for 8 levels and 341 + on a 64-bit CPU, 16 levels. Unless the scattering is really poor, it is 342 + unlikely that more than one word of any particular index key will have to be 343 + used. 344 + 345 + 346 + Internal Workings 347 + ================= 348 + 349 + The associative array data structure has an internal tree. This tree is 350 + constructed of two types of metadata blocks: nodes and shortcuts. 351 + 352 + A node is an array of slots. Each slot can contain one of four things: 353 + 354 + * A NULL pointer, indicating that the slot is empty. 355 + * A pointer to an object (a leaf). 356 + * A pointer to a node at the next level. 357 + * A pointer to a shortcut. 358 + 359 + 360 + Basic Internal Tree Layout 361 + -------------------------- 362 + 363 + Ignoring shortcuts for the moment, the nodes form a multilevel tree. The index 364 + key space is strictly subdivided by the nodes in the tree and nodes occur on 365 + fixed levels. For example:: 366 + 367 + Level: 0 1 2 3 368 + =============== =============== =============== =============== 369 + NODE D 370 + NODE B NODE C +------>+---+ 371 + +------>+---+ +------>+---+ | | 0 | 372 + NODE A | | 0 | | | 0 | | +---+ 373 + +---+ | +---+ | +---+ | : : 374 + | 0 | | : : | : : | +---+ 375 + +---+ | +---+ | +---+ | | f | 376 + | 1 |---+ | 3 |---+ | 7 |---+ +---+ 377 + +---+ +---+ +---+ 378 + : : : : | 8 |---+ 379 + +---+ +---+ +---+ | NODE E 380 + | e |---+ | f | : : +------>+---+ 381 + +---+ | +---+ +---+ | 0 | 382 + | f | | | f | +---+ 383 + +---+ | +---+ : : 384 + | NODE F +---+ 385 + +------>+---+ | f | 386 + | 0 | NODE G +---+ 387 + +---+ +------>+---+ 388 + : : | | 0 | 389 + +---+ | +---+ 390 + | 6 |---+ : : 391 + +---+ +---+ 392 + : : | f | 393 + +---+ +---+ 394 + | f | 395 + +---+ 396 + 397 + In the above example, there are 7 nodes (A-G), each with 16 slots (0-f). 398 + Assuming no other meta data nodes in the tree, the key space is divided 399 + thusly:: 400 + 401 + KEY PREFIX NODE 402 + ========== ==== 403 + 137* D 404 + 138* E 405 + 13[0-69-f]* C 406 + 1[0-24-f]* B 407 + e6* G 408 + e[0-57-f]* F 409 + [02-df]* A 410 + 411 + So, for instance, keys with the following example index keys will be found in 412 + the appropriate nodes:: 413 + 414 + INDEX KEY PREFIX NODE 415 + =============== ======= ==== 416 + 13694892892489 13 C 417 + 13795289025897 137 D 418 + 13889dde88793 138 E 419 + 138bbb89003093 138 E 420 + 1394879524789 12 C 421 + 1458952489 1 B 422 + 9431809de993ba - A 423 + b4542910809cd - A 424 + e5284310def98 e F 425 + e68428974237 e6 G 426 + e7fffcbd443 e F 427 + f3842239082 - A 428 + 429 + To save memory, if a node can hold all the leaves in its portion of keyspace, 430 + then the node will have all those leaves in it and will not have any metadata 431 + pointers - even if some of those leaves would like to be in the same slot. 432 + 433 + A node can contain a heterogeneous mix of leaves and metadata pointers. 434 + Metadata pointers must be in the slots that match their subdivisions of key 435 + space. The leaves can be in any slot not occupied by a metadata pointer. It 436 + is guaranteed that none of the leaves in a node will match a slot occupied by a 437 + metadata pointer. If the metadata pointer is there, any leaf whose key matches 438 + the metadata key prefix must be in the subtree that the metadata pointer points 439 + to. 440 + 441 + In the above example list of index keys, node A will contain:: 442 + 443 + SLOT CONTENT INDEX KEY (PREFIX) 444 + ==== =============== ================== 445 + 1 PTR TO NODE B 1* 446 + any LEAF 9431809de993ba 447 + any LEAF b4542910809cd 448 + e PTR TO NODE F e* 449 + any LEAF f3842239082 450 + 451 + and node B:: 452 + 453 + 3 PTR TO NODE C 13* 454 + any LEAF 1458952489 455 + 456 + 457 + Shortcuts 458 + --------- 459 + 460 + Shortcuts are metadata records that jump over a piece of keyspace. A shortcut 461 + is a replacement for a series of single-occupancy nodes ascending through the 462 + levels. Shortcuts exist to save memory and to speed up traversal. 463 + 464 + It is possible for the root of the tree to be a shortcut - say, for example, 465 + the tree contains at least 17 nodes all with key prefix ``1111``. The 466 + insertion algorithm will insert a shortcut to skip over the ``1111`` keyspace 467 + in a single bound and get to the fourth level where these actually become 468 + different. 469 + 470 + 471 + Splitting And Collapsing Nodes 472 + ------------------------------ 473 + 474 + Each node has a maximum capacity of 16 leaves and metadata pointers. If the 475 + insertion algorithm finds that it is trying to insert a 17th object into a 476 + node, that node will be split such that at least two leaves that have a common 477 + key segment at that level end up in a separate node rooted on that slot for 478 + that common key segment. 479 + 480 + If the leaves in a full node and the leaf that is being inserted are 481 + sufficiently similar, then a shortcut will be inserted into the tree. 482 + 483 + When the number of objects in the subtree rooted at a node falls to 16 or 484 + fewer, then the subtree will be collapsed down to a single node - and this will 485 + ripple towards the root if possible. 486 + 487 + 488 + Non-Recursive Iteration 489 + ----------------------- 490 + 491 + Each node and shortcut contains a back pointer to its parent and the number of 492 + slot in that parent that points to it. None-recursive iteration uses these to 493 + proceed rootwards through the tree, going to the parent node, slot N + 1 to 494 + make sure progress is made without the need for a stack. 495 + 496 + The backpointers, however, make simultaneous alteration and iteration tricky. 497 + 498 + 499 + Simultaneous Alteration And Iteration 500 + ------------------------------------- 501 + 502 + There are a number of cases to consider: 503 + 504 + 1. Simple insert/replace. This involves simply replacing a NULL or old 505 + matching leaf pointer with the pointer to the new leaf after a barrier. 506 + The metadata blocks don't change otherwise. An old leaf won't be freed 507 + until after the RCU grace period. 508 + 509 + 2. Simple delete. This involves just clearing an old matching leaf. The 510 + metadata blocks don't change otherwise. The old leaf won't be freed until 511 + after the RCU grace period. 512 + 513 + 3. Insertion replacing part of a subtree that we haven't yet entered. This 514 + may involve replacement of part of that subtree - but that won't affect 515 + the iteration as we won't have reached the pointer to it yet and the 516 + ancestry blocks are not replaced (the layout of those does not change). 517 + 518 + 4. Insertion replacing nodes that we're actively processing. This isn't a 519 + problem as we've passed the anchoring pointer and won't switch onto the 520 + new layout until we follow the back pointers - at which point we've 521 + already examined the leaves in the replaced node (we iterate over all the 522 + leaves in a node before following any of its metadata pointers). 523 + 524 + We might, however, re-see some leaves that have been split out into a new 525 + branch that's in a slot further along than we were at. 526 + 527 + 5. Insertion replacing nodes that we're processing a dependent branch of. 528 + This won't affect us until we follow the back pointers. Similar to (4). 529 + 530 + 6. Deletion collapsing a branch under us. This doesn't affect us because the 531 + back pointers will get us back to the parent of the new node before we 532 + could see the new node. The entire collapsed subtree is thrown away 533 + unchanged - and will still be rooted on the same slot, so we shouldn't 534 + process it a second time as we'll go back to slot + 1. 535 + 536 + .. note:: 537 + 538 + Under some circumstances, we need to simultaneously change the parent 539 + pointer and the parent slot pointer on a node (say, for example, we 540 + inserted another node before it and moved it up a level). We cannot do 541 + this without locking against a read - so we have to replace that node too. 542 + 543 + However, when we're changing a shortcut into a node this isn't a problem 544 + as shortcuts only have one slot and so the parent slot number isn't used 545 + when traversing backwards over one. This means that it's okay to change 546 + the slot number first - provided suitable barriers are used to make sure 547 + the parent slot number is read after the back pointer. 548 + 549 + Obsolete blocks and leaves are freed up after an RCU grace period has passed, 550 + so as long as anyone doing walking or iteration holds the RCU read lock, the 551 + old superstructure should not go away on them.
+1
Documentation/core-api/index.rst
··· 7 7 .. toctree:: 8 8 :maxdepth: 1 9 9 10 + assoc_array 10 11 workqueue 11 12 12 13 .. only:: subproject