Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

locking/lockdep: Remove the cross-release locking checks

This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y),
while it found a number of old bugs initially, was also causing too many
false positives that caused people to disable lockdep - which is arguably
a worse overall outcome.

If we disable cross-release by default but keep the code upstream then
in practice the most likely outcome is that we'll allow the situation
to degrade gradually, by allowing entropy to introduce more and more
false positives, until it overwhelms maintenance capacity.

Another bad side effect was that people were trying to work around
the false positives by uglifying/complicating unrelated code. There's
a marked difference between annotating locking operations and
uglifying good code just due to bad lock debugging code ...

This gradual decrease in quality happened to a number of debugging
facilities in the kernel, and lockdep is pretty complex already,
so we cannot risk this outcome.

Either cross-release checking can be done right with no false positives,
or it should not be included in the upstream kernel.

( Note that it might make sense to maintain it out of tree and go through
the false positives every now and then and see whether new bugs were
introduced. )

Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

+38 -1708
-874
Documentation/locking/crossrelease.txt
··· 1 - Crossrelease 2 - ============ 3 - 4 - Started by Byungchul Park <byungchul.park@lge.com> 5 - 6 - Contents: 7 - 8 - (*) Background 9 - 10 - - What causes deadlock 11 - - How lockdep works 12 - 13 - (*) Limitation 14 - 15 - - Limit lockdep 16 - - Pros from the limitation 17 - - Cons from the limitation 18 - - Relax the limitation 19 - 20 - (*) Crossrelease 21 - 22 - - Introduce crossrelease 23 - - Introduce commit 24 - 25 - (*) Implementation 26 - 27 - - Data structures 28 - - How crossrelease works 29 - 30 - (*) Optimizations 31 - 32 - - Avoid duplication 33 - - Lockless for hot paths 34 - 35 - (*) APPENDIX A: What lockdep does to work aggresively 36 - 37 - (*) APPENDIX B: How to avoid adding false dependencies 38 - 39 - 40 - ========== 41 - Background 42 - ========== 43 - 44 - What causes deadlock 45 - -------------------- 46 - 47 - A deadlock occurs when a context is waiting for an event to happen, 48 - which is impossible because another (or the) context who can trigger the 49 - event is also waiting for another (or the) event to happen, which is 50 - also impossible due to the same reason. 51 - 52 - For example: 53 - 54 - A context going to trigger event C is waiting for event A to happen. 55 - A context going to trigger event A is waiting for event B to happen. 56 - A context going to trigger event B is waiting for event C to happen. 57 - 58 - A deadlock occurs when these three wait operations run at the same time, 59 - because event C cannot be triggered if event A does not happen, which in 60 - turn cannot be triggered if event B does not happen, which in turn 61 - cannot be triggered if event C does not happen. After all, no event can 62 - be triggered since any of them never meets its condition to wake up. 63 - 64 - A dependency might exist between two waiters and a deadlock might happen 65 - due to an incorrect releationship between dependencies. Thus, we must 66 - define what a dependency is first. A dependency exists between them if: 67 - 68 - 1. There are two waiters waiting for each event at a given time. 69 - 2. The only way to wake up each waiter is to trigger its event. 70 - 3. Whether one can be woken up depends on whether the other can. 71 - 72 - Each wait in the example creates its dependency like: 73 - 74 - Event C depends on event A. 75 - Event A depends on event B. 76 - Event B depends on event C. 77 - 78 - NOTE: Precisely speaking, a dependency is one between whether a 79 - waiter for an event can be woken up and whether another waiter for 80 - another event can be woken up. However from now on, we will describe 81 - a dependency as if it's one between an event and another event for 82 - simplicity. 83 - 84 - And they form circular dependencies like: 85 - 86 - -> C -> A -> B - 87 - / \ 88 - \ / 89 - ---------------- 90 - 91 - where 'A -> B' means that event A depends on event B. 92 - 93 - Such circular dependencies lead to a deadlock since no waiter can meet 94 - its condition to wake up as described. 95 - 96 - CONCLUSION 97 - 98 - Circular dependencies cause a deadlock. 99 - 100 - 101 - How lockdep works 102 - ----------------- 103 - 104 - Lockdep tries to detect a deadlock by checking dependencies created by 105 - lock operations, acquire and release. Waiting for a lock corresponds to 106 - waiting for an event, and releasing a lock corresponds to triggering an 107 - event in the previous section. 108 - 109 - In short, lockdep does: 110 - 111 - 1. Detect a new dependency. 112 - 2. Add the dependency into a global graph. 113 - 3. Check if that makes dependencies circular. 114 - 4. Report a deadlock or its possibility if so. 115 - 116 - For example, consider a graph built by lockdep that looks like: 117 - 118 - A -> B - 119 - \ 120 - -> E 121 - / 122 - C -> D - 123 - 124 - where A, B,..., E are different lock classes. 125 - 126 - Lockdep will add a dependency into the graph on detection of a new 127 - dependency. For example, it will add a dependency 'E -> C' when a new 128 - dependency between lock E and lock C is detected. Then the graph will be: 129 - 130 - A -> B - 131 - \ 132 - -> E - 133 - / \ 134 - -> C -> D - \ 135 - / / 136 - \ / 137 - ------------------ 138 - 139 - where A, B,..., E are different lock classes. 140 - 141 - This graph contains a subgraph which demonstrates circular dependencies: 142 - 143 - -> E - 144 - / \ 145 - -> C -> D - \ 146 - / / 147 - \ / 148 - ------------------ 149 - 150 - where C, D and E are different lock classes. 151 - 152 - This is the condition under which a deadlock might occur. Lockdep 153 - reports it on detection after adding a new dependency. This is the way 154 - how lockdep works. 155 - 156 - CONCLUSION 157 - 158 - Lockdep detects a deadlock or its possibility by checking if circular 159 - dependencies were created after adding each new dependency. 160 - 161 - 162 - ========== 163 - Limitation 164 - ========== 165 - 166 - Limit lockdep 167 - ------------- 168 - 169 - Limiting lockdep to work on only typical locks e.g. spin locks and 170 - mutexes, which are released within the acquire context, the 171 - implementation becomes simple but its capacity for detection becomes 172 - limited. Let's check pros and cons in next section. 173 - 174 - 175 - Pros from the limitation 176 - ------------------------ 177 - 178 - Given the limitation, when acquiring a lock, locks in a held_locks 179 - cannot be released if the context cannot acquire it so has to wait to 180 - acquire it, which means all waiters for the locks in the held_locks are 181 - stuck. It's an exact case to create dependencies between each lock in 182 - the held_locks and the lock to acquire. 183 - 184 - For example: 185 - 186 - CONTEXT X 187 - --------- 188 - acquire A 189 - acquire B /* Add a dependency 'A -> B' */ 190 - release B 191 - release A 192 - 193 - where A and B are different lock classes. 194 - 195 - When acquiring lock A, the held_locks of CONTEXT X is empty thus no 196 - dependency is added. But when acquiring lock B, lockdep detects and adds 197 - a new dependency 'A -> B' between lock A in the held_locks and lock B. 198 - They can be simply added whenever acquiring each lock. 199 - 200 - And data required by lockdep exists in a local structure, held_locks 201 - embedded in task_struct. Forcing to access the data within the context, 202 - lockdep can avoid racy problems without explicit locks while handling 203 - the local data. 204 - 205 - Lastly, lockdep only needs to keep locks currently being held, to build 206 - a dependency graph. However, relaxing the limitation, it needs to keep 207 - even locks already released, because a decision whether they created 208 - dependencies might be long-deferred. 209 - 210 - To sum up, we can expect several advantages from the limitation: 211 - 212 - 1. Lockdep can easily identify a dependency when acquiring a lock. 213 - 2. Races are avoidable while accessing local locks in a held_locks. 214 - 3. Lockdep only needs to keep locks currently being held. 215 - 216 - CONCLUSION 217 - 218 - Given the limitation, the implementation becomes simple and efficient. 219 - 220 - 221 - Cons from the limitation 222 - ------------------------ 223 - 224 - Given the limitation, lockdep is applicable only to typical locks. For 225 - example, page locks for page access or completions for synchronization 226 - cannot work with lockdep. 227 - 228 - Can we detect deadlocks below, under the limitation? 229 - 230 - Example 1: 231 - 232 - CONTEXT X CONTEXT Y CONTEXT Z 233 - --------- --------- ---------- 234 - mutex_lock A 235 - lock_page B 236 - lock_page B 237 - mutex_lock A /* DEADLOCK */ 238 - unlock_page B held by X 239 - unlock_page B 240 - mutex_unlock A 241 - mutex_unlock A 242 - 243 - where A and B are different lock classes. 244 - 245 - No, we cannot. 246 - 247 - Example 2: 248 - 249 - CONTEXT X CONTEXT Y 250 - --------- --------- 251 - mutex_lock A 252 - mutex_lock A 253 - wait_for_complete B /* DEADLOCK */ 254 - complete B 255 - mutex_unlock A 256 - mutex_unlock A 257 - 258 - where A is a lock class and B is a completion variable. 259 - 260 - No, we cannot. 261 - 262 - CONCLUSION 263 - 264 - Given the limitation, lockdep cannot detect a deadlock or its 265 - possibility caused by page locks or completions. 266 - 267 - 268 - Relax the limitation 269 - -------------------- 270 - 271 - Under the limitation, things to create dependencies are limited to 272 - typical locks. However, synchronization primitives like page locks and 273 - completions, which are allowed to be released in any context, also 274 - create dependencies and can cause a deadlock. So lockdep should track 275 - these locks to do a better job. We have to relax the limitation for 276 - these locks to work with lockdep. 277 - 278 - Detecting dependencies is very important for lockdep to work because 279 - adding a dependency means adding an opportunity to check whether it 280 - causes a deadlock. The more lockdep adds dependencies, the more it 281 - thoroughly works. Thus Lockdep has to do its best to detect and add as 282 - many true dependencies into a graph as possible. 283 - 284 - For example, considering only typical locks, lockdep builds a graph like: 285 - 286 - A -> B - 287 - \ 288 - -> E 289 - / 290 - C -> D - 291 - 292 - where A, B,..., E are different lock classes. 293 - 294 - On the other hand, under the relaxation, additional dependencies might 295 - be created and added. Assuming additional 'FX -> C' and 'E -> GX' are 296 - added thanks to the relaxation, the graph will be: 297 - 298 - A -> B - 299 - \ 300 - -> E -> GX 301 - / 302 - FX -> C -> D - 303 - 304 - where A, B,..., E, FX and GX are different lock classes, and a suffix 305 - 'X' is added on non-typical locks. 306 - 307 - The latter graph gives us more chances to check circular dependencies 308 - than the former. However, it might suffer performance degradation since 309 - relaxing the limitation, with which design and implementation of lockdep 310 - can be efficient, might introduce inefficiency inevitably. So lockdep 311 - should provide two options, strong detection and efficient detection. 312 - 313 - Choosing efficient detection: 314 - 315 - Lockdep works with only locks restricted to be released within the 316 - acquire context. However, lockdep works efficiently. 317 - 318 - Choosing strong detection: 319 - 320 - Lockdep works with all synchronization primitives. However, lockdep 321 - suffers performance degradation. 322 - 323 - CONCLUSION 324 - 325 - Relaxing the limitation, lockdep can add additional dependencies giving 326 - additional opportunities to check circular dependencies. 327 - 328 - 329 - ============ 330 - Crossrelease 331 - ============ 332 - 333 - Introduce crossrelease 334 - ---------------------- 335 - 336 - In order to allow lockdep to handle additional dependencies by what 337 - might be released in any context, namely 'crosslock', we have to be able 338 - to identify those created by crosslocks. The proposed 'crossrelease' 339 - feature provoides a way to do that. 340 - 341 - Crossrelease feature has to do: 342 - 343 - 1. Identify dependencies created by crosslocks. 344 - 2. Add the dependencies into a dependency graph. 345 - 346 - That's all. Once a meaningful dependency is added into graph, then 347 - lockdep would work with the graph as it did. The most important thing 348 - crossrelease feature has to do is to correctly identify and add true 349 - dependencies into the global graph. 350 - 351 - A dependency e.g. 'A -> B' can be identified only in the A's release 352 - context because a decision required to identify the dependency can be 353 - made only in the release context. That is to decide whether A can be 354 - released so that a waiter for A can be woken up. It cannot be made in 355 - other than the A's release context. 356 - 357 - It's no matter for typical locks because each acquire context is same as 358 - its release context, thus lockdep can decide whether a lock can be 359 - released in the acquire context. However for crosslocks, lockdep cannot 360 - make the decision in the acquire context but has to wait until the 361 - release context is identified. 362 - 363 - Therefore, deadlocks by crosslocks cannot be detected just when it 364 - happens, because those cannot be identified until the crosslocks are 365 - released. However, deadlock possibilities can be detected and it's very 366 - worth. See 'APPENDIX A' section to check why. 367 - 368 - CONCLUSION 369 - 370 - Using crossrelease feature, lockdep can work with what might be released 371 - in any context, namely crosslock. 372 - 373 - 374 - Introduce commit 375 - ---------------- 376 - 377 - Since crossrelease defers the work adding true dependencies of 378 - crosslocks until they are actually released, crossrelease has to queue 379 - all acquisitions which might create dependencies with the crosslocks. 380 - Then it identifies dependencies using the queued data in batches at a 381 - proper time. We call it 'commit'. 382 - 383 - There are four types of dependencies: 384 - 385 - 1. TT type: 'typical lock A -> typical lock B' 386 - 387 - Just when acquiring B, lockdep can see it's in the A's release 388 - context. So the dependency between A and B can be identified 389 - immediately. Commit is unnecessary. 390 - 391 - 2. TC type: 'typical lock A -> crosslock BX' 392 - 393 - Just when acquiring BX, lockdep can see it's in the A's release 394 - context. So the dependency between A and BX can be identified 395 - immediately. Commit is unnecessary, too. 396 - 397 - 3. CT type: 'crosslock AX -> typical lock B' 398 - 399 - When acquiring B, lockdep cannot identify the dependency because 400 - there's no way to know if it's in the AX's release context. It has 401 - to wait until the decision can be made. Commit is necessary. 402 - 403 - 4. CC type: 'crosslock AX -> crosslock BX' 404 - 405 - When acquiring BX, lockdep cannot identify the dependency because 406 - there's no way to know if it's in the AX's release context. It has 407 - to wait until the decision can be made. Commit is necessary. 408 - But, handling CC type is not implemented yet. It's a future work. 409 - 410 - Lockdep can work without commit for typical locks, but commit step is 411 - necessary once crosslocks are involved. Introducing commit, lockdep 412 - performs three steps. What lockdep does in each step is: 413 - 414 - 1. Acquisition: For typical locks, lockdep does what it originally did 415 - and queues the lock so that CT type dependencies can be checked using 416 - it at the commit step. For crosslocks, it saves data which will be 417 - used at the commit step and increases a reference count for it. 418 - 419 - 2. Commit: No action is reauired for typical locks. For crosslocks, 420 - lockdep adds CT type dependencies using the data saved at the 421 - acquisition step. 422 - 423 - 3. Release: No changes are required for typical locks. When a crosslock 424 - is released, it decreases a reference count for it. 425 - 426 - CONCLUSION 427 - 428 - Crossrelease introduces commit step to handle dependencies of crosslocks 429 - in batches at a proper time. 430 - 431 - 432 - ============== 433 - Implementation 434 - ============== 435 - 436 - Data structures 437 - --------------- 438 - 439 - Crossrelease introduces two main data structures. 440 - 441 - 1. hist_lock 442 - 443 - This is an array embedded in task_struct, for keeping lock history so 444 - that dependencies can be added using them at the commit step. Since 445 - it's local data, it can be accessed locklessly in the owner context. 446 - The array is filled at the acquisition step and consumed at the 447 - commit step. And it's managed in circular manner. 448 - 449 - 2. cross_lock 450 - 451 - One per lockdep_map exists. This is for keeping data of crosslocks 452 - and used at the commit step. 453 - 454 - 455 - How crossrelease works 456 - ---------------------- 457 - 458 - It's the key of how crossrelease works, to defer necessary works to an 459 - appropriate point in time and perform in at once at the commit step. 460 - Let's take a look with examples step by step, starting from how lockdep 461 - works without crossrelease for typical locks. 462 - 463 - acquire A /* Push A onto held_locks */ 464 - acquire B /* Push B onto held_locks and add 'A -> B' */ 465 - acquire C /* Push C onto held_locks and add 'B -> C' */ 466 - release C /* Pop C from held_locks */ 467 - release B /* Pop B from held_locks */ 468 - release A /* Pop A from held_locks */ 469 - 470 - where A, B and C are different lock classes. 471 - 472 - NOTE: This document assumes that readers already understand how 473 - lockdep works without crossrelease thus omits details. But there's 474 - one thing to note. Lockdep pretends to pop a lock from held_locks 475 - when releasing it. But it's subtly different from the original pop 476 - operation because lockdep allows other than the top to be poped. 477 - 478 - In this case, lockdep adds 'the top of held_locks -> the lock to acquire' 479 - dependency every time acquiring a lock. 480 - 481 - After adding 'A -> B', a dependency graph will be: 482 - 483 - A -> B 484 - 485 - where A and B are different lock classes. 486 - 487 - And after adding 'B -> C', the graph will be: 488 - 489 - A -> B -> C 490 - 491 - where A, B and C are different lock classes. 492 - 493 - Let's performs commit step even for typical locks to add dependencies. 494 - Of course, commit step is not necessary for them, however, it would work 495 - well because this is a more general way. 496 - 497 - acquire A 498 - /* 499 - * Queue A into hist_locks 500 - * 501 - * In hist_locks: A 502 - * In graph: Empty 503 - */ 504 - 505 - acquire B 506 - /* 507 - * Queue B into hist_locks 508 - * 509 - * In hist_locks: A, B 510 - * In graph: Empty 511 - */ 512 - 513 - acquire C 514 - /* 515 - * Queue C into hist_locks 516 - * 517 - * In hist_locks: A, B, C 518 - * In graph: Empty 519 - */ 520 - 521 - commit C 522 - /* 523 - * Add 'C -> ?' 524 - * Answer the following to decide '?' 525 - * What has been queued since acquire C: Nothing 526 - * 527 - * In hist_locks: A, B, C 528 - * In graph: Empty 529 - */ 530 - 531 - release C 532 - 533 - commit B 534 - /* 535 - * Add 'B -> ?' 536 - * Answer the following to decide '?' 537 - * What has been queued since acquire B: C 538 - * 539 - * In hist_locks: A, B, C 540 - * In graph: 'B -> C' 541 - */ 542 - 543 - release B 544 - 545 - commit A 546 - /* 547 - * Add 'A -> ?' 548 - * Answer the following to decide '?' 549 - * What has been queued since acquire A: B, C 550 - * 551 - * In hist_locks: A, B, C 552 - * In graph: 'B -> C', 'A -> B', 'A -> C' 553 - */ 554 - 555 - release A 556 - 557 - where A, B and C are different lock classes. 558 - 559 - In this case, dependencies are added at the commit step as described. 560 - 561 - After commits for A, B and C, the graph will be: 562 - 563 - A -> B -> C 564 - 565 - where A, B and C are different lock classes. 566 - 567 - NOTE: A dependency 'A -> C' is optimized out. 568 - 569 - We can see the former graph built without commit step is same as the 570 - latter graph built using commit steps. Of course the former way leads to 571 - earlier finish for building the graph, which means we can detect a 572 - deadlock or its possibility sooner. So the former way would be prefered 573 - when possible. But we cannot avoid using the latter way for crosslocks. 574 - 575 - Let's look at how commit steps work for crosslocks. In this case, the 576 - commit step is performed only on crosslock AX as real. And it assumes 577 - that the AX release context is different from the AX acquire context. 578 - 579 - BX RELEASE CONTEXT BX ACQUIRE CONTEXT 580 - ------------------ ------------------ 581 - acquire A 582 - /* 583 - * Push A onto held_locks 584 - * Queue A into hist_locks 585 - * 586 - * In held_locks: A 587 - * In hist_locks: A 588 - * In graph: Empty 589 - */ 590 - 591 - acquire BX 592 - /* 593 - * Add 'the top of held_locks -> BX' 594 - * 595 - * In held_locks: A 596 - * In hist_locks: A 597 - * In graph: 'A -> BX' 598 - */ 599 - 600 - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 601 - It must be guaranteed that the following operations are seen after 602 - acquiring BX globally. It can be done by things like barrier. 603 - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 604 - 605 - acquire C 606 - /* 607 - * Push C onto held_locks 608 - * Queue C into hist_locks 609 - * 610 - * In held_locks: C 611 - * In hist_locks: C 612 - * In graph: 'A -> BX' 613 - */ 614 - 615 - release C 616 - /* 617 - * Pop C from held_locks 618 - * 619 - * In held_locks: Empty 620 - * In hist_locks: C 621 - * In graph: 'A -> BX' 622 - */ 623 - acquire D 624 - /* 625 - * Push D onto held_locks 626 - * Queue D into hist_locks 627 - * Add 'the top of held_locks -> D' 628 - * 629 - * In held_locks: A, D 630 - * In hist_locks: A, D 631 - * In graph: 'A -> BX', 'A -> D' 632 - */ 633 - acquire E 634 - /* 635 - * Push E onto held_locks 636 - * Queue E into hist_locks 637 - * 638 - * In held_locks: E 639 - * In hist_locks: C, E 640 - * In graph: 'A -> BX', 'A -> D' 641 - */ 642 - 643 - release E 644 - /* 645 - * Pop E from held_locks 646 - * 647 - * In held_locks: Empty 648 - * In hist_locks: D, E 649 - * In graph: 'A -> BX', 'A -> D' 650 - */ 651 - release D 652 - /* 653 - * Pop D from held_locks 654 - * 655 - * In held_locks: A 656 - * In hist_locks: A, D 657 - * In graph: 'A -> BX', 'A -> D' 658 - */ 659 - commit BX 660 - /* 661 - * Add 'BX -> ?' 662 - * What has been queued since acquire BX: C, E 663 - * 664 - * In held_locks: Empty 665 - * In hist_locks: D, E 666 - * In graph: 'A -> BX', 'A -> D', 667 - * 'BX -> C', 'BX -> E' 668 - */ 669 - 670 - release BX 671 - /* 672 - * In held_locks: Empty 673 - * In hist_locks: D, E 674 - * In graph: 'A -> BX', 'A -> D', 675 - * 'BX -> C', 'BX -> E' 676 - */ 677 - release A 678 - /* 679 - * Pop A from held_locks 680 - * 681 - * In held_locks: Empty 682 - * In hist_locks: A, D 683 - * In graph: 'A -> BX', 'A -> D', 684 - * 'BX -> C', 'BX -> E' 685 - */ 686 - 687 - where A, BX, C,..., E are different lock classes, and a suffix 'X' is 688 - added on crosslocks. 689 - 690 - Crossrelease considers all acquisitions after acqiuring BX are 691 - candidates which might create dependencies with BX. True dependencies 692 - will be determined when identifying the release context of BX. Meanwhile, 693 - all typical locks are queued so that they can be used at the commit step. 694 - And then two dependencies 'BX -> C' and 'BX -> E' are added at the 695 - commit step when identifying the release context. 696 - 697 - The final graph will be, with crossrelease: 698 - 699 - -> C 700 - / 701 - -> BX - 702 - / \ 703 - A - -> E 704 - \ 705 - -> D 706 - 707 - where A, BX, C,..., E are different lock classes, and a suffix 'X' is 708 - added on crosslocks. 709 - 710 - However, the final graph will be, without crossrelease: 711 - 712 - A -> D 713 - 714 - where A and D are different lock classes. 715 - 716 - The former graph has three more dependencies, 'A -> BX', 'BX -> C' and 717 - 'BX -> E' giving additional opportunities to check if they cause 718 - deadlocks. This way lockdep can detect a deadlock or its possibility 719 - caused by crosslocks. 720 - 721 - CONCLUSION 722 - 723 - We checked how crossrelease works with several examples. 724 - 725 - 726 - ============= 727 - Optimizations 728 - ============= 729 - 730 - Avoid duplication 731 - ----------------- 732 - 733 - Crossrelease feature uses a cache like what lockdep already uses for 734 - dependency chains, but this time it's for caching CT type dependencies. 735 - Once that dependency is cached, the same will never be added again. 736 - 737 - 738 - Lockless for hot paths 739 - ---------------------- 740 - 741 - To keep all locks for later use at the commit step, crossrelease adopts 742 - a local array embedded in task_struct, which makes access to the data 743 - lockless by forcing it to happen only within the owner context. It's 744 - like how lockdep handles held_locks. Lockless implmentation is important 745 - since typical locks are very frequently acquired and released. 746 - 747 - 748 - ================================================= 749 - APPENDIX A: What lockdep does to work aggresively 750 - ================================================= 751 - 752 - A deadlock actually occurs when all wait operations creating circular 753 - dependencies run at the same time. Even though they don't, a potential 754 - deadlock exists if the problematic dependencies exist. Thus it's 755 - meaningful to detect not only an actual deadlock but also its potential 756 - possibility. The latter is rather valuable. When a deadlock occurs 757 - actually, we can identify what happens in the system by some means or 758 - other even without lockdep. However, there's no way to detect possiblity 759 - without lockdep unless the whole code is parsed in head. It's terrible. 760 - Lockdep does the both, and crossrelease only focuses on the latter. 761 - 762 - Whether or not a deadlock actually occurs depends on several factors. 763 - For example, what order contexts are switched in is a factor. Assuming 764 - circular dependencies exist, a deadlock would occur when contexts are 765 - switched so that all wait operations creating the dependencies run 766 - simultaneously. Thus to detect a deadlock possibility even in the case 767 - that it has not occured yet, lockdep should consider all possible 768 - combinations of dependencies, trying to: 769 - 770 - 1. Use a global dependency graph. 771 - 772 - Lockdep combines all dependencies into one global graph and uses them, 773 - regardless of which context generates them or what order contexts are 774 - switched in. Aggregated dependencies are only considered so they are 775 - prone to be circular if a problem exists. 776 - 777 - 2. Check dependencies between classes instead of instances. 778 - 779 - What actually causes a deadlock are instances of lock. However, 780 - lockdep checks dependencies between classes instead of instances. 781 - This way lockdep can detect a deadlock which has not happened but 782 - might happen in future by others but the same class. 783 - 784 - 3. Assume all acquisitions lead to waiting. 785 - 786 - Although locks might be acquired without waiting which is essential 787 - to create dependencies, lockdep assumes all acquisitions lead to 788 - waiting since it might be true some time or another. 789 - 790 - CONCLUSION 791 - 792 - Lockdep detects not only an actual deadlock but also its possibility, 793 - and the latter is more valuable. 794 - 795 - 796 - ================================================== 797 - APPENDIX B: How to avoid adding false dependencies 798 - ================================================== 799 - 800 - Remind what a dependency is. A dependency exists if: 801 - 802 - 1. There are two waiters waiting for each event at a given time. 803 - 2. The only way to wake up each waiter is to trigger its event. 804 - 3. Whether one can be woken up depends on whether the other can. 805 - 806 - For example: 807 - 808 - acquire A 809 - acquire B /* A dependency 'A -> B' exists */ 810 - release B 811 - release A 812 - 813 - where A and B are different lock classes. 814 - 815 - A depedency 'A -> B' exists since: 816 - 817 - 1. A waiter for A and a waiter for B might exist when acquiring B. 818 - 2. Only way to wake up each is to release what it waits for. 819 - 3. Whether the waiter for A can be woken up depends on whether the 820 - other can. IOW, TASK X cannot release A if it fails to acquire B. 821 - 822 - For another example: 823 - 824 - TASK X TASK Y 825 - ------ ------ 826 - acquire AX 827 - acquire B /* A dependency 'AX -> B' exists */ 828 - release B 829 - release AX held by Y 830 - 831 - where AX and B are different lock classes, and a suffix 'X' is added 832 - on crosslocks. 833 - 834 - Even in this case involving crosslocks, the same rule can be applied. A 835 - depedency 'AX -> B' exists since: 836 - 837 - 1. A waiter for AX and a waiter for B might exist when acquiring B. 838 - 2. Only way to wake up each is to release what it waits for. 839 - 3. Whether the waiter for AX can be woken up depends on whether the 840 - other can. IOW, TASK X cannot release AX if it fails to acquire B. 841 - 842 - Let's take a look at more complicated example: 843 - 844 - TASK X TASK Y 845 - ------ ------ 846 - acquire B 847 - release B 848 - fork Y 849 - acquire AX 850 - acquire C /* A dependency 'AX -> C' exists */ 851 - release C 852 - release AX held by Y 853 - 854 - where AX, B and C are different lock classes, and a suffix 'X' is 855 - added on crosslocks. 856 - 857 - Does a dependency 'AX -> B' exist? Nope. 858 - 859 - Two waiters are essential to create a dependency. However, waiters for 860 - AX and B to create 'AX -> B' cannot exist at the same time in this 861 - example. Thus the dependency 'AX -> B' cannot be created. 862 - 863 - It would be ideal if the full set of true ones can be considered. But 864 - we can ensure nothing but what actually happened. Relying on what 865 - actually happens at runtime, we can anyway add only true ones, though 866 - they might be a subset of true ones. It's similar to how lockdep works 867 - for typical locks. There might be more true dependencies than what 868 - lockdep has detected in runtime. Lockdep has no choice but to rely on 869 - what actually happens. Crossrelease also relies on it. 870 - 871 - CONCLUSION 872 - 873 - Relying on what actually happens, lockdep can avoid adding false 874 - dependencies.
-45
include/linux/completion.h
··· 10 10 */ 11 11 12 12 #include <linux/wait.h> 13 - #ifdef CONFIG_LOCKDEP_COMPLETIONS 14 - #include <linux/lockdep.h> 15 - #endif 16 13 17 14 /* 18 15 * struct completion - structure used to maintain state for a "completion" ··· 26 29 struct completion { 27 30 unsigned int done; 28 31 wait_queue_head_t wait; 29 - #ifdef CONFIG_LOCKDEP_COMPLETIONS 30 - struct lockdep_map_cross map; 31 - #endif 32 32 }; 33 33 34 - #ifdef CONFIG_LOCKDEP_COMPLETIONS 35 - static inline void complete_acquire(struct completion *x) 36 - { 37 - lock_acquire_exclusive((struct lockdep_map *)&x->map, 0, 0, NULL, _RET_IP_); 38 - } 39 - 40 - static inline void complete_release(struct completion *x) 41 - { 42 - lock_release((struct lockdep_map *)&x->map, 0, _RET_IP_); 43 - } 44 - 45 - static inline void complete_release_commit(struct completion *x) 46 - { 47 - lock_commit_crosslock((struct lockdep_map *)&x->map); 48 - } 49 - 50 - #define init_completion_map(x, m) \ 51 - do { \ 52 - lockdep_init_map_crosslock((struct lockdep_map *)&(x)->map, \ 53 - (m)->name, (m)->key, 0); \ 54 - __init_completion(x); \ 55 - } while (0) 56 - 57 - #define init_completion(x) \ 58 - do { \ 59 - static struct lock_class_key __key; \ 60 - lockdep_init_map_crosslock((struct lockdep_map *)&(x)->map, \ 61 - "(completion)" #x, \ 62 - &__key, 0); \ 63 - __init_completion(x); \ 64 - } while (0) 65 - #else 66 34 #define init_completion_map(x, m) __init_completion(x) 67 35 #define init_completion(x) __init_completion(x) 68 36 static inline void complete_acquire(struct completion *x) {} 69 37 static inline void complete_release(struct completion *x) {} 70 38 static inline void complete_release_commit(struct completion *x) {} 71 - #endif 72 39 73 - #ifdef CONFIG_LOCKDEP_COMPLETIONS 74 - #define COMPLETION_INITIALIZER(work) \ 75 - { 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait), \ 76 - STATIC_CROSS_LOCKDEP_MAP_INIT("(completion)" #work, &(work)) } 77 - #else 78 40 #define COMPLETION_INITIALIZER(work) \ 79 41 { 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait) } 80 - #endif 81 42 82 43 #define COMPLETION_INITIALIZER_ONSTACK_MAP(work, map) \ 83 44 (*({ init_completion_map(&(work), &(map)); &(work); }))
-125
include/linux/lockdep.h
··· 158 158 int cpu; 159 159 unsigned long ip; 160 160 #endif 161 - #ifdef CONFIG_LOCKDEP_CROSSRELEASE 162 - /* 163 - * Whether it's a crosslock. 164 - */ 165 - int cross; 166 - #endif 167 161 }; 168 162 169 163 static inline void lockdep_copy_map(struct lockdep_map *to, ··· 261 267 unsigned int hardirqs_off:1; 262 268 unsigned int references:12; /* 32 bits */ 263 269 unsigned int pin_count; 264 - #ifdef CONFIG_LOCKDEP_CROSSRELEASE 265 - /* 266 - * Generation id. 267 - * 268 - * A value of cross_gen_id will be stored when holding this, 269 - * which is globally increased whenever each crosslock is held. 270 - */ 271 - unsigned int gen_id; 272 - #endif 273 270 }; 274 - 275 - #ifdef CONFIG_LOCKDEP_CROSSRELEASE 276 - #define MAX_XHLOCK_TRACE_ENTRIES 5 277 - 278 - /* 279 - * This is for keeping locks waiting for commit so that true dependencies 280 - * can be added at commit step. 281 - */ 282 - struct hist_lock { 283 - /* 284 - * Id for each entry in the ring buffer. This is used to 285 - * decide whether the ring buffer was overwritten or not. 286 - * 287 - * For example, 288 - * 289 - * |<----------- hist_lock ring buffer size ------->| 290 - * pppppppppppppppppppppiiiiiiiiiiiiiiiiiiiiiiiiiiiii 291 - * wrapped > iiiiiiiiiiiiiiiiiiiiiiiiiii....................... 292 - * 293 - * where 'p' represents an acquisition in process 294 - * context, 'i' represents an acquisition in irq 295 - * context. 296 - * 297 - * In this example, the ring buffer was overwritten by 298 - * acquisitions in irq context, that should be detected on 299 - * rollback or commit. 300 - */ 301 - unsigned int hist_id; 302 - 303 - /* 304 - * Seperate stack_trace data. This will be used at commit step. 305 - */ 306 - struct stack_trace trace; 307 - unsigned long trace_entries[MAX_XHLOCK_TRACE_ENTRIES]; 308 - 309 - /* 310 - * Seperate hlock instance. This will be used at commit step. 311 - * 312 - * TODO: Use a smaller data structure containing only necessary 313 - * data. However, we should make lockdep code able to handle the 314 - * smaller one first. 315 - */ 316 - struct held_lock hlock; 317 - }; 318 - 319 - /* 320 - * To initialize a lock as crosslock, lockdep_init_map_crosslock() should 321 - * be called instead of lockdep_init_map(). 322 - */ 323 - struct cross_lock { 324 - /* 325 - * When more than one acquisition of crosslocks are overlapped, 326 - * we have to perform commit for them based on cross_gen_id of 327 - * the first acquisition, which allows us to add more true 328 - * dependencies. 329 - * 330 - * Moreover, when no acquisition of a crosslock is in progress, 331 - * we should not perform commit because the lock might not exist 332 - * any more, which might cause incorrect memory access. So we 333 - * have to track the number of acquisitions of a crosslock. 334 - */ 335 - int nr_acquire; 336 - 337 - /* 338 - * Seperate hlock instance. This will be used at commit step. 339 - * 340 - * TODO: Use a smaller data structure containing only necessary 341 - * data. However, we should make lockdep code able to handle the 342 - * smaller one first. 343 - */ 344 - struct held_lock hlock; 345 - }; 346 - 347 - struct lockdep_map_cross { 348 - struct lockdep_map map; 349 - struct cross_lock xlock; 350 - }; 351 - #endif 352 271 353 272 /* 354 273 * Initialization, self-test and debugging-output methods: ··· 467 560 XHLOCK_CTX_NR, 468 561 }; 469 562 470 - #ifdef CONFIG_LOCKDEP_CROSSRELEASE 471 - extern void lockdep_init_map_crosslock(struct lockdep_map *lock, 472 - const char *name, 473 - struct lock_class_key *key, 474 - int subclass); 475 - extern void lock_commit_crosslock(struct lockdep_map *lock); 476 - 477 - /* 478 - * What we essencially have to initialize is 'nr_acquire'. Other members 479 - * will be initialized in add_xlock(). 480 - */ 481 - #define STATIC_CROSS_LOCK_INIT() \ 482 - { .nr_acquire = 0,} 483 - 484 - #define STATIC_CROSS_LOCKDEP_MAP_INIT(_name, _key) \ 485 - { .map.name = (_name), .map.key = (void *)(_key), \ 486 - .map.cross = 1, .xlock = STATIC_CROSS_LOCK_INIT(), } 487 - 488 - /* 489 - * To initialize a lockdep_map statically use this macro. 490 - * Note that _name must not be NULL. 491 - */ 492 - #define STATIC_LOCKDEP_MAP_INIT(_name, _key) \ 493 - { .name = (_name), .key = (void *)(_key), .cross = 0, } 494 - 495 - extern void crossrelease_hist_start(enum xhlock_context_t c); 496 - extern void crossrelease_hist_end(enum xhlock_context_t c); 497 - extern void lockdep_invariant_state(bool force); 498 - extern void lockdep_init_task(struct task_struct *task); 499 - extern void lockdep_free_task(struct task_struct *task); 500 - #else /* !CROSSRELEASE */ 501 563 #define lockdep_init_map_crosslock(m, n, k, s) do {} while (0) 502 564 /* 503 565 * To initialize a lockdep_map statically use this macro. ··· 480 604 static inline void lockdep_invariant_state(bool force) {} 481 605 static inline void lockdep_init_task(struct task_struct *task) {} 482 606 static inline void lockdep_free_task(struct task_struct *task) {} 483 - #endif /* CROSSRELEASE */ 484 607 485 608 #ifdef CONFIG_LOCK_STAT 486 609
-11
include/linux/sched.h
··· 849 849 struct held_lock held_locks[MAX_LOCK_DEPTH]; 850 850 #endif 851 851 852 - #ifdef CONFIG_LOCKDEP_CROSSRELEASE 853 - #define MAX_XHLOCKS_NR 64UL 854 - struct hist_lock *xhlocks; /* Crossrelease history locks */ 855 - unsigned int xhlock_idx; 856 - /* For restoring at history boundaries */ 857 - unsigned int xhlock_idx_hist[XHLOCK_CTX_NR]; 858 - unsigned int hist_id; 859 - /* For overwrite check at each context exit */ 860 - unsigned int hist_id_save[XHLOCK_CTX_NR]; 861 - #endif 862 - 863 852 #ifdef CONFIG_UBSAN 864 853 unsigned int in_ubsan; 865 854 #endif
+38 -620
kernel/locking/lockdep.c
··· 57 57 #define CREATE_TRACE_POINTS 58 58 #include <trace/events/lock.h> 59 59 60 - #ifdef CONFIG_LOCKDEP_CROSSRELEASE 61 - #include <linux/slab.h> 62 - #endif 63 - 64 60 #ifdef CONFIG_PROVE_LOCKING 65 61 int prove_locking = 1; 66 62 module_param(prove_locking, int, 0644); ··· 70 74 #else 71 75 #define lock_stat 0 72 76 #endif 73 - 74 - #ifdef CONFIG_BOOTPARAM_LOCKDEP_CROSSRELEASE_FULLSTACK 75 - static int crossrelease_fullstack = 1; 76 - #else 77 - static int crossrelease_fullstack; 78 - #endif 79 - static int __init allow_crossrelease_fullstack(char *str) 80 - { 81 - crossrelease_fullstack = 1; 82 - return 0; 83 - } 84 - 85 - early_param("crossrelease_fullstack", allow_crossrelease_fullstack); 86 77 87 78 /* 88 79 * lockdep_lock: protects the lockdep graph, the hashes and the ··· 723 740 return is_static || static_obj(lock->key) ? NULL : ERR_PTR(-EINVAL); 724 741 } 725 742 726 - #ifdef CONFIG_LOCKDEP_CROSSRELEASE 727 - static void cross_init(struct lockdep_map *lock, int cross); 728 - static int cross_lock(struct lockdep_map *lock); 729 - static int lock_acquire_crosslock(struct held_lock *hlock); 730 - static int lock_release_crosslock(struct lockdep_map *lock); 731 - #else 732 - static inline void cross_init(struct lockdep_map *lock, int cross) {} 733 - static inline int cross_lock(struct lockdep_map *lock) { return 0; } 734 - static inline int lock_acquire_crosslock(struct held_lock *hlock) { return 2; } 735 - static inline int lock_release_crosslock(struct lockdep_map *lock) { return 2; } 736 - #endif 737 - 738 743 /* 739 744 * Register a lock's class in the hash-table, if the class is not present 740 745 * yet. Otherwise we look it up. We cache the result in the lock object ··· 1122 1151 printk(KERN_CONT "\n\n"); 1123 1152 } 1124 1153 1125 - if (cross_lock(tgt->instance)) { 1126 - printk(" Possible unsafe locking scenario by crosslock:\n\n"); 1127 - printk(" CPU0 CPU1\n"); 1128 - printk(" ---- ----\n"); 1129 - printk(" lock("); 1130 - __print_lock_name(parent); 1131 - printk(KERN_CONT ");\n"); 1132 - printk(" lock("); 1133 - __print_lock_name(target); 1134 - printk(KERN_CONT ");\n"); 1135 - printk(" lock("); 1136 - __print_lock_name(source); 1137 - printk(KERN_CONT ");\n"); 1138 - printk(" unlock("); 1139 - __print_lock_name(target); 1140 - printk(KERN_CONT ");\n"); 1141 - printk("\n *** DEADLOCK ***\n\n"); 1142 - } else { 1143 - printk(" Possible unsafe locking scenario:\n\n"); 1144 - printk(" CPU0 CPU1\n"); 1145 - printk(" ---- ----\n"); 1146 - printk(" lock("); 1147 - __print_lock_name(target); 1148 - printk(KERN_CONT ");\n"); 1149 - printk(" lock("); 1150 - __print_lock_name(parent); 1151 - printk(KERN_CONT ");\n"); 1152 - printk(" lock("); 1153 - __print_lock_name(target); 1154 - printk(KERN_CONT ");\n"); 1155 - printk(" lock("); 1156 - __print_lock_name(source); 1157 - printk(KERN_CONT ");\n"); 1158 - printk("\n *** DEADLOCK ***\n\n"); 1159 - } 1154 + printk(" Possible unsafe locking scenario:\n\n"); 1155 + printk(" CPU0 CPU1\n"); 1156 + printk(" ---- ----\n"); 1157 + printk(" lock("); 1158 + __print_lock_name(target); 1159 + printk(KERN_CONT ");\n"); 1160 + printk(" lock("); 1161 + __print_lock_name(parent); 1162 + printk(KERN_CONT ");\n"); 1163 + printk(" lock("); 1164 + __print_lock_name(target); 1165 + printk(KERN_CONT ");\n"); 1166 + printk(" lock("); 1167 + __print_lock_name(source); 1168 + printk(KERN_CONT ");\n"); 1169 + printk("\n *** DEADLOCK ***\n\n"); 1160 1170 } 1161 1171 1162 1172 /* ··· 1163 1211 curr->comm, task_pid_nr(curr)); 1164 1212 print_lock(check_src); 1165 1213 1166 - if (cross_lock(check_tgt->instance)) 1167 - pr_warn("\nbut now in release context of a crosslock acquired at the following:\n"); 1168 - else 1169 - pr_warn("\nbut task is already holding lock:\n"); 1214 + pr_warn("\nbut task is already holding lock:\n"); 1170 1215 1171 1216 print_lock(check_tgt); 1172 1217 pr_warn("\nwhich lock already depends on the new lock.\n\n"); ··· 1193 1244 if (!debug_locks_off_graph_unlock() || debug_locks_silent) 1194 1245 return 0; 1195 1246 1196 - if (cross_lock(check_tgt->instance)) 1197 - this->trace = *trace; 1198 - else if (!save_trace(&this->trace)) 1247 + if (!save_trace(&this->trace)) 1199 1248 return 0; 1200 1249 1201 1250 depth = get_lock_depth(target); ··· 1797 1850 if (nest) 1798 1851 return 2; 1799 1852 1800 - if (cross_lock(prev->instance)) 1801 - continue; 1802 - 1803 1853 return print_deadlock_bug(curr, prev, next); 1804 1854 } 1805 1855 return 1; ··· 1962 2018 for (;;) { 1963 2019 int distance = curr->lockdep_depth - depth + 1; 1964 2020 hlock = curr->held_locks + depth - 1; 1965 - /* 1966 - * Only non-crosslock entries get new dependencies added. 1967 - * Crosslock entries will be added by commit later: 1968 - */ 1969 - if (!cross_lock(hlock->instance)) { 1970 - /* 1971 - * Only non-recursive-read entries get new dependencies 1972 - * added: 1973 - */ 1974 - if (hlock->read != 2 && hlock->check) { 1975 - int ret = check_prev_add(curr, hlock, next, 1976 - distance, &trace, save_trace); 1977 - if (!ret) 1978 - return 0; 1979 2021 1980 - /* 1981 - * Stop after the first non-trylock entry, 1982 - * as non-trylock entries have added their 1983 - * own direct dependencies already, so this 1984 - * lock is connected to them indirectly: 1985 - */ 1986 - if (!hlock->trylock) 1987 - break; 1988 - } 2022 + /* 2023 + * Only non-recursive-read entries get new dependencies 2024 + * added: 2025 + */ 2026 + if (hlock->read != 2 && hlock->check) { 2027 + int ret = check_prev_add(curr, hlock, next, distance, &trace, save_trace); 2028 + if (!ret) 2029 + return 0; 2030 + 2031 + /* 2032 + * Stop after the first non-trylock entry, 2033 + * as non-trylock entries have added their 2034 + * own direct dependencies already, so this 2035 + * lock is connected to them indirectly: 2036 + */ 2037 + if (!hlock->trylock) 2038 + break; 1989 2039 } 2040 + 1990 2041 depth--; 1991 2042 /* 1992 2043 * End of lock-stack? ··· 3231 3292 void lockdep_init_map(struct lockdep_map *lock, const char *name, 3232 3293 struct lock_class_key *key, int subclass) 3233 3294 { 3234 - cross_init(lock, 0); 3235 3295 __lockdep_init_map(lock, name, key, subclass); 3236 3296 } 3237 3297 EXPORT_SYMBOL_GPL(lockdep_init_map); 3238 - 3239 - #ifdef CONFIG_LOCKDEP_CROSSRELEASE 3240 - void lockdep_init_map_crosslock(struct lockdep_map *lock, const char *name, 3241 - struct lock_class_key *key, int subclass) 3242 - { 3243 - cross_init(lock, 1); 3244 - __lockdep_init_map(lock, name, key, subclass); 3245 - } 3246 - EXPORT_SYMBOL_GPL(lockdep_init_map_crosslock); 3247 - #endif 3248 3298 3249 3299 struct lock_class_key __lockdep_no_validate__; 3250 3300 EXPORT_SYMBOL_GPL(__lockdep_no_validate__); ··· 3290 3362 int chain_head = 0; 3291 3363 int class_idx; 3292 3364 u64 chain_key; 3293 - int ret; 3294 3365 3295 3366 if (unlikely(!debug_locks)) 3296 3367 return 0; ··· 3338 3411 3339 3412 class_idx = class - lock_classes + 1; 3340 3413 3341 - /* TODO: nest_lock is not implemented for crosslock yet. */ 3342 - if (depth && !cross_lock(lock)) { 3414 + if (depth) { 3343 3415 hlock = curr->held_locks + depth - 1; 3344 3416 if (hlock->class_idx == class_idx && nest_lock) { 3345 3417 if (hlock->references) { ··· 3425 3499 3426 3500 if (!validate_chain(curr, lock, hlock, chain_head, chain_key)) 3427 3501 return 0; 3428 - 3429 - ret = lock_acquire_crosslock(hlock); 3430 - /* 3431 - * 2 means normal acquire operations are needed. Otherwise, it's 3432 - * ok just to return with '0:fail, 1:success'. 3433 - */ 3434 - if (ret != 2) 3435 - return ret; 3436 3502 3437 3503 curr->curr_chain_key = chain_key; 3438 3504 curr->lockdep_depth++; ··· 3663 3745 struct task_struct *curr = current; 3664 3746 struct held_lock *hlock; 3665 3747 unsigned int depth; 3666 - int ret, i; 3748 + int i; 3667 3749 3668 3750 if (unlikely(!debug_locks)) 3669 3751 return 0; 3670 - 3671 - ret = lock_release_crosslock(lock); 3672 - /* 3673 - * 2 means normal release operations are needed. Otherwise, it's 3674 - * ok just to return with '0:fail, 1:success'. 3675 - */ 3676 - if (ret != 2) 3677 - return ret; 3678 3752 3679 3753 depth = curr->lockdep_depth; 3680 3754 /* ··· 4585 4675 dump_stack(); 4586 4676 } 4587 4677 EXPORT_SYMBOL_GPL(lockdep_rcu_suspicious); 4588 - 4589 - #ifdef CONFIG_LOCKDEP_CROSSRELEASE 4590 - 4591 - /* 4592 - * Crossrelease works by recording a lock history for each thread and 4593 - * connecting those historic locks that were taken after the 4594 - * wait_for_completion() in the complete() context. 4595 - * 4596 - * Task-A Task-B 4597 - * 4598 - * mutex_lock(&A); 4599 - * mutex_unlock(&A); 4600 - * 4601 - * wait_for_completion(&C); 4602 - * lock_acquire_crosslock(); 4603 - * atomic_inc_return(&cross_gen_id); 4604 - * | 4605 - * | mutex_lock(&B); 4606 - * | mutex_unlock(&B); 4607 - * | 4608 - * | complete(&C); 4609 - * `-- lock_commit_crosslock(); 4610 - * 4611 - * Which will then add a dependency between B and C. 4612 - */ 4613 - 4614 - #define xhlock(i) (current->xhlocks[(i) % MAX_XHLOCKS_NR]) 4615 - 4616 - /* 4617 - * Whenever a crosslock is held, cross_gen_id will be increased. 4618 - */ 4619 - static atomic_t cross_gen_id; /* Can be wrapped */ 4620 - 4621 - /* 4622 - * Make an entry of the ring buffer invalid. 4623 - */ 4624 - static inline void invalidate_xhlock(struct hist_lock *xhlock) 4625 - { 4626 - /* 4627 - * Normally, xhlock->hlock.instance must be !NULL. 4628 - */ 4629 - xhlock->hlock.instance = NULL; 4630 - } 4631 - 4632 - /* 4633 - * Lock history stacks; we have 2 nested lock history stacks: 4634 - * 4635 - * HARD(IRQ) 4636 - * SOFT(IRQ) 4637 - * 4638 - * The thing is that once we complete a HARD/SOFT IRQ the future task locks 4639 - * should not depend on any of the locks observed while running the IRQ. So 4640 - * what we do is rewind the history buffer and erase all our knowledge of that 4641 - * temporal event. 4642 - */ 4643 - 4644 - void crossrelease_hist_start(enum xhlock_context_t c) 4645 - { 4646 - struct task_struct *cur = current; 4647 - 4648 - if (!cur->xhlocks) 4649 - return; 4650 - 4651 - cur->xhlock_idx_hist[c] = cur->xhlock_idx; 4652 - cur->hist_id_save[c] = cur->hist_id; 4653 - } 4654 - 4655 - void crossrelease_hist_end(enum xhlock_context_t c) 4656 - { 4657 - struct task_struct *cur = current; 4658 - 4659 - if (cur->xhlocks) { 4660 - unsigned int idx = cur->xhlock_idx_hist[c]; 4661 - struct hist_lock *h = &xhlock(idx); 4662 - 4663 - cur->xhlock_idx = idx; 4664 - 4665 - /* Check if the ring was overwritten. */ 4666 - if (h->hist_id != cur->hist_id_save[c]) 4667 - invalidate_xhlock(h); 4668 - } 4669 - } 4670 - 4671 - /* 4672 - * lockdep_invariant_state() is used to annotate independence inside a task, to 4673 - * make one task look like multiple independent 'tasks'. 4674 - * 4675 - * Take for instance workqueues; each work is independent of the last. The 4676 - * completion of a future work does not depend on the completion of a past work 4677 - * (in general). Therefore we must not carry that (lock) dependency across 4678 - * works. 4679 - * 4680 - * This is true for many things; pretty much all kthreads fall into this 4681 - * pattern, where they have an invariant state and future completions do not 4682 - * depend on past completions. Its just that since they all have the 'same' 4683 - * form -- the kthread does the same over and over -- it doesn't typically 4684 - * matter. 4685 - * 4686 - * The same is true for system-calls, once a system call is completed (we've 4687 - * returned to userspace) the next system call does not depend on the lock 4688 - * history of the previous system call. 4689 - * 4690 - * They key property for independence, this invariant state, is that it must be 4691 - * a point where we hold no locks and have no history. Because if we were to 4692 - * hold locks, the restore at _end() would not necessarily recover it's history 4693 - * entry. Similarly, independence per-definition means it does not depend on 4694 - * prior state. 4695 - */ 4696 - void lockdep_invariant_state(bool force) 4697 - { 4698 - /* 4699 - * We call this at an invariant point, no current state, no history. 4700 - * Verify the former, enforce the latter. 4701 - */ 4702 - WARN_ON_ONCE(!force && current->lockdep_depth); 4703 - if (current->xhlocks) 4704 - invalidate_xhlock(&xhlock(current->xhlock_idx)); 4705 - } 4706 - 4707 - static int cross_lock(struct lockdep_map *lock) 4708 - { 4709 - return lock ? lock->cross : 0; 4710 - } 4711 - 4712 - /* 4713 - * This is needed to decide the relationship between wrapable variables. 4714 - */ 4715 - static inline int before(unsigned int a, unsigned int b) 4716 - { 4717 - return (int)(a - b) < 0; 4718 - } 4719 - 4720 - static inline struct lock_class *xhlock_class(struct hist_lock *xhlock) 4721 - { 4722 - return hlock_class(&xhlock->hlock); 4723 - } 4724 - 4725 - static inline struct lock_class *xlock_class(struct cross_lock *xlock) 4726 - { 4727 - return hlock_class(&xlock->hlock); 4728 - } 4729 - 4730 - /* 4731 - * Should we check a dependency with previous one? 4732 - */ 4733 - static inline int depend_before(struct held_lock *hlock) 4734 - { 4735 - return hlock->read != 2 && hlock->check && !hlock->trylock; 4736 - } 4737 - 4738 - /* 4739 - * Should we check a dependency with next one? 4740 - */ 4741 - static inline int depend_after(struct held_lock *hlock) 4742 - { 4743 - return hlock->read != 2 && hlock->check; 4744 - } 4745 - 4746 - /* 4747 - * Check if the xhlock is valid, which would be false if, 4748 - * 4749 - * 1. Has not used after initializaion yet. 4750 - * 2. Got invalidated. 4751 - * 4752 - * Remind hist_lock is implemented as a ring buffer. 4753 - */ 4754 - static inline int xhlock_valid(struct hist_lock *xhlock) 4755 - { 4756 - /* 4757 - * xhlock->hlock.instance must be !NULL. 4758 - */ 4759 - return !!xhlock->hlock.instance; 4760 - } 4761 - 4762 - /* 4763 - * Record a hist_lock entry. 4764 - * 4765 - * Irq disable is only required. 4766 - */ 4767 - static void add_xhlock(struct held_lock *hlock) 4768 - { 4769 - unsigned int idx = ++current->xhlock_idx; 4770 - struct hist_lock *xhlock = &xhlock(idx); 4771 - 4772 - #ifdef CONFIG_DEBUG_LOCKDEP 4773 - /* 4774 - * This can be done locklessly because they are all task-local 4775 - * state, we must however ensure IRQs are disabled. 4776 - */ 4777 - WARN_ON_ONCE(!irqs_disabled()); 4778 - #endif 4779 - 4780 - /* Initialize hist_lock's members */ 4781 - xhlock->hlock = *hlock; 4782 - xhlock->hist_id = ++current->hist_id; 4783 - 4784 - xhlock->trace.nr_entries = 0; 4785 - xhlock->trace.max_entries = MAX_XHLOCK_TRACE_ENTRIES; 4786 - xhlock->trace.entries = xhlock->trace_entries; 4787 - 4788 - if (crossrelease_fullstack) { 4789 - xhlock->trace.skip = 3; 4790 - save_stack_trace(&xhlock->trace); 4791 - } else { 4792 - xhlock->trace.nr_entries = 1; 4793 - xhlock->trace.entries[0] = hlock->acquire_ip; 4794 - } 4795 - } 4796 - 4797 - static inline int same_context_xhlock(struct hist_lock *xhlock) 4798 - { 4799 - return xhlock->hlock.irq_context == task_irq_context(current); 4800 - } 4801 - 4802 - /* 4803 - * This should be lockless as far as possible because this would be 4804 - * called very frequently. 4805 - */ 4806 - static void check_add_xhlock(struct held_lock *hlock) 4807 - { 4808 - /* 4809 - * Record a hist_lock, only in case that acquisitions ahead 4810 - * could depend on the held_lock. For example, if the held_lock 4811 - * is trylock then acquisitions ahead never depends on that. 4812 - * In that case, we don't need to record it. Just return. 4813 - */ 4814 - if (!current->xhlocks || !depend_before(hlock)) 4815 - return; 4816 - 4817 - add_xhlock(hlock); 4818 - } 4819 - 4820 - /* 4821 - * For crosslock. 4822 - */ 4823 - static int add_xlock(struct held_lock *hlock) 4824 - { 4825 - struct cross_lock *xlock; 4826 - unsigned int gen_id; 4827 - 4828 - if (!graph_lock()) 4829 - return 0; 4830 - 4831 - xlock = &((struct lockdep_map_cross *)hlock->instance)->xlock; 4832 - 4833 - /* 4834 - * When acquisitions for a crosslock are overlapped, we use 4835 - * nr_acquire to perform commit for them, based on cross_gen_id 4836 - * of the first acquisition, which allows to add additional 4837 - * dependencies. 4838 - * 4839 - * Moreover, when no acquisition of a crosslock is in progress, 4840 - * we should not perform commit because the lock might not exist 4841 - * any more, which might cause incorrect memory access. So we 4842 - * have to track the number of acquisitions of a crosslock. 4843 - * 4844 - * depend_after() is necessary to initialize only the first 4845 - * valid xlock so that the xlock can be used on its commit. 4846 - */ 4847 - if (xlock->nr_acquire++ && depend_after(&xlock->hlock)) 4848 - goto unlock; 4849 - 4850 - gen_id = (unsigned int)atomic_inc_return(&cross_gen_id); 4851 - xlock->hlock = *hlock; 4852 - xlock->hlock.gen_id = gen_id; 4853 - unlock: 4854 - graph_unlock(); 4855 - return 1; 4856 - } 4857 - 4858 - /* 4859 - * Called for both normal and crosslock acquires. Normal locks will be 4860 - * pushed on the hist_lock queue. Cross locks will record state and 4861 - * stop regular lock_acquire() to avoid being placed on the held_lock 4862 - * stack. 4863 - * 4864 - * Return: 0 - failure; 4865 - * 1 - crosslock, done; 4866 - * 2 - normal lock, continue to held_lock[] ops. 4867 - */ 4868 - static int lock_acquire_crosslock(struct held_lock *hlock) 4869 - { 4870 - /* 4871 - * CONTEXT 1 CONTEXT 2 4872 - * --------- --------- 4873 - * lock A (cross) 4874 - * X = atomic_inc_return(&cross_gen_id) 4875 - * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4876 - * Y = atomic_read_acquire(&cross_gen_id) 4877 - * lock B 4878 - * 4879 - * atomic_read_acquire() is for ordering between A and B, 4880 - * IOW, A happens before B, when CONTEXT 2 see Y >= X. 4881 - * 4882 - * Pairs with atomic_inc_return() in add_xlock(). 4883 - */ 4884 - hlock->gen_id = (unsigned int)atomic_read_acquire(&cross_gen_id); 4885 - 4886 - if (cross_lock(hlock->instance)) 4887 - return add_xlock(hlock); 4888 - 4889 - check_add_xhlock(hlock); 4890 - return 2; 4891 - } 4892 - 4893 - static int copy_trace(struct stack_trace *trace) 4894 - { 4895 - unsigned long *buf = stack_trace + nr_stack_trace_entries; 4896 - unsigned int max_nr = MAX_STACK_TRACE_ENTRIES - nr_stack_trace_entries; 4897 - unsigned int nr = min(max_nr, trace->nr_entries); 4898 - 4899 - trace->nr_entries = nr; 4900 - memcpy(buf, trace->entries, nr * sizeof(trace->entries[0])); 4901 - trace->entries = buf; 4902 - nr_stack_trace_entries += nr; 4903 - 4904 - if (nr_stack_trace_entries >= MAX_STACK_TRACE_ENTRIES-1) { 4905 - if (!debug_locks_off_graph_unlock()) 4906 - return 0; 4907 - 4908 - print_lockdep_off("BUG: MAX_STACK_TRACE_ENTRIES too low!"); 4909 - dump_stack(); 4910 - 4911 - return 0; 4912 - } 4913 - 4914 - return 1; 4915 - } 4916 - 4917 - static int commit_xhlock(struct cross_lock *xlock, struct hist_lock *xhlock) 4918 - { 4919 - unsigned int xid, pid; 4920 - u64 chain_key; 4921 - 4922 - xid = xlock_class(xlock) - lock_classes; 4923 - chain_key = iterate_chain_key((u64)0, xid); 4924 - pid = xhlock_class(xhlock) - lock_classes; 4925 - chain_key = iterate_chain_key(chain_key, pid); 4926 - 4927 - if (lookup_chain_cache(chain_key)) 4928 - return 1; 4929 - 4930 - if (!add_chain_cache_classes(xid, pid, xhlock->hlock.irq_context, 4931 - chain_key)) 4932 - return 0; 4933 - 4934 - if (!check_prev_add(current, &xlock->hlock, &xhlock->hlock, 1, 4935 - &xhlock->trace, copy_trace)) 4936 - return 0; 4937 - 4938 - return 1; 4939 - } 4940 - 4941 - static void commit_xhlocks(struct cross_lock *xlock) 4942 - { 4943 - unsigned int cur = current->xhlock_idx; 4944 - unsigned int prev_hist_id = xhlock(cur).hist_id; 4945 - unsigned int i; 4946 - 4947 - if (!graph_lock()) 4948 - return; 4949 - 4950 - if (xlock->nr_acquire) { 4951 - for (i = 0; i < MAX_XHLOCKS_NR; i++) { 4952 - struct hist_lock *xhlock = &xhlock(cur - i); 4953 - 4954 - if (!xhlock_valid(xhlock)) 4955 - break; 4956 - 4957 - if (before(xhlock->hlock.gen_id, xlock->hlock.gen_id)) 4958 - break; 4959 - 4960 - if (!same_context_xhlock(xhlock)) 4961 - break; 4962 - 4963 - /* 4964 - * Filter out the cases where the ring buffer was 4965 - * overwritten and the current entry has a bigger 4966 - * hist_id than the previous one, which is impossible 4967 - * otherwise: 4968 - */ 4969 - if (unlikely(before(prev_hist_id, xhlock->hist_id))) 4970 - break; 4971 - 4972 - prev_hist_id = xhlock->hist_id; 4973 - 4974 - /* 4975 - * commit_xhlock() returns 0 with graph_lock already 4976 - * released if fail. 4977 - */ 4978 - if (!commit_xhlock(xlock, xhlock)) 4979 - return; 4980 - } 4981 - } 4982 - 4983 - graph_unlock(); 4984 - } 4985 - 4986 - void lock_commit_crosslock(struct lockdep_map *lock) 4987 - { 4988 - struct cross_lock *xlock; 4989 - unsigned long flags; 4990 - 4991 - if (unlikely(!debug_locks || current->lockdep_recursion)) 4992 - return; 4993 - 4994 - if (!current->xhlocks) 4995 - return; 4996 - 4997 - /* 4998 - * Do commit hist_locks with the cross_lock, only in case that 4999 - * the cross_lock could depend on acquisitions after that. 5000 - * 5001 - * For example, if the cross_lock does not have the 'check' flag 5002 - * then we don't need to check dependencies and commit for that. 5003 - * Just skip it. In that case, of course, the cross_lock does 5004 - * not depend on acquisitions ahead, either. 5005 - * 5006 - * WARNING: Don't do that in add_xlock() in advance. When an 5007 - * acquisition context is different from the commit context, 5008 - * invalid(skipped) cross_lock might be accessed. 5009 - */ 5010 - if (!depend_after(&((struct lockdep_map_cross *)lock)->xlock.hlock)) 5011 - return; 5012 - 5013 - raw_local_irq_save(flags); 5014 - check_flags(flags); 5015 - current->lockdep_recursion = 1; 5016 - xlock = &((struct lockdep_map_cross *)lock)->xlock; 5017 - commit_xhlocks(xlock); 5018 - current->lockdep_recursion = 0; 5019 - raw_local_irq_restore(flags); 5020 - } 5021 - EXPORT_SYMBOL_GPL(lock_commit_crosslock); 5022 - 5023 - /* 5024 - * Return: 0 - failure; 5025 - * 1 - crosslock, done; 5026 - * 2 - normal lock, continue to held_lock[] ops. 5027 - */ 5028 - static int lock_release_crosslock(struct lockdep_map *lock) 5029 - { 5030 - if (cross_lock(lock)) { 5031 - if (!graph_lock()) 5032 - return 0; 5033 - ((struct lockdep_map_cross *)lock)->xlock.nr_acquire--; 5034 - graph_unlock(); 5035 - return 1; 5036 - } 5037 - return 2; 5038 - } 5039 - 5040 - static void cross_init(struct lockdep_map *lock, int cross) 5041 - { 5042 - if (cross) 5043 - ((struct lockdep_map_cross *)lock)->xlock.nr_acquire = 0; 5044 - 5045 - lock->cross = cross; 5046 - 5047 - /* 5048 - * Crossrelease assumes that the ring buffer size of xhlocks 5049 - * is aligned with power of 2. So force it on build. 5050 - */ 5051 - BUILD_BUG_ON(MAX_XHLOCKS_NR & (MAX_XHLOCKS_NR - 1)); 5052 - } 5053 - 5054 - void lockdep_init_task(struct task_struct *task) 5055 - { 5056 - int i; 5057 - 5058 - task->xhlock_idx = UINT_MAX; 5059 - task->hist_id = 0; 5060 - 5061 - for (i = 0; i < XHLOCK_CTX_NR; i++) { 5062 - task->xhlock_idx_hist[i] = UINT_MAX; 5063 - task->hist_id_save[i] = 0; 5064 - } 5065 - 5066 - task->xhlocks = kzalloc(sizeof(struct hist_lock) * MAX_XHLOCKS_NR, 5067 - GFP_KERNEL); 5068 - } 5069 - 5070 - void lockdep_free_task(struct task_struct *task) 5071 - { 5072 - if (task->xhlocks) { 5073 - void *tmp = task->xhlocks; 5074 - /* Diable crossrelease for current */ 5075 - task->xhlocks = NULL; 5076 - kfree(tmp); 5077 - } 5078 - } 5079 - #endif
-33
lib/Kconfig.debug
··· 1099 1099 select DEBUG_MUTEXES 1100 1100 select DEBUG_RT_MUTEXES if RT_MUTEXES 1101 1101 select DEBUG_LOCK_ALLOC 1102 - select LOCKDEP_CROSSRELEASE 1103 - select LOCKDEP_COMPLETIONS 1104 1102 select TRACE_IRQFLAGS 1105 1103 default n 1106 1104 help ··· 1167 1169 1168 1170 CONFIG_LOCK_STAT defines "contended" and "acquired" lock events. 1169 1171 (CONFIG_LOCKDEP defines "acquire" and "release" events.) 1170 - 1171 - config LOCKDEP_CROSSRELEASE 1172 - bool 1173 - help 1174 - This makes lockdep work for crosslock which is a lock allowed to 1175 - be released in a different context from the acquisition context. 1176 - Normally a lock must be released in the context acquiring the lock. 1177 - However, relexing this constraint helps synchronization primitives 1178 - such as page locks or completions can use the lock correctness 1179 - detector, lockdep. 1180 - 1181 - config LOCKDEP_COMPLETIONS 1182 - bool 1183 - help 1184 - A deadlock caused by wait_for_completion() and complete() can be 1185 - detected by lockdep using crossrelease feature. 1186 - 1187 - config BOOTPARAM_LOCKDEP_CROSSRELEASE_FULLSTACK 1188 - bool "Enable the boot parameter, crossrelease_fullstack" 1189 - depends on LOCKDEP_CROSSRELEASE 1190 - default n 1191 - help 1192 - The lockdep "cross-release" feature needs to record stack traces 1193 - (of calling functions) for all acquisitions, for eventual later 1194 - use during analysis. By default only a single caller is recorded, 1195 - because the unwind operation can be very expensive with deeper 1196 - stack chains. 1197 - 1198 - However a boot parameter, crossrelease_fullstack, was 1199 - introduced since sometimes deeper traces are required for full 1200 - analysis. This option turns on the boot parameter. 1201 1172 1202 1173 config DEBUG_LOCKDEP 1203 1174 bool "Lock dependency engine debugging"