Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

rcu: Document interpretation of RCU-lockdep splats

There has been quite a bit of confusion about what RCU-lockdep splats
mean, so this commit adds some documentation describing how to
interpret them.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

+110
+110
Documentation/RCU/lockdep-splat.txt
··· 1 + Lockdep-RCU was added to the Linux kernel in early 2010 2 + (http://lwn.net/Articles/371986/). This facility checks for some common 3 + misuses of the RCU API, most notably using one of the rcu_dereference() 4 + family to access an RCU-protected pointer without the proper protection. 5 + When such misuse is detected, an lockdep-RCU splat is emitted. 6 + 7 + The usual cause of a lockdep-RCU slat is someone accessing an 8 + RCU-protected data structure without either (1) being in the right kind of 9 + RCU read-side critical section or (2) holding the right update-side lock. 10 + This problem can therefore be serious: it might result in random memory 11 + overwriting or worse. There can of course be false positives, this 12 + being the real world and all that. 13 + 14 + So let's look at an example RCU lockdep splat from 3.0-rc5, one that 15 + has long since been fixed: 16 + 17 + =============================== 18 + [ INFO: suspicious RCU usage. ] 19 + ------------------------------- 20 + block/cfq-iosched.c:2776 suspicious rcu_dereference_protected() usage! 21 + 22 + other info that might help us debug this: 23 + 24 + 25 + rcu_scheduler_active = 1, debug_locks = 0 26 + 3 locks held by scsi_scan_6/1552: 27 + #0: (&shost->scan_mutex){+.+.+.}, at: [<ffffffff8145efca>] 28 + scsi_scan_host_selected+0x5a/0x150 29 + #1: (&eq->sysfs_lock){+.+...}, at: [<ffffffff812a5032>] 30 + elevator_exit+0x22/0x60 31 + #2: (&(&q->__queue_lock)->rlock){-.-...}, at: [<ffffffff812b6233>] 32 + cfq_exit_queue+0x43/0x190 33 + 34 + stack backtrace: 35 + Pid: 1552, comm: scsi_scan_6 Not tainted 3.0.0-rc5 #17 36 + Call Trace: 37 + [<ffffffff810abb9b>] lockdep_rcu_dereference+0xbb/0xc0 38 + [<ffffffff812b6139>] __cfq_exit_single_io_context+0xe9/0x120 39 + [<ffffffff812b626c>] cfq_exit_queue+0x7c/0x190 40 + [<ffffffff812a5046>] elevator_exit+0x36/0x60 41 + [<ffffffff812a802a>] blk_cleanup_queue+0x4a/0x60 42 + [<ffffffff8145cc09>] scsi_free_queue+0x9/0x10 43 + [<ffffffff81460944>] __scsi_remove_device+0x84/0xd0 44 + [<ffffffff8145dca3>] scsi_probe_and_add_lun+0x353/0xb10 45 + [<ffffffff817da069>] ? error_exit+0x29/0xb0 46 + [<ffffffff817d98ed>] ? _raw_spin_unlock_irqrestore+0x3d/0x80 47 + [<ffffffff8145e722>] __scsi_scan_target+0x112/0x680 48 + [<ffffffff812c690d>] ? trace_hardirqs_off_thunk+0x3a/0x3c 49 + [<ffffffff817da069>] ? error_exit+0x29/0xb0 50 + [<ffffffff812bcc60>] ? kobject_del+0x40/0x40 51 + [<ffffffff8145ed16>] scsi_scan_channel+0x86/0xb0 52 + [<ffffffff8145f0b0>] scsi_scan_host_selected+0x140/0x150 53 + [<ffffffff8145f149>] do_scsi_scan_host+0x89/0x90 54 + [<ffffffff8145f170>] do_scan_async+0x20/0x160 55 + [<ffffffff8145f150>] ? do_scsi_scan_host+0x90/0x90 56 + [<ffffffff810975b6>] kthread+0xa6/0xb0 57 + [<ffffffff817db154>] kernel_thread_helper+0x4/0x10 58 + [<ffffffff81066430>] ? finish_task_switch+0x80/0x110 59 + [<ffffffff817d9c04>] ? retint_restore_args+0xe/0xe 60 + [<ffffffff81097510>] ? __init_kthread_worker+0x70/0x70 61 + [<ffffffff817db150>] ? gs_change+0xb/0xb 62 + 63 + Line 2776 of block/cfq-iosched.c in v3.0-rc5 is as follows: 64 + 65 + if (rcu_dereference(ioc->ioc_data) == cic) { 66 + 67 + This form says that it must be in a plain vanilla RCU read-side critical 68 + section, but the "other info" list above shows that this is not the 69 + case. Instead, we hold three locks, one of which might be RCU related. 70 + And maybe that lock really does protect this reference. If so, the fix 71 + is to inform RCU, perhaps by changing __cfq_exit_single_io_context() to 72 + take the struct request_queue "q" from cfq_exit_queue() as an argument, 73 + which would permit us to invoke rcu_dereference_protected as follows: 74 + 75 + if (rcu_dereference_protected(ioc->ioc_data, 76 + lockdep_is_held(&q->queue_lock)) == cic) { 77 + 78 + With this change, there would be no lockdep-RCU splat emitted if this 79 + code was invoked either from within an RCU read-side critical section 80 + or with the ->queue_lock held. In particular, this would have suppressed 81 + the above lockdep-RCU splat because ->queue_lock is held (see #2 in the 82 + list above). 83 + 84 + On the other hand, perhaps we really do need an RCU read-side critical 85 + section. In this case, the critical section must span the use of the 86 + return value from rcu_dereference(), or at least until there is some 87 + reference count incremented or some such. One way to handle this is to 88 + add rcu_read_lock() and rcu_read_unlock() as follows: 89 + 90 + rcu_read_lock(); 91 + if (rcu_dereference(ioc->ioc_data) == cic) { 92 + spin_lock(&ioc->lock); 93 + rcu_assign_pointer(ioc->ioc_data, NULL); 94 + spin_unlock(&ioc->lock); 95 + } 96 + rcu_read_unlock(); 97 + 98 + With this change, the rcu_dereference() is always within an RCU 99 + read-side critical section, which again would have suppressed the 100 + above lockdep-RCU splat. 101 + 102 + But in this particular case, we don't actually deference the pointer 103 + returned from rcu_dereference(). Instead, that pointer is just compared 104 + to the cic pointer, which means that the rcu_dereference() can be replaced 105 + by rcu_access_pointer() as follows: 106 + 107 + if (rcu_access_pointer(ioc->ioc_data) == cic) { 108 + 109 + Because it is legal to invoke rcu_access_pointer() without protection, 110 + this change would also suppress the above lockdep-RCU splat.