Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

timers: handle HRTIMER_CB_IRQSAFE_UNLOCKED correctly from softirq context

Impact: fix incorrect locking triggered during hotplug-intense stress-tests

While migrating the the CB_IRQSAFE_UNLOCKED timers during a cpu-offline,
we queue them on the cb_pending list, so that they won't go
stale.

Thus, when the callbacks of the timers run from the softirq context,
they could run into potential deadlocks, since these callbacks
assume that they're running with irq's disabled, thereby annoying
lockdep!

Fix this by emulating hardirq context while running these callbacks from
the hrtimer softirq.

=================================
[ INFO: inconsistent lock state ]
2.6.27 #2
--------------------------------
inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
ksoftirqd/0/4 [HC0[0]:SC1[1]:HE1:SE0] takes:
(&rq->lock){++..}, at: [<c011db84>] sched_rt_period_timer+0x9e/0x1fc
{in-hardirq-W} state was registered at:
[<c014103c>] __lock_acquire+0x549/0x121e
[<c0107890>] native_sched_clock+0x88/0x99
[<c013aa12>] clocksource_get_next+0x39/0x3f
[<c0139abc>] update_wall_time+0x616/0x7df
[<c0141d6b>] lock_acquire+0x5a/0x74
[<c0121724>] scheduler_tick+0x3a/0x18d
[<c047ed45>] _spin_lock+0x1c/0x45
[<c0121724>] scheduler_tick+0x3a/0x18d
[<c0121724>] scheduler_tick+0x3a/0x18d
[<c012c436>] update_process_times+0x3a/0x44
[<c013c044>] tick_periodic+0x63/0x6d
[<c013c062>] tick_handle_periodic+0x14/0x5e
[<c010568c>] timer_interrupt+0x44/0x4a
[<c0150c9f>] handle_IRQ_event+0x13/0x3d
[<c0151c14>] handle_level_irq+0x79/0xbd
[<c0105634>] do_IRQ+0x69/0x7d
[<c01041e4>] common_interrupt+0x28/0x30
[<c047007b>] aac_probe_one+0x1a3/0x3f3
[<c047ec2d>] _spin_unlock_irqrestore+0x36/0x39
[<c01512b4>] setup_irq+0x1be/0x1f9
[<c065d70b>] start_kernel+0x259/0x2c5
[<ffffffff>] 0xffffffff
irq event stamp: 50102
hardirqs last enabled at (50102): [<c047ebf4>] _spin_unlock_irq+0x20/0x23
hardirqs last disabled at (50101): [<c047edc2>] _spin_lock_irq+0xa/0x4b
softirqs last enabled at (50088): [<c0128ba6>] do_softirq+0x37/0x4d
softirqs last disabled at (50099): [<c0128ba6>] do_softirq+0x37/0x4d

other info that might help us debug this:
no locks held by ksoftirqd/0/4.

stack backtrace:
Pid: 4, comm: ksoftirqd/0 Not tainted 2.6.27 #2
[<c013f6cb>] print_usage_bug+0x13e/0x147
[<c013fef5>] mark_lock+0x493/0x797
[<c01410b1>] __lock_acquire+0x5be/0x121e
[<c0141d6b>] lock_acquire+0x5a/0x74
[<c011db84>] sched_rt_period_timer+0x9e/0x1fc
[<c047ed45>] _spin_lock+0x1c/0x45
[<c011db84>] sched_rt_period_timer+0x9e/0x1fc
[<c011db84>] sched_rt_period_timer+0x9e/0x1fc
[<c01210fd>] finish_task_switch+0x41/0xbd
[<c0107890>] native_sched_clock+0x88/0x99
[<c011dae6>] sched_rt_period_timer+0x0/0x1fc
[<c0136dda>] run_hrtimer_pending+0x54/0xe5
[<c011dae6>] sched_rt_period_timer+0x0/0x1fc
[<c0128afb>] __do_softirq+0x7b/0xef
[<c0128ba6>] do_softirq+0x37/0x4d
[<c0128c12>] ksoftirqd+0x56/0xc5
[<c0128bbc>] ksoftirqd+0x0/0xc5
[<c0134649>] kthread+0x38/0x5d
[<c0134611>] kthread+0x0/0x5d
[<c0104477>] kernel_thread_helper+0x7/0x10
=======================

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

authored by

Gautham R Shenoy and committed by
Thomas Gleixner
5d5254f0 ae99286b

+16 -1
+16 -1
kernel/hrtimer.c
··· 1209 1209 enum hrtimer_restart (*fn)(struct hrtimer *); 1210 1210 struct hrtimer *timer; 1211 1211 int restart; 1212 + int emulate_hardirq_ctx = 0; 1212 1213 1213 1214 timer = list_entry(cpu_base->cb_pending.next, 1214 1215 struct hrtimer, cb_entry); ··· 1218 1217 timer_stats_account_hrtimer(timer); 1219 1218 1220 1219 fn = timer->function; 1220 + /* 1221 + * A timer might have been added to the cb_pending list 1222 + * when it was migrated during a cpu-offline operation. 1223 + * Emulate hardirq context for such timers. 1224 + */ 1225 + if (timer->cb_mode == HRTIMER_CB_IRQSAFE_PERCPU || 1226 + timer->cb_mode == HRTIMER_CB_IRQSAFE_UNLOCKED) 1227 + emulate_hardirq_ctx = 1; 1228 + 1221 1229 __remove_hrtimer(timer, timer->base, HRTIMER_STATE_CALLBACK, 0); 1222 1230 spin_unlock_irq(&cpu_base->lock); 1223 1231 1224 - restart = fn(timer); 1232 + if (unlikely(emulate_hardirq_ctx)) { 1233 + local_irq_disable(); 1234 + restart = fn(timer); 1235 + local_irq_enable(); 1236 + } else 1237 + restart = fn(timer); 1225 1238 1226 1239 spin_lock_irq(&cpu_base->lock); 1227 1240