Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

MIPS: perf: fix deadlock

mipsxx_pmu_handle_shared_irq() calls irq_work_run() while holding the
pmuint_rwlock for read. irq_work_run() can, via perf_pending_event(),
call try_to_wake_up() which can try to take rq->lock.

However, perf can also call perf_pmu_enable() (and thus take the
pmuint_rwlock for write) while holding the rq->lock, from
finish_task_switch() via perf_event_context_sched_in().

This leads to an ABBA deadlock:

PID: 3855 TASK: 8f7ce288 CPU: 2 COMMAND: "process"
#0 [89c39ac8] __delay at 803b5be4
#1 [89c39ac8] do_raw_spin_lock at 8008fdcc
#2 [89c39af8] try_to_wake_up at 8006e47c
#3 [89c39b38] pollwake at 8018eab0
#4 [89c39b68] __wake_up_common at 800879f4
#5 [89c39b98] __wake_up at 800880e4
#6 [89c39bc8] perf_event_wakeup at 8012109c
#7 [89c39be8] perf_pending_event at 80121184
#8 [89c39c08] irq_work_run_list at 801151f0
#9 [89c39c38] irq_work_run at 80115274
#10 [89c39c50] mipsxx_pmu_handle_shared_irq at 8002cc7c

PID: 1481 TASK: 8eaac6a8 CPU: 3 COMMAND: "process"
#0 [8de7f900] do_raw_write_lock at 800900e0
#1 [8de7f918] perf_event_context_sched_in at 80122310
#2 [8de7f938] __perf_event_task_sched_in at 80122608
#3 [8de7f958] finish_task_switch at 8006b8a4
#4 [8de7f998] __schedule at 805e4dc4
#5 [8de7f9f8] schedule at 805e5558
#6 [8de7fa10] schedule_hrtimeout_range_clock at 805e9984
#7 [8de7fa70] poll_schedule_timeout at 8018e8f8
#8 [8de7fa88] do_select at 8018f338
#9 [8de7fd88] core_sys_select at 8018f5cc
#10 [8de7fee0] sys_select at 8018f854
#11 [8de7ff28] syscall_common at 80028fc8

The lock seems to be there to protect the hardware counters so there is
no need to hold it across irq_work_run().

Signed-off-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

authored by

Rabin Vincent and committed by
Ralf Baechle
f2b42866 9eec1c01

+5 -4
+5 -4
arch/mips/kernel/perf_event_mipsxx.c
··· 1446 1446 HANDLE_COUNTER(0) 1447 1447 } 1448 1448 1449 + #ifdef CONFIG_MIPS_PERF_SHARED_TC_COUNTERS 1450 + read_unlock(&pmuint_rwlock); 1451 + #endif 1452 + resume_local_counters(); 1453 + 1449 1454 /* 1450 1455 * Do all the work for the pending perf events. We can do this 1451 1456 * in here because the performance counter interrupt is a regular ··· 1459 1454 if (handled == IRQ_HANDLED) 1460 1455 irq_work_run(); 1461 1456 1462 - #ifdef CONFIG_MIPS_PERF_SHARED_TC_COUNTERS 1463 - read_unlock(&pmuint_rwlock); 1464 - #endif 1465 - resume_local_counters(); 1466 1457 return handled; 1467 1458 } 1468 1459