genirq: Keep chip buslock across irq_request/release_resources()

Moving the irq_request/release_resources() callbacks out of the spinlocked,
irq disabled and bus locked region, unearthed an interesting abuse of the
irq_bus_lock/irq_bus_sync_unlock() callbacks.

The OMAP GPIO driver does merily power management inside of them. The
irq_request_resources() callback of this GPIO irqchip calls a function
which reads a GPIO register. That read aborts now because the clock of the
GPIO block is not magically enabled via the irq_bus_lock() callback.

Move the callbacks under the bus lock again to prevent this. In the
free_irq() path this requires to drop the bus_lock before calling
synchronize_irq() and reaquiring it before calling the
irq_release_resources() callback.

The bus lock can't be held because:

1) The data which has been changed between bus_lock/un_lock is cached in
the irq chip driver private data and needs to go out to the irq chip
via the slow bus (usually SPI or I2C) before calling
synchronize_irq().

That's the reason why this bus_lock/unlock magic exists in the first
place, as you cannot do SPI/I2C transactions while holding desc->lock
with interrupts disabled.

2) synchronize_irq() will actually deadlock, if there is a handler on
flight. These chips use threaded handlers for obvious reasons, as
they allow to do SPI/I2C communication. When the threaded handler
returns then bus_lock needs to be taken in irq_finalize_oneshot() as
we need to talk to the actual irq chip once more. After that the
threaded handler is marked done, which makes synchronize_irq() return.

So if we hold bus_lock accross the synchronize_irq() call, the
handler cannot mark itself done because it blocks on the bus
lock. That in turn makes synchronize_irq() wait forever on the
threaded handler to complete....

Add the missing unlock of desc->request_mutex in the error path of
__free_irq() and add a bunch of comments to explain the locking and
protection rules.

Fixes: 46e48e257360 ("genirq: Move irq resource handling out of spinlocked region")
Reported-and-tested-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Reported-and-tested-by: Tony Lindgren <tony@atomide.com>
Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Not-longer-ranted-at-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>

+53 -10
+53 -10
kernel/irq/manage.c
··· 1090 1090 /* 1091 1091 * Internal function to register an irqaction - typically used to 1092 1092 * allocate special interrupts that are part of the architecture. 1093 + * 1094 + * Locking rules: 1095 + * 1096 + * desc->request_mutex Provides serialization against a concurrent free_irq() 1097 + * chip_bus_lock Provides serialization for slow bus operations 1098 + * desc->lock Provides serialization against hard interrupts 1099 + * 1100 + * chip_bus_lock and desc->lock are sufficient for all other management and 1101 + * interrupt related functions. desc->request_mutex solely serializes 1102 + * request/free_irq(). 1093 1103 */ 1094 1104 static int 1095 1105 __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) ··· 1177 1167 if (desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE) 1178 1168 new->flags &= ~IRQF_ONESHOT; 1179 1169 1170 + /* 1171 + * Protects against a concurrent __free_irq() call which might wait 1172 + * for synchronize_irq() to complete without holding the optional 1173 + * chip bus lock and desc->lock. 1174 + */ 1180 1175 mutex_lock(&desc->request_mutex); 1176 + 1177 + /* 1178 + * Acquire bus lock as the irq_request_resources() callback below 1179 + * might rely on the serialization or the magic power management 1180 + * functions which are abusing the irq_bus_lock() callback, 1181 + */ 1182 + chip_bus_lock(desc); 1183 + 1184 + /* First installed action requests resources. */ 1181 1185 if (!desc->action) { 1182 1186 ret = irq_request_resources(desc); 1183 1187 if (ret) { 1184 1188 pr_err("Failed to request resources for %s (irq %d) on irqchip %s\n", 1185 1189 new->name, irq, desc->irq_data.chip->name); 1186 - goto out_mutex; 1190 + goto out_bus_unlock; 1187 1191 } 1188 1192 } 1189 1193 1190 - chip_bus_lock(desc); 1191 - 1192 1194 /* 1193 1195 * The following block of code has to be executed atomically 1196 + * protected against a concurrent interrupt and any of the other 1197 + * management calls which are not serialized via 1198 + * desc->request_mutex or the optional bus lock. 1194 1199 */ 1195 1200 raw_spin_lock_irqsave(&desc->lock, flags); 1196 1201 old_ptr = &desc->action; ··· 1311 1286 ret = __irq_set_trigger(desc, 1312 1287 new->flags & IRQF_TRIGGER_MASK); 1313 1288 1314 - if (ret) { 1315 - irq_release_resources(desc); 1289 + if (ret) 1316 1290 goto out_unlock; 1317 - } 1318 1291 } 1319 1292 1320 1293 desc->istate &= ~(IRQS_AUTODETECT | IRQS_SPURIOUS_DISABLED | \ ··· 1408 1385 out_unlock: 1409 1386 raw_spin_unlock_irqrestore(&desc->lock, flags); 1410 1387 1411 - chip_bus_sync_unlock(desc); 1412 - 1413 1388 if (!desc->action) 1414 1389 irq_release_resources(desc); 1415 - 1416 - out_mutex: 1390 + out_bus_unlock: 1391 + chip_bus_sync_unlock(desc); 1417 1392 mutex_unlock(&desc->request_mutex); 1418 1393 1419 1394 out_thread: ··· 1493 1472 WARN(1, "Trying to free already-free IRQ %d\n", irq); 1494 1473 raw_spin_unlock_irqrestore(&desc->lock, flags); 1495 1474 chip_bus_sync_unlock(desc); 1475 + mutex_unlock(&desc->request_mutex); 1496 1476 return NULL; 1497 1477 } 1498 1478 ··· 1520 1498 #endif 1521 1499 1522 1500 raw_spin_unlock_irqrestore(&desc->lock, flags); 1501 + /* 1502 + * Drop bus_lock here so the changes which were done in the chip 1503 + * callbacks above are synced out to the irq chips which hang 1504 + * behind a slow bus (I2C, SPI) before calling synchronize_irq(). 1505 + * 1506 + * Aside of that the bus_lock can also be taken from the threaded 1507 + * handler in irq_finalize_oneshot() which results in a deadlock 1508 + * because synchronize_irq() would wait forever for the thread to 1509 + * complete, which is blocked on the bus lock. 1510 + * 1511 + * The still held desc->request_mutex() protects against a 1512 + * concurrent request_irq() of this irq so the release of resources 1513 + * and timing data is properly serialized. 1514 + */ 1523 1515 chip_bus_sync_unlock(desc); 1524 1516 1525 1517 unregister_handler_proc(irq, action); ··· 1566 1530 } 1567 1531 } 1568 1532 1533 + /* Last action releases resources */ 1569 1534 if (!desc->action) { 1535 + /* 1536 + * Reaquire bus lock as irq_release_resources() might 1537 + * require it to deallocate resources over the slow bus. 1538 + */ 1539 + chip_bus_lock(desc); 1570 1540 irq_release_resources(desc); 1541 + chip_bus_sync_unlock(desc); 1571 1542 irq_remove_timings(desc); 1572 1543 } 1573 1544