Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

cxgb4: fix BUG() on interrupt deallocating path of ULD

Since the introduction of ULD (Upper-Layer Drivers), the MSI-X
deallocating path changed in cxgb4: the driver frees the interrupts
of ULD when unregistering it or on shutdown PCI handler.

Problem is that if a MSI-X is not freed before deallocated in the PCI
layer, it will trigger a BUG() due to still "alive" interrupt being
tentatively quiesced.

The below trace was observed when doing a simple unbind of Chelsio's
adapter PCI function, like:
"echo 001e:80:00.4 > /sys/bus/pci/drivers/cxgb4/unbind"

Trace:

kernel BUG at drivers/pci/msi.c:352!
Oops: Exception in kernel mode, sig: 5 [#1]
...
NIP [c0000000005a5e60] free_msi_irqs+0xa0/0x250
LR [c0000000005a5e50] free_msi_irqs+0x90/0x250
Call Trace:
[c0000000005a5e50] free_msi_irqs+0x90/0x250 (unreliable)
[c0000000005a72c4] pci_disable_msix+0x124/0x180
[d000000011e06708] disable_msi+0x88/0xb0 [cxgb4]
[d000000011e06948] free_some_resources+0xa8/0x160 [cxgb4]
[d000000011e06d60] remove_one+0x170/0x3c0 [cxgb4]
[c00000000058a910] pci_device_remove+0x70/0x110
[c00000000064ef04] device_release_driver_internal+0x1f4/0x2c0
...

This patch fixes the issue by refactoring the shutdown path of ULD on
cxgb4 driver, by properly freeing and disabling interrupts on PCI
remove handler too.

Fixes: 0fbc81b3ad51 ("Allocate resources dynamically for all cxgb4 ULD's")
Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

authored by

Guilherme G. Piccoli and committed by
David S. Miller
6a146f3a 91d1ae47

+36 -22
+11 -5
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
··· 2083 2083 2084 2084 mutex_lock(&uld_mutex); 2085 2085 list_del(&adap->list_node); 2086 + 2086 2087 for (i = 0; i < CXGB4_ULD_MAX; i++) 2087 - if (adap->uld && adap->uld[i].handle) { 2088 + if (adap->uld && adap->uld[i].handle) 2088 2089 adap->uld[i].state_change(adap->uld[i].handle, 2089 2090 CXGB4_STATE_DETACH); 2090 - adap->uld[i].handle = NULL; 2091 - } 2091 + 2092 2092 if (netevent_registered && list_empty(&adapter_list)) { 2093 2093 unregister_netevent_notifier(&cxgb4_netevent_nb); 2094 2094 netevent_registered = false; ··· 5303 5303 */ 5304 5304 destroy_workqueue(adapter->workq); 5305 5305 5306 - if (is_uld(adapter)) 5306 + if (is_uld(adapter)) { 5307 5307 detach_ulds(adapter); 5308 + t4_uld_clean_up(adapter); 5309 + } 5308 5310 5309 5311 disable_interrupts(adapter); 5310 5312 ··· 5387 5385 if (adapter->port[i]->reg_state == NETREG_REGISTERED) 5388 5386 cxgb_close(adapter->port[i]); 5389 5387 5390 - t4_uld_clean_up(adapter); 5388 + if (is_uld(adapter)) { 5389 + detach_ulds(adapter); 5390 + t4_uld_clean_up(adapter); 5391 + } 5392 + 5391 5393 disable_interrupts(adapter); 5392 5394 disable_msi(adapter); 5393 5395
+25 -17
drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c
··· 589 589 kfree(adap->uld); 590 590 } 591 591 592 + /* This function should be called with uld_mutex taken. */ 593 + static void cxgb4_shutdown_uld_adapter(struct adapter *adap, enum cxgb4_uld type) 594 + { 595 + if (adap->uld[type].handle) { 596 + adap->uld[type].handle = NULL; 597 + adap->uld[type].add = NULL; 598 + release_sge_txq_uld(adap, type); 599 + 600 + if (adap->flags & FULL_INIT_DONE) 601 + quiesce_rx_uld(adap, type); 602 + 603 + if (adap->flags & USING_MSIX) 604 + free_msix_queue_irqs_uld(adap, type); 605 + 606 + free_sge_queues_uld(adap, type); 607 + free_queues_uld(adap, type); 608 + } 609 + } 610 + 592 611 void t4_uld_clean_up(struct adapter *adap) 593 612 { 594 613 unsigned int i; 595 614 596 - if (!adap->uld) 597 - return; 615 + mutex_lock(&uld_mutex); 598 616 for (i = 0; i < CXGB4_ULD_MAX; i++) { 599 617 if (!adap->uld[i].handle) 600 618 continue; 601 - if (adap->flags & FULL_INIT_DONE) 602 - quiesce_rx_uld(adap, i); 603 - if (adap->flags & USING_MSIX) 604 - free_msix_queue_irqs_uld(adap, i); 605 - free_sge_queues_uld(adap, i); 606 - free_queues_uld(adap, i); 619 + 620 + cxgb4_shutdown_uld_adapter(adap, i); 607 621 } 622 + mutex_unlock(&uld_mutex); 608 623 } 609 624 610 625 static void uld_init(struct adapter *adap, struct cxgb4_lld_info *lld) ··· 798 783 continue; 799 784 if (type == CXGB4_ULD_ISCSIT && is_t4(adap->params.chip)) 800 785 continue; 801 - adap->uld[type].handle = NULL; 802 - adap->uld[type].add = NULL; 803 - release_sge_txq_uld(adap, type); 804 - if (adap->flags & FULL_INIT_DONE) 805 - quiesce_rx_uld(adap, type); 806 - if (adap->flags & USING_MSIX) 807 - free_msix_queue_irqs_uld(adap, type); 808 - free_sge_queues_uld(adap, type); 809 - free_queues_uld(adap, type); 786 + 787 + cxgb4_shutdown_uld_adapter(adap, type); 810 788 } 811 789 mutex_unlock(&uld_mutex); 812 790