Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

tg3: prevent ifup/ifdown during PCI error recovery

The patch fixes race conditions between PCI error recovery callbacks and
potential ifup/ifdown.

First, if ifup (tg3_open) is called between tg3_io_error_detected() and
tg3_io_resume() then tp->timer is armed twice before expiry. Once during
tg3_open() and again during tg3_io_resume(). This results in BUG
at kernel/time/timer.c:945.

Second, if ifdown (tg3_close) is called between tg3_io_error_detected()
and tg3_io_resume() then tg3_napi_disable() is called twice without
a tg3_napi_enable between. Once during tg3_io_error_detected() and again
during tg3_close(). The tg3_io_resume() then hangs on rtnl_lock().

v2: Added logging messages per Prashant's request

Cc: Prashant Sreedharan <prashant@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Prashant Sreedharan <prashant@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

authored by

Ivan Vecera and committed by
David S. Miller
0486a063 88e41947

+17
+16
drivers/net/ethernet/broadcom/tg3.c
··· 11617 11617 struct tg3 *tp = netdev_priv(dev); 11618 11618 int err; 11619 11619 11620 + if (tp->pcierr_recovery) { 11621 + netdev_err(dev, "Failed to open device. PCI error recovery " 11622 + "in progress\n"); 11623 + return -EAGAIN; 11624 + } 11625 + 11620 11626 if (tp->fw_needed) { 11621 11627 err = tg3_request_firmware(tp); 11622 11628 if (tg3_asic_rev(tp) == ASIC_REV_57766) { ··· 11679 11673 static int tg3_close(struct net_device *dev) 11680 11674 { 11681 11675 struct tg3 *tp = netdev_priv(dev); 11676 + 11677 + if (tp->pcierr_recovery) { 11678 + netdev_err(dev, "Failed to close device. PCI error recovery " 11679 + "in progress\n"); 11680 + return -EAGAIN; 11681 + } 11682 11682 11683 11683 tg3_ptp_fini(tp); 11684 11684 ··· 17573 17561 tp->rx_mode = TG3_DEF_RX_MODE; 17574 17562 tp->tx_mode = TG3_DEF_TX_MODE; 17575 17563 tp->irq_sync = 1; 17564 + tp->pcierr_recovery = false; 17576 17565 17577 17566 if (tg3_debug > 0) 17578 17567 tp->msg_enable = tg3_debug; ··· 18084 18071 18085 18072 rtnl_lock(); 18086 18073 18074 + tp->pcierr_recovery = true; 18075 + 18087 18076 /* We probably don't have netdev yet */ 18088 18077 if (!netdev || !netif_running(netdev)) 18089 18078 goto done; ··· 18210 18195 tg3_phy_start(tp); 18211 18196 18212 18197 done: 18198 + tp->pcierr_recovery = false; 18213 18199 rtnl_unlock(); 18214 18200 } 18215 18201
+1
drivers/net/ethernet/broadcom/tg3.h
··· 3407 3407 3408 3408 struct device *hwmon_dev; 3409 3409 bool link_up; 3410 + bool pcierr_recovery; 3410 3411 }; 3411 3412 3412 3413 /* Accessor macros for chip and asic attributes