[PATCH] corruption during e100 MDI register access

We have identified two related bugs in the e100 driver.

Both bugs are related to manipulation of the MDI control register.

The first problem is that the Ready bit is being ignored when writing to
the Control register; we noticed this because the Linux bonding driver
would occasionally come to the spurious conclusion that the link was down
when querying Link State. It turned out that by failing to wait for a
previous command to complete it was selecting what was essentially a random
register in the MDI register set. When we added code that waits for the
Ready bit (as shown in the patch file below) all such problems ceased.

The second problem is that, although access to the MDI registers involves
multiple steps which must not be intermixed, nothing was defending against
two or more threads attempting simultaneous access. The most obvious
situation where such interference could occur involves the watchdog versus
ioctl paths, but there are probably others, so we recommend the locking
shown in our patch file.

Signed-off-by: Michael O'Donnell <Michael.ODonnell@stratus.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: John Ronciak <john.ronciak@intel.com>
Cc: Ganesh Venkatesan <ganesh.venkatesan@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>

authored by ODonnell, Michael and committed by Jeff Garzik ac7c6669 bf785ee0

+29 -3
+29 -3
drivers/net/e100.c
··· 132 132 * TODO: 133 133 * o several entry points race with dev->close 134 134 * o check for tx-no-resources/stop Q races with tx clean/wake Q 135 + * 136 + * FIXES: 137 + * 2005/12/02 - Michael O'Donnell <Michael.ODonnell at stratus dot com> 138 + * - Stratus87247: protect MDI control register manipulations 135 139 */ 136 140 137 141 #include <linux/config.h> ··· 582 578 u16 leds; 583 579 u16 eeprom_wc; 584 580 u16 eeprom[256]; 581 + spinlock_t mdio_lock; 585 582 }; 586 583 587 584 static inline void e100_write_flush(struct nic *nic) ··· 881 876 { 882 877 u32 data_out = 0; 883 878 unsigned int i; 879 + unsigned long flags; 884 880 881 + 882 + /* 883 + * Stratus87247: we shouldn't be writing the MDI control 884 + * register until the Ready bit shows True. Also, since 885 + * manipulation of the MDI control registers is a multi-step 886 + * procedure it should be done under lock. 887 + */ 888 + spin_lock_irqsave(&nic->mdio_lock, flags); 889 + for (i = 100; i; --i) { 890 + if (readl(&nic->csr->mdi_ctrl) & mdi_ready) 891 + break; 892 + udelay(20); 893 + } 894 + if (unlikely(!i)) { 895 + printk("e100.mdio_ctrl(%s) won't go Ready\n", 896 + nic->netdev->name ); 897 + spin_unlock_irqrestore(&nic->mdio_lock, flags); 898 + return 0; /* No way to indicate timeout error */ 899 + } 885 900 writel((reg << 16) | (addr << 21) | dir | data, &nic->csr->mdi_ctrl); 886 901 887 - for(i = 0; i < 100; i++) { 902 + for (i = 0; i < 100; i++) { 888 903 udelay(20); 889 - if((data_out = readl(&nic->csr->mdi_ctrl)) & mdi_ready) 904 + if ((data_out = readl(&nic->csr->mdi_ctrl)) & mdi_ready) 890 905 break; 891 906 } 892 - 907 + spin_unlock_irqrestore(&nic->mdio_lock, flags); 893 908 DPRINTK(HW, DEBUG, 894 909 "%s:addr=%d, reg=%d, data_in=0x%04X, data_out=0x%04X\n", 895 910 dir == mdi_read ? "READ" : "WRITE", addr, reg, data, data_out); ··· 2587 2562 /* locks must be initialized before calling hw_reset */ 2588 2563 spin_lock_init(&nic->cb_lock); 2589 2564 spin_lock_init(&nic->cmd_lock); 2565 + spin_lock_init(&nic->mdio_lock); 2590 2566 2591 2567 /* Reset the device before pci_set_master() in case device is in some 2592 2568 * funky state and has an interrupt pending - hint: we don't have the