Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

net: ena: fix race condition between device reset and link up setup

In rare cases, ena driver would reset and re-start the device,
for example, in case of misbehaving application that causes
transmit timeout

The first step in the reset procedure is to stop the Tx traffic by
calling ena_carrier_off().

After the driver have just started the device reset procedure, device
happens to send an asynchronous notification (via AENQ) to the driver
than there was a link change (to link-up state).
This link change is mapped to a call to netif_carrier_on() which
re-activates the Tx queues, violating the assumption of no tx traffic
until device reset is completed, as the reset task might still be in
the process of queues initialization, leading to an access to
uninitialized memory.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

authored by

Netanel Belgazal and committed by
David S. Miller
d18e4f68 b399a394

+11 -3
+9 -2
drivers/net/ethernet/amazon/ena/ena_netdev.c
··· 2579 2579 bool wd_state; 2580 2580 int rc; 2581 2581 2582 + set_bit(ENA_FLAG_ONGOING_RESET, &adapter->flags); 2582 2583 rc = ena_device_init(ena_dev, adapter->pdev, &get_feat_ctx, &wd_state); 2583 2584 if (rc) { 2584 2585 dev_err(&pdev->dev, "Can not initialize device\n"); ··· 2592 2591 dev_err(&pdev->dev, "Validation of device parameters failed\n"); 2593 2592 goto err_device_destroy; 2594 2593 } 2594 + 2595 + clear_bit(ENA_FLAG_ONGOING_RESET, &adapter->flags); 2596 + /* Make sure we don't have a race with AENQ Links state handler */ 2597 + if (test_bit(ENA_FLAG_LINK_UP, &adapter->flags)) 2598 + netif_carrier_on(adapter->netdev); 2595 2599 2596 2600 rc = ena_enable_msix_and_set_admin_interrupts(adapter, 2597 2601 adapter->num_queues); ··· 2624 2618 ena_com_admin_destroy(ena_dev); 2625 2619 err: 2626 2620 clear_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags); 2627 - 2621 + clear_bit(ENA_FLAG_ONGOING_RESET, &adapter->flags); 2628 2622 dev_err(&pdev->dev, 2629 2623 "Reset attempt failed. Can not reset the device\n"); 2630 2624 ··· 3501 3495 if (status) { 3502 3496 netdev_dbg(adapter->netdev, "%s\n", __func__); 3503 3497 set_bit(ENA_FLAG_LINK_UP, &adapter->flags); 3504 - netif_carrier_on(adapter->netdev); 3498 + if (!test_bit(ENA_FLAG_ONGOING_RESET, &adapter->flags)) 3499 + netif_carrier_on(adapter->netdev); 3505 3500 } else { 3506 3501 clear_bit(ENA_FLAG_LINK_UP, &adapter->flags); 3507 3502 netif_carrier_off(adapter->netdev);
+2 -1
drivers/net/ethernet/amazon/ena/ena_netdev.h
··· 272 272 ENA_FLAG_DEV_UP, 273 273 ENA_FLAG_LINK_UP, 274 274 ENA_FLAG_MSIX_ENABLED, 275 - ENA_FLAG_TRIGGER_RESET 275 + ENA_FLAG_TRIGGER_RESET, 276 + ENA_FLAG_ONGOING_RESET 276 277 }; 277 278 278 279 /* adapter specific private data structure */