Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

IB/hfi1: Correct -EBUSY handling in tx code

The current code mishandles -EBUSY in two ways:
- The flow change doesn't test the return from the flush and runs on to
process the current packet racing with the wakeup processing
- The -EBUSY handling for a single packet inserts the tx into the txlist
after the submit call, racing with the same wakeup processing

Fix the first by dropping the skb and returning NETDEV_TX_OK.

Fix the second by insuring the the list entry within the txreq is inited
when allocated. This enables the sleep routine to detect that the txreq
has used the non-list api and queue the packet to the txlist.

Both flaws can lead to having the flushing thread executing in causing two
threads to manipulate the txlist.

Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Link: https://lore.kernel.org/r/20200623204321.108092.83898.stgit@awfm-01.aw.intel.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

authored by

Mike Marciniszyn and committed by
Jason Gunthorpe
82172b76 822fbd37

+17 -16
+17 -16
drivers/infiniband/hw/hfi1/ipoib_tx.c
··· 369 369 tx->priv = priv; 370 370 tx->txq = txp->txq; 371 371 tx->skb = skb; 372 + INIT_LIST_HEAD(&tx->txreq.list); 372 373 373 374 hfi1_ipoib_build_ib_tx_headers(tx, txp); 374 375 ··· 470 469 471 470 ret = hfi1_ipoib_submit_tx(txq, tx); 472 471 if (likely(!ret)) { 472 + tx_ok: 473 473 trace_sdma_output_ibhdr(tx->priv->dd, 474 474 &tx->sdma_hdr.hdr, 475 475 ib_is_sc5(txp->flow.sc5)); ··· 480 478 481 479 txq->pkts_sent = false; 482 480 483 - if (ret == -EBUSY) { 484 - list_add_tail(&tx->txreq.list, &txq->tx_list); 485 - 486 - trace_sdma_output_ibhdr(tx->priv->dd, 487 - &tx->sdma_hdr.hdr, 488 - ib_is_sc5(txp->flow.sc5)); 489 - hfi1_ipoib_check_queue_depth(txq); 490 - return NETDEV_TX_OK; 491 - } 492 - 493 - if (ret == -ECOMM) { 494 - hfi1_ipoib_check_queue_depth(txq); 495 - return NETDEV_TX_OK; 496 - } 481 + if (ret == -EBUSY || ret == -ECOMM) 482 + goto tx_ok; 497 483 498 484 sdma_txclean(priv->dd, &tx->txreq); 499 485 dev_kfree_skb_any(skb); ··· 499 509 struct ipoib_txreq *tx; 500 510 501 511 /* Has the flow change ? */ 502 - if (txq->flow.as_int != txp->flow.as_int) 503 - (void)hfi1_ipoib_flush_tx_list(dev, txq); 512 + if (txq->flow.as_int != txp->flow.as_int) { 513 + int ret; 504 514 515 + ret = hfi1_ipoib_flush_tx_list(dev, txq); 516 + if (unlikely(ret)) { 517 + if (ret == -EBUSY) 518 + ++dev->stats.tx_dropped; 519 + dev_kfree_skb_any(skb); 520 + return NETDEV_TX_OK; 521 + } 522 + } 505 523 tx = hfi1_ipoib_send_dma_common(dev, skb, txp); 506 524 if (IS_ERR(tx)) { 507 525 int ret = PTR_ERR(tx); ··· 610 612 611 613 netif_stop_subqueue(txq->priv->netdev, txq->q_idx); 612 614 615 + if (list_empty(&txreq->list)) 616 + /* came from non-list submit */ 617 + list_add_tail(&txreq->list, &txq->tx_list); 613 618 if (list_empty(&txq->wait.list)) 614 619 iowait_queue(pkts_sent, wait->iow, &sde->dmawait); 615 620