Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

IB/mthca: Fix handling of send CQE with error for QPs connected to SRQ

mthca_free_err_wqe() currently treats both send and receive CQEs
identically if a QP is using an SRQ. But for Tavor hardware, send
CQEs with error can be chained together even if the RQ is part of SRQ,
so we may miss some CQEs.

Fix by following the WQE chain for all send CQEs even for non-SRQ QPs.

This fixes crashes in IPoIB CM:
<https://bugs.openfabrics.org//show_bug.cgi?id=604>

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

authored by

Michael S. Tsirkin and committed by
Roland Dreier
8b7e1577 6e98ee75

+3 -3
+3 -3
drivers/infiniband/hw/mthca/mthca_qp.c
··· 2284 2284 struct mthca_next_seg *next; 2285 2285 2286 2286 /* 2287 - * For SRQs, all WQEs generate a CQE, so we're always at the 2288 - * end of the doorbell chain. 2287 + * For SRQs, all receive WQEs generate a CQE, so we're always 2288 + * at the end of the doorbell chain. 2289 2289 */ 2290 - if (qp->ibqp.srq) { 2290 + if (qp->ibqp.srq && !is_send) { 2291 2291 *new_wqe = 0; 2292 2292 return; 2293 2293 }