nvme: always punt polled uring_cmd end_io work to task_work

Currently NVMe uring_cmd completions will complete locally, if they are
polled. This is done because those completions are always invoked from
task context. And while that is true, there's no guarantee that it's
invoked under the right ring context, or even task. If someone does
NVMe passthrough via multiple threads and with a limited number of
poll queues, then ringA may find completions from ringB. For that case,
completing the request may not be sound.

Always just punt the passthrough completions via task_work, which will
redirect the completion, if needed.

Cc: stable@vger.kernel.org
Fixes: 585079b6e425 ("nvme: wire up async polling for io passthrough commands")
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Changed files
+7 -14
drivers
nvme
host
+7 -14
drivers/nvme/host/ioctl.c
··· 429 429 pdu->result = le64_to_cpu(nvme_req(req)->result.u64); 430 430 431 431 /* 432 - * For iopoll, complete it directly. Note that using the uring_cmd 433 - * helper for this is safe only because we check blk_rq_is_poll(). 434 - * As that returns false if we're NOT on a polled queue, then it's 435 - * safe to use the polled completion helper. 436 - * 437 - * Otherwise, move the completion to task work. 432 + * IOPOLL could potentially complete this request directly, but 433 + * if multiple rings are polling on the same queue, then it's possible 434 + * for one ring to find completions for another ring. Punting the 435 + * completion via task_work will always direct it to the right 436 + * location, rather than potentially complete requests for ringA 437 + * under iopoll invocations from ringB. 438 438 */ 439 - if (blk_rq_is_poll(req)) { 440 - if (pdu->bio) 441 - blk_rq_unmap_user(pdu->bio); 442 - io_uring_cmd_iopoll_done(ioucmd, pdu->result, pdu->status); 443 - } else { 444 - io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); 445 - } 446 - 439 + io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); 447 440 return RQ_END_IO_FREE; 448 441 } 449 442