Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

nvme: always punt polled uring_cmd end_io work to task_work

Currently NVMe uring_cmd completions will complete locally, if they are
polled. This is done because those completions are always invoked from
task context. And while that is true, there's no guarantee that it's
invoked under the right ring context, or even task. If someone does
NVMe passthrough via multiple threads and with a limited number of
poll queues, then ringA may find completions from ringB. For that case,
completing the request may not be sound.

Always just punt the passthrough completions via task_work, which will
redirect the completion, if needed.

Cc: stable@vger.kernel.org
Fixes: 585079b6e425 ("nvme: wire up async polling for io passthrough commands")
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Jens Axboe 9ce6c987 db3dfae1

+7 -14
+7 -14
drivers/nvme/host/ioctl.c
··· 429 pdu->result = le64_to_cpu(nvme_req(req)->result.u64); 430 431 /* 432 - * For iopoll, complete it directly. Note that using the uring_cmd 433 - * helper for this is safe only because we check blk_rq_is_poll(). 434 - * As that returns false if we're NOT on a polled queue, then it's 435 - * safe to use the polled completion helper. 436 - * 437 - * Otherwise, move the completion to task work. 438 */ 439 - if (blk_rq_is_poll(req)) { 440 - if (pdu->bio) 441 - blk_rq_unmap_user(pdu->bio); 442 - io_uring_cmd_iopoll_done(ioucmd, pdu->result, pdu->status); 443 - } else { 444 - io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); 445 - } 446 - 447 return RQ_END_IO_FREE; 448 } 449
··· 429 pdu->result = le64_to_cpu(nvme_req(req)->result.u64); 430 431 /* 432 + * IOPOLL could potentially complete this request directly, but 433 + * if multiple rings are polling on the same queue, then it's possible 434 + * for one ring to find completions for another ring. Punting the 435 + * completion via task_work will always direct it to the right 436 + * location, rather than potentially complete requests for ringA 437 + * under iopoll invocations from ringB. 438 */ 439 + io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); 440 return RQ_END_IO_FREE; 441 } 442