Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

nvme: fix possible hang when ns scanning fails during error recovery

When the controller is reconnecting, the host fails I/O and admin
commands as the host cannot reach the controller. ns scanning may
revalidate namespaces during that period and it is wrong to remove
namespaces due to these failures as we may hang (see 205da2434301).

One command that may fail is nvme_identify_ns_descs. Since we return
success due to having ns identify descriptor list optional, we continue
to compare ns identifiers in nvme_revalidate_disk, obviously fail and
return -ENODEV to nvme_validate_ns, which will remove the namespace.

Exactly what we don't want to happen.

Fixes: 22802bf742c2 ("nvme: Namepace identification descriptor list is optional")
Tested-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

authored by

Sagi Grimberg and committed by
Jens Axboe
59c7c3ca a8de6639

+1 -1
+1 -1
drivers/nvme/host/core.c
··· 1110 1110 * Don't treat an error as fatal, as we potentially already 1111 1111 * have a NGUID or EUI-64. 1112 1112 */ 1113 - if (status > 0) 1113 + if (status > 0 && !(status & NVME_SC_DNR)) 1114 1114 status = 0; 1115 1115 goto free_data; 1116 1116 }