Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

nbd: Fix hung on disconnect request if socket is closed before

When userspace closes the socket before sending a disconnect
request, the following I/O requests will be blocked in
wait_for_reconnect() until dead timeout. This will cause the
following disconnect request also hung on blk_mq_quiesce_queue().
That means we have no way to disconnect a nbd device if there
are some I/O requests waiting for reconnecting until dead timeout.
It's not expected. So let's wake up the thread waiting for
reconnecting directly when a disconnect request is sent.

Reported-by: Xu Jianhai <zero.xu@bytedance.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20220322080639.142-1-xieyongji@bytedance.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

authored by

Xie Yongji and committed by
Jens Axboe
491bf8f2 c23d47ab

+9 -4
+9 -4
drivers/block/nbd.c
··· 946 946 struct nbd_config *config = nbd->config; 947 947 if (!config->dead_conn_timeout) 948 948 return 0; 949 - if (test_bit(NBD_RT_DISCONNECTED, &config->runtime_flags)) 949 + 950 + if (!wait_event_timeout(config->conn_wait, 951 + test_bit(NBD_RT_DISCONNECTED, 952 + &config->runtime_flags) || 953 + atomic_read(&config->live_connections) > 0, 954 + config->dead_conn_timeout)) 950 955 return 0; 951 - return wait_event_timeout(config->conn_wait, 952 - atomic_read(&config->live_connections) > 0, 953 - config->dead_conn_timeout) > 0; 956 + 957 + return !test_bit(NBD_RT_DISCONNECTED, &config->runtime_flags); 954 958 } 955 959 956 960 static int nbd_handle_cmd(struct nbd_cmd *cmd, int index) ··· 2080 2076 mutex_lock(&nbd->config_lock); 2081 2077 nbd_disconnect(nbd); 2082 2078 sock_shutdown(nbd); 2079 + wake_up(&nbd->config->conn_wait); 2083 2080 /* 2084 2081 * Make sure recv thread has finished, we can safely call nbd_clear_que() 2085 2082 * to cancel the inflight I/Os.