Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

vfio-ccw: Prevent quiesce function going into an infinite loop

The quiesce function calls cio_cancel_halt_clear() and if we
get an -EBUSY we go into a loop where we:
- wait for any interrupts
- flush all I/O in the workqueue
- retry cio_cancel_halt_clear

During the period where we are waiting for interrupts or
flushing all I/O, the channel subsystem could have completed
a halt/clear action and turned off the corresponding activity
control bits in the subchannel status word. This means the next
time we call cio_cancel_halt_clear(), we will again start by
calling cancel subchannel and so we can be stuck between calling
cancel and halt forever.

Rather than calling cio_cancel_halt_clear() immediately after
waiting, let's try to disable the subchannel. If we succeed in
disabling the subchannel then we know nothing else can happen
with the device.

Suggested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Message-Id: <4d5a4b98ab1b41ac6131b5c36de18b76c5d66898.1555449329.git.alifm@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>

authored by

Farhan Ali and committed by
Cornelia Huck
d1ffa760 b49bdc86

+18 -14
+18 -14
drivers/s390/cio/vfio_ccw_drv.c
··· 43 43 if (ret != -EBUSY) 44 44 goto out_unlock; 45 45 46 + iretry = 255; 46 47 do { 47 - iretry = 255; 48 48 49 49 ret = cio_cancel_halt_clear(sch, &iretry); 50 - while (ret == -EBUSY) { 51 - /* 52 - * Flush all I/O and wait for 53 - * cancel/halt/clear completion. 54 - */ 55 - private->completion = &completion; 56 - spin_unlock_irq(sch->lock); 57 50 51 + if (ret == -EIO) { 52 + pr_err("vfio_ccw: could not quiesce subchannel 0.%x.%04x!\n", 53 + sch->schid.ssid, sch->schid.sch_no); 54 + break; 55 + } 56 + 57 + /* 58 + * Flush all I/O and wait for 59 + * cancel/halt/clear completion. 60 + */ 61 + private->completion = &completion; 62 + spin_unlock_irq(sch->lock); 63 + 64 + if (ret == -EBUSY) 58 65 wait_for_completion_timeout(&completion, 3*HZ); 59 66 60 - private->completion = NULL; 61 - flush_workqueue(vfio_ccw_work_q); 62 - spin_lock_irq(sch->lock); 63 - ret = cio_cancel_halt_clear(sch, &iretry); 64 - }; 65 - 67 + private->completion = NULL; 68 + flush_workqueue(vfio_ccw_work_q); 69 + spin_lock_irq(sch->lock); 66 70 ret = cio_disable_subchannel(sch); 67 71 } while (ret == -EBUSY); 68 72 out_unlock: