Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

USB: core: Fix bug caused by duplicate interface PM usage counter

The syzkaller fuzzer reported a bug in the USB hub driver which turned
out to be caused by a negative runtime-PM usage counter. This allowed
a hub to be runtime suspended at a time when the driver did not expect
it. The symptom is a WARNING issued because the hub's status URB is
submitted while it is already active:

URB 0000000031fb463e submitted while active
WARNING: CPU: 0 PID: 2917 at drivers/usb/core/urb.c:363

The negative runtime-PM usage count was caused by an unfortunate
design decision made when runtime PM was first implemented for USB.
At that time, USB class drivers were allowed to unbind from their
interfaces without balancing the usage counter (i.e., leaving it with
a positive count). The core code would take care of setting the
counter back to 0 before allowing another driver to bind to the
interface.

Later on when runtime PM was implemented for the entire kernel, the
opposite decision was made: Drivers were required to balance their
runtime-PM get and put calls. In order to maintain backward
compatibility, however, the USB subsystem adapted to the new
implementation by keeping an independent usage counter for each
interface and using it to automatically adjust the normal usage
counter back to 0 whenever a driver was unbound.

This approach involves duplicating information, but what is worse, it
doesn't work properly in cases where a USB class driver delays
decrementing the usage counter until after the driver's disconnect()
routine has returned and the counter has been adjusted back to 0.
Doing so would cause the usage counter to become negative. There's
even a warning about this in the USB power management documentation!

As it happens, this is exactly what the hub driver does. The
kick_hub_wq() routine increments the runtime-PM usage counter, and the
corresponding decrement is carried out by hub_event() in the context
of the hub_wq work-queue thread. This work routine may sometimes run
after the driver has been unbound from its interface, and when it does
it causes the usage counter to go negative.

It is not possible for hub_disconnect() to wait for a pending
hub_event() call to finish, because hub_disconnect() is called with
the device lock held and hub_event() acquires that lock. The only
feasible fix is to reverse the original design decision: remove the
duplicate interface-specific usage counter and require USB drivers to
balance their runtime PM gets and puts. As far as I know, all
existing drivers currently do this.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: syzbot+7634edaea4d0b341c625@syzkaller.appspotmail.com
CC: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

authored by

Alan Stern and committed by
Greg Kroah-Hartman
c2b71462 fc834e60

+14 -28
+9 -5
Documentation/driver-api/usb/power-management.rst
··· 370 370 then the interface is considered to be idle, and the kernel may 371 371 autosuspend the device. 372 372 373 - Drivers need not be concerned about balancing changes to the usage 374 - counter; the USB core will undo any remaining "get"s when a driver 375 - is unbound from its interface. As a corollary, drivers must not call 376 - any of the ``usb_autopm_*`` functions after their ``disconnect`` 377 - routine has returned. 373 + Drivers must be careful to balance their overall changes to the usage 374 + counter. Unbalanced "get"s will remain in effect when a driver is 375 + unbound from its interface, preventing the device from going into 376 + runtime suspend should the interface be bound to a driver again. On 377 + the other hand, drivers are allowed to achieve this balance by calling 378 + the ``usb_autopm_*`` functions even after their ``disconnect`` routine 379 + has returned -- say from within a work-queue routine -- provided they 380 + retain an active reference to the interface (via ``usb_get_intf`` and 381 + ``usb_put_intf``). 378 382 379 383 Drivers using the async routines are responsible for their own 380 384 synchronization and mutual exclusion.
-13
drivers/usb/core/driver.c
··· 473 473 pm_runtime_disable(dev); 474 474 pm_runtime_set_suspended(dev); 475 475 476 - /* Undo any residual pm_autopm_get_interface_* calls */ 477 - for (r = atomic_read(&intf->pm_usage_cnt); r > 0; --r) 478 - usb_autopm_put_interface_no_suspend(intf); 479 - atomic_set(&intf->pm_usage_cnt, 0); 480 - 481 476 if (!error) 482 477 usb_autosuspend_device(udev); 483 478 ··· 1628 1633 int status; 1629 1634 1630 1635 usb_mark_last_busy(udev); 1631 - atomic_dec(&intf->pm_usage_cnt); 1632 1636 status = pm_runtime_put_sync(&intf->dev); 1633 1637 dev_vdbg(&intf->dev, "%s: cnt %d -> %d\n", 1634 1638 __func__, atomic_read(&intf->dev.power.usage_count), ··· 1656 1662 int status; 1657 1663 1658 1664 usb_mark_last_busy(udev); 1659 - atomic_dec(&intf->pm_usage_cnt); 1660 1665 status = pm_runtime_put(&intf->dev); 1661 1666 dev_vdbg(&intf->dev, "%s: cnt %d -> %d\n", 1662 1667 __func__, atomic_read(&intf->dev.power.usage_count), ··· 1677 1684 struct usb_device *udev = interface_to_usbdev(intf); 1678 1685 1679 1686 usb_mark_last_busy(udev); 1680 - atomic_dec(&intf->pm_usage_cnt); 1681 1687 pm_runtime_put_noidle(&intf->dev); 1682 1688 } 1683 1689 EXPORT_SYMBOL_GPL(usb_autopm_put_interface_no_suspend); ··· 1707 1715 status = pm_runtime_get_sync(&intf->dev); 1708 1716 if (status < 0) 1709 1717 pm_runtime_put_sync(&intf->dev); 1710 - else 1711 - atomic_inc(&intf->pm_usage_cnt); 1712 1718 dev_vdbg(&intf->dev, "%s: cnt %d -> %d\n", 1713 1719 __func__, atomic_read(&intf->dev.power.usage_count), 1714 1720 status); ··· 1740 1750 status = pm_runtime_get(&intf->dev); 1741 1751 if (status < 0 && status != -EINPROGRESS) 1742 1752 pm_runtime_put_noidle(&intf->dev); 1743 - else 1744 - atomic_inc(&intf->pm_usage_cnt); 1745 1753 dev_vdbg(&intf->dev, "%s: cnt %d -> %d\n", 1746 1754 __func__, atomic_read(&intf->dev.power.usage_count), 1747 1755 status); ··· 1763 1775 struct usb_device *udev = interface_to_usbdev(intf); 1764 1776 1765 1777 usb_mark_last_busy(udev); 1766 - atomic_inc(&intf->pm_usage_cnt); 1767 1778 pm_runtime_get_noresume(&intf->dev); 1768 1779 } 1769 1780 EXPORT_SYMBOL_GPL(usb_autopm_get_interface_no_resume);
+5 -8
drivers/usb/storage/realtek_cr.c
··· 763 763 break; 764 764 case RTS51X_STAT_IDLE: 765 765 case RTS51X_STAT_SS: 766 - usb_stor_dbg(us, "RTS51X_STAT_SS, intf->pm_usage_cnt:%d, power.usage:%d\n", 767 - atomic_read(&us->pusb_intf->pm_usage_cnt), 766 + usb_stor_dbg(us, "RTS51X_STAT_SS, power.usage:%d\n", 768 767 atomic_read(&us->pusb_intf->dev.power.usage_count)); 769 768 770 - if (atomic_read(&us->pusb_intf->pm_usage_cnt) > 0) { 769 + if (atomic_read(&us->pusb_intf->dev.power.usage_count) > 0) { 771 770 usb_stor_dbg(us, "Ready to enter SS state\n"); 772 771 rts51x_set_stat(chip, RTS51X_STAT_SS); 773 772 /* ignore mass storage interface's children */ 774 773 pm_suspend_ignore_children(&us->pusb_intf->dev, true); 775 774 usb_autopm_put_interface_async(us->pusb_intf); 776 - usb_stor_dbg(us, "RTS51X_STAT_SS 01, intf->pm_usage_cnt:%d, power.usage:%d\n", 777 - atomic_read(&us->pusb_intf->pm_usage_cnt), 775 + usb_stor_dbg(us, "RTS51X_STAT_SS 01, power.usage:%d\n", 778 776 atomic_read(&us->pusb_intf->dev.power.usage_count)); 779 777 } 780 778 break; ··· 805 807 int ret; 806 808 807 809 if (working_scsi(srb)) { 808 - usb_stor_dbg(us, "working scsi, intf->pm_usage_cnt:%d, power.usage:%d\n", 809 - atomic_read(&us->pusb_intf->pm_usage_cnt), 810 + usb_stor_dbg(us, "working scsi, power.usage:%d\n", 810 811 atomic_read(&us->pusb_intf->dev.power.usage_count)); 811 812 812 - if (atomic_read(&us->pusb_intf->pm_usage_cnt) <= 0) { 813 + if (atomic_read(&us->pusb_intf->dev.power.usage_count) <= 0) { 813 814 ret = usb_autopm_get_interface(us->pusb_intf); 814 815 usb_stor_dbg(us, "working scsi, ret=%d\n", ret); 815 816 }
-2
include/linux/usb.h
··· 200 200 * @dev: driver model's view of this device 201 201 * @usb_dev: if an interface is bound to the USB major, this will point 202 202 * to the sysfs representation for that device. 203 - * @pm_usage_cnt: PM usage counter for this interface 204 203 * @reset_ws: Used for scheduling resets from atomic context. 205 204 * @resetting_device: USB core reset the device, so use alt setting 0 as 206 205 * current; needs bandwidth alloc after reset. ··· 256 257 257 258 struct device dev; /* interface specific device info */ 258 259 struct device *usb_dev; 259 - atomic_t pm_usage_cnt; /* usage counter for autosuspend */ 260 260 struct work_struct reset_ws; /* for resets in atomic context */ 261 261 }; 262 262 #define to_usb_interface(d) container_of(d, struct usb_interface, dev)