Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags

The motivation for this change is to provide a way to work around
a problem with the direct-complete mechanism used for avoiding
system suspend/resume handling for devices in runtime suspend.

The problem is that some middle layer code (the PCI bus type and
the ACPI PM domain in particular) returns positive values from its
system suspend ->prepare callbacks regardless of whether the driver's
->prepare returns a positive value or 0, which effectively prevents
drivers from being able to control the direct-complete feature.
Some drivers need that control, however, and the PCI bus type has
grown its own flag to deal with this issue, but since it is not
limited to PCI, it is better to address it by adding driver flags at
the core level.

To that end, add a driver_flags field to struct dev_pm_info for flags
that can be set by device drivers at the probe time to inform the PM
core and/or bus types, PM domains and so on on the capabilities and/or
preferences of device drivers. Also add two static inline helpers
for setting that field and testing it against a given set of flags
and make the driver core clear it automatically on driver remove
and probe failures.

Define and document two PM driver flags related to the direct-
complete feature: NEVER_SKIP and SMART_PREPARE that can be used,
respectively, to indicate to the PM core that the direct-complete
mechanism should never be used for the device and to inform the
middle layer code (bus types, PM domains etc) that it can only
request the PM core to use the direct-complete mechanism for
the device (by returning a positive value from its ->prepare
callback) if it also has been requested by the driver.

While at it, make the core check pm_runtime_suspended() when
setting power.direct_complete so that it doesn't need to be
checked by ->prepare callbacks.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

+81 -6
+14
Documentation/driver-api/pm/devices.rst
··· 354 354 is because all such devices are initially set to runtime-suspended with 355 355 runtime PM disabled. 356 356 357 + This feature also can be controlled by device drivers by using the 358 + ``DPM_FLAG_NEVER_SKIP`` and ``DPM_FLAG_SMART_PREPARE`` driver power 359 + management flags. [Typically, they are set at the time the driver is 360 + probed against the device in question by passing them to the 361 + :c:func:`dev_pm_set_driver_flags` helper function.] If the first of 362 + these flags is set, the PM core will not apply the direct-complete 363 + procedure described above to the given device and, consequenty, to any 364 + of its ancestors. The second flag, when set, informs the middle layer 365 + code (bus types, device types, PM domains, classes) that it should take 366 + the return value of the ``->prepare`` callback provided by the driver 367 + into account and it may only return a positive value from its own 368 + ``->prepare`` callback if the driver's one also has returned a positive 369 + value. 370 + 357 371 2. The ``->suspend`` methods should quiesce the device to stop it from 358 372 performing I/O. They also may save the device registers and put it into 359 373 the appropriate low-power state, depending on the bus type the device is
+19
Documentation/power/pci.txt
··· 961 961 .suspend(), .freeze(), and .poweroff() members and one resume routine is to 962 962 be pointed to by the .resume(), .thaw(), and .restore() members. 963 963 964 + 3.1.19. Driver Flags for Power Management 965 + 966 + The PM core allows device drivers to set flags that influence the handling of 967 + power management for the devices by the core itself and by middle layer code 968 + including the PCI bus type. The flags should be set once at the driver probe 969 + time with the help of the dev_pm_set_driver_flags() function and they should not 970 + be updated directly afterwards. 971 + 972 + The DPM_FLAG_NEVER_SKIP flag prevents the PM core from using the direct-complete 973 + mechanism allowing device suspend/resume callbacks to be skipped if the device 974 + is in runtime suspend when the system suspend starts. That also affects all of 975 + the ancestors of the device, so this flag should only be used if absolutely 976 + necessary. 977 + 978 + The DPM_FLAG_SMART_PREPARE flag instructs the PCI bus type to only return a 979 + positive value from pci_pm_prepare() if the ->prepare callback provided by the 980 + driver of the device returns a positive value. That allows the driver to opt 981 + out from using the direct-complete mechanism dynamically. 982 + 964 983 3.2. Device Runtime Power Management 965 984 ------------------------------------ 966 985 In addition to providing device power management callbacks PCI device drivers
+9 -4
drivers/acpi/device_pm.c
··· 959 959 int acpi_subsys_prepare(struct device *dev) 960 960 { 961 961 struct acpi_device *adev = ACPI_COMPANION(dev); 962 - int ret; 963 962 964 - ret = pm_generic_prepare(dev); 965 - if (ret < 0) 966 - return ret; 963 + if (dev->driver && dev->driver->pm && dev->driver->pm->prepare) { 964 + int ret = dev->driver->pm->prepare(dev); 965 + 966 + if (ret < 0) 967 + return ret; 968 + 969 + if (!ret && dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_PREPARE)) 970 + return 0; 971 + } 967 972 968 973 if (!adev || !pm_runtime_suspended(dev)) 969 974 return 0;
+2
drivers/base/dd.c
··· 464 464 if (dev->pm_domain && dev->pm_domain->dismiss) 465 465 dev->pm_domain->dismiss(dev); 466 466 pm_runtime_reinit(dev); 467 + dev_pm_set_driver_flags(dev, 0); 467 468 468 469 switch (ret) { 469 470 case -EPROBE_DEFER: ··· 870 869 if (dev->pm_domain && dev->pm_domain->dismiss) 871 870 dev->pm_domain->dismiss(dev); 872 871 pm_runtime_reinit(dev); 872 + dev_pm_set_driver_flags(dev, 0); 873 873 874 874 klist_remove(&dev->p->knode_driver); 875 875 device_pm_check_callbacks(dev);
+3 -1
drivers/base/power/main.c
··· 1700 1700 * applies to suspend transitions, however. 1701 1701 */ 1702 1702 spin_lock_irq(&dev->power.lock); 1703 - dev->power.direct_complete = ret > 0 && state.event == PM_EVENT_SUSPEND; 1703 + dev->power.direct_complete = state.event == PM_EVENT_SUSPEND && 1704 + pm_runtime_suspended(dev) && ret > 0 && 1705 + !dev_pm_test_driver_flags(dev, DPM_FLAG_NEVER_SKIP); 1704 1706 spin_unlock_irq(&dev->power.lock); 1705 1707 return 0; 1706 1708 }
+4 -1
drivers/pci/pci-driver.c
··· 689 689 690 690 if (drv && drv->pm && drv->pm->prepare) { 691 691 int error = drv->pm->prepare(dev); 692 - if (error) 692 + if (error < 0) 693 693 return error; 694 + 695 + if (!error && dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_PREPARE)) 696 + return 0; 694 697 } 695 698 return pci_dev_keep_suspended(to_pci_dev(dev)); 696 699 }
+10
include/linux/device.h
··· 1070 1070 #endif 1071 1071 } 1072 1072 1073 + static inline void dev_pm_set_driver_flags(struct device *dev, u32 flags) 1074 + { 1075 + dev->power.driver_flags = flags; 1076 + } 1077 + 1078 + static inline bool dev_pm_test_driver_flags(struct device *dev, u32 flags) 1079 + { 1080 + return !!(dev->power.driver_flags & flags); 1081 + } 1082 + 1073 1083 static inline void device_lock(struct device *dev) 1074 1084 { 1075 1085 mutex_lock(&dev->mutex);
+20
include/linux/pm.h
··· 550 550 #endif 551 551 }; 552 552 553 + /* 554 + * Driver flags to control system suspend/resume behavior. 555 + * 556 + * These flags can be set by device drivers at the probe time. They need not be 557 + * cleared by the drivers as the driver core will take care of that. 558 + * 559 + * NEVER_SKIP: Do not skip system suspend/resume callbacks for the device. 560 + * SMART_PREPARE: Check the return value of the driver's ->prepare callback. 561 + * 562 + * Setting SMART_PREPARE instructs bus types and PM domains which may want 563 + * system suspend/resume callbacks to be skipped for the device to return 0 from 564 + * their ->prepare callbacks if the driver's ->prepare callback returns 0 (in 565 + * other words, the system suspend/resume callbacks can only be skipped for the 566 + * device if its driver doesn't object against that). This flag has no effect 567 + * if NEVER_SKIP is set. 568 + */ 569 + #define DPM_FLAG_NEVER_SKIP BIT(0) 570 + #define DPM_FLAG_SMART_PREPARE BIT(1) 571 + 553 572 struct dev_pm_info { 554 573 pm_message_t power_state; 555 574 unsigned int can_wakeup:1; ··· 580 561 bool is_late_suspended:1; 581 562 bool early_init:1; /* Owned by the PM core */ 582 563 bool direct_complete:1; /* Owned by the PM core */ 564 + u32 driver_flags; 583 565 spinlock_t lock; 584 566 #ifdef CONFIG_PM_SLEEP 585 567 struct list_head entry;