Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'pm-4.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more power management updates from Rafael Wysocki:
"These revert a recent PM core change that introduced a regression, fix
the build when the recently added Kryo cpufreq driver is selected, add
support for devices attached to multiple power domains to the generic
power domains (genpd) framework, add support for iowait boosting on
systens with hardware-managed P-states (HWP) enabled to the
intel_pstate driver, modify the behavior of the wakeup_count device
attribute in sysfs, fix a few issues and clean up some ugliness,
mostly in cpufreq (core and drivers) and in the cpupower utility.

Specifics:

- Revert a recent PM core change that attempted to fix an issue
related to device links, but introduced a regression (Rafael
Wysocki)

- Fix build when the recently added cpufreq driver for Kryo
processors is selected by making it possible to build that driver
as a module (Arnd Bergmann)

- Fix the long idle detection mechanism in the out-of-band (ondemand
and conservative) cpufreq governors (Chen Yu)

- Add support for devices in multiple power domains to the generic
power domains (genpd) framework (Ulf Hansson)

- Add support for iowait boosting on systems with hardware-managed
P-states (HWP) enabled to the intel_pstate driver and make it use
that feature on systems with Skylake Xeon processors as it is
reported to improve performance significantly on those systems
(Srinivas Pandruvada)

- Fix and update the acpi_cpufreq, ti-cpufreq and imx6q cpufreq
drivers (Colin Ian King, Suman Anna, Sébastien Szymanski)

- Change the behavior of the wakeup_count device attribute in sysfs
to expose the number of events when the device might have aborted
system suspend in progress (Ravi Chandra Sadineni)

- Fix two minor issues in the cpupower utility (Abhishek Goel, Colin
Ian King)"

* tag 'pm-4.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "PM / runtime: Fixup reference counting of device link suppliers at probe"
cpufreq: imx6q: check speed grades for i.MX6ULL
cpufreq: governors: Fix long idle detection logic in load calculation
cpufreq: intel_pstate: enable boost for Skylake Xeon
PM / wakeup: Export wakeup_count instead of event_count via sysfs
PM / Domains: Add dev_pm_domain_attach_by_id() to manage multi PM domains
PM / Domains: Add support for multi PM domains per device to genpd
PM / Domains: Split genpd_dev_pm_attach()
PM / Domains: Don't attach devices in genpd with multi PM domains
PM / Domains: dt: Allow power-domain property to be a list of specifiers
cpufreq: intel_pstate: New sysfs entry to control HWP boost
cpufreq: intel_pstate: HWP boost performance on IO wakeup
cpufreq: intel_pstate: Add HWP boost utility and sched util hooks
cpufreq: ti-cpufreq: Use devres managed API in probe()
cpufreq: ti-cpufreq: Fix an incorrect error return value
cpufreq: ACPI: make function acpi_cpufreq_fast_switch() static
cpufreq: kryo: allow building as a loadable module
cpupower : Fix header name to read idle state name
cpupower: fix spelling mistake: "logilename" -> "logfilename"

+468 -75
+14 -5
Documentation/devicetree/bindings/power/power_domain.txt
··· 111 111 ==PM domain consumers== 112 112 113 113 Required properties: 114 - - power-domains : A phandle and PM domain specifier as defined by bindings of 115 - the power controller specified by phandle. 114 + - power-domains : A list of PM domain specifiers, as defined by bindings of 115 + the power controller that is the PM domain provider. 116 116 117 117 Example: 118 118 ··· 122 122 power-domains = <&power 0>; 123 123 }; 124 124 125 - The node above defines a typical PM domain consumer device, which is located 126 - inside a PM domain with index 0 of a power controller represented by a node 127 - with the label "power". 125 + leaky-device@12351000 { 126 + compatible = "foo,i-leak-current"; 127 + reg = <0x12351000 0x1000>; 128 + power-domains = <&power 0>, <&power 1> ; 129 + }; 130 + 131 + The first example above defines a typical PM domain consumer device, which is 132 + located inside a PM domain with index 0 of a power controller represented by a 133 + node with the label "power". 134 + In the second example the consumer device are partitioned across two PM domains, 135 + the first with index 0 and the second with index 1, of a power controller that 136 + is represented by a node with the label "power. 128 137 129 138 Optional properties: 130 139 - required-opps: This contains phandle to an OPP node in another device's OPP
+2 -1
drivers/base/dd.c
··· 580 580 pr_debug("bus: '%s': %s: matched device %s with driver %s\n", 581 581 drv->bus->name, __func__, dev_name(dev), drv->name); 582 582 583 - pm_runtime_resume_suppliers(dev); 583 + pm_runtime_get_suppliers(dev); 584 584 if (dev->parent) 585 585 pm_runtime_get_sync(dev->parent); 586 586 ··· 591 591 if (dev->parent) 592 592 pm_runtime_put(dev->parent); 593 593 594 + pm_runtime_put_suppliers(dev); 594 595 return ret; 595 596 } 596 597
+40 -3
drivers/base/power/common.c
··· 117 117 EXPORT_SYMBOL_GPL(dev_pm_domain_attach); 118 118 119 119 /** 120 + * dev_pm_domain_attach_by_id - Associate a device with one of its PM domains. 121 + * @dev: The device used to lookup the PM domain. 122 + * @index: The index of the PM domain. 123 + * 124 + * As @dev may only be attached to a single PM domain, the backend PM domain 125 + * provider creates a virtual device to attach instead. If attachment succeeds, 126 + * the ->detach() callback in the struct dev_pm_domain are assigned by the 127 + * corresponding backend attach function, as to deal with detaching of the 128 + * created virtual device. 129 + * 130 + * This function should typically be invoked by a driver during the probe phase, 131 + * in case its device requires power management through multiple PM domains. The 132 + * driver may benefit from using the received device, to configure device-links 133 + * towards its original device. Depending on the use-case and if needed, the 134 + * links may be dynamically changed by the driver, which allows it to control 135 + * the power to the PM domains independently from each other. 136 + * 137 + * Callers must ensure proper synchronization of this function with power 138 + * management callbacks. 139 + * 140 + * Returns the virtual created device when successfully attached to its PM 141 + * domain, NULL in case @dev don't need a PM domain, else an ERR_PTR(). 142 + * Note that, to detach the returned virtual device, the driver shall call 143 + * dev_pm_domain_detach() on it, typically during the remove phase. 144 + */ 145 + struct device *dev_pm_domain_attach_by_id(struct device *dev, 146 + unsigned int index) 147 + { 148 + if (dev->pm_domain) 149 + return ERR_PTR(-EEXIST); 150 + 151 + return genpd_dev_pm_attach_by_id(dev, index); 152 + } 153 + EXPORT_SYMBOL_GPL(dev_pm_domain_attach_by_id); 154 + 155 + /** 120 156 * dev_pm_domain_detach - Detach a device from its PM domain. 121 157 * @dev: Device to detach. 122 158 * @power_off: Used to indicate whether we should power off the device. 123 159 * 124 - * This functions will reverse the actions from dev_pm_domain_attach() and thus 125 - * try to detach the @dev from its PM domain. Typically it should be invoked 126 - * from subsystem level code during the remove phase. 160 + * This functions will reverse the actions from dev_pm_domain_attach() and 161 + * dev_pm_domain_attach_by_id(), thus it detaches @dev from its PM domain. 162 + * Typically it should be invoked during the remove phase, either from 163 + * subsystem level code or from drivers. 127 164 * 128 165 * Callers must ensure proper synchronization of this function with power 129 166 * management callbacks.
+114 -20
drivers/base/power/domain.c
··· 2171 2171 } 2172 2172 EXPORT_SYMBOL_GPL(of_genpd_remove_last); 2173 2173 2174 + static void genpd_release_dev(struct device *dev) 2175 + { 2176 + kfree(dev); 2177 + } 2178 + 2179 + static struct bus_type genpd_bus_type = { 2180 + .name = "genpd", 2181 + }; 2182 + 2174 2183 /** 2175 2184 * genpd_dev_pm_detach - Detach a device from its PM domain. 2176 2185 * @dev: Device to detach. ··· 2217 2208 2218 2209 /* Check if PM domain can be powered off after removing this device. */ 2219 2210 genpd_queue_power_off_work(pd); 2211 + 2212 + /* Unregister the device if it was created by genpd. */ 2213 + if (dev->bus == &genpd_bus_type) 2214 + device_unregister(dev); 2220 2215 } 2221 2216 2222 2217 static void genpd_dev_pm_sync(struct device *dev) ··· 2234 2221 genpd_queue_power_off_work(pd); 2235 2222 } 2236 2223 2237 - /** 2238 - * genpd_dev_pm_attach - Attach a device to its PM domain using DT. 2239 - * @dev: Device to attach. 2240 - * 2241 - * Parse device's OF node to find a PM domain specifier. If such is found, 2242 - * attaches the device to retrieved pm_domain ops. 2243 - * 2244 - * Returns 1 on successfully attached PM domain, 0 when the device don't need a 2245 - * PM domain or a negative error code in case of failures. Note that if a 2246 - * power-domain exists for the device, but it cannot be found or turned on, 2247 - * then return -EPROBE_DEFER to ensure that the device is not probed and to 2248 - * re-try again later. 2249 - */ 2250 - int genpd_dev_pm_attach(struct device *dev) 2224 + static int __genpd_dev_pm_attach(struct device *dev, struct device_node *np, 2225 + unsigned int index) 2251 2226 { 2252 2227 struct of_phandle_args pd_args; 2253 2228 struct generic_pm_domain *pd; 2254 2229 int ret; 2255 2230 2256 - if (!dev->of_node) 2257 - return 0; 2258 - 2259 - ret = of_parse_phandle_with_args(dev->of_node, "power-domains", 2260 - "#power-domain-cells", 0, &pd_args); 2231 + ret = of_parse_phandle_with_args(np, "power-domains", 2232 + "#power-domain-cells", index, &pd_args); 2261 2233 if (ret < 0) 2262 - return 0; 2234 + return ret; 2263 2235 2264 2236 mutex_lock(&gpd_list_lock); 2265 2237 pd = genpd_get_from_provider(&pd_args); ··· 2280 2282 2281 2283 return ret ? -EPROBE_DEFER : 1; 2282 2284 } 2285 + 2286 + /** 2287 + * genpd_dev_pm_attach - Attach a device to its PM domain using DT. 2288 + * @dev: Device to attach. 2289 + * 2290 + * Parse device's OF node to find a PM domain specifier. If such is found, 2291 + * attaches the device to retrieved pm_domain ops. 2292 + * 2293 + * Returns 1 on successfully attached PM domain, 0 when the device don't need a 2294 + * PM domain or when multiple power-domains exists for it, else a negative error 2295 + * code. Note that if a power-domain exists for the device, but it cannot be 2296 + * found or turned on, then return -EPROBE_DEFER to ensure that the device is 2297 + * not probed and to re-try again later. 2298 + */ 2299 + int genpd_dev_pm_attach(struct device *dev) 2300 + { 2301 + if (!dev->of_node) 2302 + return 0; 2303 + 2304 + /* 2305 + * Devices with multiple PM domains must be attached separately, as we 2306 + * can only attach one PM domain per device. 2307 + */ 2308 + if (of_count_phandle_with_args(dev->of_node, "power-domains", 2309 + "#power-domain-cells") != 1) 2310 + return 0; 2311 + 2312 + return __genpd_dev_pm_attach(dev, dev->of_node, 0); 2313 + } 2283 2314 EXPORT_SYMBOL_GPL(genpd_dev_pm_attach); 2315 + 2316 + /** 2317 + * genpd_dev_pm_attach_by_id - Associate a device with one of its PM domains. 2318 + * @dev: The device used to lookup the PM domain. 2319 + * @index: The index of the PM domain. 2320 + * 2321 + * Parse device's OF node to find a PM domain specifier at the provided @index. 2322 + * If such is found, creates a virtual device and attaches it to the retrieved 2323 + * pm_domain ops. To deal with detaching of the virtual device, the ->detach() 2324 + * callback in the struct dev_pm_domain are assigned to genpd_dev_pm_detach(). 2325 + * 2326 + * Returns the created virtual device if successfully attached PM domain, NULL 2327 + * when the device don't need a PM domain, else an ERR_PTR() in case of 2328 + * failures. If a power-domain exists for the device, but cannot be found or 2329 + * turned on, then ERR_PTR(-EPROBE_DEFER) is returned to ensure that the device 2330 + * is not probed and to re-try again later. 2331 + */ 2332 + struct device *genpd_dev_pm_attach_by_id(struct device *dev, 2333 + unsigned int index) 2334 + { 2335 + struct device *genpd_dev; 2336 + int num_domains; 2337 + int ret; 2338 + 2339 + if (!dev->of_node) 2340 + return NULL; 2341 + 2342 + /* Deal only with devices using multiple PM domains. */ 2343 + num_domains = of_count_phandle_with_args(dev->of_node, "power-domains", 2344 + "#power-domain-cells"); 2345 + if (num_domains < 2 || index >= num_domains) 2346 + return NULL; 2347 + 2348 + /* Allocate and register device on the genpd bus. */ 2349 + genpd_dev = kzalloc(sizeof(*genpd_dev), GFP_KERNEL); 2350 + if (!genpd_dev) 2351 + return ERR_PTR(-ENOMEM); 2352 + 2353 + dev_set_name(genpd_dev, "genpd:%u:%s", index, dev_name(dev)); 2354 + genpd_dev->bus = &genpd_bus_type; 2355 + genpd_dev->release = genpd_release_dev; 2356 + 2357 + ret = device_register(genpd_dev); 2358 + if (ret) { 2359 + kfree(genpd_dev); 2360 + return ERR_PTR(ret); 2361 + } 2362 + 2363 + /* Try to attach the device to the PM domain at the specified index. */ 2364 + ret = __genpd_dev_pm_attach(genpd_dev, dev->of_node, index); 2365 + if (ret < 1) { 2366 + device_unregister(genpd_dev); 2367 + return ret ? ERR_PTR(ret) : NULL; 2368 + } 2369 + 2370 + pm_runtime_set_active(genpd_dev); 2371 + pm_runtime_enable(genpd_dev); 2372 + 2373 + return genpd_dev; 2374 + } 2375 + EXPORT_SYMBOL_GPL(genpd_dev_pm_attach_by_id); 2284 2376 2285 2377 static const struct of_device_id idle_state_match[] = { 2286 2378 { .compatible = "domain-idle-state", }, ··· 2530 2442 return state; 2531 2443 } 2532 2444 EXPORT_SYMBOL_GPL(of_genpd_opp_to_performance_state); 2445 + 2446 + static int __init genpd_bus_init(void) 2447 + { 2448 + return bus_register(&genpd_bus_type); 2449 + } 2450 + core_initcall(genpd_bus_init); 2533 2451 2534 2452 #endif /* CONFIG_PM_GENERIC_DOMAINS_OF */ 2535 2453
+24 -3
drivers/base/power/runtime.c
··· 1563 1563 } 1564 1564 1565 1565 /** 1566 - * pm_runtime_resume_suppliers - Resume supplier devices. 1566 + * pm_runtime_get_suppliers - Resume and reference-count supplier devices. 1567 1567 * @dev: Consumer device. 1568 1568 */ 1569 - void pm_runtime_resume_suppliers(struct device *dev) 1569 + void pm_runtime_get_suppliers(struct device *dev) 1570 1570 { 1571 + struct device_link *link; 1571 1572 int idx; 1572 1573 1573 1574 idx = device_links_read_lock(); 1574 1575 1575 - rpm_get_suppliers(dev); 1576 + list_for_each_entry_rcu(link, &dev->links.suppliers, c_node) 1577 + if (link->flags & DL_FLAG_PM_RUNTIME) 1578 + pm_runtime_get_sync(link->supplier); 1579 + 1580 + device_links_read_unlock(idx); 1581 + } 1582 + 1583 + /** 1584 + * pm_runtime_put_suppliers - Drop references to supplier devices. 1585 + * @dev: Consumer device. 1586 + */ 1587 + void pm_runtime_put_suppliers(struct device *dev) 1588 + { 1589 + struct device_link *link; 1590 + int idx; 1591 + 1592 + idx = device_links_read_lock(); 1593 + 1594 + list_for_each_entry_rcu(link, &dev->links.suppliers, c_node) 1595 + if (link->flags & DL_FLAG_PM_RUNTIME) 1596 + pm_runtime_put(link->supplier); 1576 1597 1577 1598 device_links_read_unlock(idx); 1578 1599 }
+1 -1
drivers/base/power/sysfs.c
··· 353 353 354 354 spin_lock_irq(&dev->power.lock); 355 355 if (dev->power.wakeup) { 356 - count = dev->power.wakeup->event_count; 356 + count = dev->power.wakeup->wakeup_count; 357 357 enabled = true; 358 358 } 359 359 spin_unlock_irq(&dev->power.lock);
+1 -1
drivers/cpufreq/Kconfig.arm
··· 125 125 default ARCH_OMAP2PLUS 126 126 127 127 config ARM_QCOM_CPUFREQ_KRYO 128 - bool "Qualcomm Kryo based CPUFreq" 128 + tristate "Qualcomm Kryo based CPUFreq" 129 129 depends on ARM64 130 130 depends on QCOM_QFPROM 131 131 depends on QCOM_SMEM
+2 -2
drivers/cpufreq/acpi-cpufreq.c
··· 465 465 return result; 466 466 } 467 467 468 - unsigned int acpi_cpufreq_fast_switch(struct cpufreq_policy *policy, 469 - unsigned int target_freq) 468 + static unsigned int acpi_cpufreq_fast_switch(struct cpufreq_policy *policy, 469 + unsigned int target_freq) 470 470 { 471 471 struct acpi_cpufreq_data *data = policy->driver_data; 472 472 struct acpi_processor_performance *perf;
+5 -7
drivers/cpufreq/cpufreq_governor.c
··· 165 165 * calls, so the previous load value can be used then. 166 166 */ 167 167 load = j_cdbs->prev_load; 168 - } else if (unlikely(time_elapsed > 2 * sampling_rate && 168 + } else if (unlikely((int)idle_time > 2 * sampling_rate && 169 169 j_cdbs->prev_load)) { 170 170 /* 171 171 * If the CPU had gone completely idle and a task has ··· 185 185 * clear prev_load to guarantee that the load will be 186 186 * computed again next time. 187 187 * 188 - * Detecting this situation is easy: the governor's 189 - * utilization update handler would not have run during 190 - * CPU-idle periods. Hence, an unusually large 191 - * 'time_elapsed' (as compared to the sampling rate) 188 + * Detecting this situation is easy: an unusually large 189 + * 'idle_time' (as compared to the sampling rate) 192 190 * indicates this scenario. 193 191 */ 194 192 load = j_cdbs->prev_load; ··· 215 217 j_cdbs->prev_load = load; 216 218 } 217 219 218 - if (time_elapsed > 2 * sampling_rate) { 219 - unsigned int periods = time_elapsed / sampling_rate; 220 + if (unlikely((int)idle_time > 2 * sampling_rate)) { 221 + unsigned int periods = idle_time / sampling_rate; 220 222 221 223 if (periods < idle_periods) 222 224 idle_periods = periods;
+23 -6
drivers/cpufreq/imx6q-cpufreq.c
··· 266 266 } 267 267 268 268 #define OCOTP_CFG3_6UL_SPEED_696MHZ 0x2 269 + #define OCOTP_CFG3_6ULL_SPEED_792MHZ 0x2 270 + #define OCOTP_CFG3_6ULL_SPEED_900MHZ 0x3 269 271 270 272 static void imx6ul_opp_check_speed_grading(struct device *dev) 271 273 { ··· 289 287 * Speed GRADING[1:0] defines the max speed of ARM: 290 288 * 2b'00: Reserved; 291 289 * 2b'01: 528000000Hz; 292 - * 2b'10: 696000000Hz; 293 - * 2b'11: Reserved; 290 + * 2b'10: 696000000Hz on i.MX6UL, 792000000Hz on i.MX6ULL; 291 + * 2b'11: 900000000Hz on i.MX6ULL only; 294 292 * We need to set the max speed of ARM according to fuse map. 295 293 */ 296 294 val = readl_relaxed(base + OCOTP_CFG3); 297 295 val >>= OCOTP_CFG3_SPEED_SHIFT; 298 296 val &= 0x3; 299 - if (val != OCOTP_CFG3_6UL_SPEED_696MHZ) 300 - if (dev_pm_opp_disable(dev, 696000000)) 301 - dev_warn(dev, "failed to disable 696MHz OPP\n"); 297 + 298 + if (of_machine_is_compatible("fsl,imx6ul")) { 299 + if (val != OCOTP_CFG3_6UL_SPEED_696MHZ) 300 + if (dev_pm_opp_disable(dev, 696000000)) 301 + dev_warn(dev, "failed to disable 696MHz OPP\n"); 302 + } 303 + 304 + if (of_machine_is_compatible("fsl,imx6ull")) { 305 + if (val != OCOTP_CFG3_6ULL_SPEED_792MHZ) 306 + if (dev_pm_opp_disable(dev, 792000000)) 307 + dev_warn(dev, "failed to disable 792MHz OPP\n"); 308 + 309 + if (val != OCOTP_CFG3_6ULL_SPEED_900MHZ) 310 + if (dev_pm_opp_disable(dev, 900000000)) 311 + dev_warn(dev, "failed to disable 900MHz OPP\n"); 312 + } 313 + 302 314 iounmap(base); 303 315 put_node: 304 316 of_node_put(np); ··· 372 356 goto put_reg; 373 357 } 374 358 375 - if (of_machine_is_compatible("fsl,imx6ul")) 359 + if (of_machine_is_compatible("fsl,imx6ul") || 360 + of_machine_is_compatible("fsl,imx6ull")) 376 361 imx6ul_opp_check_speed_grading(cpu_dev); 377 362 else 378 363 imx6q_opp_check_speed_grading(cpu_dev);
+176 -3
drivers/cpufreq/intel_pstate.c
··· 221 221 * preference/bias 222 222 * @epp_saved: Saved EPP/EPB during system suspend or CPU offline 223 223 * operation 224 + * @hwp_req_cached: Cached value of the last HWP Request MSR 225 + * @hwp_cap_cached: Cached value of the last HWP Capabilities MSR 226 + * @last_io_update: Last time when IO wake flag was set 227 + * @sched_flags: Store scheduler flags for possible cross CPU update 228 + * @hwp_boost_min: Last HWP boosted min performance 224 229 * 225 230 * This structure stores per CPU instance data for all CPUs. 226 231 */ ··· 258 253 s16 epp_policy; 259 254 s16 epp_default; 260 255 s16 epp_saved; 256 + u64 hwp_req_cached; 257 + u64 hwp_cap_cached; 258 + u64 last_io_update; 259 + unsigned int sched_flags; 260 + u32 hwp_boost_min; 261 261 }; 262 262 263 263 static struct cpudata **all_cpu_data; ··· 295 285 296 286 static int hwp_active __read_mostly; 297 287 static bool per_cpu_limits __read_mostly; 288 + static bool hwp_boost __read_mostly; 298 289 299 290 static struct cpufreq_driver *intel_pstate_driver __read_mostly; 300 291 ··· 700 689 u64 cap; 701 690 702 691 rdmsrl_on_cpu(cpu, MSR_HWP_CAPABILITIES, &cap); 692 + WRITE_ONCE(all_cpu_data[cpu]->hwp_cap_cached, cap); 703 693 if (global.no_turbo) 704 694 *current_max = HWP_GUARANTEED_PERF(cap); 705 695 else ··· 775 763 intel_pstate_set_epb(cpu, epp); 776 764 } 777 765 skip_epp: 766 + WRITE_ONCE(cpu_data->hwp_req_cached, value); 778 767 wrmsrl_on_cpu(cpu, MSR_HWP_REQUEST, value); 779 768 } 780 769 ··· 1033 1020 return count; 1034 1021 } 1035 1022 1023 + static ssize_t show_hwp_dynamic_boost(struct kobject *kobj, 1024 + struct attribute *attr, char *buf) 1025 + { 1026 + return sprintf(buf, "%u\n", hwp_boost); 1027 + } 1028 + 1029 + static ssize_t store_hwp_dynamic_boost(struct kobject *a, struct attribute *b, 1030 + const char *buf, size_t count) 1031 + { 1032 + unsigned int input; 1033 + int ret; 1034 + 1035 + ret = kstrtouint(buf, 10, &input); 1036 + if (ret) 1037 + return ret; 1038 + 1039 + mutex_lock(&intel_pstate_driver_lock); 1040 + hwp_boost = !!input; 1041 + intel_pstate_update_policies(); 1042 + mutex_unlock(&intel_pstate_driver_lock); 1043 + 1044 + return count; 1045 + } 1046 + 1036 1047 show_one(max_perf_pct, max_perf_pct); 1037 1048 show_one(min_perf_pct, min_perf_pct); 1038 1049 ··· 1066 1029 define_one_global_rw(min_perf_pct); 1067 1030 define_one_global_ro(turbo_pct); 1068 1031 define_one_global_ro(num_pstates); 1032 + define_one_global_rw(hwp_dynamic_boost); 1069 1033 1070 1034 static struct attribute *intel_pstate_attributes[] = { 1071 1035 &status.attr, ··· 1107 1069 rc = sysfs_create_file(intel_pstate_kobject, &min_perf_pct.attr); 1108 1070 WARN_ON(rc); 1109 1071 1072 + if (hwp_active) { 1073 + rc = sysfs_create_file(intel_pstate_kobject, 1074 + &hwp_dynamic_boost.attr); 1075 + WARN_ON(rc); 1076 + } 1110 1077 } 1111 1078 /************************** sysfs end ************************/ 1112 1079 ··· 1424 1381 intel_pstate_set_min_pstate(cpu); 1425 1382 } 1426 1383 1384 + /* 1385 + * Long hold time will keep high perf limits for long time, 1386 + * which negatively impacts perf/watt for some workloads, 1387 + * like specpower. 3ms is based on experiements on some 1388 + * workoads. 1389 + */ 1390 + static int hwp_boost_hold_time_ns = 3 * NSEC_PER_MSEC; 1391 + 1392 + static inline void intel_pstate_hwp_boost_up(struct cpudata *cpu) 1393 + { 1394 + u64 hwp_req = READ_ONCE(cpu->hwp_req_cached); 1395 + u32 max_limit = (hwp_req & 0xff00) >> 8; 1396 + u32 min_limit = (hwp_req & 0xff); 1397 + u32 boost_level1; 1398 + 1399 + /* 1400 + * Cases to consider (User changes via sysfs or boot time): 1401 + * If, P0 (Turbo max) = P1 (Guaranteed max) = min: 1402 + * No boost, return. 1403 + * If, P0 (Turbo max) > P1 (Guaranteed max) = min: 1404 + * Should result in one level boost only for P0. 1405 + * If, P0 (Turbo max) = P1 (Guaranteed max) > min: 1406 + * Should result in two level boost: 1407 + * (min + p1)/2 and P1. 1408 + * If, P0 (Turbo max) > P1 (Guaranteed max) > min: 1409 + * Should result in three level boost: 1410 + * (min + p1)/2, P1 and P0. 1411 + */ 1412 + 1413 + /* If max and min are equal or already at max, nothing to boost */ 1414 + if (max_limit == min_limit || cpu->hwp_boost_min >= max_limit) 1415 + return; 1416 + 1417 + if (!cpu->hwp_boost_min) 1418 + cpu->hwp_boost_min = min_limit; 1419 + 1420 + /* level at half way mark between min and guranteed */ 1421 + boost_level1 = (HWP_GUARANTEED_PERF(cpu->hwp_cap_cached) + min_limit) >> 1; 1422 + 1423 + if (cpu->hwp_boost_min < boost_level1) 1424 + cpu->hwp_boost_min = boost_level1; 1425 + else if (cpu->hwp_boost_min < HWP_GUARANTEED_PERF(cpu->hwp_cap_cached)) 1426 + cpu->hwp_boost_min = HWP_GUARANTEED_PERF(cpu->hwp_cap_cached); 1427 + else if (cpu->hwp_boost_min == HWP_GUARANTEED_PERF(cpu->hwp_cap_cached) && 1428 + max_limit != HWP_GUARANTEED_PERF(cpu->hwp_cap_cached)) 1429 + cpu->hwp_boost_min = max_limit; 1430 + else 1431 + return; 1432 + 1433 + hwp_req = (hwp_req & ~GENMASK_ULL(7, 0)) | cpu->hwp_boost_min; 1434 + wrmsrl(MSR_HWP_REQUEST, hwp_req); 1435 + cpu->last_update = cpu->sample.time; 1436 + } 1437 + 1438 + static inline void intel_pstate_hwp_boost_down(struct cpudata *cpu) 1439 + { 1440 + if (cpu->hwp_boost_min) { 1441 + bool expired; 1442 + 1443 + /* Check if we are idle for hold time to boost down */ 1444 + expired = time_after64(cpu->sample.time, cpu->last_update + 1445 + hwp_boost_hold_time_ns); 1446 + if (expired) { 1447 + wrmsrl(MSR_HWP_REQUEST, cpu->hwp_req_cached); 1448 + cpu->hwp_boost_min = 0; 1449 + } 1450 + } 1451 + cpu->last_update = cpu->sample.time; 1452 + } 1453 + 1454 + static inline void intel_pstate_update_util_hwp_local(struct cpudata *cpu, 1455 + u64 time) 1456 + { 1457 + cpu->sample.time = time; 1458 + 1459 + if (cpu->sched_flags & SCHED_CPUFREQ_IOWAIT) { 1460 + bool do_io = false; 1461 + 1462 + cpu->sched_flags = 0; 1463 + /* 1464 + * Set iowait_boost flag and update time. Since IO WAIT flag 1465 + * is set all the time, we can't just conclude that there is 1466 + * some IO bound activity is scheduled on this CPU with just 1467 + * one occurrence. If we receive at least two in two 1468 + * consecutive ticks, then we treat as boost candidate. 1469 + */ 1470 + if (time_before64(time, cpu->last_io_update + 2 * TICK_NSEC)) 1471 + do_io = true; 1472 + 1473 + cpu->last_io_update = time; 1474 + 1475 + if (do_io) 1476 + intel_pstate_hwp_boost_up(cpu); 1477 + 1478 + } else { 1479 + intel_pstate_hwp_boost_down(cpu); 1480 + } 1481 + } 1482 + 1483 + static inline void intel_pstate_update_util_hwp(struct update_util_data *data, 1484 + u64 time, unsigned int flags) 1485 + { 1486 + struct cpudata *cpu = container_of(data, struct cpudata, update_util); 1487 + 1488 + cpu->sched_flags |= flags; 1489 + 1490 + if (smp_processor_id() == cpu->cpu) 1491 + intel_pstate_update_util_hwp_local(cpu, time); 1492 + } 1493 + 1427 1494 static inline void intel_pstate_calc_avg_perf(struct cpudata *cpu) 1428 1495 { 1429 1496 struct sample *sample = &cpu->sample; ··· 1794 1641 {} 1795 1642 }; 1796 1643 1644 + static const struct x86_cpu_id intel_pstate_hwp_boost_ids[] = { 1645 + ICPU(INTEL_FAM6_SKYLAKE_X, core_funcs), 1646 + ICPU(INTEL_FAM6_SKYLAKE_DESKTOP, core_funcs), 1647 + {} 1648 + }; 1649 + 1797 1650 static int intel_pstate_init_cpu(unsigned int cpunum) 1798 1651 { 1799 1652 struct cpudata *cpu; ··· 1830 1671 intel_pstate_disable_ee(cpunum); 1831 1672 1832 1673 intel_pstate_hwp_enable(cpu); 1674 + 1675 + id = x86_match_cpu(intel_pstate_hwp_boost_ids); 1676 + if (id) 1677 + hwp_boost = true; 1833 1678 } 1834 1679 1835 1680 intel_pstate_get_cpu_pstates(cpu); ··· 1847 1684 { 1848 1685 struct cpudata *cpu = all_cpu_data[cpu_num]; 1849 1686 1850 - if (hwp_active) 1687 + if (hwp_active && !hwp_boost) 1851 1688 return; 1852 1689 1853 1690 if (cpu->update_util_set) ··· 1856 1693 /* Prevent intel_pstate_update_util() from using stale data. */ 1857 1694 cpu->sample.time = 0; 1858 1695 cpufreq_add_update_util_hook(cpu_num, &cpu->update_util, 1859 - intel_pstate_update_util); 1696 + (hwp_active ? 1697 + intel_pstate_update_util_hwp : 1698 + intel_pstate_update_util)); 1860 1699 cpu->update_util_set = true; 1861 1700 } 1862 1701 ··· 1970 1805 intel_pstate_set_update_util_hook(policy->cpu); 1971 1806 } 1972 1807 1973 - if (hwp_active) 1808 + if (hwp_active) { 1809 + /* 1810 + * When hwp_boost was active before and dynamically it 1811 + * was turned off, in that case we need to clear the 1812 + * update util hook. 1813 + */ 1814 + if (!hwp_boost) 1815 + intel_pstate_clear_update_util_hook(policy->cpu); 1974 1816 intel_pstate_hwp_set(policy->cpu); 1817 + } 1975 1818 1976 1819 mutex_unlock(&intel_pstate_limits_lock); 1977 1820
+2 -5
drivers/cpufreq/ti-cpufreq.c
··· 217 217 if (!match) 218 218 return -ENODEV; 219 219 220 - opp_data = kzalloc(sizeof(*opp_data), GFP_KERNEL); 220 + opp_data = devm_kzalloc(&pdev->dev, sizeof(*opp_data), GFP_KERNEL); 221 221 if (!opp_data) 222 222 return -ENOMEM; 223 223 ··· 226 226 opp_data->cpu_dev = get_cpu_device(0); 227 227 if (!opp_data->cpu_dev) { 228 228 pr_err("%s: Failed to get device for CPU0\n", __func__); 229 - ret = ENODEV; 230 - goto free_opp_data; 229 + return -ENODEV; 231 230 } 232 231 233 232 opp_data->opp_node = dev_pm_opp_of_get_opp_desc_node(opp_data->cpu_dev); ··· 284 285 285 286 fail_put_node: 286 287 of_node_put(opp_data->opp_node); 287 - free_opp_data: 288 - kfree(opp_data); 289 288 290 289 return ret; 291 290 }
+15
include/linux/pm_domain.h
··· 237 237 struct device_node *opp_node); 238 238 239 239 int genpd_dev_pm_attach(struct device *dev); 240 + struct device *genpd_dev_pm_attach_by_id(struct device *dev, 241 + unsigned int index); 240 242 #else /* !CONFIG_PM_GENERIC_DOMAINS_OF */ 241 243 static inline int of_genpd_add_provider_simple(struct device_node *np, 242 244 struct generic_pm_domain *genpd) ··· 284 282 return 0; 285 283 } 286 284 285 + static inline struct device *genpd_dev_pm_attach_by_id(struct device *dev, 286 + unsigned int index) 287 + { 288 + return NULL; 289 + } 290 + 287 291 static inline 288 292 struct generic_pm_domain *of_genpd_remove_last(struct device_node *np) 289 293 { ··· 299 291 300 292 #ifdef CONFIG_PM 301 293 int dev_pm_domain_attach(struct device *dev, bool power_on); 294 + struct device *dev_pm_domain_attach_by_id(struct device *dev, 295 + unsigned int index); 302 296 void dev_pm_domain_detach(struct device *dev, bool power_off); 303 297 void dev_pm_domain_set(struct device *dev, struct dev_pm_domain *pd); 304 298 #else 305 299 static inline int dev_pm_domain_attach(struct device *dev, bool power_on) 306 300 { 307 301 return 0; 302 + } 303 + static inline struct device *dev_pm_domain_attach_by_id(struct device *dev, 304 + unsigned int index) 305 + { 306 + return NULL; 308 307 } 309 308 static inline void dev_pm_domain_detach(struct device *dev, bool power_off) {} 310 309 static inline void dev_pm_domain_set(struct device *dev,
+4 -2
include/linux/pm_runtime.h
··· 56 56 s64 delta_ns); 57 57 extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable); 58 58 extern void pm_runtime_clean_up_links(struct device *dev); 59 - extern void pm_runtime_resume_suppliers(struct device *dev); 59 + extern void pm_runtime_get_suppliers(struct device *dev); 60 + extern void pm_runtime_put_suppliers(struct device *dev); 60 61 extern void pm_runtime_new_link(struct device *dev); 61 62 extern void pm_runtime_drop_link(struct device *dev); 62 63 ··· 173 172 static inline void pm_runtime_set_memalloc_noio(struct device *dev, 174 173 bool enable){} 175 174 static inline void pm_runtime_clean_up_links(struct device *dev) {} 176 - static inline void pm_runtime_resume_suppliers(struct device *dev) {} 175 + static inline void pm_runtime_get_suppliers(struct device *dev) {} 176 + static inline void pm_runtime_put_suppliers(struct device *dev) {} 177 177 static inline void pm_runtime_new_link(struct device *dev) {} 178 178 static inline void pm_runtime_drop_link(struct device *dev) {} 179 179
+1 -1
tools/power/cpupower/bench/parse.c
··· 104 104 dirname, time(NULL)); 105 105 } 106 106 107 - dprintf("logilename: %s\n", filename); 107 + dprintf("logfilename: %s\n", filename); 108 108 109 109 output = fopen(filename, "w+"); 110 110 if (output == NULL) {
+15
tools/power/cpupower/utils/idle_monitor/cpuidle_sysfs.c
··· 126 126 } 127 127 } 128 128 129 + #ifdef __powerpc__ 130 + void map_power_idle_state_name(char *tmp) 131 + { 132 + if (!strncmp(tmp, "stop0_lite", CSTATE_NAME_LEN)) 133 + strcpy(tmp, "stop0L"); 134 + else if (!strncmp(tmp, "stop1_lite", CSTATE_NAME_LEN)) 135 + strcpy(tmp, "stop1L"); 136 + else if (!strncmp(tmp, "stop2_lite", CSTATE_NAME_LEN)) 137 + strcpy(tmp, "stop2L"); 138 + } 139 + #else 140 + void map_power_idle_state_name(char *tmp) { } 141 + #endif 142 + 129 143 static struct cpuidle_monitor *cpuidle_register(void) 130 144 { 131 145 int num; ··· 159 145 if (tmp == NULL) 160 146 continue; 161 147 148 + map_power_idle_state_name(tmp); 162 149 fix_up_intel_idle_driver_name(tmp, num); 163 150 strncpy(cpuidle_cstates[num].name, tmp, CSTATE_NAME_LEN - 1); 164 151 free(tmp);
+20 -15
tools/power/cpupower/utils/idle_monitor/cpupower-monitor.c
··· 70 70 printf(" "); 71 71 } 72 72 73 - /* size of s must be at least n + 1 */ 73 + /*s is filled with left and right spaces 74 + *to make its length atleast n+1 75 + */ 74 76 int fill_string_with_spaces(char *s, int n) 75 77 { 78 + char *temp; 76 79 int len = strlen(s); 77 - if (len > n) 80 + 81 + if (len >= n) 78 82 return -1; 83 + 84 + temp = malloc(sizeof(char) * (n+1)); 79 85 for (; len < n; len++) 80 86 s[len] = ' '; 81 87 s[len] = '\0'; 88 + snprintf(temp, n+1, " %s", s); 89 + strcpy(s, temp); 90 + free(temp); 82 91 return 0; 83 92 } 84 93 94 + #define MAX_COL_WIDTH 6 85 95 void print_header(int topology_depth) 86 96 { 87 97 int unsigned mon; 88 98 int state, need_len; 89 99 cstate_t s; 90 100 char buf[128] = ""; 91 - int percent_width = 4; 92 101 93 102 fill_string_with_spaces(buf, topology_depth * 5 - 1); 94 103 printf("%s|", buf); 95 104 96 105 for (mon = 0; mon < avail_monitors; mon++) { 97 - need_len = monitors[mon]->hw_states_num * (percent_width + 3) 106 + need_len = monitors[mon]->hw_states_num * (MAX_COL_WIDTH + 1) 98 107 - 1; 99 - if (mon != 0) { 100 - printf("|| "); 101 - need_len--; 102 - } 108 + if (mon != 0) 109 + printf("||"); 103 110 sprintf(buf, "%s", monitors[mon]->name); 104 111 fill_string_with_spaces(buf, need_len); 105 112 printf("%s", buf); ··· 114 107 printf("\n"); 115 108 116 109 if (topology_depth > 2) 117 - printf("PKG |"); 110 + printf(" PKG|"); 118 111 if (topology_depth > 1) 119 112 printf("CORE|"); 120 113 if (topology_depth > 0) 121 - printf("CPU |"); 114 + printf(" CPU|"); 122 115 123 116 for (mon = 0; mon < avail_monitors; mon++) { 124 117 if (mon != 0) 125 - printf("|| "); 126 - else 127 - printf(" "); 118 + printf("||"); 128 119 for (state = 0; state < monitors[mon]->hw_states_num; state++) { 129 120 if (state != 0) 130 - printf(" | "); 121 + printf("|"); 131 122 s = monitors[mon]->hw_states[state]; 132 123 sprintf(buf, "%s", s.name); 133 - fill_string_with_spaces(buf, percent_width); 124 + fill_string_with_spaces(buf, MAX_COL_WIDTH); 134 125 printf("%s", buf); 135 126 } 136 127 printf(" ");
+9
tools/power/cpupower/utils/idle_monitor/cpupower-monitor.h
··· 15 15 16 16 #define MONITORS_MAX 20 17 17 #define MONITOR_NAME_LEN 20 18 + 19 + /* CSTATE_NAME_LEN is limited by header field width defined 20 + * in cpupower-monitor.c. Header field width is defined to be 21 + * sum of percent width and two spaces for padding. 22 + */ 23 + #ifdef __powerpc__ 24 + #define CSTATE_NAME_LEN 7 25 + #else 18 26 #define CSTATE_NAME_LEN 5 27 + #endif 19 28 #define CSTATE_DESC_LEN 60 20 29 21 30 int cpu_count;