Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'pm-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
"From the functional perspective, the most significant change here is
the addition of support for Energy Models that can be updated
dynamically at run time.

There is also the addition of LZ4 compression support for hibernation,
the new preferred core support in amd-pstate, new platforms support in
the Intel RAPL driver, new model-specific EPP handling in intel_pstate
and more.

Apart from that, the cpufreq default transition delay is reduced from
10 ms to 2 ms (along with some related adjustments), the system
suspend statistics code undergoes a significant rework and there is a
usual bunch of fixes and code cleanups all over.

Specifics:

- Allow the Energy Model to be updated dynamically (Lukasz Luba)

- Add support for LZ4 compression algorithm to the hibernation image
creation and loading code (Nikhil V)

- Fix and clean up system suspend statistics collection (Rafael
Wysocki)

- Simplify device suspend and resume handling in the power management
core code (Rafael Wysocki)

- Fix PCI hibernation support description (Yiwei Lin)

- Make hibernation take set_memory_ro() return values into account as
appropriate (Christophe Leroy)

- Set mem_sleep_current during kernel command line setup to avoid an
ordering issue with handling it (Maulik Shah)

- Fix wake IRQs handling when pm_runtime_force_suspend() is used as a
driver's system suspend callback (Qingliang Li)

- Simplify pm_runtime_get_if_active() usage and add a replacement for
pm_runtime_put_autosuspend() (Sakari Ailus)

- Add a tracepoint for runtime_status changes tracking (Vilas Bhat)

- Fix section title markdown in the runtime PM documentation (Yiwei
Lin)

- Enable preferred core support in the amd-pstate cpufreq driver
(Meng Li)

- Fix min_perf assignment in amd_pstate_adjust_perf() and make the
min/max limit perf values in amd-pstate always stay within the
(highest perf, lowest perf) range (Tor Vic, Meng Li)

- Allow intel_pstate to assign model-specific values to strings used
in the EPP sysfs interface and make it do so on Meteor Lake
(Srinivas Pandruvada)

- Drop long-unused cpudata::prev_cummulative_iowait from the
intel_pstate cpufreq driver (Jiri Slaby)

- Prevent scaling_cur_freq from exceeding scaling_max_freq when the
latter is an inefficient frequency (Shivnandan Kumar)

- Change default transition delay in cpufreq to 2ms (Qais Yousef)

- Remove references to 10ms minimum sampling rate from comments in
the cpufreq code (Pierre Gondois)

- Honour transition_latency over transition_delay_us in cpufreq (Qais
Yousef)

- Stop unregistering cpufreq cooling on CPU hot-remove (Viresh Kumar)

- General enhancements / cleanups to ARM cpufreq drivers (tianyu2,
Nícolas F. R. A. Prado, Erick Archer, Arnd Bergmann, Anastasia
Belova)

- Update cpufreq-dt-platdev to block/approve devices (Richard Acayan)

- Make the SCMI cpufreq driver get a transition delay value from
firmware (Pierre Gondois)

- Prevent the haltpoll cpuidle governor from shrinking guest
poll_limit_ns below grow_start (Parshuram Sangle)

- Avoid potential overflow in integer multiplication when computing
cpuidle state parameters (C Cheng)

- Adjust MWAIT hint target C-state computation in the ACPI cpuidle
driver and in intel_idle to return a correct value for C0 (He
Rongguang)

- Address multiple issues in the TPMI RAPL driver and add support for
new platforms (Lunar Lake-M, Arrow Lake) to Intel RAPL (Zhang Rui)

- Fix freq_qos_add_request() return value check in dtpm_cpu (Daniel
Lezcano)

- Fix kernel-doc for dtpm_create_hierarchy() (Yang Li)

- Fix file leak in get_pkg_num() in x86_energy_perf_policy (Samasth
Norway Ananda)

- Fix cpupower-frequency-info.1 man page typo (Jan Kratochvil)

- Fix a couple of warnings in the OPP core code related to W=1 builds
(Viresh Kumar)

- Move dev_pm_opp_{init|free}_cpufreq_table() to pm_opp.h (Viresh
Kumar)

- Extend dev_pm_opp_data with turbo support (Sibi Sankar)

- dt-bindings: drop maxItems from inner items (David Heidelberg)"

* tag 'pm-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (95 commits)
dt-bindings: opp: drop maxItems from inner items
OPP: debugfs: Fix warning around icc_get_name()
OPP: debugfs: Fix warning with W=1 builds
cpufreq: Move dev_pm_opp_{init|free}_cpufreq_table() to pm_opp.h
OPP: Extend dev_pm_opp_data with turbo support
Fix cpupower-frequency-info.1 man page typo
cpufreq: scmi: Set transition_delay_us
firmware: arm_scmi: Populate fast channel rate_limit
firmware: arm_scmi: Populate perf commands rate_limit
cpuidle: ACPI/intel: fix MWAIT hint target C-state computation
PM: sleep: wakeirq: fix wake irq warning in system suspend
powercap: dtpm: Fix kernel-doc for dtpm_create_hierarchy() function
cpufreq: Don't unregister cpufreq cooling on CPU hotplug
PM: suspend: Set mem_sleep_current during kernel command line setup
cpufreq: Honour transition_latency over transition_delay_us
cpufreq: Limit resolving a frequency to policy min/max
Documentation: PM: Fix runtime_pm.rst markdown syntax
cpufreq: amd-pstate: adjust min/max limit perf
cpufreq: Remove references to 10ms min sampling rate
cpufreq: intel_pstate: Update default EPPs for Meteor Lake
...

+2126 -736
+16
Documentation/admin-guide/kernel-parameters.txt
··· 374 374 selects a performance level in this range and appropriate 375 375 to the current workload. 376 376 377 + amd_prefcore= 378 + [X86] 379 + disable 380 + Disable amd-pstate preferred core. 381 + 377 382 amijoy.map= [HW,JOY] Amiga joystick support 378 383 Map of devices attached to JOY0DAT and JOY1DAT 379 384 Format: <a>,<b> ··· 1764 1759 protect_image Turn on image protection during restoration 1765 1760 (that will set all pages holding image data 1766 1761 during restoration read-only). 1762 + 1763 + hibernate.compressor= [HIBERNATION] Compression algorithm to be 1764 + used with hibernation. 1765 + Format: { lzo | lz4 } 1766 + Default: lzo 1767 + 1768 + lzo: Select LZO compression algorithm to 1769 + compress/decompress hibernation image. 1770 + 1771 + lz4: Select LZ4 compression algorithm to 1772 + compress/decompress hibernation image. 1767 1773 1768 1774 highmem=nn[KMG] [KNL,BOOT,EARLY] forces the highmem zone to have an exact 1769 1775 size of <nn>. This works even on boxes that have no
+57 -2
Documentation/admin-guide/pm/amd-pstate.rst
··· 300 300 efficiency frequency management method on AMD processors. 301 301 302 302 303 - AMD Pstate Driver Operation Modes 304 - ================================= 303 + ``amd-pstate`` Driver Operation Modes 304 + ====================================== 305 305 306 306 ``amd_pstate`` CPPC has 3 operation modes: autonomous (active) mode, 307 307 non-autonomous (passive) mode and guided autonomous (guided) mode. ··· 353 353 level and the platform autonomously selects a performance level in this range 354 354 and appropriate to the current workload. 355 355 356 + ``amd-pstate`` Preferred Core 357 + ================================= 358 + 359 + The core frequency is subjected to the process variation in semiconductors. 360 + Not all cores are able to reach the maximum frequency respecting the 361 + infrastructure limits. Consequently, AMD has redefined the concept of 362 + maximum frequency of a part. This means that a fraction of cores can reach 363 + maximum frequency. To find the best process scheduling policy for a given 364 + scenario, OS needs to know the core ordering informed by the platform through 365 + highest performance capability register of the CPPC interface. 366 + 367 + ``amd-pstate`` preferred core enables the scheduler to prefer scheduling on 368 + cores that can achieve a higher frequency with lower voltage. The preferred 369 + core rankings can dynamically change based on the workload, platform conditions, 370 + thermals and ageing. 371 + 372 + The priority metric will be initialized by the ``amd-pstate`` driver. The ``amd-pstate`` 373 + driver will also determine whether or not ``amd-pstate`` preferred core is 374 + supported by the platform. 375 + 376 + ``amd-pstate`` driver will provide an initial core ordering when the system boots. 377 + The platform uses the CPPC interfaces to communicate the core ranking to the 378 + operating system and scheduler to make sure that OS is choosing the cores 379 + with highest performance firstly for scheduling the process. When ``amd-pstate`` 380 + driver receives a message with the highest performance change, it will 381 + update the core ranking and set the cpu's priority. 382 + 383 + ``amd-pstate`` Preferred Core Switch 384 + ===================================== 385 + Kernel Parameters 386 + ----------------- 387 + 388 + ``amd-pstate`` peferred core`` has two states: enable and disable. 389 + Enable/disable states can be chosen by different kernel parameters. 390 + Default enable ``amd-pstate`` preferred core. 391 + 392 + ``amd_prefcore=disable`` 393 + 394 + For systems that support ``amd-pstate`` preferred core, the core rankings will 395 + always be advertised by the platform. But OS can choose to ignore that via the 396 + kernel parameter ``amd_prefcore=disable``. 397 + 356 398 User Space Interface in ``sysfs`` - General 357 399 =========================================== 358 400 ··· 426 384 these values to the sysfs file will cause the driver to switch over 427 385 to the operation mode represented by that string - or to be 428 386 unregistered in the "disable" case. 387 + 388 + ``prefcore`` 389 + Preferred core state of the driver: "enabled" or "disabled". 390 + 391 + "enabled" 392 + Enable the ``amd-pstate`` preferred core. 393 + 394 + "disabled" 395 + Disable the ``amd-pstate`` preferred core 396 + 397 + 398 + This attribute is read-only to check the state of preferred core set 399 + by the kernel parameter. 429 400 430 401 ``cpupower`` tool support for ``amd-pstate`` 431 402 ===============================================
-2
Documentation/devicetree/bindings/opp/opp-v2-base.yaml
··· 57 57 specific binding. 58 58 minItems: 1 59 59 maxItems: 32 60 - items: 61 - maxItems: 1 62 60 63 61 opp-microvolt: 64 62 description: |
+179 -4
Documentation/power/energy-model.rst
··· 71 71 required to have the same micro-architecture. CPUs in different performance 72 72 domains can have different micro-architectures. 73 73 74 + To better reflect power variation due to static power (leakage) the EM 75 + supports runtime modifications of the power values. The mechanism relies on 76 + RCU to free the modifiable EM perf_state table memory. Its user, the task 77 + scheduler, also uses RCU to access this memory. The EM framework provides 78 + API for allocating/freeing the new memory for the modifiable EM table. 79 + The old memory is freed automatically using RCU callback mechanism when there 80 + are no owners anymore for the given EM runtime table instance. This is tracked 81 + using kref mechanism. The device driver which provided the new EM at runtime, 82 + should call EM API to free it safely when it's no longer needed. The EM 83 + framework will handle the clean-up when it's possible. 84 + 85 + The kernel code which want to modify the EM values is protected from concurrent 86 + access using a mutex. Therefore, the device driver code must run in sleeping 87 + context when it tries to modify the EM. 88 + 89 + With the runtime modifiable EM we switch from a 'single and during the entire 90 + runtime static EM' (system property) design to a 'single EM which can be 91 + changed during runtime according e.g. to the workload' (system and workload 92 + property) design. 93 + 94 + It is possible also to modify the CPU performance values for each EM's 95 + performance state. Thus, the full power and performance profile (which 96 + is an exponential curve) can be changed according e.g. to the workload 97 + or system property. 98 + 74 99 75 100 2. Core APIs 76 101 ------------ ··· 200 175 not provided for other type of devices. 201 176 202 177 More details about the above APIs can be found in ``<linux/energy_model.h>`` 203 - or in Section 2.4 178 + or in Section 2.5 204 179 205 180 206 - 2.4 Description details of this API 181 + 2.4 Runtime modifications 182 + ^^^^^^^^^^^^^^^^^^^^^^^^^ 183 + 184 + Drivers willing to update the EM at runtime should use the following dedicated 185 + function to allocate a new instance of the modified EM. The API is listed 186 + below:: 187 + 188 + struct em_perf_table __rcu *em_table_alloc(struct em_perf_domain *pd); 189 + 190 + This allows to allocate a structure which contains the new EM table with 191 + also RCU and kref needed by the EM framework. The 'struct em_perf_table' 192 + contains array 'struct em_perf_state state[]' which is a list of performance 193 + states in ascending order. That list must be populated by the device driver 194 + which wants to update the EM. The list of frequencies can be taken from 195 + existing EM (created during boot). The content in the 'struct em_perf_state' 196 + must be populated by the driver as well. 197 + 198 + This is the API which does the EM update, using RCU pointers swap:: 199 + 200 + int em_dev_update_perf_domain(struct device *dev, 201 + struct em_perf_table __rcu *new_table); 202 + 203 + Drivers must provide a pointer to the allocated and initialized new EM 204 + 'struct em_perf_table'. That new EM will be safely used inside the EM framework 205 + and will be visible to other sub-systems in the kernel (thermal, powercap). 206 + The main design goal for this API is to be fast and avoid extra calculations 207 + or memory allocations at runtime. When pre-computed EMs are available in the 208 + device driver, than it should be possible to simply re-use them with low 209 + performance overhead. 210 + 211 + In order to free the EM, provided earlier by the driver (e.g. when the module 212 + is unloaded), there is a need to call the API:: 213 + 214 + void em_table_free(struct em_perf_table __rcu *table); 215 + 216 + It will allow the EM framework to safely remove the memory, when there is 217 + no other sub-system using it, e.g. EAS. 218 + 219 + To use the power values in other sub-systems (like thermal, powercap) there is 220 + a need to call API which protects the reader and provide consistency of the EM 221 + table data:: 222 + 223 + struct em_perf_state *em_perf_state_from_pd(struct em_perf_domain *pd); 224 + 225 + It returns the 'struct em_perf_state' pointer which is an array of performance 226 + states in ascending order. 227 + This function must be called in the RCU read lock section (after the 228 + rcu_read_lock()). When the EM table is not needed anymore there is a need to 229 + call rcu_real_unlock(). In this way the EM safely uses the RCU read section 230 + and protects the users. It also allows the EM framework to manage the memory 231 + and free it. More details how to use it can be found in Section 3.2 in the 232 + example driver. 233 + 234 + There is dedicated API for device drivers to calculate em_perf_state::cost 235 + values:: 236 + 237 + int em_dev_compute_costs(struct device *dev, struct em_perf_state *table, 238 + int nr_states); 239 + 240 + These 'cost' values from EM are used in EAS. The new EM table should be passed 241 + together with the number of entries and device pointer. When the computation 242 + of the cost values is done properly the return value from the function is 0. 243 + The function takes care for right setting of inefficiency for each performance 244 + state as well. It updates em_perf_state::flags accordingly. 245 + Then such prepared new EM can be passed to the em_dev_update_perf_domain() 246 + function, which will allow to use it. 247 + 248 + More details about the above APIs can be found in ``<linux/energy_model.h>`` 249 + or in Section 3.2 with an example code showing simple implementation of the 250 + updating mechanism in a device driver. 251 + 252 + 253 + 2.5 Description details of this API 207 254 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 208 255 .. kernel-doc:: include/linux/energy_model.h 209 256 :internal: ··· 284 187 :export: 285 188 286 189 287 - 3. Example driver 288 - ----------------- 190 + 3. Examples 191 + ----------- 192 + 193 + 3.1 Example driver with EM registration 194 + ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 289 195 290 196 The CPUFreq framework supports dedicated callback for registering 291 197 the EM for a given CPU(s) 'policy' object: cpufreq_driver::register_em(). ··· 342 242 39 static struct cpufreq_driver foo_cpufreq_driver = { 343 243 40 .register_em = foo_cpufreq_register_em, 344 244 41 }; 245 + 246 + 247 + 3.2 Example driver with EM modification 248 + ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 249 + 250 + This section provides a simple example of a thermal driver modifying the EM. 251 + The driver implements a foo_thermal_em_update() function. The driver is woken 252 + up periodically to check the temperature and modify the EM data:: 253 + 254 + -> drivers/soc/example/example_em_mod.c 255 + 256 + 01 static void foo_get_new_em(struct foo_context *ctx) 257 + 02 { 258 + 03 struct em_perf_table __rcu *em_table; 259 + 04 struct em_perf_state *table, *new_table; 260 + 05 struct device *dev = ctx->dev; 261 + 06 struct em_perf_domain *pd; 262 + 07 unsigned long freq; 263 + 08 int i, ret; 264 + 09 265 + 10 pd = em_pd_get(dev); 266 + 11 if (!pd) 267 + 12 return; 268 + 13 269 + 14 em_table = em_table_alloc(pd); 270 + 15 if (!em_table) 271 + 16 return; 272 + 17 273 + 18 new_table = em_table->state; 274 + 19 275 + 20 rcu_read_lock(); 276 + 21 table = em_perf_state_from_pd(pd); 277 + 22 for (i = 0; i < pd->nr_perf_states; i++) { 278 + 23 freq = table[i].frequency; 279 + 24 foo_get_power_perf_values(dev, freq, &new_table[i]); 280 + 25 } 281 + 26 rcu_read_unlock(); 282 + 27 283 + 28 /* Calculate 'cost' values for EAS */ 284 + 29 ret = em_dev_compute_costs(dev, table, pd->nr_perf_states); 285 + 30 if (ret) { 286 + 31 dev_warn(dev, "EM: compute costs failed %d\n", ret); 287 + 32 em_free_table(em_table); 288 + 33 return; 289 + 34 } 290 + 35 291 + 36 ret = em_dev_update_perf_domain(dev, em_table); 292 + 37 if (ret) { 293 + 38 dev_warn(dev, "EM: update failed %d\n", ret); 294 + 39 em_free_table(em_table); 295 + 40 return; 296 + 41 } 297 + 42 298 + 43 /* 299 + 44 * Since it's one-time-update drop the usage counter. 300 + 45 * The EM framework will later free the table when needed. 301 + 46 */ 302 + 47 em_table_free(em_table); 303 + 48 } 304 + 49 305 + 50 /* 306 + 51 * Function called periodically to check the temperature and 307 + 52 * update the EM if needed 308 + 53 */ 309 + 54 static void foo_thermal_em_update(struct foo_context *ctx) 310 + 55 { 311 + 56 struct device *dev = ctx->dev; 312 + 57 int cpu; 313 + 58 314 + 59 ctx->temperature = foo_get_temp(dev, ctx); 315 + 60 if (ctx->temperature < FOO_EM_UPDATE_TEMP_THRESHOLD) 316 + 61 return; 317 + 62 318 + 63 foo_get_new_em(ctx); 319 + 64 }
+1 -1
Documentation/power/opp.rst
··· 305 305 { 306 306 /* Do things */ 307 307 num_available = dev_pm_opp_get_opp_count(dev); 308 - speeds = kzalloc(sizeof(u32) * num_available, GFP_KERNEL); 308 + speeds = kcalloc(num_available, sizeof(u32), GFP_KERNEL); 309 309 /* populate the table in increasing order */ 310 310 freq = 0; 311 311 while (!IS_ERR(opp = dev_pm_opp_find_freq_ceil(dev, &freq))) {
+1 -1
Documentation/power/pci.rst
··· 625 625 pci_pm_poweroff() 626 626 pci_pm_poweroff_noirq() 627 627 628 - work in analogy with pci_pm_suspend() and pci_pm_poweroff_noirq(), respectively, 628 + work in analogy with pci_pm_suspend() and pci_pm_suspend_noirq(), respectively, 629 629 although they don't attempt to save the device's standard configuration 630 630 registers. 631 631
+14 -9
Documentation/power/runtime_pm.rst
··· 154 154 device in that case. If there is no idle callback, or if the callback returns 155 155 0, then the PM core will attempt to carry out a runtime suspend of the device, 156 156 also respecting devices configured for autosuspend. In essence this means a 157 - call to pm_runtime_autosuspend() (do note that drivers needs to update the 157 + call to __pm_runtime_autosuspend() (do note that drivers needs to update the 158 158 device last busy mark, pm_runtime_mark_last_busy(), to control the delay under 159 159 this circumstance). To prevent this (for example, if the callback routine has 160 160 started a delayed suspend), the routine must return a non-zero value. Negative ··· 396 396 nonzero, increment the counter and return 1; otherwise return 0 without 397 397 changing the counter 398 398 399 - `int pm_runtime_get_if_active(struct device *dev, bool ign_usage_count);` 399 + `int pm_runtime_get_if_active(struct device *dev);` 400 400 - return -EINVAL if 'power.disable_depth' is nonzero; otherwise, if the 401 - runtime PM status is RPM_ACTIVE, and either ign_usage_count is true 402 - or the device's usage_count is non-zero, increment the counter and 401 + runtime PM status is RPM_ACTIVE, increment the counter and 403 402 return 1; otherwise return 0 without changing the counter 404 403 405 404 `void pm_runtime_put_noidle(struct device *dev);` ··· 409 410 pm_request_idle(dev) and return its result 410 411 411 412 `int pm_runtime_put_autosuspend(struct device *dev);` 413 + - does the same as __pm_runtime_put_autosuspend() for now, but in the 414 + future, will also call pm_runtime_mark_last_busy() as well, DO NOT USE! 415 + 416 + `int __pm_runtime_put_autosuspend(struct device *dev);` 412 417 - decrement the device's usage counter; if the result is 0 then run 413 418 pm_request_autosuspend(dev) and return its result 414 419 ··· 543 540 - pm_runtime_put_noidle() 544 541 - pm_runtime_put() 545 542 - pm_runtime_put_autosuspend() 543 + - __pm_runtime_put_autosuspend() 546 544 - pm_runtime_enable() 547 545 - pm_suspend_ignore_children() 548 546 - pm_runtime_set_active() ··· 734 730 for it, respectively. 735 731 736 732 7. Generic subsystem callbacks 733 + ============================== 737 734 738 735 Subsystems may wish to conserve code space by using the set of generic power 739 736 management callbacks provided by the PM core, defined in ··· 870 865 871 866 Inactivity is determined based on the power.last_busy field. Drivers should 872 867 call pm_runtime_mark_last_busy() to update this field after carrying out I/O, 873 - typically just before calling pm_runtime_put_autosuspend(). The desired length 874 - of the inactivity period is a matter of policy. Subsystems can set this length 875 - initially by calling pm_runtime_set_autosuspend_delay(), but after device 868 + typically just before calling __pm_runtime_put_autosuspend(). The desired 869 + length of the inactivity period is a matter of policy. Subsystems can set this 870 + length initially by calling pm_runtime_set_autosuspend_delay(), but after device 876 871 registration the length should be controlled by user space, using the 877 872 /sys/devices/.../power/autosuspend_delay_ms attribute. 878 873 ··· 883 878 884 879 Instead of: pm_runtime_suspend use: pm_runtime_autosuspend; 885 880 Instead of: pm_schedule_suspend use: pm_request_autosuspend; 886 - Instead of: pm_runtime_put use: pm_runtime_put_autosuspend; 881 + Instead of: pm_runtime_put use: __pm_runtime_put_autosuspend; 887 882 Instead of: pm_runtime_put_sync use: pm_runtime_put_sync_autosuspend. 888 883 889 884 Drivers may also continue to use the non-autosuspend helper functions; they ··· 922 917 lock(&foo->private_lock); 923 918 if (--foo->num_pending_requests == 0) { 924 919 pm_runtime_mark_last_busy(&foo->dev); 925 - pm_runtime_put_autosuspend(&foo->dev); 920 + __pm_runtime_put_autosuspend(&foo->dev); 926 921 } else { 927 922 foo_process_next_request(foo); 928 923 }
+1 -1
Documentation/translations/zh_CN/power/opp.rst
··· 274 274 { 275 275 /* 做一些事情 */ 276 276 num_available = dev_pm_opp_get_opp_count(dev); 277 - speeds = kzalloc(sizeof(u32) * num_available, GFP_KERNEL); 277 + speeds = kcalloc(num_available, sizeof(u32), GFP_KERNEL); 278 278 /* 按升序填充表 */ 279 279 freq = 0; 280 280 while (!IS_ERR(opp = dev_pm_opp_find_freq_ceil(dev, &freq))) {
+3 -2
arch/x86/Kconfig
··· 1065 1065 1066 1066 config SCHED_MC_PRIO 1067 1067 bool "CPU core priorities scheduler support" 1068 - depends on SCHED_MC && CPU_SUP_INTEL 1069 - select X86_INTEL_PSTATE 1068 + depends on SCHED_MC 1069 + select X86_INTEL_PSTATE if CPU_SUP_INTEL 1070 + select X86_AMD_PSTATE if CPU_SUP_AMD && ACPI 1070 1071 select CPU_FREQ 1071 1072 default y 1072 1073 help
+2 -2
arch/x86/kernel/acpi/cstate.c
··· 131 131 cpuid(CPUID_MWAIT_LEAF, &eax, &ebx, &ecx, &edx); 132 132 133 133 /* Check whether this particular cx_type (in CST) is supported or not */ 134 - cstate_type = ((cx->address >> MWAIT_SUBSTATE_SIZE) & 135 - MWAIT_CSTATE_MASK) + 1; 134 + cstate_type = (((cx->address >> MWAIT_SUBSTATE_SIZE) & 135 + MWAIT_CSTATE_MASK) + 1) & MWAIT_CSTATE_MASK; 136 136 edx_part = edx >> (cstate_type * MWAIT_SUBSTATE_SIZE); 137 137 num_cstate_subtype = edx_part & MWAIT_SUBSTATE_MASK; 138 138
+1 -1
drivers/accel/ivpu/ivpu_pm.c
··· 309 309 { 310 310 int ret; 311 311 312 - ret = pm_runtime_get_if_active(vdev->drm.dev, false); 312 + ret = pm_runtime_get_if_in_use(vdev->drm.dev); 313 313 drm_WARN_ON(&vdev->drm, ret < 0); 314 314 315 315 return ret;
+13
drivers/acpi/cppc_acpi.c
··· 1158 1158 } 1159 1159 1160 1160 /** 1161 + * cppc_get_highest_perf - Get the highest performance register value. 1162 + * @cpunum: CPU from which to get highest performance. 1163 + * @highest_perf: Return address. 1164 + * 1165 + * Return: 0 for success, -EIO otherwise. 1166 + */ 1167 + int cppc_get_highest_perf(int cpunum, u64 *highest_perf) 1168 + { 1169 + return cppc_get_perf(cpunum, HIGHEST_PERF, highest_perf); 1170 + } 1171 + EXPORT_SYMBOL_GPL(cppc_get_highest_perf); 1172 + 1173 + /** 1161 1174 * cppc_get_epp_perf - Get the epp register value. 1162 1175 * @cpunum: CPU from which to get epp preference value. 1163 1176 * @epp_perf: Return address.
+6
drivers/acpi/processor_driver.c
··· 27 27 #define ACPI_PROCESSOR_NOTIFY_PERFORMANCE 0x80 28 28 #define ACPI_PROCESSOR_NOTIFY_POWER 0x81 29 29 #define ACPI_PROCESSOR_NOTIFY_THROTTLING 0x82 30 + #define ACPI_PROCESSOR_NOTIFY_HIGEST_PERF_CHANGED 0x85 30 31 31 32 MODULE_AUTHOR("Paul Diefenbaugh"); 32 33 MODULE_DESCRIPTION("ACPI Processor Driver"); ··· 81 80 break; 82 81 case ACPI_PROCESSOR_NOTIFY_THROTTLING: 83 82 acpi_processor_tstate_has_changed(pr); 83 + acpi_bus_generate_netlink_event(device->pnp.device_class, 84 + dev_name(&device->dev), event, 0); 85 + break; 86 + case ACPI_PROCESSOR_NOTIFY_HIGEST_PERF_CHANGED: 87 + cpufreq_update_limits(pr->id); 84 88 acpi_bus_generate_netlink_event(device->pnp.device_class, 85 89 dev_name(&device->dev), event, 0); 86 90 break;
+117 -152
drivers/base/power/main.c
··· 60 60 static LIST_HEAD(dpm_late_early_list); 61 61 static LIST_HEAD(dpm_noirq_list); 62 62 63 - struct suspend_stats suspend_stats; 64 63 static DEFINE_MUTEX(dpm_list_mtx); 65 64 static pm_message_t pm_transition; 66 65 ··· 577 578 return !dev->power.must_resume; 578 579 } 579 580 581 + static bool is_async(struct device *dev) 582 + { 583 + return dev->power.async_suspend && pm_async_enabled 584 + && !pm_trace_is_enabled(); 585 + } 586 + 587 + static bool dpm_async_fn(struct device *dev, async_func_t func) 588 + { 589 + reinit_completion(&dev->power.completion); 590 + 591 + if (is_async(dev)) { 592 + dev->power.async_in_progress = true; 593 + 594 + get_device(dev); 595 + 596 + if (async_schedule_dev_nocall(func, dev)) 597 + return true; 598 + 599 + put_device(dev); 600 + } 601 + /* 602 + * Because async_schedule_dev_nocall() above has returned false or it 603 + * has not been called at all, func() is not running and it is safe to 604 + * update the async_in_progress flag without extra synchronization. 605 + */ 606 + dev->power.async_in_progress = false; 607 + return false; 608 + } 609 + 580 610 /** 581 611 * device_resume_noirq - Execute a "noirq resume" callback for given device. 582 612 * @dev: Device to handle. ··· 685 657 TRACE_RESUME(error); 686 658 687 659 if (error) { 688 - suspend_stats.failed_resume_noirq++; 689 - dpm_save_failed_step(SUSPEND_RESUME_NOIRQ); 660 + async_error = error; 690 661 dpm_save_failed_dev(dev_name(dev)); 691 662 pm_dev_err(dev, state, async ? " async noirq" : " noirq", error); 692 663 } 693 - } 694 - 695 - static bool is_async(struct device *dev) 696 - { 697 - return dev->power.async_suspend && pm_async_enabled 698 - && !pm_trace_is_enabled(); 699 - } 700 - 701 - static bool dpm_async_fn(struct device *dev, async_func_t func) 702 - { 703 - reinit_completion(&dev->power.completion); 704 - 705 - if (is_async(dev)) { 706 - dev->power.async_in_progress = true; 707 - 708 - get_device(dev); 709 - 710 - if (async_schedule_dev_nocall(func, dev)) 711 - return true; 712 - 713 - put_device(dev); 714 - } 715 - /* 716 - * Because async_schedule_dev_nocall() above has returned false or it 717 - * has not been called at all, func() is not running and it is safe to 718 - * update the async_in_progress flag without extra synchronization. 719 - */ 720 - dev->power.async_in_progress = false; 721 - return false; 722 664 } 723 665 724 666 static void async_resume_noirq(void *data, async_cookie_t cookie) ··· 705 707 ktime_t starttime = ktime_get(); 706 708 707 709 trace_suspend_resume(TPS("dpm_resume_noirq"), state.event, true); 708 - mutex_lock(&dpm_list_mtx); 710 + 711 + async_error = 0; 709 712 pm_transition = state; 713 + 714 + mutex_lock(&dpm_list_mtx); 710 715 711 716 /* 712 717 * Trigger the resume of "async" devices upfront so they don't have to ··· 737 736 mutex_unlock(&dpm_list_mtx); 738 737 async_synchronize_full(); 739 738 dpm_show_time(starttime, state, 0, "noirq"); 739 + if (async_error) 740 + dpm_save_failed_step(SUSPEND_RESUME_NOIRQ); 741 + 740 742 trace_suspend_resume(TPS("dpm_resume_noirq"), state.event, false); 741 743 } 742 744 ··· 821 817 complete_all(&dev->power.completion); 822 818 823 819 if (error) { 824 - suspend_stats.failed_resume_early++; 825 - dpm_save_failed_step(SUSPEND_RESUME_EARLY); 820 + async_error = error; 826 821 dpm_save_failed_dev(dev_name(dev)); 827 822 pm_dev_err(dev, state, async ? " async early" : " early", error); 828 823 } ··· 845 842 ktime_t starttime = ktime_get(); 846 843 847 844 trace_suspend_resume(TPS("dpm_resume_early"), state.event, true); 848 - mutex_lock(&dpm_list_mtx); 845 + 846 + async_error = 0; 849 847 pm_transition = state; 848 + 849 + mutex_lock(&dpm_list_mtx); 850 850 851 851 /* 852 852 * Trigger the resume of "async" devices upfront so they don't have to ··· 877 871 mutex_unlock(&dpm_list_mtx); 878 872 async_synchronize_full(); 879 873 dpm_show_time(starttime, state, 0, "early"); 874 + if (async_error) 875 + dpm_save_failed_step(SUSPEND_RESUME_EARLY); 876 + 880 877 trace_suspend_resume(TPS("dpm_resume_early"), state.event, false); 881 878 } 882 879 ··· 983 974 TRACE_RESUME(error); 984 975 985 976 if (error) { 986 - suspend_stats.failed_resume++; 987 - dpm_save_failed_step(SUSPEND_RESUME); 977 + async_error = error; 988 978 dpm_save_failed_dev(dev_name(dev)); 989 979 pm_dev_err(dev, state, async ? " async" : "", error); 990 980 } ··· 1012 1004 trace_suspend_resume(TPS("dpm_resume"), state.event, true); 1013 1005 might_sleep(); 1014 1006 1015 - mutex_lock(&dpm_list_mtx); 1016 1007 pm_transition = state; 1017 1008 async_error = 0; 1009 + 1010 + mutex_lock(&dpm_list_mtx); 1018 1011 1019 1012 /* 1020 1013 * Trigger the resume of "async" devices upfront so they don't have to ··· 1026 1017 1027 1018 while (!list_empty(&dpm_suspended_list)) { 1028 1019 dev = to_device(dpm_suspended_list.next); 1029 - 1030 - get_device(dev); 1020 + list_move_tail(&dev->power.entry, &dpm_prepared_list); 1031 1021 1032 1022 if (!dev->power.async_in_progress) { 1023 + get_device(dev); 1024 + 1033 1025 mutex_unlock(&dpm_list_mtx); 1034 1026 1035 1027 device_resume(dev, state, false); 1036 1028 1029 + put_device(dev); 1030 + 1037 1031 mutex_lock(&dpm_list_mtx); 1038 1032 } 1039 - 1040 - if (!list_empty(&dev->power.entry)) 1041 - list_move_tail(&dev->power.entry, &dpm_prepared_list); 1042 - 1043 - mutex_unlock(&dpm_list_mtx); 1044 - 1045 - put_device(dev); 1046 - 1047 - mutex_lock(&dpm_list_mtx); 1048 1033 } 1049 1034 mutex_unlock(&dpm_list_mtx); 1050 1035 async_synchronize_full(); 1051 1036 dpm_show_time(starttime, state, 0, NULL); 1037 + if (async_error) 1038 + dpm_save_failed_step(SUSPEND_RESUME); 1052 1039 1053 1040 cpufreq_resume(); 1054 1041 devfreq_resume(); ··· 1192 1187 } 1193 1188 1194 1189 /** 1195 - * __device_suspend_noirq - Execute a "noirq suspend" callback for given device. 1190 + * device_suspend_noirq - Execute a "noirq suspend" callback for given device. 1196 1191 * @dev: Device to handle. 1197 1192 * @state: PM transition of the system being carried out. 1198 1193 * @async: If true, the device is being suspended asynchronously. ··· 1200 1195 * The driver of @dev will not receive interrupts while this function is being 1201 1196 * executed. 1202 1197 */ 1203 - static int __device_suspend_noirq(struct device *dev, pm_message_t state, bool async) 1198 + static int device_suspend_noirq(struct device *dev, pm_message_t state, bool async) 1204 1199 { 1205 1200 pm_callback_t callback = NULL; 1206 1201 const char *info = NULL; ··· 1245 1240 error = dpm_run_callback(callback, dev, state, info); 1246 1241 if (error) { 1247 1242 async_error = error; 1243 + dpm_save_failed_dev(dev_name(dev)); 1244 + pm_dev_err(dev, state, async ? " async noirq" : " noirq", error); 1248 1245 goto Complete; 1249 1246 } 1250 1247 ··· 1276 1269 static void async_suspend_noirq(void *data, async_cookie_t cookie) 1277 1270 { 1278 1271 struct device *dev = data; 1279 - int error; 1280 1272 1281 - error = __device_suspend_noirq(dev, pm_transition, true); 1282 - if (error) { 1283 - dpm_save_failed_dev(dev_name(dev)); 1284 - pm_dev_err(dev, pm_transition, " async", error); 1285 - } 1286 - 1273 + device_suspend_noirq(dev, pm_transition, true); 1287 1274 put_device(dev); 1288 - } 1289 - 1290 - static int device_suspend_noirq(struct device *dev) 1291 - { 1292 - if (dpm_async_fn(dev, async_suspend_noirq)) 1293 - return 0; 1294 - 1295 - return __device_suspend_noirq(dev, pm_transition, false); 1296 1275 } 1297 1276 1298 1277 static int dpm_noirq_suspend_devices(pm_message_t state) ··· 1287 1294 int error = 0; 1288 1295 1289 1296 trace_suspend_resume(TPS("dpm_suspend_noirq"), state.event, true); 1290 - mutex_lock(&dpm_list_mtx); 1297 + 1291 1298 pm_transition = state; 1292 1299 async_error = 0; 1300 + 1301 + mutex_lock(&dpm_list_mtx); 1293 1302 1294 1303 while (!list_empty(&dpm_late_early_list)) { 1295 1304 struct device *dev = to_device(dpm_late_early_list.prev); 1296 1305 1306 + list_move(&dev->power.entry, &dpm_noirq_list); 1307 + 1308 + if (dpm_async_fn(dev, async_suspend_noirq)) 1309 + continue; 1310 + 1297 1311 get_device(dev); 1298 - mutex_unlock(&dpm_list_mtx); 1299 - 1300 - error = device_suspend_noirq(dev); 1301 - 1302 - mutex_lock(&dpm_list_mtx); 1303 - 1304 - if (error) { 1305 - pm_dev_err(dev, state, " noirq", error); 1306 - dpm_save_failed_dev(dev_name(dev)); 1307 - } else if (!list_empty(&dev->power.entry)) { 1308 - list_move(&dev->power.entry, &dpm_noirq_list); 1309 - } 1310 1312 1311 1313 mutex_unlock(&dpm_list_mtx); 1314 + 1315 + error = device_suspend_noirq(dev, state, false); 1312 1316 1313 1317 put_device(dev); 1314 1318 ··· 1314 1324 if (error || async_error) 1315 1325 break; 1316 1326 } 1327 + 1317 1328 mutex_unlock(&dpm_list_mtx); 1329 + 1318 1330 async_synchronize_full(); 1319 1331 if (!error) 1320 1332 error = async_error; 1321 1333 1322 - if (error) { 1323 - suspend_stats.failed_suspend_noirq++; 1334 + if (error) 1324 1335 dpm_save_failed_step(SUSPEND_SUSPEND_NOIRQ); 1325 - } 1336 + 1326 1337 dpm_show_time(starttime, state, error, "noirq"); 1327 1338 trace_suspend_resume(TPS("dpm_suspend_noirq"), state.event, false); 1328 1339 return error; ··· 1366 1375 } 1367 1376 1368 1377 /** 1369 - * __device_suspend_late - Execute a "late suspend" callback for given device. 1378 + * device_suspend_late - Execute a "late suspend" callback for given device. 1370 1379 * @dev: Device to handle. 1371 1380 * @state: PM transition of the system being carried out. 1372 1381 * @async: If true, the device is being suspended asynchronously. 1373 1382 * 1374 1383 * Runtime PM is disabled for @dev while this function is being executed. 1375 1384 */ 1376 - static int __device_suspend_late(struct device *dev, pm_message_t state, bool async) 1385 + static int device_suspend_late(struct device *dev, pm_message_t state, bool async) 1377 1386 { 1378 1387 pm_callback_t callback = NULL; 1379 1388 const char *info = NULL; ··· 1425 1434 error = dpm_run_callback(callback, dev, state, info); 1426 1435 if (error) { 1427 1436 async_error = error; 1437 + dpm_save_failed_dev(dev_name(dev)); 1438 + pm_dev_err(dev, state, async ? " async late" : " late", error); 1428 1439 goto Complete; 1429 1440 } 1430 1441 dpm_propagate_wakeup_to_parent(dev); ··· 1443 1450 static void async_suspend_late(void *data, async_cookie_t cookie) 1444 1451 { 1445 1452 struct device *dev = data; 1446 - int error; 1447 1453 1448 - error = __device_suspend_late(dev, pm_transition, true); 1449 - if (error) { 1450 - dpm_save_failed_dev(dev_name(dev)); 1451 - pm_dev_err(dev, pm_transition, " async", error); 1452 - } 1454 + device_suspend_late(dev, pm_transition, true); 1453 1455 put_device(dev); 1454 - } 1455 - 1456 - static int device_suspend_late(struct device *dev) 1457 - { 1458 - if (dpm_async_fn(dev, async_suspend_late)) 1459 - return 0; 1460 - 1461 - return __device_suspend_late(dev, pm_transition, false); 1462 1456 } 1463 1457 1464 1458 /** ··· 1458 1478 int error = 0; 1459 1479 1460 1480 trace_suspend_resume(TPS("dpm_suspend_late"), state.event, true); 1461 - wake_up_all_idle_cpus(); 1462 - mutex_lock(&dpm_list_mtx); 1481 + 1463 1482 pm_transition = state; 1464 1483 async_error = 0; 1465 1484 1485 + wake_up_all_idle_cpus(); 1486 + 1487 + mutex_lock(&dpm_list_mtx); 1488 + 1466 1489 while (!list_empty(&dpm_suspended_list)) { 1467 1490 struct device *dev = to_device(dpm_suspended_list.prev); 1491 + 1492 + list_move(&dev->power.entry, &dpm_late_early_list); 1493 + 1494 + if (dpm_async_fn(dev, async_suspend_late)) 1495 + continue; 1468 1496 1469 1497 get_device(dev); 1470 1498 1471 1499 mutex_unlock(&dpm_list_mtx); 1472 1500 1473 - error = device_suspend_late(dev); 1474 - 1475 - mutex_lock(&dpm_list_mtx); 1476 - 1477 - if (!list_empty(&dev->power.entry)) 1478 - list_move(&dev->power.entry, &dpm_late_early_list); 1479 - 1480 - if (error) { 1481 - pm_dev_err(dev, state, " late", error); 1482 - dpm_save_failed_dev(dev_name(dev)); 1483 - } 1484 - 1485 - mutex_unlock(&dpm_list_mtx); 1501 + error = device_suspend_late(dev, state, false); 1486 1502 1487 1503 put_device(dev); 1488 1504 ··· 1487 1511 if (error || async_error) 1488 1512 break; 1489 1513 } 1514 + 1490 1515 mutex_unlock(&dpm_list_mtx); 1516 + 1491 1517 async_synchronize_full(); 1492 1518 if (!error) 1493 1519 error = async_error; 1520 + 1494 1521 if (error) { 1495 - suspend_stats.failed_suspend_late++; 1496 1522 dpm_save_failed_step(SUSPEND_SUSPEND_LATE); 1497 1523 dpm_resume_early(resume_event(state)); 1498 1524 } ··· 1575 1597 } 1576 1598 1577 1599 /** 1578 - * __device_suspend - Execute "suspend" callbacks for given device. 1600 + * device_suspend - Execute "suspend" callbacks for given device. 1579 1601 * @dev: Device to handle. 1580 1602 * @state: PM transition of the system being carried out. 1581 1603 * @async: If true, the device is being suspended asynchronously. 1582 1604 */ 1583 - static int __device_suspend(struct device *dev, pm_message_t state, bool async) 1605 + static int device_suspend(struct device *dev, pm_message_t state, bool async) 1584 1606 { 1585 1607 pm_callback_t callback = NULL; 1586 1608 const char *info = NULL; ··· 1694 1716 dpm_watchdog_clear(&wd); 1695 1717 1696 1718 Complete: 1697 - if (error) 1719 + if (error) { 1698 1720 async_error = error; 1721 + dpm_save_failed_dev(dev_name(dev)); 1722 + pm_dev_err(dev, state, async ? " async" : "", error); 1723 + } 1699 1724 1700 1725 complete_all(&dev->power.completion); 1701 1726 TRACE_SUSPEND(error); ··· 1708 1727 static void async_suspend(void *data, async_cookie_t cookie) 1709 1728 { 1710 1729 struct device *dev = data; 1711 - int error; 1712 1730 1713 - error = __device_suspend(dev, pm_transition, true); 1714 - if (error) { 1715 - dpm_save_failed_dev(dev_name(dev)); 1716 - pm_dev_err(dev, pm_transition, " async", error); 1717 - } 1718 - 1731 + device_suspend(dev, pm_transition, true); 1719 1732 put_device(dev); 1720 - } 1721 - 1722 - static int device_suspend(struct device *dev) 1723 - { 1724 - if (dpm_async_fn(dev, async_suspend)) 1725 - return 0; 1726 - 1727 - return __device_suspend(dev, pm_transition, false); 1728 1733 } 1729 1734 1730 1735 /** ··· 1728 1761 devfreq_suspend(); 1729 1762 cpufreq_suspend(); 1730 1763 1731 - mutex_lock(&dpm_list_mtx); 1732 1764 pm_transition = state; 1733 1765 async_error = 0; 1766 + 1767 + mutex_lock(&dpm_list_mtx); 1768 + 1734 1769 while (!list_empty(&dpm_prepared_list)) { 1735 1770 struct device *dev = to_device(dpm_prepared_list.prev); 1771 + 1772 + list_move(&dev->power.entry, &dpm_suspended_list); 1773 + 1774 + if (dpm_async_fn(dev, async_suspend)) 1775 + continue; 1736 1776 1737 1777 get_device(dev); 1738 1778 1739 1779 mutex_unlock(&dpm_list_mtx); 1740 1780 1741 - error = device_suspend(dev); 1742 - 1743 - mutex_lock(&dpm_list_mtx); 1744 - 1745 - if (error) { 1746 - pm_dev_err(dev, state, "", error); 1747 - dpm_save_failed_dev(dev_name(dev)); 1748 - } else if (!list_empty(&dev->power.entry)) { 1749 - list_move(&dev->power.entry, &dpm_suspended_list); 1750 - } 1751 - 1752 - mutex_unlock(&dpm_list_mtx); 1781 + error = device_suspend(dev, state, false); 1753 1782 1754 1783 put_device(dev); 1755 1784 ··· 1754 1791 if (error || async_error) 1755 1792 break; 1756 1793 } 1794 + 1757 1795 mutex_unlock(&dpm_list_mtx); 1796 + 1758 1797 async_synchronize_full(); 1759 1798 if (!error) 1760 1799 error = async_error; 1761 - if (error) { 1762 - suspend_stats.failed_suspend++; 1800 + 1801 + if (error) 1763 1802 dpm_save_failed_step(SUSPEND_SUSPEND); 1764 - } 1803 + 1765 1804 dpm_show_time(starttime, state, error, NULL); 1766 1805 trace_suspend_resume(TPS("dpm_suspend"), state.event, false); 1767 1806 return error; ··· 1914 1949 int error; 1915 1950 1916 1951 error = dpm_prepare(state); 1917 - if (error) { 1918 - suspend_stats.failed_prepare++; 1952 + if (error) 1919 1953 dpm_save_failed_step(SUSPEND_PREPARE); 1920 - } else 1954 + else 1921 1955 error = dpm_suspend(state); 1956 + 1922 1957 dpm_show_time(starttime, state, error, "start"); 1923 1958 return error; 1924 1959 }
+34 -2
drivers/base/power/runtime.c
··· 94 94 static void __update_runtime_status(struct device *dev, enum rpm_status status) 95 95 { 96 96 update_pm_runtime_accounting(dev); 97 + trace_rpm_status(dev, status); 97 98 dev->power.runtime_status = status; 98 99 } 99 100 ··· 1177 1176 EXPORT_SYMBOL_GPL(__pm_runtime_resume); 1178 1177 1179 1178 /** 1180 - * pm_runtime_get_if_active - Conditionally bump up device usage counter. 1179 + * pm_runtime_get_conditional - Conditionally bump up device usage counter. 1181 1180 * @dev: Device to handle. 1182 1181 * @ign_usage_count: Whether or not to look at the current usage counter value. 1183 1182 * ··· 1198 1197 * The caller is responsible for decrementing the runtime PM usage counter of 1199 1198 * @dev after this function has returned a positive value for it. 1200 1199 */ 1201 - int pm_runtime_get_if_active(struct device *dev, bool ign_usage_count) 1200 + static int pm_runtime_get_conditional(struct device *dev, bool ign_usage_count) 1202 1201 { 1203 1202 unsigned long flags; 1204 1203 int retval; ··· 1219 1218 1220 1219 return retval; 1221 1220 } 1221 + 1222 + /** 1223 + * pm_runtime_get_if_active - Bump up runtime PM usage counter if the device is 1224 + * in active state 1225 + * @dev: Target device. 1226 + * 1227 + * Increment the runtime PM usage counter of @dev if its runtime PM status is 1228 + * %RPM_ACTIVE, in which case it returns 1. If the device is in a different 1229 + * state, 0 is returned. -EINVAL is returned if runtime PM is disabled for the 1230 + * device, in which case also the usage_count will remain unmodified. 1231 + */ 1232 + int pm_runtime_get_if_active(struct device *dev) 1233 + { 1234 + return pm_runtime_get_conditional(dev, true); 1235 + } 1222 1236 EXPORT_SYMBOL_GPL(pm_runtime_get_if_active); 1237 + 1238 + /** 1239 + * pm_runtime_get_if_in_use - Conditionally bump up runtime PM usage counter. 1240 + * @dev: Target device. 1241 + * 1242 + * Increment the runtime PM usage counter of @dev if its runtime PM status is 1243 + * %RPM_ACTIVE and its runtime PM usage counter is greater than 0, in which case 1244 + * it returns 1. If the device is in a different state or its usage_count is 0, 1245 + * 0 is returned. -EINVAL is returned if runtime PM is disabled for the device, 1246 + * in which case also the usage_count will remain unmodified. 1247 + */ 1248 + int pm_runtime_get_if_in_use(struct device *dev) 1249 + { 1250 + return pm_runtime_get_conditional(dev, false); 1251 + } 1252 + EXPORT_SYMBOL_GPL(pm_runtime_get_if_in_use); 1223 1253 1224 1254 /** 1225 1255 * __pm_runtime_set_status - Set runtime PM status of a device.
+3 -1
drivers/base/power/wakeirq.c
··· 313 313 return; 314 314 315 315 if (wirq->status & WAKE_IRQ_DEDICATED_MANAGED && 316 - wirq->status & WAKE_IRQ_DEDICATED_REVERSE) 316 + wirq->status & WAKE_IRQ_DEDICATED_REVERSE) { 317 317 enable_irq(wirq->irq); 318 + wirq->status |= WAKE_IRQ_DEDICATED_ENABLED; 319 + } 318 320 } 319 321 320 322 /**
+1
drivers/cpufreq/Kconfig.arm
··· 173 173 config ARM_QCOM_CPUFREQ_HW 174 174 tristate "QCOM CPUFreq HW driver" 175 175 depends on ARCH_QCOM || COMPILE_TEST 176 + depends on COMMON_CLK 176 177 help 177 178 Support for the CPUFreq HW driver. 178 179 Some QCOM chipsets have a HW engine to offload the steps
+190 -10
drivers/cpufreq/amd-pstate.c
··· 37 37 #include <linux/uaccess.h> 38 38 #include <linux/static_call.h> 39 39 #include <linux/amd-pstate.h> 40 + #include <linux/topology.h> 40 41 41 42 #include <acpi/processor.h> 42 43 #include <acpi/cppc_acpi.h> ··· 50 49 51 50 #define AMD_PSTATE_TRANSITION_LATENCY 20000 52 51 #define AMD_PSTATE_TRANSITION_DELAY 1000 52 + #define AMD_PSTATE_PREFCORE_THRESHOLD 166 53 53 54 54 /* 55 55 * TODO: We need more time to fine tune processors with shared memory solution ··· 66 64 static struct cpufreq_driver amd_pstate_epp_driver; 67 65 static int cppc_state = AMD_PSTATE_UNDEFINED; 68 66 static bool cppc_enabled; 67 + static bool amd_pstate_prefcore = true; 69 68 70 69 /* 71 70 * AMD Energy Preference Performance (EPP) ··· 300 297 if (ret) 301 298 return ret; 302 299 303 - /* 304 - * TODO: Introduce AMD specific power feature. 305 - * 306 - * CPPC entry doesn't indicate the highest performance in some ASICs. 300 + /* For platforms that do not support the preferred core feature, the 301 + * highest_pef may be configured with 166 or 255, to avoid max frequency 302 + * calculated wrongly. we take the AMD_CPPC_HIGHEST_PERF(cap1) value as 303 + * the default max perf. 307 304 */ 308 - highest_perf = amd_get_highest_perf(); 309 - if (highest_perf > AMD_CPPC_HIGHEST_PERF(cap1)) 305 + if (cpudata->hw_prefcore) 306 + highest_perf = AMD_PSTATE_PREFCORE_THRESHOLD; 307 + else 310 308 highest_perf = AMD_CPPC_HIGHEST_PERF(cap1); 311 309 312 310 WRITE_ONCE(cpudata->highest_perf, highest_perf); ··· 315 311 WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1)); 316 312 WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1)); 317 313 WRITE_ONCE(cpudata->lowest_perf, AMD_CPPC_LOWEST_PERF(cap1)); 314 + WRITE_ONCE(cpudata->prefcore_ranking, AMD_CPPC_HIGHEST_PERF(cap1)); 318 315 WRITE_ONCE(cpudata->min_limit_perf, AMD_CPPC_LOWEST_PERF(cap1)); 319 316 return 0; 320 317 } ··· 329 324 if (ret) 330 325 return ret; 331 326 332 - highest_perf = amd_get_highest_perf(); 333 - if (highest_perf > cppc_perf.highest_perf) 327 + if (cpudata->hw_prefcore) 328 + highest_perf = AMD_PSTATE_PREFCORE_THRESHOLD; 329 + else 334 330 highest_perf = cppc_perf.highest_perf; 335 331 336 332 WRITE_ONCE(cpudata->highest_perf, highest_perf); ··· 340 334 WRITE_ONCE(cpudata->lowest_nonlinear_perf, 341 335 cppc_perf.lowest_nonlinear_perf); 342 336 WRITE_ONCE(cpudata->lowest_perf, cppc_perf.lowest_perf); 337 + WRITE_ONCE(cpudata->prefcore_ranking, cppc_perf.highest_perf); 343 338 WRITE_ONCE(cpudata->min_limit_perf, cppc_perf.lowest_perf); 344 339 345 340 if (cppc_state == AMD_PSTATE_ACTIVE) ··· 484 477 485 478 static int amd_pstate_update_min_max_limit(struct cpufreq_policy *policy) 486 479 { 487 - u32 max_limit_perf, min_limit_perf; 480 + u32 max_limit_perf, min_limit_perf, lowest_perf; 488 481 struct amd_cpudata *cpudata = policy->driver_data; 489 482 490 483 max_limit_perf = div_u64(policy->max * cpudata->highest_perf, cpudata->max_freq); 491 484 min_limit_perf = div_u64(policy->min * cpudata->highest_perf, cpudata->max_freq); 485 + 486 + lowest_perf = READ_ONCE(cpudata->lowest_perf); 487 + if (min_limit_perf < lowest_perf) 488 + min_limit_perf = lowest_perf; 489 + 490 + if (max_limit_perf < min_limit_perf) 491 + max_limit_perf = min_limit_perf; 492 492 493 493 WRITE_ONCE(cpudata->max_limit_perf, max_limit_perf); 494 494 WRITE_ONCE(cpudata->min_limit_perf, min_limit_perf); ··· 584 570 if (target_perf < capacity) 585 571 des_perf = DIV_ROUND_UP(cap_perf * target_perf, capacity); 586 572 587 - min_perf = READ_ONCE(cpudata->highest_perf); 573 + min_perf = READ_ONCE(cpudata->lowest_perf); 588 574 if (_min_perf < capacity) 589 575 min_perf = DIV_ROUND_UP(cap_perf * _min_perf, capacity); 590 576 ··· 720 706 wrmsrl_on_cpu(cpu, MSR_AMD_PERF_CTL, 0); 721 707 } 722 708 709 + /* 710 + * Set amd-pstate preferred core enable can't be done directly from cpufreq callbacks 711 + * due to locking, so queue the work for later. 712 + */ 713 + static void amd_pstste_sched_prefcore_workfn(struct work_struct *work) 714 + { 715 + sched_set_itmt_support(); 716 + } 717 + static DECLARE_WORK(sched_prefcore_work, amd_pstste_sched_prefcore_workfn); 718 + 719 + /* 720 + * Get the highest performance register value. 721 + * @cpu: CPU from which to get highest performance. 722 + * @highest_perf: Return address. 723 + * 724 + * Return: 0 for success, -EIO otherwise. 725 + */ 726 + static int amd_pstate_get_highest_perf(int cpu, u32 *highest_perf) 727 + { 728 + int ret; 729 + 730 + if (boot_cpu_has(X86_FEATURE_CPPC)) { 731 + u64 cap1; 732 + 733 + ret = rdmsrl_safe_on_cpu(cpu, MSR_AMD_CPPC_CAP1, &cap1); 734 + if (ret) 735 + return ret; 736 + WRITE_ONCE(*highest_perf, AMD_CPPC_HIGHEST_PERF(cap1)); 737 + } else { 738 + u64 cppc_highest_perf; 739 + 740 + ret = cppc_get_highest_perf(cpu, &cppc_highest_perf); 741 + if (ret) 742 + return ret; 743 + WRITE_ONCE(*highest_perf, cppc_highest_perf); 744 + } 745 + 746 + return (ret); 747 + } 748 + 749 + #define CPPC_MAX_PERF U8_MAX 750 + 751 + static void amd_pstate_init_prefcore(struct amd_cpudata *cpudata) 752 + { 753 + int ret, prio; 754 + u32 highest_perf; 755 + 756 + ret = amd_pstate_get_highest_perf(cpudata->cpu, &highest_perf); 757 + if (ret) 758 + return; 759 + 760 + cpudata->hw_prefcore = true; 761 + /* check if CPPC preferred core feature is enabled*/ 762 + if (highest_perf < CPPC_MAX_PERF) 763 + prio = (int)highest_perf; 764 + else { 765 + pr_debug("AMD CPPC preferred core is unsupported!\n"); 766 + cpudata->hw_prefcore = false; 767 + return; 768 + } 769 + 770 + if (!amd_pstate_prefcore) 771 + return; 772 + 773 + /* 774 + * The priorities can be set regardless of whether or not 775 + * sched_set_itmt_support(true) has been called and it is valid to 776 + * update them at any time after it has been called. 777 + */ 778 + sched_set_itmt_core_prio(prio, cpudata->cpu); 779 + 780 + schedule_work(&sched_prefcore_work); 781 + } 782 + 783 + static void amd_pstate_update_limits(unsigned int cpu) 784 + { 785 + struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); 786 + struct amd_cpudata *cpudata = policy->driver_data; 787 + u32 prev_high = 0, cur_high = 0; 788 + int ret; 789 + bool highest_perf_changed = false; 790 + 791 + mutex_lock(&amd_pstate_driver_lock); 792 + if ((!amd_pstate_prefcore) || (!cpudata->hw_prefcore)) 793 + goto free_cpufreq_put; 794 + 795 + ret = amd_pstate_get_highest_perf(cpu, &cur_high); 796 + if (ret) 797 + goto free_cpufreq_put; 798 + 799 + prev_high = READ_ONCE(cpudata->prefcore_ranking); 800 + if (prev_high != cur_high) { 801 + highest_perf_changed = true; 802 + WRITE_ONCE(cpudata->prefcore_ranking, cur_high); 803 + 804 + if (cur_high < CPPC_MAX_PERF) 805 + sched_set_itmt_core_prio((int)cur_high, cpu); 806 + } 807 + 808 + free_cpufreq_put: 809 + cpufreq_cpu_put(policy); 810 + 811 + if (!highest_perf_changed) 812 + cpufreq_update_policy(cpu); 813 + 814 + mutex_unlock(&amd_pstate_driver_lock); 815 + } 816 + 723 817 static int amd_pstate_cpu_init(struct cpufreq_policy *policy) 724 818 { 725 819 int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret; ··· 848 726 return -ENOMEM; 849 727 850 728 cpudata->cpu = policy->cpu; 729 + 730 + amd_pstate_init_prefcore(cpudata); 851 731 852 732 ret = amd_pstate_init_perf(cpudata); 853 733 if (ret) ··· 999 875 perf = READ_ONCE(cpudata->highest_perf); 1000 876 1001 877 return sysfs_emit(buf, "%u\n", perf); 878 + } 879 + 880 + static ssize_t show_amd_pstate_prefcore_ranking(struct cpufreq_policy *policy, 881 + char *buf) 882 + { 883 + u32 perf; 884 + struct amd_cpudata *cpudata = policy->driver_data; 885 + 886 + perf = READ_ONCE(cpudata->prefcore_ranking); 887 + 888 + return sysfs_emit(buf, "%u\n", perf); 889 + } 890 + 891 + static ssize_t show_amd_pstate_hw_prefcore(struct cpufreq_policy *policy, 892 + char *buf) 893 + { 894 + bool hw_prefcore; 895 + struct amd_cpudata *cpudata = policy->driver_data; 896 + 897 + hw_prefcore = READ_ONCE(cpudata->hw_prefcore); 898 + 899 + return sysfs_emit(buf, "%s\n", str_enabled_disabled(hw_prefcore)); 1002 900 } 1003 901 1004 902 static ssize_t show_energy_performance_available_preferences( ··· 1220 1074 return ret < 0 ? ret : count; 1221 1075 } 1222 1076 1077 + static ssize_t prefcore_show(struct device *dev, 1078 + struct device_attribute *attr, char *buf) 1079 + { 1080 + return sysfs_emit(buf, "%s\n", str_enabled_disabled(amd_pstate_prefcore)); 1081 + } 1082 + 1223 1083 cpufreq_freq_attr_ro(amd_pstate_max_freq); 1224 1084 cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq); 1225 1085 1226 1086 cpufreq_freq_attr_ro(amd_pstate_highest_perf); 1087 + cpufreq_freq_attr_ro(amd_pstate_prefcore_ranking); 1088 + cpufreq_freq_attr_ro(amd_pstate_hw_prefcore); 1227 1089 cpufreq_freq_attr_rw(energy_performance_preference); 1228 1090 cpufreq_freq_attr_ro(energy_performance_available_preferences); 1229 1091 static DEVICE_ATTR_RW(status); 1092 + static DEVICE_ATTR_RO(prefcore); 1230 1093 1231 1094 static struct freq_attr *amd_pstate_attr[] = { 1232 1095 &amd_pstate_max_freq, 1233 1096 &amd_pstate_lowest_nonlinear_freq, 1234 1097 &amd_pstate_highest_perf, 1098 + &amd_pstate_prefcore_ranking, 1099 + &amd_pstate_hw_prefcore, 1235 1100 NULL, 1236 1101 }; 1237 1102 ··· 1250 1093 &amd_pstate_max_freq, 1251 1094 &amd_pstate_lowest_nonlinear_freq, 1252 1095 &amd_pstate_highest_perf, 1096 + &amd_pstate_prefcore_ranking, 1097 + &amd_pstate_hw_prefcore, 1253 1098 &energy_performance_preference, 1254 1099 &energy_performance_available_preferences, 1255 1100 NULL, ··· 1259 1100 1260 1101 static struct attribute *pstate_global_attributes[] = { 1261 1102 &dev_attr_status.attr, 1103 + &dev_attr_prefcore.attr, 1262 1104 NULL 1263 1105 }; 1264 1106 ··· 1310 1150 1311 1151 cpudata->cpu = policy->cpu; 1312 1152 cpudata->epp_policy = 0; 1153 + 1154 + amd_pstate_init_prefcore(cpudata); 1313 1155 1314 1156 ret = amd_pstate_init_perf(cpudata); 1315 1157 if (ret) ··· 1393 1231 min_perf = READ_ONCE(cpudata->lowest_perf); 1394 1232 max_limit_perf = div_u64(policy->max * cpudata->highest_perf, cpudata->max_freq); 1395 1233 min_limit_perf = div_u64(policy->min * cpudata->highest_perf, cpudata->max_freq); 1234 + 1235 + if (min_limit_perf < min_perf) 1236 + min_limit_perf = min_perf; 1237 + 1238 + if (max_limit_perf < min_limit_perf) 1239 + max_limit_perf = min_limit_perf; 1396 1240 1397 1241 WRITE_ONCE(cpudata->max_limit_perf, max_limit_perf); 1398 1242 WRITE_ONCE(cpudata->min_limit_perf, min_limit_perf); ··· 1600 1432 .suspend = amd_pstate_cpu_suspend, 1601 1433 .resume = amd_pstate_cpu_resume, 1602 1434 .set_boost = amd_pstate_set_boost, 1435 + .update_limits = amd_pstate_update_limits, 1603 1436 .name = "amd-pstate", 1604 1437 .attr = amd_pstate_attr, 1605 1438 }; ··· 1615 1446 .online = amd_pstate_epp_cpu_online, 1616 1447 .suspend = amd_pstate_epp_suspend, 1617 1448 .resume = amd_pstate_epp_resume, 1449 + .update_limits = amd_pstate_update_limits, 1618 1450 .name = "amd-pstate-epp", 1619 1451 .attr = amd_pstate_epp_attr, 1620 1452 }; ··· 1737 1567 1738 1568 return amd_pstate_set_driver(mode_idx); 1739 1569 } 1570 + 1571 + static int __init amd_prefcore_param(char *str) 1572 + { 1573 + if (!strcmp(str, "disable")) 1574 + amd_pstate_prefcore = false; 1575 + 1576 + return 0; 1577 + } 1578 + 1740 1579 early_param("amd_pstate", amd_pstate_param); 1580 + early_param("amd_prefcore", amd_prefcore_param); 1741 1581 1742 1582 MODULE_AUTHOR("Huang Rui <ray.huang@amd.com>"); 1743 1583 MODULE_DESCRIPTION("AMD Processor P-state Frequency Driver");
+2
drivers/cpufreq/brcmstb-avs-cpufreq.c
··· 481 481 static unsigned int brcm_avs_cpufreq_get(unsigned int cpu) 482 482 { 483 483 struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); 484 + if (!policy) 485 + return 0; 484 486 struct private_data *priv = policy->driver_data; 485 487 486 488 cpufreq_cpu_put(policy);
+1
drivers/cpufreq/cpufreq-dt-platdev.c
··· 156 156 { .compatible = "qcom,sc7280", }, 157 157 { .compatible = "qcom,sc8180x", }, 158 158 { .compatible = "qcom,sc8280xp", }, 159 + { .compatible = "qcom,sdm670", }, 159 160 { .compatible = "qcom,sdm845", }, 160 161 { .compatible = "qcom,sdx75", }, 161 162 { .compatible = "qcom,sm6115", },
+23 -9
drivers/cpufreq/cpufreq.c
··· 576 576 577 577 latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC; 578 578 if (latency) { 579 + unsigned int max_delay_us = 2 * MSEC_PER_SEC; 580 + 579 581 /* 580 - * For platforms that can change the frequency very fast (< 10 582 + * If the platform already has high transition_latency, use it 583 + * as-is. 584 + */ 585 + if (latency > max_delay_us) 586 + return latency; 587 + 588 + /* 589 + * For platforms that can change the frequency very fast (< 2 581 590 * us), the above formula gives a decent transition delay. But 582 591 * for platforms where transition_latency is in milliseconds, it 583 592 * ends up giving unrealistic values. 584 593 * 585 - * Cap the default transition delay to 10 ms, which seems to be 594 + * Cap the default transition delay to 2 ms, which seems to be 586 595 * a reasonable amount of time after which we should reevaluate 587 596 * the frequency. 588 597 */ 589 - return min(latency * LATENCY_MULTIPLIER, (unsigned int)10000); 598 + return min(latency * LATENCY_MULTIPLIER, max_delay_us); 590 599 } 591 600 592 601 return LATENCY_MULTIPLIER; ··· 1580 1571 if (cpufreq_driver->ready) 1581 1572 cpufreq_driver->ready(policy); 1582 1573 1583 - if (cpufreq_thermal_control_enabled(cpufreq_driver)) 1574 + /* Register cpufreq cooling only for a new policy */ 1575 + if (new_policy && cpufreq_thermal_control_enabled(cpufreq_driver)) 1584 1576 policy->cdev = of_cpufreq_cooling_register(policy); 1585 1577 1586 1578 pr_debug("initialization complete\n"); ··· 1665 1655 else 1666 1656 policy->last_policy = policy->policy; 1667 1657 1668 - if (cpufreq_thermal_control_enabled(cpufreq_driver)) { 1669 - cpufreq_cooling_unregister(policy->cdev); 1670 - policy->cdev = NULL; 1671 - } 1672 - 1673 1658 if (has_target()) 1674 1659 cpufreq_exit_governor(policy); 1675 1660 ··· 1723 1718 if (!cpumask_empty(policy->real_cpus)) { 1724 1719 up_write(&policy->rwsem); 1725 1720 return; 1721 + } 1722 + 1723 + /* 1724 + * Unregister cpufreq cooling once all the CPUs of the policy are 1725 + * removed. 1726 + */ 1727 + if (cpufreq_thermal_control_enabled(cpufreq_driver)) { 1728 + cpufreq_cooling_unregister(policy->cdev); 1729 + policy->cdev = NULL; 1726 1730 } 1727 1731 1728 1732 /* We did light-weight exit earlier, do full tear down now */
-1
drivers/cpufreq/cpufreq_ondemand.c
··· 22 22 #define DEF_SAMPLING_DOWN_FACTOR (1) 23 23 #define MAX_SAMPLING_DOWN_FACTOR (100000) 24 24 #define MICRO_FREQUENCY_UP_THRESHOLD (95) 25 - #define MICRO_FREQUENCY_MIN_SAMPLE_RATE (10000) 26 25 #define MIN_FREQUENCY_UP_THRESHOLD (1) 27 26 #define MAX_FREQUENCY_UP_THRESHOLD (100) 28 27
+14 -29
drivers/cpufreq/imx6q-cpufreq.c
··· 14 14 #include <linux/pm_opp.h> 15 15 #include <linux/platform_device.h> 16 16 #include <linux/regulator/consumer.h> 17 + #include <linux/mfd/syscon.h> 18 + #include <linux/regmap.h> 17 19 18 20 #define PU_SOC_VOLTAGE_NORMAL 1250000 19 21 #define PU_SOC_VOLTAGE_HIGH 1275000 ··· 227 225 228 226 static int imx6q_opp_check_speed_grading(struct device *dev) 229 227 { 230 - struct device_node *np; 231 - void __iomem *base; 232 228 u32 val; 233 229 int ret; 234 230 ··· 235 235 if (ret) 236 236 return ret; 237 237 } else { 238 - np = of_find_compatible_node(NULL, NULL, "fsl,imx6q-ocotp"); 239 - if (!np) 240 - return -ENOENT; 238 + struct regmap *ocotp; 241 239 242 - base = of_iomap(np, 0); 243 - of_node_put(np); 244 - if (!base) { 245 - dev_err(dev, "failed to map ocotp\n"); 246 - return -EFAULT; 247 - } 240 + ocotp = syscon_regmap_lookup_by_compatible("fsl,imx6q-ocotp"); 241 + if (IS_ERR(ocotp)) 242 + return -ENOENT; 248 243 249 244 /* 250 245 * SPEED_GRADING[1:0] defines the max speed of ARM: ··· 249 254 * 2b'00: 792000000Hz; 250 255 * We need to set the max speed of ARM according to fuse map. 251 256 */ 252 - val = readl_relaxed(base + OCOTP_CFG3); 253 - iounmap(base); 257 + regmap_read(ocotp, OCOTP_CFG3, &val); 254 258 } 255 259 256 260 val >>= OCOTP_CFG3_SPEED_SHIFT; ··· 284 290 if (ret) 285 291 return ret; 286 292 } else { 287 - struct device_node *np; 288 - void __iomem *base; 293 + struct regmap *ocotp; 289 294 290 - np = of_find_compatible_node(NULL, NULL, "fsl,imx6ul-ocotp"); 291 - if (!np) 292 - np = of_find_compatible_node(NULL, NULL, 293 - "fsl,imx6ull-ocotp"); 294 - if (!np) 295 + ocotp = syscon_regmap_lookup_by_compatible("fsl,imx6ul-ocotp"); 296 + if (IS_ERR(ocotp)) 297 + ocotp = syscon_regmap_lookup_by_compatible("fsl,imx6ull-ocotp"); 298 + 299 + if (IS_ERR(ocotp)) 295 300 return -ENOENT; 296 301 297 - base = of_iomap(np, 0); 298 - of_node_put(np); 299 - if (!base) { 300 - dev_err(dev, "failed to map ocotp\n"); 301 - return -EFAULT; 302 - } 303 - 304 - val = readl_relaxed(base + OCOTP_CFG3); 305 - iounmap(base); 302 + regmap_read(ocotp, OCOTP_CFG3, &val); 306 303 } 307 304 308 305 /*
+37 -9
drivers/cpufreq/intel_pstate.c
··· 25 25 #include <linux/acpi.h> 26 26 #include <linux/vmalloc.h> 27 27 #include <linux/pm_qos.h> 28 + #include <linux/bitfield.h> 28 29 #include <trace/events/power.h> 29 30 30 31 #include <asm/cpu.h> ··· 202 201 * @prev_aperf: Last APERF value read from APERF MSR 203 202 * @prev_mperf: Last MPERF value read from MPERF MSR 204 203 * @prev_tsc: Last timestamp counter (TSC) value 205 - * @prev_cummulative_iowait: IO Wait time difference from last and 206 - * current sample 207 204 * @sample: Storage for storing last Sample data 208 205 * @min_perf_ratio: Minimum capacity in terms of PERF or HWP ratios 209 206 * @max_perf_ratio: Maximum capacity in terms of PERF or HWP ratios ··· 240 241 u64 prev_aperf; 241 242 u64 prev_mperf; 242 243 u64 prev_tsc; 243 - u64 prev_cummulative_iowait; 244 244 struct sample sample; 245 245 int32_t min_perf_ratio; 246 246 int32_t max_perf_ratio; ··· 3405 3407 return !!(value & 0x1); 3406 3408 } 3407 3409 3408 - static const struct x86_cpu_id intel_epp_balance_perf[] = { 3410 + #define POWERSAVE_MASK GENMASK(7, 0) 3411 + #define BALANCE_POWER_MASK GENMASK(15, 8) 3412 + #define BALANCE_PERFORMANCE_MASK GENMASK(23, 16) 3413 + #define PERFORMANCE_MASK GENMASK(31, 24) 3414 + 3415 + #define HWP_SET_EPP_VALUES(powersave, balance_power, balance_perf, performance) \ 3416 + (FIELD_PREP_CONST(POWERSAVE_MASK, powersave) |\ 3417 + FIELD_PREP_CONST(BALANCE_POWER_MASK, balance_power) |\ 3418 + FIELD_PREP_CONST(BALANCE_PERFORMANCE_MASK, balance_perf) |\ 3419 + FIELD_PREP_CONST(PERFORMANCE_MASK, performance)) 3420 + 3421 + #define HWP_SET_DEF_BALANCE_PERF_EPP(balance_perf) \ 3422 + (HWP_SET_EPP_VALUES(HWP_EPP_POWERSAVE, HWP_EPP_BALANCE_POWERSAVE,\ 3423 + balance_perf, HWP_EPP_PERFORMANCE)) 3424 + 3425 + static const struct x86_cpu_id intel_epp_default[] = { 3409 3426 /* 3410 3427 * Set EPP value as 102, this is the max suggested EPP 3411 3428 * which can result in one core turbo frequency for 3412 3429 * AlderLake Mobile CPUs. 3413 3430 */ 3414 - X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, 102), 3415 - X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, 32), 3431 + X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, HWP_SET_DEF_BALANCE_PERF_EPP(102)), 3432 + X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, HWP_SET_DEF_BALANCE_PERF_EPP(32)), 3433 + X86_MATCH_INTEL_FAM6_MODEL(METEORLAKE_L, HWP_SET_EPP_VALUES(HWP_EPP_POWERSAVE, 3434 + HWP_EPP_BALANCE_POWERSAVE, 115, 16)), 3416 3435 {} 3417 3436 }; 3418 3437 ··· 3527 3512 intel_pstate_sysfs_expose_params(); 3528 3513 3529 3514 if (hwp_active) { 3530 - const struct x86_cpu_id *id = x86_match_cpu(intel_epp_balance_perf); 3515 + const struct x86_cpu_id *id = x86_match_cpu(intel_epp_default); 3531 3516 const struct x86_cpu_id *hybrid_id = x86_match_cpu(intel_hybrid_scaling_factor); 3532 3517 3533 - if (id) 3534 - epp_values[EPP_INDEX_BALANCE_PERFORMANCE] = id->driver_data; 3518 + if (id) { 3519 + epp_values[EPP_INDEX_POWERSAVE] = 3520 + FIELD_GET(POWERSAVE_MASK, id->driver_data); 3521 + epp_values[EPP_INDEX_BALANCE_POWERSAVE] = 3522 + FIELD_GET(BALANCE_POWER_MASK, id->driver_data); 3523 + epp_values[EPP_INDEX_BALANCE_PERFORMANCE] = 3524 + FIELD_GET(BALANCE_PERFORMANCE_MASK, id->driver_data); 3525 + epp_values[EPP_INDEX_PERFORMANCE] = 3526 + FIELD_GET(PERFORMANCE_MASK, id->driver_data); 3527 + pr_debug("Updated EPPs powersave:%x balanced power:%x balanced perf:%x performance:%x\n", 3528 + epp_values[EPP_INDEX_POWERSAVE], 3529 + epp_values[EPP_INDEX_BALANCE_POWERSAVE], 3530 + epp_values[EPP_INDEX_BALANCE_PERFORMANCE], 3531 + epp_values[EPP_INDEX_PERFORMANCE]); 3532 + } 3535 3533 3536 3534 if (hybrid_id) { 3537 3535 hybrid_scaling_factor = hybrid_id->driver_data;
+18 -1
drivers/cpufreq/mediatek-cpufreq-hw.c
··· 13 13 #include <linux/of.h> 14 14 #include <linux/of_platform.h> 15 15 #include <linux/platform_device.h> 16 + #include <linux/regulator/consumer.h> 16 17 #include <linux/slab.h> 17 18 18 19 #define LUT_MAX_ENTRIES 32U ··· 301 300 static int mtk_cpufreq_hw_driver_probe(struct platform_device *pdev) 302 301 { 303 302 const void *data; 304 - int ret; 303 + int ret, cpu; 304 + struct device *cpu_dev; 305 + struct regulator *cpu_reg; 306 + 307 + /* Make sure that all CPU supplies are available before proceeding. */ 308 + for_each_possible_cpu(cpu) { 309 + cpu_dev = get_cpu_device(cpu); 310 + if (!cpu_dev) 311 + return dev_err_probe(&pdev->dev, -EPROBE_DEFER, 312 + "Failed to get cpu%d device\n", cpu); 313 + 314 + cpu_reg = devm_regulator_get(cpu_dev, "cpu"); 315 + if (IS_ERR(cpu_reg)) 316 + return dev_err_probe(&pdev->dev, PTR_ERR(cpu_reg), 317 + "CPU%d regulator get failed\n", cpu); 318 + } 319 + 305 320 306 321 data = of_device_get_match_data(&pdev->dev); 307 322 if (!data)
+26
drivers/cpufreq/scmi-cpufreq.c
··· 144 144 return 0; 145 145 } 146 146 147 + static int 148 + scmi_get_rate_limit(u32 domain, bool has_fast_switch) 149 + { 150 + int ret, rate_limit; 151 + 152 + if (has_fast_switch) { 153 + /* 154 + * Fast channels are used whenever available, 155 + * so use their rate_limit value if populated. 156 + */ 157 + ret = perf_ops->fast_switch_rate_limit(ph, domain, 158 + &rate_limit); 159 + if (!ret && rate_limit) 160 + return rate_limit; 161 + } 162 + 163 + ret = perf_ops->rate_limit_get(ph, domain, &rate_limit); 164 + if (ret) 165 + return 0; 166 + 167 + return rate_limit; 168 + } 169 + 147 170 static int scmi_cpufreq_init(struct cpufreq_policy *policy) 148 171 { 149 172 int ret, nr_opp, domain; ··· 272 249 273 250 policy->fast_switch_possible = 274 251 perf_ops->fast_switch_possible(ph, domain); 252 + 253 + policy->transition_delay_us = 254 + scmi_get_rate_limit(domain, policy->fast_switch_possible); 275 255 276 256 return 0; 277 257
+2 -1
drivers/cpuidle/driver.c
··· 16 16 #include <linux/cpumask.h> 17 17 #include <linux/tick.h> 18 18 #include <linux/cpu.h> 19 + #include <linux/math64.h> 19 20 20 21 #include "cpuidle.h" 21 22 ··· 188 187 s->target_residency = div_u64(s->target_residency_ns, NSEC_PER_USEC); 189 188 190 189 if (s->exit_latency > 0) 191 - s->exit_latency_ns = s->exit_latency * NSEC_PER_USEC; 190 + s->exit_latency_ns = mul_u32_u32(s->exit_latency, NSEC_PER_USEC); 192 191 else if (s->exit_latency_ns < 0) 193 192 s->exit_latency_ns = 0; 194 193 else
+7 -2
drivers/cpuidle/governors/haltpoll.c
··· 98 98 unsigned int shrink = guest_halt_poll_shrink; 99 99 100 100 val = dev->poll_limit_ns; 101 - if (shrink == 0) 101 + if (shrink == 0) { 102 102 val = 0; 103 - else 103 + } else { 104 104 val /= shrink; 105 + /* Reset value to 0 if shrunk below grow_start */ 106 + if (val < guest_halt_poll_grow_start) 107 + val = 0; 108 + } 109 + 105 110 trace_guest_halt_poll_ns_shrink(val, dev->poll_limit_ns); 106 111 dev->poll_limit_ns = val; 107 112 }
+4 -1
drivers/firmware/arm_scmi/driver.c
··· 1624 1624 scmi_common_fastchannel_init(const struct scmi_protocol_handle *ph, 1625 1625 u8 describe_id, u32 message_id, u32 valid_size, 1626 1626 u32 domain, void __iomem **p_addr, 1627 - struct scmi_fc_db_info **p_db) 1627 + struct scmi_fc_db_info **p_db, u32 *rate_limit) 1628 1628 { 1629 1629 int ret; 1630 1630 u32 flags; ··· 1667 1667 ret = -EINVAL; 1668 1668 goto err_xfer; 1669 1669 } 1670 + 1671 + if (rate_limit) 1672 + *rate_limit = le32_to_cpu(resp->rate_limit) & GENMASK(19, 0); 1670 1673 1671 1674 phys_addr = le32_to_cpu(resp->chan_addr_low); 1672 1675 phys_addr |= (u64)le32_to_cpu(resp->chan_addr_high) << 32;
+49 -4
drivers/firmware/arm_scmi/perf.c
··· 153 153 bool perf_fastchannels; 154 154 bool level_indexing_mode; 155 155 u32 opp_count; 156 + u32 rate_limit_us; 156 157 u32 sustained_freq_khz; 157 158 u32 sustained_perf_level; 158 159 unsigned long mult_factor; ··· 283 282 if (PROTOCOL_REV_MAJOR(version) >= 0x4) 284 283 dom_info->level_indexing_mode = 285 284 SUPPORTS_LEVEL_INDEXING(flags); 285 + dom_info->rate_limit_us = le32_to_cpu(attr->rate_limit_us) & 286 + GENMASK(19, 0); 286 287 dom_info->sustained_freq_khz = 287 288 le32_to_cpu(attr->sustained_freq_khz); 288 289 dom_info->sustained_perf_level = ··· 828 825 829 826 ph->hops->fastchannel_init(ph, PERF_DESCRIBE_FASTCHANNEL, 830 827 PERF_LEVEL_GET, 4, dom->id, 831 - &fc[PERF_FC_LEVEL].get_addr, NULL); 828 + &fc[PERF_FC_LEVEL].get_addr, NULL, 829 + &fc[PERF_FC_LEVEL].rate_limit); 832 830 833 831 ph->hops->fastchannel_init(ph, PERF_DESCRIBE_FASTCHANNEL, 834 832 PERF_LIMITS_GET, 8, dom->id, 835 - &fc[PERF_FC_LIMIT].get_addr, NULL); 833 + &fc[PERF_FC_LIMIT].get_addr, NULL, 834 + &fc[PERF_FC_LIMIT].rate_limit); 836 835 837 836 if (dom->info.set_perf) 838 837 ph->hops->fastchannel_init(ph, PERF_DESCRIBE_FASTCHANNEL, 839 838 PERF_LEVEL_SET, 4, dom->id, 840 839 &fc[PERF_FC_LEVEL].set_addr, 841 - &fc[PERF_FC_LEVEL].set_db); 840 + &fc[PERF_FC_LEVEL].set_db, 841 + &fc[PERF_FC_LEVEL].rate_limit); 842 842 843 843 if (dom->set_limits) 844 844 ph->hops->fastchannel_init(ph, PERF_DESCRIBE_FASTCHANNEL, 845 845 PERF_LIMITS_SET, 8, dom->id, 846 846 &fc[PERF_FC_LIMIT].set_addr, 847 - &fc[PERF_FC_LIMIT].set_db); 847 + &fc[PERF_FC_LIMIT].set_db, 848 + &fc[PERF_FC_LIMIT].rate_limit); 848 849 849 850 dom->fc_info = fc; 850 851 } ··· 899 892 900 893 /* uS to nS */ 901 894 return dom->opp[dom->opp_count - 1].trans_latency_us * 1000; 895 + } 896 + 897 + static int 898 + scmi_dvfs_rate_limit_get(const struct scmi_protocol_handle *ph, 899 + u32 domain, u32 *rate_limit) 900 + { 901 + struct perf_dom_info *dom; 902 + 903 + if (!rate_limit) 904 + return -EINVAL; 905 + 906 + dom = scmi_perf_domain_lookup(ph, domain); 907 + if (IS_ERR(dom)) 908 + return PTR_ERR(dom); 909 + 910 + *rate_limit = dom->rate_limit_us; 911 + return 0; 902 912 } 903 913 904 914 static int scmi_dvfs_freq_set(const struct scmi_protocol_handle *ph, u32 domain, ··· 1017 993 return dom->fc_info && dom->fc_info[PERF_FC_LEVEL].set_addr; 1018 994 } 1019 995 996 + static int scmi_fast_switch_rate_limit(const struct scmi_protocol_handle *ph, 997 + u32 domain, u32 *rate_limit) 998 + { 999 + struct perf_dom_info *dom; 1000 + 1001 + if (!rate_limit) 1002 + return -EINVAL; 1003 + 1004 + dom = scmi_perf_domain_lookup(ph, domain); 1005 + if (IS_ERR(dom)) 1006 + return PTR_ERR(dom); 1007 + 1008 + if (!dom->fc_info) 1009 + return -EINVAL; 1010 + 1011 + *rate_limit = dom->fc_info[PERF_FC_LEVEL].rate_limit; 1012 + return 0; 1013 + } 1014 + 1020 1015 static enum scmi_power_scale 1021 1016 scmi_power_scale_get(const struct scmi_protocol_handle *ph) 1022 1017 { ··· 1052 1009 .level_set = scmi_perf_level_set, 1053 1010 .level_get = scmi_perf_level_get, 1054 1011 .transition_latency_get = scmi_dvfs_transition_latency_get, 1012 + .rate_limit_get = scmi_dvfs_rate_limit_get, 1055 1013 .device_opps_add = scmi_dvfs_device_opps_add, 1056 1014 .freq_set = scmi_dvfs_freq_set, 1057 1015 .freq_get = scmi_dvfs_freq_get, 1058 1016 .est_power_get = scmi_dvfs_est_power_get, 1059 1017 .fast_switch_possible = scmi_fast_switch_possible, 1018 + .fast_switch_rate_limit = scmi_fast_switch_rate_limit, 1060 1019 .power_scale_get = scmi_power_scale_get, 1061 1020 }; 1062 1021
+8 -4
drivers/firmware/arm_scmi/powercap.c
··· 719 719 ph->hops->fastchannel_init(ph, POWERCAP_DESCRIBE_FASTCHANNEL, 720 720 POWERCAP_CAP_SET, 4, domain, 721 721 &fc[POWERCAP_FC_CAP].set_addr, 722 - &fc[POWERCAP_FC_CAP].set_db); 722 + &fc[POWERCAP_FC_CAP].set_db, 723 + &fc[POWERCAP_FC_CAP].rate_limit); 723 724 724 725 ph->hops->fastchannel_init(ph, POWERCAP_DESCRIBE_FASTCHANNEL, 725 726 POWERCAP_CAP_GET, 4, domain, 726 - &fc[POWERCAP_FC_CAP].get_addr, NULL); 727 + &fc[POWERCAP_FC_CAP].get_addr, NULL, 728 + &fc[POWERCAP_FC_CAP].rate_limit); 727 729 728 730 ph->hops->fastchannel_init(ph, POWERCAP_DESCRIBE_FASTCHANNEL, 729 731 POWERCAP_PAI_SET, 4, domain, 730 732 &fc[POWERCAP_FC_PAI].set_addr, 731 - &fc[POWERCAP_FC_PAI].set_db); 733 + &fc[POWERCAP_FC_PAI].set_db, 734 + &fc[POWERCAP_FC_PAI].rate_limit); 732 735 733 736 ph->hops->fastchannel_init(ph, POWERCAP_DESCRIBE_FASTCHANNEL, 734 737 POWERCAP_PAI_GET, 4, domain, 735 - &fc[POWERCAP_FC_PAI].get_addr, NULL); 738 + &fc[POWERCAP_FC_PAI].get_addr, NULL, 739 + &fc[POWERCAP_PAI_GET].rate_limit); 736 740 737 741 *p_fc = fc; 738 742 }
+3 -1
drivers/firmware/arm_scmi/protocols.h
··· 235 235 void __iomem *set_addr; 236 236 void __iomem *get_addr; 237 237 struct scmi_fc_db_info *set_db; 238 + u32 rate_limit; 238 239 }; 239 240 240 241 /** ··· 274 273 u8 describe_id, u32 message_id, 275 274 u32 valid_size, u32 domain, 276 275 void __iomem **p_addr, 277 - struct scmi_fc_db_info **p_db); 276 + struct scmi_fc_db_info **p_db, 277 + u32 *rate_limit); 278 278 void (*fastchannel_db_ring)(struct scmi_fc_db_info *db); 279 279 }; 280 280
+4 -1
drivers/gpu/drm/i915/intel_runtime_pm.c
··· 246 246 * function, since the power state is undefined. This applies 247 247 * atm to the late/early system suspend/resume handlers. 248 248 */ 249 - if (pm_runtime_get_if_active(rpm->kdev, ignore_usecount) <= 0) 249 + if ((ignore_usecount && 250 + pm_runtime_get_if_active(rpm->kdev) <= 0) || 251 + (!ignore_usecount && 252 + pm_runtime_get_if_in_use(rpm->kdev) <= 0)) 250 253 return 0; 251 254 } 252 255
+1 -1
drivers/gpu/drm/xe/xe_pm.c
··· 330 330 331 331 int xe_pm_runtime_get_if_active(struct xe_device *xe) 332 332 { 333 - return pm_runtime_get_if_active(xe->drm.dev, true); 333 + return pm_runtime_get_if_active(xe->drm.dev); 334 334 } 335 335 336 336 void xe_pm_assert_unbounded_bridge(struct xe_device *xe)
+2 -1
drivers/idle/intel_idle.c
··· 1934 1934 1935 1935 static bool __init intel_idle_verify_cstate(unsigned int mwait_hint) 1936 1936 { 1937 - unsigned int mwait_cstate = MWAIT_HINT2CSTATE(mwait_hint) + 1; 1937 + unsigned int mwait_cstate = (MWAIT_HINT2CSTATE(mwait_hint) + 1) & 1938 + MWAIT_CSTATE_MASK; 1938 1939 unsigned int num_substates = (mwait_substates >> mwait_cstate * 4) & 1939 1940 MWAIT_SUBSTATE_MASK; 1940 1941
+1 -1
drivers/media/i2c/ccs/ccs-core.c
··· 674 674 break; 675 675 } 676 676 677 - pm_status = pm_runtime_get_if_active(&client->dev, true); 677 + pm_status = pm_runtime_get_if_active(&client->dev); 678 678 if (!pm_status) 679 679 return 0; 680 680
+1 -1
drivers/media/i2c/ov64a40.c
··· 3287 3287 exp_max, 1, exp_val); 3288 3288 } 3289 3289 3290 - pm_status = pm_runtime_get_if_active(ov64a40->dev, true); 3290 + pm_status = pm_runtime_get_if_active(ov64a40->dev); 3291 3291 if (!pm_status) 3292 3292 return 0; 3293 3293
+1 -1
drivers/media/i2c/thp7312.c
··· 1052 1052 if (ctrl->flags & V4L2_CTRL_FLAG_INACTIVE) 1053 1053 return -EINVAL; 1054 1054 1055 - if (!pm_runtime_get_if_active(thp7312->dev, true)) 1055 + if (!pm_runtime_get_if_active(thp7312->dev)) 1056 1056 return 0; 1057 1057 1058 1058 switch (ctrl->id) {
+1 -1
drivers/net/ipa/ipa_smp2p.c
··· 90 90 if (smp2p->notified) 91 91 return; 92 92 93 - smp2p->power_on = pm_runtime_get_if_active(smp2p->ipa->dev, true) > 0; 93 + smp2p->power_on = pm_runtime_get_if_active(smp2p->ipa->dev) > 0; 94 94 95 95 /* Signal whether the IPA power is enabled */ 96 96 mask = BIT(smp2p->enabled_bit);
+1
drivers/opp/core.c
··· 2065 2065 /* populate the opp table */ 2066 2066 new_opp->rates[0] = data->freq; 2067 2067 new_opp->level = data->level; 2068 + new_opp->turbo = data->turbo; 2068 2069 tol = u_volt * opp_table->voltage_tolerance_v1 / 100; 2069 2070 new_opp->supplies[0].u_volt = u_volt; 2070 2071 new_opp->supplies[0].u_volt_min = u_volt - tol;
+8 -6
drivers/opp/debugfs.c
··· 37 37 size_t count, loff_t *ppos) 38 38 { 39 39 struct icc_path *path = fp->private_data; 40 + const char *name = icc_get_name(path); 40 41 char buf[64]; 41 - int i; 42 + int i = 0; 42 43 43 - i = scnprintf(buf, sizeof(buf), "%.62s\n", icc_get_name(path)); 44 + if (name) 45 + i = scnprintf(buf, sizeof(buf), "%.62s\n", name); 44 46 45 47 return simple_read_from_buffer(userbuf, count, ppos, buf, i); 46 48 } ··· 58 56 struct dentry *pdentry) 59 57 { 60 58 struct dentry *d; 61 - char name[20]; 59 + char name[] = "icc-path-XXXXXXXXXXX"; /* Integers can take 11 chars max */ 62 60 int i; 63 61 64 62 for (i = 0; i < opp_table->path_count; i++) { 65 - snprintf(name, sizeof(name), "icc-path-%.1d", i); 63 + snprintf(name, sizeof(name), "icc-path-%d", i); 66 64 67 65 /* Create per-path directory */ 68 66 d = debugfs_create_dir(name, pdentry); ··· 80 78 struct opp_table *opp_table, 81 79 struct dentry *pdentry) 82 80 { 83 - char name[12]; 81 + char name[] = "rate_hz_XXXXXXXXXXX"; /* Integers can take 11 chars max */ 84 82 int i; 85 83 86 84 if (opp_table->clk_count == 1) { ··· 102 100 int i; 103 101 104 102 for (i = 0; i < opp_table->regulator_count; i++) { 105 - char name[15]; 103 + char name[] = "supply-XXXXXXXXXXX"; /* Integers can take 11 chars max */ 106 104 107 105 snprintf(name, sizeof(name), "supply-%d", i); 108 106
+1 -1
drivers/pci/pci.c
··· 2532 2532 * course of the call. 2533 2533 */ 2534 2534 if (bdev) { 2535 - bref = pm_runtime_get_if_active(bdev, true); 2535 + bref = pm_runtime_get_if_active(bdev); 2536 2536 if (!bref) 2537 2537 continue; 2538 2538
+1 -1
drivers/powercap/dtpm.c
··· 522 522 523 523 /** 524 524 * dtpm_create_hierarchy - Create the dtpm hierarchy 525 - * @hierarchy: An array of struct dtpm_node describing the hierarchy 525 + * @dtpm_match_table: Pointer to the array of device ID structures 526 526 * 527 527 * The function is called by the platform specific code with the 528 528 * description of the different node in the hierarchy. It creates the
+31 -12
drivers/powercap/dtpm_cpu.c
··· 42 42 { 43 43 struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm); 44 44 struct em_perf_domain *pd = em_cpu_get(dtpm_cpu->cpu); 45 + struct em_perf_state *table; 45 46 struct cpumask cpus; 46 47 unsigned long freq; 47 48 u64 power; ··· 51 50 cpumask_and(&cpus, cpu_online_mask, to_cpumask(pd->cpus)); 52 51 nr_cpus = cpumask_weight(&cpus); 53 52 53 + rcu_read_lock(); 54 + table = em_perf_state_from_pd(pd); 54 55 for (i = 0; i < pd->nr_perf_states; i++) { 55 56 56 - power = pd->table[i].power * nr_cpus; 57 + power = table[i].power * nr_cpus; 57 58 58 59 if (power > power_limit) 59 60 break; 60 61 } 61 62 62 - freq = pd->table[i - 1].frequency; 63 + freq = table[i - 1].frequency; 64 + power_limit = table[i - 1].power * nr_cpus; 65 + rcu_read_unlock(); 63 66 64 67 freq_qos_update_request(&dtpm_cpu->qos_req, freq); 65 - 66 - power_limit = pd->table[i - 1].power * nr_cpus; 67 68 68 69 return power_limit; 69 70 } ··· 90 87 static u64 get_pd_power_uw(struct dtpm *dtpm) 91 88 { 92 89 struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm); 90 + struct em_perf_state *table; 93 91 struct em_perf_domain *pd; 94 92 struct cpumask *pd_mask; 95 93 unsigned long freq; 94 + u64 power = 0; 96 95 int i; 97 96 98 97 pd = em_cpu_get(dtpm_cpu->cpu); ··· 103 98 104 99 freq = cpufreq_quick_get(dtpm_cpu->cpu); 105 100 101 + rcu_read_lock(); 102 + table = em_perf_state_from_pd(pd); 106 103 for (i = 0; i < pd->nr_perf_states; i++) { 107 104 108 - if (pd->table[i].frequency < freq) 105 + if (table[i].frequency < freq) 109 106 continue; 110 107 111 - return scale_pd_power_uw(pd_mask, pd->table[i].power); 108 + power = scale_pd_power_uw(pd_mask, table[i].power); 109 + break; 112 110 } 111 + rcu_read_unlock(); 113 112 114 - return 0; 113 + return power; 115 114 } 116 115 117 116 static int update_pd_power_uw(struct dtpm *dtpm) 118 117 { 119 118 struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm); 120 119 struct em_perf_domain *em = em_cpu_get(dtpm_cpu->cpu); 120 + struct em_perf_state *table; 121 121 struct cpumask cpus; 122 122 int nr_cpus; 123 123 124 124 cpumask_and(&cpus, cpu_online_mask, to_cpumask(em->cpus)); 125 125 nr_cpus = cpumask_weight(&cpus); 126 126 127 - dtpm->power_min = em->table[0].power; 127 + rcu_read_lock(); 128 + table = em_perf_state_from_pd(em); 129 + 130 + dtpm->power_min = table[0].power; 128 131 dtpm->power_min *= nr_cpus; 129 132 130 - dtpm->power_max = em->table[em->nr_perf_states - 1].power; 133 + dtpm->power_max = table[em->nr_perf_states - 1].power; 131 134 dtpm->power_max *= nr_cpus; 135 + 136 + rcu_read_unlock(); 132 137 133 138 return 0; 134 139 } ··· 158 143 159 144 cpufreq_cpu_put(policy); 160 145 } 161 - 146 + 162 147 kfree(dtpm_cpu); 163 148 } 164 149 ··· 195 180 { 196 181 struct dtpm_cpu *dtpm_cpu; 197 182 struct cpufreq_policy *policy; 183 + struct em_perf_state *table; 198 184 struct em_perf_domain *pd; 199 185 char name[CPUFREQ_NAME_LEN]; 200 186 int ret = -ENOMEM; ··· 232 216 if (ret) 233 217 goto out_kfree_dtpm_cpu; 234 218 219 + rcu_read_lock(); 220 + table = em_perf_state_from_pd(pd); 235 221 ret = freq_qos_add_request(&policy->constraints, 236 222 &dtpm_cpu->qos_req, FREQ_QOS_MAX, 237 - pd->table[pd->nr_perf_states - 1].frequency); 238 - if (ret) 223 + table[pd->nr_perf_states - 1].frequency); 224 + rcu_read_unlock(); 225 + if (ret < 0) 239 226 goto out_dtpm_unregister; 240 227 241 228 cpufreq_cpu_put(policy);
+23 -11
drivers/powercap/dtpm_devfreq.c
··· 37 37 struct devfreq *devfreq = dtpm_devfreq->devfreq; 38 38 struct device *dev = devfreq->dev.parent; 39 39 struct em_perf_domain *pd = em_pd_get(dev); 40 + struct em_perf_state *table; 40 41 41 - dtpm->power_min = pd->table[0].power; 42 + rcu_read_lock(); 43 + table = em_perf_state_from_pd(pd); 42 44 43 - dtpm->power_max = pd->table[pd->nr_perf_states - 1].power; 45 + dtpm->power_min = table[0].power; 44 46 47 + dtpm->power_max = table[pd->nr_perf_states - 1].power; 48 + 49 + rcu_read_unlock(); 45 50 return 0; 46 51 } 47 52 ··· 56 51 struct devfreq *devfreq = dtpm_devfreq->devfreq; 57 52 struct device *dev = devfreq->dev.parent; 58 53 struct em_perf_domain *pd = em_pd_get(dev); 54 + struct em_perf_state *table; 59 55 unsigned long freq; 60 56 int i; 61 57 58 + rcu_read_lock(); 59 + table = em_perf_state_from_pd(pd); 62 60 for (i = 0; i < pd->nr_perf_states; i++) { 63 - if (pd->table[i].power > power_limit) 61 + if (table[i].power > power_limit) 64 62 break; 65 63 } 66 64 67 - freq = pd->table[i - 1].frequency; 65 + freq = table[i - 1].frequency; 66 + power_limit = table[i - 1].power; 67 + rcu_read_unlock(); 68 68 69 69 dev_pm_qos_update_request(&dtpm_devfreq->qos_req, freq); 70 - 71 - power_limit = pd->table[i - 1].power; 72 70 73 71 return power_limit; 74 72 } ··· 97 89 struct device *dev = devfreq->dev.parent; 98 90 struct em_perf_domain *pd = em_pd_get(dev); 99 91 struct devfreq_dev_status status; 92 + struct em_perf_state *table; 100 93 unsigned long freq; 101 - u64 power; 94 + u64 power = 0; 102 95 int i; 103 96 104 97 mutex_lock(&devfreq->lock); ··· 109 100 freq = DIV_ROUND_UP(status.current_frequency, HZ_PER_KHZ); 110 101 _normalize_load(&status); 111 102 103 + rcu_read_lock(); 104 + table = em_perf_state_from_pd(pd); 112 105 for (i = 0; i < pd->nr_perf_states; i++) { 113 106 114 - if (pd->table[i].frequency < freq) 107 + if (table[i].frequency < freq) 115 108 continue; 116 109 117 - power = pd->table[i].power; 110 + power = table[i].power; 118 111 power *= status.busy_time; 119 112 power >>= 10; 120 113 121 - return power; 114 + break; 122 115 } 116 + rcu_read_unlock(); 123 117 124 - return 0; 118 + return power; 125 119 } 126 120 127 121 static void pd_release(struct dtpm *dtpm)
+33 -3
drivers/powercap/intel_rapl_common.c
··· 5 5 */ 6 6 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt 7 7 8 + #include <linux/cleanup.h> 8 9 #include <linux/kernel.h> 9 10 #include <linux/module.h> 10 11 #include <linux/list.h> ··· 760 759 default: 761 760 return -EINVAL; 762 761 } 762 + 763 + /* defaults_msr can be NULL on unsupported platforms */ 764 + if (!rp->priv->defaults || !rp->priv->rpi) 765 + return -ENODEV; 766 + 763 767 return 0; 764 768 } 765 769 ··· 1262 1256 X86_MATCH_INTEL_FAM6_MODEL(METEORLAKE_L, &rapl_defaults_core), 1263 1257 X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, &rapl_defaults_spr_server), 1264 1258 X86_MATCH_INTEL_FAM6_MODEL(EMERALDRAPIDS_X, &rapl_defaults_spr_server), 1259 + X86_MATCH_INTEL_FAM6_MODEL(LUNARLAKE_M, &rapl_defaults_core), 1260 + X86_MATCH_INTEL_FAM6_MODEL(ARROWLAKE, &rapl_defaults_core), 1265 1261 X86_MATCH_INTEL_FAM6_MODEL(LAKEFIELD, &rapl_defaults_core), 1266 1262 1267 1263 X86_MATCH_INTEL_FAM6_MODEL(ATOM_SILVERMONT, &rapl_defaults_byt), ··· 1507 1499 } 1508 1500 1509 1501 /* called from CPU hotplug notifier, hotplug lock held */ 1510 - void rapl_remove_package(struct rapl_package *rp) 1502 + void rapl_remove_package_cpuslocked(struct rapl_package *rp) 1511 1503 { 1512 1504 struct rapl_domain *rd, *rd_package = NULL; 1513 1505 ··· 1536 1528 list_del(&rp->plist); 1537 1529 kfree(rp); 1538 1530 } 1531 + EXPORT_SYMBOL_GPL(rapl_remove_package_cpuslocked); 1532 + 1533 + void rapl_remove_package(struct rapl_package *rp) 1534 + { 1535 + guard(cpus_read_lock)(); 1536 + rapl_remove_package_cpuslocked(rp); 1537 + } 1539 1538 EXPORT_SYMBOL_GPL(rapl_remove_package); 1540 1539 1541 1540 /* caller to ensure CPU hotplug lock is held */ 1542 - struct rapl_package *rapl_find_package_domain(int id, struct rapl_if_priv *priv, bool id_is_cpu) 1541 + struct rapl_package *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_priv *priv, 1542 + bool id_is_cpu) 1543 1543 { 1544 1544 struct rapl_package *rp; 1545 1545 int uid; ··· 1565 1549 1566 1550 return NULL; 1567 1551 } 1552 + EXPORT_SYMBOL_GPL(rapl_find_package_domain_cpuslocked); 1553 + 1554 + struct rapl_package *rapl_find_package_domain(int id, struct rapl_if_priv *priv, bool id_is_cpu) 1555 + { 1556 + guard(cpus_read_lock)(); 1557 + return rapl_find_package_domain_cpuslocked(id, priv, id_is_cpu); 1558 + } 1568 1559 EXPORT_SYMBOL_GPL(rapl_find_package_domain); 1569 1560 1570 1561 /* called from CPU hotplug notifier, hotplug lock held */ 1571 - struct rapl_package *rapl_add_package(int id, struct rapl_if_priv *priv, bool id_is_cpu) 1562 + struct rapl_package *rapl_add_package_cpuslocked(int id, struct rapl_if_priv *priv, bool id_is_cpu) 1572 1563 { 1573 1564 struct rapl_package *rp; 1574 1565 int ret; ··· 1620 1597 kfree(rp->domains); 1621 1598 kfree(rp); 1622 1599 return ERR_PTR(ret); 1600 + } 1601 + EXPORT_SYMBOL_GPL(rapl_add_package_cpuslocked); 1602 + 1603 + struct rapl_package *rapl_add_package(int id, struct rapl_if_priv *priv, bool id_is_cpu) 1604 + { 1605 + guard(cpus_read_lock)(); 1606 + return rapl_add_package_cpuslocked(id, priv, id_is_cpu); 1623 1607 } 1624 1608 EXPORT_SYMBOL_GPL(rapl_add_package); 1625 1609
+4 -4
drivers/powercap/intel_rapl_msr.c
··· 73 73 { 74 74 struct rapl_package *rp; 75 75 76 - rp = rapl_find_package_domain(cpu, rapl_msr_priv, true); 76 + rp = rapl_find_package_domain_cpuslocked(cpu, rapl_msr_priv, true); 77 77 if (!rp) { 78 - rp = rapl_add_package(cpu, rapl_msr_priv, true); 78 + rp = rapl_add_package_cpuslocked(cpu, rapl_msr_priv, true); 79 79 if (IS_ERR(rp)) 80 80 return PTR_ERR(rp); 81 81 } ··· 88 88 struct rapl_package *rp; 89 89 int lead_cpu; 90 90 91 - rp = rapl_find_package_domain(cpu, rapl_msr_priv, true); 91 + rp = rapl_find_package_domain_cpuslocked(cpu, rapl_msr_priv, true); 92 92 if (!rp) 93 93 return 0; 94 94 95 95 cpumask_clear_cpu(cpu, &rp->cpumask); 96 96 lead_cpu = cpumask_first(&rp->cpumask); 97 97 if (lead_cpu >= nr_cpu_ids) 98 - rapl_remove_package(rp); 98 + rapl_remove_package_cpuslocked(rp); 99 99 else if (rp->lead_cpu == cpu) 100 100 rp->lead_cpu = lead_cpu; 101 101 return 0;
+15
drivers/powercap/intel_rapl_tpmi.c
··· 40 40 TPMI_RAPL_REG_ENERGY_STATUS, 41 41 TPMI_RAPL_REG_PERF_STATUS, 42 42 TPMI_RAPL_REG_POWER_INFO, 43 + TPMI_RAPL_REG_DOMAIN_INFO, 43 44 TPMI_RAPL_REG_INTERRUPT, 44 45 TPMI_RAPL_REG_MAX = 15, 45 46 }; ··· 131 130 mutex_unlock(&tpmi_rapl_lock); 132 131 } 133 132 133 + /* 134 + * Bit 0 of TPMI_RAPL_REG_DOMAIN_INFO indicates if the current package is a domain 135 + * root or not. Only domain root packages can enumerate System (Psys) Domain. 136 + */ 137 + #define TPMI_RAPL_DOMAIN_ROOT BIT(0) 138 + 134 139 static int parse_one_domain(struct tpmi_rapl_package *trp, u32 offset) 135 140 { 136 141 u8 tpmi_domain_version; ··· 146 139 enum rapl_domain_reg_id reg_id; 147 140 int tpmi_domain_size, tpmi_domain_flags; 148 141 u64 tpmi_domain_header = readq(trp->base + offset); 142 + u64 tpmi_domain_info; 149 143 150 144 /* Domain Parent bits are ignored for now */ 151 145 tpmi_domain_version = tpmi_domain_header & 0xff; ··· 177 169 domain_type = RAPL_DOMAIN_PACKAGE; 178 170 break; 179 171 case TPMI_RAPL_DOMAIN_SYSTEM: 172 + if (!(tpmi_domain_flags & BIT(TPMI_RAPL_REG_DOMAIN_INFO))) { 173 + pr_warn(FW_BUG "System domain must support Domain Info register\n"); 174 + return -ENODEV; 175 + } 176 + tpmi_domain_info = readq(trp->base + offset + TPMI_RAPL_REG_DOMAIN_INFO); 177 + if (!(tpmi_domain_info & TPMI_RAPL_DOMAIN_ROOT)) 178 + return 0; 180 179 domain_type = RAPL_DOMAIN_PLATFORM; 181 180 break; 182 181 case TPMI_RAPL_DOMAIN_MEMORY:
+37 -8
drivers/thermal/cpufreq_cooling.c
··· 91 91 static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev, 92 92 unsigned int freq) 93 93 { 94 + struct em_perf_state *table; 94 95 int i; 95 96 97 + rcu_read_lock(); 98 + table = em_perf_state_from_pd(cpufreq_cdev->em); 96 99 for (i = cpufreq_cdev->max_level - 1; i >= 0; i--) { 97 - if (freq > cpufreq_cdev->em->table[i].frequency) 100 + if (freq > table[i].frequency) 98 101 break; 99 102 } 103 + rcu_read_unlock(); 100 104 101 105 return cpufreq_cdev->max_level - i - 1; 102 106 } ··· 108 104 static u32 cpu_freq_to_power(struct cpufreq_cooling_device *cpufreq_cdev, 109 105 u32 freq) 110 106 { 107 + struct em_perf_state *table; 111 108 unsigned long power_mw; 112 109 int i; 113 110 111 + rcu_read_lock(); 112 + table = em_perf_state_from_pd(cpufreq_cdev->em); 114 113 for (i = cpufreq_cdev->max_level - 1; i >= 0; i--) { 115 - if (freq > cpufreq_cdev->em->table[i].frequency) 114 + if (freq > table[i].frequency) 116 115 break; 117 116 } 118 117 119 - power_mw = cpufreq_cdev->em->table[i + 1].power; 118 + power_mw = table[i + 1].power; 120 119 power_mw /= MICROWATT_PER_MILLIWATT; 120 + rcu_read_unlock(); 121 121 122 122 return power_mw; 123 123 } ··· 129 121 static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_cdev, 130 122 u32 power) 131 123 { 124 + struct em_perf_state *table; 132 125 unsigned long em_power_mw; 126 + u32 freq; 133 127 int i; 134 128 129 + rcu_read_lock(); 130 + table = em_perf_state_from_pd(cpufreq_cdev->em); 135 131 for (i = cpufreq_cdev->max_level; i > 0; i--) { 136 132 /* Convert EM power to milli-Watts to make safe comparison */ 137 - em_power_mw = cpufreq_cdev->em->table[i].power; 133 + em_power_mw = table[i].power; 138 134 em_power_mw /= MICROWATT_PER_MILLIWATT; 139 135 if (power >= em_power_mw) 140 136 break; 141 137 } 138 + freq = table[i].frequency; 139 + rcu_read_unlock(); 142 140 143 - return cpufreq_cdev->em->table[i].frequency; 141 + return freq; 144 142 } 145 143 146 144 /** ··· 276 262 static int cpufreq_state2power(struct thermal_cooling_device *cdev, 277 263 unsigned long state, u32 *power) 278 264 { 279 - unsigned int freq, num_cpus, idx; 280 265 struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata; 266 + unsigned int freq, num_cpus, idx; 267 + struct em_perf_state *table; 281 268 282 269 /* Request state should be less than max_level */ 283 270 if (state > cpufreq_cdev->max_level) ··· 287 272 num_cpus = cpumask_weight(cpufreq_cdev->policy->cpus); 288 273 289 274 idx = cpufreq_cdev->max_level - state; 290 - freq = cpufreq_cdev->em->table[idx].frequency; 275 + 276 + rcu_read_lock(); 277 + table = em_perf_state_from_pd(cpufreq_cdev->em); 278 + freq = table[idx].frequency; 279 + rcu_read_unlock(); 280 + 291 281 *power = cpu_freq_to_power(cpufreq_cdev, freq) * num_cpus; 292 282 293 283 return 0; ··· 398 378 #ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR 399 379 /* Use the Energy Model table if available */ 400 380 if (cpufreq_cdev->em) { 381 + struct em_perf_state *table; 382 + unsigned int freq; 383 + 401 384 idx = cpufreq_cdev->max_level - state; 402 - return cpufreq_cdev->em->table[idx].frequency; 385 + 386 + rcu_read_lock(); 387 + table = em_perf_state_from_pd(cpufreq_cdev->em); 388 + freq = table[idx].frequency; 389 + rcu_read_unlock(); 390 + 391 + return freq; 403 392 } 404 393 #endif 405 394
+41 -10
drivers/thermal/devfreq_cooling.c
··· 87 87 struct devfreq_cooling_device *dfc = cdev->devdata; 88 88 struct devfreq *df = dfc->devfreq; 89 89 struct device *dev = df->dev.parent; 90 + struct em_perf_state *table; 90 91 unsigned long freq; 91 92 int perf_idx; 92 93 ··· 101 100 102 101 if (dfc->em_pd) { 103 102 perf_idx = dfc->max_state - state; 104 - freq = dfc->em_pd->table[perf_idx].frequency * 1000; 103 + 104 + rcu_read_lock(); 105 + table = em_perf_state_from_pd(dfc->em_pd); 106 + freq = table[perf_idx].frequency * 1000; 107 + rcu_read_unlock(); 105 108 } else { 106 109 freq = dfc->freq_table[state]; 107 110 } ··· 128 123 */ 129 124 static int get_perf_idx(struct em_perf_domain *em_pd, unsigned long freq) 130 125 { 131 - int i; 126 + struct em_perf_state *table; 127 + int i, idx = -EINVAL; 132 128 129 + rcu_read_lock(); 130 + table = em_perf_state_from_pd(em_pd); 133 131 for (i = 0; i < em_pd->nr_perf_states; i++) { 134 - if (em_pd->table[i].frequency == freq) 135 - return i; 136 - } 132 + if (table[i].frequency != freq) 133 + continue; 137 134 138 - return -EINVAL; 135 + idx = i; 136 + break; 137 + } 138 + rcu_read_unlock(); 139 + 140 + return idx; 139 141 } 140 142 141 143 static unsigned long get_voltage(struct devfreq *df, unsigned long freq) ··· 193 181 struct devfreq_cooling_device *dfc = cdev->devdata; 194 182 struct devfreq *df = dfc->devfreq; 195 183 struct devfreq_dev_status status; 184 + struct em_perf_state *table; 196 185 unsigned long state; 197 186 unsigned long freq; 198 187 unsigned long voltage; ··· 217 204 state = dfc->capped_state; 218 205 219 206 /* Convert EM power into milli-Watts first */ 220 - dfc->res_util = dfc->em_pd->table[state].power; 207 + rcu_read_lock(); 208 + table = em_perf_state_from_pd(dfc->em_pd); 209 + dfc->res_util = table[state].power; 210 + rcu_read_unlock(); 211 + 221 212 dfc->res_util /= MICROWATT_PER_MILLIWATT; 222 213 223 214 dfc->res_util *= SCALE_ERROR_MITIGATION; ··· 242 225 _normalize_load(&status); 243 226 244 227 /* Convert EM power into milli-Watts first */ 245 - *power = dfc->em_pd->table[perf_idx].power; 228 + rcu_read_lock(); 229 + table = em_perf_state_from_pd(dfc->em_pd); 230 + *power = table[perf_idx].power; 231 + rcu_read_unlock(); 232 + 246 233 *power /= MICROWATT_PER_MILLIWATT; 247 234 /* Scale power for utilization */ 248 235 *power *= status.busy_time; ··· 266 245 unsigned long state, u32 *power) 267 246 { 268 247 struct devfreq_cooling_device *dfc = cdev->devdata; 248 + struct em_perf_state *table; 269 249 int perf_idx; 270 250 271 251 if (state > dfc->max_state) 272 252 return -EINVAL; 273 253 274 254 perf_idx = dfc->max_state - state; 275 - *power = dfc->em_pd->table[perf_idx].power; 255 + 256 + rcu_read_lock(); 257 + table = em_perf_state_from_pd(dfc->em_pd); 258 + *power = table[perf_idx].power; 259 + rcu_read_unlock(); 260 + 276 261 *power /= MICROWATT_PER_MILLIWATT; 277 262 278 263 return 0; ··· 291 264 struct devfreq *df = dfc->devfreq; 292 265 struct devfreq_dev_status status; 293 266 unsigned long freq, em_power_mw; 267 + struct em_perf_state *table; 294 268 s32 est_power; 295 269 int i; 296 270 ··· 316 288 * Find the first cooling state that is within the power 317 289 * budget. The EM power table is sorted ascending. 318 290 */ 291 + rcu_read_lock(); 292 + table = em_perf_state_from_pd(dfc->em_pd); 319 293 for (i = dfc->max_state; i > 0; i--) { 320 294 /* Convert EM power to milli-Watts to make safe comparison */ 321 - em_power_mw = dfc->em_pd->table[i].power; 295 + em_power_mw = table[i].power; 322 296 em_power_mw /= MICROWATT_PER_MILLIWATT; 323 297 if (est_power >= em_power_mw) 324 298 break; 325 299 } 300 + rcu_read_unlock(); 326 301 327 302 *state = dfc->max_state - i; 328 303 dfc->capped_state = *state;
+4 -4
drivers/thermal/intel/int340x_thermal/processor_thermal_rapl.c
··· 27 27 if (topology_physical_package_id(cpu)) 28 28 return 0; 29 29 30 - rp = rapl_find_package_domain(cpu, &rapl_mmio_priv, true); 30 + rp = rapl_find_package_domain_cpuslocked(cpu, &rapl_mmio_priv, true); 31 31 if (!rp) { 32 - rp = rapl_add_package(cpu, &rapl_mmio_priv, true); 32 + rp = rapl_add_package_cpuslocked(cpu, &rapl_mmio_priv, true); 33 33 if (IS_ERR(rp)) 34 34 return PTR_ERR(rp); 35 35 } ··· 42 42 struct rapl_package *rp; 43 43 int lead_cpu; 44 44 45 - rp = rapl_find_package_domain(cpu, &rapl_mmio_priv, true); 45 + rp = rapl_find_package_domain_cpuslocked(cpu, &rapl_mmio_priv, true); 46 46 if (!rp) 47 47 return 0; 48 48 49 49 cpumask_clear_cpu(cpu, &rp->cpumask); 50 50 lead_cpu = cpumask_first(&rp->cpumask); 51 51 if (lead_cpu >= nr_cpu_ids) 52 - rapl_remove_package(rp); 52 + rapl_remove_package_cpuslocked(rp); 53 53 else if (rp->lead_cpu == cpu) 54 54 rp->lead_cpu = lead_cpu; 55 55 return 0;
+5
include/acpi/cppc_acpi.h
··· 139 139 #ifdef CONFIG_ACPI_CPPC_LIB 140 140 extern int cppc_get_desired_perf(int cpunum, u64 *desired_perf); 141 141 extern int cppc_get_nominal_perf(int cpunum, u64 *nominal_perf); 142 + extern int cppc_get_highest_perf(int cpunum, u64 *highest_perf); 142 143 extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs); 143 144 extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls); 144 145 extern int cppc_set_enable(int cpu, bool enable); ··· 165 164 return -ENOTSUPP; 166 165 } 167 166 static inline int cppc_get_nominal_perf(int cpunum, u64 *nominal_perf) 167 + { 168 + return -ENOTSUPP; 169 + } 170 + static inline int cppc_get_highest_perf(int cpunum, u64 *highest_perf) 168 171 { 169 172 return -ENOTSUPP; 170 173 }
+10
include/linux/amd-pstate.h
··· 39 39 * @cppc_req_cached: cached performance request hints 40 40 * @highest_perf: the maximum performance an individual processor may reach, 41 41 * assuming ideal conditions 42 + * For platforms that do not support the preferred core feature, the 43 + * highest_pef may be configured with 166 or 255, to avoid max frequency 44 + * calculated wrongly. we take the fixed value as the highest_perf. 42 45 * @nominal_perf: the maximum sustained performance level of the processor, 43 46 * assuming ideal operating conditions 44 47 * @lowest_nonlinear_perf: the lowest performance level at which nonlinear power 45 48 * savings are achieved 46 49 * @lowest_perf: the absolute lowest performance level of the processor 50 + * @prefcore_ranking: the preferred core ranking, the higher value indicates a higher 51 + * priority. 47 52 * @max_freq: the frequency that mapped to highest_perf 48 53 * @min_freq: the frequency that mapped to lowest_perf 49 54 * @nominal_freq: the frequency that mapped to nominal_perf ··· 57 52 * @prev: Last Aperf/Mperf/tsc count value read from register 58 53 * @freq: current cpu frequency value 59 54 * @boost_supported: check whether the Processor or SBIOS supports boost mode 55 + * @hw_prefcore: check whether HW supports preferred core featue. 56 + * Only when hw_prefcore and early prefcore param are true, 57 + * AMD P-State driver supports preferred core featue. 60 58 * @epp_policy: Last saved policy used to set energy-performance preference 61 59 * @epp_cached: Cached CPPC energy-performance preference value 62 60 * @policy: Cpufreq policy value ··· 78 70 u32 nominal_perf; 79 71 u32 lowest_nonlinear_perf; 80 72 u32 lowest_perf; 73 + u32 prefcore_ranking; 81 74 u32 min_limit_perf; 82 75 u32 max_limit_perf; 83 76 u32 min_limit_freq; ··· 94 85 95 86 u64 freq; 96 87 bool boost_supported; 88 + bool hw_prefcore; 97 89 98 90 /* EPP feature related attributes*/ 99 91 s16 epp_policy;
+16 -24
include/linux/cpufreq.h
··· 263 263 return false; 264 264 } 265 265 static inline void disable_cpufreq(void) { } 266 + static inline void cpufreq_update_limits(unsigned int cpu) { } 266 267 #endif 267 268 268 269 #ifdef CONFIG_CPU_FREQ_STAT ··· 569 568 570 569 /* 571 570 * The polling frequency depends on the capability of the processor. Default 572 - * polling frequency is 1000 times the transition latency of the processor. The 573 - * ondemand governor will work on any processor with transition latency <= 10ms, 574 - * using appropriate sampling rate. 571 + * polling frequency is 1000 times the transition latency of the processor. 575 572 */ 576 573 #define LATENCY_MULTIPLIER (1000) 577 574 ··· 692 693 unsigned int frequency; /* kHz - doesn't need to be in ascending 693 694 * order */ 694 695 }; 695 - 696 - #if defined(CONFIG_CPU_FREQ) && defined(CONFIG_PM_OPP) 697 - int dev_pm_opp_init_cpufreq_table(struct device *dev, 698 - struct cpufreq_frequency_table **table); 699 - void dev_pm_opp_free_cpufreq_table(struct device *dev, 700 - struct cpufreq_frequency_table **table); 701 - #else 702 - static inline int dev_pm_opp_init_cpufreq_table(struct device *dev, 703 - struct cpufreq_frequency_table 704 - **table) 705 - { 706 - return -EINVAL; 707 - } 708 - 709 - static inline void dev_pm_opp_free_cpufreq_table(struct device *dev, 710 - struct cpufreq_frequency_table 711 - **table) 712 - { 713 - } 714 - #endif 715 696 716 697 /* 717 698 * cpufreq_for_each_entry - iterate over a cpufreq_frequency_table ··· 1000 1021 efficiencies); 1001 1022 } 1002 1023 1024 + static inline bool cpufreq_is_in_limits(struct cpufreq_policy *policy, int idx) 1025 + { 1026 + unsigned int freq; 1027 + 1028 + if (idx < 0) 1029 + return false; 1030 + 1031 + freq = policy->freq_table[idx].frequency; 1032 + 1033 + return freq == clamp_val(freq, policy->min, policy->max); 1034 + } 1035 + 1003 1036 static inline int cpufreq_frequency_table_target(struct cpufreq_policy *policy, 1004 1037 unsigned int target_freq, 1005 1038 unsigned int relation) ··· 1045 1054 return 0; 1046 1055 } 1047 1056 1048 - if (idx < 0 && efficiencies) { 1057 + /* Limit frequency index to honor policy->min/max */ 1058 + if (!cpufreq_is_in_limits(policy, idx) && efficiencies) { 1049 1059 efficiencies = false; 1050 1060 goto retry; 1051 1061 }
+105 -61
include/linux/energy_model.h
··· 5 5 #include <linux/device.h> 6 6 #include <linux/jump_label.h> 7 7 #include <linux/kobject.h> 8 + #include <linux/kref.h> 8 9 #include <linux/rcupdate.h> 9 10 #include <linux/sched/cpufreq.h> 10 11 #include <linux/sched/topology.h> ··· 13 12 14 13 /** 15 14 * struct em_perf_state - Performance state of a performance domain 15 + * @performance: CPU performance (capacity) at a given frequency 16 16 * @frequency: The frequency in KHz, for consistency with CPUFreq 17 17 * @power: The power consumed at this level (by 1 CPU or by a registered 18 18 * device). It can be a total power: static and dynamic. ··· 22 20 * @flags: see "em_perf_state flags" description below. 23 21 */ 24 22 struct em_perf_state { 23 + unsigned long performance; 25 24 unsigned long frequency; 26 25 unsigned long power; 27 26 unsigned long cost; ··· 40 37 #define EM_PERF_STATE_INEFFICIENT BIT(0) 41 38 42 39 /** 40 + * struct em_perf_table - Performance states table 41 + * @rcu: RCU used for safe access and destruction 42 + * @kref: Reference counter to track the users 43 + * @state: List of performance states, in ascending order 44 + */ 45 + struct em_perf_table { 46 + struct rcu_head rcu; 47 + struct kref kref; 48 + struct em_perf_state state[]; 49 + }; 50 + 51 + /** 43 52 * struct em_perf_domain - Performance domain 44 - * @table: List of performance states, in ascending order 53 + * @em_table: Pointer to the runtime modifiable em_perf_table 45 54 * @nr_perf_states: Number of performance states 46 55 * @flags: See "em_perf_domain flags" 47 56 * @cpus: Cpumask covering the CPUs of the domain. It's here ··· 68 53 * field is unused. 69 54 */ 70 55 struct em_perf_domain { 71 - struct em_perf_state *table; 56 + struct em_perf_table __rcu *em_table; 72 57 int nr_perf_states; 73 58 unsigned long flags; 74 59 unsigned long cpus[]; ··· 111 96 #define EM_MAX_NUM_CPUS 4096 112 97 #else 113 98 #define EM_MAX_NUM_CPUS 16 114 - #endif 115 - 116 - /* 117 - * To avoid an overflow on 32bit machines while calculating the energy 118 - * use a different order in the operation. First divide by the 'cpu_scale' 119 - * which would reduce big value stored in the 'cost' field, then multiply by 120 - * the 'sum_util'. This would allow to handle existing platforms, which have 121 - * e.g. power ~1.3 Watt at max freq, so the 'cost' value > 1mln micro-Watts. 122 - * In such scenario, where there are 4 CPUs in the Perf. Domain the 'sum_util' 123 - * could be 4096, then multiplication: 'cost' * 'sum_util' would overflow. 124 - * This reordering of operations has some limitations, we lose small 125 - * precision in the estimation (comparing to 64bit platform w/o reordering). 126 - * 127 - * We are safe on 64bit machine. 128 - */ 129 - #ifdef CONFIG_64BIT 130 - #define em_estimate_energy(cost, sum_util, scale_cpu) \ 131 - (((cost) * (sum_util)) / (scale_cpu)) 132 - #else 133 - #define em_estimate_energy(cost, sum_util, scale_cpu) \ 134 - (((cost) / (scale_cpu)) * (sum_util)) 135 99 #endif 136 100 137 101 struct em_data_callback { ··· 162 168 163 169 struct em_perf_domain *em_cpu_get(int cpu); 164 170 struct em_perf_domain *em_pd_get(struct device *dev); 171 + int em_dev_update_perf_domain(struct device *dev, 172 + struct em_perf_table __rcu *new_table); 165 173 int em_dev_register_perf_domain(struct device *dev, unsigned int nr_states, 166 174 struct em_data_callback *cb, cpumask_t *span, 167 175 bool microwatts); 168 176 void em_dev_unregister_perf_domain(struct device *dev); 177 + struct em_perf_table __rcu *em_table_alloc(struct em_perf_domain *pd); 178 + void em_table_free(struct em_perf_table __rcu *table); 179 + int em_dev_compute_costs(struct device *dev, struct em_perf_state *table, 180 + int nr_states); 169 181 170 182 /** 171 183 * em_pd_get_efficient_state() - Get an efficient performance state from the EM 172 - * @pd : Performance domain for which we want an efficient frequency 173 - * @freq : Frequency to map with the EM 184 + * @table: List of performance states, in ascending order 185 + * @nr_perf_states: Number of performance states 186 + * @max_util: Max utilization to map with the EM 187 + * @pd_flags: Performance Domain flags 174 188 * 175 189 * It is called from the scheduler code quite frequently and as a consequence 176 190 * doesn't implement any check. 177 191 * 178 - * Return: An efficient performance state, high enough to meet @freq 192 + * Return: An efficient performance state id, high enough to meet @max_util 179 193 * requirement. 180 194 */ 181 - static inline 182 - struct em_perf_state *em_pd_get_efficient_state(struct em_perf_domain *pd, 183 - unsigned long freq) 195 + static inline int 196 + em_pd_get_efficient_state(struct em_perf_state *table, int nr_perf_states, 197 + unsigned long max_util, unsigned long pd_flags) 184 198 { 185 199 struct em_perf_state *ps; 186 200 int i; 187 201 188 - for (i = 0; i < pd->nr_perf_states; i++) { 189 - ps = &pd->table[i]; 190 - if (ps->frequency >= freq) { 191 - if (pd->flags & EM_PERF_DOMAIN_SKIP_INEFFICIENCIES && 202 + for (i = 0; i < nr_perf_states; i++) { 203 + ps = &table[i]; 204 + if (ps->performance >= max_util) { 205 + if (pd_flags & EM_PERF_DOMAIN_SKIP_INEFFICIENCIES && 192 206 ps->flags & EM_PERF_STATE_INEFFICIENT) 193 207 continue; 194 - break; 208 + return i; 195 209 } 196 210 } 197 211 198 - return ps; 212 + return nr_perf_states - 1; 199 213 } 200 214 201 215 /** ··· 226 224 unsigned long max_util, unsigned long sum_util, 227 225 unsigned long allowed_cpu_cap) 228 226 { 229 - unsigned long freq, ref_freq, scale_cpu; 227 + struct em_perf_table *em_table; 230 228 struct em_perf_state *ps; 231 - int cpu; 229 + int i; 230 + 231 + #ifdef CONFIG_SCHED_DEBUG 232 + WARN_ONCE(!rcu_read_lock_held(), "EM: rcu read lock needed\n"); 233 + #endif 232 234 233 235 if (!sum_util) 234 236 return 0; ··· 240 234 /* 241 235 * In order to predict the performance state, map the utilization of 242 236 * the most utilized CPU of the performance domain to a requested 243 - * frequency, like schedutil. Take also into account that the real 244 - * frequency might be set lower (due to thermal capping). Thus, clamp 237 + * performance, like schedutil. Take also into account that the real 238 + * performance might be set lower (due to thermal capping). Thus, clamp 245 239 * max utilization to the allowed CPU capacity before calculating 246 - * effective frequency. 240 + * effective performance. 247 241 */ 248 - cpu = cpumask_first(to_cpumask(pd->cpus)); 249 - scale_cpu = arch_scale_cpu_capacity(cpu); 250 - ref_freq = arch_scale_freq_ref(cpu); 251 - 242 + max_util = map_util_perf(max_util); 252 243 max_util = min(max_util, allowed_cpu_cap); 253 - freq = map_util_freq(max_util, ref_freq, scale_cpu); 254 244 255 245 /* 256 246 * Find the lowest performance state of the Energy Model above the 257 - * requested frequency. 247 + * requested performance. 258 248 */ 259 - ps = em_pd_get_efficient_state(pd, freq); 249 + em_table = rcu_dereference(pd->em_table); 250 + i = em_pd_get_efficient_state(em_table->state, pd->nr_perf_states, 251 + max_util, pd->flags); 252 + ps = &em_table->state[i]; 260 253 261 254 /* 262 - * The capacity of a CPU in the domain at the performance state (ps) 263 - * can be computed as: 255 + * The performance (capacity) of a CPU in the domain at the performance 256 + * state (ps) can be computed as: 264 257 * 265 - * ps->freq * scale_cpu 266 - * ps->cap = -------------------- (1) 267 - * cpu_max_freq 258 + * ps->freq * scale_cpu 259 + * ps->performance = -------------------- (1) 260 + * cpu_max_freq 268 261 * 269 262 * So, ignoring the costs of idle states (which are not available in 270 263 * the EM), the energy consumed by this CPU at that performance state ··· 271 266 * 272 267 * ps->power * cpu_util 273 268 * cpu_nrg = -------------------- (2) 274 - * ps->cap 269 + * ps->performance 275 270 * 276 - * since 'cpu_util / ps->cap' represents its percentage of busy time. 271 + * since 'cpu_util / ps->performance' represents its percentage of busy 272 + * time. 277 273 * 278 274 * NOTE: Although the result of this computation actually is in 279 275 * units of power, it can be manipulated as an energy value ··· 284 278 * By injecting (1) in (2), 'cpu_nrg' can be re-expressed as a product 285 279 * of two terms: 286 280 * 287 - * ps->power * cpu_max_freq cpu_util 288 - * cpu_nrg = ------------------------ * --------- (3) 289 - * ps->freq scale_cpu 281 + * ps->power * cpu_max_freq 282 + * cpu_nrg = ------------------------ * cpu_util (3) 283 + * ps->freq * scale_cpu 290 284 * 291 285 * The first term is static, and is stored in the em_perf_state struct 292 286 * as 'ps->cost'. ··· 296 290 * total energy of the domain (which is the simple sum of the energy of 297 291 * all of its CPUs) can be factorized as: 298 292 * 299 - * ps->cost * \Sum cpu_util 300 - * pd_nrg = ------------------------ (4) 301 - * scale_cpu 293 + * pd_nrg = ps->cost * \Sum cpu_util (4) 302 294 */ 303 - return em_estimate_energy(ps->cost, sum_util, scale_cpu); 295 + return ps->cost * sum_util; 304 296 } 305 297 306 298 /** ··· 311 307 static inline int em_pd_nr_perf_states(struct em_perf_domain *pd) 312 308 { 313 309 return pd->nr_perf_states; 310 + } 311 + 312 + /** 313 + * em_perf_state_from_pd() - Get the performance states table of perf. 314 + * domain 315 + * @pd : performance domain for which this must be done 316 + * 317 + * To use this function the rcu_read_lock() should be hold. After the usage 318 + * of the performance states table is finished, the rcu_read_unlock() should 319 + * be called. 320 + * 321 + * Return: the pointer to performance states table of the performance domain 322 + */ 323 + static inline 324 + struct em_perf_state *em_perf_state_from_pd(struct em_perf_domain *pd) 325 + { 326 + return rcu_dereference(pd->em_table)->state; 314 327 } 315 328 316 329 #else ··· 363 342 static inline int em_pd_nr_perf_states(struct em_perf_domain *pd) 364 343 { 365 344 return 0; 345 + } 346 + static inline 347 + struct em_perf_table __rcu *em_table_alloc(struct em_perf_domain *pd) 348 + { 349 + return NULL; 350 + } 351 + static inline void em_table_free(struct em_perf_table __rcu *table) {} 352 + static inline 353 + int em_dev_update_perf_domain(struct device *dev, 354 + struct em_perf_table __rcu *new_table) 355 + { 356 + return -EINVAL; 357 + } 358 + static inline 359 + struct em_perf_state *em_perf_state_from_pd(struct em_perf_domain *pd) 360 + { 361 + return NULL; 362 + } 363 + static inline 364 + int em_dev_compute_costs(struct device *dev, struct em_perf_state *table, 365 + int nr_states) 366 + { 367 + return -EINVAL; 366 368 } 367 369 #endif 368 370
+6
include/linux/intel_rapl.h
··· 178 178 struct rapl_if_priv *priv; 179 179 }; 180 180 181 + struct rapl_package *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_priv *priv, 182 + bool id_is_cpu); 183 + struct rapl_package *rapl_add_package_cpuslocked(int id, struct rapl_if_priv *priv, 184 + bool id_is_cpu); 185 + void rapl_remove_package_cpuslocked(struct rapl_package *rp); 186 + 181 187 struct rapl_package *rapl_find_package_domain(int id, struct rapl_if_priv *priv, bool id_is_cpu); 182 188 struct rapl_package *rapl_add_package(int id, struct rapl_if_priv *priv, bool id_is_cpu); 183 189 void rapl_remove_package(struct rapl_package *rp);
+15 -15
include/linux/pm.h
··· 662 662 663 663 struct dev_pm_info { 664 664 pm_message_t power_state; 665 - unsigned int can_wakeup:1; 666 - unsigned int async_suspend:1; 665 + bool can_wakeup:1; 666 + bool async_suspend:1; 667 667 bool in_dpm_list:1; /* Owned by the PM core */ 668 668 bool is_prepared:1; /* Owned by the PM core */ 669 669 bool is_suspended:1; /* Ditto */ ··· 682 682 bool syscore:1; 683 683 bool no_pm_callbacks:1; /* Owned by the PM core */ 684 684 bool async_in_progress:1; /* Owned by the PM core */ 685 - unsigned int must_resume:1; /* Owned by the PM core */ 686 - unsigned int may_skip_resume:1; /* Set by subsystems */ 685 + bool must_resume:1; /* Owned by the PM core */ 686 + bool may_skip_resume:1; /* Set by subsystems */ 687 687 #else 688 - unsigned int should_wakeup:1; 688 + bool should_wakeup:1; 689 689 #endif 690 690 #ifdef CONFIG_PM 691 691 struct hrtimer suspend_timer; ··· 696 696 atomic_t usage_count; 697 697 atomic_t child_count; 698 698 unsigned int disable_depth:3; 699 - unsigned int idle_notification:1; 700 - unsigned int request_pending:1; 701 - unsigned int deferred_resume:1; 702 - unsigned int needs_force_resume:1; 703 - unsigned int runtime_auto:1; 699 + bool idle_notification:1; 700 + bool request_pending:1; 701 + bool deferred_resume:1; 702 + bool needs_force_resume:1; 703 + bool runtime_auto:1; 704 704 bool ignore_children:1; 705 - unsigned int no_callbacks:1; 706 - unsigned int irq_safe:1; 707 - unsigned int use_autosuspend:1; 708 - unsigned int timer_autosuspends:1; 709 - unsigned int memalloc_noio:1; 705 + bool no_callbacks:1; 706 + bool irq_safe:1; 707 + bool use_autosuspend:1; 708 + bool timer_autosuspends:1; 709 + bool memalloc_noio:1; 710 710 unsigned int links_count; 711 711 enum rpm_request request; 712 712 enum rpm_status runtime_status;
+18
include/linux/pm_opp.h
··· 16 16 #include <linux/notifier.h> 17 17 18 18 struct clk; 19 + struct cpufreq_frequency_table; 19 20 struct regulator; 20 21 struct dev_pm_opp; 21 22 struct device; ··· 88 87 89 88 /** 90 89 * struct dev_pm_opp_data - The data to use to initialize an OPP. 90 + * @turbo: Flag to indicate whether the OPP is to be marked turbo or not. 91 91 * @level: The performance level for the OPP. Set level to OPP_LEVEL_UNSET if 92 92 * level field isn't used. 93 93 * @freq: The clock rate in Hz for the OPP. 94 94 * @u_volt: The voltage in uV for the OPP. 95 95 */ 96 96 struct dev_pm_opp_data { 97 + bool turbo; 97 98 unsigned int level; 98 99 unsigned long freq; 99 100 unsigned long u_volt; ··· 446 443 } 447 444 448 445 #endif /* CONFIG_PM_OPP */ 446 + 447 + #if defined(CONFIG_CPU_FREQ) && defined(CONFIG_PM_OPP) 448 + int dev_pm_opp_init_cpufreq_table(struct device *dev, struct cpufreq_frequency_table **table); 449 + void dev_pm_opp_free_cpufreq_table(struct device *dev, struct cpufreq_frequency_table **table); 450 + #else 451 + static inline int dev_pm_opp_init_cpufreq_table(struct device *dev, struct cpufreq_frequency_table **table) 452 + { 453 + return -EINVAL; 454 + } 455 + 456 + static inline void dev_pm_opp_free_cpufreq_table(struct device *dev, struct cpufreq_frequency_table **table) 457 + { 458 + } 459 + #endif 460 + 449 461 450 462 #if defined(CONFIG_PM_OPP) && defined(CONFIG_OF) 451 463 int dev_pm_opp_of_add_table(struct device *dev);
+15 -15
include/linux/pm_runtime.h
··· 72 72 extern int __pm_runtime_idle(struct device *dev, int rpmflags); 73 73 extern int __pm_runtime_suspend(struct device *dev, int rpmflags); 74 74 extern int __pm_runtime_resume(struct device *dev, int rpmflags); 75 - extern int pm_runtime_get_if_active(struct device *dev, bool ign_usage_count); 75 + extern int pm_runtime_get_if_active(struct device *dev); 76 + extern int pm_runtime_get_if_in_use(struct device *dev); 76 77 extern int pm_schedule_suspend(struct device *dev, unsigned int delay); 77 78 extern int __pm_runtime_set_status(struct device *dev, unsigned int status); 78 79 extern int pm_runtime_barrier(struct device *dev); ··· 94 93 extern void pm_runtime_release_supplier(struct device_link *link); 95 94 96 95 extern int devm_pm_runtime_enable(struct device *dev); 97 - 98 - /** 99 - * pm_runtime_get_if_in_use - Conditionally bump up runtime PM usage counter. 100 - * @dev: Target device. 101 - * 102 - * Increment the runtime PM usage counter of @dev if its runtime PM status is 103 - * %RPM_ACTIVE and its runtime PM usage counter is greater than 0. 104 - */ 105 - static inline int pm_runtime_get_if_in_use(struct device *dev) 106 - { 107 - return pm_runtime_get_if_active(dev, false); 108 - } 109 96 110 97 /** 111 98 * pm_suspend_ignore_children - Set runtime PM behavior regarding children. ··· 264 275 { 265 276 return -EINVAL; 266 277 } 267 - static inline int pm_runtime_get_if_active(struct device *dev, 268 - bool ign_usage_count) 278 + static inline int pm_runtime_get_if_active(struct device *dev) 269 279 { 270 280 return -EINVAL; 271 281 } ··· 446 458 static inline int pm_runtime_put(struct device *dev) 447 459 { 448 460 return __pm_runtime_idle(dev, RPM_GET_PUT | RPM_ASYNC); 461 + } 462 + 463 + /** 464 + * __pm_runtime_put_autosuspend - Drop device usage counter and queue autosuspend if 0. 465 + * @dev: Target device. 466 + * 467 + * Decrement the runtime PM usage counter of @dev and if it turns out to be 468 + * equal to 0, queue up a work item for @dev like in pm_request_autosuspend(). 469 + */ 470 + static inline int __pm_runtime_put_autosuspend(struct device *dev) 471 + { 472 + return __pm_runtime_suspend(dev, RPM_GET_PUT | RPM_ASYNC | RPM_AUTO); 449 473 } 450 474 451 475 /**
+8
include/linux/scmi_protocol.h
··· 140 140 * @level_set: sets the performance level of a domain 141 141 * @level_get: gets the performance level of a domain 142 142 * @transition_latency_get: gets the DVFS transition latency for a given device 143 + * @rate_limit_get: gets the minimum time (us) required between successive 144 + * requests 143 145 * @device_opps_add: adds all the OPPs for a given device 144 146 * @freq_set: sets the frequency for a given device using sustained frequency 145 147 * to sustained performance level mapping ··· 151 149 * at a given frequency 152 150 * @fast_switch_possible: indicates if fast DVFS switching is possible or not 153 151 * for a given device 152 + * @fast_switch_rate_limit: gets the minimum time (us) required between 153 + * successive fast_switching requests 154 154 * @power_scale_mw_get: indicates if the power values provided are in milliWatts 155 155 * or in some other (abstract) scale 156 156 */ ··· 170 166 u32 *level, bool poll); 171 167 int (*transition_latency_get)(const struct scmi_protocol_handle *ph, 172 168 u32 domain); 169 + int (*rate_limit_get)(const struct scmi_protocol_handle *ph, 170 + u32 domain, u32 *rate_limit); 173 171 int (*device_opps_add)(const struct scmi_protocol_handle *ph, 174 172 struct device *dev, u32 domain); 175 173 int (*freq_set)(const struct scmi_protocol_handle *ph, u32 domain, ··· 182 176 unsigned long *rate, unsigned long *power); 183 177 bool (*fast_switch_possible)(const struct scmi_protocol_handle *ph, 184 178 u32 domain); 179 + int (*fast_switch_rate_limit)(const struct scmi_protocol_handle *ph, 180 + u32 domain, u32 *rate_limit); 185 181 enum scmi_power_scale (*power_scale_get)(const struct scmi_protocol_handle *ph); 186 182 }; 187 183
+15 -59
include/linux/suspend.h
··· 40 40 #define PM_SUSPEND_MIN PM_SUSPEND_TO_IDLE 41 41 #define PM_SUSPEND_MAX ((__force suspend_state_t) 4) 42 42 43 - enum suspend_stat_step { 44 - SUSPEND_FREEZE = 1, 45 - SUSPEND_PREPARE, 46 - SUSPEND_SUSPEND, 47 - SUSPEND_SUSPEND_LATE, 48 - SUSPEND_SUSPEND_NOIRQ, 49 - SUSPEND_RESUME_NOIRQ, 50 - SUSPEND_RESUME_EARLY, 51 - SUSPEND_RESUME 52 - }; 53 - 54 - struct suspend_stats { 55 - int success; 56 - int fail; 57 - int failed_freeze; 58 - int failed_prepare; 59 - int failed_suspend; 60 - int failed_suspend_late; 61 - int failed_suspend_noirq; 62 - int failed_resume; 63 - int failed_resume_early; 64 - int failed_resume_noirq; 65 - #define REC_FAILED_NUM 2 66 - int last_failed_dev; 67 - char failed_devs[REC_FAILED_NUM][40]; 68 - int last_failed_errno; 69 - int errno[REC_FAILED_NUM]; 70 - int last_failed_step; 71 - u64 last_hw_sleep; 72 - u64 total_hw_sleep; 73 - u64 max_hw_sleep; 74 - enum suspend_stat_step failed_steps[REC_FAILED_NUM]; 75 - }; 76 - 77 - extern struct suspend_stats suspend_stats; 78 - 79 - static inline void dpm_save_failed_dev(const char *name) 80 - { 81 - strscpy(suspend_stats.failed_devs[suspend_stats.last_failed_dev], 82 - name, 83 - sizeof(suspend_stats.failed_devs[0])); 84 - suspend_stats.last_failed_dev++; 85 - suspend_stats.last_failed_dev %= REC_FAILED_NUM; 86 - } 87 - 88 - static inline void dpm_save_failed_errno(int err) 89 - { 90 - suspend_stats.errno[suspend_stats.last_failed_errno] = err; 91 - suspend_stats.last_failed_errno++; 92 - suspend_stats.last_failed_errno %= REC_FAILED_NUM; 93 - } 94 - 95 - static inline void dpm_save_failed_step(enum suspend_stat_step step) 96 - { 97 - suspend_stats.failed_steps[suspend_stats.last_failed_step] = step; 98 - suspend_stats.last_failed_step++; 99 - suspend_stats.last_failed_step %= REC_FAILED_NUM; 100 - } 101 - 102 43 /** 103 44 * struct platform_suspend_ops - Callbacks for managing platform dependent 104 45 * system sleep states. ··· 566 625 static inline void queue_up_suspend_work(void) {} 567 626 568 627 #endif /* !CONFIG_PM_AUTOSLEEP */ 628 + 629 + enum suspend_stat_step { 630 + SUSPEND_WORKING = 0, 631 + SUSPEND_FREEZE, 632 + SUSPEND_PREPARE, 633 + SUSPEND_SUSPEND, 634 + SUSPEND_SUSPEND_LATE, 635 + SUSPEND_SUSPEND_NOIRQ, 636 + SUSPEND_RESUME_NOIRQ, 637 + SUSPEND_RESUME_EARLY, 638 + SUSPEND_RESUME 639 + }; 640 + 641 + void dpm_save_failed_dev(const char *name); 642 + void dpm_save_failed_step(enum suspend_stat_step step); 569 643 570 644 #endif /* _LINUX_SUSPEND_H */
+42
include/trace/events/rpm.h
··· 101 101 __entry->ret) 102 102 ); 103 103 104 + #define RPM_STATUS_STRINGS \ 105 + EM(RPM_INVALID, "RPM_INVALID") \ 106 + EM(RPM_ACTIVE, "RPM_ACTIVE") \ 107 + EM(RPM_RESUMING, "RPM_RESUMING") \ 108 + EM(RPM_SUSPENDED, "RPM_SUSPENDED") \ 109 + EMe(RPM_SUSPENDING, "RPM_SUSPENDING") 110 + 111 + /* Enums require being exported to userspace, for user tool parsing. */ 112 + #undef EM 113 + #undef EMe 114 + #define EM(a, b) TRACE_DEFINE_ENUM(a); 115 + #define EMe(a, b) TRACE_DEFINE_ENUM(a); 116 + 117 + RPM_STATUS_STRINGS 118 + 119 + /* 120 + * Now redefine the EM() and EMe() macros to map the enums to the strings that 121 + * will be printed in the output. 122 + */ 123 + #undef EM 124 + #undef EMe 125 + #define EM(a, b) { a, b }, 126 + #define EMe(a, b) { a, b } 127 + 128 + TRACE_EVENT(rpm_status, 129 + TP_PROTO(struct device *dev, enum rpm_status status), 130 + TP_ARGS(dev, status), 131 + 132 + TP_STRUCT__entry( 133 + __string(name, dev_name(dev)) 134 + __field(int, status) 135 + ), 136 + 137 + TP_fast_assign( 138 + __assign_str(name, dev_name(dev)); 139 + __entry->status = status; 140 + ), 141 + 142 + TP_printk("%s status=%s", __get_str(name), 143 + __print_symbolic(__entry->status, RPM_STATUS_STRINGS)) 144 + ); 145 + 104 146 #endif /* _TRACE_RUNTIME_POWER_H */ 105 147 106 148 /* This part must be outside protection */
+24 -2
kernel/power/Kconfig
··· 39 39 bool "Hibernation (aka 'suspend to disk')" 40 40 depends on SWAP && ARCH_HIBERNATION_POSSIBLE 41 41 select HIBERNATE_CALLBACKS 42 - select LZO_COMPRESS 43 - select LZO_DECOMPRESS 44 42 select CRC32 43 + select CRYPTO 44 + select CRYPTO_LZO 45 45 help 46 46 Enable the suspend to disk (STD) functionality, which is usually 47 47 called "hibernation" in user interfaces. STD checkpoints the ··· 91 91 reduces the attack surface of the kernel. 92 92 93 93 If in doubt, say Y. 94 + 95 + choice 96 + prompt "Default compressor" 97 + default HIBERNATION_COMP_LZO 98 + depends on HIBERNATION 99 + 100 + config HIBERNATION_COMP_LZO 101 + bool "lzo" 102 + depends on CRYPTO_LZO 103 + 104 + config HIBERNATION_COMP_LZ4 105 + bool "lz4" 106 + depends on CRYPTO_LZ4 107 + 108 + endchoice 109 + 110 + config HIBERNATION_DEF_COMP 111 + string 112 + default "lzo" if HIBERNATION_COMP_LZO 113 + default "lz4" if HIBERNATION_COMP_LZ4 114 + help 115 + Default compressor to be used for hibernation. 94 116 95 117 config PM_STD_PARTITION 96 118 string "Default resume partition"
+417 -76
kernel/power/energy_model.c
··· 23 23 */ 24 24 static DEFINE_MUTEX(em_pd_mutex); 25 25 26 + static void em_cpufreq_update_efficiencies(struct device *dev, 27 + struct em_perf_state *table); 28 + static void em_check_capacity_update(void); 29 + static void em_update_workfn(struct work_struct *work); 30 + static DECLARE_DELAYED_WORK(em_update_work, em_update_workfn); 31 + 26 32 static bool _is_cpu_device(struct device *dev) 27 33 { 28 34 return (dev->bus == &cpu_subsys); ··· 37 31 #ifdef CONFIG_DEBUG_FS 38 32 static struct dentry *rootdir; 39 33 40 - static void em_debug_create_ps(struct em_perf_state *ps, struct dentry *pd) 34 + struct em_dbg_info { 35 + struct em_perf_domain *pd; 36 + int ps_id; 37 + }; 38 + 39 + #define DEFINE_EM_DBG_SHOW(name, fname) \ 40 + static int em_debug_##fname##_show(struct seq_file *s, void *unused) \ 41 + { \ 42 + struct em_dbg_info *em_dbg = s->private; \ 43 + struct em_perf_state *table; \ 44 + unsigned long val; \ 45 + \ 46 + rcu_read_lock(); \ 47 + table = em_perf_state_from_pd(em_dbg->pd); \ 48 + val = table[em_dbg->ps_id].name; \ 49 + rcu_read_unlock(); \ 50 + \ 51 + seq_printf(s, "%lu\n", val); \ 52 + return 0; \ 53 + } \ 54 + DEFINE_SHOW_ATTRIBUTE(em_debug_##fname) 55 + 56 + DEFINE_EM_DBG_SHOW(frequency, frequency); 57 + DEFINE_EM_DBG_SHOW(power, power); 58 + DEFINE_EM_DBG_SHOW(cost, cost); 59 + DEFINE_EM_DBG_SHOW(performance, performance); 60 + DEFINE_EM_DBG_SHOW(flags, inefficiency); 61 + 62 + static void em_debug_create_ps(struct em_perf_domain *em_pd, 63 + struct em_dbg_info *em_dbg, int i, 64 + struct dentry *pd) 41 65 { 66 + struct em_perf_state *table; 67 + unsigned long freq; 42 68 struct dentry *d; 43 69 char name[24]; 44 70 45 - snprintf(name, sizeof(name), "ps:%lu", ps->frequency); 71 + em_dbg[i].pd = em_pd; 72 + em_dbg[i].ps_id = i; 73 + 74 + rcu_read_lock(); 75 + table = em_perf_state_from_pd(em_pd); 76 + freq = table[i].frequency; 77 + rcu_read_unlock(); 78 + 79 + snprintf(name, sizeof(name), "ps:%lu", freq); 46 80 47 81 /* Create per-ps directory */ 48 82 d = debugfs_create_dir(name, pd); 49 - debugfs_create_ulong("frequency", 0444, d, &ps->frequency); 50 - debugfs_create_ulong("power", 0444, d, &ps->power); 51 - debugfs_create_ulong("cost", 0444, d, &ps->cost); 52 - debugfs_create_ulong("inefficient", 0444, d, &ps->flags); 83 + debugfs_create_file("frequency", 0444, d, &em_dbg[i], 84 + &em_debug_frequency_fops); 85 + debugfs_create_file("power", 0444, d, &em_dbg[i], 86 + &em_debug_power_fops); 87 + debugfs_create_file("cost", 0444, d, &em_dbg[i], 88 + &em_debug_cost_fops); 89 + debugfs_create_file("performance", 0444, d, &em_dbg[i], 90 + &em_debug_performance_fops); 91 + debugfs_create_file("inefficient", 0444, d, &em_dbg[i], 92 + &em_debug_inefficiency_fops); 53 93 } 54 94 55 95 static int em_debug_cpus_show(struct seq_file *s, void *unused) ··· 118 66 119 67 static void em_debug_create_pd(struct device *dev) 120 68 { 69 + struct em_dbg_info *em_dbg; 121 70 struct dentry *d; 122 71 int i; 123 72 ··· 132 79 debugfs_create_file("flags", 0444, d, dev->em_pd, 133 80 &em_debug_flags_fops); 134 81 82 + em_dbg = devm_kcalloc(dev, dev->em_pd->nr_perf_states, 83 + sizeof(*em_dbg), GFP_KERNEL); 84 + if (!em_dbg) 85 + return; 86 + 135 87 /* Create a sub-directory for each performance state */ 136 88 for (i = 0; i < dev->em_pd->nr_perf_states; i++) 137 - em_debug_create_ps(&dev->em_pd->table[i], d); 89 + em_debug_create_ps(dev->em_pd, em_dbg, i, d); 138 90 139 91 } 140 92 ··· 161 103 static void em_debug_remove_pd(struct device *dev) {} 162 104 #endif 163 105 164 - static int em_create_perf_table(struct device *dev, struct em_perf_domain *pd, 165 - int nr_states, struct em_data_callback *cb, 166 - unsigned long flags) 106 + static void em_destroy_table_rcu(struct rcu_head *rp) 167 107 { 168 - unsigned long power, freq, prev_freq = 0, prev_cost = ULONG_MAX; 169 - struct em_perf_state *table; 170 - int i, ret; 171 - u64 fmax; 108 + struct em_perf_table __rcu *table; 172 109 173 - table = kcalloc(nr_states, sizeof(*table), GFP_KERNEL); 110 + table = container_of(rp, struct em_perf_table, rcu); 111 + kfree(table); 112 + } 113 + 114 + static void em_release_table_kref(struct kref *kref) 115 + { 116 + struct em_perf_table __rcu *table; 117 + 118 + /* It was the last owner of this table so we can free */ 119 + table = container_of(kref, struct em_perf_table, kref); 120 + 121 + call_rcu(&table->rcu, em_destroy_table_rcu); 122 + } 123 + 124 + /** 125 + * em_table_free() - Handles safe free of the EM table when needed 126 + * @table : EM table which is going to be freed 127 + * 128 + * No return values. 129 + */ 130 + void em_table_free(struct em_perf_table __rcu *table) 131 + { 132 + kref_put(&table->kref, em_release_table_kref); 133 + } 134 + 135 + /** 136 + * em_table_alloc() - Allocate a new EM table 137 + * @pd : EM performance domain for which this must be done 138 + * 139 + * Allocate a new EM table and initialize its kref to indicate that it 140 + * has a user. 141 + * Returns allocated table or NULL. 142 + */ 143 + struct em_perf_table __rcu *em_table_alloc(struct em_perf_domain *pd) 144 + { 145 + struct em_perf_table __rcu *table; 146 + int table_size; 147 + 148 + table_size = sizeof(struct em_perf_state) * pd->nr_perf_states; 149 + 150 + table = kzalloc(sizeof(*table) + table_size, GFP_KERNEL); 174 151 if (!table) 175 - return -ENOMEM; 152 + return NULL; 176 153 177 - /* Build the list of performance states for this performance domain */ 178 - for (i = 0, freq = 0; i < nr_states; i++, freq++) { 179 - /* 180 - * active_power() is a driver callback which ceils 'freq' to 181 - * lowest performance state of 'dev' above 'freq' and updates 182 - * 'power' and 'freq' accordingly. 183 - */ 184 - ret = cb->active_power(dev, &power, &freq); 185 - if (ret) { 186 - dev_err(dev, "EM: invalid perf. state: %d\n", 187 - ret); 188 - goto free_ps_table; 189 - } 154 + kref_init(&table->kref); 190 155 191 - /* 192 - * We expect the driver callback to increase the frequency for 193 - * higher performance states. 194 - */ 195 - if (freq <= prev_freq) { 196 - dev_err(dev, "EM: non-increasing freq: %lu\n", 197 - freq); 198 - goto free_ps_table; 199 - } 156 + return table; 157 + } 200 158 201 - /* 202 - * The power returned by active_state() is expected to be 203 - * positive and be in range. 204 - */ 205 - if (!power || power > EM_MAX_POWER) { 206 - dev_err(dev, "EM: invalid power: %lu\n", 207 - power); 208 - goto free_ps_table; 209 - } 159 + static void em_init_performance(struct device *dev, struct em_perf_domain *pd, 160 + struct em_perf_state *table, int nr_states) 161 + { 162 + u64 fmax, max_cap; 163 + int i, cpu; 210 164 211 - table[i].power = power; 212 - table[i].frequency = prev_freq = freq; 213 - } 165 + /* This is needed only for CPUs and EAS skip other devices */ 166 + if (!_is_cpu_device(dev)) 167 + return; 168 + 169 + cpu = cpumask_first(em_span_cpus(pd)); 170 + 171 + /* 172 + * Calculate the performance value for each frequency with 173 + * linear relationship. The final CPU capacity might not be ready at 174 + * boot time, but the EM will be updated a bit later with correct one. 175 + */ 176 + fmax = (u64) table[nr_states - 1].frequency; 177 + max_cap = (u64) arch_scale_cpu_capacity(cpu); 178 + for (i = 0; i < nr_states; i++) 179 + table[i].performance = div64_u64(max_cap * table[i].frequency, 180 + fmax); 181 + } 182 + 183 + static int em_compute_costs(struct device *dev, struct em_perf_state *table, 184 + struct em_data_callback *cb, int nr_states, 185 + unsigned long flags) 186 + { 187 + unsigned long prev_cost = ULONG_MAX; 188 + int i, ret; 214 189 215 190 /* Compute the cost of each performance state. */ 216 - fmax = (u64) table[nr_states - 1].frequency; 217 191 for (i = nr_states - 1; i >= 0; i--) { 218 192 unsigned long power_res, cost; 219 193 220 - if (flags & EM_PERF_DOMAIN_ARTIFICIAL) { 194 + if ((flags & EM_PERF_DOMAIN_ARTIFICIAL) && cb->get_cost) { 221 195 ret = cb->get_cost(dev, table[i].frequency, &cost); 222 196 if (ret || !cost || cost > EM_MAX_POWER) { 223 197 dev_err(dev, "EM: invalid cost %lu %d\n", 224 198 cost, ret); 225 - goto free_ps_table; 199 + return -EINVAL; 226 200 } 227 201 } else { 228 - power_res = table[i].power; 229 - cost = div64_u64(fmax * power_res, table[i].frequency); 202 + /* increase resolution of 'cost' precision */ 203 + power_res = table[i].power * 10; 204 + cost = power_res / table[i].performance; 230 205 } 231 206 232 207 table[i].cost = cost; ··· 273 182 } 274 183 } 275 184 276 - pd->table = table; 277 - pd->nr_perf_states = nr_states; 185 + return 0; 186 + } 187 + 188 + /** 189 + * em_dev_compute_costs() - Calculate cost values for new runtime EM table 190 + * @dev : Device for which the EM table is to be updated 191 + * @table : The new EM table that is going to get the costs calculated 192 + * @nr_states : Number of performance states 193 + * 194 + * Calculate the em_perf_state::cost values for new runtime EM table. The 195 + * values are used for EAS during task placement. It also calculates and sets 196 + * the efficiency flag for each performance state. When the function finish 197 + * successfully the EM table is ready to be updated and used by EAS. 198 + * 199 + * Return 0 on success or a proper error in case of failure. 200 + */ 201 + int em_dev_compute_costs(struct device *dev, struct em_perf_state *table, 202 + int nr_states) 203 + { 204 + return em_compute_costs(dev, table, NULL, nr_states, 0); 205 + } 206 + 207 + /** 208 + * em_dev_update_perf_domain() - Update runtime EM table for a device 209 + * @dev : Device for which the EM is to be updated 210 + * @new_table : The new EM table that is going to be used from now 211 + * 212 + * Update EM runtime modifiable table for the @dev using the provided @table. 213 + * 214 + * This function uses a mutex to serialize writers, so it must not be called 215 + * from a non-sleeping context. 216 + * 217 + * Return 0 on success or an error code on failure. 218 + */ 219 + int em_dev_update_perf_domain(struct device *dev, 220 + struct em_perf_table __rcu *new_table) 221 + { 222 + struct em_perf_table __rcu *old_table; 223 + struct em_perf_domain *pd; 224 + 225 + if (!dev) 226 + return -EINVAL; 227 + 228 + /* Serialize update/unregister or concurrent updates */ 229 + mutex_lock(&em_pd_mutex); 230 + 231 + if (!dev->em_pd) { 232 + mutex_unlock(&em_pd_mutex); 233 + return -EINVAL; 234 + } 235 + pd = dev->em_pd; 236 + 237 + kref_get(&new_table->kref); 238 + 239 + old_table = pd->em_table; 240 + rcu_assign_pointer(pd->em_table, new_table); 241 + 242 + em_cpufreq_update_efficiencies(dev, new_table->state); 243 + 244 + em_table_free(old_table); 245 + 246 + mutex_unlock(&em_pd_mutex); 247 + return 0; 248 + } 249 + EXPORT_SYMBOL_GPL(em_dev_update_perf_domain); 250 + 251 + static int em_create_perf_table(struct device *dev, struct em_perf_domain *pd, 252 + struct em_perf_state *table, 253 + struct em_data_callback *cb, 254 + unsigned long flags) 255 + { 256 + unsigned long power, freq, prev_freq = 0; 257 + int nr_states = pd->nr_perf_states; 258 + int i, ret; 259 + 260 + /* Build the list of performance states for this performance domain */ 261 + for (i = 0, freq = 0; i < nr_states; i++, freq++) { 262 + /* 263 + * active_power() is a driver callback which ceils 'freq' to 264 + * lowest performance state of 'dev' above 'freq' and updates 265 + * 'power' and 'freq' accordingly. 266 + */ 267 + ret = cb->active_power(dev, &power, &freq); 268 + if (ret) { 269 + dev_err(dev, "EM: invalid perf. state: %d\n", 270 + ret); 271 + return -EINVAL; 272 + } 273 + 274 + /* 275 + * We expect the driver callback to increase the frequency for 276 + * higher performance states. 277 + */ 278 + if (freq <= prev_freq) { 279 + dev_err(dev, "EM: non-increasing freq: %lu\n", 280 + freq); 281 + return -EINVAL; 282 + } 283 + 284 + /* 285 + * The power returned by active_state() is expected to be 286 + * positive and be in range. 287 + */ 288 + if (!power || power > EM_MAX_POWER) { 289 + dev_err(dev, "EM: invalid power: %lu\n", 290 + power); 291 + return -EINVAL; 292 + } 293 + 294 + table[i].power = power; 295 + table[i].frequency = prev_freq = freq; 296 + } 297 + 298 + em_init_performance(dev, pd, table, nr_states); 299 + 300 + ret = em_compute_costs(dev, table, cb, nr_states, flags); 301 + if (ret) 302 + return -EINVAL; 278 303 279 304 return 0; 280 - 281 - free_ps_table: 282 - kfree(table); 283 - return -EINVAL; 284 305 } 285 306 286 307 static int em_create_pd(struct device *dev, int nr_states, 287 308 struct em_data_callback *cb, cpumask_t *cpus, 288 309 unsigned long flags) 289 310 { 311 + struct em_perf_table __rcu *em_table; 290 312 struct em_perf_domain *pd; 291 313 struct device *cpu_dev; 292 314 int cpu, ret, num_cpus; ··· 424 220 return -ENOMEM; 425 221 } 426 222 427 - ret = em_create_perf_table(dev, pd, nr_states, cb, flags); 428 - if (ret) { 429 - kfree(pd); 430 - return ret; 431 - } 223 + pd->nr_perf_states = nr_states; 224 + 225 + em_table = em_table_alloc(pd); 226 + if (!em_table) 227 + goto free_pd; 228 + 229 + ret = em_create_perf_table(dev, pd, em_table->state, cb, flags); 230 + if (ret) 231 + goto free_pd_table; 232 + 233 + rcu_assign_pointer(pd->em_table, em_table); 432 234 433 235 if (_is_cpu_device(dev)) 434 236 for_each_cpu(cpu, cpus) { ··· 445 235 dev->em_pd = pd; 446 236 447 237 return 0; 238 + 239 + free_pd_table: 240 + kfree(em_table); 241 + free_pd: 242 + kfree(pd); 243 + return -EINVAL; 448 244 } 449 245 450 - static void em_cpufreq_update_efficiencies(struct device *dev) 246 + static void 247 + em_cpufreq_update_efficiencies(struct device *dev, struct em_perf_state *table) 451 248 { 452 249 struct em_perf_domain *pd = dev->em_pd; 453 - struct em_perf_state *table; 454 250 struct cpufreq_policy *policy; 455 251 int found = 0; 456 - int i; 252 + int i, cpu; 457 253 458 - if (!_is_cpu_device(dev) || !pd) 254 + if (!_is_cpu_device(dev)) 459 255 return; 460 256 461 - policy = cpufreq_cpu_get(cpumask_first(em_span_cpus(pd))); 462 - if (!policy) { 463 - dev_warn(dev, "EM: Access to CPUFreq policy failed"); 257 + /* Try to get a CPU which is active and in this PD */ 258 + cpu = cpumask_first_and(em_span_cpus(pd), cpu_active_mask); 259 + if (cpu >= nr_cpu_ids) { 260 + dev_warn(dev, "EM: No online CPU for CPUFreq policy\n"); 464 261 return; 465 262 } 466 263 467 - table = pd->table; 264 + policy = cpufreq_cpu_get(cpu); 265 + if (!policy) { 266 + dev_warn(dev, "EM: Access to CPUFreq policy failed\n"); 267 + return; 268 + } 468 269 469 270 for (i = 0; i < pd->nr_perf_states; i++) { 470 271 if (!(table[i].flags & EM_PERF_STATE_INEFFICIENT)) ··· 618 397 619 398 dev->em_pd->flags |= flags; 620 399 621 - em_cpufreq_update_efficiencies(dev); 400 + em_cpufreq_update_efficiencies(dev, dev->em_pd->em_table->state); 622 401 623 402 em_debug_create_pd(dev); 624 403 dev_info(dev, "EM: created perf domain\n"); 625 404 626 405 unlock: 627 406 mutex_unlock(&em_pd_mutex); 407 + 408 + if (_is_cpu_device(dev)) 409 + em_check_capacity_update(); 410 + 628 411 return ret; 629 412 } 630 413 EXPORT_SYMBOL_GPL(em_dev_register_perf_domain); ··· 655 430 mutex_lock(&em_pd_mutex); 656 431 em_debug_remove_pd(dev); 657 432 658 - kfree(dev->em_pd->table); 433 + em_table_free(dev->em_pd->em_table); 434 + 659 435 kfree(dev->em_pd); 660 436 dev->em_pd = NULL; 661 437 mutex_unlock(&em_pd_mutex); 662 438 } 663 439 EXPORT_SYMBOL_GPL(em_dev_unregister_perf_domain); 440 + 441 + /* 442 + * Adjustment of CPU performance values after boot, when all CPUs capacites 443 + * are correctly calculated. 444 + */ 445 + static void em_adjust_new_capacity(struct device *dev, 446 + struct em_perf_domain *pd, 447 + u64 max_cap) 448 + { 449 + struct em_perf_table __rcu *em_table; 450 + struct em_perf_state *ps, *new_ps; 451 + int ret, ps_size; 452 + 453 + em_table = em_table_alloc(pd); 454 + if (!em_table) { 455 + dev_warn(dev, "EM: allocation failed\n"); 456 + return; 457 + } 458 + 459 + new_ps = em_table->state; 460 + 461 + rcu_read_lock(); 462 + ps = em_perf_state_from_pd(pd); 463 + /* Initialize data based on old table */ 464 + ps_size = sizeof(struct em_perf_state) * pd->nr_perf_states; 465 + memcpy(new_ps, ps, ps_size); 466 + 467 + rcu_read_unlock(); 468 + 469 + em_init_performance(dev, pd, new_ps, pd->nr_perf_states); 470 + ret = em_compute_costs(dev, new_ps, NULL, pd->nr_perf_states, 471 + pd->flags); 472 + if (ret) { 473 + dev_warn(dev, "EM: compute costs failed\n"); 474 + return; 475 + } 476 + 477 + ret = em_dev_update_perf_domain(dev, em_table); 478 + if (ret) 479 + dev_warn(dev, "EM: update failed %d\n", ret); 480 + 481 + /* 482 + * This is one-time-update, so give up the ownership in this updater. 483 + * The EM framework has incremented the usage counter and from now 484 + * will keep the reference (then free the memory when needed). 485 + */ 486 + em_table_free(em_table); 487 + } 488 + 489 + static void em_check_capacity_update(void) 490 + { 491 + cpumask_var_t cpu_done_mask; 492 + struct em_perf_state *table; 493 + struct em_perf_domain *pd; 494 + unsigned long cpu_capacity; 495 + int cpu; 496 + 497 + if (!zalloc_cpumask_var(&cpu_done_mask, GFP_KERNEL)) { 498 + pr_warn("no free memory\n"); 499 + return; 500 + } 501 + 502 + /* Check if CPUs capacity has changed than update EM */ 503 + for_each_possible_cpu(cpu) { 504 + struct cpufreq_policy *policy; 505 + unsigned long em_max_perf; 506 + struct device *dev; 507 + 508 + if (cpumask_test_cpu(cpu, cpu_done_mask)) 509 + continue; 510 + 511 + policy = cpufreq_cpu_get(cpu); 512 + if (!policy) { 513 + pr_debug("Accessing cpu%d policy failed\n", cpu); 514 + schedule_delayed_work(&em_update_work, 515 + msecs_to_jiffies(1000)); 516 + break; 517 + } 518 + cpufreq_cpu_put(policy); 519 + 520 + pd = em_cpu_get(cpu); 521 + if (!pd || em_is_artificial(pd)) 522 + continue; 523 + 524 + cpumask_or(cpu_done_mask, cpu_done_mask, 525 + em_span_cpus(pd)); 526 + 527 + cpu_capacity = arch_scale_cpu_capacity(cpu); 528 + 529 + rcu_read_lock(); 530 + table = em_perf_state_from_pd(pd); 531 + em_max_perf = table[pd->nr_perf_states - 1].performance; 532 + rcu_read_unlock(); 533 + 534 + /* 535 + * Check if the CPU capacity has been adjusted during boot 536 + * and trigger the update for new performance values. 537 + */ 538 + if (em_max_perf == cpu_capacity) 539 + continue; 540 + 541 + pr_debug("updating cpu%d cpu_cap=%lu old capacity=%lu\n", 542 + cpu, cpu_capacity, em_max_perf); 543 + 544 + dev = get_cpu_device(cpu); 545 + em_adjust_new_capacity(dev, pd, cpu_capacity); 546 + } 547 + 548 + free_cpumask_var(cpu_done_mask); 549 + } 550 + 551 + static void em_update_workfn(struct work_struct *work) 552 + { 553 + em_check_capacity_update(); 554 + }
+105 -2
kernel/power/hibernate.c
··· 47 47 sector_t swsusp_resume_block; 48 48 __visible int in_suspend __nosavedata; 49 49 50 + static char hibernate_compressor[CRYPTO_MAX_ALG_NAME] = CONFIG_HIBERNATION_DEF_COMP; 51 + 52 + /* 53 + * Compression/decompression algorithm to be used while saving/loading 54 + * image to/from disk. This would later be used in 'kernel/power/swap.c' 55 + * to allocate comp streams. 56 + */ 57 + char hib_comp_algo[CRYPTO_MAX_ALG_NAME]; 58 + 50 59 enum { 51 60 HIBERNATION_INVALID, 52 61 HIBERNATION_PLATFORM, ··· 727 718 return error; 728 719 } 729 720 721 + #define COMPRESSION_ALGO_LZO "lzo" 722 + #define COMPRESSION_ALGO_LZ4 "lz4" 723 + 730 724 /** 731 725 * hibernate - Carry out system hibernation, including saving the image. 732 726 */ ··· 742 730 if (!hibernation_available()) { 743 731 pm_pr_dbg("Hibernation not available.\n"); 744 732 return -EPERM; 733 + } 734 + 735 + /* 736 + * Query for the compression algorithm support if compression is enabled. 737 + */ 738 + if (!nocompress) { 739 + strscpy(hib_comp_algo, hibernate_compressor, sizeof(hib_comp_algo)); 740 + if (crypto_has_comp(hib_comp_algo, 0, 0) != 1) { 741 + pr_err("%s compression is not available\n", hib_comp_algo); 742 + return -EOPNOTSUPP; 743 + } 745 744 } 746 745 747 746 sleep_flags = lock_system_sleep(); ··· 789 766 790 767 if (hibernation_mode == HIBERNATION_PLATFORM) 791 768 flags |= SF_PLATFORM_MODE; 792 - if (nocompress) 769 + if (nocompress) { 793 770 flags |= SF_NOCOMPRESS_MODE; 794 - else 771 + } else { 795 772 flags |= SF_CRC32_MODE; 773 + 774 + /* 775 + * By default, LZO compression is enabled. Use SF_COMPRESSION_ALG_LZ4 776 + * to override this behaviour and use LZ4. 777 + * 778 + * Refer kernel/power/power.h for more details 779 + */ 780 + 781 + if (!strcmp(hib_comp_algo, COMPRESSION_ALGO_LZ4)) 782 + flags |= SF_COMPRESSION_ALG_LZ4; 783 + else 784 + flags |= SF_COMPRESSION_ALG_LZO; 785 + } 796 786 797 787 pm_pr_dbg("Writing hibernation image.\n"); 798 788 error = swsusp_write(flags); ··· 990 954 error = swsusp_check(true); 991 955 if (error) 992 956 goto Unlock; 957 + 958 + /* 959 + * Check if the hibernation image is compressed. If so, query for 960 + * the algorithm support. 961 + */ 962 + if (!(swsusp_header_flags & SF_NOCOMPRESS_MODE)) { 963 + if (swsusp_header_flags & SF_COMPRESSION_ALG_LZ4) 964 + strscpy(hib_comp_algo, COMPRESSION_ALGO_LZ4, sizeof(hib_comp_algo)); 965 + else 966 + strscpy(hib_comp_algo, COMPRESSION_ALGO_LZO, sizeof(hib_comp_algo)); 967 + if (crypto_has_comp(hib_comp_algo, 0, 0) != 1) { 968 + pr_err("%s compression is not available\n", hib_comp_algo); 969 + error = -EOPNOTSUPP; 970 + goto Unlock; 971 + } 972 + } 993 973 994 974 /* The snapshot device should not be opened while we're running */ 995 975 if (!hibernate_acquire()) { ··· 1421 1369 nohibernate = 1; 1422 1370 return 1; 1423 1371 } 1372 + 1373 + static const char * const comp_alg_enabled[] = { 1374 + #if IS_ENABLED(CONFIG_CRYPTO_LZO) 1375 + COMPRESSION_ALGO_LZO, 1376 + #endif 1377 + #if IS_ENABLED(CONFIG_CRYPTO_LZ4) 1378 + COMPRESSION_ALGO_LZ4, 1379 + #endif 1380 + }; 1381 + 1382 + static int hibernate_compressor_param_set(const char *compressor, 1383 + const struct kernel_param *kp) 1384 + { 1385 + unsigned int sleep_flags; 1386 + int index, ret; 1387 + 1388 + sleep_flags = lock_system_sleep(); 1389 + 1390 + index = sysfs_match_string(comp_alg_enabled, compressor); 1391 + if (index >= 0) { 1392 + ret = param_set_copystring(comp_alg_enabled[index], kp); 1393 + if (!ret) 1394 + strscpy(hib_comp_algo, comp_alg_enabled[index], 1395 + sizeof(hib_comp_algo)); 1396 + } else { 1397 + ret = index; 1398 + } 1399 + 1400 + unlock_system_sleep(sleep_flags); 1401 + 1402 + if (ret) 1403 + pr_debug("Cannot set specified compressor %s\n", 1404 + compressor); 1405 + 1406 + return ret; 1407 + } 1408 + 1409 + static const struct kernel_param_ops hibernate_compressor_param_ops = { 1410 + .set = hibernate_compressor_param_set, 1411 + .get = param_get_string, 1412 + }; 1413 + 1414 + static struct kparam_string hibernate_compressor_param_string = { 1415 + .maxlen = sizeof(hibernate_compressor), 1416 + .string = hibernate_compressor, 1417 + }; 1418 + 1419 + module_param_cb(compressor, &hibernate_compressor_param_ops, 1420 + &hibernate_compressor_param_string, 0644); 1421 + MODULE_PARM_DESC(compressor, 1422 + "Compression algorithm to be used with hibernation"); 1424 1423 1425 1424 __setup("noresume", noresume_setup); 1426 1425 __setup("resume_offset=", resume_offset_setup);
+113 -69
kernel/power/main.c
··· 95 95 } 96 96 EXPORT_SYMBOL_GPL(unregister_pm_notifier); 97 97 98 - void pm_report_hw_sleep_time(u64 t) 99 - { 100 - suspend_stats.last_hw_sleep = t; 101 - suspend_stats.total_hw_sleep += t; 102 - } 103 - EXPORT_SYMBOL_GPL(pm_report_hw_sleep_time); 104 - 105 - void pm_report_max_hw_sleep(u64 t) 106 - { 107 - suspend_stats.max_hw_sleep = t; 108 - } 109 - EXPORT_SYMBOL_GPL(pm_report_max_hw_sleep); 110 - 111 98 int pm_notifier_call_chain_robust(unsigned long val_up, unsigned long val_down) 112 99 { 113 100 int ret; ··· 306 319 power_attr(pm_test); 307 320 #endif /* CONFIG_PM_SLEEP_DEBUG */ 308 321 309 - static char *suspend_step_name(enum suspend_stat_step step) 322 + #define SUSPEND_NR_STEPS SUSPEND_RESUME 323 + #define REC_FAILED_NUM 2 324 + 325 + struct suspend_stats { 326 + unsigned int step_failures[SUSPEND_NR_STEPS]; 327 + unsigned int success; 328 + unsigned int fail; 329 + int last_failed_dev; 330 + char failed_devs[REC_FAILED_NUM][40]; 331 + int last_failed_errno; 332 + int errno[REC_FAILED_NUM]; 333 + int last_failed_step; 334 + u64 last_hw_sleep; 335 + u64 total_hw_sleep; 336 + u64 max_hw_sleep; 337 + enum suspend_stat_step failed_steps[REC_FAILED_NUM]; 338 + }; 339 + 340 + static struct suspend_stats suspend_stats; 341 + static DEFINE_MUTEX(suspend_stats_lock); 342 + 343 + void dpm_save_failed_dev(const char *name) 310 344 { 311 - switch (step) { 312 - case SUSPEND_FREEZE: 313 - return "freeze"; 314 - case SUSPEND_PREPARE: 315 - return "prepare"; 316 - case SUSPEND_SUSPEND: 317 - return "suspend"; 318 - case SUSPEND_SUSPEND_NOIRQ: 319 - return "suspend_noirq"; 320 - case SUSPEND_RESUME_NOIRQ: 321 - return "resume_noirq"; 322 - case SUSPEND_RESUME: 323 - return "resume"; 324 - default: 325 - return ""; 326 - } 345 + mutex_lock(&suspend_stats_lock); 346 + 347 + strscpy(suspend_stats.failed_devs[suspend_stats.last_failed_dev], 348 + name, sizeof(suspend_stats.failed_devs[0])); 349 + suspend_stats.last_failed_dev++; 350 + suspend_stats.last_failed_dev %= REC_FAILED_NUM; 351 + 352 + mutex_unlock(&suspend_stats_lock); 327 353 } 354 + 355 + void dpm_save_failed_step(enum suspend_stat_step step) 356 + { 357 + suspend_stats.step_failures[step-1]++; 358 + suspend_stats.failed_steps[suspend_stats.last_failed_step] = step; 359 + suspend_stats.last_failed_step++; 360 + suspend_stats.last_failed_step %= REC_FAILED_NUM; 361 + } 362 + 363 + void dpm_save_errno(int err) 364 + { 365 + if (!err) { 366 + suspend_stats.success++; 367 + return; 368 + } 369 + 370 + suspend_stats.fail++; 371 + 372 + suspend_stats.errno[suspend_stats.last_failed_errno] = err; 373 + suspend_stats.last_failed_errno++; 374 + suspend_stats.last_failed_errno %= REC_FAILED_NUM; 375 + } 376 + 377 + void pm_report_hw_sleep_time(u64 t) 378 + { 379 + suspend_stats.last_hw_sleep = t; 380 + suspend_stats.total_hw_sleep += t; 381 + } 382 + EXPORT_SYMBOL_GPL(pm_report_hw_sleep_time); 383 + 384 + void pm_report_max_hw_sleep(u64 t) 385 + { 386 + suspend_stats.max_hw_sleep = t; 387 + } 388 + EXPORT_SYMBOL_GPL(pm_report_max_hw_sleep); 389 + 390 + static const char * const suspend_step_names[] = { 391 + [SUSPEND_WORKING] = "", 392 + [SUSPEND_FREEZE] = "freeze", 393 + [SUSPEND_PREPARE] = "prepare", 394 + [SUSPEND_SUSPEND] = "suspend", 395 + [SUSPEND_SUSPEND_LATE] = "suspend_late", 396 + [SUSPEND_SUSPEND_NOIRQ] = "suspend_noirq", 397 + [SUSPEND_RESUME_NOIRQ] = "resume_noirq", 398 + [SUSPEND_RESUME_EARLY] = "resume_early", 399 + [SUSPEND_RESUME] = "resume", 400 + }; 328 401 329 402 #define suspend_attr(_name, format_str) \ 330 403 static ssize_t _name##_show(struct kobject *kobj, \ ··· 394 347 } \ 395 348 static struct kobj_attribute _name = __ATTR_RO(_name) 396 349 397 - suspend_attr(success, "%d\n"); 398 - suspend_attr(fail, "%d\n"); 399 - suspend_attr(failed_freeze, "%d\n"); 400 - suspend_attr(failed_prepare, "%d\n"); 401 - suspend_attr(failed_suspend, "%d\n"); 402 - suspend_attr(failed_suspend_late, "%d\n"); 403 - suspend_attr(failed_suspend_noirq, "%d\n"); 404 - suspend_attr(failed_resume, "%d\n"); 405 - suspend_attr(failed_resume_early, "%d\n"); 406 - suspend_attr(failed_resume_noirq, "%d\n"); 350 + suspend_attr(success, "%u\n"); 351 + suspend_attr(fail, "%u\n"); 407 352 suspend_attr(last_hw_sleep, "%llu\n"); 408 353 suspend_attr(total_hw_sleep, "%llu\n"); 409 354 suspend_attr(max_hw_sleep, "%llu\n"); 355 + 356 + #define suspend_step_attr(_name, step) \ 357 + static ssize_t _name##_show(struct kobject *kobj, \ 358 + struct kobj_attribute *attr, char *buf) \ 359 + { \ 360 + return sprintf(buf, "%u\n", \ 361 + suspend_stats.step_failures[step-1]); \ 362 + } \ 363 + static struct kobj_attribute _name = __ATTR_RO(_name) 364 + 365 + suspend_step_attr(failed_freeze, SUSPEND_FREEZE); 366 + suspend_step_attr(failed_prepare, SUSPEND_PREPARE); 367 + suspend_step_attr(failed_suspend, SUSPEND_SUSPEND); 368 + suspend_step_attr(failed_suspend_late, SUSPEND_SUSPEND_LATE); 369 + suspend_step_attr(failed_suspend_noirq, SUSPEND_SUSPEND_NOIRQ); 370 + suspend_step_attr(failed_resume, SUSPEND_RESUME); 371 + suspend_step_attr(failed_resume_early, SUSPEND_RESUME_EARLY); 372 + suspend_step_attr(failed_resume_noirq, SUSPEND_RESUME_NOIRQ); 410 373 411 374 static ssize_t last_failed_dev_show(struct kobject *kobj, 412 375 struct kobj_attribute *attr, char *buf) ··· 449 392 static ssize_t last_failed_step_show(struct kobject *kobj, 450 393 struct kobj_attribute *attr, char *buf) 451 394 { 452 - int index; 453 395 enum suspend_stat_step step; 454 - char *last_failed_step = NULL; 396 + int index; 455 397 456 398 index = suspend_stats.last_failed_step + REC_FAILED_NUM - 1; 457 399 index %= REC_FAILED_NUM; 458 400 step = suspend_stats.failed_steps[index]; 459 - last_failed_step = suspend_step_name(step); 460 401 461 - return sprintf(buf, "%s\n", last_failed_step); 402 + return sprintf(buf, "%s\n", suspend_step_names[step]); 462 403 } 463 404 static struct kobj_attribute last_failed_step = __ATTR_RO(last_failed_step); 464 405 ··· 504 449 static int suspend_stats_show(struct seq_file *s, void *unused) 505 450 { 506 451 int i, index, last_dev, last_errno, last_step; 452 + enum suspend_stat_step step; 507 453 508 454 last_dev = suspend_stats.last_failed_dev + REC_FAILED_NUM - 1; 509 455 last_dev %= REC_FAILED_NUM; ··· 512 456 last_errno %= REC_FAILED_NUM; 513 457 last_step = suspend_stats.last_failed_step + REC_FAILED_NUM - 1; 514 458 last_step %= REC_FAILED_NUM; 515 - seq_printf(s, "%s: %d\n%s: %d\n%s: %d\n%s: %d\n%s: %d\n" 516 - "%s: %d\n%s: %d\n%s: %d\n%s: %d\n%s: %d\n", 517 - "success", suspend_stats.success, 518 - "fail", suspend_stats.fail, 519 - "failed_freeze", suspend_stats.failed_freeze, 520 - "failed_prepare", suspend_stats.failed_prepare, 521 - "failed_suspend", suspend_stats.failed_suspend, 522 - "failed_suspend_late", 523 - suspend_stats.failed_suspend_late, 524 - "failed_suspend_noirq", 525 - suspend_stats.failed_suspend_noirq, 526 - "failed_resume", suspend_stats.failed_resume, 527 - "failed_resume_early", 528 - suspend_stats.failed_resume_early, 529 - "failed_resume_noirq", 530 - suspend_stats.failed_resume_noirq); 459 + 460 + seq_printf(s, "success: %u\nfail: %u\n", 461 + suspend_stats.success, suspend_stats.fail); 462 + 463 + for (step = SUSPEND_FREEZE; step <= SUSPEND_NR_STEPS; step++) 464 + seq_printf(s, "failed_%s: %u\n", suspend_step_names[step], 465 + suspend_stats.step_failures[step-1]); 466 + 531 467 seq_printf(s, "failures:\n last_failed_dev:\t%-s\n", 532 - suspend_stats.failed_devs[last_dev]); 468 + suspend_stats.failed_devs[last_dev]); 533 469 for (i = 1; i < REC_FAILED_NUM; i++) { 534 470 index = last_dev + REC_FAILED_NUM - i; 535 471 index %= REC_FAILED_NUM; 536 - seq_printf(s, "\t\t\t%-s\n", 537 - suspend_stats.failed_devs[index]); 472 + seq_printf(s, "\t\t\t%-s\n", suspend_stats.failed_devs[index]); 538 473 } 539 474 seq_printf(s, " last_failed_errno:\t%-d\n", 540 475 suspend_stats.errno[last_errno]); 541 476 for (i = 1; i < REC_FAILED_NUM; i++) { 542 477 index = last_errno + REC_FAILED_NUM - i; 543 478 index %= REC_FAILED_NUM; 544 - seq_printf(s, "\t\t\t%-d\n", 545 - suspend_stats.errno[index]); 479 + seq_printf(s, "\t\t\t%-d\n", suspend_stats.errno[index]); 546 480 } 547 481 seq_printf(s, " last_failed_step:\t%-s\n", 548 - suspend_step_name( 549 - suspend_stats.failed_steps[last_step])); 482 + suspend_step_names[suspend_stats.failed_steps[last_step]]); 550 483 for (i = 1; i < REC_FAILED_NUM; i++) { 551 484 index = last_step + REC_FAILED_NUM - i; 552 485 index %= REC_FAILED_NUM; 553 486 seq_printf(s, "\t\t\t%-s\n", 554 - suspend_step_name( 555 - suspend_stats.failed_steps[index])); 487 + suspend_step_names[suspend_stats.failed_steps[index]]); 556 488 } 557 489 558 490 return 0;
+22 -1
kernel/power/power.h
··· 6 6 #include <linux/compiler.h> 7 7 #include <linux/cpu.h> 8 8 #include <linux/cpuidle.h> 9 + #include <linux/crypto.h> 9 10 10 11 struct swsusp_info { 11 12 struct new_utsname uts; ··· 55 54 56 55 /* kernel/power/hibernate.c */ 57 56 extern bool freezer_test_done; 57 + extern char hib_comp_algo[CRYPTO_MAX_ALG_NAME]; 58 + 59 + /* kernel/power/swap.c */ 60 + extern unsigned int swsusp_header_flags; 58 61 59 62 extern int hibernation_snapshot(int platform_mode); 60 63 extern int hibernation_restore(int platform_mode); ··· 153 148 extern unsigned long snapshot_get_image_size(void); 154 149 extern int snapshot_read_next(struct snapshot_handle *handle); 155 150 extern int snapshot_write_next(struct snapshot_handle *handle); 156 - extern void snapshot_write_finalize(struct snapshot_handle *handle); 151 + int snapshot_write_finalize(struct snapshot_handle *handle); 157 152 extern int snapshot_image_loaded(struct snapshot_handle *handle); 158 153 159 154 extern bool hibernate_acquire(void); ··· 167 162 * Flags that can be passed from the hibernatig hernel to the "boot" kernel in 168 163 * the image header. 169 164 */ 165 + #define SF_COMPRESSION_ALG_LZO 0 /* dummy, details given below */ 170 166 #define SF_PLATFORM_MODE 1 171 167 #define SF_NOCOMPRESS_MODE 2 172 168 #define SF_CRC32_MODE 4 173 169 #define SF_HW_SIG 8 170 + 171 + /* 172 + * Bit to indicate the compression algorithm to be used(for LZ4). The same 173 + * could be checked while saving/loading image to/from disk to use the 174 + * corresponding algorithms. 175 + * 176 + * By default, LZO compression is enabled if SF_CRC32_MODE is set. Use 177 + * SF_COMPRESSION_ALG_LZ4 to override this behaviour and use LZ4. 178 + * 179 + * SF_CRC32_MODE, SF_COMPRESSION_ALG_LZO(dummy) -> Compression, LZO 180 + * SF_CRC32_MODE, SF_COMPRESSION_ALG_LZ4 -> Compression, LZ4 181 + */ 182 + #define SF_COMPRESSION_ALG_LZ4 16 174 183 175 184 /* kernel/power/hibernate.c */ 176 185 int swsusp_check(bool exclusive); ··· 346 327 suspend_enable_secondary_cpus(); 347 328 cpuidle_resume(); 348 329 } 330 + 331 + void dpm_save_errno(int err);
+16 -9
kernel/power/snapshot.c
··· 58 58 hibernate_restore_protection_active = false; 59 59 } 60 60 61 - static inline void hibernate_restore_protect_page(void *page_address) 61 + static inline int __must_check hibernate_restore_protect_page(void *page_address) 62 62 { 63 63 if (hibernate_restore_protection_active) 64 - set_memory_ro((unsigned long)page_address, 1); 64 + return set_memory_ro((unsigned long)page_address, 1); 65 + return 0; 65 66 } 66 67 67 - static inline void hibernate_restore_unprotect_page(void *page_address) 68 + static inline int hibernate_restore_unprotect_page(void *page_address) 68 69 { 69 70 if (hibernate_restore_protection_active) 70 - set_memory_rw((unsigned long)page_address, 1); 71 + return set_memory_rw((unsigned long)page_address, 1); 72 + return 0; 71 73 } 72 74 #else 73 75 static inline void hibernate_restore_protection_begin(void) {} 74 76 static inline void hibernate_restore_protection_end(void) {} 75 - static inline void hibernate_restore_protect_page(void *page_address) {} 76 - static inline void hibernate_restore_unprotect_page(void *page_address) {} 77 + static inline int __must_check hibernate_restore_protect_page(void *page_address) {return 0; } 78 + static inline int hibernate_restore_unprotect_page(void *page_address) {return 0; } 77 79 #endif /* CONFIG_STRICT_KERNEL_RWX && CONFIG_ARCH_HAS_SET_MEMORY */ 78 80 79 81 ··· 2834 2832 } 2835 2833 } else { 2836 2834 copy_last_highmem_page(); 2837 - hibernate_restore_protect_page(handle->buffer); 2835 + error = hibernate_restore_protect_page(handle->buffer); 2836 + if (error) 2837 + return error; 2838 2838 handle->buffer = get_buffer(&orig_bm, &ca); 2839 2839 if (IS_ERR(handle->buffer)) 2840 2840 return PTR_ERR(handle->buffer); ··· 2862 2858 * stored in highmem. Additionally, it recycles bitmap memory that's not 2863 2859 * necessary any more. 2864 2860 */ 2865 - void snapshot_write_finalize(struct snapshot_handle *handle) 2861 + int snapshot_write_finalize(struct snapshot_handle *handle) 2866 2862 { 2863 + int error; 2864 + 2867 2865 copy_last_highmem_page(); 2868 - hibernate_restore_protect_page(handle->buffer); 2866 + error = hibernate_restore_protect_page(handle->buffer); 2869 2867 /* Do that only if we have loaded the image entirely */ 2870 2868 if (handle->cur > 1 && handle->cur > nr_meta_pages + nr_copy_pages + nr_zero_pages) { 2871 2869 memory_bm_recycle(&orig_bm); 2872 2870 free_highmem_data(); 2873 2871 } 2872 + return error; 2874 2873 } 2875 2874 2876 2875 int snapshot_image_loaded(struct snapshot_handle *handle)
+2 -7
kernel/power/suspend.c
··· 192 192 if (mem_sleep_labels[state] && 193 193 !strcmp(str, mem_sleep_labels[state])) { 194 194 mem_sleep_default = state; 195 + mem_sleep_current = state; 195 196 break; 196 197 } 197 198 ··· 368 367 if (!error) 369 368 return 0; 370 369 371 - suspend_stats.failed_freeze++; 372 370 dpm_save_failed_step(SUSPEND_FREEZE); 373 371 pm_notifier_call_chain(PM_POST_SUSPEND); 374 372 Restore: ··· 617 617 618 618 pr_info("suspend entry (%s)\n", mem_sleep_labels[state]); 619 619 error = enter_state(state); 620 - if (error) { 621 - suspend_stats.fail++; 622 - dpm_save_failed_errno(error); 623 - } else { 624 - suspend_stats.success++; 625 - } 620 + dpm_save_errno(error); 626 621 pr_info("suspend exit\n"); 627 622 return error; 628 623 }
+123 -74
kernel/power/swap.c
··· 23 23 #include <linux/swapops.h> 24 24 #include <linux/pm.h> 25 25 #include <linux/slab.h> 26 - #include <linux/lzo.h> 27 26 #include <linux/vmalloc.h> 28 27 #include <linux/cpumask.h> 29 28 #include <linux/atomic.h> ··· 338 339 return error; 339 340 } 340 341 342 + /* 343 + * Hold the swsusp_header flag. This is used in software_resume() in 344 + * 'kernel/power/hibernate' to check if the image is compressed and query 345 + * for the compression algorithm support(if so). 346 + */ 347 + unsigned int swsusp_header_flags; 348 + 341 349 /** 342 350 * swsusp_swap_check - check if the resume device is a swap device 343 351 * and get its index (if so) ··· 520 514 return error; 521 515 } 522 516 517 + /* 518 + * Bytes we need for compressed data in worst case. We assume(limitation) 519 + * this is the worst of all the compression algorithms. 520 + */ 521 + #define bytes_worst_compress(x) ((x) + ((x) / 16) + 64 + 3 + 2) 522 + 523 523 /* We need to remember how much compressed data we need to read. */ 524 - #define LZO_HEADER sizeof(size_t) 524 + #define CMP_HEADER sizeof(size_t) 525 525 526 526 /* Number of pages/bytes we'll compress at one time. */ 527 - #define LZO_UNC_PAGES 32 528 - #define LZO_UNC_SIZE (LZO_UNC_PAGES * PAGE_SIZE) 527 + #define UNC_PAGES 32 528 + #define UNC_SIZE (UNC_PAGES * PAGE_SIZE) 529 529 530 - /* Number of pages/bytes we need for compressed data (worst case). */ 531 - #define LZO_CMP_PAGES DIV_ROUND_UP(lzo1x_worst_compress(LZO_UNC_SIZE) + \ 532 - LZO_HEADER, PAGE_SIZE) 533 - #define LZO_CMP_SIZE (LZO_CMP_PAGES * PAGE_SIZE) 530 + /* Number of pages we need for compressed data (worst case). */ 531 + #define CMP_PAGES DIV_ROUND_UP(bytes_worst_compress(UNC_SIZE) + \ 532 + CMP_HEADER, PAGE_SIZE) 533 + #define CMP_SIZE (CMP_PAGES * PAGE_SIZE) 534 534 535 535 /* Maximum number of threads for compression/decompression. */ 536 - #define LZO_THREADS 3 536 + #define CMP_THREADS 3 537 537 538 538 /* Minimum/maximum number of pages for read buffering. */ 539 - #define LZO_MIN_RD_PAGES 1024 540 - #define LZO_MAX_RD_PAGES 8192 541 - 539 + #define CMP_MIN_RD_PAGES 1024 540 + #define CMP_MAX_RD_PAGES 8192 542 541 543 542 /** 544 543 * save_image - save the suspend image data ··· 604 593 wait_queue_head_t go; /* start crc update */ 605 594 wait_queue_head_t done; /* crc update done */ 606 595 u32 *crc32; /* points to handle's crc32 */ 607 - size_t *unc_len[LZO_THREADS]; /* uncompressed lengths */ 608 - unsigned char *unc[LZO_THREADS]; /* uncompressed data */ 596 + size_t *unc_len[CMP_THREADS]; /* uncompressed lengths */ 597 + unsigned char *unc[CMP_THREADS]; /* uncompressed data */ 609 598 }; 610 599 611 600 /* ··· 636 625 return 0; 637 626 } 638 627 /* 639 - * Structure used for LZO data compression. 628 + * Structure used for data compression. 640 629 */ 641 630 struct cmp_data { 642 631 struct task_struct *thr; /* thread */ 632 + struct crypto_comp *cc; /* crypto compressor stream */ 643 633 atomic_t ready; /* ready to start flag */ 644 634 atomic_t stop; /* ready to stop flag */ 645 635 int ret; /* return code */ ··· 648 636 wait_queue_head_t done; /* compression done */ 649 637 size_t unc_len; /* uncompressed length */ 650 638 size_t cmp_len; /* compressed length */ 651 - unsigned char unc[LZO_UNC_SIZE]; /* uncompressed buffer */ 652 - unsigned char cmp[LZO_CMP_SIZE]; /* compressed buffer */ 653 - unsigned char wrk[LZO1X_1_MEM_COMPRESS]; /* compression workspace */ 639 + unsigned char unc[UNC_SIZE]; /* uncompressed buffer */ 640 + unsigned char cmp[CMP_SIZE]; /* compressed buffer */ 654 641 }; 642 + 643 + /* Indicates the image size after compression */ 644 + static atomic_t compressed_size = ATOMIC_INIT(0); 655 645 656 646 /* 657 647 * Compression function that runs in its own thread. 658 648 */ 659 - static int lzo_compress_threadfn(void *data) 649 + static int compress_threadfn(void *data) 660 650 { 661 651 struct cmp_data *d = data; 652 + unsigned int cmp_len = 0; 662 653 663 654 while (1) { 664 655 wait_event(d->go, atomic_read_acquire(&d->ready) || ··· 675 660 } 676 661 atomic_set(&d->ready, 0); 677 662 678 - d->ret = lzo1x_1_compress(d->unc, d->unc_len, 679 - d->cmp + LZO_HEADER, &d->cmp_len, 680 - d->wrk); 663 + cmp_len = CMP_SIZE - CMP_HEADER; 664 + d->ret = crypto_comp_compress(d->cc, d->unc, d->unc_len, 665 + d->cmp + CMP_HEADER, 666 + &cmp_len); 667 + d->cmp_len = cmp_len; 668 + 669 + atomic_set(&compressed_size, atomic_read(&compressed_size) + d->cmp_len); 681 670 atomic_set_release(&d->stop, 1); 682 671 wake_up(&d->done); 683 672 } ··· 689 670 } 690 671 691 672 /** 692 - * save_image_lzo - Save the suspend image data compressed with LZO. 673 + * save_compressed_image - Save the suspend image data after compression. 693 674 * @handle: Swap map handle to use for saving the image. 694 675 * @snapshot: Image to read data from. 695 676 * @nr_to_write: Number of pages to save. 696 677 */ 697 - static int save_image_lzo(struct swap_map_handle *handle, 698 - struct snapshot_handle *snapshot, 699 - unsigned int nr_to_write) 678 + static int save_compressed_image(struct swap_map_handle *handle, 679 + struct snapshot_handle *snapshot, 680 + unsigned int nr_to_write) 700 681 { 701 682 unsigned int m; 702 683 int ret = 0; ··· 713 694 714 695 hib_init_batch(&hb); 715 696 697 + atomic_set(&compressed_size, 0); 698 + 716 699 /* 717 700 * We'll limit the number of threads for compression to limit memory 718 701 * footprint. 719 702 */ 720 703 nr_threads = num_online_cpus() - 1; 721 - nr_threads = clamp_val(nr_threads, 1, LZO_THREADS); 704 + nr_threads = clamp_val(nr_threads, 1, CMP_THREADS); 722 705 723 706 page = (void *)__get_free_page(GFP_NOIO | __GFP_HIGH); 724 707 if (!page) { 725 - pr_err("Failed to allocate LZO page\n"); 708 + pr_err("Failed to allocate %s page\n", hib_comp_algo); 726 709 ret = -ENOMEM; 727 710 goto out_clean; 728 711 } 729 712 730 713 data = vzalloc(array_size(nr_threads, sizeof(*data))); 731 714 if (!data) { 732 - pr_err("Failed to allocate LZO data\n"); 715 + pr_err("Failed to allocate %s data\n", hib_comp_algo); 733 716 ret = -ENOMEM; 734 717 goto out_clean; 735 718 } ··· 750 729 init_waitqueue_head(&data[thr].go); 751 730 init_waitqueue_head(&data[thr].done); 752 731 753 - data[thr].thr = kthread_run(lzo_compress_threadfn, 732 + data[thr].cc = crypto_alloc_comp(hib_comp_algo, 0, 0); 733 + if (IS_ERR_OR_NULL(data[thr].cc)) { 734 + pr_err("Could not allocate comp stream %ld\n", PTR_ERR(data[thr].cc)); 735 + ret = -EFAULT; 736 + goto out_clean; 737 + } 738 + 739 + data[thr].thr = kthread_run(compress_threadfn, 754 740 &data[thr], 755 741 "image_compress/%u", thr); 756 742 if (IS_ERR(data[thr].thr)) { ··· 795 767 */ 796 768 handle->reqd_free_pages = reqd_free_pages(); 797 769 798 - pr_info("Using %u thread(s) for compression\n", nr_threads); 770 + pr_info("Using %u thread(s) for %s compression\n", nr_threads, hib_comp_algo); 799 771 pr_info("Compressing and saving image data (%u pages)...\n", 800 772 nr_to_write); 801 773 m = nr_to_write / 10; ··· 805 777 start = ktime_get(); 806 778 for (;;) { 807 779 for (thr = 0; thr < nr_threads; thr++) { 808 - for (off = 0; off < LZO_UNC_SIZE; off += PAGE_SIZE) { 780 + for (off = 0; off < UNC_SIZE; off += PAGE_SIZE) { 809 781 ret = snapshot_read_next(snapshot); 810 782 if (ret < 0) 811 783 goto out_finish; ··· 845 817 ret = data[thr].ret; 846 818 847 819 if (ret < 0) { 848 - pr_err("LZO compression failed\n"); 820 + pr_err("%s compression failed\n", hib_comp_algo); 849 821 goto out_finish; 850 822 } 851 823 852 824 if (unlikely(!data[thr].cmp_len || 853 825 data[thr].cmp_len > 854 - lzo1x_worst_compress(data[thr].unc_len))) { 855 - pr_err("Invalid LZO compressed length\n"); 826 + bytes_worst_compress(data[thr].unc_len))) { 827 + pr_err("Invalid %s compressed length\n", hib_comp_algo); 856 828 ret = -1; 857 829 goto out_finish; 858 830 } ··· 868 840 * read it. 869 841 */ 870 842 for (off = 0; 871 - off < LZO_HEADER + data[thr].cmp_len; 843 + off < CMP_HEADER + data[thr].cmp_len; 872 844 off += PAGE_SIZE) { 873 845 memcpy(page, data[thr].cmp + off, PAGE_SIZE); 874 846 ··· 890 862 if (!ret) 891 863 pr_info("Image saving done\n"); 892 864 swsusp_show_speed(start, stop, nr_to_write, "Wrote"); 865 + pr_info("Image size after compression: %d kbytes\n", 866 + (atomic_read(&compressed_size) / 1024)); 867 + 893 868 out_clean: 894 869 hib_finish_batch(&hb); 895 870 if (crc) { ··· 901 870 kfree(crc); 902 871 } 903 872 if (data) { 904 - for (thr = 0; thr < nr_threads; thr++) 873 + for (thr = 0; thr < nr_threads; thr++) { 905 874 if (data[thr].thr) 906 875 kthread_stop(data[thr].thr); 876 + if (data[thr].cc) 877 + crypto_free_comp(data[thr].cc); 878 + } 907 879 vfree(data); 908 880 } 909 881 if (page) free_page((unsigned long)page); ··· 976 942 if (!error) { 977 943 error = (flags & SF_NOCOMPRESS_MODE) ? 978 944 save_image(&handle, &snapshot, pages - 1) : 979 - save_image_lzo(&handle, &snapshot, pages - 1); 945 + save_compressed_image(&handle, &snapshot, pages - 1); 980 946 } 981 947 out_finish: 982 948 error = swap_writer_finish(&handle, flags, error); ··· 1134 1100 ret = err2; 1135 1101 if (!ret) { 1136 1102 pr_info("Image loading done\n"); 1137 - snapshot_write_finalize(snapshot); 1138 - if (!snapshot_image_loaded(snapshot)) 1103 + ret = snapshot_write_finalize(snapshot); 1104 + if (!ret && !snapshot_image_loaded(snapshot)) 1139 1105 ret = -ENODATA; 1140 1106 } 1141 1107 swsusp_show_speed(start, stop, nr_to_read, "Read"); ··· 1143 1109 } 1144 1110 1145 1111 /* 1146 - * Structure used for LZO data decompression. 1112 + * Structure used for data decompression. 1147 1113 */ 1148 1114 struct dec_data { 1149 1115 struct task_struct *thr; /* thread */ 1116 + struct crypto_comp *cc; /* crypto compressor stream */ 1150 1117 atomic_t ready; /* ready to start flag */ 1151 1118 atomic_t stop; /* ready to stop flag */ 1152 1119 int ret; /* return code */ ··· 1155 1120 wait_queue_head_t done; /* decompression done */ 1156 1121 size_t unc_len; /* uncompressed length */ 1157 1122 size_t cmp_len; /* compressed length */ 1158 - unsigned char unc[LZO_UNC_SIZE]; /* uncompressed buffer */ 1159 - unsigned char cmp[LZO_CMP_SIZE]; /* compressed buffer */ 1123 + unsigned char unc[UNC_SIZE]; /* uncompressed buffer */ 1124 + unsigned char cmp[CMP_SIZE]; /* compressed buffer */ 1160 1125 }; 1161 1126 1162 1127 /* 1163 1128 * Decompression function that runs in its own thread. 1164 1129 */ 1165 - static int lzo_decompress_threadfn(void *data) 1130 + static int decompress_threadfn(void *data) 1166 1131 { 1167 1132 struct dec_data *d = data; 1133 + unsigned int unc_len = 0; 1168 1134 1169 1135 while (1) { 1170 1136 wait_event(d->go, atomic_read_acquire(&d->ready) || ··· 1179 1143 } 1180 1144 atomic_set(&d->ready, 0); 1181 1145 1182 - d->unc_len = LZO_UNC_SIZE; 1183 - d->ret = lzo1x_decompress_safe(d->cmp + LZO_HEADER, d->cmp_len, 1184 - d->unc, &d->unc_len); 1146 + unc_len = UNC_SIZE; 1147 + d->ret = crypto_comp_decompress(d->cc, d->cmp + CMP_HEADER, d->cmp_len, 1148 + d->unc, &unc_len); 1149 + d->unc_len = unc_len; 1150 + 1185 1151 if (clean_pages_on_decompress) 1186 1152 flush_icache_range((unsigned long)d->unc, 1187 1153 (unsigned long)d->unc + d->unc_len); ··· 1195 1157 } 1196 1158 1197 1159 /** 1198 - * load_image_lzo - Load compressed image data and decompress them with LZO. 1160 + * load_compressed_image - Load compressed image data and decompress it. 1199 1161 * @handle: Swap map handle to use for loading data. 1200 1162 * @snapshot: Image to copy uncompressed data into. 1201 1163 * @nr_to_read: Number of pages to load. 1202 1164 */ 1203 - static int load_image_lzo(struct swap_map_handle *handle, 1204 - struct snapshot_handle *snapshot, 1205 - unsigned int nr_to_read) 1165 + static int load_compressed_image(struct swap_map_handle *handle, 1166 + struct snapshot_handle *snapshot, 1167 + unsigned int nr_to_read) 1206 1168 { 1207 1169 unsigned int m; 1208 1170 int ret = 0; ··· 1227 1189 * footprint. 1228 1190 */ 1229 1191 nr_threads = num_online_cpus() - 1; 1230 - nr_threads = clamp_val(nr_threads, 1, LZO_THREADS); 1192 + nr_threads = clamp_val(nr_threads, 1, CMP_THREADS); 1231 1193 1232 - page = vmalloc(array_size(LZO_MAX_RD_PAGES, sizeof(*page))); 1194 + page = vmalloc(array_size(CMP_MAX_RD_PAGES, sizeof(*page))); 1233 1195 if (!page) { 1234 - pr_err("Failed to allocate LZO page\n"); 1196 + pr_err("Failed to allocate %s page\n", hib_comp_algo); 1235 1197 ret = -ENOMEM; 1236 1198 goto out_clean; 1237 1199 } 1238 1200 1239 1201 data = vzalloc(array_size(nr_threads, sizeof(*data))); 1240 1202 if (!data) { 1241 - pr_err("Failed to allocate LZO data\n"); 1203 + pr_err("Failed to allocate %s data\n", hib_comp_algo); 1242 1204 ret = -ENOMEM; 1243 1205 goto out_clean; 1244 1206 } ··· 1259 1221 init_waitqueue_head(&data[thr].go); 1260 1222 init_waitqueue_head(&data[thr].done); 1261 1223 1262 - data[thr].thr = kthread_run(lzo_decompress_threadfn, 1224 + data[thr].cc = crypto_alloc_comp(hib_comp_algo, 0, 0); 1225 + if (IS_ERR_OR_NULL(data[thr].cc)) { 1226 + pr_err("Could not allocate comp stream %ld\n", PTR_ERR(data[thr].cc)); 1227 + ret = -EFAULT; 1228 + goto out_clean; 1229 + } 1230 + 1231 + data[thr].thr = kthread_run(decompress_threadfn, 1263 1232 &data[thr], 1264 1233 "image_decompress/%u", thr); 1265 1234 if (IS_ERR(data[thr].thr)) { ··· 1307 1262 */ 1308 1263 if (low_free_pages() > snapshot_get_image_size()) 1309 1264 read_pages = (low_free_pages() - snapshot_get_image_size()) / 2; 1310 - read_pages = clamp_val(read_pages, LZO_MIN_RD_PAGES, LZO_MAX_RD_PAGES); 1265 + read_pages = clamp_val(read_pages, CMP_MIN_RD_PAGES, CMP_MAX_RD_PAGES); 1311 1266 1312 1267 for (i = 0; i < read_pages; i++) { 1313 - page[i] = (void *)__get_free_page(i < LZO_CMP_PAGES ? 1268 + page[i] = (void *)__get_free_page(i < CMP_PAGES ? 1314 1269 GFP_NOIO | __GFP_HIGH : 1315 1270 GFP_NOIO | __GFP_NOWARN | 1316 1271 __GFP_NORETRY); 1317 1272 1318 1273 if (!page[i]) { 1319 - if (i < LZO_CMP_PAGES) { 1274 + if (i < CMP_PAGES) { 1320 1275 ring_size = i; 1321 - pr_err("Failed to allocate LZO pages\n"); 1276 + pr_err("Failed to allocate %s pages\n", hib_comp_algo); 1322 1277 ret = -ENOMEM; 1323 1278 goto out_clean; 1324 1279 } else { ··· 1328 1283 } 1329 1284 want = ring_size = i; 1330 1285 1331 - pr_info("Using %u thread(s) for decompression\n", nr_threads); 1286 + pr_info("Using %u thread(s) for %s decompression\n", nr_threads, hib_comp_algo); 1332 1287 pr_info("Loading and decompressing image data (%u pages)...\n", 1333 1288 nr_to_read); 1334 1289 m = nr_to_read / 10; ··· 1389 1344 data[thr].cmp_len = *(size_t *)page[pg]; 1390 1345 if (unlikely(!data[thr].cmp_len || 1391 1346 data[thr].cmp_len > 1392 - lzo1x_worst_compress(LZO_UNC_SIZE))) { 1393 - pr_err("Invalid LZO compressed length\n"); 1347 + bytes_worst_compress(UNC_SIZE))) { 1348 + pr_err("Invalid %s compressed length\n", hib_comp_algo); 1394 1349 ret = -1; 1395 1350 goto out_finish; 1396 1351 } 1397 1352 1398 - need = DIV_ROUND_UP(data[thr].cmp_len + LZO_HEADER, 1353 + need = DIV_ROUND_UP(data[thr].cmp_len + CMP_HEADER, 1399 1354 PAGE_SIZE); 1400 1355 if (need > have) { 1401 1356 if (eof > 1) { ··· 1406 1361 } 1407 1362 1408 1363 for (off = 0; 1409 - off < LZO_HEADER + data[thr].cmp_len; 1364 + off < CMP_HEADER + data[thr].cmp_len; 1410 1365 off += PAGE_SIZE) { 1411 1366 memcpy(data[thr].cmp + off, 1412 1367 page[pg], PAGE_SIZE); ··· 1423 1378 /* 1424 1379 * Wait for more data while we are decompressing. 1425 1380 */ 1426 - if (have < LZO_CMP_PAGES && asked) { 1381 + if (have < CMP_PAGES && asked) { 1427 1382 ret = hib_wait_io(&hb); 1428 1383 if (ret) 1429 1384 goto out_finish; ··· 1441 1396 ret = data[thr].ret; 1442 1397 1443 1398 if (ret < 0) { 1444 - pr_err("LZO decompression failed\n"); 1399 + pr_err("%s decompression failed\n", hib_comp_algo); 1445 1400 goto out_finish; 1446 1401 } 1447 1402 1448 1403 if (unlikely(!data[thr].unc_len || 1449 - data[thr].unc_len > LZO_UNC_SIZE || 1450 - data[thr].unc_len & (PAGE_SIZE - 1))) { 1451 - pr_err("Invalid LZO uncompressed length\n"); 1404 + data[thr].unc_len > UNC_SIZE || 1405 + data[thr].unc_len & (PAGE_SIZE - 1))) { 1406 + pr_err("Invalid %s uncompressed length\n", hib_comp_algo); 1452 1407 ret = -1; 1453 1408 goto out_finish; 1454 1409 } ··· 1486 1441 stop = ktime_get(); 1487 1442 if (!ret) { 1488 1443 pr_info("Image loading done\n"); 1489 - snapshot_write_finalize(snapshot); 1490 - if (!snapshot_image_loaded(snapshot)) 1444 + ret = snapshot_write_finalize(snapshot); 1445 + if (!ret && !snapshot_image_loaded(snapshot)) 1491 1446 ret = -ENODATA; 1492 1447 if (!ret) { 1493 1448 if (swsusp_header->flags & SF_CRC32_MODE) { ··· 1509 1464 kfree(crc); 1510 1465 } 1511 1466 if (data) { 1512 - for (thr = 0; thr < nr_threads; thr++) 1467 + for (thr = 0; thr < nr_threads; thr++) { 1513 1468 if (data[thr].thr) 1514 1469 kthread_stop(data[thr].thr); 1470 + if (data[thr].cc) 1471 + crypto_free_comp(data[thr].cc); 1472 + } 1515 1473 vfree(data); 1516 1474 } 1517 1475 vfree(page); ··· 1548 1500 if (!error) { 1549 1501 error = (*flags_p & SF_NOCOMPRESS_MODE) ? 1550 1502 load_image(&handle, &snapshot, header->pages - 1) : 1551 - load_image_lzo(&handle, &snapshot, header->pages - 1); 1503 + load_compressed_image(&handle, &snapshot, header->pages - 1); 1552 1504 } 1553 1505 swap_reader_finish(&handle); 1554 1506 end: ··· 1583 1535 1584 1536 if (!memcmp(HIBERNATE_SIG, swsusp_header->sig, 10)) { 1585 1537 memcpy(swsusp_header->sig, swsusp_header->orig_sig, 10); 1538 + swsusp_header_flags = swsusp_header->flags; 1586 1539 /* Reset swap signature now */ 1587 1540 error = hib_submit_io(REQ_OP_WRITE | REQ_SYNC, 1588 1541 swsusp_resume_block,
+3 -1
kernel/power/user.c
··· 317 317 break; 318 318 319 319 case SNAPSHOT_ATOMIC_RESTORE: 320 - snapshot_write_finalize(&data->handle); 320 + error = snapshot_write_finalize(&data->handle); 321 + if (error) 322 + break; 321 323 if (data->mode != O_WRONLY || !data->frozen || 322 324 !snapshot_image_loaded(&data->handle)) { 323 325 error = -EPERM;
+1 -1
sound/hda/hdac_device.c
··· 612 612 int snd_hdac_keep_power_up(struct hdac_device *codec) 613 613 { 614 614 if (!atomic_inc_not_zero(&codec->in_pm)) { 615 - int ret = pm_runtime_get_if_active(&codec->dev, true); 615 + int ret = pm_runtime_get_if_active(&codec->dev); 616 616 if (!ret) 617 617 return -1; 618 618 if (ret < 0)
+1 -1
tools/power/cpupower/man/cpupower-frequency-info.1
··· 32 32 \fB\-g\fR \fB\-\-governors\fR 33 33 Determines available cpufreq governors. 34 34 .TP 35 - \fB\-a\fR \fB\-\-related\-cpus\fR 35 + \fB\-r\fR \fB\-\-related\-cpus\fR 36 36 Determines which CPUs run at the same hardware frequency. 37 37 .TP 38 38 \fB\-a\fR \fB\-\-affected\-cpus\fR
+1
tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c
··· 1241 1241 retval = fscanf(fp, "%d\n", &pkg); 1242 1242 if (retval != 1) 1243 1243 errx(1, "%s: failed to parse", pathname); 1244 + fclose(fp); 1244 1245 return pkg; 1245 1246 } 1246 1247