Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

powercap: idle_inject: Add update callback

The powercap/idle_inject core uses play_idle_precise() to inject idle
time. But play_idle_precise() can't ensure that the CPU is fully idle
for the specified duration because of wakeups due to interrupts. To
compensate for the reduced idle time due to these wakes, the caller
can adjust requested idle time for the next cycle.

The goal of idle injection is to keep system at some idle percent on
average, so this is fine to overshoot or undershoot instantaneous idle
times.

The idle inject core provides an interface idle_inject_set_duration()
to set idle and runtime duration.

Some architectures provide interface to get actual idle time observed
by the hardware. So, the effective idle percent can be adjusted using
the hardware feedback. For example, Intel CPUs provides package idle
counters, which is currently used by Intel powerclamp driver to
readjust runtime duration.

When the caller's desired idle time over a period is less or greater
than the actual CPU idle time observed by the hardware, caller can
readjust idle and runtime duration for the next cycle.

The only way this can be done currently is by monitoring hardware idle
time from a different software thread and readjust idle and runtime
duration using idle_inject_set_duration().

This can be avoided by adding a callback which callers can register and
readjust from this callback function.

Add a capability to register an optional update() callback, which can be
called from the idle inject core before waking up CPUs for idle injection.
This callback can be registered via a new interface:
idle_inject_register_full().

During this process of constantly adjusting idle and runtime duration
there can be some cases where actual idle time is more than the desired.
In this case idle inject can be skipped for a cycle. If update() callback
returns false, then the idle inject core skips waking up CPUs for the
idle injection.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

authored by

Srinivas Pandruvada and committed by
Rafael J. Wysocki
acbc6610 bbfc3349

+49 -6
+46 -6
drivers/powercap/idle_inject.c
··· 63 63 * @idle_duration_us: duration of CPU idle time to inject 64 64 * @run_duration_us: duration of CPU run time to allow 65 65 * @latency_us: max allowed latency 66 + * @update: Optional callback deciding whether or not to skip idle 67 + * injection in the given cycle. 66 68 * @cpumask: mask of CPUs affected by idle injection 69 + * 70 + * This structure is used to define per instance idle inject device data. Each 71 + * instance has an idle duration, a run duration and mask of CPUs to inject 72 + * idle. 73 + * 74 + * Actual CPU idle time is injected by calling kernel scheduler interface 75 + * play_idle_precise(). There is one optional callback that can be registered 76 + * by calling idle_inject_register_full(): 77 + * 78 + * update() - This callback is invoked just before waking up CPUs to inject 79 + * idle. If it returns false, CPUs are not woken up to inject idle in the given 80 + * cycle. It also allows the caller to readjust the idle and run duration by 81 + * calling idle_inject_set_duration() for the next cycle. 67 82 */ 68 83 struct idle_inject_device { 69 84 struct hrtimer timer; 70 85 unsigned int idle_duration_us; 71 86 unsigned int run_duration_us; 72 87 unsigned int latency_us; 88 + bool (*update)(void); 73 89 unsigned long cpumask[]; 74 90 }; 75 91 ··· 127 111 struct idle_inject_device *ii_dev = 128 112 container_of(timer, struct idle_inject_device, timer); 129 113 114 + if (!ii_dev->update || (ii_dev->update && ii_dev->update())) 115 + idle_inject_wakeup(ii_dev); 116 + 130 117 duration_us = READ_ONCE(ii_dev->run_duration_us); 131 118 duration_us += READ_ONCE(ii_dev->idle_duration_us); 132 - 133 - idle_inject_wakeup(ii_dev); 134 119 135 120 hrtimer_forward_now(timer, ns_to_ktime(duration_us * NSEC_PER_USEC)); 136 121 ··· 312 295 } 313 296 314 297 /** 315 - * idle_inject_register - initialize idle injection on a set of CPUs 298 + * idle_inject_register_full - initialize idle injection on a set of CPUs 316 299 * @cpumask: CPUs to be affected by idle injection 300 + * @update: This callback is called just before waking up CPUs to inject 301 + * idle 317 302 * 318 303 * This function creates an idle injection control device structure for the 319 - * given set of CPUs and initializes the timer associated with it. It does not 320 - * start any injection cycles. 304 + * given set of CPUs and initializes the timer associated with it. This 305 + * function also allows to register update()callback. 306 + * It does not start any injection cycles. 321 307 * 322 308 * Return: NULL if memory allocation fails, idle injection control device 323 309 * pointer on success. 324 310 */ 325 - struct idle_inject_device *idle_inject_register(struct cpumask *cpumask) 311 + 312 + struct idle_inject_device *idle_inject_register_full(struct cpumask *cpumask, 313 + bool (*update)(void)) 326 314 { 327 315 struct idle_inject_device *ii_dev; 328 316 int cpu, cpu_rb; ··· 340 318 hrtimer_init(&ii_dev->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); 341 319 ii_dev->timer.function = idle_inject_timer_fn; 342 320 ii_dev->latency_us = UINT_MAX; 321 + ii_dev->update = update; 343 322 344 323 for_each_cpu(cpu, to_cpumask(ii_dev->cpumask)) { 345 324 ··· 364 341 kfree(ii_dev); 365 342 366 343 return NULL; 344 + } 345 + EXPORT_SYMBOL_NS_GPL(idle_inject_register_full, IDLE_INJECT); 346 + 347 + /** 348 + * idle_inject_register - initialize idle injection on a set of CPUs 349 + * @cpumask: CPUs to be affected by idle injection 350 + * 351 + * This function creates an idle injection control device structure for the 352 + * given set of CPUs and initializes the timer associated with it. It does not 353 + * start any injection cycles. 354 + * 355 + * Return: NULL if memory allocation fails, idle injection control device 356 + * pointer on success. 357 + */ 358 + struct idle_inject_device *idle_inject_register(struct cpumask *cpumask) 359 + { 360 + return idle_inject_register_full(cpumask, NULL); 367 361 } 368 362 EXPORT_SYMBOL_NS_GPL(idle_inject_register, IDLE_INJECT); 369 363
+3
include/linux/idle_inject.h
··· 13 13 14 14 struct idle_inject_device *idle_inject_register(struct cpumask *cpumask); 15 15 16 + struct idle_inject_device *idle_inject_register_full(struct cpumask *cpumask, 17 + bool (*update)(void)); 18 + 16 19 void idle_inject_unregister(struct idle_inject_device *ii_dev); 17 20 18 21 int idle_inject_start(struct idle_inject_device *ii_dev);