Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

sched: Ensure cpu_power periodic update

With a lot of small tasks, the softirq sched is nearly never called
when no_hz is enabled. In this case load_balance() is mainly called
with the newly_idle mode which doesn't update the cpu_power.

Add a next_update field which ensure a maximum update period when
there is short activity.

Having stale cpu_power information can skew the load-balancing
decisions, this is cured by the guaranteed update.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1323717668-2143-1-git-send-email-vincent.guittot@linaro.org

authored by

Vincent Guittot and committed by
Ingo Molnar
4ec4412e 39be3501

+17 -8
+1
include/linux/sched.h
··· 905 905 * single CPU. 906 906 */ 907 907 unsigned int power, power_orig; 908 + unsigned long next_update; 908 909 /* 909 910 * Number of busy cpus in this group. 910 911 */
+16 -8
kernel/sched/fair.c
··· 215 215 216 216 const struct sched_class fair_sched_class; 217 217 218 + static unsigned long __read_mostly max_load_balance_interval = HZ/10; 219 + 218 220 /************************************************************** 219 221 * CFS operations on generic schedulable entities: 220 222 */ ··· 3778 3776 struct sched_domain *child = sd->child; 3779 3777 struct sched_group *group, *sdg = sd->groups; 3780 3778 unsigned long power; 3779 + unsigned long interval; 3780 + 3781 + interval = msecs_to_jiffies(sd->balance_interval); 3782 + interval = clamp(interval, 1UL, max_load_balance_interval); 3783 + sdg->sgp->next_update = jiffies + interval; 3781 3784 3782 3785 if (!child) { 3783 3786 update_cpu_power(sd, cpu); ··· 3890 3883 * domains. In the newly idle case, we will allow all the cpu's 3891 3884 * to do the newly idle load balance. 3892 3885 */ 3893 - if (idle != CPU_NEWLY_IDLE && local_group) { 3894 - if (balance_cpu != this_cpu) { 3895 - *balance = 0; 3896 - return; 3897 - } 3898 - update_group_power(sd, this_cpu); 3886 + if (local_group) { 3887 + if (idle != CPU_NEWLY_IDLE) { 3888 + if (balance_cpu != this_cpu) { 3889 + *balance = 0; 3890 + return; 3891 + } 3892 + update_group_power(sd, this_cpu); 3893 + } else if (time_after_eq(jiffies, group->sgp->next_update)) 3894 + update_group_power(sd, this_cpu); 3899 3895 } 3900 3896 3901 3897 /* Adjust by relative CPU power of the group */ ··· 4954 4944 #endif 4955 4945 4956 4946 static DEFINE_SPINLOCK(balancing); 4957 - 4958 - static unsigned long __read_mostly max_load_balance_interval = HZ/10; 4959 4947 4960 4948 /* 4961 4949 * Scale the max load_balance interval with the number of CPUs in the system.