Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

nohz: Fix update_ts_time_stat idle accounting

update_ts_time_stat currently updates idle time even if we are in
iowait loop at the moment. The only real users of the idle counter
(via get_cpu_idle_time_us) are CPU governors and they expect to get
cumulative time for both idle and iowait times.
The value (idle_sleeptime) is also printed to userspace by print_cpu
but it prints both idle and iowait times so the idle part is misleading.

Let's clean this up and fix update_ts_time_stat to account both counters
properly and update consumers of idle to consider iowait time as well.
If we do this we might use get_cpu_{idle,iowait}_time_us from other
contexts as well and we will get expected values.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Dave Jones <davej@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Link: http://lkml.kernel.org/r/e9c909c221a8da402c4da07e4cd968c3218f8eb1.1314172057.git.mhocko@suse.cz
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

authored by

Michal Hocko and committed by
Thomas Gleixner
6beea0cd ef0e0f5e

+10 -6
+3 -1
drivers/cpufreq/cpufreq_conservative.c
··· 120 120 121 121 static inline cputime64_t get_cpu_idle_time(unsigned int cpu, cputime64_t *wall) 122 122 { 123 - u64 idle_time = get_cpu_idle_time_us(cpu, wall); 123 + u64 idle_time = get_cpu_idle_time_us(cpu, NULL); 124 124 125 125 if (idle_time == -1ULL) 126 126 return get_cpu_idle_time_jiffy(cpu, wall); 127 + else 128 + idle_time += get_cpu_iowait_time_us(cpu, wall); 127 129 128 130 return idle_time; 129 131 }
+3 -1
drivers/cpufreq/cpufreq_ondemand.c
··· 144 144 145 145 static inline cputime64_t get_cpu_idle_time(unsigned int cpu, cputime64_t *wall) 146 146 { 147 - u64 idle_time = get_cpu_idle_time_us(cpu, wall); 147 + u64 idle_time = get_cpu_idle_time_us(cpu, NULL); 148 148 149 149 if (idle_time == -1ULL) 150 150 return get_cpu_idle_time_jiffy(cpu, wall); 151 + else 152 + idle_time += get_cpu_iowait_time_us(cpu, wall); 151 153 152 154 return idle_time; 153 155 }
+4 -4
kernel/time/tick-sched.c
··· 159 159 160 160 if (ts->idle_active) { 161 161 delta = ktime_sub(now, ts->idle_entrytime); 162 - ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta); 163 162 if (nr_iowait_cpu(cpu) > 0) 164 163 ts->iowait_sleeptime = ktime_add(ts->iowait_sleeptime, delta); 164 + else 165 + ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta); 165 166 ts->idle_entrytime = now; 166 167 } 167 168 ··· 201 200 * @last_update_time: variable to store update time in 202 201 * 203 202 * Return the cummulative idle time (since boot) for a given 204 - * CPU, in microseconds. The idle time returned includes 205 - * the iowait time (unlike what "top" and co report). 203 + * CPU, in microseconds. 206 204 * 207 205 * This time is measured via accounting rather than sampling, 208 206 * and is as accurate as ktime_get() is. ··· 221 221 } 222 222 EXPORT_SYMBOL_GPL(get_cpu_idle_time_us); 223 223 224 - /* 224 + /** 225 225 * get_cpu_iowait_time_us - get the total iowait time of a cpu 226 226 * @cpu: CPU number to query 227 227 * @last_update_time: variable to store update time in