Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

cpufreq: Fix governor start/stop race condition

Cpufreq governors' stop and start operations should be carried out
in sequence. Otherwise, there will be unexpected behavior, like in
the example below.

Suppose there are 4 CPUs and policy->cpu=CPU0, CPU1/2/3 are linked
to CPU0. The normal sequence is:

1) Current governor is userspace. An application tries to set the
governor to ondemand. It will call __cpufreq_set_policy() in
which it will stop the userspace governor and then start the
ondemand governor.

2) Current governor is userspace. The online of CPU3 runs on CPU0.
It will call cpufreq_add_policy_cpu() in which it will first
stop the userspace governor, and then start it again.

If the sequence of the above two cases interleaves, it becomes:

1) Application stops userspace governor
2) Hotplug stops userspace governor

which is a problem, because the governor shouldn't be stopped twice
in a row. What happens next is:

3) Application starts ondemand governor
4) Hotplug starts a governor

In step 4, the hotplug is supposed to start the userspace governor,
but now the governor has been changed by the application to ondemand,
so the ondemand governor is started once again, which is incorrect.

The solution is to prevent policy governors from being stopped
multiple times in a row. A governor should only be stopped once for
one policy. After it has been stopped, no more governor stop
operations should be executed.

Also add a mutex to serialize governor operations.

[rjw: Changelog. And you owe me a beverage of my choice.]
Signed-off-by: Xiaoguang Chen <chenxg@marvell.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

authored by

Xiaoguang Chen and committed by
Rafael J. Wysocki
95731ebb d1922f02

+25
+24
drivers/cpufreq/cpufreq.c
··· 49 49 static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor); 50 50 #endif 51 51 static DEFINE_RWLOCK(cpufreq_driver_lock); 52 + static DEFINE_MUTEX(cpufreq_governor_lock); 52 53 53 54 /* 54 55 * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure ··· 1636 1635 1637 1636 pr_debug("__cpufreq_governor for CPU %u, event %u\n", 1638 1637 policy->cpu, event); 1638 + 1639 + mutex_lock(&cpufreq_governor_lock); 1640 + if ((!policy->governor_enabled && (event == CPUFREQ_GOV_STOP)) || 1641 + (policy->governor_enabled && (event == CPUFREQ_GOV_START))) { 1642 + mutex_unlock(&cpufreq_governor_lock); 1643 + return -EBUSY; 1644 + } 1645 + 1646 + if (event == CPUFREQ_GOV_STOP) 1647 + policy->governor_enabled = false; 1648 + else if (event == CPUFREQ_GOV_START) 1649 + policy->governor_enabled = true; 1650 + 1651 + mutex_unlock(&cpufreq_governor_lock); 1652 + 1639 1653 ret = policy->governor->governor(policy, event); 1640 1654 1641 1655 if (!ret) { ··· 1658 1642 policy->governor->initialized++; 1659 1643 else if (event == CPUFREQ_GOV_POLICY_EXIT) 1660 1644 policy->governor->initialized--; 1645 + } else { 1646 + /* Restore original values */ 1647 + mutex_lock(&cpufreq_governor_lock); 1648 + if (event == CPUFREQ_GOV_STOP) 1649 + policy->governor_enabled = true; 1650 + else if (event == CPUFREQ_GOV_START) 1651 + policy->governor_enabled = false; 1652 + mutex_unlock(&cpufreq_governor_lock); 1661 1653 } 1662 1654 1663 1655 /* we keep one module reference alive for
+1
include/linux/cpufreq.h
··· 111 111 unsigned int policy; /* see above */ 112 112 struct cpufreq_governor *governor; /* see below */ 113 113 void *governor_data; 114 + bool governor_enabled; /* governor start/stop flag */ 114 115 115 116 struct work_struct update; /* if update_policy() needs to be 116 117 * called, but you're in IRQ context */