Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'pm+acpi-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:
"Fixes for some issues discovered after recent changes and for some
that have just been found lately regardless of those changes
(intel_pstate, intel_idle, PM core, mailbox/pcc, turbostat) plus
support for some new CPU models (intel_idle, Intel RAPL driver,
turbostat) and documentation updates (intel_pstate, PM core).

Specifics:

- intel_pstate fixes for two issues exposed by the recent switch over
from using timers and for one issue introduced during the 4.4 cycle
plus new comments describing data structures used by the driver
(Rafael Wysocki, Srinivas Pandruvada).

- intel_idle fixes related to CPU offline/online (Richard Cochran).

- intel_idle support (new CPU IDs and state definitions mostly) for
Skylake-X and Kabylake processors (Len Brown).

- PCC mailbox driver fix for an out-of-bounds memory access that may
cause the kernel to panic() (Shanker Donthineni).

- New (missing) CPU ID for one apparently overlooked Haswell model in
the Intel RAPL power capping driver (Srinivas Pandruvada).

- Fix for the PM core's wakeup IRQs framework to make it work after
wakeup settings reconfiguration from sysfs (Grygorii Strashko).

- Runtime PM documentation update to make it describe what needs to
be done during device removal more precisely (Krzysztof Kozlowski).

- Stale comment removal cleanup in the cpufreq-dt driver (Viresh
Kumar).

- turbostat utility fixes and support for Broxton, Skylake-X and
Kabylake processors (Len Brown)"

* tag 'pm+acpi-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (28 commits)
PM / wakeirq: fix wakeirq setting after wakup re-configuration from sysfs
tools/power turbostat: work around RC6 counter wrap
tools/power turbostat: initial KBL support
tools/power turbostat: initial SKX support
tools/power turbostat: decode BXT TSC frequency via CPUID
tools/power turbostat: initial BXT support
tools/power turbostat: print IRTL MSRs
tools/power turbostat: SGX state should print only if --debug
intel_idle: Add KBL support
intel_idle: Add SKX support
intel_idle: Clean up all registered devices on exit.
intel_idle: Propagate hot plug errors.
intel_idle: Don't overreact to a cpuidle registration failure.
intel_idle: Setup the timer broadcast only on successful driver load.
intel_idle: Avoid a double free of the per-CPU data.
intel_idle: Fix dangling registration on error path.
intel_idle: Fix deallocation order on the driver exit path.
intel_idle: Remove redundant initialization calls.
intel_idle: Fix a helper function's return value.
intel_idle: remove useless return from void function.
...

+380 -62
+4
Documentation/power/runtime_pm.txt
··· 586 586 but also it allows of more flexibility in the handling of devices during the 587 587 removal of their drivers. 588 588 589 + Drivers in ->remove() callback should undo the runtime PM changes done 590 + in ->probe(). Usually this means calling pm_runtime_disable(), 591 + pm_runtime_dont_use_autosuspend() etc. 592 + 589 593 The user space can effectively disallow the driver of the device to power manage 590 594 it at run time by changing the value of its /sys/devices/.../power/control 591 595 attribute to "on", which causes pm_runtime_forbid() to be called. In principle,
+8
arch/x86/include/asm/msr-index.h
··· 167 167 #define MSR_PKG_C9_RESIDENCY 0x00000631 168 168 #define MSR_PKG_C10_RESIDENCY 0x00000632 169 169 170 + /* Interrupt Response Limit */ 171 + #define MSR_PKGC3_IRTL 0x0000060a 172 + #define MSR_PKGC6_IRTL 0x0000060b 173 + #define MSR_PKGC7_IRTL 0x0000060c 174 + #define MSR_PKGC8_IRTL 0x00000633 175 + #define MSR_PKGC9_IRTL 0x00000634 176 + #define MSR_PKGC10_IRTL 0x00000635 177 + 170 178 /* Run Time Average Power Limiting (RAPL) Interface */ 171 179 172 180 #define MSR_RAPL_POWER_UNIT 0x00000606
+2
drivers/base/power/wakeup.c
··· 246 246 return -EEXIST; 247 247 } 248 248 dev->power.wakeup = ws; 249 + if (dev->power.wakeirq) 250 + device_wakeup_attach_irq(dev, dev->power.wakeirq); 249 251 spin_unlock_irq(&dev->power.lock); 250 252 return 0; 251 253 }
-3
drivers/cpufreq/cpufreq-dt.c
··· 4 4 * Copyright (C) 2014 Linaro. 5 5 * Viresh Kumar <viresh.kumar@linaro.org> 6 6 * 7 - * The OPP code in function set_target() is reused from 8 - * drivers/cpufreq/omap-cpufreq.c 9 - * 10 7 * This program is free software; you can redistribute it and/or modify 11 8 * it under the terms of the GNU General Public License version 2 as 12 9 * published by the Free Software Foundation.
+189 -17
drivers/cpufreq/intel_pstate.c
··· 64 64 return ret; 65 65 } 66 66 67 + /** 68 + * struct sample - Store performance sample 69 + * @core_pct_busy: Ratio of APERF/MPERF in percent, which is actual 70 + * performance during last sample period 71 + * @busy_scaled: Scaled busy value which is used to calculate next 72 + * P state. This can be different than core_pct_busy 73 + * to account for cpu idle period 74 + * @aperf: Difference of actual performance frequency clock count 75 + * read from APERF MSR between last and current sample 76 + * @mperf: Difference of maximum performance frequency clock count 77 + * read from MPERF MSR between last and current sample 78 + * @tsc: Difference of time stamp counter between last and 79 + * current sample 80 + * @freq: Effective frequency calculated from APERF/MPERF 81 + * @time: Current time from scheduler 82 + * 83 + * This structure is used in the cpudata structure to store performance sample 84 + * data for choosing next P State. 85 + */ 67 86 struct sample { 68 87 int32_t core_pct_busy; 69 88 int32_t busy_scaled; ··· 93 74 u64 time; 94 75 }; 95 76 77 + /** 78 + * struct pstate_data - Store P state data 79 + * @current_pstate: Current requested P state 80 + * @min_pstate: Min P state possible for this platform 81 + * @max_pstate: Max P state possible for this platform 82 + * @max_pstate_physical:This is physical Max P state for a processor 83 + * This can be higher than the max_pstate which can 84 + * be limited by platform thermal design power limits 85 + * @scaling: Scaling factor to convert frequency to cpufreq 86 + * frequency units 87 + * @turbo_pstate: Max Turbo P state possible for this platform 88 + * 89 + * Stores the per cpu model P state limits and current P state. 90 + */ 96 91 struct pstate_data { 97 92 int current_pstate; 98 93 int min_pstate; ··· 116 83 int turbo_pstate; 117 84 }; 118 85 86 + /** 87 + * struct vid_data - Stores voltage information data 88 + * @min: VID data for this platform corresponding to 89 + * the lowest P state 90 + * @max: VID data corresponding to the highest P State. 91 + * @turbo: VID data for turbo P state 92 + * @ratio: Ratio of (vid max - vid min) / 93 + * (max P state - Min P State) 94 + * 95 + * Stores the voltage data for DVFS (Dynamic Voltage and Frequency Scaling) 96 + * This data is used in Atom platforms, where in addition to target P state, 97 + * the voltage data needs to be specified to select next P State. 98 + */ 119 99 struct vid_data { 120 100 int min; 121 101 int max; ··· 136 90 int32_t ratio; 137 91 }; 138 92 93 + /** 94 + * struct _pid - Stores PID data 95 + * @setpoint: Target set point for busyness or performance 96 + * @integral: Storage for accumulated error values 97 + * @p_gain: PID proportional gain 98 + * @i_gain: PID integral gain 99 + * @d_gain: PID derivative gain 100 + * @deadband: PID deadband 101 + * @last_err: Last error storage for integral part of PID calculation 102 + * 103 + * Stores PID coefficients and last error for PID controller. 104 + */ 139 105 struct _pid { 140 106 int setpoint; 141 107 int32_t integral; ··· 158 100 int32_t last_err; 159 101 }; 160 102 103 + /** 104 + * struct cpudata - Per CPU instance data storage 105 + * @cpu: CPU number for this instance data 106 + * @update_util: CPUFreq utility callback information 107 + * @pstate: Stores P state limits for this CPU 108 + * @vid: Stores VID limits for this CPU 109 + * @pid: Stores PID parameters for this CPU 110 + * @last_sample_time: Last Sample time 111 + * @prev_aperf: Last APERF value read from APERF MSR 112 + * @prev_mperf: Last MPERF value read from MPERF MSR 113 + * @prev_tsc: Last timestamp counter (TSC) value 114 + * @prev_cummulative_iowait: IO Wait time difference from last and 115 + * current sample 116 + * @sample: Storage for storing last Sample data 117 + * 118 + * This structure stores per CPU instance data for all CPUs. 119 + */ 161 120 struct cpudata { 162 121 int cpu; 163 122 ··· 193 118 }; 194 119 195 120 static struct cpudata **all_cpu_data; 121 + 122 + /** 123 + * struct pid_adjust_policy - Stores static PID configuration data 124 + * @sample_rate_ms: PID calculation sample rate in ms 125 + * @sample_rate_ns: Sample rate calculation in ns 126 + * @deadband: PID deadband 127 + * @setpoint: PID Setpoint 128 + * @p_gain_pct: PID proportional gain 129 + * @i_gain_pct: PID integral gain 130 + * @d_gain_pct: PID derivative gain 131 + * 132 + * Stores per CPU model static PID configuration data. 133 + */ 196 134 struct pstate_adjust_policy { 197 135 int sample_rate_ms; 198 136 s64 sample_rate_ns; ··· 216 128 int i_gain_pct; 217 129 }; 218 130 131 + /** 132 + * struct pstate_funcs - Per CPU model specific callbacks 133 + * @get_max: Callback to get maximum non turbo effective P state 134 + * @get_max_physical: Callback to get maximum non turbo physical P state 135 + * @get_min: Callback to get minimum P state 136 + * @get_turbo: Callback to get turbo P state 137 + * @get_scaling: Callback to get frequency scaling factor 138 + * @get_val: Callback to convert P state to actual MSR write value 139 + * @get_vid: Callback to get VID data for Atom platforms 140 + * @get_target_pstate: Callback to a function to calculate next P state to use 141 + * 142 + * Core and Atom CPU models have different way to get P State limits. This 143 + * structure is used to store those callbacks. 144 + */ 219 145 struct pstate_funcs { 220 146 int (*get_max)(void); 221 147 int (*get_max_physical)(void); ··· 241 139 int32_t (*get_target_pstate)(struct cpudata *); 242 140 }; 243 141 142 + /** 143 + * struct cpu_defaults- Per CPU model default config data 144 + * @pid_policy: PID config data 145 + * @funcs: Callback function data 146 + */ 244 147 struct cpu_defaults { 245 148 struct pstate_adjust_policy pid_policy; 246 149 struct pstate_funcs funcs; ··· 258 151 static struct pstate_funcs pstate_funcs; 259 152 static int hwp_active; 260 153 154 + 155 + /** 156 + * struct perf_limits - Store user and policy limits 157 + * @no_turbo: User requested turbo state from intel_pstate sysfs 158 + * @turbo_disabled: Platform turbo status either from msr 159 + * MSR_IA32_MISC_ENABLE or when maximum available pstate 160 + * matches the maximum turbo pstate 161 + * @max_perf_pct: Effective maximum performance limit in percentage, this 162 + * is minimum of either limits enforced by cpufreq policy 163 + * or limits from user set limits via intel_pstate sysfs 164 + * @min_perf_pct: Effective minimum performance limit in percentage, this 165 + * is maximum of either limits enforced by cpufreq policy 166 + * or limits from user set limits via intel_pstate sysfs 167 + * @max_perf: This is a scaled value between 0 to 255 for max_perf_pct 168 + * This value is used to limit max pstate 169 + * @min_perf: This is a scaled value between 0 to 255 for min_perf_pct 170 + * This value is used to limit min pstate 171 + * @max_policy_pct: The maximum performance in percentage enforced by 172 + * cpufreq setpolicy interface 173 + * @max_sysfs_pct: The maximum performance in percentage enforced by 174 + * intel pstate sysfs interface 175 + * @min_policy_pct: The minimum performance in percentage enforced by 176 + * cpufreq setpolicy interface 177 + * @min_sysfs_pct: The minimum performance in percentage enforced by 178 + * intel pstate sysfs interface 179 + * 180 + * Storage for user and policy defined limits. 181 + */ 261 182 struct perf_limits { 262 183 int no_turbo; 263 184 int turbo_disabled; ··· 1045 910 cpu->prev_aperf = aperf; 1046 911 cpu->prev_mperf = mperf; 1047 912 cpu->prev_tsc = tsc; 1048 - return true; 913 + /* 914 + * First time this function is invoked in a given cycle, all of the 915 + * previous sample data fields are equal to zero or stale and they must 916 + * be populated with meaningful numbers for things to work, so assume 917 + * that sample.time will always be reset before setting the utilization 918 + * update hook and make the caller skip the sample then. 919 + */ 920 + return !!cpu->last_sample_time; 1049 921 } 1050 922 1051 923 static inline int32_t get_avg_frequency(struct cpudata *cpu) ··· 1126 984 * enough period of time to adjust our busyness. 1127 985 */ 1128 986 duration_ns = cpu->sample.time - cpu->last_sample_time; 1129 - if ((s64)duration_ns > pid_params.sample_rate_ns * 3 1130 - && cpu->last_sample_time > 0) { 987 + if ((s64)duration_ns > pid_params.sample_rate_ns * 3) { 1131 988 sample_ratio = div_fp(int_tofp(pid_params.sample_rate_ns), 1132 989 int_tofp(duration_ns)); 1133 990 core_busy = mul_fp(core_busy, sample_ratio); ··· 1241 1100 intel_pstate_get_cpu_pstates(cpu); 1242 1101 1243 1102 intel_pstate_busy_pid_reset(cpu); 1244 - intel_pstate_sample(cpu, 0); 1245 1103 1246 1104 cpu->update_util.func = intel_pstate_update_util; 1247 - cpufreq_set_update_util_data(cpunum, &cpu->update_util); 1248 1105 1249 1106 pr_debug("intel_pstate: controlling: cpu %d\n", cpunum); 1250 1107 ··· 1261 1122 return get_avg_frequency(cpu); 1262 1123 } 1263 1124 1125 + static void intel_pstate_set_update_util_hook(unsigned int cpu_num) 1126 + { 1127 + struct cpudata *cpu = all_cpu_data[cpu_num]; 1128 + 1129 + /* Prevent intel_pstate_update_util() from using stale data. */ 1130 + cpu->sample.time = 0; 1131 + cpufreq_set_update_util_data(cpu_num, &cpu->update_util); 1132 + } 1133 + 1134 + static void intel_pstate_clear_update_util_hook(unsigned int cpu) 1135 + { 1136 + cpufreq_set_update_util_data(cpu, NULL); 1137 + synchronize_sched(); 1138 + } 1139 + 1140 + static void intel_pstate_set_performance_limits(struct perf_limits *limits) 1141 + { 1142 + limits->no_turbo = 0; 1143 + limits->turbo_disabled = 0; 1144 + limits->max_perf_pct = 100; 1145 + limits->max_perf = int_tofp(1); 1146 + limits->min_perf_pct = 100; 1147 + limits->min_perf = int_tofp(1); 1148 + limits->max_policy_pct = 100; 1149 + limits->max_sysfs_pct = 100; 1150 + limits->min_policy_pct = 0; 1151 + limits->min_sysfs_pct = 0; 1152 + } 1153 + 1264 1154 static int intel_pstate_set_policy(struct cpufreq_policy *policy) 1265 1155 { 1266 1156 if (!policy->cpuinfo.max_freq) 1267 1157 return -ENODEV; 1268 1158 1269 - if (policy->policy == CPUFREQ_POLICY_PERFORMANCE && 1270 - policy->max >= policy->cpuinfo.max_freq) { 1271 - pr_debug("intel_pstate: set performance\n"); 1159 + intel_pstate_clear_update_util_hook(policy->cpu); 1160 + 1161 + if (policy->policy == CPUFREQ_POLICY_PERFORMANCE) { 1272 1162 limits = &performance_limits; 1273 - if (hwp_active) 1274 - intel_pstate_hwp_set(policy->cpus); 1275 - return 0; 1163 + if (policy->max >= policy->cpuinfo.max_freq) { 1164 + pr_debug("intel_pstate: set performance\n"); 1165 + intel_pstate_set_performance_limits(limits); 1166 + goto out; 1167 + } 1168 + } else { 1169 + pr_debug("intel_pstate: set powersave\n"); 1170 + limits = &powersave_limits; 1276 1171 } 1277 1172 1278 - pr_debug("intel_pstate: set powersave\n"); 1279 - limits = &powersave_limits; 1280 1173 limits->min_policy_pct = (policy->min * 100) / policy->cpuinfo.max_freq; 1281 1174 limits->min_policy_pct = clamp_t(int, limits->min_policy_pct, 0 , 100); 1282 1175 limits->max_policy_pct = DIV_ROUND_UP(policy->max * 100, ··· 1334 1163 limits->max_perf = div_fp(int_tofp(limits->max_perf_pct), 1335 1164 int_tofp(100)); 1336 1165 1166 + out: 1167 + intel_pstate_set_update_util_hook(policy->cpu); 1168 + 1337 1169 if (hwp_active) 1338 1170 intel_pstate_hwp_set(policy->cpus); 1339 1171 ··· 1361 1187 1362 1188 pr_debug("intel_pstate: CPU %d exiting\n", cpu_num); 1363 1189 1364 - cpufreq_set_update_util_data(cpu_num, NULL); 1365 - synchronize_sched(); 1190 + intel_pstate_clear_update_util_hook(cpu_num); 1366 1191 1367 1192 if (hwp_active) 1368 1193 return; ··· 1628 1455 get_online_cpus(); 1629 1456 for_each_online_cpu(cpu) { 1630 1457 if (all_cpu_data[cpu]) { 1631 - cpufreq_set_update_util_data(cpu, NULL); 1632 - synchronize_sched(); 1458 + intel_pstate_clear_update_util_hook(cpu); 1633 1459 kfree(all_cpu_data[cpu]); 1634 1460 } 1635 1461 }
+67 -30
drivers/idle/intel_idle.c
··· 660 660 .enter = NULL } 661 661 }; 662 662 663 + static struct cpuidle_state skx_cstates[] = { 664 + { 665 + .name = "C1-SKX", 666 + .desc = "MWAIT 0x00", 667 + .flags = MWAIT2flg(0x00), 668 + .exit_latency = 2, 669 + .target_residency = 2, 670 + .enter = &intel_idle, 671 + .enter_freeze = intel_idle_freeze, }, 672 + { 673 + .name = "C1E-SKX", 674 + .desc = "MWAIT 0x01", 675 + .flags = MWAIT2flg(0x01), 676 + .exit_latency = 10, 677 + .target_residency = 20, 678 + .enter = &intel_idle, 679 + .enter_freeze = intel_idle_freeze, }, 680 + { 681 + .name = "C6-SKX", 682 + .desc = "MWAIT 0x20", 683 + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED, 684 + .exit_latency = 133, 685 + .target_residency = 600, 686 + .enter = &intel_idle, 687 + .enter_freeze = intel_idle_freeze, }, 688 + { 689 + .enter = NULL } 690 + }; 691 + 663 692 static struct cpuidle_state atom_cstates[] = { 664 693 { 665 694 .name = "C1E-ATM", ··· 847 818 * driver in this case 848 819 */ 849 820 dev = per_cpu_ptr(intel_idle_cpuidle_devices, hotcpu); 850 - if (!dev->registered) 851 - intel_idle_cpu_init(hotcpu); 821 + if (dev->registered) 822 + break; 823 + 824 + if (intel_idle_cpu_init(hotcpu)) 825 + return NOTIFY_BAD; 852 826 853 827 break; 854 828 } ··· 936 904 .disable_promotion_to_c1e = true, 937 905 }; 938 906 907 + static const struct idle_cpu idle_cpu_skx = { 908 + .state_table = skx_cstates, 909 + .disable_promotion_to_c1e = true, 910 + }; 939 911 940 912 static const struct idle_cpu idle_cpu_avn = { 941 913 .state_table = avn_cstates, ··· 981 945 ICPU(0x56, idle_cpu_bdw), 982 946 ICPU(0x4e, idle_cpu_skl), 983 947 ICPU(0x5e, idle_cpu_skl), 948 + ICPU(0x8e, idle_cpu_skl), 949 + ICPU(0x9e, idle_cpu_skl), 950 + ICPU(0x55, idle_cpu_skx), 984 951 ICPU(0x57, idle_cpu_knl), 985 952 {} 986 953 }; ··· 1026 987 icpu = (const struct idle_cpu *)id->driver_data; 1027 988 cpuidle_state_table = icpu->state_table; 1028 989 1029 - if (boot_cpu_has(X86_FEATURE_ARAT)) /* Always Reliable APIC Timer */ 1030 - lapic_timer_reliable_states = LAPIC_TIMER_ALWAYS_RELIABLE; 1031 - else 1032 - on_each_cpu(__setup_broadcast_timer, (void *)true, 1); 1033 - 1034 990 pr_debug(PREFIX "v" INTEL_IDLE_VERSION 1035 991 " model 0x%X\n", boot_cpu_data.x86_model); 1036 992 1037 - pr_debug(PREFIX "lapic_timer_reliable_states 0x%x\n", 1038 - lapic_timer_reliable_states); 1039 993 return 0; 1040 994 } 1041 995 1042 996 /* 1043 997 * intel_idle_cpuidle_devices_uninit() 1044 - * unregister, free cpuidle_devices 998 + * Unregisters the cpuidle devices. 1045 999 */ 1046 1000 static void intel_idle_cpuidle_devices_uninit(void) 1047 1001 { ··· 1045 1013 dev = per_cpu_ptr(intel_idle_cpuidle_devices, i); 1046 1014 cpuidle_unregister_device(dev); 1047 1015 } 1048 - 1049 - free_percpu(intel_idle_cpuidle_devices); 1050 - return; 1051 1016 } 1052 1017 1053 1018 /* ··· 1140 1111 * intel_idle_cpuidle_driver_init() 1141 1112 * allocate, initialize cpuidle_states 1142 1113 */ 1143 - static int __init intel_idle_cpuidle_driver_init(void) 1114 + static void __init intel_idle_cpuidle_driver_init(void) 1144 1115 { 1145 1116 int cstate; 1146 1117 struct cpuidle_driver *drv = &intel_idle_driver; ··· 1192 1163 drv->state_count += 1; 1193 1164 } 1194 1165 1195 - if (icpu->auto_demotion_disable_flags) 1196 - on_each_cpu(auto_demotion_disable, NULL, 1); 1197 - 1198 1166 if (icpu->byt_auto_demotion_disable_flag) { 1199 1167 wrmsrl(MSR_CC6_DEMOTION_POLICY_CONFIG, 0); 1200 1168 wrmsrl(MSR_MC6_DEMOTION_POLICY_CONFIG, 0); 1201 1169 } 1202 - 1203 - if (icpu->disable_promotion_to_c1e) /* each-cpu is redundant */ 1204 - on_each_cpu(c1e_promotion_disable, NULL, 1); 1205 - 1206 - return 0; 1207 1170 } 1208 1171 1209 1172 ··· 1214 1193 1215 1194 if (cpuidle_register_device(dev)) { 1216 1195 pr_debug(PREFIX "cpuidle_register_device %d failed!\n", cpu); 1217 - intel_idle_cpuidle_devices_uninit(); 1218 1196 return -EIO; 1219 1197 } 1220 1198 ··· 1238 1218 if (retval) 1239 1219 return retval; 1240 1220 1221 + intel_idle_cpuidle_devices = alloc_percpu(struct cpuidle_device); 1222 + if (intel_idle_cpuidle_devices == NULL) 1223 + return -ENOMEM; 1224 + 1241 1225 intel_idle_cpuidle_driver_init(); 1242 1226 retval = cpuidle_register_driver(&intel_idle_driver); 1243 1227 if (retval) { 1244 1228 struct cpuidle_driver *drv = cpuidle_get_driver(); 1245 1229 printk(KERN_DEBUG PREFIX "intel_idle yielding to %s", 1246 1230 drv ? drv->name : "none"); 1231 + free_percpu(intel_idle_cpuidle_devices); 1247 1232 return retval; 1248 1233 } 1249 - 1250 - intel_idle_cpuidle_devices = alloc_percpu(struct cpuidle_device); 1251 - if (intel_idle_cpuidle_devices == NULL) 1252 - return -ENOMEM; 1253 1234 1254 1235 cpu_notifier_register_begin(); 1255 1236 1256 1237 for_each_online_cpu(i) { 1257 1238 retval = intel_idle_cpu_init(i); 1258 1239 if (retval) { 1240 + intel_idle_cpuidle_devices_uninit(); 1259 1241 cpu_notifier_register_done(); 1260 1242 cpuidle_unregister_driver(&intel_idle_driver); 1243 + free_percpu(intel_idle_cpuidle_devices); 1261 1244 return retval; 1262 1245 } 1263 1246 } 1264 1247 __register_cpu_notifier(&cpu_hotplug_notifier); 1265 1248 1249 + if (boot_cpu_has(X86_FEATURE_ARAT)) /* Always Reliable APIC Timer */ 1250 + lapic_timer_reliable_states = LAPIC_TIMER_ALWAYS_RELIABLE; 1251 + else 1252 + on_each_cpu(__setup_broadcast_timer, (void *)true, 1); 1253 + 1266 1254 cpu_notifier_register_done(); 1255 + 1256 + pr_debug(PREFIX "lapic_timer_reliable_states 0x%x\n", 1257 + lapic_timer_reliable_states); 1267 1258 1268 1259 return 0; 1269 1260 } 1270 1261 1271 1262 static void __exit intel_idle_exit(void) 1272 1263 { 1273 - intel_idle_cpuidle_devices_uninit(); 1274 - cpuidle_unregister_driver(&intel_idle_driver); 1264 + struct cpuidle_device *dev; 1265 + int i; 1275 1266 1276 1267 cpu_notifier_register_begin(); 1277 1268 ··· 1290 1259 on_each_cpu(__setup_broadcast_timer, (void *)false, 1); 1291 1260 __unregister_cpu_notifier(&cpu_hotplug_notifier); 1292 1261 1262 + for_each_possible_cpu(i) { 1263 + dev = per_cpu_ptr(intel_idle_cpuidle_devices, i); 1264 + cpuidle_unregister_device(dev); 1265 + } 1266 + 1293 1267 cpu_notifier_register_done(); 1294 1268 1295 - return; 1269 + cpuidle_unregister_driver(&intel_idle_driver); 1270 + free_percpu(intel_idle_cpuidle_devices); 1296 1271 } 1297 1272 1298 1273 module_init(intel_idle_init);
+2 -2
drivers/mailbox/pcc.c
··· 361 361 struct acpi_generic_address *db_reg; 362 362 struct acpi_pcct_hw_reduced *pcct_ss; 363 363 pcc_mbox_channels[i].con_priv = pcct_entry; 364 - pcct_entry = (struct acpi_subtable_header *) 365 - ((unsigned long) pcct_entry + pcct_entry->length); 366 364 367 365 /* If doorbell is in system memory cache the virt address */ 368 366 pcct_ss = (struct acpi_pcct_hw_reduced *)pcct_entry; ··· 368 370 if (db_reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) 369 371 pcc_doorbell_vaddr[i] = acpi_os_ioremap(db_reg->address, 370 372 db_reg->bit_width/8); 373 + pcct_entry = (struct acpi_subtable_header *) 374 + ((unsigned long) pcct_entry + pcct_entry->length); 371 375 } 372 376 373 377 pcc_mbox_ctrl.num_chans = count;
+1
drivers/powercap/intel_rapl.c
··· 1091 1091 RAPL_CPU(0x3f, rapl_defaults_hsw_server),/* Haswell servers */ 1092 1092 RAPL_CPU(0x4f, rapl_defaults_hsw_server),/* Broadwell servers */ 1093 1093 RAPL_CPU(0x45, rapl_defaults_core),/* Haswell ULT */ 1094 + RAPL_CPU(0x46, rapl_defaults_core),/* Haswell */ 1094 1095 RAPL_CPU(0x47, rapl_defaults_core),/* Broadwell-H */ 1095 1096 RAPL_CPU(0x4E, rapl_defaults_core),/* Skylake */ 1096 1097 RAPL_CPU(0x4C, rapl_defaults_cht),/* Braswell/Cherryview */
+107 -10
tools/power/x86/turbostat/turbostat.c
··· 66 66 unsigned int use_c1_residency_msr; 67 67 unsigned int has_aperf; 68 68 unsigned int has_epb; 69 + unsigned int do_irtl_snb; 70 + unsigned int do_irtl_hsw; 69 71 unsigned int units = 1000000; /* MHz etc */ 70 72 unsigned int genuine_intel; 71 73 unsigned int has_invariant_tsc; ··· 189 187 unsigned long long pkg_any_core_c0; 190 188 unsigned long long pkg_any_gfxe_c0; 191 189 unsigned long long pkg_both_core_gfxe_c0; 192 - unsigned long long gfx_rc6_ms; 190 + long long gfx_rc6_ms; 193 191 unsigned int gfx_mhz; 194 192 unsigned int package_id; 195 193 unsigned int energy_pkg; /* MSR_PKG_ENERGY_STATUS */ ··· 623 621 outp += sprintf(outp, "%8d", p->pkg_temp_c); 624 622 625 623 /* GFXrc6 */ 626 - if (do_gfx_rc6_ms) 627 - outp += sprintf(outp, "%8.2f", 100.0 * p->gfx_rc6_ms / 1000.0 / interval_float); 624 + if (do_gfx_rc6_ms) { 625 + if (p->gfx_rc6_ms == -1) { /* detect counter reset */ 626 + outp += sprintf(outp, " ***.**"); 627 + } else { 628 + outp += sprintf(outp, "%8.2f", 629 + p->gfx_rc6_ms / 10.0 / interval_float); 630 + } 631 + } 628 632 629 633 /* GFXMHz */ 630 634 if (do_gfx_mhz) ··· 774 766 old->pc10 = new->pc10 - old->pc10; 775 767 old->pkg_temp_c = new->pkg_temp_c; 776 768 777 - old->gfx_rc6_ms = new->gfx_rc6_ms - old->gfx_rc6_ms; 769 + /* flag an error when rc6 counter resets/wraps */ 770 + if (old->gfx_rc6_ms > new->gfx_rc6_ms) 771 + old->gfx_rc6_ms = -1; 772 + else 773 + old->gfx_rc6_ms = new->gfx_rc6_ms - old->gfx_rc6_ms; 774 + 778 775 old->gfx_mhz = new->gfx_mhz; 779 776 780 777 DELTA_WRAP32(new->energy_pkg, old->energy_pkg); ··· 1309 1296 int slv_pkg_cstate_limits[16] = {PCL__0, PCL__1, PCLRSV, PCLRSV, PCL__4, PCLRSV, PCL__6, PCL__7, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV}; 1310 1297 int amt_pkg_cstate_limits[16] = {PCL__0, PCL__1, PCL__2, PCLRSV, PCLRSV, PCLRSV, PCL__6, PCL__7, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV}; 1311 1298 int phi_pkg_cstate_limits[16] = {PCL__0, PCL__2, PCL_6N, PCL_6R, PCLRSV, PCLRSV, PCLRSV, PCLUNL, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV}; 1299 + int bxt_pkg_cstate_limits[16] = {PCL__0, PCL__2, PCLUNL, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV}; 1312 1300 1313 1301 1314 1302 static void ··· 1592 1578 fprintf(outf, "MAX_NON_TURBO_RATIO=%d", (unsigned int)(msr) & 0xFF); 1593 1579 fprintf(outf, " lock=%d", (unsigned int)(msr >> 31) & 1); 1594 1580 fprintf(outf, ")\n"); 1581 + } 1582 + 1583 + unsigned int irtl_time_units[] = {1, 32, 1024, 32768, 1048576, 33554432, 0, 0 }; 1584 + 1585 + void print_irtl(void) 1586 + { 1587 + unsigned long long msr; 1588 + 1589 + get_msr(base_cpu, MSR_PKGC3_IRTL, &msr); 1590 + fprintf(outf, "cpu%d: MSR_PKGC3_IRTL: 0x%08llx (", base_cpu, msr); 1591 + fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", 1592 + (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 1593 + 1594 + get_msr(base_cpu, MSR_PKGC6_IRTL, &msr); 1595 + fprintf(outf, "cpu%d: MSR_PKGC6_IRTL: 0x%08llx (", base_cpu, msr); 1596 + fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", 1597 + (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 1598 + 1599 + get_msr(base_cpu, MSR_PKGC7_IRTL, &msr); 1600 + fprintf(outf, "cpu%d: MSR_PKGC7_IRTL: 0x%08llx (", base_cpu, msr); 1601 + fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", 1602 + (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 1603 + 1604 + if (!do_irtl_hsw) 1605 + return; 1606 + 1607 + get_msr(base_cpu, MSR_PKGC8_IRTL, &msr); 1608 + fprintf(outf, "cpu%d: MSR_PKGC8_IRTL: 0x%08llx (", base_cpu, msr); 1609 + fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", 1610 + (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 1611 + 1612 + get_msr(base_cpu, MSR_PKGC9_IRTL, &msr); 1613 + fprintf(outf, "cpu%d: MSR_PKGC9_IRTL: 0x%08llx (", base_cpu, msr); 1614 + fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", 1615 + (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 1616 + 1617 + get_msr(base_cpu, MSR_PKGC10_IRTL, &msr); 1618 + fprintf(outf, "cpu%d: MSR_PKGC10_IRTL: 0x%08llx (", base_cpu, msr); 1619 + fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", 1620 + (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 1621 + 1595 1622 } 1596 1623 void free_fd_percpu(void) 1597 1624 { ··· 2199 2144 case 0x56: /* BDX-DE */ 2200 2145 case 0x4E: /* SKL */ 2201 2146 case 0x5E: /* SKL */ 2147 + case 0x8E: /* KBL */ 2148 + case 0x9E: /* KBL */ 2149 + case 0x55: /* SKX */ 2202 2150 pkg_cstate_limits = hsw_pkg_cstate_limits; 2203 2151 break; 2204 2152 case 0x37: /* BYT */ ··· 2213 2155 break; 2214 2156 case 0x57: /* PHI */ 2215 2157 pkg_cstate_limits = phi_pkg_cstate_limits; 2158 + break; 2159 + case 0x5C: /* BXT */ 2160 + pkg_cstate_limits = bxt_pkg_cstate_limits; 2216 2161 break; 2217 2162 default: 2218 2163 return 0; ··· 2309 2248 case 0x56: /* BDX-DE */ 2310 2249 case 0x4E: /* SKL */ 2311 2250 case 0x5E: /* SKL */ 2251 + case 0x8E: /* KBL */ 2252 + case 0x9E: /* KBL */ 2253 + case 0x55: /* SKX */ 2312 2254 2313 2255 case 0x57: /* Knights Landing */ 2314 2256 return 1; ··· 2649 2585 case 0x47: /* BDW */ 2650 2586 do_rapl = RAPL_PKG | RAPL_CORES | RAPL_CORE_POLICY | RAPL_GFX | RAPL_PKG_POWER_INFO; 2651 2587 break; 2588 + case 0x5C: /* BXT */ 2589 + do_rapl = RAPL_PKG | RAPL_PKG_POWER_INFO; 2590 + break; 2652 2591 case 0x4E: /* SKL */ 2653 2592 case 0x5E: /* SKL */ 2593 + case 0x8E: /* KBL */ 2594 + case 0x9E: /* KBL */ 2654 2595 do_rapl = RAPL_PKG | RAPL_DRAM | RAPL_DRAM_PERF_STATUS | RAPL_PKG_PERF_STATUS | RAPL_PKG_POWER_INFO; 2655 2596 break; 2656 2597 case 0x3F: /* HSX */ 2657 2598 case 0x4F: /* BDX */ 2658 2599 case 0x56: /* BDX-DE */ 2600 + case 0x55: /* SKX */ 2659 2601 case 0x57: /* KNL */ 2660 2602 do_rapl = RAPL_PKG | RAPL_DRAM | RAPL_DRAM_POWER_INFO | RAPL_DRAM_PERF_STATUS | RAPL_PKG_PERF_STATUS | RAPL_PKG_POWER_INFO; 2661 2603 break; ··· 2941 2871 case 0x56: /* BDX-DE */ 2942 2872 case 0x4E: /* SKL */ 2943 2873 case 0x5E: /* SKL */ 2874 + case 0x8E: /* KBL */ 2875 + case 0x9E: /* KBL */ 2876 + case 0x55: /* SKX */ 2877 + case 0x5C: /* BXT */ 2944 2878 return 1; 2945 2879 } 2946 2880 return 0; ··· 2953 2879 /* 2954 2880 * HSW adds support for additional MSRs: 2955 2881 * 2956 - * MSR_PKG_C8_RESIDENCY 0x00000630 2957 - * MSR_PKG_C9_RESIDENCY 0x00000631 2958 - * MSR_PKG_C10_RESIDENCY 0x00000632 2882 + * MSR_PKG_C8_RESIDENCY 0x00000630 2883 + * MSR_PKG_C9_RESIDENCY 0x00000631 2884 + * MSR_PKG_C10_RESIDENCY 0x00000632 2885 + * 2886 + * MSR_PKGC8_IRTL 0x00000633 2887 + * MSR_PKGC9_IRTL 0x00000634 2888 + * MSR_PKGC10_IRTL 0x00000635 2889 + * 2959 2890 */ 2960 2891 int has_hsw_msrs(unsigned int family, unsigned int model) 2961 2892 { ··· 2972 2893 case 0x3D: /* BDW */ 2973 2894 case 0x4E: /* SKL */ 2974 2895 case 0x5E: /* SKL */ 2896 + case 0x8E: /* KBL */ 2897 + case 0x9E: /* KBL */ 2898 + case 0x5C: /* BXT */ 2975 2899 return 1; 2976 2900 } 2977 2901 return 0; ··· 2996 2914 switch (model) { 2997 2915 case 0x4E: /* SKL */ 2998 2916 case 0x5E: /* SKL */ 2917 + case 0x8E: /* KBL */ 2918 + case 0x9E: /* KBL */ 2999 2919 return 1; 3000 2920 } 3001 2921 return 0; ··· 3271 3187 if (debug) 3272 3188 decode_misc_enable_msr(); 3273 3189 3274 - if (max_level >= 0x7) { 3190 + if (max_level >= 0x7 && debug) { 3275 3191 int has_sgx; 3276 3192 3277 3193 ecx = 0; ··· 3305 3221 switch(model) { 3306 3222 case 0x4E: /* SKL */ 3307 3223 case 0x5E: /* SKL */ 3308 - crystal_hz = 24000000; /* 24 MHz */ 3224 + case 0x8E: /* KBL */ 3225 + case 0x9E: /* KBL */ 3226 + crystal_hz = 24000000; /* 24.0 MHz */ 3227 + break; 3228 + case 0x55: /* SKX */ 3229 + crystal_hz = 25000000; /* 25.0 MHz */ 3230 + break; 3231 + case 0x5C: /* BXT */ 3232 + crystal_hz = 19200000; /* 19.2 MHz */ 3309 3233 break; 3310 3234 default: 3311 3235 crystal_hz = 0; ··· 3346 3254 3347 3255 do_nhm_platform_info = do_nhm_cstates = do_smi = probe_nhm_msrs(family, model); 3348 3256 do_snb_cstates = has_snb_msrs(family, model); 3257 + do_irtl_snb = has_snb_msrs(family, model); 3349 3258 do_pc2 = do_snb_cstates && (pkg_cstate_limit >= PCL__2); 3350 3259 do_pc3 = (pkg_cstate_limit >= PCL__3); 3351 3260 do_pc6 = (pkg_cstate_limit >= PCL__6); 3352 3261 do_pc7 = do_snb_cstates && (pkg_cstate_limit >= PCL__7); 3353 3262 do_c8_c9_c10 = has_hsw_msrs(family, model); 3263 + do_irtl_hsw = has_hsw_msrs(family, model); 3354 3264 do_skl_residency = has_skl_msrs(family, model); 3355 3265 do_slm_cstates = is_slm(family, model); 3356 3266 do_knl_cstates = is_knl(family, model); ··· 3658 3564 3659 3565 if (debug) 3660 3566 for_all_cpus(print_thermal, ODD_COUNTERS); 3567 + 3568 + if (debug && do_irtl_snb) 3569 + print_irtl(); 3661 3570 } 3662 3571 3663 3572 int fork_it(char **argv) ··· 3726 3629 } 3727 3630 3728 3631 void print_version() { 3729 - fprintf(outf, "turbostat version 4.11 27 Feb 2016" 3632 + fprintf(outf, "turbostat version 4.12 5 Apr 2016" 3730 3633 " - Len Brown <lenb@kernel.org>\n"); 3731 3634 } 3732 3635