Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge branch 'thermal-intel'

Merge changes in Intel thermal control drivers for 6.7-rc1:

- Add power floor notifications support to the int340x thermal control
driver (Srinivas Pandruvada).

- Rework updating trip points in the int340x thermal driver so that it
does not access thermal zone internals directly (Rafael Wysocki).

- Use param_get_byte() instead of param_get_int() as the max_idle module
parameter .get() callback in the Intel powerclamp thermal driver to
avoid possible out-of-bounds access (David Arcari).

- Add workload hints support to the the int340x thermal driver (Srinivas
Pandruvada).

* thermal-intel:
selftests/thermel/intel: Add test to read power floor status
thermal: int340x: processor_thermal: Enable power floor support
thermal: int340x: processor_thermal: Handle power floor interrupts
thermal: int340x: processor_thermal: Support power floor notifications
thermal: int340x: processor_thermal: Set feature mask before proc_thermal_add
thermal: int340x: processor_thermal: Common function to clear SOC interrupt
thermal: int340x: processor_thermal: Move interrupt status MMIO offset to common header
thermal: intel: powerclamp: fix mismatch in get function for max_idle
thermal: int340x: Use thermal_zone_for_each_trip()
thermal: int340x: processor_thermal: Ack all PCI interrupts
thermal: int340x: Add ArrowLake-S PCI ID
selftests/thermel/intel: Add test to read workload hint
thermal: int340x: Handle workload hint interrupts
thermal: int340x: processor_thermal: Add workload type hint interface
thermal: int340x: Remove PROC_THERMAL_FEATURE_WLT_REQ for Meteor Lake
thermal: int340x: processor_thermal: Use non MSI interrupts by default
thermal: int340x: processor_thermal: Add interrupt configuration function
thermal: int340x: processor_thermal: Move mailbox code to common module

+1179 -215
+64
Documentation/driver-api/thermal/intel_dptf.rst
··· 164 164 ``power_limit_1_tmax_us`` (RO) 165 165 Maximum powercap sysfs constraint_1_time_window_us for Intel RAPL 166 166 167 + ``power_floor_status`` (RO) 168 + When set to 1, the power floor of the system in the current 169 + configuration has been reached. It needs to be reconfigured to allow 170 + power to be reduced any further. 171 + 172 + ``power_floor_enable`` (RW) 173 + When set to 1, enable reading and notification of the power floor 174 + status. Notifications are triggered for the power_floor_status 175 + attribute value changes. 176 + 167 177 :file:`/sys/bus/pci/devices/0000\:00\:04.0/` 168 178 169 179 ``tcc_offset_degree_celsius`` (RW) ··· 325 315 ---------------------------------------- 326 316 327 317 Refer to Documentation/admin-guide/acpi/fan_performance_states.rst 318 + 319 + Workload Type Hints 320 + ---------------------------------------- 321 + 322 + The firmware in Meteor Lake processor generation is capable of identifying 323 + workload type and passing hints regarding it to the OS. A special sysfs 324 + interface is provided to allow user space to obtain workload type hints from 325 + the firmware and control the rate at which they are provided. 326 + 327 + User space can poll attribute "workload_type_index" for the current hint or 328 + can receive a notification whenever the value of this attribute is updated. 329 + 330 + file:`/sys/bus/pci/devices/0000:00:04.0/workload_hint/` 331 + Segment 0, bus 0, device 4, function 0 is reserved for the processor thermal 332 + device on all Intel client processors. So, the above path doesn't change 333 + based on the processor generation. 334 + 335 + ``workload_hint_enable`` (RW) 336 + Enable firmware to send workload type hints to user space. 337 + 338 + ``notification_delay_ms`` (RW) 339 + Minimum delay in milliseconds before firmware will notify OS. This is 340 + for the rate control of notifications. This delay is between changing 341 + the workload type prediction in the firmware and notifying the OS about 342 + the change. The default delay is 1024 ms. The delay of 0 is invalid. 343 + The delay is rounded up to the nearest power of 2 to simplify firmware 344 + programming of the delay value. The read of notification_delay_ms 345 + attribute shows the effective value used. 346 + 347 + ``workload_type_index`` (RO) 348 + Predicted workload type index. User space can get notification of 349 + change via existing sysfs attribute change notification mechanism. 350 + 351 + The supported index values and their meaning for the Meteor Lake 352 + processor generation are as follows: 353 + 354 + 0 - Idle: System performs no tasks, power and idle residency are 355 + consistently low for long periods of time. 356 + 357 + 1 – Battery Life: Power is relatively low, but the processor may 358 + still be actively performing a task, such as video playback for 359 + a long period of time. 360 + 361 + 2 – Sustained: Power level that is relatively high for a long period 362 + of time, with very few to no periods of idleness, which will 363 + eventually exhaust RAPL Power Limit 1 and 2. 364 + 365 + 3 – Bursty: Consumes a relatively constant average amount of power, but 366 + periods of relative idleness are interrupted by bursts of 367 + activity. The bursts are relatively short and the periods of 368 + relative idleness between them typically prevent RAPL Power 369 + Limit 1 from being exhausted. 370 + 371 + 4 – Unknown: Can't classify.
+3
drivers/thermal/intel/int340x_thermal/Makefile
··· 10 10 obj-$(CONFIG_PROC_THERMAL_MMIO_RAPL) += processor_thermal_rapl.o 11 11 obj-$(CONFIG_INT340X_THERMAL) += processor_thermal_rfim.o 12 12 obj-$(CONFIG_INT340X_THERMAL) += processor_thermal_mbox.o 13 + obj-$(CONFIG_INT340X_THERMAL) += processor_thermal_wt_req.o 14 + obj-$(CONFIG_INT340X_THERMAL) += processor_thermal_wt_hint.o 15 + obj-$(CONFIG_INT340X_THERMAL) += processor_thermal_power_floor.o 13 16 obj-$(CONFIG_INT3406_THERMAL) += int3406_thermal.o 14 17 obj-$(CONFIG_ACPI_THERMAL_REL) += acpi_thermal_rel.o
+43 -37
drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
··· 67 67 .critical = int340x_thermal_critical, 68 68 }; 69 69 70 + static inline void *int_to_trip_priv(int i) 71 + { 72 + return (void *)(long)i; 73 + } 74 + 75 + static inline int trip_priv_to_int(const struct thermal_trip *trip) 76 + { 77 + return (long)trip->priv; 78 + } 79 + 70 80 static int int340x_thermal_read_trips(struct acpi_device *zone_adev, 71 81 struct thermal_trip *zone_trips, 72 82 int trip_cnt) ··· 111 101 break; 112 102 113 103 zone_trips[trip_cnt].type = THERMAL_TRIP_ACTIVE; 104 + zone_trips[trip_cnt].priv = int_to_trip_priv(i); 114 105 trip_cnt++; 115 106 } 116 107 ··· 223 212 } 224 213 EXPORT_SYMBOL_GPL(int340x_thermal_zone_remove); 225 214 215 + static int int340x_update_one_trip(struct thermal_trip *trip, void *arg) 216 + { 217 + struct acpi_device *zone_adev = arg; 218 + int temp, err; 219 + 220 + switch (trip->type) { 221 + case THERMAL_TRIP_CRITICAL: 222 + err = thermal_acpi_critical_trip_temp(zone_adev, &temp); 223 + break; 224 + case THERMAL_TRIP_HOT: 225 + err = thermal_acpi_hot_trip_temp(zone_adev, &temp); 226 + break; 227 + case THERMAL_TRIP_PASSIVE: 228 + err = thermal_acpi_passive_trip_temp(zone_adev, &temp); 229 + break; 230 + case THERMAL_TRIP_ACTIVE: 231 + err = thermal_acpi_active_trip_temp(zone_adev, 232 + trip_priv_to_int(trip), 233 + &temp); 234 + break; 235 + default: 236 + err = -ENODEV; 237 + } 238 + if (err) 239 + temp = THERMAL_TEMP_INVALID; 240 + 241 + trip->temperature = temp; 242 + return 0; 243 + } 244 + 226 245 void int340x_thermal_update_trips(struct int34x_thermal_zone *int34x_zone) 227 246 { 228 - struct acpi_device *zone_adev = int34x_zone->adev; 229 - struct thermal_trip *zone_trips = int34x_zone->trips; 230 - int trip_cnt = int34x_zone->zone->num_trips; 231 - int act_trip_nr = 0; 232 - int i; 233 - 234 - mutex_lock(&int34x_zone->zone->lock); 235 - 236 - for (i = int34x_zone->aux_trip_nr; i < trip_cnt; i++) { 237 - int temp, err; 238 - 239 - switch (zone_trips[i].type) { 240 - case THERMAL_TRIP_CRITICAL: 241 - err = thermal_acpi_critical_trip_temp(zone_adev, &temp); 242 - break; 243 - case THERMAL_TRIP_HOT: 244 - err = thermal_acpi_hot_trip_temp(zone_adev, &temp); 245 - break; 246 - case THERMAL_TRIP_PASSIVE: 247 - err = thermal_acpi_passive_trip_temp(zone_adev, &temp); 248 - break; 249 - case THERMAL_TRIP_ACTIVE: 250 - err = thermal_acpi_active_trip_temp(zone_adev, act_trip_nr++, 251 - &temp); 252 - break; 253 - default: 254 - err = -ENODEV; 255 - } 256 - if (err) { 257 - zone_trips[i].temperature = THERMAL_TEMP_INVALID; 258 - continue; 259 - } 260 - 261 - zone_trips[i].temperature = temp; 262 - } 263 - 264 - mutex_unlock(&int34x_zone->zone->lock); 247 + thermal_zone_for_each_trip(int34x_zone->zone, int340x_update_one_trip, 248 + int34x_zone->adev); 265 249 } 266 250 EXPORT_SYMBOL_GPL(int340x_thermal_update_trips); 267 251
+80 -5
drivers/thermal/intel/int340x_thermal/processor_thermal_device.c
··· 26 26 (unsigned long)proc_dev->power_limits[index].suffix * 1000); \ 27 27 } 28 28 29 + static ssize_t power_floor_status_show(struct device *dev, 30 + struct device_attribute *attr, 31 + char *buf) 32 + { 33 + struct proc_thermal_device *proc_dev = dev_get_drvdata(dev); 34 + int ret; 35 + 36 + ret = proc_thermal_read_power_floor_status(proc_dev); 37 + 38 + return sysfs_emit(buf, "%d\n", ret); 39 + } 40 + 41 + static ssize_t power_floor_enable_show(struct device *dev, 42 + struct device_attribute *attr, 43 + char *buf) 44 + { 45 + struct proc_thermal_device *proc_dev = dev_get_drvdata(dev); 46 + bool ret; 47 + 48 + ret = proc_thermal_power_floor_get_state(proc_dev); 49 + 50 + return sysfs_emit(buf, "%d\n", ret); 51 + } 52 + 53 + static ssize_t power_floor_enable_store(struct device *dev, 54 + struct device_attribute *attr, 55 + const char *buf, size_t count) 56 + { 57 + struct proc_thermal_device *proc_dev = dev_get_drvdata(dev); 58 + u8 state; 59 + int ret; 60 + 61 + if (kstrtou8(buf, 0, &state)) 62 + return -EINVAL; 63 + 64 + ret = proc_thermal_power_floor_set_state(proc_dev, !!state); 65 + if (ret) 66 + return ret; 67 + 68 + return count; 69 + } 70 + 29 71 POWER_LIMIT_SHOW(0, min_uw) 30 72 POWER_LIMIT_SHOW(0, max_uw) 31 73 POWER_LIMIT_SHOW(0, step_uw) ··· 92 50 static DEVICE_ATTR_RO(power_limit_1_tmin_us); 93 51 static DEVICE_ATTR_RO(power_limit_1_tmax_us); 94 52 53 + static DEVICE_ATTR_RO(power_floor_status); 54 + static DEVICE_ATTR_RW(power_floor_enable); 55 + 95 56 static struct attribute *power_limit_attrs[] = { 96 57 &dev_attr_power_limit_0_min_uw.attr, 97 58 &dev_attr_power_limit_1_min_uw.attr, ··· 106 61 &dev_attr_power_limit_1_tmin_us.attr, 107 62 &dev_attr_power_limit_0_tmax_us.attr, 108 63 &dev_attr_power_limit_1_tmax_us.attr, 64 + &dev_attr_power_floor_status.attr, 65 + &dev_attr_power_floor_enable.attr, 109 66 NULL 110 67 }; 111 68 69 + static umode_t power_limit_attr_visible(struct kobject *kobj, struct attribute *attr, int unused) 70 + { 71 + struct device *dev = kobj_to_dev(kobj); 72 + struct proc_thermal_device *proc_dev; 73 + 74 + if (attr != &dev_attr_power_floor_status.attr && attr != &dev_attr_power_floor_enable.attr) 75 + return attr->mode; 76 + 77 + proc_dev = dev_get_drvdata(dev); 78 + if (!proc_dev || !(proc_dev->mmio_feature_mask & PROC_THERMAL_FEATURE_POWER_FLOOR)) 79 + return 0; 80 + 81 + return attr->mode; 82 + } 83 + 112 84 static const struct attribute_group power_limit_attribute_group = { 113 85 .attrs = power_limit_attrs, 114 - .name = "power_limits" 86 + .name = "power_limits", 87 + .is_visible = power_limit_attr_visible, 115 88 }; 116 89 117 90 static ssize_t tcc_offset_degree_celsius_show(struct device *dev, ··· 409 346 } 410 347 } 411 348 412 - if (feature_mask & PROC_THERMAL_FEATURE_MBOX) { 413 - ret = proc_thermal_mbox_add(pdev, proc_priv); 349 + if (feature_mask & PROC_THERMAL_FEATURE_WT_REQ) { 350 + ret = proc_thermal_wt_req_add(pdev, proc_priv); 414 351 if (ret) { 415 352 dev_err(&pdev->dev, "failed to add MBOX interface\n"); 353 + goto err_rem_rfim; 354 + } 355 + } else if (feature_mask & PROC_THERMAL_FEATURE_WT_HINT) { 356 + ret = proc_thermal_wt_hint_add(pdev, proc_priv); 357 + if (ret) { 358 + dev_err(&pdev->dev, "failed to add WT Hint\n"); 416 359 goto err_rem_rfim; 417 360 } 418 361 } ··· 443 374 proc_priv->mmio_feature_mask & PROC_THERMAL_FEATURE_DVFS) 444 375 proc_thermal_rfim_remove(pdev); 445 376 446 - if (proc_priv->mmio_feature_mask & PROC_THERMAL_FEATURE_MBOX) 447 - proc_thermal_mbox_remove(pdev); 377 + if (proc_priv->mmio_feature_mask & PROC_THERMAL_FEATURE_POWER_FLOOR) 378 + proc_thermal_power_floor_set_state(proc_priv, false); 379 + 380 + if (proc_priv->mmio_feature_mask & PROC_THERMAL_FEATURE_WT_REQ) 381 + proc_thermal_wt_req_remove(pdev); 382 + else if (proc_priv->mmio_feature_mask & PROC_THERMAL_FEATURE_WT_HINT) 383 + proc_thermal_wt_hint_remove(pdev); 448 384 } 449 385 EXPORT_SYMBOL_GPL(proc_thermal_mmio_remove); 450 386 451 387 MODULE_IMPORT_NS(INTEL_TCC); 388 + MODULE_IMPORT_NS(INT340X_THERMAL); 452 389 MODULE_AUTHOR("Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>"); 453 390 MODULE_DESCRIPTION("Processor Thermal Reporting Device Driver"); 454 391 MODULE_LICENSE("GPL v2");
+30 -3
drivers/thermal/intel/int340x_thermal/processor_thermal_device.h
··· 10 10 #include <linux/intel_rapl.h> 11 11 12 12 #define PCI_DEVICE_ID_INTEL_ADL_THERMAL 0x461d 13 + #define PCI_DEVICE_ID_INTEL_ARL_S_THERMAL 0xAD03 13 14 #define PCI_DEVICE_ID_INTEL_BDW_THERMAL 0x1603 14 15 #define PCI_DEVICE_ID_INTEL_BSW_THERMAL 0x22DC 15 16 ··· 60 59 #define PROC_THERMAL_FEATURE_RAPL 0x01 61 60 #define PROC_THERMAL_FEATURE_FIVR 0x02 62 61 #define PROC_THERMAL_FEATURE_DVFS 0x04 63 - #define PROC_THERMAL_FEATURE_MBOX 0x08 62 + #define PROC_THERMAL_FEATURE_WT_REQ 0x08 64 63 #define PROC_THERMAL_FEATURE_DLVR 0x10 64 + #define PROC_THERMAL_FEATURE_WT_HINT 0x20 65 + #define PROC_THERMAL_FEATURE_POWER_FLOOR 0x40 65 66 66 67 #if IS_ENABLED(CONFIG_PROC_THERMAL_MMIO_RAPL) 67 68 int proc_thermal_rapl_add(struct pci_dev *pdev, struct proc_thermal_device *proc_priv); ··· 83 80 int proc_thermal_rfim_add(struct pci_dev *pdev, struct proc_thermal_device *proc_priv); 84 81 void proc_thermal_rfim_remove(struct pci_dev *pdev); 85 82 86 - int proc_thermal_mbox_add(struct pci_dev *pdev, struct proc_thermal_device *proc_priv); 87 - void proc_thermal_mbox_remove(struct pci_dev *pdev); 83 + int proc_thermal_wt_req_add(struct pci_dev *pdev, struct proc_thermal_device *proc_priv); 84 + void proc_thermal_wt_req_remove(struct pci_dev *pdev); 85 + 86 + #define MBOX_CMD_WORKLOAD_TYPE_READ 0x0E 87 + #define MBOX_CMD_WORKLOAD_TYPE_WRITE 0x0F 88 + 89 + #define MBOX_DATA_BIT_AC_DC 30 90 + #define MBOX_DATA_BIT_VALID 31 91 + 92 + #define SOC_WT_RES_INT_STATUS_OFFSET 0x5B18 93 + #define SOC_WT_RES_INT_STATUS_MASK GENMASK_ULL(3, 2) 94 + 95 + int proc_thermal_read_power_floor_status(struct proc_thermal_device *proc_priv); 96 + int proc_thermal_power_floor_set_state(struct proc_thermal_device *proc_priv, bool enable); 97 + bool proc_thermal_power_floor_get_state(struct proc_thermal_device *proc_priv); 98 + void proc_thermal_power_floor_intr_callback(struct pci_dev *pdev, 99 + struct proc_thermal_device *proc_priv); 100 + bool proc_thermal_check_power_floor_intr(struct proc_thermal_device *proc_priv); 88 101 89 102 int processor_thermal_send_mbox_read_cmd(struct pci_dev *pdev, u16 id, u64 *resp); 90 103 int processor_thermal_send_mbox_write_cmd(struct pci_dev *pdev, u16 id, u32 data); 104 + int processor_thermal_mbox_interrupt_config(struct pci_dev *pdev, bool enable, int enable_bit, 105 + int time_window); 91 106 int proc_thermal_add(struct device *dev, struct proc_thermal_device *priv); 92 107 void proc_thermal_remove(struct proc_thermal_device *proc_priv); 108 + 109 + int proc_thermal_wt_hint_add(struct pci_dev *pdev, struct proc_thermal_device *proc_priv); 110 + void proc_thermal_wt_hint_remove(struct pci_dev *pdev); 111 + void proc_thermal_wt_intr_callback(struct pci_dev *pdev, struct proc_thermal_device *proc_priv); 112 + bool proc_thermal_check_wt_intr(struct proc_thermal_device *proc_priv); 113 + 93 114 int proc_thermal_suspend(struct device *dev); 94 115 int proc_thermal_resume(struct device *dev); 95 116 int proc_thermal_mmio_add(struct pci_dev *pdev,
+93 -30
drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c
··· 15 15 16 16 #define DRV_NAME "proc_thermal_pci" 17 17 18 + static bool use_msi; 19 + module_param(use_msi, bool, 0644); 20 + MODULE_PARM_DESC(use_msi, 21 + "Use PCI MSI based interrupts for processor thermal device."); 22 + 18 23 struct proc_thermal_pci { 19 24 struct pci_dev *pdev; 20 25 struct proc_thermal_device *proc_priv; ··· 122 117 schedule_delayed_work(work, ms); 123 118 } 124 119 120 + static void proc_thermal_clear_soc_int_status(struct proc_thermal_device *proc_priv) 121 + { 122 + u64 status; 123 + 124 + if (!(proc_priv->mmio_feature_mask & 125 + (PROC_THERMAL_FEATURE_WT_HINT | PROC_THERMAL_FEATURE_POWER_FLOOR))) 126 + return; 127 + 128 + status = readq(proc_priv->mmio_base + SOC_WT_RES_INT_STATUS_OFFSET); 129 + writeq(status & ~SOC_WT_RES_INT_STATUS_MASK, 130 + proc_priv->mmio_base + SOC_WT_RES_INT_STATUS_OFFSET); 131 + } 132 + 133 + static irqreturn_t proc_thermal_irq_thread_handler(int irq, void *devid) 134 + { 135 + struct proc_thermal_pci *pci_info = devid; 136 + 137 + proc_thermal_wt_intr_callback(pci_info->pdev, pci_info->proc_priv); 138 + proc_thermal_power_floor_intr_callback(pci_info->pdev, pci_info->proc_priv); 139 + proc_thermal_clear_soc_int_status(pci_info->proc_priv); 140 + 141 + return IRQ_HANDLED; 142 + } 143 + 125 144 static irqreturn_t proc_thermal_irq_handler(int irq, void *devid) 126 145 { 127 146 struct proc_thermal_pci *pci_info = devid; 147 + struct proc_thermal_device *proc_priv; 148 + int ret = IRQ_HANDLED; 128 149 u32 status; 129 150 130 - proc_thermal_mmio_read(pci_info, PROC_THERMAL_MMIO_INT_STATUS_0, &status); 151 + proc_priv = pci_info->proc_priv; 131 152 132 - /* Disable enable interrupt flag */ 133 - proc_thermal_mmio_write(pci_info, PROC_THERMAL_MMIO_INT_ENABLE_0, 0); 153 + if (proc_priv->mmio_feature_mask & PROC_THERMAL_FEATURE_WT_HINT) { 154 + if (proc_thermal_check_wt_intr(pci_info->proc_priv)) 155 + ret = IRQ_WAKE_THREAD; 156 + } 157 + 158 + if (proc_priv->mmio_feature_mask & PROC_THERMAL_FEATURE_POWER_FLOOR) { 159 + if (proc_thermal_check_power_floor_intr(pci_info->proc_priv)) 160 + ret = IRQ_WAKE_THREAD; 161 + } 162 + 163 + /* 164 + * Since now there are two sources of interrupts: one from thermal threshold 165 + * and another from workload hint, add a check if there was really a threshold 166 + * interrupt before scheduling work function for thermal threshold. 167 + */ 168 + proc_thermal_mmio_read(pci_info, PROC_THERMAL_MMIO_INT_STATUS_0, &status); 169 + if (status) { 170 + /* Disable enable interrupt flag */ 171 + proc_thermal_mmio_write(pci_info, PROC_THERMAL_MMIO_INT_ENABLE_0, 0); 172 + pkg_thermal_schedule_work(&pci_info->work); 173 + } 174 + 134 175 pci_write_config_byte(pci_info->pdev, 0xdc, 0x01); 135 176 136 - pkg_thermal_schedule_work(&pci_info->work); 137 - 138 - return IRQ_HANDLED; 177 + return ret; 139 178 } 140 179 141 180 static int sys_get_curr_temp(struct thermal_zone_device *tzd, int *temp) ··· 252 203 struct proc_thermal_device *proc_priv; 253 204 struct proc_thermal_pci *pci_info; 254 205 int irq_flag = 0, irq, ret; 206 + bool msi_irq = false; 255 207 256 208 proc_priv = devm_kzalloc(&pdev->dev, sizeof(*proc_priv), GFP_KERNEL); 257 209 if (!proc_priv) ··· 273 223 274 224 INIT_DELAYED_WORK(&pci_info->work, proc_thermal_threshold_work_fn); 275 225 276 - ret = proc_thermal_add(&pdev->dev, proc_priv); 277 - if (ret) { 278 - dev_err(&pdev->dev, "error: proc_thermal_add, will continue\n"); 279 - pci_info->no_legacy = 1; 280 - } 281 - 282 226 proc_priv->priv_data = pci_info; 283 227 pci_info->proc_priv = proc_priv; 284 228 pci_set_drvdata(pdev, proc_priv); 285 229 286 230 ret = proc_thermal_mmio_add(pdev, proc_priv, id->driver_data); 287 231 if (ret) 288 - goto err_ret_thermal; 232 + return ret; 233 + 234 + ret = proc_thermal_add(&pdev->dev, proc_priv); 235 + if (ret) { 236 + dev_err(&pdev->dev, "error: proc_thermal_add, will continue\n"); 237 + pci_info->no_legacy = 1; 238 + } 289 239 290 240 psv_trip.temperature = get_trip_temp(pci_info); 291 241 ··· 295 245 &tzone_params, 0, 0); 296 246 if (IS_ERR(pci_info->tzone)) { 297 247 ret = PTR_ERR(pci_info->tzone); 298 - goto err_ret_mmio; 248 + goto err_del_legacy; 299 249 } 300 250 301 - /* request and enable interrupt */ 302 - ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES); 303 - if (ret < 0) { 304 - dev_err(&pdev->dev, "Failed to allocate vectors!\n"); 305 - goto err_ret_tzone; 306 - } 307 - if (!pdev->msi_enabled && !pdev->msix_enabled) 251 + if (use_msi && (pdev->msi_enabled || pdev->msix_enabled)) { 252 + /* request and enable interrupt */ 253 + ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES); 254 + if (ret < 0) { 255 + dev_err(&pdev->dev, "Failed to allocate vectors!\n"); 256 + goto err_ret_tzone; 257 + } 258 + 259 + irq = pci_irq_vector(pdev, 0); 260 + msi_irq = true; 261 + } else { 308 262 irq_flag = IRQF_SHARED; 263 + irq = pdev->irq; 264 + } 309 265 310 - irq = pci_irq_vector(pdev, 0); 311 266 ret = devm_request_threaded_irq(&pdev->dev, irq, 312 - proc_thermal_irq_handler, NULL, 267 + proc_thermal_irq_handler, proc_thermal_irq_thread_handler, 313 268 irq_flag, KBUILD_MODNAME, pci_info); 314 269 if (ret) { 315 270 dev_err(&pdev->dev, "Request IRQ %d failed\n", pdev->irq); ··· 328 273 return 0; 329 274 330 275 err_free_vectors: 331 - pci_free_irq_vectors(pdev); 276 + if (msi_irq) 277 + pci_free_irq_vectors(pdev); 332 278 err_ret_tzone: 333 279 thermal_zone_device_unregister(pci_info->tzone); 334 - err_ret_mmio: 335 - proc_thermal_mmio_remove(pdev, proc_priv); 336 - err_ret_thermal: 280 + err_del_legacy: 337 281 if (!pci_info->no_legacy) 338 282 proc_thermal_remove(proc_priv); 283 + proc_thermal_mmio_remove(pdev, proc_priv); 339 284 pci_disable_device(pdev); 340 285 341 286 return ret; ··· 405 350 proc_thermal_pci_resume); 406 351 407 352 static const struct pci_device_id proc_thermal_pci_ids[] = { 408 - { PCI_DEVICE_DATA(INTEL, ADL_THERMAL, PROC_THERMAL_FEATURE_RAPL | PROC_THERMAL_FEATURE_FIVR | PROC_THERMAL_FEATURE_DVFS | PROC_THERMAL_FEATURE_MBOX) }, 409 - { PCI_DEVICE_DATA(INTEL, MTLP_THERMAL, PROC_THERMAL_FEATURE_RAPL | PROC_THERMAL_FEATURE_FIVR | PROC_THERMAL_FEATURE_DVFS | PROC_THERMAL_FEATURE_MBOX | PROC_THERMAL_FEATURE_DLVR) }, 410 - { PCI_DEVICE_DATA(INTEL, RPL_THERMAL, PROC_THERMAL_FEATURE_RAPL | PROC_THERMAL_FEATURE_FIVR | PROC_THERMAL_FEATURE_DVFS | PROC_THERMAL_FEATURE_MBOX) }, 353 + { PCI_DEVICE_DATA(INTEL, ADL_THERMAL, PROC_THERMAL_FEATURE_RAPL | 354 + PROC_THERMAL_FEATURE_FIVR | PROC_THERMAL_FEATURE_DVFS | PROC_THERMAL_FEATURE_WT_REQ) }, 355 + { PCI_DEVICE_DATA(INTEL, MTLP_THERMAL, PROC_THERMAL_FEATURE_RAPL | 356 + PROC_THERMAL_FEATURE_FIVR | PROC_THERMAL_FEATURE_DVFS | PROC_THERMAL_FEATURE_DLVR | 357 + PROC_THERMAL_FEATURE_WT_HINT | PROC_THERMAL_FEATURE_POWER_FLOOR) }, 358 + { PCI_DEVICE_DATA(INTEL, ARL_S_THERMAL, PROC_THERMAL_FEATURE_RAPL | 359 + PROC_THERMAL_FEATURE_DVFS | PROC_THERMAL_FEATURE_DLVR | PROC_THERMAL_FEATURE_WT_HINT) }, 360 + { PCI_DEVICE_DATA(INTEL, RPL_THERMAL, PROC_THERMAL_FEATURE_RAPL | 361 + PROC_THERMAL_FEATURE_FIVR | PROC_THERMAL_FEATURE_DVFS | PROC_THERMAL_FEATURE_WT_REQ) }, 411 362 { }, 412 363 }; 413 364 ··· 428 367 }; 429 368 430 369 module_pci_driver(proc_thermal_pci_driver); 370 + 371 + MODULE_IMPORT_NS(INT340X_THERMAL); 431 372 432 373 MODULE_AUTHOR("Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>"); 433 374 MODULE_DESCRIPTION("Processor Thermal Reporting Device Driver");
+2 -1
drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci_legacy.c
··· 137 137 { PCI_DEVICE_DATA(INTEL, ICL_THERMAL, PROC_THERMAL_FEATURE_RAPL) }, 138 138 { PCI_DEVICE_DATA(INTEL, JSL_THERMAL, 0) }, 139 139 { PCI_DEVICE_DATA(INTEL, SKL_THERMAL, PROC_THERMAL_FEATURE_RAPL) }, 140 - { PCI_DEVICE_DATA(INTEL, TGL_THERMAL, PROC_THERMAL_FEATURE_RAPL | PROC_THERMAL_FEATURE_FIVR | PROC_THERMAL_FEATURE_MBOX) }, 140 + { PCI_DEVICE_DATA(INTEL, TGL_THERMAL, PROC_THERMAL_FEATURE_RAPL | 141 + PROC_THERMAL_FEATURE_FIVR | PROC_THERMAL_FEATURE_WT_REQ) }, 141 142 { }, 142 143 }; 143 144
+55 -138
drivers/thermal/intel/int340x_thermal/processor_thermal_mbox.c
··· 10 10 #include <linux/io-64-nonatomic-lo-hi.h> 11 11 #include "processor_thermal_device.h" 12 12 13 - #define MBOX_CMD_WORKLOAD_TYPE_READ 0x0E 14 - #define MBOX_CMD_WORKLOAD_TYPE_WRITE 0x0F 15 - 16 13 #define MBOX_OFFSET_DATA 0x5810 17 14 #define MBOX_OFFSET_INTERFACE 0x5818 18 15 19 16 #define MBOX_BUSY_BIT 31 20 17 #define MBOX_RETRY_COUNT 100 21 - 22 - #define MBOX_DATA_BIT_VALID 31 23 - #define MBOX_DATA_BIT_AC_DC 30 24 18 25 19 static DEFINE_MUTEX(mbox_lock); 26 20 ··· 45 51 int ret; 46 52 47 53 proc_priv = pci_get_drvdata(pdev); 48 - 49 - mutex_lock(&mbox_lock); 50 - 51 54 ret = wait_for_mbox_ready(proc_priv); 52 55 if (ret) 53 - goto unlock_mbox; 56 + return ret; 54 57 55 58 writel(data, (proc_priv->mmio_base + MBOX_OFFSET_DATA)); 56 59 /* Write command register */ 57 60 reg_data = BIT_ULL(MBOX_BUSY_BIT) | id; 58 61 writel(reg_data, (proc_priv->mmio_base + MBOX_OFFSET_INTERFACE)); 59 62 60 - ret = wait_for_mbox_ready(proc_priv); 61 - 62 - unlock_mbox: 63 - mutex_unlock(&mbox_lock); 64 - return ret; 63 + return wait_for_mbox_ready(proc_priv); 65 64 } 66 65 67 66 static int send_mbox_read_cmd(struct pci_dev *pdev, u16 id, u64 *resp) ··· 64 77 int ret; 65 78 66 79 proc_priv = pci_get_drvdata(pdev); 67 - 68 - mutex_lock(&mbox_lock); 69 - 70 80 ret = wait_for_mbox_ready(proc_priv); 71 81 if (ret) 72 - goto unlock_mbox; 82 + return ret; 73 83 74 84 /* Write command register */ 75 85 reg_data = BIT_ULL(MBOX_BUSY_BIT) | id; ··· 74 90 75 91 ret = wait_for_mbox_ready(proc_priv); 76 92 if (ret) 77 - goto unlock_mbox; 93 + return ret; 78 94 79 95 if (id == MBOX_CMD_WORKLOAD_TYPE_READ) 80 96 *resp = readl(proc_priv->mmio_base + MBOX_OFFSET_DATA); 81 97 else 82 98 *resp = readq(proc_priv->mmio_base + MBOX_OFFSET_DATA); 83 99 84 - unlock_mbox: 85 - mutex_unlock(&mbox_lock); 86 - return ret; 100 + return 0; 87 101 } 88 102 89 103 int processor_thermal_send_mbox_read_cmd(struct pci_dev *pdev, u16 id, u64 *resp) 90 104 { 91 - return send_mbox_read_cmd(pdev, id, resp); 105 + int ret; 106 + 107 + mutex_lock(&mbox_lock); 108 + ret = send_mbox_read_cmd(pdev, id, resp); 109 + mutex_unlock(&mbox_lock); 110 + 111 + return ret; 92 112 } 93 113 EXPORT_SYMBOL_NS_GPL(processor_thermal_send_mbox_read_cmd, INT340X_THERMAL); 94 114 95 115 int processor_thermal_send_mbox_write_cmd(struct pci_dev *pdev, u16 id, u32 data) 96 116 { 97 - return send_mbox_write_cmd(pdev, id, data); 98 - } 99 - EXPORT_SYMBOL_NS_GPL(processor_thermal_send_mbox_write_cmd, INT340X_THERMAL); 117 + int ret; 100 118 101 - /* List of workload types */ 102 - static const char * const workload_types[] = { 103 - "none", 104 - "idle", 105 - "semi_active", 106 - "bursty", 107 - "sustained", 108 - "battery_life", 109 - NULL 110 - }; 111 - 112 - static ssize_t workload_available_types_show(struct device *dev, 113 - struct device_attribute *attr, 114 - char *buf) 115 - { 116 - int i = 0; 117 - int ret = 0; 118 - 119 - while (workload_types[i] != NULL) 120 - ret += sprintf(&buf[ret], "%s ", workload_types[i++]); 121 - 122 - ret += sprintf(&buf[ret], "\n"); 119 + mutex_lock(&mbox_lock); 120 + ret = send_mbox_write_cmd(pdev, id, data); 121 + mutex_unlock(&mbox_lock); 123 122 124 123 return ret; 125 124 } 125 + EXPORT_SYMBOL_NS_GPL(processor_thermal_send_mbox_write_cmd, INT340X_THERMAL); 126 126 127 - static DEVICE_ATTR_RO(workload_available_types); 127 + #define MBOX_CAMARILLO_RD_INTR_CONFIG 0x1E 128 + #define MBOX_CAMARILLO_WR_INTR_CONFIG 0x1F 129 + #define WLT_TW_MASK GENMASK_ULL(30, 24) 130 + #define SOC_PREDICTION_TW_SHIFT 24 128 131 129 - static ssize_t workload_type_store(struct device *dev, 130 - struct device_attribute *attr, 131 - const char *buf, size_t count) 132 + int processor_thermal_mbox_interrupt_config(struct pci_dev *pdev, bool enable, 133 + int enable_bit, int time_window) 132 134 { 133 - struct pci_dev *pdev = to_pci_dev(dev); 134 - char str_preference[15]; 135 - u32 data = 0; 136 - ssize_t ret; 137 - 138 - ret = sscanf(buf, "%14s", str_preference); 139 - if (ret != 1) 140 - return -EINVAL; 141 - 142 - ret = match_string(workload_types, -1, str_preference); 143 - if (ret < 0) 144 - return ret; 145 - 146 - ret &= 0xff; 147 - 148 - if (ret) 149 - data = BIT(MBOX_DATA_BIT_VALID) | BIT(MBOX_DATA_BIT_AC_DC); 150 - 151 - data |= ret; 152 - 153 - ret = send_mbox_write_cmd(pdev, MBOX_CMD_WORKLOAD_TYPE_WRITE, data); 154 - if (ret) 155 - return false; 156 - 157 - return count; 158 - } 159 - 160 - static ssize_t workload_type_show(struct device *dev, 161 - struct device_attribute *attr, 162 - char *buf) 163 - { 164 - struct pci_dev *pdev = to_pci_dev(dev); 165 - u64 cmd_resp; 135 + u64 data; 166 136 int ret; 167 137 168 - ret = send_mbox_read_cmd(pdev, MBOX_CMD_WORKLOAD_TYPE_READ, &cmd_resp); 138 + if (!pdev) 139 + return -ENODEV; 140 + 141 + mutex_lock(&mbox_lock); 142 + 143 + /* Do read modify write for MBOX_CAMARILLO_RD_INTR_CONFIG */ 144 + 145 + ret = send_mbox_read_cmd(pdev, MBOX_CAMARILLO_RD_INTR_CONFIG, &data); 146 + if (ret) { 147 + dev_err(&pdev->dev, "MBOX_CAMARILLO_RD_INTR_CONFIG failed\n"); 148 + goto unlock; 149 + } 150 + 151 + if (time_window >= 0) { 152 + data &= ~WLT_TW_MASK; 153 + 154 + /* Program notification delay */ 155 + data |= ((u64)time_window << SOC_PREDICTION_TW_SHIFT) & WLT_TW_MASK; 156 + } 157 + 158 + if (enable) 159 + data |= BIT(enable_bit); 160 + else 161 + data &= ~BIT(enable_bit); 162 + 163 + ret = send_mbox_write_cmd(pdev, MBOX_CAMARILLO_WR_INTR_CONFIG, data); 169 164 if (ret) 170 - return false; 165 + dev_err(&pdev->dev, "MBOX_CAMARILLO_WR_INTR_CONFIG failed\n"); 171 166 172 - cmd_resp &= 0xff; 167 + unlock: 168 + mutex_unlock(&mbox_lock); 173 169 174 - if (cmd_resp > ARRAY_SIZE(workload_types) - 1) 175 - return -EINVAL; 176 - 177 - return sprintf(buf, "%s\n", workload_types[cmd_resp]); 170 + return ret; 178 171 } 179 - 180 - static DEVICE_ATTR_RW(workload_type); 181 - 182 - static struct attribute *workload_req_attrs[] = { 183 - &dev_attr_workload_available_types.attr, 184 - &dev_attr_workload_type.attr, 185 - NULL 186 - }; 187 - 188 - static const struct attribute_group workload_req_attribute_group = { 189 - .attrs = workload_req_attrs, 190 - .name = "workload_request" 191 - }; 192 - 193 - static bool workload_req_created; 194 - 195 - int proc_thermal_mbox_add(struct pci_dev *pdev, struct proc_thermal_device *proc_priv) 196 - { 197 - u64 cmd_resp; 198 - int ret; 199 - 200 - /* Check if there is a mailbox support, if fails return success */ 201 - ret = send_mbox_read_cmd(pdev, MBOX_CMD_WORKLOAD_TYPE_READ, &cmd_resp); 202 - if (ret) 203 - return 0; 204 - 205 - ret = sysfs_create_group(&pdev->dev.kobj, &workload_req_attribute_group); 206 - if (ret) 207 - return ret; 208 - 209 - workload_req_created = true; 210 - 211 - return 0; 212 - } 213 - EXPORT_SYMBOL_GPL(proc_thermal_mbox_add); 214 - 215 - void proc_thermal_mbox_remove(struct pci_dev *pdev) 216 - { 217 - if (workload_req_created) 218 - sysfs_remove_group(&pdev->dev.kobj, &workload_req_attribute_group); 219 - 220 - workload_req_created = false; 221 - 222 - } 223 - EXPORT_SYMBOL_GPL(proc_thermal_mbox_remove); 172 + EXPORT_SYMBOL_NS_GPL(processor_thermal_mbox_interrupt_config, INT340X_THERMAL); 224 173 225 174 MODULE_LICENSE("GPL v2");
+126
drivers/thermal/intel/int340x_thermal/processor_thermal_power_floor.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * Processor thermal device module for registering and processing 4 + * power floor. When the hardware reduces the power to the minimum 5 + * possible, the power floor is notified via an interrupt. 6 + * 7 + * Operation: 8 + * When user space enables power floor reporting: 9 + * - Use mailbox to: 10 + * Enable processor thermal device interrupt 11 + * 12 + * - Current status of power floor is read from offset 0x5B18 13 + * bit 39. 14 + * 15 + * Two interface functions are provided to call when there is a 16 + * thermal device interrupt: 17 + * - proc_thermal_power_floor_intr(): 18 + * Check if the interrupt is for change in power floor. 19 + * Called from interrupt context. 20 + * 21 + * - proc_thermal_power_floor_intr_callback(): 22 + * Callback for interrupt processing in thread context. This involves 23 + * sending notification to user space that there is a change in the 24 + * power floor status. 25 + * 26 + * Copyright (c) 2023, Intel Corporation. 27 + */ 28 + 29 + #include <linux/pci.h> 30 + #include "processor_thermal_device.h" 31 + 32 + #define SOC_POWER_FLOOR_STATUS BIT(39) 33 + #define SOC_POWER_FLOOR_SHIFT 39 34 + 35 + #define SOC_POWER_FLOOR_INT_ENABLE_BIT 31 36 + #define SOC_POWER_FLOOR_INT_ACTIVE BIT(3) 37 + 38 + int proc_thermal_read_power_floor_status(struct proc_thermal_device *proc_priv) 39 + { 40 + u64 status = 0; 41 + 42 + status = readq(proc_priv->mmio_base + SOC_WT_RES_INT_STATUS_OFFSET); 43 + return (status & SOC_POWER_FLOOR_STATUS) >> SOC_POWER_FLOOR_SHIFT; 44 + } 45 + EXPORT_SYMBOL_NS_GPL(proc_thermal_read_power_floor_status, INT340X_THERMAL); 46 + 47 + static bool enable_state; 48 + static DEFINE_MUTEX(pf_lock); 49 + 50 + int proc_thermal_power_floor_set_state(struct proc_thermal_device *proc_priv, bool enable) 51 + { 52 + int ret = 0; 53 + 54 + mutex_lock(&pf_lock); 55 + if (enable_state == enable) 56 + goto pf_unlock; 57 + 58 + /* 59 + * Time window parameter is not applicable to power floor interrupt configuration. 60 + * Hence use -1 for time window. 61 + */ 62 + ret = processor_thermal_mbox_interrupt_config(to_pci_dev(proc_priv->dev), enable, 63 + SOC_POWER_FLOOR_INT_ENABLE_BIT, -1); 64 + if (!ret) 65 + enable_state = enable; 66 + 67 + pf_unlock: 68 + mutex_unlock(&pf_lock); 69 + 70 + return ret; 71 + } 72 + EXPORT_SYMBOL_NS_GPL(proc_thermal_power_floor_set_state, INT340X_THERMAL); 73 + 74 + bool proc_thermal_power_floor_get_state(struct proc_thermal_device *proc_priv) 75 + { 76 + return enable_state; 77 + } 78 + EXPORT_SYMBOL_NS_GPL(proc_thermal_power_floor_get_state, INT340X_THERMAL); 79 + 80 + /** 81 + * proc_thermal_check_power_floor_intr() - Check power floor interrupt. 82 + * @proc_priv: Processor thermal device instance. 83 + * 84 + * Callback to check if the interrupt for power floor is active. 85 + * 86 + * Context: Called from interrupt context. 87 + * 88 + * Return: true if power floor is active, false when not active. 89 + */ 90 + bool proc_thermal_check_power_floor_intr(struct proc_thermal_device *proc_priv) 91 + { 92 + u64 int_status; 93 + 94 + int_status = readq(proc_priv->mmio_base + SOC_WT_RES_INT_STATUS_OFFSET); 95 + return !!(int_status & SOC_POWER_FLOOR_INT_ACTIVE); 96 + } 97 + EXPORT_SYMBOL_NS_GPL(proc_thermal_check_power_floor_intr, INT340X_THERMAL); 98 + 99 + /** 100 + * proc_thermal_power_floor_intr_callback() - Process power floor notification 101 + * @pdev: PCI device instance 102 + * @proc_priv: Processor thermal device instance. 103 + * 104 + * Check if the power floor interrupt is active, if active send notification to 105 + * user space for the attribute "power_limits", so that user can read the attribute 106 + * and take action. 107 + * 108 + * Context: Called from interrupt thread context. 109 + * 110 + * Return: None. 111 + */ 112 + void proc_thermal_power_floor_intr_callback(struct pci_dev *pdev, 113 + struct proc_thermal_device *proc_priv) 114 + { 115 + u64 status; 116 + 117 + status = readq(proc_priv->mmio_base + SOC_WT_RES_INT_STATUS_OFFSET); 118 + if (!(status & SOC_POWER_FLOOR_INT_ACTIVE)) 119 + return; 120 + 121 + sysfs_notify(&pdev->dev.kobj, "power_limits", "power_floor_status"); 122 + } 123 + EXPORT_SYMBOL_NS_GPL(proc_thermal_power_floor_intr_callback, INT340X_THERMAL); 124 + 125 + MODULE_IMPORT_NS(INT340X_THERMAL); 126 + MODULE_LICENSE("GPL");
+255
drivers/thermal/intel/int340x_thermal/processor_thermal_wt_hint.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * processor thermal device interface for reading workload type hints 4 + * from the user space. The hints are provided by the firmware. 5 + * 6 + * Operation: 7 + * When user space enables workload type prediction: 8 + * - Use mailbox to: 9 + * Configure notification delay 10 + * Enable processor thermal device interrupt 11 + * 12 + * - The predicted workload type can be read from MMIO: 13 + * Offset 0x5B18 shows if there was an interrupt 14 + * active for change in workload type and also 15 + * predicted workload type. 16 + * 17 + * Two interface functions are provided to call when there is a 18 + * thermal device interrupt: 19 + * - proc_thermal_check_wt_intr(): 20 + * Check if the interrupt is for change in workload type. Called from 21 + * interrupt context. 22 + * 23 + * - proc_thermal_wt_intr_callback(): 24 + * Callback for interrupt processing in thread context. This involves 25 + * sending notification to user space that there is a change in the 26 + * workload type. 27 + * 28 + * Copyright (c) 2023, Intel Corporation. 29 + */ 30 + 31 + #include <linux/bitfield.h> 32 + #include <linux/pci.h> 33 + #include "processor_thermal_device.h" 34 + 35 + #define SOC_WT GENMASK_ULL(47, 40) 36 + 37 + #define SOC_WT_PREDICTION_INT_ENABLE_BIT 23 38 + 39 + #define SOC_WT_PREDICTION_INT_ACTIVE BIT(2) 40 + 41 + /* 42 + * Closest possible to 1 Second is 1024 ms with programmed time delay 43 + * of 0x0A. 44 + */ 45 + static u8 notify_delay = 0x0A; 46 + static u16 notify_delay_ms = 1024; 47 + 48 + static DEFINE_MUTEX(wt_lock); 49 + static u8 wt_enable; 50 + 51 + /* Show current predicted workload type index */ 52 + static ssize_t workload_type_index_show(struct device *dev, 53 + struct device_attribute *attr, 54 + char *buf) 55 + { 56 + struct proc_thermal_device *proc_priv; 57 + struct pci_dev *pdev = to_pci_dev(dev); 58 + u64 status = 0; 59 + int wt; 60 + 61 + mutex_lock(&wt_lock); 62 + if (!wt_enable) { 63 + mutex_unlock(&wt_lock); 64 + return -ENODATA; 65 + } 66 + 67 + proc_priv = pci_get_drvdata(pdev); 68 + 69 + status = readq(proc_priv->mmio_base + SOC_WT_RES_INT_STATUS_OFFSET); 70 + 71 + mutex_unlock(&wt_lock); 72 + 73 + wt = FIELD_GET(SOC_WT, status); 74 + 75 + return sysfs_emit(buf, "%d\n", wt); 76 + } 77 + 78 + static DEVICE_ATTR_RO(workload_type_index); 79 + 80 + static ssize_t workload_hint_enable_show(struct device *dev, 81 + struct device_attribute *attr, 82 + char *buf) 83 + { 84 + return sysfs_emit(buf, "%d\n", wt_enable); 85 + } 86 + 87 + static ssize_t workload_hint_enable_store(struct device *dev, 88 + struct device_attribute *attr, 89 + const char *buf, size_t size) 90 + { 91 + struct pci_dev *pdev = to_pci_dev(dev); 92 + u8 mode; 93 + int ret; 94 + 95 + if (kstrtou8(buf, 10, &mode) || mode > 1) 96 + return -EINVAL; 97 + 98 + mutex_lock(&wt_lock); 99 + 100 + if (mode) 101 + ret = processor_thermal_mbox_interrupt_config(pdev, true, 102 + SOC_WT_PREDICTION_INT_ENABLE_BIT, 103 + notify_delay); 104 + else 105 + ret = processor_thermal_mbox_interrupt_config(pdev, false, 106 + SOC_WT_PREDICTION_INT_ENABLE_BIT, 0); 107 + 108 + if (ret) 109 + goto ret_enable_store; 110 + 111 + ret = size; 112 + wt_enable = mode; 113 + 114 + ret_enable_store: 115 + mutex_unlock(&wt_lock); 116 + 117 + return ret; 118 + } 119 + 120 + static DEVICE_ATTR_RW(workload_hint_enable); 121 + 122 + static ssize_t notification_delay_ms_show(struct device *dev, 123 + struct device_attribute *attr, 124 + char *buf) 125 + { 126 + return sysfs_emit(buf, "%u\n", notify_delay_ms); 127 + } 128 + 129 + static ssize_t notification_delay_ms_store(struct device *dev, 130 + struct device_attribute *attr, 131 + const char *buf, size_t size) 132 + { 133 + struct pci_dev *pdev = to_pci_dev(dev); 134 + u16 new_tw; 135 + int ret; 136 + u8 tm; 137 + 138 + /* 139 + * Time window register value: 140 + * Formula: (1 + x/4) * power(2,y) 141 + * x = 2 msbs, that is [30:29] y = 5 [28:24] 142 + * in INTR_CONFIG register. 143 + * The result will be in milli seconds. 144 + * Here, just keep x = 0, and just change y. 145 + * First round up the user value to power of 2 and 146 + * then take log2, to get "y" value to program. 147 + */ 148 + ret = kstrtou16(buf, 10, &new_tw); 149 + if (ret) 150 + return ret; 151 + 152 + if (!new_tw) 153 + return -EINVAL; 154 + 155 + new_tw = roundup_pow_of_two(new_tw); 156 + tm = ilog2(new_tw); 157 + if (tm > 31) 158 + return -EINVAL; 159 + 160 + mutex_lock(&wt_lock); 161 + 162 + /* If the workload hint was already enabled, then update with the new delay */ 163 + if (wt_enable) 164 + ret = processor_thermal_mbox_interrupt_config(pdev, true, 165 + SOC_WT_PREDICTION_INT_ENABLE_BIT, 166 + tm); 167 + 168 + if (!ret) { 169 + ret = size; 170 + notify_delay = tm; 171 + notify_delay_ms = new_tw; 172 + } 173 + 174 + mutex_unlock(&wt_lock); 175 + 176 + return ret; 177 + } 178 + 179 + static DEVICE_ATTR_RW(notification_delay_ms); 180 + 181 + static struct attribute *workload_hint_attrs[] = { 182 + &dev_attr_workload_type_index.attr, 183 + &dev_attr_workload_hint_enable.attr, 184 + &dev_attr_notification_delay_ms.attr, 185 + NULL 186 + }; 187 + 188 + static const struct attribute_group workload_hint_attribute_group = { 189 + .attrs = workload_hint_attrs, 190 + .name = "workload_hint" 191 + }; 192 + 193 + /* 194 + * Callback to check if the interrupt for prediction is active. 195 + * Caution: Called from the interrupt context. 196 + */ 197 + bool proc_thermal_check_wt_intr(struct proc_thermal_device *proc_priv) 198 + { 199 + u64 int_status; 200 + 201 + int_status = readq(proc_priv->mmio_base + SOC_WT_RES_INT_STATUS_OFFSET); 202 + if (int_status & SOC_WT_PREDICTION_INT_ACTIVE) 203 + return true; 204 + 205 + return false; 206 + } 207 + EXPORT_SYMBOL_NS_GPL(proc_thermal_check_wt_intr, INT340X_THERMAL); 208 + 209 + /* Callback to notify user space */ 210 + void proc_thermal_wt_intr_callback(struct pci_dev *pdev, struct proc_thermal_device *proc_priv) 211 + { 212 + u64 status; 213 + 214 + status = readq(proc_priv->mmio_base + SOC_WT_RES_INT_STATUS_OFFSET); 215 + if (!(status & SOC_WT_PREDICTION_INT_ACTIVE)) 216 + return; 217 + 218 + sysfs_notify(&pdev->dev.kobj, "workload_hint", "workload_type_index"); 219 + } 220 + EXPORT_SYMBOL_NS_GPL(proc_thermal_wt_intr_callback, INT340X_THERMAL); 221 + 222 + static bool workload_hint_created; 223 + 224 + int proc_thermal_wt_hint_add(struct pci_dev *pdev, struct proc_thermal_device *proc_priv) 225 + { 226 + int ret; 227 + 228 + ret = sysfs_create_group(&pdev->dev.kobj, &workload_hint_attribute_group); 229 + if (ret) 230 + return ret; 231 + 232 + workload_hint_created = true; 233 + 234 + return 0; 235 + } 236 + EXPORT_SYMBOL_NS_GPL(proc_thermal_wt_hint_add, INT340X_THERMAL); 237 + 238 + void proc_thermal_wt_hint_remove(struct pci_dev *pdev) 239 + { 240 + mutex_lock(&wt_lock); 241 + if (wt_enable) 242 + processor_thermal_mbox_interrupt_config(pdev, false, 243 + SOC_WT_PREDICTION_INT_ENABLE_BIT, 244 + 0); 245 + mutex_unlock(&wt_lock); 246 + 247 + if (workload_hint_created) 248 + sysfs_remove_group(&pdev->dev.kobj, &workload_hint_attribute_group); 249 + 250 + workload_hint_created = false; 251 + } 252 + EXPORT_SYMBOL_NS_GPL(proc_thermal_wt_hint_remove, INT340X_THERMAL); 253 + 254 + MODULE_IMPORT_NS(INT340X_THERMAL); 255 + MODULE_LICENSE("GPL");
+136
drivers/thermal/intel/int340x_thermal/processor_thermal_wt_req.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * processor thermal device for Workload type hints 4 + * update from user space 5 + * 6 + * Copyright (c) 2020-2023, Intel Corporation. 7 + */ 8 + 9 + #include <linux/pci.h> 10 + #include "processor_thermal_device.h" 11 + 12 + /* List of workload types */ 13 + static const char * const workload_types[] = { 14 + "none", 15 + "idle", 16 + "semi_active", 17 + "bursty", 18 + "sustained", 19 + "battery_life", 20 + NULL 21 + }; 22 + 23 + static ssize_t workload_available_types_show(struct device *dev, 24 + struct device_attribute *attr, 25 + char *buf) 26 + { 27 + int i = 0; 28 + int ret = 0; 29 + 30 + while (workload_types[i] != NULL) 31 + ret += sprintf(&buf[ret], "%s ", workload_types[i++]); 32 + 33 + ret += sprintf(&buf[ret], "\n"); 34 + 35 + return ret; 36 + } 37 + 38 + static DEVICE_ATTR_RO(workload_available_types); 39 + 40 + static ssize_t workload_type_store(struct device *dev, 41 + struct device_attribute *attr, 42 + const char *buf, size_t count) 43 + { 44 + struct pci_dev *pdev = to_pci_dev(dev); 45 + char str_preference[15]; 46 + u32 data = 0; 47 + ssize_t ret; 48 + 49 + ret = sscanf(buf, "%14s", str_preference); 50 + if (ret != 1) 51 + return -EINVAL; 52 + 53 + ret = match_string(workload_types, -1, str_preference); 54 + if (ret < 0) 55 + return ret; 56 + 57 + ret &= 0xff; 58 + 59 + if (ret) 60 + data = BIT(MBOX_DATA_BIT_VALID) | BIT(MBOX_DATA_BIT_AC_DC); 61 + 62 + data |= ret; 63 + 64 + ret = processor_thermal_send_mbox_write_cmd(pdev, MBOX_CMD_WORKLOAD_TYPE_WRITE, data); 65 + if (ret) 66 + return false; 67 + 68 + return count; 69 + } 70 + 71 + static ssize_t workload_type_show(struct device *dev, 72 + struct device_attribute *attr, 73 + char *buf) 74 + { 75 + struct pci_dev *pdev = to_pci_dev(dev); 76 + u64 cmd_resp; 77 + int ret; 78 + 79 + ret = processor_thermal_send_mbox_read_cmd(pdev, MBOX_CMD_WORKLOAD_TYPE_READ, &cmd_resp); 80 + if (ret) 81 + return false; 82 + 83 + cmd_resp &= 0xff; 84 + 85 + if (cmd_resp > ARRAY_SIZE(workload_types) - 1) 86 + return -EINVAL; 87 + 88 + return sprintf(buf, "%s\n", workload_types[cmd_resp]); 89 + } 90 + 91 + static DEVICE_ATTR_RW(workload_type); 92 + 93 + static struct attribute *workload_req_attrs[] = { 94 + &dev_attr_workload_available_types.attr, 95 + &dev_attr_workload_type.attr, 96 + NULL 97 + }; 98 + 99 + static const struct attribute_group workload_req_attribute_group = { 100 + .attrs = workload_req_attrs, 101 + .name = "workload_request" 102 + }; 103 + 104 + static bool workload_req_created; 105 + 106 + int proc_thermal_wt_req_add(struct pci_dev *pdev, struct proc_thermal_device *proc_priv) 107 + { 108 + u64 cmd_resp; 109 + int ret; 110 + 111 + /* Check if there is a mailbox support, if fails return success */ 112 + ret = processor_thermal_send_mbox_read_cmd(pdev, MBOX_CMD_WORKLOAD_TYPE_READ, &cmd_resp); 113 + if (ret) 114 + return 0; 115 + 116 + ret = sysfs_create_group(&pdev->dev.kobj, &workload_req_attribute_group); 117 + if (ret) 118 + return ret; 119 + 120 + workload_req_created = true; 121 + 122 + return 0; 123 + } 124 + EXPORT_SYMBOL_GPL(proc_thermal_wt_req_add); 125 + 126 + void proc_thermal_wt_req_remove(struct pci_dev *pdev) 127 + { 128 + if (workload_req_created) 129 + sysfs_remove_group(&pdev->dev.kobj, &workload_req_attribute_group); 130 + 131 + workload_req_created = false; 132 + } 133 + EXPORT_SYMBOL_GPL(proc_thermal_wt_req_remove); 134 + 135 + MODULE_IMPORT_NS(INT340X_THERMAL); 136 + MODULE_LICENSE("GPL");
+1 -1
drivers/thermal/intel/intel_powerclamp.c
··· 256 256 257 257 static const struct kernel_param_ops max_idle_ops = { 258 258 .set = max_idle_set, 259 - .get = param_get_int, 259 + .get = param_get_byte, 260 260 }; 261 261 262 262 module_param_cb(max_idle, &max_idle_ops, &max_idle, 0644);
+2
tools/testing/selftests/Makefile
··· 85 85 TARGETS += sysctl 86 86 TARGETS += tc-testing 87 87 TARGETS += tdx 88 + TARGETS += thermal/intel/power_floor 89 + TARGETS += thermal/intel/workload_hint 88 90 TARGETS += timens 89 91 ifneq (1, $(quicktest)) 90 92 TARGETS += timers
+12
tools/testing/selftests/thermal/intel/power_floor/Makefile
··· 1 + # SPDX-License-Identifier: GPL-2.0 2 + ifndef CROSS_COMPILE 3 + uname_M := $(shell uname -m 2>/dev/null || echo not) 4 + ARCH ?= $(shell echo $(uname_M) | sed -e s/i.86/x86/ -e s/x86_64/x86/) 5 + 6 + ifeq ($(ARCH),x86) 7 + TEST_GEN_PROGS := power_floor_test 8 + 9 + include ../../../lib.mk 10 + 11 + endif 12 + endif
+108
tools/testing/selftests/thermal/intel/power_floor/power_floor_test.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #define _GNU_SOURCE 4 + 5 + #include <stdio.h> 6 + #include <string.h> 7 + #include <stdlib.h> 8 + #include <unistd.h> 9 + #include <fcntl.h> 10 + #include <poll.h> 11 + #include <signal.h> 12 + 13 + #define POWER_FLOOR_ENABLE_ATTRIBUTE "/sys/bus/pci/devices/0000:00:04.0/power_limits/power_floor_enable" 14 + #define POWER_FLOOR_STATUS_ATTRIBUTE "/sys/bus/pci/devices/0000:00:04.0/power_limits/power_floor_status" 15 + 16 + void power_floor_exit(int signum) 17 + { 18 + int fd; 19 + 20 + /* Disable feature via sysfs knob */ 21 + 22 + fd = open(POWER_FLOOR_ENABLE_ATTRIBUTE, O_RDWR); 23 + if (fd < 0) { 24 + perror("Unable to open power floor enable file\n"); 25 + exit(1); 26 + } 27 + 28 + if (write(fd, "0\n", 2) < 0) { 29 + perror("Can' disable power floor notifications\n"); 30 + exit(1); 31 + } 32 + 33 + printf("Disabled power floor notifications\n"); 34 + 35 + close(fd); 36 + } 37 + 38 + int main(int argc, char **argv) 39 + { 40 + struct pollfd ufd; 41 + char status_str[3]; 42 + int fd, ret; 43 + 44 + if (signal(SIGINT, power_floor_exit) == SIG_IGN) 45 + signal(SIGINT, SIG_IGN); 46 + if (signal(SIGHUP, power_floor_exit) == SIG_IGN) 47 + signal(SIGHUP, SIG_IGN); 48 + if (signal(SIGTERM, power_floor_exit) == SIG_IGN) 49 + signal(SIGTERM, SIG_IGN); 50 + 51 + /* Enable feature via sysfs knob */ 52 + fd = open(POWER_FLOOR_ENABLE_ATTRIBUTE, O_RDWR); 53 + if (fd < 0) { 54 + perror("Unable to open power floor enable file\n"); 55 + exit(1); 56 + } 57 + 58 + if (write(fd, "1\n", 2) < 0) { 59 + perror("Can' enable power floor notifications\n"); 60 + exit(1); 61 + } 62 + 63 + close(fd); 64 + 65 + printf("Enabled power floor notifications\n"); 66 + 67 + while (1) { 68 + fd = open(POWER_FLOOR_STATUS_ATTRIBUTE, O_RDONLY); 69 + if (fd < 0) { 70 + perror("Unable to power floor status file\n"); 71 + exit(1); 72 + } 73 + 74 + if ((lseek(fd, 0L, SEEK_SET)) < 0) { 75 + fprintf(stderr, "Failed to set pointer to beginning\n"); 76 + exit(1); 77 + } 78 + 79 + if (read(fd, status_str, sizeof(status_str)) < 0) { 80 + fprintf(stderr, "Failed to read from:%s\n", 81 + POWER_FLOOR_STATUS_ATTRIBUTE); 82 + exit(1); 83 + } 84 + 85 + ufd.fd = fd; 86 + ufd.events = POLLPRI; 87 + 88 + ret = poll(&ufd, 1, -1); 89 + if (ret < 0) { 90 + perror("poll error"); 91 + exit(1); 92 + } else if (ret == 0) { 93 + printf("Poll Timeout\n"); 94 + } else { 95 + if ((lseek(fd, 0L, SEEK_SET)) < 0) { 96 + fprintf(stderr, "Failed to set pointer to beginning\n"); 97 + exit(1); 98 + } 99 + 100 + if (read(fd, status_str, sizeof(status_str)) < 0) 101 + exit(0); 102 + 103 + printf("power floor status: %s\n", status_str); 104 + } 105 + 106 + close(fd); 107 + } 108 + }
+12
tools/testing/selftests/thermal/intel/workload_hint/Makefile
··· 1 + # SPDX-License-Identifier: GPL-2.0 2 + ifndef CROSS_COMPILE 3 + uname_M := $(shell uname -m 2>/dev/null || echo not) 4 + ARCH ?= $(shell echo $(uname_M) | sed -e s/i.86/x86/ -e s/x86_64/x86/) 5 + 6 + ifeq ($(ARCH),x86) 7 + TEST_GEN_PROGS := workload_hint_test 8 + 9 + include ../../../lib.mk 10 + 11 + endif 12 + endif
+157
tools/testing/selftests/thermal/intel/workload_hint/workload_hint_test.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #define _GNU_SOURCE 4 + 5 + #include <stdio.h> 6 + #include <string.h> 7 + #include <stdlib.h> 8 + #include <unistd.h> 9 + #include <fcntl.h> 10 + #include <poll.h> 11 + #include <signal.h> 12 + 13 + #define WORKLOAD_NOTIFICATION_DELAY_ATTRIBUTE "/sys/bus/pci/devices/0000:00:04.0/workload_hint/notification_delay_ms" 14 + #define WORKLOAD_ENABLE_ATTRIBUTE "/sys/bus/pci/devices/0000:00:04.0/workload_hint/workload_hint_enable" 15 + #define WORKLOAD_TYPE_INDEX_ATTRIBUTE "/sys/bus/pci/devices/0000:00:04.0/workload_hint/workload_type_index" 16 + 17 + static const char * const workload_types[] = { 18 + "idle", 19 + "battery_life", 20 + "sustained", 21 + "bursty", 22 + NULL 23 + }; 24 + 25 + #define WORKLOAD_TYPE_MAX_INDEX 3 26 + 27 + void workload_hint_exit(int signum) 28 + { 29 + int fd; 30 + 31 + /* Disable feature via sysfs knob */ 32 + 33 + fd = open(WORKLOAD_ENABLE_ATTRIBUTE, O_RDWR); 34 + if (fd < 0) { 35 + perror("Unable to open workload type feature enable file\n"); 36 + exit(1); 37 + } 38 + 39 + if (write(fd, "0\n", 2) < 0) { 40 + perror("Can' disable workload hints\n"); 41 + exit(1); 42 + } 43 + 44 + printf("Disabled workload type prediction\n"); 45 + 46 + close(fd); 47 + } 48 + 49 + int main(int argc, char **argv) 50 + { 51 + struct pollfd ufd; 52 + char index_str[4]; 53 + int fd, ret, index; 54 + char delay_str[64]; 55 + int delay = 0; 56 + 57 + printf("Usage: workload_hint_test [notification delay in milli seconds]\n"); 58 + 59 + if (argc > 1) { 60 + ret = sscanf(argv[1], "%d", &delay); 61 + if (ret < 0) { 62 + printf("Invalid delay\n"); 63 + exit(1); 64 + } 65 + 66 + printf("Setting notification delay to %d ms\n", delay); 67 + if (delay < 0) 68 + exit(1); 69 + 70 + sprintf(delay_str, "%s\n", argv[1]); 71 + 72 + sprintf(delay_str, "%s\n", argv[1]); 73 + fd = open(WORKLOAD_NOTIFICATION_DELAY_ATTRIBUTE, O_RDWR); 74 + if (fd < 0) { 75 + perror("Unable to open workload notification delay\n"); 76 + exit(1); 77 + } 78 + 79 + if (write(fd, delay_str, strlen(delay_str)) < 0) { 80 + perror("Can't set delay\n"); 81 + exit(1); 82 + } 83 + 84 + close(fd); 85 + } 86 + 87 + if (signal(SIGINT, workload_hint_exit) == SIG_IGN) 88 + signal(SIGINT, SIG_IGN); 89 + if (signal(SIGHUP, workload_hint_exit) == SIG_IGN) 90 + signal(SIGHUP, SIG_IGN); 91 + if (signal(SIGTERM, workload_hint_exit) == SIG_IGN) 92 + signal(SIGTERM, SIG_IGN); 93 + 94 + /* Enable feature via sysfs knob */ 95 + fd = open(WORKLOAD_ENABLE_ATTRIBUTE, O_RDWR); 96 + if (fd < 0) { 97 + perror("Unable to open workload type feature enable file\n"); 98 + exit(1); 99 + } 100 + 101 + if (write(fd, "1\n", 2) < 0) { 102 + perror("Can' enable workload hints\n"); 103 + exit(1); 104 + } 105 + 106 + close(fd); 107 + 108 + printf("Enabled workload type prediction\n"); 109 + 110 + while (1) { 111 + fd = open(WORKLOAD_TYPE_INDEX_ATTRIBUTE, O_RDONLY); 112 + if (fd < 0) { 113 + perror("Unable to open workload type file\n"); 114 + exit(1); 115 + } 116 + 117 + if ((lseek(fd, 0L, SEEK_SET)) < 0) { 118 + fprintf(stderr, "Failed to set pointer to beginning\n"); 119 + exit(1); 120 + } 121 + 122 + if (read(fd, index_str, sizeof(index_str)) < 0) { 123 + fprintf(stderr, "Failed to read from:%s\n", 124 + WORKLOAD_TYPE_INDEX_ATTRIBUTE); 125 + exit(1); 126 + } 127 + 128 + ufd.fd = fd; 129 + ufd.events = POLLPRI; 130 + 131 + ret = poll(&ufd, 1, -1); 132 + if (ret < 0) { 133 + perror("poll error"); 134 + exit(1); 135 + } else if (ret == 0) { 136 + printf("Poll Timeout\n"); 137 + } else { 138 + if ((lseek(fd, 0L, SEEK_SET)) < 0) { 139 + fprintf(stderr, "Failed to set pointer to beginning\n"); 140 + exit(1); 141 + } 142 + 143 + if (read(fd, index_str, sizeof(index_str)) < 0) 144 + exit(0); 145 + 146 + ret = sscanf(index_str, "%d", &index); 147 + if (ret < 0) 148 + break; 149 + if (index > WORKLOAD_TYPE_MAX_INDEX) 150 + printf("Invalid workload type index\n"); 151 + else 152 + printf("workload type:%s\n", workload_types[index]); 153 + } 154 + 155 + close(fd); 156 + } 157 + }