Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

drm/xe/uapi: Introduce API for EU stall sampling

A new hardware feature first introduced in PVC gives capability to
periodically sample EU stall state and record counts for different stall
reasons, on a per IP basis, aggregate across all EUs in a subslice and
record the samples in a buffer in each subslice. Eventually, the aggregated
data is written out to a buffer in the memory. This feature is also
supported in XE2 and later architecture GPUs.

Use an existing IOCTL - DRM_IOCTL_XE_OBSERVATION as the interface into the
driver from the user space to do initial setup and obtain a file descriptor
for the EU stall data stream. Input parameter to the IOCTL is a struct
drm_xe_observation_param in which observation_type should be set to
DRM_XE_OBSERVATION_TYPE_EU_STALL, observation_op should be
DRM_XE_OBSERVATION_OP_STREAM_OPEN and param should point to a chain of
drm_xe_ext_set_property structures in which each structure has a pair of
property and value. The EU stall sampling input properties are defined in
drm_xe_eu_stall_property_id enum.

With the file descriptor obtained from DRM_IOCTL_XE_OBSERVATION, user space
can enable and disable EU stall sampling with the IOCTLs:
DRM_XE_OBSERVATION_IOCTL_ENABLE and DRM_XE_OBSERVATION_IOCTL_DISABLE.
User space can also call poll() to check for availability of data in the
buffer. The data can be read with read(). Finally, the file descriptor
can be closed with close().

v11: Changed a couple of variables in struct eu_stall_open_properties
from unsigned int to int.
v10: Use extension number while parsing chain of extensions.
Remove function description for static functions.
Move code around as per review feedback.
v9: Changed some u32 to unsigned int.
Moved some code around as per review feedback from v8.
v8: Used div_u64 instead of / to fix 32-bit build issue.
Changed copyright year in xe_eu_stall.c/h to 2025.
v7: Renamed input property DRM_XE_EU_STALL_PROP_EVENT_REPORT_COUNT
to DRM_XE_EU_STALL_PROP_WAIT_NUM_REPORTS to be consistent with
OA. Renamed the corresponding internal variables.
Fixed some commit messages based on review feedback.
v6: Change the input sampling rate to GPU cycles instead of
GPU cycles multiplier.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/bb707a27975c33e4a912b9839b023acb7a1f9c90.1740533885.git.harish.chegondi@intel.com

authored by

Harish Chegondi and committed by
Ashutosh Dixit
1537ec85 a2d6f86b

+285
+1
drivers/gpu/drm/xe/Makefile
··· 33 33 xe_device_sysfs.o \ 34 34 xe_dma_buf.o \ 35 35 xe_drm_client.o \ 36 + xe_eu_stall.o \ 36 37 xe_exec.o \ 37 38 xe_exec_queue.o \ 38 39 xe_execlist.o \
+218
drivers/gpu/drm/xe/xe_eu_stall.c
··· 1 + // SPDX-License-Identifier: MIT 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #include <linux/anon_inodes.h> 7 + #include <linux/fs.h> 8 + #include <linux/poll.h> 9 + #include <linux/types.h> 10 + 11 + #include <uapi/drm/xe_drm.h> 12 + 13 + #include "xe_device.h" 14 + #include "xe_eu_stall.h" 15 + #include "xe_gt_printk.h" 16 + #include "xe_gt_topology.h" 17 + #include "xe_macros.h" 18 + #include "xe_observation.h" 19 + 20 + /** 21 + * struct eu_stall_open_properties - EU stall sampling properties received 22 + * from user space at open. 23 + * @sampling_rate_mult: EU stall sampling rate multiplier. 24 + * HW will sample every (sampling_rate_mult x 251) cycles. 25 + * @wait_num_reports: Minimum number of EU stall data reports to unblock poll(). 26 + * @gt: GT on which EU stall data will be captured. 27 + */ 28 + struct eu_stall_open_properties { 29 + int sampling_rate_mult; 30 + int wait_num_reports; 31 + struct xe_gt *gt; 32 + }; 33 + 34 + static int set_prop_eu_stall_sampling_rate(struct xe_device *xe, u64 value, 35 + struct eu_stall_open_properties *props) 36 + { 37 + value = div_u64(value, 251); 38 + if (value == 0 || value > 7) { 39 + drm_dbg(&xe->drm, "Invalid EU stall sampling rate %llu\n", value); 40 + return -EINVAL; 41 + } 42 + props->sampling_rate_mult = value; 43 + return 0; 44 + } 45 + 46 + static int set_prop_eu_stall_wait_num_reports(struct xe_device *xe, u64 value, 47 + struct eu_stall_open_properties *props) 48 + { 49 + props->wait_num_reports = value; 50 + 51 + return 0; 52 + } 53 + 54 + static int set_prop_eu_stall_gt_id(struct xe_device *xe, u64 value, 55 + struct eu_stall_open_properties *props) 56 + { 57 + if (value >= xe->info.gt_count) { 58 + drm_dbg(&xe->drm, "Invalid GT ID %llu for EU stall sampling\n", value); 59 + return -EINVAL; 60 + } 61 + props->gt = xe_device_get_gt(xe, value); 62 + return 0; 63 + } 64 + 65 + typedef int (*set_eu_stall_property_fn)(struct xe_device *xe, u64 value, 66 + struct eu_stall_open_properties *props); 67 + 68 + static const set_eu_stall_property_fn xe_set_eu_stall_property_funcs[] = { 69 + [DRM_XE_EU_STALL_PROP_SAMPLE_RATE] = set_prop_eu_stall_sampling_rate, 70 + [DRM_XE_EU_STALL_PROP_WAIT_NUM_REPORTS] = set_prop_eu_stall_wait_num_reports, 71 + [DRM_XE_EU_STALL_PROP_GT_ID] = set_prop_eu_stall_gt_id, 72 + }; 73 + 74 + static int xe_eu_stall_user_ext_set_property(struct xe_device *xe, u64 extension, 75 + struct eu_stall_open_properties *props) 76 + { 77 + u64 __user *address = u64_to_user_ptr(extension); 78 + struct drm_xe_ext_set_property ext; 79 + int err; 80 + u32 idx; 81 + 82 + err = __copy_from_user(&ext, address, sizeof(ext)); 83 + if (XE_IOCTL_DBG(xe, err)) 84 + return -EFAULT; 85 + 86 + if (XE_IOCTL_DBG(xe, ext.property >= ARRAY_SIZE(xe_set_eu_stall_property_funcs)) || 87 + XE_IOCTL_DBG(xe, ext.pad)) 88 + return -EINVAL; 89 + 90 + idx = array_index_nospec(ext.property, ARRAY_SIZE(xe_set_eu_stall_property_funcs)); 91 + return xe_set_eu_stall_property_funcs[idx](xe, ext.value, props); 92 + } 93 + 94 + typedef int (*xe_eu_stall_user_extension_fn)(struct xe_device *xe, u64 extension, 95 + struct eu_stall_open_properties *props); 96 + static const xe_eu_stall_user_extension_fn xe_eu_stall_user_extension_funcs[] = { 97 + [DRM_XE_EU_STALL_EXTENSION_SET_PROPERTY] = xe_eu_stall_user_ext_set_property, 98 + }; 99 + 100 + #define MAX_USER_EXTENSIONS 5 101 + static int xe_eu_stall_user_extensions(struct xe_device *xe, u64 extension, 102 + int ext_number, struct eu_stall_open_properties *props) 103 + { 104 + u64 __user *address = u64_to_user_ptr(extension); 105 + struct drm_xe_user_extension ext; 106 + int err; 107 + u32 idx; 108 + 109 + if (XE_IOCTL_DBG(xe, ext_number >= MAX_USER_EXTENSIONS)) 110 + return -E2BIG; 111 + 112 + err = __copy_from_user(&ext, address, sizeof(ext)); 113 + if (XE_IOCTL_DBG(xe, err)) 114 + return -EFAULT; 115 + 116 + if (XE_IOCTL_DBG(xe, ext.pad) || 117 + XE_IOCTL_DBG(xe, ext.name >= ARRAY_SIZE(xe_eu_stall_user_extension_funcs))) 118 + return -EINVAL; 119 + 120 + idx = array_index_nospec(ext.name, ARRAY_SIZE(xe_eu_stall_user_extension_funcs)); 121 + err = xe_eu_stall_user_extension_funcs[idx](xe, extension, props); 122 + if (XE_IOCTL_DBG(xe, err)) 123 + return err; 124 + 125 + if (ext.next_extension) 126 + return xe_eu_stall_user_extensions(xe, ext.next_extension, ++ext_number, props); 127 + 128 + return 0; 129 + } 130 + 131 + /* 132 + * Userspace must enable the EU stall stream with DRM_XE_OBSERVATION_IOCTL_ENABLE 133 + * before calling read(). 134 + */ 135 + static ssize_t xe_eu_stall_stream_read(struct file *file, char __user *buf, 136 + size_t count, loff_t *ppos) 137 + { 138 + ssize_t ret = 0; 139 + 140 + return ret; 141 + } 142 + 143 + static __poll_t xe_eu_stall_stream_poll(struct file *file, poll_table *wait) 144 + { 145 + __poll_t ret = 0; 146 + 147 + return ret; 148 + } 149 + 150 + static long xe_eu_stall_stream_ioctl(struct file *file, unsigned int cmd, unsigned long arg) 151 + { 152 + return 0; 153 + } 154 + 155 + static int xe_eu_stall_stream_close(struct inode *inode, struct file *file) 156 + { 157 + return 0; 158 + } 159 + 160 + static const struct file_operations fops_eu_stall = { 161 + .owner = THIS_MODULE, 162 + .llseek = noop_llseek, 163 + .release = xe_eu_stall_stream_close, 164 + .poll = xe_eu_stall_stream_poll, 165 + .read = xe_eu_stall_stream_read, 166 + .unlocked_ioctl = xe_eu_stall_stream_ioctl, 167 + .compat_ioctl = xe_eu_stall_stream_ioctl, 168 + }; 169 + 170 + static inline bool has_eu_stall_sampling_support(struct xe_device *xe) 171 + { 172 + return false; 173 + } 174 + 175 + /** 176 + * xe_eu_stall_stream_open - Open a xe EU stall data stream fd 177 + * 178 + * @dev: DRM device pointer 179 + * @data: pointer to first struct @drm_xe_ext_set_property in 180 + * the chain of input properties from the user space. 181 + * @file: DRM file pointer 182 + * 183 + * This function opens a EU stall data stream with input properties from 184 + * the user space. 185 + * 186 + * Returns: EU stall data stream fd on success or a negative error code. 187 + */ 188 + int xe_eu_stall_stream_open(struct drm_device *dev, u64 data, struct drm_file *file) 189 + { 190 + struct xe_device *xe = to_xe_device(dev); 191 + struct eu_stall_open_properties props = {}; 192 + int ret, stream_fd; 193 + 194 + if (!has_eu_stall_sampling_support(xe)) { 195 + drm_dbg(&xe->drm, "EU stall monitoring is not supported on this platform\n"); 196 + return -ENODEV; 197 + } 198 + 199 + if (xe_observation_paranoid && !perfmon_capable()) { 200 + drm_dbg(&xe->drm, "Insufficient privileges for EU stall monitoring\n"); 201 + return -EACCES; 202 + } 203 + 204 + ret = xe_eu_stall_user_extensions(xe, data, 0, &props); 205 + if (ret) 206 + return ret; 207 + 208 + if (!props.gt) { 209 + drm_dbg(&xe->drm, "GT ID not provided for EU stall sampling\n"); 210 + return -EINVAL; 211 + } 212 + 213 + stream_fd = anon_inode_getfd("[xe_eu_stall]", &fops_eu_stall, NULL, 0); 214 + if (stream_fd < 0) 215 + xe_gt_dbg(props.gt, "EU stall inode get fd failed : %d\n", stream_fd); 216 + 217 + return stream_fd; 218 + }
+14
drivers/gpu/drm/xe/xe_eu_stall.h
··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2025 Intel Corporation 4 + */ 5 + 6 + #ifndef __XE_EU_STALL_H__ 7 + #define __XE_EU_STALL_H__ 8 + 9 + #include "xe_gt_types.h" 10 + 11 + int xe_eu_stall_stream_open(struct drm_device *dev, 12 + u64 data, 13 + struct drm_file *file); 14 + #endif
+14
drivers/gpu/drm/xe/xe_observation.c
··· 8 8 9 9 #include <uapi/drm/xe_drm.h> 10 10 11 + #include "xe_eu_stall.h" 11 12 #include "xe_oa.h" 12 13 #include "xe_observation.h" 13 14 ··· 25 24 return xe_oa_add_config_ioctl(dev, arg->param, file); 26 25 case DRM_XE_OBSERVATION_OP_REMOVE_CONFIG: 27 26 return xe_oa_remove_config_ioctl(dev, arg->param, file); 27 + default: 28 + return -EINVAL; 29 + } 30 + } 31 + 32 + static int xe_eu_stall_ioctl(struct drm_device *dev, struct drm_xe_observation_param *arg, 33 + struct drm_file *file) 34 + { 35 + switch (arg->observation_op) { 36 + case DRM_XE_OBSERVATION_OP_STREAM_OPEN: 37 + return xe_eu_stall_stream_open(dev, arg->param, file); 28 38 default: 29 39 return -EINVAL; 30 40 } ··· 63 51 switch (arg->observation_type) { 64 52 case DRM_XE_OBSERVATION_TYPE_OA: 65 53 return xe_oa_ioctl(dev, arg, file); 54 + case DRM_XE_OBSERVATION_TYPE_EU_STALL: 55 + return xe_eu_stall_ioctl(dev, arg, file); 66 56 default: 67 57 return -EINVAL; 68 58 }
+38
include/uapi/drm/xe_drm.h
··· 1496 1496 enum drm_xe_observation_type { 1497 1497 /** @DRM_XE_OBSERVATION_TYPE_OA: OA observation stream type */ 1498 1498 DRM_XE_OBSERVATION_TYPE_OA, 1499 + /** @DRM_XE_OBSERVATION_TYPE_EU_STALL: EU stall sampling observation stream type */ 1500 + DRM_XE_OBSERVATION_TYPE_EU_STALL, 1499 1501 }; 1500 1502 1501 1503 /** ··· 1849 1847 1850 1848 /* ID of the protected content session managed by Xe when PXP is active */ 1851 1849 #define DRM_XE_PXP_HWDRM_DEFAULT_SESSION 0xf 1850 + 1851 + /** 1852 + * enum drm_xe_eu_stall_property_id - EU stall sampling input property ids. 1853 + * 1854 + * These properties are passed to the driver at open as a chain of 1855 + * @drm_xe_ext_set_property structures with @property set to these 1856 + * properties' enums and @value set to the corresponding values of these 1857 + * properties. @drm_xe_user_extension base.name should be set to 1858 + * @DRM_XE_EU_STALL_EXTENSION_SET_PROPERTY. 1859 + * 1860 + * With the file descriptor obtained from open, user space must enable 1861 + * the EU stall stream fd with @DRM_XE_OBSERVATION_IOCTL_ENABLE before 1862 + * calling read(). EIO errno from read() indicates HW dropped data 1863 + * due to full buffer. 1864 + */ 1865 + enum drm_xe_eu_stall_property_id { 1866 + #define DRM_XE_EU_STALL_EXTENSION_SET_PROPERTY 0 1867 + /** 1868 + * @DRM_XE_EU_STALL_PROP_GT_ID: @gt_id of the GT on which 1869 + * EU stall data will be captured. 1870 + */ 1871 + DRM_XE_EU_STALL_PROP_GT_ID = 1, 1872 + 1873 + /** 1874 + * @DRM_XE_EU_STALL_PROP_SAMPLE_RATE: Sampling rate 1875 + * in GPU cycles. 1876 + */ 1877 + DRM_XE_EU_STALL_PROP_SAMPLE_RATE, 1878 + 1879 + /** 1880 + * @DRM_XE_EU_STALL_PROP_WAIT_NUM_REPORTS: Minimum number of 1881 + * EU stall data reports to be present in the kernel buffer 1882 + * before unblocking a blocked poll or read. 1883 + */ 1884 + DRM_XE_EU_STALL_PROP_WAIT_NUM_REPORTS, 1885 + }; 1852 1886 1853 1887 #if defined(__cplusplus) 1854 1888 }