Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'drm-xe-next-2024-10-10' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

Cross-subsystem Changes:
- Add drm_line_printer (Michal)

Driver Changes:
- Fix an UAF (Matt Auld)
- Sanity check compression and coherency mode (Matt Auld)
- Some PIC-ID work (Jani)
- Use IS_ENABLED() instead of defined() on config options.
- gt powergating work (Riana)
- Suppress missing out ter rpm protection warning (Rodrigo)
- Fix a vm leak (Dafna)
- Clean up and update 'has_flat_ccs' handling (Lucas)
- Fix arg to pci_iomap (Lucas)
- Mark reserved engines in shapshot (Lucas)
- Don't keep stale pointer (Michal)
- Fix build warning with CONFIG_PM=n (Arnd)
- Add a xe_bo subtest for shrinking / swapping (Thomas)
- Add a warkaround (Tejas)
- Some display PM work (Maarten)
- Enable Xe2 + PES disaggregation (Ashutosh)
- Large xe_mmio rework / cleanup (Matt Roper)
- A couple of fixes / cleanups in the xe client code (Matt Auld)
- Fix page-fault handling on closed VMs (Matt Brost)
- Fix overflow in OA batch buffer (José)
- Style fixes (Lucas, Jiapeng, Nitin)
- Fixes and new development around SRIOV (Michal)
- Use devm_add_action_or_reset() in gt code (He)
- Fix CCS offset calculation (Matt Auld)
- Remove i915_drv.h include (Rodrigo)
- Restore PCI state on resume (Rodrigo)
- Fix DSB buffer coherency / Revert DSB disabling (Maarten / Animesh)
- Convert USM lock to rwsem (Matt Brost)
- Defer gt-mmio intialization (Matt Roper)
- meemirq changes (Ilia)
- Move some PVC related code out of xe-for-CI and to the driver (Rodrigo / Jani)
- Use a helper for ASID->VM lookup (Matt Brost)
- Add new PCI id for ARL (Dnyaneshwar)
- Use Xe2_LPM steering tables for Xe2_HPM (Gustavo)
- Performance tuning work for media GT and L3 cache flushing (Gustavo)
- Clean up VM- and exec queue file lock usage (Matt Brost)
- GuC locking fix (Matt Auld)
- Fix UAF around queue destruction (Matt Auld)
- Move IRQ-related registers to dedicated header (Matt Roper)
- Resume TDR after GT reset (Matt Brost)
- Move xa_alloc to prevent UAF (Matt Auld)
- Fix OA stream close (José)
- Remove unused i915_gpu_error.h (Jani)
- Prevent null pointer access in xe_migrate_copy (Zhanjun)
- Fix memory leak when aborting binds (Matt Brost)
- Prevent UAF in send_recv() (Matt Auld)
- Fix xa_store() error checking (Matt Auld)
- drop irq disabling around xa_erase in guc code (Matt Auld)
- Use fault injection infrastructure to find issues as probe time (Francois)
- Fix a workaround implementation. (Vinay)
- Mark wedged_mode debugfs writable (Matt Roper)
- Fix for prviewous memirq work (Michal)
- More SRIOV work (Michal)
- Devcoredump work (John)
- GuC logging + devcoredump support (John)
- Don't report L3 bank availability on PTL (Shekhar)
- Replicate Xe2 PAT settings on Xe2 (Matt Roper)
- Define Xe3 feature flags (Haridhar)
- Reuse Xe2 MOCS table on on PTL (Haridhar)
- Add PTL platform definition (Haridhar)
- Add MCR steering for Xe3 (Matt)
- More work around GuC capture for devcoredump (Zhanjun)
- Improve cache flushing behaviour on bmg (Matt Auld)
- Fix shrinker test compiler warnings on 32-bit (Thomas)
- Initial set of workarounds for Xe3 (Gustavo)
- Extend workaround for xe2lpg (Aradhya)
- Fix unbalanced rpm put x 2 (Matt Auld)

Signed-off-by: Dave Airlie <airlied@redhat.com>

# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQRskUM7w1oG5rx2IZO4FpNVCsYGvwUCZwekBwAKCRC4FpNVCsYG
# v32oAQDnIKVwjZecI1V3oUsy2ZE3TKWx8HH4FweT6S5L6tqZwQD/b0vkeA3UaojO
# 5FIkPEqyHFbrj+Sw7bLonLb3LHv4WAE=
# =FtY6
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 10 Oct 2024 19:53:11 AEST
# gpg: using EDDSA key 6C91433BC35A06E6BC762193B81693550AC606BF
# gpg: Can't check signature: No public key

# Conflicts:
# drivers/gpu/drm/xe/xe_gt_mcr.c
# drivers/gpu/drm/xe/xe_tuning.c
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Zwekwrak12c5SSgo@fedora

+6254 -1263
+14
drivers/gpu/drm/drm_print.c
··· 235 235 } 236 236 EXPORT_SYMBOL(__drm_printfn_err); 237 237 238 + void __drm_printfn_line(struct drm_printer *p, struct va_format *vaf) 239 + { 240 + unsigned int counter = ++p->line.counter; 241 + const char *prefix = p->prefix ?: ""; 242 + const char *pad = p->prefix ? " " : ""; 243 + 244 + if (p->line.series) 245 + drm_printf(p->arg, "%s%s%u.%u: %pV", 246 + prefix, pad, p->line.series, counter, vaf); 247 + else 248 + drm_printf(p->arg, "%s%s%u: %pV", prefix, pad, counter, vaf); 249 + } 250 + EXPORT_SYMBOL(__drm_printfn_line); 251 + 238 252 /** 239 253 * drm_puts - print a const string to a &drm_printer stream 240 254 * @p: the &drm printer
-4
drivers/gpu/drm/i915/display/intel_dsb.c
··· 706 706 if (!i915->display.params.enable_dsb) 707 707 return NULL; 708 708 709 - /* TODO: DSB is broken in Xe KMD, so disabling it until fixed */ 710 - if (!IS_ENABLED(I915)) 711 - return NULL; 712 - 713 709 dsb = kzalloc(sizeof(*dsb), GFP_KERNEL); 714 710 if (!dsb) 715 711 goto out;
+12
drivers/gpu/drm/xe/Kconfig.debug
··· 40 40 41 41 If in doubt, say "N". 42 42 43 + config DRM_XE_DEBUG_MEMIRQ 44 + bool "Enable extra memirq debugging" 45 + default n 46 + help 47 + Choose this option to enable additional debugging info for 48 + memory based interrupts. 49 + 50 + Recommended for driver developers only. 51 + 52 + If in doubt, say "N". 53 + 43 54 config DRM_XE_DEBUG_SRIOV 44 55 bool "Enable extra SR-IOV debugging" 45 56 default n 57 + select DRM_XE_DEBUG_MEMIRQ 46 58 help 47 59 Enable extra SR-IOV debugging info. 48 60
+2
drivers/gpu/drm/xe/Makefile
··· 56 56 xe_gt_topology.o \ 57 57 xe_guc.o \ 58 58 xe_guc_ads.o \ 59 + xe_guc_capture.o \ 59 60 xe_guc_ct.o \ 60 61 xe_guc_db_mgr.o \ 61 62 xe_guc_hwconfig.o \ ··· 130 129 xe_gt_sriov_pf.o \ 131 130 xe_gt_sriov_pf_config.o \ 132 131 xe_gt_sriov_pf_control.o \ 132 + xe_gt_sriov_pf_migration.o \ 133 133 xe_gt_sriov_pf_monitor.o \ 134 134 xe_gt_sriov_pf_policy.o \ 135 135 xe_gt_sriov_pf_service.o \
+8
drivers/gpu/drm/xe/abi/guc_actions_abi.h
··· 176 176 #define GUC_LOG_CONTROL_VERBOSITY_MASK (0xF << GUC_LOG_CONTROL_VERBOSITY_SHIFT) 177 177 #define GUC_LOG_CONTROL_DEFAULT_LOGGING (1 << 8) 178 178 179 + enum xe_guc_state_capture_event_status { 180 + XE_GUC_STATE_CAPTURE_EVENT_STATUS_SUCCESS = 0x0, 181 + XE_GUC_STATE_CAPTURE_EVENT_STATUS_NOSPACE = 0x1, 182 + }; 183 + 184 + #define XE_GUC_STATE_CAPTURE_EVENT_STATUS_MASK 0x000000FF 185 + #define XE_GUC_ACTION_STATE_CAPTURE_NOTIFICATION_DATA_LEN 1 186 + 179 187 #define XE_GUC_TLB_INVAL_TYPE_SHIFT 0 180 188 #define XE_GUC_TLB_INVAL_MODE_SHIFT 8 181 189 /* Flush PPC or SMRO caches along with TLB invalidation request */
+61
drivers/gpu/drm/xe/abi/guc_actions_sriov_abi.h
··· 557 557 #define VF2GUC_QUERY_SINGLE_KLV_RESPONSE_MSG_2_VALUE64 GUC_HXG_REQUEST_MSG_n_DATAn 558 558 #define VF2GUC_QUERY_SINGLE_KLV_RESPONSE_MSG_3_VALUE96 GUC_HXG_REQUEST_MSG_n_DATAn 559 559 560 + /** 561 + * DOC: PF2GUC_SAVE_RESTORE_VF 562 + * 563 + * This message is used by the PF to migrate VF info state maintained by the GuC. 564 + * 565 + * This message must be sent as `CTB HXG Message`_. 566 + * 567 + * Available since GuC version 70.25.0 568 + * 569 + * +---+-------+--------------------------------------------------------------+ 570 + * | | Bits | Description | 571 + * +===+=======+==============================================================+ 572 + * | 0 | 31 | ORIGIN = GUC_HXG_ORIGIN_HOST_ | 573 + * | +-------+--------------------------------------------------------------+ 574 + * | | 30:28 | TYPE = GUC_HXG_TYPE_REQUEST_ | 575 + * | +-------+--------------------------------------------------------------+ 576 + * | | 27:16 | DATA0 = **OPCODE** - operation to take: | 577 + * | | | | 578 + * | | | - _`GUC_PF_OPCODE_VF_SAVE` = 0 | 579 + * | | | - _`GUC_PF_OPCODE_VF_RESTORE` = 1 | 580 + * | +-------+--------------------------------------------------------------+ 581 + * | | 15:0 | ACTION = _`GUC_ACTION_PF2GUC_SAVE_RESTORE_VF` = 0x550B | 582 + * +---+-------+--------------------------------------------------------------+ 583 + * | 1 | 31:0 | **VFID** - VF identifier | 584 + * +---+-------+--------------------------------------------------------------+ 585 + * | 2 | 31:0 | **ADDR_LO** - lower 32-bits of GGTT offset to the buffer | 586 + * | | | where the VF info will be save to or restored from. | 587 + * +---+-------+--------------------------------------------------------------+ 588 + * | 3 | 31:0 | **ADDR_HI** - upper 32-bits of GGTT offset to the buffer | 589 + * | | | where the VF info will be save to or restored from. | 590 + * +---+-------+--------------------------------------------------------------+ 591 + * | 4 | 27:0 | **SIZE** - size of the buffer (in dwords) | 592 + * | +-------+--------------------------------------------------------------+ 593 + * | | 31:28 | MBZ | 594 + * +---+-------+--------------------------------------------------------------+ 595 + * 596 + * +---+-------+--------------------------------------------------------------+ 597 + * | | Bits | Description | 598 + * +===+=======+==============================================================+ 599 + * | 0 | 31 | ORIGIN = GUC_HXG_ORIGIN_GUC_ | 600 + * | +-------+--------------------------------------------------------------+ 601 + * | | 30:28 | TYPE = GUC_HXG_TYPE_RESPONSE_SUCCESS_ | 602 + * | +-------+--------------------------------------------------------------+ 603 + * | | 27:0 | DATA0 = **USED** - size of used buffer space (in dwords) | 604 + * +---+-------+--------------------------------------------------------------+ 605 + */ 606 + #define GUC_ACTION_PF2GUC_SAVE_RESTORE_VF 0x550Bu 607 + 608 + #define PF2GUC_SAVE_RESTORE_VF_REQUEST_MSG_LEN (GUC_HXG_EVENT_MSG_MIN_LEN + 4u) 609 + #define PF2GUC_SAVE_RESTORE_VF_REQUEST_MSG_0_OPCODE GUC_HXG_EVENT_MSG_0_DATA0 610 + #define GUC_PF_OPCODE_VF_SAVE 0u 611 + #define GUC_PF_OPCODE_VF_RESTORE 1u 612 + #define PF2GUC_SAVE_RESTORE_VF_REQUEST_MSG_1_VFID GUC_HXG_EVENT_MSG_n_DATAn 613 + #define PF2GUC_SAVE_RESTORE_VF_REQUEST_MSG_2_ADDR_LO GUC_HXG_EVENT_MSG_n_DATAn 614 + #define PF2GUC_SAVE_RESTORE_VF_REQUEST_MSG_3_ADDR_HI GUC_HXG_EVENT_MSG_n_DATAn 615 + #define PF2GUC_SAVE_RESTORE_VF_REQUEST_MSG_4_SIZE (0xfffffffu << 0) 616 + #define PF2GUC_SAVE_RESTORE_VF_REQUEST_MSG_4_MBZ (0xfu << 28) 617 + 618 + #define PF2GUC_SAVE_RESTORE_VF_RESPONSE_MSG_LEN GUC_HXG_RESPONSE_MSG_MIN_LEN 619 + #define PF2GUC_SAVE_RESTORE_VF_RESPONSE_MSG_0_USED GUC_HXG_RESPONSE_MSG_0_DATA0 620 + 560 621 #endif
+186
drivers/gpu/drm/xe/abi/guc_capture_abi.h
··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2024 Intel Corporation 4 + */ 5 + 6 + #ifndef _ABI_GUC_CAPTURE_ABI_H 7 + #define _ABI_GUC_CAPTURE_ABI_H 8 + 9 + #include <linux/types.h> 10 + 11 + /* Capture List Index */ 12 + enum guc_capture_list_index_type { 13 + GUC_CAPTURE_LIST_INDEX_PF = 0, 14 + GUC_CAPTURE_LIST_INDEX_VF = 1, 15 + }; 16 + 17 + #define GUC_CAPTURE_LIST_INDEX_MAX (GUC_CAPTURE_LIST_INDEX_VF + 1) 18 + 19 + /* Register-types of GuC capture register lists */ 20 + enum guc_state_capture_type { 21 + GUC_STATE_CAPTURE_TYPE_GLOBAL = 0, 22 + GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS, 23 + GUC_STATE_CAPTURE_TYPE_ENGINE_INSTANCE 24 + }; 25 + 26 + #define GUC_STATE_CAPTURE_TYPE_MAX (GUC_STATE_CAPTURE_TYPE_ENGINE_INSTANCE + 1) 27 + 28 + /* Class indecies for capture_class and capture_instance arrays */ 29 + enum guc_capture_list_class_type { 30 + GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE = 0, 31 + GUC_CAPTURE_LIST_CLASS_VIDEO = 1, 32 + GUC_CAPTURE_LIST_CLASS_VIDEOENHANCE = 2, 33 + GUC_CAPTURE_LIST_CLASS_BLITTER = 3, 34 + GUC_CAPTURE_LIST_CLASS_GSC_OTHER = 4, 35 + }; 36 + 37 + #define GUC_CAPTURE_LIST_CLASS_MAX (GUC_CAPTURE_LIST_CLASS_GSC_OTHER + 1) 38 + 39 + /** 40 + * struct guc_mmio_reg - GuC MMIO reg state struct 41 + * 42 + * GuC MMIO reg state struct 43 + */ 44 + struct guc_mmio_reg { 45 + /** @offset: MMIO Offset - filled in by Host */ 46 + u32 offset; 47 + /** @value: MMIO Value - Used by Firmware to store value */ 48 + u32 value; 49 + /** @flags: Flags for accessing the MMIO */ 50 + u32 flags; 51 + /** @mask: Value of a mask to apply if mask with value is set */ 52 + u32 mask; 53 + #define GUC_REGSET_MASKED BIT(0) 54 + #define GUC_REGSET_STEERING_NEEDED BIT(1) 55 + #define GUC_REGSET_MASKED_WITH_VALUE BIT(2) 56 + #define GUC_REGSET_RESTORE_ONLY BIT(3) 57 + #define GUC_REGSET_STEERING_GROUP GENMASK(16, 12) 58 + #define GUC_REGSET_STEERING_INSTANCE GENMASK(23, 20) 59 + } __packed; 60 + 61 + /** 62 + * struct guc_mmio_reg_set - GuC register sets 63 + * 64 + * GuC register sets 65 + */ 66 + struct guc_mmio_reg_set { 67 + /** @address: register address */ 68 + u32 address; 69 + /** @count: register count */ 70 + u16 count; 71 + /** @reserved: reserved */ 72 + u16 reserved; 73 + } __packed; 74 + 75 + /** 76 + * struct guc_debug_capture_list_header - Debug capture list header. 77 + * 78 + * Debug capture list header. 79 + */ 80 + struct guc_debug_capture_list_header { 81 + /** @info: contains number of MMIO descriptors in the capture list. */ 82 + u32 info; 83 + #define GUC_CAPTURELISTHDR_NUMDESCR GENMASK(15, 0) 84 + } __packed; 85 + 86 + /** 87 + * struct guc_debug_capture_list - Debug capture list 88 + * 89 + * As part of ADS registration, these header structures (followed by 90 + * an array of 'struct guc_mmio_reg' entries) are used to register with 91 + * GuC microkernel the list of registers we want it to dump out prior 92 + * to a engine reset. 93 + */ 94 + struct guc_debug_capture_list { 95 + /** @header: Debug capture list header. */ 96 + struct guc_debug_capture_list_header header; 97 + /** @regs: MMIO descriptors in the capture list. */ 98 + struct guc_mmio_reg regs[]; 99 + } __packed; 100 + 101 + /** 102 + * struct guc_state_capture_header_t - State capture header. 103 + * 104 + * Prior to resetting engines that have hung or faulted, GuC microkernel 105 + * reports the engine error-state (register values that was read) by 106 + * logging them into the shared GuC log buffer using these hierarchy 107 + * of structures. 108 + */ 109 + struct guc_state_capture_header_t { 110 + /** 111 + * @owner: VFID 112 + * BR[ 7: 0] MBZ when SRIOV is disabled. When SRIOV is enabled 113 + * VFID is an integer in range [0, 63] where 0 means the state capture 114 + * is corresponding to the PF and an integer N in range [1, 63] means 115 + * the state capture is for VF N. 116 + */ 117 + u32 owner; 118 + #define GUC_STATE_CAPTURE_HEADER_VFID GENMASK(7, 0) 119 + /** @info: Engine class/instance and capture type info */ 120 + u32 info; 121 + #define GUC_STATE_CAPTURE_HEADER_CAPTURE_TYPE GENMASK(3, 0) /* see guc_state_capture_type */ 122 + #define GUC_STATE_CAPTURE_HEADER_ENGINE_CLASS GENMASK(7, 4) /* see guc_capture_list_class_type */ 123 + #define GUC_STATE_CAPTURE_HEADER_ENGINE_INSTANCE GENMASK(11, 8) 124 + /** 125 + * @lrca: logical ring context address. 126 + * if type-instance, LRCA (address) that hung, else set to ~0 127 + */ 128 + u32 lrca; 129 + /** 130 + * @guc_id: context_index. 131 + * if type-instance, context index of hung context, else set to ~0 132 + */ 133 + u32 guc_id; 134 + /** @num_mmio_entries: Number of captured MMIO entries. */ 135 + u32 num_mmio_entries; 136 + #define GUC_STATE_CAPTURE_HEADER_NUM_MMIO_ENTRIES GENMASK(9, 0) 137 + } __packed; 138 + 139 + /** 140 + * struct guc_state_capture_t - State capture. 141 + * 142 + * State capture 143 + */ 144 + struct guc_state_capture_t { 145 + /** @header: State capture header. */ 146 + struct guc_state_capture_header_t header; 147 + /** @mmio_entries: Array of captured guc_mmio_reg entries. */ 148 + struct guc_mmio_reg mmio_entries[]; 149 + } __packed; 150 + 151 + /* State Capture Group Type */ 152 + enum guc_state_capture_group_type { 153 + GUC_STATE_CAPTURE_GROUP_TYPE_FULL = 0, 154 + GUC_STATE_CAPTURE_GROUP_TYPE_PARTIAL 155 + }; 156 + 157 + #define GUC_STATE_CAPTURE_GROUP_TYPE_MAX (GUC_STATE_CAPTURE_GROUP_TYPE_PARTIAL + 1) 158 + 159 + /** 160 + * struct guc_state_capture_group_header_t - State capture group header 161 + * 162 + * State capture group header. 163 + */ 164 + struct guc_state_capture_group_header_t { 165 + /** @owner: VFID */ 166 + u32 owner; 167 + #define GUC_STATE_CAPTURE_GROUP_HEADER_VFID GENMASK(7, 0) 168 + /** @info: Engine class/instance and capture type info */ 169 + u32 info; 170 + #define GUC_STATE_CAPTURE_GROUP_HEADER_NUM_CAPTURES GENMASK(7, 0) 171 + #define GUC_STATE_CAPTURE_GROUP_HEADER_CAPTURE_GROUP_TYPE GENMASK(15, 8) 172 + } __packed; 173 + 174 + /** 175 + * struct guc_state_capture_group_t - State capture group. 176 + * 177 + * this is the top level structure where an error-capture dump starts 178 + */ 179 + struct guc_state_capture_group_t { 180 + /** @grp_header: State capture group header. */ 181 + struct guc_state_capture_group_header_t grp_header; 182 + /** @capture_entries: Array of state captures */ 183 + struct guc_state_capture_t capture_entries[]; 184 + } __packed; 185 + 186 + #endif
+1
drivers/gpu/drm/xe/abi/guc_communication_ctb_abi.h
··· 52 52 #define GUC_CTB_STATUS_OVERFLOW (1 << 0) 53 53 #define GUC_CTB_STATUS_UNDERFLOW (1 << 1) 54 54 #define GUC_CTB_STATUS_MISMATCH (1 << 2) 55 + #define GUC_CTB_STATUS_DISABLED (1 << 3) 55 56 u32 reserved[13]; 56 57 } __packed; 57 58 static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
+75
drivers/gpu/drm/xe/abi/guc_log_abi.h
··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2024 Intel Corporation 4 + */ 5 + 6 + #ifndef _ABI_GUC_LOG_ABI_H 7 + #define _ABI_GUC_LOG_ABI_H 8 + 9 + #include <linux/types.h> 10 + 11 + /* GuC logging buffer types */ 12 + enum guc_log_buffer_type { 13 + GUC_LOG_BUFFER_CRASH_DUMP, 14 + GUC_LOG_BUFFER_DEBUG, 15 + GUC_LOG_BUFFER_CAPTURE, 16 + }; 17 + 18 + #define GUC_LOG_BUFFER_TYPE_MAX 3 19 + 20 + /** 21 + * struct guc_log_buffer_state - GuC log buffer state 22 + * 23 + * Below state structure is used for coordination of retrieval of GuC firmware 24 + * logs. Separate state is maintained for each log buffer type. 25 + * read_ptr points to the location where Xe read last in log buffer and 26 + * is read only for GuC firmware. write_ptr is incremented by GuC with number 27 + * of bytes written for each log entry and is read only for Xe. 28 + * When any type of log buffer becomes half full, GuC sends a flush interrupt. 29 + * GuC firmware expects that while it is writing to 2nd half of the buffer, 30 + * first half would get consumed by Host and then get a flush completed 31 + * acknowledgment from Host, so that it does not end up doing any overwrite 32 + * causing loss of logs. So when buffer gets half filled & Xe has requested 33 + * for interrupt, GuC will set flush_to_file field, set the sampled_write_ptr 34 + * to the value of write_ptr and raise the interrupt. 35 + * On receiving the interrupt Xe should read the buffer, clear flush_to_file 36 + * field and also update read_ptr with the value of sample_write_ptr, before 37 + * sending an acknowledgment to GuC. marker & version fields are for internal 38 + * usage of GuC and opaque to Xe. buffer_full_cnt field is incremented every 39 + * time GuC detects the log buffer overflow. 40 + */ 41 + struct guc_log_buffer_state { 42 + /** @marker: buffer state start marker */ 43 + u32 marker[2]; 44 + /** @read_ptr: the last byte offset that was read by KMD previously */ 45 + u32 read_ptr; 46 + /** 47 + * @write_ptr: the next byte offset location that will be written by 48 + * GuC 49 + */ 50 + u32 write_ptr; 51 + /** @size: Log buffer size */ 52 + u32 size; 53 + /** 54 + * @sampled_write_ptr: Log buffer write pointer 55 + * This is written by GuC to the byte offset of the next free entry in 56 + * the buffer on log buffer half full or state capture notification 57 + */ 58 + u32 sampled_write_ptr; 59 + /** 60 + * @wrap_offset: wraparound offset 61 + * This is the byte offset of location 1 byte after last valid guc log 62 + * event entry written by Guc firmware before there was a wraparound. 63 + * This field is updated by guc firmware and should be used by Host 64 + * when copying buffer contents to file. 65 + */ 66 + u32 wrap_offset; 67 + /** @flags: Flush to file flag and buffer full count */ 68 + u32 flags; 69 + #define GUC_LOG_BUFFER_STATE_FLUSH_TO_FILE GENMASK(0, 0) 70 + #define GUC_LOG_BUFFER_STATE_BUFFER_FULL_CNT GENMASK(4, 1) 71 + /** @version: The Guc-Log-Entry format version */ 72 + u32 version; 73 + } __packed; 74 + 75 + #endif
-17
drivers/gpu/drm/xe/compat-i915-headers/i915_gpu_error.h
··· 1 - /* SPDX-License-Identifier: MIT */ 2 - /* 3 - * Copyright © 2023 Intel Corporation 4 - */ 5 - 6 - #ifndef _I915_GPU_ERROR_H_ 7 - #define _I915_GPU_ERROR_H_ 8 - 9 - struct drm_i915_error_state_buf; 10 - 11 - __printf(2, 3) 12 - static inline void 13 - i915_error_printf(struct drm_i915_error_state_buf *e, const char *f, ...) 14 - { 15 - } 16 - 17 - #endif
+18 -18
drivers/gpu/drm/xe/compat-i915-headers/intel_uncore.h
··· 10 10 #include "xe_device_types.h" 11 11 #include "xe_mmio.h" 12 12 13 - static inline struct xe_gt *__compat_uncore_to_gt(struct intel_uncore *uncore) 13 + static inline struct xe_mmio *__compat_uncore_to_mmio(struct intel_uncore *uncore) 14 14 { 15 15 struct xe_device *xe = container_of(uncore, struct xe_device, uncore); 16 16 17 - return xe_root_mmio_gt(xe); 17 + return xe_root_tile_mmio(xe); 18 18 } 19 19 20 20 static inline struct xe_tile *__compat_uncore_to_tile(struct intel_uncore *uncore) ··· 29 29 { 30 30 struct xe_reg reg = XE_REG(i915_mmio_reg_offset(i915_reg)); 31 31 32 - return xe_mmio_read32(__compat_uncore_to_gt(uncore), reg); 32 + return xe_mmio_read32(__compat_uncore_to_mmio(uncore), reg); 33 33 } 34 34 35 35 static inline u8 intel_uncore_read8(struct intel_uncore *uncore, ··· 37 37 { 38 38 struct xe_reg reg = XE_REG(i915_mmio_reg_offset(i915_reg)); 39 39 40 - return xe_mmio_read8(__compat_uncore_to_gt(uncore), reg); 40 + return xe_mmio_read8(__compat_uncore_to_mmio(uncore), reg); 41 41 } 42 42 43 43 static inline u16 intel_uncore_read16(struct intel_uncore *uncore, ··· 45 45 { 46 46 struct xe_reg reg = XE_REG(i915_mmio_reg_offset(i915_reg)); 47 47 48 - return xe_mmio_read16(__compat_uncore_to_gt(uncore), reg); 48 + return xe_mmio_read16(__compat_uncore_to_mmio(uncore), reg); 49 49 } 50 50 51 51 static inline u64 ··· 57 57 u32 upper, lower, old_upper; 58 58 int loop = 0; 59 59 60 - upper = xe_mmio_read32(__compat_uncore_to_gt(uncore), upper_reg); 60 + upper = xe_mmio_read32(__compat_uncore_to_mmio(uncore), upper_reg); 61 61 do { 62 62 old_upper = upper; 63 - lower = xe_mmio_read32(__compat_uncore_to_gt(uncore), lower_reg); 64 - upper = xe_mmio_read32(__compat_uncore_to_gt(uncore), upper_reg); 63 + lower = xe_mmio_read32(__compat_uncore_to_mmio(uncore), lower_reg); 64 + upper = xe_mmio_read32(__compat_uncore_to_mmio(uncore), upper_reg); 65 65 } while (upper != old_upper && loop++ < 2); 66 66 67 67 return (u64)upper << 32 | lower; ··· 72 72 { 73 73 struct xe_reg reg = XE_REG(i915_mmio_reg_offset(i915_reg)); 74 74 75 - xe_mmio_read32(__compat_uncore_to_gt(uncore), reg); 75 + xe_mmio_read32(__compat_uncore_to_mmio(uncore), reg); 76 76 } 77 77 78 78 static inline void intel_uncore_write(struct intel_uncore *uncore, ··· 80 80 { 81 81 struct xe_reg reg = XE_REG(i915_mmio_reg_offset(i915_reg)); 82 82 83 - xe_mmio_write32(__compat_uncore_to_gt(uncore), reg, val); 83 + xe_mmio_write32(__compat_uncore_to_mmio(uncore), reg, val); 84 84 } 85 85 86 86 static inline u32 intel_uncore_rmw(struct intel_uncore *uncore, ··· 88 88 { 89 89 struct xe_reg reg = XE_REG(i915_mmio_reg_offset(i915_reg)); 90 90 91 - return xe_mmio_rmw32(__compat_uncore_to_gt(uncore), reg, clear, set); 91 + return xe_mmio_rmw32(__compat_uncore_to_mmio(uncore), reg, clear, set); 92 92 } 93 93 94 94 static inline int intel_wait_for_register(struct intel_uncore *uncore, ··· 97 97 { 98 98 struct xe_reg reg = XE_REG(i915_mmio_reg_offset(i915_reg)); 99 99 100 - return xe_mmio_wait32(__compat_uncore_to_gt(uncore), reg, mask, value, 100 + return xe_mmio_wait32(__compat_uncore_to_mmio(uncore), reg, mask, value, 101 101 timeout * USEC_PER_MSEC, NULL, false); 102 102 } 103 103 ··· 107 107 { 108 108 struct xe_reg reg = XE_REG(i915_mmio_reg_offset(i915_reg)); 109 109 110 - return xe_mmio_wait32(__compat_uncore_to_gt(uncore), reg, mask, value, 110 + return xe_mmio_wait32(__compat_uncore_to_mmio(uncore), reg, mask, value, 111 111 timeout * USEC_PER_MSEC, NULL, false); 112 112 } 113 113 ··· 118 118 { 119 119 struct xe_reg reg = XE_REG(i915_mmio_reg_offset(i915_reg)); 120 120 121 - return xe_mmio_wait32(__compat_uncore_to_gt(uncore), reg, mask, value, 121 + return xe_mmio_wait32(__compat_uncore_to_mmio(uncore), reg, mask, value, 122 122 fast_timeout_us + 1000 * slow_timeout_ms, 123 123 out_value, false); 124 124 } ··· 128 128 { 129 129 struct xe_reg reg = XE_REG(i915_mmio_reg_offset(i915_reg)); 130 130 131 - return xe_mmio_read32(__compat_uncore_to_gt(uncore), reg); 131 + return xe_mmio_read32(__compat_uncore_to_mmio(uncore), reg); 132 132 } 133 133 134 134 static inline void intel_uncore_write_fw(struct intel_uncore *uncore, ··· 136 136 { 137 137 struct xe_reg reg = XE_REG(i915_mmio_reg_offset(i915_reg)); 138 138 139 - xe_mmio_write32(__compat_uncore_to_gt(uncore), reg, val); 139 + xe_mmio_write32(__compat_uncore_to_mmio(uncore), reg, val); 140 140 } 141 141 142 142 static inline u32 intel_uncore_read_notrace(struct intel_uncore *uncore, ··· 144 144 { 145 145 struct xe_reg reg = XE_REG(i915_mmio_reg_offset(i915_reg)); 146 146 147 - return xe_mmio_read32(__compat_uncore_to_gt(uncore), reg); 147 + return xe_mmio_read32(__compat_uncore_to_mmio(uncore), reg); 148 148 } 149 149 150 150 static inline void intel_uncore_write_notrace(struct intel_uncore *uncore, ··· 152 152 { 153 153 struct xe_reg reg = XE_REG(i915_mmio_reg_offset(i915_reg)); 154 154 155 - xe_mmio_write32(__compat_uncore_to_gt(uncore), reg, val); 155 + xe_mmio_write32(__compat_uncore_to_mmio(uncore), reg, val); 156 156 } 157 157 158 158 static inline void __iomem *intel_uncore_regs(struct intel_uncore *uncore)
+74 -21
drivers/gpu/drm/xe/display/xe_display.c
··· 4 4 */ 5 5 6 6 #include "xe_display.h" 7 - #include "regs/xe_regs.h" 7 + #include "regs/xe_irq_regs.h" 8 8 9 9 #include <linux/fb.h> 10 10 ··· 13 13 #include <uapi/drm/xe_drm.h> 14 14 15 15 #include "soc/intel_dram.h" 16 - #include "i915_drv.h" /* FIXME: HAS_DISPLAY() depends on this */ 17 16 #include "intel_acpi.h" 18 17 #include "intel_audio.h" 19 18 #include "intel_bw.h" ··· 33 34 34 35 static bool has_display(struct xe_device *xe) 35 36 { 36 - return HAS_DISPLAY(xe); 37 + return HAS_DISPLAY(&xe->display); 37 38 } 38 39 39 40 /** ··· 308 309 } 309 310 310 311 /* TODO: System and runtime suspend/resume sequences will be sanitized as a follow-up. */ 311 - void xe_display_pm_runtime_suspend(struct xe_device *xe) 312 - { 313 - if (!xe->info.probe_display) 314 - return; 315 - 316 - if (xe->d3cold.allowed) 317 - xe_display_pm_suspend(xe, true); 318 - 319 - intel_hpd_poll_enable(xe); 320 - } 321 - 322 - void xe_display_pm_suspend(struct xe_device *xe, bool runtime) 312 + static void __xe_display_pm_suspend(struct xe_device *xe, bool runtime) 323 313 { 324 314 struct intel_display *display = &xe->display; 325 315 bool s2idle = suspend_to_idle(); ··· 343 355 intel_dmc_suspend(xe); 344 356 } 345 357 358 + void xe_display_pm_suspend(struct xe_device *xe) 359 + { 360 + __xe_display_pm_suspend(xe, false); 361 + } 362 + 363 + void xe_display_pm_shutdown(struct xe_device *xe) 364 + { 365 + struct intel_display *display = &xe->display; 366 + 367 + if (!xe->info.probe_display) 368 + return; 369 + 370 + intel_power_domains_disable(xe); 371 + intel_fbdev_set_suspend(&xe->drm, FBINFO_STATE_SUSPENDED, true); 372 + if (has_display(xe)) { 373 + drm_kms_helper_poll_disable(&xe->drm); 374 + intel_display_driver_disable_user_access(xe); 375 + intel_display_driver_suspend(xe); 376 + } 377 + 378 + xe_display_flush_cleanup_work(xe); 379 + intel_dp_mst_suspend(xe); 380 + intel_hpd_cancel_work(xe); 381 + 382 + if (has_display(xe)) 383 + intel_display_driver_suspend_access(xe); 384 + 385 + intel_encoder_suspend_all(display); 386 + intel_encoder_shutdown_all(display); 387 + 388 + intel_opregion_suspend(display, PCI_D3cold); 389 + 390 + intel_dmc_suspend(xe); 391 + } 392 + 393 + void xe_display_pm_runtime_suspend(struct xe_device *xe) 394 + { 395 + if (!xe->info.probe_display) 396 + return; 397 + 398 + if (xe->d3cold.allowed) 399 + __xe_display_pm_suspend(xe, true); 400 + 401 + intel_hpd_poll_enable(xe); 402 + } 403 + 346 404 void xe_display_pm_suspend_late(struct xe_device *xe) 347 405 { 348 406 bool s2idle = suspend_to_idle(); ··· 400 366 intel_display_power_suspend_late(xe); 401 367 } 402 368 403 - void xe_display_pm_runtime_resume(struct xe_device *xe) 369 + void xe_display_pm_shutdown_late(struct xe_device *xe) 404 370 { 405 371 if (!xe->info.probe_display) 406 372 return; 407 373 408 - intel_hpd_poll_disable(xe); 409 - 410 - if (xe->d3cold.allowed) 411 - xe_display_pm_resume(xe, true); 374 + /* 375 + * The only requirement is to reboot with display DC states disabled, 376 + * for now leaving all display power wells in the INIT power domain 377 + * enabled. 378 + */ 379 + intel_power_domains_driver_remove(xe); 412 380 } 413 381 414 382 void xe_display_pm_resume_early(struct xe_device *xe) ··· 423 387 intel_power_domains_resume(xe); 424 388 } 425 389 426 - void xe_display_pm_resume(struct xe_device *xe, bool runtime) 390 + static void __xe_display_pm_resume(struct xe_device *xe, bool runtime) 427 391 { 428 392 struct intel_display *display = &xe->display; 429 393 ··· 456 420 457 421 intel_power_domains_enable(xe); 458 422 } 423 + 424 + void xe_display_pm_resume(struct xe_device *xe) 425 + { 426 + __xe_display_pm_resume(xe, false); 427 + } 428 + 429 + void xe_display_pm_runtime_resume(struct xe_device *xe) 430 + { 431 + if (!xe->info.probe_display) 432 + return; 433 + 434 + intel_hpd_poll_disable(xe); 435 + 436 + if (xe->d3cold.allowed) 437 + __xe_display_pm_resume(xe, true); 438 + } 439 + 459 440 460 441 static void display_device_remove(struct drm_device *dev, void *arg) 461 442 {
+8 -4
drivers/gpu/drm/xe/display/xe_display.h
··· 34 34 void xe_display_irq_reset(struct xe_device *xe); 35 35 void xe_display_irq_postinstall(struct xe_device *xe, struct xe_gt *gt); 36 36 37 - void xe_display_pm_suspend(struct xe_device *xe, bool runtime); 37 + void xe_display_pm_suspend(struct xe_device *xe); 38 + void xe_display_pm_shutdown(struct xe_device *xe); 38 39 void xe_display_pm_suspend_late(struct xe_device *xe); 40 + void xe_display_pm_shutdown_late(struct xe_device *xe); 39 41 void xe_display_pm_resume_early(struct xe_device *xe); 40 - void xe_display_pm_resume(struct xe_device *xe, bool runtime); 42 + void xe_display_pm_resume(struct xe_device *xe); 41 43 void xe_display_pm_runtime_suspend(struct xe_device *xe); 42 44 void xe_display_pm_runtime_resume(struct xe_device *xe); 43 45 ··· 67 65 static inline void xe_display_irq_reset(struct xe_device *xe) {} 68 66 static inline void xe_display_irq_postinstall(struct xe_device *xe, struct xe_gt *gt) {} 69 67 70 - static inline void xe_display_pm_suspend(struct xe_device *xe, bool runtime) {} 68 + static inline void xe_display_pm_suspend(struct xe_device *xe) {} 69 + static inline void xe_display_pm_shutdown(struct xe_device *xe) {} 71 70 static inline void xe_display_pm_suspend_late(struct xe_device *xe) {} 71 + static inline void xe_display_pm_shutdown_late(struct xe_device *xe) {} 72 72 static inline void xe_display_pm_resume_early(struct xe_device *xe) {} 73 - static inline void xe_display_pm_resume(struct xe_device *xe, bool runtime) {} 73 + static inline void xe_display_pm_resume(struct xe_device *xe) {} 74 74 static inline void xe_display_pm_runtime_suspend(struct xe_device *xe) {} 75 75 static inline void xe_display_pm_runtime_resume(struct xe_device *xe) {} 76 76
+7 -2
drivers/gpu/drm/xe/display/xe_dsb_buffer.c
··· 48 48 if (!vma) 49 49 return false; 50 50 51 + /* Set scanout flag for WC mapping */ 51 52 obj = xe_bo_create_pin_map(xe, xe_device_get_root_tile(xe), 52 53 NULL, PAGE_ALIGN(size), 53 54 ttm_bo_type_kernel, 54 55 XE_BO_FLAG_VRAM_IF_DGFX(xe_device_get_root_tile(xe)) | 55 - XE_BO_FLAG_GGTT); 56 + XE_BO_FLAG_SCANOUT | XE_BO_FLAG_GGTT); 56 57 if (IS_ERR(obj)) { 57 58 kfree(vma); 58 59 return false; ··· 74 73 75 74 void intel_dsb_buffer_flush_map(struct intel_dsb_buffer *dsb_buf) 76 75 { 77 - /* TODO: add xe specific flush_map() for dsb buffer object. */ 76 + /* 77 + * The memory barrier here is to ensure coherency of DSB vs MMIO, 78 + * both for weak ordering archs and discrete cards. 79 + */ 80 + xe_device_wmb(dsb_buf->vma->bo->tile->xe); 78 81 }
+1
drivers/gpu/drm/xe/regs/xe_engine_regs.h
··· 186 186 187 187 #define VDBOX_CGCTL3F10(base) XE_REG((base) + 0x3f10) 188 188 #define IECPUNIT_CLKGATE_DIS REG_BIT(22) 189 + #define RAMDFTUNIT_CLKGATE_DIS REG_BIT(9) 189 190 190 191 #define VDBOX_CGCTL3F18(base) XE_REG((base) + 0x3f18) 191 192 #define ALNUNIT_CLKGATE_DIS REG_BIT(13)
+17 -59
drivers/gpu/drm/xe/regs/xe_gt_regs.h
··· 286 286 #define GAMTLBVEBOX0_CLKGATE_DIS REG_BIT(16) 287 287 #define LTCDD_CLKGATE_DIS REG_BIT(10) 288 288 289 + #define UNSLCGCTL9454 XE_REG(0x9454) 290 + #define LSCFE_CLKGATE_DIS REG_BIT(4) 291 + 289 292 #define XEHP_SLICE_UNIT_LEVEL_CLKGATE XE_REG_MCR(0x94d4) 290 293 #define L3_CR2X_CLKGATE_DIS REG_BIT(17) 291 294 #define L3_CLKGATE_DIS REG_BIT(16) ··· 347 344 #define CTC_SOURCE_DIVIDE_LOGIC REG_BIT(0) 348 345 349 346 #define FORCEWAKE_RENDER XE_REG(0xa278) 347 + 348 + #define POWERGATE_DOMAIN_STATUS XE_REG(0xa2a0) 349 + #define MEDIA_SLICE3_AWAKE_STATUS REG_BIT(4) 350 + #define MEDIA_SLICE2_AWAKE_STATUS REG_BIT(3) 351 + #define MEDIA_SLICE1_AWAKE_STATUS REG_BIT(2) 352 + #define RENDER_AWAKE_STATUS REG_BIT(1) 353 + #define MEDIA_SLICE0_AWAKE_STATUS REG_BIT(0) 354 + 350 355 #define FORCEWAKE_MEDIA_VDBOX(n) XE_REG(0xa540 + (n) * 4) 351 356 #define FORCEWAKE_MEDIA_VEBOX(n) XE_REG(0xa560 + (n) * 4) 352 357 #define FORCEWAKE_GSC XE_REG(0xa618) ··· 404 393 405 394 #define XE2_GLOBAL_INVAL XE_REG(0xb404) 406 395 407 - #define SCRATCH1LPFC XE_REG(0xb474) 408 - #define EN_L3_RW_CCS_CACHE_FLUSH REG_BIT(0) 396 + #define XE2LPM_L3SQCREG2 XE_REG_MCR(0xb604) 397 + 398 + #define XE2LPM_L3SQCREG3 XE_REG_MCR(0xb608) 399 + 400 + #define XE2LPM_SCRATCH3_LBCF XE_REG_MCR(0xb654) 409 401 410 402 #define XE2LPM_L3SQCREG2 XE_REG_MCR(0xb604) 411 403 ··· 573 559 #define GT_PERF_STATUS XE_REG(0x1381b4) 574 560 #define VOLTAGE_MASK REG_GENMASK(10, 0) 575 561 576 - /* 577 - * Note: Interrupt registers 1900xx are VF accessible only until version 12.50. 578 - * On newer platforms, VFs are using memory-based interrupts instead. 579 - * However, for simplicity we keep this XE_REG_OPTION_VF tag intact. 580 - */ 581 - 582 - #define GT_INTR_DW(x) XE_REG(0x190018 + ((x) * 4), XE_REG_OPTION_VF) 583 - #define INTR_GSC REG_BIT(31) 584 - #define INTR_GUC REG_BIT(25) 585 - #define INTR_MGUC REG_BIT(24) 586 - #define INTR_BCS8 REG_BIT(23) 587 - #define INTR_BCS(x) REG_BIT(15 - (x)) 588 - #define INTR_CCS(x) REG_BIT(4 + (x)) 589 - #define INTR_RCS0 REG_BIT(0) 590 - #define INTR_VECS(x) REG_BIT(31 - (x)) 591 - #define INTR_VCS(x) REG_BIT(x) 592 - 593 - #define RENDER_COPY_INTR_ENABLE XE_REG(0x190030, XE_REG_OPTION_VF) 594 - #define VCS_VECS_INTR_ENABLE XE_REG(0x190034, XE_REG_OPTION_VF) 595 - #define GUC_SG_INTR_ENABLE XE_REG(0x190038, XE_REG_OPTION_VF) 596 - #define ENGINE1_MASK REG_GENMASK(31, 16) 597 - #define ENGINE0_MASK REG_GENMASK(15, 0) 598 - #define GPM_WGBOXPERF_INTR_ENABLE XE_REG(0x19003c, XE_REG_OPTION_VF) 599 - #define GUNIT_GSC_INTR_ENABLE XE_REG(0x190044, XE_REG_OPTION_VF) 600 - #define CCS_RSVD_INTR_ENABLE XE_REG(0x190048, XE_REG_OPTION_VF) 601 - 602 - #define INTR_IDENTITY_REG(x) XE_REG(0x190060 + ((x) * 4), XE_REG_OPTION_VF) 603 - #define INTR_DATA_VALID REG_BIT(31) 604 - #define INTR_ENGINE_INSTANCE(x) REG_FIELD_GET(GENMASK(25, 20), x) 605 - #define INTR_ENGINE_CLASS(x) REG_FIELD_GET(GENMASK(18, 16), x) 606 - #define INTR_ENGINE_INTR(x) REG_FIELD_GET(GENMASK(15, 0), x) 607 - #define OTHER_GUC_INSTANCE 0 608 - #define OTHER_GSC_HECI2_INSTANCE 3 609 - #define OTHER_GSC_INSTANCE 6 610 - 611 - #define IIR_REG_SELECTOR(x) XE_REG(0x190070 + ((x) * 4), XE_REG_OPTION_VF) 612 - #define RCS0_RSVD_INTR_MASK XE_REG(0x190090, XE_REG_OPTION_VF) 613 - #define BCS_RSVD_INTR_MASK XE_REG(0x1900a0, XE_REG_OPTION_VF) 614 - #define VCS0_VCS1_INTR_MASK XE_REG(0x1900a8, XE_REG_OPTION_VF) 615 - #define VCS2_VCS3_INTR_MASK XE_REG(0x1900ac, XE_REG_OPTION_VF) 616 - #define VECS0_VECS1_INTR_MASK XE_REG(0x1900d0, XE_REG_OPTION_VF) 617 - #define HECI2_RSVD_INTR_MASK XE_REG(0x1900e4) 618 - #define GUC_SG_INTR_MASK XE_REG(0x1900e8, XE_REG_OPTION_VF) 619 - #define GPM_WGBOXPERF_INTR_MASK XE_REG(0x1900ec, XE_REG_OPTION_VF) 620 - #define GUNIT_GSC_INTR_MASK XE_REG(0x1900f4, XE_REG_OPTION_VF) 621 - #define CCS0_CCS1_INTR_MASK XE_REG(0x190100) 622 - #define CCS2_CCS3_INTR_MASK XE_REG(0x190104) 623 - #define XEHPC_BCS1_BCS2_INTR_MASK XE_REG(0x190110) 624 - #define XEHPC_BCS3_BCS4_INTR_MASK XE_REG(0x190114) 625 - #define XEHPC_BCS5_BCS6_INTR_MASK XE_REG(0x190118) 626 - #define XEHPC_BCS7_BCS8_INTR_MASK XE_REG(0x19011c) 627 - #define GT_WAIT_SEMAPHORE_INTERRUPT REG_BIT(11) 628 - #define GT_CONTEXT_SWITCH_INTERRUPT REG_BIT(8) 629 - #define GSC_ER_COMPLETE REG_BIT(5) 630 - #define GT_RENDER_PIPECTL_NOTIFY_INTERRUPT REG_BIT(4) 631 - #define GT_CS_MASTER_ERROR_INTERRUPT REG_BIT(3) 632 - #define GT_RENDER_USER_INTERRUPT REG_BIT(0) 562 + #define SFC_DONE(n) XE_REG(0x1cc000 + (n) * 0x1000) 633 563 634 564 #endif
+1
drivers/gpu/drm/xe/regs/xe_guc_regs.h
··· 84 84 #define HUC_LOADING_AGENT_GUC REG_BIT(1) 85 85 #define GUC_WOPCM_OFFSET_VALID REG_BIT(0) 86 86 #define GUC_MAX_IDLE_COUNT XE_REG(0xc3e4) 87 + #define GUC_PMTIMESTAMP XE_REG(0xc3e8) 87 88 88 89 #define GUC_SEND_INTERRUPT XE_REG(0xc4c8) 89 90 #define GUC_SEND_TRIGGER REG_BIT(0)
+82
drivers/gpu/drm/xe/regs/xe_irq_regs.h
··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2024 Intel Corporation 4 + */ 5 + #ifndef _XE_IRQ_REGS_H_ 6 + #define _XE_IRQ_REGS_H_ 7 + 8 + #include "regs/xe_reg_defs.h" 9 + 10 + #define PCU_IRQ_OFFSET 0x444e0 11 + #define GU_MISC_IRQ_OFFSET 0x444f0 12 + #define GU_MISC_GSE REG_BIT(27) 13 + 14 + #define DG1_MSTR_TILE_INTR XE_REG(0x190008) 15 + #define DG1_MSTR_IRQ REG_BIT(31) 16 + #define DG1_MSTR_TILE(t) REG_BIT(t) 17 + 18 + #define GFX_MSTR_IRQ XE_REG(0x190010, XE_REG_OPTION_VF) 19 + #define MASTER_IRQ REG_BIT(31) 20 + #define GU_MISC_IRQ REG_BIT(29) 21 + #define DISPLAY_IRQ REG_BIT(16) 22 + #define GT_DW_IRQ(x) REG_BIT(x) 23 + 24 + /* 25 + * Note: Interrupt registers 1900xx are VF accessible only until version 12.50. 26 + * On newer platforms, VFs are using memory-based interrupts instead. 27 + * However, for simplicity we keep this XE_REG_OPTION_VF tag intact. 28 + */ 29 + 30 + #define GT_INTR_DW(x) XE_REG(0x190018 + ((x) * 4), XE_REG_OPTION_VF) 31 + #define INTR_GSC REG_BIT(31) 32 + #define INTR_GUC REG_BIT(25) 33 + #define INTR_MGUC REG_BIT(24) 34 + #define INTR_BCS8 REG_BIT(23) 35 + #define INTR_BCS(x) REG_BIT(15 - (x)) 36 + #define INTR_CCS(x) REG_BIT(4 + (x)) 37 + #define INTR_RCS0 REG_BIT(0) 38 + #define INTR_VECS(x) REG_BIT(31 - (x)) 39 + #define INTR_VCS(x) REG_BIT(x) 40 + 41 + #define RENDER_COPY_INTR_ENABLE XE_REG(0x190030, XE_REG_OPTION_VF) 42 + #define VCS_VECS_INTR_ENABLE XE_REG(0x190034, XE_REG_OPTION_VF) 43 + #define GUC_SG_INTR_ENABLE XE_REG(0x190038, XE_REG_OPTION_VF) 44 + #define ENGINE1_MASK REG_GENMASK(31, 16) 45 + #define ENGINE0_MASK REG_GENMASK(15, 0) 46 + #define GPM_WGBOXPERF_INTR_ENABLE XE_REG(0x19003c, XE_REG_OPTION_VF) 47 + #define GUNIT_GSC_INTR_ENABLE XE_REG(0x190044, XE_REG_OPTION_VF) 48 + #define CCS_RSVD_INTR_ENABLE XE_REG(0x190048, XE_REG_OPTION_VF) 49 + 50 + #define INTR_IDENTITY_REG(x) XE_REG(0x190060 + ((x) * 4), XE_REG_OPTION_VF) 51 + #define INTR_DATA_VALID REG_BIT(31) 52 + #define INTR_ENGINE_INSTANCE(x) REG_FIELD_GET(GENMASK(25, 20), x) 53 + #define INTR_ENGINE_CLASS(x) REG_FIELD_GET(GENMASK(18, 16), x) 54 + #define INTR_ENGINE_INTR(x) REG_FIELD_GET(GENMASK(15, 0), x) 55 + #define OTHER_GUC_INSTANCE 0 56 + #define OTHER_GSC_HECI2_INSTANCE 3 57 + #define OTHER_GSC_INSTANCE 6 58 + 59 + #define IIR_REG_SELECTOR(x) XE_REG(0x190070 + ((x) * 4), XE_REG_OPTION_VF) 60 + #define RCS0_RSVD_INTR_MASK XE_REG(0x190090, XE_REG_OPTION_VF) 61 + #define BCS_RSVD_INTR_MASK XE_REG(0x1900a0, XE_REG_OPTION_VF) 62 + #define VCS0_VCS1_INTR_MASK XE_REG(0x1900a8, XE_REG_OPTION_VF) 63 + #define VCS2_VCS3_INTR_MASK XE_REG(0x1900ac, XE_REG_OPTION_VF) 64 + #define VECS0_VECS1_INTR_MASK XE_REG(0x1900d0, XE_REG_OPTION_VF) 65 + #define HECI2_RSVD_INTR_MASK XE_REG(0x1900e4) 66 + #define GUC_SG_INTR_MASK XE_REG(0x1900e8, XE_REG_OPTION_VF) 67 + #define GPM_WGBOXPERF_INTR_MASK XE_REG(0x1900ec, XE_REG_OPTION_VF) 68 + #define GUNIT_GSC_INTR_MASK XE_REG(0x1900f4, XE_REG_OPTION_VF) 69 + #define CCS0_CCS1_INTR_MASK XE_REG(0x190100) 70 + #define CCS2_CCS3_INTR_MASK XE_REG(0x190104) 71 + #define XEHPC_BCS1_BCS2_INTR_MASK XE_REG(0x190110) 72 + #define XEHPC_BCS3_BCS4_INTR_MASK XE_REG(0x190114) 73 + #define XEHPC_BCS5_BCS6_INTR_MASK XE_REG(0x190118) 74 + #define XEHPC_BCS7_BCS8_INTR_MASK XE_REG(0x19011c) 75 + #define GT_WAIT_SEMAPHORE_INTERRUPT REG_BIT(11) 76 + #define GT_CONTEXT_SWITCH_INTERRUPT REG_BIT(8) 77 + #define GSC_ER_COMPLETE REG_BIT(5) 78 + #define GT_RENDER_PIPECTL_NOTIFY_INTERRUPT REG_BIT(4) 79 + #define GT_CS_MASTER_ERROR_INTERRUPT REG_BIT(3) 80 + #define GT_RENDER_USER_INTERRUPT REG_BIT(0) 81 + 82 + #endif
+1 -1
drivers/gpu/drm/xe/regs/xe_reg_defs.h
··· 128 128 * options. 129 129 */ 130 130 #define XE_REG_MCR(r_, ...) ((const struct xe_reg_mcr){ \ 131 - .__reg = XE_REG_INITIALIZER(r_, ##__VA_ARGS__, .mcr = 1) \ 131 + .__reg = XE_REG_INITIALIZER(r_, ##__VA_ARGS__, .mcr = 1) \ 132 132 }) 133 133 134 134 static inline bool xe_reg_is_valid(struct xe_reg r)
-14
drivers/gpu/drm/xe/regs/xe_regs.h
··· 11 11 #define TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK REG_GENMASK(15, 12) 12 12 #define TIMESTAMP_OVERRIDE_US_COUNTER_DIVIDER_MASK REG_GENMASK(9, 0) 13 13 14 - #define PCU_IRQ_OFFSET 0x444e0 15 - #define GU_MISC_IRQ_OFFSET 0x444f0 16 - #define GU_MISC_GSE REG_BIT(27) 17 - 18 14 #define GU_CNTL_PROTECTED XE_REG(0x10100C) 19 15 #define DRIVERINT_FLR_DIS REG_BIT(31) 20 16 ··· 52 56 53 57 #define MTL_MPE_FREQUENCY XE_REG(0x13802c) 54 58 #define MTL_RPE_MASK REG_GENMASK(8, 0) 55 - 56 - #define DG1_MSTR_TILE_INTR XE_REG(0x190008) 57 - #define DG1_MSTR_IRQ REG_BIT(31) 58 - #define DG1_MSTR_TILE(t) REG_BIT(t) 59 - 60 - #define GFX_MSTR_IRQ XE_REG(0x190010, XE_REG_OPTION_VF) 61 - #define MASTER_IRQ REG_BIT(31) 62 - #define GU_MISC_IRQ REG_BIT(29) 63 - #define DISPLAY_IRQ REG_BIT(16) 64 - #define GT_DW_IRQ(x) REG_BIT(x) 65 59 66 60 #define VF_CAP_REG XE_REG(0x1901f8, XE_REG_OPTION_VF) 67 61 #define VF_CAP REG_BIT(0)
+240
drivers/gpu/drm/xe/tests/xe_bo.c
··· 6 6 #include <kunit/test.h> 7 7 #include <kunit/visibility.h> 8 8 9 + #include <linux/iosys-map.h> 10 + #include <linux/math64.h> 11 + #include <linux/random.h> 12 + #include <linux/swap.h> 13 + 14 + #include <uapi/linux/sysinfo.h> 15 + 9 16 #include "tests/xe_kunit_helpers.h" 10 17 #include "tests/xe_pci_test.h" 11 18 #include "tests/xe_test.h" ··· 365 358 evict_test_run_device(xe); 366 359 } 367 360 361 + struct xe_bo_link { 362 + struct list_head link; 363 + struct xe_bo *bo; 364 + u32 val; 365 + }; 366 + 367 + #define XE_BO_SHRINK_SIZE ((unsigned long)SZ_64M) 368 + 369 + static int shrink_test_fill_random(struct xe_bo *bo, struct rnd_state *state, 370 + struct xe_bo_link *link) 371 + { 372 + struct iosys_map map; 373 + int ret = ttm_bo_vmap(&bo->ttm, &map); 374 + size_t __maybe_unused i; 375 + 376 + if (ret) 377 + return ret; 378 + 379 + for (i = 0; i < bo->ttm.base.size; i += sizeof(u32)) { 380 + u32 val = prandom_u32_state(state); 381 + 382 + iosys_map_wr(&map, i, u32, val); 383 + if (i == 0) 384 + link->val = val; 385 + } 386 + 387 + ttm_bo_vunmap(&bo->ttm, &map); 388 + return 0; 389 + } 390 + 391 + static bool shrink_test_verify(struct kunit *test, struct xe_bo *bo, 392 + unsigned int bo_nr, struct rnd_state *state, 393 + struct xe_bo_link *link) 394 + { 395 + struct iosys_map map; 396 + int ret = ttm_bo_vmap(&bo->ttm, &map); 397 + size_t i; 398 + bool failed = false; 399 + 400 + if (ret) { 401 + KUNIT_FAIL(test, "Error mapping bo %u for content check.\n", bo_nr); 402 + return true; 403 + } 404 + 405 + for (i = 0; i < bo->ttm.base.size; i += sizeof(u32)) { 406 + u32 val = prandom_u32_state(state); 407 + 408 + if (iosys_map_rd(&map, i, u32) != val) { 409 + KUNIT_FAIL(test, "Content not preserved, bo %u offset 0x%016llx", 410 + bo_nr, (unsigned long long)i); 411 + kunit_info(test, "Failed value is 0x%08x, recorded 0x%08x\n", 412 + (unsigned int)iosys_map_rd(&map, i, u32), val); 413 + if (i == 0 && val != link->val) 414 + kunit_info(test, "Looks like PRNG is out of sync.\n"); 415 + failed = true; 416 + break; 417 + } 418 + } 419 + 420 + ttm_bo_vunmap(&bo->ttm, &map); 421 + 422 + return failed; 423 + } 424 + 425 + /* 426 + * Try to create system bos corresponding to twice the amount 427 + * of available system memory to test shrinker functionality. 428 + * If no swap space is available to accommodate the 429 + * memory overcommit, mark bos purgeable. 430 + */ 431 + static int shrink_test_run_device(struct xe_device *xe) 432 + { 433 + struct kunit *test = kunit_get_current_test(); 434 + LIST_HEAD(bos); 435 + struct xe_bo_link *link, *next; 436 + struct sysinfo si; 437 + u64 ram, ram_and_swap, purgeable = 0, alloced, to_alloc, limit; 438 + unsigned int interrupted = 0, successful = 0, count = 0; 439 + struct rnd_state prng; 440 + u64 rand_seed; 441 + bool failed = false; 442 + 443 + rand_seed = get_random_u64(); 444 + prandom_seed_state(&prng, rand_seed); 445 + kunit_info(test, "Random seed is 0x%016llx.\n", 446 + (unsigned long long)rand_seed); 447 + 448 + /* Skip if execution time is expected to be too long. */ 449 + 450 + limit = SZ_32G; 451 + /* IGFX with flat CCS needs to copy when swapping / shrinking */ 452 + if (!IS_DGFX(xe) && xe_device_has_flat_ccs(xe)) 453 + limit = SZ_16G; 454 + 455 + si_meminfo(&si); 456 + ram = (size_t)si.freeram * si.mem_unit; 457 + if (ram > limit) { 458 + kunit_skip(test, "Too long expected execution time.\n"); 459 + return 0; 460 + } 461 + to_alloc = ram * 2; 462 + 463 + ram_and_swap = ram + get_nr_swap_pages() * PAGE_SIZE; 464 + if (to_alloc > ram_and_swap) 465 + purgeable = to_alloc - ram_and_swap; 466 + purgeable += div64_u64(purgeable, 5); 467 + 468 + kunit_info(test, "Free ram is %lu bytes. Will allocate twice of that.\n", 469 + (unsigned long)ram); 470 + for (alloced = 0; alloced < to_alloc; alloced += XE_BO_SHRINK_SIZE) { 471 + struct xe_bo *bo; 472 + unsigned int mem_type; 473 + struct xe_ttm_tt *xe_tt; 474 + 475 + link = kzalloc(sizeof(*link), GFP_KERNEL); 476 + if (!link) { 477 + KUNIT_FAIL(test, "Unexpected link allocation failure\n"); 478 + failed = true; 479 + break; 480 + } 481 + 482 + INIT_LIST_HEAD(&link->link); 483 + 484 + /* We can create bos using WC caching here. But it is slower. */ 485 + bo = xe_bo_create_user(xe, NULL, NULL, XE_BO_SHRINK_SIZE, 486 + DRM_XE_GEM_CPU_CACHING_WB, 487 + XE_BO_FLAG_SYSTEM); 488 + if (IS_ERR(bo)) { 489 + if (bo != ERR_PTR(-ENOMEM) && bo != ERR_PTR(-ENOSPC) && 490 + bo != ERR_PTR(-EINTR) && bo != ERR_PTR(-ERESTARTSYS)) 491 + KUNIT_FAIL(test, "Error creating bo: %pe\n", bo); 492 + kfree(link); 493 + failed = true; 494 + break; 495 + } 496 + xe_bo_lock(bo, false); 497 + xe_tt = container_of(bo->ttm.ttm, typeof(*xe_tt), ttm); 498 + 499 + /* 500 + * Allocate purgeable bos first, because if we do it the 501 + * other way around, they may not be subject to swapping... 502 + */ 503 + if (alloced < purgeable) { 504 + xe_tt->purgeable = true; 505 + bo->ttm.priority = 0; 506 + } else { 507 + int ret = shrink_test_fill_random(bo, &prng, link); 508 + 509 + if (ret) { 510 + xe_bo_unlock(bo); 511 + xe_bo_put(bo); 512 + KUNIT_FAIL(test, "Error filling bo with random data: %pe\n", 513 + ERR_PTR(ret)); 514 + kfree(link); 515 + failed = true; 516 + break; 517 + } 518 + } 519 + 520 + mem_type = bo->ttm.resource->mem_type; 521 + xe_bo_unlock(bo); 522 + link->bo = bo; 523 + list_add_tail(&link->link, &bos); 524 + 525 + if (mem_type != XE_PL_TT) { 526 + KUNIT_FAIL(test, "Bo in incorrect memory type: %u\n", 527 + bo->ttm.resource->mem_type); 528 + failed = true; 529 + } 530 + cond_resched(); 531 + if (signal_pending(current)) 532 + break; 533 + } 534 + 535 + /* 536 + * Read back and destroy bos. Reset the pseudo-random seed to get an 537 + * identical pseudo-random number sequence for readback. 538 + */ 539 + prandom_seed_state(&prng, rand_seed); 540 + list_for_each_entry_safe(link, next, &bos, link) { 541 + static struct ttm_operation_ctx ctx = {.interruptible = true}; 542 + struct xe_bo *bo = link->bo; 543 + struct xe_ttm_tt *xe_tt; 544 + int ret; 545 + 546 + count++; 547 + if (!signal_pending(current) && !failed) { 548 + bool purgeable, intr = false; 549 + 550 + xe_bo_lock(bo, NULL); 551 + 552 + /* xe_tt->purgeable is cleared on validate. */ 553 + xe_tt = container_of(bo->ttm.ttm, typeof(*xe_tt), ttm); 554 + purgeable = xe_tt->purgeable; 555 + do { 556 + ret = ttm_bo_validate(&bo->ttm, &tt_placement, &ctx); 557 + if (ret == -EINTR) 558 + intr = true; 559 + } while (ret == -EINTR && !signal_pending(current)); 560 + 561 + if (!ret && !purgeable) 562 + failed = shrink_test_verify(test, bo, count, &prng, link); 563 + 564 + xe_bo_unlock(bo); 565 + if (ret) { 566 + KUNIT_FAIL(test, "Validation failed: %pe\n", 567 + ERR_PTR(ret)); 568 + failed = true; 569 + } else if (intr) { 570 + interrupted++; 571 + } else { 572 + successful++; 573 + } 574 + } 575 + xe_bo_put(link->bo); 576 + list_del(&link->link); 577 + kfree(link); 578 + } 579 + kunit_info(test, "Readbacks interrupted: %u successful: %u\n", 580 + interrupted, successful); 581 + 582 + return 0; 583 + } 584 + 585 + static void xe_bo_shrink_kunit(struct kunit *test) 586 + { 587 + struct xe_device *xe = test->priv; 588 + 589 + shrink_test_run_device(xe); 590 + } 591 + 368 592 static struct kunit_case xe_bo_tests[] = { 369 593 KUNIT_CASE_PARAM(xe_ccs_migrate_kunit, xe_pci_live_device_gen_param), 370 594 KUNIT_CASE_PARAM(xe_bo_evict_kunit, xe_pci_live_device_gen_param), 595 + KUNIT_CASE_PARAM_ATTR(xe_bo_shrink_kunit, xe_pci_live_device_gen_param, 596 + {.speed = KUNIT_SPEED_SLOW}), 371 597 {} 372 598 }; 373 599
+2 -2
drivers/gpu/drm/xe/tests/xe_mocs.c
··· 55 55 if (regs_are_mcr(gt)) 56 56 reg_val = xe_gt_mcr_unicast_read_any(gt, XEHP_LNCFCMOCS(i >> 1)); 57 57 else 58 - reg_val = xe_mmio_read32(gt, XELP_LNCFCMOCS(i >> 1)); 58 + reg_val = xe_mmio_read32(&gt->mmio, XELP_LNCFCMOCS(i >> 1)); 59 59 60 60 mocs_dbg(gt, "reg_val=0x%x\n", reg_val); 61 61 } else { ··· 94 94 if (regs_are_mcr(gt)) 95 95 reg_val = xe_gt_mcr_unicast_read_any(gt, XEHP_GLOBAL_MOCS(i)); 96 96 else 97 - reg_val = xe_mmio_read32(gt, XELP_GLOBAL_MOCS(i)); 97 + reg_val = xe_mmio_read32(&gt->mmio, XELP_GLOBAL_MOCS(i)); 98 98 99 99 mocs_expected = get_entry_control(info, i); 100 100 mocs = reg_val;
+1 -1
drivers/gpu/drm/xe/xe_assert.h
··· 10 10 11 11 #include <drm/drm_print.h> 12 12 13 - #include "xe_device_types.h" 13 + #include "xe_gt_types.h" 14 14 #include "xe_step.h" 15 15 16 16 /**
+32 -2
drivers/gpu/drm/xe/xe_bo.c
··· 283 283 struct device *dev; 284 284 struct sg_table sgt; 285 285 struct sg_table *sg; 286 + /** @purgeable: Whether the content of the pages of @ttm is purgeable. */ 287 + bool purgeable; 286 288 }; 287 289 288 290 static int xe_tt_map_sg(struct ttm_tt *tt) ··· 470 468 mem->bus.offset += vram->io_start; 471 469 mem->bus.is_iomem = true; 472 470 473 - #if !defined(CONFIG_X86) 471 + #if !IS_ENABLED(CONFIG_X86) 474 472 mem->bus.caching = ttm_write_combined; 475 473 #endif 476 474 return 0; ··· 763 761 if (xe_rpm_reclaim_safe(xe)) { 764 762 /* 765 763 * We might be called through swapout in the validation path of 766 - * another TTM device, so unconditionally acquire rpm here. 764 + * another TTM device, so acquire rpm here. 767 765 */ 768 766 xe_pm_runtime_get(xe); 769 767 } else { ··· 1084 1082 } 1085 1083 } 1086 1084 1085 + static void xe_ttm_bo_purge(struct ttm_buffer_object *ttm_bo, struct ttm_operation_ctx *ctx) 1086 + { 1087 + struct xe_device *xe = ttm_to_xe_device(ttm_bo->bdev); 1088 + 1089 + if (ttm_bo->ttm) { 1090 + struct ttm_placement place = {}; 1091 + int ret = ttm_bo_validate(ttm_bo, &place, ctx); 1092 + 1093 + drm_WARN_ON(&xe->drm, ret); 1094 + } 1095 + } 1096 + 1097 + static void xe_ttm_bo_swap_notify(struct ttm_buffer_object *ttm_bo) 1098 + { 1099 + struct ttm_operation_ctx ctx = { 1100 + .interruptible = false 1101 + }; 1102 + 1103 + if (ttm_bo->ttm) { 1104 + struct xe_ttm_tt *xe_tt = 1105 + container_of(ttm_bo->ttm, struct xe_ttm_tt, ttm); 1106 + 1107 + if (xe_tt->purgeable) 1108 + xe_ttm_bo_purge(ttm_bo, &ctx); 1109 + } 1110 + } 1111 + 1087 1112 const struct ttm_device_funcs xe_ttm_funcs = { 1088 1113 .ttm_tt_create = xe_ttm_tt_create, 1089 1114 .ttm_tt_populate = xe_ttm_tt_populate, ··· 1123 1094 .release_notify = xe_ttm_bo_release_notify, 1124 1095 .eviction_valuable = ttm_bo_eviction_valuable, 1125 1096 .delete_mem_notify = xe_ttm_bo_delete_mem_notify, 1097 + .swap_notify = xe_ttm_bo_swap_notify, 1126 1098 }; 1127 1099 1128 1100 static void xe_ttm_bo_destroy(struct ttm_buffer_object *ttm_bo)
+1 -1
drivers/gpu/drm/xe/xe_debugfs.c
··· 187 187 debugfs_create_file("forcewake_all", 0400, root, xe, 188 188 &forcewake_all_fops); 189 189 190 - debugfs_create_file("wedged_mode", 0400, root, xe, 190 + debugfs_create_file("wedged_mode", 0600, root, xe, 191 191 &wedged_mode_fops); 192 192 193 193 for (mem_type = XE_PL_VRAM0; mem_type <= XE_PL_VRAM1; ++mem_type) {
+127 -30
drivers/gpu/drm/xe/xe_devcoredump.c
··· 6 6 #include "xe_devcoredump.h" 7 7 #include "xe_devcoredump_types.h" 8 8 9 + #include <linux/ascii85.h> 9 10 #include <linux/devcoredump.h> 10 11 #include <generated/utsrelease.h> 11 12 ··· 17 16 #include "xe_force_wake.h" 18 17 #include "xe_gt.h" 19 18 #include "xe_gt_printk.h" 19 + #include "xe_guc_capture.h" 20 20 #include "xe_guc_ct.h" 21 + #include "xe_guc_log.h" 21 22 #include "xe_guc_submit.h" 22 23 #include "xe_hw_engine.h" 24 + #include "xe_module.h" 23 25 #include "xe_sched_job.h" 24 26 #include "xe_vm.h" 25 27 ··· 89 85 90 86 p = drm_coredump_printer(&iter); 91 87 92 - drm_printf(&p, "**** Xe Device Coredump ****\n"); 93 - drm_printf(&p, "kernel: " UTS_RELEASE "\n"); 94 - drm_printf(&p, "module: " KBUILD_MODNAME "\n"); 88 + drm_puts(&p, "**** Xe Device Coredump ****\n"); 89 + drm_puts(&p, "kernel: " UTS_RELEASE "\n"); 90 + drm_puts(&p, "module: " KBUILD_MODNAME "\n"); 95 91 96 92 ts = ktime_to_timespec64(ss->snapshot_time); 97 93 drm_printf(&p, "Snapshot time: %lld.%09ld\n", ts.tv_sec, ts.tv_nsec); ··· 100 96 drm_printf(&p, "Process: %s\n", ss->process_name); 101 97 xe_device_snapshot_print(xe, &p); 102 98 103 - drm_printf(&p, "\n**** GuC CT ****\n"); 104 - xe_guc_ct_snapshot_print(coredump->snapshot.ct, &p); 105 - xe_guc_exec_queue_snapshot_print(coredump->snapshot.ge, &p); 99 + drm_printf(&p, "\n**** GT #%d ****\n", ss->gt->info.id); 100 + drm_printf(&p, "\tTile: %d\n", ss->gt->tile->id); 106 101 107 - drm_printf(&p, "\n**** Job ****\n"); 108 - xe_sched_job_snapshot_print(coredump->snapshot.job, &p); 102 + drm_puts(&p, "\n**** GuC Log ****\n"); 103 + xe_guc_log_snapshot_print(ss->guc.log, &p); 104 + drm_puts(&p, "\n**** GuC CT ****\n"); 105 + xe_guc_ct_snapshot_print(ss->guc.ct, &p); 109 106 110 - drm_printf(&p, "\n**** HW Engines ****\n"); 107 + drm_puts(&p, "\n**** Contexts ****\n"); 108 + xe_guc_exec_queue_snapshot_print(ss->ge, &p); 109 + 110 + drm_puts(&p, "\n**** Job ****\n"); 111 + xe_sched_job_snapshot_print(ss->job, &p); 112 + 113 + drm_puts(&p, "\n**** HW Engines ****\n"); 111 114 for (i = 0; i < XE_NUM_HW_ENGINES; i++) 112 - if (coredump->snapshot.hwe[i]) 113 - xe_hw_engine_snapshot_print(coredump->snapshot.hwe[i], 114 - &p); 115 - drm_printf(&p, "\n**** VM state ****\n"); 116 - xe_vm_snapshot_print(coredump->snapshot.vm, &p); 115 + if (ss->hwe[i]) 116 + xe_engine_snapshot_print(ss->hwe[i], &p); 117 + 118 + drm_puts(&p, "\n**** VM state ****\n"); 119 + xe_vm_snapshot_print(ss->vm, &p); 117 120 118 121 return count - iter.remain; 119 122 } ··· 129 118 { 130 119 int i; 131 120 132 - xe_guc_ct_snapshot_free(ss->ct); 133 - ss->ct = NULL; 121 + xe_guc_log_snapshot_free(ss->guc.log); 122 + ss->guc.log = NULL; 123 + 124 + xe_guc_ct_snapshot_free(ss->guc.ct); 125 + ss->guc.ct = NULL; 126 + 127 + xe_guc_capture_put_matched_nodes(&ss->gt->uc.guc); 128 + ss->matched_node = NULL; 134 129 135 130 xe_guc_exec_queue_snapshot_free(ss->ge); 136 131 ss->ge = NULL; ··· 221 204 /* To prevent stale data on next snapshot, clear everything */ 222 205 memset(&coredump->snapshot, 0, sizeof(coredump->snapshot)); 223 206 coredump->captured = false; 207 + coredump->job = NULL; 224 208 drm_info(&coredump_to_xe(coredump)->drm, 225 209 "Xe device coredump has been deleted.\n"); 226 210 } ··· 232 214 struct xe_devcoredump_snapshot *ss = &coredump->snapshot; 233 215 struct xe_exec_queue *q = job->q; 234 216 struct xe_guc *guc = exec_queue_to_guc(q); 235 - struct xe_hw_engine *hwe; 236 - enum xe_hw_engine_id id; 237 217 u32 adj_logical_mask = q->logical_mask; 238 218 u32 width_mask = (0x1 << q->width) - 1; 239 219 const char *process_name = "no process"; ··· 247 231 strscpy(ss->process_name, process_name); 248 232 249 233 ss->gt = q->gt; 234 + coredump->job = job; 250 235 INIT_WORK(&ss->work, xe_devcoredump_deferred_snap_work); 251 236 252 237 cookie = dma_fence_begin_signalling(); ··· 264 247 if (xe_force_wake_get(gt_to_fw(q->gt), XE_FORCEWAKE_ALL)) 265 248 xe_gt_info(ss->gt, "failed to get forcewake for coredump capture\n"); 266 249 267 - coredump->snapshot.ct = xe_guc_ct_snapshot_capture(&guc->ct, true); 268 - coredump->snapshot.ge = xe_guc_exec_queue_snapshot_capture(q); 269 - coredump->snapshot.job = xe_sched_job_snapshot_capture(job); 270 - coredump->snapshot.vm = xe_vm_snapshot_capture(q->vm); 250 + ss->guc.log = xe_guc_log_snapshot_capture(&guc->log, true); 251 + ss->guc.ct = xe_guc_ct_snapshot_capture(&guc->ct, true); 252 + ss->ge = xe_guc_exec_queue_snapshot_capture(q); 253 + ss->job = xe_sched_job_snapshot_capture(job); 254 + ss->vm = xe_vm_snapshot_capture(q->vm); 271 255 272 - for_each_hw_engine(hwe, q->gt, id) { 273 - if (hwe->class != q->hwe->class || 274 - !(BIT(hwe->logical_instance) & adj_logical_mask)) { 275 - coredump->snapshot.hwe[id] = NULL; 276 - continue; 277 - } 278 - coredump->snapshot.hwe[id] = xe_hw_engine_snapshot_capture(hwe); 279 - } 256 + xe_engine_snapshot_capture_for_job(job); 280 257 281 258 queue_work(system_unbound_wq, &ss->work); 282 259 ··· 321 310 } 322 311 323 312 #endif 313 + 314 + /** 315 + * xe_print_blob_ascii85 - print a BLOB to some useful location in ASCII85 316 + * 317 + * The output is split to multiple lines because some print targets, e.g. dmesg 318 + * cannot handle arbitrarily long lines. Note also that printing to dmesg in 319 + * piece-meal fashion is not possible, each separate call to drm_puts() has a 320 + * line-feed automatically added! Therefore, the entire output line must be 321 + * constructed in a local buffer first, then printed in one atomic output call. 322 + * 323 + * There is also a scheduler yield call to prevent the 'task has been stuck for 324 + * 120s' kernel hang check feature from firing when printing to a slow target 325 + * such as dmesg over a serial port. 326 + * 327 + * TODO: Add compression prior to the ASCII85 encoding to shrink huge buffers down. 328 + * 329 + * @p: the printer object to output to 330 + * @prefix: optional prefix to add to output string 331 + * @blob: the Binary Large OBject to dump out 332 + * @offset: offset in bytes to skip from the front of the BLOB, must be a multiple of sizeof(u32) 333 + * @size: the size in bytes of the BLOB, must be a multiple of sizeof(u32) 334 + */ 335 + void xe_print_blob_ascii85(struct drm_printer *p, const char *prefix, 336 + const void *blob, size_t offset, size_t size) 337 + { 338 + const u32 *blob32 = (const u32 *)blob; 339 + char buff[ASCII85_BUFSZ], *line_buff; 340 + size_t line_pos = 0; 341 + 342 + #define DMESG_MAX_LINE_LEN 800 343 + #define MIN_SPACE (ASCII85_BUFSZ + 2) /* 85 + "\n\0" */ 344 + 345 + if (size & 3) 346 + drm_printf(p, "Size not word aligned: %zu", size); 347 + if (offset & 3) 348 + drm_printf(p, "Offset not word aligned: %zu", size); 349 + 350 + line_buff = kzalloc(DMESG_MAX_LINE_LEN, GFP_KERNEL); 351 + if (IS_ERR_OR_NULL(line_buff)) { 352 + drm_printf(p, "Failed to allocate line buffer: %pe", line_buff); 353 + return; 354 + } 355 + 356 + blob32 += offset / sizeof(*blob32); 357 + size /= sizeof(*blob32); 358 + 359 + if (prefix) { 360 + strscpy(line_buff, prefix, DMESG_MAX_LINE_LEN - MIN_SPACE - 2); 361 + line_pos = strlen(line_buff); 362 + 363 + line_buff[line_pos++] = ':'; 364 + line_buff[line_pos++] = ' '; 365 + } 366 + 367 + while (size--) { 368 + u32 val = *(blob32++); 369 + 370 + strscpy(line_buff + line_pos, ascii85_encode(val, buff), 371 + DMESG_MAX_LINE_LEN - line_pos); 372 + line_pos += strlen(line_buff + line_pos); 373 + 374 + if ((line_pos + MIN_SPACE) >= DMESG_MAX_LINE_LEN) { 375 + line_buff[line_pos++] = '\n'; 376 + line_buff[line_pos++] = 0; 377 + 378 + drm_puts(p, line_buff); 379 + 380 + line_pos = 0; 381 + 382 + /* Prevent 'stuck thread' time out errors */ 383 + cond_resched(); 384 + } 385 + } 386 + 387 + if (line_pos) { 388 + line_buff[line_pos++] = '\n'; 389 + line_buff[line_pos++] = 0; 390 + 391 + drm_puts(p, line_buff); 392 + } 393 + 394 + kfree(line_buff); 395 + 396 + #undef MIN_SPACE 397 + #undef DMESG_MAX_LINE_LEN 398 + }
+6
drivers/gpu/drm/xe/xe_devcoredump.h
··· 6 6 #ifndef _XE_DEVCOREDUMP_H_ 7 7 #define _XE_DEVCOREDUMP_H_ 8 8 9 + #include <linux/types.h> 10 + 11 + struct drm_printer; 9 12 struct xe_device; 10 13 struct xe_sched_job; 11 14 ··· 25 22 return 0; 26 23 } 27 24 #endif 25 + 26 + void xe_print_blob_ascii85(struct drm_printer *p, const char *prefix, 27 + const void *blob, size_t offset, size_t size); 28 28 29 29 #endif
+17 -4
drivers/gpu/drm/xe/xe_devcoredump_types.h
··· 34 34 /** @work: Workqueue for deferred capture outside of signaling context */ 35 35 struct work_struct work; 36 36 37 - /* GuC snapshots */ 38 - /** @ct: GuC CT snapshot */ 39 - struct xe_guc_ct_snapshot *ct; 40 - /** @ge: Guc Engine snapshot */ 37 + /** @guc: GuC snapshots */ 38 + struct { 39 + /** @guc.ct: GuC CT snapshot */ 40 + struct xe_guc_ct_snapshot *ct; 41 + /** @guc.log: GuC log snapshot */ 42 + struct xe_guc_log_snapshot *log; 43 + } guc; 44 + 45 + /** @ge: GuC Submission Engine snapshot */ 41 46 struct xe_guc_submit_exec_queue_snapshot *ge; 42 47 43 48 /** @hwe: HW Engine snapshot array */ 44 49 struct xe_hw_engine_snapshot *hwe[XE_NUM_HW_ENGINES]; 45 50 /** @job: Snapshot of job state */ 46 51 struct xe_sched_job_snapshot *job; 52 + /** 53 + * @matched_node: The matched capture node for timedout job 54 + * this single-node tracker works because devcoredump will always only 55 + * produce one hw-engine capture per devcoredump event 56 + */ 57 + struct __guc_capture_parsed_output *matched_node; 47 58 /** @vm: Snapshot of VM state */ 48 59 struct xe_vm_snapshot *vm; 49 60 ··· 80 69 bool captured; 81 70 /** @snapshot: Snapshot is captured at time of the first crash */ 82 71 struct xe_devcoredump_snapshot snapshot; 72 + /** @job: Point to the faulting job */ 73 + struct xe_sched_job *job; 83 74 }; 84 75 85 76 #endif
+71 -35
drivers/gpu/drm/xe/xe_device.c
··· 6 6 #include "xe_device.h" 7 7 8 8 #include <linux/delay.h> 9 + #include <linux/fault-inject.h> 9 10 #include <linux/units.h> 10 11 11 12 #include <drm/drm_aperture.h> ··· 384 383 err: 385 384 return ERR_PTR(err); 386 385 } 386 + ALLOW_ERROR_INJECTION(xe_device_create, ERRNO); /* See xe_pci_probe() */ 387 + 388 + static bool xe_driver_flr_disabled(struct xe_device *xe) 389 + { 390 + return xe_mmio_read32(xe_root_tile_mmio(xe), GU_CNTL_PROTECTED) & DRIVERINT_FLR_DIS; 391 + } 387 392 388 393 /* 389 394 * The driver-initiated FLR is the highest level of reset that we can trigger ··· 404 397 * if/when a new instance of i915 is bound to the device it will do a full 405 398 * re-init anyway. 406 399 */ 407 - static void xe_driver_flr(struct xe_device *xe) 400 + static void __xe_driver_flr(struct xe_device *xe) 408 401 { 409 402 const unsigned int flr_timeout = 3 * MICRO; /* specs recommend a 3s wait */ 410 - struct xe_gt *gt = xe_root_mmio_gt(xe); 403 + struct xe_mmio *mmio = xe_root_tile_mmio(xe); 411 404 int ret; 412 - 413 - if (xe_mmio_read32(gt, GU_CNTL_PROTECTED) & DRIVERINT_FLR_DIS) { 414 - drm_info_once(&xe->drm, "BIOS Disabled Driver-FLR\n"); 415 - return; 416 - } 417 405 418 406 drm_dbg(&xe->drm, "Triggering Driver-FLR\n"); 419 407 ··· 421 419 * is still pending (unless the HW is totally dead), but better to be 422 420 * safe in case something unexpected happens 423 421 */ 424 - ret = xe_mmio_wait32(gt, GU_CNTL, DRIVERFLR, 0, flr_timeout, NULL, false); 422 + ret = xe_mmio_wait32(mmio, GU_CNTL, DRIVERFLR, 0, flr_timeout, NULL, false); 425 423 if (ret) { 426 424 drm_err(&xe->drm, "Driver-FLR-prepare wait for ready failed! %d\n", ret); 427 425 return; 428 426 } 429 - xe_mmio_write32(gt, GU_DEBUG, DRIVERFLR_STATUS); 427 + xe_mmio_write32(mmio, GU_DEBUG, DRIVERFLR_STATUS); 430 428 431 429 /* Trigger the actual Driver-FLR */ 432 - xe_mmio_rmw32(gt, GU_CNTL, 0, DRIVERFLR); 430 + xe_mmio_rmw32(mmio, GU_CNTL, 0, DRIVERFLR); 433 431 434 432 /* Wait for hardware teardown to complete */ 435 - ret = xe_mmio_wait32(gt, GU_CNTL, DRIVERFLR, 0, flr_timeout, NULL, false); 433 + ret = xe_mmio_wait32(mmio, GU_CNTL, DRIVERFLR, 0, flr_timeout, NULL, false); 436 434 if (ret) { 437 435 drm_err(&xe->drm, "Driver-FLR-teardown wait completion failed! %d\n", ret); 438 436 return; 439 437 } 440 438 441 439 /* Wait for hardware/firmware re-init to complete */ 442 - ret = xe_mmio_wait32(gt, GU_DEBUG, DRIVERFLR_STATUS, DRIVERFLR_STATUS, 440 + ret = xe_mmio_wait32(mmio, GU_DEBUG, DRIVERFLR_STATUS, DRIVERFLR_STATUS, 443 441 flr_timeout, NULL, false); 444 442 if (ret) { 445 443 drm_err(&xe->drm, "Driver-FLR-reinit wait completion failed! %d\n", ret); ··· 447 445 } 448 446 449 447 /* Clear sticky completion status */ 450 - xe_mmio_write32(gt, GU_DEBUG, DRIVERFLR_STATUS); 448 + xe_mmio_write32(mmio, GU_DEBUG, DRIVERFLR_STATUS); 449 + } 450 + 451 + static void xe_driver_flr(struct xe_device *xe) 452 + { 453 + if (xe_driver_flr_disabled(xe)) { 454 + drm_info_once(&xe->drm, "BIOS Disabled Driver-FLR\n"); 455 + return; 456 + } 457 + 458 + __xe_driver_flr(xe); 451 459 } 452 460 453 461 static void xe_driver_flr_fini(void *arg) ··· 500 488 return err; 501 489 } 502 490 503 - static bool verify_lmem_ready(struct xe_gt *gt) 491 + static bool verify_lmem_ready(struct xe_device *xe) 504 492 { 505 - u32 val = xe_mmio_read32(gt, GU_CNTL) & LMEM_INIT; 493 + u32 val = xe_mmio_read32(xe_root_tile_mmio(xe), GU_CNTL) & LMEM_INIT; 506 494 507 495 return !!val; 508 496 } 509 497 510 498 static int wait_for_lmem_ready(struct xe_device *xe) 511 499 { 512 - struct xe_gt *gt = xe_root_mmio_gt(xe); 513 500 unsigned long timeout, start; 514 501 515 502 if (!IS_DGFX(xe)) ··· 517 506 if (IS_SRIOV_VF(xe)) 518 507 return 0; 519 508 520 - if (verify_lmem_ready(gt)) 509 + if (verify_lmem_ready(xe)) 521 510 return 0; 522 511 523 512 drm_dbg(&xe->drm, "Waiting for lmem initialization\n"); ··· 546 535 547 536 msleep(20); 548 537 549 - } while (!verify_lmem_ready(gt)); 538 + } while (!verify_lmem_ready(xe)); 550 539 551 540 drm_dbg(&xe->drm, "lmem ready after %ums", 552 541 jiffies_to_msecs(jiffies - start)); 553 542 554 543 return 0; 555 544 } 545 + ALLOW_ERROR_INJECTION(wait_for_lmem_ready, ERRNO); /* See xe_pci_probe() */ 556 546 557 547 static void update_device_info(struct xe_device *xe) 558 548 { ··· 601 589 return 0; 602 590 } 603 591 604 - static int xe_device_set_has_flat_ccs(struct xe_device *xe) 592 + static int probe_has_flat_ccs(struct xe_device *xe) 605 593 { 594 + struct xe_gt *gt; 606 595 u32 reg; 607 596 int err; 608 597 598 + /* Always enabled/disabled, no runtime check to do */ 609 599 if (GRAPHICS_VER(xe) < 20 || !xe->info.has_flat_ccs) 610 600 return 0; 611 601 612 - struct xe_gt *gt = xe_root_mmio_gt(xe); 602 + gt = xe_root_mmio_gt(xe); 613 603 614 604 err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT); 615 605 if (err) ··· 660 646 err = xe_gt_init_early(gt); 661 647 if (err) 662 648 return err; 649 + 650 + /* 651 + * Only after this point can GT-specific MMIO operations 652 + * (including things like communication with the GuC) 653 + * be performed. 654 + */ 655 + xe_gt_mmio_init(gt); 663 656 } 664 657 665 658 for_each_tile(tile, xe, id) { ··· 682 661 err = xe_ggtt_init_early(tile->mem.ggtt); 683 662 if (err) 684 663 return err; 685 - if (IS_SRIOV_VF(xe)) { 686 - err = xe_memirq_init(&tile->sriov.vf.memirq); 687 - if (err) 688 - return err; 689 - } 664 + err = xe_memirq_init(&tile->memirq); 665 + if (err) 666 + return err; 690 667 } 691 668 692 669 for_each_gt(gt, xe, id) { ··· 708 689 if (err) 709 690 goto err; 710 691 711 - err = xe_device_set_has_flat_ccs(xe); 692 + err = probe_has_flat_ccs(xe); 712 693 if (err) 713 694 goto err; 714 695 ··· 818 799 819 800 void xe_device_shutdown(struct xe_device *xe) 820 801 { 802 + struct xe_gt *gt; 803 + u8 id; 804 + 805 + drm_dbg(&xe->drm, "Shutting down device\n"); 806 + 807 + if (xe_driver_flr_disabled(xe)) { 808 + xe_display_pm_shutdown(xe); 809 + 810 + xe_irq_suspend(xe); 811 + 812 + for_each_gt(gt, xe, id) 813 + xe_gt_shutdown(gt); 814 + 815 + xe_display_pm_shutdown_late(xe); 816 + } else { 817 + /* BOOM! */ 818 + __xe_driver_flr(xe); 819 + } 821 820 } 822 821 823 822 /** ··· 849 812 */ 850 813 void xe_device_wmb(struct xe_device *xe) 851 814 { 852 - struct xe_gt *gt = xe_root_mmio_gt(xe); 853 - 854 815 wmb(); 855 816 if (IS_DGFX(xe)) 856 - xe_mmio_write32(gt, VF_CAP_REG, 0); 817 + xe_mmio_write32(xe_root_tile_mmio(xe), VF_CAP_REG, 0); 857 818 } 858 819 859 820 /** ··· 892 857 if (xe_force_wake_get(gt_to_fw(gt), XE_FW_GT)) 893 858 return; 894 859 895 - xe_mmio_write32(gt, XE2_TDF_CTRL, TRANSIENT_FLUSH_REQUEST); 860 + xe_mmio_write32(&gt->mmio, XE2_TDF_CTRL, TRANSIENT_FLUSH_REQUEST); 896 861 /* 897 862 * FIXME: We can likely do better here with our choice of 898 863 * timeout. Currently we just assume the worst case, i.e. 150us, ··· 900 865 * scenario on current platforms if all cache entries are 901 866 * transient and need to be flushed.. 902 867 */ 903 - if (xe_mmio_wait32(gt, XE2_TDF_CTRL, TRANSIENT_FLUSH_REQUEST, 0, 868 + if (xe_mmio_wait32(&gt->mmio, XE2_TDF_CTRL, TRANSIENT_FLUSH_REQUEST, 0, 904 869 150, NULL, false)) 905 870 xe_gt_err_once(gt, "TD flush timeout\n"); 906 871 ··· 923 888 return; 924 889 925 890 spin_lock(&gt->global_invl_lock); 926 - xe_mmio_write32(gt, XE2_GLOBAL_INVAL, 0x1); 891 + xe_mmio_write32(&gt->mmio, XE2_GLOBAL_INVAL, 0x1); 927 892 928 - if (xe_mmio_wait32(gt, XE2_GLOBAL_INVAL, 0x1, 0x0, 150, NULL, true)) 893 + if (xe_mmio_wait32(&gt->mmio, XE2_GLOBAL_INVAL, 0x1, 0x0, 150, NULL, true)) 929 894 xe_gt_err_once(gt, "Global invalidation timeout\n"); 930 895 spin_unlock(&gt->global_invl_lock); 931 896 ··· 964 929 965 930 for_each_gt(gt, xe, id) { 966 931 drm_printf(p, "GT id: %u\n", id); 932 + drm_printf(p, "\tTile: %u\n", gt->tile->id); 967 933 drm_printf(p, "\tType: %s\n", 968 934 gt->info.type == XE_GT_TYPE_MAIN ? "main" : "media"); 969 935 drm_printf(p, "\tIP ver: %u.%u.%u\n", ··· 1016 980 return; 1017 981 } 1018 982 983 + xe_pm_runtime_get_noresume(xe); 984 + 1019 985 if (drmm_add_action_or_reset(&xe->drm, xe_device_wedged_fini, xe)) { 1020 986 drm_err(&xe->drm, "Failed to register xe_device_wedged_fini clean-up. Although device is wedged.\n"); 1021 987 return; 1022 988 } 1023 - 1024 - xe_pm_runtime_get_noresume(xe); 1025 989 1026 990 if (!atomic_xchg(&xe->wedged.flag, 1)) { 1027 991 xe->needs_flr_on_fini = true;
+14 -1
drivers/gpu/drm/xe/xe_device.h
··· 9 9 #include <drm/drm_util.h> 10 10 11 11 #include "xe_device_types.h" 12 + #include "xe_gt_types.h" 13 + #include "xe_sriov.h" 12 14 13 15 static inline struct xe_device *to_xe_device(const struct drm_device *dev) 14 16 { ··· 140 138 141 139 static inline struct xe_force_wake *gt_to_fw(struct xe_gt *gt) 142 140 { 143 - return &gt->mmio.fw; 141 + return &gt->pm.fw; 144 142 } 145 143 146 144 void xe_device_assert_mem_access(struct xe_device *xe); ··· 155 153 return xe->info.has_sriov; 156 154 } 157 155 156 + static inline bool xe_device_has_msix(struct xe_device *xe) 157 + { 158 + /* TODO: change this when MSI-X support is fully integrated */ 159 + return false; 160 + } 161 + 158 162 static inline bool xe_device_has_memirq(struct xe_device *xe) 159 163 { 160 164 return GRAPHICS_VERx100(xe) >= 1250; 165 + } 166 + 167 + static inline bool xe_device_uses_memirq(struct xe_device *xe) 168 + { 169 + return xe_device_has_memirq(xe) && (IS_SRIOV_VF(xe) || xe_device_has_msix(xe)); 161 170 } 162 171 163 172 u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size);
+44 -18
drivers/gpu/drm/xe/xe_device_types.h
··· 14 14 15 15 #include "xe_devcoredump_types.h" 16 16 #include "xe_heci_gsc.h" 17 - #include "xe_gt_types.h" 18 17 #include "xe_lmtt_types.h" 19 18 #include "xe_memirq_types.h" 20 19 #include "xe_oa.h" ··· 107 108 }; 108 109 109 110 /** 111 + * struct xe_mmio - register mmio structure 112 + * 113 + * Represents an MMIO region that the CPU may use to access registers. A 114 + * region may share its IO map with other regions (e.g., all GTs within a 115 + * tile share the same map with their parent tile, but represent different 116 + * subregions of the overall IO space). 117 + */ 118 + struct xe_mmio { 119 + /** @tile: Backpointer to tile, used for tracing */ 120 + struct xe_tile *tile; 121 + 122 + /** @regs: Map used to access registers. */ 123 + void __iomem *regs; 124 + 125 + /** 126 + * @sriov_vf_gt: Backpointer to GT. 127 + * 128 + * This pointer is only set for GT MMIO regions and only when running 129 + * as an SRIOV VF structure 130 + */ 131 + struct xe_gt *sriov_vf_gt; 132 + 133 + /** 134 + * @regs_size: Length of the register region within the map. 135 + * 136 + * The size of the iomap set in *regs is generally larger than the 137 + * register mmio space since it includes unused regions and/or 138 + * non-register regions such as the GGTT PTEs. 139 + */ 140 + size_t regs_size; 141 + 142 + /** @adj_limit: adjust MMIO address if address is below this value */ 143 + u32 adj_limit; 144 + 145 + /** @adj_offset: offset to add to MMIO address when adjusting */ 146 + u32 adj_offset; 147 + }; 148 + 149 + /** 110 150 * struct xe_tile - hardware tile structure 111 151 * 112 152 * From a driver perspective, a "tile" is effectively a complete GPU, containing ··· 186 148 * * 4MB-8MB: reserved 187 149 * * 8MB-16MB: global GTT 188 150 */ 189 - struct { 190 - /** @mmio.size: size of tile's MMIO space */ 191 - size_t size; 192 - 193 - /** @mmio.regs: pointer to tile's MMIO space (starting with registers) */ 194 - void __iomem *regs; 195 - } mmio; 151 + struct xe_mmio mmio; 196 152 197 153 /** 198 154 * @mmio_ext: MMIO-extension info for a tile. 199 155 * 200 156 * Each tile has its own additional 256MB (28-bit) MMIO-extension space. 201 157 */ 202 - struct { 203 - /** @mmio_ext.size: size of tile's additional MMIO-extension space */ 204 - size_t size; 205 - 206 - /** @mmio_ext.regs: pointer to tile's additional MMIO-extension space */ 207 - void __iomem *regs; 208 - } mmio_ext; 158 + struct xe_mmio mmio_ext; 209 159 210 160 /** @mem: memory management info for tile */ 211 161 struct { ··· 226 200 struct xe_lmtt lmtt; 227 201 } pf; 228 202 struct { 229 - /** @sriov.vf.memirq: Memory Based Interrupts. */ 230 - struct xe_memirq memirq; 231 - 232 203 /** @sriov.vf.ggtt_balloon: GGTT regions excluded from use. */ 233 204 struct xe_ggtt_node *ggtt_balloon[2]; 234 205 } vf; 235 206 } sriov; 207 + 208 + /** @memirq: Memory Based Interrupts. */ 209 + struct xe_memirq memirq; 236 210 237 211 /** @pcode: tile's PCODE */ 238 212 struct {
+10 -9
drivers/gpu/drm/xe/xe_execlist.c
··· 44 44 u32 ctx_id) 45 45 { 46 46 struct xe_gt *gt = hwe->gt; 47 + struct xe_mmio *mmio = &gt->mmio; 47 48 struct xe_device *xe = gt_to_xe(gt); 48 49 u64 lrc_desc; 49 50 ··· 59 58 } 60 59 61 60 if (hwe->class == XE_ENGINE_CLASS_COMPUTE) 62 - xe_mmio_write32(hwe->gt, RCU_MODE, 61 + xe_mmio_write32(mmio, RCU_MODE, 63 62 _MASKED_BIT_ENABLE(RCU_MODE_CCS_ENABLE)); 64 63 65 64 xe_lrc_write_ctx_reg(lrc, CTX_RING_TAIL, lrc->ring.tail); ··· 77 76 */ 78 77 wmb(); 79 78 80 - xe_mmio_write32(gt, RING_HWS_PGA(hwe->mmio_base), 79 + xe_mmio_write32(mmio, RING_HWS_PGA(hwe->mmio_base), 81 80 xe_bo_ggtt_addr(hwe->hwsp)); 82 - xe_mmio_read32(gt, RING_HWS_PGA(hwe->mmio_base)); 83 - xe_mmio_write32(gt, RING_MODE(hwe->mmio_base), 81 + xe_mmio_read32(mmio, RING_HWS_PGA(hwe->mmio_base)); 82 + xe_mmio_write32(mmio, RING_MODE(hwe->mmio_base), 84 83 _MASKED_BIT_ENABLE(GFX_DISABLE_LEGACY_MODE)); 85 84 86 - xe_mmio_write32(gt, RING_EXECLIST_SQ_CONTENTS_LO(hwe->mmio_base), 85 + xe_mmio_write32(mmio, RING_EXECLIST_SQ_CONTENTS_LO(hwe->mmio_base), 87 86 lower_32_bits(lrc_desc)); 88 - xe_mmio_write32(gt, RING_EXECLIST_SQ_CONTENTS_HI(hwe->mmio_base), 87 + xe_mmio_write32(mmio, RING_EXECLIST_SQ_CONTENTS_HI(hwe->mmio_base), 89 88 upper_32_bits(lrc_desc)); 90 - xe_mmio_write32(gt, RING_EXECLIST_CONTROL(hwe->mmio_base), 89 + xe_mmio_write32(mmio, RING_EXECLIST_CONTROL(hwe->mmio_base), 91 90 EL_CTRL_LOAD); 92 91 } 93 92 ··· 169 168 struct xe_gt *gt = hwe->gt; 170 169 u32 hi, lo; 171 170 172 - lo = xe_mmio_read32(gt, RING_EXECLIST_STATUS_LO(hwe->mmio_base)); 173 - hi = xe_mmio_read32(gt, RING_EXECLIST_STATUS_HI(hwe->mmio_base)); 171 + lo = xe_mmio_read32(&gt->mmio, RING_EXECLIST_STATUS_LO(hwe->mmio_base)); 172 + hi = xe_mmio_read32(&gt->mmio, RING_EXECLIST_STATUS_HI(hwe->mmio_base)); 174 173 175 174 return lo | (u64)hi << 32; 176 175 }
+2 -2
drivers/gpu/drm/xe/xe_force_wake.c
··· 100 100 if (IS_SRIOV_VF(gt_to_xe(gt))) 101 101 return; 102 102 103 - xe_mmio_write32(gt, domain->reg_ctl, domain->mask | (wake ? domain->val : 0)); 103 + xe_mmio_write32(&gt->mmio, domain->reg_ctl, domain->mask | (wake ? domain->val : 0)); 104 104 } 105 105 106 106 static int __domain_wait(struct xe_gt *gt, struct xe_force_wake_domain *domain, bool wake) ··· 111 111 if (IS_SRIOV_VF(gt_to_xe(gt))) 112 112 return 0; 113 113 114 - ret = xe_mmio_wait32(gt, domain->reg_ack, domain->val, wake ? domain->val : 0, 114 + ret = xe_mmio_wait32(&gt->mmio, domain->reg_ack, domain->val, wake ? domain->val : 0, 115 115 XE_FORCE_WAKE_ACK_TIMEOUT_MS * USEC_PER_MSEC, 116 116 &value, true); 117 117 if (ret)
+7 -3
drivers/gpu/drm/xe/xe_ggtt.c
··· 5 5 6 6 #include "xe_ggtt.h" 7 7 8 + #include <linux/fault-inject.h> 8 9 #include <linux/io-64-nonatomic-lo-hi.h> 9 10 #include <linux/sizes.h> 10 11 ··· 108 107 109 108 static void ggtt_update_access_counter(struct xe_ggtt *ggtt) 110 109 { 111 - struct xe_gt *gt = XE_WA(ggtt->tile->primary_gt, 22019338487) ? ggtt->tile->primary_gt : 112 - ggtt->tile->media_gt; 110 + struct xe_tile *tile = ggtt->tile; 111 + struct xe_gt *affected_gt = XE_WA(tile->primary_gt, 22019338487) ? 112 + tile->primary_gt : tile->media_gt; 113 + struct xe_mmio *mmio = &affected_gt->mmio; 113 114 u32 max_gtt_writes = XE_WA(ggtt->tile->primary_gt, 22019338487) ? 1100 : 63; 114 115 /* 115 116 * Wa_22019338487: GMD_ID is a RO register, a dummy write forces gunit ··· 121 118 lockdep_assert_held(&ggtt->lock); 122 119 123 120 if ((++ggtt->access_count % max_gtt_writes) == 0) { 124 - xe_mmio_write32(gt, GMD_ID, 0x0); 121 + xe_mmio_write32(mmio, GMD_ID, 0x0); 125 122 ggtt->access_count = 0; 126 123 } 127 124 } ··· 265 262 266 263 return 0; 267 264 } 265 + ALLOW_ERROR_INJECTION(xe_ggtt_init_early, ERRNO); /* See xe_pci_probe() */ 268 266 269 267 static void xe_ggtt_invalidate(struct xe_ggtt *ggtt); 270 268
+13 -11
drivers/gpu/drm/xe/xe_gsc.c
··· 34 34 #include "instructions/xe_gsc_commands.h" 35 35 #include "regs/xe_gsc_regs.h" 36 36 #include "regs/xe_gt_regs.h" 37 + #include "regs/xe_irq_regs.h" 37 38 38 39 static struct xe_gt * 39 40 gsc_to_gt(struct xe_gsc *gsc) ··· 180 179 181 180 static int gsc_fw_is_loaded(struct xe_gt *gt) 182 181 { 183 - return xe_mmio_read32(gt, HECI_FWSTS1(MTL_GSC_HECI1_BASE)) & 182 + return xe_mmio_read32(&gt->mmio, HECI_FWSTS1(MTL_GSC_HECI1_BASE)) & 184 183 HECI1_FWSTS1_INIT_COMPLETE; 185 184 } 186 185 ··· 191 190 * executed by the GSCCS. To account for possible submission delays or 192 191 * other issues, we use a 500ms timeout in the wait here. 193 192 */ 194 - return xe_mmio_wait32(gt, HECI_FWSTS1(MTL_GSC_HECI1_BASE), 193 + return xe_mmio_wait32(&gt->mmio, HECI_FWSTS1(MTL_GSC_HECI1_BASE), 195 194 HECI1_FWSTS1_INIT_COMPLETE, 196 195 HECI1_FWSTS1_INIT_COMPLETE, 197 196 500 * USEC_PER_MSEC, NULL, false); ··· 331 330 * so in that scenario we're always guaranteed to find the correct 332 331 * value. 333 332 */ 334 - er_status = xe_mmio_read32(gt, GSCI_TIMER_STATUS) & GSCI_TIMER_STATUS_VALUE; 333 + er_status = xe_mmio_read32(&gt->mmio, GSCI_TIMER_STATUS) & GSCI_TIMER_STATUS_VALUE; 335 334 336 335 if (er_status == GSCI_TIMER_STATUS_TIMER_EXPIRED) { 337 336 /* ··· 582 581 if (!XE_WA(gt, 14015076503) || !gsc_fw_is_loaded(gt)) 583 582 return; 584 583 585 - xe_mmio_rmw32(gt, HECI_H_GS1(MTL_GSC_HECI2_BASE), gs1_clr, gs1_set); 584 + xe_mmio_rmw32(&gt->mmio, HECI_H_GS1(MTL_GSC_HECI2_BASE), gs1_clr, gs1_set); 586 585 587 586 if (prep) { 588 587 /* make sure the reset bit is clear when writing the CSR reg */ 589 - xe_mmio_rmw32(gt, HECI_H_CSR(MTL_GSC_HECI2_BASE), 588 + xe_mmio_rmw32(&gt->mmio, HECI_H_CSR(MTL_GSC_HECI2_BASE), 590 589 HECI_H_CSR_RST, HECI_H_CSR_IG); 591 590 msleep(200); 592 591 } ··· 600 599 void xe_gsc_print_info(struct xe_gsc *gsc, struct drm_printer *p) 601 600 { 602 601 struct xe_gt *gt = gsc_to_gt(gsc); 602 + struct xe_mmio *mmio = &gt->mmio; 603 603 int err; 604 604 605 605 xe_uc_fw_print(&gsc->fw, p); ··· 615 613 return; 616 614 617 615 drm_printf(p, "\nHECI1 FWSTS: 0x%08x 0x%08x 0x%08x 0x%08x 0x%08x 0x%08x\n", 618 - xe_mmio_read32(gt, HECI_FWSTS1(MTL_GSC_HECI1_BASE)), 619 - xe_mmio_read32(gt, HECI_FWSTS2(MTL_GSC_HECI1_BASE)), 620 - xe_mmio_read32(gt, HECI_FWSTS3(MTL_GSC_HECI1_BASE)), 621 - xe_mmio_read32(gt, HECI_FWSTS4(MTL_GSC_HECI1_BASE)), 622 - xe_mmio_read32(gt, HECI_FWSTS5(MTL_GSC_HECI1_BASE)), 623 - xe_mmio_read32(gt, HECI_FWSTS6(MTL_GSC_HECI1_BASE))); 616 + xe_mmio_read32(mmio, HECI_FWSTS1(MTL_GSC_HECI1_BASE)), 617 + xe_mmio_read32(mmio, HECI_FWSTS2(MTL_GSC_HECI1_BASE)), 618 + xe_mmio_read32(mmio, HECI_FWSTS3(MTL_GSC_HECI1_BASE)), 619 + xe_mmio_read32(mmio, HECI_FWSTS4(MTL_GSC_HECI1_BASE)), 620 + xe_mmio_read32(mmio, HECI_FWSTS5(MTL_GSC_HECI1_BASE)), 621 + xe_mmio_read32(mmio, HECI_FWSTS6(MTL_GSC_HECI1_BASE))); 624 622 625 623 xe_force_wake_put(gt_to_fw(gt), XE_FW_GSC); 626 624 }
+2 -2
drivers/gpu/drm/xe/xe_gsc_proxy.c
··· 65 65 bool xe_gsc_proxy_init_done(struct xe_gsc *gsc) 66 66 { 67 67 struct xe_gt *gt = gsc_to_gt(gsc); 68 - u32 fwsts1 = xe_mmio_read32(gt, HECI_FWSTS1(MTL_GSC_HECI1_BASE)); 68 + u32 fwsts1 = xe_mmio_read32(&gt->mmio, HECI_FWSTS1(MTL_GSC_HECI1_BASE)); 69 69 70 70 return REG_FIELD_GET(HECI1_FWSTS1_CURRENT_STATE, fwsts1) == 71 71 HECI1_FWSTS1_PROXY_STATE_NORMAL; ··· 78 78 /* make sure we never accidentally write the RST bit */ 79 79 clr |= HECI_H_CSR_RST; 80 80 81 - xe_mmio_rmw32(gt, HECI_H_CSR(MTL_GSC_HECI2_BASE), clr, set); 81 + xe_mmio_rmw32(&gt->mmio, HECI_H_CSR(MTL_GSC_HECI2_BASE), clr, set); 82 82 } 83 83 84 84 static void gsc_proxy_irq_clear(struct xe_gsc *gsc)
+38 -6
drivers/gpu/drm/xe/xe_gt.c
··· 108 108 return; 109 109 110 110 if (!xe_gt_is_media_type(gt)) { 111 - xe_mmio_write32(gt, SCRATCH1LPFC, EN_L3_RW_CCS_CACHE_FLUSH); 112 111 reg = xe_gt_mcr_unicast_read_any(gt, XE2_GAMREQSTRM_CTRL); 113 112 reg |= CG_DIS_CNTLBUS; 114 113 xe_gt_mcr_multicast_write(gt, XE2_GAMREQSTRM_CTRL, reg); ··· 244 245 else if (entry->clr_bits + 1) 245 246 val = (reg.mcr ? 246 247 xe_gt_mcr_unicast_read_any(gt, reg_mcr) : 247 - xe_mmio_read32(gt, reg)) & (~entry->clr_bits); 248 + xe_mmio_read32(&gt->mmio, reg)) & (~entry->clr_bits); 248 249 else 249 250 val = 0; 250 251 ··· 439 440 * Stash hardware-reported version. Since this register does not exist 440 441 * on pre-MTL platforms, reading it there will (correctly) return 0. 441 442 */ 442 - gt->info.gmdid = xe_mmio_read32(gt, GMD_ID); 443 + gt->info.gmdid = xe_mmio_read32(&gt->mmio, GMD_ID); 443 444 444 445 err = xe_force_wake_put(gt_to_fw(gt), XE_FW_GT); 445 446 XE_WARN_ON(err); ··· 622 623 return 0; 623 624 } 624 625 626 + /** 627 + * xe_gt_mmio_init() - Initialize GT's MMIO access 628 + * @gt: the GT object 629 + * 630 + * Initialize GT's MMIO accessor, which will be used to access registers inside 631 + * this GT. 632 + */ 633 + void xe_gt_mmio_init(struct xe_gt *gt) 634 + { 635 + struct xe_tile *tile = gt_to_tile(gt); 636 + 637 + gt->mmio.regs = tile->mmio.regs; 638 + gt->mmio.regs_size = tile->mmio.regs_size; 639 + gt->mmio.tile = tile; 640 + 641 + if (gt->info.type == XE_GT_TYPE_MEDIA) { 642 + gt->mmio.adj_offset = MEDIA_GT_GSI_OFFSET; 643 + gt->mmio.adj_limit = MEDIA_GT_GSI_LENGTH; 644 + } 645 + 646 + if (IS_SRIOV_VF(gt_to_xe(gt))) 647 + gt->mmio.sriov_vf_gt = gt; 648 + } 649 + 625 650 void xe_gt_record_user_engines(struct xe_gt *gt) 626 651 { 627 652 struct xe_hw_engine *hwe; ··· 673 650 674 651 xe_gsc_wa_14015076503(gt, true); 675 652 676 - xe_mmio_write32(gt, GDRST, GRDOM_FULL); 677 - err = xe_mmio_wait32(gt, GDRST, GRDOM_FULL, 0, 5000, NULL, false); 653 + xe_mmio_write32(&gt->mmio, GDRST, GRDOM_FULL); 654 + err = xe_mmio_wait32(&gt->mmio, GDRST, GRDOM_FULL, 0, 5000, NULL, false); 678 655 if (err) 679 656 xe_gt_err(gt, "failed to clear GRDOM_FULL (%pe)\n", 680 657 ERR_PTR(err)); ··· 885 862 return err; 886 863 } 887 864 865 + void xe_gt_shutdown(struct xe_gt *gt) 866 + { 867 + xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL); 868 + do_gt_reset(gt); 869 + xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL); 870 + } 871 + 888 872 /** 889 873 * xe_gt_sanitize_freq() - Restore saved frequencies if necessary. 890 874 * @gt: the GT object ··· 904 874 int ret = 0; 905 875 906 876 if ((!xe_uc_fw_is_available(&gt->uc.gsc.fw) || 907 - xe_uc_fw_is_loaded(&gt->uc.gsc.fw)) && XE_WA(gt, 22019338487)) 877 + xe_uc_fw_is_loaded(&gt->uc.gsc.fw) || 878 + xe_uc_fw_is_in_error_state(&gt->uc.gsc.fw)) && 879 + XE_WA(gt, 22019338487)) 908 880 ret = xe_guc_pc_restore_stashed_freq(&gt->uc.guc.pc); 909 881 910 882 return ret;
+2
drivers/gpu/drm/xe/xe_gt.h
··· 31 31 int xe_gt_init_hwconfig(struct xe_gt *gt); 32 32 int xe_gt_init_early(struct xe_gt *gt); 33 33 int xe_gt_init(struct xe_gt *gt); 34 + void xe_gt_mmio_init(struct xe_gt *gt); 34 35 void xe_gt_declare_wedged(struct xe_gt *gt); 35 36 int xe_gt_record_default_lrcs(struct xe_gt *gt); 36 37 ··· 49 48 50 49 void xe_gt_suspend_prepare(struct xe_gt *gt); 51 50 int xe_gt_suspend(struct xe_gt *gt); 51 + void xe_gt_shutdown(struct xe_gt *gt); 52 52 int xe_gt_resume(struct xe_gt *gt); 53 53 void xe_gt_reset_async(struct xe_gt *gt); 54 54 void xe_gt_sanitize(struct xe_gt *gt);
+1 -1
drivers/gpu/drm/xe/xe_gt_ccs_mode.c
··· 68 68 } 69 69 } 70 70 71 - xe_mmio_write32(gt, CCS_MODE, mode); 71 + xe_mmio_write32(&gt->mmio, CCS_MODE, mode); 72 72 73 73 xe_gt_dbg(gt, "CCS_MODE=%x config:%08x, num_engines:%d, num_slices:%d\n", 74 74 mode, config, num_engines, num_slices);
+3 -3
drivers/gpu/drm/xe/xe_gt_clock.c
··· 17 17 18 18 static u32 read_reference_ts_freq(struct xe_gt *gt) 19 19 { 20 - u32 ts_override = xe_mmio_read32(gt, TIMESTAMP_OVERRIDE); 20 + u32 ts_override = xe_mmio_read32(&gt->mmio, TIMESTAMP_OVERRIDE); 21 21 u32 base_freq, frac_freq; 22 22 23 23 base_freq = REG_FIELD_GET(TIMESTAMP_OVERRIDE_US_COUNTER_DIVIDER_MASK, ··· 57 57 58 58 int xe_gt_clock_init(struct xe_gt *gt) 59 59 { 60 - u32 ctc_reg = xe_mmio_read32(gt, CTC_MODE); 60 + u32 ctc_reg = xe_mmio_read32(&gt->mmio, CTC_MODE); 61 61 u32 freq = 0; 62 62 63 63 /* Assuming gen11+ so assert this assumption is correct */ ··· 66 66 if (ctc_reg & CTC_SOURCE_DIVIDE_LOGIC) { 67 67 freq = read_reference_ts_freq(gt); 68 68 } else { 69 - u32 c0 = xe_mmio_read32(gt, RPM_CONFIG0); 69 + u32 c0 = xe_mmio_read32(&gt->mmio, RPM_CONFIG0); 70 70 71 71 freq = get_crystal_clock_freq(c0); 72 72
+13
drivers/gpu/drm/xe/xe_gt_debugfs.c
··· 15 15 #include "xe_ggtt.h" 16 16 #include "xe_gt.h" 17 17 #include "xe_gt_mcr.h" 18 + #include "xe_gt_idle.h" 18 19 #include "xe_gt_sriov_pf_debugfs.h" 19 20 #include "xe_gt_sriov_vf_debugfs.h" 20 21 #include "xe_gt_stats.h" ··· 108 107 return err; 109 108 110 109 return 0; 110 + } 111 + 112 + static int powergate_info(struct xe_gt *gt, struct drm_printer *p) 113 + { 114 + int ret; 115 + 116 + xe_pm_runtime_get(gt_to_xe(gt)); 117 + ret = xe_gt_idle_pg_print(gt, p); 118 + xe_pm_runtime_put(gt_to_xe(gt)); 119 + 120 + return ret; 111 121 } 112 122 113 123 static int force_reset(struct xe_gt *gt, struct drm_printer *p) ··· 300 288 {"topology", .show = xe_gt_debugfs_simple_show, .data = topology}, 301 289 {"steering", .show = xe_gt_debugfs_simple_show, .data = steering}, 302 290 {"ggtt", .show = xe_gt_debugfs_simple_show, .data = ggtt}, 291 + {"powergate_info", .show = xe_gt_debugfs_simple_show, .data = powergate_info}, 303 292 {"register-save-restore", .show = xe_gt_debugfs_simple_show, .data = register_save_restore}, 304 293 {"workarounds", .show = xe_gt_debugfs_simple_show, .data = workarounds}, 305 294 {"pat", .show = xe_gt_debugfs_simple_show, .data = pat},
+1 -1
drivers/gpu/drm/xe/xe_gt_freq.c
··· 11 11 #include <drm/drm_managed.h> 12 12 #include <drm/drm_print.h> 13 13 14 - #include "xe_device_types.h" 15 14 #include "xe_gt_sysfs.h" 16 15 #include "xe_gt_throttle.h" 16 + #include "xe_gt_types.h" 17 17 #include "xe_guc_pc.h" 18 18 #include "xe_pm.h" 19 19
+111 -14
drivers/gpu/drm/xe/xe_gt_idle.c
··· 98 98 void xe_gt_idle_enable_pg(struct xe_gt *gt) 99 99 { 100 100 struct xe_device *xe = gt_to_xe(gt); 101 - u32 pg_enable; 101 + struct xe_gt_idle *gtidle = &gt->gtidle; 102 + struct xe_mmio *mmio = &gt->mmio; 103 + u32 vcs_mask, vecs_mask; 102 104 int i, j; 103 105 104 106 if (IS_SRIOV_VF(xe)) ··· 112 110 113 111 xe_device_assert_mem_access(gt_to_xe(gt)); 114 112 115 - pg_enable = RENDER_POWERGATE_ENABLE | MEDIA_POWERGATE_ENABLE; 113 + vcs_mask = xe_hw_engine_mask_per_class(gt, XE_ENGINE_CLASS_VIDEO_DECODE); 114 + vecs_mask = xe_hw_engine_mask_per_class(gt, XE_ENGINE_CLASS_VIDEO_ENHANCE); 115 + 116 + if (vcs_mask || vecs_mask) 117 + gtidle->powergate_enable = MEDIA_POWERGATE_ENABLE; 118 + 119 + if (!xe_gt_is_media_type(gt)) 120 + gtidle->powergate_enable |= RENDER_POWERGATE_ENABLE; 116 121 117 122 for (i = XE_HW_ENGINE_VCS0, j = 0; i <= XE_HW_ENGINE_VCS7; ++i, ++j) { 118 123 if ((gt->info.engine_mask & BIT(i))) 119 - pg_enable |= (VDN_HCP_POWERGATE_ENABLE(j) | 120 - VDN_MFXVDENC_POWERGATE_ENABLE(j)); 124 + gtidle->powergate_enable |= (VDN_HCP_POWERGATE_ENABLE(j) | 125 + VDN_MFXVDENC_POWERGATE_ENABLE(j)); 121 126 } 122 127 123 128 XE_WARN_ON(xe_force_wake_get(gt_to_fw(gt), XE_FW_GT)); ··· 133 124 * GuC sets the hysteresis value when GuC PC is enabled 134 125 * else set it to 25 (25 * 1.28us) 135 126 */ 136 - xe_mmio_write32(gt, MEDIA_POWERGATE_IDLE_HYSTERESIS, 25); 137 - xe_mmio_write32(gt, RENDER_POWERGATE_IDLE_HYSTERESIS, 25); 127 + xe_mmio_write32(mmio, MEDIA_POWERGATE_IDLE_HYSTERESIS, 25); 128 + xe_mmio_write32(mmio, RENDER_POWERGATE_IDLE_HYSTERESIS, 25); 138 129 } 139 130 140 - xe_mmio_write32(gt, POWERGATE_ENABLE, pg_enable); 131 + xe_mmio_write32(mmio, POWERGATE_ENABLE, gtidle->powergate_enable); 141 132 XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FW_GT)); 142 133 } 143 134 144 135 void xe_gt_idle_disable_pg(struct xe_gt *gt) 145 136 { 137 + struct xe_gt_idle *gtidle = &gt->gtidle; 138 + 146 139 if (IS_SRIOV_VF(gt_to_xe(gt))) 147 140 return; 148 141 149 142 xe_device_assert_mem_access(gt_to_xe(gt)); 143 + gtidle->powergate_enable = 0; 144 + 150 145 XE_WARN_ON(xe_force_wake_get(gt_to_fw(gt), XE_FW_GT)); 151 - 152 - xe_mmio_write32(gt, POWERGATE_ENABLE, 0); 153 - 146 + xe_mmio_write32(&gt->mmio, POWERGATE_ENABLE, gtidle->powergate_enable); 154 147 XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FW_GT)); 148 + } 149 + 150 + /** 151 + * xe_gt_idle_pg_print - Xe powergating info 152 + * @gt: GT object 153 + * @p: drm_printer. 154 + * 155 + * This function prints the powergating information 156 + * 157 + * Return: 0 on success, negative error code otherwise 158 + */ 159 + int xe_gt_idle_pg_print(struct xe_gt *gt, struct drm_printer *p) 160 + { 161 + struct xe_gt_idle *gtidle = &gt->gtidle; 162 + struct xe_device *xe = gt_to_xe(gt); 163 + enum xe_gt_idle_state state; 164 + u32 pg_enabled, pg_status = 0; 165 + u32 vcs_mask, vecs_mask; 166 + int err, n; 167 + /* 168 + * Media Slices 169 + * 170 + * Slice 0: VCS0, VCS1, VECS0 171 + * Slice 1: VCS2, VCS3, VECS1 172 + * Slice 2: VCS4, VCS5, VECS2 173 + * Slice 3: VCS6, VCS7, VECS3 174 + */ 175 + static const struct { 176 + u64 engines; 177 + u32 status_bit; 178 + } media_slices[] = { 179 + {(BIT(XE_HW_ENGINE_VCS0) | BIT(XE_HW_ENGINE_VCS1) | 180 + BIT(XE_HW_ENGINE_VECS0)), MEDIA_SLICE0_AWAKE_STATUS}, 181 + 182 + {(BIT(XE_HW_ENGINE_VCS2) | BIT(XE_HW_ENGINE_VCS3) | 183 + BIT(XE_HW_ENGINE_VECS1)), MEDIA_SLICE1_AWAKE_STATUS}, 184 + 185 + {(BIT(XE_HW_ENGINE_VCS4) | BIT(XE_HW_ENGINE_VCS5) | 186 + BIT(XE_HW_ENGINE_VECS2)), MEDIA_SLICE2_AWAKE_STATUS}, 187 + 188 + {(BIT(XE_HW_ENGINE_VCS6) | BIT(XE_HW_ENGINE_VCS7) | 189 + BIT(XE_HW_ENGINE_VECS3)), MEDIA_SLICE3_AWAKE_STATUS}, 190 + }; 191 + 192 + if (xe->info.platform == XE_PVC) { 193 + drm_printf(p, "Power Gating not supported\n"); 194 + return 0; 195 + } 196 + 197 + state = gtidle->idle_status(gtidle_to_pc(gtidle)); 198 + pg_enabled = gtidle->powergate_enable; 199 + 200 + /* Do not wake the GT to read powergating status */ 201 + if (state != GT_IDLE_C6) { 202 + err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT); 203 + if (err) 204 + return err; 205 + 206 + pg_enabled = xe_mmio_read32(&gt->mmio, POWERGATE_ENABLE); 207 + pg_status = xe_mmio_read32(&gt->mmio, POWERGATE_DOMAIN_STATUS); 208 + 209 + XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FW_GT)); 210 + } 211 + 212 + if (gt->info.engine_mask & XE_HW_ENGINE_RCS_MASK) { 213 + drm_printf(p, "Render Power Gating Enabled: %s\n", 214 + str_yes_no(pg_enabled & RENDER_POWERGATE_ENABLE)); 215 + 216 + drm_printf(p, "Render Power Gate Status: %s\n", 217 + str_up_down(pg_status & RENDER_AWAKE_STATUS)); 218 + } 219 + 220 + vcs_mask = xe_hw_engine_mask_per_class(gt, XE_ENGINE_CLASS_VIDEO_DECODE); 221 + vecs_mask = xe_hw_engine_mask_per_class(gt, XE_ENGINE_CLASS_VIDEO_ENHANCE); 222 + 223 + /* Print media CPG status only if media is present */ 224 + if (vcs_mask || vecs_mask) { 225 + drm_printf(p, "Media Power Gating Enabled: %s\n", 226 + str_yes_no(pg_enabled & MEDIA_POWERGATE_ENABLE)); 227 + 228 + for (n = 0; n < ARRAY_SIZE(media_slices); n++) 229 + if (gt->info.engine_mask & media_slices[n].engines) 230 + drm_printf(p, "Media Slice%d Power Gate Status: %s\n", n, 231 + str_up_down(pg_status & media_slices[n].status_bit)); 232 + } 233 + return 0; 155 234 } 156 235 157 236 static ssize_t name_show(struct device *dev, ··· 357 260 return; 358 261 359 262 /* Units of 1280 ns for a total of 5s */ 360 - xe_mmio_write32(gt, RC_IDLE_HYSTERSIS, 0x3B9ACA); 263 + xe_mmio_write32(&gt->mmio, RC_IDLE_HYSTERSIS, 0x3B9ACA); 361 264 /* Enable RC6 */ 362 - xe_mmio_write32(gt, RC_CONTROL, 265 + xe_mmio_write32(&gt->mmio, RC_CONTROL, 363 266 RC_CTL_HW_ENABLE | RC_CTL_TO_MODE | RC_CTL_RC6_ENABLE); 364 267 } 365 268 ··· 371 274 if (IS_SRIOV_VF(gt_to_xe(gt))) 372 275 return; 373 276 374 - xe_mmio_write32(gt, RC_CONTROL, 0); 375 - xe_mmio_write32(gt, RC_STATE, 0); 277 + xe_mmio_write32(&gt->mmio, RC_CONTROL, 0); 278 + xe_mmio_write32(&gt->mmio, RC_STATE, 0); 376 279 }
+2
drivers/gpu/drm/xe/xe_gt_idle.h
··· 8 8 9 9 #include "xe_gt_idle_types.h" 10 10 11 + struct drm_printer; 11 12 struct xe_gt; 12 13 13 14 int xe_gt_idle_init(struct xe_gt_idle *gtidle); ··· 16 15 void xe_gt_idle_disable_c6(struct xe_gt *gt); 17 16 void xe_gt_idle_enable_pg(struct xe_gt *gt); 18 17 void xe_gt_idle_disable_pg(struct xe_gt *gt); 18 + int xe_gt_idle_pg_print(struct xe_gt *gt, struct drm_printer *p); 19 19 20 20 #endif /* _XE_GT_IDLE_H_ */
+2
drivers/gpu/drm/xe/xe_gt_idle_types.h
··· 23 23 struct xe_gt_idle { 24 24 /** @name: name */ 25 25 char name[16]; 26 + /** @powergate_enable: copy of powergate enable bits */ 27 + u32 powergate_enable; 26 28 /** @residency_multiplier: residency multiplier in ns */ 27 29 u32 residency_multiplier; 28 30 /** @cur_residency: raw driver copy of idle residency */
+49 -19
drivers/gpu/drm/xe/xe_gt_mcr.c
··· 237 237 {}, 238 238 }; 239 239 240 + static const struct xe_mmio_range xe3lpm_instance0_steering_table[] = { 241 + { 0x384000, 0x3847DF }, /* GAM, rsvd, GAM */ 242 + { 0x384900, 0x384AFF }, /* GAM */ 243 + { 0x389560, 0x3895FF }, /* MEDIAINF */ 244 + { 0x38B600, 0x38B8FF }, /* L3BANK */ 245 + { 0x38C800, 0x38D07F }, /* GAM, MEDIAINF */ 246 + { 0x38D0D0, 0x38F0FF }, /* MEDIAINF, GAM */ 247 + { 0x393C00, 0x393C7F }, /* MEDIAINF */ 248 + {}, 249 + }; 250 + 240 251 static void init_steering_l3bank(struct xe_gt *gt) 241 252 { 253 + struct xe_mmio *mmio = &gt->mmio; 254 + 242 255 if (GRAPHICS_VERx100(gt_to_xe(gt)) >= 1270) { 243 256 u32 mslice_mask = REG_FIELD_GET(MEML3_EN_MASK, 244 - xe_mmio_read32(gt, MIRROR_FUSE3)); 257 + xe_mmio_read32(mmio, MIRROR_FUSE3)); 245 258 u32 bank_mask = REG_FIELD_GET(GT_L3_EXC_MASK, 246 - xe_mmio_read32(gt, XEHP_FUSE4)); 259 + xe_mmio_read32(mmio, XEHP_FUSE4)); 247 260 248 261 /* 249 262 * Group selects mslice, instance selects bank within mslice. ··· 267 254 bank_mask & BIT(0) ? 0 : 2; 268 255 } else if (gt_to_xe(gt)->info.platform == XE_DG2) { 269 256 u32 mslice_mask = REG_FIELD_GET(MEML3_EN_MASK, 270 - xe_mmio_read32(gt, MIRROR_FUSE3)); 257 + xe_mmio_read32(mmio, MIRROR_FUSE3)); 271 258 u32 bank = __ffs(mslice_mask) * 8; 272 259 273 260 /* ··· 279 266 gt->steering[L3BANK].instance_target = bank & 0x3; 280 267 } else { 281 268 u32 fuse = REG_FIELD_GET(L3BANK_MASK, 282 - ~xe_mmio_read32(gt, MIRROR_FUSE3)); 269 + ~xe_mmio_read32(mmio, MIRROR_FUSE3)); 283 270 284 271 gt->steering[L3BANK].group_target = 0; /* unused */ 285 272 gt->steering[L3BANK].instance_target = __ffs(fuse); ··· 289 276 static void init_steering_mslice(struct xe_gt *gt) 290 277 { 291 278 u32 mask = REG_FIELD_GET(MEML3_EN_MASK, 292 - xe_mmio_read32(gt, MIRROR_FUSE3)); 279 + xe_mmio_read32(&gt->mmio, MIRROR_FUSE3)); 293 280 294 281 /* 295 282 * mslice registers are valid (not terminated) if either the meml3 ··· 365 352 *instance = dss % gt->steering_dss_per_grp; 366 353 } 367 354 355 + /** 356 + * xe_gt_mcr_steering_info_to_dss_id - Get DSS ID from group/instance steering 357 + * @gt: GT structure 358 + * @group: steering group ID 359 + * @instance: steering instance ID 360 + * 361 + * Return: the coverted DSS id. 362 + */ 363 + u32 xe_gt_mcr_steering_info_to_dss_id(struct xe_gt *gt, u16 group, u16 instance) 364 + { 365 + return group * dss_per_group(gt) + instance; 366 + } 367 + 368 368 static void init_steering_dss(struct xe_gt *gt) 369 369 { 370 370 gt->steering_dss_per_grp = dss_per_group(gt); ··· 406 380 static void init_steering_sqidi_psmi(struct xe_gt *gt) 407 381 { 408 382 u32 mask = REG_FIELD_GET(XE2_NODE_ENABLE_MASK, 409 - xe_mmio_read32(gt, MIRROR_FUSE3)); 383 + xe_mmio_read32(&gt->mmio, MIRROR_FUSE3)); 410 384 u32 select = __ffs(mask); 411 385 412 386 gt->steering[SQIDI_PSMI].group_target = select >> 1; ··· 465 439 if (gt->info.type == XE_GT_TYPE_MEDIA) { 466 440 drm_WARN_ON(&xe->drm, MEDIA_VER(xe) < 13); 467 441 468 - if (MEDIA_VERx100(xe) >= 1301) { 442 + if (MEDIA_VER(xe) >= 30) { 443 + gt->steering[OADDRM].ranges = xe2lpm_gpmxmt_steering_table; 444 + gt->steering[INSTANCE0].ranges = xe3lpm_instance0_steering_table; 445 + } else if (MEDIA_VERx100(xe) >= 1301) { 469 446 gt->steering[OADDRM].ranges = xe2lpm_gpmxmt_steering_table; 470 447 gt->steering[INSTANCE0].ranges = xe2lpm_instance0_steering_table; 471 448 } else { ··· 523 494 u32 steer_val = REG_FIELD_PREP(MCR_SLICE_MASK, 0) | 524 495 REG_FIELD_PREP(MCR_SUBSLICE_MASK, 2); 525 496 526 - xe_mmio_write32(gt, MCFG_MCR_SELECTOR, steer_val); 527 - xe_mmio_write32(gt, SF_MCR_SELECTOR, steer_val); 497 + xe_mmio_write32(&gt->mmio, MCFG_MCR_SELECTOR, steer_val); 498 + xe_mmio_write32(&gt->mmio, SF_MCR_SELECTOR, steer_val); 528 499 /* 529 500 * For GAM registers, all reads should be directed to instance 1 530 501 * (unicast reads against other instances are not allowed), ··· 562 533 continue; 563 534 564 535 for (int i = 0; gt->steering[type].ranges[i].end > 0; i++) { 565 - if (xe_mmio_in_range(gt, &gt->steering[type].ranges[i], reg)) { 536 + if (xe_mmio_in_range(&gt->mmio, &gt->steering[type].ranges[i], reg)) { 566 537 *group = gt->steering[type].group_target; 567 538 *instance = gt->steering[type].instance_target; 568 539 return true; ··· 573 544 implicit_ranges = gt->steering[IMPLICIT_STEERING].ranges; 574 545 if (implicit_ranges) 575 546 for (int i = 0; implicit_ranges[i].end > 0; i++) 576 - if (xe_mmio_in_range(gt, &implicit_ranges[i], reg)) 547 + if (xe_mmio_in_range(&gt->mmio, &implicit_ranges[i], reg)) 577 548 return false; 578 549 579 550 /* ··· 608 579 * when a read to the relevant register returns 1. 609 580 */ 610 581 if (GRAPHICS_VERx100(xe) >= 1270) 611 - ret = xe_mmio_wait32(gt, STEER_SEMAPHORE, 0x1, 0x1, 10, NULL, 582 + ret = xe_mmio_wait32(&gt->mmio, STEER_SEMAPHORE, 0x1, 0x1, 10, NULL, 612 583 true); 613 584 614 585 drm_WARN_ON_ONCE(&xe->drm, ret == -ETIMEDOUT); ··· 618 589 { 619 590 /* Release hardware semaphore - this is done by writing 1 to the register */ 620 591 if (GRAPHICS_VERx100(gt_to_xe(gt)) >= 1270) 621 - xe_mmio_write32(gt, STEER_SEMAPHORE, 0x1); 592 + xe_mmio_write32(&gt->mmio, STEER_SEMAPHORE, 0x1); 622 593 623 594 spin_unlock(&gt->mcr_lock); 624 595 } ··· 632 603 u8 rw_flag, int group, int instance, u32 value) 633 604 { 634 605 const struct xe_reg reg = to_xe_reg(reg_mcr); 606 + struct xe_mmio *mmio = &gt->mmio; 635 607 struct xe_reg steer_reg; 636 608 u32 steer_val, val = 0; 637 609 ··· 665 635 if (rw_flag == MCR_OP_READ) 666 636 steer_val |= MCR_MULTICAST; 667 637 668 - xe_mmio_write32(gt, steer_reg, steer_val); 638 + xe_mmio_write32(mmio, steer_reg, steer_val); 669 639 670 640 if (rw_flag == MCR_OP_READ) 671 - val = xe_mmio_read32(gt, reg); 641 + val = xe_mmio_read32(mmio, reg); 672 642 else 673 - xe_mmio_write32(gt, reg, value); 643 + xe_mmio_write32(mmio, reg, value); 674 644 675 645 /* 676 646 * If we turned off the multicast bit (during a write) we're required ··· 679 649 * operation. 680 650 */ 681 651 if (rw_flag == MCR_OP_WRITE) 682 - xe_mmio_write32(gt, steer_reg, MCR_MULTICAST); 652 + xe_mmio_write32(mmio, steer_reg, MCR_MULTICAST); 683 653 684 654 return val; 685 655 } ··· 714 684 group, instance, 0); 715 685 mcr_unlock(gt); 716 686 } else { 717 - val = xe_mmio_read32(gt, reg); 687 + val = xe_mmio_read32(&gt->mmio, reg); 718 688 } 719 689 720 690 return val; ··· 787 757 * to touch the steering register. 788 758 */ 789 759 mcr_lock(gt); 790 - xe_mmio_write32(gt, reg, value); 760 + xe_mmio_write32(&gt->mmio, reg, value); 791 761 mcr_unlock(gt); 792 762 } 793 763
+1
drivers/gpu/drm/xe/xe_gt_mcr.h
··· 28 28 29 29 void xe_gt_mcr_steering_dump(struct xe_gt *gt, struct drm_printer *p); 30 30 void xe_gt_mcr_get_dss_steering(struct xe_gt *gt, unsigned int dss, u16 *group, u16 *instance); 31 + u32 xe_gt_mcr_steering_info_to_dss_id(struct xe_gt *gt, u16 group, u16 instance); 31 32 32 33 /* 33 34 * Loop over each DSS and determine the group and instance IDs that
+1 -1
drivers/gpu/drm/xe/xe_gt_printk.h
··· 8 8 9 9 #include <drm/drm_print.h> 10 10 11 - #include "xe_device_types.h" 11 + #include "xe_gt_types.h" 12 12 13 13 #define xe_gt_printk(_gt, _level, _fmt, ...) \ 14 14 drm_##_level(&gt_to_xe(_gt)->drm, "GT%u: " _fmt, (_gt)->info.id, ##__VA_ARGS__)
+55 -1
drivers/gpu/drm/xe/xe_gt_sriov_pf.c
··· 5 5 6 6 #include <drm/drm_managed.h> 7 7 8 + #include "regs/xe_guc_regs.h" 8 9 #include "regs/xe_regs.h" 9 10 11 + #include "xe_gt.h" 10 12 #include "xe_gt_sriov_pf.h" 11 13 #include "xe_gt_sriov_pf_config.h" 12 14 #include "xe_gt_sriov_pf_control.h" 13 15 #include "xe_gt_sriov_pf_helpers.h" 16 + #include "xe_gt_sriov_pf_migration.h" 14 17 #include "xe_gt_sriov_pf_service.h" 15 18 #include "xe_mmio.h" 16 19 ··· 75 72 76 73 static void pf_enable_ggtt_guest_update(struct xe_gt *gt) 77 74 { 78 - xe_mmio_write32(gt, VIRTUAL_CTRL_REG, GUEST_GTT_UPDATE_EN); 75 + xe_mmio_write32(&gt->mmio, VIRTUAL_CTRL_REG, GUEST_GTT_UPDATE_EN); 79 76 } 80 77 81 78 /** ··· 90 87 pf_enable_ggtt_guest_update(gt); 91 88 92 89 xe_gt_sriov_pf_service_update(gt); 90 + xe_gt_sriov_pf_migration_init(gt); 91 + } 92 + 93 + static u32 pf_get_vf_regs_stride(struct xe_device *xe) 94 + { 95 + return GRAPHICS_VERx100(xe) > 1200 ? 0x400 : 0x1000; 96 + } 97 + 98 + static struct xe_reg xe_reg_vf_to_pf(struct xe_reg vf_reg, unsigned int vfid, u32 stride) 99 + { 100 + struct xe_reg pf_reg = vf_reg; 101 + 102 + pf_reg.vf = 0; 103 + pf_reg.addr += stride * vfid; 104 + 105 + return pf_reg; 106 + } 107 + 108 + static void pf_clear_vf_scratch_regs(struct xe_gt *gt, unsigned int vfid) 109 + { 110 + u32 stride = pf_get_vf_regs_stride(gt_to_xe(gt)); 111 + struct xe_reg scratch; 112 + int n, count; 113 + 114 + if (xe_gt_is_media_type(gt)) { 115 + count = MED_VF_SW_FLAG_COUNT; 116 + for (n = 0; n < count; n++) { 117 + scratch = xe_reg_vf_to_pf(MED_VF_SW_FLAG(n), vfid, stride); 118 + xe_mmio_write32(&gt->mmio, scratch, 0); 119 + } 120 + } else { 121 + count = VF_SW_FLAG_COUNT; 122 + for (n = 0; n < count; n++) { 123 + scratch = xe_reg_vf_to_pf(VF_SW_FLAG(n), vfid, stride); 124 + xe_mmio_write32(&gt->mmio, scratch, 0); 125 + } 126 + } 127 + } 128 + 129 + /** 130 + * xe_gt_sriov_pf_sanitize_hw() - Reset hardware state related to a VF. 131 + * @gt: the &xe_gt 132 + * @vfid: the VF identifier 133 + * 134 + * This function can only be called on PF. 135 + */ 136 + void xe_gt_sriov_pf_sanitize_hw(struct xe_gt *gt, unsigned int vfid) 137 + { 138 + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt))); 139 + 140 + pf_clear_vf_scratch_regs(gt, vfid); 93 141 } 94 142 95 143 /**
+1
drivers/gpu/drm/xe/xe_gt_sriov_pf.h
··· 11 11 #ifdef CONFIG_PCI_IOV 12 12 int xe_gt_sriov_pf_init_early(struct xe_gt *gt); 13 13 void xe_gt_sriov_pf_init_hw(struct xe_gt *gt); 14 + void xe_gt_sriov_pf_sanitize_hw(struct xe_gt *gt, unsigned int vfid); 14 15 void xe_gt_sriov_pf_restart(struct xe_gt *gt); 15 16 #else 16 17 static inline int xe_gt_sriov_pf_init_early(struct xe_gt *gt)
+190 -14
drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
··· 34 34 #include "xe_ttm_vram_mgr.h" 35 35 #include "xe_wopcm.h" 36 36 37 + #define make_u64_from_u32(hi, lo) ((u64)((u64)(u32)(hi) << 32 | (u32)(lo))) 38 + 37 39 /* 38 40 * Return: number of KLVs that were successfully parsed and saved, 39 41 * negative error code on failure. ··· 231 229 } 232 230 233 231 /* Return: number of configuration dwords written */ 234 - static u32 encode_config_ggtt(u32 *cfg, const struct xe_gt_sriov_config *config) 232 + static u32 encode_config_ggtt(u32 *cfg, const struct xe_gt_sriov_config *config, bool details) 235 233 { 236 234 u32 n = 0; 237 235 238 236 if (xe_ggtt_node_allocated(config->ggtt_region)) { 239 - cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_GGTT_START); 240 - cfg[n++] = lower_32_bits(config->ggtt_region->base.start); 241 - cfg[n++] = upper_32_bits(config->ggtt_region->base.start); 237 + if (details) { 238 + cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_GGTT_START); 239 + cfg[n++] = lower_32_bits(config->ggtt_region->base.start); 240 + cfg[n++] = upper_32_bits(config->ggtt_region->base.start); 241 + } 242 242 243 243 cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_GGTT_SIZE); 244 244 cfg[n++] = lower_32_bits(config->ggtt_region->base.size); ··· 251 247 } 252 248 253 249 /* Return: number of configuration dwords written */ 254 - static u32 encode_config(u32 *cfg, const struct xe_gt_sriov_config *config) 250 + static u32 encode_config(u32 *cfg, const struct xe_gt_sriov_config *config, bool details) 255 251 { 256 252 u32 n = 0; 257 253 258 - n += encode_config_ggtt(cfg, config); 254 + n += encode_config_ggtt(cfg, config, details); 259 255 260 - cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_BEGIN_CONTEXT_ID); 261 - cfg[n++] = config->begin_ctx; 256 + if (details) { 257 + cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_BEGIN_CONTEXT_ID); 258 + cfg[n++] = config->begin_ctx; 259 + } 262 260 263 261 cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_NUM_CONTEXTS); 264 262 cfg[n++] = config->num_ctxs; 265 263 266 - cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_BEGIN_DOORBELL_ID); 267 - cfg[n++] = config->begin_db; 264 + if (details) { 265 + cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_BEGIN_DOORBELL_ID); 266 + cfg[n++] = config->begin_db; 267 + } 268 268 269 269 cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_NUM_DOORBELLS); 270 270 cfg[n++] = config->num_dbs; ··· 309 301 if (!cfg) 310 302 return -ENOMEM; 311 303 312 - num_dwords = encode_config(cfg, config); 304 + num_dwords = encode_config(cfg, config, true); 313 305 xe_gt_assert(gt, num_dwords <= max_cfg_dwords); 314 306 315 307 if (xe_gt_is_media_type(gt)) { ··· 317 309 struct xe_gt_sriov_config *other = pf_pick_vf_config(primary, vfid); 318 310 319 311 /* media-GT will never include a GGTT config */ 320 - xe_gt_assert(gt, !encode_config_ggtt(cfg + num_dwords, config)); 312 + xe_gt_assert(gt, !encode_config_ggtt(cfg + num_dwords, config, true)); 321 313 322 314 /* the GGTT config must be taken from the primary-GT instead */ 323 - num_dwords += encode_config_ggtt(cfg + num_dwords, other); 315 + num_dwords += encode_config_ggtt(cfg + num_dwords, other, true); 324 316 } 325 317 xe_gt_assert(gt, num_dwords <= max_cfg_dwords); 326 318 ··· 2050 2042 valid_all = valid_all && valid_lmem; 2051 2043 } 2052 2044 2053 - return valid_all ? 1 : valid_any ? -ENOKEY : -ENODATA; 2045 + return valid_all ? 0 : valid_any ? -ENOKEY : -ENODATA; 2054 2046 } 2055 2047 2056 2048 /** ··· 2074 2066 mutex_unlock(xe_gt_sriov_pf_master_mutex(gt)); 2075 2067 2076 2068 return empty; 2069 + } 2070 + 2071 + /** 2072 + * xe_gt_sriov_pf_config_save - Save a VF provisioning config as binary blob. 2073 + * @gt: the &xe_gt 2074 + * @vfid: the VF identifier (can't be PF) 2075 + * @buf: the buffer to save a config to (or NULL if query the buf size) 2076 + * @size: the size of the buffer (or 0 if query the buf size) 2077 + * 2078 + * This function can only be called on PF. 2079 + * 2080 + * Return: mininum size of the buffer or the number of bytes saved, 2081 + * or a negative error code on failure. 2082 + */ 2083 + ssize_t xe_gt_sriov_pf_config_save(struct xe_gt *gt, unsigned int vfid, void *buf, size_t size) 2084 + { 2085 + struct xe_gt_sriov_config *config; 2086 + ssize_t ret; 2087 + 2088 + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt))); 2089 + xe_gt_assert(gt, vfid); 2090 + xe_gt_assert(gt, !(!buf ^ !size)); 2091 + 2092 + mutex_lock(xe_gt_sriov_pf_master_mutex(gt)); 2093 + ret = pf_validate_vf_config(gt, vfid); 2094 + if (!size) { 2095 + ret = ret ? 0 : SZ_4K; 2096 + } else if (!ret) { 2097 + if (size < SZ_4K) { 2098 + ret = -ENOBUFS; 2099 + } else { 2100 + config = pf_pick_vf_config(gt, vfid); 2101 + ret = encode_config(buf, config, false) * sizeof(u32); 2102 + } 2103 + } 2104 + mutex_unlock(xe_gt_sriov_pf_master_mutex(gt)); 2105 + 2106 + return ret; 2107 + } 2108 + 2109 + static int pf_restore_vf_config_klv(struct xe_gt *gt, unsigned int vfid, 2110 + u32 key, u32 len, const u32 *value) 2111 + { 2112 + switch (key) { 2113 + case GUC_KLV_VF_CFG_NUM_CONTEXTS_KEY: 2114 + if (len != GUC_KLV_VF_CFG_NUM_CONTEXTS_LEN) 2115 + return -EBADMSG; 2116 + return pf_provision_vf_ctxs(gt, vfid, value[0]); 2117 + 2118 + case GUC_KLV_VF_CFG_NUM_DOORBELLS_KEY: 2119 + if (len != GUC_KLV_VF_CFG_NUM_DOORBELLS_LEN) 2120 + return -EBADMSG; 2121 + return pf_provision_vf_dbs(gt, vfid, value[0]); 2122 + 2123 + case GUC_KLV_VF_CFG_EXEC_QUANTUM_KEY: 2124 + if (len != GUC_KLV_VF_CFG_EXEC_QUANTUM_LEN) 2125 + return -EBADMSG; 2126 + return pf_provision_exec_quantum(gt, vfid, value[0]); 2127 + 2128 + case GUC_KLV_VF_CFG_PREEMPT_TIMEOUT_KEY: 2129 + if (len != GUC_KLV_VF_CFG_PREEMPT_TIMEOUT_LEN) 2130 + return -EBADMSG; 2131 + return pf_provision_preempt_timeout(gt, vfid, value[0]); 2132 + 2133 + /* auto-generate case statements */ 2134 + #define define_threshold_key_to_provision_case(TAG, ...) \ 2135 + case MAKE_GUC_KLV_VF_CFG_THRESHOLD_KEY(TAG): \ 2136 + BUILD_BUG_ON(MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG) != 1u); \ 2137 + if (len != MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG)) \ 2138 + return -EBADMSG; \ 2139 + return pf_provision_threshold(gt, vfid, \ 2140 + MAKE_XE_GUC_KLV_THRESHOLD_INDEX(TAG), \ 2141 + value[0]); 2142 + 2143 + MAKE_XE_GUC_KLV_THRESHOLDS_SET(define_threshold_key_to_provision_case) 2144 + #undef define_threshold_key_to_provision_case 2145 + } 2146 + 2147 + if (xe_gt_is_media_type(gt)) 2148 + return -EKEYREJECTED; 2149 + 2150 + switch (key) { 2151 + case GUC_KLV_VF_CFG_GGTT_SIZE_KEY: 2152 + if (len != GUC_KLV_VF_CFG_GGTT_SIZE_LEN) 2153 + return -EBADMSG; 2154 + return pf_provision_vf_ggtt(gt, vfid, make_u64_from_u32(value[1], value[0])); 2155 + 2156 + case GUC_KLV_VF_CFG_LMEM_SIZE_KEY: 2157 + if (!IS_DGFX(gt_to_xe(gt))) 2158 + return -EKEYREJECTED; 2159 + if (len != GUC_KLV_VF_CFG_LMEM_SIZE_LEN) 2160 + return -EBADMSG; 2161 + return pf_provision_vf_lmem(gt, vfid, make_u64_from_u32(value[1], value[0])); 2162 + } 2163 + 2164 + return -EKEYREJECTED; 2165 + } 2166 + 2167 + static int pf_restore_vf_config(struct xe_gt *gt, unsigned int vfid, 2168 + const u32 *klvs, size_t num_dwords) 2169 + { 2170 + int err; 2171 + 2172 + while (num_dwords >= GUC_KLV_LEN_MIN) { 2173 + u32 key = FIELD_GET(GUC_KLV_0_KEY, klvs[0]); 2174 + u32 len = FIELD_GET(GUC_KLV_0_LEN, klvs[0]); 2175 + 2176 + klvs += GUC_KLV_LEN_MIN; 2177 + num_dwords -= GUC_KLV_LEN_MIN; 2178 + 2179 + if (num_dwords < len) 2180 + err = -EBADMSG; 2181 + else 2182 + err = pf_restore_vf_config_klv(gt, vfid, key, len, klvs); 2183 + 2184 + if (err) { 2185 + xe_gt_sriov_dbg(gt, "restore failed on key %#x (%pe)\n", key, ERR_PTR(err)); 2186 + return err; 2187 + } 2188 + 2189 + klvs += len; 2190 + num_dwords -= len; 2191 + } 2192 + 2193 + return pf_validate_vf_config(gt, vfid); 2194 + } 2195 + 2196 + /** 2197 + * xe_gt_sriov_pf_config_restore - Restore a VF provisioning config from binary blob. 2198 + * @gt: the &xe_gt 2199 + * @vfid: the VF identifier (can't be PF) 2200 + * @buf: the buffer with config data 2201 + * @size: the size of the config data 2202 + * 2203 + * This function can only be called on PF. 2204 + * 2205 + * Return: 0 on success or a negative error code on failure. 2206 + */ 2207 + int xe_gt_sriov_pf_config_restore(struct xe_gt *gt, unsigned int vfid, 2208 + const void *buf, size_t size) 2209 + { 2210 + int err; 2211 + 2212 + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt))); 2213 + xe_gt_assert(gt, vfid); 2214 + 2215 + if (!size) 2216 + return -ENODATA; 2217 + 2218 + if (size % sizeof(u32)) 2219 + return -EINVAL; 2220 + 2221 + if (IS_ENABLED(CONFIG_DRM_XE_DEBUG_SRIOV)) { 2222 + struct drm_printer p = xe_gt_info_printer(gt); 2223 + 2224 + drm_printf(&p, "restoring VF%u config:\n", vfid); 2225 + xe_guc_klv_print(buf, size / sizeof(u32), &p); 2226 + } 2227 + 2228 + mutex_lock(xe_gt_sriov_pf_master_mutex(gt)); 2229 + err = pf_send_vf_cfg_reset(gt, vfid); 2230 + if (!err) { 2231 + pf_release_vf_config(gt, vfid); 2232 + err = pf_restore_vf_config(gt, vfid, buf, size / sizeof(u32)); 2233 + } 2234 + mutex_unlock(xe_gt_sriov_pf_master_mutex(gt)); 2235 + 2236 + return err; 2077 2237 } 2078 2238 2079 2239 /**
+4
drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
··· 54 54 int xe_gt_sriov_pf_config_release(struct xe_gt *gt, unsigned int vfid, bool force); 55 55 int xe_gt_sriov_pf_config_push(struct xe_gt *gt, unsigned int vfid, bool refresh); 56 56 57 + ssize_t xe_gt_sriov_pf_config_save(struct xe_gt *gt, unsigned int vfid, void *buf, size_t size); 58 + int xe_gt_sriov_pf_config_restore(struct xe_gt *gt, unsigned int vfid, 59 + const void *buf, size_t size); 60 + 57 61 bool xe_gt_sriov_pf_config_is_empty(struct xe_gt *gt, unsigned int vfid); 58 62 59 63 void xe_gt_sriov_pf_config_restart(struct xe_gt *gt);
+42 -2
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
··· 9 9 10 10 #include "xe_device.h" 11 11 #include "xe_gt.h" 12 + #include "xe_gt_sriov_pf.h" 12 13 #include "xe_gt_sriov_pf_config.h" 13 14 #include "xe_gt_sriov_pf_control.h" 14 15 #include "xe_gt_sriov_pf_helpers.h" 16 + #include "xe_gt_sriov_pf_migration.h" 15 17 #include "xe_gt_sriov_pf_monitor.h" 16 18 #include "xe_gt_sriov_pf_service.h" 17 19 #include "xe_gt_sriov_printk.h" ··· 178 176 CASE2STR(PAUSE_SEND_PAUSE); 179 177 CASE2STR(PAUSE_WAIT_GUC); 180 178 CASE2STR(PAUSE_GUC_DONE); 179 + CASE2STR(PAUSE_SAVE_GUC); 181 180 CASE2STR(PAUSE_FAILED); 182 181 CASE2STR(PAUSED); 183 182 CASE2STR(RESUME_WIP); ··· 418 415 * : | : / 419 416 * : v : / 420 417 * : PAUSE_GUC_DONE o-----restart 418 + * : | : 419 + * : | o---<--busy : 420 + * : v / / : 421 + * : PAUSE_SAVE_GUC : 421 422 * : / : 422 423 * : / : 423 424 * :....o..............o...............o...........: ··· 441 434 pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_SEND_PAUSE); 442 435 pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_WAIT_GUC); 443 436 pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_GUC_DONE); 437 + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC); 444 438 } 445 439 } 446 440 ··· 472 464 pf_enter_vf_pause_failed(gt, vfid); 473 465 } 474 466 467 + static void pf_enter_vf_pause_save_guc(struct xe_gt *gt, unsigned int vfid) 468 + { 469 + if (!pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC)) 470 + pf_enter_vf_state_machine_bug(gt, vfid); 471 + } 472 + 473 + static bool pf_exit_vf_pause_save_guc(struct xe_gt *gt, unsigned int vfid) 474 + { 475 + int err; 476 + 477 + if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC)) 478 + return false; 479 + 480 + err = xe_gt_sriov_pf_migration_save_guc_state(gt, vfid); 481 + if (err) { 482 + /* retry if busy */ 483 + if (err == -EBUSY) { 484 + pf_enter_vf_pause_save_guc(gt, vfid); 485 + return true; 486 + } 487 + /* give up on error */ 488 + if (err == -EIO) 489 + pf_enter_vf_mismatch(gt, vfid); 490 + } 491 + 492 + pf_enter_vf_pause_completed(gt, vfid); 493 + return true; 494 + } 495 + 475 496 static bool pf_exit_vf_pause_guc_done(struct xe_gt *gt, unsigned int vfid) 476 497 { 477 498 if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_GUC_DONE)) 478 499 return false; 479 500 480 - pf_enter_vf_pause_completed(gt, vfid); 501 + pf_enter_vf_pause_save_guc(gt, vfid); 481 502 return true; 482 503 } 483 504 ··· 1045 1008 if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_FLR_RESET_MMIO)) 1046 1009 return false; 1047 1010 1048 - /* XXX: placeholder */ 1011 + xe_gt_sriov_pf_sanitize_hw(gt, vfid); 1049 1012 1050 1013 pf_enter_vf_flr_send_finish(gt, vfid); 1051 1014 return true; ··· 1373 1336 } 1374 1337 1375 1338 if (pf_exit_vf_pause_guc_done(gt, vfid)) 1339 + return true; 1340 + 1341 + if (pf_exit_vf_pause_save_guc(gt, vfid)) 1376 1342 return true; 1377 1343 1378 1344 if (pf_exit_vf_resume_send_resume(gt, vfid))
+2
drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
··· 27 27 * @XE_GT_SRIOV_STATE_PAUSE_SEND_PAUSE: indicates that the PF is about to send a PAUSE command. 28 28 * @XE_GT_SRIOV_STATE_PAUSE_WAIT_GUC: indicates that the PF awaits for a response from the GuC. 29 29 * @XE_GT_SRIOV_STATE_PAUSE_GUC_DONE: indicates that the PF has received a response from the GuC. 30 + * @XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC: indicates that the PF needs to save the VF GuC state. 30 31 * @XE_GT_SRIOV_STATE_PAUSE_FAILED: indicates that a VF pause operation has failed. 31 32 * @XE_GT_SRIOV_STATE_PAUSED: indicates that the VF is paused. 32 33 * @XE_GT_SRIOV_STATE_RESUME_WIP: indicates the a VF resume operation is in progress. ··· 57 56 XE_GT_SRIOV_STATE_PAUSE_SEND_PAUSE, 58 57 XE_GT_SRIOV_STATE_PAUSE_WAIT_GUC, 59 58 XE_GT_SRIOV_STATE_PAUSE_GUC_DONE, 59 + XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC, 60 60 XE_GT_SRIOV_STATE_PAUSE_FAILED, 61 61 XE_GT_SRIOV_STATE_PAUSED, 62 62
+127
drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
··· 17 17 #include "xe_gt_sriov_pf_control.h" 18 18 #include "xe_gt_sriov_pf_debugfs.h" 19 19 #include "xe_gt_sriov_pf_helpers.h" 20 + #include "xe_gt_sriov_pf_migration.h" 20 21 #include "xe_gt_sriov_pf_monitor.h" 21 22 #include "xe_gt_sriov_pf_policy.h" 22 23 #include "xe_gt_sriov_pf_service.h" ··· 313 312 { "stop", xe_gt_sriov_pf_control_stop_vf }, 314 313 { "pause", xe_gt_sriov_pf_control_pause_vf }, 315 314 { "resume", xe_gt_sriov_pf_control_resume_vf }, 315 + #ifdef CONFIG_DRM_XE_DEBUG_SRIOV 316 + { "restore!", xe_gt_sriov_pf_migration_restore_guc_state }, 317 + #endif 316 318 }; 317 319 318 320 static ssize_t control_write(struct file *file, const char __user *buf, size_t count, loff_t *pos) ··· 379 375 .llseek = default_llseek, 380 376 }; 381 377 378 + /* 379 + * /sys/kernel/debug/dri/0/ 380 + * ├── gt0 381 + * │   ├── vf1 382 + * │   │   ├── guc_state 383 + */ 384 + static ssize_t guc_state_read(struct file *file, char __user *buf, 385 + size_t count, loff_t *pos) 386 + { 387 + struct dentry *dent = file_dentry(file); 388 + struct dentry *parent = dent->d_parent; 389 + struct xe_gt *gt = extract_gt(parent); 390 + unsigned int vfid = extract_vfid(parent); 391 + 392 + return xe_gt_sriov_pf_migration_read_guc_state(gt, vfid, buf, count, pos); 393 + } 394 + 395 + static ssize_t guc_state_write(struct file *file, const char __user *buf, 396 + size_t count, loff_t *pos) 397 + { 398 + struct dentry *dent = file_dentry(file); 399 + struct dentry *parent = dent->d_parent; 400 + struct xe_gt *gt = extract_gt(parent); 401 + unsigned int vfid = extract_vfid(parent); 402 + 403 + if (*pos) 404 + return -EINVAL; 405 + 406 + return xe_gt_sriov_pf_migration_write_guc_state(gt, vfid, buf, count); 407 + } 408 + 409 + static const struct file_operations guc_state_ops = { 410 + .owner = THIS_MODULE, 411 + .read = guc_state_read, 412 + .write = guc_state_write, 413 + .llseek = default_llseek, 414 + }; 415 + 416 + /* 417 + * /sys/kernel/debug/dri/0/ 418 + * ├── gt0 419 + * │   ├── vf1 420 + * │   │   ├── config_blob 421 + */ 422 + static ssize_t config_blob_read(struct file *file, char __user *buf, 423 + size_t count, loff_t *pos) 424 + { 425 + struct dentry *dent = file_dentry(file); 426 + struct dentry *parent = dent->d_parent; 427 + struct xe_gt *gt = extract_gt(parent); 428 + unsigned int vfid = extract_vfid(parent); 429 + ssize_t ret; 430 + void *tmp; 431 + 432 + ret = xe_gt_sriov_pf_config_save(gt, vfid, NULL, 0); 433 + if (!ret) 434 + return -ENODATA; 435 + if (ret < 0) 436 + return ret; 437 + 438 + tmp = kzalloc(ret, GFP_KERNEL); 439 + if (!tmp) 440 + return -ENOMEM; 441 + 442 + ret = xe_gt_sriov_pf_config_save(gt, vfid, tmp, ret); 443 + if (ret > 0) 444 + ret = simple_read_from_buffer(buf, count, pos, tmp, ret); 445 + 446 + kfree(tmp); 447 + return ret; 448 + } 449 + 450 + static ssize_t config_blob_write(struct file *file, const char __user *buf, 451 + size_t count, loff_t *pos) 452 + { 453 + struct dentry *dent = file_dentry(file); 454 + struct dentry *parent = dent->d_parent; 455 + struct xe_gt *gt = extract_gt(parent); 456 + unsigned int vfid = extract_vfid(parent); 457 + ssize_t ret; 458 + void *tmp; 459 + 460 + if (*pos) 461 + return -EINVAL; 462 + 463 + if (!count) 464 + return -ENODATA; 465 + 466 + if (count > SZ_4K) 467 + return -EINVAL; 468 + 469 + tmp = kzalloc(count, GFP_KERNEL); 470 + if (!tmp) 471 + return -ENOMEM; 472 + 473 + if (copy_from_user(tmp, buf, count)) { 474 + ret = -EFAULT; 475 + } else { 476 + ret = xe_gt_sriov_pf_config_restore(gt, vfid, tmp, count); 477 + if (!ret) 478 + ret = count; 479 + } 480 + kfree(tmp); 481 + return ret; 482 + } 483 + 484 + static const struct file_operations config_blob_ops = { 485 + .owner = THIS_MODULE, 486 + .read = config_blob_read, 487 + .write = config_blob_write, 488 + .llseek = default_llseek, 489 + }; 490 + 382 491 /** 383 492 * xe_gt_sriov_pf_debugfs_register - Register SR-IOV PF specific entries in GT debugfs. 384 493 * @gt: the &xe_gt to register ··· 540 423 541 424 pf_add_config_attrs(gt, vfdentry, VFID(n)); 542 425 debugfs_create_file("control", 0600, vfdentry, NULL, &control_ops); 426 + 427 + /* for testing/debugging purposes only! */ 428 + if (IS_ENABLED(CONFIG_DRM_XE_DEBUG)) { 429 + debugfs_create_file("guc_state", 430 + IS_ENABLED(CONFIG_DRM_XE_DEBUG_SRIOV) ? 0600 : 0400, 431 + vfdentry, NULL, &guc_state_ops); 432 + debugfs_create_file("config_blob", 433 + IS_ENABLED(CONFIG_DRM_XE_DEBUG_SRIOV) ? 0600 : 0400, 434 + vfdentry, NULL, &config_blob_ops); 435 + } 543 436 } 544 437 }
+419
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
··· 1 + // SPDX-License-Identifier: MIT 2 + /* 3 + * Copyright © 2024 Intel Corporation 4 + */ 5 + 6 + #include <drm/drm_managed.h> 7 + 8 + #include "abi/guc_actions_sriov_abi.h" 9 + #include "xe_bo.h" 10 + #include "xe_gt_sriov_pf_helpers.h" 11 + #include "xe_gt_sriov_pf_migration.h" 12 + #include "xe_gt_sriov_printk.h" 13 + #include "xe_guc.h" 14 + #include "xe_guc_ct.h" 15 + #include "xe_sriov.h" 16 + 17 + /* Return: number of dwords saved/restored/required or a negative error code on failure */ 18 + static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode, 19 + u64 addr, u32 ndwords) 20 + { 21 + u32 request[PF2GUC_SAVE_RESTORE_VF_REQUEST_MSG_LEN] = { 22 + FIELD_PREP(GUC_HXG_MSG_0_ORIGIN, GUC_HXG_ORIGIN_HOST) | 23 + FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) | 24 + FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION, GUC_ACTION_PF2GUC_SAVE_RESTORE_VF) | 25 + FIELD_PREP(PF2GUC_SAVE_RESTORE_VF_REQUEST_MSG_0_OPCODE, opcode), 26 + FIELD_PREP(PF2GUC_SAVE_RESTORE_VF_REQUEST_MSG_1_VFID, vfid), 27 + FIELD_PREP(PF2GUC_SAVE_RESTORE_VF_REQUEST_MSG_2_ADDR_LO, lower_32_bits(addr)), 28 + FIELD_PREP(PF2GUC_SAVE_RESTORE_VF_REQUEST_MSG_3_ADDR_HI, upper_32_bits(addr)), 29 + FIELD_PREP(PF2GUC_SAVE_RESTORE_VF_REQUEST_MSG_4_SIZE, ndwords), 30 + }; 31 + 32 + return xe_guc_ct_send_block(&guc->ct, request, ARRAY_SIZE(request)); 33 + } 34 + 35 + /* Return: size of the state in dwords or a negative error code on failure */ 36 + static int pf_send_guc_query_vf_state_size(struct xe_gt *gt, unsigned int vfid) 37 + { 38 + int ret; 39 + 40 + ret = guc_action_vf_save_restore(&gt->uc.guc, vfid, GUC_PF_OPCODE_VF_SAVE, 0, 0); 41 + return ret ?: -ENODATA; 42 + } 43 + 44 + /* Return: number of state dwords saved or a negative error code on failure */ 45 + static int pf_send_guc_save_vf_state(struct xe_gt *gt, unsigned int vfid, 46 + void *buff, size_t size) 47 + { 48 + const int ndwords = size / sizeof(u32); 49 + struct xe_tile *tile = gt_to_tile(gt); 50 + struct xe_device *xe = tile_to_xe(tile); 51 + struct xe_guc *guc = &gt->uc.guc; 52 + struct xe_bo *bo; 53 + int ret; 54 + 55 + xe_gt_assert(gt, size % sizeof(u32) == 0); 56 + xe_gt_assert(gt, size == ndwords * sizeof(u32)); 57 + 58 + bo = xe_bo_create_pin_map(xe, tile, NULL, 59 + ALIGN(size, PAGE_SIZE), 60 + ttm_bo_type_kernel, 61 + XE_BO_FLAG_SYSTEM | 62 + XE_BO_FLAG_GGTT | 63 + XE_BO_FLAG_GGTT_INVALIDATE); 64 + if (IS_ERR(bo)) 65 + return PTR_ERR(bo); 66 + 67 + ret = guc_action_vf_save_restore(guc, vfid, GUC_PF_OPCODE_VF_SAVE, 68 + xe_bo_ggtt_addr(bo), ndwords); 69 + if (!ret) 70 + ret = -ENODATA; 71 + else if (ret > ndwords) 72 + ret = -EPROTO; 73 + else if (ret > 0) 74 + xe_map_memcpy_from(xe, buff, &bo->vmap, 0, ret * sizeof(u32)); 75 + 76 + xe_bo_unpin_map_no_vm(bo); 77 + return ret; 78 + } 79 + 80 + /* Return: number of state dwords restored or a negative error code on failure */ 81 + static int pf_send_guc_restore_vf_state(struct xe_gt *gt, unsigned int vfid, 82 + const void *buff, size_t size) 83 + { 84 + const int ndwords = size / sizeof(u32); 85 + struct xe_tile *tile = gt_to_tile(gt); 86 + struct xe_device *xe = tile_to_xe(tile); 87 + struct xe_guc *guc = &gt->uc.guc; 88 + struct xe_bo *bo; 89 + int ret; 90 + 91 + xe_gt_assert(gt, size % sizeof(u32) == 0); 92 + xe_gt_assert(gt, size == ndwords * sizeof(u32)); 93 + 94 + bo = xe_bo_create_pin_map(xe, tile, NULL, 95 + ALIGN(size, PAGE_SIZE), 96 + ttm_bo_type_kernel, 97 + XE_BO_FLAG_SYSTEM | 98 + XE_BO_FLAG_GGTT | 99 + XE_BO_FLAG_GGTT_INVALIDATE); 100 + if (IS_ERR(bo)) 101 + return PTR_ERR(bo); 102 + 103 + xe_map_memcpy_to(xe, &bo->vmap, 0, buff, size); 104 + 105 + ret = guc_action_vf_save_restore(guc, vfid, GUC_PF_OPCODE_VF_RESTORE, 106 + xe_bo_ggtt_addr(bo), ndwords); 107 + if (!ret) 108 + ret = -ENODATA; 109 + else if (ret > ndwords) 110 + ret = -EPROTO; 111 + 112 + xe_bo_unpin_map_no_vm(bo); 113 + return ret; 114 + } 115 + 116 + static bool pf_migration_supported(struct xe_gt *gt) 117 + { 118 + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt))); 119 + return gt->sriov.pf.migration.supported; 120 + } 121 + 122 + static struct mutex *pf_migration_mutex(struct xe_gt *gt) 123 + { 124 + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt))); 125 + return &gt->sriov.pf.migration.snapshot_lock; 126 + } 127 + 128 + static struct xe_gt_sriov_state_snapshot *pf_pick_vf_snapshot(struct xe_gt *gt, 129 + unsigned int vfid) 130 + { 131 + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt))); 132 + xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt))); 133 + lockdep_assert_held(pf_migration_mutex(gt)); 134 + 135 + return &gt->sriov.pf.vfs[vfid].snapshot; 136 + } 137 + 138 + static unsigned int pf_snapshot_index(struct xe_gt *gt, struct xe_gt_sriov_state_snapshot *snapshot) 139 + { 140 + return container_of(snapshot, struct xe_gt_sriov_metadata, snapshot) - gt->sriov.pf.vfs; 141 + } 142 + 143 + static void pf_free_guc_state(struct xe_gt *gt, struct xe_gt_sriov_state_snapshot *snapshot) 144 + { 145 + struct xe_device *xe = gt_to_xe(gt); 146 + 147 + drmm_kfree(&xe->drm, snapshot->guc.buff); 148 + snapshot->guc.buff = NULL; 149 + snapshot->guc.size = 0; 150 + } 151 + 152 + static int pf_alloc_guc_state(struct xe_gt *gt, 153 + struct xe_gt_sriov_state_snapshot *snapshot, 154 + size_t size) 155 + { 156 + struct xe_device *xe = gt_to_xe(gt); 157 + void *p; 158 + 159 + pf_free_guc_state(gt, snapshot); 160 + 161 + if (!size) 162 + return -ENODATA; 163 + 164 + if (size % sizeof(u32)) 165 + return -EINVAL; 166 + 167 + if (size > SZ_2M) 168 + return -EFBIG; 169 + 170 + p = drmm_kzalloc(&xe->drm, size, GFP_KERNEL); 171 + if (!p) 172 + return -ENOMEM; 173 + 174 + snapshot->guc.buff = p; 175 + snapshot->guc.size = size; 176 + return 0; 177 + } 178 + 179 + static void pf_dump_guc_state(struct xe_gt *gt, struct xe_gt_sriov_state_snapshot *snapshot) 180 + { 181 + if (IS_ENABLED(CONFIG_DRM_XE_DEBUG_SRIOV)) { 182 + unsigned int vfid __maybe_unused = pf_snapshot_index(gt, snapshot); 183 + 184 + xe_gt_sriov_dbg_verbose(gt, "VF%u GuC state is %zu dwords:\n", 185 + vfid, snapshot->guc.size / sizeof(u32)); 186 + print_hex_dump_bytes("state: ", DUMP_PREFIX_OFFSET, 187 + snapshot->guc.buff, min(SZ_64, snapshot->guc.size)); 188 + } 189 + } 190 + 191 + static int pf_save_vf_guc_state(struct xe_gt *gt, unsigned int vfid) 192 + { 193 + struct xe_gt_sriov_state_snapshot *snapshot = pf_pick_vf_snapshot(gt, vfid); 194 + size_t size; 195 + int ret; 196 + 197 + ret = pf_send_guc_query_vf_state_size(gt, vfid); 198 + if (ret < 0) 199 + goto fail; 200 + size = ret * sizeof(u32); 201 + xe_gt_sriov_dbg_verbose(gt, "VF%u state size is %d dwords (%zu bytes)\n", vfid, ret, size); 202 + 203 + ret = pf_alloc_guc_state(gt, snapshot, size); 204 + if (ret < 0) 205 + goto fail; 206 + 207 + ret = pf_send_guc_save_vf_state(gt, vfid, snapshot->guc.buff, size); 208 + if (ret < 0) 209 + goto fail; 210 + size = ret * sizeof(u32); 211 + xe_gt_assert(gt, size); 212 + xe_gt_assert(gt, size <= snapshot->guc.size); 213 + snapshot->guc.size = size; 214 + 215 + pf_dump_guc_state(gt, snapshot); 216 + return 0; 217 + 218 + fail: 219 + xe_gt_sriov_dbg(gt, "Unable to save VF%u state (%pe)\n", vfid, ERR_PTR(ret)); 220 + pf_free_guc_state(gt, snapshot); 221 + return ret; 222 + } 223 + 224 + /** 225 + * xe_gt_sriov_pf_migration_save_guc_state() - Take a GuC VF state snapshot. 226 + * @gt: the &xe_gt 227 + * @vfid: the VF identifier 228 + * 229 + * This function is for PF only. 230 + * 231 + * Return: 0 on success or a negative error code on failure. 232 + */ 233 + int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid) 234 + { 235 + int err; 236 + 237 + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt))); 238 + xe_gt_assert(gt, vfid != PFID); 239 + xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt))); 240 + 241 + if (!pf_migration_supported(gt)) 242 + return -ENOPKG; 243 + 244 + mutex_lock(pf_migration_mutex(gt)); 245 + err = pf_save_vf_guc_state(gt, vfid); 246 + mutex_unlock(pf_migration_mutex(gt)); 247 + 248 + return err; 249 + } 250 + 251 + static int pf_restore_vf_guc_state(struct xe_gt *gt, unsigned int vfid) 252 + { 253 + struct xe_gt_sriov_state_snapshot *snapshot = pf_pick_vf_snapshot(gt, vfid); 254 + int ret; 255 + 256 + if (!snapshot->guc.size) 257 + return -ENODATA; 258 + 259 + xe_gt_sriov_dbg_verbose(gt, "restoring %zu dwords of VF%u GuC state\n", 260 + snapshot->guc.size / sizeof(u32), vfid); 261 + ret = pf_send_guc_restore_vf_state(gt, vfid, snapshot->guc.buff, snapshot->guc.size); 262 + if (ret < 0) 263 + goto fail; 264 + 265 + xe_gt_sriov_dbg_verbose(gt, "restored %d dwords of VF%u GuC state\n", ret, vfid); 266 + return 0; 267 + 268 + fail: 269 + xe_gt_sriov_dbg(gt, "Failed to restore VF%u GuC state (%pe)\n", vfid, ERR_PTR(ret)); 270 + return ret; 271 + } 272 + 273 + /** 274 + * xe_gt_sriov_pf_migration_restore_guc_state() - Restore a GuC VF state. 275 + * @gt: the &xe_gt 276 + * @vfid: the VF identifier 277 + * 278 + * This function is for PF only. 279 + * 280 + * Return: 0 on success or a negative error code on failure. 281 + */ 282 + int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid) 283 + { 284 + int ret; 285 + 286 + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt))); 287 + xe_gt_assert(gt, vfid != PFID); 288 + xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt))); 289 + 290 + if (!pf_migration_supported(gt)) 291 + return -ENOPKG; 292 + 293 + mutex_lock(pf_migration_mutex(gt)); 294 + ret = pf_restore_vf_guc_state(gt, vfid); 295 + mutex_unlock(pf_migration_mutex(gt)); 296 + 297 + return ret; 298 + } 299 + 300 + #ifdef CONFIG_DEBUG_FS 301 + /** 302 + * xe_gt_sriov_pf_migration_read_guc_state() - Read a GuC VF state. 303 + * @gt: the &xe_gt 304 + * @vfid: the VF identifier 305 + * @buf: the user space buffer to read to 306 + * @count: the maximum number of bytes to read 307 + * @pos: the current position in the buffer 308 + * 309 + * This function is for PF only. 310 + * 311 + * This function reads up to @count bytes from the saved VF GuC state buffer 312 + * at offset @pos into the user space address starting at @buf. 313 + * 314 + * Return: the number of bytes read or a negative error code on failure. 315 + */ 316 + ssize_t xe_gt_sriov_pf_migration_read_guc_state(struct xe_gt *gt, unsigned int vfid, 317 + char __user *buf, size_t count, loff_t *pos) 318 + { 319 + struct xe_gt_sriov_state_snapshot *snapshot; 320 + ssize_t ret; 321 + 322 + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt))); 323 + xe_gt_assert(gt, vfid != PFID); 324 + xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt))); 325 + 326 + if (!pf_migration_supported(gt)) 327 + return -ENOPKG; 328 + 329 + mutex_lock(pf_migration_mutex(gt)); 330 + snapshot = pf_pick_vf_snapshot(gt, vfid); 331 + if (snapshot->guc.size) 332 + ret = simple_read_from_buffer(buf, count, pos, snapshot->guc.buff, 333 + snapshot->guc.size); 334 + else 335 + ret = -ENODATA; 336 + mutex_unlock(pf_migration_mutex(gt)); 337 + 338 + return ret; 339 + } 340 + 341 + /** 342 + * xe_gt_sriov_pf_migration_write_guc_state() - Write a GuC VF state. 343 + * @gt: the &xe_gt 344 + * @vfid: the VF identifier 345 + * @buf: the user space buffer with GuC VF state 346 + * @size: the size of GuC VF state (in bytes) 347 + * 348 + * This function is for PF only. 349 + * 350 + * This function reads @size bytes of the VF GuC state stored at user space 351 + * address @buf and writes it into a internal VF state buffer. 352 + * 353 + * Return: the number of bytes used or a negative error code on failure. 354 + */ 355 + ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int vfid, 356 + const char __user *buf, size_t size) 357 + { 358 + struct xe_gt_sriov_state_snapshot *snapshot; 359 + loff_t pos = 0; 360 + ssize_t ret; 361 + 362 + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt))); 363 + xe_gt_assert(gt, vfid != PFID); 364 + xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt))); 365 + 366 + if (!pf_migration_supported(gt)) 367 + return -ENOPKG; 368 + 369 + mutex_lock(pf_migration_mutex(gt)); 370 + snapshot = pf_pick_vf_snapshot(gt, vfid); 371 + ret = pf_alloc_guc_state(gt, snapshot, size); 372 + if (!ret) { 373 + ret = simple_write_to_buffer(snapshot->guc.buff, size, &pos, buf, size); 374 + if (ret < 0) 375 + pf_free_guc_state(gt, snapshot); 376 + else 377 + pf_dump_guc_state(gt, snapshot); 378 + } 379 + mutex_unlock(pf_migration_mutex(gt)); 380 + 381 + return ret; 382 + } 383 + #endif /* CONFIG_DEBUG_FS */ 384 + 385 + static bool pf_check_migration_support(struct xe_gt *gt) 386 + { 387 + /* GuC 70.25 with save/restore v2 is required */ 388 + xe_gt_assert(gt, GUC_FIRMWARE_VER(&gt->uc.guc) >= MAKE_GUC_VER(70, 25, 0)); 389 + 390 + /* XXX: for now this is for feature enabling only */ 391 + return IS_ENABLED(CONFIG_DRM_XE_DEBUG); 392 + } 393 + 394 + /** 395 + * xe_gt_sriov_pf_migration_init() - Initialize support for VF migration. 396 + * @gt: the &xe_gt 397 + * 398 + * This function is for PF only. 399 + * 400 + * Return: 0 on success or a negative error code on failure. 401 + */ 402 + int xe_gt_sriov_pf_migration_init(struct xe_gt *gt) 403 + { 404 + struct xe_device *xe = gt_to_xe(gt); 405 + int err; 406 + 407 + xe_gt_assert(gt, IS_SRIOV_PF(xe)); 408 + 409 + gt->sriov.pf.migration.supported = pf_check_migration_support(gt); 410 + 411 + if (!pf_migration_supported(gt)) 412 + return 0; 413 + 414 + err = drmm_mutex_init(&xe->drm, &gt->sriov.pf.migration.snapshot_lock); 415 + if (err) 416 + return err; 417 + 418 + return 0; 419 + }
+24
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2024 Intel Corporation 4 + */ 5 + 6 + #ifndef _XE_GT_SRIOV_PF_MIGRATION_H_ 7 + #define _XE_GT_SRIOV_PF_MIGRATION_H_ 8 + 9 + #include <linux/types.h> 10 + 11 + struct xe_gt; 12 + 13 + int xe_gt_sriov_pf_migration_init(struct xe_gt *gt); 14 + int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid); 15 + int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid); 16 + 17 + #ifdef CONFIG_DEBUG_FS 18 + ssize_t xe_gt_sriov_pf_migration_read_guc_state(struct xe_gt *gt, unsigned int vfid, 19 + char __user *buf, size_t count, loff_t *pos); 20 + ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int vfid, 21 + const char __user *buf, size_t count); 22 + #endif 23 + 24 + #endif
+40
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2024 Intel Corporation 4 + */ 5 + 6 + #ifndef _XE_GT_SRIOV_PF_MIGRATION_TYPES_H_ 7 + #define _XE_GT_SRIOV_PF_MIGRATION_TYPES_H_ 8 + 9 + #include <linux/mutex.h> 10 + #include <linux/types.h> 11 + 12 + /** 13 + * struct xe_gt_sriov_state_snapshot - GT-level per-VF state snapshot data. 14 + * 15 + * Used by the PF driver to maintain per-VF migration data. 16 + */ 17 + struct xe_gt_sriov_state_snapshot { 18 + /** @guc: GuC VF state snapshot */ 19 + struct { 20 + /** @guc.buff: buffer with the VF state */ 21 + u32 *buff; 22 + /** @guc.size: size of the buffer (must be dwords aligned) */ 23 + u32 size; 24 + } guc; 25 + }; 26 + 27 + /** 28 + * struct xe_gt_sriov_pf_migration - GT-level data. 29 + * 30 + * Used by the PF driver to maintain non-VF specific per-GT data. 31 + */ 32 + struct xe_gt_sriov_pf_migration { 33 + /** @supported: indicates whether the feature is supported */ 34 + bool supported; 35 + 36 + /** @snapshot_lock: protects all VFs snapshots */ 37 + struct mutex snapshot_lock; 38 + }; 39 + 40 + #endif
+3 -3
drivers/gpu/drm/xe/xe_gt_sriov_pf_service.c
··· 237 237 const struct xe_reg *regs, u32 *values) 238 238 { 239 239 while (count--) 240 - *values++ = xe_mmio_read32(gt, *regs++); 240 + *values++ = xe_mmio_read32(&gt->mmio, *regs++); 241 241 } 242 242 243 243 static void pf_prepare_runtime_info(struct xe_gt *gt) ··· 402 402 403 403 for (i = 0; i < count; ++i, ++data) { 404 404 addr = runtime->regs[start + i].addr; 405 - data->offset = xe_mmio_adjusted_addr(gt, addr); 405 + data->offset = xe_mmio_adjusted_addr(&gt->mmio, addr); 406 406 data->value = runtime->values[start + i]; 407 407 } 408 408 ··· 513 513 514 514 for (; size--; regs++, values++) { 515 515 drm_printf(p, "reg[%#x] = %#x\n", 516 - xe_mmio_adjusted_addr(gt, regs->addr), *values); 516 + xe_mmio_adjusted_addr(&gt->mmio, regs->addr), *values); 517 517 } 518 518 519 519 return 0;
+6
drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
··· 10 10 11 11 #include "xe_gt_sriov_pf_config_types.h" 12 12 #include "xe_gt_sriov_pf_control_types.h" 13 + #include "xe_gt_sriov_pf_migration_types.h" 13 14 #include "xe_gt_sriov_pf_monitor_types.h" 14 15 #include "xe_gt_sriov_pf_policy_types.h" 15 16 #include "xe_gt_sriov_pf_service_types.h" ··· 30 29 31 30 /** @version: negotiated VF/PF ABI version */ 32 31 struct xe_gt_sriov_pf_service_version version; 32 + 33 + /** @snapshot: snapshot of the VF state data */ 34 + struct xe_gt_sriov_state_snapshot snapshot; 33 35 }; 34 36 35 37 /** ··· 40 36 * @service: service data. 41 37 * @control: control data. 42 38 * @policy: policy data. 39 + * @migration: migration data. 43 40 * @spare: PF-only provisioning configuration. 44 41 * @vfs: metadata for all VFs. 45 42 */ ··· 48 43 struct xe_gt_sriov_pf_service service; 49 44 struct xe_gt_sriov_pf_control control; 50 45 struct xe_gt_sriov_pf_policy policy; 46 + struct xe_gt_sriov_pf_migration migration; 51 47 struct xe_gt_sriov_spare_config spare; 52 48 struct xe_gt_sriov_metadata *vfs; 53 49 };
+2 -2
drivers/gpu/drm/xe/xe_gt_sriov_vf.c
··· 881 881 */ 882 882 u32 xe_gt_sriov_vf_read32(struct xe_gt *gt, struct xe_reg reg) 883 883 { 884 - u32 addr = xe_mmio_adjusted_addr(gt, reg.addr); 884 + u32 addr = xe_mmio_adjusted_addr(&gt->mmio, reg.addr); 885 885 struct vf_runtime_reg *rr; 886 886 887 887 xe_gt_assert(gt, IS_SRIOV_VF(gt_to_xe(gt))); ··· 917 917 */ 918 918 void xe_gt_sriov_vf_write32(struct xe_gt *gt, struct xe_reg reg, u32 val) 919 919 { 920 - u32 addr = xe_mmio_adjusted_addr(gt, reg.addr); 920 + u32 addr = xe_mmio_adjusted_addr(&gt->mmio, reg.addr); 921 921 922 922 xe_gt_assert(gt, IS_SRIOV_VF(gt_to_xe(gt))); 923 923 xe_gt_assert(gt, !reg.vf);
+1 -1
drivers/gpu/drm/xe/xe_gt_sriov_vf_debugfs.c
··· 33 33 .show = xe_gt_debugfs_simple_show, 34 34 .data = xe_gt_sriov_vf_print_version, 35 35 }, 36 - #if defined(CONFIG_DRM_XE_DEBUG) || defined(CONFIG_DRM_XE_DEBUG_SRIOV) 36 + #if IS_ENABLED(CONFIG_DRM_XE_DEBUG) || IS_ENABLED(CONFIG_DRM_XE_DEBUG_SRIOV) 37 37 { 38 38 "runtime_regs", 39 39 .show = xe_gt_debugfs_simple_show,
+2 -2
drivers/gpu/drm/xe/xe_gt_throttle.c
··· 41 41 42 42 xe_pm_runtime_get(gt_to_xe(gt)); 43 43 if (xe_gt_is_media_type(gt)) 44 - reg = xe_mmio_read32(gt, MTL_MEDIA_PERF_LIMIT_REASONS); 44 + reg = xe_mmio_read32(&gt->mmio, MTL_MEDIA_PERF_LIMIT_REASONS); 45 45 else 46 - reg = xe_mmio_read32(gt, GT0_PERF_LIMIT_REASONS); 46 + reg = xe_mmio_read32(&gt->mmio, GT0_PERF_LIMIT_REASONS); 47 47 xe_pm_runtime_put(gt_to_xe(gt)); 48 48 49 49 return reg;
+18 -19
drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
··· 37 37 return hw_tlb_timeout + 2 * delay; 38 38 } 39 39 40 + static void xe_gt_tlb_invalidation_fence_fini(struct xe_gt_tlb_invalidation_fence *fence) 41 + { 42 + if (WARN_ON_ONCE(!fence->gt)) 43 + return; 44 + 45 + xe_pm_runtime_put(gt_to_xe(fence->gt)); 46 + fence->gt = NULL; /* fini() should be called once */ 47 + } 48 + 40 49 static void 41 50 __invalidation_fence_signal(struct xe_device *xe, struct xe_gt_tlb_invalidation_fence *fence) 42 51 { ··· 213 204 tlb_timeout_jiffies(gt)); 214 205 } 215 206 spin_unlock_irq(&gt->tlb_invalidation.pending_lock); 216 - } else if (ret < 0) { 207 + } else { 217 208 __invalidation_fence_signal(xe, fence); 218 209 } 219 210 if (!ret) { ··· 276 267 277 268 xe_gt_tlb_invalidation_fence_init(gt, &fence, true); 278 269 ret = xe_gt_tlb_invalidation_guc(gt, &fence); 279 - if (ret < 0) { 280 - xe_gt_tlb_invalidation_fence_fini(&fence); 270 + if (ret) 281 271 return ret; 282 - } 283 272 284 273 xe_gt_tlb_invalidation_fence_wait(&fence); 285 274 } else if (xe_device_uc_enabled(xe) && !xe_device_wedged(xe)) { 275 + struct xe_mmio *mmio = &gt->mmio; 276 + 286 277 if (IS_SRIOV_VF(xe)) 287 278 return 0; 288 279 289 280 xe_gt_WARN_ON(gt, xe_force_wake_get(gt_to_fw(gt), XE_FW_GT)); 290 281 if (xe->info.platform == XE_PVC || GRAPHICS_VER(xe) >= 20) { 291 - xe_mmio_write32(gt, PVC_GUC_TLB_INV_DESC1, 282 + xe_mmio_write32(mmio, PVC_GUC_TLB_INV_DESC1, 292 283 PVC_GUC_TLB_INV_DESC1_INVALIDATE); 293 - xe_mmio_write32(gt, PVC_GUC_TLB_INV_DESC0, 284 + xe_mmio_write32(mmio, PVC_GUC_TLB_INV_DESC0, 294 285 PVC_GUC_TLB_INV_DESC0_VALID); 295 286 } else { 296 - xe_mmio_write32(gt, GUC_TLB_INV_CR, 287 + xe_mmio_write32(mmio, GUC_TLB_INV_CR, 297 288 GUC_TLB_INV_CR_INVALIDATE); 298 289 } 299 290 xe_force_wake_put(gt_to_fw(gt), XE_FW_GT); ··· 505 496 * @stack: fence is stack variable 506 497 * 507 498 * Initialize TLB invalidation fence for use. xe_gt_tlb_invalidation_fence_fini 508 - * must be called if fence is not signaled. 499 + * will be automatically called when fence is signalled (all fences must signal), 500 + * even on error. 509 501 */ 510 502 void xe_gt_tlb_invalidation_fence_init(struct xe_gt *gt, 511 503 struct xe_gt_tlb_invalidation_fence *fence, ··· 525 515 else 526 516 dma_fence_get(&fence->base); 527 517 fence->gt = gt; 528 - } 529 - 530 - /** 531 - * xe_gt_tlb_invalidation_fence_fini - Finalize TLB invalidation fence 532 - * @fence: TLB invalidation fence to finalize 533 - * 534 - * Drop PM ref which fence took durinig init. 535 - */ 536 - void xe_gt_tlb_invalidation_fence_fini(struct xe_gt_tlb_invalidation_fence *fence) 537 - { 538 - xe_pm_runtime_put(gt_to_xe(fence->gt)); 539 518 }
-1
drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h
··· 28 28 void xe_gt_tlb_invalidation_fence_init(struct xe_gt *gt, 29 29 struct xe_gt_tlb_invalidation_fence *fence, 30 30 bool stack); 31 - void xe_gt_tlb_invalidation_fence_fini(struct xe_gt_tlb_invalidation_fence *fence); 32 31 33 32 static inline void 34 33 xe_gt_tlb_invalidation_fence_wait(struct xe_gt_tlb_invalidation_fence *fence)
+18 -4
drivers/gpu/drm/xe/xe_gt_topology.c
··· 5 5 6 6 #include "xe_gt_topology.h" 7 7 8 + #include <generated/xe_wa_oob.h> 8 9 #include <linux/bitmap.h> 9 10 #include <linux/compiler.h> 10 11 ··· 13 12 #include "xe_assert.h" 14 13 #include "xe_gt.h" 15 14 #include "xe_mmio.h" 15 + #include "xe_wa.h" 16 16 17 17 static void 18 18 load_dss_mask(struct xe_gt *gt, xe_dss_mask_t mask, int numregs, ...) ··· 27 25 28 26 va_start(argp, numregs); 29 27 for (i = 0; i < numregs; i++) 30 - fuse_val[i] = xe_mmio_read32(gt, va_arg(argp, struct xe_reg)); 28 + fuse_val[i] = xe_mmio_read32(&gt->mmio, va_arg(argp, struct xe_reg)); 31 29 va_end(argp); 32 30 33 31 bitmap_from_arr32(mask, fuse_val, numregs * 32); ··· 37 35 load_eu_mask(struct xe_gt *gt, xe_eu_mask_t mask, enum xe_gt_eu_type *eu_type) 38 36 { 39 37 struct xe_device *xe = gt_to_xe(gt); 40 - u32 reg_val = xe_mmio_read32(gt, XELP_EU_ENABLE); 38 + u32 reg_val = xe_mmio_read32(&gt->mmio, XELP_EU_ENABLE); 41 39 u32 val = 0; 42 40 int i; 43 41 ··· 129 127 load_l3_bank_mask(struct xe_gt *gt, xe_l3_bank_mask_t l3_bank_mask) 130 128 { 131 129 struct xe_device *xe = gt_to_xe(gt); 132 - u32 fuse3 = xe_mmio_read32(gt, MIRROR_FUSE3); 130 + u32 fuse3 = xe_mmio_read32(&gt->mmio, MIRROR_FUSE3); 131 + 132 + /* 133 + * PTL platforms with media version 30.00 do not provide proper values 134 + * for the media GT's L3 bank registers. Skip the readout since we 135 + * don't have any way to obtain real values. 136 + * 137 + * This may get re-described as an official workaround in the future, 138 + * but there's no tracking number assigned yet so we use a custom 139 + * OOB workaround descriptor. 140 + */ 141 + if (XE_WA(gt, no_media_l3)) 142 + return; 133 143 134 144 if (GRAPHICS_VER(xe) >= 20) { 135 145 xe_l3_bank_mask_t per_node = {}; ··· 155 141 xe_l3_bank_mask_t per_node = {}; 156 142 xe_l3_bank_mask_t per_mask_bit = {}; 157 143 u32 meml3_en = REG_FIELD_GET(MEML3_EN_MASK, fuse3); 158 - u32 fuse4 = xe_mmio_read32(gt, XEHP_FUSE4); 144 + u32 fuse4 = xe_mmio_read32(&gt->mmio, XEHP_FUSE4); 159 145 u32 bank_val = REG_FIELD_GET(GT_L3_EXC_MASK, fuse4); 160 146 161 147 bitmap_set_value8(per_mask_bit, 0x3, 0);
+12 -10
drivers/gpu/drm/xe/xe_gt_types.h
··· 6 6 #ifndef _XE_GT_TYPES_H_ 7 7 #define _XE_GT_TYPES_H_ 8 8 9 + #include "xe_device_types.h" 9 10 #include "xe_force_wake_types.h" 10 11 #include "xe_gt_idle_types.h" 11 12 #include "xe_gt_sriov_pf_types.h" ··· 146 145 /** 147 146 * @mmio: mmio info for GT. All GTs within a tile share the same 148 147 * register space, but have their own copy of GSI registers at a 149 - * specific offset, as well as their own forcewake handling. 148 + * specific offset. 149 + */ 150 + struct xe_mmio mmio; 151 + 152 + /** 153 + * @pm: power management info for GT. The driver uses the GT's 154 + * "force wake" interface to wake up specific parts of the GT hardware 155 + * from C6 sleep states and ensure the hardware remains awake while it 156 + * is being actively used. 150 157 */ 151 158 struct { 152 - /** @mmio.fw: force wake for GT */ 159 + /** @pm.fw: force wake for GT */ 153 160 struct xe_force_wake fw; 154 - /** 155 - * @mmio.adj_limit: adjust MMIO address if address is below this 156 - * value 157 - */ 158 - u32 adj_limit; 159 - /** @mmio.adj_offset: offect to add to MMIO address when adjusting */ 160 - u32 adj_offset; 161 - } mmio; 161 + } pm; 162 162 163 163 /** @sriov: virtualization data related to GT */ 164 164 union {
+41 -31
drivers/gpu/drm/xe/xe_guc.c
··· 14 14 #include "regs/xe_gt_regs.h" 15 15 #include "regs/xe_gtt_defs.h" 16 16 #include "regs/xe_guc_regs.h" 17 + #include "regs/xe_irq_regs.h" 17 18 #include "xe_bo.h" 18 19 #include "xe_device.h" 19 20 #include "xe_force_wake.h" ··· 23 22 #include "xe_gt_sriov_vf.h" 24 23 #include "xe_gt_throttle.h" 25 24 #include "xe_guc_ads.h" 25 + #include "xe_guc_capture.h" 26 26 #include "xe_guc_ct.h" 27 27 #include "xe_guc_db_mgr.h" 28 28 #include "xe_guc_hwconfig.h" ··· 238 236 239 237 xe_force_wake_assert_held(gt_to_fw(gt), XE_FW_GT); 240 238 241 - xe_mmio_write32(gt, SOFT_SCRATCH(0), 0); 239 + xe_mmio_write32(&gt->mmio, SOFT_SCRATCH(0), 0); 242 240 243 241 for (i = 0; i < GUC_CTL_MAX_DWORDS; i++) 244 - xe_mmio_write32(gt, SOFT_SCRATCH(1 + i), guc->params[i]); 242 + xe_mmio_write32(&gt->mmio, SOFT_SCRATCH(1 + i), guc->params[i]); 245 243 } 246 244 247 245 static void guc_fini_hw(void *arg) ··· 340 338 if (ret) 341 339 goto out; 342 340 341 + ret = xe_guc_capture_init(guc); 342 + if (ret) 343 + goto out; 344 + 343 345 ret = xe_guc_ads_init(&guc->ads); 344 346 if (ret) 345 347 goto out; ··· 431 425 int xe_guc_reset(struct xe_guc *guc) 432 426 { 433 427 struct xe_gt *gt = guc_to_gt(guc); 428 + struct xe_mmio *mmio = &gt->mmio; 434 429 u32 guc_status, gdrst; 435 430 int ret; 436 431 ··· 440 433 if (IS_SRIOV_VF(gt_to_xe(gt))) 441 434 return xe_gt_sriov_vf_bootstrap(gt); 442 435 443 - xe_mmio_write32(gt, GDRST, GRDOM_GUC); 436 + xe_mmio_write32(mmio, GDRST, GRDOM_GUC); 444 437 445 - ret = xe_mmio_wait32(gt, GDRST, GRDOM_GUC, 0, 5000, &gdrst, false); 438 + ret = xe_mmio_wait32(mmio, GDRST, GRDOM_GUC, 0, 5000, &gdrst, false); 446 439 if (ret) { 447 440 xe_gt_err(gt, "GuC reset timed out, GDRST=%#x\n", gdrst); 448 441 goto err_out; 449 442 } 450 443 451 - guc_status = xe_mmio_read32(gt, GUC_STATUS); 444 + guc_status = xe_mmio_read32(mmio, GUC_STATUS); 452 445 if (!(guc_status & GS_MIA_IN_RESET)) { 453 446 xe_gt_err(gt, "GuC status: %#x, MIA core expected to be in reset\n", 454 447 guc_status); ··· 466 459 static void guc_prepare_xfer(struct xe_guc *guc) 467 460 { 468 461 struct xe_gt *gt = guc_to_gt(guc); 462 + struct xe_mmio *mmio = &gt->mmio; 469 463 struct xe_device *xe = guc_to_xe(guc); 470 464 u32 shim_flags = GUC_ENABLE_READ_CACHE_LOGIC | 471 465 GUC_ENABLE_READ_CACHE_FOR_SRAM_DATA | ··· 481 473 shim_flags |= REG_FIELD_PREP(GUC_MOCS_INDEX_MASK, gt->mocs.uc_index); 482 474 483 475 /* Must program this register before loading the ucode with DMA */ 484 - xe_mmio_write32(gt, GUC_SHIM_CONTROL, shim_flags); 476 + xe_mmio_write32(mmio, GUC_SHIM_CONTROL, shim_flags); 485 477 486 - xe_mmio_write32(gt, GT_PM_CONFIG, GT_DOORBELL_ENABLE); 478 + xe_mmio_write32(mmio, GT_PM_CONFIG, GT_DOORBELL_ENABLE); 487 479 488 480 /* Make sure GuC receives ARAT interrupts */ 489 - xe_mmio_rmw32(gt, PMINTRMSK, ARAT_EXPIRED_INTRMSK, 0); 481 + xe_mmio_rmw32(mmio, PMINTRMSK, ARAT_EXPIRED_INTRMSK, 0); 490 482 } 491 483 492 484 /* ··· 502 494 if (guc->fw.rsa_size > 256) { 503 495 u32 rsa_ggtt_addr = xe_bo_ggtt_addr(guc->fw.bo) + 504 496 xe_uc_fw_rsa_offset(&guc->fw); 505 - xe_mmio_write32(gt, UOS_RSA_SCRATCH(0), rsa_ggtt_addr); 497 + xe_mmio_write32(&gt->mmio, UOS_RSA_SCRATCH(0), rsa_ggtt_addr); 506 498 return 0; 507 499 } 508 500 ··· 511 503 return -ENOMEM; 512 504 513 505 for (i = 0; i < UOS_RSA_SCRATCH_COUNT; i++) 514 - xe_mmio_write32(gt, UOS_RSA_SCRATCH(i), rsa[i]); 506 + xe_mmio_write32(&gt->mmio, UOS_RSA_SCRATCH(i), rsa[i]); 515 507 516 508 return 0; 517 509 } ··· 591 583 * extreme thermal throttling. And a system that is that hot during boot is probably 592 584 * dead anyway! 593 585 */ 594 - #if defined(CONFIG_DRM_XE_DEBUG) 586 + #if IS_ENABLED(CONFIG_DRM_XE_DEBUG) 595 587 #define GUC_LOAD_RETRY_LIMIT 20 596 588 #else 597 589 #define GUC_LOAD_RETRY_LIMIT 3 ··· 601 593 static void guc_wait_ucode(struct xe_guc *guc) 602 594 { 603 595 struct xe_gt *gt = guc_to_gt(guc); 596 + struct xe_mmio *mmio = &gt->mmio; 604 597 struct xe_guc_pc *guc_pc = &gt->uc.guc.pc; 605 598 ktime_t before, after, delta; 606 599 int load_done; ··· 628 619 * timeouts rather than allowing a huge timeout each time. So basically, need 629 620 * to treat a timeout no different to a value change. 630 621 */ 631 - ret = xe_mmio_wait32_not(gt, GUC_STATUS, GS_UKERNEL_MASK | GS_BOOTROM_MASK, 622 + ret = xe_mmio_wait32_not(mmio, GUC_STATUS, GS_UKERNEL_MASK | GS_BOOTROM_MASK, 632 623 last_status, 1000 * 1000, &status, false); 633 624 if (ret < 0) 634 625 count++; ··· 666 657 switch (bootrom) { 667 658 case XE_BOOTROM_STATUS_NO_KEY_FOUND: 668 659 xe_gt_err(gt, "invalid key requested, header = 0x%08X\n", 669 - xe_mmio_read32(gt, GUC_HEADER_INFO)); 660 + xe_mmio_read32(mmio, GUC_HEADER_INFO)); 670 661 break; 671 662 672 663 case XE_BOOTROM_STATUS_RSA_FAILED: ··· 681 672 switch (ukernel) { 682 673 case XE_GUC_LOAD_STATUS_EXCEPTION: 683 674 xe_gt_err(gt, "firmware exception. EIP: %#x\n", 684 - xe_mmio_read32(gt, SOFT_SCRATCH(13))); 675 + xe_mmio_read32(mmio, SOFT_SCRATCH(13))); 685 676 break; 686 677 687 678 case XE_GUC_LOAD_STATUS_INIT_MMIO_SAVE_RESTORE_INVALID: ··· 833 824 834 825 xe_force_wake_assert_held(gt_to_fw(gt), XE_FW_GT); 835 826 836 - msg = xe_mmio_read32(gt, SOFT_SCRATCH(15)); 827 + msg = xe_mmio_read32(&gt->mmio, SOFT_SCRATCH(15)); 837 828 msg &= XE_GUC_RECV_MSG_EXCEPTION | 838 829 XE_GUC_RECV_MSG_CRASH_DUMP_POSTED; 839 - xe_mmio_write32(gt, SOFT_SCRATCH(15), 0); 830 + xe_mmio_write32(&gt->mmio, SOFT_SCRATCH(15), 0); 840 831 841 832 if (msg & XE_GUC_RECV_MSG_CRASH_DUMP_POSTED) 842 833 xe_gt_err(gt, "Received early GuC crash dump notification!\n"); ··· 853 844 REG_FIELD_PREP(ENGINE1_MASK, GUC_INTR_GUC2HOST); 854 845 855 846 /* Primary GuC and media GuC share a single enable bit */ 856 - xe_mmio_write32(gt, GUC_SG_INTR_ENABLE, 847 + xe_mmio_write32(&gt->mmio, GUC_SG_INTR_ENABLE, 857 848 REG_FIELD_PREP(ENGINE1_MASK, GUC_INTR_GUC2HOST)); 858 849 859 850 /* 860 851 * There are separate mask bits for primary and media GuCs, so use 861 852 * a RMW operation to avoid clobbering the other GuC's setting. 862 853 */ 863 - xe_mmio_rmw32(gt, GUC_SG_INTR_MASK, events, 0); 854 + xe_mmio_rmw32(&gt->mmio, GUC_SG_INTR_MASK, events, 0); 864 855 } 865 856 866 857 int xe_guc_enable_communication(struct xe_guc *guc) ··· 872 863 struct xe_gt *gt = guc_to_gt(guc); 873 864 struct xe_tile *tile = gt_to_tile(gt); 874 865 875 - err = xe_memirq_init_guc(&tile->sriov.vf.memirq, guc); 866 + err = xe_memirq_init_guc(&tile->memirq, guc); 876 867 if (err) 877 868 return err; 878 869 } else { ··· 916 907 * additional payload data to the GuC but this capability is not 917 908 * used by the firmware yet. Use default value in the meantime. 918 909 */ 919 - xe_mmio_write32(gt, guc->notify_reg, default_notify_data); 910 + xe_mmio_write32(&gt->mmio, guc->notify_reg, default_notify_data); 920 911 } 921 912 922 913 int xe_guc_auth_huc(struct xe_guc *guc, u32 rsa_addr) ··· 934 925 { 935 926 struct xe_device *xe = guc_to_xe(guc); 936 927 struct xe_gt *gt = guc_to_gt(guc); 928 + struct xe_mmio *mmio = &gt->mmio; 937 929 u32 header, reply; 938 930 struct xe_reg reply_reg = xe_gt_is_media_type(gt) ? 939 931 MED_VF_SW_FLAG(0) : VF_SW_FLAG(0); ··· 957 947 /* Not in critical data-path, just do if else for GT type */ 958 948 if (xe_gt_is_media_type(gt)) { 959 949 for (i = 0; i < len; ++i) 960 - xe_mmio_write32(gt, MED_VF_SW_FLAG(i), 950 + xe_mmio_write32(mmio, MED_VF_SW_FLAG(i), 961 951 request[i]); 962 - xe_mmio_read32(gt, MED_VF_SW_FLAG(LAST_INDEX)); 952 + xe_mmio_read32(mmio, MED_VF_SW_FLAG(LAST_INDEX)); 963 953 } else { 964 954 for (i = 0; i < len; ++i) 965 - xe_mmio_write32(gt, VF_SW_FLAG(i), 955 + xe_mmio_write32(mmio, VF_SW_FLAG(i), 966 956 request[i]); 967 - xe_mmio_read32(gt, VF_SW_FLAG(LAST_INDEX)); 957 + xe_mmio_read32(mmio, VF_SW_FLAG(LAST_INDEX)); 968 958 } 969 959 970 960 xe_guc_notify(guc); 971 961 972 - ret = xe_mmio_wait32(gt, reply_reg, GUC_HXG_MSG_0_ORIGIN, 962 + ret = xe_mmio_wait32(mmio, reply_reg, GUC_HXG_MSG_0_ORIGIN, 973 963 FIELD_PREP(GUC_HXG_MSG_0_ORIGIN, GUC_HXG_ORIGIN_GUC), 974 964 50000, &reply, false); 975 965 if (ret) { ··· 979 969 return ret; 980 970 } 981 971 982 - header = xe_mmio_read32(gt, reply_reg); 972 + header = xe_mmio_read32(mmio, reply_reg); 983 973 if (FIELD_GET(GUC_HXG_MSG_0_TYPE, header) == 984 974 GUC_HXG_TYPE_NO_RESPONSE_BUSY) { 985 975 /* ··· 995 985 BUILD_BUG_ON(FIELD_MAX(GUC_HXG_MSG_0_TYPE) != GUC_HXG_TYPE_RESPONSE_SUCCESS); 996 986 BUILD_BUG_ON((GUC_HXG_TYPE_RESPONSE_SUCCESS ^ GUC_HXG_TYPE_RESPONSE_FAILURE) != 1); 997 987 998 - ret = xe_mmio_wait32(gt, reply_reg, resp_mask, resp_mask, 988 + ret = xe_mmio_wait32(mmio, reply_reg, resp_mask, resp_mask, 999 989 1000000, &header, false); 1000 990 1001 991 if (unlikely(FIELD_GET(GUC_HXG_MSG_0_ORIGIN, header) != ··· 1042 1032 1043 1033 for (i = 1; i < VF_SW_FLAG_COUNT; i++) { 1044 1034 reply_reg.addr += sizeof(u32); 1045 - response_buf[i] = xe_mmio_read32(gt, reply_reg); 1035 + response_buf[i] = xe_mmio_read32(mmio, reply_reg); 1046 1036 } 1047 1037 } 1048 1038 ··· 1165 1155 if (err) 1166 1156 return; 1167 1157 1168 - status = xe_mmio_read32(gt, GUC_STATUS); 1158 + status = xe_mmio_read32(&gt->mmio, GUC_STATUS); 1169 1159 1170 1160 drm_printf(p, "\nGuC status 0x%08x:\n", status); 1171 1161 drm_printf(p, "\tBootrom status = 0x%x\n", ··· 1180 1170 drm_puts(p, "\nScratch registers:\n"); 1181 1171 for (i = 0; i < SOFT_SCRATCH_COUNT; i++) { 1182 1172 drm_printf(p, "\t%2d: \t0x%x\n", 1183 - i, xe_mmio_read32(gt, SOFT_SCRATCH(i))); 1173 + i, xe_mmio_read32(&gt->mmio, SOFT_SCRATCH(i))); 1184 1174 } 1185 1175 1186 1176 xe_force_wake_put(gt_to_fw(gt), XE_FW_GT); 1187 1177 1188 - xe_guc_ct_print(&guc->ct, p, false); 1178 + xe_guc_ct_print(&guc->ct, p); 1189 1179 xe_guc_submit_print(guc, p); 1190 1180 } 1191 1181
+5
drivers/gpu/drm/xe/xe_guc.h
··· 82 82 return gt_to_xe(guc_to_gt(guc)); 83 83 } 84 84 85 + static inline struct drm_device *guc_to_drm(struct xe_guc *guc) 86 + { 87 + return &guc_to_xe(guc)->drm; 88 + } 89 + 85 90 #endif
+149 -15
drivers/gpu/drm/xe/xe_guc_ads.c
··· 5 5 6 6 #include "xe_guc_ads.h" 7 7 8 + #include <linux/fault-inject.h> 9 + 8 10 #include <drm/drm_managed.h> 9 11 10 12 #include <generated/xe_wa_oob.h> ··· 20 18 #include "xe_gt_ccs_mode.h" 21 19 #include "xe_gt_printk.h" 22 20 #include "xe_guc.h" 21 + #include "xe_guc_capture.h" 23 22 #include "xe_guc_ct.h" 24 23 #include "xe_hw_engine.h" 25 24 #include "xe_lrc.h" ··· 152 149 153 150 static size_t guc_ads_capture_size(struct xe_guc_ads *ads) 154 151 { 155 - /* FIXME: Allocate a proper capture list */ 156 - return PAGE_ALIGN(PAGE_SIZE); 152 + return PAGE_ALIGN(ads->capture_size); 157 153 } 158 154 159 155 static size_t guc_ads_um_queues_size(struct xe_guc_ads *ads) ··· 406 404 struct xe_bo *bo; 407 405 408 406 ads->golden_lrc_size = calculate_golden_lrc_size(ads); 407 + ads->capture_size = xe_guc_capture_ads_input_worst_size(ads_to_guc(ads)); 409 408 ads->regset_size = calculate_regset_size(gt); 410 409 ads->ads_waklv_size = calculate_waklv_size(ads); 411 410 ··· 421 418 422 419 return 0; 423 420 } 421 + ALLOW_ERROR_INJECTION(xe_guc_ads_init, ERRNO); /* See xe_pci_probe() */ 424 422 425 423 /** 426 424 * xe_guc_ads_init_post_hwconfig - initialize ADS post hwconfig load 427 425 * @ads: Additional data structures object 428 426 * 429 - * Recalcuate golden_lrc_size & regset_size as the number hardware engines may 430 - * have changed after the hwconfig was loaded. Also verify the new sizes fit in 431 - * the already allocated ADS buffer object. 427 + * Recalculate golden_lrc_size, capture_size and regset_size as the number 428 + * hardware engines may have changed after the hwconfig was loaded. Also verify 429 + * the new sizes fit in the already allocated ADS buffer object. 432 430 * 433 431 * Return: 0 on success, negative error code on error. 434 432 */ ··· 441 437 xe_gt_assert(gt, ads->bo); 442 438 443 439 ads->golden_lrc_size = calculate_golden_lrc_size(ads); 440 + /* Calculate Capture size with worst size */ 441 + ads->capture_size = xe_guc_capture_ads_input_worst_size(ads_to_guc(ads)); 444 442 ads->regset_size = calculate_regset_size(gt); 445 443 446 444 xe_gt_assert(gt, ads->golden_lrc_size + ··· 542 536 } 543 537 } 544 538 545 - static void guc_capture_list_init(struct xe_guc_ads *ads) 539 + static u32 guc_get_capture_engine_mask(struct xe_gt *gt, struct iosys_map *info_map, 540 + enum guc_capture_list_class_type capture_class) 546 541 { 547 - int i, j; 548 - u32 addr = xe_bo_ggtt_addr(ads->bo) + guc_ads_capture_offset(ads); 542 + struct xe_device *xe = gt_to_xe(gt); 543 + u32 mask; 549 544 550 - /* FIXME: Populate a proper capture list */ 545 + switch (capture_class) { 546 + case GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE: 547 + mask = info_map_read(xe, info_map, engine_enabled_masks[GUC_RENDER_CLASS]); 548 + mask |= info_map_read(xe, info_map, engine_enabled_masks[GUC_COMPUTE_CLASS]); 549 + break; 550 + case GUC_CAPTURE_LIST_CLASS_VIDEO: 551 + mask = info_map_read(xe, info_map, engine_enabled_masks[GUC_VIDEO_CLASS]); 552 + break; 553 + case GUC_CAPTURE_LIST_CLASS_VIDEOENHANCE: 554 + mask = info_map_read(xe, info_map, engine_enabled_masks[GUC_VIDEOENHANCE_CLASS]); 555 + break; 556 + case GUC_CAPTURE_LIST_CLASS_BLITTER: 557 + mask = info_map_read(xe, info_map, engine_enabled_masks[GUC_BLITTER_CLASS]); 558 + break; 559 + case GUC_CAPTURE_LIST_CLASS_GSC_OTHER: 560 + mask = info_map_read(xe, info_map, engine_enabled_masks[GUC_GSC_OTHER_CLASS]); 561 + break; 562 + default: 563 + mask = 0; 564 + } 565 + 566 + return mask; 567 + } 568 + 569 + static inline bool get_capture_list(struct xe_guc_ads *ads, struct xe_guc *guc, struct xe_gt *gt, 570 + int owner, int type, int class, u32 *total_size, size_t *size, 571 + void **pptr) 572 + { 573 + *size = 0; 574 + 575 + if (!xe_guc_capture_getlistsize(guc, owner, type, class, size)) { 576 + if (*total_size + *size > ads->capture_size) 577 + xe_gt_dbg(gt, "Capture size overflow :%zu vs %d\n", 578 + *total_size + *size, ads->capture_size); 579 + else if (!xe_guc_capture_getlist(guc, owner, type, class, pptr)) 580 + return false; 581 + } 582 + 583 + return true; 584 + } 585 + 586 + static int guc_capture_prep_lists(struct xe_guc_ads *ads) 587 + { 588 + struct xe_guc *guc = ads_to_guc(ads); 589 + struct xe_gt *gt = ads_to_gt(ads); 590 + u32 ads_ggtt, capture_offset, null_ggtt, total_size = 0; 591 + struct iosys_map info_map; 592 + size_t size = 0; 593 + void *ptr; 594 + int i, j; 595 + 596 + /* 597 + * GuC Capture's steered reg-list needs to be allocated and initialized 598 + * after the GuC-hwconfig is available which guaranteed from here. 599 + */ 600 + xe_guc_capture_steered_list_init(ads_to_guc(ads)); 601 + 602 + capture_offset = guc_ads_capture_offset(ads); 603 + ads_ggtt = xe_bo_ggtt_addr(ads->bo); 604 + info_map = IOSYS_MAP_INIT_OFFSET(ads_to_map(ads), 605 + offsetof(struct __guc_ads_blob, system_info)); 606 + 607 + /* first, set aside the first page for a capture_list with zero descriptors */ 608 + total_size = PAGE_SIZE; 609 + if (!xe_guc_capture_getnullheader(guc, &ptr, &size)) 610 + xe_map_memcpy_to(ads_to_xe(ads), ads_to_map(ads), capture_offset, ptr, size); 611 + 612 + null_ggtt = ads_ggtt + capture_offset; 613 + capture_offset += PAGE_SIZE; 614 + 615 + /* 616 + * Populate capture list : at this point adps is already allocated and 617 + * mapped to worst case size 618 + */ 551 619 for (i = 0; i < GUC_CAPTURE_LIST_INDEX_MAX; i++) { 552 - for (j = 0; j < GUC_MAX_ENGINE_CLASSES; j++) { 553 - ads_blob_write(ads, ads.capture_instance[i][j], addr); 554 - ads_blob_write(ads, ads.capture_class[i][j], addr); 620 + bool write_empty_list; 621 + 622 + for (j = 0; j < GUC_CAPTURE_LIST_CLASS_MAX; j++) { 623 + u32 engine_mask = guc_get_capture_engine_mask(gt, &info_map, j); 624 + /* null list if we dont have said engine or list */ 625 + if (!engine_mask) { 626 + ads_blob_write(ads, ads.capture_class[i][j], null_ggtt); 627 + ads_blob_write(ads, ads.capture_instance[i][j], null_ggtt); 628 + continue; 629 + } 630 + 631 + /* engine exists: start with engine-class registers */ 632 + write_empty_list = get_capture_list(ads, guc, gt, i, 633 + GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS, 634 + j, &total_size, &size, &ptr); 635 + if (!write_empty_list) { 636 + ads_blob_write(ads, ads.capture_class[i][j], 637 + ads_ggtt + capture_offset); 638 + xe_map_memcpy_to(ads_to_xe(ads), ads_to_map(ads), capture_offset, 639 + ptr, size); 640 + total_size += size; 641 + capture_offset += size; 642 + } else { 643 + ads_blob_write(ads, ads.capture_class[i][j], null_ggtt); 644 + } 645 + 646 + /* engine exists: next, engine-instance registers */ 647 + write_empty_list = get_capture_list(ads, guc, gt, i, 648 + GUC_STATE_CAPTURE_TYPE_ENGINE_INSTANCE, 649 + j, &total_size, &size, &ptr); 650 + if (!write_empty_list) { 651 + ads_blob_write(ads, ads.capture_instance[i][j], 652 + ads_ggtt + capture_offset); 653 + xe_map_memcpy_to(ads_to_xe(ads), ads_to_map(ads), capture_offset, 654 + ptr, size); 655 + total_size += size; 656 + capture_offset += size; 657 + } else { 658 + ads_blob_write(ads, ads.capture_instance[i][j], null_ggtt); 659 + } 555 660 } 556 661 557 - ads_blob_write(ads, ads.capture_global[i], addr); 662 + /* global registers is last in our PF/VF loops */ 663 + write_empty_list = get_capture_list(ads, guc, gt, i, 664 + GUC_STATE_CAPTURE_TYPE_GLOBAL, 665 + 0, &total_size, &size, &ptr); 666 + if (!write_empty_list) { 667 + ads_blob_write(ads, ads.capture_global[i], ads_ggtt + capture_offset); 668 + xe_map_memcpy_to(ads_to_xe(ads), ads_to_map(ads), capture_offset, ptr, 669 + size); 670 + total_size += size; 671 + capture_offset += size; 672 + } else { 673 + ads_blob_write(ads, ads.capture_global[i], null_ggtt); 674 + } 558 675 } 676 + 677 + if (ads->capture_size != PAGE_ALIGN(total_size)) 678 + xe_gt_dbg(gt, "ADS capture alloc size changed from %d to %d\n", 679 + ads->capture_size, PAGE_ALIGN(total_size)); 680 + return PAGE_ALIGN(total_size); 559 681 } 560 682 561 683 static void guc_mmio_regset_write_one(struct xe_guc_ads *ads, ··· 818 684 819 685 if (GRAPHICS_VER(xe) >= 12 && !IS_DGFX(xe)) { 820 686 u32 distdbreg = 821 - xe_mmio_read32(gt, DIST_DBS_POPULATED); 687 + xe_mmio_read32(&gt->mmio, DIST_DBS_POPULATED); 822 688 823 689 ads_blob_write(ads, 824 690 system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_DOORBELL_COUNT_PER_SQIDI], ··· 872 738 guc_mmio_reg_state_init(ads); 873 739 guc_prep_golden_lrc_null(ads); 874 740 guc_mapping_table_init(gt, &info_map); 875 - guc_capture_list_init(ads); 741 + guc_capture_prep_lists(ads); 876 742 guc_doorbell_init(ads); 877 743 guc_waklv_init(ads); 878 744
+2
drivers/gpu/drm/xe/xe_guc_ads_types.h
··· 22 22 u32 regset_size; 23 23 /** @ads_waklv_size: total waklv size supported by platform */ 24 24 u32 ads_waklv_size; 25 + /** @capture_size: size of register set passed to GuC for capture */ 26 + u32 capture_size; 25 27 }; 26 28 27 29 #endif
+1972
drivers/gpu/drm/xe/xe_guc_capture.c
··· 1 + // SPDX-License-Identifier: MIT 2 + /* 3 + * Copyright © 2021-2024 Intel Corporation 4 + */ 5 + 6 + #include <linux/types.h> 7 + 8 + #include <drm/drm_managed.h> 9 + #include <drm/drm_print.h> 10 + 11 + #include "abi/guc_actions_abi.h" 12 + #include "abi/guc_capture_abi.h" 13 + #include "abi/guc_log_abi.h" 14 + #include "regs/xe_engine_regs.h" 15 + #include "regs/xe_gt_regs.h" 16 + #include "regs/xe_guc_regs.h" 17 + #include "regs/xe_regs.h" 18 + 19 + #include "xe_bo.h" 20 + #include "xe_device.h" 21 + #include "xe_exec_queue_types.h" 22 + #include "xe_gt.h" 23 + #include "xe_gt_mcr.h" 24 + #include "xe_gt_printk.h" 25 + #include "xe_guc.h" 26 + #include "xe_guc_ads.h" 27 + #include "xe_guc_capture.h" 28 + #include "xe_guc_capture_types.h" 29 + #include "xe_guc_ct.h" 30 + #include "xe_guc_exec_queue_types.h" 31 + #include "xe_guc_log.h" 32 + #include "xe_guc_submit_types.h" 33 + #include "xe_guc_submit.h" 34 + #include "xe_hw_engine_types.h" 35 + #include "xe_hw_engine.h" 36 + #include "xe_lrc.h" 37 + #include "xe_macros.h" 38 + #include "xe_map.h" 39 + #include "xe_mmio.h" 40 + #include "xe_sched_job.h" 41 + 42 + /* 43 + * struct __guc_capture_bufstate 44 + * 45 + * Book-keeping structure used to track read and write pointers 46 + * as we extract error capture data from the GuC-log-buffer's 47 + * error-capture region as a stream of dwords. 48 + */ 49 + struct __guc_capture_bufstate { 50 + u32 size; 51 + u32 data_offset; 52 + u32 rd; 53 + u32 wr; 54 + }; 55 + 56 + /* 57 + * struct __guc_capture_parsed_output - extracted error capture node 58 + * 59 + * A single unit of extracted error-capture output data grouped together 60 + * at an engine-instance level. We keep these nodes in a linked list. 61 + * See cachelist and outlist below. 62 + */ 63 + struct __guc_capture_parsed_output { 64 + /* 65 + * A single set of 3 capture lists: a global-list 66 + * an engine-class-list and an engine-instance list. 67 + * outlist in __guc_capture_parsed_output will keep 68 + * a linked list of these nodes that will eventually 69 + * be detached from outlist and attached into to 70 + * xe_codedump in response to a context reset 71 + */ 72 + struct list_head link; 73 + bool is_partial; 74 + u32 eng_class; 75 + u32 eng_inst; 76 + u32 guc_id; 77 + u32 lrca; 78 + u32 type; 79 + bool locked; 80 + enum xe_hw_engine_snapshot_source_id source; 81 + struct gcap_reg_list_info { 82 + u32 vfid; 83 + u32 num_regs; 84 + struct guc_mmio_reg *regs; 85 + } reginfo[GUC_STATE_CAPTURE_TYPE_MAX]; 86 + #define GCAP_PARSED_REGLIST_INDEX_GLOBAL BIT(GUC_STATE_CAPTURE_TYPE_GLOBAL) 87 + #define GCAP_PARSED_REGLIST_INDEX_ENGCLASS BIT(GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS) 88 + }; 89 + 90 + /* 91 + * Define all device tables of GuC error capture register lists 92 + * NOTE: 93 + * For engine-registers, GuC only needs the register offsets 94 + * from the engine-mmio-base 95 + * 96 + * 64 bit registers need 2 entries for low 32 bit register and high 32 bit 97 + * register, for example: 98 + * Register data_type flags mask Register name 99 + * { XXX_REG_LO(0), REG_64BIT_LOW_DW, 0, 0, NULL}, 100 + * { XXX_REG_HI(0), REG_64BIT_HI_DW,, 0, 0, "XXX_REG"}, 101 + * 1. data_type: Indicate is hi/low 32 bit for a 64 bit register 102 + * A 64 bit register define requires 2 consecutive entries, 103 + * with low dword first and hi dword the second. 104 + * 2. Register name: null for incompleted define 105 + */ 106 + #define COMMON_XELP_BASE_GLOBAL \ 107 + { FORCEWAKE_GT, REG_32BIT, 0, 0, "FORCEWAKE_GT"} 108 + 109 + #define COMMON_BASE_ENGINE_INSTANCE \ 110 + { RING_HWSTAM(0), REG_32BIT, 0, 0, "HWSTAM"}, \ 111 + { RING_HWS_PGA(0), REG_32BIT, 0, 0, "RING_HWS_PGA"}, \ 112 + { RING_HEAD(0), REG_32BIT, 0, 0, "RING_HEAD"}, \ 113 + { RING_TAIL(0), REG_32BIT, 0, 0, "RING_TAIL"}, \ 114 + { RING_CTL(0), REG_32BIT, 0, 0, "RING_CTL"}, \ 115 + { RING_MI_MODE(0), REG_32BIT, 0, 0, "RING_MI_MODE"}, \ 116 + { RING_MODE(0), REG_32BIT, 0, 0, "RING_MODE"}, \ 117 + { RING_ESR(0), REG_32BIT, 0, 0, "RING_ESR"}, \ 118 + { RING_EMR(0), REG_32BIT, 0, 0, "RING_EMR"}, \ 119 + { RING_EIR(0), REG_32BIT, 0, 0, "RING_EIR"}, \ 120 + { RING_IMR(0), REG_32BIT, 0, 0, "RING_IMR"}, \ 121 + { RING_IPEHR(0), REG_32BIT, 0, 0, "IPEHR"}, \ 122 + { RING_INSTDONE(0), REG_32BIT, 0, 0, "RING_INSTDONE"}, \ 123 + { INDIRECT_RING_STATE(0), REG_32BIT, 0, 0, "INDIRECT_RING_STATE"}, \ 124 + { RING_ACTHD(0), REG_64BIT_LOW_DW, 0, 0, NULL}, \ 125 + { RING_ACTHD_UDW(0), REG_64BIT_HI_DW, 0, 0, "ACTHD"}, \ 126 + { RING_BBADDR(0), REG_64BIT_LOW_DW, 0, 0, NULL}, \ 127 + { RING_BBADDR_UDW(0), REG_64BIT_HI_DW, 0, 0, "RING_BBADDR"}, \ 128 + { RING_START(0), REG_64BIT_LOW_DW, 0, 0, NULL}, \ 129 + { RING_START_UDW(0), REG_64BIT_HI_DW, 0, 0, "RING_START"}, \ 130 + { RING_DMA_FADD(0), REG_64BIT_LOW_DW, 0, 0, NULL}, \ 131 + { RING_DMA_FADD_UDW(0), REG_64BIT_HI_DW, 0, 0, "RING_DMA_FADD"}, \ 132 + { RING_EXECLIST_STATUS_LO(0), REG_64BIT_LOW_DW, 0, 0, NULL}, \ 133 + { RING_EXECLIST_STATUS_HI(0), REG_64BIT_HI_DW, 0, 0, "RING_EXECLIST_STATUS"}, \ 134 + { RING_EXECLIST_SQ_CONTENTS_LO(0), REG_64BIT_LOW_DW, 0, 0, NULL}, \ 135 + { RING_EXECLIST_SQ_CONTENTS_HI(0), REG_64BIT_HI_DW, 0, 0, "RING_EXECLIST_SQ_CONTENTS"} 136 + 137 + #define COMMON_XELP_RC_CLASS \ 138 + { RCU_MODE, REG_32BIT, 0, 0, "RCU_MODE"} 139 + 140 + #define COMMON_XELP_RC_CLASS_INSTDONE \ 141 + { SC_INSTDONE, REG_32BIT, 0, 0, "SC_INSTDONE"}, \ 142 + { SC_INSTDONE_EXTRA, REG_32BIT, 0, 0, "SC_INSTDONE_EXTRA"}, \ 143 + { SC_INSTDONE_EXTRA2, REG_32BIT, 0, 0, "SC_INSTDONE_EXTRA2"} 144 + 145 + #define XELP_VEC_CLASS_REGS \ 146 + { SFC_DONE(0), 0, 0, 0, "SFC_DONE[0]"}, \ 147 + { SFC_DONE(1), 0, 0, 0, "SFC_DONE[1]"}, \ 148 + { SFC_DONE(2), 0, 0, 0, "SFC_DONE[2]"}, \ 149 + { SFC_DONE(3), 0, 0, 0, "SFC_DONE[3]"} 150 + 151 + /* XE_LP Global */ 152 + static const struct __guc_mmio_reg_descr xe_lp_global_regs[] = { 153 + COMMON_XELP_BASE_GLOBAL, 154 + }; 155 + 156 + /* Render / Compute Per-Engine-Instance */ 157 + static const struct __guc_mmio_reg_descr xe_rc_inst_regs[] = { 158 + COMMON_BASE_ENGINE_INSTANCE, 159 + }; 160 + 161 + /* Render / Compute Engine-Class */ 162 + static const struct __guc_mmio_reg_descr xe_rc_class_regs[] = { 163 + COMMON_XELP_RC_CLASS, 164 + COMMON_XELP_RC_CLASS_INSTDONE, 165 + }; 166 + 167 + /* Render / Compute Engine-Class for xehpg */ 168 + static const struct __guc_mmio_reg_descr xe_hpg_rc_class_regs[] = { 169 + COMMON_XELP_RC_CLASS, 170 + }; 171 + 172 + /* Media Decode/Encode Per-Engine-Instance */ 173 + static const struct __guc_mmio_reg_descr xe_vd_inst_regs[] = { 174 + COMMON_BASE_ENGINE_INSTANCE, 175 + }; 176 + 177 + /* Video Enhancement Engine-Class */ 178 + static const struct __guc_mmio_reg_descr xe_vec_class_regs[] = { 179 + XELP_VEC_CLASS_REGS, 180 + }; 181 + 182 + /* Video Enhancement Per-Engine-Instance */ 183 + static const struct __guc_mmio_reg_descr xe_vec_inst_regs[] = { 184 + COMMON_BASE_ENGINE_INSTANCE, 185 + }; 186 + 187 + /* Blitter Per-Engine-Instance */ 188 + static const struct __guc_mmio_reg_descr xe_blt_inst_regs[] = { 189 + COMMON_BASE_ENGINE_INSTANCE, 190 + }; 191 + 192 + /* XE_LP - GSC Per-Engine-Instance */ 193 + static const struct __guc_mmio_reg_descr xe_lp_gsc_inst_regs[] = { 194 + COMMON_BASE_ENGINE_INSTANCE, 195 + }; 196 + 197 + /* 198 + * Empty list to prevent warnings about unknown class/instance types 199 + * as not all class/instance types have entries on all platforms. 200 + */ 201 + static const struct __guc_mmio_reg_descr empty_regs_list[] = { 202 + }; 203 + 204 + #define TO_GCAP_DEF_OWNER(x) (GUC_CAPTURE_LIST_INDEX_##x) 205 + #define TO_GCAP_DEF_TYPE(x) (GUC_STATE_CAPTURE_TYPE_##x) 206 + #define MAKE_REGLIST(regslist, regsowner, regstype, class) \ 207 + { \ 208 + regslist, \ 209 + ARRAY_SIZE(regslist), \ 210 + TO_GCAP_DEF_OWNER(regsowner), \ 211 + TO_GCAP_DEF_TYPE(regstype), \ 212 + class \ 213 + } 214 + 215 + /* List of lists for legacy graphic product version < 1255 */ 216 + static const struct __guc_mmio_reg_descr_group xe_lp_lists[] = { 217 + MAKE_REGLIST(xe_lp_global_regs, PF, GLOBAL, 0), 218 + MAKE_REGLIST(xe_rc_class_regs, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE), 219 + MAKE_REGLIST(xe_rc_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE), 220 + MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_VIDEO), 221 + MAKE_REGLIST(xe_vd_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_VIDEO), 222 + MAKE_REGLIST(xe_vec_class_regs, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_VIDEOENHANCE), 223 + MAKE_REGLIST(xe_vec_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_VIDEOENHANCE), 224 + MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_BLITTER), 225 + MAKE_REGLIST(xe_blt_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_BLITTER), 226 + MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_GSC_OTHER), 227 + MAKE_REGLIST(xe_lp_gsc_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_GSC_OTHER), 228 + {} 229 + }; 230 + 231 + /* List of lists for graphic product version >= 1255 */ 232 + static const struct __guc_mmio_reg_descr_group xe_hpg_lists[] = { 233 + MAKE_REGLIST(xe_lp_global_regs, PF, GLOBAL, 0), 234 + MAKE_REGLIST(xe_hpg_rc_class_regs, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE), 235 + MAKE_REGLIST(xe_rc_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE), 236 + MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_VIDEO), 237 + MAKE_REGLIST(xe_vd_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_VIDEO), 238 + MAKE_REGLIST(xe_vec_class_regs, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_VIDEOENHANCE), 239 + MAKE_REGLIST(xe_vec_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_VIDEOENHANCE), 240 + MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_BLITTER), 241 + MAKE_REGLIST(xe_blt_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_BLITTER), 242 + MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_GSC_OTHER), 243 + MAKE_REGLIST(xe_lp_gsc_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_GSC_OTHER), 244 + {} 245 + }; 246 + 247 + static const char * const capture_list_type_names[] = { 248 + "Global", 249 + "Class", 250 + "Instance", 251 + }; 252 + 253 + static const char * const capture_engine_class_names[] = { 254 + "Render/Compute", 255 + "Video", 256 + "VideoEnhance", 257 + "Blitter", 258 + "GSC-Other", 259 + }; 260 + 261 + struct __guc_capture_ads_cache { 262 + bool is_valid; 263 + void *ptr; 264 + size_t size; 265 + int status; 266 + }; 267 + 268 + struct xe_guc_state_capture { 269 + const struct __guc_mmio_reg_descr_group *reglists; 270 + /** 271 + * NOTE: steered registers have multiple instances depending on the HW configuration 272 + * (slices or dual-sub-slices) and thus depends on HW fuses discovered 273 + */ 274 + struct __guc_mmio_reg_descr_group *extlists; 275 + struct __guc_capture_ads_cache ads_cache[GUC_CAPTURE_LIST_INDEX_MAX] 276 + [GUC_STATE_CAPTURE_TYPE_MAX] 277 + [GUC_CAPTURE_LIST_CLASS_MAX]; 278 + void *ads_null_cache; 279 + struct list_head cachelist; 280 + #define PREALLOC_NODES_MAX_COUNT (3 * GUC_MAX_ENGINE_CLASSES * GUC_MAX_INSTANCES_PER_CLASS) 281 + #define PREALLOC_NODES_DEFAULT_NUMREGS 64 282 + 283 + int max_mmio_per_node; 284 + struct list_head outlist; 285 + }; 286 + 287 + static void 288 + guc_capture_remove_stale_matches_from_list(struct xe_guc_state_capture *gc, 289 + struct __guc_capture_parsed_output *node); 290 + 291 + static const struct __guc_mmio_reg_descr_group * 292 + guc_capture_get_device_reglist(struct xe_device *xe) 293 + { 294 + if (GRAPHICS_VERx100(xe) >= 1255) 295 + return xe_hpg_lists; 296 + else 297 + return xe_lp_lists; 298 + } 299 + 300 + static const struct __guc_mmio_reg_descr_group * 301 + guc_capture_get_one_list(const struct __guc_mmio_reg_descr_group *reglists, 302 + u32 owner, u32 type, enum guc_capture_list_class_type capture_class) 303 + { 304 + int i; 305 + 306 + if (!reglists) 307 + return NULL; 308 + 309 + for (i = 0; reglists[i].list; ++i) { 310 + if (reglists[i].owner == owner && reglists[i].type == type && 311 + (reglists[i].engine == capture_class || 312 + reglists[i].type == GUC_STATE_CAPTURE_TYPE_GLOBAL)) 313 + return &reglists[i]; 314 + } 315 + 316 + return NULL; 317 + } 318 + 319 + const struct __guc_mmio_reg_descr_group * 320 + xe_guc_capture_get_reg_desc_list(struct xe_gt *gt, u32 owner, u32 type, 321 + enum guc_capture_list_class_type capture_class, bool is_ext) 322 + { 323 + const struct __guc_mmio_reg_descr_group *reglists; 324 + 325 + if (is_ext) { 326 + struct xe_guc *guc = &gt->uc.guc; 327 + 328 + reglists = guc->capture->extlists; 329 + } else { 330 + reglists = guc_capture_get_device_reglist(gt_to_xe(gt)); 331 + } 332 + return guc_capture_get_one_list(reglists, owner, type, capture_class); 333 + } 334 + 335 + struct __ext_steer_reg { 336 + const char *name; 337 + struct xe_reg_mcr reg; 338 + }; 339 + 340 + static const struct __ext_steer_reg xe_extregs[] = { 341 + {"SAMPLER_INSTDONE", SAMPLER_INSTDONE}, 342 + {"ROW_INSTDONE", ROW_INSTDONE} 343 + }; 344 + 345 + static const struct __ext_steer_reg xehpg_extregs[] = { 346 + {"SC_INSTDONE", XEHPG_SC_INSTDONE}, 347 + {"SC_INSTDONE_EXTRA", XEHPG_SC_INSTDONE_EXTRA}, 348 + {"SC_INSTDONE_EXTRA2", XEHPG_SC_INSTDONE_EXTRA2}, 349 + {"INSTDONE_GEOM_SVGUNIT", XEHPG_INSTDONE_GEOM_SVGUNIT} 350 + }; 351 + 352 + static void __fill_ext_reg(struct __guc_mmio_reg_descr *ext, 353 + const struct __ext_steer_reg *extlist, 354 + int slice_id, int subslice_id) 355 + { 356 + if (!ext || !extlist) 357 + return; 358 + 359 + ext->reg = XE_REG(extlist->reg.__reg.addr); 360 + ext->flags = FIELD_PREP(GUC_REGSET_STEERING_NEEDED, 1); 361 + ext->flags = FIELD_PREP(GUC_REGSET_STEERING_GROUP, slice_id); 362 + ext->flags |= FIELD_PREP(GUC_REGSET_STEERING_INSTANCE, subslice_id); 363 + ext->regname = extlist->name; 364 + } 365 + 366 + static int 367 + __alloc_ext_regs(struct drm_device *drm, struct __guc_mmio_reg_descr_group *newlist, 368 + const struct __guc_mmio_reg_descr_group *rootlist, int num_regs) 369 + { 370 + struct __guc_mmio_reg_descr *list; 371 + 372 + list = drmm_kzalloc(drm, num_regs * sizeof(struct __guc_mmio_reg_descr), GFP_KERNEL); 373 + if (!list) 374 + return -ENOMEM; 375 + 376 + newlist->list = list; 377 + newlist->num_regs = num_regs; 378 + newlist->owner = rootlist->owner; 379 + newlist->engine = rootlist->engine; 380 + newlist->type = rootlist->type; 381 + 382 + return 0; 383 + } 384 + 385 + static int guc_capture_get_steer_reg_num(struct xe_device *xe) 386 + { 387 + int num = ARRAY_SIZE(xe_extregs); 388 + 389 + if (GRAPHICS_VERx100(xe) >= 1255) 390 + num += ARRAY_SIZE(xehpg_extregs); 391 + 392 + return num; 393 + } 394 + 395 + static void guc_capture_alloc_steered_lists(struct xe_guc *guc) 396 + { 397 + struct xe_gt *gt = guc_to_gt(guc); 398 + u16 slice, subslice; 399 + int iter, i, total = 0; 400 + const struct __guc_mmio_reg_descr_group *lists = guc->capture->reglists; 401 + const struct __guc_mmio_reg_descr_group *list; 402 + struct __guc_mmio_reg_descr_group *extlists; 403 + struct __guc_mmio_reg_descr *extarray; 404 + bool has_xehpg_extregs = GRAPHICS_VERx100(gt_to_xe(gt)) >= 1255; 405 + struct drm_device *drm = &gt_to_xe(gt)->drm; 406 + bool has_rcs_ccs = false; 407 + struct xe_hw_engine *hwe; 408 + enum xe_hw_engine_id id; 409 + 410 + /* 411 + * If GT has no rcs/ccs, no need to alloc steered list. 412 + * Currently, only rcs/ccs has steering register, if in the future, 413 + * other engine types has steering register, this condition check need 414 + * to be extended 415 + */ 416 + for_each_hw_engine(hwe, gt, id) { 417 + if (xe_engine_class_to_guc_capture_class(hwe->class) == 418 + GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE) { 419 + has_rcs_ccs = true; 420 + break; 421 + } 422 + } 423 + 424 + if (!has_rcs_ccs) 425 + return; 426 + 427 + /* steered registers currently only exist for the render-class */ 428 + list = guc_capture_get_one_list(lists, GUC_CAPTURE_LIST_INDEX_PF, 429 + GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS, 430 + GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE); 431 + /* 432 + * Skip if this platform has no engine class registers or if extlists 433 + * was previously allocated 434 + */ 435 + if (!list || guc->capture->extlists) 436 + return; 437 + 438 + total = bitmap_weight(gt->fuse_topo.g_dss_mask, sizeof(gt->fuse_topo.g_dss_mask) * 8) * 439 + guc_capture_get_steer_reg_num(guc_to_xe(guc)); 440 + 441 + if (!total) 442 + return; 443 + 444 + /* allocate an extra for an end marker */ 445 + extlists = drmm_kzalloc(drm, 2 * sizeof(struct __guc_mmio_reg_descr_group), GFP_KERNEL); 446 + if (!extlists) 447 + return; 448 + 449 + if (__alloc_ext_regs(drm, &extlists[0], list, total)) { 450 + drmm_kfree(drm, extlists); 451 + return; 452 + } 453 + 454 + /* For steering registers, the list is generated at run-time */ 455 + extarray = (struct __guc_mmio_reg_descr *)extlists[0].list; 456 + for_each_dss_steering(iter, gt, slice, subslice) { 457 + for (i = 0; i < ARRAY_SIZE(xe_extregs); ++i) { 458 + __fill_ext_reg(extarray, &xe_extregs[i], slice, subslice); 459 + ++extarray; 460 + } 461 + 462 + if (has_xehpg_extregs) 463 + for (i = 0; i < ARRAY_SIZE(xehpg_extregs); ++i) { 464 + __fill_ext_reg(extarray, &xehpg_extregs[i], slice, subslice); 465 + ++extarray; 466 + } 467 + } 468 + 469 + extlists[0].num_regs = total; 470 + 471 + xe_gt_dbg(guc_to_gt(guc), "capture found %d ext-regs.\n", total); 472 + guc->capture->extlists = extlists; 473 + } 474 + 475 + static int 476 + guc_capture_list_init(struct xe_guc *guc, u32 owner, u32 type, 477 + enum guc_capture_list_class_type capture_class, struct guc_mmio_reg *ptr, 478 + u16 num_entries) 479 + { 480 + u32 ptr_idx = 0, list_idx = 0; 481 + const struct __guc_mmio_reg_descr_group *reglists = guc->capture->reglists; 482 + struct __guc_mmio_reg_descr_group *extlists = guc->capture->extlists; 483 + const struct __guc_mmio_reg_descr_group *match; 484 + u32 list_num; 485 + 486 + if (!reglists) 487 + return -ENODEV; 488 + 489 + match = guc_capture_get_one_list(reglists, owner, type, capture_class); 490 + if (!match) 491 + return -ENODATA; 492 + 493 + list_num = match->num_regs; 494 + for (list_idx = 0; ptr_idx < num_entries && list_idx < list_num; ++list_idx, ++ptr_idx) { 495 + ptr[ptr_idx].offset = match->list[list_idx].reg.addr; 496 + ptr[ptr_idx].value = 0xDEADF00D; 497 + ptr[ptr_idx].flags = match->list[list_idx].flags; 498 + ptr[ptr_idx].mask = match->list[list_idx].mask; 499 + } 500 + 501 + match = guc_capture_get_one_list(extlists, owner, type, capture_class); 502 + if (match) 503 + for (ptr_idx = list_num, list_idx = 0; 504 + ptr_idx < num_entries && list_idx < match->num_regs; 505 + ++ptr_idx, ++list_idx) { 506 + ptr[ptr_idx].offset = match->list[list_idx].reg.addr; 507 + ptr[ptr_idx].value = 0xDEADF00D; 508 + ptr[ptr_idx].flags = match->list[list_idx].flags; 509 + ptr[ptr_idx].mask = match->list[list_idx].mask; 510 + } 511 + 512 + if (ptr_idx < num_entries) 513 + xe_gt_dbg(guc_to_gt(guc), "Got short capture reglist init: %d out-of %d.\n", 514 + ptr_idx, num_entries); 515 + 516 + return 0; 517 + } 518 + 519 + static int 520 + guc_cap_list_num_regs(struct xe_guc *guc, u32 owner, u32 type, 521 + enum guc_capture_list_class_type capture_class) 522 + { 523 + const struct __guc_mmio_reg_descr_group *match; 524 + int num_regs = 0; 525 + 526 + match = guc_capture_get_one_list(guc->capture->reglists, owner, type, capture_class); 527 + if (match) 528 + num_regs = match->num_regs; 529 + 530 + match = guc_capture_get_one_list(guc->capture->extlists, owner, type, capture_class); 531 + if (match) 532 + num_regs += match->num_regs; 533 + else 534 + /* 535 + * If a caller wants the full register dump size but we have 536 + * not yet got the hw-config, which is before max_mmio_per_node 537 + * is initialized, then provide a worst-case number for 538 + * extlists based on max dss fuse bits, but only ever for 539 + * render/compute 540 + */ 541 + if (owner == GUC_CAPTURE_LIST_INDEX_PF && 542 + type == GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS && 543 + capture_class == GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE && 544 + !guc->capture->max_mmio_per_node) 545 + num_regs += guc_capture_get_steer_reg_num(guc_to_xe(guc)) * 546 + XE_MAX_DSS_FUSE_BITS; 547 + 548 + return num_regs; 549 + } 550 + 551 + static int 552 + guc_capture_getlistsize(struct xe_guc *guc, u32 owner, u32 type, 553 + enum guc_capture_list_class_type capture_class, 554 + size_t *size, bool is_purpose_est) 555 + { 556 + struct xe_guc_state_capture *gc = guc->capture; 557 + struct xe_gt *gt = guc_to_gt(guc); 558 + struct __guc_capture_ads_cache *cache; 559 + int num_regs; 560 + 561 + xe_gt_assert(gt, type < GUC_STATE_CAPTURE_TYPE_MAX); 562 + xe_gt_assert(gt, capture_class < GUC_CAPTURE_LIST_CLASS_MAX); 563 + 564 + cache = &gc->ads_cache[owner][type][capture_class]; 565 + if (!gc->reglists) { 566 + xe_gt_warn(gt, "No capture reglist for this device\n"); 567 + return -ENODEV; 568 + } 569 + 570 + if (cache->is_valid) { 571 + *size = cache->size; 572 + return cache->status; 573 + } 574 + 575 + if (!is_purpose_est && owner == GUC_CAPTURE_LIST_INDEX_PF && 576 + !guc_capture_get_one_list(gc->reglists, owner, type, capture_class)) { 577 + if (type == GUC_STATE_CAPTURE_TYPE_GLOBAL) 578 + xe_gt_warn(gt, "Missing capture reglist: global!\n"); 579 + else 580 + xe_gt_warn(gt, "Missing capture reglist: %s(%u):%s(%u)!\n", 581 + capture_list_type_names[type], type, 582 + capture_engine_class_names[capture_class], capture_class); 583 + return -ENODEV; 584 + } 585 + 586 + num_regs = guc_cap_list_num_regs(guc, owner, type, capture_class); 587 + /* intentional empty lists can exist depending on hw config */ 588 + if (!num_regs) 589 + return -ENODATA; 590 + 591 + if (size) 592 + *size = PAGE_ALIGN((sizeof(struct guc_debug_capture_list)) + 593 + (num_regs * sizeof(struct guc_mmio_reg))); 594 + 595 + return 0; 596 + } 597 + 598 + /** 599 + * xe_guc_capture_getlistsize - Get list size for owner/type/class combination 600 + * @guc: The GuC object 601 + * @owner: PF/VF owner 602 + * @type: GuC capture register type 603 + * @capture_class: GuC capture engine class id 604 + * @size: Point to the size 605 + * 606 + * This function will get the list for the owner/type/class combination, and 607 + * return the page aligned list size. 608 + * 609 + * Returns: 0 on success or a negative error code on failure. 610 + */ 611 + int 612 + xe_guc_capture_getlistsize(struct xe_guc *guc, u32 owner, u32 type, 613 + enum guc_capture_list_class_type capture_class, size_t *size) 614 + { 615 + return guc_capture_getlistsize(guc, owner, type, capture_class, size, false); 616 + } 617 + 618 + /** 619 + * xe_guc_capture_getlist - Get register capture list for owner/type/class 620 + * combination 621 + * @guc: The GuC object 622 + * @owner: PF/VF owner 623 + * @type: GuC capture register type 624 + * @capture_class: GuC capture engine class id 625 + * @outptr: Point to cached register capture list 626 + * 627 + * This function will get the register capture list for the owner/type/class 628 + * combination. 629 + * 630 + * Returns: 0 on success or a negative error code on failure. 631 + */ 632 + int 633 + xe_guc_capture_getlist(struct xe_guc *guc, u32 owner, u32 type, 634 + enum guc_capture_list_class_type capture_class, void **outptr) 635 + { 636 + struct xe_guc_state_capture *gc = guc->capture; 637 + struct __guc_capture_ads_cache *cache = &gc->ads_cache[owner][type][capture_class]; 638 + struct guc_debug_capture_list *listnode; 639 + int ret, num_regs; 640 + u8 *caplist, *tmp; 641 + size_t size = 0; 642 + 643 + if (!gc->reglists) 644 + return -ENODEV; 645 + 646 + if (cache->is_valid) { 647 + *outptr = cache->ptr; 648 + return cache->status; 649 + } 650 + 651 + ret = xe_guc_capture_getlistsize(guc, owner, type, capture_class, &size); 652 + if (ret) { 653 + cache->is_valid = true; 654 + cache->ptr = NULL; 655 + cache->size = 0; 656 + cache->status = ret; 657 + return ret; 658 + } 659 + 660 + caplist = drmm_kzalloc(guc_to_drm(guc), size, GFP_KERNEL); 661 + if (!caplist) 662 + return -ENOMEM; 663 + 664 + /* populate capture list header */ 665 + tmp = caplist; 666 + num_regs = guc_cap_list_num_regs(guc, owner, type, capture_class); 667 + listnode = (struct guc_debug_capture_list *)tmp; 668 + listnode->header.info = FIELD_PREP(GUC_CAPTURELISTHDR_NUMDESCR, (u32)num_regs); 669 + 670 + /* populate list of register descriptor */ 671 + tmp += sizeof(struct guc_debug_capture_list); 672 + guc_capture_list_init(guc, owner, type, capture_class, 673 + (struct guc_mmio_reg *)tmp, num_regs); 674 + 675 + /* cache this list */ 676 + cache->is_valid = true; 677 + cache->ptr = caplist; 678 + cache->size = size; 679 + cache->status = 0; 680 + 681 + *outptr = caplist; 682 + 683 + return 0; 684 + } 685 + 686 + /** 687 + * xe_guc_capture_getnullheader - Get a null list for register capture 688 + * @guc: The GuC object 689 + * @outptr: Point to cached register capture list 690 + * @size: Point to the size 691 + * 692 + * This function will alloc for a null list for register capture. 693 + * 694 + * Returns: 0 on success or a negative error code on failure. 695 + */ 696 + int 697 + xe_guc_capture_getnullheader(struct xe_guc *guc, void **outptr, size_t *size) 698 + { 699 + struct xe_guc_state_capture *gc = guc->capture; 700 + int tmp = sizeof(u32) * 4; 701 + void *null_header; 702 + 703 + if (gc->ads_null_cache) { 704 + *outptr = gc->ads_null_cache; 705 + *size = tmp; 706 + return 0; 707 + } 708 + 709 + null_header = drmm_kzalloc(guc_to_drm(guc), tmp, GFP_KERNEL); 710 + if (!null_header) 711 + return -ENOMEM; 712 + 713 + gc->ads_null_cache = null_header; 714 + *outptr = null_header; 715 + *size = tmp; 716 + 717 + return 0; 718 + } 719 + 720 + /** 721 + * xe_guc_capture_ads_input_worst_size - Calculate the worst size for GuC register capture 722 + * @guc: point to xe_guc structure 723 + * 724 + * Calculate the worst size for GuC register capture by including all possible engines classes. 725 + * 726 + * Returns: Calculated size 727 + */ 728 + size_t xe_guc_capture_ads_input_worst_size(struct xe_guc *guc) 729 + { 730 + size_t total_size, class_size, instance_size, global_size; 731 + int i, j; 732 + 733 + /* 734 + * This function calculates the worst case register lists size by 735 + * including all possible engines classes. It is called during the 736 + * first of a two-phase GuC (and ADS-population) initialization 737 + * sequence, that is, during the pre-hwconfig phase before we have 738 + * the exact engine fusing info. 739 + */ 740 + total_size = PAGE_SIZE; /* Pad a page in front for empty lists */ 741 + for (i = 0; i < GUC_CAPTURE_LIST_INDEX_MAX; i++) { 742 + for (j = 0; j < GUC_CAPTURE_LIST_CLASS_MAX; j++) { 743 + if (xe_guc_capture_getlistsize(guc, i, 744 + GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS, 745 + j, &class_size) < 0) 746 + class_size = 0; 747 + if (xe_guc_capture_getlistsize(guc, i, 748 + GUC_STATE_CAPTURE_TYPE_ENGINE_INSTANCE, 749 + j, &instance_size) < 0) 750 + instance_size = 0; 751 + total_size += class_size + instance_size; 752 + } 753 + if (xe_guc_capture_getlistsize(guc, i, 754 + GUC_STATE_CAPTURE_TYPE_GLOBAL, 755 + 0, &global_size) < 0) 756 + global_size = 0; 757 + total_size += global_size; 758 + } 759 + 760 + return PAGE_ALIGN(total_size); 761 + } 762 + 763 + static int guc_capture_output_size_est(struct xe_guc *guc) 764 + { 765 + struct xe_gt *gt = guc_to_gt(guc); 766 + struct xe_hw_engine *hwe; 767 + enum xe_hw_engine_id id; 768 + 769 + int capture_size = 0; 770 + size_t tmp = 0; 771 + 772 + if (!guc->capture) 773 + return -ENODEV; 774 + 775 + /* 776 + * If every single engine-instance suffered a failure in quick succession but 777 + * were all unrelated, then a burst of multiple error-capture events would dump 778 + * registers for every one engine instance, one at a time. In this case, GuC 779 + * would even dump the global-registers repeatedly. 780 + * 781 + * For each engine instance, there would be 1 x guc_state_capture_group_t output 782 + * followed by 3 x guc_state_capture_t lists. The latter is how the register 783 + * dumps are split across different register types (where the '3' are global vs class 784 + * vs instance). 785 + */ 786 + for_each_hw_engine(hwe, gt, id) { 787 + enum guc_capture_list_class_type capture_class; 788 + 789 + capture_class = xe_engine_class_to_guc_capture_class(hwe->class); 790 + capture_size += sizeof(struct guc_state_capture_group_header_t) + 791 + (3 * sizeof(struct guc_state_capture_header_t)); 792 + 793 + if (!guc_capture_getlistsize(guc, 0, GUC_STATE_CAPTURE_TYPE_GLOBAL, 794 + 0, &tmp, true)) 795 + capture_size += tmp; 796 + if (!guc_capture_getlistsize(guc, 0, GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS, 797 + capture_class, &tmp, true)) 798 + capture_size += tmp; 799 + if (!guc_capture_getlistsize(guc, 0, GUC_STATE_CAPTURE_TYPE_ENGINE_INSTANCE, 800 + capture_class, &tmp, true)) 801 + capture_size += tmp; 802 + } 803 + 804 + return capture_size; 805 + } 806 + 807 + /* 808 + * Add on a 3x multiplier to allow for multiple back-to-back captures occurring 809 + * before the Xe can read the data out and process it 810 + */ 811 + #define GUC_CAPTURE_OVERBUFFER_MULTIPLIER 3 812 + 813 + static void check_guc_capture_size(struct xe_guc *guc) 814 + { 815 + int capture_size = guc_capture_output_size_est(guc); 816 + int spare_size = capture_size * GUC_CAPTURE_OVERBUFFER_MULTIPLIER; 817 + u32 buffer_size = xe_guc_log_section_size_capture(&guc->log); 818 + 819 + /* 820 + * NOTE: capture_size is much smaller than the capture region 821 + * allocation (DG2: <80K vs 1MB). 822 + * Additionally, its based on space needed to fit all engines getting 823 + * reset at once within the same G2H handler task slot. This is very 824 + * unlikely. However, if GuC really does run out of space for whatever 825 + * reason, we will see an separate warning message when processing the 826 + * G2H event capture-notification, search for: 827 + * xe_guc_STATE_CAPTURE_EVENT_STATUS_NOSPACE. 828 + */ 829 + if (capture_size < 0) 830 + xe_gt_dbg(guc_to_gt(guc), 831 + "Failed to calculate error state capture buffer minimum size: %d!\n", 832 + capture_size); 833 + if (capture_size > buffer_size) 834 + xe_gt_dbg(guc_to_gt(guc), "Error state capture buffer maybe small: %d < %d\n", 835 + buffer_size, capture_size); 836 + else if (spare_size > buffer_size) 837 + xe_gt_dbg(guc_to_gt(guc), 838 + "Error state capture buffer lacks spare size: %d < %d (min = %d)\n", 839 + buffer_size, spare_size, capture_size); 840 + } 841 + 842 + static void 843 + guc_capture_add_node_to_list(struct __guc_capture_parsed_output *node, 844 + struct list_head *list) 845 + { 846 + list_add(&node->link, list); 847 + } 848 + 849 + static void 850 + guc_capture_add_node_to_outlist(struct xe_guc_state_capture *gc, 851 + struct __guc_capture_parsed_output *node) 852 + { 853 + guc_capture_remove_stale_matches_from_list(gc, node); 854 + guc_capture_add_node_to_list(node, &gc->outlist); 855 + } 856 + 857 + static void 858 + guc_capture_add_node_to_cachelist(struct xe_guc_state_capture *gc, 859 + struct __guc_capture_parsed_output *node) 860 + { 861 + guc_capture_add_node_to_list(node, &gc->cachelist); 862 + } 863 + 864 + static void 865 + guc_capture_free_outlist_node(struct xe_guc_state_capture *gc, 866 + struct __guc_capture_parsed_output *n) 867 + { 868 + if (n) { 869 + n->locked = 0; 870 + list_del(&n->link); 871 + /* put node back to cache list */ 872 + guc_capture_add_node_to_cachelist(gc, n); 873 + } 874 + } 875 + 876 + static void 877 + guc_capture_remove_stale_matches_from_list(struct xe_guc_state_capture *gc, 878 + struct __guc_capture_parsed_output *node) 879 + { 880 + struct __guc_capture_parsed_output *n, *ntmp; 881 + int guc_id = node->guc_id; 882 + 883 + list_for_each_entry_safe(n, ntmp, &gc->outlist, link) { 884 + if (n != node && !n->locked && n->guc_id == guc_id) 885 + guc_capture_free_outlist_node(gc, n); 886 + } 887 + } 888 + 889 + static void 890 + guc_capture_init_node(struct xe_guc *guc, struct __guc_capture_parsed_output *node) 891 + { 892 + struct guc_mmio_reg *tmp[GUC_STATE_CAPTURE_TYPE_MAX]; 893 + int i; 894 + 895 + for (i = 0; i < GUC_STATE_CAPTURE_TYPE_MAX; ++i) { 896 + tmp[i] = node->reginfo[i].regs; 897 + memset(tmp[i], 0, sizeof(struct guc_mmio_reg) * 898 + guc->capture->max_mmio_per_node); 899 + } 900 + memset(node, 0, sizeof(*node)); 901 + for (i = 0; i < GUC_STATE_CAPTURE_TYPE_MAX; ++i) 902 + node->reginfo[i].regs = tmp[i]; 903 + 904 + INIT_LIST_HEAD(&node->link); 905 + } 906 + 907 + /** 908 + * DOC: Init, G2H-event and reporting flows for GuC-error-capture 909 + * 910 + * KMD Init time flows: 911 + * -------------------- 912 + * --> alloc A: GuC input capture regs lists (registered to GuC via ADS). 913 + * xe_guc_ads acquires the register lists by calling 914 + * xe_guc_capture_getlistsize and xe_guc_capture_getlist 'n' times, 915 + * where n = 1 for global-reg-list + 916 + * num_engine_classes for class-reg-list + 917 + * num_engine_classes for instance-reg-list 918 + * (since all instances of the same engine-class type 919 + * have an identical engine-instance register-list). 920 + * ADS module also calls separately for PF vs VF. 921 + * 922 + * --> alloc B: GuC output capture buf (registered via guc_init_params(log_param)) 923 + * Size = #define CAPTURE_BUFFER_SIZE (warns if on too-small) 924 + * Note2: 'x 3' to hold multiple capture groups 925 + * 926 + * GUC Runtime notify capture: 927 + * -------------------------- 928 + * --> G2H STATE_CAPTURE_NOTIFICATION 929 + * L--> xe_guc_capture_process 930 + * L--> Loop through B (head..tail) and for each engine instance's 931 + * err-state-captured register-list we find, we alloc 'C': 932 + * --> alloc C: A capture-output-node structure that includes misc capture info along 933 + * with 3 register list dumps (global, engine-class and engine-instance) 934 + * This node is created from a pre-allocated list of blank nodes in 935 + * guc->capture->cachelist and populated with the error-capture 936 + * data from GuC and then it's added into guc->capture->outlist linked 937 + * list. This list is used for matchup and printout by xe_devcoredump_read 938 + * and xe_engine_snapshot_print, (when user invokes the devcoredump sysfs). 939 + * 940 + * GUC --> notify context reset: 941 + * ----------------------------- 942 + * --> guc_exec_queue_timedout_job 943 + * L--> xe_devcoredump 944 + * L--> devcoredump_snapshot 945 + * --> xe_hw_engine_snapshot_capture 946 + * --> xe_engine_manual_capture(For manual capture) 947 + * 948 + * User Sysfs / Debugfs 949 + * -------------------- 950 + * --> xe_devcoredump_read-> 951 + * L--> xxx_snapshot_print 952 + * L--> xe_engine_snapshot_print 953 + * Print register lists values saved at 954 + * guc->capture->outlist 955 + * 956 + */ 957 + 958 + static int guc_capture_buf_cnt(struct __guc_capture_bufstate *buf) 959 + { 960 + if (buf->wr >= buf->rd) 961 + return (buf->wr - buf->rd); 962 + return (buf->size - buf->rd) + buf->wr; 963 + } 964 + 965 + static int guc_capture_buf_cnt_to_end(struct __guc_capture_bufstate *buf) 966 + { 967 + if (buf->rd > buf->wr) 968 + return (buf->size - buf->rd); 969 + return (buf->wr - buf->rd); 970 + } 971 + 972 + /* 973 + * GuC's error-capture output is a ring buffer populated in a byte-stream fashion: 974 + * 975 + * The GuC Log buffer region for error-capture is managed like a ring buffer. 976 + * The GuC firmware dumps error capture logs into this ring in a byte-stream flow. 977 + * Additionally, as per the current and foreseeable future, all packed error- 978 + * capture output structures are dword aligned. 979 + * 980 + * That said, if the GuC firmware is in the midst of writing a structure that is larger 981 + * than one dword but the tail end of the err-capture buffer-region has lesser space left, 982 + * we would need to extract that structure one dword at a time straddled across the end, 983 + * onto the start of the ring. 984 + * 985 + * Below function, guc_capture_log_remove_bytes is a helper for that. All callers of this 986 + * function would typically do a straight-up memcpy from the ring contents and will only 987 + * call this helper if their structure-extraction is straddling across the end of the 988 + * ring. GuC firmware does not add any padding. The reason for the no-padding is to ease 989 + * scalability for future expansion of output data types without requiring a redesign 990 + * of the flow controls. 991 + */ 992 + static int 993 + guc_capture_log_remove_bytes(struct xe_guc *guc, struct __guc_capture_bufstate *buf, 994 + void *out, int bytes_needed) 995 + { 996 + #define GUC_CAPTURE_LOG_BUF_COPY_RETRY_MAX 3 997 + 998 + int fill_size = 0, tries = GUC_CAPTURE_LOG_BUF_COPY_RETRY_MAX; 999 + int copy_size, avail; 1000 + 1001 + xe_assert(guc_to_xe(guc), bytes_needed % sizeof(u32) == 0); 1002 + 1003 + if (bytes_needed > guc_capture_buf_cnt(buf)) 1004 + return -1; 1005 + 1006 + while (bytes_needed > 0 && tries--) { 1007 + int misaligned; 1008 + 1009 + avail = guc_capture_buf_cnt_to_end(buf); 1010 + misaligned = avail % sizeof(u32); 1011 + /* wrap if at end */ 1012 + if (!avail) { 1013 + /* output stream clipped */ 1014 + if (!buf->rd) 1015 + return fill_size; 1016 + buf->rd = 0; 1017 + continue; 1018 + } 1019 + 1020 + /* Only copy to u32 aligned data */ 1021 + copy_size = avail < bytes_needed ? avail - misaligned : bytes_needed; 1022 + xe_map_memcpy_from(guc_to_xe(guc), out + fill_size, &guc->log.bo->vmap, 1023 + buf->data_offset + buf->rd, copy_size); 1024 + buf->rd += copy_size; 1025 + fill_size += copy_size; 1026 + bytes_needed -= copy_size; 1027 + 1028 + if (misaligned) 1029 + xe_gt_warn(guc_to_gt(guc), 1030 + "Bytes extraction not dword aligned, clipping.\n"); 1031 + } 1032 + 1033 + return fill_size; 1034 + } 1035 + 1036 + static int 1037 + guc_capture_log_get_group_hdr(struct xe_guc *guc, struct __guc_capture_bufstate *buf, 1038 + struct guc_state_capture_group_header_t *ghdr) 1039 + { 1040 + int fullsize = sizeof(struct guc_state_capture_group_header_t); 1041 + 1042 + if (guc_capture_log_remove_bytes(guc, buf, ghdr, fullsize) != fullsize) 1043 + return -1; 1044 + return 0; 1045 + } 1046 + 1047 + static int 1048 + guc_capture_log_get_data_hdr(struct xe_guc *guc, struct __guc_capture_bufstate *buf, 1049 + struct guc_state_capture_header_t *hdr) 1050 + { 1051 + int fullsize = sizeof(struct guc_state_capture_header_t); 1052 + 1053 + if (guc_capture_log_remove_bytes(guc, buf, hdr, fullsize) != fullsize) 1054 + return -1; 1055 + return 0; 1056 + } 1057 + 1058 + static int 1059 + guc_capture_log_get_register(struct xe_guc *guc, struct __guc_capture_bufstate *buf, 1060 + struct guc_mmio_reg *reg) 1061 + { 1062 + int fullsize = sizeof(struct guc_mmio_reg); 1063 + 1064 + if (guc_capture_log_remove_bytes(guc, buf, reg, fullsize) != fullsize) 1065 + return -1; 1066 + return 0; 1067 + } 1068 + 1069 + static struct __guc_capture_parsed_output * 1070 + guc_capture_get_prealloc_node(struct xe_guc *guc) 1071 + { 1072 + struct __guc_capture_parsed_output *found = NULL; 1073 + 1074 + if (!list_empty(&guc->capture->cachelist)) { 1075 + struct __guc_capture_parsed_output *n, *ntmp; 1076 + 1077 + /* get first avail node from the cache list */ 1078 + list_for_each_entry_safe(n, ntmp, &guc->capture->cachelist, link) { 1079 + found = n; 1080 + break; 1081 + } 1082 + } else { 1083 + struct __guc_capture_parsed_output *n, *ntmp; 1084 + 1085 + /* 1086 + * traverse reversed and steal back the oldest node already 1087 + * allocated 1088 + */ 1089 + list_for_each_entry_safe_reverse(n, ntmp, &guc->capture->outlist, link) { 1090 + if (!n->locked) 1091 + found = n; 1092 + } 1093 + } 1094 + if (found) { 1095 + list_del(&found->link); 1096 + guc_capture_init_node(guc, found); 1097 + } 1098 + 1099 + return found; 1100 + } 1101 + 1102 + static struct __guc_capture_parsed_output * 1103 + guc_capture_clone_node(struct xe_guc *guc, struct __guc_capture_parsed_output *original, 1104 + u32 keep_reglist_mask) 1105 + { 1106 + struct __guc_capture_parsed_output *new; 1107 + int i; 1108 + 1109 + new = guc_capture_get_prealloc_node(guc); 1110 + if (!new) 1111 + return NULL; 1112 + if (!original) 1113 + return new; 1114 + 1115 + new->is_partial = original->is_partial; 1116 + 1117 + /* copy reg-lists that we want to clone */ 1118 + for (i = 0; i < GUC_STATE_CAPTURE_TYPE_MAX; ++i) { 1119 + if (keep_reglist_mask & BIT(i)) { 1120 + XE_WARN_ON(original->reginfo[i].num_regs > 1121 + guc->capture->max_mmio_per_node); 1122 + 1123 + memcpy(new->reginfo[i].regs, original->reginfo[i].regs, 1124 + original->reginfo[i].num_regs * sizeof(struct guc_mmio_reg)); 1125 + 1126 + new->reginfo[i].num_regs = original->reginfo[i].num_regs; 1127 + new->reginfo[i].vfid = original->reginfo[i].vfid; 1128 + 1129 + if (i == GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS) { 1130 + new->eng_class = original->eng_class; 1131 + } else if (i == GUC_STATE_CAPTURE_TYPE_ENGINE_INSTANCE) { 1132 + new->eng_inst = original->eng_inst; 1133 + new->guc_id = original->guc_id; 1134 + new->lrca = original->lrca; 1135 + } 1136 + } 1137 + } 1138 + 1139 + return new; 1140 + } 1141 + 1142 + static int 1143 + guc_capture_extract_reglists(struct xe_guc *guc, struct __guc_capture_bufstate *buf) 1144 + { 1145 + struct xe_gt *gt = guc_to_gt(guc); 1146 + struct guc_state_capture_group_header_t ghdr = {0}; 1147 + struct guc_state_capture_header_t hdr = {0}; 1148 + struct __guc_capture_parsed_output *node = NULL; 1149 + struct guc_mmio_reg *regs = NULL; 1150 + int i, numlists, numregs, ret = 0; 1151 + enum guc_state_capture_type datatype; 1152 + struct guc_mmio_reg tmp; 1153 + bool is_partial = false; 1154 + 1155 + i = guc_capture_buf_cnt(buf); 1156 + if (!i) 1157 + return -ENODATA; 1158 + 1159 + if (i % sizeof(u32)) { 1160 + xe_gt_warn(gt, "Got mis-aligned register capture entries\n"); 1161 + ret = -EIO; 1162 + goto bailout; 1163 + } 1164 + 1165 + /* first get the capture group header */ 1166 + if (guc_capture_log_get_group_hdr(guc, buf, &ghdr)) { 1167 + ret = -EIO; 1168 + goto bailout; 1169 + } 1170 + /* 1171 + * we would typically expect a layout as below where n would be expected to be 1172 + * anywhere between 3 to n where n > 3 if we are seeing multiple dependent engine 1173 + * instances being reset together. 1174 + * ____________________________________________ 1175 + * | Capture Group | 1176 + * | ________________________________________ | 1177 + * | | Capture Group Header: | | 1178 + * | | - num_captures = 5 | | 1179 + * | |______________________________________| | 1180 + * | ________________________________________ | 1181 + * | | Capture1: | | 1182 + * | | Hdr: GLOBAL, numregs=a | | 1183 + * | | ____________________________________ | | 1184 + * | | | Reglist | | | 1185 + * | | | - reg1, reg2, ... rega | | | 1186 + * | | |__________________________________| | | 1187 + * | |______________________________________| | 1188 + * | ________________________________________ | 1189 + * | | Capture2: | | 1190 + * | | Hdr: CLASS=RENDER/COMPUTE, numregs=b| | 1191 + * | | ____________________________________ | | 1192 + * | | | Reglist | | | 1193 + * | | | - reg1, reg2, ... regb | | | 1194 + * | | |__________________________________| | | 1195 + * | |______________________________________| | 1196 + * | ________________________________________ | 1197 + * | | Capture3: | | 1198 + * | | Hdr: INSTANCE=RCS, numregs=c | | 1199 + * | | ____________________________________ | | 1200 + * | | | Reglist | | | 1201 + * | | | - reg1, reg2, ... regc | | | 1202 + * | | |__________________________________| | | 1203 + * | |______________________________________| | 1204 + * | ________________________________________ | 1205 + * | | Capture4: | | 1206 + * | | Hdr: CLASS=RENDER/COMPUTE, numregs=d| | 1207 + * | | ____________________________________ | | 1208 + * | | | Reglist | | | 1209 + * | | | - reg1, reg2, ... regd | | | 1210 + * | | |__________________________________| | | 1211 + * | |______________________________________| | 1212 + * | ________________________________________ | 1213 + * | | Capture5: | | 1214 + * | | Hdr: INSTANCE=CCS0, numregs=e | | 1215 + * | | ____________________________________ | | 1216 + * | | | Reglist | | | 1217 + * | | | - reg1, reg2, ... rege | | | 1218 + * | | |__________________________________| | | 1219 + * | |______________________________________| | 1220 + * |__________________________________________| 1221 + */ 1222 + is_partial = FIELD_GET(GUC_STATE_CAPTURE_GROUP_HEADER_CAPTURE_GROUP_TYPE, ghdr.info); 1223 + numlists = FIELD_GET(GUC_STATE_CAPTURE_GROUP_HEADER_NUM_CAPTURES, ghdr.info); 1224 + 1225 + while (numlists--) { 1226 + if (guc_capture_log_get_data_hdr(guc, buf, &hdr)) { 1227 + ret = -EIO; 1228 + break; 1229 + } 1230 + 1231 + datatype = FIELD_GET(GUC_STATE_CAPTURE_HEADER_CAPTURE_TYPE, hdr.info); 1232 + if (datatype > GUC_STATE_CAPTURE_TYPE_ENGINE_INSTANCE) { 1233 + /* unknown capture type - skip over to next capture set */ 1234 + numregs = FIELD_GET(GUC_STATE_CAPTURE_HEADER_NUM_MMIO_ENTRIES, 1235 + hdr.num_mmio_entries); 1236 + while (numregs--) { 1237 + if (guc_capture_log_get_register(guc, buf, &tmp)) { 1238 + ret = -EIO; 1239 + break; 1240 + } 1241 + } 1242 + continue; 1243 + } else if (node) { 1244 + /* 1245 + * Based on the current capture type and what we have so far, 1246 + * decide if we should add the current node into the internal 1247 + * linked list for match-up when xe_devcoredump calls later 1248 + * (and alloc a blank node for the next set of reglists) 1249 + * or continue with the same node or clone the current node 1250 + * but only retain the global or class registers (such as the 1251 + * case of dependent engine resets). 1252 + */ 1253 + if (datatype == GUC_STATE_CAPTURE_TYPE_GLOBAL) { 1254 + guc_capture_add_node_to_outlist(guc->capture, node); 1255 + node = NULL; 1256 + } else if (datatype == GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS && 1257 + node->reginfo[GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS].num_regs) { 1258 + /* Add to list, clone node and duplicate global list */ 1259 + guc_capture_add_node_to_outlist(guc->capture, node); 1260 + node = guc_capture_clone_node(guc, node, 1261 + GCAP_PARSED_REGLIST_INDEX_GLOBAL); 1262 + } else if (datatype == GUC_STATE_CAPTURE_TYPE_ENGINE_INSTANCE && 1263 + node->reginfo[GUC_STATE_CAPTURE_TYPE_ENGINE_INSTANCE].num_regs) { 1264 + /* Add to list, clone node and duplicate global + class lists */ 1265 + guc_capture_add_node_to_outlist(guc->capture, node); 1266 + node = guc_capture_clone_node(guc, node, 1267 + (GCAP_PARSED_REGLIST_INDEX_GLOBAL | 1268 + GCAP_PARSED_REGLIST_INDEX_ENGCLASS)); 1269 + } 1270 + } 1271 + 1272 + if (!node) { 1273 + node = guc_capture_get_prealloc_node(guc); 1274 + if (!node) { 1275 + ret = -ENOMEM; 1276 + break; 1277 + } 1278 + if (datatype != GUC_STATE_CAPTURE_TYPE_GLOBAL) 1279 + xe_gt_dbg(gt, "Register capture missing global dump: %08x!\n", 1280 + datatype); 1281 + } 1282 + node->is_partial = is_partial; 1283 + node->reginfo[datatype].vfid = FIELD_GET(GUC_STATE_CAPTURE_HEADER_VFID, hdr.owner); 1284 + node->source = XE_ENGINE_CAPTURE_SOURCE_GUC; 1285 + node->type = datatype; 1286 + 1287 + switch (datatype) { 1288 + case GUC_STATE_CAPTURE_TYPE_ENGINE_INSTANCE: 1289 + node->eng_class = FIELD_GET(GUC_STATE_CAPTURE_HEADER_ENGINE_CLASS, 1290 + hdr.info); 1291 + node->eng_inst = FIELD_GET(GUC_STATE_CAPTURE_HEADER_ENGINE_INSTANCE, 1292 + hdr.info); 1293 + node->lrca = hdr.lrca; 1294 + node->guc_id = hdr.guc_id; 1295 + break; 1296 + case GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS: 1297 + node->eng_class = FIELD_GET(GUC_STATE_CAPTURE_HEADER_ENGINE_CLASS, 1298 + hdr.info); 1299 + break; 1300 + default: 1301 + break; 1302 + } 1303 + 1304 + numregs = FIELD_GET(GUC_STATE_CAPTURE_HEADER_NUM_MMIO_ENTRIES, 1305 + hdr.num_mmio_entries); 1306 + if (numregs > guc->capture->max_mmio_per_node) { 1307 + xe_gt_dbg(gt, "Register capture list extraction clipped by prealloc!\n"); 1308 + numregs = guc->capture->max_mmio_per_node; 1309 + } 1310 + node->reginfo[datatype].num_regs = numregs; 1311 + regs = node->reginfo[datatype].regs; 1312 + i = 0; 1313 + while (numregs--) { 1314 + if (guc_capture_log_get_register(guc, buf, &regs[i++])) { 1315 + ret = -EIO; 1316 + break; 1317 + } 1318 + } 1319 + } 1320 + 1321 + bailout: 1322 + if (node) { 1323 + /* If we have data, add to linked list for match-up when xe_devcoredump calls */ 1324 + for (i = GUC_STATE_CAPTURE_TYPE_GLOBAL; i < GUC_STATE_CAPTURE_TYPE_MAX; ++i) { 1325 + if (node->reginfo[i].regs) { 1326 + guc_capture_add_node_to_outlist(guc->capture, node); 1327 + node = NULL; 1328 + break; 1329 + } 1330 + } 1331 + if (node) /* else return it back to cache list */ 1332 + guc_capture_add_node_to_cachelist(guc->capture, node); 1333 + } 1334 + return ret; 1335 + } 1336 + 1337 + static int __guc_capture_flushlog_complete(struct xe_guc *guc) 1338 + { 1339 + u32 action[] = { 1340 + XE_GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE, 1341 + GUC_LOG_BUFFER_CAPTURE 1342 + }; 1343 + 1344 + return xe_guc_ct_send_g2h_handler(&guc->ct, action, ARRAY_SIZE(action)); 1345 + } 1346 + 1347 + static void __guc_capture_process_output(struct xe_guc *guc) 1348 + { 1349 + unsigned int buffer_size, read_offset, write_offset, full_count; 1350 + struct xe_uc *uc = container_of(guc, typeof(*uc), guc); 1351 + struct guc_log_buffer_state log_buf_state_local; 1352 + struct __guc_capture_bufstate buf; 1353 + bool new_overflow; 1354 + int ret, tmp; 1355 + u32 log_buf_state_offset; 1356 + u32 src_data_offset; 1357 + 1358 + log_buf_state_offset = sizeof(struct guc_log_buffer_state) * GUC_LOG_BUFFER_CAPTURE; 1359 + src_data_offset = xe_guc_get_log_buffer_offset(&guc->log, GUC_LOG_BUFFER_CAPTURE); 1360 + 1361 + /* 1362 + * Make a copy of the state structure, inside GuC log buffer 1363 + * (which is uncached mapped), on the stack to avoid reading 1364 + * from it multiple times. 1365 + */ 1366 + xe_map_memcpy_from(guc_to_xe(guc), &log_buf_state_local, &guc->log.bo->vmap, 1367 + log_buf_state_offset, sizeof(struct guc_log_buffer_state)); 1368 + 1369 + buffer_size = xe_guc_get_log_buffer_size(&guc->log, GUC_LOG_BUFFER_CAPTURE); 1370 + read_offset = log_buf_state_local.read_ptr; 1371 + write_offset = log_buf_state_local.sampled_write_ptr; 1372 + full_count = FIELD_GET(GUC_LOG_BUFFER_STATE_BUFFER_FULL_CNT, log_buf_state_local.flags); 1373 + 1374 + /* Bookkeeping stuff */ 1375 + tmp = FIELD_GET(GUC_LOG_BUFFER_STATE_FLUSH_TO_FILE, log_buf_state_local.flags); 1376 + guc->log.stats[GUC_LOG_BUFFER_CAPTURE].flush += tmp; 1377 + new_overflow = xe_guc_check_log_buf_overflow(&guc->log, GUC_LOG_BUFFER_CAPTURE, 1378 + full_count); 1379 + 1380 + /* Now copy the actual logs. */ 1381 + if (unlikely(new_overflow)) { 1382 + /* copy the whole buffer in case of overflow */ 1383 + read_offset = 0; 1384 + write_offset = buffer_size; 1385 + } else if (unlikely((read_offset > buffer_size) || 1386 + (write_offset > buffer_size))) { 1387 + xe_gt_err(guc_to_gt(guc), 1388 + "Register capture buffer in invalid state: read = 0x%X, size = 0x%X!\n", 1389 + read_offset, buffer_size); 1390 + /* copy whole buffer as offsets are unreliable */ 1391 + read_offset = 0; 1392 + write_offset = buffer_size; 1393 + } 1394 + 1395 + buf.size = buffer_size; 1396 + buf.rd = read_offset; 1397 + buf.wr = write_offset; 1398 + buf.data_offset = src_data_offset; 1399 + 1400 + if (!xe_guc_read_stopped(guc)) { 1401 + do { 1402 + ret = guc_capture_extract_reglists(guc, &buf); 1403 + if (ret && ret != -ENODATA) 1404 + xe_gt_dbg(guc_to_gt(guc), "Capture extraction failed:%d\n", ret); 1405 + } while (ret >= 0); 1406 + } 1407 + 1408 + /* Update the state of log buffer err-cap state */ 1409 + xe_map_wr(guc_to_xe(guc), &guc->log.bo->vmap, 1410 + log_buf_state_offset + offsetof(struct guc_log_buffer_state, read_ptr), u32, 1411 + write_offset); 1412 + 1413 + /* 1414 + * Clear the flush_to_file from local first, the local was loaded by above 1415 + * xe_map_memcpy_from, then write out the "updated local" through 1416 + * xe_map_wr() 1417 + */ 1418 + log_buf_state_local.flags &= ~GUC_LOG_BUFFER_STATE_FLUSH_TO_FILE; 1419 + xe_map_wr(guc_to_xe(guc), &guc->log.bo->vmap, 1420 + log_buf_state_offset + offsetof(struct guc_log_buffer_state, flags), u32, 1421 + log_buf_state_local.flags); 1422 + __guc_capture_flushlog_complete(guc); 1423 + } 1424 + 1425 + /* 1426 + * xe_guc_capture_process - Process GuC register captured data 1427 + * @guc: The GuC object 1428 + * 1429 + * When GuC captured data is ready, GuC will send message 1430 + * XE_GUC_ACTION_STATE_CAPTURE_NOTIFICATION to host, this function will be 1431 + * called to process the data comes with the message. 1432 + * 1433 + * Returns: None 1434 + */ 1435 + void xe_guc_capture_process(struct xe_guc *guc) 1436 + { 1437 + if (guc->capture) 1438 + __guc_capture_process_output(guc); 1439 + } 1440 + 1441 + static struct __guc_capture_parsed_output * 1442 + guc_capture_alloc_one_node(struct xe_guc *guc) 1443 + { 1444 + struct drm_device *drm = guc_to_drm(guc); 1445 + struct __guc_capture_parsed_output *new; 1446 + int i; 1447 + 1448 + new = drmm_kzalloc(drm, sizeof(*new), GFP_KERNEL); 1449 + if (!new) 1450 + return NULL; 1451 + 1452 + for (i = 0; i < GUC_STATE_CAPTURE_TYPE_MAX; ++i) { 1453 + new->reginfo[i].regs = drmm_kzalloc(drm, guc->capture->max_mmio_per_node * 1454 + sizeof(struct guc_mmio_reg), GFP_KERNEL); 1455 + if (!new->reginfo[i].regs) { 1456 + while (i) 1457 + drmm_kfree(drm, new->reginfo[--i].regs); 1458 + drmm_kfree(drm, new); 1459 + return NULL; 1460 + } 1461 + } 1462 + guc_capture_init_node(guc, new); 1463 + 1464 + return new; 1465 + } 1466 + 1467 + static void 1468 + __guc_capture_create_prealloc_nodes(struct xe_guc *guc) 1469 + { 1470 + struct __guc_capture_parsed_output *node = NULL; 1471 + int i; 1472 + 1473 + for (i = 0; i < PREALLOC_NODES_MAX_COUNT; ++i) { 1474 + node = guc_capture_alloc_one_node(guc); 1475 + if (!node) { 1476 + xe_gt_warn(guc_to_gt(guc), "Register capture pre-alloc-cache failure\n"); 1477 + /* dont free the priors, use what we got and cleanup at shutdown */ 1478 + return; 1479 + } 1480 + guc_capture_add_node_to_cachelist(guc->capture, node); 1481 + } 1482 + } 1483 + 1484 + static int 1485 + guc_get_max_reglist_count(struct xe_guc *guc) 1486 + { 1487 + int i, j, k, tmp, maxregcount = 0; 1488 + 1489 + for (i = 0; i < GUC_CAPTURE_LIST_INDEX_MAX; ++i) { 1490 + for (j = 0; j < GUC_STATE_CAPTURE_TYPE_MAX; ++j) { 1491 + for (k = 0; k < GUC_CAPTURE_LIST_CLASS_MAX; ++k) { 1492 + const struct __guc_mmio_reg_descr_group *match; 1493 + 1494 + if (j == GUC_STATE_CAPTURE_TYPE_GLOBAL && k > 0) 1495 + continue; 1496 + 1497 + tmp = 0; 1498 + match = guc_capture_get_one_list(guc->capture->reglists, i, j, k); 1499 + if (match) 1500 + tmp = match->num_regs; 1501 + 1502 + match = guc_capture_get_one_list(guc->capture->extlists, i, j, k); 1503 + if (match) 1504 + tmp += match->num_regs; 1505 + 1506 + if (tmp > maxregcount) 1507 + maxregcount = tmp; 1508 + } 1509 + } 1510 + } 1511 + if (!maxregcount) 1512 + maxregcount = PREALLOC_NODES_DEFAULT_NUMREGS; 1513 + 1514 + return maxregcount; 1515 + } 1516 + 1517 + static void 1518 + guc_capture_create_prealloc_nodes(struct xe_guc *guc) 1519 + { 1520 + /* skip if we've already done the pre-alloc */ 1521 + if (guc->capture->max_mmio_per_node) 1522 + return; 1523 + 1524 + guc->capture->max_mmio_per_node = guc_get_max_reglist_count(guc); 1525 + __guc_capture_create_prealloc_nodes(guc); 1526 + } 1527 + 1528 + static void 1529 + read_reg_to_node(struct xe_hw_engine *hwe, const struct __guc_mmio_reg_descr_group *list, 1530 + struct guc_mmio_reg *regs) 1531 + { 1532 + int i; 1533 + 1534 + if (!list || list->num_regs == 0) 1535 + return; 1536 + 1537 + if (!regs) 1538 + return; 1539 + 1540 + for (i = 0; i < list->num_regs; i++) { 1541 + struct __guc_mmio_reg_descr desc = list->list[i]; 1542 + u32 value; 1543 + 1544 + if (!list->list) 1545 + return; 1546 + 1547 + if (list->type == GUC_STATE_CAPTURE_TYPE_ENGINE_INSTANCE) { 1548 + value = xe_hw_engine_mmio_read32(hwe, desc.reg); 1549 + } else { 1550 + if (list->type == GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS && 1551 + FIELD_GET(GUC_REGSET_STEERING_NEEDED, desc.flags)) { 1552 + int group, instance; 1553 + 1554 + group = FIELD_GET(GUC_REGSET_STEERING_GROUP, desc.flags); 1555 + instance = FIELD_GET(GUC_REGSET_STEERING_INSTANCE, desc.flags); 1556 + value = xe_gt_mcr_unicast_read(hwe->gt, XE_REG_MCR(desc.reg.addr), 1557 + group, instance); 1558 + } else { 1559 + value = xe_mmio_read32(&hwe->gt->mmio, desc.reg); 1560 + } 1561 + } 1562 + 1563 + regs[i].value = value; 1564 + regs[i].offset = desc.reg.addr; 1565 + regs[i].flags = desc.flags; 1566 + regs[i].mask = desc.mask; 1567 + } 1568 + } 1569 + 1570 + /** 1571 + * xe_engine_manual_capture - Take a manual engine snapshot from engine. 1572 + * @hwe: Xe HW Engine. 1573 + * @snapshot: The engine snapshot 1574 + * 1575 + * Take engine snapshot from engine read. 1576 + * 1577 + * Returns: None 1578 + */ 1579 + void 1580 + xe_engine_manual_capture(struct xe_hw_engine *hwe, struct xe_hw_engine_snapshot *snapshot) 1581 + { 1582 + struct xe_gt *gt = hwe->gt; 1583 + struct xe_device *xe = gt_to_xe(gt); 1584 + struct xe_guc *guc = &gt->uc.guc; 1585 + struct xe_devcoredump *devcoredump = &xe->devcoredump; 1586 + enum guc_capture_list_class_type capture_class; 1587 + const struct __guc_mmio_reg_descr_group *list; 1588 + struct __guc_capture_parsed_output *new; 1589 + enum guc_state_capture_type type; 1590 + u16 guc_id = 0; 1591 + u32 lrca = 0; 1592 + 1593 + new = guc_capture_get_prealloc_node(guc); 1594 + if (!new) 1595 + return; 1596 + 1597 + capture_class = xe_engine_class_to_guc_capture_class(hwe->class); 1598 + for (type = GUC_STATE_CAPTURE_TYPE_GLOBAL; type < GUC_STATE_CAPTURE_TYPE_MAX; type++) { 1599 + struct gcap_reg_list_info *reginfo = &new->reginfo[type]; 1600 + /* 1601 + * regsinfo->regs is allocated based on guc->capture->max_mmio_per_node 1602 + * which is based on the descriptor list driving the population so 1603 + * should not overflow 1604 + */ 1605 + 1606 + /* Get register list for the type/class */ 1607 + list = xe_guc_capture_get_reg_desc_list(gt, GUC_CAPTURE_LIST_INDEX_PF, type, 1608 + capture_class, false); 1609 + if (!list) { 1610 + xe_gt_dbg(gt, "Empty GuC capture register descriptor for %s", 1611 + hwe->name); 1612 + continue; 1613 + } 1614 + 1615 + read_reg_to_node(hwe, list, reginfo->regs); 1616 + reginfo->num_regs = list->num_regs; 1617 + 1618 + /* Capture steering registers for rcs/ccs */ 1619 + if (capture_class == GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE) { 1620 + list = xe_guc_capture_get_reg_desc_list(gt, GUC_CAPTURE_LIST_INDEX_PF, 1621 + type, capture_class, true); 1622 + if (list) { 1623 + read_reg_to_node(hwe, list, &reginfo->regs[reginfo->num_regs]); 1624 + reginfo->num_regs += list->num_regs; 1625 + } 1626 + } 1627 + } 1628 + 1629 + if (devcoredump && devcoredump->captured) { 1630 + struct xe_guc_submit_exec_queue_snapshot *ge = devcoredump->snapshot.ge; 1631 + 1632 + if (ge) { 1633 + guc_id = ge->guc.id; 1634 + if (ge->lrc[0]) 1635 + lrca = ge->lrc[0]->context_desc; 1636 + } 1637 + } 1638 + 1639 + new->eng_class = xe_engine_class_to_guc_class(hwe->class); 1640 + new->eng_inst = hwe->instance; 1641 + new->guc_id = guc_id; 1642 + new->lrca = lrca; 1643 + new->is_partial = 0; 1644 + new->locked = 1; 1645 + new->source = XE_ENGINE_CAPTURE_SOURCE_MANUAL; 1646 + 1647 + guc_capture_add_node_to_outlist(guc->capture, new); 1648 + devcoredump->snapshot.matched_node = new; 1649 + } 1650 + 1651 + static struct guc_mmio_reg * 1652 + guc_capture_find_reg(struct gcap_reg_list_info *reginfo, u32 addr, u32 flags) 1653 + { 1654 + int i; 1655 + 1656 + if (reginfo && reginfo->num_regs > 0) { 1657 + struct guc_mmio_reg *regs = reginfo->regs; 1658 + 1659 + if (regs) 1660 + for (i = 0; i < reginfo->num_regs; i++) 1661 + if (regs[i].offset == addr && regs[i].flags == flags) 1662 + return &regs[i]; 1663 + } 1664 + 1665 + return NULL; 1666 + } 1667 + 1668 + static void 1669 + snapshot_print_by_list_order(struct xe_hw_engine_snapshot *snapshot, struct drm_printer *p, 1670 + u32 type, const struct __guc_mmio_reg_descr_group *list) 1671 + { 1672 + struct xe_gt *gt = snapshot->hwe->gt; 1673 + struct xe_device *xe = gt_to_xe(gt); 1674 + struct xe_guc *guc = &gt->uc.guc; 1675 + struct xe_devcoredump *devcoredump = &xe->devcoredump; 1676 + struct xe_devcoredump_snapshot *devcore_snapshot = &devcoredump->snapshot; 1677 + struct gcap_reg_list_info *reginfo = NULL; 1678 + u32 last_value, i; 1679 + bool is_ext; 1680 + 1681 + if (!list || list->num_regs == 0) 1682 + return; 1683 + XE_WARN_ON(!devcore_snapshot->matched_node); 1684 + 1685 + is_ext = list == guc->capture->extlists; 1686 + reginfo = &devcore_snapshot->matched_node->reginfo[type]; 1687 + 1688 + /* 1689 + * loop through descriptor first and find the register in the node 1690 + * this is more scalable for developer maintenance as it will ensure 1691 + * the printout matched the ordering of the static descriptor 1692 + * table-of-lists 1693 + */ 1694 + for (i = 0; i < list->num_regs; i++) { 1695 + const struct __guc_mmio_reg_descr *reg_desc = &list->list[i]; 1696 + struct guc_mmio_reg *reg; 1697 + u32 value; 1698 + 1699 + reg = guc_capture_find_reg(reginfo, reg_desc->reg.addr, reg_desc->flags); 1700 + if (!reg) 1701 + continue; 1702 + 1703 + value = reg->value; 1704 + if (reg_desc->data_type == REG_64BIT_LOW_DW) { 1705 + last_value = value; 1706 + /* Low 32 bit dword saved, continue for high 32 bit */ 1707 + continue; 1708 + } else if (reg_desc->data_type == REG_64BIT_HI_DW) { 1709 + u64 value_qw = ((u64)value << 32) | last_value; 1710 + 1711 + drm_printf(p, "\t%s: 0x%016llx\n", reg_desc->regname, value_qw); 1712 + continue; 1713 + } 1714 + 1715 + if (is_ext) { 1716 + int dss, group, instance; 1717 + 1718 + group = FIELD_GET(GUC_REGSET_STEERING_GROUP, reg_desc->flags); 1719 + instance = FIELD_GET(GUC_REGSET_STEERING_INSTANCE, reg_desc->flags); 1720 + dss = xe_gt_mcr_steering_info_to_dss_id(gt, group, instance); 1721 + 1722 + drm_printf(p, "\t%s[%u]: 0x%08x\n", reg_desc->regname, dss, value); 1723 + } else { 1724 + drm_printf(p, "\t%s: 0x%08x\n", reg_desc->regname, value); 1725 + } 1726 + } 1727 + } 1728 + 1729 + /** 1730 + * xe_engine_snapshot_print - Print out a given Xe HW Engine snapshot. 1731 + * @snapshot: Xe HW Engine snapshot object. 1732 + * @p: drm_printer where it will be printed out. 1733 + * 1734 + * This function prints out a given Xe HW Engine snapshot object. 1735 + */ 1736 + void xe_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm_printer *p) 1737 + { 1738 + const char *grptype[GUC_STATE_CAPTURE_GROUP_TYPE_MAX] = { 1739 + "full-capture", 1740 + "partial-capture" 1741 + }; 1742 + int type; 1743 + const struct __guc_mmio_reg_descr_group *list; 1744 + enum guc_capture_list_class_type capture_class; 1745 + 1746 + struct xe_gt *gt; 1747 + struct xe_device *xe; 1748 + struct xe_devcoredump *devcoredump; 1749 + struct xe_devcoredump_snapshot *devcore_snapshot; 1750 + 1751 + if (!snapshot) 1752 + return; 1753 + 1754 + gt = snapshot->hwe->gt; 1755 + xe = gt_to_xe(gt); 1756 + devcoredump = &xe->devcoredump; 1757 + devcore_snapshot = &devcoredump->snapshot; 1758 + 1759 + if (!devcore_snapshot->matched_node) 1760 + return; 1761 + 1762 + xe_gt_assert(gt, snapshot->source <= XE_ENGINE_CAPTURE_SOURCE_GUC); 1763 + xe_gt_assert(gt, snapshot->hwe); 1764 + 1765 + capture_class = xe_engine_class_to_guc_capture_class(snapshot->hwe->class); 1766 + 1767 + drm_printf(p, "%s (physical), logical instance=%d\n", 1768 + snapshot->name ? snapshot->name : "", 1769 + snapshot->logical_instance); 1770 + drm_printf(p, "\tCapture_source: %s\n", 1771 + snapshot->source == XE_ENGINE_CAPTURE_SOURCE_GUC ? "GuC" : "Manual"); 1772 + drm_printf(p, "\tCoverage: %s\n", grptype[devcore_snapshot->matched_node->is_partial]); 1773 + drm_printf(p, "\tForcewake: domain 0x%x, ref %d\n", 1774 + snapshot->forcewake.domain, snapshot->forcewake.ref); 1775 + drm_printf(p, "\tReserved: %s\n", 1776 + str_yes_no(snapshot->kernel_reserved)); 1777 + 1778 + for (type = GUC_STATE_CAPTURE_TYPE_GLOBAL; type < GUC_STATE_CAPTURE_TYPE_MAX; type++) { 1779 + list = xe_guc_capture_get_reg_desc_list(gt, GUC_CAPTURE_LIST_INDEX_PF, type, 1780 + capture_class, false); 1781 + snapshot_print_by_list_order(snapshot, p, type, list); 1782 + } 1783 + 1784 + if (capture_class == GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE) { 1785 + list = xe_guc_capture_get_reg_desc_list(gt, GUC_CAPTURE_LIST_INDEX_PF, 1786 + GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS, 1787 + capture_class, true); 1788 + snapshot_print_by_list_order(snapshot, p, GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS, 1789 + list); 1790 + } 1791 + 1792 + drm_puts(p, "\n"); 1793 + } 1794 + 1795 + /** 1796 + * xe_guc_capture_get_matching_and_lock - Matching GuC capture for the job. 1797 + * @job: The job object. 1798 + * 1799 + * Search within the capture outlist for the job, could be used for check if 1800 + * GuC capture is ready for the job. 1801 + * If found, the locked boolean of the node will be flagged. 1802 + * 1803 + * Returns: found guc-capture node ptr else NULL 1804 + */ 1805 + struct __guc_capture_parsed_output * 1806 + xe_guc_capture_get_matching_and_lock(struct xe_sched_job *job) 1807 + { 1808 + struct xe_hw_engine *hwe; 1809 + enum xe_hw_engine_id id; 1810 + struct xe_exec_queue *q; 1811 + struct xe_device *xe; 1812 + u16 guc_class = GUC_LAST_ENGINE_CLASS + 1; 1813 + struct xe_devcoredump_snapshot *ss; 1814 + 1815 + if (!job) 1816 + return NULL; 1817 + 1818 + q = job->q; 1819 + if (!q || !q->gt) 1820 + return NULL; 1821 + 1822 + xe = gt_to_xe(q->gt); 1823 + if (xe->wedged.mode >= 2 || !xe_device_uc_enabled(xe)) 1824 + return NULL; 1825 + 1826 + ss = &xe->devcoredump.snapshot; 1827 + if (ss->matched_node && ss->matched_node->source == XE_ENGINE_CAPTURE_SOURCE_GUC) 1828 + return ss->matched_node; 1829 + 1830 + /* Find hwe for the job */ 1831 + for_each_hw_engine(hwe, q->gt, id) { 1832 + if (hwe != q->hwe) 1833 + continue; 1834 + guc_class = xe_engine_class_to_guc_class(hwe->class); 1835 + break; 1836 + } 1837 + 1838 + if (guc_class <= GUC_LAST_ENGINE_CLASS) { 1839 + struct __guc_capture_parsed_output *n, *ntmp; 1840 + struct xe_guc *guc = &q->gt->uc.guc; 1841 + u16 guc_id = q->guc->id; 1842 + u32 lrca = xe_lrc_ggtt_addr(q->lrc[0]); 1843 + 1844 + /* 1845 + * Look for a matching GuC reported error capture node from 1846 + * the internal output link-list based on engine, guc id and 1847 + * lrca info. 1848 + */ 1849 + list_for_each_entry_safe(n, ntmp, &guc->capture->outlist, link) { 1850 + if (n->eng_class == guc_class && n->eng_inst == hwe->instance && 1851 + n->guc_id == guc_id && n->lrca == lrca && 1852 + n->source == XE_ENGINE_CAPTURE_SOURCE_GUC) { 1853 + n->locked = 1; 1854 + return n; 1855 + } 1856 + } 1857 + } 1858 + return NULL; 1859 + } 1860 + 1861 + /** 1862 + * xe_engine_snapshot_capture_for_job - Take snapshot of associated engine 1863 + * @job: The job object 1864 + * 1865 + * Take snapshot of associated HW Engine 1866 + * 1867 + * Returns: None. 1868 + */ 1869 + void 1870 + xe_engine_snapshot_capture_for_job(struct xe_sched_job *job) 1871 + { 1872 + struct xe_exec_queue *q = job->q; 1873 + struct xe_device *xe = gt_to_xe(q->gt); 1874 + struct xe_devcoredump *coredump = &xe->devcoredump; 1875 + struct xe_hw_engine *hwe; 1876 + enum xe_hw_engine_id id; 1877 + u32 adj_logical_mask = q->logical_mask; 1878 + 1879 + for_each_hw_engine(hwe, q->gt, id) { 1880 + if (hwe->class != q->hwe->class || 1881 + !(BIT(hwe->logical_instance) & adj_logical_mask)) { 1882 + coredump->snapshot.hwe[id] = NULL; 1883 + continue; 1884 + } 1885 + 1886 + if (!coredump->snapshot.hwe[id]) { 1887 + coredump->snapshot.hwe[id] = xe_hw_engine_snapshot_capture(hwe, job); 1888 + } else { 1889 + struct __guc_capture_parsed_output *new; 1890 + 1891 + new = xe_guc_capture_get_matching_and_lock(job); 1892 + if (new) { 1893 + struct xe_guc *guc = &q->gt->uc.guc; 1894 + 1895 + /* 1896 + * If we are in here, it means we found a fresh 1897 + * GuC-err-capture node for this engine after 1898 + * previously failing to find a match in the 1899 + * early part of guc_exec_queue_timedout_job. 1900 + * Thus we must free the manually captured node 1901 + */ 1902 + guc_capture_free_outlist_node(guc->capture, 1903 + coredump->snapshot.matched_node); 1904 + coredump->snapshot.matched_node = new; 1905 + } 1906 + } 1907 + 1908 + break; 1909 + } 1910 + } 1911 + 1912 + /* 1913 + * xe_guc_capture_put_matched_nodes - Cleanup macthed nodes 1914 + * @guc: The GuC object 1915 + * 1916 + * Free matched node and all nodes with the equal guc_id from 1917 + * GuC captured outlist 1918 + */ 1919 + void xe_guc_capture_put_matched_nodes(struct xe_guc *guc) 1920 + { 1921 + struct xe_device *xe = guc_to_xe(guc); 1922 + struct xe_devcoredump *devcoredump = &xe->devcoredump; 1923 + struct __guc_capture_parsed_output *n = devcoredump->snapshot.matched_node; 1924 + 1925 + if (n) { 1926 + guc_capture_remove_stale_matches_from_list(guc->capture, n); 1927 + guc_capture_free_outlist_node(guc->capture, n); 1928 + devcoredump->snapshot.matched_node = NULL; 1929 + } 1930 + } 1931 + 1932 + /* 1933 + * xe_guc_capture_steered_list_init - Init steering register list 1934 + * @guc: The GuC object 1935 + * 1936 + * Init steering register list for GuC register capture, create pre-alloc node 1937 + */ 1938 + void xe_guc_capture_steered_list_init(struct xe_guc *guc) 1939 + { 1940 + /* 1941 + * For certain engine classes, there are slice and subslice 1942 + * level registers requiring steering. We allocate and populate 1943 + * these based on hw config and add it as an extension list at 1944 + * the end of the pre-populated render list. 1945 + */ 1946 + guc_capture_alloc_steered_lists(guc); 1947 + check_guc_capture_size(guc); 1948 + guc_capture_create_prealloc_nodes(guc); 1949 + } 1950 + 1951 + /* 1952 + * xe_guc_capture_init - Init for GuC register capture 1953 + * @guc: The GuC object 1954 + * 1955 + * Init for GuC register capture, alloc memory for capture data structure. 1956 + * 1957 + * Returns: 0 if success. 1958 + * -ENOMEM if out of memory 1959 + */ 1960 + int xe_guc_capture_init(struct xe_guc *guc) 1961 + { 1962 + guc->capture = drmm_kzalloc(guc_to_drm(guc), sizeof(*guc->capture), GFP_KERNEL); 1963 + if (!guc->capture) 1964 + return -ENOMEM; 1965 + 1966 + guc->capture->reglists = guc_capture_get_device_reglist(guc_to_xe(guc)); 1967 + 1968 + INIT_LIST_HEAD(&guc->capture->outlist); 1969 + INIT_LIST_HEAD(&guc->capture->cachelist); 1970 + 1971 + return 0; 1972 + }
+61
drivers/gpu/drm/xe/xe_guc_capture.h
··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2021-2024 Intel Corporation 4 + */ 5 + 6 + #ifndef _XE_GUC_CAPTURE_H 7 + #define _XE_GUC_CAPTURE_H 8 + 9 + #include <linux/types.h> 10 + #include "abi/guc_capture_abi.h" 11 + #include "xe_guc.h" 12 + #include "xe_guc_fwif.h" 13 + 14 + struct xe_guc; 15 + struct xe_hw_engine; 16 + struct xe_hw_engine_snapshot; 17 + struct xe_sched_job; 18 + 19 + static inline enum guc_capture_list_class_type xe_guc_class_to_capture_class(u16 class) 20 + { 21 + switch (class) { 22 + case GUC_RENDER_CLASS: 23 + case GUC_COMPUTE_CLASS: 24 + return GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE; 25 + case GUC_GSC_OTHER_CLASS: 26 + return GUC_CAPTURE_LIST_CLASS_GSC_OTHER; 27 + case GUC_VIDEO_CLASS: 28 + case GUC_VIDEOENHANCE_CLASS: 29 + case GUC_BLITTER_CLASS: 30 + return class; 31 + default: 32 + XE_WARN_ON(class); 33 + return GUC_CAPTURE_LIST_CLASS_MAX; 34 + } 35 + } 36 + 37 + static inline enum guc_capture_list_class_type 38 + xe_engine_class_to_guc_capture_class(enum xe_engine_class class) 39 + { 40 + return xe_guc_class_to_capture_class(xe_engine_class_to_guc_class(class)); 41 + } 42 + 43 + void xe_guc_capture_process(struct xe_guc *guc); 44 + int xe_guc_capture_getlist(struct xe_guc *guc, u32 owner, u32 type, 45 + enum guc_capture_list_class_type capture_class, void **outptr); 46 + int xe_guc_capture_getlistsize(struct xe_guc *guc, u32 owner, u32 type, 47 + enum guc_capture_list_class_type capture_class, size_t *size); 48 + int xe_guc_capture_getnullheader(struct xe_guc *guc, void **outptr, size_t *size); 49 + size_t xe_guc_capture_ads_input_worst_size(struct xe_guc *guc); 50 + const struct __guc_mmio_reg_descr_group * 51 + xe_guc_capture_get_reg_desc_list(struct xe_gt *gt, u32 owner, u32 type, 52 + enum guc_capture_list_class_type capture_class, bool is_ext); 53 + struct __guc_capture_parsed_output *xe_guc_capture_get_matching_and_lock(struct xe_sched_job *job); 54 + void xe_engine_manual_capture(struct xe_hw_engine *hwe, struct xe_hw_engine_snapshot *snapshot); 55 + void xe_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, struct drm_printer *p); 56 + void xe_engine_snapshot_capture_for_job(struct xe_sched_job *job); 57 + void xe_guc_capture_steered_list_init(struct xe_guc *guc); 58 + void xe_guc_capture_put_matched_nodes(struct xe_guc *guc); 59 + int xe_guc_capture_init(struct xe_guc *guc); 60 + 61 + #endif
+68
drivers/gpu/drm/xe/xe_guc_capture_types.h
··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2021-2024 Intel Corporation 4 + */ 5 + 6 + #ifndef _XE_GUC_CAPTURE_TYPES_H 7 + #define _XE_GUC_CAPTURE_TYPES_H 8 + 9 + #include <linux/types.h> 10 + #include "regs/xe_reg_defs.h" 11 + 12 + struct xe_guc; 13 + 14 + /* data type of the register in register list */ 15 + enum capture_register_data_type { 16 + REG_32BIT = 0, 17 + REG_64BIT_LOW_DW, 18 + REG_64BIT_HI_DW, 19 + }; 20 + 21 + /** 22 + * struct __guc_mmio_reg_descr - GuC mmio register descriptor 23 + * 24 + * xe_guc_capture module uses these structures to define a register 25 + * (offsets, names, flags,...) that are used at the ADS regisration 26 + * time as well as during runtime processing and reporting of error- 27 + * capture states generated by GuC just prior to engine reset events. 28 + */ 29 + struct __guc_mmio_reg_descr { 30 + /** @reg: the register */ 31 + struct xe_reg reg; 32 + /** 33 + * @data_type: data type of the register 34 + * Could be 32 bit, low or hi dword of a 64 bit, see enum 35 + * register_data_type 36 + */ 37 + enum capture_register_data_type data_type; 38 + /** @flags: Flags for the register */ 39 + u32 flags; 40 + /** @mask: The mask to apply */ 41 + u32 mask; 42 + /** @regname: Name of the register */ 43 + const char *regname; 44 + }; 45 + 46 + /** 47 + * struct __guc_mmio_reg_descr_group - The group of register descriptor 48 + * 49 + * xe_guc_capture module uses these structures to maintain static 50 + * tables (per unique platform) that consists of lists of registers 51 + * (offsets, names, flags,...) that are used at the ADS regisration 52 + * time as well as during runtime processing and reporting of error- 53 + * capture states generated by GuC just prior to engine reset events. 54 + */ 55 + struct __guc_mmio_reg_descr_group { 56 + /** @list: The register list */ 57 + const struct __guc_mmio_reg_descr *list; 58 + /** @num_regs: Count of registers in the list */ 59 + u32 num_regs; 60 + /** @owner: PF/VF owner, see enum guc_capture_list_index_type */ 61 + u32 owner; 62 + /** @type: Capture register type, see enum guc_state_capture_type */ 63 + u32 type; 64 + /** @engine: The engine class, see enum guc_capture_list_class_type */ 65 + u32 engine; 66 + }; 67 + 68 + #endif
+376 -110
drivers/gpu/drm/xe/xe_guc_ct.c
··· 8 8 #include <linux/bitfield.h> 9 9 #include <linux/circ_buf.h> 10 10 #include <linux/delay.h> 11 + #include <linux/fault-inject.h> 11 12 12 13 #include <kunit/static_stub.h> 13 14 ··· 18 17 #include "abi/guc_actions_sriov_abi.h" 19 18 #include "abi/guc_klvs_abi.h" 20 19 #include "xe_bo.h" 20 + #include "xe_devcoredump.h" 21 21 #include "xe_device.h" 22 22 #include "xe_gt.h" 23 23 #include "xe_gt_pagefault.h" ··· 27 25 #include "xe_gt_sriov_pf_monitor.h" 28 26 #include "xe_gt_tlb_invalidation.h" 29 27 #include "xe_guc.h" 28 + #include "xe_guc_log.h" 30 29 #include "xe_guc_relay.h" 31 30 #include "xe_guc_submit.h" 32 31 #include "xe_map.h" 33 32 #include "xe_pm.h" 34 33 #include "xe_trace_guc.h" 34 + 35 + #if IS_ENABLED(CONFIG_DRM_XE_DEBUG) 36 + enum { 37 + /* Internal states, not error conditions */ 38 + CT_DEAD_STATE_REARM, /* 0x0001 */ 39 + CT_DEAD_STATE_CAPTURE, /* 0x0002 */ 40 + 41 + /* Error conditions */ 42 + CT_DEAD_SETUP, /* 0x0004 */ 43 + CT_DEAD_H2G_WRITE, /* 0x0008 */ 44 + CT_DEAD_H2G_HAS_ROOM, /* 0x0010 */ 45 + CT_DEAD_G2H_READ, /* 0x0020 */ 46 + CT_DEAD_G2H_RECV, /* 0x0040 */ 47 + CT_DEAD_G2H_RELEASE, /* 0x0080 */ 48 + CT_DEAD_DEADLOCK, /* 0x0100 */ 49 + CT_DEAD_PROCESS_FAILED, /* 0x0200 */ 50 + CT_DEAD_FAST_G2H, /* 0x0400 */ 51 + CT_DEAD_PARSE_G2H_RESPONSE, /* 0x0800 */ 52 + CT_DEAD_PARSE_G2H_UNKNOWN, /* 0x1000 */ 53 + CT_DEAD_PARSE_G2H_ORIGIN, /* 0x2000 */ 54 + CT_DEAD_PARSE_G2H_TYPE, /* 0x4000 */ 55 + }; 56 + 57 + static void ct_dead_worker_func(struct work_struct *w); 58 + static void ct_dead_capture(struct xe_guc_ct *ct, struct guc_ctb *ctb, u32 reason_code); 59 + 60 + #define CT_DEAD(ct, ctb, reason_code) ct_dead_capture((ct), (ctb), CT_DEAD_##reason_code) 61 + #else 62 + #define CT_DEAD(ct, ctb, reason) \ 63 + do { \ 64 + struct guc_ctb *_ctb = (ctb); \ 65 + if (_ctb) \ 66 + _ctb->info.broken = true; \ 67 + } while (0) 68 + #endif 35 69 36 70 /* Used when a CT send wants to block and / or receive data */ 37 71 struct g2h_fence { ··· 220 182 spin_lock_init(&ct->fast_lock); 221 183 xa_init(&ct->fence_lookup); 222 184 INIT_WORK(&ct->g2h_worker, g2h_worker_func); 223 - INIT_DELAYED_WORK(&ct->safe_mode_worker, safe_mode_worker_func); 185 + INIT_DELAYED_WORK(&ct->safe_mode_worker, safe_mode_worker_func); 186 + #if IS_ENABLED(CONFIG_DRM_XE_DEBUG) 187 + spin_lock_init(&ct->dead.lock); 188 + INIT_WORK(&ct->dead.worker, ct_dead_worker_func); 189 + #endif 224 190 init_waitqueue_head(&ct->wq); 225 191 init_waitqueue_head(&ct->g2h_fence_wq); 226 192 ··· 251 209 ct->state = XE_GUC_CT_STATE_DISABLED; 252 210 return 0; 253 211 } 212 + ALLOW_ERROR_INJECTION(xe_guc_ct_init, ERRNO); /* See xe_pci_probe() */ 254 213 255 214 #define desc_read(xe_, guc_ctb__, field_) \ 256 215 xe_map_rd_field(xe_, &guc_ctb__->desc, 0, \ ··· 438 395 439 396 xe_gt_assert(gt, !xe_guc_ct_enabled(ct)); 440 397 398 + xe_map_memset(xe, &ct->bo->vmap, 0, 0, ct->bo->size); 441 399 guc_ct_ctb_h2g_init(xe, &ct->ctbs.h2g, &ct->bo->vmap); 442 400 guc_ct_ctb_g2h_init(xe, &ct->ctbs.g2h, &ct->bo->vmap); 443 401 ··· 463 419 if (ct_needs_safe_mode(ct)) 464 420 ct_enter_safe_mode(ct); 465 421 422 + #if IS_ENABLED(CONFIG_DRM_XE_DEBUG) 423 + /* 424 + * The CT has now been reset so the dumper can be re-armed 425 + * after any existing dead state has been dumped. 426 + */ 427 + spin_lock_irq(&ct->dead.lock); 428 + if (ct->dead.reason) 429 + ct->dead.reason |= (1 << CT_DEAD_STATE_REARM); 430 + spin_unlock_irq(&ct->dead.lock); 431 + #endif 432 + 466 433 return 0; 467 434 468 435 err_out: 469 436 xe_gt_err(gt, "Failed to enable GuC CT (%pe)\n", ERR_PTR(err)); 437 + CT_DEAD(ct, NULL, SETUP); 470 438 471 439 return err; 472 440 } ··· 522 466 523 467 if (cmd_len > h2g->info.space) { 524 468 h2g->info.head = desc_read(ct_to_xe(ct), h2g, head); 469 + 470 + if (h2g->info.head > h2g->info.size) { 471 + struct xe_device *xe = ct_to_xe(ct); 472 + u32 desc_status = desc_read(xe, h2g, status); 473 + 474 + desc_write(xe, h2g, status, desc_status | GUC_CTB_STATUS_OVERFLOW); 475 + 476 + xe_gt_err(ct_to_gt(ct), "CT: invalid head offset %u >= %u)\n", 477 + h2g->info.head, h2g->info.size); 478 + CT_DEAD(ct, h2g, H2G_HAS_ROOM); 479 + return false; 480 + } 481 + 525 482 h2g->info.space = CIRC_SPACE(h2g->info.tail, h2g->info.head, 526 483 h2g->info.size) - 527 484 h2g->info.resv_space; ··· 590 521 591 522 static void __g2h_release_space(struct xe_guc_ct *ct, u32 g2h_len) 592 523 { 524 + bool bad = false; 525 + 593 526 lockdep_assert_held(&ct->fast_lock); 594 - xe_gt_assert(ct_to_gt(ct), ct->ctbs.g2h.info.space + g2h_len <= 595 - ct->ctbs.g2h.info.size - ct->ctbs.g2h.info.resv_space); 596 - xe_gt_assert(ct_to_gt(ct), ct->g2h_outstanding); 527 + 528 + bad = ct->ctbs.g2h.info.space + g2h_len > 529 + ct->ctbs.g2h.info.size - ct->ctbs.g2h.info.resv_space; 530 + bad |= !ct->g2h_outstanding; 531 + 532 + if (bad) { 533 + xe_gt_err(ct_to_gt(ct), "Invalid G2H release: %d + %d vs %d - %d -> %d vs %d, outstanding = %d!\n", 534 + ct->ctbs.g2h.info.space, g2h_len, 535 + ct->ctbs.g2h.info.size, ct->ctbs.g2h.info.resv_space, 536 + ct->ctbs.g2h.info.space + g2h_len, 537 + ct->ctbs.g2h.info.size - ct->ctbs.g2h.info.resv_space, 538 + ct->g2h_outstanding); 539 + CT_DEAD(ct, &ct->ctbs.g2h, G2H_RELEASE); 540 + return; 541 + } 597 542 598 543 ct->ctbs.g2h.info.space += g2h_len; 599 544 if (!--ct->g2h_outstanding) ··· 634 551 u32 full_len; 635 552 struct iosys_map map = IOSYS_MAP_INIT_OFFSET(&h2g->cmds, 636 553 tail * sizeof(u32)); 554 + u32 desc_status; 637 555 638 556 full_len = len + GUC_CTB_HDR_LEN; 639 557 640 558 lockdep_assert_held(&ct->lock); 641 559 xe_gt_assert(gt, full_len <= GUC_CTB_MSG_MAX_LEN); 642 - xe_gt_assert(gt, tail <= h2g->info.size); 560 + 561 + desc_status = desc_read(xe, h2g, status); 562 + if (desc_status) { 563 + xe_gt_err(gt, "CT write: non-zero status: %u\n", desc_status); 564 + goto corrupted; 565 + } 566 + 567 + if (IS_ENABLED(CONFIG_DRM_XE_DEBUG)) { 568 + u32 desc_tail = desc_read(xe, h2g, tail); 569 + u32 desc_head = desc_read(xe, h2g, head); 570 + 571 + if (tail != desc_tail) { 572 + desc_write(xe, h2g, status, desc_status | GUC_CTB_STATUS_MISMATCH); 573 + xe_gt_err(gt, "CT write: tail was modified %u != %u\n", desc_tail, tail); 574 + goto corrupted; 575 + } 576 + 577 + if (tail > h2g->info.size) { 578 + desc_write(xe, h2g, status, desc_status | GUC_CTB_STATUS_OVERFLOW); 579 + xe_gt_err(gt, "CT write: tail out of range: %u vs %u\n", 580 + tail, h2g->info.size); 581 + goto corrupted; 582 + } 583 + 584 + if (desc_head >= h2g->info.size) { 585 + desc_write(xe, h2g, status, desc_status | GUC_CTB_STATUS_OVERFLOW); 586 + xe_gt_err(gt, "CT write: invalid head offset %u >= %u)\n", 587 + desc_head, h2g->info.size); 588 + goto corrupted; 589 + } 590 + } 643 591 644 592 /* Command will wrap, zero fill (NOPs), return and check credits again */ 645 593 if (tail + full_len > h2g->info.size) { ··· 723 609 desc_read(xe, h2g, head), h2g->info.tail); 724 610 725 611 return 0; 612 + 613 + corrupted: 614 + CT_DEAD(ct, &ct->ctbs.h2g, H2G_WRITE); 615 + return -EPIPE; 726 616 } 727 617 728 618 /* ··· 785 667 num_g2h = 1; 786 668 787 669 if (g2h_fence_needs_alloc(g2h_fence)) { 788 - void *ptr; 789 - 790 670 g2h_fence->seqno = next_ct_seqno(ct, true); 791 - ptr = xa_store(&ct->fence_lookup, 792 - g2h_fence->seqno, 793 - g2h_fence, GFP_ATOMIC); 794 - if (IS_ERR(ptr)) { 795 - ret = PTR_ERR(ptr); 671 + ret = xa_err(xa_store(&ct->fence_lookup, 672 + g2h_fence->seqno, g2h_fence, 673 + GFP_ATOMIC)); 674 + if (ret) 796 675 goto out; 797 - } 798 676 } 799 677 800 678 seqno = g2h_fence->seqno; ··· 834 720 { 835 721 struct xe_device *xe = ct_to_xe(ct); 836 722 struct xe_gt *gt = ct_to_gt(ct); 837 - struct drm_printer p = xe_gt_info_printer(gt); 838 723 unsigned int sleep_period_ms = 1; 839 724 int ret; 840 725 ··· 886 773 goto broken; 887 774 #undef g2h_avail 888 775 889 - if (dequeue_one_g2h(ct) < 0) 776 + ret = dequeue_one_g2h(ct); 777 + if (ret < 0) { 778 + if (ret != -ECANCELED) 779 + xe_gt_err(ct_to_gt(ct), "CTB receive failed (%pe)", 780 + ERR_PTR(ret)); 890 781 goto broken; 782 + } 891 783 892 784 goto try_again; 893 785 } ··· 901 783 902 784 broken: 903 785 xe_gt_err(gt, "No forward process on H2G, reset required\n"); 904 - xe_guc_ct_print(ct, &p, true); 905 - ct->ctbs.h2g.info.broken = true; 786 + CT_DEAD(ct, &ct->ctbs.h2g, DEADLOCK); 906 787 907 788 return -EDEADLK; 908 789 } ··· 969 852 #define ct_alive(ct) \ 970 853 (xe_guc_ct_enabled(ct) && !ct->ctbs.h2g.info.broken && \ 971 854 !ct->ctbs.g2h.info.broken) 972 - if (!wait_event_interruptible_timeout(ct->wq, ct_alive(ct), HZ * 5)) 855 + if (!wait_event_interruptible_timeout(ct->wq, ct_alive(ct), HZ * 5)) 973 856 return false; 974 857 #undef ct_alive 975 858 ··· 996 879 retry_same_fence: 997 880 ret = guc_ct_send(ct, action, len, 0, 0, &g2h_fence); 998 881 if (unlikely(ret == -ENOMEM)) { 999 - void *ptr; 1000 - 1001 882 /* Retry allocation /w GFP_KERNEL */ 1002 - ptr = xa_store(&ct->fence_lookup, 1003 - g2h_fence.seqno, 1004 - &g2h_fence, GFP_KERNEL); 1005 - if (IS_ERR(ptr)) 1006 - return PTR_ERR(ptr); 883 + ret = xa_err(xa_store(&ct->fence_lookup, g2h_fence.seqno, 884 + &g2h_fence, GFP_KERNEL)); 885 + if (ret) 886 + return ret; 1007 887 1008 888 goto retry_same_fence; 1009 889 } else if (unlikely(ret)) { ··· 1011 897 goto retry_same_fence; 1012 898 1013 899 if (!g2h_fence_needs_alloc(&g2h_fence)) 1014 - xa_erase_irq(&ct->fence_lookup, g2h_fence.seqno); 900 + xa_erase(&ct->fence_lookup, g2h_fence.seqno); 1015 901 1016 902 return ret; 1017 903 } 1018 904 1019 905 ret = wait_event_timeout(ct->g2h_fence_wq, g2h_fence.done, HZ); 906 + 907 + /* 908 + * Ensure we serialize with completion side to prevent UAF with fence going out of scope on 909 + * the stack, since we have no clue if it will fire after the timeout before we can erase 910 + * from the xa. Also we have some dependent loads and stores below for which we need the 911 + * correct ordering, and we lack the needed barriers. 912 + */ 913 + mutex_lock(&ct->lock); 1020 914 if (!ret) { 1021 - xe_gt_err(gt, "Timed out wait for G2H, fence %u, action %04x", 1022 - g2h_fence.seqno, action[0]); 1023 - xa_erase_irq(&ct->fence_lookup, g2h_fence.seqno); 915 + xe_gt_err(gt, "Timed out wait for G2H, fence %u, action %04x, done %s", 916 + g2h_fence.seqno, action[0], str_yes_no(g2h_fence.done)); 917 + xa_erase(&ct->fence_lookup, g2h_fence.seqno); 918 + mutex_unlock(&ct->lock); 1024 919 return -ETIME; 1025 920 } 1026 921 1027 922 if (g2h_fence.retry) { 1028 923 xe_gt_dbg(gt, "H2G action %#x retrying: reason %#x\n", 1029 924 action[0], g2h_fence.reason); 925 + mutex_unlock(&ct->lock); 1030 926 goto retry; 1031 927 } 1032 928 if (g2h_fence.fail) { ··· 1045 921 ret = -EIO; 1046 922 } 1047 923 1048 - return ret > 0 ? response_buffer ? g2h_fence.response_len : g2h_fence.response_data : ret; 924 + if (ret > 0) 925 + ret = response_buffer ? g2h_fence.response_len : g2h_fence.response_data; 926 + 927 + mutex_unlock(&ct->lock); 928 + 929 + return ret; 1049 930 } 1050 931 1051 932 /** ··· 1140 1011 else 1141 1012 xe_gt_err(gt, "unexpected response %u for FAST_REQ H2G fence 0x%x!\n", 1142 1013 type, fence); 1014 + CT_DEAD(ct, NULL, PARSE_G2H_RESPONSE); 1143 1015 1144 1016 return -EPROTO; 1145 1017 } ··· 1148 1018 g2h_fence = xa_erase(&ct->fence_lookup, fence); 1149 1019 if (unlikely(!g2h_fence)) { 1150 1020 /* Don't tear down channel, as send could've timed out */ 1021 + /* CT_DEAD(ct, NULL, PARSE_G2H_UNKNOWN); */ 1151 1022 xe_gt_warn(gt, "G2H fence (%u) not found!\n", fence); 1152 1023 g2h_release_space(ct, GUC_CTB_HXG_MSG_MAX_LEN); 1153 1024 return 0; ··· 1193 1062 if (unlikely(origin != GUC_HXG_ORIGIN_GUC)) { 1194 1063 xe_gt_err(gt, "G2H channel broken on read, origin=%u, reset required\n", 1195 1064 origin); 1196 - ct->ctbs.g2h.info.broken = true; 1065 + CT_DEAD(ct, &ct->ctbs.g2h, PARSE_G2H_ORIGIN); 1197 1066 1198 1067 return -EPROTO; 1199 1068 } ··· 1211 1080 default: 1212 1081 xe_gt_err(gt, "G2H channel broken on read, type=%u, reset required\n", 1213 1082 type); 1214 - ct->ctbs.g2h.info.broken = true; 1083 + CT_DEAD(ct, &ct->ctbs.g2h, PARSE_G2H_TYPE); 1215 1084 1216 1085 ret = -EOPNOTSUPP; 1217 1086 } ··· 1254 1123 /* Selftest only at the moment */ 1255 1124 break; 1256 1125 case XE_GUC_ACTION_STATE_CAPTURE_NOTIFICATION: 1126 + ret = xe_guc_error_capture_handler(guc, payload, adj_len); 1127 + break; 1257 1128 case XE_GUC_ACTION_NOTIFY_FLUSH_LOG_BUFFER_TO_FILE: 1258 1129 /* FIXME: Handle this */ 1259 1130 break; ··· 1290 1157 xe_gt_err(gt, "unexpected G2H action 0x%04x\n", action); 1291 1158 } 1292 1159 1293 - if (ret) 1160 + if (ret) { 1294 1161 xe_gt_err(gt, "G2H action 0x%04x failed (%pe)\n", 1295 1162 action, ERR_PTR(ret)); 1163 + CT_DEAD(ct, NULL, PROCESS_FAILED); 1164 + } 1296 1165 1297 1166 return 0; 1298 1167 } ··· 1304 1169 struct xe_device *xe = ct_to_xe(ct); 1305 1170 struct xe_gt *gt = ct_to_gt(ct); 1306 1171 struct guc_ctb *g2h = &ct->ctbs.g2h; 1307 - u32 tail, head, len; 1172 + u32 tail, head, len, desc_status; 1308 1173 s32 avail; 1309 1174 u32 action; 1310 1175 u32 *hxg; ··· 1323 1188 1324 1189 xe_gt_assert(gt, xe_guc_ct_enabled(ct)); 1325 1190 1191 + desc_status = desc_read(xe, g2h, status); 1192 + if (desc_status) { 1193 + if (desc_status & GUC_CTB_STATUS_DISABLED) { 1194 + /* 1195 + * Potentially valid if a CLIENT_RESET request resulted in 1196 + * contexts/engines being reset. But should never happen as 1197 + * no contexts should be active when CLIENT_RESET is sent. 1198 + */ 1199 + xe_gt_err(gt, "CT read: unexpected G2H after GuC has stopped!\n"); 1200 + desc_status &= ~GUC_CTB_STATUS_DISABLED; 1201 + } 1202 + 1203 + if (desc_status) { 1204 + xe_gt_err(gt, "CT read: non-zero status: %u\n", desc_status); 1205 + goto corrupted; 1206 + } 1207 + } 1208 + 1209 + if (IS_ENABLED(CONFIG_DRM_XE_DEBUG)) { 1210 + u32 desc_tail = desc_read(xe, g2h, tail); 1211 + /* 1212 + u32 desc_head = desc_read(xe, g2h, head); 1213 + 1214 + * info.head and desc_head are updated back-to-back at the end of 1215 + * this function and nowhere else. Hence, they cannot be different 1216 + * unless two g2h_read calls are running concurrently. Which is not 1217 + * possible because it is guarded by ct->fast_lock. And yet, some 1218 + * discrete platforms are reguarly hitting this error :(. 1219 + * 1220 + * desc_head rolling backwards shouldn't cause any noticeable 1221 + * problems - just a delay in GuC being allowed to proceed past that 1222 + * point in the queue. So for now, just disable the error until it 1223 + * can be root caused. 1224 + * 1225 + if (g2h->info.head != desc_head) { 1226 + desc_write(xe, g2h, status, desc_status | GUC_CTB_STATUS_MISMATCH); 1227 + xe_gt_err(gt, "CT read: head was modified %u != %u\n", 1228 + desc_head, g2h->info.head); 1229 + goto corrupted; 1230 + } 1231 + */ 1232 + 1233 + if (g2h->info.head > g2h->info.size) { 1234 + desc_write(xe, g2h, status, desc_status | GUC_CTB_STATUS_OVERFLOW); 1235 + xe_gt_err(gt, "CT read: head out of range: %u vs %u\n", 1236 + g2h->info.head, g2h->info.size); 1237 + goto corrupted; 1238 + } 1239 + 1240 + if (desc_tail >= g2h->info.size) { 1241 + desc_write(xe, g2h, status, desc_status | GUC_CTB_STATUS_OVERFLOW); 1242 + xe_gt_err(gt, "CT read: invalid tail offset %u >= %u)\n", 1243 + desc_tail, g2h->info.size); 1244 + goto corrupted; 1245 + } 1246 + } 1247 + 1326 1248 /* Calculate DW available to read */ 1327 1249 tail = desc_read(xe, g2h, tail); 1328 1250 avail = tail - g2h->info.head; ··· 1396 1204 if (len > avail) { 1397 1205 xe_gt_err(gt, "G2H channel broken on read, avail=%d, len=%d, reset required\n", 1398 1206 avail, len); 1399 - g2h->info.broken = true; 1400 - 1401 - return -EPROTO; 1207 + goto corrupted; 1402 1208 } 1403 1209 1404 1210 head = (g2h->info.head + 1) % g2h->info.size; ··· 1442 1252 action, len, g2h->info.head, tail); 1443 1253 1444 1254 return len; 1255 + 1256 + corrupted: 1257 + CT_DEAD(ct, &ct->ctbs.g2h, G2H_READ); 1258 + return -EPROTO; 1445 1259 } 1446 1260 1447 1261 static void g2h_fast_path(struct xe_guc_ct *ct, u32 *msg, u32 len) ··· 1472 1278 xe_gt_warn(gt, "NOT_POSSIBLE"); 1473 1279 } 1474 1280 1475 - if (ret) 1281 + if (ret) { 1476 1282 xe_gt_err(gt, "G2H action 0x%04x failed (%pe)\n", 1477 1283 action, ERR_PTR(ret)); 1284 + CT_DEAD(ct, NULL, FAST_G2H); 1285 + } 1478 1286 } 1479 1287 1480 1288 /** ··· 1536 1340 1537 1341 static void receive_g2h(struct xe_guc_ct *ct) 1538 1342 { 1539 - struct xe_gt *gt = ct_to_gt(ct); 1540 1343 bool ongoing; 1541 1344 int ret; 1542 1345 ··· 1572 1377 mutex_unlock(&ct->lock); 1573 1378 1574 1379 if (unlikely(ret == -EPROTO || ret == -EOPNOTSUPP)) { 1575 - struct drm_printer p = xe_gt_info_printer(gt); 1576 - 1577 - xe_guc_ct_print(ct, &p, false); 1380 + xe_gt_err(ct_to_gt(ct), "CT dequeue failed: %d", ret); 1381 + CT_DEAD(ct, NULL, G2H_RECV); 1578 1382 kick_reset(ct); 1579 1383 } 1580 1384 } while (ret == 1); ··· 1589 1395 receive_g2h(ct); 1590 1396 } 1591 1397 1592 - static void guc_ctb_snapshot_capture(struct xe_device *xe, struct guc_ctb *ctb, 1593 - struct guc_ctb_snapshot *snapshot, 1594 - bool atomic) 1398 + struct xe_guc_ct_snapshot *xe_guc_ct_snapshot_alloc(struct xe_guc_ct *ct, bool atomic) 1595 1399 { 1596 - u32 head, tail; 1400 + struct xe_guc_ct_snapshot *snapshot; 1597 1401 1402 + snapshot = kzalloc(sizeof(*snapshot), atomic ? GFP_ATOMIC : GFP_KERNEL); 1403 + if (!snapshot) 1404 + return NULL; 1405 + 1406 + if (ct->bo) { 1407 + snapshot->ctb_size = ct->bo->size; 1408 + snapshot->ctb = kmalloc(snapshot->ctb_size, atomic ? GFP_ATOMIC : GFP_KERNEL); 1409 + } 1410 + 1411 + return snapshot; 1412 + } 1413 + 1414 + static void guc_ctb_snapshot_capture(struct xe_device *xe, struct guc_ctb *ctb, 1415 + struct guc_ctb_snapshot *snapshot) 1416 + { 1598 1417 xe_map_memcpy_from(xe, &snapshot->desc, &ctb->desc, 0, 1599 1418 sizeof(struct guc_ct_buffer_desc)); 1600 1419 memcpy(&snapshot->info, &ctb->info, sizeof(struct guc_ctb_info)); 1601 - 1602 - snapshot->cmds = kmalloc_array(ctb->info.size, sizeof(u32), 1603 - atomic ? GFP_ATOMIC : GFP_KERNEL); 1604 - 1605 - if (!snapshot->cmds) { 1606 - drm_err(&xe->drm, "Skipping CTB commands snapshot. Only CTB info will be available.\n"); 1607 - return; 1608 - } 1609 - 1610 - head = snapshot->desc.head; 1611 - tail = snapshot->desc.tail; 1612 - 1613 - if (head != tail) { 1614 - struct iosys_map map = 1615 - IOSYS_MAP_INIT_OFFSET(&ctb->cmds, head * sizeof(u32)); 1616 - 1617 - while (head != tail) { 1618 - snapshot->cmds[head] = xe_map_rd(xe, &map, 0, u32); 1619 - ++head; 1620 - if (head == ctb->info.size) { 1621 - head = 0; 1622 - map = ctb->cmds; 1623 - } else { 1624 - iosys_map_incr(&map, sizeof(u32)); 1625 - } 1626 - } 1627 - } 1628 1420 } 1629 1421 1630 1422 static void guc_ctb_snapshot_print(struct guc_ctb_snapshot *snapshot, 1631 1423 struct drm_printer *p) 1632 1424 { 1633 - u32 head, tail; 1634 - 1635 1425 drm_printf(p, "\tsize: %d\n", snapshot->info.size); 1636 1426 drm_printf(p, "\tresv_space: %d\n", snapshot->info.resv_space); 1637 1427 drm_printf(p, "\thead: %d\n", snapshot->info.head); ··· 1625 1447 drm_printf(p, "\thead (memory): %d\n", snapshot->desc.head); 1626 1448 drm_printf(p, "\ttail (memory): %d\n", snapshot->desc.tail); 1627 1449 drm_printf(p, "\tstatus (memory): 0x%x\n", snapshot->desc.status); 1628 - 1629 - if (!snapshot->cmds) 1630 - return; 1631 - 1632 - head = snapshot->desc.head; 1633 - tail = snapshot->desc.tail; 1634 - 1635 - while (head != tail) { 1636 - drm_printf(p, "\tcmd[%d]: 0x%08x\n", head, 1637 - snapshot->cmds[head]); 1638 - ++head; 1639 - if (head == snapshot->info.size) 1640 - head = 0; 1641 - } 1642 - } 1643 - 1644 - static void guc_ctb_snapshot_free(struct guc_ctb_snapshot *snapshot) 1645 - { 1646 - kfree(snapshot->cmds); 1647 1450 } 1648 1451 1649 1452 /** ··· 1645 1486 struct xe_device *xe = ct_to_xe(ct); 1646 1487 struct xe_guc_ct_snapshot *snapshot; 1647 1488 1648 - snapshot = kzalloc(sizeof(*snapshot), 1649 - atomic ? GFP_ATOMIC : GFP_KERNEL); 1650 - 1489 + snapshot = xe_guc_ct_snapshot_alloc(ct, atomic); 1651 1490 if (!snapshot) { 1652 - drm_err(&xe->drm, "Skipping CTB snapshot entirely.\n"); 1491 + xe_gt_err(ct_to_gt(ct), "Skipping CTB snapshot entirely.\n"); 1653 1492 return NULL; 1654 1493 } 1655 1494 1656 1495 if (xe_guc_ct_enabled(ct) || ct->state == XE_GUC_CT_STATE_STOPPED) { 1657 1496 snapshot->ct_enabled = true; 1658 1497 snapshot->g2h_outstanding = READ_ONCE(ct->g2h_outstanding); 1659 - guc_ctb_snapshot_capture(xe, &ct->ctbs.h2g, 1660 - &snapshot->h2g, atomic); 1661 - guc_ctb_snapshot_capture(xe, &ct->ctbs.g2h, 1662 - &snapshot->g2h, atomic); 1498 + guc_ctb_snapshot_capture(xe, &ct->ctbs.h2g, &snapshot->h2g); 1499 + guc_ctb_snapshot_capture(xe, &ct->ctbs.g2h, &snapshot->g2h); 1663 1500 } 1501 + 1502 + if (ct->bo && snapshot->ctb) 1503 + xe_map_memcpy_from(xe, snapshot->ctb, &ct->bo->vmap, 0, snapshot->ctb_size); 1664 1504 1665 1505 return snapshot; 1666 1506 } ··· 1681 1523 drm_puts(p, "H2G CTB (all sizes in DW):\n"); 1682 1524 guc_ctb_snapshot_print(&snapshot->h2g, p); 1683 1525 1684 - drm_puts(p, "\nG2H CTB (all sizes in DW):\n"); 1526 + drm_puts(p, "G2H CTB (all sizes in DW):\n"); 1685 1527 guc_ctb_snapshot_print(&snapshot->g2h, p); 1686 - 1687 1528 drm_printf(p, "\tg2h outstanding: %d\n", 1688 1529 snapshot->g2h_outstanding); 1530 + 1531 + if (snapshot->ctb) { 1532 + xe_print_blob_ascii85(p, "CTB data", snapshot->ctb, 0, snapshot->ctb_size); 1533 + } else { 1534 + drm_printf(p, "CTB snapshot missing!\n"); 1535 + return; 1536 + } 1689 1537 } else { 1690 1538 drm_puts(p, "CT disabled\n"); 1691 1539 } ··· 1709 1545 if (!snapshot) 1710 1546 return; 1711 1547 1712 - guc_ctb_snapshot_free(&snapshot->h2g); 1713 - guc_ctb_snapshot_free(&snapshot->g2h); 1548 + kfree(snapshot->ctb); 1714 1549 kfree(snapshot); 1715 1550 } 1716 1551 ··· 1717 1554 * xe_guc_ct_print - GuC CT Print. 1718 1555 * @ct: GuC CT. 1719 1556 * @p: drm_printer where it will be printed out. 1720 - * @atomic: Boolean to indicate if this is called from atomic context like 1721 - * reset or CTB handler or from some regular path like debugfs. 1722 1557 * 1723 1558 * This function quickly capture a snapshot and immediately print it out. 1724 1559 */ 1725 - void xe_guc_ct_print(struct xe_guc_ct *ct, struct drm_printer *p, bool atomic) 1560 + void xe_guc_ct_print(struct xe_guc_ct *ct, struct drm_printer *p) 1726 1561 { 1727 1562 struct xe_guc_ct_snapshot *snapshot; 1728 1563 1729 - snapshot = xe_guc_ct_snapshot_capture(ct, atomic); 1564 + snapshot = xe_guc_ct_snapshot_capture(ct, false); 1730 1565 xe_guc_ct_snapshot_print(snapshot, p); 1731 1566 xe_guc_ct_snapshot_free(snapshot); 1732 1567 } 1568 + 1569 + #if IS_ENABLED(CONFIG_DRM_XE_DEBUG) 1570 + static void ct_dead_capture(struct xe_guc_ct *ct, struct guc_ctb *ctb, u32 reason_code) 1571 + { 1572 + struct xe_guc_log_snapshot *snapshot_log; 1573 + struct xe_guc_ct_snapshot *snapshot_ct; 1574 + struct xe_guc *guc = ct_to_guc(ct); 1575 + unsigned long flags; 1576 + bool have_capture; 1577 + 1578 + if (ctb) 1579 + ctb->info.broken = true; 1580 + 1581 + /* Ignore further errors after the first dump until a reset */ 1582 + if (ct->dead.reported) 1583 + return; 1584 + 1585 + spin_lock_irqsave(&ct->dead.lock, flags); 1586 + 1587 + /* And only capture one dump at a time */ 1588 + have_capture = ct->dead.reason & (1 << CT_DEAD_STATE_CAPTURE); 1589 + ct->dead.reason |= (1 << reason_code) | 1590 + (1 << CT_DEAD_STATE_CAPTURE); 1591 + 1592 + spin_unlock_irqrestore(&ct->dead.lock, flags); 1593 + 1594 + if (have_capture) 1595 + return; 1596 + 1597 + snapshot_log = xe_guc_log_snapshot_capture(&guc->log, true); 1598 + snapshot_ct = xe_guc_ct_snapshot_capture((ct), true); 1599 + 1600 + spin_lock_irqsave(&ct->dead.lock, flags); 1601 + 1602 + if (ct->dead.snapshot_log || ct->dead.snapshot_ct) { 1603 + xe_gt_err(ct_to_gt(ct), "Got unexpected dead CT capture!\n"); 1604 + xe_guc_log_snapshot_free(snapshot_log); 1605 + xe_guc_ct_snapshot_free(snapshot_ct); 1606 + } else { 1607 + ct->dead.snapshot_log = snapshot_log; 1608 + ct->dead.snapshot_ct = snapshot_ct; 1609 + } 1610 + 1611 + spin_unlock_irqrestore(&ct->dead.lock, flags); 1612 + 1613 + queue_work(system_unbound_wq, &(ct)->dead.worker); 1614 + } 1615 + 1616 + static void ct_dead_print(struct xe_dead_ct *dead) 1617 + { 1618 + struct xe_guc_ct *ct = container_of(dead, struct xe_guc_ct, dead); 1619 + struct xe_device *xe = ct_to_xe(ct); 1620 + struct xe_gt *gt = ct_to_gt(ct); 1621 + static int g_count; 1622 + struct drm_printer ip = xe_gt_info_printer(gt); 1623 + struct drm_printer lp = drm_line_printer(&ip, "Capture", ++g_count); 1624 + 1625 + if (!dead->reason) { 1626 + xe_gt_err(gt, "CTB is dead for no reason!?\n"); 1627 + return; 1628 + } 1629 + 1630 + drm_printf(&lp, "CTB is dead - reason=0x%X\n", dead->reason); 1631 + 1632 + /* Can't generate a genuine core dump at this point, so just do the good bits */ 1633 + drm_puts(&lp, "**** Xe Device Coredump ****\n"); 1634 + xe_device_snapshot_print(xe, &lp); 1635 + 1636 + drm_printf(&lp, "**** GT #%d ****\n", gt->info.id); 1637 + drm_printf(&lp, "\tTile: %d\n", gt->tile->id); 1638 + 1639 + drm_puts(&lp, "**** GuC Log ****\n"); 1640 + xe_guc_log_snapshot_print(dead->snapshot_log, &lp); 1641 + 1642 + drm_puts(&lp, "**** GuC CT ****\n"); 1643 + xe_guc_ct_snapshot_print(dead->snapshot_ct, &lp); 1644 + 1645 + drm_puts(&lp, "Done.\n"); 1646 + } 1647 + 1648 + static void ct_dead_worker_func(struct work_struct *w) 1649 + { 1650 + struct xe_guc_ct *ct = container_of(w, struct xe_guc_ct, dead.worker); 1651 + 1652 + if (!ct->dead.reported) { 1653 + ct->dead.reported = true; 1654 + ct_dead_print(&ct->dead); 1655 + } 1656 + 1657 + spin_lock_irq(&ct->dead.lock); 1658 + 1659 + xe_guc_log_snapshot_free(ct->dead.snapshot_log); 1660 + ct->dead.snapshot_log = NULL; 1661 + xe_guc_ct_snapshot_free(ct->dead.snapshot_ct); 1662 + ct->dead.snapshot_ct = NULL; 1663 + 1664 + if (ct->dead.reason & (1 << CT_DEAD_STATE_REARM)) { 1665 + /* A reset has occurred so re-arm the error reporting */ 1666 + ct->dead.reason = 0; 1667 + ct->dead.reported = false; 1668 + } 1669 + 1670 + spin_unlock_irq(&ct->dead.lock); 1671 + } 1672 + #endif
+5 -5
drivers/gpu/drm/xe/xe_guc_ct.h
··· 9 9 #include "xe_guc_ct_types.h" 10 10 11 11 struct drm_printer; 12 + struct xe_device; 12 13 13 14 int xe_guc_ct_init(struct xe_guc_ct *ct); 14 15 int xe_guc_ct_enable(struct xe_guc_ct *ct); ··· 17 16 void xe_guc_ct_stop(struct xe_guc_ct *ct); 18 17 void xe_guc_ct_fast_path(struct xe_guc_ct *ct); 19 18 20 - struct xe_guc_ct_snapshot * 21 - xe_guc_ct_snapshot_capture(struct xe_guc_ct *ct, bool atomic); 22 - void xe_guc_ct_snapshot_print(struct xe_guc_ct_snapshot *snapshot, 23 - struct drm_printer *p); 19 + struct xe_guc_ct_snapshot *xe_guc_ct_snapshot_alloc(struct xe_guc_ct *ct, bool atomic); 20 + struct xe_guc_ct_snapshot *xe_guc_ct_snapshot_capture(struct xe_guc_ct *ct, bool atomic); 21 + void xe_guc_ct_snapshot_print(struct xe_guc_ct_snapshot *snapshot, struct drm_printer *p); 24 22 void xe_guc_ct_snapshot_free(struct xe_guc_ct_snapshot *snapshot); 25 - void xe_guc_ct_print(struct xe_guc_ct *ct, struct drm_printer *p, bool atomic); 23 + void xe_guc_ct_print(struct xe_guc_ct *ct, struct drm_printer *p); 26 24 27 25 static inline bool xe_guc_ct_enabled(struct xe_guc_ct *ct) 28 26 {
+27 -2
drivers/gpu/drm/xe/xe_guc_ct_types.h
··· 52 52 struct guc_ctb_snapshot { 53 53 /** @desc: snapshot of the CTB descriptor */ 54 54 struct guc_ct_buffer_desc desc; 55 - /** @cmds: snapshot of the CTB commands */ 56 - u32 *cmds; 57 55 /** @info: snapshot of the CTB info */ 58 56 struct guc_ctb_info info; 59 57 }; ··· 68 70 struct guc_ctb_snapshot g2h; 69 71 /** @h2g: H2G CTB snapshot */ 70 72 struct guc_ctb_snapshot h2g; 73 + /** @ctb_size: size of the snapshot of the CTB */ 74 + size_t ctb_size; 75 + /** @ctb: snapshot of the entire CTB */ 76 + u32 *ctb; 71 77 }; 72 78 73 79 /** ··· 87 85 XE_GUC_CT_STATE_STOPPED, 88 86 XE_GUC_CT_STATE_ENABLED, 89 87 }; 88 + 89 + #if IS_ENABLED(CONFIG_DRM_XE_DEBUG) 90 + /** struct xe_dead_ct - Information for debugging a dead CT */ 91 + struct xe_dead_ct { 92 + /** @lock: protects memory allocation/free operations, and @reason updates */ 93 + spinlock_t lock; 94 + /** @reason: bit mask of CT_DEAD_* reason codes */ 95 + unsigned int reason; 96 + /** @reported: for preventing multiple dumps per error sequence */ 97 + bool reported; 98 + /** @worker: worker thread to get out of interrupt context before dumping */ 99 + struct work_struct worker; 100 + /** snapshot_ct: copy of CT state and CTB content at point of error */ 101 + struct xe_guc_ct_snapshot *snapshot_ct; 102 + /** snapshot_log: copy of GuC log at point of error */ 103 + struct xe_guc_log_snapshot *snapshot_log; 104 + }; 105 + #endif 90 106 91 107 /** 92 108 * struct xe_guc_ct - GuC command transport (CT) layer ··· 148 128 u32 msg[GUC_CTB_MSG_MAX_LEN]; 149 129 /** @fast_msg: Message buffer */ 150 130 u32 fast_msg[GUC_CTB_MSG_MAX_LEN]; 131 + 132 + #if IS_ENABLED(CONFIG_DRM_XE_DEBUG) 133 + /** @dead: information for debugging dead CTs */ 134 + struct xe_dead_ct dead; 135 + #endif 151 136 }; 152 137 153 138 #endif
+2 -24
drivers/gpu/drm/xe/xe_guc_fwif.h
··· 8 8 9 9 #include <linux/bits.h> 10 10 11 + #include "abi/guc_capture_abi.h" 11 12 #include "abi/guc_klvs_abi.h" 13 + #include "xe_hw_engine_types.h" 12 14 13 15 #define G2H_LEN_DW_SCHED_CONTEXT_MODE_SET 4 14 16 #define G2H_LEN_DW_DEREGISTER_CONTEXT 3 ··· 159 157 u32 reserved[4]; 160 158 } __packed; 161 159 162 - /* GuC MMIO reg state struct */ 163 - struct guc_mmio_reg { 164 - u32 offset; 165 - u32 value; 166 - u32 flags; 167 - u32 mask; 168 - #define GUC_REGSET_MASKED BIT(0) 169 - #define GUC_REGSET_MASKED_WITH_VALUE BIT(2) 170 - #define GUC_REGSET_RESTORE_ONLY BIT(3) 171 - } __packed; 172 - 173 - /* GuC register sets */ 174 - struct guc_mmio_reg_set { 175 - u32 address; 176 - u16 count; 177 - u16 reserved; 178 - } __packed; 179 - 180 160 /* Generic GT SysInfo data types */ 181 161 #define GUC_GENERIC_GT_SYSINFO_SLICE_ENABLED 0 182 162 #define GUC_GENERIC_GT_SYSINFO_VDBOX_SFC_SUPPORT_MASK 1 ··· 171 187 u32 engine_enabled_masks[GUC_MAX_ENGINE_CLASSES]; 172 188 u32 generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_MAX]; 173 189 } __packed; 174 - 175 - enum { 176 - GUC_CAPTURE_LIST_INDEX_PF = 0, 177 - GUC_CAPTURE_LIST_INDEX_VF = 1, 178 - GUC_CAPTURE_LIST_INDEX_MAX = 2, 179 - }; 180 190 181 191 /* GuC Additional Data Struct */ 182 192 struct guc_ads {
+7
drivers/gpu/drm/xe/xe_guc_klv_thresholds_set.h
··· 18 18 MAKE_GUC_KLV_KEY(CONCATENATE(VF_CFG_THRESHOLD_, TAG)) 19 19 20 20 /** 21 + * MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN - Prepare the name of the KLV length constant. 22 + * @TAG: unique tag of the GuC threshold KLV key. 23 + */ 24 + #define MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG) \ 25 + MAKE_GUC_KLV_LEN(CONCATENATE(VF_CFG_THRESHOLD_, TAG)) 26 + 27 + /** 21 28 * xe_guc_klv_threshold_key_to_index - Find index of the tracked GuC threshold. 22 29 * @key: GuC threshold KLV key. 23 30 *
+298 -22
drivers/gpu/drm/xe/xe_guc_log.c
··· 5 5 6 6 #include "xe_guc_log.h" 7 7 8 + #include <linux/fault-inject.h> 9 + 8 10 #include <drm/drm_managed.h> 9 11 12 + #include "regs/xe_guc_regs.h" 10 13 #include "xe_bo.h" 14 + #include "xe_devcoredump.h" 15 + #include "xe_force_wake.h" 11 16 #include "xe_gt.h" 17 + #include "xe_gt_printk.h" 12 18 #include "xe_map.h" 19 + #include "xe_mmio.h" 13 20 #include "xe_module.h" 21 + 22 + static struct xe_guc * 23 + log_to_guc(struct xe_guc_log *log) 24 + { 25 + return container_of(log, struct xe_guc, log); 26 + } 14 27 15 28 static struct xe_gt * 16 29 log_to_gt(struct xe_guc_log *log) ··· 62 49 CAPTURE_BUFFER_SIZE; 63 50 } 64 51 52 + #define GUC_LOG_CHUNK_SIZE SZ_2M 53 + 54 + static struct xe_guc_log_snapshot *xe_guc_log_snapshot_alloc(struct xe_guc_log *log, bool atomic) 55 + { 56 + struct xe_guc_log_snapshot *snapshot; 57 + size_t remain; 58 + int i; 59 + 60 + snapshot = kzalloc(sizeof(*snapshot), atomic ? GFP_ATOMIC : GFP_KERNEL); 61 + if (!snapshot) 62 + return NULL; 63 + 64 + /* 65 + * NB: kmalloc has a hard limit well below the maximum GuC log buffer size. 66 + * Also, can't use vmalloc as might be called from atomic context. So need 67 + * to break the buffer up into smaller chunks that can be allocated. 68 + */ 69 + snapshot->size = log->bo->size; 70 + snapshot->num_chunks = DIV_ROUND_UP(snapshot->size, GUC_LOG_CHUNK_SIZE); 71 + 72 + snapshot->copy = kcalloc(snapshot->num_chunks, sizeof(*snapshot->copy), 73 + atomic ? GFP_ATOMIC : GFP_KERNEL); 74 + if (!snapshot->copy) 75 + goto fail_snap; 76 + 77 + remain = snapshot->size; 78 + for (i = 0; i < snapshot->num_chunks; i++) { 79 + size_t size = min(GUC_LOG_CHUNK_SIZE, remain); 80 + 81 + snapshot->copy[i] = kmalloc(size, atomic ? GFP_ATOMIC : GFP_KERNEL); 82 + if (!snapshot->copy[i]) 83 + goto fail_copy; 84 + remain -= size; 85 + } 86 + 87 + return snapshot; 88 + 89 + fail_copy: 90 + for (i = 0; i < snapshot->num_chunks; i++) 91 + kfree(snapshot->copy[i]); 92 + kfree(snapshot->copy); 93 + fail_snap: 94 + kfree(snapshot); 95 + return NULL; 96 + } 97 + 98 + /** 99 + * xe_guc_log_snapshot_free - free a previously captured GuC log snapshot 100 + * @snapshot: GuC log snapshot structure 101 + * 102 + * Return: pointer to a newly allocated snapshot object or null if out of memory. Caller is 103 + * responsible for calling xe_guc_log_snapshot_free when done with the snapshot. 104 + */ 105 + void xe_guc_log_snapshot_free(struct xe_guc_log_snapshot *snapshot) 106 + { 107 + int i; 108 + 109 + if (!snapshot) 110 + return; 111 + 112 + if (!snapshot->copy) { 113 + for (i = 0; i < snapshot->num_chunks; i++) 114 + kfree(snapshot->copy[i]); 115 + kfree(snapshot->copy); 116 + } 117 + 118 + kfree(snapshot); 119 + } 120 + 121 + /** 122 + * xe_guc_log_snapshot_capture - create a new snapshot copy the GuC log for later dumping 123 + * @log: GuC log structure 124 + * @atomic: is the call inside an atomic section of some kind? 125 + * 126 + * Return: pointer to a newly allocated snapshot object or null if out of memory. Caller is 127 + * responsible for calling xe_guc_log_snapshot_free when done with the snapshot. 128 + */ 129 + struct xe_guc_log_snapshot *xe_guc_log_snapshot_capture(struct xe_guc_log *log, bool atomic) 130 + { 131 + struct xe_guc_log_snapshot *snapshot; 132 + struct xe_device *xe = log_to_xe(log); 133 + struct xe_guc *guc = log_to_guc(log); 134 + struct xe_gt *gt = log_to_gt(log); 135 + size_t remain; 136 + int i, err; 137 + 138 + if (!log->bo) { 139 + xe_gt_err(gt, "GuC log buffer not allocated\n"); 140 + return NULL; 141 + } 142 + 143 + snapshot = xe_guc_log_snapshot_alloc(log, atomic); 144 + if (!snapshot) { 145 + xe_gt_err(gt, "GuC log snapshot not allocated\n"); 146 + return NULL; 147 + } 148 + 149 + remain = snapshot->size; 150 + for (i = 0; i < snapshot->num_chunks; i++) { 151 + size_t size = min(GUC_LOG_CHUNK_SIZE, remain); 152 + 153 + xe_map_memcpy_from(xe, snapshot->copy[i], &log->bo->vmap, 154 + i * GUC_LOG_CHUNK_SIZE, size); 155 + remain -= size; 156 + } 157 + 158 + err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT); 159 + if (err) { 160 + snapshot->stamp = ~0; 161 + } else { 162 + snapshot->stamp = xe_mmio_read32(&gt->mmio, GUC_PMTIMESTAMP); 163 + xe_force_wake_put(gt_to_fw(gt), XE_FW_GT); 164 + } 165 + snapshot->ktime = ktime_get_boottime_ns(); 166 + snapshot->level = log->level; 167 + snapshot->ver_found = guc->fw.versions.found[XE_UC_FW_VER_RELEASE]; 168 + snapshot->ver_want = guc->fw.versions.wanted; 169 + snapshot->path = guc->fw.path; 170 + 171 + return snapshot; 172 + } 173 + 174 + /** 175 + * xe_guc_log_snapshot_print - dump a previously saved copy of the GuC log to some useful location 176 + * @snapshot: a snapshot of the GuC log 177 + * @p: the printer object to output to 178 + */ 179 + void xe_guc_log_snapshot_print(struct xe_guc_log_snapshot *snapshot, struct drm_printer *p) 180 + { 181 + size_t remain; 182 + int i; 183 + 184 + if (!snapshot) { 185 + drm_printf(p, "GuC log snapshot not allocated!\n"); 186 + return; 187 + } 188 + 189 + drm_printf(p, "GuC firmware: %s\n", snapshot->path); 190 + drm_printf(p, "GuC version: %u.%u.%u (wanted %u.%u.%u)\n", 191 + snapshot->ver_found.major, snapshot->ver_found.minor, snapshot->ver_found.patch, 192 + snapshot->ver_want.major, snapshot->ver_want.minor, snapshot->ver_want.patch); 193 + drm_printf(p, "Kernel timestamp: 0x%08llX [%llu]\n", snapshot->ktime, snapshot->ktime); 194 + drm_printf(p, "GuC timestamp: 0x%08X [%u]\n", snapshot->stamp, snapshot->stamp); 195 + drm_printf(p, "Log level: %u\n", snapshot->level); 196 + 197 + remain = snapshot->size; 198 + for (i = 0; i < snapshot->num_chunks; i++) { 199 + size_t size = min(GUC_LOG_CHUNK_SIZE, remain); 200 + 201 + xe_print_blob_ascii85(p, i ? NULL : "Log data", snapshot->copy[i], 0, size); 202 + remain -= size; 203 + } 204 + } 205 + 206 + /** 207 + * xe_guc_log_print_dmesg - dump a copy of the GuC log to dmesg 208 + * @log: GuC log structure 209 + */ 210 + void xe_guc_log_print_dmesg(struct xe_guc_log *log) 211 + { 212 + struct xe_gt *gt = log_to_gt(log); 213 + static int g_count; 214 + struct drm_printer ip = xe_gt_info_printer(gt); 215 + struct drm_printer lp = drm_line_printer(&ip, "Capture", ++g_count); 216 + 217 + drm_printf(&lp, "Dumping GuC log for %ps...\n", __builtin_return_address(0)); 218 + 219 + xe_guc_log_print(log, &lp); 220 + 221 + drm_printf(&lp, "Done.\n"); 222 + } 223 + 224 + /** 225 + * xe_guc_log_print - dump a copy of the GuC log to some useful location 226 + * @log: GuC log structure 227 + * @p: the printer object to output to 228 + */ 65 229 void xe_guc_log_print(struct xe_guc_log *log, struct drm_printer *p) 66 230 { 67 - struct xe_device *xe = log_to_xe(log); 68 - size_t size; 69 - int i, j; 231 + struct xe_guc_log_snapshot *snapshot; 70 232 71 - xe_assert(xe, log->bo); 233 + drm_printf(p, "**** GuC Log ****\n"); 72 234 73 - size = log->bo->size; 74 - 75 - #define DW_PER_READ 128 76 - xe_assert(xe, !(size % (DW_PER_READ * sizeof(u32)))); 77 - for (i = 0; i < size / sizeof(u32); i += DW_PER_READ) { 78 - u32 read[DW_PER_READ]; 79 - 80 - xe_map_memcpy_from(xe, read, &log->bo->vmap, i * sizeof(u32), 81 - DW_PER_READ * sizeof(u32)); 82 - #define DW_PER_PRINT 4 83 - for (j = 0; j < DW_PER_READ / DW_PER_PRINT; ++j) { 84 - u32 *print = read + j * DW_PER_PRINT; 85 - 86 - drm_printf(p, "0x%08x 0x%08x 0x%08x 0x%08x\n", 87 - *(print + 0), *(print + 1), 88 - *(print + 2), *(print + 3)); 89 - } 90 - } 235 + snapshot = xe_guc_log_snapshot_capture(log, false); 236 + drm_printf(p, "CS reference clock: %u\n", log_to_gt(log)->info.reference_clock); 237 + xe_guc_log_snapshot_print(snapshot, p); 238 + xe_guc_log_snapshot_free(snapshot); 91 239 } 92 240 93 241 int xe_guc_log_init(struct xe_guc_log *log) ··· 269 95 log->level = xe_modparam.guc_log_level; 270 96 271 97 return 0; 98 + } 99 + 100 + ALLOW_ERROR_INJECTION(xe_guc_log_init, ERRNO); /* See xe_pci_probe() */ 101 + 102 + static u32 xe_guc_log_section_size_crash(struct xe_guc_log *log) 103 + { 104 + return CRASH_BUFFER_SIZE; 105 + } 106 + 107 + static u32 xe_guc_log_section_size_debug(struct xe_guc_log *log) 108 + { 109 + return DEBUG_BUFFER_SIZE; 110 + } 111 + 112 + /** 113 + * xe_guc_log_section_size_capture - Get capture buffer size within log sections. 114 + * @log: The log object. 115 + * 116 + * This function will return the capture buffer size within log sections. 117 + * 118 + * Return: capture buffer size. 119 + */ 120 + u32 xe_guc_log_section_size_capture(struct xe_guc_log *log) 121 + { 122 + return CAPTURE_BUFFER_SIZE; 123 + } 124 + 125 + /** 126 + * xe_guc_get_log_buffer_size - Get log buffer size for a type. 127 + * @log: The log object. 128 + * @type: The log buffer type 129 + * 130 + * Return: buffer size. 131 + */ 132 + u32 xe_guc_get_log_buffer_size(struct xe_guc_log *log, enum guc_log_buffer_type type) 133 + { 134 + switch (type) { 135 + case GUC_LOG_BUFFER_CRASH_DUMP: 136 + return xe_guc_log_section_size_crash(log); 137 + case GUC_LOG_BUFFER_DEBUG: 138 + return xe_guc_log_section_size_debug(log); 139 + case GUC_LOG_BUFFER_CAPTURE: 140 + return xe_guc_log_section_size_capture(log); 141 + } 142 + return 0; 143 + } 144 + 145 + /** 146 + * xe_guc_get_log_buffer_offset - Get offset in log buffer for a type. 147 + * @log: The log object. 148 + * @type: The log buffer type 149 + * 150 + * This function will return the offset in the log buffer for a type. 151 + * Return: buffer offset. 152 + */ 153 + u32 xe_guc_get_log_buffer_offset(struct xe_guc_log *log, enum guc_log_buffer_type type) 154 + { 155 + enum guc_log_buffer_type i; 156 + u32 offset = PAGE_SIZE;/* for the log_buffer_states */ 157 + 158 + for (i = GUC_LOG_BUFFER_CRASH_DUMP; i < GUC_LOG_BUFFER_TYPE_MAX; ++i) { 159 + if (i == type) 160 + break; 161 + offset += xe_guc_get_log_buffer_size(log, i); 162 + } 163 + 164 + return offset; 165 + } 166 + 167 + /** 168 + * xe_guc_check_log_buf_overflow - Check if log buffer overflowed 169 + * @log: The log object. 170 + * @type: The log buffer type 171 + * @full_cnt: The count of buffer full 172 + * 173 + * This function will check count of buffer full against previous, mismatch 174 + * indicate overflowed. 175 + * Update the sampled_overflow counter, if the 4 bit counter overflowed, add 176 + * up 16 to correct the value. 177 + * 178 + * Return: True if overflowed. 179 + */ 180 + bool xe_guc_check_log_buf_overflow(struct xe_guc_log *log, enum guc_log_buffer_type type, 181 + unsigned int full_cnt) 182 + { 183 + unsigned int prev_full_cnt = log->stats[type].sampled_overflow; 184 + bool overflow = false; 185 + 186 + if (full_cnt != prev_full_cnt) { 187 + overflow = true; 188 + 189 + log->stats[type].overflow = full_cnt; 190 + log->stats[type].sampled_overflow += full_cnt - prev_full_cnt; 191 + 192 + if (full_cnt < prev_full_cnt) { 193 + /* buffer_full_cnt is a 4 bit counter */ 194 + log->stats[type].sampled_overflow += 16; 195 + } 196 + xe_gt_notice(log_to_gt(log), "log buffer overflow\n"); 197 + } 198 + 199 + return overflow; 272 200 }
+14 -1
drivers/gpu/drm/xe/xe_guc_log.h
··· 7 7 #define _XE_GUC_LOG_H_ 8 8 9 9 #include "xe_guc_log_types.h" 10 + #include "abi/guc_log_abi.h" 10 11 11 12 struct drm_printer; 13 + struct xe_device; 12 14 13 15 #if IS_ENABLED(CONFIG_DRM_XE_LARGE_GUC_BUFFER) 14 16 #define CRASH_BUFFER_SIZE SZ_1M ··· 19 17 #else 20 18 #define CRASH_BUFFER_SIZE SZ_8K 21 19 #define DEBUG_BUFFER_SIZE SZ_64K 22 - #define CAPTURE_BUFFER_SIZE SZ_16K 20 + #define CAPTURE_BUFFER_SIZE SZ_1M 23 21 #endif 24 22 /* 25 23 * While we're using plain log level in i915, GuC controls are much more... ··· 40 38 41 39 int xe_guc_log_init(struct xe_guc_log *log); 42 40 void xe_guc_log_print(struct xe_guc_log *log, struct drm_printer *p); 41 + void xe_guc_log_print_dmesg(struct xe_guc_log *log); 42 + struct xe_guc_log_snapshot *xe_guc_log_snapshot_capture(struct xe_guc_log *log, bool atomic); 43 + void xe_guc_log_snapshot_print(struct xe_guc_log_snapshot *snapshot, struct drm_printer *p); 44 + void xe_guc_log_snapshot_free(struct xe_guc_log_snapshot *snapshot); 43 45 44 46 static inline u32 45 47 xe_guc_log_get_level(struct xe_guc_log *log) 46 48 { 47 49 return log->level; 48 50 } 51 + 52 + u32 xe_guc_log_section_size_capture(struct xe_guc_log *log); 53 + u32 xe_guc_get_log_buffer_size(struct xe_guc_log *log, enum guc_log_buffer_type type); 54 + u32 xe_guc_get_log_buffer_offset(struct xe_guc_log *log, enum guc_log_buffer_type type); 55 + bool xe_guc_check_log_buf_overflow(struct xe_guc_log *log, 56 + enum guc_log_buffer_type type, 57 + unsigned int full_cnt); 49 58 50 59 #endif
+34
drivers/gpu/drm/xe/xe_guc_log_types.h
··· 7 7 #define _XE_GUC_LOG_TYPES_H_ 8 8 9 9 #include <linux/types.h> 10 + #include "abi/guc_log_abi.h" 11 + 12 + #include "xe_uc_fw_types.h" 10 13 11 14 struct xe_bo; 15 + 16 + /** 17 + * struct xe_guc_log_snapshot: 18 + * Capture of the GuC log plus various state useful for decoding the log 19 + */ 20 + struct xe_guc_log_snapshot { 21 + /** @size: Size in bytes of the @copy allocation */ 22 + size_t size; 23 + /** @copy: Host memory copy of the log buffer for later dumping, split into chunks */ 24 + void **copy; 25 + /** @num_chunks: Number of chunks within @copy */ 26 + int num_chunks; 27 + /** @ktime: Kernel time the snapshot was taken */ 28 + u64 ktime; 29 + /** @stamp: GuC timestamp at which the snapshot was taken */ 30 + u32 stamp; 31 + /** @level: GuC log verbosity level */ 32 + u32 level; 33 + /** @ver_found: GuC firmware version */ 34 + struct xe_uc_fw_version ver_found; 35 + /** @ver_want: GuC firmware version that driver expected */ 36 + struct xe_uc_fw_version ver_want; 37 + /** @path: Path of GuC firmware blob */ 38 + const char *path; 39 + }; 12 40 13 41 /** 14 42 * struct xe_guc_log - GuC log ··· 46 18 u32 level; 47 19 /** @bo: XE BO for GuC log */ 48 20 struct xe_bo *bo; 21 + /** @stats: logging related stats */ 22 + struct { 23 + u32 sampled_overflow; 24 + u32 overflow; 25 + u32 flush; 26 + } stats[GUC_LOG_BUFFER_TYPE_MAX]; 49 27 }; 50 28 51 29 #endif
+17 -17
drivers/gpu/drm/xe/xe_guc_pc.c
··· 262 262 u32 state = enable ? RPSWCTL_ENABLE : RPSWCTL_DISABLE; 263 263 264 264 /* Allow/Disallow punit to process software freq requests */ 265 - xe_mmio_write32(gt, RP_CONTROL, state); 265 + xe_mmio_write32(&gt->mmio, RP_CONTROL, state); 266 266 } 267 267 268 268 static void pc_set_cur_freq(struct xe_guc_pc *pc, u32 freq) ··· 274 274 275 275 /* Req freq is in units of 16.66 Mhz */ 276 276 rpnswreq = REG_FIELD_PREP(REQ_RATIO_MASK, encode_freq(freq)); 277 - xe_mmio_write32(gt, RPNSWREQ, rpnswreq); 277 + xe_mmio_write32(&gt->mmio, RPNSWREQ, rpnswreq); 278 278 279 279 /* Sleep for a small time to allow pcode to respond */ 280 280 usleep_range(100, 300); ··· 334 334 u32 reg; 335 335 336 336 if (xe_gt_is_media_type(gt)) 337 - reg = xe_mmio_read32(gt, MTL_MPE_FREQUENCY); 337 + reg = xe_mmio_read32(&gt->mmio, MTL_MPE_FREQUENCY); 338 338 else 339 - reg = xe_mmio_read32(gt, MTL_GT_RPE_FREQUENCY); 339 + reg = xe_mmio_read32(&gt->mmio, MTL_GT_RPE_FREQUENCY); 340 340 341 341 pc->rpe_freq = decode_freq(REG_FIELD_GET(MTL_RPE_MASK, reg)); 342 342 } ··· 353 353 * PCODE at a different register 354 354 */ 355 355 if (xe->info.platform == XE_PVC) 356 - reg = xe_mmio_read32(gt, PVC_RP_STATE_CAP); 356 + reg = xe_mmio_read32(&gt->mmio, PVC_RP_STATE_CAP); 357 357 else 358 - reg = xe_mmio_read32(gt, FREQ_INFO_REC); 358 + reg = xe_mmio_read32(&gt->mmio, FREQ_INFO_REC); 359 359 360 360 pc->rpe_freq = REG_FIELD_GET(RPE_MASK, reg) * GT_FREQUENCY_MULTIPLIER; 361 361 } ··· 392 392 393 393 /* When in RC6, actual frequency reported will be 0. */ 394 394 if (GRAPHICS_VERx100(xe) >= 1270) { 395 - freq = xe_mmio_read32(gt, MTL_MIRROR_TARGET_WP1); 395 + freq = xe_mmio_read32(&gt->mmio, MTL_MIRROR_TARGET_WP1); 396 396 freq = REG_FIELD_GET(MTL_CAGF_MASK, freq); 397 397 } else { 398 - freq = xe_mmio_read32(gt, GT_PERF_STATUS); 398 + freq = xe_mmio_read32(&gt->mmio, GT_PERF_STATUS); 399 399 freq = REG_FIELD_GET(CAGF_MASK, freq); 400 400 } 401 401 ··· 425 425 if (ret) 426 426 return ret; 427 427 428 - *freq = xe_mmio_read32(gt, RPNSWREQ); 428 + *freq = xe_mmio_read32(&gt->mmio, RPNSWREQ); 429 429 430 430 *freq = REG_FIELD_GET(REQ_RATIO_MASK, *freq); 431 431 *freq = decode_freq(*freq); ··· 612 612 u32 reg, gt_c_state; 613 613 614 614 if (GRAPHICS_VERx100(gt_to_xe(gt)) >= 1270) { 615 - reg = xe_mmio_read32(gt, MTL_MIRROR_TARGET_WP1); 615 + reg = xe_mmio_read32(&gt->mmio, MTL_MIRROR_TARGET_WP1); 616 616 gt_c_state = REG_FIELD_GET(MTL_CC_MASK, reg); 617 617 } else { 618 - reg = xe_mmio_read32(gt, GT_CORE_STATUS); 618 + reg = xe_mmio_read32(&gt->mmio, GT_CORE_STATUS); 619 619 gt_c_state = REG_FIELD_GET(RCN_MASK, reg); 620 620 } 621 621 ··· 638 638 struct xe_gt *gt = pc_to_gt(pc); 639 639 u32 reg; 640 640 641 - reg = xe_mmio_read32(gt, GT_GFX_RC6); 641 + reg = xe_mmio_read32(&gt->mmio, GT_GFX_RC6); 642 642 643 643 return reg; 644 644 } ··· 652 652 struct xe_gt *gt = pc_to_gt(pc); 653 653 u64 reg; 654 654 655 - reg = xe_mmio_read32(gt, MTL_MEDIA_MC6); 655 + reg = xe_mmio_read32(&gt->mmio, MTL_MEDIA_MC6); 656 656 657 657 return reg; 658 658 } ··· 665 665 xe_device_assert_mem_access(pc_to_xe(pc)); 666 666 667 667 if (xe_gt_is_media_type(gt)) 668 - reg = xe_mmio_read32(gt, MTL_MEDIAP_STATE_CAP); 668 + reg = xe_mmio_read32(&gt->mmio, MTL_MEDIAP_STATE_CAP); 669 669 else 670 - reg = xe_mmio_read32(gt, MTL_RP_STATE_CAP); 670 + reg = xe_mmio_read32(&gt->mmio, MTL_RP_STATE_CAP); 671 671 672 672 pc->rp0_freq = decode_freq(REG_FIELD_GET(MTL_RP0_CAP_MASK, reg)); 673 673 ··· 683 683 xe_device_assert_mem_access(pc_to_xe(pc)); 684 684 685 685 if (xe->info.platform == XE_PVC) 686 - reg = xe_mmio_read32(gt, PVC_RP_STATE_CAP); 686 + reg = xe_mmio_read32(&gt->mmio, PVC_RP_STATE_CAP); 687 687 else 688 - reg = xe_mmio_read32(gt, RP_STATE_CAP); 688 + reg = xe_mmio_read32(&gt->mmio, RP_STATE_CAP); 689 689 pc->rp0_freq = REG_FIELD_GET(RP0_MASK, reg) * GT_FREQUENCY_MULTIPLIER; 690 690 pc->rpn_freq = REG_FIELD_GET(RPN_MASK, reg) * GT_FREQUENCY_MULTIPLIER; 691 691 }
+2
drivers/gpu/drm/xe/xe_guc_relay.c
··· 5 5 6 6 #include <linux/bitfield.h> 7 7 #include <linux/delay.h> 8 + #include <linux/fault-inject.h> 8 9 9 10 #include <drm/drm_managed.h> 10 11 ··· 356 355 357 356 return drmm_add_action_or_reset(&xe->drm, __fini_relay, relay); 358 357 } 358 + ALLOW_ERROR_INJECTION(xe_guc_relay_init, ERRNO); /* See xe_pci_probe() */ 359 359 360 360 static u32 to_relay_error(int err) 361 361 {
+68 -26
drivers/gpu/drm/xe/xe_guc_submit.c
··· 27 27 #include "xe_gt_clock.h" 28 28 #include "xe_gt_printk.h" 29 29 #include "xe_guc.h" 30 + #include "xe_guc_capture.h" 30 31 #include "xe_guc_ct.h" 31 32 #include "xe_guc_exec_queue_types.h" 32 33 #include "xe_guc_id_mgr.h" ··· 394 393 static int alloc_guc_id(struct xe_guc *guc, struct xe_exec_queue *q) 395 394 { 396 395 int ret; 397 - void *ptr; 398 396 int i; 399 397 400 398 /* ··· 413 413 q->guc->id = ret; 414 414 415 415 for (i = 0; i < q->width; ++i) { 416 - ptr = xa_store(&guc->submission_state.exec_queue_lookup, 417 - q->guc->id + i, q, GFP_NOWAIT); 418 - if (IS_ERR(ptr)) { 419 - ret = PTR_ERR(ptr); 416 + ret = xa_err(xa_store(&guc->submission_state.exec_queue_lookup, 417 + q->guc->id + i, q, GFP_NOWAIT)); 418 + if (ret) 420 419 goto err_release; 421 - } 422 420 } 423 421 424 422 return 0; ··· 825 827 xe_sched_job_put(job); 826 828 } 827 829 828 - static int guc_read_stopped(struct xe_guc *guc) 830 + int xe_guc_read_stopped(struct xe_guc *guc) 829 831 { 830 832 return atomic_read(&guc->submission_state.stopped); 831 833 } ··· 847 849 set_min_preemption_timeout(guc, q); 848 850 smp_rmb(); 849 851 ret = wait_event_timeout(guc->ct.wq, !exec_queue_pending_enable(q) || 850 - guc_read_stopped(guc), HZ * 5); 852 + xe_guc_read_stopped(guc), HZ * 5); 851 853 if (!ret) { 852 854 struct xe_gpu_scheduler *sched = &q->guc->sched; 853 855 ··· 973 975 */ 974 976 ret = wait_event_timeout(guc->ct.wq, 975 977 !exec_queue_pending_disable(q) || 976 - guc_read_stopped(guc), HZ * 5); 978 + xe_guc_read_stopped(guc), HZ * 5); 977 979 if (!ret) { 978 980 drm_warn(&xe->drm, "Schedule disable failed to respond"); 979 981 xe_sched_submission_start(sched); ··· 1041 1043 1042 1044 ret = wait_event_timeout(guc->ct.wq, 1043 1045 !exec_queue_pending_enable(q) || 1044 - guc_read_stopped(guc), HZ * 5); 1045 - if (!ret || guc_read_stopped(guc)) { 1046 + xe_guc_read_stopped(guc), HZ * 5); 1047 + if (!ret || xe_guc_read_stopped(guc)) { 1046 1048 xe_gt_warn(guc_to_gt(guc), "Schedule enable failed to respond"); 1047 1049 set_exec_queue_banned(q); 1048 1050 xe_gt_reset_async(q->gt); ··· 1097 1099 struct xe_gpu_scheduler *sched = &q->guc->sched; 1098 1100 struct xe_guc *guc = exec_queue_to_guc(q); 1099 1101 const char *process_name = "no process"; 1102 + struct xe_device *xe = guc_to_xe(guc); 1100 1103 int err = -ETIME; 1101 1104 pid_t pid = -1; 1102 1105 int i = 0; ··· 1126 1127 goto rearm; 1127 1128 1128 1129 /* 1130 + * If devcoredump not captured and GuC capture for the job is not ready 1131 + * do manual capture first and decide later if we need to use it 1132 + */ 1133 + if (!exec_queue_killed(q) && !xe->devcoredump.captured && 1134 + !xe_guc_capture_get_matching_and_lock(job)) { 1135 + /* take force wake before engine register manual capture */ 1136 + if (xe_force_wake_get(gt_to_fw(q->gt), XE_FORCEWAKE_ALL)) 1137 + xe_gt_info(q->gt, "failed to get forcewake for coredump capture\n"); 1138 + 1139 + xe_engine_snapshot_capture_for_job(job); 1140 + 1141 + xe_force_wake_put(gt_to_fw(q->gt), XE_FORCEWAKE_ALL); 1142 + } 1143 + 1144 + /* 1129 1145 * XXX: Sampling timeout doesn't work in wedged mode as we have to 1130 1146 * modify scheduling state to read timestamp. We could read the 1131 1147 * timestamp from a register to accumulate current running time but this ··· 1163 1149 */ 1164 1150 ret = wait_event_timeout(guc->ct.wq, 1165 1151 !exec_queue_pending_enable(q) || 1166 - guc_read_stopped(guc), HZ * 5); 1167 - if (!ret || guc_read_stopped(guc)) 1152 + xe_guc_read_stopped(guc), HZ * 5); 1153 + if (!ret || xe_guc_read_stopped(guc)) 1168 1154 goto trigger_reset; 1169 1155 1170 1156 /* ··· 1188 1174 smp_rmb(); 1189 1175 ret = wait_event_timeout(guc->ct.wq, 1190 1176 !exec_queue_pending_disable(q) || 1191 - guc_read_stopped(guc), HZ * 5); 1192 - if (!ret || guc_read_stopped(guc)) { 1177 + xe_guc_read_stopped(guc), HZ * 5); 1178 + if (!ret || xe_guc_read_stopped(guc)) { 1193 1179 trigger_reset: 1194 1180 if (!ret) 1195 1181 xe_gt_warn(guc_to_gt(guc), "Schedule disable failed to respond"); ··· 1378 1364 struct xe_device *xe = guc_to_xe(guc); 1379 1365 1380 1366 xe_assert(xe, exec_queue_suspended(q) || exec_queue_killed(q) || 1381 - guc_read_stopped(guc)); 1367 + xe_guc_read_stopped(guc)); 1382 1368 xe_assert(xe, q->guc->suspend_pending); 1383 1369 1384 1370 __suspend_fence_signal(q); ··· 1392 1378 if (guc_exec_queue_allowed_to_change_state(q) && !exec_queue_suspended(q) && 1393 1379 exec_queue_enabled(q)) { 1394 1380 wait_event(guc->ct.wq, q->guc->resume_time != RESUME_PENDING || 1395 - guc_read_stopped(guc)); 1381 + xe_guc_read_stopped(guc)); 1396 1382 1397 - if (!guc_read_stopped(guc)) { 1383 + if (!xe_guc_read_stopped(guc)) { 1398 1384 s64 since_resume_ms = 1399 1385 ktime_ms_delta(ktime_get(), 1400 1386 q->guc->resume_time); ··· 1519 1505 1520 1506 q->entity = &ge->entity; 1521 1507 1522 - if (guc_read_stopped(guc)) 1508 + if (xe_guc_read_stopped(guc)) 1523 1509 xe_sched_stop(sched); 1524 1510 1525 1511 mutex_unlock(&guc->submission_state.lock); ··· 1675 1661 ret = wait_event_interruptible_timeout(q->guc->suspend_wait, 1676 1662 !READ_ONCE(q->guc->suspend_pending) || 1677 1663 exec_queue_killed(q) || 1678 - guc_read_stopped(guc), 1664 + xe_guc_read_stopped(guc), 1679 1665 HZ * 5); 1680 1666 1681 1667 if (!ret) { ··· 1801 1787 void xe_guc_submit_reset_wait(struct xe_guc *guc) 1802 1788 { 1803 1789 wait_event(guc->ct.wq, xe_device_wedged(guc_to_xe(guc)) || 1804 - !guc_read_stopped(guc)); 1790 + !xe_guc_read_stopped(guc)); 1805 1791 } 1806 1792 1807 1793 void xe_guc_submit_stop(struct xe_guc *guc) ··· 1810 1796 unsigned long index; 1811 1797 struct xe_device *xe = guc_to_xe(guc); 1812 1798 1813 - xe_assert(xe, guc_read_stopped(guc) == 1); 1799 + xe_assert(xe, xe_guc_read_stopped(guc) == 1); 1814 1800 1815 1801 mutex_lock(&guc->submission_state.lock); 1816 1802 ··· 1849 1835 unsigned long index; 1850 1836 struct xe_device *xe = guc_to_xe(guc); 1851 1837 1852 - xe_assert(xe, guc_read_stopped(guc) == 1); 1838 + xe_assert(xe, xe_guc_read_stopped(guc) == 1); 1853 1839 1854 1840 mutex_lock(&guc->submission_state.lock); 1855 1841 atomic_dec(&guc->submission_state.stopped); ··· 2023 2009 xe_gt_info(gt, "Engine reset: engine_class=%s, logical_mask: 0x%x, guc_id=%d", 2024 2010 xe_hw_engine_class_to_str(q->class), q->logical_mask, guc_id); 2025 2011 2026 - /* FIXME: Do error capture, most likely async */ 2027 - 2028 2012 trace_xe_exec_queue_reset(q); 2029 2013 2030 2014 /* ··· 2034 2022 set_exec_queue_reset(q); 2035 2023 if (!exec_queue_banned(q) && !exec_queue_check_timeout(q)) 2036 2024 xe_guc_exec_queue_trigger_cleanup(q); 2025 + 2026 + return 0; 2027 + } 2028 + 2029 + /* 2030 + * xe_guc_error_capture_handler - Handler of GuC captured message 2031 + * @guc: The GuC object 2032 + * @msg: Point to the message 2033 + * @len: The message length 2034 + * 2035 + * When GuC captured data is ready, GuC will send message 2036 + * XE_GUC_ACTION_STATE_CAPTURE_NOTIFICATION to host, this function will be 2037 + * called 1st to check status before process the data comes with the message. 2038 + * 2039 + * Returns: error code. 0 if success 2040 + */ 2041 + int xe_guc_error_capture_handler(struct xe_guc *guc, u32 *msg, u32 len) 2042 + { 2043 + u32 status; 2044 + 2045 + if (unlikely(len != XE_GUC_ACTION_STATE_CAPTURE_NOTIFICATION_DATA_LEN)) { 2046 + xe_gt_dbg(guc_to_gt(guc), "Invalid length %u", len); 2047 + return -EPROTO; 2048 + } 2049 + 2050 + status = msg[0] & XE_GUC_STATE_CAPTURE_EVENT_STATUS_MASK; 2051 + if (status == XE_GUC_STATE_CAPTURE_EVENT_STATUS_NOSPACE) 2052 + xe_gt_warn(guc_to_gt(guc), "G2H-Error capture no space"); 2053 + 2054 + xe_guc_capture_process(guc); 2037 2055 2038 2056 return 0; 2039 2057 } ··· 2282 2240 if (!snapshot) 2283 2241 return; 2284 2242 2285 - drm_printf(p, "\nGuC ID: %d\n", snapshot->guc.id); 2243 + drm_printf(p, "GuC ID: %d\n", snapshot->guc.id); 2286 2244 drm_printf(p, "\tName: %s\n", snapshot->name); 2287 2245 drm_printf(p, "\tClass: %d\n", snapshot->class); 2288 2246 drm_printf(p, "\tLogical mask: 0x%x\n", snapshot->logical_mask);
+2
drivers/gpu/drm/xe/xe_guc_submit.h
··· 20 20 int xe_guc_submit_start(struct xe_guc *guc); 21 21 void xe_guc_submit_wedge(struct xe_guc *guc); 22 22 23 + int xe_guc_read_stopped(struct xe_guc *guc); 23 24 int xe_guc_sched_done_handler(struct xe_guc *guc, u32 *msg, u32 len); 24 25 int xe_guc_deregister_done_handler(struct xe_guc *guc, u32 *msg, u32 len); 25 26 int xe_guc_exec_queue_reset_handler(struct xe_guc *guc, u32 *msg, u32 len); 26 27 int xe_guc_exec_queue_memory_cat_error_handler(struct xe_guc *guc, u32 *msg, 27 28 u32 len); 28 29 int xe_guc_exec_queue_reset_failure_handler(struct xe_guc *guc, u32 *msg, u32 len); 30 + int xe_guc_error_capture_handler(struct xe_guc *guc, u32 *msg, u32 len); 29 31 30 32 struct xe_guc_submit_exec_queue_snapshot * 31 33 xe_guc_exec_queue_snapshot_capture(struct xe_exec_queue *q);
+2
drivers/gpu/drm/xe/xe_guc_types.h
··· 58 58 struct xe_guc_ads ads; 59 59 /** @ct: GuC ct */ 60 60 struct xe_guc_ct ct; 61 + /** @capture: the error-state-capture module's data and objects */ 62 + struct xe_guc_state_capture *capture; 61 63 /** @pc: GuC Power Conservation */ 62 64 struct xe_guc_pc pc; 63 65 /** @dbm: GuC Doorbell Manager */
+3 -3
drivers/gpu/drm/xe/xe_huc.c
··· 229 229 { 230 230 struct xe_gt *gt = huc_to_gt(huc); 231 231 232 - return xe_mmio_read32(gt, huc_auth_modes[type].reg) & huc_auth_modes[type].val; 232 + return xe_mmio_read32(&gt->mmio, huc_auth_modes[type].reg) & huc_auth_modes[type].val; 233 233 } 234 234 235 235 int xe_huc_auth(struct xe_huc *huc, enum xe_huc_auth_types type) ··· 268 268 goto fail; 269 269 } 270 270 271 - ret = xe_mmio_wait32(gt, huc_auth_modes[type].reg, huc_auth_modes[type].val, 271 + ret = xe_mmio_wait32(&gt->mmio, huc_auth_modes[type].reg, huc_auth_modes[type].val, 272 272 huc_auth_modes[type].val, 100000, NULL, false); 273 273 if (ret) { 274 274 xe_gt_err(gt, "HuC: firmware not verified: %pe\n", ERR_PTR(ret)); ··· 308 308 return; 309 309 310 310 drm_printf(p, "\nHuC status: 0x%08x\n", 311 - xe_mmio_read32(gt, HUC_KERNEL_LOAD_INFO)); 311 + xe_mmio_read32(&gt->mmio, HUC_KERNEL_LOAD_INFO)); 312 312 313 313 xe_force_wake_put(gt_to_fw(gt), XE_FW_GT); 314 314 }
+69 -236
drivers/gpu/drm/xe/xe_hw_engine.c
··· 12 12 13 13 #include "regs/xe_engine_regs.h" 14 14 #include "regs/xe_gt_regs.h" 15 + #include "regs/xe_irq_regs.h" 15 16 #include "xe_assert.h" 16 17 #include "xe_bo.h" 17 18 #include "xe_device.h" ··· 24 23 #include "xe_gt_printk.h" 25 24 #include "xe_gt_mcr.h" 26 25 #include "xe_gt_topology.h" 26 + #include "xe_guc_capture.h" 27 27 #include "xe_hw_engine_group.h" 28 28 #include "xe_hw_fence.h" 29 29 #include "xe_irq.h" ··· 297 295 298 296 reg.addr += hwe->mmio_base; 299 297 300 - xe_mmio_write32(hwe->gt, reg, val); 298 + xe_mmio_write32(&hwe->gt->mmio, reg, val); 301 299 } 302 300 303 301 /** ··· 317 315 318 316 reg.addr += hwe->mmio_base; 319 317 320 - return xe_mmio_read32(hwe->gt, reg); 318 + return xe_mmio_read32(&hwe->gt->mmio, reg); 321 319 } 322 320 323 321 void xe_hw_engine_enable_ring(struct xe_hw_engine *hwe) ··· 326 324 xe_hw_engine_mask_per_class(hwe->gt, XE_ENGINE_CLASS_COMPUTE); 327 325 328 326 if (hwe->class == XE_ENGINE_CLASS_COMPUTE && ccs_mask) 329 - xe_mmio_write32(hwe->gt, RCU_MODE, 327 + xe_mmio_write32(&hwe->gt->mmio, RCU_MODE, 330 328 _MASKED_BIT_ENABLE(RCU_MODE_CCS_ENABLE)); 331 329 332 330 xe_hw_engine_mmio_write32(hwe, RING_HWSTAM(0), ~0x0); ··· 356 354 hwe->class != XE_ENGINE_CLASS_RENDER) 357 355 return false; 358 356 359 - return xe_mmio_read32(hwe->gt, XEHP_FUSE4) & CFEG_WMTP_DISABLE; 357 + return xe_mmio_read32(&hwe->gt->mmio, XEHP_FUSE4) & CFEG_WMTP_DISABLE; 360 358 } 361 359 362 360 void ··· 462 460 xe_rtp_process_to_sr(&ctx, engine_entries, &hwe->reg_sr); 463 461 } 464 462 463 + static const struct engine_info *find_engine_info(enum xe_engine_class class, int instance) 464 + { 465 + const struct engine_info *info; 466 + enum xe_hw_engine_id id; 467 + 468 + for (id = 0; id < XE_NUM_HW_ENGINES; ++id) { 469 + info = &engine_infos[id]; 470 + if (info->class == class && info->instance == instance) 471 + return info; 472 + } 473 + 474 + return NULL; 475 + } 476 + 477 + static u16 get_msix_irq_offset(struct xe_gt *gt, enum xe_engine_class class) 478 + { 479 + /* For MSI-X, hw engines report to offset of engine instance zero */ 480 + const struct engine_info *info = find_engine_info(class, 0); 481 + 482 + xe_gt_assert(gt, info); 483 + 484 + return info ? info->irq_offset : 0; 485 + } 486 + 465 487 static void hw_engine_init_early(struct xe_gt *gt, struct xe_hw_engine *hwe, 466 488 enum xe_hw_engine_id id) 467 489 { ··· 505 479 hwe->class = info->class; 506 480 hwe->instance = info->instance; 507 481 hwe->mmio_base = info->mmio_base; 508 - hwe->irq_offset = info->irq_offset; 482 + hwe->irq_offset = xe_device_has_msix(gt_to_xe(gt)) ? 483 + get_msix_irq_offset(gt, info->class) : 484 + info->irq_offset; 509 485 hwe->domain = info->domain; 510 486 hwe->name = info->name; 511 487 hwe->fence_irq = &gt->fence_irq[info->class]; ··· 640 612 641 613 xe_force_wake_assert_held(gt_to_fw(gt), XE_FW_GT); 642 614 643 - media_fuse = xe_mmio_read32(gt, GT_VEBOX_VDBOX_DISABLE); 615 + media_fuse = xe_mmio_read32(&gt->mmio, GT_VEBOX_VDBOX_DISABLE); 644 616 645 617 /* 646 618 * Pre-Xe_HP platforms had register bits representing absent engines, ··· 685 657 686 658 xe_force_wake_assert_held(gt_to_fw(gt), XE_FW_GT); 687 659 688 - bcs_mask = xe_mmio_read32(gt, MIRROR_FUSE3); 660 + bcs_mask = xe_mmio_read32(&gt->mmio, MIRROR_FUSE3); 689 661 bcs_mask = REG_FIELD_GET(MEML3_EN_MASK, bcs_mask); 690 662 691 663 /* BCS0 is always present; only BCS1-BCS8 may be fused off */ ··· 732 704 struct xe_device *xe = gt_to_xe(gt); 733 705 u32 ccs_mask; 734 706 735 - ccs_mask = xe_mmio_read32(gt, XEHP_FUSE4); 707 + ccs_mask = xe_mmio_read32(&gt->mmio, XEHP_FUSE4); 736 708 ccs_mask = REG_FIELD_GET(CCS_EN_MASK, ccs_mask); 737 709 738 710 for (int i = XE_HW_ENGINE_CCS0, j = 0; i <= XE_HW_ENGINE_CCS3; ++i, ++j) { ··· 770 742 gt->info.engine_mask &= ~BIT(XE_HW_ENGINE_GSCCS0); 771 743 772 744 /* interrupts where previously enabled, so turn them off */ 773 - xe_mmio_write32(gt, GUNIT_GSC_INTR_ENABLE, 0); 774 - xe_mmio_write32(gt, GUNIT_GSC_INTR_MASK, ~0); 745 + xe_mmio_write32(&gt->mmio, GUNIT_GSC_INTR_ENABLE, 0); 746 + xe_mmio_write32(&gt->mmio, GUNIT_GSC_INTR_MASK, ~0); 775 747 776 748 drm_info(&xe->drm, "gsccs disabled due to lack of FW\n"); 777 749 } ··· 826 798 xe_hw_fence_irq_run(hwe->fence_irq); 827 799 } 828 800 829 - static bool 830 - is_slice_common_per_gslice(struct xe_device *xe) 831 - { 832 - return GRAPHICS_VERx100(xe) >= 1255; 833 - } 834 - 835 - static void 836 - xe_hw_engine_snapshot_instdone_capture(struct xe_hw_engine *hwe, 837 - struct xe_hw_engine_snapshot *snapshot) 838 - { 839 - struct xe_gt *gt = hwe->gt; 840 - struct xe_device *xe = gt_to_xe(gt); 841 - unsigned int dss; 842 - u16 group, instance; 843 - 844 - snapshot->reg.instdone.ring = xe_hw_engine_mmio_read32(hwe, RING_INSTDONE(0)); 845 - 846 - if (snapshot->hwe->class != XE_ENGINE_CLASS_RENDER) 847 - return; 848 - 849 - if (is_slice_common_per_gslice(xe) == false) { 850 - snapshot->reg.instdone.slice_common[0] = 851 - xe_mmio_read32(gt, SC_INSTDONE); 852 - snapshot->reg.instdone.slice_common_extra[0] = 853 - xe_mmio_read32(gt, SC_INSTDONE_EXTRA); 854 - snapshot->reg.instdone.slice_common_extra2[0] = 855 - xe_mmio_read32(gt, SC_INSTDONE_EXTRA2); 856 - } else { 857 - for_each_geometry_dss(dss, gt, group, instance) { 858 - snapshot->reg.instdone.slice_common[dss] = 859 - xe_gt_mcr_unicast_read(gt, XEHPG_SC_INSTDONE, group, instance); 860 - snapshot->reg.instdone.slice_common_extra[dss] = 861 - xe_gt_mcr_unicast_read(gt, XEHPG_SC_INSTDONE_EXTRA, group, instance); 862 - snapshot->reg.instdone.slice_common_extra2[dss] = 863 - xe_gt_mcr_unicast_read(gt, XEHPG_SC_INSTDONE_EXTRA2, group, instance); 864 - } 865 - } 866 - 867 - for_each_geometry_dss(dss, gt, group, instance) { 868 - snapshot->reg.instdone.sampler[dss] = 869 - xe_gt_mcr_unicast_read(gt, SAMPLER_INSTDONE, group, instance); 870 - snapshot->reg.instdone.row[dss] = 871 - xe_gt_mcr_unicast_read(gt, ROW_INSTDONE, group, instance); 872 - 873 - if (GRAPHICS_VERx100(xe) >= 1255) 874 - snapshot->reg.instdone.geom_svg[dss] = 875 - xe_gt_mcr_unicast_read(gt, XEHPG_INSTDONE_GEOM_SVGUNIT, 876 - group, instance); 877 - } 878 - } 879 - 880 801 /** 881 802 * xe_hw_engine_snapshot_capture - Take a quick snapshot of the HW Engine. 882 803 * @hwe: Xe HW Engine. 804 + * @job: The job object. 883 805 * 884 806 * This can be printed out in a later stage like during dev_coredump 885 807 * analysis. ··· 838 860 * caller, using `xe_hw_engine_snapshot_free`. 839 861 */ 840 862 struct xe_hw_engine_snapshot * 841 - xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe) 863 + xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe, struct xe_sched_job *job) 842 864 { 843 865 struct xe_hw_engine_snapshot *snapshot; 844 - size_t len; 845 - u64 val; 866 + struct __guc_capture_parsed_output *node; 846 867 847 868 if (!xe_hw_engine_is_valid(hwe)) 848 869 return NULL; ··· 851 874 if (!snapshot) 852 875 return NULL; 853 876 854 - /* Because XE_MAX_DSS_FUSE_BITS is defined in xe_gt_types.h and it 855 - * includes xe_hw_engine_types.h the length of this 3 registers can't be 856 - * set in struct xe_hw_engine_snapshot, so here doing additional 857 - * allocations. 858 - */ 859 - len = (XE_MAX_DSS_FUSE_BITS * sizeof(u32)); 860 - snapshot->reg.instdone.slice_common = kzalloc(len, GFP_ATOMIC); 861 - snapshot->reg.instdone.slice_common_extra = kzalloc(len, GFP_ATOMIC); 862 - snapshot->reg.instdone.slice_common_extra2 = kzalloc(len, GFP_ATOMIC); 863 - snapshot->reg.instdone.sampler = kzalloc(len, GFP_ATOMIC); 864 - snapshot->reg.instdone.row = kzalloc(len, GFP_ATOMIC); 865 - snapshot->reg.instdone.geom_svg = kzalloc(len, GFP_ATOMIC); 866 - if (!snapshot->reg.instdone.slice_common || 867 - !snapshot->reg.instdone.slice_common_extra || 868 - !snapshot->reg.instdone.slice_common_extra2 || 869 - !snapshot->reg.instdone.sampler || 870 - !snapshot->reg.instdone.row || 871 - !snapshot->reg.instdone.geom_svg) { 872 - xe_hw_engine_snapshot_free(snapshot); 873 - return NULL; 874 - } 875 - 876 877 snapshot->name = kstrdup(hwe->name, GFP_ATOMIC); 877 878 snapshot->hwe = hwe; 878 879 snapshot->logical_instance = hwe->logical_instance; ··· 858 903 snapshot->forcewake.ref = xe_force_wake_ref(gt_to_fw(hwe->gt), 859 904 hwe->domain); 860 905 snapshot->mmio_base = hwe->mmio_base; 906 + snapshot->kernel_reserved = xe_hw_engine_is_reserved(hwe); 861 907 862 908 /* no more VF accessible data below this point */ 863 909 if (IS_SRIOV_VF(gt_to_xe(hwe->gt))) 864 910 return snapshot; 865 911 866 - snapshot->reg.ring_execlist_status = 867 - xe_hw_engine_mmio_read32(hwe, RING_EXECLIST_STATUS_LO(0)); 868 - val = xe_hw_engine_mmio_read32(hwe, RING_EXECLIST_STATUS_HI(0)); 869 - snapshot->reg.ring_execlist_status |= val << 32; 912 + if (job) { 913 + /* If got guc capture, set source to GuC */ 914 + node = xe_guc_capture_get_matching_and_lock(job); 915 + if (node) { 916 + struct xe_device *xe = gt_to_xe(hwe->gt); 917 + struct xe_devcoredump *coredump = &xe->devcoredump; 870 918 871 - snapshot->reg.ring_execlist_sq_contents = 872 - xe_hw_engine_mmio_read32(hwe, RING_EXECLIST_SQ_CONTENTS_LO(0)); 873 - val = xe_hw_engine_mmio_read32(hwe, RING_EXECLIST_SQ_CONTENTS_HI(0)); 874 - snapshot->reg.ring_execlist_sq_contents |= val << 32; 875 - 876 - snapshot->reg.ring_acthd = xe_hw_engine_mmio_read32(hwe, RING_ACTHD(0)); 877 - val = xe_hw_engine_mmio_read32(hwe, RING_ACTHD_UDW(0)); 878 - snapshot->reg.ring_acthd |= val << 32; 879 - 880 - snapshot->reg.ring_bbaddr = xe_hw_engine_mmio_read32(hwe, RING_BBADDR(0)); 881 - val = xe_hw_engine_mmio_read32(hwe, RING_BBADDR_UDW(0)); 882 - snapshot->reg.ring_bbaddr |= val << 32; 883 - 884 - snapshot->reg.ring_dma_fadd = 885 - xe_hw_engine_mmio_read32(hwe, RING_DMA_FADD(0)); 886 - val = xe_hw_engine_mmio_read32(hwe, RING_DMA_FADD_UDW(0)); 887 - snapshot->reg.ring_dma_fadd |= val << 32; 888 - 889 - snapshot->reg.ring_hwstam = xe_hw_engine_mmio_read32(hwe, RING_HWSTAM(0)); 890 - snapshot->reg.ring_hws_pga = xe_hw_engine_mmio_read32(hwe, RING_HWS_PGA(0)); 891 - snapshot->reg.ring_start = xe_hw_engine_mmio_read32(hwe, RING_START(0)); 892 - if (GRAPHICS_VERx100(hwe->gt->tile->xe) >= 2000) { 893 - val = xe_hw_engine_mmio_read32(hwe, RING_START_UDW(0)); 894 - snapshot->reg.ring_start |= val << 32; 895 - } 896 - if (xe_gt_has_indirect_ring_state(hwe->gt)) { 897 - snapshot->reg.indirect_ring_state = 898 - xe_hw_engine_mmio_read32(hwe, INDIRECT_RING_STATE(0)); 899 - } 900 - 901 - snapshot->reg.ring_head = 902 - xe_hw_engine_mmio_read32(hwe, RING_HEAD(0)) & HEAD_ADDR; 903 - snapshot->reg.ring_tail = 904 - xe_hw_engine_mmio_read32(hwe, RING_TAIL(0)) & TAIL_ADDR; 905 - snapshot->reg.ring_ctl = xe_hw_engine_mmio_read32(hwe, RING_CTL(0)); 906 - snapshot->reg.ring_mi_mode = 907 - xe_hw_engine_mmio_read32(hwe, RING_MI_MODE(0)); 908 - snapshot->reg.ring_mode = xe_hw_engine_mmio_read32(hwe, RING_MODE(0)); 909 - snapshot->reg.ring_imr = xe_hw_engine_mmio_read32(hwe, RING_IMR(0)); 910 - snapshot->reg.ring_esr = xe_hw_engine_mmio_read32(hwe, RING_ESR(0)); 911 - snapshot->reg.ring_emr = xe_hw_engine_mmio_read32(hwe, RING_EMR(0)); 912 - snapshot->reg.ring_eir = xe_hw_engine_mmio_read32(hwe, RING_EIR(0)); 913 - snapshot->reg.ipehr = xe_hw_engine_mmio_read32(hwe, RING_IPEHR(0)); 914 - xe_hw_engine_snapshot_instdone_capture(hwe, snapshot); 915 - 916 - if (snapshot->hwe->class == XE_ENGINE_CLASS_COMPUTE) 917 - snapshot->reg.rcu_mode = xe_mmio_read32(hwe->gt, RCU_MODE); 918 - 919 - return snapshot; 920 - } 921 - 922 - static void 923 - xe_hw_engine_snapshot_instdone_print(struct xe_hw_engine_snapshot *snapshot, struct drm_printer *p) 924 - { 925 - struct xe_gt *gt = snapshot->hwe->gt; 926 - struct xe_device *xe = gt_to_xe(gt); 927 - u16 group, instance; 928 - unsigned int dss; 929 - 930 - drm_printf(p, "\tRING_INSTDONE: 0x%08x\n", snapshot->reg.instdone.ring); 931 - 932 - if (snapshot->hwe->class != XE_ENGINE_CLASS_RENDER) 933 - return; 934 - 935 - if (is_slice_common_per_gslice(xe) == false) { 936 - drm_printf(p, "\tSC_INSTDONE[0]: 0x%08x\n", 937 - snapshot->reg.instdone.slice_common[0]); 938 - drm_printf(p, "\tSC_INSTDONE_EXTRA[0]: 0x%08x\n", 939 - snapshot->reg.instdone.slice_common_extra[0]); 940 - drm_printf(p, "\tSC_INSTDONE_EXTRA2[0]: 0x%08x\n", 941 - snapshot->reg.instdone.slice_common_extra2[0]); 942 - } else { 943 - for_each_geometry_dss(dss, gt, group, instance) { 944 - drm_printf(p, "\tSC_INSTDONE[%u]: 0x%08x\n", dss, 945 - snapshot->reg.instdone.slice_common[dss]); 946 - drm_printf(p, "\tSC_INSTDONE_EXTRA[%u]: 0x%08x\n", dss, 947 - snapshot->reg.instdone.slice_common_extra[dss]); 948 - drm_printf(p, "\tSC_INSTDONE_EXTRA2[%u]: 0x%08x\n", dss, 949 - snapshot->reg.instdone.slice_common_extra2[dss]); 919 + coredump->snapshot.matched_node = node; 920 + snapshot->source = XE_ENGINE_CAPTURE_SOURCE_GUC; 921 + xe_gt_dbg(hwe->gt, "Found and locked GuC-err-capture node"); 922 + return snapshot; 950 923 } 951 924 } 952 925 953 - for_each_geometry_dss(dss, gt, group, instance) { 954 - drm_printf(p, "\tSAMPLER_INSTDONE[%u]: 0x%08x\n", dss, 955 - snapshot->reg.instdone.sampler[dss]); 956 - drm_printf(p, "\tROW_INSTDONE[%u]: 0x%08x\n", dss, 957 - snapshot->reg.instdone.row[dss]); 926 + /* otherwise, do manual capture */ 927 + xe_engine_manual_capture(hwe, snapshot); 928 + snapshot->source = XE_ENGINE_CAPTURE_SOURCE_MANUAL; 929 + xe_gt_dbg(hwe->gt, "Proceeding with manual engine snapshot"); 958 930 959 - if (GRAPHICS_VERx100(xe) >= 1255) 960 - drm_printf(p, "\tINSTDONE_GEOM_SVGUNIT[%u]: 0x%08x\n", 961 - dss, snapshot->reg.instdone.geom_svg[dss]); 962 - } 963 - } 964 - 965 - /** 966 - * xe_hw_engine_snapshot_print - Print out a given Xe HW Engine snapshot. 967 - * @snapshot: Xe HW Engine snapshot object. 968 - * @p: drm_printer where it will be printed out. 969 - * 970 - * This function prints out a given Xe HW Engine snapshot object. 971 - */ 972 - void xe_hw_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, 973 - struct drm_printer *p) 974 - { 975 - if (!snapshot) 976 - return; 977 - 978 - drm_printf(p, "%s (physical), logical instance=%d\n", 979 - snapshot->name ? snapshot->name : "", 980 - snapshot->logical_instance); 981 - drm_printf(p, "\tForcewake: domain 0x%x, ref %d\n", 982 - snapshot->forcewake.domain, snapshot->forcewake.ref); 983 - drm_printf(p, "\tHWSTAM: 0x%08x\n", snapshot->reg.ring_hwstam); 984 - drm_printf(p, "\tRING_HWS_PGA: 0x%08x\n", snapshot->reg.ring_hws_pga); 985 - drm_printf(p, "\tRING_EXECLIST_STATUS: 0x%016llx\n", 986 - snapshot->reg.ring_execlist_status); 987 - drm_printf(p, "\tRING_EXECLIST_SQ_CONTENTS: 0x%016llx\n", 988 - snapshot->reg.ring_execlist_sq_contents); 989 - drm_printf(p, "\tRING_START: 0x%016llx\n", snapshot->reg.ring_start); 990 - drm_printf(p, "\tRING_HEAD: 0x%08x\n", snapshot->reg.ring_head); 991 - drm_printf(p, "\tRING_TAIL: 0x%08x\n", snapshot->reg.ring_tail); 992 - drm_printf(p, "\tRING_CTL: 0x%08x\n", snapshot->reg.ring_ctl); 993 - drm_printf(p, "\tRING_MI_MODE: 0x%08x\n", snapshot->reg.ring_mi_mode); 994 - drm_printf(p, "\tRING_MODE: 0x%08x\n", 995 - snapshot->reg.ring_mode); 996 - drm_printf(p, "\tRING_IMR: 0x%08x\n", snapshot->reg.ring_imr); 997 - drm_printf(p, "\tRING_ESR: 0x%08x\n", snapshot->reg.ring_esr); 998 - drm_printf(p, "\tRING_EMR: 0x%08x\n", snapshot->reg.ring_emr); 999 - drm_printf(p, "\tRING_EIR: 0x%08x\n", snapshot->reg.ring_eir); 1000 - drm_printf(p, "\tACTHD: 0x%016llx\n", snapshot->reg.ring_acthd); 1001 - drm_printf(p, "\tBBADDR: 0x%016llx\n", snapshot->reg.ring_bbaddr); 1002 - drm_printf(p, "\tDMA_FADDR: 0x%016llx\n", snapshot->reg.ring_dma_fadd); 1003 - drm_printf(p, "\tINDIRECT_RING_STATE: 0x%08x\n", 1004 - snapshot->reg.indirect_ring_state); 1005 - drm_printf(p, "\tIPEHR: 0x%08x\n", snapshot->reg.ipehr); 1006 - xe_hw_engine_snapshot_instdone_print(snapshot, p); 1007 - 1008 - if (snapshot->hwe->class == XE_ENGINE_CLASS_COMPUTE) 1009 - drm_printf(p, "\tRCU_MODE: 0x%08x\n", 1010 - snapshot->reg.rcu_mode); 1011 - drm_puts(p, "\n"); 931 + return snapshot; 1012 932 } 1013 933 1014 934 /** ··· 895 1065 */ 896 1066 void xe_hw_engine_snapshot_free(struct xe_hw_engine_snapshot *snapshot) 897 1067 { 1068 + struct xe_gt *gt; 898 1069 if (!snapshot) 899 1070 return; 900 1071 901 - kfree(snapshot->reg.instdone.slice_common); 902 - kfree(snapshot->reg.instdone.slice_common_extra); 903 - kfree(snapshot->reg.instdone.slice_common_extra2); 904 - kfree(snapshot->reg.instdone.sampler); 905 - kfree(snapshot->reg.instdone.row); 906 - kfree(snapshot->reg.instdone.geom_svg); 1072 + gt = snapshot->hwe->gt; 1073 + /* 1074 + * xe_guc_capture_put_matched_nodes is called here and from 1075 + * xe_devcoredump_snapshot_free, to cover the 2 calling paths 1076 + * of hw_engines - debugfs and devcoredump free. 1077 + */ 1078 + xe_guc_capture_put_matched_nodes(&gt->uc.guc); 1079 + 907 1080 kfree(snapshot->name); 908 1081 kfree(snapshot); 909 1082 } ··· 922 1089 { 923 1090 struct xe_hw_engine_snapshot *snapshot; 924 1091 925 - snapshot = xe_hw_engine_snapshot_capture(hwe); 926 - xe_hw_engine_snapshot_print(snapshot, p); 1092 + snapshot = xe_hw_engine_snapshot_capture(hwe, NULL); 1093 + xe_engine_snapshot_print(snapshot, p); 927 1094 xe_hw_engine_snapshot_free(snapshot); 928 1095 } 929 1096 ··· 983 1150 984 1151 u64 xe_hw_engine_read_timestamp(struct xe_hw_engine *hwe) 985 1152 { 986 - return xe_mmio_read64_2x32(hwe->gt, RING_TIMESTAMP(hwe->mmio_base)); 1153 + return xe_mmio_read64_2x32(&hwe->gt->mmio, RING_TIMESTAMP(hwe->mmio_base)); 987 1154 } 988 1155 989 1156 enum xe_force_wake_domains xe_hw_engine_to_fw_domain(struct xe_hw_engine *hwe)
+2 -4
drivers/gpu/drm/xe/xe_hw_engine.h
··· 11 11 struct drm_printer; 12 12 struct drm_xe_engine_class_instance; 13 13 struct xe_device; 14 + struct xe_sched_job; 14 15 15 16 #ifdef CONFIG_DRM_XE_JOB_TIMEOUT_MIN 16 17 #define XE_HW_ENGINE_JOB_TIMEOUT_MIN CONFIG_DRM_XE_JOB_TIMEOUT_MIN ··· 55 54 void xe_hw_engine_enable_ring(struct xe_hw_engine *hwe); 56 55 u32 xe_hw_engine_mask_per_class(struct xe_gt *gt, 57 56 enum xe_engine_class engine_class); 58 - 59 57 struct xe_hw_engine_snapshot * 60 - xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe); 58 + xe_hw_engine_snapshot_capture(struct xe_hw_engine *hwe, struct xe_sched_job *job); 61 59 void xe_hw_engine_snapshot_free(struct xe_hw_engine_snapshot *snapshot); 62 - void xe_hw_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot, 63 - struct drm_printer *p); 64 60 void xe_hw_engine_print(struct xe_hw_engine *hwe, struct drm_printer *p); 65 61 void xe_hw_engine_setup_default_lrc_state(struct xe_hw_engine *hwe); 66 62
+9 -59
drivers/gpu/drm/xe/xe_hw_engine_types.h
··· 152 152 struct xe_hw_engine_group *hw_engine_group; 153 153 }; 154 154 155 + enum xe_hw_engine_snapshot_source_id { 156 + XE_ENGINE_CAPTURE_SOURCE_MANUAL, 157 + XE_ENGINE_CAPTURE_SOURCE_GUC 158 + }; 159 + 155 160 /** 156 161 * struct xe_hw_engine_snapshot - Hardware engine snapshot 157 162 * ··· 165 160 struct xe_hw_engine_snapshot { 166 161 /** @name: name of the hw engine */ 167 162 char *name; 163 + /** @source: Data source, either manual or GuC */ 164 + enum xe_hw_engine_snapshot_source_id source; 168 165 /** @hwe: hw engine */ 169 166 struct xe_hw_engine *hwe; 170 167 /** @logical_instance: logical instance of this hw engine */ ··· 180 173 } forcewake; 181 174 /** @mmio_base: MMIO base address of this hw engine*/ 182 175 u32 mmio_base; 183 - /** @reg: Useful MMIO register snapshot */ 184 - struct { 185 - /** @reg.ring_execlist_status: RING_EXECLIST_STATUS */ 186 - u64 ring_execlist_status; 187 - /** @reg.ring_execlist_sq_contents: RING_EXECLIST_SQ_CONTENTS */ 188 - u64 ring_execlist_sq_contents; 189 - /** @reg.ring_acthd: RING_ACTHD */ 190 - u64 ring_acthd; 191 - /** @reg.ring_bbaddr: RING_BBADDR */ 192 - u64 ring_bbaddr; 193 - /** @reg.ring_dma_fadd: RING_DMA_FADD */ 194 - u64 ring_dma_fadd; 195 - /** @reg.ring_hwstam: RING_HWSTAM */ 196 - u32 ring_hwstam; 197 - /** @reg.ring_hws_pga: RING_HWS_PGA */ 198 - u32 ring_hws_pga; 199 - /** @reg.ring_start: RING_START */ 200 - u64 ring_start; 201 - /** @reg.ring_head: RING_HEAD */ 202 - u32 ring_head; 203 - /** @reg.ring_tail: RING_TAIL */ 204 - u32 ring_tail; 205 - /** @reg.ring_ctl: RING_CTL */ 206 - u32 ring_ctl; 207 - /** @reg.ring_mi_mode: RING_MI_MODE */ 208 - u32 ring_mi_mode; 209 - /** @reg.ring_mode: RING_MODE */ 210 - u32 ring_mode; 211 - /** @reg.ring_imr: RING_IMR */ 212 - u32 ring_imr; 213 - /** @reg.ring_esr: RING_ESR */ 214 - u32 ring_esr; 215 - /** @reg.ring_emr: RING_EMR */ 216 - u32 ring_emr; 217 - /** @reg.ring_eir: RING_EIR */ 218 - u32 ring_eir; 219 - /** @reg.indirect_ring_state: INDIRECT_RING_STATE */ 220 - u32 indirect_ring_state; 221 - /** @reg.ipehr: IPEHR */ 222 - u32 ipehr; 223 - /** @reg.rcu_mode: RCU_MODE */ 224 - u32 rcu_mode; 225 - struct { 226 - /** @reg.instdone.ring: RING_INSTDONE */ 227 - u32 ring; 228 - /** @reg.instdone.slice_common: SC_INSTDONE */ 229 - u32 *slice_common; 230 - /** @reg.instdone.slice_common_extra: SC_INSTDONE_EXTRA */ 231 - u32 *slice_common_extra; 232 - /** @reg.instdone.slice_common_extra2: SC_INSTDONE_EXTRA2 */ 233 - u32 *slice_common_extra2; 234 - /** @reg.instdone.sampler: SAMPLER_INSTDONE */ 235 - u32 *sampler; 236 - /** @reg.instdone.row: ROW_INSTDONE */ 237 - u32 *row; 238 - /** @reg.instdone.geom_svg: INSTDONE_GEOM_SVGUNIT */ 239 - u32 *geom_svg; 240 - } instdone; 241 - } reg; 176 + /** @kernel_reserved: Engine reserved, can't be used by userspace */ 177 + bool kernel_reserved; 242 178 }; 243 179 244 180 #endif
+8 -8
drivers/gpu/drm/xe/xe_hwmon.c
··· 149 149 u64 reg_val, min, max; 150 150 struct xe_device *xe = hwmon->xe; 151 151 struct xe_reg rapl_limit, pkg_power_sku; 152 - struct xe_gt *mmio = xe_root_mmio_gt(xe); 152 + struct xe_mmio *mmio = xe_root_tile_mmio(xe); 153 153 154 154 rapl_limit = xe_hwmon_get_reg(hwmon, REG_PKG_RAPL_LIMIT, channel); 155 155 pkg_power_sku = xe_hwmon_get_reg(hwmon, REG_PKG_POWER_SKU, channel); ··· 190 190 191 191 static int xe_hwmon_power_max_write(struct xe_hwmon *hwmon, int channel, long value) 192 192 { 193 - struct xe_gt *mmio = xe_root_mmio_gt(hwmon->xe); 193 + struct xe_mmio *mmio = xe_root_tile_mmio(hwmon->xe); 194 194 int ret = 0; 195 195 u64 reg_val; 196 196 struct xe_reg rapl_limit; ··· 222 222 223 223 static void xe_hwmon_power_rated_max_read(struct xe_hwmon *hwmon, int channel, long *value) 224 224 { 225 - struct xe_gt *mmio = xe_root_mmio_gt(hwmon->xe); 225 + struct xe_mmio *mmio = xe_root_tile_mmio(hwmon->xe); 226 226 struct xe_reg reg = xe_hwmon_get_reg(hwmon, REG_PKG_POWER_SKU, channel); 227 227 u64 reg_val; 228 228 ··· 259 259 static void 260 260 xe_hwmon_energy_get(struct xe_hwmon *hwmon, int channel, long *energy) 261 261 { 262 - struct xe_gt *mmio = xe_root_mmio_gt(hwmon->xe); 262 + struct xe_mmio *mmio = xe_root_tile_mmio(hwmon->xe); 263 263 struct xe_hwmon_energy_info *ei = &hwmon->ei[channel]; 264 264 u64 reg_val; 265 265 ··· 282 282 char *buf) 283 283 { 284 284 struct xe_hwmon *hwmon = dev_get_drvdata(dev); 285 - struct xe_gt *mmio = xe_root_mmio_gt(hwmon->xe); 285 + struct xe_mmio *mmio = xe_root_tile_mmio(hwmon->xe); 286 286 u32 x, y, x_w = 2; /* 2 bits */ 287 287 u64 r, tau4, out; 288 288 int sensor_index = to_sensor_dev_attr(attr)->index; ··· 323 323 const char *buf, size_t count) 324 324 { 325 325 struct xe_hwmon *hwmon = dev_get_drvdata(dev); 326 - struct xe_gt *mmio = xe_root_mmio_gt(hwmon->xe); 326 + struct xe_mmio *mmio = xe_root_tile_mmio(hwmon->xe); 327 327 u32 x, y, rxy, x_w = 2; /* 2 bits */ 328 328 u64 tau4, r, max_win; 329 329 unsigned long val; ··· 498 498 499 499 static void xe_hwmon_get_voltage(struct xe_hwmon *hwmon, int channel, long *value) 500 500 { 501 - struct xe_gt *mmio = xe_root_mmio_gt(hwmon->xe); 501 + struct xe_mmio *mmio = xe_root_tile_mmio(hwmon->xe); 502 502 u64 reg_val; 503 503 504 504 reg_val = xe_mmio_read32(mmio, xe_hwmon_get_reg(hwmon, REG_GT_PERF_STATUS, channel)); ··· 781 781 static void 782 782 xe_hwmon_get_preregistration_info(struct xe_device *xe) 783 783 { 784 - struct xe_gt *mmio = xe_root_mmio_gt(xe); 784 + struct xe_mmio *mmio = xe_root_tile_mmio(xe); 785 785 struct xe_hwmon *hwmon = xe->hwmon; 786 786 long energy; 787 787 u64 val_sku_unit = 0;
+39 -39
drivers/gpu/drm/xe/xe_irq.c
··· 10 10 #include <drm/drm_managed.h> 11 11 12 12 #include "display/xe_display.h" 13 - #include "regs/xe_gt_regs.h" 14 - #include "regs/xe_regs.h" 13 + #include "regs/xe_irq_regs.h" 15 14 #include "xe_device.h" 16 15 #include "xe_drv.h" 17 16 #include "xe_gsc_proxy.h" ··· 29 30 #define IIR(offset) XE_REG(offset + 0x8) 30 31 #define IER(offset) XE_REG(offset + 0xc) 31 32 32 - static void assert_iir_is_zero(struct xe_gt *mmio, struct xe_reg reg) 33 + static void assert_iir_is_zero(struct xe_mmio *mmio, struct xe_reg reg) 33 34 { 34 35 u32 val = xe_mmio_read32(mmio, reg); 35 36 36 37 if (val == 0) 37 38 return; 38 39 39 - drm_WARN(&gt_to_xe(mmio)->drm, 1, 40 + drm_WARN(&mmio->tile->xe->drm, 1, 40 41 "Interrupt register 0x%x is not zero: 0x%08x\n", 41 42 reg.addr, val); 42 43 xe_mmio_write32(mmio, reg, 0xffffffff); ··· 51 52 */ 52 53 static void unmask_and_enable(struct xe_tile *tile, u32 irqregs, u32 bits) 53 54 { 54 - struct xe_gt *mmio = tile->primary_gt; 55 + struct xe_mmio *mmio = &tile->mmio; 55 56 56 57 /* 57 58 * If we're just enabling an interrupt now, it shouldn't already ··· 69 70 /* Mask and disable all interrupts. */ 70 71 static void mask_and_disable(struct xe_tile *tile, u32 irqregs) 71 72 { 72 - struct xe_gt *mmio = tile->primary_gt; 73 + struct xe_mmio *mmio = &tile->mmio; 73 74 74 75 xe_mmio_write32(mmio, IMR(irqregs), ~0); 75 76 /* Posting read */ ··· 86 87 87 88 static u32 xelp_intr_disable(struct xe_device *xe) 88 89 { 89 - struct xe_gt *mmio = xe_root_mmio_gt(xe); 90 + struct xe_mmio *mmio = xe_root_tile_mmio(xe); 90 91 91 92 xe_mmio_write32(mmio, GFX_MSTR_IRQ, 0); 92 93 ··· 102 103 static u32 103 104 gu_misc_irq_ack(struct xe_device *xe, const u32 master_ctl) 104 105 { 105 - struct xe_gt *mmio = xe_root_mmio_gt(xe); 106 + struct xe_mmio *mmio = xe_root_tile_mmio(xe); 106 107 u32 iir; 107 108 108 109 if (!(master_ctl & GU_MISC_IRQ)) ··· 117 118 118 119 static inline void xelp_intr_enable(struct xe_device *xe, bool stall) 119 120 { 120 - struct xe_gt *mmio = xe_root_mmio_gt(xe); 121 + struct xe_mmio *mmio = xe_root_tile_mmio(xe); 121 122 122 123 xe_mmio_write32(mmio, GFX_MSTR_IRQ, MASTER_IRQ); 123 124 if (stall) ··· 128 129 void xe_irq_enable_hwe(struct xe_gt *gt) 129 130 { 130 131 struct xe_device *xe = gt_to_xe(gt); 132 + struct xe_mmio *mmio = &gt->mmio; 131 133 u32 ccs_mask, bcs_mask; 132 134 u32 irqs, dmask, smask; 133 135 u32 gsc_mask = 0; 134 136 u32 heci_mask = 0; 135 137 136 - if (IS_SRIOV_VF(xe) && xe_device_has_memirq(xe)) 138 + if (xe_device_uses_memirq(xe)) 137 139 return; 138 140 139 141 if (xe_device_uc_enabled(xe)) { ··· 155 155 156 156 if (!xe_gt_is_media_type(gt)) { 157 157 /* Enable interrupts for each engine class */ 158 - xe_mmio_write32(gt, RENDER_COPY_INTR_ENABLE, dmask); 158 + xe_mmio_write32(mmio, RENDER_COPY_INTR_ENABLE, dmask); 159 159 if (ccs_mask) 160 - xe_mmio_write32(gt, CCS_RSVD_INTR_ENABLE, smask); 160 + xe_mmio_write32(mmio, CCS_RSVD_INTR_ENABLE, smask); 161 161 162 162 /* Unmask interrupts for each engine instance */ 163 - xe_mmio_write32(gt, RCS0_RSVD_INTR_MASK, ~smask); 164 - xe_mmio_write32(gt, BCS_RSVD_INTR_MASK, ~smask); 163 + xe_mmio_write32(mmio, RCS0_RSVD_INTR_MASK, ~smask); 164 + xe_mmio_write32(mmio, BCS_RSVD_INTR_MASK, ~smask); 165 165 if (bcs_mask & (BIT(1)|BIT(2))) 166 - xe_mmio_write32(gt, XEHPC_BCS1_BCS2_INTR_MASK, ~dmask); 166 + xe_mmio_write32(mmio, XEHPC_BCS1_BCS2_INTR_MASK, ~dmask); 167 167 if (bcs_mask & (BIT(3)|BIT(4))) 168 - xe_mmio_write32(gt, XEHPC_BCS3_BCS4_INTR_MASK, ~dmask); 168 + xe_mmio_write32(mmio, XEHPC_BCS3_BCS4_INTR_MASK, ~dmask); 169 169 if (bcs_mask & (BIT(5)|BIT(6))) 170 - xe_mmio_write32(gt, XEHPC_BCS5_BCS6_INTR_MASK, ~dmask); 170 + xe_mmio_write32(mmio, XEHPC_BCS5_BCS6_INTR_MASK, ~dmask); 171 171 if (bcs_mask & (BIT(7)|BIT(8))) 172 - xe_mmio_write32(gt, XEHPC_BCS7_BCS8_INTR_MASK, ~dmask); 172 + xe_mmio_write32(mmio, XEHPC_BCS7_BCS8_INTR_MASK, ~dmask); 173 173 if (ccs_mask & (BIT(0)|BIT(1))) 174 - xe_mmio_write32(gt, CCS0_CCS1_INTR_MASK, ~dmask); 174 + xe_mmio_write32(mmio, CCS0_CCS1_INTR_MASK, ~dmask); 175 175 if (ccs_mask & (BIT(2)|BIT(3))) 176 - xe_mmio_write32(gt, CCS2_CCS3_INTR_MASK, ~dmask); 176 + xe_mmio_write32(mmio, CCS2_CCS3_INTR_MASK, ~dmask); 177 177 } 178 178 179 179 if (xe_gt_is_media_type(gt) || MEDIA_VER(xe) < 13) { 180 180 /* Enable interrupts for each engine class */ 181 - xe_mmio_write32(gt, VCS_VECS_INTR_ENABLE, dmask); 181 + xe_mmio_write32(mmio, VCS_VECS_INTR_ENABLE, dmask); 182 182 183 183 /* Unmask interrupts for each engine instance */ 184 - xe_mmio_write32(gt, VCS0_VCS1_INTR_MASK, ~dmask); 185 - xe_mmio_write32(gt, VCS2_VCS3_INTR_MASK, ~dmask); 186 - xe_mmio_write32(gt, VECS0_VECS1_INTR_MASK, ~dmask); 184 + xe_mmio_write32(mmio, VCS0_VCS1_INTR_MASK, ~dmask); 185 + xe_mmio_write32(mmio, VCS2_VCS3_INTR_MASK, ~dmask); 186 + xe_mmio_write32(mmio, VECS0_VECS1_INTR_MASK, ~dmask); 187 187 188 188 /* 189 189 * the heci2 interrupt is enabled via the same register as the ··· 197 197 } 198 198 199 199 if (gsc_mask) { 200 - xe_mmio_write32(gt, GUNIT_GSC_INTR_ENABLE, gsc_mask | heci_mask); 201 - xe_mmio_write32(gt, GUNIT_GSC_INTR_MASK, ~gsc_mask); 200 + xe_mmio_write32(mmio, GUNIT_GSC_INTR_ENABLE, gsc_mask | heci_mask); 201 + xe_mmio_write32(mmio, GUNIT_GSC_INTR_MASK, ~gsc_mask); 202 202 } 203 203 if (heci_mask) 204 - xe_mmio_write32(gt, HECI2_RSVD_INTR_MASK, ~(heci_mask << 16)); 204 + xe_mmio_write32(mmio, HECI2_RSVD_INTR_MASK, ~(heci_mask << 16)); 205 205 } 206 206 } 207 207 208 208 static u32 209 209 gt_engine_identity(struct xe_device *xe, 210 - struct xe_gt *mmio, 210 + struct xe_mmio *mmio, 211 211 const unsigned int bank, 212 212 const unsigned int bit) 213 213 { ··· 279 279 return tile->media_gt; 280 280 default: 281 281 break; 282 - }; 282 + } 283 283 fallthrough; 284 284 default: 285 285 return tile->primary_gt; ··· 291 291 u32 *identity) 292 292 { 293 293 struct xe_device *xe = tile_to_xe(tile); 294 - struct xe_gt *mmio = tile->primary_gt; 294 + struct xe_mmio *mmio = &tile->mmio; 295 295 unsigned int bank, bit; 296 296 u16 instance, intr_vec; 297 297 enum xe_engine_class class; ··· 376 376 377 377 static u32 dg1_intr_disable(struct xe_device *xe) 378 378 { 379 - struct xe_gt *mmio = xe_root_mmio_gt(xe); 379 + struct xe_mmio *mmio = xe_root_tile_mmio(xe); 380 380 u32 val; 381 381 382 382 /* First disable interrupts */ ··· 394 394 395 395 static void dg1_intr_enable(struct xe_device *xe, bool stall) 396 396 { 397 - struct xe_gt *mmio = xe_root_mmio_gt(xe); 397 + struct xe_mmio *mmio = xe_root_tile_mmio(xe); 398 398 399 399 xe_mmio_write32(mmio, DG1_MSTR_TILE_INTR, DG1_MSTR_IRQ); 400 400 if (stall) ··· 431 431 } 432 432 433 433 for_each_tile(tile, xe, id) { 434 - struct xe_gt *mmio = tile->primary_gt; 434 + struct xe_mmio *mmio = &tile->mmio; 435 435 436 436 if ((master_tile_ctl & DG1_MSTR_TILE(tile->id)) == 0) 437 437 continue; ··· 474 474 475 475 static void gt_irq_reset(struct xe_tile *tile) 476 476 { 477 - struct xe_gt *mmio = tile->primary_gt; 477 + struct xe_mmio *mmio = &tile->mmio; 478 478 479 479 u32 ccs_mask = xe_hw_engine_mask_per_class(tile->primary_gt, 480 480 XE_ENGINE_CLASS_COMPUTE); ··· 504 504 if (ccs_mask & (BIT(0)|BIT(1))) 505 505 xe_mmio_write32(mmio, CCS0_CCS1_INTR_MASK, ~0); 506 506 if (ccs_mask & (BIT(2)|BIT(3))) 507 - xe_mmio_write32(mmio, CCS2_CCS3_INTR_MASK, ~0); 507 + xe_mmio_write32(mmio, CCS2_CCS3_INTR_MASK, ~0); 508 508 509 509 if ((tile->media_gt && 510 510 xe_hw_engine_mask_per_class(tile->media_gt, XE_ENGINE_CLASS_OTHER)) || ··· 547 547 548 548 static void dg1_irq_reset_mstr(struct xe_tile *tile) 549 549 { 550 - struct xe_gt *mmio = tile->primary_gt; 550 + struct xe_mmio *mmio = &tile->mmio; 551 551 552 552 xe_mmio_write32(mmio, GFX_MSTR_IRQ, ~0); 553 553 } ··· 566 566 567 567 for_each_tile(tile, xe, id) { 568 568 if (xe_device_has_memirq(xe)) 569 - xe_memirq_reset(&tile->sriov.vf.memirq); 569 + xe_memirq_reset(&tile->memirq); 570 570 else 571 571 gt_irq_reset(tile); 572 572 } ··· 609 609 610 610 for_each_tile(tile, xe, id) 611 611 if (xe_device_has_memirq(xe)) 612 - xe_memirq_postinstall(&tile->sriov.vf.memirq); 612 + xe_memirq_postinstall(&tile->memirq); 613 613 614 614 if (GRAPHICS_VERx100(xe) < 1210) 615 615 xelp_intr_enable(xe, true); ··· 652 652 spin_unlock(&xe->irq.lock); 653 653 654 654 for_each_tile(tile, xe, id) 655 - xe_memirq_handler(&tile->sriov.vf.memirq); 655 + xe_memirq_handler(&tile->memirq); 656 656 657 657 return IRQ_HANDLED; 658 658 }
+1 -1
drivers/gpu/drm/xe/xe_lmtt.c
··· 193 193 lmtt_assert(lmtt, xe_bo_is_vram(lmtt->pd->bo)); 194 194 lmtt_assert(lmtt, IS_ALIGNED(offset, SZ_64K)); 195 195 196 - xe_mmio_write32(tile->primary_gt, 196 + xe_mmio_write32(&tile->mmio, 197 197 GRAPHICS_VER(xe) >= 20 ? XE2_LMEM_CFG : LMEM_CFG, 198 198 LMEM_EN | REG_FIELD_PREP(LMTT_DIR_PTR, offset / SZ_64K)); 199 199 }
+4 -22
drivers/gpu/drm/xe/xe_lrc.c
··· 38 38 39 39 #define LRC_INDIRECT_RING_STATE_SIZE SZ_4K 40 40 41 - struct xe_lrc_snapshot { 42 - struct xe_bo *lrc_bo; 43 - void *lrc_snapshot; 44 - unsigned long lrc_size, lrc_offset; 45 - 46 - u32 context_desc; 47 - u32 indirect_context_desc; 48 - u32 head; 49 - struct { 50 - u32 internal; 51 - u32 memory; 52 - } tail; 53 - u32 start_seqno; 54 - u32 seqno; 55 - u32 ctx_timestamp; 56 - u32 ctx_job_timestamp; 57 - }; 58 - 59 41 static struct xe_device * 60 42 lrc_to_xe(struct xe_lrc *lrc) 61 43 { ··· 581 599 582 600 static void set_memory_based_intr(u32 *regs, struct xe_hw_engine *hwe) 583 601 { 584 - struct xe_memirq *memirq = &gt_to_tile(hwe->gt)->sriov.vf.memirq; 602 + struct xe_memirq *memirq = &gt_to_tile(hwe->gt)->memirq; 585 603 struct xe_device *xe = gt_to_xe(hwe->gt); 586 604 587 - if (!IS_SRIOV_VF(xe) || !xe_device_has_memirq(xe)) 605 + if (!xe_device_uses_memirq(xe)) 588 606 return; 589 607 590 608 regs[CTX_LRM_INT_MASK_ENABLE] = MI_LOAD_REGISTER_MEM | ··· 595 613 regs[CTX_LRI_INT_REPORT_PTR] = MI_LOAD_REGISTER_IMM | MI_LRI_NUM_REGS(2) | 596 614 MI_LRI_LRM_CS_MMIO | MI_LRI_FORCE_POSTED; 597 615 regs[CTX_INT_STATUS_REPORT_REG] = RING_INT_STATUS_RPT_PTR(0).addr; 598 - regs[CTX_INT_STATUS_REPORT_PTR] = xe_memirq_status_ptr(memirq); 616 + regs[CTX_INT_STATUS_REPORT_PTR] = xe_memirq_status_ptr(memirq, hwe); 599 617 regs[CTX_INT_SRC_REPORT_REG] = RING_INT_SRC_RPT_PTR(0).addr; 600 - regs[CTX_INT_SRC_REPORT_PTR] = xe_memirq_source_ptr(memirq); 618 + regs[CTX_INT_SRC_REPORT_PTR] = xe_memirq_source_ptr(memirq, hwe); 601 619 } 602 620 603 621 static int lrc_ring_mi_mode(struct xe_hw_engine *hwe)
+18 -1
drivers/gpu/drm/xe/xe_lrc.h
··· 17 17 struct xe_gt; 18 18 struct xe_hw_engine; 19 19 struct xe_lrc; 20 - struct xe_lrc_snapshot; 21 20 struct xe_vm; 21 + 22 + struct xe_lrc_snapshot { 23 + struct xe_bo *lrc_bo; 24 + void *lrc_snapshot; 25 + unsigned long lrc_size, lrc_offset; 26 + 27 + u32 context_desc; 28 + u32 indirect_context_desc; 29 + u32 head; 30 + struct { 31 + u32 internal; 32 + u32 memory; 33 + } tail; 34 + u32 start_seqno; 35 + u32 seqno; 36 + u32 ctx_timestamp; 37 + u32 ctx_job_timestamp; 38 + }; 22 39 23 40 #define LRC_PPHWSP_SCRATCH_ADDR (0x34 * 4) 24 41
+143 -60
drivers/gpu/drm/xe/xe_memirq.c
··· 5 5 6 6 #include <drm/drm_managed.h> 7 7 8 - #include "regs/xe_gt_regs.h" 9 8 #include "regs/xe_guc_regs.h" 9 + #include "regs/xe_irq_regs.h" 10 10 #include "regs/xe_regs.h" 11 11 12 12 #include "xe_assert.h" ··· 19 19 #include "xe_hw_engine.h" 20 20 #include "xe_map.h" 21 21 #include "xe_memirq.h" 22 - #include "xe_sriov.h" 23 - #include "xe_sriov_printk.h" 24 22 25 23 #define memirq_assert(m, condition) xe_tile_assert(memirq_to_tile(m), condition) 26 - #define memirq_debug(m, msg...) xe_sriov_dbg_verbose(memirq_to_xe(m), "MEMIRQ: " msg) 24 + #define memirq_printk(m, _level, _fmt, ...) \ 25 + drm_##_level(&memirq_to_xe(m)->drm, "MEMIRQ%u: " _fmt, \ 26 + memirq_to_tile(m)->id, ##__VA_ARGS__) 27 + 28 + #ifdef CONFIG_DRM_XE_DEBUG_MEMIRQ 29 + #define memirq_debug(m, _fmt, ...) memirq_printk(m, dbg, _fmt, ##__VA_ARGS__) 30 + #else 31 + #define memirq_debug(...) 32 + #endif 33 + 34 + #define memirq_err(m, _fmt, ...) memirq_printk(m, err, _fmt, ##__VA_ARGS__) 35 + #define memirq_err_ratelimited(m, _fmt, ...) \ 36 + memirq_printk(m, err_ratelimited, _fmt, ##__VA_ARGS__) 27 37 28 38 static struct xe_tile *memirq_to_tile(struct xe_memirq *memirq) 29 39 { 30 - return container_of(memirq, struct xe_tile, sriov.vf.memirq); 40 + return container_of(memirq, struct xe_tile, memirq); 31 41 } 32 42 33 43 static struct xe_device *memirq_to_xe(struct xe_memirq *memirq) ··· 115 105 * | | 116 106 * | | 117 107 * +-----------+ 108 + * 109 + * 110 + * MSI-X use case 111 + * 112 + * When using MSI-X, hw engines report interrupt status and source to engine 113 + * instance 0. For this scenario, in order to differentiate between the 114 + * engines, we need to pass different status/source pointers in the LRC. 115 + * 116 + * The requirements on those pointers are: 117 + * - Interrupt status should be 4KiB aligned 118 + * - Interrupt source should be 64 bytes aligned 119 + * 120 + * To accommodate this, we duplicate the memirq page layout above - 121 + * allocating a page for each engine instance and pass this page in the LRC. 122 + * Note that the same page can be reused for different engine types. 123 + * For example, an LRC executing on CCS #x will have pointers to page #x, 124 + * and an LRC executing on BCS #x will have the same pointers. 125 + * 126 + * :: 127 + * 128 + * 0x0000 +==============================+ <== page for instance 0 (BCS0, CCS0, etc.) 129 + * | Interrupt Status Report Page | 130 + * 0x0400 +==============================+ 131 + * | Interrupt Source Report Page | 132 + * 0x0440 +==============================+ 133 + * | Interrupt Enable Mask | 134 + * +==============================+ 135 + * | Not used | 136 + * 0x1000 +==============================+ <== page for instance 1 (BCS1, CCS1, etc.) 137 + * | Interrupt Status Report Page | 138 + * 0x1400 +==============================+ 139 + * | Interrupt Source Report Page | 140 + * 0x1440 +==============================+ 141 + * | Not used | 142 + * 0x2000 +==============================+ <== page for instance 2 (BCS2, CCS2, etc.) 143 + * | ... | 144 + * +==============================+ 145 + * 118 146 */ 119 147 120 148 static void __release_xe_bo(struct drm_device *drm, void *arg) ··· 162 114 xe_bo_unpin_map_no_vm(bo); 163 115 } 164 116 117 + static inline bool hw_reports_to_instance_zero(struct xe_memirq *memirq) 118 + { 119 + /* 120 + * When the HW engines are configured to use MSI-X, 121 + * they report interrupt status and source to the offset of 122 + * engine instance 0. 123 + */ 124 + return xe_device_has_msix(memirq_to_xe(memirq)); 125 + } 126 + 165 127 static int memirq_alloc_pages(struct xe_memirq *memirq) 166 128 { 167 129 struct xe_device *xe = memirq_to_xe(memirq); 168 130 struct xe_tile *tile = memirq_to_tile(memirq); 131 + size_t bo_size = hw_reports_to_instance_zero(memirq) ? 132 + XE_HW_ENGINE_MAX_INSTANCE * SZ_4K : SZ_4K; 169 133 struct xe_bo *bo; 170 134 int err; 171 135 172 - BUILD_BUG_ON(!IS_ALIGNED(XE_MEMIRQ_SOURCE_OFFSET, SZ_64)); 173 - BUILD_BUG_ON(!IS_ALIGNED(XE_MEMIRQ_STATUS_OFFSET, SZ_4K)); 136 + BUILD_BUG_ON(!IS_ALIGNED(XE_MEMIRQ_SOURCE_OFFSET(0), SZ_64)); 137 + BUILD_BUG_ON(!IS_ALIGNED(XE_MEMIRQ_STATUS_OFFSET(0), SZ_4K)); 174 138 175 139 /* XXX: convert to managed bo */ 176 - bo = xe_bo_create_pin_map(xe, tile, NULL, SZ_4K, 140 + bo = xe_bo_create_pin_map(xe, tile, NULL, bo_size, 177 141 ttm_bo_type_kernel, 178 142 XE_BO_FLAG_SYSTEM | 179 143 XE_BO_FLAG_GGTT | ··· 200 140 memirq_assert(memirq, !xe_bo_is_vram(bo)); 201 141 memirq_assert(memirq, !memirq->bo); 202 142 203 - iosys_map_memset(&bo->vmap, 0, 0, SZ_4K); 143 + iosys_map_memset(&bo->vmap, 0, 0, bo_size); 204 144 205 145 memirq->bo = bo; 206 - memirq->source = IOSYS_MAP_INIT_OFFSET(&bo->vmap, XE_MEMIRQ_SOURCE_OFFSET); 207 - memirq->status = IOSYS_MAP_INIT_OFFSET(&bo->vmap, XE_MEMIRQ_STATUS_OFFSET); 146 + memirq->source = IOSYS_MAP_INIT_OFFSET(&bo->vmap, XE_MEMIRQ_SOURCE_OFFSET(0)); 147 + memirq->status = IOSYS_MAP_INIT_OFFSET(&bo->vmap, XE_MEMIRQ_STATUS_OFFSET(0)); 208 148 memirq->mask = IOSYS_MAP_INIT_OFFSET(&bo->vmap, XE_MEMIRQ_ENABLE_OFFSET); 209 149 210 150 memirq_assert(memirq, !memirq->source.is_iomem); 211 151 memirq_assert(memirq, !memirq->status.is_iomem); 212 152 memirq_assert(memirq, !memirq->mask.is_iomem); 213 153 214 - memirq_debug(memirq, "page offsets: source %#x status %#x\n", 215 - xe_memirq_source_ptr(memirq), xe_memirq_status_ptr(memirq)); 154 + memirq_debug(memirq, "page offsets: bo %#x bo_size %zu source %#x status %#x\n", 155 + xe_bo_ggtt_addr(bo), bo_size, XE_MEMIRQ_SOURCE_OFFSET(0), 156 + XE_MEMIRQ_STATUS_OFFSET(0)); 216 157 217 158 return drmm_add_action_or_reset(&xe->drm, __release_xe_bo, memirq->bo); 218 159 219 160 out: 220 - xe_sriov_err(memirq_to_xe(memirq), 221 - "Failed to allocate memirq page (%pe)\n", ERR_PTR(err)); 161 + memirq_err(memirq, "Failed to allocate memirq page (%pe)\n", ERR_PTR(err)); 222 162 return err; 223 163 } 224 164 ··· 238 178 * 239 179 * These allocations are managed and will be implicitly released on unload. 240 180 * 241 - * Note: This function shall be called only by the VF driver. 242 - * 243 - * If this function fails then VF driver won't be able to operate correctly. 181 + * If this function fails then the driver won't be able to operate correctly. 244 182 * If `Memory Based Interrupts`_ are not used this function will return 0. 245 183 * 246 184 * Return: 0 on success or a negative error code on failure. ··· 248 190 struct xe_device *xe = memirq_to_xe(memirq); 249 191 int err; 250 192 251 - memirq_assert(memirq, IS_SRIOV_VF(xe)); 252 - 253 - if (!xe_device_has_memirq(xe)) 193 + if (!xe_device_uses_memirq(xe)) 254 194 return 0; 255 195 256 196 err = memirq_alloc_pages(memirq); ··· 261 205 return 0; 262 206 } 263 207 208 + static u32 __memirq_source_page(struct xe_memirq *memirq, u16 instance) 209 + { 210 + memirq_assert(memirq, instance <= XE_HW_ENGINE_MAX_INSTANCE); 211 + memirq_assert(memirq, memirq->bo); 212 + 213 + instance = hw_reports_to_instance_zero(memirq) ? instance : 0; 214 + return xe_bo_ggtt_addr(memirq->bo) + XE_MEMIRQ_SOURCE_OFFSET(instance); 215 + } 216 + 264 217 /** 265 218 * xe_memirq_source_ptr - Get GGTT's offset of the `Interrupt Source Report Page`_. 266 219 * @memirq: the &xe_memirq to query 220 + * @hwe: the hw engine for which we want the report page 267 221 * 268 - * Shall be called only on VF driver when `Memory Based Interrupts`_ are used 222 + * Shall be called when `Memory Based Interrupts`_ are used 269 223 * and xe_memirq_init() didn't fail. 270 224 * 271 225 * Return: GGTT's offset of the `Interrupt Source Report Page`_. 272 226 */ 273 - u32 xe_memirq_source_ptr(struct xe_memirq *memirq) 227 + u32 xe_memirq_source_ptr(struct xe_memirq *memirq, struct xe_hw_engine *hwe) 274 228 { 275 - memirq_assert(memirq, IS_SRIOV_VF(memirq_to_xe(memirq))); 276 - memirq_assert(memirq, xe_device_has_memirq(memirq_to_xe(memirq))); 229 + memirq_assert(memirq, xe_device_uses_memirq(memirq_to_xe(memirq))); 230 + 231 + return __memirq_source_page(memirq, hwe->instance); 232 + } 233 + 234 + static u32 __memirq_status_page(struct xe_memirq *memirq, u16 instance) 235 + { 236 + memirq_assert(memirq, instance <= XE_HW_ENGINE_MAX_INSTANCE); 277 237 memirq_assert(memirq, memirq->bo); 278 238 279 - return xe_bo_ggtt_addr(memirq->bo) + XE_MEMIRQ_SOURCE_OFFSET; 239 + instance = hw_reports_to_instance_zero(memirq) ? instance : 0; 240 + return xe_bo_ggtt_addr(memirq->bo) + XE_MEMIRQ_STATUS_OFFSET(instance); 280 241 } 281 242 282 243 /** 283 244 * xe_memirq_status_ptr - Get GGTT's offset of the `Interrupt Status Report Page`_. 284 245 * @memirq: the &xe_memirq to query 246 + * @hwe: the hw engine for which we want the report page 285 247 * 286 - * Shall be called only on VF driver when `Memory Based Interrupts`_ are used 248 + * Shall be called when `Memory Based Interrupts`_ are used 287 249 * and xe_memirq_init() didn't fail. 288 250 * 289 251 * Return: GGTT's offset of the `Interrupt Status Report Page`_. 290 252 */ 291 - u32 xe_memirq_status_ptr(struct xe_memirq *memirq) 253 + u32 xe_memirq_status_ptr(struct xe_memirq *memirq, struct xe_hw_engine *hwe) 292 254 { 293 - memirq_assert(memirq, IS_SRIOV_VF(memirq_to_xe(memirq))); 294 - memirq_assert(memirq, xe_device_has_memirq(memirq_to_xe(memirq))); 295 - memirq_assert(memirq, memirq->bo); 255 + memirq_assert(memirq, xe_device_uses_memirq(memirq_to_xe(memirq))); 296 256 297 - return xe_bo_ggtt_addr(memirq->bo) + XE_MEMIRQ_STATUS_OFFSET; 257 + return __memirq_status_page(memirq, hwe->instance); 298 258 } 299 259 300 260 /** 301 261 * xe_memirq_enable_ptr - Get GGTT's offset of the Interrupt Enable Mask. 302 262 * @memirq: the &xe_memirq to query 303 263 * 304 - * Shall be called only on VF driver when `Memory Based Interrupts`_ are used 264 + * Shall be called when `Memory Based Interrupts`_ are used 305 265 * and xe_memirq_init() didn't fail. 306 266 * 307 267 * Return: GGTT's offset of the Interrupt Enable Mask. 308 268 */ 309 269 u32 xe_memirq_enable_ptr(struct xe_memirq *memirq) 310 270 { 311 - memirq_assert(memirq, IS_SRIOV_VF(memirq_to_xe(memirq))); 312 - memirq_assert(memirq, xe_device_has_memirq(memirq_to_xe(memirq))); 271 + memirq_assert(memirq, xe_device_uses_memirq(memirq_to_xe(memirq))); 313 272 memirq_assert(memirq, memirq->bo); 314 273 315 274 return xe_bo_ggtt_addr(memirq->bo) + XE_MEMIRQ_ENABLE_OFFSET; ··· 338 267 * Register `Interrupt Source Report Page`_ and `Interrupt Status Report Page`_ 339 268 * to be used by the GuC when `Memory Based Interrupts`_ are required. 340 269 * 341 - * Shall be called only on VF driver when `Memory Based Interrupts`_ are used 270 + * Shall be called when `Memory Based Interrupts`_ are used 342 271 * and xe_memirq_init() didn't fail. 343 272 * 344 273 * Return: 0 on success or a negative error code on failure. ··· 350 279 u32 source, status; 351 280 int err; 352 281 353 - memirq_assert(memirq, IS_SRIOV_VF(memirq_to_xe(memirq))); 354 - memirq_assert(memirq, xe_device_has_memirq(memirq_to_xe(memirq))); 355 - memirq_assert(memirq, memirq->bo); 282 + memirq_assert(memirq, xe_device_uses_memirq(memirq_to_xe(memirq))); 356 283 357 - source = xe_memirq_source_ptr(memirq) + offset; 358 - status = xe_memirq_status_ptr(memirq) + offset * SZ_16; 284 + source = __memirq_source_page(memirq, 0) + offset; 285 + status = __memirq_status_page(memirq, 0) + offset * SZ_16; 359 286 360 287 err = xe_guc_self_cfg64(guc, GUC_KLV_SELF_CFG_MEMIRQ_SOURCE_ADDR_KEY, 361 288 source); ··· 368 299 return 0; 369 300 370 301 failed: 371 - xe_sriov_err(memirq_to_xe(memirq), 372 - "Failed to setup report pages in %s (%pe)\n", 373 - guc_name(guc), ERR_PTR(err)); 302 + memirq_err(memirq, "Failed to setup report pages in %s (%pe)\n", 303 + guc_name(guc), ERR_PTR(err)); 374 304 return err; 375 305 } 376 306 ··· 379 311 * 380 312 * This is part of the driver IRQ setup flow. 381 313 * 382 - * This function shall only be used by the VF driver on platforms that use 314 + * This function shall only be used on platforms that use 383 315 * `Memory Based Interrupts`_. 384 316 */ 385 317 void xe_memirq_reset(struct xe_memirq *memirq) 386 318 { 387 - memirq_assert(memirq, IS_SRIOV_VF(memirq_to_xe(memirq))); 388 - memirq_assert(memirq, xe_device_has_memirq(memirq_to_xe(memirq))); 319 + memirq_assert(memirq, xe_device_uses_memirq(memirq_to_xe(memirq))); 389 320 390 321 if (memirq->bo) 391 322 memirq_set_enable(memirq, false); ··· 396 329 * 397 330 * This is part of the driver IRQ setup flow. 398 331 * 399 - * This function shall only be used by the VF driver on platforms that use 332 + * This function shall only be used on platforms that use 400 333 * `Memory Based Interrupts`_. 401 334 */ 402 335 void xe_memirq_postinstall(struct xe_memirq *memirq) 403 336 { 404 - memirq_assert(memirq, IS_SRIOV_VF(memirq_to_xe(memirq))); 405 - memirq_assert(memirq, xe_device_has_memirq(memirq_to_xe(memirq))); 337 + memirq_assert(memirq, xe_device_uses_memirq(memirq_to_xe(memirq))); 406 338 407 339 if (memirq->bo) 408 340 memirq_set_enable(memirq, true); ··· 415 349 value = iosys_map_rd(vector, offset, u8); 416 350 if (value) { 417 351 if (value != 0xff) 418 - xe_sriov_err_ratelimited(memirq_to_xe(memirq), 419 - "Unexpected memirq value %#x from %s at %u\n", 420 - value, name, offset); 352 + memirq_err_ratelimited(memirq, 353 + "Unexpected memirq value %#x from %s at %u\n", 354 + value, name, offset); 421 355 iosys_map_wr(vector, offset, u8, 0x00); 422 356 } 423 357 ··· 442 376 443 377 if (memirq_received(memirq, status, ilog2(GUC_INTR_GUC2HOST), name)) 444 378 xe_guc_irq_handler(guc, GUC_INTR_GUC2HOST); 379 + } 380 + 381 + /** 382 + * xe_memirq_hwe_handler - Check and process interrupts for a specific HW engine. 383 + * @memirq: the &xe_memirq 384 + * @hwe: the hw engine to process 385 + * 386 + * This function reads and dispatches `Memory Based Interrupts` for the provided HW engine. 387 + */ 388 + void xe_memirq_hwe_handler(struct xe_memirq *memirq, struct xe_hw_engine *hwe) 389 + { 390 + u16 offset = hwe->irq_offset; 391 + u16 instance = hw_reports_to_instance_zero(memirq) ? hwe->instance : 0; 392 + struct iosys_map src_offset = IOSYS_MAP_INIT_OFFSET(&memirq->bo->vmap, 393 + XE_MEMIRQ_SOURCE_OFFSET(instance)); 394 + 395 + if (memirq_received(memirq, &src_offset, offset, "SRC")) { 396 + struct iosys_map status_offset = 397 + IOSYS_MAP_INIT_OFFSET(&memirq->bo->vmap, 398 + XE_MEMIRQ_STATUS_OFFSET(instance) + offset * SZ_16); 399 + memirq_dispatch_engine(memirq, &status_offset, hwe); 400 + } 445 401 } 446 402 447 403 /** ··· 493 405 if (gt->tile != tile) 494 406 continue; 495 407 496 - for_each_hw_engine(hwe, gt, id) { 497 - if (memirq_received(memirq, &memirq->source, hwe->irq_offset, "SRC")) { 498 - map = IOSYS_MAP_INIT_OFFSET(&memirq->status, 499 - hwe->irq_offset * SZ_16); 500 - memirq_dispatch_engine(memirq, &map, hwe); 501 - } 502 - } 408 + for_each_hw_engine(hwe, gt, id) 409 + xe_memirq_hwe_handler(memirq, hwe); 503 410 } 504 411 505 412 /* GuC and media GuC (if present) must be checked separately */
+4 -2
drivers/gpu/drm/xe/xe_memirq.h
··· 9 9 #include <linux/types.h> 10 10 11 11 struct xe_guc; 12 + struct xe_hw_engine; 12 13 struct xe_memirq; 13 14 14 15 int xe_memirq_init(struct xe_memirq *memirq); 15 16 16 - u32 xe_memirq_source_ptr(struct xe_memirq *memirq); 17 - u32 xe_memirq_status_ptr(struct xe_memirq *memirq); 17 + u32 xe_memirq_source_ptr(struct xe_memirq *memirq, struct xe_hw_engine *hwe); 18 + u32 xe_memirq_status_ptr(struct xe_memirq *memirq, struct xe_hw_engine *hwe); 18 19 u32 xe_memirq_enable_ptr(struct xe_memirq *memirq); 19 20 20 21 void xe_memirq_reset(struct xe_memirq *memirq); 21 22 void xe_memirq_postinstall(struct xe_memirq *memirq); 23 + void xe_memirq_hwe_handler(struct xe_memirq *memirq, struct xe_hw_engine *hwe); 22 24 void xe_memirq_handler(struct xe_memirq *memirq); 23 25 24 26 int xe_memirq_init_guc(struct xe_memirq *memirq, struct xe_guc *guc);
+2 -2
drivers/gpu/drm/xe/xe_memirq_types.h
··· 11 11 struct xe_bo; 12 12 13 13 /* ISR */ 14 - #define XE_MEMIRQ_STATUS_OFFSET 0x0 14 + #define XE_MEMIRQ_STATUS_OFFSET(inst) ((inst) * SZ_4K + 0x0) 15 15 /* IIR */ 16 - #define XE_MEMIRQ_SOURCE_OFFSET 0x400 16 + #define XE_MEMIRQ_SOURCE_OFFSET(inst) ((inst) * SZ_4K + 0x400) 17 17 /* IMR */ 18 18 #define XE_MEMIRQ_ENABLE_OFFSET 0x440 19 19
+71 -68
drivers/gpu/drm/xe/xe_mmio.c
··· 36 36 /* 37 37 * On multi-tile devices, partition the BAR space for MMIO on each tile, 38 38 * possibly accounting for register override on the number of tiles available. 39 + * tile_mmio_size contains both the tile's 4MB register space, as well as 40 + * additional space for the GTT and other (possibly unused) regions). 39 41 * Resulting memory layout is like below: 40 42 * 41 43 * .----------------------. <- tile_count * tile_mmio_size 42 44 * | .... | 43 45 * |----------------------| <- 2 * tile_mmio_size 46 + * | tile1 GTT + other | 47 + * |----------------------| <- 1 * tile_mmio_size + 4MB 44 48 * | tile1->mmio.regs | 45 49 * |----------------------| <- 1 * tile_mmio_size 50 + * | tile0 GTT + other | 51 + * |----------------------| <- 4MB 46 52 * | tile0->mmio.regs | 47 53 * '----------------------' <- 0MB 48 54 */ ··· 67 61 68 62 /* Possibly override number of tile based on configuration register */ 69 63 if (!xe->info.skip_mtcfg) { 70 - struct xe_gt *gt = xe_root_mmio_gt(xe); 64 + struct xe_mmio *mmio = xe_root_tile_mmio(xe); 71 65 u8 tile_count; 72 66 u32 mtcfg; 73 67 74 68 /* 75 69 * Although the per-tile mmio regs are not yet initialized, this 76 - * is fine as it's going to the root gt, that's guaranteed to be 77 - * initialized earlier in xe_mmio_init() 70 + * is fine as it's going to the root tile's mmio, that's 71 + * guaranteed to be initialized earlier in xe_mmio_init() 78 72 */ 79 - mtcfg = xe_mmio_read64_2x32(gt, XEHP_MTCFG_ADDR); 73 + mtcfg = xe_mmio_read64_2x32(mmio, XEHP_MTCFG_ADDR); 80 74 tile_count = REG_FIELD_GET(TILE_COUNT, mtcfg) + 1; 81 75 82 76 if (tile_count < xe->info.tile_count) { ··· 96 90 97 91 regs = xe->mmio.regs; 98 92 for_each_tile(tile, xe, id) { 99 - tile->mmio.size = tile_mmio_size; 93 + tile->mmio.regs_size = SZ_4M; 100 94 tile->mmio.regs = regs; 95 + tile->mmio.tile = tile; 101 96 regs += tile_mmio_size; 102 97 } 103 98 } ··· 133 126 134 127 regs = xe->mmio.regs + tile_mmio_size * xe->info.tile_count; 135 128 for_each_tile(tile, xe, id) { 136 - tile->mmio_ext.size = tile_mmio_ext_size; 129 + tile->mmio_ext.regs_size = tile_mmio_ext_size; 137 130 tile->mmio_ext.regs = regs; 131 + tile->mmio_ext.tile = tile; 138 132 regs += tile_mmio_ext_size; 139 133 } 140 134 } ··· 165 157 { 166 158 struct xe_tile *root_tile = xe_device_get_root_tile(xe); 167 159 struct pci_dev *pdev = to_pci_dev(xe->drm.dev); 168 - const int mmio_bar = 0; 169 160 170 161 /* 171 162 * Map the entire BAR. 172 163 * The first 16MB of the BAR, belong to the root tile, and include: 173 164 * registers (0-4MB), reserved space (4MB-8MB) and GGTT (8MB-16MB). 174 165 */ 175 - xe->mmio.size = pci_resource_len(pdev, mmio_bar); 176 - xe->mmio.regs = pci_iomap(pdev, mmio_bar, GTTMMADR_BAR); 166 + xe->mmio.size = pci_resource_len(pdev, GTTMMADR_BAR); 167 + xe->mmio.regs = pci_iomap(pdev, GTTMMADR_BAR, 0); 177 168 if (xe->mmio.regs == NULL) { 178 169 drm_err(&xe->drm, "failed to map registers\n"); 179 170 return -EIO; 180 171 } 181 172 182 173 /* Setup first tile; other tiles (if present) will be setup later. */ 183 - root_tile->mmio.size = SZ_16M; 174 + root_tile->mmio.regs_size = SZ_4M; 184 175 root_tile->mmio.regs = xe->mmio.regs; 176 + root_tile->mmio.tile = root_tile; 185 177 186 178 return devm_add_action_or_reset(xe->drm.dev, mmio_fini, xe); 187 179 } 188 180 189 - static void mmio_flush_pending_writes(struct xe_gt *gt) 181 + static void mmio_flush_pending_writes(struct xe_mmio *mmio) 190 182 { 191 183 #define DUMMY_REG_OFFSET 0x130030 192 - struct xe_tile *tile = gt_to_tile(gt); 193 184 int i; 194 185 195 - if (tile->xe->info.platform != XE_LUNARLAKE) 186 + if (mmio->tile->xe->info.platform != XE_LUNARLAKE) 196 187 return; 197 188 198 189 /* 4 dummy writes */ 199 190 for (i = 0; i < 4; i++) 200 - writel(0, tile->mmio.regs + DUMMY_REG_OFFSET); 191 + writel(0, mmio->regs + DUMMY_REG_OFFSET); 201 192 } 202 193 203 - u8 xe_mmio_read8(struct xe_gt *gt, struct xe_reg reg) 194 + u8 xe_mmio_read8(struct xe_mmio *mmio, struct xe_reg reg) 204 195 { 205 - struct xe_tile *tile = gt_to_tile(gt); 206 - u32 addr = xe_mmio_adjusted_addr(gt, reg.addr); 196 + u32 addr = xe_mmio_adjusted_addr(mmio, reg.addr); 207 197 u8 val; 208 198 209 199 /* Wa_15015404425 */ 210 - mmio_flush_pending_writes(gt); 200 + mmio_flush_pending_writes(mmio); 211 201 212 - val = readb((reg.ext ? tile->mmio_ext.regs : tile->mmio.regs) + addr); 213 - trace_xe_reg_rw(gt, false, addr, val, sizeof(val)); 202 + val = readb(mmio->regs + addr); 203 + trace_xe_reg_rw(mmio, false, addr, val, sizeof(val)); 214 204 215 205 return val; 216 206 } 217 207 218 - u16 xe_mmio_read16(struct xe_gt *gt, struct xe_reg reg) 208 + u16 xe_mmio_read16(struct xe_mmio *mmio, struct xe_reg reg) 219 209 { 220 - struct xe_tile *tile = gt_to_tile(gt); 221 - u32 addr = xe_mmio_adjusted_addr(gt, reg.addr); 210 + u32 addr = xe_mmio_adjusted_addr(mmio, reg.addr); 222 211 u16 val; 223 212 224 213 /* Wa_15015404425 */ 225 - mmio_flush_pending_writes(gt); 214 + mmio_flush_pending_writes(mmio); 226 215 227 - val = readw((reg.ext ? tile->mmio_ext.regs : tile->mmio.regs) + addr); 228 - trace_xe_reg_rw(gt, false, addr, val, sizeof(val)); 216 + val = readw(mmio->regs + addr); 217 + trace_xe_reg_rw(mmio, false, addr, val, sizeof(val)); 229 218 230 219 return val; 231 220 } 232 221 233 - void xe_mmio_write32(struct xe_gt *gt, struct xe_reg reg, u32 val) 222 + void xe_mmio_write32(struct xe_mmio *mmio, struct xe_reg reg, u32 val) 234 223 { 235 - struct xe_tile *tile = gt_to_tile(gt); 236 - u32 addr = xe_mmio_adjusted_addr(gt, reg.addr); 224 + u32 addr = xe_mmio_adjusted_addr(mmio, reg.addr); 237 225 238 - trace_xe_reg_rw(gt, true, addr, val, sizeof(val)); 226 + trace_xe_reg_rw(mmio, true, addr, val, sizeof(val)); 239 227 240 - if (!reg.vf && IS_SRIOV_VF(gt_to_xe(gt))) 241 - xe_gt_sriov_vf_write32(gt, reg, val); 228 + if (!reg.vf && mmio->sriov_vf_gt) 229 + xe_gt_sriov_vf_write32(mmio->sriov_vf_gt, reg, val); 242 230 else 243 - writel(val, (reg.ext ? tile->mmio_ext.regs : tile->mmio.regs) + addr); 231 + writel(val, mmio->regs + addr); 244 232 } 245 233 246 - u32 xe_mmio_read32(struct xe_gt *gt, struct xe_reg reg) 234 + u32 xe_mmio_read32(struct xe_mmio *mmio, struct xe_reg reg) 247 235 { 248 - struct xe_tile *tile = gt_to_tile(gt); 249 - u32 addr = xe_mmio_adjusted_addr(gt, reg.addr); 236 + u32 addr = xe_mmio_adjusted_addr(mmio, reg.addr); 250 237 u32 val; 251 238 252 239 /* Wa_15015404425 */ 253 - mmio_flush_pending_writes(gt); 240 + mmio_flush_pending_writes(mmio); 254 241 255 - if (!reg.vf && IS_SRIOV_VF(gt_to_xe(gt))) 256 - val = xe_gt_sriov_vf_read32(gt, reg); 242 + if (!reg.vf && mmio->sriov_vf_gt) 243 + val = xe_gt_sriov_vf_read32(mmio->sriov_vf_gt, reg); 257 244 else 258 - val = readl((reg.ext ? tile->mmio_ext.regs : tile->mmio.regs) + addr); 245 + val = readl(mmio->regs + addr); 259 246 260 - trace_xe_reg_rw(gt, false, addr, val, sizeof(val)); 247 + trace_xe_reg_rw(mmio, false, addr, val, sizeof(val)); 261 248 262 249 return val; 263 250 } 264 251 265 - u32 xe_mmio_rmw32(struct xe_gt *gt, struct xe_reg reg, u32 clr, u32 set) 252 + u32 xe_mmio_rmw32(struct xe_mmio *mmio, struct xe_reg reg, u32 clr, u32 set) 266 253 { 267 254 u32 old, reg_val; 268 255 269 - old = xe_mmio_read32(gt, reg); 256 + old = xe_mmio_read32(mmio, reg); 270 257 reg_val = (old & ~clr) | set; 271 - xe_mmio_write32(gt, reg, reg_val); 258 + xe_mmio_write32(mmio, reg, reg_val); 272 259 273 260 return old; 274 261 } 275 262 276 - int xe_mmio_write32_and_verify(struct xe_gt *gt, 263 + int xe_mmio_write32_and_verify(struct xe_mmio *mmio, 277 264 struct xe_reg reg, u32 val, u32 mask, u32 eval) 278 265 { 279 266 u32 reg_val; 280 267 281 - xe_mmio_write32(gt, reg, val); 282 - reg_val = xe_mmio_read32(gt, reg); 268 + xe_mmio_write32(mmio, reg, val); 269 + reg_val = xe_mmio_read32(mmio, reg); 283 270 284 271 return (reg_val & mask) != eval ? -EINVAL : 0; 285 272 } 286 273 287 - bool xe_mmio_in_range(const struct xe_gt *gt, 274 + bool xe_mmio_in_range(const struct xe_mmio *mmio, 288 275 const struct xe_mmio_range *range, 289 276 struct xe_reg reg) 290 277 { 291 - u32 addr = xe_mmio_adjusted_addr(gt, reg.addr); 278 + u32 addr = xe_mmio_adjusted_addr(mmio, reg.addr); 292 279 293 280 return range && addr >= range->start && addr <= range->end; 294 281 } 295 282 296 283 /** 297 284 * xe_mmio_read64_2x32() - Read a 64-bit register as two 32-bit reads 298 - * @gt: MMIO target GT 285 + * @mmio: MMIO target 299 286 * @reg: register to read value from 300 287 * 301 288 * Although Intel GPUs have some 64-bit registers, the hardware officially ··· 310 307 * 311 308 * Returns the value of the 64-bit register. 312 309 */ 313 - u64 xe_mmio_read64_2x32(struct xe_gt *gt, struct xe_reg reg) 310 + u64 xe_mmio_read64_2x32(struct xe_mmio *mmio, struct xe_reg reg) 314 311 { 315 312 struct xe_reg reg_udw = { .addr = reg.addr + 0x4 }; 316 313 u32 ldw, udw, oldudw, retries; 317 314 318 - reg.addr = xe_mmio_adjusted_addr(gt, reg.addr); 319 - reg_udw.addr = xe_mmio_adjusted_addr(gt, reg_udw.addr); 315 + reg.addr = xe_mmio_adjusted_addr(mmio, reg.addr); 316 + reg_udw.addr = xe_mmio_adjusted_addr(mmio, reg_udw.addr); 320 317 321 318 /* we shouldn't adjust just one register address */ 322 - xe_gt_assert(gt, reg_udw.addr == reg.addr + 0x4); 319 + xe_tile_assert(mmio->tile, reg_udw.addr == reg.addr + 0x4); 323 320 324 - oldudw = xe_mmio_read32(gt, reg_udw); 321 + oldudw = xe_mmio_read32(mmio, reg_udw); 325 322 for (retries = 5; retries; --retries) { 326 - ldw = xe_mmio_read32(gt, reg); 327 - udw = xe_mmio_read32(gt, reg_udw); 323 + ldw = xe_mmio_read32(mmio, reg); 324 + udw = xe_mmio_read32(mmio, reg_udw); 328 325 329 326 if (udw == oldudw) 330 327 break; ··· 332 329 oldudw = udw; 333 330 } 334 331 335 - xe_gt_WARN(gt, retries == 0, 336 - "64-bit read of %#x did not stabilize\n", reg.addr); 332 + drm_WARN(&mmio->tile->xe->drm, retries == 0, 333 + "64-bit read of %#x did not stabilize\n", reg.addr); 337 334 338 335 return (u64)udw << 32 | ldw; 339 336 } 340 337 341 - static int __xe_mmio_wait32(struct xe_gt *gt, struct xe_reg reg, u32 mask, u32 val, u32 timeout_us, 338 + static int __xe_mmio_wait32(struct xe_mmio *mmio, struct xe_reg reg, u32 mask, u32 val, u32 timeout_us, 342 339 u32 *out_val, bool atomic, bool expect_match) 343 340 { 344 341 ktime_t cur = ktime_get_raw(); ··· 349 346 bool check; 350 347 351 348 for (;;) { 352 - read = xe_mmio_read32(gt, reg); 349 + read = xe_mmio_read32(mmio, reg); 353 350 354 351 check = (read & mask) == val; 355 352 if (!expect_match) ··· 375 372 } 376 373 377 374 if (ret != 0) { 378 - read = xe_mmio_read32(gt, reg); 375 + read = xe_mmio_read32(mmio, reg); 379 376 380 377 check = (read & mask) == val; 381 378 if (!expect_match) ··· 393 390 394 391 /** 395 392 * xe_mmio_wait32() - Wait for a register to match the desired masked value 396 - * @gt: MMIO target GT 393 + * @mmio: MMIO target 397 394 * @reg: register to read value from 398 395 * @mask: mask to be applied to the value read from the register 399 396 * @val: desired value after applying the mask ··· 410 407 * @timeout_us for different reasons, specially in non-atomic contexts. Thus, 411 408 * it is possible that this function succeeds even after @timeout_us has passed. 412 409 */ 413 - int xe_mmio_wait32(struct xe_gt *gt, struct xe_reg reg, u32 mask, u32 val, u32 timeout_us, 410 + int xe_mmio_wait32(struct xe_mmio *mmio, struct xe_reg reg, u32 mask, u32 val, u32 timeout_us, 414 411 u32 *out_val, bool atomic) 415 412 { 416 - return __xe_mmio_wait32(gt, reg, mask, val, timeout_us, out_val, atomic, true); 413 + return __xe_mmio_wait32(mmio, reg, mask, val, timeout_us, out_val, atomic, true); 417 414 } 418 415 419 416 /** 420 417 * xe_mmio_wait32_not() - Wait for a register to return anything other than the given masked value 421 - * @gt: MMIO target GT 418 + * @mmio: MMIO target 422 419 * @reg: register to read value from 423 420 * @mask: mask to be applied to the value read from the register 424 421 * @val: value not to be matched after applying the mask ··· 429 426 * This function works exactly like xe_mmio_wait32() with the exception that 430 427 * @val is expected not to be matched. 431 428 */ 432 - int xe_mmio_wait32_not(struct xe_gt *gt, struct xe_reg reg, u32 mask, u32 val, u32 timeout_us, 429 + int xe_mmio_wait32_not(struct xe_mmio *mmio, struct xe_reg reg, u32 mask, u32 val, u32 timeout_us, 433 430 u32 *out_val, bool atomic) 434 431 { 435 - return __xe_mmio_wait32(gt, reg, mask, val, timeout_us, out_val, atomic, false); 432 + return __xe_mmio_wait32(mmio, reg, mask, val, timeout_us, out_val, atomic, false); 436 433 }
+20 -15
drivers/gpu/drm/xe/xe_mmio.h
··· 14 14 int xe_mmio_init(struct xe_device *xe); 15 15 int xe_mmio_probe_tiles(struct xe_device *xe); 16 16 17 - u8 xe_mmio_read8(struct xe_gt *gt, struct xe_reg reg); 18 - u16 xe_mmio_read16(struct xe_gt *gt, struct xe_reg reg); 19 - void xe_mmio_write32(struct xe_gt *gt, struct xe_reg reg, u32 val); 20 - u32 xe_mmio_read32(struct xe_gt *gt, struct xe_reg reg); 21 - u32 xe_mmio_rmw32(struct xe_gt *gt, struct xe_reg reg, u32 clr, u32 set); 22 - int xe_mmio_write32_and_verify(struct xe_gt *gt, struct xe_reg reg, u32 val, u32 mask, u32 eval); 23 - bool xe_mmio_in_range(const struct xe_gt *gt, const struct xe_mmio_range *range, struct xe_reg reg); 17 + u8 xe_mmio_read8(struct xe_mmio *mmio, struct xe_reg reg); 18 + u16 xe_mmio_read16(struct xe_mmio *mmio, struct xe_reg reg); 19 + void xe_mmio_write32(struct xe_mmio *mmio, struct xe_reg reg, u32 val); 20 + u32 xe_mmio_read32(struct xe_mmio *mmio, struct xe_reg reg); 21 + u32 xe_mmio_rmw32(struct xe_mmio *mmio, struct xe_reg reg, u32 clr, u32 set); 22 + int xe_mmio_write32_and_verify(struct xe_mmio *mmio, struct xe_reg reg, u32 val, u32 mask, u32 eval); 23 + bool xe_mmio_in_range(const struct xe_mmio *mmio, const struct xe_mmio_range *range, struct xe_reg reg); 24 24 25 - u64 xe_mmio_read64_2x32(struct xe_gt *gt, struct xe_reg reg); 26 - int xe_mmio_wait32(struct xe_gt *gt, struct xe_reg reg, u32 mask, u32 val, u32 timeout_us, 27 - u32 *out_val, bool atomic); 28 - int xe_mmio_wait32_not(struct xe_gt *gt, struct xe_reg reg, u32 mask, u32 val, u32 timeout_us, 29 - u32 *out_val, bool atomic); 25 + u64 xe_mmio_read64_2x32(struct xe_mmio *mmio, struct xe_reg reg); 26 + int xe_mmio_wait32(struct xe_mmio *mmio, struct xe_reg reg, u32 mask, u32 val, 27 + u32 timeout_us, u32 *out_val, bool atomic); 28 + int xe_mmio_wait32_not(struct xe_mmio *mmio, struct xe_reg reg, u32 mask, 29 + u32 val, u32 timeout_us, u32 *out_val, bool atomic); 30 30 31 - static inline u32 xe_mmio_adjusted_addr(const struct xe_gt *gt, u32 addr) 31 + static inline u32 xe_mmio_adjusted_addr(const struct xe_mmio *mmio, u32 addr) 32 32 { 33 - if (addr < gt->mmio.adj_limit) 34 - addr += gt->mmio.adj_offset; 33 + if (addr < mmio->adj_limit) 34 + addr += mmio->adj_offset; 35 35 return addr; 36 + } 37 + 38 + static inline struct xe_mmio *xe_root_tile_mmio(struct xe_device *xe) 39 + { 40 + return &xe->tiles[0].mmio; 36 41 } 37 42 38 43 #endif
+9 -8
drivers/gpu/drm/xe/xe_mocs.c
··· 278 278 if (regs_are_mcr(gt)) 279 279 reg_val = xe_gt_mcr_unicast_read_any(gt, XEHP_LNCFCMOCS(i)); 280 280 else 281 - reg_val = xe_mmio_read32(gt, XELP_LNCFCMOCS(i)); 281 + reg_val = xe_mmio_read32(&gt->mmio, XELP_LNCFCMOCS(i)); 282 282 283 283 drm_printf(p, "LNCFCMOCS[%2d] = [%u, %u, %u] (%#8x)\n", 284 284 j++, ··· 310 310 if (regs_are_mcr(gt)) 311 311 reg_val = xe_gt_mcr_unicast_read_any(gt, XEHP_GLOBAL_MOCS(i)); 312 312 else 313 - reg_val = xe_mmio_read32(gt, XELP_GLOBAL_MOCS(i)); 313 + reg_val = xe_mmio_read32(&gt->mmio, XELP_GLOBAL_MOCS(i)); 314 314 315 315 drm_printf(p, "GLOB_MOCS[%2d] = [%u, %u, %u, %u, %u, %u, %u, %u, %u, %u ] (%#8x)\n", 316 316 i, ··· 383 383 if (regs_are_mcr(gt)) 384 384 reg_val = xe_gt_mcr_unicast_read_any(gt, XEHP_LNCFCMOCS(i)); 385 385 else 386 - reg_val = xe_mmio_read32(gt, XELP_LNCFCMOCS(i)); 386 + reg_val = xe_mmio_read32(&gt->mmio, XELP_LNCFCMOCS(i)); 387 387 388 388 drm_printf(p, "LNCFCMOCS[%2d] = [%u, %u, %u] (%#8x)\n", 389 389 j++, ··· 428 428 if (regs_are_mcr(gt)) 429 429 reg_val = xe_gt_mcr_unicast_read_any(gt, XEHP_LNCFCMOCS(i)); 430 430 else 431 - reg_val = xe_mmio_read32(gt, XELP_LNCFCMOCS(i)); 431 + reg_val = xe_mmio_read32(&gt->mmio, XELP_LNCFCMOCS(i)); 432 432 433 433 drm_printf(p, "LNCFCMOCS[%2d] = [ %u ] (%#8x)\n", 434 434 j++, ··· 510 510 if (regs_are_mcr(gt)) 511 511 reg_val = xe_gt_mcr_unicast_read_any(gt, XEHP_GLOBAL_MOCS(i)); 512 512 else 513 - reg_val = xe_mmio_read32(gt, XELP_GLOBAL_MOCS(i)); 513 + reg_val = xe_mmio_read32(&gt->mmio, XELP_GLOBAL_MOCS(i)); 514 514 515 515 drm_printf(p, "GLOB_MOCS[%2d] = [%u, %u] (%#8x)\n", 516 516 i, ··· 553 553 if (regs_are_mcr(gt)) 554 554 reg_val = xe_gt_mcr_unicast_read_any(gt, XEHP_GLOBAL_MOCS(i)); 555 555 else 556 - reg_val = xe_mmio_read32(gt, XELP_GLOBAL_MOCS(i)); 556 + reg_val = xe_mmio_read32(&gt->mmio, XELP_GLOBAL_MOCS(i)); 557 557 558 558 drm_printf(p, "GLOB_MOCS[%2d] = [%u, %u, %u] (%#8x)\n", 559 559 i, ··· 576 576 memset(info, 0, sizeof(struct xe_mocs_info)); 577 577 578 578 switch (xe->info.platform) { 579 + case XE_PANTHERLAKE: 579 580 case XE_LUNARLAKE: 580 581 case XE_BATTLEMAGE: 581 582 info->ops = &xe2_mocs_ops; ··· 691 690 if (regs_are_mcr(gt)) 692 691 xe_gt_mcr_multicast_write(gt, XEHP_GLOBAL_MOCS(i), mocs); 693 692 else 694 - xe_mmio_write32(gt, XELP_GLOBAL_MOCS(i), mocs); 693 + xe_mmio_write32(&gt->mmio, XELP_GLOBAL_MOCS(i), mocs); 695 694 } 696 695 } 697 696 ··· 731 730 if (regs_are_mcr(gt)) 732 731 xe_gt_mcr_multicast_write(gt, XEHP_LNCFCMOCS(i), l3cc); 733 732 else 734 - xe_mmio_write32(gt, XELP_LNCFCMOCS(i), l3cc); 733 + xe_mmio_write32(&gt->mmio, XELP_LNCFCMOCS(i), l3cc); 735 734 } 736 735 } 737 736
+27 -21
drivers/gpu/drm/xe/xe_oa.c
··· 176 176 177 177 static u32 xe_oa_hw_tail_read(struct xe_oa_stream *stream) 178 178 { 179 - return xe_mmio_read32(stream->gt, __oa_regs(stream)->oa_tail_ptr) & 179 + return xe_mmio_read32(&stream->gt->mmio, __oa_regs(stream)->oa_tail_ptr) & 180 180 OAG_OATAILPTR_MASK; 181 181 } 182 182 ··· 366 366 struct xe_reg oaheadptr = __oa_regs(stream)->oa_head_ptr; 367 367 368 368 spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags); 369 - xe_mmio_write32(stream->gt, oaheadptr, 369 + xe_mmio_write32(&stream->gt->mmio, oaheadptr, 370 370 (head + gtt_offset) & OAG_OAHEADPTR_MASK); 371 371 stream->oa_buffer.head = head; 372 372 spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags); ··· 377 377 378 378 static void xe_oa_init_oa_buffer(struct xe_oa_stream *stream) 379 379 { 380 + struct xe_mmio *mmio = &stream->gt->mmio; 380 381 u32 gtt_offset = xe_bo_ggtt_addr(stream->oa_buffer.bo); 381 382 u32 oa_buf = gtt_offset | OABUFFER_SIZE_16M | OAG_OABUFFER_MEMORY_SELECT; 382 383 unsigned long flags; 383 384 384 385 spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags); 385 386 386 - xe_mmio_write32(stream->gt, __oa_regs(stream)->oa_status, 0); 387 - xe_mmio_write32(stream->gt, __oa_regs(stream)->oa_head_ptr, 387 + xe_mmio_write32(mmio, __oa_regs(stream)->oa_status, 0); 388 + xe_mmio_write32(mmio, __oa_regs(stream)->oa_head_ptr, 388 389 gtt_offset & OAG_OAHEADPTR_MASK); 389 390 stream->oa_buffer.head = 0; 390 391 /* 391 392 * PRM says: "This MMIO must be set before the OATAILPTR register and after the 392 393 * OAHEADPTR register. This is to enable proper functionality of the overflow bit". 393 394 */ 394 - xe_mmio_write32(stream->gt, __oa_regs(stream)->oa_buffer, oa_buf); 395 - xe_mmio_write32(stream->gt, __oa_regs(stream)->oa_tail_ptr, 395 + xe_mmio_write32(mmio, __oa_regs(stream)->oa_buffer, oa_buf); 396 + xe_mmio_write32(mmio, __oa_regs(stream)->oa_tail_ptr, 396 397 gtt_offset & OAG_OATAILPTR_MASK); 397 398 398 399 /* Mark that we need updated tail pointer to read from */ ··· 445 444 stream->hwe->oa_unit->type == DRM_XE_OA_UNIT_TYPE_OAG) 446 445 val |= OAG_OACONTROL_OA_PES_DISAG_EN; 447 446 448 - xe_mmio_write32(stream->gt, regs->oa_ctrl, val); 447 + xe_mmio_write32(&stream->gt->mmio, regs->oa_ctrl, val); 449 448 } 450 449 451 450 static void xe_oa_disable(struct xe_oa_stream *stream) 452 451 { 453 - xe_mmio_write32(stream->gt, __oa_regs(stream)->oa_ctrl, 0); 454 - if (xe_mmio_wait32(stream->gt, __oa_regs(stream)->oa_ctrl, 452 + struct xe_mmio *mmio = &stream->gt->mmio; 453 + 454 + xe_mmio_write32(mmio, __oa_regs(stream)->oa_ctrl, 0); 455 + if (xe_mmio_wait32(mmio, __oa_regs(stream)->oa_ctrl, 455 456 OAG_OACONTROL_OA_COUNTER_ENABLE, 0, 50000, NULL, false)) 456 457 drm_err(&stream->oa->xe->drm, 457 458 "wait for OA to be disabled timed out\n"); 458 459 459 460 if (GRAPHICS_VERx100(stream->oa->xe) <= 1270 && GRAPHICS_VERx100(stream->oa->xe) != 1260) { 460 461 /* <= XE_METEORLAKE except XE_PVC */ 461 - xe_mmio_write32(stream->gt, OA_TLB_INV_CR, 1); 462 - if (xe_mmio_wait32(stream->gt, OA_TLB_INV_CR, 1, 0, 50000, NULL, false)) 462 + xe_mmio_write32(mmio, OA_TLB_INV_CR, 1); 463 + if (xe_mmio_wait32(mmio, OA_TLB_INV_CR, 1, 0, 50000, NULL, false)) 463 464 drm_err(&stream->oa->xe->drm, 464 465 "wait for OA tlb invalidate timed out\n"); 465 466 } ··· 484 481 size_t count, size_t *offset) 485 482 { 486 483 /* Only clear our bits to avoid side-effects */ 487 - stream->oa_status = xe_mmio_rmw32(stream->gt, __oa_regs(stream)->oa_status, 484 + stream->oa_status = xe_mmio_rmw32(&stream->gt->mmio, __oa_regs(stream)->oa_status, 488 485 OASTATUS_RELEVANT_BITS, 0); 489 486 /* 490 487 * Signal to userspace that there is non-zero OA status to read via ··· 752 749 int err; 753 750 754 751 /* Set ccs select to enable programming of OAC_OACONTROL */ 755 - xe_mmio_write32(stream->gt, __oa_regs(stream)->oa_ctrl, __oa_ccs_select(stream)); 752 + xe_mmio_write32(&stream->gt->mmio, __oa_regs(stream)->oa_ctrl, 753 + __oa_ccs_select(stream)); 756 754 757 755 /* Modify stream hwe context image with regs_context */ 758 756 err = xe_oa_modify_ctx_image(stream, stream->exec_q->lrc[0], ··· 789 785 790 786 static void xe_oa_disable_metric_set(struct xe_oa_stream *stream) 791 787 { 788 + struct xe_mmio *mmio = &stream->gt->mmio; 792 789 u32 sqcnt1; 793 790 794 791 /* ··· 803 798 _MASKED_BIT_DISABLE(DISABLE_DOP_GATING)); 804 799 } 805 800 806 - xe_mmio_write32(stream->gt, __oa_regs(stream)->oa_debug, 801 + xe_mmio_write32(mmio, __oa_regs(stream)->oa_debug, 807 802 oag_configure_mmio_trigger(stream, false)); 808 803 809 804 /* disable the context save/restore or OAR counters */ ··· 811 806 xe_oa_configure_oa_context(stream, false); 812 807 813 808 /* Make sure we disable noa to save power. */ 814 - xe_mmio_rmw32(stream->gt, RPM_CONFIG1, GT_NOA_ENABLE, 0); 809 + xe_mmio_rmw32(mmio, RPM_CONFIG1, GT_NOA_ENABLE, 0); 815 810 816 811 sqcnt1 = SQCNT1_PMON_ENABLE | 817 812 (HAS_OA_BPC_REPORTING(stream->oa->xe) ? SQCNT1_OABPC : 0); 818 813 819 814 /* Reset PMON Enable to save power. */ 820 - xe_mmio_rmw32(stream->gt, XELPMP_SQCNT1, sqcnt1, 0); 815 + xe_mmio_rmw32(mmio, XELPMP_SQCNT1, sqcnt1, 0); 821 816 } 822 817 823 818 static void xe_oa_stream_destroy(struct xe_oa_stream *stream) ··· 945 940 946 941 static int xe_oa_enable_metric_set(struct xe_oa_stream *stream) 947 942 { 943 + struct xe_mmio *mmio = &stream->gt->mmio; 948 944 u32 oa_debug, sqcnt1; 949 945 int ret; 950 946 ··· 972 966 OAG_OA_DEBUG_DISABLE_START_TRG_2_COUNT_QUAL | 973 967 OAG_OA_DEBUG_DISABLE_START_TRG_1_COUNT_QUAL; 974 968 975 - xe_mmio_write32(stream->gt, __oa_regs(stream)->oa_debug, 969 + xe_mmio_write32(mmio, __oa_regs(stream)->oa_debug, 976 970 _MASKED_BIT_ENABLE(oa_debug) | 977 971 oag_report_ctx_switches(stream) | 978 972 oag_configure_mmio_trigger(stream, true)); 979 973 980 - xe_mmio_write32(stream->gt, __oa_regs(stream)->oa_ctx_ctrl, stream->periodic ? 974 + xe_mmio_write32(mmio, __oa_regs(stream)->oa_ctx_ctrl, stream->periodic ? 981 975 (OAG_OAGLBCTXCTRL_COUNTER_RESUME | 982 976 OAG_OAGLBCTXCTRL_TIMER_ENABLE | 983 977 REG_FIELD_PREP(OAG_OAGLBCTXCTRL_TIMER_PERIOD_MASK, ··· 991 985 sqcnt1 = SQCNT1_PMON_ENABLE | 992 986 (HAS_OA_BPC_REPORTING(stream->oa->xe) ? SQCNT1_OABPC : 0); 993 987 994 - xe_mmio_rmw32(stream->gt, XELPMP_SQCNT1, 0, sqcnt1); 988 + xe_mmio_rmw32(mmio, XELPMP_SQCNT1, 0, sqcnt1); 995 989 996 990 /* Configure OAR/OAC */ 997 991 if (stream->exec_q) { ··· 1539 1533 case XE_PVC: 1540 1534 case XE_METEORLAKE: 1541 1535 xe_pm_runtime_get(gt_to_xe(gt)); 1542 - reg = xe_mmio_read32(gt, RPM_CONFIG0); 1536 + reg = xe_mmio_read32(&gt->mmio, RPM_CONFIG0); 1543 1537 xe_pm_runtime_put(gt_to_xe(gt)); 1544 1538 1545 1539 shift = REG_FIELD_GET(RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK, reg); ··· 2355 2349 } 2356 2350 2357 2351 /* Ensure MMIO trigger remains disabled till there is a stream */ 2358 - xe_mmio_write32(gt, u->regs.oa_debug, 2352 + xe_mmio_write32(&gt->mmio, u->regs.oa_debug, 2359 2353 oag_configure_mmio_trigger(NULL, false)); 2360 2354 2361 2355 /* Set oa_unit_ids now to ensure ids remain contiguous */
+14 -9
drivers/gpu/drm/xe/xe_pat.c
··· 100 100 * Reserved entries should be programmed with the maximum caching, minimum 101 101 * coherency (which matches an all-0's encoding), so we can just omit them 102 102 * in the table. 103 + * 104 + * Note: There is an implicit assumption in the driver that compression and 105 + * coh_1way+ are mutually exclusive. If this is ever not true then userptr 106 + * and imported dma-buf from external device will have uncleared ccs state. 103 107 */ 104 108 #define XE2_PAT(no_promote, comp_en, l3clos, l3_policy, l4_policy, __coh_mode) \ 105 109 { \ ··· 113 109 REG_FIELD_PREP(XE2_L3_POLICY, l3_policy) | \ 114 110 REG_FIELD_PREP(XE2_L4_POLICY, l4_policy) | \ 115 111 REG_FIELD_PREP(XE2_COH_MODE, __coh_mode), \ 116 - .coh_mode = __coh_mode ? XE_COH_AT_LEAST_1WAY : XE_COH_NONE \ 112 + .coh_mode = (BUILD_BUG_ON_ZERO(__coh_mode && comp_en) || __coh_mode) ? \ 113 + XE_COH_AT_LEAST_1WAY : XE_COH_NONE \ 117 114 } 118 115 119 116 static const struct xe_pat_table_entry xe2_pat_table[] = { ··· 165 160 for (int i = 0; i < n_entries; i++) { 166 161 struct xe_reg reg = XE_REG(_PAT_INDEX(i)); 167 162 168 - xe_mmio_write32(gt, reg, table[i].value); 163 + xe_mmio_write32(&gt->mmio, reg, table[i].value); 169 164 } 170 165 } 171 166 ··· 191 186 drm_printf(p, "PAT table:\n"); 192 187 193 188 for (i = 0; i < xe->pat.n_entries; i++) { 194 - u32 pat = xe_mmio_read32(gt, XE_REG(_PAT_INDEX(i))); 189 + u32 pat = xe_mmio_read32(&gt->mmio, XE_REG(_PAT_INDEX(i))); 195 190 u8 mem_type = REG_FIELD_GET(XELP_MEM_TYPE_MASK, pat); 196 191 197 192 drm_printf(p, "PAT[%2d] = %s (%#8x)\n", i, ··· 283 278 u32 pat; 284 279 285 280 if (xe_gt_is_media_type(gt)) 286 - pat = xe_mmio_read32(gt, XE_REG(_PAT_INDEX(i))); 281 + pat = xe_mmio_read32(&gt->mmio, XE_REG(_PAT_INDEX(i))); 287 282 else 288 283 pat = xe_gt_mcr_unicast_read_any(gt, XE_REG_MCR(_PAT_INDEX(i))); 289 284 ··· 321 316 int n_entries) 322 317 { 323 318 program_pat(gt, table, n_entries); 324 - xe_mmio_write32(gt, XE_REG(_PAT_ATS), xe2_pat_ats.value); 319 + xe_mmio_write32(&gt->mmio, XE_REG(_PAT_ATS), xe2_pat_ats.value); 325 320 326 321 if (IS_DGFX(gt_to_xe(gt))) 327 - xe_mmio_write32(gt, XE_REG(_PAT_PTA), xe2_pat_pta.value); 322 + xe_mmio_write32(&gt->mmio, XE_REG(_PAT_PTA), xe2_pat_pta.value); 328 323 } 329 324 330 325 static void xe2_dump(struct xe_gt *gt, struct drm_printer *p) ··· 341 336 342 337 for (i = 0; i < xe->pat.n_entries; i++) { 343 338 if (xe_gt_is_media_type(gt)) 344 - pat = xe_mmio_read32(gt, XE_REG(_PAT_INDEX(i))); 339 + pat = xe_mmio_read32(&gt->mmio, XE_REG(_PAT_INDEX(i))); 345 340 else 346 341 pat = xe_gt_mcr_unicast_read_any(gt, XE_REG_MCR(_PAT_INDEX(i))); 347 342 ··· 360 355 * PPGTT entries. 361 356 */ 362 357 if (xe_gt_is_media_type(gt)) 363 - pat = xe_mmio_read32(gt, XE_REG(_PAT_PTA)); 358 + pat = xe_mmio_read32(&gt->mmio, XE_REG(_PAT_PTA)); 364 359 else 365 360 pat = xe_gt_mcr_unicast_read_any(gt, XE_REG_MCR(_PAT_PTA)); 366 361 ··· 387 382 388 383 void xe_pat_init_early(struct xe_device *xe) 389 384 { 390 - if (GRAPHICS_VER(xe) == 20) { 385 + if (GRAPHICS_VER(xe) == 30 || GRAPHICS_VER(xe) == 20) { 391 386 xe->pat.ops = &xe2_pat_ops; 392 387 xe->pat.table = xe2_pat_table; 393 388
+48 -11
drivers/gpu/drm/xe/xe_pci.c
··· 103 103 104 104 #define XE_HP_FEATURES \ 105 105 .has_range_tlb_invalidation = true, \ 106 - .has_flat_ccs = true, \ 107 106 .dma_mask_size = 46, \ 108 107 .va_bits = 48, \ 109 108 .vm_max_level = 3 ··· 119 120 120 121 XE_HP_FEATURES, 121 122 .vram_flags = XE_VRAM_FLAGS_NEED64K, 123 + 124 + .has_flat_ccs = 1, 122 125 }; 123 126 124 127 static const struct xe_graphics_desc graphics_xehpc = { ··· 146 145 147 146 .has_asid = 1, 148 147 .has_atomic_enable_pte_bit = 1, 149 - .has_flat_ccs = 0, 150 148 .has_usm = 1, 151 149 }; 152 150 ··· 156 156 BIT(XE_HW_ENGINE_CCS0), 157 157 158 158 XE_HP_FEATURES, 159 - .has_flat_ccs = 0, 160 159 }; 161 160 162 161 #define XE2_GFX_FEATURES \ ··· 208 209 }; 209 210 210 211 static const struct xe_media_desc media_xe2 = { 211 - .name = "Xe2_LPM / Xe2_HPM", 212 + .name = "Xe2_LPM / Xe2_HPM / Xe3_LPM", 212 213 .hw_engine_mask = 213 214 GENMASK(XE_HW_ENGINE_VCS7, XE_HW_ENGINE_VCS0) | 214 215 GENMASK(XE_HW_ENGINE_VECS3, XE_HW_ENGINE_VECS0) | ··· 346 347 .has_heci_cscfi = 1, 347 348 }; 348 349 350 + static const struct xe_device_desc ptl_desc = { 351 + PLATFORM(PANTHERLAKE), 352 + .has_display = false, 353 + .require_force_probe = true, 354 + }; 355 + 349 356 #undef PLATFORM 350 357 __diag_pop(); 351 358 ··· 362 357 { 1274, &graphics_xelpg }, /* Xe_LPG+ */ 363 358 { 2001, &graphics_xe2 }, 364 359 { 2004, &graphics_xe2 }, 360 + { 3000, &graphics_xe2 }, 361 + { 3001, &graphics_xe2 }, 365 362 }; 366 363 367 364 /* Map of GMD_ID values to media IP */ ··· 371 364 { 1300, &media_xelpmp }, 372 365 { 1301, &media_xe2 }, 373 366 { 2000, &media_xe2 }, 367 + { 3000, &media_xe2 }, 374 368 }; 375 369 376 370 #define INTEL_VGA_DEVICE(id, info) { \ ··· 391 383 XE_ADLS_IDS(INTEL_VGA_DEVICE, &adl_s_desc), 392 384 XE_ADLP_IDS(INTEL_VGA_DEVICE, &adl_p_desc), 393 385 XE_ADLN_IDS(INTEL_VGA_DEVICE, &adl_n_desc), 386 + XE_RPLU_IDS(INTEL_VGA_DEVICE, &adl_p_desc), 394 387 XE_RPLP_IDS(INTEL_VGA_DEVICE, &adl_p_desc), 395 388 XE_RPLS_IDS(INTEL_VGA_DEVICE, &adl_s_desc), 396 389 XE_DG1_IDS(INTEL_VGA_DEVICE, &dg1_desc), 397 390 XE_ATS_M_IDS(INTEL_VGA_DEVICE, &ats_m_desc), 391 + XE_ARL_IDS(INTEL_VGA_DEVICE, &mtl_desc), 398 392 XE_DG2_IDS(INTEL_VGA_DEVICE, &dg2_desc), 399 393 XE_MTL_IDS(INTEL_VGA_DEVICE, &mtl_desc), 400 394 XE_LNL_IDS(INTEL_VGA_DEVICE, &lnl_desc), 401 395 XE_BMG_IDS(INTEL_VGA_DEVICE, &bmg_desc), 396 + XE_PTL_IDS(INTEL_VGA_DEVICE, &ptl_desc), 402 397 { } 403 398 }; 404 399 MODULE_DEVICE_TABLE(pci, pciidlist); ··· 478 467 479 468 static void read_gmdid(struct xe_device *xe, enum xe_gmdid_type type, u32 *ver, u32 *revid) 480 469 { 481 - struct xe_gt *gt = xe_root_mmio_gt(xe); 470 + struct xe_mmio *mmio = xe_root_tile_mmio(xe); 482 471 struct xe_reg gmdid_reg = GMD_ID; 483 472 u32 val; 484 473 485 474 KUNIT_STATIC_STUB_REDIRECT(read_gmdid, xe, type, ver, revid); 486 475 487 476 if (IS_SRIOV_VF(xe)) { 477 + struct xe_gt *gt = xe_root_mmio_gt(xe); 478 + 488 479 /* 489 480 * To get the value of the GMDID register, VFs must obtain it 490 481 * from the GuC using MMIO communication. ··· 522 509 gt->info.type = XE_GT_TYPE_UNINITIALIZED; 523 510 } else { 524 511 /* 525 - * We need to apply the GSI offset explicitly here as at this 526 - * point the xe_gt is not fully uninitialized and only basic 527 - * access to MMIO registers is possible. 512 + * GMD_ID is a GT register, but at this point in the driver 513 + * init we haven't fully initialized the GT yet so we need to 514 + * read the register with the tile's MMIO accessor. That means 515 + * we need to apply the GSI offset manually since it won't get 516 + * automatically added as it would if we were using a GT mmio 517 + * accessor. 528 518 */ 529 519 if (type == GMDID_MEDIA) 530 520 gmdid_reg.addr += MEDIA_GT_GSI_OFFSET; 531 521 532 - val = xe_mmio_read32(gt, gmdid_reg); 522 + val = xe_mmio_read32(mmio, gmdid_reg); 533 523 } 534 524 535 525 *ver = REG_FIELD_GET(GMD_ID_ARCH_MASK, val) * 100 + REG_FIELD_GET(GMD_ID_RELEASE_MASK, val); ··· 694 678 xe->info.has_atomic_enable_pte_bit = graphics_desc->has_atomic_enable_pte_bit; 695 679 if (xe->info.platform != XE_PVC) 696 680 xe->info.has_device_atomics_on_smem = 1; 681 + 682 + /* Runtime detection may change this later */ 697 683 xe->info.has_flat_ccs = graphics_desc->has_flat_ccs; 684 + 698 685 xe->info.has_range_tlb_invalidation = graphics_desc->has_range_tlb_invalidation; 699 686 xe->info.has_usm = graphics_desc->has_usm; 700 687 ··· 726 707 gt->info.type = XE_GT_TYPE_MAIN; 727 708 gt->info.has_indirect_ring_state = graphics_desc->has_indirect_ring_state; 728 709 gt->info.engine_mask = graphics_desc->hw_engine_mask; 710 + 729 711 if (MEDIA_VER(xe) < 13 && media_desc) 730 712 gt->info.engine_mask |= media_desc->hw_engine_mask; 731 713 ··· 745 725 gt->info.type = XE_GT_TYPE_MEDIA; 746 726 gt->info.has_indirect_ring_state = media_desc->has_indirect_ring_state; 747 727 gt->info.engine_mask = media_desc->hw_engine_mask; 748 - gt->mmio.adj_offset = MEDIA_GT_GSI_OFFSET; 749 - gt->mmio.adj_limit = MEDIA_GT_GSI_LENGTH; 750 728 751 729 /* 752 730 * FIXME: At the moment multi-tile and standalone media are ··· 775 757 pci_set_drvdata(pdev, NULL); 776 758 } 777 759 760 + /* 761 + * Probe the PCI device, initialize various parts of the driver. 762 + * 763 + * Fault injection is used to test the error paths of some initialization 764 + * functions called either directly from xe_pci_probe() or indirectly for 765 + * example through xe_device_probe(). Those functions use the kernel fault 766 + * injection capabilities infrastructure, see 767 + * Documentation/fault-injection/fault-injection.rst for details. The macro 768 + * ALLOW_ERROR_INJECTION() is used to conditionally skip function execution 769 + * at runtime and use a provided return value. The first requirement for 770 + * error injectable functions is proper handling of the error code by the 771 + * caller for recovery, which is always the case here. The second 772 + * requirement is that no state is changed before the first error return. 773 + * It is not strictly fullfilled for all initialization functions using the 774 + * ALLOW_ERROR_INJECTION() macro but this is acceptable because for those 775 + * error cases at probe time, the error code is simply propagated up by the 776 + * caller. Therefore there is no consequence on those specific callers when 777 + * function error injection skips the whole function. 778 + */ 778 779 static int xe_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) 779 780 { 780 781 const struct xe_device_desc *desc = (const void *)ent->driver_data;
+2 -2
drivers/gpu/drm/xe/xe_pcode.c
··· 44 44 [PCODE_ERROR_MASK] = {-EPROTO, "Unknown"}, 45 45 }; 46 46 47 - err = xe_mmio_read32(tile->primary_gt, PCODE_MAILBOX) & PCODE_ERROR_MASK; 47 + err = xe_mmio_read32(&tile->mmio, PCODE_MAILBOX) & PCODE_ERROR_MASK; 48 48 if (err) { 49 49 drm_err(&tile_to_xe(tile)->drm, "PCODE Mailbox failed: %d %s", err, 50 50 err_decode[err].str ?: "Unknown"); ··· 58 58 unsigned int timeout_ms, bool return_data, 59 59 bool atomic) 60 60 { 61 - struct xe_gt *mmio = tile->primary_gt; 61 + struct xe_mmio *mmio = &tile->mmio; 62 62 int err; 63 63 64 64 if (tile_to_xe(tile)->info.skip_pcode)
+1
drivers/gpu/drm/xe/xe_platform_types.h
··· 23 23 XE_METEORLAKE, 24 24 XE_LUNARLAKE, 25 25 XE_BATTLEMAGE, 26 + XE_PANTHERLAKE, 26 27 }; 27 28 28 29 enum xe_subplatform {
+5 -3
drivers/gpu/drm/xe/xe_pm.c
··· 5 5 6 6 #include "xe_pm.h" 7 7 8 + #include <linux/fault-inject.h> 8 9 #include <linux/pm_runtime.h> 9 10 10 11 #include <drm/drm_managed.h> ··· 124 123 for_each_gt(gt, xe, id) 125 124 xe_gt_suspend_prepare(gt); 126 125 127 - xe_display_pm_suspend(xe, false); 126 + xe_display_pm_suspend(xe); 128 127 129 128 /* FIXME: Super racey... */ 130 129 err = xe_bo_evict_all(xe); ··· 134 133 for_each_gt(gt, xe, id) { 135 134 err = xe_gt_suspend(gt); 136 135 if (err) { 137 - xe_display_pm_resume(xe, false); 136 + xe_display_pm_resume(xe); 138 137 goto err; 139 138 } 140 139 } ··· 188 187 for_each_gt(gt, xe, id) 189 188 xe_gt_resume(gt); 190 189 191 - xe_display_pm_resume(xe, false); 190 + xe_display_pm_resume(xe); 192 191 193 192 err = xe_bo_restore_user(xe); 194 193 if (err) ··· 264 263 265 264 return 0; 266 265 } 266 + ALLOW_ERROR_INJECTION(xe_pm_init_early, ERRNO); /* See xe_pci_probe() */ 267 267 268 268 /** 269 269 * xe_pm_init - Initialize Xe Power Management
+35 -14
drivers/gpu/drm/xe/xe_query.c
··· 9 9 #include <linux/sched/clock.h> 10 10 11 11 #include <drm/ttm/ttm_placement.h> 12 + #include <generated/xe_wa_oob.h> 12 13 #include <uapi/drm/xe_drm.h> 13 14 14 15 #include "regs/xe_engine_regs.h" ··· 24 23 #include "xe_macros.h" 25 24 #include "xe_mmio.h" 26 25 #include "xe_ttm_vram_mgr.h" 26 + #include "xe_wa.h" 27 27 28 28 static const u16 xe_to_user_engine_class[] = { 29 29 [XE_ENGINE_CLASS_RENDER] = DRM_XE_ENGINE_CLASS_RENDER, ··· 93 91 u64 *cpu_delta, 94 92 __ktime_func_t cpu_clock) 95 93 { 94 + struct xe_mmio *mmio = &gt->mmio; 96 95 u32 upper, lower, old_upper, loop = 0; 97 96 98 - upper = xe_mmio_read32(gt, upper_reg); 97 + upper = xe_mmio_read32(mmio, upper_reg); 99 98 do { 100 99 *cpu_delta = local_clock(); 101 100 *cpu_ts = cpu_clock(); 102 - lower = xe_mmio_read32(gt, lower_reg); 101 + lower = xe_mmio_read32(mmio, lower_reg); 103 102 *cpu_delta = local_clock() - *cpu_delta; 104 103 old_upper = upper; 105 - upper = xe_mmio_read32(gt, upper_reg); 104 + upper = xe_mmio_read32(mmio, upper_reg); 106 105 } while (upper != old_upper && loop++ < 2); 107 106 108 107 *engine_ts = (u64)upper << 32 | lower; ··· 457 454 458 455 static size_t calc_topo_query_size(struct xe_device *xe) 459 456 { 460 - return xe->info.gt_count * 461 - (4 * sizeof(struct drm_xe_query_topology_mask) + 462 - sizeof_field(struct xe_gt, fuse_topo.g_dss_mask) + 463 - sizeof_field(struct xe_gt, fuse_topo.c_dss_mask) + 464 - sizeof_field(struct xe_gt, fuse_topo.l3_bank_mask) + 465 - sizeof_field(struct xe_gt, fuse_topo.eu_mask_per_dss)); 457 + struct xe_gt *gt; 458 + size_t query_size = 0; 459 + int id; 460 + 461 + for_each_gt(gt, xe, id) { 462 + query_size += 3 * sizeof(struct drm_xe_query_topology_mask) + 463 + sizeof_field(struct xe_gt, fuse_topo.g_dss_mask) + 464 + sizeof_field(struct xe_gt, fuse_topo.c_dss_mask) + 465 + sizeof_field(struct xe_gt, fuse_topo.eu_mask_per_dss); 466 + 467 + /* L3bank mask may not be available for some GTs */ 468 + if (!XE_WA(gt, no_media_l3)) 469 + query_size += sizeof(struct drm_xe_query_topology_mask) + 470 + sizeof_field(struct xe_gt, fuse_topo.l3_bank_mask); 471 + } 472 + 473 + return query_size; 466 474 } 467 475 468 476 static int copy_mask(void __user **ptr, ··· 526 512 if (err) 527 513 return err; 528 514 529 - topo.type = DRM_XE_TOPO_L3_BANK; 530 - err = copy_mask(&query_ptr, &topo, gt->fuse_topo.l3_bank_mask, 531 - sizeof(gt->fuse_topo.l3_bank_mask)); 532 - if (err) 533 - return err; 515 + /* 516 + * If the kernel doesn't have a way to obtain a correct L3bank 517 + * mask, then it's better to omit L3 from the query rather than 518 + * reporting bogus or zeroed information to userspace. 519 + */ 520 + if (!XE_WA(gt, no_media_l3)) { 521 + topo.type = DRM_XE_TOPO_L3_BANK; 522 + err = copy_mask(&query_ptr, &topo, gt->fuse_topo.l3_bank_mask, 523 + sizeof(gt->fuse_topo.l3_bank_mask)); 524 + if (err) 525 + return err; 526 + } 534 527 535 528 topo.type = gt->fuse_topo.eu_type == XE_GT_EU_TYPE_SIMD16 ? 536 529 DRM_XE_TOPO_SIMD16_EU_PER_DSS :
+9 -8
drivers/gpu/drm/xe/xe_reg_sr.c
··· 15 15 16 16 #include "regs/xe_engine_regs.h" 17 17 #include "regs/xe_gt_regs.h" 18 + #include "xe_device.h" 18 19 #include "xe_device_types.h" 19 20 #include "xe_force_wake.h" 20 21 #include "xe_gt.h" ··· 165 164 else if (entry->clr_bits + 1) 166 165 val = (reg.mcr ? 167 166 xe_gt_mcr_unicast_read_any(gt, reg_mcr) : 168 - xe_mmio_read32(gt, reg)) & (~entry->clr_bits); 167 + xe_mmio_read32(&gt->mmio, reg)) & (~entry->clr_bits); 169 168 else 170 169 val = 0; 171 170 ··· 181 180 if (entry->reg.mcr) 182 181 xe_gt_mcr_multicast_write(gt, reg_mcr, val); 183 182 else 184 - xe_mmio_write32(gt, reg, val); 183 + xe_mmio_write32(&gt->mmio, reg, val); 185 184 } 186 185 187 186 void xe_reg_sr_apply_mmio(struct xe_reg_sr *sr, struct xe_gt *gt) ··· 195 194 196 195 xe_gt_dbg(gt, "Applying %s save-restore MMIOs\n", sr->name); 197 196 198 - err = xe_force_wake_get(&gt->mmio.fw, XE_FORCEWAKE_ALL); 197 + err = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL); 199 198 if (err) 200 199 goto err_force_wake; 201 200 202 201 xa_for_each(&sr->xa, reg, entry) 203 202 apply_one_mmio(gt, entry); 204 203 205 - err = xe_force_wake_put(&gt->mmio.fw, XE_FORCEWAKE_ALL); 204 + err = xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL); 206 205 XE_WARN_ON(err); 207 206 208 207 return; ··· 228 227 229 228 drm_dbg(&xe->drm, "Whitelisting %s registers\n", sr->name); 230 229 231 - err = xe_force_wake_get(&gt->mmio.fw, XE_FORCEWAKE_ALL); 230 + err = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL); 232 231 if (err) 233 232 goto err_force_wake; 234 233 ··· 242 241 } 243 242 244 243 xe_reg_whitelist_print_entry(&p, 0, reg, entry); 245 - xe_mmio_write32(gt, RING_FORCE_TO_NONPRIV(mmio_base, slot), 244 + xe_mmio_write32(&gt->mmio, RING_FORCE_TO_NONPRIV(mmio_base, slot), 246 245 reg | entry->set_bits); 247 246 slot++; 248 247 } ··· 251 250 for (; slot < RING_MAX_NONPRIV_SLOTS; slot++) { 252 251 u32 addr = RING_NOPID(mmio_base).addr; 253 252 254 - xe_mmio_write32(gt, RING_FORCE_TO_NONPRIV(mmio_base, slot), addr); 253 + xe_mmio_write32(&gt->mmio, RING_FORCE_TO_NONPRIV(mmio_base, slot), addr); 255 254 } 256 255 257 - err = xe_force_wake_put(&gt->mmio.fw, XE_FORCEWAKE_ALL); 256 + err = xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL); 258 257 XE_WARN_ON(err); 259 258 260 259 return;
+1 -1
drivers/gpu/drm/xe/xe_rtp.c
··· 196 196 *gt = (*hwe)->gt; 197 197 *xe = gt_to_xe(*gt); 198 198 break; 199 - }; 199 + } 200 200 } 201 201 202 202 /**
+1 -1
drivers/gpu/drm/xe/xe_sa.c
··· 53 53 if (IS_ERR(bo)) { 54 54 drm_err(&xe->drm, "failed to allocate bo for sa manager: %ld\n", 55 55 PTR_ERR(bo)); 56 - return (struct xe_sa_manager *)bo; 56 + return ERR_CAST(bo); 57 57 } 58 58 sa_manager->bo = bo; 59 59 sa_manager->is_iomem = bo->vmap.is_iomem;
+4 -1
drivers/gpu/drm/xe/xe_sriov.c
··· 3 3 * Copyright © 2023 Intel Corporation 4 4 */ 5 5 6 + #include <linux/fault-inject.h> 7 + 6 8 #include <drm/drm_managed.h> 7 9 8 10 #include "regs/xe_regs.h" ··· 37 35 38 36 static bool test_is_vf(struct xe_device *xe) 39 37 { 40 - u32 value = xe_mmio_read32(xe_root_mmio_gt(xe), VF_CAP_REG); 38 + u32 value = xe_mmio_read32(xe_root_tile_mmio(xe), VF_CAP_REG); 41 39 42 40 return value & VF_CAP; 43 41 } ··· 121 119 122 120 return drmm_add_action_or_reset(&xe->drm, fini_sriov, xe); 123 121 } 122 + ALLOW_ERROR_INJECTION(xe_sriov_init, ERRNO); /* See xe_pci_probe() */ 124 123 125 124 /** 126 125 * xe_sriov_print_info - Print basic SR-IOV information.
+3
drivers/gpu/drm/xe/xe_tile.c
··· 3 3 * Copyright © 2023 Intel Corporation 4 4 */ 5 5 6 + #include <linux/fault-inject.h> 7 + 6 8 #include <drm/drm_managed.h> 7 9 8 10 #include "xe_device.h" ··· 131 129 132 130 return 0; 133 131 } 132 + ALLOW_ERROR_INJECTION(xe_tile_init_early, ERRNO); /* See xe_pci_probe() */ 134 133 135 134 static int tile_ttm_mgr_init(struct xe_tile *tile) 136 135 {
+4 -3
drivers/gpu/drm/xe/xe_trace.h
··· 21 21 #include "xe_vm.h" 22 22 23 23 #define __dev_name_xe(xe) dev_name((xe)->drm.dev) 24 + #define __dev_name_tile(tile) __dev_name_xe(tile_to_xe((tile))) 24 25 #define __dev_name_gt(gt) __dev_name_xe(gt_to_xe((gt))) 25 26 #define __dev_name_eq(q) __dev_name_gt((q)->gt) 26 27 ··· 343 342 ); 344 343 345 344 TRACE_EVENT(xe_reg_rw, 346 - TP_PROTO(struct xe_gt *gt, bool write, u32 reg, u64 val, int len), 345 + TP_PROTO(struct xe_mmio *mmio, bool write, u32 reg, u64 val, int len), 347 346 348 - TP_ARGS(gt, write, reg, val, len), 347 + TP_ARGS(mmio, write, reg, val, len), 349 348 350 349 TP_STRUCT__entry( 351 - __string(dev, __dev_name_gt(gt)) 350 + __string(dev, __dev_name_tile(mmio->tile)) 352 351 __field(u64, val) 353 352 __field(u32, reg) 354 353 __field(u16, write)
+1 -1
drivers/gpu/drm/xe/xe_trace_bo.h
··· 189 189 ), 190 190 191 191 TP_printk("dev=%s, vm=%p, asid=0x%05x", __get_str(dev), 192 - __entry->vm, __entry->asid) 192 + __entry->vm, __entry->asid) 193 193 ); 194 194 195 195 DEFINE_EVENT(xe_vm, xe_vm_kill,
+4 -4
drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
··· 60 60 static s64 detect_bar2_dgfx(struct xe_device *xe, struct xe_ttm_stolen_mgr *mgr) 61 61 { 62 62 struct xe_tile *tile = xe_device_get_root_tile(xe); 63 - struct xe_gt *mmio = xe_root_mmio_gt(xe); 63 + struct xe_mmio *mmio = xe_root_tile_mmio(xe); 64 64 struct pci_dev *pdev = to_pci_dev(xe->drm.dev); 65 65 u64 stolen_size; 66 66 u64 tile_offset; ··· 94 94 u32 wopcm_size; 95 95 u64 val; 96 96 97 - val = xe_mmio_read64_2x32(xe_root_mmio_gt(xe), STOLEN_RESERVED); 97 + val = xe_mmio_read64_2x32(xe_root_tile_mmio(xe), STOLEN_RESERVED); 98 98 val = REG_FIELD_GET64(WOPCM_SIZE_MASK, val); 99 99 100 100 switch (val) { ··· 119 119 u32 stolen_size, wopcm_size; 120 120 u32 ggc, gms; 121 121 122 - ggc = xe_mmio_read32(xe_root_mmio_gt(xe), GGC); 122 + ggc = xe_mmio_read32(xe_root_tile_mmio(xe), GGC); 123 123 124 124 /* 125 125 * Check GGMS: it should be fixed 0x3 (8MB), which corresponds to the ··· 159 159 stolen_size -= wopcm_size; 160 160 161 161 if (media_gt && XE_WA(media_gt, 14019821291)) { 162 - u64 gscpsmi_base = xe_mmio_read64_2x32(media_gt, GSCPSMI_BASE) 162 + u64 gscpsmi_base = xe_mmio_read64_2x32(&media_gt->mmio, GSCPSMI_BASE) 163 163 & ~GENMASK_ULL(5, 0); 164 164 165 165 /*
+5 -5
drivers/gpu/drm/xe/xe_tuning.c
··· 33 33 REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f))) 34 34 }, 35 35 { XE_RTP_NAME("Tuning: L3 cache - media"), 36 - XE_RTP_RULES(MEDIA_VERSION(2000)), 36 + XE_RTP_RULES(MEDIA_VERSION_RANGE(2000, XE_RTP_END_VERSION_UNDEFINED)), 37 37 XE_RTP_ACTIONS(FIELD_SET(XE2LPM_L3SQCREG5, L3_PWM_TIMER_INIT_VAL_MASK, 38 38 REG_FIELD_PREP(L3_PWM_TIMER_INIT_VAL_MASK, 0x7f))) 39 39 }, ··· 43 43 SET(CCCHKNREG1, L3CMPCTRL)) 44 44 }, 45 45 { XE_RTP_NAME("Tuning: Compression Overfetch - media"), 46 - XE_RTP_RULES(MEDIA_VERSION(2000)), 46 + XE_RTP_RULES(MEDIA_VERSION_RANGE(2000, XE_RTP_END_VERSION_UNDEFINED)), 47 47 XE_RTP_ACTIONS(CLR(XE2LPM_CCCHKNREG1, ENCOMPPERFFIX), 48 48 SET(XE2LPM_CCCHKNREG1, L3CMPCTRL)) 49 49 }, ··· 52 52 XE_RTP_ACTIONS(SET(L3SQCREG3, COMPPWOVERFETCHEN)) 53 53 }, 54 54 { XE_RTP_NAME("Tuning: Enable compressible partial write overfetch in L3 - media"), 55 - XE_RTP_RULES(MEDIA_VERSION(2000)), 55 + XE_RTP_RULES(MEDIA_VERSION_RANGE(2000, XE_RTP_END_VERSION_UNDEFINED)), 56 56 XE_RTP_ACTIONS(SET(XE2LPM_L3SQCREG3, COMPPWOVERFETCHEN)) 57 57 }, 58 58 { XE_RTP_NAME("Tuning: L2 Overfetch Compressible Only"), ··· 61 61 COMPMEMRD256BOVRFETCHEN)) 62 62 }, 63 63 { XE_RTP_NAME("Tuning: L2 Overfetch Compressible Only - media"), 64 - XE_RTP_RULES(MEDIA_VERSION(2000)), 64 + XE_RTP_RULES(MEDIA_VERSION_RANGE(2000, XE_RTP_END_VERSION_UNDEFINED)), 65 65 XE_RTP_ACTIONS(SET(XE2LPM_L3SQCREG2, 66 66 COMPMEMRD256BOVRFETCHEN)) 67 67 }, ··· 71 71 REG_FIELD_PREP(UNIFIED_COMPRESSION_FORMAT, 0))) 72 72 }, 73 73 { XE_RTP_NAME("Tuning: Stateless compression control - media"), 74 - XE_RTP_RULES(MEDIA_VERSION_RANGE(1301, 2000)), 74 + XE_RTP_RULES(MEDIA_VERSION_RANGE(1301, XE_RTP_END_VERSION_UNDEFINED)), 75 75 XE_RTP_ACTIONS(FIELD_SET(STATELESS_COMPRESSION_CTRL, UNIFIED_COMPRESSION_FORMAT, 76 76 REG_FIELD_PREP(UNIFIED_COMPRESSION_FORMAT, 0))) 77 77 },
+11 -8
drivers/gpu/drm/xe/xe_uc_fw.c
··· 4 4 */ 5 5 6 6 #include <linux/bitfield.h> 7 + #include <linux/fault-inject.h> 7 8 #include <linux/firmware.h> 8 9 9 10 #include <drm/drm_managed.h> ··· 797 796 798 797 return err; 799 798 } 799 + ALLOW_ERROR_INJECTION(xe_uc_fw_init, ERRNO); /* See xe_pci_probe() */ 800 800 801 801 static u32 uc_fw_ggtt_offset(struct xe_uc_fw *uc_fw) 802 802 { ··· 808 806 { 809 807 struct xe_device *xe = uc_fw_to_xe(uc_fw); 810 808 struct xe_gt *gt = uc_fw_to_gt(uc_fw); 809 + struct xe_mmio *mmio = &gt->mmio; 811 810 u64 src_offset; 812 811 u32 dma_ctrl; 813 812 int ret; ··· 817 814 818 815 /* Set the source address for the uCode */ 819 816 src_offset = uc_fw_ggtt_offset(uc_fw) + uc_fw->css_offset; 820 - xe_mmio_write32(gt, DMA_ADDR_0_LOW, lower_32_bits(src_offset)); 821 - xe_mmio_write32(gt, DMA_ADDR_0_HIGH, 817 + xe_mmio_write32(mmio, DMA_ADDR_0_LOW, lower_32_bits(src_offset)); 818 + xe_mmio_write32(mmio, DMA_ADDR_0_HIGH, 822 819 upper_32_bits(src_offset) | DMA_ADDRESS_SPACE_GGTT); 823 820 824 821 /* Set the DMA destination */ 825 - xe_mmio_write32(gt, DMA_ADDR_1_LOW, offset); 826 - xe_mmio_write32(gt, DMA_ADDR_1_HIGH, DMA_ADDRESS_SPACE_WOPCM); 822 + xe_mmio_write32(mmio, DMA_ADDR_1_LOW, offset); 823 + xe_mmio_write32(mmio, DMA_ADDR_1_HIGH, DMA_ADDRESS_SPACE_WOPCM); 827 824 828 825 /* 829 826 * Set the transfer size. The header plus uCode will be copied to WOPCM 830 827 * via DMA, excluding any other components 831 828 */ 832 - xe_mmio_write32(gt, DMA_COPY_SIZE, 829 + xe_mmio_write32(mmio, DMA_COPY_SIZE, 833 830 sizeof(struct uc_css_header) + uc_fw->ucode_size); 834 831 835 832 /* Start the DMA */ 836 - xe_mmio_write32(gt, DMA_CTRL, 833 + xe_mmio_write32(mmio, DMA_CTRL, 837 834 _MASKED_BIT_ENABLE(dma_flags | START_DMA)); 838 835 839 836 /* Wait for DMA to finish */ 840 - ret = xe_mmio_wait32(gt, DMA_CTRL, START_DMA, 0, 100000, &dma_ctrl, 837 + ret = xe_mmio_wait32(mmio, DMA_CTRL, START_DMA, 0, 100000, &dma_ctrl, 841 838 false); 842 839 if (ret) 843 840 drm_err(&xe->drm, "DMA for %s fw failed, DMA_CTRL=%u\n", 844 841 xe_uc_fw_type_repr(uc_fw->type), dma_ctrl); 845 842 846 843 /* Disable the bits once DMA is over */ 847 - xe_mmio_write32(gt, DMA_CTRL, _MASKED_BIT_DISABLE(dma_flags)); 844 + xe_mmio_write32(mmio, DMA_CTRL, _MASKED_BIT_DISABLE(dma_flags)); 848 845 849 846 return ret; 850 847 }
+2 -6
drivers/gpu/drm/xe/xe_vm.c
··· 3199 3199 3200 3200 ret = xe_gt_tlb_invalidation_vma(tile->primary_gt, 3201 3201 &fence[fence_id], vma); 3202 - if (ret < 0) { 3203 - xe_gt_tlb_invalidation_fence_fini(&fence[fence_id]); 3202 + if (ret) 3204 3203 goto wait; 3205 - } 3206 3204 ++fence_id; 3207 3205 3208 3206 if (!tile->media_gt) ··· 3212 3214 3213 3215 ret = xe_gt_tlb_invalidation_vma(tile->media_gt, 3214 3216 &fence[fence_id], vma); 3215 - if (ret < 0) { 3216 - xe_gt_tlb_invalidation_fence_fini(&fence[fence_id]); 3217 + if (ret) 3217 3218 goto wait; 3218 - } 3219 3219 ++fence_id; 3220 3220 } 3221 3221 }
+4 -3
drivers/gpu/drm/xe/xe_vram.c
··· 169 169 u64 offset_hi, offset_lo; 170 170 u32 nodes, num_enabled; 171 171 172 - reg = xe_mmio_read32(gt, MIRROR_FUSE3); 172 + reg = xe_mmio_read32(&gt->mmio, MIRROR_FUSE3); 173 173 nodes = REG_FIELD_GET(XE2_NODE_ENABLE_MASK, reg); 174 174 num_enabled = hweight32(nodes); /* Number of enabled l3 nodes */ 175 175 ··· 185 185 offset = round_up(offset, SZ_128K); /* SW must round up to nearest 128K */ 186 186 187 187 /* We don't expect any holes */ 188 - xe_assert_msg(xe, offset == (xe_mmio_read64_2x32(gt, GSMBASE) - ccs_size), 188 + xe_assert_msg(xe, offset == (xe_mmio_read64_2x32(&gt_to_tile(gt)->mmio, GSMBASE) - 189 + ccs_size), 189 190 "Hole between CCS and GSM.\n"); 190 191 } else { 191 192 reg = xe_gt_mcr_unicast_read_any(gt, XEHP_FLAT_CCS_BASE_ADDR); ··· 258 257 if (xe->info.has_flat_ccs) { 259 258 offset = get_flat_ccs_offset(gt, *tile_size); 260 259 } else { 261 - offset = xe_mmio_read64_2x32(gt, GSMBASE); 260 + offset = xe_mmio_read64_2x32(&tile->mmio, GSMBASE); 262 261 } 263 262 264 263 /* remove the tile offset so we have just the available size */
+55 -2
drivers/gpu/drm/xe/xe_wa.c
··· 8 8 #include <drm/drm_managed.h> 9 9 #include <kunit/visibility.h> 10 10 #include <linux/compiler_types.h> 11 + #include <linux/fault-inject.h> 11 12 12 13 #include <generated/xe_wa_oob.h> 13 14 ··· 249 248 { XE_RTP_NAME("14019449301"), 250 249 XE_RTP_RULES(MEDIA_VERSION(1301), ENGINE_CLASS(VIDEO_DECODE)), 251 250 XE_RTP_ACTIONS(SET(VDBOX_CGCTL3F08(0), CG3DDISHRS_CLKGATE_DIS)), 251 + XE_RTP_ENTRY_FLAG(FOREACH_ENGINE), 252 + }, 253 + 254 + /* Xe3_LPG */ 255 + 256 + { XE_RTP_NAME("14021871409"), 257 + XE_RTP_RULES(GRAPHICS_VERSION(3000), GRAPHICS_STEP(A0, B0)), 258 + XE_RTP_ACTIONS(SET(UNSLCGCTL9454, LSCFE_CLKGATE_DIS)) 259 + }, 260 + 261 + /* Xe3_LPM */ 262 + 263 + { XE_RTP_NAME("16021867713"), 264 + XE_RTP_RULES(MEDIA_VERSION(3000), 265 + ENGINE_CLASS(VIDEO_DECODE)), 266 + XE_RTP_ACTIONS(SET(VDBOX_CGCTL3F1C(0), MFXPIPE_CLKGATE_DIS)), 267 + XE_RTP_ENTRY_FLAG(FOREACH_ENGINE), 268 + }, 269 + { XE_RTP_NAME("16021865536"), 270 + XE_RTP_RULES(MEDIA_VERSION(3000), 271 + ENGINE_CLASS(VIDEO_DECODE)), 272 + XE_RTP_ACTIONS(SET(VDBOX_CGCTL3F10(0), IECPUNIT_CLKGATE_DIS)), 273 + XE_RTP_ENTRY_FLAG(FOREACH_ENGINE), 274 + }, 275 + { XE_RTP_NAME("14021486841"), 276 + XE_RTP_RULES(MEDIA_VERSION(3000), MEDIA_STEP(A0, B0), 277 + ENGINE_CLASS(VIDEO_DECODE)), 278 + XE_RTP_ACTIONS(SET(VDBOX_CGCTL3F10(0), RAMDFTUNIT_CLKGATE_DIS)), 252 279 XE_RTP_ENTRY_FLAG(FOREACH_ENGINE), 253 280 }, 254 281 ··· 596 567 XE_RTP_ACTION_FLAG(ENGINE_BASE))) 597 568 }, 598 569 570 + /* Xe3_LPG */ 571 + 572 + { XE_RTP_NAME("14021402888"), 573 + XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3000, 3001), FUNC(xe_rtp_match_first_render_or_compute)), 574 + XE_RTP_ACTIONS(SET(HALF_SLICE_CHICKEN7, CLEAR_OPTIMIZATION_DISABLE)) 575 + }, 576 + 599 577 {} 600 578 }; 601 579 ··· 746 710 DIS_PARTIAL_AUTOSTRIP | 747 711 DIS_AUTOSTRIP)) 748 712 }, 713 + { XE_RTP_NAME("15016589081"), 714 + XE_RTP_RULES(GRAPHICS_VERSION(2004), ENGINE_CLASS(RENDER)), 715 + XE_RTP_ACTIONS(SET(CHICKEN_RASTER_1, DIS_CLIP_NEGATIVE_BOUNDING_BOX)) 716 + }, 749 717 750 718 /* Xe2_HPG */ 751 719 { XE_RTP_NAME("15010599737"), ··· 776 736 { XE_RTP_NAME("15016589081"), 777 737 XE_RTP_RULES(GRAPHICS_VERSION(2001), ENGINE_CLASS(RENDER)), 778 738 XE_RTP_ACTIONS(SET(CHICKEN_RASTER_1, DIS_CLIP_NEGATIVE_BOUNDING_BOX)) 739 + }, 740 + 741 + /* Xe3_LPG */ 742 + { XE_RTP_NAME("14021490052"), 743 + XE_RTP_RULES(GRAPHICS_VERSION(3000), GRAPHICS_STEP(A0, B0), 744 + ENGINE_CLASS(RENDER)), 745 + XE_RTP_ACTIONS(SET(FF_MODE, 746 + DIS_MESH_PARTIAL_AUTOSTRIP | 747 + DIS_MESH_AUTOSTRIP), 748 + SET(VFLSKPD, 749 + DIS_PARTIAL_AUTOSTRIP | 750 + DIS_AUTOSTRIP)) 779 751 }, 780 752 781 753 {} ··· 902 850 903 851 return 0; 904 852 } 853 + ALLOW_ERROR_INJECTION(xe_wa_init, ERRNO); /* See xe_pci_probe() */ 905 854 906 855 void xe_wa_dump(struct xe_gt *gt, struct drm_printer *p) 907 856 { ··· 940 887 */ 941 888 void xe_wa_apply_tile_workarounds(struct xe_tile *tile) 942 889 { 943 - struct xe_gt *mmio = tile->primary_gt; 890 + struct xe_mmio *mmio = &tile->mmio; 944 891 945 892 if (IS_SRIOV_VF(tile->xe)) 946 893 return; 947 894 948 - if (XE_WA(mmio, 22010954014)) 895 + if (XE_WA(tile->primary_gt, 22010954014)) 949 896 xe_mmio_rmw32(mmio, XEHP_CLOCK_GATE_DIS, 0, SGSI_SIDECLK_DIS); 950 897 }
+2
drivers/gpu/drm/xe/xe_wa_oob.rules
··· 33 33 GRAPHICS_VERSION(2004) 34 34 22019338487 MEDIA_VERSION(2000) 35 35 GRAPHICS_VERSION(2001) 36 + MEDIA_VERSION(3000), MEDIA_STEP(A0, B0) 36 37 22019338487_display PLATFORM(LUNARLAKE) 37 38 16023588340 GRAPHICS_VERSION(2001) 38 39 14019789679 GRAPHICS_VERSION(1255) 39 40 GRAPHICS_VERSION_RANGE(1270, 2004) 41 + no_media_l3 MEDIA_VERSION(3000)
+9 -6
drivers/gpu/drm/xe/xe_wopcm.c
··· 5 5 6 6 #include "xe_wopcm.h" 7 7 8 + #include <linux/fault-inject.h> 9 + 8 10 #include "regs/xe_guc_regs.h" 9 11 #include "xe_device.h" 10 12 #include "xe_force_wake.h" ··· 125 123 static bool __wopcm_regs_locked(struct xe_gt *gt, 126 124 u32 *guc_wopcm_base, u32 *guc_wopcm_size) 127 125 { 128 - u32 reg_base = xe_mmio_read32(gt, DMA_GUC_WOPCM_OFFSET); 129 - u32 reg_size = xe_mmio_read32(gt, GUC_WOPCM_SIZE); 126 + u32 reg_base = xe_mmio_read32(&gt->mmio, DMA_GUC_WOPCM_OFFSET); 127 + u32 reg_size = xe_mmio_read32(&gt->mmio, GUC_WOPCM_SIZE); 130 128 131 129 if (!(reg_size & GUC_WOPCM_SIZE_LOCKED) || 132 130 !(reg_base & GUC_WOPCM_OFFSET_VALID)) ··· 152 150 XE_WARN_ON(size & ~GUC_WOPCM_SIZE_MASK); 153 151 154 152 mask = GUC_WOPCM_SIZE_MASK | GUC_WOPCM_SIZE_LOCKED; 155 - err = xe_mmio_write32_and_verify(gt, GUC_WOPCM_SIZE, size, mask, 153 + err = xe_mmio_write32_and_verify(&gt->mmio, GUC_WOPCM_SIZE, size, mask, 156 154 size | GUC_WOPCM_SIZE_LOCKED); 157 155 if (err) 158 156 goto err_out; 159 157 160 158 mask = GUC_WOPCM_OFFSET_MASK | GUC_WOPCM_OFFSET_VALID | huc_agent; 161 - err = xe_mmio_write32_and_verify(gt, DMA_GUC_WOPCM_OFFSET, 159 + err = xe_mmio_write32_and_verify(&gt->mmio, DMA_GUC_WOPCM_OFFSET, 162 160 base | huc_agent, mask, 163 161 base | huc_agent | 164 162 GUC_WOPCM_OFFSET_VALID); ··· 171 169 drm_notice(&xe->drm, "Failed to init uC WOPCM registers!\n"); 172 170 drm_notice(&xe->drm, "%s(%#x)=%#x\n", "DMA_GUC_WOPCM_OFFSET", 173 171 DMA_GUC_WOPCM_OFFSET.addr, 174 - xe_mmio_read32(gt, DMA_GUC_WOPCM_OFFSET)); 172 + xe_mmio_read32(&gt->mmio, DMA_GUC_WOPCM_OFFSET)); 175 173 drm_notice(&xe->drm, "%s(%#x)=%#x\n", "GUC_WOPCM_SIZE", 176 174 GUC_WOPCM_SIZE.addr, 177 - xe_mmio_read32(gt, GUC_WOPCM_SIZE)); 175 + xe_mmio_read32(&gt->mmio, GUC_WOPCM_SIZE)); 178 176 179 177 return err; 180 178 } ··· 270 268 271 269 return ret; 272 270 } 271 + ALLOW_ERROR_INJECTION(xe_wopcm_init, ERRNO); /* See xe_pci_probe() */
+64
include/drm/drm_print.h
··· 177 177 void *arg; 178 178 const void *origin; 179 179 const char *prefix; 180 + struct { 181 + unsigned int series; 182 + unsigned int counter; 183 + } line; 180 184 enum drm_debug_category category; 181 185 }; 182 186 ··· 191 187 void __drm_printfn_info(struct drm_printer *p, struct va_format *vaf); 192 188 void __drm_printfn_dbg(struct drm_printer *p, struct va_format *vaf); 193 189 void __drm_printfn_err(struct drm_printer *p, struct va_format *vaf); 190 + void __drm_printfn_line(struct drm_printer *p, struct va_format *vaf); 194 191 195 192 __printf(2, 3) 196 193 void drm_printf(struct drm_printer *p, const char *f, ...); ··· 414 409 .prefix = prefix 415 410 }; 416 411 return p; 412 + } 413 + 414 + /** 415 + * drm_line_printer - construct a &drm_printer that prefixes outputs with line numbers 416 + * @p: the &struct drm_printer which actually generates the output 417 + * @prefix: optional output prefix, or NULL for no prefix 418 + * @series: optional unique series identifier, or 0 to omit identifier in the output 419 + * 420 + * This printer can be used to increase the robustness of the captured output 421 + * to make sure we didn't lost any intermediate lines of the output. Helpful 422 + * while capturing some crash data. 423 + * 424 + * Example 1:: 425 + * 426 + * void crash_dump(struct drm_device *drm) 427 + * { 428 + * static unsigned int id; 429 + * struct drm_printer p = drm_err_printer(drm, "crash"); 430 + * struct drm_printer lp = drm_line_printer(&p, "dump", ++id); 431 + * 432 + * drm_printf(&lp, "foo"); 433 + * drm_printf(&lp, "bar"); 434 + * } 435 + * 436 + * Above code will print into the dmesg something like:: 437 + * 438 + * [ ] 0000:00:00.0: [drm] *ERROR* crash dump 1.1: foo 439 + * [ ] 0000:00:00.0: [drm] *ERROR* crash dump 1.2: bar 440 + * 441 + * Example 2:: 442 + * 443 + * void line_dump(struct device *dev) 444 + * { 445 + * struct drm_printer p = drm_info_printer(dev); 446 + * struct drm_printer lp = drm_line_printer(&p, NULL, 0); 447 + * 448 + * drm_printf(&lp, "foo"); 449 + * drm_printf(&lp, "bar"); 450 + * } 451 + * 452 + * Above code will print:: 453 + * 454 + * [ ] 0000:00:00.0: [drm] 1: foo 455 + * [ ] 0000:00:00.0: [drm] 2: bar 456 + * 457 + * RETURNS: 458 + * The &drm_printer object 459 + */ 460 + static inline struct drm_printer drm_line_printer(struct drm_printer *p, 461 + const char *prefix, 462 + unsigned int series) 463 + { 464 + struct drm_printer lp = { 465 + .printfn = __drm_printfn_line, 466 + .arg = p, 467 + .prefix = prefix, 468 + .line = { .series = series, }, 469 + }; 470 + return lp; 417 471 } 418 472 419 473 /*
+39 -7
include/drm/intel/xe_pciids.h
··· 97 97 #define XE_ADLN_IDS(MACRO__, ...) \ 98 98 MACRO__(0x46D0, ## __VA_ARGS__), \ 99 99 MACRO__(0x46D1, ## __VA_ARGS__), \ 100 - MACRO__(0x46D2, ## __VA_ARGS__) 100 + MACRO__(0x46D2, ## __VA_ARGS__), \ 101 + MACRO__(0x46D3, ## __VA_ARGS__), \ 102 + MACRO__(0x46D4, ## __VA_ARGS__) 101 103 102 104 /* RPL-S */ 103 105 #define XE_RPLS_IDS(MACRO__, ...) \ ··· 122 120 123 121 /* RPL-P */ 124 122 #define XE_RPLP_IDS(MACRO__, ...) \ 125 - XE_RPLU_IDS(MACRO__, ## __VA_ARGS__), \ 126 123 MACRO__(0xA720, ## __VA_ARGS__), \ 127 124 MACRO__(0xA7A0, ## __VA_ARGS__), \ 128 125 MACRO__(0xA7A8, ## __VA_ARGS__), \ ··· 176 175 XE_ATS_M150_IDS(MACRO__, ## __VA_ARGS__),\ 177 176 XE_ATS_M75_IDS(MACRO__, ## __VA_ARGS__) 178 177 179 - /* MTL / ARL */ 178 + /* ARL */ 179 + #define XE_ARL_IDS(MACRO__, ...) \ 180 + MACRO__(0x7D41, ## __VA_ARGS__), \ 181 + MACRO__(0x7D51, ## __VA_ARGS__), \ 182 + MACRO__(0x7D67, ## __VA_ARGS__), \ 183 + MACRO__(0x7DD1, ## __VA_ARGS__), \ 184 + MACRO__(0xB640, ## __VA_ARGS__) 185 + 186 + /* MTL */ 180 187 #define XE_MTL_IDS(MACRO__, ...) \ 181 188 MACRO__(0x7D40, ## __VA_ARGS__), \ 182 - MACRO__(0x7D41, ## __VA_ARGS__), \ 183 189 MACRO__(0x7D45, ## __VA_ARGS__), \ 184 - MACRO__(0x7D51, ## __VA_ARGS__), \ 185 190 MACRO__(0x7D55, ## __VA_ARGS__), \ 186 191 MACRO__(0x7D60, ## __VA_ARGS__), \ 187 - MACRO__(0x7D67, ## __VA_ARGS__), \ 188 - MACRO__(0x7DD1, ## __VA_ARGS__), \ 189 192 MACRO__(0x7DD5, ## __VA_ARGS__) 193 + 194 + /* PVC */ 195 + #define XE_PVC_IDS(MACRO__, ...) \ 196 + MACRO__(0x0B69, ## __VA_ARGS__), \ 197 + MACRO__(0x0B6E, ## __VA_ARGS__), \ 198 + MACRO__(0x0BD4, ## __VA_ARGS__), \ 199 + MACRO__(0x0BD5, ## __VA_ARGS__), \ 200 + MACRO__(0x0BD6, ## __VA_ARGS__), \ 201 + MACRO__(0x0BD7, ## __VA_ARGS__), \ 202 + MACRO__(0x0BD8, ## __VA_ARGS__), \ 203 + MACRO__(0x0BD9, ## __VA_ARGS__), \ 204 + MACRO__(0x0BDA, ## __VA_ARGS__), \ 205 + MACRO__(0x0BDB, ## __VA_ARGS__), \ 206 + MACRO__(0x0BE0, ## __VA_ARGS__), \ 207 + MACRO__(0x0BE1, ## __VA_ARGS__), \ 208 + MACRO__(0x0BE5, ## __VA_ARGS__) 190 209 191 210 #define XE_LNL_IDS(MACRO__, ...) \ 192 211 MACRO__(0x6420, ## __VA_ARGS__), \ ··· 219 198 MACRO__(0xE20C, ## __VA_ARGS__), \ 220 199 MACRO__(0xE20D, ## __VA_ARGS__), \ 221 200 MACRO__(0xE212, ## __VA_ARGS__) 201 + 202 + #define XE_PTL_IDS(MACRO__, ...) \ 203 + MACRO__(0xB080, ## __VA_ARGS__), \ 204 + MACRO__(0xB081, ## __VA_ARGS__), \ 205 + MACRO__(0xB082, ## __VA_ARGS__), \ 206 + MACRO__(0xB090, ## __VA_ARGS__), \ 207 + MACRO__(0xB091, ## __VA_ARGS__), \ 208 + MACRO__(0xB092, ## __VA_ARGS__), \ 209 + MACRO__(0xB0A0, ## __VA_ARGS__), \ 210 + MACRO__(0xB0A1, ## __VA_ARGS__), \ 211 + MACRO__(0xB0A2, ## __VA_ARGS__) 222 212 223 213 #endif
+3 -1
include/uapi/drm/xe_drm.h
··· 512 512 * containing the following in mask: 513 513 * ``DSS_COMPUTE ff ff ff ff 00 00 00 00`` 514 514 * means 32 DSS are available for compute. 515 - * - %DRM_XE_TOPO_L3_BANK - To query the mask of enabled L3 banks 515 + * - %DRM_XE_TOPO_L3_BANK - To query the mask of enabled L3 banks. This type 516 + * may be omitted if the driver is unable to query the mask from the 517 + * hardware. 516 518 * - %DRM_XE_TOPO_EU_PER_DSS - To query the mask of Execution Units (EU) 517 519 * available per Dual Sub Slices (DSS). For example a query response 518 520 * containing the following in mask: