Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

drm/xe/guc: Enable WA_DUAL_QUEUE for newer platforms

The DUAL_QUEUE_WA tells the GuC to not allow concurrent submissions
on RCS and CCSes with different address spaces, which on DG2 is
required as a WA for an HW bug. On newer platforms, this block has
been moved in HW at the CS level, by stalling the RCS/CCS context
switch when one of the other RCS/CCSes is busy with a different
address space. While functionally correct, having a submission
stalled on the HW limits the GuC ability to shuffle things around and
can cause complications if the non-stalled submission runs for a long
time, because the GuC doesn't know that the stalled submission isn't
actually running and might declare it as hung. Therefore, we enable
the DUAL_QUEUE_WA on all newer platforms to move management back to
the GuC.

Note that the GuC specs also recommend enabling this for all platforms
starting from MTL that have a CCS.

v2: only apply the WA on GTs that have CCS engines
v3: split comment (Jonathan)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Jesus Narvaez <jesus.narvaez@intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241213181012.2178794-1-daniele.ceraolospurio@intel.com

+29 -1
+29 -1
drivers/gpu/drm/xe/xe_guc.c
··· 147 147 return flags; 148 148 } 149 149 150 + static bool needs_wa_dual_queue(struct xe_gt *gt) 151 + { 152 + /* 153 + * The DUAL_QUEUE_WA tells the GuC to not allow concurrent submissions 154 + * on RCS and CCSes with different address spaces, which on DG2 is 155 + * required as a WA for an HW bug. 156 + */ 157 + if (XE_WA(gt, 22011391025)) 158 + return true; 159 + 160 + /* 161 + * On newer platforms, the HW has been updated to not allow parallel 162 + * execution of different address spaces, so the RCS/CCS will stall the 163 + * context switch if one of the other RCS/CCSes is busy with a different 164 + * address space. While functionally correct, having a submission 165 + * stalled on the HW limits the GuC ability to shuffle things around and 166 + * can cause complications if the non-stalled submission runs for a long 167 + * time, because the GuC doesn't know that the stalled submission isn't 168 + * actually running and might declare it as hung. Therefore, we enable 169 + * the DUAL_QUEUE_WA on all newer platforms on GTs that have CCS engines 170 + * to move management back to the GuC. 171 + */ 172 + if (CCS_MASK(gt) && GRAPHICS_VERx100(gt_to_xe(gt)) >= 1270) 173 + return true; 174 + 175 + return false; 176 + } 177 + 150 178 static u32 guc_ctl_wa_flags(struct xe_guc *guc) 151 179 { 152 180 struct xe_device *xe = guc_to_xe(guc); ··· 187 159 if (XE_WA(gt, 14014475959)) 188 160 flags |= GUC_WA_HOLD_CCS_SWITCHOUT; 189 161 190 - if (XE_WA(gt, 22011391025)) 162 + if (needs_wa_dual_queue(gt)) 191 163 flags |= GUC_WA_DUAL_QUEUE; 192 164 193 165 /*