Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'drm-intel-gt-next-2023-05-24' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

UAPI Changes:

- New getparam for querying PXP support and load status

Cross-subsystem Changes:

- GSC/MEI proxy driver

Driver Changes:

Fixes/improvements/new stuff:

- Avoid clearing pre-allocated framebuffers with the TTM backend (Nirmoy Das)
- Implement framebuffer mmap support (Nirmoy Das)
- Disable sampler indirect state in bindless heap (Lionel Landwerlin)
- Avoid out-of-bounds access when loading HuC (Lucas De Marchi)
- Actually return an error if GuC version range check fails (John Harrison)
- Get mutex and rpm ref just once in hwm_power_max_write (Ashutosh Dixit)
- Disable PL1 power limit when loading GuC firmware (Ashutosh Dixit)
- Block in hwmon while waiting for GuC reset to complete (Ashutosh Dixit)
- Provide sysfs for SLPC efficient freq (Vinay Belgaumkar)
- Add support for total context runtime for GuC back-end (Umesh Nerlige Ramappa)
- Enable fdinfo for GuC backends (Umesh Nerlige Ramappa)
- Don't capture Gen8 regs on Xe devices (John Harrison)
- Fix error capture for virtual engines (John Harrison)
- Track patch level versions on reduced version firmware files (John Harrison)
- Decode another GuC load failure case (John Harrison)
- GuC loading and firmware table handling fixes (John Harrison)
- Fix confused register capture list creation (John Harrison)
- Dump error capture to kernel log (John Harrison)
- Dump error capture to dmesg on CTB error (John Harrison)
- Disable rps_boost debugfs when SLPC is used (Vinay Belgaumkar)

Future platform enablement:

- Disable stolen memory backed FB for A0 [mtl] (Nirmoy Das)
- Various refactors for multi-tile enablement (Andi Shyti, Tejas Upadhyay)
- Extend Wa_22011802037 to MTL A-step (Madhumitha Tolakanahalli Pradeep)
- WA to clear RDOP clock gating [mtl] (Haridhar Kalvala)
- Set has_llc=0 [mtl] (Fei Yang)
- Define MOCS and PAT tables for MTL (Madhumitha Tolakanahalli Pradeep)
- Add PTE encode function [mtl] (Fei Yang)
- fix mocs selftest [mtl] (Fei Yang)
- Workaround coherency issue for Media [mtl] (Fei Yang)
- Add workaround 14018778641 [mtl] (Tejas Upadhyay)
- Implement Wa_14019141245 [mtl] (Radhakrishna Sripada)
- Fix the wa number for Wa_22016670082 [mtl] (Radhakrishna Sripada)
- Use correct huge page manager for MTL (Jonathan Cavitt)
- GSC/MEI support for Meteorlake (Alexander Usyskin, Daniele Ceraolo Spurio)
- Define GuC firmware version for MTL (John Harrison)
- Drop FLAT CCS check [mtl] (Pallavi Mishra)
- Add MTL for remapping CCS FBs [mtl] (Clint Taylor)
- Meteorlake PXP enablement (Alan Previn)
- Do not enable render power-gating on MTL (Andrzej Hajda)
- Add MTL performance tuning changes (Radhakrishna Sripada)
- Extend Wa_16014892111 to MTL A-step (Radhakrishna Sripada)
- PMU multi-tile support (Tvrtko Ursulin)
- End support for set caching ioctl [mtl] (Fei Yang)

Driver refactors:

- Use i915 instead of dev_priv insied the file_priv structure (Andi Shyti)
- Use proper parameter naming in for_each_engine() (Andi Shyti)
- Use gt_err for GT info (Tejas Upadhyay)
- Consolidate duplicated capture list code (John Harrison)
- Capture list naming clean up (John Harrison)
- Use kernel-doc -Werror when CONFIG_DRM_I915_WERROR=y (Jani Nikula)
- Preparation for using PAT index (Fei Yang)
- Use pat_index instead of cache_level (Fei Yang)

Miscellaneous:

- Fix memory leaks in i915 selftests (Cong Liu)
- Record GT error for gt failure (Tejas Upadhyay)
- Migrate platform-dependent mock hugepage selftests to live (Jonathan Cavitt)
- Update the SLPC selftest (Vinay Belgaumkar)
- Throw out set() wrapper (Jani Nikula)
- Large driver kernel doc cleanup (Jani Nikula)
- Fix probe injection CI failures after recent change (John Harrison)
- Make unexpected firmware versions an error in debug builds (John Harrison)
- Silence UBSAN uninitialized bool variable warning (Ashutosh Dixit)
- Fix memory leaks in function live_nop_switch (Cong Liu)

Merges:

- Merge drm/drm-next into drm-intel-gt-next (Joonas Lahtinen)

Signed-off-by: Dave Airlie <airlied@redhat.com>

# Conflicts:
# drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZG5SxCWRSkZhTDtY@tursulin-desk

+4259 -1201
+4 -2
drivers/gpu/drm/i915/Makefile
··· 194 194 # general-purpose microcontroller (GuC) support 195 195 i915-y += \ 196 196 gt/uc/intel_gsc_fw.o \ 197 + gt/uc/intel_gsc_proxy.o \ 197 198 gt/uc/intel_gsc_uc.o \ 198 199 gt/uc/intel_gsc_uc_heci_cmd_submit.o\ 199 200 gt/uc/intel_guc.o \ ··· 339 338 i915-$(CONFIG_DRM_I915_PXP) += \ 340 339 pxp/intel_pxp_cmd.o \ 341 340 pxp/intel_pxp_debugfs.o \ 341 + pxp/intel_pxp_gsccs.o \ 342 342 pxp/intel_pxp_irq.o \ 343 343 pxp/intel_pxp_pm.o \ 344 344 pxp/intel_pxp_session.o ··· 375 373 # 376 374 # Enable locally for CONFIG_DRM_I915_WERROR=y. See also scripts/Makefile.build 377 375 ifdef CONFIG_DRM_I915_WERROR 378 - cmd_checkdoc = $(srctree)/scripts/kernel-doc -none $< 376 + cmd_checkdoc = $(srctree)/scripts/kernel-doc -none -Werror $< 379 377 endif 380 378 381 379 # header test ··· 390 388 391 389 quiet_cmd_hdrtest = HDRTEST $(patsubst %.hdrtest,%.h,$@) 392 390 cmd_hdrtest = $(CC) $(filter-out $(CFLAGS_GCOV), $(c_flags)) -S -o /dev/null -x c /dev/null -include $<; \ 393 - $(srctree)/scripts/kernel-doc -none $<; touch $@ 391 + $(srctree)/scripts/kernel-doc -none -Werror $<; touch $@ 394 392 395 393 $(obj)/%.hdrtest: $(src)/%.h FORCE 396 394 $(call if_changed_dep,hdrtest)
+7 -7
drivers/gpu/drm/i915/display/intel_dpt.c
··· 43 43 static void dpt_insert_page(struct i915_address_space *vm, 44 44 dma_addr_t addr, 45 45 u64 offset, 46 - enum i915_cache_level level, 46 + unsigned int pat_index, 47 47 u32 flags) 48 48 { 49 49 struct i915_dpt *dpt = i915_vm_to_dpt(vm); 50 50 gen8_pte_t __iomem *base = dpt->iomem; 51 51 52 52 gen8_set_pte(base + offset / I915_GTT_PAGE_SIZE, 53 - vm->pte_encode(addr, level, flags)); 53 + vm->pte_encode(addr, pat_index, flags)); 54 54 } 55 55 56 56 static void dpt_insert_entries(struct i915_address_space *vm, 57 57 struct i915_vma_resource *vma_res, 58 - enum i915_cache_level level, 58 + unsigned int pat_index, 59 59 u32 flags) 60 60 { 61 61 struct i915_dpt *dpt = i915_vm_to_dpt(vm); 62 62 gen8_pte_t __iomem *base = dpt->iomem; 63 - const gen8_pte_t pte_encode = vm->pte_encode(0, level, flags); 63 + const gen8_pte_t pte_encode = vm->pte_encode(0, pat_index, flags); 64 64 struct sgt_iter sgt_iter; 65 65 dma_addr_t addr; 66 66 int i; ··· 83 83 static void dpt_bind_vma(struct i915_address_space *vm, 84 84 struct i915_vm_pt_stash *stash, 85 85 struct i915_vma_resource *vma_res, 86 - enum i915_cache_level cache_level, 86 + unsigned int pat_index, 87 87 u32 flags) 88 88 { 89 89 u32 pte_flags; ··· 98 98 if (vma_res->bi.lmem) 99 99 pte_flags |= PTE_LM; 100 100 101 - vm->insert_entries(vm, vma_res, cache_level, pte_flags); 101 + vm->insert_entries(vm, vma_res, pat_index, pte_flags); 102 102 103 103 vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE; 104 104 ··· 300 300 vm->vma_ops.bind_vma = dpt_bind_vma; 301 301 vm->vma_ops.unbind_vma = dpt_unbind_vma; 302 302 303 - vm->pte_encode = gen8_ggtt_pte_encode; 303 + vm->pte_encode = vm->gt->ggtt->vm.pte_encode; 304 304 305 305 dpt->obj = dpt_obj; 306 306 dpt->obj->is_dpt = true;
+6 -3
drivers/gpu/drm/i915/display/intel_fb.c
··· 1190 1190 { 1191 1191 struct drm_i915_private *i915 = to_i915(fb->base.dev); 1192 1192 1193 - return IS_ALDERLAKE_P(i915) && intel_fb_uses_dpt(&fb->base); 1193 + return (IS_ALDERLAKE_P(i915) || DISPLAY_VER(i915) >= 14) && 1194 + intel_fb_uses_dpt(&fb->base); 1194 1195 } 1195 1196 1196 1197 static int intel_fb_pitch(const struct intel_framebuffer *fb, int color_plane, unsigned int rotation) ··· 1327 1326 unsigned int tile_width, 1328 1327 unsigned int src_stride_tiles, unsigned int dst_stride_tiles) 1329 1328 { 1329 + struct drm_i915_private *i915 = to_i915(fb->base.dev); 1330 1330 unsigned int stride_tiles; 1331 1331 1332 - if (IS_ALDERLAKE_P(to_i915(fb->base.dev))) 1332 + if (IS_ALDERLAKE_P(i915) || DISPLAY_VER(i915) >= 14) 1333 1333 stride_tiles = src_stride_tiles; 1334 1334 else 1335 1335 stride_tiles = dst_stride_tiles; ··· 1524 1522 memset(view, 0, sizeof(*view)); 1525 1523 view->gtt.type = view_type; 1526 1524 1527 - if (view_type == I915_GTT_VIEW_REMAPPED && IS_ALDERLAKE_P(i915)) 1525 + if (view_type == I915_GTT_VIEW_REMAPPED && 1526 + (IS_ALDERLAKE_P(i915) || DISPLAY_VER(i915) >= 14)) 1528 1527 view->gtt.remapped.plane_alignment = SZ_2M / PAGE_SIZE; 1529 1528 } 1530 1529
+24 -14
drivers/gpu/drm/i915/display/intel_fbdev.c
··· 40 40 #include <drm/drm_crtc.h> 41 41 #include <drm/drm_fb_helper.h> 42 42 #include <drm/drm_fourcc.h> 43 + #include <drm/drm_gem_framebuffer_helper.h> 43 44 44 45 #include "gem/i915_gem_lmem.h" 46 + #include "gem/i915_gem_mman.h" 45 47 46 48 #include "i915_drv.h" 47 49 #include "intel_display_types.h" ··· 69 67 struct mutex hpd_lock; 70 68 }; 71 69 70 + static struct intel_fbdev *to_intel_fbdev(struct drm_fb_helper *fb_helper) 71 + { 72 + return container_of(fb_helper, struct intel_fbdev, helper); 73 + } 74 + 72 75 static struct intel_frontbuffer *to_frontbuffer(struct intel_fbdev *ifbdev) 73 76 { 74 77 return ifbdev->fb->frontbuffer; ··· 86 79 87 80 static int intel_fbdev_set_par(struct fb_info *info) 88 81 { 89 - struct drm_fb_helper *fb_helper = info->par; 90 - struct intel_fbdev *ifbdev = 91 - container_of(fb_helper, struct intel_fbdev, helper); 82 + struct intel_fbdev *ifbdev = to_intel_fbdev(info->par); 92 83 int ret; 93 84 94 85 ret = drm_fb_helper_set_par(info); ··· 98 93 99 94 static int intel_fbdev_blank(int blank, struct fb_info *info) 100 95 { 101 - struct drm_fb_helper *fb_helper = info->par; 102 - struct intel_fbdev *ifbdev = 103 - container_of(fb_helper, struct intel_fbdev, helper); 96 + struct intel_fbdev *ifbdev = to_intel_fbdev(info->par); 104 97 int ret; 105 98 106 99 ret = drm_fb_helper_blank(blank, info); ··· 111 108 static int intel_fbdev_pan_display(struct fb_var_screeninfo *var, 112 109 struct fb_info *info) 113 110 { 114 - struct drm_fb_helper *fb_helper = info->par; 115 - struct intel_fbdev *ifbdev = 116 - container_of(fb_helper, struct intel_fbdev, helper); 111 + struct intel_fbdev *ifbdev = to_intel_fbdev(info->par); 117 112 int ret; 118 113 119 114 ret = drm_fb_helper_pan_display(var, info); ··· 119 118 intel_fbdev_invalidate(ifbdev); 120 119 121 120 return ret; 121 + } 122 + 123 + static int intel_fbdev_mmap(struct fb_info *info, struct vm_area_struct *vma) 124 + { 125 + struct intel_fbdev *fbdev = to_intel_fbdev(info->par); 126 + struct drm_gem_object *bo = drm_gem_fb_get_obj(&fbdev->fb->base, 0); 127 + struct drm_i915_gem_object *obj = to_intel_bo(bo); 128 + 129 + return i915_gem_fb_mmap(obj, vma); 122 130 } 123 131 124 132 static const struct fb_ops intelfb_ops = { ··· 141 131 .fb_imageblit = drm_fb_helper_cfb_imageblit, 142 132 .fb_pan_display = intel_fbdev_pan_display, 143 133 .fb_blank = intel_fbdev_blank, 134 + .fb_mmap = intel_fbdev_mmap, 144 135 }; 145 136 146 137 static int intelfb_alloc(struct drm_fb_helper *helper, 147 138 struct drm_fb_helper_surface_size *sizes) 148 139 { 149 - struct intel_fbdev *ifbdev = 150 - container_of(helper, struct intel_fbdev, helper); 140 + struct intel_fbdev *ifbdev = to_intel_fbdev(helper); 151 141 struct drm_framebuffer *fb; 152 142 struct drm_device *dev = helper->dev; 153 143 struct drm_i915_private *dev_priv = to_i915(dev); ··· 173 163 obj = ERR_PTR(-ENODEV); 174 164 if (HAS_LMEM(dev_priv)) { 175 165 obj = i915_gem_object_create_lmem(dev_priv, size, 176 - I915_BO_ALLOC_CONTIGUOUS); 166 + I915_BO_ALLOC_CONTIGUOUS | 167 + I915_BO_ALLOC_USER); 177 168 } else { 178 169 /* 179 170 * If the FB is too big, just don't use it since fbdev is not very ··· 204 193 static int intelfb_create(struct drm_fb_helper *helper, 205 194 struct drm_fb_helper_surface_size *sizes) 206 195 { 207 - struct intel_fbdev *ifbdev = 208 - container_of(helper, struct intel_fbdev, helper); 196 + struct intel_fbdev *ifbdev = to_intel_fbdev(helper); 209 197 struct intel_framebuffer *intel_fb = ifbdev->fb; 210 198 struct drm_device *dev = helper->dev; 211 199 struct drm_i915_private *dev_priv = to_i915(dev);
+3 -1
drivers/gpu/drm/i915/display/intel_plane_initial.c
··· 110 110 size * 2 > i915->dsm.usable_size) 111 111 return NULL; 112 112 113 - obj = i915_gem_object_create_region_at(mem, phys_base, size, 0); 113 + obj = i915_gem_object_create_region_at(mem, phys_base, size, 114 + I915_BO_ALLOC_USER | 115 + I915_BO_PREALLOC); 114 116 if (IS_ERR(obj)) 115 117 return NULL; 116 118
+45 -20
drivers/gpu/drm/i915/gem/i915_gem_domain.c
··· 27 27 if (IS_DGFX(i915)) 28 28 return false; 29 29 30 - return !(obj->cache_level == I915_CACHE_NONE || 31 - obj->cache_level == I915_CACHE_WT); 30 + /* 31 + * For objects created by userspace through GEM_CREATE with pat_index 32 + * set by set_pat extension, i915_gem_object_has_cache_level() will 33 + * always return true, because the coherency of such object is managed 34 + * by userspace. Othereise the call here would fall back to checking 35 + * whether the object is un-cached or write-through. 36 + */ 37 + return !(i915_gem_object_has_cache_level(obj, I915_CACHE_NONE) || 38 + i915_gem_object_has_cache_level(obj, I915_CACHE_WT)); 32 39 } 33 40 34 41 bool i915_gem_cpu_write_needs_clflush(struct drm_i915_gem_object *obj) ··· 274 267 { 275 268 int ret; 276 269 277 - if (obj->cache_level == cache_level) 270 + /* 271 + * For objects created by userspace through GEM_CREATE with pat_index 272 + * set by set_pat extension, simply return 0 here without touching 273 + * the cache setting, because such objects should have an immutable 274 + * cache setting by desgin and always managed by userspace. 275 + */ 276 + if (i915_gem_object_has_cache_level(obj, cache_level)) 278 277 return 0; 279 278 280 279 ret = i915_gem_object_wait(obj, ··· 291 278 return ret; 292 279 293 280 /* Always invalidate stale cachelines */ 294 - if (obj->cache_level != cache_level) { 295 - i915_gem_object_set_cache_coherency(obj, cache_level); 296 - obj->cache_dirty = true; 297 - } 281 + i915_gem_object_set_cache_coherency(obj, cache_level); 282 + obj->cache_dirty = true; 298 283 299 284 /* The cache-level will be applied when each vma is rebound. */ 300 285 return i915_gem_object_unbind(obj, ··· 317 306 goto out; 318 307 } 319 308 320 - switch (obj->cache_level) { 321 - case I915_CACHE_LLC: 322 - case I915_CACHE_L3_LLC: 323 - args->caching = I915_CACHING_CACHED; 324 - break; 325 - 326 - case I915_CACHE_WT: 327 - args->caching = I915_CACHING_DISPLAY; 328 - break; 329 - 330 - default: 331 - args->caching = I915_CACHING_NONE; 332 - break; 309 + /* 310 + * This ioctl should be disabled for the objects with pat_index 311 + * set by user space. 312 + */ 313 + if (obj->pat_set_by_user) { 314 + err = -EOPNOTSUPP; 315 + goto out; 333 316 } 317 + 318 + if (i915_gem_object_has_cache_level(obj, I915_CACHE_LLC) || 319 + i915_gem_object_has_cache_level(obj, I915_CACHE_L3_LLC)) 320 + args->caching = I915_CACHING_CACHED; 321 + else if (i915_gem_object_has_cache_level(obj, I915_CACHE_WT)) 322 + args->caching = I915_CACHING_DISPLAY; 323 + else 324 + args->caching = I915_CACHING_NONE; 334 325 out: 335 326 rcu_read_unlock(); 336 327 return err; ··· 349 336 350 337 if (IS_DGFX(i915)) 351 338 return -ENODEV; 339 + 340 + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)) 341 + return -EOPNOTSUPP; 352 342 353 343 switch (args->caching) { 354 344 case I915_CACHING_NONE: ··· 379 363 obj = i915_gem_object_lookup(file, args->handle); 380 364 if (!obj) 381 365 return -ENOENT; 366 + 367 + /* 368 + * This ioctl should be disabled for the objects with pat_index 369 + * set by user space. 370 + */ 371 + if (obj->pat_set_by_user) { 372 + ret = -EOPNOTSUPP; 373 + goto out; 374 + } 382 375 383 376 /* 384 377 * The caching mode of proxy object is handled by its generator, and
+12 -3
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
··· 640 640 if (DBG_FORCE_RELOC == FORCE_GTT_RELOC) 641 641 return false; 642 642 643 + /* 644 + * For objects created by userspace through GEM_CREATE with pat_index 645 + * set by set_pat extension, i915_gem_object_has_cache_level() always 646 + * return true, otherwise the call would fall back to checking whether 647 + * the object is un-cached. 648 + */ 643 649 return (cache->has_llc || 644 650 obj->cache_dirty || 645 - obj->cache_level != I915_CACHE_NONE); 651 + !i915_gem_object_has_cache_level(obj, I915_CACHE_NONE)); 646 652 } 647 653 648 654 static int eb_reserve_vma(struct i915_execbuffer *eb, ··· 1330 1324 if (drm_mm_node_allocated(&cache->node)) { 1331 1325 ggtt->vm.insert_page(&ggtt->vm, 1332 1326 i915_gem_object_get_dma_address(obj, page), 1333 - offset, I915_CACHE_NONE, 0); 1327 + offset, 1328 + i915_gem_get_pat_index(ggtt->vm.i915, 1329 + I915_CACHE_NONE), 1330 + 0); 1334 1331 } else { 1335 1332 offset += page << PAGE_SHIFT; 1336 1333 } ··· 1473 1464 reloc_cache_unmap(&eb->reloc_cache); 1474 1465 mutex_lock(&vma->vm->mutex); 1475 1466 err = i915_vma_bind(target->vma, 1476 - target->vma->obj->cache_level, 1467 + target->vma->obj->pat_index, 1477 1468 PIN_GLOBAL, NULL, NULL); 1478 1469 mutex_unlock(&vma->vm->mutex); 1479 1470 reloc_cache_remap(&eb->reloc_cache, ev->vma->obj);
+102 -46
drivers/gpu/drm/i915/gem/i915_gem_mman.c
··· 383 383 } 384 384 385 385 /* Access to snoopable pages through the GTT is incoherent. */ 386 - if (obj->cache_level != I915_CACHE_NONE && !HAS_LLC(i915)) { 386 + /* 387 + * For objects created by userspace through GEM_CREATE with pat_index 388 + * set by set_pat extension, coherency is managed by userspace, make 389 + * sure we don't fail handling the vm fault by calling 390 + * i915_gem_object_has_cache_level() which always return true for such 391 + * objects. Otherwise this helper function would fall back to checking 392 + * whether the object is un-cached. 393 + */ 394 + if (!(i915_gem_object_has_cache_level(obj, I915_CACHE_NONE) || 395 + HAS_LLC(i915))) { 387 396 ret = -EFAULT; 388 397 goto err_unpin; 389 398 } ··· 936 927 return file; 937 928 } 938 929 939 - /* 940 - * This overcomes the limitation in drm_gem_mmap's assignment of a 941 - * drm_gem_object as the vma->vm_private_data. Since we need to 942 - * be able to resolve multiple mmap offsets which could be tied 943 - * to a single gem object. 944 - */ 945 - int i915_gem_mmap(struct file *filp, struct vm_area_struct *vma) 930 + static int 931 + i915_gem_object_mmap(struct drm_i915_gem_object *obj, 932 + struct i915_mmap_offset *mmo, 933 + struct vm_area_struct *vma) 946 934 { 947 - struct drm_vma_offset_node *node; 948 - struct drm_file *priv = filp->private_data; 949 - struct drm_device *dev = priv->minor->dev; 950 - struct drm_i915_gem_object *obj = NULL; 951 - struct i915_mmap_offset *mmo = NULL; 935 + struct drm_i915_private *i915 = to_i915(obj->base.dev); 936 + struct drm_device *dev = &i915->drm; 952 937 struct file *anon; 953 - 954 - if (drm_dev_is_unplugged(dev)) 955 - return -ENODEV; 956 - 957 - rcu_read_lock(); 958 - drm_vma_offset_lock_lookup(dev->vma_offset_manager); 959 - node = drm_vma_offset_exact_lookup_locked(dev->vma_offset_manager, 960 - vma->vm_pgoff, 961 - vma_pages(vma)); 962 - if (node && drm_vma_node_is_allowed(node, priv)) { 963 - /* 964 - * Skip 0-refcnted objects as it is in the process of being 965 - * destroyed and will be invalid when the vma manager lock 966 - * is released. 967 - */ 968 - if (!node->driver_private) { 969 - mmo = container_of(node, struct i915_mmap_offset, vma_node); 970 - obj = i915_gem_object_get_rcu(mmo->obj); 971 - 972 - GEM_BUG_ON(obj && obj->ops->mmap_ops); 973 - } else { 974 - obj = i915_gem_object_get_rcu 975 - (container_of(node, struct drm_i915_gem_object, 976 - base.vma_node)); 977 - 978 - GEM_BUG_ON(obj && !obj->ops->mmap_ops); 979 - } 980 - } 981 - drm_vma_offset_unlock_lookup(dev->vma_offset_manager); 982 - rcu_read_unlock(); 983 - if (!obj) 984 - return node ? -EACCES : -EINVAL; 985 938 986 939 if (i915_gem_object_is_readonly(obj)) { 987 940 if (vma->vm_flags & VM_WRITE) { ··· 976 1005 if (obj->ops->mmap_ops) { 977 1006 vma->vm_page_prot = pgprot_decrypted(vm_get_page_prot(vma->vm_flags)); 978 1007 vma->vm_ops = obj->ops->mmap_ops; 979 - vma->vm_private_data = node->driver_private; 1008 + vma->vm_private_data = obj->base.vma_node.driver_private; 980 1009 return 0; 981 1010 } 982 1011 ··· 1012 1041 vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot); 1013 1042 1014 1043 return 0; 1044 + } 1045 + 1046 + /* 1047 + * This overcomes the limitation in drm_gem_mmap's assignment of a 1048 + * drm_gem_object as the vma->vm_private_data. Since we need to 1049 + * be able to resolve multiple mmap offsets which could be tied 1050 + * to a single gem object. 1051 + */ 1052 + int i915_gem_mmap(struct file *filp, struct vm_area_struct *vma) 1053 + { 1054 + struct drm_vma_offset_node *node; 1055 + struct drm_file *priv = filp->private_data; 1056 + struct drm_device *dev = priv->minor->dev; 1057 + struct drm_i915_gem_object *obj = NULL; 1058 + struct i915_mmap_offset *mmo = NULL; 1059 + 1060 + if (drm_dev_is_unplugged(dev)) 1061 + return -ENODEV; 1062 + 1063 + rcu_read_lock(); 1064 + drm_vma_offset_lock_lookup(dev->vma_offset_manager); 1065 + node = drm_vma_offset_exact_lookup_locked(dev->vma_offset_manager, 1066 + vma->vm_pgoff, 1067 + vma_pages(vma)); 1068 + if (node && drm_vma_node_is_allowed(node, priv)) { 1069 + /* 1070 + * Skip 0-refcnted objects as it is in the process of being 1071 + * destroyed and will be invalid when the vma manager lock 1072 + * is released. 1073 + */ 1074 + if (!node->driver_private) { 1075 + mmo = container_of(node, struct i915_mmap_offset, vma_node); 1076 + obj = i915_gem_object_get_rcu(mmo->obj); 1077 + 1078 + GEM_BUG_ON(obj && obj->ops->mmap_ops); 1079 + } else { 1080 + obj = i915_gem_object_get_rcu 1081 + (container_of(node, struct drm_i915_gem_object, 1082 + base.vma_node)); 1083 + 1084 + GEM_BUG_ON(obj && !obj->ops->mmap_ops); 1085 + } 1086 + } 1087 + drm_vma_offset_unlock_lookup(dev->vma_offset_manager); 1088 + rcu_read_unlock(); 1089 + if (!obj) 1090 + return node ? -EACCES : -EINVAL; 1091 + 1092 + return i915_gem_object_mmap(obj, mmo, vma); 1093 + } 1094 + 1095 + int i915_gem_fb_mmap(struct drm_i915_gem_object *obj, struct vm_area_struct *vma) 1096 + { 1097 + struct drm_i915_private *i915 = to_i915(obj->base.dev); 1098 + struct drm_device *dev = &i915->drm; 1099 + struct i915_mmap_offset *mmo = NULL; 1100 + enum i915_mmap_type mmap_type; 1101 + struct i915_ggtt *ggtt = to_gt(i915)->ggtt; 1102 + 1103 + if (drm_dev_is_unplugged(dev)) 1104 + return -ENODEV; 1105 + 1106 + /* handle ttm object */ 1107 + if (obj->ops->mmap_ops) { 1108 + /* 1109 + * ttm fault handler, ttm_bo_vm_fault_reserved() uses fake offset 1110 + * to calculate page offset so set that up. 1111 + */ 1112 + vma->vm_pgoff += drm_vma_node_start(&obj->base.vma_node); 1113 + } else { 1114 + /* handle stolen and smem objects */ 1115 + mmap_type = i915_ggtt_has_aperture(ggtt) ? I915_MMAP_TYPE_GTT : I915_MMAP_TYPE_WC; 1116 + mmo = mmap_offset_attach(obj, mmap_type, NULL); 1117 + if (!mmo) 1118 + return -ENODEV; 1119 + } 1120 + 1121 + /* 1122 + * When we install vm_ops for mmap we are too late for 1123 + * the vm_ops->open() which increases the ref_count of 1124 + * this obj and then it gets decreased by the vm_ops->close(). 1125 + * To balance this increase the obj ref_count here. 1126 + */ 1127 + obj = i915_gem_object_get(obj); 1128 + return i915_gem_object_mmap(obj, mmo, vma); 1015 1129 } 1016 1130 1017 1131 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+1 -1
drivers/gpu/drm/i915/gem/i915_gem_mman.h
··· 29 29 30 30 void i915_gem_object_runtime_pm_release_mmap_offset(struct drm_i915_gem_object *obj); 31 31 void i915_gem_object_release_mmap_offset(struct drm_i915_gem_object *obj); 32 - 32 + int i915_gem_fb_mmap(struct drm_i915_gem_object *obj, struct vm_area_struct *vma); 33 33 #endif
+59 -1
drivers/gpu/drm/i915/gem/i915_gem_object.c
··· 45 45 46 46 static const struct drm_gem_object_funcs i915_gem_object_funcs; 47 47 48 + unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915, 49 + enum i915_cache_level level) 50 + { 51 + if (drm_WARN_ON(&i915->drm, level >= I915_MAX_CACHE_LEVEL)) 52 + return 0; 53 + 54 + return INTEL_INFO(i915)->cachelevel_to_pat[level]; 55 + } 56 + 57 + bool i915_gem_object_has_cache_level(const struct drm_i915_gem_object *obj, 58 + enum i915_cache_level lvl) 59 + { 60 + /* 61 + * In case the pat_index is set by user space, this kernel mode 62 + * driver should leave the coherency to be managed by user space, 63 + * simply return true here. 64 + */ 65 + if (obj->pat_set_by_user) 66 + return true; 67 + 68 + /* 69 + * Otherwise the pat_index should have been converted from cache_level 70 + * so that the following comparison is valid. 71 + */ 72 + return obj->pat_index == i915_gem_get_pat_index(obj_to_i915(obj), lvl); 73 + } 74 + 48 75 struct drm_i915_gem_object *i915_gem_object_alloc(void) 49 76 { 50 77 struct drm_i915_gem_object *obj; ··· 151 124 { 152 125 struct drm_i915_private *i915 = to_i915(obj->base.dev); 153 126 154 - obj->cache_level = cache_level; 127 + obj->pat_index = i915_gem_get_pat_index(i915, cache_level); 155 128 156 129 if (cache_level != I915_CACHE_NONE) 130 + obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ | 131 + I915_BO_CACHE_COHERENT_FOR_WRITE); 132 + else if (HAS_LLC(i915)) 133 + obj->cache_coherent = I915_BO_CACHE_COHERENT_FOR_READ; 134 + else 135 + obj->cache_coherent = 0; 136 + 137 + obj->cache_dirty = 138 + !(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE) && 139 + !IS_DGFX(i915); 140 + } 141 + 142 + /** 143 + * i915_gem_object_set_pat_index - set PAT index to be used in PTE encode 144 + * @obj: #drm_i915_gem_object 145 + * @pat_index: PAT index 146 + * 147 + * This is a clone of i915_gem_object_set_cache_coherency taking pat index 148 + * instead of cache_level as its second argument. 149 + */ 150 + void i915_gem_object_set_pat_index(struct drm_i915_gem_object *obj, 151 + unsigned int pat_index) 152 + { 153 + struct drm_i915_private *i915 = to_i915(obj->base.dev); 154 + 155 + if (obj->pat_index == pat_index) 156 + return; 157 + 158 + obj->pat_index = pat_index; 159 + 160 + if (pat_index != i915_gem_get_pat_index(i915, I915_CACHE_NONE)) 157 161 obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ | 158 162 I915_BO_CACHE_COHERENT_FOR_WRITE); 159 163 else if (HAS_LLC(i915))
+9 -1
drivers/gpu/drm/i915/gem/i915_gem_object.h
··· 20 20 21 21 enum intel_region_id; 22 22 23 + #define obj_to_i915(obj__) to_i915((obj__)->base.dev) 24 + 23 25 static inline bool i915_gem_object_size_2big(u64 size) 24 26 { 25 27 struct drm_i915_gem_object *obj; ··· 32 30 return false; 33 31 } 34 32 33 + unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915, 34 + enum i915_cache_level level); 35 + bool i915_gem_object_has_cache_level(const struct drm_i915_gem_object *obj, 36 + enum i915_cache_level lvl); 35 37 void i915_gem_init__objects(struct drm_i915_private *i915); 36 38 37 39 void i915_objects_module_exit(void); ··· 86 80 87 81 /** 88 82 * i915_gem_object_lookup_rcu - look up a temporary GEM object from its handle 89 - * @filp: DRM file private date 83 + * @file: DRM file private date 90 84 * @handle: userspace handle 91 85 * 92 86 * Returns: ··· 766 760 767 761 void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj, 768 762 unsigned int cache_level); 763 + void i915_gem_object_set_pat_index(struct drm_i915_gem_object *obj, 764 + unsigned int pat_index); 769 765 bool i915_gem_object_can_bypass_llc(struct drm_i915_gem_object *obj); 770 766 void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj); 771 767 void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj);
+60 -8
drivers/gpu/drm/i915/gem/i915_gem_object_types.h
··· 194 194 * engine. 195 195 */ 196 196 I915_CACHE_WT, 197 + /** 198 + * @I915_MAX_CACHE_LEVEL: 199 + * 200 + * Mark the last entry in the enum. Used for defining cachelevel_to_pat 201 + * array for cache_level to pat translation table. 202 + */ 203 + I915_MAX_CACHE_LEVEL, 197 204 }; 198 205 199 206 enum i915_map_type { ··· 335 328 */ 336 329 #define I915_BO_ALLOC_GPU_ONLY BIT(6) 337 330 #define I915_BO_ALLOC_CCS_AUX BIT(7) 331 + /* 332 + * Object is allowed to retain its initial data and will not be cleared on first 333 + * access if used along with I915_BO_ALLOC_USER. This is mainly to keep 334 + * preallocated framebuffer data intact while transitioning it to i915drmfb. 335 + */ 336 + #define I915_BO_PREALLOC BIT(8) 338 337 #define I915_BO_ALLOC_FLAGS (I915_BO_ALLOC_CONTIGUOUS | \ 339 338 I915_BO_ALLOC_VOLATILE | \ 340 339 I915_BO_ALLOC_CPU_CLEAR | \ ··· 348 335 I915_BO_ALLOC_PM_VOLATILE | \ 349 336 I915_BO_ALLOC_PM_EARLY | \ 350 337 I915_BO_ALLOC_GPU_ONLY | \ 351 - I915_BO_ALLOC_CCS_AUX) 352 - #define I915_BO_READONLY BIT(8) 353 - #define I915_TILING_QUIRK_BIT 9 /* unknown swizzling; do not release! */ 354 - #define I915_BO_PROTECTED BIT(10) 338 + I915_BO_ALLOC_CCS_AUX | \ 339 + I915_BO_PREALLOC) 340 + #define I915_BO_READONLY BIT(9) 341 + #define I915_TILING_QUIRK_BIT 10 /* unknown swizzling; do not release! */ 342 + #define I915_BO_PROTECTED BIT(11) 355 343 /** 356 344 * @mem_flags - Mutable placement-related flags 357 345 * ··· 364 350 #define I915_BO_FLAG_STRUCT_PAGE BIT(0) /* Object backed by struct pages */ 365 351 #define I915_BO_FLAG_IOMEM BIT(1) /* Object backed by IO memory */ 366 352 /** 367 - * @cache_level: The desired GTT caching level. 353 + * @pat_index: The desired PAT index. 368 354 * 369 - * See enum i915_cache_level for possible values, along with what 370 - * each does. 355 + * See hardware specification for valid PAT indices for each platform. 356 + * This field replaces the @cache_level that contains a value of enum 357 + * i915_cache_level since PAT indices are being used by both userspace 358 + * and kernel mode driver for caching policy control after GEN12. 359 + * In the meantime platform specific tables are created to translate 360 + * i915_cache_level into pat index, for more details check the macros 361 + * defined i915/i915_pci.c, e.g. PVC_CACHELEVEL. 362 + * For backward compatibility, this field contains values exactly match 363 + * the entries of enum i915_cache_level for pre-GEN12 platforms (See 364 + * LEGACY_CACHELEVEL), so that the PTE encode functions for these 365 + * legacy platforms can stay the same. 371 366 */ 372 - unsigned int cache_level:3; 367 + unsigned int pat_index:6; 368 + /** 369 + * @pat_set_by_user: Indicate whether pat_index is set by user space 370 + * 371 + * This field is set to false by default, only set to true if the 372 + * pat_index is set by user space. By design, user space is capable of 373 + * managing caching behavior by setting pat_index, in which case this 374 + * kernel mode driver should never touch the pat_index. 375 + */ 376 + unsigned int pat_set_by_user:1; 373 377 /** 374 378 * @cache_coherent: 379 + * 380 + * Note: with the change above which replaced @cache_level with pat_index, 381 + * the use of @cache_coherent is limited to the objects created by kernel 382 + * or by userspace without pat index specified. 383 + * Check for @pat_set_by_user to find out if an object has pat index set 384 + * by userspace. The ioctl's to change cache settings have also been 385 + * disabled for the objects with pat index set by userspace. Please don't 386 + * assume @cache_coherent having the flags set as describe here. A helper 387 + * function i915_gem_object_has_cache_level() provides one way to bypass 388 + * the use of this field. 375 389 * 376 390 * Track whether the pages are coherent with the GPU if reading or 377 391 * writing through the CPU caches. The largely depends on the ··· 473 431 474 432 /** 475 433 * @cache_dirty: 434 + * 435 + * Note: with the change above which replaced cache_level with pat_index, 436 + * the use of @cache_dirty is limited to the objects created by kernel 437 + * or by userspace without pat index specified. 438 + * Check for @pat_set_by_user to find out if an object has pat index set 439 + * by userspace. The ioctl's to change cache settings have also been 440 + * disabled for the objects with pat_index set by userspace. Please don't 441 + * assume @cache_dirty is set as describe here. Also see helper function 442 + * i915_gem_object_has_cache_level() for possible ways to bypass the use 443 + * of this field. 476 444 * 477 445 * Track if we are we dirty with writes through the CPU cache for this 478 446 * object. As a result reading directly from main memory might yield
+4 -1
drivers/gpu/drm/i915/gem/i915_gem_pages.c
··· 469 469 struct drm_i915_gem_object *obj, 470 470 bool always_coherent) 471 471 { 472 - if (i915_gem_object_is_lmem(obj)) 472 + /* 473 + * Wa_22016122933: always return I915_MAP_WC for MTL 474 + */ 475 + if (i915_gem_object_is_lmem(obj) || IS_METEORLAKE(i915)) 473 476 return I915_MAP_WC; 474 477 if (HAS_LLC(i915) || always_coherent) 475 478 return I915_MAP_WB;
+1 -3
drivers/gpu/drm/i915/gem/i915_gem_region.h
··· 22 22 */ 23 23 struct i915_gem_apply_to_region_ops { 24 24 /** 25 - * process_obj - Process the current object 26 - * @apply: Embed this for private data. 27 - * @obj: The current object. 25 + * @process_obj: Process the current object 28 26 * 29 27 * Note that if this function is part of a ww transaction, and 30 28 * if returns -EDEADLK for one of the objects, it may be
+8 -1
drivers/gpu/drm/i915/gem/i915_gem_shmem.c
··· 601 601 obj->write_domain = I915_GEM_DOMAIN_CPU; 602 602 obj->read_domains = I915_GEM_DOMAIN_CPU; 603 603 604 - if (HAS_LLC(i915)) 604 + /* 605 + * MTL doesn't snoop CPU cache by default for GPU access (namely 606 + * 1-way coherency). However some UMD's are currently depending on 607 + * that. Make 1-way coherent the default setting for MTL. A follow 608 + * up patch will extend the GEM_CREATE uAPI to allow UMD's specify 609 + * caching mode at BO creation time 610 + */ 611 + if (HAS_LLC(i915) || (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))) 605 612 /* On some devices, we can have the GPU use the LLC (the CPU 606 613 * cache) for about a 10% performance improvement 607 614 * compared to uncached. Graphics requests other than
-2
drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
··· 460 460 fs_reclaim_release(GFP_KERNEL); 461 461 } 462 462 463 - #define obj_to_i915(obj__) to_i915((obj__)->base.dev) 464 - 465 463 /** 466 464 * i915_gem_object_make_unshrinkable - Hide the object from the shrinker. By 467 465 * default all object types that support shrinking(see IS_SHRINKABLE), will also
+11 -1
drivers/gpu/drm/i915/gem/i915_gem_stolen.c
··· 535 535 /* Basic memrange allocator for stolen space. */ 536 536 drm_mm_init(&i915->mm.stolen, 0, i915->dsm.usable_size); 537 537 538 + /* 539 + * Access to stolen lmem beyond certain size for MTL A0 stepping 540 + * would crash the machine. Disable stolen lmem for userspace access 541 + * by setting usable_size to zero. 542 + */ 543 + if (IS_METEORLAKE(i915) && INTEL_REVID(i915) == 0x0) 544 + i915->dsm.usable_size = 0; 545 + 538 546 return 0; 539 547 } 540 548 ··· 565 557 566 558 ggtt->vm.insert_page(&ggtt->vm, addr, 567 559 ggtt->error_capture.start, 568 - I915_CACHE_NONE, 0); 560 + i915_gem_get_pat_index(ggtt->vm.i915, 561 + I915_CACHE_NONE), 562 + 0); 569 563 mb(); 570 564 571 565 s = io_mapping_map_wc(&ggtt->iomap,
+2 -1
drivers/gpu/drm/i915/gem/i915_gem_ttm.h
··· 42 42 /** 43 43 * i915_ttm_to_gem - Convert a struct ttm_buffer_object to an embedding 44 44 * struct drm_i915_gem_object. 45 + * @bo: Pointer to the ttm buffer object 45 46 * 46 - * Return: Pointer to the embedding struct ttm_buffer_object. 47 + * Return: Pointer to the embedding struct drm_i915_gem_object. 47 48 */ 48 49 static inline struct drm_i915_gem_object * 49 50 i915_ttm_to_gem(struct ttm_buffer_object *bo)
+8 -5
drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
··· 214 214 215 215 intel_engine_pm_get(to_gt(i915)->migrate.context->engine); 216 216 ret = intel_context_migrate_clear(to_gt(i915)->migrate.context, deps, 217 - dst_st->sgl, dst_level, 217 + dst_st->sgl, 218 + i915_gem_get_pat_index(i915, dst_level), 218 219 i915_ttm_gtt_binds_lmem(dst_mem), 219 220 0, &rq); 220 221 } else { ··· 229 228 intel_engine_pm_get(to_gt(i915)->migrate.context->engine); 230 229 ret = intel_context_migrate_copy(to_gt(i915)->migrate.context, 231 230 deps, src_rsgt->table.sgl, 232 - src_level, 231 + i915_gem_get_pat_index(i915, src_level), 233 232 i915_ttm_gtt_binds_lmem(bo->resource), 234 - dst_st->sgl, dst_level, 233 + dst_st->sgl, 234 + i915_gem_get_pat_index(i915, dst_level), 235 235 i915_ttm_gtt_binds_lmem(dst_mem), 236 236 &rq); 237 237 ··· 578 576 struct dma_fence *migration_fence = NULL; 579 577 struct ttm_tt *ttm = bo->ttm; 580 578 struct i915_refct_sgt *dst_rsgt; 581 - bool clear; 579 + bool clear, prealloc_bo; 582 580 int ret; 583 581 584 582 if (GEM_WARN_ON(i915_ttm_is_ghost_object(bo))) { ··· 634 632 return PTR_ERR(dst_rsgt); 635 633 636 634 clear = !i915_ttm_cpu_maps_iomem(bo->resource) && (!ttm || !ttm_tt_is_populated(ttm)); 637 - if (!(clear && ttm && !(ttm->page_flags & TTM_TT_FLAG_ZERO_ALLOC))) { 635 + prealloc_bo = obj->flags & I915_BO_PREALLOC; 636 + if (!(clear && ttm && !((ttm->page_flags & TTM_TT_FLAG_ZERO_ALLOC) && !prealloc_bo))) { 638 637 struct i915_deps deps; 639 638 640 639 i915_deps_init(&deps, GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN);
+82 -26
drivers/gpu/drm/i915/gem/selftests/huge_pages.c
··· 354 354 355 355 obj->write_domain = I915_GEM_DOMAIN_CPU; 356 356 obj->read_domains = I915_GEM_DOMAIN_CPU; 357 - obj->cache_level = I915_CACHE_NONE; 357 + obj->pat_index = i915_gem_get_pat_index(i915, I915_CACHE_NONE); 358 358 359 359 return obj; 360 360 } ··· 695 695 return err; 696 696 } 697 697 698 - static void close_object_list(struct list_head *objects, 699 - struct i915_ppgtt *ppgtt) 698 + static void close_object_list(struct list_head *objects) 700 699 { 701 700 struct drm_i915_gem_object *obj, *on; 702 701 ··· 709 710 } 710 711 } 711 712 712 - static int igt_mock_ppgtt_huge_fill(void *arg) 713 + static int igt_ppgtt_huge_fill(void *arg) 713 714 { 714 - struct i915_ppgtt *ppgtt = arg; 715 - struct drm_i915_private *i915 = ppgtt->vm.i915; 716 - unsigned long max_pages = ppgtt->vm.total >> PAGE_SHIFT; 715 + struct drm_i915_private *i915 = arg; 716 + unsigned int supported = RUNTIME_INFO(i915)->page_sizes; 717 + bool has_pte64 = GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50); 718 + struct i915_address_space *vm; 719 + struct i915_gem_context *ctx; 720 + unsigned long max_pages; 717 721 unsigned long page_num; 722 + struct file *file; 718 723 bool single = false; 719 724 LIST_HEAD(objects); 720 725 IGT_TIMEOUT(end_time); 721 726 int err = -ENODEV; 727 + 728 + if (supported == I915_GTT_PAGE_SIZE_4K) 729 + return 0; 730 + 731 + file = mock_file(i915); 732 + if (IS_ERR(file)) 733 + return PTR_ERR(file); 734 + 735 + ctx = hugepage_ctx(i915, file); 736 + if (IS_ERR(ctx)) { 737 + err = PTR_ERR(ctx); 738 + goto out; 739 + } 740 + vm = i915_gem_context_get_eb_vm(ctx); 741 + max_pages = vm->total >> PAGE_SHIFT; 722 742 723 743 for_each_prime_number_from(page_num, 1, max_pages) { 724 744 struct drm_i915_gem_object *obj; ··· 768 750 769 751 list_add(&obj->st_link, &objects); 770 752 771 - vma = i915_vma_instance(obj, &ppgtt->vm, NULL); 753 + vma = i915_vma_instance(obj, vm, NULL); 772 754 if (IS_ERR(vma)) { 773 755 err = PTR_ERR(vma); 774 756 break; 775 757 } 776 758 777 - err = i915_vma_pin(vma, 0, 0, PIN_USER); 759 + /* vma start must be aligned to BIT(21) to allow 2M PTEs */ 760 + err = i915_vma_pin(vma, 0, BIT(21), PIN_USER); 778 761 if (err) 779 762 break; 780 763 ··· 803 784 GEM_BUG_ON(!expected_gtt); 804 785 GEM_BUG_ON(size); 805 786 806 - if (expected_gtt & I915_GTT_PAGE_SIZE_4K) 787 + if (!has_pte64 && (obj->base.size < I915_GTT_PAGE_SIZE_2M || 788 + expected_gtt & I915_GTT_PAGE_SIZE_2M)) 807 789 expected_gtt &= ~I915_GTT_PAGE_SIZE_64K; 808 790 809 791 i915_vma_unpin(vma); 810 792 811 - if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K) { 793 + if (!has_pte64 && vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K) { 812 794 if (!IS_ALIGNED(vma->node.start, 813 795 I915_GTT_PAGE_SIZE_2M)) { 814 796 pr_err("node.start(%llx) not aligned to 2M\n", ··· 828 808 } 829 809 830 810 if (vma->resource->page_sizes_gtt != expected_gtt) { 831 - pr_err("gtt=%u, expected=%u, size=%zd, single=%s\n", 811 + pr_err("gtt=%#x, expected=%#x, size=0x%zx, single=%s\n", 832 812 vma->resource->page_sizes_gtt, expected_gtt, 833 813 obj->base.size, str_yes_no(!!single)); 834 814 err = -EINVAL; ··· 843 823 single = !single; 844 824 } 845 825 846 - close_object_list(&objects, ppgtt); 826 + close_object_list(&objects); 847 827 848 828 if (err == -ENOMEM || err == -ENOSPC) 849 829 err = 0; 850 830 831 + i915_vm_put(vm); 832 + out: 833 + fput(file); 851 834 return err; 852 835 } 853 836 854 - static int igt_mock_ppgtt_64K(void *arg) 837 + static int igt_ppgtt_64K(void *arg) 855 838 { 856 - struct i915_ppgtt *ppgtt = arg; 857 - struct drm_i915_private *i915 = ppgtt->vm.i915; 839 + struct drm_i915_private *i915 = arg; 840 + bool has_pte64 = GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50); 858 841 struct drm_i915_gem_object *obj; 842 + struct i915_address_space *vm; 843 + struct i915_gem_context *ctx; 844 + struct file *file; 859 845 const struct object_info { 860 846 unsigned int size; 861 847 unsigned int gtt; ··· 933 907 if (!HAS_PAGE_SIZES(i915, I915_GTT_PAGE_SIZE_64K)) 934 908 return 0; 935 909 910 + file = mock_file(i915); 911 + if (IS_ERR(file)) 912 + return PTR_ERR(file); 913 + 914 + ctx = hugepage_ctx(i915, file); 915 + if (IS_ERR(ctx)) { 916 + err = PTR_ERR(ctx); 917 + goto out; 918 + } 919 + vm = i915_gem_context_get_eb_vm(ctx); 920 + 936 921 for (i = 0; i < ARRAY_SIZE(objects); ++i) { 937 922 unsigned int size = objects[i].size; 938 923 unsigned int expected_gtt = objects[i].gtt; 939 924 unsigned int offset = objects[i].offset; 940 925 unsigned int flags = PIN_USER; 941 926 927 + /* 928 + * For modern GTT models, the requirements for marking a page-table 929 + * as 64K have been relaxed. Account for this. 930 + */ 931 + if (has_pte64) { 932 + expected_gtt = 0; 933 + if (size >= SZ_64K) 934 + expected_gtt |= I915_GTT_PAGE_SIZE_64K; 935 + if (size & (SZ_64K - 1)) 936 + expected_gtt |= I915_GTT_PAGE_SIZE_4K; 937 + } 938 + 942 939 for (single = 0; single <= 1; single++) { 943 940 obj = fake_huge_pages_object(i915, size, !!single); 944 - if (IS_ERR(obj)) 945 - return PTR_ERR(obj); 941 + if (IS_ERR(obj)) { 942 + err = PTR_ERR(obj); 943 + goto out_vm; 944 + } 946 945 947 946 err = i915_gem_object_pin_pages_unlocked(obj); 948 947 if (err) ··· 979 928 */ 980 929 obj->mm.page_sizes.sg &= ~I915_GTT_PAGE_SIZE_2M; 981 930 982 - vma = i915_vma_instance(obj, &ppgtt->vm, NULL); 931 + vma = i915_vma_instance(obj, vm, NULL); 983 932 if (IS_ERR(vma)) { 984 933 err = PTR_ERR(vma); 985 934 goto out_object_unpin; ··· 996 945 if (err) 997 946 goto out_vma_unpin; 998 947 999 - if (!offset && vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K) { 948 + if (!has_pte64 && !offset && 949 + vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K) { 1000 950 if (!IS_ALIGNED(vma->node.start, 1001 951 I915_GTT_PAGE_SIZE_2M)) { 1002 952 pr_err("node.start(%llx) not aligned to 2M\n", ··· 1016 964 } 1017 965 1018 966 if (vma->resource->page_sizes_gtt != expected_gtt) { 1019 - pr_err("gtt=%u, expected=%u, i=%d, single=%s\n", 967 + pr_err("gtt=%#x, expected=%#x, i=%d, single=%s offset=%#x size=%#x\n", 1020 968 vma->resource->page_sizes_gtt, 1021 - expected_gtt, i, str_yes_no(!!single)); 969 + expected_gtt, i, str_yes_no(!!single), 970 + offset, size); 1022 971 err = -EINVAL; 1023 972 goto out_vma_unpin; 1024 973 } ··· 1035 982 } 1036 983 } 1037 984 1038 - return 0; 985 + goto out_vm; 1039 986 1040 987 out_vma_unpin: 1041 988 i915_vma_unpin(vma); ··· 1045 992 i915_gem_object_unlock(obj); 1046 993 out_object_put: 1047 994 i915_gem_object_put(obj); 1048 - 995 + out_vm: 996 + i915_vm_put(vm); 997 + out: 998 + fput(file); 1049 999 return err; 1050 1000 } 1051 1001 ··· 1966 1910 SUBTEST(igt_mock_exhaust_device_supported_pages), 1967 1911 SUBTEST(igt_mock_memory_region_huge_pages), 1968 1912 SUBTEST(igt_mock_ppgtt_misaligned_dma), 1969 - SUBTEST(igt_mock_ppgtt_huge_fill), 1970 - SUBTEST(igt_mock_ppgtt_64K), 1971 1913 }; 1972 1914 struct drm_i915_private *dev_priv; 1973 1915 struct i915_ppgtt *ppgtt; ··· 2016 1962 SUBTEST(igt_ppgtt_sanity_check), 2017 1963 SUBTEST(igt_ppgtt_compact), 2018 1964 SUBTEST(igt_ppgtt_mixed), 1965 + SUBTEST(igt_ppgtt_huge_fill), 1966 + SUBTEST(igt_ppgtt_64K), 2019 1967 }; 2020 1968 2021 1969 if (!HAS_PPGTT(i915)) {
+10 -8
drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
··· 66 66 ctx[n] = live_context(i915, file); 67 67 if (IS_ERR(ctx[n])) { 68 68 err = PTR_ERR(ctx[n]); 69 - goto out_file; 69 + goto out_ctx; 70 70 } 71 71 } 72 72 ··· 82 82 this = igt_request_alloc(ctx[n], engine); 83 83 if (IS_ERR(this)) { 84 84 err = PTR_ERR(this); 85 - goto out_file; 85 + goto out_ctx; 86 86 } 87 87 if (rq) { 88 88 i915_request_await_dma_fence(this, &rq->fence); ··· 93 93 } 94 94 if (i915_request_wait(rq, 0, 10 * HZ) < 0) { 95 95 pr_err("Failed to populated %d contexts\n", nctx); 96 - intel_gt_set_wedged(to_gt(i915)); 96 + intel_gt_set_wedged(engine->gt); 97 97 i915_request_put(rq); 98 98 err = -EIO; 99 - goto out_file; 99 + goto out_ctx; 100 100 } 101 101 i915_request_put(rq); 102 102 ··· 107 107 108 108 err = igt_live_test_begin(&t, i915, __func__, engine->name); 109 109 if (err) 110 - goto out_file; 110 + goto out_ctx; 111 111 112 112 end_time = jiffies + i915_selftest.timeout_jiffies; 113 113 for_each_prime_number_from(prime, 2, 8192) { ··· 120 120 this = igt_request_alloc(ctx[n % nctx], engine); 121 121 if (IS_ERR(this)) { 122 122 err = PTR_ERR(this); 123 - goto out_file; 123 + goto out_ctx; 124 124 } 125 125 126 126 if (rq) { /* Force submission order */ ··· 149 149 if (i915_request_wait(rq, 0, HZ / 5) < 0) { 150 150 pr_err("Switching between %ld contexts timed out\n", 151 151 prime); 152 - intel_gt_set_wedged(to_gt(i915)); 152 + intel_gt_set_wedged(engine->gt); 153 153 i915_request_put(rq); 154 154 break; 155 155 } ··· 165 165 166 166 err = igt_live_test_end(&t); 167 167 if (err) 168 - goto out_file; 168 + goto out_ctx; 169 169 170 170 pr_info("Switch latencies on %s: 1 = %lluns, %lu = %lluns\n", 171 171 engine->name, ··· 173 173 prime - 1, div64_u64(ktime_to_ns(times[1]), prime - 1)); 174 174 } 175 175 176 + out_ctx: 177 + kfree(ctx); 176 178 out_file: 177 179 fput(file); 178 180 return err;
+1 -1
drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
··· 219 219 continue; 220 220 221 221 err = intel_migrate_clear(&gt->migrate, &ww, deps, 222 - obj->mm.pages->sgl, obj->cache_level, 222 + obj->mm.pages->sgl, obj->pat_index, 223 223 i915_gem_object_is_lmem(obj), 224 224 0xdeadbeaf, &rq); 225 225 if (rq) {
+1 -1
drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
··· 1222 1222 } 1223 1223 1224 1224 err = intel_context_migrate_clear(to_gt(i915)->migrate.context, NULL, 1225 - obj->mm.pages->sgl, obj->cache_level, 1225 + obj->mm.pages->sgl, obj->pat_index, 1226 1226 i915_gem_object_is_lmem(obj), 1227 1227 expand32(POISON_INUSE), &rq); 1228 1228 i915_gem_object_unpin_pages(obj);
+6 -4
drivers/gpu/drm/i915/gt/gen6_ppgtt.c
··· 109 109 110 110 static void gen6_ppgtt_insert_entries(struct i915_address_space *vm, 111 111 struct i915_vma_resource *vma_res, 112 - enum i915_cache_level cache_level, 112 + unsigned int pat_index, 113 113 u32 flags) 114 114 { 115 115 struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(vm); ··· 117 117 unsigned int first_entry = vma_res->start / I915_GTT_PAGE_SIZE; 118 118 unsigned int act_pt = first_entry / GEN6_PTES; 119 119 unsigned int act_pte = first_entry % GEN6_PTES; 120 - const u32 pte_encode = vm->pte_encode(0, cache_level, flags); 120 + const u32 pte_encode = vm->pte_encode(0, pat_index, flags); 121 121 struct sgt_dma iter = sgt_dma(vma_res); 122 122 gen6_pte_t *vaddr; 123 123 ··· 227 227 228 228 vm->scratch[0]->encode = 229 229 vm->pte_encode(px_dma(vm->scratch[0]), 230 - I915_CACHE_NONE, PTE_READ_ONLY); 230 + i915_gem_get_pat_index(vm->i915, 231 + I915_CACHE_NONE), 232 + PTE_READ_ONLY); 231 233 232 234 vm->scratch[1] = vm->alloc_pt_dma(vm, I915_GTT_PAGE_SIZE_4K); 233 235 if (IS_ERR(vm->scratch[1])) { ··· 280 278 static void pd_vma_bind(struct i915_address_space *vm, 281 279 struct i915_vm_pt_stash *stash, 282 280 struct i915_vma_resource *vma_res, 283 - enum i915_cache_level cache_level, 281 + unsigned int pat_index, 284 282 u32 unused) 285 283 { 286 284 struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
+61 -23
drivers/gpu/drm/i915/gt/gen8_ppgtt.c
··· 29 29 } 30 30 31 31 static u64 gen8_pte_encode(dma_addr_t addr, 32 - enum i915_cache_level level, 32 + unsigned int pat_index, 33 33 u32 flags) 34 34 { 35 35 gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW; ··· 40 40 if (flags & PTE_LM) 41 41 pte |= GEN12_PPGTT_PTE_LM; 42 42 43 - switch (level) { 43 + /* 44 + * For pre-gen12 platforms pat_index is the same as enum 45 + * i915_cache_level, so the switch-case here is still valid. 46 + * See translation table defined by LEGACY_CACHELEVEL. 47 + */ 48 + switch (pat_index) { 44 49 case I915_CACHE_NONE: 45 50 pte |= PPAT_UNCACHED; 46 51 break; ··· 56 51 pte |= PPAT_CACHED; 57 52 break; 58 53 } 54 + 55 + return pte; 56 + } 57 + 58 + static u64 gen12_pte_encode(dma_addr_t addr, 59 + unsigned int pat_index, 60 + u32 flags) 61 + { 62 + gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW; 63 + 64 + if (unlikely(flags & PTE_READ_ONLY)) 65 + pte &= ~GEN8_PAGE_RW; 66 + 67 + if (flags & PTE_LM) 68 + pte |= GEN12_PPGTT_PTE_LM; 69 + 70 + if (pat_index & BIT(0)) 71 + pte |= GEN12_PPGTT_PTE_PAT0; 72 + 73 + if (pat_index & BIT(1)) 74 + pte |= GEN12_PPGTT_PTE_PAT1; 75 + 76 + if (pat_index & BIT(2)) 77 + pte |= GEN12_PPGTT_PTE_PAT2; 78 + 79 + if (pat_index & BIT(3)) 80 + pte |= MTL_PPGTT_PTE_PAT3; 59 81 60 82 return pte; 61 83 } ··· 455 423 struct i915_page_directory *pdp, 456 424 struct sgt_dma *iter, 457 425 u64 idx, 458 - enum i915_cache_level cache_level, 426 + unsigned int pat_index, 459 427 u32 flags) 460 428 { 461 429 struct i915_page_directory *pd; 462 - const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags); 430 + const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, pat_index, flags); 463 431 gen8_pte_t *vaddr; 464 432 465 433 pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2)); ··· 502 470 xehpsdv_ppgtt_insert_huge(struct i915_address_space *vm, 503 471 struct i915_vma_resource *vma_res, 504 472 struct sgt_dma *iter, 505 - enum i915_cache_level cache_level, 473 + unsigned int pat_index, 506 474 u32 flags) 507 475 { 508 - const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, flags); 476 + const gen8_pte_t pte_encode = vm->pte_encode(0, pat_index, flags); 509 477 unsigned int rem = sg_dma_len(iter->sg); 510 478 u64 start = vma_res->start; 511 479 u64 end = start + vma_res->vma_size; ··· 602 570 } 603 571 } while (rem >= page_size && index < max); 604 572 573 + drm_clflush_virt_range(vaddr, PAGE_SIZE); 605 574 vma_res->page_sizes_gtt |= page_size; 606 575 } while (iter->sg && sg_dma_len(iter->sg)); 607 576 } ··· 610 577 static void gen8_ppgtt_insert_huge(struct i915_address_space *vm, 611 578 struct i915_vma_resource *vma_res, 612 579 struct sgt_dma *iter, 613 - enum i915_cache_level cache_level, 580 + unsigned int pat_index, 614 581 u32 flags) 615 582 { 616 - const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags); 583 + const gen8_pte_t pte_encode = vm->pte_encode(0, pat_index, flags); 617 584 unsigned int rem = sg_dma_len(iter->sg); 618 585 u64 start = vma_res->start; 619 586 ··· 733 700 734 701 static void gen8_ppgtt_insert(struct i915_address_space *vm, 735 702 struct i915_vma_resource *vma_res, 736 - enum i915_cache_level cache_level, 703 + unsigned int pat_index, 737 704 u32 flags) 738 705 { 739 706 struct i915_ppgtt * const ppgtt = i915_vm_to_ppgtt(vm); 740 707 struct sgt_dma iter = sgt_dma(vma_res); 741 708 742 709 if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) { 743 - if (HAS_64K_PAGES(vm->i915)) 744 - xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); 710 + if (GRAPHICS_VER_FULL(vm->i915) >= IP_VER(12, 50)) 711 + xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, pat_index, flags); 745 712 else 746 - gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); 713 + gen8_ppgtt_insert_huge(vm, vma_res, &iter, pat_index, flags); 747 714 } else { 748 715 u64 idx = vma_res->start >> GEN8_PTE_SHIFT; 749 716 ··· 752 719 gen8_pdp_for_page_index(vm, idx); 753 720 754 721 idx = gen8_ppgtt_insert_pte(ppgtt, pdp, &iter, idx, 755 - cache_level, flags); 722 + pat_index, flags); 756 723 } while (idx); 757 724 758 725 vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE; ··· 762 729 static void gen8_ppgtt_insert_entry(struct i915_address_space *vm, 763 730 dma_addr_t addr, 764 731 u64 offset, 765 - enum i915_cache_level level, 732 + unsigned int pat_index, 766 733 u32 flags) 767 734 { 768 735 u64 idx = offset >> GEN8_PTE_SHIFT; ··· 776 743 GEM_BUG_ON(pt->is_compact); 777 744 778 745 vaddr = px_vaddr(pt); 779 - vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags); 746 + vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, pat_index, flags); 780 747 drm_clflush_virt_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr)); 781 748 } 782 749 783 750 static void __xehpsdv_ppgtt_insert_entry_lm(struct i915_address_space *vm, 784 751 dma_addr_t addr, 785 752 u64 offset, 786 - enum i915_cache_level level, 753 + unsigned int pat_index, 787 754 u32 flags) 788 755 { 789 756 u64 idx = offset >> GEN8_PTE_SHIFT; ··· 806 773 } 807 774 808 775 vaddr = px_vaddr(pt); 809 - vaddr[gen8_pd_index(idx, 0) / 16] = gen8_pte_encode(addr, level, flags); 776 + vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, pat_index, flags); 810 777 } 811 778 812 779 static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm, 813 780 dma_addr_t addr, 814 781 u64 offset, 815 - enum i915_cache_level level, 782 + unsigned int pat_index, 816 783 u32 flags) 817 784 { 818 785 if (flags & PTE_LM) 819 786 return __xehpsdv_ppgtt_insert_entry_lm(vm, addr, offset, 820 - level, flags); 787 + pat_index, flags); 821 788 822 - return gen8_ppgtt_insert_entry(vm, addr, offset, level, flags); 789 + return gen8_ppgtt_insert_entry(vm, addr, offset, pat_index, flags); 823 790 } 824 791 825 792 static int gen8_init_scratch(struct i915_address_space *vm) ··· 853 820 pte_flags |= PTE_LM; 854 821 855 822 vm->scratch[0]->encode = 856 - gen8_pte_encode(px_dma(vm->scratch[0]), 857 - I915_CACHE_NONE, pte_flags); 823 + vm->pte_encode(px_dma(vm->scratch[0]), 824 + i915_gem_get_pat_index(vm->i915, 825 + I915_CACHE_NONE), 826 + pte_flags); 858 827 859 828 for (i = 1; i <= vm->top; i++) { 860 829 struct drm_i915_gem_object *obj; ··· 998 963 */ 999 964 ppgtt->vm.alloc_scratch_dma = alloc_pt_dma; 1000 965 1001 - ppgtt->vm.pte_encode = gen8_pte_encode; 966 + if (GRAPHICS_VER(gt->i915) >= 12) 967 + ppgtt->vm.pte_encode = gen12_pte_encode; 968 + else 969 + ppgtt->vm.pte_encode = gen8_pte_encode; 1002 970 1003 971 ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND; 1004 972 ppgtt->vm.insert_entries = gen8_ppgtt_insert;
+1 -2
drivers/gpu/drm/i915/gt/gen8_ppgtt.h
··· 10 10 11 11 struct i915_address_space; 12 12 struct intel_gt; 13 - enum i915_cache_level; 14 13 15 14 struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt, 16 15 unsigned long lmem_pt_obj_flags); 17 16 18 17 u64 gen8_ggtt_pte_encode(dma_addr_t addr, 19 - enum i915_cache_level level, 18 + unsigned int pat_index, 20 19 u32 flags); 21 20 22 21 #endif
+4 -1
drivers/gpu/drm/i915/gt/intel_context.c
··· 578 578 child->parallel.parent = parent; 579 579 } 580 580 581 - u64 intel_context_get_total_runtime_ns(const struct intel_context *ce) 581 + u64 intel_context_get_total_runtime_ns(struct intel_context *ce) 582 582 { 583 583 u64 total, active; 584 + 585 + if (ce->ops->update_stats) 586 + ce->ops->update_stats(ce); 584 587 585 588 total = ce->stats.runtime.total; 586 589 if (ce->ops->flags & COPS_RUNTIME_CYCLES)
+4 -4
drivers/gpu/drm/i915/gt/intel_context.h
··· 97 97 98 98 /** 99 99 * intel_context_lock_pinned - Stablises the 'pinned' status of the HW context 100 - * @ce - the context 100 + * @ce: the context 101 101 * 102 102 * Acquire a lock on the pinned status of the HW context, such that the context 103 103 * can neither be bound to the GPU or unbound whilst the lock is held, i.e. ··· 111 111 112 112 /** 113 113 * intel_context_is_pinned - Reports the 'pinned' status 114 - * @ce - the context 114 + * @ce: the context 115 115 * 116 116 * While in use by the GPU, the context, along with its ring and page 117 117 * tables is pinned into memory and the GTT. ··· 133 133 134 134 /** 135 135 * intel_context_unlock_pinned - Releases the earlier locking of 'pinned' status 136 - * @ce - the context 136 + * @ce: the context 137 137 * 138 138 * Releases the lock earlier acquired by intel_context_unlock_pinned(). 139 139 */ ··· 375 375 clear_bit(CONTEXT_NOPREEMPT, &ce->flags); 376 376 } 377 377 378 - u64 intel_context_get_total_runtime_ns(const struct intel_context *ce); 378 + u64 intel_context_get_total_runtime_ns(struct intel_context *ce); 379 379 u64 intel_context_get_avg_runtime_ns(struct intel_context *ce); 380 380 381 381 static inline u64 intel_context_clock(void)
+2
drivers/gpu/drm/i915/gt/intel_context_types.h
··· 58 58 59 59 void (*sched_disable)(struct intel_context *ce); 60 60 61 + void (*update_stats)(struct intel_context *ce); 62 + 61 63 void (*reset)(struct intel_context *ce); 62 64 void (*destroy)(struct kref *kref); 63 65
+1 -1
drivers/gpu/drm/i915/gt/intel_engine_cs.c
··· 1515 1515 } 1516 1516 1517 1517 /** 1518 - * intel_engines_cleanup_common - cleans up the engine state created by 1518 + * intel_engine_cleanup_common - cleans up the engine state created by 1519 1519 * the common initiailizers. 1520 1520 * @engine: Engine to cleanup. 1521 1521 *
+1
drivers/gpu/drm/i915/gt/intel_engine_types.h
··· 289 289 */ 290 290 u8 csb_head; 291 291 292 + /* private: selftest */ 292 293 I915_SELFTEST_DECLARE(struct st_preempt_hang preempt_hang;) 293 294 }; 294 295
+1 -1
drivers/gpu/drm/i915/gt/intel_engine_user.c
··· 117 117 disabled |= (I915_SCHEDULER_CAP_ENABLED | 118 118 I915_SCHEDULER_CAP_PRIORITY); 119 119 120 - if (intel_uc_uses_guc_submission(&to_gt(i915)->uc)) 120 + if (intel_uc_uses_guc_submission(&engine->gt->uc)) 121 121 enabled |= I915_SCHEDULER_CAP_STATIC_PRIORITY_MAP; 122 122 123 123 for (i = 0; i < ARRAY_SIZE(map); i++) {
+59 -25
drivers/gpu/drm/i915/gt/intel_ggtt.c
··· 220 220 } 221 221 } 222 222 223 + static u64 mtl_ggtt_pte_encode(dma_addr_t addr, 224 + unsigned int pat_index, 225 + u32 flags) 226 + { 227 + gen8_pte_t pte = addr | GEN8_PAGE_PRESENT; 228 + 229 + WARN_ON_ONCE(addr & ~GEN12_GGTT_PTE_ADDR_MASK); 230 + 231 + if (flags & PTE_LM) 232 + pte |= GEN12_GGTT_PTE_LM; 233 + 234 + if (pat_index & BIT(0)) 235 + pte |= MTL_GGTT_PTE_PAT0; 236 + 237 + if (pat_index & BIT(1)) 238 + pte |= MTL_GGTT_PTE_PAT1; 239 + 240 + return pte; 241 + } 242 + 223 243 u64 gen8_ggtt_pte_encode(dma_addr_t addr, 224 - enum i915_cache_level level, 244 + unsigned int pat_index, 225 245 u32 flags) 226 246 { 227 247 gen8_pte_t pte = addr | GEN8_PAGE_PRESENT; ··· 260 240 static void gen8_ggtt_insert_page(struct i915_address_space *vm, 261 241 dma_addr_t addr, 262 242 u64 offset, 263 - enum i915_cache_level level, 243 + unsigned int pat_index, 264 244 u32 flags) 265 245 { 266 246 struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm); 267 247 gen8_pte_t __iomem *pte = 268 248 (gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE; 269 249 270 - gen8_set_pte(pte, gen8_ggtt_pte_encode(addr, level, flags)); 250 + gen8_set_pte(pte, ggtt->vm.pte_encode(addr, pat_index, flags)); 271 251 272 252 ggtt->invalidate(ggtt); 273 253 } 274 254 275 255 static void gen8_ggtt_insert_entries(struct i915_address_space *vm, 276 256 struct i915_vma_resource *vma_res, 277 - enum i915_cache_level level, 257 + unsigned int pat_index, 278 258 u32 flags) 279 259 { 280 - const gen8_pte_t pte_encode = gen8_ggtt_pte_encode(0, level, flags); 281 260 struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm); 261 + const gen8_pte_t pte_encode = ggtt->vm.pte_encode(0, pat_index, flags); 282 262 gen8_pte_t __iomem *gte; 283 263 gen8_pte_t __iomem *end; 284 264 struct sgt_iter iter; ··· 335 315 static void gen6_ggtt_insert_page(struct i915_address_space *vm, 336 316 dma_addr_t addr, 337 317 u64 offset, 338 - enum i915_cache_level level, 318 + unsigned int pat_index, 339 319 u32 flags) 340 320 { 341 321 struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm); 342 322 gen6_pte_t __iomem *pte = 343 323 (gen6_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE; 344 324 345 - iowrite32(vm->pte_encode(addr, level, flags), pte); 325 + iowrite32(vm->pte_encode(addr, pat_index, flags), pte); 346 326 347 327 ggtt->invalidate(ggtt); 348 328 } ··· 355 335 */ 356 336 static void gen6_ggtt_insert_entries(struct i915_address_space *vm, 357 337 struct i915_vma_resource *vma_res, 358 - enum i915_cache_level level, 338 + unsigned int pat_index, 359 339 u32 flags) 360 340 { 361 341 struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm); ··· 372 352 iowrite32(vm->scratch[0]->encode, gte++); 373 353 end += (vma_res->node_size + vma_res->guard) / I915_GTT_PAGE_SIZE; 374 354 for_each_sgt_daddr(addr, iter, vma_res->bi.pages) 375 - iowrite32(vm->pte_encode(addr, level, flags), gte++); 355 + iowrite32(vm->pte_encode(addr, pat_index, flags), gte++); 376 356 GEM_BUG_ON(gte > end); 377 357 378 358 /* Fill the allocated but "unused" space beyond the end of the buffer */ ··· 407 387 struct i915_address_space *vm; 408 388 dma_addr_t addr; 409 389 u64 offset; 410 - enum i915_cache_level level; 390 + unsigned int pat_index; 411 391 }; 412 392 413 393 static int bxt_vtd_ggtt_insert_page__cb(void *_arg) 414 394 { 415 395 struct insert_page *arg = _arg; 416 396 417 - gen8_ggtt_insert_page(arg->vm, arg->addr, arg->offset, arg->level, 0); 397 + gen8_ggtt_insert_page(arg->vm, arg->addr, arg->offset, 398 + arg->pat_index, 0); 418 399 bxt_vtd_ggtt_wa(arg->vm); 419 400 420 401 return 0; ··· 424 403 static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm, 425 404 dma_addr_t addr, 426 405 u64 offset, 427 - enum i915_cache_level level, 406 + unsigned int pat_index, 428 407 u32 unused) 429 408 { 430 - struct insert_page arg = { vm, addr, offset, level }; 409 + struct insert_page arg = { vm, addr, offset, pat_index }; 431 410 432 411 stop_machine(bxt_vtd_ggtt_insert_page__cb, &arg, NULL); 433 412 } ··· 435 414 struct insert_entries { 436 415 struct i915_address_space *vm; 437 416 struct i915_vma_resource *vma_res; 438 - enum i915_cache_level level; 417 + unsigned int pat_index; 439 418 u32 flags; 440 419 }; 441 420 ··· 443 422 { 444 423 struct insert_entries *arg = _arg; 445 424 446 - gen8_ggtt_insert_entries(arg->vm, arg->vma_res, arg->level, arg->flags); 425 + gen8_ggtt_insert_entries(arg->vm, arg->vma_res, 426 + arg->pat_index, arg->flags); 447 427 bxt_vtd_ggtt_wa(arg->vm); 448 428 449 429 return 0; ··· 452 430 453 431 static void bxt_vtd_ggtt_insert_entries__BKL(struct i915_address_space *vm, 454 432 struct i915_vma_resource *vma_res, 455 - enum i915_cache_level level, 433 + unsigned int pat_index, 456 434 u32 flags) 457 435 { 458 - struct insert_entries arg = { vm, vma_res, level, flags }; 436 + struct insert_entries arg = { vm, vma_res, pat_index, flags }; 459 437 460 438 stop_machine(bxt_vtd_ggtt_insert_entries__cb, &arg, NULL); 461 439 } ··· 484 462 void intel_ggtt_bind_vma(struct i915_address_space *vm, 485 463 struct i915_vm_pt_stash *stash, 486 464 struct i915_vma_resource *vma_res, 487 - enum i915_cache_level cache_level, 465 + unsigned int pat_index, 488 466 u32 flags) 489 467 { 490 468 u32 pte_flags; ··· 501 479 if (vma_res->bi.lmem) 502 480 pte_flags |= PTE_LM; 503 481 504 - vm->insert_entries(vm, vma_res, cache_level, pte_flags); 482 + vm->insert_entries(vm, vma_res, pat_index, pte_flags); 505 483 vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE; 506 484 } 507 485 ··· 650 628 static void aliasing_gtt_bind_vma(struct i915_address_space *vm, 651 629 struct i915_vm_pt_stash *stash, 652 630 struct i915_vma_resource *vma_res, 653 - enum i915_cache_level cache_level, 631 + unsigned int pat_index, 654 632 u32 flags) 655 633 { 656 634 u32 pte_flags; ··· 662 640 663 641 if (flags & I915_VMA_LOCAL_BIND) 664 642 ppgtt_bind_vma(&i915_vm_to_ggtt(vm)->alias->vm, 665 - stash, vma_res, cache_level, flags); 643 + stash, vma_res, pat_index, flags); 666 644 667 645 if (flags & I915_VMA_GLOBAL_BIND) 668 - vm->insert_entries(vm, vma_res, cache_level, pte_flags); 646 + vm->insert_entries(vm, vma_res, pat_index, pte_flags); 669 647 670 648 vma_res->bound_flags |= flags; 671 649 } ··· 922 900 923 901 ggtt->vm.scratch[0]->encode = 924 902 ggtt->vm.pte_encode(px_dma(ggtt->vm.scratch[0]), 925 - I915_CACHE_NONE, pte_flags); 903 + i915_gem_get_pat_index(i915, 904 + I915_CACHE_NONE), 905 + pte_flags); 926 906 927 907 return 0; 928 908 } ··· 1005 981 ggtt->vm.vma_ops.bind_vma = intel_ggtt_bind_vma; 1006 982 ggtt->vm.vma_ops.unbind_vma = intel_ggtt_unbind_vma; 1007 983 1008 - ggtt->vm.pte_encode = gen8_ggtt_pte_encode; 984 + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)) 985 + ggtt->vm.pte_encode = mtl_ggtt_pte_encode; 986 + else 987 + ggtt->vm.pte_encode = gen8_ggtt_pte_encode; 1009 988 1010 989 return ggtt_probe_common(ggtt, size); 1011 990 } 1012 991 992 + /* 993 + * For pre-gen8 platforms pat_index is the same as enum i915_cache_level, 994 + * so these PTE encode functions are left with using cache_level. 995 + * See translation table LEGACY_CACHELEVEL. 996 + */ 1013 997 static u64 snb_pte_encode(dma_addr_t addr, 1014 998 enum i915_cache_level level, 1015 999 u32 flags) ··· 1298 1266 */ 1299 1267 vma->resource->bound_flags = 0; 1300 1268 vma->ops->bind_vma(vm, NULL, vma->resource, 1301 - obj ? obj->cache_level : 0, 1269 + obj ? obj->pat_index : 1270 + i915_gem_get_pat_index(vm->i915, 1271 + I915_CACHE_NONE), 1302 1272 was_bound); 1303 1273 1304 1274 if (obj) { /* only used during resume => exclusive access */
+21 -4
drivers/gpu/drm/i915/gt/intel_gt_irq.c
··· 15 15 #include "intel_uncore.h" 16 16 #include "intel_rps.h" 17 17 #include "pxp/intel_pxp_irq.h" 18 + #include "uc/intel_gsc_proxy.h" 18 19 19 20 static void guc_irq_handler(struct intel_guc *guc, u16 iir) 20 21 { ··· 82 81 if (instance == OTHER_GSC_INSTANCE) 83 82 return intel_gsc_irq_handler(gt, iir); 84 83 84 + if (instance == OTHER_GSC_HECI_2_INSTANCE) 85 + return intel_gsc_proxy_irq_handler(&gt->uc.gsc, iir); 86 + 85 87 WARN_ONCE(1, "unhandled other interrupt instance=0x%x, iir=0x%x\n", 86 88 instance, iir); 87 89 } ··· 104 100 case VIDEO_ENHANCEMENT_CLASS: 105 101 return media_gt; 106 102 case OTHER_CLASS: 107 - if (instance == OTHER_GSC_INSTANCE && HAS_ENGINE(media_gt, GSC0)) 103 + if (instance == OTHER_GSC_HECI_2_INSTANCE) 104 + return media_gt; 105 + if ((instance == OTHER_GSC_INSTANCE || instance == OTHER_KCR_INSTANCE) && 106 + HAS_ENGINE(media_gt, GSC0)) 108 107 return media_gt; 109 108 fallthrough; 110 109 default: ··· 263 256 u32 irqs = GT_RENDER_USER_INTERRUPT; 264 257 u32 guc_mask = intel_uc_wants_guc(&gt->uc) ? GUC_INTR_GUC2HOST : 0; 265 258 u32 gsc_mask = 0; 259 + u32 heci_mask = 0; 266 260 u32 dmask; 267 261 u32 smask; 268 262 ··· 275 267 dmask = irqs << 16 | irqs; 276 268 smask = irqs << 16; 277 269 278 - if (HAS_ENGINE(gt, GSC0)) 270 + if (HAS_ENGINE(gt, GSC0)) { 271 + /* 272 + * the heci2 interrupt is enabled via the same register as the 273 + * GSC interrupt, but it has its own mask register. 274 + */ 279 275 gsc_mask = irqs; 280 - else if (HAS_HECI_GSC(gt->i915)) 276 + heci_mask = GSC_IRQ_INTF(1); /* HECI2 IRQ for SW Proxy*/ 277 + } else if (HAS_HECI_GSC(gt->i915)) { 281 278 gsc_mask = GSC_IRQ_INTF(0) | GSC_IRQ_INTF(1); 279 + } 282 280 283 281 BUILD_BUG_ON(irqs & 0xffff0000); 284 282 ··· 294 280 if (CCS_MASK(gt)) 295 281 intel_uncore_write(uncore, GEN12_CCS_RSVD_INTR_ENABLE, smask); 296 282 if (gsc_mask) 297 - intel_uncore_write(uncore, GEN11_GUNIT_CSME_INTR_ENABLE, gsc_mask); 283 + intel_uncore_write(uncore, GEN11_GUNIT_CSME_INTR_ENABLE, gsc_mask | heci_mask); 298 284 299 285 /* Unmask irqs on RCS, BCS, VCS and VECS engines. */ 300 286 intel_uncore_write(uncore, GEN11_RCS0_RSVD_INTR_MASK, ~smask); ··· 322 308 intel_uncore_write(uncore, GEN12_CCS2_CCS3_INTR_MASK, ~dmask); 323 309 if (gsc_mask) 324 310 intel_uncore_write(uncore, GEN11_GUNIT_CSME_INTR_MASK, ~gsc_mask); 311 + if (heci_mask) 312 + intel_uncore_write(uncore, GEN12_HECI2_RSVD_INTR_MASK, 313 + ~REG_FIELD_PREP(ENGINE1_MASK, heci_mask)); 325 314 326 315 if (guc_mask) { 327 316 /* the enable bit is common for both GTs but the masks are separate */
+2 -2
drivers/gpu/drm/i915/gt/intel_gt_pm.c
··· 87 87 88 88 intel_rc6_unpark(&gt->rc6); 89 89 intel_rps_unpark(&gt->rps); 90 - i915_pmu_gt_unparked(i915); 90 + i915_pmu_gt_unparked(gt); 91 91 intel_guc_busyness_unpark(gt); 92 92 93 93 intel_gt_unpark_requests(gt); ··· 109 109 110 110 intel_guc_busyness_park(gt); 111 111 i915_vma_parked(gt); 112 - i915_pmu_gt_parked(i915); 112 + i915_pmu_gt_parked(gt); 113 113 intel_rps_park(&gt->rps); 114 114 intel_rc6_park(&gt->rc6); 115 115
+4 -1
drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
··· 539 539 { 540 540 struct intel_gt *gt = data; 541 541 542 - return HAS_RPS(gt->i915); 542 + if (intel_guc_slpc_is_used(&gt->uc.guc)) 543 + return false; 544 + else 545 + return HAS_RPS(gt->i915); 543 546 } 544 547 545 548 DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(rps_boost);
+13 -1
drivers/gpu/drm/i915/gt/intel_gt_regs.h
··· 356 356 #define GEN7_TLB_RD_ADDR _MMIO(0x4700) 357 357 358 358 #define GEN12_PAT_INDEX(index) _MMIO(0x4800 + (index) * 4) 359 - #define XEHP_PAT_INDEX(index) MCR_REG(0x4800 + (index) * 4) 359 + #define _PAT_INDEX(index) _PICK_EVEN_2RANGES(index, 8, \ 360 + 0x4800, 0x4804, \ 361 + 0x4848, 0x484c) 362 + #define XEHP_PAT_INDEX(index) MCR_REG(_PAT_INDEX(index)) 363 + #define XELPMP_PAT_INDEX(index) _MMIO(_PAT_INDEX(index)) 360 364 361 365 #define XEHP_TILE0_ADDR_RANGE MCR_REG(0x4900) 362 366 #define XEHP_TILE_LMEM_RANGE_SHIFT 8 ··· 528 524 #define POLYGON_TRIFAN_LINELOOP_DISABLE REG_BIT(4) 529 525 530 526 #define GEN8_RC6_CTX_INFO _MMIO(0x8504) 527 + 528 + #define GEN12_SQCNT1 _MMIO(0x8718) 529 + #define GEN12_SQCNT1_PMON_ENABLE REG_BIT(30) 530 + #define GEN12_SQCNT1_OABPC REG_BIT(29) 531 + #define GEN12_STRICT_RAR_ENABLE REG_BIT(23) 531 532 532 533 #define XEHP_SQCM MCR_REG(0x8724) 533 534 #define EN_32B_ACCESS REG_BIT(30) ··· 1596 1587 1597 1588 #define GEN11_GT_INTR_DW(x) _MMIO(0x190018 + ((x) * 4)) 1598 1589 #define GEN11_CSME (31) 1590 + #define GEN12_HECI_2 (30) 1599 1591 #define GEN11_GUNIT (28) 1600 1592 #define GEN11_GUC (25) 1601 1593 #define MTL_MGUC (24) ··· 1638 1628 /* irq instances for OTHER_CLASS */ 1639 1629 #define OTHER_GUC_INSTANCE 0 1640 1630 #define OTHER_GTPM_INSTANCE 1 1631 + #define OTHER_GSC_HECI_2_INSTANCE 3 1641 1632 #define OTHER_KCR_INSTANCE 4 1642 1633 #define OTHER_GSC_INSTANCE 6 1643 1634 #define OTHER_MEDIA_GUC_INSTANCE 16 ··· 1654 1643 #define GEN12_VCS6_VCS7_INTR_MASK _MMIO(0x1900b4) 1655 1644 #define GEN11_VECS0_VECS1_INTR_MASK _MMIO(0x1900d0) 1656 1645 #define GEN12_VECS2_VECS3_INTR_MASK _MMIO(0x1900d4) 1646 + #define GEN12_HECI2_RSVD_INTR_MASK _MMIO(0x1900e4) 1657 1647 #define GEN11_GUC_SG_INTR_MASK _MMIO(0x1900e8) 1658 1648 #define MTL_GUC_MGUC_INTR_MASK _MMIO(0x1900e8) /* MTL+ */ 1659 1649 #define GEN11_GPM_WGBOXPERF_INTR_MASK _MMIO(0x1900ec)
+35
drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.c
··· 451 451 return sysfs_emit(buff, "%u\n", preq); 452 452 } 453 453 454 + static ssize_t slpc_ignore_eff_freq_show(struct kobject *kobj, 455 + struct kobj_attribute *attr, 456 + char *buff) 457 + { 458 + struct intel_gt *gt = intel_gt_sysfs_get_drvdata(kobj, attr->attr.name); 459 + struct intel_guc_slpc *slpc = &gt->uc.guc.slpc; 460 + 461 + return sysfs_emit(buff, "%u\n", slpc->ignore_eff_freq); 462 + } 463 + 464 + static ssize_t slpc_ignore_eff_freq_store(struct kobject *kobj, 465 + struct kobj_attribute *attr, 466 + const char *buff, size_t count) 467 + { 468 + struct intel_gt *gt = intel_gt_sysfs_get_drvdata(kobj, attr->attr.name); 469 + struct intel_guc_slpc *slpc = &gt->uc.guc.slpc; 470 + int err; 471 + u32 val; 472 + 473 + err = kstrtou32(buff, 0, &val); 474 + if (err) 475 + return err; 476 + 477 + err = intel_guc_slpc_set_ignore_eff_freq(slpc, val); 478 + return err ?: count; 479 + } 480 + 454 481 struct intel_gt_bool_throttle_attr { 455 482 struct attribute attr; 456 483 ssize_t (*show)(struct kobject *kobj, struct kobj_attribute *attr, ··· 690 663 INTEL_GT_ATTR_RO(media_RP0_freq_mhz); 691 664 INTEL_GT_ATTR_RO(media_RPn_freq_mhz); 692 665 666 + INTEL_GT_ATTR_RW(slpc_ignore_eff_freq); 667 + 693 668 static const struct attribute *media_perf_power_attrs[] = { 694 669 &attr_media_freq_factor.attr, 695 670 &attr_media_freq_factor_scale.attr, ··· 772 743 ret = sysfs_create_file(kobj, &attr_punit_req_freq_mhz.attr); 773 744 if (ret) 774 745 gt_warn(gt, "failed to create punit_req_freq_mhz sysfs (%pe)", ERR_PTR(ret)); 746 + 747 + if (intel_uc_uses_guc_slpc(&gt->uc)) { 748 + ret = sysfs_create_file(kobj, &attr_slpc_ignore_eff_freq.attr); 749 + if (ret) 750 + gt_warn(gt, "failed to create ignore_eff_freq sysfs (%pe)", ERR_PTR(ret)); 751 + } 775 752 776 753 if (i915_mmio_reg_valid(intel_gt_perf_limit_reasons_reg(gt))) { 777 754 ret = sysfs_create_files(kobj, throttle_reason_attrs);
+46 -1
drivers/gpu/drm/i915/gt/intel_gtt.c
··· 468 468 } 469 469 } 470 470 471 + static void xelpmp_setup_private_ppat(struct intel_uncore *uncore) 472 + { 473 + intel_uncore_write(uncore, XELPMP_PAT_INDEX(0), 474 + MTL_PPAT_L4_0_WB); 475 + intel_uncore_write(uncore, XELPMP_PAT_INDEX(1), 476 + MTL_PPAT_L4_1_WT); 477 + intel_uncore_write(uncore, XELPMP_PAT_INDEX(2), 478 + MTL_PPAT_L4_3_UC); 479 + intel_uncore_write(uncore, XELPMP_PAT_INDEX(3), 480 + MTL_PPAT_L4_0_WB | MTL_2_COH_1W); 481 + intel_uncore_write(uncore, XELPMP_PAT_INDEX(4), 482 + MTL_PPAT_L4_0_WB | MTL_3_COH_2W); 483 + 484 + /* 485 + * Remaining PAT entries are left at the hardware-default 486 + * fully-cached setting 487 + */ 488 + } 489 + 490 + static void xelpg_setup_private_ppat(struct intel_gt *gt) 491 + { 492 + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(0), 493 + MTL_PPAT_L4_0_WB); 494 + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(1), 495 + MTL_PPAT_L4_1_WT); 496 + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(2), 497 + MTL_PPAT_L4_3_UC); 498 + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(3), 499 + MTL_PPAT_L4_0_WB | MTL_2_COH_1W); 500 + intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(4), 501 + MTL_PPAT_L4_0_WB | MTL_3_COH_2W); 502 + 503 + /* 504 + * Remaining PAT entries are left at the hardware-default 505 + * fully-cached setting 506 + */ 507 + } 508 + 471 509 static void tgl_setup_private_ppat(struct intel_uncore *uncore) 472 510 { 473 511 /* TGL doesn't support LLC or AGE settings */ ··· 641 603 642 604 GEM_BUG_ON(GRAPHICS_VER(i915) < 8); 643 605 644 - if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) 606 + if (gt->type == GT_MEDIA) { 607 + xelpmp_setup_private_ppat(gt->uncore); 608 + return; 609 + } 610 + 611 + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)) 612 + xelpg_setup_private_ppat(gt); 613 + else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) 645 614 xehp_setup_private_ppat(gt); 646 615 else if (GRAPHICS_VER(i915) >= 12) 647 616 tgl_setup_private_ppat(uncore);
+25 -11
drivers/gpu/drm/i915/gt/intel_gtt.h
··· 88 88 #define BYT_PTE_SNOOPED_BY_CPU_CACHES REG_BIT(2) 89 89 #define BYT_PTE_WRITEABLE REG_BIT(1) 90 90 91 + #define MTL_PPGTT_PTE_PAT3 BIT_ULL(62) 91 92 #define GEN12_PPGTT_PTE_LM BIT_ULL(11) 93 + #define GEN12_PPGTT_PTE_PAT2 BIT_ULL(7) 94 + #define GEN12_PPGTT_PTE_PAT1 BIT_ULL(4) 95 + #define GEN12_PPGTT_PTE_PAT0 BIT_ULL(3) 92 96 93 - #define GEN12_GGTT_PTE_LM BIT_ULL(1) 97 + #define GEN12_GGTT_PTE_LM BIT_ULL(1) 98 + #define MTL_GGTT_PTE_PAT0 BIT_ULL(52) 99 + #define MTL_GGTT_PTE_PAT1 BIT_ULL(53) 100 + #define GEN12_GGTT_PTE_ADDR_MASK GENMASK_ULL(45, 12) 101 + #define MTL_GGTT_PTE_PAT_MASK GENMASK_ULL(53, 52) 94 102 95 103 #define GEN12_PDE_64K BIT(6) 96 104 #define GEN12_PTE_PS64 BIT(8) ··· 155 147 #define GEN8_PDE_IPS_64K BIT(11) 156 148 #define GEN8_PDE_PS_2M BIT(7) 157 149 158 - enum i915_cache_level; 150 + #define MTL_PPAT_L4_CACHE_POLICY_MASK REG_GENMASK(3, 2) 151 + #define MTL_PAT_INDEX_COH_MODE_MASK REG_GENMASK(1, 0) 152 + #define MTL_PPAT_L4_3_UC REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 3) 153 + #define MTL_PPAT_L4_1_WT REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 1) 154 + #define MTL_PPAT_L4_0_WB REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 0) 155 + #define MTL_3_COH_2W REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 3) 156 + #define MTL_2_COH_1W REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 2) 159 157 160 158 struct drm_i915_gem_object; 161 159 struct i915_fence_reg; ··· 230 216 void (*bind_vma)(struct i915_address_space *vm, 231 217 struct i915_vm_pt_stash *stash, 232 218 struct i915_vma_resource *vma_res, 233 - enum i915_cache_level cache_level, 219 + unsigned int pat_index, 234 220 u32 flags); 235 221 /* 236 222 * Unmap an object from an address space. This usually consists of ··· 302 288 (*alloc_scratch_dma)(struct i915_address_space *vm, int sz); 303 289 304 290 u64 (*pte_encode)(dma_addr_t addr, 305 - enum i915_cache_level level, 291 + unsigned int pat_index, 306 292 u32 flags); /* Create a valid PTE */ 307 293 #define PTE_READ_ONLY BIT(0) 308 294 #define PTE_LM BIT(1) ··· 317 303 void (*insert_page)(struct i915_address_space *vm, 318 304 dma_addr_t addr, 319 305 u64 offset, 320 - enum i915_cache_level cache_level, 306 + unsigned int pat_index, 321 307 u32 flags); 322 308 void (*insert_entries)(struct i915_address_space *vm, 323 309 struct i915_vma_resource *vma_res, 324 - enum i915_cache_level cache_level, 310 + unsigned int pat_index, 325 311 u32 flags); 326 312 void (*raw_insert_page)(struct i915_address_space *vm, 327 313 dma_addr_t addr, 328 314 u64 offset, 329 - enum i915_cache_level cache_level, 315 + unsigned int pat_index, 330 316 u32 flags); 331 317 void (*raw_insert_entries)(struct i915_address_space *vm, 332 318 struct i915_vma_resource *vma_res, 333 - enum i915_cache_level cache_level, 319 + unsigned int pat_index, 334 320 u32 flags); 335 321 void (*cleanup)(struct i915_address_space *vm); 336 322 ··· 507 493 508 494 /** 509 495 * i915_vm_resv_put - Release a reference on the vm's reservation lock 510 - * @resv: Pointer to a reservation lock obtained from i915_vm_resv_get() 496 + * @vm: The vm whose reservation lock reference we want to release 511 497 */ 512 498 static inline void i915_vm_resv_put(struct i915_address_space *vm) 513 499 { ··· 577 563 void intel_ggtt_bind_vma(struct i915_address_space *vm, 578 564 struct i915_vm_pt_stash *stash, 579 565 struct i915_vma_resource *vma_res, 580 - enum i915_cache_level cache_level, 566 + unsigned int pat_index, 581 567 u32 flags); 582 568 void intel_ggtt_unbind_vma(struct i915_address_space *vm, 583 569 struct i915_vma_resource *vma_res); ··· 655 641 void ppgtt_bind_vma(struct i915_address_space *vm, 656 642 struct i915_vm_pt_stash *stash, 657 643 struct i915_vma_resource *vma_res, 658 - enum i915_cache_level cache_level, 644 + unsigned int pat_index, 659 645 u32 flags); 660 646 void ppgtt_unbind_vma(struct i915_address_space *vm, 661 647 struct i915_vma_resource *vma_res);
+3 -1
drivers/gpu/drm/i915/gt/intel_lrc.c
··· 1370 1370 cs, GEN12_GFX_CCS_AUX_NV); 1371 1371 1372 1372 /* Wa_16014892111 */ 1373 - if (IS_DG2(ce->engine->i915)) 1373 + if (IS_MTL_GRAPHICS_STEP(ce->engine->i915, M, STEP_A0, STEP_B0) || 1374 + IS_MTL_GRAPHICS_STEP(ce->engine->i915, P, STEP_A0, STEP_B0) || 1375 + IS_DG2(ce->engine->i915)) 1374 1376 cs = dg2_emit_draw_watermark_setting(cs); 1375 1377 1376 1378 return cs;
+28 -23
drivers/gpu/drm/i915/gt/intel_migrate.c
··· 45 45 * Insert a dummy PTE into every PT that will map to LMEM to ensure 46 46 * we have a correctly setup PDE structure for later use. 47 47 */ 48 - vm->insert_page(vm, 0, d->offset, I915_CACHE_NONE, PTE_LM); 48 + vm->insert_page(vm, 0, d->offset, 49 + i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE), 50 + PTE_LM); 49 51 GEM_BUG_ON(!pt->is_compact); 50 52 d->offset += SZ_2M; 51 53 } ··· 65 63 * alignment is 64K underneath for the pt, and we are careful 66 64 * not to access the space in the void. 67 65 */ 68 - vm->insert_page(vm, px_dma(pt), d->offset, I915_CACHE_NONE, PTE_LM); 66 + vm->insert_page(vm, px_dma(pt), d->offset, 67 + i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE), 68 + PTE_LM); 69 69 d->offset += SZ_64K; 70 70 } 71 71 ··· 77 73 { 78 74 struct insert_pte_data *d = data; 79 75 80 - vm->insert_page(vm, px_dma(pt), d->offset, I915_CACHE_NONE, 76 + vm->insert_page(vm, px_dma(pt), d->offset, 77 + i915_gem_get_pat_index(vm->i915, I915_CACHE_NONE), 81 78 i915_gem_object_is_lmem(pt->base) ? PTE_LM : 0); 82 79 d->offset += PAGE_SIZE; 83 80 } ··· 361 356 362 357 static int emit_pte(struct i915_request *rq, 363 358 struct sgt_dma *it, 364 - enum i915_cache_level cache_level, 359 + unsigned int pat_index, 365 360 bool is_lmem, 366 361 u64 offset, 367 362 int length) 368 363 { 369 364 bool has_64K_pages = HAS_64K_PAGES(rq->engine->i915); 370 - const u64 encode = rq->context->vm->pte_encode(0, cache_level, 365 + const u64 encode = rq->context->vm->pte_encode(0, pat_index, 371 366 is_lmem ? PTE_LM : 0); 372 367 struct intel_ring *ring = rq->ring; 373 368 int pkt, dword_length; ··· 678 673 intel_context_migrate_copy(struct intel_context *ce, 679 674 const struct i915_deps *deps, 680 675 struct scatterlist *src, 681 - enum i915_cache_level src_cache_level, 676 + unsigned int src_pat_index, 682 677 bool src_is_lmem, 683 678 struct scatterlist *dst, 684 - enum i915_cache_level dst_cache_level, 679 + unsigned int dst_pat_index, 685 680 bool dst_is_lmem, 686 681 struct i915_request **out) 687 682 { 688 683 struct sgt_dma it_src = sg_sgt(src), it_dst = sg_sgt(dst), it_ccs; 689 684 struct drm_i915_private *i915 = ce->engine->i915; 690 685 u64 ccs_bytes_to_cpy = 0, bytes_to_cpy; 691 - enum i915_cache_level ccs_cache_level; 686 + unsigned int ccs_pat_index; 692 687 u32 src_offset, dst_offset; 693 688 u8 src_access, dst_access; 694 689 struct i915_request *rq; ··· 712 707 dst_sz = scatter_list_length(dst); 713 708 if (src_is_lmem) { 714 709 it_ccs = it_dst; 715 - ccs_cache_level = dst_cache_level; 710 + ccs_pat_index = dst_pat_index; 716 711 ccs_is_src = false; 717 712 } else if (dst_is_lmem) { 718 713 bytes_to_cpy = dst_sz; 719 714 it_ccs = it_src; 720 - ccs_cache_level = src_cache_level; 715 + ccs_pat_index = src_pat_index; 721 716 ccs_is_src = true; 722 717 } 723 718 ··· 778 773 src_sz = calculate_chunk_sz(i915, src_is_lmem, 779 774 bytes_to_cpy, ccs_bytes_to_cpy); 780 775 781 - len = emit_pte(rq, &it_src, src_cache_level, src_is_lmem, 776 + len = emit_pte(rq, &it_src, src_pat_index, src_is_lmem, 782 777 src_offset, src_sz); 783 778 if (!len) { 784 779 err = -EINVAL; ··· 789 784 goto out_rq; 790 785 } 791 786 792 - err = emit_pte(rq, &it_dst, dst_cache_level, dst_is_lmem, 787 + err = emit_pte(rq, &it_dst, dst_pat_index, dst_is_lmem, 793 788 dst_offset, len); 794 789 if (err < 0) 795 790 goto out_rq; ··· 816 811 goto out_rq; 817 812 818 813 ccs_sz = GET_CCS_BYTES(i915, len); 819 - err = emit_pte(rq, &it_ccs, ccs_cache_level, false, 814 + err = emit_pte(rq, &it_ccs, ccs_pat_index, false, 820 815 ccs_is_src ? src_offset : dst_offset, 821 816 ccs_sz); 822 817 if (err < 0) ··· 925 920 926 921 GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX); 927 922 928 - if (HAS_FLAT_CCS(i915) && ver >= 12) 923 + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) 929 924 ring_sz = XY_FAST_COLOR_BLT_DW; 930 925 else if (ver >= 8) 931 926 ring_sz = 8; ··· 936 931 if (IS_ERR(cs)) 937 932 return PTR_ERR(cs); 938 933 939 - if (HAS_FLAT_CCS(i915) && ver >= 12) { 934 + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { 940 935 *cs++ = XY_FAST_COLOR_BLT_CMD | XY_FAST_COLOR_BLT_DEPTH_32 | 941 936 (XY_FAST_COLOR_BLT_DW - 2); 942 937 *cs++ = FIELD_PREP(XY_FAST_COLOR_BLT_MOCS_MASK, mocs) | ··· 984 979 intel_context_migrate_clear(struct intel_context *ce, 985 980 const struct i915_deps *deps, 986 981 struct scatterlist *sg, 987 - enum i915_cache_level cache_level, 982 + unsigned int pat_index, 988 983 bool is_lmem, 989 984 u32 value, 990 985 struct i915_request **out) ··· 1032 1027 if (err) 1033 1028 goto out_rq; 1034 1029 1035 - len = emit_pte(rq, &it, cache_level, is_lmem, offset, CHUNK_SZ); 1030 + len = emit_pte(rq, &it, pat_index, is_lmem, offset, CHUNK_SZ); 1036 1031 if (len <= 0) { 1037 1032 err = len; 1038 1033 goto out_rq; ··· 1079 1074 struct i915_gem_ww_ctx *ww, 1080 1075 const struct i915_deps *deps, 1081 1076 struct scatterlist *src, 1082 - enum i915_cache_level src_cache_level, 1077 + unsigned int src_pat_index, 1083 1078 bool src_is_lmem, 1084 1079 struct scatterlist *dst, 1085 - enum i915_cache_level dst_cache_level, 1080 + unsigned int dst_pat_index, 1086 1081 bool dst_is_lmem, 1087 1082 struct i915_request **out) 1088 1083 { ··· 1103 1098 goto out; 1104 1099 1105 1100 err = intel_context_migrate_copy(ce, deps, 1106 - src, src_cache_level, src_is_lmem, 1107 - dst, dst_cache_level, dst_is_lmem, 1101 + src, src_pat_index, src_is_lmem, 1102 + dst, dst_pat_index, dst_is_lmem, 1108 1103 out); 1109 1104 1110 1105 intel_context_unpin(ce); ··· 1118 1113 struct i915_gem_ww_ctx *ww, 1119 1114 const struct i915_deps *deps, 1120 1115 struct scatterlist *sg, 1121 - enum i915_cache_level cache_level, 1116 + unsigned int pat_index, 1122 1117 bool is_lmem, 1123 1118 u32 value, 1124 1119 struct i915_request **out) ··· 1139 1134 if (err) 1140 1135 goto out; 1141 1136 1142 - err = intel_context_migrate_clear(ce, deps, sg, cache_level, 1137 + err = intel_context_migrate_clear(ce, deps, sg, pat_index, 1143 1138 is_lmem, value, out); 1144 1139 1145 1140 intel_context_unpin(ce);
+6 -7
drivers/gpu/drm/i915/gt/intel_migrate.h
··· 16 16 struct i915_gem_ww_ctx; 17 17 struct intel_gt; 18 18 struct scatterlist; 19 - enum i915_cache_level; 20 19 21 20 int intel_migrate_init(struct intel_migrate *m, struct intel_gt *gt); 22 21 ··· 25 26 struct i915_gem_ww_ctx *ww, 26 27 const struct i915_deps *deps, 27 28 struct scatterlist *src, 28 - enum i915_cache_level src_cache_level, 29 + unsigned int src_pat_index, 29 30 bool src_is_lmem, 30 31 struct scatterlist *dst, 31 - enum i915_cache_level dst_cache_level, 32 + unsigned int dst_pat_index, 32 33 bool dst_is_lmem, 33 34 struct i915_request **out); 34 35 35 36 int intel_context_migrate_copy(struct intel_context *ce, 36 37 const struct i915_deps *deps, 37 38 struct scatterlist *src, 38 - enum i915_cache_level src_cache_level, 39 + unsigned int src_pat_index, 39 40 bool src_is_lmem, 40 41 struct scatterlist *dst, 41 - enum i915_cache_level dst_cache_level, 42 + unsigned int dst_pat_index, 42 43 bool dst_is_lmem, 43 44 struct i915_request **out); 44 45 ··· 47 48 struct i915_gem_ww_ctx *ww, 48 49 const struct i915_deps *deps, 49 50 struct scatterlist *sg, 50 - enum i915_cache_level cache_level, 51 + unsigned int pat_index, 51 52 bool is_lmem, 52 53 u32 value, 53 54 struct i915_request **out); ··· 55 56 intel_context_migrate_clear(struct intel_context *ce, 56 57 const struct i915_deps *deps, 57 58 struct scatterlist *sg, 58 - enum i915_cache_level cache_level, 59 + unsigned int pat_index, 59 60 bool is_lmem, 60 61 u32 value, 61 62 struct i915_request **out);
+69 -1
drivers/gpu/drm/i915/gt/intel_mocs.c
··· 40 40 #define LE_COS(value) ((value) << 15) 41 41 #define LE_SSE(value) ((value) << 17) 42 42 43 + /* Defines for the tables (GLOB_MOCS_0 - GLOB_MOCS_16) */ 44 + #define _L4_CACHEABILITY(value) ((value) << 2) 45 + #define IG_PAT(value) ((value) << 8) 46 + 43 47 /* Defines for the tables (LNCFMOCS0 - LNCFMOCS31) - two entries per word */ 44 48 #define L3_ESC(value) ((value) << 0) 45 49 #define L3_SCC(value) ((value) << 1) ··· 54 50 /* Helper defines */ 55 51 #define GEN9_NUM_MOCS_ENTRIES 64 /* 63-64 are reserved, but configured. */ 56 52 #define PVC_NUM_MOCS_ENTRIES 3 53 + #define MTL_NUM_MOCS_ENTRIES 16 57 54 58 55 /* (e)LLC caching options */ 59 56 /* ··· 77 72 #define L3_1_UC _L3_CACHEABILITY(1) 78 73 #define L3_2_RESERVED _L3_CACHEABILITY(2) 79 74 #define L3_3_WB _L3_CACHEABILITY(3) 75 + 76 + /* L4 caching options */ 77 + #define L4_0_WB _L4_CACHEABILITY(0) 78 + #define L4_1_WT _L4_CACHEABILITY(1) 79 + #define L4_2_RESERVED _L4_CACHEABILITY(2) 80 + #define L4_3_UC _L4_CACHEABILITY(3) 80 81 81 82 #define MOCS_ENTRY(__idx, __control_value, __l3cc_value) \ 82 83 [__idx] = { \ ··· 427 416 MOCS_ENTRY(2, 0, L3_3_WB), 428 417 }; 429 418 419 + static const struct drm_i915_mocs_entry mtl_mocs_table[] = { 420 + /* Error - Reserved for Non-Use */ 421 + MOCS_ENTRY(0, 422 + IG_PAT(0), 423 + L3_LKUP(1) | L3_3_WB), 424 + /* Cached - L3 + L4 */ 425 + MOCS_ENTRY(1, 426 + IG_PAT(1), 427 + L3_LKUP(1) | L3_3_WB), 428 + /* L4 - GO:L3 */ 429 + MOCS_ENTRY(2, 430 + IG_PAT(1), 431 + L3_LKUP(1) | L3_1_UC), 432 + /* Uncached - GO:L3 */ 433 + MOCS_ENTRY(3, 434 + IG_PAT(1) | L4_3_UC, 435 + L3_LKUP(1) | L3_1_UC), 436 + /* L4 - GO:Mem */ 437 + MOCS_ENTRY(4, 438 + IG_PAT(1), 439 + L3_LKUP(1) | L3_GLBGO(1) | L3_1_UC), 440 + /* Uncached - GO:Mem */ 441 + MOCS_ENTRY(5, 442 + IG_PAT(1) | L4_3_UC, 443 + L3_LKUP(1) | L3_GLBGO(1) | L3_1_UC), 444 + /* L4 - L3:NoLKUP; GO:L3 */ 445 + MOCS_ENTRY(6, 446 + IG_PAT(1), 447 + L3_1_UC), 448 + /* Uncached - L3:NoLKUP; GO:L3 */ 449 + MOCS_ENTRY(7, 450 + IG_PAT(1) | L4_3_UC, 451 + L3_1_UC), 452 + /* L4 - L3:NoLKUP; GO:Mem */ 453 + MOCS_ENTRY(8, 454 + IG_PAT(1), 455 + L3_GLBGO(1) | L3_1_UC), 456 + /* Uncached - L3:NoLKUP; GO:Mem */ 457 + MOCS_ENTRY(9, 458 + IG_PAT(1) | L4_3_UC, 459 + L3_GLBGO(1) | L3_1_UC), 460 + /* Display - L3; L4:WT */ 461 + MOCS_ENTRY(14, 462 + IG_PAT(1) | L4_1_WT, 463 + L3_LKUP(1) | L3_3_WB), 464 + /* CCS - Non-Displayable */ 465 + MOCS_ENTRY(15, 466 + IG_PAT(1), 467 + L3_GLBGO(1) | L3_1_UC), 468 + }; 469 + 430 470 enum { 431 471 HAS_GLOBAL_MOCS = BIT(0), 432 472 HAS_ENGINE_MOCS = BIT(1), ··· 507 445 memset(table, 0, sizeof(struct drm_i915_mocs_table)); 508 446 509 447 table->unused_entries_index = I915_MOCS_PTE; 510 - if (IS_PONTEVECCHIO(i915)) { 448 + if (IS_METEORLAKE(i915)) { 449 + table->size = ARRAY_SIZE(mtl_mocs_table); 450 + table->table = mtl_mocs_table; 451 + table->n_entries = MTL_NUM_MOCS_ENTRIES; 452 + table->uc_index = 9; 453 + table->unused_entries_index = 1; 454 + } else if (IS_PONTEVECCHIO(i915)) { 511 455 table->size = ARRAY_SIZE(pvc_mocs_table); 512 456 table->table = pvc_mocs_table; 513 457 table->n_entries = PVC_NUM_MOCS_ENTRIES;
+2 -2
drivers/gpu/drm/i915/gt/intel_ppgtt.c
··· 181 181 void ppgtt_bind_vma(struct i915_address_space *vm, 182 182 struct i915_vm_pt_stash *stash, 183 183 struct i915_vma_resource *vma_res, 184 - enum i915_cache_level cache_level, 184 + unsigned int pat_index, 185 185 u32 flags) 186 186 { 187 187 u32 pte_flags; ··· 199 199 if (vma_res->bi.lmem) 200 200 pte_flags |= PTE_LM; 201 201 202 - vm->insert_entries(vm, vma_res, cache_level, pte_flags); 202 + vm->insert_entries(vm, vma_res, pat_index, pte_flags); 203 203 wmb(); 204 204 } 205 205
+84 -83
drivers/gpu/drm/i915/gt/intel_rc6.c
··· 53 53 return rc6_to_gt(rc)->i915; 54 54 } 55 55 56 - static void set(struct intel_uncore *uncore, i915_reg_t reg, u32 val) 57 - { 58 - intel_uncore_write_fw(uncore, reg, val); 59 - } 60 - 61 56 static void gen11_rc6_enable(struct intel_rc6 *rc6) 62 57 { 63 58 struct intel_gt *gt = rc6_to_gt(rc6); ··· 67 72 */ 68 73 if (!intel_uc_uses_guc_rc(&gt->uc)) { 69 74 /* 2b: Program RC6 thresholds.*/ 70 - set(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 54 << 16 | 85); 71 - set(uncore, GEN10_MEDIA_WAKE_RATE_LIMIT, 150); 75 + intel_uncore_write_fw(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 54 << 16 | 85); 76 + intel_uncore_write_fw(uncore, GEN10_MEDIA_WAKE_RATE_LIMIT, 150); 72 77 73 - set(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */ 74 - set(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */ 78 + intel_uncore_write_fw(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */ 79 + intel_uncore_write_fw(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */ 75 80 for_each_engine(engine, rc6_to_gt(rc6), id) 76 - set(uncore, RING_MAX_IDLE(engine->mmio_base), 10); 81 + intel_uncore_write_fw(uncore, RING_MAX_IDLE(engine->mmio_base), 10); 77 82 78 - set(uncore, GUC_MAX_IDLE_COUNT, 0xA); 83 + intel_uncore_write_fw(uncore, GUC_MAX_IDLE_COUNT, 0xA); 79 84 80 - set(uncore, GEN6_RC_SLEEP, 0); 85 + intel_uncore_write_fw(uncore, GEN6_RC_SLEEP, 0); 81 86 82 - set(uncore, GEN6_RC6_THRESHOLD, 50000); /* 50/125ms per EI */ 87 + intel_uncore_write_fw(uncore, GEN6_RC6_THRESHOLD, 50000); /* 50/125ms per EI */ 83 88 } 84 89 85 90 /* ··· 100 105 * Broadwell+, To be conservative, we want to factor in a context 101 106 * switch on top (due to ksoftirqd). 102 107 */ 103 - set(uncore, GEN9_MEDIA_PG_IDLE_HYSTERESIS, 60); 104 - set(uncore, GEN9_RENDER_PG_IDLE_HYSTERESIS, 60); 108 + intel_uncore_write_fw(uncore, GEN9_MEDIA_PG_IDLE_HYSTERESIS, 60); 109 + intel_uncore_write_fw(uncore, GEN9_RENDER_PG_IDLE_HYSTERESIS, 60); 105 110 106 111 /* 3a: Enable RC6 107 112 * ··· 117 122 GEN6_RC_CTL_RC6_ENABLE | 118 123 GEN6_RC_CTL_EI_MODE(1); 119 124 120 - /* Wa_16011777198 - Render powergating must remain disabled */ 121 - if (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_C0) || 125 + /* 126 + * Wa_16011777198 and BSpec 52698 - Render powergating must be off. 127 + * FIXME BSpec is outdated, disabling powergating for MTL is just 128 + * temporary wa and should be removed after fixing real cause 129 + * of forcewake timeouts. 130 + */ 131 + if (IS_METEORLAKE(gt->i915) || 132 + IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_C0) || 122 133 IS_DG2_GRAPHICS_STEP(gt->i915, G11, STEP_A0, STEP_B0)) 123 134 pg_enable = 124 135 GEN9_MEDIA_PG_ENABLE | ··· 142 141 VDN_MFX_POWERGATE_ENABLE(i)); 143 142 } 144 143 145 - set(uncore, GEN9_PG_ENABLE, pg_enable); 144 + intel_uncore_write_fw(uncore, GEN9_PG_ENABLE, pg_enable); 146 145 } 147 146 148 147 static void gen9_rc6_enable(struct intel_rc6 *rc6) ··· 153 152 154 153 /* 2b: Program RC6 thresholds.*/ 155 154 if (GRAPHICS_VER(rc6_to_i915(rc6)) >= 11) { 156 - set(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 54 << 16 | 85); 157 - set(uncore, GEN10_MEDIA_WAKE_RATE_LIMIT, 150); 155 + intel_uncore_write_fw(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 54 << 16 | 85); 156 + intel_uncore_write_fw(uncore, GEN10_MEDIA_WAKE_RATE_LIMIT, 150); 158 157 } else if (IS_SKYLAKE(rc6_to_i915(rc6))) { 159 158 /* 160 159 * WaRsDoubleRc6WrlWithCoarsePowerGating:skl Doubling WRL only 161 160 * when CPG is enabled 162 161 */ 163 - set(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 108 << 16); 162 + intel_uncore_write_fw(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 108 << 16); 164 163 } else { 165 - set(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 54 << 16); 164 + intel_uncore_write_fw(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 54 << 16); 166 165 } 167 166 168 - set(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */ 169 - set(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */ 167 + intel_uncore_write_fw(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */ 168 + intel_uncore_write_fw(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */ 170 169 for_each_engine(engine, rc6_to_gt(rc6), id) 171 - set(uncore, RING_MAX_IDLE(engine->mmio_base), 10); 170 + intel_uncore_write_fw(uncore, RING_MAX_IDLE(engine->mmio_base), 10); 172 171 173 - set(uncore, GUC_MAX_IDLE_COUNT, 0xA); 172 + intel_uncore_write_fw(uncore, GUC_MAX_IDLE_COUNT, 0xA); 174 173 175 - set(uncore, GEN6_RC_SLEEP, 0); 174 + intel_uncore_write_fw(uncore, GEN6_RC_SLEEP, 0); 176 175 177 176 /* 178 177 * 2c: Program Coarse Power Gating Policies. ··· 195 194 * conservative, we have to factor in a context switch on top (due 196 195 * to ksoftirqd). 197 196 */ 198 - set(uncore, GEN9_MEDIA_PG_IDLE_HYSTERESIS, 250); 199 - set(uncore, GEN9_RENDER_PG_IDLE_HYSTERESIS, 250); 197 + intel_uncore_write_fw(uncore, GEN9_MEDIA_PG_IDLE_HYSTERESIS, 250); 198 + intel_uncore_write_fw(uncore, GEN9_RENDER_PG_IDLE_HYSTERESIS, 250); 200 199 201 200 /* 3a: Enable RC6 */ 202 - set(uncore, GEN6_RC6_THRESHOLD, 37500); /* 37.5/125ms per EI */ 201 + intel_uncore_write_fw(uncore, GEN6_RC6_THRESHOLD, 37500); /* 37.5/125ms per EI */ 203 202 204 203 rc6->ctl_enable = 205 204 GEN6_RC_CTL_HW_ENABLE | ··· 211 210 * - Render/Media PG need to be disabled with RC6. 212 211 */ 213 212 if (!NEEDS_WaRsDisableCoarsePowerGating(rc6_to_i915(rc6))) 214 - set(uncore, GEN9_PG_ENABLE, 215 - GEN9_RENDER_PG_ENABLE | GEN9_MEDIA_PG_ENABLE); 213 + intel_uncore_write_fw(uncore, GEN9_PG_ENABLE, 214 + GEN9_RENDER_PG_ENABLE | GEN9_MEDIA_PG_ENABLE); 216 215 } 217 216 218 217 static void gen8_rc6_enable(struct intel_rc6 *rc6) ··· 222 221 enum intel_engine_id id; 223 222 224 223 /* 2b: Program RC6 thresholds.*/ 225 - set(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 40 << 16); 226 - set(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */ 227 - set(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */ 224 + intel_uncore_write_fw(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 40 << 16); 225 + intel_uncore_write_fw(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */ 226 + intel_uncore_write_fw(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */ 228 227 for_each_engine(engine, rc6_to_gt(rc6), id) 229 - set(uncore, RING_MAX_IDLE(engine->mmio_base), 10); 230 - set(uncore, GEN6_RC_SLEEP, 0); 231 - set(uncore, GEN6_RC6_THRESHOLD, 625); /* 800us/1.28 for TO */ 228 + intel_uncore_write_fw(uncore, RING_MAX_IDLE(engine->mmio_base), 10); 229 + intel_uncore_write_fw(uncore, GEN6_RC_SLEEP, 0); 230 + intel_uncore_write_fw(uncore, GEN6_RC6_THRESHOLD, 625); /* 800us/1.28 for TO */ 232 231 233 232 /* 3: Enable RC6 */ 234 233 rc6->ctl_enable = ··· 246 245 u32 rc6vids, rc6_mask; 247 246 int ret; 248 247 249 - set(uncore, GEN6_RC1_WAKE_RATE_LIMIT, 1000 << 16); 250 - set(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 40 << 16 | 30); 251 - set(uncore, GEN6_RC6pp_WAKE_RATE_LIMIT, 30); 252 - set(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); 253 - set(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); 248 + intel_uncore_write_fw(uncore, GEN6_RC1_WAKE_RATE_LIMIT, 1000 << 16); 249 + intel_uncore_write_fw(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 40 << 16 | 30); 250 + intel_uncore_write_fw(uncore, GEN6_RC6pp_WAKE_RATE_LIMIT, 30); 251 + intel_uncore_write_fw(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); 252 + intel_uncore_write_fw(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); 254 253 255 254 for_each_engine(engine, rc6_to_gt(rc6), id) 256 - set(uncore, RING_MAX_IDLE(engine->mmio_base), 10); 255 + intel_uncore_write_fw(uncore, RING_MAX_IDLE(engine->mmio_base), 10); 257 256 258 - set(uncore, GEN6_RC_SLEEP, 0); 259 - set(uncore, GEN6_RC1e_THRESHOLD, 1000); 260 - set(uncore, GEN6_RC6_THRESHOLD, 50000); 261 - set(uncore, GEN6_RC6p_THRESHOLD, 150000); 262 - set(uncore, GEN6_RC6pp_THRESHOLD, 64000); /* unused */ 257 + intel_uncore_write_fw(uncore, GEN6_RC_SLEEP, 0); 258 + intel_uncore_write_fw(uncore, GEN6_RC1e_THRESHOLD, 1000); 259 + intel_uncore_write_fw(uncore, GEN6_RC6_THRESHOLD, 50000); 260 + intel_uncore_write_fw(uncore, GEN6_RC6p_THRESHOLD, 150000); 261 + intel_uncore_write_fw(uncore, GEN6_RC6pp_THRESHOLD, 64000); /* unused */ 263 262 264 263 /* We don't use those on Haswell */ 265 264 rc6_mask = GEN6_RC_CTL_RC6_ENABLE; ··· 373 372 enum intel_engine_id id; 374 373 375 374 /* 2a: Program RC6 thresholds.*/ 376 - set(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 40 << 16); 377 - set(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */ 378 - set(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */ 375 + intel_uncore_write_fw(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 40 << 16); 376 + intel_uncore_write_fw(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */ 377 + intel_uncore_write_fw(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */ 379 378 380 379 for_each_engine(engine, rc6_to_gt(rc6), id) 381 - set(uncore, RING_MAX_IDLE(engine->mmio_base), 10); 382 - set(uncore, GEN6_RC_SLEEP, 0); 380 + intel_uncore_write_fw(uncore, RING_MAX_IDLE(engine->mmio_base), 10); 381 + intel_uncore_write_fw(uncore, GEN6_RC_SLEEP, 0); 383 382 384 383 /* TO threshold set to 500 us (0x186 * 1.28 us) */ 385 - set(uncore, GEN6_RC6_THRESHOLD, 0x186); 384 + intel_uncore_write_fw(uncore, GEN6_RC6_THRESHOLD, 0x186); 386 385 387 386 /* Allows RC6 residency counter to work */ 388 - set(uncore, VLV_COUNTER_CONTROL, 389 - _MASKED_BIT_ENABLE(VLV_COUNT_RANGE_HIGH | 390 - VLV_MEDIA_RC6_COUNT_EN | 391 - VLV_RENDER_RC6_COUNT_EN)); 387 + intel_uncore_write_fw(uncore, VLV_COUNTER_CONTROL, 388 + _MASKED_BIT_ENABLE(VLV_COUNT_RANGE_HIGH | 389 + VLV_MEDIA_RC6_COUNT_EN | 390 + VLV_RENDER_RC6_COUNT_EN)); 392 391 393 392 /* 3: Enable RC6 */ 394 393 rc6->ctl_enable = GEN7_RC_CTL_TO_MODE; ··· 400 399 struct intel_engine_cs *engine; 401 400 enum intel_engine_id id; 402 401 403 - set(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 0x00280000); 404 - set(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); 405 - set(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); 402 + intel_uncore_write_fw(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 0x00280000); 403 + intel_uncore_write_fw(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); 404 + intel_uncore_write_fw(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); 406 405 407 406 for_each_engine(engine, rc6_to_gt(rc6), id) 408 - set(uncore, RING_MAX_IDLE(engine->mmio_base), 10); 407 + intel_uncore_write_fw(uncore, RING_MAX_IDLE(engine->mmio_base), 10); 409 408 410 - set(uncore, GEN6_RC6_THRESHOLD, 0x557); 409 + intel_uncore_write_fw(uncore, GEN6_RC6_THRESHOLD, 0x557); 411 410 412 411 /* Allows RC6 residency counter to work */ 413 - set(uncore, VLV_COUNTER_CONTROL, 414 - _MASKED_BIT_ENABLE(VLV_COUNT_RANGE_HIGH | 415 - VLV_MEDIA_RC0_COUNT_EN | 416 - VLV_RENDER_RC0_COUNT_EN | 417 - VLV_MEDIA_RC6_COUNT_EN | 418 - VLV_RENDER_RC6_COUNT_EN)); 412 + intel_uncore_write_fw(uncore, VLV_COUNTER_CONTROL, 413 + _MASKED_BIT_ENABLE(VLV_COUNT_RANGE_HIGH | 414 + VLV_MEDIA_RC0_COUNT_EN | 415 + VLV_RENDER_RC0_COUNT_EN | 416 + VLV_MEDIA_RC6_COUNT_EN | 417 + VLV_RENDER_RC6_COUNT_EN)); 419 418 420 419 rc6->ctl_enable = 421 420 GEN7_RC_CTL_TO_MODE | VLV_RC_CTL_CTX_RST_PARALLEL; ··· 576 575 577 576 intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL); 578 577 if (GRAPHICS_VER(i915) >= 9) 579 - set(uncore, GEN9_PG_ENABLE, 0); 580 - set(uncore, GEN6_RC_CONTROL, 0); 581 - set(uncore, GEN6_RC_STATE, 0); 578 + intel_uncore_write_fw(uncore, GEN9_PG_ENABLE, 0); 579 + intel_uncore_write_fw(uncore, GEN6_RC_CONTROL, 0); 580 + intel_uncore_write_fw(uncore, GEN6_RC_STATE, 0); 582 581 intel_uncore_forcewake_put(uncore, FORCEWAKE_ALL); 583 582 } 584 583 ··· 685 684 return; 686 685 687 686 /* Restore HW timers for automatic RC6 entry while busy */ 688 - set(uncore, GEN6_RC_CONTROL, rc6->ctl_enable); 687 + intel_uncore_write_fw(uncore, GEN6_RC_CONTROL, rc6->ctl_enable); 689 688 } 690 689 691 690 void intel_rc6_park(struct intel_rc6 *rc6) ··· 705 704 return; 706 705 707 706 /* Turn off the HW timers and go directly to rc6 */ 708 - set(uncore, GEN6_RC_CONTROL, GEN6_RC_CTL_RC6_ENABLE); 707 + intel_uncore_write_fw(uncore, GEN6_RC_CONTROL, GEN6_RC_CTL_RC6_ENABLE); 709 708 710 709 if (HAS_RC6pp(rc6_to_i915(rc6))) 711 710 target = 0x6; /* deepest rc6 */ ··· 713 712 target = 0x5; /* deep rc6 */ 714 713 else 715 714 target = 0x4; /* normal rc6 */ 716 - set(uncore, GEN6_RC_STATE, target << RC_SW_TARGET_STATE_SHIFT); 715 + intel_uncore_write_fw(uncore, GEN6_RC_STATE, target << RC_SW_TARGET_STATE_SHIFT); 717 716 } 718 717 719 718 void intel_rc6_disable(struct intel_rc6 *rc6) ··· 736 735 737 736 /* We want the BIOS C6 state preserved across loads for MTL */ 738 737 if (IS_METEORLAKE(rc6_to_i915(rc6)) && rc6->bios_state_captured) 739 - set(uncore, GEN6_RC_STATE, rc6->bios_rc_state); 738 + intel_uncore_write_fw(uncore, GEN6_RC_STATE, rc6->bios_rc_state); 740 739 741 740 pctx = fetch_and_zero(&rc6->pctx); 742 741 if (pctx) ··· 767 766 * before we have set the default VLV_COUNTER_CONTROL value. So always 768 767 * set the high bit to be safe. 769 768 */ 770 - set(uncore, VLV_COUNTER_CONTROL, 771 - _MASKED_BIT_ENABLE(VLV_COUNT_RANGE_HIGH)); 769 + intel_uncore_write_fw(uncore, VLV_COUNTER_CONTROL, 770 + _MASKED_BIT_ENABLE(VLV_COUNT_RANGE_HIGH)); 772 771 upper = intel_uncore_read_fw(uncore, reg); 773 772 do { 774 773 tmp = upper; 775 774 776 - set(uncore, VLV_COUNTER_CONTROL, 777 - _MASKED_BIT_DISABLE(VLV_COUNT_RANGE_HIGH)); 775 + intel_uncore_write_fw(uncore, VLV_COUNTER_CONTROL, 776 + _MASKED_BIT_DISABLE(VLV_COUNT_RANGE_HIGH)); 778 777 lower = intel_uncore_read_fw(uncore, reg); 779 778 780 - set(uncore, VLV_COUNTER_CONTROL, 781 - _MASKED_BIT_ENABLE(VLV_COUNT_RANGE_HIGH)); 779 + intel_uncore_write_fw(uncore, VLV_COUNTER_CONTROL, 780 + _MASKED_BIT_ENABLE(VLV_COUNT_RANGE_HIGH)); 782 781 upper = intel_uncore_read_fw(uncore, reg); 783 782 } while (upper != tmp && --loop); 784 783
+41 -15
drivers/gpu/drm/i915/gt/intel_workarounds.c
··· 812 812 wa_masked_en(wal, CACHE_MODE_1, MSAA_OPTIMIZATION_REDUC_DISABLE); 813 813 } 814 814 815 + static void mtl_ctx_gt_tuning_init(struct intel_engine_cs *engine, 816 + struct i915_wa_list *wal) 817 + { 818 + struct drm_i915_private *i915 = engine->i915; 819 + 820 + dg2_ctx_gt_tuning_init(engine, wal); 821 + 822 + if (IS_MTL_GRAPHICS_STEP(i915, M, STEP_B0, STEP_FOREVER) || 823 + IS_MTL_GRAPHICS_STEP(i915, P, STEP_B0, STEP_FOREVER)) 824 + wa_add(wal, DRAW_WATERMARK, VERT_WM_VAL, 0x3FF, 0, false); 825 + } 826 + 815 827 static void mtl_ctx_workarounds_init(struct intel_engine_cs *engine, 816 828 struct i915_wa_list *wal) 817 829 { 818 830 struct drm_i915_private *i915 = engine->i915; 831 + 832 + mtl_ctx_gt_tuning_init(engine, wal); 819 833 820 834 if (IS_MTL_GRAPHICS_STEP(i915, M, STEP_A0, STEP_B0) || 821 835 IS_MTL_GRAPHICS_STEP(i915, P, STEP_A0, STEP_B0)) { ··· 1709 1695 static void 1710 1696 xelpg_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) 1711 1697 { 1698 + /* Wa_14018778641 / Wa_18018781329 */ 1699 + wa_mcr_write_or(wal, RENDER_MOD_CTRL, FORCE_MISS_FTLB); 1700 + wa_mcr_write_or(wal, COMP_MOD_CTRL, FORCE_MISS_FTLB); 1701 + 1702 + /* Wa_22016670082 */ 1703 + wa_write_or(wal, GEN12_SQCNT1, GEN12_STRICT_RAR_ENABLE); 1704 + 1712 1705 if (IS_MTL_GRAPHICS_STEP(gt->i915, M, STEP_A0, STEP_B0) || 1713 1706 IS_MTL_GRAPHICS_STEP(gt->i915, P, STEP_A0, STEP_B0)) { 1714 1707 /* Wa_14014830051 */ 1715 1708 wa_mcr_write_clr(wal, SARB_CHICKEN1, COMP_CKN_IN); 1716 1709 1717 - /* Wa_18018781329 */ 1718 - wa_mcr_write_or(wal, RENDER_MOD_CTRL, FORCE_MISS_FTLB); 1719 - wa_mcr_write_or(wal, COMP_MOD_CTRL, FORCE_MISS_FTLB); 1710 + /* Wa_14015795083 */ 1711 + wa_write_clr(wal, GEN7_MISCCPCTL, GEN12_DOP_CLOCK_GATE_RENDER_ENABLE); 1720 1712 } 1721 1713 1722 1714 /* ··· 1735 1715 static void 1736 1716 xelpmp_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal) 1737 1717 { 1738 - if (IS_MTL_MEDIA_STEP(gt->i915, STEP_A0, STEP_B0)) { 1739 - /* 1740 - * Wa_18018781329 1741 - * 1742 - * Note that although these registers are MCR on the primary 1743 - * GT, the media GT's versions are regular singleton registers. 1744 - */ 1745 - wa_write_or(wal, XELPMP_GSC_MOD_CTRL, FORCE_MISS_FTLB); 1746 - wa_write_or(wal, XELPMP_VDBX_MOD_CTRL, FORCE_MISS_FTLB); 1747 - wa_write_or(wal, XELPMP_VEBX_MOD_CTRL, FORCE_MISS_FTLB); 1748 - } 1718 + /* 1719 + * Wa_14018778641 1720 + * Wa_18018781329 1721 + * 1722 + * Note that although these registers are MCR on the primary 1723 + * GT, the media GT's versions are regular singleton registers. 1724 + */ 1725 + wa_write_or(wal, XELPMP_GSC_MOD_CTRL, FORCE_MISS_FTLB); 1726 + wa_write_or(wal, XELPMP_VDBX_MOD_CTRL, FORCE_MISS_FTLB); 1727 + wa_write_or(wal, XELPMP_VEBX_MOD_CTRL, FORCE_MISS_FTLB); 1749 1728 1750 1729 debug_dump_steering(gt); 1751 1730 } ··· 1762 1743 */ 1763 1744 static void gt_tuning_settings(struct intel_gt *gt, struct i915_wa_list *wal) 1764 1745 { 1746 + if (IS_METEORLAKE(gt->i915)) { 1747 + if (gt->type != GT_MEDIA) 1748 + wa_mcr_write_or(wal, XEHP_L3SCQREG7, BLEND_FILL_CACHING_OPT_DIS); 1749 + 1750 + wa_mcr_write_or(wal, XEHP_SQCM, EN_32B_ACCESS); 1751 + } 1752 + 1765 1753 if (IS_PONTEVECCHIO(gt->i915)) { 1766 1754 wa_mcr_write(wal, XEHPC_L3SCRUB, 1767 1755 SCRUB_CL_DWNGRADE_SHARED | SCRUB_RATE_4B_PER_CLK); ··· 2965 2939 add_render_compute_tuning_settings(struct drm_i915_private *i915, 2966 2940 struct i915_wa_list *wal) 2967 2941 { 2968 - if (IS_DG2(i915)) 2942 + if (IS_METEORLAKE(i915) || IS_DG2(i915)) 2969 2943 wa_mcr_write_clr_set(wal, RT_CTRL, STACKID_CTRL, STACKID_CTRL_512); 2970 2944 2971 2945 /*
+2 -1
drivers/gpu/drm/i915/gt/selftest_engine_pm.c
··· 5 5 6 6 #include <linux/sort.h> 7 7 8 + #include "gt/intel_gt_print.h" 8 9 #include "i915_selftest.h" 9 10 #include "intel_engine_regs.h" 10 11 #include "intel_gpu_commands.h" ··· 403 402 404 403 /* gt wakeref is async (deferred to workqueue) */ 405 404 if (intel_gt_pm_wait_for_idle(gt)) { 406 - pr_err("GT failed to idle\n"); 405 + gt_err(gt, "GT failed to idle\n"); 407 406 return -EINVAL; 408 407 } 409 408 }
+25 -22
drivers/gpu/drm/i915/gt/selftest_migrate.c
··· 137 137 static int intel_context_copy_ccs(struct intel_context *ce, 138 138 const struct i915_deps *deps, 139 139 struct scatterlist *sg, 140 - enum i915_cache_level cache_level, 140 + unsigned int pat_index, 141 141 bool write_to_ccs, 142 142 struct i915_request **out) 143 143 { ··· 185 185 if (err) 186 186 goto out_rq; 187 187 188 - len = emit_pte(rq, &it, cache_level, true, offset, CHUNK_SZ); 188 + len = emit_pte(rq, &it, pat_index, true, offset, CHUNK_SZ); 189 189 if (len <= 0) { 190 190 err = len; 191 191 goto out_rq; ··· 223 223 struct i915_gem_ww_ctx *ww, 224 224 const struct i915_deps *deps, 225 225 struct scatterlist *sg, 226 - enum i915_cache_level cache_level, 226 + unsigned int pat_index, 227 227 bool write_to_ccs, 228 228 struct i915_request **out) 229 229 { ··· 243 243 if (err) 244 244 goto out; 245 245 246 - err = intel_context_copy_ccs(ce, deps, sg, cache_level, 246 + err = intel_context_copy_ccs(ce, deps, sg, pat_index, 247 247 write_to_ccs, out); 248 248 249 249 intel_context_unpin(ce); ··· 300 300 /* Write the obj data into ccs surface */ 301 301 err = intel_migrate_ccs_copy(migrate, &ww, NULL, 302 302 obj->mm.pages->sgl, 303 - obj->cache_level, 303 + obj->pat_index, 304 304 true, &rq); 305 305 if (rq && !err) { 306 306 if (i915_request_wait(rq, 0, HZ) < 0) { ··· 351 351 352 352 err = intel_migrate_ccs_copy(migrate, &ww, NULL, 353 353 obj->mm.pages->sgl, 354 - obj->cache_level, 354 + obj->pat_index, 355 355 false, &rq); 356 356 if (rq && !err) { 357 357 if (i915_request_wait(rq, 0, HZ) < 0) { ··· 414 414 struct i915_request **out) 415 415 { 416 416 return intel_migrate_copy(migrate, ww, NULL, 417 - src->mm.pages->sgl, src->cache_level, 417 + src->mm.pages->sgl, src->pat_index, 418 418 i915_gem_object_is_lmem(src), 419 - dst->mm.pages->sgl, dst->cache_level, 419 + dst->mm.pages->sgl, dst->pat_index, 420 420 i915_gem_object_is_lmem(dst), 421 421 out); 422 422 } ··· 428 428 struct i915_request **out) 429 429 { 430 430 return intel_context_migrate_copy(migrate->context, NULL, 431 - src->mm.pages->sgl, src->cache_level, 431 + src->mm.pages->sgl, src->pat_index, 432 432 i915_gem_object_is_lmem(src), 433 - dst->mm.pages->sgl, dst->cache_level, 433 + dst->mm.pages->sgl, dst->pat_index, 434 434 i915_gem_object_is_lmem(dst), 435 435 out); 436 436 } ··· 455 455 { 456 456 return intel_migrate_clear(migrate, ww, NULL, 457 457 obj->mm.pages->sgl, 458 - obj->cache_level, 458 + obj->pat_index, 459 459 i915_gem_object_is_lmem(obj), 460 460 value, out); 461 461 } ··· 468 468 { 469 469 return intel_context_migrate_clear(migrate->context, NULL, 470 470 obj->mm.pages->sgl, 471 - obj->cache_level, 471 + obj->pat_index, 472 472 i915_gem_object_is_lmem(obj), 473 473 value, out); 474 474 } ··· 648 648 */ 649 649 pr_info("%s emite_pte ring space=%u\n", __func__, rq->ring->space); 650 650 it = sg_sgt(obj->mm.pages->sgl); 651 - len = emit_pte(rq, &it, obj->cache_level, false, 0, CHUNK_SZ); 651 + len = emit_pte(rq, &it, obj->pat_index, false, 0, CHUNK_SZ); 652 652 if (!len) { 653 653 err = -EINVAL; 654 654 goto out_rq; ··· 844 844 845 845 static int __perf_clear_blt(struct intel_context *ce, 846 846 struct scatterlist *sg, 847 - enum i915_cache_level cache_level, 847 + unsigned int pat_index, 848 848 bool is_lmem, 849 849 size_t sz) 850 850 { ··· 858 858 859 859 t0 = ktime_get(); 860 860 861 - err = intel_context_migrate_clear(ce, NULL, sg, cache_level, 861 + err = intel_context_migrate_clear(ce, NULL, sg, pat_index, 862 862 is_lmem, 0, &rq); 863 863 if (rq) { 864 864 if (i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT) < 0) ··· 904 904 905 905 err = __perf_clear_blt(gt->migrate.context, 906 906 dst->mm.pages->sgl, 907 - I915_CACHE_NONE, 907 + i915_gem_get_pat_index(gt->i915, 908 + I915_CACHE_NONE), 908 909 i915_gem_object_is_lmem(dst), 909 910 sizes[i]); 910 911 ··· 920 919 921 920 static int __perf_copy_blt(struct intel_context *ce, 922 921 struct scatterlist *src, 923 - enum i915_cache_level src_cache_level, 922 + unsigned int src_pat_index, 924 923 bool src_is_lmem, 925 924 struct scatterlist *dst, 926 - enum i915_cache_level dst_cache_level, 925 + unsigned int dst_pat_index, 927 926 bool dst_is_lmem, 928 927 size_t sz) 929 928 { ··· 938 937 t0 = ktime_get(); 939 938 940 939 err = intel_context_migrate_copy(ce, NULL, 941 - src, src_cache_level, 940 + src, src_pat_index, 942 941 src_is_lmem, 943 - dst, dst_cache_level, 942 + dst, dst_pat_index, 944 943 dst_is_lmem, 945 944 &rq); 946 945 if (rq) { ··· 995 994 996 995 err = __perf_copy_blt(gt->migrate.context, 997 996 src->mm.pages->sgl, 998 - I915_CACHE_NONE, 997 + i915_gem_get_pat_index(gt->i915, 998 + I915_CACHE_NONE), 999 999 i915_gem_object_is_lmem(src), 1000 1000 dst->mm.pages->sgl, 1001 - I915_CACHE_NONE, 1001 + i915_gem_get_pat_index(gt->i915, 1002 + I915_CACHE_NONE), 1002 1003 i915_gem_object_is_lmem(dst), 1003 1004 sz); 1004 1005
+2 -1
drivers/gpu/drm/i915/gt/selftest_mocs.c
··· 131 131 const struct drm_i915_mocs_table *table, 132 132 u32 *offset) 133 133 { 134 + struct intel_gt *gt = rq->engine->gt; 134 135 u32 addr; 135 136 136 137 if (!table) 137 138 return 0; 138 139 139 140 if (HAS_GLOBAL_MOCS_REGISTERS(rq->engine->i915)) 140 - addr = global_mocs_offset(); 141 + addr = global_mocs_offset() + gt->uncore->gsi_offset; 141 142 else 142 143 addr = mocs_offset(rq->engine); 143 144
+6 -2
drivers/gpu/drm/i915/gt/selftest_reset.c
··· 86 86 87 87 ggtt->vm.insert_page(&ggtt->vm, dma, 88 88 ggtt->error_capture.start, 89 - I915_CACHE_NONE, 0); 89 + i915_gem_get_pat_index(gt->i915, 90 + I915_CACHE_NONE), 91 + 0); 90 92 mb(); 91 93 92 94 s = io_mapping_map_wc(&ggtt->iomap, ··· 129 127 130 128 ggtt->vm.insert_page(&ggtt->vm, dma, 131 129 ggtt->error_capture.start, 132 - I915_CACHE_NONE, 0); 130 + i915_gem_get_pat_index(gt->i915, 131 + I915_CACHE_NONE), 132 + 0); 133 133 mb(); 134 134 135 135 s = io_mapping_map_wc(&ggtt->iomap,
+37 -5
drivers/gpu/drm/i915/gt/selftest_slpc.c
··· 70 70 return err; 71 71 } 72 72 73 + static int slpc_restore_freq(struct intel_guc_slpc *slpc, u32 min, u32 max) 74 + { 75 + int err; 76 + 77 + err = slpc_set_max_freq(slpc, max); 78 + if (err) { 79 + pr_err("Unable to restore max freq"); 80 + return err; 81 + } 82 + 83 + err = slpc_set_min_freq(slpc, min); 84 + if (err) { 85 + pr_err("Unable to restore min freq"); 86 + return err; 87 + } 88 + 89 + err = intel_guc_slpc_set_ignore_eff_freq(slpc, false); 90 + if (err) { 91 + pr_err("Unable to restore efficient freq"); 92 + return err; 93 + } 94 + 95 + return 0; 96 + } 97 + 73 98 static u64 measure_power_at_freq(struct intel_gt *gt, int *freq, u64 *power) 74 99 { 75 100 int err = 0; ··· 293 268 294 269 /* 295 270 * Set min frequency to RPn so that we can test the whole 296 - * range of RPn-RP0. This also turns off efficient freq 297 - * usage and makes results more predictable. 271 + * range of RPn-RP0. 298 272 */ 299 273 err = slpc_set_min_freq(slpc, slpc->min_freq); 300 274 if (err) { 301 275 pr_err("Unable to update min freq!"); 276 + return err; 277 + } 278 + 279 + /* 280 + * Turn off efficient frequency so RPn/RP0 ranges are obeyed. 281 + */ 282 + err = intel_guc_slpc_set_ignore_eff_freq(slpc, true); 283 + if (err) { 284 + pr_err("Unable to turn off efficient freq!"); 302 285 return err; 303 286 } 304 287 ··· 391 358 break; 392 359 } 393 360 394 - /* Restore min/max frequencies */ 395 - slpc_set_max_freq(slpc, slpc_max_freq); 396 - slpc_set_min_freq(slpc, slpc_min_freq); 361 + /* Restore min/max/efficient frequencies */ 362 + err = slpc_restore_freq(slpc, slpc_min_freq, slpc_max_freq); 397 363 398 364 if (igt_flush_test(gt->i915)) 399 365 err = -EIO;
+1 -1
drivers/gpu/drm/i915/gt/selftest_timeline.c
··· 836 836 return PTR_ERR(obj); 837 837 838 838 /* keep the same cache settings as timeline */ 839 - i915_gem_object_set_cache_coherency(obj, tl->hwsp_ggtt->obj->cache_level); 839 + i915_gem_object_set_pat_index(obj, tl->hwsp_ggtt->obj->pat_index); 840 840 w->map = i915_gem_object_pin_map_unlocked(obj, 841 841 page_unmask_bits(tl->hwsp_ggtt->obj->mm.mapping)); 842 842 if (IS_ERR(w->map)) {
+3 -1
drivers/gpu/drm/i915/gt/selftest_tlb.c
··· 36 36 u64 length, 37 37 struct rnd_state *prng) 38 38 { 39 + const unsigned int pat_index = 40 + i915_gem_get_pat_index(ce->vm->i915, I915_CACHE_NONE); 39 41 struct drm_i915_gem_object *batch; 40 42 struct drm_mm_node vb_node; 41 43 struct i915_request *rq; ··· 157 155 /* Flip the PTE between A and B */ 158 156 if (i915_gem_object_is_lmem(vb->obj)) 159 157 pte_flags |= PTE_LM; 160 - ce->vm->insert_entries(ce->vm, &vb_res, 0, pte_flags); 158 + ce->vm->insert_entries(ce->vm, &vb_res, pat_index, pte_flags); 161 159 162 160 /* Flush the PTE update to concurrent HW */ 163 161 tlbinv(ce->vm, addr & -length, length);
+1
drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
··· 44 44 enum intel_bootrom_load_status { 45 45 INTEL_BOOTROM_STATUS_NO_KEY_FOUND = 0x13, 46 46 INTEL_BOOTROM_STATUS_AES_PROD_KEY_FOUND = 0x1A, 47 + INTEL_BOOTROM_STATUS_PROD_KEY_CHECK_FAILURE = 0x2B, 47 48 INTEL_BOOTROM_STATUS_RSA_FAILED = 0x50, 48 49 INTEL_BOOTROM_STATUS_PAVPC_FAILED = 0x73, 49 50 INTEL_BOOTROM_STATUS_WOPCM_FAILED = 0x74,
+14 -6
drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h
··· 12 12 struct intel_guc; 13 13 struct file; 14 14 15 - /** 15 + /* 16 16 * struct __guc_capture_bufstate 17 17 * 18 18 * Book-keeping structure used to track read and write pointers ··· 26 26 u32 wr; 27 27 }; 28 28 29 - /** 29 + /* 30 30 * struct __guc_capture_parsed_output - extracted error capture node 31 31 * 32 32 * A single unit of extracted error-capture output data grouped together ··· 58 58 #define GCAP_PARSED_REGLIST_INDEX_ENGINST BIT(GUC_CAPTURE_LIST_TYPE_ENGINE_INSTANCE) 59 59 }; 60 60 61 - /** 61 + /* 62 62 * struct guc_debug_capture_list_header / struct guc_debug_capture_list 63 63 * 64 64 * As part of ADS registration, these header structures (followed by ··· 76 76 struct guc_mmio_reg regs[]; 77 77 } __packed; 78 78 79 - /** 79 + /* 80 80 * struct __guc_mmio_reg_descr / struct __guc_mmio_reg_descr_group 81 81 * 82 82 * intel_guc_capture module uses these structures to maintain static ··· 101 101 struct __guc_mmio_reg_descr *extlist; /* only used for steered registers */ 102 102 }; 103 103 104 - /** 104 + /* 105 105 * struct guc_state_capture_header_t / struct guc_state_capture_t / 106 106 * guc_state_capture_group_header_t / guc_state_capture_group_t 107 107 * ··· 148 148 struct guc_state_capture_t capture_entries[]; 149 149 } __packed; 150 150 151 - /** 151 + /* 152 152 * struct __guc_capture_ads_cache 153 153 * 154 154 * A structure to cache register lists that were populated and registered ··· 187 187 struct __guc_capture_ads_cache ads_cache[GUC_CAPTURE_LIST_INDEX_MAX] 188 188 [GUC_CAPTURE_LIST_TYPE_MAX] 189 189 [GUC_MAX_ENGINE_CLASSES]; 190 + 191 + /** 192 + * @ads_null_cache: ADS null cache. 193 + */ 190 194 void *ads_null_cache; 191 195 192 196 /** ··· 206 202 struct list_head cachelist; 207 203 #define PREALLOC_NODES_MAX_COUNT (3 * GUC_MAX_ENGINE_CLASSES * GUC_MAX_INSTANCES_PER_CLASS) 208 204 #define PREALLOC_NODES_DEFAULT_NUMREGS 64 205 + 206 + /** 207 + * @max_mmio_per_node: Max MMIO per node. 208 + */ 209 209 int max_mmio_per_node; 210 210 211 211 /**
+23
drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
··· 13 13 #define GSC_FW_STATUS_REG _MMIO(0x116C40) 14 14 #define GSC_FW_CURRENT_STATE REG_GENMASK(3, 0) 15 15 #define GSC_FW_CURRENT_STATE_RESET 0 16 + #define GSC_FW_PROXY_STATE_NORMAL 5 16 17 #define GSC_FW_INIT_COMPLETE_BIT REG_BIT(9) 17 18 18 19 static bool gsc_is_in_reset(struct intel_uncore *uncore) ··· 22 21 23 22 return REG_FIELD_GET(GSC_FW_CURRENT_STATE, fw_status) == 24 23 GSC_FW_CURRENT_STATE_RESET; 24 + } 25 + 26 + bool intel_gsc_uc_fw_proxy_init_done(struct intel_gsc_uc *gsc) 27 + { 28 + struct intel_uncore *uncore = gsc_uc_to_gt(gsc)->uncore; 29 + u32 fw_status = intel_uncore_read(uncore, GSC_FW_STATUS_REG); 30 + 31 + return REG_FIELD_GET(GSC_FW_CURRENT_STATE, fw_status) == 32 + GSC_FW_PROXY_STATE_NORMAL; 25 33 } 26 34 27 35 bool intel_gsc_uc_fw_init_done(struct intel_gsc_uc *gsc) ··· 120 110 if (obj->base.size < gsc->fw.size) 121 111 return -ENOSPC; 122 112 113 + /* 114 + * Wa_22016122933: For MTL the shared memory needs to be mapped 115 + * as WC on CPU side and UC (PAT index 2) on GPU side 116 + */ 117 + if (IS_METEORLAKE(i915)) 118 + i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE); 119 + 123 120 dst = i915_gem_object_pin_map_unlocked(obj, 124 121 i915_coherent_map_type(i915, obj, true)); 125 122 if (IS_ERR(dst)) ··· 141 124 142 125 memset(dst, 0, obj->base.size); 143 126 memcpy(dst, src, gsc->fw.size); 127 + 128 + /* 129 + * Wa_22016122933: Making sure the data in dst is 130 + * visible to GSC right away 131 + */ 132 + intel_guc_write_barrier(&gt->uc.guc); 144 133 145 134 i915_gem_object_unpin_map(gsc->fw.obj); 146 135 i915_gem_object_unpin_map(obj);
+1
drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
··· 13 13 14 14 int intel_gsc_uc_fw_upload(struct intel_gsc_uc *gsc); 15 15 bool intel_gsc_uc_fw_init_done(struct intel_gsc_uc *gsc); 16 + bool intel_gsc_uc_fw_proxy_init_done(struct intel_gsc_uc *gsc); 16 17 17 18 #endif
+424
drivers/gpu/drm/i915/gt/uc/intel_gsc_proxy.c
··· 1 + // SPDX-License-Identifier: MIT 2 + /* 3 + * Copyright © 2023 Intel Corporation 4 + */ 5 + 6 + #include <linux/component.h> 7 + 8 + #include "drm/i915_component.h" 9 + #include "drm/i915_gsc_proxy_mei_interface.h" 10 + 11 + #include "gt/intel_gt.h" 12 + #include "gt/intel_gt_print.h" 13 + #include "intel_gsc_proxy.h" 14 + #include "intel_gsc_uc.h" 15 + #include "intel_gsc_uc_heci_cmd_submit.h" 16 + #include "i915_drv.h" 17 + #include "i915_reg.h" 18 + 19 + /* 20 + * GSC proxy: 21 + * The GSC uC needs to communicate with the CSME to perform certain operations. 22 + * Since the GSC can't perform this communication directly on platforms where it 23 + * is integrated in GT, i915 needs to transfer the messages from GSC to CSME 24 + * and back. i915 must manually start the proxy flow after the GSC is loaded to 25 + * signal to GSC that we're ready to handle its messages and allow it to query 26 + * its init data from CSME; GSC will then trigger an HECI2 interrupt if it needs 27 + * to send messages to CSME again. 28 + * The proxy flow is as follow: 29 + * 1 - i915 submits a request to GSC asking for the message to CSME 30 + * 2 - GSC replies with the proxy header + payload for CSME 31 + * 3 - i915 sends the reply from GSC as-is to CSME via the mei proxy component 32 + * 4 - CSME replies with the proxy header + payload for GSC 33 + * 5 - i915 submits a request to GSC with the reply from CSME 34 + * 6 - GSC replies either with a new header + payload (same as step 2, so we 35 + * restart from there) or with an end message. 36 + */ 37 + 38 + /* 39 + * The component should load quite quickly in most cases, but it could take 40 + * a bit. Using a very big timeout just to cover the worst case scenario 41 + */ 42 + #define GSC_PROXY_INIT_TIMEOUT_MS 20000 43 + 44 + /* the protocol supports up to 32K in each direction */ 45 + #define GSC_PROXY_BUFFER_SIZE SZ_32K 46 + #define GSC_PROXY_CHANNEL_SIZE (GSC_PROXY_BUFFER_SIZE * 2) 47 + #define GSC_PROXY_MAX_MSG_SIZE (GSC_PROXY_BUFFER_SIZE - sizeof(struct intel_gsc_mtl_header)) 48 + 49 + /* FW-defined proxy header */ 50 + struct intel_gsc_proxy_header { 51 + /* 52 + * hdr: 53 + * Bits 0-7: type of the proxy message (see enum intel_gsc_proxy_type) 54 + * Bits 8-15: rsvd 55 + * Bits 16-31: length in bytes of the payload following the proxy header 56 + */ 57 + u32 hdr; 58 + #define GSC_PROXY_TYPE GENMASK(7, 0) 59 + #define GSC_PROXY_PAYLOAD_LENGTH GENMASK(31, 16) 60 + 61 + u32 source; /* Source of the Proxy message */ 62 + u32 destination; /* Destination of the Proxy message */ 63 + #define GSC_PROXY_ADDRESSING_KMD 0x10000 64 + #define GSC_PROXY_ADDRESSING_GSC 0x20000 65 + #define GSC_PROXY_ADDRESSING_CSME 0x30000 66 + 67 + u32 status; /* Command status */ 68 + } __packed; 69 + 70 + /* FW-defined proxy types */ 71 + enum intel_gsc_proxy_type { 72 + GSC_PROXY_MSG_TYPE_PROXY_INVALID = 0, 73 + GSC_PROXY_MSG_TYPE_PROXY_QUERY = 1, 74 + GSC_PROXY_MSG_TYPE_PROXY_PAYLOAD = 2, 75 + GSC_PROXY_MSG_TYPE_PROXY_END = 3, 76 + GSC_PROXY_MSG_TYPE_PROXY_NOTIFICATION = 4, 77 + }; 78 + 79 + struct gsc_proxy_msg { 80 + struct intel_gsc_mtl_header header; 81 + struct intel_gsc_proxy_header proxy_header; 82 + } __packed; 83 + 84 + static int proxy_send_to_csme(struct intel_gsc_uc *gsc) 85 + { 86 + struct intel_gt *gt = gsc_uc_to_gt(gsc); 87 + struct i915_gsc_proxy_component *comp = gsc->proxy.component; 88 + struct intel_gsc_mtl_header *hdr; 89 + void *in = gsc->proxy.to_csme; 90 + void *out = gsc->proxy.to_gsc; 91 + u32 in_size; 92 + int ret; 93 + 94 + /* CSME msg only includes the proxy */ 95 + hdr = in; 96 + in += sizeof(struct intel_gsc_mtl_header); 97 + out += sizeof(struct intel_gsc_mtl_header); 98 + 99 + in_size = hdr->message_size - sizeof(struct intel_gsc_mtl_header); 100 + 101 + /* the message must contain at least the proxy header */ 102 + if (in_size < sizeof(struct intel_gsc_proxy_header) || 103 + in_size > GSC_PROXY_MAX_MSG_SIZE) { 104 + gt_err(gt, "Invalid CSME message size: %u\n", in_size); 105 + return -EINVAL; 106 + } 107 + 108 + ret = comp->ops->send(comp->mei_dev, in, in_size); 109 + if (ret < 0) { 110 + gt_err(gt, "Failed to send CSME message\n"); 111 + return ret; 112 + } 113 + 114 + ret = comp->ops->recv(comp->mei_dev, out, GSC_PROXY_MAX_MSG_SIZE); 115 + if (ret < 0) { 116 + gt_err(gt, "Failed to receive CSME message\n"); 117 + return ret; 118 + } 119 + 120 + return ret; 121 + } 122 + 123 + static int proxy_send_to_gsc(struct intel_gsc_uc *gsc) 124 + { 125 + struct intel_gt *gt = gsc_uc_to_gt(gsc); 126 + u32 *marker = gsc->proxy.to_csme; /* first dw of the reply header */ 127 + u64 addr_in = i915_ggtt_offset(gsc->proxy.vma); 128 + u64 addr_out = addr_in + GSC_PROXY_BUFFER_SIZE; 129 + u32 size = ((struct gsc_proxy_msg *)gsc->proxy.to_gsc)->header.message_size; 130 + int err; 131 + 132 + /* the message must contain at least the gsc and proxy headers */ 133 + if (size < sizeof(struct gsc_proxy_msg) || size > GSC_PROXY_BUFFER_SIZE) { 134 + gt_err(gt, "Invalid GSC proxy message size: %u\n", size); 135 + return -EINVAL; 136 + } 137 + 138 + /* clear the message marker */ 139 + *marker = 0; 140 + 141 + /* make sure the marker write is flushed */ 142 + wmb(); 143 + 144 + /* send the request */ 145 + err = intel_gsc_uc_heci_cmd_submit_packet(gsc, addr_in, size, 146 + addr_out, GSC_PROXY_BUFFER_SIZE); 147 + 148 + if (!err) { 149 + /* wait for the reply to show up */ 150 + err = wait_for(*marker != 0, 300); 151 + if (err) 152 + gt_err(gt, "Failed to get a proxy reply from gsc\n"); 153 + } 154 + 155 + return err; 156 + } 157 + 158 + static int validate_proxy_header(struct intel_gsc_proxy_header *header, 159 + u32 source, u32 dest) 160 + { 161 + u32 type = FIELD_GET(GSC_PROXY_TYPE, header->hdr); 162 + u32 length = FIELD_GET(GSC_PROXY_PAYLOAD_LENGTH, header->hdr); 163 + int ret = 0; 164 + 165 + if (header->destination != dest || header->source != source) { 166 + ret = -ENOEXEC; 167 + goto fail; 168 + } 169 + 170 + switch (type) { 171 + case GSC_PROXY_MSG_TYPE_PROXY_PAYLOAD: 172 + if (length > 0) 173 + break; 174 + fallthrough; 175 + case GSC_PROXY_MSG_TYPE_PROXY_INVALID: 176 + ret = -EIO; 177 + goto fail; 178 + default: 179 + break; 180 + } 181 + 182 + fail: 183 + return ret; 184 + } 185 + 186 + static int proxy_query(struct intel_gsc_uc *gsc) 187 + { 188 + struct intel_gt *gt = gsc_uc_to_gt(gsc); 189 + struct gsc_proxy_msg *to_gsc = gsc->proxy.to_gsc; 190 + struct gsc_proxy_msg *to_csme = gsc->proxy.to_csme; 191 + int ret; 192 + 193 + intel_gsc_uc_heci_cmd_emit_mtl_header(&to_gsc->header, 194 + HECI_MEADDRESS_PROXY, 195 + sizeof(struct gsc_proxy_msg), 196 + 0); 197 + 198 + to_gsc->proxy_header.hdr = 199 + FIELD_PREP(GSC_PROXY_TYPE, GSC_PROXY_MSG_TYPE_PROXY_QUERY) | 200 + FIELD_PREP(GSC_PROXY_PAYLOAD_LENGTH, 0); 201 + 202 + to_gsc->proxy_header.source = GSC_PROXY_ADDRESSING_KMD; 203 + to_gsc->proxy_header.destination = GSC_PROXY_ADDRESSING_GSC; 204 + to_gsc->proxy_header.status = 0; 205 + 206 + while (1) { 207 + /* clear the GSC response header space */ 208 + memset(gsc->proxy.to_csme, 0, sizeof(struct gsc_proxy_msg)); 209 + 210 + /* send proxy message to GSC */ 211 + ret = proxy_send_to_gsc(gsc); 212 + if (ret) { 213 + gt_err(gt, "failed to send proxy message to GSC! %d\n", ret); 214 + goto proxy_error; 215 + } 216 + 217 + /* stop if this was the last message */ 218 + if (FIELD_GET(GSC_PROXY_TYPE, to_csme->proxy_header.hdr) == 219 + GSC_PROXY_MSG_TYPE_PROXY_END) 220 + break; 221 + 222 + /* make sure the GSC-to-CSME proxy header is sane */ 223 + ret = validate_proxy_header(&to_csme->proxy_header, 224 + GSC_PROXY_ADDRESSING_GSC, 225 + GSC_PROXY_ADDRESSING_CSME); 226 + if (ret) { 227 + gt_err(gt, "invalid GSC to CSME proxy header! %d\n", ret); 228 + goto proxy_error; 229 + } 230 + 231 + /* send the GSC message to the CSME */ 232 + ret = proxy_send_to_csme(gsc); 233 + if (ret < 0) { 234 + gt_err(gt, "failed to send proxy message to CSME! %d\n", ret); 235 + goto proxy_error; 236 + } 237 + 238 + /* update the GSC message size with the returned value from CSME */ 239 + to_gsc->header.message_size = ret + sizeof(struct intel_gsc_mtl_header); 240 + 241 + /* make sure the CSME-to-GSC proxy header is sane */ 242 + ret = validate_proxy_header(&to_gsc->proxy_header, 243 + GSC_PROXY_ADDRESSING_CSME, 244 + GSC_PROXY_ADDRESSING_GSC); 245 + if (ret) { 246 + gt_err(gt, "invalid CSME to GSC proxy header! %d\n", ret); 247 + goto proxy_error; 248 + } 249 + } 250 + 251 + proxy_error: 252 + return ret < 0 ? ret : 0; 253 + } 254 + 255 + int intel_gsc_proxy_request_handler(struct intel_gsc_uc *gsc) 256 + { 257 + struct intel_gt *gt = gsc_uc_to_gt(gsc); 258 + int err; 259 + 260 + if (!gsc->proxy.component_added) 261 + return -ENODEV; 262 + 263 + assert_rpm_wakelock_held(gt->uncore->rpm); 264 + 265 + /* when GSC is loaded, we can queue this before the component is bound */ 266 + err = wait_for(gsc->proxy.component, GSC_PROXY_INIT_TIMEOUT_MS); 267 + if (err) { 268 + gt_err(gt, "GSC proxy component didn't bind within the expected timeout\n"); 269 + return -EIO; 270 + } 271 + 272 + mutex_lock(&gsc->proxy.mutex); 273 + if (!gsc->proxy.component) { 274 + gt_err(gt, "GSC proxy worker called without the component being bound!\n"); 275 + err = -EIO; 276 + } else { 277 + /* 278 + * write the status bit to clear it and allow new proxy 279 + * interrupts to be generated while we handle the current 280 + * request, but be sure not to write the reset bit 281 + */ 282 + intel_uncore_rmw(gt->uncore, HECI_H_CSR(MTL_GSC_HECI2_BASE), 283 + HECI_H_CSR_RST, HECI_H_CSR_IS); 284 + err = proxy_query(gsc); 285 + } 286 + mutex_unlock(&gsc->proxy.mutex); 287 + return err; 288 + } 289 + 290 + void intel_gsc_proxy_irq_handler(struct intel_gsc_uc *gsc, u32 iir) 291 + { 292 + struct intel_gt *gt = gsc_uc_to_gt(gsc); 293 + 294 + if (unlikely(!iir)) 295 + return; 296 + 297 + lockdep_assert_held(gt->irq_lock); 298 + 299 + if (!gsc->proxy.component) { 300 + gt_err(gt, "GSC proxy irq received without the component being bound!\n"); 301 + return; 302 + } 303 + 304 + gsc->gsc_work_actions |= GSC_ACTION_SW_PROXY; 305 + queue_work(gsc->wq, &gsc->work); 306 + } 307 + 308 + static int i915_gsc_proxy_component_bind(struct device *i915_kdev, 309 + struct device *mei_kdev, void *data) 310 + { 311 + struct drm_i915_private *i915 = kdev_to_i915(i915_kdev); 312 + struct intel_gt *gt = i915->media_gt; 313 + struct intel_gsc_uc *gsc = &gt->uc.gsc; 314 + intel_wakeref_t wakeref; 315 + 316 + /* enable HECI2 IRQs */ 317 + with_intel_runtime_pm(&i915->runtime_pm, wakeref) 318 + intel_uncore_rmw(gt->uncore, HECI_H_CSR(MTL_GSC_HECI2_BASE), 319 + HECI_H_CSR_RST, HECI_H_CSR_IE); 320 + 321 + mutex_lock(&gsc->proxy.mutex); 322 + gsc->proxy.component = data; 323 + gsc->proxy.component->mei_dev = mei_kdev; 324 + mutex_unlock(&gsc->proxy.mutex); 325 + 326 + return 0; 327 + } 328 + 329 + static void i915_gsc_proxy_component_unbind(struct device *i915_kdev, 330 + struct device *mei_kdev, void *data) 331 + { 332 + struct drm_i915_private *i915 = kdev_to_i915(i915_kdev); 333 + struct intel_gt *gt = i915->media_gt; 334 + struct intel_gsc_uc *gsc = &gt->uc.gsc; 335 + intel_wakeref_t wakeref; 336 + 337 + mutex_lock(&gsc->proxy.mutex); 338 + gsc->proxy.component = NULL; 339 + mutex_unlock(&gsc->proxy.mutex); 340 + 341 + /* disable HECI2 IRQs */ 342 + with_intel_runtime_pm(&i915->runtime_pm, wakeref) 343 + intel_uncore_rmw(gt->uncore, HECI_H_CSR(MTL_GSC_HECI2_BASE), 344 + HECI_H_CSR_IE | HECI_H_CSR_RST, 0); 345 + } 346 + 347 + static const struct component_ops i915_gsc_proxy_component_ops = { 348 + .bind = i915_gsc_proxy_component_bind, 349 + .unbind = i915_gsc_proxy_component_unbind, 350 + }; 351 + 352 + static int proxy_channel_alloc(struct intel_gsc_uc *gsc) 353 + { 354 + struct intel_gt *gt = gsc_uc_to_gt(gsc); 355 + struct i915_vma *vma; 356 + void *vaddr; 357 + int err; 358 + 359 + err = intel_guc_allocate_and_map_vma(&gt->uc.guc, GSC_PROXY_CHANNEL_SIZE, 360 + &vma, &vaddr); 361 + if (err) 362 + return err; 363 + 364 + gsc->proxy.vma = vma; 365 + gsc->proxy.to_gsc = vaddr; 366 + gsc->proxy.to_csme = vaddr + GSC_PROXY_BUFFER_SIZE; 367 + 368 + return 0; 369 + } 370 + 371 + static void proxy_channel_free(struct intel_gsc_uc *gsc) 372 + { 373 + if (!gsc->proxy.vma) 374 + return; 375 + 376 + gsc->proxy.to_gsc = NULL; 377 + gsc->proxy.to_csme = NULL; 378 + i915_vma_unpin_and_release(&gsc->proxy.vma, I915_VMA_RELEASE_MAP); 379 + } 380 + 381 + void intel_gsc_proxy_fini(struct intel_gsc_uc *gsc) 382 + { 383 + struct intel_gt *gt = gsc_uc_to_gt(gsc); 384 + struct drm_i915_private *i915 = gt->i915; 385 + 386 + if (fetch_and_zero(&gsc->proxy.component_added)) 387 + component_del(i915->drm.dev, &i915_gsc_proxy_component_ops); 388 + 389 + proxy_channel_free(gsc); 390 + } 391 + 392 + int intel_gsc_proxy_init(struct intel_gsc_uc *gsc) 393 + { 394 + int err; 395 + struct intel_gt *gt = gsc_uc_to_gt(gsc); 396 + struct drm_i915_private *i915 = gt->i915; 397 + 398 + mutex_init(&gsc->proxy.mutex); 399 + 400 + if (!IS_ENABLED(CONFIG_INTEL_MEI_GSC_PROXY)) { 401 + gt_info(gt, "can't init GSC proxy due to missing mei component\n"); 402 + return -ENODEV; 403 + } 404 + 405 + err = proxy_channel_alloc(gsc); 406 + if (err) 407 + return err; 408 + 409 + err = component_add_typed(i915->drm.dev, &i915_gsc_proxy_component_ops, 410 + I915_COMPONENT_GSC_PROXY); 411 + if (err < 0) { 412 + gt_err(gt, "Failed to add GSC_PROXY component (%d)\n", err); 413 + goto out_free; 414 + } 415 + 416 + gsc->proxy.component_added = true; 417 + 418 + return 0; 419 + 420 + out_free: 421 + proxy_channel_free(gsc); 422 + return err; 423 + } 424 +
+18
drivers/gpu/drm/i915/gt/uc/intel_gsc_proxy.h
··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright © 2023 Intel Corporation 4 + */ 5 + 6 + #ifndef _INTEL_GSC_PROXY_H_ 7 + #define _INTEL_GSC_PROXY_H_ 8 + 9 + #include <linux/types.h> 10 + 11 + struct intel_gsc_uc; 12 + 13 + int intel_gsc_proxy_init(struct intel_gsc_uc *gsc); 14 + void intel_gsc_proxy_fini(struct intel_gsc_uc *gsc); 15 + int intel_gsc_proxy_request_handler(struct intel_gsc_uc *gsc); 16 + void intel_gsc_proxy_irq_handler(struct intel_gsc_uc *gsc, u32 iir); 17 + 18 + #endif
+72 -4
drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c
··· 10 10 #include "intel_gsc_uc.h" 11 11 #include "intel_gsc_fw.h" 12 12 #include "i915_drv.h" 13 + #include "intel_gsc_proxy.h" 13 14 14 15 static void gsc_work(struct work_struct *work) 15 16 { 16 17 struct intel_gsc_uc *gsc = container_of(work, typeof(*gsc), work); 17 18 struct intel_gt *gt = gsc_uc_to_gt(gsc); 18 19 intel_wakeref_t wakeref; 20 + u32 actions; 21 + int ret; 19 22 20 - with_intel_runtime_pm(gt->uncore->rpm, wakeref) 21 - intel_gsc_uc_fw_upload(gsc); 23 + wakeref = intel_runtime_pm_get(gt->uncore->rpm); 24 + 25 + spin_lock_irq(gt->irq_lock); 26 + actions = gsc->gsc_work_actions; 27 + gsc->gsc_work_actions = 0; 28 + spin_unlock_irq(gt->irq_lock); 29 + 30 + if (actions & GSC_ACTION_FW_LOAD) { 31 + ret = intel_gsc_uc_fw_upload(gsc); 32 + if (ret == -EEXIST) /* skip proxy if not a new load */ 33 + actions &= ~GSC_ACTION_FW_LOAD; 34 + else if (ret) 35 + goto out_put; 36 + } 37 + 38 + if (actions & (GSC_ACTION_FW_LOAD | GSC_ACTION_SW_PROXY)) { 39 + if (!intel_gsc_uc_fw_init_done(gsc)) { 40 + gt_err(gt, "Proxy request received with GSC not loaded!\n"); 41 + goto out_put; 42 + } 43 + 44 + ret = intel_gsc_proxy_request_handler(gsc); 45 + if (ret) 46 + goto out_put; 47 + 48 + /* mark the GSC FW init as done the first time we run this */ 49 + if (actions & GSC_ACTION_FW_LOAD) { 50 + /* 51 + * If there is a proxy establishment error, the GSC might still 52 + * complete the request handling cleanly, so we need to check the 53 + * status register to check if the proxy init was actually successful 54 + */ 55 + if (intel_gsc_uc_fw_proxy_init_done(gsc)) { 56 + drm_dbg(&gt->i915->drm, "GSC Proxy initialized\n"); 57 + intel_uc_fw_change_status(&gsc->fw, INTEL_UC_FIRMWARE_RUNNING); 58 + } else { 59 + drm_err(&gt->i915->drm, 60 + "GSC status reports proxy init not complete\n"); 61 + } 62 + } 63 + } 64 + 65 + out_put: 66 + intel_runtime_pm_put(gt->uncore->rpm, wakeref); 22 67 } 23 68 24 69 static bool gsc_engine_supported(struct intel_gt *gt) ··· 88 43 89 44 void intel_gsc_uc_init_early(struct intel_gsc_uc *gsc) 90 45 { 46 + struct intel_gt *gt = gsc_uc_to_gt(gsc); 47 + 91 48 intel_uc_fw_init_early(&gsc->fw, INTEL_UC_FW_TYPE_GSC); 92 49 INIT_WORK(&gsc->work, gsc_work); 93 50 ··· 97 50 * GT with it being not fully setup hence check device info's 98 51 * engine mask 99 52 */ 100 - if (!gsc_engine_supported(gsc_uc_to_gt(gsc))) { 53 + if (!gsc_engine_supported(gt)) { 101 54 intel_uc_fw_change_status(&gsc->fw, INTEL_UC_FIRMWARE_NOT_SUPPORTED); 102 55 return; 56 + } 57 + 58 + gsc->wq = alloc_ordered_workqueue("i915_gsc", 0); 59 + if (!gsc->wq) { 60 + gt_err(gt, "failed to allocate WQ for GSC, disabling FW\n"); 61 + intel_uc_fw_change_status(&gsc->fw, INTEL_UC_FIRMWARE_NOT_SUPPORTED); 103 62 } 104 63 } 105 64 ··· 141 88 142 89 gsc->ce = ce; 143 90 91 + /* if we fail to init proxy we still want to load GSC for PM */ 92 + intel_gsc_proxy_init(gsc); 93 + 144 94 intel_uc_fw_change_status(&gsc->fw, INTEL_UC_FIRMWARE_LOADABLE); 145 95 146 96 return 0; ··· 163 107 return; 164 108 165 109 flush_work(&gsc->work); 110 + if (gsc->wq) { 111 + destroy_workqueue(gsc->wq); 112 + gsc->wq = NULL; 113 + } 114 + 115 + intel_gsc_proxy_fini(gsc); 166 116 167 117 if (gsc->ce) 168 118 intel_engine_destroy_pinned_context(fetch_and_zero(&gsc->ce)); ··· 207 145 208 146 void intel_gsc_uc_load_start(struct intel_gsc_uc *gsc) 209 147 { 148 + struct intel_gt *gt = gsc_uc_to_gt(gsc); 149 + 210 150 if (!intel_uc_fw_is_loadable(&gsc->fw)) 211 151 return; 212 152 213 153 if (intel_gsc_uc_fw_init_done(gsc)) 214 154 return; 215 155 216 - queue_work(system_unbound_wq, &gsc->work); 156 + spin_lock_irq(gt->irq_lock); 157 + gsc->gsc_work_actions |= GSC_ACTION_FW_LOAD; 158 + spin_unlock_irq(gt->irq_lock); 159 + 160 + queue_work(gsc->wq, &gsc->work); 217 161 }
+16 -1
drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h
··· 10 10 11 11 struct i915_vma; 12 12 struct intel_context; 13 + struct i915_gsc_proxy_component; 13 14 14 15 struct intel_gsc_uc { 15 16 /* Generic uC firmware management */ ··· 20 19 struct i915_vma *local; /* private memory for GSC usage */ 21 20 struct intel_context *ce; /* for submission to GSC FW via GSC engine */ 22 21 23 - struct work_struct work; /* for delayed load */ 22 + /* for delayed load and proxy handling */ 23 + struct workqueue_struct *wq; 24 + struct work_struct work; 25 + u32 gsc_work_actions; /* protected by gt->irq_lock */ 26 + #define GSC_ACTION_FW_LOAD BIT(0) 27 + #define GSC_ACTION_SW_PROXY BIT(1) 28 + 29 + struct { 30 + struct i915_gsc_proxy_component *component; 31 + bool component_added; 32 + struct i915_vma *vma; 33 + void *to_gsc; 34 + void *to_csme; 35 + struct mutex mutex; /* protects the tee channel binding */ 36 + } proxy; 24 37 }; 25 38 26 39 void intel_gsc_uc_init_early(struct intel_gsc_uc *gsc);
+102
drivers/gpu/drm/i915/gt/uc/intel_gsc_uc_heci_cmd_submit.c
··· 3 3 * Copyright © 2023 Intel Corporation 4 4 */ 5 5 6 + #include "gt/intel_context.h" 6 7 #include "gt/intel_engine_pm.h" 7 8 #include "gt/intel_gpu_commands.h" 8 9 #include "gt/intel_gt.h" ··· 107 106 header->host_session_handle = host_session_id; 108 107 header->header_version = MTL_GSC_HEADER_VERSION; 109 108 header->message_size = message_size; 109 + } 110 + 111 + static void 112 + emit_gsc_heci_pkt_nonpriv(u32 *cmd, struct intel_gsc_heci_non_priv_pkt *pkt) 113 + { 114 + *cmd++ = GSC_HECI_CMD_PKT; 115 + *cmd++ = lower_32_bits(pkt->addr_in); 116 + *cmd++ = upper_32_bits(pkt->addr_in); 117 + *cmd++ = pkt->size_in; 118 + *cmd++ = lower_32_bits(pkt->addr_out); 119 + *cmd++ = upper_32_bits(pkt->addr_out); 120 + *cmd++ = pkt->size_out; 121 + *cmd++ = 0; 122 + *cmd++ = MI_BATCH_BUFFER_END; 123 + } 124 + 125 + int 126 + intel_gsc_uc_heci_cmd_submit_nonpriv(struct intel_gsc_uc *gsc, 127 + struct intel_context *ce, 128 + struct intel_gsc_heci_non_priv_pkt *pkt, 129 + u32 *cmd, int timeout_ms) 130 + { 131 + struct intel_engine_cs *engine; 132 + struct i915_gem_ww_ctx ww; 133 + struct i915_request *rq; 134 + int err, trials = 0; 135 + 136 + i915_gem_ww_ctx_init(&ww, false); 137 + retry: 138 + err = i915_gem_object_lock(pkt->bb_vma->obj, &ww); 139 + if (err) 140 + goto out_ww; 141 + err = i915_gem_object_lock(pkt->heci_pkt_vma->obj, &ww); 142 + if (err) 143 + goto out_ww; 144 + err = intel_context_pin_ww(ce, &ww); 145 + if (err) 146 + goto out_ww; 147 + 148 + rq = i915_request_create(ce); 149 + if (IS_ERR(rq)) { 150 + err = PTR_ERR(rq); 151 + goto out_unpin_ce; 152 + } 153 + 154 + emit_gsc_heci_pkt_nonpriv(cmd, pkt); 155 + 156 + err = i915_vma_move_to_active(pkt->bb_vma, rq, 0); 157 + if (err) 158 + goto out_rq; 159 + err = i915_vma_move_to_active(pkt->heci_pkt_vma, rq, EXEC_OBJECT_WRITE); 160 + if (err) 161 + goto out_rq; 162 + 163 + engine = rq->context->engine; 164 + if (engine->emit_init_breadcrumb) { 165 + err = engine->emit_init_breadcrumb(rq); 166 + if (err) 167 + goto out_rq; 168 + } 169 + 170 + err = engine->emit_bb_start(rq, i915_vma_offset(pkt->bb_vma), PAGE_SIZE, 0); 171 + if (err) 172 + goto out_rq; 173 + 174 + err = ce->engine->emit_flush(rq, 0); 175 + if (err) 176 + drm_err(&gsc_uc_to_gt(gsc)->i915->drm, 177 + "Failed emit-flush for gsc-heci-non-priv-pkterr=%d\n", err); 178 + 179 + out_rq: 180 + i915_request_get(rq); 181 + 182 + if (unlikely(err)) 183 + i915_request_set_error_once(rq, err); 184 + 185 + i915_request_add(rq); 186 + 187 + if (!err) { 188 + if (i915_request_wait(rq, I915_WAIT_INTERRUPTIBLE, 189 + msecs_to_jiffies(timeout_ms)) < 0) 190 + err = -ETIME; 191 + } 192 + 193 + i915_request_put(rq); 194 + 195 + out_unpin_ce: 196 + intel_context_unpin(ce); 197 + out_ww: 198 + if (err == -EDEADLK) { 199 + err = i915_gem_ww_ctx_backoff(&ww); 200 + if (!err) { 201 + if (++trials < 10) 202 + goto retry; 203 + else 204 + err = EAGAIN; 205 + } 206 + } 207 + i915_gem_ww_ctx_fini(&ww); 208 + 209 + return err; 110 210 }
+26 -1
drivers/gpu/drm/i915/gt/uc/intel_gsc_uc_heci_cmd_submit.h
··· 8 8 9 9 #include <linux/types.h> 10 10 11 + struct i915_vma; 12 + struct intel_context; 11 13 struct intel_gsc_uc; 14 + 12 15 struct intel_gsc_mtl_header { 13 16 u32 validity_marker; 14 17 #define GSC_HECI_VALIDITY_MARKER 0xA578875A 15 18 16 19 u8 heci_client_id; 20 + #define HECI_MEADDRESS_PROXY 10 17 21 #define HECI_MEADDRESS_PXP 17 18 22 #define HECI_MEADDRESS_HDCP 18 19 23 ··· 51 47 * we distinguish the flags using OUTFLAG or INFLAG 52 48 */ 53 49 u32 flags; 54 - #define GSC_OUTFLAG_MSG_PENDING 1 50 + #define GSC_OUTFLAG_MSG_PENDING BIT(0) 51 + #define GSC_INFLAG_MSG_CLEANUP BIT(1) 55 52 56 53 u32 status; 57 54 } __packed; ··· 63 58 void intel_gsc_uc_heci_cmd_emit_mtl_header(struct intel_gsc_mtl_header *header, 64 59 u8 heci_client_id, u32 message_size, 65 60 u64 host_session_id); 61 + 62 + struct intel_gsc_heci_non_priv_pkt { 63 + u64 addr_in; 64 + u32 size_in; 65 + u64 addr_out; 66 + u32 size_out; 67 + struct i915_vma *heci_pkt_vma; 68 + struct i915_vma *bb_vma; 69 + }; 70 + 71 + void 72 + intel_gsc_uc_heci_cmd_emit_mtl_header(struct intel_gsc_mtl_header *header, 73 + u8 heci_client_id, u32 msg_size, 74 + u64 host_session_id); 75 + 76 + int 77 + intel_gsc_uc_heci_cmd_submit_nonpriv(struct intel_gsc_uc *gsc, 78 + struct intel_context *ce, 79 + struct intel_gsc_heci_non_priv_pkt *pkt, 80 + u32 *cs, int timeout_ms); 66 81 #endif
+7
drivers/gpu/drm/i915/gt/uc/intel_guc.c
··· 743 743 if (IS_ERR(obj)) 744 744 return ERR_CAST(obj); 745 745 746 + /* 747 + * Wa_22016122933: For MTL the shared memory needs to be mapped 748 + * as WC on CPU side and UC (PAT index 2) on GPU side 749 + */ 750 + if (IS_METEORLAKE(gt->i915)) 751 + i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE); 752 + 746 753 vma = i915_vma_instance(obj, &gt->ggtt->vm, NULL); 747 754 if (IS_ERR(vma)) 748 755 goto err;
+1
drivers/gpu/drm/i915/gt/uc/intel_guc.h
··· 42 42 /** @capture: the error-state-capture module's data and objects */ 43 43 struct intel_guc_state_capture *capture; 44 44 45 + /** @dbgfs_node: debugfs node */ 45 46 struct dentry *dbgfs_node; 46 47 47 48 /** @sched_engine: Global engine used to submit requests to GuC */
+35 -1
drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
··· 643 643 GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size); 644 644 } 645 645 646 + static u32 guc_get_capture_engine_mask(struct iosys_map *info_map, u32 capture_class) 647 + { 648 + u32 mask; 649 + 650 + switch (capture_class) { 651 + case GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE: 652 + mask = info_map_read(info_map, engine_enabled_masks[GUC_RENDER_CLASS]); 653 + mask |= info_map_read(info_map, engine_enabled_masks[GUC_COMPUTE_CLASS]); 654 + break; 655 + 656 + case GUC_CAPTURE_LIST_CLASS_VIDEO: 657 + mask = info_map_read(info_map, engine_enabled_masks[GUC_VIDEO_CLASS]); 658 + break; 659 + 660 + case GUC_CAPTURE_LIST_CLASS_VIDEOENHANCE: 661 + mask = info_map_read(info_map, engine_enabled_masks[GUC_VIDEOENHANCE_CLASS]); 662 + break; 663 + 664 + case GUC_CAPTURE_LIST_CLASS_BLITTER: 665 + mask = info_map_read(info_map, engine_enabled_masks[GUC_BLITTER_CLASS]); 666 + break; 667 + 668 + case GUC_CAPTURE_LIST_CLASS_GSC_OTHER: 669 + mask = info_map_read(info_map, engine_enabled_masks[GUC_GSC_OTHER_CLASS]); 670 + break; 671 + 672 + default: 673 + mask = 0; 674 + } 675 + 676 + return mask; 677 + } 678 + 646 679 static int 647 680 guc_capture_prep_lists(struct intel_guc *guc) 648 681 { ··· 711 678 712 679 for (i = 0; i < GUC_CAPTURE_LIST_INDEX_MAX; i++) { 713 680 for (j = 0; j < GUC_MAX_ENGINE_CLASSES; j++) { 681 + u32 engine_mask = guc_get_capture_engine_mask(&info_map, j); 714 682 715 683 /* null list if we dont have said engine or list */ 716 - if (!info_map_read(&info_map, engine_enabled_masks[j])) { 684 + if (!engine_mask) { 717 685 if (ads_is_mapped) { 718 686 ads_blob_write(guc, ads.capture_class[i][j], null_ggtt); 719 687 ads_blob_write(guc, ads.capture_instance[i][j], null_ggtt);
+118 -148
drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
··· 30 30 #define COMMON_BASE_GLOBAL \ 31 31 { FORCEWAKE_MT, 0, 0, "FORCEWAKE" } 32 32 33 - #define COMMON_GEN9BASE_GLOBAL \ 33 + #define COMMON_GEN8BASE_GLOBAL \ 34 34 { ERROR_GEN6, 0, 0, "ERROR_GEN6" }, \ 35 35 { DONE_REG, 0, 0, "DONE_REG" }, \ 36 36 { HSW_GTT_CACHE_EN, 0, 0, "HSW_GTT_CACHE_EN" } 37 37 38 - #define GEN9_GLOBAL \ 38 + #define GEN8_GLOBAL \ 39 39 { GEN8_FAULT_TLB_DATA0, 0, 0, "GEN8_FAULT_TLB_DATA0" }, \ 40 40 { GEN8_FAULT_TLB_DATA1, 0, 0, "GEN8_FAULT_TLB_DATA1" } 41 41 ··· 96 96 { GEN12_SFC_DONE(2), 0, 0, "SFC_DONE[2]" }, \ 97 97 { GEN12_SFC_DONE(3), 0, 0, "SFC_DONE[3]" } 98 98 99 - /* XE_LPD - Global */ 100 - static const struct __guc_mmio_reg_descr xe_lpd_global_regs[] = { 99 + /* XE_LP Global */ 100 + static const struct __guc_mmio_reg_descr xe_lp_global_regs[] = { 101 101 COMMON_BASE_GLOBAL, 102 - COMMON_GEN9BASE_GLOBAL, 102 + COMMON_GEN8BASE_GLOBAL, 103 103 COMMON_GEN12BASE_GLOBAL, 104 104 }; 105 105 106 - /* XE_LPD - Render / Compute Per-Class */ 107 - static const struct __guc_mmio_reg_descr xe_lpd_rc_class_regs[] = { 106 + /* XE_LP Render / Compute Per-Class */ 107 + static const struct __guc_mmio_reg_descr xe_lp_rc_class_regs[] = { 108 108 COMMON_BASE_HAS_EU, 109 109 COMMON_BASE_RENDER, 110 110 COMMON_GEN12BASE_RENDER, 111 111 }; 112 112 113 - /* GEN9/XE_LPD - Render / Compute Per-Engine-Instance */ 114 - static const struct __guc_mmio_reg_descr xe_lpd_rc_inst_regs[] = { 113 + /* GEN8+ Render / Compute Per-Engine-Instance */ 114 + static const struct __guc_mmio_reg_descr gen8_rc_inst_regs[] = { 115 115 COMMON_BASE_ENGINE_INSTANCE, 116 116 }; 117 117 118 - /* GEN9/XE_LPD - Media Decode/Encode Per-Engine-Instance */ 119 - static const struct __guc_mmio_reg_descr xe_lpd_vd_inst_regs[] = { 118 + /* GEN8+ Media Decode/Encode Per-Engine-Instance */ 119 + static const struct __guc_mmio_reg_descr gen8_vd_inst_regs[] = { 120 120 COMMON_BASE_ENGINE_INSTANCE, 121 121 }; 122 122 123 - /* XE_LPD - Video Enhancement Per-Class */ 124 - static const struct __guc_mmio_reg_descr xe_lpd_vec_class_regs[] = { 123 + /* XE_LP Video Enhancement Per-Class */ 124 + static const struct __guc_mmio_reg_descr xe_lp_vec_class_regs[] = { 125 125 COMMON_GEN12BASE_VEC, 126 126 }; 127 127 128 - /* GEN9/XE_LPD - Video Enhancement Per-Engine-Instance */ 129 - static const struct __guc_mmio_reg_descr xe_lpd_vec_inst_regs[] = { 128 + /* GEN8+ Video Enhancement Per-Engine-Instance */ 129 + static const struct __guc_mmio_reg_descr gen8_vec_inst_regs[] = { 130 130 COMMON_BASE_ENGINE_INSTANCE, 131 131 }; 132 132 133 - /* GEN9/XE_LPD - Blitter Per-Engine-Instance */ 134 - static const struct __guc_mmio_reg_descr xe_lpd_blt_inst_regs[] = { 133 + /* GEN8+ Blitter Per-Engine-Instance */ 134 + static const struct __guc_mmio_reg_descr gen8_blt_inst_regs[] = { 135 135 COMMON_BASE_ENGINE_INSTANCE, 136 136 }; 137 137 138 - /* XE_LPD - GSC Per-Engine-Instance */ 139 - static const struct __guc_mmio_reg_descr xe_lpd_gsc_inst_regs[] = { 138 + /* XE_LP - GSC Per-Engine-Instance */ 139 + static const struct __guc_mmio_reg_descr xe_lp_gsc_inst_regs[] = { 140 140 COMMON_BASE_ENGINE_INSTANCE, 141 141 }; 142 142 143 - /* GEN9 - Global */ 144 - static const struct __guc_mmio_reg_descr default_global_regs[] = { 143 + /* GEN8 - Global */ 144 + static const struct __guc_mmio_reg_descr gen8_global_regs[] = { 145 145 COMMON_BASE_GLOBAL, 146 - COMMON_GEN9BASE_GLOBAL, 147 - GEN9_GLOBAL, 146 + COMMON_GEN8BASE_GLOBAL, 147 + GEN8_GLOBAL, 148 148 }; 149 149 150 - static const struct __guc_mmio_reg_descr default_rc_class_regs[] = { 150 + static const struct __guc_mmio_reg_descr gen8_rc_class_regs[] = { 151 151 COMMON_BASE_HAS_EU, 152 152 COMMON_BASE_RENDER, 153 153 }; 154 154 155 155 /* 156 - * Empty lists: 157 - * GEN9/XE_LPD - Blitter Per-Class 158 - * GEN9/XE_LPD - Media Decode/Encode Per-Class 159 - * GEN9 - VEC Class 156 + * Empty list to prevent warnings about unknown class/instance types 157 + * as not all class/instanace types have entries on all platforms. 160 158 */ 161 159 static const struct __guc_mmio_reg_descr empty_regs_list[] = { 162 160 }; ··· 172 174 } 173 175 174 176 /* List of lists */ 175 - static const struct __guc_mmio_reg_descr_group default_lists[] = { 176 - MAKE_REGLIST(default_global_regs, PF, GLOBAL, 0), 177 - MAKE_REGLIST(default_rc_class_regs, PF, ENGINE_CLASS, GUC_RENDER_CLASS), 178 - MAKE_REGLIST(xe_lpd_rc_inst_regs, PF, ENGINE_INSTANCE, GUC_RENDER_CLASS), 179 - MAKE_REGLIST(default_rc_class_regs, PF, ENGINE_CLASS, GUC_COMPUTE_CLASS), 180 - MAKE_REGLIST(xe_lpd_rc_inst_regs, PF, ENGINE_INSTANCE, GUC_COMPUTE_CLASS), 181 - MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_VIDEO_CLASS), 182 - MAKE_REGLIST(xe_lpd_vd_inst_regs, PF, ENGINE_INSTANCE, GUC_VIDEO_CLASS), 183 - MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_VIDEOENHANCE_CLASS), 184 - MAKE_REGLIST(xe_lpd_vec_inst_regs, PF, ENGINE_INSTANCE, GUC_VIDEOENHANCE_CLASS), 185 - MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_BLITTER_CLASS), 186 - MAKE_REGLIST(xe_lpd_blt_inst_regs, PF, ENGINE_INSTANCE, GUC_BLITTER_CLASS), 187 - MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_GSC_OTHER_CLASS), 188 - MAKE_REGLIST(xe_lpd_gsc_inst_regs, PF, ENGINE_INSTANCE, GUC_GSC_OTHER_CLASS), 177 + static const struct __guc_mmio_reg_descr_group gen8_lists[] = { 178 + MAKE_REGLIST(gen8_global_regs, PF, GLOBAL, 0), 179 + MAKE_REGLIST(gen8_rc_class_regs, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE), 180 + MAKE_REGLIST(gen8_rc_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE), 181 + MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_VIDEO), 182 + MAKE_REGLIST(gen8_vd_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_VIDEO), 183 + MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_VIDEOENHANCE), 184 + MAKE_REGLIST(gen8_vec_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_VIDEOENHANCE), 185 + MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_BLITTER), 186 + MAKE_REGLIST(gen8_blt_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_BLITTER), 187 + MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_GSC_OTHER), 188 + MAKE_REGLIST(empty_regs_list, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_GSC_OTHER), 189 189 {} 190 190 }; 191 191 192 - static const struct __guc_mmio_reg_descr_group xe_lpd_lists[] = { 193 - MAKE_REGLIST(xe_lpd_global_regs, PF, GLOBAL, 0), 194 - MAKE_REGLIST(xe_lpd_rc_class_regs, PF, ENGINE_CLASS, GUC_RENDER_CLASS), 195 - MAKE_REGLIST(xe_lpd_rc_inst_regs, PF, ENGINE_INSTANCE, GUC_RENDER_CLASS), 196 - MAKE_REGLIST(xe_lpd_rc_class_regs, PF, ENGINE_CLASS, GUC_COMPUTE_CLASS), 197 - MAKE_REGLIST(xe_lpd_rc_inst_regs, PF, ENGINE_INSTANCE, GUC_COMPUTE_CLASS), 198 - MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_VIDEO_CLASS), 199 - MAKE_REGLIST(xe_lpd_vd_inst_regs, PF, ENGINE_INSTANCE, GUC_VIDEO_CLASS), 200 - MAKE_REGLIST(xe_lpd_vec_class_regs, PF, ENGINE_CLASS, GUC_VIDEOENHANCE_CLASS), 201 - MAKE_REGLIST(xe_lpd_vec_inst_regs, PF, ENGINE_INSTANCE, GUC_VIDEOENHANCE_CLASS), 202 - MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_BLITTER_CLASS), 203 - MAKE_REGLIST(xe_lpd_blt_inst_regs, PF, ENGINE_INSTANCE, GUC_BLITTER_CLASS), 204 - MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_GSC_OTHER_CLASS), 205 - MAKE_REGLIST(xe_lpd_gsc_inst_regs, PF, ENGINE_INSTANCE, GUC_GSC_OTHER_CLASS), 192 + static const struct __guc_mmio_reg_descr_group xe_lp_lists[] = { 193 + MAKE_REGLIST(xe_lp_global_regs, PF, GLOBAL, 0), 194 + MAKE_REGLIST(xe_lp_rc_class_regs, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE), 195 + MAKE_REGLIST(gen8_rc_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE), 196 + MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_VIDEO), 197 + MAKE_REGLIST(gen8_vd_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_VIDEO), 198 + MAKE_REGLIST(xe_lp_vec_class_regs, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_VIDEOENHANCE), 199 + MAKE_REGLIST(gen8_vec_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_VIDEOENHANCE), 200 + MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_BLITTER), 201 + MAKE_REGLIST(gen8_blt_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_BLITTER), 202 + MAKE_REGLIST(empty_regs_list, PF, ENGINE_CLASS, GUC_CAPTURE_LIST_CLASS_GSC_OTHER), 203 + MAKE_REGLIST(xe_lp_gsc_inst_regs, PF, ENGINE_INSTANCE, GUC_CAPTURE_LIST_CLASS_GSC_OTHER), 206 204 {} 207 205 }; 208 206 ··· 254 260 i915_mcr_reg_t reg; 255 261 }; 256 262 257 - static const struct __ext_steer_reg xe_extregs[] = { 263 + static const struct __ext_steer_reg gen8_extregs[] = { 258 264 {"GEN8_SAMPLER_INSTDONE", GEN8_SAMPLER_INSTDONE}, 259 265 {"GEN8_ROW_INSTDONE", GEN8_ROW_INSTDONE} 266 + }; 267 + 268 + static const struct __ext_steer_reg xehpg_extregs[] = { 269 + {"XEHPG_INSTDONE_GEOM_SVG", XEHPG_INSTDONE_GEOM_SVG} 260 270 }; 261 271 262 272 static void __fill_ext_reg(struct __guc_mmio_reg_descr *ext, ··· 293 295 } 294 296 295 297 static void 296 - guc_capture_alloc_steered_lists_xe_lpd(struct intel_guc *guc, 297 - const struct __guc_mmio_reg_descr_group *lists) 298 + guc_capture_alloc_steered_lists(struct intel_guc *guc, 299 + const struct __guc_mmio_reg_descr_group *lists) 298 300 { 299 301 struct intel_gt *gt = guc_to_gt(guc); 300 302 int slice, subslice, iter, i, num_steer_regs, num_tot_regs = 0; ··· 302 304 struct __guc_mmio_reg_descr_group *extlists; 303 305 struct __guc_mmio_reg_descr *extarray; 304 306 struct sseu_dev_info *sseu; 307 + bool has_xehpg_extregs; 305 308 306 - /* In XE_LPD we only have steered registers for the render-class */ 309 + /* steered registers currently only exist for the render-class */ 307 310 list = guc_capture_get_one_list(lists, GUC_CAPTURE_LIST_INDEX_PF, 308 - GUC_CAPTURE_LIST_TYPE_ENGINE_CLASS, GUC_RENDER_CLASS); 311 + GUC_CAPTURE_LIST_TYPE_ENGINE_CLASS, 312 + GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE); 309 313 /* skip if extlists was previously allocated */ 310 314 if (!list || guc->capture->extlists) 311 315 return; 312 316 313 - num_steer_regs = ARRAY_SIZE(xe_extregs); 317 + has_xehpg_extregs = GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 55); 314 318 315 - sseu = &gt->info.sseu; 316 - for_each_ss_steering(iter, gt, slice, subslice) 317 - num_tot_regs += num_steer_regs; 318 - 319 - if (!num_tot_regs) 320 - return; 321 - 322 - /* allocate an extra for an end marker */ 323 - extlists = kcalloc(2, sizeof(struct __guc_mmio_reg_descr_group), GFP_KERNEL); 324 - if (!extlists) 325 - return; 326 - 327 - if (__alloc_ext_regs(&extlists[0], list, num_tot_regs)) { 328 - kfree(extlists); 329 - return; 330 - } 331 - 332 - extarray = extlists[0].extlist; 333 - for_each_ss_steering(iter, gt, slice, subslice) { 334 - for (i = 0; i < num_steer_regs; ++i) { 335 - __fill_ext_reg(extarray, &xe_extregs[i], slice, subslice); 336 - ++extarray; 337 - } 338 - } 339 - 340 - guc->capture->extlists = extlists; 341 - } 342 - 343 - static const struct __ext_steer_reg xehpg_extregs[] = { 344 - {"XEHPG_INSTDONE_GEOM_SVG", XEHPG_INSTDONE_GEOM_SVG} 345 - }; 346 - 347 - static bool __has_xehpg_extregs(u32 ipver) 348 - { 349 - return (ipver >= IP_VER(12, 55)); 350 - } 351 - 352 - static void 353 - guc_capture_alloc_steered_lists_xe_hpg(struct intel_guc *guc, 354 - const struct __guc_mmio_reg_descr_group *lists, 355 - u32 ipver) 356 - { 357 - struct intel_gt *gt = guc_to_gt(guc); 358 - struct sseu_dev_info *sseu; 359 - int slice, subslice, i, iter, num_steer_regs, num_tot_regs = 0; 360 - const struct __guc_mmio_reg_descr_group *list; 361 - struct __guc_mmio_reg_descr_group *extlists; 362 - struct __guc_mmio_reg_descr *extarray; 363 - 364 - /* In XE_LP / HPG we only have render-class steering registers during error-capture */ 365 - list = guc_capture_get_one_list(lists, GUC_CAPTURE_LIST_INDEX_PF, 366 - GUC_CAPTURE_LIST_TYPE_ENGINE_CLASS, GUC_RENDER_CLASS); 367 - /* skip if extlists was previously allocated */ 368 - if (!list || guc->capture->extlists) 369 - return; 370 - 371 - num_steer_regs = ARRAY_SIZE(xe_extregs); 372 - if (__has_xehpg_extregs(ipver)) 319 + num_steer_regs = ARRAY_SIZE(gen8_extregs); 320 + if (has_xehpg_extregs) 373 321 num_steer_regs += ARRAY_SIZE(xehpg_extregs); 374 322 375 323 sseu = &gt->info.sseu; ··· 337 393 338 394 extarray = extlists[0].extlist; 339 395 for_each_ss_steering(iter, gt, slice, subslice) { 340 - for (i = 0; i < ARRAY_SIZE(xe_extregs); ++i) { 341 - __fill_ext_reg(extarray, &xe_extregs[i], slice, subslice); 396 + for (i = 0; i < ARRAY_SIZE(gen8_extregs); ++i) { 397 + __fill_ext_reg(extarray, &gen8_extregs[i], slice, subslice); 342 398 ++extarray; 343 399 } 344 - if (__has_xehpg_extregs(ipver)) { 400 + 401 + if (has_xehpg_extregs) { 345 402 for (i = 0; i < ARRAY_SIZE(xehpg_extregs); ++i) { 346 403 __fill_ext_reg(extarray, &xehpg_extregs[i], slice, subslice); 347 404 ++extarray; ··· 358 413 guc_capture_get_device_reglist(struct intel_guc *guc) 359 414 { 360 415 struct drm_i915_private *i915 = guc_to_gt(guc)->i915; 416 + const struct __guc_mmio_reg_descr_group *lists; 361 417 362 - if (GRAPHICS_VER(i915) > 11) { 363 - /* 364 - * For certain engine classes, there are slice and subslice 365 - * level registers requiring steering. We allocate and populate 366 - * these at init time based on hw config add it as an extension 367 - * list at the end of the pre-populated render list. 368 - */ 369 - if (IS_DG2(i915)) 370 - guc_capture_alloc_steered_lists_xe_hpg(guc, xe_lpd_lists, IP_VER(12, 55)); 371 - else if (IS_XEHPSDV(i915)) 372 - guc_capture_alloc_steered_lists_xe_hpg(guc, xe_lpd_lists, IP_VER(12, 50)); 373 - else 374 - guc_capture_alloc_steered_lists_xe_lpd(guc, xe_lpd_lists); 418 + if (GRAPHICS_VER(i915) >= 12) 419 + lists = xe_lp_lists; 420 + else 421 + lists = gen8_lists; 375 422 376 - return xe_lpd_lists; 377 - } 423 + /* 424 + * For certain engine classes, there are slice and subslice 425 + * level registers requiring steering. We allocate and populate 426 + * these at init time based on hw config add it as an extension 427 + * list at the end of the pre-populated render list. 428 + */ 429 + guc_capture_alloc_steered_lists(guc, lists); 378 430 379 - /* if GuC submission is enabled on a non-POR platform, just use a common baseline */ 380 - return default_lists; 431 + return lists; 381 432 } 382 433 383 434 static const char * ··· 397 456 __stringify_engclass(u32 class) 398 457 { 399 458 switch (class) { 400 - case GUC_RENDER_CLASS: 401 - return "Render"; 402 - case GUC_VIDEO_CLASS: 459 + case GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE: 460 + return "Render/Compute"; 461 + case GUC_CAPTURE_LIST_CLASS_VIDEO: 403 462 return "Video"; 404 - case GUC_VIDEOENHANCE_CLASS: 463 + case GUC_CAPTURE_LIST_CLASS_VIDEOENHANCE: 405 464 return "VideoEnhance"; 406 - case GUC_BLITTER_CLASS: 465 + case GUC_CAPTURE_LIST_CLASS_BLITTER: 407 466 return "Blitter"; 408 - case GUC_COMPUTE_CLASS: 409 - return "Compute"; 410 - case GUC_GSC_OTHER_CLASS: 467 + case GUC_CAPTURE_LIST_CLASS_GSC_OTHER: 411 468 return "GSC-Other"; 412 469 default: 413 470 break; ··· 1535 1596 ee->guc_capture_node = NULL; 1536 1597 } 1537 1598 1599 + bool intel_guc_capture_is_matching_engine(struct intel_gt *gt, 1600 + struct intel_context *ce, 1601 + struct intel_engine_cs *engine) 1602 + { 1603 + struct __guc_capture_parsed_output *n; 1604 + struct intel_guc *guc; 1605 + 1606 + if (!gt || !ce || !engine) 1607 + return false; 1608 + 1609 + guc = &gt->uc.guc; 1610 + if (!guc->capture) 1611 + return false; 1612 + 1613 + /* 1614 + * Look for a matching GuC reported error capture node from 1615 + * the internal output link-list based on lrca, guc-id and engine 1616 + * identification. 1617 + */ 1618 + list_for_each_entry(n, &guc->capture->outlist, link) { 1619 + if (n->eng_inst == GUC_ID_TO_ENGINE_INSTANCE(engine->guc_id) && 1620 + n->eng_class == GUC_ID_TO_ENGINE_CLASS(engine->guc_id) && 1621 + n->guc_id == ce->guc_id.id && 1622 + (n->lrca & CTX_GTT_ADDRESS_MASK) == (ce->lrc.lrca & CTX_GTT_ADDRESS_MASK)) 1623 + return true; 1624 + } 1625 + 1626 + return false; 1627 + } 1628 + 1538 1629 void intel_guc_capture_get_matching_node(struct intel_gt *gt, 1539 1630 struct intel_engine_coredump *ee, 1540 1631 struct intel_context *ce) ··· 1580 1611 return; 1581 1612 1582 1613 GEM_BUG_ON(ee->guc_capture_node); 1614 + 1583 1615 /* 1584 1616 * Look for a matching GuC reported error capture node from 1585 1617 * the internal output link-list based on lrca, guc-id and engine
+3
drivers/gpu/drm/i915/gt/uc/intel_guc_capture.h
··· 11 11 struct drm_i915_error_state_buf; 12 12 struct guc_gt_system_info; 13 13 struct intel_engine_coredump; 14 + struct intel_engine_cs; 14 15 struct intel_context; 15 16 struct intel_gt; 16 17 struct intel_guc; ··· 21 20 const struct intel_engine_coredump *ee); 22 21 void intel_guc_capture_get_matching_node(struct intel_gt *gt, struct intel_engine_coredump *ee, 23 22 struct intel_context *ce); 23 + bool intel_guc_capture_is_matching_engine(struct intel_gt *gt, struct intel_context *ce, 24 + struct intel_engine_cs *engine); 24 25 void intel_guc_capture_process(struct intel_guc *guc); 25 26 int intel_guc_capture_getlist(struct intel_guc *guc, u32 owner, u32 type, u32 classid, 26 27 void **outptr);
+59
drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
··· 13 13 #include "intel_guc_ct.h" 14 14 #include "intel_guc_print.h" 15 15 16 + #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC) 17 + enum { 18 + CT_DEAD_ALIVE = 0, 19 + CT_DEAD_SETUP, 20 + CT_DEAD_WRITE, 21 + CT_DEAD_DEADLOCK, 22 + CT_DEAD_H2G_HAS_ROOM, 23 + CT_DEAD_READ, 24 + CT_DEAD_PROCESS_FAILED, 25 + }; 26 + 27 + static void ct_dead_ct_worker_func(struct work_struct *w); 28 + 29 + #define CT_DEAD(ct, reason) \ 30 + do { \ 31 + if (!(ct)->dead_ct_reported) { \ 32 + (ct)->dead_ct_reason |= 1 << CT_DEAD_##reason; \ 33 + queue_work(system_unbound_wq, &(ct)->dead_ct_worker); \ 34 + } \ 35 + } while (0) 36 + #else 37 + #define CT_DEAD(ct, reason) do { } while (0) 38 + #endif 39 + 16 40 static inline struct intel_guc *ct_to_guc(struct intel_guc_ct *ct) 17 41 { 18 42 return container_of(ct, struct intel_guc, ct); ··· 117 93 spin_lock_init(&ct->requests.lock); 118 94 INIT_LIST_HEAD(&ct->requests.pending); 119 95 INIT_LIST_HEAD(&ct->requests.incoming); 96 + #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC) 97 + INIT_WORK(&ct->dead_ct_worker, ct_dead_ct_worker_func); 98 + #endif 120 99 INIT_WORK(&ct->requests.worker, ct_incoming_request_worker_func); 121 100 tasklet_setup(&ct->receive_tasklet, ct_receive_tasklet_func); 122 101 init_waitqueue_head(&ct->wq); ··· 346 319 347 320 ct->enabled = true; 348 321 ct->stall_time = KTIME_MAX; 322 + #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC) 323 + ct->dead_ct_reported = false; 324 + ct->dead_ct_reason = CT_DEAD_ALIVE; 325 + #endif 349 326 350 327 return 0; 351 328 352 329 err_out: 353 330 CT_PROBE_ERROR(ct, "Failed to enable CTB (%pe)\n", ERR_PTR(err)); 331 + CT_DEAD(ct, SETUP); 354 332 return err; 355 333 } 356 334 ··· 466 434 corrupted: 467 435 CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u status=%#x\n", 468 436 desc->head, desc->tail, desc->status); 437 + CT_DEAD(ct, WRITE); 469 438 ctb->broken = true; 470 439 return -EPIPE; 471 440 } ··· 537 504 CT_ERROR(ct, "Head: %u\n (Dwords)", ct->ctbs.recv.desc->head); 538 505 CT_ERROR(ct, "Tail: %u\n (Dwords)", ct->ctbs.recv.desc->tail); 539 506 507 + CT_DEAD(ct, DEADLOCK); 540 508 ct->ctbs.send.broken = true; 541 509 } 542 510 ··· 586 552 head, ctb->size); 587 553 desc->status |= GUC_CTB_STATUS_OVERFLOW; 588 554 ctb->broken = true; 555 + CT_DEAD(ct, H2G_HAS_ROOM); 589 556 return false; 590 557 } 591 558 ··· 937 902 /* now update descriptor */ 938 903 WRITE_ONCE(desc->head, head); 939 904 905 + /* 906 + * Wa_22016122933: Making sure the head update is 907 + * visible to GuC right away 908 + */ 909 + intel_guc_write_barrier(ct_to_guc(ct)); 910 + 940 911 return available - len; 941 912 942 913 corrupted: 943 914 CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u status=%#x\n", 944 915 desc->head, desc->tail, desc->status); 945 916 ctb->broken = true; 917 + CT_DEAD(ct, READ); 946 918 return -EPIPE; 947 919 } 948 920 ··· 1099 1057 if (unlikely(err)) { 1100 1058 CT_ERROR(ct, "Failed to process CT message (%pe) %*ph\n", 1101 1059 ERR_PTR(err), 4 * request->size, request->msg); 1060 + CT_DEAD(ct, PROCESS_FAILED); 1102 1061 ct_free_msg(request); 1103 1062 } 1104 1063 ··· 1276 1233 drm_printf(p, "Tail: %u\n", 1277 1234 ct->ctbs.recv.desc->tail); 1278 1235 } 1236 + 1237 + #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC) 1238 + static void ct_dead_ct_worker_func(struct work_struct *w) 1239 + { 1240 + struct intel_guc_ct *ct = container_of(w, struct intel_guc_ct, dead_ct_worker); 1241 + struct intel_guc *guc = ct_to_guc(ct); 1242 + 1243 + if (ct->dead_ct_reported) 1244 + return; 1245 + 1246 + ct->dead_ct_reported = true; 1247 + 1248 + guc_info(guc, "CTB is dead - reason=0x%X\n", ct->dead_ct_reason); 1249 + intel_klog_error_capture(guc_to_gt(guc), (intel_engine_mask_t)~0U); 1250 + } 1251 + #endif
+6
drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
··· 85 85 86 86 /** @stall_time: time of first time a CTB submission is stalled */ 87 87 ktime_t stall_time; 88 + 89 + #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GUC) 90 + int dead_ct_reason; 91 + bool dead_ct_reported; 92 + struct work_struct dead_ct_worker; 93 + #endif 88 94 }; 89 95 90 96 void intel_guc_ct_init_early(struct intel_guc_ct *ct);
+10 -2
drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
··· 129 129 case INTEL_BOOTROM_STATUS_RC6CTXCONFIG_FAILED: 130 130 case INTEL_BOOTROM_STATUS_MPUMAP_INCORRECT: 131 131 case INTEL_BOOTROM_STATUS_EXCEPTION: 132 + case INTEL_BOOTROM_STATUS_PROD_KEY_CHECK_FAILURE: 132 133 *success = false; 133 134 return true; 134 135 } ··· 191 190 if (!ret || !success) 192 191 break; 193 192 194 - guc_dbg(guc, "load still in progress, count = %d, freq = %dMHz\n", 195 - count, intel_rps_read_actual_frequency(&uncore->gt->rps)); 193 + guc_dbg(guc, "load still in progress, count = %d, freq = %dMHz, status = 0x%08X [0x%02X/%02X]\n", 194 + count, intel_rps_read_actual_frequency(&uncore->gt->rps), status, 195 + REG_FIELD_GET(GS_BOOTROM_MASK, status), 196 + REG_FIELD_GET(GS_UKERNEL_MASK, status)); 196 197 } 197 198 after = ktime_get(); 198 199 delta = ktime_sub(after, before); ··· 220 217 221 218 case INTEL_BOOTROM_STATUS_RSA_FAILED: 222 219 guc_info(guc, "firmware signature verification failed\n"); 220 + ret = -ENOEXEC; 221 + break; 222 + 223 + case INTEL_BOOTROM_STATUS_PROD_KEY_CHECK_FAILURE: 224 + guc_info(guc, "firmware production part check failure\n"); 223 225 ret = -ENOEXEC; 224 226 break; 225 227 }
+10 -1
drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
··· 411 411 GUC_CAPTURE_LIST_TYPE_MAX, 412 412 }; 413 413 414 + /* Class indecies for capture_class and capture_instance arrays */ 415 + enum { 416 + GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE = 0, 417 + GUC_CAPTURE_LIST_CLASS_VIDEO = 1, 418 + GUC_CAPTURE_LIST_CLASS_VIDEOENHANCE = 2, 419 + GUC_CAPTURE_LIST_CLASS_BLITTER = 3, 420 + GUC_CAPTURE_LIST_CLASS_GSC_OTHER = 4, 421 + }; 422 + 414 423 /* GuC Additional Data Struct */ 415 424 struct guc_ads { 416 425 struct guc_mmio_reg_set reg_state_list[GUC_MAX_ENGINE_CLASSES][GUC_MAX_INSTANCES_PER_CLASS]; ··· 460 451 GUC_MAX_LOG_BUFFER 461 452 }; 462 453 463 - /** 454 + /* 464 455 * struct guc_log_buffer_state - GuC log buffer state 465 456 * 466 457 * Below state structure is used for coordination of retrieval of GuC firmware
+29 -11
drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
··· 277 277 278 278 slpc->max_freq_softlimit = 0; 279 279 slpc->min_freq_softlimit = 0; 280 + slpc->ignore_eff_freq = false; 280 281 slpc->min_is_rpmax = false; 281 282 282 283 slpc->boost_freq = 0; ··· 458 457 return ret; 459 458 } 460 459 460 + int intel_guc_slpc_set_ignore_eff_freq(struct intel_guc_slpc *slpc, bool val) 461 + { 462 + struct drm_i915_private *i915 = slpc_to_i915(slpc); 463 + intel_wakeref_t wakeref; 464 + int ret; 465 + 466 + mutex_lock(&slpc->lock); 467 + wakeref = intel_runtime_pm_get(&i915->runtime_pm); 468 + 469 + ret = slpc_set_param(slpc, 470 + SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY, 471 + val); 472 + if (ret) 473 + guc_probe_error(slpc_to_guc(slpc), "Failed to set efficient freq(%d): %pe\n", 474 + val, ERR_PTR(ret)); 475 + else 476 + slpc->ignore_eff_freq = val; 477 + 478 + intel_runtime_pm_put(&i915->runtime_pm, wakeref); 479 + mutex_unlock(&slpc->lock); 480 + return ret; 481 + } 482 + 461 483 /** 462 484 * intel_guc_slpc_set_min_freq() - Set min frequency limit for SLPC. 463 485 * @slpc: pointer to intel_guc_slpc. ··· 506 482 mutex_lock(&slpc->lock); 507 483 wakeref = intel_runtime_pm_get(&i915->runtime_pm); 508 484 509 - /* Ignore efficient freq if lower min freq is requested */ 510 - ret = slpc_set_param(slpc, 511 - SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY, 512 - val < slpc->rp1_freq); 513 - if (ret) { 514 - guc_probe_error(slpc_to_guc(slpc), "Failed to toggle efficient freq: %pe\n", 515 - ERR_PTR(ret)); 516 - goto out; 517 - } 518 - 519 485 ret = slpc_set_param(slpc, 520 486 SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ, 521 487 val); ··· 513 499 if (!ret) 514 500 slpc->min_freq_softlimit = val; 515 501 516 - out: 517 502 intel_runtime_pm_put(&i915->runtime_pm, wakeref); 518 503 mutex_unlock(&slpc->lock); 519 504 ··· 765 752 /* Set cached media freq ratio mode */ 766 753 intel_guc_slpc_set_media_ratio_mode(slpc, slpc->media_ratio_mode); 767 754 755 + /* Set cached value of ignore efficient freq */ 756 + intel_guc_slpc_set_ignore_eff_freq(slpc, slpc->ignore_eff_freq); 757 + 768 758 return 0; 769 759 } 770 760 ··· 837 821 slpc_decode_min_freq(slpc)); 838 822 drm_printf(p, "\twaitboosts: %u\n", 839 823 slpc->num_boosts); 824 + drm_printf(p, "\tBoosts outstanding: %u\n", 825 + atomic_read(&slpc->num_waiters)); 840 826 } 841 827 } 842 828
+1
drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
··· 46 46 void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc); 47 47 int intel_guc_slpc_unset_gucrc_mode(struct intel_guc_slpc *slpc); 48 48 int intel_guc_slpc_override_gucrc_mode(struct intel_guc_slpc *slpc, u32 mode); 49 + int intel_guc_slpc_set_ignore_eff_freq(struct intel_guc_slpc *slpc, bool val); 49 50 50 51 #endif
+1
drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h
··· 31 31 /* frequency softlimits */ 32 32 u32 min_freq_softlimit; 33 33 u32 max_freq_softlimit; 34 + bool ignore_eff_freq; 34 35 35 36 /* cached media ratio mode */ 36 37 u32 media_ratio_mode;
+65 -10
drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
··· 1402 1402 spin_unlock_irqrestore(&guc->timestamp.lock, flags); 1403 1403 } 1404 1404 1405 + static void __guc_context_update_stats(struct intel_context *ce) 1406 + { 1407 + struct intel_guc *guc = ce_to_guc(ce); 1408 + unsigned long flags; 1409 + 1410 + spin_lock_irqsave(&guc->timestamp.lock, flags); 1411 + lrc_update_runtime(ce); 1412 + spin_unlock_irqrestore(&guc->timestamp.lock, flags); 1413 + } 1414 + 1415 + static void guc_context_update_stats(struct intel_context *ce) 1416 + { 1417 + if (!intel_context_pin_if_active(ce)) 1418 + return; 1419 + 1420 + __guc_context_update_stats(ce); 1421 + intel_context_unpin(ce); 1422 + } 1423 + 1405 1424 static void guc_timestamp_ping(struct work_struct *wrk) 1406 1425 { 1407 1426 struct intel_guc *guc = container_of(wrk, typeof(*guc), 1408 1427 timestamp.work.work); 1409 1428 struct intel_uc *uc = container_of(guc, typeof(*uc), guc); 1410 1429 struct intel_gt *gt = guc_to_gt(guc); 1430 + struct intel_context *ce; 1411 1431 intel_wakeref_t wakeref; 1432 + unsigned long index; 1412 1433 int srcu, ret; 1413 1434 1414 1435 /* ··· 1444 1423 1445 1424 with_intel_runtime_pm(&gt->i915->runtime_pm, wakeref) 1446 1425 __update_guc_busyness_stats(guc); 1426 + 1427 + /* adjust context stats for overflow */ 1428 + xa_for_each(&guc->context_lookup, index, ce) 1429 + guc_context_update_stats(ce); 1447 1430 1448 1431 intel_gt_reset_unlock(gt, srcu); 1449 1432 ··· 1654 1629 1655 1630 static void guc_engine_reset_prepare(struct intel_engine_cs *engine) 1656 1631 { 1657 - if (!IS_GRAPHICS_VER(engine->i915, 11, 12)) 1658 - return; 1659 - 1660 - intel_engine_stop_cs(engine); 1661 - 1662 1632 /* 1663 1633 * Wa_22011802037: In addition to stopping the cs, we need 1664 1634 * to wait for any pending mi force wakeups 1665 1635 */ 1666 - intel_engine_wait_for_pending_mi_fw(engine); 1636 + if (IS_MTL_GRAPHICS_STEP(engine->i915, M, STEP_A0, STEP_B0) || 1637 + (GRAPHICS_VER(engine->i915) >= 11 && 1638 + GRAPHICS_VER_FULL(engine->i915) < IP_VER(12, 70))) { 1639 + intel_engine_stop_cs(engine); 1640 + intel_engine_wait_for_pending_mi_fw(engine); 1641 + } 1667 1642 } 1668 1643 1669 1644 static void guc_reset_nop(struct intel_engine_cs *engine) ··· 2799 2774 { 2800 2775 struct intel_guc *guc = ce_to_guc(ce); 2801 2776 2777 + __guc_context_update_stats(ce); 2802 2778 unpin_guc_id(guc, ce); 2803 2779 lrc_unpin(ce); 2804 2780 ··· 3481 3455 } 3482 3456 3483 3457 static const struct intel_context_ops guc_context_ops = { 3458 + .flags = COPS_RUNTIME_CYCLES, 3484 3459 .alloc = guc_context_alloc, 3485 3460 3486 3461 .close = guc_context_close, ··· 3499 3472 .exit = intel_context_exit_engine, 3500 3473 3501 3474 .sched_disable = guc_context_sched_disable, 3475 + 3476 + .update_stats = guc_context_update_stats, 3502 3477 3503 3478 .reset = lrc_reset, 3504 3479 .destroy = guc_context_destroy, ··· 3757 3728 } 3758 3729 3759 3730 static const struct intel_context_ops virtual_guc_context_ops = { 3731 + .flags = COPS_RUNTIME_CYCLES, 3760 3732 .alloc = guc_virtual_context_alloc, 3761 3733 3762 3734 .close = guc_context_close, ··· 3775 3745 .exit = guc_virtual_context_exit, 3776 3746 3777 3747 .sched_disable = guc_context_sched_disable, 3748 + .update_stats = guc_context_update_stats, 3778 3749 3779 3750 .destroy = guc_context_destroy, 3780 3751 ··· 4728 4697 { 4729 4698 struct intel_gt *gt = guc_to_gt(guc); 4730 4699 struct drm_i915_private *i915 = gt->i915; 4731 - struct intel_engine_cs *engine = __context_to_physical_engine(ce); 4732 4700 intel_wakeref_t wakeref; 4701 + intel_engine_mask_t engine_mask; 4733 4702 4734 - intel_engine_set_hung_context(engine, ce); 4703 + if (intel_engine_is_virtual(ce->engine)) { 4704 + struct intel_engine_cs *e; 4705 + intel_engine_mask_t tmp, virtual_mask = ce->engine->mask; 4706 + 4707 + engine_mask = 0; 4708 + for_each_engine_masked(e, ce->engine->gt, virtual_mask, tmp) { 4709 + bool match = intel_guc_capture_is_matching_engine(gt, ce, e); 4710 + 4711 + if (match) { 4712 + intel_engine_set_hung_context(e, ce); 4713 + engine_mask |= e->mask; 4714 + atomic_inc(&i915->gpu_error.reset_engine_count[e->uabi_class]); 4715 + } 4716 + } 4717 + 4718 + if (!engine_mask) { 4719 + guc_warn(guc, "No matching physical engine capture for virtual engine context 0x%04X / %s", 4720 + ce->guc_id.id, ce->engine->name); 4721 + engine_mask = ~0U; 4722 + } 4723 + } else { 4724 + intel_engine_set_hung_context(ce->engine, ce); 4725 + engine_mask = ce->engine->mask; 4726 + atomic_inc(&i915->gpu_error.reset_engine_count[ce->engine->uabi_class]); 4727 + } 4728 + 4735 4729 with_intel_runtime_pm(&i915->runtime_pm, wakeref) 4736 - i915_capture_error_state(gt, engine->mask, CORE_DUMP_FLAG_IS_GUC_CAPTURE); 4737 - atomic_inc(&i915->gpu_error.reset_engine_count[engine->uabi_class]); 4730 + i915_capture_error_state(gt, engine_mask, CORE_DUMP_FLAG_IS_GUC_CAPTURE); 4738 4731 } 4739 4732 4740 4733 static void guc_context_replay(struct intel_context *ce)
+14 -2
drivers/gpu/drm/i915/gt/uc/intel_uc.c
··· 18 18 #include "intel_uc.h" 19 19 20 20 #include "i915_drv.h" 21 + #include "i915_hwmon.h" 21 22 22 23 static const struct intel_uc_ops uc_ops_off; 23 24 static const struct intel_uc_ops uc_ops_on; ··· 432 431 433 432 static int __uc_check_hw(struct intel_uc *uc) 434 433 { 434 + if (uc->fw_table_invalid) 435 + return -EIO; 436 + 435 437 if (!intel_uc_supports_guc(uc)) 436 438 return 0; 437 439 ··· 465 461 struct intel_guc *guc = &uc->guc; 466 462 struct intel_huc *huc = &uc->huc; 467 463 int ret, attempts; 464 + bool pl1en = false; 468 465 469 466 GEM_BUG_ON(!intel_uc_supports_guc(uc)); 470 467 GEM_BUG_ON(!intel_uc_wants_guc(uc)); ··· 496 491 else 497 492 attempts = 1; 498 493 494 + /* Disable a potentially low PL1 power limit to allow freq to be raised */ 495 + i915_hwmon_power_max_disable(gt->i915, &pl1en); 496 + 499 497 intel_rps_raise_unslice(&uc_to_gt(uc)->rps); 500 498 501 499 while (attempts--) { ··· 508 500 */ 509 501 ret = __uc_sanitize(uc); 510 502 if (ret) 511 - goto err_out; 503 + goto err_rps; 512 504 513 505 intel_huc_fw_upload(huc); 514 506 intel_guc_ads_reset(guc); ··· 555 547 intel_rps_lower_unslice(&uc_to_gt(uc)->rps); 556 548 } 557 549 550 + i915_hwmon_power_max_restore(gt->i915, pl1en); 551 + 558 552 guc_info(guc, "submission %s\n", str_enabled_disabled(intel_uc_uses_guc_submission(uc))); 559 553 guc_info(guc, "SLPC %s\n", str_enabled_disabled(intel_uc_uses_guc_slpc(uc))); 560 554 ··· 569 559 intel_guc_submission_disable(guc); 570 560 err_log_capture: 571 561 __uc_capture_load_err_log(uc); 572 - err_out: 562 + err_rps: 573 563 /* Return GT back to RPn */ 574 564 intel_rps_lower_unslice(&uc_to_gt(uc)->rps); 575 565 566 + i915_hwmon_power_max_restore(gt->i915, pl1en); 567 + err_out: 576 568 __uc_sanitize(uc); 577 569 578 570 if (!ret) {
+1
drivers/gpu/drm/i915/gt/uc/intel_uc.h
··· 36 36 struct drm_i915_gem_object *load_err_log; 37 37 38 38 bool reset_in_progress; 39 + bool fw_table_invalid; 39 40 }; 40 41 41 42 void intel_uc_init_early(struct intel_uc *uc);
+157 -87
drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
··· 17 17 #include "i915_drv.h" 18 18 #include "i915_reg.h" 19 19 20 + #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) 21 + #define UNEXPECTED gt_probe_error 22 + #else 23 + #define UNEXPECTED gt_notice 24 + #endif 25 + 20 26 static inline struct intel_gt * 21 27 ____uc_fw_to_gt(struct intel_uc_fw *uc_fw, enum intel_uc_fw_type type) 22 28 { ··· 85 79 * security fixes, etc. to be enabled. 86 80 */ 87 81 #define INTEL_GUC_FIRMWARE_DEFS(fw_def, guc_maj, guc_mmp) \ 88 - fw_def(DG2, 0, guc_maj(dg2, 70, 5)) \ 89 - fw_def(ALDERLAKE_P, 0, guc_maj(adlp, 70, 5)) \ 82 + fw_def(METEORLAKE, 0, guc_maj(mtl, 70, 6, 6)) \ 83 + fw_def(DG2, 0, guc_maj(dg2, 70, 5, 1)) \ 84 + fw_def(ALDERLAKE_P, 0, guc_maj(adlp, 70, 5, 1)) \ 90 85 fw_def(ALDERLAKE_P, 0, guc_mmp(adlp, 70, 1, 1)) \ 91 86 fw_def(ALDERLAKE_P, 0, guc_mmp(adlp, 69, 0, 3)) \ 92 - fw_def(ALDERLAKE_S, 0, guc_maj(tgl, 70, 5)) \ 87 + fw_def(ALDERLAKE_S, 0, guc_maj(tgl, 70, 5, 1)) \ 93 88 fw_def(ALDERLAKE_S, 0, guc_mmp(tgl, 70, 1, 1)) \ 94 89 fw_def(ALDERLAKE_S, 0, guc_mmp(tgl, 69, 0, 3)) \ 95 - fw_def(DG1, 0, guc_maj(dg1, 70, 5)) \ 90 + fw_def(DG1, 0, guc_maj(dg1, 70, 5, 1)) \ 96 91 fw_def(ROCKETLAKE, 0, guc_mmp(tgl, 70, 1, 1)) \ 97 92 fw_def(TIGERLAKE, 0, guc_mmp(tgl, 70, 1, 1)) \ 98 93 fw_def(JASPERLAKE, 0, guc_mmp(ehl, 70, 1, 1)) \ ··· 147 140 __stringify(patch_) ".bin" 148 141 149 142 /* Minor for internal driver use, not part of file name */ 150 - #define MAKE_GUC_FW_PATH_MAJOR(prefix_, major_, minor_) \ 143 + #define MAKE_GUC_FW_PATH_MAJOR(prefix_, major_, minor_, patch_) \ 151 144 __MAKE_UC_FW_PATH_MAJOR(prefix_, "guc", major_) 152 145 153 146 #define MAKE_GUC_FW_PATH_MMP(prefix_, major_, minor_, patch_) \ ··· 203 196 { UC_FW_BLOB_BASE(major_, minor_, patch_, path_) \ 204 197 .legacy = true } 205 198 206 - #define GUC_FW_BLOB(prefix_, major_, minor_) \ 207 - UC_FW_BLOB_NEW(major_, minor_, 0, false, \ 208 - MAKE_GUC_FW_PATH_MAJOR(prefix_, major_, minor_)) 199 + #define GUC_FW_BLOB(prefix_, major_, minor_, patch_) \ 200 + UC_FW_BLOB_NEW(major_, minor_, patch_, false, \ 201 + MAKE_GUC_FW_PATH_MAJOR(prefix_, major_, minor_, patch_)) 209 202 210 203 #define GUC_FW_BLOB_MMP(prefix_, major_, minor_, patch_) \ 211 204 UC_FW_BLOB_OLD(major_, minor_, patch_, \ ··· 239 232 u32 count; 240 233 }; 241 234 235 + static const struct uc_fw_platform_requirement blobs_guc[] = { 236 + INTEL_GUC_FIRMWARE_DEFS(MAKE_FW_LIST, GUC_FW_BLOB, GUC_FW_BLOB_MMP) 237 + }; 238 + 239 + static const struct uc_fw_platform_requirement blobs_huc[] = { 240 + INTEL_HUC_FIRMWARE_DEFS(MAKE_FW_LIST, HUC_FW_BLOB, HUC_FW_BLOB_MMP, HUC_FW_BLOB_GSC) 241 + }; 242 + 243 + static const struct fw_blobs_by_type blobs_all[INTEL_UC_FW_NUM_TYPES] = { 244 + [INTEL_UC_FW_TYPE_GUC] = { blobs_guc, ARRAY_SIZE(blobs_guc) }, 245 + [INTEL_UC_FW_TYPE_HUC] = { blobs_huc, ARRAY_SIZE(blobs_huc) }, 246 + }; 247 + 242 248 static void 243 249 __uc_fw_auto_select(struct drm_i915_private *i915, struct intel_uc_fw *uc_fw) 244 250 { 245 - static const struct uc_fw_platform_requirement blobs_guc[] = { 246 - INTEL_GUC_FIRMWARE_DEFS(MAKE_FW_LIST, GUC_FW_BLOB, GUC_FW_BLOB_MMP) 247 - }; 248 - static const struct uc_fw_platform_requirement blobs_huc[] = { 249 - INTEL_HUC_FIRMWARE_DEFS(MAKE_FW_LIST, HUC_FW_BLOB, HUC_FW_BLOB_MMP, HUC_FW_BLOB_GSC) 250 - }; 251 - static const struct fw_blobs_by_type blobs_all[INTEL_UC_FW_NUM_TYPES] = { 252 - [INTEL_UC_FW_TYPE_GUC] = { blobs_guc, ARRAY_SIZE(blobs_guc) }, 253 - [INTEL_UC_FW_TYPE_HUC] = { blobs_huc, ARRAY_SIZE(blobs_huc) }, 254 - }; 255 - static bool verified[INTEL_UC_FW_NUM_TYPES]; 256 251 const struct uc_fw_platform_requirement *fw_blobs; 257 252 enum intel_platform p = INTEL_INFO(i915)->platform; 258 253 u32 fw_count; ··· 294 285 continue; 295 286 296 287 if (uc_fw->file_selected.path) { 288 + /* 289 + * Continuing an earlier search after a found blob failed to load. 290 + * Once the previously chosen path has been found, clear it out 291 + * and let the search continue from there. 292 + */ 297 293 if (uc_fw->file_selected.path == blob->path) 298 294 uc_fw->file_selected.path = NULL; 299 295 ··· 309 295 uc_fw->file_wanted.path = blob->path; 310 296 uc_fw->file_wanted.ver.major = blob->major; 311 297 uc_fw->file_wanted.ver.minor = blob->minor; 298 + uc_fw->file_wanted.ver.patch = blob->patch; 312 299 uc_fw->loaded_via_gsc = blob->loaded_via_gsc; 313 300 found = true; 314 301 break; ··· 319 304 /* Failed to find a match for the last attempt?! */ 320 305 uc_fw->file_selected.path = NULL; 321 306 } 307 + } 308 + 309 + static bool validate_fw_table_type(struct drm_i915_private *i915, enum intel_uc_fw_type type) 310 + { 311 + const struct uc_fw_platform_requirement *fw_blobs; 312 + u32 fw_count; 313 + int i, j; 314 + 315 + if (type >= ARRAY_SIZE(blobs_all)) { 316 + drm_err(&i915->drm, "No blob array for %s\n", intel_uc_fw_type_repr(type)); 317 + return false; 318 + } 319 + 320 + fw_blobs = blobs_all[type].blobs; 321 + fw_count = blobs_all[type].count; 322 + 323 + if (!fw_count) 324 + return true; 322 325 323 326 /* make sure the list is ordered as expected */ 324 - if (IS_ENABLED(CONFIG_DRM_I915_SELFTEST) && !verified[uc_fw->type]) { 325 - verified[uc_fw->type] = true; 326 - 327 - for (i = 1; i < fw_count; i++) { 328 - /* Next platform is good: */ 329 - if (fw_blobs[i].p < fw_blobs[i - 1].p) 327 + for (i = 1; i < fw_count; i++) { 328 + /* Versionless file names must be unique per platform: */ 329 + for (j = i + 1; j < fw_count; j++) { 330 + /* Same platform? */ 331 + if (fw_blobs[i].p != fw_blobs[j].p) 330 332 continue; 331 333 332 - /* Next platform revision is good: */ 333 - if (fw_blobs[i].p == fw_blobs[i - 1].p && 334 - fw_blobs[i].rev < fw_blobs[i - 1].rev) 334 + if (fw_blobs[i].blob.path != fw_blobs[j].blob.path) 335 335 continue; 336 336 337 - /* Platform/revision must be in order: */ 338 - if (fw_blobs[i].p != fw_blobs[i - 1].p || 339 - fw_blobs[i].rev != fw_blobs[i - 1].rev) 340 - goto bad; 337 + drm_err(&i915->drm, "Duplicate %s blobs: %s r%u %s%d.%d.%d [%s] matches %s%d.%d.%d [%s]\n", 338 + intel_uc_fw_type_repr(type), 339 + intel_platform_name(fw_blobs[j].p), fw_blobs[j].rev, 340 + fw_blobs[j].blob.legacy ? "L" : "v", 341 + fw_blobs[j].blob.major, fw_blobs[j].blob.minor, 342 + fw_blobs[j].blob.patch, fw_blobs[j].blob.path, 343 + fw_blobs[i].blob.legacy ? "L" : "v", 344 + fw_blobs[i].blob.major, fw_blobs[i].blob.minor, 345 + fw_blobs[i].blob.patch, fw_blobs[i].blob.path); 346 + } 341 347 342 - /* Next major version is good: */ 343 - if (fw_blobs[i].blob.major < fw_blobs[i - 1].blob.major) 348 + /* Next platform is good: */ 349 + if (fw_blobs[i].p < fw_blobs[i - 1].p) 350 + continue; 351 + 352 + /* Next platform revision is good: */ 353 + if (fw_blobs[i].p == fw_blobs[i - 1].p && 354 + fw_blobs[i].rev < fw_blobs[i - 1].rev) 355 + continue; 356 + 357 + /* Platform/revision must be in order: */ 358 + if (fw_blobs[i].p != fw_blobs[i - 1].p || 359 + fw_blobs[i].rev != fw_blobs[i - 1].rev) 360 + goto bad; 361 + 362 + /* Next major version is good: */ 363 + if (fw_blobs[i].blob.major < fw_blobs[i - 1].blob.major) 364 + continue; 365 + 366 + /* New must be before legacy: */ 367 + if (!fw_blobs[i].blob.legacy && fw_blobs[i - 1].blob.legacy) 368 + goto bad; 369 + 370 + /* New to legacy also means 0.0 to X.Y (HuC), or X.0 to X.Y (GuC) */ 371 + if (fw_blobs[i].blob.legacy && !fw_blobs[i - 1].blob.legacy) { 372 + if (!fw_blobs[i - 1].blob.major) 344 373 continue; 345 374 346 - /* New must be before legacy: */ 347 - if (!fw_blobs[i].blob.legacy && fw_blobs[i - 1].blob.legacy) 348 - goto bad; 349 - 350 - /* New to legacy also means 0.0 to X.Y (HuC), or X.0 to X.Y (GuC) */ 351 - if (fw_blobs[i].blob.legacy && !fw_blobs[i - 1].blob.legacy) { 352 - if (!fw_blobs[i - 1].blob.major) 353 - continue; 354 - 355 - if (fw_blobs[i].blob.major == fw_blobs[i - 1].blob.major) 356 - continue; 357 - } 358 - 359 - /* Major versions must be in order: */ 360 - if (fw_blobs[i].blob.major != fw_blobs[i - 1].blob.major) 361 - goto bad; 362 - 363 - /* Next minor version is good: */ 364 - if (fw_blobs[i].blob.minor < fw_blobs[i - 1].blob.minor) 375 + if (fw_blobs[i].blob.major == fw_blobs[i - 1].blob.major) 365 376 continue; 377 + } 366 378 367 - /* Minor versions must be in order: */ 368 - if (fw_blobs[i].blob.minor != fw_blobs[i - 1].blob.minor) 369 - goto bad; 379 + /* Major versions must be in order: */ 380 + if (fw_blobs[i].blob.major != fw_blobs[i - 1].blob.major) 381 + goto bad; 370 382 371 - /* Patch versions must be in order: */ 372 - if (fw_blobs[i].blob.patch <= fw_blobs[i - 1].blob.patch) 373 - continue; 383 + /* Next minor version is good: */ 384 + if (fw_blobs[i].blob.minor < fw_blobs[i - 1].blob.minor) 385 + continue; 386 + 387 + /* Minor versions must be in order: */ 388 + if (fw_blobs[i].blob.minor != fw_blobs[i - 1].blob.minor) 389 + goto bad; 390 + 391 + /* Patch versions must be in order and unique: */ 392 + if (fw_blobs[i].blob.patch < fw_blobs[i - 1].blob.patch) 393 + continue; 374 394 375 395 bad: 376 - drm_err(&i915->drm, "Invalid %s blob order: %s r%u %s%d.%d.%d comes before %s r%u %s%d.%d.%d\n", 377 - intel_uc_fw_type_repr(uc_fw->type), 378 - intel_platform_name(fw_blobs[i - 1].p), fw_blobs[i - 1].rev, 379 - fw_blobs[i - 1].blob.legacy ? "L" : "v", 380 - fw_blobs[i - 1].blob.major, 381 - fw_blobs[i - 1].blob.minor, 382 - fw_blobs[i - 1].blob.patch, 383 - intel_platform_name(fw_blobs[i].p), fw_blobs[i].rev, 384 - fw_blobs[i].blob.legacy ? "L" : "v", 385 - fw_blobs[i].blob.major, 386 - fw_blobs[i].blob.minor, 387 - fw_blobs[i].blob.patch); 388 - 389 - uc_fw->file_selected.path = NULL; 390 - } 396 + drm_err(&i915->drm, "Invalid %s blob order: %s r%u %s%d.%d.%d comes before %s r%u %s%d.%d.%d\n", 397 + intel_uc_fw_type_repr(type), 398 + intel_platform_name(fw_blobs[i - 1].p), fw_blobs[i - 1].rev, 399 + fw_blobs[i - 1].blob.legacy ? "L" : "v", 400 + fw_blobs[i - 1].blob.major, 401 + fw_blobs[i - 1].blob.minor, 402 + fw_blobs[i - 1].blob.patch, 403 + intel_platform_name(fw_blobs[i].p), fw_blobs[i].rev, 404 + fw_blobs[i].blob.legacy ? "L" : "v", 405 + fw_blobs[i].blob.major, 406 + fw_blobs[i].blob.minor, 407 + fw_blobs[i].blob.patch); 408 + return false; 391 409 } 410 + 411 + return true; 392 412 } 393 413 394 414 static const char *__override_guc_firmware_path(struct drm_i915_private *i915) ··· 478 428 void intel_uc_fw_init_early(struct intel_uc_fw *uc_fw, 479 429 enum intel_uc_fw_type type) 480 430 { 481 - struct drm_i915_private *i915 = ____uc_fw_to_gt(uc_fw, type)->i915; 431 + struct intel_gt *gt = ____uc_fw_to_gt(uc_fw, type); 432 + struct drm_i915_private *i915 = gt->i915; 482 433 483 434 /* 484 435 * we use FIRMWARE_UNINITIALIZED to detect checks against uc_fw->status ··· 492 441 uc_fw->type = type; 493 442 494 443 if (HAS_GT_UC(i915)) { 444 + if (!validate_fw_table_type(i915, type)) { 445 + gt->uc.fw_table_invalid = true; 446 + intel_uc_fw_change_status(uc_fw, INTEL_UC_FIRMWARE_NOT_SUPPORTED); 447 + return; 448 + } 449 + 495 450 __uc_fw_auto_select(i915, uc_fw); 496 451 __uc_fw_user_override(i915, uc_fw); 497 452 } ··· 839 782 if (uc_fw->file_wanted.ver.major && uc_fw->file_selected.ver.major) { 840 783 /* Check the file's major version was as it claimed */ 841 784 if (uc_fw->file_selected.ver.major != uc_fw->file_wanted.ver.major) { 842 - gt_notice(gt, "%s firmware %s: unexpected version: %u.%u != %u.%u\n", 843 - intel_uc_fw_type_repr(uc_fw->type), uc_fw->file_selected.path, 844 - uc_fw->file_selected.ver.major, uc_fw->file_selected.ver.minor, 845 - uc_fw->file_wanted.ver.major, uc_fw->file_wanted.ver.minor); 785 + UNEXPECTED(gt, "%s firmware %s: unexpected version: %u.%u != %u.%u\n", 786 + intel_uc_fw_type_repr(uc_fw->type), uc_fw->file_selected.path, 787 + uc_fw->file_selected.ver.major, uc_fw->file_selected.ver.minor, 788 + uc_fw->file_wanted.ver.major, uc_fw->file_wanted.ver.minor); 846 789 if (!intel_uc_fw_is_overridden(uc_fw)) { 847 790 err = -ENOEXEC; 848 791 goto fail; 849 792 } 850 793 } else { 851 794 if (uc_fw->file_selected.ver.minor < uc_fw->file_wanted.ver.minor) 795 + old_ver = true; 796 + else if ((uc_fw->file_selected.ver.minor == uc_fw->file_wanted.ver.minor) && 797 + (uc_fw->file_selected.ver.patch < uc_fw->file_wanted.ver.patch)) 852 798 old_ver = true; 853 799 } 854 800 } ··· 860 800 /* Preserve the version that was really wanted */ 861 801 memcpy(&uc_fw->file_wanted, &file_ideal, sizeof(uc_fw->file_wanted)); 862 802 863 - gt_notice(gt, "%s firmware %s (%d.%d) is recommended, but only %s (%d.%d) was found\n", 864 - intel_uc_fw_type_repr(uc_fw->type), 865 - uc_fw->file_wanted.path, 866 - uc_fw->file_wanted.ver.major, uc_fw->file_wanted.ver.minor, 867 - uc_fw->file_selected.path, 868 - uc_fw->file_selected.ver.major, uc_fw->file_selected.ver.minor); 803 + UNEXPECTED(gt, "%s firmware %s (%d.%d.%d) is recommended, but only %s (%d.%d.%d) was found\n", 804 + intel_uc_fw_type_repr(uc_fw->type), 805 + uc_fw->file_wanted.path, 806 + uc_fw->file_wanted.ver.major, 807 + uc_fw->file_wanted.ver.minor, 808 + uc_fw->file_wanted.ver.patch, 809 + uc_fw->file_selected.path, 810 + uc_fw->file_selected.ver.major, 811 + uc_fw->file_selected.ver.minor, 812 + uc_fw->file_selected.ver.patch); 869 813 gt_info(gt, "Consider updating your linux-firmware pkg or downloading from %s\n", 870 814 INTEL_UC_FIRMWARE_URL); 871 815 } ··· 957 893 pte_flags |= PTE_LM; 958 894 959 895 if (ggtt->vm.raw_insert_entries) 960 - ggtt->vm.raw_insert_entries(&ggtt->vm, dummy, I915_CACHE_NONE, pte_flags); 896 + ggtt->vm.raw_insert_entries(&ggtt->vm, dummy, 897 + i915_gem_get_pat_index(ggtt->vm.i915, 898 + I915_CACHE_NONE), 899 + pte_flags); 961 900 else 962 - ggtt->vm.insert_entries(&ggtt->vm, dummy, I915_CACHE_NONE, pte_flags); 901 + ggtt->vm.insert_entries(&ggtt->vm, dummy, 902 + i915_gem_get_pat_index(ggtt->vm.i915, 903 + I915_CACHE_NONE), 904 + pte_flags); 963 905 } 964 906 965 907 static void uc_fw_unbind_ggtt(struct intel_uc_fw *uc_fw)
+1 -1
drivers/gpu/drm/i915/gvt/aperture_gm.c
··· 330 330 /** 331 331 * intel_vgpu_alloc_resource() - allocate HW resource for a vGPU 332 332 * @vgpu: vGPU 333 - * @param: vGPU creation params 333 + * @conf: vGPU creation params 334 334 * 335 335 * This function is used to allocate HW resource for a vGPU. User specifies 336 336 * the resource configuration through the creation params.
+7 -7
drivers/gpu/drm/i915/i915_active.h
··· 49 49 50 50 /** 51 51 * __i915_active_fence_init - prepares the activity tracker for use 52 - * @active - the active tracker 53 - * @fence - initial fence to track, can be NULL 54 - * @func - a callback when then the tracker is retired (becomes idle), 52 + * @active: the active tracker 53 + * @fence: initial fence to track, can be NULL 54 + * @fn: a callback when then the tracker is retired (becomes idle), 55 55 * can be NULL 56 56 * 57 57 * i915_active_fence_init() prepares the embedded @active struct for use as ··· 77 77 78 78 /** 79 79 * i915_active_fence_set - updates the tracker to watch the current fence 80 - * @active - the active tracker 81 - * @rq - the request to watch 80 + * @active: the active tracker 81 + * @rq: the request to watch 82 82 * 83 83 * i915_active_fence_set() watches the given @rq for completion. While 84 84 * that @rq is busy, the @active reports busy. When that @rq is signaled ··· 89 89 struct i915_request *rq); 90 90 /** 91 91 * i915_active_fence_get - return a reference to the active fence 92 - * @active - the active tracker 92 + * @active: the active tracker 93 93 * 94 94 * i915_active_fence_get() returns a reference to the active fence, 95 95 * or NULL if the active tracker is idle. The reference is obtained under RCU, ··· 111 111 112 112 /** 113 113 * i915_active_fence_isset - report whether the active tracker is assigned 114 - * @active - the active tracker 114 + * @active: the active tracker 115 115 * 116 116 * i915_active_fence_isset() returns true if the active tracker is currently 117 117 * assigned to a fence. Due to the lazy retiring, that fence may be idle
+42 -9
drivers/gpu/drm/i915/i915_debugfs.c
··· 138 138 return "ppgtt"; 139 139 } 140 140 141 - static const char *i915_cache_level_str(struct drm_i915_private *i915, int type) 141 + static const char *i915_cache_level_str(struct drm_i915_gem_object *obj) 142 142 { 143 - switch (type) { 144 - case I915_CACHE_NONE: return " uncached"; 145 - case I915_CACHE_LLC: return HAS_LLC(i915) ? " LLC" : " snooped"; 146 - case I915_CACHE_L3_LLC: return " L3+LLC"; 147 - case I915_CACHE_WT: return " WT"; 148 - default: return ""; 143 + struct drm_i915_private *i915 = obj_to_i915(obj); 144 + 145 + if (IS_METEORLAKE(i915)) { 146 + switch (obj->pat_index) { 147 + case 0: return " WB"; 148 + case 1: return " WT"; 149 + case 2: return " UC"; 150 + case 3: return " WB (1-Way Coh)"; 151 + case 4: return " WB (2-Way Coh)"; 152 + default: return " not defined"; 153 + } 154 + } else if (IS_PONTEVECCHIO(i915)) { 155 + switch (obj->pat_index) { 156 + case 0: return " UC"; 157 + case 1: return " WC"; 158 + case 2: return " WT"; 159 + case 3: return " WB"; 160 + case 4: return " WT (CLOS1)"; 161 + case 5: return " WB (CLOS1)"; 162 + case 6: return " WT (CLOS2)"; 163 + case 7: return " WT (CLOS2)"; 164 + default: return " not defined"; 165 + } 166 + } else if (GRAPHICS_VER(i915) >= 12) { 167 + switch (obj->pat_index) { 168 + case 0: return " WB"; 169 + case 1: return " WC"; 170 + case 2: return " WT"; 171 + case 3: return " UC"; 172 + default: return " not defined"; 173 + } 174 + } else { 175 + switch (obj->pat_index) { 176 + case 0: return " UC"; 177 + case 1: return HAS_LLC(i915) ? 178 + " LLC" : " snooped"; 179 + case 2: return " L3+LLC"; 180 + case 3: return " WT"; 181 + default: return " not defined"; 182 + } 149 183 } 150 184 } 151 185 152 186 void 153 187 i915_debugfs_describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) 154 188 { 155 - struct drm_i915_private *dev_priv = to_i915(obj->base.dev); 156 189 struct i915_vma *vma; 157 190 int pin_count = 0; 158 191 ··· 197 164 obj->base.size / 1024, 198 165 obj->read_domains, 199 166 obj->write_domain, 200 - i915_cache_level_str(dev_priv, obj->cache_level), 167 + i915_cache_level_str(obj), 201 168 obj->mm.dirty ? " dirty" : "", 202 169 obj->mm.madv == I915_MADV_DONTNEED ? " purgeable" : ""); 203 170 if (obj->base.name)
+1 -5
drivers/gpu/drm/i915/i915_drm_client.c
··· 147 147 PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn)); 148 148 seq_printf(m, "drm-client-id:\t%u\n", client->id); 149 149 150 - /* 151 - * Temporarily skip showing client engine information with GuC submission till 152 - * fetching engine busyness is implemented in the GuC submission backend 153 - */ 154 - if (GRAPHICS_VER(i915) < 8 || intel_uc_uses_guc_submission(&i915->gt0.uc)) 150 + if (GRAPHICS_VER(i915) < 8) 155 151 return; 156 152 157 153 for (i = 0; i < ARRAY_SIZE(uabi_class_names); i++)
+226 -226
drivers/gpu/drm/i915/i915_drv.h
··· 381 381 } 382 382 383 383 /* Simple iterator over all initialised engines */ 384 - #define for_each_engine(engine__, dev_priv__, id__) \ 384 + #define for_each_engine(engine__, gt__, id__) \ 385 385 for ((id__) = 0; \ 386 386 (id__) < I915_NUM_ENGINES; \ 387 387 (id__)++) \ 388 - for_each_if ((engine__) = (dev_priv__)->engine[(id__)]) 388 + for_each_if ((engine__) = (gt__)->engine[(id__)]) 389 389 390 390 /* Iterator over subset of engines selected by mask */ 391 391 #define for_each_engine_masked(engine__, gt__, mask__, tmp__) \ ··· 407 407 (engine__) && (engine__)->uabi_class == (class__); \ 408 408 (engine__) = rb_to_uabi_engine(rb_next(&(engine__)->uabi_node))) 409 409 410 - #define INTEL_INFO(dev_priv) (&(dev_priv)->__info) 411 - #define RUNTIME_INFO(dev_priv) (&(dev_priv)->__runtime) 412 - #define DRIVER_CAPS(dev_priv) (&(dev_priv)->caps) 410 + #define INTEL_INFO(i915) (&(i915)->__info) 411 + #define RUNTIME_INFO(i915) (&(i915)->__runtime) 412 + #define DRIVER_CAPS(i915) (&(i915)->caps) 413 413 414 - #define INTEL_DEVID(dev_priv) (RUNTIME_INFO(dev_priv)->device_id) 414 + #define INTEL_DEVID(i915) (RUNTIME_INFO(i915)->device_id) 415 415 416 416 #define IP_VER(ver, rel) ((ver) << 8 | (rel)) 417 417 ··· 431 431 #define IS_DISPLAY_VER(i915, from, until) \ 432 432 (DISPLAY_VER(i915) >= (from) && DISPLAY_VER(i915) <= (until)) 433 433 434 - #define INTEL_REVID(dev_priv) (to_pci_dev((dev_priv)->drm.dev)->revision) 434 + #define INTEL_REVID(i915) (to_pci_dev((i915)->drm.dev)->revision) 435 435 436 436 #define INTEL_DISPLAY_STEP(__i915) (RUNTIME_INFO(__i915)->step.display_step) 437 437 #define INTEL_GRAPHICS_STEP(__i915) (RUNTIME_INFO(__i915)->step.graphics_step) ··· 516 516 return ((mask << (msb - pb)) & (mask << (msb - s))) & BIT(msb); 517 517 } 518 518 519 - #define IS_MOBILE(dev_priv) (INTEL_INFO(dev_priv)->is_mobile) 520 - #define IS_DGFX(dev_priv) (INTEL_INFO(dev_priv)->is_dgfx) 519 + #define IS_MOBILE(i915) (INTEL_INFO(i915)->is_mobile) 520 + #define IS_DGFX(i915) (INTEL_INFO(i915)->is_dgfx) 521 521 522 - #define IS_I830(dev_priv) IS_PLATFORM(dev_priv, INTEL_I830) 523 - #define IS_I845G(dev_priv) IS_PLATFORM(dev_priv, INTEL_I845G) 524 - #define IS_I85X(dev_priv) IS_PLATFORM(dev_priv, INTEL_I85X) 525 - #define IS_I865G(dev_priv) IS_PLATFORM(dev_priv, INTEL_I865G) 526 - #define IS_I915G(dev_priv) IS_PLATFORM(dev_priv, INTEL_I915G) 527 - #define IS_I915GM(dev_priv) IS_PLATFORM(dev_priv, INTEL_I915GM) 528 - #define IS_I945G(dev_priv) IS_PLATFORM(dev_priv, INTEL_I945G) 529 - #define IS_I945GM(dev_priv) IS_PLATFORM(dev_priv, INTEL_I945GM) 530 - #define IS_I965G(dev_priv) IS_PLATFORM(dev_priv, INTEL_I965G) 531 - #define IS_I965GM(dev_priv) IS_PLATFORM(dev_priv, INTEL_I965GM) 532 - #define IS_G45(dev_priv) IS_PLATFORM(dev_priv, INTEL_G45) 533 - #define IS_GM45(dev_priv) IS_PLATFORM(dev_priv, INTEL_GM45) 534 - #define IS_G4X(dev_priv) (IS_G45(dev_priv) || IS_GM45(dev_priv)) 535 - #define IS_PINEVIEW(dev_priv) IS_PLATFORM(dev_priv, INTEL_PINEVIEW) 536 - #define IS_G33(dev_priv) IS_PLATFORM(dev_priv, INTEL_G33) 537 - #define IS_IRONLAKE(dev_priv) IS_PLATFORM(dev_priv, INTEL_IRONLAKE) 538 - #define IS_IRONLAKE_M(dev_priv) \ 539 - (IS_PLATFORM(dev_priv, INTEL_IRONLAKE) && IS_MOBILE(dev_priv)) 540 - #define IS_SANDYBRIDGE(dev_priv) IS_PLATFORM(dev_priv, INTEL_SANDYBRIDGE) 541 - #define IS_IVYBRIDGE(dev_priv) IS_PLATFORM(dev_priv, INTEL_IVYBRIDGE) 542 - #define IS_IVB_GT1(dev_priv) (IS_IVYBRIDGE(dev_priv) && \ 543 - INTEL_INFO(dev_priv)->gt == 1) 544 - #define IS_VALLEYVIEW(dev_priv) IS_PLATFORM(dev_priv, INTEL_VALLEYVIEW) 545 - #define IS_CHERRYVIEW(dev_priv) IS_PLATFORM(dev_priv, INTEL_CHERRYVIEW) 546 - #define IS_HASWELL(dev_priv) IS_PLATFORM(dev_priv, INTEL_HASWELL) 547 - #define IS_BROADWELL(dev_priv) IS_PLATFORM(dev_priv, INTEL_BROADWELL) 548 - #define IS_SKYLAKE(dev_priv) IS_PLATFORM(dev_priv, INTEL_SKYLAKE) 549 - #define IS_BROXTON(dev_priv) IS_PLATFORM(dev_priv, INTEL_BROXTON) 550 - #define IS_KABYLAKE(dev_priv) IS_PLATFORM(dev_priv, INTEL_KABYLAKE) 551 - #define IS_GEMINILAKE(dev_priv) IS_PLATFORM(dev_priv, INTEL_GEMINILAKE) 552 - #define IS_COFFEELAKE(dev_priv) IS_PLATFORM(dev_priv, INTEL_COFFEELAKE) 553 - #define IS_COMETLAKE(dev_priv) IS_PLATFORM(dev_priv, INTEL_COMETLAKE) 554 - #define IS_ICELAKE(dev_priv) IS_PLATFORM(dev_priv, INTEL_ICELAKE) 555 - #define IS_JSL_EHL(dev_priv) (IS_PLATFORM(dev_priv, INTEL_JASPERLAKE) || \ 556 - IS_PLATFORM(dev_priv, INTEL_ELKHARTLAKE)) 557 - #define IS_TIGERLAKE(dev_priv) IS_PLATFORM(dev_priv, INTEL_TIGERLAKE) 558 - #define IS_ROCKETLAKE(dev_priv) IS_PLATFORM(dev_priv, INTEL_ROCKETLAKE) 559 - #define IS_DG1(dev_priv) IS_PLATFORM(dev_priv, INTEL_DG1) 560 - #define IS_ALDERLAKE_S(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_S) 561 - #define IS_ALDERLAKE_P(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_P) 562 - #define IS_XEHPSDV(dev_priv) IS_PLATFORM(dev_priv, INTEL_XEHPSDV) 563 - #define IS_DG2(dev_priv) IS_PLATFORM(dev_priv, INTEL_DG2) 564 - #define IS_PONTEVECCHIO(dev_priv) IS_PLATFORM(dev_priv, INTEL_PONTEVECCHIO) 565 - #define IS_METEORLAKE(dev_priv) IS_PLATFORM(dev_priv, INTEL_METEORLAKE) 522 + #define IS_I830(i915) IS_PLATFORM(i915, INTEL_I830) 523 + #define IS_I845G(i915) IS_PLATFORM(i915, INTEL_I845G) 524 + #define IS_I85X(i915) IS_PLATFORM(i915, INTEL_I85X) 525 + #define IS_I865G(i915) IS_PLATFORM(i915, INTEL_I865G) 526 + #define IS_I915G(i915) IS_PLATFORM(i915, INTEL_I915G) 527 + #define IS_I915GM(i915) IS_PLATFORM(i915, INTEL_I915GM) 528 + #define IS_I945G(i915) IS_PLATFORM(i915, INTEL_I945G) 529 + #define IS_I945GM(i915) IS_PLATFORM(i915, INTEL_I945GM) 530 + #define IS_I965G(i915) IS_PLATFORM(i915, INTEL_I965G) 531 + #define IS_I965GM(i915) IS_PLATFORM(i915, INTEL_I965GM) 532 + #define IS_G45(i915) IS_PLATFORM(i915, INTEL_G45) 533 + #define IS_GM45(i915) IS_PLATFORM(i915, INTEL_GM45) 534 + #define IS_G4X(i915) (IS_G45(i915) || IS_GM45(i915)) 535 + #define IS_PINEVIEW(i915) IS_PLATFORM(i915, INTEL_PINEVIEW) 536 + #define IS_G33(i915) IS_PLATFORM(i915, INTEL_G33) 537 + #define IS_IRONLAKE(i915) IS_PLATFORM(i915, INTEL_IRONLAKE) 538 + #define IS_IRONLAKE_M(i915) \ 539 + (IS_PLATFORM(i915, INTEL_IRONLAKE) && IS_MOBILE(i915)) 540 + #define IS_SANDYBRIDGE(i915) IS_PLATFORM(i915, INTEL_SANDYBRIDGE) 541 + #define IS_IVYBRIDGE(i915) IS_PLATFORM(i915, INTEL_IVYBRIDGE) 542 + #define IS_IVB_GT1(i915) (IS_IVYBRIDGE(i915) && \ 543 + INTEL_INFO(i915)->gt == 1) 544 + #define IS_VALLEYVIEW(i915) IS_PLATFORM(i915, INTEL_VALLEYVIEW) 545 + #define IS_CHERRYVIEW(i915) IS_PLATFORM(i915, INTEL_CHERRYVIEW) 546 + #define IS_HASWELL(i915) IS_PLATFORM(i915, INTEL_HASWELL) 547 + #define IS_BROADWELL(i915) IS_PLATFORM(i915, INTEL_BROADWELL) 548 + #define IS_SKYLAKE(i915) IS_PLATFORM(i915, INTEL_SKYLAKE) 549 + #define IS_BROXTON(i915) IS_PLATFORM(i915, INTEL_BROXTON) 550 + #define IS_KABYLAKE(i915) IS_PLATFORM(i915, INTEL_KABYLAKE) 551 + #define IS_GEMINILAKE(i915) IS_PLATFORM(i915, INTEL_GEMINILAKE) 552 + #define IS_COFFEELAKE(i915) IS_PLATFORM(i915, INTEL_COFFEELAKE) 553 + #define IS_COMETLAKE(i915) IS_PLATFORM(i915, INTEL_COMETLAKE) 554 + #define IS_ICELAKE(i915) IS_PLATFORM(i915, INTEL_ICELAKE) 555 + #define IS_JSL_EHL(i915) (IS_PLATFORM(i915, INTEL_JASPERLAKE) || \ 556 + IS_PLATFORM(i915, INTEL_ELKHARTLAKE)) 557 + #define IS_TIGERLAKE(i915) IS_PLATFORM(i915, INTEL_TIGERLAKE) 558 + #define IS_ROCKETLAKE(i915) IS_PLATFORM(i915, INTEL_ROCKETLAKE) 559 + #define IS_DG1(i915) IS_PLATFORM(i915, INTEL_DG1) 560 + #define IS_ALDERLAKE_S(i915) IS_PLATFORM(i915, INTEL_ALDERLAKE_S) 561 + #define IS_ALDERLAKE_P(i915) IS_PLATFORM(i915, INTEL_ALDERLAKE_P) 562 + #define IS_XEHPSDV(i915) IS_PLATFORM(i915, INTEL_XEHPSDV) 563 + #define IS_DG2(i915) IS_PLATFORM(i915, INTEL_DG2) 564 + #define IS_PONTEVECCHIO(i915) IS_PLATFORM(i915, INTEL_PONTEVECCHIO) 565 + #define IS_METEORLAKE(i915) IS_PLATFORM(i915, INTEL_METEORLAKE) 566 566 567 - #define IS_METEORLAKE_M(dev_priv) \ 568 - IS_SUBPLATFORM(dev_priv, INTEL_METEORLAKE, INTEL_SUBPLATFORM_M) 569 - #define IS_METEORLAKE_P(dev_priv) \ 570 - IS_SUBPLATFORM(dev_priv, INTEL_METEORLAKE, INTEL_SUBPLATFORM_P) 571 - #define IS_DG2_G10(dev_priv) \ 572 - IS_SUBPLATFORM(dev_priv, INTEL_DG2, INTEL_SUBPLATFORM_G10) 573 - #define IS_DG2_G11(dev_priv) \ 574 - IS_SUBPLATFORM(dev_priv, INTEL_DG2, INTEL_SUBPLATFORM_G11) 575 - #define IS_DG2_G12(dev_priv) \ 576 - IS_SUBPLATFORM(dev_priv, INTEL_DG2, INTEL_SUBPLATFORM_G12) 577 - #define IS_ADLS_RPLS(dev_priv) \ 578 - IS_SUBPLATFORM(dev_priv, INTEL_ALDERLAKE_S, INTEL_SUBPLATFORM_RPL) 579 - #define IS_ADLP_N(dev_priv) \ 580 - IS_SUBPLATFORM(dev_priv, INTEL_ALDERLAKE_P, INTEL_SUBPLATFORM_N) 581 - #define IS_ADLP_RPLP(dev_priv) \ 582 - IS_SUBPLATFORM(dev_priv, INTEL_ALDERLAKE_P, INTEL_SUBPLATFORM_RPL) 583 - #define IS_ADLP_RPLU(dev_priv) \ 584 - IS_SUBPLATFORM(dev_priv, INTEL_ALDERLAKE_P, INTEL_SUBPLATFORM_RPLU) 585 - #define IS_HSW_EARLY_SDV(dev_priv) (IS_HASWELL(dev_priv) && \ 586 - (INTEL_DEVID(dev_priv) & 0xFF00) == 0x0C00) 587 - #define IS_BDW_ULT(dev_priv) \ 588 - IS_SUBPLATFORM(dev_priv, INTEL_BROADWELL, INTEL_SUBPLATFORM_ULT) 589 - #define IS_BDW_ULX(dev_priv) \ 590 - IS_SUBPLATFORM(dev_priv, INTEL_BROADWELL, INTEL_SUBPLATFORM_ULX) 591 - #define IS_BDW_GT3(dev_priv) (IS_BROADWELL(dev_priv) && \ 592 - INTEL_INFO(dev_priv)->gt == 3) 593 - #define IS_HSW_ULT(dev_priv) \ 594 - IS_SUBPLATFORM(dev_priv, INTEL_HASWELL, INTEL_SUBPLATFORM_ULT) 595 - #define IS_HSW_GT3(dev_priv) (IS_HASWELL(dev_priv) && \ 596 - INTEL_INFO(dev_priv)->gt == 3) 597 - #define IS_HSW_GT1(dev_priv) (IS_HASWELL(dev_priv) && \ 598 - INTEL_INFO(dev_priv)->gt == 1) 567 + #define IS_METEORLAKE_M(i915) \ 568 + IS_SUBPLATFORM(i915, INTEL_METEORLAKE, INTEL_SUBPLATFORM_M) 569 + #define IS_METEORLAKE_P(i915) \ 570 + IS_SUBPLATFORM(i915, INTEL_METEORLAKE, INTEL_SUBPLATFORM_P) 571 + #define IS_DG2_G10(i915) \ 572 + IS_SUBPLATFORM(i915, INTEL_DG2, INTEL_SUBPLATFORM_G10) 573 + #define IS_DG2_G11(i915) \ 574 + IS_SUBPLATFORM(i915, INTEL_DG2, INTEL_SUBPLATFORM_G11) 575 + #define IS_DG2_G12(i915) \ 576 + IS_SUBPLATFORM(i915, INTEL_DG2, INTEL_SUBPLATFORM_G12) 577 + #define IS_ADLS_RPLS(i915) \ 578 + IS_SUBPLATFORM(i915, INTEL_ALDERLAKE_S, INTEL_SUBPLATFORM_RPL) 579 + #define IS_ADLP_N(i915) \ 580 + IS_SUBPLATFORM(i915, INTEL_ALDERLAKE_P, INTEL_SUBPLATFORM_N) 581 + #define IS_ADLP_RPLP(i915) \ 582 + IS_SUBPLATFORM(i915, INTEL_ALDERLAKE_P, INTEL_SUBPLATFORM_RPL) 583 + #define IS_ADLP_RPLU(i915) \ 584 + IS_SUBPLATFORM(i915, INTEL_ALDERLAKE_P, INTEL_SUBPLATFORM_RPLU) 585 + #define IS_HSW_EARLY_SDV(i915) (IS_HASWELL(i915) && \ 586 + (INTEL_DEVID(i915) & 0xFF00) == 0x0C00) 587 + #define IS_BDW_ULT(i915) \ 588 + IS_SUBPLATFORM(i915, INTEL_BROADWELL, INTEL_SUBPLATFORM_ULT) 589 + #define IS_BDW_ULX(i915) \ 590 + IS_SUBPLATFORM(i915, INTEL_BROADWELL, INTEL_SUBPLATFORM_ULX) 591 + #define IS_BDW_GT3(i915) (IS_BROADWELL(i915) && \ 592 + INTEL_INFO(i915)->gt == 3) 593 + #define IS_HSW_ULT(i915) \ 594 + IS_SUBPLATFORM(i915, INTEL_HASWELL, INTEL_SUBPLATFORM_ULT) 595 + #define IS_HSW_GT3(i915) (IS_HASWELL(i915) && \ 596 + INTEL_INFO(i915)->gt == 3) 597 + #define IS_HSW_GT1(i915) (IS_HASWELL(i915) && \ 598 + INTEL_INFO(i915)->gt == 1) 599 599 /* ULX machines are also considered ULT. */ 600 - #define IS_HSW_ULX(dev_priv) \ 601 - IS_SUBPLATFORM(dev_priv, INTEL_HASWELL, INTEL_SUBPLATFORM_ULX) 602 - #define IS_SKL_ULT(dev_priv) \ 603 - IS_SUBPLATFORM(dev_priv, INTEL_SKYLAKE, INTEL_SUBPLATFORM_ULT) 604 - #define IS_SKL_ULX(dev_priv) \ 605 - IS_SUBPLATFORM(dev_priv, INTEL_SKYLAKE, INTEL_SUBPLATFORM_ULX) 606 - #define IS_KBL_ULT(dev_priv) \ 607 - IS_SUBPLATFORM(dev_priv, INTEL_KABYLAKE, INTEL_SUBPLATFORM_ULT) 608 - #define IS_KBL_ULX(dev_priv) \ 609 - IS_SUBPLATFORM(dev_priv, INTEL_KABYLAKE, INTEL_SUBPLATFORM_ULX) 610 - #define IS_SKL_GT2(dev_priv) (IS_SKYLAKE(dev_priv) && \ 611 - INTEL_INFO(dev_priv)->gt == 2) 612 - #define IS_SKL_GT3(dev_priv) (IS_SKYLAKE(dev_priv) && \ 613 - INTEL_INFO(dev_priv)->gt == 3) 614 - #define IS_SKL_GT4(dev_priv) (IS_SKYLAKE(dev_priv) && \ 615 - INTEL_INFO(dev_priv)->gt == 4) 616 - #define IS_KBL_GT2(dev_priv) (IS_KABYLAKE(dev_priv) && \ 617 - INTEL_INFO(dev_priv)->gt == 2) 618 - #define IS_KBL_GT3(dev_priv) (IS_KABYLAKE(dev_priv) && \ 619 - INTEL_INFO(dev_priv)->gt == 3) 620 - #define IS_CFL_ULT(dev_priv) \ 621 - IS_SUBPLATFORM(dev_priv, INTEL_COFFEELAKE, INTEL_SUBPLATFORM_ULT) 622 - #define IS_CFL_ULX(dev_priv) \ 623 - IS_SUBPLATFORM(dev_priv, INTEL_COFFEELAKE, INTEL_SUBPLATFORM_ULX) 624 - #define IS_CFL_GT2(dev_priv) (IS_COFFEELAKE(dev_priv) && \ 625 - INTEL_INFO(dev_priv)->gt == 2) 626 - #define IS_CFL_GT3(dev_priv) (IS_COFFEELAKE(dev_priv) && \ 627 - INTEL_INFO(dev_priv)->gt == 3) 600 + #define IS_HSW_ULX(i915) \ 601 + IS_SUBPLATFORM(i915, INTEL_HASWELL, INTEL_SUBPLATFORM_ULX) 602 + #define IS_SKL_ULT(i915) \ 603 + IS_SUBPLATFORM(i915, INTEL_SKYLAKE, INTEL_SUBPLATFORM_ULT) 604 + #define IS_SKL_ULX(i915) \ 605 + IS_SUBPLATFORM(i915, INTEL_SKYLAKE, INTEL_SUBPLATFORM_ULX) 606 + #define IS_KBL_ULT(i915) \ 607 + IS_SUBPLATFORM(i915, INTEL_KABYLAKE, INTEL_SUBPLATFORM_ULT) 608 + #define IS_KBL_ULX(i915) \ 609 + IS_SUBPLATFORM(i915, INTEL_KABYLAKE, INTEL_SUBPLATFORM_ULX) 610 + #define IS_SKL_GT2(i915) (IS_SKYLAKE(i915) && \ 611 + INTEL_INFO(i915)->gt == 2) 612 + #define IS_SKL_GT3(i915) (IS_SKYLAKE(i915) && \ 613 + INTEL_INFO(i915)->gt == 3) 614 + #define IS_SKL_GT4(i915) (IS_SKYLAKE(i915) && \ 615 + INTEL_INFO(i915)->gt == 4) 616 + #define IS_KBL_GT2(i915) (IS_KABYLAKE(i915) && \ 617 + INTEL_INFO(i915)->gt == 2) 618 + #define IS_KBL_GT3(i915) (IS_KABYLAKE(i915) && \ 619 + INTEL_INFO(i915)->gt == 3) 620 + #define IS_CFL_ULT(i915) \ 621 + IS_SUBPLATFORM(i915, INTEL_COFFEELAKE, INTEL_SUBPLATFORM_ULT) 622 + #define IS_CFL_ULX(i915) \ 623 + IS_SUBPLATFORM(i915, INTEL_COFFEELAKE, INTEL_SUBPLATFORM_ULX) 624 + #define IS_CFL_GT2(i915) (IS_COFFEELAKE(i915) && \ 625 + INTEL_INFO(i915)->gt == 2) 626 + #define IS_CFL_GT3(i915) (IS_COFFEELAKE(i915) && \ 627 + INTEL_INFO(i915)->gt == 3) 628 628 629 - #define IS_CML_ULT(dev_priv) \ 630 - IS_SUBPLATFORM(dev_priv, INTEL_COMETLAKE, INTEL_SUBPLATFORM_ULT) 631 - #define IS_CML_ULX(dev_priv) \ 632 - IS_SUBPLATFORM(dev_priv, INTEL_COMETLAKE, INTEL_SUBPLATFORM_ULX) 633 - #define IS_CML_GT2(dev_priv) (IS_COMETLAKE(dev_priv) && \ 634 - INTEL_INFO(dev_priv)->gt == 2) 629 + #define IS_CML_ULT(i915) \ 630 + IS_SUBPLATFORM(i915, INTEL_COMETLAKE, INTEL_SUBPLATFORM_ULT) 631 + #define IS_CML_ULX(i915) \ 632 + IS_SUBPLATFORM(i915, INTEL_COMETLAKE, INTEL_SUBPLATFORM_ULX) 633 + #define IS_CML_GT2(i915) (IS_COMETLAKE(i915) && \ 634 + INTEL_INFO(i915)->gt == 2) 635 635 636 - #define IS_ICL_WITH_PORT_F(dev_priv) \ 637 - IS_SUBPLATFORM(dev_priv, INTEL_ICELAKE, INTEL_SUBPLATFORM_PORTF) 636 + #define IS_ICL_WITH_PORT_F(i915) \ 637 + IS_SUBPLATFORM(i915, INTEL_ICELAKE, INTEL_SUBPLATFORM_PORTF) 638 638 639 - #define IS_TGL_UY(dev_priv) \ 640 - IS_SUBPLATFORM(dev_priv, INTEL_TIGERLAKE, INTEL_SUBPLATFORM_UY) 639 + #define IS_TGL_UY(i915) \ 640 + IS_SUBPLATFORM(i915, INTEL_TIGERLAKE, INTEL_SUBPLATFORM_UY) 641 641 642 642 #define IS_SKL_GRAPHICS_STEP(p, since, until) (IS_SKYLAKE(p) && IS_GRAPHICS_STEP(p, since, until)) 643 643 644 - #define IS_KBL_GRAPHICS_STEP(dev_priv, since, until) \ 645 - (IS_KABYLAKE(dev_priv) && IS_GRAPHICS_STEP(dev_priv, since, until)) 646 - #define IS_KBL_DISPLAY_STEP(dev_priv, since, until) \ 647 - (IS_KABYLAKE(dev_priv) && IS_DISPLAY_STEP(dev_priv, since, until)) 644 + #define IS_KBL_GRAPHICS_STEP(i915, since, until) \ 645 + (IS_KABYLAKE(i915) && IS_GRAPHICS_STEP(i915, since, until)) 646 + #define IS_KBL_DISPLAY_STEP(i915, since, until) \ 647 + (IS_KABYLAKE(i915) && IS_DISPLAY_STEP(i915, since, until)) 648 648 649 649 #define IS_JSL_EHL_GRAPHICS_STEP(p, since, until) \ 650 650 (IS_JSL_EHL(p) && IS_GRAPHICS_STEP(p, since, until)) ··· 720 720 (IS_PONTEVECCHIO(__i915) && \ 721 721 IS_GRAPHICS_STEP(__i915, since, until)) 722 722 723 - #define IS_LP(dev_priv) (INTEL_INFO(dev_priv)->is_lp) 724 - #define IS_GEN9_LP(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && IS_LP(dev_priv)) 725 - #define IS_GEN9_BC(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && !IS_LP(dev_priv)) 723 + #define IS_LP(i915) (INTEL_INFO(i915)->is_lp) 724 + #define IS_GEN9_LP(i915) (GRAPHICS_VER(i915) == 9 && IS_LP(i915)) 725 + #define IS_GEN9_BC(i915) (GRAPHICS_VER(i915) == 9 && !IS_LP(i915)) 726 726 727 727 #define __HAS_ENGINE(engine_mask, id) ((engine_mask) & BIT(id)) 728 728 #define HAS_ENGINE(gt, id) __HAS_ENGINE((gt)->info.engine_mask, id) ··· 747 747 #define CCS_MASK(gt) \ 748 748 ENGINE_INSTANCES_MASK(gt, CCS0, I915_MAX_CCS) 749 749 750 - #define HAS_MEDIA_RATIO_MODE(dev_priv) (INTEL_INFO(dev_priv)->has_media_ratio_mode) 750 + #define HAS_MEDIA_RATIO_MODE(i915) (INTEL_INFO(i915)->has_media_ratio_mode) 751 751 752 752 /* 753 753 * The Gen7 cmdparser copies the scanned buffer to the ggtt for execution 754 754 * All later gens can run the final buffer from the ppgtt 755 755 */ 756 - #define CMDPARSER_USES_GGTT(dev_priv) (GRAPHICS_VER(dev_priv) == 7) 756 + #define CMDPARSER_USES_GGTT(i915) (GRAPHICS_VER(i915) == 7) 757 757 758 - #define HAS_LLC(dev_priv) (INTEL_INFO(dev_priv)->has_llc) 759 - #define HAS_4TILE(dev_priv) (INTEL_INFO(dev_priv)->has_4tile) 760 - #define HAS_SNOOP(dev_priv) (INTEL_INFO(dev_priv)->has_snoop) 761 - #define HAS_EDRAM(dev_priv) ((dev_priv)->edram_size_mb) 762 - #define HAS_SECURE_BATCHES(dev_priv) (GRAPHICS_VER(dev_priv) < 6) 763 - #define HAS_WT(dev_priv) HAS_EDRAM(dev_priv) 758 + #define HAS_LLC(i915) (INTEL_INFO(i915)->has_llc) 759 + #define HAS_4TILE(i915) (INTEL_INFO(i915)->has_4tile) 760 + #define HAS_SNOOP(i915) (INTEL_INFO(i915)->has_snoop) 761 + #define HAS_EDRAM(i915) ((i915)->edram_size_mb) 762 + #define HAS_SECURE_BATCHES(i915) (GRAPHICS_VER(i915) < 6) 763 + #define HAS_WT(i915) HAS_EDRAM(i915) 764 764 765 - #define HWS_NEEDS_PHYSICAL(dev_priv) (INTEL_INFO(dev_priv)->hws_needs_physical) 765 + #define HWS_NEEDS_PHYSICAL(i915) (INTEL_INFO(i915)->hws_needs_physical) 766 766 767 - #define HAS_LOGICAL_RING_CONTEXTS(dev_priv) \ 768 - (INTEL_INFO(dev_priv)->has_logical_ring_contexts) 769 - #define HAS_LOGICAL_RING_ELSQ(dev_priv) \ 770 - (INTEL_INFO(dev_priv)->has_logical_ring_elsq) 767 + #define HAS_LOGICAL_RING_CONTEXTS(i915) \ 768 + (INTEL_INFO(i915)->has_logical_ring_contexts) 769 + #define HAS_LOGICAL_RING_ELSQ(i915) \ 770 + (INTEL_INFO(i915)->has_logical_ring_elsq) 771 771 772 - #define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv) 772 + #define HAS_EXECLISTS(i915) HAS_LOGICAL_RING_CONTEXTS(i915) 773 773 774 - #define INTEL_PPGTT(dev_priv) (RUNTIME_INFO(dev_priv)->ppgtt_type) 775 - #define HAS_PPGTT(dev_priv) \ 776 - (INTEL_PPGTT(dev_priv) != INTEL_PPGTT_NONE) 777 - #define HAS_FULL_PPGTT(dev_priv) \ 778 - (INTEL_PPGTT(dev_priv) >= INTEL_PPGTT_FULL) 774 + #define INTEL_PPGTT(i915) (RUNTIME_INFO(i915)->ppgtt_type) 775 + #define HAS_PPGTT(i915) \ 776 + (INTEL_PPGTT(i915) != INTEL_PPGTT_NONE) 777 + #define HAS_FULL_PPGTT(i915) \ 778 + (INTEL_PPGTT(i915) >= INTEL_PPGTT_FULL) 779 779 780 - #define HAS_PAGE_SIZES(dev_priv, sizes) ({ \ 780 + #define HAS_PAGE_SIZES(i915, sizes) ({ \ 781 781 GEM_BUG_ON((sizes) == 0); \ 782 - ((sizes) & ~RUNTIME_INFO(dev_priv)->page_sizes) == 0; \ 782 + ((sizes) & ~RUNTIME_INFO(i915)->page_sizes) == 0; \ 783 783 }) 784 784 785 - #define HAS_OVERLAY(dev_priv) (INTEL_INFO(dev_priv)->display.has_overlay) 786 - #define OVERLAY_NEEDS_PHYSICAL(dev_priv) \ 787 - (INTEL_INFO(dev_priv)->display.overlay_needs_physical) 785 + #define HAS_OVERLAY(i915) (INTEL_INFO(i915)->display.has_overlay) 786 + #define OVERLAY_NEEDS_PHYSICAL(i915) \ 787 + (INTEL_INFO(i915)->display.overlay_needs_physical) 788 788 789 789 /* Early gen2 have a totally busted CS tlb and require pinned batches. */ 790 - #define HAS_BROKEN_CS_TLB(dev_priv) (IS_I830(dev_priv) || IS_I845G(dev_priv)) 790 + #define HAS_BROKEN_CS_TLB(i915) (IS_I830(i915) || IS_I845G(i915)) 791 791 792 - #define NEEDS_RC6_CTX_CORRUPTION_WA(dev_priv) \ 793 - (IS_BROADWELL(dev_priv) || GRAPHICS_VER(dev_priv) == 9) 792 + #define NEEDS_RC6_CTX_CORRUPTION_WA(i915) \ 793 + (IS_BROADWELL(i915) || GRAPHICS_VER(i915) == 9) 794 794 795 795 /* WaRsDisableCoarsePowerGating:skl,cnl */ 796 - #define NEEDS_WaRsDisableCoarsePowerGating(dev_priv) \ 797 - (IS_SKL_GT3(dev_priv) || IS_SKL_GT4(dev_priv)) 796 + #define NEEDS_WaRsDisableCoarsePowerGating(i915) \ 797 + (IS_SKL_GT3(i915) || IS_SKL_GT4(i915)) 798 798 799 - #define HAS_GMBUS_IRQ(dev_priv) (DISPLAY_VER(dev_priv) >= 4) 800 - #define HAS_GMBUS_BURST_READ(dev_priv) (DISPLAY_VER(dev_priv) >= 11 || \ 801 - IS_GEMINILAKE(dev_priv) || \ 802 - IS_KABYLAKE(dev_priv)) 799 + #define HAS_GMBUS_IRQ(i915) (DISPLAY_VER(i915) >= 4) 800 + #define HAS_GMBUS_BURST_READ(i915) (DISPLAY_VER(i915) >= 11 || \ 801 + IS_GEMINILAKE(i915) || \ 802 + IS_KABYLAKE(i915)) 803 803 804 804 /* With the 945 and later, Y tiling got adjusted so that it was 32 128-byte 805 805 * rows, which changed the alignment requirements and fence programming. 806 806 */ 807 - #define HAS_128_BYTE_Y_TILING(dev_priv) (GRAPHICS_VER(dev_priv) != 2 && \ 808 - !(IS_I915G(dev_priv) || IS_I915GM(dev_priv))) 809 - #define SUPPORTS_TV(dev_priv) (INTEL_INFO(dev_priv)->display.supports_tv) 810 - #define I915_HAS_HOTPLUG(dev_priv) (INTEL_INFO(dev_priv)->display.has_hotplug) 807 + #define HAS_128_BYTE_Y_TILING(i915) (GRAPHICS_VER(i915) != 2 && \ 808 + !(IS_I915G(i915) || IS_I915GM(i915))) 809 + #define SUPPORTS_TV(i915) (INTEL_INFO(i915)->display.supports_tv) 810 + #define I915_HAS_HOTPLUG(i915) (INTEL_INFO(i915)->display.has_hotplug) 811 811 812 - #define HAS_FW_BLC(dev_priv) (DISPLAY_VER(dev_priv) > 2) 813 - #define HAS_FBC(dev_priv) (RUNTIME_INFO(dev_priv)->fbc_mask != 0) 814 - #define HAS_CUR_FBC(dev_priv) (!HAS_GMCH(dev_priv) && DISPLAY_VER(dev_priv) >= 7) 812 + #define HAS_FW_BLC(i915) (DISPLAY_VER(i915) > 2) 813 + #define HAS_FBC(i915) (RUNTIME_INFO(i915)->fbc_mask != 0) 814 + #define HAS_CUR_FBC(i915) (!HAS_GMCH(i915) && DISPLAY_VER(i915) >= 7) 815 815 816 - #define HAS_DPT(dev_priv) (DISPLAY_VER(dev_priv) >= 13) 816 + #define HAS_DPT(i915) (DISPLAY_VER(i915) >= 13) 817 817 818 - #define HAS_IPS(dev_priv) (IS_HSW_ULT(dev_priv) || IS_BROADWELL(dev_priv)) 818 + #define HAS_IPS(i915) (IS_HSW_ULT(i915) || IS_BROADWELL(i915)) 819 819 820 - #define HAS_DP_MST(dev_priv) (INTEL_INFO(dev_priv)->display.has_dp_mst) 821 - #define HAS_DP20(dev_priv) (IS_DG2(dev_priv) || DISPLAY_VER(dev_priv) >= 14) 820 + #define HAS_DP_MST(i915) (INTEL_INFO(i915)->display.has_dp_mst) 821 + #define HAS_DP20(i915) (IS_DG2(i915) || DISPLAY_VER(i915) >= 14) 822 822 823 - #define HAS_DOUBLE_BUFFERED_M_N(dev_priv) (DISPLAY_VER(dev_priv) >= 9 || IS_BROADWELL(dev_priv)) 823 + #define HAS_DOUBLE_BUFFERED_M_N(i915) (DISPLAY_VER(i915) >= 9 || IS_BROADWELL(i915)) 824 824 825 - #define HAS_CDCLK_CRAWL(dev_priv) (INTEL_INFO(dev_priv)->display.has_cdclk_crawl) 826 - #define HAS_CDCLK_SQUASH(dev_priv) (INTEL_INFO(dev_priv)->display.has_cdclk_squash) 827 - #define HAS_DDI(dev_priv) (INTEL_INFO(dev_priv)->display.has_ddi) 828 - #define HAS_FPGA_DBG_UNCLAIMED(dev_priv) (INTEL_INFO(dev_priv)->display.has_fpga_dbg) 829 - #define HAS_PSR(dev_priv) (INTEL_INFO(dev_priv)->display.has_psr) 830 - #define HAS_PSR_HW_TRACKING(dev_priv) \ 831 - (INTEL_INFO(dev_priv)->display.has_psr_hw_tracking) 832 - #define HAS_PSR2_SEL_FETCH(dev_priv) (DISPLAY_VER(dev_priv) >= 12) 833 - #define HAS_TRANSCODER(dev_priv, trans) ((RUNTIME_INFO(dev_priv)->cpu_transcoder_mask & BIT(trans)) != 0) 825 + #define HAS_CDCLK_CRAWL(i915) (INTEL_INFO(i915)->display.has_cdclk_crawl) 826 + #define HAS_CDCLK_SQUASH(i915) (INTEL_INFO(i915)->display.has_cdclk_squash) 827 + #define HAS_DDI(i915) (INTEL_INFO(i915)->display.has_ddi) 828 + #define HAS_FPGA_DBG_UNCLAIMED(i915) (INTEL_INFO(i915)->display.has_fpga_dbg) 829 + #define HAS_PSR(i915) (INTEL_INFO(i915)->display.has_psr) 830 + #define HAS_PSR_HW_TRACKING(i915) \ 831 + (INTEL_INFO(i915)->display.has_psr_hw_tracking) 832 + #define HAS_PSR2_SEL_FETCH(i915) (DISPLAY_VER(i915) >= 12) 833 + #define HAS_TRANSCODER(i915, trans) ((RUNTIME_INFO(i915)->cpu_transcoder_mask & BIT(trans)) != 0) 834 834 835 - #define HAS_RC6(dev_priv) (INTEL_INFO(dev_priv)->has_rc6) 836 - #define HAS_RC6p(dev_priv) (INTEL_INFO(dev_priv)->has_rc6p) 837 - #define HAS_RC6pp(dev_priv) (false) /* HW was never validated */ 835 + #define HAS_RC6(i915) (INTEL_INFO(i915)->has_rc6) 836 + #define HAS_RC6p(i915) (INTEL_INFO(i915)->has_rc6p) 837 + #define HAS_RC6pp(i915) (false) /* HW was never validated */ 838 838 839 - #define HAS_RPS(dev_priv) (INTEL_INFO(dev_priv)->has_rps) 839 + #define HAS_RPS(i915) (INTEL_INFO(i915)->has_rps) 840 840 841 - #define HAS_DMC(dev_priv) (RUNTIME_INFO(dev_priv)->has_dmc) 842 - #define HAS_DSB(dev_priv) (INTEL_INFO(dev_priv)->display.has_dsb) 841 + #define HAS_DMC(i915) (RUNTIME_INFO(i915)->has_dmc) 842 + #define HAS_DSB(i915) (INTEL_INFO(i915)->display.has_dsb) 843 843 #define HAS_DSC(__i915) (RUNTIME_INFO(__i915)->has_dsc) 844 844 #define HAS_HW_SAGV_WM(i915) (DISPLAY_VER(i915) >= 13 && !IS_DGFX(i915)) 845 845 846 - #define HAS_HECI_PXP(dev_priv) \ 847 - (INTEL_INFO(dev_priv)->has_heci_pxp) 846 + #define HAS_HECI_PXP(i915) \ 847 + (INTEL_INFO(i915)->has_heci_pxp) 848 848 849 - #define HAS_HECI_GSCFI(dev_priv) \ 850 - (INTEL_INFO(dev_priv)->has_heci_gscfi) 849 + #define HAS_HECI_GSCFI(i915) \ 850 + (INTEL_INFO(i915)->has_heci_gscfi) 851 851 852 - #define HAS_HECI_GSC(dev_priv) (HAS_HECI_PXP(dev_priv) || HAS_HECI_GSCFI(dev_priv)) 852 + #define HAS_HECI_GSC(i915) (HAS_HECI_PXP(i915) || HAS_HECI_GSCFI(i915)) 853 853 854 854 #define HAS_MSO(i915) (DISPLAY_VER(i915) >= 12) 855 855 856 - #define HAS_RUNTIME_PM(dev_priv) (INTEL_INFO(dev_priv)->has_runtime_pm) 857 - #define HAS_64BIT_RELOC(dev_priv) (INTEL_INFO(dev_priv)->has_64bit_reloc) 856 + #define HAS_RUNTIME_PM(i915) (INTEL_INFO(i915)->has_runtime_pm) 857 + #define HAS_64BIT_RELOC(i915) (INTEL_INFO(i915)->has_64bit_reloc) 858 858 859 - #define HAS_OA_BPC_REPORTING(dev_priv) \ 860 - (INTEL_INFO(dev_priv)->has_oa_bpc_reporting) 861 - #define HAS_OA_SLICE_CONTRIB_LIMITS(dev_priv) \ 862 - (INTEL_INFO(dev_priv)->has_oa_slice_contrib_limits) 863 - #define HAS_OAM(dev_priv) \ 864 - (INTEL_INFO(dev_priv)->has_oam) 859 + #define HAS_OA_BPC_REPORTING(i915) \ 860 + (INTEL_INFO(i915)->has_oa_bpc_reporting) 861 + #define HAS_OA_SLICE_CONTRIB_LIMITS(i915) \ 862 + (INTEL_INFO(i915)->has_oa_slice_contrib_limits) 863 + #define HAS_OAM(i915) \ 864 + (INTEL_INFO(i915)->has_oam) 865 865 866 866 /* 867 867 * Set this flag, when platform requires 64K GTT page sizes or larger for 868 868 * device local memory access. 869 869 */ 870 - #define HAS_64K_PAGES(dev_priv) (INTEL_INFO(dev_priv)->has_64k_pages) 870 + #define HAS_64K_PAGES(i915) (INTEL_INFO(i915)->has_64k_pages) 871 871 872 - #define HAS_IPC(dev_priv) (INTEL_INFO(dev_priv)->display.has_ipc) 873 - #define HAS_SAGV(dev_priv) (DISPLAY_VER(dev_priv) >= 9 && !IS_LP(dev_priv)) 872 + #define HAS_IPC(i915) (INTEL_INFO(i915)->display.has_ipc) 873 + #define HAS_SAGV(i915) (DISPLAY_VER(i915) >= 9 && !IS_LP(i915)) 874 874 875 875 #define HAS_REGION(i915, i) (RUNTIME_INFO(i915)->memory_regions & (i)) 876 876 #define HAS_LMEM(i915) HAS_REGION(i915, REGION_LMEM) 877 877 878 - #define HAS_EXTRA_GT_LIST(dev_priv) (INTEL_INFO(dev_priv)->extra_gt_list) 878 + #define HAS_EXTRA_GT_LIST(i915) (INTEL_INFO(i915)->extra_gt_list) 879 879 880 880 /* 881 881 * Platform has the dedicated compression control state for each lmem surfaces 882 882 * stored in lmem to support the 3D and media compression formats. 883 883 */ 884 - #define HAS_FLAT_CCS(dev_priv) (INTEL_INFO(dev_priv)->has_flat_ccs) 884 + #define HAS_FLAT_CCS(i915) (INTEL_INFO(i915)->has_flat_ccs) 885 885 886 - #define HAS_GT_UC(dev_priv) (INTEL_INFO(dev_priv)->has_gt_uc) 886 + #define HAS_GT_UC(i915) (INTEL_INFO(i915)->has_gt_uc) 887 887 888 - #define HAS_POOLED_EU(dev_priv) (RUNTIME_INFO(dev_priv)->has_pooled_eu) 888 + #define HAS_POOLED_EU(i915) (RUNTIME_INFO(i915)->has_pooled_eu) 889 889 890 - #define HAS_GLOBAL_MOCS_REGISTERS(dev_priv) (INTEL_INFO(dev_priv)->has_global_mocs) 890 + #define HAS_GLOBAL_MOCS_REGISTERS(i915) (INTEL_INFO(i915)->has_global_mocs) 891 891 892 - #define HAS_GMCH(dev_priv) (INTEL_INFO(dev_priv)->display.has_gmch) 892 + #define HAS_GMCH(i915) (INTEL_INFO(i915)->display.has_gmch) 893 893 894 894 #define HAS_GMD_ID(i915) (INTEL_INFO(i915)->has_gmd_id) 895 895 896 - #define HAS_LSPCON(dev_priv) (IS_DISPLAY_VER(dev_priv, 9, 10)) 896 + #define HAS_LSPCON(i915) (IS_DISPLAY_VER(i915, 9, 10)) 897 897 898 898 #define HAS_L3_CCS_READ(i915) (INTEL_INFO(i915)->has_l3_ccs_read) 899 899 900 900 /* DPF == dynamic parity feature */ 901 - #define HAS_L3_DPF(dev_priv) (INTEL_INFO(dev_priv)->has_l3_dpf) 902 - #define NUM_L3_SLICES(dev_priv) (IS_HSW_GT3(dev_priv) ? \ 903 - 2 : HAS_L3_DPF(dev_priv)) 901 + #define HAS_L3_DPF(i915) (INTEL_INFO(i915)->has_l3_dpf) 902 + #define NUM_L3_SLICES(i915) (IS_HSW_GT3(i915) ? \ 903 + 2 : HAS_L3_DPF(i915)) 904 904 905 - #define INTEL_NUM_PIPES(dev_priv) (hweight8(RUNTIME_INFO(dev_priv)->pipe_mask)) 905 + #define INTEL_NUM_PIPES(i915) (hweight8(RUNTIME_INFO(i915)->pipe_mask)) 906 906 907 - #define HAS_DISPLAY(dev_priv) (RUNTIME_INFO(dev_priv)->pipe_mask != 0) 907 + #define HAS_DISPLAY(i915) (RUNTIME_INFO(i915)->pipe_mask != 0) 908 908 909 909 #define HAS_VRR(i915) (DISPLAY_VER(i915) >= 11) 910 910 911 911 #define HAS_ASYNC_FLIPS(i915) (DISPLAY_VER(i915) >= 5) 912 912 913 913 /* Only valid when HAS_DISPLAY() is true */ 914 - #define INTEL_DISPLAY_ENABLED(dev_priv) \ 915 - (drm_WARN_ON(&(dev_priv)->drm, !HAS_DISPLAY(dev_priv)), \ 916 - !(dev_priv)->params.disable_display && \ 917 - !intel_opregion_headless_sku(dev_priv)) 914 + #define INTEL_DISPLAY_ENABLED(i915) \ 915 + (drm_WARN_ON(&(i915)->drm, !HAS_DISPLAY(i915)), \ 916 + !(i915)->params.disable_display && \ 917 + !intel_opregion_headless_sku(i915)) 918 918 919 - #define HAS_GUC_DEPRIVILEGE(dev_priv) \ 920 - (INTEL_INFO(dev_priv)->has_guc_deprivilege) 919 + #define HAS_GUC_DEPRIVILEGE(i915) \ 920 + (INTEL_INFO(i915)->has_guc_deprivilege) 921 921 922 - #define HAS_D12_PLANE_MINIMIZATION(dev_priv) (IS_ROCKETLAKE(dev_priv) || \ 923 - IS_ALDERLAKE_S(dev_priv)) 922 + #define HAS_D12_PLANE_MINIMIZATION(i915) (IS_ROCKETLAKE(i915) || \ 923 + IS_ALDERLAKE_S(i915)) 924 924 925 925 #define HAS_MBUS_JOINING(i915) (IS_ALDERLAKE_P(i915) || DISPLAY_VER(i915) >= 14) 926 926
+23 -4
drivers/gpu/drm/i915/i915_gem.c
··· 420 420 page_length = remain < page_length ? remain : page_length; 421 421 if (drm_mm_node_allocated(&node)) { 422 422 ggtt->vm.insert_page(&ggtt->vm, 423 - i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT), 424 - node.start, I915_CACHE_NONE, 0); 423 + i915_gem_object_get_dma_address(obj, 424 + offset >> PAGE_SHIFT), 425 + node.start, 426 + i915_gem_get_pat_index(i915, 427 + I915_CACHE_NONE), 0); 425 428 } else { 426 429 page_base += offset & PAGE_MASK; 427 430 } ··· 601 598 /* flush the write before we modify the GGTT */ 602 599 intel_gt_flush_ggtt_writes(ggtt->vm.gt); 603 600 ggtt->vm.insert_page(&ggtt->vm, 604 - i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT), 605 - node.start, I915_CACHE_NONE, 0); 601 + i915_gem_object_get_dma_address(obj, 602 + offset >> PAGE_SHIFT), 603 + node.start, 604 + i915_gem_get_pat_index(i915, 605 + I915_CACHE_NONE), 0); 606 606 wmb(); /* flush modifications to the GGTT (insert_page) */ 607 607 } else { 608 608 page_base += offset & PAGE_MASK; ··· 1147 1141 struct intel_gt *gt; 1148 1142 unsigned int i; 1149 1143 int ret; 1144 + 1145 + /* 1146 + * In the proccess of replacing cache_level with pat_index a tricky 1147 + * dependency is created on the definition of the enum i915_cache_level. 1148 + * in case this enum is changed, PTE encode would be broken. 1149 + * Add a WARNING here. And remove when we completely quit using this 1150 + * enum 1151 + */ 1152 + BUILD_BUG_ON(I915_CACHE_NONE != 0 || 1153 + I915_CACHE_LLC != 1 || 1154 + I915_CACHE_L3_LLC != 2 || 1155 + I915_CACHE_WT != 3 || 1156 + I915_MAX_CACHE_LEVEL != 4); 1150 1157 1151 1158 /* We need to fallback to 4K pages if host doesn't support huge gtt. */ 1152 1159 if (intel_vgpu_active(dev_priv) && !intel_vgpu_has_huge_gtt(dev_priv))
+7
drivers/gpu/drm/i915/i915_getparam.c
··· 5 5 #include "gem/i915_gem_mman.h" 6 6 #include "gt/intel_engine_user.h" 7 7 8 + #include "pxp/intel_pxp.h" 9 + 8 10 #include "i915_cmd_parser.h" 9 11 #include "i915_drv.h" 10 12 #include "i915_getparam.h" ··· 101 99 break; 102 100 case I915_PARAM_HUC_STATUS: 103 101 value = intel_huc_check_status(&to_gt(i915)->uc.huc); 102 + if (value < 0) 103 + return value; 104 + break; 105 + case I915_PARAM_PXP_STATUS: 106 + value = intel_pxp_get_readiness_status(i915->pxp); 104 107 if (value < 0) 105 108 return value; 106 109 break;
+147 -6
drivers/gpu/drm/i915/i915_gpu_error.c
··· 808 808 for (ee = gt->engine; ee; ee = ee->next) { 809 809 const struct i915_vma_coredump *vma; 810 810 811 - if (ee->guc_capture_node) 812 - intel_guc_capture_print_engine_node(m, ee); 813 - else 811 + if (gt->uc && gt->uc->guc.is_guc_capture) { 812 + if (ee->guc_capture_node) 813 + intel_guc_capture_print_engine_node(m, ee); 814 + else 815 + err_printf(m, " Missing GuC capture node for %s\n", 816 + ee->engine->name); 817 + } else { 814 818 error_print_engine(m, ee); 819 + } 815 820 816 821 err_printf(m, " hung: %u\n", ee->hung); 817 822 err_printf(m, " engine reset count: %u\n", ee->reset_count); ··· 1122 1117 mutex_lock(&ggtt->error_mutex); 1123 1118 if (ggtt->vm.raw_insert_page) 1124 1119 ggtt->vm.raw_insert_page(&ggtt->vm, dma, slot, 1125 - I915_CACHE_NONE, 0); 1120 + i915_gem_get_pat_index(gt->i915, 1121 + I915_CACHE_NONE), 1122 + 0); 1126 1123 else 1127 1124 ggtt->vm.insert_page(&ggtt->vm, dma, slot, 1128 - I915_CACHE_NONE, 0); 1125 + i915_gem_get_pat_index(gt->i915, 1126 + I915_CACHE_NONE), 1127 + 0); 1129 1128 mb(); 1130 1129 1131 1130 s = io_mapping_map_wc(&ggtt->iomap, slot, PAGE_SIZE); ··· 2171 2162 * i915_capture_error_state - capture an error record for later analysis 2172 2163 * @gt: intel_gt which originated the hang 2173 2164 * @engine_mask: hung engines 2174 - * 2165 + * @dump_flags: dump flags 2175 2166 * 2176 2167 * Should be called when an error is detected (either a hang or an error 2177 2168 * interrupt) to capture error state from the time of the error. Fills ··· 2228 2219 i915->gpu_error.first_error = ERR_PTR(err); 2229 2220 spin_unlock_irq(&i915->gpu_error.lock); 2230 2221 } 2222 + 2223 + #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) 2224 + void intel_klog_error_capture(struct intel_gt *gt, 2225 + intel_engine_mask_t engine_mask) 2226 + { 2227 + static int g_count; 2228 + struct drm_i915_private *i915 = gt->i915; 2229 + struct i915_gpu_coredump *error; 2230 + intel_wakeref_t wakeref; 2231 + size_t buf_size = PAGE_SIZE * 128; 2232 + size_t pos_err; 2233 + char *buf, *ptr, *next; 2234 + int l_count = g_count++; 2235 + int line = 0; 2236 + 2237 + /* Can't allocate memory during a reset */ 2238 + if (test_bit(I915_RESET_BACKOFF, &gt->reset.flags)) { 2239 + drm_err(&gt->i915->drm, "[Capture/%d.%d] Inside GT reset, skipping error capture :(\n", 2240 + l_count, line++); 2241 + return; 2242 + } 2243 + 2244 + error = READ_ONCE(i915->gpu_error.first_error); 2245 + if (error) { 2246 + drm_err(&i915->drm, "[Capture/%d.%d] Clearing existing error capture first...\n", 2247 + l_count, line++); 2248 + i915_reset_error_state(i915); 2249 + } 2250 + 2251 + with_intel_runtime_pm(&i915->runtime_pm, wakeref) 2252 + error = i915_gpu_coredump(gt, engine_mask, CORE_DUMP_FLAG_NONE); 2253 + 2254 + if (IS_ERR(error)) { 2255 + drm_err(&i915->drm, "[Capture/%d.%d] Failed to capture error capture: %ld!\n", 2256 + l_count, line++, PTR_ERR(error)); 2257 + return; 2258 + } 2259 + 2260 + buf = kvmalloc(buf_size, GFP_KERNEL); 2261 + if (!buf) { 2262 + drm_err(&i915->drm, "[Capture/%d.%d] Failed to allocate buffer for error capture!\n", 2263 + l_count, line++); 2264 + i915_gpu_coredump_put(error); 2265 + return; 2266 + } 2267 + 2268 + drm_info(&i915->drm, "[Capture/%d.%d] Dumping i915 error capture for %ps...\n", 2269 + l_count, line++, __builtin_return_address(0)); 2270 + 2271 + /* Largest string length safe to print via dmesg */ 2272 + # define MAX_CHUNK 800 2273 + 2274 + pos_err = 0; 2275 + while (1) { 2276 + ssize_t got = i915_gpu_coredump_copy_to_buffer(error, buf, pos_err, buf_size - 1); 2277 + 2278 + if (got <= 0) 2279 + break; 2280 + 2281 + buf[got] = 0; 2282 + pos_err += got; 2283 + 2284 + ptr = buf; 2285 + while (got > 0) { 2286 + size_t count; 2287 + char tag[2]; 2288 + 2289 + next = strnchr(ptr, got, '\n'); 2290 + if (next) { 2291 + count = next - ptr; 2292 + *next = 0; 2293 + tag[0] = '>'; 2294 + tag[1] = '<'; 2295 + } else { 2296 + count = got; 2297 + tag[0] = '}'; 2298 + tag[1] = '{'; 2299 + } 2300 + 2301 + if (count > MAX_CHUNK) { 2302 + size_t pos; 2303 + char *ptr2 = ptr; 2304 + 2305 + for (pos = MAX_CHUNK; pos < count; pos += MAX_CHUNK) { 2306 + char chr = ptr[pos]; 2307 + 2308 + ptr[pos] = 0; 2309 + drm_info(&i915->drm, "[Capture/%d.%d] }%s{\n", 2310 + l_count, line++, ptr2); 2311 + ptr[pos] = chr; 2312 + ptr2 = ptr + pos; 2313 + 2314 + /* 2315 + * If spewing large amounts of data via a serial console, 2316 + * this can be a very slow process. So be friendly and try 2317 + * not to cause 'softlockup on CPU' problems. 2318 + */ 2319 + cond_resched(); 2320 + } 2321 + 2322 + if (ptr2 < (ptr + count)) 2323 + drm_info(&i915->drm, "[Capture/%d.%d] %c%s%c\n", 2324 + l_count, line++, tag[0], ptr2, tag[1]); 2325 + else if (tag[0] == '>') 2326 + drm_info(&i915->drm, "[Capture/%d.%d] ><\n", 2327 + l_count, line++); 2328 + } else { 2329 + drm_info(&i915->drm, "[Capture/%d.%d] %c%s%c\n", 2330 + l_count, line++, tag[0], ptr, tag[1]); 2331 + } 2332 + 2333 + ptr = next; 2334 + got -= count; 2335 + if (next) { 2336 + ptr++; 2337 + got--; 2338 + } 2339 + 2340 + /* As above. */ 2341 + cond_resched(); 2342 + } 2343 + 2344 + if (got) 2345 + drm_info(&i915->drm, "[Capture/%d.%d] Got %zd bytes remaining!\n", 2346 + l_count, line++, got); 2347 + } 2348 + 2349 + kvfree(buf); 2350 + 2351 + drm_info(&i915->drm, "[Capture/%d.%d] Dumped %zd bytes\n", l_count, line++, pos_err); 2352 + } 2353 + #endif
+10
drivers/gpu/drm/i915/i915_gpu_error.h
··· 258 258 #define CORE_DUMP_FLAG_NONE 0x0 259 259 #define CORE_DUMP_FLAG_IS_GUC_CAPTURE BIT(0) 260 260 261 + #if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR) && IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) 262 + void intel_klog_error_capture(struct intel_gt *gt, 263 + intel_engine_mask_t engine_mask); 264 + #else 265 + static inline void intel_klog_error_capture(struct intel_gt *gt, 266 + intel_engine_mask_t engine_mask) 267 + { 268 + } 269 + #endif 270 + 261 271 #if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR) 262 272 263 273 __printf(2, 3)
+76 -13
drivers/gpu/drm/i915/i915_hwmon.c
··· 50 50 struct hwm_energy_info ei; /* Energy info for energy1_input */ 51 51 char name[12]; 52 52 int gt_n; 53 + bool reset_in_progress; 54 + wait_queue_head_t waitq; 53 55 }; 54 56 55 57 struct i915_hwmon { ··· 398 396 { 399 397 struct i915_hwmon *hwmon = ddat->hwmon; 400 398 intel_wakeref_t wakeref; 399 + DEFINE_WAIT(wait); 400 + int ret = 0; 401 401 u32 nval; 402 + 403 + /* Block waiting for GuC reset to complete when needed */ 404 + for (;;) { 405 + mutex_lock(&hwmon->hwmon_lock); 406 + 407 + prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE); 408 + 409 + if (!hwmon->ddat.reset_in_progress) 410 + break; 411 + 412 + if (signal_pending(current)) { 413 + ret = -EINTR; 414 + break; 415 + } 416 + 417 + mutex_unlock(&hwmon->hwmon_lock); 418 + 419 + schedule(); 420 + } 421 + finish_wait(&ddat->waitq, &wait); 422 + if (ret) 423 + goto unlock; 424 + 425 + wakeref = intel_runtime_pm_get(ddat->uncore->rpm); 402 426 403 427 /* Disable PL1 limit and verify, because the limit cannot be disabled on all platforms */ 404 428 if (val == PL1_DISABLE) { 405 - mutex_lock(&hwmon->hwmon_lock); 406 - with_intel_runtime_pm(ddat->uncore->rpm, wakeref) { 407 - intel_uncore_rmw(ddat->uncore, hwmon->rg.pkg_rapl_limit, 408 - PKG_PWR_LIM_1_EN, 0); 409 - nval = intel_uncore_read(ddat->uncore, hwmon->rg.pkg_rapl_limit); 410 - } 411 - mutex_unlock(&hwmon->hwmon_lock); 429 + intel_uncore_rmw(ddat->uncore, hwmon->rg.pkg_rapl_limit, 430 + PKG_PWR_LIM_1_EN, 0); 431 + nval = intel_uncore_read(ddat->uncore, hwmon->rg.pkg_rapl_limit); 412 432 413 433 if (nval & PKG_PWR_LIM_1_EN) 414 - return -ENODEV; 415 - return 0; 434 + ret = -ENODEV; 435 + goto exit; 416 436 } 417 437 418 438 /* Computation in 64-bits to avoid overflow. Round to nearest. */ 419 439 nval = DIV_ROUND_CLOSEST_ULL((u64)val << hwmon->scl_shift_power, SF_POWER); 420 440 nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval); 421 441 422 - hwm_locked_with_pm_intel_uncore_rmw(ddat, hwmon->rg.pkg_rapl_limit, 423 - PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, 424 - nval); 425 - return 0; 442 + intel_uncore_rmw(ddat->uncore, hwmon->rg.pkg_rapl_limit, 443 + PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval); 444 + exit: 445 + intel_runtime_pm_put(ddat->uncore->rpm, wakeref); 446 + unlock: 447 + mutex_unlock(&hwmon->hwmon_lock); 448 + return ret; 426 449 } 427 450 428 451 static int ··· 495 468 default: 496 469 return -EOPNOTSUPP; 497 470 } 471 + } 472 + 473 + void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old) 474 + { 475 + struct i915_hwmon *hwmon = i915->hwmon; 476 + u32 r; 477 + 478 + if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit)) 479 + return; 480 + 481 + mutex_lock(&hwmon->hwmon_lock); 482 + 483 + hwmon->ddat.reset_in_progress = true; 484 + r = intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, 485 + PKG_PWR_LIM_1_EN, 0); 486 + *old = !!(r & PKG_PWR_LIM_1_EN); 487 + 488 + mutex_unlock(&hwmon->hwmon_lock); 489 + } 490 + 491 + void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) 492 + { 493 + struct i915_hwmon *hwmon = i915->hwmon; 494 + 495 + if (!hwmon || !i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit)) 496 + return; 497 + 498 + mutex_lock(&hwmon->hwmon_lock); 499 + 500 + intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit, 501 + PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0); 502 + hwmon->ddat.reset_in_progress = false; 503 + wake_up_all(&hwmon->ddat.waitq); 504 + 505 + mutex_unlock(&hwmon->hwmon_lock); 498 506 } 499 507 500 508 static umode_t ··· 804 742 ddat->uncore = &i915->uncore; 805 743 snprintf(ddat->name, sizeof(ddat->name), "i915"); 806 744 ddat->gt_n = -1; 745 + init_waitqueue_head(&ddat->waitq); 807 746 808 747 for_each_gt(gt, i915, i) { 809 748 ddat_gt = hwmon->ddat_gt + i;
+7
drivers/gpu/drm/i915/i915_hwmon.h
··· 7 7 #ifndef __I915_HWMON_H__ 8 8 #define __I915_HWMON_H__ 9 9 10 + #include <linux/types.h> 11 + 10 12 struct drm_i915_private; 13 + struct intel_gt; 11 14 12 15 #if IS_REACHABLE(CONFIG_HWMON) 13 16 void i915_hwmon_register(struct drm_i915_private *i915); 14 17 void i915_hwmon_unregister(struct drm_i915_private *i915); 18 + void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old); 19 + void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old); 15 20 #else 16 21 static inline void i915_hwmon_register(struct drm_i915_private *i915) { }; 17 22 static inline void i915_hwmon_unregister(struct drm_i915_private *i915) { }; 23 + static inline void i915_hwmon_power_max_disable(struct drm_i915_private *i915, bool *old) { }; 24 + static inline void i915_hwmon_power_max_restore(struct drm_i915_private *i915, bool old) { }; 18 25 #endif 19 26 20 27 #endif /* __I915_HWMON_H__ */
+11 -6
drivers/gpu/drm/i915/i915_irq.c
··· 2762 2762 2763 2763 static void dg1_irq_reset(struct drm_i915_private *dev_priv) 2764 2764 { 2765 - struct intel_gt *gt = to_gt(dev_priv); 2766 - struct intel_uncore *uncore = gt->uncore; 2765 + struct intel_uncore *uncore = &dev_priv->uncore; 2766 + struct intel_gt *gt; 2767 + unsigned int i; 2767 2768 2768 2769 dg1_master_intr_disable(dev_priv->uncore.regs); 2769 2770 2770 - gen11_gt_irq_reset(gt); 2771 + for_each_gt(gt, dev_priv, i) 2772 + gen11_gt_irq_reset(gt); 2773 + 2771 2774 gen11_display_irq_reset(dev_priv); 2772 2775 2773 2776 GEN3_IRQ_RESET(uncore, GEN11_GU_MISC_); ··· 3428 3425 3429 3426 static void dg1_irq_postinstall(struct drm_i915_private *dev_priv) 3430 3427 { 3431 - struct intel_gt *gt = to_gt(dev_priv); 3432 - struct intel_uncore *uncore = gt->uncore; 3428 + struct intel_uncore *uncore = &dev_priv->uncore; 3433 3429 u32 gu_misc_masked = GEN11_GU_MISC_GSE; 3430 + struct intel_gt *gt; 3431 + unsigned int i; 3434 3432 3435 - gen11_gt_irq_postinstall(gt); 3433 + for_each_gt(gt, dev_priv, i) 3434 + gen11_gt_irq_postinstall(gt); 3436 3435 3437 3436 GEN3_IRQ_INIT(uncore, GEN11_GU_MISC_, ~gu_misc_masked, gu_misc_masked); 3438 3437
+72 -9
drivers/gpu/drm/i915/i915_pci.c
··· 29 29 #include "display/intel_display.h" 30 30 #include "gt/intel_gt_regs.h" 31 31 #include "gt/intel_sa_media.h" 32 + #include "gem/i915_gem_object_types.h" 32 33 33 34 #include "i915_driver.h" 34 35 #include "i915_drv.h" ··· 164 163 .gamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING, \ 165 164 } 166 165 166 + #define LEGACY_CACHELEVEL \ 167 + .cachelevel_to_pat = { \ 168 + [I915_CACHE_NONE] = 0, \ 169 + [I915_CACHE_LLC] = 1, \ 170 + [I915_CACHE_L3_LLC] = 2, \ 171 + [I915_CACHE_WT] = 3, \ 172 + } 173 + 174 + #define TGL_CACHELEVEL \ 175 + .cachelevel_to_pat = { \ 176 + [I915_CACHE_NONE] = 3, \ 177 + [I915_CACHE_LLC] = 0, \ 178 + [I915_CACHE_L3_LLC] = 0, \ 179 + [I915_CACHE_WT] = 2, \ 180 + } 181 + 182 + #define PVC_CACHELEVEL \ 183 + .cachelevel_to_pat = { \ 184 + [I915_CACHE_NONE] = 0, \ 185 + [I915_CACHE_LLC] = 3, \ 186 + [I915_CACHE_L3_LLC] = 3, \ 187 + [I915_CACHE_WT] = 2, \ 188 + } 189 + 190 + #define MTL_CACHELEVEL \ 191 + .cachelevel_to_pat = { \ 192 + [I915_CACHE_NONE] = 2, \ 193 + [I915_CACHE_LLC] = 3, \ 194 + [I915_CACHE_L3_LLC] = 3, \ 195 + [I915_CACHE_WT] = 1, \ 196 + } 197 + 167 198 /* Keep in gen based order, and chronological order within a gen */ 168 199 169 200 #define GEN_DEFAULT_PAGE_SIZES \ ··· 221 188 .has_snoop = true, \ 222 189 .has_coherent_ggtt = false, \ 223 190 .dma_mask_size = 32, \ 191 + .max_pat_index = 3, \ 224 192 I9XX_PIPE_OFFSETS, \ 225 193 I9XX_CURSOR_OFFSETS, \ 226 194 I9XX_COLORS, \ 227 195 GEN_DEFAULT_PAGE_SIZES, \ 228 - GEN_DEFAULT_REGIONS 196 + GEN_DEFAULT_REGIONS, \ 197 + LEGACY_CACHELEVEL 229 198 230 199 #define I845_FEATURES \ 231 200 GEN(2), \ ··· 244 209 .has_snoop = true, \ 245 210 .has_coherent_ggtt = false, \ 246 211 .dma_mask_size = 32, \ 212 + .max_pat_index = 3, \ 247 213 I845_PIPE_OFFSETS, \ 248 214 I845_CURSOR_OFFSETS, \ 249 215 I845_COLORS, \ 250 216 GEN_DEFAULT_PAGE_SIZES, \ 251 - GEN_DEFAULT_REGIONS 217 + GEN_DEFAULT_REGIONS, \ 218 + LEGACY_CACHELEVEL 252 219 253 220 static const struct intel_device_info i830_info = { 254 221 I830_FEATURES, ··· 285 248 .has_snoop = true, \ 286 249 .has_coherent_ggtt = true, \ 287 250 .dma_mask_size = 32, \ 251 + .max_pat_index = 3, \ 288 252 I9XX_PIPE_OFFSETS, \ 289 253 I9XX_CURSOR_OFFSETS, \ 290 254 I9XX_COLORS, \ 291 255 GEN_DEFAULT_PAGE_SIZES, \ 292 - GEN_DEFAULT_REGIONS 256 + GEN_DEFAULT_REGIONS, \ 257 + LEGACY_CACHELEVEL 293 258 294 259 static const struct intel_device_info i915g_info = { 295 260 GEN3_FEATURES, ··· 379 340 .has_snoop = true, \ 380 341 .has_coherent_ggtt = true, \ 381 342 .dma_mask_size = 36, \ 343 + .max_pat_index = 3, \ 382 344 I9XX_PIPE_OFFSETS, \ 383 345 I9XX_CURSOR_OFFSETS, \ 384 346 I9XX_COLORS, \ 385 347 GEN_DEFAULT_PAGE_SIZES, \ 386 - GEN_DEFAULT_REGIONS 348 + GEN_DEFAULT_REGIONS, \ 349 + LEGACY_CACHELEVEL 387 350 388 351 static const struct intel_device_info i965g_info = { 389 352 GEN4_FEATURES, ··· 435 394 /* ilk does support rc6, but we do not implement [power] contexts */ \ 436 395 .has_rc6 = 0, \ 437 396 .dma_mask_size = 36, \ 397 + .max_pat_index = 3, \ 438 398 I9XX_PIPE_OFFSETS, \ 439 399 I9XX_CURSOR_OFFSETS, \ 440 400 ILK_COLORS, \ 441 401 GEN_DEFAULT_PAGE_SIZES, \ 442 - GEN_DEFAULT_REGIONS 402 + GEN_DEFAULT_REGIONS, \ 403 + LEGACY_CACHELEVEL 443 404 444 405 static const struct intel_device_info ilk_d_info = { 445 406 GEN5_FEATURES, ··· 471 428 .has_rc6p = 0, \ 472 429 .has_rps = true, \ 473 430 .dma_mask_size = 40, \ 431 + .max_pat_index = 3, \ 474 432 .__runtime.ppgtt_type = INTEL_PPGTT_ALIASING, \ 475 433 .__runtime.ppgtt_size = 31, \ 476 434 I9XX_PIPE_OFFSETS, \ 477 435 I9XX_CURSOR_OFFSETS, \ 478 436 ILK_COLORS, \ 479 437 GEN_DEFAULT_PAGE_SIZES, \ 480 - GEN_DEFAULT_REGIONS 438 + GEN_DEFAULT_REGIONS, \ 439 + LEGACY_CACHELEVEL 481 440 482 441 #define SNB_D_PLATFORM \ 483 442 GEN6_FEATURES, \ ··· 526 481 .has_reset_engine = true, \ 527 482 .has_rps = true, \ 528 483 .dma_mask_size = 40, \ 484 + .max_pat_index = 3, \ 529 485 .__runtime.ppgtt_type = INTEL_PPGTT_ALIASING, \ 530 486 .__runtime.ppgtt_size = 31, \ 531 487 IVB_PIPE_OFFSETS, \ 532 488 IVB_CURSOR_OFFSETS, \ 533 489 IVB_COLORS, \ 534 490 GEN_DEFAULT_PAGE_SIZES, \ 535 - GEN_DEFAULT_REGIONS 491 + GEN_DEFAULT_REGIONS, \ 492 + LEGACY_CACHELEVEL 536 493 537 494 #define IVB_D_PLATFORM \ 538 495 GEN7_FEATURES, \ ··· 588 541 .display.has_gmch = 1, 589 542 .display.has_hotplug = 1, 590 543 .dma_mask_size = 40, 544 + .max_pat_index = 3, 591 545 .__runtime.ppgtt_type = INTEL_PPGTT_ALIASING, 592 546 .__runtime.ppgtt_size = 31, 593 547 .has_snoop = true, ··· 600 552 I9XX_COLORS, 601 553 GEN_DEFAULT_PAGE_SIZES, 602 554 GEN_DEFAULT_REGIONS, 555 + LEGACY_CACHELEVEL, 603 556 }; 604 557 605 558 #define G75_FEATURES \ ··· 688 639 .has_logical_ring_contexts = 1, 689 640 .display.has_gmch = 1, 690 641 .dma_mask_size = 39, 642 + .max_pat_index = 3, 691 643 .__runtime.ppgtt_type = INTEL_PPGTT_FULL, 692 644 .__runtime.ppgtt_size = 32, 693 645 .has_reset_engine = 1, ··· 700 650 CHV_COLORS, 701 651 GEN_DEFAULT_PAGE_SIZES, 702 652 GEN_DEFAULT_REGIONS, 653 + LEGACY_CACHELEVEL, 703 654 }; 704 655 705 656 #define GEN9_DEFAULT_PAGE_SIZES \ ··· 782 731 .has_snoop = true, \ 783 732 .has_coherent_ggtt = false, \ 784 733 .display.has_ipc = 1, \ 734 + .max_pat_index = 3, \ 785 735 HSW_PIPE_OFFSETS, \ 786 736 IVB_CURSOR_OFFSETS, \ 787 737 IVB_COLORS, \ 788 738 GEN9_DEFAULT_PAGE_SIZES, \ 789 - GEN_DEFAULT_REGIONS 739 + GEN_DEFAULT_REGIONS, \ 740 + LEGACY_CACHELEVEL 790 741 791 742 static const struct intel_device_info bxt_info = { 792 743 GEN9_LP_FEATURES, ··· 942 889 [TRANSCODER_DSI_1] = TRANSCODER_DSI1_OFFSET, \ 943 890 }, \ 944 891 TGL_CURSOR_OFFSETS, \ 892 + TGL_CACHELEVEL, \ 945 893 .has_global_mocs = 1, \ 946 894 .has_pxp = 1, \ 947 - .display.has_dsb = 1 895 + .display.has_dsb = 1, \ 896 + .max_pat_index = 3 948 897 949 898 static const struct intel_device_info tgl_info = { 950 899 GEN12_FEATURES, ··· 1068 1013 .__runtime.graphics.ip.ver = 12, \ 1069 1014 .__runtime.graphics.ip.rel = 50, \ 1070 1015 XE_HP_PAGE_SIZES, \ 1016 + TGL_CACHELEVEL, \ 1071 1017 .dma_mask_size = 46, \ 1072 1018 .has_3d_pipeline = 1, \ 1073 1019 .has_64bit_reloc = 1, \ ··· 1087 1031 .has_reset_engine = 1, \ 1088 1032 .has_rps = 1, \ 1089 1033 .has_runtime_pm = 1, \ 1034 + .max_pat_index = 3, \ 1090 1035 .__runtime.ppgtt_size = 48, \ 1091 1036 .__runtime.ppgtt_type = INTEL_PPGTT_FULL 1092 1037 ··· 1164 1107 PLATFORM(INTEL_PONTEVECCHIO), 1165 1108 NO_DISPLAY, 1166 1109 .has_flat_ccs = 0, 1110 + .max_pat_index = 7, 1167 1111 .__runtime.platform_engine_mask = 1168 1112 BIT(BCS0) | 1169 1113 BIT(VCS0) | 1170 1114 BIT(CCS0) | BIT(CCS1) | BIT(CCS2) | BIT(CCS3), 1171 1115 .require_force_probe = 1, 1116 + PVC_CACHELEVEL, 1172 1117 }; 1173 1118 1174 1119 #define XE_LPDP_FEATURES \ ··· 1207 1148 .has_flat_ccs = 0, 1208 1149 .has_gmd_id = 1, 1209 1150 .has_guc_deprivilege = 1, 1151 + .has_llc = 0, 1210 1152 .has_mslice_steering = 0, 1211 1153 .has_snoop = 1, 1154 + .max_pat_index = 4, 1155 + .has_pxp = 1, 1212 1156 .__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM, 1213 1157 .__runtime.platform_engine_mask = BIT(RCS0) | BIT(BCS0) | BIT(CCS0), 1214 1158 .require_force_probe = 1, 1159 + MTL_CACHELEVEL, 1215 1160 }; 1216 1161 1217 1162 #undef PLATFORM
+1
drivers/gpu/drm/i915/i915_perf.c
··· 5300 5300 5301 5301 /** 5302 5302 * i915_perf_ioctl_version - Version of the i915-perf subsystem 5303 + * @i915: The i915 device 5303 5304 * 5304 5305 * This version number is used by userspace to detect available features. 5305 5306 */
-4
drivers/gpu/drm/i915/i915_perf_oa_regs.h
··· 134 134 #define GDT_CHICKEN_BITS _MMIO(0x9840) 135 135 #define GT_NOA_ENABLE 0x00000080 136 136 137 - #define GEN12_SQCNT1 _MMIO(0x8718) 138 - #define GEN12_SQCNT1_PMON_ENABLE REG_BIT(30) 139 - #define GEN12_SQCNT1_OABPC REG_BIT(29) 140 - 141 137 /* Gen12 OAM unit */ 142 138 #define GEN12_OAM_HEAD_POINTER_OFFSET (0x1a0) 143 139 #define GEN12_OAM_HEAD_POINTER_MASK 0xffffffc0
+205 -85
drivers/gpu/drm/i915/i915_pmu.c
··· 10 10 #include "gt/intel_engine_pm.h" 11 11 #include "gt/intel_engine_regs.h" 12 12 #include "gt/intel_engine_user.h" 13 + #include "gt/intel_gt.h" 13 14 #include "gt/intel_gt_pm.h" 14 15 #include "gt/intel_gt_regs.h" 15 16 #include "gt/intel_rc6.h" ··· 51 50 return (event->attr.config >> I915_PMU_SAMPLE_BITS) & 0xff; 52 51 } 53 52 54 - static bool is_engine_config(u64 config) 53 + static bool is_engine_config(const u64 config) 55 54 { 56 55 return config < __I915_PMU_OTHER(0); 56 + } 57 + 58 + static unsigned int config_gt_id(const u64 config) 59 + { 60 + return config >> __I915_PMU_GT_SHIFT; 61 + } 62 + 63 + static u64 config_counter(const u64 config) 64 + { 65 + return config & ~(~0ULL << __I915_PMU_GT_SHIFT); 57 66 } 58 67 59 68 static unsigned int other_bit(const u64 config) 60 69 { 61 70 unsigned int val; 62 71 63 - switch (config) { 72 + switch (config_counter(config)) { 64 73 case I915_PMU_ACTUAL_FREQUENCY: 65 74 val = __I915_PMU_ACTUAL_FREQUENCY_ENABLED; 66 75 break; ··· 88 77 return -1; 89 78 } 90 79 91 - return I915_ENGINE_SAMPLE_COUNT + val; 80 + return I915_ENGINE_SAMPLE_COUNT + 81 + config_gt_id(config) * __I915_PMU_TRACKED_EVENT_COUNT + 82 + val; 92 83 } 93 84 94 85 static unsigned int config_bit(const u64 config) ··· 101 88 return other_bit(config); 102 89 } 103 90 104 - static u64 config_mask(u64 config) 91 + static u32 config_mask(const u64 config) 105 92 { 106 - return BIT_ULL(config_bit(config)); 93 + unsigned int bit = config_bit(config); 94 + 95 + if (__builtin_constant_p(config)) 96 + BUILD_BUG_ON(bit > 97 + BITS_PER_TYPE(typeof_member(struct i915_pmu, 98 + enable)) - 1); 99 + else 100 + WARN_ON_ONCE(bit > 101 + BITS_PER_TYPE(typeof_member(struct i915_pmu, 102 + enable)) - 1); 103 + 104 + return BIT(config_bit(config)); 107 105 } 108 106 109 107 static bool is_engine_event(struct perf_event *event) ··· 125 101 static unsigned int event_bit(struct perf_event *event) 126 102 { 127 103 return config_bit(event->attr.config); 104 + } 105 + 106 + static u32 frequency_enabled_mask(void) 107 + { 108 + unsigned int i; 109 + u32 mask = 0; 110 + 111 + for (i = 0; i < I915_PMU_MAX_GTS; i++) 112 + mask |= config_mask(__I915_PMU_ACTUAL_FREQUENCY(i)) | 113 + config_mask(__I915_PMU_REQUESTED_FREQUENCY(i)); 114 + 115 + return mask; 128 116 } 129 117 130 118 static bool pmu_needs_timer(struct i915_pmu *pmu, bool gpu_active) ··· 155 119 * Mask out all the ones which do not need the timer, or in 156 120 * other words keep all the ones that could need the timer. 157 121 */ 158 - enable &= config_mask(I915_PMU_ACTUAL_FREQUENCY) | 159 - config_mask(I915_PMU_REQUESTED_FREQUENCY) | 160 - ENGINE_SAMPLE_MASK; 122 + enable &= frequency_enabled_mask() | ENGINE_SAMPLE_MASK; 161 123 162 124 /* 163 125 * When the GPU is idle per-engine counters do not need to be ··· 197 163 return ktime_to_ns(ktime_sub(ktime_get_raw(), kt)); 198 164 } 199 165 166 + static unsigned int 167 + __sample_idx(struct i915_pmu *pmu, unsigned int gt_id, int sample) 168 + { 169 + unsigned int idx = gt_id * __I915_NUM_PMU_SAMPLERS + sample; 170 + 171 + GEM_BUG_ON(idx >= ARRAY_SIZE(pmu->sample)); 172 + 173 + return idx; 174 + } 175 + 176 + static u64 read_sample(struct i915_pmu *pmu, unsigned int gt_id, int sample) 177 + { 178 + return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur; 179 + } 180 + 181 + static void 182 + store_sample(struct i915_pmu *pmu, unsigned int gt_id, int sample, u64 val) 183 + { 184 + pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val; 185 + } 186 + 187 + static void 188 + add_sample_mult(struct i915_pmu *pmu, unsigned int gt_id, int sample, u32 val, u32 mul) 189 + { 190 + pmu->sample[__sample_idx(pmu, gt_id, sample)].cur += mul_u32_u32(val, mul); 191 + } 192 + 200 193 static u64 get_rc6(struct intel_gt *gt) 201 194 { 202 195 struct drm_i915_private *i915 = gt->i915; 196 + const unsigned int gt_id = gt->info.id; 203 197 struct i915_pmu *pmu = &i915->pmu; 204 198 unsigned long flags; 205 199 bool awake = false; ··· 242 180 spin_lock_irqsave(&pmu->lock, flags); 243 181 244 182 if (awake) { 245 - pmu->sample[__I915_SAMPLE_RC6].cur = val; 183 + store_sample(pmu, gt_id, __I915_SAMPLE_RC6, val); 246 184 } else { 247 185 /* 248 186 * We think we are runtime suspended. ··· 251 189 * on top of the last known real value, as the approximated RC6 252 190 * counter value. 253 191 */ 254 - val = ktime_since_raw(pmu->sleep_last); 255 - val += pmu->sample[__I915_SAMPLE_RC6].cur; 192 + val = ktime_since_raw(pmu->sleep_last[gt_id]); 193 + val += read_sample(pmu, gt_id, __I915_SAMPLE_RC6); 256 194 } 257 195 258 - if (val < pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur) 259 - val = pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur; 196 + if (val < read_sample(pmu, gt_id, __I915_SAMPLE_RC6_LAST_REPORTED)) 197 + val = read_sample(pmu, gt_id, __I915_SAMPLE_RC6_LAST_REPORTED); 260 198 else 261 - pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = val; 199 + store_sample(pmu, gt_id, __I915_SAMPLE_RC6_LAST_REPORTED, val); 262 200 263 201 spin_unlock_irqrestore(&pmu->lock, flags); 264 202 ··· 268 206 static void init_rc6(struct i915_pmu *pmu) 269 207 { 270 208 struct drm_i915_private *i915 = container_of(pmu, typeof(*i915), pmu); 271 - intel_wakeref_t wakeref; 209 + struct intel_gt *gt; 210 + unsigned int i; 272 211 273 - with_intel_runtime_pm(to_gt(i915)->uncore->rpm, wakeref) { 274 - pmu->sample[__I915_SAMPLE_RC6].cur = __get_rc6(to_gt(i915)); 275 - pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = 276 - pmu->sample[__I915_SAMPLE_RC6].cur; 277 - pmu->sleep_last = ktime_get_raw(); 212 + for_each_gt(gt, i915, i) { 213 + intel_wakeref_t wakeref; 214 + 215 + with_intel_runtime_pm(gt->uncore->rpm, wakeref) { 216 + u64 val = __get_rc6(gt); 217 + 218 + store_sample(pmu, i, __I915_SAMPLE_RC6, val); 219 + store_sample(pmu, i, __I915_SAMPLE_RC6_LAST_REPORTED, 220 + val); 221 + pmu->sleep_last[i] = ktime_get_raw(); 222 + } 278 223 } 279 224 } 280 225 281 - static void park_rc6(struct drm_i915_private *i915) 226 + static void park_rc6(struct intel_gt *gt) 282 227 { 283 - struct i915_pmu *pmu = &i915->pmu; 228 + struct i915_pmu *pmu = &gt->i915->pmu; 284 229 285 - pmu->sample[__I915_SAMPLE_RC6].cur = __get_rc6(to_gt(i915)); 286 - pmu->sleep_last = ktime_get_raw(); 230 + store_sample(pmu, gt->info.id, __I915_SAMPLE_RC6, __get_rc6(gt)); 231 + pmu->sleep_last[gt->info.id] = ktime_get_raw(); 287 232 } 288 233 289 234 static void __i915_pmu_maybe_start_timer(struct i915_pmu *pmu) ··· 304 235 } 305 236 } 306 237 307 - void i915_pmu_gt_parked(struct drm_i915_private *i915) 238 + void i915_pmu_gt_parked(struct intel_gt *gt) 308 239 { 309 - struct i915_pmu *pmu = &i915->pmu; 240 + struct i915_pmu *pmu = &gt->i915->pmu; 310 241 311 242 if (!pmu->base.event_init) 312 243 return; 313 244 314 245 spin_lock_irq(&pmu->lock); 315 246 316 - park_rc6(i915); 247 + park_rc6(gt); 317 248 318 249 /* 319 250 * Signal sampling timer to stop if only engine events are enabled and 320 251 * GPU went idle. 321 252 */ 322 - pmu->timer_enabled = pmu_needs_timer(pmu, false); 253 + pmu->unparked &= ~BIT(gt->info.id); 254 + if (pmu->unparked == 0) 255 + pmu->timer_enabled = pmu_needs_timer(pmu, false); 323 256 324 257 spin_unlock_irq(&pmu->lock); 325 258 } 326 259 327 - void i915_pmu_gt_unparked(struct drm_i915_private *i915) 260 + void i915_pmu_gt_unparked(struct intel_gt *gt) 328 261 { 329 - struct i915_pmu *pmu = &i915->pmu; 262 + struct i915_pmu *pmu = &gt->i915->pmu; 330 263 331 264 if (!pmu->base.event_init) 332 265 return; ··· 338 267 /* 339 268 * Re-enable sampling timer when GPU goes active. 340 269 */ 341 - __i915_pmu_maybe_start_timer(pmu); 270 + if (pmu->unparked == 0) 271 + __i915_pmu_maybe_start_timer(pmu); 272 + 273 + pmu->unparked |= BIT(gt->info.id); 342 274 343 275 spin_unlock_irq(&pmu->lock); 344 276 } ··· 412 338 return; 413 339 414 340 for_each_engine(engine, gt, id) { 341 + if (!engine->pmu.enable) 342 + continue; 343 + 415 344 if (!intel_engine_pm_get_if_awake(engine)) 416 345 continue; 417 346 ··· 430 353 } 431 354 } 432 355 433 - static void 434 - add_sample_mult(struct i915_pmu_sample *sample, u32 val, u32 mul) 435 - { 436 - sample->cur += mul_u32_u32(val, mul); 437 - } 438 - 439 - static bool frequency_sampling_enabled(struct i915_pmu *pmu) 356 + static bool 357 + frequency_sampling_enabled(struct i915_pmu *pmu, unsigned int gt) 440 358 { 441 359 return pmu->enable & 442 - (config_mask(I915_PMU_ACTUAL_FREQUENCY) | 443 - config_mask(I915_PMU_REQUESTED_FREQUENCY)); 360 + (config_mask(__I915_PMU_ACTUAL_FREQUENCY(gt)) | 361 + config_mask(__I915_PMU_REQUESTED_FREQUENCY(gt))); 444 362 } 445 363 446 364 static void 447 365 frequency_sample(struct intel_gt *gt, unsigned int period_ns) 448 366 { 449 367 struct drm_i915_private *i915 = gt->i915; 368 + const unsigned int gt_id = gt->info.id; 450 369 struct i915_pmu *pmu = &i915->pmu; 451 370 struct intel_rps *rps = &gt->rps; 452 371 453 - if (!frequency_sampling_enabled(pmu)) 372 + if (!frequency_sampling_enabled(pmu, gt_id)) 454 373 return; 455 374 456 375 /* Report 0/0 (actual/requested) frequency while parked. */ 457 376 if (!intel_gt_pm_get_if_awake(gt)) 458 377 return; 459 378 460 - if (pmu->enable & config_mask(I915_PMU_ACTUAL_FREQUENCY)) { 379 + if (pmu->enable & config_mask(__I915_PMU_ACTUAL_FREQUENCY(gt_id))) { 461 380 u32 val; 462 381 463 382 /* ··· 469 396 if (!val) 470 397 val = intel_gpu_freq(rps, rps->cur_freq); 471 398 472 - add_sample_mult(&pmu->sample[__I915_SAMPLE_FREQ_ACT], 399 + add_sample_mult(pmu, gt_id, __I915_SAMPLE_FREQ_ACT, 473 400 val, period_ns / 1000); 474 401 } 475 402 476 - if (pmu->enable & config_mask(I915_PMU_REQUESTED_FREQUENCY)) { 477 - add_sample_mult(&pmu->sample[__I915_SAMPLE_FREQ_REQ], 403 + if (pmu->enable & config_mask(__I915_PMU_REQUESTED_FREQUENCY(gt_id))) { 404 + add_sample_mult(pmu, gt_id, __I915_SAMPLE_FREQ_REQ, 478 405 intel_rps_get_requested_frequency(rps), 479 406 period_ns / 1000); 480 407 } ··· 487 414 struct drm_i915_private *i915 = 488 415 container_of(hrtimer, struct drm_i915_private, pmu.timer); 489 416 struct i915_pmu *pmu = &i915->pmu; 490 - struct intel_gt *gt = to_gt(i915); 491 417 unsigned int period_ns; 418 + struct intel_gt *gt; 419 + unsigned int i; 492 420 ktime_t now; 493 421 494 422 if (!READ_ONCE(pmu->timer_enabled)) ··· 505 431 * grabbing the forcewake. However the potential error from timer call- 506 432 * back delay greatly dominates this so we keep it simple. 507 433 */ 508 - engines_sample(gt, period_ns); 509 - frequency_sample(gt, period_ns); 434 + 435 + for_each_gt(gt, i915, i) { 436 + if (!(pmu->unparked & BIT(i))) 437 + continue; 438 + 439 + engines_sample(gt, period_ns); 440 + frequency_sample(gt, period_ns); 441 + } 510 442 511 443 hrtimer_forward(hrtimer, now, ns_to_ktime(PERIOD)); 512 444 ··· 553 473 { 554 474 struct intel_gt *gt = to_gt(i915); 555 475 556 - switch (config) { 476 + unsigned int gt_id = config_gt_id(config); 477 + unsigned int max_gt_id = HAS_EXTRA_GT_LIST(i915) ? 1 : 0; 478 + 479 + if (gt_id > max_gt_id) 480 + return -ENOENT; 481 + 482 + switch (config_counter(config)) { 557 483 case I915_PMU_ACTUAL_FREQUENCY: 558 484 if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915)) 559 485 /* Requires a mutex for sampling! */ ··· 570 484 return -ENODEV; 571 485 break; 572 486 case I915_PMU_INTERRUPTS: 487 + if (gt_id) 488 + return -ENOENT; 573 489 break; 574 490 case I915_PMU_RC6_RESIDENCY: 575 491 if (!gt->rc6.supported) ··· 669 581 val = engine->pmu.sample[sample].cur; 670 582 } 671 583 } else { 672 - switch (event->attr.config) { 584 + const unsigned int gt_id = config_gt_id(event->attr.config); 585 + const u64 config = config_counter(event->attr.config); 586 + 587 + switch (config) { 673 588 case I915_PMU_ACTUAL_FREQUENCY: 674 589 val = 675 - div_u64(pmu->sample[__I915_SAMPLE_FREQ_ACT].cur, 590 + div_u64(read_sample(pmu, gt_id, 591 + __I915_SAMPLE_FREQ_ACT), 676 592 USEC_PER_SEC /* to MHz */); 677 593 break; 678 594 case I915_PMU_REQUESTED_FREQUENCY: 679 595 val = 680 - div_u64(pmu->sample[__I915_SAMPLE_FREQ_REQ].cur, 596 + div_u64(read_sample(pmu, gt_id, 597 + __I915_SAMPLE_FREQ_REQ), 681 598 USEC_PER_SEC /* to MHz */); 682 599 break; 683 600 case I915_PMU_INTERRUPTS: 684 601 val = READ_ONCE(pmu->irq_count); 685 602 break; 686 603 case I915_PMU_RC6_RESIDENCY: 687 - val = get_rc6(to_gt(i915)); 604 + val = get_rc6(i915->gt[gt_id]); 688 605 break; 689 606 case I915_PMU_SOFTWARE_GT_AWAKE_TIME: 690 607 val = ktime_to_ns(intel_gt_get_awake_time(to_gt(i915))); ··· 726 633 { 727 634 struct drm_i915_private *i915 = 728 635 container_of(event->pmu, typeof(*i915), pmu.base); 636 + const unsigned int bit = event_bit(event); 729 637 struct i915_pmu *pmu = &i915->pmu; 730 638 unsigned long flags; 731 - unsigned int bit; 732 639 733 - bit = event_bit(event); 734 640 if (bit == -1) 735 641 goto update; 736 642 ··· 743 651 GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count)); 744 652 GEM_BUG_ON(pmu->enable_count[bit] == ~0); 745 653 746 - pmu->enable |= BIT_ULL(bit); 654 + pmu->enable |= BIT(bit); 747 655 pmu->enable_count[bit]++; 748 656 749 657 /* ··· 790 698 { 791 699 struct drm_i915_private *i915 = 792 700 container_of(event->pmu, typeof(*i915), pmu.base); 793 - unsigned int bit = event_bit(event); 701 + const unsigned int bit = event_bit(event); 794 702 struct i915_pmu *pmu = &i915->pmu; 795 703 unsigned long flags; 796 704 ··· 826 734 * bitmask when the last listener on an event goes away. 827 735 */ 828 736 if (--pmu->enable_count[bit] == 0) { 829 - pmu->enable &= ~BIT_ULL(bit); 737 + pmu->enable &= ~BIT(bit); 830 738 pmu->timer_enabled &= pmu_needs_timer(pmu, true); 831 739 } 832 740 ··· 940 848 .attrs = i915_cpumask_attrs, 941 849 }; 942 850 943 - #define __event(__config, __name, __unit) \ 851 + #define __event(__counter, __name, __unit) \ 944 852 { \ 945 - .config = (__config), \ 853 + .counter = (__counter), \ 946 854 .name = (__name), \ 947 855 .unit = (__unit), \ 856 + .global = false, \ 857 + } 858 + 859 + #define __global_event(__counter, __name, __unit) \ 860 + { \ 861 + .counter = (__counter), \ 862 + .name = (__name), \ 863 + .unit = (__unit), \ 864 + .global = true, \ 948 865 } 949 866 950 867 #define __engine_event(__sample, __name) \ ··· 992 891 { 993 892 struct drm_i915_private *i915 = container_of(pmu, typeof(*i915), pmu); 994 893 static const struct { 995 - u64 config; 894 + unsigned int counter; 996 895 const char *name; 997 896 const char *unit; 897 + bool global; 998 898 } events[] = { 999 - __event(I915_PMU_ACTUAL_FREQUENCY, "actual-frequency", "M"), 1000 - __event(I915_PMU_REQUESTED_FREQUENCY, "requested-frequency", "M"), 1001 - __event(I915_PMU_INTERRUPTS, "interrupts", NULL), 1002 - __event(I915_PMU_RC6_RESIDENCY, "rc6-residency", "ns"), 1003 - __event(I915_PMU_SOFTWARE_GT_AWAKE_TIME, "software-gt-awake-time", "ns"), 899 + __event(0, "actual-frequency", "M"), 900 + __event(1, "requested-frequency", "M"), 901 + __global_event(2, "interrupts", NULL), 902 + __event(3, "rc6-residency", "ns"), 903 + __event(4, "software-gt-awake-time", "ns"), 1004 904 }; 1005 905 static const struct { 1006 906 enum drm_i915_pmu_engine_sample sample; ··· 1016 914 struct i915_ext_attribute *i915_attr = NULL, *i915_iter; 1017 915 struct attribute **attr = NULL, **attr_iter; 1018 916 struct intel_engine_cs *engine; 1019 - unsigned int i; 917 + struct intel_gt *gt; 918 + unsigned int i, j; 1020 919 1021 920 /* Count how many counters we will be exposing. */ 1022 - for (i = 0; i < ARRAY_SIZE(events); i++) { 1023 - if (!config_status(i915, events[i].config)) 1024 - count++; 921 + for_each_gt(gt, i915, j) { 922 + for (i = 0; i < ARRAY_SIZE(events); i++) { 923 + u64 config = ___I915_PMU_OTHER(j, events[i].counter); 924 + 925 + if (!config_status(i915, config)) 926 + count++; 927 + } 1025 928 } 1026 929 1027 930 for_each_uabi_engine(engine, i915) { ··· 1056 949 attr_iter = attr; 1057 950 1058 951 /* Initialize supported non-engine counters. */ 1059 - for (i = 0; i < ARRAY_SIZE(events); i++) { 1060 - char *str; 952 + for_each_gt(gt, i915, j) { 953 + for (i = 0; i < ARRAY_SIZE(events); i++) { 954 + u64 config = ___I915_PMU_OTHER(j, events[i].counter); 955 + char *str; 1061 956 1062 - if (config_status(i915, events[i].config)) 1063 - continue; 957 + if (config_status(i915, config)) 958 + continue; 1064 959 1065 - str = kstrdup(events[i].name, GFP_KERNEL); 1066 - if (!str) 1067 - goto err; 1068 - 1069 - *attr_iter++ = &i915_iter->attr.attr; 1070 - i915_iter = add_i915_attr(i915_iter, str, events[i].config); 1071 - 1072 - if (events[i].unit) { 1073 - str = kasprintf(GFP_KERNEL, "%s.unit", events[i].name); 960 + if (events[i].global || !HAS_EXTRA_GT_LIST(i915)) 961 + str = kstrdup(events[i].name, GFP_KERNEL); 962 + else 963 + str = kasprintf(GFP_KERNEL, "%s-gt%u", 964 + events[i].name, j); 1074 965 if (!str) 1075 966 goto err; 1076 967 1077 - *attr_iter++ = &pmu_iter->attr.attr; 1078 - pmu_iter = add_pmu_attr(pmu_iter, str, events[i].unit); 968 + *attr_iter++ = &i915_iter->attr.attr; 969 + i915_iter = add_i915_attr(i915_iter, str, config); 970 + 971 + if (events[i].unit) { 972 + if (events[i].global || !HAS_EXTRA_GT_LIST(i915)) 973 + str = kasprintf(GFP_KERNEL, "%s.unit", 974 + events[i].name); 975 + else 976 + str = kasprintf(GFP_KERNEL, "%s-gt%u.unit", 977 + events[i].name, j); 978 + if (!str) 979 + goto err; 980 + 981 + *attr_iter++ = &pmu_iter->attr.attr; 982 + pmu_iter = add_pmu_attr(pmu_iter, str, 983 + events[i].unit); 984 + } 1079 985 } 1080 986 } 1081 987
+18 -10
drivers/gpu/drm/i915/i915_pmu.h
··· 13 13 #include <uapi/drm/i915_drm.h> 14 14 15 15 struct drm_i915_private; 16 + struct intel_gt; 16 17 17 - /** 18 + /* 18 19 * Non-engine events that we need to track enabled-disabled transition and 19 20 * current state. 20 21 */ ··· 26 25 __I915_PMU_TRACKED_EVENT_COUNT, /* count marker */ 27 26 }; 28 27 29 - /** 28 + /* 30 29 * Slots used from the sampling timer (non-engine events) with some extras for 31 30 * convenience. 32 31 */ ··· 38 37 __I915_NUM_PMU_SAMPLERS 39 38 }; 40 39 41 - /** 40 + #define I915_PMU_MAX_GTS 2 41 + 42 + /* 42 43 * How many different events we track in the global PMU mask. 43 44 * 44 45 * It is also used to know to needed number of event reference counters. 45 46 */ 46 47 #define I915_PMU_MASK_BITS \ 47 - (I915_ENGINE_SAMPLE_COUNT + __I915_PMU_TRACKED_EVENT_COUNT) 48 + (I915_ENGINE_SAMPLE_COUNT + \ 49 + I915_PMU_MAX_GTS * __I915_PMU_TRACKED_EVENT_COUNT) 48 50 49 51 #define I915_ENGINE_SAMPLE_COUNT (I915_SAMPLE_SEMA + 1) 50 52 ··· 79 75 * @lock: Lock protecting enable mask and ref count handling. 80 76 */ 81 77 spinlock_t lock; 78 + /** 79 + * @unparked: GT unparked mask. 80 + */ 81 + unsigned int unparked; 82 82 /** 83 83 * @timer: Timer for internal i915 PMU sampling. 84 84 */ ··· 127 119 * Only global counters are held here, while the per-engine ones are in 128 120 * struct intel_engine_cs. 129 121 */ 130 - struct i915_pmu_sample sample[__I915_NUM_PMU_SAMPLERS]; 122 + struct i915_pmu_sample sample[I915_PMU_MAX_GTS * __I915_NUM_PMU_SAMPLERS]; 131 123 /** 132 124 * @sleep_last: Last time GT parked for RC6 estimation. 133 125 */ 134 - ktime_t sleep_last; 126 + ktime_t sleep_last[I915_PMU_MAX_GTS]; 135 127 /** 136 128 * @irq_count: Number of interrupts 137 129 * ··· 159 151 void i915_pmu_exit(void); 160 152 void i915_pmu_register(struct drm_i915_private *i915); 161 153 void i915_pmu_unregister(struct drm_i915_private *i915); 162 - void i915_pmu_gt_parked(struct drm_i915_private *i915); 163 - void i915_pmu_gt_unparked(struct drm_i915_private *i915); 154 + void i915_pmu_gt_parked(struct intel_gt *gt); 155 + void i915_pmu_gt_unparked(struct intel_gt *gt); 164 156 #else 165 157 static inline int i915_pmu_init(void) { return 0; } 166 158 static inline void i915_pmu_exit(void) {} 167 159 static inline void i915_pmu_register(struct drm_i915_private *i915) {} 168 160 static inline void i915_pmu_unregister(struct drm_i915_private *i915) {} 169 - static inline void i915_pmu_gt_parked(struct drm_i915_private *i915) {} 170 - static inline void i915_pmu_gt_unparked(struct drm_i915_private *i915) {} 161 + static inline void i915_pmu_gt_parked(struct intel_gt *gt) {} 162 + static inline void i915_pmu_gt_unparked(struct intel_gt *gt) {} 171 163 #endif 172 164 173 165 #endif
+26 -26
drivers/gpu/drm/i915/i915_request.h
··· 172 172 I915_FENCE_FLAG_COMPOSITE, 173 173 }; 174 174 175 - /** 175 + /* 176 176 * Request queue structure. 177 177 * 178 178 * The request queue allows us to note sequence numbers that have been emitted ··· 198 198 199 199 struct drm_i915_private *i915; 200 200 201 - /** 201 + /* 202 202 * Context and ring buffer related to this request 203 203 * Contexts are refcounted, so when this request is associated with a 204 204 * context, we must increment the context's refcount, to guarantee that ··· 251 251 }; 252 252 struct llist_head execute_cb; 253 253 struct i915_sw_fence semaphore; 254 - /** 255 - * @submit_work: complete submit fence from an IRQ if needed for 256 - * locking hierarchy reasons. 254 + /* 255 + * complete submit fence from an IRQ if needed for locking hierarchy 256 + * reasons. 257 257 */ 258 258 struct irq_work submit_work; 259 259 ··· 277 277 */ 278 278 const u32 *hwsp_seqno; 279 279 280 - /** Position in the ring of the start of the request */ 280 + /* Position in the ring of the start of the request */ 281 281 u32 head; 282 282 283 - /** Position in the ring of the start of the user packets */ 283 + /* Position in the ring of the start of the user packets */ 284 284 u32 infix; 285 285 286 - /** 286 + /* 287 287 * Position in the ring of the start of the postfix. 288 288 * This is required to calculate the maximum available ring space 289 289 * without overwriting the postfix. 290 290 */ 291 291 u32 postfix; 292 292 293 - /** Position in the ring of the end of the whole request */ 293 + /* Position in the ring of the end of the whole request */ 294 294 u32 tail; 295 295 296 - /** Position in the ring of the end of any workarounds after the tail */ 296 + /* Position in the ring of the end of any workarounds after the tail */ 297 297 u32 wa_tail; 298 298 299 - /** Preallocate space in the ring for the emitting the request */ 299 + /* Preallocate space in the ring for the emitting the request */ 300 300 u32 reserved_space; 301 301 302 - /** Batch buffer pointer for selftest internal use. */ 302 + /* Batch buffer pointer for selftest internal use. */ 303 303 I915_SELFTEST_DECLARE(struct i915_vma *batch); 304 304 305 305 struct i915_vma_resource *batch_res; 306 306 307 307 #if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR) 308 - /** 308 + /* 309 309 * Additional buffers requested by userspace to be captured upon 310 310 * a GPU hang. The vma/obj on this list are protected by their 311 311 * active reference - all objects on this list must also be ··· 314 314 struct i915_capture_list *capture_list; 315 315 #endif 316 316 317 - /** Time at which this request was emitted, in jiffies. */ 317 + /* Time at which this request was emitted, in jiffies. */ 318 318 unsigned long emitted_jiffies; 319 319 320 - /** timeline->request entry for this request */ 320 + /* timeline->request entry for this request */ 321 321 struct list_head link; 322 322 323 - /** Watchdog support fields. */ 323 + /* Watchdog support fields. */ 324 324 struct i915_request_watchdog { 325 325 struct llist_node link; 326 326 struct hrtimer timer; 327 327 } watchdog; 328 328 329 - /** 330 - * @guc_fence_link: Requests may need to be stalled when using GuC 331 - * submission waiting for certain GuC operations to complete. If that is 332 - * the case, stalled requests are added to a per context list of stalled 333 - * requests. The below list_head is the link in that list. Protected by 329 + /* 330 + * Requests may need to be stalled when using GuC submission waiting for 331 + * certain GuC operations to complete. If that is the case, stalled 332 + * requests are added to a per context list of stalled requests. The 333 + * below list_head is the link in that list. Protected by 334 334 * ce->guc_state.lock. 335 335 */ 336 336 struct list_head guc_fence_link; 337 337 338 - /** 339 - * @guc_prio: Priority level while the request is in flight. Differs 338 + /* 339 + * Priority level while the request is in flight. Differs 340 340 * from i915 scheduler priority. See comment above 341 341 * I915_SCHEDULER_CAP_STATIC_PRIORITY_MAP for details. Protected by 342 342 * ce->guc_active.lock. Two special values (GUC_PRIO_INIT and ··· 348 348 #define GUC_PRIO_FINI 0xfe 349 349 u8 guc_prio; 350 350 351 - /** 352 - * @hucq: wait queue entry used to wait on the HuC load to complete 351 + /* 352 + * wait queue entry used to wait on the HuC load to complete 353 353 */ 354 354 wait_queue_entry_t hucq; 355 355 ··· 473 473 return test_bit(I915_FENCE_FLAG_INITIAL_BREADCRUMB, &rq->fence.flags); 474 474 } 475 475 476 - /** 476 + /* 477 477 * Returns true if seq1 is later than seq2. 478 478 */ 479 479 static inline bool i915_seqno_passed(u32 seq1, u32 seq2)
+4 -5
drivers/gpu/drm/i915/i915_scatterlist.h
··· 157 157 */ 158 158 struct i915_refct_sgt_ops { 159 159 /** 160 - * release() - Free the memory of the struct i915_refct_sgt 161 - * @ref: struct kref that is embedded in the struct i915_refct_sgt 160 + * @release: Free the memory of the struct i915_refct_sgt 162 161 */ 163 162 void (*release)(struct kref *ref); 164 163 }; ··· 180 181 181 182 /** 182 183 * i915_refct_sgt_put - Put a refcounted sg-table 183 - * @rsgt the struct i915_refct_sgt to put. 184 + * @rsgt: the struct i915_refct_sgt to put. 184 185 */ 185 186 static inline void i915_refct_sgt_put(struct i915_refct_sgt *rsgt) 186 187 { ··· 190 191 191 192 /** 192 193 * i915_refct_sgt_get - Get a refcounted sg-table 193 - * @rsgt the struct i915_refct_sgt to get. 194 + * @rsgt: the struct i915_refct_sgt to get. 194 195 */ 195 196 static inline struct i915_refct_sgt * 196 197 i915_refct_sgt_get(struct i915_refct_sgt *rsgt) ··· 202 203 /** 203 204 * __i915_refct_sgt_init - Initialize a refcounted sg-list with a custom 204 205 * operations structure 205 - * @rsgt The struct i915_refct_sgt to initialize. 206 + * @rsgt: The struct i915_refct_sgt to initialize. 206 207 * @size: Size in bytes of the underlying memory buffer. 207 208 * @ops: A customized operations structure in case the refcounted sg-list 208 209 * is embedded into another structure.
+1 -1
drivers/gpu/drm/i915/i915_utils.h
··· 250 250 } 251 251 } 252 252 253 - /** 253 + /* 254 254 * __wait_for - magic wait macro 255 255 * 256 256 * Macro to help avoid open coding check/wait/timeout patterns. Note that it's
+8 -8
drivers/gpu/drm/i915/i915_vma.c
··· 315 315 struct i915_vma_resource *vma_res; 316 316 struct drm_i915_gem_object *obj; 317 317 struct i915_sw_dma_fence_cb cb; 318 - enum i915_cache_level cache_level; 318 + unsigned int pat_index; 319 319 unsigned int flags; 320 320 }; 321 321 ··· 334 334 return; 335 335 336 336 vma_res->ops->bind_vma(vma_res->vm, &vw->stash, 337 - vma_res, vw->cache_level, vw->flags); 337 + vma_res, vw->pat_index, vw->flags); 338 338 } 339 339 340 340 static void __vma_release(struct dma_fence_work *work) ··· 426 426 /** 427 427 * i915_vma_bind - Sets up PTEs for an VMA in it's corresponding address space. 428 428 * @vma: VMA to map 429 - * @cache_level: mapping cache level 429 + * @pat_index: PAT index to set in PTE 430 430 * @flags: flags like global or local mapping 431 431 * @work: preallocated worker for allocating and binding the PTE 432 432 * @vma_res: pointer to a preallocated vma resource. The resource is either ··· 437 437 * Note that DMA addresses are also the only part of the SG table we care about. 438 438 */ 439 439 int i915_vma_bind(struct i915_vma *vma, 440 - enum i915_cache_level cache_level, 440 + unsigned int pat_index, 441 441 u32 flags, 442 442 struct i915_vma_work *work, 443 443 struct i915_vma_resource *vma_res) ··· 507 507 struct dma_fence *prev; 508 508 509 509 work->vma_res = i915_vma_resource_get(vma->resource); 510 - work->cache_level = cache_level; 510 + work->pat_index = pat_index; 511 511 work->flags = bind_flags; 512 512 513 513 /* ··· 537 537 538 538 return ret; 539 539 } 540 - vma->ops->bind_vma(vma->vm, NULL, vma->resource, cache_level, 540 + vma->ops->bind_vma(vma->vm, NULL, vma->resource, pat_index, 541 541 bind_flags); 542 542 } 543 543 ··· 814 814 color = 0; 815 815 816 816 if (i915_vm_has_cache_coloring(vma->vm)) 817 - color = vma->obj->cache_level; 817 + color = vma->obj->pat_index; 818 818 819 819 if (flags & PIN_OFFSET_FIXED) { 820 820 u64 offset = flags & PIN_OFFSET_MASK; ··· 1518 1518 1519 1519 GEM_BUG_ON(!vma->pages); 1520 1520 err = i915_vma_bind(vma, 1521 - vma->obj->cache_level, 1521 + vma->obj->pat_index, 1522 1522 flags, work, vma_res); 1523 1523 vma_res = NULL; 1524 1524 if (err)
+2 -2
drivers/gpu/drm/i915/i915_vma.h
··· 132 132 } 133 133 134 134 /** 135 - * i915_vma_offset - Obtain the va range size of the vma 135 + * i915_vma_size - Obtain the va range size of the vma 136 136 * @vma: The vma 137 137 * 138 138 * GPU virtual address space may be allocated with padding. This ··· 250 250 251 251 struct i915_vma_work *i915_vma_work(void); 252 252 int i915_vma_bind(struct i915_vma *vma, 253 - enum i915_cache_level cache_level, 253 + unsigned int pat_index, 254 254 u32 flags, 255 255 struct i915_vma_work *work, 256 256 struct i915_vma_resource *vma_res);
+28 -18
drivers/gpu/drm/i915/i915_vma_resource.h
··· 34 34 }; 35 35 36 36 /** 37 + * struct i915_vma_bindinfo - Information needed for async bind 38 + * only but that can be dropped after the bind has taken place. 39 + * Consider making this a separate argument to the bind_vma 40 + * op, coalescing with other arguments like vm, stash, cache_level 41 + * and flags 42 + * @pages: The pages sg-table. 43 + * @page_sizes: Page sizes of the pages. 44 + * @pages_rsgt: Refcounted sg-table when delayed object destruction 45 + * is supported. May be NULL. 46 + * @readonly: Whether the vma should be bound read-only. 47 + * @lmem: Whether the vma points to lmem. 48 + */ 49 + struct i915_vma_bindinfo { 50 + struct sg_table *pages; 51 + struct i915_page_sizes page_sizes; 52 + struct i915_refct_sgt *pages_rsgt; 53 + bool readonly:1; 54 + bool lmem:1; 55 + }; 56 + 57 + /** 37 58 * struct i915_vma_resource - Snapshotted unbind information. 38 59 * @unbind_fence: Fence to mark unbinding complete. Note that this fence 39 60 * is not considered published until unbind is scheduled, and as such it ··· 68 47 * @chain: Pointer to struct i915_sw_fence used to await dependencies. 69 48 * @rb: Rb node for the vm's pending unbind interval tree. 70 49 * @__subtree_last: Interval tree private member. 50 + * @wakeref: wakeref. 71 51 * @vm: non-refcounted pointer to the vm. This is for internal use only and 72 52 * this member is cleared after vm_resource unbind. 73 53 * @mr: The memory region of the object pointed to by the vma. ··· 110 88 intel_wakeref_t wakeref; 111 89 112 90 /** 113 - * struct i915_vma_bindinfo - Information needed for async bind 114 - * only but that can be dropped after the bind has taken place. 115 - * Consider making this a separate argument to the bind_vma 116 - * op, coalescing with other arguments like vm, stash, cache_level 117 - * and flags 118 - * @pages: The pages sg-table. 119 - * @page_sizes: Page sizes of the pages. 120 - * @pages_rsgt: Refcounted sg-table when delayed object destruction 121 - * is supported. May be NULL. 122 - * @readonly: Whether the vma should be bound read-only. 123 - * @lmem: Whether the vma points to lmem. 91 + * @bi: Information needed for async bind only but that can be dropped 92 + * after the bind has taken place. 93 + * 94 + * Consider making this a separate argument to the bind_vma op, 95 + * coalescing with other arguments like vm, stash, cache_level and flags 124 96 */ 125 - struct i915_vma_bindinfo { 126 - struct sg_table *pages; 127 - struct i915_page_sizes page_sizes; 128 - struct i915_refct_sgt *pages_rsgt; 129 - bool readonly:1; 130 - bool lmem:1; 131 - } bi; 97 + struct i915_vma_bindinfo bi; 132 98 133 99 #if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR) 134 100 struct intel_memory_region *mr;
-2
drivers/gpu/drm/i915/i915_vma_types.h
··· 32 32 33 33 #include "gem/i915_gem_object_types.h" 34 34 35 - enum i915_cache_level; 36 - 37 35 /** 38 36 * DOC: Global GTT views 39 37 *
+5
drivers/gpu/drm/i915/intel_device_info.h
··· 35 35 #include "gt/intel_context_types.h" 36 36 #include "gt/intel_sseu.h" 37 37 38 + #include "gem/i915_gem_object_types.h" 39 + 38 40 struct drm_printer; 39 41 struct drm_i915_private; 40 42 struct intel_gt_definition; ··· 310 308 * Initial runtime info. Do not access outside of i915_driver_create(). 311 309 */ 312 310 const struct intel_runtime_info __runtime; 311 + 312 + u32 cachelevel_to_pat[I915_MAX_CACHE_LEVEL]; 313 + u32 max_pat_index; 313 314 }; 314 315 315 316 struct intel_driver_caps {
+79 -25
drivers/gpu/drm/i915/pxp/intel_pxp.c
··· 12 12 #include "i915_drv.h" 13 13 14 14 #include "intel_pxp.h" 15 + #include "intel_pxp_gsccs.h" 15 16 #include "intel_pxp_irq.h" 17 + #include "intel_pxp_regs.h" 16 18 #include "intel_pxp_session.h" 17 19 #include "intel_pxp_tee.h" 18 20 #include "intel_pxp_types.h" ··· 62 60 return IS_ENABLED(CONFIG_DRM_I915_PXP) && pxp && pxp->arb_is_valid; 63 61 } 64 62 65 - /* KCR register definitions */ 66 - #define KCR_INIT _MMIO(0x320f0) 67 - /* Setting KCR Init bit is required after system boot */ 68 - #define KCR_INIT_ALLOW_DISPLAY_ME_WRITES REG_BIT(14) 69 - 70 - static void kcr_pxp_enable(struct intel_gt *gt) 63 + static void kcr_pxp_set_status(const struct intel_pxp *pxp, bool enable) 71 64 { 72 - intel_uncore_write(gt->uncore, KCR_INIT, 73 - _MASKED_BIT_ENABLE(KCR_INIT_ALLOW_DISPLAY_ME_WRITES)); 65 + u32 val = enable ? _MASKED_BIT_ENABLE(KCR_INIT_ALLOW_DISPLAY_ME_WRITES) : 66 + _MASKED_BIT_DISABLE(KCR_INIT_ALLOW_DISPLAY_ME_WRITES); 67 + 68 + intel_uncore_write(pxp->ctrl_gt->uncore, KCR_INIT(pxp->kcr_base), val); 74 69 } 75 70 76 - static void kcr_pxp_disable(struct intel_gt *gt) 71 + static void kcr_pxp_enable(const struct intel_pxp *pxp) 77 72 { 78 - intel_uncore_write(gt->uncore, KCR_INIT, 79 - _MASKED_BIT_DISABLE(KCR_INIT_ALLOW_DISPLAY_ME_WRITES)); 73 + kcr_pxp_set_status(pxp, true); 74 + } 75 + 76 + static void kcr_pxp_disable(const struct intel_pxp *pxp) 77 + { 78 + kcr_pxp_set_status(pxp, false); 80 79 } 81 80 82 81 static int create_vcs_context(struct intel_pxp *pxp) ··· 129 126 init_completion(&pxp->termination); 130 127 complete_all(&pxp->termination); 131 128 129 + if (pxp->ctrl_gt->type == GT_MEDIA) 130 + pxp->kcr_base = MTL_KCR_BASE; 131 + else 132 + pxp->kcr_base = GEN12_KCR_BASE; 133 + 132 134 intel_pxp_session_management_init(pxp); 133 135 134 136 ret = create_vcs_context(pxp); 135 137 if (ret) 136 138 return; 137 139 138 - ret = intel_pxp_tee_component_init(pxp); 140 + if (HAS_ENGINE(pxp->ctrl_gt, GSC0)) 141 + ret = intel_pxp_gsccs_init(pxp); 142 + else 143 + ret = intel_pxp_tee_component_init(pxp); 139 144 if (ret) 140 145 goto out_context; 141 146 ··· 176 165 /* 177 166 * For MTL onwards, PXP-controller-GT needs to have a valid GSC engine 178 167 * on the media GT. NOTE: if we have a media-tile with a GSC-engine, 179 - * the VDBOX is already present so skip that check 168 + * the VDBOX is already present so skip that check. We also have to 169 + * ensure the GSC and HUC firmware are coming online 180 170 */ 181 - if (i915->media_gt && HAS_ENGINE(i915->media_gt, GSC0)) 171 + if (i915->media_gt && HAS_ENGINE(i915->media_gt, GSC0) && 172 + intel_uc_fw_is_loadable(&i915->media_gt->uc.gsc.fw) && 173 + intel_uc_fw_is_loadable(&i915->media_gt->uc.huc.fw)) 182 174 return i915->media_gt; 183 175 184 176 /* ··· 221 207 if (!i915->pxp) 222 208 return -ENOMEM; 223 209 210 + /* init common info used by all feature-mode usages*/ 224 211 i915->pxp->ctrl_gt = gt; 212 + mutex_init(&i915->pxp->tee_mutex); 225 213 226 214 /* 227 215 * If full PXP feature is not available but HuC is loaded by GSC on pre-MTL ··· 245 229 246 230 i915->pxp->arb_is_valid = false; 247 231 248 - intel_pxp_tee_component_fini(i915->pxp); 232 + if (HAS_ENGINE(i915->pxp->ctrl_gt, GSC0)) 233 + intel_pxp_gsccs_fini(i915->pxp); 234 + else 235 + intel_pxp_tee_component_fini(i915->pxp); 249 236 250 237 destroy_vcs_context(i915->pxp); 251 238 ··· 289 270 return bound; 290 271 } 291 272 273 + int intel_pxp_get_backend_timeout_ms(struct intel_pxp *pxp) 274 + { 275 + if (HAS_ENGINE(pxp->ctrl_gt, GSC0)) 276 + return GSCFW_MAX_ROUND_TRIP_LATENCY_MS; 277 + else 278 + return 250; 279 + } 280 + 292 281 static int __pxp_global_teardown_final(struct intel_pxp *pxp) 293 282 { 283 + int timeout; 284 + 294 285 if (!pxp->arb_is_valid) 295 286 return 0; 296 287 /* ··· 310 281 intel_pxp_mark_termination_in_progress(pxp); 311 282 intel_pxp_terminate(pxp, false); 312 283 313 - if (!wait_for_completion_timeout(&pxp->termination, msecs_to_jiffies(250))) 284 + timeout = intel_pxp_get_backend_timeout_ms(pxp); 285 + 286 + if (!wait_for_completion_timeout(&pxp->termination, msecs_to_jiffies(timeout))) 314 287 return -ETIMEDOUT; 315 288 316 289 return 0; ··· 320 289 321 290 static int __pxp_global_teardown_restart(struct intel_pxp *pxp) 322 291 { 292 + int timeout; 293 + 323 294 if (pxp->arb_is_valid) 324 295 return 0; 325 296 /* ··· 330 297 */ 331 298 pxp_queue_termination(pxp); 332 299 333 - if (!wait_for_completion_timeout(&pxp->termination, msecs_to_jiffies(250))) 300 + timeout = intel_pxp_get_backend_timeout_ms(pxp); 301 + 302 + if (!wait_for_completion_timeout(&pxp->termination, msecs_to_jiffies(timeout))) 334 303 return -ETIMEDOUT; 335 304 336 305 return 0; ··· 360 325 } 361 326 362 327 /* 328 + * this helper is used by both intel_pxp_start and by 329 + * the GET_PARAM IOCTL that user space calls. Thus, the 330 + * return values here should match the UAPI spec. 331 + */ 332 + int intel_pxp_get_readiness_status(struct intel_pxp *pxp) 333 + { 334 + if (!intel_pxp_is_enabled(pxp)) 335 + return -ENODEV; 336 + 337 + if (HAS_ENGINE(pxp->ctrl_gt, GSC0)) { 338 + if (wait_for(intel_pxp_gsccs_is_ready_for_sessions(pxp), 250)) 339 + return 2; 340 + } else { 341 + if (wait_for(pxp_component_bound(pxp), 250)) 342 + return 2; 343 + } 344 + return 1; 345 + } 346 + 347 + /* 363 348 * the arb session is restarted from the irq work when we receive the 364 349 * termination completion interrupt 365 350 */ ··· 387 332 { 388 333 int ret = 0; 389 334 390 - if (!intel_pxp_is_enabled(pxp)) 391 - return -ENODEV; 392 - 393 - if (wait_for(pxp_component_bound(pxp), 250)) 394 - return -ENXIO; 335 + ret = intel_pxp_get_readiness_status(pxp); 336 + if (ret < 0) 337 + return ret; 338 + else if (ret > 1) 339 + return -EIO; /* per UAPI spec, user may retry later */ 395 340 396 341 mutex_lock(&pxp->arb_mutex); 397 342 ··· 412 357 413 358 void intel_pxp_init_hw(struct intel_pxp *pxp) 414 359 { 415 - kcr_pxp_enable(pxp->ctrl_gt); 360 + kcr_pxp_enable(pxp); 416 361 intel_pxp_irq_enable(pxp); 417 362 } 418 363 419 364 void intel_pxp_fini_hw(struct intel_pxp *pxp) 420 365 { 421 - kcr_pxp_disable(pxp->ctrl_gt); 422 - 366 + kcr_pxp_disable(pxp); 423 367 intel_pxp_irq_disable(pxp); 424 368 } 425 369
+2
drivers/gpu/drm/i915/pxp/intel_pxp.h
··· 26 26 void intel_pxp_mark_termination_in_progress(struct intel_pxp *pxp); 27 27 void intel_pxp_tee_end_arb_fw_session(struct intel_pxp *pxp, u32 arb_session_id); 28 28 29 + int intel_pxp_get_readiness_status(struct intel_pxp *pxp); 30 + int intel_pxp_get_backend_timeout_ms(struct intel_pxp *pxp); 29 31 int intel_pxp_start(struct intel_pxp *pxp); 30 32 void intel_pxp_end(struct intel_pxp *pxp); 31 33
+24
drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_43.h
··· 11 11 12 12 /* PXP-Cmd-Op definitions */ 13 13 #define PXP43_CMDID_START_HUC_AUTH 0x0000003A 14 + #define PXP43_CMDID_INIT_SESSION 0x00000036 15 + 16 + /* PXP-Packet sizes for MTL's GSCCS-HECI instruction */ 17 + #define PXP43_MAX_HECI_INOUT_SIZE (SZ_32K) 14 18 15 19 /* PXP-Input-Packet: HUC-Authentication */ 16 20 struct pxp43_start_huc_auth_in { ··· 25 21 /* PXP-Output-Packet: HUC-Authentication */ 26 22 struct pxp43_start_huc_auth_out { 27 23 struct pxp_cmd_header header; 24 + } __packed; 25 + 26 + /* PXP-Input-Packet: Init PXP session */ 27 + struct pxp43_create_arb_in { 28 + struct pxp_cmd_header header; 29 + /* header.stream_id fields for vesion 4.3 of Init PXP session: */ 30 + #define PXP43_INIT_SESSION_VALID BIT(0) 31 + #define PXP43_INIT_SESSION_APPTYPE BIT(1) 32 + #define PXP43_INIT_SESSION_APPID GENMASK(17, 2) 33 + u32 protection_mode; 34 + #define PXP43_INIT_SESSION_PROTECTION_ARB 0x2 35 + u32 sub_session_id; 36 + u32 init_flags; 37 + u32 rsvd[12]; 38 + } __packed; 39 + 40 + /* PXP-Input-Packet: Init PXP session */ 41 + struct pxp43_create_arb_out { 42 + struct pxp_cmd_header header; 43 + u32 rsvd[8]; 28 44 } __packed; 29 45 30 46 #endif /* __INTEL_PXP_FW_INTERFACE_43_H__ */
+5 -1
drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c
··· 14 14 15 15 #include "intel_pxp.h" 16 16 #include "intel_pxp_debugfs.h" 17 + #include "intel_pxp_gsccs.h" 17 18 #include "intel_pxp_irq.h" 18 19 #include "intel_pxp_types.h" 19 20 ··· 46 45 { 47 46 struct intel_pxp *pxp = data; 48 47 struct intel_gt *gt = pxp->ctrl_gt; 48 + int timeout_ms; 49 49 50 50 if (!intel_pxp_is_active(pxp)) 51 51 return -ENODEV; ··· 56 54 intel_pxp_irq_handler(pxp, GEN12_DISPLAY_PXP_STATE_TERMINATED_INTERRUPT); 57 55 spin_unlock_irq(gt->irq_lock); 58 56 57 + timeout_ms = intel_pxp_get_backend_timeout_ms(pxp); 58 + 59 59 if (!wait_for_completion_timeout(&pxp->termination, 60 - msecs_to_jiffies(100))) 60 + msecs_to_jiffies(timeout_ms))) 61 61 return -ETIMEDOUT; 62 62 63 63 return 0;
+444
drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c
··· 1 + // SPDX-License-Identifier: MIT 2 + /* 3 + * Copyright(c) 2023 Intel Corporation. 4 + */ 5 + 6 + #include "gem/i915_gem_internal.h" 7 + 8 + #include "gt/intel_context.h" 9 + #include "gt/uc/intel_gsc_fw.h" 10 + #include "gt/uc/intel_gsc_uc_heci_cmd_submit.h" 11 + 12 + #include "i915_drv.h" 13 + #include "intel_pxp.h" 14 + #include "intel_pxp_cmd_interface_42.h" 15 + #include "intel_pxp_cmd_interface_43.h" 16 + #include "intel_pxp_gsccs.h" 17 + #include "intel_pxp_types.h" 18 + 19 + static bool 20 + is_fw_err_platform_config(u32 type) 21 + { 22 + switch (type) { 23 + case PXP_STATUS_ERROR_API_VERSION: 24 + case PXP_STATUS_PLATFCONFIG_KF1_NOVERIF: 25 + case PXP_STATUS_PLATFCONFIG_KF1_BAD: 26 + return true; 27 + default: 28 + break; 29 + } 30 + return false; 31 + } 32 + 33 + static const char * 34 + fw_err_to_string(u32 type) 35 + { 36 + switch (type) { 37 + case PXP_STATUS_ERROR_API_VERSION: 38 + return "ERR_API_VERSION"; 39 + case PXP_STATUS_NOT_READY: 40 + return "ERR_NOT_READY"; 41 + case PXP_STATUS_PLATFCONFIG_KF1_NOVERIF: 42 + case PXP_STATUS_PLATFCONFIG_KF1_BAD: 43 + return "ERR_PLATFORM_CONFIG"; 44 + default: 45 + break; 46 + } 47 + return NULL; 48 + } 49 + 50 + static int 51 + gsccs_send_message(struct intel_pxp *pxp, 52 + void *msg_in, size_t msg_in_size, 53 + void *msg_out, size_t msg_out_size_max, 54 + size_t *msg_out_len, 55 + u64 *gsc_msg_handle_retry) 56 + { 57 + struct intel_gt *gt = pxp->ctrl_gt; 58 + struct drm_i915_private *i915 = gt->i915; 59 + struct gsccs_session_resources *exec_res = &pxp->gsccs_res; 60 + struct intel_gsc_mtl_header *header = exec_res->pkt_vaddr; 61 + struct intel_gsc_heci_non_priv_pkt pkt; 62 + size_t max_msg_size; 63 + u32 reply_size; 64 + int ret; 65 + 66 + if (!exec_res->ce) 67 + return -ENODEV; 68 + 69 + max_msg_size = PXP43_MAX_HECI_INOUT_SIZE - sizeof(*header); 70 + 71 + if (msg_in_size > max_msg_size || msg_out_size_max > max_msg_size) 72 + return -ENOSPC; 73 + 74 + if (!exec_res->pkt_vma || !exec_res->bb_vma) 75 + return -ENOENT; 76 + 77 + GEM_BUG_ON(exec_res->pkt_vma->size < (2 * PXP43_MAX_HECI_INOUT_SIZE)); 78 + 79 + mutex_lock(&pxp->tee_mutex); 80 + 81 + memset(header, 0, sizeof(*header)); 82 + intel_gsc_uc_heci_cmd_emit_mtl_header(header, HECI_MEADDRESS_PXP, 83 + msg_in_size + sizeof(*header), 84 + exec_res->host_session_handle); 85 + 86 + /* check if this is a host-session-handle cleanup call (empty packet) */ 87 + if (!msg_in && !msg_out) 88 + header->flags |= GSC_INFLAG_MSG_CLEANUP; 89 + 90 + /* copy caller provided gsc message handle if this is polling for a prior msg completion */ 91 + header->gsc_message_handle = *gsc_msg_handle_retry; 92 + 93 + /* NOTE: zero size packets are used for session-cleanups */ 94 + if (msg_in && msg_in_size) 95 + memcpy(exec_res->pkt_vaddr + sizeof(*header), msg_in, msg_in_size); 96 + 97 + pkt.addr_in = i915_vma_offset(exec_res->pkt_vma); 98 + pkt.size_in = header->message_size; 99 + pkt.addr_out = pkt.addr_in + PXP43_MAX_HECI_INOUT_SIZE; 100 + pkt.size_out = msg_out_size_max + sizeof(*header); 101 + pkt.heci_pkt_vma = exec_res->pkt_vma; 102 + pkt.bb_vma = exec_res->bb_vma; 103 + 104 + /* 105 + * Before submitting, let's clear-out the validity marker on the reply offset. 106 + * We use offset PXP43_MAX_HECI_INOUT_SIZE for reply location so point header there. 107 + */ 108 + header = exec_res->pkt_vaddr + PXP43_MAX_HECI_INOUT_SIZE; 109 + header->validity_marker = 0; 110 + 111 + ret = intel_gsc_uc_heci_cmd_submit_nonpriv(&gt->uc.gsc, 112 + exec_res->ce, &pkt, exec_res->bb_vaddr, 113 + GSC_REPLY_LATENCY_MS); 114 + if (ret) { 115 + drm_err(&i915->drm, "failed to send gsc PXP msg (%d)\n", ret); 116 + goto unlock; 117 + } 118 + 119 + /* Response validity marker, status and busyness */ 120 + if (header->validity_marker != GSC_HECI_VALIDITY_MARKER) { 121 + drm_err(&i915->drm, "gsc PXP reply with invalid validity marker\n"); 122 + ret = -EINVAL; 123 + goto unlock; 124 + } 125 + if (header->status != 0) { 126 + drm_dbg(&i915->drm, "gsc PXP reply status has error = 0x%08x\n", 127 + header->status); 128 + ret = -EINVAL; 129 + goto unlock; 130 + } 131 + if (header->flags & GSC_OUTFLAG_MSG_PENDING) { 132 + drm_dbg(&i915->drm, "gsc PXP reply is busy\n"); 133 + /* 134 + * When the GSC firmware replies with pending bit, it means that the requested 135 + * operation has begun but the completion is pending and the caller needs 136 + * to re-request with the gsc_message_handle that was returned by the firmware. 137 + * until the pending bit is turned off. 138 + */ 139 + *gsc_msg_handle_retry = header->gsc_message_handle; 140 + ret = -EAGAIN; 141 + goto unlock; 142 + } 143 + 144 + reply_size = header->message_size - sizeof(*header); 145 + if (reply_size > msg_out_size_max) { 146 + drm_warn(&i915->drm, "caller with insufficient PXP reply size %u (%ld)\n", 147 + reply_size, msg_out_size_max); 148 + reply_size = msg_out_size_max; 149 + } 150 + 151 + if (msg_out) 152 + memcpy(msg_out, exec_res->pkt_vaddr + PXP43_MAX_HECI_INOUT_SIZE + sizeof(*header), 153 + reply_size); 154 + if (msg_out_len) 155 + *msg_out_len = reply_size; 156 + 157 + unlock: 158 + mutex_unlock(&pxp->tee_mutex); 159 + return ret; 160 + } 161 + 162 + static int 163 + gsccs_send_message_retry_complete(struct intel_pxp *pxp, 164 + void *msg_in, size_t msg_in_size, 165 + void *msg_out, size_t msg_out_size_max, 166 + size_t *msg_out_len) 167 + { 168 + u64 gsc_session_retry = 0; 169 + int ret, tries = 0; 170 + 171 + /* 172 + * Keep sending request if GSC firmware was busy. Based on fw specs + 173 + * sw overhead (and testing) we expect a worst case pending-bit delay of 174 + * GSC_PENDING_RETRY_MAXCOUNT x GSC_PENDING_RETRY_PAUSE_MS millisecs. 175 + */ 176 + do { 177 + ret = gsccs_send_message(pxp, msg_in, msg_in_size, msg_out, msg_out_size_max, 178 + msg_out_len, &gsc_session_retry); 179 + /* Only try again if gsc says so */ 180 + if (ret != -EAGAIN) 181 + break; 182 + 183 + msleep(GSC_PENDING_RETRY_PAUSE_MS); 184 + } while (++tries < GSC_PENDING_RETRY_MAXCOUNT); 185 + 186 + return ret; 187 + } 188 + 189 + bool intel_pxp_gsccs_is_ready_for_sessions(struct intel_pxp *pxp) 190 + { 191 + /* 192 + * GSC-fw loading, HuC-fw loading, HuC-fw authentication and 193 + * GSC-proxy init flow (requiring an mei component driver) 194 + * must all occur first before we can start requesting for PXP 195 + * sessions. Checking for completion on HuC authentication and 196 + * gsc-proxy init flow (the last set of dependencies that 197 + * are out of order) will suffice. 198 + */ 199 + if (intel_huc_is_authenticated(&pxp->ctrl_gt->uc.huc) && 200 + intel_gsc_uc_fw_proxy_init_done(&pxp->ctrl_gt->uc.gsc)) 201 + return true; 202 + 203 + return false; 204 + } 205 + 206 + int intel_pxp_gsccs_create_session(struct intel_pxp *pxp, 207 + int arb_session_id) 208 + { 209 + struct drm_i915_private *i915 = pxp->ctrl_gt->i915; 210 + struct pxp43_create_arb_in msg_in = {0}; 211 + struct pxp43_create_arb_out msg_out = {0}; 212 + int ret; 213 + 214 + msg_in.header.api_version = PXP_APIVER(4, 3); 215 + msg_in.header.command_id = PXP43_CMDID_INIT_SESSION; 216 + msg_in.header.stream_id = (FIELD_PREP(PXP43_INIT_SESSION_APPID, arb_session_id) | 217 + FIELD_PREP(PXP43_INIT_SESSION_VALID, 1) | 218 + FIELD_PREP(PXP43_INIT_SESSION_APPTYPE, 0)); 219 + msg_in.header.buffer_len = sizeof(msg_in) - sizeof(msg_in.header); 220 + msg_in.protection_mode = PXP43_INIT_SESSION_PROTECTION_ARB; 221 + 222 + ret = gsccs_send_message_retry_complete(pxp, 223 + &msg_in, sizeof(msg_in), 224 + &msg_out, sizeof(msg_out), NULL); 225 + if (ret) { 226 + drm_err(&i915->drm, "Failed to init session %d, ret=[%d]\n", arb_session_id, ret); 227 + } else if (msg_out.header.status != 0) { 228 + if (is_fw_err_platform_config(msg_out.header.status)) { 229 + drm_info_once(&i915->drm, 230 + "PXP init-session-%d failed due to BIOS/SOC:0x%08x:%s\n", 231 + arb_session_id, msg_out.header.status, 232 + fw_err_to_string(msg_out.header.status)); 233 + } else { 234 + drm_dbg(&i915->drm, "PXP init-session-%d failed 0x%08x:%st:\n", 235 + arb_session_id, msg_out.header.status, 236 + fw_err_to_string(msg_out.header.status)); 237 + drm_dbg(&i915->drm, " cmd-detail: ID=[0x%08x],API-Ver-[0x%08x]\n", 238 + msg_in.header.command_id, msg_in.header.api_version); 239 + } 240 + } 241 + 242 + return ret; 243 + } 244 + 245 + void intel_pxp_gsccs_end_arb_fw_session(struct intel_pxp *pxp, u32 session_id) 246 + { 247 + struct drm_i915_private *i915 = pxp->ctrl_gt->i915; 248 + struct pxp42_inv_stream_key_in msg_in = {0}; 249 + struct pxp42_inv_stream_key_out msg_out = {0}; 250 + int ret = 0; 251 + 252 + /* 253 + * Stream key invalidation reuses the same version 4.2 input/output 254 + * command format but firmware requires 4.3 API interaction 255 + */ 256 + msg_in.header.api_version = PXP_APIVER(4, 3); 257 + msg_in.header.command_id = PXP42_CMDID_INVALIDATE_STREAM_KEY; 258 + msg_in.header.buffer_len = sizeof(msg_in) - sizeof(msg_in.header); 259 + 260 + msg_in.header.stream_id = FIELD_PREP(PXP_CMDHDR_EXTDATA_SESSION_VALID, 1); 261 + msg_in.header.stream_id |= FIELD_PREP(PXP_CMDHDR_EXTDATA_APP_TYPE, 0); 262 + msg_in.header.stream_id |= FIELD_PREP(PXP_CMDHDR_EXTDATA_SESSION_ID, session_id); 263 + 264 + ret = gsccs_send_message_retry_complete(pxp, 265 + &msg_in, sizeof(msg_in), 266 + &msg_out, sizeof(msg_out), NULL); 267 + if (ret) { 268 + drm_err(&i915->drm, "Failed to inv-stream-key-%u, ret=[%d]\n", 269 + session_id, ret); 270 + } else if (msg_out.header.status != 0) { 271 + if (is_fw_err_platform_config(msg_out.header.status)) { 272 + drm_info_once(&i915->drm, 273 + "PXP inv-stream-key-%u failed due to BIOS/SOC :0x%08x:%s\n", 274 + session_id, msg_out.header.status, 275 + fw_err_to_string(msg_out.header.status)); 276 + } else { 277 + drm_dbg(&i915->drm, "PXP inv-stream-key-%u failed 0x%08x:%s:\n", 278 + session_id, msg_out.header.status, 279 + fw_err_to_string(msg_out.header.status)); 280 + drm_dbg(&i915->drm, " cmd-detail: ID=[0x%08x],API-Ver-[0x%08x]\n", 281 + msg_in.header.command_id, msg_in.header.api_version); 282 + } 283 + } 284 + } 285 + 286 + static void 287 + gsccs_cleanup_fw_host_session_handle(struct intel_pxp *pxp) 288 + { 289 + struct drm_i915_private *i915 = pxp->ctrl_gt->i915; 290 + int ret; 291 + 292 + ret = gsccs_send_message_retry_complete(pxp, NULL, 0, NULL, 0, NULL); 293 + if (ret) 294 + drm_dbg(&i915->drm, "Failed to send gsccs msg host-session-cleanup: ret=[%d]\n", 295 + ret); 296 + } 297 + 298 + static void 299 + gsccs_destroy_execution_resource(struct intel_pxp *pxp) 300 + { 301 + struct gsccs_session_resources *exec_res = &pxp->gsccs_res; 302 + 303 + if (exec_res->host_session_handle) 304 + gsccs_cleanup_fw_host_session_handle(pxp); 305 + if (exec_res->ce) 306 + intel_context_put(exec_res->ce); 307 + if (exec_res->bb_vma) 308 + i915_vma_unpin_and_release(&exec_res->bb_vma, I915_VMA_RELEASE_MAP); 309 + if (exec_res->pkt_vma) 310 + i915_vma_unpin_and_release(&exec_res->pkt_vma, I915_VMA_RELEASE_MAP); 311 + 312 + memset(exec_res, 0, sizeof(*exec_res)); 313 + } 314 + 315 + static int 316 + gsccs_create_buffer(struct intel_gt *gt, 317 + const char *bufname, size_t size, 318 + struct i915_vma **vma, void **map) 319 + { 320 + struct drm_i915_private *i915 = gt->i915; 321 + struct drm_i915_gem_object *obj; 322 + int err = 0; 323 + 324 + obj = i915_gem_object_create_internal(i915, size); 325 + if (IS_ERR(obj)) { 326 + drm_err(&i915->drm, "Failed to allocate gsccs backend %s.\n", bufname); 327 + err = PTR_ERR(obj); 328 + goto out_none; 329 + } 330 + 331 + *vma = i915_vma_instance(obj, gt->vm, NULL); 332 + if (IS_ERR(*vma)) { 333 + drm_err(&i915->drm, "Failed to vma-instance gsccs backend %s.\n", bufname); 334 + err = PTR_ERR(*vma); 335 + goto out_put; 336 + } 337 + 338 + /* return a virtual pointer */ 339 + *map = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(i915, obj, true)); 340 + if (IS_ERR(*map)) { 341 + drm_err(&i915->drm, "Failed to map gsccs backend %s.\n", bufname); 342 + err = PTR_ERR(*map); 343 + goto out_put; 344 + } 345 + 346 + /* all PXP sessions commands are treated as non-privileged */ 347 + err = i915_vma_pin(*vma, 0, 0, PIN_USER); 348 + if (err) { 349 + drm_err(&i915->drm, "Failed to vma-pin gsccs backend %s.\n", bufname); 350 + goto out_unmap; 351 + } 352 + 353 + return 0; 354 + 355 + out_unmap: 356 + i915_gem_object_unpin_map(obj); 357 + out_put: 358 + i915_gem_object_put(obj); 359 + out_none: 360 + *vma = NULL; 361 + *map = NULL; 362 + 363 + return err; 364 + } 365 + 366 + static int 367 + gsccs_allocate_execution_resource(struct intel_pxp *pxp) 368 + { 369 + struct intel_gt *gt = pxp->ctrl_gt; 370 + struct gsccs_session_resources *exec_res = &pxp->gsccs_res; 371 + struct intel_engine_cs *engine = gt->engine[GSC0]; 372 + struct intel_context *ce; 373 + int err = 0; 374 + 375 + /* 376 + * First, ensure the GSC engine is present. 377 + * NOTE: Backend would only be called with the correct gt. 378 + */ 379 + if (!engine) 380 + return -ENODEV; 381 + 382 + /* 383 + * Now, allocate, pin and map two objects, one for the heci message packet 384 + * and another for the batch buffer we submit into GSC engine (that includes the packet). 385 + * NOTE: GSC-CS backend is currently only supported on MTL, so we allocate shmem. 386 + */ 387 + err = gsccs_create_buffer(pxp->ctrl_gt, "Heci Packet", 388 + 2 * PXP43_MAX_HECI_INOUT_SIZE, 389 + &exec_res->pkt_vma, &exec_res->pkt_vaddr); 390 + if (err) 391 + return err; 392 + 393 + err = gsccs_create_buffer(pxp->ctrl_gt, "Batch Buffer", PAGE_SIZE, 394 + &exec_res->bb_vma, &exec_res->bb_vaddr); 395 + if (err) 396 + goto free_pkt; 397 + 398 + /* Finally, create an intel_context to be used during the submission */ 399 + ce = intel_context_create(engine); 400 + if (IS_ERR(ce)) { 401 + drm_err(&gt->i915->drm, "Failed creating gsccs backend ctx\n"); 402 + err = PTR_ERR(ce); 403 + goto free_batch; 404 + } 405 + 406 + i915_vm_put(ce->vm); 407 + ce->vm = i915_vm_get(pxp->ctrl_gt->vm); 408 + exec_res->ce = ce; 409 + 410 + /* initialize host-session-handle (for all i915-to-gsc-firmware PXP cmds) */ 411 + get_random_bytes(&exec_res->host_session_handle, sizeof(exec_res->host_session_handle)); 412 + 413 + return 0; 414 + 415 + free_batch: 416 + i915_vma_unpin_and_release(&exec_res->bb_vma, I915_VMA_RELEASE_MAP); 417 + free_pkt: 418 + i915_vma_unpin_and_release(&exec_res->pkt_vma, I915_VMA_RELEASE_MAP); 419 + memset(exec_res, 0, sizeof(*exec_res)); 420 + 421 + return err; 422 + } 423 + 424 + void intel_pxp_gsccs_fini(struct intel_pxp *pxp) 425 + { 426 + intel_wakeref_t wakeref; 427 + 428 + gsccs_destroy_execution_resource(pxp); 429 + with_intel_runtime_pm(&pxp->ctrl_gt->i915->runtime_pm, wakeref) 430 + intel_pxp_fini_hw(pxp); 431 + } 432 + 433 + int intel_pxp_gsccs_init(struct intel_pxp *pxp) 434 + { 435 + int ret; 436 + intel_wakeref_t wakeref; 437 + 438 + ret = gsccs_allocate_execution_resource(pxp); 439 + if (!ret) { 440 + with_intel_runtime_pm(&pxp->ctrl_gt->i915->runtime_pm, wakeref) 441 + intel_pxp_init_hw(pxp); 442 + } 443 + return ret; 444 + }
+43
drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.h
··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright(c) 2022, Intel Corporation. All rights reserved. 4 + */ 5 + 6 + #ifndef __INTEL_PXP_GSCCS_H__ 7 + #define __INTEL_PXP_GSCCS_H__ 8 + 9 + #include <linux/types.h> 10 + 11 + struct intel_pxp; 12 + 13 + #define GSC_REPLY_LATENCY_MS 210 14 + /* 15 + * Max FW response time is 200ms, to which we add 10ms to account for overhead 16 + * such as request preparation, GuC submission to hw and pipeline completion times. 17 + */ 18 + #define GSC_PENDING_RETRY_MAXCOUNT 40 19 + #define GSC_PENDING_RETRY_PAUSE_MS 50 20 + #define GSCFW_MAX_ROUND_TRIP_LATENCY_MS (GSC_PENDING_RETRY_MAXCOUNT * GSC_PENDING_RETRY_PAUSE_MS) 21 + 22 + #ifdef CONFIG_DRM_I915_PXP 23 + void intel_pxp_gsccs_fini(struct intel_pxp *pxp); 24 + int intel_pxp_gsccs_init(struct intel_pxp *pxp); 25 + 26 + int intel_pxp_gsccs_create_session(struct intel_pxp *pxp, int arb_session_id); 27 + void intel_pxp_gsccs_end_arb_fw_session(struct intel_pxp *pxp, u32 arb_session_id); 28 + 29 + #else 30 + static inline void intel_pxp_gsccs_fini(struct intel_pxp *pxp) 31 + { 32 + } 33 + 34 + static inline int intel_pxp_gsccs_init(struct intel_pxp *pxp) 35 + { 36 + return 0; 37 + } 38 + 39 + #endif 40 + 41 + bool intel_pxp_gsccs_is_ready_for_sessions(struct intel_pxp *pxp); 42 + 43 + #endif /*__INTEL_PXP_GSCCS_H__ */
+2 -1
drivers/gpu/drm/i915/pxp/intel_pxp_pm.c
··· 43 43 * The PXP component gets automatically unbound when we go into S3 and 44 44 * re-bound after we come out, so in that scenario we can defer the 45 45 * hw init to the bind call. 46 + * NOTE: GSC-CS backend doesn't rely on components. 46 47 */ 47 - if (!pxp->pxp_component) 48 + if (!HAS_ENGINE(pxp->ctrl_gt, GSC0) && !pxp->pxp_component) 48 49 return; 49 50 50 51 intel_pxp_init_hw(pxp);
+27
drivers/gpu/drm/i915/pxp/intel_pxp_regs.h
··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright(c) 2023, Intel Corporation. All rights reserved. 4 + */ 5 + 6 + #ifndef __INTEL_PXP_REGS_H__ 7 + #define __INTEL_PXP_REGS_H__ 8 + 9 + #include "i915_reg_defs.h" 10 + 11 + /* KCR subsystem register base address */ 12 + #define GEN12_KCR_BASE 0x32000 13 + #define MTL_KCR_BASE 0x386000 14 + 15 + /* KCR enable/disable control */ 16 + #define KCR_INIT(base) _MMIO((base) + 0xf0) 17 + 18 + /* Setting KCR Init bit is required after system boot */ 19 + #define KCR_INIT_ALLOW_DISPLAY_ME_WRITES REG_BIT(14) 20 + 21 + /* KCR hwdrm session in play status 0-31 */ 22 + #define KCR_SIP(base) _MMIO((base) + 0x260) 23 + 24 + /* PXP global terminate register for session termination */ 25 + #define KCR_GLOBAL_TERMINATE(base) _MMIO((base) + 0xf8) 26 + 27 + #endif /* __INTEL_PXP_REGS_H__ */
+14 -11
drivers/gpu/drm/i915/pxp/intel_pxp_session.c
··· 7 7 8 8 #include "intel_pxp.h" 9 9 #include "intel_pxp_cmd.h" 10 + #include "intel_pxp_gsccs.h" 10 11 #include "intel_pxp_session.h" 11 12 #include "intel_pxp_tee.h" 12 13 #include "intel_pxp_types.h" 14 + #include "intel_pxp_regs.h" 13 15 14 16 #define ARB_SESSION I915_PROTECTED_CONTENT_DEFAULT_SESSION /* shorter define */ 15 - 16 - #define GEN12_KCR_SIP _MMIO(0x32260) /* KCR hwdrm session in play 0-31 */ 17 - 18 - /* PXP global terminate register for session termination */ 19 - #define PXP_GLOBAL_TERMINATE _MMIO(0x320f8) 20 17 21 18 static bool intel_pxp_session_is_in_play(struct intel_pxp *pxp, u32 id) 22 19 { ··· 23 26 24 27 /* if we're suspended the session is considered off */ 25 28 with_intel_runtime_pm_if_in_use(uncore->rpm, wakeref) 26 - sip = intel_uncore_read(uncore, GEN12_KCR_SIP); 29 + sip = intel_uncore_read(uncore, KCR_SIP(pxp->kcr_base)); 27 30 28 31 return sip & BIT(id); 29 32 } ··· 41 44 return in_play ? -ENODEV : 0; 42 45 43 46 ret = intel_wait_for_register(uncore, 44 - GEN12_KCR_SIP, 47 + KCR_SIP(pxp->kcr_base), 45 48 mask, 46 49 in_play ? mask : 0, 47 - 100); 50 + 250); 48 51 49 52 intel_runtime_pm_put(uncore->rpm, wakeref); 50 53 ··· 63 66 return -EEXIST; 64 67 } 65 68 66 - ret = intel_pxp_tee_cmd_create_arb_session(pxp, ARB_SESSION); 69 + if (HAS_ENGINE(pxp->ctrl_gt, GSC0)) 70 + ret = intel_pxp_gsccs_create_session(pxp, ARB_SESSION); 71 + else 72 + ret = intel_pxp_tee_cmd_create_arb_session(pxp, ARB_SESSION); 67 73 if (ret) { 68 74 drm_err(&gt->i915->drm, "tee cmd for arb session creation failed\n"); 69 75 return ret; ··· 108 108 return ret; 109 109 } 110 110 111 - intel_uncore_write(gt->uncore, PXP_GLOBAL_TERMINATE, 1); 111 + intel_uncore_write(gt->uncore, KCR_GLOBAL_TERMINATE(pxp->kcr_base), 1); 112 112 113 - intel_pxp_tee_end_arb_fw_session(pxp, ARB_SESSION); 113 + if (HAS_ENGINE(gt, GSC0)) 114 + intel_pxp_gsccs_end_arb_fw_session(pxp, ARB_SESSION); 115 + else 116 + intel_pxp_tee_end_arb_fw_session(pxp, ARB_SESSION); 114 117 115 118 return ret; 116 119 }
-2
drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
··· 284 284 struct intel_gt *gt = pxp->ctrl_gt; 285 285 struct drm_i915_private *i915 = gt->i915; 286 286 287 - mutex_init(&pxp->tee_mutex); 288 - 289 287 ret = alloc_streaming_command(pxp); 290 288 if (ret) 291 289 return ret;
+23 -1
drivers/gpu/drm/i915/pxp/intel_pxp_types.h
··· 27 27 struct intel_gt *ctrl_gt; 28 28 29 29 /** 30 + * @kcr_base: base mmio offset for the KCR engine which is different on legacy platforms 31 + * vs newer platforms where the KCR is inside the media-tile. 32 + */ 33 + u32 kcr_base; 34 + 35 + /** 36 + * @gsccs_res: resources for request submission for platforms that have a GSC engine. 37 + */ 38 + struct gsccs_session_resources { 39 + u64 host_session_handle; /* used by firmware to link commands to sessions */ 40 + struct intel_context *ce; /* context for gsc command submission */ 41 + 42 + struct i915_vma *pkt_vma; /* GSC FW cmd packet vma */ 43 + void *pkt_vaddr; /* GSC FW cmd packet virt pointer */ 44 + 45 + struct i915_vma *bb_vma; /* HECI_PKT batch buffer vma */ 46 + void *bb_vaddr; /* HECI_PKT batch buffer virt pointer */ 47 + } gsccs_res; 48 + 49 + /** 30 50 * @pxp_component: i915_pxp_component struct of the bound mei_pxp 31 51 * module. Only set and cleared inside component bind/unbind functions, 32 52 * which are protected by &tee_mutex. 33 53 */ 34 54 struct i915_pxp_component *pxp_component; 35 55 36 - /* @dev_link: Enforce module relationship for power management ordering. */ 56 + /** 57 + * @dev_link: Enforce module relationship for power management ordering. 58 + */ 37 59 struct device_link *dev_link; 38 60 /** 39 61 * @pxp_component_added: track if the pxp component has been added.
+4 -1
drivers/gpu/drm/i915/selftests/i915_gem.c
··· 57 57 u32 __iomem *s; 58 58 int x; 59 59 60 - ggtt->vm.insert_page(&ggtt->vm, dma, slot, I915_CACHE_NONE, 0); 60 + ggtt->vm.insert_page(&ggtt->vm, dma, slot, 61 + i915_gem_get_pat_index(i915, 62 + I915_CACHE_NONE), 63 + 0); 61 64 62 65 s = io_mapping_map_atomic_wc(&ggtt->iomap, slot); 63 66 for (x = 0; x < PAGE_SIZE / sizeof(u32); x++) {
+5 -3
drivers/gpu/drm/i915/selftests/i915_gem_evict.c
··· 27 27 #include "gem/selftests/igt_gem_utils.h" 28 28 #include "gem/selftests/mock_context.h" 29 29 #include "gt/intel_gt.h" 30 + #include "gt/intel_gt_print.h" 30 31 31 32 #include "i915_selftest.h" 32 33 ··· 246 245 struct drm_mm_node target = { 247 246 .start = I915_GTT_PAGE_SIZE * 2, 248 247 .size = I915_GTT_PAGE_SIZE, 249 - .color = I915_CACHE_LLC, 248 + .color = i915_gem_get_pat_index(gt->i915, I915_CACHE_LLC), 250 249 }; 251 250 struct drm_i915_gem_object *obj; 252 251 struct i915_vma *vma; ··· 309 308 /* Attempt to remove the first *pinned* vma, by removing the (empty) 310 309 * neighbour -- this should fail. 311 310 */ 312 - target.color = I915_CACHE_L3_LLC; 311 + target.color = i915_gem_get_pat_index(gt->i915, I915_CACHE_L3_LLC); 313 312 314 313 mutex_lock(&ggtt->vm.mutex); 315 314 err = i915_gem_evict_for_node(&ggtt->vm, NULL, &target, 0); ··· 508 507 } 509 508 err = intel_gt_wait_for_idle(engine->gt, HZ * 3); 510 509 if (err) { 511 - pr_err("Failed to idle GT (on %s)", engine->name); 510 + gt_err(engine->gt, "Failed to idle GT (on %s)", 511 + engine->name); 512 512 break; 513 513 } 514 514 }
+10 -5
drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
··· 135 135 136 136 obj->write_domain = I915_GEM_DOMAIN_CPU; 137 137 obj->read_domains = I915_GEM_DOMAIN_CPU; 138 - obj->cache_level = I915_CACHE_NONE; 138 + obj->pat_index = i915_gem_get_pat_index(i915, I915_CACHE_NONE); 139 139 140 140 /* Preallocate the "backing storage" */ 141 141 if (i915_gem_object_pin_pages_unlocked(obj)) ··· 359 359 360 360 with_intel_runtime_pm(vm->gt->uncore->rpm, wakeref) 361 361 vm->insert_entries(vm, mock_vma_res, 362 - I915_CACHE_NONE, 0); 362 + i915_gem_get_pat_index(vm->i915, 363 + I915_CACHE_NONE), 364 + 0); 363 365 } 364 366 count = n; 365 367 ··· 1379 1377 1380 1378 ggtt->vm.insert_page(&ggtt->vm, 1381 1379 i915_gem_object_get_dma_address(obj, 0), 1382 - offset, I915_CACHE_NONE, 0); 1380 + offset, 1381 + i915_gem_get_pat_index(i915, 1382 + I915_CACHE_NONE), 1383 + 0); 1383 1384 } 1384 1385 1385 1386 order = i915_random_order(count, &prng); ··· 1515 1510 mutex_lock(&vm->mutex); 1516 1511 err = i915_gem_gtt_reserve(vm, NULL, &vma->node, obj->base.size, 1517 1512 offset, 1518 - obj->cache_level, 1513 + obj->pat_index, 1519 1514 0); 1520 1515 if (!err) { 1521 1516 i915_vma_resource_init_from_vma(vma_res, vma); ··· 1695 1690 1696 1691 mutex_lock(&vm->mutex); 1697 1692 err = i915_gem_gtt_insert(vm, NULL, &vma->node, obj->base.size, 0, 1698 - obj->cache_level, 0, vm->total, 0); 1693 + obj->pat_index, 0, vm->total, 0); 1699 1694 if (!err) { 1700 1695 i915_vma_resource_init_from_vma(vma_res, vma); 1701 1696 vma->resource = vma_res;
+28 -19
drivers/gpu/drm/i915/selftests/igt_live_test.c
··· 6 6 7 7 #include "i915_drv.h" 8 8 #include "gt/intel_gt.h" 9 + #include "gt/intel_gt_print.h" 9 10 10 11 #include "../i915_selftest.h" 11 12 #include "igt_flush_test.h" ··· 17 16 const char *func, 18 17 const char *name) 19 18 { 20 - struct intel_gt *gt = to_gt(i915); 21 19 struct intel_engine_cs *engine; 22 20 enum intel_engine_id id; 21 + struct intel_gt *gt; 22 + unsigned int i; 23 23 int err; 24 24 25 25 t->i915 = i915; 26 26 t->func = func; 27 27 t->name = name; 28 28 29 - err = intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT); 30 - if (err) { 31 - pr_err("%s(%s): failed to idle before, with err=%d!", 32 - func, name, err); 33 - return err; 29 + for_each_gt(gt, i915, i) { 30 + 31 + err = intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT); 32 + if (err) { 33 + gt_err(gt, "%s(%s): GT failed to idle before, with err=%d!", 34 + func, name, err); 35 + return err; 36 + } 37 + 38 + for_each_engine(engine, gt, id) 39 + t->reset_engine[id] = 40 + i915_reset_engine_count(&i915->gpu_error, engine); 34 41 } 35 42 36 43 t->reset_global = i915_reset_count(&i915->gpu_error); 37 - 38 - for_each_engine(engine, gt, id) 39 - t->reset_engine[id] = 40 - i915_reset_engine_count(&i915->gpu_error, engine); 41 44 42 45 return 0; 43 46 } ··· 51 46 struct drm_i915_private *i915 = t->i915; 52 47 struct intel_engine_cs *engine; 53 48 enum intel_engine_id id; 49 + struct intel_gt *gt; 50 + unsigned int i; 54 51 55 52 if (igt_flush_test(i915)) 56 53 return -EIO; ··· 64 57 return -EIO; 65 58 } 66 59 67 - for_each_engine(engine, to_gt(i915), id) { 68 - if (t->reset_engine[id] == 69 - i915_reset_engine_count(&i915->gpu_error, engine)) 70 - continue; 60 + for_each_gt(gt, i915, i) { 61 + for_each_engine(engine, gt, id) { 62 + if (t->reset_engine[id] == 63 + i915_reset_engine_count(&i915->gpu_error, engine)) 64 + continue; 71 65 72 - pr_err("%s(%s): engine '%s' was reset %d times!\n", 73 - t->func, t->name, engine->name, 74 - i915_reset_engine_count(&i915->gpu_error, engine) - 75 - t->reset_engine[id]); 76 - return -EIO; 66 + gt_err(gt, "%s(%s): engine '%s' was reset %d times!\n", 67 + t->func, t->name, engine->name, 68 + i915_reset_engine_count(&i915->gpu_error, engine) - 69 + t->reset_engine[id]); 70 + return -EIO; 71 + } 77 72 } 78 73 79 74 return 0;
+3 -1
drivers/gpu/drm/i915/selftests/intel_memory_region.c
··· 1070 1070 /* Put the pages into a known state -- from the gpu for added fun */ 1071 1071 intel_engine_pm_get(engine); 1072 1072 err = intel_context_migrate_clear(engine->gt->migrate.context, NULL, 1073 - obj->mm.pages->sgl, I915_CACHE_NONE, 1073 + obj->mm.pages->sgl, 1074 + i915_gem_get_pat_index(i915, 1075 + I915_CACHE_NONE), 1074 1076 true, 0xdeadbeaf, &rq); 1075 1077 if (rq) { 1076 1078 dma_resv_add_fence(obj->base.resv, &rq->fence,
+9
drivers/gpu/drm/i915/selftests/mock_gem_device.c
··· 123 123 static struct dev_iommu fake_iommu = { .priv = (void *)-1 }; 124 124 #endif 125 125 struct drm_i915_private *i915; 126 + struct intel_device_info *i915_info; 126 127 struct pci_dev *pdev; 128 + unsigned int i; 127 129 int ret; 128 130 129 131 pdev = kzalloc(sizeof(*pdev), GFP_KERNEL); ··· 182 180 I915_GTT_PAGE_SIZE_2M; 183 181 184 182 RUNTIME_INFO(i915)->memory_regions = REGION_SMEM; 183 + 184 + /* simply use legacy cache level for mock device */ 185 + i915_info = (struct intel_device_info *)INTEL_INFO(i915); 186 + i915_info->max_pat_index = 3; 187 + for (i = 0; i < I915_MAX_CACHE_LEVEL; i++) 188 + i915_info->cachelevel_to_pat[i] = i; 189 + 185 190 intel_memory_regions_hw_probe(i915); 186 191 187 192 spin_lock_init(&i915->gpu_error.lock);
+4 -4
drivers/gpu/drm/i915/selftests/mock_gtt.c
··· 27 27 static void mock_insert_page(struct i915_address_space *vm, 28 28 dma_addr_t addr, 29 29 u64 offset, 30 - enum i915_cache_level level, 30 + unsigned int pat_index, 31 31 u32 flags) 32 32 { 33 33 } 34 34 35 35 static void mock_insert_entries(struct i915_address_space *vm, 36 36 struct i915_vma_resource *vma_res, 37 - enum i915_cache_level level, u32 flags) 37 + unsigned int pat_index, u32 flags) 38 38 { 39 39 } 40 40 41 41 static void mock_bind_ppgtt(struct i915_address_space *vm, 42 42 struct i915_vm_pt_stash *stash, 43 43 struct i915_vma_resource *vma_res, 44 - enum i915_cache_level cache_level, 44 + unsigned int pat_index, 45 45 u32 flags) 46 46 { 47 47 GEM_BUG_ON(flags & I915_VMA_GLOBAL_BIND); ··· 94 94 static void mock_bind_ggtt(struct i915_address_space *vm, 95 95 struct i915_vm_pt_stash *stash, 96 96 struct i915_vma_resource *vma_res, 97 - enum i915_cache_level cache_level, 97 + unsigned int pat_index, 98 98 u32 flags) 99 99 { 100 100 }
+1 -1
drivers/misc/mei/Kconfig
··· 62 62 63 63 source "drivers/misc/mei/hdcp/Kconfig" 64 64 source "drivers/misc/mei/pxp/Kconfig" 65 - 65 + source "drivers/misc/mei/gsc_proxy/Kconfig"
+1
drivers/misc/mei/Makefile
··· 30 30 31 31 obj-$(CONFIG_INTEL_MEI_HDCP) += hdcp/ 32 32 obj-$(CONFIG_INTEL_MEI_PXP) += pxp/ 33 + obj-$(CONFIG_INTEL_MEI_GSC_PROXY) += gsc_proxy/
+14
drivers/misc/mei/gsc_proxy/Kconfig
··· 1 + # SPDX-License-Identifier: GPL-2.0 2 + # Copyright (c) 2022-2023, Intel Corporation. All rights reserved. 3 + # 4 + config INTEL_MEI_GSC_PROXY 5 + tristate "Intel GSC Proxy services of ME Interface" 6 + select INTEL_MEI_ME 7 + depends on DRM_I915 8 + help 9 + MEI Support for GSC Proxy Services on Intel platforms. 10 + 11 + MEI GSC proxy enables messaging between GSC service on 12 + Intel graphics card and services on CSE (MEI) firmware 13 + residing SoC or PCH. 14 +
+7
drivers/misc/mei/gsc_proxy/Makefile
··· 1 + # SPDX-License-Identifier: GPL-2.0 2 + # 3 + # Copyright (c) 2022-2023, Intel Corporation. All rights reserved. 4 + # 5 + # Makefile - GSC Proxy client driver for Intel MEI Bus Driver. 6 + 7 + obj-$(CONFIG_INTEL_MEI_GSC_PROXY) += mei_gsc_proxy.o
+208
drivers/misc/mei/gsc_proxy/mei_gsc_proxy.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Copyright (c) 2022-2023 Intel Corporation 4 + */ 5 + 6 + /** 7 + * DOC: MEI_GSC_PROXY Client Driver 8 + * 9 + * The mei_gsc_proxy driver acts as a translation layer between 10 + * proxy user (I915) and ME FW by proxying messages to ME FW 11 + */ 12 + 13 + #include <linux/component.h> 14 + #include <linux/mei_cl_bus.h> 15 + #include <linux/module.h> 16 + #include <linux/pci.h> 17 + #include <linux/slab.h> 18 + #include <linux/uuid.h> 19 + #include <drm/drm_connector.h> 20 + #include <drm/i915_component.h> 21 + #include <drm/i915_gsc_proxy_mei_interface.h> 22 + 23 + /** 24 + * mei_gsc_proxy_send - Sends a proxy message to ME FW. 25 + * @dev: device corresponding to the mei_cl_device 26 + * @buf: a message buffer to send 27 + * @size: size of the message 28 + * Return: bytes sent on Success, <0 on Failure 29 + */ 30 + static int mei_gsc_proxy_send(struct device *dev, const void *buf, size_t size) 31 + { 32 + ssize_t ret; 33 + 34 + if (!dev || !buf) 35 + return -EINVAL; 36 + 37 + ret = mei_cldev_send(to_mei_cl_device(dev), buf, size); 38 + if (ret < 0) 39 + dev_dbg(dev, "mei_cldev_send failed. %zd\n", ret); 40 + 41 + return ret; 42 + } 43 + 44 + /** 45 + * mei_gsc_proxy_recv - Receives a proxy message from ME FW. 46 + * @dev: device corresponding to the mei_cl_device 47 + * @buf: a message buffer to contain the received message 48 + * @size: size of the buffer 49 + * Return: bytes received on Success, <0 on Failure 50 + */ 51 + static int mei_gsc_proxy_recv(struct device *dev, void *buf, size_t size) 52 + { 53 + ssize_t ret; 54 + 55 + if (!dev || !buf) 56 + return -EINVAL; 57 + 58 + ret = mei_cldev_recv(to_mei_cl_device(dev), buf, size); 59 + if (ret < 0) 60 + dev_dbg(dev, "mei_cldev_recv failed. %zd\n", ret); 61 + 62 + return ret; 63 + } 64 + 65 + static const struct i915_gsc_proxy_component_ops mei_gsc_proxy_ops = { 66 + .owner = THIS_MODULE, 67 + .send = mei_gsc_proxy_send, 68 + .recv = mei_gsc_proxy_recv, 69 + }; 70 + 71 + static int mei_component_master_bind(struct device *dev) 72 + { 73 + struct mei_cl_device *cldev = to_mei_cl_device(dev); 74 + struct i915_gsc_proxy_component *comp_master = mei_cldev_get_drvdata(cldev); 75 + 76 + comp_master->ops = &mei_gsc_proxy_ops; 77 + comp_master->mei_dev = dev; 78 + return component_bind_all(dev, comp_master); 79 + } 80 + 81 + static void mei_component_master_unbind(struct device *dev) 82 + { 83 + struct mei_cl_device *cldev = to_mei_cl_device(dev); 84 + struct i915_gsc_proxy_component *comp_master = mei_cldev_get_drvdata(cldev); 85 + 86 + component_unbind_all(dev, comp_master); 87 + } 88 + 89 + static const struct component_master_ops mei_component_master_ops = { 90 + .bind = mei_component_master_bind, 91 + .unbind = mei_component_master_unbind, 92 + }; 93 + 94 + /** 95 + * mei_gsc_proxy_component_match - compare function for matching mei. 96 + * 97 + * The function checks if the device is pci device and 98 + * Intel VGA adapter, the subcomponent is SW Proxy 99 + * and the parent of MEI PCI and the parent of VGA are the same PCH device. 100 + * 101 + * @dev: master device 102 + * @subcomponent: subcomponent to match (I915_COMPONENT_SWPROXY) 103 + * @data: compare data (mei pci parent) 104 + * 105 + * Return: 106 + * * 1 - if components match 107 + * * 0 - otherwise 108 + */ 109 + static int mei_gsc_proxy_component_match(struct device *dev, int subcomponent, 110 + void *data) 111 + { 112 + struct pci_dev *pdev; 113 + 114 + if (!dev_is_pci(dev)) 115 + return 0; 116 + 117 + pdev = to_pci_dev(dev); 118 + 119 + if (pdev->class != (PCI_CLASS_DISPLAY_VGA << 8) || 120 + pdev->vendor != PCI_VENDOR_ID_INTEL) 121 + return 0; 122 + 123 + if (subcomponent != I915_COMPONENT_GSC_PROXY) 124 + return 0; 125 + 126 + return component_compare_dev(dev->parent, ((struct device *)data)->parent); 127 + } 128 + 129 + static int mei_gsc_proxy_probe(struct mei_cl_device *cldev, 130 + const struct mei_cl_device_id *id) 131 + { 132 + struct i915_gsc_proxy_component *comp_master; 133 + struct component_match *master_match = NULL; 134 + int ret; 135 + 136 + ret = mei_cldev_enable(cldev); 137 + if (ret < 0) { 138 + dev_err(&cldev->dev, "mei_cldev_enable Failed. %d\n", ret); 139 + goto enable_err_exit; 140 + } 141 + 142 + comp_master = kzalloc(sizeof(*comp_master), GFP_KERNEL); 143 + if (!comp_master) { 144 + ret = -ENOMEM; 145 + goto err_exit; 146 + } 147 + 148 + component_match_add_typed(&cldev->dev, &master_match, 149 + mei_gsc_proxy_component_match, cldev->dev.parent); 150 + if (IS_ERR_OR_NULL(master_match)) { 151 + ret = -ENOMEM; 152 + goto err_exit; 153 + } 154 + 155 + mei_cldev_set_drvdata(cldev, comp_master); 156 + ret = component_master_add_with_match(&cldev->dev, 157 + &mei_component_master_ops, 158 + master_match); 159 + if (ret < 0) { 160 + dev_err(&cldev->dev, "Master comp add failed %d\n", ret); 161 + goto err_exit; 162 + } 163 + 164 + return 0; 165 + 166 + err_exit: 167 + mei_cldev_set_drvdata(cldev, NULL); 168 + kfree(comp_master); 169 + mei_cldev_disable(cldev); 170 + enable_err_exit: 171 + return ret; 172 + } 173 + 174 + static void mei_gsc_proxy_remove(struct mei_cl_device *cldev) 175 + { 176 + struct i915_gsc_proxy_component *comp_master = mei_cldev_get_drvdata(cldev); 177 + int ret; 178 + 179 + component_master_del(&cldev->dev, &mei_component_master_ops); 180 + kfree(comp_master); 181 + mei_cldev_set_drvdata(cldev, NULL); 182 + 183 + ret = mei_cldev_disable(cldev); 184 + if (ret) 185 + dev_warn(&cldev->dev, "mei_cldev_disable() failed %d\n", ret); 186 + } 187 + 188 + #define MEI_UUID_GSC_PROXY UUID_LE(0xf73db04, 0x97ab, 0x4125, \ 189 + 0xb8, 0x93, 0xe9, 0x4, 0xad, 0xd, 0x54, 0x64) 190 + 191 + static struct mei_cl_device_id mei_gsc_proxy_tbl[] = { 192 + { .uuid = MEI_UUID_GSC_PROXY, .version = MEI_CL_VERSION_ANY }, 193 + { } 194 + }; 195 + MODULE_DEVICE_TABLE(mei, mei_gsc_proxy_tbl); 196 + 197 + static struct mei_cl_driver mei_gsc_proxy_driver = { 198 + .id_table = mei_gsc_proxy_tbl, 199 + .name = KBUILD_MODNAME, 200 + .probe = mei_gsc_proxy_probe, 201 + .remove = mei_gsc_proxy_remove, 202 + }; 203 + 204 + module_mei_cl_driver(mei_gsc_proxy_driver); 205 + 206 + MODULE_AUTHOR("Intel Corporation"); 207 + MODULE_LICENSE("GPL"); 208 + MODULE_DESCRIPTION("MEI GSC PROXY");
+2 -1
include/drm/i915_component.h
··· 29 29 enum i915_component_type { 30 30 I915_COMPONENT_AUDIO = 1, 31 31 I915_COMPONENT_HDCP, 32 - I915_COMPONENT_PXP 32 + I915_COMPONENT_PXP, 33 + I915_COMPONENT_GSC_PROXY, 33 34 }; 34 35 35 36 /* MAX_PORT is the number of port
+53
include/drm/i915_gsc_proxy_mei_interface.h
··· 1 + /* SPDX-License-Identifier: MIT */ 2 + /* 3 + * Copyright (c) 2022-2023 Intel Corporation 4 + */ 5 + 6 + #ifndef _I915_GSC_PROXY_MEI_INTERFACE_H_ 7 + #define _I915_GSC_PROXY_MEI_INTERFACE_H_ 8 + 9 + #include <linux/types.h> 10 + 11 + struct device; 12 + struct module; 13 + 14 + /** 15 + * struct i915_gsc_proxy_component_ops - ops for GSC Proxy services. 16 + * @owner: Module providing the ops 17 + * @send: sends a proxy message from GSC FW to ME FW 18 + * @recv: receives a proxy message for GSC FW from ME FW 19 + */ 20 + struct i915_gsc_proxy_component_ops { 21 + struct module *owner; 22 + 23 + /** 24 + * send - Sends a proxy message to ME FW. 25 + * @dev: device struct corresponding to the mei device 26 + * @buf: message buffer to send 27 + * @size: size of the message 28 + * Return: bytes sent on success, negative errno value on failure 29 + */ 30 + int (*send)(struct device *dev, const void *buf, size_t size); 31 + 32 + /** 33 + * recv - Receives a proxy message from ME FW. 34 + * @dev: device struct corresponding to the mei device 35 + * @buf: message buffer to contain the received message 36 + * @size: size of the buffer 37 + * Return: bytes received on success, negative errno value on failure 38 + */ 39 + int (*recv)(struct device *dev, void *buf, size_t size); 40 + }; 41 + 42 + /** 43 + * struct i915_gsc_proxy_component - Used for communication between i915 and 44 + * MEI drivers for GSC proxy services 45 + * @mei_dev: device that provide the GSC proxy service. 46 + * @ops: Ops implemented by GSC proxy driver, used by i915 driver. 47 + */ 48 + struct i915_gsc_proxy_component { 49 + struct device *mei_dev; 50 + const struct i915_gsc_proxy_component_ops *ops; 51 + }; 52 + 53 + #endif /* _I915_GSC_PROXY_MEI_INTERFACE_H_ */
+50 -1
include/uapi/drm/i915_drm.h
··· 280 280 #define I915_PMU_ENGINE_SEMA(class, instance) \ 281 281 __I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA) 282 282 283 - #define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x)) 283 + /* 284 + * Top 4 bits of every non-engine counter are GT id. 285 + */ 286 + #define __I915_PMU_GT_SHIFT (60) 287 + 288 + #define ___I915_PMU_OTHER(gt, x) \ 289 + (((__u64)__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x)) | \ 290 + ((__u64)(gt) << __I915_PMU_GT_SHIFT)) 291 + 292 + #define __I915_PMU_OTHER(x) ___I915_PMU_OTHER(0, x) 284 293 285 294 #define I915_PMU_ACTUAL_FREQUENCY __I915_PMU_OTHER(0) 286 295 #define I915_PMU_REQUESTED_FREQUENCY __I915_PMU_OTHER(1) ··· 298 289 #define I915_PMU_SOFTWARE_GT_AWAKE_TIME __I915_PMU_OTHER(4) 299 290 300 291 #define I915_PMU_LAST /* Deprecated - do not use */ I915_PMU_RC6_RESIDENCY 292 + 293 + #define __I915_PMU_ACTUAL_FREQUENCY(gt) ___I915_PMU_OTHER(gt, 0) 294 + #define __I915_PMU_REQUESTED_FREQUENCY(gt) ___I915_PMU_OTHER(gt, 1) 295 + #define __I915_PMU_INTERRUPTS(gt) ___I915_PMU_OTHER(gt, 2) 296 + #define __I915_PMU_RC6_RESIDENCY(gt) ___I915_PMU_OTHER(gt, 3) 297 + #define __I915_PMU_SOFTWARE_GT_AWAKE_TIME(gt) ___I915_PMU_OTHER(gt, 4) 301 298 302 299 /* Each region is a minimum of 16k, and there are at most 255 of them. 303 300 */ ··· 785 770 * timestamp frequency, but differs on some platforms. 786 771 */ 787 772 #define I915_PARAM_OA_TIMESTAMP_FREQUENCY 57 773 + 774 + /* 775 + * Query the status of PXP support in i915. 776 + * 777 + * The query can fail in the following scenarios with the listed error codes: 778 + * -ENODEV = PXP support is not available on the GPU device or in the 779 + * kernel due to missing component drivers or kernel configs. 780 + * 781 + * If the IOCTL is successful, the returned parameter will be set to one of 782 + * the following values: 783 + * 1 = PXP feature is supported and is ready for use. 784 + * 2 = PXP feature is supported but should be ready soon (pending 785 + * initialization of non-i915 system dependencies). 786 + * 787 + * NOTE: When param is supported (positive return values), user space should 788 + * still refer to the GEM PXP context-creation UAPI header specs to be 789 + * aware of possible failure due to system state machine at the time. 790 + */ 791 + #define I915_PARAM_PXP_STATUS 58 788 792 789 793 /* Must be kept compact -- no holes and well documented */ 790 794 ··· 2130 2096 * 2131 2097 * -ENODEV: feature not available 2132 2098 * -EPERM: trying to mark a recoverable or not bannable context as protected 2099 + * -ENXIO: A dependency such as a component driver or firmware is not yet 2100 + * loaded so user space may need to attempt again. Depending on the 2101 + * device, this error may be reported if protected context creation is 2102 + * attempted very early after kernel start because the internal timeout 2103 + * waiting for such dependencies is not guaranteed to be larger than 2104 + * required (numbers differ depending on system and kernel config): 2105 + * - ADL/RPL: dependencies may take up to 3 seconds from kernel start 2106 + * while context creation internal timeout is 250 milisecs 2107 + * - MTL: dependencies may take up to 8 seconds from kernel start 2108 + * while context creation internal timeout is 250 milisecs 2109 + * NOTE: such dependencies happen once, so a subsequent call to create a 2110 + * protected context after a prior successful call will not experience 2111 + * such timeouts and will not return -ENXIO (unless the driver is reloaded, 2112 + * or, depending on the device, resumes from a suspended state). 2113 + * -EIO: The firmware did not succeed in creating the protected context. 2133 2114 */ 2134 2115 #define I915_CONTEXT_PARAM_PROTECTED_CONTENT 0xd 2135 2116 /* Must be kept compact -- no holes and well documented */