Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem (v2)

Instead of handling it like a context param, unconditionally set it when
intel_contexts are created. For years we've had the idea of a watchdog
uAPI floating about. The aim was for media, so that they could set very
tight deadlines for their transcodes jobs, so that if you have a corrupt
bitstream (especially for decoding) you don't hang your desktop too
hard. But it's been stuck in limbo since forever, and this simplifies
things a bit in preparation for the proto-context work. If we decide to
actually make said uAPI a reality, we can do it through the proto-
context easily enough.

This does mean that we move from reading the request_timeout_ms param
once per engine when engines are created instead of once at context
creation. If someone changes request_timeout_ms between creating a
context and setting engines, it will mean that they get the new timeout.
If someone races setting request_timeout_ms and context creation, they
can theoretically end up with different timeouts. However, since both
of these are fairly harmless and require changing kernel params, we
don't care.

v2 (Tvrtko Ursulin):
- Add a comment about races with request_timeout_ms

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-5-jason@jlekstrand.net

authored by

Jason Ekstrand and committed by
Daniel Vetter
677db6ad 6ff6d61d

+7 -44
+6 -38
drivers/gpu/drm/i915/gem/i915_gem_context.c
··· 232 232 intel_engine_has_timeslices(ce->engine)) 233 233 __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags); 234 234 235 - intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us); 235 + if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) && 236 + ctx->i915->params.request_timeout_ms) { 237 + unsigned int timeout_ms = ctx->i915->params.request_timeout_ms; 238 + 239 + intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000); 240 + } 236 241 } 237 242 238 243 static void __free_engines(struct i915_gem_engines *e, unsigned int count) ··· 796 791 context_apply_all(ctx, __apply_timeline, timeline); 797 792 } 798 793 799 - static int __apply_watchdog(struct intel_context *ce, void *timeout_us) 800 - { 801 - return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us); 802 - } 803 - 804 - static int 805 - __set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us) 806 - { 807 - int ret; 808 - 809 - ret = context_apply_all(ctx, __apply_watchdog, 810 - (void *)(uintptr_t)timeout_us); 811 - if (!ret) 812 - ctx->watchdog.timeout_us = timeout_us; 813 - 814 - return ret; 815 - } 816 - 817 - static void __set_default_fence_expiry(struct i915_gem_context *ctx) 818 - { 819 - struct drm_i915_private *i915 = ctx->i915; 820 - int ret; 821 - 822 - if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) || 823 - !i915->params.request_timeout_ms) 824 - return; 825 - 826 - /* Default expiry for user fences. */ 827 - ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000); 828 - if (ret) 829 - drm_notice(&i915->drm, 830 - "Failed to configure default fence expiry! (%d)", 831 - ret); 832 - } 833 - 834 794 static struct i915_gem_context * 835 795 i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags) 836 796 { ··· 839 869 __assign_timeline(ctx, timeline); 840 870 intel_timeline_put(timeline); 841 871 } 842 - 843 - __set_default_fence_expiry(ctx); 844 872 845 873 trace_i915_context_create(ctx); 846 874
-4
drivers/gpu/drm/i915/gem/i915_gem_context_types.h
··· 153 153 */ 154 154 atomic_t active_count; 155 155 156 - struct { 157 - u64 timeout_us; 158 - } watchdog; 159 - 160 156 /** 161 157 * @hang_timestamp: The last time(s) this context caused a GPU hang 162 158 */
+1 -2
drivers/gpu/drm/i915/gt/intel_context_param.h
··· 10 10 11 11 #include "intel_context.h" 12 12 13 - static inline int 13 + static inline void 14 14 intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us) 15 15 { 16 16 ce->watchdog.timeout_us = timeout_us; 17 - return 0; 18 17 } 19 18 20 19 #endif /* INTEL_CONTEXT_PARAM_H */