Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

drm/i915: Force CPU synchronisation even if userspace requests ASYNC

The goal here was to minimise doing any thing or any check inside the
kernel that was not strictly required. For a userspace that assumes
complete control over the cache domains, the kernel is usually using
outdated information and may trigger clflushes where none were
required.

However, swapping is a situation where userspace has no knowledge of the
domain transfer, and will leave the object in the CPU cache. The kernel
must flush this out to the backing storage prior to use with the GPU. As
we use an asynchronous task tracked by an implicit fence for this, we
also need to cancel the ASYNC flag on the object so that the object will
wait for the clflush to complete before being executed. This also absolves
userspace of the responsibility imposed by commit 77ae9957897d ("drm/i915:
Enable userspace to opt-out of implicit fencing") that its needed to ensure
that the object was out of the CPU cache prior to use on the GPU.

Fixes: 77ae9957897d ("drm/i915: Enable userspace to opt-out of implicit fencing")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101571
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20170721145037.25105-5-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit 0f46daa1a273779a0b73d768a788ca3f04238f9c)
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

authored by

Chris Wilson and committed by
Daniel Vetter
7b98da66 adf27835

+11 -8
+4 -3
drivers/gpu/drm/i915/i915_gem_clflush.c
··· 114 114 return NOTIFY_DONE; 115 115 } 116 116 117 - void i915_gem_clflush_object(struct drm_i915_gem_object *obj, 117 + bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, 118 118 unsigned int flags) 119 119 { 120 120 struct clflush *clflush; ··· 128 128 */ 129 129 if (!i915_gem_object_has_struct_page(obj)) { 130 130 obj->cache_dirty = false; 131 - return; 131 + return false; 132 132 } 133 133 134 134 /* If the GPU is snooping the contents of the CPU cache, ··· 140 140 * tracking. 141 141 */ 142 142 if (!(flags & I915_CLFLUSH_FORCE) && obj->cache_coherent) 143 - return; 143 + return false; 144 144 145 145 trace_i915_gem_object_clflush(obj); 146 146 ··· 179 179 } 180 180 181 181 obj->cache_dirty = false; 182 + return true; 182 183 }
+1 -1
drivers/gpu/drm/i915/i915_gem_clflush.h
··· 28 28 struct drm_i915_private; 29 29 struct drm_i915_gem_object; 30 30 31 - void i915_gem_clflush_object(struct drm_i915_gem_object *obj, 31 + bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, 32 32 unsigned int flags); 33 33 #define I915_CLFLUSH_FORCE BIT(0) 34 34 #define I915_CLFLUSH_SYNC BIT(1)
+6 -4
drivers/gpu/drm/i915/i915_gem_execbuffer.c
··· 1825 1825 int err; 1826 1826 1827 1827 for (i = 0; i < count; i++) { 1828 - const struct drm_i915_gem_exec_object2 *entry = &eb->exec[i]; 1828 + struct drm_i915_gem_exec_object2 *entry = &eb->exec[i]; 1829 1829 struct i915_vma *vma = exec_to_vma(entry); 1830 1830 struct drm_i915_gem_object *obj = vma->obj; 1831 1831 ··· 1841 1841 eb->request->capture_list = capture; 1842 1842 } 1843 1843 1844 + if (unlikely(obj->cache_dirty && !obj->cache_coherent)) { 1845 + if (i915_gem_clflush_object(obj, 0)) 1846 + entry->flags &= ~EXEC_OBJECT_ASYNC; 1847 + } 1848 + 1844 1849 if (entry->flags & EXEC_OBJECT_ASYNC) 1845 1850 goto skip_flushes; 1846 - 1847 - if (unlikely(obj->cache_dirty && !obj->cache_coherent)) 1848 - i915_gem_clflush_object(obj, 0); 1849 1851 1850 1852 err = i915_gem_request_await_object 1851 1853 (eb->request, obj, entry->flags & EXEC_OBJECT_WRITE);