drm: Get ref on CRTC commit object when waiting for flip_done

This fixes a general protection fault, caused by accessing the contents
of a flip_done completion object that has already been freed. It occurs
due to the preemption of a non-blocking commit worker thread W by
another commit thread X. X continues to clear its atomic state at the
end, destroying the CRTC commit object that W still needs. Switching
back to W and accessing the commit objects then leads to bad results.

Worker W becomes preemptable when waiting for flip_done to complete. At
this point, a frequently occurring commit thread X can take over. Here's
an example where W is a worker thread that flips on both CRTCs, and X
does a legacy cursor update on both CRTCs:

...
1. W does flip work
2. W runs commit_hw_done()
3. W waits for flip_done on CRTC 1
4. > flip_done for CRTC 1 completes
5. W finishes waiting for CRTC 1
6. W waits for flip_done on CRTC 2

7. > Preempted by X
8. > flip_done for CRTC 2 completes
9. X atomic_check: hw_done and flip_done are complete on all CRTCs
10. X updates cursor on both CRTCs
11. X destroys atomic state
12. X done

13. > Switch back to W
14. W waits for flip_done on CRTC 2
15. W raises general protection fault

The error looks like so:

general protection fault: 0000 [#1] PREEMPT SMP PTI
**snip**
Call Trace:
lock_acquire+0xa2/0x1b0
_raw_spin_lock_irq+0x39/0x70
wait_for_completion_timeout+0x31/0x130
drm_atomic_helper_wait_for_flip_done+0x64/0x90 [drm_kms_helper]
amdgpu_dm_atomic_commit_tail+0xcae/0xdd0 [amdgpu]
commit_tail+0x3d/0x70 [drm_kms_helper]
process_one_work+0x212/0x650
worker_thread+0x49/0x420
kthread+0xfb/0x130
ret_from_fork+0x3a/0x50
Modules linked in: x86_pkg_temp_thermal amdgpu(O) chash(O)
gpu_sched(O) drm_kms_helper(O) syscopyarea sysfillrect sysimgblt
fb_sys_fops ttm(O) drm(O)

Note that i915 has this issue masked, since hw_done is signaled after
waiting for flip_done. Doing so will block the cursor update from
happening until hw_done is signaled, preventing the cursor commit from
destroying the state.

v2: The reference on the commit object needs to be obtained before
hw_done() is signaled, since that's the point where another commit
is allowed to modify the state. Assuming that the
new_crtc_state->commit object still exists within flip_done() is
incorrect.

Fix by getting a reference in setup_commit(), and releasing it
during default_clear().

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1539611200-6184-1-git-send-email-sunpeng.li@amd.com

authored by Leo Li and committed by Harry Wentland 4364bcb2 9068e02f

+24 -4
+5
drivers/gpu/drm/drm_atomic.c
··· 174 state->crtcs[i].state = NULL; 175 state->crtcs[i].old_state = NULL; 176 state->crtcs[i].new_state = NULL; 177 } 178 179 for (i = 0; i < config->num_total_plane; i++) {
··· 174 state->crtcs[i].state = NULL; 175 state->crtcs[i].old_state = NULL; 176 state->crtcs[i].new_state = NULL; 177 + 178 + if (state->crtcs[i].commit) { 179 + drm_crtc_commit_put(state->crtcs[i].commit); 180 + state->crtcs[i].commit = NULL; 181 + } 182 } 183 184 for (i = 0; i < config->num_total_plane; i++) {
+8 -4
drivers/gpu/drm/drm_atomic_helper.c
··· 1408 void drm_atomic_helper_wait_for_flip_done(struct drm_device *dev, 1409 struct drm_atomic_state *old_state) 1410 { 1411 - struct drm_crtc_state *new_crtc_state; 1412 struct drm_crtc *crtc; 1413 int i; 1414 1415 - for_each_new_crtc_in_state(old_state, crtc, new_crtc_state, i) { 1416 - struct drm_crtc_commit *commit = new_crtc_state->commit; 1417 int ret; 1418 1419 - if (!commit) 1420 continue; 1421 1422 ret = wait_for_completion_timeout(&commit->flip_done, 10 * HZ); ··· 1935 drm_crtc_commit_get(commit); 1936 1937 commit->abort_completion = true; 1938 } 1939 1940 for_each_oldnew_connector_in_state(state, conn, old_conn_state, new_conn_state, i) {
··· 1408 void drm_atomic_helper_wait_for_flip_done(struct drm_device *dev, 1409 struct drm_atomic_state *old_state) 1410 { 1411 struct drm_crtc *crtc; 1412 int i; 1413 1414 + for (i = 0; i < dev->mode_config.num_crtc; i++) { 1415 + struct drm_crtc_commit *commit = old_state->crtcs[i].commit; 1416 int ret; 1417 1418 + crtc = old_state->crtcs[i].ptr; 1419 + 1420 + if (!crtc || !commit) 1421 continue; 1422 1423 ret = wait_for_completion_timeout(&commit->flip_done, 10 * HZ); ··· 1934 drm_crtc_commit_get(commit); 1935 1936 commit->abort_completion = true; 1937 + 1938 + state->crtcs[i].commit = commit; 1939 + drm_crtc_commit_get(commit); 1940 } 1941 1942 for_each_oldnew_connector_in_state(state, conn, old_conn_state, new_conn_state, i) {
+11
include/drm/drm_atomic.h
··· 153 struct __drm_crtcs_state { 154 struct drm_crtc *ptr; 155 struct drm_crtc_state *state, *old_state, *new_state; 156 s32 __user *out_fence_ptr; 157 u64 last_vblank_count; 158 };
··· 153 struct __drm_crtcs_state { 154 struct drm_crtc *ptr; 155 struct drm_crtc_state *state, *old_state, *new_state; 156 + 157 + /** 158 + * @commit: 159 + * 160 + * A reference to the CRTC commit object that is kept for use by 161 + * drm_atomic_helper_wait_for_flip_done() after 162 + * drm_atomic_helper_commit_hw_done() is called. This ensures that a 163 + * concurrent commit won't free a commit object that is still in use. 164 + */ 165 + struct drm_crtc_commit *commit; 166 + 167 s32 __user *out_fence_ptr; 168 u64 last_vblank_count; 169 };