Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

drm/xe: Resume TDR after GT reset

Not starting the TDR after GT reset on exec queue which have been
restarted can lead to jobs being able to be run forever. Fix this by
restarting the TDR.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240724235919.1917216-1-matthew.brost@intel.com
(cherry picked from commit 8ec5a4e5ce97d6ee9f5eb5b4ce4cfc831976fdec)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

authored by

Matthew Brost and committed by
Lucas De Marchi
1b30f87e 6ef5a042

+8
+5
drivers/gpu/drm/xe/xe_gpu_scheduler.c
··· 90 90 cancel_work_sync(&sched->work_process_msg); 91 91 } 92 92 93 + void xe_sched_submission_resume_tdr(struct xe_gpu_scheduler *sched) 94 + { 95 + drm_sched_resume_timeout(&sched->base, sched->base.timeout); 96 + } 97 + 93 98 void xe_sched_add_msg(struct xe_gpu_scheduler *sched, 94 99 struct xe_sched_msg *msg) 95 100 {
+2
drivers/gpu/drm/xe/xe_gpu_scheduler.h
··· 22 22 void xe_sched_submission_start(struct xe_gpu_scheduler *sched); 23 23 void xe_sched_submission_stop(struct xe_gpu_scheduler *sched); 24 24 25 + void xe_sched_submission_resume_tdr(struct xe_gpu_scheduler *sched); 26 + 25 27 void xe_sched_add_msg(struct xe_gpu_scheduler *sched, 26 28 struct xe_sched_msg *msg); 27 29 void xe_sched_add_msg_locked(struct xe_gpu_scheduler *sched,
+1
drivers/gpu/drm/xe/xe_guc_submit.c
··· 1826 1826 } 1827 1827 1828 1828 xe_sched_submission_start(sched); 1829 + xe_sched_submission_resume_tdr(sched); 1829 1830 } 1830 1831 1831 1832 int xe_guc_submit_start(struct xe_guc *guc)