Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

drm/scheduler: signal scheduled fence when kill job

When an entity from application B is killed, drm_sched_entity_kill()
removes all jobs belonging to that entity through
drm_sched_entity_kill_jobs_work(). If application A's job depends on a
scheduled fence from application B's job, and that fence is not properly
signaled during the killing process, application A's dependency cannot be
cleared.

This leads to application A hanging indefinitely while waiting for a
dependency that will never be resolved. Fix this issue by ensuring that
scheduled fences are properly signaled when an entity is killed, allowing
dependent applications to continue execution.

Signed-off-by: Lin.Cao <lincao12@amd.com>
Reviewed-by: Philipp Stanner <phasta@kernel.org>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20250515020713.1110476-1-lincao12@amd.com

authored by

Lin.Cao and committed by
Christian König
471db2c2 6692dbc1

+1
+1
drivers/gpu/drm/scheduler/sched_entity.c
··· 176 176 { 177 177 struct drm_sched_job *job = container_of(wrk, typeof(*job), work); 178 178 179 + drm_sched_fence_scheduled(job->s_fence, NULL); 179 180 drm_sched_fence_finished(job->s_fence, -ESRCH); 180 181 WARN_ON(job->s_fence->parent); 181 182 job->sched->ops->free_job(job);