Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

drm/panfrost: Use DRM_GPU_SCHED_STAT_NO_HANG to skip the reset

Panfrost can skip the reset if TDR has fired before the free-job worker.
Currently, since Panfrost doesn't take any action on these scenarios, the
job is being leaked, considering that `free_job()` won't be called.

To avoid such leaks, inform the scheduler that the job did not actually
timeout and no reset was performed through the new status code
DRM_GPU_SCHED_STAT_NO_HANG.

Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-8-5c5ba4f55039@igalia.com
Signed-off-by: Maíra Canal <mcanal@igalia.com>

+4 -4
+4 -4
drivers/gpu/drm/panfrost/panfrost_job.c
··· 751 751 int js = panfrost_job_get_slot(job); 752 752 753 753 /* 754 - * If the GPU managed to complete this jobs fence, the timeout is 755 - * spurious. Bail out. 754 + * If the GPU managed to complete this jobs fence, the timeout has 755 + * fired before free-job worker. The timeout is spurious, so bail out. 756 756 */ 757 757 if (dma_fence_is_signaled(job->done_fence)) 758 - return DRM_GPU_SCHED_STAT_RESET; 758 + return DRM_GPU_SCHED_STAT_NO_HANG; 759 759 760 760 /* 761 761 * Panfrost IRQ handler may take a long time to process an interrupt ··· 770 770 771 771 if (dma_fence_is_signaled(job->done_fence)) { 772 772 dev_warn(pfdev->dev, "unexpectedly high interrupt latency\n"); 773 - return DRM_GPU_SCHED_STAT_RESET; 773 + return DRM_GPU_SCHED_STAT_NO_HANG; 774 774 } 775 775 776 776 dev_err(pfdev->dev, "gpu sched timeout, js=%d, config=0x%x, status=0x%x, head=0x%x, tail=0x%x, sched_job=%p",