Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

drm/amdgpu: Cancel delayed work when GFXOFF is disabled

schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
checking for it to be 0 (Evan Quan)

Cc: stable@vger.kernel.org
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> # v3
Acked-by: Christian König <christian.koenig@amd.com> # v3
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

authored by

Michel Dänzer and committed by
Alex Deucher
32bc8f83 2a7b9a84

+30 -17
+5 -6
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
··· 2777 2777 struct amdgpu_device *adev = 2778 2778 container_of(work, struct amdgpu_device, gfx.gfx_off_delay_work.work); 2779 2779 2780 - mutex_lock(&adev->gfx.gfx_off_mutex); 2781 - if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) { 2782 - if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, true)) 2783 - adev->gfx.gfx_off_state = true; 2784 - } 2785 - mutex_unlock(&adev->gfx.gfx_off_mutex); 2780 + WARN_ON_ONCE(adev->gfx.gfx_off_state); 2781 + WARN_ON_ONCE(adev->gfx.gfx_off_req_count); 2782 + 2783 + if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, true)) 2784 + adev->gfx.gfx_off_state = true; 2786 2785 } 2787 2786 2788 2787 /**
+25 -11
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
··· 563 563 564 564 mutex_lock(&adev->gfx.gfx_off_mutex); 565 565 566 - if (!enable) 567 - adev->gfx.gfx_off_req_count++; 568 - else if (adev->gfx.gfx_off_req_count > 0) 566 + if (enable) { 567 + /* If the count is already 0, it means there's an imbalance bug somewhere. 568 + * Note that the bug may be in a different caller than the one which triggers the 569 + * WARN_ON_ONCE. 570 + */ 571 + if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0)) 572 + goto unlock; 573 + 569 574 adev->gfx.gfx_off_req_count--; 570 575 571 - if (enable && !adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) { 572 - schedule_delayed_work(&adev->gfx.gfx_off_delay_work, GFX_OFF_DELAY_ENABLE); 573 - } else if (!enable && adev->gfx.gfx_off_state) { 574 - if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, false)) { 575 - adev->gfx.gfx_off_state = false; 576 + if (adev->gfx.gfx_off_req_count == 0 && !adev->gfx.gfx_off_state) 577 + schedule_delayed_work(&adev->gfx.gfx_off_delay_work, GFX_OFF_DELAY_ENABLE); 578 + } else { 579 + if (adev->gfx.gfx_off_req_count == 0) { 580 + cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work); 576 581 577 - if (adev->gfx.funcs->init_spm_golden) { 578 - dev_dbg(adev->dev, "GFXOFF is disabled, re-init SPM golden settings\n"); 579 - amdgpu_gfx_init_spm_golden(adev); 582 + if (adev->gfx.gfx_off_state && 583 + !amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, false)) { 584 + adev->gfx.gfx_off_state = false; 585 + 586 + if (adev->gfx.funcs->init_spm_golden) { 587 + dev_dbg(adev->dev, 588 + "GFXOFF is disabled, re-init SPM golden settings\n"); 589 + amdgpu_gfx_init_spm_golden(adev); 590 + } 580 591 } 581 592 } 593 + 594 + adev->gfx.gfx_off_req_count++; 582 595 } 583 596 597 + unlock: 584 598 mutex_unlock(&adev->gfx.gfx_off_mutex); 585 599 } 586 600