Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

drm/amdgpu: Disable RPM helpers while reprobing connectors on resume

Just about all of amdgpu's connector probing functions try to acquire
runtime PM refs. If we try to do this in the context of
amdgpu_resume_kms by calling drm_helper_hpd_irq_event(), we end up
deadlocking the system.

Since we're guaranteed to be holding the spinlock for RPM in
amdgpu_resume_kms, and we already know the GPU is in working order, we
need to prevent the RPM helpers from trying to run during the initial
connector reprobe on resume.

There's a couple of solutions I've explored for fixing this, but this
one by far seems to be the simplest and most reliable (plus I'm pretty
sure that's what disable_depth is there for anyway).

Reproduction recipe:
- Get any laptop dual GPUs using PRIME
- Make sure runtime PM is enabled for amdgpu
- Boot the machine
- If the machine managed to boot without hanging, switch out of X to
another VT. This should definitely cause X to hang infinitely.

Changes since v1:
- add appropriate #ifdef checks for CONFIG_PM. This is not very
useful, but it appears some kernel test suites test compiling amdgpu
with CONFIG_PM disabled, which results in this patch breaking the builds
if we don't include this #ifdef

Cc: stable@vger.kernel.org
Cc: Alex Deucher <alexdeucher@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Lyude <cpaul@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

authored by

Lyude and committed by
Alex Deucher
23a1a9e5 69ee9742

+16
+16
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
··· 1911 1911 } 1912 1912 1913 1913 drm_kms_helper_poll_enable(dev); 1914 + 1915 + /* 1916 + * Most of the connector probing functions try to acquire runtime pm 1917 + * refs to ensure that the GPU is powered on when connector polling is 1918 + * performed. Since we're calling this from a runtime PM callback, 1919 + * trying to acquire rpm refs will cause us to deadlock. 1920 + * 1921 + * Since we're guaranteed to be holding the rpm lock, it's safe to 1922 + * temporarily disable the rpm helpers so this doesn't deadlock us. 1923 + */ 1924 + #ifdef CONFIG_PM 1925 + dev->dev->power.disable_depth++; 1926 + #endif 1914 1927 drm_helper_hpd_irq_event(dev); 1928 + #ifdef CONFIG_PM 1929 + dev->dev->power.disable_depth--; 1930 + #endif 1915 1931 1916 1932 if (fbcon) { 1917 1933 amdgpu_fbdev_set_suspend(adev, 0);