Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

drm/amdgpu: put the SMC into the proper state on reset/unload

When doing a GPU reset or unloading the driver, we need to
put the SMU into the apprpriate state for the re-init after
the reset or unload to reliably work.

I don't think this is necessary for BACO because the SMU actually
controls the BACO state to it needs to be active.

For suspend (S3), the asic is put into D3 so the SMU would be
powered down so I don't think we need to put the SMU into
any special state.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

+30
+1
drivers/gpu/drm/amd/amdgpu/amdgpu.h
··· 990 990 /* record last mm index being written through WREG32*/ 991 991 unsigned long last_mm_index; 992 992 bool in_gpu_reset; 993 + enum pp_mp1_state mp1_state; 993 994 struct mutex lock_reset; 994 995 struct amdgpu_doorbell_index doorbell_index; 995 996
+27
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
··· 2175 2175 DRM_ERROR("suspend of IP block <%s> failed %d\n", 2176 2176 adev->ip_blocks[i].version->funcs->name, r); 2177 2177 } 2178 + /* handle putting the SMC in the appropriate state */ 2179 + if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_SMC) { 2180 + if (is_support_sw_smu(adev)) { 2181 + /* todo */ 2182 + } else if (adev->powerplay.pp_funcs && 2183 + adev->powerplay.pp_funcs->set_mp1_state) { 2184 + r = adev->powerplay.pp_funcs->set_mp1_state( 2185 + adev->powerplay.pp_handle, 2186 + adev->mp1_state); 2187 + if (r) { 2188 + DRM_ERROR("SMC failed to set mp1 state %d, %d\n", 2189 + adev->mp1_state, r); 2190 + } 2191 + } 2192 + } 2178 2193 } 2179 2194 2180 2195 return 0; ··· 3655 3640 3656 3641 atomic_inc(&adev->gpu_reset_counter); 3657 3642 adev->in_gpu_reset = 1; 3643 + switch (amdgpu_asic_reset_method(adev)) { 3644 + case AMD_RESET_METHOD_MODE1: 3645 + adev->mp1_state = PP_MP1_STATE_SHUTDOWN; 3646 + break; 3647 + case AMD_RESET_METHOD_MODE2: 3648 + adev->mp1_state = PP_MP1_STATE_RESET; 3649 + break; 3650 + default: 3651 + adev->mp1_state = PP_MP1_STATE_NONE; 3652 + break; 3653 + } 3658 3654 /* Block kfd: SRIOV would do it separately */ 3659 3655 if (!amdgpu_sriov_vf(adev)) 3660 3656 amdgpu_amdkfd_pre_reset(adev); ··· 3679 3653 if (!amdgpu_sriov_vf(adev)) 3680 3654 amdgpu_amdkfd_post_reset(adev); 3681 3655 amdgpu_vf_error_trans_all(adev); 3656 + adev->mp1_state = PP_MP1_STATE_NONE; 3682 3657 adev->in_gpu_reset = 0; 3683 3658 mutex_unlock(&adev->lock_reset); 3684 3659 }
+2
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
··· 1096 1096 * unfortunately we can't detect certain 1097 1097 * hypervisors so just do this all the time. 1098 1098 */ 1099 + adev->mp1_state = PP_MP1_STATE_UNLOAD; 1099 1100 amdgpu_device_ip_suspend(adev); 1101 + adev->mp1_state = PP_MP1_STATE_NONE; 1100 1102 } 1101 1103 1102 1104 static int amdgpu_pmops_suspend(struct device *dev)