Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

drm/amdgpu/gfx10: Refine Cleaner Shader for GFX10.1.10

This patch updates the cleaner shader, which is responsible for
initializing GPU resources such as Local Data Share (LDS), Vector
General Purpose Registers (VGPRs), and Scalar General Purpose Registers
(SGPRs). Changes include adjustments to register clearing and shader
configuration.

- Updated GPU resource initialization addresses in the cleaner shader
from `be803080` to `be803000`.
- Simplified the logic in the SGPR clearing section, ensuring all SGPRs
are set to zero.

Fixes: 25961bad9212 ("drm/amdgpu/gfx10: Add cleaner shader for GFX10.1.10")
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Manu Rastogi <manu.rastogi@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

authored by

Vitaly Prosyak and committed by
Alex Deucher
d26625d0 719d84f8

+9 -10
+3 -3
drivers/gpu/drm/amd/amdgpu/gfx_v10_0_cleaner_shader.h
··· 43 43 0xd70f6a01, 0x000202ff, 44 44 0x00000400, 0x80828102, 45 45 0xbf84fff7, 0xbefc03ff, 46 - 0x00000068, 0xbe803080, 47 - 0xbe813080, 0xbe823080, 48 - 0xbe833080, 0x80fc847c, 46 + 0x00000068, 0xbe803000, 47 + 0xbe813000, 0xbe823000, 48 + 0xbe833000, 0x80fc847c, 49 49 0xbf84fffa, 0xbeea0480, 50 50 0xbeec0480, 0xbeee0480, 51 51 0xbef00480, 0xbef20480,
+6 -7
drivers/gpu/drm/amd/amdgpu/gfx_v10_1_10_cleaner_shader.asm
··· 40 40 type(CS) 41 41 wave_size(32) 42 42 // Note: original source code from SQ team 43 - 44 43 // 45 44 // Create 32 waves in a threadgroup (CS waves) 46 45 // Each allocates 64 VGPRs ··· 70 71 s_sub_u32 s2, s2, 8 71 72 s_cbranch_scc0 label_0005 72 73 // 73 - s_mov_b32 s2, 0x80000000 // Bit31 is first_wave 74 - s_and_b32 s2, s2, s0 // sgpr0 has tg_size (first_wave) term as in ucode only COMPUTE_PGM_RSRC2.tg_size_en is set 74 + s_mov_b32 s2, 0x80000000 // Bit31 is first_wave 75 + s_and_b32 s2, s2, s1 // sgpr0 has tg_size (first_wave) term as in ucode only COMPUTE_PGM_RSRC2.tg_size_en is set 75 76 s_cbranch_scc0 label_0023 // Clean LDS if its first wave of ThreadGroup/WorkGroup 76 77 // CLEAR LDS 77 78 // ··· 98 99 label_0023: 99 100 s_mov_b32 m0, 0x00000068 // Loop 108/4=27 times (loop unrolled for performance) 100 101 label_sgpr_loop: 101 - s_movreld_b32 s0, 0 102 - s_movreld_b32 s1, 0 103 - s_movreld_b32 s2, 0 104 - s_movreld_b32 s3, 0 102 + s_movreld_b32 s0, s0 103 + s_movreld_b32 s1, s0 104 + s_movreld_b32 s2, s0 105 + s_movreld_b32 s3, s0 105 106 s_sub_u32 m0, m0, 4 106 107 s_cbranch_scc0 label_sgpr_loop 107 108