Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Revert "drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole"

That commit causes NULL pointer dereferences in dmesgs when
running applications using ROCm, including clinfo, blender,
and PyTorch, since v6.6.1. Revert it to fix blender again.

This reverts commit 96c211f1f9ef82183493f4ceed4e347b52849149.

Closes: https://github.com/ROCm/ROCm/issues/2596
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2991
Reviewed-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Kaibo Ma <ent3rm4n@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

authored by

Kaibo Ma and committed by
Alex Deucher
0f35b0a7 c966dc0e

+13 -13
+13 -13
drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
··· 330 330 pdd->gpuvm_limit = 331 331 pdd->dev->kfd->shared_resources.gpuvm_size - 1; 332 332 333 - /* dGPUs: the reserved space for kernel 334 - * before SVM 335 - */ 336 - pdd->qpd.cwsr_base = SVM_CWSR_BASE; 337 - pdd->qpd.ib_base = SVM_IB_BASE; 338 - 339 333 pdd->scratch_base = MAKE_SCRATCH_APP_BASE_VI(); 340 334 pdd->scratch_limit = MAKE_SCRATCH_APP_LIMIT(pdd->scratch_base); 341 335 } ··· 339 345 pdd->lds_base = MAKE_LDS_APP_BASE_V9(); 340 346 pdd->lds_limit = MAKE_LDS_APP_LIMIT(pdd->lds_base); 341 347 342 - pdd->gpuvm_base = PAGE_SIZE; 348 + /* Raven needs SVM to support graphic handle, etc. Leave the small 349 + * reserved space before SVM on Raven as well, even though we don't 350 + * have to. 351 + * Set gpuvm_base and gpuvm_limit to CANONICAL addresses so that they 352 + * are used in Thunk to reserve SVM. 353 + */ 354 + pdd->gpuvm_base = SVM_USER_BASE; 343 355 pdd->gpuvm_limit = 344 356 pdd->dev->kfd->shared_resources.gpuvm_size - 1; 345 357 346 358 pdd->scratch_base = MAKE_SCRATCH_APP_BASE_V9(); 347 359 pdd->scratch_limit = MAKE_SCRATCH_APP_LIMIT(pdd->scratch_base); 348 - 349 - /* 350 - * Place TBA/TMA on opposite side of VM hole to prevent 351 - * stray faults from triggering SVM on these pages. 352 - */ 353 - pdd->qpd.cwsr_base = pdd->dev->kfd->shared_resources.gpuvm_size; 354 360 } 355 361 356 362 int kfd_init_apertures(struct kfd_process *process) ··· 407 413 return -EINVAL; 408 414 } 409 415 } 416 + 417 + /* dGPUs: the reserved space for kernel 418 + * before SVM 419 + */ 420 + pdd->qpd.cwsr_base = SVM_CWSR_BASE; 421 + pdd->qpd.ib_base = SVM_IB_BASE; 410 422 } 411 423 412 424 dev_dbg(kfd_device, "node id %u\n", id);