Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

drm/ttm: fix two regressions since move_notify changes

Both changes in dc97b3409a790d2a21aac6e5cdb99558b5944119 cause serious
regressions in the nouveau driver.

move_notify() was originally able to presume that bo->mem is the old node,
and new_mem is the new node. The above commit moves the call to
move_notify() to after move() has been done, which means that now, sometimes,
new_mem isn't the new node at all, bo->mem is, and new_mem points at a
stale, possibly-just-been-killed-by-move node.

This is clearly not a good situation. This patch reverts this change, and
replaces it with a cleanup in the move() failure path instead.

The second issue is that the call to move_notify() from cleanup_memtype_use()
causes the TTM ghost objects to get passed into the driver. This is clearly
bad as the driver knows nothing about these "fake" TTM BOs, and ends up
accessing uninitialised memory.

I worked around this in nouveau's move_notify() hook by ensuring the BO
destructor was nouveau's. I don't particularly like this solution, and
would rather TTM never pass the driver these objects. However, I don't
clearly understand the reason why we're calling move_notify() here anyway
and am happy to work around the problem in nouveau instead of breaking the
behaviour expected by other drivers.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Jerome Glisse <j.glisse@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

authored by

Ben Skeggs and committed by
Dave Airlie
9f1feed2 9fc04b50

+17 -4
+4
drivers/gpu/drm/nouveau/nouveau_bo.c
··· 812 812 struct nouveau_bo *nvbo = nouveau_bo(bo); 813 813 struct nouveau_vma *vma; 814 814 815 + /* ttm can now (stupidly) pass the driver bos it didn't create... */ 816 + if (bo->destroy != nouveau_bo_del_ttm) 817 + return; 818 + 815 819 list_for_each_entry(vma, &nvbo->vma_list, head) { 816 820 if (new_mem && new_mem->mem_type == TTM_PL_VRAM) { 817 821 nouveau_vm_map(vma, new_mem->mm_node);
+13 -4
drivers/gpu/drm/ttm/ttm_bo.c
··· 404 404 } 405 405 } 406 406 407 + if (bdev->driver->move_notify) 408 + bdev->driver->move_notify(bo, mem); 409 + 407 410 if (!(old_man->flags & TTM_MEMTYPE_FLAG_FIXED) && 408 411 !(new_man->flags & TTM_MEMTYPE_FLAG_FIXED)) 409 412 ret = ttm_bo_move_ttm(bo, evict, no_wait_reserve, no_wait_gpu, mem); ··· 416 413 else 417 414 ret = ttm_bo_move_memcpy(bo, evict, no_wait_reserve, no_wait_gpu, mem); 418 415 419 - if (ret) 420 - goto out_err; 416 + if (ret) { 417 + if (bdev->driver->move_notify) { 418 + struct ttm_mem_reg tmp_mem = *mem; 419 + *mem = bo->mem; 420 + bo->mem = tmp_mem; 421 + bdev->driver->move_notify(bo, mem); 422 + bo->mem = *mem; 423 + } 421 424 422 - if (bdev->driver->move_notify) 423 - bdev->driver->move_notify(bo, mem); 425 + goto out_err; 426 + } 424 427 425 428 moved: 426 429 if (bo->evicted) {