Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm/khugepaged: invoke MMU notifiers in shmem/file collapse paths

Any codepath that zaps page table entries must invoke MMU notifiers to
ensure that secondary MMUs (like KVM) don't keep accessing pages which
aren't mapped anymore. Secondary MMUs don't hold their own references to
pages that are mirrored over, so failing to notify them can lead to page
use-after-free.

I'm marking this as addressing an issue introduced in commit f3f0e1d2150b
("khugepaged: add support of collapse for tmpfs/shmem pages"), but most of
the security impact of this only came in commit 27e1f8273113 ("khugepaged:
enable collapse pmd for pte-mapped THP"), which actually omitted flushes
for the removal of present PTEs, not just for the removal of empty page
tables.

Link: https://lkml.kernel.org/r/20221129154730.2274278-3-jannh@google.com
Link: https://lkml.kernel.org/r/20221128180252.1684965-3-jannh@google.com
Link: https://lkml.kernel.org/r/20221125213714.4115729-3-jannh@google.com
Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Jann Horn and committed by
Andrew Morton
f268f6cf 2ba99c5e

+5
+5
mm/khugepaged.c
··· 1399 1399 unsigned long addr, pmd_t *pmdp) 1400 1400 { 1401 1401 pmd_t pmd; 1402 + struct mmu_notifier_range range; 1402 1403 1403 1404 mmap_assert_write_locked(mm); 1404 1405 if (vma->vm_file) ··· 1411 1410 if (vma->anon_vma) 1412 1411 lockdep_assert_held_write(&vma->anon_vma->root->rwsem); 1413 1412 1413 + mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, NULL, mm, addr, 1414 + addr + HPAGE_PMD_SIZE); 1415 + mmu_notifier_invalidate_range_start(&range); 1414 1416 pmd = pmdp_collapse_flush(vma, addr, pmdp); 1415 1417 tlb_remove_table_sync_one(); 1418 + mmu_notifier_invalidate_range_end(&range); 1416 1419 mm_dec_nr_ptes(mm); 1417 1420 page_table_check_pte_clear_range(mm, addr, pmd); 1418 1421 pte_free(mm, pmd_pgtable(pmd));