Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm/hugetlb: fix two comments related to huge_pmd_unshare()

Ever since we stopped using the page count to detect shared PMD page
tables, these comments are outdated.

The only reason we have to flush the TLB early is because once we drop the
i_mmap_rwsem, the previously shared page table could get freed (to then
get reallocated and used for other purpose). So we really have to flush
the TLB before that could happen.

So let's simplify the comments a bit.

The "If we unshared PMDs, the TLB flush was not recorded in mmu_gather."
part introduced as in commit a4a118f2eead ("hugetlbfs: flush TLBs
correctly after huge_pmd_unshare") was confusing: sure it is recorded in
the mmu_gather, otherwise tlb_flush_mmu_tlbonly() wouldn't do anything.
So let's drop that comment while at it as well.

We'll centralize these comments in a single helper as we rework the code
next.

Link: https://lkml.kernel.org/r/20251223214037.580860-3-david@kernel.org
Fixes: 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count")
Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org>
Reviewed-by: Rik van Riel <riel@surriel.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Cc: Liu Shixin <liushixin2@huawei.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: "Uschakow, Stanislav" <suschako@amazon.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

David Hildenbrand (Red Hat) and committed by
Andrew Morton
3937027c ca1a47cd

+8 -16
+8 -16
mm/hugetlb.c
··· 5320 5320 tlb_end_vma(tlb, vma); 5321 5321 5322 5322 /* 5323 - * If we unshared PMDs, the TLB flush was not recorded in mmu_gather. We 5324 - * could defer the flush until now, since by holding i_mmap_rwsem we 5325 - * guaranteed that the last reference would not be dropped. But we must 5326 - * do the flushing before we return, as otherwise i_mmap_rwsem will be 5327 - * dropped and the last reference to the shared PMDs page might be 5328 - * dropped as well. 5329 - * 5330 - * In theory we could defer the freeing of the PMD pages as well, but 5331 - * huge_pmd_unshare() relies on the exact page_count for the PMD page to 5332 - * detect sharing, so we cannot defer the release of the page either. 5333 - * Instead, do flush now. 5323 + * There is nothing protecting a previously-shared page table that we 5324 + * unshared through huge_pmd_unshare() from getting freed after we 5325 + * release i_mmap_rwsem, so flush the TLB now. If huge_pmd_unshare() 5326 + * succeeded, flush the range corresponding to the pud. 5334 5327 */ 5335 5328 if (force_flush) 5336 5329 tlb_flush_mmu_tlbonly(tlb); ··· 6545 6552 cond_resched(); 6546 6553 } 6547 6554 /* 6548 - * Must flush TLB before releasing i_mmap_rwsem: x86's huge_pmd_unshare 6549 - * may have cleared our pud entry and done put_page on the page table: 6550 - * once we release i_mmap_rwsem, another task can do the final put_page 6551 - * and that page table be reused and filled with junk. If we actually 6552 - * did unshare a page of pmds, flush the range corresponding to the pud. 6555 + * There is nothing protecting a previously-shared page table that we 6556 + * unshared through huge_pmd_unshare() from getting freed after we 6557 + * release i_mmap_rwsem, so flush the TLB now. If huge_pmd_unshare() 6558 + * succeeded, flush the range corresponding to the pud. 6553 6559 */ 6554 6560 if (shared_pmd) 6555 6561 flush_hugetlb_tlb_range(vma, range.start, range.end);