Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm/thp: fix __split_huge_pmd_locked() for migration PMD

A migrating transparent huge page has to already be unmapped. Otherwise,
the page could be modified while it is being copied to a new page and data
could be lost. The function __split_huge_pmd() checks for a PMD migration
entry before calling __split_huge_pmd_locked() leading one to think that
__split_huge_pmd_locked() can handle splitting a migrating PMD.

However, the code always increments the page->_mapcount and adjusts the
memory control group accounting assuming the page is mapped.

Also, if the PMD entry is a migration PMD entry, the call to
is_huge_zero_pmd(*pmd) is incorrect because it calls pmd_pfn(pmd) instead
of migration_entry_to_pfn(pmd_to_swp_entry(pmd)). Fix these problems by
checking for a PMD migration entry.

Fixes: 84c3fc4e9c56 ("mm: thp: check pmd migration entry in common path")
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org> [4.14+]
Link: https://lkml.kernel.org/r/20200903183140.19055-1-rcampbell@nvidia.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Ralph Campbell and committed by
Linus Torvalds
ec0abae6 b0399092

+23 -19
+23 -19
mm/huge_memory.c
··· 2022 2022 put_page(page); 2023 2023 add_mm_counter(mm, mm_counter_file(page), -HPAGE_PMD_NR); 2024 2024 return; 2025 - } else if (is_huge_zero_pmd(*pmd)) { 2025 + } else if (pmd_trans_huge(*pmd) && is_huge_zero_pmd(*pmd)) { 2026 2026 /* 2027 2027 * FIXME: Do we want to invalidate secondary mmu by calling 2028 2028 * mmu_notifier_invalidate_range() see comments below inside ··· 2116 2116 pte = pte_offset_map(&_pmd, addr); 2117 2117 BUG_ON(!pte_none(*pte)); 2118 2118 set_pte_at(mm, addr, pte, entry); 2119 - atomic_inc(&page[i]._mapcount); 2119 + if (!pmd_migration) 2120 + atomic_inc(&page[i]._mapcount); 2120 2121 pte_unmap(pte); 2121 2122 } 2122 2123 2123 - /* 2124 - * Set PG_double_map before dropping compound_mapcount to avoid 2125 - * false-negative page_mapped(). 2126 - */ 2127 - if (compound_mapcount(page) > 1 && !TestSetPageDoubleMap(page)) { 2128 - for (i = 0; i < HPAGE_PMD_NR; i++) 2129 - atomic_inc(&page[i]._mapcount); 2130 - } 2131 - 2132 - lock_page_memcg(page); 2133 - if (atomic_add_negative(-1, compound_mapcount_ptr(page))) { 2134 - /* Last compound_mapcount is gone. */ 2135 - __dec_lruvec_page_state(page, NR_ANON_THPS); 2136 - if (TestClearPageDoubleMap(page)) { 2137 - /* No need in mapcount reference anymore */ 2124 + if (!pmd_migration) { 2125 + /* 2126 + * Set PG_double_map before dropping compound_mapcount to avoid 2127 + * false-negative page_mapped(). 2128 + */ 2129 + if (compound_mapcount(page) > 1 && 2130 + !TestSetPageDoubleMap(page)) { 2138 2131 for (i = 0; i < HPAGE_PMD_NR; i++) 2139 - atomic_dec(&page[i]._mapcount); 2132 + atomic_inc(&page[i]._mapcount); 2140 2133 } 2134 + 2135 + lock_page_memcg(page); 2136 + if (atomic_add_negative(-1, compound_mapcount_ptr(page))) { 2137 + /* Last compound_mapcount is gone. */ 2138 + __dec_lruvec_page_state(page, NR_ANON_THPS); 2139 + if (TestClearPageDoubleMap(page)) { 2140 + /* No need in mapcount reference anymore */ 2141 + for (i = 0; i < HPAGE_PMD_NR; i++) 2142 + atomic_dec(&page[i]._mapcount); 2143 + } 2144 + } 2145 + unlock_page_memcg(page); 2141 2146 } 2142 - unlock_page_memcg(page); 2143 2147 2144 2148 smp_wmb(); /* make pte visible before pmd */ 2145 2149 pmd_populate(mm, pmd, pgtable);