Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm: page migration avoid touching newpage until no going back

We have had trouble in the past from the way in which page migration's
newpage is initialized in dribs and drabs - see commit 8bdd63809160 ("mm:
fix direct reclaim writeback regression") which proposed a cleanup.

We have no actual problem now, but I think the procedure would be clearer
(and alternative get_new_page pools safer to implement) if we assert that
newpage is not touched until we are sure that it's going to be used -
except for taking the trylock on it in __unmap_and_move().

So shift the early initializations from move_to_new_page() into
migrate_page_move_mapping(), mapping and NULL-mapping paths. Similarly
migrate_huge_page_move_mapping(), but its NULL-mapping path can just be
deleted: you cannot reach hugetlbfs_migrate_page() with a NULL mapping.

Adjust stages 3 to 8 in the Documentation file accordingly.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Hugh Dickins and committed by
Linus Torvalds
cf4b769a 470f119f

+30 -38
+9 -10
Documentation/vm/page_migration
··· 92 92 93 93 2. Insure that writeback is complete. 94 94 95 - 3. Prep the new page that we want to move to. It is locked 96 - and set to not being uptodate so that all accesses to the new 97 - page immediately lock while the move is in progress. 95 + 3. Lock the new page that we want to move to. It is locked so that accesses to 96 + this (not yet uptodate) page immediately lock while the move is in progress. 98 97 99 - 4. The new page is prepped with some settings from the old page so that 100 - accesses to the new page will discover a page with the correct settings. 101 - 102 - 5. All the page table references to the page are converted to migration 98 + 4. All the page table references to the page are converted to migration 103 99 entries. This decreases the mapcount of a page. If the resulting 104 100 mapcount is not zero then we do not migrate the page. All user space 105 101 processes that attempt to access the page will now wait on the page lock. 106 102 107 - 6. The radix tree lock is taken. This will cause all processes trying 103 + 5. The radix tree lock is taken. This will cause all processes trying 108 104 to access the page via the mapping to block on the radix tree spinlock. 109 105 110 - 7. The refcount of the page is examined and we back out if references remain 106 + 6. The refcount of the page is examined and we back out if references remain 111 107 otherwise we know that we are the only one referencing this page. 112 108 113 - 8. The radix tree is checked and if it does not contain the pointer to this 109 + 7. The radix tree is checked and if it does not contain the pointer to this 114 110 page then we back out because someone else modified the radix tree. 111 + 112 + 8. The new page is prepped with some settings from the old page so that 113 + accesses to the new page will discover a page with the correct settings. 115 114 116 115 9. The radix tree is changed to point to the new page. 117 116
+21 -28
mm/migrate.c
··· 320 320 /* Anonymous page without mapping */ 321 321 if (page_count(page) != expected_count) 322 322 return -EAGAIN; 323 + 324 + /* No turning back from here */ 325 + set_page_memcg(newpage, page_memcg(page)); 326 + newpage->index = page->index; 327 + newpage->mapping = page->mapping; 328 + if (PageSwapBacked(page)) 329 + SetPageSwapBacked(newpage); 330 + 323 331 return MIGRATEPAGE_SUCCESS; 324 332 } 325 333 ··· 363 355 } 364 356 365 357 /* 366 - * Now we know that no one else is looking at the page. 358 + * Now we know that no one else is looking at the page: 359 + * no turning back from here. 367 360 */ 361 + set_page_memcg(newpage, page_memcg(page)); 362 + newpage->index = page->index; 363 + newpage->mapping = page->mapping; 364 + if (PageSwapBacked(page)) 365 + SetPageSwapBacked(newpage); 366 + 368 367 get_page(newpage); /* add cache reference */ 369 368 if (PageSwapCache(page)) { 370 369 SetPageSwapCache(newpage); ··· 418 403 int expected_count; 419 404 void **pslot; 420 405 421 - if (!mapping) { 422 - if (page_count(page) != 1) 423 - return -EAGAIN; 424 - return MIGRATEPAGE_SUCCESS; 425 - } 426 - 427 406 spin_lock_irq(&mapping->tree_lock); 428 407 429 408 pslot = radix_tree_lookup_slot(&mapping->page_tree, ··· 435 426 return -EAGAIN; 436 427 } 437 428 429 + set_page_memcg(newpage, page_memcg(page)); 430 + newpage->index = page->index; 431 + newpage->mapping = page->mapping; 438 432 get_page(newpage); 439 433 440 434 radix_tree_replace_slot(pslot, newpage); ··· 742 730 VM_BUG_ON_PAGE(!PageLocked(page), page); 743 731 VM_BUG_ON_PAGE(!PageLocked(newpage), newpage); 744 732 745 - /* Prepare mapping for the new page.*/ 746 - newpage->index = page->index; 747 - newpage->mapping = page->mapping; 748 - if (PageSwapBacked(page)) 749 - SetPageSwapBacked(newpage); 750 - 751 - /* 752 - * Indirectly called below, migrate_page_copy() copies PG_dirty and thus 753 - * needs newpage's memcg set to transfer memcg dirty page accounting. 754 - * So perform memcg migration in two steps: 755 - * 1. set newpage->mem_cgroup (here) 756 - * 2. clear page->mem_cgroup (below) 757 - */ 758 - set_page_memcg(newpage, page_memcg(page)); 759 - 760 733 mapping = page_mapping(page); 761 734 if (!mapping) 762 735 rc = migrate_page(mapping, newpage, page, mode); ··· 764 767 set_page_memcg(page, NULL); 765 768 if (!PageAnon(page)) 766 769 page->mapping = NULL; 767 - } else { 768 - set_page_memcg(newpage, NULL); 769 - newpage->mapping = NULL; 770 770 } 771 771 return rc; 772 772 } ··· 965 971 * it. Otherwise, putback_lru_page() will drop the reference grabbed 966 972 * during isolation. 967 973 */ 968 - if (put_new_page) { 969 - ClearPageSwapBacked(newpage); 974 + if (put_new_page) 970 975 put_new_page(newpage, private); 971 - } else if (unlikely(__is_movable_balloon_page(newpage))) { 976 + else if (unlikely(__is_movable_balloon_page(newpage))) { 972 977 /* drop our reference, page already in the balloon */ 973 978 put_page(newpage); 974 979 } else