Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm: hugetlb: introduce page_huge_active

We are not safe from calling isolate_huge_page() on a hugepage
concurrently, which can make the victim hugepage in invalid state and
results in BUG_ON().

The root problem of this is that we don't have any information on struct
page (so easily accessible) about hugepages' activeness. Note that
hugepages' activeness means just being linked to
hstate->hugepage_activelist, which is not the same as normal pages'
activeness represented by PageActive flag.

Normal pages are isolated by isolate_lru_page() which prechecks PageLRU
before isolation, so let's do similarly for hugetlb with a new
paeg_huge_active().

set/clear_page_huge_active() should be called within hugetlb_lock. But
hugetlb_cow() and hugetlb_no_page() don't do this, being justified because
in these functions set_page_huge_active() is called right after the
hugepage is allocated and no other thread tries to isolate it.

[akpm@linux-foundation.org: s/PageHugeActive/page_huge_active/, make it return bool]
[fengguang.wu@intel.com: set_page_huge_active() can be static]
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Hugh Dickins <hughd@google.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Naoya Horiguchi and committed by
Linus Torvalds
bcc54222 822fc613

+50 -5
+38 -3
mm/hugetlb.c
··· 924 924 return NULL; 925 925 } 926 926 927 + /* 928 + * Test to determine whether the hugepage is "active/in-use" (i.e. being linked 929 + * to hstate->hugepage_activelist.) 930 + * 931 + * This function can be called for tail pages, but never returns true for them. 932 + */ 933 + bool page_huge_active(struct page *page) 934 + { 935 + VM_BUG_ON_PAGE(!PageHuge(page), page); 936 + return PageHead(page) && PagePrivate(&page[1]); 937 + } 938 + 939 + /* never called for tail page */ 940 + static void set_page_huge_active(struct page *page) 941 + { 942 + VM_BUG_ON_PAGE(!PageHeadHuge(page), page); 943 + SetPagePrivate(&page[1]); 944 + } 945 + 946 + static void clear_page_huge_active(struct page *page) 947 + { 948 + VM_BUG_ON_PAGE(!PageHeadHuge(page), page); 949 + ClearPagePrivate(&page[1]); 950 + } 951 + 927 952 void free_huge_page(struct page *page) 928 953 { 929 954 /* ··· 977 952 restore_reserve = true; 978 953 979 954 spin_lock(&hugetlb_lock); 955 + clear_page_huge_active(page); 980 956 hugetlb_cgroup_uncharge_page(hstate_index(h), 981 957 pages_per_huge_page(h), page); 982 958 if (restore_reserve) ··· 2998 2972 copy_user_huge_page(new_page, old_page, address, vma, 2999 2973 pages_per_huge_page(h)); 3000 2974 __SetPageUptodate(new_page); 2975 + set_page_huge_active(new_page); 3001 2976 3002 2977 mmun_start = address & huge_page_mask(h); 3003 2978 mmun_end = mmun_start + huge_page_size(h); ··· 3111 3084 } 3112 3085 clear_huge_page(page, address, pages_per_huge_page(h)); 3113 3086 __SetPageUptodate(page); 3087 + set_page_huge_active(page); 3114 3088 3115 3089 if (vma->vm_flags & VM_MAYSHARE) { 3116 3090 int err; ··· 3941 3913 3942 3914 bool isolate_huge_page(struct page *page, struct list_head *list) 3943 3915 { 3916 + bool ret = true; 3917 + 3944 3918 VM_BUG_ON_PAGE(!PageHead(page), page); 3945 - if (!get_page_unless_zero(page)) 3946 - return false; 3947 3919 spin_lock(&hugetlb_lock); 3920 + if (!page_huge_active(page) || !get_page_unless_zero(page)) { 3921 + ret = false; 3922 + goto unlock; 3923 + } 3924 + clear_page_huge_active(page); 3948 3925 list_move_tail(&page->lru, list); 3926 + unlock: 3949 3927 spin_unlock(&hugetlb_lock); 3950 - return true; 3928 + return ret; 3951 3929 } 3952 3930 3953 3931 void putback_active_hugepage(struct page *page) 3954 3932 { 3955 3933 VM_BUG_ON_PAGE(!PageHead(page), page); 3956 3934 spin_lock(&hugetlb_lock); 3935 + set_page_huge_active(page); 3957 3936 list_move_tail(&page->lru, &(page_hstate(page))->hugepage_activelist); 3958 3937 spin_unlock(&hugetlb_lock); 3959 3938 put_page(page);
+12 -2
mm/memory-failure.c
··· 1586 1586 } 1587 1587 unlock_page(hpage); 1588 1588 1589 - /* Keep page count to indicate a given hugepage is isolated. */ 1590 - list_move(&hpage->lru, &pagelist); 1589 + ret = isolate_huge_page(hpage, &pagelist); 1590 + if (ret) { 1591 + /* 1592 + * get_any_page() and isolate_huge_page() takes a refcount each, 1593 + * so need to drop one here. 1594 + */ 1595 + put_page(hpage); 1596 + } else { 1597 + pr_info("soft offline: %#lx hugepage failed to isolate\n", pfn); 1598 + return -EBUSY; 1599 + } 1600 + 1591 1601 ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL, 1592 1602 MIGRATE_SYNC, MR_MEMORY_FAILURE); 1593 1603 if (ret) {