userfaultfd: hugetlbfs: fix new flag usage in error path

In commit d6995da31122 ("hugetlb: use page.private for hugetlb specific
page flags") the use of PagePrivate to indicate a reservation count
should be restored at free time was changed to the hugetlb specific flag
HPageRestoreReserve. Changes to a userfaultfd error path as well as a
VM_BUG_ON() in remove_inode_hugepages() were overlooked.

Users could see incorrect hugetlb reserve counts if they experience an
error with a UFFDIO_COPY operation. Specifically, this would be the
result of an unlikely copy_huge_page_from_user error. There is not an
increased chance of hitting the VM_BUG_ON.

Link: https://lkml.kernel.org/r/20210521233952.236434-1-mike.kravetz@oracle.com
Fixes: d6995da31122 ("hugetlb: use page.private for hugetlb specific page flags")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Mina Almasry <almasry.mina@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by Mike Kravetz and committed by Linus Torvalds e32905e5 1b6d6393

+15 -15
+1 -1
fs/hugetlbfs/inode.c
··· 529 529 * the subpool and global reserve usage count can need 530 530 * to be adjusted. 531 531 */ 532 - VM_BUG_ON(PagePrivate(page)); 532 + VM_BUG_ON(HPageRestoreReserve(page)); 533 533 remove_huge_page(page); 534 534 freed++; 535 535 if (!truncate_op) {
+14 -14
mm/userfaultfd.c
··· 360 360 * If a reservation for the page existed in the reservation 361 361 * map of a private mapping, the map was modified to indicate 362 362 * the reservation was consumed when the page was allocated. 363 - * We clear the PagePrivate flag now so that the global 363 + * We clear the HPageRestoreReserve flag now so that the global 364 364 * reserve count will not be incremented in free_huge_page. 365 365 * The reservation map will still indicate the reservation 366 366 * was consumed and possibly prevent later page allocation. 367 367 * This is better than leaking a global reservation. If no 368 - * reservation existed, it is still safe to clear PagePrivate 369 - * as no adjustments to reservation counts were made during 370 - * allocation. 368 + * reservation existed, it is still safe to clear 369 + * HPageRestoreReserve as no adjustments to reservation counts 370 + * were made during allocation. 371 371 * 372 372 * The reservation map for shared mappings indicates which 373 373 * pages have reservations. When a huge page is allocated 374 374 * for an address with a reservation, no change is made to 375 - * the reserve map. In this case PagePrivate will be set 376 - * to indicate that the global reservation count should be 375 + * the reserve map. In this case HPageRestoreReserve will be 376 + * set to indicate that the global reservation count should be 377 377 * incremented when the page is freed. This is the desired 378 378 * behavior. However, when a huge page is allocated for an 379 379 * address without a reservation a reservation entry is added 380 - * to the reservation map, and PagePrivate will not be set. 381 - * When the page is freed, the global reserve count will NOT 382 - * be incremented and it will appear as though we have leaked 383 - * reserved page. In this case, set PagePrivate so that the 384 - * global reserve count will be incremented to match the 385 - * reservation map entry which was created. 380 + * to the reservation map, and HPageRestoreReserve will not be 381 + * set. When the page is freed, the global reserve count will 382 + * NOT be incremented and it will appear as though we have 383 + * leaked reserved page. In this case, set HPageRestoreReserve 384 + * so that the global reserve count will be incremented to 385 + * match the reservation map entry which was created. 386 386 * 387 387 * Note that vm_alloc_shared is based on the flags of the vma 388 388 * for which the page was originally allocated. dst_vma could 389 389 * be different or NULL on error. 390 390 */ 391 391 if (vm_alloc_shared) 392 - SetPagePrivate(page); 392 + SetHPageRestoreReserve(page); 393 393 else 394 - ClearPagePrivate(page); 394 + ClearHPageRestoreReserve(page); 395 395 put_page(page); 396 396 } 397 397 BUG_ON(copied < 0);