Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

userfaultfd: non-cooperative: closing the uffd without triggering SIGBUS

This is an enhancement to avoid a non cooperative userfaultfd manager
having to unregister all regions before it can close the uffd after all
userfaultfd activity completed.

The UFFDIO_UNREGISTER would serialize against the handle_userfault by
taking the mmap_sem for writing, but we can simply repeat the page fault
if we detect the uffd was closed and so the regular page fault paths
should takeover.

Link: http://lkml.kernel.org/r/20170823181227.19926-1-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Andrea Arcangeli and committed by
Linus Torvalds
656710a6 98c70baa

+19 -1
+19 -1
fs/userfaultfd.c
··· 381 381 * in __get_user_pages if userfaultfd_release waits on the 382 382 * caller of handle_userfault to release the mmap_sem. 383 383 */ 384 - if (unlikely(ACCESS_ONCE(ctx->released))) 384 + if (unlikely(ACCESS_ONCE(ctx->released))) { 385 + /* 386 + * Don't return VM_FAULT_SIGBUS in this case, so a non 387 + * cooperative manager can close the uffd after the 388 + * last UFFDIO_COPY, without risking to trigger an 389 + * involuntary SIGBUS if the process was starting the 390 + * userfaultfd while the userfaultfd was still armed 391 + * (but after the last UFFDIO_COPY). If the uffd 392 + * wasn't already closed when the userfault reached 393 + * this point, that would normally be solved by 394 + * userfaultfd_must_wait returning 'false'. 395 + * 396 + * If we were to return VM_FAULT_SIGBUS here, the non 397 + * cooperative manager would be instead forced to 398 + * always call UFFDIO_UNREGISTER before it can safely 399 + * close the uffd. 400 + */ 401 + ret = VM_FAULT_NOPAGE; 385 402 goto out; 403 + } 386 404 387 405 /* 388 406 * Check that we can return VM_FAULT_RETRY.