Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm/shmem: ensure proper fallback if page faults

The kernel test robot flagged a recursive lock as a result of a conversion
from kmap_atomic() to kmap_local_folio()[Link]

The cause was due to the code depending on the kmap_atomic() side effect
of disabling page faults. In that case the code expects the fault to fail
and take the fallback case.

git archaeology implied that the recursion may not be an actual bug.[1]
However, depending on the implementation of the mmap_lock and the
condition of the call there may still be a deadlock.[2] So this is not
purely a lockdep issue. Considering a single threaded call stack there
are 3 options.

1) Different mm's are in play (no issue)
2) Readlock implementation is recursive and same mm is in play
(no issue)
3) Readlock implementation is _not_ recursive (issue)

The mmap_lock is recursive so with a single thread there is no issue.

However, Matthew pointed out a deadlock scenario when you consider
additional process' and threads thusly.

"The readlock implementation is only recursive if nobody else has taken a
write lock. If you have a multithreaded process, one of the other threads
can call mmap() and that will prevent recursion (due to fairness). Even
if it's a different process that you're trying to acquire the mmap read
lock on, you can still get into a deadly embrace. eg:

process A thread 1 takes read lock on own mmap_lock
process A thread 2 calls mmap, blocks taking write lock
process B thread 1 takes page fault, read lock on own mmap lock
process B thread 2 calls mmap, blocks taking write lock
process A thread 1 blocks taking read lock on process B
process B thread 1 blocks taking read lock on process A

Now all four threads are blocked waiting for each other."

Regardless using pagefault_disable() ensures that no matter what locking
implementation is used a deadlock will not occur. Add an explicit
pagefault_disable() and a big comment to explain this for future souls
looking at this code.

[1] https://lore.kernel.org/all/Y1MymJ%2FINb45AdaY@iweiny-desk3/
[2] https://lore.kernel.org/lkml/Y1bXBtGTCym77%2FoD@casper.infradead.org/

Link: https://lkml.kernel.org/r/20221025220108.2366043-1-ira.weiny@intel.com
Link: https://lore.kernel.org/r/202210211215.9dc6efb5-yujie.liu@intel.com
Fixes: 7a7256d5f512 ("shmem: convert shmem_mfill_atomic_pte() to use a folio")
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: kernel test robot <yujie.liu@intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Ira Weiny and committed by
Andrew Morton
5dc21f0c 5521de7d

+17
+17
mm/shmem.c
··· 2424 2424 2425 2425 if (!zeropage) { /* COPY */ 2426 2426 page_kaddr = kmap_local_folio(folio, 0); 2427 + /* 2428 + * The read mmap_lock is held here. Despite the 2429 + * mmap_lock being read recursive a deadlock is still 2430 + * possible if a writer has taken a lock. For example: 2431 + * 2432 + * process A thread 1 takes read lock on own mmap_lock 2433 + * process A thread 2 calls mmap, blocks taking write lock 2434 + * process B thread 1 takes page fault, read lock on own mmap lock 2435 + * process B thread 2 calls mmap, blocks taking write lock 2436 + * process A thread 1 blocks taking read lock on process B 2437 + * process B thread 1 blocks taking read lock on process A 2438 + * 2439 + * Disable page faults to prevent potential deadlock 2440 + * and retry the copy outside the mmap_lock. 2441 + */ 2442 + pagefault_disable(); 2427 2443 ret = copy_from_user(page_kaddr, 2428 2444 (const void __user *)src_addr, 2429 2445 PAGE_SIZE); 2446 + pagefault_enable(); 2430 2447 kunmap_local(page_kaddr); 2431 2448 2432 2449 /* fallback to copy_from_user outside mmap_lock */