Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

x86, 64-bit: Clean up user address masking

The discussion about using "access_ok()" in get_user_pages_fast() (see
commit 7f8189068726492950bf1a2dcfd9b51314560abf: "x86: don't use
'access_ok()' as a range check in get_user_pages_fast()" for details and
end result), made us notice that x86-64 was really being very sloppy
about virtual address checking.

So be way more careful and straightforward about masking x86-64 virtual
addresses:

- All the VIRTUAL_MASK* variants now cover half of the address
space, it's not like we can use the full mask on a signed
integer, and the larger mask just invites mistakes when
applying it to either half of the 48-bit address space.

- /proc/kcore's kc_offset_to_vaddr() becomes a lot more
obvious when it transforms a file offset into a
(kernel-half) virtual address.

- Unify/simplify the 32-bit and 64-bit USER_DS definition to
be based on TASK_SIZE_MAX.

This cleanup and more careful/obvious user virtual address checking also
uncovered a buglet in the x86-64 implementation of strnlen_user(): it
would do an "access_ok()" check on the whole potential area, even if the
string itself was much shorter, and thus return an error even for valid
strings. Our sloppy checking had hidden this.

So this fixes 'strnlen_user()' to do this properly, the same way we
already handled user strings in 'strncpy_from_user()'. Namely by just
checking the first byte, and then relying on fault handling for the
rest. That always works, since we impose a guard page that cannot be
mapped at the end of the user space address space (and even if we
didn't, we'd have the address space hole).

Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

+4 -12
+1 -1
arch/x86/include/asm/page_64_types.h
··· 41 41 42 42 /* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */ 43 43 #define __PHYSICAL_MASK_SHIFT 46 44 - #define __VIRTUAL_MASK_SHIFT 48 44 + #define __VIRTUAL_MASK_SHIFT 47 45 45 46 46 /* 47 47 * Kernel image size is limited to 512 MB (see level2_kernel_pgt in
+1 -4
arch/x86/include/asm/pgtable_64.h
··· 165 165 166 166 /* fs/proc/kcore.c */ 167 167 #define kc_vaddr_to_offset(v) ((v) & __VIRTUAL_MASK) 168 - #define kc_offset_to_vaddr(o) \ 169 - (((o) & (1UL << (__VIRTUAL_MASK_SHIFT - 1))) \ 170 - ? ((o) | ~__VIRTUAL_MASK) \ 171 - : (o)) 168 + #define kc_offset_to_vaddr(o) ((o) | ~__VIRTUAL_MASK) 172 169 173 170 #define __HAVE_ARCH_PTE_SAME 174 171 #endif /* !__ASSEMBLY__ */
+1 -6
arch/x86/include/asm/uaccess.h
··· 25 25 #define MAKE_MM_SEG(s) ((mm_segment_t) { (s) }) 26 26 27 27 #define KERNEL_DS MAKE_MM_SEG(-1UL) 28 - 29 - #ifdef CONFIG_X86_32 30 - # define USER_DS MAKE_MM_SEG(PAGE_OFFSET) 31 - #else 32 - # define USER_DS MAKE_MM_SEG(__VIRTUAL_MASK) 33 - #endif 28 + #define USER_DS MAKE_MM_SEG(TASK_SIZE_MAX) 34 29 35 30 #define get_ds() (KERNEL_DS) 36 31 #define get_fs() (current_thread_info()->addr_limit)
+1 -1
arch/x86/lib/usercopy_64.c
··· 127 127 128 128 long strnlen_user(const char __user *s, long n) 129 129 { 130 - if (!access_ok(VERIFY_READ, s, n)) 130 + if (!access_ok(VERIFY_READ, s, 1)) 131 131 return 0; 132 132 return __strnlen_user(s, n); 133 133 }