Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

x86/modules: Avoid breaking W^X while loading modules

When modules and BPF filters are loaded, there is a time window in
which some memory is both writable and executable. An attacker that has
already found another vulnerability (e.g., a dangling pointer) might be
able to exploit this behavior to overwrite kernel code. Prevent having
writable executable PTEs in this stage.

In addition, avoiding having W+X mappings can also slightly simplify the
patching of modules code on initialization (e.g., by alternatives and
static-key), as would be done in the next patch. This was actually the
main motivation for this patch.

To avoid having W+X mappings, set them initially as RW (NX) and after
they are set as RO set them as X as well. Setting them as executable is
done as a separate step to avoid one core in which the old PTE is cached
(hence writable), and another which sees the updated PTE (executable),
which would break the W^X protection.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <akpm@linux-foundation.org>
Cc: <ard.biesheuvel@linaro.org>
Cc: <deneen.t.dock@intel.com>
Cc: <kernel-hardening@lists.openwall.com>
Cc: <kristen@linux.intel.com>
Cc: <linux_dti@icloud.com>
Cc: <will.deacon@arm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Link: https://lkml.kernel.org/r/20190426001143.4983-12-namit@vmware.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

authored by

Nadav Amit and committed by
Ingo Molnar
f2c65fb3 7298e24f

+28 -8
+21 -7
arch/x86/kernel/alternative.c
··· 668 668 * handlers seeing an inconsistent instruction while you patch. 669 669 */ 670 670 void *__init_or_module text_poke_early(void *addr, const void *opcode, 671 - size_t len) 671 + size_t len) 672 672 { 673 673 unsigned long flags; 674 - local_irq_save(flags); 675 - memcpy(addr, opcode, len); 676 - local_irq_restore(flags); 677 - sync_core(); 678 - /* Could also do a CLFLUSH here to speed up CPU recovery; but 679 - that causes hangs on some VIA CPUs. */ 674 + 675 + if (boot_cpu_has(X86_FEATURE_NX) && 676 + is_module_text_address((unsigned long)addr)) { 677 + /* 678 + * Modules text is marked initially as non-executable, so the 679 + * code cannot be running and speculative code-fetches are 680 + * prevented. Just change the code. 681 + */ 682 + memcpy(addr, opcode, len); 683 + } else { 684 + local_irq_save(flags); 685 + memcpy(addr, opcode, len); 686 + local_irq_restore(flags); 687 + sync_core(); 688 + 689 + /* 690 + * Could also do a CLFLUSH here to speed up CPU recovery; but 691 + * that causes hangs on some VIA CPUs. 692 + */ 693 + } 680 694 return addr; 681 695 } 682 696
+1 -1
arch/x86/kernel/module.c
··· 87 87 p = __vmalloc_node_range(size, MODULE_ALIGN, 88 88 MODULES_VADDR + get_module_load_offset(), 89 89 MODULES_END, GFP_KERNEL, 90 - PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE, 90 + PAGE_KERNEL, 0, NUMA_NO_NODE, 91 91 __builtin_return_address(0)); 92 92 if (p && (kasan_module_alloc(p, size) < 0)) { 93 93 vfree(p);
+1
include/linux/filter.h
··· 746 746 static inline void bpf_jit_binary_lock_ro(struct bpf_binary_header *hdr) 747 747 { 748 748 set_memory_ro((unsigned long)hdr, hdr->pages); 749 + set_memory_x((unsigned long)hdr, hdr->pages); 749 750 } 750 751 751 752 static inline void bpf_jit_binary_unlock_ro(struct bpf_binary_header *hdr)
+5
kernel/module.c
··· 1950 1950 return; 1951 1951 1952 1952 frob_text(&mod->core_layout, set_memory_ro); 1953 + frob_text(&mod->core_layout, set_memory_x); 1954 + 1953 1955 frob_rodata(&mod->core_layout, set_memory_ro); 1956 + 1954 1957 frob_text(&mod->init_layout, set_memory_ro); 1958 + frob_text(&mod->init_layout, set_memory_x); 1959 + 1955 1960 frob_rodata(&mod->init_layout, set_memory_ro); 1956 1961 1957 1962 if (after_init)