commits

The recent refactoring of the powerpc page fault handler in commit
c3350602e876 ("powerpc/mm: Make bad_area* helper functions") caused
access to protected memory regions to indicate SEGV_MAPERR instead of
the traditional SEGV_ACCERR in the si_code field of a user-space
signal handler. This can confuse debug libraries that temporarily
change the protection of memory regions, and expect to use SEGV_ACCERR
as an indication to restore access to a region.

This commit restores the previous behavior. The following program
exhibits the issue:

$ ./repro read || echo "FAILED"
$ ./repro write || echo "FAILED"
$ ./repro exec || echo "FAILED"

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <signal.h>
#include <sys/mman.h>
#include <assert.h>

static void segv_handler(int n, siginfo_t *info, void *arg) {
_exit(info->si_code == SEGV_ACCERR ? 0 : 1);
}

int main(int argc, char **argv)
{
void *p = NULL;
struct sigaction act = {
.sa_sigaction = segv_handler,
.sa_flags = SA_SIGINFO,
};

assert(argc == 2);
p = mmap(NULL, getpagesize(),
(strcmp(argv[1], "write") == 0) ? PROT_READ : 0,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
assert(p != MAP_FAILED);

assert(sigaction(SIGSEGV, &act, NULL) == 0);
if (strcmp(argv[1], "read") == 0)
printf("%c", *(unsigned char *)p);
else if (strcmp(argv[1], "write") == 0)
*(unsigned char *)p = 0;
else if (strcmp(argv[1], "exec") == 0)
((void (*)(void))p)();
return 1; /* failed to generate SEGV */
}

Fixes: c3350602e876 ("powerpc/mm: Make bad_area* helper functions")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: John Sperbeck <jsperbeck@google.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Add commit references in change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

8y ago

Jim Mattson

0cb5b306

kvm: vmx: Scrub hardware GPRs at VM-exit

8y ago

Christian Borntraeger

c2cf265d

KVM: s390: prevent buffer overrun on memory hotplug during migration

8y ago

Tetsuo Handa

bb422a73

mm,vmscan: Make unregister_shrinker() no-op if register_shrinker() failed.

Syzbot caught an oops at unregister_shrinker() because combination of
commit 1d3d4437eae1bb29 ("vmscan: per-node deferred work") and fault
injection made register_shrinker() fail and the caller of
register_shrinker() did not check for failure.

----------
[ 554.881422] FAULT_INJECTION: forcing a failure.
[ 554.881422] name failslab, interval 1, probability 0, space 0, times 0
[ 554.881438] CPU: 1 PID: 13231 Comm: syz-executor1 Not tainted 4.14.0-rc8+ #82
[ 554.881443] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[ 554.881445] Call Trace:
[ 554.881459] dump_stack+0x194/0x257
[ 554.881474] ? arch_local_irq_restore+0x53/0x53
[ 554.881486] ? find_held_lock+0x35/0x1d0
[ 554.881507] should_fail+0x8c0/0xa40
[ 554.881522] ? fault_create_debugfs_attr+0x1f0/0x1f0
[ 554.881537] ? check_noncircular+0x20/0x20
[ 554.881546] ? find_next_zero_bit+0x2c/0x40
[ 554.881560] ? ida_get_new_above+0x421/0x9d0
[ 554.881577] ? find_held_lock+0x35/0x1d0
[ 554.881594] ? __lock_is_held+0xb6/0x140
[ 554.881628] ? check_same_owner+0x320/0x320
[ 554.881634] ? lock_downgrade+0x990/0x990
[ 554.881649] ? find_held_lock+0x35/0x1d0
[ 554.881672] should_failslab+0xec/0x120
[ 554.881684] __kmalloc+0x63/0x760
[ 554.881692] ? lock_downgrade+0x990/0x990
[ 554.881712] ? register_shrinker+0x10e/0x2d0
[ 554.881721] ? trace_event_raw_event_module_request+0x320/0x320
[ 554.881737] register_shrinker+0x10e/0x2d0
[ 554.881747] ? prepare_kswapd_sleep+0x1f0/0x1f0
[ 554.881755] ? _down_write_nest_lock+0x120/0x120
[ 554.881765] ? memcpy+0x45/0x50
[ 554.881785] sget_userns+0xbcd/0xe20
(...snipped...)
[ 554.898693] kasan: CONFIG_KASAN_INLINE enabled
[ 554.898724] kasan: GPF could be caused by NULL-ptr deref or user memory access
[ 554.898732] general protection fault: 0000 [#1] SMP KASAN
[ 554.898737] Dumping ftrace buffer:
[ 554.898741] (ftrace buffer empty)
[ 554.898743] Modules linked in:
[ 554.898752] CPU: 1 PID: 13231 Comm: syz-executor1 Not tainted 4.14.0-rc8+ #82
[ 554.898755] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[ 554.898760] task: ffff8801d1dbe5c0 task.stack: ffff8801c9e38000
[ 554.898772] RIP: 0010:__list_del_entry_valid+0x7e/0x150
[ 554.898775] RSP: 0018:ffff8801c9e3f108 EFLAGS: 00010246
[ 554.898780] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 554.898784] RDX: 0000000000000000 RSI: ffff8801c53c6f98 RDI: ffff8801c53c6fa0
[ 554.898788] RBP: ffff8801c9e3f120 R08: 1ffff100393c7d55 R09: 0000000000000004
[ 554.898791] R10: ffff8801c9e3ef70 R11: 0000000000000000 R12: 0000000000000000
[ 554.898795] R13: dffffc0000000000 R14: 1ffff100393c7e45 R15: ffff8801c53c6f98
[ 554.898800] FS: 0000000000000000(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
[ 554.898804] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
[ 554.898807] CR2: 00000000dbc23000 CR3: 00000001c7269000 CR4: 00000000001406e0
[ 554.898813] DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
[ 554.898816] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
[ 554.898818] Call Trace:
[ 554.898828] unregister_shrinker+0x79/0x300
[ 554.898837] ? perf_trace_mm_vmscan_writepage+0x750/0x750
[ 554.898844] ? down_write+0x87/0x120
[ 554.898851] ? deactivate_super+0x139/0x1b0
[ 554.898857] ? down_read+0x150/0x150
[ 554.898864] ? check_same_owner+0x320/0x320
[ 554.898875] deactivate_locked_super+0x64/0xd0
[ 554.898883] deactivate_super+0x141/0x1b0
----------

Since allowing register_shrinker() callers to call unregister_shrinker()
when register_shrinker() failed can simplify error recovery path, this
patch makes unregister_shrinker() no-op when register_shrinker() failed.
Also, reset shrinker->nr_deferred in case unregister_shrinker() was
by error called twice.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Aliaksei Karaliou <akaraliou.dev@gmail.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Glauber Costa <glauber@scylladb.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

8y ago

Linus Torvalds

8d517bdf

Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

8y ago

Thomas Gleixner

9f4533cd

timerqueue: Document return values of timerqueue_add/del()

8y ago

Thomas Gleixner

a62d6985

x86/ldt: Plug memory leak in error path

8y ago

Dou Liyang

4fcab669

x86/apic: Avoid wrong warning when parsing 'apic=' in X86-32 case

8y ago

Linus Torvalds

313243aa

Merge tag 'iommu-v4.15-rc7' of git://github.com/awilliam/linux-vfio

8y ago

Oleksandr Andrushchenko

02a0d921

Input: xen-kbdfront - do not advertise multi-touch pressure support

8y ago

Laurent Vivier

7333b5ac

KVM: PPC: Book3S HV: Fix pending_pri value in kvmppc_xive_get_icp()

8y ago

Stefan Raspl

aa12f594

tools/kvm_stat: sort '-f help' output

8y ago

Christian Borntraeger

32aa144f

KVM: s390: fix cmma migration for multiple memory slots

8y ago

Markus Trippelsdorf

d7ee9469

VFS: Handle lazytime in do_mount()

8y ago

Linus Torvalds

4c470317

Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

8y ago

Mathieu Malaterre

76dc6c09

cpu/hotplug: Move inline keyword at the beginning of declaration

8y ago

Thomas Gleixner

fd45bb77

timers: Invoke timer_start_debug() where it makes sense

8y ago

Thomas Gleixner

decab088

x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()

8y ago

Linus Torvalds

ac461122

x86-32: Fix kexec with stack canary (CONFIG_CC_STACKPROTECTOR)

Commit e802a51ede91 ("x86/idt: Consolidate IDT invalidation") cleaned up
and unified the IDT invalidation that existed in a couple of places. It
changed no actual real code.

Despite not changing any actual real code, it _did_ change code generation:
by implementing the common idt_invalidate() function in
archx86/kernel/idt.c, it made the use of the function in
arch/x86/kernel/machine_kexec_32.c be a real function call rather than an
(accidental) inlining of the function.

That, in turn, exposed two issues:

- in load_segments(), we had incorrectly reset all the segment
registers, which then made the stack canary load (which gcc does
using offset of %gs) cause a trap. Instead of %gs pointing to the
stack canary, it will be the normal zero-based kernel segment, and
the stack canary load will take a page fault at address 0x14.

- to make this even harder to debug, we had invalidated the GDT just
before calling idt_invalidate(), which meant that the fault happened
with an invalid GDT, which in turn causes a triple fault and
immediate reboot.

Fix this by

(a) not reloading the special segments in load_segments(). We currently
don't do any percpu accesses (which would require %fs on x86-32) in
this area, but there's no reason to think that we might not want to
do them, and like %gs, it's pointless to break it.

(b) doing idt_invalidate() before invalidating the GDT, to keep things
at least _slightly_ more debuggable for a bit longer. Without a
IDT, traps will not work. Without a GDT, traps also will not work,
but neither will any segment loads etc. So in a very real sense,
the GDT is even more core than the IDT.

Fixes: e802a51ede91 ("x86/idt: Consolidate IDT invalidation")
Reported-and-tested-by: Alexandru Chirvasitu <achirvasub@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.LFD.2.21.1712271143180.8572@i7.lan