commits

The following testcase may result in a page table entries with a invalid
CCA field being generated:

static void *bindstack;

static int sysrqfd;

static void protect_low(int protect)
{
mprotect(bindstack, BINDSTACK_SIZE, protect);
}

static void sigbus_handler(int signal, siginfo_t * info, void *context)
{
void *addr = info->si_addr;

write(sysrqfd, "x", 1);

printf("sigbus, fault address %p (should not happen, but might)\n",
addr);
abort();
}

static void run_bind_test(void)
{
unsigned int *p = bindstack;

p[0] = 0xf001f001;

write(sysrqfd, "x", 1);

/* Set trap on access to p[0] */
protect_low(PROT_NONE);

write(sysrqfd, "x", 1);

/* Clear trap on access to p[0] */
protect_low(PROT_READ | PROT_WRITE | PROT_EXEC);

write(sysrqfd, "x", 1);

/* Check the contents of p[0] */
if (p[0] != 0xf001f001) {
write(sysrqfd, "x", 1);

/* Reached, but shouldn't be */
printf("badness, shouldn't happen but does\n");
abort();
}
}

int main(void)
{
struct sigaction sa;

sysrqfd = open("/proc/sysrq-trigger", O_WRONLY);

if (sigprocmask(SIG_BLOCK, NULL, &sa.sa_mask)) {
perror("sigprocmask");
return 0;
}

sa.sa_sigaction = sigbus_handler;
sa.sa_flags = SA_SIGINFO | SA_NODEFER | SA_RESTART;
if (sigaction(SIGBUS, &sa, NULL)) {
perror("sigaction");
return 0;
}

bindstack = mmap(NULL,
BINDSTACK_SIZE,
PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (bindstack == MAP_FAILED) {
perror("mmap bindstack");
return 0;
}

printf("bindstack: %p\n", bindstack);

run_bind_test();

printf("done\n");

return 0;
}

There are multiple ingredients for this:

1) PAGE_NONE is defined to _CACHE_CACHABLE_NONCOHERENT, which is CCA 3
on all platforms except SB1 where it's CCA 5.
2) _page_cachable_default must have bits set which are not set
_CACHE_CACHABLE_NONCOHERENT.
3) Either the defective version of pte_modify for XPA or the standard
version must be in used. However pte_modify for the 36 bit address
space support is no affected.

In that case additional bits in the final CCA mode may generate an invalid
value for the CCA field. On the R10000 system where this was tracked
down for example a CCA 7 has been observed, which is Uncached Accelerated.

Fixed by:

1) Using the proper CCA mode for PAGE_NONE just like for all the other
PAGE_* pte/pmd bits.
2) Fix the two affected variants of pte_modify.

Further code inspection also shows the same issue to exist in pmd_modify
which would affect huge page systems.

Issue in pte_modify tracked down by Alastair Bridgewater, PAGE_NONE
and pmd_modify issue found by me.

The history of this goes back beyond Linus' git history. Chris Dearman's
commit 351336929ccf222ae38ff0cb7a8dd5fd5c6236a0 ("[MIPS] Allow setting of
the cache attribute at run time.") missed the opportunity to fix this
but it was originally introduced in lmo commit
d523832cf12007b3242e50bb77d0c9e63e0b6518 ("Missing from last commit.")
and 32cc38229ac7538f2346918a09e75413e8861f87 ("New configuration option
CONFIG_MIPS_UNCACHED.")

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>

9y ago

Miklos Szeredi

03bea604

ovl: get_write_access() in truncate

9y ago

Linus Torvalds

2ac9b973

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

9y ago

Linus Torvalds

99b0f54e

Merge tag 'drm-fixes-for-v4.7-rc6' of git://people.freedesktop.org/~airlied/linux

9y ago

Darren Stevens

bfa37087

powerpc: Initialise pci_io_base as early as possible

9y ago

Miklos Szeredi

a4859d75

ovl: fix dentry leak for default_permissions

9y ago

Linus Torvalds

da2f6aba

Merge branch 'for-linus-4.7-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs

9y ago

James Bottomley

951d77fd

Merge remote-tracking branch 'mkp-scsi/4.7/scsi-fixes' into fixes

9y ago

Linus Torvalds

467ce769

Merge tag 'spi-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

9y ago

Dave Airlie

88c08710

Merge tag 'drm-intel-fixes-2016-06-30' of git://anongit.freedesktop.org/drm-intel into drm-fixes

9y ago

Michael Neuling

190ce869

powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0

Currently we have 2 segments that are bolted for the kernel linear
mapping (ie 0xc000... addresses). This is 0 to 1TB and also the kernel
stacks. Anything accessed outside of these regions may need to be
faulted in. (In practice machines with TM always have 1T segments)

If a machine has < 2TB of memory we never fault on the kernel linear
mapping as these two segments cover all physical memory. If a machine
has > 2TB of memory, there may be structures outside of these two
segments that need to be faulted in. This faulting can occur when
running as a guest as the hypervisor may remove any SLB that's not
bolted.

When we treclaim and trecheckpoint we have a window where we need to
run with the userspace GPRs. This means that we no longer have a valid
stack pointer in r1. For this window we therefore clear MSR RI to
indicate that any exceptions taken at this point won't be able to be
handled. This means that we can't take segment misses in this RI=0
window.

In this RI=0 region, we currently access the thread_struct for the
process being context switched to or from. This thread_struct access
may cause a segment fault since it's not guaranteed to be covered by
the two bolted segment entries described above.

We've seen this with a crash when running as a guest with > 2TB of
memory on PowerVM:

Unrecoverable exception 4100 at c00000000004f138
Oops: Unrecoverable exception, sig: 6 [#1]
SMP NR_CPUS=2048 NUMA pSeries
CPU: 1280 PID: 7755 Comm: kworker/1280:1 Tainted: G X 4.4.13-46-default #1
task: c000189001df4210 ti: c000189001d5c000 task.ti: c000189001d5c000
NIP: c00000000004f138 LR: 0000000010003a24 CTR: 0000000010001b20
REGS: c000189001d5f730 TRAP: 4100 Tainted: G X (4.4.13-46-default)
MSR: 8000000100001031 <SF,ME,IR,DR,LE> CR: 24000048 XER: 00000000
CFAR: c00000000004ed18 SOFTE: 0
GPR00: ffffffffc58d7b60 c000189001d5f9b0 00000000100d7d00 000000003a738288
GPR04: 0000000000002781 0000000000000006 0000000000000000 c0000d1f4d889620
GPR08: 000000000000c350 00000000000008ab 00000000000008ab 00000000100d7af0
GPR12: 00000000100d7ae8 00003ffe787e67a0 0000000000000000 0000000000000211
GPR16: 0000000010001b20 0000000000000000 0000000000800000 00003ffe787df110
GPR20: 0000000000000001 00000000100d1e10 0000000000000000 00003ffe787df050
GPR24: 0000000000000003 0000000000010000 0000000000000000 00003fffe79e2e30
GPR28: 00003fffe79e2e68 00000000003d0f00 00003ffe787e67a0 00003ffe787de680
NIP [c00000000004f138] restore_gprs+0xd0/0x16c
LR [0000000010003a24] 0x10003a24
Call Trace:
[c000189001d5f9b0] [c000189001d5f9f0] 0xc000189001d5f9f0 (unreliable)
[c000189001d5fb90] [c00000000001583c] tm_recheckpoint+0x6c/0xa0
[c000189001d5fbd0] [c000000000015c40] __switch_to+0x2c0/0x350
[c000189001d5fc30] [c0000000007e647c] __schedule+0x32c/0x9c0
[c000189001d5fcb0] [c0000000007e6b58] schedule+0x48/0xc0
[c000189001d5fce0] [c0000000000deabc] worker_thread+0x22c/0x5b0
[c000189001d5fd80] [c0000000000e7000] kthread+0x110/0x130
[c000189001d5fe30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4
Instruction dump:
7cb103a6 7cc0e3a6 7ca222a6 78a58402 38c00800 7cc62838 08860000 7cc000a6
38a00006 78c60022 7cc62838 0b060000 <e8c701a0> 7ccff120 e8270078 e8a70098
---[ end trace 602126d0a1dedd54 ]---

This fixes this by copying the required data from the thread_struct to
the stack before we clear MSR RI. Then once we clear RI, we only access
the stack, guaranteeing there's no segment miss.

We also tighten the region over which we set RI=0 on the treclaim()
path. This may have a slight performance impact since we're adding an
mtmsr instruction.

Fixes: 090b9284d725 ("powerpc/tm: Clear MSR RI in non-recoverable TM code")
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

9y ago

Linus Torvalds

b971712a

Merge branch 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs

9y ago

Omar Sandoval

02dbfc99

Btrfs: fix ->iterate_shared() by upgrading i_rwsem for delayed nodes

9y ago

James Bottomley

27ea13e6

Merge remote-tracking branch 'mkp-scsi/4.7/scsi-fixes' into fixes

9y ago

James Bottomley

8beb3300

53c700: fix BUG on untagged commands

9y ago

Linus Torvalds

a2b0db5b

Merge tag 'regulator-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

9y ago

Mark Brown

2a9b27b3

Merge remote-tracking branches 'spi/fix/ep93xx', 'spi/fix/rockchip', 'spi/fix/sunxi' and 'spi/fix/ti-qspi' into spi-linus

9y ago

Dave Airlie

40793e85

Merge branch 'drm-fixes-4.7' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

9y ago

Wei Yongjun

cab10327

drm/i915: Fix missing unlock on error in i915_ppgtt_info()

9y ago

Gavin Shan

cca0e542

powerpc/eeh: Fix wrong argument passed to eeh_rmv_device()

9y ago

Linus Torvalds

ca83a55c

Merge tag 'sound-4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

9y ago

Chandan Rajendra

b7f67055

Btrfs: Force stripesize to the value of sectorsize

9y ago

Linus Torvalds

33688abb

Linux 4.7-rc4 v4.7-rc4

9y ago

Linus Torvalds

b02b1fbd

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

9y ago

Martin K. Petersen

6b7e9cde

sd: Fix rw_max for devices that report an optimal xfer size

9y ago

Wei Fang

72d8c36e

scsi: fix race between simultaneous decrements of ->host_failed

9y ago

Linus Torvalds

44385120

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

9y ago

Mark Brown

a29a36f2

Merge remote-tracking branches 'regulator/fix/anatop' and 'regulator/fix/max77620' into regulator-linus

9y ago

Mark Brown

bea8205f

Merge branch 'asoc-fix-ep93xx' into spi-fix-ep93xx

9y ago

Tomeu Vizoso

4dc0dd83

spi: rockchip: Signal unfinished DMA transfers

9y ago

Michal Suchanek

719bd654

spi: sunxi: fix transfer timeout

9y ago

Jean-Jacques Hiblot

3ac066e2

spi: spi-ti-qspi: Suspend the queue before removing the device

9y ago

Linus Torvalds

dbdc3bb7

Merge tag 'acpi-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

9y ago

Rex Zhu

a7f14a18

drm/amd/powerplay: workaround for UVD clock issue

9y ago

Rodrigo Vivi

6b9dc6a4

drm/i915: Removing PCI IDs that are no longer listed as Kabylake.

9y ago

Cyril Bur

8e96a87c

powerpc/tm: Always reclaim in start_thread() for exec() class syscalls

Userspace can quite legitimately perform an exec() syscall with a
suspended transaction. exec() does not return to the old process, rather
it load a new one and starts that, the expectation therefore is that the
new process starts not in a transaction. Currently exec() is not treated
any differently to any other syscall which creates problems.

Firstly it could allow a new process to start with a suspended
transaction for a binary that no longer exists. This means that the
checkpointed state won't be valid and if the suspended transaction were
ever to be resumed and subsequently aborted (a possibility which is
exceedingly likely as exec()ing will likely doom the transaction) the
new process will jump to invalid state.

Secondly the incorrect attempt to keep the transactional state while
still zeroing state for the new process creates at least two TM Bad
Things. The first triggers on the rfid to return to userspace as
start_thread() has given the new process a 'clean' MSR but the suspend
will still be set in the hardware MSR. The second TM Bad Thing triggers
in __switch_to() as the processor is still transactionally suspended but
__switch_to() wants to zero the TM sprs for the new process.

This is an example of the outcome of calling exec() with a suspended
transaction. Note the first 700 is likely the first TM bad thing
decsribed earlier only the kernel can't report it as we've loaded
userspace registers. c000000000009980 is the rfid in
fast_exception_return()

Bad kernel stack pointer 3fffcfa1a370 at c000000000009980
Oops: Bad kernel stack pointer, sig: 6 [#1]
CPU: 0 PID: 2006 Comm: tm-execed Not tainted
NIP: c000000000009980 LR: 0000000000000000 CTR: 0000000000000000
REGS: c00000003ffefd40 TRAP: 0700 Not tainted
MSR: 8000000300201031 <SF,ME,IR,DR,LE,TM[SE]> CR: 00000000 XER: 00000000
CFAR: c0000000000098b4 SOFTE: 0
PACATMSCRATCH: b00000010000d033
GPR00: 0000000000000000 00003fffcfa1a370 0000000000000000 0000000000000000
GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR12: 00003fff966611c0 0000000000000000 0000000000000000 0000000000000000
NIP [c000000000009980] fast_exception_return+0xb0/0xb8
LR [0000000000000000] (null)
Call Trace:
Instruction dump:
f84d0278 e9a100d8 7c7b03a6 e84101a0 7c4ff120 e8410170 7c5a03a6 e8010070
e8410080 e8610088 e8810090 e8210078 <4c000024> 48000000 e8610178 88ed023b

Kernel BUG at c000000000043e80 [verbose debug info unavailable]
Unexpected TM Bad Thing exception at c000000000043e80 (msr 0x201033)
Oops: Unrecoverable exception, sig: 6 [#2]
CPU: 0 PID: 2006 Comm: tm-execed Tainted: G D
task: c0000000fbea6d80 ti: c00000003ffec000 task.ti: c0000000fb7ec000
NIP: c000000000043e80 LR: c000000000015a24 CTR: 0000000000000000
REGS: c00000003ffef7e0 TRAP: 0700 Tainted: G D
MSR: 8000000300201033 <SF,ME,IR,DR,RI,LE,TM[SE]> CR: 28002828 XER: 00000000
CFAR: c000000000015a20 SOFTE: 0
PACATMSCRATCH: b00000010000d033
GPR00: 0000000000000000 c00000003ffefa60 c000000000db5500 c0000000fbead000
GPR04: 8000000300001033 2222222222222222 2222222222222222 00000000ff160000
GPR08: 0000000000000000 800000010000d033 c0000000fb7e3ea0 c00000000fe00004
GPR12: 0000000000002200 c00000000fe00000 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 c0000000fbea7410 00000000ff160000
GPR24: c0000000ffe1f600 c0000000fbea8700 c0000000fbea8700 c0000000fbead000
GPR28: c000000000e20198 c0000000fbea6d80 c0000000fbeab680 c0000000fbea6d80
NIP [c000000000043e80] tm_restore_sprs+0xc/0x1c
LR [c000000000015a24] __switch_to+0x1f4/0x420
Call Trace:
Instruction dump:
7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8
4e800020 e80304a8 7c0023a6 e80304b0 <7c0223a6> e80304b8 7c0123a6 4e800020

This fixes CVE-2016-5828.

Fixes: bc2a9408fa65 ("powerpc: Hook in new transactional memory code")
Cc: stable@vger.kernel.org # v3.9+
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

9y ago

Linus Torvalds

9a949a98

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

9y ago

Takashi Iwai

d5dbbe65

ALSA: dummy: Fix a use-after-free at closing

syzkaller fuzzer spotted a potential use-after-free case in snd-dummy
driver when hrtimer is used as backend:
> ==================================================================
> BUG: KASAN: use-after-free in rb_erase+0x1b17/0x2010 at addr ffff88005e5b6f68
> Read of size 8 by task syz-executor/8984
> =============================================================================
> BUG kmalloc-192 (Not tainted): kasan: bad access detected
> -----------------------------------------------------------------------------
>
> Disabling lock debugging due to kernel taint
> INFO: Allocated in 0xbbbbbbbbbbbbbbbb age=18446705582212484632
> ....
> [< none >] dummy_hrtimer_create+0x49/0x1a0 sound/drivers/dummy.c:464
> ....
> INFO: Freed in 0xfffd8e09 age=18446705496313138713 cpu=2164287125 pid=-1
> [< none >] dummy_hrtimer_free+0x68/0x80 sound/drivers/dummy.c:481
> ....
> Call Trace:
> [<ffffffff8179e59e>] __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:333
> [< inline >] rb_set_parent include/linux/rbtree_augmented.h:111
> [< inline >] __rb_erase_augmented include/linux/rbtree_augmented.h:218
> [<ffffffff82ca5787>] rb_erase+0x1b17/0x2010 lib/rbtree.c:427
> [<ffffffff82cb02e8>] timerqueue_del+0x78/0x170 lib/timerqueue.c:86
> [<ffffffff814d0c80>] __remove_hrtimer+0x90/0x220 kernel/time/hrtimer.c:903
> [< inline >] remove_hrtimer kernel/time/hrtimer.c:945
> [<ffffffff814d23da>] hrtimer_try_to_cancel+0x22a/0x570 kernel/time/hrtimer.c:1046
> [<ffffffff814d2742>] hrtimer_cancel+0x22/0x40 kernel/time/hrtimer.c:1066
> [<ffffffff85420531>] dummy_hrtimer_stop+0x91/0xb0 sound/drivers/dummy.c:417
> [<ffffffff854228bf>] dummy_pcm_trigger+0x17f/0x1e0 sound/drivers/dummy.c:507
> [<ffffffff85392170>] snd_pcm_do_stop+0x160/0x1b0 sound/core/pcm_native.c:1106
> [<ffffffff85391b26>] snd_pcm_action_single+0x76/0x120 sound/core/pcm_native.c:956
> [<ffffffff85391e01>] snd_pcm_action+0x231/0x290 sound/core/pcm_native.c:974
> [< inline >] snd_pcm_stop sound/core/pcm_native.c:1139
> [<ffffffff8539754d>] snd_pcm_drop+0x12d/0x1d0 sound/core/pcm_native.c:1784
> [<ffffffff8539d3be>] snd_pcm_common_ioctl1+0xfae/0x2150 sound/core/pcm_native.c:2805
> [<ffffffff8539ee91>] snd_pcm_capture_ioctl1+0x2a1/0x5e0 sound/core/pcm_native.c:2976
> [<ffffffff8539f2ec>] snd_pcm_kernel_ioctl+0x11c/0x160 sound/core/pcm_native.c:3020
> [<ffffffff853d9a44>] snd_pcm_oss_sync+0x3a4/0xa30 sound/core/oss/pcm_oss.c:1693
> [<ffffffff853da27d>] snd_pcm_oss_release+0x1ad/0x280 sound/core/oss/pcm_oss.c:2483
> .....

A workaround is to call hrtimer_cancel() in dummy_hrtimer_sync() which
is called certainly before other blocking ops.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>