commits
Pull SCSI fixes from James Bottomley:
"A couple of major (hang and deadlock) fixes with fortunately fairly
rare triggering conditions. The PM oops is only really triggered by
people using enclosure services (rare) and the fnic driver is mostly
used in enterprise environments"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
SCSI: Fix NULL pointer dereference in runtime PM
fnic: Use the local variable instead of I/O flag to acquire io_req_lock in fnic_queuecommand() to avoid deadloack
Pull MIPS bug fixes from Ralf Baechle:
"Two more fixes for 4.2.
One fixes a build issue with the LLVM assembler - LLVM assembler macro
names are case sensitive, GNU as macro names are insensitive; the
other corrects a license string (GPL v2, not GPLv2) such that the
module loader will recognice the license correctly"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
FIRMWARE: bcm47xx_nvram: Fix module license.
MIPS: Fix LLVM build issue.
The routines in scsi_rpm.c assume that if a runtime-PM callback is
invoked for a SCSI device, it can only mean that the device's driver
has asked the block layer to handle the runtime power management (by
calling blk_pm_runtime_init(), which among other things sets q->dev).
However, this assumption turns out to be wrong for things like the ses
driver. Normally ses devices are not allowed to do runtime PM, but
userspace can override this setting. If this happens, the kernel gets
a NULL pointer dereference when blk_post_runtime_resume() tries to use
the uninitialized q->dev pointer.
This patch fixes the problem by calling the block layer's runtime-PM
routines only if the device's driver really does have a runtime-PM
callback routine. Since ses doesn't define any such callbacks, the
crash won't occur.
This fixes Bugzilla #101371.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Stanisław Pitucha <viraptor@gmail.com>
Reported-by: Ilan Cohen <ilanco@gmail.com>
Tested-by: Ilan Cohen <ilanco@gmail.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Pull 9p regression fix from Al Viro:
"Fix for breakage introduced when switching p9_client_{read,write}() to
struct iov_iter * (went into 4.1)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
9p: ensure err is initialized to 0 in p9_client_read/write
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Cc: Rafał Miłecki <zajec5@gmail.com>
Cc: Paul Walmsley <paul@pwsan.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11020/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
We added changes in fnic driver patch 1.6.0.16 to acquire
io_req_lock in fnic_queuecommand() before issuing I/O so that io completion
is serialized. But when releasing the lock we check for the I/O flag and
this could be modified if IO abort occurs before I/O completion. In this case
we wont release the lock and causes deadlock in some scenerios. Using the
local variable to check the IO lock status will resolve the problem.
Fixes: 41df7b02db82cf6c14f094757bac3830d10a827f
Signed-off-by: Hiral Shah <hishah@cisco.com>
Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com>
Signed-off-by: Anil Chintalapati <achintal@cisco.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Pull ARM fixes from Russell King:
"Another couple of small ARM fixes.
A patch from Masahiro Yamada who noticed that "make -jN all zImage"
would end up generating bad images where N > 1, and a patch from
Nicolas to fix the Marvell CPU user access optimisation code when page
faults are disabled"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8418/1: add boot image dependencies to not generate invalid images
ARM: 8414/1: __copy_to_user_memcpy: fix mmap semaphore usage
Some use of those functions were providing unitialized values to those
functions. Notably, when reading 0 bytes from an empty file on a 9P
filesystem, the return code of read() was not 0.
Tested with this simple program:
#include <assert.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
int main(int argc, const char **argv)
{
assert(argc == 2);
char buffer[256];
int fd = open(argv[1], O_RDONLY|O_NOCTTY);
assert(fd >= 0);
assert(read(fd, buffer, 0) == 0);
return 0;
}
Cc: stable@vger.kernel.org # v4.1
Signed-off-by: Vincent Bernat <vincent@bernat.im>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Matthew Fortune <Matthew.Fortune@imgtec.com> reports:
The genex.S file appears to mix the case of a macro between its definition and
use. A cut down example of this is below. The macro __build_clear_none has
lower case 'build' but ends up being instantiated with upper case BUILD. Can
this be fixed on master. It has been picked up by the LLVM integrated assembler
which is currently case sensitive. We are likely to fix the assembler as well
but the code is currently inconsistent in the kernel.
.macro __build_clear_none
.endm
.macro __BUILD_HANDLER exception handler clear verbose ext
.align 5
.globl handle_\exception; .align 2; .type handle_\exception, @function; .ent
handle_\exception, 0; handle_\exception: .frame $29, 184, $29
.set noat
.globl handle_\exception\ext; .type handle_\exception\ext, @function;
handle_\exception\ext:
__BUILD_clear_\clear
.endm
.macro BUILD_HANDLER exception handler clear verbose
__BUILD_HANDLER \exception \handler \clear \verbose _int
.endm
BUILD_HANDLER ftlb ftlb none silent
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported-by: Matthew Fortune <Matthew.Fortune@imgtec.com>
Commit bcdb247c6b6a ("sd: Limit transfer length") clamped the maximum
size of an I/O request to the MAXIMUM TRANSFER LENGTH field in the BLOCK
LIMITS VPD. This had the unfortunate effect of also limiting the maximum
size of non-filesystem requests sent to the device through sg/bsg.
Avoid using blk_queue_max_hw_sectors() and set the max_sectors queue
limit directly.
Also update the comment in blk_limits_max_hw_sectors() to clarify that
max_hw_sectors defines the limit for the I/O controller only.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reported-by: Brian King <brking@linux.vnet.ibm.com>
Tested-by: Brian King <brking@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # 3.17+
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Pull x86 fixes from Ingo Molnar:
"Various low level fixes: fix more fallout from the FPU rework and the
asm entry code rework, plus an MSI rework fix, and an idle-tracing fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu/math-emu: Fix crash in fork()
x86/fpu/math-emu: Fix math-emu boot crash
x86/idle: Restore trace_cpu_idle to mwait_idle() calls
x86/irq: Build correct vector mapping for multiple MSI interrupts
Revert "sched/x86_64: Don't save flags on context switch"
U-Boot is often used to boot the kernel on ARM boards, but uImage
is not built by "make all", so we are often inclined to do
"make all uImage" to generate DTBs, modules and uImage in a single
command, but we should notice a pitfall behind it. In fact,
"make all uImage" could generate an invalid uImage if it is run with
the parallel option (-j).
You can reproduce this problem with the following procedure:
[1] First, build "all" and "uImage" separately.
You will get a valid uImage
$ git clean -f -x -d
$ export CROSS_COMPILE=<your-tools-prefix>
$ make -s -j8 ARCH=arm multi_v7_defconfig
$ make -s -j8 ARCH=arm all
$ make -j8 ARCH=arm UIMAGE_LOADADDR=0x80208000 uImage
CHK include/config/kernel.release
CHK include/generated/uapi/linux/version.h
CHK include/generated/utsrelease.h
make[1]: `include/generated/mach-types.h' is up to date.
CHK include/generated/timeconst.h
CHK include/generated/bounds.h
CHK include/generated/asm-offsets.h
CALL scripts/checksyscalls.sh
CHK include/generated/compile.h
Kernel: arch/arm/boot/Image is ready
Kernel: arch/arm/boot/zImage is ready
UIMAGE arch/arm/boot/uImage
Image Name: Linux-4.2.0-rc5-00156-gdd2384a-d
Created: Sat Aug 8 23:21:35 2015
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 6138648 Bytes = 5994.77 kB = 5.85 MB
Load Address: 80208000
Entry Point: 80208000
Image arch/arm/boot/uImage is ready
$ ls -l arch/arm/boot/*Image
-rwxrwxr-x 1 masahiro masahiro 13766656 Aug 8 23:20 arch/arm/boot/Image
-rw-rw-r-- 1 masahiro masahiro 6138712 Aug 8 23:21 arch/arm/boot/uImage
-rwxrwxr-x 1 masahiro masahiro 6138648 Aug 8 23:20 arch/arm/boot/zImage
[2] Update some source file(s)
$ touch init/main.c
[3] Then, re-build "all" and "uImage" simultaneously.
You will get an invalid uImage at random.
$ make -j8 ARCH=arm UIMAGE_LOADADDR=0x80208000 all uImage
CHK include/config/kernel.release
CHK include/generated/uapi/linux/version.h
CHK include/generated/utsrelease.h
make[1]: `include/generated/mach-types.h' is up to date.
CHK include/generated/timeconst.h
CHK include/generated/bounds.h
CHK include/generated/asm-offsets.h
CALL scripts/checksyscalls.sh
CC init/main.o
CHK include/generated/compile.h
LD init/built-in.o
LINK vmlinux
LD vmlinux.o
MODPOST vmlinux.o
GEN .version
CHK include/generated/compile.h
UPD include/generated/compile.h
CC init/version.o
LD init/built-in.o
KSYM .tmp_kallsyms1.o
KSYM .tmp_kallsyms2.o
LD vmlinux
SORTEX vmlinux
SYSMAP System.map
OBJCOPY arch/arm/boot/Image
Building modules, stage 2.
Kernel: arch/arm/boot/Image is ready
GZIP arch/arm/boot/compressed/piggy.gzip
AS arch/arm/boot/compressed/piggy.gzip.o
Kernel: arch/arm/boot/Image is ready
LD arch/arm/boot/compressed/vmlinux
GZIP arch/arm/boot/compressed/piggy.gzip
OBJCOPY arch/arm/boot/zImage
Kernel: arch/arm/boot/zImage is ready
UIMAGE arch/arm/boot/uImage
Image Name: Linux-4.2.0-rc5-00156-gdd2384a-d
Created: Sat Aug 8 23:23:14 2015
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 26472 Bytes = 25.85 kB = 0.03 MB
Load Address: 80208000
Entry Point: 80208000
Image arch/arm/boot/uImage is ready
MODPOST 192 modules
AS arch/arm/boot/compressed/piggy.gzip.o
LD arch/arm/boot/compressed/vmlinux
OBJCOPY arch/arm/boot/zImage
Kernel: arch/arm/boot/zImage is ready
$ ls -l arch/arm/boot/*Image
-rwxrwxr-x 1 masahiro masahiro 13766656 Aug 8 23:23 arch/arm/boot/Image
-rw-rw-r-- 1 masahiro masahiro 26536 Aug 8 23:23 arch/arm/boot/uImage
-rwxrwxr-x 1 masahiro masahiro 6138648 Aug 8 23:23 arch/arm/boot/zImage
Please notice the uImage is extremely small when this issue is
encountered. Besides, "Kernel: arch/arm/boot/zImage is ready" is
displayed twice, before and after the uImage log.
The root cause of this is the race condition between zImage and
uImage. Actually, uImage depends on zImage, but the dependency
between the two is only described in arch/arm/boot/Makefile.
Because arch/arm/boot/Makefile is not included from the top-level
Makefile, it cannot know the dependency between zImage and uImage.
Consequently, when we run make with the parallel option, Kbuild
updates vmlinux first, and then two different threads descends into
the arch/arm/boot/Makefile almost at the same time, one for updating
zImage and the other for uImage. While one thread is re-generating
zImage, the other also tries to update zImage before creating uImage
on top of that. zImage is overwritten by the slower thread and then
uImage is created based on the half-written zImage.
This is the reason why "Kernel: arch/arm/boot/zImage is ready" is
displayed twice, and a broken uImage is created.
The same problem could happen on bootpImage.
This commit adds dependencies among Image, zImage, uImage, and
bootpImage to arch/arm/Makefile, which is included from the
top-level Makefile.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull KVM fixes from Paolo Bonzini:
"Just two very small & simple patches"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Use adjustment in guest cycles when handling MSR_IA32_TSC_ADJUST
KVM: x86: zero IDT limit on entry to SMM
Commit 4c21b8fd8f14 ("MIPS: seccomp: Handle indirect system calls (o32)")
fixed indirect system calls on O32 but it also introduced a bug for MIPS64
where it erroneously modified the v0 (syscall) register with the assumption
that the sycall offset hasn't been taken into consideration. This breaks
seccomp on MIPS64 n64 and n32 ABIs. We fix this by replacing the addition
with a move instruction.
Fixes: 4c21b8fd8f14 ("MIPS: seccomp: Handle indirect system calls (o32)")
Cc: <stable@vger.kernel.org> # 3.15+
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10951/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Since fc_fcp_cleanup_cmd() can sleep this function must not
be called while holding a spinlock. This patch avoids that
fc_fcp_cleanup_each_cmd() triggers the following bug:
BUG: scheduling while atomic: sg_reset/1512/0x00000202
1 lock held by sg_reset/1512:
#0: (&(&fsp->scsi_pkt_lock)->rlock){+.-...}, at: [<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
Preemption disabled at:[<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
Call Trace:
[<ffffffff816c612c>] dump_stack+0x4f/0x7b
[<ffffffff810828bc>] __schedule_bug+0x6c/0xd0
[<ffffffff816c87aa>] __schedule+0x71a/0xa10
[<ffffffff816c8ad2>] schedule+0x32/0x80
[<ffffffffc0217eac>] fc_seq_set_resp+0xac/0x100 [libfc]
[<ffffffffc0218b11>] fc_exch_done+0x41/0x60 [libfc]
[<ffffffffc0225cff>] fc_fcp_cleanup_each_cmd.isra.21+0xcf/0x150 [libfc]
[<ffffffffc0225f43>] fc_eh_device_reset+0x1c3/0x270 [libfc]
[<ffffffff814a2cc9>] scsi_try_bus_device_reset+0x29/0x60
[<ffffffff814a3908>] scsi_ioctl_reset+0x258/0x2d0
[<ffffffff814a2650>] scsi_ioctl+0x150/0x440
[<ffffffff814b3a9d>] sd_ioctl+0xad/0x120
[<ffffffff8132f266>] blkdev_ioctl+0x1b6/0x810
[<ffffffff811da608>] block_ioctl+0x38/0x40
[<ffffffff811b4e08>] do_vfs_ioctl+0x2f8/0x530
[<ffffffff811b50c1>] SyS_ioctl+0x81/0xa0
[<ffffffff816cf8b2>] system_call_fastpath+0x16/0x7a
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Pull perf fixes from Ingo Molnar:
"Tooling fixes: a 'perf record' deadlock fix plus debuggability fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf top: Show backtrace when handling a SIGSEGV on --stdio mode
perf tools: Fix buildid processing
perf tools: Make fork event processing more resilient
perf tools: Avoid deadlock when map_groups are broken
During later stages of math-emu bootup the following crash triggers:
math_emulate: 0060:c100d0a8
Kernel panic - not syncing: Math emulation needed in kernel
CPU: 0 PID: 1511 Comm: login Not tainted 4.2.0-rc7+ #1012
[...]
Call Trace:
[<c181d50d>] dump_stack+0x41/0x52
[<c181c918>] panic+0x77/0x189
[<c1003530>] ? math_error+0x140/0x140
[<c164c2d7>] math_emulate+0xba7/0xbd0
[<c100d0a8>] ? fpu__copy+0x138/0x1c0
[<c1109c3c>] ? __alloc_pages_nodemask+0x12c/0x870
[<c136ac20>] ? proc_clear_tty+0x40/0x70
[<c136ac6e>] ? session_clear_tty+0x1e/0x30
[<c1003530>] ? math_error+0x140/0x140
[<c1003575>] do_device_not_available+0x45/0x70
[<c100d0a8>] ? fpu__copy+0x138/0x1c0
[<c18258e6>] error_code+0x5a/0x60
[<c1003530>] ? math_error+0x140/0x140
[<c100d0a8>] ? fpu__copy+0x138/0x1c0
[<c100c205>] arch_dup_task_struct+0x25/0x30
[<c1048cea>] copy_process.part.51+0xea/0x1480
[<c115a8e5>] ? dput+0x175/0x200
[<c136af70>] ? no_tty+0x30/0x30
[<c1157242>] ? do_vfs_ioctl+0x322/0x540
[<c104a21a>] _do_fork+0xca/0x340
[<c1057b06>] ? SyS_rt_sigaction+0x66/0x90
[<c104a557>] SyS_clone+0x27/0x30
[<c1824a80>] sysenter_do_call+0x12/0x12
The reason is the incorrect assumption in fpu_copy(), that FNSAVE
can be executed from math-emu kernels as well.
Don't try to copy the registers, the soft state will be copied
by fork anyway, so the child task inherits the parent task's
soft math state.
With this fix applied math-emu kernels boot up fine on modern
hardware and the 'no387 nofxsr' boot options.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The mmap semaphore should not be taken when page faults are disabled.
Since pagefault_disable() no longer disables preemption, we now need
to use faulthandler_disabled() in place of in_atomic().
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Merge fixes from Andrew Morton:
"11 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
Update maintainers for DRM STI driver
mm: cma: mark cma_bitmap_maxno() inline in header
zram: fix pool name truncation
memory-hotplug: fix wrong edge when hot add a new node
.mailmap: Andrey Ryabinin has moved
ipc/sem.c: update/correct memory barriers
mm/hwpoison: fix panic due to split huge zero page
ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()
ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits
mm/hwpoison: fix fail isolate hugetlbfs page w/ refcount held
mm/hwpoison: fix page refcount of unknown non LRU page
When kvm_set_msr_common() handles a guest's write to
MSR_IA32_TSC_ADJUST, it will calcuate an adjustment based on the data
written by guest and then use it to adjust TSC offset by calling a
call-back adjust_tsc_offset(). The 3rd parameter of adjust_tsc_offset()
indicates whether the adjustment is in host TSC cycles or in guest TSC
cycles. If SVM TSC scaling is enabled, adjust_tsc_offset()
[i.e. svm_adjust_tsc_offset()] will first scale the adjustment;
otherwise, it will just use the unscaled one. As the MSR write here
comes from the guest, the adjustment is in guest TSC cycles. However,
the current kvm_set_msr_common() uses it as a value in host TSC
cycles (by using true as the 3rd parameter of adjust_tsc_offset()),
which can result in an incorrect adjustment of TSC offset if SVM TSC
scaling is enabled. This patch fixes this problem.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Cc: stable@vger.linux.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Due to patch "libfc: Do not invoke the response handler after
fc_exch_done()" (commit ID 7030fd62) the lport_recv() call
in fc_exch_recv_req() is passed a dangling pointer. Avoid this
by moving the fc_frame_free() call from fc_invoke_resp() to its
callers. This patch fixes the following crash:
general protection fault: 0000 [#3] PREEMPT SMP
RIP: fc_lport_recv_req+0x72/0x280 [libfc]
Call Trace:
fc_exch_recv+0x642/0xde0 [libfc]
fcoe_percpu_receive_thread+0x46a/0x5ed [fcoe]
kthread+0x10a/0x120
ret_from_fork+0x42/0x70
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Pull irq fixes from Thomas Gleixner:
"A series of small fixlets for a regression visible on OMAP devices
caused by the conversion of the OMAP interrupt chips to hierarchical
interrupt domains. Mostly one liners on the driver side plus a small
helper function in the core to avoid open coded mess in the drivers"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/crossbar: Restore set_wake functionality
irqchip/crossbar: Restore the mask on suspend behaviour
ARM: OMAP: wakeupgen: Restore the irq_set_type() mechanism
irqchip/crossbar: Restore the irq_set_type() mechanism
genirq: Introduce irq_chip_set_type_parent() helper
genirq: Don't return ENOSYS in irq_chip_retrigger_hierarchy
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Fix buildid processing done at the end of a 'perf record' session, a
problem that happened in workloads involving lots of small short-lived
processes. That code was not asking the perf_session layer to order
the events.
Make the code more robust to handle some of the problems with such
out-of-order events and fix 'perf record' to ask for ordered events
on systems where we have perf_event_attr.sample_id_all. (Adrian Hunter)
- Show backtrace when handling a SIGSEGV in 'perf top --stdio' (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
On a math-emu bootup the following crash occurs:
Initializing CPU#0
------------[ cut here ]------------
kernel BUG at arch/x86/kernel/traps.c:779!
invalid opcode: 0000 [#1] SMP
[...]
EIP is at do_device_not_available+0xe/0x70
[...]
Call Trace:
[<c18238e6>] error_code+0x5a/0x60
[<c1002bd0>] ? math_error+0x140/0x140
[<c100bbd9>] ? fpu__init_cpu+0x59/0xa0
[<c1012322>] cpu_init+0x202/0x330
[<c104509f>] ? __native_set_fixmap+0x1f/0x30
[<c1b56ab0>] trap_init+0x305/0x346
[<c1b548af>] start_kernel+0x1a5/0x35d
[<c1b542b4>] i386_start_kernel+0x82/0x86
The reason is that in the following commit:
b1276c48e91b ("x86/fpu: Initialize fpregs in fpu__init_cpu_generic()")
I failed to consider math-emu's limitation that it cannot execute the
FNINIT instruction in kernel mode.
The long term fix might be to allow math-emu to execute (certain) kernel
mode FPU instructions, but for now apply the safe (albeit somewhat ugly)
fix: initialize the emulation state explicitly without trapping out to
the FPU emulator.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Since 906c55579a63 ("timekeeping: Copy the shadow-timekeeper over the
real timekeeper last") it has become possible on ARM to:
- Obtain a CLOCK_MONOTONIC_COARSE or CLOCK_REALTIME_COARSE timestamp
via syscall.
- Subsequently obtain a timestamp for the same clock ID via VDSO which
predates the first timestamp (by one jiffy).
This is because ARM's update_vsyscall is deriving the coarse time
using the __current_kernel_time interface, when it should really be
using the timekeeper object provided to it by the timekeeping core.
It happened to work before only because __current_kernel_time would
access the same timekeeper object which had been passed to
update_vsyscall. This is no longer the case.
Cc: stable@vger.kernel.org
Fixes: 906c55579a63 ("timekeeping: Copy the shadow-timekeeper over the real timekeeper last")
Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull clock fix from Stephen Boyd:
"A one-liner for a regression found in the PXA clock driver"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: pxa: pxa3xx: fix CKEN register access
Add Vincent Abriou and myself as maintainers.
Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Cc: Vincent Abriou <vincent.abriou@st.com>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The recent BlackHat 2015 presentation "The Memory Sinkhole"
mentions that the IDT limit is zeroed on entry to SMM.
This is not documented, and must have changed some time after 2010
(see http://www.ssi.gouv.fr/uploads/IMG/pdf/IT_Defense_2010_final.pdf).
KVM was not doing it, but the fix is easy.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull input subsystem fixes from Dmitry Torokhov:
"Just small ALPS and Elan touchpads, and other driver fixups"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elantech - add special check for fw_version 0x470f01 touchpad
Input: twl4030-vibra - fix ERROR: Bad of_node_put() warning
Input: alps - only Dell laptops have separate button bits for v2 dualpoint sticks
Input: axp20x-pek - add module alias
Input: turbografx - fix potential out of bound access
In case of hw iscsi offload, an host can have N-number of active
connections. There can be IO's running on some connections which
make host->host_busy always TRUE. Now if logout from a connection
is tried then the code gets into an infinite loop as host->host_busy
is always TRUE.
iscsi_conn_teardown(....)
{
.........
/*
* Block until all in-progress commands for this connection
* time out or fail.
*/
for (;;) {
spin_lock_irqsave(session->host->host_lock, flags);
if (!atomic_read(&session->host->host_busy)) { /* OK for ERL == 0 */
spin_unlock_irqrestore(session->host->host_lock, flags);
break;
}
spin_unlock_irqrestore(session->host->host_lock, flags);
msleep_interruptible(500);
iscsi_conn_printk(KERN_INFO, conn, "iscsi conn_destroy(): "
"host_busy %d host_failed %d\n",
atomic_read(&session->host->host_busy),
session->host->host_failed);
................
...............
}
}
This is not an issue with software-iscsi/iser as each cxn is a separate
host.
Fix:
Acquiring eh_mutex in iscsi_conn_teardown() before setting
session->state = ISCSI_STATE_TERMINATE.
Signed-off-by: John Soni Jose <sony.john@avagotech.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Chris Leech <cleech@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Pull timer fixes from Thomas Gleixner:
"Two minimalistic fixes for 4.2 regressions:
- Eric fixed a thinko in the timer_list base switching code caused by
the overhaul of the timer wheel. It can cause a cpu to see the
wrong base for a timer while we move the timer around.
- Guenter fixed a regression for IMX if booted w/o device tree, where
the timer interrupt is not initialized and therefor the machine
fails to boot"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/imx: Fix boot with non-DT systems
timer: Write timer->flags atomically
The TI crossbar irqchip doesn't provides any facility to configure the
wakeup sources, but the conversion to hierarchical irqdomains set the
irq_set_wake callback to irq_chip_set_wake_parent. The parent chip
(OMAP wakeupgen) has no irq_set_wake function either so the call will
fail with -ENOSYS. As a result the irq_set_wake() call in the resume
path will trigger an 'Unbalanced wake disable' warning.
Before the conversion the GIC irqchip was the top level irqchip and
correctly flagged with IRQCHIP_SKIP_SET_WAKE.
Restore the correct behaviour by removing the irq_set_type callback
from the crossbar irqchip and set the IRQCHIP_SKIP_SET_WAKE flag which
lets the irq_set_irq_wake() call from the driver succeed.
[ tglx: Massaged changelog ]
Fixes: 783d31863fb8 ('irqchip: crossbar: Convert dra7 crossbar...')
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: <linux@arm.linux.org.uk>
Cc: <nsekhar@ti.com>
Cc: <jason@lakedaemon.net>
Cc: <balbi@ti.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: <tony@atomide.com>
Cc: <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org # 4.1
Link: http://lkml.kernel.org/r/1439554830-19502-7-git-send-email-grygorii.strashko@ti.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull dmaengine fix from Vinod Koul:
"We recently found issue with dma_request_slave_channel() API causing
privatecnt value to go bad. This is fixed by balancing the privatecnt"
* tag 'dmaengine-fix-4.2-rc8' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: fix balance of privatecnt inc/dec operations
It was just freezing instead of informing about the SEGV, fix it and
also print a backtrace, just like in the TUI mode and in 'perf trace'.
Tested by provoking a NULL deref when pressing 'z':
0.31% libc-2.20.so [.] malloc_consolidate
0.31% ld-2.20.so [.] _dl_relocate_object
0.28% cc1 [.] ht_lookup
0.28% cc1 [.] ira_init_register_move_cost
perf: Segmentation fault
Obtained 7 stack frames.
perf(dump_stack+0x32) [0x4d69f2]
perf(sighandler_dump_stack+0x29) [0x4d6a89]
/lib64/libc.so.6(+0x34960) [0x7f5064333960]
perf() [0x438790]
/lib64/libpthread.so.0(+0x752a) [0x7f50663dd52a]
/lib64/libc.so.6(clone+0x6d) [0x7f50643ff22d]
#
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-pewrpzqd29rgmhu2wkk7fhww@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit b253149b843f ("sched/idle/x86: Restore mwait_idle() to fix boot
hangs, to improve power savings and to improve performance") restores
mwait_idle(), but the trace_cpu_idle related calls are missing. This
causes powertop on my old desktop powered by Intel Core2 E6550 to
report zero wakeups and zero events.
Add them back to restore the proper behaviour.
Fixes: b253149b843f ("sched/idle/x86: Restore mwait_idle() to ...")
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Cc: <len.brown@intel.com>
Cc: stable@vger.kernel.org # 4.1
Link: http://lkml.kernel.org/r/1440046479-4262-1-git-send-email-jszhang@marvell.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
ret_fast_syscall runs when user space makes a syscall. However it
needs to be marked as such so the ELF information is correct. Before
it was:
101: 8000f300 0 NOTYPE LOCAL DEFAULT 2 ret_fast_syscall
But with this change it correctly shows as:
101: 8000f300 96 FUNC LOCAL DEFAULT 2 ret_fast_syscall
I see this function when using perf to unwind call stacks from kernel
space to user space. Without this change I would need to add some
special case logic when using the vmlinux ELF information.
Signed-off-by: Drew Richardson <drew.richardson@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull timer fix from Ingo Molnar:
"A single clocksource driver suspend/resume fix"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clockevents/drivers/sh_cmt: Only perform clocksource suspend/resume if enabled
Clocks 0 to 31 are on CKENA, and not CKENB. The clock register names
were inadequately inverted. As a consequence, all clock operations were
happening on CKENB, because almost all but 2 clocks are on CKENA.
As the clocks were activated by the bootloader in the former tests, it
escaped the testing that the wrong clock gate was manipulated. The error
was revealed by changing the pxa3xx-nand driver to a module, where upon
unloading, the wrong clock was disabled in CKENB.
Fixes: 9bbb8a338fb2 ("clk: pxa: add pxa3xx clock driver")
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
cma_bitmap_maxno() was marked as static and not static inline, which can
cause warnings about this function not being used if this file is included
in a file that does not call that function, and violates the conventions
used elsewhere. The two options are to move the function implementation
back to mm/cma.c or make it inline here, and it's simple enough for the
latter to make sense.
Signed-off-by: Gregory Fong <gregory.0xf0@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The patch was munged on commit to re-order these tests resulting in
excessive warnings when trying to do device assignment. Return to
original ordering: https://lkml.org/lkml/2015/7/15/769
Fixes: 3e5d2fdceda1 ("KVM: MTRR: simplify kvm_mtrr_get_guest_memory_type")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull MIPS fixes from Ralf Baechle:
"Another round of MIPS fixes for 4.2. No area does particularly stand
out but we have a two unpleasant ones:
- Kernel ptes are marked with a global bit which allows the kernel to
share kernel TLB entries between all processes. For this to work
both entries of an adjacent even/odd pte pair need to have the
global bit set. There has been a subtle race in setting the other
entry's global bit since ~ 2000 but it take particularly
pathological workloads that essentially do mostly vmalloc/vfree to
trigger this.
This pull request fixes the 64-bit case but leaves the case of 32
bit CPUs with 64 bit ptes unsolved for now. The unfixed cases
affect hardware that is not available in the field yet.
- Instruction emulation requires loading instructions from user space
but the current fast but simplistic approach will fail on pages
that are PROT_EXEC but !PROT_READ. For this reason we temporarily
do not permit this permission and will map pages with PROT_EXEC |
PROT_READ.
The remainder of this pull request is more or less across the field
and the short log explains them well"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Make set_pte() SMP safe.
MIPS: Replace add and sub instructions in relocate_kernel.S with addiu
MIPS: Flush RPS on kernel entry with EVA
Revert "MIPS: BCM63xx: Provide a plat_post_dma_flush hook"
MIPS: BMIPS: Delete unused Kconfig symbol
MIPS: Export get_c0_perfcount_int()
MIPS: show_stack: Fix stack trace with EVA
MIPS: do_mcheck: Fix kernel code dump with EVA
MIPS: SMP: Don't increment irq_count multiple times for call function IPIs
MIPS: Partially disable RIXI support.
MIPS: Handle page faults of executable but unreadable pages correctly.
MIPS: Malta: Don't reinitialise RTC
MIPS: unaligned: Fix build error on big endian R6 kernels
MIPS: Fix sched_getaffinity with MT FPAFF enabled
MIPS: Fix build with CONFIG_OF=y for non OF-enabled targets
CPUFREQ: Loongson2: Fix broken build due to incorrect include.
It is no need to check the packet[0] for sanity check when doing
elantech_packet_check_v4() function for fw_version = 0x470f01 touchpad.
Signed-off by: Duson Lin <dusonlin@emc.com.tw>
Reviewed-by: Ulrik De Bie <ulrik.debie-os@e2big.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Fix a memory leak with scsi-mq triggered by commands with large data
transfer length.
__sg_alloc_table() sets both table->nents and table->orig_nents to the
same value. When the scatterlist is DMA-mapped, table->nents is
overwritten with the (possibly smaller) size of the DMA-mapped
scatterlist, while table->orig_nents retains the original size of the
allocated scatterlist. scsi_free_sgtable() should therefore check
orig_nents instead of nents, and all code that initializes sdb->table
without calling __sg_alloc_table() should set both nents and orig_nents.
Fixes: d285203cf647 ("scsi: add support for a blk-mq based I/O path.")
Cc: <stable@vger.kernel.org> # 3.17+
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
While the idea behind get_maintainer seems highly useful it's
unfortunately way to trigger happy to grab people that once had a few
commits to files. For someone like me who does a lot of tree-wide API
work that leads to an incredible amount of Cc spam.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 6dd747825b20 ("ARM: imx: move timer resources into a structure")
moved initialization parameters into a data structure, but neglected to set
the irq field in that data structure for non-DT boots. This causes the system
to hang if a non-DT boot is attempted.
Fixes: 6dd747825b20 ("ARM: imx: move timer resources into a structure")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: http://lkml.kernel.org/r/1440066441-13930-1-git-send-email-linux@roeck-us.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The ARM GIC requires that all interrupts which are not used as a
wakeup source have to be masked during suspend.
The conversion of the crossbar irqchip to hierarchical irq domains
failed to mark the crossbar irqchip with the IRQCHIP_MASK_ON_SUSPEND
flag and therefor broke the suspend requirement of the GIC.
Before the conversion the flags were visible because the GIC was the
top level irqchip. After the conversion the crossbar irqchip is the
top level irq chip whose flags are evaluated in suspend_device_irq().
As the flag is not set the masking of the non-wakeup irqs is not
invoked which breaks suspend.
Add the IRQCHIP_MASK_ON_SUSPEND flag to the crossbar irqchip, so the
GIC interrupts get masked properly.
[ tglx: Massaged changelog ]
Fixes: 783d31863fb8 ('irqchip: crossbar: Convert dra7 crossbar...')
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: <linux@arm.linux.org.uk>
Cc: <nsekhar@ti.com>
Cc: <jason@lakedaemon.net>
Cc: <balbi@ti.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: <tony@atomide.com>
Cc: <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org # 4.1
Link: http://lkml.kernel.org/r/1439554830-19502-6-git-send-email-grygorii.strashko@ti.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull drm fixes from Dave Airlie:
"These came in late last week, I wanted to look over the mst one before
forwarding, but it seems good.
Just three i915 and one MST fix"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/i915: Commit planes on each crtc separately.
drm/i915: calculate primary visibility changes instead of calling from set_config
drm/i915: Only dither on 6bpc panels
drm/dp/mst: Remove port after removing connector.
This patch increments privatecnt value and set DMA_PRIVATE in device
caps in dma_request_slave_channel() function. This is needed to keep
privatecnt increment/decrement balance.
As function dma_release_channel() decrements privatecnt counter, we need
to increment it when channel is requested. Otherwise privatecnt drops
into negatives after few dma_release_channel() calls.
Reported-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Robert Baldyga <r.baldyga@samsung.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
After recording, 'perf record' post-processes the data to determine
which buildids are needed.
That processing must process the data in time order, if possible,
because otherwise dependent events, like forks and mmaps, will not make
sense.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1439994561-27436-4-git-send-email-adrian.hunter@intel.com
[ Moved the sample_id_add to after trying to open the events, use pr_warning ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Since the commit "b2c3e38a5471 ARM: redo TTBR setup code for LPAE",
the setup code had been reworked. As a result the secondary CPUs
failed to come online in Big Endian.
As explained by Russell, the new code expected the value in r4/r5 to
be the least significant 32bits in r4 and the most significant 32bits
in r5. However, in the secondary code, we load this using ldrd, which
on BE reverses that.
This patch swap r4/r5 after the ldrd. It is done using the xor
instructions in order to not use a temporary register.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull perf fixes from Ingo Molnar:
"Misc fixes: PMU driver corner cases, tooling fixes, and an 'AUX'
(Intel PT) race related core fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/cqm: Do not access cpu_data() from CPU_UP_PREPARE handler
perf/x86/intel: Fix memory leak on hot-plug allocation fail
perf: Fix PERF_EVENT_IOC_PERIOD migration race
perf: Fix double-free of the AUX buffer
perf: Fix fasync handling on inherited events
perf tools: Fix test build error when bindir contains double slash
perf stat: Fix transaction lenght metrics
perf: Fix running time accounting
Currently the sh_cmt clocksource timer is disabled or enabled
unconditionally on clocksource suspend resp. resume, even if a
better clocksource is present (e.g. arch_sys_counter) and the
sh_cmt clocksource is not enabled.
As sh_cmt is a syscore device when its timer is enabled, this
may lead to a genpd.prepared_count imbalance in the presence of
PM Domains, which may cause a lock-up during reboot after s2ram.
During suspend:
- pm_genpd_prepare() is called for all non-syscore devices (incl.
sh_cmt), increasing genpd.prepared_count for each device,
- clocksource.suspend() is called for all clocksource devices,
- sh_cmt_clocksource_suspend() calls sh_cmt_stop(), which is a no-op
as the clocksource was not enabled.
During resume:
- clocksource.resume() is called for all clocksource devices,
- sh_cmt_clocksource_resume() calls sh_cmt_start(), which enables the
clocksource timer, and turns sh_cmt into a syscore device,
- pm_genpd_complete() is called for all non-syscore devices (excl.
sh_cmt now!), decreasing genpd.prepared_count for each device but
sh_cmt.
Now genpd.prepared_count of the PM Domain containing sh_cmt is
still 1 instead of zero. On subsequent suspend/resume cycles,
sh_cmt is still a syscore device, hence it's skipped for
pm_genpd_{prepare,complete}(), keeping the imbalance of
genpd.prepared_count at 1.
During reboot:
- platform_drv_shutdown() is called for any platform device that has
a driver with a .shutdown() method (only rcar-dmac on R-Car Gen2),
- platform_drv_shutdown() calls dev_pm_domain_detach(), which
calls genpd_dev_pm_detach(),
- genpd_dev_pm_detach() keeps calling pm_genpd_remove_device() until
it doesn't return -EAGAIN[*],
- If the device is part of the same PM Domain as sh_cmt,
pm_genpd_remove_device() always fails with -EAGAIN due to
genpd.prepared_count > 0.
- Infinite loop in genpd_dev_pm_detach()[*].
[*] Commit 93af5e9354432828 ("PM / Domains: Avoid infinite loops in
attach/detach code") already limited the number of loop iterations,
avoiding the lock-up.
To fix this, only disable or enable the clocksource timer on
clocksource suspend resp. resume if the clocksource was enabled.
This was tested on r8a7791/koelsch with the CPG Clock Domain:
- using arch_sys_counter as the clocksource, which is the default, and
which showed the problem,
- using sh_cmt as a clocksource ("echo ffca0000.timer > \
/sys/devices/system/clocksource/clocksource0/current_clocksource"),
which behaves the same as before.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1438875126-12596-2-git-send-email-daniel.lezcano@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
zram_meta_alloc() constructs a pool name for zs_create_pool() call as
snprintf(pool_name, sizeof(pool_name), "zram%d", device_id);
However, it defines pool name buffer to be only 8 bytes long (minus
trailing zero), which means that we can have only 1000 pool names: zram0
-- zram999.
With CONFIG_ZSMALLOC_STAT enabled an attempt to create a device zram1000
can fail if device zram100 already exists, because snprintf() will
truncate new pool name to zram100 and pass it debugfs_create_dir(),
causing:
debugfs dir <zram100> creation failed
zram: Error creating memory pool
... and so on.
Fix it by passing zram->disk->disk_name to zram_meta_alloc() instead of
divice_id. We construct zram%d name earlier and keep it as a ->disk_name,
no need to snprintf() it again.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KVM: s390: bugfix for kvm/master (4.2)
Here is a bugfix for a regression that was introduced after 4.1
with the commit commit 785dbef407d8 ("KVM: s390: optimize round
trip time in request handling"). After lots of cpu hotplugs in the
guest (online/offline) sometimes a guest CPU did loop within host
KVM code. Reason was that PROG_REQUEST was set in the sie control
block, but no request was pending. This made commit 785dbef407d8
the suspect and changing that area to always reset PROG_REQUEST
did indeed fix the problem.
Special thanks to David Hildenbrand, who helped understanding the
exact sequence that led to the problem.
Pull btrfs fix from Chris Mason:
"We have a btrfs quota regression fix.
I merged this one on Thursday and have run it through tests against
current master.
Normally I wouldn't have sent this while you were finalizing rc6, but
I'm feeding mosquitoes in the adirondacks next week, so I wanted to
get this one out before leaving. I'll leave longer tests running and
check on things during the week, but I don't expect any problems"
* 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: qgroup: Fix a regression in qgroup reserved space.
On MIPS the GLOBAL bit of the PTE must have the same value in any
aligned pair of PTEs. These pairs of PTEs are referred to as
"buddies". In a SMP system is is possible for two CPUs to be calling
set_pte() on adjacent PTEs at the same time. There is a race between
setting the PTE and a different CPU setting the GLOBAL bit in its
buddy PTE.
This race can be observed when multiple CPUs are executing
vmap()/vfree() at the same time.
Make setting the buddy PTE's GLOBAL bit an atomic operation to close
the race condition.
The case of CONFIG_64BIT_PHYS_ADDR && CONFIG_CPU_MIPS32 is *not*
handled.
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: <stable@vger.kernel.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10835/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Pull SCSI fixes from James Bottomley:
"A couple of major (hang and deadlock) fixes with fortunately fairly
rare triggering conditions. The PM oops is only really triggered by
people using enclosure services (rare) and the fnic driver is mostly
used in enterprise environments"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
SCSI: Fix NULL pointer dereference in runtime PM
fnic: Use the local variable instead of I/O flag to acquire io_req_lock in fnic_queuecommand() to avoid deadloack
Pull MIPS bug fixes from Ralf Baechle:
"Two more fixes for 4.2.
One fixes a build issue with the LLVM assembler - LLVM assembler macro
names are case sensitive, GNU as macro names are insensitive; the
other corrects a license string (GPL v2, not GPLv2) such that the
module loader will recognice the license correctly"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
FIRMWARE: bcm47xx_nvram: Fix module license.
MIPS: Fix LLVM build issue.
The routines in scsi_rpm.c assume that if a runtime-PM callback is
invoked for a SCSI device, it can only mean that the device's driver
has asked the block layer to handle the runtime power management (by
calling blk_pm_runtime_init(), which among other things sets q->dev).
However, this assumption turns out to be wrong for things like the ses
driver. Normally ses devices are not allowed to do runtime PM, but
userspace can override this setting. If this happens, the kernel gets
a NULL pointer dereference when blk_post_runtime_resume() tries to use
the uninitialized q->dev pointer.
This patch fixes the problem by calling the block layer's runtime-PM
routines only if the device's driver really does have a runtime-PM
callback routine. Since ses doesn't define any such callbacks, the
crash won't occur.
This fixes Bugzilla #101371.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Stanisław Pitucha <viraptor@gmail.com>
Reported-by: Ilan Cohen <ilanco@gmail.com>
Tested-by: Ilan Cohen <ilanco@gmail.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Odin.com>
We added changes in fnic driver patch 1.6.0.16 to acquire
io_req_lock in fnic_queuecommand() before issuing I/O so that io completion
is serialized. But when releasing the lock we check for the I/O flag and
this could be modified if IO abort occurs before I/O completion. In this case
we wont release the lock and causes deadlock in some scenerios. Using the
local variable to check the IO lock status will resolve the problem.
Fixes: 41df7b02db82cf6c14f094757bac3830d10a827f
Signed-off-by: Hiral Shah <hishah@cisco.com>
Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com>
Signed-off-by: Anil Chintalapati <achintal@cisco.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Pull ARM fixes from Russell King:
"Another couple of small ARM fixes.
A patch from Masahiro Yamada who noticed that "make -jN all zImage"
would end up generating bad images where N > 1, and a patch from
Nicolas to fix the Marvell CPU user access optimisation code when page
faults are disabled"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8418/1: add boot image dependencies to not generate invalid images
ARM: 8414/1: __copy_to_user_memcpy: fix mmap semaphore usage
Some use of those functions were providing unitialized values to those
functions. Notably, when reading 0 bytes from an empty file on a 9P
filesystem, the return code of read() was not 0.
Tested with this simple program:
#include <assert.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
int main(int argc, const char **argv)
{
assert(argc == 2);
char buffer[256];
int fd = open(argv[1], O_RDONLY|O_NOCTTY);
assert(fd >= 0);
assert(read(fd, buffer, 0) == 0);
return 0;
}
Cc: stable@vger.kernel.org # v4.1
Signed-off-by: Vincent Bernat <vincent@bernat.im>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Matthew Fortune <Matthew.Fortune@imgtec.com> reports:
The genex.S file appears to mix the case of a macro between its definition and
use. A cut down example of this is below. The macro __build_clear_none has
lower case 'build' but ends up being instantiated with upper case BUILD. Can
this be fixed on master. It has been picked up by the LLVM integrated assembler
which is currently case sensitive. We are likely to fix the assembler as well
but the code is currently inconsistent in the kernel.
.macro __build_clear_none
.endm
.macro __BUILD_HANDLER exception handler clear verbose ext
.align 5
.globl handle_\exception; .align 2; .type handle_\exception, @function; .ent
handle_\exception, 0; handle_\exception: .frame $29, 184, $29
.set noat
.globl handle_\exception\ext; .type handle_\exception\ext, @function;
handle_\exception\ext:
__BUILD_clear_\clear
.endm
.macro BUILD_HANDLER exception handler clear verbose
__BUILD_HANDLER \exception \handler \clear \verbose _int
.endm
BUILD_HANDLER ftlb ftlb none silent
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported-by: Matthew Fortune <Matthew.Fortune@imgtec.com>
Commit bcdb247c6b6a ("sd: Limit transfer length") clamped the maximum
size of an I/O request to the MAXIMUM TRANSFER LENGTH field in the BLOCK
LIMITS VPD. This had the unfortunate effect of also limiting the maximum
size of non-filesystem requests sent to the device through sg/bsg.
Avoid using blk_queue_max_hw_sectors() and set the max_sectors queue
limit directly.
Also update the comment in blk_limits_max_hw_sectors() to clarify that
max_hw_sectors defines the limit for the I/O controller only.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reported-by: Brian King <brking@linux.vnet.ibm.com>
Tested-by: Brian King <brking@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # 3.17+
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Pull x86 fixes from Ingo Molnar:
"Various low level fixes: fix more fallout from the FPU rework and the
asm entry code rework, plus an MSI rework fix, and an idle-tracing fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu/math-emu: Fix crash in fork()
x86/fpu/math-emu: Fix math-emu boot crash
x86/idle: Restore trace_cpu_idle to mwait_idle() calls
x86/irq: Build correct vector mapping for multiple MSI interrupts
Revert "sched/x86_64: Don't save flags on context switch"
U-Boot is often used to boot the kernel on ARM boards, but uImage
is not built by "make all", so we are often inclined to do
"make all uImage" to generate DTBs, modules and uImage in a single
command, but we should notice a pitfall behind it. In fact,
"make all uImage" could generate an invalid uImage if it is run with
the parallel option (-j).
You can reproduce this problem with the following procedure:
[1] First, build "all" and "uImage" separately.
You will get a valid uImage
$ git clean -f -x -d
$ export CROSS_COMPILE=<your-tools-prefix>
$ make -s -j8 ARCH=arm multi_v7_defconfig
$ make -s -j8 ARCH=arm all
$ make -j8 ARCH=arm UIMAGE_LOADADDR=0x80208000 uImage
CHK include/config/kernel.release
CHK include/generated/uapi/linux/version.h
CHK include/generated/utsrelease.h
make[1]: `include/generated/mach-types.h' is up to date.
CHK include/generated/timeconst.h
CHK include/generated/bounds.h
CHK include/generated/asm-offsets.h
CALL scripts/checksyscalls.sh
CHK include/generated/compile.h
Kernel: arch/arm/boot/Image is ready
Kernel: arch/arm/boot/zImage is ready
UIMAGE arch/arm/boot/uImage
Image Name: Linux-4.2.0-rc5-00156-gdd2384a-d
Created: Sat Aug 8 23:21:35 2015
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 6138648 Bytes = 5994.77 kB = 5.85 MB
Load Address: 80208000
Entry Point: 80208000
Image arch/arm/boot/uImage is ready
$ ls -l arch/arm/boot/*Image
-rwxrwxr-x 1 masahiro masahiro 13766656 Aug 8 23:20 arch/arm/boot/Image
-rw-rw-r-- 1 masahiro masahiro 6138712 Aug 8 23:21 arch/arm/boot/uImage
-rwxrwxr-x 1 masahiro masahiro 6138648 Aug 8 23:20 arch/arm/boot/zImage
[2] Update some source file(s)
$ touch init/main.c
[3] Then, re-build "all" and "uImage" simultaneously.
You will get an invalid uImage at random.
$ make -j8 ARCH=arm UIMAGE_LOADADDR=0x80208000 all uImage
CHK include/config/kernel.release
CHK include/generated/uapi/linux/version.h
CHK include/generated/utsrelease.h
make[1]: `include/generated/mach-types.h' is up to date.
CHK include/generated/timeconst.h
CHK include/generated/bounds.h
CHK include/generated/asm-offsets.h
CALL scripts/checksyscalls.sh
CC init/main.o
CHK include/generated/compile.h
LD init/built-in.o
LINK vmlinux
LD vmlinux.o
MODPOST vmlinux.o
GEN .version
CHK include/generated/compile.h
UPD include/generated/compile.h
CC init/version.o
LD init/built-in.o
KSYM .tmp_kallsyms1.o
KSYM .tmp_kallsyms2.o
LD vmlinux
SORTEX vmlinux
SYSMAP System.map
OBJCOPY arch/arm/boot/Image
Building modules, stage 2.
Kernel: arch/arm/boot/Image is ready
GZIP arch/arm/boot/compressed/piggy.gzip
AS arch/arm/boot/compressed/piggy.gzip.o
Kernel: arch/arm/boot/Image is ready
LD arch/arm/boot/compressed/vmlinux
GZIP arch/arm/boot/compressed/piggy.gzip
OBJCOPY arch/arm/boot/zImage
Kernel: arch/arm/boot/zImage is ready
UIMAGE arch/arm/boot/uImage
Image Name: Linux-4.2.0-rc5-00156-gdd2384a-d
Created: Sat Aug 8 23:23:14 2015
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 26472 Bytes = 25.85 kB = 0.03 MB
Load Address: 80208000
Entry Point: 80208000
Image arch/arm/boot/uImage is ready
MODPOST 192 modules
AS arch/arm/boot/compressed/piggy.gzip.o
LD arch/arm/boot/compressed/vmlinux
OBJCOPY arch/arm/boot/zImage
Kernel: arch/arm/boot/zImage is ready
$ ls -l arch/arm/boot/*Image
-rwxrwxr-x 1 masahiro masahiro 13766656 Aug 8 23:23 arch/arm/boot/Image
-rw-rw-r-- 1 masahiro masahiro 26536 Aug 8 23:23 arch/arm/boot/uImage
-rwxrwxr-x 1 masahiro masahiro 6138648 Aug 8 23:23 arch/arm/boot/zImage
Please notice the uImage is extremely small when this issue is
encountered. Besides, "Kernel: arch/arm/boot/zImage is ready" is
displayed twice, before and after the uImage log.
The root cause of this is the race condition between zImage and
uImage. Actually, uImage depends on zImage, but the dependency
between the two is only described in arch/arm/boot/Makefile.
Because arch/arm/boot/Makefile is not included from the top-level
Makefile, it cannot know the dependency between zImage and uImage.
Consequently, when we run make with the parallel option, Kbuild
updates vmlinux first, and then two different threads descends into
the arch/arm/boot/Makefile almost at the same time, one for updating
zImage and the other for uImage. While one thread is re-generating
zImage, the other also tries to update zImage before creating uImage
on top of that. zImage is overwritten by the slower thread and then
uImage is created based on the half-written zImage.
This is the reason why "Kernel: arch/arm/boot/zImage is ready" is
displayed twice, and a broken uImage is created.
The same problem could happen on bootpImage.
This commit adds dependencies among Image, zImage, uImage, and
bootpImage to arch/arm/Makefile, which is included from the
top-level Makefile.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 4c21b8fd8f14 ("MIPS: seccomp: Handle indirect system calls (o32)")
fixed indirect system calls on O32 but it also introduced a bug for MIPS64
where it erroneously modified the v0 (syscall) register with the assumption
that the sycall offset hasn't been taken into consideration. This breaks
seccomp on MIPS64 n64 and n32 ABIs. We fix this by replacing the addition
with a move instruction.
Fixes: 4c21b8fd8f14 ("MIPS: seccomp: Handle indirect system calls (o32)")
Cc: <stable@vger.kernel.org> # 3.15+
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10951/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Since fc_fcp_cleanup_cmd() can sleep this function must not
be called while holding a spinlock. This patch avoids that
fc_fcp_cleanup_each_cmd() triggers the following bug:
BUG: scheduling while atomic: sg_reset/1512/0x00000202
1 lock held by sg_reset/1512:
#0: (&(&fsp->scsi_pkt_lock)->rlock){+.-...}, at: [<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
Preemption disabled at:[<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
Call Trace:
[<ffffffff816c612c>] dump_stack+0x4f/0x7b
[<ffffffff810828bc>] __schedule_bug+0x6c/0xd0
[<ffffffff816c87aa>] __schedule+0x71a/0xa10
[<ffffffff816c8ad2>] schedule+0x32/0x80
[<ffffffffc0217eac>] fc_seq_set_resp+0xac/0x100 [libfc]
[<ffffffffc0218b11>] fc_exch_done+0x41/0x60 [libfc]
[<ffffffffc0225cff>] fc_fcp_cleanup_each_cmd.isra.21+0xcf/0x150 [libfc]
[<ffffffffc0225f43>] fc_eh_device_reset+0x1c3/0x270 [libfc]
[<ffffffff814a2cc9>] scsi_try_bus_device_reset+0x29/0x60
[<ffffffff814a3908>] scsi_ioctl_reset+0x258/0x2d0
[<ffffffff814a2650>] scsi_ioctl+0x150/0x440
[<ffffffff814b3a9d>] sd_ioctl+0xad/0x120
[<ffffffff8132f266>] blkdev_ioctl+0x1b6/0x810
[<ffffffff811da608>] block_ioctl+0x38/0x40
[<ffffffff811b4e08>] do_vfs_ioctl+0x2f8/0x530
[<ffffffff811b50c1>] SyS_ioctl+0x81/0xa0
[<ffffffff816cf8b2>] system_call_fastpath+0x16/0x7a
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Pull perf fixes from Ingo Molnar:
"Tooling fixes: a 'perf record' deadlock fix plus debuggability fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf top: Show backtrace when handling a SIGSEGV on --stdio mode
perf tools: Fix buildid processing
perf tools: Make fork event processing more resilient
perf tools: Avoid deadlock when map_groups are broken
During later stages of math-emu bootup the following crash triggers:
math_emulate: 0060:c100d0a8
Kernel panic - not syncing: Math emulation needed in kernel
CPU: 0 PID: 1511 Comm: login Not tainted 4.2.0-rc7+ #1012
[...]
Call Trace:
[<c181d50d>] dump_stack+0x41/0x52
[<c181c918>] panic+0x77/0x189
[<c1003530>] ? math_error+0x140/0x140
[<c164c2d7>] math_emulate+0xba7/0xbd0
[<c100d0a8>] ? fpu__copy+0x138/0x1c0
[<c1109c3c>] ? __alloc_pages_nodemask+0x12c/0x870
[<c136ac20>] ? proc_clear_tty+0x40/0x70
[<c136ac6e>] ? session_clear_tty+0x1e/0x30
[<c1003530>] ? math_error+0x140/0x140
[<c1003575>] do_device_not_available+0x45/0x70
[<c100d0a8>] ? fpu__copy+0x138/0x1c0
[<c18258e6>] error_code+0x5a/0x60
[<c1003530>] ? math_error+0x140/0x140
[<c100d0a8>] ? fpu__copy+0x138/0x1c0
[<c100c205>] arch_dup_task_struct+0x25/0x30
[<c1048cea>] copy_process.part.51+0xea/0x1480
[<c115a8e5>] ? dput+0x175/0x200
[<c136af70>] ? no_tty+0x30/0x30
[<c1157242>] ? do_vfs_ioctl+0x322/0x540
[<c104a21a>] _do_fork+0xca/0x340
[<c1057b06>] ? SyS_rt_sigaction+0x66/0x90
[<c104a557>] SyS_clone+0x27/0x30
[<c1824a80>] sysenter_do_call+0x12/0x12
The reason is the incorrect assumption in fpu_copy(), that FNSAVE
can be executed from math-emu kernels as well.
Don't try to copy the registers, the soft state will be copied
by fork anyway, so the child task inherits the parent task's
soft math state.
With this fix applied math-emu kernels boot up fine on modern
hardware and the 'no387 nofxsr' boot options.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The mmap semaphore should not be taken when page faults are disabled.
Since pagefault_disable() no longer disables preemption, we now need
to use faulthandler_disabled() in place of in_atomic().
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Merge fixes from Andrew Morton:
"11 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
Update maintainers for DRM STI driver
mm: cma: mark cma_bitmap_maxno() inline in header
zram: fix pool name truncation
memory-hotplug: fix wrong edge when hot add a new node
.mailmap: Andrey Ryabinin has moved
ipc/sem.c: update/correct memory barriers
mm/hwpoison: fix panic due to split huge zero page
ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()
ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits
mm/hwpoison: fix fail isolate hugetlbfs page w/ refcount held
mm/hwpoison: fix page refcount of unknown non LRU page
When kvm_set_msr_common() handles a guest's write to
MSR_IA32_TSC_ADJUST, it will calcuate an adjustment based on the data
written by guest and then use it to adjust TSC offset by calling a
call-back adjust_tsc_offset(). The 3rd parameter of adjust_tsc_offset()
indicates whether the adjustment is in host TSC cycles or in guest TSC
cycles. If SVM TSC scaling is enabled, adjust_tsc_offset()
[i.e. svm_adjust_tsc_offset()] will first scale the adjustment;
otherwise, it will just use the unscaled one. As the MSR write here
comes from the guest, the adjustment is in guest TSC cycles. However,
the current kvm_set_msr_common() uses it as a value in host TSC
cycles (by using true as the 3rd parameter of adjust_tsc_offset()),
which can result in an incorrect adjustment of TSC offset if SVM TSC
scaling is enabled. This patch fixes this problem.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Cc: stable@vger.linux.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Due to patch "libfc: Do not invoke the response handler after
fc_exch_done()" (commit ID 7030fd62) the lport_recv() call
in fc_exch_recv_req() is passed a dangling pointer. Avoid this
by moving the fc_frame_free() call from fc_invoke_resp() to its
callers. This patch fixes the following crash:
general protection fault: 0000 [#3] PREEMPT SMP
RIP: fc_lport_recv_req+0x72/0x280 [libfc]
Call Trace:
fc_exch_recv+0x642/0xde0 [libfc]
fcoe_percpu_receive_thread+0x46a/0x5ed [fcoe]
kthread+0x10a/0x120
ret_from_fork+0x42/0x70
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Pull irq fixes from Thomas Gleixner:
"A series of small fixlets for a regression visible on OMAP devices
caused by the conversion of the OMAP interrupt chips to hierarchical
interrupt domains. Mostly one liners on the driver side plus a small
helper function in the core to avoid open coded mess in the drivers"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/crossbar: Restore set_wake functionality
irqchip/crossbar: Restore the mask on suspend behaviour
ARM: OMAP: wakeupgen: Restore the irq_set_type() mechanism
irqchip/crossbar: Restore the irq_set_type() mechanism
genirq: Introduce irq_chip_set_type_parent() helper
genirq: Don't return ENOSYS in irq_chip_retrigger_hierarchy
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Fix buildid processing done at the end of a 'perf record' session, a
problem that happened in workloads involving lots of small short-lived
processes. That code was not asking the perf_session layer to order
the events.
Make the code more robust to handle some of the problems with such
out-of-order events and fix 'perf record' to ask for ordered events
on systems where we have perf_event_attr.sample_id_all. (Adrian Hunter)
- Show backtrace when handling a SIGSEGV in 'perf top --stdio' (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
On a math-emu bootup the following crash occurs:
Initializing CPU#0
------------[ cut here ]------------
kernel BUG at arch/x86/kernel/traps.c:779!
invalid opcode: 0000 [#1] SMP
[...]
EIP is at do_device_not_available+0xe/0x70
[...]
Call Trace:
[<c18238e6>] error_code+0x5a/0x60
[<c1002bd0>] ? math_error+0x140/0x140
[<c100bbd9>] ? fpu__init_cpu+0x59/0xa0
[<c1012322>] cpu_init+0x202/0x330
[<c104509f>] ? __native_set_fixmap+0x1f/0x30
[<c1b56ab0>] trap_init+0x305/0x346
[<c1b548af>] start_kernel+0x1a5/0x35d
[<c1b542b4>] i386_start_kernel+0x82/0x86
The reason is that in the following commit:
b1276c48e91b ("x86/fpu: Initialize fpregs in fpu__init_cpu_generic()")
I failed to consider math-emu's limitation that it cannot execute the
FNINIT instruction in kernel mode.
The long term fix might be to allow math-emu to execute (certain) kernel
mode FPU instructions, but for now apply the safe (albeit somewhat ugly)
fix: initialize the emulation state explicitly without trapping out to
the FPU emulator.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Since 906c55579a63 ("timekeeping: Copy the shadow-timekeeper over the
real timekeeper last") it has become possible on ARM to:
- Obtain a CLOCK_MONOTONIC_COARSE or CLOCK_REALTIME_COARSE timestamp
via syscall.
- Subsequently obtain a timestamp for the same clock ID via VDSO which
predates the first timestamp (by one jiffy).
This is because ARM's update_vsyscall is deriving the coarse time
using the __current_kernel_time interface, when it should really be
using the timekeeper object provided to it by the timekeeping core.
It happened to work before only because __current_kernel_time would
access the same timekeeper object which had been passed to
update_vsyscall. This is no longer the case.
Cc: stable@vger.kernel.org
Fixes: 906c55579a63 ("timekeeping: Copy the shadow-timekeeper over the real timekeeper last")
Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add Vincent Abriou and myself as maintainers.
Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Cc: Vincent Abriou <vincent.abriou@st.com>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The recent BlackHat 2015 presentation "The Memory Sinkhole"
mentions that the IDT limit is zeroed on entry to SMM.
This is not documented, and must have changed some time after 2010
(see http://www.ssi.gouv.fr/uploads/IMG/pdf/IT_Defense_2010_final.pdf).
KVM was not doing it, but the fix is easy.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull input subsystem fixes from Dmitry Torokhov:
"Just small ALPS and Elan touchpads, and other driver fixups"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elantech - add special check for fw_version 0x470f01 touchpad
Input: twl4030-vibra - fix ERROR: Bad of_node_put() warning
Input: alps - only Dell laptops have separate button bits for v2 dualpoint sticks
Input: axp20x-pek - add module alias
Input: turbografx - fix potential out of bound access
In case of hw iscsi offload, an host can have N-number of active
connections. There can be IO's running on some connections which
make host->host_busy always TRUE. Now if logout from a connection
is tried then the code gets into an infinite loop as host->host_busy
is always TRUE.
iscsi_conn_teardown(....)
{
.........
/*
* Block until all in-progress commands for this connection
* time out or fail.
*/
for (;;) {
spin_lock_irqsave(session->host->host_lock, flags);
if (!atomic_read(&session->host->host_busy)) { /* OK for ERL == 0 */
spin_unlock_irqrestore(session->host->host_lock, flags);
break;
}
spin_unlock_irqrestore(session->host->host_lock, flags);
msleep_interruptible(500);
iscsi_conn_printk(KERN_INFO, conn, "iscsi conn_destroy(): "
"host_busy %d host_failed %d\n",
atomic_read(&session->host->host_busy),
session->host->host_failed);
................
...............
}
}
This is not an issue with software-iscsi/iser as each cxn is a separate
host.
Fix:
Acquiring eh_mutex in iscsi_conn_teardown() before setting
session->state = ISCSI_STATE_TERMINATE.
Signed-off-by: John Soni Jose <sony.john@avagotech.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Chris Leech <cleech@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Pull timer fixes from Thomas Gleixner:
"Two minimalistic fixes for 4.2 regressions:
- Eric fixed a thinko in the timer_list base switching code caused by
the overhaul of the timer wheel. It can cause a cpu to see the
wrong base for a timer while we move the timer around.
- Guenter fixed a regression for IMX if booted w/o device tree, where
the timer interrupt is not initialized and therefor the machine
fails to boot"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/imx: Fix boot with non-DT systems
timer: Write timer->flags atomically
The TI crossbar irqchip doesn't provides any facility to configure the
wakeup sources, but the conversion to hierarchical irqdomains set the
irq_set_wake callback to irq_chip_set_wake_parent. The parent chip
(OMAP wakeupgen) has no irq_set_wake function either so the call will
fail with -ENOSYS. As a result the irq_set_wake() call in the resume
path will trigger an 'Unbalanced wake disable' warning.
Before the conversion the GIC irqchip was the top level irqchip and
correctly flagged with IRQCHIP_SKIP_SET_WAKE.
Restore the correct behaviour by removing the irq_set_type callback
from the crossbar irqchip and set the IRQCHIP_SKIP_SET_WAKE flag which
lets the irq_set_irq_wake() call from the driver succeed.
[ tglx: Massaged changelog ]
Fixes: 783d31863fb8 ('irqchip: crossbar: Convert dra7 crossbar...')
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: <linux@arm.linux.org.uk>
Cc: <nsekhar@ti.com>
Cc: <jason@lakedaemon.net>
Cc: <balbi@ti.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: <tony@atomide.com>
Cc: <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org # 4.1
Link: http://lkml.kernel.org/r/1439554830-19502-7-git-send-email-grygorii.strashko@ti.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull dmaengine fix from Vinod Koul:
"We recently found issue with dma_request_slave_channel() API causing
privatecnt value to go bad. This is fixed by balancing the privatecnt"
* tag 'dmaengine-fix-4.2-rc8' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: fix balance of privatecnt inc/dec operations
It was just freezing instead of informing about the SEGV, fix it and
also print a backtrace, just like in the TUI mode and in 'perf trace'.
Tested by provoking a NULL deref when pressing 'z':
0.31% libc-2.20.so [.] malloc_consolidate
0.31% ld-2.20.so [.] _dl_relocate_object
0.28% cc1 [.] ht_lookup
0.28% cc1 [.] ira_init_register_move_cost
perf: Segmentation fault
Obtained 7 stack frames.
perf(dump_stack+0x32) [0x4d69f2]
perf(sighandler_dump_stack+0x29) [0x4d6a89]
/lib64/libc.so.6(+0x34960) [0x7f5064333960]
perf() [0x438790]
/lib64/libpthread.so.0(+0x752a) [0x7f50663dd52a]
/lib64/libc.so.6(clone+0x6d) [0x7f50643ff22d]
#
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-pewrpzqd29rgmhu2wkk7fhww@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit b253149b843f ("sched/idle/x86: Restore mwait_idle() to fix boot
hangs, to improve power savings and to improve performance") restores
mwait_idle(), but the trace_cpu_idle related calls are missing. This
causes powertop on my old desktop powered by Intel Core2 E6550 to
report zero wakeups and zero events.
Add them back to restore the proper behaviour.
Fixes: b253149b843f ("sched/idle/x86: Restore mwait_idle() to ...")
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Cc: <len.brown@intel.com>
Cc: stable@vger.kernel.org # 4.1
Link: http://lkml.kernel.org/r/1440046479-4262-1-git-send-email-jszhang@marvell.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
ret_fast_syscall runs when user space makes a syscall. However it
needs to be marked as such so the ELF information is correct. Before
it was:
101: 8000f300 0 NOTYPE LOCAL DEFAULT 2 ret_fast_syscall
But with this change it correctly shows as:
101: 8000f300 96 FUNC LOCAL DEFAULT 2 ret_fast_syscall
I see this function when using perf to unwind call stacks from kernel
space to user space. Without this change I would need to add some
special case logic when using the vmlinux ELF information.
Signed-off-by: Drew Richardson <drew.richardson@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Clocks 0 to 31 are on CKENA, and not CKENB. The clock register names
were inadequately inverted. As a consequence, all clock operations were
happening on CKENB, because almost all but 2 clocks are on CKENA.
As the clocks were activated by the bootloader in the former tests, it
escaped the testing that the wrong clock gate was manipulated. The error
was revealed by changing the pxa3xx-nand driver to a module, where upon
unloading, the wrong clock was disabled in CKENB.
Fixes: 9bbb8a338fb2 ("clk: pxa: add pxa3xx clock driver")
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
cma_bitmap_maxno() was marked as static and not static inline, which can
cause warnings about this function not being used if this file is included
in a file that does not call that function, and violates the conventions
used elsewhere. The two options are to move the function implementation
back to mm/cma.c or make it inline here, and it's simple enough for the
latter to make sense.
Signed-off-by: Gregory Fong <gregory.0xf0@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The patch was munged on commit to re-order these tests resulting in
excessive warnings when trying to do device assignment. Return to
original ordering: https://lkml.org/lkml/2015/7/15/769
Fixes: 3e5d2fdceda1 ("KVM: MTRR: simplify kvm_mtrr_get_guest_memory_type")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull MIPS fixes from Ralf Baechle:
"Another round of MIPS fixes for 4.2. No area does particularly stand
out but we have a two unpleasant ones:
- Kernel ptes are marked with a global bit which allows the kernel to
share kernel TLB entries between all processes. For this to work
both entries of an adjacent even/odd pte pair need to have the
global bit set. There has been a subtle race in setting the other
entry's global bit since ~ 2000 but it take particularly
pathological workloads that essentially do mostly vmalloc/vfree to
trigger this.
This pull request fixes the 64-bit case but leaves the case of 32
bit CPUs with 64 bit ptes unsolved for now. The unfixed cases
affect hardware that is not available in the field yet.
- Instruction emulation requires loading instructions from user space
but the current fast but simplistic approach will fail on pages
that are PROT_EXEC but !PROT_READ. For this reason we temporarily
do not permit this permission and will map pages with PROT_EXEC |
PROT_READ.
The remainder of this pull request is more or less across the field
and the short log explains them well"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Make set_pte() SMP safe.
MIPS: Replace add and sub instructions in relocate_kernel.S with addiu
MIPS: Flush RPS on kernel entry with EVA
Revert "MIPS: BCM63xx: Provide a plat_post_dma_flush hook"
MIPS: BMIPS: Delete unused Kconfig symbol
MIPS: Export get_c0_perfcount_int()
MIPS: show_stack: Fix stack trace with EVA
MIPS: do_mcheck: Fix kernel code dump with EVA
MIPS: SMP: Don't increment irq_count multiple times for call function IPIs
MIPS: Partially disable RIXI support.
MIPS: Handle page faults of executable but unreadable pages correctly.
MIPS: Malta: Don't reinitialise RTC
MIPS: unaligned: Fix build error on big endian R6 kernels
MIPS: Fix sched_getaffinity with MT FPAFF enabled
MIPS: Fix build with CONFIG_OF=y for non OF-enabled targets
CPUFREQ: Loongson2: Fix broken build due to incorrect include.
Fix a memory leak with scsi-mq triggered by commands with large data
transfer length.
__sg_alloc_table() sets both table->nents and table->orig_nents to the
same value. When the scatterlist is DMA-mapped, table->nents is
overwritten with the (possibly smaller) size of the DMA-mapped
scatterlist, while table->orig_nents retains the original size of the
allocated scatterlist. scsi_free_sgtable() should therefore check
orig_nents instead of nents, and all code that initializes sdb->table
without calling __sg_alloc_table() should set both nents and orig_nents.
Fixes: d285203cf647 ("scsi: add support for a blk-mq based I/O path.")
Cc: <stable@vger.kernel.org> # 3.17+
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
While the idea behind get_maintainer seems highly useful it's
unfortunately way to trigger happy to grab people that once had a few
commits to files. For someone like me who does a lot of tree-wide API
work that leads to an incredible amount of Cc spam.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 6dd747825b20 ("ARM: imx: move timer resources into a structure")
moved initialization parameters into a data structure, but neglected to set
the irq field in that data structure for non-DT boots. This causes the system
to hang if a non-DT boot is attempted.
Fixes: 6dd747825b20 ("ARM: imx: move timer resources into a structure")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: http://lkml.kernel.org/r/1440066441-13930-1-git-send-email-linux@roeck-us.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The ARM GIC requires that all interrupts which are not used as a
wakeup source have to be masked during suspend.
The conversion of the crossbar irqchip to hierarchical irq domains
failed to mark the crossbar irqchip with the IRQCHIP_MASK_ON_SUSPEND
flag and therefor broke the suspend requirement of the GIC.
Before the conversion the flags were visible because the GIC was the
top level irqchip. After the conversion the crossbar irqchip is the
top level irq chip whose flags are evaluated in suspend_device_irq().
As the flag is not set the masking of the non-wakeup irqs is not
invoked which breaks suspend.
Add the IRQCHIP_MASK_ON_SUSPEND flag to the crossbar irqchip, so the
GIC interrupts get masked properly.
[ tglx: Massaged changelog ]
Fixes: 783d31863fb8 ('irqchip: crossbar: Convert dra7 crossbar...')
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: <linux@arm.linux.org.uk>
Cc: <nsekhar@ti.com>
Cc: <jason@lakedaemon.net>
Cc: <balbi@ti.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: <tony@atomide.com>
Cc: <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org # 4.1
Link: http://lkml.kernel.org/r/1439554830-19502-6-git-send-email-grygorii.strashko@ti.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull drm fixes from Dave Airlie:
"These came in late last week, I wanted to look over the mst one before
forwarding, but it seems good.
Just three i915 and one MST fix"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/i915: Commit planes on each crtc separately.
drm/i915: calculate primary visibility changes instead of calling from set_config
drm/i915: Only dither on 6bpc panels
drm/dp/mst: Remove port after removing connector.
This patch increments privatecnt value and set DMA_PRIVATE in device
caps in dma_request_slave_channel() function. This is needed to keep
privatecnt increment/decrement balance.
As function dma_release_channel() decrements privatecnt counter, we need
to increment it when channel is requested. Otherwise privatecnt drops
into negatives after few dma_release_channel() calls.
Reported-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Robert Baldyga <r.baldyga@samsung.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
After recording, 'perf record' post-processes the data to determine
which buildids are needed.
That processing must process the data in time order, if possible,
because otherwise dependent events, like forks and mmaps, will not make
sense.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1439994561-27436-4-git-send-email-adrian.hunter@intel.com
[ Moved the sample_id_add to after trying to open the events, use pr_warning ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since the commit "b2c3e38a5471 ARM: redo TTBR setup code for LPAE",
the setup code had been reworked. As a result the secondary CPUs
failed to come online in Big Endian.
As explained by Russell, the new code expected the value in r4/r5 to
be the least significant 32bits in r4 and the most significant 32bits
in r5. However, in the secondary code, we load this using ldrd, which
on BE reverses that.
This patch swap r4/r5 after the ldrd. It is done using the xor
instructions in order to not use a temporary register.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull perf fixes from Ingo Molnar:
"Misc fixes: PMU driver corner cases, tooling fixes, and an 'AUX'
(Intel PT) race related core fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/cqm: Do not access cpu_data() from CPU_UP_PREPARE handler
perf/x86/intel: Fix memory leak on hot-plug allocation fail
perf: Fix PERF_EVENT_IOC_PERIOD migration race
perf: Fix double-free of the AUX buffer
perf: Fix fasync handling on inherited events
perf tools: Fix test build error when bindir contains double slash
perf stat: Fix transaction lenght metrics
perf: Fix running time accounting
Currently the sh_cmt clocksource timer is disabled or enabled
unconditionally on clocksource suspend resp. resume, even if a
better clocksource is present (e.g. arch_sys_counter) and the
sh_cmt clocksource is not enabled.
As sh_cmt is a syscore device when its timer is enabled, this
may lead to a genpd.prepared_count imbalance in the presence of
PM Domains, which may cause a lock-up during reboot after s2ram.
During suspend:
- pm_genpd_prepare() is called for all non-syscore devices (incl.
sh_cmt), increasing genpd.prepared_count for each device,
- clocksource.suspend() is called for all clocksource devices,
- sh_cmt_clocksource_suspend() calls sh_cmt_stop(), which is a no-op
as the clocksource was not enabled.
During resume:
- clocksource.resume() is called for all clocksource devices,
- sh_cmt_clocksource_resume() calls sh_cmt_start(), which enables the
clocksource timer, and turns sh_cmt into a syscore device,
- pm_genpd_complete() is called for all non-syscore devices (excl.
sh_cmt now!), decreasing genpd.prepared_count for each device but
sh_cmt.
Now genpd.prepared_count of the PM Domain containing sh_cmt is
still 1 instead of zero. On subsequent suspend/resume cycles,
sh_cmt is still a syscore device, hence it's skipped for
pm_genpd_{prepare,complete}(), keeping the imbalance of
genpd.prepared_count at 1.
During reboot:
- platform_drv_shutdown() is called for any platform device that has
a driver with a .shutdown() method (only rcar-dmac on R-Car Gen2),
- platform_drv_shutdown() calls dev_pm_domain_detach(), which
calls genpd_dev_pm_detach(),
- genpd_dev_pm_detach() keeps calling pm_genpd_remove_device() until
it doesn't return -EAGAIN[*],
- If the device is part of the same PM Domain as sh_cmt,
pm_genpd_remove_device() always fails with -EAGAIN due to
genpd.prepared_count > 0.
- Infinite loop in genpd_dev_pm_detach()[*].
[*] Commit 93af5e9354432828 ("PM / Domains: Avoid infinite loops in
attach/detach code") already limited the number of loop iterations,
avoiding the lock-up.
To fix this, only disable or enable the clocksource timer on
clocksource suspend resp. resume if the clocksource was enabled.
This was tested on r8a7791/koelsch with the CPG Clock Domain:
- using arch_sys_counter as the clocksource, which is the default, and
which showed the problem,
- using sh_cmt as a clocksource ("echo ffca0000.timer > \
/sys/devices/system/clocksource/clocksource0/current_clocksource"),
which behaves the same as before.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1438875126-12596-2-git-send-email-daniel.lezcano@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
zram_meta_alloc() constructs a pool name for zs_create_pool() call as
snprintf(pool_name, sizeof(pool_name), "zram%d", device_id);
However, it defines pool name buffer to be only 8 bytes long (minus
trailing zero), which means that we can have only 1000 pool names: zram0
-- zram999.
With CONFIG_ZSMALLOC_STAT enabled an attempt to create a device zram1000
can fail if device zram100 already exists, because snprintf() will
truncate new pool name to zram100 and pass it debugfs_create_dir(),
causing:
debugfs dir <zram100> creation failed
zram: Error creating memory pool
... and so on.
Fix it by passing zram->disk->disk_name to zram_meta_alloc() instead of
divice_id. We construct zram%d name earlier and keep it as a ->disk_name,
no need to snprintf() it again.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KVM: s390: bugfix for kvm/master (4.2)
Here is a bugfix for a regression that was introduced after 4.1
with the commit commit 785dbef407d8 ("KVM: s390: optimize round
trip time in request handling"). After lots of cpu hotplugs in the
guest (online/offline) sometimes a guest CPU did loop within host
KVM code. Reason was that PROG_REQUEST was set in the sie control
block, but no request was pending. This made commit 785dbef407d8
the suspect and changing that area to always reset PROG_REQUEST
did indeed fix the problem.
Special thanks to David Hildenbrand, who helped understanding the
exact sequence that led to the problem.
Pull btrfs fix from Chris Mason:
"We have a btrfs quota regression fix.
I merged this one on Thursday and have run it through tests against
current master.
Normally I wouldn't have sent this while you were finalizing rc6, but
I'm feeding mosquitoes in the adirondacks next week, so I wanted to
get this one out before leaving. I'll leave longer tests running and
check on things during the week, but I don't expect any problems"
* 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: qgroup: Fix a regression in qgroup reserved space.
On MIPS the GLOBAL bit of the PTE must have the same value in any
aligned pair of PTEs. These pairs of PTEs are referred to as
"buddies". In a SMP system is is possible for two CPUs to be calling
set_pte() on adjacent PTEs at the same time. There is a race between
setting the PTE and a different CPU setting the GLOBAL bit in its
buddy PTE.
This race can be observed when multiple CPUs are executing
vmap()/vfree() at the same time.
Make setting the buddy PTE's GLOBAL bit an atomic operation to close
the race condition.
The case of CONFIG_64BIT_PHYS_ADDR && CONFIG_CPU_MIPS32 is *not*
handled.
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: <stable@vger.kernel.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10835/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>