commits
Pull MIPS fixes from Ralf Baechle:
"Another round of MIPS fixes for 4.5:
- Fix JZ4780 build with DEBUG_ZBOOT and MACH_JZ4780
- Fix build with DEBUG_ZBOOT and MACH_JZ4780
- Fix issue with uninitialised temp_foreign_map
- Fix awk regex compile failure with certain versions of awk. At
this time, the sole user, ld-ifversion, is only used on MIPS"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: smp.c: Fix uninitialised temp_foreign_map
MIPS: Fix build error when SMP is used without GIC
ld-version: Fix awk regex compile failure
MIPS: Fix build with DEBUG_ZBOOT and MACH_JZ4780
Pull block merge fix from Jens Axboe.
This fixes the block segment counting bug and resulting sg overrun
reported by Kent Overstreet, introduced with the last block pull.
* 'for-linus' of git://git.kernel.dk/linux-block:
block: don't optimize for non-cloned bio in bio_get_last_bvec()
When calculate_cpu_foreign_map() recalculates the cpu_foreign_map
cpumask it uses the local variable temp_foreign_map without initialising
it to zero. Since the calculation only ever sets bits in this cpumask
any existing bits at that memory location will remain set and find their
way into cpu_foreign_map too. This could potentially lead to cache
operations suboptimally doing smp calls to multiple VPEs in the same
core, even though the VPEs share primary caches.
Therefore initialise temp_foreign_map using cpumask_clear() before use.
Fixes: cccf34e9411c ("MIPS: c-r4k: Fix cache flushing for MT cores")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12759/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Pull x86 fixes from Ingo Molnar:
"This fixes 3 FPU handling related bugs, an EFI boot crash and a
runtime warning.
The EFI fix arrived late but I didn't want to delay it to after v4.5
because the effects are pretty bad for the systems that are affected
by it"
[ Actually, I don't think the EFI fix really matters yet, because we
haven't switched to the separate EFI page tables in mainline yet ]
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/efi: Fix boot crash by always mapping boot service regions into new EFI page tables
x86/fpu: Fix eager-FPU handling on legacy FPU machines
x86/delay: Avoid preemptible context checks in delay_mwaitx()
x86/fpu: Revert ("x86/fpu: Disable AVX when eagerfpu is off")
x86/fpu: Fix 'no387' regression
For !BIO_CLONED bio, we can use .bi_vcnt safely, but it
doesn't mean we can just simply return .bi_io_vec[.bi_vcnt - 1]
because the start postion may have been moved in the middle of
the bvec, such as splitting in the middle of bvec.
Fixes: 7bcd79ac50d9(block: bio: introduce helpers to get the 1st and last bvec)
Cc: stable@vger.kernel.org
Reported-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The MIPS_GIC_IPI should only be selected when MIPS_GIC is also
selected, otherwise it results in a compile error. smp-gic.c uses some
functions from include/linux/irqchip/mips-gic.h like
plat_ipi_call_int_xlate() which are only added to the header file when
MIPS_GIC is set. The Lantiq SoC does not use the GIC, but supports SMP.
The calls top the functions from smp-gic.c are already protected by
some #ifdefs
The first part of this was introduced in commit 72e20142b2bf ("MIPS:
Move GIC IPI functions out of smp-cmp.c")
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: stable@vger.kernel.org # v3.15+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12774/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Pull SCSI target bug fix from Nicholas Bellinger:
"Here is an outstanding target-core bug-fix for v4.5 code."
This patch addresses a recent Task Management (TMR) regression related
to larger set of multi-port LUN_RESET bug-fixes in v4.5-rc5.
It drops a left-over target_put_sess_cmd() of se_cmd->cmd_kref within
ABORT_TASK failure path, once a se_cmd descriptor has already
completed posting response to fabric driver logic, and must be skipped
during normal ABORT_TASK se_cmd->tag lookup"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
target: Drop incorrect ABORT_TASK put for completed commands
Some machines have EFI regions in page zero (physical address
0x00000000) and historically that region has been added to the e820
map via trim_bios_range(), and ultimately mapped into the kernel page
tables. It was not mapped via efi_map_regions() as one would expect.
Alexis reports that with the new separate EFI page tables some boot
services regions, such as page zero, are not mapped. This triggers an
oops during the SetVirtualAddressMap() runtime call.
For the EFI boot services quirk on x86 we need to memblock_reserve()
boot services regions until after SetVirtualAddressMap(). Doing that
while respecting the ownership of regions that may have already been
reserved by the kernel was the motivation behind this commit:
7d68dc3f1003 ("x86, efi: Do not reserve boot services regions within reserved areas")
That patch was merged at a time when the EFI runtime virtual mappings
were inserted into the kernel page tables as described above, and the
trick of setting ->numpages (and hence the region size) to zero to
track regions that should not be freed in efi_free_boot_services()
meant that we never mapped those regions in efi_map_regions(). Instead
we were relying solely on the existing kernel mappings.
Now that we have separate page tables we need to make sure the EFI
boot services regions are mapped correctly, even if someone else has
already called memblock_reserve(). Instead of stashing a tag in
->numpages, set the EFI_MEMORY_RUNTIME bit of ->attribute. Since it
generally makes no sense to mark a boot services region as required at
runtime, it's pretty much guaranteed the firmware will not have
already set this bit.
For the record, the specific circumstances under which Alexis
triggered this bug was that an EFI runtime driver on his machine was
responding to the EVT_SIGNAL_VIRTUAL_ADDRESS_CHANGE event during
SetVirtualAddressMap().
The event handler for this driver looks like this,
sub rsp,0x28
lea rdx,[rip+0x2445] # 0xaa948720
mov ecx,0x4
call func_aa9447c0 ; call to ConvertPointer(4, & 0xaa948720)
mov r11,QWORD PTR [rip+0x2434] # 0xaa948720
xor eax,eax
mov BYTE PTR [r11+0x1],0x1
add rsp,0x28
ret
Which is pretty typical code for an EVT_SIGNAL_VIRTUAL_ADDRESS_CHANGE
handler. The "mov r11, QWORD PTR [rip+0x2424]" was the faulting
instruction because ConvertPointer() was being called to convert the
address 0x0000000000000000, which when converted is left unchanged and
remains 0x0000000000000000.
The output of the oops trace gave the impression of a standard NULL
pointer dereference bug, but because we're accessing physical
addresses during ConvertPointer(), it wasn't. EFI boot services code
is stored at that address on Alexis' machine.
Reported-by: Alexis Murzeau <amurzeau@gmail.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raphael Hertzog <hertzog@debian.org>
Cc: Roger Shimizu <rogershimizu@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1457695163-29632-2-git-send-email-matt@codeblueprint.co.uk
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=815125
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull KVM fixes from Paolo Bonzini:
"A few simple fixes for ARM, x86, PPC and generic code.
The x86 MMU fix is a bit larger because the surrounding code needed a
cleanup, but nothing worrisome"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: MMU: fix reserved bit check for ept=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0
KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo
kvm: cap halt polling at exactly halt_poll_ns
KVM: s390: correct fprs on SIGP (STOP AND) STORE STATUS
KVM: VMX: disable PEBS before a guest entry
KVM: PPC: Book3S HV: Sanitize special-purpose register values on guest exit
The ld-version.sh script fails on some versions of awk with the
following error, resulting in build failures for MIPS:
awk: scripts/ld-version.sh: line 4: regular expression compile failed (missing '(')
This is due to the regular expression ".*)", meant to strip off the
beginning of the ld version string up to the close bracket, however
brackets have a meaning in regular expressions, so lets escape it so
that awk doesn't expect a corresponding open bracket.
Fixes: ccbef1674a15 ("Kbuild, lto: add ld-version and ld-ifversion ...")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Cc: Michal Marek <mmarek@suse.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kbuild@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # 4.4.x-
Patchwork: https://patchwork.linux-mips.org/patch/12838/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Pull MTD fixes from Brian Norris:
"Late MTD fix for v4.5:
- A simple error code handling fix for the NAND ECC test; this was a
regression in v4.5-rc1
- A MAINTAINERS update, which might as well go in ASAP"
* tag 'for-linus-20160311' of git://git.infradead.org/linux-mtd:
MAINTAINERS: add a maintainer for the NAND subsystem
mtd: nand: tests: fix regression introduced in mtd_nandectest
This patch fixes a recent ABORT_TASK regression associated
with commit febe562c, where a left-over target_put_sess_cmd()
would still be called when __target_check_io_state() detected
a command has already been completed, and explicit ABORT must
be avoided.
Note commit febe562c dropped the local kref_get_unless_zero()
check in core_tmr_abort_task(), but did not drop this extra
corresponding target_put_sess_cmd() in the failure path.
So go ahead and drop this now bogus target_put_sess_cmd(),
and avoid this potential use-after-free.
Reported-by: Dan Lane <dracodan@gmail.com>
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Andy Grover <agrover@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
i486 derived cores like Intel Quark support only the very old,
legacy x87 FPU (FSAVE/FRSTOR, CPUID bit FXSR is not set), and
our FPU code wasn't handling the saving and restoring there
properly in the 'eagerfpu' case.
So after we made eagerfpu the default for all CPU types:
58122bf1d856 x86/fpu: Default eagerfpu=on on all CPUs
these old FPU designs broke. First, Andy Shevchenko reported a splat:
WARNING: CPU: 0 PID: 823 at arch/x86/include/asm/fpu/internal.h:163 fpu__clear+0x8c/0x160
which was us trying to execute FXRSTOR on those machines even though
they don't support it.
After taking care of that, Bryan O'Donoghue reported that a simple FPU
test still failed because we weren't initializing the FPU state properly
on those machines.
Take care of all that.
Reported-and-tested-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Reported-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng <yu-cheng.yu@intel.com>
Link: http://lkml.kernel.org/r/20160311113206.GD4312@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull arm64 fixes from Will Deacon:
"I thought we were done for 4.5, but then the 64k-page chaps came
crawling out of the woodwork. *sigh*
The vmemmap fix I sent for -rc7 caused a regression with 64k pages and
sparsemem and at some point during the release cycle the new hugetlb
code using contiguous ptes started failing the libhugetlbfs tests with
64k pages enabled.
So here are a couple of patches that fix the vmemmap alignment and
disable the new hugetlb page sizes whilst a proper fix is being
developed:
- Temporarily disable huge pages built using contiguous ptes
- Ensure vmemmap region is sufficiently aligned for sparsemem
sections"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: hugetlb: partial revert of 66b3923a1a0f
arm64: account for sparsemem section alignment when choosing vmemmap offset
KVM has special logic to handle pages with pte.u=1 and pte.w=0 when
CR0.WP=1. These pages' SPTEs flip continuously between two states:
U=1/W=0 (user and supervisor reads allowed, supervisor writes not allowed)
and U=0/W=1 (supervisor reads and writes allowed, user writes not allowed).
When SMEP is in effect, however, U=0 will enable kernel execution of
this page. To avoid this, KVM also sets NX=1 in the shadow PTE together
with U=0, making the two states U=1/W=0/NX=gpte.NX and U=0/W=1/NX=1.
When guest EFER has the NX bit cleared, the reserved bit check thinks
that the latter state is invalid; teach it that the smep_andnot_wp case
will also use the NX bit of SPTEs.
Cc: stable@vger.kernel.org
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.inel.com>
Fixes: c258b62b264fdc469b6d3610a907708068145e3b
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ingenic SoC declares ZBOOT support, but debug definitions are missing
for MACH_JZ4780 resulting in a build failure when DEBUG_ZBOOT is set.
The UART addresses are same as with JZ4740, so fix by covering JZ4780
with those as well.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12830/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Pull drm/i915 fixes from Dave Airlie:
"Just two i915 regression fixes, that should be it from me"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/i915: Actually retry with bit-banging after GMBUS timeout
drm/i915: Fix bogus dig_port_map[] assignment for pre-HSW
Add myself as the maintainer of the NAND subsystem.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Richard Weinberger <richard@nod.at>
Acked-by: Brian Norris <computersforpeace@gmail.com>
Acked-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
We do use this_cpu_ptr(&cpu_tss) as a cacheline-aligned, seldomly
accessed per-cpu var as the MONITORX target in delay_mwaitx(). However,
when called in preemptible context, this_cpu_ptr -> smp_processor_id() ->
debug_smp_processor_id() fires:
BUG: using smp_processor_id() in preemptible [00000000] code: udevd/312
caller is delay_mwaitx+0x40/0xa0
But we don't care about that check - we only need cpu_tss as a MONITORX
target and it doesn't really matter which CPU's var we're touching as
we're going idle anyway. Fix that.
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: spg_linux_kernel@amd.com
Link: http://lkml.kernel.org/r/20160309205622.GG6564@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull s390 fixes from Martin Schwidefsky:
"Three bug fixes:
- The fix for the page table corruption (CVE-2016-2143)
- The diagnose statistics introduced a regression for the dasd diag
driver
- Boot crash on systems without the set-program-parameters facility"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/mm: four page table levels vs. fork
s390/cpumf: Fix lpp detection
s390/dasd: fix diag 0x250 inline assembly
Commit 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit")
introduced support for huge pages using the contiguous bit in the PTE
as opposed to block mappings, which may be slightly unwieldy (512M) in
64k page configurations.
Unfortunately, this support has resulted in some late regressions when
running the libhugetlbfs test suite with 64k pages and CONFIG_DEBUG_VM
as a result of a BUG:
| readback (2M: 64): ------------[ cut here ]------------
| kernel BUG at fs/hugetlbfs/inode.c:446!
| Internal error: Oops - BUG: 0 [#1] SMP
| Modules linked in:
| CPU: 7 PID: 1448 Comm: readback Not tainted 4.5.0-rc7 #148
| Hardware name: linux,dummy-virt (DT)
| task: fffffe0040964b00 ti: fffffe00c2668000 task.ti: fffffe00c2668000
| PC is at remove_inode_hugepages+0x44c/0x480
| LR is at remove_inode_hugepages+0x264/0x480
Rather than revert the entire patch, simply avoid advertising the
contiguous huge page sizes for now while people are actively working on
a fix. This patch can then be reverted once things have been sorted out.
Cc: David Woods <dwoods@ezchip.com>
Reported-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Yes, all of these are needed. :) This is admittedly a bit odd, but
kvm-unit-tests access.flat tests this if you run it with "-cpu host"
and of course ept=0.
KVM runs the guest with CR0.WP=1, so it must handle supervisor writes
specially when pte.u=1/pte.w=0/CR0.WP=0. Such writes cause a fault
when U=1 and W=0 in the SPTE, but they must succeed because CR0.WP=0.
When KVM gets the fault, it sets U=0 and W=1 in the shadow PTE and
restarts execution. This will still cause a user write to fault, while
supervisor writes will succeed. User reads will fault spuriously now,
and KVM will then flip U and W again in the SPTE (U=1, W=0). User reads
will be enabled and supervisor writes disabled, going back to the
originary situation where supervisor writes fault spuriously.
When SMEP is in effect, however, U=0 will enable kernel execution of
this page. To avoid this, KVM also sets NX=1 in the shadow PTE together
with U=0. If the guest has not enabled NX, the result is a continuous
stream of page faults due to the NX bit being reserved.
The fix is to force EFER.NX=1 even if the CPU is taking care of the EFER
switch. (All machines with SMEP have the CPU_LOAD_IA32_EFER vm-entry
control, so they do not use user-return notifiers for EFER---if they did,
EFER.NX would be forced to the same value as the host).
There is another bug in the reserved bit check, which I've split to a
separate patch for easier application to stable kernels.
Cc: stable@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Fixes: f6577a5fa15d82217ca73c74cd2dcbc0f6c781dd
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When removing an element from the mempool, mark it as unpoisoned in KASAN
before verifying its contents for SLUB/SLAB debugging. Otherwise KASAN
will flag the reads checking the element use-after-free writes as
use-after-free reads.
Signed-off-by: Matthew Dawson <matthew@mjdsystems.ca>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After the GMBUS transfer times out, we set force_bit=1 and
return -EAGAIN expecting the i2c core to call the .master_xfer
hook again so that we will retry the same transfer via bit-banging.
This is in case the gmbus hardware is somehow faulty.
Unfortunately we left adapter->retries to 0, meaning the i2c core
didn't actually do the retry. Let's tell the core we want one retry
when we return -EAGAIN.
Note that i2c-algo-bit also uses this retry count for some internal
retries, so we'll end up increasing those a bit as well.
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: drm-intel-fixes@lists.freedesktop.org
Fixes: bffce907d640 ("drm/i915: abstract i2c bit banging fallback in gmbus xfer")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457366220-29409-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 8b1f165a4a8f64c28cf42d10e1f4d3b451dedc51)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Offending Commit: 6e94119 "mtd: nand: return consistent error codes in
ecc.correct() implementations"
The new error code was not being handled properly in double bit error
detection.
Signed-off-by: Jorge Ramirez-Ortiz <jorge.ramirez-ortiz@linaro.org>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Tested-by: Franklin S Cooper Jr <fcooper@ti.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Pull x86 fixes from Ingo Molnar:
"This is unusually large, partly due to the EFI fixes that prevent
accidental deletion of EFI variables through efivarfs that may brick
machines. These fixes are somewhat involved to maintain compatibility
with existing install methods and other usage modes, while trying to
turn off the 'rm -rf' bricking vector.
Other fixes are for large page ioremap()s and for non-temporal
user-memcpy()s"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Fix vmalloc_fault() to handle large pages properly
hpet: Drop stale URLs
x86/uaccess/64: Handle the caching of 4-byte nocache copies properly in __copy_user_nocache()
x86/uaccess/64: Make the __copy_user_nocache() assembly code more readable
lib/ucs2_string: Correct ucs2 -> utf8 conversion
efi: Add pstore variables to the deletion whitelist
efi: Make efivarfs entries immutable by default
efi: Make our variable validation list include the guid
efi: Do variable name validation tests in utf8
efi: Use ucs2_as_utf8 in efivarfs instead of open coding a bad version
lib/ucs2_string: Add ucs2 -> utf8 helper functions
Leonid Shatz noticed that the SDM interpretation of the following
recent commit:
394db20ca240741 ("x86/fpu: Disable AVX when eagerfpu is off")
... is incorrect and that the original behavior of the FPU code was correct.
Because AVX is not stated in CR0 TS bit description, it was mistakenly
believed to be not supported for lazy context switch. This turns out
to be false:
Intel Software Developer's Manual Vol. 3A, Sec. 2.5 Control Registers:
'TS Task Switched bit (bit 3 of CR0) -- Allows the saving of the x87 FPU/
MMX/SSE/SSE2/SSE3/SSSE3/SSE4 context on a task switch to be delayed until
an x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 instruction is actually executed
by the new task.'
Intel Software Developer's Manual Vol. 2A, Sec. 2.4 Instruction Exception
Specification:
'AVX instructions refer to exceptions by classes that include #NM
"Device Not Available" exception for lazy context switch.'
So revert the commit.
Reported-by: Leonid Shatz <leonid.shatz@ravellosystems.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457569734-3785-1-git-send-email-yu-cheng.yu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull spi fixes from Mark Brown:
"A few driver specific fixes for the Rockchip and i.MX SPI controllers,
especially for the i.MX they're annoying bugs if you run into them"
* tag 'spi-fix-v4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: imx: fix spi resource leak with dma transfer
spi: imx: allow only WML aligned transfers to use DMA
spi: rockchip: add missing spi_master_put
spi: rockchip: disable runtime pm when in err case
The fork of a process with four page table levels is broken since
git commit 6252d702c5311ce9 "[S390] dynamic page tables."
All new mm contexts are created with three page table levels and
an asce limit of 4TB. If the parent has four levels dup_mmap will
add vmas to the new context which are outside of the asce limit.
The subsequent call to copy_page_range will walk the three level
page table structure of the new process with non-zero pgd and pud
indexes. This leads to memory clobbers as the pgd_index *and* the
pud_index is added to the mm->pgd pointer without a pgd_deref
in between.
The init_new_context() function is selecting the number of page
table levels for a new context. The function is used by mm_init()
which in turn is called by dup_mm() and mm_alloc(). These two are
used by fork() and exec(). The init_new_context() function can
distinguish the two cases by looking at mm->context.asce_limit,
for fork() the mm struct has been copied and the number of page
table levels may not change. For exec() the mm_alloc() function
set the new mm structure to zero, in this case a three-level page
table is created as the temporary stack space is located at
STACK_TOP_MAX = 4TB.
This fixes CVE-2016-2143.
Reported-by: Marcin Kościelnicki <koriakin@0x04.net>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Commit dfd55ad85e4a ("arm64: vmemmap: use virtual projection of linear
region") fixed an issue where the struct page array would overflow into the
adjacent virtual memory region if system RAM was placed so high up in
physical memory that its addresses were not representable in the build time
configured virtual address size.
However, the fix failed to take into account that the vmemmap region needs
to be relatively aligned with respect to the sparsemem section size, so that
a sequence of page structs corresponding with a sparsemem section in the
linear region appears naturally aligned in the vmemmap region.
So round up vmemmap to sparsemem section size. Since this essentially moves
the projection of the linear region up in memory, also revert the reduction
of the size of the vmemmap region.
Cc: <stable@vger.kernel.org>
Fixes: dfd55ad85e4a ("arm64: vmemmap: use virtual projection of linear region")
Tested-by: Mark Langsdorf <mlangsdo@redhat.com>
Tested-by: David Daney <david.daney@cavium.com>
Tested-by: Robert Richter <rrichter@cavium.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When growing halt-polling, there is no check that the poll time exceeds
the limit. It's possible for vcpu->halt_poll_ns grow once past
halt_poll_ns, and stay there until a halt which takes longer than
vcpu->halt_poll_ns. For example, booting a Linux guest with
halt_poll_ns=11000:
... kvm:kvm_halt_poll_ns: vcpu 0: halt_poll_ns 0 (shrink 10000)
... kvm:kvm_halt_poll_ns: vcpu 0: halt_poll_ns 10000 (grow 0)
... kvm:kvm_halt_poll_ns: vcpu 0: halt_poll_ns 20000 (grow 10000)
Signed-off-by: David Matlack <dmatlack@google.com>
Fixes: aca6ff29c4063a8d467cdee241e6b3bf7dc4a171
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull ARM SoC fix from Olof Johansson:
"Tiny fixes branch this week, in fact only one patch.
Turns out the USB support for a Renesas board was developed on a
pre-release board that ended up being changed before shipping. To
avoid breakage on those boards, and avoid confusion, it's a reasonable
idea to patch now instead of later. There are no known users of the
pre-release variant any more"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: dts: porter: remove enable prop from HS-USB device node
Pull ARM SoC fixes from Olof Johansson:
"Two more fixes for 4.5:
- One is a fix for OMAP that is urgently needed to avoid DRA7xx chips
from premature aging, by always keeping the Ethernet clock enabled.
- The other solves a I/O memory layout issue on Armada, where SROM
and PCI memory windows were conflicting in some configurations"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: mvebu: fix overlap of Crypto SRAM with PCIe memory window
ARM: dts: dra7: do not gate cpsw clock due to errata i877
ARM: OMAP2+: hwmod: Introduce ti,no-idle dt property
The recent commit [0bdf5a05647a: drm/i915: Add reverse mapping between
port and intel_encoder] introduced a reverse mapping to retrieve
intel_dig_port object from the port number. The code assumed that the
port vs intel_dig_port are 1:1 mapping. But in reality, this was a
too naive assumption.
As Martin reported about the missing HDMI audio on his SNB machine,
pre-HSW chips may have multiple intel_dig_port objects corresponding
to the same port. Since we assign the mapping statically at the init
time and the multiple objects override the map, it may not match with
the actually enabled output.
This patch tries to address the regression above. The reverse mapping
is provided basically only for the audio callbacks, so now we set /
clear the mapping dynamically at enabling and disabling HDMI/DP audio,
so that we can always track the latest and correct object
corresponding to the given port.
Fixes: 0bdf5a05647a ('drm/i915: Add reverse mapping between port and intel_encoder')
Reported-and-tested-by: Martin Kepplinger <martink@posteo.de>
Cc: drm-intel-fixes@lists.freedesktop.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Martin Kepplinger <martink@posteo.de>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1456324522-21591-1-git-send-email-tiwai@suse.de
(cherry picked from commit 9dfbffcf4ac0707097af9e6c1372192b9d03a357)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Add a maintainer entry for FREESCALE GPMI NAND driver and add myself as a
maintainer.
Signed-off-by: Han Xu <han.xu@nxp.com>
Acked-by: Huang Shijie <shijie.huang@arm.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Pull perf fixes from Ingo Molnar:
"A handful of CPU hotplug related fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Plug potential memory leak in CPU_UP_PREPARE
perf/core: Remove the bogus and dangerous CPU_DOWN_FAILED hotplug state
perf/core: Remove bogus UP_CANCELED hotplug state
perf/x86/amd/uncore: Plug reference leak
A kernel page fault oops with the callstack below was observed
when a read syscall was made to a pmem device after a huge amount
(>512GB) of vmalloc ranges was allocated by ioremap() on a x86_64
system:
BUG: unable to handle kernel paging request at ffff880840000ff8
IP: vmalloc_fault+0x1be/0x300
PGD c7f03a067 PUD 0
Oops: 0000 [#1] SM
Call Trace:
__do_page_fault+0x285/0x3e0
do_page_fault+0x2f/0x80
? put_prev_entity+0x35/0x7a0
page_fault+0x28/0x30
? memcpy_erms+0x6/0x10
? schedule+0x35/0x80
? pmem_rw_bytes+0x6a/0x190 [nd_pmem]
? schedule_timeout+0x183/0x240
btt_log_read+0x63/0x140 [nd_btt]
:
? __symbol_put+0x60/0x60
? kernel_read+0x50/0x80
SyS_finit_module+0xb9/0xf0
entry_SYSCALL_64_fastpath+0x1a/0xa4
Since v4.1, ioremap() supports large page (pud/pmd) mappings in
x86_64 and PAE. vmalloc_fault() however assumes that the vmalloc
range is limited to pte mappings.
vmalloc faults do not normally happen in ioremap'd ranges since
ioremap() sets up the kernel page tables, which are shared by
user processes. pgd_ctor() sets the kernel's PGD entries to
user's during fork(). When allocation of the vmalloc ranges
crosses a 512GB boundary, ioremap() allocates a new pud table
and updates the kernel PGD entry to point it. If user process's
PGD entry does not have this update yet, a read/write syscall
to the range will cause a vmalloc fault, which hits the Oops
above as it does not handle a large page properly.
Following changes are made to vmalloc_fault().
64-bit:
- No change for the PGD sync operation as it handles large
pages already.
- Add pud_huge() and pmd_huge() to the validation code to
handle large pages.
- Change pud_page_vaddr() to pud_pfn() since an ioremap range
is not directly mapped (while the if-statement still works
with a bogus addr).
- Change pmd_page() to pmd_pfn() since an ioremap range is not
backed by struct page (while the if-statement still works
with a bogus addr).
32-bit:
- No change for the sync operation since the index3 PGD entry
covers the entire vmalloc range, which is always valid.
(A separate change to sync PGD entry is necessary if this
memory layout is changed regardless of the page size.)
- Add pmd_huge() to the validation code to handle large pages.
This is for completeness since vmalloc_fault() won't happen
in ioremap'd ranges as its PGD entry is always valid.
Reported-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: <stable@vger.kernel.org> # 4.1+
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: linux-mm@kvack.org
Cc: linux-nvdimm@lists.01.org
Link: http://lkml.kernel.org/r/1455758214-24623-1-git-send-email-toshi.kani@hpe.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
After fixing FPU option parsing, we now parse the 'no387' boot option
too early: no387 clears X86_FEATURE_FPU before it's even probed, so
the boot CPU promptly re-enables it.
I suspect it gets even more confused on SMP.
Fix the probing code to leave X86_FEATURE_FPU off if it's been
disabled by setup_clear_cpu_cap().
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: yu-cheng yu <yu-cheng.yu@intel.com>
Fixes: 4f81cbafcce2 ("x86/fpu: Fix early FPU command-line parsing")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull ext4 fix from Ted Ts'o:
"This fixes a regression which crept in v4.5-rc5"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: iterate over buffer heads correctly in move_extent_per_page()
we have to check bit 40 of the facility list before issuing LPP
and not bit 48. Otherwise a guest running on a system with
"The decimal-floating-point zoned-conversion facility" and without
the "The set-program-parameters facility" might crash on an lpp
instruction.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org # v4.4+
Fixes: e22cf8ca6f75 ("s390/cpumf: rework program parameter setting to detect guest samples")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
With MACHINE_HAS_VX, we convert the floating point registers from the
vector registeres when storing the status. For other VCPUs, these are
stored to vcpu->run->s.regs.vrs, but we are using current->thread.fpu.vxrs,
which resolves to the currently loaded VCPU.
So kvm_s390_store_status_unloaded() currently writes the wrong floating
point registers (converted from the vector registers) when called from
another VCPU on a z13.
This is only the case for old user space not handling SIGP STORE STATUS and
SIGP STOP AND STORE STATUS, but relying on the kernel implementation. All
other calls come from the loaded VCPU via kvm_s390_store_status().
Fixes: 9abc2a08a7d6 (KVM: s390: fix memory overwrites when vx is disabled)
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull ARM fixes from Russell King:
"Just two ARM fixes this time: one to fix the hyp-stub for older ARM
CPUs, and another to fix the set_memory_xx() permission functions to
deal with zero sizes correctly"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8544/1: set_memory_xx fixes
ARM: 8534/1: virt: fix hyp-stub build for pre-ARMv7 CPUs
Merge "Second Round of Renesas ARM Based SoC DT Fixes for v4.5" from Simon Horman:
* remove enable prop from HS-USB device node on porter board
* tag 'renesas-dt-fixes2-for-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: dts: porter: remove enable prop from HS-USB device node
Pull media fix from Mauro Carvalho Chehab:
"One last time fix: It adds a code that prevents some media tools like
media-ctl to hide some entities that have their IDs out of the range
expected by those apps"
* tag 'media/v4.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] media-device: map new functions into old types for legacy API
When the Crypto SRAM mappings were added to the Device Tree files
describing the Armada XP boards in commit c466d997bb16 ("ARM: mvebu:
define crypto SRAM ranges for all armada-xp boards"), the fact that
those mappings were overlaping with the PCIe memory aperture was
overlooked. Due to this, we currently have for all Armada XP platforms
a situation that looks like this:
Memory mapping on Armada XP boards with internal registers at
0xf1000000:
- 0x00000000 -> 0xf0000000 3.75G RAM
- 0xf0000000 -> 0xf1000000 16M NOR flashes (AXP GP / AXP DB)
- 0xf1000000 -> 0xf1100000 1M internal registers
- 0xf8000000 -> 0xffe0000 126M PCIe memory aperture
- 0xf8100000 -> 0xf8110000 64KB Crypto SRAM #0 => OVERLAPS WITH PCIE !
- 0xf8110000 -> 0xf8120000 64KB Crypto SRAM #1 => OVERLAPS WITH PCIE !
- 0xffe00000 -> 0xfff00000 1M PCIe I/O aperture
- 0xfff0000 -> 0xffffffff 1M BootROM
The overlap means that when PCIe devices are added, depending on their
memory window needs, they might or might not be mapped into the
physical address space. Indeed, they will not be mapped if the area
allocated in the PCIe memory aperture by the PCI core overlaps with
one of the Crypto SRAM. Typically, a Intel IGB PCIe NIC that needs 8MB
of PCIe memory will see its PCIe memory window allocated from
0xf80000000 for 8MB, which overlaps with the Crypto SRAM windows. Due
to this, the PCIe window is not created, and any attempt to access the
PCIe window makes the kernel explode:
[ 3.302213] igb: Copyright (c) 2007-2014 Intel Corporation.
[ 3.307841] pci 0000:00:09.0: enabling device (0140 -> 0143)
[ 3.313539] mvebu_mbus: cannot add window '4:f8', conflicts with another window
[ 3.320870] mvebu-pcie soc:pcie-controller: Could not create MBus window at [mem 0xf8000000-0xf87fffff]: -22
[ 3.330811] Unhandled fault: external abort on non-linefetch (0x1008) at 0xf08c0018
This problem does not occur on Armada 370 boards, because we use the
following memory mapping (for boards that have internal registers at
0xf1000000):
- 0x00000000 -> 0xf0000000 3.75G RAM
- 0xf0000000 -> 0xf1000000 16M NOR flashes (AXP GP / AXP DB)
- 0xf1000000 -> 0xf1100000 1M internal registers
- 0xf1100000 -> 0xf1110000 64KB Crypto SRAM #0 => OK !
- 0xf8000000 -> 0xffe0000 126M PCIe memory
- 0xffe00000 -> 0xfff00000 1M PCIe I/O
- 0xfff0000 -> 0xffffffff 1M BootROM
Obviously, the solution is to align the location of the Crypto SRAM
mappings of Armada XP to be similar with the ones on Armada 370, i.e
have them between the "internal registers" area and the beginning of
the PCIe aperture.
However, we have a special case with the OpenBlocks AX3-4 platform,
which has a 128 MB NOR flash. Currently, this NOR flash is mapped from
0xf0000000 to 0xf8000000. This is possible because on OpenBlocks
AX3-4, the internal registers are not at 0xf1000000. And this explains
why the Crypto SRAM mappings were not configured at the same place on
Armada XP.
Hence, the solution is two-fold:
(1) Move the NOR flash mapping on Armada XP OpenBlocks AX3-4 from
0xe8000000 to 0xf0000000. This frees the 0xf0000000 ->
0xf80000000 space.
(2) Move the Crypto SRAM mappings on Armada XP to be similar to
Armada 370 (except of course that Armada XP has two Crypto SRAM
and not one).
After this patch, the memory mapping on Armada XP boards with
registers at 0xf1 is:
- 0x00000000 -> 0xf0000000 3.75G RAM
- 0xf0000000 -> 0xf1000000 16M NOR flashes (AXP GP / AXP DB)
- 0xf1000000 -> 0xf1100000 1M internal registers
- 0xf1100000 -> 0xf1110000 64KB Crypto SRAM #0
- 0xf1110000 -> 0xf1120000 64KB Crypto SRAM #1
- 0xf8000000 -> 0xffe0000 126M PCIe memory
- 0xffe00000 -> 0xfff00000 1M PCIe I/O
- 0xfff0000 -> 0xffffffff 1M BootROM
And the memory mapping for the special case of the OpenBlocks AX3-4
(internal registers at 0xd0000000, NOR of 128 MB):
- 0x00000000 -> 0xc0000000 3G RAM
- 0xd0000000 -> 0xd1000000 1M internal registers
- 0xe800000 -> 0xf0000000 128M NOR flash
- 0xf1100000 -> 0xf1110000 64KB Crypto SRAM #0
- 0xf1110000 -> 0xf1120000 64KB Crypto SRAM #1
- 0xf8000000 -> 0xffe0000 126M PCIe memory
- 0xffe00000 -> 0xfff00000 1M PCIe I/O
- 0xfff0000 -> 0xffffffff 1M BootROM
Fixes: c466d997bb16 ("ARM: mvebu: define crypto SRAM ranges for all armada-xp boards")
Reported-by: Phil Sutter <phil@nwl.cc>
Cc: Phil Sutter <phil@nwl.cc>
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
When a directory is deleted, we don't take too much care about killing off
all the dirents that belong to it — on the basis that on remount, the scan
will conclude that the directory is dead anyway.
This doesn't work though, when the deleted directory contained a child
directory which was moved *out*. In the early stages of the fs build
we can then end up with an apparent hard link, with the child directory
appearing both in its true location, and as a child of the original
directory which are this stage of the mount process we don't *yet* know
is defunct.
To resolve this, take out the early special-casing of the "directories
shall not have hard links" rule in jffs2_build_inode_pass1(), and let the
normal nlink processing happen for directories as well as other inodes.
Then later in the build process we can set ic->pino_nlink to the parent
inode#, as is required for directories during normal operaton, instead
of the nlink. And complain only *then* about hard links which are still
in evidence even after killing off all the unreachable paths.
Reported-by: Liu Song <liu.song11@zte.com.cn>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@vger.kernel.org
Pull powerpc fixes from Michael Ellerman:
- Fix build error on 32-bit with checkpoint restart from Aneesh Kumar
- Fix dedotify for binutils >= 2.26 from Andreas Schwab
- Don't trace hcalls on offline CPUs from Denis Kirjanov
- eeh: Fix stale cached primary bus from Gavin Shan
- eeh: Fix stale PE primary bus from Gavin Shan
- mm: Fix Multi hit ERAT cause by recent THP update from Aneesh Kumar K.V
- ioda: Set "read" permission when "write" is set from Alexey Kardashevskiy
* tag 'powerpc-4.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/ioda: Set "read" permission when "write" is set
powerpc/mm: Fix Multi hit ERAT cause by recent THP update
powerpc/powernv: Fix stale PE primary bus
powerpc/eeh: Fix stale cached primary bus
powerpc/pseries: Don't trace hcalls on offline CPUs
powerpc: Fix dedotify for binutils >= 2.26
powerpc/book3s_32: Fix build error with checkpoint restart
If CPU_UP_PREPARE is called it is not guaranteed, that a previously allocated
and assigned hash has been freed already, but perf_event_init_cpu()
unconditionally allocates and assignes a new hash if the swhash is referenced.
By overwriting the pointer the existing hash is not longer accessible.
Verify that there is no hash assigned on this cpu before allocating and
assigning a new one.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20160209201007.843269966@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Looks like the HPET spec at intel.com got moved.
It isn't hard to find so drop the link, just mention
the revision assumed.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/1455145462-3877-1-git-send-email-mst@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull networking fixes from David Miller:
1) Fix ordering of WEXT netlink messages so we don't see a newlink
after a dellink, from Johannes Berg.
2) Out of bounds access in minstrel_ht_set_best_prob_rage, from
Konstantin Khlebnikov.
3) Paging buffer memory leak in iwlwifi, from Matti Gottlieb.
4) Wrong units used to set initial TCP rto from cached metrics, also
from Konstantin Khlebnikov.
5) Fix stale IP options data in the SKB control block from leaking
through layers of encapsulation, from Bernie Harris.
6) Zero padding len miscalculated in bnxt_en, from Michael Chan.
7) Only CHECKSUM_PARTIAL packets should be passed down through GSO, fix
from Hannes Frederic Sowa.
8) Fix suspend/resume with JME networking devices, from Diego Violat
and Guo-Fu Tseng.
9) Checksums not validated properly in bridge multicast support due to
the placement of the SKB header pointers at the time of the check,
fix from Álvaro Fernández Rojas.
10) Fix hang/tiemout with r8169 if a stats fetch is done while the
device is runtime suspended. From Chun-Hao Lin.
11) The forwarding database netlink dump facilities don't track the
state of the dump properly, resulting in skipped/missed entries.
From Minoura Makoto.
12) Fix regression from a recent 3c59x bug fix, from Neil Horman.
13) Fix list corruption in bna driver, from Ivan Vecera.
14) Big endian machines crash on vlan add in bnx2x, fix from Michal
Schmidt.
15) Ethtool RSS configuration not propagated properly in mlx5 driver,
from Tariq Toukan.
16) Fix regression in PHY probing in stmmac driver, from Gabriel
Fernandez.
17) Fix SKB tailroom calculation in igmp/mld code, from Benjamin
Poirier.
18) A past change to skip empty routing headers in ipv6 extention header
parsing accidently caused fragment headers to not be matched any
longer. Fix from Florian Westphal.
19) eTSEC-106 erratum needs to be applied to more gianfar chips, from
Atsushi Nemoto.
20) Fix netdev reference after free via workqueues in usb networking
drivers, from Oliver Neukum and Bjørn Mork.
21) mdio->irq is now an array rather than a pointer to dynamic memory,
but several drivers were still trying to free it :-/ Fixes from
Colin Ian King.
22) act_ipt iptables action forgets to set the family field, thus LOG
netfilter targets don't work with it. Fix from Phil Sutter.
23) SKB leak in ibmveth when skb_linearize() fails, from Thomas Falcon.
24) pskb_may_pull() cannot be called with interrupts disabled, fix code
that tries to do this in vmxnet3 driver, from Neil Horman.
25) be2net driver leaks iomap'd memory on removal, fix from Douglas
Miller.
26) Forgotton RTNL mutex unlock in ppp_create_interface() error paths,
from Guillaume Nault.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (97 commits)
ppp: release rtnl mutex when interface creation fails
cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind
tcp: fix tcpi_segs_in after connection establishment
net: hns: fix the bug about loopback
jme: Fix device PM wakeup API usage
jme: Do not enable NIC WoL functions on S0
udp6: fix UDP/IPv6 encap resubmit path
be2net: Don't leak iomapped memory on removal.
vmxnet3: avoid calling pskb_may_pull with interrupts disabled
net: ethernet: Add missing MFD_SYSCON dependency on HAS_IOMEM
ibmveth: check return of skb_linearize in ibmveth_start_xmit
cdc_ncm: toggle altsetting to force reset before setup
usbnet: cleanup after bind() in probe()
mlxsw: pci: Correctly determine if descriptor queue is full
mlxsw: spectrum: Always decrement bridge's ref count
tipc: fix nullptr crash during subscription cancel
net: eth: altera: do not free array priv->mdio->irq
net/ethoc: do not free array priv->mdio->irq
net: sched: fix act_ipt for LOG target
asix: do not free array priv->mdio->irq
...
Pull drm fixes from Dave Airlie:
"A few imx fixes I missed from a couple of weeks ago, they still aren't
that big and fix some regression and a fail to boot problem.
Other than that, a couple of regression fixes for radeon/amdgpu, one
regression fix for vmwgfx and one regression fix for tda998x"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
Revert "drm/radeon/pm: adjust display configuration after powerstate"
drm/amdgpu/dp: add back special handling for NUTMEG
drm/radeon/dp: add back special handling for NUTMEG
drm/i2c: tda998x: Choose between atomic or non atomic dpms helper
drm/vmwgfx: Add back ->detect() and ->fill_modes()
drm/radeon: Fix error handling in radeon_flip_work_func.
drm/amdgpu: Fix error handling in amdgpu_flip_work_func.
drm/imx: Add missing DRM_FORMAT_RGB565 to ipu_plane_formats
drm/imx: notify DRM core about CRTC vblank state
gpu: ipu-v3: Reset IPU before activating IRQ
gpu: ipu-v3: Do not bail out on missing optional port nodes
In commit bcff24887d00 ("ext4: don't read blocks from disk after extents
being swapped") bh is not updated correctly in the for loop and wrong
data has been written to disk. generic/324 catches this on sub-page
block size ext4.
Fixes: bcff24887d00 ("ext4: don't read blocks from disk after extentsbeing swapped")
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add missing spi_master_put for rockchip_spi_remove since
it calls spi_master_get already.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
git commit 1ec2772e0c3c ("s390/diag: add a statistic for diagnose
calls") added function calls to gather diagnose statistics.
In case of the dasd diag driver the function call was added between a
register asm statement which initialized register r2 and the inline
assembly itself. The function call clobbers the contents of register
r2 and therefore the diag 0x250 call behaves in a more or less random
way.
Fix this by extracting the function call into a separate function like
we do everywhere else.
Fixes: 1ec2772e0c3c ("s390/diag: add a statistic for diagnose calls")
Cc: <stable@vger.kernel.org> # 4.4+
Reported-and-tested-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull MIPS fixes from Ralf Baechle:
"Another round of MIPS fixes for 4.5:
- Fix JZ4780 build with DEBUG_ZBOOT and MACH_JZ4780
- Fix build with DEBUG_ZBOOT and MACH_JZ4780
- Fix issue with uninitialised temp_foreign_map
- Fix awk regex compile failure with certain versions of awk. At
this time, the sole user, ld-ifversion, is only used on MIPS"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: smp.c: Fix uninitialised temp_foreign_map
MIPS: Fix build error when SMP is used without GIC
ld-version: Fix awk regex compile failure
MIPS: Fix build with DEBUG_ZBOOT and MACH_JZ4780
When calculate_cpu_foreign_map() recalculates the cpu_foreign_map
cpumask it uses the local variable temp_foreign_map without initialising
it to zero. Since the calculation only ever sets bits in this cpumask
any existing bits at that memory location will remain set and find their
way into cpu_foreign_map too. This could potentially lead to cache
operations suboptimally doing smp calls to multiple VPEs in the same
core, even though the VPEs share primary caches.
Therefore initialise temp_foreign_map using cpumask_clear() before use.
Fixes: cccf34e9411c ("MIPS: c-r4k: Fix cache flushing for MT cores")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12759/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Pull x86 fixes from Ingo Molnar:
"This fixes 3 FPU handling related bugs, an EFI boot crash and a
runtime warning.
The EFI fix arrived late but I didn't want to delay it to after v4.5
because the effects are pretty bad for the systems that are affected
by it"
[ Actually, I don't think the EFI fix really matters yet, because we
haven't switched to the separate EFI page tables in mainline yet ]
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/efi: Fix boot crash by always mapping boot service regions into new EFI page tables
x86/fpu: Fix eager-FPU handling on legacy FPU machines
x86/delay: Avoid preemptible context checks in delay_mwaitx()
x86/fpu: Revert ("x86/fpu: Disable AVX when eagerfpu is off")
x86/fpu: Fix 'no387' regression
For !BIO_CLONED bio, we can use .bi_vcnt safely, but it
doesn't mean we can just simply return .bi_io_vec[.bi_vcnt - 1]
because the start postion may have been moved in the middle of
the bvec, such as splitting in the middle of bvec.
Fixes: 7bcd79ac50d9(block: bio: introduce helpers to get the 1st and last bvec)
Cc: stable@vger.kernel.org
Reported-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The MIPS_GIC_IPI should only be selected when MIPS_GIC is also
selected, otherwise it results in a compile error. smp-gic.c uses some
functions from include/linux/irqchip/mips-gic.h like
plat_ipi_call_int_xlate() which are only added to the header file when
MIPS_GIC is set. The Lantiq SoC does not use the GIC, but supports SMP.
The calls top the functions from smp-gic.c are already protected by
some #ifdefs
The first part of this was introduced in commit 72e20142b2bf ("MIPS:
Move GIC IPI functions out of smp-cmp.c")
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: stable@vger.kernel.org # v3.15+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12774/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Pull SCSI target bug fix from Nicholas Bellinger:
"Here is an outstanding target-core bug-fix for v4.5 code."
This patch addresses a recent Task Management (TMR) regression related
to larger set of multi-port LUN_RESET bug-fixes in v4.5-rc5.
It drops a left-over target_put_sess_cmd() of se_cmd->cmd_kref within
ABORT_TASK failure path, once a se_cmd descriptor has already
completed posting response to fabric driver logic, and must be skipped
during normal ABORT_TASK se_cmd->tag lookup"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
target: Drop incorrect ABORT_TASK put for completed commands
Some machines have EFI regions in page zero (physical address
0x00000000) and historically that region has been added to the e820
map via trim_bios_range(), and ultimately mapped into the kernel page
tables. It was not mapped via efi_map_regions() as one would expect.
Alexis reports that with the new separate EFI page tables some boot
services regions, such as page zero, are not mapped. This triggers an
oops during the SetVirtualAddressMap() runtime call.
For the EFI boot services quirk on x86 we need to memblock_reserve()
boot services regions until after SetVirtualAddressMap(). Doing that
while respecting the ownership of regions that may have already been
reserved by the kernel was the motivation behind this commit:
7d68dc3f1003 ("x86, efi: Do not reserve boot services regions within reserved areas")
That patch was merged at a time when the EFI runtime virtual mappings
were inserted into the kernel page tables as described above, and the
trick of setting ->numpages (and hence the region size) to zero to
track regions that should not be freed in efi_free_boot_services()
meant that we never mapped those regions in efi_map_regions(). Instead
we were relying solely on the existing kernel mappings.
Now that we have separate page tables we need to make sure the EFI
boot services regions are mapped correctly, even if someone else has
already called memblock_reserve(). Instead of stashing a tag in
->numpages, set the EFI_MEMORY_RUNTIME bit of ->attribute. Since it
generally makes no sense to mark a boot services region as required at
runtime, it's pretty much guaranteed the firmware will not have
already set this bit.
For the record, the specific circumstances under which Alexis
triggered this bug was that an EFI runtime driver on his machine was
responding to the EVT_SIGNAL_VIRTUAL_ADDRESS_CHANGE event during
SetVirtualAddressMap().
The event handler for this driver looks like this,
sub rsp,0x28
lea rdx,[rip+0x2445] # 0xaa948720
mov ecx,0x4
call func_aa9447c0 ; call to ConvertPointer(4, & 0xaa948720)
mov r11,QWORD PTR [rip+0x2434] # 0xaa948720
xor eax,eax
mov BYTE PTR [r11+0x1],0x1
add rsp,0x28
ret
Which is pretty typical code for an EVT_SIGNAL_VIRTUAL_ADDRESS_CHANGE
handler. The "mov r11, QWORD PTR [rip+0x2424]" was the faulting
instruction because ConvertPointer() was being called to convert the
address 0x0000000000000000, which when converted is left unchanged and
remains 0x0000000000000000.
The output of the oops trace gave the impression of a standard NULL
pointer dereference bug, but because we're accessing physical
addresses during ConvertPointer(), it wasn't. EFI boot services code
is stored at that address on Alexis' machine.
Reported-by: Alexis Murzeau <amurzeau@gmail.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raphael Hertzog <hertzog@debian.org>
Cc: Roger Shimizu <rogershimizu@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1457695163-29632-2-git-send-email-matt@codeblueprint.co.uk
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=815125
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull KVM fixes from Paolo Bonzini:
"A few simple fixes for ARM, x86, PPC and generic code.
The x86 MMU fix is a bit larger because the surrounding code needed a
cleanup, but nothing worrisome"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: MMU: fix reserved bit check for ept=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0
KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo
kvm: cap halt polling at exactly halt_poll_ns
KVM: s390: correct fprs on SIGP (STOP AND) STORE STATUS
KVM: VMX: disable PEBS before a guest entry
KVM: PPC: Book3S HV: Sanitize special-purpose register values on guest exit
The ld-version.sh script fails on some versions of awk with the
following error, resulting in build failures for MIPS:
awk: scripts/ld-version.sh: line 4: regular expression compile failed (missing '(')
This is due to the regular expression ".*)", meant to strip off the
beginning of the ld version string up to the close bracket, however
brackets have a meaning in regular expressions, so lets escape it so
that awk doesn't expect a corresponding open bracket.
Fixes: ccbef1674a15 ("Kbuild, lto: add ld-version and ld-ifversion ...")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Cc: Michal Marek <mmarek@suse.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kbuild@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # 4.4.x-
Patchwork: https://patchwork.linux-mips.org/patch/12838/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Pull MTD fixes from Brian Norris:
"Late MTD fix for v4.5:
- A simple error code handling fix for the NAND ECC test; this was a
regression in v4.5-rc1
- A MAINTAINERS update, which might as well go in ASAP"
* tag 'for-linus-20160311' of git://git.infradead.org/linux-mtd:
MAINTAINERS: add a maintainer for the NAND subsystem
mtd: nand: tests: fix regression introduced in mtd_nandectest
This patch fixes a recent ABORT_TASK regression associated
with commit febe562c, where a left-over target_put_sess_cmd()
would still be called when __target_check_io_state() detected
a command has already been completed, and explicit ABORT must
be avoided.
Note commit febe562c dropped the local kref_get_unless_zero()
check in core_tmr_abort_task(), but did not drop this extra
corresponding target_put_sess_cmd() in the failure path.
So go ahead and drop this now bogus target_put_sess_cmd(),
and avoid this potential use-after-free.
Reported-by: Dan Lane <dracodan@gmail.com>
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Andy Grover <agrover@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
i486 derived cores like Intel Quark support only the very old,
legacy x87 FPU (FSAVE/FRSTOR, CPUID bit FXSR is not set), and
our FPU code wasn't handling the saving and restoring there
properly in the 'eagerfpu' case.
So after we made eagerfpu the default for all CPU types:
58122bf1d856 x86/fpu: Default eagerfpu=on on all CPUs
these old FPU designs broke. First, Andy Shevchenko reported a splat:
WARNING: CPU: 0 PID: 823 at arch/x86/include/asm/fpu/internal.h:163 fpu__clear+0x8c/0x160
which was us trying to execute FXRSTOR on those machines even though
they don't support it.
After taking care of that, Bryan O'Donoghue reported that a simple FPU
test still failed because we weren't initializing the FPU state properly
on those machines.
Take care of all that.
Reported-and-tested-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Reported-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng <yu-cheng.yu@intel.com>
Link: http://lkml.kernel.org/r/20160311113206.GD4312@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull arm64 fixes from Will Deacon:
"I thought we were done for 4.5, but then the 64k-page chaps came
crawling out of the woodwork. *sigh*
The vmemmap fix I sent for -rc7 caused a regression with 64k pages and
sparsemem and at some point during the release cycle the new hugetlb
code using contiguous ptes started failing the libhugetlbfs tests with
64k pages enabled.
So here are a couple of patches that fix the vmemmap alignment and
disable the new hugetlb page sizes whilst a proper fix is being
developed:
- Temporarily disable huge pages built using contiguous ptes
- Ensure vmemmap region is sufficiently aligned for sparsemem
sections"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: hugetlb: partial revert of 66b3923a1a0f
arm64: account for sparsemem section alignment when choosing vmemmap offset
KVM has special logic to handle pages with pte.u=1 and pte.w=0 when
CR0.WP=1. These pages' SPTEs flip continuously between two states:
U=1/W=0 (user and supervisor reads allowed, supervisor writes not allowed)
and U=0/W=1 (supervisor reads and writes allowed, user writes not allowed).
When SMEP is in effect, however, U=0 will enable kernel execution of
this page. To avoid this, KVM also sets NX=1 in the shadow PTE together
with U=0, making the two states U=1/W=0/NX=gpte.NX and U=0/W=1/NX=1.
When guest EFER has the NX bit cleared, the reserved bit check thinks
that the latter state is invalid; teach it that the smep_andnot_wp case
will also use the NX bit of SPTEs.
Cc: stable@vger.kernel.org
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.inel.com>
Fixes: c258b62b264fdc469b6d3610a907708068145e3b
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ingenic SoC declares ZBOOT support, but debug definitions are missing
for MACH_JZ4780 resulting in a build failure when DEBUG_ZBOOT is set.
The UART addresses are same as with JZ4740, so fix by covering JZ4780
with those as well.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12830/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Add myself as the maintainer of the NAND subsystem.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Richard Weinberger <richard@nod.at>
Acked-by: Brian Norris <computersforpeace@gmail.com>
Acked-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
We do use this_cpu_ptr(&cpu_tss) as a cacheline-aligned, seldomly
accessed per-cpu var as the MONITORX target in delay_mwaitx(). However,
when called in preemptible context, this_cpu_ptr -> smp_processor_id() ->
debug_smp_processor_id() fires:
BUG: using smp_processor_id() in preemptible [00000000] code: udevd/312
caller is delay_mwaitx+0x40/0xa0
But we don't care about that check - we only need cpu_tss as a MONITORX
target and it doesn't really matter which CPU's var we're touching as
we're going idle anyway. Fix that.
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: spg_linux_kernel@amd.com
Link: http://lkml.kernel.org/r/20160309205622.GG6564@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull s390 fixes from Martin Schwidefsky:
"Three bug fixes:
- The fix for the page table corruption (CVE-2016-2143)
- The diagnose statistics introduced a regression for the dasd diag
driver
- Boot crash on systems without the set-program-parameters facility"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/mm: four page table levels vs. fork
s390/cpumf: Fix lpp detection
s390/dasd: fix diag 0x250 inline assembly
Commit 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit")
introduced support for huge pages using the contiguous bit in the PTE
as opposed to block mappings, which may be slightly unwieldy (512M) in
64k page configurations.
Unfortunately, this support has resulted in some late regressions when
running the libhugetlbfs test suite with 64k pages and CONFIG_DEBUG_VM
as a result of a BUG:
| readback (2M: 64): ------------[ cut here ]------------
| kernel BUG at fs/hugetlbfs/inode.c:446!
| Internal error: Oops - BUG: 0 [#1] SMP
| Modules linked in:
| CPU: 7 PID: 1448 Comm: readback Not tainted 4.5.0-rc7 #148
| Hardware name: linux,dummy-virt (DT)
| task: fffffe0040964b00 ti: fffffe00c2668000 task.ti: fffffe00c2668000
| PC is at remove_inode_hugepages+0x44c/0x480
| LR is at remove_inode_hugepages+0x264/0x480
Rather than revert the entire patch, simply avoid advertising the
contiguous huge page sizes for now while people are actively working on
a fix. This patch can then be reverted once things have been sorted out.
Cc: David Woods <dwoods@ezchip.com>
Reported-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Yes, all of these are needed. :) This is admittedly a bit odd, but
kvm-unit-tests access.flat tests this if you run it with "-cpu host"
and of course ept=0.
KVM runs the guest with CR0.WP=1, so it must handle supervisor writes
specially when pte.u=1/pte.w=0/CR0.WP=0. Such writes cause a fault
when U=1 and W=0 in the SPTE, but they must succeed because CR0.WP=0.
When KVM gets the fault, it sets U=0 and W=1 in the shadow PTE and
restarts execution. This will still cause a user write to fault, while
supervisor writes will succeed. User reads will fault spuriously now,
and KVM will then flip U and W again in the SPTE (U=1, W=0). User reads
will be enabled and supervisor writes disabled, going back to the
originary situation where supervisor writes fault spuriously.
When SMEP is in effect, however, U=0 will enable kernel execution of
this page. To avoid this, KVM also sets NX=1 in the shadow PTE together
with U=0. If the guest has not enabled NX, the result is a continuous
stream of page faults due to the NX bit being reserved.
The fix is to force EFER.NX=1 even if the CPU is taking care of the EFER
switch. (All machines with SMEP have the CPU_LOAD_IA32_EFER vm-entry
control, so they do not use user-return notifiers for EFER---if they did,
EFER.NX would be forced to the same value as the host).
There is another bug in the reserved bit check, which I've split to a
separate patch for easier application to stable kernels.
Cc: stable@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Fixes: f6577a5fa15d82217ca73c74cd2dcbc0f6c781dd
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When removing an element from the mempool, mark it as unpoisoned in KASAN
before verifying its contents for SLUB/SLAB debugging. Otherwise KASAN
will flag the reads checking the element use-after-free writes as
use-after-free reads.
Signed-off-by: Matthew Dawson <matthew@mjdsystems.ca>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After the GMBUS transfer times out, we set force_bit=1 and
return -EAGAIN expecting the i2c core to call the .master_xfer
hook again so that we will retry the same transfer via bit-banging.
This is in case the gmbus hardware is somehow faulty.
Unfortunately we left adapter->retries to 0, meaning the i2c core
didn't actually do the retry. Let's tell the core we want one retry
when we return -EAGAIN.
Note that i2c-algo-bit also uses this retry count for some internal
retries, so we'll end up increasing those a bit as well.
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: drm-intel-fixes@lists.freedesktop.org
Fixes: bffce907d640 ("drm/i915: abstract i2c bit banging fallback in gmbus xfer")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457366220-29409-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 8b1f165a4a8f64c28cf42d10e1f4d3b451dedc51)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Offending Commit: 6e94119 "mtd: nand: return consistent error codes in
ecc.correct() implementations"
The new error code was not being handled properly in double bit error
detection.
Signed-off-by: Jorge Ramirez-Ortiz <jorge.ramirez-ortiz@linaro.org>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Tested-by: Franklin S Cooper Jr <fcooper@ti.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Pull x86 fixes from Ingo Molnar:
"This is unusually large, partly due to the EFI fixes that prevent
accidental deletion of EFI variables through efivarfs that may brick
machines. These fixes are somewhat involved to maintain compatibility
with existing install methods and other usage modes, while trying to
turn off the 'rm -rf' bricking vector.
Other fixes are for large page ioremap()s and for non-temporal
user-memcpy()s"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Fix vmalloc_fault() to handle large pages properly
hpet: Drop stale URLs
x86/uaccess/64: Handle the caching of 4-byte nocache copies properly in __copy_user_nocache()
x86/uaccess/64: Make the __copy_user_nocache() assembly code more readable
lib/ucs2_string: Correct ucs2 -> utf8 conversion
efi: Add pstore variables to the deletion whitelist
efi: Make efivarfs entries immutable by default
efi: Make our variable validation list include the guid
efi: Do variable name validation tests in utf8
efi: Use ucs2_as_utf8 in efivarfs instead of open coding a bad version
lib/ucs2_string: Add ucs2 -> utf8 helper functions
Leonid Shatz noticed that the SDM interpretation of the following
recent commit:
394db20ca240741 ("x86/fpu: Disable AVX when eagerfpu is off")
... is incorrect and that the original behavior of the FPU code was correct.
Because AVX is not stated in CR0 TS bit description, it was mistakenly
believed to be not supported for lazy context switch. This turns out
to be false:
Intel Software Developer's Manual Vol. 3A, Sec. 2.5 Control Registers:
'TS Task Switched bit (bit 3 of CR0) -- Allows the saving of the x87 FPU/
MMX/SSE/SSE2/SSE3/SSSE3/SSE4 context on a task switch to be delayed until
an x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 instruction is actually executed
by the new task.'
Intel Software Developer's Manual Vol. 2A, Sec. 2.4 Instruction Exception
Specification:
'AVX instructions refer to exceptions by classes that include #NM
"Device Not Available" exception for lazy context switch.'
So revert the commit.
Reported-by: Leonid Shatz <leonid.shatz@ravellosystems.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457569734-3785-1-git-send-email-yu-cheng.yu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull spi fixes from Mark Brown:
"A few driver specific fixes for the Rockchip and i.MX SPI controllers,
especially for the i.MX they're annoying bugs if you run into them"
* tag 'spi-fix-v4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: imx: fix spi resource leak with dma transfer
spi: imx: allow only WML aligned transfers to use DMA
spi: rockchip: add missing spi_master_put
spi: rockchip: disable runtime pm when in err case
The fork of a process with four page table levels is broken since
git commit 6252d702c5311ce9 "[S390] dynamic page tables."
All new mm contexts are created with three page table levels and
an asce limit of 4TB. If the parent has four levels dup_mmap will
add vmas to the new context which are outside of the asce limit.
The subsequent call to copy_page_range will walk the three level
page table structure of the new process with non-zero pgd and pud
indexes. This leads to memory clobbers as the pgd_index *and* the
pud_index is added to the mm->pgd pointer without a pgd_deref
in between.
The init_new_context() function is selecting the number of page
table levels for a new context. The function is used by mm_init()
which in turn is called by dup_mm() and mm_alloc(). These two are
used by fork() and exec(). The init_new_context() function can
distinguish the two cases by looking at mm->context.asce_limit,
for fork() the mm struct has been copied and the number of page
table levels may not change. For exec() the mm_alloc() function
set the new mm structure to zero, in this case a three-level page
table is created as the temporary stack space is located at
STACK_TOP_MAX = 4TB.
This fixes CVE-2016-2143.
Reported-by: Marcin Kościelnicki <koriakin@0x04.net>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Commit dfd55ad85e4a ("arm64: vmemmap: use virtual projection of linear
region") fixed an issue where the struct page array would overflow into the
adjacent virtual memory region if system RAM was placed so high up in
physical memory that its addresses were not representable in the build time
configured virtual address size.
However, the fix failed to take into account that the vmemmap region needs
to be relatively aligned with respect to the sparsemem section size, so that
a sequence of page structs corresponding with a sparsemem section in the
linear region appears naturally aligned in the vmemmap region.
So round up vmemmap to sparsemem section size. Since this essentially moves
the projection of the linear region up in memory, also revert the reduction
of the size of the vmemmap region.
Cc: <stable@vger.kernel.org>
Fixes: dfd55ad85e4a ("arm64: vmemmap: use virtual projection of linear region")
Tested-by: Mark Langsdorf <mlangsdo@redhat.com>
Tested-by: David Daney <david.daney@cavium.com>
Tested-by: Robert Richter <rrichter@cavium.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When growing halt-polling, there is no check that the poll time exceeds
the limit. It's possible for vcpu->halt_poll_ns grow once past
halt_poll_ns, and stay there until a halt which takes longer than
vcpu->halt_poll_ns. For example, booting a Linux guest with
halt_poll_ns=11000:
... kvm:kvm_halt_poll_ns: vcpu 0: halt_poll_ns 0 (shrink 10000)
... kvm:kvm_halt_poll_ns: vcpu 0: halt_poll_ns 10000 (grow 0)
... kvm:kvm_halt_poll_ns: vcpu 0: halt_poll_ns 20000 (grow 10000)
Signed-off-by: David Matlack <dmatlack@google.com>
Fixes: aca6ff29c4063a8d467cdee241e6b3bf7dc4a171
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull ARM SoC fix from Olof Johansson:
"Tiny fixes branch this week, in fact only one patch.
Turns out the USB support for a Renesas board was developed on a
pre-release board that ended up being changed before shipping. To
avoid breakage on those boards, and avoid confusion, it's a reasonable
idea to patch now instead of later. There are no known users of the
pre-release variant any more"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: dts: porter: remove enable prop from HS-USB device node
Pull ARM SoC fixes from Olof Johansson:
"Two more fixes for 4.5:
- One is a fix for OMAP that is urgently needed to avoid DRA7xx chips
from premature aging, by always keeping the Ethernet clock enabled.
- The other solves a I/O memory layout issue on Armada, where SROM
and PCI memory windows were conflicting in some configurations"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: mvebu: fix overlap of Crypto SRAM with PCIe memory window
ARM: dts: dra7: do not gate cpsw clock due to errata i877
ARM: OMAP2+: hwmod: Introduce ti,no-idle dt property
The recent commit [0bdf5a05647a: drm/i915: Add reverse mapping between
port and intel_encoder] introduced a reverse mapping to retrieve
intel_dig_port object from the port number. The code assumed that the
port vs intel_dig_port are 1:1 mapping. But in reality, this was a
too naive assumption.
As Martin reported about the missing HDMI audio on his SNB machine,
pre-HSW chips may have multiple intel_dig_port objects corresponding
to the same port. Since we assign the mapping statically at the init
time and the multiple objects override the map, it may not match with
the actually enabled output.
This patch tries to address the regression above. The reverse mapping
is provided basically only for the audio callbacks, so now we set /
clear the mapping dynamically at enabling and disabling HDMI/DP audio,
so that we can always track the latest and correct object
corresponding to the given port.
Fixes: 0bdf5a05647a ('drm/i915: Add reverse mapping between port and intel_encoder')
Reported-and-tested-by: Martin Kepplinger <martink@posteo.de>
Cc: drm-intel-fixes@lists.freedesktop.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Martin Kepplinger <martink@posteo.de>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1456324522-21591-1-git-send-email-tiwai@suse.de
(cherry picked from commit 9dfbffcf4ac0707097af9e6c1372192b9d03a357)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Pull perf fixes from Ingo Molnar:
"A handful of CPU hotplug related fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Plug potential memory leak in CPU_UP_PREPARE
perf/core: Remove the bogus and dangerous CPU_DOWN_FAILED hotplug state
perf/core: Remove bogus UP_CANCELED hotplug state
perf/x86/amd/uncore: Plug reference leak
A kernel page fault oops with the callstack below was observed
when a read syscall was made to a pmem device after a huge amount
(>512GB) of vmalloc ranges was allocated by ioremap() on a x86_64
system:
BUG: unable to handle kernel paging request at ffff880840000ff8
IP: vmalloc_fault+0x1be/0x300
PGD c7f03a067 PUD 0
Oops: 0000 [#1] SM
Call Trace:
__do_page_fault+0x285/0x3e0
do_page_fault+0x2f/0x80
? put_prev_entity+0x35/0x7a0
page_fault+0x28/0x30
? memcpy_erms+0x6/0x10
? schedule+0x35/0x80
? pmem_rw_bytes+0x6a/0x190 [nd_pmem]
? schedule_timeout+0x183/0x240
btt_log_read+0x63/0x140 [nd_btt]
:
? __symbol_put+0x60/0x60
? kernel_read+0x50/0x80
SyS_finit_module+0xb9/0xf0
entry_SYSCALL_64_fastpath+0x1a/0xa4
Since v4.1, ioremap() supports large page (pud/pmd) mappings in
x86_64 and PAE. vmalloc_fault() however assumes that the vmalloc
range is limited to pte mappings.
vmalloc faults do not normally happen in ioremap'd ranges since
ioremap() sets up the kernel page tables, which are shared by
user processes. pgd_ctor() sets the kernel's PGD entries to
user's during fork(). When allocation of the vmalloc ranges
crosses a 512GB boundary, ioremap() allocates a new pud table
and updates the kernel PGD entry to point it. If user process's
PGD entry does not have this update yet, a read/write syscall
to the range will cause a vmalloc fault, which hits the Oops
above as it does not handle a large page properly.
Following changes are made to vmalloc_fault().
64-bit:
- No change for the PGD sync operation as it handles large
pages already.
- Add pud_huge() and pmd_huge() to the validation code to
handle large pages.
- Change pud_page_vaddr() to pud_pfn() since an ioremap range
is not directly mapped (while the if-statement still works
with a bogus addr).
- Change pmd_page() to pmd_pfn() since an ioremap range is not
backed by struct page (while the if-statement still works
with a bogus addr).
32-bit:
- No change for the sync operation since the index3 PGD entry
covers the entire vmalloc range, which is always valid.
(A separate change to sync PGD entry is necessary if this
memory layout is changed regardless of the page size.)
- Add pmd_huge() to the validation code to handle large pages.
This is for completeness since vmalloc_fault() won't happen
in ioremap'd ranges as its PGD entry is always valid.
Reported-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: <stable@vger.kernel.org> # 4.1+
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: linux-mm@kvack.org
Cc: linux-nvdimm@lists.01.org
Link: http://lkml.kernel.org/r/1455758214-24623-1-git-send-email-toshi.kani@hpe.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
After fixing FPU option parsing, we now parse the 'no387' boot option
too early: no387 clears X86_FEATURE_FPU before it's even probed, so
the boot CPU promptly re-enables it.
I suspect it gets even more confused on SMP.
Fix the probing code to leave X86_FEATURE_FPU off if it's been
disabled by setup_clear_cpu_cap().
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: yu-cheng yu <yu-cheng.yu@intel.com>
Fixes: 4f81cbafcce2 ("x86/fpu: Fix early FPU command-line parsing")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
we have to check bit 40 of the facility list before issuing LPP
and not bit 48. Otherwise a guest running on a system with
"The decimal-floating-point zoned-conversion facility" and without
the "The set-program-parameters facility" might crash on an lpp
instruction.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org # v4.4+
Fixes: e22cf8ca6f75 ("s390/cpumf: rework program parameter setting to detect guest samples")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
With MACHINE_HAS_VX, we convert the floating point registers from the
vector registeres when storing the status. For other VCPUs, these are
stored to vcpu->run->s.regs.vrs, but we are using current->thread.fpu.vxrs,
which resolves to the currently loaded VCPU.
So kvm_s390_store_status_unloaded() currently writes the wrong floating
point registers (converted from the vector registers) when called from
another VCPU on a z13.
This is only the case for old user space not handling SIGP STORE STATUS and
SIGP STOP AND STORE STATUS, but relying on the kernel implementation. All
other calls come from the loaded VCPU via kvm_s390_store_status().
Fixes: 9abc2a08a7d6 (KVM: s390: fix memory overwrites when vx is disabled)
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull ARM fixes from Russell King:
"Just two ARM fixes this time: one to fix the hyp-stub for older ARM
CPUs, and another to fix the set_memory_xx() permission functions to
deal with zero sizes correctly"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8544/1: set_memory_xx fixes
ARM: 8534/1: virt: fix hyp-stub build for pre-ARMv7 CPUs
Merge "Second Round of Renesas ARM Based SoC DT Fixes for v4.5" from Simon Horman:
* remove enable prop from HS-USB device node on porter board
* tag 'renesas-dt-fixes2-for-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: dts: porter: remove enable prop from HS-USB device node
Pull media fix from Mauro Carvalho Chehab:
"One last time fix: It adds a code that prevents some media tools like
media-ctl to hide some entities that have their IDs out of the range
expected by those apps"
* tag 'media/v4.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] media-device: map new functions into old types for legacy API
When the Crypto SRAM mappings were added to the Device Tree files
describing the Armada XP boards in commit c466d997bb16 ("ARM: mvebu:
define crypto SRAM ranges for all armada-xp boards"), the fact that
those mappings were overlaping with the PCIe memory aperture was
overlooked. Due to this, we currently have for all Armada XP platforms
a situation that looks like this:
Memory mapping on Armada XP boards with internal registers at
0xf1000000:
- 0x00000000 -> 0xf0000000 3.75G RAM
- 0xf0000000 -> 0xf1000000 16M NOR flashes (AXP GP / AXP DB)
- 0xf1000000 -> 0xf1100000 1M internal registers
- 0xf8000000 -> 0xffe0000 126M PCIe memory aperture
- 0xf8100000 -> 0xf8110000 64KB Crypto SRAM #0 => OVERLAPS WITH PCIE !
- 0xf8110000 -> 0xf8120000 64KB Crypto SRAM #1 => OVERLAPS WITH PCIE !
- 0xffe00000 -> 0xfff00000 1M PCIe I/O aperture
- 0xfff0000 -> 0xffffffff 1M BootROM
The overlap means that when PCIe devices are added, depending on their
memory window needs, they might or might not be mapped into the
physical address space. Indeed, they will not be mapped if the area
allocated in the PCIe memory aperture by the PCI core overlaps with
one of the Crypto SRAM. Typically, a Intel IGB PCIe NIC that needs 8MB
of PCIe memory will see its PCIe memory window allocated from
0xf80000000 for 8MB, which overlaps with the Crypto SRAM windows. Due
to this, the PCIe window is not created, and any attempt to access the
PCIe window makes the kernel explode:
[ 3.302213] igb: Copyright (c) 2007-2014 Intel Corporation.
[ 3.307841] pci 0000:00:09.0: enabling device (0140 -> 0143)
[ 3.313539] mvebu_mbus: cannot add window '4:f8', conflicts with another window
[ 3.320870] mvebu-pcie soc:pcie-controller: Could not create MBus window at [mem 0xf8000000-0xf87fffff]: -22
[ 3.330811] Unhandled fault: external abort on non-linefetch (0x1008) at 0xf08c0018
This problem does not occur on Armada 370 boards, because we use the
following memory mapping (for boards that have internal registers at
0xf1000000):
- 0x00000000 -> 0xf0000000 3.75G RAM
- 0xf0000000 -> 0xf1000000 16M NOR flashes (AXP GP / AXP DB)
- 0xf1000000 -> 0xf1100000 1M internal registers
- 0xf1100000 -> 0xf1110000 64KB Crypto SRAM #0 => OK !
- 0xf8000000 -> 0xffe0000 126M PCIe memory
- 0xffe00000 -> 0xfff00000 1M PCIe I/O
- 0xfff0000 -> 0xffffffff 1M BootROM
Obviously, the solution is to align the location of the Crypto SRAM
mappings of Armada XP to be similar with the ones on Armada 370, i.e
have them between the "internal registers" area and the beginning of
the PCIe aperture.
However, we have a special case with the OpenBlocks AX3-4 platform,
which has a 128 MB NOR flash. Currently, this NOR flash is mapped from
0xf0000000 to 0xf8000000. This is possible because on OpenBlocks
AX3-4, the internal registers are not at 0xf1000000. And this explains
why the Crypto SRAM mappings were not configured at the same place on
Armada XP.
Hence, the solution is two-fold:
(1) Move the NOR flash mapping on Armada XP OpenBlocks AX3-4 from
0xe8000000 to 0xf0000000. This frees the 0xf0000000 ->
0xf80000000 space.
(2) Move the Crypto SRAM mappings on Armada XP to be similar to
Armada 370 (except of course that Armada XP has two Crypto SRAM
and not one).
After this patch, the memory mapping on Armada XP boards with
registers at 0xf1 is:
- 0x00000000 -> 0xf0000000 3.75G RAM
- 0xf0000000 -> 0xf1000000 16M NOR flashes (AXP GP / AXP DB)
- 0xf1000000 -> 0xf1100000 1M internal registers
- 0xf1100000 -> 0xf1110000 64KB Crypto SRAM #0
- 0xf1110000 -> 0xf1120000 64KB Crypto SRAM #1
- 0xf8000000 -> 0xffe0000 126M PCIe memory
- 0xffe00000 -> 0xfff00000 1M PCIe I/O
- 0xfff0000 -> 0xffffffff 1M BootROM
And the memory mapping for the special case of the OpenBlocks AX3-4
(internal registers at 0xd0000000, NOR of 128 MB):
- 0x00000000 -> 0xc0000000 3G RAM
- 0xd0000000 -> 0xd1000000 1M internal registers
- 0xe800000 -> 0xf0000000 128M NOR flash
- 0xf1100000 -> 0xf1110000 64KB Crypto SRAM #0
- 0xf1110000 -> 0xf1120000 64KB Crypto SRAM #1
- 0xf8000000 -> 0xffe0000 126M PCIe memory
- 0xffe00000 -> 0xfff00000 1M PCIe I/O
- 0xfff0000 -> 0xffffffff 1M BootROM
Fixes: c466d997bb16 ("ARM: mvebu: define crypto SRAM ranges for all armada-xp boards")
Reported-by: Phil Sutter <phil@nwl.cc>
Cc: Phil Sutter <phil@nwl.cc>
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
When a directory is deleted, we don't take too much care about killing off
all the dirents that belong to it — on the basis that on remount, the scan
will conclude that the directory is dead anyway.
This doesn't work though, when the deleted directory contained a child
directory which was moved *out*. In the early stages of the fs build
we can then end up with an apparent hard link, with the child directory
appearing both in its true location, and as a child of the original
directory which are this stage of the mount process we don't *yet* know
is defunct.
To resolve this, take out the early special-casing of the "directories
shall not have hard links" rule in jffs2_build_inode_pass1(), and let the
normal nlink processing happen for directories as well as other inodes.
Then later in the build process we can set ic->pino_nlink to the parent
inode#, as is required for directories during normal operaton, instead
of the nlink. And complain only *then* about hard links which are still
in evidence even after killing off all the unreachable paths.
Reported-by: Liu Song <liu.song11@zte.com.cn>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@vger.kernel.org
Pull powerpc fixes from Michael Ellerman:
- Fix build error on 32-bit with checkpoint restart from Aneesh Kumar
- Fix dedotify for binutils >= 2.26 from Andreas Schwab
- Don't trace hcalls on offline CPUs from Denis Kirjanov
- eeh: Fix stale cached primary bus from Gavin Shan
- eeh: Fix stale PE primary bus from Gavin Shan
- mm: Fix Multi hit ERAT cause by recent THP update from Aneesh Kumar K.V
- ioda: Set "read" permission when "write" is set from Alexey Kardashevskiy
* tag 'powerpc-4.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/ioda: Set "read" permission when "write" is set
powerpc/mm: Fix Multi hit ERAT cause by recent THP update
powerpc/powernv: Fix stale PE primary bus
powerpc/eeh: Fix stale cached primary bus
powerpc/pseries: Don't trace hcalls on offline CPUs
powerpc: Fix dedotify for binutils >= 2.26
powerpc/book3s_32: Fix build error with checkpoint restart
If CPU_UP_PREPARE is called it is not guaranteed, that a previously allocated
and assigned hash has been freed already, but perf_event_init_cpu()
unconditionally allocates and assignes a new hash if the swhash is referenced.
By overwriting the pointer the existing hash is not longer accessible.
Verify that there is no hash assigned on this cpu before allocating and
assigning a new one.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20160209201007.843269966@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Looks like the HPET spec at intel.com got moved.
It isn't hard to find so drop the link, just mention
the revision assumed.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/1455145462-3877-1-git-send-email-mst@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull networking fixes from David Miller:
1) Fix ordering of WEXT netlink messages so we don't see a newlink
after a dellink, from Johannes Berg.
2) Out of bounds access in minstrel_ht_set_best_prob_rage, from
Konstantin Khlebnikov.
3) Paging buffer memory leak in iwlwifi, from Matti Gottlieb.
4) Wrong units used to set initial TCP rto from cached metrics, also
from Konstantin Khlebnikov.
5) Fix stale IP options data in the SKB control block from leaking
through layers of encapsulation, from Bernie Harris.
6) Zero padding len miscalculated in bnxt_en, from Michael Chan.
7) Only CHECKSUM_PARTIAL packets should be passed down through GSO, fix
from Hannes Frederic Sowa.
8) Fix suspend/resume with JME networking devices, from Diego Violat
and Guo-Fu Tseng.
9) Checksums not validated properly in bridge multicast support due to
the placement of the SKB header pointers at the time of the check,
fix from Álvaro Fernández Rojas.
10) Fix hang/tiemout with r8169 if a stats fetch is done while the
device is runtime suspended. From Chun-Hao Lin.
11) The forwarding database netlink dump facilities don't track the
state of the dump properly, resulting in skipped/missed entries.
From Minoura Makoto.
12) Fix regression from a recent 3c59x bug fix, from Neil Horman.
13) Fix list corruption in bna driver, from Ivan Vecera.
14) Big endian machines crash on vlan add in bnx2x, fix from Michal
Schmidt.
15) Ethtool RSS configuration not propagated properly in mlx5 driver,
from Tariq Toukan.
16) Fix regression in PHY probing in stmmac driver, from Gabriel
Fernandez.
17) Fix SKB tailroom calculation in igmp/mld code, from Benjamin
Poirier.
18) A past change to skip empty routing headers in ipv6 extention header
parsing accidently caused fragment headers to not be matched any
longer. Fix from Florian Westphal.
19) eTSEC-106 erratum needs to be applied to more gianfar chips, from
Atsushi Nemoto.
20) Fix netdev reference after free via workqueues in usb networking
drivers, from Oliver Neukum and Bjørn Mork.
21) mdio->irq is now an array rather than a pointer to dynamic memory,
but several drivers were still trying to free it :-/ Fixes from
Colin Ian King.
22) act_ipt iptables action forgets to set the family field, thus LOG
netfilter targets don't work with it. Fix from Phil Sutter.
23) SKB leak in ibmveth when skb_linearize() fails, from Thomas Falcon.
24) pskb_may_pull() cannot be called with interrupts disabled, fix code
that tries to do this in vmxnet3 driver, from Neil Horman.
25) be2net driver leaks iomap'd memory on removal, fix from Douglas
Miller.
26) Forgotton RTNL mutex unlock in ppp_create_interface() error paths,
from Guillaume Nault.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (97 commits)
ppp: release rtnl mutex when interface creation fails
cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind
tcp: fix tcpi_segs_in after connection establishment
net: hns: fix the bug about loopback
jme: Fix device PM wakeup API usage
jme: Do not enable NIC WoL functions on S0
udp6: fix UDP/IPv6 encap resubmit path
be2net: Don't leak iomapped memory on removal.
vmxnet3: avoid calling pskb_may_pull with interrupts disabled
net: ethernet: Add missing MFD_SYSCON dependency on HAS_IOMEM
ibmveth: check return of skb_linearize in ibmveth_start_xmit
cdc_ncm: toggle altsetting to force reset before setup
usbnet: cleanup after bind() in probe()
mlxsw: pci: Correctly determine if descriptor queue is full
mlxsw: spectrum: Always decrement bridge's ref count
tipc: fix nullptr crash during subscription cancel
net: eth: altera: do not free array priv->mdio->irq
net/ethoc: do not free array priv->mdio->irq
net: sched: fix act_ipt for LOG target
asix: do not free array priv->mdio->irq
...
Pull drm fixes from Dave Airlie:
"A few imx fixes I missed from a couple of weeks ago, they still aren't
that big and fix some regression and a fail to boot problem.
Other than that, a couple of regression fixes for radeon/amdgpu, one
regression fix for vmwgfx and one regression fix for tda998x"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
Revert "drm/radeon/pm: adjust display configuration after powerstate"
drm/amdgpu/dp: add back special handling for NUTMEG
drm/radeon/dp: add back special handling for NUTMEG
drm/i2c: tda998x: Choose between atomic or non atomic dpms helper
drm/vmwgfx: Add back ->detect() and ->fill_modes()
drm/radeon: Fix error handling in radeon_flip_work_func.
drm/amdgpu: Fix error handling in amdgpu_flip_work_func.
drm/imx: Add missing DRM_FORMAT_RGB565 to ipu_plane_formats
drm/imx: notify DRM core about CRTC vblank state
gpu: ipu-v3: Reset IPU before activating IRQ
gpu: ipu-v3: Do not bail out on missing optional port nodes
In commit bcff24887d00 ("ext4: don't read blocks from disk after extents
being swapped") bh is not updated correctly in the for loop and wrong
data has been written to disk. generic/324 catches this on sub-page
block size ext4.
Fixes: bcff24887d00 ("ext4: don't read blocks from disk after extentsbeing swapped")
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
git commit 1ec2772e0c3c ("s390/diag: add a statistic for diagnose
calls") added function calls to gather diagnose statistics.
In case of the dasd diag driver the function call was added between a
register asm statement which initialized register r2 and the inline
assembly itself. The function call clobbers the contents of register
r2 and therefore the diag 0x250 call behaves in a more or less random
way.
Fix this by extracting the function call into a separate function like
we do everywhere else.
Fixes: 1ec2772e0c3c ("s390/diag: add a statistic for diagnose calls")
Cc: <stable@vger.kernel.org> # 4.4+
Reported-and-tested-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>