commits
The 'move_paghes()' system call was introduced long long ago with the
same permission checks as for sending a signal (except using
CAP_SYS_NICE instead of CAP_SYS_KILL for the overriding capability).
That turns out to not be a great choice - while the system call really
only moves physical page allocations around (and you need other
capabilities to do a lot of it), you can check the return value to map
out some the virtual address choices and defeat ASLR of a binary that
still shares your uid.
So change the access checks to the more common 'ptrace_may_access()'
model instead.
This tightens the access checks for the uid, and also effectively
changes the CAP_SYS_NICE check to CAP_SYS_PTRACE, but it's unlikely that
anybody really _uses_ this legacy system call any more (we hav ebetter
NUMA placement models these days), so I expect nobody to notice.
Famous last words.
Reported-by: Otto Ebeling <otto.ebeling@iki.fi>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull x86 fixes from Thomas Gleixner:
"Another pile of small fixes and updates for x86:
- Plug a hole in the SMAP implementation which misses to clear AC on
NMI entry
- Fix the norandmaps/ADDR_NO_RANDOMIZE logic so the command line
parameter works correctly again
- Use the proper accessor in the startup64 code for next_early_pgt to
prevent accessing of invalid addresses and faulting in the early
boot code.
- Prevent CPU hotplug lock recursion in the MTRR code
- Unbreak CPU0 hotplugging
- Rename overly long CPUID bits which got introduced in this cycle
- Two commits which mark data 'const' and restrict the scope of data
and functions to file scope by making them 'static'"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Constify attribute_group structures
x86/boot/64/clang: Use fixup_pointer() to access 'next_early_pgt'
x86/elf: Remove the unnecessary ADDR_NO_RANDOMIZE checks
x86: Fix norandmaps/ADDR_NO_RANDOMIZE
x86/mtrr: Prevent CPU hotplug lock recursion
x86: Mark various structures and functions as 'static'
x86/cpufeature, kvm/svm: Rename (shorten) the new "virtualized VMSAVE/VMLOAD" CPUID flag
x86/smpboot: Unbreak CPU0 hotplug
x86/asm/64: Clear AC on NMI entries
Pull timer fixes from Thomas Gleixner:
"A few small fixes for timer drivers:
- Prevent infinite recursion in the arm architected timer driver with
ftrace
- Propagate error codes to the caller in case of failure in EM STI
driver
- Adjust a bogus loop iteration in the arm architected timer driver
- Add a missing Kconfig dependency to the pistachio clocksource to
prevent build failures
- Correctly check for IS_ERR() instead of NULL in the shared timer-of
code"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/drivers/arm_arch_timer: Avoid infinite recursion when ftrace is enabled
clocksource/drivers/Kconfig: Fix CLKSRC_PISTACHIO dependencies
clocksource/drivers/timer-of: Checking for IS_ERR() instead of NULL
clocksource/drivers/em_sti: Fix error return codes in em_sti_probe()
clocksource/drivers/arm_arch_timer: Fix mem frame loop initialization
attribute_groups are not supposed to change at runtime and none of the
groups is modified.
Mark the non-const structs as const.
[ tglx: Folded into one big patch ]
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: tony.luck@intel.com
Cc: bp@alien8.de
Link: http://lkml.kernel.org/r/1500550238-15655-2-git-send-email-arvind.yadav.cs@gmail.com
Pull perf fixes from Thomas Gleixner:
"Two fixes for the perf subsystem:
- Fix an inconsistency of RDPMC mm struct tagging across exec() which
causes RDPMC to fault.
- Correct the timestamp mechanics across IOC_DISABLE/ENABLE which
causes incorrect timestamps and total time calculations"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix time on IOC_ENABLE
perf/x86: Fix RDPMC vs. mm_struct tracking
Pull clockevents fixes from Daniel Lezcano:
" - Fix error check against IS_ERR() instead of NULL for the timer-of code (Dan Carpenter)
- Fix infinite recusion with ftrace for the ARM architected timer (Ding Tianhong)
- Fix the error code return in the em_sti's probe function (Gustavo A. R. Silva)
- Fix Kconfig dependency for the pistachio driver (Matt Redfearn)
- Fix mem frame loop initialization for the ARM architected timer (Matthias Kaehlcke)"
Signed-off-by: Ingo Molnar <mingo@kernel.org>
__startup_64() is normally using fixup_pointer() to access globals in a
position-independent fashion. However 'next_early_pgt' was accessed
directly, which wasn't guaranteed to work.
Luckily GCC was generating a R_X86_64_PC32 PC-relative relocation for
'next_early_pgt', but Clang emitted a R_X86_64_32S, which led to
accessing invalid memory and rebooting the kernel.
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Davidson <md@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: c88d71508e36 ("x86/boot/64: Rewrite startup_64() in C")
Link: http://lkml.kernel.org/r/20170816190808.131748-1-glider@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull irq fixes from Thomas Gleixner:
"A pile of smallish changes all over the place:
- Add a missing ISB in the GIC V1 driver
- Remove an ACPI version check in the GIC V3 ITS driver
- Add the missing irq_pm_shutdown function for BRCMSTB-L2 to avoid
spurious wakeups
- Remove the artifical limitation of ITS instances to the number of
NUMA nodes which prevents utilizing the ITS hardware correctly
- Prevent a infinite parsing loop in the GIC-V3 ITS/MSI code
- Honour the force affinity argument in the GIC-V3 driver which is
required to make perf work correctly
- Correctly report allocation failures in GIC-V2/V3 to avoid using
half allocated and initialized interrupts.
- Fixup checks against nr_cpu_ids in the generic IPI code"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/ipi: Fixup checks against nr_cpu_ids
genirq: Restore trigger settings in irq_modify_status()
MAINTAINERS: Remove Jason Cooper's irqchip git tree
irqchip/gic-v3-its-platform-msi: Fix msi-parent parsing loop
irqchip/gic-v3-its: Allow GIC ITS number more than MAX_NUMNODES
irqchip: brcmstb-l2: Define an irq_pm_shutdown function
irqchip/gic: Ensure we have an ISB between ack and ->handle_irq
irqchip/gic-v3-its: Remove ACPICA version check for ACPI NUMA
irqchip/gic-v3: Honor forced affinity setting
irqchip/gic-v3: Report failures in gic_irq_domain_alloc
irqchip/gic-v2: Report failures in gic_irq_domain_alloc
irqchip/atmel-aic: Remove root argument from ->fixup() prototype
irqchip/atmel-aic: Fix unbalanced refcount in aic_common_rtc_irq_fixup()
irqchip/atmel-aic: Fix unbalanced of_node_put() in aic_common_irq_fixup()
Vince reported that when we do IOC_ENABLE/IOC_DISABLE while the task
is SIGSTOP'ed state the timestamps go wobbly.
It turns out we indeed fail to correctly account time while in 'OFF'
state and doing IOC_ENABLE without getting scheduled in exposes the
problem.
Further thinking about this problem, it occurred to me that we can
suffer a similar fate when we migrate an uncore event between CPUs.
The perf_event_install() on the 'new' CPU will do add_event_to_ctx()
which will reset all the time stamp, resulting in a subsequent
update_event_times() to overwrite the total_time_* fields with smaller
values.
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull NFS client fixes from Anna Schumaker:
"A few more NFS client bugfixes from me for rc5.
Dros has a stable fix for flexfiles to prevent leaking the
nfs4_ff_ds_version arrays when freeing a layout, Trond fixed a
potential recovery loop situation with the TEST_STATEID operation, and
Christoph fixed up the pNFS blocklayout Kconfig options to prevent
unsafe use with kernels that don't have large block device support.
Summary:
Stable fix:
- fix leaking nfs4_ff_ds_version array
Other fixes:
- improve TEST_STATEID OLD_STATEID handling to prevent recovery loop
- require 64-bit sector_t for pNFS blocklayout to prevent 32-bit
compile errors"
* tag 'nfs-for-4.13-5' of git://git.linux-nfs.org/projects/anna/linux-nfs:
pnfs/blocklayout: require 64-bit sector_t
NFSv4: Ignore NFS4ERR_OLD_STATEID in nfs41_check_open_stateid()
nfs/flexfiles: fix leak of nfs4_ff_ds_version arrays
On platforms with an arch timer erratum workaround, it's possible for
arch_timer_reg_read_stable() to recurse into itself when certain
tracing options are enabled, leading to stack overflows and related
problems.
For example, when PREEMPT_TRACER and FUNCTION_GRAPH_TRACER are
selected, it's possible to trigger this with:
$ mount -t debugfs nodev /sys/kernel/debug/
$ echo function_graph > /sys/kernel/debug/tracing/current_tracer
The problem is that in such cases, preempt_disable() instrumentation
attempts to acquire a timestamp via trace_clock(), resulting in a call
back to arch_timer_reg_read_stable(), and hence recursion.
This patch changes arch_timer_reg_read_stable() to use
preempt_{disable,enable}_notrace(), which avoids this.
This problem is similar to the fixed by upstream commit 96b3d28bf4
("sched/clock: Prevent tracing recursion in sched_clock_cpu()").
Fixes: 6acc71ccac71 ("arm64: arch_timer: Allows a CPU-specific erratum to only affect a subset of CPUs")
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The ADDR_NO_RANDOMIZE checks in stack_maxrandom_size() and
randomize_stack_top() are not required.
PF_RANDOMIZE is set by load_elf_binary() only if ADDR_NO_RANDOMIZE is not
set, no need to re-check after that.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: stable@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: http://lkml.kernel.org/r/20170815154011.GB1076@redhat.com
Pull watchdog fix from Thomas Gleixner:
"A fix for the hardlockup watchdog to prevent false positives with
extreme Turbo-Modes which make the perf/NMI watchdog fire faster than
the hrtimer which is used to verify.
Slightly larger than the minimal fix, which just would increase the
hrtimer frequency, but comes with extra overhead of more watchdog
timer interrupts and thread wakeups for all users.
With this change we restrict the overhead to the extreme Turbo-Mode
systems"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kernel/watchdog: Prevent false positives with turbo modes
Valid CPU ids are [0, nr_cpu_ids-1] inclusive.
Fixes: 3b8e29a82dd1 ("genirq: Implement ipi_send_mask/single()")
Fixes: f9bce791ae2a ("genirq: Add a new function to get IPI reverse mapping")
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170819095751.GB27864@avx2
Vince reported the following rdpmc() testcase failure:
> Failing test case:
>
> fd=perf_event_open();
> addr=mmap(fd);
> exec() // without closing or unmapping the event
> fd=perf_event_open();
> addr=mmap(fd);
> rdpmc() // GPFs due to rdpmc being disabled
The problem is of course that exec() plays tricks with what is
current->mm, only destroying the old mappings after having
installed the new mm.
Fix this confusion by passing along vma->vm_mm instead of relying on
current->mm.
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: 1e0fb9ec679c ("perf: Add pmu callbacks to track event mapping and unmapping")
Link: http://lkml.kernel.org/r/20170802173930.cstykcqefmqt7jau@hirez.programming.kicks-ass.net
[ Minor cleanups. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull block fixes from Jens Axboe:
"A set of fixes that should go into this series. This contains:
- Fix from Bart for blk-mq requeue queue running, preventing a
continued loop of run/restart.
- Fix for a bio/blk-integrity issue, in two parts. One from
Christoph, fixing where verification happens, and one from Milan,
for a NULL profile.
- NVMe pull request, most of the changes being for nvme-fc, but also
a few trivial core/pci fixes"
* 'for-linus' of git://git.kernel.dk/linux-block:
nvme: fix directive command numd calculation
nvme: fix nvme reset command timeout handling
nvme-pci: fix CMB sysfs file removal in reset path
lpfc: support nvmet_fc defer_rcv callback
nvmet_fc: add defer_req callback for deferment of cmd buffer return
nvme: strip trailing 0-bytes in wwid_show
block: Make blk_mq_delay_kick_requeue_list() rerun the queue at a quiet time
bio-integrity: only verify integrity on the lowest stacked driver
bio-integrity: Fix regression if profile verify_fn is NULL
The blocklayout code does not compile cleanly for a 32-bit sector_t,
and also has no reliable checks for devices sizes, which makes it
unsafe to use with a kernel that doesn't support large block devices.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 5c83746a0cf2 ("pnfs/blocklayout: in-kernel GETDEVICEINFO XDR parsing")
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
In v4.13, CLKSRC_PISTACHIO can select TIMER_OF on architectures without
GENERIC_CLOCKEVENTS, resulting in a struct clock_event_device missing
some required features and build breakage compiling timer_of.c. One of
the symbols selecting TIMER_OF is CLKSRC_PISTACHIO, so add the
dependency on GENERIC_CLOCKEVENTS.
Thanks to kbuild test robot for finding this error
(https://lkml.org/lkml/2017/7/16/249)
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Suggested-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Documentation/admin-guide/kernel-parameters.txt says:
norandmaps Don't use address space randomization. Equivalent
to echo 0 > /proc/sys/kernel/randomize_va_space
but it doesn't work because arch_rnd() which is used to randomize
mm->mmap_base returns a random value unconditionally. And as Kirill
pointed out, ADDR_NO_RANDOMIZE is broken by the same reason.
Just shift the PF_RANDOMIZE check from arch_mmap_rnd() to arch_rnd().
Fixes: 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for 32-bit mmap()")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: stable@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20170815153952.GA1076@redhat.com
Merge misc fixes from Andrew Morton:
"14 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: revert x86_64 and arm64 ELF_ET_DYN_BASE base changes
mm/vmalloc.c: don't unconditonally use __GFP_HIGHMEM
mm/mempolicy: fix use after free when calling get_mempolicy
mm/cma_debug.c: fix stack corruption due to sprintf usage
signal: don't remove SIGNAL_UNKILLABLE for traced tasks.
mm, oom: fix potential data corruption when oom_reaper races with writer
mm: fix double mmap_sem unlock on MMF_UNSTABLE enforced SIGBUS
slub: fix per memcg cache leak on css offline
mm: discard memblock data later
test_kmod: fix description for -s -and -c parameters
kmod: fix wait on recursive loop
wait: add wait_event_killable_timeout()
kernel/watchdog: fix Kconfig constraints for perf hardlockup watchdog
mm: memcontrol: fix NULL pointer crash in test_clear_page_writeback()
The hardlockup detector on x86 uses a performance counter based on unhalted
CPU cycles and a periodic hrtimer. The hrtimer period is about 2/5 of the
performance counter period, so the hrtimer should fire 2-3 times before the
performance counter NMI fires. The NMI code checks whether the hrtimer
fired since the last invocation. If not, it assumess a hard lockup.
The calculation of those periods is based on the nominal CPU
frequency. Turbo modes increase the CPU clock frequency and therefore
shorten the period of the perf/NMI watchdog. With extreme Turbo-modes (3x
nominal frequency) the perf/NMI period is shorter than the hrtimer period
which leads to false positives.
A simple fix would be to shorten the hrtimer period, but that comes with
the side effect of more frequent hrtimer and softlockup thread wakeups,
which is not desired.
Implement a low pass filter, which checks the perf/NMI period against
kernel time. If the perf/NMI fires before 4/5 of the watchdog period has
elapsed then the event is ignored and postponed to the next perf/NMI.
That solves the problem and avoids the overhead of shorter hrtimer periods
and more frequent softlockup thread wakeups.
Fixes: 58687acba592 ("lockup_detector: Combine nmi_watchdog and softlockup detector")
Reported-and-tested-by: Kan Liang <Kan.liang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: dzickus@redhat.com
Cc: prarit@redhat.com
Cc: ak@linux.intel.com
Cc: babu.moger@oracle.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: acme@redhat.com
Cc: stable@vger.kernel.org
Cc: atomlin@redhat.com
Cc: akpm@linux-foundation.org
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1708150931310.1886@nanos
irq_modify_status starts by clearing the trigger settings from
irq_data before applying the new settings, but doesn't restore them,
leaving them to IRQ_TYPE_NONE.
That's pretty confusing to the potential request_irq() that could
follow. Instead, snapshot the settings before clearing them, and restore
them if the irq_modify_status() invocation was not changing the trigger.
Fixes: 1e2a7d78499e ("irqdomain: Don't set type when mapping an IRQ")
Reported-and-tested-by: jeffy <jeffy.chen@rock-chips.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jon Hunter <jonathanh@nvidia.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170818095345.12378-1-marc.zyngier@arm.com
Pull pin control fixes from Linus Walleij:
"These are the pin control fixes I have gathered since the return from
my vacation. They boiled in -next a while so let's get them in.
Apart from the documentation build it is purely driver fixes. Which is
nice. The Intel fixes seem kind of important.
- Fix the documentation build as the docs were moved
- Correct the UART pin list on the Intel Merrifield
- Fix pin assignment and number of pins on the Marvell Armada 37xx
pin controller
- Cover the Setzer models in the Chromebook DMI quirk in the Intel
cheryview driver so they start working
- Add the missing "sim" function to the sunxi driver
- Fix USB pin definitions on Uniphier Pro4
- Smatch fix for invalid reference in the zx pin control driver"
* tag 'pinctrl-v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: generic: update references to Documentation/pinctrl.txt
pinctrl: intel: merrifield: Correct UART pin lists
pinctrl: armada-37xx: Fix number of pin in south bridge
pinctrl: armada-37xx: Fix the pin 23 on south bridge
pinctrl: cherryview: Add Setzer models to the Chromebook DMI quirk
pinctrl: sunxi: add a missing function of A10/A20 pinctrl driver
pinctrl: uniphier: fix USB3 pin assignment for Pro4
pinctrl: zte: fix dereference of 'data' in zx_set_mux()
Pull MMC fixes from Ulf Hansson:
"MMC core:
- fix lockdep splat when removing mmc_block module
- fix the logic for setting eMMC HS400ES signal voltage
MMC host:
- omap_hsmmc: add CMD23 capability to fix -EIO errors"
* tag 'mmc-v4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: block: fix lockdep splat when removing mmc_block module
mmc: mmc: correct the logic for setting HS400ES signal voltage
mmc: host: omap_hsmmc: Add CMD23 capability to omap_hsmmc driver
Pull NVMe fixes from Christoph:
"A few more small fixes - the fc/lpfc update is the biggest by far."
If the call to TEST_STATEID returns NFS4ERR_OLD_STATEID, then it just
means we raced with other calls to OPEN.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
The current code checks the return value of the of_io_request_and_map()
function as it was returning a NULL pointer in case of error.
However, it returns an error code encoded in the pointer return value, not a
NULL value. Fix this by checking the returned pointer against IS_ERR() and
return the error with PTR_ERR().
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Larry reported a CPU hotplug lock recursion in the MTRR code.
============================================
WARNING: possible recursive locking detected
systemd-udevd/153 is trying to acquire lock:
(cpu_hotplug_lock.rw_sem){.+.+.+}, at: [<c030fc26>] stop_machine+0x16/0x30
but task is already holding lock:
(cpu_hotplug_lock.rw_sem){.+.+.+}, at: [<c0234353>] mtrr_add_page+0x83/0x470
....
cpus_read_lock+0x48/0x90
stop_machine+0x16/0x30
mtrr_add_page+0x18b/0x470
mtrr_add+0x3e/0x70
mtrr_add_page() holds the hotplug rwsem already and calls stop_machine()
which acquires it again.
Call stop_machine_cpuslocked() instead.
Reported-and-tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1708140920250.1865@nanos
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Borislav Petkov <bp@suse.de>
Pull xfs fixes from Darrick Wong:
"A handful more bug fixes for you today.
Changes since last time:
- Don't leak resources when mount fails
- Don't accidentally clobber variables when looking for free inodes"
* tag 'xfs-4.13-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: don't leak quotacheck dquots when cow recovery
xfs: clear MS_ACTIVE after finishing log recovery
iomap: fix integer truncation issues in the zeroing and dirtying helpers
xfs: fix inobt inode allocation search optimization
Moving the x86_64 and arm64 PIE base from 0x555555554000 to 0x000100000000
broke AddressSanitizer. This is a partial revert of:
eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE")
02445990a96e ("arm64: move ELF_ET_DYN_BASE to 4GB / 4MB")
The AddressSanitizer tool has hard-coded expectations about where
executable mappings are loaded.
The motivation for changing the PIE base in the above commits was to
avoid the Stack-Clash CVEs that allowed executable mappings to get too
close to heap and stack. This was mainly a problem on 32-bit, but the
64-bit bases were moved too, in an effort to proactively protect those
systems (proofs of concept do exist that show 64-bit collisions, but
other recent changes to fix stack accounting and setuid behaviors will
minimize the impact).
The new 32-bit PIE base is fine for ASan (since it matches the ET_EXEC
base), so only the 64-bit PIE base needs to be reverted to let x86 and
arm64 ASan binaries run again. Future changes to the 64-bit PIE base on
these architectures can be made optional once a more dynamic method for
dealing with AddressSanitizer is found. (e.g. always loading PIE into
the mmap region for marked binaries.)
Link: http://lkml.kernel.org/r/20170807201542.GA21271@beast
Fixes: eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE")
Fixes: 02445990a96e ("arm64: move ELF_ET_DYN_BASE to 4GB / 4MB")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Kostya Serebryany <kcc@google.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jason's irqchip tree does not seem to have been updated for many months
now, remove it from the list of trees to avoid any possible confusion.
Jason says:
"Unfortunately, when I have time for irqchip, I don't always have the
time to properly follow up with pull-requests. So, for the time being,
I'll stick to reviewing as I can."
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Cc: marc.zyngier@arm.com
Link: http://lkml.kernel.org/r/20170727224733.8288-1-f.fainelli@gmail.com
Commit 65d8fc777f6d ("futex: Remove requirement for lock_page() in
get_futex_key()") removed an unnecessary lock_page() with the
side-effect that page->mapping needed to be treated very carefully.
Two defensive warnings were added in case any assumption was missed and
the first warning assumed a correct application would not alter a
mapping backing a futex key. Since merging, it has not triggered for
any unexpected case but Mark Rutland reported the following bug
triggering due to the first warning.
kernel BUG at kernel/futex.c:679!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 3695 Comm: syz-executor1 Not tainted 4.13.0-rc3-00020-g307fec773ba3 #3
Hardware name: linux,dummy-virt (DT)
task: ffff80001e271780 task.stack: ffff000010908000
PC is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679
LR is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679
pc : [<ffff00000821ac14>] lr : [<ffff00000821ac14>] pstate: 80000145
The fact that it's a bug instead of a warning was due to an unrelated
arm64 problem, but the warning itself triggered because the underlying
mapping changed.
This is an application issue but from a kernel perspective it's a
recoverable situation and the warning is unnecessary so this patch
removes the warning. The warning may potentially be triggered with the
following test program from Mark although it may be necessary to adjust
NR_FUTEX_THREADS to be a value smaller than the number of CPUs in the
system.
#include <linux/futex.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <sys/time.h>
#include <unistd.h>
#define NR_FUTEX_THREADS 16
pthread_t threads[NR_FUTEX_THREADS];
void *mem;
#define MEM_PROT (PROT_READ | PROT_WRITE)
#define MEM_SIZE 65536
static int futex_wrapper(int *uaddr, int op, int val,
const struct timespec *timeout,
int *uaddr2, int val3)
{
syscall(SYS_futex, uaddr, op, val, timeout, uaddr2, val3);
}
void *poll_futex(void *unused)
{
for (;;) {
futex_wrapper(mem, FUTEX_CMP_REQUEUE_PI, 1, NULL, mem + 4, 1);
}
}
int main(int argc, char *argv[])
{
int i;
mem = mmap(NULL, MEM_SIZE, MEM_PROT,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
printf("Mapping @ %p\n", mem);
printf("Creating futex threads...\n");
for (i = 0; i < NR_FUTEX_THREADS; i++)
pthread_create(&threads[i], NULL, poll_futex, NULL);
printf("Flipping mapping...\n");
for (;;) {
mmap(mem, MEM_SIZE, MEM_PROT,
MAP_FIXED | MAP_SHARED | MAP_ANONYMOUS, -1, 0);
}
return 0;
}
Reported-and-tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org # 4.7+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Update deprecated references to Documentation/pinctrl.txt since it has been
moved to Documentation/driver-api/pinctl.rst.
Signed-off-by: Ludovic Desroches <ludovic.desroches@o2linux.fr>
Fixes: 5a9b73832e9e ("pinctrl.txt: move it to the driver-api book")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Pull fbdev fixes from Bartlomiej Zolnierkiewicz:
- allow user to disable write combined mapping in efifb driver (Dave
Airlie)
- fix use after free bugs on driver removal in imxfb driver (Dan
Carpenter)
- fix unused variable warning in omapfb driver (Arnd Bergmann)
* tag 'fbdev-v4.13-rc5' of git://github.com/bzolnier/linux:
efifb: allow user to disable write combined mapping.
fbdev: omapfb: remove unused variable
video: fbdev: imxfb: use after free in imxfb_remove()
Fix lockdep splat introduced in v4.13-rc4.
[ 266.297226] ------------[ cut here ]------------
[ 266.300078] WARNING: CPU: 2 PID: 176 at /mnt/src/jaja/git/tf300t/include/linux/blkdev.h:657 mmc_blk_remove_req+0xd0/0xe8 [mmc_block]
[ 266.302937] Modules linked in: mmc_block(-) sdhci_tegra sdhci_pltfm sdhci pwrseq_simple pwrseq_emmc mmc_core
[ 266.305941] CPU: 2 PID: 176 Comm: rmmod Tainted: G W 4.13.0-rc4mq-00208-gb691e67724b8-dirty #694
[ 266.308852] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
[ 266.311719] [<b011144c>] (unwind_backtrace) from [<b010ca54>] (show_stack+0x18/0x1c)
[ 266.314664] [<b010ca54>] (show_stack) from [<b062e3f4>] (dump_stack+0x84/0x98)
[ 266.317644] [<b062e3f4>] (dump_stack) from [<b01214f4>] (__warn+0xf4/0x10c)
[ 266.320542] [<b01214f4>] (__warn) from [<b01215d4>] (warn_slowpath_null+0x28/0x30)
[ 266.323534] [<b01215d4>] (warn_slowpath_null) from [<af067858>] (mmc_blk_remove_req+0xd0/0xe8 [mmc_block])
[ 266.326568] [<af067858>] (mmc_blk_remove_req [mmc_block]) from [<af068f40>] (mmc_blk_remove_parts.constprop.6+0x50/0x64 [mmc_block])
[ 266.329678] [<af068f40>] (mmc_blk_remove_parts.constprop.6 [mmc_block]) from [<af0693b8>] (mmc_blk_remove+0x24/0x140 [mmc_block])
[ 266.332894] [<af0693b8>] (mmc_blk_remove [mmc_block]) from [<af0052ec>] (mmc_bus_remove+0x20/0x28 [mmc_core])
[ 266.336198] [<af0052ec>] (mmc_bus_remove [mmc_core]) from [<b046aa64>] (device_release_driver_internal+0x164/0x200)
[ 266.339367] [<b046aa64>] (device_release_driver_internal) from [<b046ab54>] (driver_detach+0x40/0x74)
[ 266.342537] [<b046ab54>] (driver_detach) from [<b046982c>] (bus_remove_driver+0x68/0xdc)
[ 266.345660] [<b046982c>] (bus_remove_driver) from [<af06ad40>] (mmc_blk_exit+0xc/0x2cc [mmc_block])
[ 266.348875] [<af06ad40>] (mmc_blk_exit [mmc_block]) from [<b01aee30>] (SyS_delete_module+0x1c4/0x254)
[ 266.352068] [<b01aee30>] (SyS_delete_module) from [<b0108480>] (ret_fast_syscall+0x0/0x34)
[ 266.355308] ---[ end trace f68728a0d3053b72 ]---
Fixes: 7c84b8b43d3d ("mmc: block: bypass the queue even if usage is present for hotplug")
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The blk_mq_delay_kick_requeue_list() function is used by the device
mapper and only by the device mapper to rerun the queue and requeue
list after a delay. This function is called once per request that
gets requeued. Modify this function such that the queue is run once
per path change event instead of once per request that is requeued.
Fixes: commit 2849450ad39d ("blk-mq: introduce blk_mq_delay_kick_requeue_list()")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The numd field of directive receive command takes number of dwords to
transfer. This fix has the correct calculation for numd.
Signed-off-by: Kwan (Hingkwan) Huen-SSI <kwan.huen@samsung.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The client was freeing the nfs4_ff_layout_ds, but not the contained
nfs4_ff_ds_version array.
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Propagate the return values of platform_get_irq and devm_request_irq on
failure.
Cc: Frans Klaver <fransklaver@gmail.com>
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Mark a couple of structures and functions as 'static', pointed out by Sparse:
warning: symbol 'bts_pmu' was not declared. Should it be static?
warning: symbol 'p4_event_aliases' was not declared. Should it be static?
warning: symbol 'rapl_attr_groups' was not declared. Should it be static?
symbol 'process_uv2_message' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Andrew Banman <abanman@hpe.com> # for the UV change
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20170810155709.7094-1-colin.king@canonical.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull block fixes from Jens Axboe:
"A small set of fixes that should go into this release. This contains:
- An NVMe pull request from Christoph, with a few select fixes.
One of them fix a polling regression in this series, in which it's
trivial to cause the kernel to disable most of the hardware queue
interrupts.
- Fixup for a blk-mq queue usage imbalance on request allocation,
from Keith.
- A xen block pull request from Konrad, fixing two issues with
xen/xen-blkfront"
* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq-pci: add a fallback when pci_irq_get_affinity returns NULL
nvme-pci: set cqe_seen on polled completions
nvme-fabrics: fix reporting of unrecognized options
nvmet-fc: eliminate incorrect static markers on local variables
nvmet-fc: correct use after free on list teardown
nvmet: don't overwrite identify sn/fr with 0-bytes
xen-blkfront: use a right index when checking requests
xen: fix bio vec merging
blk-mq: Fix queue usage on failed request allocation
If we fail a mount on account of cow recovery errors, it's possible that
a previous quotacheck left some dquots in memory. The bailout clause of
xfs_mountfs forgets to purge these, and so we leak them. Fix that.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Commit 19809c2da28a ("mm, vmalloc: use __GFP_HIGHMEM implicitly") added
use of __GFP_HIGHMEM for allocations. vmalloc_32 may use
GFP_DMA/GFP_DMA32 which does not play nice with __GFP_HIGHMEM and will
trigger a BUG in gfp_zone.
Only add __GFP_HIGHMEM if we aren't using GFP_DMA/GFP_DMA32.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1482249
Link: http://lkml.kernel.org/r/20170816220705.31374-1-labbott@redhat.com
Fixes: 19809c2da28a ("mm, vmalloc: use __GFP_HIGHMEM implicitly")
Signed-off-by: Laura Abbott <labbott@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull MIPS fixes from Ralf Baechle:
"Another round of MIPS fixes:
- compressed boot: Ignore a generated .c file
- VDSO: Fix a register clobber list
- DECstation: Fix an int-handler.S CPU_DADDI_WORKAROUNDS regression
- Octeon: Fix recent cleanups that cleaned away a bit too much thus
breaking the arch side of the EDAC and USB drivers.
- uasm: Fix duplicate const in "const struct foo const bar[]" which
GCC 7.1 no longer accepts.
- Fix race on setting and getting cpu_online_mask
- Fix preemption issue. To do so cleanly introduce macro to get the
size of L3 cache line.
- Revert include cleanup that sometimes results in build error
- MicroMIPS uses bit 0 of the PC to indicate microMIPS mode. Make
sure this bit is set for kernel entry as well.
- Prevent configuring the kernel for both microMIPS and MT. There are
no such CPUs currently and thus the combination is unsupported and
results in build errors.
This has been sitting in linux-next for a few days and has survived
automated testing by Imagination's test farm. No known regressions
pending except a number of issues that crept up due to lots of people
switching to GCC 7.1"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Set ISA bit in entry-y for microMIPS kernels
MIPS: Prevent building MT support for microMIPS kernels
MIPS: PCI: Fix smp_processor_id() in preemptible
MIPS: Introduce cpu_tcache_line_size
MIPS: DEC: Fix an int-handler.S CPU_DADDI_WORKAROUNDS regression
MIPS: VDSO: Fix clobber lists in fallback code paths
Revert "MIPS: Don't unnecessarily include kmalloc.h into <asm/cache.h>."
MIPS: OCTEON: Fix USB platform code breakage.
MIPS: Octeon: Fix broken EDAC driver.
MIPS: gitignore: ignore generated .c files
MIPS: Fix race on setting and getting cpu_online_mask
MIPS: mm: remove duplicate "const" qualifier on insn_table
Pull irqchip fixes for 4.13 from Marc Zyngier
Mostly GIC related, again:
- GICv3 ITS NUMA handling fixes
- GICv3 force affinity handling
- Barrier adjustment in both GIC interrupt handling
- Error reporting when the DT presents an incompatible interrupt
- GICv3 platform MSI DT parsing bug fix
- Broadcom L2 PM fix
- Atmel AIC cleanups
Pull i2c fixes from Wolfram Sang:
"The main thing is to allow empty id_tables for ACPI to make some
drivers get probed again. It looks a bit bigger than usual because it
needs some internal renaming, too.
Other than that, there is a fix for broken DSTDs, a super simple
enablement for ARM MPS, and two documentation fixes which I'd like to
see in v4.13 already"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: rephrase explanation of I2C_CLASS_DEPRECATED
i2c: allow i2c-versatile for ARM MPS platforms
i2c: designware: Some broken DSTDs use 1MiHz instead of 1MHz
i2c: designware: Print clock freq on invalid clock freq error
i2c: core: Allow empty id_table in ACPI case as well
i2c: mux: pinctrl: mention correct module name in Kconfig help text
UART pin lists consist GPIO numbers which is simply wrong.
Replace it by pin numbers.
Fixes: 4e80c8f50574 ("pinctrl: intel: Add Intel Merrifield pin controller support")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Pull fuse fixes from Miklos Szeredi:
"Fix a few bugs in fuse"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: set mapping error in writepage_locked when it fails
fuse: Dont call set_page_dirty_lock() for ITER_BVEC pages for async_dio
fuse: initialize the flock flag in fuse_file on allocation
This patch allows the user to disable write combined mapping
of the efifb framebuffer console using an nowc option.
A customer noticed major slowdowns while logging to the console
with write combining enabled, on other tasks running on the same
CPU. (10x or greater slow down on all other cores on the same CPU
as is doing the logging).
I reproduced this on a machine with dual CPUs.
Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz (6 core)
I wrote a test that just mmaps the pci bar and writes to it in
a loop, while this was running in the background one a single
core with (taskset -c 1), building a kernel up to init/version.o
(taskset -c 8) went from 13s to 133s or so. I've yet to explain
why this occurs or what is going wrong I haven't managed to find
a perf command that in any way gives insight into this.
11,885,070,715 instructions # 1.39 insns per cycle
vs
12,082,592,342 instructions # 0.13 insns per cycle
is the only thing I've spotted of interest, I've tried at least:
dTLB-stores,dTLB-store-misses,L1-dcache-stores,LLC-store,LLC-store-misses,LLC-load-misses,LLC-loads,\mem-loads,mem-stores,iTLB-loads,iTLB-load-misses,cache-references,cache-misses
For now it seems at least a good idea to allow a user to disable write
combining if they see this until we can figure it out.
Note also most users get a real framebuffer driver loaded when kms
kicks in, it just happens on these machines the kernel didn't support
the gpu specific driver.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Acked-by: Peter Jones <pjones@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Change the default err value to -EINVAL, make sure the card only
has type EXT_CSD_CARD_TYPE_HS400_1_8V also do the signal voltage
setting when select hs400es mode.
Fixes: commit 1720d3545b77 ("mmc: core: switch to 1V8 or 1V2 for hs400es mode")
Cc: <stable@vger.kernel.org>
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This gets us back to the behavior in 4.12 and earlier.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Fixes: 7c20f116 ("bio-integrity: stop abusing bi_end_io")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We need to return an error if a timeout occurs on any NVMe command during
initialization. Without this, the nvme reset work will be stuck. A timeout
will have a negative error code, meaning we need to stop initializing
the controller. All postitive returns mean the controller is still usable.
bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196325
Signed-off-by: Keith Busch <keith.busch@intel.com>
Cc: Martin Peres <martin.peres@intel.com>
[jth consolidated cleanup path ]
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
rpc_clnt_add_xprt() expects the callback function to be synchronous, and
expects to release the transport and switch references itself.
Fixes: 04fa2c6bb51b1 ("NFS pnfs data server multipath session trunking")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
The loop to find the best memory frame in arch_timer_mem_acpi_init()
initializes the loop counter with itself ('i = i'), which is suspicious
in the first place and pointed out by clang. The loop condition is
'i < timer_count' and a prior for loop exits when 'i' reaches
'timer_count', therefore the second loop is never executed.
Initialize the loop counter with 0 to iterate over all timers, which
supposedly was the intention before the typo monster attacked.
Fixes: c2743a36765d3 ("clocksource: arm_arch_timer: add GTDT support for memory-mapped timer")
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
"virtual_vmload_vmsave" is what is going to land in /proc/cpuinfo now
as per v4.13-rc4, for a single feature bit which is clearly too long.
So rename it to what it is called in the processor manual.
"v_vmsave_vmload" is a bit shorter, after all.
We could go more aggressively here but having it the same as in the
processor manual is advantageous.
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Radim Krčmář <rkrcmar@redhat.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: Jörg Rödel <joro@8bytes.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm-ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170801185552.GA3743@nazgul.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull rdma fixes from Doug Ledford:
"Fourth set of -rc fixes for 4.13 cycle. This is all of the -rc fixes
that we know of. I suspect this will be the last rc pull request, but
you never know, I could be wrong.
Nothing major here. There are the i40iw patches I mentioned in my last
pull request minus one that I pulled out because it wasn't a fix and
not appropriate for the rc cycle. Then a few other items trickled in
and were added to the pull request. It's fairly small aside from those
five i40iw patches
- Set of five i40iw fixes (the first of these is rather large by line
count consideration, but I decided to send it because if fixes a
legitimate issue and the line count is because it does so by
creating a new function and using it where needed instead of just
patching up a few lines...a smaller fix could probably be done, but
the larger fix is the better code solution)
- One vmw_pvrdma fix
- One hns_roce fix (this silences a checker warning, but can't
actually happen, I expect a patch to remove this from all drivers
that share this same check in for-next)
- One iw_cxgb4 fix
- Two IB core fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/uverbs: Fix NULL pointer dereference during device removal
IB/core: Protect sysfs entry on ib_unregister_device
iw_cxgb4: fix misuse of integer variable
IB/hns: fix memory leak on ah on error return path
i40iw: Fix potential fcn_id_array out of bounds
i40iw: Use correct alignment for CQ0 memory
i40iw: Fix typecast of tcp_seq_num
i40iw: Correct variable names
i40iw: Fix parsing of query/commit FPM buffers
RDMA/vmw_pvrdma: Report CQ missed events
While pci_irq_get_affinity should never fail for SMP kernel that
implement the affinity mapping, it will always return NULL in the
UP case, so provide a fallback mapping of all queues to CPU 0 in
that case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Way back when we established inode block-map redo log items, it was
discovered that we needed to prevent the VFS from evicting inodes during
log recovery because any given inode might be have bmap redo items to
replay even if the inode has no link count and is ultimately deleted,
and any eviction of an unlinked inode causes the inode to be truncated
and freed too early.
To make this possible, we set MS_ACTIVE so that inodes would not be torn
down immediately upon release. Unfortunately, this also results in the
quota inodes not being released at all if a later part of the mount
process should fail, because we never reclaim the inodes. So, set
MS_ACTIVE right before we do the last part of log recovery and clear it
immediately after we finish the log recovery so that everything
will be torn down properly if we abort the mount.
Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
The 'move_paghes()' system call was introduced long long ago with the
same permission checks as for sending a signal (except using
CAP_SYS_NICE instead of CAP_SYS_KILL for the overriding capability).
That turns out to not be a great choice - while the system call really
only moves physical page allocations around (and you need other
capabilities to do a lot of it), you can check the return value to map
out some the virtual address choices and defeat ASLR of a binary that
still shares your uid.
So change the access checks to the more common 'ptrace_may_access()'
model instead.
This tightens the access checks for the uid, and also effectively
changes the CAP_SYS_NICE check to CAP_SYS_PTRACE, but it's unlikely that
anybody really _uses_ this legacy system call any more (we hav ebetter
NUMA placement models these days), so I expect nobody to notice.
Famous last words.
Reported-by: Otto Ebeling <otto.ebeling@iki.fi>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull x86 fixes from Thomas Gleixner:
"Another pile of small fixes and updates for x86:
- Plug a hole in the SMAP implementation which misses to clear AC on
NMI entry
- Fix the norandmaps/ADDR_NO_RANDOMIZE logic so the command line
parameter works correctly again
- Use the proper accessor in the startup64 code for next_early_pgt to
prevent accessing of invalid addresses and faulting in the early
boot code.
- Prevent CPU hotplug lock recursion in the MTRR code
- Unbreak CPU0 hotplugging
- Rename overly long CPUID bits which got introduced in this cycle
- Two commits which mark data 'const' and restrict the scope of data
and functions to file scope by making them 'static'"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Constify attribute_group structures
x86/boot/64/clang: Use fixup_pointer() to access 'next_early_pgt'
x86/elf: Remove the unnecessary ADDR_NO_RANDOMIZE checks
x86: Fix norandmaps/ADDR_NO_RANDOMIZE
x86/mtrr: Prevent CPU hotplug lock recursion
x86: Mark various structures and functions as 'static'
x86/cpufeature, kvm/svm: Rename (shorten) the new "virtualized VMSAVE/VMLOAD" CPUID flag
x86/smpboot: Unbreak CPU0 hotplug
x86/asm/64: Clear AC on NMI entries
Pull timer fixes from Thomas Gleixner:
"A few small fixes for timer drivers:
- Prevent infinite recursion in the arm architected timer driver with
ftrace
- Propagate error codes to the caller in case of failure in EM STI
driver
- Adjust a bogus loop iteration in the arm architected timer driver
- Add a missing Kconfig dependency to the pistachio clocksource to
prevent build failures
- Correctly check for IS_ERR() instead of NULL in the shared timer-of
code"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/drivers/arm_arch_timer: Avoid infinite recursion when ftrace is enabled
clocksource/drivers/Kconfig: Fix CLKSRC_PISTACHIO dependencies
clocksource/drivers/timer-of: Checking for IS_ERR() instead of NULL
clocksource/drivers/em_sti: Fix error return codes in em_sti_probe()
clocksource/drivers/arm_arch_timer: Fix mem frame loop initialization
attribute_groups are not supposed to change at runtime and none of the
groups is modified.
Mark the non-const structs as const.
[ tglx: Folded into one big patch ]
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: tony.luck@intel.com
Cc: bp@alien8.de
Link: http://lkml.kernel.org/r/1500550238-15655-2-git-send-email-arvind.yadav.cs@gmail.com
Pull perf fixes from Thomas Gleixner:
"Two fixes for the perf subsystem:
- Fix an inconsistency of RDPMC mm struct tagging across exec() which
causes RDPMC to fault.
- Correct the timestamp mechanics across IOC_DISABLE/ENABLE which
causes incorrect timestamps and total time calculations"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix time on IOC_ENABLE
perf/x86: Fix RDPMC vs. mm_struct tracking
Pull clockevents fixes from Daniel Lezcano:
" - Fix error check against IS_ERR() instead of NULL for the timer-of code (Dan Carpenter)
- Fix infinite recusion with ftrace for the ARM architected timer (Ding Tianhong)
- Fix the error code return in the em_sti's probe function (Gustavo A. R. Silva)
- Fix Kconfig dependency for the pistachio driver (Matt Redfearn)
- Fix mem frame loop initialization for the ARM architected timer (Matthias Kaehlcke)"
Signed-off-by: Ingo Molnar <mingo@kernel.org>
__startup_64() is normally using fixup_pointer() to access globals in a
position-independent fashion. However 'next_early_pgt' was accessed
directly, which wasn't guaranteed to work.
Luckily GCC was generating a R_X86_64_PC32 PC-relative relocation for
'next_early_pgt', but Clang emitted a R_X86_64_32S, which led to
accessing invalid memory and rebooting the kernel.
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Davidson <md@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: c88d71508e36 ("x86/boot/64: Rewrite startup_64() in C")
Link: http://lkml.kernel.org/r/20170816190808.131748-1-glider@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull irq fixes from Thomas Gleixner:
"A pile of smallish changes all over the place:
- Add a missing ISB in the GIC V1 driver
- Remove an ACPI version check in the GIC V3 ITS driver
- Add the missing irq_pm_shutdown function for BRCMSTB-L2 to avoid
spurious wakeups
- Remove the artifical limitation of ITS instances to the number of
NUMA nodes which prevents utilizing the ITS hardware correctly
- Prevent a infinite parsing loop in the GIC-V3 ITS/MSI code
- Honour the force affinity argument in the GIC-V3 driver which is
required to make perf work correctly
- Correctly report allocation failures in GIC-V2/V3 to avoid using
half allocated and initialized interrupts.
- Fixup checks against nr_cpu_ids in the generic IPI code"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/ipi: Fixup checks against nr_cpu_ids
genirq: Restore trigger settings in irq_modify_status()
MAINTAINERS: Remove Jason Cooper's irqchip git tree
irqchip/gic-v3-its-platform-msi: Fix msi-parent parsing loop
irqchip/gic-v3-its: Allow GIC ITS number more than MAX_NUMNODES
irqchip: brcmstb-l2: Define an irq_pm_shutdown function
irqchip/gic: Ensure we have an ISB between ack and ->handle_irq
irqchip/gic-v3-its: Remove ACPICA version check for ACPI NUMA
irqchip/gic-v3: Honor forced affinity setting
irqchip/gic-v3: Report failures in gic_irq_domain_alloc
irqchip/gic-v2: Report failures in gic_irq_domain_alloc
irqchip/atmel-aic: Remove root argument from ->fixup() prototype
irqchip/atmel-aic: Fix unbalanced refcount in aic_common_rtc_irq_fixup()
irqchip/atmel-aic: Fix unbalanced of_node_put() in aic_common_irq_fixup()
Vince reported that when we do IOC_ENABLE/IOC_DISABLE while the task
is SIGSTOP'ed state the timestamps go wobbly.
It turns out we indeed fail to correctly account time while in 'OFF'
state and doing IOC_ENABLE without getting scheduled in exposes the
problem.
Further thinking about this problem, it occurred to me that we can
suffer a similar fate when we migrate an uncore event between CPUs.
The perf_event_install() on the 'new' CPU will do add_event_to_ctx()
which will reset all the time stamp, resulting in a subsequent
update_event_times() to overwrite the total_time_* fields with smaller
values.
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull NFS client fixes from Anna Schumaker:
"A few more NFS client bugfixes from me for rc5.
Dros has a stable fix for flexfiles to prevent leaking the
nfs4_ff_ds_version arrays when freeing a layout, Trond fixed a
potential recovery loop situation with the TEST_STATEID operation, and
Christoph fixed up the pNFS blocklayout Kconfig options to prevent
unsafe use with kernels that don't have large block device support.
Summary:
Stable fix:
- fix leaking nfs4_ff_ds_version array
Other fixes:
- improve TEST_STATEID OLD_STATEID handling to prevent recovery loop
- require 64-bit sector_t for pNFS blocklayout to prevent 32-bit
compile errors"
* tag 'nfs-for-4.13-5' of git://git.linux-nfs.org/projects/anna/linux-nfs:
pnfs/blocklayout: require 64-bit sector_t
NFSv4: Ignore NFS4ERR_OLD_STATEID in nfs41_check_open_stateid()
nfs/flexfiles: fix leak of nfs4_ff_ds_version arrays
On platforms with an arch timer erratum workaround, it's possible for
arch_timer_reg_read_stable() to recurse into itself when certain
tracing options are enabled, leading to stack overflows and related
problems.
For example, when PREEMPT_TRACER and FUNCTION_GRAPH_TRACER are
selected, it's possible to trigger this with:
$ mount -t debugfs nodev /sys/kernel/debug/
$ echo function_graph > /sys/kernel/debug/tracing/current_tracer
The problem is that in such cases, preempt_disable() instrumentation
attempts to acquire a timestamp via trace_clock(), resulting in a call
back to arch_timer_reg_read_stable(), and hence recursion.
This patch changes arch_timer_reg_read_stable() to use
preempt_{disable,enable}_notrace(), which avoids this.
This problem is similar to the fixed by upstream commit 96b3d28bf4
("sched/clock: Prevent tracing recursion in sched_clock_cpu()").
Fixes: 6acc71ccac71 ("arm64: arch_timer: Allows a CPU-specific erratum to only affect a subset of CPUs")
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The ADDR_NO_RANDOMIZE checks in stack_maxrandom_size() and
randomize_stack_top() are not required.
PF_RANDOMIZE is set by load_elf_binary() only if ADDR_NO_RANDOMIZE is not
set, no need to re-check after that.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: stable@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: http://lkml.kernel.org/r/20170815154011.GB1076@redhat.com
Pull watchdog fix from Thomas Gleixner:
"A fix for the hardlockup watchdog to prevent false positives with
extreme Turbo-Modes which make the perf/NMI watchdog fire faster than
the hrtimer which is used to verify.
Slightly larger than the minimal fix, which just would increase the
hrtimer frequency, but comes with extra overhead of more watchdog
timer interrupts and thread wakeups for all users.
With this change we restrict the overhead to the extreme Turbo-Mode
systems"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kernel/watchdog: Prevent false positives with turbo modes
Valid CPU ids are [0, nr_cpu_ids-1] inclusive.
Fixes: 3b8e29a82dd1 ("genirq: Implement ipi_send_mask/single()")
Fixes: f9bce791ae2a ("genirq: Add a new function to get IPI reverse mapping")
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170819095751.GB27864@avx2
Vince reported the following rdpmc() testcase failure:
> Failing test case:
>
> fd=perf_event_open();
> addr=mmap(fd);
> exec() // without closing or unmapping the event
> fd=perf_event_open();
> addr=mmap(fd);
> rdpmc() // GPFs due to rdpmc being disabled
The problem is of course that exec() plays tricks with what is
current->mm, only destroying the old mappings after having
installed the new mm.
Fix this confusion by passing along vma->vm_mm instead of relying on
current->mm.
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: 1e0fb9ec679c ("perf: Add pmu callbacks to track event mapping and unmapping")
Link: http://lkml.kernel.org/r/20170802173930.cstykcqefmqt7jau@hirez.programming.kicks-ass.net
[ Minor cleanups. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull block fixes from Jens Axboe:
"A set of fixes that should go into this series. This contains:
- Fix from Bart for blk-mq requeue queue running, preventing a
continued loop of run/restart.
- Fix for a bio/blk-integrity issue, in two parts. One from
Christoph, fixing where verification happens, and one from Milan,
for a NULL profile.
- NVMe pull request, most of the changes being for nvme-fc, but also
a few trivial core/pci fixes"
* 'for-linus' of git://git.kernel.dk/linux-block:
nvme: fix directive command numd calculation
nvme: fix nvme reset command timeout handling
nvme-pci: fix CMB sysfs file removal in reset path
lpfc: support nvmet_fc defer_rcv callback
nvmet_fc: add defer_req callback for deferment of cmd buffer return
nvme: strip trailing 0-bytes in wwid_show
block: Make blk_mq_delay_kick_requeue_list() rerun the queue at a quiet time
bio-integrity: only verify integrity on the lowest stacked driver
bio-integrity: Fix regression if profile verify_fn is NULL
The blocklayout code does not compile cleanly for a 32-bit sector_t,
and also has no reliable checks for devices sizes, which makes it
unsafe to use with a kernel that doesn't support large block devices.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 5c83746a0cf2 ("pnfs/blocklayout: in-kernel GETDEVICEINFO XDR parsing")
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
In v4.13, CLKSRC_PISTACHIO can select TIMER_OF on architectures without
GENERIC_CLOCKEVENTS, resulting in a struct clock_event_device missing
some required features and build breakage compiling timer_of.c. One of
the symbols selecting TIMER_OF is CLKSRC_PISTACHIO, so add the
dependency on GENERIC_CLOCKEVENTS.
Thanks to kbuild test robot for finding this error
(https://lkml.org/lkml/2017/7/16/249)
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Suggested-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Documentation/admin-guide/kernel-parameters.txt says:
norandmaps Don't use address space randomization. Equivalent
to echo 0 > /proc/sys/kernel/randomize_va_space
but it doesn't work because arch_rnd() which is used to randomize
mm->mmap_base returns a random value unconditionally. And as Kirill
pointed out, ADDR_NO_RANDOMIZE is broken by the same reason.
Just shift the PF_RANDOMIZE check from arch_mmap_rnd() to arch_rnd().
Fixes: 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for 32-bit mmap()")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: stable@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20170815153952.GA1076@redhat.com
Merge misc fixes from Andrew Morton:
"14 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: revert x86_64 and arm64 ELF_ET_DYN_BASE base changes
mm/vmalloc.c: don't unconditonally use __GFP_HIGHMEM
mm/mempolicy: fix use after free when calling get_mempolicy
mm/cma_debug.c: fix stack corruption due to sprintf usage
signal: don't remove SIGNAL_UNKILLABLE for traced tasks.
mm, oom: fix potential data corruption when oom_reaper races with writer
mm: fix double mmap_sem unlock on MMF_UNSTABLE enforced SIGBUS
slub: fix per memcg cache leak on css offline
mm: discard memblock data later
test_kmod: fix description for -s -and -c parameters
kmod: fix wait on recursive loop
wait: add wait_event_killable_timeout()
kernel/watchdog: fix Kconfig constraints for perf hardlockup watchdog
mm: memcontrol: fix NULL pointer crash in test_clear_page_writeback()
The hardlockup detector on x86 uses a performance counter based on unhalted
CPU cycles and a periodic hrtimer. The hrtimer period is about 2/5 of the
performance counter period, so the hrtimer should fire 2-3 times before the
performance counter NMI fires. The NMI code checks whether the hrtimer
fired since the last invocation. If not, it assumess a hard lockup.
The calculation of those periods is based on the nominal CPU
frequency. Turbo modes increase the CPU clock frequency and therefore
shorten the period of the perf/NMI watchdog. With extreme Turbo-modes (3x
nominal frequency) the perf/NMI period is shorter than the hrtimer period
which leads to false positives.
A simple fix would be to shorten the hrtimer period, but that comes with
the side effect of more frequent hrtimer and softlockup thread wakeups,
which is not desired.
Implement a low pass filter, which checks the perf/NMI period against
kernel time. If the perf/NMI fires before 4/5 of the watchdog period has
elapsed then the event is ignored and postponed to the next perf/NMI.
That solves the problem and avoids the overhead of shorter hrtimer periods
and more frequent softlockup thread wakeups.
Fixes: 58687acba592 ("lockup_detector: Combine nmi_watchdog and softlockup detector")
Reported-and-tested-by: Kan Liang <Kan.liang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: dzickus@redhat.com
Cc: prarit@redhat.com
Cc: ak@linux.intel.com
Cc: babu.moger@oracle.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: acme@redhat.com
Cc: stable@vger.kernel.org
Cc: atomlin@redhat.com
Cc: akpm@linux-foundation.org
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1708150931310.1886@nanos
irq_modify_status starts by clearing the trigger settings from
irq_data before applying the new settings, but doesn't restore them,
leaving them to IRQ_TYPE_NONE.
That's pretty confusing to the potential request_irq() that could
follow. Instead, snapshot the settings before clearing them, and restore
them if the irq_modify_status() invocation was not changing the trigger.
Fixes: 1e2a7d78499e ("irqdomain: Don't set type when mapping an IRQ")
Reported-and-tested-by: jeffy <jeffy.chen@rock-chips.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jon Hunter <jonathanh@nvidia.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170818095345.12378-1-marc.zyngier@arm.com
Pull pin control fixes from Linus Walleij:
"These are the pin control fixes I have gathered since the return from
my vacation. They boiled in -next a while so let's get them in.
Apart from the documentation build it is purely driver fixes. Which is
nice. The Intel fixes seem kind of important.
- Fix the documentation build as the docs were moved
- Correct the UART pin list on the Intel Merrifield
- Fix pin assignment and number of pins on the Marvell Armada 37xx
pin controller
- Cover the Setzer models in the Chromebook DMI quirk in the Intel
cheryview driver so they start working
- Add the missing "sim" function to the sunxi driver
- Fix USB pin definitions on Uniphier Pro4
- Smatch fix for invalid reference in the zx pin control driver"
* tag 'pinctrl-v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: generic: update references to Documentation/pinctrl.txt
pinctrl: intel: merrifield: Correct UART pin lists
pinctrl: armada-37xx: Fix number of pin in south bridge
pinctrl: armada-37xx: Fix the pin 23 on south bridge
pinctrl: cherryview: Add Setzer models to the Chromebook DMI quirk
pinctrl: sunxi: add a missing function of A10/A20 pinctrl driver
pinctrl: uniphier: fix USB3 pin assignment for Pro4
pinctrl: zte: fix dereference of 'data' in zx_set_mux()
Pull MMC fixes from Ulf Hansson:
"MMC core:
- fix lockdep splat when removing mmc_block module
- fix the logic for setting eMMC HS400ES signal voltage
MMC host:
- omap_hsmmc: add CMD23 capability to fix -EIO errors"
* tag 'mmc-v4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: block: fix lockdep splat when removing mmc_block module
mmc: mmc: correct the logic for setting HS400ES signal voltage
mmc: host: omap_hsmmc: Add CMD23 capability to omap_hsmmc driver
The current code checks the return value of the of_io_request_and_map()
function as it was returning a NULL pointer in case of error.
However, it returns an error code encoded in the pointer return value, not a
NULL value. Fix this by checking the returned pointer against IS_ERR() and
return the error with PTR_ERR().
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Larry reported a CPU hotplug lock recursion in the MTRR code.
============================================
WARNING: possible recursive locking detected
systemd-udevd/153 is trying to acquire lock:
(cpu_hotplug_lock.rw_sem){.+.+.+}, at: [<c030fc26>] stop_machine+0x16/0x30
but task is already holding lock:
(cpu_hotplug_lock.rw_sem){.+.+.+}, at: [<c0234353>] mtrr_add_page+0x83/0x470
....
cpus_read_lock+0x48/0x90
stop_machine+0x16/0x30
mtrr_add_page+0x18b/0x470
mtrr_add+0x3e/0x70
mtrr_add_page() holds the hotplug rwsem already and calls stop_machine()
which acquires it again.
Call stop_machine_cpuslocked() instead.
Reported-and-tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1708140920250.1865@nanos
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Borislav Petkov <bp@suse.de>
Pull xfs fixes from Darrick Wong:
"A handful more bug fixes for you today.
Changes since last time:
- Don't leak resources when mount fails
- Don't accidentally clobber variables when looking for free inodes"
* tag 'xfs-4.13-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: don't leak quotacheck dquots when cow recovery
xfs: clear MS_ACTIVE after finishing log recovery
iomap: fix integer truncation issues in the zeroing and dirtying helpers
xfs: fix inobt inode allocation search optimization
Moving the x86_64 and arm64 PIE base from 0x555555554000 to 0x000100000000
broke AddressSanitizer. This is a partial revert of:
eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE")
02445990a96e ("arm64: move ELF_ET_DYN_BASE to 4GB / 4MB")
The AddressSanitizer tool has hard-coded expectations about where
executable mappings are loaded.
The motivation for changing the PIE base in the above commits was to
avoid the Stack-Clash CVEs that allowed executable mappings to get too
close to heap and stack. This was mainly a problem on 32-bit, but the
64-bit bases were moved too, in an effort to proactively protect those
systems (proofs of concept do exist that show 64-bit collisions, but
other recent changes to fix stack accounting and setuid behaviors will
minimize the impact).
The new 32-bit PIE base is fine for ASan (since it matches the ET_EXEC
base), so only the 64-bit PIE base needs to be reverted to let x86 and
arm64 ASan binaries run again. Future changes to the 64-bit PIE base on
these architectures can be made optional once a more dynamic method for
dealing with AddressSanitizer is found. (e.g. always loading PIE into
the mmap region for marked binaries.)
Link: http://lkml.kernel.org/r/20170807201542.GA21271@beast
Fixes: eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE")
Fixes: 02445990a96e ("arm64: move ELF_ET_DYN_BASE to 4GB / 4MB")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Kostya Serebryany <kcc@google.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jason's irqchip tree does not seem to have been updated for many months
now, remove it from the list of trees to avoid any possible confusion.
Jason says:
"Unfortunately, when I have time for irqchip, I don't always have the
time to properly follow up with pull-requests. So, for the time being,
I'll stick to reviewing as I can."
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Cc: marc.zyngier@arm.com
Link: http://lkml.kernel.org/r/20170727224733.8288-1-f.fainelli@gmail.com
Commit 65d8fc777f6d ("futex: Remove requirement for lock_page() in
get_futex_key()") removed an unnecessary lock_page() with the
side-effect that page->mapping needed to be treated very carefully.
Two defensive warnings were added in case any assumption was missed and
the first warning assumed a correct application would not alter a
mapping backing a futex key. Since merging, it has not triggered for
any unexpected case but Mark Rutland reported the following bug
triggering due to the first warning.
kernel BUG at kernel/futex.c:679!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 3695 Comm: syz-executor1 Not tainted 4.13.0-rc3-00020-g307fec773ba3 #3
Hardware name: linux,dummy-virt (DT)
task: ffff80001e271780 task.stack: ffff000010908000
PC is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679
LR is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679
pc : [<ffff00000821ac14>] lr : [<ffff00000821ac14>] pstate: 80000145
The fact that it's a bug instead of a warning was due to an unrelated
arm64 problem, but the warning itself triggered because the underlying
mapping changed.
This is an application issue but from a kernel perspective it's a
recoverable situation and the warning is unnecessary so this patch
removes the warning. The warning may potentially be triggered with the
following test program from Mark although it may be necessary to adjust
NR_FUTEX_THREADS to be a value smaller than the number of CPUs in the
system.
#include <linux/futex.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <sys/time.h>
#include <unistd.h>
#define NR_FUTEX_THREADS 16
pthread_t threads[NR_FUTEX_THREADS];
void *mem;
#define MEM_PROT (PROT_READ | PROT_WRITE)
#define MEM_SIZE 65536
static int futex_wrapper(int *uaddr, int op, int val,
const struct timespec *timeout,
int *uaddr2, int val3)
{
syscall(SYS_futex, uaddr, op, val, timeout, uaddr2, val3);
}
void *poll_futex(void *unused)
{
for (;;) {
futex_wrapper(mem, FUTEX_CMP_REQUEUE_PI, 1, NULL, mem + 4, 1);
}
}
int main(int argc, char *argv[])
{
int i;
mem = mmap(NULL, MEM_SIZE, MEM_PROT,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
printf("Mapping @ %p\n", mem);
printf("Creating futex threads...\n");
for (i = 0; i < NR_FUTEX_THREADS; i++)
pthread_create(&threads[i], NULL, poll_futex, NULL);
printf("Flipping mapping...\n");
for (;;) {
mmap(mem, MEM_SIZE, MEM_PROT,
MAP_FIXED | MAP_SHARED | MAP_ANONYMOUS, -1, 0);
}
return 0;
}
Reported-and-tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org # 4.7+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Update deprecated references to Documentation/pinctrl.txt since it has been
moved to Documentation/driver-api/pinctl.rst.
Signed-off-by: Ludovic Desroches <ludovic.desroches@o2linux.fr>
Fixes: 5a9b73832e9e ("pinctrl.txt: move it to the driver-api book")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Pull fbdev fixes from Bartlomiej Zolnierkiewicz:
- allow user to disable write combined mapping in efifb driver (Dave
Airlie)
- fix use after free bugs on driver removal in imxfb driver (Dan
Carpenter)
- fix unused variable warning in omapfb driver (Arnd Bergmann)
* tag 'fbdev-v4.13-rc5' of git://github.com/bzolnier/linux:
efifb: allow user to disable write combined mapping.
fbdev: omapfb: remove unused variable
video: fbdev: imxfb: use after free in imxfb_remove()
Fix lockdep splat introduced in v4.13-rc4.
[ 266.297226] ------------[ cut here ]------------
[ 266.300078] WARNING: CPU: 2 PID: 176 at /mnt/src/jaja/git/tf300t/include/linux/blkdev.h:657 mmc_blk_remove_req+0xd0/0xe8 [mmc_block]
[ 266.302937] Modules linked in: mmc_block(-) sdhci_tegra sdhci_pltfm sdhci pwrseq_simple pwrseq_emmc mmc_core
[ 266.305941] CPU: 2 PID: 176 Comm: rmmod Tainted: G W 4.13.0-rc4mq-00208-gb691e67724b8-dirty #694
[ 266.308852] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
[ 266.311719] [<b011144c>] (unwind_backtrace) from [<b010ca54>] (show_stack+0x18/0x1c)
[ 266.314664] [<b010ca54>] (show_stack) from [<b062e3f4>] (dump_stack+0x84/0x98)
[ 266.317644] [<b062e3f4>] (dump_stack) from [<b01214f4>] (__warn+0xf4/0x10c)
[ 266.320542] [<b01214f4>] (__warn) from [<b01215d4>] (warn_slowpath_null+0x28/0x30)
[ 266.323534] [<b01215d4>] (warn_slowpath_null) from [<af067858>] (mmc_blk_remove_req+0xd0/0xe8 [mmc_block])
[ 266.326568] [<af067858>] (mmc_blk_remove_req [mmc_block]) from [<af068f40>] (mmc_blk_remove_parts.constprop.6+0x50/0x64 [mmc_block])
[ 266.329678] [<af068f40>] (mmc_blk_remove_parts.constprop.6 [mmc_block]) from [<af0693b8>] (mmc_blk_remove+0x24/0x140 [mmc_block])
[ 266.332894] [<af0693b8>] (mmc_blk_remove [mmc_block]) from [<af0052ec>] (mmc_bus_remove+0x20/0x28 [mmc_core])
[ 266.336198] [<af0052ec>] (mmc_bus_remove [mmc_core]) from [<b046aa64>] (device_release_driver_internal+0x164/0x200)
[ 266.339367] [<b046aa64>] (device_release_driver_internal) from [<b046ab54>] (driver_detach+0x40/0x74)
[ 266.342537] [<b046ab54>] (driver_detach) from [<b046982c>] (bus_remove_driver+0x68/0xdc)
[ 266.345660] [<b046982c>] (bus_remove_driver) from [<af06ad40>] (mmc_blk_exit+0xc/0x2cc [mmc_block])
[ 266.348875] [<af06ad40>] (mmc_blk_exit [mmc_block]) from [<b01aee30>] (SyS_delete_module+0x1c4/0x254)
[ 266.352068] [<b01aee30>] (SyS_delete_module) from [<b0108480>] (ret_fast_syscall+0x0/0x34)
[ 266.355308] ---[ end trace f68728a0d3053b72 ]---
Fixes: 7c84b8b43d3d ("mmc: block: bypass the queue even if usage is present for hotplug")
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The blk_mq_delay_kick_requeue_list() function is used by the device
mapper and only by the device mapper to rerun the queue and requeue
list after a delay. This function is called once per request that
gets requeued. Modify this function such that the queue is run once
per path change event instead of once per request that is requeued.
Fixes: commit 2849450ad39d ("blk-mq: introduce blk_mq_delay_kick_requeue_list()")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Mark a couple of structures and functions as 'static', pointed out by Sparse:
warning: symbol 'bts_pmu' was not declared. Should it be static?
warning: symbol 'p4_event_aliases' was not declared. Should it be static?
warning: symbol 'rapl_attr_groups' was not declared. Should it be static?
symbol 'process_uv2_message' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Andrew Banman <abanman@hpe.com> # for the UV change
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20170810155709.7094-1-colin.king@canonical.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull block fixes from Jens Axboe:
"A small set of fixes that should go into this release. This contains:
- An NVMe pull request from Christoph, with a few select fixes.
One of them fix a polling regression in this series, in which it's
trivial to cause the kernel to disable most of the hardware queue
interrupts.
- Fixup for a blk-mq queue usage imbalance on request allocation,
from Keith.
- A xen block pull request from Konrad, fixing two issues with
xen/xen-blkfront"
* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq-pci: add a fallback when pci_irq_get_affinity returns NULL
nvme-pci: set cqe_seen on polled completions
nvme-fabrics: fix reporting of unrecognized options
nvmet-fc: eliminate incorrect static markers on local variables
nvmet-fc: correct use after free on list teardown
nvmet: don't overwrite identify sn/fr with 0-bytes
xen-blkfront: use a right index when checking requests
xen: fix bio vec merging
blk-mq: Fix queue usage on failed request allocation
If we fail a mount on account of cow recovery errors, it's possible that
a previous quotacheck left some dquots in memory. The bailout clause of
xfs_mountfs forgets to purge these, and so we leak them. Fix that.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Commit 19809c2da28a ("mm, vmalloc: use __GFP_HIGHMEM implicitly") added
use of __GFP_HIGHMEM for allocations. vmalloc_32 may use
GFP_DMA/GFP_DMA32 which does not play nice with __GFP_HIGHMEM and will
trigger a BUG in gfp_zone.
Only add __GFP_HIGHMEM if we aren't using GFP_DMA/GFP_DMA32.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1482249
Link: http://lkml.kernel.org/r/20170816220705.31374-1-labbott@redhat.com
Fixes: 19809c2da28a ("mm, vmalloc: use __GFP_HIGHMEM implicitly")
Signed-off-by: Laura Abbott <labbott@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull MIPS fixes from Ralf Baechle:
"Another round of MIPS fixes:
- compressed boot: Ignore a generated .c file
- VDSO: Fix a register clobber list
- DECstation: Fix an int-handler.S CPU_DADDI_WORKAROUNDS regression
- Octeon: Fix recent cleanups that cleaned away a bit too much thus
breaking the arch side of the EDAC and USB drivers.
- uasm: Fix duplicate const in "const struct foo const bar[]" which
GCC 7.1 no longer accepts.
- Fix race on setting and getting cpu_online_mask
- Fix preemption issue. To do so cleanly introduce macro to get the
size of L3 cache line.
- Revert include cleanup that sometimes results in build error
- MicroMIPS uses bit 0 of the PC to indicate microMIPS mode. Make
sure this bit is set for kernel entry as well.
- Prevent configuring the kernel for both microMIPS and MT. There are
no such CPUs currently and thus the combination is unsupported and
results in build errors.
This has been sitting in linux-next for a few days and has survived
automated testing by Imagination's test farm. No known regressions
pending except a number of issues that crept up due to lots of people
switching to GCC 7.1"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Set ISA bit in entry-y for microMIPS kernels
MIPS: Prevent building MT support for microMIPS kernels
MIPS: PCI: Fix smp_processor_id() in preemptible
MIPS: Introduce cpu_tcache_line_size
MIPS: DEC: Fix an int-handler.S CPU_DADDI_WORKAROUNDS regression
MIPS: VDSO: Fix clobber lists in fallback code paths
Revert "MIPS: Don't unnecessarily include kmalloc.h into <asm/cache.h>."
MIPS: OCTEON: Fix USB platform code breakage.
MIPS: Octeon: Fix broken EDAC driver.
MIPS: gitignore: ignore generated .c files
MIPS: Fix race on setting and getting cpu_online_mask
MIPS: mm: remove duplicate "const" qualifier on insn_table
Pull irqchip fixes for 4.13 from Marc Zyngier
Mostly GIC related, again:
- GICv3 ITS NUMA handling fixes
- GICv3 force affinity handling
- Barrier adjustment in both GIC interrupt handling
- Error reporting when the DT presents an incompatible interrupt
- GICv3 platform MSI DT parsing bug fix
- Broadcom L2 PM fix
- Atmel AIC cleanups
Pull i2c fixes from Wolfram Sang:
"The main thing is to allow empty id_tables for ACPI to make some
drivers get probed again. It looks a bit bigger than usual because it
needs some internal renaming, too.
Other than that, there is a fix for broken DSTDs, a super simple
enablement for ARM MPS, and two documentation fixes which I'd like to
see in v4.13 already"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: rephrase explanation of I2C_CLASS_DEPRECATED
i2c: allow i2c-versatile for ARM MPS platforms
i2c: designware: Some broken DSTDs use 1MiHz instead of 1MHz
i2c: designware: Print clock freq on invalid clock freq error
i2c: core: Allow empty id_table in ACPI case as well
i2c: mux: pinctrl: mention correct module name in Kconfig help text
UART pin lists consist GPIO numbers which is simply wrong.
Replace it by pin numbers.
Fixes: 4e80c8f50574 ("pinctrl: intel: Add Intel Merrifield pin controller support")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Pull fuse fixes from Miklos Szeredi:
"Fix a few bugs in fuse"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: set mapping error in writepage_locked when it fails
fuse: Dont call set_page_dirty_lock() for ITER_BVEC pages for async_dio
fuse: initialize the flock flag in fuse_file on allocation
This patch allows the user to disable write combined mapping
of the efifb framebuffer console using an nowc option.
A customer noticed major slowdowns while logging to the console
with write combining enabled, on other tasks running on the same
CPU. (10x or greater slow down on all other cores on the same CPU
as is doing the logging).
I reproduced this on a machine with dual CPUs.
Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz (6 core)
I wrote a test that just mmaps the pci bar and writes to it in
a loop, while this was running in the background one a single
core with (taskset -c 1), building a kernel up to init/version.o
(taskset -c 8) went from 13s to 133s or so. I've yet to explain
why this occurs or what is going wrong I haven't managed to find
a perf command that in any way gives insight into this.
11,885,070,715 instructions # 1.39 insns per cycle
vs
12,082,592,342 instructions # 0.13 insns per cycle
is the only thing I've spotted of interest, I've tried at least:
dTLB-stores,dTLB-store-misses,L1-dcache-stores,LLC-store,LLC-store-misses,LLC-load-misses,LLC-loads,\mem-loads,mem-stores,iTLB-loads,iTLB-load-misses,cache-references,cache-misses
For now it seems at least a good idea to allow a user to disable write
combining if they see this until we can figure it out.
Note also most users get a real framebuffer driver loaded when kms
kicks in, it just happens on these machines the kernel didn't support
the gpu specific driver.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Acked-by: Peter Jones <pjones@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Change the default err value to -EINVAL, make sure the card only
has type EXT_CSD_CARD_TYPE_HS400_1_8V also do the signal voltage
setting when select hs400es mode.
Fixes: commit 1720d3545b77 ("mmc: core: switch to 1V8 or 1V2 for hs400es mode")
Cc: <stable@vger.kernel.org>
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
We need to return an error if a timeout occurs on any NVMe command during
initialization. Without this, the nvme reset work will be stuck. A timeout
will have a negative error code, meaning we need to stop initializing
the controller. All postitive returns mean the controller is still usable.
bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196325
Signed-off-by: Keith Busch <keith.busch@intel.com>
Cc: Martin Peres <martin.peres@intel.com>
[jth consolidated cleanup path ]
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
rpc_clnt_add_xprt() expects the callback function to be synchronous, and
expects to release the transport and switch references itself.
Fixes: 04fa2c6bb51b1 ("NFS pnfs data server multipath session trunking")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
The loop to find the best memory frame in arch_timer_mem_acpi_init()
initializes the loop counter with itself ('i = i'), which is suspicious
in the first place and pointed out by clang. The loop condition is
'i < timer_count' and a prior for loop exits when 'i' reaches
'timer_count', therefore the second loop is never executed.
Initialize the loop counter with 0 to iterate over all timers, which
supposedly was the intention before the typo monster attacked.
Fixes: c2743a36765d3 ("clocksource: arm_arch_timer: add GTDT support for memory-mapped timer")
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
"virtual_vmload_vmsave" is what is going to land in /proc/cpuinfo now
as per v4.13-rc4, for a single feature bit which is clearly too long.
So rename it to what it is called in the processor manual.
"v_vmsave_vmload" is a bit shorter, after all.
We could go more aggressively here but having it the same as in the
processor manual is advantageous.
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Radim Krčmář <rkrcmar@redhat.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: Jörg Rödel <joro@8bytes.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm-ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170801185552.GA3743@nazgul.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull rdma fixes from Doug Ledford:
"Fourth set of -rc fixes for 4.13 cycle. This is all of the -rc fixes
that we know of. I suspect this will be the last rc pull request, but
you never know, I could be wrong.
Nothing major here. There are the i40iw patches I mentioned in my last
pull request minus one that I pulled out because it wasn't a fix and
not appropriate for the rc cycle. Then a few other items trickled in
and were added to the pull request. It's fairly small aside from those
five i40iw patches
- Set of five i40iw fixes (the first of these is rather large by line
count consideration, but I decided to send it because if fixes a
legitimate issue and the line count is because it does so by
creating a new function and using it where needed instead of just
patching up a few lines...a smaller fix could probably be done, but
the larger fix is the better code solution)
- One vmw_pvrdma fix
- One hns_roce fix (this silences a checker warning, but can't
actually happen, I expect a patch to remove this from all drivers
that share this same check in for-next)
- One iw_cxgb4 fix
- Two IB core fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/uverbs: Fix NULL pointer dereference during device removal
IB/core: Protect sysfs entry on ib_unregister_device
iw_cxgb4: fix misuse of integer variable
IB/hns: fix memory leak on ah on error return path
i40iw: Fix potential fcn_id_array out of bounds
i40iw: Use correct alignment for CQ0 memory
i40iw: Fix typecast of tcp_seq_num
i40iw: Correct variable names
i40iw: Fix parsing of query/commit FPM buffers
RDMA/vmw_pvrdma: Report CQ missed events
While pci_irq_get_affinity should never fail for SMP kernel that
implement the affinity mapping, it will always return NULL in the
UP case, so provide a fallback mapping of all queues to CPU 0 in
that case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Way back when we established inode block-map redo log items, it was
discovered that we needed to prevent the VFS from evicting inodes during
log recovery because any given inode might be have bmap redo items to
replay even if the inode has no link count and is ultimately deleted,
and any eviction of an unlinked inode causes the inode to be truncated
and freed too early.
To make this possible, we set MS_ACTIVE so that inodes would not be torn
down immediately upon release. Unfortunately, this also results in the
quota inodes not being released at all if a later part of the mount
process should fail, because we never reclaim the inodes. So, set
MS_ACTIVE right before we do the last part of log recovery and clear it
immediately after we finish the log recovery so that everything
will be torn down properly if we abort the mount.
Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>