commits
Pull thermal fixes from Eduardo Valentin:
"A couple of minor fixes for the thermal subsystem.
Specifics in this pull request:
- Fixes in hisilicon thermal driver
- More fixes of unsigned to int type change in thermal_core.c"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
thermal: use %d to print S32 parameters
thermal: hisilicon: increase temperature resolution
Pull powerpc fixes from Michael Ellerman:
"A few more powerpc fixes for 4.6:
- cxl: Keep IRQ mappings on context teardown from Michael Neuling
- cxl: Poll for outstanding IRQs when detaching a context from
Michael Neuling
- Wire up preadv2 and pwritev2 syscalls from Rui Salvaterra"
* tag 'powerpc-4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc: wire up preadv2 and pwritev2 syscalls
cxl: Poll for outstanding IRQs when detaching a context
cxl: Keep IRQ mappings on context teardown
Power allocator's parameters are S32 type, so use %d to print them.
Acked-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Pull EDAC fix from Borislav Petkov:
"Make sure sb_edac and i7core_edac do not terminate MCE processing on
the decoding callchain prematurely"
* tag 'edac_fix_for_4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC: i7core, sb_edac: Don't return NOTIFY_BAD from mce_decoder callback
Wire up preadv2/pwritev2 in the same way as preadv/pwritev. Fixes two
build warnings on ppc64.
mpe: Lightly tested with fio (slightly hacked to add the syscall
wrappers):
fio-4217 [009] .... 1304.635300: sys_preadv2(fd: 3, vec:
10025821de0, vlen: 1, pos_l: 6253000, pos_h: 0, flags: 1)
fio-4217 [009] .... 1304.635474: sys_preadv2 -> 0x1000
Signed-off-by: Rui Salvaterra <rsalvaterra@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When calculate temperature, old code firstly do division and then
convert to "millicelsius" unit. This will lose resolution and only can
read back temperature with "Celsius" unit.
So firstly scale step value to "millicelsius" and then do division, so
finally we can increase resolution for temperature value. Also refine
the calculation from temperature value to step value.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Pull power management fixes from Rafael Wysocki:
"One revert of a recent cpufreq commit that introduced a regression and
a fix for intel_pstate's Turbo Activation Ratio handling code.
Specifics:
- Revert cpufreq commit that attempted to fix a problem in the
ondemand/conservative governor code, but did that incorrectly and
introduced another problem instead (Rafael Wysocki).
- Fix incorrect decoding of MSR contents related to the Turbo
Activation Ratio (TAR) handling in the intel_pstate driver
(Srinivas Pandruvada)"
* tag 'pm+acpi-4.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: intel_pstate: Fix processing for turbo activation ratio
Revert "cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC"
Both of these drivers can return NOTIFY_BAD, but this terminates
processing other callbacks that were registered later on the chain.
Since the driver did nothing to log the error it seems wrong to prevent
other interested parties from seeing it. E.g. neither of them had even
bothered to check the type of the error to see if it was a memory error
before the return NOTIFY_BAD.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/72937355dd92318d2630979666063f8a2853495b.1461864507.git.tony.luck@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
When detaching contexts, we may still have interrupts in the system
which are yet to be delivered to any CPU and be acked in the PSL.
This can result in a subsequent unrelated process getting an spurious
IRQ or an interrupt for a non-existent context.
This polls the PSL to ensure that the PSL is clear of IRQs for the
detached context, before removing the context from the idr.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Tested-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull thermal fixes from Eduardo Valentin:
"Specifics in this pull request:
- Fixes in mediatek and OF thermal drivers
- Fixes in power_allocator governor
- More fixes of unsigned to int type change in thermal_core.c.
These change have been CI tested using KernelCI bot. \o/"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
thermal: fix Mediatek thermal controller build
thermal: consistently use int for trip temp
thermal: fix mtk_thermal build dependency
thermal: minor mtk_thermal.c cleanups
thermal: power_allocator: req_range multiplication should be a 64 bit type
thermal: of: add __init attribute
Pull MMC fixes from Ulf Hansson:
"Here are a two MMC host fixes:
- sdhci-acpi: Reduce Baytrail eMMC/SD/SDIO hangs
- sunxi: Disable eMMC HS-DDR for Allwinner A80"
* tag 'mmc-v4.6-rc4' of git://git.linaro.org/people/ulf.hansson/mmc:
mmc: sunxi: Disable eMMC HS-DDR (MMC_CAP_1_8V_DDR) for Allwinner A80
mmc: sdhci-acpi: Reduce Baytrail eMMC/SD/SDIO hangs
* pm-cpufreq-fixes:
cpufreq: intel_pstate: Fix processing for turbo activation ratio
Revert "cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC"
Keep IRQ mappings on context teardown. This won't leak IRQs as if we
allocate the mapping again, the generic code will give the same
mapping used last time.
Doing this works around a race in the generic code. Masking the
interrupt introduces a race which can crash the kernel or result in
IRQ that is never EOIed. The lost of EOI results in all subsequent
mappings to the same HW IRQ never receiving an interrupt.
We've seen this race with cxl test cases which are doing heavy context
startup and teardown at the same time as heavy interrupt load.
A fix to the generic code is being investigated also.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: stable@vger.kernel.org # 3.8
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Tested-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull asm-generic update from Arnd Bergmann:
"Here is one patch to wire up the preadv/pwritev system calls in the
generic system call table, which is required for all architectures
that were merged in the last few years, including arm64.
Usually these get merged along with the syscall implementation or one
of the architecture trees, but this time that did not happen.
Andre and Christoph both sent a version of this patch, I picked the
one I got first"
* tag 'asm-generic-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
generic syscalls: wire up preadv2 and pwritev2 syscalls
At least with CONFIG_COMPILE_TEST, there's no reason to assume
that CONFIG_RESET_CONTROLLER is set, but the code for this
controller requires it since it calls device_reset().
Make CONFIG_MTK_THERMAL properly depend on CONFIG_RESET_CONTROLLER.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Pull drm fixes from Dave Airlie:
"A few fixes all over the place:
radeon is probably the biggest standout, it's a fix for screen
corruption or hung black outputs so I thought it was worth pulling in.
Otherwise some amdgpu power control fixes, some misc vmwgfx fixes, one
etnaviv fix, one virtio-gpu fix, two DP MST fixes, and a single TTM
fix"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/vmwgfx: Fix order of operation
drm/vmwgfx: use vmw_cmd_dx_cid_check for query commands.
drm/vmwgfx: Enable SVGA_3D_CMD_DX_SET_PREDICATION
drm/amdgpu: disable vm interrupts with vm_fault_stop=2
drm/amdgpu: print a message if ATPX dGPU power control is missing
Revert "drm/amdgpu: disable runtime pm on PX laptops without dGPU power control"
drm/radeon: fix vertical bars appear on monitor (v2)
drm/ttm: fix kref count mess in ttm_bo_move_to_lru_tail
drm/virtio: send vblank event after crtc updates
drm/dp/mst: Restore primary hub guid on resume
drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1()
drm/etnaviv: don't move linear memory window on 3D cores without MC2.0
eMMC HS-DDR no longer works on the A80, despite it working when support
for this developed.
Disable it for now.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
When the config TDP level is not nominal (level = 0), the MSR values for
reading level 1 and level 2 ratios contain power in low 14 bits and actual
ratio bits are at bits [23:16]. The current processing for level 1 and
level 2 is wrong as there is no shift done to get actual ratio.
Fixes: 6a35fc2d6c22 (cpufreq: intel_pstate: get P1 from TAR when available)
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: 4.4+ <stable@vger.kernel.org> # 4.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We need to update the user TM feature bits (PPC_FEATURE2_HTM and
PPC_FEATURE2_HTM) to mirror what we do with the kernel TM feature
bit.
At the moment, if firmware reports TM is not available we turn off
the kernel TM feature bit but leave the userspace ones on. Userspace
thinks it can execute TM instructions and it dies trying.
This (together with a QEMU patch) fixes PR KVM, which doesn't currently
support TM.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull x86 fixes from Ingo Molnar:
"Misc fixes: two EDAC driver fixes, a Xen crash fix, a HyperV log spam
fix and a documentation fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86 EDAC, sb_edac.c: Take account of channel hashing when needed
x86 EDAC, sb_edac.c: Repair damage introduced when "fixing" channel address
x86/mm/xen: Suppress hugetlbfs in PV guests
x86/doc: Correct limits in Documentation/x86/x86_64/mm.txt
x86/hyperv: Avoid reporting bogus NMI status for Gen2 instances
These new syscalls are implemented as generic code, so enable them for
architectures like arm64 which use the generic syscall table.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The commit 17e8351a7739 consistently use int for temperature,
however it missed a few in trip temperature and thermal_core.
In current codes, the trip->temperature used "unsigned long"
and zone->temperature used"int", if the temperature is negative
value, it will get wrong result when compare temperature with
trip temperature.
This patch can fix it.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Pull rdma fixes from Doug Ledford:
"Final set of -rc fixes for 4.6.
I've collected up a number of patches that are all pretty small with
the exception of only a couple. The hfi1 driver has a number of
important patches, and it is what really drives the line count of this
pull request up. These are all small and I've got this kernel built
and running in the test lab (I have most of the hardware, I think nes
is the only thing in this patch set that I can't say I've personally
tested and have up and running).
Summary:
- A number of collected fixes for oopses, memory corruptions,
deadlocks, etc. All of these fixes are small (many only 5-10
lines), obvious, and tested.
- Fix for the security issue related to the use of write for
bi-directional communications"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
RDMA/nes: don't leak skb if carrier down
IB/security: Restrict use of the write() interface
IB/hfi1: Use kernel default llseek for ui device
IB/hfi1: Don't attempt to free resources if initialization failed
IB/hfi1: Fix missing lock/unlock in verbs drain callback
IB/rdmavt: Fix send scheduling
IB/hfi1: Prevent unpinning of wrong pages
IB/hfi1: Fix deadlock caused by locking with wrong scope
IB/hfi1: Prevent NULL pointer deferences in caching code
MAINTAINERS: Update iser/isert maintainer contact info
IB/mlx5: Expose correct max_sge_rd limit
RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chips
iw_cxgb4: handle draining an idle qp
iw_cxgb3: initialize ibdev.iwcm->ifname for port mapping
iw_cxgb4: initialize ibdev.iwcm->ifname for port mapping
IB/core: Don't drain non-existent rq queue-pair
IB/core: Fix oops in ib_cache_gid_set_default_gid
A few fixes for 4.6.
- revert amdgpu PX commit that was previously reverted on the radeon side
- cleaned up version of the NI+ MC update display fix for radeon
- TTM kref fix
* 'drm-fixes-4.6' of git://people.freedesktop.org/~agd5f/linux:
drm/amdgpu: disable vm interrupts with vm_fault_stop=2
drm/amdgpu: print a message if ATPX dGPU power control is missing
Revert "drm/amdgpu: disable runtime pm on PX laptops without dGPU power control"
drm/radeon: fix vertical bars appear on monitor (v2)
drm/ttm: fix kref count mess in ttm_bo_move_to_lru_tail
Baytrail eMMC/SD/SDIO host controllers have been known to
hang. A change to a hardware setting has been found to
reduce the occurrence of such hangs. This patch ensures
the correct setting.
This patch applies cleanly to v4.4+. It could go to
earlier kernels also, so I will send backports to the
stable list in due course.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Revert commit 0df35026c6a5 (cpufreq: governor: Fix negative idle_time
when configured with CONFIG_HZ_PERIODIC) that introduced a regression
by causing the ondemand cpufreq governor to misbehave for
CONFIG_TICK_CPU_ACCOUNTING unset (the frequency goes up to the max at
one point and stays there indefinitely).
The revert takes subsequent modifications of the code in question into
account.
Fixes: 0df35026c6a5 (cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=115261
Reported-and-tested-by: Timo Valtoaho <timo.valtoaho@gmail.com>
Cc: 4.5+ <stable@vger.kernel.org> # 4.5+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
scan_features() updates cpu_user_features but not cpu_user_features2.
Amongst other things, cpu_user_features2 contains the user TM feature
bits which we must keep in sync with the kernel TM feature bit.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull perf, cpu hotplug and timer fixes from Ingo Molnar:
"perf:
- A single tooling fix for a user-triggerable segfault.
CPU hotplug:
- Fix a CPU hotplug corner case regression, introduced by the recent
hotplug rework
timers:
- Fix a boot hang in the ARM based Tango SoC clocksource driver"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf intel-pt: Fix segfault tracing transactions
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/hotplug: Fix rollback during error-out in __cpu_disable()
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/drivers/tango-xtal: Fix boot hang due to incorrect test
Haswell and Broadwell can be configured to hash the channel
interleave function using bits [27:12] of the physical address.
On those processor models we must check to see if hashing is
enabled (bit21 of the HASWELL_HASYSDEFEATURE2 register) and
act accordingly.
Based on a patch by patrickg <patrickg@supermicro.com>
Tested-by: Patrick Geary <patrickg@supermicro.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fix build errors when MTK_THERMAL=y and NVMEM=m by preventing that
Kconfig combination.
drivers/built-in.o: In function `mtk_thermal_probe':
mtk_thermal.c:(.text+0xffa8f): undefined reference to `nvmem_cell_get'
mtk_thermal.c:(.text+0xffabe): undefined reference to `nvmem_cell_read'
mtk_thermal.c:(.text+0xffac9): undefined reference to `nvmem_cell_put'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: <linux-pm@vger.kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Hanyi Wu <hanyi.wu@mediatek.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Merge fixes from Andrew Morton:
"20 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
Documentation/sysctl/vm.txt: update numa_zonelist_order description
lib/stackdepot.c: allow the stack trace hash to be zero
rapidio: fix potential NULL pointer dereference
mm/memory-failure: fix race with compound page split/merge
ocfs2/dlm: return zero if deref_done message is successfully handled
Ananth has moved
kcov: don't profile branches in kcov
kcov: don't trace the code coverage code
mm: wake kcompactd before kswapd's short sleep
.mailmap: add Frank Rowand
mm/hwpoison: fix wrong num_poisoned_pages accounting
mm: call swap_slot_free_notify() with page lock held
mm: vmscan: reclaim highmem zone if buffer_heads is over limit
numa: fix /proc/<pid>/numa_maps for THP
mm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA check
mailmap: fix Krzysztof Kozlowski's misspelled name
thp: keep huge zero page pinned until tlb flush
mm: exclude HugeTLB pages from THP page_mapped() logic
kexec: export OFFSET(page.compound_head) to find out compound tail page
kexec: update VMCOREINFO for compound_order/dtor
Alternatively one could free the skb, OTOH I don't think this test is
useful so just remove it.
Cc: <linux-rdma@vger.kernel.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
three misc vmwgfx fixes
* 'drm-vmwgfx-fixes' of git://people.freedesktop.org/~syeh/repos_linux:
drm/vmwgfx: Fix order of operation
drm/vmwgfx: use vmw_cmd_dx_cid_check for query commands.
drm/vmwgfx: Enable SVGA_3D_CMD_DX_SET_PREDICATION
V2: disable all vm interrupts in late_init()
Signed-off-by: Flora Cui <Flora.Cui@amd.com>
Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Since governor operations are generally skipped if cpufreq_suspended
is set, cpufreq_start_governor() should do nothing in that case.
That function is called in the cpufreq_online() path, and may also
be called from cpufreq_offline() in some cases, which are invoked
by the nonboot CPUs disabing/enabling code during system suspend
to RAM and resume. That happens when all devices have been
suspended, so if the cpufreq driver relies on things like I2C to
get the current frequency, it may not be ready to do that then.
To prevent problems from happening for this reason, make
cpufreq_update_current_freq(), which is the only function invoked
by cpufreq_start_governor() that doesn't check cpufreq_suspended
already, return 0 upfront if cpufreq_suspended is set.
Fixes: 3bbf8fe3ae08 (cpufreq: Always update current frequency before startig governor)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
The REAL_LE feature entry in the ibm_pa_feature struct is missing an MMU
feature value, meaning all the remaining elements initialise the wrong
values.
This means instead of checking for byte 5, bit 0, we check for byte 0,
bit 0, and then we incorrectly set the CPU feature bit as well as MMU
feature bit 1 and CPU user feature bits 0 and 2 (5).
Checking byte 0 bit 0 (IBM numbering), means we're looking at the
"Memory Management Unit (MMU)" feature - ie. does the CPU have an MMU.
In practice that bit is set on all platforms which have the property.
This means we set CPU_FTR_REAL_LE always. In practice that seems not to
matter because all the modern cpus which have this property also
implement REAL_LE, and we've never needed to disable it.
We're also incorrectly setting MMU feature bit 1, which is:
#define MMU_FTR_TYPE_8xx 0x00000002
Luckily the only place that looks for MMU_FTR_TYPE_8xx is in Book3E
code, which can't run on the same cpus as scan_features(). So this also
doesn't matter in practice.
Finally in the CPU user feature mask, we're setting bits 0 and 2. Bit 2
is not currently used, and bit 0 is:
#define PPC_FEATURE_PPC_LE 0x00000001
Which says the CPU supports the old style "PPC Little Endian" mode.
Again this should be harmless in practice as no 64-bit CPUs implement
that mode.
Fix the code by adding the missing initialisation of the MMU feature.
Also add a comment marking CPU user feature bit 2 (0x4) as reserved. It
would be unsafe to start using it as old kernels incorrectly set it.
Fixes: 44ae3ab3358e ("powerpc: Free up some CPU feature bits by moving out MMU-related features")
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
[mpe: Flesh out changelog, add comment reserving 0x4]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull locking fixes from Ingo Molnar:
"Misc fixes:
pvqspinlocks:
- an instrumentation fix
futexes:
- preempt-count vs pagefault_disable decouple corner case fix
- futex requeue plist race window fix
- futex UNLOCK_PI transaction fix for a corner case"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
asm-generic/futex: Re-enable preemption in futex_atomic_cmpxchg_inatomic()
futex: Acknowledge a new waiter in counter before plist
futex: Handle unlock_pi race gracefully
locking/pvqspinlock: Fix division by zero in qstat_read()
Pull a perf/urgent fix from Arnaldo Carvalho de Melo:
- Fix segfault tracing transactions in Intel PT (Adrian Hunter)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The recent introduction of the hotplug thread which invokes the callbacks on
the plugged cpu, cased the following regression:
If takedown_cpu() fails, then we run into several issues:
1) The rollback of the target cpu states is not invoked. That leaves the smp
threads and the hotplug thread in disabled state.
2) notify_online() is executed due to a missing skip_onerr flag. That causes
that both CPU_DOWN_FAILED and CPU_ONLINE notifications are invoked which
confuses quite some notifiers.
3) The CPU_DOWN_FAILED notification is not invoked on the target CPU. That's
not an issue per se, but it is inconsistent and in consequence blocks the
patches which rely on these states being invoked on the target CPU and not
on the controlling cpu. It also does not preserve the strict call order on
rollback which is problematic for the ongoing state machine conversion as
well.
To fix this we add a rollback flag to the remote callback machinery and invoke
the rollback including the CPU_DOWN_FAILED notification on the remote
cpu. Further mark the notify online state with 'skip_onerr' so we don't get a
double invokation.
This workaround will go away once we moved the unplug invocation to the target
cpu itself.
[ tglx: Massaged changelog and moved the CPU_DOWN_FAILED notifiaction to the
target cpu ]
Fixes: 4cb28ced23c4 ("cpu/hotplug: Create hotplug threads")
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-s390@vger.kernel.org
Cc: rt@linutronix.de
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Link: http://lkml.kernel.org/r/20160408124015.GA21960@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Commit 0881841f7e78 introduced a regression by inverting a test check
after calling clocksource_mmio_init(). That results on the system to
hang at boot time.
Fix it by inverting the test again.
Fixes: 0881841f7e78 ("Replace code by clocksource_mmio_init")
Reported-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
In commit:
eb1af3b71f9d ("Fix computation of channel address")
I switched the "sck_way" variable from holding the log2 value read
from the h/w to instead be the actual number. Unfortunately it
is needed in log2 form when used to shift the address.
Tested-by: Patrick Geary <patrickg@supermicro.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes: eb1af3b71f9d ("Fix computation of channel address")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull device mapper fix from Mike Snitzer:
"Fix for earlier 4.6-rc4 stable@ commit that introduced improper use of
write lock in cmd_read_lock() -- due to cut-n-paste gone awry (and
sparse didn't catch it)"
* tag 'dm-4.6-fix-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache metadata: fix cmd_read_lock() acquiring write lock
Trivial cleanups:
- delete one duplicate #include
- end email address with closing '>'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Hanyi Wu <hanyi.wu@mediatek.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Pull x86 fixes from Ingo Molnar:
"Two boot crash fixes and an IRQ handling crash fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic: Handle zero vector gracefully in clear_vector_irq()
Revert "x86/mm/32: Set NX in __supported_pte_mask before enabling paging"
xen/qspinlock: Don't kick CPU if IRQ is not initialized
Commit 3193913ce62c ("mm: page_alloc: default node-ordering on 64-bit
NUMA, zone-ordering on 32-bit") changes the default value of
numa_zonelist_order. Update the document.
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The drivers/infiniband stack uses write() as a replacement for
bi-directional ioctl(). This is not safe. There are ways to
trigger write calls that result in the return structure that
is normally written to user space being shunted off to user
specified kernel memory instead.
For the immediate repair, detect and deny suspicious accesses to
the write API.
For long term, update the user space libraries and the kernel API
to something that doesn't present the same security vulnerabilities
(likely a structured ioctl() interface).
The impacted uAPI interfaces are generally only available if
hardware from drivers/infiniband is installed in the system.
Reported-by: Jann Horn <jann@thejh.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
[ Expanded check to all known write() entry points ]
Cc: stable@vger.kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
virtio_gpu was failing to send vblank events when using the atomic IOCTL
with the DRM_MODE_PAGE_FLIP_EVENT flag set. This patch fixes each and
enables atomic pageflips updates.
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
mode->hdisplay * (var->bits_per_pixel + 7) gets evaluated before
the division, potentially making the pitch larger than it should
be.
Since the original intention is to do a div-round-up, just use
the macro instead.
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
It will help identify problematic boards.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jörg Otte reports that commit a4675fbc4a7a (cpufreq: intel_pstate:
Replace timers with utilization update callbacks) caused the CPUs in
his Haswell-based system to stay in the very high frequency region
even if the system is completely idle.
That turns out to be an existing problem in the intel_pstate driver's
P-state selection algorithm for Core processors. Namely, all
decisions made by that algorithm are based on the average frequency
of the CPU between sampling events and on the P-state requested on
the last invocation, so it may get stuck at a very hight frequency
even if the utilization of the CPU is very low (in fact, it may get
stuck in a inadequate P-state regardless of the CPU utilization).
The only way to kick it out of that limbo is a sufficiently long idle
period (3 times longer than the prescribed sampling interval), but if
that doesn't happen often enough (eg. due to a timing change like
after the above commit), the P-state of the CPU may be inadequate
pretty much all the time.
To address the most egregious manifestations of that issue, reset the
core_busy value used to determine the next P-state to request if the
utilization of the CPU, determined with the help of the MPERF
feedback register and the TSC, is below 1%.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=115771
Reported-and-tested-by: Jörg Otte <jrg.otte@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The used_vsr flag is set if process has used VSX registers, not Altivec
registers. But the comment says otherwise, correct the comment.
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull irq fixes from Ingo Molnar:
"A core irq affinity masks related fix and a MIPS irqchip driver fix"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/mips-gic: Don't overrun pcpu_masks array
genirq: Dont allow affinity mask to be updated on IPIs
The recent decoupling of pagefault disable and preempt disable added an
explicit preempt_disable/enable() pair to the futex_atomic_cmpxchg_inatomic()
implementation in asm-generic/futex.h. But it forgot to add preempt_enable()
calls to the error handling code pathes, which results in a preemption count
imbalance.
This is observable on boot when the test for atomic_cmpxchg() is calling
futex_atomic_cmpxchg_inatomic() on a NULL pointer.
Add the missing preempt_enable() calls to the error handling code pathes.
[ tglx: Massaged changelog ]
Fixes: d9b9ff8c1889 ("sched/preempt, futex: Disable preemption in UP futex_atomic_cmpxchg_inatomic() explicitly")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Cc: linux-arch@vger.kernel.org
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1460640963-690-1-git-send-email-romain.perier@free-electrons.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tracing a workload that uses transactions gave a seg fault as follows:
perf record -e intel_pt// workload
perf report
Program received signal SIGSEGV, Segmentation fault.
0x000000000054b58c in intel_pt_reset_last_branch_rb (ptq=0x1a36110)
at util/intel-pt.c:929
929 ptq->last_branch_rb->nr = 0;
(gdb) p ptq->last_branch_rb
$1 = (struct branch_stack *) 0x0
(gdb) up
1148 intel_pt_reset_last_branch_rb(ptq);
(gdb) l
1143 if (ret)
1144 pr_err("Intel Processor Trace: failed to deliver transaction event
1145 ret);
1146
1147 if (pt->synth_opts.callchain)
1148 intel_pt_reset_last_branch_rb(ptq);
1149
1150 return ret;
1151 }
1152
(gdb) p pt->synth_opts.callchain
$2 = true
(gdb)
(gdb) bt
#0 0x000000000054b58c in intel_pt_reset_last_branch_rb (ptq=0x1a36110)
#1 0x000000000054c1e0 in intel_pt_synth_transaction_sample (ptq=0x1a36110)
#2 0x000000000054c5b2 in intel_pt_sample (ptq=0x1a36110)
Caused by checking the 'callchain' flag when it should have been the
'last_branch' flag. Fix that.
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v4.4+
Fixes: f14445ee72c5 ("perf intel-pt: Support generating branch stack")
Link: http://lkml.kernel.org/r/1460977068-11566-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use signed integer output in the pr_err() string format, here PTR_ERR() value
is negative and it should be reported in human readable way.
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Cc: Jisheng Zhang <jszhang@marvell.com>
Cc: Ezequiel Garcia <ezequiel.garcia@imgtec.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: http://lkml.kernel.org/r/1458603727-4446-1-git-send-email-vz@mleia.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Huge pages are not normally available to PV guests. Not suppressing
hugetlbfs use results in an endless loop of page faults when user mode
code tries to access a hugetlbfs mapped area (since the hypervisor
denies such PTEs to be created, but error indications can't be
propagated out of xen_set_pte_at(), just like for various of its
siblings), and - once killed in an oops like this:
kernel BUG at .../fs/hugetlbfs/inode.c:428!
invalid opcode: 0000 [#1] SMP
...
RIP: e030:[<ffffffff811c333b>] [<ffffffff811c333b>] remove_inode_hugepages+0x25b/0x320
...
Call Trace:
[<ffffffff811c3415>] hugetlbfs_evict_inode+0x15/0x40
[<ffffffff81167b3d>] evict+0xbd/0x1b0
[<ffffffff8116514a>] __dentry_kill+0x19a/0x1f0
[<ffffffff81165b0e>] dput+0x1fe/0x220
[<ffffffff81150535>] __fput+0x155/0x200
[<ffffffff81079fc0>] task_work_run+0x60/0xa0
[<ffffffff81063510>] do_exit+0x160/0x400
[<ffffffff810637eb>] do_group_exit+0x3b/0xa0
[<ffffffff8106e8bd>] get_signal+0x1ed/0x470
[<ffffffff8100f854>] do_signal+0x14/0x110
[<ffffffff810030e9>] prepare_exit_to_usermode+0xe9/0xf0
[<ffffffff814178a5>] retint_user+0x8/0x13
This is CVE-2016-3961 / XSA-174.
Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <JGross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: stable@vger.kernel.org
Cc: xen-devel <xen-devel@lists.xenproject.org>
Link: http://lkml.kernel.org/r/57188ED802000078000E431C@prv-mh.provo.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull char/misc fixes from Greg KH:
"Here are some small char/misc driver fixes for 4.6-rc4. Full details
are in the shortlog, nothing major here.
These have all been in linux-next for a while with no reported issues"
* tag 'char-misc-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
lkdtm: do not leak free page on kmalloc failure
lkdtm: fix memory leak of base
lkdtm: fix memory leak of val
extcon: palmas: Drop stray IRQF_EARLY_RESUME flag
Pull thermal fixes from Eduardo Valentin:
"A couple of minor fixes for the thermal subsystem.
Specifics in this pull request:
- Fixes in hisilicon thermal driver
- More fixes of unsigned to int type change in thermal_core.c"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
thermal: use %d to print S32 parameters
thermal: hisilicon: increase temperature resolution
Pull powerpc fixes from Michael Ellerman:
"A few more powerpc fixes for 4.6:
- cxl: Keep IRQ mappings on context teardown from Michael Neuling
- cxl: Poll for outstanding IRQs when detaching a context from
Michael Neuling
- Wire up preadv2 and pwritev2 syscalls from Rui Salvaterra"
* tag 'powerpc-4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc: wire up preadv2 and pwritev2 syscalls
cxl: Poll for outstanding IRQs when detaching a context
cxl: Keep IRQ mappings on context teardown
Wire up preadv2/pwritev2 in the same way as preadv/pwritev. Fixes two
build warnings on ppc64.
mpe: Lightly tested with fio (slightly hacked to add the syscall
wrappers):
fio-4217 [009] .... 1304.635300: sys_preadv2(fd: 3, vec:
10025821de0, vlen: 1, pos_l: 6253000, pos_h: 0, flags: 1)
fio-4217 [009] .... 1304.635474: sys_preadv2 -> 0x1000
Signed-off-by: Rui Salvaterra <rsalvaterra@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When calculate temperature, old code firstly do division and then
convert to "millicelsius" unit. This will lose resolution and only can
read back temperature with "Celsius" unit.
So firstly scale step value to "millicelsius" and then do division, so
finally we can increase resolution for temperature value. Also refine
the calculation from temperature value to step value.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Pull power management fixes from Rafael Wysocki:
"One revert of a recent cpufreq commit that introduced a regression and
a fix for intel_pstate's Turbo Activation Ratio handling code.
Specifics:
- Revert cpufreq commit that attempted to fix a problem in the
ondemand/conservative governor code, but did that incorrectly and
introduced another problem instead (Rafael Wysocki).
- Fix incorrect decoding of MSR contents related to the Turbo
Activation Ratio (TAR) handling in the intel_pstate driver
(Srinivas Pandruvada)"
* tag 'pm+acpi-4.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: intel_pstate: Fix processing for turbo activation ratio
Revert "cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC"
Both of these drivers can return NOTIFY_BAD, but this terminates
processing other callbacks that were registered later on the chain.
Since the driver did nothing to log the error it seems wrong to prevent
other interested parties from seeing it. E.g. neither of them had even
bothered to check the type of the error to see if it was a memory error
before the return NOTIFY_BAD.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/72937355dd92318d2630979666063f8a2853495b.1461864507.git.tony.luck@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
When detaching contexts, we may still have interrupts in the system
which are yet to be delivered to any CPU and be acked in the PSL.
This can result in a subsequent unrelated process getting an spurious
IRQ or an interrupt for a non-existent context.
This polls the PSL to ensure that the PSL is clear of IRQs for the
detached context, before removing the context from the idr.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Tested-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull thermal fixes from Eduardo Valentin:
"Specifics in this pull request:
- Fixes in mediatek and OF thermal drivers
- Fixes in power_allocator governor
- More fixes of unsigned to int type change in thermal_core.c.
These change have been CI tested using KernelCI bot. \o/"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
thermal: fix Mediatek thermal controller build
thermal: consistently use int for trip temp
thermal: fix mtk_thermal build dependency
thermal: minor mtk_thermal.c cleanups
thermal: power_allocator: req_range multiplication should be a 64 bit type
thermal: of: add __init attribute
Pull MMC fixes from Ulf Hansson:
"Here are a two MMC host fixes:
- sdhci-acpi: Reduce Baytrail eMMC/SD/SDIO hangs
- sunxi: Disable eMMC HS-DDR for Allwinner A80"
* tag 'mmc-v4.6-rc4' of git://git.linaro.org/people/ulf.hansson/mmc:
mmc: sunxi: Disable eMMC HS-DDR (MMC_CAP_1_8V_DDR) for Allwinner A80
mmc: sdhci-acpi: Reduce Baytrail eMMC/SD/SDIO hangs
Keep IRQ mappings on context teardown. This won't leak IRQs as if we
allocate the mapping again, the generic code will give the same
mapping used last time.
Doing this works around a race in the generic code. Masking the
interrupt introduces a race which can crash the kernel or result in
IRQ that is never EOIed. The lost of EOI results in all subsequent
mappings to the same HW IRQ never receiving an interrupt.
We've seen this race with cxl test cases which are doing heavy context
startup and teardown at the same time as heavy interrupt load.
A fix to the generic code is being investigated also.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: stable@vger.kernel.org # 3.8
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Tested-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull asm-generic update from Arnd Bergmann:
"Here is one patch to wire up the preadv/pwritev system calls in the
generic system call table, which is required for all architectures
that were merged in the last few years, including arm64.
Usually these get merged along with the syscall implementation or one
of the architecture trees, but this time that did not happen.
Andre and Christoph both sent a version of this patch, I picked the
one I got first"
* tag 'asm-generic-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
generic syscalls: wire up preadv2 and pwritev2 syscalls
At least with CONFIG_COMPILE_TEST, there's no reason to assume
that CONFIG_RESET_CONTROLLER is set, but the code for this
controller requires it since it calls device_reset().
Make CONFIG_MTK_THERMAL properly depend on CONFIG_RESET_CONTROLLER.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Pull drm fixes from Dave Airlie:
"A few fixes all over the place:
radeon is probably the biggest standout, it's a fix for screen
corruption or hung black outputs so I thought it was worth pulling in.
Otherwise some amdgpu power control fixes, some misc vmwgfx fixes, one
etnaviv fix, one virtio-gpu fix, two DP MST fixes, and a single TTM
fix"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/vmwgfx: Fix order of operation
drm/vmwgfx: use vmw_cmd_dx_cid_check for query commands.
drm/vmwgfx: Enable SVGA_3D_CMD_DX_SET_PREDICATION
drm/amdgpu: disable vm interrupts with vm_fault_stop=2
drm/amdgpu: print a message if ATPX dGPU power control is missing
Revert "drm/amdgpu: disable runtime pm on PX laptops without dGPU power control"
drm/radeon: fix vertical bars appear on monitor (v2)
drm/ttm: fix kref count mess in ttm_bo_move_to_lru_tail
drm/virtio: send vblank event after crtc updates
drm/dp/mst: Restore primary hub guid on resume
drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1()
drm/etnaviv: don't move linear memory window on 3D cores without MC2.0
When the config TDP level is not nominal (level = 0), the MSR values for
reading level 1 and level 2 ratios contain power in low 14 bits and actual
ratio bits are at bits [23:16]. The current processing for level 1 and
level 2 is wrong as there is no shift done to get actual ratio.
Fixes: 6a35fc2d6c22 (cpufreq: intel_pstate: get P1 from TAR when available)
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: 4.4+ <stable@vger.kernel.org> # 4.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We need to update the user TM feature bits (PPC_FEATURE2_HTM and
PPC_FEATURE2_HTM) to mirror what we do with the kernel TM feature
bit.
At the moment, if firmware reports TM is not available we turn off
the kernel TM feature bit but leave the userspace ones on. Userspace
thinks it can execute TM instructions and it dies trying.
This (together with a QEMU patch) fixes PR KVM, which doesn't currently
support TM.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull x86 fixes from Ingo Molnar:
"Misc fixes: two EDAC driver fixes, a Xen crash fix, a HyperV log spam
fix and a documentation fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86 EDAC, sb_edac.c: Take account of channel hashing when needed
x86 EDAC, sb_edac.c: Repair damage introduced when "fixing" channel address
x86/mm/xen: Suppress hugetlbfs in PV guests
x86/doc: Correct limits in Documentation/x86/x86_64/mm.txt
x86/hyperv: Avoid reporting bogus NMI status for Gen2 instances
The commit 17e8351a7739 consistently use int for temperature,
however it missed a few in trip temperature and thermal_core.
In current codes, the trip->temperature used "unsigned long"
and zone->temperature used"int", if the temperature is negative
value, it will get wrong result when compare temperature with
trip temperature.
This patch can fix it.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Pull rdma fixes from Doug Ledford:
"Final set of -rc fixes for 4.6.
I've collected up a number of patches that are all pretty small with
the exception of only a couple. The hfi1 driver has a number of
important patches, and it is what really drives the line count of this
pull request up. These are all small and I've got this kernel built
and running in the test lab (I have most of the hardware, I think nes
is the only thing in this patch set that I can't say I've personally
tested and have up and running).
Summary:
- A number of collected fixes for oopses, memory corruptions,
deadlocks, etc. All of these fixes are small (many only 5-10
lines), obvious, and tested.
- Fix for the security issue related to the use of write for
bi-directional communications"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
RDMA/nes: don't leak skb if carrier down
IB/security: Restrict use of the write() interface
IB/hfi1: Use kernel default llseek for ui device
IB/hfi1: Don't attempt to free resources if initialization failed
IB/hfi1: Fix missing lock/unlock in verbs drain callback
IB/rdmavt: Fix send scheduling
IB/hfi1: Prevent unpinning of wrong pages
IB/hfi1: Fix deadlock caused by locking with wrong scope
IB/hfi1: Prevent NULL pointer deferences in caching code
MAINTAINERS: Update iser/isert maintainer contact info
IB/mlx5: Expose correct max_sge_rd limit
RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chips
iw_cxgb4: handle draining an idle qp
iw_cxgb3: initialize ibdev.iwcm->ifname for port mapping
iw_cxgb4: initialize ibdev.iwcm->ifname for port mapping
IB/core: Don't drain non-existent rq queue-pair
IB/core: Fix oops in ib_cache_gid_set_default_gid
A few fixes for 4.6.
- revert amdgpu PX commit that was previously reverted on the radeon side
- cleaned up version of the NI+ MC update display fix for radeon
- TTM kref fix
* 'drm-fixes-4.6' of git://people.freedesktop.org/~agd5f/linux:
drm/amdgpu: disable vm interrupts with vm_fault_stop=2
drm/amdgpu: print a message if ATPX dGPU power control is missing
Revert "drm/amdgpu: disable runtime pm on PX laptops without dGPU power control"
drm/radeon: fix vertical bars appear on monitor (v2)
drm/ttm: fix kref count mess in ttm_bo_move_to_lru_tail
Baytrail eMMC/SD/SDIO host controllers have been known to
hang. A change to a hardware setting has been found to
reduce the occurrence of such hangs. This patch ensures
the correct setting.
This patch applies cleanly to v4.4+. It could go to
earlier kernels also, so I will send backports to the
stable list in due course.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Revert commit 0df35026c6a5 (cpufreq: governor: Fix negative idle_time
when configured with CONFIG_HZ_PERIODIC) that introduced a regression
by causing the ondemand cpufreq governor to misbehave for
CONFIG_TICK_CPU_ACCOUNTING unset (the frequency goes up to the max at
one point and stays there indefinitely).
The revert takes subsequent modifications of the code in question into
account.
Fixes: 0df35026c6a5 (cpufreq: governor: Fix negative idle_time when configured with CONFIG_HZ_PERIODIC)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=115261
Reported-and-tested-by: Timo Valtoaho <timo.valtoaho@gmail.com>
Cc: 4.5+ <stable@vger.kernel.org> # 4.5+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
scan_features() updates cpu_user_features but not cpu_user_features2.
Amongst other things, cpu_user_features2 contains the user TM feature
bits which we must keep in sync with the kernel TM feature bit.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull perf, cpu hotplug and timer fixes from Ingo Molnar:
"perf:
- A single tooling fix for a user-triggerable segfault.
CPU hotplug:
- Fix a CPU hotplug corner case regression, introduced by the recent
hotplug rework
timers:
- Fix a boot hang in the ARM based Tango SoC clocksource driver"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf intel-pt: Fix segfault tracing transactions
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/hotplug: Fix rollback during error-out in __cpu_disable()
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/drivers/tango-xtal: Fix boot hang due to incorrect test
Haswell and Broadwell can be configured to hash the channel
interleave function using bits [27:12] of the physical address.
On those processor models we must check to see if hashing is
enabled (bit21 of the HASWELL_HASYSDEFEATURE2 register) and
act accordingly.
Based on a patch by patrickg <patrickg@supermicro.com>
Tested-by: Patrick Geary <patrickg@supermicro.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fix build errors when MTK_THERMAL=y and NVMEM=m by preventing that
Kconfig combination.
drivers/built-in.o: In function `mtk_thermal_probe':
mtk_thermal.c:(.text+0xffa8f): undefined reference to `nvmem_cell_get'
mtk_thermal.c:(.text+0xffabe): undefined reference to `nvmem_cell_read'
mtk_thermal.c:(.text+0xffac9): undefined reference to `nvmem_cell_put'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: <linux-pm@vger.kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Hanyi Wu <hanyi.wu@mediatek.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Merge fixes from Andrew Morton:
"20 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
Documentation/sysctl/vm.txt: update numa_zonelist_order description
lib/stackdepot.c: allow the stack trace hash to be zero
rapidio: fix potential NULL pointer dereference
mm/memory-failure: fix race with compound page split/merge
ocfs2/dlm: return zero if deref_done message is successfully handled
Ananth has moved
kcov: don't profile branches in kcov
kcov: don't trace the code coverage code
mm: wake kcompactd before kswapd's short sleep
.mailmap: add Frank Rowand
mm/hwpoison: fix wrong num_poisoned_pages accounting
mm: call swap_slot_free_notify() with page lock held
mm: vmscan: reclaim highmem zone if buffer_heads is over limit
numa: fix /proc/<pid>/numa_maps for THP
mm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA check
mailmap: fix Krzysztof Kozlowski's misspelled name
thp: keep huge zero page pinned until tlb flush
mm: exclude HugeTLB pages from THP page_mapped() logic
kexec: export OFFSET(page.compound_head) to find out compound tail page
kexec: update VMCOREINFO for compound_order/dtor
Since governor operations are generally skipped if cpufreq_suspended
is set, cpufreq_start_governor() should do nothing in that case.
That function is called in the cpufreq_online() path, and may also
be called from cpufreq_offline() in some cases, which are invoked
by the nonboot CPUs disabing/enabling code during system suspend
to RAM and resume. That happens when all devices have been
suspended, so if the cpufreq driver relies on things like I2C to
get the current frequency, it may not be ready to do that then.
To prevent problems from happening for this reason, make
cpufreq_update_current_freq(), which is the only function invoked
by cpufreq_start_governor() that doesn't check cpufreq_suspended
already, return 0 upfront if cpufreq_suspended is set.
Fixes: 3bbf8fe3ae08 (cpufreq: Always update current frequency before startig governor)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
The REAL_LE feature entry in the ibm_pa_feature struct is missing an MMU
feature value, meaning all the remaining elements initialise the wrong
values.
This means instead of checking for byte 5, bit 0, we check for byte 0,
bit 0, and then we incorrectly set the CPU feature bit as well as MMU
feature bit 1 and CPU user feature bits 0 and 2 (5).
Checking byte 0 bit 0 (IBM numbering), means we're looking at the
"Memory Management Unit (MMU)" feature - ie. does the CPU have an MMU.
In practice that bit is set on all platforms which have the property.
This means we set CPU_FTR_REAL_LE always. In practice that seems not to
matter because all the modern cpus which have this property also
implement REAL_LE, and we've never needed to disable it.
We're also incorrectly setting MMU feature bit 1, which is:
#define MMU_FTR_TYPE_8xx 0x00000002
Luckily the only place that looks for MMU_FTR_TYPE_8xx is in Book3E
code, which can't run on the same cpus as scan_features(). So this also
doesn't matter in practice.
Finally in the CPU user feature mask, we're setting bits 0 and 2. Bit 2
is not currently used, and bit 0 is:
#define PPC_FEATURE_PPC_LE 0x00000001
Which says the CPU supports the old style "PPC Little Endian" mode.
Again this should be harmless in practice as no 64-bit CPUs implement
that mode.
Fix the code by adding the missing initialisation of the MMU feature.
Also add a comment marking CPU user feature bit 2 (0x4) as reserved. It
would be unsafe to start using it as old kernels incorrectly set it.
Fixes: 44ae3ab3358e ("powerpc: Free up some CPU feature bits by moving out MMU-related features")
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
[mpe: Flesh out changelog, add comment reserving 0x4]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull locking fixes from Ingo Molnar:
"Misc fixes:
pvqspinlocks:
- an instrumentation fix
futexes:
- preempt-count vs pagefault_disable decouple corner case fix
- futex requeue plist race window fix
- futex UNLOCK_PI transaction fix for a corner case"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
asm-generic/futex: Re-enable preemption in futex_atomic_cmpxchg_inatomic()
futex: Acknowledge a new waiter in counter before plist
futex: Handle unlock_pi race gracefully
locking/pvqspinlock: Fix division by zero in qstat_read()
The recent introduction of the hotplug thread which invokes the callbacks on
the plugged cpu, cased the following regression:
If takedown_cpu() fails, then we run into several issues:
1) The rollback of the target cpu states is not invoked. That leaves the smp
threads and the hotplug thread in disabled state.
2) notify_online() is executed due to a missing skip_onerr flag. That causes
that both CPU_DOWN_FAILED and CPU_ONLINE notifications are invoked which
confuses quite some notifiers.
3) The CPU_DOWN_FAILED notification is not invoked on the target CPU. That's
not an issue per se, but it is inconsistent and in consequence blocks the
patches which rely on these states being invoked on the target CPU and not
on the controlling cpu. It also does not preserve the strict call order on
rollback which is problematic for the ongoing state machine conversion as
well.
To fix this we add a rollback flag to the remote callback machinery and invoke
the rollback including the CPU_DOWN_FAILED notification on the remote
cpu. Further mark the notify online state with 'skip_onerr' so we don't get a
double invokation.
This workaround will go away once we moved the unplug invocation to the target
cpu itself.
[ tglx: Massaged changelog and moved the CPU_DOWN_FAILED notifiaction to the
target cpu ]
Fixes: 4cb28ced23c4 ("cpu/hotplug: Create hotplug threads")
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-s390@vger.kernel.org
Cc: rt@linutronix.de
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Link: http://lkml.kernel.org/r/20160408124015.GA21960@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Commit 0881841f7e78 introduced a regression by inverting a test check
after calling clocksource_mmio_init(). That results on the system to
hang at boot time.
Fix it by inverting the test again.
Fixes: 0881841f7e78 ("Replace code by clocksource_mmio_init")
Reported-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
In commit:
eb1af3b71f9d ("Fix computation of channel address")
I switched the "sck_way" variable from holding the log2 value read
from the h/w to instead be the actual number. Unfortunately it
is needed in log2 form when used to shift the address.
Tested-by: Patrick Geary <patrickg@supermicro.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes: eb1af3b71f9d ("Fix computation of channel address")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull device mapper fix from Mike Snitzer:
"Fix for earlier 4.6-rc4 stable@ commit that introduced improper use of
write lock in cmd_read_lock() -- due to cut-n-paste gone awry (and
sparse didn't catch it)"
* tag 'dm-4.6-fix-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache metadata: fix cmd_read_lock() acquiring write lock
Trivial cleanups:
- delete one duplicate #include
- end email address with closing '>'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Hanyi Wu <hanyi.wu@mediatek.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Pull x86 fixes from Ingo Molnar:
"Two boot crash fixes and an IRQ handling crash fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic: Handle zero vector gracefully in clear_vector_irq()
Revert "x86/mm/32: Set NX in __supported_pte_mask before enabling paging"
xen/qspinlock: Don't kick CPU if IRQ is not initialized
Commit 3193913ce62c ("mm: page_alloc: default node-ordering on 64-bit
NUMA, zone-ordering on 32-bit") changes the default value of
numa_zonelist_order. Update the document.
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The drivers/infiniband stack uses write() as a replacement for
bi-directional ioctl(). This is not safe. There are ways to
trigger write calls that result in the return structure that
is normally written to user space being shunted off to user
specified kernel memory instead.
For the immediate repair, detect and deny suspicious accesses to
the write API.
For long term, update the user space libraries and the kernel API
to something that doesn't present the same security vulnerabilities
(likely a structured ioctl() interface).
The impacted uAPI interfaces are generally only available if
hardware from drivers/infiniband is installed in the system.
Reported-by: Jann Horn <jann@thejh.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
[ Expanded check to all known write() entry points ]
Cc: stable@vger.kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
virtio_gpu was failing to send vblank events when using the atomic IOCTL
with the DRM_MODE_PAGE_FLIP_EVENT flag set. This patch fixes each and
enables atomic pageflips updates.
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
mode->hdisplay * (var->bits_per_pixel + 7) gets evaluated before
the division, potentially making the pitch larger than it should
be.
Since the original intention is to do a div-round-up, just use
the macro instead.
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Jörg Otte reports that commit a4675fbc4a7a (cpufreq: intel_pstate:
Replace timers with utilization update callbacks) caused the CPUs in
his Haswell-based system to stay in the very high frequency region
even if the system is completely idle.
That turns out to be an existing problem in the intel_pstate driver's
P-state selection algorithm for Core processors. Namely, all
decisions made by that algorithm are based on the average frequency
of the CPU between sampling events and on the P-state requested on
the last invocation, so it may get stuck at a very hight frequency
even if the utilization of the CPU is very low (in fact, it may get
stuck in a inadequate P-state regardless of the CPU utilization).
The only way to kick it out of that limbo is a sufficiently long idle
period (3 times longer than the prescribed sampling interval), but if
that doesn't happen often enough (eg. due to a timing change like
after the above commit), the P-state of the CPU may be inadequate
pretty much all the time.
To address the most egregious manifestations of that issue, reset the
core_busy value used to determine the next P-state to request if the
utilization of the CPU, determined with the help of the MPERF
feedback register and the TSC, is below 1%.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=115771
Reported-and-tested-by: Jörg Otte <jrg.otte@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The recent decoupling of pagefault disable and preempt disable added an
explicit preempt_disable/enable() pair to the futex_atomic_cmpxchg_inatomic()
implementation in asm-generic/futex.h. But it forgot to add preempt_enable()
calls to the error handling code pathes, which results in a preemption count
imbalance.
This is observable on boot when the test for atomic_cmpxchg() is calling
futex_atomic_cmpxchg_inatomic() on a NULL pointer.
Add the missing preempt_enable() calls to the error handling code pathes.
[ tglx: Massaged changelog ]
Fixes: d9b9ff8c1889 ("sched/preempt, futex: Disable preemption in UP futex_atomic_cmpxchg_inatomic() explicitly")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Cc: linux-arch@vger.kernel.org
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1460640963-690-1-git-send-email-romain.perier@free-electrons.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tracing a workload that uses transactions gave a seg fault as follows:
perf record -e intel_pt// workload
perf report
Program received signal SIGSEGV, Segmentation fault.
0x000000000054b58c in intel_pt_reset_last_branch_rb (ptq=0x1a36110)
at util/intel-pt.c:929
929 ptq->last_branch_rb->nr = 0;
(gdb) p ptq->last_branch_rb
$1 = (struct branch_stack *) 0x0
(gdb) up
1148 intel_pt_reset_last_branch_rb(ptq);
(gdb) l
1143 if (ret)
1144 pr_err("Intel Processor Trace: failed to deliver transaction event
1145 ret);
1146
1147 if (pt->synth_opts.callchain)
1148 intel_pt_reset_last_branch_rb(ptq);
1149
1150 return ret;
1151 }
1152
(gdb) p pt->synth_opts.callchain
$2 = true
(gdb)
(gdb) bt
#0 0x000000000054b58c in intel_pt_reset_last_branch_rb (ptq=0x1a36110)
#1 0x000000000054c1e0 in intel_pt_synth_transaction_sample (ptq=0x1a36110)
#2 0x000000000054c5b2 in intel_pt_sample (ptq=0x1a36110)
Caused by checking the 'callchain' flag when it should have been the
'last_branch' flag. Fix that.
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v4.4+
Fixes: f14445ee72c5 ("perf intel-pt: Support generating branch stack")
Link: http://lkml.kernel.org/r/1460977068-11566-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use signed integer output in the pr_err() string format, here PTR_ERR() value
is negative and it should be reported in human readable way.
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Cc: Jisheng Zhang <jszhang@marvell.com>
Cc: Ezequiel Garcia <ezequiel.garcia@imgtec.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: http://lkml.kernel.org/r/1458603727-4446-1-git-send-email-vz@mleia.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Huge pages are not normally available to PV guests. Not suppressing
hugetlbfs use results in an endless loop of page faults when user mode
code tries to access a hugetlbfs mapped area (since the hypervisor
denies such PTEs to be created, but error indications can't be
propagated out of xen_set_pte_at(), just like for various of its
siblings), and - once killed in an oops like this:
kernel BUG at .../fs/hugetlbfs/inode.c:428!
invalid opcode: 0000 [#1] SMP
...
RIP: e030:[<ffffffff811c333b>] [<ffffffff811c333b>] remove_inode_hugepages+0x25b/0x320
...
Call Trace:
[<ffffffff811c3415>] hugetlbfs_evict_inode+0x15/0x40
[<ffffffff81167b3d>] evict+0xbd/0x1b0
[<ffffffff8116514a>] __dentry_kill+0x19a/0x1f0
[<ffffffff81165b0e>] dput+0x1fe/0x220
[<ffffffff81150535>] __fput+0x155/0x200
[<ffffffff81079fc0>] task_work_run+0x60/0xa0
[<ffffffff81063510>] do_exit+0x160/0x400
[<ffffffff810637eb>] do_group_exit+0x3b/0xa0
[<ffffffff8106e8bd>] get_signal+0x1ed/0x470
[<ffffffff8100f854>] do_signal+0x14/0x110
[<ffffffff810030e9>] prepare_exit_to_usermode+0xe9/0xf0
[<ffffffff814178a5>] retint_user+0x8/0x13
This is CVE-2016-3961 / XSA-174.
Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <JGross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: stable@vger.kernel.org
Cc: xen-devel <xen-devel@lists.xenproject.org>
Link: http://lkml.kernel.org/r/57188ED802000078000E431C@prv-mh.provo.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull char/misc fixes from Greg KH:
"Here are some small char/misc driver fixes for 4.6-rc4. Full details
are in the shortlog, nothing major here.
These have all been in linux-next for a while with no reported issues"
* tag 'char-misc-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
lkdtm: do not leak free page on kmalloc failure
lkdtm: fix memory leak of base
lkdtm: fix memory leak of val
extcon: palmas: Drop stray IRQF_EARLY_RESUME flag