commits

It allows to flush more than 4GB of device TLBs. So the mask should be
64bit wide. UBSAN captured this fault as below.

[ 3.760024] ================================================================================
[ 3.768440] UBSAN: Undefined behaviour in drivers/iommu/dmar.c:1348:3
[ 3.774864] shift exponent 64 is too large for 32-bit type 'int'
[ 3.780853] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G U 4.17.0-rc1+ #89
[ 3.788661] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016
[ 3.796034] Call Trace:
[ 3.798472] <IRQ>
[ 3.800479] dump_stack+0x90/0xfb
[ 3.803787] ubsan_epilogue+0x9/0x40
[ 3.807353] __ubsan_handle_shift_out_of_bounds+0x10e/0x170
[ 3.812916] ? qi_flush_dev_iotlb+0x124/0x180
[ 3.817261] qi_flush_dev_iotlb+0x124/0x180
[ 3.821437] iommu_flush_dev_iotlb+0x94/0xf0
[ 3.825698] iommu_flush_iova+0x10b/0x1c0
[ 3.829699] ? fq_ring_free+0x1d0/0x1d0
[ 3.833527] iova_domain_flush+0x25/0x40
[ 3.837448] fq_flush_timeout+0x55/0x160
[ 3.841368] ? fq_ring_free+0x1d0/0x1d0
[ 3.845200] ? fq_ring_free+0x1d0/0x1d0
[ 3.849034] call_timer_fn+0xbe/0x310
[ 3.852696] ? fq_ring_free+0x1d0/0x1d0
[ 3.856530] run_timer_softirq+0x223/0x6e0
[ 3.860625] ? sched_clock+0x5/0x10
[ 3.864108] ? sched_clock+0x5/0x10
[ 3.867594] __do_softirq+0x1b5/0x6f5
[ 3.871250] irq_exit+0xd4/0x130
[ 3.874470] smp_apic_timer_interrupt+0xb8/0x2f0
[ 3.879075] apic_timer_interrupt+0xf/0x20
[ 3.883159] </IRQ>
[ 3.885255] RIP: 0010:poll_idle+0x60/0xe7
[ 3.889252] RSP: 0018:ffffb1b201943e30 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[ 3.896802] RAX: 0000000080200000 RBX: 000000000000008e RCX: 000000000000001f
[ 3.903918] RDX: 0000000000000000 RSI: 000000002819aa06 RDI: 0000000000000000
[ 3.911031] RBP: ffff9e93c6b33280 R08: 00000010f717d567 R09: 000000000010d205
[ 3.918146] R10: ffffb1b201943df8 R11: 0000000000000001 R12: 00000000e01b169d
[ 3.925260] R13: 0000000000000000 R14: ffffffffb12aa400 R15: 0000000000000000
[ 3.932382] cpuidle_enter_state+0xb4/0x470
[ 3.936558] do_idle+0x222/0x310
[ 3.939779] cpu_startup_entry+0x78/0x90
[ 3.943693] start_secondary+0x205/0x2e0
[ 3.947607] secondary_startup_64+0xa5/0xb0
[ 3.951783] ================================================================================

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>

7y ago

Linus Torvalds

c61a56ab

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
"Another set of x86 related updates:

- Fix the long broken x32 version of the IPC user space headers which
was noticed by Arnd Bergman in course of his ongoing y2038 work.
GLIBC seems to have non broken private copies of these headers so
this went unnoticed.

- Two microcode fixlets which address some more fallout from the
recent modifications in that area:

- Unconditionally save the microcode patch, which was only saved
when CPU_HOTPLUG was enabled causing failures in the late
loading mechanism

- Make the later loader synchronization finally work under all
circumstances. It was exiting early and causing timeout failures
due to a missing synchronization point.

- Do not use mwait_play_dead() on AMD systems to prevent excessive
power consumption as the CPU cannot go into deep power states from
there.

- Address an annoying sparse warning due to lost type qualifiers of
the vmemmap and vmalloc base address constants.

- Prevent reserving crash kernel region on Xen PV as this leads to
the wrong perception that crash kernels actually work there which
is not the case. Xen PV has its own crash mechanism handled by the
hypervisor.

- Add missing TLB cpuid values to the table to make the printout on
certain machines correct.

- Enumerate the new CLDEMOTE instruction

- Fix an incorrect SPDX identifier

- Remove stale macros"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ipc: Fix x32 version of shmid64_ds and msqid64_ds
x86/setup: Do not reserve a crash kernel region if booted on Xen PV
x86/cpu/intel: Add missing TLB cpuid values
x86/smpboot: Don't use mwait_play_dead() on AMD systems
x86/mm: Make vmemmap and vmalloc base address constants unsigned long
x86/vector: Remove the unused macro FPU_IRQ
x86/vector: Remove the macro VECTOR_OFFSET_START
x86/cpufeatures: Enumerate cldemote instruction
x86/microcode: Do not exit early from __reload_late()
x86/microcode/intel: Save microcode patch unconditionally
x86/jailhouse: Fix incorrect SPDX identifier

7y ago

Valentin Schneider

c3616a07

KVM: arm/arm64: vgic_init: Cleanup reference to process_maintenance

7y ago

Linus Torvalds

ee946c36

Merge tag 'platform-drivers-x86-v4.17-2' of git://git.infradead.org/linux-platform-drivers-x86

7y ago

Agustin Vega-Frias

1bc2463c

irqchip/qcom: Fix check for spurious interrupts

7y ago

Peter Zijlstra

cd2af07d

clocksource: Consistent de-rate when marking unstable

7y ago

Shameer Kolothum

cd2c9fcf

iommu/dma: Move PCI window region reservation back into dma specific path.

7y ago

Linus Torvalds

65f4d6d0

Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

7y ago

Arnd Bergmann

1a512c08

x86/ipc: Fix x32 version of shmid64_ds and msqid64_ds

7y ago

James Morse

1975fa56

KVM: arm64: Fix order of vcpu_write_sys_reg() arguments

7y ago

Linus Torvalds

8e95cb33

Merge tag 'usb-4.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

7y ago

Mario Limonciello

7fe3fa3b

platform/x86: Kconfig: Fix dell-laptop dependency chain.

7y ago

Peter Zijlstra

e3b4f790

x86/tsc: Fix mark_tsc_unstable()

7y ago

Heiko Stuebner

2f8c7f2e

iommu/rockchip: Make clock handling optional

7y ago

Linus Torvalds

810fb07a

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

7y ago

Andy Lutomirski

8bb2610b

x86/entry/64/compat: Preserve r8-r11 in int $0x80

7y ago

Petr Tesarik

3db3eb28

x86/setup: Do not reserve a crash kernel region if booted on Xen PV

7y ago

Marc Zyngier

53692908

KVM: arm/arm64: vgic: Fix source vcpu issues for GICv2 SGI

7y ago

Linus Torvalds

c1c07416

Merge tag 'kbuild-fixes-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

7y ago

Greg Kroah-Hartman

6844dc42

Merge tag 'usb-serial-4.17-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus

7y ago

João Paulo Rechi Vita

9f0a93de

platform/x86: asus-wireless: Fix NULL pointer dereference

When the module is removed the led workqueue is destroyed in the remove
callback, before the led device is unregistered from the led subsystem.

This leads to a NULL pointer derefence when the led device is
unregistered automatically later as part of the module removal cleanup.
Bellow is the backtrace showing the problem.

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: __queue_work+0x8c/0x410
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
Modules linked in: ccm edac_mce_amd kvm_amd kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 joydev crypto_simd asus_nb_wmi glue_helper uvcvideo snd_hda_codec_conexant snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel asus_wmi snd_hda_codec cryptd snd_hda_core sparse_keymap videobuf2_vmalloc arc4 videobuf2_memops snd_hwdep input_leds videobuf2_v4l2 ath9k psmouse videobuf2_core videodev ath9k_common snd_pcm ath9k_hw media fam15h_power ath k10temp snd_timer mac80211 i2c_piix4 r8169 mii mac_hid cfg80211 asus_wireless(-) snd soundcore wmi shpchp 8250_dw ip_tables x_tables amdkfd amd_iommu_v2 amdgpu radeon chash i2c_algo_bit drm_kms_helper syscopyarea serio_raw sysfillrect sysimgblt fb_sys_fops ahci ttm libahci drm video
CPU: 3 PID: 2177 Comm: rmmod Not tainted 4.15.0-5-generic #6+dev94.b4287e5bem1-Endless
Hardware name: ASUSTeK COMPUTER INC. X555DG/X555DG, BIOS 5.011 05/05/2015
RIP: 0010:__queue_work+0x8c/0x410
RSP: 0018:ffffbe8cc249fcd8 EFLAGS: 00010086
RAX: ffff992ac6810800 RBX: 0000000000000000 RCX: 0000000000000008
RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff992ac6400e18
RBP: ffffbe8cc249fd18 R08: ffff992ac6400db0 R09: 0000000000000000
R10: 0000000000000040 R11: ffff992ac6400dd8 R12: 0000000000002000
R13: ffff992abd762e00 R14: ffff992abd763e38 R15: 000000000001ebe0
FS: 00007f318203e700(0000) GS:ffff992aced80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000001c720e000 CR4: 00000000001406e0
Call Trace:
queue_work_on+0x38/0x40
led_state_set+0x2c/0x40 [asus_wireless]
led_set_brightness_nopm+0x14/0x40
led_set_brightness+0x37/0x60
led_trigger_set+0xfc/0x1d0
led_classdev_unregister+0x32/0xd0
devm_led_classdev_release+0x11/0x20
release_nodes+0x109/0x1f0
devres_release_all+0x3c/0x50
device_release_driver_internal+0x16d/0x220
driver_detach+0x3f/0x80
bus_remove_driver+0x55/0xd0
driver_unregister+0x2c/0x40
acpi_bus_unregister_driver+0x15/0x20
asus_wireless_driver_exit+0x10/0xb7c [asus_wireless]
SyS_delete_module+0x1da/0x2b0
entry_SYSCALL_64_fastpath+0x24/0x87
RIP: 0033:0x7f3181b65fd7
RSP: 002b:00007ffe74bcbe18 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3181b65fd7
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000555ea2559258
RBP: 0000555ea25591f0 R08: 00007ffe74bcad91 R09: 000000000000000a
R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000003
R13: 00007ffe74bcae00 R14: 0000000000000000 R15: 0000555ea25591f0
Code: 01 00 00 02 0f 85 7d 01 00 00 48 63 45 d4 48 c7 c6 00 f4 fa 87 49 8b 9d 08 01 00 00 48 03 1c c6 4c 89 f7 e8 87 fb ff ff 48 85 c0 <48> 8b 3b 0f 84 c5 01 00 00 48 39 f8 0f 84 bc 01 00 00 48 89 c7
RIP: __queue_work+0x8c/0x410 RSP: ffffbe8cc249fcd8
CR2: 0000000000000000
---[ end trace 7aa4f4a232e9c39c ]---

Unregistering the led device on the remove callback before destroying the
workqueue avoids this problem.

https://bugzilla.kernel.org/show_bug.cgi?id=196097

Reported-by: Dun Hum <bitter.taste@gmx.com>
Cc: stable@vger.kernel.org
Signed-off-by: João Paulo Rechi Vita <jprvita@endlessm.com>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>

7y ago

Peter Zijlstra

5b9e886a

clocksource: Initialize cs->wd_list

7y ago

Arnd Bergmann

94c793ac

iommu/amd: Hide unused iommu_table_lock

7y ago

Linus Torvalds

7d9e55fe

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:
"The perf update contains the following bits:

x86:
- Prevent setting freeze_on_smi on PerfMon V1 CPUs to avoid #GP

perf stat:
- Keep the '/' event modifier separator in fallback, for example when
fallbacking from 'cpu/cpu-cycles/' to user level only, where it
should become 'cpu/cpu-cycles/u' and not 'cpu/cpu-cycles/:u' (Jiri
Olsa)

- Fix PMU events parsing rule, improving error reporting for invalid
events (Jiri Olsa)

- Disable write_backward and other event attributes for !group events
in a group, fixing, for instance this group: '{cycles,msr/aperf/}:S'
that has leader sampling (:S) and where just the 'cycles', the
leader event, should have the write_backward attribute set, in this
case it all fails because the PMU where 'msr/aperf/' lives doesn't
accepts write_backward style sampling (Jiri Olsa)

- Only fall back group read for leader (Kan Liang)

- Fix core PMU alias list for x86 platform (Kan Liang)

- Print out hint for mixed PMU group error (Kan Liang)

- Fix duplicate PMU name for interval print (Kan Liang)

Core:
- Set main kernel end address properly when reading kernel and module
maps (Namhyung Kim)

perf mem:
- Fix incorrect entries and add missing man options (Sangwon Hong)

s/390:
- Remove s390 specific strcmp_cpuid_cmp function (Thomas Richter)

- Adapt 'perf test' case record+probe_libc_inet_pton.sh for s390

- Fix s390 undefined record__auxtrace_init() return value in 'perf
record' (Thomas Richter)"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Don't enable freeze-on-smi for PerfMon V1
perf stat: Fix duplicate PMU name for interval print
perf evsel: Only fall back group read for leader
perf stat: Print out hint for mixed PMU group error
perf pmu: Fix core PMU alias list for X86 platform
perf record: Fix s390 undefined record__auxtrace_init() return value
perf mem: Document incorrect and missing options
perf evsel: Disable write_backward for leader sampling group events
perf pmu: Fix pmu events parsing rule
perf stat: Keep the / modifier separator in fallback
perf test: Adapt test case record+probe_libc_inet_pton.sh for s390
perf list: Remove s390 specific strcmp_cpuid_cmp function
perf machine: Set main kernel end address properly

7y ago

Thomas Gleixner

a3ed0e43

Revert: Unify CLOCK_MONOTONIC and CLOCK_BOOTTIME

Revert commits

92af4dcb4e1c ("tracing: Unify the "boot" and "mono" tracing clocks")
127bfa5f4342 ("hrtimer: Unify MONOTONIC and BOOTTIME clock behavior")
7250a4047aa6 ("posix-timers: Unify MONOTONIC and BOOTTIME clock behavior")
d6c7270e913d ("timekeeping: Remove boot time specific code")
f2d6fdbfd238 ("Input: Evdev - unify MONOTONIC and BOOTTIME clock behavior")
d6ed449afdb3 ("timekeeping: Make the MONOTONIC clock behave like the BOOTTIME clock")
72199320d49d ("timekeeping: Add the new CLOCK_MONOTONIC_ACTIVE clock")

As stated in the pull request for the unification of CLOCK_MONOTONIC and
CLOCK_BOOTTIME, it was clear that we might have to revert the change.

As reported by several folks systemd and other applications rely on the
documented behaviour of CLOCK_MONOTONIC on Linux and break with the above
changes. After resume daemons time out and other timeout related issues are
observed. Rafael compiled this list:

* systemd kills daemons on resume, after >WatchdogSec seconds
of suspending (Genki Sky). [Verified that that's because systemd uses
CLOCK_MONOTONIC and expects it to not include the suspend time.]

* systemd-journald misbehaves after resume:
systemd-journald[7266]: File /var/log/journal/016627c3c4784cd4812d4b7e96a34226/system.journal
corrupted or uncleanly shut down, renaming and replacing.
(Mike Galbraith).

* NetworkManager reports "networking disabled" and networking is broken
after resume 50% of the time (Pavel). [May be because of systemd.]

* MATE desktop dims the display and starts the screensaver right after
system resume (Pavel).

* Full system hang during resume (me). [May be due to systemd or NM or both.]

That happens on debian and open suse systems.

It's sad, that these problems were neither catched in -next nor by those
folks who expressed interest in this change.

Reported-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Reported-by: Genki Sky <sky@genki.is>,
Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kevin Easton <kevin@guarana.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salyzyn <salyzyn@android.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>

7y ago

Dave Hansen

316d097c

x86/pti: Filter at vma->vm_page_prot population

7y ago

jacek.tomaka@poczta.fm

b837913f

x86/cpu/intel: Add missing TLB cpuid values

7y ago

Marc Zyngier

85bd0ba1

arm/arm64: KVM: Add PSCI version selection API

7y ago

Linus Torvalds

4a7a7729

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

7y ago

Mauro Rossi

0da7e432

genksyms: fix typo in parse.tab.{c,h} generation rules

7y ago

Greg Kroah-Hartman

43b78f11

Revert "usb: host: ehci: Use dma_pool_zalloc()"

7y ago

Greg Kroah-Hartman

4842ed5b

USB: serial: visor: handle potential invalid device configuration

7y ago

Linus Torvalds

60cc43fc

Linux 4.17-rc1 v4.17-rc1

7y ago

Peter Zijlstra

2aae7bcf

clocksource: Allow clocksource_mark_unstable() on unregistered clocksources

7y ago

Jagannathan Raman

aa7528fe

iommu/vt-d: Fix usage of force parameter in intel_ir_reconfigure_irte()

7y ago

Linus Torvalds

cdface52

Merge tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

7y ago

Ingo Molnar

d4652f61

Merge tag 'perf-urgent-for-mingo-4.17-20180425' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

7y ago

Thomas Gleixner

1f71addd

tick/sched: Do not mess with an enqueued hrtimer

Kaike reported that in tests rdma hrtimers occasionaly stopped working. He
did great debugging, which provided enough context to decode the problem.

CPU 3 CPU 2

idle
start sched_timer expires = 712171000000
queue->next = sched_timer
start rdmavt timer. expires = 712172915662
lock(baseof(CPU3))
tick_nohz_stop_tick()
tick = 716767000000 timerqueue_add(tmr)

hrtimer_set_expires(sched_timer, tick);
sched_timer->expires = 716767000000 <---- FAIL
if (tmr->expires < queue->next->expires)
hrtimer_start(sched_timer) queue->next = tmr;
lock(baseof(CPU3))
unlock(baseof(CPU3))
timerqueue_remove()
timerqueue_add()

ts->sched_timer is queued and queue->next is pointing to it, but then
ts->sched_timer.expires is modified.

This not only corrupts the ordering of the timerqueue RB tree, it also
makes CPU2 see the new expiry time of timerqueue->next->expires when
checking whether timerqueue->next needs to be updated. So CPU2 sees that
the rdma timer is earlier than timerqueue->next and sets the rdma timer as
new next.

Depending on whether it had also seen the new time at RB tree enqueue, it
might have queued the rdma timer at the wrong place and then after removing
the sched_timer the RB tree is completely hosed.

The problem was introduced with a commit which tried to solve inconsistency
between the hrtimer in the tick_sched data and the underlying hardware
clockevent. It split out hrtimer_set_expires() to store the new tick time
in both the NOHZ and the NOHZ + HIGHRES case, but missed the fact that in
the NOHZ + HIGHRES case the hrtimer might still be queued.

Use hrtimer_start(timer, tick...) for the NOHZ + HIGHRES case which sets
timer->expires after canceling the timer and move the hrtimer_set_expires()
invocation into the NOHZ only code path which is not affected as it merily
uses the hrtimer as next event storage so code pathes can be shared with
the NOHZ + HIGHRES case.

Fixes: d4af6d933ccf ("nohz: Fix spurious warning when hrtimer and clockevent get out of sync")
Reported-by: "Wan Kaike" <kaike.wan@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Cc: "Marciniszyn Mike" <mike.marciniszyn@intel.com>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: linux-rdma@vger.kernel.org
Cc: "Dalessandro Dennis" <dennis.dalessandro@intel.com>
Cc: "Fleck John" <john.fleck@intel.com>
Cc: stable@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "Weiny Ira" <ira.weiny@intel.com>
Cc: "linux-rdma@vger.kernel.org"
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1804241637390.1679@nanos.tec.linutronix.de
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1804242119210.1597@nanos.tec.linutronix.de