commits

Laurent reported that STRICT_MODULE_RWX was causing intermittent crashes
on one of his systems:

kernel tried to execute exec-protected page (c008000004073278) - exploit attempt? (uid: 0)
BUG: Unable to handle kernel instruction fetch
Faulting instruction address: 0xc008000004073278
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks ...
CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
Workqueue: events control_work_handler [virtio_console]
NIP: c008000004073278 LR: c008000004073278 CTR: c0000000001e9de0
REGS: c00000002e4ef7e0 TRAP: 0400 Not tainted (5.14.0-rc4+)
MSR: 800000004280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24002822 XER: 200400cf
...
NIP fill_queue+0xf0/0x210 [virtio_console]
LR fill_queue+0xf0/0x210 [virtio_console]
Call Trace:
fill_queue+0xb4/0x210 [virtio_console] (unreliable)
add_port+0x1a8/0x470 [virtio_console]
control_work_handler+0xbc/0x1e8 [virtio_console]
process_one_work+0x290/0x590
worker_thread+0x88/0x620
kthread+0x194/0x1a0
ret_from_kernel_thread+0x5c/0x64

Jordan, Fabiano & Murilo were able to reproduce and identify that the
problem is caused by the call to module_enable_ro() in do_init_module(),
which happens after the module's init function has already been called.

Our current implementation of change_page_attr() is not safe against
concurrent accesses, because it invalidates the PTE before flushing the
TLB and then installing the new PTE. That leaves a window in time where
there is no valid PTE for the page, if another CPU tries to access the
page at that time we see something like the fault above.

We can't simply switch to set_pte_at()/flush TLB, because our hash MMU
code doesn't handle a set_pte_at() of a valid PTE. See [1].

But we do have pte_update(), which replaces the old PTE with the new,
meaning there's no window where the PTE is invalid. And the hash MMU
version hash__pte_update() deals with synchronising the hash page table
correctly.

[1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r.fsf@linux.ibm.com/

Fixes: 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
Reported-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Murilo Opsfelder Araújo <muriloo@linux.ibm.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210818120518.3603172-1-mpe@ellerman.id.au

4y ago

Linus Torvalds

9085423f

Merge tag 'char-misc-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

4y ago

Bjorn Andersson

9711759a

clk: qcom: gdsc: Ensure regulator init state matches GDSC state

4y ago

Christophe Leroy

ef486bf4

powerpc/32s: Fix random crashes by adding isync() after locking/unlocking KUEP

Commit b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C")
removed the 'isync' instruction after adding/removing NX bit in user
segments. The reasoning behind this change was that when setting the
NX bit we don't mind it taking effect with delay as the kernel never
executes text from userspace, and when clearing the NX bit this is
to return to userspace and then the 'rfi' should synchronise the
context.

However, it looks like on book3s/32 having a hash page table, at least
on the G3 processor, we get an unexpected fault from userspace, then
this is followed by something wrong in the verification of MSR_PR
at end of another interrupt.

This is fixed by adding back the removed isync() following update
of NX bit in user segment registers. Only do it for cores with an
hash table, as 603 cores don't exhibit that problem and the two isync
increase ./null_syscall selftest by 6 cycles on an MPC 832x.

First problem: unexpected WARN_ON() for mysterious PROTFAULT

WARNING: CPU: 0 PID: 1660 at arch/powerpc/mm/fault.c:354 do_page_fault+0x6c/0x5b0
Modules linked in:
CPU: 0 PID: 1660 Comm: Xorg Not tainted 5.13.0-pmac-00028-gb3c15b60339a #40
NIP: c001b5c8 LR: c001b6f8 CTR: 00000000
REGS: e2d09e40 TRAP: 0700 Not tainted (5.13.0-pmac-00028-gb3c15b60339a)
MSR: 00021032 <ME,IR,DR,RI> CR: 42d04f30 XER: 20000000
GPR00: c000424c e2d09f00 c301b680 e2d09f40 0000001e 42000000 00cba028 00000000
GPR08: 08000000 48000010 c301b680 e2d09f30 22d09f30 00c1fff0 00cba000 a7b7ba4c
GPR16: 00000031 00000000 00000000 00000000 00000000 00000000 a7b7b0d0 00c5c010
GPR24: a7b7b64c a7b7d2f0 00000004 00000000 c1efa6c0 00cba02c 00000300 e2d09f40
NIP [c001b5c8] do_page_fault+0x6c/0x5b0
LR [c001b6f8] do_page_fault+0x19c/0x5b0
Call Trace:
[e2d09f00] [e2d09f04] 0xe2d09f04 (unreliable)
[e2d09f30] [c000424c] DataAccess_virt+0xd4/0xe4
--- interrupt: 300 at 0xa7a261dc
NIP: a7a261dc LR: a7a253bc CTR: 00000000
REGS: e2d09f40 TRAP: 0300 Not tainted (5.13.0-pmac-00028-gb3c15b60339a)
MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 228428e2 XER: 20000000
DAR: 00cba02c DSISR: 42000000
GPR00: a7a27448 afa6b0e0 a74c35c0 a7b7b614 0000001e a7b7b614 00cba028 00000000
GPR08: 00020fd9 00000031 00cb9ff8 a7a273b0 220028e2 00c1fff0 00cba000 a7b7ba4c
GPR16: 00000031 00000000 00000000 00000000 00000000 00000000 a7b7b0d0 00c5c010
GPR24: a7b7b64c a7b7d2f0 00000004 00000002 0000001e a7b7b614 a7b7aff4 00000030
NIP [a7a261dc] 0xa7a261dc
LR [a7a253bc] 0xa7a253bc
--- interrupt: 300
Instruction dump:
7c4a1378 810300a0 75278410 83820298 83a300a4 553b018c 551e0036 4082038c
2e1b0000 40920228 75280800 41820220 <0fe00000> 3b600000 41920214 81420594

Second problem: MSR PR is seen unset allthough the interrupt frame shows it set

kernel BUG at arch/powerpc/kernel/interrupt.c:458!
Oops: Exception in kernel mode, sig: 5 [#1]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in:
CPU: 0 PID: 1660 Comm: Xorg Tainted: G W 5.13.0-pmac-00028-gb3c15b60339a #40
NIP: c0011434 LR: c001629c CTR: 00000000
REGS: e2d09e70 TRAP: 0700 Tainted: G W (5.13.0-pmac-00028-gb3c15b60339a)
MSR: 00029032 <EE,ME,IR,DR,RI> CR: 42d09f30 XER: 00000000
GPR00: 00000000 e2d09f30 c301b680 e2d09f40 83440000 c44d0e68 e2d09e8c 00000000
GPR08: 00000002 00dc228a 00004000 e2d09f30 22d09f30 00c1fff0 afa6ceb4 00c26144
GPR16: 00c25fb8 00c26140 afa6ceb8 90000000 00c944d8 0000001c 00000000 00200000
GPR24: 00000000 000001fb afa6d1b4 00000001 00000000 a539a2a0 a530fd80 00000089
NIP [c0011434] interrupt_exit_kernel_prepare+0x10/0x70
LR [c001629c] interrupt_return+0x9c/0x144
Call Trace:
[e2d09f30] [c000424c] DataAccess_virt+0xd4/0xe4 (unreliable)
--- interrupt: 300 at 0xa09be008
NIP: a09be008 LR: a09bdfe8 CTR: a09bdfc0
REGS: e2d09f40 TRAP: 0300 Tainted: G W (5.13.0-pmac-00028-gb3c15b60339a)
MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 420028e2 XER: 20000000
DAR: a539a308 DSISR: 0a000000
GPR00: a7b90d50 afa6b2d0 a74c35c0 a0a8b690 a0a8b698 a5365d70 a4fa82a8 00000004
GPR08: 00000000 a09bdfc0 00000000 a5360000 a09bde7c 00c1fff0 afa6ceb4 00c26144
GPR16: 00c25fb8 00c26140 afa6ceb8 90000000 00c944d8 0000001c 00000000 00200000
GPR24: 00000000 000001fb afa6d1b4 00000001 00000000 a539a2a0 a530fd80 00000089
NIP [a09be008] 0xa09be008
LR [a09bdfe8] 0xa09bdfe8
--- interrupt: 300
Instruction dump:
80010024 83e1001c 7c0803a6 4bffff80 3bc00800 4bffffd0 486b42fd 4bffffcc
81430084 71480002 41820038 554a0462 <0f0a0000> 80620060 74630001 40820034

Fixes: b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C")
Cc: stable@vger.kernel.org # v5.13+
Reported-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4856f5574906e2aec0522be17bf3848a22b2cd0b.1629269345.git.christophe.leroy@csgroup.eu

4y ago

Linus Torvalds

f4ff9e6b

Merge tag 'usb-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

4y ago

Greg Kroah-Hartman

d30836a9

Merge tag 'icc-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc into char-misc-linus

4y ago

Dong Aisheng

283f1b9a

clk: imx6q: fix uart earlycon unwork

4y ago

Nathan Chancellor

3f78c90f

powerpc/xive: Do not mark xive_request_ipi() as __init

4y ago

Linus Torvalds

a09434f1

Merge tag 'riscv-for-linus-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

4y ago

Hans de Goede

5571ea31

usb: typec: tcpm: Fix VDMs sometimes not being forwarded to alt-mode drivers

4y ago

Dongliang Mu

50f05bd1

ipack: tpci200: fix memory leak in the tpci200_register

4y ago

Georgi Djakov

f7530674

Revert "interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate"

4y ago

Brian Norris

f828b0bc

clk: fix leak on devm_clk_bulk_get_all() unwind

4y ago

Cédric Le Goater

cbc06f05

powerpc/xive: Do not skip CPU-less nodes when creating the IPIs

4y ago

Linus Torvalds

5479a7fe

Merge tag 's390-5.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

4y ago

Petr Pavlu

aa3e1ba3

riscv: Fix a number of free'd resources in init_resources()

4y ago

Linus Torvalds

7c60610d

Linux 5.14-rc6 v5.14-rc6

4y ago

Dongliang Mu

57a16810

ipack: tpci200: fix many double free issues in tpci200_pci_probe

4y ago

Linus Torvalds

36a21d51

Linux 5.14-rc5 v5.14-rc5

4y ago

Dmitry Osipenko

2bcc025a

clk: tegra: Implement disable_unused() of tegra_clk_sdmmc_mux_ops

4y ago

Christophe Leroy

01fcac8e

powerpc/interrupt: Do not call single_step_exception() from other exceptions

4y ago

Linus Torvalds

15517c72

Merge tag 'locks-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux

4y ago

Niklas Schnelle

2a671f77

s390/pci: fix use after free of zpci_dev

4y ago

Rob Herring

1c8094e3

dt-bindings: sifive-l2-cache: Fix 'select' matching

4y ago

Linus Torvalds

ecf93431

Merge tag 'powerpc-5.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

4y ago

Srinivas Kandagatla

d7777253

slimbus: ngd: reset dma setup during runtime pm

4y ago

Linus Torvalds

cceb6347

Merge tag 'timers-urgent-2021-08-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

4y ago

Shawn Guo

4ee107c5

clk: qcom: smd-rpm: Fix MSM8936 RPM_SMD_PCNOC_A_CLK

Commit a0384ecfe2aa ("clk: qcom: smd-rpm: De-duplicate identical
entries") introduces the following regression on MSM8936/MSM8939, as
RPM_SMD_PCNOC_A_CLK gets pointed to pcnoc_clk by mistake. Fix it by
correcting the clock to pcnoc_a_clk.

[ 1.307363] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 1.313593] Mem abort info:
[ 1.322512] ESR = 0x96000004
[ 1.325132] EC = 0x25: DABT (current EL), IL = 32 bits
[ 1.338872] SET = 0, FnV = 0
[ 1.355483] EA = 0, S1PTW = 0
[ 1.368702] FSC = 0x04: level 0 translation fault
[ 1.383294] Data abort info:
[ 1.398292] ISV = 0, ISS = 0x00000004
[ 1.398297] CM = 0, WnR = 0
[ 1.398301] [0000000000000000] user address but active_mm is swapper
[ 1.404193] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[ 1.420596] Modules linked in:
[ 1.420604] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.14.0-rc3+ #198
[ 1.441010] pc : __clk_register+0x48/0x780
[ 1.446045] lr : __clk_register+0x3c/0x780
[ 1.449953] sp : ffff800010063440
[ 1.454031] x29: ffff800010063440 x28: 0000000000000004 x27: 0000000000000066
[ 1.457423] x26: 0000000000000001 x25: 000000007fffffff x24: ffff800010f9f388
[ 1.464540] x23: ffff00007fc12a90 x22: ffff0000034b2010 x21: 0000000000000000
[ 1.471658] x20: ffff800010f9fff8 x19: ffff00000152a700 x18: 0000000000000001
[ 1.478778] x17: ffff00007fbd40c8 x16: 0000000000000460 x15: 0000000000000465
[ 1.485895] x14: ffffffffffffffff x13: 746e756f635f7265 x12: 696669746f6e5f6b
[ 1.493013] x11: 0000000000000006 x10: 0000000000000000 x9 : 0000000000000000
[ 1.500131] x8 : ffff00000152a800 x7 : 0000000000000000 x6 : 000000000000003f
[ 1.507249] x5 : 0000000000000040 x4 : 0000000000000000 x3 : 0000000000000004
[ 1.514367] x2 : 0000000000000000 x1 : 0000000000000cc0 x0 : ffff00000152a700
[ 1.521486] Call trace:
[ 1.528598] __clk_register+0x48/0x780
[ 1.530855] clk_hw_register+0x20/0x60
[ 1.534674] devm_clk_hw_register+0x50/0xa8
[ 1.538408] rpm_smd_clk_probe+0x1a4/0x260
[ 1.542488] platform_probe+0x68/0xd8
[ 1.546653] really_probe+0x140/0x2f8
[ 1.550386] __driver_probe_device+0x78/0xe0
[ 1.554033] driver_probe_device+0x80/0x110
[ 1.558373] __device_attach_driver+0x90/0xe0
[ 1.562280] bus_for_each_drv+0x78/0xc8
[ 1.566793] __device_attach+0xf0/0x150
[ 1.570438] device_initial_probe+0x14/0x20
[ 1.574259] bus_probe_device+0x9c/0xa8
[ 1.578425] device_add+0x378/0x870
[ 1.582243] of_device_add+0x44/0x60
[ 1.585716] of_platform_device_create_pdata+0xc0/0x110
[ 1.589538] of_platform_bus_create+0x17c/0x388
[ 1.594485] of_platform_populate+0x50/0xf0
[ 1.598998] qcom_smd_rpm_probe+0xd4/0x128
[ 1.603164] rpmsg_dev_probe+0xbc/0x1a8
[ 1.607330] really_probe+0x140/0x2f8
[ 1.611063] __driver_probe_device+0x78/0xe0
[ 1.614883] driver_probe_device+0x80/0x110
[ 1.619224] __device_attach_driver+0x90/0xe0
[ 1.623131] bus_for_each_drv+0x78/0xc8
[ 1.627643] __device_attach+0xf0/0x150
[ 1.631289] device_initial_probe+0x14/0x20
[ 1.635109] bus_probe_device+0x9c/0xa8
[ 1.639275] device_add+0x378/0x870
[ 1.643095] device_register+0x20/0x30
[ 1.646567] rpmsg_register_device+0x54/0x90
[ 1.650387] qcom_channel_state_worker+0x168/0x288
[ 1.654814] process_one_work+0x1a0/0x328
[ 1.659415] worker_thread+0x4c/0x420
[ 1.663494] kthread+0x150/0x160
[ 1.667138] ret_from_fork+0x10/0x18
[ 1.670442] Code: 97f56b92 b40034a0 aa0003f3 52819801 (f94002a0)
[ 1.674004] ---[ end trace 412fa6f47384cdfe ]---

Fixes: a0384ecfe2aa ("clk: qcom: smd-rpm: De-duplicate identical entries")
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Link: https://lore.kernel.org/r/20210727092613.23056-1-shawn.guo@linaro.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>

4y ago

Christophe Leroy

98694166

powerpc/interrupt: Fix OOPS by not calling do_IRQ() from timer_interrupt()

An interrupt handler shall not be called from another interrupt
handler otherwise this leads to problems like the following:

Kernel attempted to write user page (afd4fa84) - exploit attempt? (uid: 1000)
------------[ cut here ]------------
Bug: Write fault blocked by KUAP!
WARNING: CPU: 0 PID: 1617 at arch/powerpc/mm/fault.c:230 do_page_fault+0x484/0x720
Modules linked in:
CPU: 0 PID: 1617 Comm: sshd Tainted: G W 5.13.0-pmac-00010-g8393422eb77 #7
NIP: c001b77c LR: c001b77c CTR: 00000000
REGS: cb9e5bc0 TRAP: 0700 Tainted: G W (5.13.0-pmac-00010-g8393422eb77)
MSR: 00021032 <ME,IR,DR,RI> CR: 24942424 XER: 00000000

GPR00: c001b77c cb9e5c80 c1582c00 00000021 3ffffbff 085b0000 00000027 c8eb644c
GPR08: 00000023 00000000 00000000 00000000 24942424 0063f8c8 00000000 000186a0
GPR16: afd52dd4 afd52dd0 afd52dcc afd52dc8 0065a990 c07640c4 cb9e5e98 cb9e5e90
GPR24: 00000040 afd4fa96 00000040 02000000 c1fda6c0 afd4fa84 00000300 cb9e5cc0
NIP [c001b77c] do_page_fault+0x484/0x720
LR [c001b77c] do_page_fault+0x484/0x720
Call Trace:
[cb9e5c80] [c001b77c] do_page_fault+0x484/0x720 (unreliable)
[cb9e5cb0] [c000424c] DataAccess_virt+0xd4/0xe4
--- interrupt: 300 at __copy_tofrom_user+0x110/0x20c
NIP: c001f9b4 LR: c03250a0 CTR: 00000004
REGS: cb9e5cc0 TRAP: 0300 Tainted: G W (5.13.0-pmac-00010-g8393422eb77)
MSR: 00009032 <EE,ME,IR,DR,RI> CR: 48028468 XER: 20000000
DAR: afd4fa84 DSISR: 0a000000
GPR00: 20726f6f cb9e5d80 c1582c00 00000004 cb9e5e3a 00000016 afd4fa80 00000000
GPR08: 3835202d 72777872 2d78722d 00000004 28028464 0063f8c8 00000000 000186a0
GPR16: afd52dd4 afd52dd0 afd52dcc afd52dc8 0065a990 c07640c4 cb9e5e98 cb9e5e90
GPR24: 00000040 afd4fa96 00000040 cb9e5e0c 00000daa a0000000 cb9e5e98 afd4fa56
NIP [c001f9b4] __copy_tofrom_user+0x110/0x20c
LR [c03250a0] _copy_to_iter+0x144/0x990
--- interrupt: 300
[cb9e5d80] [c03e89c0] n_tty_read+0xa4/0x598 (unreliable)
[cb9e5df0] [c03e2a0c] tty_read+0xdc/0x2b4
[cb9e5e80] [c0156bf8] vfs_read+0x274/0x340
[cb9e5f00] [c01571ac] ksys_read+0x70/0x118
[cb9e5f30] [c0016048] ret_from_syscall+0x0/0x28
--- interrupt: c00 at 0xa7855c88
NIP: a7855c88 LR: a7855c5c CTR: 00000000
REGS: cb9e5f40 TRAP: 0c00 Tainted: G W (5.13.0-pmac-00010-g8393422eb77)
MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 2402446c XER: 00000000

GPR00: 00000003 afd4ec70 a72137d0 0000000b afd4ecac 00004000 0065a990 00000800
GPR08: 00000000 a7947930 00000000 00000004 c15831b0 0063f8c8 00000000 000186a0
GPR16: afd52dd4 afd52dd0 afd52dcc afd52dc8 0065a990 0065a9e0 00000001 0065fac0
GPR24: 00000000 00000089 00664050 00000000 00668e30 a720c8dc a7943ff4 0065f9b0
NIP [a7855c88] 0xa7855c88
LR [a7855c5c] 0xa7855c5c
--- interrupt: c00
Instruction dump:
3884aa88 38630178 48076861 807f0080 48042e45 2f830000 419e0148 3c80c079
3c60c076 38841be4 386301c0 4801f705 <0fe00000> 3860000b 4bfffe30 3c80c06b
---[ end trace fd69b91a8046c2e5 ]---

Here the problem is that by re-enterring an exception handler,
kuap_save_and_lock() is called a second time with this time KUAP
access locked, leading to regs->kuap being overwritten hence
KUAP not being unlocked at exception exit as expected.

Do not call do_IRQ() from timer_interrupt() directly. Instead,
redefine do_IRQ() as a standard function named __do_IRQ(), and
call it from both do_IRQ() and time_interrupt() handlers.

Fixes: 3a96570ffceb ("powerpc: convert interrupt handlers to use wrappers")
Cc: stable@vger.kernel.org # v5.12+
Reported-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c17d234f4927d39a1d7100864a8e1145323d33a0.1628611927.git.christophe.leroy@csgroup.eu

4y ago

Linus Torvalds

002c0aef

Merge tag 'block-5.14-2021-08-20' of git://git.kernel.dk/linux-block

4y ago

Jeff Layton

fdd92b64

fs: warn about impending deprecation of mandatory locks

4y ago

Alexandre Ghiti

fdf3a7a1

riscv: Fix comment regarding kernel mapping overlapping with IS_ERR_VALUE

4y ago

Linus Torvalds

c4f14eac

Merge tag 'irq-urgent-2021-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

4y ago

Srinivas Kandagatla

c0e38eaa

slimbus: ngd: set correct device for pm

4y ago

Linus Torvalds

713f0f37

Merge tag 'sched-urgent-2021-08-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

4y ago

Thomas Gleixner

bb7262b2

timers: Move clearing of base::timer_running under base:: Lock

4y ago

Randy Dunlap

953a92f0

clk: hisilicon: hi3559a: select RESET_HISI

4y ago

Pu Lehui

43e8f760

powerpc/kprobes: Fix kprobe Oops happens in booke

When using kprobe on powerpc booke series processor, Oops happens
as show bellow:

/ # echo "p:myprobe do_nanosleep" > /sys/kernel/debug/tracing/kprobe_events
/ # echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable
/ # sleep 1
[ 50.076730] Oops: Exception in kernel mode, sig: 5 [#1]
[ 50.077017] BE PAGE_SIZE=4K SMP NR_CPUS=24 QEMU e500
[ 50.077221] Modules linked in:
[ 50.077462] CPU: 0 PID: 77 Comm: sleep Not tainted 5.14.0-rc4-00022-g251a1524293d #21
[ 50.077887] NIP: c0b9c4e0 LR: c00ebecc CTR: 00000000
[ 50.078067] REGS: c3883de0 TRAP: 0700 Not tainted (5.14.0-rc4-00022-g251a1524293d)
[ 50.078349] MSR: 00029000 <CE,EE,ME> CR: 24000228 XER: 20000000
[ 50.078675]
[ 50.078675] GPR00: c00ebdf0 c3883e90 c313e300 c3883ea0 00000001 00000000 c3883ecc 00000001
[ 50.078675] GPR08: c100598c c00ea250 00000004 00000000 24000222 102490c2 bff4180c 101e60d4
[ 50.078675] GPR16: 00000000 102454ac 00000040 10240000 10241100 102410f8 10240000 00500000
[ 50.078675] GPR24: 00000002 00000000 c3883ea0 00000001 00000000 0000c350 3b9b8d50 00000000
[ 50.080151] NIP [c0b9c4e0] do_nanosleep+0x0/0x190
[ 50.080352] LR [c00ebecc] hrtimer_nanosleep+0x14c/0x1e0
[ 50.080638] Call Trace:
[ 50.080801] [c3883e90] [c00ebdf0] hrtimer_nanosleep+0x70/0x1e0 (unreliable)
[ 50.081110] [c3883f00] [c00ec004] sys_nanosleep_time32+0xa4/0x110
[ 50.081336] [c3883f40] [c001509c] ret_from_syscall+0x0/0x28
[ 50.081541] --- interrupt: c00 at 0x100a4d08
[ 50.081749] NIP: 100a4d08 LR: 101b5234 CTR: 00000003
[ 50.081931] REGS: c3883f50 TRAP: 0c00 Not tainted (5.14.0-rc4-00022-g251a1524293d)
[ 50.082183] MSR: 0002f902 <CE,EE,PR,FP,ME> CR: 24000222 XER: 00000000
[ 50.082457]
[ 50.082457] GPR00: 000000a2 bf980040 1024b4d0 bf980084 bf980084 64000000 00555345 fefefeff
[ 50.082457] GPR08: 7f7f7f7f 101e0000 00000069 00000003 28000422 102490c2 bff4180c 101e60d4
[ 50.082457] GPR16: 00000000 102454ac 00000040 10240000 10241100 102410f8 10240000 00500000
[ 50.082457] GPR24: 00000002 bf9803f4 10240000 00000000 00000000 100039e0 00000000 102444e8
[ 50.083789] NIP [100a4d08] 0x100a4d08
[ 50.083917] LR [101b5234] 0x101b5234
[ 50.084042] --- interrupt: c00
[ 50.084238] Instruction dump:
[ 50.084483] 4bfffc40 60000000 60000000 60000000 9421fff0 39400402 914200c0 38210010
[ 50.084841] 4bfffc20 00000000 00000000 00000000 <7fe00008> 7c0802a6 7c892378 93c10048
[ 50.085487] ---[ end trace f6fffe98e2fa8f3e ]---
[ 50.085678]
Trace/breakpoint trap

There is no real mode for booke arch and the MMU translation is
always on. The corresponding MSR_IS/MSR_DS bit in booke is used
to switch the address space, but not for real mode judgment.

Fixes: 21f8b2fa3ca5 ("powerpc/kprobes: Ignore traps that happened in real mode")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210809023658.218915-1-pulehui@huawei.com

4y ago

Linus Torvalds

1e6907d5

Merge tag 'io_uring-5.14-2021-08-20' of git://git.kernel.dk/linux-block

4y ago

Ming Lei

a9ed27a7

blk-mq: fix is_flush_rq

4y ago

Linus Torvalds

3dbdb38e

Merge branch 'for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

4y ago

Changbin Du

030d6dbf

riscv: kexec: do not add '-mno-relax' flag if compiler doesn't support it

4y ago

Linus Torvalds

839da253

Merge tag 'locking_urgent_for_v5.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

4y ago

Thomas Gleixner

7a3dc4f3

driver core: Add missing kernel doc for device::msi_lock

4y ago

Srinivas Kandagatla

a263c1ff

slimbus: messaging: check for valid transaction id

4y ago

Linus Torvalds

74eedeba

Merge tag 'perf-urgent-2021-08-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

4y ago

Peter Zijlstra

f558c2b8

sched/rt: Fix double enqueue caused by rt_effective_prio

Double enqueues in rt runqueues (list) have been reported while running
a simple test that spawns a number of threads doing a short sleep/run
pattern while being concurrently setscheduled between rt and fair class.

WARNING: CPU: 3 PID: 2825 at kernel/sched/rt.c:1294 enqueue_task_rt+0x355/0x360
CPU: 3 PID: 2825 Comm: setsched__13
RIP: 0010:enqueue_task_rt+0x355/0x360
Call Trace:
__sched_setscheduler+0x581/0x9d0
_sched_setscheduler+0x63/0xa0
do_sched_setscheduler+0xa0/0x150
__x64_sys_sched_setscheduler+0x1a/0x30
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae

list_add double add: new=ffff9867cb629b40, prev=ffff9867cb629b40,
next=ffff98679fc67ca0.
kernel BUG at lib/list_debug.c:31!
invalid opcode: 0000 [#1] PREEMPT_RT SMP PTI
CPU: 3 PID: 2825 Comm: setsched__13
RIP: 0010:__list_add_valid+0x41/0x50
Call Trace:
enqueue_task_rt+0x291/0x360
__sched_setscheduler+0x581/0x9d0
_sched_setscheduler+0x63/0xa0
do_sched_setscheduler+0xa0/0x150
__x64_sys_sched_setscheduler+0x1a/0x30
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae

__sched_setscheduler() uses rt_effective_prio() to handle proper queuing
of priority boosted tasks that are setscheduled while being boosted.
rt_effective_prio() is however called twice per each
__sched_setscheduler() call: first directly by __sched_setscheduler()
before dequeuing the task and then by __setscheduler() to actually do
the priority change. If the priority of the pi_top_task is concurrently
being changed however, it might happen that the two calls return
different results. If, for example, the first call returned the same rt
priority the task was running at and the second one a fair priority, the
task won't be removed by the rt list (on_list still set) and then
enqueued in the fair runqueue. When eventually setscheduled back to rt
it will be seen as enqueued already and the WARNING/BUG be issued.

Fix this by calling rt_effective_prio() only once and then reusing the
return value. While at it refactor code as well for clarity. Concurrent
priority inheritance handling is still safe and will eventually converge
to a new state by following the inheritance chain(s).

Fixes: 0782e63bc6fe ("sched: Handle priority boosted tasks proper in setscheduler()")
[squashed Peterz changes; added changelog]
Reported-by: Mark Simmons <msimmons@redhat.com>
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210803104501.38333-1-juri.lelli@redhat.com