commits

Pull irq updates from Thomas Gleixner:
"Updates for the interrupt subsystem:

Core:

- Convert the interrupt descriptor storage to a maple tree to
overcome the limitations of the radixtree + fixed size bitmap.

This allows us to handle very large servers with a huge number of
guests without imposing a huge memory overhead on everyone

- Implement optional retriggering of interrupts which utilize the
fasteoi handler to work around a GICv3 architecture issue

Drivers:

- A set of fixes and updates for the Loongson/Loongarch related
drivers

- Workaound for an ASR8601 integration hickup which ends up with CPU
numbering which can't be represented in the GIC implementation

- The usual set of boring fixes and updates all over the place"

* tag 'irq-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
Revert "irqchip/mxs: Include linux/irqchip/mxs.h"
irqchip/jcore-aic: Fix missing allocation of IRQ descriptors
irqchip/stm32-exti: Fix warning on initialized field overwritten
irqchip/stm32-exti: Add STM32MP15xx IWDG2 EXTI to GIC map
irqchip/gicv3: Add a iort_pmsi_get_dev_id() prototype
irqchip/mxs: Include linux/irqchip/mxs.h
irqchip/clps711x: Remove unused clps711x_intc_init() function
irqchip/mmp: Remove non-DT codepath
irqchip/ftintc010: Mark all function static
irqdomain: Include internals.h for function prototypes
irqchip/loongson-eiointc: Add DT init support
dt-bindings: interrupt-controller: Add Loongson EIOINTC
irqchip/loongson-eiointc: Fix irq affinity setting during resume
irqchip/loongson-liointc: Add IRQCHIP_SKIP_SET_WAKE flag
irqchip/loongson-liointc: Fix IRQ trigger polarity
irqchip/loongson-pch-pic: Fix potential incorrect hwirq assignment
irqchip/loongson-pch-pic: Fix initialization of HT vector register
irqchip/gic-v3-its: Enable RESEND_WHEN_IN_PROGRESS for LPIs
genirq: Allow fasteoi handler to resend interrupts on concurrent handling
genirq: Expand doc for PENDING and REPLAY flags
...

2y ago

Christoph Hellwig

0a2f6372

drm/nouveau: stop using is_swiotlb_active

2y ago

Linus Torvalds

831fe284

Merge tag 'spi-fix-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

2y ago

Masami Hiramatsu (Google)

797311bc

tracing/probes: Fix to record 0-length data_loc in fetch_store_string*() if fails

2y ago

Kees Cook

e0b7b208

MAINTAINERS: Foolishly claim maintainership of string routines

2y ago

Michael Ellerman

5bcedc59

powerpc/security: Fix Speculation_Store_Bypass reporting on Power10

2y ago

Gustavo A. R. Silva

f1f047bd

smb: client: Fix -Wstringop-overflow issues

pSMB->hdr.Protocol is an array of size 4 bytes, hence when the compiler
analyzes this line of code

parm_data = ((char *) &pSMB->hdr.Protocol) + offset;

it legitimately complains about the fact that offset points outside the
bounds of the array. Notice that the compiler gives priority to the object
as an array, rather than merely the address of one more byte in a structure
to wich offset should be added (which seems to be the actual intention of
the original implementation).

Fix this by explicitly instructing the compiler to treat the code as a
sequence of bytes in struct smb_com_transaction2_spi_req, and not as an
array accessed through pointer notation.

Notice that ((char *)pSMB) + sizeof(pSMB->hdr.smb_buf_length) points to
the same address as ((char *) &pSMB->hdr.Protocol), therefore this results
in no differences in binary output.

Fixes the following -Wstringop-overflow warnings when built s390
architecture with defconfig (GCC 13):
CC [M] fs/smb/client/cifssmb.o
In function 'cifs_init_ace',
inlined from 'posix_acl_to_cifs' at fs/smb/client/cifssmb.c:3046:3,
inlined from 'cifs_do_set_acl' at fs/smb/client/cifssmb.c:3191:15:
fs/smb/client/cifssmb.c:2987:31: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
2987 | cifs_ace->cifs_e_perm = local_ace->e_perm;
| ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~
In file included from fs/smb/client/cifssmb.c:27:
fs/smb/client/cifspdu.h: In function 'cifs_do_set_acl':
fs/smb/client/cifspdu.h:384:14: note: at offset [7, 11] into destination object 'Protocol' of size 4
384 | __u8 Protocol[4];
| ^~~~~~~~
In function 'cifs_init_ace',
inlined from 'posix_acl_to_cifs' at fs/smb/client/cifssmb.c:3046:3,
inlined from 'cifs_do_set_acl' at fs/smb/client/cifssmb.c:3191:15:
fs/smb/client/cifssmb.c:2988:30: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
2988 | cifs_ace->cifs_e_tag = local_ace->e_tag;
| ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~
fs/smb/client/cifspdu.h: In function 'cifs_do_set_acl':
fs/smb/client/cifspdu.h:384:14: note: at offset [6, 10] into destination object 'Protocol' of size 4
384 | __u8 Protocol[4];
| ^~~~~~~~

This helps with the ongoing efforts to globally enable
-Wstringop-overflow.

Link: https://github.com/KSPP/linux/issues/310
Fixes: dc1af4c4b472 ("cifs: implement set acl method")
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Steve French <stfrench@microsoft.com>

2y ago

Mario Limonciello

0d5ace1a

pinctrl: amd: Only use special debounce behavior for GPIO 0

2y ago

Linus Torvalds

74099e20

Merge tag 'mips_6.5_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

2y ago

Thomas Gleixner

b1472a60

x86/smp: Don't send INIT to boot CPU

2y ago

Thomas Gleixner

0303c972

x86/efi: Make efi_set_virtual_address_map IBT safe

2y ago

Linus Torvalds

cef2dd76

Merge tag 'core-debugobjects-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip

2y ago

Thomas Gleixner

f121ab7f

Merge tag 'irqchip-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

2y ago

Petr Tesarik

693405cf

swiotlb: use the atomic counter of total used slabs if available

2y ago

Linus Torvalds

393ea781

Merge tag 'regmap-fix-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

2y ago

Jonas Gorski

54ccc875

mailmap: add entry for Jonas Gorski

2y ago

Masami Hiramatsu (Google)

4ed8f337

Revert "tracing: Add "(fault)" name injection to kernel probes"

2y ago

Yonghong Song

8cc32a9b

kallsyms: strip LTO-only suffixes from promoted global functions

Commit 6eb4bd92c1ce ("kallsyms: strip LTO suffixes from static functions")
stripped all function/variable suffixes started with '.' regardless
of whether those suffixes are generated at LTO mode or not. In fact,
as far as I know, in LTO mode, when a static function/variable is
promoted to the global scope, '.llvm.<...>' suffix is added.

The existing mechanism breaks live patch for a LTO kernel even if
no <symbol>.llvm.<...> symbols are involved. For example, for the following
kernel symbols:
$ grep bpf_verifier_vlog /proc/kallsyms
ffffffff81549f60 t bpf_verifier_vlog
ffffffff8268b430 d bpf_verifier_vlog._entry
ffffffff8282a958 d bpf_verifier_vlog._entry_ptr
ffffffff82e12a1f d bpf_verifier_vlog.__already_done
'bpf_verifier_vlog' is a static function. '_entry', '_entry_ptr' and
'__already_done' are static variables used inside 'bpf_verifier_vlog',
so llvm promotes them to file-level static with prefix 'bpf_verifier_vlog.'.
Note that the func-level to file-level static function promotion also
happens without LTO.

Given a symbol name 'bpf_verifier_vlog', with LTO kernel, current mechanism will
return 4 symbols to live patch subsystem which current live patching
subsystem cannot handle it. With non-LTO kernel, only one symbol
is returned.

In [1], we have a lengthy discussion, the suggestion is to separate two
cases:
(1). new symbols with suffix which are generated regardless of whether
LTO is enabled or not, and
(2). new symbols with suffix generated only when LTO is enabled.

The cleanup_symbol_name() should only remove suffixes for case (2).
Case (1) should not be changed so it can work uniformly with or without LTO.

This patch removed LTO-only suffix '.llvm.<...>' so live patching and
tracing should work the same way for non-LTO kernel.
The cleanup_symbol_name() in scripts/kallsyms.c is also changed to have the same
filtering pattern so both kernel and kallsyms tool have the same
expectation on the order of symbols.

[1] https://lore.kernel.org/live-patching/20230615170048.2382735-1-song@kernel.org/T/#u

Fixes: 6eb4bd92c1ce ("kallsyms: strip LTO suffixes from static functions")
Reported-by: Song Liu <song@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230628181926.4102448-1-yhs@fb.com
Signed-off-by: Kees Cook <keescook@chromium.org>

2y ago

Michael Ellerman

8bbe9fee

powerpc/64s: Fix native_hpte_remove() to be irq-safe

Lockdep warns that the use of the hpte_lock in native_hpte_remove() is
not safe against an IRQ coming in:

================================
WARNING: inconsistent lock state
6.4.0-rc2-g0c54f4d30ecc #1 Not tainted
--------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
qemu-system-ppc/93865 [HC0[0]:SC0[0]:HE1:SE1] takes:
c0000000021f5180 (hpte_lock){+.?.}-{0:0}, at: native_lock_hpte+0x8/0xd0
{IN-SOFTIRQ-W} state was registered at:
lock_acquire+0x134/0x3f0
native_lock_hpte+0x44/0xd0
native_hpte_insert+0xd4/0x2a0
__hash_page_64K+0x218/0x4f0
hash_page_mm+0x464/0x840
do_hash_fault+0x11c/0x260
data_access_common_virt+0x210/0x220
__ip_select_ident+0x140/0x150
...
net_rx_action+0x3bc/0x440
__do_softirq+0x180/0x534
...
sys_sendmmsg+0x34/0x50
system_call_exception+0x128/0x320
system_call_common+0x160/0x2e4
...
Possible unsafe locking scenario:

CPU0
----
lock(hpte_lock);
<Interrupt>
lock(hpte_lock);

*** DEADLOCK ***
...
Call Trace:
dump_stack_lvl+0x98/0xe0 (unreliable)
print_usage_bug.part.0+0x250/0x278
mark_lock+0xc9c/0xd30
__lock_acquire+0x440/0x1ca0
lock_acquire+0x134/0x3f0
native_lock_hpte+0x44/0xd0
native_hpte_remove+0xb0/0x190
kvmppc_mmu_map_page+0x650/0x698 [kvm_pr]
kvmppc_handle_pagefault+0x534/0x6e8 [kvm_pr]
kvmppc_handle_exit_pr+0x6d8/0xe90 [kvm_pr]
after_sprg3_load+0x80/0x90 [kvm_pr]
kvmppc_vcpu_run_pr+0x108/0x270 [kvm_pr]
kvmppc_vcpu_run+0x34/0x48 [kvm]
kvm_arch_vcpu_ioctl_run+0x340/0x470 [kvm]
kvm_vcpu_ioctl+0x338/0x8b8 [kvm]
sys_ioctl+0x7c4/0x13e0
system_call_exception+0x128/0x320
system_call_common+0x160/0x2e4

I suspect kvm_pr is the only caller that doesn't already have IRQs
disabled, which is why this hasn't been reported previously.

Fix it by disabling IRQs in native_hpte_remove().

Fixes: 35159b5717fa ("powerpc/64s: make HPTE lock and native_tlbie_lock irq-safe")
Cc: stable@vger.kernel.org # v6.1+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230517123033.18430-1-mpe@ellerman.id.au

2y ago

Bharath SM

df9d70c1

cifs: if deferred close is disabled then close files immediately

2y ago

Linux 6.5-rc2 v6.5-rc2

fdf0eaf1

Linus Torvalds

Merge tag 'xtensa-20230716' of https://github.com/jcmvbkbc/linux-xtensa

5b8d6e85

Linus Torvalds

Merge tag 'perf_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

1667e630

Linus Torvalds

xtensa: fix unaligned and load/store configuration interaction

a160e941

Max Filippov

Merge tag 'objtool_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

8a3e4a64

Linus Torvalds

perf/x86: Fix lockdep warning in for_each_sibling_event() on SPR

27c68c21

Namhyung Kim

xtensa: ISS: fix call to split_if_spec

bc8d5916

Max Filippov

Merge tag 'sched_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

f61a89ca

Linus Torvalds

iov_iter: Mark copy_iovec_from_user() noclone

719a937b

Peter Zijlstra

Linux 6.5-rc1 v6.5-rc1

06c2afb8

Linus Torvalds

xtensa: ISS: add comment about etherdev freeing

c44e783e

Max Filippov

Merge tag 'pinctrl-v6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

ede950b0

Linus Torvalds

sched/psi: use kernfs polling functions for PSI trigger polling

Destroying psi trigger in cgroup_file_release causes UAF issues when
a cgroup is removed from under a polling process. This is happening
because cgroup removal causes a call to cgroup_file_release while the
actual file is still alive. Destroying the trigger at this point would
also destroy its waitqueue head and if there is still a polling process
on that file accessing the waitqueue, it will step on the freed pointer:

do_select
vfs_poll
do_rmdir
cgroup_rmdir
kernfs_drain_open_files
cgroup_file_release
cgroup_pressure_release
psi_trigger_destroy
wake_up_pollfree(&t->event_wait)
// vfs_poll is unblocked
synchronize_rcu
kfree(t)
poll_freewait -> UAF access to the trigger's waitqueue head

Patch [1] fixed this issue for epoll() case using wake_up_pollfree(),
however the same issue exists for synchronous poll() case.
The root cause of this issue is that the lifecycles of the psi trigger's
waitqueue and of the file associated with the trigger are different. Fix
this by using kernfs_generic_poll function when polling on cgroup-specific
psi triggers. It internally uses kernfs_open_node->poll waitqueue head
with its lifecycle tied to the file's lifecycle. This also renders the
fix in [1] obsolete, so revert it.

[1] commit c2dbe32d5db5 ("sched/psi: Fix use-after-free in ep_remove_wait_queue()")

Fixes: 0e94682b73bf ("psi: introduce psi monitor")
Closes: https://lore.kernel.org/all/20230613062306.101831-1-lujialin4@huawei.com/
Reported-by: Lu Jialin <lujialin4@huawei.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20230630005612.1014540-1-surenb@google.com