commits

When ring_buffer_swap_cpu was called during resize process,
the cpu buffer was swapped in the middle, resulting in incorrect state.
Continuing to run in the wrong state will result in oops.

This issue can be easily reproduced using the following two scripts:
/tmp # cat test1.sh
//#! /bin/sh
for i in `seq 0 100000`
do
echo 2000 > /sys/kernel/debug/tracing/buffer_size_kb
sleep 0.5
echo 5000 > /sys/kernel/debug/tracing/buffer_size_kb
sleep 0.5
done
/tmp # cat test2.sh
//#! /bin/sh
for i in `seq 0 100000`
do
echo irqsoff > /sys/kernel/debug/tracing/current_tracer
sleep 1
echo nop > /sys/kernel/debug/tracing/current_tracer
sleep 1
done
/tmp # ./test1.sh &
/tmp # ./test2.sh &

A typical oops log is as follows, sometimes with other different oops logs.

[ 231.711293] WARNING: CPU: 0 PID: 9 at kernel/trace/ring_buffer.c:2026 rb_update_pages+0x378/0x3f8
[ 231.713375] Modules linked in:
[ 231.714735] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.5.0-rc1-00276-g20edcec23f92 #15
[ 231.716750] Hardware name: linux,dummy-virt (DT)
[ 231.718152] Workqueue: events update_pages_handler
[ 231.719714] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 231.721171] pc : rb_update_pages+0x378/0x3f8
[ 231.722212] lr : rb_update_pages+0x25c/0x3f8
[ 231.723248] sp : ffff800082b9bd50
[ 231.724169] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000
[ 231.726102] x26: 0000000000000001 x25: fffffffffffff010 x24: 0000000000000ff0
[ 231.728122] x23: ffff0000c3a0b600 x22: ffff0000c3a0b5c0 x21: fffffffffffffe0a
[ 231.730203] x20: ffff0000c3a0b600 x19: ffff0000c0102400 x18: 0000000000000000
[ 231.732329] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffe7aa8510
[ 231.734212] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000002
[ 231.736291] x11: ffff8000826998a8 x10: ffff800082b9baf0 x9 : ffff800081137558
[ 231.738195] x8 : fffffc00030e82c8 x7 : 0000000000000000 x6 : 0000000000000001
[ 231.740192] x5 : ffff0000ffbafe00 x4 : 0000000000000000 x3 : 0000000000000000
[ 231.742118] x2 : 00000000000006aa x1 : 0000000000000001 x0 : ffff0000c0007208
[ 231.744196] Call trace:
[ 231.744892] rb_update_pages+0x378/0x3f8
[ 231.745893] update_pages_handler+0x1c/0x38
[ 231.746893] process_one_work+0x1f0/0x468
[ 231.747852] worker_thread+0x54/0x410
[ 231.748737] kthread+0x124/0x138
[ 231.749549] ret_from_fork+0x10/0x20
[ 231.750434] ---[ end trace 0000000000000000 ]---
[ 233.720486] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 233.721696] Mem abort info:
[ 233.721935] ESR = 0x0000000096000004
[ 233.722283] EC = 0x25: DABT (current EL), IL = 32 bits
[ 233.722596] SET = 0, FnV = 0
[ 233.722805] EA = 0, S1PTW = 0
[ 233.723026] FSC = 0x04: level 0 translation fault
[ 233.723458] Data abort info:
[ 233.723734] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 233.724176] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 233.724589] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 233.725075] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000104943000
[ 233.725592] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
[ 233.726231] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[ 233.726720] Modules linked in:
[ 233.727007] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.5.0-rc1-00276-g20edcec23f92 #15
[ 233.727777] Hardware name: linux,dummy-virt (DT)
[ 233.728225] Workqueue: events update_pages_handler
[ 233.728655] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 233.729054] pc : rb_update_pages+0x1a8/0x3f8
[ 233.729334] lr : rb_update_pages+0x154/0x3f8
[ 233.729592] sp : ffff800082b9bd50
[ 233.729792] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000
[ 233.730220] x26: 0000000000000000 x25: ffff800082a8b840 x24: ffff0000c0102418
[ 233.730653] x23: 0000000000000000 x22: fffffc000304c880 x21: 0000000000000003
[ 233.731105] x20: 00000000000001f4 x19: ffff0000c0102400 x18: ffff800082fcbc58
[ 233.731727] x17: 0000000000000000 x16: 0000000000000001 x15: 0000000000000001
[ 233.732282] x14: ffff8000825fe0c8 x13: 0000000000000001 x12: 0000000000000000
[ 233.732709] x11: ffff8000826998a8 x10: 0000000000000ae0 x9 : ffff8000801b760c
[ 233.733148] x8 : fefefefefefefeff x7 : 0000000000000018 x6 : ffff0000c03298c0
[ 233.733553] x5 : 0000000000000002 x4 : 0000000000000000 x3 : 0000000000000000
[ 233.733972] x2 : ffff0000c3a0b600 x1 : 0000000000000000 x0 : 0000000000000000
[ 233.734418] Call trace:
[ 233.734593] rb_update_pages+0x1a8/0x3f8
[ 233.734853] update_pages_handler+0x1c/0x38
[ 233.735148] process_one_work+0x1f0/0x468
[ 233.735525] worker_thread+0x54/0x410
[ 233.735852] kthread+0x124/0x138
[ 233.736064] ret_from_fork+0x10/0x20
[ 233.736387] Code: 92400000 910006b5 aa000021 aa0303f7 (f9400060)
[ 233.736959] ---[ end trace 0000000000000000 ]---

After analysis, the seq of the error is as follows [1-5]:

int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size,
int cpu_id)
{
for_each_buffer_cpu(buffer, cpu) {
cpu_buffer = buffer->buffers[cpu];
//1. get cpu_buffer, aka cpu_buffer(A)
...
...
schedule_work_on(cpu,
&cpu_buffer->update_pages_work);
//2. 'update_pages_work' is queue on 'cpu', cpu_buffer(A) is passed to
// update_pages_handler, do the update process, set 'update_done' in
// complete(&cpu_buffer->update_done) and to wakeup resize process.
//---->
//3. Just at this moment, ring_buffer_swap_cpu is triggered,
//cpu_buffer(A) be swaped to cpu_buffer(B), the max_buffer.
//ring_buffer_swap_cpu is called as the 'Call trace' below.

Call trace:
dump_backtrace+0x0/0x2f8
show_stack+0x18/0x28
dump_stack+0x12c/0x188
ring_buffer_swap_cpu+0x2f8/0x328
update_max_tr_single+0x180/0x210
check_critical_timing+0x2b4/0x2c8
tracer_hardirqs_on+0x1c0/0x200
trace_hardirqs_on+0xec/0x378
el0_svc_common+0x64/0x260
do_el0_svc+0x90/0xf8
el0_svc+0x20/0x30
el0_sync_handler+0xb0/0xb8
el0_sync+0x180/0x1c0
//<----

/* wait for all the updates to complete */
for_each_buffer_cpu(buffer, cpu) {
cpu_buffer = buffer->buffers[cpu];
//4. get cpu_buffer, cpu_buffer(B) is used in the following process,
//the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong.
//for example, cpu_buffer(A)->update_done will leave be set 1, and will
//not 'wait_for_completion' at the next resize round.
if (!cpu_buffer->nr_pages_to_update)
continue;

if (cpu_online(cpu))
wait_for_completion(&cpu_buffer->update_done);
cpu_buffer->nr_pages_to_update = 0;
}
...
}
//5. the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong,
//Continuing to run in the wrong state, then oops occurs.

Link: https://lore.kernel.org/linux-trace-kernel/202307191558478409990@zte.com.cn

Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

2y ago

Linus Torvalds

15b593ba

Merge tag 'ext4_for_linus-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

2y ago

Paolo Bonzini

0c189708

Merge tag 'kvm-s390-master-6.5-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

2y ago

Alexey Dobriyan

0817d259

kbuild: flatten KBUILD_CFLAGS

2y ago

YueHaibing

1faf7e4a

tracing: Remove unused extern declaration tracing_map_set_field_descr()

2y ago

Linus Torvalds

8266f53b

Merge tag '6.5-rc2-smb3-client-fixes-ver2' of git://git.samba.org/sfrench/cifs-2.6

2y ago

Ojaswin Mujoo

9d3de7ee

ext4: fix rbtree traversal bug in ext4_mb_use_preallocated

During allocations, while looking for preallocations(PA) in the per
inode rbtree, we can't do a direct traversal of the tree because
ext4_mb_discard_group_preallocation() can paralelly mark the pa deleted
and that can cause direct traversal to skip some entries. This was
leading to a BUG_ON() being hit [1] when we missed a PA that could satisfy
our request and ultimately tried to create a new PA that would overlap
with the missed one.

To makes sure we handle that case while still keeping the performance of
the rbtree, we make use of the fact that the only pa that could possibly
overlap the original goal start is the one that satisfies the below
conditions:

1. It must have it's logical start immediately to the left of
(ie less than) original logical start.

2. It must not be deleted

To find this pa we use the following traversal method:

1. Descend into the rbtree normally to find the immediate neighboring
PA. Here we keep descending irrespective of if the PA is deleted or if
it overlaps with our request etc. The goal is to find an immediately
adjacent PA.

2. If the found PA is on right of original goal, use rb_prev() to find
the left adjacent PA.

3. Check if this PA is deleted and keep moving left with rb_prev() until
a non deleted PA is found.

4. This is the PA we are looking for. Now we can check if it can satisfy
the original request and proceed accordingly.

This approach also takes care of having deleted PAs in the tree.

(While we are at it, also fix a possible overflow bug in calculating the
end of a PA)

[1] https://lore.kernel.org/linux-ext4/CA+G9fYv2FRpLqBZf34ZinR8bU2_ZRAUOjKAD3+tKRFaEQHtt8Q@mail.gmail.com/

Cc: stable@kernel.org # 6.4
Fixes: 3872778664e3 ("ext4: Use rbtrees to manage PAs instead of inode i_prealloc_list")
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reviewed-by: Ritesh Harjani (IBM) ritesh.list@gmail.com
Tested-by: Ritesh Harjani (IBM) ritesh.list@gmail.com
Link: https://lore.kernel.org/r/edd2efda6a83e6343c5ace9deea44813e71dbe20.1690045963.git.ojaswin@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>

2y ago

Paolo Bonzini

675a15f4

Merge tag 'kvmarm-fixes-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

2y ago

Claudio Imbrenda

c2fceb59

KVM: s390: pv: fix index value of replaced ASCE

2y ago

Benjamin Gray

1c679214

gen_compile_commands: add assembly files to compilation database

2y ago

Linus Torvalds

fdf0eaf1

Linux 6.5-rc2 v6.5-rc2

2y ago

Linus Torvalds

c2782531

Merge tag 'powerpc-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

2y ago

Steve French

ba61a03a

cifs: update internal module version number for cifs.ko

2y ago

Ojaswin Mujoo

5d5460fa

ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail()

2y ago

Xiang Chen

9d2a55b4

KVM: arm64: Fix the name of sys_reg_desc related to PMU

2y ago

Claudio Imbrenda

5ff92181

KVM: s390: pv: simplify shutdown and fix race

2y ago

Randy Dunlap

30ebf2ce

kconfig: gconfig: correct program name in help text

2y ago

Linus Torvalds

5b8d6e85

Merge tag 'xtensa-20230716' of https://github.com/jcmvbkbc/linux-xtensa

2y ago

Linus Torvalds

295e1388

Merge tag 's390-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

2y ago

Andrew Donnellan

106ea7ff

Revert "powerpc/64s: Remove support for ELFv1 little endian userspace"

2y ago

Shyam Prasad N

b3edef6b

cifs: allow dumping keys for directories too

2y ago

Eric Whitney

6909cf5c

ext4: correct inline offset when handling xattrs in inode body

2y ago

Oliver Upton

6d4f9236

KVM: arm64: Correctly handle RES0 bits PMEVTYPER<n>_EL0.evtCount

2y ago

Randy Dunlap

390ef8c0

kconfig: gconfig: drop the Show Debug Info help text

2y ago

Linus Torvalds

1667e630

Merge tag 'perf_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2y ago

Max Filippov

a160e941

xtensa: fix unaligned and load/store configuration interaction

2y ago

Linus Torvalds

f036d67c

Merge tag 'block-6.5-2023-07-21' of git://git.kernel.dk/linux

2y ago

Harald Freudenberger

4cfca532

s390/zcrypt: fix reply buffer calculations for CCA replies

2y ago

Benjamin Gray

ccb381e1

powerpc/kasan: Disable KCOV in KASAN code

2y ago

Zhang Yi

3c55097c

jbd2: remove __journal_try_to_free_buffer()

2y ago

Marc Zyngier

b321c31c

KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption

2y ago

Linus Torvalds

06c2afb8

Linux 6.5-rc1 v6.5-rc1

2y ago

Linus Torvalds

8a3e4a64

Merge tag 'objtool_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2y ago

Namhyung Kim

27c68c21

perf/x86: Fix lockdep warning in for_each_sibling_event() on SPR

2y ago

Max Filippov

bc8d5916

xtensa: ISS: fix call to split_if_spec

2y ago

Linus Torvalds

bdd1d82e

Merge tag 'io_uring-6.5-2023-07-21' of git://git.kernel.dk/linux

2y ago

Mauricio Faria de Oliveira

bb5faa99

loop: do not enforce max_loop hard limit by (new) default

Problem:

The max_loop parameter is used for 2 different purposes:

1) initial number of loop devices to pre-create on init
2) maximum number of loop devices to add on access/open()

Historically, its default value (zero) caused 1) to create non-zero
number of devices (CONFIG_BLK_DEV_LOOP_MIN_COUNT), and no hard limit on
2) to add devices with autoloading.

However, the default value changed in commit 85c50197716c ("loop: Fix
the max_loop commandline argument treatment when it is set to 0") to
CONFIG_BLK_DEV_LOOP_MIN_COUNT, for max_loop=0 not to pre-create devices.

That does improve 1), but unfortunately it breaks 2), as the default
behavior changed from no-limit to hard-limit.

Example:

For example, this userspace code broke for N >= CONFIG, if the user
relied on the default value 0 for max_loop:

mknod("/dev/loopN");
open("/dev/loopN"); // now fails with ENXIO

Though affected users may "fix" it with (loop.)max_loop=0, this means to
require a kernel parameter change on stable kernel update (that commit
Fixes: an old commit in stable).

Solution:

The original semantics for the default value in 2) can be applied if the
parameter is not set (ie, default behavior).

This still keeps the intended function in 1) and 2) if set, and that
commit's intended improvement in 1) if max_loop=0.

Before 85c50197716c:
- default: 1) CONFIG devices 2) no limit
- max_loop=0: 1) CONFIG devices 2) no limit
- max_loop=X: 1) X devices 2) X limit

After 85c50197716c:
- default: 1) CONFIG devices 2) CONFIG limit (*)
- max_loop=0: 1) 0 devices (*) 2) no limit
- max_loop=X: 1) X devices 2) X limit

This commit:
- default: 1) CONFIG devices 2) no limit (*)
- max_loop=0: 1) 0 devices 2) no limit
- max_loop=X: 1) X devices 2) X limit

Future:

The issue/regression from that commit only affects code under the
CONFIG_BLOCK_LEGACY_AUTOLOAD deprecation guard, thus the fix too is
contained under it.

Once that deprecated functionality/code is removed, the purpose 2) of
max_loop (hard limit) is no longer in use, so the module parameter
description can be changed then.

Tests:

Linux 6.4-rc7
CONFIG_BLK_DEV_LOOP_MIN_COUNT=8
CONFIG_BLOCK_LEGACY_AUTOLOAD=y

- default (original)

# ls -1 /dev/loop*
/dev/loop-control
/dev/loop0
...
/dev/loop7

# ./test-loop
open: /dev/loop8: No such device or address

- default (patched)

# ls -1 /dev/loop*
/dev/loop-control
/dev/loop0
...
/dev/loop7

# ./test-loop
#

- max_loop=0 (original & patched):

# ls -1 /dev/loop*
/dev/loop-control

# ./test-loop
#

- max_loop=8 (original & patched):

# ls -1 /dev/loop*
/dev/loop-control
/dev/loop0
...
/dev/loop7

# ./test-loop
open: /dev/loop8: No such device or address

- max_loop=0 (patched; CONFIG_BLOCK_LEGACY_AUTOLOAD is not set)

# ls -1 /dev/loop*
/dev/loop-control

# ./test-loop
open: /dev/loop8: No such device or address

Fixes: 85c50197716c ("loop: Fix the max_loop commandline argument treatment when it is set to 0")
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230720143033.841001-3-mfo@canonical.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

2y ago

Wang Ming

1f7e9067

s390/crypto: use kfree_sensitive() instead of kfree()

2y ago

Uwe Kleine-König

8739312e

powerpc/512x: lpbfifo: Convert to platform remove callback returning void

2y ago

Zhang Yi

46f881b5

jbd2: fix a race when checking checkpoint buffer busy

2y ago

Mostafa Saleh

dcf89d11

KVM: arm64: Add missing BTI instructions

2y ago

Linus Torvalds

c192ac73

MAINTAINERS 2: Electric Boogaloo

2y ago

Linus Torvalds

f61a89ca

Merge tag 'sched_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2y ago

Peter Zijlstra

719a937b

iov_iter: Mark copy_iovec_from_user() noclone

2y ago

Max Filippov

c44e783e

xtensa: ISS: add comment about etherdev freeing

2y ago

Linus Torvalds

725d444d

Merge tag 'devicetree-fixes-for-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

2y ago

Helge Deller

07e98113

ia64: mmap: Consider pgoff when searching for free mapping

2y ago

Mauricio Faria de Oliveira

23881aec

loop: deprecate autoloading callback loop_probe()

2y ago

Sven Schnelle

7686762d

s390/mm: fix per vma lock fault handling

2y ago

Russell Currey

fb74c4e3

powerpc/crypto: Add gitignore for generated P10 AES/GCM .S files

2y ago

Zhihao Cheng

e34c8dd2

jbd2: Fix wrongly judgement for buffer head removing while doing checkpoint

2y ago

Oliver Upton

df6556ad

KVM: arm64: Correctly handle page aging notifiers for unaligned memslot

2y ago

Linus Torvalds

f71f6421

Merge tag 'dma-mapping-6.5-2023-07-09' of git://git.infradead.org/users/hch/dma-mapping

2y ago

Linus Torvalds

ede950b0

Merge tag 'pinctrl-v6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

2y ago

Linux 6.5-rc3 v6.5-rc3

6eaae198

Linus Torvalds

Merge tag 'trace-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

3b4e48b8

Linus Torvalds

Merge tag 'kbuild-fixes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

12a5336c

Linus Torvalds

tracing/histograms: Return an error if we fail to add histogram to hist_vars list

4b8b3905

Mohamed Khalfella

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

269f4a4b

Linus Torvalds

kbuild: rust: avoid creating temporary files

df01b7cf

Miguel Ojeda

ring-buffer: Do not swap cpu_buffer during resize process

8a96c028

Chen Lin

Merge tag 'ext4_for_linus-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

15b593ba

Linus Torvalds

Merge tag 'kvm-s390-master-6.5-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

0c189708

Paolo Bonzini

kbuild: flatten KBUILD_CFLAGS

0817d259

Alexey Dobriyan

tracing: Remove unused extern declaration tracing_map_set_field_descr()

1faf7e4a

YueHaibing

Merge tag '6.5-rc2-smb3-client-fixes-ver2' of git://git.samba.org/sfrench/cifs-2.6

8266f53b

Linus Torvalds

ext4: fix rbtree traversal bug in ext4_mb_use_preallocated

9d3de7ee

Ojaswin Mujoo

Merge tag 'kvmarm-fixes-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

675a15f4

Paolo Bonzini

KVM: s390: pv: fix index value of replaced ASCE

c2fceb59

Claudio Imbrenda

gen_compile_commands: add assembly files to compilation database

1c679214

Benjamin Gray

Linux 6.5-rc2 v6.5-rc2

fdf0eaf1

Linus Torvalds

Merge tag 'powerpc-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

c2782531

Linus Torvalds

cifs: update internal module version number for cifs.ko

ba61a03a

Steve French

ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail()

5d5460fa

Ojaswin Mujoo

KVM: arm64: Fix the name of sys_reg_desc related to PMU

9d2a55b4

Xiang Chen

KVM: s390: pv: simplify shutdown and fix race

5ff92181

Claudio Imbrenda

kconfig: gconfig: correct program name in help text

30ebf2ce

Randy Dunlap

Merge tag 'xtensa-20230716' of https://github.com/jcmvbkbc/linux-xtensa

5b8d6e85

Linus Torvalds

Merge tag 's390-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

295e1388

Linus Torvalds

Revert "powerpc/64s: Remove support for ELFv1 little endian userspace"

106ea7ff

Andrew Donnellan

cifs: allow dumping keys for directories too

b3edef6b

Shyam Prasad N

ext4: correct inline offset when handling xattrs in inode body

6909cf5c

Eric Whitney

KVM: arm64: Correctly handle RES0 bits PMEVTYPER<n>_EL0.evtCount

6d4f9236

Oliver Upton

kconfig: gconfig: drop the Show Debug Info help text

390ef8c0

Randy Dunlap

Merge tag 'perf_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

1667e630

Linus Torvalds

xtensa: fix unaligned and load/store configuration interaction

a160e941

Max Filippov

Merge tag 'block-6.5-2023-07-21' of git://git.kernel.dk/linux

f036d67c

Linus Torvalds

s390/zcrypt: fix reply buffer calculations for CCA replies

4cfca532

Harald Freudenberger

powerpc/kasan: Disable KCOV in KASAN code

ccb381e1

Benjamin Gray

jbd2: remove __journal_try_to_free_buffer()

3c55097c

Zhang Yi

KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption

b321c31c

Marc Zyngier

Linux 6.5-rc1 v6.5-rc1

06c2afb8

Linus Torvalds

Merge tag 'objtool_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

8a3e4a64

Linus Torvalds

perf/x86: Fix lockdep warning in for_each_sibling_event() on SPR

27c68c21

Namhyung Kim

xtensa: ISS: fix call to split_if_spec

bc8d5916

Max Filippov

Merge tag 'io_uring-6.5-2023-07-21' of git://git.kernel.dk/linux

bdd1d82e

Linus Torvalds

loop: do not enforce max_loop hard limit by (new) default

bb5faa99

Mauricio Faria de Oliveira

s390/crypto: use kfree_sensitive() instead of kfree()

1f7e9067

Wang Ming

powerpc/512x: lpbfifo: Convert to platform remove callback returning void

8739312e

Uwe Kleine-König

jbd2: fix a race when checking checkpoint buffer busy

46f881b5

Zhang Yi

KVM: arm64: Add missing BTI instructions

Some bti instructions were missing from
commit b53d4a272349 ("KVM: arm64: Use BTI for nvhe")

1) kvm_host_psci_cpu_entry
kvm_host_psci_cpu_entry is called from __kvm_hyp_init_cpu through "br"
instruction as __kvm_hyp_init_cpu resides in idmap section while
kvm_host_psci_cpu_entry is in hyp .text so the offset is larger than
128MB range covered by "b".
Which means that this function should start with "bti j" instruction.

LLVM which is the only compiler supporting BTI for Linux, adds "bti j"
for jump tables or by when taking the address of the block [1].
Same behaviour is observed with GCC.

As kvm_host_psci_cpu_entry is a C function, this must be done in
assembly.

Another solution is to use X16/X17 with "br", as according to ARM
ARM DDI0487I.a RLJHCL/IGMGRS, PACIASP has an implicit branch
target identification instruction that is compatible with
PSTATE.BTYPE 0b01 which includes "br X16/X17"
And the kvm_host_psci_cpu_entry has PACIASP as it is an external
function.
Although, using explicit "bti" makes it more clear than relying on
which register is used.

A third solution is to clear SCTLR_EL2.BT, which would make PACIASP
compatible PSTATE.BTYPE 0b11 ("br" to other registers).
However this deviates from the kernel behaviour (in bti_enable()).

2) Spectre vector table
"br" instructions are generated at runtime for the vector table
(__bp_harden_hyp_vecs).
These branches would land on vectors in __kvm_hyp_vector at offset 8.
As all the macros are defined with valid_vect/invalid_vect, it is
sufficient to add "bti j" at the correct offset.

[1] https://reviews.llvm.org/D52867

Fixes: b53d4a272349 ("KVM: arm64: Use BTI for nvhe")
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reported-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20230706152240.685684-1-smostafa@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>