commits

Pull tracing fixes from Steven Rostedt::
"Eventfs fixes:

- With the usage of simple_recursive_remove() recommended by Al Viro,
the code should not be calling "d_invalidate()" itself. Doing so is
causing crashes. The code was calling d_invalidate() on the race of
trying to look up a file while the parent was being deleted. This
was detected, and the added dentry was having d_invalidate() called
on it, but the deletion of the directory was also calling
d_invalidate() on that same dentry.

- A fix to not free the eventfs_inode (ei) until the last dput() was
called on its ei->dentry made the ei->dentry exist even after it
was marked for free by setting the ei->is_freed. But code elsewhere
still was checking if ei->dentry was NULL if ei->is_freed is set
and would trigger WARN_ON if that was the case. That's no longer
true and there should not be any warnings when it is true.

- Use GFP_NOFS for allocations done under eventfs_mutex. The
eventfs_mutex can be taken on file system reclaim, make sure that
allocations done under that mutex do not trigger file system
reclaim.

- Clean up code by moving the taking of inode_lock out of the helper
functions and into where they are needed, and not use the parameter
to know to take it or not. It must always be held but some callers
of the helper function have it taken when they were called.

- Warn if the inode_lock is not held in the helper functions.

- Warn if eventfs_start_creating() is called without a parent. As
eventfs is underneath tracefs, all files created will have a parent
(the top one will have a tracefs parent).

Tracing update:

- Add Mathieu Desnoyers as an official reviewer of the tracing subsystem"

* tag 'trace-v6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
MAINTAINERS: TRACING: Add Mathieu Desnoyers as Reviewer
eventfs: Make sure that parent->d_inode is locked in creating files/dirs
eventfs: Do not allow NULL parent to eventfs_start_creating()
eventfs: Move taking of inode_lock into dcache_dir_open_wrapper()
eventfs: Use GFP_NOFS for allocation when eventfs_mutex is held
eventfs: Do not invalidate dentry in create_file/dir_dentry()
eventfs: Remove expectation that ei->is_freed means ei->dentry == NULL

2y ago

Linus Torvalds

5ef3720d

Merge tag 'powerpc-6.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

2y ago

Shubhrajyoti Datta

9483aa44

EDAC/versal: Read num_csrows and num_chans using the correct bitfield macro

2y ago

Ira Weiny

c65efe36

cxl/cdat: Free correct buffer on checksum error

2y ago

Guanjun

0c154698

dmaengine: idxd: Fix incorrect descriptions for GRPCFG register

2y ago

Linus Torvalds

b57b17e8

Merge tag 'parisc-for-6.7-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

2y ago

Boris Burkov

f63e1164

btrfs: free qgroup reserve when ORDERED_IOERR is set

2y ago

Linus Torvalds

c527f560

Merge tag 'powerpc-6.7-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

2y ago

Sean Christopherson

4cdf351d

KVM: SVM: Update EFER software model on CR0 trap for SEV-ES

2y ago

Ashwin Dayanand Kamat

27d25348

x86/sev: Fix kernel crash due to late update to read-only ghcb_version

2y ago

Linus Torvalds

d2da77f4

Merge tag 'parisc-for-6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

2y ago

Mathieu Desnoyers

76d9eaff

MAINTAINERS: TRACING: Add Mathieu Desnoyers as Reviewer

2y ago

Linus Torvalds

dde0672b

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

2y ago

Michael Ellerman

d2441d3e

MAINTAINERS: powerpc: Add Aneesh & Naveen

2y ago

Dan Williams

6f5c4eca

cxl/hdm: Fix dpa translation locking

The helper, cxl_dpa_resource_start(), snapshots the dpa-address of an
endpoint-decoder after acquiring the cxl_dpa_rwsem. However, it is
sufficient to assert that cxl_dpa_rwsem is held rather than acquire it
in the helper. Otherwise, it triggers multiple lockdep reports:

1/ Tracing callbacks are in an atomic context that can not acquire sleeping
locks:

BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1525
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1288, name: bash
preempt_count: 2, expected: 0
RCU nest depth: 0, expected: 0
[..]
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20230524-3.fc38 05/24/2023
Call Trace:
<TASK>
dump_stack_lvl+0x71/0x90
__might_resched+0x1b2/0x2c0
down_read+0x1a/0x190
cxl_dpa_resource_start+0x15/0x50 [cxl_core]
cxl_trace_hpa+0x122/0x300 [cxl_core]
trace_event_raw_event_cxl_poison+0x1c9/0x2d0 [cxl_core]

2/ The rwsem is already held in the inject poison path:

WARNING: possible recursive locking detected
6.7.0-rc2+ #12 Tainted: G W OE N
--------------------------------------------
bash/1288 is trying to acquire lock:
ffffffffc05f73d0 (cxl_dpa_rwsem){++++}-{3:3}, at: cxl_dpa_resource_start+0x15/0x50 [cxl_core]

but task is already holding lock:
ffffffffc05f73d0 (cxl_dpa_rwsem){++++}-{3:3}, at: cxl_inject_poison+0x7d/0x1e0 [cxl_core]
[..]
Call Trace:
<TASK>
dump_stack_lvl+0x71/0x90
__might_resched+0x1b2/0x2c0
down_read+0x1a/0x190
cxl_dpa_resource_start+0x15/0x50 [cxl_core]
cxl_trace_hpa+0x122/0x300 [cxl_core]
trace_event_raw_event_cxl_poison+0x1c9/0x2d0 [cxl_core]
__traceiter_cxl_poison+0x5c/0x80 [cxl_core]
cxl_inject_poison+0x1bc/0x1e0 [cxl_core]

This appears to have been an issue since the initial implementation and
uncovered by the new cxl-poison.sh test [1]. That test is now passing with
these changes.

Fixes: 28a3ae4ff66c ("cxl/trace: Add an HPA to cxl_poison trace events")
Link: http://lore.kernel.org/r/e4f2716646918135ddbadf4146e92abb659de734.1700615159.git.alison.schofield@intel.com [1]
Cc: <stable@vger.kernel.org>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>

2y ago

Guanjun

778dfacc

dmaengine: idxd: Protect int_handle field in hw descriptor

2y ago

Linus Torvalds

4eeee663

Merge tag 'loongarch-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

2y ago

Helge Deller

a406b8b4

parisc: Prevent booting 64-bit kernels on PA1.x machines

2y ago

Jann Horn

0ac1d13a

btrfs: send: ensure send_fd is writable

2y ago

Linus Torvalds

99d4cf76

Merge tag 'gpio-fixes-for-v6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

2y ago

Naveen N Rao

4b3338aa

powerpc/ftrace: Fix stack teardown in ftrace_no_trace

2y ago

David Woodhouse

96f12401

KVM: selftests: add -MP to CFLAGS

2y ago

Linus Torvalds

4892711a

Merge tag 'x86-urgent-2023-11-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2y ago

Helge Deller

43266838

parisc: Reduce size of the bug_table on 64-bit kernel by half

2y ago

Steven Rostedt (Google)

f49f950c

eventfs: Make sure that parent->d_inode is locked in creating files/dirs

2y ago

Linus Torvalds

3b8a9b2e

Merge tag 'trace-v6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

- Fix eventfs to check creating new files for events with names greater
than NAME_MAX. The eventfs lookup needs to check the return result of
simple_lookup().

- Fix the ring buffer to check the proper max data size. Events must be
able to fit on the ring buffer sub-buffer, if it cannot, then it
fails to be written and the logic to add the event is avoided. The
code to check if an event can fit failed to add the possible absolute
timestamp which may make the event not be able to fit. This causes
the ring buffer to go into an infinite loop trying to find a
sub-buffer that would fit the event. Luckily, there's a check that
will bail out if it looped over a 1000 times and it also warns.

The real fix is not to add the absolute timestamp to an event that is
starting at the beginning of a sub-buffer because it uses the
sub-buffer timestamp.

By avoiding the timestamp at the start of the sub-buffer allows
events that pass the first check to always find a sub-buffer that it
can fit on.

- Have large events that do not fit on a trace_seq to print "LINE TOO
BIG" like it does for the trace_pipe instead of what it does now
which is to silently drop the output.

- Fix a memory leak of forgetting to free the spare page that is saved
by a trace instance.

- Update the size of the snapshot buffer when the main buffer is
updated if the snapshot buffer is allocated.

- Fix ring buffer timestamp logic by removing all the places that tried
to put the before_stamp back to the write stamp so that the next
event doesn't add an absolute timestamp. But each of these updates
added a race where by making the two timestamp equal, it was
validating the write_stamp so that it can be incorrectly used for
calculating the delta of an event.

- There's a temp buffer used for printing the event that was using the
event data size for allocation when it needed to use the size of the
entire event (meta-data and payload data)

- For hardening, use "%.*s" for printing the trace_marker output, to
limit the amount that is printed by the size of the event. This was
discovered by development that added a bug that truncated the '\0'
and caused a crash.

- Fix a use-after-free bug in the use of the histogram files when an
instance is being removed.

- Remove a useless update in the rb_try_to_discard of the write_stamp.
The before_stamp was already changed to force the next event to add
an absolute timestamp that the write_stamp is not used. But the
write_stamp is modified again using an unneeded 64-bit cmpxchg.

- Fix several races in the 32-bit implementation of the
rb_time_cmpxchg() that does a 64-bit cmpxchg.

- While looking at fixing the 64-bit cmpxchg, I noticed that because
the ring buffer uses normal cmpxchg, and this can be done in NMI
context, there's some architectures that do not have a working
cmpxchg in NMI context. For these architectures, fail recording
events that happen in NMI context.

* tag 'trace-v6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ring-buffer: Do not record in NMI if the arch does not support cmpxchg in NMI
ring-buffer: Have rb_time_cmpxchg() set the msb counter too
ring-buffer: Fix 32-bit rb_time_read() race with rb_time_cmpxchg()
ring-buffer: Fix a race in rb_time_cmpxchg() for 32 bit archs
ring-buffer: Remove useless update to write_stamp in rb_try_to_discard()
ring-buffer: Do not try to put back write_stamp
tracing: Fix uaf issue when open the hist or hist_debug file
tracing: Add size check when printing trace_marker output
ring-buffer: Have saved event hold the entire event
ring-buffer: Do not update before stamp when switching sub-buffers
tracing: Update snapshot buffer on resize if it is allocated
ring-buffer: Fix memory leak of free page
eventfs: Fix events beyond NAME_MAX blocking tasks
tracing: Have large events show up as '[LINE TOO BIG]' instead of nothing
ring-buffer: Fix writing to the buffer with max_data_size

2y ago

Stephen Boyd

8defec03

Merge tag 'v6.7-rockchip-clkfixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into clk-fixes

2y ago

Linux 6.7-rc6 v6.7-rc6

ceb6a6f0

Linus Torvalds

Merge tag 'perf_urgent_for_v6.7_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

177c2ffe

Linus Torvalds

Merge tag 'for-6.7-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

0e389834

Linus Torvalds

perf: Fix perf_event_validate_size() lockdep splat

7e2c1e4b

Mark Rutland

Merge tag 'soundwire-6.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire

accc98af

Linus Torvalds

btrfs: do not allow non subvolume root targets for snapshot

a8892fd7

Josef Bacik

Linux 6.7-rc5 v6.7-rc5

a39b6ac3

Linus Torvalds

Merge tag 'phy-fixes-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy

7f499ec2

Linus Torvalds

soundwire: intel_ace2x: fix AC timing setting for ACE2.x

393cae5f

Chao Song

btrfs: ensure releasing squota reserve on head refs

e85a0ada

Boris Burkov

Merge tag 'sched_urgent_for_v6.7_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

3a874988

Linus Torvalds

Merge tag 'dmaengine-fix-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

6d04b70e

Linus Torvalds

phy: sunplus: return negative error code in sp_usb_phy_probe

2a9c7138

Su Hui

soundwire: stream: fix NULL pointer dereference for multi_link

e199bf52

Krzysztof Kozlowski

btrfs: don't clear qgroup reserved bit in release_folio

a8680550

Boris Burkov

Merge tag 'perf_urgent_for_v6.7_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

537ccb5d

Linus Torvalds

freezer,sched: Do not restore saved_state of a thawed task

23ab79e8

Elliot Berman

Merge tag 'cxl-fixes-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

134fdb80

Linus Torvalds

dmaengine: fsl-edma: fix DMA channel leak in eDMAv4

4ee632c8

Frank Li

phy: mediatek: mipi: mt8183: fix minimal supported frequency

06f76e46

Michael Walle

Linux 6.7-rc1 v6.7-rc1

b85ea95d

Linus Torvalds

btrfs: free qgroup pertrans reserve on transaction abort

b321a52c

Boris Burkov

Merge tag 'x86_urgent_for_v6.7_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

5412fed7

Linus Torvalds

perf: Fix perf_event_validate_size()

382c27f4

Peter Zijlstra

Linux 6.7-rc3 v6.7-rc3

2cc14f52

Linus Torvalds

Merge tag 'edac_urgent_for_v6.7_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

ef6a7c27

Linus Torvalds

cxl/pmu: Ensure put_device on pmu devices

ef3d5cf9

Ira Weiny

dmaengine: fsl-edma: fix wrong pointer check in fsl_edma3_attach_pd()

bffa7218

Yang Yingliang

phy: ti: gmii-sel: Fix register offset when parent is not a syscon node

0f40d509

Andrew Davis

wifi: iwlwifi: fix system commands group ordering

e257da57

Miri Korenblit

btrfs: fix qgroup_free_reserved_data int overflow

9e65bfca

Boris Burkov

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

0aea22c7

Linus Torvalds

x86/CPU/AMD: Check vendor in the AMD microcode callback

9b8493dc

Borislav Petkov (AMD)

Merge tag 'trace-v6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

5b2b1173

Linus Torvalds

Merge tag 'powerpc-6.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

5ef3720d

Linus Torvalds

EDAC/versal: Read num_csrows and num_chans using the correct bitfield macro

9483aa44

Shubhrajyoti Datta

cxl/cdat: Free correct buffer on checksum error

c65efe36

Ira Weiny

dmaengine: idxd: Fix incorrect descriptions for GRPCFG register

0c154698

Guanjun

Merge tag 'parisc-for-6.7-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

b57b17e8

Linus Torvalds

btrfs: free qgroup reserve when ORDERED_IOERR is set

f63e1164

Boris Burkov

Merge tag 'powerpc-6.7-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

c527f560

Linus Torvalds

KVM: SVM: Update EFER software model on CR0 trap for SEV-ES

In general, activating long mode involves setting the EFER_LME bit in
the EFER register and then enabling the X86_CR0_PG bit in the CR0
register. At this point, the EFER_LMA bit will be set automatically by
hardware.

In the case of SVM/SEV guests where writes to CR0 are intercepted, it's
necessary for the host to set EFER_LMA on behalf of the guest since
hardware does not see the actual CR0 write.

In the case of SEV-ES guests where writes to CR0 are trapped instead of
intercepted, the hardware *does* see/record the write to CR0 before
exiting and passing the value on to the host, so as part of enabling
SEV-ES support commit f1c6366e3043 ("KVM: SVM: Add required changes to
support intercepts under SEV-ES") dropped special handling of the
EFER_LMA bit with the understanding that it would be set automatically.

However, since the guest never explicitly sets the EFER_LMA bit, the
host never becomes aware that it has been set. This becomes problematic
when userspace tries to get/set the EFER values via
KVM_GET_SREGS/KVM_SET_SREGS, since the EFER contents tracked by the host
will be missing the EFER_LMA bit, and when userspace attempts to pass
the EFER value back via KVM_SET_SREGS it will fail a sanity check that
asserts that EFER_LMA should always be set when X86_CR0_PG and EFER_LME
are set.

Fix this by always inferring the value of EFER_LMA based on X86_CR0_PG
and EFER_LME, regardless of whether or not SEV-ES is enabled.

Fixes: f1c6366e3043 ("KVM: SVM: Add required changes to support intercepts under SEV-ES")
Reported-by: Peter Gonda <pgonda@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210507165947.2502412-2-seanjc@google.com>
[A two year old patch that was revived after we noticed the failure in
KVM_SET_SREGS and a similar patch was posted by Michael Roth. This is
Sean's patch, but with Michael's more complete commit message. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>