commits

Pull tracing fixes from Steven Rostedt::
"Eventfs fixes:

- With the usage of simple_recursive_remove() recommended by Al Viro,
the code should not be calling "d_invalidate()" itself. Doing so is
causing crashes. The code was calling d_invalidate() on the race of
trying to look up a file while the parent was being deleted. This
was detected, and the added dentry was having d_invalidate() called
on it, but the deletion of the directory was also calling
d_invalidate() on that same dentry.

- A fix to not free the eventfs_inode (ei) until the last dput() was
called on its ei->dentry made the ei->dentry exist even after it
was marked for free by setting the ei->is_freed. But code elsewhere
still was checking if ei->dentry was NULL if ei->is_freed is set
and would trigger WARN_ON if that was the case. That's no longer
true and there should not be any warnings when it is true.

- Use GFP_NOFS for allocations done under eventfs_mutex. The
eventfs_mutex can be taken on file system reclaim, make sure that
allocations done under that mutex do not trigger file system
reclaim.

- Clean up code by moving the taking of inode_lock out of the helper
functions and into where they are needed, and not use the parameter
to know to take it or not. It must always be held but some callers
of the helper function have it taken when they were called.

- Warn if the inode_lock is not held in the helper functions.

- Warn if eventfs_start_creating() is called without a parent. As
eventfs is underneath tracefs, all files created will have a parent
(the top one will have a tracefs parent).

Tracing update:

- Add Mathieu Desnoyers as an official reviewer of the tracing subsystem"

* tag 'trace-v6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
MAINTAINERS: TRACING: Add Mathieu Desnoyers as Reviewer
eventfs: Make sure that parent->d_inode is locked in creating files/dirs
eventfs: Do not allow NULL parent to eventfs_start_creating()
eventfs: Move taking of inode_lock into dcache_dir_open_wrapper()
eventfs: Use GFP_NOFS for allocation when eventfs_mutex is held
eventfs: Do not invalidate dentry in create_file/dir_dentry()
eventfs: Remove expectation that ei->is_freed means ei->dentry == NULL

2y ago

Linus Torvalds

c527f560

Merge tag 'powerpc-6.7-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

2y ago

Sean Christopherson

4cdf351d

KVM: SVM: Update EFER software model on CR0 trap for SEV-ES

2y ago

Ashwin Dayanand Kamat

27d25348

x86/sev: Fix kernel crash due to late update to read-only ghcb_version

2y ago

Linus Torvalds

d2da77f4

Merge tag 'parisc-for-6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

2y ago

Mathieu Desnoyers

76d9eaff

MAINTAINERS: TRACING: Add Mathieu Desnoyers as Reviewer

2y ago

Linus Torvalds

99d4cf76

Merge tag 'gpio-fixes-for-v6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

2y ago

Naveen N Rao

4b3338aa

powerpc/ftrace: Fix stack teardown in ftrace_no_trace

2y ago

David Woodhouse

96f12401

KVM: selftests: add -MP to CFLAGS

2y ago

Linus Torvalds

4892711a

Merge tag 'x86-urgent-2023-11-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2y ago

Helge Deller

43266838

parisc: Reduce size of the bug_table on 64-bit kernel by half

2y ago

Steven Rostedt (Google)

f49f950c

eventfs: Make sure that parent->d_inode is locked in creating files/dirs

2y ago

Linus Torvalds

21b73ffc

Merge tag 'usb-6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

2y ago

Boerge Struempfel

95dd1e34

gpiolib: sysfs: Fix error handling on failed export

2y ago

Nicholas Piggin

dc158d23

KVM: PPC: Book3S HV: Fix KVM_RUN clobbering FP/VEC user registers

2y ago

angquan yu

4a073e81

KVM: selftests: Actually print out magic token in NX hugepages skip message

2y ago

Linus Torvalds

e81fe505

Merge tag 'perf-urgent-2023-11-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2y ago

Borislav Petkov (AMD)

080990aa

x86/microcode: Rework early revisions reporting

2y ago

Helge Deller

e5f3e299

parisc: Drop the HP-UX ENOSYM and EREMOTERELEASE error codes

2y ago

Steven Rostedt (Google)

fc456122

eventfs: Do not allow NULL parent to eventfs_start_creating()

2y ago

Linus Torvalds

0b526090

Merge tag 'tty-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

2y ago

Konstantin Aladyshev

61890dc2

usb: gadget: f_hid: fix report descriptor allocation

2y ago

Linus Torvalds

b85ea95d

Linux 6.7-rc1 v6.7-rc1

2y ago

Timothy Pearson

5e1d824f

powerpc: Don't clobber f0/vs0 during fp|altivec register save

During floating point and vector save to thread data f0/vs0 are
clobbered by the FPSCR/VSCR store routine. This has been obvserved to
lead to userspace register corruption and application data corruption
with io-uring.

Fix it by restoring f0/vs0 after FPSCR/VSCR store has completed for
all the FP, altivec, VMX register save paths.

Tested under QEMU in kvm mode, running on a Talos II workstation with
dual POWER9 DD2.2 CPUs.

Additional detail (mpe):

Typically save_fpu() is called from __giveup_fpu() which saves the FP
regs and also *turns off FP* in the tasks MSR, meaning the kernel will
reload the FP regs from the thread struct before letting the task use FP
again. So in that case save_fpu() is free to clobber f0 because the FP
regs no longer hold live values for the task.

There is another case though, which is the path via:
sys_clone()
...
copy_process()
dup_task_struct()
arch_dup_task_struct()
flush_all_to_thread()
save_all()

That path saves the FP regs but leaves them live. That's meant as an
optimisation for a process that's using FP/VSX and then calls fork(),
leaving the regs live means the parent process doesn't have to take a
fault after the fork to get its FP regs back. The optimisation was added
in commit 8792468da5e1 ("powerpc: Add the ability to save FPU without
giving it up").

That path does clobber f0, but f0 is volatile across function calls,
and typically programs reach copy_process() from userspace via a syscall
wrapper function. So in normal usage f0 being clobbered across a
syscall doesn't cause visible data corruption.

But there is now a new path, because io-uring can call copy_process()
via create_io_thread() from the signal handling path. That's OK if the
signal is handled as part of syscall return, but it's not OK if the
signal is handled due to some other interrupt.

That path is:

interrupt_return_srr_user()
interrupt_exit_user_prepare()
interrupt_exit_user_prepare_main()
do_notify_resume()
get_signal()
task_work_run()
create_worker_cb()
create_io_worker()
copy_process()
dup_task_struct()
arch_dup_task_struct()
flush_all_to_thread()
save_all()
if (tsk->thread.regs->msr & MSR_FP)
save_fpu()
# f0 is clobbered and potentially live in userspace

Note the above discussion applies equally to save_altivec().

Fixes: 8792468da5e1 ("powerpc: Add the ability to save FPU without giving it up")
Cc: stable@vger.kernel.org # v4.6+
Closes: https://lore.kernel.org/all/480932026.45576726.1699374859845.JavaMail.zimbra@raptorengineeringinc.com/
Closes: https://lore.kernel.org/linuxppc-dev/480221078.47953493.1700206777956.JavaMail.zimbra@raptorengineeringinc.com/
Tested-by: Timothy Pearson <tpearson@raptorengineering.com>
Tested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
[mpe: Reword change log to describe exact path of corruption & other minor tweaks]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/1921539696.48534988.1700407082933.JavaMail.zimbra@raptorengineeringinc.com

2y ago

Paolo Bonzini

6254eeba

Merge tag 'kvm-x86-fixes-6.7-rcN' of https://github.com/kvm-x86/linux into kvm-master

2y ago

Linus Torvalds

1d0dbc3d

Merge tag 'locking-urgent-2023-11-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2y ago

Dapeng Mi

e8df9d9f

perf/x86/intel: Correct incorrect 'or' operation for PMU capabilities

2y ago

Borislav Petkov (AMD)

2e569ada

x86/microcode: Remove the driver announcement and version

2y ago

Helge Deller

fe76a134

parisc: Use natural CPU alignment for bug_table

2y ago

Steven Rostedt (Google)

bcae32c5

eventfs: Move taking of inode_lock into dcache_dir_open_wrapper()

2y ago

Linus Torvalds

ca20f162

Merge tag 'char-misc-6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

2y ago

Andy Shevchenko

e92fad02

serial: 8250_dw: Add ACPI ID for Granite Rapids-D UART

2y ago

Mathias Nyman

24be0b3c

Revert "xhci: Loosen RPM as default policy to cover for AMD xHC 1.1"

2y ago

Miri Korenblit

e257da57

wifi: iwlwifi: fix system commands group ordering

2y ago

Linus Torvalds

98b1cc82

Linux 6.7-rc2 v6.7-rc2

2y ago

Paolo Bonzini

aa0ae3df

Merge tag 'kvm-s390-master-6.7-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master

2y ago

Like Xu

ef8d8903

KVM: x86: Remove 'return void' expression for 'void function'

2y ago

Linus Torvalds

4515866d

Merge tag '6.7-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

2y ago

Peter Zijlstra

bca4104b

lockdep: Fix block chain corruption

2y ago

Helge Deller

c9fcb2b6

parisc: Ensure 32-bit alignment on parisc unwind section

2y ago

Steven Rostedt (Google)

4763d635

eventfs: Use GFP_NOFS for allocation when eventfs_mutex is held

2y ago

Linus Torvalds

b10a3cca

Merge tag 'loongarch-fixes-6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

2y ago

Miquel Raynal

b7c1e537

nvmem: Do not expect fixed layouts to grab a layout driver

2y ago

Andi Shyti

f0b9d97a

serial: ma35d1: Validate console index before assignment

2y ago

RD Babiera

b17b7fe6

usb: typec: class: fix typec_altmode_put_partner to put plugs

2y ago

Linus Torvalds

b57b17e8

Merge tag 'parisc-for-6.7-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

2y ago

Linus Torvalds

eb3479bc

Merge tag 'kbuild-fixes-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

2y ago

Paolo Bonzini

c8a11a93

Merge tag 'kvmarm-fixes-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

2y ago

Claudio Imbrenda

27072b8e

KVM: s390/mm: Properly reset no-dat

2y ago

Sean Christopherson

ea61294b

Revert "KVM: Prevent module exit until all VMs are freed"

2y ago

Linus Torvalds

090472ed

Merge tag 'usb-6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

2y ago

Linux 6.7-rc5 v6.7-rc5

a39b6ac3

Linus Torvalds

Merge tag 'sched_urgent_for_v6.7_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

3a874988

Linus Torvalds

Merge tag 'perf_urgent_for_v6.7_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

537ccb5d

Linus Torvalds

freezer,sched: Do not restore saved_state of a thawed task

23ab79e8

Elliot Berman

Merge tag 'x86_urgent_for_v6.7_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

5412fed7

Linus Torvalds

perf: Fix perf_event_validate_size()

382c27f4

Peter Zijlstra

Linux 6.7-rc3 v6.7-rc3

2cc14f52

Linus Torvalds

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

0aea22c7

Linus Torvalds

x86/CPU/AMD: Check vendor in the AMD microcode callback

9b8493dc

Borislav Petkov (AMD)

Merge tag 'trace-v6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

5b2b1173

Linus Torvalds

Merge tag 'powerpc-6.7-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

c527f560

Linus Torvalds

KVM: SVM: Update EFER software model on CR0 trap for SEV-ES

In general, activating long mode involves setting the EFER_LME bit in
the EFER register and then enabling the X86_CR0_PG bit in the CR0
register. At this point, the EFER_LMA bit will be set automatically by
hardware.

In the case of SVM/SEV guests where writes to CR0 are intercepted, it's
necessary for the host to set EFER_LMA on behalf of the guest since
hardware does not see the actual CR0 write.

In the case of SEV-ES guests where writes to CR0 are trapped instead of
intercepted, the hardware *does* see/record the write to CR0 before
exiting and passing the value on to the host, so as part of enabling
SEV-ES support commit f1c6366e3043 ("KVM: SVM: Add required changes to
support intercepts under SEV-ES") dropped special handling of the
EFER_LMA bit with the understanding that it would be set automatically.

However, since the guest never explicitly sets the EFER_LMA bit, the
host never becomes aware that it has been set. This becomes problematic
when userspace tries to get/set the EFER values via
KVM_GET_SREGS/KVM_SET_SREGS, since the EFER contents tracked by the host
will be missing the EFER_LMA bit, and when userspace attempts to pass
the EFER value back via KVM_SET_SREGS it will fail a sanity check that
asserts that EFER_LMA should always be set when X86_CR0_PG and EFER_LME
are set.

Fix this by always inferring the value of EFER_LMA based on X86_CR0_PG
and EFER_LME, regardless of whether or not SEV-ES is enabled.

Fixes: f1c6366e3043 ("KVM: SVM: Add required changes to support intercepts under SEV-ES")
Reported-by: Peter Gonda <pgonda@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210507165947.2502412-2-seanjc@google.com>
[A two year old patch that was revived after we noticed the failure in
KVM_SET_SREGS and a similar patch was posted by Michael Roth. This is
Sean's patch, but with Michael's more complete commit message. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>