commits
Pull USB / Thunderbolt driver fixes from Greg KH:
"Here are a small set of USB and Thunderbolt driver fixes for reported
problems and a documentation update, for 6.3-rc4.
Included in here are:
- documentation update for uvc gadget driver
- small thunderbolt driver fixes
- cdns3 driver fixes
- dwc3 driver fixes
- dwc2 driver fixes
- chipidea driver fixes
- typec driver fixes
- onboard_usb_hub device id updates
- quirk updates
All of these have been in linux-next with no reported problems"
* tag 'usb-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (30 commits)
usb: dwc2: fix a race, don't power off/on phy for dual-role mode
usb: dwc2: fix a devres leak in hw_enable upon suspend resume
usb: chipidea: core: fix possible concurrent when switch role
usb: chipdea: core: fix return -EINVAL if request role is the same with current role
thunderbolt: Rename shadowed variables bit to interrupt_bit and auto_clear_bit
thunderbolt: Disable interrupt auto clear for rings
thunderbolt: Use const qualifier for `ring_interrupt_index`
usb: gadget: Use correct endianness of the wLength field for WebUSB
uas: Add US_FL_NO_REPORT_OPCODES for JMicron JMS583Gen 2
usb: cdnsp: changes PCI Device ID to fix conflict with CNDS3 driver
usb: cdns3: Fix issue with using incorrect PCI device function
usb: cdnsp: Fixes issue with redundant Status Stage
MAINTAINERS: make me a reviewer of USB/IP
thunderbolt: Use scale field when allocating USB3 bandwidth
thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host routers
thunderbolt: Call tb_check_quirks() after initializing adapters
thunderbolt: Add missing UNSET_INBOUND_SBTX for retimer access
thunderbolt: Fix memory leak in margining
usb: dwc2: drd: fix inconsistent mode if role-switch-default-mode="host"
docs: usb: Add documentation for the UVC Gadget
...
Pull scheduler fix from Borislav Petkov:
- Fix a corner case where vruntime of a task is not being sanitized
* tag 'sched_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Sanitize vruntime of entity being migrated
When in dual role mode (dr_mode == USB_DR_MODE_OTG), platform probe
successively basically calls:
- dwc2_gadget_init()
- dwc2_hcd_init()
- dwc2_lowlevel_hw_disable() since recent change [1]
- usb_add_gadget_udc()
The PHYs (and so the clocks it may provide) shouldn't be disabled for all
SoCs, in OTG mode, as the HCD part has been initialized.
On STM32 this creates some weird race condition upon boot, when:
- initially attached as a device, to a HOST
- and there is a gadget script invoked to setup the device part.
Below issue becomes systematic, as long as the gadget script isn't
started by userland: the hardware PHYs (and so the clocks provided by the
PHYs) remains disabled.
It ends up in having an endless interrupt storm, before the watchdog
resets the platform.
[ 16.924163] dwc2 49000000.usb-otg: EPs: 9, dedicated fifos, 952 entries in SPRAM
[ 16.962704] dwc2 49000000.usb-otg: DWC OTG Controller
[ 16.966488] dwc2 49000000.usb-otg: new USB bus registered, assigned bus number 2
[ 16.974051] dwc2 49000000.usb-otg: irq 77, io mem 0x49000000
[ 17.032170] hub 2-0:1.0: USB hub found
[ 17.042299] hub 2-0:1.0: 1 port detected
[ 17.175408] dwc2 49000000.usb-otg: Mode Mismatch Interrupt: currently in Host mode
[ 17.181741] dwc2 49000000.usb-otg: Mode Mismatch Interrupt: currently in Host mode
[ 17.189303] dwc2 49000000.usb-otg: Mode Mismatch Interrupt: currently in Host mode
...
The host part is also not functional, until the gadget part is configured.
The HW may only be disabled for peripheral mode (original init), e.g.
dr_mode == USB_DR_MODE_PERIPHERAL, until the gadget driver initializes.
But when in USB_DR_MODE_OTG, the HW should remain enabled, as the HCD part
is able to run, while the gadget part isn't necessarily configured.
I don't fully get the of purpose the original change, that claims disabling
the hardware is missing. It creates conditions on SOCs using the PHY
initialization to be completely non working in OTG mode. Original
change [1] should be reworked to be platform specific.
[1] https://lore.kernel.org/r/20221206-dwc2-gadget-dual-role-v1-2-36515e1092cd@theobroma-systems.com
Fixes: ade23d7b7ec5 ("usb: dwc2: power on/off phy for peripheral mode in dual-role mode")
Cc: stable <stable@kernel.org>
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Reviewed-by: Quentin Schulz <quentin.schulz@theobroma-systems.com>
Tested-by: Quentin Schulz <quentin.schulz@theobroma-systems.com>
Link: https://lore.kernel.org/r/20230315144433.3095859-1-fabrice.gasnier@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull perf fix from Borislav Petkov:
- Properly clear perf event status tracking in the AMD perf event
overflow handler
* tag 'perf_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/amd/core: Always clear status for idx
Commit 829c1651e9c4 ("sched/fair: sanitize vruntime of entity being placed")
fixes an overflowing bug, but ignore a case that se->exec_start is reset
after a migration.
For fixing this case, we delay the reset of se->exec_start after
placing the entity which se->exec_start to detect long sleeping task.
In order to take into account a possible divergence between the clock_task
of 2 rqs, we increase the threshold to around 104 days.
Fixes: 829c1651e9c4 ("sched/fair: sanitize vruntime of entity being placed")
Originally-by: Zhang Qiao <zhangqiao22@huawei.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Zhang Qiao <zhangqiao22@huawei.com>
Link: https://lore.kernel.org/r/20230317160810.107988-1-vincent.guittot@linaro.org
Each time the platform goes to low power, PM suspend / resume routines
call: __dwc2_lowlevel_hw_enable -> devm_add_action_or_reset().
This adds a new devres each time.
This may also happen at runtime, as dwc2_lowlevel_hw_enable() can be
called from udc_start().
This can be seen with tracing:
- echo 1 > /sys/kernel/debug/tracing/events/dev/devres_log/enable
- go to low power
- cat /sys/kernel/debug/tracing/trace
A new "ADD" entry is found upon each low power cycle:
... devres_log: 49000000.usb-otg ADD 82a13bba devm_action_release (8 bytes)
... devres_log: 49000000.usb-otg ADD 49889daf devm_action_release (8 bytes)
...
A second issue is addressed here:
- regulator_bulk_enable() is called upon each PM cycle (suspend/resume).
- regulator_bulk_disable() never gets called.
So the reference count for these regulators constantly increase, by one
upon each low power cycle, due to missing regulator_bulk_disable() call
in __dwc2_lowlevel_hw_disable().
The original fix that introduced the devm_add_action_or_reset() call,
fixed an issue during probe, that happens due to other errors in
dwc2_driver_probe() -> dwc2_core_reset(). Then the probe fails without
disabling regulators, when dr_mode == USB_DR_MODE_PERIPHERAL.
Rather fix the error path: disable all the low level hardware in the
error path, by using the "hsotg->ll_hw_enabled" flag. Checking dr_mode
has been introduced to avoid a dual call to dwc2_lowlevel_hw_disable().
"ll_hw_enabled" should achieve the same (and is used currently in the
remove() routine).
Fixes: 54c196060510 ("usb: dwc2: Always disable regulators on driver teardown")
Fixes: 33a06f1300a7 ("usb: dwc2: Fix error path in gadget registration")
Cc: stable <stable@kernel.org>
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Link: https://lore.kernel.org/r/20230316084127.126084-1-fabrice.gasnier@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull core fixes from Borislav Petkov:
- Do the delayed RCU wakeup for kthreads in the proper order so that
former doesn't get ignored
- A noinstr warning fix
* tag 'core_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
entry/rcu: Check TIF_RESCHED _after_ delayed RCU wake-up
entry: Fix noinstr warning in __enter_from_user_mode()
The variable 'status' (which contains the unhandled overflow bits) is
not being properly masked in some cases, displaying the following
warning:
WARNING: CPU: 156 PID: 475601 at arch/x86/events/amd/core.c:972 amd_pmu_v2_handle_irq+0x216/0x270
This seems to be happening because the loop is being continued before
the status bit being unset, in case x86_perf_event_set_period()
returns 0. This is also causing an inconsistency because the "handled"
counter is incremented, but the status bit is not cleaned.
Move the bit cleaning together above, together when the "handled"
counter is incremented.
Fixes: 7685665c390d ("perf/x86/amd/core: Add PerfMonV2 overflow handling")
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230321113338.1669660-1-leitao@debian.org
The user may call role_store() when driver is handling
ci_handle_id_switch() which is triggerred by otg event or power lost
event. Unfortunately, the controller may go into chaos in this case.
Fix this by protecting it with mutex lock.
Fixes: a932a8041ff9 ("usb: chipidea: core: add sysfs group")
cc: <stable@vger.kernel.org>
Acked-by: Peter Chen <peter.chen@kernel.org>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20230317061516.2451728-2-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull x86 fixes from Borislav Petkov:
- Add a AMX ptrace self test
- Prevent a false-positive warning when retrieving the (invalid)
address of dynamic FPU features in their init state which are not
saved in init_fpstate at all
- Randomize per-CPU entry areas only when KASLR is enabled
* tag 'x86_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests/x86/amx: Add a ptrace test
x86/fpu/xstate: Prevent false-positive warning in __copy_xstate_uabi_buf()
x86/mm: Do not shuffle CPU entry areas without KASLR
RCU sometimes needs to perform a delayed wake up for specific kthreads
handling offloaded callbacks (RCU_NOCB). These wakeups are performed
by timers and upon entry to idle (also to guest and to user on nohz_full).
However the delayed wake-up on kernel exit is actually performed after
the thread flags are fetched towards the fast path check for work to
do on exit to user. As a result, and if there is no other pending work
to do upon that kernel exit, the current task will resume to userspace
with TIF_RESCHED set and the pending wake up ignored.
Fix this with fetching the thread flags _after_ the delayed RCU-nocb
kthread wake-up.
Fixes: 47b8ff194c1f ("entry: Explicitly flush pending rcuog wakeup before last rescheduling point")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230315194349.10798-3-joel@joelfernandes.org
Pull tracing fixes from Steven Rostedt:
- Fix setting affinity of hwlat threads in containers
Using sched_set_affinity() has unwanted side effects when being
called within a container. Use set_cpus_allowed_ptr() instead
- Fix per cpu thread management of the hwlat tracer:
- Do not start per_cpu threads if one is already running for the CPU
- When starting per_cpu threads, do not clear the kthread variable
as it may already be set to running per cpu threads
- Fix return value for test_gen_kprobe_cmd()
On error the return value was overwritten by being set to the result
of the call from kprobe_event_delete(), which would likely succeed,
and thus have the function return success
- Fix splice() reads from the trace file that was broken by commit
36e2c7421f02 ("fs: don't allow splice read/write without explicit
ops")
- Remove obsolete and confusing comment in ring_buffer.c
The original design of the ring buffer used struct page flags for
tricks to optimize, which was shortly removed due to them being
tricks. But a comment for those tricks remained
- Set local functions and variables to static
* tag 'trace-v6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/hwlat: Replace sched_setaffinity with set_cpus_allowed_ptr
ring-buffer: remove obsolete comment for free_buffer_page()
tracing: Make splice_read available again
ftrace: Set direct_ops storage-class-specifier to static
trace/hwlat: Do not start per-cpu thread if it is already running
trace/hwlat: Do not wipe the contents of per-cpu thread data
tracing/osnoise: set several trace_osnoise.c variables storage-class-specifier to static
tracing: Fix wrong return in kprobe_event_gen_test.c
It should not return -EINVAL if the request role is the same with current
role, return non-error and without do anything instead.
Fixes: a932a8041ff9 ("usb: chipidea: core: add sysfs group")
cc: <stable@vger.kernel.org>
Acked-by: Peter Chen <peter.chen@kernel.org>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20230317061516.2451728-1-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull cifs client fixes from Steve French:
"Twelve cifs/smb3 client fixes (most also for stable)
- forced umount fix
- fix for two perf regressions
- reconnect fixes
- small debugging improvements
- multichannel fixes"
* tag 'smb3-client-fixes-6.3-rc3' of git://git.samba.org/sfrench/cifs-2.6:
smb3: fix unusable share after force unmount failure
cifs: fix dentry lookups in directory handle cache
smb3: lower default deferred close timeout to address perf regression
cifs: fix missing unload_nls() in smb2_reconnect()
cifs: avoid race conditions with parallel reconnects
cifs: append path to open_enter trace event
cifs: print session id while listing open files
cifs: dump pending mids for all channels in DebugData
cifs: empty interface list when server doesn't support query interfaces
cifs: do not poll server interfaces too regularly
cifs: lock chan_lock outside match_session
cifs: check only tcon status on tcon related functions
Include a test case to validate the XTILEDATA injection to the target.
Also, it ensures the kernel's ability to copy states between different
XSAVE formats.
Refactor the memcmp() code to be usable for the state validation.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20230227210504.18520-3-chang.seok.bae%40intel.com
__enter_from_user_mode() is triggering noinstr warnings with
CONFIG_DEBUG_PREEMPT due to its call of preempt_count_add() via
ct_state().
The preemption disable isn't needed as interrupts are already disabled.
And the context_tracking_enabled() check in ct_state() also isn't needed
as that's already being done by the CT_WARN_ON().
Just use __ct_state() instead.
Fixes the following warnings:
vmlinux.o: warning: objtool: enter_from_user_mode+0xba: call to preempt_count_add() leaves .noinstr.text section
vmlinux.o: warning: objtool: syscall_enter_from_user_mode+0xf9: call to preempt_count_add() leaves .noinstr.text section
vmlinux.o: warning: objtool: syscall_enter_from_user_mode_prepare+0xc7: call to preempt_count_add() leaves .noinstr.text section
vmlinux.o: warning: objtool: irqentry_enter_from_user_mode+0xba: call to preempt_count_add() leaves .noinstr.text section
Fixes: 171476775d32 ("context_tracking: Convert state to atomic_t")
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/d8955fa6d68dc955dda19baf13ae014ae27926f5.1677369694.git.jpoimboe@kernel.org
Pull tty/serial driver fixes from Greg KH:
"Here are some small tty and serial driver fixes for 6.3-rc3 to resolve
some reported issues.
They include:
- 8250 driver Kconfig issue pointed out by you that showed up in -rc1
- qcom-geni serial driver fixes
- various 8250 driver fixes for reported problems
- fsl_lpuart driver fixes
- serdev fix for regression in -rc1
- vt.c bugfix
All have been in linux-next for over a week with no reported problems"
* tag 'tty-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: vt: protect KD_FONT_OP_GET_TALL from unbound access
serial: qcom-geni: drop bogus uart_write_wakeup()
serial: qcom-geni: fix mapping of empty DMA buffer
serial: qcom-geni: fix DMA mapping leak on shutdown
serial: qcom-geni: fix console shutdown hang
serdev: Set fwnode for serdev devices
tty: serial: fsl_lpuart: fix race on RX DMA shutdown
serial: 8250_pci1xxxx: Disable SERIAL_8250_PCI1XXXX config by default
serial: 8250_fsl: fix handle_irq locking
serial: 8250_em: Fix UART port type
serial: 8250: ASPEED_VUART: select REGMAP instead of depending on it
tty: serial: fsl_lpuart: skip waiting for transmission complete when UARTCTRL_SBK is asserted
Revert "tty: serial: fsl_lpuart: adjust SERIAL_FSL_LPUART_CONSOLE config dependency"
There is a problem with the behavior of hwlat in a container,
resulting in incorrect output. A warning message is generated:
"cpumask changed while in round-robin mode, switching to mode none",
and the tracing_cpumask is ignored. This issue arises because
the kernel thread, hwlatd, is not a part of the container, and
the function sched_setaffinity is unable to locate it using its PID.
Additionally, the task_struct of hwlatd is already known.
Ultimately, the function set_cpus_allowed_ptr achieves
the same outcome as sched_setaffinity, but employs task_struct
instead of PID.
Test case:
# cd /sys/kernel/tracing
# echo 0 > tracing_on
# echo round-robin > hwlat_detector/mode
# echo hwlat > current_tracer
# unshare --fork --pid bash -c 'echo 1 > tracing_on'
# dmesg -c
Actual behavior:
[573502.809060] hwlat_detector: cpumask changed while in round-robin mode, switching to mode none
Link: https://lore.kernel.org/linux-trace-kernel/20230316144535.1004952-1-costa.shul@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: 0330f7aa8ee63 ("tracing: Have hwlat trace migrate across tracing_cpumask CPUs")
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Mika writes:
thunderbolt: Fixes for v6.3-rc4
This includes following fixes and quirks for v6.3-rc:
- Quirk to disable CL-states on AMD USB4 host routers
- Fix memory leak in lane margining
- Correct the retimer access flows
- Quirk to limit USB3 bandwidth on certain Intel USB4 host routers
- Fix usage of scale field when allocting USB3 bandwidth
- Fix interrupt "auto clear" on non-Intel USB4 host routers.
There are also two commits that are not fixes themselves but are needed
for the USB3 bandwidth quirk and for the interrupt auto clear fix to
work.
All these have been in linux-next with no reported issues.
* tag 'thunderbolt-for-v6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt:
thunderbolt: Rename shadowed variables bit to interrupt_bit and auto_clear_bit
thunderbolt: Disable interrupt auto clear for rings
thunderbolt: Use const qualifier for `ring_interrupt_index`
thunderbolt: Use scale field when allocating USB3 bandwidth
thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host routers
thunderbolt: Call tb_check_quirks() after initializing adapters
thunderbolt: Add missing UNSET_INBOUND_SBTX for retimer access
thunderbolt: Fix memory leak in margining
thunderbolt: Add quirk to disable CLx
Pull nfsd fix from Chuck Lever:
- Fix a crash when using NFS with krb5p
* tag 'nfsd-6.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
SUNRPC: Fix a crash in gss_krb5_checksum()
If user does forced unmount ("umount -f") while files are still open
on the share (as was seen in a Kubernetes example running on SMB3.1.1
mount) then we were marking the share as "TID_EXITING" in umount_begin()
which caused all subsequent operations (except write) to fail ... but
unfortunately when umount_begin() is called we do not know yet that
there are open files or active references on the share that would prevent
unmount from succeeding. Kubernetes had example when they were doing
umount -f when files were open which caused the share to become
unusable until the files were closed (and the umount retried).
Fix this so that TID_EXITING is not set until we are about to send
the tree disconnect (not at the beginning of forced umounts in
umount_begin) so that if "umount -f" fails (due to open files or
references) the mount is still usable.
Cc: stable@vger.kernel.org
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
__copy_xstate_to_uabi_buf() copies either from the tasks XSAVE buffer
or from init_fpstate into the ptrace buffer. Dynamic features, like
XTILEDATA, have an all zeroes init state and are not saved in
init_fpstate, which means the corresponding bit is not set in the
xfeatures bitmap of the init_fpstate header.
But __copy_xstate_to_uabi_buf() retrieves addresses for both the tasks
xstate and init_fpstate unconditionally via __raw_xsave_addr().
So if the tasks XSAVE buffer has a dynamic feature set, then the
address retrieval for init_fpstate triggers the warning in
__raw_xsave_addr() which checks the feature bit in the init_fpstate
header.
Remove the address retrieval from init_fpstate for extended features.
They have an all zeroes init state so init_fpstate has zeros for them.
Then zeroing the user buffer for the init state is the same as copying
them from init_fpstate.
Fixes: 2308ee57d93d ("x86/fpu/amx: Enable the AMX feature in 64-bit mode")
Reported-by: Mingwei Zhang <mizhang@google.com>
Link: https://lore.kernel.org/kvm/20230221163655.920289-2-mizhang@google.com/
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Mingwei Zhang <mizhang@google.com>
Link: https://lore.kernel.org/all/20230227210504.18520-2-chang.seok.bae%40intel.com
Cc: stable@vger.kernel.org
Pull char/misc driver fixes from Greg KH:
"Here are a few small char/misc/other driver subsystem patches to
resolve reported problems for 6.3-rc3.
Included in here are:
- Interconnect driver fixes for reported problems
- Memory driver fixes for reported problems
- nvmem core fix
- firmware driver fix for reported problem
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (23 commits)
memory: tegra30-emc: fix interconnect registration race
memory: tegra20-emc: fix interconnect registration race
memory: tegra124-emc: fix interconnect registration race
memory: tegra: fix interconnect registration race
interconnect: exynos: drop redundant link destroy
interconnect: exynos: fix registration race
interconnect: exynos: fix node leak in probe PM QoS error path
interconnect: qcom: msm8974: fix registration race
interconnect: qcom: rpmh: fix registration race
interconnect: qcom: rpmh: fix probe child-node error handling
interconnect: qcom: rpm: fix registration race
nvmem: core: return -ENOENT if nvmem cell is not found
firmware: xilinx: don't make a sleepable memory allocation from an atomic context
interconnect: qcom: rpm: fix probe child-node error handling
interconnect: qcom: osm-l3: fix registration race
interconnect: imx: fix registration race
interconnect: fix provider registration API
interconnect: fix icc_provider_del() error handling
interconnect: fix mem leak when freeing nodes
interconnect: qcom: qcm2290: Fix MASTER_SNOC_BIMC_NRT
...
In ioctl(KD_FONT_OP_GET_TALL), userland tells through op->height which
vpitch should be used to copy over the font. In con_font_get, we were
not checking that it is within the maximum height value, and thus
userland could make the vc->vc_sw->con_font_get(vc, &font, vpitch);
call possibly overflow the allocated max_font_size bytes, and the
copy_to_user(op->data, font.data, c) call possibly read out of that
allocated buffer.
By checking vpitch against max_font_height, the max_font_size buffer
will always be large enough for the vc->vc_sw->con_font_get(vc, &font,
vpitch) call (since we already prevent loading a font larger than that),
and c = (font.width+7)/8 * vpitch * font.charcount will always remain
below max_font_size.
Fixes: 24d69384bcd3 ("VT: Add KD_FONT_OP_SET/GET_TALL operations")
Reported-by: syzbot+3af17071816b61e807ed@syzkaller.appspotmail.com
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Jiri Slaby <jirislaby@kernel.org>
Link: https://lore.kernel.org/r/20230306094921.tik5ewne4ft6mfpo@begin
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The comment refers to mm/slob.c which is being removed. It comes from
commit ed56829cb319 ("ring_buffer: reset buffer page when freeing") and
according to Steven the borrowed code was a page mapcount and mapping
reset, which was later removed by commit e4c2ce82ca27 ("ring_buffer:
allocate buffer page pointer"). Thus the comment is not accurate anyway,
remove it.
Link: https://lore.kernel.org/linux-trace-kernel/20230315142446.27040-1-vbabka@suse.cz
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Reported-by: Mike Rapoport <mike.rapoport@gmail.com>
Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Fixes: e4c2ce82ca27 ("ring_buffer: allocate buffer page pointer")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
WebUSB code uses wLength directly without proper endianness conversion.
Update it to use already prepared temporary variable w_length instead.
Fixes: 93c473948c58 ("usb: gadget: add WebUSB landing page support")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-By: Jó Ágila Bitsch <jgilab@gmail.com>
Link: https://lore.kernel.org/r/20230313154522.52684-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cppcheck reports
drivers/thunderbolt/nhi.c:74:7: style: Local variable 'bit' shadows outer variable [shadowVariable]
int bit;
^
drivers/thunderbolt/nhi.c:66:6: note: Shadowed declaration
int bit = ring_interrupt_index(ring) & 31;
^
drivers/thunderbolt/nhi.c:74:7: note: Shadow variable
int bit;
^
For readablity rename the outer to interrupt_bit and the innner
to auto_clear_bit.
Fixes: 468c49f44759 ("thunderbolt: Disable interrupt auto clear for ring")
Cc: stable@vger.kernel.org
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Pull yet more xfs bug fixes from Darrick Wong:
"The first bugfix addresses a longstanding problem where we use the
wrong file mapping cursors when trying to compute the speculative
preallocation quantity. This has been causing sporadic crashes when
alwayscow mode is engaged.
The other two fixes correct minor problems in more recent changes.
- Fix the new allocator tracepoints because git am mismerged the
changes such that the trace_XXX got rebased to be in function YYY
instead of XXX
- Ensure that the perag AGFL_RESET state is consistent with whatever
we've just read off the disk
- Fix a bug where we used the wrong iext cursor during a write begin"
* tag 'xfs-6.3-fixes-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix mismerged tracepoints
xfs: clear incore AGFL_RESET state if it's not needed
xfs: pass the correct cursor to xfs_iomap_prealloc_size
Anna says:
> KASAN reports [...] a slab-out-of-bounds in gss_krb5_checksum(),
> and it can cause my client to panic when running cthon basic
> tests with krb5p.
> Running faddr2line gives me:
>
> gss_krb5_checksum+0x4b6/0x630:
> ahash_request_free at
> /home/anna/Programs/linux-nfs.git/./include/crypto/hash.h:619
> (inlined by) gss_krb5_checksum at
> /home/anna/Programs/linux-nfs.git/net/sunrpc/auth_gss/gss_krb5_crypto.c:358
My diagnosis is that the memcpy() at the end of gss_krb5_checksum()
reads past the end of the buffer containing the checksum data
because the callers have ignored gss_krb5_checksum()'s API contract:
* Caller provides the truncation length of the output token (h) in
* cksumout.len.
Instead they provide the fixed length of the hmac buffer. This
length happens to be larger than the value returned by
crypto_ahash_digestsize().
Change these errant callers to work like krb5_etm_{en,de}crypt().
As a defensive measure, bound the length of the byte copy at the
end of gss_krb5_checksum().
Kunit sez:
Testing complete. Ran 68 tests: passed: 68
Elapsed time: 81.680s total, 5.875s configuring, 75.610s building, 0.103s running
Reported-by: Anna Schumaker <schumaker.anna@gmail.com>
Fixes: 8270dbfcebea ("SUNRPC: Obscure Kerberos integrity keys")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Get rid of any prefix paths in @path before lookup_positive_unlocked()
as it will call ->lookup() which already adds those prefix paths
through build_path_from_dentry().
This has caused a performance regression when mounting shares with a
prefix path where readdir(2) would end up retrying several times to
open bad directory names that contained duplicate prefix paths.
Fix this by skipping any prefix paths in @path before calling
lookup_positive_unlocked().
Fixes: e4029e072673 ("cifs: find and use the dentry for cached non-root directories also")
Cc: stable@vger.kernel.org # 6.1+
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
The commit 97e3d26b5e5f ("x86/mm: Randomize per-cpu entry area") fixed
an omission of KASLR on CPU entry areas. It doesn't take into account
KASLR switches though, which may result in unintended non-determinism
when a user wants to avoid it (e.g. debugging, benchmarking).
Generate only a single combination of CPU entry areas offsets -- the
linear array that existed prior randomization when KASLR is turned off.
Since we have 3f148f331814 ("x86/kasan: Map shadow for percpu pages on
demand") and followups, we can use the more relaxed guard
kasrl_enabled() (in contrast to kaslr_memory_enabled()).
Fixes: 97e3d26b5e5f ("x86/mm: Randomize per-cpu entry area")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20230306193144.24605-1-mkoutny%40suse.com
Pull RAS fix from Borislav Petkov:
- Flush out logged errors immediately after MCA banks configuration
changes over sysfs have been done instead of waiting until something
else triggers the workqueue later - another error or the polling
interval cycle is reached
* tag 'ras_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Make sure logged MCEs are processed after sysfs update
Georgi writes:
interconnect fixes for v6.3-rc
This contains a bunch of fixes with the highlight being fixes for a race
condition that could sometimes occur during the interconnect provider
driver registration. There are also fixes for memory overallocation and
a memory leak.
- interconnect: qcom: osm-l3: fix icc_onecell_data allocation
- interconnect: qcom: sm8450: switch to qcom_icc_rpmh_* function
- interconnect: qcom: sm8550: switch to qcom_icc_rpmh_* function
- interconnect: qcom: qcm2290: Fix MASTER_SNOC_BIMC_NRT
- interconnect: fix mem leak when freeing nodes
- interconnect: fix icc_provider_del() error handling
- interconnect: fix provider registration API
- interconnect: imx: fix registration race
- interconnect: qcom: osm-l3: fix registration race
- interconnect: qcom: rpm: fix probe child-node error handling
- interconnect: qcom: rpm: fix registration race
- interconnect: qcom: rpmh: fix probe child-node error handling
- interconnect: qcom: rpmh: fix registration race
- interconnect: qcom: msm8974: fix registration race
- interconnect: exynos: fix node leak in probe PM QoS error path
- interconnect: exynos: fix registration race
- interconnect: exynos: drop redundant link destroy
- memory: tegra: fix interconnect registration race
- memory: tegra124-emc: fix interconnect registration race
- memory: tegra20-emc: fix interconnect registration race
- memory: tegra30-emc: fix interconnect registration race
Signed-off-by: Georgi Djakov <djakov@kernel.org>
* tag 'icc-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc: (21 commits)
memory: tegra30-emc: fix interconnect registration race
memory: tegra20-emc: fix interconnect registration race
memory: tegra124-emc: fix interconnect registration race
memory: tegra: fix interconnect registration race
interconnect: exynos: drop redundant link destroy
interconnect: exynos: fix registration race
interconnect: exynos: fix node leak in probe PM QoS error path
interconnect: qcom: msm8974: fix registration race
interconnect: qcom: rpmh: fix registration race
interconnect: qcom: rpmh: fix probe child-node error handling
interconnect: qcom: rpm: fix registration race
interconnect: qcom: rpm: fix probe child-node error handling
interconnect: qcom: osm-l3: fix registration race
interconnect: imx: fix registration race
interconnect: fix provider registration API
interconnect: fix icc_provider_del() error handling
interconnect: fix mem leak when freeing nodes
interconnect: qcom: qcm2290: Fix MASTER_SNOC_BIMC_NRT
interconnect: qcom: sm8550: switch to qcom_icc_rpmh_* function
interconnect: qcom: sm8450: switch to qcom_icc_rpmh_* function
...
Drop the bogus uart_write_wakeup() from when setting up a new DMA
transfer, which does not free up any more space in the ring buffer.
Any pending writers will be woken up when the transfer completes.
Cc: stable <stable@kernel.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Andrew Halaney <ahalaney@redhat.com>
Tested-by: Andrew Halaney <ahalaney@redhat.com> # sa8540p-ride
Link: https://lore.kernel.org/r/20230307164405.14218-5-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since the commit 36e2c7421f02 ("fs: don't allow splice read/write
without explicit ops") is applied to the kernel, splice() and
sendfile() calls on the trace file (/sys/kernel/debug/tracing
/trace) return EINVAL.
This patch restores these system calls by initializing splice_read
in file_operations of the trace file. This patch only enables such
functionalities for the read case.
Link: https://lore.kernel.org/linux-trace-kernel/20230314013707.28814-1-sfoon.kim@samsung.com
Cc: stable@vger.kernel.org
Fixes: 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops")
Signed-off-by: Sung-hun Kim <sfoon.kim@samsung.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Just like other JMicron JMS5xx enclosures, it chokes on report-opcodes,
let's avoid them.
Signed-off-by: Yaroslav Furman <yaro330@gmail.com>
Cc: stable <stable@kernel.org>
Link: https://lore.kernel.org/r/20230312090745.47962-1-yaro330@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When interrupt auto clear is programmed, any read to the interrupt
status register will clear all interrupts. If two interrupts have
come in before one can be serviced then this will cause lost interrupts.
On AMD USB4 routers this has manifested in odd problems particularly
with long strings of control tranfers such as reading the DROM via bit
banging.
Instead of clearing interrupts automatically, clear the bit corresponding
to the given ring's interrupt in the ISR.
Fixes: 7a1808f82a37 ("thunderbolt: Handle ring interrupt by reading interrupt status register")
Cc: Sanju Mehta <Sanju.Mehta@amd.com>
Cc: stable@vger.kernel.org
Tested-by: Anson Tsao <anson.tsao@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Pull xfs percpu counter fixes from Darrick Wong:
"We discovered a filesystem summary counter corruption problem that was
traced to cpu hot-remove racing with the call to percpu_counter_sum
that sets the free block count in the superblock when writing it to
disk. The root cause is that percpu_counter_sum doesn't cull from
dying cpus and hence misses those counter values if the cpu shutdown
hooks have not yet run to merge the values.
I'm hoping this is a fairly painless fix to the problem, since the
dying cpu mask should generally be empty. It's been in for-next for a
week without any complaints from the bots.
- Fix a race in the percpu counters summation code where the
summation failed to add in the values for any CPUs that were dying
but not yet dead. This fixes some minor discrepancies and incorrect
assertions when running generic/650"
* tag 'xfs-6.3-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
pcpcntr: remove percpu_counter_sum_all()
fork: remove use of percpu_counter_sum_all
pcpcntrs: fix dying cpu summation race
cpumask: introduce for_each_cpu_or
At some point in between sending this patch to the list and merging it
into for-next, the tracepoints got all mixed up because I've
over-reliant on automated tools not sucking. The end result is that the
tracepoints are all wrong, so fix them.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
The splice read calls nfsd_splice_actor to put the pages containing file
data into the svc_rqst->rq_pages array. It's possible however to get a
splice result that only has a partial page at the end, if (e.g.) the
filesystem hands back a short read that doesn't cover the whole page.
nfsd_splice_actor will plop the partial page into its rq_pages array and
return. Then later, when nfsd_splice_actor is called again, the
remainder of the page may end up being filled out. At this point,
nfsd_splice_actor will put the page into the array _again_ corrupting
the reply. If this is done enough times, rq_next_page will overrun the
array and corrupt the trailing fields -- the rq_respages and
rq_next_page pointers themselves.
If we've already added the page to the array in the last pass, don't add
it to the array a second time when dealing with a splice continuation.
This was originally handled properly in nfsd_splice_actor, but commit
91e23b1c3982 ("NFSD: Clean up nfsd_splice_actor()") removed the check
for it.
Fixes: 91e23b1c3982 ("NFSD: Clean up nfsd_splice_actor()")
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: Dario Lesca <d.lesca@solinos.it>
Tested-by: David Critch <dcritch@redhat.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2150630
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Performance tests with large number of threads noted that the change
of the default closetimeo (deferred close timeout between when
close is done by application and when client has to send the close
to the server), to 5 seconds from 1 second, significantly degraded
perf in some cases like this (in the filebench example reported,
the stats show close requests on the wire taking twice as long,
and 50% regression in filebench perf). This is stil configurable
via mount parm closetimeo, but to be safe, decrease default back
to its previous value of 1 second.
Reported-by: Yin Fengwei <fengwei.yin@intel.com>
Reported-by: kernel test robot <yujie.liu@intel.com>
Link: https://lore.kernel.org/lkml/997614df-10d4-af53-9571-edec36b0e2f3@intel.com/
Fixes: 5efdd9122eff ("smb3: allow deferred close timeout to be configurable")
Cc: stable@vger.kernel.org # 6.0+
Tested-by: Yin Fengwei <fengwei.yin@intel.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
cmdline_find_option() may fail before doing any initialization of
the buffer array. This may lead to unpredictable results when the same
buffer is used later in calls to strncmp() function. Fix the issue by
returning early if cmdline_find_option() returns an error.
Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.
Fixes: aca20d546214 ("x86/mm: Add support to make use of Secure Memory Encryption")
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20230306160656.14844-1-n.zhandarovich@fintech.ru
Pull perf fixes from Borislav Petkov:
- Check whether sibling events have been deactivated before adding them
to groups
- Update the proper event time tracking variable depending on the event
type
- Fix a memory overwrite issue due to using the wrong function argument
when outputting perf events
* tag 'perf_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Fix check before add_event_to_groups() in perf_group_detach()
perf: fix perf_event_context->time
perf/core: Fix perf_output_begin parameter is incorrectly invoked in perf_event_bpf_output
A recent change introduced a flag to queue up errors found during
boot-time polling. These errors will be processed during late init once
the MCE subsystem is fully set up.
A number of sysfs updates call mce_restart() which goes through a subset
of the CPU init flow. This includes polling MCA banks and logging any
errors found. Since the same function is used as boot-time polling,
errors will be queued. However, the system is now past late init, so the
errors will remain queued until another error is found and the workqueue
is triggered.
Call mce_schedule_work() at the end of mce_restart() so that queued
errors are processed.
Fixes: 3bff147b187d ("x86/mce: Defer processing of early errors")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230301221420.2203184-1-yazen.ghannam@amd.com
Prior to commit 5d8e6e6c10a3 ("nvmem: core: add an index parameter to
the cell") of_nvmem_cell_get() would return -ENOENT if the cell wasn't
found. Particularly, if of_property_match_string() returned -EINVAL,
that return code was passed as the index to of_parse_phandle(), which
then detected it as invalid and returned NULL. That led to an return
code of -ENOENT.
With the new code, the negative index will lead to an -EINVAL of
of_parse_phandle_with_optional_args() which pass straight to the
caller and break those who expect an -ENOENT.
Fix it by always returning -ENOENT.
Fixes: 5d8e6e6c10a3 ("nvmem: core: add an index parameter to the cell")
Reported-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Link: https://lore.kernel.org/r/2143916.GUh0CODmnK@steina-w/
Signed-off-by: Michael Walle <michael@walle.cc>
Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20230310094845.139400-1-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current interconnect provider registration interface is inherently
racy as nodes are not added until the after adding the provider. This
can specifically cause racing DT lookups to fail.
Switch to using the new API where the provider is not registered until
after it has been fully initialised.
Fixes: d5ef16ba5fbe ("memory: tegra20: Support interconnect framework")
Cc: stable@vger.kernel.org # 5.11
Cc: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230306075651.2449-21-johan+linaro@kernel.org
Signed-off-by: Georgi Djakov <djakov@kernel.org>
Make sure that there is data in the ring buffer before trying to set up
a zero-length DMA transfer.
This specifically fixes the following warning when unmapping the empty
buffer on the sc8280xp-crd:
WARNING: CPU: 0 PID: 138 at drivers/iommu/dma-iommu.c:1046 iommu_dma_unmap_page+0xbc/0xd8
...
Call trace:
iommu_dma_unmap_page+0xbc/0xd8
dma_unmap_page_attrs+0x30/0x1c8
geni_se_tx_dma_unprep+0x28/0x38
qcom_geni_serial_isr+0x358/0x75c
Fixes: 2aaa43c70778 ("tty: serial: qcom-geni-serial: add support for serial engine DMA")
Cc: stable <stable@kernel.org>
Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Andrew Halaney <ahalaney@redhat.com>
Tested-by: Andrew Halaney <ahalaney@redhat.com> # sa8540p-ride
Link: https://lore.kernel.org/r/20230307164405.14218-4-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
smatch reports this warning
kernel/trace/ftrace.c:2594:19: warning:
symbol 'direct_ops' was not declared. Should it be static?
The variable direct_ops is only used in ftrace.c, so it should be static
Link: https://lore.kernel.org/linux-trace-kernel/20230311135113.711824-1-trix@redhat.com
Signed-off-by: Tom Rix <trix@redhat.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Patch changes CDNS_DEVICE_ID in USBSSP PCI Glue driver to remove
the conflict with Cadence USBSS driver.
cc: <stable@vger.kernel.org>
Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver")
Signed-off-by: Pawel Laszczak <pawell@cadence.com>
Link: https://lore.kernel.org/r/20230309063048.299378-1-pawell@cadence.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
`ring_interrupt_index` doesn't change the data for `ring` so mark it as
const. This is needed by the following patch that disables interrupt
auto clear for rings.
Cc: Sanju Mehta <Sanju.Mehta@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Pull xfs fixes from Darrick Wong:
"This batch started with some debugging enhancements to the new
allocator refactoring that we put in 6.3-rc1 to assist developers in
rebasing their dev branches.
As for more serious code changes -- there's a bug fix to make the
lockless allocator scan the whole filesystem before resorting to the
locking allocator. We're also adding a selftest for the venerable
directory/xattr hash function to make sure that it produces consistent
results so that we can address any fallout as soon as possible.
- Add a few debugging assertions so that people (me) trying to port
code to the new allocator functions don't mess up the caller
requirements
- Relax some overly cautious lock ordering enforcement in the new
allocator code, which means that file allocations will locklessly
scan for the best space they can get before backing off to the
traditional lock-and-really-get-it behavior
- Add tracepoints to make it easier to trace the xfs allocator
behavior
- Actually test the dir/xattr hash algorithm to make sure it produces
consistent results across all the platforms XFS supports"
* tag 'xfs-6.3-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: test dir/attr hash when loading module
xfs: add tracepoints for each of the externally visible allocators
xfs: walk all AGs if TRYLOCK passed to xfs_alloc_vextent_iterate_ags
xfs: try to idiot-proof the allocators
percpu_counter_sum_all() is now redundant as the race condition it
was invented to handle is now dealt with by percpu_counter_sum()
directly and all users of percpu_counter_sum_all() have been
removed.
Remove it.
This effectively reverts the changes made in f689054aace2
("percpu_counter: add percpu_counter_sum_all interface") except for
the cpumask iteration that fixes percpu_counter_sum() made earlier
in this series.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Prior to commit 7ac2ff8bb371, when we loaded the incore perag structure
with information from the AGF header, we would set or clear the
pagf_agfl_reset field based on whether or not the AGFL list was
misaligned within the block. IOWs, it's an incore state bit that's
supposed to cache something in the ondisk metadata. Therefore, the code
still needs to support clearing the incore bit if (somehow) the AGFL
were to correct itself.
It turns out that xfs_repair does exactly this -- phase 4 loads the AGF
to scan the rmapbt for corrupt records, which can set NEEDS_AGFL_RESET.
The scan unsets AGF_INIT but doesn't unset NEEDS_AGFL_RESET. Phase 5
totally rewrites the AGFL and fixes the alignment problem, didn't clear
NEEDS_AGFL_RESET historically, and reloads the perag state to fix the
freelist. This results in the AGFL being reset based on stale data,
which then causes the new AGFL blocks to be leaked. A subsequent
xfs_repair -n then complains about the leaks.
One could argue that phase 5 ought to clear this bit directly when it
reloads the perag AGF data after rewriting the AGFL, but libxfs used to
handle this for us, so it should go back to doing that.
Found by fuzzing flfirst = ones in xfs/352.
Fixes: 7ac2ff8bb371 ("xfs: perags need atomic operational state")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Geert reports that:
> On v6.2, "make ARCH=m68k defconfig" gives you
> CONFIG_RPCSEC_GSS_KRB5=m
> On v6.3, it became builtin, due to dropping the dependencies on
> the individual crypto modules.
>
> $ grep -E "CRYPTO_(MD5|DES|CBC|CTS|ECB|HMAC|SHA1|AES)" .config
> CONFIG_CRYPTO_AES=y
> CONFIG_CRYPTO_AES_TI=m
> CONFIG_CRYPTO_DES=m
> CONFIG_CRYPTO_CBC=m
> CONFIG_CRYPTO_CTS=m
> CONFIG_CRYPTO_ECB=m
> CONFIG_CRYPTO_HMAC=m
> CONFIG_CRYPTO_MD5=m
> CONFIG_CRYPTO_SHA1=m
This behavior is triggered by the "default y" in the definition of
RPCSEC_GSS.
The "default y" was added in 2010 by commit df486a25900f ("NFS: Fix
the selection of security flavours in Kconfig"). However,
svc_gss_principal was removed in 2012 by commit 03a4e1f6ddf2
("nfsd4: move principal name into svc_cred"), so the 2010 fix is
no longer necessary. We can safely change the NFS_V4 and NFSD_V4
dependencies back to RPCSEC_GSS_KRB5 to get the nicer v6.2
behavior back.
Selecting KRB5 symbolically represents the true requirement here:
that all spec-compliant NFSv4 implementations must have Kerberos
available to use.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: dfe9a123451a ("SUNRPC: Enable rpcsec_gss_krb5.ko to be built without CRYPTO_DES")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Make sure to unload_nls() @nls_codepage if we no longer need it.
Fixes: bc962159e8e3 ("cifs: avoid race conditions with parallel reconnects")
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
As a temporary storage, staged_config[] in rdt_domain should be cleared
before and after it is used. The stale value in staged_config[] could
cause an MSR access error.
Here is a reproducer on a system with 16 usable CLOSIDs for a 15-way L3
Cache (MBA should be disabled if the number of CLOSIDs for MB is less than
16.) :
mount -t resctrl resctrl -o cdp /sys/fs/resctrl
mkdir /sys/fs/resctrl/p{1..7}
umount /sys/fs/resctrl/
mount -t resctrl resctrl /sys/fs/resctrl
mkdir /sys/fs/resctrl/p{1..8}
An error occurs when creating resource group named p8:
unchecked MSR access error: WRMSR to 0xca0 (tried to write 0x00000000000007ff) at rIP: 0xffffffff82249142 (cat_wrmsr+0x32/0x60)
Call Trace:
<IRQ>
__flush_smp_call_function_queue+0x11d/0x170
__sysvec_call_function+0x24/0xd0
sysvec_call_function+0x89/0xc0
</IRQ>
<TASK>
asm_sysvec_call_function+0x16/0x20
When creating a new resource control group, hardware will be configured
by the following process:
rdtgroup_mkdir()
rdtgroup_mkdir_ctrl_mon()
rdtgroup_init_alloc()
resctrl_arch_update_domains()
resctrl_arch_update_domains() iterates and updates all resctrl_conf_type
whose have_new_ctrl is true. Since staged_config[] holds the same values as
when CDP was enabled, it will continue to update the CDP_CODE and CDP_DATA
configurations. When group p8 is created, get_config_index() called in
resctrl_arch_update_domains() will return 16 and 17 as the CLOSIDs for
CDP_CODE and CDP_DATA, which will be translated to an invalid register -
0xca0 in this scenario.
Fix it by clearing staged_config[] before and after it is used.
[reinette: re-order commit tags]
Fixes: 75408e43509e ("x86/resctrl: Allow different CODE/DATA configurations to be staged")
Suggested-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/2fad13f49fbe89687fc40e9a5a61f23a28d1507a.1673988935.git.reinette.chatre%40intel.com
Pull x86 fixes from Borislav Petkov:
"There's a little bit more 'movement' in there for my taste but it
needs to happen and should make the code better after it.
- Check cmdline_find_option()'s return value before further
processing
- Clear temporary storage in the resctrl code to prevent access to an
unexistent MSR
- Add a simple throttling mechanism to protect the hypervisor from
potentially malicious SEV guests issuing requests in rapid
succession.
In order to not jeopardize the sanity of everyone involved in
maintaining this code, the request issuing side has received a
cleanup, split in more or less trivial, small and digestible
pieces. Otherwise, the code was threatening to become an
unmaintainable mess.
Therefore, that cleanup is marked indirectly also for stable so
that there's no differences between the upstream code and the
stable variant when it comes down to backporting more there"
* tag 'x86_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Fix use of uninitialized buffer in sme_enable()
x86/resctrl: Clear staged_config[] before and after it is used
virt/coco/sev-guest: Add throttling awareness
virt/coco/sev-guest: Convert the sw_exit_info_2 checking to a switch-case
virt/coco/sev-guest: Do some code style cleanups
virt/coco/sev-guest: Carve out the request issuing logic into a helper
virt/coco/sev-guest: Remove the disable_vmpck label in handle_guest_request()
virt/coco/sev-guest: Simplify extended guest request handling
virt/coco/sev-guest: Check SEV_SNP attribute at probe time
Events should only be added to a groups rb tree if they have not been
removed from their context by list_del_event(). Since remove_on_exec
made it possible to call list_del_event() on individual events before
they are detached from their group, perf_group_detach() should check each
sibling's attach_state before calling add_event_to_groups() on it.
Fixes: 2e498d0a74e5 ("perf: Add support for event removal on exec")
Signed-off-by: Budimir Markovic <markovicbudimir@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/ZBFzvQV9tEqoHEtH@gentoo
Pull USB / Thunderbolt driver fixes from Greg KH:
"Here are a small set of USB and Thunderbolt driver fixes for reported
problems and a documentation update, for 6.3-rc4.
Included in here are:
- documentation update for uvc gadget driver
- small thunderbolt driver fixes
- cdns3 driver fixes
- dwc3 driver fixes
- dwc2 driver fixes
- chipidea driver fixes
- typec driver fixes
- onboard_usb_hub device id updates
- quirk updates
All of these have been in linux-next with no reported problems"
* tag 'usb-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (30 commits)
usb: dwc2: fix a race, don't power off/on phy for dual-role mode
usb: dwc2: fix a devres leak in hw_enable upon suspend resume
usb: chipidea: core: fix possible concurrent when switch role
usb: chipdea: core: fix return -EINVAL if request role is the same with current role
thunderbolt: Rename shadowed variables bit to interrupt_bit and auto_clear_bit
thunderbolt: Disable interrupt auto clear for rings
thunderbolt: Use const qualifier for `ring_interrupt_index`
usb: gadget: Use correct endianness of the wLength field for WebUSB
uas: Add US_FL_NO_REPORT_OPCODES for JMicron JMS583Gen 2
usb: cdnsp: changes PCI Device ID to fix conflict with CNDS3 driver
usb: cdns3: Fix issue with using incorrect PCI device function
usb: cdnsp: Fixes issue with redundant Status Stage
MAINTAINERS: make me a reviewer of USB/IP
thunderbolt: Use scale field when allocating USB3 bandwidth
thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host routers
thunderbolt: Call tb_check_quirks() after initializing adapters
thunderbolt: Add missing UNSET_INBOUND_SBTX for retimer access
thunderbolt: Fix memory leak in margining
usb: dwc2: drd: fix inconsistent mode if role-switch-default-mode="host"
docs: usb: Add documentation for the UVC Gadget
...
When in dual role mode (dr_mode == USB_DR_MODE_OTG), platform probe
successively basically calls:
- dwc2_gadget_init()
- dwc2_hcd_init()
- dwc2_lowlevel_hw_disable() since recent change [1]
- usb_add_gadget_udc()
The PHYs (and so the clocks it may provide) shouldn't be disabled for all
SoCs, in OTG mode, as the HCD part has been initialized.
On STM32 this creates some weird race condition upon boot, when:
- initially attached as a device, to a HOST
- and there is a gadget script invoked to setup the device part.
Below issue becomes systematic, as long as the gadget script isn't
started by userland: the hardware PHYs (and so the clocks provided by the
PHYs) remains disabled.
It ends up in having an endless interrupt storm, before the watchdog
resets the platform.
[ 16.924163] dwc2 49000000.usb-otg: EPs: 9, dedicated fifos, 952 entries in SPRAM
[ 16.962704] dwc2 49000000.usb-otg: DWC OTG Controller
[ 16.966488] dwc2 49000000.usb-otg: new USB bus registered, assigned bus number 2
[ 16.974051] dwc2 49000000.usb-otg: irq 77, io mem 0x49000000
[ 17.032170] hub 2-0:1.0: USB hub found
[ 17.042299] hub 2-0:1.0: 1 port detected
[ 17.175408] dwc2 49000000.usb-otg: Mode Mismatch Interrupt: currently in Host mode
[ 17.181741] dwc2 49000000.usb-otg: Mode Mismatch Interrupt: currently in Host mode
[ 17.189303] dwc2 49000000.usb-otg: Mode Mismatch Interrupt: currently in Host mode
...
The host part is also not functional, until the gadget part is configured.
The HW may only be disabled for peripheral mode (original init), e.g.
dr_mode == USB_DR_MODE_PERIPHERAL, until the gadget driver initializes.
But when in USB_DR_MODE_OTG, the HW should remain enabled, as the HCD part
is able to run, while the gadget part isn't necessarily configured.
I don't fully get the of purpose the original change, that claims disabling
the hardware is missing. It creates conditions on SOCs using the PHY
initialization to be completely non working in OTG mode. Original
change [1] should be reworked to be platform specific.
[1] https://lore.kernel.org/r/20221206-dwc2-gadget-dual-role-v1-2-36515e1092cd@theobroma-systems.com
Fixes: ade23d7b7ec5 ("usb: dwc2: power on/off phy for peripheral mode in dual-role mode")
Cc: stable <stable@kernel.org>
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Reviewed-by: Quentin Schulz <quentin.schulz@theobroma-systems.com>
Tested-by: Quentin Schulz <quentin.schulz@theobroma-systems.com>
Link: https://lore.kernel.org/r/20230315144433.3095859-1-fabrice.gasnier@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 829c1651e9c4 ("sched/fair: sanitize vruntime of entity being placed")
fixes an overflowing bug, but ignore a case that se->exec_start is reset
after a migration.
For fixing this case, we delay the reset of se->exec_start after
placing the entity which se->exec_start to detect long sleeping task.
In order to take into account a possible divergence between the clock_task
of 2 rqs, we increase the threshold to around 104 days.
Fixes: 829c1651e9c4 ("sched/fair: sanitize vruntime of entity being placed")
Originally-by: Zhang Qiao <zhangqiao22@huawei.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Zhang Qiao <zhangqiao22@huawei.com>
Link: https://lore.kernel.org/r/20230317160810.107988-1-vincent.guittot@linaro.org
Each time the platform goes to low power, PM suspend / resume routines
call: __dwc2_lowlevel_hw_enable -> devm_add_action_or_reset().
This adds a new devres each time.
This may also happen at runtime, as dwc2_lowlevel_hw_enable() can be
called from udc_start().
This can be seen with tracing:
- echo 1 > /sys/kernel/debug/tracing/events/dev/devres_log/enable
- go to low power
- cat /sys/kernel/debug/tracing/trace
A new "ADD" entry is found upon each low power cycle:
... devres_log: 49000000.usb-otg ADD 82a13bba devm_action_release (8 bytes)
... devres_log: 49000000.usb-otg ADD 49889daf devm_action_release (8 bytes)
...
A second issue is addressed here:
- regulator_bulk_enable() is called upon each PM cycle (suspend/resume).
- regulator_bulk_disable() never gets called.
So the reference count for these regulators constantly increase, by one
upon each low power cycle, due to missing regulator_bulk_disable() call
in __dwc2_lowlevel_hw_disable().
The original fix that introduced the devm_add_action_or_reset() call,
fixed an issue during probe, that happens due to other errors in
dwc2_driver_probe() -> dwc2_core_reset(). Then the probe fails without
disabling regulators, when dr_mode == USB_DR_MODE_PERIPHERAL.
Rather fix the error path: disable all the low level hardware in the
error path, by using the "hsotg->ll_hw_enabled" flag. Checking dr_mode
has been introduced to avoid a dual call to dwc2_lowlevel_hw_disable().
"ll_hw_enabled" should achieve the same (and is used currently in the
remove() routine).
Fixes: 54c196060510 ("usb: dwc2: Always disable regulators on driver teardown")
Fixes: 33a06f1300a7 ("usb: dwc2: Fix error path in gadget registration")
Cc: stable <stable@kernel.org>
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Link: https://lore.kernel.org/r/20230316084127.126084-1-fabrice.gasnier@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull core fixes from Borislav Petkov:
- Do the delayed RCU wakeup for kthreads in the proper order so that
former doesn't get ignored
- A noinstr warning fix
* tag 'core_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
entry/rcu: Check TIF_RESCHED _after_ delayed RCU wake-up
entry: Fix noinstr warning in __enter_from_user_mode()
The variable 'status' (which contains the unhandled overflow bits) is
not being properly masked in some cases, displaying the following
warning:
WARNING: CPU: 156 PID: 475601 at arch/x86/events/amd/core.c:972 amd_pmu_v2_handle_irq+0x216/0x270
This seems to be happening because the loop is being continued before
the status bit being unset, in case x86_perf_event_set_period()
returns 0. This is also causing an inconsistency because the "handled"
counter is incremented, but the status bit is not cleaned.
Move the bit cleaning together above, together when the "handled"
counter is incremented.
Fixes: 7685665c390d ("perf/x86/amd/core: Add PerfMonV2 overflow handling")
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230321113338.1669660-1-leitao@debian.org
The user may call role_store() when driver is handling
ci_handle_id_switch() which is triggerred by otg event or power lost
event. Unfortunately, the controller may go into chaos in this case.
Fix this by protecting it with mutex lock.
Fixes: a932a8041ff9 ("usb: chipidea: core: add sysfs group")
cc: <stable@vger.kernel.org>
Acked-by: Peter Chen <peter.chen@kernel.org>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20230317061516.2451728-2-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull x86 fixes from Borislav Petkov:
- Add a AMX ptrace self test
- Prevent a false-positive warning when retrieving the (invalid)
address of dynamic FPU features in their init state which are not
saved in init_fpstate at all
- Randomize per-CPU entry areas only when KASLR is enabled
* tag 'x86_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests/x86/amx: Add a ptrace test
x86/fpu/xstate: Prevent false-positive warning in __copy_xstate_uabi_buf()
x86/mm: Do not shuffle CPU entry areas without KASLR
RCU sometimes needs to perform a delayed wake up for specific kthreads
handling offloaded callbacks (RCU_NOCB). These wakeups are performed
by timers and upon entry to idle (also to guest and to user on nohz_full).
However the delayed wake-up on kernel exit is actually performed after
the thread flags are fetched towards the fast path check for work to
do on exit to user. As a result, and if there is no other pending work
to do upon that kernel exit, the current task will resume to userspace
with TIF_RESCHED set and the pending wake up ignored.
Fix this with fetching the thread flags _after_ the delayed RCU-nocb
kthread wake-up.
Fixes: 47b8ff194c1f ("entry: Explicitly flush pending rcuog wakeup before last rescheduling point")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230315194349.10798-3-joel@joelfernandes.org
Pull tracing fixes from Steven Rostedt:
- Fix setting affinity of hwlat threads in containers
Using sched_set_affinity() has unwanted side effects when being
called within a container. Use set_cpus_allowed_ptr() instead
- Fix per cpu thread management of the hwlat tracer:
- Do not start per_cpu threads if one is already running for the CPU
- When starting per_cpu threads, do not clear the kthread variable
as it may already be set to running per cpu threads
- Fix return value for test_gen_kprobe_cmd()
On error the return value was overwritten by being set to the result
of the call from kprobe_event_delete(), which would likely succeed,
and thus have the function return success
- Fix splice() reads from the trace file that was broken by commit
36e2c7421f02 ("fs: don't allow splice read/write without explicit
ops")
- Remove obsolete and confusing comment in ring_buffer.c
The original design of the ring buffer used struct page flags for
tricks to optimize, which was shortly removed due to them being
tricks. But a comment for those tricks remained
- Set local functions and variables to static
* tag 'trace-v6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/hwlat: Replace sched_setaffinity with set_cpus_allowed_ptr
ring-buffer: remove obsolete comment for free_buffer_page()
tracing: Make splice_read available again
ftrace: Set direct_ops storage-class-specifier to static
trace/hwlat: Do not start per-cpu thread if it is already running
trace/hwlat: Do not wipe the contents of per-cpu thread data
tracing/osnoise: set several trace_osnoise.c variables storage-class-specifier to static
tracing: Fix wrong return in kprobe_event_gen_test.c
It should not return -EINVAL if the request role is the same with current
role, return non-error and without do anything instead.
Fixes: a932a8041ff9 ("usb: chipidea: core: add sysfs group")
cc: <stable@vger.kernel.org>
Acked-by: Peter Chen <peter.chen@kernel.org>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20230317061516.2451728-1-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull cifs client fixes from Steve French:
"Twelve cifs/smb3 client fixes (most also for stable)
- forced umount fix
- fix for two perf regressions
- reconnect fixes
- small debugging improvements
- multichannel fixes"
* tag 'smb3-client-fixes-6.3-rc3' of git://git.samba.org/sfrench/cifs-2.6:
smb3: fix unusable share after force unmount failure
cifs: fix dentry lookups in directory handle cache
smb3: lower default deferred close timeout to address perf regression
cifs: fix missing unload_nls() in smb2_reconnect()
cifs: avoid race conditions with parallel reconnects
cifs: append path to open_enter trace event
cifs: print session id while listing open files
cifs: dump pending mids for all channels in DebugData
cifs: empty interface list when server doesn't support query interfaces
cifs: do not poll server interfaces too regularly
cifs: lock chan_lock outside match_session
cifs: check only tcon status on tcon related functions
Include a test case to validate the XTILEDATA injection to the target.
Also, it ensures the kernel's ability to copy states between different
XSAVE formats.
Refactor the memcmp() code to be usable for the state validation.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20230227210504.18520-3-chang.seok.bae%40intel.com
__enter_from_user_mode() is triggering noinstr warnings with
CONFIG_DEBUG_PREEMPT due to its call of preempt_count_add() via
ct_state().
The preemption disable isn't needed as interrupts are already disabled.
And the context_tracking_enabled() check in ct_state() also isn't needed
as that's already being done by the CT_WARN_ON().
Just use __ct_state() instead.
Fixes the following warnings:
vmlinux.o: warning: objtool: enter_from_user_mode+0xba: call to preempt_count_add() leaves .noinstr.text section
vmlinux.o: warning: objtool: syscall_enter_from_user_mode+0xf9: call to preempt_count_add() leaves .noinstr.text section
vmlinux.o: warning: objtool: syscall_enter_from_user_mode_prepare+0xc7: call to preempt_count_add() leaves .noinstr.text section
vmlinux.o: warning: objtool: irqentry_enter_from_user_mode+0xba: call to preempt_count_add() leaves .noinstr.text section
Fixes: 171476775d32 ("context_tracking: Convert state to atomic_t")
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/d8955fa6d68dc955dda19baf13ae014ae27926f5.1677369694.git.jpoimboe@kernel.org
Pull tty/serial driver fixes from Greg KH:
"Here are some small tty and serial driver fixes for 6.3-rc3 to resolve
some reported issues.
They include:
- 8250 driver Kconfig issue pointed out by you that showed up in -rc1
- qcom-geni serial driver fixes
- various 8250 driver fixes for reported problems
- fsl_lpuart driver fixes
- serdev fix for regression in -rc1
- vt.c bugfix
All have been in linux-next for over a week with no reported problems"
* tag 'tty-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: vt: protect KD_FONT_OP_GET_TALL from unbound access
serial: qcom-geni: drop bogus uart_write_wakeup()
serial: qcom-geni: fix mapping of empty DMA buffer
serial: qcom-geni: fix DMA mapping leak on shutdown
serial: qcom-geni: fix console shutdown hang
serdev: Set fwnode for serdev devices
tty: serial: fsl_lpuart: fix race on RX DMA shutdown
serial: 8250_pci1xxxx: Disable SERIAL_8250_PCI1XXXX config by default
serial: 8250_fsl: fix handle_irq locking
serial: 8250_em: Fix UART port type
serial: 8250: ASPEED_VUART: select REGMAP instead of depending on it
tty: serial: fsl_lpuart: skip waiting for transmission complete when UARTCTRL_SBK is asserted
Revert "tty: serial: fsl_lpuart: adjust SERIAL_FSL_LPUART_CONSOLE config dependency"
There is a problem with the behavior of hwlat in a container,
resulting in incorrect output. A warning message is generated:
"cpumask changed while in round-robin mode, switching to mode none",
and the tracing_cpumask is ignored. This issue arises because
the kernel thread, hwlatd, is not a part of the container, and
the function sched_setaffinity is unable to locate it using its PID.
Additionally, the task_struct of hwlatd is already known.
Ultimately, the function set_cpus_allowed_ptr achieves
the same outcome as sched_setaffinity, but employs task_struct
instead of PID.
Test case:
# cd /sys/kernel/tracing
# echo 0 > tracing_on
# echo round-robin > hwlat_detector/mode
# echo hwlat > current_tracer
# unshare --fork --pid bash -c 'echo 1 > tracing_on'
# dmesg -c
Actual behavior:
[573502.809060] hwlat_detector: cpumask changed while in round-robin mode, switching to mode none
Link: https://lore.kernel.org/linux-trace-kernel/20230316144535.1004952-1-costa.shul@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: 0330f7aa8ee63 ("tracing: Have hwlat trace migrate across tracing_cpumask CPUs")
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Mika writes:
thunderbolt: Fixes for v6.3-rc4
This includes following fixes and quirks for v6.3-rc:
- Quirk to disable CL-states on AMD USB4 host routers
- Fix memory leak in lane margining
- Correct the retimer access flows
- Quirk to limit USB3 bandwidth on certain Intel USB4 host routers
- Fix usage of scale field when allocting USB3 bandwidth
- Fix interrupt "auto clear" on non-Intel USB4 host routers.
There are also two commits that are not fixes themselves but are needed
for the USB3 bandwidth quirk and for the interrupt auto clear fix to
work.
All these have been in linux-next with no reported issues.
* tag 'thunderbolt-for-v6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt:
thunderbolt: Rename shadowed variables bit to interrupt_bit and auto_clear_bit
thunderbolt: Disable interrupt auto clear for rings
thunderbolt: Use const qualifier for `ring_interrupt_index`
thunderbolt: Use scale field when allocating USB3 bandwidth
thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host routers
thunderbolt: Call tb_check_quirks() after initializing adapters
thunderbolt: Add missing UNSET_INBOUND_SBTX for retimer access
thunderbolt: Fix memory leak in margining
thunderbolt: Add quirk to disable CLx
If user does forced unmount ("umount -f") while files are still open
on the share (as was seen in a Kubernetes example running on SMB3.1.1
mount) then we were marking the share as "TID_EXITING" in umount_begin()
which caused all subsequent operations (except write) to fail ... but
unfortunately when umount_begin() is called we do not know yet that
there are open files or active references on the share that would prevent
unmount from succeeding. Kubernetes had example when they were doing
umount -f when files were open which caused the share to become
unusable until the files were closed (and the umount retried).
Fix this so that TID_EXITING is not set until we are about to send
the tree disconnect (not at the beginning of forced umounts in
umount_begin) so that if "umount -f" fails (due to open files or
references) the mount is still usable.
Cc: stable@vger.kernel.org
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
__copy_xstate_to_uabi_buf() copies either from the tasks XSAVE buffer
or from init_fpstate into the ptrace buffer. Dynamic features, like
XTILEDATA, have an all zeroes init state and are not saved in
init_fpstate, which means the corresponding bit is not set in the
xfeatures bitmap of the init_fpstate header.
But __copy_xstate_to_uabi_buf() retrieves addresses for both the tasks
xstate and init_fpstate unconditionally via __raw_xsave_addr().
So if the tasks XSAVE buffer has a dynamic feature set, then the
address retrieval for init_fpstate triggers the warning in
__raw_xsave_addr() which checks the feature bit in the init_fpstate
header.
Remove the address retrieval from init_fpstate for extended features.
They have an all zeroes init state so init_fpstate has zeros for them.
Then zeroing the user buffer for the init state is the same as copying
them from init_fpstate.
Fixes: 2308ee57d93d ("x86/fpu/amx: Enable the AMX feature in 64-bit mode")
Reported-by: Mingwei Zhang <mizhang@google.com>
Link: https://lore.kernel.org/kvm/20230221163655.920289-2-mizhang@google.com/
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Mingwei Zhang <mizhang@google.com>
Link: https://lore.kernel.org/all/20230227210504.18520-2-chang.seok.bae%40intel.com
Cc: stable@vger.kernel.org
Pull char/misc driver fixes from Greg KH:
"Here are a few small char/misc/other driver subsystem patches to
resolve reported problems for 6.3-rc3.
Included in here are:
- Interconnect driver fixes for reported problems
- Memory driver fixes for reported problems
- nvmem core fix
- firmware driver fix for reported problem
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (23 commits)
memory: tegra30-emc: fix interconnect registration race
memory: tegra20-emc: fix interconnect registration race
memory: tegra124-emc: fix interconnect registration race
memory: tegra: fix interconnect registration race
interconnect: exynos: drop redundant link destroy
interconnect: exynos: fix registration race
interconnect: exynos: fix node leak in probe PM QoS error path
interconnect: qcom: msm8974: fix registration race
interconnect: qcom: rpmh: fix registration race
interconnect: qcom: rpmh: fix probe child-node error handling
interconnect: qcom: rpm: fix registration race
nvmem: core: return -ENOENT if nvmem cell is not found
firmware: xilinx: don't make a sleepable memory allocation from an atomic context
interconnect: qcom: rpm: fix probe child-node error handling
interconnect: qcom: osm-l3: fix registration race
interconnect: imx: fix registration race
interconnect: fix provider registration API
interconnect: fix icc_provider_del() error handling
interconnect: fix mem leak when freeing nodes
interconnect: qcom: qcm2290: Fix MASTER_SNOC_BIMC_NRT
...
In ioctl(KD_FONT_OP_GET_TALL), userland tells through op->height which
vpitch should be used to copy over the font. In con_font_get, we were
not checking that it is within the maximum height value, and thus
userland could make the vc->vc_sw->con_font_get(vc, &font, vpitch);
call possibly overflow the allocated max_font_size bytes, and the
copy_to_user(op->data, font.data, c) call possibly read out of that
allocated buffer.
By checking vpitch against max_font_height, the max_font_size buffer
will always be large enough for the vc->vc_sw->con_font_get(vc, &font,
vpitch) call (since we already prevent loading a font larger than that),
and c = (font.width+7)/8 * vpitch * font.charcount will always remain
below max_font_size.
Fixes: 24d69384bcd3 ("VT: Add KD_FONT_OP_SET/GET_TALL operations")
Reported-by: syzbot+3af17071816b61e807ed@syzkaller.appspotmail.com
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Jiri Slaby <jirislaby@kernel.org>
Link: https://lore.kernel.org/r/20230306094921.tik5ewne4ft6mfpo@begin
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The comment refers to mm/slob.c which is being removed. It comes from
commit ed56829cb319 ("ring_buffer: reset buffer page when freeing") and
according to Steven the borrowed code was a page mapcount and mapping
reset, which was later removed by commit e4c2ce82ca27 ("ring_buffer:
allocate buffer page pointer"). Thus the comment is not accurate anyway,
remove it.
Link: https://lore.kernel.org/linux-trace-kernel/20230315142446.27040-1-vbabka@suse.cz
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Reported-by: Mike Rapoport <mike.rapoport@gmail.com>
Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Fixes: e4c2ce82ca27 ("ring_buffer: allocate buffer page pointer")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
WebUSB code uses wLength directly without proper endianness conversion.
Update it to use already prepared temporary variable w_length instead.
Fixes: 93c473948c58 ("usb: gadget: add WebUSB landing page support")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-By: Jó Ágila Bitsch <jgilab@gmail.com>
Link: https://lore.kernel.org/r/20230313154522.52684-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cppcheck reports
drivers/thunderbolt/nhi.c:74:7: style: Local variable 'bit' shadows outer variable [shadowVariable]
int bit;
^
drivers/thunderbolt/nhi.c:66:6: note: Shadowed declaration
int bit = ring_interrupt_index(ring) & 31;
^
drivers/thunderbolt/nhi.c:74:7: note: Shadow variable
int bit;
^
For readablity rename the outer to interrupt_bit and the innner
to auto_clear_bit.
Fixes: 468c49f44759 ("thunderbolt: Disable interrupt auto clear for ring")
Cc: stable@vger.kernel.org
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Pull yet more xfs bug fixes from Darrick Wong:
"The first bugfix addresses a longstanding problem where we use the
wrong file mapping cursors when trying to compute the speculative
preallocation quantity. This has been causing sporadic crashes when
alwayscow mode is engaged.
The other two fixes correct minor problems in more recent changes.
- Fix the new allocator tracepoints because git am mismerged the
changes such that the trace_XXX got rebased to be in function YYY
instead of XXX
- Ensure that the perag AGFL_RESET state is consistent with whatever
we've just read off the disk
- Fix a bug where we used the wrong iext cursor during a write begin"
* tag 'xfs-6.3-fixes-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix mismerged tracepoints
xfs: clear incore AGFL_RESET state if it's not needed
xfs: pass the correct cursor to xfs_iomap_prealloc_size
Anna says:
> KASAN reports [...] a slab-out-of-bounds in gss_krb5_checksum(),
> and it can cause my client to panic when running cthon basic
> tests with krb5p.
> Running faddr2line gives me:
>
> gss_krb5_checksum+0x4b6/0x630:
> ahash_request_free at
> /home/anna/Programs/linux-nfs.git/./include/crypto/hash.h:619
> (inlined by) gss_krb5_checksum at
> /home/anna/Programs/linux-nfs.git/net/sunrpc/auth_gss/gss_krb5_crypto.c:358
My diagnosis is that the memcpy() at the end of gss_krb5_checksum()
reads past the end of the buffer containing the checksum data
because the callers have ignored gss_krb5_checksum()'s API contract:
* Caller provides the truncation length of the output token (h) in
* cksumout.len.
Instead they provide the fixed length of the hmac buffer. This
length happens to be larger than the value returned by
crypto_ahash_digestsize().
Change these errant callers to work like krb5_etm_{en,de}crypt().
As a defensive measure, bound the length of the byte copy at the
end of gss_krb5_checksum().
Kunit sez:
Testing complete. Ran 68 tests: passed: 68
Elapsed time: 81.680s total, 5.875s configuring, 75.610s building, 0.103s running
Reported-by: Anna Schumaker <schumaker.anna@gmail.com>
Fixes: 8270dbfcebea ("SUNRPC: Obscure Kerberos integrity keys")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Get rid of any prefix paths in @path before lookup_positive_unlocked()
as it will call ->lookup() which already adds those prefix paths
through build_path_from_dentry().
This has caused a performance regression when mounting shares with a
prefix path where readdir(2) would end up retrying several times to
open bad directory names that contained duplicate prefix paths.
Fix this by skipping any prefix paths in @path before calling
lookup_positive_unlocked().
Fixes: e4029e072673 ("cifs: find and use the dentry for cached non-root directories also")
Cc: stable@vger.kernel.org # 6.1+
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
The commit 97e3d26b5e5f ("x86/mm: Randomize per-cpu entry area") fixed
an omission of KASLR on CPU entry areas. It doesn't take into account
KASLR switches though, which may result in unintended non-determinism
when a user wants to avoid it (e.g. debugging, benchmarking).
Generate only a single combination of CPU entry areas offsets -- the
linear array that existed prior randomization when KASLR is turned off.
Since we have 3f148f331814 ("x86/kasan: Map shadow for percpu pages on
demand") and followups, we can use the more relaxed guard
kasrl_enabled() (in contrast to kaslr_memory_enabled()).
Fixes: 97e3d26b5e5f ("x86/mm: Randomize per-cpu entry area")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20230306193144.24605-1-mkoutny%40suse.com
Pull RAS fix from Borislav Petkov:
- Flush out logged errors immediately after MCA banks configuration
changes over sysfs have been done instead of waiting until something
else triggers the workqueue later - another error or the polling
interval cycle is reached
* tag 'ras_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Make sure logged MCEs are processed after sysfs update
Georgi writes:
interconnect fixes for v6.3-rc
This contains a bunch of fixes with the highlight being fixes for a race
condition that could sometimes occur during the interconnect provider
driver registration. There are also fixes for memory overallocation and
a memory leak.
- interconnect: qcom: osm-l3: fix icc_onecell_data allocation
- interconnect: qcom: sm8450: switch to qcom_icc_rpmh_* function
- interconnect: qcom: sm8550: switch to qcom_icc_rpmh_* function
- interconnect: qcom: qcm2290: Fix MASTER_SNOC_BIMC_NRT
- interconnect: fix mem leak when freeing nodes
- interconnect: fix icc_provider_del() error handling
- interconnect: fix provider registration API
- interconnect: imx: fix registration race
- interconnect: qcom: osm-l3: fix registration race
- interconnect: qcom: rpm: fix probe child-node error handling
- interconnect: qcom: rpm: fix registration race
- interconnect: qcom: rpmh: fix probe child-node error handling
- interconnect: qcom: rpmh: fix registration race
- interconnect: qcom: msm8974: fix registration race
- interconnect: exynos: fix node leak in probe PM QoS error path
- interconnect: exynos: fix registration race
- interconnect: exynos: drop redundant link destroy
- memory: tegra: fix interconnect registration race
- memory: tegra124-emc: fix interconnect registration race
- memory: tegra20-emc: fix interconnect registration race
- memory: tegra30-emc: fix interconnect registration race
Signed-off-by: Georgi Djakov <djakov@kernel.org>
* tag 'icc-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc: (21 commits)
memory: tegra30-emc: fix interconnect registration race
memory: tegra20-emc: fix interconnect registration race
memory: tegra124-emc: fix interconnect registration race
memory: tegra: fix interconnect registration race
interconnect: exynos: drop redundant link destroy
interconnect: exynos: fix registration race
interconnect: exynos: fix node leak in probe PM QoS error path
interconnect: qcom: msm8974: fix registration race
interconnect: qcom: rpmh: fix registration race
interconnect: qcom: rpmh: fix probe child-node error handling
interconnect: qcom: rpm: fix registration race
interconnect: qcom: rpm: fix probe child-node error handling
interconnect: qcom: osm-l3: fix registration race
interconnect: imx: fix registration race
interconnect: fix provider registration API
interconnect: fix icc_provider_del() error handling
interconnect: fix mem leak when freeing nodes
interconnect: qcom: qcm2290: Fix MASTER_SNOC_BIMC_NRT
interconnect: qcom: sm8550: switch to qcom_icc_rpmh_* function
interconnect: qcom: sm8450: switch to qcom_icc_rpmh_* function
...
Drop the bogus uart_write_wakeup() from when setting up a new DMA
transfer, which does not free up any more space in the ring buffer.
Any pending writers will be woken up when the transfer completes.
Cc: stable <stable@kernel.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Andrew Halaney <ahalaney@redhat.com>
Tested-by: Andrew Halaney <ahalaney@redhat.com> # sa8540p-ride
Link: https://lore.kernel.org/r/20230307164405.14218-5-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since the commit 36e2c7421f02 ("fs: don't allow splice read/write
without explicit ops") is applied to the kernel, splice() and
sendfile() calls on the trace file (/sys/kernel/debug/tracing
/trace) return EINVAL.
This patch restores these system calls by initializing splice_read
in file_operations of the trace file. This patch only enables such
functionalities for the read case.
Link: https://lore.kernel.org/linux-trace-kernel/20230314013707.28814-1-sfoon.kim@samsung.com
Cc: stable@vger.kernel.org
Fixes: 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops")
Signed-off-by: Sung-hun Kim <sfoon.kim@samsung.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Just like other JMicron JMS5xx enclosures, it chokes on report-opcodes,
let's avoid them.
Signed-off-by: Yaroslav Furman <yaro330@gmail.com>
Cc: stable <stable@kernel.org>
Link: https://lore.kernel.org/r/20230312090745.47962-1-yaro330@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When interrupt auto clear is programmed, any read to the interrupt
status register will clear all interrupts. If two interrupts have
come in before one can be serviced then this will cause lost interrupts.
On AMD USB4 routers this has manifested in odd problems particularly
with long strings of control tranfers such as reading the DROM via bit
banging.
Instead of clearing interrupts automatically, clear the bit corresponding
to the given ring's interrupt in the ISR.
Fixes: 7a1808f82a37 ("thunderbolt: Handle ring interrupt by reading interrupt status register")
Cc: Sanju Mehta <Sanju.Mehta@amd.com>
Cc: stable@vger.kernel.org
Tested-by: Anson Tsao <anson.tsao@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Pull xfs percpu counter fixes from Darrick Wong:
"We discovered a filesystem summary counter corruption problem that was
traced to cpu hot-remove racing with the call to percpu_counter_sum
that sets the free block count in the superblock when writing it to
disk. The root cause is that percpu_counter_sum doesn't cull from
dying cpus and hence misses those counter values if the cpu shutdown
hooks have not yet run to merge the values.
I'm hoping this is a fairly painless fix to the problem, since the
dying cpu mask should generally be empty. It's been in for-next for a
week without any complaints from the bots.
- Fix a race in the percpu counters summation code where the
summation failed to add in the values for any CPUs that were dying
but not yet dead. This fixes some minor discrepancies and incorrect
assertions when running generic/650"
* tag 'xfs-6.3-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
pcpcntr: remove percpu_counter_sum_all()
fork: remove use of percpu_counter_sum_all
pcpcntrs: fix dying cpu summation race
cpumask: introduce for_each_cpu_or
The splice read calls nfsd_splice_actor to put the pages containing file
data into the svc_rqst->rq_pages array. It's possible however to get a
splice result that only has a partial page at the end, if (e.g.) the
filesystem hands back a short read that doesn't cover the whole page.
nfsd_splice_actor will plop the partial page into its rq_pages array and
return. Then later, when nfsd_splice_actor is called again, the
remainder of the page may end up being filled out. At this point,
nfsd_splice_actor will put the page into the array _again_ corrupting
the reply. If this is done enough times, rq_next_page will overrun the
array and corrupt the trailing fields -- the rq_respages and
rq_next_page pointers themselves.
If we've already added the page to the array in the last pass, don't add
it to the array a second time when dealing with a splice continuation.
This was originally handled properly in nfsd_splice_actor, but commit
91e23b1c3982 ("NFSD: Clean up nfsd_splice_actor()") removed the check
for it.
Fixes: 91e23b1c3982 ("NFSD: Clean up nfsd_splice_actor()")
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: Dario Lesca <d.lesca@solinos.it>
Tested-by: David Critch <dcritch@redhat.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2150630
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Performance tests with large number of threads noted that the change
of the default closetimeo (deferred close timeout between when
close is done by application and when client has to send the close
to the server), to 5 seconds from 1 second, significantly degraded
perf in some cases like this (in the filebench example reported,
the stats show close requests on the wire taking twice as long,
and 50% regression in filebench perf). This is stil configurable
via mount parm closetimeo, but to be safe, decrease default back
to its previous value of 1 second.
Reported-by: Yin Fengwei <fengwei.yin@intel.com>
Reported-by: kernel test robot <yujie.liu@intel.com>
Link: https://lore.kernel.org/lkml/997614df-10d4-af53-9571-edec36b0e2f3@intel.com/
Fixes: 5efdd9122eff ("smb3: allow deferred close timeout to be configurable")
Cc: stable@vger.kernel.org # 6.0+
Tested-by: Yin Fengwei <fengwei.yin@intel.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
cmdline_find_option() may fail before doing any initialization of
the buffer array. This may lead to unpredictable results when the same
buffer is used later in calls to strncmp() function. Fix the issue by
returning early if cmdline_find_option() returns an error.
Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.
Fixes: aca20d546214 ("x86/mm: Add support to make use of Secure Memory Encryption")
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20230306160656.14844-1-n.zhandarovich@fintech.ru
Pull perf fixes from Borislav Petkov:
- Check whether sibling events have been deactivated before adding them
to groups
- Update the proper event time tracking variable depending on the event
type
- Fix a memory overwrite issue due to using the wrong function argument
when outputting perf events
* tag 'perf_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Fix check before add_event_to_groups() in perf_group_detach()
perf: fix perf_event_context->time
perf/core: Fix perf_output_begin parameter is incorrectly invoked in perf_event_bpf_output
A recent change introduced a flag to queue up errors found during
boot-time polling. These errors will be processed during late init once
the MCE subsystem is fully set up.
A number of sysfs updates call mce_restart() which goes through a subset
of the CPU init flow. This includes polling MCA banks and logging any
errors found. Since the same function is used as boot-time polling,
errors will be queued. However, the system is now past late init, so the
errors will remain queued until another error is found and the workqueue
is triggered.
Call mce_schedule_work() at the end of mce_restart() so that queued
errors are processed.
Fixes: 3bff147b187d ("x86/mce: Defer processing of early errors")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230301221420.2203184-1-yazen.ghannam@amd.com
Prior to commit 5d8e6e6c10a3 ("nvmem: core: add an index parameter to
the cell") of_nvmem_cell_get() would return -ENOENT if the cell wasn't
found. Particularly, if of_property_match_string() returned -EINVAL,
that return code was passed as the index to of_parse_phandle(), which
then detected it as invalid and returned NULL. That led to an return
code of -ENOENT.
With the new code, the negative index will lead to an -EINVAL of
of_parse_phandle_with_optional_args() which pass straight to the
caller and break those who expect an -ENOENT.
Fix it by always returning -ENOENT.
Fixes: 5d8e6e6c10a3 ("nvmem: core: add an index parameter to the cell")
Reported-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Link: https://lore.kernel.org/r/2143916.GUh0CODmnK@steina-w/
Signed-off-by: Michael Walle <michael@walle.cc>
Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20230310094845.139400-1-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current interconnect provider registration interface is inherently
racy as nodes are not added until the after adding the provider. This
can specifically cause racing DT lookups to fail.
Switch to using the new API where the provider is not registered until
after it has been fully initialised.
Fixes: d5ef16ba5fbe ("memory: tegra20: Support interconnect framework")
Cc: stable@vger.kernel.org # 5.11
Cc: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230306075651.2449-21-johan+linaro@kernel.org
Signed-off-by: Georgi Djakov <djakov@kernel.org>
Make sure that there is data in the ring buffer before trying to set up
a zero-length DMA transfer.
This specifically fixes the following warning when unmapping the empty
buffer on the sc8280xp-crd:
WARNING: CPU: 0 PID: 138 at drivers/iommu/dma-iommu.c:1046 iommu_dma_unmap_page+0xbc/0xd8
...
Call trace:
iommu_dma_unmap_page+0xbc/0xd8
dma_unmap_page_attrs+0x30/0x1c8
geni_se_tx_dma_unprep+0x28/0x38
qcom_geni_serial_isr+0x358/0x75c
Fixes: 2aaa43c70778 ("tty: serial: qcom-geni-serial: add support for serial engine DMA")
Cc: stable <stable@kernel.org>
Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Andrew Halaney <ahalaney@redhat.com>
Tested-by: Andrew Halaney <ahalaney@redhat.com> # sa8540p-ride
Link: https://lore.kernel.org/r/20230307164405.14218-4-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
smatch reports this warning
kernel/trace/ftrace.c:2594:19: warning:
symbol 'direct_ops' was not declared. Should it be static?
The variable direct_ops is only used in ftrace.c, so it should be static
Link: https://lore.kernel.org/linux-trace-kernel/20230311135113.711824-1-trix@redhat.com
Signed-off-by: Tom Rix <trix@redhat.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Patch changes CDNS_DEVICE_ID in USBSSP PCI Glue driver to remove
the conflict with Cadence USBSS driver.
cc: <stable@vger.kernel.org>
Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver")
Signed-off-by: Pawel Laszczak <pawell@cadence.com>
Link: https://lore.kernel.org/r/20230309063048.299378-1-pawell@cadence.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
`ring_interrupt_index` doesn't change the data for `ring` so mark it as
const. This is needed by the following patch that disables interrupt
auto clear for rings.
Cc: Sanju Mehta <Sanju.Mehta@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Pull xfs fixes from Darrick Wong:
"This batch started with some debugging enhancements to the new
allocator refactoring that we put in 6.3-rc1 to assist developers in
rebasing their dev branches.
As for more serious code changes -- there's a bug fix to make the
lockless allocator scan the whole filesystem before resorting to the
locking allocator. We're also adding a selftest for the venerable
directory/xattr hash function to make sure that it produces consistent
results so that we can address any fallout as soon as possible.
- Add a few debugging assertions so that people (me) trying to port
code to the new allocator functions don't mess up the caller
requirements
- Relax some overly cautious lock ordering enforcement in the new
allocator code, which means that file allocations will locklessly
scan for the best space they can get before backing off to the
traditional lock-and-really-get-it behavior
- Add tracepoints to make it easier to trace the xfs allocator
behavior
- Actually test the dir/xattr hash algorithm to make sure it produces
consistent results across all the platforms XFS supports"
* tag 'xfs-6.3-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: test dir/attr hash when loading module
xfs: add tracepoints for each of the externally visible allocators
xfs: walk all AGs if TRYLOCK passed to xfs_alloc_vextent_iterate_ags
xfs: try to idiot-proof the allocators
percpu_counter_sum_all() is now redundant as the race condition it
was invented to handle is now dealt with by percpu_counter_sum()
directly and all users of percpu_counter_sum_all() have been
removed.
Remove it.
This effectively reverts the changes made in f689054aace2
("percpu_counter: add percpu_counter_sum_all interface") except for
the cpumask iteration that fixes percpu_counter_sum() made earlier
in this series.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Prior to commit 7ac2ff8bb371, when we loaded the incore perag structure
with information from the AGF header, we would set or clear the
pagf_agfl_reset field based on whether or not the AGFL list was
misaligned within the block. IOWs, it's an incore state bit that's
supposed to cache something in the ondisk metadata. Therefore, the code
still needs to support clearing the incore bit if (somehow) the AGFL
were to correct itself.
It turns out that xfs_repair does exactly this -- phase 4 loads the AGF
to scan the rmapbt for corrupt records, which can set NEEDS_AGFL_RESET.
The scan unsets AGF_INIT but doesn't unset NEEDS_AGFL_RESET. Phase 5
totally rewrites the AGFL and fixes the alignment problem, didn't clear
NEEDS_AGFL_RESET historically, and reloads the perag state to fix the
freelist. This results in the AGFL being reset based on stale data,
which then causes the new AGFL blocks to be leaked. A subsequent
xfs_repair -n then complains about the leaks.
One could argue that phase 5 ought to clear this bit directly when it
reloads the perag AGF data after rewriting the AGFL, but libxfs used to
handle this for us, so it should go back to doing that.
Found by fuzzing flfirst = ones in xfs/352.
Fixes: 7ac2ff8bb371 ("xfs: perags need atomic operational state")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Geert reports that:
> On v6.2, "make ARCH=m68k defconfig" gives you
> CONFIG_RPCSEC_GSS_KRB5=m
> On v6.3, it became builtin, due to dropping the dependencies on
> the individual crypto modules.
>
> $ grep -E "CRYPTO_(MD5|DES|CBC|CTS|ECB|HMAC|SHA1|AES)" .config
> CONFIG_CRYPTO_AES=y
> CONFIG_CRYPTO_AES_TI=m
> CONFIG_CRYPTO_DES=m
> CONFIG_CRYPTO_CBC=m
> CONFIG_CRYPTO_CTS=m
> CONFIG_CRYPTO_ECB=m
> CONFIG_CRYPTO_HMAC=m
> CONFIG_CRYPTO_MD5=m
> CONFIG_CRYPTO_SHA1=m
This behavior is triggered by the "default y" in the definition of
RPCSEC_GSS.
The "default y" was added in 2010 by commit df486a25900f ("NFS: Fix
the selection of security flavours in Kconfig"). However,
svc_gss_principal was removed in 2012 by commit 03a4e1f6ddf2
("nfsd4: move principal name into svc_cred"), so the 2010 fix is
no longer necessary. We can safely change the NFS_V4 and NFSD_V4
dependencies back to RPCSEC_GSS_KRB5 to get the nicer v6.2
behavior back.
Selecting KRB5 symbolically represents the true requirement here:
that all spec-compliant NFSv4 implementations must have Kerberos
available to use.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: dfe9a123451a ("SUNRPC: Enable rpcsec_gss_krb5.ko to be built without CRYPTO_DES")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
As a temporary storage, staged_config[] in rdt_domain should be cleared
before and after it is used. The stale value in staged_config[] could
cause an MSR access error.
Here is a reproducer on a system with 16 usable CLOSIDs for a 15-way L3
Cache (MBA should be disabled if the number of CLOSIDs for MB is less than
16.) :
mount -t resctrl resctrl -o cdp /sys/fs/resctrl
mkdir /sys/fs/resctrl/p{1..7}
umount /sys/fs/resctrl/
mount -t resctrl resctrl /sys/fs/resctrl
mkdir /sys/fs/resctrl/p{1..8}
An error occurs when creating resource group named p8:
unchecked MSR access error: WRMSR to 0xca0 (tried to write 0x00000000000007ff) at rIP: 0xffffffff82249142 (cat_wrmsr+0x32/0x60)
Call Trace:
<IRQ>
__flush_smp_call_function_queue+0x11d/0x170
__sysvec_call_function+0x24/0xd0
sysvec_call_function+0x89/0xc0
</IRQ>
<TASK>
asm_sysvec_call_function+0x16/0x20
When creating a new resource control group, hardware will be configured
by the following process:
rdtgroup_mkdir()
rdtgroup_mkdir_ctrl_mon()
rdtgroup_init_alloc()
resctrl_arch_update_domains()
resctrl_arch_update_domains() iterates and updates all resctrl_conf_type
whose have_new_ctrl is true. Since staged_config[] holds the same values as
when CDP was enabled, it will continue to update the CDP_CODE and CDP_DATA
configurations. When group p8 is created, get_config_index() called in
resctrl_arch_update_domains() will return 16 and 17 as the CLOSIDs for
CDP_CODE and CDP_DATA, which will be translated to an invalid register -
0xca0 in this scenario.
Fix it by clearing staged_config[] before and after it is used.
[reinette: re-order commit tags]
Fixes: 75408e43509e ("x86/resctrl: Allow different CODE/DATA configurations to be staged")
Suggested-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/2fad13f49fbe89687fc40e9a5a61f23a28d1507a.1673988935.git.reinette.chatre%40intel.com
Pull x86 fixes from Borislav Petkov:
"There's a little bit more 'movement' in there for my taste but it
needs to happen and should make the code better after it.
- Check cmdline_find_option()'s return value before further
processing
- Clear temporary storage in the resctrl code to prevent access to an
unexistent MSR
- Add a simple throttling mechanism to protect the hypervisor from
potentially malicious SEV guests issuing requests in rapid
succession.
In order to not jeopardize the sanity of everyone involved in
maintaining this code, the request issuing side has received a
cleanup, split in more or less trivial, small and digestible
pieces. Otherwise, the code was threatening to become an
unmaintainable mess.
Therefore, that cleanup is marked indirectly also for stable so
that there's no differences between the upstream code and the
stable variant when it comes down to backporting more there"
* tag 'x86_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Fix use of uninitialized buffer in sme_enable()
x86/resctrl: Clear staged_config[] before and after it is used
virt/coco/sev-guest: Add throttling awareness
virt/coco/sev-guest: Convert the sw_exit_info_2 checking to a switch-case
virt/coco/sev-guest: Do some code style cleanups
virt/coco/sev-guest: Carve out the request issuing logic into a helper
virt/coco/sev-guest: Remove the disable_vmpck label in handle_guest_request()
virt/coco/sev-guest: Simplify extended guest request handling
virt/coco/sev-guest: Check SEV_SNP attribute at probe time
Events should only be added to a groups rb tree if they have not been
removed from their context by list_del_event(). Since remove_on_exec
made it possible to call list_del_event() on individual events before
they are detached from their group, perf_group_detach() should check each
sibling's attach_state before calling add_event_to_groups() on it.
Fixes: 2e498d0a74e5 ("perf: Add support for event removal on exec")
Signed-off-by: Budimir Markovic <markovicbudimir@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/ZBFzvQV9tEqoHEtH@gentoo