Linux kernel
============
There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.
In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``. The formatted documentation can also be read online at:
https://www.kernel.org/doc/html/latest/
There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.
code
Clone this repository
https://tangled.org/tjh.dev/kernel
git@gordian.tjh.dev:tjh.dev/kernel
For self-hosted knots, clone URLs may differ based on your setup.
Pull phy fixes from Vinod Koul:
- mapphone-mdm6600 runtime pm & pinctrl handling fixes
- Qualcomm qmp usb pcs register fixes, qmp pcie register size warning
fix, m31 fixes for wrong pointer in PTR_ERR and dropping wrong vreg
check, qmp combo fix for 8550 power config register
- realtek usb fix for debugfs_create_dir() and kconfig dependency
* tag 'phy-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
phy: realtek: Realtek PHYs should depend on ARCH_REALTEK
phy: qualcomm: Fix typos in comments
phy: qcom-qmp-combo: initialize PCS_USB registers
phy: qcom-qmp-combo: Square out 8550 POWER_STATE_CONFIG1
phy: qcom: m31: Remove unwanted qphy->vreg is NULL check
phy: realtek: usb: Drop unnecessary error check for debugfs_create_dir()
phy: qcom: phy-qcom-m31: change m31_ipq5332_regs to static
phy: qcom: phy-qcom-m31: fix wrong pointer pass to PTR_ERR()
dt-bindings: phy: qcom,ipq8074-qmp-pcie: fix warning regarding reg size
phy: qcom-qmp-usb: split PCS_USB init table for sc8280xp and sa8775p
phy: qcom-qmp-usb: initialize PCS_USB registers
phy: mapphone-mdm6600: Fix pinctrl_pm handling for sleep pins
phy: mapphone-mdm6600: Fix runtime PM for remove
phy: mapphone-mdm6600: Fix runtime disable on probe
Pull EFI fixes from Ard Biesheuvel:
"The boot_params pointer fix uses a somewhat ugly extern struct
declaration but this will be cleaned up the next cycle.
- don't try to print warnings to the console when it is no longer
available
- fix theoretical memory leak in SSDT override handling
- make sure that the boot_params global variable is set before the
KASLR code attempts to hash it for 'randomness'
- avoid soft lockups in the memory acceptance code"
* tag 'efi-fixes-for-v6.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi/unaccepted: Fix soft lockups caused by parallel memory acceptance
x86/boot: efistub: Assign global boot_params variable
efi: fix memory leak in krealloc failure handling
x86/efistub: Don't try to print after ExitBootService()
The Realtek SoC USB2 and USB3 PHY Transceivers are only present on
Realtek Digital Home Center (DHC) RTD series SoCs. Hence add a
dependency on ARCH_REALTEK, to prevent asking the user about these
drivers when configuring a kernel without Realtek SoC support.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/2892527cac9af6fa8f5e7b8daeffd7d4351fde68.1692113167.git.geert+renesas@glider.be
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Pull powerpc fixes from Michael Ellerman:
- Fix stale propagated yield_cpu in qspinlocks leading to lockups
- Fix broken hugepages on some configs due to ARCH_FORCE_MAX_ORDER
- Fix a spurious warning when copros are in use at exit time
Thanks to Nicholas Piggin, Christophe Leroy, Nysal Jan K.A Sachin Sant,
and Shrikanth Hegde.
* tag 'powerpc-6.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/qspinlock: Fix stale propagated yield_cpu
powerpc/64s/radix: Don't warn on copros in radix__tlb_flush()
powerpc/mm: Allow ARCH_FORCE_MAX_ORDER up to 12
Fix typo in the description of the 'succesfully'.
Signed-off-by: Bo Liu <liubo03@inspur.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Link: https://lore.kernel.org/r/20230912114646.8452-1-liubo03@inspur.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Pull gpio fixes from Bartosz Golaszewski:
- fix interrupt handling in suspend and wakeup in gpio-vf610
- fix a bug on setting direction to output in gpio-vf610
- add a missing memset() in gpio ACPI code
* tag 'gpio-fixes-for-v6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpiolib: acpi: Add missing memset(0) to acpi_get_gpiod_from_data()
gpio: vf610: set value before the direction to avoid a glitch
gpio: vf610: mask the gpio irq in system suspend and support wakeup
yield_cpu is a sample of a preempted lock holder that gets propagated
back through the queue. Queued waiters use this to yield to the
preempted lock holder without continually sampling the lock word (which
would defeat the purpose of MCS queueing by bouncing the cache line).
The problem is that yield_cpu can become stale. It can take some time to
be passed down the chain, and if any queued waiter gets preempted then
it will cease to propagate the yield_cpu to later waiters.
This can result in yielding to a CPU that no longer holds the lock,
which is bad, but particularly if it is currently in H_CEDE (idle),
then it appears to be preempted and some hypervisors (PowerVM) can
cause very long H_CONFER latencies waiting for H_CEDE wakeup. This
results in latency spikes and hard lockups on oversubscribed
partitions with lock contention.
This is a minimal fix. Before yielding to yield_cpu, sample the lock
word to confirm yield_cpu is still the owner, and bail out of it is not.
Thanks to a bunch of people who reported this and tracked down the
exact problem using tracepoints and dispatch trace logs.
Fixes: 28db61e207ea ("powerpc/qspinlock: allow propagation of yield CPU down the queue")
Cc: stable@vger.kernel.org # v6.2+
Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Reported-by: Laurent Dufour <ldufour@linux.ibm.com>
Reported-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com>
Debugged-by: "Nysal Jan K.A" <nysal@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231016124305.139923-2-npiggin@gmail.com
Unaccepted table is now allocated from EFI_ACPI_RECLAIM_MEMORY. It
translates into E820_TYPE_ACPI, which is not added to memblock and
therefore not mapped in the direct mapping.
This causes a crash on the first touch of the table.
Use memblock_add() to make sure that the table is mapped in direct
mapping.
Align the range to the nearest page borders. Ranges smaller than page
size are not mapped.
Fixes: e7761d827e99 ("efi/unaccepted: Use ACPI reclaim memory for unaccepted memory table")
Reported-by: Hongyu Ning <hongyu.ning@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Michael reported soft lockups on a system that has unaccepted memory.
This occurs when a user attempts to allocate and accept memory on
multiple CPUs simultaneously.
The root cause of the issue is that memory acceptance is serialized with
a spinlock, allowing only one CPU to accept memory at a time. The other
CPUs spin and wait for their turn, leading to starvation and soft lockup
reports.
To address this, the code has been modified to release the spinlock
while accepting memory. This allows for parallel memory acceptance on
multiple CPUs.
A newly introduced "accepting_list" keeps track of which memory is
currently being accepted. This is necessary to prevent parallel
acceptance of the same memory block. If a collision occurs, the lock is
released and the process is retried.
Such collisions should rarely occur. The main path for memory acceptance
is the page allocator, which accepts memory in MAX_ORDER chunks. As long
as MAX_ORDER is equal to or larger than the unit_size, collisions will
never occur because the caller fully owns the memory block being
accepted.
Aside from the page allocator, only memblock and deferered_free_range()
accept memory, but this only happens during boot.
The code has been tested with unit_size == 128MiB to trigger collisions
and validate the retry codepath.
Fixes: 2053bc57f367 ("efi: Add unaccepted memory support")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Michael Roth <michael.roth@amd.com
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Michael Roth <michael.roth@amd.com>
[ardb: drop unnecessary cpu_relax() call]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Currently, PCS_USB registers that have their initialization data in a
pcs_usb_tbl table are never initialized. Fix that.
Fixes: fc64623637da ("phy: qcom-qmp-combo,usb: add support for separate PCS_USB region")
Reported-by: Adrien Thierry <athierry@redhat.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Link: https://lore.kernel.org/r/20230829-topic-8550_usbphy-v3-2-34ec434194c5@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Pull rust fixes from Miguel Ojeda:
- GCC build: fix bindgen build error with '-fstrict-flex-arrays'
- Error module: fix the description for 'ECHILD' and fix Markdown
style nit
- Code docs: fix logo replacement
- Docs: update docs output path
- Kbuild: remove old docs output path in 'cleandocs' target
* tag 'rust-fixes-6.6' of https://github.com/Rust-for-Linux/linux:
rust: docs: fix logo replacement
kbuild: remove old Rust docs output path
docs: rust: update Rust docs output path
rust: fix bindgen build error with fstrict-flex-arrays
rust: error: Markdown style nit
rust: error: fix the description for `ECHILD`
When refactoring the acpi_get_gpiod_from_data() the change missed
cleaning up the variable on stack. Add missing memset().
Reported-by: Ferry Toth <ftoth@exalondelft.nl>
Fixes: 16ba046e86e9 ("gpiolib: acpi: teach acpi_find_gpio() to handle data-only nodes")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Sachin reported a warning when running the inject-ra-err selftest:
# selftests: powerpc/mce: inject-ra-err
Disabling lock debugging due to kernel taint
MCE: CPU19: machine check (Severe) Real address Load/Store (foreign/control memory) [Not recovered]
MCE: CPU19: PID: 5254 Comm: inject-ra-err NIP: [0000000010000e48]
MCE: CPU19: Initiator CPU
MCE: CPU19: Unknown
------------[ cut here ]------------
WARNING: CPU: 19 PID: 5254 at arch/powerpc/mm/book3s64/radix_tlb.c:1221 radix__tlb_flush+0x160/0x180
CPU: 19 PID: 5254 Comm: inject-ra-err Kdump: loaded Tainted: G M E 6.6.0-rc3-00055-g9ed22ae6be81 #4
Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
...
NIP radix__tlb_flush+0x160/0x180
LR radix__tlb_flush+0x104/0x180
Call Trace:
radix__tlb_flush+0xf4/0x180 (unreliable)
tlb_finish_mmu+0x15c/0x1e0
exit_mmap+0x1a0/0x510
__mmput+0x60/0x1e0
exit_mm+0xdc/0x170
do_exit+0x2bc/0x5a0
do_group_exit+0x4c/0xc0
sys_exit_group+0x28/0x30
system_call_exception+0x138/0x330
system_call_vectored_common+0x15c/0x2ec
And bisected it to commit e43c0a0c3c28 ("powerpc/64s/radix: combine
final TLB flush and lazy tlb mm shootdown IPIs"), which added a warning
in radix__tlb_flush() if mm->context.copros is still elevated.
However it's possible for the copros count to be elevated if a process
exits without first closing file descriptors that are associated with a
copro, eg. VAS.
If the process exits with a VAS file still open, the release callback
is queued up for exit_task_work() via:
exit_files()
put_files_struct()
close_files()
filp_close()
fput()
And called via:
exit_task_work()
____fput()
__fput()
file->f_op->release(inode, file)
coproc_release()
vas_user_win_ops->close_win()
vas_deallocate_window()
mm_context_remove_vas_window()
mm_context_remove_copro()
But that is after exit_mm() has been called from do_exit() and triggered
the warning.
Fix it by dropping the warning, and always calling __flush_all_mm().
In the normal case of no copros, that will result in a call to
_tlbiel_pid(mm->context.id, RIC_FLUSH_ALL) just as the current code
does.
If the copros count is elevated then it will cause a global flush, which
should flush translations from any copros. Note that the process table
entry was cleared in arch_exit_mmap(), so copros should not be able to
fetch any new translations.
Fixes: e43c0a0c3c28 ("powerpc/64s/radix: combine final TLB flush and lazy tlb mm shootdown IPIs")
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Closes: https://lore.kernel.org/all/A8E52547-4BF1-47CE-8AEA-BC5A9D7E3567@linux.ibm.com/
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Link: https://msgid.link/20231017121527.1574104-1-mpe@ellerman.id.au
Some firmware (notably U-Boot) provides GetVariable() and
GetNextVariableName() but not QueryVariableInfo().
With commit d86ff3333cb1 ("efivarfs: expose used and total size") the
statfs syscall was broken for such firmware.
If QueryVariableInfo() does not exist or returns EFI_UNSUPPORTED, just
report the file system size as 0 as statfs_simple() previously did.
Fixes: d86ff3333cb1 ("efivarfs: expose used and total size")
Link: https://lore.kernel.org/all/20230910045445.41632-1-heinrich.schuchardt@canonical.com/
Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
[ardb: log warning on QueryVariableInfo() failure]
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>