Linux kernel
============
There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.
In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``. The formatted documentation can also be read online at:
https://www.kernel.org/doc/html/latest/
There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.
Clone this repository
For self-hosted knots, clone URLs may differ based on your setup.
Download tar.gz
Pull xen security fixes from Juergen Gross:
- XSA-403 (4 patches for blkfront and netfront drivers):
Linux Block and Network PV device frontends don't zero memory regions
before sharing them with the backend (CVE-2022-26365,
CVE-2022-33740). Additionally the granularity of the grant table
doesn't allow sharing less than a 4K page, leading to unrelated data
residing in the same 4K page as data shared with a backend being
accessible by such backend (CVE-2022-33741, CVE-2022-33742).
- XSA-405 (1 patch for netfront driver, only 5.10 and newer):
While adding logic to support XDP (eXpress Data Path), a code label
was moved in a way allowing for SKBs having references (pointers)
retained for further processing to nevertheless be freed.
- XSA-406 (1 patch for Arm specific dom0 code):
When mapping pages of guests on Arm, dom0 is using an rbtree to keep
track of the foreign mappings.
Updating of that rbtree is not always done completely with the
related lock held, resulting in a small race window, which can be
used by unprivileged guests via PV devices to cause inconsistencies
of the rbtree. These inconsistencies can lead to Denial of Service
(DoS) of dom0, e.g. by causing crashes or the inability to perform
further mappings of other guests' memory pages.
* tag 'xsa-5.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/arm: Fix race in RB-tree based P2M accounting
xen-netfront: restore __skb_queue_tail() positioning in xennet_get_responses()
xen/blkfront: force data bouncing when backend is untrusted
xen/netfront: force data bouncing when backend is untrusted
xen/netfront: fix leaking data in shared pages
xen/blkfront: fix leaking data in shared pages
Pull ARM SoC fixes from Arnd Bergmann:
"Another set of minor patches for Arm DTS files and soc specific
drivers:
- More reference counting bug fixes for DT nodes, and other trivial
code fixes
- Multiple code fixes for the Arm SCMI firmware driver to improve
compatibility with firmware implementations.
- A patch series for at91 to address power management issues from
using the wrong DT compatible properties.
- A series of patches to fix pad settings for NXP imx8mp to leave the
configuration untouched from the boot loader
- Additional DT fixes for qualcomm and NXP platforms
- A boot time fix for stm32mp15 DT
- Konrad Dybcio becomes an additional reviewer for the Qualcomm
platforms"
* tag 'soc-fixes-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (42 commits)
soc: qcom: smem: use correct format characters
ARM: dts: stm32: add missing usbh clock and fix clk order on stm32mp15
ARM: dts: stm32: delete fixed clock node on STM32MP15-SCMI
ARM: dts: stm32: DSI should use LSE SCMI clock on DK1/ED1 STM32 board
ARM: dts: stm32: use the correct clock source for CEC on stm32mp151
ARM: dts: stm32: fix pwr regulators references to use scmi
soc: ixp4xx/npe: Fix unused match warning
ARM: at91: pm: Mark at91_pm_secure_init as __init
ARM: at91: fix soc detection for SAM9X60 SiPs
ARM: dts: at91: sama5d2_icp: fix eeprom compatibles
ARM: dts: at91: sam9x60ek: fix eeprom compatible and size
ARM: at91: pm: use proper compatibles for sama7g5's rtc and rtt
ARM: at91: pm: use proper compatibles for sam9x60's rtc and rtt
ARM: at91: pm: use proper compatible for sama5d2's rtc
arm64: dts: qcom: msm8992-*: Fix vdd_lvs1_2-supply typo
firmware: arm_scmi: Remove usage of the deprecated ida_simple_xxx API
firmware: arm_scmi: Fix response size warning for OPTEE transport
arm64: dts: imx8mp-icore-mx8mp-edim2.2: correct pad settings
arm64: dts: imx8mp-phyboard-pollux-rdk: correct i2c2 & mmc settings
arm64: dts: imx8mp-phyboard-pollux-rdk: correct eqos pad settings
...
During the PV driver life cycle the mappings are added to
the RB-tree by set_foreign_p2m_mapping(), which is called from
gnttab_map_refs() and are removed by clear_foreign_p2m_mapping()
which is called from gnttab_unmap_refs(). As both functions end
up calling __set_phys_to_machine_multi() which updates the RB-tree,
this function can be called concurrently.
There is already a "p2m_lock" to protect against concurrent accesses,
but the problem is that the first read of "phys_to_mach.rb_node"
in __set_phys_to_machine_multi() is not covered by it, so this might
lead to the incorrect mappings update (removing in our case) in RB-tree.
In my environment the related issue happens rarely and only when
PV net backend is running, the xen_add_phys_to_mach_entry() claims
that it cannot add new pfn <-> mfn mapping to the tree since it is
already exists which results in a failure when mapping foreign pages.
But there might be other bad consequences related to the non-protected
root reads such use-after-free, etc.
While at it, also fix the similar usage in __pfn_to_mfn(), so
initialize "struct rb_node *n" with the "p2m_lock" held in both
functions to avoid possible bad consequences.
This is CVE-2022-33744 / XSA-406.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
STM32 DT fixes for v5.19, round 2
Highlights:
-----------
-Fixes STM32MP15:
- Add missing usbh clock and fix clk order for usbh to avoid PLL
issue.
- Fix SCMI version: use scmi regulator and update missing SCMI
clocks to be able to correcly boot.
* tag 'stm32-dt-for-v5.19-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/atorgue/stm32:
ARM: dts: stm32: add missing usbh clock and fix clk order on stm32mp15
ARM: dts: stm32: delete fixed clock node on STM32MP15-SCMI
ARM: dts: stm32: DSI should use LSE SCMI clock on DK1/ED1 STM32 board
ARM: dts: stm32: use the correct clock source for CEC on stm32mp151
ARM: dts: stm32: fix pwr regulators references to use scmi
Link: https://lore.kernel.org/r/1259e082-a3a4-96a5-ec9c-05dbb893a746@foss.st.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The commit referenced below moved the invocation past the "next" label,
without any explanation. In fact this allows misbehaving backends undue
control over the domain the frontend runs in, as earlier detected errors
require the skb to not be freed (it may be retained for later processing
via xennet_move_rx_slot(), or it may simply be unsafe to have it freed).
This is CVE-2022-33743 / XSA-405.
Fixes: 6c5aa6fc4def ("xen networking: add basic XDP support for xen-netfront")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Looking at the conditional lock acquire functions in the kernel due to
the new sparse support (see commit 4a557a5d1a61 "sparse: introduce
conditional lock acquire function attribute"), it became obvious that
the lockref code has a couple of them, but they don't match the usual
naming convention for the other ones, and their return value logic is
also reversed.
In the other very similar places, the naming pattern is '*_and_lock()'
(eg 'atomic_put_and_lock()' and 'refcount_dec_and_lock()'), and the
function returns true when the lock is taken.
The lockref code is superficially very similar to the refcount code,
only with the special "atomic wrt the embedded lock" semantics. But
instead of the '*_and_lock()' naming it uses '*_or_lock()'.
And instead of returning true in case it took the lock, it returns true
if it *didn't* take the lock.
Now, arguably the reflock code is quite logical: it really is a "either
decrement _or_ lock" kind of situation - and the return value is about
whether the operation succeeded without any special care needed.
So despite the similarities, the differences do make some sense, and
maybe it's not worth trying to unify the different conditional locking
primitives in this area.
But while looking at this all, it did become obvious that the
'lockref_get_or_lock()' function hasn't actually had any users for
almost a decade.
The only user it ever had was the shortlived 'd_rcu_to_refcount()'
function, and it got removed and replaced with 'lockref_get_not_dead()'
back in 2013 in commits 0d98439ea3c6 ("vfs: use lockred 'dead' flag to
mark unrecoverably dead dentries") and e5c832d55588 ("vfs: fix dentry
RCU to refcounting possibly sleeping dput()")
In fact, that single use was removed less than a week after the whole
function was introduced in commit b3abd80250c1 ("lockref: add
'lockref_get_or_lock() helper") so this function has been around for a
decade, but only had a user for six days.
Let's just put this mis-designed and unused function out of its misery.
We can think about the naming and semantic oddities of the remaining
'lockref_put_or_lock()' later, but at least that function has users.
And while the naming is different and the return value doesn't match,
that function matches the whole '{atomic,refcount}_dec_and_test()'
pattern much better (ie the magic happens when the count goes down to
zero, not when it is incremented from zero).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When compiling with -Wformat, clang emits the following warnings:
drivers/soc/qcom/smem.c:847:41: warning: format specifies type 'unsigned
short' but the argument has type 'unsigned int' [-Wformat]
dev_err(smem->dev, "bad host %hu\n", remote_host);
~~~ ^~~~~~~~~~~
%u
./include/linux/dev_printk.h:144:65: note: expanded from macro 'dev_err'
dev_printk_index_wrap(_dev_err, KERN_ERR, dev, dev_fmt(fmt), ##__VA_ARGS__)
~~~ ^~~~~~~~~~~
./include/linux/dev_printk.h:110:23: note: expanded from macro 'dev_printk_index_wrap'
_p_func(dev, fmt, ##__VA_ARGS__); \
~~~ ^~~~~~~~~~~
drivers/soc/qcom/smem.c:852:47: warning: format specifies type 'unsigned
short' but the argument has type 'unsigned int' [-Wformat]
dev_err(smem->dev, "duplicate host %hu\n", remote_host);
~~~ ^~~~~~~~~~~
%u
./include/linux/dev_printk.h:144:65: note: expanded from macro 'dev_err'
dev_printk_index_wrap(_dev_err, KERN_ERR, dev, dev_fmt(fmt), ##__VA_ARGS__)
~~~ ^~~~~~~~~~~
./include/linux/dev_printk.h:110:23: note: expanded from macro 'dev_printk_index_wrap'
_p_func(dev, fmt, ##__VA_ARGS__); \
~~~ ^~~~~~~~~~~
The types of these arguments are unconditionally defined, so this patch
updates the format character to the correct one and change type of
remote_host to "u16" to match with other types.
Signed-off-by: Bill Wendling <morbo@google.com>
Tested-by: Justin Stitt <jstitt007@gmail.com>
Reviewed-by: Justin Stitt <jstitt007@gmail.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/378
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The USBH composed of EHCI and OHCI controllers needs the PHY clock to be
initialized first, before enabling (gating) them. The reverse is also
required when going to suspend.
So, add USBPHY clock as 1st entry in both controllers, so the USBPHY PLL
gets enabled 1st upon controller init. Upon suspend/resume, this also makes
the clock to be disabled/re-enabled in the correct order.
This fixes some IRQ storm conditions seen when going to low-power, due to
PHY PLL being disabled before all clocks are cleanly gated.
Fixes: 949a0c0dec85 ("ARM: dts: stm32: add USB Host (USBH) support to stm32mp157c")
Fixes: db7be2cb87ae ("ARM: dts: stm32: use usbphyc ck_usbo_48m as USBH OHCI clock on stm32mp151")
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com>
Split the current bounce buffering logic used with persistent grants
into it's own option, and allow enabling it independently of
persistent grants. This allows to reuse the same code paths to
perform the bounce buffering required to avoid leaking contiguous data
in shared pages not part of the request fragments.
Reporting whether the backend is to be trusted can be done using a
module parameter, or from the xenstore frontend path as set by the
toolstack when adding the device.
This is CVE-2022-33742, part of XSA-403.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
The kernel tends to try to avoid conditional locking semantics because
it makes it harder to think about and statically check locking rules,
but we do have a few fundamental locking primitives that take locks
conditionally - most obviously the 'trylock' functions.
That has always been a problem for 'sparse' checking for locking
imbalance, and we've had a special '__cond_lock()' macro that we've used
to let sparse know how the locking works:
# define __cond_lock(x,c) ((c) ? ({ __acquire(x); 1; }) : 0)
so that you can then use this to tell sparse that (for example) the
spinlock trylock macro ends up acquiring the lock when it succeeds, but
not when it fails:
#define raw_spin_trylock(lock) __cond_lock(lock, _raw_spin_trylock(lock))
and then sparse can follow along the locking rules when you have code like
if (!spin_trylock(&dentry->d_lock))
return LRU_SKIP;
.. sparse sees that the lock is held here..
spin_unlock(&dentry->d_lock);
and sparse ends up happy about the lock contexts.
However, this '__cond_lock()' use does result in very ugly header files,
and requires you to basically wrap the real function with that macro
that uses '__cond_lock'. Which has made PeterZ NAK things that try to
fix sparse warnings over the years [1].
To solve this, there is now a very experimental patch to sparse that
basically does the exact same thing as '__cond_lock()' did, but using a
function attribute instead. That seems to make PeterZ happy [2].
Note that this does not replace existing use of '__cond_lock()', but
only exposes the new proposed attribute and uses it for the previously
unannotated 'refcount_dec_and_lock()' family of functions.
For existing sparse installations, this will make no difference (a
negative output context was ignored), but if you have the experimental
sparse patch it will make sparse now understand code that uses those
functions, the same way '__cond_lock()' makes sparse understand the very
similar 'atomic_dec_and_lock()' uses that have the old '__cond_lock()'
annotations.
Note that in some cases this will silence existing context imbalance
warnings. But in other cases it may end up exposing new sparse warnings
for code that sparse just didn't see the locking for at all before.
This is a trial, in other words. I'd expect that if it ends up being
successful, and new sparse releases end up having this new attribute,
we'll migrate the old-style '__cond_lock()' users to use the new-style
'__cond_acquires' function attribute.
The actual experimental sparse patch was posted in [3].
Link: https://lore.kernel.org/all/20130930134434.GC12926@twins.programming.kicks-ass.net/ [1]
Link: https://lore.kernel.org/all/Yr60tWxN4P568x3W@worktop.programming.kicks-ass.net/ [2]
Link: https://lore.kernel.org/all/CAHk-=wjZfO9hGqJ2_hGQG3U_XzSh9_XaXze=HgPdvJbgrvASfA@mail.gmail.com/ [3]
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Aring <aahringo@redhat.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Qualcomm ARM64 DT fixes for v5.19
This removes duplicate includes in the sc7180-trogdor files, which
accidentally ended up disabling nodes intended to be enabled.
It corrects identifiers for CPU6/7 on MSM8994. On SM8450 the UFS node's
interconnects property is updated to match the #interconnect-cells,
avoiding sync_state issues and the GIC ITS is defined, to correct the
references from the PCIe nodes. On SDM845 the display subsystem's AHB
clock is corrected and on msm8992 devices, the supplies for lvs 1 and 2
are correctly specified.
Lastly, a welcome addition of Konrad as reviewer for the Qualcomm SoC.
* tag 'qcom-arm64-fixes-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
arm64: dts: qcom: msm8992-*: Fix vdd_lvs1_2-supply typo
MAINTAINERS: Add myself as a reviewer for Qualcomm ARM/64 support
arm64: dts: qcom: sdm845: use dispcc AHB clock for mdss node
arm64: dts: qcom: sm8450 add ITS device tree node
arm64: dts: qcom: msm8994: Fix CPU6/7 reg values
arm64: dts: qcom: sm8450: fix interconnects property of UFS node
arm64: dts: qcom: Remove duplicate sc7180-trogdor include on lazor/homestar
Link: https://lore.kernel.org/r/20220703030208.408109-1-bjorn.andersson@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Delete the node fixed clock managed by secure world with SCMI.
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@foss.st.com>
Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com>