commits
Pull Kbuild fixes from Masahiro Yamada:
- Remove unused scripts/gcc-ld script
- Add zstd support to scripts/extract-ikconfig
- Check 'make headers' for UML
- Fix scripts/mksysmap to ignore local symbols
* tag 'kbuild-fixes-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
mksysmap: Fix the mismatch of 'L0' symbols in System.map
kbuild: disable header exports for UML in a straightforward way
scripts/extract-ikconfig: add zstd compression support
scripts: remove obsolete gcc-ld script
Pull arm64 fixes from Will Deacon:
"Three small arm64 fixes, all related to optional architecture
extensions: BTI, SME and 52-bit virtual addressing:
- Disable in-kernel BTI when compiling with GCC, as it makes invalid
assumptions about the distance between functions which has led to
crashes when calling modules on a CPU with BTI support
- Remove bogus TIF_SME flag management if memory allocation fails in
the ptrace code
- Fix the resume path when configured for 52-bit virtual addressing"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: fix resume for 52-bit enabled builds
arm64/ptrace: Don't clear calling process' TIF_SME on OOM
arm64/bti: Disable in kernel BTI when cross section thunks are broken
When System.map was generated, the kernel used mksysmap to filter the
kernel symbols, we need to filter "L0" symbols in LoongArch architecture.
$ cat System.map | grep L0
9000000000221540 t L0
The L0 symbol exists in System.map, but not in .tmp_System.map. When
"cmp -s System.map .tmp_System.map" will show "Inconsistent kallsyms
data" error message in link-vmlinux.sh script.
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Pull i2c fixes from Wolfram Sang:
"Only documentation and DT binding fixes and improvements"
* tag 'i2c-for-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
dt-bindings: i2c: renesas,riic: Fix 'unevaluatedProperties' warnings
docs: i2c: piix4: Fix typos, add markup, drop link
docs: i2c: i2c-topology: reorder sections more logically
docs: i2c: i2c-topology: fix incorrect heading
docs: i2c: i2c-topology: fix typo
__cpu_setup() was changed to take the actual number of VA bits in x0,
however the resume path was not updated at the same time.
Load `vabits_actual` in the resume path, to ensure that the correct
number of VA bits is used.
This fixes booting v6.0-rc kernels on my Juno.
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Fixes: 0aaa68532e9d ("arm64: mm: fix booting with 52-bit address space")
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220909124311.38489-1-joey.gouly@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Previously 'make ARCH=um headers' stopped because of missing
arch/um/include/uapi/asm/Kbuild.
The error is not shown since commit ed102bf2afed ("um: Fix W=1
missing-include-dirs warnings") added arch/um/include/uapi/asm/Kbuild.
Hard-code the unsupported architecture, so it works like before.
Fixes: ed102bf2afed ("um: Fix W=1 missing-include-dirs warnings")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Richard Weinberger <richard@nod.at>
Pull iommu fixes from Joerg Roedel:
- Intel VT-d fixes from Lu Baolu:
- Boot kdump kernels with VT-d scalable mode on
- Calculate the right page table levels
- Fix two recursive locking issues
- Fix a lockdep splat issue
- AMD IOMMU fixes:
- Fix for completion-wait command to use full 64 bits of data
- Fix PASID related issue where GPU sound devices failed to
initialize
- Fix for Virtio-IOMMU to report correct caching behavior, needed for
use with VFIO
* tag 'iommu-fixes-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu: Fix false ownership failure on AMD systems with PASID activated
iommu/vt-d: Fix possible recursive locking in intel_iommu_init()
iommu/virtio: Fix interaction with VFIO
iommu/vt-d: Fix lockdep splat due to klist iteration in atomic context
iommu/vt-d: Fix recursive lock issue in iommu_flush_dev_iotlb()
iommu/vt-d: Correctly calculate sagaw value of IOMMU
iommu/vt-d: Fix kdump kernels boot failure with scalable mode
iommu/amd: use full 64-bit value in build_completion_wait()
With 'unevaluatedProperties' support implemented, there's a number of
warnings when running dtbs_check:
arch/arm64/boot/dts/renesas/r9a07g043u11-smarc.dtb: i2c@10058000: Unevaluated properties are not allowed ('resets' was unexpected)
From schema: Documentation/devicetree/bindings/i2c/renesas,riic.yaml
The main problem is that bindings schema marks resets as a required
property for RZ/G2L (and alike) SoC's but resets property is not part
of schema. So to fix this just add a resets property with maxItems
set to 1.
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
If allocating memory for the target SVE state in za_set() fails we clear
TIF_SME for the ptracing task which is obviously not correct. If we are
here we know that the target task already had neither TIF_SVE nor
TIF_SME set since we only need to allocate if either the target had not
used either SVE or SME and had no need to allocate state before or we
just changed the vector length with vec_set_vector_length() which clears
TIF_ for us on allocation failure so just remove the clear entirely.
Reported-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220902132802.39682-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Add extract-ikconfig support for kernel images compressed with zstd.
Signed-off-by: Thitat Auareesuksakul <thitat@flux.ci>
Tested-by: Nicolas Schier <n.schier@avm.de>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Pull MIPS fixes from Thomas Bogendoerfer:
- fix for loongson32 starup hang
- fix for octeon irq setup problem
- fix compiler warning for new CONFIG option
- switch to SPARSEMEM_EXTREME for all platforms selecting SPARSEMEM
* tag 'mips-fixes_6.0_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
mips: Select SPARSEMEM_EXTREME
MIPS: OCTEON: irq: Fix octeon_irq_force_ciu_mapping()
MIPS: octeon: Get rid of preprocessor directives around RESERVE32
MIPS: loongson32: ls1c: Fix hang during startup
The AMD IOMMU driver cannot activate PASID mode on a RID without the RID's
translation being set to IDENTITY. Further it requires changing the RID's
page table layout from the normal v1 IOMMU_DOMAIN_IDENTITY layout to a
different v2 layout.
It does this by creating a new iommu_domain, configuring that domain for
v2 identity operation and then attaching it to the group, from within the
driver. This logic assumes the group is already set to the IDENTITY domain
and is being used by the DMA API.
However, since the ownership logic is based on the group's domain pointer
equaling the default domain to detect DMA API ownership, this causes it to
look like the group is not attached to the DMA API any more. This blocks
attaching drivers to any other devices in the group.
In a real system this manifests itself as the HD-audio devices on some AMD
platforms losing their device drivers.
Work around this unique behavior of the AMD driver by checking for
equality of IDENTITY domains based on their type, not their pointer
value. This allows the AMD driver to have two IDENTITY domains for
internal purposes without breaking the check.
Have the AMD driver properly declare that the special domain it created is
actually an IDENTITY domain.
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: stable@vger.kernel.org
Fixes: 512881eacfa7 ("bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management")
Reported-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/0-v1-ea566e16b06b+811-amd_owner_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
[JD: Update the subject
One more typo fixed
Drop the link to lm-sensors' README, it's irrelevant]
Signed-off-by: Bruce Duncan <bwduncan@gmail.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
GCC does not insert a `bti c` instruction at the beginning of a function
when it believes that all callers reach the function through a direct
branch[1]. Unfortunately the logic it uses to determine this is not
sufficiently robust, for example not taking account of functions being
placed in different sections which may be loaded separately, so we may
still see thunks being generated to these functions. If that happens,
the first instruction in the callee function will result in a Branch
Target Exception due to the missing landing pad.
While this has currently only been observed in the case of modules
having their main code loaded sufficiently far from their init section
to require thunks it could potentially happen for other cases so the
safest thing is to disable BTI for the kernel when building with an
affected toolchain.
[1]: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671
Reported-by: D Scott Phillips <scott@os.amperecomputing.com>
[Bits of the commit message are lifted from his report & workaround]
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220905142255.591990-1-broonie@kernel.org
Cc: <stable@vger.kernel.org> # v5.10+
Signed-off-by: Will Deacon <will@kernel.org>
Since commit 8564ed2b3888 ("Kbuild, lto: Add a gcc-ld script to let run gcc
as ld") in 2014, there was not specific work on this the gcc-ld script
other than treewide clean-ups.
There are no users within the kernel tree, and probably no out-of-tree
users either, and there is no dedicated maintainer in MAINTAINERS.
Delete this obsolete gcc-ld script.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Pull s390 fixes from Vasily Gorbik:
- Fix absolute zero lowcore corruption on kdump when CPU0 is offline
- Fix lowcore protection setup for offline CPU restart
* tag 's390-6.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/smp: enforce lowcore protection on CPU restart
s390/boot: fix absolute zero lowcore corruption on boot
Commit c46173183657 ("MIPS: Add NUMA support for Loongson-3") has increased
.bss size of the Octeon kernel from 16k to 16M. Providing the conditions
for SPARSEMEM_EXTREME avoids the waste of memory.
Thomas has tested the loogsoon64 kernel, where .bss is being reduced by
this patch from 16.5M to 515k.
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
is used to protect DMAR related global data from DMAR hotplug operations.
The dmar_global_lock used in the intel_iommu_init() might cause recursive
locking issue, for example, intel_iommu_get_resv_regions() is taking the
dmar_global_lock from within a section where intel_iommu_init() already
holds it via probe_acpi_namespace_devices().
Using dmar_global_lock in intel_iommu_init() could be relaxed since it is
unlikely that any IO board must be hot added before the IOMMU subsystem is
initialized. This eliminates the possible recursive locking issue by moving
down DMAR hotplug support after the IOMMU is initialized and removing the
uses of dmar_global_lock in intel_iommu_init().
Fixes: d5692d4af08cd ("iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices()")
Reported-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/894db0ccae854b35c73814485569b634237b5538.1657034828.git.robin.murphy@arm.com
Link: https://lore.kernel.org/r/20220718235325.3952426-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The sequence of sections is a bit confusing here:
* we list the mux locking scheme for existing drivers before introducing
what mux locking schemes are
* we list the caveats for each locking scheme (which are tricky) before
the example of the simple use case
Restructure it entirely with the following logic:
* Intro ("I2C muxes and complex topologies")
* Locking
- mux-locked
- example
- caveats
- parent-locked
- example
- caveats
* Complex examples
* Mux type of existing device drivers
While there, also apply some other improvements:
* convert the caveat list from a table (with only one column carrying
content) to a bullet list.
* add a small introductory text to bridge the gap from listing the use
cases to telling about the hardware components to handle them and then
the device drivers that implement those.
* make empty lines usage more uniform
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Acked-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
The logic that conditionally allocates one additional page at each
swapper page table level if KASLR is enabled is also applied to the
initial ID map, now that we have started using the same set of macros
to allocate the space for it.
However, the placement of the kernel in physical memory might result in
additional pages being needed at any level, even if KASLR is disabled in
the build. So account for this in the computation.
Fixes: c3cee924bd85 ("arm64: head: cover entire kernel image in initial ID map")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220826164800.2059148-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Pull hwmon fixes from Guenter Roeck:
- Fix severe regression in asus-ec-sensors driver
which resulted in EC driver failures
- Fix various bugs in mr75203 driver
- Fix byte order bug in tps23861 driver
* tag 'hwmon-for-v6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (asus-ec-sensors) autoload module via DMI data
hwmon: (mr75203) enable polling for all VM channels
hwmon: (mr75203) fix multi-channel voltage reading
hwmon: (mr75203) fix voltage equation for negative source input
hwmon: (mr75203) update pvt->v_num and vm_num to the actual number of used sensors
hwmon: (mr75203) fix VM sensor allocation when "intel,vm-map" not defined
dt-bindings: hwmon: (mr75203) fix "intel,vm-map" property to be optional
hwmon: (tps23861) fix byte order in resistance register
As result of commit 915fea04f932 ("s390/smp: enable DAT before
CPU restart callback is called") the low-address protection bit
gets mistakenly unset in control register 0 save area of the
absolute zero memory. That area is used when manual PSW restart
happened to hit an offline CPU. In this case the low-address
protection for that CPU will be dropped.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Fixes: 915fea04f932 ("s390/smp: enable DAT before CPU restart callback is called")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
For irq_domain_associate() to work the virq descriptor has to be
pre-allocated in advance. Otherwise the following happens:
WARNING: CPU: 0 PID: 0 at .../kernel/irq/irqdomain.c:527 irq_domain_associate+0x298/0x2e8
error: virq128 is not allocated
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.78-... #1
...
Call Trace:
[<ffffffff801344c4>] show_stack+0x9c/0x130
[<ffffffff80769550>] dump_stack+0x90/0xd0
[<ffffffff801576d0>] __warn+0x118/0x130
[<ffffffff80157734>] warn_slowpath_fmt+0x4c/0x70
[<ffffffff801b83c0>] irq_domain_associate+0x298/0x2e8
[<ffffffff80a43bb8>] octeon_irq_init_ciu+0x4c8/0x53c
[<ffffffff80a76cbc>] of_irq_init+0x1e0/0x388
[<ffffffff80a452cc>] init_IRQ+0x4c/0xf4
[<ffffffff80a3cc00>] start_kernel+0x404/0x698
Use irq_alloc_desc_at() to avoid the above problem.
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Commit e8ae0e140c05 ("vfio: Require that devices support DMA cache
coherence") requires IOMMU drivers to advertise
IOMMU_CAP_CACHE_COHERENCY, in order to be used by VFIO. Since VFIO does
not provide to userspace the ability to maintain coherency through cache
invalidations, it requires hardware coherency. Advertise the capability
in order to restore VFIO support.
The meaning of IOMMU_CAP_CACHE_COHERENCY also changed from "IOMMU can
enforce cache coherent DMA transactions" to "IOMMU_CACHE is supported".
While virtio-iommu cannot enforce coherency (of PCIe no-snoop
transactions), it does support IOMMU_CACHE.
We can distinguish different cases of non-coherent DMA:
(1) When accesses from a hardware endpoint are not coherent. The host
would describe such a device using firmware methods ('dma-coherent'
in device-tree, '_CCA' in ACPI), since they are also needed without
a vIOMMU. In this case mappings are created without IOMMU_CACHE.
virtio-iommu doesn't need any additional support. It sends the same
requests as for coherent devices.
(2) When the physical IOMMU supports non-cacheable mappings. Supporting
those would require a new feature in virtio-iommu, new PROBE request
property and MAP flags. Device drivers would use a new API to
discover this since it depends on the architecture and the physical
IOMMU.
(3) When the hardware supports PCIe no-snoop. It is possible for
assigned PCIe devices to issue no-snoop transactions, and the
virtio-iommu specification is lacking any mention of this.
Arm platforms don't necessarily support no-snoop, and those that do
cannot enforce coherency of no-snoop transactions. Device drivers
must be careful about assuming that no-snoop transactions won't end
up cached; see commit e02f5c1bb228 ("drm: disable uncached DMA
optimization for ARM and arm64"). On x86 platforms, the host may or
may not enforce coherency of no-snoop transactions with the physical
IOMMU. But according to the above commit, on x86 a driver which
assumes that no-snoop DMA is compatible with uncached CPU mappings
will also work if the host enforces coherency.
Although these issues are not specific to virtio-iommu, it could be
used to facilitate discovery and configuration of no-snoop. This
would require a new feature bit, PROBE property and ATTACH/MAP
flags.
Cc: stable@vger.kernel.org
Fixes: e8ae0e140c05 ("vfio: Require that devices support DMA cache coherence")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20220825154622.86759-1-jean-philippe@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
"Etc" here was never meant to be a heading, it became one while converting
to ReST.
It would be easy to just convert it to plain text, but rather remove it and
add an introductory text before the list that conveys the same meaning but
with a better reading flow.
Fixes: ccf988b66d69 ("docs: i2c: convert to ReST and add to driver-api bookset")
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Acked-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
The platform_get_irq() returns negative error codes. It can't actually
return zero.
Signed-off-by: Yu Zhe <yuzhe@nfschina.com>
Link: https://lore.kernel.org/r/20220825011844.8536-1-yuzhe@nfschina.com
Signed-off-by: Will Deacon <will@kernel.org>
Pull more hotfixes from Andrew Morton:
"Seventeen hotfixes. Mostly memory management things.
Ten patches are cc:stable, addressing pre-6.0 issues"
* tag 'mm-hotfixes-stable-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
.mailmap: update Luca Ceresoli's e-mail address
mm/mprotect: only reference swap pfn page if type match
squashfs: don't call kmalloc in decompressors
mm/damon/dbgfs: avoid duplicate context directory creation
mailmap: update email address for Colin King
asm-generic: sections: refactor memory_intersects
bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem
ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown
Revert "memcg: cleanup racy sum avoidance code"
mm/zsmalloc: do not attempt to free IS_ERR handle
binder_alloc: add missing mmap_lock calls when using the VMA
mm: re-allow pinning of zero pfns (again)
vmcoreinfo: add kallsyms_num_syms symbol
mailmap: update Guilherme G. Piccoli's email addresses
writeback: avoid use-after-free after removing device
shmem: update folio if shmem_replace_page() updates the page
mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte
Pull dma-mapping fixes from Christoph Hellwig:
- revert a panic on swiotlb initialization failure (Yu Zhao)
- fix the lookup for partial syncs in dma-debug (Robin Murphy)
- fix a shift overflow in swiotlb (Chao Gao)
- fix a comment typo in swiotlb (Chao Gao)
- mark a function static now that all abusers are gone (Christoph
Hellwig)
* tag 'dma-mapping-6.0-2022-09-10' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: mark dma_supported static
swiotlb: fix a typo
swiotlb: avoid potential left shift overflow
dma-debug: improve search for partial syncs
Revert "swiotlb: panic if nslabs is too small"
Replace autoloading data based on the ACPI EC device with the DMI
records for motherboards models. The ACPI method created a bug that when
this driver returns error from the probe function because of the
unsupported motherboard model, the ACPI subsystem concludes
that the EC device does not work properly.
Fixes: 5cd29012028d ("hwmon: (asus-ec-sensors) introduce ec_board_info struct for board data")
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=216412
Bug: https://bugzilla.redhat.com/show_bug.cgi?id=2121844
Signed-off-by: Eugene Shalygin <eugene.shalygin@gmail.com>
Link: https://lore.kernel.org/r/20220909155654.123398-2-eugene.shalygin@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Crash dump always starts on CPU0. In case CPU0 is offline the
prefix page is not installed and the absolute zero lowcore is
used. However, struct lowcore::mcesad is never assigned and
stays zero. That leads to __machine_kdump() -> save_vx_regs()
call silently stores vector registers to the absolute lowcore
at 0x11b0 offset.
Fixes: a62bc0739253 ("s390/kdump: add support for vector extension")
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Some of them were pointless because CONFIG_CAVIUM_RESERVE32 is now always
defined, some were not enough (Yu Zhao reported
"Failed to allocate CAVIUM_RESERVE32 memory area" error).
Removing the directives allows for compiler coverage of RESERVE32 code and
replacing one of [always-true] "ifdef" with a compiler conditional fixes
the [cosmetic] error message.
Fixes: 3e3114ac460e ("MIPS: Introduce CAVIUM_RESERVE32 Kconfig option")
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
With CONFIG_INTEL_IOMMU_DEBUGFS enabled, below lockdep splat are seen
when an I/O fault occurs on a machine with an Intel IOMMU in it.
DMAR: DRHD: handling fault status reg 3
DMAR: [DMA Write NO_PASID] Request device [00:1a.0] fault addr 0x0
[fault reason 0x05] PTE Write access is not set
DMAR: Dump dmar0 table entries for IOVA 0x0
DMAR: root entry: 0x0000000127f42001
DMAR: context entry: hi 0x0000000000001502, low 0x000000012d8ab001
================================
WARNING: inconsistent lock state
5.20.0-0.rc0.20220812git7ebfc85e2cd7.10.fc38.x86_64 #1 Not tainted
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
rngd/1006 [HC1[1]:SC0[0]:HE0:SE1] takes:
ff177021416f2d78 (&k->k_lock){?.+.}-{2:2}, at: klist_next+0x1b/0x160
{HARDIRQ-ON-W} state was registered at:
lock_acquire+0xce/0x2d0
_raw_spin_lock+0x33/0x80
klist_add_tail+0x46/0x80
bus_add_device+0xee/0x150
device_add+0x39d/0x9a0
add_memory_block+0x108/0x1d0
memory_dev_init+0xe1/0x117
driver_init+0x43/0x4d
kernel_init_freeable+0x1c2/0x2cc
kernel_init+0x16/0x140
ret_from_fork+0x1f/0x30
irq event stamp: 7812
hardirqs last enabled at (7811): [<ffffffff85000e86>] asm_sysvec_apic_timer_interrupt+0x16/0x20
hardirqs last disabled at (7812): [<ffffffff84f16894>] irqentry_enter+0x54/0x60
softirqs last enabled at (7794): [<ffffffff840ff669>] __irq_exit_rcu+0xf9/0x170
softirqs last disabled at (7787): [<ffffffff840ff669>] __irq_exit_rcu+0xf9/0x170
The klist iterator functions using spin_*lock_irq*() but the klist
insertion functions using spin_*lock(), combined with the Intel DMAR
IOMMU driver iterating over klists from atomic (hardirq) context, where
pci_get_domain_bus_and_slot() calls into bus_find_device() which iterates
over klists.
As currently there's no plan to fix the klist to make it safe to use in
atomic context, this fixes the lockdep splat by avoid calling
pci_get_domain_bus_and_slot() in the hardirq context.
Fixes: 8ac0b64b9735 ("iommu/vt-d: Use pci_get_domain_bus_and_slot() in pgtable_walk()")
Reported-by: Lennert Buytenhek <buytenh@wantstofly.org>
Link: https://lore.kernel.org/linux-iommu/Yvo2dfpEh%2FWC+Wrr@wantstofly.org/
Link: https://lore.kernel.org/linux-iommu/YvyBdPwrTuHHbn5X@wantstofly.org/
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220819015949.4795-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
"intension" should have probably been "intention", however "intent" seems
even better.
Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Acked-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Even non-KASLR kernels can be built as relocatable, to work around
broken bootloaders that violate the rules regarding physical placement
of the kernel image - in this case, the physical offset modulo 2 MiB is
used as the KASLR offset, and all absolute symbol references are fixed
up in the usual way. This workaround is enabled by default.
CONFIG_RELOCATABLE can also be disabled entirely, in which case the
relocation code and the code that captures the offset are omitted from
the build. However, since commit aacd149b6238 ("arm64: head: avoid
relocating the kernel twice for KASLR"), this code got out of sync, and
we still add the offset to the kernel virtual address before populating
the page tables even though we never capture it. This means we add a
bogus value instead, breaking the boot entirely.
Fixes: aacd149b6238 ("arm64: head: avoid relocating the kernel twice for KASLR")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Mikulas Patocka <mpatocka@redhat.com>
Link: https://lore.kernel.org/r/20220827070904.2216989-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Pull bitmap fixes from Yury Norov:
"Fix the reported issues, and implements the suggested improvements,
for the version of the cpumask tests [1] that was merged with commit
c41e8866c28c ("lib/test: introduce cpumask KUnit test suite").
These changes include fixes for the tests, and better alignment with
the KUnit style guidelines"
* tag 'bitmap-6.0-rc3' of github.com:/norov/linux:
lib/cpumask_kunit: add tests file to MAINTAINERS
lib/cpumask_kunit: log mask contents
lib/test_cpumask: follow KUnit style guidelines
lib/test_cpumask: fix cpu_possible_mask last test
lib/test_cpumask: drop cpu_possible_mask full test
My Bootlin address is preferred from now on.
Link: https://lkml.kernel.org/r/20220826130515.3011951-1-luca.ceresoli@bootlin.com
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Atish Patra <atishp@atishpatra.org>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pull SCSI fixes from James Bottomley:
"Eight patches which looks like quite a large core change, but most of
the diffstat is reverting the attempt to rejig reference counting
introduced in the last merge window which caused issues with device
and module removal.
Of the remaining four patches, only the fix use-after-free is
substantial"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mpt3sas: Fix use-after-free warning
scsi: core: Fix a use-after-free
scsi: core: Revert "Make sure that targets outlive devices"
scsi: core: Revert "Make sure that hosts outlive targets"
scsi: core: Revert "Simplify LLD module reference counting"
scsi: core: Revert "Call blk_mq_free_tag_set() earlier"
scsi: lpfc: Add missing destroy_workqueue() in error path
scsi: lpfc: Return DID_TRANSPORT_DISRUPTED instead of DID_REQUEUE
Now that the remaining users in drivers are gone, this function can be
marked static.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Configure ip-polling register to enable polling for all voltage monitor
channels.
This enables reading the voltage values for all inputs other than just
input 0.
Fixes: 9d823351a337 ("hwmon: Add hardware monitoring driver for Moortec MR75203 PVT controller")
Signed-off-by: Eliav Farber <farbere@amazon.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220908152449.35457-7-farbere@amazon.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The alignment check in prepare_hugepage_range() is wrong for 2 GB
hugepages, it only checks for 1 MB hugepage alignment.
This can result in kernel crash in __unmap_hugepage_range() at the
BUG_ON(start & ~huge_page_mask(h)) alignment check, for mappings
created with MAP_FIXED at unaligned address.
Fix this by correctly handling multiple hugepage sizes, similar to the
generic version of prepare_hugepage_range().
Fixes: d08de8e2d867 ("s390/mm: add support for 2GB hugepages")
Cc: <stable@vger.kernel.org> # 4.8+
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
The RTCCTRL reg of LS1C is obselete.
Writing this reg will cause system hang.
Fixes: 60219c563c9b6 ("MIPS: Add RTC support for Loongson1C board")
Signed-off-by: Yang Ling <gnaygnil@gmail.com>
Tested-by: Keguang Zhang <keguang.zhang@gmail.com>
Acked-by: Keguang Zhang <keguang.zhang@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
The per domain spinlock is acquired in iommu_flush_dev_iotlb(), which
is possbile to be called in the interrupt context. For example, the
drm-intel's CI system got completely blocked with below error:
WARNING: inconsistent lock state
6.0.0-rc1-CI_DRM_11990-g6590d43d39b9+ #1 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
swapper/6/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
ffff88810440d678 (&domain->lock){+.?.}-{2:2}, at: iommu_flush_dev_iotlb.part.61+0x23/0x80
{SOFTIRQ-ON-W} state was registered at:
lock_acquire+0xd3/0x310
_raw_spin_lock+0x2a/0x40
domain_update_iommu_cap+0x20b/0x2c0
intel_iommu_attach_device+0x5bd/0x860
__iommu_attach_device+0x18/0xe0
bus_iommu_probe+0x1f3/0x2d0
bus_set_iommu+0x82/0xd0
intel_iommu_init+0xe45/0x102a
pci_iommu_init+0x9/0x31
do_one_initcall+0x53/0x2f0
kernel_init_freeable+0x18f/0x1e1
kernel_init+0x11/0x120
ret_from_fork+0x1f/0x30
irq event stamp: 162354
hardirqs last enabled at (162354): [<ffffffff81b59274>] _raw_spin_unlock_irqrestore+0x54/0x70
hardirqs last disabled at (162353): [<ffffffff81b5901b>] _raw_spin_lock_irqsave+0x4b/0x50
softirqs last enabled at (162338): [<ffffffff81e00323>] __do_softirq+0x323/0x48e
softirqs last disabled at (162349): [<ffffffff810c1588>] irq_exit_rcu+0xb8/0xe0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&domain->lock);
<Interrupt>
lock(&domain->lock);
*** DEADLOCK ***
1 lock held by swapper/6/0:
This coverts the spin_lock/unlock() into the irq save/restore varieties
to fix the recursive locking issues.
Fixes: ffd5869d93530 ("iommu/vt-d: Replace spin_lock_irqsave() with spin_lock()")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20220817025650.3253959-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Like crashk_res, Calling crash_exclude_mem_range function with
crashk_low_res area would need extra crash_mem range too.
Add one more extra cmem slot in case of crashk_low_res is used.
Signed-off-by: Levi Yun <ppbuk5246@gmail.com>
Fixes: 944a45abfabc ("arm64: kdump: Reimplement crashkernel=X")
Cc: <stable@vger.kernel.org> # 5.19.x
Acked-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220831103913.12661-1-ppbuk5246@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
Pull btrfs fixes from David Sterba:
"Fixes:
- check that subvolume is writable when changing xattrs from security
namespace
- fix memory leak in device lookup helper
- update generation of hole file extent item when merging holes
- fix space cache corruption and potential double allocations; this
is a rare bug but can be serious once it happens, stable backports
and analysis tool will be provided
- fix error handling when deleting root references
- fix crash due to assert when attempting to cancel suspended device
replace, add message what to do if mount fails due to missing
replace item
Regressions:
- don't merge pages into bio if their page offset is not contiguous
- don't allow large NOWAIT direct reads, this could lead to short
reads eg. in io_uring"
* tag 'for-6.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: add info when mount fails due to stale replace target
btrfs: replace: drop assert for suspended replace
btrfs: fix silent failure when deleting root reference
btrfs: fix space cache corruption and potential double allocations
btrfs: don't allow large NOWAIT direct reads
btrfs: don't merge pages into bio if their page offset is not contiguous
btrfs: update generation of hole file extent item when merging holes
btrfs: fix possible memory leak in btrfs_get_dev_args_from_path()
btrfs: check if root is readonly while setting security xattr
cpumask related files are listed under the BITMAP API section, so the
file with tests for cpumask should be added to that list.
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Reviewed-by: David Gow <davidgow@google.com>
Acked-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Yu Zhao reported a bug after the commit "mm/swap: Add swp_offset_pfn() to
fetch PFN from swap entry" added a check in swp_offset_pfn() for swap type [1]:
kernel BUG at include/linux/swapops.h:117!
CPU: 46 PID: 5245 Comm: EventManager_De Tainted: G S O L 6.0.0-dbg-DEV #2
RIP: 0010:pfn_swap_entry_to_page+0x72/0xf0
Code: c6 48 8b 36 48 83 fe ff 74 53 48 01 d1 48 83 c1 08 48 8b 09 f6
c1 01 75 7b 66 90 48 89 c1 48 8b 09 f6 c1 01 74 74 5d c3 eb 9e <0f> 0b
48 ba ff ff ff ff 03 00 00 00 eb ae a9 ff 0f 00 00 75 13 48
RSP: 0018:ffffa59e73fabb80 EFLAGS: 00010282
RAX: 00000000ffffffe8 RBX: 0c00000000000000 RCX: ffffcd5440000000
RDX: 1ffffffffff7a80a RSI: 0000000000000000 RDI: 0c0000000000042b
RBP: ffffa59e73fabb80 R08: ffff9965ca6e8bb8 R09: 0000000000000000
R10: ffffffffa5a2f62d R11: 0000030b372e9fff R12: ffff997b79db5738
R13: 000000000000042b R14: 0c0000000000042b R15: 1ffffffffff7a80a
FS: 00007f549d1bb700(0000) GS:ffff99d3cf680000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000440d035b3180 CR3: 0000002243176004 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
change_pte_range+0x36e/0x880
change_p4d_range+0x2e8/0x670
change_protection_range+0x14e/0x2c0
mprotect_fixup+0x1ee/0x330
do_mprotect_pkey+0x34c/0x440
__x64_sys_mprotect+0x1d/0x30
It triggers because pfn_swap_entry_to_page() could be called upon e.g. a
genuine swap entry.
Fix it by only calling it when it's a write migration entry where the page*
is used.
[1] https://lore.kernel.org/lkml/CAOUHufaVC2Za-p8m0aiHw6YkheDcrO-C3wRGixwDS32VTS+k1w@mail.gmail.com/
Link: https://lkml.kernel.org/r/20220823221138.45602-1-peterx@redhat.com
Fixes: 6c287605fd56 ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Yu Zhao <yuzhao@google.com>
Tested-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pull driver core fixes from Greg KH:
"Here are some small driver core and debugfs fixes for 6.0-rc5.
Included in here are:
- multiple attempts to get the arch_topology code to work properly on
non-cluster SMT systems. First attempt caused build breakages in
linux-next and 0-day, second try worked.
- debugfs fixes for a long-suffering memory leak. The pattern of
debugfs_remove(debugfs_lookup(...)) turns out to leak dentries, so
add debugfs_lookup_and_remove() to fix this problem. Also fix up
the scheduler debug code that highlighted this problem. Fixes for
other subsystems will be trickling in over the next few months for
this same issue once the debugfs function is merged.
All of these have been in linux-next since Wednesday with no reported
problems"
* tag 'driver-core-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
arch_topology: Make cluster topology span at least SMT CPUs
sched/debug: fix dentry leak in update_sched_domain_debugfs
debugfs: add debugfs_lookup_and_remove()
driver core: fix driver_set_override() issue with empty strings
Revert "arch_topology: Make cluster topology span at least SMT CPUs"
arch_topology: Make cluster topology span at least SMT CPUs
Fix the following use-after-free warning which is observed during
controller reset:
refcount_t: underflow; use-after-free.
WARNING: CPU: 23 PID: 5399 at lib/refcount.c:28 refcount_warn_saturate+0xa6/0xf0
Link: https://lore.kernel.org/r/20220906134908.1039-2-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
"overwirte" isn't a word. It should be "overwrite".
Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Fix voltage allocation and reading to support all channels in all VMs.
Prior to this change allocation and reading were done only for the first
channel in each VM.
This change counts the total number of channels for allocation, and takes
into account the channel offset when reading the sample data register.
Fixes: 9d823351a337 ("hwmon: Add hardware monitoring driver for Moortec MR75203 PVT controller")
Signed-off-by: Eliav Farber <farbere@amazon.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220908152449.35457-6-farbere@amazon.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
The Intel IOMMU driver possibly selects between the first-level and the
second-level translation tables for DMA address translation. However,
the levels of page-table walks for the 4KB base page size are calculated
from the SAGAW field of the capability register, which is only valid for
the second-level page table. This causes the IOMMU driver to stop working
if the hardware (or the emulated IOMMU) advertises only first-level
translation capability and reports the SAGAW field as 0.
This solves the above problem by considering both the first level and the
second level when calculating the supported page table levels.
Fixes: b802d070a52a1 ("iommu/vt-d: Use iova over first level")
Cc: stable@vger.kernel.org
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220817023558.3253263-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Currently as part of handling a SME access trap we flush the SVE register
state. This is not needed and would corrupt register state if the task has
access to the SVE registers already. For non-streaming mode accesses the
required flushing will be done in the SVE access trap. For streaming
mode SVE register accesses the architecture guarantees that the register
state will be flushed when streaming mode is entered or exited so there is
no need for us to do so. Simply remove the register initialisation.
Fixes: 8bd7f91c03d8 ("arm64/sme: Implement traps and syscall handling for SME")
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220817182324.638214-5-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Pull cfis fixes from Steve French:
- two locking fixes (zero range, punch hole)
- DFS 9 fix (padding), affecting some servers
- three minor cleanup changes
* tag '6.0-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Add helper function to check smb1+ server
cifs: Use help macro to get the mid header size
cifs: Use help macro to get the header preamble size
cifs: skip extra NULL byte in filenames
smb3: missing inode locks in punch hole
smb3: missing inode locks in zero range
If the replace target device reappears after the suspended replace is
cancelled, it blocks the mount operation as it can't find the matching
replace-item in the metadata. As shown below,
BTRFS error (device sda5): replace devid present without an active replace item
To overcome this situation, the user can run the command
btrfs device scan --forget <replace target device>
and try the mount command again. And also, to avoid repeating the issue,
superblock on the devid=0 must be wiped.
wipefs -a device-path-to-devid=0.
This patch adds some info when this situation occurs.
Reported-by: Samuel Greiner <samuel@balkonien.org>
Link: https://lore.kernel.org/linux-btrfs/b4f62b10-b295-26ea-71f9-9a5c9299d42c@balkonien.org/T/
CC: stable@vger.kernel.org # 5.0+
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For extra context, log the contents of the masks under test. This
should help with finding out why a certain test fails.
Link: https://lore.kernel.org/lkml/CABVgOSkPXBc-PWk1zBZRQ_Tt+Sz1ruFHBj3ixojymZF=Vi4tpQ@mail.gmail.com/
Suggested-by: David Gow <davidgow@google.com>
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
The decompressors may be called while in an atomic section. So move the
kmalloc() out of this path, and into the "page actor" init function.
This fixes a regression introduced by commit
f268eedddf35 ("squashfs: extend "page actor" to handle missing pages")
Link: https://lkml.kernel.org/r/20220822215430.15933-1-phillip@squashfs.org.uk
Fixes: f268eedddf35 ("squashfs: extend "page actor" to handle missing pages")
Reported-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pull Kbuild fixes from Masahiro Yamada:
- Remove unused scripts/gcc-ld script
- Add zstd support to scripts/extract-ikconfig
- Check 'make headers' for UML
- Fix scripts/mksysmap to ignore local symbols
* tag 'kbuild-fixes-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
mksysmap: Fix the mismatch of 'L0' symbols in System.map
kbuild: disable header exports for UML in a straightforward way
scripts/extract-ikconfig: add zstd compression support
scripts: remove obsolete gcc-ld script
Pull arm64 fixes from Will Deacon:
"Three small arm64 fixes, all related to optional architecture
extensions: BTI, SME and 52-bit virtual addressing:
- Disable in-kernel BTI when compiling with GCC, as it makes invalid
assumptions about the distance between functions which has led to
crashes when calling modules on a CPU with BTI support
- Remove bogus TIF_SME flag management if memory allocation fails in
the ptrace code
- Fix the resume path when configured for 52-bit virtual addressing"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: fix resume for 52-bit enabled builds
arm64/ptrace: Don't clear calling process' TIF_SME on OOM
arm64/bti: Disable in kernel BTI when cross section thunks are broken
When System.map was generated, the kernel used mksysmap to filter the
kernel symbols, we need to filter "L0" symbols in LoongArch architecture.
$ cat System.map | grep L0
9000000000221540 t L0
The L0 symbol exists in System.map, but not in .tmp_System.map. When
"cmp -s System.map .tmp_System.map" will show "Inconsistent kallsyms
data" error message in link-vmlinux.sh script.
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Pull i2c fixes from Wolfram Sang:
"Only documentation and DT binding fixes and improvements"
* tag 'i2c-for-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
dt-bindings: i2c: renesas,riic: Fix 'unevaluatedProperties' warnings
docs: i2c: piix4: Fix typos, add markup, drop link
docs: i2c: i2c-topology: reorder sections more logically
docs: i2c: i2c-topology: fix incorrect heading
docs: i2c: i2c-topology: fix typo
__cpu_setup() was changed to take the actual number of VA bits in x0,
however the resume path was not updated at the same time.
Load `vabits_actual` in the resume path, to ensure that the correct
number of VA bits is used.
This fixes booting v6.0-rc kernels on my Juno.
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Fixes: 0aaa68532e9d ("arm64: mm: fix booting with 52-bit address space")
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220909124311.38489-1-joey.gouly@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Previously 'make ARCH=um headers' stopped because of missing
arch/um/include/uapi/asm/Kbuild.
The error is not shown since commit ed102bf2afed ("um: Fix W=1
missing-include-dirs warnings") added arch/um/include/uapi/asm/Kbuild.
Hard-code the unsupported architecture, so it works like before.
Fixes: ed102bf2afed ("um: Fix W=1 missing-include-dirs warnings")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Richard Weinberger <richard@nod.at>
Pull iommu fixes from Joerg Roedel:
- Intel VT-d fixes from Lu Baolu:
- Boot kdump kernels with VT-d scalable mode on
- Calculate the right page table levels
- Fix two recursive locking issues
- Fix a lockdep splat issue
- AMD IOMMU fixes:
- Fix for completion-wait command to use full 64 bits of data
- Fix PASID related issue where GPU sound devices failed to
initialize
- Fix for Virtio-IOMMU to report correct caching behavior, needed for
use with VFIO
* tag 'iommu-fixes-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu: Fix false ownership failure on AMD systems with PASID activated
iommu/vt-d: Fix possible recursive locking in intel_iommu_init()
iommu/virtio: Fix interaction with VFIO
iommu/vt-d: Fix lockdep splat due to klist iteration in atomic context
iommu/vt-d: Fix recursive lock issue in iommu_flush_dev_iotlb()
iommu/vt-d: Correctly calculate sagaw value of IOMMU
iommu/vt-d: Fix kdump kernels boot failure with scalable mode
iommu/amd: use full 64-bit value in build_completion_wait()
With 'unevaluatedProperties' support implemented, there's a number of
warnings when running dtbs_check:
arch/arm64/boot/dts/renesas/r9a07g043u11-smarc.dtb: i2c@10058000: Unevaluated properties are not allowed ('resets' was unexpected)
From schema: Documentation/devicetree/bindings/i2c/renesas,riic.yaml
The main problem is that bindings schema marks resets as a required
property for RZ/G2L (and alike) SoC's but resets property is not part
of schema. So to fix this just add a resets property with maxItems
set to 1.
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
If allocating memory for the target SVE state in za_set() fails we clear
TIF_SME for the ptracing task which is obviously not correct. If we are
here we know that the target task already had neither TIF_SVE nor
TIF_SME set since we only need to allocate if either the target had not
used either SVE or SME and had no need to allocate state before or we
just changed the vector length with vec_set_vector_length() which clears
TIF_ for us on allocation failure so just remove the clear entirely.
Reported-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220902132802.39682-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Pull MIPS fixes from Thomas Bogendoerfer:
- fix for loongson32 starup hang
- fix for octeon irq setup problem
- fix compiler warning for new CONFIG option
- switch to SPARSEMEM_EXTREME for all platforms selecting SPARSEMEM
* tag 'mips-fixes_6.0_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
mips: Select SPARSEMEM_EXTREME
MIPS: OCTEON: irq: Fix octeon_irq_force_ciu_mapping()
MIPS: octeon: Get rid of preprocessor directives around RESERVE32
MIPS: loongson32: ls1c: Fix hang during startup
The AMD IOMMU driver cannot activate PASID mode on a RID without the RID's
translation being set to IDENTITY. Further it requires changing the RID's
page table layout from the normal v1 IOMMU_DOMAIN_IDENTITY layout to a
different v2 layout.
It does this by creating a new iommu_domain, configuring that domain for
v2 identity operation and then attaching it to the group, from within the
driver. This logic assumes the group is already set to the IDENTITY domain
and is being used by the DMA API.
However, since the ownership logic is based on the group's domain pointer
equaling the default domain to detect DMA API ownership, this causes it to
look like the group is not attached to the DMA API any more. This blocks
attaching drivers to any other devices in the group.
In a real system this manifests itself as the HD-audio devices on some AMD
platforms losing their device drivers.
Work around this unique behavior of the AMD driver by checking for
equality of IDENTITY domains based on their type, not their pointer
value. This allows the AMD driver to have two IDENTITY domains for
internal purposes without breaking the check.
Have the AMD driver properly declare that the special domain it created is
actually an IDENTITY domain.
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: stable@vger.kernel.org
Fixes: 512881eacfa7 ("bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management")
Reported-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/0-v1-ea566e16b06b+811-amd_owner_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
GCC does not insert a `bti c` instruction at the beginning of a function
when it believes that all callers reach the function through a direct
branch[1]. Unfortunately the logic it uses to determine this is not
sufficiently robust, for example not taking account of functions being
placed in different sections which may be loaded separately, so we may
still see thunks being generated to these functions. If that happens,
the first instruction in the callee function will result in a Branch
Target Exception due to the missing landing pad.
While this has currently only been observed in the case of modules
having their main code loaded sufficiently far from their init section
to require thunks it could potentially happen for other cases so the
safest thing is to disable BTI for the kernel when building with an
affected toolchain.
[1]: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671
Reported-by: D Scott Phillips <scott@os.amperecomputing.com>
[Bits of the commit message are lifted from his report & workaround]
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220905142255.591990-1-broonie@kernel.org
Cc: <stable@vger.kernel.org> # v5.10+
Signed-off-by: Will Deacon <will@kernel.org>
Since commit 8564ed2b3888 ("Kbuild, lto: Add a gcc-ld script to let run gcc
as ld") in 2014, there was not specific work on this the gcc-ld script
other than treewide clean-ups.
There are no users within the kernel tree, and probably no out-of-tree
users either, and there is no dedicated maintainer in MAINTAINERS.
Delete this obsolete gcc-ld script.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Pull s390 fixes from Vasily Gorbik:
- Fix absolute zero lowcore corruption on kdump when CPU0 is offline
- Fix lowcore protection setup for offline CPU restart
* tag 's390-6.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/smp: enforce lowcore protection on CPU restart
s390/boot: fix absolute zero lowcore corruption on boot
Commit c46173183657 ("MIPS: Add NUMA support for Loongson-3") has increased
.bss size of the Octeon kernel from 16k to 16M. Providing the conditions
for SPARSEMEM_EXTREME avoids the waste of memory.
Thomas has tested the loogsoon64 kernel, where .bss is being reduced by
this patch from 16.5M to 515k.
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
is used to protect DMAR related global data from DMAR hotplug operations.
The dmar_global_lock used in the intel_iommu_init() might cause recursive
locking issue, for example, intel_iommu_get_resv_regions() is taking the
dmar_global_lock from within a section where intel_iommu_init() already
holds it via probe_acpi_namespace_devices().
Using dmar_global_lock in intel_iommu_init() could be relaxed since it is
unlikely that any IO board must be hot added before the IOMMU subsystem is
initialized. This eliminates the possible recursive locking issue by moving
down DMAR hotplug support after the IOMMU is initialized and removing the
uses of dmar_global_lock in intel_iommu_init().
Fixes: d5692d4af08cd ("iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices()")
Reported-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/894db0ccae854b35c73814485569b634237b5538.1657034828.git.robin.murphy@arm.com
Link: https://lore.kernel.org/r/20220718235325.3952426-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The sequence of sections is a bit confusing here:
* we list the mux locking scheme for existing drivers before introducing
what mux locking schemes are
* we list the caveats for each locking scheme (which are tricky) before
the example of the simple use case
Restructure it entirely with the following logic:
* Intro ("I2C muxes and complex topologies")
* Locking
- mux-locked
- example
- caveats
- parent-locked
- example
- caveats
* Complex examples
* Mux type of existing device drivers
While there, also apply some other improvements:
* convert the caveat list from a table (with only one column carrying
content) to a bullet list.
* add a small introductory text to bridge the gap from listing the use
cases to telling about the hardware components to handle them and then
the device drivers that implement those.
* make empty lines usage more uniform
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Acked-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
The logic that conditionally allocates one additional page at each
swapper page table level if KASLR is enabled is also applied to the
initial ID map, now that we have started using the same set of macros
to allocate the space for it.
However, the placement of the kernel in physical memory might result in
additional pages being needed at any level, even if KASLR is disabled in
the build. So account for this in the computation.
Fixes: c3cee924bd85 ("arm64: head: cover entire kernel image in initial ID map")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220826164800.2059148-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Pull hwmon fixes from Guenter Roeck:
- Fix severe regression in asus-ec-sensors driver
which resulted in EC driver failures
- Fix various bugs in mr75203 driver
- Fix byte order bug in tps23861 driver
* tag 'hwmon-for-v6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (asus-ec-sensors) autoload module via DMI data
hwmon: (mr75203) enable polling for all VM channels
hwmon: (mr75203) fix multi-channel voltage reading
hwmon: (mr75203) fix voltage equation for negative source input
hwmon: (mr75203) update pvt->v_num and vm_num to the actual number of used sensors
hwmon: (mr75203) fix VM sensor allocation when "intel,vm-map" not defined
dt-bindings: hwmon: (mr75203) fix "intel,vm-map" property to be optional
hwmon: (tps23861) fix byte order in resistance register
As result of commit 915fea04f932 ("s390/smp: enable DAT before
CPU restart callback is called") the low-address protection bit
gets mistakenly unset in control register 0 save area of the
absolute zero memory. That area is used when manual PSW restart
happened to hit an offline CPU. In this case the low-address
protection for that CPU will be dropped.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Fixes: 915fea04f932 ("s390/smp: enable DAT before CPU restart callback is called")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
For irq_domain_associate() to work the virq descriptor has to be
pre-allocated in advance. Otherwise the following happens:
WARNING: CPU: 0 PID: 0 at .../kernel/irq/irqdomain.c:527 irq_domain_associate+0x298/0x2e8
error: virq128 is not allocated
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.78-... #1
...
Call Trace:
[<ffffffff801344c4>] show_stack+0x9c/0x130
[<ffffffff80769550>] dump_stack+0x90/0xd0
[<ffffffff801576d0>] __warn+0x118/0x130
[<ffffffff80157734>] warn_slowpath_fmt+0x4c/0x70
[<ffffffff801b83c0>] irq_domain_associate+0x298/0x2e8
[<ffffffff80a43bb8>] octeon_irq_init_ciu+0x4c8/0x53c
[<ffffffff80a76cbc>] of_irq_init+0x1e0/0x388
[<ffffffff80a452cc>] init_IRQ+0x4c/0xf4
[<ffffffff80a3cc00>] start_kernel+0x404/0x698
Use irq_alloc_desc_at() to avoid the above problem.
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Commit e8ae0e140c05 ("vfio: Require that devices support DMA cache
coherence") requires IOMMU drivers to advertise
IOMMU_CAP_CACHE_COHERENCY, in order to be used by VFIO. Since VFIO does
not provide to userspace the ability to maintain coherency through cache
invalidations, it requires hardware coherency. Advertise the capability
in order to restore VFIO support.
The meaning of IOMMU_CAP_CACHE_COHERENCY also changed from "IOMMU can
enforce cache coherent DMA transactions" to "IOMMU_CACHE is supported".
While virtio-iommu cannot enforce coherency (of PCIe no-snoop
transactions), it does support IOMMU_CACHE.
We can distinguish different cases of non-coherent DMA:
(1) When accesses from a hardware endpoint are not coherent. The host
would describe such a device using firmware methods ('dma-coherent'
in device-tree, '_CCA' in ACPI), since they are also needed without
a vIOMMU. In this case mappings are created without IOMMU_CACHE.
virtio-iommu doesn't need any additional support. It sends the same
requests as for coherent devices.
(2) When the physical IOMMU supports non-cacheable mappings. Supporting
those would require a new feature in virtio-iommu, new PROBE request
property and MAP flags. Device drivers would use a new API to
discover this since it depends on the architecture and the physical
IOMMU.
(3) When the hardware supports PCIe no-snoop. It is possible for
assigned PCIe devices to issue no-snoop transactions, and the
virtio-iommu specification is lacking any mention of this.
Arm platforms don't necessarily support no-snoop, and those that do
cannot enforce coherency of no-snoop transactions. Device drivers
must be careful about assuming that no-snoop transactions won't end
up cached; see commit e02f5c1bb228 ("drm: disable uncached DMA
optimization for ARM and arm64"). On x86 platforms, the host may or
may not enforce coherency of no-snoop transactions with the physical
IOMMU. But according to the above commit, on x86 a driver which
assumes that no-snoop DMA is compatible with uncached CPU mappings
will also work if the host enforces coherency.
Although these issues are not specific to virtio-iommu, it could be
used to facilitate discovery and configuration of no-snoop. This
would require a new feature bit, PROBE property and ATTACH/MAP
flags.
Cc: stable@vger.kernel.org
Fixes: e8ae0e140c05 ("vfio: Require that devices support DMA cache coherence")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20220825154622.86759-1-jean-philippe@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
"Etc" here was never meant to be a heading, it became one while converting
to ReST.
It would be easy to just convert it to plain text, but rather remove it and
add an introductory text before the list that conveys the same meaning but
with a better reading flow.
Fixes: ccf988b66d69 ("docs: i2c: convert to ReST and add to driver-api bookset")
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Acked-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Pull more hotfixes from Andrew Morton:
"Seventeen hotfixes. Mostly memory management things.
Ten patches are cc:stable, addressing pre-6.0 issues"
* tag 'mm-hotfixes-stable-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
.mailmap: update Luca Ceresoli's e-mail address
mm/mprotect: only reference swap pfn page if type match
squashfs: don't call kmalloc in decompressors
mm/damon/dbgfs: avoid duplicate context directory creation
mailmap: update email address for Colin King
asm-generic: sections: refactor memory_intersects
bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem
ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown
Revert "memcg: cleanup racy sum avoidance code"
mm/zsmalloc: do not attempt to free IS_ERR handle
binder_alloc: add missing mmap_lock calls when using the VMA
mm: re-allow pinning of zero pfns (again)
vmcoreinfo: add kallsyms_num_syms symbol
mailmap: update Guilherme G. Piccoli's email addresses
writeback: avoid use-after-free after removing device
shmem: update folio if shmem_replace_page() updates the page
mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte
Pull dma-mapping fixes from Christoph Hellwig:
- revert a panic on swiotlb initialization failure (Yu Zhao)
- fix the lookup for partial syncs in dma-debug (Robin Murphy)
- fix a shift overflow in swiotlb (Chao Gao)
- fix a comment typo in swiotlb (Chao Gao)
- mark a function static now that all abusers are gone (Christoph
Hellwig)
* tag 'dma-mapping-6.0-2022-09-10' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: mark dma_supported static
swiotlb: fix a typo
swiotlb: avoid potential left shift overflow
dma-debug: improve search for partial syncs
Revert "swiotlb: panic if nslabs is too small"
Replace autoloading data based on the ACPI EC device with the DMI
records for motherboards models. The ACPI method created a bug that when
this driver returns error from the probe function because of the
unsupported motherboard model, the ACPI subsystem concludes
that the EC device does not work properly.
Fixes: 5cd29012028d ("hwmon: (asus-ec-sensors) introduce ec_board_info struct for board data")
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=216412
Bug: https://bugzilla.redhat.com/show_bug.cgi?id=2121844
Signed-off-by: Eugene Shalygin <eugene.shalygin@gmail.com>
Link: https://lore.kernel.org/r/20220909155654.123398-2-eugene.shalygin@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Crash dump always starts on CPU0. In case CPU0 is offline the
prefix page is not installed and the absolute zero lowcore is
used. However, struct lowcore::mcesad is never assigned and
stays zero. That leads to __machine_kdump() -> save_vx_regs()
call silently stores vector registers to the absolute lowcore
at 0x11b0 offset.
Fixes: a62bc0739253 ("s390/kdump: add support for vector extension")
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Some of them were pointless because CONFIG_CAVIUM_RESERVE32 is now always
defined, some were not enough (Yu Zhao reported
"Failed to allocate CAVIUM_RESERVE32 memory area" error).
Removing the directives allows for compiler coverage of RESERVE32 code and
replacing one of [always-true] "ifdef" with a compiler conditional fixes
the [cosmetic] error message.
Fixes: 3e3114ac460e ("MIPS: Introduce CAVIUM_RESERVE32 Kconfig option")
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
With CONFIG_INTEL_IOMMU_DEBUGFS enabled, below lockdep splat are seen
when an I/O fault occurs on a machine with an Intel IOMMU in it.
DMAR: DRHD: handling fault status reg 3
DMAR: [DMA Write NO_PASID] Request device [00:1a.0] fault addr 0x0
[fault reason 0x05] PTE Write access is not set
DMAR: Dump dmar0 table entries for IOVA 0x0
DMAR: root entry: 0x0000000127f42001
DMAR: context entry: hi 0x0000000000001502, low 0x000000012d8ab001
================================
WARNING: inconsistent lock state
5.20.0-0.rc0.20220812git7ebfc85e2cd7.10.fc38.x86_64 #1 Not tainted
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
rngd/1006 [HC1[1]:SC0[0]:HE0:SE1] takes:
ff177021416f2d78 (&k->k_lock){?.+.}-{2:2}, at: klist_next+0x1b/0x160
{HARDIRQ-ON-W} state was registered at:
lock_acquire+0xce/0x2d0
_raw_spin_lock+0x33/0x80
klist_add_tail+0x46/0x80
bus_add_device+0xee/0x150
device_add+0x39d/0x9a0
add_memory_block+0x108/0x1d0
memory_dev_init+0xe1/0x117
driver_init+0x43/0x4d
kernel_init_freeable+0x1c2/0x2cc
kernel_init+0x16/0x140
ret_from_fork+0x1f/0x30
irq event stamp: 7812
hardirqs last enabled at (7811): [<ffffffff85000e86>] asm_sysvec_apic_timer_interrupt+0x16/0x20
hardirqs last disabled at (7812): [<ffffffff84f16894>] irqentry_enter+0x54/0x60
softirqs last enabled at (7794): [<ffffffff840ff669>] __irq_exit_rcu+0xf9/0x170
softirqs last disabled at (7787): [<ffffffff840ff669>] __irq_exit_rcu+0xf9/0x170
The klist iterator functions using spin_*lock_irq*() but the klist
insertion functions using spin_*lock(), combined with the Intel DMAR
IOMMU driver iterating over klists from atomic (hardirq) context, where
pci_get_domain_bus_and_slot() calls into bus_find_device() which iterates
over klists.
As currently there's no plan to fix the klist to make it safe to use in
atomic context, this fixes the lockdep splat by avoid calling
pci_get_domain_bus_and_slot() in the hardirq context.
Fixes: 8ac0b64b9735 ("iommu/vt-d: Use pci_get_domain_bus_and_slot() in pgtable_walk()")
Reported-by: Lennert Buytenhek <buytenh@wantstofly.org>
Link: https://lore.kernel.org/linux-iommu/Yvo2dfpEh%2FWC+Wrr@wantstofly.org/
Link: https://lore.kernel.org/linux-iommu/YvyBdPwrTuHHbn5X@wantstofly.org/
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220819015949.4795-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Even non-KASLR kernels can be built as relocatable, to work around
broken bootloaders that violate the rules regarding physical placement
of the kernel image - in this case, the physical offset modulo 2 MiB is
used as the KASLR offset, and all absolute symbol references are fixed
up in the usual way. This workaround is enabled by default.
CONFIG_RELOCATABLE can also be disabled entirely, in which case the
relocation code and the code that captures the offset are omitted from
the build. However, since commit aacd149b6238 ("arm64: head: avoid
relocating the kernel twice for KASLR"), this code got out of sync, and
we still add the offset to the kernel virtual address before populating
the page tables even though we never capture it. This means we add a
bogus value instead, breaking the boot entirely.
Fixes: aacd149b6238 ("arm64: head: avoid relocating the kernel twice for KASLR")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Mikulas Patocka <mpatocka@redhat.com>
Link: https://lore.kernel.org/r/20220827070904.2216989-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Pull bitmap fixes from Yury Norov:
"Fix the reported issues, and implements the suggested improvements,
for the version of the cpumask tests [1] that was merged with commit
c41e8866c28c ("lib/test: introduce cpumask KUnit test suite").
These changes include fixes for the tests, and better alignment with
the KUnit style guidelines"
* tag 'bitmap-6.0-rc3' of github.com:/norov/linux:
lib/cpumask_kunit: add tests file to MAINTAINERS
lib/cpumask_kunit: log mask contents
lib/test_cpumask: follow KUnit style guidelines
lib/test_cpumask: fix cpu_possible_mask last test
lib/test_cpumask: drop cpu_possible_mask full test
My Bootlin address is preferred from now on.
Link: https://lkml.kernel.org/r/20220826130515.3011951-1-luca.ceresoli@bootlin.com
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Atish Patra <atishp@atishpatra.org>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pull SCSI fixes from James Bottomley:
"Eight patches which looks like quite a large core change, but most of
the diffstat is reverting the attempt to rejig reference counting
introduced in the last merge window which caused issues with device
and module removal.
Of the remaining four patches, only the fix use-after-free is
substantial"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mpt3sas: Fix use-after-free warning
scsi: core: Fix a use-after-free
scsi: core: Revert "Make sure that targets outlive devices"
scsi: core: Revert "Make sure that hosts outlive targets"
scsi: core: Revert "Simplify LLD module reference counting"
scsi: core: Revert "Call blk_mq_free_tag_set() earlier"
scsi: lpfc: Add missing destroy_workqueue() in error path
scsi: lpfc: Return DID_TRANSPORT_DISRUPTED instead of DID_REQUEUE
Configure ip-polling register to enable polling for all voltage monitor
channels.
This enables reading the voltage values for all inputs other than just
input 0.
Fixes: 9d823351a337 ("hwmon: Add hardware monitoring driver for Moortec MR75203 PVT controller")
Signed-off-by: Eliav Farber <farbere@amazon.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220908152449.35457-7-farbere@amazon.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The alignment check in prepare_hugepage_range() is wrong for 2 GB
hugepages, it only checks for 1 MB hugepage alignment.
This can result in kernel crash in __unmap_hugepage_range() at the
BUG_ON(start & ~huge_page_mask(h)) alignment check, for mappings
created with MAP_FIXED at unaligned address.
Fix this by correctly handling multiple hugepage sizes, similar to the
generic version of prepare_hugepage_range().
Fixes: d08de8e2d867 ("s390/mm: add support for 2GB hugepages")
Cc: <stable@vger.kernel.org> # 4.8+
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
The RTCCTRL reg of LS1C is obselete.
Writing this reg will cause system hang.
Fixes: 60219c563c9b6 ("MIPS: Add RTC support for Loongson1C board")
Signed-off-by: Yang Ling <gnaygnil@gmail.com>
Tested-by: Keguang Zhang <keguang.zhang@gmail.com>
Acked-by: Keguang Zhang <keguang.zhang@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
The per domain spinlock is acquired in iommu_flush_dev_iotlb(), which
is possbile to be called in the interrupt context. For example, the
drm-intel's CI system got completely blocked with below error:
WARNING: inconsistent lock state
6.0.0-rc1-CI_DRM_11990-g6590d43d39b9+ #1 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
swapper/6/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
ffff88810440d678 (&domain->lock){+.?.}-{2:2}, at: iommu_flush_dev_iotlb.part.61+0x23/0x80
{SOFTIRQ-ON-W} state was registered at:
lock_acquire+0xd3/0x310
_raw_spin_lock+0x2a/0x40
domain_update_iommu_cap+0x20b/0x2c0
intel_iommu_attach_device+0x5bd/0x860
__iommu_attach_device+0x18/0xe0
bus_iommu_probe+0x1f3/0x2d0
bus_set_iommu+0x82/0xd0
intel_iommu_init+0xe45/0x102a
pci_iommu_init+0x9/0x31
do_one_initcall+0x53/0x2f0
kernel_init_freeable+0x18f/0x1e1
kernel_init+0x11/0x120
ret_from_fork+0x1f/0x30
irq event stamp: 162354
hardirqs last enabled at (162354): [<ffffffff81b59274>] _raw_spin_unlock_irqrestore+0x54/0x70
hardirqs last disabled at (162353): [<ffffffff81b5901b>] _raw_spin_lock_irqsave+0x4b/0x50
softirqs last enabled at (162338): [<ffffffff81e00323>] __do_softirq+0x323/0x48e
softirqs last disabled at (162349): [<ffffffff810c1588>] irq_exit_rcu+0xb8/0xe0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&domain->lock);
<Interrupt>
lock(&domain->lock);
*** DEADLOCK ***
1 lock held by swapper/6/0:
This coverts the spin_lock/unlock() into the irq save/restore varieties
to fix the recursive locking issues.
Fixes: ffd5869d93530 ("iommu/vt-d: Replace spin_lock_irqsave() with spin_lock()")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20220817025650.3253959-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Like crashk_res, Calling crash_exclude_mem_range function with
crashk_low_res area would need extra crash_mem range too.
Add one more extra cmem slot in case of crashk_low_res is used.
Signed-off-by: Levi Yun <ppbuk5246@gmail.com>
Fixes: 944a45abfabc ("arm64: kdump: Reimplement crashkernel=X")
Cc: <stable@vger.kernel.org> # 5.19.x
Acked-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220831103913.12661-1-ppbuk5246@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
Pull btrfs fixes from David Sterba:
"Fixes:
- check that subvolume is writable when changing xattrs from security
namespace
- fix memory leak in device lookup helper
- update generation of hole file extent item when merging holes
- fix space cache corruption and potential double allocations; this
is a rare bug but can be serious once it happens, stable backports
and analysis tool will be provided
- fix error handling when deleting root references
- fix crash due to assert when attempting to cancel suspended device
replace, add message what to do if mount fails due to missing
replace item
Regressions:
- don't merge pages into bio if their page offset is not contiguous
- don't allow large NOWAIT direct reads, this could lead to short
reads eg. in io_uring"
* tag 'for-6.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: add info when mount fails due to stale replace target
btrfs: replace: drop assert for suspended replace
btrfs: fix silent failure when deleting root reference
btrfs: fix space cache corruption and potential double allocations
btrfs: don't allow large NOWAIT direct reads
btrfs: don't merge pages into bio if their page offset is not contiguous
btrfs: update generation of hole file extent item when merging holes
btrfs: fix possible memory leak in btrfs_get_dev_args_from_path()
btrfs: check if root is readonly while setting security xattr
cpumask related files are listed under the BITMAP API section, so the
file with tests for cpumask should be added to that list.
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Reviewed-by: David Gow <davidgow@google.com>
Acked-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Yu Zhao reported a bug after the commit "mm/swap: Add swp_offset_pfn() to
fetch PFN from swap entry" added a check in swp_offset_pfn() for swap type [1]:
kernel BUG at include/linux/swapops.h:117!
CPU: 46 PID: 5245 Comm: EventManager_De Tainted: G S O L 6.0.0-dbg-DEV #2
RIP: 0010:pfn_swap_entry_to_page+0x72/0xf0
Code: c6 48 8b 36 48 83 fe ff 74 53 48 01 d1 48 83 c1 08 48 8b 09 f6
c1 01 75 7b 66 90 48 89 c1 48 8b 09 f6 c1 01 74 74 5d c3 eb 9e <0f> 0b
48 ba ff ff ff ff 03 00 00 00 eb ae a9 ff 0f 00 00 75 13 48
RSP: 0018:ffffa59e73fabb80 EFLAGS: 00010282
RAX: 00000000ffffffe8 RBX: 0c00000000000000 RCX: ffffcd5440000000
RDX: 1ffffffffff7a80a RSI: 0000000000000000 RDI: 0c0000000000042b
RBP: ffffa59e73fabb80 R08: ffff9965ca6e8bb8 R09: 0000000000000000
R10: ffffffffa5a2f62d R11: 0000030b372e9fff R12: ffff997b79db5738
R13: 000000000000042b R14: 0c0000000000042b R15: 1ffffffffff7a80a
FS: 00007f549d1bb700(0000) GS:ffff99d3cf680000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000440d035b3180 CR3: 0000002243176004 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
change_pte_range+0x36e/0x880
change_p4d_range+0x2e8/0x670
change_protection_range+0x14e/0x2c0
mprotect_fixup+0x1ee/0x330
do_mprotect_pkey+0x34c/0x440
__x64_sys_mprotect+0x1d/0x30
It triggers because pfn_swap_entry_to_page() could be called upon e.g. a
genuine swap entry.
Fix it by only calling it when it's a write migration entry where the page*
is used.
[1] https://lore.kernel.org/lkml/CAOUHufaVC2Za-p8m0aiHw6YkheDcrO-C3wRGixwDS32VTS+k1w@mail.gmail.com/
Link: https://lkml.kernel.org/r/20220823221138.45602-1-peterx@redhat.com
Fixes: 6c287605fd56 ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Yu Zhao <yuzhao@google.com>
Tested-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pull driver core fixes from Greg KH:
"Here are some small driver core and debugfs fixes for 6.0-rc5.
Included in here are:
- multiple attempts to get the arch_topology code to work properly on
non-cluster SMT systems. First attempt caused build breakages in
linux-next and 0-day, second try worked.
- debugfs fixes for a long-suffering memory leak. The pattern of
debugfs_remove(debugfs_lookup(...)) turns out to leak dentries, so
add debugfs_lookup_and_remove() to fix this problem. Also fix up
the scheduler debug code that highlighted this problem. Fixes for
other subsystems will be trickling in over the next few months for
this same issue once the debugfs function is merged.
All of these have been in linux-next since Wednesday with no reported
problems"
* tag 'driver-core-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
arch_topology: Make cluster topology span at least SMT CPUs
sched/debug: fix dentry leak in update_sched_domain_debugfs
debugfs: add debugfs_lookup_and_remove()
driver core: fix driver_set_override() issue with empty strings
Revert "arch_topology: Make cluster topology span at least SMT CPUs"
arch_topology: Make cluster topology span at least SMT CPUs
Fix the following use-after-free warning which is observed during
controller reset:
refcount_t: underflow; use-after-free.
WARNING: CPU: 23 PID: 5399 at lib/refcount.c:28 refcount_warn_saturate+0xa6/0xf0
Link: https://lore.kernel.org/r/20220906134908.1039-2-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix voltage allocation and reading to support all channels in all VMs.
Prior to this change allocation and reading were done only for the first
channel in each VM.
This change counts the total number of channels for allocation, and takes
into account the channel offset when reading the sample data register.
Fixes: 9d823351a337 ("hwmon: Add hardware monitoring driver for Moortec MR75203 PVT controller")
Signed-off-by: Eliav Farber <farbere@amazon.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220908152449.35457-6-farbere@amazon.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
The Intel IOMMU driver possibly selects between the first-level and the
second-level translation tables for DMA address translation. However,
the levels of page-table walks for the 4KB base page size are calculated
from the SAGAW field of the capability register, which is only valid for
the second-level page table. This causes the IOMMU driver to stop working
if the hardware (or the emulated IOMMU) advertises only first-level
translation capability and reports the SAGAW field as 0.
This solves the above problem by considering both the first level and the
second level when calculating the supported page table levels.
Fixes: b802d070a52a1 ("iommu/vt-d: Use iova over first level")
Cc: stable@vger.kernel.org
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220817023558.3253263-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Currently as part of handling a SME access trap we flush the SVE register
state. This is not needed and would corrupt register state if the task has
access to the SVE registers already. For non-streaming mode accesses the
required flushing will be done in the SVE access trap. For streaming
mode SVE register accesses the architecture guarantees that the register
state will be flushed when streaming mode is entered or exited so there is
no need for us to do so. Simply remove the register initialisation.
Fixes: 8bd7f91c03d8 ("arm64/sme: Implement traps and syscall handling for SME")
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220817182324.638214-5-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Pull cfis fixes from Steve French:
- two locking fixes (zero range, punch hole)
- DFS 9 fix (padding), affecting some servers
- three minor cleanup changes
* tag '6.0-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Add helper function to check smb1+ server
cifs: Use help macro to get the mid header size
cifs: Use help macro to get the header preamble size
cifs: skip extra NULL byte in filenames
smb3: missing inode locks in punch hole
smb3: missing inode locks in zero range
If the replace target device reappears after the suspended replace is
cancelled, it blocks the mount operation as it can't find the matching
replace-item in the metadata. As shown below,
BTRFS error (device sda5): replace devid present without an active replace item
To overcome this situation, the user can run the command
btrfs device scan --forget <replace target device>
and try the mount command again. And also, to avoid repeating the issue,
superblock on the devid=0 must be wiped.
wipefs -a device-path-to-devid=0.
This patch adds some info when this situation occurs.
Reported-by: Samuel Greiner <samuel@balkonien.org>
Link: https://lore.kernel.org/linux-btrfs/b4f62b10-b295-26ea-71f9-9a5c9299d42c@balkonien.org/T/
CC: stable@vger.kernel.org # 5.0+
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For extra context, log the contents of the masks under test. This
should help with finding out why a certain test fails.
Link: https://lore.kernel.org/lkml/CABVgOSkPXBc-PWk1zBZRQ_Tt+Sz1ruFHBj3ixojymZF=Vi4tpQ@mail.gmail.com/
Suggested-by: David Gow <davidgow@google.com>
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
The decompressors may be called while in an atomic section. So move the
kmalloc() out of this path, and into the "page actor" init function.
This fixes a regression introduced by commit
f268eedddf35 ("squashfs: extend "page actor" to handle missing pages")
Link: https://lkml.kernel.org/r/20220822215430.15933-1-phillip@squashfs.org.uk
Fixes: f268eedddf35 ("squashfs: extend "page actor" to handle missing pages")
Reported-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>