commits
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix compilation of callchain related code on powerpc with gcc11+
- Fix PERF_SAMPLE_WEIGHT_STRUCT support in 'perf script'
- Check session->header.env.arch before using it, fixing a segmentation
fault
- Suppress 'rm dlfilter' build messages
* tag 'perf-tools-fixes-for-v5.15-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf script: Fix PERF_SAMPLE_WEIGHT_STRUCT support
perf callchain: Fix compilation on powerpc with gcc11+
perf script: Check session->header.env.arch before using it
perf build: Suppress 'rm dlfilter' build message
Pull kvm fixes from Paolo Bonzini:
- Fixes for s390 interrupt delivery
- Fixes for Xen emulator bugs showing up as debug kernel WARNs
- Fix another issue with SEV/ES string I/O VMGEXITs
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Take srcu lock in post_kvm_run_save()
KVM: SEV-ES: fix another issue with string I/O VMGEXITs
KVM: x86/xen: Fix kvm_xen_has_interrupt() sleeping in kvm_vcpu_block()
KVM: x86: switch pvclock_gtod_sync_lock to a raw spinlock
KVM: s390: preserve deliverable_mask in __airqs_kick_single_vcpu
KVM: s390: clear kicked_mask before sleeping again
-F weight in perf script is broken.
# ./perf mem record
# ./perf script -F weight
Samples for 'dummy:HG' event do not have WEIGHT attribute set. Cannot
print 'weight' field.
The sample type, PERF_SAMPLE_WEIGHT_STRUCT, is an alternative of the
PERF_SAMPLE_WEIGHT sample type. They share the same space, weight. The
lower 32 bits are exactly the same for both sample type. The higher 32
bits may be different for different architecture. For a new kernel on
x86, the PERF_SAMPLE_WEIGHT_STRUCT is used. For an old kernel or other
ARCHs, the PERF_SAMPLE_WEIGHT is used.
With -F weight, current perf script will only check the input string
"weight" with the PERF_SAMPLE_WEIGHT sample type. Because the commit
ea8d0ed6eae3 ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT") didn't
update the PERF_SAMPLE_WEIGHT_STRUCT sample type for perf script. For a
new kernel on x86, the check fails.
Use PERF_SAMPLE_WEIGHT_TYPE, which supports both sample types, to
replace PERF_SAMPLE_WEIGHT
Fixes: ea8d0ed6eae37b01 ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT")
Reported-by: Joe Mario <jmario@redhat.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Joe Mario <jmario@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Joe Mario <jmario@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1632929894-102778-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull SCSI fixes from James Bottomley:
"Three small fixes, all in drivers, and one sizeable update to the UFS
driver to remove the HPB 2.0 feature that has been objected to by Jens
and Christoph.
Although the UFS patch is large and last minute, it's essentially the
least intrusive way of resolving the objections in time for the 5.15
release"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: ufshpb: Remove HPB2.0 flows
scsi: mpt3sas: Fix reference tag handling for WRITE_INSERT
scsi: ufs: ufs-exynos: Correct timeout value setting registers
scsi: ibmvfc: Fix up duplicate response detection
The Xen interrupt injection for event channels relies on accessing the
guest's vcpu_info structure in __kvm_xen_has_interrupt(), through a
gfn_to_hva_cache.
This requires the srcu lock to be held, which is mostly the case except
for this code path:
[ 11.822877] WARNING: suspicious RCU usage
[ 11.822965] -----------------------------
[ 11.823013] include/linux/kvm_host.h:664 suspicious rcu_dereference_check() usage!
[ 11.823131]
[ 11.823131] other info that might help us debug this:
[ 11.823131]
[ 11.823196]
[ 11.823196] rcu_scheduler_active = 2, debug_locks = 1
[ 11.823253] 1 lock held by dom:0/90:
[ 11.823292] #0: ffff998956ec8118 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x85/0x680
[ 11.823379]
[ 11.823379] stack backtrace:
[ 11.823428] CPU: 2 PID: 90 Comm: dom:0 Kdump: loaded Not tainted 5.4.34+ #5
[ 11.823496] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[ 11.823612] Call Trace:
[ 11.823645] dump_stack+0x7a/0xa5
[ 11.823681] lockdep_rcu_suspicious+0xc5/0x100
[ 11.823726] __kvm_xen_has_interrupt+0x179/0x190
[ 11.823773] kvm_cpu_has_extint+0x6d/0x90
[ 11.823813] kvm_cpu_accept_dm_intr+0xd/0x40
[ 11.823853] kvm_vcpu_ready_for_interrupt_injection+0x20/0x30
< post_kvm_run_save() inlined here >
[ 11.823906] kvm_arch_vcpu_ioctl_run+0x135/0x6a0
[ 11.823947] kvm_vcpu_ioctl+0x263/0x680
Fixes: 40da8ccd724f ("KVM: x86/xen: Add event channel interrupt vector upcall")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: stable@vger.kernel.org
Message-Id: <606aaaf29fca3850a63aa4499826104e77a72346.camel@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Got following build fail on powerpc:
CC arch/powerpc/util/skip-callchain-idx.o
In function ‘check_return_reg’,
inlined from ‘check_return_addr’ at arch/powerpc/util/skip-callchain-idx.c:213:7,
inlined from ‘arch_skip_callchain_idx’ at arch/powerpc/util/skip-callchain-idx.c:265:7:
arch/powerpc/util/skip-callchain-idx.c:54:18: error: ‘dwarf_frame_register’ accessing 96 bytes \
in a region of size 64 [-Werror=stringop-overflow=]
54 | result = dwarf_frame_register(frame, ra_regno, ops_mem, &ops, &nops);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/util/skip-callchain-idx.c: In function ‘arch_skip_callchain_idx’:
arch/powerpc/util/skip-callchain-idx.c:54:18: note: referencing argument 3 of type ‘Dwarf_Op *’
In file included from /usr/include/elfutils/libdwfl.h:32,
from arch/powerpc/util/skip-callchain-idx.c:10:
/usr/include/elfutils/libdw.h:1069:12: note: in a call to function ‘dwarf_frame_register’
1069 | extern int dwarf_frame_register (Dwarf_Frame *frame, int regno,
| ^~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
The dwarf_frame_register args changed with [1],
Updating ops_mem accordingly.
[1] https://sourceware.org/git/?p=elfutils.git;a=commit;h=5621fe5443da23112170235dd5cac161e5c75e65
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Mark Wieelard <mjw@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20210928195253.1267023-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull clk fix from Stephen Boyd:
"One fix for the composite clk that broke when we changed this clk type
to use the determine_rate instead of round_rate clk op by default.
This caused lots of problems on Rockchip SoCs because they heavily use
the composite clk code to model the clk tree"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: composite: Also consider .determine_rate for rate + mux composites
The Host Performance Buffer feature allows UFS read commands to carry the
physical media addresses along with the LBAs, thus allowing less internal
L2P-table switches in the device. HPB1.0 allowed a single LBA, while
HPB2.0 increases this capacity up to 255 blocks.
Carrying more than a single record, the read operation is no longer purely
of type "read" but a "hybrid" command: Writing the physical address to the
device in one operation and reading back the required payload in another.
The JEDEC HPB spec defines two commands for this operation:
HPB-WRITE-BUFFER (0x2) to write the physical addresses to device, and
HPB-READ to read the payload.
With the current HPB design the UFS driver has no alternative but to divide
the READ request into 2 separate commands: HPB-WRITE-BUFFER and HPB-READ.
This causes a great deal of aggravation to the block layer guys who
demanded that we completely revert the entire HPB driver regardless of the
huge amount of corporate effort already invested in it.
As a compromise, remove only the pieces that implement the 2.0
specification. This is done as a matter of urgency for the final 5.15
release.
Link: https://lore.kernel.org/r/20211030062301.248-1-avri.altman@wdc.com
Tested-by: Avri Altman <avri.altman@wdc.com>
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Co-developed-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If the guest requests string I/O from the hypervisor via VMGEXIT,
SW_EXITINFO2 will contain the REP count. However, sev_es_string_io
was incorrectly treating it as the size of the GHCB buffer in
bytes.
This fixes the "outsw" test in the experimental SEV tests of
kvm-unit-tests.
Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e9f ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reported-by: Marc Orr <marcorr@google.com>
Tested-by: Marc Orr <marcorr@google.com>
Reviewed-by: Marc Orr <marcorr@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When perf.data is not written cleanly, we would like to process existing
data as much as possible (please see f_header.data.size == 0 condition
in perf_session__read_header). However, perf.data with partial data may
crash perf. Specifically, we see crash in 'perf script' for NULL
session->header.env.arch.
Fix this by checking session->header.env.arch before using it to determine
native_arch. Also split the if condition so it is easier to read.
Committer notes:
If it is a pipe, we already assume is a native arch, so no need to check
session->header.env.arch.
Signed-off-by: Song Liu <songliubraving@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@fb.com
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211004053238.514936-1-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull RISC-V fixes from Palmer Dabbelt:
"These are pretty late, but they do fix concrete issues.
- ensure the trap vector's address is aligned.
- avoid re-populating the KASAN shadow memory.
- allow kasan to build without warnings, which have recently become
errors"
* tag 'riscv-for-linus-5.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Fix asan-stack clang build
riscv: Do not re-populate shadow memory with kasan_populate_early_shadow
riscv: fix misalgned trap vector base address
Commit 69a00fb3d69706 ("clk: divider: Implement and wire up
.determine_rate by default") switches clk_divider_ops to implement
.determine_rate by default. This breaks composite clocks with multiple
parents because clk-composite.c does not use the special handling for
mux + divider combinations anymore (that was restricted to rate clocks
which only implement .round_rate, but not .determine_rate).
Alex reports:
This breaks lot of clocks for Rockchip which intensively uses
composites, i.e. those clocks will always stay at the initial parent,
which in some cases is the XTAL clock and I strongly guess it is the
same for other platforms, which use composite clocks having more than
one parent (e.g. mediatek, ti ...)
Example (RK3399)
clk_sdio is set (initialized) with XTAL (24 MHz) as parent in u-boot.
It will always stay at this parent, even if the mmc driver sets a rate
of 200 MHz (fails, as the nature of things), which should switch it
to any of its possible parent PLLs defined in
mux_pll_src_cpll_gpll_npll_ppll_upll_24m_p (see clk-rk3399.c) - which
never happens.
Restore the original behavior by changing the priority of the conditions
inside clk-composite.c. Now the special rate + mux case (with rate_ops
having a .round_rate - which is still the case for the default
clk_divider_ops) is preferred over rate_ops which have .determine_rate
defined (and not further considering the mux).
Fixes: 69a00fb3d69706 ("clk: divider: Implement and wire up .determine_rate by default")
Reported-by: Alex Bee <knaerzche@gmail.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Link: https://lore.kernel.org/r/20211016105022.303413-2-martin.blumenstingl@googlemail.com
Tested-by: Alex Bee <knaerzche@gmail.com>
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Testing revealed a problem with how the reference tag was handled for
a WRITE_INSERT operation. The SCSI_PROT_REF_CHECK flag is not set when
the controller is asked to generate the protection information
(i.e. not DIX). And as a result the initial reference tag would not be
set in the WRITE_INSERT case.
Separate handling of the REF_CHECK and REF_INCREMENT flags to align
with both the DIX spec and the MPI implementation.
Link: https://lore.kernel.org/r/20211028034202.24225-1-martin.petersen@oracle.com
Fixes: b3e2c72af1d5 ("scsi: mpt3sas: Use the proper SCSI midlayer interfaces for PI")
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In kvm_vcpu_block, the current task is set to TASK_INTERRUPTIBLE before
making a final check whether the vCPU should be woken from HLT by any
incoming interrupt.
This is a problem for the get_user() in __kvm_xen_has_interrupt(), which
really shouldn't be sleeping when the task state has already been set.
I think it's actually harmless as it would just manifest itself as a
spurious wakeup, but it's causing a debug warning:
[ 230.963649] do not call blocking ops when !TASK_RUNNING; state=1 set at [<00000000b6bcdbc9>] prepare_to_swait_exclusive+0x30/0x80
Fix the warning by turning it into an *explicit* spurious wakeup. When
invoked with !task_is_running(current) (and we might as well add
in_atomic() there while we're at it), just return 1 to indicate that
an IRQ is pending, which will cause a wakeup and then something will
call it again in a context that *can* sleep so it can fault the page
back in.
Cc: stable@vger.kernel.org
Fixes: 40da8ccd724f ("KVM: x86/xen: Add event channel interrupt vector upcall")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <168bf8c689561da904e48e2ff5ae4713eaef9e2d.camel@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The following build message:
rm dlfilters/dlfilter-test-api-v0.o
is unwanted.
The object file is being treated as an intermediate file and being
automatically removed. Mark the object file as .SECONDARY to prevent
removal and hence the message.
Requested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210930062849.110416-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull powerpc fixes from Michael Ellerman:
"Three commits fixing some issues introduced with the recent IOMMU
changes we merged.
Thanks to Alexey Kardashevskiy"
* tag 'powerpc-5.15-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/pseries/iommu: Create huge DMA window if no MMIO32 is present
powerpc/pseries/iommu: Check if the default window in use before removing it
powerpc/pseries/iommu: Use correct vfree for it_map
Nathan reported that because KASAN_SHADOW_OFFSET was not defined in
Kconfig, it prevents asan-stack from getting disabled with clang even
when CONFIG_KASAN_STACK is disabled: fix this by defining the
corresponding config.
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: 8ad8b72721d0 ("riscv: Add KASAN support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
PA_PWRMODEUSERDATA0 -> DL_FC0PROTTIMEOUTVAL
PA_PWRMODEUSERDATA1 -> DL_TC0REPLAYTIMEOUTVAL
PA_PWRMODEUSERDATA2 -> DL_AFC0REQTIMEOUTVAL
Link: https://lore.kernel.org/r/20211018062841.18226-1-chanho61.park@samsung.com
Fixes: a967ddb22d94 ("scsi: ufs: ufs-exynos: Apply vendor-specific values for three timeouts")
Cc: Alim Akhtar <alim.akhtar@samsung.com>
Cc: Kiwoong Kim <kwmad.kim@samsung.com>
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Chanho Park <chanho61.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
KVM: s390: Fixes for interrupt delivery
Two bugs that might result in CPUs not woken up when interrupts are
pending.
Pull gpio fixes from Bartosz Golaszewski:
- fix the return value check when parsing the ngpios property in
gpio-xgs-iproc
- check the return value of bgpio_init() in gpio-mlxbf2
* tag 'gpio-fixes-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: mlxbf2.c: Add check for bgpio_init failure
gpio: xgs-iproc: fix parsing of ngpios property
The iommu_init_table() helper takes an address range to reserve in
the IOMMU table being initialized to exclude MMIO addresses, this is
useful if the window stretches far beyond 4GB (although wastes some TCEs).
At the moment the code searches for such MMIO32 range and fails if none
found which is considered a problem while it really is not: it is actually
better as this says there is no MMIO32 to reserve and we can use
usually wasted TCEs. Furthermore PHYP never actually allows creating
windows starting at busaddress=0 so this MMIO32 range is never useful.
This removes error exit and initializes the table with zero range if
no MMIO32 is detected.
Fixes: 381ceda88c4c ("powerpc/pseries/iommu: Make use of DDW for indirect mapping")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211020132315.2287178-5-aik@ozlabs.ru
When calling this function, all the shadow memory is already populated
with kasan_early_shadow_pte which has PAGE_KERNEL protection.
kasan_populate_early_shadow write-protects the mapping of the range
of addresses passed in argument in zero_pte_populate, which actually
write-protects all the shadow memory mapping since kasan_early_shadow_pte
is used for all the shadow memory at this point. And then when using
memblock API to populate the shadow memory, the first write access to the
kernel stack triggers a trap. This becomes visible with the next commit
that contains a fix for asan-stack.
We already manually populate all the shadow memory in kasan_early_init
and we write-protect kasan_early_shadow_pte at the end of kasan_init
which makes the calls to kasan_populate_early_shadow superfluous so
we can remove them.
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: e178d670f251 ("riscv/kasan: add KASAN_VMALLOC support")
Fixes: 8ad8b72721d0 ("riscv: Add KASAN support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Pull more perf tools updates from Arnaldo Carvalho de Melo:
- Add missing fields and remove some duplicate fields when printing a
perf_event_attr.
- Fix hybrid config terms list corruption.
- Update kernel header copies, some resulted in new kernel features
being automagically added to 'perf trace' syscall/tracepoint argument
id->string translators.
- Add a file generated during the documentation build to .gitignore.
- Add an option to build without libbfd, as some distros, like Debian
consider its ABI unstable.
- Add support to print a textual representation of IBS raw sample data
in 'perf report'.
- Fix bpf 'perf test' sample mismatch reporting
- Fix passing arguments to stackcollapse report in a 'perf script'
python script.
- Allow build-id with trailing zeros.
- Look for ImageBase in PE file to compute .text offset.
* tag 'perf-tools-for-v5.15-2021-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (25 commits)
tools headers UAPI: Update tools's copy of drm.h headers
tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
tools headers UAPI: Sync linux/fs.h with the kernel sources
tools headers UAPI: Sync linux/in.h copy with the kernel sources
perf tools: Add an option to build without libbfd
perf tools: Allow build-id with trailing zeros
perf tools: Fix hybrid config terms list corruption
perf tools: Factor out copy_config_terms() and free_config_terms()
perf tools: Fix perf_event_attr__fprintf() missing/dupl. fields
perf tools: Ignore Documentation dependency file
perf bpf: Provide a weak btf__load_from_kernel_by_id() for older libbpf versions
tools include UAPI: Update linux/mount.h copy
perf beauty: Cover more flags in the move_mount syscall argument beautifier
tools headers UAPI: Sync linux/prctl.h with the kernel sources
tools include UAPI: Sync sound/asound.h copy with the kernel sources
tools headers UAPI: Sync linux/kvm.h with the kernel sources
tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources
perf report: Add support to print a textual representation of IBS raw sample data
perf report: Add tools/arch/x86/include/asm/amd-ibs.h
perf env: Add perf_env__cpuid, perf_env__{nr_}pmu_mappings
...
Commit a264cf5e81c7 ("scsi: ibmvfc: Fix command state accounting and stale
response detection") introduced a regression in detecting duplicate
responses. This was observed in test where a command was sent to the VIOS
and completed before ibmvfc_send_event() set the active flag to 1, which
resulted in the atomic_dec_if_positive() call in ibmvfc_handle_crq()
thinking this was a duplicate response, which resulted in scsi_done() not
getting called, so we then hit a SCSI command timeout for this command once
the timeout expires. This simply ensures the active flag gets set prior to
making the hcall to send the command to the VIOS, in order to close this
window.
Link: https://lore.kernel.org/r/20211019152129.16558-1-brking@linux.vnet.ibm.com
Fixes: a264cf5e81c7 ("scsi: ibmvfc: Fix command state accounting and stale response detection")
Cc: stable@vger.kernel.org
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
On the preemption path when updating a Xen guest's runstate times, this
lock is taken inside the scheduler rq->lock, which is a raw spinlock.
This was shown in a lockdep warning:
[ 89.138354] =============================
[ 89.138356] [ BUG: Invalid wait context ]
[ 89.138358] 5.15.0-rc5+ #834 Tainted: G S I E
[ 89.138360] -----------------------------
[ 89.138361] xen_shinfo_test/2575 is trying to lock:
[ 89.138363] ffffa34a0364efd8 (&kvm->arch.pvclock_gtod_sync_lock){....}-{3:3}, at: get_kvmclock_ns+0x1f/0x130 [kvm]
[ 89.138442] other info that might help us debug this:
[ 89.138444] context-{5:5}
[ 89.138445] 4 locks held by xen_shinfo_test/2575:
[ 89.138447] #0: ffff972bdc3b8108 (&vcpu->mutex){+.+.}-{4:4}, at: kvm_vcpu_ioctl+0x77/0x6f0 [kvm]
[ 89.138483] #1: ffffa34a03662e90 (&kvm->srcu){....}-{0:0}, at: kvm_arch_vcpu_ioctl_run+0xdc/0x8b0 [kvm]
[ 89.138526] #2: ffff97331fdbac98 (&rq->__lock){-.-.}-{2:2}, at: __schedule+0xff/0xbd0
[ 89.138534] #3: ffffa34a03662e90 (&kvm->srcu){....}-{0:0}, at: kvm_arch_vcpu_put+0x26/0x170 [kvm]
...
[ 89.138695] get_kvmclock_ns+0x1f/0x130 [kvm]
[ 89.138734] kvm_xen_update_runstate+0x14/0x90 [kvm]
[ 89.138783] kvm_xen_update_runstate_guest+0x15/0xd0 [kvm]
[ 89.138830] kvm_arch_vcpu_put+0xe6/0x170 [kvm]
[ 89.138870] kvm_sched_out+0x2f/0x40 [kvm]
[ 89.138900] __schedule+0x5de/0xbd0
Cc: stable@vger.kernel.org
Reported-by: syzbot+b282b65c2c68492df769@syzkaller.appspotmail.com
Fixes: 30b5c851af79 ("KVM: x86/xen: Add support for vCPU runstate information")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <1b02a06421c17993df337493a68ba923f3bd5c0f.camel@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Changing the deliverable mask in __airqs_kick_single_vcpu() is a bug. If
one idle vcpu can't take the interrupts we want to deliver, we should
look for another vcpu that can, instead of saying that we don't want
to deliver these interrupts by clearing the bits from the
deliverable_mask.
Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()")
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20211019175401.3757927-3-pasic@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Pull block fixes from Jens Axboe:
- NVMe pull request:
- fix nvmet-tcp header digest verification (Amit Engel)
- fix a memory leak in nvmet-tcp when releasing a queue (Maurizio
Lombardi)
- fix nvme-tcp H2CData PDU send accounting again (Sagi Grimberg)
- fix digest pointer calculation in nvme-tcp and nvmet-tcp (Varun
Prakash)
- fix possible nvme-tcp req->offset corruption (Varun Prakash)
- Queue drain ordering fix (Ming)
- Partition check regression for zoned devices (Shin'ichiro)
- Zone queue restart fix (Naohiro)
* tag 'block-5.15-2021-10-29' of git://git.kernel.dk/linux-block:
block: Fix partition check for host-aware zoned block devices
nvmet-tcp: fix header digest verification
nvmet-tcp: fix data digest pointer calculation
nvme-tcp: fix data digest pointer calculation
nvme-tcp: fix possible req->offset corruption
block: schedule queue restart after BLK_STS_ZONE_RESOURCE
block: drain queue after disk is removed from sysfs
nvme-tcp: fix H2CData PDU send accounting (again)
nvmet-tcp: fix a memory leak when releasing a queue
Add a check if bgpio_init fails.
Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
At the moment this check is performed after we remove the default window
which is late and disallows to revert whatever changes enable_ddw()
has made to DMA windows.
This moves the check and error exit before removing the window.
This raised the message severity from "debug" to "warning" as this
should not happen in practice and cannot be triggered by the userspace.
Fixes: 381ceda88c4c ("powerpc/pseries/iommu: Make use of DDW for indirect mapping")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211020132315.2287178-4-aik@ozlabs.ru
The trap vector marked by label .Lsecondary_park must align on a
4-byte boundary, as the {m,s}tvec is defined to require 4-byte
alignment.
Signed-off-by: Chen Lu <181250012@smail.nju.edu.cn>
Reviewed-by: Anup Patel <anup.patel@wdc.com>
Fixes: e011995e826f ("RISC-V: Move relocate and few other functions out of __init")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Pull compiler attributes updates from Miguel Ojeda:
- Fix __has_attribute(__no_sanitize_coverage__) for GCC 4 (Marco Elver)
- Add Nick as Reviewer for compiler_attributes.h (Nick Desaulniers)
- Move __compiletime_{error|warning} (Nick Desaulniers)
* tag 'compiler-attributes-for-linus-v5.15-rc1-v2' of git://github.com/ojeda/linux:
compiler_attributes.h: move __compiletime_{error|warning}
MAINTAINERS: add Nick as Reviewer for compiler_attributes.h
Compiler Attributes: fix __has_attribute(__no_sanitize_coverage__) for GCC 4
Picking the changes from:
17ce9c61c71cbc0d ("drm: document DRM_IOCTL_MODE_RMFB")
Doesn't result in any tooling changes:
$ tools/perf/trace/beauty/drm_ioctl.sh > before
$ cp include/uapi/drm/drm.h tools/include/uapi/drm/drm.h
$ tools/perf/trace/beauty/drm_ioctl.sh > after
$ diff -u before after
Silencing these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/drm/drm.h' differs from latest version at 'include/uapi/drm/drm.h'
diff -u tools/include/uapi/drm/drm.h include/uapi/drm/drm.h
Cc: Simon Ser <contact@emersion.fr>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implement the ->restore() PM operation and set the link to off, which will
force a full reset and restore. This ensures that Host Performance Booster
is reset after suspend-to-disk.
The Host Performance Booster feature caches logical-to-physical mapping
information in the host memory. After suspend-to-disk, such information is
not valid, so a full reset and restore is needed.
A full reset and restore is done if the SPM level is 5 or 6, but not for
other SPM levels, so this change fixes those cases.
A full reset and restore also restores base address registers, so that code
is removed.
Link: https://lore.kernel.org/r/20211018151004.284200-2-adrian.hunter@intel.com
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If allocation of rmaps fails, but some of the pointers have already been written,
those pointers can be cleaned up when the memslot is freed, or even reused later
for another attempt at allocating the rmaps. Therefore there is no need to
WARN, as done for example in memslot_rmap_alloc, but the allocation *must* be
skipped lest KVM will overwrite the previous pointer and will indeed leak memory.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The idea behind kicked mask is that we should not re-kick a vcpu that
is already in the "kick" process, i.e. that was kicked and is
is about to be dispatched if certain conditions are met.
The problem with the current implementation is, that it assumes the
kicked vcpu is going to enter SIE shortly. But under certain
circumstances, the vcpu we just kicked will be deemed non-runnable and
will remain in wait state. This can happen, if the interrupt(s) this
vcpu got kicked to deal with got already cleared (because the interrupts
got delivered to another vcpu). In this case kvm_arch_vcpu_runnable()
would return false, and the vcpu would remain in kvm_vcpu_block(),
but this time with its kicked_mask bit set. So next time around we
wouldn't kick the vcpu form __airqs_kick_single_vcpu(), but would assume
that we just kicked it.
Let us make sure the kicked_mask is cleared before we give up on
re-dispatching the vcpu.
Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()")
Reported-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20211019175401.3757927-2-pasic@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Pull MMC fixes from Ulf Hansson:
- tmio: Re-enable card irqs after a reset
- mtk-sd: Fixup probing of cqhci for crypto
- cqhci: Fix support for suspend/resume
- vub300: Fix control-message timeouts
- dw_mmc-exynos: Fix support for tuning
- winbond: Silences build errors on M68K
- sdhci-esdhc-imx: Fix support for tuning
- sdhci-pci: Read card detect from ACPI for Intel Merrifield
- sdhci: Fix eMMC support for Thundercomm TurboX CM2290
* tag 'mmc-v5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: tmio: reenable card irqs after the reset callback
mmc: mediatek: Move cqhci init behind ungate clock
mmc: cqhci: clear HALT state after CQE enable
mmc: vub300: fix control-message timeouts
mmc: dw_mmc: exynos: fix the finding clock sample value
mmc: winbond: don't build on M68K
mmc: sdhci-esdhc-imx: clear the buffer_read_ready to reset standard tuning circuit
mmc: sdhci-pci: Read card detect from ACPI for Intel Merrifield
mmc: sdhci: Map more voltage level to SDHCI_POWER_330
Pull NVMe fixes from Christoph:
"nvme fixe for Linux 5.15
- fix nvmet-tcp header digest verification (Amit Engel)
- fix a memory leak in nvmet-tcp when releasing a queue
(Maurizio Lombardi)
- fix nvme-tcp H2CData PDU send accounting again (Sagi Grimberg)
- fix digest pointer calculation in nvme-tcp and nvmet-tcp
(Varun Prakash)
- fix possible nvme-tcp req->offset corruption (Varun Prakash)"
* tag 'nvme-5.15-2021-10-28' of git://git.infradead.org/nvme:
nvmet-tcp: fix header digest verification
nvmet-tcp: fix data digest pointer calculation
nvme-tcp: fix data digest pointer calculation
nvme-tcp: fix possible req->offset corruption
nvme-tcp: fix H2CData PDU send accounting (again)
nvmet-tcp: fix a memory leak when releasing a queue
of_property_read_u32 returns 0 on success, not true, so we need to
invert the check to actually take over the provided ngpio value.
Fixes: 6a41b6c5fc20 ("gpio: Add xgs-iproc driver")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
The it_map array is vzalloc'ed so use vfree() for it when creating
a huge DMA window failed for whatever reason.
While at this, write zero to it_map.
Fixes: 381ceda88c4c ("powerpc/pseries/iommu: Make use of DDW for indirect mapping")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211020132315.2287178-3-aik@ozlabs.ru
These can be replaced by statx(). Since rv32 has a 64-bit time_t we
just never ended up with them in the first place. This is now an error
due to -Werror.
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Pull auxdisplay updates from Miguel Ojeda:
"An assortment of improvements for auxdisplay:
- Replace symbolic permissions with octal permissions (Jinchao Wang)
- ks0108: Switch to use module_parport_driver() (Andy Shevchenko)
- charlcd: Drop unneeded initializers and switch to C99 style (Andy
Shevchenko)
- hd44780: Fix oops on module unloading (Lars Poeschel)
- Add I2C gpio expander example (Ralf Schlatterbeck)"
* tag 'auxdisplay-for-linus-v5.15-rc1' of git://github.com/ojeda/linux:
auxdisplay: Replace symbolic permissions with octal permissions
auxdisplay: ks0108: Switch to use module_parport_driver()
auxdisplay: charlcd: Drop unneeded initializers and switch to C99 style
auxdisplay: hd44780: Fix oops on module unloading
auxdisplay: Add I2C gpio expander example
Clang 14 will add support for __attribute__((__error__(""))) and
__attribute__((__warning__(""))). To make use of these in
__compiletime_error and __compiletime_warning (as used by BUILD_BUG and
friends) for newer clang and detect/fallback for older versions of
clang, move these to compiler_attributes.h and guard them with
__has_attribute preprocessor guards.
Link: https://reviews.llvm.org/D106030
Link: https://bugs.llvm.org/show_bug.cgi?id=16428
Link: https://github.com/ClangBuiltLinux/linux/issues/1173
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
[Reworded, landed in Clang 14]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
To pick the changes in:
b65a9489730a2494 ("drm/i915/userptr: Probe existence of backing struct pages upon creation")
ee242ca704d38699 ("drm/i915/guc: Implement GuC priority management")
81340cf3bddded4f ("drm/i915/uapi: reject set_domain for discrete")
7961c5b60f23dff5 ("drm/i915: Add TTM offset argument to mmap.")
aef7b67a79564f6c ("drm/i915/uapi: convert drm_i915_gem_userptr to kernel doc")
e7737b67ab46ee0e ("drm/i915/uapi: reject caching ioctls for discrete")
3aa8c57fe25a9247 ("drm/i915/uapi: convert drm_i915_gem_set_domain to kernel doc")
289f5a72009b8f67 ("drm/i915/uapi: convert drm_i915_gem_caching to kernel doc")
4a766ae40ec83301 ("drm/i915: Drop the CONTEXT_CLONE API (v2)")
6ff6d61dd2a943bd ("drm/i915: Drop I915_CONTEXT_PARAM_NO_ZEROMAP")
fe4751c3d513ff4f ("drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE")
577729533cdc4e37 ("drm/i915: Document the Virtual Engine uAPI")
c649432e86ca677d ("drm/i915: Fix busy ioctl commentary")
That doesn't result in any changes to tooling as no new ioctl were
added (at least not perceived by tools/perf/trace/beauty/drm_ioctl.sh).
Addressing this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The sgl is freed in the target stack in target_release_cmd_kref() before
calling qlt_free_cmd() but there is an unmap of sgl in qlt_free_cmd() that
causes a panic if sgl is not yet DMA unmapped:
NIP dma_direct_unmap_sg+0xdc/0x180
LR dma_direct_unmap_sg+0xc8/0x180
Call Trace:
ql_dbg_prefix+0x68/0xc0 [qla2xxx] (unreliable)
dma_unmap_sg_attrs+0x54/0xf0
qlt_unmap_sg.part.19+0x54/0x1c0 [qla2xxx]
qlt_free_cmd+0x124/0x1d0 [qla2xxx]
tcm_qla2xxx_release_cmd+0x4c/0xa0 [tcm_qla2xxx]
target_put_sess_cmd+0x198/0x370 [target_core_mod]
transport_generic_free_cmd+0x6c/0x1b0 [target_core_mod]
tcm_qla2xxx_complete_free+0x6c/0x90 [tcm_qla2xxx]
The sgl may be left unmapped in error cases of response sending. For
instance, qlt_rdy_to_xfer() maps sgl and exits when session is being
deleted keeping the sgl mapped.
This patch removes use-after-free of the sgl and ensures that the sgl is
unmapped for any command that was not sent to firmware.
Link: https://lore.kernel.org/r/20211018122650.11846-1-d.bogdanov@yadro.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
KVM/arm64 fixes for 5.15, take #2
- Properly refcount pages used as a concatenated stage-2 PGD
- Fix missing unlock when detecting the use of MTE+VM_SHARED
The latest compile changes pointed us to a few instances where we use
the kernel documentation style but don't explain all variables or
don't adhere to it 100%.
It's easy to fix so let's do that.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Pull btrfs fixes from David Sterba:
"Last minute fixes for crash on 32bit architectures when compression is
in use. It's a regression introduced in 5.15-rc and I'd really like
not let this into the final release, fixes via stable trees would add
unnecessary delay.
The problem is on 32bit architectures with highmem enabled, the pages
for compression may need to be kmapped, while the patches removed that
as we don't use GFP_HIGHMEM allocations anymore. The pages that don't
come from local allocation still may be from highmem. Despite being on
32bit there's enough such ARM machines in use so it's not a marginal
issue.
I did full reverts of the patches one by one instead of a huge one.
There's one exception for the "lzo" revert as there was an
intermediate patch touching the same code to make it compatible with
subpage. I can't revert that one too, so the revert in lzo.c is
manual. Qu Wenruo has worked on that with me and verified the changes"
* tag 'for-5.15-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
Revert "btrfs: compression: drop kmap/kunmap from lzo"
Revert "btrfs: compression: drop kmap/kunmap from zlib"
Revert "btrfs: compression: drop kmap/kunmap from zstd"
Revert "btrfs: compression: drop kmap/kunmap from generic helpers"
The reset callback may clear the internal card detect interrupts, so
make sure to reenable them if needed.
Fixes: b4d86f37eacb ("mmc: renesas_sdhi: do hard reset if possible")
Reported-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211028195149.8003-1-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Commit a33df75c6328 ("block: use an xarray for disk->part_tbl") modified
the method to check partition existence in host-aware zoned block
devices from disk_has_partitions() helper function call to empty check
of xarray disk->part_tbl. However, disk->part_tbl always has single
entry for disk->part0 and never becomes empty. This resulted in the
host-aware zoned devices always judged to have partitions, and it made
the sysfs queue/zoned attribute to be "none" instead of "host-aware"
regardless of partition existence in the devices.
This also caused DEBUG_LOCKS_WARN_ON(lock->magic != lock) for
sdkp->rev_mutex in scsi layer when the kernel detects host-aware zoned
device. Since block layer handled the host-aware zoned devices as non-
zoned devices, scsi layer did not have chance to initialize the mutex
for zone revalidation. Therefore, the warning was triggered.
To fix the issues, call the helper function disk_has_partitions() in
place of disk->part_tbl empty check. Since the function was removed with
the commit a33df75c6328, reimplement it to walk through entries in the
xarray disk->part_tbl.
Fixes: a33df75c6328 ("block: use an xarray for disk->part_tbl")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Cc: stable@vger.kernel.org # v5.14+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211026060115.753746-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pass the correct length to nvmet_tcp_verify_hdgst, which is the pdu
header length. This fixes a wrong behaviour where header digest
verification passes although the digest is wrong.
Signed-off-by: Amit Engel <amit.engel@dell.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The gpio-mockup driver creates the properties that are shared between
platform and GPIO devices. Because of that, the properties may not
be removed at the proper point of time without provoking a use-after-free
as shown in the following backtrace:
refcount_t: underflow; use-after-free.
WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
...
Call Trace:
kobject_put+0xdc/0xf0
software_node_notify_remove+0xa8/0xc0
device_del+0x15a/0x3e0
That's why the driver has to manage the lifetime of the software nodes
by itself.
The problem originates from the old device_add_properties() API, but
has been only revealed after the commit bd1e336aa853 ("driver core: platform:
Remove platform_device_add_properties()"). Hence, it's used as a landmark
for backporting.
Fixes: bd1e336aa853 ("driver core: platform: Remove platform_device_add_properties()")
Reported-by: Kent Gibson <warthog618@gmail.com>
Tested-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[Bartosz: tweaked local variable placement]
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
With PREEMPT_COUNT=y, when a CPU is offlined and then onlined again, we
get:
BUG: scheduling while atomic: swapper/1/0/0x00000000
no locks held by swapper/1/0.
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.15.0-rc2+ #100
Call Trace:
dump_stack_lvl+0xac/0x108
__schedule_bug+0xac/0xe0
__schedule+0xcf8/0x10d0
schedule_idle+0x3c/0x70
do_idle+0x2d8/0x4a0
cpu_startup_entry+0x38/0x40
start_secondary+0x2ec/0x3a0
start_secondary_prolog+0x10/0x14
This is because powerpc's arch_cpu_idle_dead() decrements the idle task's
preempt count, for reasons explained in commit a7c2bb8279d2 ("powerpc:
Re-enable preemption before cpu_die()"), specifically "start_secondary()
expects a preempt_count() of 0."
However, since commit 2c669ef6979c ("powerpc/preempt: Don't touch the idle
task's preempt_count during hotplug") and commit f1a0a376ca0c ("sched/core:
Initialize the idle task with preemption disabled"), that justification no
longer holds.
The idle task isn't supposed to re-enable preemption, so remove the
vestigial preempt_enable() from the CPU offline path.
Tested with pseries and powernv in qemu, and pseries on PowerVM.
Fixes: 2c669ef6979c ("powerpc/preempt: Don't touch the idle task's preempt_count during hotplug")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211015173902.2278118-1-nathanl@linux.ibm.com
On SiFive Unmatched, I recently fell onto the following BUG when booting:
[ 0.000000] ftrace: allocating 36610 entries in 144 pages
[ 0.000000] Oops - illegal instruction [#1]
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.13.1+ #5
[ 0.000000] Hardware name: SiFive HiFive Unmatched A00 (DT)
[ 0.000000] epc : riscv_cpuid_to_hartid_mask+0x6/0xae
[ 0.000000] ra : __sbi_rfence_v02+0xc8/0x10a
[ 0.000000] epc : ffffffff80007240 ra : ffffffff80009964 sp : ffffffff81803e10
[ 0.000000] gp : ffffffff81a1ea70 tp : ffffffff8180f500 t0 : ffffffe07fe30000
[ 0.000000] t1 : 0000000000000004 t2 : 0000000000000000 s0 : ffffffff81803e60
[ 0.000000] s1 : 0000000000000000 a0 : ffffffff81a22238 a1 : ffffffff81803e10
[ 0.000000] a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000
[ 0.000000] a5 : 0000000000000000 a6 : ffffffff8000989c a7 : 0000000052464e43
[ 0.000000] s2 : ffffffff81a220c8 s3 : 0000000000000000 s4 : 0000000000000000
[ 0.000000] s5 : 0000000000000000 s6 : 0000000200000100 s7 : 0000000000000001
[ 0.000000] s8 : ffffffe07fe04040 s9 : ffffffff81a22c80 s10: 0000000000001000
[ 0.000000] s11: 0000000000000004 t3 : 0000000000000001 t4 : 0000000000000008
[ 0.000000] t5 : ffffffcf04000808 t6 : ffffffe3ffddf188
[ 0.000000] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000002
[ 0.000000] [<ffffffff80007240>] riscv_cpuid_to_hartid_mask+0x6/0xae
[ 0.000000] [<ffffffff80009474>] sbi_remote_fence_i+0x1e/0x26
[ 0.000000] [<ffffffff8000b8f4>] flush_icache_all+0x12/0x1a
[ 0.000000] [<ffffffff8000666c>] patch_text_nosync+0x26/0x32
[ 0.000000] [<ffffffff8000884e>] ftrace_init_nop+0x52/0x8c
[ 0.000000] [<ffffffff800f051e>] ftrace_process_locs.isra.0+0x29c/0x360
[ 0.000000] [<ffffffff80a0e3c6>] ftrace_init+0x80/0x130
[ 0.000000] [<ffffffff80a00f8c>] start_kernel+0x5c4/0x8f6
[ 0.000000] ---[ end trace f67eb9af4d8d492b ]---
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
While ftrace is looping over a list of addresses to patch, it always failed
when patching the same function: riscv_cpuid_to_hartid_mask. Looking at the
backtrace, the illegal instruction is encountered in this same function.
However, patch_text_nosync, after patching the instructions, calls
flush_icache_range. But looking at what happens in this function:
flush_icache_range -> flush_icache_all
-> sbi_remote_fence_i
-> __sbi_rfence_v02
-> riscv_cpuid_to_hartid_mask
The icache and dcache of the current cpu are never synchronized between the
patching of riscv_cpuid_to_hartid_mask and calling this same function.
So fix this by flushing the current cpu's icache before asking for the other
cpus to do the same.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Fixes: fab957c11efe ("RISC-V: Atomic and Locking Code")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Pull CPU hotplug updates from Thomas Gleixner:
"Updates for the SMP and CPU hotplug:
- Remove DEFINE_SMP_CALL_CACHE_FUNCTION() which is a left over of the
original hotplug code and now causing trouble with the ARM64 cache
topology setup due to the pointless SMP function call.
It's not longer required as the hotplug callbacks are guaranteed to
be invoked on the upcoming CPU.
- Remove the deprecated and now unused CPU hotplug functions
- Rewrite the CPU hotplug API documentation"
* tag 'smp-urgent-2021-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Documentation: core-api/cpuhotplug: Rewrite the API section
cpu/hotplug: Remove deprecated CPU-hotplug functions.
thermal: Replace deprecated CPU-hotplug functions.
drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION()
Resolves the checkpatch warning.
Signed-off-by: Jinchao Wang <wjc@cdjrlc.com>
[edited wording]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
$ ./scripts/get_maintainer.pl --rolestats --git-blame -f \
include/linux/compiler_attributes.h
...
Nick Desaulniers <ndesaulniers@google.com> (supporter:CLANG/LLVM BUILD
SUPPORT,authored lines:43/331=13%,commits:8/15=53%)
It's also important for me to stay up on which compiler attributes clang
is missing.
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
To pick the change in:
7957d93bf32bc211 ("block: add ioctl to read the disk sequence number")
It adds a new ioctl, but we are still not using that to generate tables
for 'perf trace', so no changes in tooling.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/fs.h' differs from latest version at 'include/uapi/linux/fs.h'
diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 8c0eb596baa5 ("[SCSI] qla2xxx: Fix a memory leak in an error path of
qla2x00_process_els()"), intended to change:
bsg_job->request->msgcode == FC_BSG_HST_ELS_NOLOGIN
to:
bsg_job->request->msgcode != FC_BSG_RPT_ELS
but changed it to:
bsg_job->request->msgcode == FC_BSG_RPT_ELS
instead.
Change the == to a != to avoid leaking the fcport structure or freeing
unallocated memory.
Link: https://lore.kernel.org/r/20211012191834.90306-2-jgu@purestorage.com
Fixes: 8c0eb596baa5 ("[SCSI] qla2xxx: Fix a memory leak in an error path of qla2x00_process_els()")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Joy Gu <jgu@purestorage.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix compilation of callchain related code on powerpc with gcc11+
- Fix PERF_SAMPLE_WEIGHT_STRUCT support in 'perf script'
- Check session->header.env.arch before using it, fixing a segmentation
fault
- Suppress 'rm dlfilter' build messages
* tag 'perf-tools-fixes-for-v5.15-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf script: Fix PERF_SAMPLE_WEIGHT_STRUCT support
perf callchain: Fix compilation on powerpc with gcc11+
perf script: Check session->header.env.arch before using it
perf build: Suppress 'rm dlfilter' build message
Pull kvm fixes from Paolo Bonzini:
- Fixes for s390 interrupt delivery
- Fixes for Xen emulator bugs showing up as debug kernel WARNs
- Fix another issue with SEV/ES string I/O VMGEXITs
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Take srcu lock in post_kvm_run_save()
KVM: SEV-ES: fix another issue with string I/O VMGEXITs
KVM: x86/xen: Fix kvm_xen_has_interrupt() sleeping in kvm_vcpu_block()
KVM: x86: switch pvclock_gtod_sync_lock to a raw spinlock
KVM: s390: preserve deliverable_mask in __airqs_kick_single_vcpu
KVM: s390: clear kicked_mask before sleeping again
-F weight in perf script is broken.
# ./perf mem record
# ./perf script -F weight
Samples for 'dummy:HG' event do not have WEIGHT attribute set. Cannot
print 'weight' field.
The sample type, PERF_SAMPLE_WEIGHT_STRUCT, is an alternative of the
PERF_SAMPLE_WEIGHT sample type. They share the same space, weight. The
lower 32 bits are exactly the same for both sample type. The higher 32
bits may be different for different architecture. For a new kernel on
x86, the PERF_SAMPLE_WEIGHT_STRUCT is used. For an old kernel or other
ARCHs, the PERF_SAMPLE_WEIGHT is used.
With -F weight, current perf script will only check the input string
"weight" with the PERF_SAMPLE_WEIGHT sample type. Because the commit
ea8d0ed6eae3 ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT") didn't
update the PERF_SAMPLE_WEIGHT_STRUCT sample type for perf script. For a
new kernel on x86, the check fails.
Use PERF_SAMPLE_WEIGHT_TYPE, which supports both sample types, to
replace PERF_SAMPLE_WEIGHT
Fixes: ea8d0ed6eae37b01 ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT")
Reported-by: Joe Mario <jmario@redhat.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Joe Mario <jmario@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Joe Mario <jmario@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1632929894-102778-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull SCSI fixes from James Bottomley:
"Three small fixes, all in drivers, and one sizeable update to the UFS
driver to remove the HPB 2.0 feature that has been objected to by Jens
and Christoph.
Although the UFS patch is large and last minute, it's essentially the
least intrusive way of resolving the objections in time for the 5.15
release"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: ufshpb: Remove HPB2.0 flows
scsi: mpt3sas: Fix reference tag handling for WRITE_INSERT
scsi: ufs: ufs-exynos: Correct timeout value setting registers
scsi: ibmvfc: Fix up duplicate response detection
The Xen interrupt injection for event channels relies on accessing the
guest's vcpu_info structure in __kvm_xen_has_interrupt(), through a
gfn_to_hva_cache.
This requires the srcu lock to be held, which is mostly the case except
for this code path:
[ 11.822877] WARNING: suspicious RCU usage
[ 11.822965] -----------------------------
[ 11.823013] include/linux/kvm_host.h:664 suspicious rcu_dereference_check() usage!
[ 11.823131]
[ 11.823131] other info that might help us debug this:
[ 11.823131]
[ 11.823196]
[ 11.823196] rcu_scheduler_active = 2, debug_locks = 1
[ 11.823253] 1 lock held by dom:0/90:
[ 11.823292] #0: ffff998956ec8118 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x85/0x680
[ 11.823379]
[ 11.823379] stack backtrace:
[ 11.823428] CPU: 2 PID: 90 Comm: dom:0 Kdump: loaded Not tainted 5.4.34+ #5
[ 11.823496] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[ 11.823612] Call Trace:
[ 11.823645] dump_stack+0x7a/0xa5
[ 11.823681] lockdep_rcu_suspicious+0xc5/0x100
[ 11.823726] __kvm_xen_has_interrupt+0x179/0x190
[ 11.823773] kvm_cpu_has_extint+0x6d/0x90
[ 11.823813] kvm_cpu_accept_dm_intr+0xd/0x40
[ 11.823853] kvm_vcpu_ready_for_interrupt_injection+0x20/0x30
< post_kvm_run_save() inlined here >
[ 11.823906] kvm_arch_vcpu_ioctl_run+0x135/0x6a0
[ 11.823947] kvm_vcpu_ioctl+0x263/0x680
Fixes: 40da8ccd724f ("KVM: x86/xen: Add event channel interrupt vector upcall")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: stable@vger.kernel.org
Message-Id: <606aaaf29fca3850a63aa4499826104e77a72346.camel@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Got following build fail on powerpc:
CC arch/powerpc/util/skip-callchain-idx.o
In function ‘check_return_reg’,
inlined from ‘check_return_addr’ at arch/powerpc/util/skip-callchain-idx.c:213:7,
inlined from ‘arch_skip_callchain_idx’ at arch/powerpc/util/skip-callchain-idx.c:265:7:
arch/powerpc/util/skip-callchain-idx.c:54:18: error: ‘dwarf_frame_register’ accessing 96 bytes \
in a region of size 64 [-Werror=stringop-overflow=]
54 | result = dwarf_frame_register(frame, ra_regno, ops_mem, &ops, &nops);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/util/skip-callchain-idx.c: In function ‘arch_skip_callchain_idx’:
arch/powerpc/util/skip-callchain-idx.c:54:18: note: referencing argument 3 of type ‘Dwarf_Op *’
In file included from /usr/include/elfutils/libdwfl.h:32,
from arch/powerpc/util/skip-callchain-idx.c:10:
/usr/include/elfutils/libdw.h:1069:12: note: in a call to function ‘dwarf_frame_register’
1069 | extern int dwarf_frame_register (Dwarf_Frame *frame, int regno,
| ^~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
The dwarf_frame_register args changed with [1],
Updating ops_mem accordingly.
[1] https://sourceware.org/git/?p=elfutils.git;a=commit;h=5621fe5443da23112170235dd5cac161e5c75e65
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Mark Wieelard <mjw@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20210928195253.1267023-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull clk fix from Stephen Boyd:
"One fix for the composite clk that broke when we changed this clk type
to use the determine_rate instead of round_rate clk op by default.
This caused lots of problems on Rockchip SoCs because they heavily use
the composite clk code to model the clk tree"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: composite: Also consider .determine_rate for rate + mux composites
The Host Performance Buffer feature allows UFS read commands to carry the
physical media addresses along with the LBAs, thus allowing less internal
L2P-table switches in the device. HPB1.0 allowed a single LBA, while
HPB2.0 increases this capacity up to 255 blocks.
Carrying more than a single record, the read operation is no longer purely
of type "read" but a "hybrid" command: Writing the physical address to the
device in one operation and reading back the required payload in another.
The JEDEC HPB spec defines two commands for this operation:
HPB-WRITE-BUFFER (0x2) to write the physical addresses to device, and
HPB-READ to read the payload.
With the current HPB design the UFS driver has no alternative but to divide
the READ request into 2 separate commands: HPB-WRITE-BUFFER and HPB-READ.
This causes a great deal of aggravation to the block layer guys who
demanded that we completely revert the entire HPB driver regardless of the
huge amount of corporate effort already invested in it.
As a compromise, remove only the pieces that implement the 2.0
specification. This is done as a matter of urgency for the final 5.15
release.
Link: https://lore.kernel.org/r/20211030062301.248-1-avri.altman@wdc.com
Tested-by: Avri Altman <avri.altman@wdc.com>
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Co-developed-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If the guest requests string I/O from the hypervisor via VMGEXIT,
SW_EXITINFO2 will contain the REP count. However, sev_es_string_io
was incorrectly treating it as the size of the GHCB buffer in
bytes.
This fixes the "outsw" test in the experimental SEV tests of
kvm-unit-tests.
Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e9f ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reported-by: Marc Orr <marcorr@google.com>
Tested-by: Marc Orr <marcorr@google.com>
Reviewed-by: Marc Orr <marcorr@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When perf.data is not written cleanly, we would like to process existing
data as much as possible (please see f_header.data.size == 0 condition
in perf_session__read_header). However, perf.data with partial data may
crash perf. Specifically, we see crash in 'perf script' for NULL
session->header.env.arch.
Fix this by checking session->header.env.arch before using it to determine
native_arch. Also split the if condition so it is easier to read.
Committer notes:
If it is a pipe, we already assume is a native arch, so no need to check
session->header.env.arch.
Signed-off-by: Song Liu <songliubraving@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@fb.com
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211004053238.514936-1-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull RISC-V fixes from Palmer Dabbelt:
"These are pretty late, but they do fix concrete issues.
- ensure the trap vector's address is aligned.
- avoid re-populating the KASAN shadow memory.
- allow kasan to build without warnings, which have recently become
errors"
* tag 'riscv-for-linus-5.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Fix asan-stack clang build
riscv: Do not re-populate shadow memory with kasan_populate_early_shadow
riscv: fix misalgned trap vector base address
Commit 69a00fb3d69706 ("clk: divider: Implement and wire up
.determine_rate by default") switches clk_divider_ops to implement
.determine_rate by default. This breaks composite clocks with multiple
parents because clk-composite.c does not use the special handling for
mux + divider combinations anymore (that was restricted to rate clocks
which only implement .round_rate, but not .determine_rate).
Alex reports:
This breaks lot of clocks for Rockchip which intensively uses
composites, i.e. those clocks will always stay at the initial parent,
which in some cases is the XTAL clock and I strongly guess it is the
same for other platforms, which use composite clocks having more than
one parent (e.g. mediatek, ti ...)
Example (RK3399)
clk_sdio is set (initialized) with XTAL (24 MHz) as parent in u-boot.
It will always stay at this parent, even if the mmc driver sets a rate
of 200 MHz (fails, as the nature of things), which should switch it
to any of its possible parent PLLs defined in
mux_pll_src_cpll_gpll_npll_ppll_upll_24m_p (see clk-rk3399.c) - which
never happens.
Restore the original behavior by changing the priority of the conditions
inside clk-composite.c. Now the special rate + mux case (with rate_ops
having a .round_rate - which is still the case for the default
clk_divider_ops) is preferred over rate_ops which have .determine_rate
defined (and not further considering the mux).
Fixes: 69a00fb3d69706 ("clk: divider: Implement and wire up .determine_rate by default")
Reported-by: Alex Bee <knaerzche@gmail.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Link: https://lore.kernel.org/r/20211016105022.303413-2-martin.blumenstingl@googlemail.com
Tested-by: Alex Bee <knaerzche@gmail.com>
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Testing revealed a problem with how the reference tag was handled for
a WRITE_INSERT operation. The SCSI_PROT_REF_CHECK flag is not set when
the controller is asked to generate the protection information
(i.e. not DIX). And as a result the initial reference tag would not be
set in the WRITE_INSERT case.
Separate handling of the REF_CHECK and REF_INCREMENT flags to align
with both the DIX spec and the MPI implementation.
Link: https://lore.kernel.org/r/20211028034202.24225-1-martin.petersen@oracle.com
Fixes: b3e2c72af1d5 ("scsi: mpt3sas: Use the proper SCSI midlayer interfaces for PI")
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In kvm_vcpu_block, the current task is set to TASK_INTERRUPTIBLE before
making a final check whether the vCPU should be woken from HLT by any
incoming interrupt.
This is a problem for the get_user() in __kvm_xen_has_interrupt(), which
really shouldn't be sleeping when the task state has already been set.
I think it's actually harmless as it would just manifest itself as a
spurious wakeup, but it's causing a debug warning:
[ 230.963649] do not call blocking ops when !TASK_RUNNING; state=1 set at [<00000000b6bcdbc9>] prepare_to_swait_exclusive+0x30/0x80
Fix the warning by turning it into an *explicit* spurious wakeup. When
invoked with !task_is_running(current) (and we might as well add
in_atomic() there while we're at it), just return 1 to indicate that
an IRQ is pending, which will cause a wakeup and then something will
call it again in a context that *can* sleep so it can fault the page
back in.
Cc: stable@vger.kernel.org
Fixes: 40da8ccd724f ("KVM: x86/xen: Add event channel interrupt vector upcall")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <168bf8c689561da904e48e2ff5ae4713eaef9e2d.camel@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The following build message:
rm dlfilters/dlfilter-test-api-v0.o
is unwanted.
The object file is being treated as an intermediate file and being
automatically removed. Mark the object file as .SECONDARY to prevent
removal and hence the message.
Requested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210930062849.110416-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull powerpc fixes from Michael Ellerman:
"Three commits fixing some issues introduced with the recent IOMMU
changes we merged.
Thanks to Alexey Kardashevskiy"
* tag 'powerpc-5.15-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/pseries/iommu: Create huge DMA window if no MMIO32 is present
powerpc/pseries/iommu: Check if the default window in use before removing it
powerpc/pseries/iommu: Use correct vfree for it_map
Nathan reported that because KASAN_SHADOW_OFFSET was not defined in
Kconfig, it prevents asan-stack from getting disabled with clang even
when CONFIG_KASAN_STACK is disabled: fix this by defining the
corresponding config.
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: 8ad8b72721d0 ("riscv: Add KASAN support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
PA_PWRMODEUSERDATA0 -> DL_FC0PROTTIMEOUTVAL
PA_PWRMODEUSERDATA1 -> DL_TC0REPLAYTIMEOUTVAL
PA_PWRMODEUSERDATA2 -> DL_AFC0REQTIMEOUTVAL
Link: https://lore.kernel.org/r/20211018062841.18226-1-chanho61.park@samsung.com
Fixes: a967ddb22d94 ("scsi: ufs: ufs-exynos: Apply vendor-specific values for three timeouts")
Cc: Alim Akhtar <alim.akhtar@samsung.com>
Cc: Kiwoong Kim <kwmad.kim@samsung.com>
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Chanho Park <chanho61.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull gpio fixes from Bartosz Golaszewski:
- fix the return value check when parsing the ngpios property in
gpio-xgs-iproc
- check the return value of bgpio_init() in gpio-mlxbf2
* tag 'gpio-fixes-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: mlxbf2.c: Add check for bgpio_init failure
gpio: xgs-iproc: fix parsing of ngpios property
The iommu_init_table() helper takes an address range to reserve in
the IOMMU table being initialized to exclude MMIO addresses, this is
useful if the window stretches far beyond 4GB (although wastes some TCEs).
At the moment the code searches for such MMIO32 range and fails if none
found which is considered a problem while it really is not: it is actually
better as this says there is no MMIO32 to reserve and we can use
usually wasted TCEs. Furthermore PHYP never actually allows creating
windows starting at busaddress=0 so this MMIO32 range is never useful.
This removes error exit and initializes the table with zero range if
no MMIO32 is detected.
Fixes: 381ceda88c4c ("powerpc/pseries/iommu: Make use of DDW for indirect mapping")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211020132315.2287178-5-aik@ozlabs.ru
When calling this function, all the shadow memory is already populated
with kasan_early_shadow_pte which has PAGE_KERNEL protection.
kasan_populate_early_shadow write-protects the mapping of the range
of addresses passed in argument in zero_pte_populate, which actually
write-protects all the shadow memory mapping since kasan_early_shadow_pte
is used for all the shadow memory at this point. And then when using
memblock API to populate the shadow memory, the first write access to the
kernel stack triggers a trap. This becomes visible with the next commit
that contains a fix for asan-stack.
We already manually populate all the shadow memory in kasan_early_init
and we write-protect kasan_early_shadow_pte at the end of kasan_init
which makes the calls to kasan_populate_early_shadow superfluous so
we can remove them.
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: e178d670f251 ("riscv/kasan: add KASAN_VMALLOC support")
Fixes: 8ad8b72721d0 ("riscv: Add KASAN support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Pull more perf tools updates from Arnaldo Carvalho de Melo:
- Add missing fields and remove some duplicate fields when printing a
perf_event_attr.
- Fix hybrid config terms list corruption.
- Update kernel header copies, some resulted in new kernel features
being automagically added to 'perf trace' syscall/tracepoint argument
id->string translators.
- Add a file generated during the documentation build to .gitignore.
- Add an option to build without libbfd, as some distros, like Debian
consider its ABI unstable.
- Add support to print a textual representation of IBS raw sample data
in 'perf report'.
- Fix bpf 'perf test' sample mismatch reporting
- Fix passing arguments to stackcollapse report in a 'perf script'
python script.
- Allow build-id with trailing zeros.
- Look for ImageBase in PE file to compute .text offset.
* tag 'perf-tools-for-v5.15-2021-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (25 commits)
tools headers UAPI: Update tools's copy of drm.h headers
tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
tools headers UAPI: Sync linux/fs.h with the kernel sources
tools headers UAPI: Sync linux/in.h copy with the kernel sources
perf tools: Add an option to build without libbfd
perf tools: Allow build-id with trailing zeros
perf tools: Fix hybrid config terms list corruption
perf tools: Factor out copy_config_terms() and free_config_terms()
perf tools: Fix perf_event_attr__fprintf() missing/dupl. fields
perf tools: Ignore Documentation dependency file
perf bpf: Provide a weak btf__load_from_kernel_by_id() for older libbpf versions
tools include UAPI: Update linux/mount.h copy
perf beauty: Cover more flags in the move_mount syscall argument beautifier
tools headers UAPI: Sync linux/prctl.h with the kernel sources
tools include UAPI: Sync sound/asound.h copy with the kernel sources
tools headers UAPI: Sync linux/kvm.h with the kernel sources
tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources
perf report: Add support to print a textual representation of IBS raw sample data
perf report: Add tools/arch/x86/include/asm/amd-ibs.h
perf env: Add perf_env__cpuid, perf_env__{nr_}pmu_mappings
...
Commit a264cf5e81c7 ("scsi: ibmvfc: Fix command state accounting and stale
response detection") introduced a regression in detecting duplicate
responses. This was observed in test where a command was sent to the VIOS
and completed before ibmvfc_send_event() set the active flag to 1, which
resulted in the atomic_dec_if_positive() call in ibmvfc_handle_crq()
thinking this was a duplicate response, which resulted in scsi_done() not
getting called, so we then hit a SCSI command timeout for this command once
the timeout expires. This simply ensures the active flag gets set prior to
making the hcall to send the command to the VIOS, in order to close this
window.
Link: https://lore.kernel.org/r/20211019152129.16558-1-brking@linux.vnet.ibm.com
Fixes: a264cf5e81c7 ("scsi: ibmvfc: Fix command state accounting and stale response detection")
Cc: stable@vger.kernel.org
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
On the preemption path when updating a Xen guest's runstate times, this
lock is taken inside the scheduler rq->lock, which is a raw spinlock.
This was shown in a lockdep warning:
[ 89.138354] =============================
[ 89.138356] [ BUG: Invalid wait context ]
[ 89.138358] 5.15.0-rc5+ #834 Tainted: G S I E
[ 89.138360] -----------------------------
[ 89.138361] xen_shinfo_test/2575 is trying to lock:
[ 89.138363] ffffa34a0364efd8 (&kvm->arch.pvclock_gtod_sync_lock){....}-{3:3}, at: get_kvmclock_ns+0x1f/0x130 [kvm]
[ 89.138442] other info that might help us debug this:
[ 89.138444] context-{5:5}
[ 89.138445] 4 locks held by xen_shinfo_test/2575:
[ 89.138447] #0: ffff972bdc3b8108 (&vcpu->mutex){+.+.}-{4:4}, at: kvm_vcpu_ioctl+0x77/0x6f0 [kvm]
[ 89.138483] #1: ffffa34a03662e90 (&kvm->srcu){....}-{0:0}, at: kvm_arch_vcpu_ioctl_run+0xdc/0x8b0 [kvm]
[ 89.138526] #2: ffff97331fdbac98 (&rq->__lock){-.-.}-{2:2}, at: __schedule+0xff/0xbd0
[ 89.138534] #3: ffffa34a03662e90 (&kvm->srcu){....}-{0:0}, at: kvm_arch_vcpu_put+0x26/0x170 [kvm]
...
[ 89.138695] get_kvmclock_ns+0x1f/0x130 [kvm]
[ 89.138734] kvm_xen_update_runstate+0x14/0x90 [kvm]
[ 89.138783] kvm_xen_update_runstate_guest+0x15/0xd0 [kvm]
[ 89.138830] kvm_arch_vcpu_put+0xe6/0x170 [kvm]
[ 89.138870] kvm_sched_out+0x2f/0x40 [kvm]
[ 89.138900] __schedule+0x5de/0xbd0
Cc: stable@vger.kernel.org
Reported-by: syzbot+b282b65c2c68492df769@syzkaller.appspotmail.com
Fixes: 30b5c851af79 ("KVM: x86/xen: Add support for vCPU runstate information")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <1b02a06421c17993df337493a68ba923f3bd5c0f.camel@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Changing the deliverable mask in __airqs_kick_single_vcpu() is a bug. If
one idle vcpu can't take the interrupts we want to deliver, we should
look for another vcpu that can, instead of saying that we don't want
to deliver these interrupts by clearing the bits from the
deliverable_mask.
Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()")
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20211019175401.3757927-3-pasic@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Pull block fixes from Jens Axboe:
- NVMe pull request:
- fix nvmet-tcp header digest verification (Amit Engel)
- fix a memory leak in nvmet-tcp when releasing a queue (Maurizio
Lombardi)
- fix nvme-tcp H2CData PDU send accounting again (Sagi Grimberg)
- fix digest pointer calculation in nvme-tcp and nvmet-tcp (Varun
Prakash)
- fix possible nvme-tcp req->offset corruption (Varun Prakash)
- Queue drain ordering fix (Ming)
- Partition check regression for zoned devices (Shin'ichiro)
- Zone queue restart fix (Naohiro)
* tag 'block-5.15-2021-10-29' of git://git.kernel.dk/linux-block:
block: Fix partition check for host-aware zoned block devices
nvmet-tcp: fix header digest verification
nvmet-tcp: fix data digest pointer calculation
nvme-tcp: fix data digest pointer calculation
nvme-tcp: fix possible req->offset corruption
block: schedule queue restart after BLK_STS_ZONE_RESOURCE
block: drain queue after disk is removed from sysfs
nvme-tcp: fix H2CData PDU send accounting (again)
nvmet-tcp: fix a memory leak when releasing a queue
At the moment this check is performed after we remove the default window
which is late and disallows to revert whatever changes enable_ddw()
has made to DMA windows.
This moves the check and error exit before removing the window.
This raised the message severity from "debug" to "warning" as this
should not happen in practice and cannot be triggered by the userspace.
Fixes: 381ceda88c4c ("powerpc/pseries/iommu: Make use of DDW for indirect mapping")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211020132315.2287178-4-aik@ozlabs.ru
The trap vector marked by label .Lsecondary_park must align on a
4-byte boundary, as the {m,s}tvec is defined to require 4-byte
alignment.
Signed-off-by: Chen Lu <181250012@smail.nju.edu.cn>
Reviewed-by: Anup Patel <anup.patel@wdc.com>
Fixes: e011995e826f ("RISC-V: Move relocate and few other functions out of __init")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Pull compiler attributes updates from Miguel Ojeda:
- Fix __has_attribute(__no_sanitize_coverage__) for GCC 4 (Marco Elver)
- Add Nick as Reviewer for compiler_attributes.h (Nick Desaulniers)
- Move __compiletime_{error|warning} (Nick Desaulniers)
* tag 'compiler-attributes-for-linus-v5.15-rc1-v2' of git://github.com/ojeda/linux:
compiler_attributes.h: move __compiletime_{error|warning}
MAINTAINERS: add Nick as Reviewer for compiler_attributes.h
Compiler Attributes: fix __has_attribute(__no_sanitize_coverage__) for GCC 4
Picking the changes from:
17ce9c61c71cbc0d ("drm: document DRM_IOCTL_MODE_RMFB")
Doesn't result in any tooling changes:
$ tools/perf/trace/beauty/drm_ioctl.sh > before
$ cp include/uapi/drm/drm.h tools/include/uapi/drm/drm.h
$ tools/perf/trace/beauty/drm_ioctl.sh > after
$ diff -u before after
Silencing these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/drm/drm.h' differs from latest version at 'include/uapi/drm/drm.h'
diff -u tools/include/uapi/drm/drm.h include/uapi/drm/drm.h
Cc: Simon Ser <contact@emersion.fr>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implement the ->restore() PM operation and set the link to off, which will
force a full reset and restore. This ensures that Host Performance Booster
is reset after suspend-to-disk.
The Host Performance Booster feature caches logical-to-physical mapping
information in the host memory. After suspend-to-disk, such information is
not valid, so a full reset and restore is needed.
A full reset and restore is done if the SPM level is 5 or 6, but not for
other SPM levels, so this change fixes those cases.
A full reset and restore also restores base address registers, so that code
is removed.
Link: https://lore.kernel.org/r/20211018151004.284200-2-adrian.hunter@intel.com
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If allocation of rmaps fails, but some of the pointers have already been written,
those pointers can be cleaned up when the memslot is freed, or even reused later
for another attempt at allocating the rmaps. Therefore there is no need to
WARN, as done for example in memslot_rmap_alloc, but the allocation *must* be
skipped lest KVM will overwrite the previous pointer and will indeed leak memory.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The idea behind kicked mask is that we should not re-kick a vcpu that
is already in the "kick" process, i.e. that was kicked and is
is about to be dispatched if certain conditions are met.
The problem with the current implementation is, that it assumes the
kicked vcpu is going to enter SIE shortly. But under certain
circumstances, the vcpu we just kicked will be deemed non-runnable and
will remain in wait state. This can happen, if the interrupt(s) this
vcpu got kicked to deal with got already cleared (because the interrupts
got delivered to another vcpu). In this case kvm_arch_vcpu_runnable()
would return false, and the vcpu would remain in kvm_vcpu_block(),
but this time with its kicked_mask bit set. So next time around we
wouldn't kick the vcpu form __airqs_kick_single_vcpu(), but would assume
that we just kicked it.
Let us make sure the kicked_mask is cleared before we give up on
re-dispatching the vcpu.
Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()")
Reported-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20211019175401.3757927-2-pasic@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Pull MMC fixes from Ulf Hansson:
- tmio: Re-enable card irqs after a reset
- mtk-sd: Fixup probing of cqhci for crypto
- cqhci: Fix support for suspend/resume
- vub300: Fix control-message timeouts
- dw_mmc-exynos: Fix support for tuning
- winbond: Silences build errors on M68K
- sdhci-esdhc-imx: Fix support for tuning
- sdhci-pci: Read card detect from ACPI for Intel Merrifield
- sdhci: Fix eMMC support for Thundercomm TurboX CM2290
* tag 'mmc-v5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: tmio: reenable card irqs after the reset callback
mmc: mediatek: Move cqhci init behind ungate clock
mmc: cqhci: clear HALT state after CQE enable
mmc: vub300: fix control-message timeouts
mmc: dw_mmc: exynos: fix the finding clock sample value
mmc: winbond: don't build on M68K
mmc: sdhci-esdhc-imx: clear the buffer_read_ready to reset standard tuning circuit
mmc: sdhci-pci: Read card detect from ACPI for Intel Merrifield
mmc: sdhci: Map more voltage level to SDHCI_POWER_330
Pull NVMe fixes from Christoph:
"nvme fixe for Linux 5.15
- fix nvmet-tcp header digest verification (Amit Engel)
- fix a memory leak in nvmet-tcp when releasing a queue
(Maurizio Lombardi)
- fix nvme-tcp H2CData PDU send accounting again (Sagi Grimberg)
- fix digest pointer calculation in nvme-tcp and nvmet-tcp
(Varun Prakash)
- fix possible nvme-tcp req->offset corruption (Varun Prakash)"
* tag 'nvme-5.15-2021-10-28' of git://git.infradead.org/nvme:
nvmet-tcp: fix header digest verification
nvmet-tcp: fix data digest pointer calculation
nvme-tcp: fix data digest pointer calculation
nvme-tcp: fix possible req->offset corruption
nvme-tcp: fix H2CData PDU send accounting (again)
nvmet-tcp: fix a memory leak when releasing a queue
of_property_read_u32 returns 0 on success, not true, so we need to
invert the check to actually take over the provided ngpio value.
Fixes: 6a41b6c5fc20 ("gpio: Add xgs-iproc driver")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
The it_map array is vzalloc'ed so use vfree() for it when creating
a huge DMA window failed for whatever reason.
While at this, write zero to it_map.
Fixes: 381ceda88c4c ("powerpc/pseries/iommu: Make use of DDW for indirect mapping")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211020132315.2287178-3-aik@ozlabs.ru
Pull auxdisplay updates from Miguel Ojeda:
"An assortment of improvements for auxdisplay:
- Replace symbolic permissions with octal permissions (Jinchao Wang)
- ks0108: Switch to use module_parport_driver() (Andy Shevchenko)
- charlcd: Drop unneeded initializers and switch to C99 style (Andy
Shevchenko)
- hd44780: Fix oops on module unloading (Lars Poeschel)
- Add I2C gpio expander example (Ralf Schlatterbeck)"
* tag 'auxdisplay-for-linus-v5.15-rc1' of git://github.com/ojeda/linux:
auxdisplay: Replace symbolic permissions with octal permissions
auxdisplay: ks0108: Switch to use module_parport_driver()
auxdisplay: charlcd: Drop unneeded initializers and switch to C99 style
auxdisplay: hd44780: Fix oops on module unloading
auxdisplay: Add I2C gpio expander example
Clang 14 will add support for __attribute__((__error__(""))) and
__attribute__((__warning__(""))). To make use of these in
__compiletime_error and __compiletime_warning (as used by BUILD_BUG and
friends) for newer clang and detect/fallback for older versions of
clang, move these to compiler_attributes.h and guard them with
__has_attribute preprocessor guards.
Link: https://reviews.llvm.org/D106030
Link: https://bugs.llvm.org/show_bug.cgi?id=16428
Link: https://github.com/ClangBuiltLinux/linux/issues/1173
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
[Reworded, landed in Clang 14]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
To pick the changes in:
b65a9489730a2494 ("drm/i915/userptr: Probe existence of backing struct pages upon creation")
ee242ca704d38699 ("drm/i915/guc: Implement GuC priority management")
81340cf3bddded4f ("drm/i915/uapi: reject set_domain for discrete")
7961c5b60f23dff5 ("drm/i915: Add TTM offset argument to mmap.")
aef7b67a79564f6c ("drm/i915/uapi: convert drm_i915_gem_userptr to kernel doc")
e7737b67ab46ee0e ("drm/i915/uapi: reject caching ioctls for discrete")
3aa8c57fe25a9247 ("drm/i915/uapi: convert drm_i915_gem_set_domain to kernel doc")
289f5a72009b8f67 ("drm/i915/uapi: convert drm_i915_gem_caching to kernel doc")
4a766ae40ec83301 ("drm/i915: Drop the CONTEXT_CLONE API (v2)")
6ff6d61dd2a943bd ("drm/i915: Drop I915_CONTEXT_PARAM_NO_ZEROMAP")
fe4751c3d513ff4f ("drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE")
577729533cdc4e37 ("drm/i915: Document the Virtual Engine uAPI")
c649432e86ca677d ("drm/i915: Fix busy ioctl commentary")
That doesn't result in any changes to tooling as no new ioctl were
added (at least not perceived by tools/perf/trace/beauty/drm_ioctl.sh).
Addressing this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The sgl is freed in the target stack in target_release_cmd_kref() before
calling qlt_free_cmd() but there is an unmap of sgl in qlt_free_cmd() that
causes a panic if sgl is not yet DMA unmapped:
NIP dma_direct_unmap_sg+0xdc/0x180
LR dma_direct_unmap_sg+0xc8/0x180
Call Trace:
ql_dbg_prefix+0x68/0xc0 [qla2xxx] (unreliable)
dma_unmap_sg_attrs+0x54/0xf0
qlt_unmap_sg.part.19+0x54/0x1c0 [qla2xxx]
qlt_free_cmd+0x124/0x1d0 [qla2xxx]
tcm_qla2xxx_release_cmd+0x4c/0xa0 [tcm_qla2xxx]
target_put_sess_cmd+0x198/0x370 [target_core_mod]
transport_generic_free_cmd+0x6c/0x1b0 [target_core_mod]
tcm_qla2xxx_complete_free+0x6c/0x90 [tcm_qla2xxx]
The sgl may be left unmapped in error cases of response sending. For
instance, qlt_rdy_to_xfer() maps sgl and exits when session is being
deleted keeping the sgl mapped.
This patch removes use-after-free of the sgl and ensures that the sgl is
unmapped for any command that was not sent to firmware.
Link: https://lore.kernel.org/r/20211018122650.11846-1-d.bogdanov@yadro.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The latest compile changes pointed us to a few instances where we use
the kernel documentation style but don't explain all variables or
don't adhere to it 100%.
It's easy to fix so let's do that.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Pull btrfs fixes from David Sterba:
"Last minute fixes for crash on 32bit architectures when compression is
in use. It's a regression introduced in 5.15-rc and I'd really like
not let this into the final release, fixes via stable trees would add
unnecessary delay.
The problem is on 32bit architectures with highmem enabled, the pages
for compression may need to be kmapped, while the patches removed that
as we don't use GFP_HIGHMEM allocations anymore. The pages that don't
come from local allocation still may be from highmem. Despite being on
32bit there's enough such ARM machines in use so it's not a marginal
issue.
I did full reverts of the patches one by one instead of a huge one.
There's one exception for the "lzo" revert as there was an
intermediate patch touching the same code to make it compatible with
subpage. I can't revert that one too, so the revert in lzo.c is
manual. Qu Wenruo has worked on that with me and verified the changes"
* tag 'for-5.15-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
Revert "btrfs: compression: drop kmap/kunmap from lzo"
Revert "btrfs: compression: drop kmap/kunmap from zlib"
Revert "btrfs: compression: drop kmap/kunmap from zstd"
Revert "btrfs: compression: drop kmap/kunmap from generic helpers"
The reset callback may clear the internal card detect interrupts, so
make sure to reenable them if needed.
Fixes: b4d86f37eacb ("mmc: renesas_sdhi: do hard reset if possible")
Reported-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211028195149.8003-1-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Commit a33df75c6328 ("block: use an xarray for disk->part_tbl") modified
the method to check partition existence in host-aware zoned block
devices from disk_has_partitions() helper function call to empty check
of xarray disk->part_tbl. However, disk->part_tbl always has single
entry for disk->part0 and never becomes empty. This resulted in the
host-aware zoned devices always judged to have partitions, and it made
the sysfs queue/zoned attribute to be "none" instead of "host-aware"
regardless of partition existence in the devices.
This also caused DEBUG_LOCKS_WARN_ON(lock->magic != lock) for
sdkp->rev_mutex in scsi layer when the kernel detects host-aware zoned
device. Since block layer handled the host-aware zoned devices as non-
zoned devices, scsi layer did not have chance to initialize the mutex
for zone revalidation. Therefore, the warning was triggered.
To fix the issues, call the helper function disk_has_partitions() in
place of disk->part_tbl empty check. Since the function was removed with
the commit a33df75c6328, reimplement it to walk through entries in the
xarray disk->part_tbl.
Fixes: a33df75c6328 ("block: use an xarray for disk->part_tbl")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Cc: stable@vger.kernel.org # v5.14+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211026060115.753746-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pass the correct length to nvmet_tcp_verify_hdgst, which is the pdu
header length. This fixes a wrong behaviour where header digest
verification passes although the digest is wrong.
Signed-off-by: Amit Engel <amit.engel@dell.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The gpio-mockup driver creates the properties that are shared between
platform and GPIO devices. Because of that, the properties may not
be removed at the proper point of time without provoking a use-after-free
as shown in the following backtrace:
refcount_t: underflow; use-after-free.
WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120
...
Call Trace:
kobject_put+0xdc/0xf0
software_node_notify_remove+0xa8/0xc0
device_del+0x15a/0x3e0
That's why the driver has to manage the lifetime of the software nodes
by itself.
The problem originates from the old device_add_properties() API, but
has been only revealed after the commit bd1e336aa853 ("driver core: platform:
Remove platform_device_add_properties()"). Hence, it's used as a landmark
for backporting.
Fixes: bd1e336aa853 ("driver core: platform: Remove platform_device_add_properties()")
Reported-by: Kent Gibson <warthog618@gmail.com>
Tested-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[Bartosz: tweaked local variable placement]
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
With PREEMPT_COUNT=y, when a CPU is offlined and then onlined again, we
get:
BUG: scheduling while atomic: swapper/1/0/0x00000000
no locks held by swapper/1/0.
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.15.0-rc2+ #100
Call Trace:
dump_stack_lvl+0xac/0x108
__schedule_bug+0xac/0xe0
__schedule+0xcf8/0x10d0
schedule_idle+0x3c/0x70
do_idle+0x2d8/0x4a0
cpu_startup_entry+0x38/0x40
start_secondary+0x2ec/0x3a0
start_secondary_prolog+0x10/0x14
This is because powerpc's arch_cpu_idle_dead() decrements the idle task's
preempt count, for reasons explained in commit a7c2bb8279d2 ("powerpc:
Re-enable preemption before cpu_die()"), specifically "start_secondary()
expects a preempt_count() of 0."
However, since commit 2c669ef6979c ("powerpc/preempt: Don't touch the idle
task's preempt_count during hotplug") and commit f1a0a376ca0c ("sched/core:
Initialize the idle task with preemption disabled"), that justification no
longer holds.
The idle task isn't supposed to re-enable preemption, so remove the
vestigial preempt_enable() from the CPU offline path.
Tested with pseries and powernv in qemu, and pseries on PowerVM.
Fixes: 2c669ef6979c ("powerpc/preempt: Don't touch the idle task's preempt_count during hotplug")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211015173902.2278118-1-nathanl@linux.ibm.com
On SiFive Unmatched, I recently fell onto the following BUG when booting:
[ 0.000000] ftrace: allocating 36610 entries in 144 pages
[ 0.000000] Oops - illegal instruction [#1]
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.13.1+ #5
[ 0.000000] Hardware name: SiFive HiFive Unmatched A00 (DT)
[ 0.000000] epc : riscv_cpuid_to_hartid_mask+0x6/0xae
[ 0.000000] ra : __sbi_rfence_v02+0xc8/0x10a
[ 0.000000] epc : ffffffff80007240 ra : ffffffff80009964 sp : ffffffff81803e10
[ 0.000000] gp : ffffffff81a1ea70 tp : ffffffff8180f500 t0 : ffffffe07fe30000
[ 0.000000] t1 : 0000000000000004 t2 : 0000000000000000 s0 : ffffffff81803e60
[ 0.000000] s1 : 0000000000000000 a0 : ffffffff81a22238 a1 : ffffffff81803e10
[ 0.000000] a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000
[ 0.000000] a5 : 0000000000000000 a6 : ffffffff8000989c a7 : 0000000052464e43
[ 0.000000] s2 : ffffffff81a220c8 s3 : 0000000000000000 s4 : 0000000000000000
[ 0.000000] s5 : 0000000000000000 s6 : 0000000200000100 s7 : 0000000000000001
[ 0.000000] s8 : ffffffe07fe04040 s9 : ffffffff81a22c80 s10: 0000000000001000
[ 0.000000] s11: 0000000000000004 t3 : 0000000000000001 t4 : 0000000000000008
[ 0.000000] t5 : ffffffcf04000808 t6 : ffffffe3ffddf188
[ 0.000000] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000002
[ 0.000000] [<ffffffff80007240>] riscv_cpuid_to_hartid_mask+0x6/0xae
[ 0.000000] [<ffffffff80009474>] sbi_remote_fence_i+0x1e/0x26
[ 0.000000] [<ffffffff8000b8f4>] flush_icache_all+0x12/0x1a
[ 0.000000] [<ffffffff8000666c>] patch_text_nosync+0x26/0x32
[ 0.000000] [<ffffffff8000884e>] ftrace_init_nop+0x52/0x8c
[ 0.000000] [<ffffffff800f051e>] ftrace_process_locs.isra.0+0x29c/0x360
[ 0.000000] [<ffffffff80a0e3c6>] ftrace_init+0x80/0x130
[ 0.000000] [<ffffffff80a00f8c>] start_kernel+0x5c4/0x8f6
[ 0.000000] ---[ end trace f67eb9af4d8d492b ]---
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
While ftrace is looping over a list of addresses to patch, it always failed
when patching the same function: riscv_cpuid_to_hartid_mask. Looking at the
backtrace, the illegal instruction is encountered in this same function.
However, patch_text_nosync, after patching the instructions, calls
flush_icache_range. But looking at what happens in this function:
flush_icache_range -> flush_icache_all
-> sbi_remote_fence_i
-> __sbi_rfence_v02
-> riscv_cpuid_to_hartid_mask
The icache and dcache of the current cpu are never synchronized between the
patching of riscv_cpuid_to_hartid_mask and calling this same function.
So fix this by flushing the current cpu's icache before asking for the other
cpus to do the same.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Fixes: fab957c11efe ("RISC-V: Atomic and Locking Code")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Pull CPU hotplug updates from Thomas Gleixner:
"Updates for the SMP and CPU hotplug:
- Remove DEFINE_SMP_CALL_CACHE_FUNCTION() which is a left over of the
original hotplug code and now causing trouble with the ARM64 cache
topology setup due to the pointless SMP function call.
It's not longer required as the hotplug callbacks are guaranteed to
be invoked on the upcoming CPU.
- Remove the deprecated and now unused CPU hotplug functions
- Rewrite the CPU hotplug API documentation"
* tag 'smp-urgent-2021-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Documentation: core-api/cpuhotplug: Rewrite the API section
cpu/hotplug: Remove deprecated CPU-hotplug functions.
thermal: Replace deprecated CPU-hotplug functions.
drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION()
$ ./scripts/get_maintainer.pl --rolestats --git-blame -f \
include/linux/compiler_attributes.h
...
Nick Desaulniers <ndesaulniers@google.com> (supporter:CLANG/LLVM BUILD
SUPPORT,authored lines:43/331=13%,commits:8/15=53%)
It's also important for me to stay up on which compiler attributes clang
is missing.
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
To pick the change in:
7957d93bf32bc211 ("block: add ioctl to read the disk sequence number")
It adds a new ioctl, but we are still not using that to generate tables
for 'perf trace', so no changes in tooling.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/fs.h' differs from latest version at 'include/uapi/linux/fs.h'
diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 8c0eb596baa5 ("[SCSI] qla2xxx: Fix a memory leak in an error path of
qla2x00_process_els()"), intended to change:
bsg_job->request->msgcode == FC_BSG_HST_ELS_NOLOGIN
to:
bsg_job->request->msgcode != FC_BSG_RPT_ELS
but changed it to:
bsg_job->request->msgcode == FC_BSG_RPT_ELS
instead.
Change the == to a != to avoid leaking the fcport structure or freeing
unallocated memory.
Link: https://lore.kernel.org/r/20211012191834.90306-2-jgu@purestorage.com
Fixes: 8c0eb596baa5 ("[SCSI] qla2xxx: Fix a memory leak in an error path of qla2x00_process_els()")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Joy Gu <jgu@purestorage.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>