···376376Julien Thierry <julien.thierry.kdev@gmail.com> <julien.thierry@arm.com>377377Iskren Chernev <me@iskren.info> <iskren.chernev@gmail.com>378378Kalle Valo <kvalo@kernel.org> <kvalo@codeaurora.org>379379+Kalle Valo <kvalo@kernel.org> <quic_kvalo@quicinc.com>379380Kalyan Thota <quic_kalyant@quicinc.com> <kalyan_t@codeaurora.org>380381Karthikeyan Periyasamy <quic_periyasa@quicinc.com> <periyasa@codeaurora.org>381382Kathiravan T <quic_kathirav@quicinc.com> <kathirav@codeaurora.org>
+2-4
CREDITS
···25152515D: Initial implementation of VC's, pty's and select()2516251625172517N: Pavel Machek25182518-E: pavel@ucw.cz25182518+E: pavel@kernel.org25192519P: 4096R/92DFCE96 4FA7 9EEF FCD4 C44F C585 B8C7 C060 2241 92DF CE9625202520-D: Softcursor for vga, hypertech cdrom support, vcsa bugfix, nbd,25212521-D: sun4/330 port, capabilities for elf, speedup for rm on ext2, USB,25222522-D: work on suspend-to-ram/disk, killing duplicates from ioctl32,25202520+D: NBD, Sun4/330 port, USB, work on suspend-to-ram/disk,25232521D: Altera SoCFPGA and Nokia N900 support.25242522S: Czech Republic25252523
+1-1
Documentation/arch/arm64/gcs.rst
···3737 shadow stacks rather than GCS.38383939* Support for GCS is reported to userspace via HWCAP_GCS in the aux vector4040- AT_HWCAP2 entry.4040+ AT_HWCAP entry.41414242* GCS is enabled per thread. While there is support for disabling GCS4343 at runtime this should be done with great care.
···8899maintainers:1010 - Taniya Das <quic_tdas@quicinc.com>1111+ - Imran Shaik <quic_imrashai@quicinc.com>11121213description: |1314 Qualcomm camera clock control module provides the clocks, resets and power1415 domains on SA8775p.15161616- See also: include/dt-bindings/clock/qcom,sa8775p-camcc.h1717+ See also:1818+ include/dt-bindings/clock/qcom,qcs8300-camcc.h1919+ include/dt-bindings/clock/qcom,sa8775p-camcc.h17201821properties:1922 compatible:2023 enum:2424+ - qcom,qcs8300-camcc2125 - qcom,sa8775p-camcc22262327 clocks:
···14141515description: |1616 The Microchip LAN966x outband interrupt controller (OIC) maps the internal1717- interrupt sources of the LAN966x device to an external interrupt.1818- When the LAN966x device is used as a PCI device, the external interrupt is1919- routed to the PCI interrupt.1717+ interrupt sources of the LAN966x device to a PCI interrupt when the LAN966x1818+ device is used as a PCI device.20192120properties:2221 compatible:
···77title: Qualcomm Technologies ath10k wireless devices8899maintainers:1010- - Kalle Valo <kvalo@kernel.org>1110 - Jeff Johnson <jjohnson@kernel.org>12111312description:
···88title: Qualcomm Technologies ath11k wireless devices (PCIe)991010maintainers:1111- - Kalle Valo <kvalo@kernel.org>1211 - Jeff Johnson <jjohnson@kernel.org>13121413description: |
···88title: Qualcomm Technologies ath11k wireless devices991010maintainers:1111- - Kalle Valo <kvalo@kernel.org>1211 - Jeff Johnson <jjohnson@kernel.org>13121413description: |
···991010maintainers:1111 - Jeff Johnson <jjohnson@kernel.org>1212- - Kalle Valo <kvalo@kernel.org>13121413description: |1514 Qualcomm Technologies IEEE 802.11be PCIe devices with WSI interface.
···991010maintainers:1111 - Jeff Johnson <quic_jjohnson@quicinc.com>1212- - Kalle Valo <kvalo@kernel.org>13121413description:1514 Qualcomm Technologies IEEE 802.11be PCIe devices.
···2222 Each sub-node is identified using the node's name, with valid values listed2323 for each of the pmics below.24242525- For mp5496, s1, s22525+ For mp5496, s1, s2, l2, l526262727 For pm2250, s1, s2, s3, s4, l1, l2, l3, l4, l5, l6, l7, l8, l9, l10, l11,2828 l12, l13, l14, l15, l16, l17, l18, l19, l20, l21, l22
···11+Submitting patches to bcachefs:22+===============================33+44+Patches must be tested before being submitted, either with the xfstests suite55+[0], or the full bcachefs test suite in ktest [1], depending on what's being66+touched. Note that ktest wraps xfstests and will be an easier method to running77+it for most users; it includes single-command wrappers for all the mainstream88+in-kernel local filesystems.99+1010+Patches will undergo more testing after being merged (including1111+lockdep/kasan/preempt/etc. variants), these are not generally required to be1212+run by the submitter - but do put some thought into what you're changing and1313+which tests might be relevant, e.g. are you dealing with tricky memory layout1414+work? kasan, are you doing locking work? then lockdep; and ktest includes1515+single-command variants for the debug build types you'll most likely need.1616+1717+The exception to this rule is incomplete WIP/RFC patches: if you're working on1818+something nontrivial, it's encouraged to send out a WIP patch to let people1919+know what you're doing and make sure you're on the right track. Just make sure2020+it includes a brief note as to what's done and what's incomplete, to avoid2121+confusion.2222+2323+Rigorous checkpatch.pl adherence is not required (many of its warnings are2424+considered out of date), but try not to deviate too much without reason.2525+2626+Focus on writing code that reads well and is organized well; code should be2727+aesthetically pleasing.2828+2929+CI:3030+===3131+3232+Instead of running your tests locally, when running the full test suite it's3333+prefereable to let a server farm do it in parallel, and then have the results3434+in a nice test dashboard (which can tell you which failures are new, and3535+presents results in a git log view, avoiding the need for most bisecting).3636+3737+That exists [2], and community members may request an account. If you work for3838+a big tech company, you'll need to help out with server costs to get access -3939+but the CI is not restricted to running bcachefs tests: it runs any ktest test4040+(which generally makes it easy to wrap other tests that can run in qemu).4141+4242+Other things to think about:4343+============================4444+4545+- How will we debug this code? Is there sufficient introspection to diagnose4646+ when something starts acting wonky on a user machine?4747+4848+ We don't necessarily need every single field of every data structure visible4949+ with introspection, but having the important fields of all the core data5050+ types wired up makes debugging drastically easier - a bit of thoughtful5151+ foresight greatly reduces the need to have people build custom kernels with5252+ debug patches.5353+5454+ More broadly, think about all the debug tooling that might be needed.5555+5656+- Does it make the codebase more or less of a mess? Can we also try to do some5757+ organizing, too?5858+5959+- Do new tests need to be written? New assertions? How do we know and verify6060+ that the code is correct, and what happens if something goes wrong?6161+6262+ We don't yet have automated code coverage analysis or easy fault injection -6363+ but for now, pretend we did and ask what they might tell us.6464+6565+ Assertions are hugely important, given that we don't yet have a systems6666+ language that can do ergonomic embedded correctness proofs. Hitting an assert6767+ in testing is much better than wandering off into undefined behaviour la-la6868+ land - use them. Use them judiciously, and not as a replacement for proper6969+ error handling, but use them.7070+7171+- Does it need to be performance tested? Should we add new peformance counters?7272+7373+ bcachefs has a set of persistent runtime counters which can be viewed with7474+ the 'bcachefs fs top' command; this should give users a basic idea of what7575+ their filesystem is currently doing. If you're doing a new feature or looking7676+ at old code, think if anything should be added.7777+7878+- If it's a new on disk format feature - have upgrades and downgrades been7979+ tested? (Automated tests exists but aren't in the CI, due to the hassle of8080+ disk image management; coordinate to have them run.)8181+8282+Mailing list, IRC:8383+==================8484+8585+Patches should hit the list [3], but much discussion and code review happens on8686+IRC as well [4]; many people appreciate the more conversational approach and8787+quicker feedback.8888+8989+Additionally, we have a lively user community doing excellent QA work, which9090+exists primarily on IRC. Please make use of that resource; user feedback is9191+important for any nontrivial feature, and documenting it in commit messages9292+would be a good idea.9393+9494+[0]: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git9595+[1]: https://evilpiepirate.org/git/ktest.git/9696+[2]: https://evilpiepirate.org/~testdashboard/ci/9797+[3]: linux-bcachefs@vger.kernel.org9898+[4]: irc.oftc.net#bcache, #bcachefs-dev
···14191419S390:14201420^^^^^1421142114221422-Returns -EINVAL if the VM has the KVM_VM_S390_UCONTROL flag set.14221422+Returns -EINVAL or -EEXIST if the VM has the KVM_VM_S390_UCONTROL flag set.14231423Returns -EINVAL if called on a protected VM.14241424142514254.36 KVM_SET_TSS_ADDR
+65-20
MAINTAINERS
···22092209F: sound/soc/codecs/ssm3515.c2210221022112211ARM/APPLE MACHINE SUPPORT22122212-M: Hector Martin <marcan@marcan.st>22132212M: Sven Peter <sven@svenpeter.dev>22142213R: Alyssa Rosenzweig <alyssa@rosenzweig.io>22152214L: asahi@lists.linux.dev···36543655F: drivers/phy/qualcomm/phy-ath79-usb.c3655365636563657ATHEROS ATH GENERIC UTILITIES36573657-M: Kalle Valo <kvalo@kernel.org>36583658M: Jeff Johnson <jjohnson@kernel.org>36593659L: linux-wireless@vger.kernel.org36603660S: Supported···38583860F: Documentation/devicetree/bindings/pwm/adi,axi-pwmgen.yaml38593861F: drivers/pwm/pwm-axi-pwmgen.c3860386238613861-AXXIA I2C CONTROLLER38623862-M: Krzysztof Adamski <krzysztof.adamski@nokia.com>38633863-L: linux-i2c@vger.kernel.org38643864-S: Maintained38653865-F: Documentation/devicetree/bindings/i2c/i2c-axxia.txt38663866-F: drivers/i2c/busses/i2c-axxia.c38673867-38683863AZ6007 DVB DRIVER38693864M: Mauro Carvalho Chehab <mchehab@kernel.org>38703865L: linux-media@vger.kernel.org···39463955L: linux-bcachefs@vger.kernel.org39473956S: Supported39483957C: irc://irc.oftc.net/bcache39583958+P: Documentation/filesystems/bcachefs/SubmittingPatches.rst39493959T: git https://evilpiepirate.org/git/bcachefs.git39503960F: fs/bcachefs/39513961F: Documentation/filesystems/bcachefs/···71087116F: rust/kernel/device_id.rs71097117F: rust/kernel/devres.rs71107118F: rust/kernel/driver.rs71197119+F: rust/kernel/faux.rs71117120F: rust/kernel/platform.rs71127121F: samples/rust/rust_driver_platform.rs71227122+F: samples/rust/rust_driver_faux.rs7113712371147124DRIVERS FOR OMAP ADAPTIVE VOLTAGE SCALING (AVS)71157125M: Nishanth Menon <nm@ti.com>···9412941894139419FREEZER94149420M: "Rafael J. Wysocki" <rafael@kernel.org>94159415-M: Pavel Machek <pavel@ucw.cz>94219421+M: Pavel Machek <pavel@kernel.org>94169422L: linux-pm@vger.kernel.org94179423S: Supported94189424F: Documentation/power/freezing-of-tasks.rst···98729878F: drivers/staging/gpib/9873987998749880GPIO ACPI SUPPORT98759875-M: Mika Westerberg <mika.westerberg@linux.intel.com>98819881+M: Mika Westerberg <westeri@kernel.org>98769882M: Andy Shevchenko <andriy.shevchenko@linux.intel.com>98779883L: linux-gpio@vger.kernel.org98789884L: linux-acpi@vger.kernel.org···10247102531024810254HIBERNATION (aka Software Suspend, aka swsusp)1024910255M: "Rafael J. Wysocki" <rafael@kernel.org>1025010250-M: Pavel Machek <pavel@ucw.cz>1025610256+M: Pavel Machek <pavel@kernel.org>1025110257L: linux-pm@vger.kernel.org1025210258S: Supported1025310259B: https://bugzilla.kernel.org···1081610822F: drivers/tty/hvc/10817108231081810824I2C ACPI SUPPORT1081910819-M: Mika Westerberg <mika.westerberg@linux.intel.com>1082510825+M: Mika Westerberg <westeri@kernel.org>1082010826L: linux-i2c@vger.kernel.org1082110827L: linux-acpi@vger.kernel.org1082210828S: Maintained···1311813124F: scripts/leaking_addresses.pl13119131251312013126LED SUBSYSTEM1312113121-M: Pavel Machek <pavel@ucw.cz>1312213127M: Lee Jones <lee@kernel.org>1312813128+M: Pavel Machek <pavel@kernel.org>1312313129L: linux-leds@vger.kernel.org1312413130S: Maintained1312513131T: git git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds.git···1643216438X: drivers/net/wireless/16433164391643416440NETWORKING DRIVERS (WIRELESS)1643516435-M: Kalle Valo <kvalo@kernel.org>1644116441+M: Johannes Berg <johannes@sipsolutions.net>1643616442L: linux-wireless@vger.kernel.org1643716443S: Maintained1643816444W: https://wireless.wiki.kernel.org/···1645516461F: include/net/dsa.h1645616462F: net/dsa/1645716463F: tools/testing/selftests/drivers/net/dsa/1646416464+1646516465+NETWORKING [ETHTOOL]1646616466+M: Andrew Lunn <andrew@lunn.ch>1646716467+M: Jakub Kicinski <kuba@kernel.org>1646816468+F: Documentation/netlink/specs/ethtool.yaml1646916469+F: Documentation/networking/ethtool-netlink.rst1647016470+F: include/linux/ethtool*1647116471+F: include/uapi/linux/ethtool*1647216472+F: net/ethtool/1647316473+F: tools/testing/selftests/drivers/net/*/ethtool*1647416474+1647516475+NETWORKING [ETHTOOL CABLE TEST]1647616476+M: Andrew Lunn <andrew@lunn.ch>1647716477+F: net/ethtool/cabletest.c1647816478+F: tools/testing/selftests/drivers/net/*/ethtool*1647916479+K: cable_test16458164801645916481NETWORKING [GENERAL]1646016482M: "David S. Miller" <davem@davemloft.net>···1650316493F: include/linux/netlink.h1650416494F: include/linux/netpoll.h1650516495F: include/linux/rtnetlink.h1649616496+F: include/linux/sctp.h1650616497F: include/linux/seq_file_net.h1650716498F: include/linux/skbuff*1650816499F: include/net/···1652016509F: include/uapi/linux/netlink.h1652116510F: include/uapi/linux/netlink_diag.h1652216511F: include/uapi/linux/rtnetlink.h1651216512+F: include/uapi/linux/sctp.h1652316513F: lib/net_utils.c1652416514F: lib/random32.c1652516515F: net/···1663316621NETWORKING [TCP]1663416622M: Eric Dumazet <edumazet@google.com>1663516623M: Neal Cardwell <ncardwell@google.com>1662416624+R: Kuniyuki Iwashima <kuniyu@amazon.com>1663616625L: netdev@vger.kernel.org1663716626S: Maintained1663816627F: Documentation/networking/net_cachelines/tcp_sock.rst···1666016647F: include/net/tls.h1666116648F: include/uapi/linux/tls.h1666216649F: net/tls/*1665016650+1665116651+NETWORKING [SOCKETS]1665216652+M: Eric Dumazet <edumazet@google.com>1665316653+M: Kuniyuki Iwashima <kuniyu@amazon.com>1665416654+M: Paolo Abeni <pabeni@redhat.com>1665516655+M: Willem de Bruijn <willemb@google.com>1665616656+S: Maintained1665716657+F: include/linux/sock_diag.h1665816658+F: include/linux/socket.h1665916659+F: include/linux/sockptr.h1666016660+F: include/net/sock.h1666116661+F: include/net/sock_reuseport.h1666216662+F: include/uapi/linux/socket.h1666316663+F: net/core/*sock*1666416664+F: net/core/scm.c1666516665+F: net/socket.c1666616666+1666716667+NETWORKING [UNIX SOCKETS]1666816668+M: Kuniyuki Iwashima <kuniyu@amazon.com>1666916669+S: Maintained1667016670+F: include/net/af_unix.h1667116671+F: include/net/netns/unix.h1667216672+F: include/uapi/linux/unix_diag.h1667316673+F: net/unix/1667416674+F: tools/testing/selftests/net/af_unix/16663166751666416676NETXEN (1/10) GbE SUPPORT1666516677M: Manish Chopra <manishc@marvell.com>···1681916781F: kernel/time/tick*.*16820167821682116783NOKIA N900 CAMERA SUPPORT (ET8EK8 SENSOR, AD5820 FOCUS)1682216822-M: Pavel Machek <pavel@ucw.cz>1678416784+M: Pavel Machek <pavel@kernel.org>1682316785M: Sakari Ailus <sakari.ailus@iki.fi>1682416786L: linux-media@vger.kernel.org1682516787S: Maintained···1775117713L: dev@openvswitch.org1775217714S: Maintained1775317715W: http://openvswitch.org1771617716+F: Documentation/networking/openvswitch.rst1775417717F: include/uapi/linux/openvswitch.h1775517718F: net/openvswitch/1775617719F: tools/testing/selftests/net/openvswitch/···1935119312F: drivers/media/tuners/qt1010*19352193131935319314QUALCOMM ATH12K WIRELESS DRIVER1935419354-M: Kalle Valo <kvalo@kernel.org>1935519315M: Jeff Johnson <jjohnson@kernel.org>1935619316L: ath12k@lists.infradead.org1935719317S: Supported···1936019322N: ath12k19361193231936219324QUALCOMM ATHEROS ATH10K WIRELESS DRIVER1936319363-M: Kalle Valo <kvalo@kernel.org>1936419325M: Jeff Johnson <jjohnson@kernel.org>1936519326L: ath10k@lists.infradead.org1936619327S: Supported···1936919332N: ath10k19370193331937119334QUALCOMM ATHEROS ATH11K WIRELESS DRIVER1937219372-M: Kalle Valo <kvalo@kernel.org>1937319335M: Jeff Johnson <jjohnson@kernel.org>1937419336L: ath11k@lists.infradead.org1937519337S: Supported···1950219466L: dmaengine@vger.kernel.org1950319467S: Supported1950419468F: drivers/dma/qcom/hidma*1946919469+1947019470+QUALCOMM I2C QCOM GENI DRIVER1947119471+M: Mukesh Kumar Savaliya <quic_msavaliy@quicinc.com>1947219472+M: Viken Dadhaniya <quic_vdadhani@quicinc.com>1947319473+L: linux-i2c@vger.kernel.org1947419474+L: linux-arm-msm@vger.kernel.org1947519475+S: Maintained1947619476+F: Documentation/devicetree/bindings/i2c/qcom,i2c-geni-qcom.yaml1947719477+F: drivers/i2c/busses/i2c-qcom-geni.c19505194781950619479QUALCOMM I2C CCI DRIVER1950719480M: Loic Poulain <loic.poulain@linaro.org>···2285122806SUSPEND TO RAM2285222807M: "Rafael J. Wysocki" <rafael@kernel.org>2285322808M: Len Brown <len.brown@intel.com>2285422854-M: Pavel Machek <pavel@ucw.cz>2280922809+M: Pavel Machek <pavel@kernel.org>2285522810L: linux-pm@vger.kernel.org2285622811S: Supported2285722812B: https://bugzilla.kernel.org
+5-10
Makefile
···22VERSION = 633PATCHLEVEL = 1444SUBLEVEL = 055-EXTRAVERSION = -rc155+EXTRAVERSION = -rc366NAME = Baby Opossum Posse7788# *DOCUMENTATION*···11201120endif1121112111221122# Align the bit size of userspace programs with the kernel11231123-KBUILD_USERCFLAGS += $(filter -m32 -m64 --target=%, $(KBUILD_CFLAGS))11241124-KBUILD_USERLDFLAGS += $(filter -m32 -m64 --target=%, $(KBUILD_CFLAGS))11231123+KBUILD_USERCFLAGS += $(filter -m32 -m64 --target=%, $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS))11241124+KBUILD_USERLDFLAGS += $(filter -m32 -m64 --target=%, $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS))1125112511261126# make the checker run with the right architecture11271127CHECKFLAGS += --arch=$(ARCH)···14211421 $(Q)$(MAKE) -sC $(srctree)/tools/bpf/resolve_btfids O=$(resolve_btfids_O) clean14221422endif1423142314241424-# Clear a bunch of variables before executing the submake14251425-ifeq ($(quiet),silent_)14261426-tools_silent=s14271427-endif14281428-14291424tools/: FORCE14301425 $(Q)mkdir -p $(objtree)/tools14311431- $(Q)$(MAKE) LDFLAGS= MAKEFLAGS="$(tools_silent) $(filter --j% -j,$(MAKEFLAGS))" O=$(abspath $(objtree)) subdir=tools -C $(srctree)/tools/14261426+ $(Q)$(MAKE) LDFLAGS= O=$(abspath $(objtree)) subdir=tools -C $(srctree)/tools/1432142714331428tools/%: FORCE14341429 $(Q)mkdir -p $(objtree)/tools14351435- $(Q)$(MAKE) LDFLAGS= MAKEFLAGS="$(tools_silent) $(filter --j% -j,$(MAKEFLAGS))" O=$(abspath $(objtree)) subdir=tools -C $(srctree)/tools/ $*14301430+ $(Q)$(MAKE) LDFLAGS= O=$(abspath $(objtree)) subdir=tools -C $(srctree)/tools/ $*1436143114371432# ---------------------------------------------------------------------------14381433# Kernel selftest
+1-5
arch/alpha/include/asm/elf.h
···7474/*7575 * This is used to ensure we don't load something for the wrong architecture.7676 */7777-#define elf_check_arch(x) ((x)->e_machine == EM_ALPHA)7777+#define elf_check_arch(x) (((x)->e_machine == EM_ALPHA) && !((x)->e_flags & EF_ALPHA_32BIT))78787979/*8080 * These are used to set parameters in the core dumps.···136136 ( i_ == IMPLVER_EV5 ? "ev56" \137137 : amask (AMASK_CIX) ? "ev6" : "ev67"); \138138})139139-140140-#define SET_PERSONALITY(EX) \141141- set_personality(((EX).e_flags & EF_ALPHA_32BIT) \142142- ? PER_LINUX_32BIT : PER_LINUX)143139144140extern int alpha_l1i_cacheshape;145141extern int alpha_l1d_cacheshape;
+1-1
arch/alpha/include/asm/hwrpb.h
···135135 /* virtual->physical map */136136 unsigned long map_entries;137137 unsigned long map_pages;138138- struct vf_map_struct map[1];138138+ struct vf_map_struct map[];139139};140140141141struct memclust_struct {
+1-1
arch/alpha/include/asm/pgtable.h
···360360361361extern void paging_init(void);362362363363-/* We have our own get_unmapped_area to cope with ADDR_LIMIT_32BIT. */363363+/* We have our own get_unmapped_area */364364#define HAVE_ARCH_UNMAPPED_AREA365365366366#endif /* _ALPHA_PGTABLE_H */
+2-6
arch/alpha/include/asm/processor.h
···88#ifndef __ASM_ALPHA_PROCESSOR_H99#define __ASM_ALPHA_PROCESSOR_H10101111-#include <linux/personality.h> /* for ADDR_LIMIT_32BIT */1212-1311/*1412 * We have a 42-bit user address space: 4TB user VM...1513 */1614#define TASK_SIZE (0x40000000000UL)17151818-#define STACK_TOP \1919- (current->personality & ADDR_LIMIT_32BIT ? 0x80000000 : 0x00120000000UL)1616+#define STACK_TOP (0x00120000000UL)20172118#define STACK_TOP_MAX 0x00120000000UL22192320/* This decides where the kernel will search for a free chunk of vm2421 * space during mmap's.2522 */2626-#define TASK_UNMAPPED_BASE \2727- ((current->personality & ADDR_LIMIT_32BIT) ? 0x40000000 : TASK_SIZE / 2)2323+#define TASK_UNMAPPED_BASE (TASK_SIZE / 2)28242925/* This is dead. Everything has been moved to thread_info. */3026struct thread_struct { };
+2
arch/alpha/include/uapi/asm/ptrace.h
···4242 unsigned long trap_a0;4343 unsigned long trap_a1;4444 unsigned long trap_a2;4545+/* This makes the stack 16-byte aligned as GCC expects */4646+ unsigned long __pad0;4547/* These are saved by PAL-code: */4648 unsigned long ps;4749 unsigned long pc;
···12101210 return ret;12111211}1212121212131213-/* Get an address range which is currently unmapped. Similar to the12141214- generic version except that we know how to honor ADDR_LIMIT_32BIT. */12131213+/* Get an address range which is currently unmapped. */1215121412161215static unsigned long12171216arch_get_unmapped_area_1(unsigned long addr, unsigned long len,···12291230 unsigned long len, unsigned long pgoff,12301231 unsigned long flags, vm_flags_t vm_flags)12311232{12321232- unsigned long limit;12331233-12341234- /* "32 bit" actually means 31 bit, since pointers sign extend. */12351235- if (current->personality & ADDR_LIMIT_32BIT)12361236- limit = 0x80000000;12371237- else12381238- limit = TASK_SIZE;12331233+ unsigned long limit = TASK_SIZE;1239123412401235 if (len > limit)12411236 return -ENOMEM;
+2-1
arch/alpha/kernel/pci_iommu.c
···1313#include <linux/log2.h>1414#include <linux/dma-map-ops.h>1515#include <linux/iommu-helper.h>1616+#include <linux/string_choices.h>16171718#include <asm/io.h>1819#include <asm/hwrpb.h>···213212214213 /* If both conditions above are met, we are fine. */215214 DBGA("pci_dac_dma_supported %s from %ps\n",216216- ok ? "yes" : "no", __builtin_return_address(0));215215+ str_yes_no(ok), __builtin_return_address(0));217216218217 return ok;219218}
+1-1
arch/alpha/kernel/traps.c
···649649static int unauser_reg_offsets[32] = {650650 R(r0), R(r1), R(r2), R(r3), R(r4), R(r5), R(r6), R(r7), R(r8),651651 /* r9 ... r15 are stored in front of regs. */652652- -56, -48, -40, -32, -24, -16, -8,652652+ -64, -56, -48, -40, -32, -24, -16, /* padding at -8 */653653 R(r16), R(r17), R(r18),654654 R(r19), R(r20), R(r21), R(r22), R(r23), R(r24), R(r25), R(r26),655655 R(r27), R(r28), R(gp),
···605605 __cpacr_to_cptr_set(clr, set));\606606 } while (0)607607608608-static __always_inline void kvm_write_cptr_el2(u64 val)609609-{610610- if (has_vhe() || has_hvhe())611611- write_sysreg(val, cpacr_el1);612612- else613613- write_sysreg(val, cptr_el2);614614-}615615-616616-/* Resets the value of cptr_el2 when returning to the host. */617617-static __always_inline void __kvm_reset_cptr_el2(struct kvm *kvm)618618-{619619- u64 val;620620-621621- if (has_vhe()) {622622- val = (CPACR_EL1_FPEN | CPACR_EL1_ZEN_EL1EN);623623- if (cpus_have_final_cap(ARM64_SME))624624- val |= CPACR_EL1_SMEN_EL1EN;625625- } else if (has_hvhe()) {626626- val = CPACR_EL1_FPEN;627627-628628- if (!kvm_has_sve(kvm) || !guest_owns_fp_regs())629629- val |= CPACR_EL1_ZEN;630630- if (cpus_have_final_cap(ARM64_SME))631631- val |= CPACR_EL1_SMEN;632632- } else {633633- val = CPTR_NVHE_EL2_RES1;634634-635635- if (kvm_has_sve(kvm) && guest_owns_fp_regs())636636- val |= CPTR_EL2_TZ;637637- if (!cpus_have_final_cap(ARM64_SME))638638- val |= CPTR_EL2_TSM;639639- }640640-641641- kvm_write_cptr_el2(val);642642-}643643-644644-#ifdef __KVM_NVHE_HYPERVISOR__645645-#define kvm_reset_cptr_el2(v) __kvm_reset_cptr_el2(kern_hyp_va((v)->kvm))646646-#else647647-#define kvm_reset_cptr_el2(v) __kvm_reset_cptr_el2((v)->kvm)648648-#endif649649-650608/*651609 * Returns a 'sanitised' view of CPTR_EL2, translating from nVHE to the VHE652610 * format if E2H isn't set.
+5-17
arch/arm64/include/asm/kvm_host.h
···100100static inline void *pop_hyp_memcache(struct kvm_hyp_memcache *mc,101101 void *(*to_va)(phys_addr_t phys))102102{103103- phys_addr_t *p = to_va(mc->head);103103+ phys_addr_t *p = to_va(mc->head & PAGE_MASK);104104105105 if (!mc->nr_pages)106106 return NULL;···615615struct kvm_host_data {616616#define KVM_HOST_DATA_FLAG_HAS_SPE 0617617#define KVM_HOST_DATA_FLAG_HAS_TRBE 1618618-#define KVM_HOST_DATA_FLAG_HOST_SVE_ENABLED 2619619-#define KVM_HOST_DATA_FLAG_HOST_SME_ENABLED 3620618#define KVM_HOST_DATA_FLAG_TRBE_ENABLED 4621619#define KVM_HOST_DATA_FLAG_EL1_TRACING_CONFIGURED 5622620 unsigned long flags;···622624 struct kvm_cpu_context host_ctxt;623625624626 /*625625- * All pointers in this union are hyp VA.627627+ * Hyp VA.626628 * sve_state is only used in pKVM and if system_supports_sve().627629 */628628- union {629629- struct user_fpsimd_state *fpsimd_state;630630- struct cpu_sve_state *sve_state;631631- };630630+ struct cpu_sve_state *sve_state;632631633633- union {634634- /* HYP VA pointer to the host storage for FPMR */635635- u64 *fpmr_ptr;636636- /*637637- * Used by pKVM only, as it needs to provide storage638638- * for the host639639- */640640- u64 fpmr;641641- };632632+ /* Used by pKVM only. */633633+ u64 fpmr;642634643635 /* Ownership of the FP regs */644636 enum {
···16951695}1696169616971697/*16981698- * Called by KVM when entering the guest.16991699- */17001700-void fpsimd_kvm_prepare(void)17011701-{17021702- if (!system_supports_sve())17031703- return;17041704-17051705- /*17061706- * KVM does not save host SVE state since we can only enter17071707- * the guest from a syscall so the ABI means that only the17081708- * non-saved SVE state needs to be saved. If we have left17091709- * SVE enabled for performance reasons then update the task17101710- * state to be FPSIMD only.17111711- */17121712- get_cpu_fpsimd_context();17131713-17141714- if (test_and_clear_thread_flag(TIF_SVE)) {17151715- sve_to_fpsimd(current);17161716- current->thread.fp_type = FP_STATE_FPSIMD;17171717- }17181718-17191719- put_cpu_fpsimd_context();17201720-}17211721-17221722-/*17231698 * Associate current's FPSIMD context with this cpu17241699 * The caller must have ownership of the cpu FPSIMD context before calling17251700 * this function.
+10-12
arch/arm64/kernel/topology.c
···194194 int cpu;195195196196 /* We are already set since the last insmod of cpufreq driver */197197- if (unlikely(cpumask_subset(cpus, amu_fie_cpus)))197197+ if (cpumask_available(amu_fie_cpus) &&198198+ unlikely(cpumask_subset(cpus, amu_fie_cpus)))198199 return;199200200200- for_each_cpu(cpu, cpus) {201201+ for_each_cpu(cpu, cpus)201202 if (!freq_counters_valid(cpu))202203 return;204204+205205+ if (!cpumask_available(amu_fie_cpus) &&206206+ !zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL)) {207207+ WARN_ONCE(1, "Failed to allocate FIE cpumask for CPUs[%*pbl]\n",208208+ cpumask_pr_args(cpus));209209+ return;203210 }204211205212 cpumask_or(amu_fie_cpus, amu_fie_cpus, cpus);···244237245238static int __init init_amu_fie(void)246239{247247- int ret;248248-249249- if (!zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL))250250- return -ENOMEM;251251-252252- ret = cpufreq_register_notifier(&init_amu_fie_notifier,240240+ return cpufreq_register_notifier(&init_amu_fie_notifier,253241 CPUFREQ_POLICY_NOTIFIER);254254- if (ret)255255- free_cpumask_var(amu_fie_cpus);256256-257257- return ret;258242}259243core_initcall(init_amu_fie);260244
···447447static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,448448 struct arch_timer_context *timer_ctx)449449{450450- int ret;451451-452450 kvm_timer_update_status(timer_ctx, new_level);453451454452 timer_ctx->irq.level = new_level;455453 trace_kvm_timer_update_irq(vcpu->vcpu_id, timer_irq(timer_ctx),456454 timer_ctx->irq.level);457455458458- if (!userspace_irqchip(vcpu->kvm)) {459459- ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu,460460- timer_irq(timer_ctx),461461- timer_ctx->irq.level,462462- timer_ctx);463463- WARN_ON(ret);464464- }456456+ if (userspace_irqchip(vcpu->kvm))457457+ return;458458+459459+ kvm_vgic_inject_irq(vcpu->kvm, vcpu,460460+ timer_irq(timer_ctx),461461+ timer_ctx->irq.level,462462+ timer_ctx);465463}466464467465/* Only called for a fully emulated timer */···469471470472 trace_kvm_timer_emulate(ctx, should_fire);471473472472- if (should_fire != ctx->irq.level) {474474+ if (should_fire != ctx->irq.level)473475 kvm_timer_update_irq(ctx->vcpu, should_fire, ctx);474474- return;475475- }476476477477 kvm_timer_update_status(ctx, should_fire);478478···757761 timer_irq(map->direct_ptimer),758762 &arch_timer_irq_ops);759763 WARN_ON_ONCE(ret);760760-761761- /*762762- * The virtual offset behaviour is "interesting", as it763763- * always applies when HCR_EL2.E2H==0, but only when764764- * accessed from EL1 when HCR_EL2.E2H==1. So make sure we765765- * track E2H when putting the HV timer in "direct" mode.766766- */767767- if (map->direct_vtimer == vcpu_hvtimer(vcpu)) {768768- struct arch_timer_offset *offs = &map->direct_vtimer->offset;769769-770770- if (vcpu_el2_e2h_is_set(vcpu))771771- offs->vcpu_offset = NULL;772772- else773773- offs->vcpu_offset = &__vcpu_sys_reg(vcpu, CNTVOFF_EL2);774774- }775764 }776765}777766···957976 * which allows trapping of the timer registers even with NV2.958977 * Still, this is still worse than FEAT_NV on its own. Meh.959978 */960960- if (!vcpu_el2_e2h_is_set(vcpu)) {961961- if (cpus_have_final_cap(ARM64_HAS_ECV))962962- return;963963-964964- /*965965- * A non-VHE guest hypervisor doesn't have any direct access966966- * to its timers: the EL2 registers trap (and the HW is967967- * fully emulated), while the EL0 registers access memory968968- * despite the access being notionally direct. Boo.969969- *970970- * We update the hardware timer registers with the971971- * latest value written by the guest to the VNCR page972972- * and let the hardware take care of the rest.973973- */974974- write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CTL_EL0), SYS_CNTV_CTL);975975- write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CVAL_EL0), SYS_CNTV_CVAL);976976- write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CTL_EL0), SYS_CNTP_CTL);977977- write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), SYS_CNTP_CVAL);978978- } else {979979+ if (!cpus_have_final_cap(ARM64_HAS_ECV)) {979980 /*980981 * For a VHE guest hypervisor, the EL2 state is directly981981- * stored in the host EL1 timers, while the emulated EL0982982+ * stored in the host EL1 timers, while the emulated EL1982983 * state is stored in the VNCR page. The latter could have983984 * been updated behind our back, and we must reset the984985 * emulation of the timers.986986+ *987987+ * A non-VHE guest hypervisor doesn't have any direct access988988+ * to its timers: the EL2 registers trap despite being989989+ * notionally direct (we use the EL1 HW, as for VHE), while990990+ * the EL1 registers access memory.991991+ *992992+ * In both cases, process the emulated timers on each guest993993+ * exit. Boo.985994 */986995 struct timer_map map;987996 get_timer_map(vcpu, &map);
+20-8
arch/arm64/kvm/arm.c
···22902290 break;22912291 case -ENODEV:22922292 case -ENXIO:22932293+ /*22942294+ * No VGIC? No pKVM for you.22952295+ *22962296+ * Protected mode assumes that VGICv3 is present, so no point22972297+ * in trying to hobble along if vgic initialization fails.22982298+ */22992299+ if (is_protected_kvm_enabled())23002300+ goto out;23012301+23022302+ /*23032303+ * Otherwise, userspace could choose to implement a GIC for its23042304+ * guest on non-cooperative hardware.23052305+ */22932306 vgic_present = false;22942307 err = 0;22952308 break;···24132400 kvm_nvhe_sym(id_aa64smfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64SMFR0_EL1);24142401 kvm_nvhe_sym(__icache_flags) = __icache_flags;24152402 kvm_nvhe_sym(kvm_arm_vmid_bits) = kvm_arm_vmid_bits;24032403+24042404+ /*24052405+ * Flush entire BSS since part of its data containing init symbols is read24062406+ * while the MMU is off.24072407+ */24082408+ kvm_flush_dcache_to_poc(kvm_ksym_ref(__hyp_bss_start),24092409+ kvm_ksym_ref(__hyp_bss_end) - kvm_ksym_ref(__hyp_bss_start));24162410}2417241124182412static int __init kvm_hyp_init_protection(u32 hyp_va_bits)···24802460 sve_state = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state;24812461 per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state =24822462 kern_hyp_va(sve_state);24832483- }24842484- } else {24852485- for_each_possible_cpu(cpu) {24862486- struct user_fpsimd_state *fpsimd_state;24872487-24882488- fpsimd_state = &per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->host_ctxt.fp_regs;24892489- per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->fpsimd_state =24902490- kern_hyp_va(fpsimd_state);24912463 }24922464 }24932465}
+9-98
arch/arm64/kvm/fpsimd.c
···5454 if (!system_supports_fpsimd())5555 return;56565757- fpsimd_kvm_prepare();5858-5957 /*6060- * We will check TIF_FOREIGN_FPSTATE just before entering the6161- * guest in kvm_arch_vcpu_ctxflush_fp() and override this to6262- * FP_STATE_FREE if the flag set.5858+ * Ensure that any host FPSIMD/SVE/SME state is saved and unbound such5959+ * that the host kernel is responsible for restoring this state upon6060+ * return to userspace, and the hyp code doesn't need to save anything.6161+ *6262+ * When the host may use SME, fpsimd_save_and_flush_cpu_state() ensures6363+ * that PSTATE.{SM,ZA} == {0,0}.6364 */6464- *host_data_ptr(fp_owner) = FP_STATE_HOST_OWNED;6565- *host_data_ptr(fpsimd_state) = kern_hyp_va(¤t->thread.uw.fpsimd_state);6666- *host_data_ptr(fpmr_ptr) = kern_hyp_va(¤t->thread.uw.fpmr);6565+ fpsimd_save_and_flush_cpu_state();6666+ *host_data_ptr(fp_owner) = FP_STATE_FREE;67676868- host_data_clear_flag(HOST_SVE_ENABLED);6969- if (read_sysreg(cpacr_el1) & CPACR_EL1_ZEN_EL0EN)7070- host_data_set_flag(HOST_SVE_ENABLED);7171-7272- if (system_supports_sme()) {7373- host_data_clear_flag(HOST_SME_ENABLED);7474- if (read_sysreg(cpacr_el1) & CPACR_EL1_SMEN_EL0EN)7575- host_data_set_flag(HOST_SME_ENABLED);7676-7777- /*7878- * If PSTATE.SM is enabled then save any pending FP7979- * state and disable PSTATE.SM. If we leave PSTATE.SM8080- * enabled and the guest does not enable SME via8181- * CPACR_EL1.SMEN then operations that should be valid8282- * may generate SME traps from EL1 to EL1 which we8383- * can't intercept and which would confuse the guest.8484- *8585- * Do the same for PSTATE.ZA in the case where there8686- * is state in the registers which has not already8787- * been saved, this is very unlikely to happen.8888- */8989- if (read_sysreg_s(SYS_SVCR) & (SVCR_SM_MASK | SVCR_ZA_MASK)) {9090- *host_data_ptr(fp_owner) = FP_STATE_FREE;9191- fpsimd_save_and_flush_cpu_state();9292- }9393- }9494-9595- /*9696- * If normal guests gain SME support, maintain this behavior for pKVM9797- * guests, which don't support SME.9898- */9999- WARN_ON(is_protected_kvm_enabled() && system_supports_sme() &&100100- read_sysreg_s(SYS_SVCR));6868+ WARN_ON_ONCE(system_supports_sme() && read_sysreg_s(SYS_SVCR));10169}1027010371/*···130162131163 local_irq_save(flags);132164133133- /*134134- * If we have VHE then the Hyp code will reset CPACR_EL1 to135135- * the default value and we need to reenable SME.136136- */137137- if (has_vhe() && system_supports_sme()) {138138- /* Also restore EL0 state seen on entry */139139- if (host_data_test_flag(HOST_SME_ENABLED))140140- sysreg_clear_set(CPACR_EL1, 0, CPACR_EL1_SMEN);141141- else142142- sysreg_clear_set(CPACR_EL1,143143- CPACR_EL1_SMEN_EL0EN,144144- CPACR_EL1_SMEN_EL1EN);145145- isb();146146- }147147-148165 if (guest_owns_fp_regs()) {149149- if (vcpu_has_sve(vcpu)) {150150- u64 zcr = read_sysreg_el1(SYS_ZCR);151151-152152- /*153153- * If the vCPU is in the hyp context then ZCR_EL1 is154154- * loaded with its vEL2 counterpart.155155- */156156- __vcpu_sys_reg(vcpu, vcpu_sve_zcr_elx(vcpu)) = zcr;157157-158158- /*159159- * Restore the VL that was saved when bound to the CPU,160160- * which is the maximum VL for the guest. Because the161161- * layout of the data when saving the sve state depends162162- * on the VL, we need to use a consistent (i.e., the163163- * maximum) VL.164164- * Note that this means that at guest exit ZCR_EL1 is165165- * not necessarily the same as on guest entry.166166- *167167- * ZCR_EL2 holds the guest hypervisor's VL when running168168- * a nested guest, which could be smaller than the169169- * max for the vCPU. Similar to above, we first need to170170- * switch to a VL consistent with the layout of the171171- * vCPU's SVE state. KVM support for NV implies VHE, so172172- * using the ZCR_EL1 alias is safe.173173- */174174- if (!has_vhe() || (vcpu_has_nv(vcpu) && !is_hyp_ctxt(vcpu)))175175- sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1,176176- SYS_ZCR_EL1);177177- }178178-179166 /*180167 * Flush (save and invalidate) the fpsimd/sve state so that if181168 * the host tries to use fpsimd/sve, it's not using stale data···142219 * when needed.143220 */144221 fpsimd_save_and_flush_cpu_state();145145- } else if (has_vhe() && system_supports_sve()) {146146- /*147147- * The FPSIMD/SVE state in the CPU has not been touched, and we148148- * have SVE (and VHE): CPACR_EL1 (alias CPTR_EL2) has been149149- * reset by kvm_reset_cptr_el2() in the Hyp code, disabling SVE150150- * for EL0. To avoid spurious traps, restore the trap state151151- * seen by kvm_arch_vcpu_load_fp():152152- */153153- if (host_data_test_flag(HOST_SVE_ENABLED))154154- sysreg_clear_set(CPACR_EL1, 0, CPACR_EL1_ZEN_EL0EN);155155- else156156- sysreg_clear_set(CPACR_EL1, CPACR_EL1_ZEN_EL0EN, 0);157222 }158223159224 local_irq_restore(flags);
+5
arch/arm64/kvm/hyp/entry.S
···4444alternative_else_nop_endif4545 mrs x1, isr_el14646 cbz x1, 1f4747+4848+ // Ensure that __guest_enter() always provides a context4949+ // synchronization event so that callers don't need ISBs for anything5050+ // that would usually be synchonized by the ERET.5151+ isb4752 mov x0, #ARM_EXCEPTION_IRQ4853 ret4954
+111-37
arch/arm64/kvm/hyp/include/hyp/switch.h
···326326 return __get_fault_info(vcpu->arch.fault.esr_el2, &vcpu->arch.fault);327327}328328329329-static bool kvm_hyp_handle_mops(struct kvm_vcpu *vcpu, u64 *exit_code)329329+static inline bool kvm_hyp_handle_mops(struct kvm_vcpu *vcpu, u64 *exit_code)330330{331331 *vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);332332 arm64_mops_reset_regs(vcpu_gp_regs(vcpu), vcpu->arch.fault.esr_el2);···375375 true);376376}377377378378-static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu);378378+static inline void fpsimd_lazy_switch_to_guest(struct kvm_vcpu *vcpu)379379+{380380+ u64 zcr_el1, zcr_el2;381381+382382+ if (!guest_owns_fp_regs())383383+ return;384384+385385+ if (vcpu_has_sve(vcpu)) {386386+ /* A guest hypervisor may restrict the effective max VL. */387387+ if (vcpu_has_nv(vcpu) && !is_hyp_ctxt(vcpu))388388+ zcr_el2 = __vcpu_sys_reg(vcpu, ZCR_EL2);389389+ else390390+ zcr_el2 = vcpu_sve_max_vq(vcpu) - 1;391391+392392+ write_sysreg_el2(zcr_el2, SYS_ZCR);393393+394394+ zcr_el1 = __vcpu_sys_reg(vcpu, vcpu_sve_zcr_elx(vcpu));395395+ write_sysreg_el1(zcr_el1, SYS_ZCR);396396+ }397397+}398398+399399+static inline void fpsimd_lazy_switch_to_host(struct kvm_vcpu *vcpu)400400+{401401+ u64 zcr_el1, zcr_el2;402402+403403+ if (!guest_owns_fp_regs())404404+ return;405405+406406+ /*407407+ * When the guest owns the FP regs, we know that guest+hyp traps for408408+ * any FPSIMD/SVE/SME features exposed to the guest have been disabled409409+ * by either fpsimd_lazy_switch_to_guest() or kvm_hyp_handle_fpsimd()410410+ * prior to __guest_entry(). As __guest_entry() guarantees a context411411+ * synchronization event, we don't need an ISB here to avoid taking412412+ * traps for anything that was exposed to the guest.413413+ */414414+ if (vcpu_has_sve(vcpu)) {415415+ zcr_el1 = read_sysreg_el1(SYS_ZCR);416416+ __vcpu_sys_reg(vcpu, vcpu_sve_zcr_elx(vcpu)) = zcr_el1;417417+418418+ /*419419+ * The guest's state is always saved using the guest's max VL.420420+ * Ensure that the host has the guest's max VL active such that421421+ * the host can save the guest's state lazily, but don't422422+ * artificially restrict the host to the guest's max VL.423423+ */424424+ if (has_vhe()) {425425+ zcr_el2 = vcpu_sve_max_vq(vcpu) - 1;426426+ write_sysreg_el2(zcr_el2, SYS_ZCR);427427+ } else {428428+ zcr_el2 = sve_vq_from_vl(kvm_host_sve_max_vl) - 1;429429+ write_sysreg_el2(zcr_el2, SYS_ZCR);430430+431431+ zcr_el1 = vcpu_sve_max_vq(vcpu) - 1;432432+ write_sysreg_el1(zcr_el1, SYS_ZCR);433433+ }434434+ }435435+}436436+437437+static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu)438438+{439439+ /*440440+ * Non-protected kvm relies on the host restoring its sve state.441441+ * Protected kvm restores the host's sve state as not to reveal that442442+ * fpsimd was used by a guest nor leak upper sve bits.443443+ */444444+ if (system_supports_sve()) {445445+ __hyp_sve_save_host();446446+447447+ /* Re-enable SVE traps if not supported for the guest vcpu. */448448+ if (!vcpu_has_sve(vcpu))449449+ cpacr_clear_set(CPACR_EL1_ZEN, 0);450450+451451+ } else {452452+ __fpsimd_save_state(host_data_ptr(host_ctxt.fp_regs));453453+ }454454+455455+ if (kvm_has_fpmr(kern_hyp_va(vcpu->kvm)))456456+ *host_data_ptr(fpmr) = read_sysreg_s(SYS_FPMR);457457+}458458+379459380460/*381461 * We trap the first access to the FP/SIMD to save the host context and···463383 * If FP/SIMD is not implemented, handle the trap and inject an undefined464384 * instruction exception to the guest. Similarly for trapped SVE accesses.465385 */466466-static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)386386+static inline bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)467387{468388 bool sve_guest;469389 u8 esr_ec;···505425 isb();506426507427 /* Write out the host state if it's in the registers */508508- if (host_owns_fp_regs())428428+ if (is_protected_kvm_enabled() && host_owns_fp_regs())509429 kvm_hyp_save_fpsimd_host(vcpu);510430511431 /* Restore the guest state */···581501 return true;582502}583503504504+/* Open-coded version of timer_get_offset() to allow for kern_hyp_va() */505505+static inline u64 hyp_timer_get_offset(struct arch_timer_context *ctxt)506506+{507507+ u64 offset = 0;508508+509509+ if (ctxt->offset.vm_offset)510510+ offset += *kern_hyp_va(ctxt->offset.vm_offset);511511+ if (ctxt->offset.vcpu_offset)512512+ offset += *kern_hyp_va(ctxt->offset.vcpu_offset);513513+514514+ return offset;515515+}516516+584517static inline u64 compute_counter_value(struct arch_timer_context *ctxt)585518{586586- return arch_timer_read_cntpct_el0() - timer_get_offset(ctxt);519519+ return arch_timer_read_cntpct_el0() - hyp_timer_get_offset(ctxt);587520}588521589522static bool kvm_handle_cntxct(struct kvm_vcpu *vcpu)···680587 return true;681588}682589683683-static bool kvm_hyp_handle_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)590590+static inline bool kvm_hyp_handle_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)684591{685592 if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) &&686593 handle_tx2_tvm(vcpu))···700607 return false;701608}702609703703-static bool kvm_hyp_handle_cp15_32(struct kvm_vcpu *vcpu, u64 *exit_code)610610+static inline bool kvm_hyp_handle_cp15_32(struct kvm_vcpu *vcpu, u64 *exit_code)704611{705612 if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&706613 __vgic_v3_perform_cpuif_access(vcpu) == 1)···709616 return false;710617}711618712712-static bool kvm_hyp_handle_memory_fault(struct kvm_vcpu *vcpu, u64 *exit_code)619619+static inline bool kvm_hyp_handle_memory_fault(struct kvm_vcpu *vcpu,620620+ u64 *exit_code)713621{714622 if (!__populate_fault_info(vcpu))715623 return true;716624717625 return false;718626}719719-static bool kvm_hyp_handle_iabt_low(struct kvm_vcpu *vcpu, u64 *exit_code)720720- __alias(kvm_hyp_handle_memory_fault);721721-static bool kvm_hyp_handle_watchpt_low(struct kvm_vcpu *vcpu, u64 *exit_code)722722- __alias(kvm_hyp_handle_memory_fault);627627+#define kvm_hyp_handle_iabt_low kvm_hyp_handle_memory_fault628628+#define kvm_hyp_handle_watchpt_low kvm_hyp_handle_memory_fault723629724724-static bool kvm_hyp_handle_dabt_low(struct kvm_vcpu *vcpu, u64 *exit_code)630630+static inline bool kvm_hyp_handle_dabt_low(struct kvm_vcpu *vcpu, u64 *exit_code)725631{726632 if (kvm_hyp_handle_memory_fault(vcpu, exit_code))727633 return true;···750658751659typedef bool (*exit_handler_fn)(struct kvm_vcpu *, u64 *);752660753753-static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm_vcpu *vcpu);754754-755755-static void early_exit_filter(struct kvm_vcpu *vcpu, u64 *exit_code);756756-757661/*758662 * Allow the hypervisor to handle the exit with an exit handler if it has one.759663 *760664 * Returns true if the hypervisor handled the exit, and control should go back761665 * to the guest, or false if it hasn't.762666 */763763-static inline bool kvm_hyp_handle_exit(struct kvm_vcpu *vcpu, u64 *exit_code)667667+static inline bool kvm_hyp_handle_exit(struct kvm_vcpu *vcpu, u64 *exit_code,668668+ const exit_handler_fn *handlers)764669{765765- const exit_handler_fn *handlers = kvm_get_exit_handler_array(vcpu);766766- exit_handler_fn fn;767767-768768- fn = handlers[kvm_vcpu_trap_get_class(vcpu)];769769-670670+ exit_handler_fn fn = handlers[kvm_vcpu_trap_get_class(vcpu)];770671 if (fn)771672 return fn(vcpu, exit_code);772673···789704 * the guest, false when we should restore the host state and return to the790705 * main run loop.791706 */792792-static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)707707+static inline bool __fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code,708708+ const exit_handler_fn *handlers)793709{794794- /*795795- * Save PSTATE early so that we can evaluate the vcpu mode796796- * early on.797797- */798798- synchronize_vcpu_pstate(vcpu, exit_code);799799-800800- /*801801- * Check whether we want to repaint the state one way or802802- * another.803803- */804804- early_exit_filter(vcpu, exit_code);805805-806710 if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ)807711 vcpu->arch.fault.esr_el2 = read_sysreg_el2(SYS_ESR);808712···821747 goto exit;822748823749 /* Check if there's an exit handler and allow it to handle the exit. */824824- if (kvm_hyp_handle_exit(vcpu, exit_code))750750+ if (kvm_hyp_handle_exit(vcpu, exit_code, handlers))825751 goto guest;826752exit:827753 /* Return to the host kernel and handle the exit */
+31-8
arch/arm64/kvm/hyp/nvhe/hyp-main.c
···55 */6677#include <hyp/adjust_pc.h>88+#include <hyp/switch.h>89910#include <asm/pgtable-types.h>1011#include <asm/kvm_asm.h>···8483 if (system_supports_sve())8584 __hyp_sve_restore_host();8685 else8787- __fpsimd_restore_state(*host_data_ptr(fpsimd_state));8686+ __fpsimd_restore_state(host_data_ptr(host_ctxt.fp_regs));88878988 if (has_fpmr)9089 write_sysreg_s(*host_data_ptr(fpmr), SYS_FPMR);···9291 *host_data_ptr(fp_owner) = FP_STATE_HOST_OWNED;9392}94939494+static void flush_debug_state(struct pkvm_hyp_vcpu *hyp_vcpu)9595+{9696+ struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;9797+9898+ hyp_vcpu->vcpu.arch.debug_owner = host_vcpu->arch.debug_owner;9999+100100+ if (kvm_guest_owns_debug_regs(&hyp_vcpu->vcpu))101101+ hyp_vcpu->vcpu.arch.vcpu_debug_state = host_vcpu->arch.vcpu_debug_state;102102+ else if (kvm_host_owns_debug_regs(&hyp_vcpu->vcpu))103103+ hyp_vcpu->vcpu.arch.external_debug_state = host_vcpu->arch.external_debug_state;104104+}105105+106106+static void sync_debug_state(struct pkvm_hyp_vcpu *hyp_vcpu)107107+{108108+ struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;109109+110110+ if (kvm_guest_owns_debug_regs(&hyp_vcpu->vcpu))111111+ host_vcpu->arch.vcpu_debug_state = hyp_vcpu->vcpu.arch.vcpu_debug_state;112112+ else if (kvm_host_owns_debug_regs(&hyp_vcpu->vcpu))113113+ host_vcpu->arch.external_debug_state = hyp_vcpu->vcpu.arch.external_debug_state;114114+}115115+95116static void flush_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)96117{97118 struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;9811999120 fpsimd_sve_flush();121121+ flush_debug_state(hyp_vcpu);100122101123 hyp_vcpu->vcpu.arch.ctxt = host_vcpu->arch.ctxt;102124···147123 unsigned int i;148124149125 fpsimd_sve_sync(&hyp_vcpu->vcpu);126126+ sync_debug_state(hyp_vcpu);150127151128 host_vcpu->arch.ctxt = hyp_vcpu->vcpu.arch.ctxt;152129···225200226201 sync_hyp_vcpu(hyp_vcpu);227202 } else {203203+ struct kvm_vcpu *vcpu = kern_hyp_va(host_vcpu);204204+228205 /* The host is fully trusted, run its vCPU directly. */229229- ret = __kvm_vcpu_run(kern_hyp_va(host_vcpu));206206+ fpsimd_lazy_switch_to_guest(vcpu);207207+ ret = __kvm_vcpu_run(vcpu);208208+ fpsimd_lazy_switch_to_host(vcpu);230209 }231210out:232211 cpu_reg(host_ctxt, 1) = ret;···679650 break;680651 case ESR_ELx_EC_SMC64:681652 handle_host_smc(host_ctxt);682682- break;683683- case ESR_ELx_EC_SVE:684684- cpacr_clear_set(0, CPACR_EL1_ZEN);685685- isb();686686- sve_cond_update_zcr_vq(sve_vq_from_vl(kvm_host_sve_max_vl) - 1,687687- SYS_ZCR_EL2);688653 break;689654 case ESR_ELx_EC_IABT_LOW:690655 case ESR_ELx_EC_DABT_LOW:
+41-35
arch/arm64/kvm/hyp/nvhe/mem_protect.c
···943943 ret = kvm_pgtable_get_leaf(&vm->pgt, ipa, &pte, &level);944944 if (ret)945945 return ret;946946- if (level != KVM_PGTABLE_LAST_LEVEL)947947- return -E2BIG;948946 if (!kvm_pte_valid(pte))949947 return -ENOENT;948948+ if (level != KVM_PGTABLE_LAST_LEVEL)949949+ return -E2BIG;950950951951 state = guest_get_page_state(pte, ipa);952952 if (state != PKVM_PAGE_SHARED_BORROWED)···998998 return ret;999999}1000100010011001-int __pkvm_host_relax_perms_guest(u64 gfn, struct pkvm_hyp_vcpu *vcpu, enum kvm_pgtable_prot prot)10011001+static void assert_host_shared_guest(struct pkvm_hyp_vm *vm, u64 ipa)10021002{10031003- struct pkvm_hyp_vm *vm = pkvm_hyp_vcpu_to_hyp_vm(vcpu);10041004- u64 ipa = hyp_pfn_to_phys(gfn);10051003 u64 phys;10061004 int ret;1007100510081008- if (prot & ~KVM_PGTABLE_PROT_RWX)10091009- return -EINVAL;10061006+ if (!IS_ENABLED(CONFIG_NVHE_EL2_DEBUG))10071007+ return;1010100810111009 host_lock_component();10121010 guest_lock_component(vm);1013101110141012 ret = __check_host_shared_guest(vm, &phys, ipa);10151015- if (!ret)10161016- ret = kvm_pgtable_stage2_relax_perms(&vm->pgt, ipa, prot, 0);1017101310181014 guest_unlock_component(vm);10191015 host_unlock_component();10161016+10171017+ WARN_ON(ret && ret != -ENOENT);10181018+}10191019+10201020+int __pkvm_host_relax_perms_guest(u64 gfn, struct pkvm_hyp_vcpu *vcpu, enum kvm_pgtable_prot prot)10211021+{10221022+ struct pkvm_hyp_vm *vm = pkvm_hyp_vcpu_to_hyp_vm(vcpu);10231023+ u64 ipa = hyp_pfn_to_phys(gfn);10241024+ int ret;10251025+10261026+ if (pkvm_hyp_vm_is_protected(vm))10271027+ return -EPERM;10281028+10291029+ if (prot & ~KVM_PGTABLE_PROT_RWX)10301030+ return -EINVAL;10311031+10321032+ assert_host_shared_guest(vm, ipa);10331033+ guest_lock_component(vm);10341034+ ret = kvm_pgtable_stage2_relax_perms(&vm->pgt, ipa, prot, 0);10351035+ guest_unlock_component(vm);1020103610211037 return ret;10221038}···10401024int __pkvm_host_wrprotect_guest(u64 gfn, struct pkvm_hyp_vm *vm)10411025{10421026 u64 ipa = hyp_pfn_to_phys(gfn);10431043- u64 phys;10441027 int ret;1045102810461046- host_lock_component();10291029+ if (pkvm_hyp_vm_is_protected(vm))10301030+ return -EPERM;10311031+10321032+ assert_host_shared_guest(vm, ipa);10471033 guest_lock_component(vm);10481048-10491049- ret = __check_host_shared_guest(vm, &phys, ipa);10501050- if (!ret)10511051- ret = kvm_pgtable_stage2_wrprotect(&vm->pgt, ipa, PAGE_SIZE);10521052-10341034+ ret = kvm_pgtable_stage2_wrprotect(&vm->pgt, ipa, PAGE_SIZE);10531035 guest_unlock_component(vm);10541054- host_unlock_component();1055103610561037 return ret;10571038}···10561043int __pkvm_host_test_clear_young_guest(u64 gfn, bool mkold, struct pkvm_hyp_vm *vm)10571044{10581045 u64 ipa = hyp_pfn_to_phys(gfn);10591059- u64 phys;10601046 int ret;1061104710621062- host_lock_component();10481048+ if (pkvm_hyp_vm_is_protected(vm))10491049+ return -EPERM;10501050+10511051+ assert_host_shared_guest(vm, ipa);10631052 guest_lock_component(vm);10641064-10651065- ret = __check_host_shared_guest(vm, &phys, ipa);10661066- if (!ret)10671067- ret = kvm_pgtable_stage2_test_clear_young(&vm->pgt, ipa, PAGE_SIZE, mkold);10681068-10531053+ ret = kvm_pgtable_stage2_test_clear_young(&vm->pgt, ipa, PAGE_SIZE, mkold);10691054 guest_unlock_component(vm);10701070- host_unlock_component();1071105510721056 return ret;10731057}···10731063{10741064 struct pkvm_hyp_vm *vm = pkvm_hyp_vcpu_to_hyp_vm(vcpu);10751065 u64 ipa = hyp_pfn_to_phys(gfn);10761076- u64 phys;10771077- int ret;1078106610791079- host_lock_component();10671067+ if (pkvm_hyp_vm_is_protected(vm))10681068+ return -EPERM;10691069+10701070+ assert_host_shared_guest(vm, ipa);10801071 guest_lock_component(vm);10811081-10821082- ret = __check_host_shared_guest(vm, &phys, ipa);10831083- if (!ret)10841084- kvm_pgtable_stage2_mkyoung(&vm->pgt, ipa, 0);10851085-10721072+ kvm_pgtable_stage2_mkyoung(&vm->pgt, ipa, 0);10861073 guest_unlock_component(vm);10871087- host_unlock_component();1088107410891089- return ret;10751075+ return 0;10901076}
+45-44
arch/arm64/kvm/hyp/nvhe/switch.c
···3939{4040 u64 val = CPTR_EL2_TAM; /* Same bit irrespective of E2H */41414242+ if (!guest_owns_fp_regs())4343+ __activate_traps_fpsimd32(vcpu);4444+4245 if (has_hvhe()) {4346 val |= CPACR_EL1_TTA;4447···5047 if (vcpu_has_sve(vcpu))5148 val |= CPACR_EL1_ZEN;5249 }5050+5151+ write_sysreg(val, cpacr_el1);5352 } else {5453 val |= CPTR_EL2_TTA | CPTR_NVHE_EL2_RES1;5554···66616762 if (!guest_owns_fp_regs())6863 val |= CPTR_EL2_TFP;6464+6565+ write_sysreg(val, cptr_el2);6966 }6767+}70687171- if (!guest_owns_fp_regs())7272- __activate_traps_fpsimd32(vcpu);6969+static void __deactivate_cptr_traps(struct kvm_vcpu *vcpu)7070+{7171+ if (has_hvhe()) {7272+ u64 val = CPACR_EL1_FPEN;73737474- kvm_write_cptr_el2(val);7474+ if (cpus_have_final_cap(ARM64_SVE))7575+ val |= CPACR_EL1_ZEN;7676+ if (cpus_have_final_cap(ARM64_SME))7777+ val |= CPACR_EL1_SMEN;7878+7979+ write_sysreg(val, cpacr_el1);8080+ } else {8181+ u64 val = CPTR_NVHE_EL2_RES1;8282+8383+ if (!cpus_have_final_cap(ARM64_SVE))8484+ val |= CPTR_EL2_TZ;8585+ if (!cpus_have_final_cap(ARM64_SME))8686+ val |= CPTR_EL2_TSM;8787+8888+ write_sysreg(val, cptr_el2);8989+ }7590}76917792static void __activate_traps(struct kvm_vcpu *vcpu)···144119145120 write_sysreg(this_cpu_ptr(&kvm_init_params)->hcr_el2, hcr_el2);146121147147- kvm_reset_cptr_el2(vcpu);122122+ __deactivate_cptr_traps(vcpu);148123 write_sysreg(__kvm_hyp_host_vector, vbar_el2);149124}150125···217192 kvm_handle_pvm_sysreg(vcpu, exit_code));218193}219194220220-static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu)221221-{222222- /*223223- * Non-protected kvm relies on the host restoring its sve state.224224- * Protected kvm restores the host's sve state as not to reveal that225225- * fpsimd was used by a guest nor leak upper sve bits.226226- */227227- if (unlikely(is_protected_kvm_enabled() && system_supports_sve())) {228228- __hyp_sve_save_host();229229-230230- /* Re-enable SVE traps if not supported for the guest vcpu. */231231- if (!vcpu_has_sve(vcpu))232232- cpacr_clear_set(CPACR_EL1_ZEN, 0);233233-234234- } else {235235- __fpsimd_save_state(*host_data_ptr(fpsimd_state));236236- }237237-238238- if (kvm_has_fpmr(kern_hyp_va(vcpu->kvm))) {239239- u64 val = read_sysreg_s(SYS_FPMR);240240-241241- if (unlikely(is_protected_kvm_enabled()))242242- *host_data_ptr(fpmr) = val;243243- else244244- **host_data_ptr(fpmr_ptr) = val;245245- }246246-}247247-248195static const exit_handler_fn hyp_exit_handlers[] = {249196 [0 ... ESR_ELx_EC_MAX] = NULL,250197 [ESR_ELx_EC_CP15_32] = kvm_hyp_handle_cp15_32,···248251 return hyp_exit_handlers;249252}250253251251-/*252252- * Some guests (e.g., protected VMs) are not be allowed to run in AArch32.253253- * The ARMv8 architecture does not give the hypervisor a mechanism to prevent a254254- * guest from dropping to AArch32 EL0 if implemented by the CPU. If the255255- * hypervisor spots a guest in such a state ensure it is handled, and don't256256- * trust the host to spot or fix it. The check below is based on the one in257257- * kvm_arch_vcpu_ioctl_run().258258- *259259- * Returns false if the guest ran in AArch32 when it shouldn't have, and260260- * thus should exit to the host, or true if a the guest run loop can continue.261261- */262262-static void early_exit_filter(struct kvm_vcpu *vcpu, u64 *exit_code)254254+static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)263255{256256+ const exit_handler_fn *handlers = kvm_get_exit_handler_array(vcpu);257257+258258+ synchronize_vcpu_pstate(vcpu, exit_code);259259+260260+ /*261261+ * Some guests (e.g., protected VMs) are not be allowed to run in262262+ * AArch32. The ARMv8 architecture does not give the hypervisor a263263+ * mechanism to prevent a guest from dropping to AArch32 EL0 if264264+ * implemented by the CPU. If the hypervisor spots a guest in such a265265+ * state ensure it is handled, and don't trust the host to spot or fix266266+ * it. The check below is based on the one in267267+ * kvm_arch_vcpu_ioctl_run().268268+ */264269 if (unlikely(vcpu_is_protected(vcpu) && vcpu_mode_is_32bit(vcpu))) {265270 /*266271 * As we have caught the guest red-handed, decide that it isn't···275276 *exit_code &= BIT(ARM_EXIT_WITH_SERROR_BIT);276277 *exit_code |= ARM_EXCEPTION_IL;277278 }279279+280280+ return __fixup_guest_exit(vcpu, exit_code, handlers);278281}279282280283/* Switch to the guest for legacy non-VHE systems */
+19-14
arch/arm64/kvm/hyp/vhe/switch.c
···136136 write_sysreg(val, cpacr_el1);137137}138138139139+static void __deactivate_cptr_traps(struct kvm_vcpu *vcpu)140140+{141141+ u64 val = CPACR_EL1_FPEN | CPACR_EL1_ZEN_EL1EN;142142+143143+ if (cpus_have_final_cap(ARM64_SME))144144+ val |= CPACR_EL1_SMEN_EL1EN;145145+146146+ write_sysreg(val, cpacr_el1);147147+}148148+139149static void __activate_traps(struct kvm_vcpu *vcpu)140150{141151 u64 val;···217207 */218208 asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT));219209220220- kvm_reset_cptr_el2(vcpu);210210+ __deactivate_cptr_traps(vcpu);221211222212 if (!arm64_kernel_unmapped_at_el0())223213 host_vectors = __this_cpu_read(this_cpu_vector);···423413 return true;424414}425415426426-static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu)427427-{428428- __fpsimd_save_state(*host_data_ptr(fpsimd_state));429429-430430- if (kvm_has_fpmr(vcpu->kvm))431431- **host_data_ptr(fpmr_ptr) = read_sysreg_s(SYS_FPMR);432432-}433433-434416static bool kvm_hyp_handle_tlbi_el2(struct kvm_vcpu *vcpu, u64 *exit_code)435417{436418 int ret = -EINVAL;···540538 [ESR_ELx_EC_MOPS] = kvm_hyp_handle_mops,541539};542540543543-static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm_vcpu *vcpu)541541+static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)544542{545545- return hyp_exit_handlers;546546-}543543+ synchronize_vcpu_pstate(vcpu, exit_code);547544548548-static void early_exit_filter(struct kvm_vcpu *vcpu, u64 *exit_code)549549-{550545 /*551546 * If we were in HYP context on entry, adjust the PSTATE view552547 * so that the usual helpers work correctly.···563564 *vcpu_cpsr(vcpu) &= ~(PSR_MODE_MASK | PSR_MODE32_BIT);564565 *vcpu_cpsr(vcpu) |= mode;565566 }567567+568568+ return __fixup_guest_exit(vcpu, exit_code, hyp_exit_handlers);566569}567570568571/* Switch to the guest for VHE systems running in EL2 */···578577 guest_ctxt = &vcpu->arch.ctxt;579578580579 sysreg_save_host_state_vhe(host_ctxt);580580+581581+ fpsimd_lazy_switch_to_guest(vcpu);581582582583 /*583584 * Note that ARM erratum 1165522 requires us to configure both stage 1···604601 sysreg_save_guest_state_vhe(guest_ctxt);605602606603 __deactivate_traps(vcpu);604604+605605+ fpsimd_lazy_switch_to_host(vcpu);607606608607 sysreg_restore_host_state_vhe(host_ctxt);609608
+5-4
arch/arm64/kvm/nested.c
···6767 if (!tmp)6868 return -ENOMEM;69697070+ swap(kvm->arch.nested_mmus, tmp);7171+7072 /*7173 * If we went through a realocation, adjust the MMU back-pointers in7274 * the previously initialised kvm_pgtable structures.7375 */7476 if (kvm->arch.nested_mmus != tmp)7577 for (int i = 0; i < kvm->arch.nested_mmus_size; i++)7676- tmp[i].pgt->mmu = &tmp[i];7878+ kvm->arch.nested_mmus[i].pgt->mmu = &kvm->arch.nested_mmus[i];77797880 for (int i = kvm->arch.nested_mmus_size; !ret && i < num_mmus; i++)7979- ret = init_nested_s2_mmu(kvm, &tmp[i]);8181+ ret = init_nested_s2_mmu(kvm, &kvm->arch.nested_mmus[i]);80828183 if (ret) {8284 for (int i = kvm->arch.nested_mmus_size; i < num_mmus; i++)8383- kvm_free_stage2_pgd(&tmp[i]);8585+ kvm_free_stage2_pgd(&kvm->arch.nested_mmus[i]);84868587 return ret;8688 }87898890 kvm->arch.nested_mmus_size = num_mmus;8989- kvm->arch.nested_mmus = tmp;90919192 return 0;9293}
···3434 *3535 * CPU Interface:3636 *3737- * - kvm_vgic_vcpu_init(): initialization of static data that3838- * doesn't depend on any sizing information or emulation type. No3939- * allocation is allowed there.3737+ * - kvm_vgic_vcpu_init(): initialization of static data that doesn't depend3838+ * on any sizing information. Private interrupts are allocated if not3939+ * already allocated at vgic-creation time.4040 */41414242/* EARLY INIT */···5757}58585959/* CREATION */6060+6161+static int vgic_allocate_private_irqs_locked(struct kvm_vcpu *vcpu, u32 type);60626163/**6264 * kvm_vgic_create: triggered by the instantiation of the VGIC device by···111109112110 if (atomic_read(&kvm->online_vcpus) > kvm->max_vcpus) {113111 ret = -E2BIG;112112+ goto out_unlock;113113+ }114114+115115+ kvm_for_each_vcpu(i, vcpu, kvm) {116116+ ret = vgic_allocate_private_irqs_locked(vcpu, type);117117+ if (ret)118118+ break;119119+ }120120+121121+ if (ret) {122122+ kvm_for_each_vcpu(i, vcpu, kvm) {123123+ struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;124124+ kfree(vgic_cpu->private_irqs);125125+ vgic_cpu->private_irqs = NULL;126126+ }127127+114128 goto out_unlock;115129 }116130···198180 return 0;199181}200182201201-static int vgic_allocate_private_irqs_locked(struct kvm_vcpu *vcpu)183183+static int vgic_allocate_private_irqs_locked(struct kvm_vcpu *vcpu, u32 type)202184{203185 struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;204186 int i;···236218 /* PPIs */237219 irq->config = VGIC_CONFIG_LEVEL;238220 }221221+222222+ switch (type) {223223+ case KVM_DEV_TYPE_ARM_VGIC_V3:224224+ irq->group = 1;225225+ irq->mpidr = kvm_vcpu_get_mpidr_aff(vcpu);226226+ break;227227+ case KVM_DEV_TYPE_ARM_VGIC_V2:228228+ irq->group = 0;229229+ irq->targets = BIT(vcpu->vcpu_id);230230+ break;231231+ }239232 }240233241234 return 0;242235}243236244244-static int vgic_allocate_private_irqs(struct kvm_vcpu *vcpu)237237+static int vgic_allocate_private_irqs(struct kvm_vcpu *vcpu, u32 type)245238{246239 int ret;247240248241 mutex_lock(&vcpu->kvm->arch.config_lock);249249- ret = vgic_allocate_private_irqs_locked(vcpu);242242+ ret = vgic_allocate_private_irqs_locked(vcpu, type);250243 mutex_unlock(&vcpu->kvm->arch.config_lock);251244252245 return ret;···287258 if (!irqchip_in_kernel(vcpu->kvm))288259 return 0;289260290290- ret = vgic_allocate_private_irqs(vcpu);261261+ ret = vgic_allocate_private_irqs(vcpu, dist->vgic_model);291262 if (ret)292263 return ret;293264···324295{325296 struct vgic_dist *dist = &kvm->arch.vgic;326297 struct kvm_vcpu *vcpu;327327- int ret = 0, i;298298+ int ret = 0;328299 unsigned long idx;329300330301 lockdep_assert_held(&kvm->arch.config_lock);···343314 ret = kvm_vgic_dist_init(kvm, dist->nr_spis);344315 if (ret)345316 goto out;346346-347347- /* Initialize groups on CPUs created before the VGIC type was known */348348- kvm_for_each_vcpu(idx, vcpu, kvm) {349349- ret = vgic_allocate_private_irqs_locked(vcpu);350350- if (ret)351351- goto out;352352-353353- for (i = 0; i < VGIC_NR_PRIVATE_IRQS; i++) {354354- struct vgic_irq *irq = vgic_get_vcpu_irq(vcpu, i);355355-356356- switch (dist->vgic_model) {357357- case KVM_DEV_TYPE_ARM_VGIC_V3:358358- irq->group = 1;359359- irq->mpidr = kvm_vcpu_get_mpidr_aff(vcpu);360360- break;361361- case KVM_DEV_TYPE_ARM_VGIC_V2:362362- irq->group = 0;363363- irq->targets = 1U << idx;364364- break;365365- default:366366- ret = -EINVAL;367367- }368368-369369- vgic_put_irq(kvm, irq);370370-371371- if (ret)372372- goto out;373373- }374374- }375317376318 /*377319 * If we have GICv4.1 enabled, unconditionally request enable the
+7
arch/arm64/mm/trans_pgd.c
···162162 unsigned long next;163163 unsigned long addr = start;164164165165+ if (pgd_none(READ_ONCE(*dst_pgdp))) {166166+ dst_p4dp = trans_alloc(info);167167+ if (!dst_p4dp)168168+ return -ENOMEM;169169+ pgd_populate(NULL, dst_pgdp, dst_p4dp);170170+ }171171+165172 dst_p4dp = p4d_offset(dst_pgdp, start);166173 src_p4dp = p4d_offset(src_pgdp, start);167174 do {
-21
arch/loongarch/include/asm/cpu-info.h
···7676#define cpu_family_string() __cpu_family[raw_smp_processor_id()]7777#define cpu_full_name_string() __cpu_full_name[raw_smp_processor_id()]78787979-struct seq_file;8080-struct notifier_block;8181-8282-extern int register_proc_cpuinfo_notifier(struct notifier_block *nb);8383-extern int proc_cpuinfo_notifier_call_chain(unsigned long val, void *v);8484-8585-#define proc_cpuinfo_notifier(fn, pri) \8686-({ \8787- static struct notifier_block fn##_nb = { \8888- .notifier_call = fn, \8989- .priority = pri \9090- }; \9191- \9292- register_proc_cpuinfo_notifier(&fn##_nb); \9393-})9494-9595-struct proc_cpuinfo_notifier_args {9696- struct seq_file *m;9797- unsigned long n;9898-};9999-10079static inline bool cpus_are_siblings(int cpua, int cpub)10180{10281 struct cpuinfo_loongarch *infoa = &cpu_data[cpua];
+2
arch/loongarch/include/asm/smp.h
···7777#define SMP_IRQ_WORK BIT(ACTION_IRQ_WORK)7878#define SMP_CLEAR_VECTOR BIT(ACTION_CLEAR_VECTOR)79798080+struct seq_file;8181+8082struct secondary_data {8183 unsigned long stack;8284 unsigned long thread_info;
+15-13
arch/loongarch/kernel/genex.S
···18181919 .align 52020SYM_FUNC_START(__arch_cpu_idle)2121- /* start of rollback region */2222- LONG_L t0, tp, TI_FLAGS2323- nop2424- andi t0, t0, _TIF_NEED_RESCHED2525- bnez t0, 1f2626- nop2727- nop2828- nop2121+ /* start of idle interrupt region */2222+ ori t0, zero, CSR_CRMD_IE2323+ /* idle instruction needs irq enabled */2424+ csrxchg t0, t0, LOONGARCH_CSR_CRMD2525+ /*2626+ * If an interrupt lands here; between enabling interrupts above and2727+ * going idle on the next instruction, we must *NOT* go idle since the2828+ * interrupt could have set TIF_NEED_RESCHED or caused an timer to need2929+ * reprogramming. Fall through -- see handle_vint() below -- and have3030+ * the idle loop take care of things.3131+ */2932 idle 03030- /* end of rollback region */3333+ /* end of idle interrupt region */31341: jr ra3235SYM_FUNC_END(__arch_cpu_idle)3336···3835 UNWIND_HINT_UNDEFINED3936 BACKUP_T0T14037 SAVE_ALL4141- la_abs t1, __arch_cpu_idle3838+ la_abs t1, 1b4239 LONG_L t0, sp, PT_ERA4343- /* 32 byte rollback region */4444- ori t0, t0, 0x1f4545- xori t0, t0, 0x1f4040+ /* 3 instructions idle interrupt region */4141+ ori t0, t0, 0b11004642 bne t0, t1, 1f4743 LONG_S t0, sp, PT_ERA48441: move a0, sp
···303303 * TOE=0: Trap on Exception.304304 * TIT=0: Trap on Timer.305305 */306306- if (env & CSR_GCFG_GCIP_ALL)306306+ if (env & CSR_GCFG_GCIP_SECURE)307307 gcfg |= CSR_GCFG_GCI_SECURE;308308- if (env & CSR_GCFG_MATC_ROOT)308308+ if (env & CSR_GCFG_MATP_ROOT)309309 gcfg |= CSR_GCFG_MATC_ROOT;310310311311 write_csr_gcfg(gcfg);
+1-1
arch/loongarch/kvm/switch.S
···8585 * Guest CRMD comes from separate GCSR_CRMD register8686 */8787 ori t0, zero, CSR_PRMD_PIE8888- csrxchg t0, t0, LOONGARCH_CSR_PRMD8888+ csrwr t0, LOONGARCH_CSR_PRMD89899090 /* Set PVM bit to setup ertn to guest context */9191 ori t0, zero, CSR_GSTAT_PVM
-3
arch/loongarch/kvm/vcpu.c
···1548154815491549 /* Restore timer state regardless */15501550 kvm_restore_timer(vcpu);15511551-15521552- /* Control guest page CCA attribute */15531553- change_csr_gcfg(CSR_GCFG_MATC_MASK, CSR_GCFG_MATC_ROOT);15541551 kvm_make_request(KVM_REQ_STEAL_UPDATE, vcpu);1555155215561553 /* Restore hardware PMU CSRs */
···2727 */2828struct pt_regs {2929#ifdef CONFIG_32BIT3030- /* Pad bytes for argument save space on the stack. */3131- unsigned long pad0[8];3030+ /* Saved syscall stack arguments; entries 0-3 unused. */3131+ unsigned long args[8];3232#endif33333434 /* Saved main processor registers. */
+8-24
arch/mips/include/asm/syscall.h
···5757static inline void mips_get_syscall_arg(unsigned long *arg,5858 struct task_struct *task, struct pt_regs *regs, unsigned int n)5959{6060- unsigned long usp __maybe_unused = regs->regs[29];6161-6060+#ifdef CONFIG_32BIT6261 switch (n) {6362 case 0: case 1: case 2: case 3:6463 *arg = regs->regs[4 + n];6565-6664 return;6767-6868-#ifdef CONFIG_32BIT6965 case 4: case 5: case 6: case 7:7070- get_user(*arg, (int *)usp + n);6666+ *arg = regs->args[n];7167 return;7272-#endif7373-7474-#ifdef CONFIG_64BIT7575- case 4: case 5: case 6: case 7:7676-#ifdef CONFIG_MIPS32_O327777- if (test_tsk_thread_flag(task, TIF_32BIT_REGS))7878- get_user(*arg, (int *)usp + n);7979- else8080-#endif8181- *arg = regs->regs[4 + n];8282-8383- return;8484-#endif8585-8686- default:8787- BUG();8868 }8989-9090- unreachable();6969+#else7070+ *arg = regs->regs[4 + n];7171+ if ((IS_ENABLED(CONFIG_MIPS32_O32) &&7272+ test_tsk_thread_flag(task, TIF_32BIT_REGS)))7373+ *arg = (unsigned int)*arg;7474+#endif9175}92769377static inline long syscall_get_error(struct task_struct *task,
···740740CONFIG_IMA_DEFAULT_HASH_SHA256=y741741CONFIG_IMA_WRITE_POLICY=y742742CONFIG_IMA_APPRAISE=y743743-CONFIG_LSM="yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor"744743CONFIG_BUG_ON_DATA_CORRUPTION=y745744CONFIG_CRYPTO_USER=m746745# CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is not set
···6262# CONFIG_INOTIFY_USER is not set6363# CONFIG_MISC_FILESYSTEMS is not set6464# CONFIG_NETWORK_FILESYSTEMS is not set6565-CONFIG_LSM="yama,loadpin,safesetid,integrity"6665# CONFIG_ZLIB_DFLTCC is not set6766CONFIG_XZ_DEC_MICROLZMA=y6867CONFIG_PRINTK_TIME=y
+5-1
arch/s390/include/asm/bitops.h
···5353 unsigned long mask;5454 int cc;55555656- if (__builtin_constant_p(nr)) {5656+ /*5757+ * With CONFIG_PROFILE_ALL_BRANCHES enabled gcc fails to5858+ * handle __builtin_constant_p() in some cases.5959+ */6060+ if (!IS_ENABLED(CONFIG_PROFILE_ALL_BRANCHES) && __builtin_constant_p(nr)) {5761 addr = (const volatile unsigned char *)ptr;5862 addr += (nr ^ (BITS_PER_LONG - BITS_PER_BYTE)) / BITS_PER_BYTE;5963 mask = 1UL << (nr & (BITS_PER_BYTE - 1));
+6-14
arch/s390/include/asm/gmap.h
···2323/**2424 * struct gmap_struct - guest address space2525 * @list: list head for the mm->context gmap list2626- * @crst_list: list of all crst tables used in the guest address space2726 * @mm: pointer to the parent mm_struct2827 * @guest_to_host: radix tree with guest to host address translation2928 * @host_to_guest: radix tree with pointer to segment table entries···3435 * @guest_handle: protected virtual machine handle for the ultravisor3536 * @host_to_rmap: radix tree with gmap_rmap lists3637 * @children: list of shadow gmap structures3737- * @pt_list: list of all page tables used in the shadow guest address space3838 * @shadow_lock: spinlock to protect the shadow gmap list3939 * @parent: pointer to the parent gmap for shadow guest address spaces4040 * @orig_asce: ASCE for which the shadow page table has been created···4345 */4446struct gmap {4547 struct list_head list;4646- struct list_head crst_list;4748 struct mm_struct *mm;4849 struct radix_tree_root guest_to_host;4950 struct radix_tree_root host_to_guest;···5861 /* Additional data for shadow guest address spaces */5962 struct radix_tree_root host_to_rmap;6063 struct list_head children;6161- struct list_head pt_list;6264 spinlock_t shadow_lock;6365 struct gmap *parent;6466 unsigned long orig_asce;···102106void gmap_remove(struct gmap *gmap);103107struct gmap *gmap_get(struct gmap *gmap);104108void gmap_put(struct gmap *gmap);109109+void gmap_free(struct gmap *gmap);110110+struct gmap *gmap_alloc(unsigned long limit);105111106112int gmap_map_segment(struct gmap *gmap, unsigned long from,107113 unsigned long to, unsigned long len);108114int gmap_unmap_segment(struct gmap *gmap, unsigned long to, unsigned long len);109115unsigned long __gmap_translate(struct gmap *, unsigned long gaddr);110110-unsigned long gmap_translate(struct gmap *, unsigned long gaddr);111116int __gmap_link(struct gmap *gmap, unsigned long gaddr, unsigned long vmaddr);112112-int gmap_fault(struct gmap *, unsigned long gaddr, unsigned int fault_flags);113117void gmap_discard(struct gmap *, unsigned long from, unsigned long to);114118void __gmap_zap(struct gmap *, unsigned long gaddr);115119void gmap_unlink(struct mm_struct *, unsigned long *table, unsigned long vmaddr);116120117121int gmap_read_table(struct gmap *gmap, unsigned long gaddr, unsigned long *val);118122119119-struct gmap *gmap_shadow(struct gmap *parent, unsigned long asce,120120- int edat_level);121121-int gmap_shadow_valid(struct gmap *sg, unsigned long asce, int edat_level);123123+void gmap_unshadow(struct gmap *sg);122124int gmap_shadow_r2t(struct gmap *sg, unsigned long saddr, unsigned long r2t,123125 int fake);124126int gmap_shadow_r3t(struct gmap *sg, unsigned long saddr, unsigned long r3t,···125131 int fake);126132int gmap_shadow_pgt(struct gmap *sg, unsigned long saddr, unsigned long pgt,127133 int fake);128128-int gmap_shadow_pgt_lookup(struct gmap *sg, unsigned long saddr,129129- unsigned long *pgt, int *dat_protection, int *fake);130134int gmap_shadow_page(struct gmap *sg, unsigned long saddr, pte_t pte);131135132136void gmap_register_pte_notifier(struct gmap_notifier *);133137void gmap_unregister_pte_notifier(struct gmap_notifier *);134138135135-int gmap_mprotect_notify(struct gmap *, unsigned long start,136136- unsigned long len, int prot);139139+int gmap_protect_one(struct gmap *gmap, unsigned long gaddr, int prot, unsigned long bits);137140138141void gmap_sync_dirty_log_pmd(struct gmap *gmap, unsigned long dirty_bitmap[4],139142 unsigned long gaddr, unsigned long vmaddr);140143int s390_disable_cow_sharing(void);141141-void s390_unlist_old_asce(struct gmap *gmap);142144int s390_replace_asce(struct gmap *gmap);143145void s390_uv_destroy_pfns(unsigned long count, unsigned long *pfns);144146int __s390_uv_destroy_range(struct mm_struct *mm, unsigned long start,145147 unsigned long end, bool interruptible);148148+int kvm_s390_wiggle_split_folio(struct mm_struct *mm, struct folio *folio, bool split);149149+unsigned long *gmap_table_walk(struct gmap *gmap, unsigned long gaddr, int level);146150147151/**148152 * s390_uv_destroy_range - Destroy a range of pages in the given mm.
+5-1
arch/s390/include/asm/kvm_host.h
···3030#define KVM_S390_ESCA_CPU_SLOTS 2483131#define KVM_MAX_VCPUS 25532323333+#define KVM_INTERNAL_MEM_SLOTS 13434+3335/*3436 * These seem to be used for allocating ->chip in the routing table, which we3537 * don't use. 1 is as small as we can get to reduce the needed memory. If we···933931 u8 reserved928[0x1000 - 0x928]; /* 0x0928 */934932};935933934934+struct vsie_page;935935+936936struct kvm_s390_vsie {937937 struct mutex mutex;938938 struct radix_tree_root addr_to_page;939939 int page_count;940940 int next;941941- struct page *pages[KVM_MAX_VCPUS];941941+ struct vsie_page *pages[KVM_MAX_VCPUS];942942};943943944944struct kvm_s390_gisa_iam {
+18-3
arch/s390/include/asm/pgtable.h
···420420#define PGSTE_HC_BIT 0x0020000000000000UL421421#define PGSTE_GR_BIT 0x0004000000000000UL422422#define PGSTE_GC_BIT 0x0002000000000000UL423423-#define PGSTE_UC_BIT 0x0000800000000000UL /* user dirty (migration) */424424-#define PGSTE_IN_BIT 0x0000400000000000UL /* IPTE notify bit */425425-#define PGSTE_VSIE_BIT 0x0000200000000000UL /* ref'd in a shadow table */423423+#define PGSTE_ST2_MASK 0x0000ffff00000000UL424424+#define PGSTE_UC_BIT 0x0000000000008000UL /* user dirty (migration) */425425+#define PGSTE_IN_BIT 0x0000000000004000UL /* IPTE notify bit */426426+#define PGSTE_VSIE_BIT 0x0000000000002000UL /* ref'd in a shadow table */426427427428/* Guest Page State used for virtualization */428429#define _PGSTE_GPS_ZERO 0x0000000080000000UL···2007200620082007#define pmd_pgtable(pmd) \20092008 ((pgtable_t)__va(pmd_val(pmd) & -sizeof(pte_t)*PTRS_PER_PTE))20092009+20102010+static inline unsigned long gmap_pgste_get_pgt_addr(unsigned long *pgt)20112011+{20122012+ unsigned long *pgstes, res;20132013+20142014+ pgstes = pgt + _PAGE_ENTRIES;20152015+20162016+ res = (pgstes[0] & PGSTE_ST2_MASK) << 16;20172017+ res |= pgstes[1] & PGSTE_ST2_MASK;20182018+ res |= (pgstes[2] & PGSTE_ST2_MASK) >> 16;20192019+ res |= (pgstes[3] & PGSTE_ST2_MASK) >> 32;20202020+20212021+ return res;20222022+}2010202320112024#endif /* _S390_PAGE_H */
+3-3
arch/s390/include/asm/uv.h
···628628}629629630630int uv_pin_shared(unsigned long paddr);631631-int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb);632632-int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr);633631int uv_destroy_folio(struct folio *folio);634632int uv_destroy_pte(pte_t pte);635633int uv_convert_from_secure_pte(pte_t pte);636636-int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr);634634+int make_folio_secure(struct folio *folio, struct uv_cb_header *uvcb);635635+int uv_convert_from_secure(unsigned long paddr);636636+int uv_convert_from_secure_folio(struct folio *folio);637637638638void setup_uv(void);639639
+29-263
arch/s390/kernel/uv.c
···1919#include <asm/sections.h>2020#include <asm/uv.h>21212222-#if !IS_ENABLED(CONFIG_KVM)2323-unsigned long __gmap_translate(struct gmap *gmap, unsigned long gaddr)2424-{2525- return 0;2626-}2727-2828-int gmap_fault(struct gmap *gmap, unsigned long gaddr,2929- unsigned int fault_flags)3030-{3131- return 0;3232-}3333-#endif3434-3522/* the bootdata_preserved fields come from ones in arch/s390/boot/uv.c */3623int __bootdata_preserved(prot_virt_guest);3724EXPORT_SYMBOL(prot_virt_guest);···146159 folio_put(folio);147160 return rc;148161}162162+EXPORT_SYMBOL(uv_destroy_folio);149163150164/*151165 * The present PTE still indirectly holds a folio reference through the mapping.···163175 *164176 * @paddr: Absolute host address of page to be exported165177 */166166-static int uv_convert_from_secure(unsigned long paddr)178178+int uv_convert_from_secure(unsigned long paddr)167179{168180 struct uv_cb_cfs uvcb = {169181 .header.cmd = UVC_CMD_CONV_FROM_SEC_STOR,···175187 return -EINVAL;176188 return 0;177189}190190+EXPORT_SYMBOL_GPL(uv_convert_from_secure);178191179192/*180193 * The caller must already hold a reference to the folio.181194 */182182-static int uv_convert_from_secure_folio(struct folio *folio)195195+int uv_convert_from_secure_folio(struct folio *folio)183196{184197 int rc;185198···195206 folio_put(folio);196207 return rc;197208}209209+EXPORT_SYMBOL_GPL(uv_convert_from_secure_folio);198210199211/*200212 * The present PTE still indirectly holds a folio reference through the mapping.···227237 return res;228238}229239230230-static int make_folio_secure(struct folio *folio, struct uv_cb_header *uvcb)240240+/**241241+ * make_folio_secure() - make a folio secure242242+ * @folio: the folio to make secure243243+ * @uvcb: the uvcb that describes the UVC to be used244244+ *245245+ * The folio @folio will be made secure if possible, @uvcb will be passed246246+ * as-is to the UVC.247247+ *248248+ * Return: 0 on success;249249+ * -EBUSY if the folio is in writeback or has too many references;250250+ * -E2BIG if the folio is large;251251+ * -EAGAIN if the UVC needs to be attempted again;252252+ * -ENXIO if the address is not mapped;253253+ * -EINVAL if the UVC failed for other reasons.254254+ *255255+ * Context: The caller must hold exactly one extra reference on the folio256256+ * (it's the same logic as split_folio())257257+ */258258+int make_folio_secure(struct folio *folio, struct uv_cb_header *uvcb)231259{232260 int expected, cc = 0;233261262262+ if (folio_test_large(folio))263263+ return -E2BIG;234264 if (folio_test_writeback(folio))235235- return -EAGAIN;236236- expected = expected_folio_refs(folio);265265+ return -EBUSY;266266+ expected = expected_folio_refs(folio) + 1;237267 if (!folio_ref_freeze(folio, expected))238268 return -EBUSY;239269 set_bit(PG_arch_1, &folio->flags);···277267 return -EAGAIN;278268 return uvcb->rc == 0x10a ? -ENXIO : -EINVAL;279269}280280-281281-/**282282- * should_export_before_import - Determine whether an export is needed283283- * before an import-like operation284284- * @uvcb: the Ultravisor control block of the UVC to be performed285285- * @mm: the mm of the process286286- *287287- * Returns whether an export is needed before every import-like operation.288288- * This is needed for shared pages, which don't trigger a secure storage289289- * exception when accessed from a different guest.290290- *291291- * Although considered as one, the Unpin Page UVC is not an actual import,292292- * so it is not affected.293293- *294294- * No export is needed also when there is only one protected VM, because the295295- * page cannot belong to the wrong VM in that case (there is no "other VM"296296- * it can belong to).297297- *298298- * Return: true if an export is needed before every import, otherwise false.299299- */300300-static bool should_export_before_import(struct uv_cb_header *uvcb, struct mm_struct *mm)301301-{302302- /*303303- * The misc feature indicates, among other things, that importing a304304- * shared page from a different protected VM will automatically also305305- * transfer its ownership.306306- */307307- if (uv_has_feature(BIT_UV_FEAT_MISC))308308- return false;309309- if (uvcb->cmd == UVC_CMD_UNPIN_PAGE_SHARED)310310- return false;311311- return atomic_read(&mm->context.protected_count) > 1;312312-}313313-314314-/*315315- * Drain LRU caches: the local one on first invocation and the ones of all316316- * CPUs on successive invocations. Returns "true" on the first invocation.317317- */318318-static bool drain_lru(bool *drain_lru_called)319319-{320320- /*321321- * If we have tried a local drain and the folio refcount322322- * still does not match our expected safe value, try with a323323- * system wide drain. This is needed if the pagevecs holding324324- * the page are on a different CPU.325325- */326326- if (*drain_lru_called) {327327- lru_add_drain_all();328328- /* We give up here, don't retry immediately. */329329- return false;330330- }331331- /*332332- * We are here if the folio refcount does not match the333333- * expected safe value. The main culprits are usually334334- * pagevecs. With lru_add_drain() we drain the pagevecs335335- * on the local CPU so that hopefully the refcount will336336- * reach the expected safe value.337337- */338338- lru_add_drain();339339- *drain_lru_called = true;340340- /* The caller should try again immediately */341341- return true;342342-}343343-344344-/*345345- * Requests the Ultravisor to make a page accessible to a guest.346346- * If it's brought in the first time, it will be cleared. If347347- * it has been exported before, it will be decrypted and integrity348348- * checked.349349- */350350-int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)351351-{352352- struct vm_area_struct *vma;353353- bool drain_lru_called = false;354354- spinlock_t *ptelock;355355- unsigned long uaddr;356356- struct folio *folio;357357- pte_t *ptep;358358- int rc;359359-360360-again:361361- rc = -EFAULT;362362- mmap_read_lock(gmap->mm);363363-364364- uaddr = __gmap_translate(gmap, gaddr);365365- if (IS_ERR_VALUE(uaddr))366366- goto out;367367- vma = vma_lookup(gmap->mm, uaddr);368368- if (!vma)369369- goto out;370370- /*371371- * Secure pages cannot be huge and userspace should not combine both.372372- * In case userspace does it anyway this will result in an -EFAULT for373373- * the unpack. The guest is thus never reaching secure mode. If374374- * userspace is playing dirty tricky with mapping huge pages later375375- * on this will result in a segmentation fault.376376- */377377- if (is_vm_hugetlb_page(vma))378378- goto out;379379-380380- rc = -ENXIO;381381- ptep = get_locked_pte(gmap->mm, uaddr, &ptelock);382382- if (!ptep)383383- goto out;384384- if (pte_present(*ptep) && !(pte_val(*ptep) & _PAGE_INVALID) && pte_write(*ptep)) {385385- folio = page_folio(pte_page(*ptep));386386- rc = -EAGAIN;387387- if (folio_test_large(folio)) {388388- rc = -E2BIG;389389- } else if (folio_trylock(folio)) {390390- if (should_export_before_import(uvcb, gmap->mm))391391- uv_convert_from_secure(PFN_PHYS(folio_pfn(folio)));392392- rc = make_folio_secure(folio, uvcb);393393- folio_unlock(folio);394394- }395395-396396- /*397397- * Once we drop the PTL, the folio may get unmapped and398398- * freed immediately. We need a temporary reference.399399- */400400- if (rc == -EAGAIN || rc == -E2BIG)401401- folio_get(folio);402402- }403403- pte_unmap_unlock(ptep, ptelock);404404-out:405405- mmap_read_unlock(gmap->mm);406406-407407- switch (rc) {408408- case -E2BIG:409409- folio_lock(folio);410410- rc = split_folio(folio);411411- folio_unlock(folio);412412- folio_put(folio);413413-414414- switch (rc) {415415- case 0:416416- /* Splitting succeeded, try again immediately. */417417- goto again;418418- case -EAGAIN:419419- /* Additional folio references. */420420- if (drain_lru(&drain_lru_called))421421- goto again;422422- return -EAGAIN;423423- case -EBUSY:424424- /* Unexpected race. */425425- return -EAGAIN;426426- }427427- WARN_ON_ONCE(1);428428- return -ENXIO;429429- case -EAGAIN:430430- /*431431- * If we are here because the UVC returned busy or partial432432- * completion, this is just a useless check, but it is safe.433433- */434434- folio_wait_writeback(folio);435435- folio_put(folio);436436- return -EAGAIN;437437- case -EBUSY:438438- /* Additional folio references. */439439- if (drain_lru(&drain_lru_called))440440- goto again;441441- return -EAGAIN;442442- case -ENXIO:443443- if (gmap_fault(gmap, gaddr, FAULT_FLAG_WRITE))444444- return -EFAULT;445445- return -EAGAIN;446446- }447447- return rc;448448-}449449-EXPORT_SYMBOL_GPL(gmap_make_secure);450450-451451-int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr)452452-{453453- struct uv_cb_cts uvcb = {454454- .header.cmd = UVC_CMD_CONV_TO_SEC_STOR,455455- .header.len = sizeof(uvcb),456456- .guest_handle = gmap->guest_handle,457457- .gaddr = gaddr,458458- };459459-460460- return gmap_make_secure(gmap, gaddr, &uvcb);461461-}462462-EXPORT_SYMBOL_GPL(gmap_convert_to_secure);463463-464464-/**465465- * gmap_destroy_page - Destroy a guest page.466466- * @gmap: the gmap of the guest467467- * @gaddr: the guest address to destroy468468- *469469- * An attempt will be made to destroy the given guest page. If the attempt470470- * fails, an attempt is made to export the page. If both attempts fail, an471471- * appropriate error is returned.472472- */473473-int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr)474474-{475475- struct vm_area_struct *vma;476476- struct folio_walk fw;477477- unsigned long uaddr;478478- struct folio *folio;479479- int rc;480480-481481- rc = -EFAULT;482482- mmap_read_lock(gmap->mm);483483-484484- uaddr = __gmap_translate(gmap, gaddr);485485- if (IS_ERR_VALUE(uaddr))486486- goto out;487487- vma = vma_lookup(gmap->mm, uaddr);488488- if (!vma)489489- goto out;490490- /*491491- * Huge pages should not be able to become secure492492- */493493- if (is_vm_hugetlb_page(vma))494494- goto out;495495-496496- rc = 0;497497- folio = folio_walk_start(&fw, vma, uaddr, 0);498498- if (!folio)499499- goto out;500500- /*501501- * See gmap_make_secure(): large folios cannot be secure. Small502502- * folio implies FW_LEVEL_PTE.503503- */504504- if (folio_test_large(folio) || !pte_write(fw.pte))505505- goto out_walk_end;506506- rc = uv_destroy_folio(folio);507507- /*508508- * Fault handlers can race; it is possible that two CPUs will fault509509- * on the same secure page. One CPU can destroy the page, reboot,510510- * re-enter secure mode and import it, while the second CPU was511511- * stuck at the beginning of the handler. At some point the second512512- * CPU will be able to progress, and it will not be able to destroy513513- * the page. In that case we do not want to terminate the process,514514- * we instead try to export the page.515515- */516516- if (rc)517517- rc = uv_convert_from_secure_folio(folio);518518-out_walk_end:519519- folio_walk_end(&fw, vma);520520-out:521521- mmap_read_unlock(gmap->mm);522522- return rc;523523-}524524-EXPORT_SYMBOL_GPL(gmap_destroy_page);270270+EXPORT_SYMBOL_GPL(make_folio_secure);525271526272/*527273 * To be called with the folio locked or with an extra reference! This will
···1616#include <asm/gmap.h>1717#include <asm/dat-bits.h>1818#include "kvm-s390.h"1919+#include "gmap.h"1920#include "gaccess.h"20212122/*···13941393}1395139413961395/**13961396+ * shadow_pgt_lookup() - find a shadow page table13971397+ * @sg: pointer to the shadow guest address space structure13981398+ * @saddr: the address in the shadow aguest address space13991399+ * @pgt: parent gmap address of the page table to get shadowed14001400+ * @dat_protection: if the pgtable is marked as protected by dat14011401+ * @fake: pgt references contiguous guest memory block, not a pgtable14021402+ *14031403+ * Returns 0 if the shadow page table was found and -EAGAIN if the page14041404+ * table was not found.14051405+ *14061406+ * Called with sg->mm->mmap_lock in read.14071407+ */14081408+static int shadow_pgt_lookup(struct gmap *sg, unsigned long saddr, unsigned long *pgt,14091409+ int *dat_protection, int *fake)14101410+{14111411+ unsigned long pt_index;14121412+ unsigned long *table;14131413+ struct page *page;14141414+ int rc;14151415+14161416+ spin_lock(&sg->guest_table_lock);14171417+ table = gmap_table_walk(sg, saddr, 1); /* get segment pointer */14181418+ if (table && !(*table & _SEGMENT_ENTRY_INVALID)) {14191419+ /* Shadow page tables are full pages (pte+pgste) */14201420+ page = pfn_to_page(*table >> PAGE_SHIFT);14211421+ pt_index = gmap_pgste_get_pgt_addr(page_to_virt(page));14221422+ *pgt = pt_index & ~GMAP_SHADOW_FAKE_TABLE;14231423+ *dat_protection = !!(*table & _SEGMENT_ENTRY_PROTECT);14241424+ *fake = !!(pt_index & GMAP_SHADOW_FAKE_TABLE);14251425+ rc = 0;14261426+ } else {14271427+ rc = -EAGAIN;14281428+ }14291429+ spin_unlock(&sg->guest_table_lock);14301430+ return rc;14311431+}14321432+14331433+/**13971434 * kvm_s390_shadow_fault - handle fault on a shadow page table13981435 * @vcpu: virtual cpu13991436 * @sg: pointer to the shadow guest address space structure···14541415 int dat_protection, fake;14551416 int rc;1456141714181418+ if (KVM_BUG_ON(!gmap_is_shadow(sg), vcpu->kvm))14191419+ return -EFAULT;14201420+14571421 mmap_read_lock(sg->mm);14581422 /*14591423 * We don't want any guest-2 tables to change - so the parent···14651423 */14661424 ipte_lock(vcpu->kvm);1467142514681468- rc = gmap_shadow_pgt_lookup(sg, saddr, &pgt, &dat_protection, &fake);14261426+ rc = shadow_pgt_lookup(sg, saddr, &pgt, &dat_protection, &fake);14691427 if (rc)14701428 rc = kvm_s390_shadow_tables(sg, saddr, &pgt, &dat_protection,14711429 &fake);
+142
arch/s390/kvm/gmap-vsie.c
···11+// SPDX-License-Identifier: GPL-2.022+/*33+ * Guest memory management for KVM/s390 nested VMs.44+ *55+ * Copyright IBM Corp. 2008, 2020, 202466+ *77+ * Author(s): Claudio Imbrenda <imbrenda@linux.ibm.com>88+ * Martin Schwidefsky <schwidefsky@de.ibm.com>99+ * David Hildenbrand <david@redhat.com>1010+ * Janosch Frank <frankja@linux.vnet.ibm.com>1111+ */1212+1313+#include <linux/compiler.h>1414+#include <linux/kvm.h>1515+#include <linux/kvm_host.h>1616+#include <linux/pgtable.h>1717+#include <linux/pagemap.h>1818+#include <linux/mman.h>1919+2020+#include <asm/lowcore.h>2121+#include <asm/gmap.h>2222+#include <asm/uv.h>2323+2424+#include "kvm-s390.h"2525+#include "gmap.h"2626+2727+/**2828+ * gmap_find_shadow - find a specific asce in the list of shadow tables2929+ * @parent: pointer to the parent gmap3030+ * @asce: ASCE for which the shadow table is created3131+ * @edat_level: edat level to be used for the shadow translation3232+ *3333+ * Returns the pointer to a gmap if a shadow table with the given asce is3434+ * already available, ERR_PTR(-EAGAIN) if another one is just being created,3535+ * otherwise NULL3636+ *3737+ * Context: Called with parent->shadow_lock held3838+ */3939+static struct gmap *gmap_find_shadow(struct gmap *parent, unsigned long asce, int edat_level)4040+{4141+ struct gmap *sg;4242+4343+ lockdep_assert_held(&parent->shadow_lock);4444+ list_for_each_entry(sg, &parent->children, list) {4545+ if (!gmap_shadow_valid(sg, asce, edat_level))4646+ continue;4747+ if (!sg->initialized)4848+ return ERR_PTR(-EAGAIN);4949+ refcount_inc(&sg->ref_count);5050+ return sg;5151+ }5252+ return NULL;5353+}5454+5555+/**5656+ * gmap_shadow - create/find a shadow guest address space5757+ * @parent: pointer to the parent gmap5858+ * @asce: ASCE for which the shadow table is created5959+ * @edat_level: edat level to be used for the shadow translation6060+ *6161+ * The pages of the top level page table referred by the asce parameter6262+ * will be set to read-only and marked in the PGSTEs of the kvm process.6363+ * The shadow table will be removed automatically on any change to the6464+ * PTE mapping for the source table.6565+ *6666+ * Returns a guest address space structure, ERR_PTR(-ENOMEM) if out of memory,6767+ * ERR_PTR(-EAGAIN) if the caller has to retry and ERR_PTR(-EFAULT) if the6868+ * parent gmap table could not be protected.6969+ */7070+struct gmap *gmap_shadow(struct gmap *parent, unsigned long asce, int edat_level)7171+{7272+ struct gmap *sg, *new;7373+ unsigned long limit;7474+ int rc;7575+7676+ if (KVM_BUG_ON(parent->mm->context.allow_gmap_hpage_1m, (struct kvm *)parent->private) ||7777+ KVM_BUG_ON(gmap_is_shadow(parent), (struct kvm *)parent->private))7878+ return ERR_PTR(-EFAULT);7979+ spin_lock(&parent->shadow_lock);8080+ sg = gmap_find_shadow(parent, asce, edat_level);8181+ spin_unlock(&parent->shadow_lock);8282+ if (sg)8383+ return sg;8484+ /* Create a new shadow gmap */8585+ limit = -1UL >> (33 - (((asce & _ASCE_TYPE_MASK) >> 2) * 11));8686+ if (asce & _ASCE_REAL_SPACE)8787+ limit = -1UL;8888+ new = gmap_alloc(limit);8989+ if (!new)9090+ return ERR_PTR(-ENOMEM);9191+ new->mm = parent->mm;9292+ new->parent = gmap_get(parent);9393+ new->private = parent->private;9494+ new->orig_asce = asce;9595+ new->edat_level = edat_level;9696+ new->initialized = false;9797+ spin_lock(&parent->shadow_lock);9898+ /* Recheck if another CPU created the same shadow */9999+ sg = gmap_find_shadow(parent, asce, edat_level);100100+ if (sg) {101101+ spin_unlock(&parent->shadow_lock);102102+ gmap_free(new);103103+ return sg;104104+ }105105+ if (asce & _ASCE_REAL_SPACE) {106106+ /* only allow one real-space gmap shadow */107107+ list_for_each_entry(sg, &parent->children, list) {108108+ if (sg->orig_asce & _ASCE_REAL_SPACE) {109109+ spin_lock(&sg->guest_table_lock);110110+ gmap_unshadow(sg);111111+ spin_unlock(&sg->guest_table_lock);112112+ list_del(&sg->list);113113+ gmap_put(sg);114114+ break;115115+ }116116+ }117117+ }118118+ refcount_set(&new->ref_count, 2);119119+ list_add(&new->list, &parent->children);120120+ if (asce & _ASCE_REAL_SPACE) {121121+ /* nothing to protect, return right away */122122+ new->initialized = true;123123+ spin_unlock(&parent->shadow_lock);124124+ return new;125125+ }126126+ spin_unlock(&parent->shadow_lock);127127+ /* protect after insertion, so it will get properly invalidated */128128+ mmap_read_lock(parent->mm);129129+ rc = __kvm_s390_mprotect_many(parent, asce & _ASCE_ORIGIN,130130+ ((asce & _ASCE_TABLE_LENGTH) + 1),131131+ PROT_READ, GMAP_NOTIFY_SHADOW);132132+ mmap_read_unlock(parent->mm);133133+ spin_lock(&parent->shadow_lock);134134+ new->initialized = true;135135+ if (rc) {136136+ list_del(&new->list);137137+ gmap_free(new);138138+ new = ERR_PTR(rc);139139+ }140140+ spin_unlock(&parent->shadow_lock);141141+ return new;142142+}
+212
arch/s390/kvm/gmap.c
···11+// SPDX-License-Identifier: GPL-2.022+/*33+ * Guest memory management for KVM/s39044+ *55+ * Copyright IBM Corp. 2008, 2020, 202466+ *77+ * Author(s): Claudio Imbrenda <imbrenda@linux.ibm.com>88+ * Martin Schwidefsky <schwidefsky@de.ibm.com>99+ * David Hildenbrand <david@redhat.com>1010+ * Janosch Frank <frankja@linux.vnet.ibm.com>1111+ */1212+1313+#include <linux/compiler.h>1414+#include <linux/kvm.h>1515+#include <linux/kvm_host.h>1616+#include <linux/pgtable.h>1717+#include <linux/pagemap.h>1818+1919+#include <asm/lowcore.h>2020+#include <asm/gmap.h>2121+#include <asm/uv.h>2222+2323+#include "gmap.h"2424+2525+/**2626+ * should_export_before_import - Determine whether an export is needed2727+ * before an import-like operation2828+ * @uvcb: the Ultravisor control block of the UVC to be performed2929+ * @mm: the mm of the process3030+ *3131+ * Returns whether an export is needed before every import-like operation.3232+ * This is needed for shared pages, which don't trigger a secure storage3333+ * exception when accessed from a different guest.3434+ *3535+ * Although considered as one, the Unpin Page UVC is not an actual import,3636+ * so it is not affected.3737+ *3838+ * No export is needed also when there is only one protected VM, because the3939+ * page cannot belong to the wrong VM in that case (there is no "other VM"4040+ * it can belong to).4141+ *4242+ * Return: true if an export is needed before every import, otherwise false.4343+ */4444+static bool should_export_before_import(struct uv_cb_header *uvcb, struct mm_struct *mm)4545+{4646+ /*4747+ * The misc feature indicates, among other things, that importing a4848+ * shared page from a different protected VM will automatically also4949+ * transfer its ownership.5050+ */5151+ if (uv_has_feature(BIT_UV_FEAT_MISC))5252+ return false;5353+ if (uvcb->cmd == UVC_CMD_UNPIN_PAGE_SHARED)5454+ return false;5555+ return atomic_read(&mm->context.protected_count) > 1;5656+}5757+5858+static int __gmap_make_secure(struct gmap *gmap, struct page *page, void *uvcb)5959+{6060+ struct folio *folio = page_folio(page);6161+ int rc;6262+6363+ /*6464+ * Secure pages cannot be huge and userspace should not combine both.6565+ * In case userspace does it anyway this will result in an -EFAULT for6666+ * the unpack. The guest is thus never reaching secure mode.6767+ * If userspace plays dirty tricks and decides to map huge pages at a6868+ * later point in time, it will receive a segmentation fault or6969+ * KVM_RUN will return -EFAULT.7070+ */7171+ if (folio_test_hugetlb(folio))7272+ return -EFAULT;7373+ if (folio_test_large(folio)) {7474+ mmap_read_unlock(gmap->mm);7575+ rc = kvm_s390_wiggle_split_folio(gmap->mm, folio, true);7676+ mmap_read_lock(gmap->mm);7777+ if (rc)7878+ return rc;7979+ folio = page_folio(page);8080+ }8181+8282+ if (!folio_trylock(folio))8383+ return -EAGAIN;8484+ if (should_export_before_import(uvcb, gmap->mm))8585+ uv_convert_from_secure(folio_to_phys(folio));8686+ rc = make_folio_secure(folio, uvcb);8787+ folio_unlock(folio);8888+8989+ /*9090+ * In theory a race is possible and the folio might have become9191+ * large again before the folio_trylock() above. In that case, no9292+ * action is performed and -EAGAIN is returned; the callers will9393+ * have to try again later.9494+ * In most cases this implies running the VM again, getting the same9595+ * exception again, and make another attempt in this function.9696+ * This is expected to happen extremely rarely.9797+ */9898+ if (rc == -E2BIG)9999+ return -EAGAIN;100100+ /* The folio has too many references, try to shake some off */101101+ if (rc == -EBUSY) {102102+ mmap_read_unlock(gmap->mm);103103+ kvm_s390_wiggle_split_folio(gmap->mm, folio, false);104104+ mmap_read_lock(gmap->mm);105105+ return -EAGAIN;106106+ }107107+108108+ return rc;109109+}110110+111111+/**112112+ * gmap_make_secure() - make one guest page secure113113+ * @gmap: the guest gmap114114+ * @gaddr: the guest address that needs to be made secure115115+ * @uvcb: the UVCB specifying which operation needs to be performed116116+ *117117+ * Context: needs to be called with kvm->srcu held.118118+ * Return: 0 on success, < 0 in case of error (see __gmap_make_secure()).119119+ */120120+int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)121121+{122122+ struct kvm *kvm = gmap->private;123123+ struct page *page;124124+ int rc = 0;125125+126126+ lockdep_assert_held(&kvm->srcu);127127+128128+ page = gfn_to_page(kvm, gpa_to_gfn(gaddr));129129+ mmap_read_lock(gmap->mm);130130+ if (page)131131+ rc = __gmap_make_secure(gmap, page, uvcb);132132+ kvm_release_page_clean(page);133133+ mmap_read_unlock(gmap->mm);134134+135135+ return rc;136136+}137137+138138+int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr)139139+{140140+ struct uv_cb_cts uvcb = {141141+ .header.cmd = UVC_CMD_CONV_TO_SEC_STOR,142142+ .header.len = sizeof(uvcb),143143+ .guest_handle = gmap->guest_handle,144144+ .gaddr = gaddr,145145+ };146146+147147+ return gmap_make_secure(gmap, gaddr, &uvcb);148148+}149149+150150+/**151151+ * __gmap_destroy_page() - Destroy a guest page.152152+ * @gmap: the gmap of the guest153153+ * @page: the page to destroy154154+ *155155+ * An attempt will be made to destroy the given guest page. If the attempt156156+ * fails, an attempt is made to export the page. If both attempts fail, an157157+ * appropriate error is returned.158158+ *159159+ * Context: must be called holding the mm lock for gmap->mm160160+ */161161+static int __gmap_destroy_page(struct gmap *gmap, struct page *page)162162+{163163+ struct folio *folio = page_folio(page);164164+ int rc;165165+166166+ /*167167+ * See gmap_make_secure(): large folios cannot be secure. Small168168+ * folio implies FW_LEVEL_PTE.169169+ */170170+ if (folio_test_large(folio))171171+ return -EFAULT;172172+173173+ rc = uv_destroy_folio(folio);174174+ /*175175+ * Fault handlers can race; it is possible that two CPUs will fault176176+ * on the same secure page. One CPU can destroy the page, reboot,177177+ * re-enter secure mode and import it, while the second CPU was178178+ * stuck at the beginning of the handler. At some point the second179179+ * CPU will be able to progress, and it will not be able to destroy180180+ * the page. In that case we do not want to terminate the process,181181+ * we instead try to export the page.182182+ */183183+ if (rc)184184+ rc = uv_convert_from_secure_folio(folio);185185+186186+ return rc;187187+}188188+189189+/**190190+ * gmap_destroy_page() - Destroy a guest page.191191+ * @gmap: the gmap of the guest192192+ * @gaddr: the guest address to destroy193193+ *194194+ * An attempt will be made to destroy the given guest page. If the attempt195195+ * fails, an attempt is made to export the page. If both attempts fail, an196196+ * appropriate error is returned.197197+ *198198+ * Context: may sleep.199199+ */200200+int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr)201201+{202202+ struct page *page;203203+ int rc = 0;204204+205205+ mmap_read_lock(gmap->mm);206206+ page = gfn_to_page(gmap->private, gpa_to_gfn(gaddr));207207+ if (page)208208+ rc = __gmap_destroy_page(gmap, page);209209+ kvm_release_page_clean(page);210210+ mmap_read_unlock(gmap->mm);211211+ return rc;212212+}
+39
arch/s390/kvm/gmap.h
···11+/* SPDX-License-Identifier: GPL-2.0 */22+/*33+ * KVM guest address space mapping code44+ *55+ * Copyright IBM Corp. 2007, 2016, 202566+ * Author(s): Martin Schwidefsky <schwidefsky@de.ibm.com>77+ * Claudio Imbrenda <imbrenda@linux.ibm.com>88+ */99+1010+#ifndef ARCH_KVM_S390_GMAP_H1111+#define ARCH_KVM_S390_GMAP_H1212+1313+#define GMAP_SHADOW_FAKE_TABLE 1ULL1414+1515+int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb);1616+int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr);1717+int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr);1818+struct gmap *gmap_shadow(struct gmap *parent, unsigned long asce, int edat_level);1919+2020+/**2121+ * gmap_shadow_valid - check if a shadow guest address space matches the2222+ * given properties and is still valid2323+ * @sg: pointer to the shadow guest address space structure2424+ * @asce: ASCE for which the shadow table is requested2525+ * @edat_level: edat level to be used for the shadow translation2626+ *2727+ * Returns 1 if the gmap shadow is still valid and matches the given2828+ * properties, the caller can continue using it. Returns 0 otherwise, the2929+ * caller has to request a new shadow gmap in this case.3030+ *3131+ */3232+static inline int gmap_shadow_valid(struct gmap *sg, unsigned long asce, int edat_level)3333+{3434+ if (sg->removed)3535+ return 0;3636+ return sg->orig_asce == asce && sg->edat_level == edat_level;3737+}3838+3939+#endif
+4-3
arch/s390/kvm/intercept.c
···2121#include "gaccess.h"2222#include "trace.h"2323#include "trace-s390.h"2424+#include "gmap.h"24252526u8 kvm_s390_get_ilen(struct kvm_vcpu *vcpu)2627{···368367 reg2, &srcaddr, GACC_FETCH, 0);369368 if (rc)370369 return kvm_s390_inject_prog_cond(vcpu, rc);371371- rc = gmap_fault(vcpu->arch.gmap, srcaddr, 0);370370+ rc = kvm_s390_handle_dat_fault(vcpu, srcaddr, 0);372371 if (rc != 0)373372 return rc;374373···377376 reg1, &dstaddr, GACC_STORE, 0);378377 if (rc)379378 return kvm_s390_inject_prog_cond(vcpu, rc);380380- rc = gmap_fault(vcpu->arch.gmap, dstaddr, FAULT_FLAG_WRITE);379379+ rc = kvm_s390_handle_dat_fault(vcpu, dstaddr, FOLL_WRITE);381380 if (rc != 0)382381 return rc;383382···550549 * If the unpin did not succeed, the guest will exit again for the UVC551550 * and we will retry the unpin.552551 */553553- if (rc == -EINVAL)552552+ if (rc == -EINVAL || rc == -ENXIO)554553 return 0;555554 /*556555 * If we got -EAGAIN here, we simply return it. It will eventually
+11-8
arch/s390/kvm/interrupt.c
···28932893 struct kvm_kernel_irq_routing_entry *e,28942894 const struct kvm_irq_routing_entry *ue)28952895{28962896- u64 uaddr;28962896+ u64 uaddr_s, uaddr_i;28972897+ int idx;2897289828982899 switch (ue->type) {28992900 /* we store the userspace addresses instead of the guest addresses */···29022901 if (kvm_is_ucontrol(kvm))29032902 return -EINVAL;29042903 e->set = set_adapter_int;29052905- uaddr = gmap_translate(kvm->arch.gmap, ue->u.adapter.summary_addr);29062906- if (uaddr == -EFAULT)29042904+29052905+ idx = srcu_read_lock(&kvm->srcu);29062906+ uaddr_s = gpa_to_hva(kvm, ue->u.adapter.summary_addr);29072907+ uaddr_i = gpa_to_hva(kvm, ue->u.adapter.ind_addr);29082908+ srcu_read_unlock(&kvm->srcu, idx);29092909+29102910+ if (kvm_is_error_hva(uaddr_s) || kvm_is_error_hva(uaddr_i))29072911 return -EFAULT;29082908- e->adapter.summary_addr = uaddr;29092909- uaddr = gmap_translate(kvm->arch.gmap, ue->u.adapter.ind_addr);29102910- if (uaddr == -EFAULT)29112911- return -EFAULT;29122912- e->adapter.ind_addr = uaddr;29122912+ e->adapter.summary_addr = uaddr_s;29132913+ e->adapter.ind_addr = uaddr_i;29132914 e->adapter.summary_offset = ue->u.adapter.summary_offset;29142915 e->adapter.ind_offset = ue->u.adapter.ind_offset;29152916 e->adapter.adapter_id = ue->u.adapter.adapter_id;
+197-40
arch/s390/kvm/kvm-s390.c
···5050#include "kvm-s390.h"5151#include "gaccess.h"5252#include "pci.h"5353+#include "gmap.h"53545455#define CREATE_TRACE_POINTS5556#include "trace.h"···34293428 VM_EVENT(kvm, 3, "vm created with type %lu", type);3430342934313430 if (type & KVM_VM_S390_UCONTROL) {34313431+ struct kvm_userspace_memory_region2 fake_memslot = {34323432+ .slot = KVM_S390_UCONTROL_MEMSLOT,34333433+ .guest_phys_addr = 0,34343434+ .userspace_addr = 0,34353435+ .memory_size = ALIGN_DOWN(TASK_SIZE, _SEGMENT_SIZE),34363436+ .flags = 0,34373437+ };34383438+34323439 kvm->arch.gmap = NULL;34333440 kvm->arch.mem_limit = KVM_S390_NO_MEM_LIMIT;34413441+ /* one flat fake memslot covering the whole address-space */34423442+ mutex_lock(&kvm->slots_lock);34433443+ KVM_BUG_ON(kvm_set_internal_memslot(kvm, &fake_memslot), kvm);34443444+ mutex_unlock(&kvm->slots_lock);34343445 } else {34353446 if (sclp.hamax == U64_MAX)34363447 kvm->arch.mem_limit = TASK_SIZE_MAX;···45114498 return kvm_s390_test_cpuflags(vcpu, CPUSTAT_IBS);45124499}4513450045014501+static int __kvm_s390_fixup_fault_sync(struct gmap *gmap, gpa_t gaddr, unsigned int flags)45024502+{45034503+ struct kvm *kvm = gmap->private;45044504+ gfn_t gfn = gpa_to_gfn(gaddr);45054505+ bool unlocked;45064506+ hva_t vmaddr;45074507+ gpa_t tmp;45084508+ int rc;45094509+45104510+ if (kvm_is_ucontrol(kvm)) {45114511+ tmp = __gmap_translate(gmap, gaddr);45124512+ gfn = gpa_to_gfn(tmp);45134513+ }45144514+45154515+ vmaddr = gfn_to_hva(kvm, gfn);45164516+ rc = fixup_user_fault(gmap->mm, vmaddr, FAULT_FLAG_WRITE, &unlocked);45174517+ if (!rc)45184518+ rc = __gmap_link(gmap, gaddr, vmaddr);45194519+ return rc;45204520+}45214521+45224522+/**45234523+ * __kvm_s390_mprotect_many() - Apply specified protection to guest pages45244524+ * @gmap: the gmap of the guest45254525+ * @gpa: the starting guest address45264526+ * @npages: how many pages to protect45274527+ * @prot: indicates access rights: PROT_NONE, PROT_READ or PROT_WRITE45284528+ * @bits: pgste notification bits to set45294529+ *45304530+ * Returns: 0 in case of success, < 0 in case of error - see gmap_protect_one()45314531+ *45324532+ * Context: kvm->srcu and gmap->mm need to be held in read mode45334533+ */45344534+int __kvm_s390_mprotect_many(struct gmap *gmap, gpa_t gpa, u8 npages, unsigned int prot,45354535+ unsigned long bits)45364536+{45374537+ unsigned int fault_flag = (prot & PROT_WRITE) ? FAULT_FLAG_WRITE : 0;45384538+ gpa_t end = gpa + npages * PAGE_SIZE;45394539+ int rc;45404540+45414541+ for (; gpa < end; gpa = ALIGN(gpa + 1, rc)) {45424542+ rc = gmap_protect_one(gmap, gpa, prot, bits);45434543+ if (rc == -EAGAIN) {45444544+ __kvm_s390_fixup_fault_sync(gmap, gpa, fault_flag);45454545+ rc = gmap_protect_one(gmap, gpa, prot, bits);45464546+ }45474547+ if (rc < 0)45484548+ return rc;45494549+ }45504550+45514551+ return 0;45524552+}45534553+45544554+static int kvm_s390_mprotect_notify_prefix(struct kvm_vcpu *vcpu)45554555+{45564556+ gpa_t gaddr = kvm_s390_get_prefix(vcpu);45574557+ int idx, rc;45584558+45594559+ idx = srcu_read_lock(&vcpu->kvm->srcu);45604560+ mmap_read_lock(vcpu->arch.gmap->mm);45614561+45624562+ rc = __kvm_s390_mprotect_many(vcpu->arch.gmap, gaddr, 2, PROT_WRITE, GMAP_NOTIFY_MPROT);45634563+45644564+ mmap_read_unlock(vcpu->arch.gmap->mm);45654565+ srcu_read_unlock(&vcpu->kvm->srcu, idx);45664566+45674567+ return rc;45684568+}45694569+45144570static int kvm_s390_handle_requests(struct kvm_vcpu *vcpu)45154571{45164572retry:···45954513 */45964514 if (kvm_check_request(KVM_REQ_REFRESH_GUEST_PREFIX, vcpu)) {45974515 int rc;45984598- rc = gmap_mprotect_notify(vcpu->arch.gmap,45994599- kvm_s390_get_prefix(vcpu),46004600- PAGE_SIZE * 2, PROT_WRITE);45164516+45174517+ rc = kvm_s390_mprotect_notify_prefix(vcpu);46014518 if (rc) {46024519 kvm_make_request(KVM_REQ_REFRESH_GUEST_PREFIX, vcpu);46034520 return rc;···48474766 return kvm_s390_inject_prog_irq(vcpu, &pgm_info);48484767}4849476847694769+static void kvm_s390_assert_primary_as(struct kvm_vcpu *vcpu)47704770+{47714771+ KVM_BUG(current->thread.gmap_teid.as != PSW_BITS_AS_PRIMARY, vcpu->kvm,47724772+ "Unexpected program interrupt 0x%x, TEID 0x%016lx",47734773+ current->thread.gmap_int_code, current->thread.gmap_teid.val);47744774+}47754775+47764776+/*47774777+ * __kvm_s390_handle_dat_fault() - handle a dat fault for the gmap of a vcpu47784778+ * @vcpu: the vCPU whose gmap is to be fixed up47794779+ * @gfn: the guest frame number used for memslots (including fake memslots)47804780+ * @gaddr: the gmap address, does not have to match @gfn for ucontrol gmaps47814781+ * @flags: FOLL_* flags47824782+ *47834783+ * Return: 0 on success, < 0 in case of error.47844784+ * Context: The mm lock must not be held before calling. May sleep.47854785+ */47864786+int __kvm_s390_handle_dat_fault(struct kvm_vcpu *vcpu, gfn_t gfn, gpa_t gaddr, unsigned int flags)47874787+{47884788+ struct kvm_memory_slot *slot;47894789+ unsigned int fault_flags;47904790+ bool writable, unlocked;47914791+ unsigned long vmaddr;47924792+ struct page *page;47934793+ kvm_pfn_t pfn;47944794+ int rc;47954795+47964796+ slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);47974797+ if (!slot || slot->flags & KVM_MEMSLOT_INVALID)47984798+ return vcpu_post_run_addressing_exception(vcpu);47994799+48004800+ fault_flags = flags & FOLL_WRITE ? FAULT_FLAG_WRITE : 0;48014801+ if (vcpu->arch.gmap->pfault_enabled)48024802+ flags |= FOLL_NOWAIT;48034803+ vmaddr = __gfn_to_hva_memslot(slot, gfn);48044804+48054805+try_again:48064806+ pfn = __kvm_faultin_pfn(slot, gfn, flags, &writable, &page);48074807+48084808+ /* Access outside memory, inject addressing exception */48094809+ if (is_noslot_pfn(pfn))48104810+ return vcpu_post_run_addressing_exception(vcpu);48114811+ /* Signal pending: try again */48124812+ if (pfn == KVM_PFN_ERR_SIGPENDING)48134813+ return -EAGAIN;48144814+48154815+ /* Needs I/O, try to setup async pfault (only possible with FOLL_NOWAIT) */48164816+ if (pfn == KVM_PFN_ERR_NEEDS_IO) {48174817+ trace_kvm_s390_major_guest_pfault(vcpu);48184818+ if (kvm_arch_setup_async_pf(vcpu))48194819+ return 0;48204820+ vcpu->stat.pfault_sync++;48214821+ /* Could not setup async pfault, try again synchronously */48224822+ flags &= ~FOLL_NOWAIT;48234823+ goto try_again;48244824+ }48254825+ /* Any other error */48264826+ if (is_error_pfn(pfn))48274827+ return -EFAULT;48284828+48294829+ /* Success */48304830+ mmap_read_lock(vcpu->arch.gmap->mm);48314831+ /* Mark the userspace PTEs as young and/or dirty, to avoid page fault loops */48324832+ rc = fixup_user_fault(vcpu->arch.gmap->mm, vmaddr, fault_flags, &unlocked);48334833+ if (!rc)48344834+ rc = __gmap_link(vcpu->arch.gmap, gaddr, vmaddr);48354835+ scoped_guard(spinlock, &vcpu->kvm->mmu_lock) {48364836+ kvm_release_faultin_page(vcpu->kvm, page, false, writable);48374837+ }48384838+ mmap_read_unlock(vcpu->arch.gmap->mm);48394839+ return rc;48404840+}48414841+48424842+static int vcpu_dat_fault_handler(struct kvm_vcpu *vcpu, unsigned long gaddr, unsigned int flags)48434843+{48444844+ unsigned long gaddr_tmp;48454845+ gfn_t gfn;48464846+48474847+ gfn = gpa_to_gfn(gaddr);48484848+ if (kvm_is_ucontrol(vcpu->kvm)) {48494849+ /*48504850+ * This translates the per-vCPU guest address into a48514851+ * fake guest address, which can then be used with the48524852+ * fake memslots that are identity mapping userspace.48534853+ * This allows ucontrol VMs to use the normal fault48544854+ * resolution path, like normal VMs.48554855+ */48564856+ mmap_read_lock(vcpu->arch.gmap->mm);48574857+ gaddr_tmp = __gmap_translate(vcpu->arch.gmap, gaddr);48584858+ mmap_read_unlock(vcpu->arch.gmap->mm);48594859+ if (gaddr_tmp == -EFAULT) {48604860+ vcpu->run->exit_reason = KVM_EXIT_S390_UCONTROL;48614861+ vcpu->run->s390_ucontrol.trans_exc_code = gaddr;48624862+ vcpu->run->s390_ucontrol.pgm_code = PGM_SEGMENT_TRANSLATION;48634863+ return -EREMOTE;48644864+ }48654865+ gfn = gpa_to_gfn(gaddr_tmp);48664866+ }48674867+ return __kvm_s390_handle_dat_fault(vcpu, gfn, gaddr, flags);48684868+}48694869+48504870static int vcpu_post_run_handle_fault(struct kvm_vcpu *vcpu)48514871{48524872 unsigned int flags = 0;48534873 unsigned long gaddr;48544854- int rc = 0;4855487448564875 gaddr = current->thread.gmap_teid.addr * PAGE_SIZE;48574876 if (kvm_s390_cur_gmap_fault_is_write())···49624781 vcpu->stat.exit_null++;49634782 break;49644783 case PGM_NON_SECURE_STORAGE_ACCESS:49654965- KVM_BUG(current->thread.gmap_teid.as != PSW_BITS_AS_PRIMARY, vcpu->kvm,49664966- "Unexpected program interrupt 0x%x, TEID 0x%016lx",49674967- current->thread.gmap_int_code, current->thread.gmap_teid.val);47844784+ kvm_s390_assert_primary_as(vcpu);49684785 /*49694786 * This is normal operation; a page belonging to a protected49704787 * guest has not been imported yet. Try to import the page into···49734794 break;49744795 case PGM_SECURE_STORAGE_ACCESS:49754796 case PGM_SECURE_STORAGE_VIOLATION:49764976- KVM_BUG(current->thread.gmap_teid.as != PSW_BITS_AS_PRIMARY, vcpu->kvm,49774977- "Unexpected program interrupt 0x%x, TEID 0x%016lx",49784978- current->thread.gmap_int_code, current->thread.gmap_teid.val);47974797+ kvm_s390_assert_primary_as(vcpu);49794798 /*49804799 * This can happen after a reboot with asynchronous teardown;49814800 * the new guest (normal or protected) will run on top of the···50024825 case PGM_REGION_FIRST_TRANS:50034826 case PGM_REGION_SECOND_TRANS:50044827 case PGM_REGION_THIRD_TRANS:50055005- KVM_BUG(current->thread.gmap_teid.as != PSW_BITS_AS_PRIMARY, vcpu->kvm,50065006- "Unexpected program interrupt 0x%x, TEID 0x%016lx",50075007- current->thread.gmap_int_code, current->thread.gmap_teid.val);50085008- if (vcpu->arch.gmap->pfault_enabled) {50095009- rc = gmap_fault(vcpu->arch.gmap, gaddr, flags | FAULT_FLAG_RETRY_NOWAIT);50105010- if (rc == -EFAULT)50115011- return vcpu_post_run_addressing_exception(vcpu);50125012- if (rc == -EAGAIN) {50135013- trace_kvm_s390_major_guest_pfault(vcpu);50145014- if (kvm_arch_setup_async_pf(vcpu))50155015- return 0;50165016- vcpu->stat.pfault_sync++;50175017- } else {50185018- return rc;50195019- }50205020- }50215021- rc = gmap_fault(vcpu->arch.gmap, gaddr, flags);50225022- if (rc == -EFAULT) {50235023- if (kvm_is_ucontrol(vcpu->kvm)) {50245024- vcpu->run->exit_reason = KVM_EXIT_S390_UCONTROL;50255025- vcpu->run->s390_ucontrol.trans_exc_code = gaddr;50265026- vcpu->run->s390_ucontrol.pgm_code = 0x10;50275027- return -EREMOTE;50285028- }50295029- return vcpu_post_run_addressing_exception(vcpu);50305030- }50315031- break;48284828+ kvm_s390_assert_primary_as(vcpu);48294829+ return vcpu_dat_fault_handler(vcpu, gaddr, flags);50324830 default:50334831 KVM_BUG(1, vcpu->kvm, "Unexpected program interrupt 0x%x, TEID 0x%016lx",50344832 current->thread.gmap_int_code, current->thread.gmap_teid.val);50354833 send_sig(SIGSEGV, current, 0);50364834 break;50374835 }50385038- return rc;48364836+ return 0;50394837}5040483850414839static int vcpu_post_run(struct kvm_vcpu *vcpu, int exit_reason)···58895737 }58905738#endif58915739 case KVM_S390_VCPU_FAULT: {58925892- r = gmap_fault(vcpu->arch.gmap, arg, 0);57405740+ idx = srcu_read_lock(&vcpu->kvm->srcu);57415741+ r = vcpu_dat_fault_handler(vcpu, arg, 0);57425742+ srcu_read_unlock(&vcpu->kvm->srcu, idx);58935743 break;58945744 }58955745 case KVM_ENABLE_CAP:···60075853{60085854 gpa_t size;6009585560106010- if (kvm_is_ucontrol(kvm))58565856+ if (kvm_is_ucontrol(kvm) && new->id < KVM_USER_MEM_SLOTS)60115857 return -EINVAL;6012585860135859 /* When we are protected, we should not change the memory slots */···60585904 enum kvm_mr_change change)60595905{60605906 int rc = 0;59075907+59085908+ if (kvm_is_ucontrol(kvm))59095909+ return;6061591060625911 switch (change) {60635912 case KVM_MR_DELETE:
···1717#include <linux/sched/mm.h>1818#include <linux/mmu_notifier.h>1919#include "kvm-s390.h"2020+#include "gmap.h"20212122bool kvm_s390_pv_is_protected(struct kvm *kvm)2223{···639638 .tweak[1] = offset,640639 };641640 int ret = gmap_make_secure(kvm->arch.gmap, addr, &uvcb);641641+ unsigned long vmaddr;642642+ bool unlocked;642643643644 *rc = uvcb.header.rc;644645 *rrc = uvcb.header.rrc;646646+647647+ if (ret == -ENXIO) {648648+ mmap_read_lock(kvm->mm);649649+ vmaddr = gfn_to_hva(kvm, gpa_to_gfn(addr));650650+ if (kvm_is_error_hva(vmaddr)) {651651+ ret = -EFAULT;652652+ } else {653653+ ret = fixup_user_fault(kvm->mm, vmaddr, FAULT_FLAG_WRITE, &unlocked);654654+ if (!ret)655655+ ret = __gmap_link(kvm->arch.gmap, addr, vmaddr);656656+ }657657+ mmap_read_unlock(kvm->mm);658658+ if (!ret)659659+ return -EAGAIN;660660+ return ret;661661+ }645662646663 if (ret && ret != -EAGAIN)647664 KVM_UV_EVENT(kvm, 3, "PROTVIRT VM UNPACK: failed addr %llx with rc %x rrc %x",···678659679660 KVM_UV_EVENT(kvm, 3, "PROTVIRT VM UNPACK: start addr %lx size %lx",680661 addr, size);662662+663663+ guard(srcu)(&kvm->srcu);681664682665 while (offset < size) {683666 ret = unpack_one(kvm, addr, tweak, offset, rc, rrc);
+68-38
arch/s390/kvm/vsie.c
···1313#include <linux/bitmap.h>1414#include <linux/sched/signal.h>1515#include <linux/io.h>1616+#include <linux/mman.h>16171718#include <asm/gmap.h>1819#include <asm/mmu_context.h>···2322#include <asm/facility.h>2423#include "kvm-s390.h"2524#include "gaccess.h"2525+#include "gmap.h"2626+2727+enum vsie_page_flags {2828+ VSIE_PAGE_IN_USE = 0,2929+};26302731struct vsie_page {2832 struct kvm_s390_sie_block scb_s; /* 0x0000 */···5246 gpa_t gvrd_gpa; /* 0x0240 */5347 gpa_t riccbd_gpa; /* 0x0248 */5448 gpa_t sdnx_gpa; /* 0x0250 */5555- __u8 reserved[0x0700 - 0x0258]; /* 0x0258 */4949+ /*5050+ * guest address of the original SCB. Remains set for free vsie5151+ * pages, so we can properly look them up in our addr_to_page5252+ * radix tree.5353+ */5454+ gpa_t scb_gpa; /* 0x0258 */5555+ /*5656+ * Flags: must be set/cleared atomically after the vsie page can be5757+ * looked up by other CPUs.5858+ */5959+ unsigned long flags; /* 0x0260 */6060+ __u8 reserved[0x0700 - 0x0268]; /* 0x0268 */5661 struct kvm_s390_crypto_cb crycb; /* 0x0700 */5762 __u8 fac[S390_ARCH_FAC_LIST_SIZE_BYTE]; /* 0x0800 */5863};···601584 struct kvm *kvm = gmap->private;602585 struct vsie_page *cur;603586 unsigned long prefix;604604- struct page *page;605587 int i;606588607589 if (!gmap_is_shadow(gmap))···610594 * therefore we can safely reference them all the time.611595 */612596 for (i = 0; i < kvm->arch.vsie.page_count; i++) {613613- page = READ_ONCE(kvm->arch.vsie.pages[i]);614614- if (!page)597597+ cur = READ_ONCE(kvm->arch.vsie.pages[i]);598598+ if (!cur)615599 continue;616616- cur = page_to_virt(page);617600 if (READ_ONCE(cur->gmap) != gmap)618601 continue;619602 prefix = cur->scb_s.prefix << GUEST_PREFIX_SHIFT;···13601345 return rc;13611346}1362134713481348+/* Try getting a given vsie page, returning "true" on success. */13491349+static inline bool try_get_vsie_page(struct vsie_page *vsie_page)13501350+{13511351+ if (test_bit(VSIE_PAGE_IN_USE, &vsie_page->flags))13521352+ return false;13531353+ return !test_and_set_bit(VSIE_PAGE_IN_USE, &vsie_page->flags);13541354+}13551355+13561356+/* Put a vsie page acquired through get_vsie_page / try_get_vsie_page. */13571357+static void put_vsie_page(struct vsie_page *vsie_page)13581358+{13591359+ clear_bit(VSIE_PAGE_IN_USE, &vsie_page->flags);13601360+}13611361+13631362/*13641363 * Get or create a vsie page for a scb address.13651364 *···13841355static struct vsie_page *get_vsie_page(struct kvm *kvm, unsigned long addr)13851356{13861357 struct vsie_page *vsie_page;13871387- struct page *page;13881358 int nr_vcpus;1389135913901360 rcu_read_lock();13911391- page = radix_tree_lookup(&kvm->arch.vsie.addr_to_page, addr >> 9);13611361+ vsie_page = radix_tree_lookup(&kvm->arch.vsie.addr_to_page, addr >> 9);13921362 rcu_read_unlock();13931393- if (page) {13941394- if (page_ref_inc_return(page) == 2)13951395- return page_to_virt(page);13961396- page_ref_dec(page);13631363+ if (vsie_page) {13641364+ if (try_get_vsie_page(vsie_page)) {13651365+ if (vsie_page->scb_gpa == addr)13661366+ return vsie_page;13671367+ /*13681368+ * We raced with someone reusing + putting this vsie13691369+ * page before we grabbed it.13701370+ */13711371+ put_vsie_page(vsie_page);13721372+ }13971373 }1398137413991375 /*···1409137514101376 mutex_lock(&kvm->arch.vsie.mutex);14111377 if (kvm->arch.vsie.page_count < nr_vcpus) {14121412- page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO | GFP_DMA);14131413- if (!page) {13781378+ vsie_page = (void *)__get_free_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO | GFP_DMA);13791379+ if (!vsie_page) {14141380 mutex_unlock(&kvm->arch.vsie.mutex);14151381 return ERR_PTR(-ENOMEM);14161382 }14171417- page_ref_inc(page);14181418- kvm->arch.vsie.pages[kvm->arch.vsie.page_count] = page;13831383+ __set_bit(VSIE_PAGE_IN_USE, &vsie_page->flags);13841384+ kvm->arch.vsie.pages[kvm->arch.vsie.page_count] = vsie_page;14191385 kvm->arch.vsie.page_count++;14201386 } else {14211387 /* reuse an existing entry that belongs to nobody */14221388 while (true) {14231423- page = kvm->arch.vsie.pages[kvm->arch.vsie.next];14241424- if (page_ref_inc_return(page) == 2)13891389+ vsie_page = kvm->arch.vsie.pages[kvm->arch.vsie.next];13901390+ if (try_get_vsie_page(vsie_page))14251391 break;14261426- page_ref_dec(page);14271392 kvm->arch.vsie.next++;14281393 kvm->arch.vsie.next %= nr_vcpus;14291394 }14301430- radix_tree_delete(&kvm->arch.vsie.addr_to_page, page->index >> 9);13951395+ if (vsie_page->scb_gpa != ULONG_MAX)13961396+ radix_tree_delete(&kvm->arch.vsie.addr_to_page,13971397+ vsie_page->scb_gpa >> 9);14311398 }14321432- page->index = addr;14331433- /* double use of the same address */14341434- if (radix_tree_insert(&kvm->arch.vsie.addr_to_page, addr >> 9, page)) {14351435- page_ref_dec(page);13991399+ /* Mark it as invalid until it resides in the tree. */14001400+ vsie_page->scb_gpa = ULONG_MAX;14011401+14021402+ /* Double use of the same address or allocation failure. */14031403+ if (radix_tree_insert(&kvm->arch.vsie.addr_to_page, addr >> 9,14041404+ vsie_page)) {14051405+ put_vsie_page(vsie_page);14361406 mutex_unlock(&kvm->arch.vsie.mutex);14371407 return NULL;14381408 }14091409+ vsie_page->scb_gpa = addr;14391410 mutex_unlock(&kvm->arch.vsie.mutex);1440141114411441- vsie_page = page_to_virt(page);14421412 memset(&vsie_page->scb_s, 0, sizeof(struct kvm_s390_sie_block));14431413 release_gmap_shadow(vsie_page);14441414 vsie_page->fault_addr = 0;14451415 vsie_page->scb_s.ihcpu = 0xffffU;14461416 return vsie_page;14471447-}14481448-14491449-/* put a vsie page acquired via get_vsie_page */14501450-static void put_vsie_page(struct kvm *kvm, struct vsie_page *vsie_page)14511451-{14521452- struct page *page = pfn_to_page(__pa(vsie_page) >> PAGE_SHIFT);14531453-14541454- page_ref_dec(page);14551417}1456141814571419int kvm_s390_handle_vsie(struct kvm_vcpu *vcpu)···15001470out_unpin_scb:15011471 unpin_scb(vcpu, vsie_page, scb_addr);15021472out_put:15031503- put_vsie_page(vcpu->kvm, vsie_page);14731473+ put_vsie_page(vsie_page);1504147415051475 return rc < 0 ? rc : 0;15061476}···15161486void kvm_s390_vsie_destroy(struct kvm *kvm)15171487{15181488 struct vsie_page *vsie_page;15191519- struct page *page;15201489 int i;1521149015221491 mutex_lock(&kvm->arch.vsie.mutex);15231492 for (i = 0; i < kvm->arch.vsie.page_count; i++) {15241524- page = kvm->arch.vsie.pages[i];14931493+ vsie_page = kvm->arch.vsie.pages[i];15251494 kvm->arch.vsie.pages[i] = NULL;15261526- vsie_page = page_to_virt(page);15271495 release_gmap_shadow(vsie_page);15281496 /* free the radix tree entry */15291529- radix_tree_delete(&kvm->arch.vsie.addr_to_page, page->index >> 9);15301530- __free_page(page);14971497+ if (vsie_page->scb_gpa != ULONG_MAX)14981498+ radix_tree_delete(&kvm->arch.vsie.addr_to_page,14991499+ vsie_page->scb_gpa >> 9);15001500+ free_page((unsigned long)vsie_page);15311501 }15321502 kvm->arch.vsie.page_count = 0;15331503 mutex_unlock(&kvm->arch.vsie.mutex);
+150-531
arch/s390/mm/gmap.c
···2424#include <asm/page.h>2525#include <asm/tlb.h>26262727+/*2828+ * The address is saved in a radix tree directly; NULL would be ambiguous,2929+ * since 0 is a valid address, and NULL is returned when nothing was found.3030+ * The lower bits are ignored by all users of the macro, so it can be used3131+ * to distinguish a valid address 0 from a NULL.3232+ */3333+#define VALID_GADDR_FLAG 13434+#define IS_GADDR_VALID(gaddr) ((gaddr) & VALID_GADDR_FLAG)3535+#define MAKE_VALID_GADDR(gaddr) (((gaddr) & HPAGE_MASK) | VALID_GADDR_FLAG)3636+2737#define GMAP_SHADOW_FAKE_TABLE 1ULL28382939static struct page *gmap_alloc_crst(void)···5343 *5444 * Returns a guest address space structure.5545 */5656-static struct gmap *gmap_alloc(unsigned long limit)4646+struct gmap *gmap_alloc(unsigned long limit)5747{5848 struct gmap *gmap;5949 struct page *page;···8070 gmap = kzalloc(sizeof(struct gmap), GFP_KERNEL_ACCOUNT);8171 if (!gmap)8272 goto out;8383- INIT_LIST_HEAD(&gmap->crst_list);8473 INIT_LIST_HEAD(&gmap->children);8585- INIT_LIST_HEAD(&gmap->pt_list);8674 INIT_RADIX_TREE(&gmap->guest_to_host, GFP_KERNEL_ACCOUNT);8775 INIT_RADIX_TREE(&gmap->host_to_guest, GFP_ATOMIC | __GFP_ACCOUNT);8876 INIT_RADIX_TREE(&gmap->host_to_rmap, GFP_ATOMIC | __GFP_ACCOUNT);···9082 page = gmap_alloc_crst();9183 if (!page)9284 goto out_free;9393- page->index = 0;9494- list_add(&page->lru, &gmap->crst_list);9585 table = page_to_virt(page);9686 crst_table_init(table, etype);9787 gmap->table = table;···10397out:10498 return NULL;10599}100100+EXPORT_SYMBOL_GPL(gmap_alloc);106101107102/**108103 * gmap_create - create a guest address space···192185 } while (nr > 0);193186}194187188188+static void gmap_free_crst(unsigned long *table, bool free_ptes)189189+{190190+ bool is_segment = (table[0] & _SEGMENT_ENTRY_TYPE_MASK) == 0;191191+ int i;192192+193193+ if (is_segment) {194194+ if (!free_ptes)195195+ goto out;196196+ for (i = 0; i < _CRST_ENTRIES; i++)197197+ if (!(table[i] & _SEGMENT_ENTRY_INVALID))198198+ page_table_free_pgste(page_ptdesc(phys_to_page(table[i])));199199+ } else {200200+ for (i = 0; i < _CRST_ENTRIES; i++)201201+ if (!(table[i] & _REGION_ENTRY_INVALID))202202+ gmap_free_crst(__va(table[i] & PAGE_MASK), free_ptes);203203+ }204204+205205+out:206206+ free_pages((unsigned long)table, CRST_ALLOC_ORDER);207207+}208208+195209/**196210 * gmap_free - free a guest address space197211 * @gmap: pointer to the guest address space structure198212 *199213 * No locks required. There are no references to this gmap anymore.200214 */201201-static void gmap_free(struct gmap *gmap)215215+void gmap_free(struct gmap *gmap)202216{203203- struct page *page, *next;204204-205217 /* Flush tlb of all gmaps (if not already done for shadows) */206218 if (!(gmap_is_shadow(gmap) && gmap->removed))207219 gmap_flush_tlb(gmap);208220 /* Free all segment & region tables. */209209- list_for_each_entry_safe(page, next, &gmap->crst_list, lru)210210- __free_pages(page, CRST_ALLOC_ORDER);221221+ gmap_free_crst(gmap->table, gmap_is_shadow(gmap));222222+211223 gmap_radix_tree_free(&gmap->guest_to_host);212224 gmap_radix_tree_free(&gmap->host_to_guest);213225214226 /* Free additional data for a shadow gmap */215227 if (gmap_is_shadow(gmap)) {216216- struct ptdesc *ptdesc, *n;217217-218218- /* Free all page tables. */219219- list_for_each_entry_safe(ptdesc, n, &gmap->pt_list, pt_list)220220- page_table_free_pgste(ptdesc);221228 gmap_rmap_radix_tree_free(&gmap->host_to_rmap);222229 /* Release reference to the parent */223230 gmap_put(gmap->parent);···239218240219 kfree(gmap);241220}221221+EXPORT_SYMBOL_GPL(gmap_free);242222243223/**244224 * gmap_get - increase reference counter for guest address space···320298 crst_table_init(new, init);321299 spin_lock(&gmap->guest_table_lock);322300 if (*table & _REGION_ENTRY_INVALID) {323323- list_add(&page->lru, &gmap->crst_list);324301 *table = __pa(new) | _REGION_ENTRY_LENGTH |325302 (*table & _REGION_ENTRY_TYPE_MASK);326326- page->index = gaddr;327303 page = NULL;328304 }329305 spin_unlock(&gmap->guest_table_lock);···330310 return 0;331311}332312333333-/**334334- * __gmap_segment_gaddr - find virtual address from segment pointer335335- * @entry: pointer to a segment table entry in the guest address space336336- *337337- * Returns the virtual address in the guest address space for the segment338338- */339339-static unsigned long __gmap_segment_gaddr(unsigned long *entry)313313+static unsigned long host_to_guest_lookup(struct gmap *gmap, unsigned long vmaddr)340314{341341- struct page *page;342342- unsigned long offset;315315+ return (unsigned long)radix_tree_lookup(&gmap->host_to_guest, vmaddr >> PMD_SHIFT);316316+}343317344344- offset = (unsigned long) entry / sizeof(unsigned long);345345- offset = (offset & (PTRS_PER_PMD - 1)) * PMD_SIZE;346346- page = pmd_pgtable_page((pmd_t *) entry);347347- return page->index + offset;318318+static unsigned long host_to_guest_delete(struct gmap *gmap, unsigned long vmaddr)319319+{320320+ return (unsigned long)radix_tree_delete(&gmap->host_to_guest, vmaddr >> PMD_SHIFT);321321+}322322+323323+static pmd_t *host_to_guest_pmd_delete(struct gmap *gmap, unsigned long vmaddr,324324+ unsigned long *gaddr)325325+{326326+ *gaddr = host_to_guest_delete(gmap, vmaddr);327327+ if (IS_GADDR_VALID(*gaddr))328328+ return (pmd_t *)gmap_table_walk(gmap, *gaddr, 1);329329+ return NULL;348330}349331350332/**···358336 */359337static int __gmap_unlink_by_vmaddr(struct gmap *gmap, unsigned long vmaddr)360338{361361- unsigned long *entry;339339+ unsigned long gaddr;362340 int flush = 0;341341+ pmd_t *pmdp;363342364343 BUG_ON(gmap_is_shadow(gmap));365344 spin_lock(&gmap->guest_table_lock);366366- entry = radix_tree_delete(&gmap->host_to_guest, vmaddr >> PMD_SHIFT);367367- if (entry) {368368- flush = (*entry != _SEGMENT_ENTRY_EMPTY);369369- *entry = _SEGMENT_ENTRY_EMPTY;345345+346346+ pmdp = host_to_guest_pmd_delete(gmap, vmaddr, &gaddr);347347+ if (pmdp) {348348+ flush = (pmd_val(*pmdp) != _SEGMENT_ENTRY_EMPTY);349349+ *pmdp = __pmd(_SEGMENT_ENTRY_EMPTY);370350 }351351+371352 spin_unlock(&gmap->guest_table_lock);372353 return flush;373354}···489464EXPORT_SYMBOL_GPL(__gmap_translate);490465491466/**492492- * gmap_translate - translate a guest address to a user space address493493- * @gmap: pointer to guest mapping meta data structure494494- * @gaddr: guest address495495- *496496- * Returns user space address which corresponds to the guest address or497497- * -EFAULT if no such mapping exists.498498- * This function does not establish potentially missing page table entries.499499- */500500-unsigned long gmap_translate(struct gmap *gmap, unsigned long gaddr)501501-{502502- unsigned long rc;503503-504504- mmap_read_lock(gmap->mm);505505- rc = __gmap_translate(gmap, gaddr);506506- mmap_read_unlock(gmap->mm);507507- return rc;508508-}509509-EXPORT_SYMBOL_GPL(gmap_translate);510510-511511-/**512467 * gmap_unlink - disconnect a page table from the gmap shadow tables513468 * @mm: pointer to the parent mm_struct514469 * @table: pointer to the host page table···587582 spin_lock(&gmap->guest_table_lock);588583 if (*table == _SEGMENT_ENTRY_EMPTY) {589584 rc = radix_tree_insert(&gmap->host_to_guest,590590- vmaddr >> PMD_SHIFT, table);585585+ vmaddr >> PMD_SHIFT,586586+ (void *)MAKE_VALID_GADDR(gaddr));591587 if (!rc) {592588 if (pmd_leaf(*pmd)) {593589 *table = (pmd_val(*pmd) &···611605 radix_tree_preload_end();612606 return rc;613607}614614-615615-/**616616- * fixup_user_fault_nowait - manually resolve a user page fault without waiting617617- * @mm: mm_struct of target mm618618- * @address: user address619619- * @fault_flags:flags to pass down to handle_mm_fault()620620- * @unlocked: did we unlock the mmap_lock while retrying621621- *622622- * This function behaves similarly to fixup_user_fault(), but it guarantees623623- * that the fault will be resolved without waiting. The function might drop624624- * and re-acquire the mm lock, in which case @unlocked will be set to true.625625- *626626- * The guarantee is that the fault is handled without waiting, but the627627- * function itself might sleep, due to the lock.628628- *629629- * Context: Needs to be called with mm->mmap_lock held in read mode, and will630630- * return with the lock held in read mode; @unlocked will indicate whether631631- * the lock has been dropped and re-acquired. This is the same behaviour as632632- * fixup_user_fault().633633- *634634- * Return: 0 on success, -EAGAIN if the fault cannot be resolved without635635- * waiting, -EFAULT if the fault cannot be resolved, -ENOMEM if out of636636- * memory.637637- */638638-static int fixup_user_fault_nowait(struct mm_struct *mm, unsigned long address,639639- unsigned int fault_flags, bool *unlocked)640640-{641641- struct vm_area_struct *vma;642642- unsigned int test_flags;643643- vm_fault_t fault;644644- int rc;645645-646646- fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT;647647- test_flags = fault_flags & FAULT_FLAG_WRITE ? VM_WRITE : VM_READ;648648-649649- vma = find_vma(mm, address);650650- if (unlikely(!vma || address < vma->vm_start))651651- return -EFAULT;652652- if (unlikely(!(vma->vm_flags & test_flags)))653653- return -EFAULT;654654-655655- fault = handle_mm_fault(vma, address, fault_flags, NULL);656656- /* the mm lock has been dropped, take it again */657657- if (fault & VM_FAULT_COMPLETED) {658658- *unlocked = true;659659- mmap_read_lock(mm);660660- return 0;661661- }662662- /* the mm lock has not been dropped */663663- if (fault & VM_FAULT_ERROR) {664664- rc = vm_fault_to_errno(fault, 0);665665- BUG_ON(!rc);666666- return rc;667667- }668668- /* the mm lock has not been dropped because of FAULT_FLAG_RETRY_NOWAIT */669669- if (fault & VM_FAULT_RETRY)670670- return -EAGAIN;671671- /* nothing needed to be done and the mm lock has not been dropped */672672- return 0;673673-}674674-675675-/**676676- * __gmap_fault - resolve a fault on a guest address677677- * @gmap: pointer to guest mapping meta data structure678678- * @gaddr: guest address679679- * @fault_flags: flags to pass down to handle_mm_fault()680680- *681681- * Context: Needs to be called with mm->mmap_lock held in read mode. Might682682- * drop and re-acquire the lock. Will always return with the lock held.683683- */684684-static int __gmap_fault(struct gmap *gmap, unsigned long gaddr, unsigned int fault_flags)685685-{686686- unsigned long vmaddr;687687- bool unlocked;688688- int rc = 0;689689-690690-retry:691691- unlocked = false;692692-693693- vmaddr = __gmap_translate(gmap, gaddr);694694- if (IS_ERR_VALUE(vmaddr))695695- return vmaddr;696696-697697- if (fault_flags & FAULT_FLAG_RETRY_NOWAIT)698698- rc = fixup_user_fault_nowait(gmap->mm, vmaddr, fault_flags, &unlocked);699699- else700700- rc = fixup_user_fault(gmap->mm, vmaddr, fault_flags, &unlocked);701701- if (rc)702702- return rc;703703- /*704704- * In the case that fixup_user_fault unlocked the mmap_lock during705705- * fault-in, redo __gmap_translate() to avoid racing with a706706- * map/unmap_segment.707707- * In particular, __gmap_translate(), fixup_user_fault{,_nowait}(),708708- * and __gmap_link() must all be called atomically in one go; if the709709- * lock had been dropped in between, a retry is needed.710710- */711711- if (unlocked)712712- goto retry;713713-714714- return __gmap_link(gmap, gaddr, vmaddr);715715-}716716-717717-/**718718- * gmap_fault - resolve a fault on a guest address719719- * @gmap: pointer to guest mapping meta data structure720720- * @gaddr: guest address721721- * @fault_flags: flags to pass down to handle_mm_fault()722722- *723723- * Returns 0 on success, -ENOMEM for out of memory conditions, -EFAULT if the724724- * vm address is already mapped to a different guest segment, and -EAGAIN if725725- * FAULT_FLAG_RETRY_NOWAIT was specified and the fault could not be processed726726- * immediately.727727- */728728-int gmap_fault(struct gmap *gmap, unsigned long gaddr, unsigned int fault_flags)729729-{730730- int rc;731731-732732- mmap_read_lock(gmap->mm);733733- rc = __gmap_fault(gmap, gaddr, fault_flags);734734- mmap_read_unlock(gmap->mm);735735- return rc;736736-}737737-EXPORT_SYMBOL_GPL(gmap_fault);608608+EXPORT_SYMBOL(__gmap_link);738609739610/*740611 * this function is assumed to be called with mmap_lock held···736853 *737854 * Note: Can also be called for shadow gmaps.738855 */739739-static inline unsigned long *gmap_table_walk(struct gmap *gmap,740740- unsigned long gaddr, int level)856856+unsigned long *gmap_table_walk(struct gmap *gmap, unsigned long gaddr, int level)741857{742858 const int asce_type = gmap->asce & _ASCE_TYPE_MASK;743859 unsigned long *table = gmap->table;···787905 }788906 return table;789907}908908+EXPORT_SYMBOL(gmap_table_walk);790909791910/**792911 * gmap_pte_op_walk - walk the gmap page table, get the page table lock···9841101 * @prot: indicates access rights: PROT_NONE, PROT_READ or PROT_WRITE9851102 * @bits: pgste notification bits to set9861103 *987987- * Returns 0 if successfully protected, -ENOMEM if out of memory and988988- * -EFAULT if gaddr is invalid (or mapping for shadows is missing).11041104+ * Returns:11051105+ * PAGE_SIZE if a small page was successfully protected;11061106+ * HPAGE_SIZE if a large page was successfully protected;11071107+ * -ENOMEM if out of memory;11081108+ * -EFAULT if gaddr is invalid (or mapping for shadows is missing);11091109+ * -EAGAIN if the guest mapping is missing and should be fixed by the caller.9891110 *990990- * Called with sg->mm->mmap_lock in read.11111111+ * Context: Called with sg->mm->mmap_lock in read.9911112 */992992-static int gmap_protect_range(struct gmap *gmap, unsigned long gaddr,993993- unsigned long len, int prot, unsigned long bits)11131113+int gmap_protect_one(struct gmap *gmap, unsigned long gaddr, int prot, unsigned long bits)9941114{995995- unsigned long vmaddr, dist;9961115 pmd_t *pmdp;997997- int rc;11161116+ int rc = 0;99811179991118 BUG_ON(gmap_is_shadow(gmap));10001000- while (len) {10011001- rc = -EAGAIN;10021002- pmdp = gmap_pmd_op_walk(gmap, gaddr);10031003- if (pmdp) {10041004- if (!pmd_leaf(*pmdp)) {10051005- rc = gmap_protect_pte(gmap, gaddr, pmdp, prot,10061006- bits);10071007- if (!rc) {10081008- len -= PAGE_SIZE;10091009- gaddr += PAGE_SIZE;10101010- }10111011- } else {10121012- rc = gmap_protect_pmd(gmap, gaddr, pmdp, prot,10131013- bits);10141014- if (!rc) {10151015- dist = HPAGE_SIZE - (gaddr & ~HPAGE_MASK);10161016- len = len < dist ? 0 : len - dist;10171017- gaddr = (gaddr & HPAGE_MASK) + HPAGE_SIZE;10181018- }10191019- }10201020- gmap_pmd_op_end(gmap, pmdp);10211021- }10221022- if (rc) {10231023- if (rc == -EINVAL)10241024- return rc;1025111910261026- /* -EAGAIN, fixup of userspace mm and gmap */10271027- vmaddr = __gmap_translate(gmap, gaddr);10281028- if (IS_ERR_VALUE(vmaddr))10291029- return vmaddr;10301030- rc = gmap_pte_op_fixup(gmap, gaddr, vmaddr, prot);10311031- if (rc)10321032- return rc;10331033- }11201120+ pmdp = gmap_pmd_op_walk(gmap, gaddr);11211121+ if (!pmdp)11221122+ return -EAGAIN;11231123+11241124+ if (!pmd_leaf(*pmdp)) {11251125+ rc = gmap_protect_pte(gmap, gaddr, pmdp, prot, bits);11261126+ if (!rc)11271127+ rc = PAGE_SIZE;11281128+ } else {11291129+ rc = gmap_protect_pmd(gmap, gaddr, pmdp, prot, bits);11301130+ if (!rc)11311131+ rc = HPAGE_SIZE;10341132 }10351035- return 0;10361036-}11331133+ gmap_pmd_op_end(gmap, pmdp);1037113410381038-/**10391039- * gmap_mprotect_notify - change access rights for a range of ptes and10401040- * call the notifier if any pte changes again10411041- * @gmap: pointer to guest mapping meta data structure10421042- * @gaddr: virtual address in the guest address space10431043- * @len: size of area10441044- * @prot: indicates access rights: PROT_NONE, PROT_READ or PROT_WRITE10451045- *10461046- * Returns 0 if for each page in the given range a gmap mapping exists,10471047- * the new access rights could be set and the notifier could be armed.10481048- * If the gmap mapping is missing for one or more pages -EFAULT is10491049- * returned. If no memory could be allocated -ENOMEM is returned.10501050- * This function establishes missing page table entries.10511051- */10521052-int gmap_mprotect_notify(struct gmap *gmap, unsigned long gaddr,10531053- unsigned long len, int prot)10541054-{10551055- int rc;10561056-10571057- if ((gaddr & ~PAGE_MASK) || (len & ~PAGE_MASK) || gmap_is_shadow(gmap))10581058- return -EINVAL;10591059- if (!MACHINE_HAS_ESOP && prot == PROT_READ)10601060- return -EINVAL;10611061- mmap_read_lock(gmap->mm);10621062- rc = gmap_protect_range(gmap, gaddr, len, prot, GMAP_NOTIFY_MPROT);10631063- mmap_read_unlock(gmap->mm);10641135 return rc;10651136}10661066-EXPORT_SYMBOL_GPL(gmap_mprotect_notify);11371137+EXPORT_SYMBOL_GPL(gmap_protect_one);1067113810681139/**10691140 * gmap_read_table - get an unsigned long value from a guest page table using···12511414 __gmap_unshadow_pgt(sg, raddr, __va(pgt));12521415 /* Free page table */12531416 ptdesc = page_ptdesc(phys_to_page(pgt));12541254- list_del(&ptdesc->pt_list);12551417 page_table_free_pgste(ptdesc);12561418}12571419···12781442 __gmap_unshadow_pgt(sg, raddr, __va(pgt));12791443 /* Free page table */12801444 ptdesc = page_ptdesc(phys_to_page(pgt));12811281- list_del(&ptdesc->pt_list);12821445 page_table_free_pgste(ptdesc);12831446 }12841447}···13071472 __gmap_unshadow_sgt(sg, raddr, __va(sgt));13081473 /* Free segment table */13091474 page = phys_to_page(sgt);13101310- list_del(&page->lru);13111475 __free_pages(page, CRST_ALLOC_ORDER);13121476}13131477···13341500 __gmap_unshadow_sgt(sg, raddr, __va(sgt));13351501 /* Free segment table */13361502 page = phys_to_page(sgt);13371337- list_del(&page->lru);13381503 __free_pages(page, CRST_ALLOC_ORDER);13391504 }13401505}···13631530 __gmap_unshadow_r3t(sg, raddr, __va(r3t));13641531 /* Free region 3 table */13651532 page = phys_to_page(r3t);13661366- list_del(&page->lru);13671533 __free_pages(page, CRST_ALLOC_ORDER);13681534}13691535···13901558 __gmap_unshadow_r3t(sg, raddr, __va(r3t));13911559 /* Free region 3 table */13921560 page = phys_to_page(r3t);13931393- list_del(&page->lru);13941561 __free_pages(page, CRST_ALLOC_ORDER);13951562 }13961563}···14191588 __gmap_unshadow_r2t(sg, raddr, __va(r2t));14201589 /* Free region 2 table */14211590 page = phys_to_page(r2t);14221422- list_del(&page->lru);14231591 __free_pages(page, CRST_ALLOC_ORDER);14241592}14251593···14501620 r1t[i] = _REGION1_ENTRY_EMPTY;14511621 /* Free region 2 table */14521622 page = phys_to_page(r2t);14531453- list_del(&page->lru);14541623 __free_pages(page, CRST_ALLOC_ORDER);14551624 }14561625}···14601631 *14611632 * Called with sg->guest_table_lock14621633 */14631463-static void gmap_unshadow(struct gmap *sg)16341634+void gmap_unshadow(struct gmap *sg)14641635{14651636 unsigned long *table;14661637···14861657 break;14871658 }14881659}14891489-14901490-/**14911491- * gmap_find_shadow - find a specific asce in the list of shadow tables14921492- * @parent: pointer to the parent gmap14931493- * @asce: ASCE for which the shadow table is created14941494- * @edat_level: edat level to be used for the shadow translation14951495- *14961496- * Returns the pointer to a gmap if a shadow table with the given asce is14971497- * already available, ERR_PTR(-EAGAIN) if another one is just being created,14981498- * otherwise NULL14991499- */15001500-static struct gmap *gmap_find_shadow(struct gmap *parent, unsigned long asce,15011501- int edat_level)15021502-{15031503- struct gmap *sg;15041504-15051505- list_for_each_entry(sg, &parent->children, list) {15061506- if (sg->orig_asce != asce || sg->edat_level != edat_level ||15071507- sg->removed)15081508- continue;15091509- if (!sg->initialized)15101510- return ERR_PTR(-EAGAIN);15111511- refcount_inc(&sg->ref_count);15121512- return sg;15131513- }15141514- return NULL;15151515-}15161516-15171517-/**15181518- * gmap_shadow_valid - check if a shadow guest address space matches the15191519- * given properties and is still valid15201520- * @sg: pointer to the shadow guest address space structure15211521- * @asce: ASCE for which the shadow table is requested15221522- * @edat_level: edat level to be used for the shadow translation15231523- *15241524- * Returns 1 if the gmap shadow is still valid and matches the given15251525- * properties, the caller can continue using it. Returns 0 otherwise, the15261526- * caller has to request a new shadow gmap in this case.15271527- *15281528- */15291529-int gmap_shadow_valid(struct gmap *sg, unsigned long asce, int edat_level)15301530-{15311531- if (sg->removed)15321532- return 0;15331533- return sg->orig_asce == asce && sg->edat_level == edat_level;15341534-}15351535-EXPORT_SYMBOL_GPL(gmap_shadow_valid);15361536-15371537-/**15381538- * gmap_shadow - create/find a shadow guest address space15391539- * @parent: pointer to the parent gmap15401540- * @asce: ASCE for which the shadow table is created15411541- * @edat_level: edat level to be used for the shadow translation15421542- *15431543- * The pages of the top level page table referred by the asce parameter15441544- * will be set to read-only and marked in the PGSTEs of the kvm process.15451545- * The shadow table will be removed automatically on any change to the15461546- * PTE mapping for the source table.15471547- *15481548- * Returns a guest address space structure, ERR_PTR(-ENOMEM) if out of memory,15491549- * ERR_PTR(-EAGAIN) if the caller has to retry and ERR_PTR(-EFAULT) if the15501550- * parent gmap table could not be protected.15511551- */15521552-struct gmap *gmap_shadow(struct gmap *parent, unsigned long asce,15531553- int edat_level)15541554-{15551555- struct gmap *sg, *new;15561556- unsigned long limit;15571557- int rc;15581558-15591559- BUG_ON(parent->mm->context.allow_gmap_hpage_1m);15601560- BUG_ON(gmap_is_shadow(parent));15611561- spin_lock(&parent->shadow_lock);15621562- sg = gmap_find_shadow(parent, asce, edat_level);15631563- spin_unlock(&parent->shadow_lock);15641564- if (sg)15651565- return sg;15661566- /* Create a new shadow gmap */15671567- limit = -1UL >> (33 - (((asce & _ASCE_TYPE_MASK) >> 2) * 11));15681568- if (asce & _ASCE_REAL_SPACE)15691569- limit = -1UL;15701570- new = gmap_alloc(limit);15711571- if (!new)15721572- return ERR_PTR(-ENOMEM);15731573- new->mm = parent->mm;15741574- new->parent = gmap_get(parent);15751575- new->private = parent->private;15761576- new->orig_asce = asce;15771577- new->edat_level = edat_level;15781578- new->initialized = false;15791579- spin_lock(&parent->shadow_lock);15801580- /* Recheck if another CPU created the same shadow */15811581- sg = gmap_find_shadow(parent, asce, edat_level);15821582- if (sg) {15831583- spin_unlock(&parent->shadow_lock);15841584- gmap_free(new);15851585- return sg;15861586- }15871587- if (asce & _ASCE_REAL_SPACE) {15881588- /* only allow one real-space gmap shadow */15891589- list_for_each_entry(sg, &parent->children, list) {15901590- if (sg->orig_asce & _ASCE_REAL_SPACE) {15911591- spin_lock(&sg->guest_table_lock);15921592- gmap_unshadow(sg);15931593- spin_unlock(&sg->guest_table_lock);15941594- list_del(&sg->list);15951595- gmap_put(sg);15961596- break;15971597- }15981598- }15991599- }16001600- refcount_set(&new->ref_count, 2);16011601- list_add(&new->list, &parent->children);16021602- if (asce & _ASCE_REAL_SPACE) {16031603- /* nothing to protect, return right away */16041604- new->initialized = true;16051605- spin_unlock(&parent->shadow_lock);16061606- return new;16071607- }16081608- spin_unlock(&parent->shadow_lock);16091609- /* protect after insertion, so it will get properly invalidated */16101610- mmap_read_lock(parent->mm);16111611- rc = gmap_protect_range(parent, asce & _ASCE_ORIGIN,16121612- ((asce & _ASCE_TABLE_LENGTH) + 1) * PAGE_SIZE,16131613- PROT_READ, GMAP_NOTIFY_SHADOW);16141614- mmap_read_unlock(parent->mm);16151615- spin_lock(&parent->shadow_lock);16161616- new->initialized = true;16171617- if (rc) {16181618- list_del(&new->list);16191619- gmap_free(new);16201620- new = ERR_PTR(rc);16211621- }16221622- spin_unlock(&parent->shadow_lock);16231623- return new;16241624-}16251625-EXPORT_SYMBOL_GPL(gmap_shadow);16601660+EXPORT_SYMBOL(gmap_unshadow);1626166116271662/**16281663 * gmap_shadow_r2t - create an empty shadow region 2 table···15201827 page = gmap_alloc_crst();15211828 if (!page)15221829 return -ENOMEM;15231523- page->index = r2t & _REGION_ENTRY_ORIGIN;15241524- if (fake)15251525- page->index |= GMAP_SHADOW_FAKE_TABLE;15261830 s_r2t = page_to_phys(page);15271831 /* Install shadow region second table */15281832 spin_lock(&sg->guest_table_lock);···15411851 _REGION_ENTRY_TYPE_R1 | _REGION_ENTRY_INVALID;15421852 if (sg->edat_level >= 1)15431853 *table |= (r2t & _REGION_ENTRY_PROTECT);15441544- list_add(&page->lru, &sg->crst_list);15451854 if (fake) {15461855 /* nothing to protect for fake tables */15471856 *table &= ~_REGION_ENTRY_INVALID;···16001911 page = gmap_alloc_crst();16011912 if (!page)16021913 return -ENOMEM;16031603- page->index = r3t & _REGION_ENTRY_ORIGIN;16041604- if (fake)16051605- page->index |= GMAP_SHADOW_FAKE_TABLE;16061914 s_r3t = page_to_phys(page);16071915 /* Install shadow region second table */16081916 spin_lock(&sg->guest_table_lock);···16211935 _REGION_ENTRY_TYPE_R2 | _REGION_ENTRY_INVALID;16221936 if (sg->edat_level >= 1)16231937 *table |= (r3t & _REGION_ENTRY_PROTECT);16241624- list_add(&page->lru, &sg->crst_list);16251938 if (fake) {16261939 /* nothing to protect for fake tables */16271940 *table &= ~_REGION_ENTRY_INVALID;···16801995 page = gmap_alloc_crst();16811996 if (!page)16821997 return -ENOMEM;16831683- page->index = sgt & _REGION_ENTRY_ORIGIN;16841684- if (fake)16851685- page->index |= GMAP_SHADOW_FAKE_TABLE;16861998 s_sgt = page_to_phys(page);16871999 /* Install shadow region second table */16882000 spin_lock(&sg->guest_table_lock);···17012019 _REGION_ENTRY_TYPE_R3 | _REGION_ENTRY_INVALID;17022020 if (sg->edat_level >= 1)17032021 *table |= sgt & _REGION_ENTRY_PROTECT;17041704- list_add(&page->lru, &sg->crst_list);17052022 if (fake) {17062023 /* nothing to protect for fake tables */17072024 *table &= ~_REGION_ENTRY_INVALID;···17332052}17342053EXPORT_SYMBOL_GPL(gmap_shadow_sgt);1735205417361736-/**17371737- * gmap_shadow_pgt_lookup - find a shadow page table17381738- * @sg: pointer to the shadow guest address space structure17391739- * @saddr: the address in the shadow aguest address space17401740- * @pgt: parent gmap address of the page table to get shadowed17411741- * @dat_protection: if the pgtable is marked as protected by dat17421742- * @fake: pgt references contiguous guest memory block, not a pgtable17431743- *17441744- * Returns 0 if the shadow page table was found and -EAGAIN if the page17451745- * table was not found.17461746- *17471747- * Called with sg->mm->mmap_lock in read.17481748- */17491749-int gmap_shadow_pgt_lookup(struct gmap *sg, unsigned long saddr,17501750- unsigned long *pgt, int *dat_protection,17511751- int *fake)20552055+static void gmap_pgste_set_pgt_addr(struct ptdesc *ptdesc, unsigned long pgt_addr)17522056{17531753- unsigned long *table;17541754- struct page *page;17551755- int rc;20572057+ unsigned long *pgstes = page_to_virt(ptdesc_page(ptdesc));1756205817571757- BUG_ON(!gmap_is_shadow(sg));17581758- spin_lock(&sg->guest_table_lock);17591759- table = gmap_table_walk(sg, saddr, 1); /* get segment pointer */17601760- if (table && !(*table & _SEGMENT_ENTRY_INVALID)) {17611761- /* Shadow page tables are full pages (pte+pgste) */17621762- page = pfn_to_page(*table >> PAGE_SHIFT);17631763- *pgt = page->index & ~GMAP_SHADOW_FAKE_TABLE;17641764- *dat_protection = !!(*table & _SEGMENT_ENTRY_PROTECT);17651765- *fake = !!(page->index & GMAP_SHADOW_FAKE_TABLE);17661766- rc = 0;17671767- } else {17681768- rc = -EAGAIN;17691769- }17701770- spin_unlock(&sg->guest_table_lock);17711771- return rc;20592059+ pgstes += _PAGE_ENTRIES;1772206020612061+ pgstes[0] &= ~PGSTE_ST2_MASK;20622062+ pgstes[1] &= ~PGSTE_ST2_MASK;20632063+ pgstes[2] &= ~PGSTE_ST2_MASK;20642064+ pgstes[3] &= ~PGSTE_ST2_MASK;20652065+20662066+ pgstes[0] |= (pgt_addr >> 16) & PGSTE_ST2_MASK;20672067+ pgstes[1] |= pgt_addr & PGSTE_ST2_MASK;20682068+ pgstes[2] |= (pgt_addr << 16) & PGSTE_ST2_MASK;20692069+ pgstes[3] |= (pgt_addr << 32) & PGSTE_ST2_MASK;17732070}17741774-EXPORT_SYMBOL_GPL(gmap_shadow_pgt_lookup);1775207117762072/**17772073 * gmap_shadow_pgt - instantiate a shadow page table···17772119 ptdesc = page_table_alloc_pgste(sg->mm);17782120 if (!ptdesc)17792121 return -ENOMEM;17801780- ptdesc->pt_index = pgt & _SEGMENT_ENTRY_ORIGIN;21222122+ origin = pgt & _SEGMENT_ENTRY_ORIGIN;17812123 if (fake)17821782- ptdesc->pt_index |= GMAP_SHADOW_FAKE_TABLE;21242124+ origin |= GMAP_SHADOW_FAKE_TABLE;21252125+ gmap_pgste_set_pgt_addr(ptdesc, origin);17832126 s_pgt = page_to_phys(ptdesc_page(ptdesc));17842127 /* Install shadow page table */17852128 spin_lock(&sg->guest_table_lock);···17992140 /* mark as invalid as long as the parent table is not protected */18002141 *table = (unsigned long) s_pgt | _SEGMENT_ENTRY |18012142 (pgt & _SEGMENT_ENTRY_PROTECT) | _SEGMENT_ENTRY_INVALID;18021802- list_add(&ptdesc->pt_list, &sg->pt_list);18032143 if (fake) {18042144 /* nothing to protect for fake tables */18052145 *table &= ~_SEGMENT_ENTRY_INVALID;···19762318 pte_t *pte, unsigned long bits)19772319{19782320 unsigned long offset, gaddr = 0;19791979- unsigned long *table;19802321 struct gmap *gmap, *sg, *next;1981232219822323 offset = ((unsigned long) pte) & (255 * sizeof(pte_t));···19832326 rcu_read_lock();19842327 list_for_each_entry_rcu(gmap, &mm->context.gmap_list, list) {19852328 spin_lock(&gmap->guest_table_lock);19861986- table = radix_tree_lookup(&gmap->host_to_guest,19871987- vmaddr >> PMD_SHIFT);19881988- if (table)19891989- gaddr = __gmap_segment_gaddr(table) + offset;23292329+ gaddr = host_to_guest_lookup(gmap, vmaddr) + offset;19902330 spin_unlock(&gmap->guest_table_lock);19911991- if (!table)23312331+ if (!IS_GADDR_VALID(gaddr))19922332 continue;1993233319942334 if (!list_empty(&gmap->children) && (bits & PGSTE_VSIE_BIT)) {···20452391 rcu_read_lock();20462392 list_for_each_entry_rcu(gmap, &mm->context.gmap_list, list) {20472393 spin_lock(&gmap->guest_table_lock);20482048- pmdp = (pmd_t *)radix_tree_delete(&gmap->host_to_guest,20492049- vmaddr >> PMD_SHIFT);23942394+ pmdp = host_to_guest_pmd_delete(gmap, vmaddr, &gaddr);20502395 if (pmdp) {20512051- gaddr = __gmap_segment_gaddr((unsigned long *)pmdp);20522396 pmdp_notify_gmap(gmap, pmdp, gaddr);20532397 WARN_ON(pmd_val(*pmdp) & ~(_SEGMENT_ENTRY_HARDWARE_BITS_LARGE |20542398 _SEGMENT_ENTRY_GMAP_UC |···20902438 */20912439void gmap_pmdp_idte_local(struct mm_struct *mm, unsigned long vmaddr)20922440{20932093- unsigned long *entry, gaddr;24412441+ unsigned long gaddr;20942442 struct gmap *gmap;20952443 pmd_t *pmdp;2096244420972445 rcu_read_lock();20982446 list_for_each_entry_rcu(gmap, &mm->context.gmap_list, list) {20992447 spin_lock(&gmap->guest_table_lock);21002100- entry = radix_tree_delete(&gmap->host_to_guest,21012101- vmaddr >> PMD_SHIFT);21022102- if (entry) {21032103- pmdp = (pmd_t *)entry;21042104- gaddr = __gmap_segment_gaddr(entry);24482448+ pmdp = host_to_guest_pmd_delete(gmap, vmaddr, &gaddr);24492449+ if (pmdp) {21052450 pmdp_notify_gmap(gmap, pmdp, gaddr);21062106- WARN_ON(*entry & ~(_SEGMENT_ENTRY_HARDWARE_BITS_LARGE |21072107- _SEGMENT_ENTRY_GMAP_UC |21082108- _SEGMENT_ENTRY));24512451+ WARN_ON(pmd_val(*pmdp) & ~(_SEGMENT_ENTRY_HARDWARE_BITS_LARGE |24522452+ _SEGMENT_ENTRY_GMAP_UC |24532453+ _SEGMENT_ENTRY));21092454 if (MACHINE_HAS_TLB_GUEST)21102455 __pmdp_idte(gaddr, pmdp, IDTE_GUEST_ASCE,21112456 gmap->asce, IDTE_LOCAL);21122457 else if (MACHINE_HAS_IDTE)21132458 __pmdp_idte(gaddr, pmdp, 0, 0, IDTE_LOCAL);21142114- *entry = _SEGMENT_ENTRY_EMPTY;24592459+ *pmdp = __pmd(_SEGMENT_ENTRY_EMPTY);21152460 }21162461 spin_unlock(&gmap->guest_table_lock);21172462 }···21232474 */21242475void gmap_pmdp_idte_global(struct mm_struct *mm, unsigned long vmaddr)21252476{21262126- unsigned long *entry, gaddr;24772477+ unsigned long gaddr;21272478 struct gmap *gmap;21282479 pmd_t *pmdp;2129248021302481 rcu_read_lock();21312482 list_for_each_entry_rcu(gmap, &mm->context.gmap_list, list) {21322483 spin_lock(&gmap->guest_table_lock);21332133- entry = radix_tree_delete(&gmap->host_to_guest,21342134- vmaddr >> PMD_SHIFT);21352135- if (entry) {21362136- pmdp = (pmd_t *)entry;21372137- gaddr = __gmap_segment_gaddr(entry);24842484+ pmdp = host_to_guest_pmd_delete(gmap, vmaddr, &gaddr);24852485+ if (pmdp) {21382486 pmdp_notify_gmap(gmap, pmdp, gaddr);21392139- WARN_ON(*entry & ~(_SEGMENT_ENTRY_HARDWARE_BITS_LARGE |21402140- _SEGMENT_ENTRY_GMAP_UC |21412141- _SEGMENT_ENTRY));24872487+ WARN_ON(pmd_val(*pmdp) & ~(_SEGMENT_ENTRY_HARDWARE_BITS_LARGE |24882488+ _SEGMENT_ENTRY_GMAP_UC |24892489+ _SEGMENT_ENTRY));21422490 if (MACHINE_HAS_TLB_GUEST)21432491 __pmdp_idte(gaddr, pmdp, IDTE_GUEST_ASCE,21442492 gmap->asce, IDTE_GLOBAL);···21432497 __pmdp_idte(gaddr, pmdp, 0, 0, IDTE_GLOBAL);21442498 else21452499 __pmdp_csp(pmdp);21462146- *entry = _SEGMENT_ENTRY_EMPTY;25002500+ *pmdp = __pmd(_SEGMENT_ENTRY_EMPTY);21472501 }21482502 spin_unlock(&gmap->guest_table_lock);21492503 }···25892943EXPORT_SYMBOL_GPL(__s390_uv_destroy_range);2590294425912945/**25922592- * s390_unlist_old_asce - Remove the topmost level of page tables from the25932593- * list of page tables of the gmap.25942594- * @gmap: the gmap whose table is to be removed25952595- *25962596- * On s390x, KVM keeps a list of all pages containing the page tables of the25972597- * gmap (the CRST list). This list is used at tear down time to free all25982598- * pages that are now not needed anymore.25992599- *26002600- * This function removes the topmost page of the tree (the one pointed to by26012601- * the ASCE) from the CRST list.26022602- *26032603- * This means that it will not be freed when the VM is torn down, and needs26042604- * to be handled separately by the caller, unless a leak is actually26052605- * intended. Notice that this function will only remove the page from the26062606- * list, the page will still be used as a top level page table (and ASCE).26072607- */26082608-void s390_unlist_old_asce(struct gmap *gmap)26092609-{26102610- struct page *old;26112611-26122612- old = virt_to_page(gmap->table);26132613- spin_lock(&gmap->guest_table_lock);26142614- list_del(&old->lru);26152615- /*26162616- * Sometimes the topmost page might need to be "removed" multiple26172617- * times, for example if the VM is rebooted into secure mode several26182618- * times concurrently, or if s390_replace_asce fails after calling26192619- * s390_remove_old_asce and is attempted again later. In that case26202620- * the old asce has been removed from the list, and therefore it26212621- * will not be freed when the VM terminates, but the ASCE is still26222622- * in use and still pointed to.26232623- * A subsequent call to replace_asce will follow the pointer and try26242624- * to remove the same page from the list again.26252625- * Therefore it's necessary that the page of the ASCE has valid26262626- * pointers, so list_del can work (and do nothing) without26272627- * dereferencing stale or invalid pointers.26282628- */26292629- INIT_LIST_HEAD(&old->lru);26302630- spin_unlock(&gmap->guest_table_lock);26312631-}26322632-EXPORT_SYMBOL_GPL(s390_unlist_old_asce);26332633-26342634-/**26352946 * s390_replace_asce - Try to replace the current ASCE of a gmap with a copy26362947 * @gmap: the gmap whose ASCE needs to be replaced26372948 *···26073004 struct page *page;26083005 void *table;2609300626102610- s390_unlist_old_asce(gmap);26112611-26123007 /* Replacing segment type ASCEs would cause serious issues */26133008 if ((gmap->asce & _ASCE_TYPE_MASK) == _ASCE_TYPE_SEGMENT)26143009 return -EINVAL;···26143013 page = gmap_alloc_crst();26153014 if (!page)26163015 return -ENOMEM;26172617- page->index = 0;26183016 table = page_to_virt(page);26193017 memcpy(table, gmap->table, 1UL << (CRST_ALLOC_ORDER + PAGE_SHIFT));26202620-26212621- /*26222622- * The caller has to deal with the old ASCE, but here we make sure26232623- * the new one is properly added to the CRST list, so that26242624- * it will be freed when the VM is torn down.26252625- */26262626- spin_lock(&gmap->guest_table_lock);26272627- list_add(&page->lru, &gmap->crst_list);26282628- spin_unlock(&gmap->guest_table_lock);2629301826303019 /* Set new table origin while preserving existing ASCE control bits */26313020 asce = (gmap->asce & ~_ASCE_ORIGIN) | __pa(table);···26263035 return 0;26273036}26283037EXPORT_SYMBOL_GPL(s390_replace_asce);30383038+30393039+/**30403040+ * kvm_s390_wiggle_split_folio() - try to drain extra references to a folio and optionally split30413041+ * @mm: the mm containing the folio to work on30423042+ * @folio: the folio30433043+ * @split: whether to split a large folio30443044+ *30453045+ * Context: Must be called while holding an extra reference to the folio;30463046+ * the mm lock should not be held.30473047+ */30483048+int kvm_s390_wiggle_split_folio(struct mm_struct *mm, struct folio *folio, bool split)30493049+{30503050+ int rc;30513051+30523052+ lockdep_assert_not_held(&mm->mmap_lock);30533053+ folio_wait_writeback(folio);30543054+ lru_add_drain_all();30553055+ if (split) {30563056+ folio_lock(folio);30573057+ rc = split_folio(folio);30583058+ folio_unlock(folio);30593059+30603060+ if (rc != -EBUSY)30613061+ return rc;30623062+ }30633063+ return -EAGAIN;30643064+}30653065+EXPORT_SYMBOL_GPL(kvm_s390_wiggle_split_folio);
-2
arch/s390/mm/pgalloc.c
···176176 }177177 table = ptdesc_to_virt(ptdesc);178178 __arch_set_page_dat(table, 1);179179- /* pt_list is used by gmap only */180180- INIT_LIST_HEAD(&ptdesc->pt_list);181179 memset64((u64 *)table, _PAGE_INVALID, PTRS_PER_PTE);182180 memset64((u64 *)table + PTRS_PER_PTE, 0, PTRS_PER_PTE);183181 return table;
+20
arch/s390/pci/pci_bus.c
···331331 return rc;332332}333333334334+static bool zpci_bus_is_isolated_vf(struct zpci_bus *zbus, struct zpci_dev *zdev)335335+{336336+ struct pci_dev *pdev;337337+338338+ pdev = zpci_iov_find_parent_pf(zbus, zdev);339339+ if (!pdev)340340+ return true;341341+ pci_dev_put(pdev);342342+ return false;343343+}344344+334345int zpci_bus_device_register(struct zpci_dev *zdev, struct pci_ops *ops)335346{336347 bool topo_is_tid = zdev->tid_avail;···356345357346 topo = topo_is_tid ? zdev->tid : zdev->pchid;358347 zbus = zpci_bus_get(topo, topo_is_tid);348348+ /*349349+ * An isolated VF gets its own domain/bus even if there exists350350+ * a matching domain/bus already351351+ */352352+ if (zbus && zpci_bus_is_isolated_vf(zbus, zdev)) {353353+ zpci_bus_put(zbus);354354+ zbus = NULL;355355+ }356356+359357 if (!zbus) {360358 zbus = zpci_bus_alloc(topo, topo_is_tid);361359 if (!zbus)
+42-14
arch/s390/pci/pci_iov.c
···6060 return 0;6161}62626363-int zpci_iov_setup_virtfn(struct zpci_bus *zbus, struct pci_dev *virtfn, int vfn)6363+/**6464+ * zpci_iov_find_parent_pf - Find the parent PF, if any, of the given function6565+ * @zbus: The bus that the PCI function is on, or would be added on6666+ * @zdev: The PCI function6767+ *6868+ * Finds the parent PF, if it exists and is configured, of the given PCI function6969+ * and increments its refcount. Th PF is searched for on the provided bus so the7070+ * caller has to ensure that this is the correct bus to search. This function may7171+ * be used before adding the PCI function to a zbus.7272+ *7373+ * Return: Pointer to the struct pci_dev of the parent PF or NULL if it not7474+ * found. If the function is not a VF or has no RequesterID information,7575+ * NULL is returned as well.7676+ */7777+struct pci_dev *zpci_iov_find_parent_pf(struct zpci_bus *zbus, struct zpci_dev *zdev)6478{6565- int i, cand_devfn;6666- struct zpci_dev *zdev;7979+ int i, vfid, devfn, cand_devfn;6780 struct pci_dev *pdev;6868- int vfid = vfn - 1; /* Linux' vfid's start at 0 vfn at 1*/6969- int rc = 0;70817182 if (!zbus->multifunction)7272- return 0;7373-7474- /* If the parent PF for the given VF is also configured in the8383+ return NULL;8484+ /* Non-VFs and VFs without RID available don't have a parent */8585+ if (!zdev->vfn || !zdev->rid_available)8686+ return NULL;8787+ /* Linux vfid starts at 0 vfn at 1 */8888+ vfid = zdev->vfn - 1;8989+ devfn = zdev->rid & ZPCI_RID_MASK_DEVFN;9090+ /*9191+ * If the parent PF for the given VF is also configured in the7592 * instance, it must be on the same zbus.7693 * We can then identify the parent PF by checking what7794 * devfn the VF would have if it belonged to that PF using the PF's···10285 if (!pdev)10386 continue;10487 cand_devfn = pci_iov_virtfn_devfn(pdev, vfid);105105- if (cand_devfn == virtfn->devfn) {106106- rc = zpci_iov_link_virtfn(pdev, virtfn, vfid);107107- /* balance pci_get_slot() */108108- pci_dev_put(pdev);109109- break;110110- }8888+ if (cand_devfn == devfn)8989+ return pdev;11190 /* balance pci_get_slot() */11291 pci_dev_put(pdev);11392 }9393+ }9494+ return NULL;9595+}9696+9797+int zpci_iov_setup_virtfn(struct zpci_bus *zbus, struct pci_dev *virtfn, int vfn)9898+{9999+ struct zpci_dev *zdev = to_zpci(virtfn);100100+ struct pci_dev *pdev_pf;101101+ int rc = 0;102102+103103+ pdev_pf = zpci_iov_find_parent_pf(zbus, zdev);104104+ if (pdev_pf) {105105+ /* Linux' vfids start at 0 while zdev->vfn starts at 1 */106106+ rc = zpci_iov_link_virtfn(pdev_pf, virtfn, zdev->vfn - 1);107107+ pci_dev_put(pdev_pf);114108 }115109 return rc;116110}
···181181182182static int stub_exe_fd;183183184184+#ifndef CLOSE_RANGE_CLOEXEC185185+#define CLOSE_RANGE_CLOEXEC (1U << 2)186186+#endif187187+184188static int userspace_tramp(void *stack)185189{186190 char *const argv[] = { "uml-userspace", NULL };···206202 init_data.stub_data_fd = phys_mapping(uml_to_phys(stack), &offset);207203 init_data.stub_data_offset = MMAP_OFFSET(offset);208204209209- /* Set CLOEXEC on all FDs and then unset on all memory related FDs */210210- close_range(0, ~0U, CLOSE_RANGE_CLOEXEC);205205+ /*206206+ * Avoid leaking unneeded FDs to the stub by setting CLOEXEC on all FDs207207+ * and then unsetting it on all memory related FDs.208208+ * This is not strictly necessary from a safety perspective.209209+ */210210+ syscall(__NR_close_range, 0, ~0U, CLOSE_RANGE_CLOEXEC);211211212212 fcntl(init_data.stub_data_fd, F_SETFD, 0);213213 for (iomem = iomem_regions; iomem; iomem = iomem->next)···232224 if (ret != sizeof(init_data))233225 exit(4);234226235235- execveat(stub_exe_fd, "", argv, NULL, AT_EMPTY_PATH);227227+ /* Raw execveat for compatibility with older libc versions */228228+ syscall(__NR_execveat, stub_exe_fd, (unsigned long)"",229229+ (unsigned long)argv, NULL, AT_EMPTY_PATH);236230237231 exit(5);238232}
+2-1
arch/x86/Kconfig
···25992599 depends on CPU_SUP_AMD && X86_6426002600 default y26012601 help26022602- Compile the kernel with support for the retbleed=ibpb mitigation.26022602+ Compile the kernel with support for the retbleed=ibpb and26032603+ spec_rstack_overflow={ibpb,ibpb-vmexit} mitigations.2603260426042605config MITIGATION_IBRS_ENTRY26052606 bool "Enable IBRS on kernel entry"
+1
arch/x86/boot/compressed/Makefile
···2525# avoid errors with '-march=i386', and future flags may depend on the target to2626# be valid.2727KBUILD_CFLAGS := -m$(BITS) -O2 $(CLANG_FLAGS)2828+KBUILD_CFLAGS += -std=gnu112829KBUILD_CFLAGS += -fno-strict-aliasing -fPIE2930KBUILD_CFLAGS += -Wundef3031KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING
+14-19
arch/x86/events/intel/core.c
···4905490549064906static void update_pmu_cap(struct x86_hybrid_pmu *pmu)49074907{49084908- unsigned int sub_bitmaps, eax, ebx, ecx, edx;49084908+ unsigned int cntr, fixed_cntr, ecx, edx;49094909+ union cpuid35_eax eax;49104910+ union cpuid35_ebx ebx;4909491149104910- cpuid(ARCH_PERFMON_EXT_LEAF, &sub_bitmaps, &ebx, &ecx, &edx);49124912+ cpuid(ARCH_PERFMON_EXT_LEAF, &eax.full, &ebx.full, &ecx, &edx);4911491349124912- if (ebx & ARCH_PERFMON_EXT_UMASK2)49144914+ if (ebx.split.umask2)49134915 pmu->config_mask |= ARCH_PERFMON_EVENTSEL_UMASK2;49144914- if (ebx & ARCH_PERFMON_EXT_EQ)49164916+ if (ebx.split.eq)49154917 pmu->config_mask |= ARCH_PERFMON_EVENTSEL_EQ;4916491849174917- if (sub_bitmaps & ARCH_PERFMON_NUM_COUNTER_LEAF_BIT) {49194919+ if (eax.split.cntr_subleaf) {49184920 cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF,49194919- &eax, &ebx, &ecx, &edx);49204920- pmu->cntr_mask64 = eax;49214921- pmu->fixed_cntr_mask64 = ebx;49214921+ &cntr, &fixed_cntr, &ecx, &edx);49224922+ pmu->cntr_mask64 = cntr;49234923+ pmu->fixed_cntr_mask64 = fixed_cntr;49224924 }4923492549244926 if (!intel_pmu_broken_perf_cap()) {···49424940 pmu->intel_ctrl |= 1ULL << GLOBAL_CTRL_EN_PERF_METRICS;49434941 else49444942 pmu->intel_ctrl &= ~(1ULL << GLOBAL_CTRL_EN_PERF_METRICS);49454945-49464946- if (pmu->intel_cap.pebs_output_pt_available)49474947- pmu->pmu.capabilities |= PERF_PMU_CAP_AUX_OUTPUT;49484948- else49494949- pmu->pmu.capabilities &= ~PERF_PMU_CAP_AUX_OUTPUT;4950494349514944 intel_pmu_check_event_constraints(pmu->event_constraints,49524945 pmu->cntr_mask64,···5020502350215024 pr_info("%s PMU driver: ", pmu->name);5022502550235023- if (pmu->intel_cap.pebs_output_pt_available)50245024- pr_cont("PEBS-via-PT ");50255025-50265026 pr_cont("\n");5027502750285028 x86_pmu_show_pmu_cap(&pmu->pmu);···5042504850435049 init_debug_store_on_cpu(cpu);50445050 /*50455045- * Deal with CPUs that don't clear their LBRs on power-up.50515051+ * Deal with CPUs that don't clear their LBRs on power-up, and that may50525052+ * even boot with LBRs enabled.50465053 */50545054+ if (!static_cpu_has(X86_FEATURE_ARCH_LBR) && x86_pmu.lbr_nr)50555055+ msr_clear_bit(MSR_IA32_DEBUGCTLMSR, DEBUGCTLMSR_LBR_BIT);50475056 intel_pmu_lbr_reset();5048505750495058 cpuc->lbr_sel = NULL;···63676370 pmu->intel_cap.capabilities = x86_pmu.intel_cap.capabilities;63686371 if (pmu->pmu_type & hybrid_small_tiny) {63696372 pmu->intel_cap.perf_metrics = 0;63706370- pmu->intel_cap.pebs_output_pt_available = 1;63716373 pmu->mid_ack = true;63726374 } else if (pmu->pmu_type & hybrid_big) {63736375 pmu->intel_cap.perf_metrics = 1;63746374- pmu->intel_cap.pebs_output_pt_available = 0;63756376 pmu->late_ack = true;63766377 }63776378 }
+9-1
arch/x86/events/intel/ds.c
···25782578 }25792579 pr_cont("PEBS fmt4%c%s, ", pebs_type, pebs_qual);2580258025812581- if (!is_hybrid() && x86_pmu.intel_cap.pebs_output_pt_available) {25812581+ /*25822582+ * The PEBS-via-PT is not supported on hybrid platforms,25832583+ * because not all CPUs of a hybrid machine support it.25842584+ * The global x86_pmu.intel_cap, which only contains the25852585+ * common capabilities, is used to check the availability25862586+ * of the feature. The per-PMU pebs_output_pt_available25872587+ * in a hybrid machine should be ignored.25882588+ */25892589+ if (x86_pmu.intel_cap.pebs_output_pt_available) {25822590 pr_cont("PEBS-via-PT, ");25832591 x86_get_pmu(smp_processor_id())->capabilities |= PERF_PMU_CAP_AUX_OUTPUT;25842592 }
+4-8
arch/x86/events/rapl.c
···370370 unsigned int rapl_pmu_idx;371371 struct rapl_pmus *rapl_pmus;372372373373+ /* only look at RAPL events */374374+ if (event->attr.type != event->pmu->type)375375+ return -ENOENT;376376+373377 /* unsupported modes and filters */374378 if (event->attr.sample_period) /* no sampling */375379 return -EINVAL;···391387 rapl_pmus_scope = rapl_pmus->pmu.scope;392388393389 if (rapl_pmus_scope == PERF_PMU_SCOPE_PKG || rapl_pmus_scope == PERF_PMU_SCOPE_DIE) {394394- /* only look at RAPL package events */395395- if (event->attr.type != rapl_pmus_pkg->pmu.type)396396- return -ENOENT;397397-398390 cfg = array_index_nospec((long)cfg, NR_RAPL_PKG_DOMAINS + 1);399391 if (!cfg || cfg >= NR_RAPL_PKG_DOMAINS + 1)400392 return -EINVAL;···398398 bit = cfg - 1;399399 event->hw.event_base = rapl_model->rapl_pkg_msrs[bit].msr;400400 } else if (rapl_pmus_scope == PERF_PMU_SCOPE_CORE) {401401- /* only look at RAPL core events */402402- if (event->attr.type != rapl_pmus_core->pmu.type)403403- return -ENOENT;404404-405401 cfg = array_index_nospec((long)cfg, NR_RAPL_CORE_DOMAINS + 1);406402 if (!cfg || cfg >= NR_RAPL_PKG_DOMAINS + 1)407403 return -EINVAL;
···188188 * detection/enumeration details:189189 */190190#define ARCH_PERFMON_EXT_LEAF 0x00000023191191-#define ARCH_PERFMON_EXT_UMASK2 0x1192192-#define ARCH_PERFMON_EXT_EQ 0x2193193-#define ARCH_PERFMON_NUM_COUNTER_LEAF_BIT 0x1194191#define ARCH_PERFMON_NUM_COUNTER_LEAF 0x1192192+193193+union cpuid35_eax {194194+ struct {195195+ unsigned int leaf0:1;196196+ /* Counters Sub-Leaf */197197+ unsigned int cntr_subleaf:1;198198+ /* Auto Counter Reload Sub-Leaf */199199+ unsigned int acr_subleaf:1;200200+ /* Events Sub-Leaf */201201+ unsigned int events_subleaf:1;202202+ unsigned int reserved:28;203203+ } split;204204+ unsigned int full;205205+};206206+207207+union cpuid35_ebx {208208+ struct {209209+ /* UnitMask2 Supported */210210+ unsigned int umask2:1;211211+ /* EQ-bit Supported */212212+ unsigned int eq:1;213213+ unsigned int reserved:30;214214+ } split;215215+ unsigned int full;216216+};195217196218/*197219 * Intel Architectural LBR CPUID detection/enumeration details:
+2
arch/x86/include/asm/sev.h
···531531532532#ifdef CONFIG_KVM_AMD_SEV533533bool snp_probe_rmptable_info(void);534534+int snp_rmptable_init(void);534535int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level);535536void snp_dump_hva_rmpentry(unsigned long address);536537int psmash(u64 pfn);···542541void snp_fixup_e820_tables(void);543542#else544543static inline bool snp_probe_rmptable_info(void) { return false; }544544+static inline int snp_rmptable_init(void) { return -ENOSYS; }545545static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENODEV; }546546static inline void snp_dump_hva_rmpentry(unsigned long address) {}547547static inline int psmash(u64 pfn) { return -ENODEV; }
+14-7
arch/x86/kernel/cpu/bugs.c
···1115111511161116 case RETBLEED_MITIGATION_IBPB:11171117 setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB);11181118+ setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);11191119+ mitigate_smt = true;1118112011191121 /*11201122 * IBPB on entry already obviates the need for···11251123 */11261124 setup_clear_cpu_cap(X86_FEATURE_UNRET);11271125 setup_clear_cpu_cap(X86_FEATURE_RETHUNK);11281128-11291129- setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);11301130- mitigate_smt = true;1131112611321127 /*11331128 * There is no need for RSB filling: entry_ibpb() ensures···26452646 if (IS_ENABLED(CONFIG_MITIGATION_IBPB_ENTRY)) {26462647 if (has_microcode) {26472648 setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB);26492649+ setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);26482650 srso_mitigation = SRSO_MITIGATION_IBPB;2649265126502652 /*···26552655 */26562656 setup_clear_cpu_cap(X86_FEATURE_UNRET);26572657 setup_clear_cpu_cap(X86_FEATURE_RETHUNK);26582658+26592659+ /*26602660+ * There is no need for RSB filling: entry_ibpb() ensures26612661+ * all predictions, including the RSB, are invalidated,26622662+ * regardless of IBPB implementation.26632663+ */26642664+ setup_clear_cpu_cap(X86_FEATURE_RSB_VMEXIT);26582665 }26592666 } else {26602667 pr_err("WARNING: kernel not compiled with MITIGATION_IBPB_ENTRY.\n");···2670266326712664ibpb_on_vmexit:26722665 case SRSO_CMD_IBPB_ON_VMEXIT:26732673- if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) {26742674- if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {26662666+ if (IS_ENABLED(CONFIG_MITIGATION_IBPB_ENTRY)) {26672667+ if (has_microcode) {26752668 setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);26762669 srso_mitigation = SRSO_MITIGATION_IBPB_ON_VMEXIT;26772670···26832676 setup_clear_cpu_cap(X86_FEATURE_RSB_VMEXIT);26842677 }26852678 } else {26862686- pr_err("WARNING: kernel not compiled with MITIGATION_SRSO.\n");26872687- }26792679+ pr_err("WARNING: kernel not compiled with MITIGATION_IBPB_ENTRY.\n");26802680+ }26882681 break;26892682 default:26902683 break;
···22262226 u32 vector;22272227 bool all_cpus;2228222822292229+ if (!lapic_in_kernel(vcpu))22302230+ return HV_STATUS_INVALID_HYPERCALL_INPUT;22312231+22292232 if (hc->code == HVCALL_SEND_IPI) {22302233 if (!hc->fast) {22312234 if (unlikely(kvm_read_guest(kvm, hc->ingpa, &send_ipi,···28552852 ent->eax |= HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED;28562853 ent->eax |= HV_X64_APIC_ACCESS_RECOMMENDED;28572854 ent->eax |= HV_X64_RELAXED_TIMING_RECOMMENDED;28582858- ent->eax |= HV_X64_CLUSTER_IPI_RECOMMENDED;28552855+ if (!vcpu || lapic_in_kernel(vcpu))28562856+ ent->eax |= HV_X64_CLUSTER_IPI_RECOMMENDED;28592857 ent->eax |= HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED;28602858 if (evmcs_ver)28612859 ent->eax |= HV_X64_ENLIGHTENED_VMCS_RECOMMENDED;
+27-8
arch/x86/kvm/mmu/mmu.c
···55405540 union kvm_mmu_page_role root_role;5541554155425542 /* NPT requires CR0.PG=1. */55435543- WARN_ON_ONCE(cpu_role.base.direct);55435543+ WARN_ON_ONCE(cpu_role.base.direct || !cpu_role.base.guest_mode);5544554455455545 root_role = cpu_role.base;55465546 root_role.level = kvm_mmu_get_tdp_level(vcpu);···71207120 kmem_cache_destroy(mmu_page_header_cache);71217121}7122712271237123+static void kvm_wake_nx_recovery_thread(struct kvm *kvm)71247124+{71257125+ /*71267126+ * The NX recovery thread is spawned on-demand at the first KVM_RUN and71277127+ * may not be valid even though the VM is globally visible. Do nothing,71287128+ * as such a VM can't have any possible NX huge pages.71297129+ */71307130+ struct vhost_task *nx_thread = READ_ONCE(kvm->arch.nx_huge_page_recovery_thread);71317131+71327132+ if (nx_thread)71337133+ vhost_task_wake(nx_thread);71347134+}71357135+71237136static int get_nx_huge_pages(char *buffer, const struct kernel_param *kp)71247137{71257138 if (nx_hugepage_mitigation_hard_disabled)···71937180 kvm_mmu_zap_all_fast(kvm);71947181 mutex_unlock(&kvm->slots_lock);7195718271967196- vhost_task_wake(kvm->arch.nx_huge_page_recovery_thread);71837183+ kvm_wake_nx_recovery_thread(kvm);71977184 }71987185 mutex_unlock(&kvm_lock);71997186 }···73287315 mutex_lock(&kvm_lock);7329731673307317 list_for_each_entry(kvm, &vm_list, vm_list)73317331- vhost_task_wake(kvm->arch.nx_huge_page_recovery_thread);73187318+ kvm_wake_nx_recovery_thread(kvm);7332731973337320 mutex_unlock(&kvm_lock);73347321 }···74647451{74657452 struct kvm_arch *ka = container_of(once, struct kvm_arch, nx_once);74667453 struct kvm *kvm = container_of(ka, struct kvm, arch);74547454+ struct vhost_task *nx_thread;7467745574687456 kvm->arch.nx_huge_page_last = get_jiffies_64();74697469- kvm->arch.nx_huge_page_recovery_thread = vhost_task_create(74707470- kvm_nx_huge_page_recovery_worker, kvm_nx_huge_page_recovery_worker_kill,74717471- kvm, "kvm-nx-lpage-recovery");74577457+ nx_thread = vhost_task_create(kvm_nx_huge_page_recovery_worker,74587458+ kvm_nx_huge_page_recovery_worker_kill,74597459+ kvm, "kvm-nx-lpage-recovery");7472746074737473- if (kvm->arch.nx_huge_page_recovery_thread)74747474- vhost_task_start(kvm->arch.nx_huge_page_recovery_thread);74617461+ if (!nx_thread)74627462+ return;74637463+74647464+ vhost_task_start(nx_thread);74657465+74667466+ /* Make the task visible only once it is fully started. */74677467+ WRITE_ONCE(kvm->arch.nx_huge_page_recovery_thread, nx_thread);74757468}7476746974777470int kvm_mmu_post_init_vm(struct kvm *kvm)
+5-5
arch/x86/kvm/svm/nested.c
···646646 u32 pause_count12;647647 u32 pause_thresh12;648648649649+ nested_svm_transition_tlb_flush(vcpu);650650+651651+ /* Enter Guest-Mode */652652+ enter_guest_mode(vcpu);653653+649654 /*650655 * Filled at exit: exit_code, exit_code_hi, exit_info_1, exit_info_2,651656 * exit_int_info, exit_int_info_err, next_rip, insn_len, insn_bytes.···766761 vmcb02->control.pause_filter_thresh = 0;767762 }768763 }769769-770770- nested_svm_transition_tlb_flush(vcpu);771771-772772- /* Enter Guest-Mode */773773- enter_guest_mode(vcpu);774764775765 /*776766 * Merge guest and host intercepts - must be called with vcpu in
+10
arch/x86/kvm/svm/sev.c
···29722972 WARN_ON_ONCE(!boot_cpu_has(X86_FEATURE_FLUSHBYASID)))29732973 goto out;2974297429752975+ /*29762976+ * The kernel's initcall infrastructure lacks the ability to express29772977+ * dependencies between initcalls, whereas the modules infrastructure29782978+ * automatically handles dependencies via symbol loading. Ensure the29792979+ * PSP SEV driver is initialized before proceeding if KVM is built-in,29802980+ * as the dependency isn't handled by the initcall infrastructure.29812981+ */29822982+ if (IS_BUILTIN(CONFIG_KVM_AMD) && sev_module_init())29832983+ goto out;29842984+29752985 /* Retrieve SEV CPUID information */29762986 cpuid(0x8000001f, &eax, &ebx, &ecx, &edx);29772987
+6-7
arch/x86/kvm/svm/svm.c
···19911991 svm->asid = sd->next_asid++;19921992}1993199319941994-static void svm_set_dr6(struct vcpu_svm *svm, unsigned long value)19941994+static void svm_set_dr6(struct kvm_vcpu *vcpu, unsigned long value)19951995{19961996- struct vmcb *vmcb = svm->vmcb;19961996+ struct vmcb *vmcb = to_svm(vcpu)->vmcb;1997199719981998- if (svm->vcpu.arch.guest_state_protected)19981998+ if (vcpu->arch.guest_state_protected)19991999 return;2000200020012001 if (unlikely(value != vmcb->save.dr6)) {···42474247 * Run with all-zero DR6 unless needed, so that we can get the exact cause42484248 * of a #DB.42494249 */42504250- if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT))42514251- svm_set_dr6(svm, vcpu->arch.dr6);42524252- else42534253- svm_set_dr6(svm, DR6_ACTIVE_LOW);42504250+ if (likely(!(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)))42514251+ svm_set_dr6(vcpu, DR6_ACTIVE_LOW);4254425242554253 clgi();42564254 kvm_load_guest_xsave_state(vcpu);···50415043 .set_idt = svm_set_idt,50425044 .get_gdt = svm_get_gdt,50435045 .set_gdt = svm_set_gdt,50465046+ .set_dr6 = svm_set_dr6,50445047 .set_dr7 = svm_set_dr7,50455048 .sync_dirty_debug_regs = svm_sync_dirty_debug_regs,50465049 .cache_reg = svm_cache_reg,
···56485648 set_debugreg(DR6_RESERVED, 6);56495649}5650565056515651+void vmx_set_dr6(struct kvm_vcpu *vcpu, unsigned long val)56525652+{56535653+ lockdep_assert_irqs_disabled();56545654+ set_debugreg(vcpu->arch.dr6, 6);56555655+}56565656+56515657void vmx_set_dr7(struct kvm_vcpu *vcpu, unsigned long val)56525658{56535659 vmcs_writel(GUEST_DR7, val);···74227416 vmcs_writel(HOST_CR4, cr4);74237417 vmx->loaded_vmcs->host_state.cr4 = cr4;74247418 }74257425-74267426- /* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */74277427- if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT))74287428- set_debugreg(vcpu->arch.dr6, 6);7429741974307420 /* When single-stepping over STI and MOV SS, we must clear the74317421 * corresponding interruptibility bits in the guest state. Otherwise
···1096110961 set_debugreg(vcpu->arch.eff_db[1], 1);1096210962 set_debugreg(vcpu->arch.eff_db[2], 2);1096310963 set_debugreg(vcpu->arch.eff_db[3], 3);1096410964+ /* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */1096510965+ if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT))1096610966+ kvm_x86_call(set_dr6)(vcpu, vcpu->arch.dr6);1096410967 } else if (unlikely(hw_breakpoint_active())) {1096510968 set_debugreg(0, 7);1096610969 }···1274412741 "does not run without ignore_msrs=1, please report it to kvm@vger.kernel.org.\n");1274512742 }12746127431274412744+ once_init(&kvm->arch.nx_once);1274712745 return 0;12748127461274912747out_uninit_mmu:···1275212748 kvm_page_track_cleanup(kvm);1275312749out:1275412750 return ret;1275512755-}1275612756-1275712757-int kvm_arch_post_init_vm(struct kvm *kvm)1275812758-{1275912759- once_init(&kvm->arch.nx_once);1276012760- return 0;1276112751}12762127521276312753static void kvm_unload_vcpu_mmu(struct kvm_vcpu *vcpu)
+18-3
arch/x86/um/os-Linux/registers.c
···1818#include <registers.h>1919#include <sys/mman.h>20202121+static unsigned long ptrace_regset;2122unsigned long host_fp_size;22232324int get_fp_registers(int pid, unsigned long *regs)···2827 .iov_len = host_fp_size,2928 };30293131- if (ptrace(PTRACE_GETREGSET, pid, NT_X86_XSTATE, &iov) < 0)3030+ if (ptrace(PTRACE_GETREGSET, pid, ptrace_regset, &iov) < 0)3231 return -errno;3332 return 0;3433}···4039 .iov_len = host_fp_size,4140 };42414343- if (ptrace(PTRACE_SETREGSET, pid, NT_X86_XSTATE, &iov) < 0)4242+ if (ptrace(PTRACE_SETREGSET, pid, ptrace_regset, &iov) < 0)4443 return -errno;4544 return 0;4645}···5958 return -ENOMEM;60596160 /* GDB has x86_xsave_length, which uses x86_cpuid_count */6262- ret = ptrace(PTRACE_GETREGSET, pid, NT_X86_XSTATE, &iov);6161+ ptrace_regset = NT_X86_XSTATE;6262+ ret = ptrace(PTRACE_GETREGSET, pid, ptrace_regset, &iov);6363 if (ret)6464 ret = -errno;6565+6666+ if (ret == -ENODEV) {6767+#ifdef CONFIG_X86_326868+ ptrace_regset = NT_PRXFPREG;6969+#else7070+ ptrace_regset = NT_PRFPREG;7171+#endif7272+ iov.iov_len = 2 * 1024 * 1024;7373+ ret = ptrace(PTRACE_GETREGSET, pid, ptrace_regset, &iov);7474+ if (ret)7575+ ret = -errno;7676+ }7777+6578 munmap(iov.iov_base, 2 * 1024 * 1024);66796780 host_fp_size = iov.iov_len;
+10-3
arch/x86/um/signal.c
···187187 * Put magic/size values for userspace. We do not bother to verify them188188 * later on, however, userspace needs them should it try to read the189189 * XSTATE data. And ptrace does not fill in these parts.190190+ *191191+ * Skip this if we do not have an XSTATE frame.190192 */193193+ if (host_fp_size <= sizeof(to_fp64->fpstate))194194+ return 0;195195+191196 BUILD_BUG_ON(sizeof(int) != FP_XSTATE_MAGIC2_SIZE);192197#ifdef CONFIG_X86_32193198 __put_user(offsetof(struct _fpstate_32, _fxsr_env) +···372367 int err = 0, sig = ksig->sig;373368 unsigned long fp_to;374369375375- frame = (struct rt_sigframe __user *)376376- round_down(stack_top - sizeof(struct rt_sigframe), 16);370370+ frame = (void __user *)stack_top - sizeof(struct rt_sigframe);377371378372 /* Add required space for math frame */379379- frame = (struct rt_sigframe __user *)((unsigned long)frame - math_size);373373+ frame = (void __user *)((unsigned long)frame - math_size);374374+375375+ /* ABI requires 16 byte boundary alignment */376376+ frame = (void __user *)round_down((unsigned long)frame, 16);380377381378 /* Subtract 128 for a red zone and 8 for proper alignment */382379 frame = (struct rt_sigframe __user *) ((unsigned long) frame - 128 - 8);
+7-16
arch/x86/virt/svm/sev.c
···505505 * described in the SNP_INIT_EX firmware command description in the SNP506506 * firmware ABI spec.507507 */508508-static int __init snp_rmptable_init(void)508508+int __init snp_rmptable_init(void)509509{510510 unsigned int i;511511 u64 val;512512513513- if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))514514- return 0;513513+ if (WARN_ON_ONCE(!cc_platform_has(CC_ATTR_HOST_SEV_SNP)))514514+ return -ENOSYS;515515516516- if (!amd_iommu_snp_en)517517- goto nosnp;516516+ if (WARN_ON_ONCE(!amd_iommu_snp_en))517517+ return -ENOSYS;518518519519 if (!setup_rmptable())520520- goto nosnp;520520+ return -ENOSYS;521521522522 /*523523 * Check if SEV-SNP is already enabled, this can happen in case of···530530 /* Zero out the RMP bookkeeping area */531531 if (!clear_rmptable_bookkeeping()) {532532 free_rmp_segment_table();533533- goto nosnp;533533+ return -ENOSYS;534534 }535535536536 /* Zero out the RMP entries */···562562 crash_kexec_post_notifiers = true;563563564564 return 0;565565-566566-nosnp:567567- cc_platform_clear(CC_ATTR_HOST_SEV_SNP);568568- return -ENOSYS;569565}570570-571571-/*572572- * This must be called after the IOMMU has been initialized.573573- */574574-device_initcall(snp_rmptable_init);575566576567static void set_rmp_segment_info(unsigned int segment_shift)577568{
+62-9
arch/x86/xen/mmu_pv.c
···111111 */112112static DEFINE_SPINLOCK(xen_reservation_lock);113113114114+/* Protected by xen_reservation_lock. */115115+#define MIN_CONTIG_ORDER 9 /* 2MB */116116+static unsigned int discontig_frames_order = MIN_CONTIG_ORDER;117117+static unsigned long discontig_frames_early[1UL << MIN_CONTIG_ORDER] __initdata;118118+static unsigned long *discontig_frames __refdata = discontig_frames_early;119119+static bool discontig_frames_dyn;120120+121121+static int alloc_discontig_frames(unsigned int order)122122+{123123+ unsigned long *new_array, *old_array;124124+ unsigned int old_order;125125+ unsigned long flags;126126+127127+ BUG_ON(order < MIN_CONTIG_ORDER);128128+ BUILD_BUG_ON(sizeof(discontig_frames_early) != PAGE_SIZE);129129+130130+ new_array = (unsigned long *)__get_free_pages(GFP_KERNEL,131131+ order - MIN_CONTIG_ORDER);132132+ if (!new_array)133133+ return -ENOMEM;134134+135135+ spin_lock_irqsave(&xen_reservation_lock, flags);136136+137137+ old_order = discontig_frames_order;138138+139139+ if (order > discontig_frames_order || !discontig_frames_dyn) {140140+ if (!discontig_frames_dyn)141141+ old_array = NULL;142142+ else143143+ old_array = discontig_frames;144144+145145+ discontig_frames = new_array;146146+ discontig_frames_order = order;147147+ discontig_frames_dyn = true;148148+ } else {149149+ old_array = new_array;150150+ }151151+152152+ spin_unlock_irqrestore(&xen_reservation_lock, flags);153153+154154+ free_pages((unsigned long)old_array, old_order - MIN_CONTIG_ORDER);155155+156156+ return 0;157157+}158158+114159/*115160 * Note about cr3 (pagetable base) values:116161 *···859814 SetPagePinned(virt_to_page(level3_user_vsyscall));860815#endif861816 xen_pgd_walk(&init_mm, xen_mark_pinned, FIXADDR_TOP);817817+818818+ if (alloc_discontig_frames(MIN_CONTIG_ORDER))819819+ BUG();862820}863821864822static void xen_unpin_page(struct mm_struct *mm, struct page *page,···22512203 memset(dummy_mapping, 0xff, PAGE_SIZE);22522204}2253220522542254-/* Protected by xen_reservation_lock. */22552255-#define MAX_CONTIG_ORDER 9 /* 2MB */22562256-static unsigned long discontig_frames[1<<MAX_CONTIG_ORDER];22572257-22582206#define VOID_PTE (mfn_pte(0, __pgprot(0)))22592207static void xen_zap_pfn_range(unsigned long vaddr, unsigned int order,22602208 unsigned long *in_frames,···23672323 unsigned int address_bits,23682324 dma_addr_t *dma_handle)23692325{23702370- unsigned long *in_frames = discontig_frames, out_frame;23262326+ unsigned long *in_frames, out_frame;23712327 unsigned long flags;23722328 int success;23732329 unsigned long vstart = (unsigned long)phys_to_virt(pstart);2374233023752375- if (unlikely(order > MAX_CONTIG_ORDER))23762376- return -ENOMEM;23312331+ if (unlikely(order > discontig_frames_order)) {23322332+ if (!discontig_frames_dyn)23332333+ return -ENOMEM;23342334+23352335+ if (alloc_discontig_frames(order))23362336+ return -ENOMEM;23372337+ }2377233823782339 memset((void *) vstart, 0, PAGE_SIZE << order);2379234023802341 spin_lock_irqsave(&xen_reservation_lock, flags);23422342+23432343+ in_frames = discontig_frames;2381234423822345 /* 1. Zap current PTEs, remembering MFNs. */23832346 xen_zap_pfn_range(vstart, order, in_frames, NULL);···2409235824102359void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order)24112360{24122412- unsigned long *out_frames = discontig_frames, in_frame;23612361+ unsigned long *out_frames, in_frame;24132362 unsigned long flags;24142363 int success;24152364 unsigned long vstart;2416236524172417- if (unlikely(order > MAX_CONTIG_ORDER))23662366+ if (unlikely(order > discontig_frames_order))24182367 return;2419236824202369 vstart = (unsigned long)phys_to_virt(pstart);24212370 memset((void *) vstart, 0, PAGE_SIZE << order);2422237124232372 spin_lock_irqsave(&xen_reservation_lock, flags);23732373+23742374+ out_frames = discontig_frames;2424237524252376 /* 1. Find start MFN of contiguous extent. */24262377 in_frame = virt_to_mfn((void *)vstart);
+3-8
arch/x86/xen/xen-head.S
···100100 push %r10101101 push %r9102102 push %r8103103-#ifdef CONFIG_FRAME_POINTER104104- pushq $0 /* Dummy push for stack alignment. */105105-#endif106103#endif107104 /* Set the vendor specific function. */108105 call __xen_hypercall_setfunc···114117 pop %ebx115118 pop %eax116119#else117117- lea xen_hypercall_amd(%rip), %rbx118118- cmp %rax, %rbx119119-#ifdef CONFIG_FRAME_POINTER120120- pop %rax /* Dummy pop. */121121-#endif120120+ lea xen_hypercall_amd(%rip), %rcx121121+ cmp %rax, %rcx122122 pop %r8123123 pop %r9124124 pop %r10···126132 pop %rcx127133 pop %rax128134#endif135135+ FRAME_END129136 /* Use correct hypercall function. */130137 jz xen_hypercall_amd131138 jmp xen_hypercall_intel
+15-3
block/partitions/mac.c
···5353 }5454 secsize = be16_to_cpu(md->block_size);5555 put_dev_sector(sect);5656+5757+ /*5858+ * If the "block size" is not a power of 2, things get weird - we might5959+ * end up with a partition straddling a sector boundary, so we wouldn't6060+ * be able to read a partition entry with read_part_sector().6161+ * Real block sizes are probably (?) powers of two, so just require6262+ * that.6363+ */6464+ if (!is_power_of_2(secsize))6565+ return -1;5666 datasize = round_down(secsize, 512);5767 data = read_part_sector(state, datasize / 512, §);5868 if (!data)5969 return -1;6070 partoffset = secsize % 512;6161- if (partoffset + sizeof(*part) > datasize)7171+ if (partoffset + sizeof(*part) > datasize) {7272+ put_dev_sector(sect);6273 return -1;7474+ }6375 part = (struct mac_partition *) (data + partoffset);6476 if (be16_to_cpu(part->signature) != MAC_PARTITION_MAGIC) {6577 put_dev_sector(sect);···124112 int i, l;125113126114 goodness++;127127- l = strlen(part->name);128128- if (strcmp(part->name, "/") == 0)115115+ l = strnlen(part->name, sizeof(part->name));116116+ if (strncmp(part->name, "/", sizeof(part->name)) == 0)129117 goodness++;130118 for (i = 0; i <= l - 4; ++i) {131119 if (strncasecmp(part->name + i, "root",
+5
drivers/accel/amdxdna/amdxdna_pci_drv.c
···21212222#define AMDXDNA_AUTOSUSPEND_DELAY 5000 /* milliseconds */23232424+MODULE_FIRMWARE("amdnpu/1502_00/npu.sbin");2525+MODULE_FIRMWARE("amdnpu/17f0_10/npu.sbin");2626+MODULE_FIRMWARE("amdnpu/17f0_11/npu.sbin");2727+MODULE_FIRMWARE("amdnpu/17f0_20/npu.sbin");2828+2429/*2530 * Bind the driver base on (vendor_id, device_id) pair and later use the2631 * (device_id, rev_id) pair as a key to select the devices. The devices with
+6-2
drivers/accel/ivpu/ivpu_drv.c
···397397 if (ivpu_fw_is_cold_boot(vdev)) {398398 ret = ivpu_pm_dct_init(vdev);399399 if (ret)400400- goto err_diagnose_failure;400400+ goto err_disable_ipc;401401402402 ret = ivpu_hw_sched_init(vdev);403403 if (ret)404404- goto err_diagnose_failure;404404+ goto err_disable_ipc;405405 }406406407407 return 0;408408409409+err_disable_ipc:410410+ ivpu_ipc_disable(vdev);411411+ ivpu_hw_irq_disable(vdev);412412+ disable_irq(vdev->irq);409413err_diagnose_failure:410414 ivpu_hw_diagnose_failure(vdev);411415 ivpu_mmu_evtq_dump(vdev);
···11+// SPDX-License-Identifier: GPL-2.0-only22+/*33+ * Copyright (c) 2025 Greg Kroah-Hartman <gregkh@linuxfoundation.org>44+ * Copyright (c) 2025 The Linux Foundation55+ *66+ * A "simple" faux bus that allows devices to be created and added77+ * automatically to it. This is to be used whenever you need to create a88+ * device that is not associated with any "real" system resources, and do99+ * not want to have to deal with a bus/driver binding logic. It is1010+ * intended to be very simple, with only a create and a destroy function1111+ * available.1212+ */1313+#include <linux/err.h>1414+#include <linux/init.h>1515+#include <linux/slab.h>1616+#include <linux/string.h>1717+#include <linux/container_of.h>1818+#include <linux/device/faux.h>1919+#include "base.h"2020+2121+/*2222+ * Internal wrapper structure so we can hold a pointer to the2323+ * faux_device_ops for this device.2424+ */2525+struct faux_object {2626+ struct faux_device faux_dev;2727+ const struct faux_device_ops *faux_ops;2828+};2929+#define to_faux_object(dev) container_of_const(dev, struct faux_object, faux_dev.dev)3030+3131+static struct device faux_bus_root = {3232+ .init_name = "faux",3333+};3434+3535+static int faux_match(struct device *dev, const struct device_driver *drv)3636+{3737+ /* Match always succeeds, we only have one driver */3838+ return 1;3939+}4040+4141+static int faux_probe(struct device *dev)4242+{4343+ struct faux_object *faux_obj = to_faux_object(dev);4444+ struct faux_device *faux_dev = &faux_obj->faux_dev;4545+ const struct faux_device_ops *faux_ops = faux_obj->faux_ops;4646+ int ret = 0;4747+4848+ if (faux_ops && faux_ops->probe)4949+ ret = faux_ops->probe(faux_dev);5050+5151+ return ret;5252+}5353+5454+static void faux_remove(struct device *dev)5555+{5656+ struct faux_object *faux_obj = to_faux_object(dev);5757+ struct faux_device *faux_dev = &faux_obj->faux_dev;5858+ const struct faux_device_ops *faux_ops = faux_obj->faux_ops;5959+6060+ if (faux_ops && faux_ops->remove)6161+ faux_ops->remove(faux_dev);6262+}6363+6464+static const struct bus_type faux_bus_type = {6565+ .name = "faux",6666+ .match = faux_match,6767+ .probe = faux_probe,6868+ .remove = faux_remove,6969+};7070+7171+static struct device_driver faux_driver = {7272+ .name = "faux_driver",7373+ .bus = &faux_bus_type,7474+ .probe_type = PROBE_FORCE_SYNCHRONOUS,7575+};7676+7777+static void faux_device_release(struct device *dev)7878+{7979+ struct faux_object *faux_obj = to_faux_object(dev);8080+8181+ kfree(faux_obj);8282+}8383+8484+/**8585+ * faux_device_create_with_groups - Create and register with the driver8686+ * core a faux device and populate the device with an initial8787+ * set of sysfs attributes.8888+ * @name: The name of the device we are adding, must be unique for8989+ * all faux devices.9090+ * @parent: Pointer to a potential parent struct device. If set to9191+ * NULL, the device will be created in the "root" of the faux9292+ * device tree in sysfs.9393+ * @faux_ops: struct faux_device_ops that the new device will call back9494+ * into, can be NULL.9595+ * @groups: The set of sysfs attributes that will be created for this9696+ * device when it is registered with the driver core.9797+ *9898+ * Create a new faux device and register it in the driver core properly.9999+ * If present, callbacks in @faux_ops will be called with the device that100100+ * for the caller to do something with at the proper time given the101101+ * device's lifecycle.102102+ *103103+ * Note, when this function is called, the functions specified in struct104104+ * faux_ops can be called before the function returns, so be prepared for105105+ * everything to be properly initialized before that point in time.106106+ *107107+ * Return:108108+ * * NULL if an error happened with creating the device109109+ * * pointer to a valid struct faux_device that is registered with sysfs110110+ */111111+struct faux_device *faux_device_create_with_groups(const char *name,112112+ struct device *parent,113113+ const struct faux_device_ops *faux_ops,114114+ const struct attribute_group **groups)115115+{116116+ struct faux_object *faux_obj;117117+ struct faux_device *faux_dev;118118+ struct device *dev;119119+ int ret;120120+121121+ faux_obj = kzalloc(sizeof(*faux_obj), GFP_KERNEL);122122+ if (!faux_obj)123123+ return NULL;124124+125125+ /* Save off the callbacks so we can use them in the future */126126+ faux_obj->faux_ops = faux_ops;127127+128128+ /* Initialize the device portion and register it with the driver core */129129+ faux_dev = &faux_obj->faux_dev;130130+ dev = &faux_dev->dev;131131+132132+ device_initialize(dev);133133+ dev->release = faux_device_release;134134+ if (parent)135135+ dev->parent = parent;136136+ else137137+ dev->parent = &faux_bus_root;138138+ dev->bus = &faux_bus_type;139139+ dev->groups = groups;140140+ dev_set_name(dev, "%s", name);141141+142142+ ret = device_add(dev);143143+ if (ret) {144144+ pr_err("%s: device_add for faux device '%s' failed with %d\n",145145+ __func__, name, ret);146146+ put_device(dev);147147+ return NULL;148148+ }149149+150150+ return faux_dev;151151+}152152+EXPORT_SYMBOL_GPL(faux_device_create_with_groups);153153+154154+/**155155+ * faux_device_create - create and register with the driver core a faux device156156+ * @name: The name of the device we are adding, must be unique for all157157+ * faux devices.158158+ * @parent: Pointer to a potential parent struct device. If set to159159+ * NULL, the device will be created in the "root" of the faux160160+ * device tree in sysfs.161161+ * @faux_ops: struct faux_device_ops that the new device will call back162162+ * into, can be NULL.163163+ *164164+ * Create a new faux device and register it in the driver core properly.165165+ * If present, callbacks in @faux_ops will be called with the device that166166+ * for the caller to do something with at the proper time given the167167+ * device's lifecycle.168168+ *169169+ * Note, when this function is called, the functions specified in struct170170+ * faux_ops can be called before the function returns, so be prepared for171171+ * everything to be properly initialized before that point in time.172172+ *173173+ * Return:174174+ * * NULL if an error happened with creating the device175175+ * * pointer to a valid struct faux_device that is registered with sysfs176176+ */177177+struct faux_device *faux_device_create(const char *name,178178+ struct device *parent,179179+ const struct faux_device_ops *faux_ops)180180+{181181+ return faux_device_create_with_groups(name, parent, faux_ops, NULL);182182+}183183+EXPORT_SYMBOL_GPL(faux_device_create);184184+185185+/**186186+ * faux_device_destroy - destroy a faux device187187+ * @faux_dev: faux device to destroy188188+ *189189+ * Unregisters and cleans up a device that was created with a call to190190+ * faux_device_create()191191+ */192192+void faux_device_destroy(struct faux_device *faux_dev)193193+{194194+ struct device *dev = &faux_dev->dev;195195+196196+ if (!faux_dev)197197+ return;198198+199199+ device_del(dev);200200+201201+ /* The final put_device() will clean up the memory we allocated for this device. */202202+ put_device(dev);203203+}204204+EXPORT_SYMBOL_GPL(faux_device_destroy);205205+206206+int __init faux_bus_init(void)207207+{208208+ int ret;209209+210210+ ret = device_register(&faux_bus_root);211211+ if (ret) {212212+ put_device(&faux_bus_root);213213+ return ret;214214+ }215215+216216+ ret = bus_register(&faux_bus_type);217217+ if (ret)218218+ goto error_bus;219219+220220+ ret = driver_register(&faux_driver);221221+ if (ret)222222+ goto error_driver;223223+224224+ return ret;225225+226226+error_driver:227227+ bus_unregister(&faux_bus_type);228228+229229+error_bus:230230+ device_unregister(&faux_bus_root);231231+ return ret;232232+}
+1
drivers/base/init.c
···3232 /* These are also core pieces, but must come after the3333 * core core pieces.3434 */3535+ faux_bus_init();3536 of_core_init();3637 platform_bus_init();3738 auxiliary_bus_init();
···17171818config ARM_AIROHA_SOC_CPUFREQ1919 tristate "Airoha EN7581 SoC CPUFreq support"2020- depends on (ARCH_AIROHA && OF) || COMPILE_TEST2020+ depends on ARCH_AIROHA || COMPILE_TEST2121+ depends on OF2122 select PM_OPP2223 default ARCH_AIROHA2324 help
+10-10
drivers/cpufreq/amd-pstate.c
···699699 if (min_perf < lowest_nonlinear_perf)700700 min_perf = lowest_nonlinear_perf;701701702702- max_perf = cap_perf;702702+ max_perf = cpudata->max_limit_perf;703703 if (max_perf < min_perf)704704 max_perf = min_perf;705705···747747 guard(mutex)(&amd_pstate_driver_lock);748748749749 ret = amd_pstate_cpu_boost_update(policy, state);750750- policy->boost_enabled = !ret ? state : false;751750 refresh_frequency_limits(policy);752751753752 return ret;···821822822823static void amd_pstate_update_limits(unsigned int cpu)823824{824824- struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);825825+ struct cpufreq_policy *policy = NULL;825826 struct amd_cpudata *cpudata;826827 u32 prev_high = 0, cur_high = 0;827828 int ret;828829 bool highest_perf_changed = false;829830831831+ if (!amd_pstate_prefcore)832832+ return;833833+834834+ policy = cpufreq_cpu_get(cpu);830835 if (!policy)831836 return;832837833838 cpudata = policy->driver_data;834839835835- if (!amd_pstate_prefcore)836836- return;837837-838840 guard(mutex)(&amd_pstate_driver_lock);839841840842 ret = amd_get_highest_perf(cpu, &cur_high);841841- if (ret)842842- goto free_cpufreq_put;843843+ if (ret) {844844+ cpufreq_cpu_put(policy);845845+ return;846846+ }843847844848 prev_high = READ_ONCE(cpudata->prefcore_ranking);845849 highest_perf_changed = (prev_high != cur_high);···852850 if (cur_high < CPPC_MAX_PERF)853851 sched_set_itmt_core_prio((int)cur_high, cpu);854852 }855855-856856-free_cpufreq_put:857853 cpufreq_cpu_put(policy);858854859855 if (!highest_perf_changed)
+2-1
drivers/cpufreq/cpufreq.c
···15711571 policy->cdev = of_cpufreq_cooling_register(policy);1572157215731573 /* Let the per-policy boost flag mirror the cpufreq_driver boost during init */15741574- if (policy->boost_enabled != cpufreq_boost_enabled()) {15741574+ if (cpufreq_driver->set_boost &&15751575+ policy->boost_enabled != cpufreq_boost_enabled()) {15751576 policy->boost_enabled = cpufreq_boost_enabled();15761577 ret = cpufreq_driver->set_boost(policy, policy->boost_enabled);15771578 if (ret) {
+14
drivers/crypto/ccp/sp-dev.c
···1919#include <linux/types.h>2020#include <linux/ccp.h>21212222+#include "sev-dev.h"2223#include "ccp-dev.h"2324#include "sp-dev.h"2425···254253static int __init sp_mod_init(void)255254{256255#ifdef CONFIG_X86256256+ static bool initialized;257257 int ret;258258+259259+ if (initialized)260260+ return 0;258261259262 ret = sp_pci_init();260263 if (ret)···267262#ifdef CONFIG_CRYPTO_DEV_SP_PSP268263 psp_pci_init();269264#endif265265+266266+ initialized = true;270267271268 return 0;272269#endif···285278286279 return -ENODEV;287280}281281+282282+#if IS_BUILTIN(CONFIG_KVM_AMD) && IS_ENABLED(CONFIG_KVM_AMD_SEV)283283+int __init sev_module_init(void)284284+{285285+ return sp_mod_init();286286+}287287+#endif288288289289static void __exit sp_mod_exit(void)290290{
+14-3
drivers/dma/tegra210-adma.c
···887887 const struct tegra_adma_chip_data *cdata;888888 struct tegra_adma *tdma;889889 struct resource *res_page, *res_base;890890- int ret, i, page_no;890890+ int ret, i;891891892892 cdata = of_device_get_match_data(&pdev->dev);893893 if (!cdata) {···914914915915 res_base = platform_get_resource_byname(pdev, IORESOURCE_MEM, "global");916916 if (res_base) {917917- page_no = (res_page->start - res_base->start) / cdata->ch_base_offset;918918- if (page_no <= 0)917917+ resource_size_t page_offset, page_no;918918+ unsigned int ch_base_offset;919919+920920+ if (res_page->start < res_base->start)919921 return -EINVAL;922922+ page_offset = res_page->start - res_base->start;923923+ ch_base_offset = cdata->ch_base_offset;924924+ if (!ch_base_offset)925925+ return -EINVAL;926926+927927+ page_no = div_u64(page_offset, ch_base_offset);928928+ if (!page_no || page_no > INT_MAX)929929+ return -EINVAL;930930+920931 tdma->ch_page_no = page_no - 1;921932 tdma->base_addr = devm_ioremap_resource(&pdev->dev, res_base);922933 if (IS_ERR(tdma->base_addr))
+1-1
drivers/firmware/Kconfig
···106106 select ISCSI_BOOT_SYSFS107107 select ISCSI_IBFT_FIND if X86108108 depends on ACPI && SCSI && SCSI_LOWLEVEL109109- default n109109+ default n110110 help111111 This option enables support for detection and exposing of iSCSI112112 Boot Firmware Table (iBFT) via sysfs to userspace. If you wish to
···2525 if (md->type != EFI_CONVENTIONAL_MEMORY)2626 return 0;27272828+ if (md->attribute & EFI_MEMORY_HOT_PLUGGABLE)2929+ return 0;3030+2831 if (efi_soft_reserve_enabled() &&2932 (md->attribute & EFI_MEMORY_SP))3033 return 0;
+3
drivers/firmware/efi/libstub/relocate.c
···5353 if (desc->type != EFI_CONVENTIONAL_MEMORY)5454 continue;55555656+ if (desc->attribute & EFI_MEMORY_HOT_PLUGGABLE)5757+ continue;5858+5659 if (efi_soft_reserve_enabled() &&5760 (desc->attribute & EFI_MEMORY_SP))5861 continue;
+4-1
drivers/firmware/iscsi_ibft.c
···310310 str += sprintf_ipaddr(str, nic->ip_addr);311311 break;312312 case ISCSI_BOOT_ETH_SUBNET_MASK:313313- val = cpu_to_be32(~((1 << (32-nic->subnet_mask_prefix))-1));313313+ if (nic->subnet_mask_prefix > 32)314314+ val = cpu_to_be32(~0);315315+ else316316+ val = cpu_to_be32(~((1 << (32-nic->subnet_mask_prefix))-1));314317 str += sprintf(str, "%pI4", &val);315318 break;316319 case ISCSI_BOOT_ETH_PREFIX_LEN:
+1
drivers/gpio/Kconfig
···338338339339config GPIO_GRGPIO340340 tristate "Aeroflex Gaisler GRGPIO support"341341+ depends on OF || COMPILE_TEST341342 select GPIO_GENERIC342343 select IRQ_DOMAIN343344 help
+58-13
drivers/gpio/gpio-bcm-kona.c
···6969struct bcm_kona_gpio_bank {7070 int id;7171 int irq;7272+ /*7373+ * Used to keep track of lock/unlock operations for each GPIO in the7474+ * bank.7575+ *7676+ * All GPIOs are locked by default (see bcm_kona_gpio_reset), and the7777+ * unlock count for all GPIOs is 0 by default. Each unlock increments7878+ * the counter, and each lock decrements the counter.7979+ *8080+ * The lock function only locks the GPIO once its unlock counter is8181+ * down to 0. This is necessary because the GPIO is unlocked in two8282+ * places in this driver: once for requested GPIOs, and once for8383+ * requested IRQs. Since it is possible for a GPIO to be requested8484+ * as both a GPIO and an IRQ, we need to ensure that we don't lock it8585+ * too early.8686+ */8787+ u8 gpio_unlock_count[GPIO_PER_BANK];7288 /* Used in the interrupt handler */7389 struct bcm_kona_gpio *kona_gpio;7490};···10286 u32 val;10387 unsigned long flags;10488 int bank_id = GPIO_BANK(gpio);8989+ int bit = GPIO_BIT(gpio);9090+ struct bcm_kona_gpio_bank *bank = &kona_gpio->banks[bank_id];10591106106- raw_spin_lock_irqsave(&kona_gpio->lock, flags);9292+ if (bank->gpio_unlock_count[bit] == 0) {9393+ dev_err(kona_gpio->gpio_chip.parent,9494+ "Unbalanced locks for GPIO %u\n", gpio);9595+ return;9696+ }10797108108- val = readl(kona_gpio->reg_base + GPIO_PWD_STATUS(bank_id));109109- val |= BIT(gpio);110110- bcm_kona_gpio_write_lock_regs(kona_gpio->reg_base, bank_id, val);9898+ if (--bank->gpio_unlock_count[bit] == 0) {9999+ raw_spin_lock_irqsave(&kona_gpio->lock, flags);111100112112- raw_spin_unlock_irqrestore(&kona_gpio->lock, flags);101101+ val = readl(kona_gpio->reg_base + GPIO_PWD_STATUS(bank_id));102102+ val |= BIT(bit);103103+ bcm_kona_gpio_write_lock_regs(kona_gpio->reg_base, bank_id, val);104104+105105+ raw_spin_unlock_irqrestore(&kona_gpio->lock, flags);106106+ }113107}114108115109static void bcm_kona_gpio_unlock_gpio(struct bcm_kona_gpio *kona_gpio,···128102 u32 val;129103 unsigned long flags;130104 int bank_id = GPIO_BANK(gpio);105105+ int bit = GPIO_BIT(gpio);106106+ struct bcm_kona_gpio_bank *bank = &kona_gpio->banks[bank_id];131107132132- raw_spin_lock_irqsave(&kona_gpio->lock, flags);108108+ if (bank->gpio_unlock_count[bit] == 0) {109109+ raw_spin_lock_irqsave(&kona_gpio->lock, flags);133110134134- val = readl(kona_gpio->reg_base + GPIO_PWD_STATUS(bank_id));135135- val &= ~BIT(gpio);136136- bcm_kona_gpio_write_lock_regs(kona_gpio->reg_base, bank_id, val);111111+ val = readl(kona_gpio->reg_base + GPIO_PWD_STATUS(bank_id));112112+ val &= ~BIT(bit);113113+ bcm_kona_gpio_write_lock_regs(kona_gpio->reg_base, bank_id, val);137114138138- raw_spin_unlock_irqrestore(&kona_gpio->lock, flags);115115+ raw_spin_unlock_irqrestore(&kona_gpio->lock, flags);116116+ }117117+118118+ ++bank->gpio_unlock_count[bit];139119}140120141121static int bcm_kona_gpio_get_dir(struct gpio_chip *chip, unsigned gpio)···392360393361 kona_gpio = irq_data_get_irq_chip_data(d);394362 reg_base = kona_gpio->reg_base;363363+395364 raw_spin_lock_irqsave(&kona_gpio->lock, flags);396365397366 val = readl(reg_base + GPIO_INT_MASK(bank_id));···415382416383 kona_gpio = irq_data_get_irq_chip_data(d);417384 reg_base = kona_gpio->reg_base;385385+418386 raw_spin_lock_irqsave(&kona_gpio->lock, flags);419387420388 val = readl(reg_base + GPIO_INT_MSKCLR(bank_id));···511477static int bcm_kona_gpio_irq_reqres(struct irq_data *d)512478{513479 struct bcm_kona_gpio *kona_gpio = irq_data_get_irq_chip_data(d);480480+ unsigned int gpio = d->hwirq;514481515515- return gpiochip_reqres_irq(&kona_gpio->gpio_chip, d->hwirq);482482+ /*483483+ * We need to unlock the GPIO before any other operations are performed484484+ * on the relevant GPIO configuration registers485485+ */486486+ bcm_kona_gpio_unlock_gpio(kona_gpio, gpio);487487+488488+ return gpiochip_reqres_irq(&kona_gpio->gpio_chip, gpio);516489}517490518491static void bcm_kona_gpio_irq_relres(struct irq_data *d)519492{520493 struct bcm_kona_gpio *kona_gpio = irq_data_get_irq_chip_data(d);494494+ unsigned int gpio = d->hwirq;521495522522- gpiochip_relres_irq(&kona_gpio->gpio_chip, d->hwirq);496496+ /* Once we no longer use it, lock the GPIO again */497497+ bcm_kona_gpio_lock_gpio(kona_gpio, gpio);498498+499499+ gpiochip_relres_irq(&kona_gpio->gpio_chip, gpio);523500}524501525502static struct irq_chip bcm_gpio_irq_chip = {···659614 bank->irq = platform_get_irq(pdev, i);660615 bank->kona_gpio = kona_gpio;661616 if (bank->irq < 0) {662662- dev_err(dev, "Couldn't get IRQ for bank %d", i);617617+ dev_err(dev, "Couldn't get IRQ for bank %d\n", i);663618 ret = -ENOENT;664619 goto err_irq_domain;665620 }
-19
drivers/gpio/gpio-pca953x.c
···841841 DECLARE_BITMAP(trigger, MAX_LINE);842842 int ret;843843844844- if (chip->driver_data & PCA_PCAL) {845845- /* Read the current interrupt status from the device */846846- ret = pca953x_read_regs(chip, PCAL953X_INT_STAT, trigger);847847- if (ret)848848- return false;849849-850850- /* Check latched inputs and clear interrupt status */851851- ret = pca953x_read_regs(chip, chip->regs->input, cur_stat);852852- if (ret)853853- return false;854854-855855- /* Apply filter for rising/falling edge selection */856856- bitmap_replace(new_stat, chip->irq_trig_fall, chip->irq_trig_raise, cur_stat, gc->ngpio);857857-858858- bitmap_and(pending, new_stat, trigger, gc->ngpio);859859-860860- return !bitmap_empty(pending, gc->ngpio);861861- }862862-863844 ret = pca953x_read_regs(chip, chip->regs->input, cur_stat);864845 if (ret)865846 return false;
+8-5
drivers/gpio/gpio-sim.c
···10281028 struct configfs_subsystem *subsys = dev->group.cg_subsys;10291029 struct gpio_sim_bank *bank;10301030 struct gpio_sim_line *line;10311031+ struct config_item *item;1031103210321033 /*10331033- * The device only needs to depend on leaf line entries. This is10341034+ * The device only needs to depend on leaf entries. This is10341035 * sufficient to lock up all the configfs entries that the10351036 * instantiated, alive device depends on.10361037 */10371038 list_for_each_entry(bank, &dev->bank_list, siblings) {10381039 list_for_each_entry(line, &bank->line_list, siblings) {10401040+ item = line->hog ? &line->hog->item10411041+ : &line->group.cg_item;10421042+10391043 if (lock)10401040- WARN_ON(configfs_depend_item_unlocked(10411041- subsys, &line->group.cg_item));10441044+ WARN_ON(configfs_depend_item_unlocked(subsys,10451045+ item));10421046 else10431043- configfs_undepend_item_unlocked(10441044- &line->group.cg_item);10471047+ configfs_undepend_item_unlocked(item);10451048 }10461049 }10471050}
+12-3
drivers/gpio/gpio-stmpe.c
···191191 [REG_IE][CSB] = STMPE_IDX_IEGPIOR_CSB,192192 [REG_IE][MSB] = STMPE_IDX_IEGPIOR_MSB,193193 };194194- int i, j;194194+ int ret, i, j;195195196196 /*197197 * STMPE1600: to be able to get IRQ from pins,···199199 * GPSR or GPCR registers200200 */201201 if (stmpe->partnum == STMPE1600) {202202- stmpe_reg_read(stmpe, stmpe->regs[STMPE_IDX_GPMR_LSB]);203203- stmpe_reg_read(stmpe, stmpe->regs[STMPE_IDX_GPMR_CSB]);202202+ ret = stmpe_reg_read(stmpe, stmpe->regs[STMPE_IDX_GPMR_LSB]);203203+ if (ret < 0) {204204+ dev_err(stmpe->dev, "Failed to read GPMR_LSB: %d\n", ret);205205+ goto err;206206+ }207207+ ret = stmpe_reg_read(stmpe, stmpe->regs[STMPE_IDX_GPMR_CSB]);208208+ if (ret < 0) {209209+ dev_err(stmpe->dev, "Failed to read GPMR_CSB: %d\n", ret);210210+ goto err;211211+ }204212 }205213206214 for (i = 0; i < CACHE_NR_REGS; i++) {···230222 }231223 }232224225225+err:233226 mutex_unlock(&stmpe_gpio->irq_lock);234227}235228
···904904 }905905906906 if (gc->ngpio == 0) {907907- chip_err(gc, "tried to insert a GPIO chip with zero lines\n");907907+ dev_err(dev, "tried to insert a GPIO chip with zero lines\n");908908 return -EINVAL;909909 }910910911911 if (gc->ngpio > FASTPATH_NGPIO)912912- chip_warn(gc, "line cnt %u is greater than fast path cnt %u\n",913913- gc->ngpio, FASTPATH_NGPIO);912912+ dev_warn(dev, "line cnt %u is greater than fast path cnt %u\n",913913+ gc->ngpio, FASTPATH_NGPIO);914914915915 return 0;916916}
···38153815 if (err == -ENODEV) {38163816 dev_warn(adev->dev, "cap microcode does not exist, skip\n");38173817 err = 0;38183818- goto out;38183818+ } else {38193819+ dev_err(adev->dev, "fail to initialize cap microcode\n");38193820 }38203820- dev_err(adev->dev, "fail to initialize cap microcode\n");38213821+ goto out;38213822 }3822382338233824 info = &adev->firmware.ucode[AMDGPU_UCODE_ID_CAP];
+6-2
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
···309309 mutex_lock(&adev->mman.gtt_window_lock);310310 while (src_mm.remaining) {311311 uint64_t from, to, cur_size, tiling_flags;312312- uint32_t num_type, data_format, max_com;312312+ uint32_t num_type, data_format, max_com, write_compress_disable;313313 struct dma_fence *next;314314315315 /* Never copy more than 256MiB at once to avoid a timeout */···340340 max_com = AMDGPU_TILING_GET(tiling_flags, GFX12_DCC_MAX_COMPRESSED_BLOCK);341341 num_type = AMDGPU_TILING_GET(tiling_flags, GFX12_DCC_NUMBER_TYPE);342342 data_format = AMDGPU_TILING_GET(tiling_flags, GFX12_DCC_DATA_FORMAT);343343+ write_compress_disable =344344+ AMDGPU_TILING_GET(tiling_flags, GFX12_DCC_WRITE_COMPRESS_DISABLE);343345 copy_flags |= (AMDGPU_COPY_FLAGS_SET(MAX_COMPRESSED, max_com) |344346 AMDGPU_COPY_FLAGS_SET(NUMBER_TYPE, num_type) |345345- AMDGPU_COPY_FLAGS_SET(DATA_FORMAT, data_format));347347+ AMDGPU_COPY_FLAGS_SET(DATA_FORMAT, data_format) |348348+ AMDGPU_COPY_FLAGS_SET(WRITE_COMPRESS_DISABLE,349349+ write_compress_disable));346350 }347351348352 r = amdgpu_copy_buffer(ring, from, to, cur_size, resv,
···10491049 s_rfe_b64 s_restore_pc_lo //Return to the main shader program and resume execution1050105010511051L_END_PGM:10521052+ // Make sure that no wave of the workgroup can exit the trap handler10531053+ // before the workgroup barrier state is saved.10541054+ s_barrier_signal -210551055+ s_barrier_wait -210521056 s_endpgm_saved10531057end10541058
···2133213321342134 dc_enable_stereo(dc, context, dc_streams, context->stream_count);2135213521362136- if (context->stream_count > get_seamless_boot_stream_count(context) ||21362136+ if (get_seamless_boot_stream_count(context) == 0 ||21372137 context->stream_count == 0) {21382138 /* Must wait for no flips to be pending before doing optimize bw */21392139 hwss_wait_for_no_pipes_pending(dc, context);
···195195 if (enabled)196196 vgacrdf_test |= AST_IO_VGACRDF_DP_VIDEO_ENABLE;197197198198- for (i = 0; i < 200; ++i) {198198+ for (i = 0; i < 1000; ++i) {199199 if (i)200200 mdelay(1);201201 vgacrdf = ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xdf,
+3-11
drivers/gpu/drm/display/drm_dp_cec.c
···311311 if (!aux->transfer)312312 return;313313314314-#ifndef CONFIG_MEDIA_CEC_RC315315- /*316316- * CEC_CAP_RC is part of CEC_CAP_DEFAULTS, but it is stripped by317317- * cec_allocate_adapter() if CONFIG_MEDIA_CEC_RC is undefined.318318- *319319- * Do this here as well to ensure the tests against cec_caps are320320- * correct.321321- */322322- cec_caps &= ~CEC_CAP_RC;323323-#endif324314 cancel_delayed_work_sync(&aux->cec.unregister_work);325315326316 mutex_lock(&aux->cec.lock);···327337 num_las = CEC_MAX_LOG_ADDRS;328338329339 if (aux->cec.adap) {330330- if (aux->cec.adap->capabilities == cec_caps &&340340+ /* Check if the adapter properties have changed */341341+ if ((aux->cec.adap->capabilities & CEC_CAP_MONITOR_ALL) ==342342+ (cec_caps & CEC_CAP_MONITOR_ALL) &&331343 aux->cec.adap->available_log_addrs == num_las) {332344 /* Unchanged, so just set the phys addr */333345 cec_s_phys_addr(aux->cec.adap, source_physical_address, false);
···209209 struct address_space *mapping = obj->base.filp->f_mapping;210210 unsigned int max_segment = i915_sg_segment_size(i915->drm.dev);211211 struct sg_table *st;212212- struct sgt_iter sgt_iter;213213- struct page *page;214212 int ret;215213216214 /*···237239 * for PAGE_SIZE chunks instead may be helpful.238240 */239241 if (max_segment > PAGE_SIZE) {240240- for_each_sgt_page(page, sgt_iter, st)241241- put_page(page);242242- sg_free_table(st);242242+ shmem_sg_free_table(st, mapping, false, false);243243 kfree(st);244244245245 max_segment = PAGE_SIZE;
+30-6
drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
···14691469 spin_unlock_irqrestore(&guc->timestamp.lock, flags);14701470}1471147114721472+static void __update_guc_busyness_running_state(struct intel_guc *guc)14731473+{14741474+ struct intel_gt *gt = guc_to_gt(guc);14751475+ struct intel_engine_cs *engine;14761476+ enum intel_engine_id id;14771477+ unsigned long flags;14781478+14791479+ spin_lock_irqsave(&guc->timestamp.lock, flags);14801480+ for_each_engine(engine, gt, id)14811481+ engine->stats.guc.running = false;14821482+ spin_unlock_irqrestore(&guc->timestamp.lock, flags);14831483+}14841484+14721485static void __update_guc_busyness_stats(struct intel_guc *guc)14731486{14741487 struct intel_gt *gt = guc_to_gt(guc);···1631161816321619 if (!guc_submission_initialized(guc))16331620 return;16211621+16221622+ /* Assume no engines are running and set running state to false */16231623+ __update_guc_busyness_running_state(guc);1634162416351625 /*16361626 * There is a race with suspend flow where the worker runs after suspend···55355519{55365520 drm_printf(p, "GuC lrc descriptor %u:\n", ce->guc_id.id);55375521 drm_printf(p, "\tHW Context Desc: 0x%08x\n", ce->lrc.lrca);55385538- drm_printf(p, "\t\tLRC Head: Internal %u, Memory %u\n",55395539- ce->ring->head,55405540- ce->lrc_reg_state[CTX_RING_HEAD]);55415541- drm_printf(p, "\t\tLRC Tail: Internal %u, Memory %u\n",55425542- ce->ring->tail,55435543- ce->lrc_reg_state[CTX_RING_TAIL]);55225522+ if (intel_context_pin_if_active(ce)) {55235523+ drm_printf(p, "\t\tLRC Head: Internal %u, Memory %u\n",55245524+ ce->ring->head,55255525+ ce->lrc_reg_state[CTX_RING_HEAD]);55265526+ drm_printf(p, "\t\tLRC Tail: Internal %u, Memory %u\n",55275527+ ce->ring->tail,55285528+ ce->lrc_reg_state[CTX_RING_TAIL]);55295529+ intel_context_unpin(ce);55305530+ } else {55315531+ drm_printf(p, "\t\tLRC Head: Internal %u, Memory not pinned\n",55325532+ ce->ring->head);55335533+ drm_printf(p, "\t\tLRC Tail: Internal %u, Memory not pinned\n",55345534+ ce->ring->tail);55355535+ }55445536 drm_printf(p, "\t\tContext Pin Count: %u\n",55455537 atomic_read(&ce->pin_count));55465538 drm_printf(p, "\t\tGuC ID Ref Count: %u\n",
+2-2
drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
···168168 return PTR_ERR(ppgtt);169169170170 if (!ppgtt->vm.allocate_va_range)171171- goto err_ppgtt_cleanup;171171+ goto ppgtt_vm_put;172172173173 /*174174 * While we only allocate the page tables here and so we could···236236 goto retry;237237 }238238 i915_gem_ww_ctx_fini(&ww);239239-239239+ppgtt_vm_put:240240 i915_vm_put(&ppgtt->vm);241241 return err;242242}
+1
drivers/gpu/drm/panthor/panthor_drv.c
···802802{803803 int prio;804804805805+ memset(arg, 0, sizeof(*arg));805806 for (prio = PANTHOR_GROUP_PRIORITY_REALTIME; prio >= 0; prio--) {806807 if (!group_priority_permit(file, prio))807808 arg->allowed_mask |= BIT(prio);
···119119 drm_puts(&p, "\n**** GuC CT ****\n");120120 xe_guc_ct_snapshot_print(ss->guc.ct, &p);121121122122- /*123123- * Don't add a new section header here because the mesa debug decoder124124- * tool expects the context information to be in the 'GuC CT' section.125125- */126126- /* drm_puts(&p, "\n**** Contexts ****\n"); */122122+ drm_puts(&p, "\n**** Contexts ****\n");127123 xe_guc_exec_queue_snapshot_print(ss->ge, &p);128124129125 drm_puts(&p, "\n**** Job ****\n");···391395/**392396 * xe_print_blob_ascii85 - print a BLOB to some useful location in ASCII85393397 *394394- * The output is split to multiple lines because some print targets, e.g. dmesg395395- * cannot handle arbitrarily long lines. Note also that printing to dmesg in396396- * piece-meal fashion is not possible, each separate call to drm_puts() has a397397- * line-feed automatically added! Therefore, the entire output line must be398398- * constructed in a local buffer first, then printed in one atomic output call.398398+ * The output is split into multiple calls to drm_puts() because some print399399+ * targets, e.g. dmesg, cannot handle arbitrarily long lines. These targets may400400+ * add newlines, as is the case with dmesg: each drm_puts() call creates a401401+ * separate line.399402 *400403 * There is also a scheduler yield call to prevent the 'task has been stuck for401404 * 120s' kernel hang check feature from firing when printing to a slow target402405 * such as dmesg over a serial port.403406 *404404- * TODO: Add compression prior to the ASCII85 encoding to shrink huge buffers down.405405- *406407 * @p: the printer object to output to407408 * @prefix: optional prefix to add to output string409409+ * @suffix: optional suffix to add at the end. 0 disables it and is410410+ * not added to the output, which is useful when using multiple calls411411+ * to dump data to @p408412 * @blob: the Binary Large OBject to dump out409413 * @offset: offset in bytes to skip from the front of the BLOB, must be a multiple of sizeof(u32)410414 * @size: the size in bytes of the BLOB, must be a multiple of sizeof(u32)411415 */412412-void xe_print_blob_ascii85(struct drm_printer *p, const char *prefix,416416+void xe_print_blob_ascii85(struct drm_printer *p, const char *prefix, char suffix,413417 const void *blob, size_t offset, size_t size)414418{415419 const u32 *blob32 = (const u32 *)blob;416420 char buff[ASCII85_BUFSZ], *line_buff;417421 size_t line_pos = 0;418422419419- /*420420- * Splitting blobs across multiple lines is not compatible with the mesa421421- * debug decoder tool. Note that even dropping the explicit '\n' below422422- * doesn't help because the GuC log is so big some underlying implementation423423- * still splits the lines at 512K characters. So just bail completely for424424- * the moment.425425- */426426- return;427427-428423#define DMESG_MAX_LINE_LEN 800429429-#define MIN_SPACE (ASCII85_BUFSZ + 2) /* 85 + "\n\0" */424424+ /* Always leave space for the suffix char and the \0 */425425+#define MIN_SPACE (ASCII85_BUFSZ + 2) /* 85 + "<suffix>\0" */430426431427 if (size & 3)432428 drm_printf(p, "Size not word aligned: %zu", size);···450462 line_pos += strlen(line_buff + line_pos);451463452464 if ((line_pos + MIN_SPACE) >= DMESG_MAX_LINE_LEN) {453453- line_buff[line_pos++] = '\n';454465 line_buff[line_pos++] = 0;455466456467 drm_puts(p, line_buff);···461474 }462475 }463476464464- if (line_pos) {465465- line_buff[line_pos++] = '\n';466466- line_buff[line_pos++] = 0;477477+ if (suffix)478478+ line_buff[line_pos++] = suffix;467479480480+ if (line_pos) {481481+ line_buff[line_pos++] = 0;468482 drm_puts(p, line_buff);469483 }470484
···5757 return GRAPHICS_VERx100(xe) < 1270 && !IS_DGFX(xe);5858}59596060-static s64 detect_bar2_dgfx(struct xe_device *xe, struct xe_ttm_stolen_mgr *mgr)6161-{6262- struct xe_tile *tile = xe_device_get_root_tile(xe);6363- struct xe_mmio *mmio = xe_root_tile_mmio(xe);6464- struct pci_dev *pdev = to_pci_dev(xe->drm.dev);6565- u64 stolen_size;6666- u64 tile_offset;6767- u64 tile_size;6868-6969- tile_offset = tile->mem.vram.io_start - xe->mem.vram.io_start;7070- tile_size = tile->mem.vram.actual_physical_size;7171-7272- /* Use DSM base address instead for stolen memory */7373- mgr->stolen_base = (xe_mmio_read64_2x32(mmio, DSMBASE) & BDSM_MASK) - tile_offset;7474- if (drm_WARN_ON(&xe->drm, tile_size < mgr->stolen_base))7575- return 0;7676-7777- stolen_size = tile_size - mgr->stolen_base;7878-7979- /* Verify usage fits in the actual resource available */8080- if (mgr->stolen_base + stolen_size <= pci_resource_len(pdev, LMEM_BAR))8181- mgr->io_base = tile->mem.vram.io_start + mgr->stolen_base;8282-8383- /*8484- * There may be few KB of platform dependent reserved memory at the end8585- * of vram which is not part of the DSM. Such reserved memory portion is8686- * always less then DSM granularity so align down the stolen_size to DSM8787- * granularity to accommodate such reserve vram portion.8888- */8989- return ALIGN_DOWN(stolen_size, SZ_1M);9090-}9191-9260static u32 get_wopcm_size(struct xe_device *xe)9361{9462 u32 wopcm_size;···78110 }7911180112 return wopcm_size;113113+}114114+115115+static s64 detect_bar2_dgfx(struct xe_device *xe, struct xe_ttm_stolen_mgr *mgr)116116+{117117+ struct xe_tile *tile = xe_device_get_root_tile(xe);118118+ struct xe_mmio *mmio = xe_root_tile_mmio(xe);119119+ struct pci_dev *pdev = to_pci_dev(xe->drm.dev);120120+ u64 stolen_size, wopcm_size;121121+ u64 tile_offset;122122+ u64 tile_size;123123+124124+ tile_offset = tile->mem.vram.io_start - xe->mem.vram.io_start;125125+ tile_size = tile->mem.vram.actual_physical_size;126126+127127+ /* Use DSM base address instead for stolen memory */128128+ mgr->stolen_base = (xe_mmio_read64_2x32(mmio, DSMBASE) & BDSM_MASK) - tile_offset;129129+ if (drm_WARN_ON(&xe->drm, tile_size < mgr->stolen_base))130130+ return 0;131131+132132+ /* Carve out the top of DSM as it contains the reserved WOPCM region */133133+ wopcm_size = get_wopcm_size(xe);134134+ if (drm_WARN_ON(&xe->drm, !wopcm_size))135135+ return 0;136136+137137+ stolen_size = tile_size - mgr->stolen_base;138138+ stolen_size -= wopcm_size;139139+140140+ /* Verify usage fits in the actual resource available */141141+ if (mgr->stolen_base + stolen_size <= pci_resource_len(pdev, LMEM_BAR))142142+ mgr->io_base = tile->mem.vram.io_start + mgr->stolen_base;143143+144144+ /*145145+ * There may be few KB of platform dependent reserved memory at the end146146+ * of vram which is not part of the DSM. Such reserved memory portion is147147+ * always less then DSM granularity so align down the stolen_size to DSM148148+ * granularity to accommodate such reserve vram portion.149149+ */150150+ return ALIGN_DOWN(stolen_size, SZ_1M);81151}8215283153static u32 detect_bar2_integrated(struct xe_device *xe, struct xe_ttm_stolen_mgr *mgr)
···104104 unsigned int id;105105 int i, err;106106107107- mutex_init(&host->intr_mutex);108108-109107 for (id = 0; id < host1x_syncpt_nb_pts(host); ++id) {110108 struct host1x_syncpt *syncpt = &host->syncpt[id];111109
+10-5
drivers/hid/Kconfig
···570570571571config HID_LENOVO572572 tristate "Lenovo / Thinkpad devices"573573+ depends on ACPI574574+ select ACPI_PLATFORM_PROFILE573575 select NEW_LEDS574576 select LEDS_CLASS575577 help···11691167 tristate "Topre REALFORCE keyboards"11701168 depends on HID11711169 help11721172- Say Y for N-key rollover support on Topre REALFORCE R2 108/87 key keyboards.11701170+ Say Y for N-key rollover support on Topre REALFORCE R2 108/87 key and11711171+ Topre REALFORCE R3S 87 key keyboards.1173117211741173config HID_THINGM11751174 tristate "ThingM blink(1) USB RGB LED"···1377137413781375source "drivers/hid/bpf/Kconfig"1379137613801380-endif # HID13811381-13821382-source "drivers/hid/usbhid/Kconfig"13831383-13841377source "drivers/hid/i2c-hid/Kconfig"1385137813861379source "drivers/hid/intel-ish-hid/Kconfig"···13861387source "drivers/hid/surface-hid/Kconfig"1387138813881389source "drivers/hid/intel-thc-hid/Kconfig"13901390+13911391+endif # HID13921392+13931393+# USB support may be used with HID disabled13941394+13951395+source "drivers/hid/usbhid/Kconfig"1389139613901397endif # HID_SUPPORT
-1
drivers/hid/amd-sfh-hid/Kconfig
···5566config AMD_SFH_HID77 tristate "AMD Sensor Fusion Hub"88- depends on HID98 depends on X86109 help1110 If you say yes to this option, support will be included for the
···106106 "%s::%s",107107 dev_name(&input->dev),108108 info->led_name);109109+ if (!led->cdev.name)110110+ return -ENOMEM;109111110112 ret = devm_led_classdev_register(&hdev->dev, &led->cdev);111113 if (ret)
+1-1
drivers/hid/i2c-hid/Kconfig
···22menuconfig I2C_HID33 tristate "I2C HID support"44 default y55- depends on I2C && INPUT && HID55+ depends on I2C6677if I2C_HID88
-1
drivers/hid/intel-ish-hid/Kconfig
···66 tristate "Intel Integrated Sensor Hub"77 default n88 depends on X8699- depends on HID109 help1110 The Integrated Sensor Hub (ISH) enables the ability to offload1211 sensor polling and algorithm processing to a dedicated low power
···517517 /* ISH FW is dead */518518 if (!ish_is_input_ready(dev))519519 return -EPIPE;520520+521521+ /* Send clock sync at once after reset */522522+ ishtp_dev->prev_sync = 0;523523+520524 /*521525 * Set HOST2ISH.ILUP. Apparently we need this BEFORE sending522526 * RESET_NOTIFY_ACK - FW will be checking for it···581577 */582578static void _ish_sync_fw_clock(struct ishtp_device *dev)583579{584584- static unsigned long prev_sync;585585- uint64_t usec;580580+ struct ipc_time_update_msg time = {};586581587587- if (prev_sync && time_before(jiffies, prev_sync + 20 * HZ))582582+ if (dev->prev_sync && time_before(jiffies, dev->prev_sync + 20 * HZ))588583 return;589584590590- prev_sync = jiffies;591591- usec = ktime_to_us(ktime_get_boottime());592592- ipc_send_mng_msg(dev, MNG_SYNC_FW_CLOCK, &usec, sizeof(uint64_t));585585+ dev->prev_sync = jiffies;586586+ /* The fields of time would be updated while sending message */587587+ ipc_send_mng_msg(dev, MNG_SYNC_FW_CLOCK, &time, sizeof(time));593588}594589595590/**
···253253 unsigned int ipc_tx_cnt;254254 unsigned long long ipc_tx_bytes_cnt;255255256256+ /* Time of the last clock sync */257257+ unsigned long prev_sync;256258 const struct ishtp_hw_ops *ops;257259 size_t mtu;258260 uint32_t ishtp_msg_hdr;
-1
drivers/hid/intel-thc-hid/Kconfig
···77config INTEL_THC_HID88 tristate "Intel Touch Host Controller"99 depends on ACPI1010- select HID1110 help1211 THC (Touch Host Controller) is the name of the IP block in PCH that1312 interfaces with Touch Devices (ex: touchscreen, touchpad etc.). It
-2
drivers/hid/surface-hid/Kconfig
···11# SPDX-License-Identifier: GPL-2.0+22menu "Surface System Aggregator Module HID support"33 depends on SURFACE_AGGREGATOR44- depends on INPUT5465config SURFACE_HID76 tristate "HID transport driver for Surface System Aggregator Module"···38393940config SURFACE_HID_CORE4041 tristate4141- select HID
+1-2
drivers/hid/usbhid/Kconfig
···55config USB_HID66 tristate "USB HID transport layer"77 default y88- depends on USB && INPUT99- select HID88+ depends on HID109 help1110 Say Y here if you want to connect USB keyboards,1211 mice, joysticks, graphic tablets, or any other HID based devices
+76-39
drivers/i2c/i2c-core-base.c
···13001300 info.flags |= I2C_CLIENT_SLAVE;13011301 }1302130213031303- info.flags |= I2C_CLIENT_USER;13041304-13051303 client = i2c_new_client_device(adap, &info);13061304 if (IS_ERR(client))13071305 return PTR_ERR(client);1308130613071307+ /* Keep track of the added device */13081308+ mutex_lock(&adap->userspace_clients_lock);13091309+ list_add_tail(&client->detected, &adap->userspace_clients);13101310+ mutex_unlock(&adap->userspace_clients_lock);13091311 dev_info(dev, "%s: Instantiated device %s at 0x%02hx\n", "new_device",13101312 info.type, info.addr);1311131313121314 return count;13131315}13141316static DEVICE_ATTR_WO(new_device);13151315-13161316-static int __i2c_find_user_addr(struct device *dev, const void *addrp)13171317-{13181318- struct i2c_client *client = i2c_verify_client(dev);13191319- unsigned short addr = *(unsigned short *)addrp;13201320-13211321- return client && client->flags & I2C_CLIENT_USER &&13221322- i2c_encode_flags_to_addr(client) == addr;13231323-}1324131713251318/*13261319 * And of course let the users delete the devices they instantiated, if···13291336 const char *buf, size_t count)13301337{13311338 struct i2c_adapter *adap = to_i2c_adapter(dev);13321332- struct device *child_dev;13391339+ struct i2c_client *client, *next;13331340 unsigned short addr;13341341 char end;13351342 int res;···13451352 return -EINVAL;13461353 }1347135413481348- mutex_lock(&core_lock);13491355 /* Make sure the device was added through sysfs */13501350- child_dev = device_find_child(&adap->dev, &addr, __i2c_find_user_addr);13511351- if (child_dev) {13521352- i2c_unregister_device(i2c_verify_client(child_dev));13531353- put_device(child_dev);13541354- } else {13551355- dev_err(dev, "Can't find userspace-created device at %#x\n", addr);13561356- count = -ENOENT;13571357- }13581358- mutex_unlock(&core_lock);13561356+ res = -ENOENT;13571357+ mutex_lock_nested(&adap->userspace_clients_lock,13581358+ i2c_adapter_depth(adap));13591359+ list_for_each_entry_safe(client, next, &adap->userspace_clients,13601360+ detected) {13611361+ if (i2c_encode_flags_to_addr(client) == addr) {13621362+ dev_info(dev, "%s: Deleting device %s at 0x%02hx\n",13631363+ "delete_device", client->name, client->addr);1359136413601360- return count;13651365+ list_del(&client->detected);13661366+ i2c_unregister_device(client);13671367+ res = count;13681368+ break;13691369+ }13701370+ }13711371+ mutex_unlock(&adap->userspace_clients_lock);13721372+13731373+ if (res < 0)13741374+ dev_err(dev, "%s: Can't find device in list\n",13751375+ "delete_device");13761376+ return res;13611377}13621378static DEVICE_ATTR_IGNORE_LOCKDEP(delete_device, S_IWUSR, NULL,13631379 delete_device_store);···15371535 adap->locked_flags = 0;15381536 rt_mutex_init(&adap->bus_lock);15391537 rt_mutex_init(&adap->mux_lock);15381538+ mutex_init(&adap->userspace_clients_lock);15391539+ INIT_LIST_HEAD(&adap->userspace_clients);1540154015411541 /* Set default timeout to 1 second if not already set */15421542 if (adap->timeout == 0)···17041700}17051701EXPORT_SYMBOL_GPL(i2c_add_numbered_adapter);1706170217031703+static void i2c_do_del_adapter(struct i2c_driver *driver,17041704+ struct i2c_adapter *adapter)17051705+{17061706+ struct i2c_client *client, *_n;17071707+17081708+ /* Remove the devices we created ourselves as the result of hardware17091709+ * probing (using a driver's detect method) */17101710+ list_for_each_entry_safe(client, _n, &driver->clients, detected) {17111711+ if (client->adapter == adapter) {17121712+ dev_dbg(&adapter->dev, "Removing %s at 0x%x\n",17131713+ client->name, client->addr);17141714+ list_del(&client->detected);17151715+ i2c_unregister_device(client);17161716+ }17171717+ }17181718+}17191719+17071720static int __unregister_client(struct device *dev, void *dummy)17081721{17091722 struct i2c_client *client = i2c_verify_client(dev);···17361715 return 0;17371716}1738171717181718+static int __process_removed_adapter(struct device_driver *d, void *data)17191719+{17201720+ i2c_do_del_adapter(to_i2c_driver(d), data);17211721+ return 0;17221722+}17231723+17391724/**17401725 * i2c_del_adapter - unregister I2C adapter17411726 * @adap: the adapter being unregistered···17531726void i2c_del_adapter(struct i2c_adapter *adap)17541727{17551728 struct i2c_adapter *found;17291729+ struct i2c_client *client, *next;1756173017571731 /* First make sure that this adapter was ever added */17581732 mutex_lock(&core_lock);···17651737 }1766173817671739 i2c_acpi_remove_space_handler(adap);17401740+ /* Tell drivers about this removal */17411741+ mutex_lock(&core_lock);17421742+ bus_for_each_drv(&i2c_bus_type, NULL, adap,17431743+ __process_removed_adapter);17441744+ mutex_unlock(&core_lock);17451745+17461746+ /* Remove devices instantiated from sysfs */17471747+ mutex_lock_nested(&adap->userspace_clients_lock,17481748+ i2c_adapter_depth(adap));17491749+ list_for_each_entry_safe(client, next, &adap->userspace_clients,17501750+ detected) {17511751+ dev_dbg(&adap->dev, "Removing %s at 0x%x\n", client->name,17521752+ client->addr);17531753+ list_del(&client->detected);17541754+ i2c_unregister_device(client);17551755+ }17561756+ mutex_unlock(&adap->userspace_clients_lock);1768175717691758 /* Detach any active clients. This can't fail, thus we do not17701759 * check the returned value. This is a two-pass process, because17711760 * we can't remove the dummy devices during the first pass: they17721761 * could have been instantiated by real devices wishing to clean17731762 * them up properly, so we give them a chance to do that first. */17741774- mutex_lock(&core_lock);17751763 device_for_each_child(&adap->dev, NULL, __unregister_client);17761764 device_for_each_child(&adap->dev, NULL, __unregister_dummy);17771777- mutex_unlock(&core_lock);1778176517791766 /* device name is gone after device_unregister */17801767 dev_dbg(&adap->dev, "adapter [%s] unregistered\n", adap->name);···20091966 /* add the driver to the list of i2c drivers in the driver core */20101967 driver->driver.owner = owner;20111968 driver->driver.bus = &i2c_bus_type;19691969+ INIT_LIST_HEAD(&driver->clients);2012197020131971 /* When registration returns, the driver core20141972 * will have called probe() for all matching-but-unbound devices.···20271983}20281984EXPORT_SYMBOL(i2c_register_driver);2029198520302030-static int __i2c_unregister_detected_client(struct device *dev, void *argp)19861986+static int __process_removed_driver(struct device *dev, void *data)20311987{20322032- struct i2c_client *client = i2c_verify_client(dev);20332033-20342034- if (client && client->flags & I2C_CLIENT_AUTO)20352035- i2c_unregister_device(client);20362036-19881988+ if (dev->type == &i2c_adapter_type)19891989+ i2c_do_del_adapter(data, to_i2c_adapter(dev));20371990 return 0;20381991}20391992···20412000 */20422001void i2c_del_driver(struct i2c_driver *driver)20432002{20442044- mutex_lock(&core_lock);20452045- /* Satisfy __must_check, function can't fail */20462046- if (driver_for_each_device(&driver->driver, NULL, NULL,20472047- __i2c_unregister_detected_client)) {20482048- }20492049- mutex_unlock(&core_lock);20032003+ i2c_for_each_dev(driver, __process_removed_driver);2050200420512005 driver_unregister(&driver->driver);20522006 pr_debug("driver [%s] unregistered\n", driver->driver.name);···24682432 /* Finally call the custom detection function */24692433 memset(&info, 0, sizeof(struct i2c_board_info));24702434 info.addr = addr;24712471- info.flags = I2C_CLIENT_AUTO;24722435 err = driver->detect(temp_client, &info);24732436 if (err) {24742437 /* -ENODEV is returned if the detection fails. We catch it···24942459 dev_dbg(&adapter->dev, "Creating %s at 0x%02x\n",24952460 info.type, info.addr);24962461 client = i2c_new_client_device(adapter, &info);24972497- if (IS_ERR(client))24622462+ if (!IS_ERR(client))24632463+ list_add_tail(&client->detected, &driver->clients);24642464+ else24982465 dev_err(&adapter->dev, "Failed creating %s at 0x%02x\n",24992466 info.type, info.addr);25002467 }
···2653265326542654 /* Set IOTLB invalidation timeout to 1s */26552655 iommu_set_inv_tlb_timeout(iommu, CTRL_INV_TO_1S);26562656+26572657+ /* Enable Enhanced Peripheral Page Request Handling */26582658+ if (check_feature(FEATURE_EPHSUP))26592659+ iommu_feature_enable(iommu, CONTROL_EPH_EN);26562660}2657266126582662static void iommu_apply_resume_quirks(struct amd_iommu *iommu)···31983194 return true;31993195}3200319632013201-static void iommu_snp_enable(void)31973197+static __init void iommu_snp_enable(void)32023198{32033199#ifdef CONFIG_KVM_AMD_SEV32043200 if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))···32203216 amd_iommu_snp_en = check_feature(FEATURE_SNP);32213217 if (!amd_iommu_snp_en) {32223218 pr_warn("SNP: IOMMU SNP feature not enabled, SNP cannot be supported.\n");32193219+ goto disable_snp;32203220+ }32213221+32223222+ /*32233223+ * Enable host SNP support once SNP support is checked on IOMMU.32243224+ */32253225+ if (snp_rmptable_init()) {32263226+ pr_warn("SNP: RMP initialization failed, SNP cannot be supported.\n");32233227 goto disable_snp;32243228 }32253229···33293317 break;33303318 ret = state_next();33313319 }33203320+33213321+ /*33223322+ * SNP platform initilazation requires IOMMUs to be fully configured.33233323+ * If the SNP support on IOMMUs has NOT been checked, simply mark SNP33243324+ * as unsupported. If the SNP support on IOMMUs has been checked and33253325+ * host SNP support enabled but RMP enforcement has not been enabled33263326+ * in IOMMUs, then the system is in a half-baked state, but can limp33273327+ * along as all memory should be Hypervisor-Owned in the RMP. WARN,33283328+ * but leave SNP as "supported" to avoid confusing the kernel.33293329+ */33303330+ if (ret && cc_platform_has(CC_ATTR_HOST_SEV_SNP) &&33313331+ !WARN_ON_ONCE(amd_iommu_snp_en))33323332+ cc_platform_clear(CC_ATTR_HOST_SEV_SNP);3332333333333334 return ret;33343335}···34513426 int ret;3452342734533428 if (no_iommu || (iommu_detected && !gart_iommu_aperture))34543454- return;34293429+ goto disable_snp;3455343034563431 if (!amd_iommu_sme_check())34573457- return;34323432+ goto disable_snp;3458343334593434 ret = iommu_go_to_state(IOMMU_IVRS_DETECTED);34603435 if (ret)34613461- return;34363436+ goto disable_snp;3462343734633438 amd_iommu_detected = true;34643439 iommu_detected = 1;34653440 x86_init.iommu.iommu_init = amd_iommu_init;34413441+ return;34423442+34433443+disable_snp:34443444+ if (cc_platform_has(CC_ATTR_HOST_SEV_SNP))34453445+ cc_platform_clear(CC_ATTR_HOST_SEV_SNP);34663446}3467344734683448/****************************************************************************
+3-3
drivers/iommu/exynos-iommu.c
···249249 struct list_head clients; /* list of sysmmu_drvdata.domain_node */250250 sysmmu_pte_t *pgtable; /* lv1 page table, 16KB */251251 short *lv2entcnt; /* free lv2 entry counter for each section */252252- spinlock_t lock; /* lock for modyfying list of clients */252252+ spinlock_t lock; /* lock for modifying list of clients */253253 spinlock_t pgtablelock; /* lock for modifying page table @ pgtable */254254 struct iommu_domain domain; /* generic domain data structure */255255};···292292 struct clk *aclk; /* SYSMMU's aclk clock */293293 struct clk *pclk; /* SYSMMU's pclk clock */294294 struct clk *clk_master; /* master's device clock */295295- spinlock_t lock; /* lock for modyfying state */295295+ spinlock_t lock; /* lock for modifying state */296296 bool active; /* current status */297297 struct exynos_iommu_domain *domain; /* domain we belong to */298298 struct list_head domain_node; /* node for domain clients list */···746746 ret = devm_request_irq(dev, irq, exynos_sysmmu_irq, 0,747747 dev_name(dev), data);748748 if (ret) {749749- dev_err(dev, "Unabled to register handler of irq %d\n", irq);749749+ dev_err(dev, "Unable to register handler of irq %d\n", irq);750750 return ret;751751 }752752
···17561756 group->id);1757175717581758 /*17591759- * Try to recover, drivers are allowed to force IDENITY or DMA, IDENTITY17591759+ * Try to recover, drivers are allowed to force IDENTITY or DMA, IDENTITY17601760 * takes precedence.17611761 */17621762 if (type == IOMMU_DOMAIN_IDENTITY)
+1
drivers/irqchip/Kconfig
···169169170170config LAN966X_OIC171171 tristate "Microchip LAN966x OIC Support"172172+ depends on MCHP_LAN966X_PCI || COMPILE_TEST172173 select GENERIC_IRQ_CHIP173174 select IRQ_DOMAIN174175 help
+2-1
drivers/irqchip/irq-apple-aic.c
···577577 AIC_FIQ_HWIRQ(AIC_TMR_EL02_VIRT));578578 }579579580580- if (read_sysreg_s(SYS_IMP_APL_PMCR0_EL1) & PMCR0_IACT) {580580+ if ((read_sysreg_s(SYS_IMP_APL_PMCR0_EL1) & (PMCR0_IMODE | PMCR0_IACT)) ==581581+ (FIELD_PREP(PMCR0_IMODE, PMCR0_IMODE_FIQ) | PMCR0_IACT)) {581582 int irq;582583 if (cpumask_test_cpu(smp_processor_id(),583584 &aic_irqc->fiq_aff[AIC_CPU_PMU_P]->aff))
+2-1
drivers/irqchip/irq-mvebu-icu.c
···6868 unsigned long *hwirq, unsigned int *type)6969{7070 unsigned int param_count = static_branch_unlikely(&legacy_bindings) ? 3 : 2;7171- struct mvebu_icu_msi_data *msi_data = d->host_data;7171+ struct msi_domain_info *info = d->host_data;7272+ struct mvebu_icu_msi_data *msi_data = info->chip_data;7273 struct mvebu_icu *icu = msi_data->icu;73747475 /* Check the count of the parameters in dt */
···159159}160160161161static struct regmap *device_node_get_regmap(struct device_node *np,162162+ bool create_regmap,162163 bool check_res)163164{164165 struct syscon *entry, *syscon = NULL;···173172 }174173175174 if (!syscon) {176176- if (of_device_is_compatible(np, "syscon"))175175+ if (create_regmap)177176 syscon = of_syscon_register(np, check_res);178177 else179178 syscon = ERR_PTR(-EINVAL);···234233}235234EXPORT_SYMBOL_GPL(of_syscon_register_regmap);236235236236+/**237237+ * device_node_to_regmap() - Get or create a regmap for specified device node238238+ * @np: Device tree node239239+ *240240+ * Get a regmap for the specified device node. If there's not an existing241241+ * regmap, then one is instantiated. This function should not be used if the242242+ * device node has a custom regmap driver or has resources (clocks, resets) to243243+ * be managed. Use syscon_node_to_regmap() instead for those cases.244244+ *245245+ * Return: regmap ptr on success, negative error code on failure.246246+ */237247struct regmap *device_node_to_regmap(struct device_node *np)238248{239239- return device_node_get_regmap(np, false);249249+ return device_node_get_regmap(np, true, false);240250}241251EXPORT_SYMBOL_GPL(device_node_to_regmap);242252253253+/**254254+ * syscon_node_to_regmap() - Get or create a regmap for specified syscon device node255255+ * @np: Device tree node256256+ *257257+ * Get a regmap for the specified device node. If there's not an existing258258+ * regmap, then one is instantiated if the node is a generic "syscon". This259259+ * function is safe to use for a syscon registered with260260+ * of_syscon_register_regmap().261261+ *262262+ * Return: regmap ptr on success, negative error code on failure.263263+ */243264struct regmap *syscon_node_to_regmap(struct device_node *np)244265{245245- return device_node_get_regmap(np, true);266266+ return device_node_get_regmap(np, of_device_is_compatible(np, "syscon"), true);246267}247268EXPORT_SYMBOL_GPL(syscon_node_to_regmap);248269
···622622 netdev_dbg(priv->ndev, "RX-FIFO overflow\n");623623624624 skb = rkcanfd_alloc_can_err_skb(priv, &cf, ×tamp);625625- if (skb)625625+ if (!skb)626626 return 0;627627628628 rkcanfd_get_berr_counter_corrected(priv, &bec);
+5-1
drivers/net/can/usb/etas_es58x/es58x_devlink.c
···248248 return ret;249249 }250250251251- return devlink_info_serial_number_put(req, es58x_dev->udev->serial);251251+ if (es58x_dev->udev->serial)252252+ ret = devlink_info_serial_number_put(req,253253+ es58x_dev->udev->serial);254254+255255+ return ret;252256}253257254258const struct devlink_ops es58x_dl_ops = {
+3-1
drivers/net/ethernet/aquantia/atlantic/aq_nic.c
···14411441 aq_ptp_ring_free(self);14421442 aq_ptp_free(self);1443144314441444- if (likely(self->aq_fw_ops->deinit) && link_down) {14441444+ /* May be invoked during hot unplug. */14451445+ if (pci_device_is_present(self->pdev) &&14461446+ likely(self->aq_fw_ops->deinit) && link_down) {14451447 mutex_lock(&self->fwreq_mutex);14461448 self->aq_fw_ops->deinit(self->aq_hw);14471449 mutex_unlock(&self->fwreq_mutex);
···4141{4242 struct bcmgenet_priv *priv = netdev_priv(dev);4343 struct device *kdev = &priv->pdev->dev;4444+ u32 phy_wolopts = 0;44454545- if (dev->phydev)4646+ if (dev->phydev) {4647 phy_ethtool_get_wol(dev->phydev, wol);4848+ phy_wolopts = wol->wolopts;4949+ }47504851 /* MAC is not wake-up capable, return what the PHY does */4952 if (!device_can_wakeup(kdev))···54515552 /* Overlay MAC capabilities with that of the PHY queried before */5653 wol->supported |= WAKE_MAGIC | WAKE_MAGICSECURE | WAKE_FILTER;5757- wol->wolopts = priv->wolopts;5858- memset(wol->sopass, 0, sizeof(wol->sopass));5454+ wol->wolopts |= priv->wolopts;59555656+ /* Return the PHY configured magic password */5757+ if (phy_wolopts & WAKE_MAGICSECURE)5858+ return;5959+6060+ /* Otherwise the MAC one */6161+ memset(wol->sopass, 0, sizeof(wol->sopass));6062 if (wol->wolopts & WAKE_MAGICSECURE)6163 memcpy(wol->sopass, priv->sopass, sizeof(priv->sopass));6264}···7870 /* Try Wake-on-LAN from the PHY first */7971 if (dev->phydev) {8072 ret = phy_ethtool_set_wol(dev->phydev, wol);8181- if (ret != -EOPNOTSUPP)7373+ if (ret != -EOPNOTSUPP && wol->wolopts)8274 return ret;8375 }8476
+58
drivers/net/ethernet/broadcom/tg3.c
···5555#include <linux/hwmon.h>5656#include <linux/hwmon-sysfs.h>5757#include <linux/crc32poly.h>5858+#include <linux/dmi.h>58595960#include <net/checksum.h>6061#include <net/gso.h>···18213182121821418213static SIMPLE_DEV_PM_OPS(tg3_pm_ops, tg3_suspend, tg3_resume);18215182141821518215+/* Systems where ACPI _PTS (Prepare To Sleep) S5 will result in a fatal1821618216+ * PCIe AER event on the tg3 device if the tg3 device is not, or cannot1821718217+ * be, powered down.1821818218+ */1821918219+static const struct dmi_system_id tg3_restart_aer_quirk_table[] = {1822018220+ {1822118221+ .matches = {1822218222+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),1822318223+ DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R440"),1822418224+ },1822518225+ },1822618226+ {1822718227+ .matches = {1822818228+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),1822918229+ DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R540"),1823018230+ },1823118231+ },1823218232+ {1823318233+ .matches = {1823418234+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),1823518235+ DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R640"),1823618236+ },1823718237+ },1823818238+ {1823918239+ .matches = {1824018240+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),1824118241+ DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R650"),1824218242+ },1824318243+ },1824418244+ {1824518245+ .matches = {1824618246+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),1824718247+ DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R740"),1824818248+ },1824918249+ },1825018250+ {1825118251+ .matches = {1825218252+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),1825318253+ DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R750"),1825418254+ },1825518255+ },1825618256+ {}1825718257+};1825818258+1821618259static void tg3_shutdown(struct pci_dev *pdev)1821718260{1821818261 struct net_device *dev = pci_get_drvdata(pdev);···18273182281827418229 if (system_state == SYSTEM_POWER_OFF)1827518230 tg3_power_down(tp);1823118231+ else if (system_state == SYSTEM_RESTART &&1823218232+ dmi_first_match(tg3_restart_aer_quirk_table) &&1823318233+ pdev->current_state != PCI_D3cold &&1823418234+ pdev->current_state != PCI_UNKNOWN) {1823518235+ /* Disable PCIe AER on the tg3 to avoid a fatal1823618236+ * error during this system restart.1823718237+ */1823818238+ pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL,1823918239+ PCI_EXP_DEVCTL_CERE |1824018240+ PCI_EXP_DEVCTL_NFERE |1824118241+ PCI_EXP_DEVCTL_FERE |1824218242+ PCI_EXP_DEVCTL_URRE);1824318243+ }18276182441827718245 rtnl_unlock();1827818246
+1-1
drivers/net/ethernet/intel/iavf/iavf_main.c
···29032903 }2904290429052905 mutex_unlock(&adapter->crit_lock);29062906- netdev_unlock(netdev);29072906restart_watchdog:29072907+ netdev_unlock(netdev);29082908 if (adapter->state >= __IAVF_DOWN)29092909 queue_work(adapter->wq, &adapter->adminq_task);29102910 if (adapter->aq_required)
···527527 * @xdp: xdp_buff used as input to the XDP program528528 * @xdp_prog: XDP program to run529529 * @xdp_ring: ring to be used for XDP_TX action530530- * @rx_buf: Rx buffer to store the XDP action531530 * @eop_desc: Last descriptor in packet to read metadata from532531 *533532 * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR}534533 */535535-static void534534+static u32536535ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,537536 struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring,538538- struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc)537537+ union ice_32b_rx_flex_desc *eop_desc)539538{540539 unsigned int ret = ICE_XDP_PASS;541540 u32 act;···573574 ret = ICE_XDP_CONSUMED;574575 }575576exit:576576- ice_set_rx_bufs_act(xdp, rx_ring, ret);577577+ return ret;577578}578579579580/**···859860 xdp_buff_set_frags_flag(xdp);860861 }861862862862- if (unlikely(sinfo->nr_frags == MAX_SKB_FRAGS)) {863863- ice_set_rx_bufs_act(xdp, rx_ring, ICE_XDP_CONSUMED);863863+ if (unlikely(sinfo->nr_frags == MAX_SKB_FRAGS))864864 return -ENOMEM;865865- }866865867866 __skb_fill_page_desc_noacc(sinfo, sinfo->nr_frags++, rx_buf->page,868867 rx_buf->page_offset, size);···921924 struct ice_rx_buf *rx_buf;922925923926 rx_buf = &rx_ring->rx_buf[ntc];924924- rx_buf->pgcnt = page_count(rx_buf->page);925927 prefetchw(rx_buf->page);926928927929 if (!size)···934938 rx_buf->pagecnt_bias--;935939936940 return rx_buf;941941+}942942+943943+/**944944+ * ice_get_pgcnts - grab page_count() for gathered fragments945945+ * @rx_ring: Rx descriptor ring to store the page counts on946946+ *947947+ * This function is intended to be called right before running XDP948948+ * program so that the page recycling mechanism will be able to take949949+ * a correct decision regarding underlying pages; this is done in such950950+ * way as XDP program can change the refcount of page951951+ */952952+static void ice_get_pgcnts(struct ice_rx_ring *rx_ring)953953+{954954+ u32 nr_frags = rx_ring->nr_frags + 1;955955+ u32 idx = rx_ring->first_desc;956956+ struct ice_rx_buf *rx_buf;957957+ u32 cnt = rx_ring->count;958958+959959+ for (int i = 0; i < nr_frags; i++) {960960+ rx_buf = &rx_ring->rx_buf[idx];961961+ rx_buf->pgcnt = page_count(rx_buf->page);962962+963963+ if (++idx == cnt)964964+ idx = 0;965965+ }937966}938967939968/**···10721051 rx_buf->page_offset + headlen, size,10731052 xdp->frame_sz);10741053 } else {10751075- /* buffer is unused, change the act that should be taken later10761076- * on; data was copied onto skb's linear part so there's no10541054+ /* buffer is unused, restore biased page count in Rx buffer;10551055+ * data was copied onto skb's linear part so there's no10771056 * need for adjusting page offset and we can reuse this buffer10781057 * as-is10791058 */10801080- rx_buf->act = ICE_SKB_CONSUMED;10591059+ rx_buf->pagecnt_bias++;10811060 }1082106110831062 if (unlikely(xdp_buff_has_frags(xdp))) {···11251104}1126110511271106/**11071107+ * ice_put_rx_mbuf - ice_put_rx_buf() caller, for all frame frags11081108+ * @rx_ring: Rx ring with all the auxiliary data11091109+ * @xdp: XDP buffer carrying linear + frags part11101110+ * @xdp_xmit: XDP_TX/XDP_REDIRECT verdict storage11111111+ * @ntc: a current next_to_clean value to be stored at rx_ring11121112+ * @verdict: return code from XDP program execution11131113+ *11141114+ * Walk through gathered fragments and satisfy internal page11151115+ * recycle mechanism; we take here an action related to verdict11161116+ * returned by XDP program;11171117+ */11181118+static void ice_put_rx_mbuf(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,11191119+ u32 *xdp_xmit, u32 ntc, u32 verdict)11201120+{11211121+ u32 nr_frags = rx_ring->nr_frags + 1;11221122+ u32 idx = rx_ring->first_desc;11231123+ u32 cnt = rx_ring->count;11241124+ u32 post_xdp_frags = 1;11251125+ struct ice_rx_buf *buf;11261126+ int i;11271127+11281128+ if (unlikely(xdp_buff_has_frags(xdp)))11291129+ post_xdp_frags += xdp_get_shared_info_from_buff(xdp)->nr_frags;11301130+11311131+ for (i = 0; i < post_xdp_frags; i++) {11321132+ buf = &rx_ring->rx_buf[idx];11331133+11341134+ if (verdict & (ICE_XDP_TX | ICE_XDP_REDIR)) {11351135+ ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz);11361136+ *xdp_xmit |= verdict;11371137+ } else if (verdict & ICE_XDP_CONSUMED) {11381138+ buf->pagecnt_bias++;11391139+ } else if (verdict == ICE_XDP_PASS) {11401140+ ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz);11411141+ }11421142+11431143+ ice_put_rx_buf(rx_ring, buf);11441144+11451145+ if (++idx == cnt)11461146+ idx = 0;11471147+ }11481148+ /* handle buffers that represented frags released by XDP prog;11491149+ * for these we keep pagecnt_bias as-is; refcount from struct page11501150+ * has been decremented within XDP prog and we do not have to increase11511151+ * the biased refcnt11521152+ */11531153+ for (; i < nr_frags; i++) {11541154+ buf = &rx_ring->rx_buf[idx];11551155+ ice_put_rx_buf(rx_ring, buf);11561156+ if (++idx == cnt)11571157+ idx = 0;11581158+ }11591159+11601160+ xdp->data = NULL;11611161+ rx_ring->first_desc = ntc;11621162+ rx_ring->nr_frags = 0;11631163+}11641164+11651165+/**11281166 * ice_clean_rx_irq - Clean completed descriptors from Rx ring - bounce buf11291167 * @rx_ring: Rx descriptor ring to transact packets on11301168 * @budget: Total limit on number of packets to process···12001120 unsigned int total_rx_bytes = 0, total_rx_pkts = 0;12011121 unsigned int offset = rx_ring->rx_offset;12021122 struct xdp_buff *xdp = &rx_ring->xdp;12031203- u32 cached_ntc = rx_ring->first_desc;12041123 struct ice_tx_ring *xdp_ring = NULL;12051124 struct bpf_prog *xdp_prog = NULL;12061125 u32 ntc = rx_ring->next_to_clean;11261126+ u32 cached_ntu, xdp_verdict;12071127 u32 cnt = rx_ring->count;12081128 u32 xdp_xmit = 0;12091209- u32 cached_ntu;12101129 bool failure;12111211- u32 first;1212113012131131 xdp_prog = READ_ONCE(rx_ring->xdp_prog);12141132 if (xdp_prog) {···12681190 xdp_prepare_buff(xdp, hard_start, offset, size, !!offset);12691191 xdp_buff_clear_frags_flag(xdp);12701192 } else if (ice_add_xdp_frag(rx_ring, xdp, rx_buf, size)) {11931193+ ice_put_rx_mbuf(rx_ring, xdp, NULL, ntc, ICE_XDP_CONSUMED);12711194 break;12721195 }12731196 if (++ntc == cnt)···12781199 if (ice_is_non_eop(rx_ring, rx_desc))12791200 continue;1280120112811281- ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc);12821282- if (rx_buf->act == ICE_XDP_PASS)12021202+ ice_get_pgcnts(rx_ring);12031203+ xdp_verdict = ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_desc);12041204+ if (xdp_verdict == ICE_XDP_PASS)12831205 goto construct_skb;12841206 total_rx_bytes += xdp_get_buff_len(xdp);12851207 total_rx_pkts++;1286120812871287- xdp->data = NULL;12881288- rx_ring->first_desc = ntc;12891289- rx_ring->nr_frags = 0;12091209+ ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict);12101210+12901211 continue;12911212construct_skb:12921213 if (likely(ice_ring_uses_build_skb(rx_ring)))···12961217 /* exit if we failed to retrieve a buffer */12971218 if (!skb) {12981219 rx_ring->ring_stats->rx_stats.alloc_page_failed++;12991299- rx_buf->act = ICE_XDP_CONSUMED;13001300- if (unlikely(xdp_buff_has_frags(xdp)))13011301- ice_set_rx_bufs_act(xdp, rx_ring,13021302- ICE_XDP_CONSUMED);13031303- xdp->data = NULL;13041304- rx_ring->first_desc = ntc;13051305- rx_ring->nr_frags = 0;13061306- break;12201220+ xdp_verdict = ICE_XDP_CONSUMED;13071221 }13081308- xdp->data = NULL;13091309- rx_ring->first_desc = ntc;13101310- rx_ring->nr_frags = 0;12221222+ ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict);12231223+12241224+ if (!skb)12251225+ break;1311122613121227 stat_err_bits = BIT(ICE_RX_FLEX_DESC_STATUS0_RXE_S);13131228 if (unlikely(ice_test_staterr(rx_desc->wb.status_error0,···13301257 total_rx_pkts++;13311258 }1332125913331333- first = rx_ring->first_desc;13341334- while (cached_ntc != first) {13351335- struct ice_rx_buf *buf = &rx_ring->rx_buf[cached_ntc];13361336-13371337- if (buf->act & (ICE_XDP_TX | ICE_XDP_REDIR)) {13381338- ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz);13391339- xdp_xmit |= buf->act;13401340- } else if (buf->act & ICE_XDP_CONSUMED) {13411341- buf->pagecnt_bias++;13421342- } else if (buf->act == ICE_XDP_PASS) {13431343- ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz);13441344- }13451345-13461346- ice_put_rx_buf(rx_ring, buf);13471347- if (++cached_ntc >= cnt)13481348- cached_ntc = 0;13491349- }13501260 rx_ring->next_to_clean = ntc;13511261 /* return up to cleaned_count buffers to hardware */13521262 failure = ice_alloc_rx_bufs(rx_ring, ICE_RX_DESC_UNUSED(rx_ring));
-1
drivers/net/ethernet/intel/ice/ice_txrx.h
···201201 struct page *page;202202 unsigned int page_offset;203203 unsigned int pgcnt;204204- unsigned int act;205204 unsigned int pagecnt_bias;206205};207206
-43
drivers/net/ethernet/intel/ice/ice_txrx_lib.h
···66#include "ice.h"7788/**99- * ice_set_rx_bufs_act - propagate Rx buffer action to frags1010- * @xdp: XDP buffer representing frame (linear and frags part)1111- * @rx_ring: Rx ring struct1212- * act: action to store onto Rx buffers related to XDP buffer parts1313- *1414- * Set action that should be taken before putting Rx buffer from first frag1515- * to the last.1616- */1717-static inline void1818-ice_set_rx_bufs_act(struct xdp_buff *xdp, const struct ice_rx_ring *rx_ring,1919- const unsigned int act)2020-{2121- u32 sinfo_frags = xdp_get_shared_info_from_buff(xdp)->nr_frags;2222- u32 nr_frags = rx_ring->nr_frags + 1;2323- u32 idx = rx_ring->first_desc;2424- u32 cnt = rx_ring->count;2525- struct ice_rx_buf *buf;2626-2727- for (int i = 0; i < nr_frags; i++) {2828- buf = &rx_ring->rx_buf[idx];2929- buf->act = act;3030-3131- if (++idx == cnt)3232- idx = 0;3333- }3434-3535- /* adjust pagecnt_bias on frags freed by XDP prog */3636- if (sinfo_frags < rx_ring->nr_frags && act == ICE_XDP_CONSUMED) {3737- u32 delta = rx_ring->nr_frags - sinfo_frags;3838-3939- while (delta) {4040- if (idx == 0)4141- idx = cnt - 1;4242- else4343- idx--;4444- buf = &rx_ring->rx_buf[idx];4545- buf->pagecnt_bias--;4646- delta--;4747- }4848- }4949-}5050-5151-/**529 * ice_test_staterr - tests bits in Rx descriptor status and error fields5310 * @status_err_n: Rx descriptor status_error0 or status_error1 bits5411 * @stat_err_bits: value to mask
···21052105 /* hand second half of page back to the ring */21062106 ixgbe_reuse_rx_page(rx_ring, rx_buffer);21072107 } else {21082108- if (!IS_ERR(skb) && IXGBE_CB(skb)->dma == rx_buffer->dma) {21082108+ if (skb && IXGBE_CB(skb)->dma == rx_buffer->dma) {21092109 /* the page has been released from the ring */21102110 IXGBE_CB(skb)->page_released = true;21112111 } else {
···22652265 /* Allow the MAC to stop its clock if the PHY has the capability */22662266 pl->mac_tx_clk_stop = phy_eee_tx_clock_stop_capable(phy) > 0;2267226722682268- /* Explicitly configure whether the PHY is allowed to stop it's22692269- * receive clock.22702270- */22712271- ret = phy_eee_rx_clock_stop(phy, pl->config->eee_rx_clk_stop_enable);22722272- if (ret == -EOPNOTSUPP)22732273- ret = 0;22682268+ if (pl->mac_supports_eee_ops) {22692269+ /* Explicitly configure whether the PHY is allowed to stop it's22702270+ * receive clock.22712271+ */22722272+ ret = phy_eee_rx_clock_stop(phy,22732273+ pl->config->eee_rx_clk_stop_enable);22742274+ if (ret == -EOPNOTSUPP)22752275+ ret = 0;22762276+ }2274227722752278 return ret;22762279}
+2-2
drivers/net/pse-pd/pse_core.c
···319319 goto out;320320 mW = ret;321321322322- ret = pse_pi_get_voltage(rdev);322322+ ret = _pse_pi_get_voltage(rdev);323323 if (!ret) {324324 dev_err(pcdev->dev, "Voltage null\n");325325 ret = -ERANGE;···356356357357 id = rdev_get_id(rdev);358358 mutex_lock(&pcdev->lock);359359- ret = pse_pi_get_voltage(rdev);359359+ ret = _pse_pi_get_voltage(rdev);360360 if (!ret) {361361 dev_err(pcdev->dev, "Voltage null\n");362362 ret = -ERANGE;
+3-1
drivers/net/team/team_core.c
···26392639 ctx.data.u32_val = nla_get_u32(attr_data);26402640 break;26412641 case TEAM_OPTION_TYPE_STRING:26422642- if (nla_len(attr_data) > TEAM_STRING_MAX_LEN) {26422642+ if (nla_len(attr_data) > TEAM_STRING_MAX_LEN ||26432643+ !memchr(nla_data(attr_data), '\0',26442644+ nla_len(attr_data))) {26432645 err = -EINVAL;26442646 goto team_put;26452647 }
···2828 if (likely(cpu < tq_number))2929 tq = &adapter->tx_queue[cpu];3030 else3131- tq = &adapter->tx_queue[reciprocal_scale(cpu, tq_number)];3131+ tq = &adapter->tx_queue[cpu % tq_number];32323333 return tq;3434}···124124 u32 buf_size;125125 u32 dw2;126126127127+ spin_lock_irq(&tq->tx_lock);127128 dw2 = (tq->tx_ring.gen ^ 0x1) << VMXNET3_TXD_GEN_SHIFT;128129 dw2 |= xdpf->len;129130 ctx.sop_txd = tq->tx_ring.base + tq->tx_ring.next2fill;···135134136135 if (vmxnet3_cmd_ring_desc_avail(&tq->tx_ring) == 0) {137136 tq->stats.tx_ring_full++;137137+ spin_unlock_irq(&tq->tx_lock);138138 return -ENOSPC;139139 }140140···144142 tbi->dma_addr = dma_map_single(&adapter->pdev->dev,145143 xdpf->data, buf_size,146144 DMA_TO_DEVICE);147147- if (dma_mapping_error(&adapter->pdev->dev, tbi->dma_addr))145145+ if (dma_mapping_error(&adapter->pdev->dev, tbi->dma_addr)) {146146+ spin_unlock_irq(&tq->tx_lock);148147 return -EFAULT;148148+ }149149 tbi->map_type |= VMXNET3_MAP_SINGLE;150150 } else { /* XDP buffer from page pool */151151 page = virt_to_page(xdpf->data);···186182 dma_wmb();187183 gdesc->dword[2] = cpu_to_le32(le32_to_cpu(gdesc->dword[2]) ^188184 VMXNET3_TXD_GEN);185185+ spin_unlock_irq(&tq->tx_lock);189186190187 /* No need to handle the case when tx_num_deferred doesn't reach191188 * threshold. Backend driver at hypervisor side will poll and reset···230225{231226 struct vmxnet3_adapter *adapter = netdev_priv(dev);232227 struct vmxnet3_tx_queue *tq;228228+ struct netdev_queue *nq;233229 int i;234230235231 if (unlikely(test_bit(VMXNET3_STATE_BIT_QUIESCED, &adapter->state)))···242236 if (tq->stopped)243237 return -ENETDOWN;244238239239+ nq = netdev_get_tx_queue(adapter->netdev, tq->qid);240240+241241+ __netif_tx_lock(nq, smp_processor_id());245242 for (i = 0; i < n; i++) {246243 if (vmxnet3_xdp_xmit_frame(adapter, frames[i], tq, true)) {247244 tq->stats.xdp_xmit_err++;···252243 }253244 }254245 tq->stats.xdp_xmit += i;246246+ __netif_tx_unlock(nq);255247256248 return i;257249}
+5-2
drivers/net/vxlan/vxlan_core.c
···28982898 struct vxlan_dev *vxlan = netdev_priv(dev);28992899 int err;2900290029012901- if (vxlan->cfg.flags & VXLAN_F_VNIFILTER)29022902- vxlan_vnigroup_init(vxlan);29012901+ if (vxlan->cfg.flags & VXLAN_F_VNIFILTER) {29022902+ err = vxlan_vnigroup_init(vxlan);29032903+ if (err)29042904+ return err;29052905+ }2903290629042907 err = gro_cells_init(&vxlan->gro_cells, dev);29052908 if (err)
+45-16
drivers/net/wireless/ath/ath12k/wmi.c
···48514851 return reg_rule_ptr;48524852}4853485348544854+static u8 ath12k_wmi_ignore_num_extra_rules(struct ath12k_wmi_reg_rule_ext_params *rule,48554855+ u32 num_reg_rules)48564856+{48574857+ u8 num_invalid_5ghz_rules = 0;48584858+ u32 count, start_freq;48594859+48604860+ for (count = 0; count < num_reg_rules; count++) {48614861+ start_freq = le32_get_bits(rule[count].freq_info, REG_RULE_START_FREQ);48624862+48634863+ if (start_freq >= ATH12K_MIN_6G_FREQ)48644864+ num_invalid_5ghz_rules++;48654865+ }48664866+48674867+ return num_invalid_5ghz_rules;48684868+}48694869+48544870static int ath12k_pull_reg_chan_list_ext_update_ev(struct ath12k_base *ab,48554871 struct sk_buff *skb,48564872 struct ath12k_reg_info *reg_info)···48774861 u32 num_2g_reg_rules, num_5g_reg_rules;48784862 u32 num_6g_reg_rules_ap[WMI_REG_CURRENT_MAX_AP_TYPE];48794863 u32 num_6g_reg_rules_cl[WMI_REG_CURRENT_MAX_AP_TYPE][WMI_REG_MAX_CLIENT_TYPE];48644864+ u8 num_invalid_5ghz_ext_rules;48804865 u32 total_reg_rules = 0;48814866 int ret, i, j;48824867···49704953 }4971495449724955 memcpy(reg_info->alpha2, &ev->alpha2, REG_ALPHA2_LEN);49734973-49744974- /* FIXME: Currently FW includes 6G reg rule also in 5G rule49754975- * list for country US.49764976- * Having same 6G reg rule in 5G and 6G rules list causes49774977- * intersect check to be true, and same rules will be shown49784978- * multiple times in iw cmd. So added hack below to avoid49794979- * parsing 6G rule from 5G reg rule list, and this can be49804980- * removed later, after FW updates to remove 6G reg rule49814981- * from 5G rules list.49824982- */49834983- if (memcmp(reg_info->alpha2, "US", 2) == 0) {49844984- reg_info->num_5g_reg_rules = REG_US_5G_NUM_REG_RULES;49854985- num_5g_reg_rules = reg_info->num_5g_reg_rules;49864986- }4987495649884957 reg_info->dfs_region = le32_to_cpu(ev->dfs_region);49894958 reg_info->phybitmap = le32_to_cpu(ev->phybitmap);···50735070 }50745071 }5075507250735073+ ext_wmi_reg_rule += num_2g_reg_rules;50745074+50755075+ /* Firmware might include 6 GHz reg rule in 5 GHz rule list50765076+ * for few countries along with separate 6 GHz rule.50775077+ * Having same 6 GHz reg rule in 5 GHz and 6 GHz rules list50785078+ * causes intersect check to be true, and same rules will be50795079+ * shown multiple times in iw cmd.50805080+ * Hence, avoid parsing 6 GHz rule from 5 GHz reg rule list50815081+ */50825082+ num_invalid_5ghz_ext_rules = ath12k_wmi_ignore_num_extra_rules(ext_wmi_reg_rule,50835083+ num_5g_reg_rules);50845084+50855085+ if (num_invalid_5ghz_ext_rules) {50865086+ ath12k_dbg(ab, ATH12K_DBG_WMI,50875087+ "CC: %s 5 GHz reg rules number %d from fw, %d number of invalid 5 GHz rules",50885088+ reg_info->alpha2, reg_info->num_5g_reg_rules,50895089+ num_invalid_5ghz_ext_rules);50905090+50915091+ num_5g_reg_rules = num_5g_reg_rules - num_invalid_5ghz_ext_rules;50925092+ reg_info->num_5g_reg_rules = num_5g_reg_rules;50935093+ }50945094+50765095 if (num_5g_reg_rules) {50775077- ext_wmi_reg_rule += num_2g_reg_rules;50785096 reg_info->reg_rules_5g_ptr =50795097 create_ext_reg_rules_from_wmi(num_5g_reg_rules,50805098 ext_wmi_reg_rule);···51075083 }51085084 }5109508551105110- ext_wmi_reg_rule += num_5g_reg_rules;50865086+ /* We have adjusted the number of 5 GHz reg rules above. But still those50875087+ * many rules needs to be adjusted in ext_wmi_reg_rule.50885088+ *50895089+ * NOTE: num_invalid_5ghz_ext_rules will be 0 for rest other cases.50905090+ */50915091+ ext_wmi_reg_rule += (num_5g_reg_rules + num_invalid_5ghz_ext_rules);5111509251125093 for (i = 0; i < WMI_REG_CURRENT_MAX_AP_TYPE; i++) {51135094 reg_info->reg_rules_6g_ap_ptr[i] =
···1700170017011701 status = nvme_set_features(ctrl, NVME_FEAT_NUM_QUEUES, q_count, NULL, 0,17021702 &result);17031703- if (status < 0)17031703+17041704+ /*17051705+ * It's either a kernel error or the host observed a connection17061706+ * lost. In either case it's not possible communicate with the17071707+ * controller and thus enter the error code path.17081708+ */17091709+ if (status < 0 || status == NVME_SC_HOST_PATH_ERROR)17041710 return status;1705171117061712 /*
+25-10
drivers/nvme/host/fc.c
···781781static void782782nvme_fc_ctrl_connectivity_loss(struct nvme_fc_ctrl *ctrl)783783{784784+ enum nvme_ctrl_state state;785785+ unsigned long flags;786786+784787 dev_info(ctrl->ctrl.device,785788 "NVME-FC{%d}: controller connectivity lost. Awaiting "786789 "Reconnect", ctrl->cnum);787790788788- switch (nvme_ctrl_state(&ctrl->ctrl)) {791791+ spin_lock_irqsave(&ctrl->lock, flags);792792+ set_bit(ASSOC_FAILED, &ctrl->flags);793793+ state = nvme_ctrl_state(&ctrl->ctrl);794794+ spin_unlock_irqrestore(&ctrl->lock, flags);795795+796796+ switch (state) {789797 case NVME_CTRL_NEW:790798 case NVME_CTRL_LIVE:791799 /*···20872079 nvme_fc_complete_rq(rq);2088208020892081check_error:20902090- if (terminate_assoc && ctrl->ctrl.state != NVME_CTRL_RESETTING)20822082+ if (terminate_assoc &&20832083+ nvme_ctrl_state(&ctrl->ctrl) != NVME_CTRL_RESETTING)20912084 queue_work(nvme_reset_wq, &ctrl->ioerr_work);20922085}20932086···25422533static void25432534nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl, char *errmsg)25442535{25362536+ enum nvme_ctrl_state state = nvme_ctrl_state(&ctrl->ctrl);25372537+25452538 /*25462539 * if an error (io timeout, etc) while (re)connecting, the remote25472540 * port requested terminating of the association (disconnect_ls)···25512540 * the controller. Abort any ios on the association and let the25522541 * create_association error path resolve things.25532542 */25542554- if (ctrl->ctrl.state == NVME_CTRL_CONNECTING) {25432543+ if (state == NVME_CTRL_CONNECTING) {25552544 __nvme_fc_abort_outstanding_ios(ctrl, true);25562556- set_bit(ASSOC_FAILED, &ctrl->flags);25572545 dev_warn(ctrl->ctrl.device,25582546 "NVME-FC{%d}: transport error during (re)connect\n",25592547 ctrl->cnum);···25602550 }2561255125622552 /* Otherwise, only proceed if in LIVE state - e.g. on first error */25632563- if (ctrl->ctrl.state != NVME_CTRL_LIVE)25532553+ if (state != NVME_CTRL_LIVE)25642554 return;2565255525662556 dev_warn(ctrl->ctrl.device,···31773167 else31783168 ret = nvme_fc_recreate_io_queues(ctrl);31793169 }31803180- if (!ret && test_bit(ASSOC_FAILED, &ctrl->flags))31813181- ret = -EIO;31823170 if (ret)31833171 goto out_term_aen_ops;3184317231853185- changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE);31733173+ spin_lock_irqsave(&ctrl->lock, flags);31743174+ if (!test_bit(ASSOC_FAILED, &ctrl->flags))31753175+ changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE);31763176+ else31773177+ ret = -EIO;31783178+ spin_unlock_irqrestore(&ctrl->lock, flags);31793179+31803180+ if (ret)31813181+ goto out_term_aen_ops;3186318231873183 ctrl->ctrl.nr_reconnects = 0;31883184···35943578 list_add_tail(&ctrl->ctrl_list, &rport->ctrl_list);35953579 spin_unlock_irqrestore(&rport->lock, flags);3596358035973597- if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING) ||35983598- !nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {35813581+ if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {35993582 dev_err(ctrl->ctrl.device,36003583 "NVME-FC{%d}: failed to init ctrl state\n", ctrl->cnum);36013584 goto fail_ctrl;
+3-9
drivers/nvme/host/pci.c
···21532153 return 0;2154215421552155out_free_bufs:21562156- while (--i >= 0) {21572157- size_t size = le32_to_cpu(descs[i].size) * NVME_CTRL_PAGE_SIZE;21582158-21592159- dma_free_attrs(dev->dev, size, bufs[i],21602160- le64_to_cpu(descs[i].addr),21612161- DMA_ATTR_NO_KERNEL_MAPPING | DMA_ATTR_NO_WARN);21622162- }21632163-21642156 kfree(bufs);21652157out_free_descs:21662158 dma_free_coherent(dev->dev, descs_size, descs, descs_dma);···31393147 * because of high power consumption (> 2 Watt) in s2idle31403148 * sleep. Only some boards with Intel CPU are affected.31413149 */31423142- if (dmi_match(DMI_BOARD_NAME, "GMxPXxx") ||31503150+ if (dmi_match(DMI_BOARD_NAME, "DN50Z-140HC-YD") ||31513151+ dmi_match(DMI_BOARD_NAME, "GMxPXxx") ||31523152+ dmi_match(DMI_BOARD_NAME, "GXxMRXx") ||31433153 dmi_match(DMI_BOARD_NAME, "PH4PG31") ||31443154 dmi_match(DMI_BOARD_NAME, "PH4PRX1_PH6PRX1") ||31453155 dmi_match(DMI_BOARD_NAME, "PH6PG01_PH6PG71"))
···108108 pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL2, cap++);109109 pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL1, cap++);110110111111- if (parent->state_saved)112112- return;113113-114111 /*115112 * Save parent's L1 substate configuration so we have it for116113 * pci_restore_aspm_l1ss_state(pdev) to restore.
+3-2
drivers/pci/probe.c
···339339 return (res->flags & IORESOURCE_MEM_64) ? 1 : 0;340340}341341342342-static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int rom)342342+static __always_inline void pci_read_bases(struct pci_dev *dev,343343+ unsigned int howmany, int rom)343344{344345 u32 rombar, stdbars[PCI_STD_NUM_BARS];345346 unsigned int pos, reg;346347 u16 orig_cmd;347348348348- BUILD_BUG_ON(howmany > PCI_STD_NUM_BARS);349349+ BUILD_BUG_ON(statically_true(howmany > PCI_STD_NUM_BARS));349350350351 if (dev->non_compliant_bars)351352 return;
···2323 * IFS Image2424 * ---------2525 *2626- * Intel provides a firmware file containing the scan tests via2727- * github [#f1]_. Similar to microcode there is a separate file for each2626+ * Intel provides firmware files containing the scan tests via the webpage [#f1]_.2727+ * Look under "In-Field Scan Test Images Download" section towards the2828+ * end of the page. Similar to microcode, there are separate files for each2829 * family-model-stepping. IFS Images are not applicable for some test types.2930 * Wherever applicable the sysfs directory would provide a "current_batch" file3031 * (see below) for loading the image.3132 *3333+ * .. [#f1] https://intel.com/InFieldScan3234 *3335 * IFS Image Loading3436 * -----------------···126124 *127125 * 2) Hardware allows for some number of cores to be tested in parallel.128126 * The driver does not make use of this, it only tests one core at a time.129129- *130130- * .. [#f1] https://github.com/intel/TBD131131- *132127 *133128 * Structural Based Functional Test at Field (SBAF):134129 * -------------------------------------------------
+65-20
drivers/platform/x86/intel/int3472/discrete.c
···22/* Author: Dan Scally <djrscally@gmail.com> */3344#include <linux/acpi.h>55+#include <linux/array_size.h>56#include <linux/bitfield.h>67#include <linux/device.h>78#include <linux/gpio/consumer.h>···56555756static int skl_int3472_fill_gpiod_lookup(struct gpiod_lookup *table_entry,5857 struct acpi_resource_gpio *agpio,5959- const char *func, u32 polarity)5858+ const char *func, unsigned long gpio_flags)6059{6160 char *path = agpio->resource_source.string_ptr;6261 struct acpi_device *adev;···7170 if (!adev)7271 return -ENODEV;73727474- *table_entry = GPIO_LOOKUP(acpi_dev_name(adev), agpio->pin_table[0], func, polarity);7373+ *table_entry = GPIO_LOOKUP(acpi_dev_name(adev), agpio->pin_table[0], func, gpio_flags);75747675 return 0;7776}78777978static int skl_int3472_map_gpio_to_sensor(struct int3472_discrete_device *int3472,8079 struct acpi_resource_gpio *agpio,8181- const char *func, u32 polarity)8080+ const char *func, unsigned long gpio_flags)8281{8382 int ret;8483···8887 }89889089 ret = skl_int3472_fill_gpiod_lookup(&int3472->gpios.table[int3472->n_sensor_gpios],9191- agpio, func, polarity);9090+ agpio, func, gpio_flags);9291 if (ret)9392 return ret;9493···101100static struct gpio_desc *102101skl_int3472_gpiod_get_from_temp_lookup(struct int3472_discrete_device *int3472,103102 struct acpi_resource_gpio *agpio,104104- const char *func, u32 polarity)103103+ const char *func, unsigned long gpio_flags)105104{106105 struct gpio_desc *desc;107106 int ret;···112111 return ERR_PTR(-ENOMEM);113112114113 lookup->dev_id = dev_name(int3472->dev);115115- ret = skl_int3472_fill_gpiod_lookup(&lookup->table[0], agpio, func, polarity);114114+ ret = skl_int3472_fill_gpiod_lookup(&lookup->table[0], agpio, func, gpio_flags);116115 if (ret)117116 return ERR_PTR(ret);118117···123122 return desc;124123}125124126126-static void int3472_get_func_and_polarity(u8 type, const char **func, u32 *polarity)125125+/**126126+ * struct int3472_gpio_map - Map GPIOs to whatever is expected by the127127+ * sensor driver (as in DT bindings)128128+ * @hid: The ACPI HID of the device without the instance number e.g. INT347E129129+ * @type_from: The GPIO type from ACPI ?SDT130130+ * @type_to: The assigned GPIO type, typically same as @type_from131131+ * @func: The function, e.g. "enable"132132+ * @polarity_low: GPIO_ACTIVE_LOW true if the @polarity_low is true,133133+ * GPIO_ACTIVE_HIGH otherwise134134+ */135135+struct int3472_gpio_map {136136+ const char *hid;137137+ u8 type_from;138138+ u8 type_to;139139+ bool polarity_low;140140+ const char *func;141141+};142142+143143+static const struct int3472_gpio_map int3472_gpio_map[] = {144144+ { "INT347E", INT3472_GPIO_TYPE_RESET, INT3472_GPIO_TYPE_RESET, false, "enable" },145145+};146146+147147+static void int3472_get_func_and_polarity(struct acpi_device *adev, u8 *type,148148+ const char **func, unsigned long *gpio_flags)127149{128128- switch (type) {150150+ unsigned int i;151151+152152+ for (i = 0; i < ARRAY_SIZE(int3472_gpio_map); i++) {153153+ /*154154+ * Map the firmware-provided GPIO to whatever a driver expects155155+ * (as in DT bindings). First check if the type matches with the156156+ * GPIO map, then further check that the device _HID matches.157157+ */158158+ if (*type != int3472_gpio_map[i].type_from)159159+ continue;160160+161161+ if (!acpi_dev_hid_uid_match(adev, int3472_gpio_map[i].hid, NULL))162162+ continue;163163+164164+ *type = int3472_gpio_map[i].type_to;165165+ *gpio_flags = int3472_gpio_map[i].polarity_low ?166166+ GPIO_ACTIVE_LOW : GPIO_ACTIVE_HIGH;167167+ *func = int3472_gpio_map[i].func;168168+ return;169169+ }170170+171171+ switch (*type) {129172 case INT3472_GPIO_TYPE_RESET:130173 *func = "reset";131131- *polarity = GPIO_ACTIVE_LOW;174174+ *gpio_flags = GPIO_ACTIVE_LOW;132175 break;133176 case INT3472_GPIO_TYPE_POWERDOWN:134177 *func = "powerdown";135135- *polarity = GPIO_ACTIVE_LOW;178178+ *gpio_flags = GPIO_ACTIVE_LOW;136179 break;137180 case INT3472_GPIO_TYPE_CLK_ENABLE:138181 *func = "clk-enable";139139- *polarity = GPIO_ACTIVE_HIGH;182182+ *gpio_flags = GPIO_ACTIVE_HIGH;140183 break;141184 case INT3472_GPIO_TYPE_PRIVACY_LED:142185 *func = "privacy-led";143143- *polarity = GPIO_ACTIVE_HIGH;186186+ *gpio_flags = GPIO_ACTIVE_HIGH;144187 break;145188 case INT3472_GPIO_TYPE_POWER_ENABLE:146189 *func = "power-enable";147147- *polarity = GPIO_ACTIVE_HIGH;190190+ *gpio_flags = GPIO_ACTIVE_HIGH;148191 break;149192 default:150193 *func = "unknown";151151- *polarity = GPIO_ACTIVE_HIGH;194194+ *gpio_flags = GPIO_ACTIVE_HIGH;152195 break;153196 }154197}···239194 struct gpio_desc *gpio;240195 const char *err_msg;241196 const char *func;242242- u32 polarity;197197+ unsigned long gpio_flags;243198 int ret;244199245200 if (!acpi_gpio_get_io_resource(ares, &agpio))···262217263218 type = FIELD_GET(INT3472_GPIO_DSM_TYPE, obj->integer.value);264219265265- int3472_get_func_and_polarity(type, &func, &polarity);220220+ int3472_get_func_and_polarity(int3472->sensor, &type, &func, &gpio_flags);266221267222 pin = FIELD_GET(INT3472_GPIO_DSM_PIN, obj->integer.value);268223 /* Pin field is not really used under Windows and wraps around at 8 bits */···272227273228 active_value = FIELD_GET(INT3472_GPIO_DSM_SENSOR_ON_VAL, obj->integer.value);274229 if (!active_value)275275- polarity ^= GPIO_ACTIVE_LOW;230230+ gpio_flags ^= GPIO_ACTIVE_LOW;276231277232 dev_dbg(int3472->dev, "%s %s pin %d active-%s\n", func,278233 agpio->resource_source.string_ptr, agpio->pin_table[0],279279- str_high_low(polarity == GPIO_ACTIVE_HIGH));234234+ str_high_low(gpio_flags == GPIO_ACTIVE_HIGH));280235281236 switch (type) {282237 case INT3472_GPIO_TYPE_RESET:283238 case INT3472_GPIO_TYPE_POWERDOWN:284284- ret = skl_int3472_map_gpio_to_sensor(int3472, agpio, func, polarity);239239+ ret = skl_int3472_map_gpio_to_sensor(int3472, agpio, func, gpio_flags);285240 if (ret)286241 err_msg = "Failed to map GPIO pin to sensor\n";287242···289244 case INT3472_GPIO_TYPE_CLK_ENABLE:290245 case INT3472_GPIO_TYPE_PRIVACY_LED:291246 case INT3472_GPIO_TYPE_POWER_ENABLE:292292- gpio = skl_int3472_gpiod_get_from_temp_lookup(int3472, agpio, func, polarity);247247+ gpio = skl_int3472_gpiod_get_from_temp_lookup(int3472, agpio, func, gpio_flags);293248 if (IS_ERR(gpio)) {294249 ret = PTR_ERR(gpio);295250 err_msg = "Failed to get GPIO\n";
···7885788578867886#define FAN_NS_CTRL_STATUS BIT(2) /* Bit which determines control is enabled or not */78877887#define FAN_NS_CTRL BIT(4) /* Bit which determines control is by host or EC */78887888+#define FAN_CLOCK_TPM (22500*60) /* Ticks per minute for a 22.5 kHz clock */7888788978897890enum { /* Fan control constants */78907891 fan_status_offset = 0x2f, /* EC register 0x2f */···7941794079427941static bool fan_with_ns_addr;79437942static bool ecfw_with_fan_dec_rpm;79437943+static bool fan_speed_in_tpr;7944794479457945static struct mutex fan_mutex;79467946···81448142 !acpi_ec_read(fan_rpm_offset + 1, &hi)))81458143 return -EIO;8146814481478147- if (likely(speed))81458145+ if (likely(speed)) {81488146 *speed = (hi << 8) | lo;81478147+ if (fan_speed_in_tpr && *speed != 0)81488148+ *speed = FAN_CLOCK_TPM / *speed;81498149+ }81498150 break;81508151 case TPACPI_FAN_RD_TPEC_NS:81518152 if (!acpi_ec_read(fan_rpm_status_ns, &lo))···81818176 if (rc)81828177 return -EIO;8183817881848184- if (likely(speed))81798179+ if (likely(speed)) {81858180 *speed = (hi << 8) | lo;81818181+ if (fan_speed_in_tpr && *speed != 0)81828182+ *speed = FAN_CLOCK_TPM / *speed;81838183+ }81868184 break;8187818581888186 case TPACPI_FAN_RD_TPEC_NS:···87968788#define TPACPI_FAN_NOFAN 0x0008 /* no fan available */87978789#define TPACPI_FAN_NS 0x0010 /* For EC with non-Standard register addresses */87988790#define TPACPI_FAN_DECRPM 0x0020 /* For ECFW's with RPM in register as decimal */87918791+#define TPACPI_FAN_TPR 0x0040 /* Fan speed is in Ticks Per Revolution */8799879288008793static const struct tpacpi_quirk fan_quirk_table[] __initconst = {88018794 TPACPI_QEC_IBM('1', 'Y', TPACPI_FAN_Q1),···88268817 TPACPI_Q_LNV3('R', '0', 'V', TPACPI_FAN_NS), /* 11e Gen5 KL-Y */88278818 TPACPI_Q_LNV3('N', '1', 'O', TPACPI_FAN_NOFAN), /* X1 Tablet (2nd gen) */88288819 TPACPI_Q_LNV3('R', '0', 'Q', TPACPI_FAN_DECRPM),/* L480 */88208820+ TPACPI_Q_LNV('8', 'F', TPACPI_FAN_TPR), /* ThinkPad x120e */88298821};8830882288318823static int __init fan_init(struct ibm_init_struct *iibm)···8897888788988888 if (quirks & TPACPI_FAN_Q1)88998889 fan_quirk1_setup();88908890+ if (quirks & TPACPI_FAN_TPR)88918891+ fan_speed_in_tpr = true;89008892 /* Try and probe the 2nd fan */89018893 tp_features.second_fan = 1; /* needed for get_speed to work */89028894 res = fan2_get_speed(&speed);···1033110319#define DYTC_MODE_PSC_BALANCE 5 /* Default mode aka balanced */1033210320#define DYTC_MODE_PSC_PERFORM 7 /* High power mode aka performance */10333103211032210322+#define DYTC_MODE_PSCV9_LOWPOWER 1 /* Low power mode */1032310323+#define DYTC_MODE_PSCV9_BALANCE 3 /* Default mode aka balanced */1032410324+#define DYTC_MODE_PSCV9_PERFORM 4 /* High power mode aka performance */1032510325+1033410326#define DYTC_ERR_MASK 0xF /* Bits 0-3 in cmd result are the error result */1033510327#define DYTC_ERR_SUCCESS 1 /* CMD completed successful */1033610328···1035410338static int dytc_capabilities;1035510339static bool dytc_mmc_get_available;1035610340static int profile_force;1034110341+1034210342+static int platform_psc_profile_lowpower = DYTC_MODE_PSC_LOWPOWER;1034310343+static int platform_psc_profile_balanced = DYTC_MODE_PSC_BALANCE;1034410344+static int platform_psc_profile_performance = DYTC_MODE_PSC_PERFORM;10357103451035810346static int convert_dytc_to_profile(int funcmode, int dytcmode,1035910347 enum platform_profile_option *profile)···1038010360 }1038110361 return 0;1038210362 case DYTC_FUNCTION_PSC:1038310383- switch (dytcmode) {1038410384- case DYTC_MODE_PSC_LOWPOWER:1036310363+ if (dytcmode == platform_psc_profile_lowpower)1038510364 *profile = PLATFORM_PROFILE_LOW_POWER;1038610386- break;1038710387- case DYTC_MODE_PSC_BALANCE:1036510365+ else if (dytcmode == platform_psc_profile_balanced)1038810366 *profile = PLATFORM_PROFILE_BALANCED;1038910389- break;1039010390- case DYTC_MODE_PSC_PERFORM:1036710367+ else if (dytcmode == platform_psc_profile_performance)1039110368 *profile = PLATFORM_PROFILE_PERFORMANCE;1039210392- break;1039310393- default: /* Unknown mode */1036910369+ else1039410370 return -EINVAL;1039510395- }1037110371+1039610372 return 0;1039710373 case DYTC_FUNCTION_AMT:1039810374 /* For now return balanced. It's the closest we have to 'auto' */···1040910393 if (dytc_capabilities & BIT(DYTC_FC_MMC))1041010394 *perfmode = DYTC_MODE_MMC_LOWPOWER;1041110395 else if (dytc_capabilities & BIT(DYTC_FC_PSC))1041210412- *perfmode = DYTC_MODE_PSC_LOWPOWER;1039610396+ *perfmode = platform_psc_profile_lowpower;1041310397 break;1041410398 case PLATFORM_PROFILE_BALANCED:1041510399 if (dytc_capabilities & BIT(DYTC_FC_MMC))1041610400 *perfmode = DYTC_MODE_MMC_BALANCE;1041710401 else if (dytc_capabilities & BIT(DYTC_FC_PSC))1041810418- *perfmode = DYTC_MODE_PSC_BALANCE;1040210402+ *perfmode = platform_psc_profile_balanced;1041910403 break;1042010404 case PLATFORM_PROFILE_PERFORMANCE:1042110405 if (dytc_capabilities & BIT(DYTC_FC_MMC))1042210406 *perfmode = DYTC_MODE_MMC_PERFORM;1042310407 else if (dytc_capabilities & BIT(DYTC_FC_PSC))1042410424- *perfmode = DYTC_MODE_PSC_PERFORM;1040810408+ *perfmode = platform_psc_profile_performance;1042510409 break;1042610410 default: /* Unknown profile */1042710411 return -EOPNOTSUPP;···1061510599 if (output & BIT(DYTC_QUERY_ENABLE_BIT))1061610600 dytc_version = (output >> DYTC_QUERY_REV_BIT) & 0xF;10617106011060210602+ dbg_printk(TPACPI_DBG_INIT, "DYTC version %d\n", dytc_version);1061810603 /* Check DYTC is enabled and supports mode setting */1061910604 if (dytc_version < 5)1062010605 return -ENODEV;···1065410637 }1065510638 } else if (dytc_capabilities & BIT(DYTC_FC_PSC)) { /* PSC MODE */1065610639 pr_debug("PSC is supported\n");1064010640+ if (dytc_version >= 9) { /* update profiles for DYTC 9 and up */1064110641+ platform_psc_profile_lowpower = DYTC_MODE_PSCV9_LOWPOWER;1064210642+ platform_psc_profile_balanced = DYTC_MODE_PSCV9_BALANCE;1064310643+ platform_psc_profile_performance = DYTC_MODE_PSCV9_PERFORM;1064410644+ }1065710645 } else {1065810646 dbg_printk(TPACPI_DBG_INIT, "No DYTC support available\n");1065910647 return -ENODEV;···1066810646 "DYTC version %d: thermal mode available\n", dytc_version);10669106471067010648 /* Create platform_profile structure and register */1067110671- tpacpi_pprof = devm_platform_profile_register(&tpacpi_pdev->dev, "thinkpad-acpi",1067210672- NULL, &dytc_profile_ops);1064910649+ tpacpi_pprof = platform_profile_register(&tpacpi_pdev->dev, "thinkpad-acpi-profile",1065010650+ NULL, &dytc_profile_ops);1067310651 /*1067410652 * If for some reason platform_profiles aren't enabled1067510653 * don't quit terminally.···1068710665 return 0;1068810666}10689106671066810668+static void dytc_profile_exit(void)1066910669+{1067010670+ if (!IS_ERR_OR_NULL(tpacpi_pprof))1067110671+ platform_profile_remove(tpacpi_pprof);1067210672+}1067310673+1069010674static struct ibm_struct dytc_profile_driver_data = {1069110675 .name = "dytc-profile",1067610676+ .exit = dytc_profile_exit,1069210677};10693106781069410679/*************************************************************************
+1-2
drivers/powercap/powercap_sys.c
···627627 dev_set_name(&control_type->dev, "%s", name);628628 result = device_register(&control_type->dev);629629 if (result) {630630- if (control_type->allocated)631631- kfree(control_type);630630+ put_device(&control_type->dev);632631 return ERR_PTR(result);633632 }634633 idr_init(&control_type->idr);
+21-26
drivers/ptp/ptp_vmclock.c
···414414}415415416416static const struct file_operations vmclock_miscdev_fops = {417417+ .owner = THIS_MODULE,417418 .mmap = vmclock_miscdev_mmap,418419 .read = vmclock_miscdev_read,419420};420421421422/* module operations */422423423423-static void vmclock_remove(struct platform_device *pdev)424424+static void vmclock_remove(void *data)424425{425425- struct device *dev = &pdev->dev;426426- struct vmclock_state *st = dev_get_drvdata(dev);426426+ struct vmclock_state *st = data;427427428428 if (st->ptp_clock)429429 ptp_clock_unregister(st->ptp_clock);···506506507507 if (ret) {508508 dev_info(dev, "Failed to obtain physical address: %d\n", ret);509509- goto out;509509+ return ret;510510 }511511512512 if (resource_size(&st->res) < VMCLOCK_MIN_SIZE) {513513 dev_info(dev, "Region too small (0x%llx)\n",514514 resource_size(&st->res));515515- ret = -EINVAL;516516- goto out;515515+ return -EINVAL;517516 }518517 st->clk = devm_memremap(dev, st->res.start, resource_size(&st->res),519518 MEMREMAP_WB | MEMREMAP_DEC);···520521 ret = PTR_ERR(st->clk);521522 dev_info(dev, "failed to map shared memory\n");522523 st->clk = NULL;523523- goto out;524524+ return ret;524525 }525526526527 if (le32_to_cpu(st->clk->magic) != VMCLOCK_MAGIC ||527528 le32_to_cpu(st->clk->size) > resource_size(&st->res) ||528529 le16_to_cpu(st->clk->version) != 1) {529530 dev_info(dev, "vmclock magic fields invalid\n");530530- ret = -EINVAL;531531- goto out;531531+ return -EINVAL;532532 }533533534534 ret = ida_alloc(&vmclock_ida, GFP_KERNEL);535535 if (ret < 0)536536- goto out;536536+ return ret;537537538538 st->index = ret;539539 ret = devm_add_action_or_reset(&pdev->dev, vmclock_put_idx, st);540540 if (ret)541541- goto out;541541+ return ret;542542543543 st->name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "vmclock%d", st->index);544544- if (!st->name) {545545- ret = -ENOMEM;546546- goto out;547547- }544544+ if (!st->name)545545+ return -ENOMEM;546546+547547+ st->miscdev.minor = MISC_DYNAMIC_MINOR;548548+549549+ ret = devm_add_action_or_reset(&pdev->dev, vmclock_remove, st);550550+ if (ret)551551+ return ret;548552549553 /*550554 * If the structure is big enough, it can be mapped to userspace.···556554 * cross that bridge if/when we come to it.557555 */558556 if (le32_to_cpu(st->clk->size) >= PAGE_SIZE) {559559- st->miscdev.minor = MISC_DYNAMIC_MINOR;560557 st->miscdev.fops = &vmclock_miscdev_fops;561558 st->miscdev.name = st->name;562559563560 ret = misc_register(&st->miscdev);564561 if (ret)565565- goto out;562562+ return ret;566563 }567564568565 /* If there is valid clock information, register a PTP clock */···571570 if (IS_ERR(st->ptp_clock)) {572571 ret = PTR_ERR(st->ptp_clock);573572 st->ptp_clock = NULL;574574- vmclock_remove(pdev);575575- goto out;573573+ return ret;576574 }577575 }578576579577 if (!st->miscdev.minor && !st->ptp_clock) {580578 /* Neither miscdev nor PTP registered */581579 dev_info(dev, "vmclock: Neither miscdev nor PTP available; not registering\n");582582- ret = -ENODEV;583583- goto out;580580+ return -ENODEV;584581 }585582586583 dev_info(dev, "%s: registered %s%s%s\n", st->name,···586587 (st->miscdev.minor && st->ptp_clock) ? ", " : "",587588 st->ptp_clock ? "PTP" : "");588589589589- dev_set_drvdata(dev, st);590590-591591- out:592592- return ret;590590+ return 0;593591}594592595593static const struct acpi_device_id vmclock_acpi_ids[] = {···597601598602static struct platform_driver vmclock_platform_driver = {599603 .probe = vmclock_probe,600600- .remove = vmclock_remove,601604 .driver = {602605 .name = "vmclock",603606 .acpi_match_table = vmclock_acpi_ids,
+27-34
drivers/regulator/core.c
···57745774 goto clean;57755775 }5776577657775777- if (config->init_data) {57785778- /*57795779- * Providing of_match means the framework is expected to parse57805780- * DT to get the init_data. This would conflict with provided57815781- * init_data, if set. Warn if it happens.57825782- */57835783- if (regulator_desc->of_match)57845784- dev_warn(dev, "Using provided init data - OF match ignored\n");57775777+ /*57785778+ * DT may override the config->init_data provided if the platform57795779+ * needs to do so. If so, config->init_data is completely ignored.57805780+ */57815781+ init_data = regulator_of_get_init_data(dev, regulator_desc, config,57825782+ &rdev->dev.of_node);5785578357845784+ /*57855785+ * Sometimes not all resources are probed already so we need to take57865786+ * that into account. This happens most the time if the ena_gpiod comes57875787+ * from a gpio extender or something else.57885788+ */57895789+ if (PTR_ERR(init_data) == -EPROBE_DEFER) {57905790+ ret = -EPROBE_DEFER;57915791+ goto clean;57925792+ }57935793+57945794+ /*57955795+ * We need to keep track of any GPIO descriptor coming from the57965796+ * device tree until we have handled it over to the core. If the57975797+ * config that was passed in to this function DOES NOT contain57985798+ * a descriptor, and the config after this call DOES contain57995799+ * a descriptor, we definitely got one from parsing the device58005800+ * tree.58015801+ */58025802+ if (!cfg->ena_gpiod && config->ena_gpiod)58035803+ dangling_of_gpiod = true;58045804+ if (!init_data) {57865805 init_data = config->init_data;57875806 rdev->dev.of_node = of_node_get(config->of_node);57885788-57895789- } else {57905790- init_data = regulator_of_get_init_data(dev, regulator_desc,57915791- config,57925792- &rdev->dev.of_node);57935793-57945794- /*57955795- * Sometimes not all resources are probed already so we need to57965796- * take that into account. This happens most the time if the57975797- * ena_gpiod comes from a gpio extender or something else.57985798- */57995799- if (PTR_ERR(init_data) == -EPROBE_DEFER) {58005800- ret = -EPROBE_DEFER;58015801- goto clean;58025802- }58035803-58045804- /*58055805- * We need to keep track of any GPIO descriptor coming from the58065806- * device tree until we have handled it over to the core. If the58075807- * config that was passed in to this function DOES NOT contain a58085808- * descriptor, and the config after this call DOES contain a58095809- * descriptor, we definitely got one from parsing the device58105810- * tree.58115811- */58125812- if (!cfg->ena_gpiod && config->ena_gpiod)58135813- dangling_of_gpiod = true;58145807 }5815580858165809 ww_mutex_init(&rdev->mutex, ®ulator_ww_class);
+2-1
drivers/s390/cio/chp.c
···695695 if (time_after(jiffies, chp_info_expires)) {696696 /* Data is too old, update. */697697 rc = sclp_chp_read_info(&chp_info);698698- chp_info_expires = jiffies + CHP_INFO_UPDATE_INTERVAL ;698698+ if (!rc)699699+ chp_info_expires = jiffies + CHP_INFO_UPDATE_INTERVAL;699700 }700701 mutex_unlock(&info_lock);701702
···872872 case 0x1a: /* start stop unit in progress */873873 case 0x1b: /* sanitize in progress */874874 case 0x1d: /* configuration in progress */875875- case 0x24: /* depopulation in progress */876876- case 0x25: /* depopulation restore in progress */877875 action = ACTION_DELAYED_RETRY;878876 break;879877 case 0x0a: /* ALUA state transition */880878 action = ACTION_DELAYED_REPREP;881879 break;880880+ /*881881+ * Depopulation might take many hours,882882+ * thus it is not worthwhile to retry.883883+ */884884+ case 0x24: /* depopulation in progress */885885+ case 0x25: /* depopulation restore in progress */886886+ fallthrough;882887 default:883888 action = ACTION_FAIL;884889 break;
+7
drivers/scsi/scsi_lib_test.c
···6767 };6868 int i;69697070+ /* Success */7171+ sc.result = 0;7272+ KUNIT_EXPECT_EQ(test, 0, scsi_check_passthrough(&sc, &failures));7373+ KUNIT_EXPECT_EQ(test, 0, scsi_check_passthrough(&sc, NULL));7474+ /* Command failed but caller did not pass in a failures array */7575+ scsi_build_sense(&sc, 0, ILLEGAL_REQUEST, 0x91, 0x36);7676+ KUNIT_EXPECT_EQ(test, 0, scsi_check_passthrough(&sc, NULL));7077 /* Match end of array */7178 scsi_build_sense(&sc, 0, ILLEGAL_REQUEST, 0x91, 0x36);7279 KUNIT_EXPECT_EQ(test, -EAGAIN, scsi_check_passthrough(&sc, &failures));
+1-1
drivers/scsi/scsi_scan.c
···246246 }247247 ret = sbitmap_init_node(&sdev->budget_map,248248 scsi_device_max_queue_depth(sdev),249249- new_shift, GFP_KERNEL,249249+ new_shift, GFP_NOIO,250250 sdev->request_queue->node, false, true);251251 if (!ret)252252 sbitmap_resize(&sdev->budget_map, depth);
···5757 * @max_level: maximum cooling level. One less than total number of valid5858 * cpufreq frequencies.5959 * @em: Reference on the Energy Model of the device6060- * @cdev: thermal_cooling_device pointer to keep track of the6161- * registered cooling device.6260 * @policy: cpufreq policy.6361 * @cooling_ops: cpufreq callbacks to thermal cooling device ops6462 * @idle_time: idle time stats
+1-1
drivers/tty/pty.c
···798798 nonseekable_open(inode, filp);799799800800 /* We refuse fsnotify events on ptmx, since it's a shared resource */801801- filp->f_mode |= FMODE_NONOTIFY;801801+ file_set_fsnotify_mode(filp, FMODE_NONOTIFY);802802803803 retval = tty_alloc_file(filp);804804 if (retval)
+2
drivers/tty/serial/8250/8250.h
···374374375375#ifdef CONFIG_SERIAL_8250_DMA376376extern int serial8250_tx_dma(struct uart_8250_port *);377377+extern void serial8250_tx_dma_flush(struct uart_8250_port *);377378extern int serial8250_rx_dma(struct uart_8250_port *);378379extern void serial8250_rx_dma_flush(struct uart_8250_port *);379380extern int serial8250_request_dma(struct uart_8250_port *);···407406{408407 return -1;409408}409409+static inline void serial8250_tx_dma_flush(struct uart_8250_port *p) { }410410static inline int serial8250_rx_dma(struct uart_8250_port *p)411411{412412 return -1;
+16
drivers/tty/serial/8250/8250_dma.c
···149149 return ret;150150}151151152152+void serial8250_tx_dma_flush(struct uart_8250_port *p)153153+{154154+ struct uart_8250_dma *dma = p->dma;155155+156156+ if (!dma->tx_running)157157+ return;158158+159159+ /*160160+ * kfifo_reset() has been called by the serial core, avoid161161+ * advancing and underflowing in __dma_tx_complete().162162+ */163163+ dma->tx_size = 0;164164+165165+ dmaengine_terminate_async(dma->rxchan);166166+}167167+152168int serial8250_rx_dma(struct uart_8250_port *p)153169{154170 struct uart_8250_dma *dma = p->dma;
···15611561 /* Always ask for fixed clock rate from a property. */15621562 device_property_read_u32(dev, "clock-frequency", &uartclk);1563156315641564- s->polling = !!irq;15641564+ s->polling = (irq <= 0);15651565 if (s->polling)15661566 dev_dbg(dev,15671567 "No interrupt pin definition, falling back to polling mode\n");
+7-5
drivers/tty/serial/serial_port.c
···173173 * The caller is responsible to initialize the following fields of the @port174174 * ->dev (must be valid)175175 * ->flags176176+ * ->iobase176177 * ->mapbase177178 * ->mapsize178179 * ->regshift (if @use_defaults is false)···215214 /* Read the registers I/O access type (default: MMIO 8-bit) */216215 ret = device_property_read_u32(dev, "reg-io-width", &value);217216 if (ret) {218218- port->iotype = UPIO_MEM;217217+ port->iotype = port->iobase ? UPIO_PORT : UPIO_MEM;219218 } else {220219 switch (value) {221220 case 1:···228227 port->iotype = device_is_big_endian(dev) ? UPIO_MEM32BE : UPIO_MEM32;229228 break;230229 default:231231- if (!use_defaults) {232232- dev_err(dev, "Unsupported reg-io-width (%u)\n", value);233233- return -EINVAL;234234- }235230 port->iotype = UPIO_UNKNOWN;236231 break;237232 }233233+ }234234+235235+ if (!use_defaults && port->iotype == UPIO_UNKNOWN) {236236+ dev_err(dev, "Unsupported reg-io-width (%u)\n", value);237237+ return -EINVAL;238238 }239239240240 /* Read the address mapping base offset (default: no offset) */
+35-33
drivers/ufs/core/ufshcd.c
···21202120 INIT_DELAYED_WORK(&hba->clk_gating.gate_work, ufshcd_gate_work);21212121 INIT_WORK(&hba->clk_gating.ungate_work, ufshcd_ungate_work);2122212221232123- spin_lock_init(&hba->clk_gating.lock);21242124-21252123 hba->clk_gating.clk_gating_workq = alloc_ordered_workqueue(21262124 "ufs_clk_gating_%d", WQ_MEM_RECLAIM | WQ_HIGHPRI,21272125 hba->host->host_no);···31043106 case UPIU_TRANSACTION_QUERY_RSP: {31053107 u8 response = lrbp->ucd_rsp_ptr->header.response;3106310831073107- if (response == 0)31093109+ if (response == 0) {31083110 err = ufshcd_copy_query_response(hba, lrbp);31113111+ } else {31123112+ err = -EINVAL;31133113+ dev_err(hba->dev, "%s: unexpected response in Query RSP: %x\n",31143114+ __func__, response);31153115+ }31093116 break;31103117 }31113118 case UPIU_TRANSACTION_REJECT_UPIU:···59795976 __func__, err);59805977}5981597859825982-static void ufshcd_temp_exception_event_handler(struct ufs_hba *hba, u16 status)59835983-{59845984- u32 value;59855985-59865986- if (ufshcd_query_attr_retry(hba, UPIU_QUERY_OPCODE_READ_ATTR,59875987- QUERY_ATTR_IDN_CASE_ROUGH_TEMP, 0, 0, &value))59885988- return;59895989-59905990- dev_info(hba->dev, "exception Tcase %d\n", value - 80);59915991-59925992- ufs_hwmon_notify_event(hba, status & MASK_EE_URGENT_TEMP);59935993-59945994- /*59955995- * A placeholder for the platform vendors to add whatever additional59965996- * steps required59975997- */59985998-}59995999-60005979static int __ufshcd_wb_toggle(struct ufs_hba *hba, bool set, enum flag_idn idn)60015980{60025981 u8 index;···61996214 ufshcd_bkops_exception_event_handler(hba);6200621562016216 if (status & hba->ee_drv_mask & MASK_EE_URGENT_TEMP)62026202- ufshcd_temp_exception_event_handler(hba, status);62176217+ ufs_hwmon_notify_event(hba, status & MASK_EE_URGENT_TEMP);6203621862046219 ufs_debugfs_exception_event(hba, status);62056220}···91459160 if (!IS_ERR_OR_NULL(clki->clk) && clki->enabled)91469161 clk_disable_unprepare(clki->clk);91479162 }91489148- } else if (!ret && on) {91639163+ } else if (!ret && on && hba->clk_gating.is_initialized) {91499164 scoped_guard(spinlock_irqsave, &hba->clk_gating.lock)91509165 hba->clk_gating.state = CLKS_ON;91519166 trace_ufshcd_clk_gating(dev_name(hba->dev),···1023210247#endif /* CONFIG_PM_SLEEP */10233102481023410249/**1023510235- * ufshcd_dealloc_host - deallocate Host Bus Adapter (HBA)1023610236- * @hba: pointer to Host Bus Adapter (HBA)1023710237- */1023810238-void ufshcd_dealloc_host(struct ufs_hba *hba)1023910239-{1024010240- scsi_host_put(hba->host);1024110241-}1024210242-EXPORT_SYMBOL_GPL(ufshcd_dealloc_host);1024310243-1024410244-/**1024510250 * ufshcd_set_dma_mask - Set dma mask based on the controller1024610251 * addressing capability1024710252 * @hba: per adapter instance···1025010275}10251102761025210277/**1027810278+ * ufshcd_devres_release - devres cleanup handler, invoked during release of1027910279+ * hba->dev1028010280+ * @host: pointer to SCSI host1028110281+ */1028210282+static void ufshcd_devres_release(void *host)1028310283+{1028410284+ scsi_host_put(host);1028510285+}1028610286+1028710287+/**1025310288 * ufshcd_alloc_host - allocate Host Bus Adapter (HBA)1025410289 * @dev: pointer to device handle1025510290 * @hba_handle: driver private handle1025610291 *1025710292 * Return: 0 on success, non-zero value on failure.1029310293+ *1029410294+ * NOTE: There is no corresponding ufshcd_dealloc_host() because this function1029510295+ * keeps track of its allocations using devres and deallocates everything on1029610296+ * device removal automatically.1025810297 */1025910298int ufshcd_alloc_host(struct device *dev, struct ufs_hba **hba_handle)1026010299{···1029010301 err = -ENOMEM;1029110302 goto out_error;1029210303 }1030410304+1030510305+ err = devm_add_action_or_reset(dev, ufshcd_devres_release,1030610306+ host);1030710307+ if (err)1030810308+ return dev_err_probe(dev, err,1030910309+ "failed to add ufshcd dealloc action\n");1031010310+1029310311 host->nr_maps = HCTX_TYPE_POLL + 1;1029410312 hba = shost_priv(host);1029510313 hba->host = host;···1042410428 hba->mmio_base = mmio_base;1042510429 hba->irq = irq;1042610430 hba->vps = &ufs_hba_vps;1043110431+1043210432+ /*1043310433+ * Initialize clk_gating.lock early since it is being used in1043410434+ * ufshcd_setup_clocks()1043510435+ */1043610436+ spin_lock_init(&hba->clk_gating.lock);10427104371042810438 err = ufshcd_hba_init(hba);1042910439 if (err)
···371371static void acm_ctrl_irq(struct urb *urb)372372{373373 struct acm *acm = urb->context;374374- struct usb_cdc_notification *dr = urb->transfer_buffer;374374+ struct usb_cdc_notification *dr;375375 unsigned int current_size = urb->actual_length;376376 unsigned int expected_size, copy_size, alloc_size;377377 int retval;···398398399399 usb_mark_last_busy(acm->dev);400400401401- if (acm->nb_index)401401+ if (acm->nb_index == 0) {402402+ /*403403+ * The first chunk of a message must contain at least the404404+ * notification header with the length field, otherwise we405405+ * can't get an expected_size.406406+ */407407+ if (current_size < sizeof(struct usb_cdc_notification)) {408408+ dev_dbg(&acm->control->dev, "urb too short\n");409409+ goto exit;410410+ }411411+ dr = urb->transfer_buffer;412412+ } else {402413 dr = (struct usb_cdc_notification *)acm->notification_buffer;403403-414414+ }404415 /* size = notification-header + (optional) data */405416 expected_size = sizeof(struct usb_cdc_notification) +406417 le16_to_cpu(dr->wLength);407418408408- if (current_size < expected_size) {419419+ if (acm->nb_index != 0 || current_size < expected_size) {409420 /* notification is transmitted fragmented, reassemble */410421 if (acm->nb_size < expected_size) {411422 u8 *new_buffer;···17381727 { USB_DEVICE(0x0870, 0x0001), /* Metricom GS Modem */17391728 .driver_info = NO_UNION_NORMAL, /* has no union descriptor */17401729 },17411741- { USB_DEVICE(0x045b, 0x023c), /* Renesas USB Download mode */17301730+ { USB_DEVICE(0x045b, 0x023c), /* Renesas R-Car H3 USB Download mode */17421731 .driver_info = DISABLE_ECHO, /* Don't echo banner */17431732 },17441744- { USB_DEVICE(0x045b, 0x0248), /* Renesas USB Download mode */17331733+ { USB_DEVICE(0x045b, 0x0247), /* Renesas R-Car D3 USB Download mode */17451734 .driver_info = DISABLE_ECHO, /* Don't echo banner */17461735 },17471747- { USB_DEVICE(0x045b, 0x024D), /* Renesas USB Download mode */17361736+ { USB_DEVICE(0x045b, 0x0248), /* Renesas R-Car M3-N USB Download mode */17371737+ .driver_info = DISABLE_ECHO, /* Don't echo banner */17381738+ },17391739+ { USB_DEVICE(0x045b, 0x024D), /* Renesas R-Car E3 USB Download mode */17481740 .driver_info = DISABLE_ECHO, /* Don't echo banner */17491741 },17501742 { USB_DEVICE(0x0e8d, 0x0003), /* FIREFLY, MediaTek Inc; andrey.arapov@gmail.com */
+12-2
drivers/usb/core/hub.c
···18491849 hdev = interface_to_usbdev(intf);1850185018511851 /*18521852+ * The USB 2.0 spec prohibits hubs from having more than one18531853+ * configuration or interface, and we rely on this prohibition.18541854+ * Refuse to accept a device that violates it.18551855+ */18561856+ if (hdev->descriptor.bNumConfigurations > 1 ||18571857+ hdev->actconfig->desc.bNumInterfaces > 1) {18581858+ dev_err(&intf->dev, "Invalid hub with more than one config or interface\n");18591859+ return -EINVAL;18601860+ }18611861+18621862+ /*18521863 * Set default autosuspend delay as 0 to speedup bus suspend,18531864 * based on the below considerations:18541865 *···47094698EXPORT_SYMBOL_GPL(usb_ep0_reinit);4710469947114700#define usb_sndaddr0pipe() (PIPE_CONTROL << 30)47124712-#define usb_rcvaddr0pipe() ((PIPE_CONTROL << 30) | USB_DIR_IN)4713470147144702static int hub_set_address(struct usb_device *udev, int devnum)47154703{···48144804 for (i = 0; i < GET_MAXPACKET0_TRIES; ++i) {48154805 /* Start with invalid values in case the transfer fails */48164806 buf->bDescriptorType = buf->bMaxPacketSize0 = 0;48174817- rc = usb_control_msg(udev, usb_rcvaddr0pipe(),48074807+ rc = usb_control_msg(udev, usb_rcvctrlpipe(udev, 0),48184808 USB_REQ_GET_DESCRIPTOR, USB_DIR_IN,48194809 USB_DT_DEVICE << 8, 0,48204810 buf, size,
···717717/**718718 * struct dwc3_ep - device side endpoint representation719719 * @endpoint: usb endpoint720720+ * @nostream_work: work for handling bulk NoStream720721 * @cancelled_list: list of cancelled requests for this endpoint721722 * @pending_list: list of pending requests for this endpoint722723 * @started_list: list of started requests on this endpoint
+34
drivers/usb/dwc3/gadget.c
···26372637{26382638 u32 reg;26392639 u32 timeout = 2000;26402640+ u32 saved_config = 0;2640264126412642 if (pm_runtime_suspended(dwc->dev))26422643 return 0;26442644+26452645+ /*26462646+ * When operating in USB 2.0 speeds (HS/FS), ensure that26472647+ * GUSB2PHYCFG.ENBLSLPM and GUSB2PHYCFG.SUSPHY are cleared before starting26482648+ * or stopping the controller. This resolves timeout issues that occur26492649+ * during frequent role switches between host and device modes.26502650+ *26512651+ * Save and clear these settings, then restore them after completing the26522652+ * controller start or stop sequence.26532653+ *26542654+ * This solution was discovered through experimentation as it is not26552655+ * mentioned in the dwc3 programming guide. It has been tested on an26562656+ * Exynos platforms.26572657+ */26582658+ reg = dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0));26592659+ if (reg & DWC3_GUSB2PHYCFG_SUSPHY) {26602660+ saved_config |= DWC3_GUSB2PHYCFG_SUSPHY;26612661+ reg &= ~DWC3_GUSB2PHYCFG_SUSPHY;26622662+ }26632663+26642664+ if (reg & DWC3_GUSB2PHYCFG_ENBLSLPM) {26652665+ saved_config |= DWC3_GUSB2PHYCFG_ENBLSLPM;26662666+ reg &= ~DWC3_GUSB2PHYCFG_ENBLSLPM;26672667+ }26682668+26692669+ if (saved_config)26702670+ dwc3_writel(dwc->regs, DWC3_GUSB2PHYCFG(0), reg);2643267126442672 reg = dwc3_readl(dwc->regs, DWC3_DCTL);26452673 if (is_on) {···26952667 reg = dwc3_readl(dwc->regs, DWC3_DSTS);26962668 reg &= DWC3_DSTS_DEVCTRLHLT;26972669 } while (--timeout && !(!is_on ^ !reg));26702670+26712671+ if (saved_config) {26722672+ reg = dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0));26732673+ reg |= saved_config;26742674+ dwc3_writel(dwc->regs, DWC3_GUSB2PHYCFG(0), reg);26752675+ }2698267626992677 if (!timeout)27002678 return -ETIMEDOUT;
+14-5
drivers/usb/gadget/function/f_midi.c
···283283 /* Our transmit completed. See if there's more to go.284284 * f_midi_transmit eats req, don't queue it again. */285285 req->length = 0;286286- f_midi_transmit(midi);286286+ queue_work(system_highpri_wq, &midi->work);287287 return;288288 }289289 break;···907907908908 status = -ENODEV;909909910910+ /*911911+ * Reset wMaxPacketSize with maximum packet size of FS bulk transfer before912912+ * endpoint claim. This ensures that the wMaxPacketSize does not exceed the913913+ * limit during bind retries where configured dwc3 TX/RX FIFO's maxpacket914914+ * size of 512 bytes for IN/OUT endpoints in support HS speed only.915915+ */916916+ bulk_in_desc.wMaxPacketSize = cpu_to_le16(64);917917+ bulk_out_desc.wMaxPacketSize = cpu_to_le16(64);918918+910919 /* allocate instance-specific endpoints */911920 midi->in_ep = usb_ep_autoconfig(cdev->gadget, &bulk_in_desc);912921 if (!midi->in_ep)···10091000 }1010100110111002 /* configure the endpoint descriptors ... */10121012- ms_out_desc.bLength = USB_DT_MS_ENDPOINT_SIZE(midi->in_ports);10131013- ms_out_desc.bNumEmbMIDIJack = midi->in_ports;10031003+ ms_out_desc.bLength = USB_DT_MS_ENDPOINT_SIZE(midi->out_ports);10041004+ ms_out_desc.bNumEmbMIDIJack = midi->out_ports;1014100510151015- ms_in_desc.bLength = USB_DT_MS_ENDPOINT_SIZE(midi->out_ports);10161016- ms_in_desc.bNumEmbMIDIJack = midi->out_ports;10061006+ ms_in_desc.bLength = USB_DT_MS_ENDPOINT_SIZE(midi->in_ports);10071007+ ms_in_desc.bNumEmbMIDIJack = midi->in_ports;1017100810181009 /* ... and add them to the list */10191010 endpoint_descriptor_index = i;
+1-1
drivers/usb/gadget/function/uvc_video.c
···818818 return -EINVAL;819819820820 /* Allocate a kthread for asynchronous hw submit handler. */821821- video->kworker = kthread_create_worker(0, "UVCG");821821+ video->kworker = kthread_run_worker(0, "UVCG");822822 if (IS_ERR(video->kworker)) {823823 uvcg_err(&video->uvc->func, "failed to create UVCG kworker\n");824824 return PTR_ERR(video->kworker);
···958958 * booting from USB disk or using a usb keyboard959959 */960960 hcc_params = readl(base + EHCI_HCC_PARAMS);961961+962962+ /* LS7A EHCI controller doesn't have extended capabilities, the963963+ * EECP (EHCI Extended Capabilities Pointer) field of HCCPARAMS964964+ * register should be 0x0 but it reads as 0xa0. So clear it to965965+ * avoid error messages on boot.966966+ */967967+ if (pdev->vendor == PCI_VENDOR_ID_LOONGSON && pdev->device == 0x7a14)968968+ hcc_params &= ~(0xffL << 8);969969+961970 offset = (hcc_params >> 8) & 0xff;962971 while (offset && --count) {963972 pci_read_config_dword(pdev, offset, &cap);
+4-3
drivers/usb/host/xhci-pci.c
···653653}654654EXPORT_SYMBOL_NS_GPL(xhci_pci_common_probe, "xhci");655655656656-static const struct pci_device_id pci_ids_reject[] = {657657- /* handled by xhci-pci-renesas */656656+/* handled by xhci-pci-renesas if enabled */657657+static const struct pci_device_id pci_ids_renesas[] = {658658 { PCI_DEVICE(PCI_VENDOR_ID_RENESAS, 0x0014) },659659 { PCI_DEVICE(PCI_VENDOR_ID_RENESAS, 0x0015) },660660 { /* end: all zeroes */ }···662662663663static int xhci_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)664664{665665- if (pci_match_id(pci_ids_reject, dev))665665+ if (IS_ENABLED(CONFIG_USB_XHCI_PCI_RENESAS) &&666666+ pci_match_id(pci_ids_renesas, dev))666667 return -ENODEV;667668668669 return xhci_pci_common_probe(dev, id);
···7474 return xen_bus_to_phys(dev, dma_to_phys(dev, dma_addr));7575}76767777+static inline bool range_requires_alignment(phys_addr_t p, size_t size)7878+{7979+ phys_addr_t algn = 1ULL << (get_order(size) + PAGE_SHIFT);8080+ phys_addr_t bus_addr = pfn_to_bfn(XEN_PFN_DOWN(p)) << XEN_PAGE_SHIFT;8181+8282+ return IS_ALIGNED(p, algn) && !IS_ALIGNED(bus_addr, algn);8383+}8484+7785static inline int range_straddles_page_boundary(phys_addr_t p, size_t size)7886{7987 unsigned long next_bfn, xen_pfn = XEN_PFN_DOWN(p);8088 unsigned int i, nr_pages = XEN_PFN_UP(xen_offset_in_page(p) + size);8181- phys_addr_t algn = 1ULL << (get_order(size) + PAGE_SHIFT);82898390 next_bfn = pfn_to_bfn(xen_pfn);8484-8585- /* If buffer is physically aligned, ensure DMA alignment. */8686- if (IS_ALIGNED(p, algn) &&8787- !IS_ALIGNED((phys_addr_t)next_bfn << XEN_PAGE_SHIFT, algn))8888- return 1;89919092 for (i = 1; i < nr_pages; i++)9193 if (pfn_to_bfn(++xen_pfn) != ++next_bfn)···113111}114112115113#ifdef CONFIG_X86116116-int xen_swiotlb_fixup(void *buf, unsigned long nslabs)114114+int __init xen_swiotlb_fixup(void *buf, unsigned long nslabs)117115{118116 int rc;119117 unsigned int order = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT);···158156159157 *dma_handle = xen_phys_to_dma(dev, phys);160158 if (*dma_handle + size - 1 > dma_mask ||161161- range_straddles_page_boundary(phys, size)) {159159+ range_straddles_page_boundary(phys, size) ||160160+ range_requires_alignment(phys, size)) {162161 if (xen_create_contiguous_region(phys, order, fls64(dma_mask),163162 dma_handle) != 0)164163 goto out_free_pages;···185182 size = ALIGN(size, XEN_PAGE_SIZE);186183187184 if (WARN_ON_ONCE(dma_handle + size - 1 > dev->coherent_dma_mask) ||188188- WARN_ON_ONCE(range_straddles_page_boundary(phys, size)))185185+ WARN_ON_ONCE(range_straddles_page_boundary(phys, size) ||186186+ range_requires_alignment(phys, size)))189187 return;190188191189 if (TestClearPageXenRemapped(virt_to_page(vaddr)))
+7
fs/bcachefs/Kconfig
···6161 The resulting code will be significantly slower than normal; you6262 probably shouldn't select this option unless you're a developer.63636464+config BCACHEFS_INJECT_TRANSACTION_RESTARTS6565+ bool "Randomly inject transaction restarts"6666+ depends on BCACHEFS_DEBUG6767+ help6868+ Randomly inject transaction restarts in a few core paths - may have a6969+ significant performance penalty7070+6471config BCACHEFS_TESTS6572 bool "bcachefs unit and performance tests"6673 depends on BCACHEFS_FS
+25-22
fs/bcachefs/alloc_background.c
···18031803 u64 open;18041804 u64 need_journal_commit;18051805 u64 discarded;18061806- u64 need_journal_commit_this_dev;18071806};1808180718091808static int bch2_discard_one_bucket(struct btree_trans *trans,···18261827 goto out;18271828 }1828182918291829- if (bch2_bucket_needs_journal_commit(&c->buckets_waiting_for_journal,18301830- c->journal.flushed_seq_ondisk,18311831- pos.inode, pos.offset)) {18321832- s->need_journal_commit++;18331833- s->need_journal_commit_this_dev++;18301830+ u64 seq_ready = bch2_bucket_journal_seq_ready(&c->buckets_waiting_for_journal,18311831+ pos.inode, pos.offset);18321832+ if (seq_ready > c->journal.flushed_seq_ondisk) {18331833+ if (seq_ready > c->journal.flushing_seq)18341834+ s->need_journal_commit++;18341835 goto out;18351836 }18361837···18641865 discard_locked = true;18651866 }1866186718671867- if (!bkey_eq(*discard_pos_done, iter.pos) &&18681868- ca->mi.discard && !c->opts.nochanges) {18691869- /*18701870- * This works without any other locks because this is the only18711871- * thread that removes items from the need_discard tree18721872- */18731873- bch2_trans_unlock_long(trans);18741874- blkdev_issue_discard(ca->disk_sb.bdev,18751875- k.k->p.offset * ca->mi.bucket_size,18761876- ca->mi.bucket_size,18771877- GFP_KERNEL);18781878- *discard_pos_done = iter.pos;18681868+ if (!bkey_eq(*discard_pos_done, iter.pos)) {18791869 s->discarded++;18701870+ *discard_pos_done = iter.pos;1880187118811881- ret = bch2_trans_relock_notrace(trans);18821882- if (ret)18831883- goto out;18721872+ if (ca->mi.discard && !c->opts.nochanges) {18731873+ /*18741874+ * This works without any other locks because this is the only18751875+ * thread that removes items from the need_discard tree18761876+ */18771877+ bch2_trans_unlock_long(trans);18781878+ blkdev_issue_discard(ca->disk_sb.bdev,18791879+ k.k->p.offset * ca->mi.bucket_size,18801880+ ca->mi.bucket_size,18811881+ GFP_KERNEL);18821882+ ret = bch2_trans_relock_notrace(trans);18831883+ if (ret)18841884+ goto out;18851885+ }18841886 }1885188718861888 SET_BCH_ALLOC_V4_NEED_DISCARD(&a->v, false);···19281928 POS(ca->dev_idx, 0),19291929 POS(ca->dev_idx, U64_MAX), 0, k,19301930 bch2_discard_one_bucket(trans, ca, &iter, &discard_pos_done, &s, false)));19311931+19321932+ if (s.need_journal_commit > dev_buckets_available(ca, BCH_WATERMARK_normal))19331933+ bch2_journal_flush_async(&c->journal, NULL);1931193419321935 trace_discard_buckets(c, s.seen, s.open, s.need_journal_commit, s.discarded,19331936 bch2_err_str(ret));···20272024 break;20282025 }2029202620302030- trace_discard_buckets(c, s.seen, s.open, s.need_journal_commit, s.discarded, bch2_err_str(ret));20272027+ trace_discard_buckets_fast(c, s.seen, s.open, s.need_journal_commit, s.discarded, bch2_err_str(ret));2031202820322029 bch2_trans_put(trans);20332030 percpu_ref_put(&ca->io_ref);
+7-3
fs/bcachefs/alloc_foreground.c
···205205 return false;206206 }207207208208- if (bch2_bucket_needs_journal_commit(&c->buckets_waiting_for_journal,209209- c->journal.flushed_seq_ondisk, bucket.inode, bucket.offset)) {208208+ u64 journal_seq_ready =209209+ bch2_bucket_journal_seq_ready(&c->buckets_waiting_for_journal,210210+ bucket.inode, bucket.offset);211211+ if (journal_seq_ready > c->journal.flushed_seq_ondisk) {212212+ if (journal_seq_ready > c->journal.flushing_seq)213213+ s->need_journal_commit++;210214 s->skipped_need_journal_commit++;211215 return false;212216 }···574570 ? bch2_bucket_alloc_freelist(trans, ca, watermark, &s, cl)575571 : bch2_bucket_alloc_early(trans, ca, watermark, &s, cl);576572577577- if (s.skipped_need_journal_commit * 2 > avail)573573+ if (s.need_journal_commit * 2 > avail)578574 bch2_journal_flush_async(&c->journal, NULL);579575580576 if (!ob && s.btree_bitmap != BTREE_BITMAP_ANY) {
···748748 rcu_read_unlock();749749 mutex_lock(&bc->table.mutex);750750 mutex_unlock(&bc->table.mutex);751751- rcu_read_lock();752751 continue;753752 }754753 for (i = 0; i < tbl->size; i++)
+4
fs/bcachefs/btree_trans_commit.c
···99999910001000 bch2_trans_verify_not_unlocked_or_in_restart(trans);1001100110021002+ ret = trans_maybe_inject_restart(trans, _RET_IP_);10031003+ if (unlikely(ret))10041004+ goto out_reset;10051005+10021006 if (!trans->nr_updates &&10031007 !trans->journal_entries_u64s)10041008 goto out_reset;
···381381not_found:382382 if (flags & BTREE_TRIGGER_check_repair) {383383 ret = bch2_indirect_extent_missing_error(trans, p, *idx, next_idx, false);384384+ if (ret == -BCH_ERR_missing_indirect_extent)385385+ ret = 0;384386 if (ret)385387 goto err;386388 }
···523523 u64 end;524524 u32 len;525525526526- /* For now only order 0 folios are supported for data. */527527- ASSERT(folio_order(folio) == 0);528526 btrfs_debug(fs_info,529527 "%s: bi_sector=%llu, err=%d, mirror=%u",530528 __func__, bio->bi_iter.bi_sector, bio->bi_status,···550552551553 if (likely(uptodate)) {552554 loff_t i_size = i_size_read(inode);553553- pgoff_t end_index = i_size >> folio_shift(folio);554555555556 /*556557 * Zero out the remaining part if this range straddles···558561 * Here we should only zero the range inside the folio,559562 * not touch anything else.560563 *561561- * NOTE: i_size is exclusive while end is inclusive.564564+ * NOTE: i_size is exclusive while end is inclusive and565565+ * folio_contains() takes PAGE_SIZE units.562566 */563563- if (folio_index(folio) == end_index && i_size <= end) {567567+ if (folio_contains(folio, i_size >> PAGE_SHIFT) &&568568+ i_size <= end) {564569 u32 zero_start = max(offset_in_folio(folio, i_size),565570 offset_in_folio(folio, start));566571 u32 zero_len = offset_in_folio(folio, end) + 1 -···898899 u64 len, struct extent_map **em_cached)899900{900901 struct extent_map *em;901901- struct extent_state *cached_state = NULL;902902903903 ASSERT(em_cached);904904···913915 *em_cached = NULL;914916 }915917916916- btrfs_lock_and_flush_ordered_range(inode, start, start + len - 1, &cached_state);917918 em = btrfs_get_extent(inode, folio, start, len);918919 if (!IS_ERR(em)) {919920 BUG_ON(*em_cached);920921 refcount_inc(&em->refs);921922 *em_cached = em;922923 }923923- unlock_extent(&inode->io_tree, start, start + len - 1, &cached_state);924924925925 return em;926926}···952956 return ret;953957 }954958955955- if (folio->index == last_byte >> folio_shift(folio)) {959959+ if (folio_contains(folio, last_byte >> PAGE_SHIFT)) {956960 size_t zero_offset = offset_in_folio(folio, last_byte);957961958962 if (zero_offset) {···1075107910761080int btrfs_read_folio(struct file *file, struct folio *folio)10771081{10821082+ struct btrfs_inode *inode = folio_to_inode(folio);10831083+ const u64 start = folio_pos(folio);10841084+ const u64 end = start + folio_size(folio) - 1;10851085+ struct extent_state *cached_state = NULL;10781086 struct btrfs_bio_ctrl bio_ctrl = { .opf = REQ_OP_READ };10791087 struct extent_map *em_cached = NULL;10801088 int ret;1081108910901090+ btrfs_lock_and_flush_ordered_range(inode, start, end, &cached_state);10821091 ret = btrfs_do_readpage(folio, &em_cached, &bio_ctrl, NULL);10921092+ unlock_extent(&inode->io_tree, start, end, &cached_state);10931093+10831094 free_extent_map(em_cached);1084109510851096 /*···23832380{23842381 struct btrfs_bio_ctrl bio_ctrl = { .opf = REQ_OP_READ | REQ_RAHEAD };23852382 struct folio *folio;23832383+ struct btrfs_inode *inode = BTRFS_I(rac->mapping->host);23842384+ const u64 start = readahead_pos(rac);23852385+ const u64 end = start + readahead_length(rac) - 1;23862386+ struct extent_state *cached_state = NULL;23862387 struct extent_map *em_cached = NULL;23872388 u64 prev_em_start = (u64)-1;2388238923902390+ btrfs_lock_and_flush_ordered_range(inode, start, end, &cached_state);23912391+23892392 while ((folio = readahead_folio(rac)) != NULL)23902393 btrfs_do_readpage(folio, &em_cached, &bio_ctrl, &prev_em_start);23942394+23952395+ unlock_extent(&inode->io_tree, start, end, &cached_state);2391239623922397 if (em_cached)23932398 free_extent_map(em_cached);
+1-3
fs/btrfs/file.c
···10391039 loff_t pos = iocb->ki_pos;10401040 int ret;10411041 loff_t oldsize;10421042- loff_t start_pos;1043104210441043 /*10451044 * Quickly bail out on NOWAIT writes if we don't have the nodatacow or···10651066 inode_inc_iversion(inode);10661067 }1067106810681068- start_pos = round_down(pos, fs_info->sectorsize);10691069 oldsize = i_size_read(inode);10701070- if (start_pos > oldsize) {10701070+ if (pos > oldsize) {10711071 /* Expand hole size to cover write data, preventing empty gap */10721072 loff_t end_pos = round_up(pos + count, fs_info->sectorsize);10731073
+12
fs/btrfs/ordered-data.c
···12291229 */12301230 if (WARN_ON_ONCE(len >= ordered->num_bytes))12311231 return ERR_PTR(-EINVAL);12321232+ /*12331233+ * If our ordered extent had an error there's no point in continuing.12341234+ * The error may have come from a transaction abort done either by this12351235+ * task or some other concurrent task, and the transaction abort path12361236+ * iterates over all existing ordered extents and sets the flag12371237+ * BTRFS_ORDERED_IOERR on them.12381238+ */12391239+ if (unlikely(flags & (1U << BTRFS_ORDERED_IOERR))) {12401240+ const int fs_error = BTRFS_FS_ERROR(fs_info);12411241+12421242+ return fs_error ? ERR_PTR(fs_error) : ERR_PTR(-EIO);12431243+ }12321244 /* We cannot split partially completed ordered extents. */12331245 if (ordered->bytes_left) {12341246 ASSERT(!(flags & ~BTRFS_ORDERED_TYPE_FLAGS));
+5-6
fs/btrfs/qgroup.c
···18801880 * Commit current transaction to make sure all the rfer/excl numbers18811881 * get updated.18821882 */18831883- trans = btrfs_start_transaction(fs_info->quota_root, 0);18841884- if (IS_ERR(trans))18851885- return PTR_ERR(trans);18861886-18871887- ret = btrfs_commit_transaction(trans);18831883+ ret = btrfs_commit_current_transaction(fs_info->quota_root);18881884 if (ret < 0)18891885 return ret;18901886···18931897 /*18941898 * It's squota and the subvolume still has numbers needed for future18951899 * accounting, in this case we can not delete it. Just skip it.19001900+ *19011901+ * Or the qgroup is already removed by a qgroup rescan. For both cases we're19021902+ * safe to ignore them.18961903 */18971897- if (ret == -EBUSY)19041904+ if (ret == -EBUSY || ret == -ENOENT)18981905 ret = 0;18991906 return ret;19001907}
+3-1
fs/btrfs/transaction.c
···274274 cur_trans = fs_info->running_transaction;275275 if (cur_trans) {276276 if (TRANS_ABORTED(cur_trans)) {277277+ const int abort_error = cur_trans->aborted;278278+277279 spin_unlock(&fs_info->trans_lock);278278- return cur_trans->aborted;280280+ return abort_error;279281 }280282 if (btrfs_blocked_trans_types[cur_trans->state] & type) {281283 spin_unlock(&fs_info->trans_lock);
+3-3
fs/dcache.c
···17001700 smp_store_release(&dentry->d_name.name, dname); /* ^^^ */1701170117021702 dentry->d_flags = 0;17031703- lockref_init(&dentry->d_lockref, 1);17031703+ lockref_init(&dentry->d_lockref);17041704 seqcount_spinlock_init(&dentry->d_seq, &dentry->d_lock);17051705 dentry->d_inode = NULL;17061706 dentry->d_parent = dentry;···29662966 goto out_err;29672967 m2 = &alias->d_parent->d_inode->i_rwsem;29682968out_unalias:29692969- if (alias->d_op->d_unalias_trylock &&29692969+ if (alias->d_op && alias->d_op->d_unalias_trylock &&29702970 !alias->d_op->d_unalias_trylock(alias))29712971 goto out_err;29722972 __d_move(alias, dentry, false);29732973- if (alias->d_op->d_unalias_unlock)29732973+ if (alias->d_op && alias->d_op->d_unalias_unlock)29742974 alias->d_op->d_unalias_unlock(alias);29752975 ret = 0;29762976out_err:
+1-1
fs/erofs/zdata.c
···726726 if (IS_ERR(pcl))727727 return PTR_ERR(pcl);728728729729- lockref_init(&pcl->lockref, 1); /* one ref for this request */729729+ lockref_init(&pcl->lockref); /* one ref for this request */730730 pcl->algorithmformat = map->m_algorithmformat;731731 pcl->length = 0;732732 pcl->partial = true;
+16
fs/file_table.c
···194194 * refcount bumps we should reinitialize the reused file first.195195 */196196 file_ref_init(&f->f_ref, 1);197197+ /*198198+ * Disable permission and pre-content events for all files by default.199199+ * They may be enabled later by file_set_fsnotify_mode_from_watchers().200200+ */201201+ file_set_fsnotify_mode(f, FMODE_NONOTIFY_PERM);197202 return 0;198203}199204···380375 if (IS_ERR(file)) {381376 ihold(inode);382377 path_put(&path);378378+ return file;383379 }380380+ /*381381+ * Disable all fsnotify events for pseudo files by default.382382+ * They may be enabled by caller with file_set_fsnotify_mode().383383+ */384384+ file_set_fsnotify_mode(file, FMODE_NONOTIFY);384385 return file;385386}386387EXPORT_SYMBOL(alloc_file_pseudo);···411400 return file;412401 }413402 file_init_path(file, &path, fops);403403+ /*404404+ * Disable all fsnotify events for pseudo files by default.405405+ * They may be enabled by caller with file_set_fsnotify_mode().406406+ */407407+ file_set_fsnotify_mode(file, FMODE_NONOTIFY);414408 return file;415409}416410EXPORT_SYMBOL_GPL(alloc_file_pseudo_noaccount);
···380380 error = check_nfsd_access(exp, rqstp, may_bypass_gss);381381 if (error)382382 goto out;383383-384384- svc_xprt_set_valid(rqstp->rq_xprt);383383+ /* During LOCALIO call to fh_verify will be called with a NULL rqstp */384384+ if (rqstp)385385+ svc_xprt_set_valid(rqstp->rq_xprt);385386386387 /* Finally, check access permissions. */387388 error = nfsd_permission(cred, exp, dentry, access);
+12-6
fs/notify/fsnotify.c
···648648 * Later, fsnotify permission hooks do not check if there are permission event649649 * watches, but that there were permission event watches at open time.650650 */651651-void file_set_fsnotify_mode(struct file *file)651651+void file_set_fsnotify_mode_from_watchers(struct file *file)652652{653653 struct dentry *dentry = file->f_path.dentry, *parent;654654 struct super_block *sb = dentry->d_sb;···665665 */666666 if (likely(!fsnotify_sb_has_priority_watchers(sb,667667 FSNOTIFY_PRIO_CONTENT))) {668668- file->f_mode |= FMODE_NONOTIFY_PERM;668668+ file_set_fsnotify_mode(file, FMODE_NONOTIFY_PERM);669669 return;670670 }671671···676676 if ((!d_is_dir(dentry) && !d_is_reg(dentry)) ||677677 likely(!fsnotify_sb_has_priority_watchers(sb,678678 FSNOTIFY_PRIO_PRE_CONTENT))) {679679- file->f_mode |= FMODE_NONOTIFY | FMODE_NONOTIFY_PERM;679679+ file_set_fsnotify_mode(file, FMODE_NONOTIFY | FMODE_NONOTIFY_PERM);680680 return;681681 }682682···686686 */687687 mnt_mask = READ_ONCE(real_mount(file->f_path.mnt)->mnt_fsnotify_mask);688688 if (unlikely(fsnotify_object_watched(d_inode(dentry), mnt_mask,689689- FSNOTIFY_PRE_CONTENT_EVENTS)))689689+ FSNOTIFY_PRE_CONTENT_EVENTS))) {690690+ /* Enable pre-content events */691691+ file_set_fsnotify_mode(file, 0);690692 return;693693+ }691694692695 /* Is parent watching for pre-content events on this file? */693696 if (dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED) {694697 parent = dget_parent(dentry);695698 p_mask = fsnotify_inode_watches_children(d_inode(parent));696699 dput(parent);697697- if (p_mask & FSNOTIFY_PRE_CONTENT_EVENTS)700700+ if (p_mask & FSNOTIFY_PRE_CONTENT_EVENTS) {701701+ /* Enable pre-content events */702702+ file_set_fsnotify_mode(file, 0);698703 return;704704+ }699705 }700706 /* Nobody watching for pre-content events from this file */701701- file->f_mode |= FMODE_NONOTIFY | FMODE_NONOTIFY_PERM;707707+ file_set_fsnotify_mode(file, FMODE_NONOTIFY | FMODE_NONOTIFY_PERM);702708}703709#endif704710
+6-5
fs/open.c
···905905 f->f_sb_err = file_sample_sb_err(f);906906907907 if (unlikely(f->f_flags & O_PATH)) {908908- f->f_mode = FMODE_PATH | FMODE_OPENED | FMODE_NONOTIFY;908908+ f->f_mode = FMODE_PATH | FMODE_OPENED;909909+ file_set_fsnotify_mode(f, FMODE_NONOTIFY);909910 f->f_op = &empty_fops;910911 return 0;911912 }···936935937936 /*938937 * Set FMODE_NONOTIFY_* bits according to existing permission watches.939939- * If FMODE_NONOTIFY was already set for an fanotify fd, this doesn't940940- * change anything.938938+ * If FMODE_NONOTIFY mode was already set for an fanotify fd or for a939939+ * pseudo file, this call will not change the mode.941940 */942942- file_set_fsnotify_mode(f);941941+ file_set_fsnotify_mode_from_watchers(f);943942 error = fsnotify_open_perm(f);944943 if (error)945944 goto cleanup_all;···11231122 if (!IS_ERR(f)) {11241123 int error;1125112411261126- f->f_mode |= FMODE_NONOTIFY;11251125+ file_set_fsnotify_mode(f, FMODE_NONOTIFY);11271126 error = vfs_open(path, f);11281127 if (error) {11291128 fput(f);
+11-1
fs/pidfs.c
···287287 switch (cmd) {288288 case FS_IOC_GETVERSION:289289 case PIDFD_GET_CGROUP_NAMESPACE:290290- case PIDFD_GET_INFO:291290 case PIDFD_GET_IPC_NAMESPACE:292291 case PIDFD_GET_MNT_NAMESPACE:293292 case PIDFD_GET_NET_NAMESPACE:···297298 case PIDFD_GET_USER_NAMESPACE:298299 case PIDFD_GET_PID_NAMESPACE:299300 return true;301301+ }302302+303303+ /* Extensible ioctls require some more careful checks. */304304+ switch (_IOC_NR(cmd)) {305305+ case _IOC_NR(PIDFD_GET_INFO):306306+ /*307307+ * Try to prevent performing a pidfd ioctl when someone308308+ * erronously mistook the file descriptor for a pidfd.309309+ * This is not perfect but will catch most cases.310310+ */311311+ return (_IOC_TYPE(cmd) == _IOC_TYPE(PIDFD_GET_INFO));300312 }301313302314 return false;
···35633563 int error;3564356435653565 /*35663566- * If there are already extents in the file, try an exact EOF block35673567- * allocation to extend the file as a contiguous extent. If that fails,35683568- * or it's the first allocation in a file, just try for a stripe aligned35693569- * allocation.35663566+ * If there are already extents in the file, and xfs_bmap_adjacent() has35673567+ * given a better blkno, try an exact EOF block allocation to extend the35683568+ * file as a contiguous extent. If that fails, or it's the first35693569+ * allocation in a file, just try for a stripe aligned allocation.35703570 */35713571- if (ap->offset) {35713571+ if (ap->eof) {35723572 xfs_extlen_t nextminlen = 0;3573357335743574 /*···37363736 int error;3737373737383738 ap->blkno = XFS_INO_TO_FSB(args->mp, ap->ip->i_ino);37393739- xfs_bmap_adjacent(ap);37393739+ if (!xfs_bmap_adjacent(ap))37403740+ ap->eof = false;3740374137413742 /*37423743 * Search for an allocation group with a single extent large enough for
+17-19
fs/xfs/xfs_buf.c
···4141 *4242 * xfs_buf_rele:4343 * b_lock4444- * pag_buf_lock4545- * lru_lock4444+ * lru_lock4645 *4746 * xfs_buftarg_drain_rele4847 * lru_lock···219220 */220221 flags &= ~(XBF_UNMAPPED | XBF_TRYLOCK | XBF_ASYNC | XBF_READ_AHEAD);221222222222- spin_lock_init(&bp->b_lock);223223+ /*224224+ * A new buffer is held and locked by the owner. This ensures that the225225+ * buffer is owned by the caller and racing RCU lookups right after226226+ * inserting into the hash table are safe (and will have to wait for227227+ * the unlock to do anything non-trivial).228228+ */223229 bp->b_hold = 1;230230+ sema_init(&bp->b_sema, 0); /* held, no waiters */231231+232232+ spin_lock_init(&bp->b_lock);224233 atomic_set(&bp->b_lru_ref, 1);225234 init_completion(&bp->b_iowait);226235 INIT_LIST_HEAD(&bp->b_lru);227236 INIT_LIST_HEAD(&bp->b_list);228237 INIT_LIST_HEAD(&bp->b_li_list);229229- sema_init(&bp->b_sema, 0); /* held, no waiters */230238 bp->b_target = target;231239 bp->b_mount = target->bt_mount;232240 bp->b_flags = flags;233241234234- /*235235- * Set length and io_length to the same value initially.236236- * I/O routines should use io_length, which will be the same in237237- * most cases but may be reset (e.g. XFS recovery).238238- */239242 error = xfs_buf_get_maps(bp, nmaps);240243 if (error) {241244 kmem_cache_free(xfs_buf_cache, bp);···503502xfs_buf_cache_init(504503 struct xfs_buf_cache *bch)505504{506506- spin_lock_init(&bch->bc_lock);507505 return rhashtable_init(&bch->bc_hash, &xfs_buf_hash_params);508506}509507···652652 if (error)653653 goto out_free_buf;654654655655- spin_lock(&bch->bc_lock);655655+ /* The new buffer keeps the perag reference until it is freed. */656656+ new_bp->b_pag = pag;657657+658658+ rcu_read_lock();656659 bp = rhashtable_lookup_get_insert_fast(&bch->bc_hash,657660 &new_bp->b_rhash_head, xfs_buf_hash_params);658661 if (IS_ERR(bp)) {662662+ rcu_read_unlock();659663 error = PTR_ERR(bp);660660- spin_unlock(&bch->bc_lock);661664 goto out_free_buf;662665 }663666 if (bp && xfs_buf_try_hold(bp)) {664667 /* found an existing buffer */665665- spin_unlock(&bch->bc_lock);668668+ rcu_read_unlock();666669 error = xfs_buf_find_lock(bp, flags);667670 if (error)668671 xfs_buf_rele(bp);···673670 *bpp = bp;674671 goto out_free_buf;675672 }673673+ rcu_read_unlock();676674677677- /* The new buffer keeps the perag reference until it is freed. */678678- new_bp->b_pag = pag;679679- spin_unlock(&bch->bc_lock);680675 *bpp = new_bp;681676 return 0;682677···10911090 }1092109110931092 /* we are asked to drop the last reference */10941094- spin_lock(&bch->bc_lock);10951093 __xfs_buf_ioacct_dec(bp);10961094 if (!(bp->b_flags & XBF_STALE) && atomic_read(&bp->b_lru_ref)) {10971095 /*···11021102 bp->b_state &= ~XFS_BSTATE_DISPOSE;11031103 else11041104 bp->b_hold--;11051105- spin_unlock(&bch->bc_lock);11061105 } else {11071106 bp->b_hold--;11081107 /*···11191120 ASSERT(!(bp->b_flags & _XBF_DELWRI_Q));11201121 rhashtable_remove_fast(&bch->bc_hash, &bp->b_rhash_head,11211122 xfs_buf_hash_params);11221122- spin_unlock(&bch->bc_lock);11231123 if (pag)11241124 xfs_perag_put(pag);11251125 freebuf = true;
···329329 * successfully but before locks are dropped.330330 */331331332332-/* Verify that we have security clearance to perform this operation. */333333-static int334334-xfs_exchange_range_verify_area(335335- struct xfs_exchrange *fxr)336336-{337337- int ret;338338-339339- ret = remap_verify_area(fxr->file1, fxr->file1_offset, fxr->length,340340- true);341341- if (ret)342342- return ret;343343-344344- return remap_verify_area(fxr->file2, fxr->file2_offset, fxr->length,345345- true);346346-}347347-348332/*349333 * Performs necessary checks before doing a range exchange, having stabilized350334 * mutable inode attributes via i_rwsem.···339355 unsigned int alloc_unit)340356{341357 struct inode *inode1 = file_inode(fxr->file1);358358+ loff_t size1 = i_size_read(inode1);342359 struct inode *inode2 = file_inode(fxr->file2);360360+ loff_t size2 = i_size_read(inode2);343361 uint64_t allocmask = alloc_unit - 1;344362 int64_t test_len;345363 uint64_t blen;346346- loff_t size1, size2, tmp;364364+ loff_t tmp;347365 int error;348366349367 /* Don't touch certain kinds of inodes */···354368 if (IS_SWAPFILE(inode1) || IS_SWAPFILE(inode2))355369 return -ETXTBSY;356370357357- size1 = i_size_read(inode1);358358- size2 = i_size_read(inode2);359359-360371 /* Ranges cannot start after EOF. */361372 if (fxr->file1_offset > size1 || fxr->file2_offset > size2)362373 return -EINVAL;363374364364- /*365365- * If the caller said to exchange to EOF, we set the length of the366366- * request large enough to cover everything to the end of both files.367367- */368375 if (fxr->flags & XFS_EXCHANGE_RANGE_TO_EOF) {376376+ /*377377+ * If the caller said to exchange to EOF, we set the length of378378+ * the request large enough to cover everything to the end of379379+ * both files.380380+ */369381 fxr->length = max_t(int64_t, size1 - fxr->file1_offset,370382 size2 - fxr->file2_offset);371371-372372- error = xfs_exchange_range_verify_area(fxr);373373- if (error)374374- return error;383383+ } else {384384+ /*385385+ * Otherwise we require both ranges to end within EOF.386386+ */387387+ if (fxr->file1_offset + fxr->length > size1 ||388388+ fxr->file2_offset + fxr->length > size2)389389+ return -EINVAL;375390 }376391377392 /*···386399 /* Ensure offsets don't wrap. */387400 if (check_add_overflow(fxr->file1_offset, fxr->length, &tmp) ||388401 check_add_overflow(fxr->file2_offset, fxr->length, &tmp))389389- return -EINVAL;390390-391391- /*392392- * We require both ranges to end within EOF, unless we're exchanging393393- * to EOF.394394- */395395- if (!(fxr->flags & XFS_EXCHANGE_RANGE_TO_EOF) &&396396- (fxr->file1_offset + fxr->length > size1 ||397397- fxr->file2_offset + fxr->length > size2))398402 return -EINVAL;399403400404 /*···725747{726748 struct inode *inode1 = file_inode(fxr->file1);727749 struct inode *inode2 = file_inode(fxr->file2);750750+ loff_t check_len = fxr->length;728751 int ret;729752730753 BUILD_BUG_ON(XFS_EXCHANGE_RANGE_ALL_FLAGS &···758779 return -EBADF;759780760781 /*761761- * If we're not exchanging to EOF, we can check the areas before762762- * stabilizing both files' i_size.782782+ * If we're exchanging to EOF we can't calculate the length until taking783783+ * the iolock. Pass a 0 length to remap_verify_area similar to the784784+ * FICLONE and FICLONERANGE ioctls that support cloning to EOF as well.763785 */764764- if (!(fxr->flags & XFS_EXCHANGE_RANGE_TO_EOF)) {765765- ret = xfs_exchange_range_verify_area(fxr);766766- if (ret)767767- return ret;768768- }786786+ if (fxr->flags & XFS_EXCHANGE_RANGE_TO_EOF)787787+ check_len = 0;788788+ ret = remap_verify_area(fxr->file1, fxr->file1_offset, check_len, true);789789+ if (ret)790790+ return ret;791791+ ret = remap_verify_area(fxr->file2, fxr->file2_offset, check_len, true);792792+ if (ret)793793+ return ret;769794770795 /* Update cmtime if the fd/inode don't forbid it. */771796 if (!(fxr->file1->f_mode & FMODE_NOCMTIME) && !IS_NOCMTIME(inode1))
+5-2
fs/xfs/xfs_inode.c
···14041404 goto out;1405140514061406 /* Try to clean out the cow blocks if there are any. */14071407- if (xfs_inode_has_cow_data(ip))14081408- xfs_reflink_cancel_cow_range(ip, 0, NULLFILEOFF, true);14071407+ if (xfs_inode_has_cow_data(ip)) {14081408+ error = xfs_reflink_cancel_cow_range(ip, 0, NULLFILEOFF, true);14091409+ if (error)14101410+ goto out;14111411+ }1409141214101413 if (VFS_I(ip)->i_nlink != 0) {14111414 /*
+2-4
fs/xfs/xfs_iomap.c
···976976 if (!xfs_is_cow_inode(ip))977977 return 0;978978979979- if (!written) {980980- xfs_reflink_cancel_cow_range(ip, pos, length, true);981981- return 0;982982- }979979+ if (!written)980980+ return xfs_reflink_cancel_cow_range(ip, pos, length, true);983981984982 return xfs_reflink_end_cow(ip, pos, written);985983}
+1
include/asm-generic/vmlinux.lds.h
···10381038 *(.discard) \10391039 *(.discard.*) \10401040 *(.export_symbol) \10411041+ *(.no_trim_symbol) \10411042 *(.modinfo) \10421043 /* ld.bfd warns about .gnu.version* even when not emitted */ \10431044 *(.gnu.version*) \
···11+/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */22+/*33+ * Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.44+ */55+66+#ifndef _DT_BINDINGS_CLK_QCOM_QCS8300_CAM_CC_H77+#define _DT_BINDINGS_CLK_QCOM_QCS8300_CAM_CC_H88+99+#include "qcom,sa8775p-camcc.h"1010+1111+/* QCS8300 introduces below new clocks compared to SA8775P */1212+1313+/* CAM_CC clocks */1414+#define CAM_CC_TITAN_TOP_ACCU_SHIFT_CLK 861515+1616+#endif
+17
include/dt-bindings/clock/qcom,qcs8300-gpucc.h
···11+/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */22+/*33+ * Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.44+ */55+66+#ifndef _DT_BINDINGS_CLK_QCOM_GPUCC_QCS8300_H77+#define _DT_BINDINGS_CLK_QCOM_GPUCC_QCS8300_H88+99+#include "qcom,sa8775p-gpucc.h"1010+1111+/* QCS8300 introduces below new clocks compared to SA8775P */1212+1313+/* GPU_CC clocks */1414+#define GPU_CC_CX_ACCU_SHIFT_CLK 231515+#define GPU_CC_GX_ACCU_SHIFT_CLK 241616+1717+#endif
+14-4
include/linux/blk-mq.h
···861861 void (*complete)(struct io_comp_batch *))862862{863863 /*864864- * blk_mq_end_request_batch() can't end request allocated from865865- * sched tags864864+ * Check various conditions that exclude batch processing:865865+ * 1) No batch container866866+ * 2) Has scheduler data attached867867+ * 3) Not a passthrough request and end_io set868868+ * 4) Not a passthrough request and an ioerror866869 */867867- if (!iob || (req->rq_flags & RQF_SCHED_TAGS) || ioerror ||868868- (req->end_io && !blk_rq_is_passthrough(req)))870870+ if (!iob)869871 return false;872872+ if (req->rq_flags & RQF_SCHED_TAGS)873873+ return false;874874+ if (!blk_rq_is_passthrough(req)) {875875+ if (req->end_io)876876+ return false;877877+ if (ioerror < 0)878878+ return false;879879+ }870880871881 if (!iob->complete)872882 iob->complete = complete;
+3-3
include/linux/cgroup-defs.h
···71717272 /* Cgroup is frozen. */7373 CGRP_FROZEN,7474-7575- /* Control group has to be killed. */7676- CGRP_KILL,7774};78757976/* cgroup_root->flags */···457460 int nr_populated_threaded_children;458461459462 int nr_threaded_children; /* # of live threaded child cgroups */463463+464464+ /* sequence number for cgroup.kill, serialized by css_set_lock. */465465+ unsigned int kill_seq;460466461467 struct kernfs_node *kn; /* cgroup kernfs entry */462468 struct cgroup_file procs_file; /* handle for "cgroup.procs" */
+19-13
include/linux/compiler.h
···191191 __v; \192192})193193194194+#ifdef __CHECKER__195195+#define __BUILD_BUG_ON_ZERO_MSG(e, msg) (0)196196+#else /* __CHECKER__ */197197+#define __BUILD_BUG_ON_ZERO_MSG(e, msg) ((int)sizeof(struct {_Static_assert(!(e), msg);}))198198+#endif /* __CHECKER__ */199199+200200+/* &a[0] degrades to a pointer: a different type from an array */201201+#define __is_array(a) (!__same_type((a), &(a)[0]))202202+#define __must_be_array(a) __BUILD_BUG_ON_ZERO_MSG(!__is_array(a), \203203+ "must be array")204204+205205+#define __is_byte_array(a) (__is_array(a) && sizeof((a)[0]) == 1)206206+#define __must_be_byte_array(a) __BUILD_BUG_ON_ZERO_MSG(!__is_byte_array(a), \207207+ "must be byte array")208208+209209+/* Require C Strings (i.e. NUL-terminated) lack the "nonstring" attribute. */210210+#define __must_be_cstr(p) \211211+ __BUILD_BUG_ON_ZERO_MSG(__annotated(p, nonstring), "must be cstr (NUL-terminated)")212212+194213#endif /* __KERNEL__ */195214196215/**···249230 .popsection;250231251232#define __ADDRESSABLE_ASM_STR(sym) __stringify(__ADDRESSABLE_ASM(sym))252252-253253-#ifdef __CHECKER__254254-#define __BUILD_BUG_ON_ZERO_MSG(e, msg) (0)255255-#else /* __CHECKER__ */256256-#define __BUILD_BUG_ON_ZERO_MSG(e, msg) ((int)sizeof(struct {_Static_assert(!(e), msg);}))257257-#endif /* __CHECKER__ */258258-259259-/* &a[0] degrades to a pointer: a different type from an array */260260-#define __must_be_array(a) __BUILD_BUG_ON_ZERO_MSG(__same_type((a), &(a)[0]), "must be array")261261-262262-/* Require C Strings (i.e. NUL-terminated) lack the "nonstring" attribute. */263263-#define __must_be_cstr(p) \264264- __BUILD_BUG_ON_ZERO_MSG(__annotated(p, nonstring), "must be cstr (NUL-terminated)")265233266234/*267235 * This returns a constant expression while determining if an argument is
+69
include/linux/device/faux.h
···11+/* SPDX-License-Identifier: GPL-2.0-only */22+/*33+ * Copyright (c) 2025 Greg Kroah-Hartman <gregkh@linuxfoundation.org>44+ * Copyright (c) 2025 The Linux Foundation55+ *66+ * A "simple" faux bus that allows devices to be created and added77+ * automatically to it. This is to be used whenever you need to create a88+ * device that is not associated with any "real" system resources, and do99+ * not want to have to deal with a bus/driver binding logic. It is1010+ * intended to be very simple, with only a create and a destroy function1111+ * available.1212+ */1313+#ifndef _FAUX_DEVICE_H_1414+#define _FAUX_DEVICE_H_1515+1616+#include <linux/container_of.h>1717+#include <linux/device.h>1818+1919+/**2020+ * struct faux_device - a "faux" device2121+ * @dev: internal struct device of the object2222+ *2323+ * A simple faux device that can be created/destroyed. To be used when a2424+ * driver only needs to have a device to "hang" something off. This can be2525+ * used for downloading firmware or other basic tasks. Use this instead of2626+ * a struct platform_device if the device has no resources assigned to2727+ * it at all.2828+ */2929+struct faux_device {3030+ struct device dev;3131+};3232+#define to_faux_device(x) container_of_const((x), struct faux_device, dev)3333+3434+/**3535+ * struct faux_device_ops - a set of callbacks for a struct faux_device3636+ * @probe: called when a faux device is probed by the driver core3737+ * before the device is fully bound to the internal faux bus3838+ * code. If probe succeeds, return 0, otherwise return a3939+ * negative error number to stop the probe sequence from4040+ * succeeding.4141+ * @remove: called when a faux device is removed from the system4242+ *4343+ * Both @probe and @remove are optional, if not needed, set to NULL.4444+ */4545+struct faux_device_ops {4646+ int (*probe)(struct faux_device *faux_dev);4747+ void (*remove)(struct faux_device *faux_dev);4848+};4949+5050+struct faux_device *faux_device_create(const char *name,5151+ struct device *parent,5252+ const struct faux_device_ops *faux_ops);5353+struct faux_device *faux_device_create_with_groups(const char *name,5454+ struct device *parent,5555+ const struct faux_device_ops *faux_ops,5656+ const struct attribute_group **groups);5757+void faux_device_destroy(struct faux_device *faux_dev);5858+5959+static inline void *faux_device_get_drvdata(const struct faux_device *faux_dev)6060+{6161+ return dev_get_drvdata(&faux_dev->dev);6262+}6363+6464+static inline void faux_device_set_drvdata(struct faux_device *faux_dev, void *data)6565+{6666+ dev_set_drvdata(&faux_dev->dev, data);6767+}6868+6969+#endif /* _FAUX_DEVICE_H_ */
···222222#define FMODE_FSNOTIFY_HSM(mode) 0223223#endif224224225225-226225/*227226 * Attribute flags. These should be or-ed together to figure out what228227 * has been changed!···790791791792static inline void inode_set_cached_link(struct inode *inode, char *link, int linklen)792793{794794+ int testlen;795795+796796+ /*797797+ * TODO: patch it into a debug-only check if relevant macros show up.798798+ * In the meantime, since we are suffering strlen even on production kernels799799+ * to find the right length, do a fixup if the wrong value got passed.800800+ */801801+ testlen = strlen(link);802802+ if (testlen != linklen) {803803+ WARN_ONCE(1, "bad length passed for symlink [%s] (got %d, expected %d)",804804+ link, linklen, testlen);805805+ linklen = testlen;806806+ }793807 inode->i_link = link;794808 inode->i_linklen = linklen;795809 inode->i_opflags |= IOP_CACHED_LINK;···31503138 if (unlikely(!exe_file || FMODE_FSNOTIFY_HSM(exe_file->f_mode)))31513139 return;31523140 allow_write_access(exe_file);31413141+}31423142+31433143+static inline void file_set_fsnotify_mode(struct file *file, fmode_t mode)31443144+{31453145+ file->f_mode &= ~FMODE_FSNOTIFY_MASK;31463146+ file->f_mode |= mode;31533147}3154314831553149static inline bool inode_is_open_for_write(const struct inode *inode)
···244244 * @id_table: List of I2C devices supported by this driver245245 * @detect: Callback for device detection246246 * @address_list: The I2C addresses to probe (for detect)247247+ * @clients: List of detected clients we created (for i2c-core use only)247248 * @flags: A bitmask of flags defined in &enum i2c_driver_flags248249 *249250 * The driver.owner field should be set to the module owner of this driver.···299298 /* Device detection callback for automatic device creation */300299 int (*detect)(struct i2c_client *client, struct i2c_board_info *info);301300 const unsigned short *address_list;301301+ struct list_head clients;302302303303 u32 flags;304304};···315313 * @dev: Driver model device node for the slave.316314 * @init_irq: IRQ that was set at initialization317315 * @irq: indicates the IRQ generated by this device (if any)316316+ * @detected: member of an i2c_driver.clients list or i2c-core's317317+ * userspace_devices list318318 * @slave_cb: Callback when I2C slave mode of an adapter is used. The adapter319319 * calls it to pass on slave events to the slave driver.320320 * @devres_group_id: id of the devres group that will be created for resources···336332#define I2C_CLIENT_SLAVE 0x20 /* we are the slave */337333#define I2C_CLIENT_HOST_NOTIFY 0x40 /* We want to use I2C host notify */338334#define I2C_CLIENT_WAKE 0x80 /* for board_info; true iff can wake */339339-#define I2C_CLIENT_AUTO 0x100 /* client was auto-detected */340340-#define I2C_CLIENT_USER 0x200 /* client was userspace-created */341335#define I2C_CLIENT_SCCB 0x9000 /* Use Omnivision SCCB protocol */342336 /* Must match I2C_M_STOP|IGNORE_NAK */343337···347345 struct device dev; /* the device structure */348346 int init_irq; /* irq set at initialization */349347 int irq; /* irq issued by device */348348+ struct list_head detected;350349#if IS_ENABLED(CONFIG_I2C_SLAVE)351350 i2c_slave_cb_t slave_cb; /* callback for slave mode */352351#endif···753750 int nr;754751 char name[48];755752 struct completion dev_released;753753+754754+ struct mutex userspace_clients_lock;755755+ struct list_head userspace_clients;756756757757 struct i2c_bus_recovery_info *bus_recovery_info;758758 const struct i2c_adapter_quirks *quirks;
···815815#ifdef CONFIG_CRYPTO_DEV_SP_PSP816816817817/**818818+ * sev_module_init - perform PSP SEV module initialization819819+ *820820+ * Returns:821821+ * 0 if the PSP module is successfully initialized822822+ * negative value if the PSP module initialization fails823823+ */824824+int sev_module_init(void);825825+826826+/**818827 * sev_platform_init - perform SEV INIT command819828 *820829 * @args: struct sev_platform_init_args to pass in arguments
···411411/* GFX12 and later: */412412#define AMDGPU_TILING_GFX12_SWIZZLE_MODE_SHIFT 0413413#define AMDGPU_TILING_GFX12_SWIZZLE_MODE_MASK 0x7414414-/* These are DCC recompression setting for memory management: */414414+/* These are DCC recompression settings for memory management: */415415#define AMDGPU_TILING_GFX12_DCC_MAX_COMPRESSED_BLOCK_SHIFT 3416416#define AMDGPU_TILING_GFX12_DCC_MAX_COMPRESSED_BLOCK_MASK 0x3 /* 0:64B, 1:128B, 2:256B */417417#define AMDGPU_TILING_GFX12_DCC_NUMBER_TYPE_SHIFT 5418418#define AMDGPU_TILING_GFX12_DCC_NUMBER_TYPE_MASK 0x7 /* CB_COLOR0_INFO.NUMBER_TYPE */419419#define AMDGPU_TILING_GFX12_DCC_DATA_FORMAT_SHIFT 8420420#define AMDGPU_TILING_GFX12_DCC_DATA_FORMAT_MASK 0x3f /* [0:4]:CB_COLOR0_INFO.FORMAT, [5]:MM */421421+/* When clearing the buffer or moving it from VRAM to GTT, don't compress and set DCC metadata422422+ * to uncompressed. Set when parts of an allocation bypass DCC and read raw data. */423423+#define AMDGPU_TILING_GFX12_DCC_WRITE_COMPRESS_DISABLE_SHIFT 14424424+#define AMDGPU_TILING_GFX12_DCC_WRITE_COMPRESS_DISABLE_MASK 0x1425425+/* bit gap */426426+#define AMDGPU_TILING_GFX12_SCANOUT_SHIFT 63427427+#define AMDGPU_TILING_GFX12_SCANOUT_MASK 0x1421428422429/* Set/Get helpers for tiling flags. */423430#define AMDGPU_TILING_SET(field, value) \
+2
include/uapi/linux/ethtool.h
···682682 * @ETH_SS_STATS_ETH_CTRL: names of IEEE 802.3 MAC Control statistics683683 * @ETH_SS_STATS_RMON: names of RMON statistics684684 * @ETH_SS_STATS_PHY: names of PHY(dev) statistics685685+ * @ETH_SS_TS_FLAGS: hardware timestamping flags685686 *686687 * @ETH_SS_COUNT: number of defined string sets687688 */···709708 ETH_SS_STATS_ETH_CTRL,710709 ETH_SS_STATS_RMON,711710 ETH_SS_STATS_PHY,711711+ ETH_SS_TS_FLAGS,712712713713 /* add new constants above here */714714 ETH_SS_COUNT
···5454 continue;55555656 if (cmd->flags & IORING_URING_CMD_CANCELABLE) {5757- /* ->sqe isn't available if no async data */5858- if (!req_has_async_data(req))5959- cmd->sqe = NULL;6057 file->f_op->uring_cmd(cmd, IO_URING_F_CANCEL |6158 IO_URING_F_COMPLETE_DEFER);6259 ret = true;···176179 return -ENOMEM;177180 cache->op_data = NULL;178181179179- if (!(req->flags & REQ_F_FORCE_ASYNC)) {180180- /* defer memcpy until we need it */181181- ioucmd->sqe = sqe;182182- return 0;183183- }184184-182182+ /*183183+ * Unconditionally cache the SQE for now - this is only needed for184184+ * requests that go async, but prep handlers must ensure that any185185+ * sqe data is stable beyond prep. Since uring_cmd is special in186186+ * that it doesn't read in per-op data, play it safe and ensure that187187+ * any SQE data is stable beyond prep. This can later get relaxed.188188+ */185189 memcpy(cache->sqes, sqe, uring_sqe_size(req->ctx));186190 ioucmd->sqe = cache->sqes;187191 return 0;···247249 }248250249251 ret = file->f_op->uring_cmd(ioucmd, issue_flags);250250- if (ret == -EAGAIN) {251251- struct io_uring_cmd_data *cache = req->async_data;252252-253253- if (ioucmd->sqe != (void *) cache)254254- memcpy(cache->sqes, ioucmd->sqe, uring_sqe_size(req->ctx));255255- return -EAGAIN;256256- } else if (ret == -EIOCBQUEUED) {257257- return -EIOCBQUEUED;258258- }259259-252252+ if (ret == -EAGAIN || ret == -EIOCBQUEUED)253253+ return ret;260254 if (ret < 0)261255 req_set_fail(req);262256 io_req_uring_cleanup(req, issue_flags);
···285285}286286287287extern void __futex_unqueue(struct futex_q *q);288288-extern void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb);288288+extern void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb,289289+ struct task_struct *task);289290extern int futex_unqueue(struct futex_q *q);290291291292/**292293 * futex_queue() - Enqueue the futex_q on the futex_hash_bucket293294 * @q: The futex_q to enqueue294295 * @hb: The destination hash bucket296296+ * @task: Task queueing this futex295297 *296298 * The hb->lock must be held by the caller, and is released here. A call to297299 * futex_queue() is typically paired with exactly one call to futex_unqueue(). The···301299 * or nothing if the unqueue is done as part of the wake process and the unqueue302300 * state is implicit in the state of woken task (see futex_wait_requeue_pi() for303301 * an example).302302+ *303303+ * Note that @task may be NULL, for async usage of futexes.304304 */305305-static inline void futex_queue(struct futex_q *q, struct futex_hash_bucket *hb)305305+static inline void futex_queue(struct futex_q *q, struct futex_hash_bucket *hb,306306+ struct task_struct *task)306307 __releases(&hb->lock)307308{308308- __futex_queue(q, hb);309309+ __futex_queue(q, hb, task);309310 spin_unlock(&hb->lock);310311}311312
+1-1
kernel/futex/pi.c
···982982 /*983983 * Only actually queue now that the atomic ops are done:984984 */985985- __futex_queue(&q, hb);985985+ __futex_queue(&q, hb, current);986986987987 if (trylock) {988988 ret = rt_mutex_futex_trylock(&q.pi_state->pi_mutex);
+2-2
kernel/futex/waitwake.c
···349349 * access to the hash list and forcing another memory barrier.350350 */351351 set_current_state(TASK_INTERRUPTIBLE|TASK_FREEZABLE);352352- futex_queue(q, hb);352352+ futex_queue(q, hb, current);353353354354 /* Arm the timer */355355 if (timeout)···460460 * next futex. Queue each futex at this moment so hb can461461 * be unlocked.462462 */463463- futex_queue(q, hb);463463+ futex_queue(q, hb, current);464464 continue;465465 }466466
-4
kernel/irq/Kconfig
···3131config GENERIC_PENDING_IRQ3232 bool33333434-# Deduce delayed migration from top-level interrupt chip flags3535-config GENERIC_PENDING_IRQ_CHIPFLAGS3636- bool3737-3834# Support for generic irq migrating off cpu before the cpu is offline.3935config GENERIC_IRQ_MIGRATION4036 bool
+2-2
kernel/kthread.c
···859859 struct kthread *kthread = to_kthread(p);860860 cpumask_var_t affinity;861861 unsigned long flags;862862- int ret;862862+ int ret = 0;863863864864 if (!wait_task_inactive(p, TASK_UNINTERRUPTIBLE) || kthread->started) {865865 WARN_ON(1);···892892out:893893 free_cpumask_var(affinity);894894895895- return 0;895895+ return ret;896896}897897898898/*
+2-2
kernel/sched/autogroup.c
···150150 * see this thread after that: we can no longer use signal->autogroup.151151 * See the PF_EXITING check in task_wants_autogroup().152152 */153153- sched_move_task(p);153153+ sched_move_task(p, true);154154}155155156156static void···182182 * sched_autogroup_exit_task().183183 */184184 for_each_thread(p, t)185185- sched_move_task(t);185185+ sched_move_task(t, true);186186187187 unlock_task_sighand(p, &flags);188188 autogroup_kref_put(prev);
+7-5
kernel/sched/core.c
···10631063 struct task_struct *task;1064106410651065 task = container_of(node, struct task_struct, wake_q);10661066- /* Task can safely be re-inserted now: */10671066 node = node->next;10681068- task->wake_q.next = NULL;10671067+ /* pairs with cmpxchg_relaxed() in __wake_q_add() */10681068+ WRITE_ONCE(task->wake_q.next, NULL);10691069+ /* Task can safely be re-inserted now. */1069107010701071 /*10711072 * wake_up_process() executes a full barrier, which pairs with···90519050 * now. This function just updates tsk->se.cfs_rq and tsk->se.parent to reflect90529051 * its new group.90539052 */90549054-void sched_move_task(struct task_struct *tsk)90539053+void sched_move_task(struct task_struct *tsk, bool for_autogroup)90559054{90569055 int queued, running, queue_flags =90579056 DEQUEUE_SAVE | DEQUEUE_MOVE | DEQUEUE_NOCLOCK;···90809079 put_prev_task(rq, tsk);9081908090829081 sched_change_group(tsk, group);90839083- scx_move_task(tsk);90829082+ if (!for_autogroup)90839083+ scx_cgroup_move_task(tsk);9084908490859085 if (queued)90869086 enqueue_task(rq, tsk, queue_flags);···91829180 struct cgroup_subsys_state *css;9183918191849182 cgroup_taskset_for_each(task, css, tset)91859185- sched_move_task(task);91839183+ sched_move_task(task, false);9186918491879185 scx_cgroup_finish_attach();91889186}
+2
kernel/sched/debug.c
···12621262 if (task_has_dl_policy(p)) {12631263 P(dl.runtime);12641264 P(dl.deadline);12651265+ } else if (fair_policy(p->policy)) {12661266+ P(se.slice);12651267 }12661268#ifdef CONFIG_SCHED_CLASS_EXT12671269 __PS("ext.enabled", task_on_scx(p));
+76-37
kernel/sched/ext.c
···123123 SCX_OPS_SWITCH_PARTIAL = 1LLU << 3,124124125125 /*126126+ * A migration disabled task can only execute on its current CPU. By127127+ * default, such tasks are automatically put on the CPU's local DSQ with128128+ * the default slice on enqueue. If this ops flag is set, they also go129129+ * through ops.enqueue().130130+ *131131+ * A migration disabled task never invokes ops.select_cpu() as it can132132+ * only select the current CPU. Also, p->cpus_ptr will only contain its133133+ * current CPU while p->nr_cpus_allowed keeps tracking p->user_cpus_ptr134134+ * and thus may disagree with cpumask_weight(p->cpus_ptr).135135+ */136136+ SCX_OPS_ENQ_MIGRATION_DISABLED = 1LLU << 4,137137+138138+ /*126139 * CPU cgroup support flags127140 */128141 SCX_OPS_HAS_CGROUP_WEIGHT = 1LLU << 16, /* cpu.weight */···143130 SCX_OPS_ALL_FLAGS = SCX_OPS_KEEP_BUILTIN_IDLE |144131 SCX_OPS_ENQ_LAST |145132 SCX_OPS_ENQ_EXITING |133133+ SCX_OPS_ENQ_MIGRATION_DISABLED |146134 SCX_OPS_SWITCH_PARTIAL |147135 SCX_OPS_HAS_CGROUP_WEIGHT,148136};···430416431417 /**432418 * @update_idle: Update the idle state of a CPU433433- * @cpu: CPU to udpate the idle state for419419+ * @cpu: CPU to update the idle state for434420 * @idle: whether entering or exiting the idle state435421 *436422 * This operation is called when @rq's CPU goes or leaves the idle···896882897883static DEFINE_STATIC_KEY_FALSE(scx_ops_enq_last);898884static DEFINE_STATIC_KEY_FALSE(scx_ops_enq_exiting);885885+static DEFINE_STATIC_KEY_FALSE(scx_ops_enq_migration_disabled);899886static DEFINE_STATIC_KEY_FALSE(scx_ops_cpu_preempt);900887static DEFINE_STATIC_KEY_FALSE(scx_builtin_idle_enabled);901888···1229121412301215/**12311216 * nldsq_next_task - Iterate to the next task in a non-local DSQ12321232- * @dsq: user dsq being interated12171217+ * @dsq: user dsq being iterated12331218 * @cur: current position, %NULL to start iteration12341219 * @rev: walk backwards12351220 *···20292014 unlikely(p->flags & PF_EXITING))20302015 goto local;2031201620172017+ /* see %SCX_OPS_ENQ_MIGRATION_DISABLED */20182018+ if (!static_branch_unlikely(&scx_ops_enq_migration_disabled) &&20192019+ is_migration_disabled(p))20202020+ goto local;20212021+20322022 if (!SCX_HAS_OP(enqueue))20332023 goto global;20342024···2098207820992079 /*21002080 * list_add_tail() must be used. scx_ops_bypass() depends on tasks being21012101- * appened to the runnable_list.20812081+ * appended to the runnable_list.21022082 */21032083 list_add_tail(&p->scx.runnable_node, &rq->scx.runnable_list);21042084}···23332313 *23342314 * - The BPF scheduler is bypassed while the rq is offline and we can always say23352315 * no to the BPF scheduler initiated migrations while offline.23162316+ *23172317+ * The caller must ensure that @p and @rq are on different CPUs.23362318 */23372319static bool task_can_run_on_remote_rq(struct task_struct *p, struct rq *rq,23382320 bool trigger_error)23392321{23402322 int cpu = cpu_of(rq);23232323+23242324+ SCHED_WARN_ON(task_cpu(p) == cpu);23252325+23262326+ /*23272327+ * If @p has migration disabled, @p->cpus_ptr is updated to contain only23282328+ * the pinned CPU in migrate_disable_switch() while @p is being switched23292329+ * out. However, put_prev_task_scx() is called before @p->cpus_ptr is23302330+ * updated and thus another CPU may see @p on a DSQ inbetween leading to23312331+ * @p passing the below task_allowed_on_cpu() check while migration is23322332+ * disabled.23332333+ *23342334+ * Test the migration disabled state first as the race window is narrow23352335+ * and the BPF scheduler failing to check migration disabled state can23362336+ * easily be masked if task_allowed_on_cpu() is done first.23372337+ */23382338+ if (unlikely(is_migration_disabled(p))) {23392339+ if (trigger_error)23402340+ scx_ops_error("SCX_DSQ_LOCAL[_ON] cannot move migration disabled %s[%d] from CPU %d to %d",23412341+ p->comm, p->pid, task_cpu(p), cpu);23422342+ return false;23432343+ }2341234423422345 /*23432346 * We don't require the BPF scheduler to avoid dispatching to offline···23702327 */23712328 if (!task_allowed_on_cpu(p, cpu)) {23722329 if (trigger_error)23732373- scx_ops_error("SCX_DSQ_LOCAL[_ON] verdict target cpu %d not allowed for %s[%d]",23742374- cpu_of(rq), p->comm, p->pid);23302330+ scx_ops_error("SCX_DSQ_LOCAL[_ON] target CPU %d not allowed for %s[%d]",23312331+ cpu, p->comm, p->pid);23752332 return false;23762333 }23772377-23782378- if (unlikely(is_migration_disabled(p)))23792379- return false;2380233423812335 if (!scx_rq_online(rq))23822336 return false;···2477243724782438 if (dst_dsq->id == SCX_DSQ_LOCAL) {24792439 dst_rq = container_of(dst_dsq, struct rq, scx.local_dsq);24802480- if (!task_can_run_on_remote_rq(p, dst_rq, true)) {24402440+ if (src_rq != dst_rq &&24412441+ unlikely(!task_can_run_on_remote_rq(p, dst_rq, true))) {24812442 dst_dsq = find_global_dsq(p);24822443 dst_rq = src_rq;24832444 }···25212480/*25222481 * A poorly behaving BPF scheduler can live-lock the system by e.g. incessantly25232482 * banging on the same DSQ on a large NUMA system to the point where switching25242524- * to the bypass mode can take a long time. Inject artifical delays while the24832483+ * to the bypass mode can take a long time. Inject artificial delays while the25252484 * bypass mode is switching to guarantee timely completion.25262485 */25272486static void scx_ops_breather(struct rq *rq)···26162575{26172576 struct rq *src_rq = task_rq(p);26182577 struct rq *dst_rq = container_of(dst_dsq, struct rq, scx.local_dsq);25782578+#ifdef CONFIG_SMP25792579+ struct rq *locked_rq = rq;25802580+#endif2619258126202582 /*26212583 * We're synchronized against dequeue through DISPATCHING. As @p can't···26322588 }2633258926342590#ifdef CONFIG_SMP26352635- if (unlikely(!task_can_run_on_remote_rq(p, dst_rq, true))) {25912591+ if (src_rq != dst_rq &&25922592+ unlikely(!task_can_run_on_remote_rq(p, dst_rq, true))) {26362593 dispatch_enqueue(find_global_dsq(p), p,26372594 enq_flags | SCX_ENQ_CLEAR_OPSS);26382595 return;···26562611 atomic_long_set_release(&p->scx.ops_state, SCX_OPSS_NONE);2657261226582613 /* switch to @src_rq lock */26592659- if (rq != src_rq) {26602660- raw_spin_rq_unlock(rq);26142614+ if (locked_rq != src_rq) {26152615+ raw_spin_rq_unlock(locked_rq);26162616+ locked_rq = src_rq;26612617 raw_spin_rq_lock(src_rq);26622618 }26632619···26762630 } else {26772631 move_remote_task_to_local_dsq(p, enq_flags,26782632 src_rq, dst_rq);26332633+ /* task has been moved to dst_rq, which is now locked */26342634+ locked_rq = dst_rq;26792635 }2680263626812637 /* if the destination CPU is idle, wake it up */···26862638 }2687263926882640 /* switch back to @rq lock */26892689- if (rq != dst_rq) {26902690- raw_spin_rq_unlock(dst_rq);26412641+ if (locked_rq != rq) {26422642+ raw_spin_rq_unlock(locked_rq);26912643 raw_spin_rq_lock(rq);26922644 }26932645#else /* CONFIG_SMP */···31923144 *31933145 * Unless overridden by ops.core_sched_before(), @p->scx.core_sched_at is used31943146 * to implement the default task ordering. The older the timestamp, the higher31953195- * prority the task - the global FIFO ordering matching the default scheduling31473147+ * priority the task - the global FIFO ordering matching the default scheduling31963148 * behavior.31973149 *31983150 * When ops.core_sched_before() is enabled, @p->scx.core_sched_at is used to···38993851 curr->scx.slice = 0;39003852 touch_core_sched(rq, curr);39013853 } else if (SCX_HAS_OP(tick)) {39023902- SCX_CALL_OP(SCX_KF_REST, tick, curr);38543854+ SCX_CALL_OP_TASK(SCX_KF_REST, tick, curr);39033855 }3904385639053857 if (!curr->scx.slice)···40463998 WARN_ON_ONCE(scx_get_task_state(p) != SCX_TASK_ENABLED);4047399940484000 if (SCX_HAS_OP(disable))40494049- SCX_CALL_OP(SCX_KF_REST, disable, p);40014001+ SCX_CALL_OP_TASK(SCX_KF_REST, disable, p);40504002 scx_set_task_state(p, SCX_TASK_READY);40514003}40524004···40754027 }4076402840774029 if (SCX_HAS_OP(exit_task))40784078- SCX_CALL_OP(SCX_KF_REST, exit_task, p, &args);40304030+ SCX_CALL_OP_TASK(SCX_KF_REST, exit_task, p, &args);40794031 scx_set_task_state(p, SCX_TASK_NONE);40804032}40814033···43714323 return ops_sanitize_err("cgroup_prep_move", ret);43724324}4373432543744374-void scx_move_task(struct task_struct *p)43264326+void scx_cgroup_move_task(struct task_struct *p)43754327{43764328 if (!scx_cgroup_enabled)43774377- return;43784378-43794379- /*43804380- * We're called from sched_move_task() which handles both cgroup and43814381- * autogroup moves. Ignore the latter.43824382- *43834383- * Also ignore exiting tasks, because in the exit path tasks transition43844384- * from the autogroup to the root group, so task_group_is_autogroup()43854385- * alone isn't able to catch exiting autogroup tasks. This is safe for43864386- * cgroup_move(), because cgroup migrations never happen for PF_EXITING43874387- * tasks.43884388- */43894389- if (task_group_is_autogroup(task_group(p)) || (p->flags & PF_EXITING))43904329 return;4391433043924331 /*···46254590 cgroup_warned_missing_idle = false;4626459146274592 /*46284628- * scx_tg_on/offline() are excluded thorugh scx_cgroup_rwsem. If we walk45934593+ * scx_tg_on/offline() are excluded through scx_cgroup_rwsem. If we walk46294594 * cgroups and init, all online cgroups are initialized.46304595 */46314596 rcu_read_lock();···50945059 static_branch_disable(&scx_has_op[i]);50955060 static_branch_disable(&scx_ops_enq_last);50965061 static_branch_disable(&scx_ops_enq_exiting);50625062+ static_branch_disable(&scx_ops_enq_migration_disabled);50975063 static_branch_disable(&scx_ops_cpu_preempt);50985064 static_branch_disable(&scx_builtin_idle_enabled);50995065 synchronize_rcu();···53135277 scx_get_task_state(p), p->scx.flags & ~SCX_TASK_STATE_MASK,53145278 p->scx.dsq_flags, ops_state & SCX_OPSS_STATE_MASK,53155279 ops_state >> SCX_OPSS_QSEQ_SHIFT);53165316- dump_line(s, " sticky/holding_cpu=%d/%d dsq_id=%s dsq_vtime=%llu slice=%llu",53175317- p->scx.sticky_cpu, p->scx.holding_cpu, dsq_id_buf,53185318- p->scx.dsq_vtime, p->scx.slice);52805280+ dump_line(s, " sticky/holding_cpu=%d/%d dsq_id=%s",52815281+ p->scx.sticky_cpu, p->scx.holding_cpu, dsq_id_buf);52825282+ dump_line(s, " dsq_vtime=%llu slice=%llu weight=%u",52835283+ p->scx.dsq_vtime, p->scx.slice, p->scx.weight);53195284 dump_line(s, " cpus=%*pb", cpumask_pr_args(p->cpus_ptr));5320528553215286 if (SCX_HAS_OP(dump_task)) {···5704566757055668 if (ops->flags & SCX_OPS_ENQ_EXITING)57065669 static_branch_enable(&scx_ops_enq_exiting);56705670+ if (ops->flags & SCX_OPS_ENQ_MIGRATION_DISABLED)56715671+ static_branch_enable(&scx_ops_enq_migration_disabled);57075672 if (scx_ops.cpu_acquire || scx_ops.cpu_release)57085673 static_branch_enable(&scx_ops_cpu_preempt);57095674
···53855385static void set_delayed(struct sched_entity *se)53865386{53875387 se->sched_delayed = 1;53885388+53895389+ /*53905390+ * Delayed se of cfs_rq have no tasks queued on them.53915391+ * Do not adjust h_nr_runnable since dequeue_entities()53925392+ * will account it for blocked tasks.53935393+ */53945394+ if (!entity_is_task(se))53955395+ return;53965396+53885397 for_each_sched_entity(se) {53895398 struct cfs_rq *cfs_rq = cfs_rq_of(se);53905399···54065397static void clear_delayed(struct sched_entity *se)54075398{54085399 se->sched_delayed = 0;54005400+54015401+ /*54025402+ * Delayed se of cfs_rq have no tasks queued on them.54035403+ * Do not adjust h_nr_runnable since a dequeue has54045404+ * already accounted for it or an enqueue of a task54055405+ * below it will account for it in enqueue_task_fair().54065406+ */54075407+ if (!entity_is_task(se))54085408+ return;54095409+54095410 for_each_sched_entity(se) {54105411 struct cfs_rq *cfs_rq = cfs_rq_of(se);54115412
···749749 if (WARN_ON_ONCE(!fprog))750750 return false;751751752752+ /* Our single exception to filtering. */753753+#ifdef __NR_uretprobe754754+#ifdef SECCOMP_ARCH_COMPAT755755+ if (sd->arch == SECCOMP_ARCH_NATIVE)756756+#endif757757+ if (sd->nr == __NR_uretprobe)758758+ return true;759759+#endif760760+752761 for (pc = 0; pc < fprog->len; pc++) {753762 struct sock_filter *insn = &fprog->filter[pc];754763 u16 code = insn->code;···10321023 */10331024static const int mode1_syscalls[] = {10341025 __NR_seccomp_read, __NR_seccomp_write, __NR_seccomp_exit, __NR_seccomp_sigreturn,10261026+#ifdef __NR_uretprobe10271027+ __NR_uretprobe,10281028+#endif10351029 -1, /* negative terminated */10361030};10371031
+6-3
kernel/time/clocksource.c
···373373 cpumask_clear(&cpus_ahead);374374 cpumask_clear(&cpus_behind);375375 cpus_read_lock();376376- preempt_disable();376376+ migrate_disable();377377 clocksource_verify_choose_cpus();378378 if (cpumask_empty(&cpus_chosen)) {379379- preempt_enable();379379+ migrate_enable();380380 cpus_read_unlock();381381 pr_warn("Not enough CPUs to check clocksource '%s'.\n", cs->name);382382 return;383383 }384384 testcpu = smp_processor_id();385385- pr_warn("Checking clocksource %s synchronization from CPU %d to CPUs %*pbl.\n", cs->name, testcpu, cpumask_pr_args(&cpus_chosen));385385+ pr_info("Checking clocksource %s synchronization from CPU %d to CPUs %*pbl.\n",386386+ cs->name, testcpu, cpumask_pr_args(&cpus_chosen));387387+ preempt_disable();386388 for_each_cpu(cpu, &cpus_chosen) {387389 if (cpu == testcpu)388390 continue;···404402 cs_nsec_min = cs_nsec;405403 }406404 preempt_enable();405405+ migrate_enable();407406 cpus_read_unlock();408407 if (!cpumask_empty(&cpus_ahead))409408 pr_warn(" CPUs %*pbl ahead of CPU %d for clocksource %s.\n",
+94-31
kernel/time/hrtimer.c
···5858#define HRTIMER_ACTIVE_SOFT (HRTIMER_ACTIVE_HARD << MASK_SHIFT)5959#define HRTIMER_ACTIVE_ALL (HRTIMER_ACTIVE_SOFT | HRTIMER_ACTIVE_HARD)60606161+static void retrigger_next_event(void *arg);6262+6163/*6264 * The timer bases:6365 *···113111 .clockid = CLOCK_TAI,114112 .get_time = &ktime_get_clocktai,115113 },116116- }114114+ },115115+ .csd = CSD_INIT(retrigger_next_event, NULL)117116};118117119118static const int hrtimer_clock_to_base_table[MAX_CLOCKS] = {···126123 [CLOCK_BOOTTIME] = HRTIMER_BASE_BOOTTIME,127124 [CLOCK_TAI] = HRTIMER_BASE_TAI,128125};126126+127127+static inline bool hrtimer_base_is_online(struct hrtimer_cpu_base *base)128128+{129129+ if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))130130+ return true;131131+ else132132+ return likely(base->online);133133+}129134130135/*131136 * Functions and macros which are different for UP/SMP systems are kept in a···155144};156145157146#define migration_base migration_cpu_base.clock_base[0]158158-159159-static inline bool is_migration_base(struct hrtimer_clock_base *base)160160-{161161- return base == &migration_base;162162-}163147164148/*165149 * We are using hashed locking: holding per_cpu(hrtimer_bases)[n].lock···189183}190184191185/*192192- * We do not migrate the timer when it is expiring before the next193193- * event on the target cpu. When high resolution is enabled, we cannot194194- * reprogram the target cpu hardware and we would cause it to fire195195- * late. To keep it simple, we handle the high resolution enabled and196196- * disabled case similar.186186+ * Check if the elected target is suitable considering its next187187+ * event and the hotplug state of the current CPU.188188+ *189189+ * If the elected target is remote and its next event is after the timer190190+ * to queue, then a remote reprogram is necessary. However there is no191191+ * guarantee the IPI handling the operation would arrive in time to meet192192+ * the high resolution deadline. In this case the local CPU becomes a193193+ * preferred target, unless it is offline.194194+ *195195+ * High and low resolution modes are handled the same way for simplicity.197196 *198197 * Called with cpu_base->lock of target cpu held.199198 */200200-static int201201-hrtimer_check_target(struct hrtimer *timer, struct hrtimer_clock_base *new_base)199199+static bool hrtimer_suitable_target(struct hrtimer *timer, struct hrtimer_clock_base *new_base,200200+ struct hrtimer_cpu_base *new_cpu_base,201201+ struct hrtimer_cpu_base *this_cpu_base)202202{203203 ktime_t expires;204204205205+ /*206206+ * The local CPU clockevent can be reprogrammed. Also get_target_base()207207+ * guarantees it is online.208208+ */209209+ if (new_cpu_base == this_cpu_base)210210+ return true;211211+212212+ /*213213+ * The offline local CPU can't be the default target if the214214+ * next remote target event is after this timer. Keep the215215+ * elected new base. An IPI will we issued to reprogram216216+ * it as a last resort.217217+ */218218+ if (!hrtimer_base_is_online(this_cpu_base))219219+ return true;220220+205221 expires = ktime_sub(hrtimer_get_expires(timer), new_base->offset);206206- return expires < new_base->cpu_base->expires_next;222222+223223+ return expires >= new_base->cpu_base->expires_next;207224}208225209209-static inline210210-struct hrtimer_cpu_base *get_target_base(struct hrtimer_cpu_base *base,211211- int pinned)226226+static inline struct hrtimer_cpu_base *get_target_base(struct hrtimer_cpu_base *base, int pinned)212227{228228+ if (!hrtimer_base_is_online(base)) {229229+ int cpu = cpumask_any_and(cpu_online_mask, housekeeping_cpumask(HK_TYPE_TIMER));230230+231231+ return &per_cpu(hrtimer_bases, cpu);232232+ }233233+213234#if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON)214235 if (static_branch_likely(&timers_migration_enabled) && !pinned)215236 return &per_cpu(hrtimer_bases, get_nohz_timer_target());···287254 raw_spin_unlock(&base->cpu_base->lock);288255 raw_spin_lock(&new_base->cpu_base->lock);289256290290- if (new_cpu_base != this_cpu_base &&291291- hrtimer_check_target(timer, new_base)) {257257+ if (!hrtimer_suitable_target(timer, new_base, new_cpu_base,258258+ this_cpu_base)) {292259 raw_spin_unlock(&new_base->cpu_base->lock);293260 raw_spin_lock(&base->cpu_base->lock);294261 new_cpu_base = this_cpu_base;···297264 }298265 WRITE_ONCE(timer->base, new_base);299266 } else {300300- if (new_cpu_base != this_cpu_base &&301301- hrtimer_check_target(timer, new_base)) {267267+ if (!hrtimer_suitable_target(timer, new_base, new_cpu_base, this_cpu_base)) {302268 new_cpu_base = this_cpu_base;303269 goto again;304270 }···306274}307275308276#else /* CONFIG_SMP */309309-310310-static inline bool is_migration_base(struct hrtimer_clock_base *base)311311-{312312- return false;313313-}314277315278static inline struct hrtimer_clock_base *316279lock_hrtimer_base(const struct hrtimer *timer, unsigned long *flags)···742715{743716 return hrtimer_hres_enabled;744717}745745-746746-static void retrigger_next_event(void *arg);747718748719/*749720 * Switch to high resolution mode···12301205 u64 delta_ns, const enum hrtimer_mode mode,12311206 struct hrtimer_clock_base *base)12321207{12081208+ struct hrtimer_cpu_base *this_cpu_base = this_cpu_ptr(&hrtimer_bases);12331209 struct hrtimer_clock_base *new_base;12341210 bool force_local, first;12351211···12421216 * and enforce reprogramming after it is queued no matter whether12431217 * it is the new first expiring timer again or not.12441218 */12451245- force_local = base->cpu_base == this_cpu_ptr(&hrtimer_bases);12191219+ force_local = base->cpu_base == this_cpu_base;12461220 force_local &= base->cpu_base->next_timer == timer;12211221+12221222+ /*12231223+ * Don't force local queuing if this enqueue happens on a unplugged12241224+ * CPU after hrtimer_cpu_dying() has been invoked.12251225+ */12261226+ force_local &= this_cpu_base->online;1247122712481228 /*12491229 * Remove an active timer from the queue. In case it is not queued···12801248 }1281124912821250 first = enqueue_hrtimer(timer, new_base, mode);12831283- if (!force_local)12841284- return first;12511251+ if (!force_local) {12521252+ /*12531253+ * If the current CPU base is online, then the timer is12541254+ * never queued on a remote CPU if it would be the first12551255+ * expiring timer there.12561256+ */12571257+ if (hrtimer_base_is_online(this_cpu_base))12581258+ return first;12591259+12601260+ /*12611261+ * Timer was enqueued remote because the current base is12621262+ * already offline. If the timer is the first to expire,12631263+ * kick the remote CPU to reprogram the clock event.12641264+ */12651265+ if (first) {12661266+ struct hrtimer_cpu_base *new_cpu_base = new_base->cpu_base;12671267+12681268+ smp_call_function_single_async(new_cpu_base->cpu, &new_cpu_base->csd);12691269+ }12701270+ return 0;12711271+ }1285127212861273 /*12871274 * Timer was forced to stay on the current CPU to avoid···14201369 raw_spin_lock_irq(&cpu_base->lock);14211370 }14221371}13721372+13731373+#ifdef CONFIG_SMP13741374+static __always_inline bool is_migration_base(struct hrtimer_clock_base *base)13751375+{13761376+ return base == &migration_base;13771377+}13781378+#else13791379+static __always_inline bool is_migration_base(struct hrtimer_clock_base *base)13801380+{13811381+ return false;13821382+}13831383+#endif1423138414241385/*14251386 * This function is called on PREEMPT_RT kernels when the fast path
+9-1
kernel/time/timer_migration.c
···1675167516761676 } while (i < tmigr_hierarchy_levels);1677167716781678+ /* Assert single root */16791679+ WARN_ON_ONCE(!err && !group->parent && !list_is_singular(&tmigr_level_list[top]));16801680+16781681 while (i > 0) {16791682 group = stack[--i];16801683···17191716 WARN_ON_ONCE(top == 0);1720171717211718 lvllist = &tmigr_level_list[top];17221722- if (group->num_children == 1 && list_is_singular(lvllist)) {17191719+17201720+ /*17211721+ * Newly created root level should have accounted the upcoming17221722+ * CPU's child group and pre-accounted the old root.17231723+ */17241724+ if (group->num_children == 2 && list_is_singular(lvllist)) {17231725 /*17241726 * The target CPU must never do the prepare work, except17251727 * on early boot when the boot CPU is the target. Otherwise
+26-2
kernel/trace/ring_buffer.c
···16721672 * must be the same.16731673 */16741674static bool rb_meta_valid(struct ring_buffer_meta *meta, int cpu,16751675- struct trace_buffer *buffer, int nr_pages)16751675+ struct trace_buffer *buffer, int nr_pages,16761676+ unsigned long *subbuf_mask)16761677{16771678 int subbuf_size = PAGE_SIZE;16781679 struct buffer_data_page *subbuf;16791680 unsigned long buffers_start;16801681 unsigned long buffers_end;16811682 int i;16831683+16841684+ if (!subbuf_mask)16851685+ return false;1682168616831687 /* Check the meta magic and meta struct size */16841688 if (meta->magic != RING_BUFFER_META_MAGIC ||···1716171217171713 subbuf = rb_subbufs_from_meta(meta);1718171417151715+ bitmap_clear(subbuf_mask, 0, meta->nr_subbufs);17161716+17191717 /* Is the meta buffers and the subbufs themselves have correct data? */17201718 for (i = 0; i < meta->nr_subbufs; i++) {17211719 if (meta->buffers[i] < 0 ||···17311725 return false;17321726 }1733172717281728+ if (test_bit(meta->buffers[i], subbuf_mask)) {17291729+ pr_info("Ring buffer boot meta [%d] array has duplicates\n", cpu);17301730+ return false;17311731+ }17321732+17331733+ set_bit(meta->buffers[i], subbuf_mask);17341734 subbuf = (void *)subbuf + subbuf_size;17351735 }17361736···18501838 cpu_buffer->cpu);18511839 goto invalid;18521840 }18411841+18421842+ /* If the buffer has content, update pages_touched */18431843+ if (ret)18441844+ local_inc(&cpu_buffer->pages_touched);18451845+18531846 entries += ret;18541847 entry_bytes += local_read(&head_page->page->commit);18551848 local_set(&cpu_buffer->head_page->entries, ret);···19061889static void rb_range_meta_init(struct trace_buffer *buffer, int nr_pages)19071890{19081891 struct ring_buffer_meta *meta;18921892+ unsigned long *subbuf_mask;19091893 unsigned long delta;19101894 void *subbuf;19111895 int cpu;19121896 int i;18971897+18981898+ /* Create a mask to test the subbuf array */18991899+ subbuf_mask = bitmap_alloc(nr_pages + 1, GFP_KERNEL);19001900+ /* If subbuf_mask fails to allocate, then rb_meta_valid() will return false */1913190119141902 for (cpu = 0; cpu < nr_cpu_ids; cpu++) {19151903 void *next_meta;1916190419171905 meta = rb_range_meta(buffer, nr_pages, cpu);1918190619191919- if (rb_meta_valid(meta, cpu, buffer, nr_pages)) {19071907+ if (rb_meta_valid(meta, cpu, buffer, nr_pages, subbuf_mask)) {19201908 /* Make the mappings match the current address */19211909 subbuf = rb_subbufs_from_meta(meta);19221910 delta = (unsigned long)subbuf - meta->first_buffer;···19651943 subbuf += meta->subbuf_size;19661944 }19671945 }19461946+ bitmap_free(subbuf_mask);19681947}1969194819701949static void *rbm_start(struct seq_file *m, loff_t *pos)···71497126 kfree(cpu_buffer->subbuf_ids);71507127 cpu_buffer->subbuf_ids = NULL;71517128 rb_free_meta_page(cpu_buffer);71297129+ atomic_dec(&cpu_buffer->resize_disabled);71527130 }7153713171547132unlock:
+5-7
kernel/trace/trace.c
···59775977ssize_t tracing_resize_ring_buffer(struct trace_array *tr,59785978 unsigned long size, int cpu_id)59795979{59805980- int ret;59815981-59825980 guard(mutex)(&trace_types_lock);5983598159845982 if (cpu_id != RING_BUFFER_ALL_CPUS) {···59855987 return -EINVAL;59865988 }5987598959885988- ret = __tracing_resize_ring_buffer(tr, size, cpu_id);59895989- if (ret < 0)59905990- ret = -ENOMEM;59915991-59925992- return ret;59905990+ return __tracing_resize_ring_buffer(tr, size, cpu_id);59935991}5994599259955993static void update_last_data(struct trace_array *tr)···82788284 struct ftrace_buffer_info *info = filp->private_data;82798285 struct trace_iterator *iter = &info->iter;82808286 int ret = 0;82878287+82888288+ /* Currently the boot mapped buffer is not supported for mmap */82898289+ if (iter->tr->flags & TRACE_ARRAY_FL_BOOT)82908290+ return -ENODEV;8281829182828292 ret = get_snapshot_map(iter->tr);82838293 if (ret)
+1-1
kernel/trace/trace_functions_graph.c
···198198 * returning from the function.199199 */200200 if (ftrace_graph_notrace_addr(trace->func)) {201201- *task_var |= TRACE_GRAPH_NOTRACE_BIT;201201+ *task_var |= TRACE_GRAPH_NOTRACE;202202 /*203203 * Need to return 1 to have the return called204204 * that will clear the NOTRACE bit.
+6-6
kernel/workqueue.c
···35173517 }3518351835193519 /*35203520- * Put the reference grabbed by send_mayday(). @pool won't35213521- * go away while we're still attached to it.35223522- */35233523- put_pwq(pwq);35243524-35253525- /*35263520 * Leave this pool. Notify regular workers; otherwise, we end up35273521 * with 0 concurrency and stalling the execution.35283522 */···35253531 raw_spin_unlock_irq(&pool->lock);3526353235273533 worker_detach_from_pool(rescuer);35343534+35353535+ /*35363536+ * Put the reference grabbed by send_mayday(). @pool might35373537+ * go away any time after it.35383538+ */35393539+ put_pwq_unlocked(pwq);3528354035293541 raw_spin_lock_irq(&wq_mayday_lock);35303542 }
+4-2
lib/stackinit_kunit.c
···7575 */7676#ifdef CONFIG_M68K7777#define FILL_SIZE_STRING 87878+#define FILL_SIZE_ARRAY 27879#else7980#define FILL_SIZE_STRING 168181+#define FILL_SIZE_ARRAY 88082#endif81838284#define INIT_CLONE_SCALAR /**/···347345 short three;348346 unsigned long four;349347 struct big_struct {350350- unsigned long array[8];348348+ unsigned long array[FILL_SIZE_ARRAY];351349 } big;352350};353351354354-/* Mismatched sizes, with one and two being small */352352+/* Mismatched sizes, with three and four being small */355353union test_small_end {356354 short one;357355 unsigned long two;
···1818#include <linux/if_ether.h>1919#include <linux/jiffies.h>2020#include <linux/kref.h>2121+#include <linux/list.h>2122#include <linux/minmax.h>2223#include <linux/netdevice.h>2324#include <linux/nl80211.h>···2726#include <linux/rcupdate.h>2827#include <linux/rtnetlink.h>2928#include <linux/skbuff.h>2929+#include <linux/slab.h>3030#include <linux/stddef.h>3131#include <linux/string.h>3232#include <linux/types.h>···4240#include "originator.h"4341#include "routing.h"4442#include "send.h"4343+4444+/**4545+ * struct batadv_v_metric_queue_entry - list of hardif neighbors which require4646+ * and metric update4747+ */4848+struct batadv_v_metric_queue_entry {4949+ /** @hardif_neigh: hardif neighbor scheduled for metric update */5050+ struct batadv_hardif_neigh_node *hardif_neigh;5151+5252+ /** @list: list node for metric_queue */5353+ struct list_head list;5454+};45554656/**4757 * batadv_v_elp_start_timer() - restart timer for ELP periodic work···7359/**7460 * batadv_v_elp_get_throughput() - get the throughput towards a neighbour7561 * @neigh: the neighbour for which the throughput has to be obtained6262+ * @pthroughput: calculated throughput towards the given neighbour in multiples6363+ * of 100kpbs (a value of '1' equals 0.1Mbps, '10' equals 1Mbps, etc).7664 *7777- * Return: The throughput towards the given neighbour in multiples of 100kpbs7878- * (a value of '1' equals 0.1Mbps, '10' equals 1Mbps, etc).6565+ * Return: true when value behind @pthroughput was set7966 */8080-static u32 batadv_v_elp_get_throughput(struct batadv_hardif_neigh_node *neigh)6767+static bool batadv_v_elp_get_throughput(struct batadv_hardif_neigh_node *neigh,6868+ u32 *pthroughput)8169{8270 struct batadv_hard_iface *hard_iface = neigh->if_incoming;7171+ struct net_device *soft_iface = hard_iface->soft_iface;8372 struct ethtool_link_ksettings link_settings;8473 struct net_device *real_netdev;8574 struct station_info sinfo;8675 u32 throughput;8776 int ret;88777878+ /* don't query throughput when no longer associated with any7979+ * batman-adv interface8080+ */8181+ if (!soft_iface)8282+ return false;8383+8984 /* if the user specified a customised value for this interface, then9085 * return it directly9186 */9287 throughput = atomic_read(&hard_iface->bat_v.throughput_override);9393- if (throughput != 0)9494- return throughput;8888+ if (throughput != 0) {8989+ *pthroughput = throughput;9090+ return true;9191+ }95929693 /* if this is a wireless device, then ask its throughput through9794 * cfg80211 API···129104 * possible to delete this neighbor. For now set130105 * the throughput metric to 0.131106 */132132- return 0;107107+ *pthroughput = 0;108108+ return true;133109 }134110 if (ret)135111 goto default_throughput;136112137137- if (sinfo.filled & BIT(NL80211_STA_INFO_EXPECTED_THROUGHPUT))138138- return sinfo.expected_throughput / 100;113113+ if (sinfo.filled & BIT(NL80211_STA_INFO_EXPECTED_THROUGHPUT)) {114114+ *pthroughput = sinfo.expected_throughput / 100;115115+ return true;116116+ }139117140118 /* try to estimate the expected throughput based on reported tx141119 * rates142120 */143143- if (sinfo.filled & BIT(NL80211_STA_INFO_TX_BITRATE))144144- return cfg80211_calculate_bitrate(&sinfo.txrate) / 3;121121+ if (sinfo.filled & BIT(NL80211_STA_INFO_TX_BITRATE)) {122122+ *pthroughput = cfg80211_calculate_bitrate(&sinfo.txrate) / 3;123123+ return true;124124+ }145125146126 goto default_throughput;147127 }148128129129+ /* only use rtnl_trylock because the elp worker will be cancelled while130130+ * the rntl_lock is held. the cancel_delayed_work_sync() would otherwise131131+ * wait forever when the elp work_item was started and it is then also132132+ * trying to rtnl_lock133133+ */134134+ if (!rtnl_trylock())135135+ return false;136136+149137 /* if not a wifi interface, check if this device provides data via150138 * ethtool (e.g. an Ethernet adapter)151139 */152152- rtnl_lock();153140 ret = __ethtool_get_link_ksettings(hard_iface->net_dev, &link_settings);154141 rtnl_unlock();155142 if (ret == 0) {···172135 hard_iface->bat_v.flags &= ~BATADV_FULL_DUPLEX;173136174137 throughput = link_settings.base.speed;175175- if (throughput && throughput != SPEED_UNKNOWN)176176- return throughput * 10;138138+ if (throughput && throughput != SPEED_UNKNOWN) {139139+ *pthroughput = throughput * 10;140140+ return true;141141+ }177142 }178143179144default_throughput:180145 if (!(hard_iface->bat_v.flags & BATADV_WARNING_DEFAULT)) {181181- batadv_info(hard_iface->soft_iface,146146+ batadv_info(soft_iface,182147 "WiFi driver or ethtool info does not provide information about link speeds on interface %s, therefore defaulting to hardcoded throughput values of %u.%1u Mbps. Consider overriding the throughput manually or checking your driver.\n",183148 hard_iface->net_dev->name,184149 BATADV_THROUGHPUT_DEFAULT_VALUE / 10,···189150 }190151191152 /* if none of the above cases apply, return the base_throughput */192192- return BATADV_THROUGHPUT_DEFAULT_VALUE;153153+ *pthroughput = BATADV_THROUGHPUT_DEFAULT_VALUE;154154+ return true;193155}194156195157/**196158 * batadv_v_elp_throughput_metric_update() - worker updating the throughput197159 * metric of a single hop neighbour198198- * @work: the work queue item160160+ * @neigh: the neighbour to probe199161 */200200-void batadv_v_elp_throughput_metric_update(struct work_struct *work)162162+static void163163+batadv_v_elp_throughput_metric_update(struct batadv_hardif_neigh_node *neigh)201164{202202- struct batadv_hardif_neigh_node_bat_v *neigh_bat_v;203203- struct batadv_hardif_neigh_node *neigh;165165+ u32 throughput;166166+ bool valid;204167205205- neigh_bat_v = container_of(work, struct batadv_hardif_neigh_node_bat_v,206206- metric_work);207207- neigh = container_of(neigh_bat_v, struct batadv_hardif_neigh_node,208208- bat_v);168168+ valid = batadv_v_elp_get_throughput(neigh, &throughput);169169+ if (!valid)170170+ return;209171210210- ewma_throughput_add(&neigh->bat_v.throughput,211211- batadv_v_elp_get_throughput(neigh));212212-213213- /* decrement refcounter to balance increment performed before scheduling214214- * this task215215- */216216- batadv_hardif_neigh_put(neigh);172172+ ewma_throughput_add(&neigh->bat_v.throughput, throughput);217173}218174219175/**···282248 */283249static void batadv_v_elp_periodic_work(struct work_struct *work)284250{251251+ struct batadv_v_metric_queue_entry *metric_entry;252252+ struct batadv_v_metric_queue_entry *metric_safe;285253 struct batadv_hardif_neigh_node *hardif_neigh;286254 struct batadv_hard_iface *hard_iface;287255 struct batadv_hard_iface_bat_v *bat_v;288256 struct batadv_elp_packet *elp_packet;257257+ struct list_head metric_queue;289258 struct batadv_priv *bat_priv;290259 struct sk_buff *skb;291260 u32 elp_interval;292292- bool ret;293261294262 bat_v = container_of(work, struct batadv_hard_iface_bat_v, elp_wq.work);295263 hard_iface = container_of(bat_v, struct batadv_hard_iface, bat_v);···327291328292 atomic_inc(&hard_iface->bat_v.elp_seqno);329293294294+ INIT_LIST_HEAD(&metric_queue);295295+330296 /* The throughput metric is updated on each sent packet. This way, if a331297 * node is dead and no longer sends packets, batman-adv is still able to332298 * react timely to its death.···353315354316 /* Reading the estimated throughput from cfg80211 is a task that355317 * may sleep and that is not allowed in an rcu protected356356- * context. Therefore schedule a task for that.318318+ * context. Therefore add it to metric_queue and process it319319+ * outside rcu protected context.357320 */358358- ret = queue_work(batadv_event_workqueue,359359- &hardif_neigh->bat_v.metric_work);360360-361361- if (!ret)321321+ metric_entry = kzalloc(sizeof(*metric_entry), GFP_ATOMIC);322322+ if (!metric_entry) {362323 batadv_hardif_neigh_put(hardif_neigh);324324+ continue;325325+ }326326+327327+ metric_entry->hardif_neigh = hardif_neigh;328328+ list_add(&metric_entry->list, &metric_queue);363329 }364330 rcu_read_unlock();331331+332332+ list_for_each_entry_safe(metric_entry, metric_safe, &metric_queue, list) {333333+ batadv_v_elp_throughput_metric_update(metric_entry->hardif_neigh);334334+335335+ batadv_hardif_neigh_put(metric_entry->hardif_neigh);336336+ list_del(&metric_entry->list);337337+ kfree(metric_entry);338338+ }365339366340restart_timer:367341 batadv_v_elp_start_timer(hard_iface);
···596596 * neighbor597597 */598598 unsigned long last_unicast_tx;599599-600600- /** @metric_work: work queue callback item for metric update */601601- struct work_struct metric_work;602599};603600604601/**
+1-2
net/bluetooth/hidp/Kconfig
···11# SPDX-License-Identifier: GPL-2.0-only22config BT_HIDP33 tristate "HIDP protocol support"44- depends on BT_BREDR && INPUT && HID_SUPPORT55- select HID44+ depends on BT_BREDR && HID65 help76 HIDP (Human Interface Device Protocol) is a transport layer87 for HID reports. HIDP is required for the Bluetooth Human
+79-90
net/bluetooth/l2cap_core.c
···119119{120120 struct l2cap_chan *c;121121122122- mutex_lock(&conn->chan_lock);123122 c = __l2cap_get_chan_by_scid(conn, cid);124123 if (c) {125124 /* Only lock if chan reference is not 0 */···126127 if (c)127128 l2cap_chan_lock(c);128129 }129129- mutex_unlock(&conn->chan_lock);130130131131 return c;132132}···138140{139141 struct l2cap_chan *c;140142141141- mutex_lock(&conn->chan_lock);142143 c = __l2cap_get_chan_by_dcid(conn, cid);143144 if (c) {144145 /* Only lock if chan reference is not 0 */···145148 if (c)146149 l2cap_chan_lock(c);147150 }148148- mutex_unlock(&conn->chan_lock);149151150152 return c;151153}···414418 if (!conn)415419 return;416420417417- mutex_lock(&conn->chan_lock);421421+ mutex_lock(&conn->lock);418422 /* __set_chan_timer() calls l2cap_chan_hold(chan) while scheduling419423 * this work. No need to call l2cap_chan_hold(chan) here again.420424 */···435439 l2cap_chan_unlock(chan);436440 l2cap_chan_put(chan);437441438438- mutex_unlock(&conn->chan_lock);442442+ mutex_unlock(&conn->lock);439443}440444441445struct l2cap_chan *l2cap_chan_create(void)···637641638642void l2cap_chan_add(struct l2cap_conn *conn, struct l2cap_chan *chan)639643{640640- mutex_lock(&conn->chan_lock);644644+ mutex_lock(&conn->lock);641645 __l2cap_chan_add(conn, chan);642642- mutex_unlock(&conn->chan_lock);646646+ mutex_unlock(&conn->lock);643647}644648645649void l2cap_chan_del(struct l2cap_chan *chan, int err)···727731 if (!conn)728732 return;729733730730- mutex_lock(&conn->chan_lock);734734+ mutex_lock(&conn->lock);731735 __l2cap_chan_list(conn, func, data);732732- mutex_unlock(&conn->chan_lock);736736+ mutex_unlock(&conn->lock);733737}734738735739EXPORT_SYMBOL_GPL(l2cap_chan_list);···741745 struct hci_conn *hcon = conn->hcon;742746 struct l2cap_chan *chan;743747744744- mutex_lock(&conn->chan_lock);748748+ mutex_lock(&conn->lock);745749746750 list_for_each_entry(chan, &conn->chan_l, list) {747751 l2cap_chan_lock(chan);···750754 l2cap_chan_unlock(chan);751755 }752756753753- mutex_unlock(&conn->chan_lock);757757+ mutex_unlock(&conn->lock);754758}755759756760static void l2cap_chan_le_connect_reject(struct l2cap_chan *chan)···944948 return id;945949}946950951951+static void l2cap_send_acl(struct l2cap_conn *conn, struct sk_buff *skb,952952+ u8 flags)953953+{954954+ /* Check if the hcon still valid before attempting to send */955955+ if (hci_conn_valid(conn->hcon->hdev, conn->hcon))956956+ hci_send_acl(conn->hchan, skb, flags);957957+ else958958+ kfree_skb(skb);959959+}960960+947961static void l2cap_send_cmd(struct l2cap_conn *conn, u8 ident, u8 code, u16 len,948962 void *data)949963{···976970 bt_cb(skb)->force_active = BT_POWER_FORCE_ACTIVE_ON;977971 skb->priority = HCI_PRIO_MAX;978972979979- hci_send_acl(conn->hchan, skb, flags);973973+ l2cap_send_acl(conn, skb, flags);980974}981975982976static void l2cap_do_send(struct l2cap_chan *chan, struct sk_buff *skb)···1503149715041498 BT_DBG("conn %p", conn);1505149915061506- mutex_lock(&conn->chan_lock);15071507-15081500 list_for_each_entry_safe(chan, tmp, &conn->chan_l, list) {15091501 l2cap_chan_lock(chan);15101502···1571156715721568 l2cap_chan_unlock(chan);15731569 }15741574-15751575- mutex_unlock(&conn->chan_lock);15761570}1577157115781572static void l2cap_le_conn_ready(struct l2cap_conn *conn)···16161614 if (hcon->type == ACL_LINK)16171615 l2cap_request_info(conn);1618161616191619- mutex_lock(&conn->chan_lock);16171617+ mutex_lock(&conn->lock);1620161816211619 list_for_each_entry(chan, &conn->chan_l, list) {16221620···16341632 l2cap_chan_unlock(chan);16351633 }1636163416371637- mutex_unlock(&conn->chan_lock);16351635+ mutex_unlock(&conn->lock);1638163616391637 if (hcon->type == LE_LINK)16401638 l2cap_le_conn_ready(conn);···1649164716501648 BT_DBG("conn %p", conn);1651164916521652- mutex_lock(&conn->chan_lock);16531653-16541650 list_for_each_entry(chan, &conn->chan_l, list) {16551651 if (test_bit(FLAG_FORCE_RELIABLE, &chan->flags))16561652 l2cap_chan_set_err(chan, err);16571653 }16581658-16591659- mutex_unlock(&conn->chan_lock);16601654}1661165516621656static void l2cap_info_timeout(struct work_struct *work)···16631665 conn->info_state |= L2CAP_INFO_FEAT_MASK_REQ_DONE;16641666 conn->info_ident = 0;1665166716681668+ mutex_lock(&conn->lock);16661669 l2cap_conn_start(conn);16701670+ mutex_unlock(&conn->lock);16671671}1668167216691673/*···1757175717581758 BT_DBG("hcon %p conn %p, err %d", hcon, conn, err);1759175917601760+ mutex_lock(&conn->lock);17611761+17601762 kfree_skb(conn->rx_skb);1761176317621764 skb_queue_purge(&conn->pending_rx);···17771775 /* Force the connection to be immediately dropped */17781776 hcon->disc_timeout = 0;1779177717801780- mutex_lock(&conn->chan_lock);17811781-17821778 /* Kill channels */17831779 list_for_each_entry_safe(chan, l, &conn->chan_l, list) {17841780 l2cap_chan_hold(chan);···17901790 l2cap_chan_put(chan);17911791 }1792179217931793- mutex_unlock(&conn->chan_lock);17941794-17951795- hci_chan_del(conn->hchan);17961796-17971793 if (conn->info_state & L2CAP_INFO_FEAT_MASK_REQ_SENT)17981794 cancel_delayed_work_sync(&conn->info_timer);1799179518001800- hcon->l2cap_data = NULL;17961796+ hci_chan_del(conn->hchan);18011797 conn->hchan = NULL;17981798+17991799+ hcon->l2cap_data = NULL;18001800+ mutex_unlock(&conn->lock);18021801 l2cap_conn_put(conn);18031802}18041803···2915291629162917 BT_DBG("conn %p", conn);2917291829182918- mutex_lock(&conn->chan_lock);29192919-29202919 list_for_each_entry(chan, &conn->chan_l, list) {29212920 if (chan->chan_type != L2CAP_CHAN_RAW)29222921 continue;···29292932 if (chan->ops->recv(chan, nskb))29302933 kfree_skb(nskb);29312934 }29322932-29332933- mutex_unlock(&conn->chan_lock);29342935}2935293629362937/* ---- L2CAP signalling commands ---- */···39473952 goto response;39483953 }3949395439503950- mutex_lock(&conn->chan_lock);39513955 l2cap_chan_lock(pchan);3952395639533957 /* Check if the ACL is secure enough (if not SDP) */···40534059 }4054406040554061 l2cap_chan_unlock(pchan);40564056- mutex_unlock(&conn->chan_lock);40574062 l2cap_chan_put(pchan);40584063}40594064···40914098 BT_DBG("dcid 0x%4.4x scid 0x%4.4x result 0x%2.2x status 0x%2.2x",40924099 dcid, scid, result, status);4093410040944094- mutex_lock(&conn->chan_lock);40954095-40964101 if (scid) {40974102 chan = __l2cap_get_chan_by_scid(conn, scid);40984098- if (!chan) {40994099- err = -EBADSLT;41004100- goto unlock;41014101- }41034103+ if (!chan)41044104+ return -EBADSLT;41024105 } else {41034106 chan = __l2cap_get_chan_by_ident(conn, cmd->ident);41044104- if (!chan) {41054105- err = -EBADSLT;41064106- goto unlock;41074107- }41074107+ if (!chan)41084108+ return -EBADSLT;41084109 }4109411041104111 chan = l2cap_chan_hold_unless_zero(chan);41114111- if (!chan) {41124112- err = -EBADSLT;41134113- goto unlock;41144114- }41124112+ if (!chan)41134113+ return -EBADSLT;4115411441164115 err = 0;41174116···4140415541414156 l2cap_chan_unlock(chan);41424157 l2cap_chan_put(chan);41434143-41444144-unlock:41454145- mutex_unlock(&conn->chan_lock);4146415841474159 return err;41484160}···4428444644294447 chan->ops->set_shutdown(chan);4430444844314431- l2cap_chan_unlock(chan);44324432- mutex_lock(&conn->chan_lock);44334433- l2cap_chan_lock(chan);44344449 l2cap_chan_del(chan, ECONNRESET);44354435- mutex_unlock(&conn->chan_lock);4436445044374451 chan->ops->close(chan);44384452···44654487 return 0;44664488 }4467448944684468- l2cap_chan_unlock(chan);44694469- mutex_lock(&conn->chan_lock);44704470- l2cap_chan_lock(chan);44714490 l2cap_chan_del(chan, 0);44724472- mutex_unlock(&conn->chan_lock);4473449144744492 chan->ops->close(chan);44754493···46634689 BT_DBG("dcid 0x%4.4x mtu %u mps %u credits %u result 0x%2.2x",46644690 dcid, mtu, mps, credits, result);4665469146664666- mutex_lock(&conn->chan_lock);46674667-46684692 chan = __l2cap_get_chan_by_ident(conn, cmd->ident);46694669- if (!chan) {46704670- err = -EBADSLT;46714671- goto unlock;46724672- }46934693+ if (!chan)46944694+ return -EBADSLT;4673469546744696 err = 0;46754697···47124742 }4713474347144744 l2cap_chan_unlock(chan);47154715-47164716-unlock:47174717- mutex_unlock(&conn->chan_lock);4718474547194746 return err;47204747}···48244857 goto response;48254858 }4826485948274827- mutex_lock(&conn->chan_lock);48284860 l2cap_chan_lock(pchan);4829486148304862 if (!smp_sufficient_security(conn->hcon, pchan->sec_level,···4889492348904924response_unlock:48914925 l2cap_chan_unlock(pchan);48924892- mutex_unlock(&conn->chan_lock);48934926 l2cap_chan_put(pchan);4894492748954928 if (result == L2CAP_CR_PEND)···50225057 goto response;50235058 }5024505950255025- mutex_lock(&conn->chan_lock);50265060 l2cap_chan_lock(pchan);5027506150285062 if (!smp_sufficient_security(conn->hcon, pchan->sec_level,···5096513250975133unlock:50985134 l2cap_chan_unlock(pchan);50995099- mutex_unlock(&conn->chan_lock);51005135 l2cap_chan_put(pchan);5101513651025137response:···5131516851325169 BT_DBG("mtu %u mps %u credits %u result 0x%4.4x", mtu, mps, credits,51335170 result);51345134-51355135- mutex_lock(&conn->chan_lock);5136517151375172 cmd_len -= sizeof(*rsp);51385173···5216525552175256 l2cap_chan_unlock(chan);52185257 }52195219-52205220- mutex_unlock(&conn->chan_lock);5221525852225259 return err;52235260}···53295370 if (cmd_len < sizeof(*rej))53305371 return -EPROTO;5331537253325332- mutex_lock(&conn->chan_lock);53335333-53345373 chan = __l2cap_get_chan_by_ident(conn, cmd->ident);53355374 if (!chan)53365375 goto done;···53435386 l2cap_chan_put(chan);5344538753455388done:53465346- mutex_unlock(&conn->chan_lock);53475389 return 0;53485390}53495391···6797684167986842 BT_DBG("");6799684368446844+ mutex_lock(&conn->lock);68456845+68006846 while ((skb = skb_dequeue(&conn->pending_rx)))68016847 l2cap_recv_frame(conn, skb);68486848+68496849+ mutex_unlock(&conn->lock);68026850}6803685168046852static struct l2cap_conn *l2cap_conn_add(struct hci_conn *hcon)···68416881 conn->local_fixed_chan |= L2CAP_FC_SMP_BREDR;6842688268436883 mutex_init(&conn->ident_lock);68446844- mutex_init(&conn->chan_lock);68846884+ mutex_init(&conn->lock);6845688568466886 INIT_LIST_HEAD(&conn->chan_l);68476887 INIT_LIST_HEAD(&conn->users);···70327072 }70337073 }7034707470357035- mutex_lock(&conn->chan_lock);70757075+ mutex_lock(&conn->lock);70367076 l2cap_chan_lock(chan);7037707770387078 if (cid && __l2cap_get_chan_by_dcid(conn, cid)) {···7073711370747114chan_unlock:70757115 l2cap_chan_unlock(chan);70767076- mutex_unlock(&conn->chan_lock);71167116+ mutex_unlock(&conn->lock);70777117done:70787118 hci_dev_unlock(hdev);70797119 hci_dev_put(hdev);···7285732572867326 BT_DBG("conn %p status 0x%2.2x encrypt %u", conn, status, encrypt);7287732772887288- mutex_lock(&conn->chan_lock);73287328+ mutex_lock(&conn->lock);7289732972907330 list_for_each_entry(chan, &conn->chan_l, list) {72917331 l2cap_chan_lock(chan);···73597399 l2cap_chan_unlock(chan);73607400 }7361740173627362- mutex_unlock(&conn->chan_lock);74027402+ mutex_unlock(&conn->lock);73637403}7364740473657405/* Append fragment into frame respecting the maximum len of rx_skb */···74267466 conn->rx_len = 0;74277467}7428746874697469+struct l2cap_conn *l2cap_conn_hold_unless_zero(struct l2cap_conn *c)74707470+{74717471+ if (!c)74727472+ return NULL;74737473+74747474+ BT_DBG("conn %p orig refcnt %u", c, kref_read(&c->ref));74757475+74767476+ if (!kref_get_unless_zero(&c->ref))74777477+ return NULL;74787478+74797479+ return c;74807480+}74817481+74297482void l2cap_recv_acldata(struct hci_conn *hcon, struct sk_buff *skb, u16 flags)74307483{74317431- struct l2cap_conn *conn = hcon->l2cap_data;74847484+ struct l2cap_conn *conn;74327485 int len;74867486+74877487+ /* Lock hdev to access l2cap_data to avoid race with l2cap_conn_del */74887488+ hci_dev_lock(hcon->hdev);74897489+74907490+ conn = hcon->l2cap_data;7433749174347492 if (!conn)74357493 conn = l2cap_conn_add(hcon);7436749474377437- if (!conn)74387438- goto drop;74957495+ conn = l2cap_conn_hold_unless_zero(conn);74967496+74977497+ hci_dev_unlock(hcon->hdev);74987498+74997499+ if (!conn) {75007500+ kfree_skb(skb);75017501+ return;75027502+ }7439750374407504 BT_DBG("conn %p len %u flags 0x%x", conn, skb->len, flags);75057505+75067506+ mutex_lock(&conn->lock);7441750774427508 switch (flags) {74437509 case ACL_START:···74897503 if (len == skb->len) {74907504 /* Complete frame received */74917505 l2cap_recv_frame(conn, skb);74927492- return;75067506+ goto unlock;74937507 }7494750874957509 BT_DBG("Start: total len %d, frag len %u", len, skb->len);···7553756775547568drop:75557569 kfree_skb(skb);75707570+unlock:75717571+ mutex_unlock(&conn->lock);75727572+ l2cap_conn_put(conn);75567573}7557757475587575static struct hci_cb l2cap_cb = {
+7-8
net/bluetooth/l2cap_sock.c
···13261326 /* prevent sk structure from being freed whilst unlocked */13271327 sock_hold(sk);1328132813291329- chan = l2cap_pi(sk)->chan;13301329 /* prevent chan structure from being freed whilst unlocked */13311331- l2cap_chan_hold(chan);13301330+ chan = l2cap_chan_hold_unless_zero(l2cap_pi(sk)->chan);13311331+ if (!chan)13321332+ goto shutdown_already;1332133313331334 BT_DBG("chan %p state %s", chan, state_to_string(chan->state));13341335···13591358 release_sock(sk);1360135913611360 l2cap_chan_lock(chan);13621362- conn = chan->conn;13631363- if (conn)13641364- /* prevent conn structure from being freed */13651365- l2cap_conn_get(conn);13611361+ /* prevent conn structure from being freed */13621362+ conn = l2cap_conn_hold_unless_zero(chan->conn);13661363 l2cap_chan_unlock(chan);1367136413681365 if (conn)13691366 /* mutex lock must be taken before l2cap_chan_lock() */13701370- mutex_lock(&conn->chan_lock);13671367+ mutex_lock(&conn->lock);1371136813721369 l2cap_chan_lock(chan);13731370 l2cap_chan_close(chan, 0);13741371 l2cap_chan_unlock(chan);1375137213761373 if (conn) {13771377- mutex_unlock(&conn->chan_lock);13741374+ mutex_unlock(&conn->lock);13781375 l2cap_conn_put(conn);13791376 }13801377
···993993 return rc;994994995995 /* Nonzero ring with RSS only makes sense if NIC adds them together */996996- if (cmd == ETHTOOL_SRXCLSRLINS && info.flow_type & FLOW_RSS &&996996+ if (cmd == ETHTOOL_SRXCLSRLINS && info.fs.flow_type & FLOW_RSS &&997997 !ops->cap_rss_rxnfc_adds &&998998 ethtool_get_flow_spec_ring(info.fs.ring_cookie))999999 return -EINVAL;
···418418{419419 int hlen = LL_RESERVED_SPACE(dev);420420 int tlen = dev->needed_tailroom;421421- struct sock *sk = dev_net(dev)->ipv6.ndisc_sk;422421 struct sk_buff *skb;423422424423 skb = alloc_skb(hlen + sizeof(struct ipv6hdr) + len + tlen, GFP_ATOMIC);425425- if (!skb) {426426- ND_PRINTK(0, err, "ndisc: %s failed to allocate an skb\n",427427- __func__);424424+ if (!skb)428425 return NULL;429429- }430426431427 skb->protocol = htons(ETH_P_IPV6);432428 skb->dev = dev;···433437 /* Manually assign socket ownership as we avoid calling434438 * sock_alloc_send_pskb() to bypass wmem buffer limits435439 */436436- skb_set_owner_w(skb, sk);440440+ rcu_read_lock();441441+ skb_set_owner_w(skb, dev_net_rcu(dev)->ipv6.ndisc_sk);442442+ rcu_read_unlock();437443438444 return skb;439445}···471473void ndisc_send_skb(struct sk_buff *skb, const struct in6_addr *daddr,472474 const struct in6_addr *saddr)473475{474474- struct dst_entry *dst = skb_dst(skb);475475- struct net *net = dev_net(skb->dev);476476- struct sock *sk = net->ipv6.ndisc_sk;477477- struct inet6_dev *idev;478478- int err;479476 struct icmp6hdr *icmp6h = icmp6_hdr(skb);477477+ struct dst_entry *dst = skb_dst(skb);478478+ struct inet6_dev *idev;479479+ struct net *net;480480+ struct sock *sk;481481+ int err;480482 u8 type;481483482484 type = icmp6h->icmp6_type;483485486486+ rcu_read_lock();487487+488488+ net = dev_net_rcu(skb->dev);489489+ sk = net->ipv6.ndisc_sk;484490 if (!dst) {485491 struct flowi6 fl6;486492 int oif = skb->dev->ifindex;···492490 icmpv6_flow_init(sk, &fl6, type, saddr, daddr, oif);493491 dst = icmp6_dst_alloc(skb->dev, &fl6);494492 if (IS_ERR(dst)) {493493+ rcu_read_unlock();495494 kfree_skb(skb);496495 return;497496 }···507504508505 ip6_nd_hdr(skb, saddr, daddr, READ_ONCE(inet6_sk(sk)->hop_limit), skb->len);509506510510- rcu_read_lock();511507 idev = __in6_dev_get(dst->dev);512508 IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTREQUESTS);513509···16961694 bool ret;1697169516981696 if (netif_is_l3_master(skb->dev)) {16991699- dev = __dev_get_by_index(dev_net(skb->dev), IPCB(skb)->iif);16971697+ dev = dev_get_by_index_rcu(dev_net(skb->dev), IPCB(skb)->iif);17001698 if (!dev)17011699 return;17021700 }
+6-1
net/ipv6/route.c
···31963196{31973197 struct net_device *dev = dst->dev;31983198 unsigned int mtu = dst_mtu(dst);31993199- struct net *net = dev_net(dev);31993199+ struct net *net;3200320032013201 mtu -= sizeof(struct ipv6hdr) + sizeof(struct tcphdr);3202320232033203+ rcu_read_lock();32043204+32053205+ net = dev_net_rcu(dev);32033206 if (mtu < net->ipv6.sysctl.ip6_rt_min_advmss)32043207 mtu = net->ipv6.sysctl.ip6_rt_min_advmss;32083208+32093209+ rcu_read_unlock();3205321032063211 /*32073212 * Maximal non-jumbo IPv6 payload is IPV6_MAXPLEN and
+10-5
net/ipv6/rpl_iptunnel.c
···232232 dst = ip6_route_output(net, NULL, &fl6);233233 if (dst->error) {234234 err = dst->error;235235- dst_release(dst);236235 goto drop;237236 }238237239239- local_bh_disable();240240- dst_cache_set_ip6(&rlwt->cache, dst, &fl6.saddr);241241- local_bh_enable();238238+ /* cache only if we don't create a dst reference loop */239239+ if (orig_dst->lwtstate != dst->lwtstate) {240240+ local_bh_disable();241241+ dst_cache_set_ip6(&rlwt->cache, dst, &fl6.saddr);242242+ local_bh_enable();243243+ }242244243245 err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev));244246 if (unlikely(err))···253251 return dst_output(net, sk, skb);254252255253drop:254254+ dst_release(dst);256255 kfree_skb(skb);257256 return err;258257}···272269 local_bh_enable();273270274271 err = rpl_do_srh(skb, rlwt, dst);275275- if (unlikely(err))272272+ if (unlikely(err)) {273273+ dst_release(dst);276274 goto drop;275275+ }277276278277 if (!dst) {279278 ip6_route_input(skb);
+10-5
net/ipv6/seg6_iptunnel.c
···482482 local_bh_enable();483483484484 err = seg6_do_srh(skb, dst);485485- if (unlikely(err))485485+ if (unlikely(err)) {486486+ dst_release(dst);486487 goto drop;488488+ }487489488490 if (!dst) {489491 ip6_route_input(skb);···573571 dst = ip6_route_output(net, NULL, &fl6);574572 if (dst->error) {575573 err = dst->error;576576- dst_release(dst);577574 goto drop;578575 }579576580580- local_bh_disable();581581- dst_cache_set_ip6(&slwt->cache, dst, &fl6.saddr);582582- local_bh_enable();577577+ /* cache only if we don't create a dst reference loop */578578+ if (orig_dst->lwtstate != dst->lwtstate) {579579+ local_bh_disable();580580+ dst_cache_set_ip6(&slwt->cache, dst, &fl6.saddr);581581+ local_bh_enable();582582+ }583583584584 err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev));585585 if (unlikely(err))···597593598594 return dst_output(net, sk, skb);599595drop:596596+ dst_release(dst);600597 kfree_skb(skb);601598 return err;602599}
+2-2
net/ipv6/udp.c
···13891389 const int hlen = skb_network_header_len(skb) +13901390 sizeof(struct udphdr);1391139113921392- if (hlen + cork->gso_size > cork->fragsize) {13921392+ if (hlen + min(datalen, cork->gso_size) > cork->fragsize) {13931393 kfree_skb(skb);13941394- return -EINVAL;13941394+ return -EMSGSIZE;13951395 }13961396 if (datalen > cork->gso_size * UDP_MAX_SEGMENTS) {13971397 kfree_skb(skb);
···21012101{21022102 struct ovs_header *ovs_header;21032103 struct ovs_vport_stats vport_stats;21042104+ struct net *net_vport;21042105 int err;2105210621062107 ovs_header = genlmsg_put(skb, portid, seq, &dp_vport_genl_family,···21182117 nla_put_u32(skb, OVS_VPORT_ATTR_IFINDEX, vport->dev->ifindex))21192118 goto nla_put_failure;2120211921212121- if (!net_eq(net, dev_net(vport->dev))) {21222122- int id = peernet2id_alloc(net, dev_net(vport->dev), gfp);21202120+ rcu_read_lock();21212121+ net_vport = dev_net_rcu(vport->dev);21222122+ if (!net_eq(net, net_vport)) {21232123+ int id = peernet2id_alloc(net, net_vport, GFP_ATOMIC);2123212421242125 if (nla_put_s32(skb, OVS_VPORT_ATTR_NETNSID, id))21252125- goto nla_put_failure;21262126+ goto nla_put_failure_unlock;21262127 }21282128+ rcu_read_unlock();2127212921282130 ovs_vport_get_stats(vport, &vport_stats);21292131 if (nla_put_64bit(skb, OVS_VPORT_ATTR_STATS,···21472143 genlmsg_end(skb, ovs_header);21482144 return 0;2149214521462146+nla_put_failure_unlock:21472147+ rcu_read_unlock();21502148nla_put_failure:21512149 err = -EMSGSIZE;21522150error:
+16-8
net/rose/af_rose.c
···701701 struct net_device *dev;702702 ax25_address *source;703703 ax25_uid_assoc *user;704704+ int err = -EINVAL;704705 int n;705705-706706- if (!sock_flag(sk, SOCK_ZAPPED))707707- return -EINVAL;708706709707 if (addr_len != sizeof(struct sockaddr_rose) && addr_len != sizeof(struct full_sockaddr_rose))710708 return -EINVAL;···716718 if ((unsigned int) addr->srose_ndigis > ROSE_MAX_DIGIS)717719 return -EINVAL;718720719719- if ((dev = rose_dev_get(&addr->srose_addr)) == NULL)720720- return -EADDRNOTAVAIL;721721+ lock_sock(sk);722722+723723+ if (!sock_flag(sk, SOCK_ZAPPED))724724+ goto out_release;725725+726726+ err = -EADDRNOTAVAIL;727727+ dev = rose_dev_get(&addr->srose_addr);728728+ if (!dev)729729+ goto out_release;721730722731 source = &addr->srose_call;723732···735730 } else {736731 if (ax25_uid_policy && !capable(CAP_NET_BIND_SERVICE)) {737732 dev_put(dev);738738- return -EACCES;733733+ err = -EACCES;734734+ goto out_release;739735 }740736 rose->source_call = *source;741737 }···759753 rose_insert_socket(sk);760754761755 sock_reset_flag(sk, SOCK_ZAPPED);762762-763763- return 0;756756+ err = 0;757757+out_release:758758+ release_sock(sk);759759+ return err;764760}765761766762static int rose_connect(struct socket *sock, struct sockaddr *uaddr, int addr_len, int flags)
+4-5
net/rxrpc/ar-internal.h
···327327 * packet with a maximum set of jumbo subpackets or a PING ACK padded328328 * out to 64K with zeropages for PMTUD.329329 */330330- struct kvec kvec[RXRPC_MAX_NR_JUMBO > 3 + 16 ?331331- RXRPC_MAX_NR_JUMBO : 3 + 16];330330+ struct kvec kvec[1 + RXRPC_MAX_NR_JUMBO > 3 + 16 ?331331+ 1 + RXRPC_MAX_NR_JUMBO : 3 + 16];332332};333333334334/*···582582 RXRPC_CALL_EXCLUSIVE, /* The call uses a once-only connection */583583 RXRPC_CALL_RX_IS_IDLE, /* recvmsg() is idle - send an ACK */584584 RXRPC_CALL_RECVMSG_READ_ALL, /* recvmsg() read all of the received data */585585+ RXRPC_CALL_CONN_CHALLENGING, /* The connection is being challenged */585586};586587587588/*···603602 RXRPC_CALL_CLIENT_AWAIT_REPLY, /* - client awaiting reply */604603 RXRPC_CALL_CLIENT_RECV_REPLY, /* - client receiving reply phase */605604 RXRPC_CALL_SERVER_PREALLOC, /* - service preallocation */606606- RXRPC_CALL_SERVER_SECURING, /* - server securing request connection */607605 RXRPC_CALL_SERVER_RECV_REQUEST, /* - server receiving request */608606 RXRPC_CALL_SERVER_ACK_REQUEST, /* - server pending ACK of request */609607 RXRPC_CALL_SERVER_SEND_REPLY, /* - server sending reply */···874874#define RXRPC_TXBUF_RESENT 0x100 /* Set if has been resent */875875 __be16 cksum; /* Checksum to go in header */876876 bool jumboable; /* Can be non-terminal jumbo subpacket */877877- u8 nr_kvec; /* Amount of kvec[] used */878878- struct kvec kvec[1];877877+ void *data; /* Data with preceding jumbo header */879878};880879881880static inline bool rxrpc_sending_to_server(const struct rxrpc_txbuf *txb)
···448448 struct rxrpc_skb_priv *sp = rxrpc_skb(skb);449449 bool last = sp->hdr.flags & RXRPC_LAST_PACKET;450450451451- skb_queue_tail(&call->recvmsg_queue, skb);451451+ spin_lock_irq(&call->recvmsg_queue.lock);452452+453453+ __skb_queue_tail(&call->recvmsg_queue, skb);452454 rxrpc_input_update_ack_window(call, window, wtop);453455 trace_rxrpc_receive(call, last ? why + 1 : why, sp->hdr.serial, sp->hdr.seq);454456 if (last)457457+ /* Change the state inside the lock so that recvmsg syncs458458+ * correctly with it and using sendmsg() to send a reply459459+ * doesn't race.460460+ */455461 rxrpc_end_rx_phase(call, sp->hdr.serial);462462+463463+ spin_unlock_irq(&call->recvmsg_queue.lock);456464}457465458466/*···665657 rxrpc_propose_delay_ACK(call, sp->hdr.serial,666658 rxrpc_propose_ack_input_data);667659 }668668- if (notify) {660660+ if (notify && !test_bit(RXRPC_CALL_CONN_CHALLENGING, &call->flags)) {669661 trace_rxrpc_notify_socket(call->debug_id, sp->hdr.serial);670662 rxrpc_notify_socket(call);671663 }
+35-15
net/rxrpc/output.c
···428428static size_t rxrpc_prepare_data_subpacket(struct rxrpc_call *call,429429 struct rxrpc_send_data_req *req,430430 struct rxrpc_txbuf *txb,431431+ struct rxrpc_wire_header *whdr,431432 rxrpc_serial_t serial, int subpkt)432433{433433- struct rxrpc_wire_header *whdr = txb->kvec[0].iov_base;434434- struct rxrpc_jumbo_header *jumbo = (void *)(whdr + 1) - sizeof(*jumbo);434434+ struct rxrpc_jumbo_header *jumbo = txb->data - sizeof(*jumbo);435435 enum rxrpc_req_ack_trace why;436436 struct rxrpc_connection *conn = call->conn;437437- struct kvec *kv = &call->local->kvec[subpkt];437437+ struct kvec *kv = &call->local->kvec[1 + subpkt];438438 size_t len = txb->pkt_len;439439 bool last;440440 u8 flags;···491491 }492492dont_set_request_ack:493493494494- /* The jumbo header overlays the wire header in the txbuf. */494494+ /* There's a jumbo header prepended to the data if we need it. */495495 if (subpkt < req->n - 1)496496 flags |= RXRPC_JUMBO_PACKET;497497 else498498 flags &= ~RXRPC_JUMBO_PACKET;499499 if (subpkt == 0) {500500 whdr->flags = flags;501501- whdr->serial = htonl(txb->serial);502501 whdr->cksum = txb->cksum;503503- whdr->serviceId = htons(conn->service_id);504504- kv->iov_base = whdr;505505- len += sizeof(*whdr);502502+ kv->iov_base = txb->data;506503 } else {507504 jumbo->flags = flags;508505 jumbo->pad = 0;···532535/*533536 * Prepare a (jumbo) packet for transmission.534537 */535535-static size_t rxrpc_prepare_data_packet(struct rxrpc_call *call, struct rxrpc_send_data_req *req)538538+static size_t rxrpc_prepare_data_packet(struct rxrpc_call *call,539539+ struct rxrpc_send_data_req *req,540540+ struct rxrpc_wire_header *whdr)536541{537542 struct rxrpc_txqueue *tq = req->tq;538543 rxrpc_serial_t serial;···547548548549 /* Each transmission of a Tx packet needs a new serial number */549550 serial = rxrpc_get_next_serials(call->conn, req->n);551551+552552+ whdr->epoch = htonl(call->conn->proto.epoch);553553+ whdr->cid = htonl(call->cid);554554+ whdr->callNumber = htonl(call->call_id);555555+ whdr->seq = htonl(seq);556556+ whdr->serial = htonl(serial);557557+ whdr->type = RXRPC_PACKET_TYPE_DATA;558558+ whdr->flags = 0;559559+ whdr->userStatus = 0;560560+ whdr->securityIndex = call->security_ix;561561+ whdr->_rsvd = 0;562562+ whdr->serviceId = htons(call->conn->service_id);550563551564 call->tx_last_serial = serial + req->n - 1;552565 call->tx_last_sent = req->now;···587576 if (i + 1 == req->n)588577 /* Only sample the last subpacket in a jumbo. */589578 __set_bit(ix, &tq->rtt_samples);590590- len += rxrpc_prepare_data_subpacket(call, req, txb, serial, i);579579+ len += rxrpc_prepare_data_subpacket(call, req, txb, whdr, serial, i);591580 serial++;592581 seq++;593582 i++;···629618 }630619631620 rxrpc_set_keepalive(call, req->now);621621+ page_frag_free(whdr);632622 return len;633623}634624···638626 */639627void rxrpc_send_data_packet(struct rxrpc_call *call, struct rxrpc_send_data_req *req)640628{629629+ struct rxrpc_wire_header *whdr;641630 struct rxrpc_connection *conn = call->conn;642631 enum rxrpc_tx_point frag;643632 struct rxrpc_txqueue *tq = req->tq;644633 struct rxrpc_txbuf *txb;645634 struct msghdr msg;646635 rxrpc_seq_t seq = req->seq;647647- size_t len;636636+ size_t len = sizeof(*whdr);648637 bool new_call = test_bit(RXRPC_CALL_BEGAN_RX_TIMER, &call->flags);649638 int ret, stat_ix;650639651640 _enter("%x,%x-%x", tq->qbase, seq, seq + req->n - 1);652641642642+ whdr = page_frag_alloc(&call->local->tx_alloc, sizeof(*whdr), GFP_NOFS);643643+ if (!whdr)644644+ return; /* Drop the packet if no memory. */645645+646646+ call->local->kvec[0].iov_base = whdr;647647+ call->local->kvec[0].iov_len = sizeof(*whdr);648648+653649 stat_ix = umin(req->n, ARRAY_SIZE(call->rxnet->stat_tx_jumbo)) - 1;654650 atomic_inc(&call->rxnet->stat_tx_jumbo[stat_ix]);655651656656- len = rxrpc_prepare_data_packet(call, req);652652+ len += rxrpc_prepare_data_packet(call, req, whdr);657653 txb = tq->bufs[seq & RXRPC_TXQ_MASK];658654659659- iov_iter_kvec(&msg.msg_iter, WRITE, call->local->kvec, req->n, len);655655+ iov_iter_kvec(&msg.msg_iter, WRITE, call->local->kvec, 1 + req->n, len);660656661657 msg.msg_name = &call->peer->srx.transport;662658 msg.msg_namelen = call->peer->srx.transport_len;···715695716696 if (ret == -EMSGSIZE) {717697 rxrpc_inc_stat(call->rxnet, stat_tx_data_send_msgsize);718718- trace_rxrpc_tx_packet(call->debug_id, call->local->kvec[0].iov_base, frag);698698+ trace_rxrpc_tx_packet(call->debug_id, whdr, frag);719699 ret = 0;720700 } else if (ret < 0) {721701 rxrpc_inc_stat(call->rxnet, stat_tx_data_send_fail);722702 trace_rxrpc_tx_fail(call->debug_id, txb->serial, ret, frag);723703 } else {724724- trace_rxrpc_tx_packet(call->debug_id, call->local->kvec[0].iov_base, frag);704704+ trace_rxrpc_tx_packet(call->debug_id, whdr, frag);725705 }726706727707 rxrpc_tx_backoff(call, ret);
···479479 sock->file = file;480480 file->private_data = sock;481481 stream_open(SOCK_INODE(sock), file);482482+ /*483483+ * Disable permission and pre-content events, but enable legacy484484+ * inotify events for legacy users.485485+ */486486+ file_set_fsnotify_mode(file, FMODE_NONOTIFY_PERM);482487 return file;483488}484489EXPORT_SYMBOL(sock_alloc_file);
+7-1
net/vmw_vsock/af_vsock.c
···824824 */825825 lock_sock_nested(sk, level);826826827827- sock_orphan(sk);827827+ /* Indicate to vsock_remove_sock() that the socket is being released and828828+ * can be removed from the bound_table. Unlike transport reassignment829829+ * case, where the socket must remain bound despite vsock_remove_sock()830830+ * being called from the transport release() callback.831831+ */832832+ sock_set_flag(sk, SOCK_DEAD);828833829834 if (vsk->transport)830835 vsk->transport->release(vsk);831836 else if (sock_type_connectible(sk->sk_type))832837 vsock_remove_sock(vsk);833838839839+ sock_orphan(sk);834840 sk->sk_shutdown = SHUTDOWN_MASK;835841836842 skb_queue_purge(&sk->sk_receive_queue);
···11+// SPDX-License-Identifier: GPL-2.0-only22+33+//! Abstractions for the faux bus.44+//!55+//! This module provides bindings for working with faux devices in kernel modules.66+//!77+//! C header: [`include/linux/device/faux.h`]88+99+use crate::{bindings, device, error::code::*, prelude::*};1010+use core::ptr::{addr_of_mut, null, null_mut, NonNull};1111+1212+/// The registration of a faux device.1313+///1414+/// This type represents the registration of a [`struct faux_device`]. When an instance of this type1515+/// is dropped, its respective faux device will be unregistered from the system.1616+///1717+/// # Invariants1818+///1919+/// `self.0` always holds a valid pointer to an initialized and registered [`struct faux_device`].2020+///2121+/// [`struct faux_device`]: srctree/include/linux/device/faux.h2222+#[repr(transparent)]2323+pub struct Registration(NonNull<bindings::faux_device>);2424+2525+impl Registration {2626+ /// Create and register a new faux device with the given name.2727+ pub fn new(name: &CStr) -> Result<Self> {2828+ // SAFETY:2929+ // - `name` is copied by this function into its own storage3030+ // - `faux_ops` is safe to leave NULL according to the C API3131+ let dev = unsafe { bindings::faux_device_create(name.as_char_ptr(), null_mut(), null()) };3232+3333+ // The above function will return either a valid device, or NULL on failure3434+ // INVARIANT: The device will remain registered until faux_device_destroy() is called, which3535+ // happens in our Drop implementation.3636+ Ok(Self(NonNull::new(dev).ok_or(ENODEV)?))3737+ }3838+3939+ fn as_raw(&self) -> *mut bindings::faux_device {4040+ self.0.as_ptr()4141+ }4242+}4343+4444+impl AsRef<device::Device> for Registration {4545+ fn as_ref(&self) -> &device::Device {4646+ // SAFETY: The underlying `device` in `faux_device` is guaranteed by the C API to be4747+ // a valid initialized `device`.4848+ unsafe { device::Device::as_ref(addr_of_mut!((*self.as_raw()).dev)) }4949+ }5050+}5151+5252+impl Drop for Registration {5353+ fn drop(&mut self) {5454+ // SAFETY: `self.0` is a valid registered faux_device via our type invariants.5555+ unsafe { bindings::faux_device_destroy(self.as_raw()) }5656+ }5757+}5858+5959+// SAFETY: The faux device API is thread-safe as guaranteed by the device core, as long as6060+// faux_device_destroy() is guaranteed to only be called once - which is guaranteed by our type not6161+// having Copy/Clone.6262+unsafe impl Send for Registration {}6363+6464+// SAFETY: The faux device API is thread-safe as guaranteed by the device core, as long as6565+// faux_device_destroy() is guaranteed to only be called once - which is guaranteed by our type not6666+// having Copy/Clone.6767+unsafe impl Sync for Registration {}
···4646pub mod devres;4747pub mod driver;4848pub mod error;4949+pub mod faux;4950#[cfg(CONFIG_RUST_FW_LOADER_ABSTRACTIONS)]5051pub mod firmware;5152pub mod fs;
+1-1
rust/kernel/rbtree.rs
···11491149/// # Invariants11501150/// - `parent` may be null if the new node becomes the root.11511151/// - `child_field_of_parent` is a valid pointer to the left-child or right-child of `parent`. If `parent` is11521152-/// null, it is a pointer to the root of the [`RBTree`].11521152+/// null, it is a pointer to the root of the [`RBTree`].11531153struct RawVacantEntry<'a, K, V> {11541154 rbtree: *mut RBTree<K, V>,11551155 /// The node that will become the parent of the new node if we insert one.
···61616262 If unsure, say N.63636464+config SAMPLE_RUST_DRIVER_FAUX6565+ tristate "Faux Driver"6666+ help6767+ This option builds the Rust Faux driver sample.6868+6969+ To compile this as a module, choose M here:7070+ the module will be called rust_driver_faux.7171+7272+ If unsure, say N.7373+6474config SAMPLE_RUST_HOSTPROGS6575 bool "Host programs"6676 help
···3131ifdef CONFIG_CC_IS_CLANG3232# The kernel builds with '-std=gnu11' so use of GNU extensions is acceptable.3333KBUILD_CFLAGS += -Wno-gnu3434+3535+# Clang checks for overflow/truncation with '%p', while GCC does not:3636+# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=1112193737+KBUILD_CFLAGS += $(call cc-disable-warning, format-overflow-non-kprintf)3838+KBUILD_CFLAGS += $(call cc-disable-warning, format-truncation-non-kprintf)3439else35403641# gcc inanely warns about local variables called 'main'···110105KBUILD_CFLAGS += $(call cc-disable-warning, format-overflow)111106ifdef CONFIG_CC_IS_GCC112107KBUILD_CFLAGS += $(call cc-disable-warning, format-truncation)113113-else114114-# Clang checks for overflow/truncation with '%p', while GCC does not:115115-# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111219116116-KBUILD_CFLAGS += $(call cc-disable-warning, format-overflow-non-kprintf)117117-KBUILD_CFLAGS += $(call cc-disable-warning, format-truncation-non-kprintf)118108endif119109KBUILD_CFLAGS += $(call cc-disable-warning, stringop-truncation)120110···133133KBUILD_CFLAGS += -Wno-tautological-constant-out-of-range-compare134134KBUILD_CFLAGS += $(call cc-disable-warning, unaligned-access)135135KBUILD_CFLAGS += -Wno-enum-compare-conditional136136-KBUILD_CFLAGS += -Wno-enum-enum-conversion137136endif138137139138endif···155156KBUILD_CFLAGS += -Wno-missing-field-initializers156157KBUILD_CFLAGS += -Wno-type-limits157158KBUILD_CFLAGS += -Wno-shift-negative-value159159+160160+ifdef CONFIG_CC_IS_CLANG161161+KBUILD_CFLAGS += -Wno-enum-enum-conversion162162+endif158163159164ifdef CONFIG_CC_IS_GCC160165KBUILD_CFLAGS += -Wno-maybe-uninitialized
+1-1
scripts/Makefile.lib
···305305# These are shared by some Makefile.* files.306306307307ifdef CONFIG_LTO_CLANG308308-# Run $(LD) here to covert LLVM IR to ELF in the following cases:308308+# Run $(LD) here to convert LLVM IR to ELF in the following cases:309309# - when this object needs objtool processing, as objtool cannot process LLVM IR310310# - when this is a single-object module, as modpost cannot process LLVM IR311311cmd_ld_single = $(if $(objtool-enabled)$(is-single-obj-m), ; $(LD) $(ld_flags) -r -o $(tmp-target) $@; mv $(tmp-target) $@)
+18
scripts/generate_rust_target.rs
···165165 let option = "CONFIG_".to_owned() + option;166166 self.0.contains_key(&option)167167 }168168+169169+ /// Is the rustc version at least `major.minor.patch`?170170+ fn rustc_version_atleast(&self, major: u32, minor: u32, patch: u32) -> bool {171171+ let check_version = 100000 * major + 100 * minor + patch;172172+ let actual_version = self173173+ .0174174+ .get("CONFIG_RUSTC_VERSION")175175+ .unwrap()176176+ .parse::<u32>()177177+ .unwrap();178178+ check_version <= actual_version179179+ }168180}169181170182fn main() {···194182 }195183 } else if cfg.has("X86_64") {196184 ts.push("arch", "x86_64");185185+ if cfg.rustc_version_atleast(1, 86, 0) {186186+ ts.push("rustc-abi", "x86-softfloat");187187+ }197188 ts.push(198189 "data-layout",199190 "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128",···230215 panic!("32-bit x86 only works under UML");231216 }232217 ts.push("arch", "x86");218218+ if cfg.rustc_version_atleast(1, 86, 0) {219219+ ts.push("rustc-abi", "x86-softfloat");220220+ }233221 ts.push(234222 "data-layout",235223 "e-m:e-p:32:32-p270:32:32-p271:32:32-p272:64:64-i128:128-f64:32:64-f80:32-n8:16:32-S128",
+37-2
scripts/mod/modpost.c
···190190191191 /*192192 * Set mod->is_gpl_compatible to true by default. If MODULE_LICENSE()193193- * is missing, do not check the use for EXPORT_SYMBOL_GPL() becasue194194- * modpost will exit wiht error anyway.193193+ * is missing, do not check the use for EXPORT_SYMBOL_GPL() because194194+ * modpost will exit with an error anyway.195195 */196196 mod->is_gpl_compatible = true;197197···507507 info->modinfo_len = sechdrs[i].sh_size;508508 } else if (!strcmp(secname, ".export_symbol")) {509509 info->export_symbol_secndx = i;510510+ } else if (!strcmp(secname, ".no_trim_symbol")) {511511+ info->no_trim_symbol = (void *)hdr + sechdrs[i].sh_offset;512512+ info->no_trim_symbol_len = sechdrs[i].sh_size;510513 }511514512515 if (sechdrs[i].sh_type == SHT_SYMTAB) {···15691566 /* strip trailing .o */15701567 mod = new_module(modname, strlen(modname) - strlen(".o"));1571156815691569+ /* save .no_trim_symbol section for later use */15701570+ if (info.no_trim_symbol_len) {15711571+ mod->no_trim_symbol = xmalloc(info.no_trim_symbol_len);15721572+ memcpy(mod->no_trim_symbol, info.no_trim_symbol,15731573+ info.no_trim_symbol_len);15741574+ mod->no_trim_symbol_len = info.no_trim_symbol_len;15751575+ }15761576+15721577 if (!mod->is_vmlinux) {15731578 license = get_modinfo(&info, "license");15741579 if (!license)···17371726 }1738172717391728 free(buf);17291729+}17301730+17311731+/*17321732+ * Keep symbols recorded in the .no_trim_symbol section. This is necessary to17331733+ * prevent CONFIG_TRIM_UNUSED_KSYMS from dropping EXPORT_SYMBOL because17341734+ * symbol_get() relies on the symbol being present in the ksymtab for lookups.17351735+ */17361736+static void keep_no_trim_symbols(struct module *mod)17371737+{17381738+ unsigned long size = mod->no_trim_symbol_len;17391739+17401740+ for (char *s = mod->no_trim_symbol; s; s = next_string(s , &size)) {17411741+ struct symbol *sym;17421742+17431743+ /*17441744+ * If find_symbol() returns NULL, this symbol is not provided17451745+ * by any module, and symbol_get() will fail.17461746+ */17471747+ sym = find_symbol(s);17481748+ if (sym)17491749+ sym->used = true;17501750+ }17401751}1741175217421753static void check_modname_len(struct module *mod)···22872254 read_symbols_from_files(files_source);2288225522892256 list_for_each_entry(mod, &modules, list) {22572257+ keep_no_trim_symbols(mod);22582258+22902259 if (mod->dump_file || mod->is_vmlinux)22912260 continue;22922261
+6
scripts/mod/modpost.h
···111111 *112112 * @dump_file: path to the .symvers file if loaded from a file113113 * @aliases: list head for module_aliases114114+ * @no_trim_symbol: .no_trim_symbol section data115115+ * @no_trim_symbol_len: length of the .no_trim_symbol section114116 */115117struct module {116118 struct list_head list;···130128 // Actual imported namespaces131129 struct list_head imported_namespaces;132130 struct list_head aliases;131131+ char *no_trim_symbol;132132+ unsigned int no_trim_symbol_len;133133 char name[];134134};135135···145141 char *strtab;146142 char *modinfo;147143 unsigned int modinfo_len;144144+ char *no_trim_symbol;145145+ unsigned int no_trim_symbol_len;148146149147 /* support for 32bit section numbers */150148
···6262 #6363 # Clear VPATH and srcroot because the source files reside in the output6464 # directory.6565- # shellcheck disable=SC2016 # $(MAKE), $(CC), and $(build) will be expanded by Make6666- "${MAKE}" run-command KBUILD_RUN_COMMAND='+$(MAKE) HOSTCC=$(CC) VPATH= srcroot=. $(build)='"${destdir}"/scripts6565+ # shellcheck disable=SC2016 # $(MAKE) and $(build) will be expanded by Make6666+ "${MAKE}" run-command KBUILD_RUN_COMMAND='+$(MAKE) HOSTCC='"${CC}"' VPATH= srcroot=. $(build)='"${destdir}"/scripts67676868 rm -f "${destdir}/scripts/Kbuild"6969fi
+112-33
security/tomoyo/common.c
···19811981}1982198219831983/**19841984+ * tomoyo_numscan - sscanf() which stores the length of a decimal integer value.19851985+ *19861986+ * @str: String to scan.19871987+ * @head: Leading string that must start with.19881988+ * @width: Pointer to "int" for storing length of a decimal integer value after @head.19891989+ * @tail: Optional character that must match after a decimal integer value.19901990+ *19911991+ * Returns whether @str starts with @head and a decimal value follows @head.19921992+ */19931993+static bool tomoyo_numscan(const char *str, const char *head, int *width, const char tail)19941994+{19951995+ const char *cp;19961996+ const int n = strlen(head);19971997+19981998+ if (!strncmp(str, head, n)) {19991999+ cp = str + n;20002000+ while (*cp && *cp >= '0' && *cp <= '9')20012001+ cp++;20022002+ if (*cp == tail || !tail) {20032003+ *width = cp - (str + n);20042004+ return *width != 0;20052005+ }20062006+ }20072007+ *width = 0;20082008+ return 0;20092009+}20102010+20112011+/**20122012+ * tomoyo_patternize_path - Make patterns for file path. Used by learning mode.20132013+ *20142014+ * @buffer: Destination buffer.20152015+ * @len: Size of @buffer.20162016+ * @entry: Original line.20172017+ *20182018+ * Returns nothing.20192019+ */20202020+static void tomoyo_patternize_path(char *buffer, const int len, char *entry)20212021+{20222022+ int width;20232023+ char *cp = entry;20242024+20252025+ /* Nothing to do if this line is not for "file" related entry. */20262026+ if (strncmp(entry, "file ", 5))20272027+ goto flush;20282028+ /*20292029+ * Nothing to do if there is no colon in this line, for this rewriting20302030+ * applies to only filesystems where numeric values in the path are volatile.20312031+ */20322032+ cp = strchr(entry + 5, ':');20332033+ if (!cp) {20342034+ cp = entry;20352035+ goto flush;20362036+ }20372037+ /* Flush e.g. "file ioctl" part. */20382038+ while (*cp != ' ')20392039+ cp--;20402040+ *cp++ = '\0';20412041+ tomoyo_addprintf(buffer, len, "%s ", entry);20422042+ /* e.g. file ioctl pipe:[$INO] $CMD */20432043+ if (tomoyo_numscan(cp, "pipe:[", &width, ']')) {20442044+ cp += width + 7;20452045+ tomoyo_addprintf(buffer, len, "pipe:[\\$]");20462046+ goto flush;20472047+ }20482048+ /* e.g. file ioctl socket:[$INO] $CMD */20492049+ if (tomoyo_numscan(cp, "socket:[", &width, ']')) {20502050+ cp += width + 9;20512051+ tomoyo_addprintf(buffer, len, "socket:[\\$]");20522052+ goto flush;20532053+ }20542054+ if (!strncmp(cp, "proc:/self", 10)) {20552055+ /* e.g. file read proc:/self/task/$TID/fdinfo/$FD */20562056+ cp += 10;20572057+ tomoyo_addprintf(buffer, len, "proc:/self");20582058+ } else if (tomoyo_numscan(cp, "proc:/", &width, 0)) {20592059+ /* e.g. file read proc:/$PID/task/$TID/fdinfo/$FD */20602060+ /*20612061+ * Don't patternize $PID part if $PID == 1, for several20622062+ * programs access only files in /proc/1/ directory.20632063+ */20642064+ cp += width + 6;20652065+ if (width == 1 && *(cp - 1) == '1')20662066+ tomoyo_addprintf(buffer, len, "proc:/1");20672067+ else20682068+ tomoyo_addprintf(buffer, len, "proc:/\\$");20692069+ } else {20702070+ goto flush;20712071+ }20722072+ /* Patternize $TID part if "/task/" follows. */20732073+ if (tomoyo_numscan(cp, "/task/", &width, 0)) {20742074+ cp += width + 6;20752075+ tomoyo_addprintf(buffer, len, "/task/\\$");20762076+ }20772077+ /* Patternize $FD part if "/fd/" or "/fdinfo/" follows. */20782078+ if (tomoyo_numscan(cp, "/fd/", &width, 0)) {20792079+ cp += width + 4;20802080+ tomoyo_addprintf(buffer, len, "/fd/\\$");20812081+ } else if (tomoyo_numscan(cp, "/fdinfo/", &width, 0)) {20822082+ cp += width + 8;20832083+ tomoyo_addprintf(buffer, len, "/fdinfo/\\$");20842084+ }20852085+flush:20862086+ /* Flush remaining part if any. */20872087+ if (*cp)20882088+ tomoyo_addprintf(buffer, len, "%s", cp);20892089+}20902090+20912091+/**19842092 * tomoyo_add_entry - Add an ACL to current thread's domain. Used by learning mode.19852093 *19862094 * @domain: Pointer to "struct tomoyo_domain_info".···21112003 if (!cp)21122004 return;21132005 *cp++ = '\0';21142114- len = strlen(cp) + 1;20062006+ /* Reserve some space for potentially using patterns. */20072007+ len = strlen(cp) + 16;21152008 /* strstr() will return NULL if ordering is wrong. */21162009 if (*cp == 'f') {21172010 argv0 = strstr(header, " argv[]={ \"");···21292020 if (symlink)21302021 len += tomoyo_truncate(symlink + 1) + 1;21312022 }21322132- buffer = kmalloc(len, GFP_NOFS);20232023+ buffer = kmalloc(len, GFP_NOFS | __GFP_ZERO);21332024 if (!buffer)21342025 return;21352135- snprintf(buffer, len - 1, "%s", cp);21362136- if (*cp == 'f' && strchr(buffer, ':')) {21372137- /* Automatically replace 2 or more digits with \$ pattern. */21382138- char *cp2;21392139-21402140- /* e.g. file read proc:/$PID/stat */21412141- cp = strstr(buffer, " proc:/");21422142- if (cp && simple_strtoul(cp + 7, &cp2, 10) >= 10 && *cp2 == '/') {21432143- *(cp + 7) = '\\';21442144- *(cp + 8) = '$';21452145- memmove(cp + 9, cp2, strlen(cp2) + 1);21462146- goto ok;21472147- }21482148- /* e.g. file ioctl pipe:[$INO] $CMD */21492149- cp = strstr(buffer, " pipe:[");21502150- if (cp && simple_strtoul(cp + 7, &cp2, 10) >= 10 && *cp2 == ']') {21512151- *(cp + 7) = '\\';21522152- *(cp + 8) = '$';21532153- memmove(cp + 9, cp2, strlen(cp2) + 1);21542154- goto ok;21552155- }21562156- /* e.g. file ioctl socket:[$INO] $CMD */21572157- cp = strstr(buffer, " socket:[");21582158- if (cp && simple_strtoul(cp + 9, &cp2, 10) >= 10 && *cp2 == ']') {21592159- *(cp + 9) = '\\';21602160- *(cp + 10) = '$';21612161- memmove(cp + 11, cp2, strlen(cp2) + 1);21622162- goto ok;21632163- }21642164- }21652165-ok:20262026+ tomoyo_patternize_path(buffer, len, cp);21662027 if (realpath)21672028 tomoyo_addprintf(buffer, len, " exec.%s", realpath);21682029 if (argv0)
+1-1
security/tomoyo/domain.c
···920920#ifdef CONFIG_MMU921921 /*922922 * This is called at execve() time in order to dig around923923- * in the argv/environment of the new proceess923923+ * in the argv/environment of the new process924924 * (represented by bprm).925925 */926926 mmap_read_lock(bprm->mm);
···549549 .id = LSM_ID_TOMOYO,550550};551551552552-/*553553- * tomoyo_security_ops is a "struct security_operations" which is used for554554- * registering TOMOYO.555555- */552552+/* tomoyo_hooks is used for registering TOMOYO. */556553static struct security_hook_list tomoyo_hooks[] __ro_after_init = {557554 LSM_HOOK_INIT(cred_prepare, tomoyo_cred_prepare),558555 LSM_HOOK_INIT(bprm_committed_creds, tomoyo_bprm_committed_creds),
+11-1
tools/objtool/check.c
···227227 str_ends_with(func->name, "_4core9panicking18panic_bounds_check") ||228228 str_ends_with(func->name, "_4core9panicking19assert_failed_inner") ||229229 str_ends_with(func->name, "_4core9panicking36panic_misaligned_pointer_dereference") ||230230+ strstr(func->name, "_4core9panicking13assert_failed") ||230231 strstr(func->name, "_4core9panicking11panic_const24panic_const_") ||231232 (strstr(func->name, "_4core5slice5index24slice_") &&232233 str_ends_with(func->name, "_fail"));···19761975 reloc_addend(reloc) == pfunc->offset)19771976 break;1978197719781978+ /*19791979+ * Clang sometimes leaves dangling unused jump table entries19801980+ * which point to the end of the function. Ignore them.19811981+ */19821982+ if (reloc->sym->sec == pfunc->sec &&19831983+ reloc_addend(reloc) == pfunc->offset + pfunc->len)19841984+ goto next;19851985+19791986 dest_insn = find_insn(file, reloc->sym->sec, reloc_addend(reloc));19801987 if (!dest_insn)19811988 break;···20011992 alt->insn = dest_insn;20021993 alt->next = insn->alts;20031994 insn->alts = alt;19951995+next:20041996 prev_offset = reloc_offset(reloc);20051997 }20061998···2274226422752265 if (sec->sh.sh_entsize != 8) {22762266 static bool warned = false;22772277- if (!warned) {22672267+ if (!warned && opts.verbose) {22782268 WARN("%s: dodgy linker, sh_entsize != 8", sec->name);22792269 warned = true;22802270 }
···49495050 SCX_ASSERT(is_cpu_online());51515252- skel = hotplug__open_and_load();5353- SCX_ASSERT(skel);5252+ skel = hotplug__open();5353+ SCX_FAIL_IF(!skel, "Failed to open");5454+ SCX_ENUM_INIT(skel);5555+ SCX_FAIL_IF(hotplug__load(skel), "Failed to load skel");54565557 /* Testing the offline -> online path, so go offline before starting */5658 if (onlining)
···15151616#define SCHED_EXT 717171818-static struct init_enable_count *1919-open_load_prog(bool global)2020-{2121- struct init_enable_count *skel;2222-2323- skel = init_enable_count__open();2424- SCX_BUG_ON(!skel, "Failed to open skel");2525-2626- if (!global)2727- skel->struct_ops.init_enable_count_ops->flags |= SCX_OPS_SWITCH_PARTIAL;2828-2929- SCX_BUG_ON(init_enable_count__load(skel), "Failed to load skel");3030-3131- return skel;3232-}3333-3418static enum scx_test_status run_test(bool global)3519{3620 struct init_enable_count *skel;···2440 struct sched_param param = {};2541 pid_t pids[num_pre_forks];26422727- skel = open_load_prog(global);4343+ skel = init_enable_count__open();4444+ SCX_FAIL_IF(!skel, "Failed to open");4545+ SCX_ENUM_INIT(skel);4646+4747+ if (!global)4848+ skel->struct_ops.init_enable_count_ops->flags |= SCX_OPS_SWITCH_PARTIAL;4949+5050+ SCX_FAIL_IF(init_enable_count__load(skel), "Failed to load skel");28512952 /*3053 * Fork a bunch of children before we attach the scheduler so that we···150159151160struct scx_test init_enable_count = {152161 .name = "init_enable_count",153153- .description = "Verify we do the correct amount of counting of init, "162162+ .description = "Verify we correctly count the occurrences of init, "154163 "enable, etc callbacks.",155164 .run = run,156165};
+5-2
tools/testing/selftests/sched_ext/maximal.c
···1414{1515 struct maximal *skel;16161717- skel = maximal__open_and_load();1818- SCX_FAIL_IF(!skel, "Failed to open and load skel");1717+ skel = maximal__open();1818+ SCX_FAIL_IF(!skel, "Failed to open");1919+ SCX_ENUM_INIT(skel);2020+ SCX_FAIL_IF(maximal__load(skel), "Failed to load skel");2121+1922 *ctx = skel;20232124 return SCX_TEST_PASS;
+1-1
tools/testing/selftests/sched_ext/maybe_null.c
···43434444struct scx_test maybe_null = {4545 .name = "maybe_null",4646- .description = "Verify if PTR_MAYBE_NULL work for .dispatch",4646+ .description = "Verify if PTR_MAYBE_NULL works for .dispatch",4747 .run = run,4848};4949REGISTER_SCX_TEST(&maybe_null)
+5-5
tools/testing/selftests/sched_ext/minimal.c
···1515{1616 struct minimal *skel;17171818- skel = minimal__open_and_load();1919- if (!skel) {2020- SCX_ERR("Failed to open and load skel");2121- return SCX_TEST_FAIL;2222- }1818+ skel = minimal__open();1919+ SCX_FAIL_IF(!skel, "Failed to open");2020+ SCX_ENUM_INIT(skel);2121+ SCX_FAIL_IF(minimal__load(skel), "Failed to load skel");2222+2323 *ctx = skel;24242525 return SCX_TEST_PASS;
+5-5
tools/testing/selftests/sched_ext/prog_run.c
···1515{1616 struct prog_run *skel;17171818- skel = prog_run__open_and_load();1919- if (!skel) {2020- SCX_ERR("Failed to open and load skel");2121- return SCX_TEST_FAIL;2222- }1818+ skel = prog_run__open();1919+ SCX_FAIL_IF(!skel, "Failed to open");2020+ SCX_ENUM_INIT(skel);2121+ SCX_FAIL_IF(prog_run__load(skel), "Failed to load skel");2222+2323 *ctx = skel;24242525 return SCX_TEST_PASS;
+4-5
tools/testing/selftests/sched_ext/reload_loop.c
···18181919static enum scx_test_status setup(void **ctx)2020{2121- skel = maximal__open_and_load();2222- if (!skel) {2323- SCX_ERR("Failed to open and load skel");2424- return SCX_TEST_FAIL;2525- }2121+ skel = maximal__open();2222+ SCX_FAIL_IF(!skel, "Failed to open");2323+ SCX_ENUM_INIT(skel);2424+ SCX_FAIL_IF(maximal__load(skel), "Failed to load skel");26252726 return SCX_TEST_PASS;2827}
···10711071}1072107210731073/*10741074- * Called after the VM is otherwise initialized, but just before adding it to10751075- * the vm_list.10761076- */10771077-int __weak kvm_arch_post_init_vm(struct kvm *kvm)10781078-{10791079- return 0;10801080-}10811081-10821082-/*10831074 * Called just after removing the VM from the vm_list, but before doing any10841075 * other destruction.10851076 */···11901199 if (r)11911200 goto out_err_no_debugfs;1192120111931193- r = kvm_arch_post_init_vm(kvm);11941194- if (r)11951195- goto out_err;11961196-11971202 mutex_lock(&kvm_lock);11981203 list_add(&kvm->vm_list, &vm_list);11991204 mutex_unlock(&kvm_lock);···1199121212001213 return kvm;1201121412021202-out_err:12031203- kvm_destroy_vm_debugfs(kvm);12041215out_err_no_debugfs:12051216 kvm_coalesced_mmio_free(kvm);12061217out_no_coalesced_mmio:···19561971 return -EINVAL;19571972 if (mem->guest_phys_addr + mem->memory_size < mem->guest_phys_addr)19581973 return -EINVAL;19591959- if ((mem->memory_size >> PAGE_SHIFT) > KVM_MEM_MAX_NR_PAGES)19741974+19751975+ /*19761976+ * The size of userspace-defined memory regions is restricted in order19771977+ * to play nice with dirty bitmap operations, which are indexed with an19781978+ * "unsigned int". KVM's internal memory regions don't support dirty19791979+ * logging, and so are exempt.19801980+ */19811981+ if (id < KVM_USER_MEM_SLOTS &&19821982+ (mem->memory_size >> PAGE_SHIFT) > KVM_MEM_MAX_NR_PAGES)19601983 return -EINVAL;1961198419621985 slots = __kvm_memslots(kvm, as_id);