Merge 6.14-rc3 into usb-next · tjh.dev/kernel@f8da37e

+1

.mailmap

··· 376 376 Julien Thierry <julien.thierry.kdev@gmail.com> <julien.thierry@arm.com> 377 377 Iskren Chernev <me@iskren.info> <iskren.chernev@gmail.com> 378 378 Kalle Valo <kvalo@kernel.org> <kvalo@codeaurora.org> 379 + Kalle Valo <kvalo@kernel.org> <quic_kvalo@quicinc.com> 379 380 Kalyan Thota <quic_kalyant@quicinc.com> <kalyan_t@codeaurora.org> 380 381 Karthikeyan Periyasamy <quic_periyasa@quicinc.com> <periyasa@codeaurora.org> 381 382 Kathiravan T <quic_kathirav@quicinc.com> <kathirav@codeaurora.org>

+2 -4

CREDITS

··· 2515 2515 D: Initial implementation of VC's, pty's and select() 2516 2516 2517 2517 N: Pavel Machek 2518 - E: pavel@ucw.cz 2518 + E: pavel@kernel.org 2519 2519 P: 4096R/92DFCE96 4FA7 9EEF FCD4 C44F C585 B8C7 C060 2241 92DF CE96 2520 - D: Softcursor for vga, hypertech cdrom support, vcsa bugfix, nbd, 2521 - D: sun4/330 port, capabilities for elf, speedup for rm on ext2, USB, 2522 - D: work on suspend-to-ram/disk, killing duplicates from ioctl32, 2520 + D: NBD, Sun4/330 port, USB, work on suspend-to-ram/disk, 2523 2521 D: Altera SoCFPGA and Nokia N900 support. 2524 2522 S: Czech Republic 2525 2523

+1 -1

Documentation/arch/arm64/gcs.rst

··· 37 37 shadow stacks rather than GCS. 38 38 39 39 * Support for GCS is reported to userspace via HWCAP_GCS in the aux vector 40 - AT_HWCAP2 entry. 40 + AT_HWCAP entry. 41 41 42 42 * GCS is enabled per thread. While there is support for disabling GCS 43 43 at runtime this should be done with great care.

+3

Documentation/devicetree/bindings/clock/qcom,gpucc.yaml

··· 8 8 9 9 maintainers: 10 10 - Taniya Das <quic_tdas@quicinc.com> 11 + - Imran Shaik <quic_imrashai@quicinc.com> 11 12 12 13 description: | 13 14 Qualcomm graphics clock control module provides the clocks, resets and power ··· 24 23 include/dt-bindings/clock/qcom,gpucc-sm8150.h 25 24 include/dt-bindings/clock/qcom,gpucc-sm8250.h 26 25 include/dt-bindings/clock/qcom,gpucc-sm8350.h 26 + include/dt-bindings/clock/qcom,qcs8300-gpucc.h 27 27 28 28 properties: 29 29 compatible: 30 30 enum: 31 + - qcom,qcs8300-gpucc 31 32 - qcom,sdm845-gpucc 32 33 - qcom,sa8775p-gpucc 33 34 - qcom,sc7180-gpucc

+5 -1

Documentation/devicetree/bindings/clock/qcom,sa8775p-camcc.yaml

··· 8 8 9 9 maintainers: 10 10 - Taniya Das <quic_tdas@quicinc.com> 11 + - Imran Shaik <quic_imrashai@quicinc.com> 11 12 12 13 description: | 13 14 Qualcomm camera clock control module provides the clocks, resets and power 14 15 domains on SA8775p. 15 16 16 - See also: include/dt-bindings/clock/qcom,sa8775p-camcc.h 17 + See also: 18 + include/dt-bindings/clock/qcom,qcs8300-camcc.h 19 + include/dt-bindings/clock/qcom,sa8775p-camcc.h 17 20 18 21 properties: 19 22 compatible: 20 23 enum: 24 + - qcom,qcs8300-camcc 21 25 - qcom,sa8775p-camcc 22 26 23 27 clocks:

+1

Documentation/devicetree/bindings/clock/qcom,sa8775p-videocc.yaml

··· 18 18 properties: 19 19 compatible: 20 20 enum: 21 + - qcom,qcs8300-videocc 21 22 - qcom,sa8775p-videocc 22 23 23 24 clocks:

+29

Documentation/devicetree/bindings/display/panel/powertip,hx8238a.yaml

··· 1 + # SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause 2 + %YAML 1.2 3 + --- 4 + $id: http://devicetree.org/schemas/display/panel/powertip,hx8238a.yaml# 5 + $schema: http://devicetree.org/meta-schemas/core.yaml# 6 + 7 + title: Powertip Electronic Technology Co. 320 x 240 LCD panel 8 + 9 + maintainers: 10 + - Lukasz Majewski <lukma@denx.de> 11 + 12 + allOf: 13 + - $ref: panel-dpi.yaml# 14 + 15 + properties: 16 + compatible: 17 + items: 18 + - const: powertip,hx8238a 19 + - {} # panel-dpi, but not listed here to avoid false select 20 + 21 + height-mm: true 22 + panel-timing: true 23 + port: true 24 + power-supply: true 25 + width-mm: true 26 + 27 + additionalProperties: false 28 + 29 + ...

+29

Documentation/devicetree/bindings/display/panel/powertip,st7272.yaml

··· 1 + # SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause 2 + %YAML 1.2 3 + --- 4 + $id: http://devicetree.org/schemas/display/panel/powertip,st7272.yaml# 5 + $schema: http://devicetree.org/meta-schemas/core.yaml# 6 + 7 + title: Powertip Electronic Technology Co. 320 x 240 LCD panel 8 + 9 + maintainers: 10 + - Lukasz Majewski <lukma@denx.de> 11 + 12 + allOf: 13 + - $ref: panel-dpi.yaml# 14 + 15 + properties: 16 + compatible: 17 + items: 18 + - const: powertip,st7272 19 + - {} # panel-dpi, but not listed here to avoid false select 20 + 21 + height-mm: true 22 + panel-timing: true 23 + port: true 24 + power-supply: true 25 + width-mm: true 26 + 27 + additionalProperties: false 28 + 29 + ...

+1 -1

Documentation/devicetree/bindings/display/ti/ti,am65x-dss.yaml

··· 23 23 compatible: 24 24 enum: 25 25 - ti,am625-dss 26 - - ti,am62a7,dss 26 + - ti,am62a7-dss 27 27 - ti,am65x-dss 28 28 29 29 reg:

+2 -3

Documentation/devicetree/bindings/interrupt-controller/microchip,lan966x-oic.yaml

··· 14 14 15 15 description: | 16 16 The Microchip LAN966x outband interrupt controller (OIC) maps the internal 17 - interrupt sources of the LAN966x device to an external interrupt. 18 - When the LAN966x device is used as a PCI device, the external interrupt is 19 - routed to the PCI interrupt. 17 + interrupt sources of the LAN966x device to a PCI interrupt when the LAN966x 18 + device is used as a PCI device. 20 19 21 20 properties: 22 21 compatible:

-1

Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml

··· 7 7 title: Qualcomm Technologies ath10k wireless devices 8 8 9 9 maintainers: 10 - - Kalle Valo <kvalo@kernel.org> 11 10 - Jeff Johnson <jjohnson@kernel.org> 12 11 13 12 description:

-1

Documentation/devicetree/bindings/net/wireless/qcom,ath11k-pci.yaml

··· 8 8 title: Qualcomm Technologies ath11k wireless devices (PCIe) 9 9 10 10 maintainers: 11 - - Kalle Valo <kvalo@kernel.org> 12 11 - Jeff Johnson <jjohnson@kernel.org> 13 12 14 13 description: |

-1

Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml

··· 8 8 title: Qualcomm Technologies ath11k wireless devices 9 9 10 10 maintainers: 11 - - Kalle Valo <kvalo@kernel.org> 12 11 - Jeff Johnson <jjohnson@kernel.org> 13 12 14 13 description: |

-1

Documentation/devicetree/bindings/net/wireless/qcom,ath12k-wsi.yaml

··· 9 9 10 10 maintainers: 11 11 - Jeff Johnson <jjohnson@kernel.org> 12 - - Kalle Valo <kvalo@kernel.org> 13 12 14 13 description: | 15 14 Qualcomm Technologies IEEE 802.11be PCIe devices with WSI interface.

-1

Documentation/devicetree/bindings/net/wireless/qcom,ath12k.yaml

··· 9 9 10 10 maintainers: 11 11 - Jeff Johnson <quic_jjohnson@quicinc.com> 12 - - Kalle Valo <kvalo@kernel.org> 13 12 14 13 description: 15 14 Qualcomm Technologies IEEE 802.11be PCIe devices.

+1

Documentation/devicetree/bindings/nvmem/qcom,qfprom.yaml

··· 36 36 - qcom,qcs404-qfprom 37 37 - qcom,qcs615-qfprom 38 38 - qcom,qcs8300-qfprom 39 + - qcom,sar2130p-qfprom 39 40 - qcom,sc7180-qfprom 40 41 - qcom,sc7280-qfprom 41 42 - qcom,sc8280xp-qfprom

+1 -1

Documentation/devicetree/bindings/regulator/qcom,smd-rpm-regulator.yaml

··· 22 22 Each sub-node is identified using the node's name, with valid values listed 23 23 for each of the pmics below. 24 24 25 - For mp5496, s1, s2 25 + For mp5496, s1, s2, l2, l5 26 26 27 27 For pm2250, s1, s2, s3, s4, l1, l2, l3, l4, l5, l6, l7, l8, l9, l10, l11, 28 28 l12, l13, l14, l15, l16, l17, l18, l19, l20, l21, l22

+6

Documentation/driver-api/infrastructure.rst

··· 41 41 .. kernel-doc:: drivers/base/class.c 42 42 :export: 43 43 44 + .. kernel-doc:: include/linux/device/faux.h 45 + :internal: 46 + 47 + .. kernel-doc:: drivers/base/faux.c 48 + :export: 49 + 44 50 .. kernel-doc:: drivers/base/node.c 45 51 :internal: 46 52

+98

Documentation/filesystems/bcachefs/SubmittingPatches.rst

··· 1 + Submitting patches to bcachefs: 2 + =============================== 3 + 4 + Patches must be tested before being submitted, either with the xfstests suite 5 + [0], or the full bcachefs test suite in ktest [1], depending on what's being 6 + touched. Note that ktest wraps xfstests and will be an easier method to running 7 + it for most users; it includes single-command wrappers for all the mainstream 8 + in-kernel local filesystems. 9 + 10 + Patches will undergo more testing after being merged (including 11 + lockdep/kasan/preempt/etc. variants), these are not generally required to be 12 + run by the submitter - but do put some thought into what you're changing and 13 + which tests might be relevant, e.g. are you dealing with tricky memory layout 14 + work? kasan, are you doing locking work? then lockdep; and ktest includes 15 + single-command variants for the debug build types you'll most likely need. 16 + 17 + The exception to this rule is incomplete WIP/RFC patches: if you're working on 18 + something nontrivial, it's encouraged to send out a WIP patch to let people 19 + know what you're doing and make sure you're on the right track. Just make sure 20 + it includes a brief note as to what's done and what's incomplete, to avoid 21 + confusion. 22 + 23 + Rigorous checkpatch.pl adherence is not required (many of its warnings are 24 + considered out of date), but try not to deviate too much without reason. 25 + 26 + Focus on writing code that reads well and is organized well; code should be 27 + aesthetically pleasing. 28 + 29 + CI: 30 + === 31 + 32 + Instead of running your tests locally, when running the full test suite it's 33 + prefereable to let a server farm do it in parallel, and then have the results 34 + in a nice test dashboard (which can tell you which failures are new, and 35 + presents results in a git log view, avoiding the need for most bisecting). 36 + 37 + That exists [2], and community members may request an account. If you work for 38 + a big tech company, you'll need to help out with server costs to get access - 39 + but the CI is not restricted to running bcachefs tests: it runs any ktest test 40 + (which generally makes it easy to wrap other tests that can run in qemu). 41 + 42 + Other things to think about: 43 + ============================ 44 + 45 + - How will we debug this code? Is there sufficient introspection to diagnose 46 + when something starts acting wonky on a user machine? 47 + 48 + We don't necessarily need every single field of every data structure visible 49 + with introspection, but having the important fields of all the core data 50 + types wired up makes debugging drastically easier - a bit of thoughtful 51 + foresight greatly reduces the need to have people build custom kernels with 52 + debug patches. 53 + 54 + More broadly, think about all the debug tooling that might be needed. 55 + 56 + - Does it make the codebase more or less of a mess? Can we also try to do some 57 + organizing, too? 58 + 59 + - Do new tests need to be written? New assertions? How do we know and verify 60 + that the code is correct, and what happens if something goes wrong? 61 + 62 + We don't yet have automated code coverage analysis or easy fault injection - 63 + but for now, pretend we did and ask what they might tell us. 64 + 65 + Assertions are hugely important, given that we don't yet have a systems 66 + language that can do ergonomic embedded correctness proofs. Hitting an assert 67 + in testing is much better than wandering off into undefined behaviour la-la 68 + land - use them. Use them judiciously, and not as a replacement for proper 69 + error handling, but use them. 70 + 71 + - Does it need to be performance tested? Should we add new peformance counters? 72 + 73 + bcachefs has a set of persistent runtime counters which can be viewed with 74 + the 'bcachefs fs top' command; this should give users a basic idea of what 75 + their filesystem is currently doing. If you're doing a new feature or looking 76 + at old code, think if anything should be added. 77 + 78 + - If it's a new on disk format feature - have upgrades and downgrades been 79 + tested? (Automated tests exists but aren't in the CI, due to the hassle of 80 + disk image management; coordinate to have them run.) 81 + 82 + Mailing list, IRC: 83 + ================== 84 + 85 + Patches should hit the list [3], but much discussion and code review happens on 86 + IRC as well [4]; many people appreciate the more conversational approach and 87 + quicker feedback. 88 + 89 + Additionally, we have a lively user community doing excellent QA work, which 90 + exists primarily on IRC. Please make use of that resource; user feedback is 91 + important for any nontrivial feature, and documenting it in commit messages 92 + would be a good idea. 93 + 94 + [0]: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git 95 + [1]: https://evilpiepirate.org/git/ktest.git/ 96 + [2]: https://evilpiepirate.org/~testdashboard/ci/ 97 + [3]: linux-bcachefs@vger.kernel.org 98 + [4]: irc.oftc.net#bcache, #bcachefs-dev

+1

Documentation/filesystems/bcachefs/index.rst

··· 9 9 :numbered: 10 10 11 11 CodingStyle 12 + SubmittingPatches 12 13 errorcodes

+2 -1

Documentation/netlink/specs/ethtool.yaml

··· 1524 1524 nested-attributes: bitset 1525 1525 - 1526 1526 name: hwtstamp-flags 1527 - type: u32 1527 + type: nest 1528 + nested-attributes: bitset 1528 1529 1529 1530 operations: 1530 1531 enum-model: directional

+2 -2

Documentation/networking/iso15765-2.rst

··· 369 369 370 370 addr.can_family = AF_CAN; 371 371 addr.can_ifindex = if_nametoindex("can0"); 372 - addr.tp.tx_id = 0x18DA42F1 | CAN_EFF_FLAG; 373 - addr.tp.rx_id = 0x18DAF142 | CAN_EFF_FLAG; 372 + addr.can_addr.tp.tx_id = 0x18DA42F1 | CAN_EFF_FLAG; 373 + addr.can_addr.tp.rx_id = 0x18DAF142 | CAN_EFF_FLAG; 374 374 375 375 ret = bind(s, (struct sockaddr *)&addr, sizeof(addr)); 376 376 if (ret < 0)

+1 -1

Documentation/virt/kvm/api.rst

··· 1419 1419 S390: 1420 1420 ^^^^^ 1421 1421 1422 - Returns -EINVAL if the VM has the KVM_VM_S390_UCONTROL flag set. 1422 + Returns -EINVAL or -EEXIST if the VM has the KVM_VM_S390_UCONTROL flag set. 1423 1423 Returns -EINVAL if called on a protected VM. 1424 1424 1425 1425 4.36 KVM_SET_TSS_ADDR

+65 -20

MAINTAINERS

··· 2209 2209 F: sound/soc/codecs/ssm3515.c 2210 2210 2211 2211 ARM/APPLE MACHINE SUPPORT 2212 - M: Hector Martin <marcan@marcan.st> 2213 2212 M: Sven Peter <sven@svenpeter.dev> 2214 2213 R: Alyssa Rosenzweig <alyssa@rosenzweig.io> 2215 2214 L: asahi@lists.linux.dev ··· 3654 3655 F: drivers/phy/qualcomm/phy-ath79-usb.c 3655 3656 3656 3657 ATHEROS ATH GENERIC UTILITIES 3657 - M: Kalle Valo <kvalo@kernel.org> 3658 3658 M: Jeff Johnson <jjohnson@kernel.org> 3659 3659 L: linux-wireless@vger.kernel.org 3660 3660 S: Supported ··· 3858 3860 F: Documentation/devicetree/bindings/pwm/adi,axi-pwmgen.yaml 3859 3861 F: drivers/pwm/pwm-axi-pwmgen.c 3860 3862 3861 - AXXIA I2C CONTROLLER 3862 - M: Krzysztof Adamski <krzysztof.adamski@nokia.com> 3863 - L: linux-i2c@vger.kernel.org 3864 - S: Maintained 3865 - F: Documentation/devicetree/bindings/i2c/i2c-axxia.txt 3866 - F: drivers/i2c/busses/i2c-axxia.c 3867 - 3868 3863 AZ6007 DVB DRIVER 3869 3864 M: Mauro Carvalho Chehab <mchehab@kernel.org> 3870 3865 L: linux-media@vger.kernel.org ··· 3946 3955 L: linux-bcachefs@vger.kernel.org 3947 3956 S: Supported 3948 3957 C: irc://irc.oftc.net/bcache 3958 + P: Documentation/filesystems/bcachefs/SubmittingPatches.rst 3949 3959 T: git https://evilpiepirate.org/git/bcachefs.git 3950 3960 F: fs/bcachefs/ 3951 3961 F: Documentation/filesystems/bcachefs/ ··· 7108 7116 F: rust/kernel/device_id.rs 7109 7117 F: rust/kernel/devres.rs 7110 7118 F: rust/kernel/driver.rs 7119 + F: rust/kernel/faux.rs 7111 7120 F: rust/kernel/platform.rs 7112 7121 F: samples/rust/rust_driver_platform.rs 7122 + F: samples/rust/rust_driver_faux.rs 7113 7123 7114 7124 DRIVERS FOR OMAP ADAPTIVE VOLTAGE SCALING (AVS) 7115 7125 M: Nishanth Menon <nm@ti.com> ··· 9412 9418 9413 9419 FREEZER 9414 9420 M: "Rafael J. Wysocki" <rafael@kernel.org> 9415 - M: Pavel Machek <pavel@ucw.cz> 9421 + M: Pavel Machek <pavel@kernel.org> 9416 9422 L: linux-pm@vger.kernel.org 9417 9423 S: Supported 9418 9424 F: Documentation/power/freezing-of-tasks.rst ··· 9872 9878 F: drivers/staging/gpib/ 9873 9879 9874 9880 GPIO ACPI SUPPORT 9875 - M: Mika Westerberg <mika.westerberg@linux.intel.com> 9881 + M: Mika Westerberg <westeri@kernel.org> 9876 9882 M: Andy Shevchenko <andriy.shevchenko@linux.intel.com> 9877 9883 L: linux-gpio@vger.kernel.org 9878 9884 L: linux-acpi@vger.kernel.org ··· 10247 10253 10248 10254 HIBERNATION (aka Software Suspend, aka swsusp) 10249 10255 M: "Rafael J. Wysocki" <rafael@kernel.org> 10250 - M: Pavel Machek <pavel@ucw.cz> 10256 + M: Pavel Machek <pavel@kernel.org> 10251 10257 L: linux-pm@vger.kernel.org 10252 10258 S: Supported 10253 10259 B: https://bugzilla.kernel.org ··· 10816 10822 F: drivers/tty/hvc/ 10817 10823 10818 10824 I2C ACPI SUPPORT 10819 - M: Mika Westerberg <mika.westerberg@linux.intel.com> 10825 + M: Mika Westerberg <westeri@kernel.org> 10820 10826 L: linux-i2c@vger.kernel.org 10821 10827 L: linux-acpi@vger.kernel.org 10822 10828 S: Maintained ··· 13118 13124 F: scripts/leaking_addresses.pl 13119 13125 13120 13126 LED SUBSYSTEM 13121 - M: Pavel Machek <pavel@ucw.cz> 13122 13127 M: Lee Jones <lee@kernel.org> 13128 + M: Pavel Machek <pavel@kernel.org> 13123 13129 L: linux-leds@vger.kernel.org 13124 13130 S: Maintained 13125 13131 T: git git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds.git ··· 16432 16438 X: drivers/net/wireless/ 16433 16439 16434 16440 NETWORKING DRIVERS (WIRELESS) 16435 - M: Kalle Valo <kvalo@kernel.org> 16441 + M: Johannes Berg <johannes@sipsolutions.net> 16436 16442 L: linux-wireless@vger.kernel.org 16437 16443 S: Maintained 16438 16444 W: https://wireless.wiki.kernel.org/ ··· 16455 16461 F: include/net/dsa.h 16456 16462 F: net/dsa/ 16457 16463 F: tools/testing/selftests/drivers/net/dsa/ 16464 + 16465 + NETWORKING [ETHTOOL] 16466 + M: Andrew Lunn <andrew@lunn.ch> 16467 + M: Jakub Kicinski <kuba@kernel.org> 16468 + F: Documentation/netlink/specs/ethtool.yaml 16469 + F: Documentation/networking/ethtool-netlink.rst 16470 + F: include/linux/ethtool* 16471 + F: include/uapi/linux/ethtool* 16472 + F: net/ethtool/ 16473 + F: tools/testing/selftests/drivers/net/*/ethtool* 16474 + 16475 + NETWORKING [ETHTOOL CABLE TEST] 16476 + M: Andrew Lunn <andrew@lunn.ch> 16477 + F: net/ethtool/cabletest.c 16478 + F: tools/testing/selftests/drivers/net/*/ethtool* 16479 + K: cable_test 16458 16480 16459 16481 NETWORKING [GENERAL] 16460 16482 M: "David S. Miller" <davem@davemloft.net> ··· 16503 16493 F: include/linux/netlink.h 16504 16494 F: include/linux/netpoll.h 16505 16495 F: include/linux/rtnetlink.h 16496 + F: include/linux/sctp.h 16506 16497 F: include/linux/seq_file_net.h 16507 16498 F: include/linux/skbuff* 16508 16499 F: include/net/ ··· 16520 16509 F: include/uapi/linux/netlink.h 16521 16510 F: include/uapi/linux/netlink_diag.h 16522 16511 F: include/uapi/linux/rtnetlink.h 16512 + F: include/uapi/linux/sctp.h 16523 16513 F: lib/net_utils.c 16524 16514 F: lib/random32.c 16525 16515 F: net/ ··· 16633 16621 NETWORKING [TCP] 16634 16622 M: Eric Dumazet <edumazet@google.com> 16635 16623 M: Neal Cardwell <ncardwell@google.com> 16624 + R: Kuniyuki Iwashima <kuniyu@amazon.com> 16636 16625 L: netdev@vger.kernel.org 16637 16626 S: Maintained 16638 16627 F: Documentation/networking/net_cachelines/tcp_sock.rst ··· 16660 16647 F: include/net/tls.h 16661 16648 F: include/uapi/linux/tls.h 16662 16649 F: net/tls/* 16650 + 16651 + NETWORKING [SOCKETS] 16652 + M: Eric Dumazet <edumazet@google.com> 16653 + M: Kuniyuki Iwashima <kuniyu@amazon.com> 16654 + M: Paolo Abeni <pabeni@redhat.com> 16655 + M: Willem de Bruijn <willemb@google.com> 16656 + S: Maintained 16657 + F: include/linux/sock_diag.h 16658 + F: include/linux/socket.h 16659 + F: include/linux/sockptr.h 16660 + F: include/net/sock.h 16661 + F: include/net/sock_reuseport.h 16662 + F: include/uapi/linux/socket.h 16663 + F: net/core/*sock* 16664 + F: net/core/scm.c 16665 + F: net/socket.c 16666 + 16667 + NETWORKING [UNIX SOCKETS] 16668 + M: Kuniyuki Iwashima <kuniyu@amazon.com> 16669 + S: Maintained 16670 + F: include/net/af_unix.h 16671 + F: include/net/netns/unix.h 16672 + F: include/uapi/linux/unix_diag.h 16673 + F: net/unix/ 16674 + F: tools/testing/selftests/net/af_unix/ 16663 16675 16664 16676 NETXEN (1/10) GbE SUPPORT 16665 16677 M: Manish Chopra <manishc@marvell.com> ··· 16819 16781 F: kernel/time/tick*.* 16820 16782 16821 16783 NOKIA N900 CAMERA SUPPORT (ET8EK8 SENSOR, AD5820 FOCUS) 16822 - M: Pavel Machek <pavel@ucw.cz> 16784 + M: Pavel Machek <pavel@kernel.org> 16823 16785 M: Sakari Ailus <sakari.ailus@iki.fi> 16824 16786 L: linux-media@vger.kernel.org 16825 16787 S: Maintained ··· 17751 17713 L: dev@openvswitch.org 17752 17714 S: Maintained 17753 17715 W: http://openvswitch.org 17716 + F: Documentation/networking/openvswitch.rst 17754 17717 F: include/uapi/linux/openvswitch.h 17755 17718 F: net/openvswitch/ 17756 17719 F: tools/testing/selftests/net/openvswitch/ ··· 19351 19312 F: drivers/media/tuners/qt1010* 19352 19313 19353 19314 QUALCOMM ATH12K WIRELESS DRIVER 19354 - M: Kalle Valo <kvalo@kernel.org> 19355 19315 M: Jeff Johnson <jjohnson@kernel.org> 19356 19316 L: ath12k@lists.infradead.org 19357 19317 S: Supported ··· 19360 19322 N: ath12k 19361 19323 19362 19324 QUALCOMM ATHEROS ATH10K WIRELESS DRIVER 19363 - M: Kalle Valo <kvalo@kernel.org> 19364 19325 M: Jeff Johnson <jjohnson@kernel.org> 19365 19326 L: ath10k@lists.infradead.org 19366 19327 S: Supported ··· 19369 19332 N: ath10k 19370 19333 19371 19334 QUALCOMM ATHEROS ATH11K WIRELESS DRIVER 19372 - M: Kalle Valo <kvalo@kernel.org> 19373 19335 M: Jeff Johnson <jjohnson@kernel.org> 19374 19336 L: ath11k@lists.infradead.org 19375 19337 S: Supported ··· 19502 19466 L: dmaengine@vger.kernel.org 19503 19467 S: Supported 19504 19468 F: drivers/dma/qcom/hidma* 19469 + 19470 + QUALCOMM I2C QCOM GENI DRIVER 19471 + M: Mukesh Kumar Savaliya <quic_msavaliy@quicinc.com> 19472 + M: Viken Dadhaniya <quic_vdadhani@quicinc.com> 19473 + L: linux-i2c@vger.kernel.org 19474 + L: linux-arm-msm@vger.kernel.org 19475 + S: Maintained 19476 + F: Documentation/devicetree/bindings/i2c/qcom,i2c-geni-qcom.yaml 19477 + F: drivers/i2c/busses/i2c-qcom-geni.c 19505 19478 19506 19479 QUALCOMM I2C CCI DRIVER 19507 19480 M: Loic Poulain <loic.poulain@linaro.org> ··· 22851 22806 SUSPEND TO RAM 22852 22807 M: "Rafael J. Wysocki" <rafael@kernel.org> 22853 22808 M: Len Brown <len.brown@intel.com> 22854 - M: Pavel Machek <pavel@ucw.cz> 22809 + M: Pavel Machek <pavel@kernel.org> 22855 22810 L: linux-pm@vger.kernel.org 22856 22811 S: Supported 22857 22812 B: https://bugzilla.kernel.org

+5 -10

Makefile

··· 2 2 VERSION = 6 3 3 PATCHLEVEL = 14 4 4 SUBLEVEL = 0 5 - EXTRAVERSION = -rc1 5 + EXTRAVERSION = -rc3 6 6 NAME = Baby Opossum Posse 7 7 8 8 # *DOCUMENTATION* ··· 1120 1120 endif 1121 1121 1122 1122 # Align the bit size of userspace programs with the kernel 1123 - KBUILD_USERCFLAGS += $(filter -m32 -m64 --target=%, $(KBUILD_CFLAGS)) 1124 - KBUILD_USERLDFLAGS += $(filter -m32 -m64 --target=%, $(KBUILD_CFLAGS)) 1123 + KBUILD_USERCFLAGS += $(filter -m32 -m64 --target=%, $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS)) 1124 + KBUILD_USERLDFLAGS += $(filter -m32 -m64 --target=%, $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS)) 1125 1125 1126 1126 # make the checker run with the right architecture 1127 1127 CHECKFLAGS += --arch=$(ARCH) ··· 1421 1421 $(Q)$(MAKE) -sC $(srctree)/tools/bpf/resolve_btfids O=$(resolve_btfids_O) clean 1422 1422 endif 1423 1423 1424 - # Clear a bunch of variables before executing the submake 1425 - ifeq ($(quiet),silent_) 1426 - tools_silent=s 1427 - endif 1428 - 1429 1424 tools/: FORCE 1430 1425 $(Q)mkdir -p $(objtree)/tools 1431 - $(Q)$(MAKE) LDFLAGS= MAKEFLAGS="$(tools_silent) $(filter --j% -j,$(MAKEFLAGS))" O=$(abspath $(objtree)) subdir=tools -C $(srctree)/tools/ 1426 + $(Q)$(MAKE) LDFLAGS= O=$(abspath $(objtree)) subdir=tools -C $(srctree)/tools/ 1432 1427 1433 1428 tools/%: FORCE 1434 1429 $(Q)mkdir -p $(objtree)/tools 1435 - $(Q)$(MAKE) LDFLAGS= MAKEFLAGS="$(tools_silent) $(filter --j% -j,$(MAKEFLAGS))" O=$(abspath $(objtree)) subdir=tools -C $(srctree)/tools/ $* 1430 + $(Q)$(MAKE) LDFLAGS= O=$(abspath $(objtree)) subdir=tools -C $(srctree)/tools/ $* 1436 1431 1437 1432 # --------------------------------------------------------------------------- 1438 1433 # Kernel selftest

+1 -5

arch/alpha/include/asm/elf.h

··· 74 74 /* 75 75 * This is used to ensure we don't load something for the wrong architecture. 76 76 */ 77 - #define elf_check_arch(x) ((x)->e_machine == EM_ALPHA) 77 + #define elf_check_arch(x) (((x)->e_machine == EM_ALPHA) && !((x)->e_flags & EF_ALPHA_32BIT)) 78 78 79 79 /* 80 80 * These are used to set parameters in the core dumps. ··· 136 136 ( i_ == IMPLVER_EV5 ? "ev56" \ 137 137 : amask (AMASK_CIX) ? "ev6" : "ev67"); \ 138 138 }) 139 - 140 - #define SET_PERSONALITY(EX) \ 141 - set_personality(((EX).e_flags & EF_ALPHA_32BIT) \ 142 - ? PER_LINUX_32BIT : PER_LINUX) 143 139 144 140 extern int alpha_l1i_cacheshape; 145 141 extern int alpha_l1d_cacheshape;

+1 -1

arch/alpha/include/asm/hwrpb.h

··· 135 135 /* virtual->physical map */ 136 136 unsigned long map_entries; 137 137 unsigned long map_pages; 138 - struct vf_map_struct map[1]; 138 + struct vf_map_struct map[]; 139 139 }; 140 140 141 141 struct memclust_struct {

+1 -1

arch/alpha/include/asm/pgtable.h

··· 360 360 361 361 extern void paging_init(void); 362 362 363 - /* We have our own get_unmapped_area to cope with ADDR_LIMIT_32BIT. */ 363 + /* We have our own get_unmapped_area */ 364 364 #define HAVE_ARCH_UNMAPPED_AREA 365 365 366 366 #endif /* _ALPHA_PGTABLE_H */

+2 -6

arch/alpha/include/asm/processor.h

··· 8 8 #ifndef __ASM_ALPHA_PROCESSOR_H 9 9 #define __ASM_ALPHA_PROCESSOR_H 10 10 11 - #include <linux/personality.h> /* for ADDR_LIMIT_32BIT */ 12 - 13 11 /* 14 12 * We have a 42-bit user address space: 4TB user VM... 15 13 */ 16 14 #define TASK_SIZE (0x40000000000UL) 17 15 18 - #define STACK_TOP \ 19 - (current->personality & ADDR_LIMIT_32BIT ? 0x80000000 : 0x00120000000UL) 16 + #define STACK_TOP (0x00120000000UL) 20 17 21 18 #define STACK_TOP_MAX 0x00120000000UL 22 19 23 20 /* This decides where the kernel will search for a free chunk of vm 24 21 * space during mmap's. 25 22 */ 26 - #define TASK_UNMAPPED_BASE \ 27 - ((current->personality & ADDR_LIMIT_32BIT) ? 0x40000000 : TASK_SIZE / 2) 23 + #define TASK_UNMAPPED_BASE (TASK_SIZE / 2) 28 24 29 25 /* This is dead. Everything has been moved to thread_info. */ 30 26 struct thread_struct { };

+2

arch/alpha/include/uapi/asm/ptrace.h

··· 42 42 unsigned long trap_a0; 43 43 unsigned long trap_a1; 44 44 unsigned long trap_a2; 45 + /* This makes the stack 16-byte aligned as GCC expects */ 46 + unsigned long __pad0; 45 47 /* These are saved by PAL-code: */ 46 48 unsigned long ps; 47 49 unsigned long pc;

+4

arch/alpha/kernel/asm-offsets.c

··· 19 19 DEFINE(TI_STATUS, offsetof(struct thread_info, status)); 20 20 BLANK(); 21 21 22 + DEFINE(SP_OFF, offsetof(struct pt_regs, ps)); 22 23 DEFINE(SIZEOF_PT_REGS, sizeof(struct pt_regs)); 24 + BLANK(); 25 + 26 + DEFINE(SWITCH_STACK_SIZE, sizeof(struct switch_stack)); 23 27 BLANK(); 24 28 25 29 DEFINE(HAE_CACHE, offsetof(struct alpha_machine_vector, hae_cache));

+10 -14

arch/alpha/kernel/entry.S

··· 15 15 .set noat 16 16 .cfi_sections .debug_frame 17 17 18 - /* Stack offsets. */ 19 - #define SP_OFF 184 20 - #define SWITCH_STACK_SIZE 64 21 - 22 18 .macro CFI_START_OSF_FRAME func 23 19 .align 4 24 20 .globl \func ··· 194 198 CFI_START_OSF_FRAME entMM 195 199 SAVE_ALL 196 200 /* save $9 - $15 so the inline exception code can manipulate them. */ 197 - subq $sp, 56, $sp 198 - .cfi_adjust_cfa_offset 56 201 + subq $sp, 64, $sp 202 + .cfi_adjust_cfa_offset 64 199 203 stq $9, 0($sp) 200 204 stq $10, 8($sp) 201 205 stq $11, 16($sp) ··· 210 214 .cfi_rel_offset $13, 32 211 215 .cfi_rel_offset $14, 40 212 216 .cfi_rel_offset $15, 48 213 - addq $sp, 56, $19 217 + addq $sp, 64, $19 214 218 /* handle the fault */ 215 219 lda $8, 0x3fff 216 220 bic $sp, $8, $8 ··· 223 227 ldq $13, 32($sp) 224 228 ldq $14, 40($sp) 225 229 ldq $15, 48($sp) 226 - addq $sp, 56, $sp 230 + addq $sp, 64, $sp 227 231 .cfi_restore $9 228 232 .cfi_restore $10 229 233 .cfi_restore $11 ··· 231 235 .cfi_restore $13 232 236 .cfi_restore $14 233 237 .cfi_restore $15 234 - .cfi_adjust_cfa_offset -56 238 + .cfi_adjust_cfa_offset -64 235 239 /* finish up the syscall as normal. */ 236 240 br ret_from_sys_call 237 241 CFI_END_OSF_FRAME entMM ··· 378 382 .cfi_restore $0 379 383 .cfi_adjust_cfa_offset -256 380 384 SAVE_ALL /* setup normal kernel stack */ 381 - lda $sp, -56($sp) 382 - .cfi_adjust_cfa_offset 56 385 + lda $sp, -64($sp) 386 + .cfi_adjust_cfa_offset 64 383 387 stq $9, 0($sp) 384 388 stq $10, 8($sp) 385 389 stq $11, 16($sp) ··· 395 399 .cfi_rel_offset $14, 40 396 400 .cfi_rel_offset $15, 48 397 401 lda $8, 0x3fff 398 - addq $sp, 56, $19 402 + addq $sp, 64, $19 399 403 bic $sp, $8, $8 400 404 jsr $26, do_entUnaUser 401 405 ldq $9, 0($sp) ··· 405 409 ldq $13, 32($sp) 406 410 ldq $14, 40($sp) 407 411 ldq $15, 48($sp) 408 - lda $sp, 56($sp) 412 + lda $sp, 64($sp) 409 413 .cfi_restore $9 410 414 .cfi_restore $10 411 415 .cfi_restore $11 ··· 413 417 .cfi_restore $13 414 418 .cfi_restore $14 415 419 .cfi_restore $15 416 - .cfi_adjust_cfa_offset -56 420 + .cfi_adjust_cfa_offset -64 417 421 br ret_from_sys_call 418 422 CFI_END_OSF_FRAME entUna 419 423

+2 -9

arch/alpha/kernel/osf_sys.c

··· 1210 1210 return ret; 1211 1211 } 1212 1212 1213 - /* Get an address range which is currently unmapped. Similar to the 1214 - generic version except that we know how to honor ADDR_LIMIT_32BIT. */ 1213 + /* Get an address range which is currently unmapped. */ 1215 1214 1216 1215 static unsigned long 1217 1216 arch_get_unmapped_area_1(unsigned long addr, unsigned long len, ··· 1229 1230 unsigned long len, unsigned long pgoff, 1230 1231 unsigned long flags, vm_flags_t vm_flags) 1231 1232 { 1232 - unsigned long limit; 1233 - 1234 - /* "32 bit" actually means 31 bit, since pointers sign extend. */ 1235 - if (current->personality & ADDR_LIMIT_32BIT) 1236 - limit = 0x80000000; 1237 - else 1238 - limit = TASK_SIZE; 1233 + unsigned long limit = TASK_SIZE; 1239 1234 1240 1235 if (len > limit) 1241 1236 return -ENOMEM;

+2 -1

arch/alpha/kernel/pci_iommu.c

··· 13 13 #include <linux/log2.h> 14 14 #include <linux/dma-map-ops.h> 15 15 #include <linux/iommu-helper.h> 16 + #include <linux/string_choices.h> 16 17 17 18 #include <asm/io.h> 18 19 #include <asm/hwrpb.h> ··· 213 212 214 213 /* If both conditions above are met, we are fine. */ 215 214 DBGA("pci_dac_dma_supported %s from %ps\n", 216 - ok ? "yes" : "no", __builtin_return_address(0)); 215 + str_yes_no(ok), __builtin_return_address(0)); 217 216 218 217 return ok; 219 218 }

+1 -1

arch/alpha/kernel/traps.c

··· 649 649 static int unauser_reg_offsets[32] = { 650 650 R(r0), R(r1), R(r2), R(r3), R(r4), R(r5), R(r6), R(r7), R(r8), 651 651 /* r9 ... r15 are stored in front of regs. */ 652 - -56, -48, -40, -32, -24, -16, -8, 652 + -64, -56, -48, -40, -32, -24, -16, /* padding at -8 */ 653 653 R(r16), R(r17), R(r18), 654 654 R(r19), R(r20), R(r21), R(r22), R(r23), R(r24), R(r25), R(r26), 655 655 R(r27), R(r28), R(gp),

+2 -2

arch/alpha/mm/fault.c

··· 78 78 79 79 /* Macro for exception fixup code to access integer registers. */ 80 80 #define dpf_reg(r) \ 81 - (((unsigned long *)regs)[(r) <= 8 ? (r) : (r) <= 15 ? (r)-16 : \ 82 - (r) <= 18 ? (r)+10 : (r)-10]) 81 + (((unsigned long *)regs)[(r) <= 8 ? (r) : (r) <= 15 ? (r)-17 : \ 82 + (r) <= 18 ? (r)+11 : (r)-10]) 83 83 84 84 asmlinkage void 85 85 do_page_fault(unsigned long address, unsigned long mmcsr,

-1

arch/arm64/Kconfig

··· 225 225 select HAVE_FUNCTION_ERROR_INJECTION 226 226 select HAVE_FUNCTION_GRAPH_FREGS 227 227 select HAVE_FUNCTION_GRAPH_TRACER 228 - select HAVE_FUNCTION_GRAPH_RETVAL 229 228 select HAVE_GCC_PLUGINS 230 229 select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && \ 231 230 HW_PERF_EVENTS && HAVE_PERF_EVENTS_NMI

+4

arch/arm64/Makefile

··· 48 48 KBUILD_CFLAGS += $(call cc-disable-warning, psabi) 49 49 KBUILD_AFLAGS += $(compat_vdso) 50 50 51 + ifeq ($(call test-ge, $(CONFIG_RUSTC_VERSION), 108500),y) 52 + KBUILD_RUSTFLAGS += --target=aarch64-unknown-none-softfloat 53 + else 51 54 KBUILD_RUSTFLAGS += --target=aarch64-unknown-none -Ctarget-feature="-neon" 55 + endif 52 56 53 57 KBUILD_CFLAGS += $(call cc-option,-mabi=lp64) 54 58 KBUILD_AFLAGS += $(call cc-option,-mabi=lp64)

-42

arch/arm64/include/asm/kvm_emulate.h

··· 605 605 __cpacr_to_cptr_set(clr, set));\ 606 606 } while (0) 607 607 608 - static __always_inline void kvm_write_cptr_el2(u64 val) 609 - { 610 - if (has_vhe() || has_hvhe()) 611 - write_sysreg(val, cpacr_el1); 612 - else 613 - write_sysreg(val, cptr_el2); 614 - } 615 - 616 - /* Resets the value of cptr_el2 when returning to the host. */ 617 - static __always_inline void __kvm_reset_cptr_el2(struct kvm *kvm) 618 - { 619 - u64 val; 620 - 621 - if (has_vhe()) { 622 - val = (CPACR_EL1_FPEN | CPACR_EL1_ZEN_EL1EN); 623 - if (cpus_have_final_cap(ARM64_SME)) 624 - val |= CPACR_EL1_SMEN_EL1EN; 625 - } else if (has_hvhe()) { 626 - val = CPACR_EL1_FPEN; 627 - 628 - if (!kvm_has_sve(kvm) || !guest_owns_fp_regs()) 629 - val |= CPACR_EL1_ZEN; 630 - if (cpus_have_final_cap(ARM64_SME)) 631 - val |= CPACR_EL1_SMEN; 632 - } else { 633 - val = CPTR_NVHE_EL2_RES1; 634 - 635 - if (kvm_has_sve(kvm) && guest_owns_fp_regs()) 636 - val |= CPTR_EL2_TZ; 637 - if (!cpus_have_final_cap(ARM64_SME)) 638 - val |= CPTR_EL2_TSM; 639 - } 640 - 641 - kvm_write_cptr_el2(val); 642 - } 643 - 644 - #ifdef __KVM_NVHE_HYPERVISOR__ 645 - #define kvm_reset_cptr_el2(v) __kvm_reset_cptr_el2(kern_hyp_va((v)->kvm)) 646 - #else 647 - #define kvm_reset_cptr_el2(v) __kvm_reset_cptr_el2((v)->kvm) 648 - #endif 649 - 650 608 /* 651 609 * Returns a 'sanitised' view of CPTR_EL2, translating from nVHE to the VHE 652 610 * format if E2H isn't set.

+5 -17

arch/arm64/include/asm/kvm_host.h

··· 100 100 static inline void *pop_hyp_memcache(struct kvm_hyp_memcache *mc, 101 101 void *(*to_va)(phys_addr_t phys)) 102 102 { 103 - phys_addr_t *p = to_va(mc->head); 103 + phys_addr_t *p = to_va(mc->head & PAGE_MASK); 104 104 105 105 if (!mc->nr_pages) 106 106 return NULL; ··· 615 615 struct kvm_host_data { 616 616 #define KVM_HOST_DATA_FLAG_HAS_SPE 0 617 617 #define KVM_HOST_DATA_FLAG_HAS_TRBE 1 618 - #define KVM_HOST_DATA_FLAG_HOST_SVE_ENABLED 2 619 - #define KVM_HOST_DATA_FLAG_HOST_SME_ENABLED 3 620 618 #define KVM_HOST_DATA_FLAG_TRBE_ENABLED 4 621 619 #define KVM_HOST_DATA_FLAG_EL1_TRACING_CONFIGURED 5 622 620 unsigned long flags; ··· 622 624 struct kvm_cpu_context host_ctxt; 623 625 624 626 /* 625 - * All pointers in this union are hyp VA. 627 + * Hyp VA. 626 628 * sve_state is only used in pKVM and if system_supports_sve(). 627 629 */ 628 - union { 629 - struct user_fpsimd_state *fpsimd_state; 630 - struct cpu_sve_state *sve_state; 631 - }; 630 + struct cpu_sve_state *sve_state; 632 631 633 - union { 634 - /* HYP VA pointer to the host storage for FPMR */ 635 - u64 *fpmr_ptr; 636 - /* 637 - * Used by pKVM only, as it needs to provide storage 638 - * for the host 639 - */ 640 - u64 fpmr; 641 - }; 632 + /* Used by pKVM only. */ 633 + u64 fpmr; 642 634 643 635 /* Ownership of the FP regs */ 644 636 enum {

+7 -5

arch/arm64/kernel/cacheinfo.c

··· 101 101 unsigned int level, idx; 102 102 enum cache_type type; 103 103 struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); 104 - struct cacheinfo *this_leaf = this_cpu_ci->info_list; 104 + struct cacheinfo *infos = this_cpu_ci->info_list; 105 105 106 106 for (idx = 0, level = 1; level <= this_cpu_ci->num_levels && 107 - idx < this_cpu_ci->num_leaves; idx++, level++) { 107 + idx < this_cpu_ci->num_leaves; level++) { 108 108 type = get_cache_type(level); 109 109 if (type == CACHE_TYPE_SEPARATE) { 110 - ci_leaf_init(this_leaf++, CACHE_TYPE_DATA, level); 111 - ci_leaf_init(this_leaf++, CACHE_TYPE_INST, level); 110 + if (idx + 1 >= this_cpu_ci->num_leaves) 111 + break; 112 + ci_leaf_init(&infos[idx++], CACHE_TYPE_DATA, level); 113 + ci_leaf_init(&infos[idx++], CACHE_TYPE_INST, level); 112 114 } else { 113 - ci_leaf_init(this_leaf++, type, level); 115 + ci_leaf_init(&infos[idx++], type, level); 114 116 } 115 117 } 116 118 return 0;

+3 -2

arch/arm64/kernel/cpufeature.c

··· 3091 3091 HWCAP_CAP(ID_AA64ISAR0_EL1, TS, FLAGM, CAP_HWCAP, KERNEL_HWCAP_FLAGM), 3092 3092 HWCAP_CAP(ID_AA64ISAR0_EL1, TS, FLAGM2, CAP_HWCAP, KERNEL_HWCAP_FLAGM2), 3093 3093 HWCAP_CAP(ID_AA64ISAR0_EL1, RNDR, IMP, CAP_HWCAP, KERNEL_HWCAP_RNG), 3094 + HWCAP_CAP(ID_AA64ISAR3_EL1, FPRCVT, IMP, CAP_HWCAP, KERNEL_HWCAP_FPRCVT), 3094 3095 HWCAP_CAP(ID_AA64PFR0_EL1, FP, IMP, CAP_HWCAP, KERNEL_HWCAP_FP), 3095 3096 HWCAP_CAP(ID_AA64PFR0_EL1, FP, FP16, CAP_HWCAP, KERNEL_HWCAP_FPHP), 3096 3097 HWCAP_CAP(ID_AA64PFR0_EL1, AdvSIMD, IMP, CAP_HWCAP, KERNEL_HWCAP_ASIMD), ··· 3181 3180 HWCAP_CAP(ID_AA64SMFR0_EL1, SF8FMA, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SF8FMA), 3182 3181 HWCAP_CAP(ID_AA64SMFR0_EL1, SF8DP4, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SF8DP4), 3183 3182 HWCAP_CAP(ID_AA64SMFR0_EL1, SF8DP2, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SF8DP2), 3184 - HWCAP_CAP(ID_AA64SMFR0_EL1, SF8MM8, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SF8MM8), 3185 - HWCAP_CAP(ID_AA64SMFR0_EL1, SF8MM4, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SF8MM4), 3186 3183 HWCAP_CAP(ID_AA64SMFR0_EL1, SBitPerm, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SBITPERM), 3187 3184 HWCAP_CAP(ID_AA64SMFR0_EL1, AES, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_AES), 3188 3185 HWCAP_CAP(ID_AA64SMFR0_EL1, SFEXPA, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SFEXPA), ··· 3191 3192 HWCAP_CAP(ID_AA64FPFR0_EL1, F8FMA, IMP, CAP_HWCAP, KERNEL_HWCAP_F8FMA), 3192 3193 HWCAP_CAP(ID_AA64FPFR0_EL1, F8DP4, IMP, CAP_HWCAP, KERNEL_HWCAP_F8DP4), 3193 3194 HWCAP_CAP(ID_AA64FPFR0_EL1, F8DP2, IMP, CAP_HWCAP, KERNEL_HWCAP_F8DP2), 3195 + HWCAP_CAP(ID_AA64FPFR0_EL1, F8MM8, IMP, CAP_HWCAP, KERNEL_HWCAP_F8MM8), 3196 + HWCAP_CAP(ID_AA64FPFR0_EL1, F8MM4, IMP, CAP_HWCAP, KERNEL_HWCAP_F8MM4), 3194 3197 HWCAP_CAP(ID_AA64FPFR0_EL1, F8E4M3, IMP, CAP_HWCAP, KERNEL_HWCAP_F8E4M3), 3195 3198 HWCAP_CAP(ID_AA64FPFR0_EL1, F8E5M2, IMP, CAP_HWCAP, KERNEL_HWCAP_F8E5M2), 3196 3199 #ifdef CONFIG_ARM64_POE

-25

arch/arm64/kernel/fpsimd.c

··· 1695 1695 } 1696 1696 1697 1697 /* 1698 - * Called by KVM when entering the guest. 1699 - */ 1700 - void fpsimd_kvm_prepare(void) 1701 - { 1702 - if (!system_supports_sve()) 1703 - return; 1704 - 1705 - /* 1706 - * KVM does not save host SVE state since we can only enter 1707 - * the guest from a syscall so the ABI means that only the 1708 - * non-saved SVE state needs to be saved. If we have left 1709 - * SVE enabled for performance reasons then update the task 1710 - * state to be FPSIMD only. 1711 - */ 1712 - get_cpu_fpsimd_context(); 1713 - 1714 - if (test_and_clear_thread_flag(TIF_SVE)) { 1715 - sve_to_fpsimd(current); 1716 - current->thread.fp_type = FP_STATE_FPSIMD; 1717 - } 1718 - 1719 - put_cpu_fpsimd_context(); 1720 - } 1721 - 1722 - /* 1723 1698 * Associate current's FPSIMD context with this cpu 1724 1699 * The caller must have ownership of the cpu FPSIMD context before calling 1725 1700 * this function.

+10 -12

arch/arm64/kernel/topology.c

··· 194 194 int cpu; 195 195 196 196 /* We are already set since the last insmod of cpufreq driver */ 197 - if (unlikely(cpumask_subset(cpus, amu_fie_cpus))) 197 + if (cpumask_available(amu_fie_cpus) && 198 + unlikely(cpumask_subset(cpus, amu_fie_cpus))) 198 199 return; 199 200 200 - for_each_cpu(cpu, cpus) { 201 + for_each_cpu(cpu, cpus) 201 202 if (!freq_counters_valid(cpu)) 202 203 return; 204 + 205 + if (!cpumask_available(amu_fie_cpus) && 206 + !zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL)) { 207 + WARN_ONCE(1, "Failed to allocate FIE cpumask for CPUs[%*pbl]\n", 208 + cpumask_pr_args(cpus)); 209 + return; 203 210 } 204 211 205 212 cpumask_or(amu_fie_cpus, amu_fie_cpus, cpus); ··· 244 237 245 238 static int __init init_amu_fie(void) 246 239 { 247 - int ret; 248 - 249 - if (!zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL)) 250 - return -ENOMEM; 251 - 252 - ret = cpufreq_register_notifier(&init_amu_fie_notifier, 240 + return cpufreq_register_notifier(&init_amu_fie_notifier, 253 241 CPUFREQ_POLICY_NOTIFIER); 254 - if (ret) 255 - free_cpumask_var(amu_fie_cpus); 256 - 257 - return ret; 258 242 } 259 243 core_initcall(init_amu_fie); 260 244

+1

arch/arm64/kernel/vdso/vdso.lds.S

··· 41 41 */ 42 42 /DISCARD/ : { 43 43 *(.note.GNU-stack .note.gnu.property) 44 + *(.ARM.attributes) 44 45 } 45 46 .note : { *(.note.*) } :text :note 46 47

+1

arch/arm64/kernel/vmlinux.lds.S

··· 162 162 /DISCARD/ : { 163 163 *(.interp .dynamic) 164 164 *(.dynsym .dynstr .hash .gnu.hash) 165 + *(.ARM.attributes) 165 166 } 166 167 167 168 . = KIMAGE_VADDR;

+18 -47

arch/arm64/kvm/arch_timer.c

··· 447 447 static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level, 448 448 struct arch_timer_context *timer_ctx) 449 449 { 450 - int ret; 451 - 452 450 kvm_timer_update_status(timer_ctx, new_level); 453 451 454 452 timer_ctx->irq.level = new_level; 455 453 trace_kvm_timer_update_irq(vcpu->vcpu_id, timer_irq(timer_ctx), 456 454 timer_ctx->irq.level); 457 455 458 - if (!userspace_irqchip(vcpu->kvm)) { 459 - ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu, 460 - timer_irq(timer_ctx), 461 - timer_ctx->irq.level, 462 - timer_ctx); 463 - WARN_ON(ret); 464 - } 456 + if (userspace_irqchip(vcpu->kvm)) 457 + return; 458 + 459 + kvm_vgic_inject_irq(vcpu->kvm, vcpu, 460 + timer_irq(timer_ctx), 461 + timer_ctx->irq.level, 462 + timer_ctx); 465 463 } 466 464 467 465 /* Only called for a fully emulated timer */ ··· 469 471 470 472 trace_kvm_timer_emulate(ctx, should_fire); 471 473 472 - if (should_fire != ctx->irq.level) { 474 + if (should_fire != ctx->irq.level) 473 475 kvm_timer_update_irq(ctx->vcpu, should_fire, ctx); 474 - return; 475 - } 476 476 477 477 kvm_timer_update_status(ctx, should_fire); 478 478 ··· 757 761 timer_irq(map->direct_ptimer), 758 762 &arch_timer_irq_ops); 759 763 WARN_ON_ONCE(ret); 760 - 761 - /* 762 - * The virtual offset behaviour is "interesting", as it 763 - * always applies when HCR_EL2.E2H==0, but only when 764 - * accessed from EL1 when HCR_EL2.E2H==1. So make sure we 765 - * track E2H when putting the HV timer in "direct" mode. 766 - */ 767 - if (map->direct_vtimer == vcpu_hvtimer(vcpu)) { 768 - struct arch_timer_offset *offs = &map->direct_vtimer->offset; 769 - 770 - if (vcpu_el2_e2h_is_set(vcpu)) 771 - offs->vcpu_offset = NULL; 772 - else 773 - offs->vcpu_offset = &__vcpu_sys_reg(vcpu, CNTVOFF_EL2); 774 - } 775 764 } 776 765 } 777 766 ··· 957 976 * which allows trapping of the timer registers even with NV2. 958 977 * Still, this is still worse than FEAT_NV on its own. Meh. 959 978 */ 960 - if (!vcpu_el2_e2h_is_set(vcpu)) { 961 - if (cpus_have_final_cap(ARM64_HAS_ECV)) 962 - return; 963 - 964 - /* 965 - * A non-VHE guest hypervisor doesn't have any direct access 966 - * to its timers: the EL2 registers trap (and the HW is 967 - * fully emulated), while the EL0 registers access memory 968 - * despite the access being notionally direct. Boo. 969 - * 970 - * We update the hardware timer registers with the 971 - * latest value written by the guest to the VNCR page 972 - * and let the hardware take care of the rest. 973 - */ 974 - write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CTL_EL0), SYS_CNTV_CTL); 975 - write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CVAL_EL0), SYS_CNTV_CVAL); 976 - write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CTL_EL0), SYS_CNTP_CTL); 977 - write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), SYS_CNTP_CVAL); 978 - } else { 979 + if (!cpus_have_final_cap(ARM64_HAS_ECV)) { 979 980 /* 980 981 * For a VHE guest hypervisor, the EL2 state is directly 981 - * stored in the host EL1 timers, while the emulated EL0 982 + * stored in the host EL1 timers, while the emulated EL1 982 983 * state is stored in the VNCR page. The latter could have 983 984 * been updated behind our back, and we must reset the 984 985 * emulation of the timers. 986 + * 987 + * A non-VHE guest hypervisor doesn't have any direct access 988 + * to its timers: the EL2 registers trap despite being 989 + * notionally direct (we use the EL1 HW, as for VHE), while 990 + * the EL1 registers access memory. 991 + * 992 + * In both cases, process the emulated timers on each guest 993 + * exit. Boo. 985 994 */ 986 995 struct timer_map map; 987 996 get_timer_map(vcpu, &map);

+20 -8

arch/arm64/kvm/arm.c

··· 2290 2290 break; 2291 2291 case -ENODEV: 2292 2292 case -ENXIO: 2293 + /* 2294 + * No VGIC? No pKVM for you. 2295 + * 2296 + * Protected mode assumes that VGICv3 is present, so no point 2297 + * in trying to hobble along if vgic initialization fails. 2298 + */ 2299 + if (is_protected_kvm_enabled()) 2300 + goto out; 2301 + 2302 + /* 2303 + * Otherwise, userspace could choose to implement a GIC for its 2304 + * guest on non-cooperative hardware. 2305 + */ 2293 2306 vgic_present = false; 2294 2307 err = 0; 2295 2308 break; ··· 2413 2400 kvm_nvhe_sym(id_aa64smfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64SMFR0_EL1); 2414 2401 kvm_nvhe_sym(__icache_flags) = __icache_flags; 2415 2402 kvm_nvhe_sym(kvm_arm_vmid_bits) = kvm_arm_vmid_bits; 2403 + 2404 + /* 2405 + * Flush entire BSS since part of its data containing init symbols is read 2406 + * while the MMU is off. 2407 + */ 2408 + kvm_flush_dcache_to_poc(kvm_ksym_ref(__hyp_bss_start), 2409 + kvm_ksym_ref(__hyp_bss_end) - kvm_ksym_ref(__hyp_bss_start)); 2416 2410 } 2417 2411 2418 2412 static int __init kvm_hyp_init_protection(u32 hyp_va_bits) ··· 2480 2460 sve_state = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state; 2481 2461 per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state = 2482 2462 kern_hyp_va(sve_state); 2483 - } 2484 - } else { 2485 - for_each_possible_cpu(cpu) { 2486 - struct user_fpsimd_state *fpsimd_state; 2487 - 2488 - fpsimd_state = &per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->host_ctxt.fp_regs; 2489 - per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->fpsimd_state = 2490 - kern_hyp_va(fpsimd_state); 2491 2463 } 2492 2464 } 2493 2465 }

+9 -98

arch/arm64/kvm/fpsimd.c

··· 54 54 if (!system_supports_fpsimd()) 55 55 return; 56 56 57 - fpsimd_kvm_prepare(); 58 - 59 57 /* 60 - * We will check TIF_FOREIGN_FPSTATE just before entering the 61 - * guest in kvm_arch_vcpu_ctxflush_fp() and override this to 62 - * FP_STATE_FREE if the flag set. 58 + * Ensure that any host FPSIMD/SVE/SME state is saved and unbound such 59 + * that the host kernel is responsible for restoring this state upon 60 + * return to userspace, and the hyp code doesn't need to save anything. 61 + * 62 + * When the host may use SME, fpsimd_save_and_flush_cpu_state() ensures 63 + * that PSTATE.{SM,ZA} == {0,0}. 63 64 */ 64 - *host_data_ptr(fp_owner) = FP_STATE_HOST_OWNED; 65 - *host_data_ptr(fpsimd_state) = kern_hyp_va(&current->thread.uw.fpsimd_state); 66 - *host_data_ptr(fpmr_ptr) = kern_hyp_va(&current->thread.uw.fpmr); 65 + fpsimd_save_and_flush_cpu_state(); 66 + *host_data_ptr(fp_owner) = FP_STATE_FREE; 67 67 68 - host_data_clear_flag(HOST_SVE_ENABLED); 69 - if (read_sysreg(cpacr_el1) & CPACR_EL1_ZEN_EL0EN) 70 - host_data_set_flag(HOST_SVE_ENABLED); 71 - 72 - if (system_supports_sme()) { 73 - host_data_clear_flag(HOST_SME_ENABLED); 74 - if (read_sysreg(cpacr_el1) & CPACR_EL1_SMEN_EL0EN) 75 - host_data_set_flag(HOST_SME_ENABLED); 76 - 77 - /* 78 - * If PSTATE.SM is enabled then save any pending FP 79 - * state and disable PSTATE.SM. If we leave PSTATE.SM 80 - * enabled and the guest does not enable SME via 81 - * CPACR_EL1.SMEN then operations that should be valid 82 - * may generate SME traps from EL1 to EL1 which we 83 - * can't intercept and which would confuse the guest. 84 - * 85 - * Do the same for PSTATE.ZA in the case where there 86 - * is state in the registers which has not already 87 - * been saved, this is very unlikely to happen. 88 - */ 89 - if (read_sysreg_s(SYS_SVCR) & (SVCR_SM_MASK | SVCR_ZA_MASK)) { 90 - *host_data_ptr(fp_owner) = FP_STATE_FREE; 91 - fpsimd_save_and_flush_cpu_state(); 92 - } 93 - } 94 - 95 - /* 96 - * If normal guests gain SME support, maintain this behavior for pKVM 97 - * guests, which don't support SME. 98 - */ 99 - WARN_ON(is_protected_kvm_enabled() && system_supports_sme() && 100 - read_sysreg_s(SYS_SVCR)); 68 + WARN_ON_ONCE(system_supports_sme() && read_sysreg_s(SYS_SVCR)); 101 69 } 102 70 103 71 /* ··· 130 162 131 163 local_irq_save(flags); 132 164 133 - /* 134 - * If we have VHE then the Hyp code will reset CPACR_EL1 to 135 - * the default value and we need to reenable SME. 136 - */ 137 - if (has_vhe() && system_supports_sme()) { 138 - /* Also restore EL0 state seen on entry */ 139 - if (host_data_test_flag(HOST_SME_ENABLED)) 140 - sysreg_clear_set(CPACR_EL1, 0, CPACR_EL1_SMEN); 141 - else 142 - sysreg_clear_set(CPACR_EL1, 143 - CPACR_EL1_SMEN_EL0EN, 144 - CPACR_EL1_SMEN_EL1EN); 145 - isb(); 146 - } 147 - 148 165 if (guest_owns_fp_regs()) { 149 - if (vcpu_has_sve(vcpu)) { 150 - u64 zcr = read_sysreg_el1(SYS_ZCR); 151 - 152 - /* 153 - * If the vCPU is in the hyp context then ZCR_EL1 is 154 - * loaded with its vEL2 counterpart. 155 - */ 156 - __vcpu_sys_reg(vcpu, vcpu_sve_zcr_elx(vcpu)) = zcr; 157 - 158 - /* 159 - * Restore the VL that was saved when bound to the CPU, 160 - * which is the maximum VL for the guest. Because the 161 - * layout of the data when saving the sve state depends 162 - * on the VL, we need to use a consistent (i.e., the 163 - * maximum) VL. 164 - * Note that this means that at guest exit ZCR_EL1 is 165 - * not necessarily the same as on guest entry. 166 - * 167 - * ZCR_EL2 holds the guest hypervisor's VL when running 168 - * a nested guest, which could be smaller than the 169 - * max for the vCPU. Similar to above, we first need to 170 - * switch to a VL consistent with the layout of the 171 - * vCPU's SVE state. KVM support for NV implies VHE, so 172 - * using the ZCR_EL1 alias is safe. 173 - */ 174 - if (!has_vhe() || (vcpu_has_nv(vcpu) && !is_hyp_ctxt(vcpu))) 175 - sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, 176 - SYS_ZCR_EL1); 177 - } 178 - 179 166 /* 180 167 * Flush (save and invalidate) the fpsimd/sve state so that if 181 168 * the host tries to use fpsimd/sve, it's not using stale data ··· 142 219 * when needed. 143 220 */ 144 221 fpsimd_save_and_flush_cpu_state(); 145 - } else if (has_vhe() && system_supports_sve()) { 146 - /* 147 - * The FPSIMD/SVE state in the CPU has not been touched, and we 148 - * have SVE (and VHE): CPACR_EL1 (alias CPTR_EL2) has been 149 - * reset by kvm_reset_cptr_el2() in the Hyp code, disabling SVE 150 - * for EL0. To avoid spurious traps, restore the trap state 151 - * seen by kvm_arch_vcpu_load_fp(): 152 - */ 153 - if (host_data_test_flag(HOST_SVE_ENABLED)) 154 - sysreg_clear_set(CPACR_EL1, 0, CPACR_EL1_ZEN_EL0EN); 155 - else 156 - sysreg_clear_set(CPACR_EL1, CPACR_EL1_ZEN_EL0EN, 0); 157 222 } 158 223 159 224 local_irq_restore(flags);

+5

arch/arm64/kvm/hyp/entry.S

··· 44 44 alternative_else_nop_endif 45 45 mrs x1, isr_el1 46 46 cbz x1, 1f 47 + 48 + // Ensure that __guest_enter() always provides a context 49 + // synchronization event so that callers don't need ISBs for anything 50 + // that would usually be synchonized by the ERET. 51 + isb 47 52 mov x0, #ARM_EXCEPTION_IRQ 48 53 ret 49 54

+111 -37

arch/arm64/kvm/hyp/include/hyp/switch.h

··· 326 326 return __get_fault_info(vcpu->arch.fault.esr_el2, &vcpu->arch.fault); 327 327 } 328 328 329 - static bool kvm_hyp_handle_mops(struct kvm_vcpu *vcpu, u64 *exit_code) 329 + static inline bool kvm_hyp_handle_mops(struct kvm_vcpu *vcpu, u64 *exit_code) 330 330 { 331 331 *vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR); 332 332 arm64_mops_reset_regs(vcpu_gp_regs(vcpu), vcpu->arch.fault.esr_el2); ··· 375 375 true); 376 376 } 377 377 378 - static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu); 378 + static inline void fpsimd_lazy_switch_to_guest(struct kvm_vcpu *vcpu) 379 + { 380 + u64 zcr_el1, zcr_el2; 381 + 382 + if (!guest_owns_fp_regs()) 383 + return; 384 + 385 + if (vcpu_has_sve(vcpu)) { 386 + /* A guest hypervisor may restrict the effective max VL. */ 387 + if (vcpu_has_nv(vcpu) && !is_hyp_ctxt(vcpu)) 388 + zcr_el2 = __vcpu_sys_reg(vcpu, ZCR_EL2); 389 + else 390 + zcr_el2 = vcpu_sve_max_vq(vcpu) - 1; 391 + 392 + write_sysreg_el2(zcr_el2, SYS_ZCR); 393 + 394 + zcr_el1 = __vcpu_sys_reg(vcpu, vcpu_sve_zcr_elx(vcpu)); 395 + write_sysreg_el1(zcr_el1, SYS_ZCR); 396 + } 397 + } 398 + 399 + static inline void fpsimd_lazy_switch_to_host(struct kvm_vcpu *vcpu) 400 + { 401 + u64 zcr_el1, zcr_el2; 402 + 403 + if (!guest_owns_fp_regs()) 404 + return; 405 + 406 + /* 407 + * When the guest owns the FP regs, we know that guest+hyp traps for 408 + * any FPSIMD/SVE/SME features exposed to the guest have been disabled 409 + * by either fpsimd_lazy_switch_to_guest() or kvm_hyp_handle_fpsimd() 410 + * prior to __guest_entry(). As __guest_entry() guarantees a context 411 + * synchronization event, we don't need an ISB here to avoid taking 412 + * traps for anything that was exposed to the guest. 413 + */ 414 + if (vcpu_has_sve(vcpu)) { 415 + zcr_el1 = read_sysreg_el1(SYS_ZCR); 416 + __vcpu_sys_reg(vcpu, vcpu_sve_zcr_elx(vcpu)) = zcr_el1; 417 + 418 + /* 419 + * The guest's state is always saved using the guest's max VL. 420 + * Ensure that the host has the guest's max VL active such that 421 + * the host can save the guest's state lazily, but don't 422 + * artificially restrict the host to the guest's max VL. 423 + */ 424 + if (has_vhe()) { 425 + zcr_el2 = vcpu_sve_max_vq(vcpu) - 1; 426 + write_sysreg_el2(zcr_el2, SYS_ZCR); 427 + } else { 428 + zcr_el2 = sve_vq_from_vl(kvm_host_sve_max_vl) - 1; 429 + write_sysreg_el2(zcr_el2, SYS_ZCR); 430 + 431 + zcr_el1 = vcpu_sve_max_vq(vcpu) - 1; 432 + write_sysreg_el1(zcr_el1, SYS_ZCR); 433 + } 434 + } 435 + } 436 + 437 + static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu) 438 + { 439 + /* 440 + * Non-protected kvm relies on the host restoring its sve state. 441 + * Protected kvm restores the host's sve state as not to reveal that 442 + * fpsimd was used by a guest nor leak upper sve bits. 443 + */ 444 + if (system_supports_sve()) { 445 + __hyp_sve_save_host(); 446 + 447 + /* Re-enable SVE traps if not supported for the guest vcpu. */ 448 + if (!vcpu_has_sve(vcpu)) 449 + cpacr_clear_set(CPACR_EL1_ZEN, 0); 450 + 451 + } else { 452 + __fpsimd_save_state(host_data_ptr(host_ctxt.fp_regs)); 453 + } 454 + 455 + if (kvm_has_fpmr(kern_hyp_va(vcpu->kvm))) 456 + *host_data_ptr(fpmr) = read_sysreg_s(SYS_FPMR); 457 + } 458 + 379 459 380 460 /* 381 461 * We trap the first access to the FP/SIMD to save the host context and ··· 463 383 * If FP/SIMD is not implemented, handle the trap and inject an undefined 464 384 * instruction exception to the guest. Similarly for trapped SVE accesses. 465 385 */ 466 - static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code) 386 + static inline bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code) 467 387 { 468 388 bool sve_guest; 469 389 u8 esr_ec; ··· 505 425 isb(); 506 426 507 427 /* Write out the host state if it's in the registers */ 508 - if (host_owns_fp_regs()) 428 + if (is_protected_kvm_enabled() && host_owns_fp_regs()) 509 429 kvm_hyp_save_fpsimd_host(vcpu); 510 430 511 431 /* Restore the guest state */ ··· 581 501 return true; 582 502 } 583 503 504 + /* Open-coded version of timer_get_offset() to allow for kern_hyp_va() */ 505 + static inline u64 hyp_timer_get_offset(struct arch_timer_context *ctxt) 506 + { 507 + u64 offset = 0; 508 + 509 + if (ctxt->offset.vm_offset) 510 + offset += *kern_hyp_va(ctxt->offset.vm_offset); 511 + if (ctxt->offset.vcpu_offset) 512 + offset += *kern_hyp_va(ctxt->offset.vcpu_offset); 513 + 514 + return offset; 515 + } 516 + 584 517 static inline u64 compute_counter_value(struct arch_timer_context *ctxt) 585 518 { 586 - return arch_timer_read_cntpct_el0() - timer_get_offset(ctxt); 519 + return arch_timer_read_cntpct_el0() - hyp_timer_get_offset(ctxt); 587 520 } 588 521 589 522 static bool kvm_handle_cntxct(struct kvm_vcpu *vcpu) ··· 680 587 return true; 681 588 } 682 589 683 - static bool kvm_hyp_handle_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code) 590 + static inline bool kvm_hyp_handle_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code) 684 591 { 685 592 if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) && 686 593 handle_tx2_tvm(vcpu)) ··· 700 607 return false; 701 608 } 702 609 703 - static bool kvm_hyp_handle_cp15_32(struct kvm_vcpu *vcpu, u64 *exit_code) 610 + static inline bool kvm_hyp_handle_cp15_32(struct kvm_vcpu *vcpu, u64 *exit_code) 704 611 { 705 612 if (static_branch_unlikely(&vgic_v3_cpuif_trap) && 706 613 __vgic_v3_perform_cpuif_access(vcpu) == 1) ··· 709 616 return false; 710 617 } 711 618 712 - static bool kvm_hyp_handle_memory_fault(struct kvm_vcpu *vcpu, u64 *exit_code) 619 + static inline bool kvm_hyp_handle_memory_fault(struct kvm_vcpu *vcpu, 620 + u64 *exit_code) 713 621 { 714 622 if (!__populate_fault_info(vcpu)) 715 623 return true; 716 624 717 625 return false; 718 626 } 719 - static bool kvm_hyp_handle_iabt_low(struct kvm_vcpu *vcpu, u64 *exit_code) 720 - __alias(kvm_hyp_handle_memory_fault); 721 - static bool kvm_hyp_handle_watchpt_low(struct kvm_vcpu *vcpu, u64 *exit_code) 722 - __alias(kvm_hyp_handle_memory_fault); 627 + #define kvm_hyp_handle_iabt_low kvm_hyp_handle_memory_fault 628 + #define kvm_hyp_handle_watchpt_low kvm_hyp_handle_memory_fault 723 629 724 - static bool kvm_hyp_handle_dabt_low(struct kvm_vcpu *vcpu, u64 *exit_code) 630 + static inline bool kvm_hyp_handle_dabt_low(struct kvm_vcpu *vcpu, u64 *exit_code) 725 631 { 726 632 if (kvm_hyp_handle_memory_fault(vcpu, exit_code)) 727 633 return true; ··· 750 658 751 659 typedef bool (*exit_handler_fn)(struct kvm_vcpu *, u64 *); 752 660 753 - static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm_vcpu *vcpu); 754 - 755 - static void early_exit_filter(struct kvm_vcpu *vcpu, u64 *exit_code); 756 - 757 661 /* 758 662 * Allow the hypervisor to handle the exit with an exit handler if it has one. 759 663 * 760 664 * Returns true if the hypervisor handled the exit, and control should go back 761 665 * to the guest, or false if it hasn't. 762 666 */ 763 - static inline bool kvm_hyp_handle_exit(struct kvm_vcpu *vcpu, u64 *exit_code) 667 + static inline bool kvm_hyp_handle_exit(struct kvm_vcpu *vcpu, u64 *exit_code, 668 + const exit_handler_fn *handlers) 764 669 { 765 - const exit_handler_fn *handlers = kvm_get_exit_handler_array(vcpu); 766 - exit_handler_fn fn; 767 - 768 - fn = handlers[kvm_vcpu_trap_get_class(vcpu)]; 769 - 670 + exit_handler_fn fn = handlers[kvm_vcpu_trap_get_class(vcpu)]; 770 671 if (fn) 771 672 return fn(vcpu, exit_code); 772 673 ··· 789 704 * the guest, false when we should restore the host state and return to the 790 705 * main run loop. 791 706 */ 792 - static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code) 707 + static inline bool __fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code, 708 + const exit_handler_fn *handlers) 793 709 { 794 - /* 795 - * Save PSTATE early so that we can evaluate the vcpu mode 796 - * early on. 797 - */ 798 - synchronize_vcpu_pstate(vcpu, exit_code); 799 - 800 - /* 801 - * Check whether we want to repaint the state one way or 802 - * another. 803 - */ 804 - early_exit_filter(vcpu, exit_code); 805 - 806 710 if (ARM_EXCEPTION_CODE(*exit_code) != ARM_EXCEPTION_IRQ) 807 711 vcpu->arch.fault.esr_el2 = read_sysreg_el2(SYS_ESR); 808 712 ··· 821 747 goto exit; 822 748 823 749 /* Check if there's an exit handler and allow it to handle the exit. */ 824 - if (kvm_hyp_handle_exit(vcpu, exit_code)) 750 + if (kvm_hyp_handle_exit(vcpu, exit_code, handlers)) 825 751 goto guest; 826 752 exit: 827 753 /* Return to the host kernel and handle the exit */

+31 -8

arch/arm64/kvm/hyp/nvhe/hyp-main.c

··· 5 5 */ 6 6 7 7 #include <hyp/adjust_pc.h> 8 + #include <hyp/switch.h> 8 9 9 10 #include <asm/pgtable-types.h> 10 11 #include <asm/kvm_asm.h> ··· 84 83 if (system_supports_sve()) 85 84 __hyp_sve_restore_host(); 86 85 else 87 - __fpsimd_restore_state(*host_data_ptr(fpsimd_state)); 86 + __fpsimd_restore_state(host_data_ptr(host_ctxt.fp_regs)); 88 87 89 88 if (has_fpmr) 90 89 write_sysreg_s(*host_data_ptr(fpmr), SYS_FPMR); ··· 92 91 *host_data_ptr(fp_owner) = FP_STATE_HOST_OWNED; 93 92 } 94 93 94 + static void flush_debug_state(struct pkvm_hyp_vcpu *hyp_vcpu) 95 + { 96 + struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu; 97 + 98 + hyp_vcpu->vcpu.arch.debug_owner = host_vcpu->arch.debug_owner; 99 + 100 + if (kvm_guest_owns_debug_regs(&hyp_vcpu->vcpu)) 101 + hyp_vcpu->vcpu.arch.vcpu_debug_state = host_vcpu->arch.vcpu_debug_state; 102 + else if (kvm_host_owns_debug_regs(&hyp_vcpu->vcpu)) 103 + hyp_vcpu->vcpu.arch.external_debug_state = host_vcpu->arch.external_debug_state; 104 + } 105 + 106 + static void sync_debug_state(struct pkvm_hyp_vcpu *hyp_vcpu) 107 + { 108 + struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu; 109 + 110 + if (kvm_guest_owns_debug_regs(&hyp_vcpu->vcpu)) 111 + host_vcpu->arch.vcpu_debug_state = hyp_vcpu->vcpu.arch.vcpu_debug_state; 112 + else if (kvm_host_owns_debug_regs(&hyp_vcpu->vcpu)) 113 + host_vcpu->arch.external_debug_state = hyp_vcpu->vcpu.arch.external_debug_state; 114 + } 115 + 95 116 static void flush_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu) 96 117 { 97 118 struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu; 98 119 99 120 fpsimd_sve_flush(); 121 + flush_debug_state(hyp_vcpu); 100 122 101 123 hyp_vcpu->vcpu.arch.ctxt = host_vcpu->arch.ctxt; 102 124 ··· 147 123 unsigned int i; 148 124 149 125 fpsimd_sve_sync(&hyp_vcpu->vcpu); 126 + sync_debug_state(hyp_vcpu); 150 127 151 128 host_vcpu->arch.ctxt = hyp_vcpu->vcpu.arch.ctxt; 152 129 ··· 225 200 226 201 sync_hyp_vcpu(hyp_vcpu); 227 202 } else { 203 + struct kvm_vcpu *vcpu = kern_hyp_va(host_vcpu); 204 + 228 205 /* The host is fully trusted, run its vCPU directly. */ 229 - ret = __kvm_vcpu_run(kern_hyp_va(host_vcpu)); 206 + fpsimd_lazy_switch_to_guest(vcpu); 207 + ret = __kvm_vcpu_run(vcpu); 208 + fpsimd_lazy_switch_to_host(vcpu); 230 209 } 231 210 out: 232 211 cpu_reg(host_ctxt, 1) = ret; ··· 679 650 break; 680 651 case ESR_ELx_EC_SMC64: 681 652 handle_host_smc(host_ctxt); 682 - break; 683 - case ESR_ELx_EC_SVE: 684 - cpacr_clear_set(0, CPACR_EL1_ZEN); 685 - isb(); 686 - sve_cond_update_zcr_vq(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, 687 - SYS_ZCR_EL2); 688 653 break; 689 654 case ESR_ELx_EC_IABT_LOW: 690 655 case ESR_ELx_EC_DABT_LOW:

+41 -35

arch/arm64/kvm/hyp/nvhe/mem_protect.c

··· 943 943 ret = kvm_pgtable_get_leaf(&vm->pgt, ipa, &pte, &level); 944 944 if (ret) 945 945 return ret; 946 - if (level != KVM_PGTABLE_LAST_LEVEL) 947 - return -E2BIG; 948 946 if (!kvm_pte_valid(pte)) 949 947 return -ENOENT; 948 + if (level != KVM_PGTABLE_LAST_LEVEL) 949 + return -E2BIG; 950 950 951 951 state = guest_get_page_state(pte, ipa); 952 952 if (state != PKVM_PAGE_SHARED_BORROWED) ··· 998 998 return ret; 999 999 } 1000 1000 1001 - int __pkvm_host_relax_perms_guest(u64 gfn, struct pkvm_hyp_vcpu *vcpu, enum kvm_pgtable_prot prot) 1001 + static void assert_host_shared_guest(struct pkvm_hyp_vm *vm, u64 ipa) 1002 1002 { 1003 - struct pkvm_hyp_vm *vm = pkvm_hyp_vcpu_to_hyp_vm(vcpu); 1004 - u64 ipa = hyp_pfn_to_phys(gfn); 1005 1003 u64 phys; 1006 1004 int ret; 1007 1005 1008 - if (prot & ~KVM_PGTABLE_PROT_RWX) 1009 - return -EINVAL; 1006 + if (!IS_ENABLED(CONFIG_NVHE_EL2_DEBUG)) 1007 + return; 1010 1008 1011 1009 host_lock_component(); 1012 1010 guest_lock_component(vm); 1013 1011 1014 1012 ret = __check_host_shared_guest(vm, &phys, ipa); 1015 - if (!ret) 1016 - ret = kvm_pgtable_stage2_relax_perms(&vm->pgt, ipa, prot, 0); 1017 1013 1018 1014 guest_unlock_component(vm); 1019 1015 host_unlock_component(); 1016 + 1017 + WARN_ON(ret && ret != -ENOENT); 1018 + } 1019 + 1020 + int __pkvm_host_relax_perms_guest(u64 gfn, struct pkvm_hyp_vcpu *vcpu, enum kvm_pgtable_prot prot) 1021 + { 1022 + struct pkvm_hyp_vm *vm = pkvm_hyp_vcpu_to_hyp_vm(vcpu); 1023 + u64 ipa = hyp_pfn_to_phys(gfn); 1024 + int ret; 1025 + 1026 + if (pkvm_hyp_vm_is_protected(vm)) 1027 + return -EPERM; 1028 + 1029 + if (prot & ~KVM_PGTABLE_PROT_RWX) 1030 + return -EINVAL; 1031 + 1032 + assert_host_shared_guest(vm, ipa); 1033 + guest_lock_component(vm); 1034 + ret = kvm_pgtable_stage2_relax_perms(&vm->pgt, ipa, prot, 0); 1035 + guest_unlock_component(vm); 1020 1036 1021 1037 return ret; 1022 1038 } ··· 1040 1024 int __pkvm_host_wrprotect_guest(u64 gfn, struct pkvm_hyp_vm *vm) 1041 1025 { 1042 1026 u64 ipa = hyp_pfn_to_phys(gfn); 1043 - u64 phys; 1044 1027 int ret; 1045 1028 1046 - host_lock_component(); 1029 + if (pkvm_hyp_vm_is_protected(vm)) 1030 + return -EPERM; 1031 + 1032 + assert_host_shared_guest(vm, ipa); 1047 1033 guest_lock_component(vm); 1048 - 1049 - ret = __check_host_shared_guest(vm, &phys, ipa); 1050 - if (!ret) 1051 - ret = kvm_pgtable_stage2_wrprotect(&vm->pgt, ipa, PAGE_SIZE); 1052 - 1034 + ret = kvm_pgtable_stage2_wrprotect(&vm->pgt, ipa, PAGE_SIZE); 1053 1035 guest_unlock_component(vm); 1054 - host_unlock_component(); 1055 1036 1056 1037 return ret; 1057 1038 } ··· 1056 1043 int __pkvm_host_test_clear_young_guest(u64 gfn, bool mkold, struct pkvm_hyp_vm *vm) 1057 1044 { 1058 1045 u64 ipa = hyp_pfn_to_phys(gfn); 1059 - u64 phys; 1060 1046 int ret; 1061 1047 1062 - host_lock_component(); 1048 + if (pkvm_hyp_vm_is_protected(vm)) 1049 + return -EPERM; 1050 + 1051 + assert_host_shared_guest(vm, ipa); 1063 1052 guest_lock_component(vm); 1064 - 1065 - ret = __check_host_shared_guest(vm, &phys, ipa); 1066 - if (!ret) 1067 - ret = kvm_pgtable_stage2_test_clear_young(&vm->pgt, ipa, PAGE_SIZE, mkold); 1068 - 1053 + ret = kvm_pgtable_stage2_test_clear_young(&vm->pgt, ipa, PAGE_SIZE, mkold); 1069 1054 guest_unlock_component(vm); 1070 - host_unlock_component(); 1071 1055 1072 1056 return ret; 1073 1057 } ··· 1073 1063 { 1074 1064 struct pkvm_hyp_vm *vm = pkvm_hyp_vcpu_to_hyp_vm(vcpu); 1075 1065 u64 ipa = hyp_pfn_to_phys(gfn); 1076 - u64 phys; 1077 - int ret; 1078 1066 1079 - host_lock_component(); 1067 + if (pkvm_hyp_vm_is_protected(vm)) 1068 + return -EPERM; 1069 + 1070 + assert_host_shared_guest(vm, ipa); 1080 1071 guest_lock_component(vm); 1081 - 1082 - ret = __check_host_shared_guest(vm, &phys, ipa); 1083 - if (!ret) 1084 - kvm_pgtable_stage2_mkyoung(&vm->pgt, ipa, 0); 1085 - 1072 + kvm_pgtable_stage2_mkyoung(&vm->pgt, ipa, 0); 1086 1073 guest_unlock_component(vm); 1087 - host_unlock_component(); 1088 1074 1089 - return ret; 1075 + return 0; 1090 1076 }

+45 -44

arch/arm64/kvm/hyp/nvhe/switch.c

··· 39 39 { 40 40 u64 val = CPTR_EL2_TAM; /* Same bit irrespective of E2H */ 41 41 42 + if (!guest_owns_fp_regs()) 43 + __activate_traps_fpsimd32(vcpu); 44 + 42 45 if (has_hvhe()) { 43 46 val |= CPACR_EL1_TTA; 44 47 ··· 50 47 if (vcpu_has_sve(vcpu)) 51 48 val |= CPACR_EL1_ZEN; 52 49 } 50 + 51 + write_sysreg(val, cpacr_el1); 53 52 } else { 54 53 val |= CPTR_EL2_TTA | CPTR_NVHE_EL2_RES1; 55 54 ··· 66 61 67 62 if (!guest_owns_fp_regs()) 68 63 val |= CPTR_EL2_TFP; 64 + 65 + write_sysreg(val, cptr_el2); 69 66 } 67 + } 70 68 71 - if (!guest_owns_fp_regs()) 72 - __activate_traps_fpsimd32(vcpu); 69 + static void __deactivate_cptr_traps(struct kvm_vcpu *vcpu) 70 + { 71 + if (has_hvhe()) { 72 + u64 val = CPACR_EL1_FPEN; 73 73 74 - kvm_write_cptr_el2(val); 74 + if (cpus_have_final_cap(ARM64_SVE)) 75 + val |= CPACR_EL1_ZEN; 76 + if (cpus_have_final_cap(ARM64_SME)) 77 + val |= CPACR_EL1_SMEN; 78 + 79 + write_sysreg(val, cpacr_el1); 80 + } else { 81 + u64 val = CPTR_NVHE_EL2_RES1; 82 + 83 + if (!cpus_have_final_cap(ARM64_SVE)) 84 + val |= CPTR_EL2_TZ; 85 + if (!cpus_have_final_cap(ARM64_SME)) 86 + val |= CPTR_EL2_TSM; 87 + 88 + write_sysreg(val, cptr_el2); 89 + } 75 90 } 76 91 77 92 static void __activate_traps(struct kvm_vcpu *vcpu) ··· 144 119 145 120 write_sysreg(this_cpu_ptr(&kvm_init_params)->hcr_el2, hcr_el2); 146 121 147 - kvm_reset_cptr_el2(vcpu); 122 + __deactivate_cptr_traps(vcpu); 148 123 write_sysreg(__kvm_hyp_host_vector, vbar_el2); 149 124 } 150 125 ··· 217 192 kvm_handle_pvm_sysreg(vcpu, exit_code)); 218 193 } 219 194 220 - static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu) 221 - { 222 - /* 223 - * Non-protected kvm relies on the host restoring its sve state. 224 - * Protected kvm restores the host's sve state as not to reveal that 225 - * fpsimd was used by a guest nor leak upper sve bits. 226 - */ 227 - if (unlikely(is_protected_kvm_enabled() && system_supports_sve())) { 228 - __hyp_sve_save_host(); 229 - 230 - /* Re-enable SVE traps if not supported for the guest vcpu. */ 231 - if (!vcpu_has_sve(vcpu)) 232 - cpacr_clear_set(CPACR_EL1_ZEN, 0); 233 - 234 - } else { 235 - __fpsimd_save_state(*host_data_ptr(fpsimd_state)); 236 - } 237 - 238 - if (kvm_has_fpmr(kern_hyp_va(vcpu->kvm))) { 239 - u64 val = read_sysreg_s(SYS_FPMR); 240 - 241 - if (unlikely(is_protected_kvm_enabled())) 242 - *host_data_ptr(fpmr) = val; 243 - else 244 - **host_data_ptr(fpmr_ptr) = val; 245 - } 246 - } 247 - 248 195 static const exit_handler_fn hyp_exit_handlers[] = { 249 196 [0 ... ESR_ELx_EC_MAX] = NULL, 250 197 [ESR_ELx_EC_CP15_32] = kvm_hyp_handle_cp15_32, ··· 248 251 return hyp_exit_handlers; 249 252 } 250 253 251 - /* 252 - * Some guests (e.g., protected VMs) are not be allowed to run in AArch32. 253 - * The ARMv8 architecture does not give the hypervisor a mechanism to prevent a 254 - * guest from dropping to AArch32 EL0 if implemented by the CPU. If the 255 - * hypervisor spots a guest in such a state ensure it is handled, and don't 256 - * trust the host to spot or fix it. The check below is based on the one in 257 - * kvm_arch_vcpu_ioctl_run(). 258 - * 259 - * Returns false if the guest ran in AArch32 when it shouldn't have, and 260 - * thus should exit to the host, or true if a the guest run loop can continue. 261 - */ 262 - static void early_exit_filter(struct kvm_vcpu *vcpu, u64 *exit_code) 254 + static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code) 263 255 { 256 + const exit_handler_fn *handlers = kvm_get_exit_handler_array(vcpu); 257 + 258 + synchronize_vcpu_pstate(vcpu, exit_code); 259 + 260 + /* 261 + * Some guests (e.g., protected VMs) are not be allowed to run in 262 + * AArch32. The ARMv8 architecture does not give the hypervisor a 263 + * mechanism to prevent a guest from dropping to AArch32 EL0 if 264 + * implemented by the CPU. If the hypervisor spots a guest in such a 265 + * state ensure it is handled, and don't trust the host to spot or fix 266 + * it. The check below is based on the one in 267 + * kvm_arch_vcpu_ioctl_run(). 268 + */ 264 269 if (unlikely(vcpu_is_protected(vcpu) && vcpu_mode_is_32bit(vcpu))) { 265 270 /* 266 271 * As we have caught the guest red-handed, decide that it isn't ··· 275 276 *exit_code &= BIT(ARM_EXIT_WITH_SERROR_BIT); 276 277 *exit_code |= ARM_EXCEPTION_IL; 277 278 } 279 + 280 + return __fixup_guest_exit(vcpu, exit_code, handlers); 278 281 } 279 282 280 283 /* Switch to the guest for legacy non-VHE systems */

+19 -14

arch/arm64/kvm/hyp/vhe/switch.c

··· 136 136 write_sysreg(val, cpacr_el1); 137 137 } 138 138 139 + static void __deactivate_cptr_traps(struct kvm_vcpu *vcpu) 140 + { 141 + u64 val = CPACR_EL1_FPEN | CPACR_EL1_ZEN_EL1EN; 142 + 143 + if (cpus_have_final_cap(ARM64_SME)) 144 + val |= CPACR_EL1_SMEN_EL1EN; 145 + 146 + write_sysreg(val, cpacr_el1); 147 + } 148 + 139 149 static void __activate_traps(struct kvm_vcpu *vcpu) 140 150 { 141 151 u64 val; ··· 217 207 */ 218 208 asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT)); 219 209 220 - kvm_reset_cptr_el2(vcpu); 210 + __deactivate_cptr_traps(vcpu); 221 211 222 212 if (!arm64_kernel_unmapped_at_el0()) 223 213 host_vectors = __this_cpu_read(this_cpu_vector); ··· 423 413 return true; 424 414 } 425 415 426 - static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu) 427 - { 428 - __fpsimd_save_state(*host_data_ptr(fpsimd_state)); 429 - 430 - if (kvm_has_fpmr(vcpu->kvm)) 431 - **host_data_ptr(fpmr_ptr) = read_sysreg_s(SYS_FPMR); 432 - } 433 - 434 416 static bool kvm_hyp_handle_tlbi_el2(struct kvm_vcpu *vcpu, u64 *exit_code) 435 417 { 436 418 int ret = -EINVAL; ··· 540 538 [ESR_ELx_EC_MOPS] = kvm_hyp_handle_mops, 541 539 }; 542 540 543 - static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm_vcpu *vcpu) 541 + static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code) 544 542 { 545 - return hyp_exit_handlers; 546 - } 543 + synchronize_vcpu_pstate(vcpu, exit_code); 547 544 548 - static void early_exit_filter(struct kvm_vcpu *vcpu, u64 *exit_code) 549 - { 550 545 /* 551 546 * If we were in HYP context on entry, adjust the PSTATE view 552 547 * so that the usual helpers work correctly. ··· 563 564 *vcpu_cpsr(vcpu) &= ~(PSR_MODE_MASK | PSR_MODE32_BIT); 564 565 *vcpu_cpsr(vcpu) |= mode; 565 566 } 567 + 568 + return __fixup_guest_exit(vcpu, exit_code, hyp_exit_handlers); 566 569 } 567 570 568 571 /* Switch to the guest for VHE systems running in EL2 */ ··· 578 577 guest_ctxt = &vcpu->arch.ctxt; 579 578 580 579 sysreg_save_host_state_vhe(host_ctxt); 580 + 581 + fpsimd_lazy_switch_to_guest(vcpu); 581 582 582 583 /* 583 584 * Note that ARM erratum 1165522 requires us to configure both stage 1 ··· 604 601 sysreg_save_guest_state_vhe(guest_ctxt); 605 602 606 603 __deactivate_traps(vcpu); 604 + 605 + fpsimd_lazy_switch_to_host(vcpu); 607 606 608 607 sysreg_restore_host_state_vhe(host_ctxt); 609 608

+5 -4

arch/arm64/kvm/nested.c

··· 67 67 if (!tmp) 68 68 return -ENOMEM; 69 69 70 + swap(kvm->arch.nested_mmus, tmp); 71 + 70 72 /* 71 73 * If we went through a realocation, adjust the MMU back-pointers in 72 74 * the previously initialised kvm_pgtable structures. 73 75 */ 74 76 if (kvm->arch.nested_mmus != tmp) 75 77 for (int i = 0; i < kvm->arch.nested_mmus_size; i++) 76 - tmp[i].pgt->mmu = &tmp[i]; 78 + kvm->arch.nested_mmus[i].pgt->mmu = &kvm->arch.nested_mmus[i]; 77 79 78 80 for (int i = kvm->arch.nested_mmus_size; !ret && i < num_mmus; i++) 79 - ret = init_nested_s2_mmu(kvm, &tmp[i]); 81 + ret = init_nested_s2_mmu(kvm, &kvm->arch.nested_mmus[i]); 80 82 81 83 if (ret) { 82 84 for (int i = kvm->arch.nested_mmus_size; i < num_mmus; i++) 83 - kvm_free_stage2_pgd(&tmp[i]); 85 + kvm_free_stage2_pgd(&kvm->arch.nested_mmus[i]); 84 86 85 87 return ret; 86 88 } 87 89 88 90 kvm->arch.nested_mmus_size = num_mmus; 89 - kvm->arch.nested_mmus = tmp; 90 91 91 92 return 0; 92 93 }

+13 -3

arch/arm64/kvm/sys_regs.c

··· 1452 1452 return true; 1453 1453 } 1454 1454 1455 + static bool access_hv_timer(struct kvm_vcpu *vcpu, 1456 + struct sys_reg_params *p, 1457 + const struct sys_reg_desc *r) 1458 + { 1459 + if (!vcpu_el2_e2h_is_set(vcpu)) 1460 + return undef_access(vcpu, p, r); 1461 + 1462 + return access_arch_timer(vcpu, p, r); 1463 + } 1464 + 1455 1465 static s64 kvm_arm64_ftr_safe_value(u32 id, const struct arm64_ftr_bits *ftrp, 1456 1466 s64 new, s64 cur) 1457 1467 { ··· 3113 3103 EL2_REG(CNTHP_CTL_EL2, access_arch_timer, reset_val, 0), 3114 3104 EL2_REG(CNTHP_CVAL_EL2, access_arch_timer, reset_val, 0), 3115 3105 3116 - { SYS_DESC(SYS_CNTHV_TVAL_EL2), access_arch_timer }, 3117 - EL2_REG(CNTHV_CTL_EL2, access_arch_timer, reset_val, 0), 3118 - EL2_REG(CNTHV_CVAL_EL2, access_arch_timer, reset_val, 0), 3106 + { SYS_DESC(SYS_CNTHV_TVAL_EL2), access_hv_timer }, 3107 + EL2_REG(CNTHV_CTL_EL2, access_hv_timer, reset_val, 0), 3108 + EL2_REG(CNTHV_CVAL_EL2, access_hv_timer, reset_val, 0), 3119 3109 3120 3110 { SYS_DESC(SYS_CNTKCTL_EL12), access_cntkctl_el12 }, 3121 3111

+37 -37

arch/arm64/kvm/vgic/vgic-init.c

··· 34 34 * 35 35 * CPU Interface: 36 36 * 37 - * - kvm_vgic_vcpu_init(): initialization of static data that 38 - * doesn't depend on any sizing information or emulation type. No 39 - * allocation is allowed there. 37 + * - kvm_vgic_vcpu_init(): initialization of static data that doesn't depend 38 + * on any sizing information. Private interrupts are allocated if not 39 + * already allocated at vgic-creation time. 40 40 */ 41 41 42 42 /* EARLY INIT */ ··· 57 57 } 58 58 59 59 /* CREATION */ 60 + 61 + static int vgic_allocate_private_irqs_locked(struct kvm_vcpu *vcpu, u32 type); 60 62 61 63 /** 62 64 * kvm_vgic_create: triggered by the instantiation of the VGIC device by ··· 111 109 112 110 if (atomic_read(&kvm->online_vcpus) > kvm->max_vcpus) { 113 111 ret = -E2BIG; 112 + goto out_unlock; 113 + } 114 + 115 + kvm_for_each_vcpu(i, vcpu, kvm) { 116 + ret = vgic_allocate_private_irqs_locked(vcpu, type); 117 + if (ret) 118 + break; 119 + } 120 + 121 + if (ret) { 122 + kvm_for_each_vcpu(i, vcpu, kvm) { 123 + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; 124 + kfree(vgic_cpu->private_irqs); 125 + vgic_cpu->private_irqs = NULL; 126 + } 127 + 114 128 goto out_unlock; 115 129 } 116 130 ··· 198 180 return 0; 199 181 } 200 182 201 - static int vgic_allocate_private_irqs_locked(struct kvm_vcpu *vcpu) 183 + static int vgic_allocate_private_irqs_locked(struct kvm_vcpu *vcpu, u32 type) 202 184 { 203 185 struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; 204 186 int i; ··· 236 218 /* PPIs */ 237 219 irq->config = VGIC_CONFIG_LEVEL; 238 220 } 221 + 222 + switch (type) { 223 + case KVM_DEV_TYPE_ARM_VGIC_V3: 224 + irq->group = 1; 225 + irq->mpidr = kvm_vcpu_get_mpidr_aff(vcpu); 226 + break; 227 + case KVM_DEV_TYPE_ARM_VGIC_V2: 228 + irq->group = 0; 229 + irq->targets = BIT(vcpu->vcpu_id); 230 + break; 231 + } 239 232 } 240 233 241 234 return 0; 242 235 } 243 236 244 - static int vgic_allocate_private_irqs(struct kvm_vcpu *vcpu) 237 + static int vgic_allocate_private_irqs(struct kvm_vcpu *vcpu, u32 type) 245 238 { 246 239 int ret; 247 240 248 241 mutex_lock(&vcpu->kvm->arch.config_lock); 249 - ret = vgic_allocate_private_irqs_locked(vcpu); 242 + ret = vgic_allocate_private_irqs_locked(vcpu, type); 250 243 mutex_unlock(&vcpu->kvm->arch.config_lock); 251 244 252 245 return ret; ··· 287 258 if (!irqchip_in_kernel(vcpu->kvm)) 288 259 return 0; 289 260 290 - ret = vgic_allocate_private_irqs(vcpu); 261 + ret = vgic_allocate_private_irqs(vcpu, dist->vgic_model); 291 262 if (ret) 292 263 return ret; 293 264 ··· 324 295 { 325 296 struct vgic_dist *dist = &kvm->arch.vgic; 326 297 struct kvm_vcpu *vcpu; 327 - int ret = 0, i; 298 + int ret = 0; 328 299 unsigned long idx; 329 300 330 301 lockdep_assert_held(&kvm->arch.config_lock); ··· 343 314 ret = kvm_vgic_dist_init(kvm, dist->nr_spis); 344 315 if (ret) 345 316 goto out; 346 - 347 - /* Initialize groups on CPUs created before the VGIC type was known */ 348 - kvm_for_each_vcpu(idx, vcpu, kvm) { 349 - ret = vgic_allocate_private_irqs_locked(vcpu); 350 - if (ret) 351 - goto out; 352 - 353 - for (i = 0; i < VGIC_NR_PRIVATE_IRQS; i++) { 354 - struct vgic_irq *irq = vgic_get_vcpu_irq(vcpu, i); 355 - 356 - switch (dist->vgic_model) { 357 - case KVM_DEV_TYPE_ARM_VGIC_V3: 358 - irq->group = 1; 359 - irq->mpidr = kvm_vcpu_get_mpidr_aff(vcpu); 360 - break; 361 - case KVM_DEV_TYPE_ARM_VGIC_V2: 362 - irq->group = 0; 363 - irq->targets = 1U << idx; 364 - break; 365 - default: 366 - ret = -EINVAL; 367 - } 368 - 369 - vgic_put_irq(kvm, irq); 370 - 371 - if (ret) 372 - goto out; 373 - } 374 - } 375 317 376 318 /* 377 319 * If we have GICv4.1 enabled, unconditionally request enable the

+7

arch/arm64/mm/trans_pgd.c

··· 162 162 unsigned long next; 163 163 unsigned long addr = start; 164 164 165 + if (pgd_none(READ_ONCE(*dst_pgdp))) { 166 + dst_p4dp = trans_alloc(info); 167 + if (!dst_p4dp) 168 + return -ENOMEM; 169 + pgd_populate(NULL, dst_pgdp, dst_p4dp); 170 + } 171 + 165 172 dst_p4dp = p4d_offset(dst_pgdp, start); 166 173 src_p4dp = p4d_offset(src_pgdp, start); 167 174 do {

-21

arch/loongarch/include/asm/cpu-info.h

··· 76 76 #define cpu_family_string() __cpu_family[raw_smp_processor_id()] 77 77 #define cpu_full_name_string() __cpu_full_name[raw_smp_processor_id()] 78 78 79 - struct seq_file; 80 - struct notifier_block; 81 - 82 - extern int register_proc_cpuinfo_notifier(struct notifier_block *nb); 83 - extern int proc_cpuinfo_notifier_call_chain(unsigned long val, void *v); 84 - 85 - #define proc_cpuinfo_notifier(fn, pri) \ 86 - ({ \ 87 - static struct notifier_block fn##_nb = { \ 88 - .notifier_call = fn, \ 89 - .priority = pri \ 90 - }; \ 91 - \ 92 - register_proc_cpuinfo_notifier(&fn##_nb); \ 93 - }) 94 - 95 - struct proc_cpuinfo_notifier_args { 96 - struct seq_file *m; 97 - unsigned long n; 98 - }; 99 - 100 79 static inline bool cpus_are_siblings(int cpua, int cpub) 101 80 { 102 81 struct cpuinfo_loongarch *infoa = &cpu_data[cpua];

+2

arch/loongarch/include/asm/smp.h

··· 77 77 #define SMP_IRQ_WORK BIT(ACTION_IRQ_WORK) 78 78 #define SMP_CLEAR_VECTOR BIT(ACTION_CLEAR_VECTOR) 79 79 80 + struct seq_file; 81 + 80 82 struct secondary_data { 81 83 unsigned long stack; 82 84 unsigned long thread_info;

+15 -13

arch/loongarch/kernel/genex.S

··· 18 18 19 19 .align 5 20 20 SYM_FUNC_START(__arch_cpu_idle) 21 - /* start of rollback region */ 22 - LONG_L t0, tp, TI_FLAGS 23 - nop 24 - andi t0, t0, _TIF_NEED_RESCHED 25 - bnez t0, 1f 26 - nop 27 - nop 28 - nop 21 + /* start of idle interrupt region */ 22 + ori t0, zero, CSR_CRMD_IE 23 + /* idle instruction needs irq enabled */ 24 + csrxchg t0, t0, LOONGARCH_CSR_CRMD 25 + /* 26 + * If an interrupt lands here; between enabling interrupts above and 27 + * going idle on the next instruction, we must *NOT* go idle since the 28 + * interrupt could have set TIF_NEED_RESCHED or caused an timer to need 29 + * reprogramming. Fall through -- see handle_vint() below -- and have 30 + * the idle loop take care of things. 31 + */ 29 32 idle 0 30 - /* end of rollback region */ 33 + /* end of idle interrupt region */ 31 34 1: jr ra 32 35 SYM_FUNC_END(__arch_cpu_idle) 33 36 ··· 38 35 UNWIND_HINT_UNDEFINED 39 36 BACKUP_T0T1 40 37 SAVE_ALL 41 - la_abs t1, __arch_cpu_idle 38 + la_abs t1, 1b 42 39 LONG_L t0, sp, PT_ERA 43 - /* 32 byte rollback region */ 44 - ori t0, t0, 0x1f 45 - xori t0, t0, 0x1f 40 + /* 3 instructions idle interrupt region */ 41 + ori t0, t0, 0b1100 46 42 bne t0, t1, 1f 47 43 LONG_S t0, sp, PT_ERA 48 44 1: move a0, sp

+1 -2

arch/loongarch/kernel/idle.c

··· 11 11 12 12 void __cpuidle arch_cpu_idle(void) 13 13 { 14 - raw_local_irq_enable(); 15 - __arch_cpu_idle(); /* idle instruction needs irq enabled */ 14 + __arch_cpu_idle(); 16 15 raw_local_irq_disable(); 17 16 }

+3 -26

arch/loongarch/kernel/proc.c

··· 13 13 #include <asm/processor.h> 14 14 #include <asm/time.h> 15 15 16 - /* 17 - * No lock; only written during early bootup by CPU 0. 18 - */ 19 - static RAW_NOTIFIER_HEAD(proc_cpuinfo_chain); 20 - 21 - int __ref register_proc_cpuinfo_notifier(struct notifier_block *nb) 22 - { 23 - return raw_notifier_chain_register(&proc_cpuinfo_chain, nb); 24 - } 25 - 26 - int proc_cpuinfo_notifier_call_chain(unsigned long val, void *v) 27 - { 28 - return raw_notifier_call_chain(&proc_cpuinfo_chain, val, v); 29 - } 30 - 31 16 static int show_cpuinfo(struct seq_file *m, void *v) 32 17 { 33 18 unsigned long n = (unsigned long) v - 1; 34 19 unsigned int isa = cpu_data[n].isa_level; 35 20 unsigned int version = cpu_data[n].processor_id & 0xff; 36 21 unsigned int fp_version = cpu_data[n].fpu_vers; 37 - struct proc_cpuinfo_notifier_args proc_cpuinfo_notifier_args; 38 22 39 23 #ifdef CONFIG_SMP 40 24 if (!cpu_online(n)) ··· 75 91 if (cpu_has_lbt_mips) seq_printf(m, " lbt_mips"); 76 92 seq_printf(m, "\n"); 77 93 78 - seq_printf(m, "Hardware Watchpoint\t: %s", 79 - cpu_has_watch ? "yes, " : "no\n"); 94 + seq_printf(m, "Hardware Watchpoint\t: %s", str_yes_no(cpu_has_watch)); 80 95 if (cpu_has_watch) { 81 - seq_printf(m, "iwatch count: %d, dwatch count: %d\n", 96 + seq_printf(m, ", iwatch count: %d, dwatch count: %d", 82 97 cpu_data[n].watch_ireg_count, cpu_data[n].watch_dreg_count); 83 98 } 84 99 85 - proc_cpuinfo_notifier_args.m = m; 86 - proc_cpuinfo_notifier_args.n = n; 87 - 88 - raw_notifier_call_chain(&proc_cpuinfo_chain, 0, 89 - &proc_cpuinfo_notifier_args); 90 - 91 - seq_printf(m, "\n"); 100 + seq_printf(m, "\n\n"); 92 101 93 102 return 0; 94 103 }

+3 -3

arch/loongarch/kernel/reset.c

··· 33 33 console_flush_on_panic(CONSOLE_FLUSH_PENDING); 34 34 35 35 while (true) { 36 - __arch_cpu_idle(); 36 + __asm__ __volatile__("idle 0" : : : "memory"); 37 37 } 38 38 } 39 39 ··· 53 53 #endif 54 54 55 55 while (true) { 56 - __arch_cpu_idle(); 56 + __asm__ __volatile__("idle 0" : : : "memory"); 57 57 } 58 58 } 59 59 ··· 74 74 acpi_reboot(); 75 75 76 76 while (true) { 77 - __arch_cpu_idle(); 77 + __asm__ __volatile__("idle 0" : : : "memory"); 78 78 } 79 79 }

+2 -2

arch/loongarch/kvm/main.c

··· 303 303 * TOE=0: Trap on Exception. 304 304 * TIT=0: Trap on Timer. 305 305 */ 306 - if (env & CSR_GCFG_GCIP_ALL) 306 + if (env & CSR_GCFG_GCIP_SECURE) 307 307 gcfg |= CSR_GCFG_GCI_SECURE; 308 - if (env & CSR_GCFG_MATC_ROOT) 308 + if (env & CSR_GCFG_MATP_ROOT) 309 309 gcfg |= CSR_GCFG_MATC_ROOT; 310 310 311 311 write_csr_gcfg(gcfg);

+1 -1

arch/loongarch/kvm/switch.S

··· 85 85 * Guest CRMD comes from separate GCSR_CRMD register 86 86 */ 87 87 ori t0, zero, CSR_PRMD_PIE 88 - csrxchg t0, t0, LOONGARCH_CSR_PRMD 88 + csrwr t0, LOONGARCH_CSR_PRMD 89 89 90 90 /* Set PVM bit to setup ertn to guest context */ 91 91 ori t0, zero, CSR_GSTAT_PVM

-3

arch/loongarch/kvm/vcpu.c

··· 1548 1548 1549 1549 /* Restore timer state regardless */ 1550 1550 kvm_restore_timer(vcpu); 1551 - 1552 - /* Control guest page CCA attribute */ 1553 - change_csr_gcfg(CSR_GCFG_MATC_MASK, CSR_GCFG_MATC_ROOT); 1554 1551 kvm_make_request(KVM_REQ_STEAL_UPDATE, vcpu); 1555 1552 1556 1553 /* Restore hardware PMU CSRs */

+1 -1

arch/loongarch/lib/csum.c

··· 25 25 const u64 *ptr; 26 26 u64 data, sum64 = 0; 27 27 28 - if (unlikely(len == 0)) 28 + if (unlikely(len <= 0)) 29 29 return 0; 30 30 31 31 offset = (unsigned long)buff & 7;

+2 -1

arch/loongarch/mm/pageattr.c

··· 3 3 * Copyright (C) 2024 Loongson Technology Corporation Limited 4 4 */ 5 5 6 + #include <linux/memblock.h> 6 7 #include <linux/pagewalk.h> 7 8 #include <linux/pgtable.h> 8 9 #include <asm/set_memory.h> ··· 168 167 unsigned long addr = (unsigned long)page_address(page); 169 168 170 169 if (addr < vm_map_base) 171 - return true; 170 + return memblock_is_memory(__pa(addr)); 172 171 173 172 pgd = pgd_offset_k(addr); 174 173 if (pgd_none(pgdp_get(pgd)))

+2 -2

arch/mips/include/asm/ptrace.h

··· 27 27 */ 28 28 struct pt_regs { 29 29 #ifdef CONFIG_32BIT 30 - /* Pad bytes for argument save space on the stack. */ 31 - unsigned long pad0[8]; 30 + /* Saved syscall stack arguments; entries 0-3 unused. */ 31 + unsigned long args[8]; 32 32 #endif 33 33 34 34 /* Saved main processor registers. */

+8 -24

arch/mips/include/asm/syscall.h

··· 57 57 static inline void mips_get_syscall_arg(unsigned long *arg, 58 58 struct task_struct *task, struct pt_regs *regs, unsigned int n) 59 59 { 60 - unsigned long usp __maybe_unused = regs->regs[29]; 61 - 60 + #ifdef CONFIG_32BIT 62 61 switch (n) { 63 62 case 0: case 1: case 2: case 3: 64 63 *arg = regs->regs[4 + n]; 65 - 66 64 return; 67 - 68 - #ifdef CONFIG_32BIT 69 65 case 4: case 5: case 6: case 7: 70 - get_user(*arg, (int *)usp + n); 66 + *arg = regs->args[n]; 71 67 return; 72 - #endif 73 - 74 - #ifdef CONFIG_64BIT 75 - case 4: case 5: case 6: case 7: 76 - #ifdef CONFIG_MIPS32_O32 77 - if (test_tsk_thread_flag(task, TIF_32BIT_REGS)) 78 - get_user(*arg, (int *)usp + n); 79 - else 80 - #endif 81 - *arg = regs->regs[4 + n]; 82 - 83 - return; 84 - #endif 85 - 86 - default: 87 - BUG(); 88 68 } 89 - 90 - unreachable(); 69 + #else 70 + *arg = regs->regs[4 + n]; 71 + if ((IS_ENABLED(CONFIG_MIPS32_O32) && 72 + test_tsk_thread_flag(task, TIF_32BIT_REGS))) 73 + *arg = (unsigned int)*arg; 74 + #endif 91 75 } 92 76 93 77 static inline long syscall_get_error(struct task_struct *task,

+6

arch/mips/kernel/asm-offsets.c

··· 27 27 void output_ptreg_defines(void) 28 28 { 29 29 COMMENT("MIPS pt_regs offsets."); 30 + #ifdef CONFIG_32BIT 31 + OFFSET(PT_ARG4, pt_regs, args[4]); 32 + OFFSET(PT_ARG5, pt_regs, args[5]); 33 + OFFSET(PT_ARG6, pt_regs, args[6]); 34 + OFFSET(PT_ARG7, pt_regs, args[7]); 35 + #endif 30 36 OFFSET(PT_R0, pt_regs, regs[0]); 31 37 OFFSET(PT_R1, pt_regs, regs[1]); 32 38 OFFSET(PT_R2, pt_regs, regs[2]);

+4 -4

arch/mips/kernel/scall32-o32.S

··· 64 64 load_a7: user_lw(t8, 28(t0)) # argument #8 from usp 65 65 loads_done: 66 66 67 - sw t5, 16(sp) # argument #5 to ksp 68 - sw t6, 20(sp) # argument #6 to ksp 69 - sw t7, 24(sp) # argument #7 to ksp 70 - sw t8, 28(sp) # argument #8 to ksp 67 + sw t5, PT_ARG4(sp) # argument #5 to ksp 68 + sw t6, PT_ARG5(sp) # argument #6 to ksp 69 + sw t7, PT_ARG6(sp) # argument #7 to ksp 70 + sw t8, PT_ARG7(sp) # argument #8 to ksp 71 71 .set pop 72 72 73 73 .section __ex_table,"a"

+1 -1

arch/powerpc/sysdev/fsl_msi.c

··· 75 75 srs = (hwirq >> msi_data->srs_shift) & MSI_SRS_MASK; 76 76 cascade_virq = msi_data->cascade_array[srs]->virq; 77 77 78 - seq_printf(p, " fsl-msi-%d", cascade_virq); 78 + seq_printf(p, "fsl-msi-%d", cascade_virq); 79 79 } 80 80 81 81

-1

arch/s390/configs/debug_defconfig

··· 740 740 CONFIG_IMA_DEFAULT_HASH_SHA256=y 741 741 CONFIG_IMA_WRITE_POLICY=y 742 742 CONFIG_IMA_APPRAISE=y 743 - CONFIG_LSM="yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor" 744 743 CONFIG_BUG_ON_DATA_CORRUPTION=y 745 744 CONFIG_CRYPTO_USER=m 746 745 # CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is not set

-1

arch/s390/configs/defconfig

··· 725 725 CONFIG_IMA_DEFAULT_HASH_SHA256=y 726 726 CONFIG_IMA_WRITE_POLICY=y 727 727 CONFIG_IMA_APPRAISE=y 728 - CONFIG_LSM="yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor" 729 728 CONFIG_BUG_ON_DATA_CORRUPTION=y 730 729 CONFIG_CRYPTO_FIPS=y 731 730 CONFIG_CRYPTO_USER=m

-1

arch/s390/configs/zfcpdump_defconfig

··· 62 62 # CONFIG_INOTIFY_USER is not set 63 63 # CONFIG_MISC_FILESYSTEMS is not set 64 64 # CONFIG_NETWORK_FILESYSTEMS is not set 65 - CONFIG_LSM="yama,loadpin,safesetid,integrity" 66 65 # CONFIG_ZLIB_DFLTCC is not set 67 66 CONFIG_XZ_DEC_MICROLZMA=y 68 67 CONFIG_PRINTK_TIME=y

+5 -1

arch/s390/include/asm/bitops.h

··· 53 53 unsigned long mask; 54 54 int cc; 55 55 56 - if (__builtin_constant_p(nr)) { 56 + /* 57 + * With CONFIG_PROFILE_ALL_BRANCHES enabled gcc fails to 58 + * handle __builtin_constant_p() in some cases. 59 + */ 60 + if (!IS_ENABLED(CONFIG_PROFILE_ALL_BRANCHES) && __builtin_constant_p(nr)) { 57 61 addr = (const volatile unsigned char *)ptr; 58 62 addr += (nr ^ (BITS_PER_LONG - BITS_PER_BYTE)) / BITS_PER_BYTE; 59 63 mask = 1UL << (nr & (BITS_PER_BYTE - 1));

+6 -14

arch/s390/include/asm/gmap.h

··· 23 23 /** 24 24 * struct gmap_struct - guest address space 25 25 * @list: list head for the mm->context gmap list 26 - * @crst_list: list of all crst tables used in the guest address space 27 26 * @mm: pointer to the parent mm_struct 28 27 * @guest_to_host: radix tree with guest to host address translation 29 28 * @host_to_guest: radix tree with pointer to segment table entries ··· 34 35 * @guest_handle: protected virtual machine handle for the ultravisor 35 36 * @host_to_rmap: radix tree with gmap_rmap lists 36 37 * @children: list of shadow gmap structures 37 - * @pt_list: list of all page tables used in the shadow guest address space 38 38 * @shadow_lock: spinlock to protect the shadow gmap list 39 39 * @parent: pointer to the parent gmap for shadow guest address spaces 40 40 * @orig_asce: ASCE for which the shadow page table has been created ··· 43 45 */ 44 46 struct gmap { 45 47 struct list_head list; 46 - struct list_head crst_list; 47 48 struct mm_struct *mm; 48 49 struct radix_tree_root guest_to_host; 49 50 struct radix_tree_root host_to_guest; ··· 58 61 /* Additional data for shadow guest address spaces */ 59 62 struct radix_tree_root host_to_rmap; 60 63 struct list_head children; 61 - struct list_head pt_list; 62 64 spinlock_t shadow_lock; 63 65 struct gmap *parent; 64 66 unsigned long orig_asce; ··· 102 106 void gmap_remove(struct gmap *gmap); 103 107 struct gmap *gmap_get(struct gmap *gmap); 104 108 void gmap_put(struct gmap *gmap); 109 + void gmap_free(struct gmap *gmap); 110 + struct gmap *gmap_alloc(unsigned long limit); 105 111 106 112 int gmap_map_segment(struct gmap *gmap, unsigned long from, 107 113 unsigned long to, unsigned long len); 108 114 int gmap_unmap_segment(struct gmap *gmap, unsigned long to, unsigned long len); 109 115 unsigned long __gmap_translate(struct gmap *, unsigned long gaddr); 110 - unsigned long gmap_translate(struct gmap *, unsigned long gaddr); 111 116 int __gmap_link(struct gmap *gmap, unsigned long gaddr, unsigned long vmaddr); 112 - int gmap_fault(struct gmap *, unsigned long gaddr, unsigned int fault_flags); 113 117 void gmap_discard(struct gmap *, unsigned long from, unsigned long to); 114 118 void __gmap_zap(struct gmap *, unsigned long gaddr); 115 119 void gmap_unlink(struct mm_struct *, unsigned long *table, unsigned long vmaddr); 116 120 117 121 int gmap_read_table(struct gmap *gmap, unsigned long gaddr, unsigned long *val); 118 122 119 - struct gmap *gmap_shadow(struct gmap *parent, unsigned long asce, 120 - int edat_level); 121 - int gmap_shadow_valid(struct gmap *sg, unsigned long asce, int edat_level); 123 + void gmap_unshadow(struct gmap *sg); 122 124 int gmap_shadow_r2t(struct gmap *sg, unsigned long saddr, unsigned long r2t, 123 125 int fake); 124 126 int gmap_shadow_r3t(struct gmap *sg, unsigned long saddr, unsigned long r3t, ··· 125 131 int fake); 126 132 int gmap_shadow_pgt(struct gmap *sg, unsigned long saddr, unsigned long pgt, 127 133 int fake); 128 - int gmap_shadow_pgt_lookup(struct gmap *sg, unsigned long saddr, 129 - unsigned long *pgt, int *dat_protection, int *fake); 130 134 int gmap_shadow_page(struct gmap *sg, unsigned long saddr, pte_t pte); 131 135 132 136 void gmap_register_pte_notifier(struct gmap_notifier *); 133 137 void gmap_unregister_pte_notifier(struct gmap_notifier *); 134 138 135 - int gmap_mprotect_notify(struct gmap *, unsigned long start, 136 - unsigned long len, int prot); 139 + int gmap_protect_one(struct gmap *gmap, unsigned long gaddr, int prot, unsigned long bits); 137 140 138 141 void gmap_sync_dirty_log_pmd(struct gmap *gmap, unsigned long dirty_bitmap[4], 139 142 unsigned long gaddr, unsigned long vmaddr); 140 143 int s390_disable_cow_sharing(void); 141 - void s390_unlist_old_asce(struct gmap *gmap); 142 144 int s390_replace_asce(struct gmap *gmap); 143 145 void s390_uv_destroy_pfns(unsigned long count, unsigned long *pfns); 144 146 int __s390_uv_destroy_range(struct mm_struct *mm, unsigned long start, 145 147 unsigned long end, bool interruptible); 148 + int kvm_s390_wiggle_split_folio(struct mm_struct *mm, struct folio *folio, bool split); 149 + unsigned long *gmap_table_walk(struct gmap *gmap, unsigned long gaddr, int level); 146 150 147 151 /** 148 152 * s390_uv_destroy_range - Destroy a range of pages in the given mm.

+5 -1

arch/s390/include/asm/kvm_host.h

··· 30 30 #define KVM_S390_ESCA_CPU_SLOTS 248 31 31 #define KVM_MAX_VCPUS 255 32 32 33 + #define KVM_INTERNAL_MEM_SLOTS 1 34 + 33 35 /* 34 36 * These seem to be used for allocating ->chip in the routing table, which we 35 37 * don't use. 1 is as small as we can get to reduce the needed memory. If we ··· 933 931 u8 reserved928[0x1000 - 0x928]; /* 0x0928 */ 934 932 }; 935 933 934 + struct vsie_page; 935 + 936 936 struct kvm_s390_vsie { 937 937 struct mutex mutex; 938 938 struct radix_tree_root addr_to_page; 939 939 int page_count; 940 940 int next; 941 - struct page *pages[KVM_MAX_VCPUS]; 941 + struct vsie_page *pages[KVM_MAX_VCPUS]; 942 942 }; 943 943 944 944 struct kvm_s390_gisa_iam {

+18 -3

arch/s390/include/asm/pgtable.h

··· 420 420 #define PGSTE_HC_BIT 0x0020000000000000UL 421 421 #define PGSTE_GR_BIT 0x0004000000000000UL 422 422 #define PGSTE_GC_BIT 0x0002000000000000UL 423 - #define PGSTE_UC_BIT 0x0000800000000000UL /* user dirty (migration) */ 424 - #define PGSTE_IN_BIT 0x0000400000000000UL /* IPTE notify bit */ 425 - #define PGSTE_VSIE_BIT 0x0000200000000000UL /* ref'd in a shadow table */ 423 + #define PGSTE_ST2_MASK 0x0000ffff00000000UL 424 + #define PGSTE_UC_BIT 0x0000000000008000UL /* user dirty (migration) */ 425 + #define PGSTE_IN_BIT 0x0000000000004000UL /* IPTE notify bit */ 426 + #define PGSTE_VSIE_BIT 0x0000000000002000UL /* ref'd in a shadow table */ 426 427 427 428 /* Guest Page State used for virtualization */ 428 429 #define _PGSTE_GPS_ZERO 0x0000000080000000UL ··· 2007 2006 2008 2007 #define pmd_pgtable(pmd) \ 2009 2008 ((pgtable_t)__va(pmd_val(pmd) & -sizeof(pte_t)*PTRS_PER_PTE)) 2009 + 2010 + static inline unsigned long gmap_pgste_get_pgt_addr(unsigned long *pgt) 2011 + { 2012 + unsigned long *pgstes, res; 2013 + 2014 + pgstes = pgt + _PAGE_ENTRIES; 2015 + 2016 + res = (pgstes[0] & PGSTE_ST2_MASK) << 16; 2017 + res |= pgstes[1] & PGSTE_ST2_MASK; 2018 + res |= (pgstes[2] & PGSTE_ST2_MASK) >> 16; 2019 + res |= (pgstes[3] & PGSTE_ST2_MASK) >> 32; 2020 + 2021 + return res; 2022 + } 2010 2023 2011 2024 #endif /* _S390_PAGE_H */

+3 -3

arch/s390/include/asm/uv.h

··· 628 628 } 629 629 630 630 int uv_pin_shared(unsigned long paddr); 631 - int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb); 632 - int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr); 633 631 int uv_destroy_folio(struct folio *folio); 634 632 int uv_destroy_pte(pte_t pte); 635 633 int uv_convert_from_secure_pte(pte_t pte); 636 - int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr); 634 + int make_folio_secure(struct folio *folio, struct uv_cb_header *uvcb); 635 + int uv_convert_from_secure(unsigned long paddr); 636 + int uv_convert_from_secure_folio(struct folio *folio); 637 637 638 638 void setup_uv(void); 639 639

+29 -263

arch/s390/kernel/uv.c

··· 19 19 #include <asm/sections.h> 20 20 #include <asm/uv.h> 21 21 22 - #if !IS_ENABLED(CONFIG_KVM) 23 - unsigned long __gmap_translate(struct gmap *gmap, unsigned long gaddr) 24 - { 25 - return 0; 26 - } 27 - 28 - int gmap_fault(struct gmap *gmap, unsigned long gaddr, 29 - unsigned int fault_flags) 30 - { 31 - return 0; 32 - } 33 - #endif 34 - 35 22 /* the bootdata_preserved fields come from ones in arch/s390/boot/uv.c */ 36 23 int __bootdata_preserved(prot_virt_guest); 37 24 EXPORT_SYMBOL(prot_virt_guest); ··· 146 159 folio_put(folio); 147 160 return rc; 148 161 } 162 + EXPORT_SYMBOL(uv_destroy_folio); 149 163 150 164 /* 151 165 * The present PTE still indirectly holds a folio reference through the mapping. ··· 163 175 * 164 176 * @paddr: Absolute host address of page to be exported 165 177 */ 166 - static int uv_convert_from_secure(unsigned long paddr) 178 + int uv_convert_from_secure(unsigned long paddr) 167 179 { 168 180 struct uv_cb_cfs uvcb = { 169 181 .header.cmd = UVC_CMD_CONV_FROM_SEC_STOR, ··· 175 187 return -EINVAL; 176 188 return 0; 177 189 } 190 + EXPORT_SYMBOL_GPL(uv_convert_from_secure); 178 191 179 192 /* 180 193 * The caller must already hold a reference to the folio. 181 194 */ 182 - static int uv_convert_from_secure_folio(struct folio *folio) 195 + int uv_convert_from_secure_folio(struct folio *folio) 183 196 { 184 197 int rc; 185 198 ··· 195 206 folio_put(folio); 196 207 return rc; 197 208 } 209 + EXPORT_SYMBOL_GPL(uv_convert_from_secure_folio); 198 210 199 211 /* 200 212 * The present PTE still indirectly holds a folio reference through the mapping. ··· 227 237 return res; 228 238 } 229 239 230 - static int make_folio_secure(struct folio *folio, struct uv_cb_header *uvcb) 240 + /** 241 + * make_folio_secure() - make a folio secure 242 + * @folio: the folio to make secure 243 + * @uvcb: the uvcb that describes the UVC to be used 244 + * 245 + * The folio @folio will be made secure if possible, @uvcb will be passed 246 + * as-is to the UVC. 247 + * 248 + * Return: 0 on success; 249 + * -EBUSY if the folio is in writeback or has too many references; 250 + * -E2BIG if the folio is large; 251 + * -EAGAIN if the UVC needs to be attempted again; 252 + * -ENXIO if the address is not mapped; 253 + * -EINVAL if the UVC failed for other reasons. 254 + * 255 + * Context: The caller must hold exactly one extra reference on the folio 256 + * (it's the same logic as split_folio()) 257 + */ 258 + int make_folio_secure(struct folio *folio, struct uv_cb_header *uvcb) 231 259 { 232 260 int expected, cc = 0; 233 261 262 + if (folio_test_large(folio)) 263 + return -E2BIG; 234 264 if (folio_test_writeback(folio)) 235 - return -EAGAIN; 236 - expected = expected_folio_refs(folio); 265 + return -EBUSY; 266 + expected = expected_folio_refs(folio) + 1; 237 267 if (!folio_ref_freeze(folio, expected)) 238 268 return -EBUSY; 239 269 set_bit(PG_arch_1, &folio->flags); ··· 277 267 return -EAGAIN; 278 268 return uvcb->rc == 0x10a ? -ENXIO : -EINVAL; 279 269 } 280 - 281 - /** 282 - * should_export_before_import - Determine whether an export is needed 283 - * before an import-like operation 284 - * @uvcb: the Ultravisor control block of the UVC to be performed 285 - * @mm: the mm of the process 286 - * 287 - * Returns whether an export is needed before every import-like operation. 288 - * This is needed for shared pages, which don't trigger a secure storage 289 - * exception when accessed from a different guest. 290 - * 291 - * Although considered as one, the Unpin Page UVC is not an actual import, 292 - * so it is not affected. 293 - * 294 - * No export is needed also when there is only one protected VM, because the 295 - * page cannot belong to the wrong VM in that case (there is no "other VM" 296 - * it can belong to). 297 - * 298 - * Return: true if an export is needed before every import, otherwise false. 299 - */ 300 - static bool should_export_before_import(struct uv_cb_header *uvcb, struct mm_struct *mm) 301 - { 302 - /* 303 - * The misc feature indicates, among other things, that importing a 304 - * shared page from a different protected VM will automatically also 305 - * transfer its ownership. 306 - */ 307 - if (uv_has_feature(BIT_UV_FEAT_MISC)) 308 - return false; 309 - if (uvcb->cmd == UVC_CMD_UNPIN_PAGE_SHARED) 310 - return false; 311 - return atomic_read(&mm->context.protected_count) > 1; 312 - } 313 - 314 - /* 315 - * Drain LRU caches: the local one on first invocation and the ones of all 316 - * CPUs on successive invocations. Returns "true" on the first invocation. 317 - */ 318 - static bool drain_lru(bool *drain_lru_called) 319 - { 320 - /* 321 - * If we have tried a local drain and the folio refcount 322 - * still does not match our expected safe value, try with a 323 - * system wide drain. This is needed if the pagevecs holding 324 - * the page are on a different CPU. 325 - */ 326 - if (*drain_lru_called) { 327 - lru_add_drain_all(); 328 - /* We give up here, don't retry immediately. */ 329 - return false; 330 - } 331 - /* 332 - * We are here if the folio refcount does not match the 333 - * expected safe value. The main culprits are usually 334 - * pagevecs. With lru_add_drain() we drain the pagevecs 335 - * on the local CPU so that hopefully the refcount will 336 - * reach the expected safe value. 337 - */ 338 - lru_add_drain(); 339 - *drain_lru_called = true; 340 - /* The caller should try again immediately */ 341 - return true; 342 - } 343 - 344 - /* 345 - * Requests the Ultravisor to make a page accessible to a guest. 346 - * If it's brought in the first time, it will be cleared. If 347 - * it has been exported before, it will be decrypted and integrity 348 - * checked. 349 - */ 350 - int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb) 351 - { 352 - struct vm_area_struct *vma; 353 - bool drain_lru_called = false; 354 - spinlock_t *ptelock; 355 - unsigned long uaddr; 356 - struct folio *folio; 357 - pte_t *ptep; 358 - int rc; 359 - 360 - again: 361 - rc = -EFAULT; 362 - mmap_read_lock(gmap->mm); 363 - 364 - uaddr = __gmap_translate(gmap, gaddr); 365 - if (IS_ERR_VALUE(uaddr)) 366 - goto out; 367 - vma = vma_lookup(gmap->mm, uaddr); 368 - if (!vma) 369 - goto out; 370 - /* 371 - * Secure pages cannot be huge and userspace should not combine both. 372 - * In case userspace does it anyway this will result in an -EFAULT for 373 - * the unpack. The guest is thus never reaching secure mode. If 374 - * userspace is playing dirty tricky with mapping huge pages later 375 - * on this will result in a segmentation fault. 376 - */ 377 - if (is_vm_hugetlb_page(vma)) 378 - goto out; 379 - 380 - rc = -ENXIO; 381 - ptep = get_locked_pte(gmap->mm, uaddr, &ptelock); 382 - if (!ptep) 383 - goto out; 384 - if (pte_present(*ptep) && !(pte_val(*ptep) & _PAGE_INVALID) && pte_write(*ptep)) { 385 - folio = page_folio(pte_page(*ptep)); 386 - rc = -EAGAIN; 387 - if (folio_test_large(folio)) { 388 - rc = -E2BIG; 389 - } else if (folio_trylock(folio)) { 390 - if (should_export_before_import(uvcb, gmap->mm)) 391 - uv_convert_from_secure(PFN_PHYS(folio_pfn(folio))); 392 - rc = make_folio_secure(folio, uvcb); 393 - folio_unlock(folio); 394 - } 395 - 396 - /* 397 - * Once we drop the PTL, the folio may get unmapped and 398 - * freed immediately. We need a temporary reference. 399 - */ 400 - if (rc == -EAGAIN || rc == -E2BIG) 401 - folio_get(folio); 402 - } 403 - pte_unmap_unlock(ptep, ptelock); 404 - out: 405 - mmap_read_unlock(gmap->mm); 406 - 407 - switch (rc) { 408 - case -E2BIG: 409 - folio_lock(folio); 410 - rc = split_folio(folio); 411 - folio_unlock(folio); 412 - folio_put(folio); 413 - 414 - switch (rc) { 415 - case 0: 416 - /* Splitting succeeded, try again immediately. */ 417 - goto again; 418 - case -EAGAIN: 419 - /* Additional folio references. */ 420 - if (drain_lru(&drain_lru_called)) 421 - goto again; 422 - return -EAGAIN; 423 - case -EBUSY: 424 - /* Unexpected race. */ 425 - return -EAGAIN; 426 - } 427 - WARN_ON_ONCE(1); 428 - return -ENXIO; 429 - case -EAGAIN: 430 - /* 431 - * If we are here because the UVC returned busy or partial 432 - * completion, this is just a useless check, but it is safe. 433 - */ 434 - folio_wait_writeback(folio); 435 - folio_put(folio); 436 - return -EAGAIN; 437 - case -EBUSY: 438 - /* Additional folio references. */ 439 - if (drain_lru(&drain_lru_called)) 440 - goto again; 441 - return -EAGAIN; 442 - case -ENXIO: 443 - if (gmap_fault(gmap, gaddr, FAULT_FLAG_WRITE)) 444 - return -EFAULT; 445 - return -EAGAIN; 446 - } 447 - return rc; 448 - } 449 - EXPORT_SYMBOL_GPL(gmap_make_secure); 450 - 451 - int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr) 452 - { 453 - struct uv_cb_cts uvcb = { 454 - .header.cmd = UVC_CMD_CONV_TO_SEC_STOR, 455 - .header.len = sizeof(uvcb), 456 - .guest_handle = gmap->guest_handle, 457 - .gaddr = gaddr, 458 - }; 459 - 460 - return gmap_make_secure(gmap, gaddr, &uvcb); 461 - } 462 - EXPORT_SYMBOL_GPL(gmap_convert_to_secure); 463 - 464 - /** 465 - * gmap_destroy_page - Destroy a guest page. 466 - * @gmap: the gmap of the guest 467 - * @gaddr: the guest address to destroy 468 - * 469 - * An attempt will be made to destroy the given guest page. If the attempt 470 - * fails, an attempt is made to export the page. If both attempts fail, an 471 - * appropriate error is returned. 472 - */ 473 - int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr) 474 - { 475 - struct vm_area_struct *vma; 476 - struct folio_walk fw; 477 - unsigned long uaddr; 478 - struct folio *folio; 479 - int rc; 480 - 481 - rc = -EFAULT; 482 - mmap_read_lock(gmap->mm); 483 - 484 - uaddr = __gmap_translate(gmap, gaddr); 485 - if (IS_ERR_VALUE(uaddr)) 486 - goto out; 487 - vma = vma_lookup(gmap->mm, uaddr); 488 - if (!vma) 489 - goto out; 490 - /* 491 - * Huge pages should not be able to become secure 492 - */ 493 - if (is_vm_hugetlb_page(vma)) 494 - goto out; 495 - 496 - rc = 0; 497 - folio = folio_walk_start(&fw, vma, uaddr, 0); 498 - if (!folio) 499 - goto out; 500 - /* 501 - * See gmap_make_secure(): large folios cannot be secure. Small 502 - * folio implies FW_LEVEL_PTE. 503 - */ 504 - if (folio_test_large(folio) || !pte_write(fw.pte)) 505 - goto out_walk_end; 506 - rc = uv_destroy_folio(folio); 507 - /* 508 - * Fault handlers can race; it is possible that two CPUs will fault 509 - * on the same secure page. One CPU can destroy the page, reboot, 510 - * re-enter secure mode and import it, while the second CPU was 511 - * stuck at the beginning of the handler. At some point the second 512 - * CPU will be able to progress, and it will not be able to destroy 513 - * the page. In that case we do not want to terminate the process, 514 - * we instead try to export the page. 515 - */ 516 - if (rc) 517 - rc = uv_convert_from_secure_folio(folio); 518 - out_walk_end: 519 - folio_walk_end(&fw, vma); 520 - out: 521 - mmap_read_unlock(gmap->mm); 522 - return rc; 523 - } 524 - EXPORT_SYMBOL_GPL(gmap_destroy_page); 270 + EXPORT_SYMBOL_GPL(make_folio_secure); 525 271 526 272 /* 527 273 * To be called with the folio locked or with an extra reference! This will

+1 -1

arch/s390/kvm/Makefile

··· 8 8 ccflags-y := -Ivirt/kvm -Iarch/s390/kvm 9 9 10 10 kvm-y += kvm-s390.o intercept.o interrupt.o priv.o sigp.o 11 - kvm-y += diag.o gaccess.o guestdbg.o vsie.o pv.o 11 + kvm-y += diag.o gaccess.o guestdbg.o vsie.o pv.o gmap.o gmap-vsie.o 12 12 13 13 kvm-$(CONFIG_VFIO_PCI_ZDEV_KVM) += pci.o 14 14 obj-$(CONFIG_KVM) += kvm.o

+43 -1

arch/s390/kvm/gaccess.c

··· 16 16 #include <asm/gmap.h> 17 17 #include <asm/dat-bits.h> 18 18 #include "kvm-s390.h" 19 + #include "gmap.h" 19 20 #include "gaccess.h" 20 21 21 22 /* ··· 1394 1393 } 1395 1394 1396 1395 /** 1396 + * shadow_pgt_lookup() - find a shadow page table 1397 + * @sg: pointer to the shadow guest address space structure 1398 + * @saddr: the address in the shadow aguest address space 1399 + * @pgt: parent gmap address of the page table to get shadowed 1400 + * @dat_protection: if the pgtable is marked as protected by dat 1401 + * @fake: pgt references contiguous guest memory block, not a pgtable 1402 + * 1403 + * Returns 0 if the shadow page table was found and -EAGAIN if the page 1404 + * table was not found. 1405 + * 1406 + * Called with sg->mm->mmap_lock in read. 1407 + */ 1408 + static int shadow_pgt_lookup(struct gmap *sg, unsigned long saddr, unsigned long *pgt, 1409 + int *dat_protection, int *fake) 1410 + { 1411 + unsigned long pt_index; 1412 + unsigned long *table; 1413 + struct page *page; 1414 + int rc; 1415 + 1416 + spin_lock(&sg->guest_table_lock); 1417 + table = gmap_table_walk(sg, saddr, 1); /* get segment pointer */ 1418 + if (table && !(*table & _SEGMENT_ENTRY_INVALID)) { 1419 + /* Shadow page tables are full pages (pte+pgste) */ 1420 + page = pfn_to_page(*table >> PAGE_SHIFT); 1421 + pt_index = gmap_pgste_get_pgt_addr(page_to_virt(page)); 1422 + *pgt = pt_index & ~GMAP_SHADOW_FAKE_TABLE; 1423 + *dat_protection = !!(*table & _SEGMENT_ENTRY_PROTECT); 1424 + *fake = !!(pt_index & GMAP_SHADOW_FAKE_TABLE); 1425 + rc = 0; 1426 + } else { 1427 + rc = -EAGAIN; 1428 + } 1429 + spin_unlock(&sg->guest_table_lock); 1430 + return rc; 1431 + } 1432 + 1433 + /** 1397 1434 * kvm_s390_shadow_fault - handle fault on a shadow page table 1398 1435 * @vcpu: virtual cpu 1399 1436 * @sg: pointer to the shadow guest address space structure ··· 1454 1415 int dat_protection, fake; 1455 1416 int rc; 1456 1417 1418 + if (KVM_BUG_ON(!gmap_is_shadow(sg), vcpu->kvm)) 1419 + return -EFAULT; 1420 + 1457 1421 mmap_read_lock(sg->mm); 1458 1422 /* 1459 1423 * We don't want any guest-2 tables to change - so the parent ··· 1465 1423 */ 1466 1424 ipte_lock(vcpu->kvm); 1467 1425 1468 - rc = gmap_shadow_pgt_lookup(sg, saddr, &pgt, &dat_protection, &fake); 1426 + rc = shadow_pgt_lookup(sg, saddr, &pgt, &dat_protection, &fake); 1469 1427 if (rc) 1470 1428 rc = kvm_s390_shadow_tables(sg, saddr, &pgt, &dat_protection, 1471 1429 &fake);

+142

arch/s390/kvm/gmap-vsie.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Guest memory management for KVM/s390 nested VMs. 4 + * 5 + * Copyright IBM Corp. 2008, 2020, 2024 6 + * 7 + * Author(s): Claudio Imbrenda <imbrenda@linux.ibm.com> 8 + * Martin Schwidefsky <schwidefsky@de.ibm.com> 9 + * David Hildenbrand <david@redhat.com> 10 + * Janosch Frank <frankja@linux.vnet.ibm.com> 11 + */ 12 + 13 + #include <linux/compiler.h> 14 + #include <linux/kvm.h> 15 + #include <linux/kvm_host.h> 16 + #include <linux/pgtable.h> 17 + #include <linux/pagemap.h> 18 + #include <linux/mman.h> 19 + 20 + #include <asm/lowcore.h> 21 + #include <asm/gmap.h> 22 + #include <asm/uv.h> 23 + 24 + #include "kvm-s390.h" 25 + #include "gmap.h" 26 + 27 + /** 28 + * gmap_find_shadow - find a specific asce in the list of shadow tables 29 + * @parent: pointer to the parent gmap 30 + * @asce: ASCE for which the shadow table is created 31 + * @edat_level: edat level to be used for the shadow translation 32 + * 33 + * Returns the pointer to a gmap if a shadow table with the given asce is 34 + * already available, ERR_PTR(-EAGAIN) if another one is just being created, 35 + * otherwise NULL 36 + * 37 + * Context: Called with parent->shadow_lock held 38 + */ 39 + static struct gmap *gmap_find_shadow(struct gmap *parent, unsigned long asce, int edat_level) 40 + { 41 + struct gmap *sg; 42 + 43 + lockdep_assert_held(&parent->shadow_lock); 44 + list_for_each_entry(sg, &parent->children, list) { 45 + if (!gmap_shadow_valid(sg, asce, edat_level)) 46 + continue; 47 + if (!sg->initialized) 48 + return ERR_PTR(-EAGAIN); 49 + refcount_inc(&sg->ref_count); 50 + return sg; 51 + } 52 + return NULL; 53 + } 54 + 55 + /** 56 + * gmap_shadow - create/find a shadow guest address space 57 + * @parent: pointer to the parent gmap 58 + * @asce: ASCE for which the shadow table is created 59 + * @edat_level: edat level to be used for the shadow translation 60 + * 61 + * The pages of the top level page table referred by the asce parameter 62 + * will be set to read-only and marked in the PGSTEs of the kvm process. 63 + * The shadow table will be removed automatically on any change to the 64 + * PTE mapping for the source table. 65 + * 66 + * Returns a guest address space structure, ERR_PTR(-ENOMEM) if out of memory, 67 + * ERR_PTR(-EAGAIN) if the caller has to retry and ERR_PTR(-EFAULT) if the 68 + * parent gmap table could not be protected. 69 + */ 70 + struct gmap *gmap_shadow(struct gmap *parent, unsigned long asce, int edat_level) 71 + { 72 + struct gmap *sg, *new; 73 + unsigned long limit; 74 + int rc; 75 + 76 + if (KVM_BUG_ON(parent->mm->context.allow_gmap_hpage_1m, (struct kvm *)parent->private) || 77 + KVM_BUG_ON(gmap_is_shadow(parent), (struct kvm *)parent->private)) 78 + return ERR_PTR(-EFAULT); 79 + spin_lock(&parent->shadow_lock); 80 + sg = gmap_find_shadow(parent, asce, edat_level); 81 + spin_unlock(&parent->shadow_lock); 82 + if (sg) 83 + return sg; 84 + /* Create a new shadow gmap */ 85 + limit = -1UL >> (33 - (((asce & _ASCE_TYPE_MASK) >> 2) * 11)); 86 + if (asce & _ASCE_REAL_SPACE) 87 + limit = -1UL; 88 + new = gmap_alloc(limit); 89 + if (!new) 90 + return ERR_PTR(-ENOMEM); 91 + new->mm = parent->mm; 92 + new->parent = gmap_get(parent); 93 + new->private = parent->private; 94 + new->orig_asce = asce; 95 + new->edat_level = edat_level; 96 + new->initialized = false; 97 + spin_lock(&parent->shadow_lock); 98 + /* Recheck if another CPU created the same shadow */ 99 + sg = gmap_find_shadow(parent, asce, edat_level); 100 + if (sg) { 101 + spin_unlock(&parent->shadow_lock); 102 + gmap_free(new); 103 + return sg; 104 + } 105 + if (asce & _ASCE_REAL_SPACE) { 106 + /* only allow one real-space gmap shadow */ 107 + list_for_each_entry(sg, &parent->children, list) { 108 + if (sg->orig_asce & _ASCE_REAL_SPACE) { 109 + spin_lock(&sg->guest_table_lock); 110 + gmap_unshadow(sg); 111 + spin_unlock(&sg->guest_table_lock); 112 + list_del(&sg->list); 113 + gmap_put(sg); 114 + break; 115 + } 116 + } 117 + } 118 + refcount_set(&new->ref_count, 2); 119 + list_add(&new->list, &parent->children); 120 + if (asce & _ASCE_REAL_SPACE) { 121 + /* nothing to protect, return right away */ 122 + new->initialized = true; 123 + spin_unlock(&parent->shadow_lock); 124 + return new; 125 + } 126 + spin_unlock(&parent->shadow_lock); 127 + /* protect after insertion, so it will get properly invalidated */ 128 + mmap_read_lock(parent->mm); 129 + rc = __kvm_s390_mprotect_many(parent, asce & _ASCE_ORIGIN, 130 + ((asce & _ASCE_TABLE_LENGTH) + 1), 131 + PROT_READ, GMAP_NOTIFY_SHADOW); 132 + mmap_read_unlock(parent->mm); 133 + spin_lock(&parent->shadow_lock); 134 + new->initialized = true; 135 + if (rc) { 136 + list_del(&new->list); 137 + gmap_free(new); 138 + new = ERR_PTR(rc); 139 + } 140 + spin_unlock(&parent->shadow_lock); 141 + return new; 142 + }

+212

arch/s390/kvm/gmap.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Guest memory management for KVM/s390 4 + * 5 + * Copyright IBM Corp. 2008, 2020, 2024 6 + * 7 + * Author(s): Claudio Imbrenda <imbrenda@linux.ibm.com> 8 + * Martin Schwidefsky <schwidefsky@de.ibm.com> 9 + * David Hildenbrand <david@redhat.com> 10 + * Janosch Frank <frankja@linux.vnet.ibm.com> 11 + */ 12 + 13 + #include <linux/compiler.h> 14 + #include <linux/kvm.h> 15 + #include <linux/kvm_host.h> 16 + #include <linux/pgtable.h> 17 + #include <linux/pagemap.h> 18 + 19 + #include <asm/lowcore.h> 20 + #include <asm/gmap.h> 21 + #include <asm/uv.h> 22 + 23 + #include "gmap.h" 24 + 25 + /** 26 + * should_export_before_import - Determine whether an export is needed 27 + * before an import-like operation 28 + * @uvcb: the Ultravisor control block of the UVC to be performed 29 + * @mm: the mm of the process 30 + * 31 + * Returns whether an export is needed before every import-like operation. 32 + * This is needed for shared pages, which don't trigger a secure storage 33 + * exception when accessed from a different guest. 34 + * 35 + * Although considered as one, the Unpin Page UVC is not an actual import, 36 + * so it is not affected. 37 + * 38 + * No export is needed also when there is only one protected VM, because the 39 + * page cannot belong to the wrong VM in that case (there is no "other VM" 40 + * it can belong to). 41 + * 42 + * Return: true if an export is needed before every import, otherwise false. 43 + */ 44 + static bool should_export_before_import(struct uv_cb_header *uvcb, struct mm_struct *mm) 45 + { 46 + /* 47 + * The misc feature indicates, among other things, that importing a 48 + * shared page from a different protected VM will automatically also 49 + * transfer its ownership. 50 + */ 51 + if (uv_has_feature(BIT_UV_FEAT_MISC)) 52 + return false; 53 + if (uvcb->cmd == UVC_CMD_UNPIN_PAGE_SHARED) 54 + return false; 55 + return atomic_read(&mm->context.protected_count) > 1; 56 + } 57 + 58 + static int __gmap_make_secure(struct gmap *gmap, struct page *page, void *uvcb) 59 + { 60 + struct folio *folio = page_folio(page); 61 + int rc; 62 + 63 + /* 64 + * Secure pages cannot be huge and userspace should not combine both. 65 + * In case userspace does it anyway this will result in an -EFAULT for 66 + * the unpack. The guest is thus never reaching secure mode. 67 + * If userspace plays dirty tricks and decides to map huge pages at a 68 + * later point in time, it will receive a segmentation fault or 69 + * KVM_RUN will return -EFAULT. 70 + */ 71 + if (folio_test_hugetlb(folio)) 72 + return -EFAULT; 73 + if (folio_test_large(folio)) { 74 + mmap_read_unlock(gmap->mm); 75 + rc = kvm_s390_wiggle_split_folio(gmap->mm, folio, true); 76 + mmap_read_lock(gmap->mm); 77 + if (rc) 78 + return rc; 79 + folio = page_folio(page); 80 + } 81 + 82 + if (!folio_trylock(folio)) 83 + return -EAGAIN; 84 + if (should_export_before_import(uvcb, gmap->mm)) 85 + uv_convert_from_secure(folio_to_phys(folio)); 86 + rc = make_folio_secure(folio, uvcb); 87 + folio_unlock(folio); 88 + 89 + /* 90 + * In theory a race is possible and the folio might have become 91 + * large again before the folio_trylock() above. In that case, no 92 + * action is performed and -EAGAIN is returned; the callers will 93 + * have to try again later. 94 + * In most cases this implies running the VM again, getting the same 95 + * exception again, and make another attempt in this function. 96 + * This is expected to happen extremely rarely. 97 + */ 98 + if (rc == -E2BIG) 99 + return -EAGAIN; 100 + /* The folio has too many references, try to shake some off */ 101 + if (rc == -EBUSY) { 102 + mmap_read_unlock(gmap->mm); 103 + kvm_s390_wiggle_split_folio(gmap->mm, folio, false); 104 + mmap_read_lock(gmap->mm); 105 + return -EAGAIN; 106 + } 107 + 108 + return rc; 109 + } 110 + 111 + /** 112 + * gmap_make_secure() - make one guest page secure 113 + * @gmap: the guest gmap 114 + * @gaddr: the guest address that needs to be made secure 115 + * @uvcb: the UVCB specifying which operation needs to be performed 116 + * 117 + * Context: needs to be called with kvm->srcu held. 118 + * Return: 0 on success, < 0 in case of error (see __gmap_make_secure()). 119 + */ 120 + int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb) 121 + { 122 + struct kvm *kvm = gmap->private; 123 + struct page *page; 124 + int rc = 0; 125 + 126 + lockdep_assert_held(&kvm->srcu); 127 + 128 + page = gfn_to_page(kvm, gpa_to_gfn(gaddr)); 129 + mmap_read_lock(gmap->mm); 130 + if (page) 131 + rc = __gmap_make_secure(gmap, page, uvcb); 132 + kvm_release_page_clean(page); 133 + mmap_read_unlock(gmap->mm); 134 + 135 + return rc; 136 + } 137 + 138 + int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr) 139 + { 140 + struct uv_cb_cts uvcb = { 141 + .header.cmd = UVC_CMD_CONV_TO_SEC_STOR, 142 + .header.len = sizeof(uvcb), 143 + .guest_handle = gmap->guest_handle, 144 + .gaddr = gaddr, 145 + }; 146 + 147 + return gmap_make_secure(gmap, gaddr, &uvcb); 148 + } 149 + 150 + /** 151 + * __gmap_destroy_page() - Destroy a guest page. 152 + * @gmap: the gmap of the guest 153 + * @page: the page to destroy 154 + * 155 + * An attempt will be made to destroy the given guest page. If the attempt 156 + * fails, an attempt is made to export the page. If both attempts fail, an 157 + * appropriate error is returned. 158 + * 159 + * Context: must be called holding the mm lock for gmap->mm 160 + */ 161 + static int __gmap_destroy_page(struct gmap *gmap, struct page *page) 162 + { 163 + struct folio *folio = page_folio(page); 164 + int rc; 165 + 166 + /* 167 + * See gmap_make_secure(): large folios cannot be secure. Small 168 + * folio implies FW_LEVEL_PTE. 169 + */ 170 + if (folio_test_large(folio)) 171 + return -EFAULT; 172 + 173 + rc = uv_destroy_folio(folio); 174 + /* 175 + * Fault handlers can race; it is possible that two CPUs will fault 176 + * on the same secure page. One CPU can destroy the page, reboot, 177 + * re-enter secure mode and import it, while the second CPU was 178 + * stuck at the beginning of the handler. At some point the second 179 + * CPU will be able to progress, and it will not be able to destroy 180 + * the page. In that case we do not want to terminate the process, 181 + * we instead try to export the page. 182 + */ 183 + if (rc) 184 + rc = uv_convert_from_secure_folio(folio); 185 + 186 + return rc; 187 + } 188 + 189 + /** 190 + * gmap_destroy_page() - Destroy a guest page. 191 + * @gmap: the gmap of the guest 192 + * @gaddr: the guest address to destroy 193 + * 194 + * An attempt will be made to destroy the given guest page. If the attempt 195 + * fails, an attempt is made to export the page. If both attempts fail, an 196 + * appropriate error is returned. 197 + * 198 + * Context: may sleep. 199 + */ 200 + int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr) 201 + { 202 + struct page *page; 203 + int rc = 0; 204 + 205 + mmap_read_lock(gmap->mm); 206 + page = gfn_to_page(gmap->private, gpa_to_gfn(gaddr)); 207 + if (page) 208 + rc = __gmap_destroy_page(gmap, page); 209 + kvm_release_page_clean(page); 210 + mmap_read_unlock(gmap->mm); 211 + return rc; 212 + }

+39

arch/s390/kvm/gmap.h

··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* 3 + * KVM guest address space mapping code 4 + * 5 + * Copyright IBM Corp. 2007, 2016, 2025 6 + * Author(s): Martin Schwidefsky <schwidefsky@de.ibm.com> 7 + * Claudio Imbrenda <imbrenda@linux.ibm.com> 8 + */ 9 + 10 + #ifndef ARCH_KVM_S390_GMAP_H 11 + #define ARCH_KVM_S390_GMAP_H 12 + 13 + #define GMAP_SHADOW_FAKE_TABLE 1ULL 14 + 15 + int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb); 16 + int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr); 17 + int gmap_destroy_page(struct gmap *gmap, unsigned long gaddr); 18 + struct gmap *gmap_shadow(struct gmap *parent, unsigned long asce, int edat_level); 19 + 20 + /** 21 + * gmap_shadow_valid - check if a shadow guest address space matches the 22 + * given properties and is still valid 23 + * @sg: pointer to the shadow guest address space structure 24 + * @asce: ASCE for which the shadow table is requested 25 + * @edat_level: edat level to be used for the shadow translation 26 + * 27 + * Returns 1 if the gmap shadow is still valid and matches the given 28 + * properties, the caller can continue using it. Returns 0 otherwise, the 29 + * caller has to request a new shadow gmap in this case. 30 + * 31 + */ 32 + static inline int gmap_shadow_valid(struct gmap *sg, unsigned long asce, int edat_level) 33 + { 34 + if (sg->removed) 35 + return 0; 36 + return sg->orig_asce == asce && sg->edat_level == edat_level; 37 + } 38 + 39 + #endif

+4 -3

arch/s390/kvm/intercept.c

··· 21 21 #include "gaccess.h" 22 22 #include "trace.h" 23 23 #include "trace-s390.h" 24 + #include "gmap.h" 24 25 25 26 u8 kvm_s390_get_ilen(struct kvm_vcpu *vcpu) 26 27 { ··· 368 367 reg2, &srcaddr, GACC_FETCH, 0); 369 368 if (rc) 370 369 return kvm_s390_inject_prog_cond(vcpu, rc); 371 - rc = gmap_fault(vcpu->arch.gmap, srcaddr, 0); 370 + rc = kvm_s390_handle_dat_fault(vcpu, srcaddr, 0); 372 371 if (rc != 0) 373 372 return rc; 374 373 ··· 377 376 reg1, &dstaddr, GACC_STORE, 0); 378 377 if (rc) 379 378 return kvm_s390_inject_prog_cond(vcpu, rc); 380 - rc = gmap_fault(vcpu->arch.gmap, dstaddr, FAULT_FLAG_WRITE); 379 + rc = kvm_s390_handle_dat_fault(vcpu, dstaddr, FOLL_WRITE); 381 380 if (rc != 0) 382 381 return rc; 383 382 ··· 550 549 * If the unpin did not succeed, the guest will exit again for the UVC 551 550 * and we will retry the unpin. 552 551 */ 553 - if (rc == -EINVAL) 552 + if (rc == -EINVAL || rc == -ENXIO) 554 553 return 0; 555 554 /* 556 555 * If we got -EAGAIN here, we simply return it. It will eventually

+11 -8

arch/s390/kvm/interrupt.c

··· 2893 2893 struct kvm_kernel_irq_routing_entry *e, 2894 2894 const struct kvm_irq_routing_entry *ue) 2895 2895 { 2896 - u64 uaddr; 2896 + u64 uaddr_s, uaddr_i; 2897 + int idx; 2897 2898 2898 2899 switch (ue->type) { 2899 2900 /* we store the userspace addresses instead of the guest addresses */ ··· 2902 2901 if (kvm_is_ucontrol(kvm)) 2903 2902 return -EINVAL; 2904 2903 e->set = set_adapter_int; 2905 - uaddr = gmap_translate(kvm->arch.gmap, ue->u.adapter.summary_addr); 2906 - if (uaddr == -EFAULT) 2904 + 2905 + idx = srcu_read_lock(&kvm->srcu); 2906 + uaddr_s = gpa_to_hva(kvm, ue->u.adapter.summary_addr); 2907 + uaddr_i = gpa_to_hva(kvm, ue->u.adapter.ind_addr); 2908 + srcu_read_unlock(&kvm->srcu, idx); 2909 + 2910 + if (kvm_is_error_hva(uaddr_s) || kvm_is_error_hva(uaddr_i)) 2907 2911 return -EFAULT; 2908 - e->adapter.summary_addr = uaddr; 2909 - uaddr = gmap_translate(kvm->arch.gmap, ue->u.adapter.ind_addr); 2910 - if (uaddr == -EFAULT) 2911 - return -EFAULT; 2912 - e->adapter.ind_addr = uaddr; 2912 + e->adapter.summary_addr = uaddr_s; 2913 + e->adapter.ind_addr = uaddr_i; 2913 2914 e->adapter.summary_offset = ue->u.adapter.summary_offset; 2914 2915 e->adapter.ind_offset = ue->u.adapter.ind_offset; 2915 2916 e->adapter.adapter_id = ue->u.adapter.adapter_id;

+197 -40

arch/s390/kvm/kvm-s390.c

··· 50 50 #include "kvm-s390.h" 51 51 #include "gaccess.h" 52 52 #include "pci.h" 53 + #include "gmap.h" 53 54 54 55 #define CREATE_TRACE_POINTS 55 56 #include "trace.h" ··· 3429 3428 VM_EVENT(kvm, 3, "vm created with type %lu", type); 3430 3429 3431 3430 if (type & KVM_VM_S390_UCONTROL) { 3431 + struct kvm_userspace_memory_region2 fake_memslot = { 3432 + .slot = KVM_S390_UCONTROL_MEMSLOT, 3433 + .guest_phys_addr = 0, 3434 + .userspace_addr = 0, 3435 + .memory_size = ALIGN_DOWN(TASK_SIZE, _SEGMENT_SIZE), 3436 + .flags = 0, 3437 + }; 3438 + 3432 3439 kvm->arch.gmap = NULL; 3433 3440 kvm->arch.mem_limit = KVM_S390_NO_MEM_LIMIT; 3441 + /* one flat fake memslot covering the whole address-space */ 3442 + mutex_lock(&kvm->slots_lock); 3443 + KVM_BUG_ON(kvm_set_internal_memslot(kvm, &fake_memslot), kvm); 3444 + mutex_unlock(&kvm->slots_lock); 3434 3445 } else { 3435 3446 if (sclp.hamax == U64_MAX) 3436 3447 kvm->arch.mem_limit = TASK_SIZE_MAX; ··· 4511 4498 return kvm_s390_test_cpuflags(vcpu, CPUSTAT_IBS); 4512 4499 } 4513 4500 4501 + static int __kvm_s390_fixup_fault_sync(struct gmap *gmap, gpa_t gaddr, unsigned int flags) 4502 + { 4503 + struct kvm *kvm = gmap->private; 4504 + gfn_t gfn = gpa_to_gfn(gaddr); 4505 + bool unlocked; 4506 + hva_t vmaddr; 4507 + gpa_t tmp; 4508 + int rc; 4509 + 4510 + if (kvm_is_ucontrol(kvm)) { 4511 + tmp = __gmap_translate(gmap, gaddr); 4512 + gfn = gpa_to_gfn(tmp); 4513 + } 4514 + 4515 + vmaddr = gfn_to_hva(kvm, gfn); 4516 + rc = fixup_user_fault(gmap->mm, vmaddr, FAULT_FLAG_WRITE, &unlocked); 4517 + if (!rc) 4518 + rc = __gmap_link(gmap, gaddr, vmaddr); 4519 + return rc; 4520 + } 4521 + 4522 + /** 4523 + * __kvm_s390_mprotect_many() - Apply specified protection to guest pages 4524 + * @gmap: the gmap of the guest 4525 + * @gpa: the starting guest address 4526 + * @npages: how many pages to protect 4527 + * @prot: indicates access rights: PROT_NONE, PROT_READ or PROT_WRITE 4528 + * @bits: pgste notification bits to set 4529 + * 4530 + * Returns: 0 in case of success, < 0 in case of error - see gmap_protect_one() 4531 + * 4532 + * Context: kvm->srcu and gmap->mm need to be held in read mode 4533 + */ 4534 + int __kvm_s390_mprotect_many(struct gmap *gmap, gpa_t gpa, u8 npages, unsigned int prot, 4535 + unsigned long bits) 4536 + { 4537 + unsigned int fault_flag = (prot & PROT_WRITE) ? FAULT_FLAG_WRITE : 0; 4538 + gpa_t end = gpa + npages * PAGE_SIZE; 4539 + int rc; 4540 + 4541 + for (; gpa < end; gpa = ALIGN(gpa + 1, rc)) { 4542 + rc = gmap_protect_one(gmap, gpa, prot, bits); 4543 + if (rc == -EAGAIN) { 4544 + __kvm_s390_fixup_fault_sync(gmap, gpa, fault_flag); 4545 + rc = gmap_protect_one(gmap, gpa, prot, bits); 4546 + } 4547 + if (rc < 0) 4548 + return rc; 4549 + } 4550 + 4551 + return 0; 4552 + } 4553 + 4554 + static int kvm_s390_mprotect_notify_prefix(struct kvm_vcpu *vcpu) 4555 + { 4556 + gpa_t gaddr = kvm_s390_get_prefix(vcpu); 4557 + int idx, rc; 4558 + 4559 + idx = srcu_read_lock(&vcpu->kvm->srcu); 4560 + mmap_read_lock(vcpu->arch.gmap->mm); 4561 + 4562 + rc = __kvm_s390_mprotect_many(vcpu->arch.gmap, gaddr, 2, PROT_WRITE, GMAP_NOTIFY_MPROT); 4563 + 4564 + mmap_read_unlock(vcpu->arch.gmap->mm); 4565 + srcu_read_unlock(&vcpu->kvm->srcu, idx); 4566 + 4567 + return rc; 4568 + } 4569 + 4514 4570 static int kvm_s390_handle_requests(struct kvm_vcpu *vcpu) 4515 4571 { 4516 4572 retry: ··· 4595 4513 */ 4596 4514 if (kvm_check_request(KVM_REQ_REFRESH_GUEST_PREFIX, vcpu)) { 4597 4515 int rc; 4598 - rc = gmap_mprotect_notify(vcpu->arch.gmap, 4599 - kvm_s390_get_prefix(vcpu), 4600 - PAGE_SIZE * 2, PROT_WRITE); 4516 + 4517 + rc = kvm_s390_mprotect_notify_prefix(vcpu); 4601 4518 if (rc) { 4602 4519 kvm_make_request(KVM_REQ_REFRESH_GUEST_PREFIX, vcpu); 4603 4520 return rc; ··· 4847 4766 return kvm_s390_inject_prog_irq(vcpu, &pgm_info); 4848 4767 } 4849 4768 4769 + static void kvm_s390_assert_primary_as(struct kvm_vcpu *vcpu) 4770 + { 4771 + KVM_BUG(current->thread.gmap_teid.as != PSW_BITS_AS_PRIMARY, vcpu->kvm, 4772 + "Unexpected program interrupt 0x%x, TEID 0x%016lx", 4773 + current->thread.gmap_int_code, current->thread.gmap_teid.val); 4774 + } 4775 + 4776 + /* 4777 + * __kvm_s390_handle_dat_fault() - handle a dat fault for the gmap of a vcpu 4778 + * @vcpu: the vCPU whose gmap is to be fixed up 4779 + * @gfn: the guest frame number used for memslots (including fake memslots) 4780 + * @gaddr: the gmap address, does not have to match @gfn for ucontrol gmaps 4781 + * @flags: FOLL_* flags 4782 + * 4783 + * Return: 0 on success, < 0 in case of error. 4784 + * Context: The mm lock must not be held before calling. May sleep. 4785 + */ 4786 + int __kvm_s390_handle_dat_fault(struct kvm_vcpu *vcpu, gfn_t gfn, gpa_t gaddr, unsigned int flags) 4787 + { 4788 + struct kvm_memory_slot *slot; 4789 + unsigned int fault_flags; 4790 + bool writable, unlocked; 4791 + unsigned long vmaddr; 4792 + struct page *page; 4793 + kvm_pfn_t pfn; 4794 + int rc; 4795 + 4796 + slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); 4797 + if (!slot || slot->flags & KVM_MEMSLOT_INVALID) 4798 + return vcpu_post_run_addressing_exception(vcpu); 4799 + 4800 + fault_flags = flags & FOLL_WRITE ? FAULT_FLAG_WRITE : 0; 4801 + if (vcpu->arch.gmap->pfault_enabled) 4802 + flags |= FOLL_NOWAIT; 4803 + vmaddr = __gfn_to_hva_memslot(slot, gfn); 4804 + 4805 + try_again: 4806 + pfn = __kvm_faultin_pfn(slot, gfn, flags, &writable, &page); 4807 + 4808 + /* Access outside memory, inject addressing exception */ 4809 + if (is_noslot_pfn(pfn)) 4810 + return vcpu_post_run_addressing_exception(vcpu); 4811 + /* Signal pending: try again */ 4812 + if (pfn == KVM_PFN_ERR_SIGPENDING) 4813 + return -EAGAIN; 4814 + 4815 + /* Needs I/O, try to setup async pfault (only possible with FOLL_NOWAIT) */ 4816 + if (pfn == KVM_PFN_ERR_NEEDS_IO) { 4817 + trace_kvm_s390_major_guest_pfault(vcpu); 4818 + if (kvm_arch_setup_async_pf(vcpu)) 4819 + return 0; 4820 + vcpu->stat.pfault_sync++; 4821 + /* Could not setup async pfault, try again synchronously */ 4822 + flags &= ~FOLL_NOWAIT; 4823 + goto try_again; 4824 + } 4825 + /* Any other error */ 4826 + if (is_error_pfn(pfn)) 4827 + return -EFAULT; 4828 + 4829 + /* Success */ 4830 + mmap_read_lock(vcpu->arch.gmap->mm); 4831 + /* Mark the userspace PTEs as young and/or dirty, to avoid page fault loops */ 4832 + rc = fixup_user_fault(vcpu->arch.gmap->mm, vmaddr, fault_flags, &unlocked); 4833 + if (!rc) 4834 + rc = __gmap_link(vcpu->arch.gmap, gaddr, vmaddr); 4835 + scoped_guard(spinlock, &vcpu->kvm->mmu_lock) { 4836 + kvm_release_faultin_page(vcpu->kvm, page, false, writable); 4837 + } 4838 + mmap_read_unlock(vcpu->arch.gmap->mm); 4839 + return rc; 4840 + } 4841 + 4842 + static int vcpu_dat_fault_handler(struct kvm_vcpu *vcpu, unsigned long gaddr, unsigned int flags) 4843 + { 4844 + unsigned long gaddr_tmp; 4845 + gfn_t gfn; 4846 + 4847 + gfn = gpa_to_gfn(gaddr); 4848 + if (kvm_is_ucontrol(vcpu->kvm)) { 4849 + /* 4850 + * This translates the per-vCPU guest address into a 4851 + * fake guest address, which can then be used with the 4852 + * fake memslots that are identity mapping userspace. 4853 + * This allows ucontrol VMs to use the normal fault 4854 + * resolution path, like normal VMs. 4855 + */ 4856 + mmap_read_lock(vcpu->arch.gmap->mm); 4857 + gaddr_tmp = __gmap_translate(vcpu->arch.gmap, gaddr); 4858 + mmap_read_unlock(vcpu->arch.gmap->mm); 4859 + if (gaddr_tmp == -EFAULT) { 4860 + vcpu->run->exit_reason = KVM_EXIT_S390_UCONTROL; 4861 + vcpu->run->s390_ucontrol.trans_exc_code = gaddr; 4862 + vcpu->run->s390_ucontrol.pgm_code = PGM_SEGMENT_TRANSLATION; 4863 + return -EREMOTE; 4864 + } 4865 + gfn = gpa_to_gfn(gaddr_tmp); 4866 + } 4867 + return __kvm_s390_handle_dat_fault(vcpu, gfn, gaddr, flags); 4868 + } 4869 + 4850 4870 static int vcpu_post_run_handle_fault(struct kvm_vcpu *vcpu) 4851 4871 { 4852 4872 unsigned int flags = 0; 4853 4873 unsigned long gaddr; 4854 - int rc = 0; 4855 4874 4856 4875 gaddr = current->thread.gmap_teid.addr * PAGE_SIZE; 4857 4876 if (kvm_s390_cur_gmap_fault_is_write()) ··· 4962 4781 vcpu->stat.exit_null++; 4963 4782 break; 4964 4783 case PGM_NON_SECURE_STORAGE_ACCESS: 4965 - KVM_BUG(current->thread.gmap_teid.as != PSW_BITS_AS_PRIMARY, vcpu->kvm, 4966 - "Unexpected program interrupt 0x%x, TEID 0x%016lx", 4967 - current->thread.gmap_int_code, current->thread.gmap_teid.val); 4784 + kvm_s390_assert_primary_as(vcpu); 4968 4785 /* 4969 4786 * This is normal operation; a page belonging to a protected 4970 4787 * guest has not been imported yet. Try to import the page into ··· 4973 4794 break; 4974 4795 case PGM_SECURE_STORAGE_ACCESS: 4975 4796 case PGM_SECURE_STORAGE_VIOLATION: 4976 - KVM_BUG(current->thread.gmap_teid.as != PSW_BITS_AS_PRIMARY, vcpu->kvm, 4977 - "Unexpected program interrupt 0x%x, TEID 0x%016lx", 4978 - current->thread.gmap_int_code, current->thread.gmap_teid.val); 4797 + kvm_s390_assert_primary_as(vcpu); 4979 4798 /* 4980 4799 * This can happen after a reboot with asynchronous teardown; 4981 4800 * the new guest (normal or protected) will run on top of the ··· 5002 4825 case PGM_REGION_FIRST_TRANS: 5003 4826 case PGM_REGION_SECOND_TRANS: 5004 4827 case PGM_REGION_THIRD_TRANS: 5005 - KVM_BUG(current->thread.gmap_teid.as != PSW_BITS_AS_PRIMARY, vcpu->kvm, 5006 - "Unexpected program interrupt 0x%x, TEID 0x%016lx", 5007 - current->thread.gmap_int_code, current->thread.gmap_teid.val); 5008 - if (vcpu->arch.gmap->pfault_enabled) { 5009 - rc = gmap_fault(vcpu->arch.gmap, gaddr, flags | FAULT_FLAG_RETRY_NOWAIT); 5010 - if (rc == -EFAULT) 5011 - return vcpu_post_run_addressing_exception(vcpu); 5012 - if (rc == -EAGAIN) { 5013 - trace_kvm_s390_major_guest_pfault(vcpu); 5014 - if (kvm_arch_setup_async_pf(vcpu)) 5015 - return 0; 5016 - vcpu->stat.pfault_sync++; 5017 - } else { 5018 - return rc; 5019 - } 5020 - } 5021 - rc = gmap_fault(vcpu->arch.gmap, gaddr, flags); 5022 - if (rc == -EFAULT) { 5023 - if (kvm_is_ucontrol(vcpu->kvm)) { 5024 - vcpu->run->exit_reason = KVM_EXIT_S390_UCONTROL; 5025 - vcpu->run->s390_ucontrol.trans_exc_code = gaddr; 5026 - vcpu->run->s390_ucontrol.pgm_code = 0x10; 5027 - return -EREMOTE; 5028 - } 5029 - return vcpu_post_run_addressing_exception(vcpu); 5030 - } 5031 - break; 4828 + kvm_s390_assert_primary_as(vcpu); 4829 + return vcpu_dat_fault_handler(vcpu, gaddr, flags); 5032 4830 default: 5033 4831 KVM_BUG(1, vcpu->kvm, "Unexpected program interrupt 0x%x, TEID 0x%016lx", 5034 4832 current->thread.gmap_int_code, current->thread.gmap_teid.val); 5035 4833 send_sig(SIGSEGV, current, 0); 5036 4834 break; 5037 4835 } 5038 - return rc; 4836 + return 0; 5039 4837 } 5040 4838 5041 4839 static int vcpu_post_run(struct kvm_vcpu *vcpu, int exit_reason) ··· 5889 5737 } 5890 5738 #endif 5891 5739 case KVM_S390_VCPU_FAULT: { 5892 - r = gmap_fault(vcpu->arch.gmap, arg, 0); 5740 + idx = srcu_read_lock(&vcpu->kvm->srcu); 5741 + r = vcpu_dat_fault_handler(vcpu, arg, 0); 5742 + srcu_read_unlock(&vcpu->kvm->srcu, idx); 5893 5743 break; 5894 5744 } 5895 5745 case KVM_ENABLE_CAP: ··· 6007 5853 { 6008 5854 gpa_t size; 6009 5855 6010 - if (kvm_is_ucontrol(kvm)) 5856 + if (kvm_is_ucontrol(kvm) && new->id < KVM_USER_MEM_SLOTS) 6011 5857 return -EINVAL; 6012 5858 6013 5859 /* When we are protected, we should not change the memory slots */ ··· 6058 5904 enum kvm_mr_change change) 6059 5905 { 6060 5906 int rc = 0; 5907 + 5908 + if (kvm_is_ucontrol(kvm)) 5909 + return; 6061 5910 6062 5911 switch (change) { 6063 5912 case KVM_MR_DELETE:

+19

arch/s390/kvm/kvm-s390.h

··· 20 20 #include <asm/processor.h> 21 21 #include <asm/sclp.h> 22 22 23 + #define KVM_S390_UCONTROL_MEMSLOT (KVM_USER_MEM_SLOTS + 0) 24 + 23 25 static inline void kvm_s390_fpu_store(struct kvm_run *run) 24 26 { 25 27 fpu_stfpc(&run->s.regs.fpc); ··· 281 279 return gd; 282 280 } 283 281 282 + static inline hva_t gpa_to_hva(struct kvm *kvm, gpa_t gpa) 283 + { 284 + hva_t hva = gfn_to_hva(kvm, gpa_to_gfn(gpa)); 285 + 286 + if (!kvm_is_error_hva(hva)) 287 + hva |= offset_in_page(gpa); 288 + return hva; 289 + } 290 + 284 291 /* implemented in pv.c */ 285 292 int kvm_s390_pv_destroy_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc); 286 293 int kvm_s390_pv_create_cpu(struct kvm_vcpu *vcpu, u16 *rc, u16 *rrc); ··· 419 408 void kvm_s390_set_cpu_timer(struct kvm_vcpu *vcpu, __u64 cputm); 420 409 __u64 kvm_s390_get_cpu_timer(struct kvm_vcpu *vcpu); 421 410 int kvm_s390_cpus_from_pv(struct kvm *kvm, u16 *rc, u16 *rrc); 411 + int __kvm_s390_handle_dat_fault(struct kvm_vcpu *vcpu, gfn_t gfn, gpa_t gaddr, unsigned int flags); 412 + int __kvm_s390_mprotect_many(struct gmap *gmap, gpa_t gpa, u8 npages, unsigned int prot, 413 + unsigned long bits); 414 + 415 + static inline int kvm_s390_handle_dat_fault(struct kvm_vcpu *vcpu, gpa_t gaddr, unsigned int flags) 416 + { 417 + return __kvm_s390_handle_dat_fault(vcpu, gpa_to_gfn(gaddr), gaddr, flags); 418 + } 422 419 423 420 /* implemented in diag.c */ 424 421 int kvm_s390_handle_diag(struct kvm_vcpu *vcpu);

+21

arch/s390/kvm/pv.c

··· 17 17 #include <linux/sched/mm.h> 18 18 #include <linux/mmu_notifier.h> 19 19 #include "kvm-s390.h" 20 + #include "gmap.h" 20 21 21 22 bool kvm_s390_pv_is_protected(struct kvm *kvm) 22 23 { ··· 639 638 .tweak[1] = offset, 640 639 }; 641 640 int ret = gmap_make_secure(kvm->arch.gmap, addr, &uvcb); 641 + unsigned long vmaddr; 642 + bool unlocked; 642 643 643 644 *rc = uvcb.header.rc; 644 645 *rrc = uvcb.header.rrc; 646 + 647 + if (ret == -ENXIO) { 648 + mmap_read_lock(kvm->mm); 649 + vmaddr = gfn_to_hva(kvm, gpa_to_gfn(addr)); 650 + if (kvm_is_error_hva(vmaddr)) { 651 + ret = -EFAULT; 652 + } else { 653 + ret = fixup_user_fault(kvm->mm, vmaddr, FAULT_FLAG_WRITE, &unlocked); 654 + if (!ret) 655 + ret = __gmap_link(kvm->arch.gmap, addr, vmaddr); 656 + } 657 + mmap_read_unlock(kvm->mm); 658 + if (!ret) 659 + return -EAGAIN; 660 + return ret; 661 + } 645 662 646 663 if (ret && ret != -EAGAIN) 647 664 KVM_UV_EVENT(kvm, 3, "PROTVIRT VM UNPACK: failed addr %llx with rc %x rrc %x", ··· 678 659 679 660 KVM_UV_EVENT(kvm, 3, "PROTVIRT VM UNPACK: start addr %lx size %lx", 680 661 addr, size); 662 + 663 + guard(srcu)(&kvm->srcu); 681 664 682 665 while (offset < size) { 683 666 ret = unpack_one(kvm, addr, tweak, offset, rc, rrc);

+68 -38

arch/s390/kvm/vsie.c

··· 13 13 #include <linux/bitmap.h> 14 14 #include <linux/sched/signal.h> 15 15 #include <linux/io.h> 16 + #include <linux/mman.h> 16 17 17 18 #include <asm/gmap.h> 18 19 #include <asm/mmu_context.h> ··· 23 22 #include <asm/facility.h> 24 23 #include "kvm-s390.h" 25 24 #include "gaccess.h" 25 + #include "gmap.h" 26 + 27 + enum vsie_page_flags { 28 + VSIE_PAGE_IN_USE = 0, 29 + }; 26 30 27 31 struct vsie_page { 28 32 struct kvm_s390_sie_block scb_s; /* 0x0000 */ ··· 52 46 gpa_t gvrd_gpa; /* 0x0240 */ 53 47 gpa_t riccbd_gpa; /* 0x0248 */ 54 48 gpa_t sdnx_gpa; /* 0x0250 */ 55 - __u8 reserved[0x0700 - 0x0258]; /* 0x0258 */ 49 + /* 50 + * guest address of the original SCB. Remains set for free vsie 51 + * pages, so we can properly look them up in our addr_to_page 52 + * radix tree. 53 + */ 54 + gpa_t scb_gpa; /* 0x0258 */ 55 + /* 56 + * Flags: must be set/cleared atomically after the vsie page can be 57 + * looked up by other CPUs. 58 + */ 59 + unsigned long flags; /* 0x0260 */ 60 + __u8 reserved[0x0700 - 0x0268]; /* 0x0268 */ 56 61 struct kvm_s390_crypto_cb crycb; /* 0x0700 */ 57 62 __u8 fac[S390_ARCH_FAC_LIST_SIZE_BYTE]; /* 0x0800 */ 58 63 }; ··· 601 584 struct kvm *kvm = gmap->private; 602 585 struct vsie_page *cur; 603 586 unsigned long prefix; 604 - struct page *page; 605 587 int i; 606 588 607 589 if (!gmap_is_shadow(gmap)) ··· 610 594 * therefore we can safely reference them all the time. 611 595 */ 612 596 for (i = 0; i < kvm->arch.vsie.page_count; i++) { 613 - page = READ_ONCE(kvm->arch.vsie.pages[i]); 614 - if (!page) 597 + cur = READ_ONCE(kvm->arch.vsie.pages[i]); 598 + if (!cur) 615 599 continue; 616 - cur = page_to_virt(page); 617 600 if (READ_ONCE(cur->gmap) != gmap) 618 601 continue; 619 602 prefix = cur->scb_s.prefix << GUEST_PREFIX_SHIFT; ··· 1360 1345 return rc; 1361 1346 } 1362 1347 1348 + /* Try getting a given vsie page, returning "true" on success. */ 1349 + static inline bool try_get_vsie_page(struct vsie_page *vsie_page) 1350 + { 1351 + if (test_bit(VSIE_PAGE_IN_USE, &vsie_page->flags)) 1352 + return false; 1353 + return !test_and_set_bit(VSIE_PAGE_IN_USE, &vsie_page->flags); 1354 + } 1355 + 1356 + /* Put a vsie page acquired through get_vsie_page / try_get_vsie_page. */ 1357 + static void put_vsie_page(struct vsie_page *vsie_page) 1358 + { 1359 + clear_bit(VSIE_PAGE_IN_USE, &vsie_page->flags); 1360 + } 1361 + 1363 1362 /* 1364 1363 * Get or create a vsie page for a scb address. 1365 1364 * ··· 1384 1355 static struct vsie_page *get_vsie_page(struct kvm *kvm, unsigned long addr) 1385 1356 { 1386 1357 struct vsie_page *vsie_page; 1387 - struct page *page; 1388 1358 int nr_vcpus; 1389 1359 1390 1360 rcu_read_lock(); 1391 - page = radix_tree_lookup(&kvm->arch.vsie.addr_to_page, addr >> 9); 1361 + vsie_page = radix_tree_lookup(&kvm->arch.vsie.addr_to_page, addr >> 9); 1392 1362 rcu_read_unlock(); 1393 - if (page) { 1394 - if (page_ref_inc_return(page) == 2) 1395 - return page_to_virt(page); 1396 - page_ref_dec(page); 1363 + if (vsie_page) { 1364 + if (try_get_vsie_page(vsie_page)) { 1365 + if (vsie_page->scb_gpa == addr) 1366 + return vsie_page; 1367 + /* 1368 + * We raced with someone reusing + putting this vsie 1369 + * page before we grabbed it. 1370 + */ 1371 + put_vsie_page(vsie_page); 1372 + } 1397 1373 } 1398 1374 1399 1375 /* ··· 1409 1375 1410 1376 mutex_lock(&kvm->arch.vsie.mutex); 1411 1377 if (kvm->arch.vsie.page_count < nr_vcpus) { 1412 - page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO | GFP_DMA); 1413 - if (!page) { 1378 + vsie_page = (void *)__get_free_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO | GFP_DMA); 1379 + if (!vsie_page) { 1414 1380 mutex_unlock(&kvm->arch.vsie.mutex); 1415 1381 return ERR_PTR(-ENOMEM); 1416 1382 } 1417 - page_ref_inc(page); 1418 - kvm->arch.vsie.pages[kvm->arch.vsie.page_count] = page; 1383 + __set_bit(VSIE_PAGE_IN_USE, &vsie_page->flags); 1384 + kvm->arch.vsie.pages[kvm->arch.vsie.page_count] = vsie_page; 1419 1385 kvm->arch.vsie.page_count++; 1420 1386 } else { 1421 1387 /* reuse an existing entry that belongs to nobody */ 1422 1388 while (true) { 1423 - page = kvm->arch.vsie.pages[kvm->arch.vsie.next]; 1424 - if (page_ref_inc_return(page) == 2) 1389 + vsie_page = kvm->arch.vsie.pages[kvm->arch.vsie.next]; 1390 + if (try_get_vsie_page(vsie_page)) 1425 1391 break; 1426 - page_ref_dec(page); 1427 1392 kvm->arch.vsie.next++; 1428 1393 kvm->arch.vsie.next %= nr_vcpus; 1429 1394 } 1430 - radix_tree_delete(&kvm->arch.vsie.addr_to_page, page->index >> 9); 1395 + if (vsie_page->scb_gpa != ULONG_MAX) 1396 + radix_tree_delete(&kvm->arch.vsie.addr_to_page, 1397 + vsie_page->scb_gpa >> 9); 1431 1398 } 1432 - page->index = addr; 1433 - /* double use of the same address */ 1434 - if (radix_tree_insert(&kvm->arch.vsie.addr_to_page, addr >> 9, page)) { 1435 - page_ref_dec(page); 1399 + /* Mark it as invalid until it resides in the tree. */ 1400 + vsie_page->scb_gpa = ULONG_MAX; 1401 + 1402 + /* Double use of the same address or allocation failure. */ 1403 + if (radix_tree_insert(&kvm->arch.vsie.addr_to_page, addr >> 9, 1404 + vsie_page)) { 1405 + put_vsie_page(vsie_page); 1436 1406 mutex_unlock(&kvm->arch.vsie.mutex); 1437 1407 return NULL; 1438 1408 } 1409 + vsie_page->scb_gpa = addr; 1439 1410 mutex_unlock(&kvm->arch.vsie.mutex); 1440 1411 1441 - vsie_page = page_to_virt(page); 1442 1412 memset(&vsie_page->scb_s, 0, sizeof(struct kvm_s390_sie_block)); 1443 1413 release_gmap_shadow(vsie_page); 1444 1414 vsie_page->fault_addr = 0; 1445 1415 vsie_page->scb_s.ihcpu = 0xffffU; 1446 1416 return vsie_page; 1447 - } 1448 - 1449 - /* put a vsie page acquired via get_vsie_page */ 1450 - static void put_vsie_page(struct kvm *kvm, struct vsie_page *vsie_page) 1451 - { 1452 - struct page *page = pfn_to_page(__pa(vsie_page) >> PAGE_SHIFT); 1453 - 1454 - page_ref_dec(page); 1455 1417 } 1456 1418 1457 1419 int kvm_s390_handle_vsie(struct kvm_vcpu *vcpu) ··· 1500 1470 out_unpin_scb: 1501 1471 unpin_scb(vcpu, vsie_page, scb_addr); 1502 1472 out_put: 1503 - put_vsie_page(vcpu->kvm, vsie_page); 1473 + put_vsie_page(vsie_page); 1504 1474 1505 1475 return rc < 0 ? rc : 0; 1506 1476 } ··· 1516 1486 void kvm_s390_vsie_destroy(struct kvm *kvm) 1517 1487 { 1518 1488 struct vsie_page *vsie_page; 1519 - struct page *page; 1520 1489 int i; 1521 1490 1522 1491 mutex_lock(&kvm->arch.vsie.mutex); 1523 1492 for (i = 0; i < kvm->arch.vsie.page_count; i++) { 1524 - page = kvm->arch.vsie.pages[i]; 1493 + vsie_page = kvm->arch.vsie.pages[i]; 1525 1494 kvm->arch.vsie.pages[i] = NULL; 1526 - vsie_page = page_to_virt(page); 1527 1495 release_gmap_shadow(vsie_page); 1528 1496 /* free the radix tree entry */ 1529 - radix_tree_delete(&kvm->arch.vsie.addr_to_page, page->index >> 9); 1530 - __free_page(page); 1497 + if (vsie_page->scb_gpa != ULONG_MAX) 1498 + radix_tree_delete(&kvm->arch.vsie.addr_to_page, 1499 + vsie_page->scb_gpa >> 9); 1500 + free_page((unsigned long)vsie_page); 1531 1501 } 1532 1502 kvm->arch.vsie.page_count = 0; 1533 1503 mutex_unlock(&kvm->arch.vsie.mutex);

+150 -531

arch/s390/mm/gmap.c

··· 24 24 #include <asm/page.h> 25 25 #include <asm/tlb.h> 26 26 27 + /* 28 + * The address is saved in a radix tree directly; NULL would be ambiguous, 29 + * since 0 is a valid address, and NULL is returned when nothing was found. 30 + * The lower bits are ignored by all users of the macro, so it can be used 31 + * to distinguish a valid address 0 from a NULL. 32 + */ 33 + #define VALID_GADDR_FLAG 1 34 + #define IS_GADDR_VALID(gaddr) ((gaddr) & VALID_GADDR_FLAG) 35 + #define MAKE_VALID_GADDR(gaddr) (((gaddr) & HPAGE_MASK) | VALID_GADDR_FLAG) 36 + 27 37 #define GMAP_SHADOW_FAKE_TABLE 1ULL 28 38 29 39 static struct page *gmap_alloc_crst(void) ··· 53 43 * 54 44 * Returns a guest address space structure. 55 45 */ 56 - static struct gmap *gmap_alloc(unsigned long limit) 46 + struct gmap *gmap_alloc(unsigned long limit) 57 47 { 58 48 struct gmap *gmap; 59 49 struct page *page; ··· 80 70 gmap = kzalloc(sizeof(struct gmap), GFP_KERNEL_ACCOUNT); 81 71 if (!gmap) 82 72 goto out; 83 - INIT_LIST_HEAD(&gmap->crst_list); 84 73 INIT_LIST_HEAD(&gmap->children); 85 - INIT_LIST_HEAD(&gmap->pt_list); 86 74 INIT_RADIX_TREE(&gmap->guest_to_host, GFP_KERNEL_ACCOUNT); 87 75 INIT_RADIX_TREE(&gmap->host_to_guest, GFP_ATOMIC | __GFP_ACCOUNT); 88 76 INIT_RADIX_TREE(&gmap->host_to_rmap, GFP_ATOMIC | __GFP_ACCOUNT); ··· 90 82 page = gmap_alloc_crst(); 91 83 if (!page) 92 84 goto out_free; 93 - page->index = 0; 94 - list_add(&page->lru, &gmap->crst_list); 95 85 table = page_to_virt(page); 96 86 crst_table_init(table, etype); 97 87 gmap->table = table; ··· 103 97 out: 104 98 return NULL; 105 99 } 100 + EXPORT_SYMBOL_GPL(gmap_alloc); 106 101 107 102 /** 108 103 * gmap_create - create a guest address space ··· 192 185 } while (nr > 0); 193 186 } 194 187 188 + static void gmap_free_crst(unsigned long *table, bool free_ptes) 189 + { 190 + bool is_segment = (table[0] & _SEGMENT_ENTRY_TYPE_MASK) == 0; 191 + int i; 192 + 193 + if (is_segment) { 194 + if (!free_ptes) 195 + goto out; 196 + for (i = 0; i < _CRST_ENTRIES; i++) 197 + if (!(table[i] & _SEGMENT_ENTRY_INVALID)) 198 + page_table_free_pgste(page_ptdesc(phys_to_page(table[i]))); 199 + } else { 200 + for (i = 0; i < _CRST_ENTRIES; i++) 201 + if (!(table[i] & _REGION_ENTRY_INVALID)) 202 + gmap_free_crst(__va(table[i] & PAGE_MASK), free_ptes); 203 + } 204 + 205 + out: 206 + free_pages((unsigned long)table, CRST_ALLOC_ORDER); 207 + } 208 + 195 209 /** 196 210 * gmap_free - free a guest address space 197 211 * @gmap: pointer to the guest address space structure 198 212 * 199 213 * No locks required. There are no references to this gmap anymore. 200 214 */ 201 - static void gmap_free(struct gmap *gmap) 215 + void gmap_free(struct gmap *gmap) 202 216 { 203 - struct page *page, *next; 204 - 205 217 /* Flush tlb of all gmaps (if not already done for shadows) */ 206 218 if (!(gmap_is_shadow(gmap) && gmap->removed)) 207 219 gmap_flush_tlb(gmap); 208 220 /* Free all segment & region tables. */ 209 - list_for_each_entry_safe(page, next, &gmap->crst_list, lru) 210 - __free_pages(page, CRST_ALLOC_ORDER); 221 + gmap_free_crst(gmap->table, gmap_is_shadow(gmap)); 222 + 211 223 gmap_radix_tree_free(&gmap->guest_to_host); 212 224 gmap_radix_tree_free(&gmap->host_to_guest); 213 225 214 226 /* Free additional data for a shadow gmap */ 215 227 if (gmap_is_shadow(gmap)) { 216 - struct ptdesc *ptdesc, *n; 217 - 218 - /* Free all page tables. */ 219 - list_for_each_entry_safe(ptdesc, n, &gmap->pt_list, pt_list) 220 - page_table_free_pgste(ptdesc); 221 228 gmap_rmap_radix_tree_free(&gmap->host_to_rmap); 222 229 /* Release reference to the parent */ 223 230 gmap_put(gmap->parent); ··· 239 218 240 219 kfree(gmap); 241 220 } 221 + EXPORT_SYMBOL_GPL(gmap_free); 242 222 243 223 /** 244 224 * gmap_get - increase reference counter for guest address space ··· 320 298 crst_table_init(new, init); 321 299 spin_lock(&gmap->guest_table_lock); 322 300 if (*table & _REGION_ENTRY_INVALID) { 323 - list_add(&page->lru, &gmap->crst_list); 324 301 *table = __pa(new) | _REGION_ENTRY_LENGTH | 325 302 (*table & _REGION_ENTRY_TYPE_MASK); 326 - page->index = gaddr; 327 303 page = NULL; 328 304 } 329 305 spin_unlock(&gmap->guest_table_lock); ··· 330 310 return 0; 331 311 } 332 312 333 - /** 334 - * __gmap_segment_gaddr - find virtual address from segment pointer 335 - * @entry: pointer to a segment table entry in the guest address space 336 - * 337 - * Returns the virtual address in the guest address space for the segment 338 - */ 339 - static unsigned long __gmap_segment_gaddr(unsigned long *entry) 313 + static unsigned long host_to_guest_lookup(struct gmap *gmap, unsigned long vmaddr) 340 314 { 341 - struct page *page; 342 - unsigned long offset; 315 + return (unsigned long)radix_tree_lookup(&gmap->host_to_guest, vmaddr >> PMD_SHIFT); 316 + } 343 317 344 - offset = (unsigned long) entry / sizeof(unsigned long); 345 - offset = (offset & (PTRS_PER_PMD - 1)) * PMD_SIZE; 346 - page = pmd_pgtable_page((pmd_t *) entry); 347 - return page->index + offset; 318 + static unsigned long host_to_guest_delete(struct gmap *gmap, unsigned long vmaddr) 319 + { 320 + return (unsigned long)radix_tree_delete(&gmap->host_to_guest, vmaddr >> PMD_SHIFT); 321 + } 322 + 323 + static pmd_t *host_to_guest_pmd_delete(struct gmap *gmap, unsigned long vmaddr, 324 + unsigned long *gaddr) 325 + { 326 + *gaddr = host_to_guest_delete(gmap, vmaddr); 327 + if (IS_GADDR_VALID(*gaddr)) 328 + return (pmd_t *)gmap_table_walk(gmap, *gaddr, 1); 329 + return NULL; 348 330 } 349 331 350 332 /** ··· 358 336 */ 359 337 static int __gmap_unlink_by_vmaddr(struct gmap *gmap, unsigned long vmaddr) 360 338 { 361 - unsigned long *entry; 339 + unsigned long gaddr; 362 340 int flush = 0; 341 + pmd_t *pmdp; 363 342 364 343 BUG_ON(gmap_is_shadow(gmap)); 365 344 spin_lock(&gmap->guest_table_lock); 366 - entry = radix_tree_delete(&gmap->host_to_guest, vmaddr >> PMD_SHIFT); 367 - if (entry) { 368 - flush = (*entry != _SEGMENT_ENTRY_EMPTY); 369 - *entry = _SEGMENT_ENTRY_EMPTY; 345 + 346 + pmdp = host_to_guest_pmd_delete(gmap, vmaddr, &gaddr); 347 + if (pmdp) { 348 + flush = (pmd_val(*pmdp) != _SEGMENT_ENTRY_EMPTY); 349 + *pmdp = __pmd(_SEGMENT_ENTRY_EMPTY); 370 350 } 351 + 371 352 spin_unlock(&gmap->guest_table_lock); 372 353 return flush; 373 354 } ··· 489 464 EXPORT_SYMBOL_GPL(__gmap_translate); 490 465 491 466 /** 492 - * gmap_translate - translate a guest address to a user space address 493 - * @gmap: pointer to guest mapping meta data structure 494 - * @gaddr: guest address 495 - * 496 - * Returns user space address which corresponds to the guest address or 497 - * -EFAULT if no such mapping exists. 498 - * This function does not establish potentially missing page table entries. 499 - */ 500 - unsigned long gmap_translate(struct gmap *gmap, unsigned long gaddr) 501 - { 502 - unsigned long rc; 503 - 504 - mmap_read_lock(gmap->mm); 505 - rc = __gmap_translate(gmap, gaddr); 506 - mmap_read_unlock(gmap->mm); 507 - return rc; 508 - } 509 - EXPORT_SYMBOL_GPL(gmap_translate); 510 - 511 - /** 512 467 * gmap_unlink - disconnect a page table from the gmap shadow tables 513 468 * @mm: pointer to the parent mm_struct 514 469 * @table: pointer to the host page table ··· 587 582 spin_lock(&gmap->guest_table_lock); 588 583 if (*table == _SEGMENT_ENTRY_EMPTY) { 589 584 rc = radix_tree_insert(&gmap->host_to_guest, 590 - vmaddr >> PMD_SHIFT, table); 585 + vmaddr >> PMD_SHIFT, 586 + (void *)MAKE_VALID_GADDR(gaddr)); 591 587 if (!rc) { 592 588 if (pmd_leaf(*pmd)) { 593 589 *table = (pmd_val(*pmd) & ··· 611 605 radix_tree_preload_end(); 612 606 return rc; 613 607 } 614 - 615 - /** 616 - * fixup_user_fault_nowait - manually resolve a user page fault without waiting 617 - * @mm: mm_struct of target mm 618 - * @address: user address 619 - * @fault_flags:flags to pass down to handle_mm_fault() 620 - * @unlocked: did we unlock the mmap_lock while retrying 621 - * 622 - * This function behaves similarly to fixup_user_fault(), but it guarantees 623 - * that the fault will be resolved without waiting. The function might drop 624 - * and re-acquire the mm lock, in which case @unlocked will be set to true. 625 - * 626 - * The guarantee is that the fault is handled without waiting, but the 627 - * function itself might sleep, due to the lock. 628 - * 629 - * Context: Needs to be called with mm->mmap_lock held in read mode, and will 630 - * return with the lock held in read mode; @unlocked will indicate whether 631 - * the lock has been dropped and re-acquired. This is the same behaviour as 632 - * fixup_user_fault(). 633 - * 634 - * Return: 0 on success, -EAGAIN if the fault cannot be resolved without 635 - * waiting, -EFAULT if the fault cannot be resolved, -ENOMEM if out of 636 - * memory. 637 - */ 638 - static int fixup_user_fault_nowait(struct mm_struct *mm, unsigned long address, 639 - unsigned int fault_flags, bool *unlocked) 640 - { 641 - struct vm_area_struct *vma; 642 - unsigned int test_flags; 643 - vm_fault_t fault; 644 - int rc; 645 - 646 - fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; 647 - test_flags = fault_flags & FAULT_FLAG_WRITE ? VM_WRITE : VM_READ; 648 - 649 - vma = find_vma(mm, address); 650 - if (unlikely(!vma || address < vma->vm_start)) 651 - return -EFAULT; 652 - if (unlikely(!(vma->vm_flags & test_flags))) 653 - return -EFAULT; 654 - 655 - fault = handle_mm_fault(vma, address, fault_flags, NULL); 656 - /* the mm lock has been dropped, take it again */ 657 - if (fault & VM_FAULT_COMPLETED) { 658 - *unlocked = true; 659 - mmap_read_lock(mm); 660 - return 0; 661 - } 662 - /* the mm lock has not been dropped */ 663 - if (fault & VM_FAULT_ERROR) { 664 - rc = vm_fault_to_errno(fault, 0); 665 - BUG_ON(!rc); 666 - return rc; 667 - } 668 - /* the mm lock has not been dropped because of FAULT_FLAG_RETRY_NOWAIT */ 669 - if (fault & VM_FAULT_RETRY) 670 - return -EAGAIN; 671 - /* nothing needed to be done and the mm lock has not been dropped */ 672 - return 0; 673 - } 674 - 675 - /** 676 - * __gmap_fault - resolve a fault on a guest address 677 - * @gmap: pointer to guest mapping meta data structure 678 - * @gaddr: guest address 679 - * @fault_flags: flags to pass down to handle_mm_fault() 680 - * 681 - * Context: Needs to be called with mm->mmap_lock held in read mode. Might 682 - * drop and re-acquire the lock. Will always return with the lock held. 683 - */ 684 - static int __gmap_fault(struct gmap *gmap, unsigned long gaddr, unsigned int fault_flags) 685 - { 686 - unsigned long vmaddr; 687 - bool unlocked; 688 - int rc = 0; 689 - 690 - retry: 691 - unlocked = false; 692 - 693 - vmaddr = __gmap_translate(gmap, gaddr); 694 - if (IS_ERR_VALUE(vmaddr)) 695 - return vmaddr; 696 - 697 - if (fault_flags & FAULT_FLAG_RETRY_NOWAIT) 698 - rc = fixup_user_fault_nowait(gmap->mm, vmaddr, fault_flags, &unlocked); 699 - else 700 - rc = fixup_user_fault(gmap->mm, vmaddr, fault_flags, &unlocked); 701 - if (rc) 702 - return rc; 703 - /* 704 - * In the case that fixup_user_fault unlocked the mmap_lock during 705 - * fault-in, redo __gmap_translate() to avoid racing with a 706 - * map/unmap_segment. 707 - * In particular, __gmap_translate(), fixup_user_fault{,_nowait}(), 708 - * and __gmap_link() must all be called atomically in one go; if the 709 - * lock had been dropped in between, a retry is needed. 710 - */ 711 - if (unlocked) 712 - goto retry; 713 - 714 - return __gmap_link(gmap, gaddr, vmaddr); 715 - } 716 - 717 - /** 718 - * gmap_fault - resolve a fault on a guest address 719 - * @gmap: pointer to guest mapping meta data structure 720 - * @gaddr: guest address 721 - * @fault_flags: flags to pass down to handle_mm_fault() 722 - * 723 - * Returns 0 on success, -ENOMEM for out of memory conditions, -EFAULT if the 724 - * vm address is already mapped to a different guest segment, and -EAGAIN if 725 - * FAULT_FLAG_RETRY_NOWAIT was specified and the fault could not be processed 726 - * immediately. 727 - */ 728 - int gmap_fault(struct gmap *gmap, unsigned long gaddr, unsigned int fault_flags) 729 - { 730 - int rc; 731 - 732 - mmap_read_lock(gmap->mm); 733 - rc = __gmap_fault(gmap, gaddr, fault_flags); 734 - mmap_read_unlock(gmap->mm); 735 - return rc; 736 - } 737 - EXPORT_SYMBOL_GPL(gmap_fault); 608 + EXPORT_SYMBOL(__gmap_link); 738 609 739 610 /* 740 611 * this function is assumed to be called with mmap_lock held ··· 736 853 * 737 854 * Note: Can also be called for shadow gmaps. 738 855 */ 739 - static inline unsigned long *gmap_table_walk(struct gmap *gmap, 740 - unsigned long gaddr, int level) 856 + unsigned long *gmap_table_walk(struct gmap *gmap, unsigned long gaddr, int level) 741 857 { 742 858 const int asce_type = gmap->asce & _ASCE_TYPE_MASK; 743 859 unsigned long *table = gmap->table; ··· 787 905 } 788 906 return table; 789 907 } 908 + EXPORT_SYMBOL(gmap_table_walk); 790 909 791 910 /** 792 911 * gmap_pte_op_walk - walk the gmap page table, get the page table lock ··· 984 1101 * @prot: indicates access rights: PROT_NONE, PROT_READ or PROT_WRITE 985 1102 * @bits: pgste notification bits to set 986 1103 * 987 - * Returns 0 if successfully protected, -ENOMEM if out of memory and 988 - * -EFAULT if gaddr is invalid (or mapping for shadows is missing). 1104 + * Returns: 1105 + * PAGE_SIZE if a small page was successfully protected; 1106 + * HPAGE_SIZE if a large page was successfully protected; 1107 + * -ENOMEM if out of memory; 1108 + * -EFAULT if gaddr is invalid (or mapping for shadows is missing); 1109 + * -EAGAIN if the guest mapping is missing and should be fixed by the caller. 989 1110 * 990 - * Called with sg->mm->mmap_lock in read. 1111 + * Context: Called with sg->mm->mmap_lock in read. 991 1112 */ 992 - static int gmap_protect_range(struct gmap *gmap, unsigned long gaddr, 993 - unsigned long len, int prot, unsigned long bits) 1113 + int gmap_protect_one(struct gmap *gmap, unsigned long gaddr, int prot, unsigned long bits) 994 1114 { 995 - unsigned long vmaddr, dist; 996 1115 pmd_t *pmdp; 997 - int rc; 1116 + int rc = 0; 998 1117 999 1118 BUG_ON(gmap_is_shadow(gmap)); 1000 - while (len) { 1001 - rc = -EAGAIN; 1002 - pmdp = gmap_pmd_op_walk(gmap, gaddr); 1003 - if (pmdp) { 1004 - if (!pmd_leaf(*pmdp)) { 1005 - rc = gmap_protect_pte(gmap, gaddr, pmdp, prot, 1006 - bits); 1007 - if (!rc) { 1008 - len -= PAGE_SIZE; 1009 - gaddr += PAGE_SIZE; 1010 - } 1011 - } else { 1012 - rc = gmap_protect_pmd(gmap, gaddr, pmdp, prot, 1013 - bits); 1014 - if (!rc) { 1015 - dist = HPAGE_SIZE - (gaddr & ~HPAGE_MASK); 1016 - len = len < dist ? 0 : len - dist; 1017 - gaddr = (gaddr & HPAGE_MASK) + HPAGE_SIZE; 1018 - } 1019 - } 1020 - gmap_pmd_op_end(gmap, pmdp); 1021 - } 1022 - if (rc) { 1023 - if (rc == -EINVAL) 1024 - return rc; 1025 1119 1026 - /* -EAGAIN, fixup of userspace mm and gmap */ 1027 - vmaddr = __gmap_translate(gmap, gaddr); 1028 - if (IS_ERR_VALUE(vmaddr)) 1029 - return vmaddr; 1030 - rc = gmap_pte_op_fixup(gmap, gaddr, vmaddr, prot); 1031 - if (rc) 1032 - return rc; 1033 - } 1120 + pmdp = gmap_pmd_op_walk(gmap, gaddr); 1121 + if (!pmdp) 1122 + return -EAGAIN; 1123 + 1124 + if (!pmd_leaf(*pmdp)) { 1125 + rc = gmap_protect_pte(gmap, gaddr, pmdp, prot, bits); 1126 + if (!rc) 1127 + rc = PAGE_SIZE; 1128 + } else { 1129 + rc = gmap_protect_pmd(gmap, gaddr, pmdp, prot, bits); 1130 + if (!rc) 1131 + rc = HPAGE_SIZE; 1034 1132 } 1035 - return 0; 1036 - } 1133 + gmap_pmd_op_end(gmap, pmdp); 1037 1134 1038 - /** 1039 - * gmap_mprotect_notify - change access rights for a range of ptes and 1040 - * call the notifier if any pte changes again 1041 - * @gmap: pointer to guest mapping meta data structure 1042 - * @gaddr: virtual address in the guest address space 1043 - * @len: size of area 1044 - * @prot: indicates access rights: PROT_NONE, PROT_READ or PROT_WRITE 1045 - * 1046 - * Returns 0 if for each page in the given range a gmap mapping exists, 1047 - * the new access rights could be set and the notifier could be armed. 1048 - * If the gmap mapping is missing for one or more pages -EFAULT is 1049 - * returned. If no memory could be allocated -ENOMEM is returned. 1050 - * This function establishes missing page table entries. 1051 - */ 1052 - int gmap_mprotect_notify(struct gmap *gmap, unsigned long gaddr, 1053 - unsigned long len, int prot) 1054 - { 1055 - int rc; 1056 - 1057 - if ((gaddr & ~PAGE_MASK) || (len & ~PAGE_MASK) || gmap_is_shadow(gmap)) 1058 - return -EINVAL; 1059 - if (!MACHINE_HAS_ESOP && prot == PROT_READ) 1060 - return -EINVAL; 1061 - mmap_read_lock(gmap->mm); 1062 - rc = gmap_protect_range(gmap, gaddr, len, prot, GMAP_NOTIFY_MPROT); 1063 - mmap_read_unlock(gmap->mm); 1064 1135 return rc; 1065 1136 } 1066 - EXPORT_SYMBOL_GPL(gmap_mprotect_notify); 1137 + EXPORT_SYMBOL_GPL(gmap_protect_one); 1067 1138 1068 1139 /** 1069 1140 * gmap_read_table - get an unsigned long value from a guest page table using ··· 1251 1414 __gmap_unshadow_pgt(sg, raddr, __va(pgt)); 1252 1415 /* Free page table */ 1253 1416 ptdesc = page_ptdesc(phys_to_page(pgt)); 1254 - list_del(&ptdesc->pt_list); 1255 1417 page_table_free_pgste(ptdesc); 1256 1418 } 1257 1419 ··· 1278 1442 __gmap_unshadow_pgt(sg, raddr, __va(pgt)); 1279 1443 /* Free page table */ 1280 1444 ptdesc = page_ptdesc(phys_to_page(pgt)); 1281 - list_del(&ptdesc->pt_list); 1282 1445 page_table_free_pgste(ptdesc); 1283 1446 } 1284 1447 } ··· 1307 1472 __gmap_unshadow_sgt(sg, raddr, __va(sgt)); 1308 1473 /* Free segment table */ 1309 1474 page = phys_to_page(sgt); 1310 - list_del(&page->lru); 1311 1475 __free_pages(page, CRST_ALLOC_ORDER); 1312 1476 } 1313 1477 ··· 1334 1500 __gmap_unshadow_sgt(sg, raddr, __va(sgt)); 1335 1501 /* Free segment table */ 1336 1502 page = phys_to_page(sgt); 1337 - list_del(&page->lru); 1338 1503 __free_pages(page, CRST_ALLOC_ORDER); 1339 1504 } 1340 1505 } ··· 1363 1530 __gmap_unshadow_r3t(sg, raddr, __va(r3t)); 1364 1531 /* Free region 3 table */ 1365 1532 page = phys_to_page(r3t); 1366 - list_del(&page->lru); 1367 1533 __free_pages(page, CRST_ALLOC_ORDER); 1368 1534 } 1369 1535 ··· 1390 1558 __gmap_unshadow_r3t(sg, raddr, __va(r3t)); 1391 1559 /* Free region 3 table */ 1392 1560 page = phys_to_page(r3t); 1393 - list_del(&page->lru); 1394 1561 __free_pages(page, CRST_ALLOC_ORDER); 1395 1562 } 1396 1563 } ··· 1419 1588 __gmap_unshadow_r2t(sg, raddr, __va(r2t)); 1420 1589 /* Free region 2 table */ 1421 1590 page = phys_to_page(r2t); 1422 - list_del(&page->lru); 1423 1591 __free_pages(page, CRST_ALLOC_ORDER); 1424 1592 } 1425 1593 ··· 1450 1620 r1t[i] = _REGION1_ENTRY_EMPTY; 1451 1621 /* Free region 2 table */ 1452 1622 page = phys_to_page(r2t); 1453 - list_del(&page->lru); 1454 1623 __free_pages(page, CRST_ALLOC_ORDER); 1455 1624 } 1456 1625 } ··· 1460 1631 * 1461 1632 * Called with sg->guest_table_lock 1462 1633 */ 1463 - static void gmap_unshadow(struct gmap *sg) 1634 + void gmap_unshadow(struct gmap *sg) 1464 1635 { 1465 1636 unsigned long *table; 1466 1637 ··· 1486 1657 break; 1487 1658 } 1488 1659 } 1489 - 1490 - /** 1491 - * gmap_find_shadow - find a specific asce in the list of shadow tables 1492 - * @parent: pointer to the parent gmap 1493 - * @asce: ASCE for which the shadow table is created 1494 - * @edat_level: edat level to be used for the shadow translation 1495 - * 1496 - * Returns the pointer to a gmap if a shadow table with the given asce is 1497 - * already available, ERR_PTR(-EAGAIN) if another one is just being created, 1498 - * otherwise NULL 1499 - */ 1500 - static struct gmap *gmap_find_shadow(struct gmap *parent, unsigned long asce, 1501 - int edat_level) 1502 - { 1503 - struct gmap *sg; 1504 - 1505 - list_for_each_entry(sg, &parent->children, list) { 1506 - if (sg->orig_asce != asce || sg->edat_level != edat_level || 1507 - sg->removed) 1508 - continue; 1509 - if (!sg->initialized) 1510 - return ERR_PTR(-EAGAIN); 1511 - refcount_inc(&sg->ref_count); 1512 - return sg; 1513 - } 1514 - return NULL; 1515 - } 1516 - 1517 - /** 1518 - * gmap_shadow_valid - check if a shadow guest address space matches the 1519 - * given properties and is still valid 1520 - * @sg: pointer to the shadow guest address space structure 1521 - * @asce: ASCE for which the shadow table is requested 1522 - * @edat_level: edat level to be used for the shadow translation 1523 - * 1524 - * Returns 1 if the gmap shadow is still valid and matches the given 1525 - * properties, the caller can continue using it. Returns 0 otherwise, the 1526 - * caller has to request a new shadow gmap in this case. 1527 - * 1528 - */ 1529 - int gmap_shadow_valid(struct gmap *sg, unsigned long asce, int edat_level) 1530 - { 1531 - if (sg->removed) 1532 - return 0; 1533 - return sg->orig_asce == asce && sg->edat_level == edat_level; 1534 - } 1535 - EXPORT_SYMBOL_GPL(gmap_shadow_valid); 1536 - 1537 - /** 1538 - * gmap_shadow - create/find a shadow guest address space 1539 - * @parent: pointer to the parent gmap 1540 - * @asce: ASCE for which the shadow table is created 1541 - * @edat_level: edat level to be used for the shadow translation 1542 - * 1543 - * The pages of the top level page table referred by the asce parameter 1544 - * will be set to read-only and marked in the PGSTEs of the kvm process. 1545 - * The shadow table will be removed automatically on any change to the 1546 - * PTE mapping for the source table. 1547 - * 1548 - * Returns a guest address space structure, ERR_PTR(-ENOMEM) if out of memory, 1549 - * ERR_PTR(-EAGAIN) if the caller has to retry and ERR_PTR(-EFAULT) if the 1550 - * parent gmap table could not be protected. 1551 - */ 1552 - struct gmap *gmap_shadow(struct gmap *parent, unsigned long asce, 1553 - int edat_level) 1554 - { 1555 - struct gmap *sg, *new; 1556 - unsigned long limit; 1557 - int rc; 1558 - 1559 - BUG_ON(parent->mm->context.allow_gmap_hpage_1m); 1560 - BUG_ON(gmap_is_shadow(parent)); 1561 - spin_lock(&parent->shadow_lock); 1562 - sg = gmap_find_shadow(parent, asce, edat_level); 1563 - spin_unlock(&parent->shadow_lock); 1564 - if (sg) 1565 - return sg; 1566 - /* Create a new shadow gmap */ 1567 - limit = -1UL >> (33 - (((asce & _ASCE_TYPE_MASK) >> 2) * 11)); 1568 - if (asce & _ASCE_REAL_SPACE) 1569 - limit = -1UL; 1570 - new = gmap_alloc(limit); 1571 - if (!new) 1572 - return ERR_PTR(-ENOMEM); 1573 - new->mm = parent->mm; 1574 - new->parent = gmap_get(parent); 1575 - new->private = parent->private; 1576 - new->orig_asce = asce; 1577 - new->edat_level = edat_level; 1578 - new->initialized = false; 1579 - spin_lock(&parent->shadow_lock); 1580 - /* Recheck if another CPU created the same shadow */ 1581 - sg = gmap_find_shadow(parent, asce, edat_level); 1582 - if (sg) { 1583 - spin_unlock(&parent->shadow_lock); 1584 - gmap_free(new); 1585 - return sg; 1586 - } 1587 - if (asce & _ASCE_REAL_SPACE) { 1588 - /* only allow one real-space gmap shadow */ 1589 - list_for_each_entry(sg, &parent->children, list) { 1590 - if (sg->orig_asce & _ASCE_REAL_SPACE) { 1591 - spin_lock(&sg->guest_table_lock); 1592 - gmap_unshadow(sg); 1593 - spin_unlock(&sg->guest_table_lock); 1594 - list_del(&sg->list); 1595 - gmap_put(sg); 1596 - break; 1597 - } 1598 - } 1599 - } 1600 - refcount_set(&new->ref_count, 2); 1601 - list_add(&new->list, &parent->children); 1602 - if (asce & _ASCE_REAL_SPACE) { 1603 - /* nothing to protect, return right away */ 1604 - new->initialized = true; 1605 - spin_unlock(&parent->shadow_lock); 1606 - return new; 1607 - } 1608 - spin_unlock(&parent->shadow_lock); 1609 - /* protect after insertion, so it will get properly invalidated */ 1610 - mmap_read_lock(parent->mm); 1611 - rc = gmap_protect_range(parent, asce & _ASCE_ORIGIN, 1612 - ((asce & _ASCE_TABLE_LENGTH) + 1) * PAGE_SIZE, 1613 - PROT_READ, GMAP_NOTIFY_SHADOW); 1614 - mmap_read_unlock(parent->mm); 1615 - spin_lock(&parent->shadow_lock); 1616 - new->initialized = true; 1617 - if (rc) { 1618 - list_del(&new->list); 1619 - gmap_free(new); 1620 - new = ERR_PTR(rc); 1621 - } 1622 - spin_unlock(&parent->shadow_lock); 1623 - return new; 1624 - } 1625 - EXPORT_SYMBOL_GPL(gmap_shadow); 1660 + EXPORT_SYMBOL(gmap_unshadow); 1626 1661 1627 1662 /** 1628 1663 * gmap_shadow_r2t - create an empty shadow region 2 table ··· 1520 1827 page = gmap_alloc_crst(); 1521 1828 if (!page) 1522 1829 return -ENOMEM; 1523 - page->index = r2t & _REGION_ENTRY_ORIGIN; 1524 - if (fake) 1525 - page->index |= GMAP_SHADOW_FAKE_TABLE; 1526 1830 s_r2t = page_to_phys(page); 1527 1831 /* Install shadow region second table */ 1528 1832 spin_lock(&sg->guest_table_lock); ··· 1541 1851 _REGION_ENTRY_TYPE_R1 | _REGION_ENTRY_INVALID; 1542 1852 if (sg->edat_level >= 1) 1543 1853 *table |= (r2t & _REGION_ENTRY_PROTECT); 1544 - list_add(&page->lru, &sg->crst_list); 1545 1854 if (fake) { 1546 1855 /* nothing to protect for fake tables */ 1547 1856 *table &= ~_REGION_ENTRY_INVALID; ··· 1600 1911 page = gmap_alloc_crst(); 1601 1912 if (!page) 1602 1913 return -ENOMEM; 1603 - page->index = r3t & _REGION_ENTRY_ORIGIN; 1604 - if (fake) 1605 - page->index |= GMAP_SHADOW_FAKE_TABLE; 1606 1914 s_r3t = page_to_phys(page); 1607 1915 /* Install shadow region second table */ 1608 1916 spin_lock(&sg->guest_table_lock); ··· 1621 1935 _REGION_ENTRY_TYPE_R2 | _REGION_ENTRY_INVALID; 1622 1936 if (sg->edat_level >= 1) 1623 1937 *table |= (r3t & _REGION_ENTRY_PROTECT); 1624 - list_add(&page->lru, &sg->crst_list); 1625 1938 if (fake) { 1626 1939 /* nothing to protect for fake tables */ 1627 1940 *table &= ~_REGION_ENTRY_INVALID; ··· 1680 1995 page = gmap_alloc_crst(); 1681 1996 if (!page) 1682 1997 return -ENOMEM; 1683 - page->index = sgt & _REGION_ENTRY_ORIGIN; 1684 - if (fake) 1685 - page->index |= GMAP_SHADOW_FAKE_TABLE; 1686 1998 s_sgt = page_to_phys(page); 1687 1999 /* Install shadow region second table */ 1688 2000 spin_lock(&sg->guest_table_lock); ··· 1701 2019 _REGION_ENTRY_TYPE_R3 | _REGION_ENTRY_INVALID; 1702 2020 if (sg->edat_level >= 1) 1703 2021 *table |= sgt & _REGION_ENTRY_PROTECT; 1704 - list_add(&page->lru, &sg->crst_list); 1705 2022 if (fake) { 1706 2023 /* nothing to protect for fake tables */ 1707 2024 *table &= ~_REGION_ENTRY_INVALID; ··· 1733 2052 } 1734 2053 EXPORT_SYMBOL_GPL(gmap_shadow_sgt); 1735 2054 1736 - /** 1737 - * gmap_shadow_pgt_lookup - find a shadow page table 1738 - * @sg: pointer to the shadow guest address space structure 1739 - * @saddr: the address in the shadow aguest address space 1740 - * @pgt: parent gmap address of the page table to get shadowed 1741 - * @dat_protection: if the pgtable is marked as protected by dat 1742 - * @fake: pgt references contiguous guest memory block, not a pgtable 1743 - * 1744 - * Returns 0 if the shadow page table was found and -EAGAIN if the page 1745 - * table was not found. 1746 - * 1747 - * Called with sg->mm->mmap_lock in read. 1748 - */ 1749 - int gmap_shadow_pgt_lookup(struct gmap *sg, unsigned long saddr, 1750 - unsigned long *pgt, int *dat_protection, 1751 - int *fake) 2055 + static void gmap_pgste_set_pgt_addr(struct ptdesc *ptdesc, unsigned long pgt_addr) 1752 2056 { 1753 - unsigned long *table; 1754 - struct page *page; 1755 - int rc; 2057 + unsigned long *pgstes = page_to_virt(ptdesc_page(ptdesc)); 1756 2058 1757 - BUG_ON(!gmap_is_shadow(sg)); 1758 - spin_lock(&sg->guest_table_lock); 1759 - table = gmap_table_walk(sg, saddr, 1); /* get segment pointer */ 1760 - if (table && !(*table & _SEGMENT_ENTRY_INVALID)) { 1761 - /* Shadow page tables are full pages (pte+pgste) */ 1762 - page = pfn_to_page(*table >> PAGE_SHIFT); 1763 - *pgt = page->index & ~GMAP_SHADOW_FAKE_TABLE; 1764 - *dat_protection = !!(*table & _SEGMENT_ENTRY_PROTECT); 1765 - *fake = !!(page->index & GMAP_SHADOW_FAKE_TABLE); 1766 - rc = 0; 1767 - } else { 1768 - rc = -EAGAIN; 1769 - } 1770 - spin_unlock(&sg->guest_table_lock); 1771 - return rc; 2059 + pgstes += _PAGE_ENTRIES; 1772 2060 2061 + pgstes[0] &= ~PGSTE_ST2_MASK; 2062 + pgstes[1] &= ~PGSTE_ST2_MASK; 2063 + pgstes[2] &= ~PGSTE_ST2_MASK; 2064 + pgstes[3] &= ~PGSTE_ST2_MASK; 2065 + 2066 + pgstes[0] |= (pgt_addr >> 16) & PGSTE_ST2_MASK; 2067 + pgstes[1] |= pgt_addr & PGSTE_ST2_MASK; 2068 + pgstes[2] |= (pgt_addr << 16) & PGSTE_ST2_MASK; 2069 + pgstes[3] |= (pgt_addr << 32) & PGSTE_ST2_MASK; 1773 2070 } 1774 - EXPORT_SYMBOL_GPL(gmap_shadow_pgt_lookup); 1775 2071 1776 2072 /** 1777 2073 * gmap_shadow_pgt - instantiate a shadow page table ··· 1777 2119 ptdesc = page_table_alloc_pgste(sg->mm); 1778 2120 if (!ptdesc) 1779 2121 return -ENOMEM; 1780 - ptdesc->pt_index = pgt & _SEGMENT_ENTRY_ORIGIN; 2122 + origin = pgt & _SEGMENT_ENTRY_ORIGIN; 1781 2123 if (fake) 1782 - ptdesc->pt_index |= GMAP_SHADOW_FAKE_TABLE; 2124 + origin |= GMAP_SHADOW_FAKE_TABLE; 2125 + gmap_pgste_set_pgt_addr(ptdesc, origin); 1783 2126 s_pgt = page_to_phys(ptdesc_page(ptdesc)); 1784 2127 /* Install shadow page table */ 1785 2128 spin_lock(&sg->guest_table_lock); ··· 1799 2140 /* mark as invalid as long as the parent table is not protected */ 1800 2141 *table = (unsigned long) s_pgt | _SEGMENT_ENTRY | 1801 2142 (pgt & _SEGMENT_ENTRY_PROTECT) | _SEGMENT_ENTRY_INVALID; 1802 - list_add(&ptdesc->pt_list, &sg->pt_list); 1803 2143 if (fake) { 1804 2144 /* nothing to protect for fake tables */ 1805 2145 *table &= ~_SEGMENT_ENTRY_INVALID; ··· 1976 2318 pte_t *pte, unsigned long bits) 1977 2319 { 1978 2320 unsigned long offset, gaddr = 0; 1979 - unsigned long *table; 1980 2321 struct gmap *gmap, *sg, *next; 1981 2322 1982 2323 offset = ((unsigned long) pte) & (255 * sizeof(pte_t)); ··· 1983 2326 rcu_read_lock(); 1984 2327 list_for_each_entry_rcu(gmap, &mm->context.gmap_list, list) { 1985 2328 spin_lock(&gmap->guest_table_lock); 1986 - table = radix_tree_lookup(&gmap->host_to_guest, 1987 - vmaddr >> PMD_SHIFT); 1988 - if (table) 1989 - gaddr = __gmap_segment_gaddr(table) + offset; 2329 + gaddr = host_to_guest_lookup(gmap, vmaddr) + offset; 1990 2330 spin_unlock(&gmap->guest_table_lock); 1991 - if (!table) 2331 + if (!IS_GADDR_VALID(gaddr)) 1992 2332 continue; 1993 2333 1994 2334 if (!list_empty(&gmap->children) && (bits & PGSTE_VSIE_BIT)) { ··· 2045 2391 rcu_read_lock(); 2046 2392 list_for_each_entry_rcu(gmap, &mm->context.gmap_list, list) { 2047 2393 spin_lock(&gmap->guest_table_lock); 2048 - pmdp = (pmd_t *)radix_tree_delete(&gmap->host_to_guest, 2049 - vmaddr >> PMD_SHIFT); 2394 + pmdp = host_to_guest_pmd_delete(gmap, vmaddr, &gaddr); 2050 2395 if (pmdp) { 2051 - gaddr = __gmap_segment_gaddr((unsigned long *)pmdp); 2052 2396 pmdp_notify_gmap(gmap, pmdp, gaddr); 2053 2397 WARN_ON(pmd_val(*pmdp) & ~(_SEGMENT_ENTRY_HARDWARE_BITS_LARGE | 2054 2398 _SEGMENT_ENTRY_GMAP_UC | ··· 2090 2438 */ 2091 2439 void gmap_pmdp_idte_local(struct mm_struct *mm, unsigned long vmaddr) 2092 2440 { 2093 - unsigned long *entry, gaddr; 2441 + unsigned long gaddr; 2094 2442 struct gmap *gmap; 2095 2443 pmd_t *pmdp; 2096 2444 2097 2445 rcu_read_lock(); 2098 2446 list_for_each_entry_rcu(gmap, &mm->context.gmap_list, list) { 2099 2447 spin_lock(&gmap->guest_table_lock); 2100 - entry = radix_tree_delete(&gmap->host_to_guest, 2101 - vmaddr >> PMD_SHIFT); 2102 - if (entry) { 2103 - pmdp = (pmd_t *)entry; 2104 - gaddr = __gmap_segment_gaddr(entry); 2448 + pmdp = host_to_guest_pmd_delete(gmap, vmaddr, &gaddr); 2449 + if (pmdp) { 2105 2450 pmdp_notify_gmap(gmap, pmdp, gaddr); 2106 - WARN_ON(*entry & ~(_SEGMENT_ENTRY_HARDWARE_BITS_LARGE | 2107 - _SEGMENT_ENTRY_GMAP_UC | 2108 - _SEGMENT_ENTRY)); 2451 + WARN_ON(pmd_val(*pmdp) & ~(_SEGMENT_ENTRY_HARDWARE_BITS_LARGE | 2452 + _SEGMENT_ENTRY_GMAP_UC | 2453 + _SEGMENT_ENTRY)); 2109 2454 if (MACHINE_HAS_TLB_GUEST) 2110 2455 __pmdp_idte(gaddr, pmdp, IDTE_GUEST_ASCE, 2111 2456 gmap->asce, IDTE_LOCAL); 2112 2457 else if (MACHINE_HAS_IDTE) 2113 2458 __pmdp_idte(gaddr, pmdp, 0, 0, IDTE_LOCAL); 2114 - *entry = _SEGMENT_ENTRY_EMPTY; 2459 + *pmdp = __pmd(_SEGMENT_ENTRY_EMPTY); 2115 2460 } 2116 2461 spin_unlock(&gmap->guest_table_lock); 2117 2462 } ··· 2123 2474 */ 2124 2475 void gmap_pmdp_idte_global(struct mm_struct *mm, unsigned long vmaddr) 2125 2476 { 2126 - unsigned long *entry, gaddr; 2477 + unsigned long gaddr; 2127 2478 struct gmap *gmap; 2128 2479 pmd_t *pmdp; 2129 2480 2130 2481 rcu_read_lock(); 2131 2482 list_for_each_entry_rcu(gmap, &mm->context.gmap_list, list) { 2132 2483 spin_lock(&gmap->guest_table_lock); 2133 - entry = radix_tree_delete(&gmap->host_to_guest, 2134 - vmaddr >> PMD_SHIFT); 2135 - if (entry) { 2136 - pmdp = (pmd_t *)entry; 2137 - gaddr = __gmap_segment_gaddr(entry); 2484 + pmdp = host_to_guest_pmd_delete(gmap, vmaddr, &gaddr); 2485 + if (pmdp) { 2138 2486 pmdp_notify_gmap(gmap, pmdp, gaddr); 2139 - WARN_ON(*entry & ~(_SEGMENT_ENTRY_HARDWARE_BITS_LARGE | 2140 - _SEGMENT_ENTRY_GMAP_UC | 2141 - _SEGMENT_ENTRY)); 2487 + WARN_ON(pmd_val(*pmdp) & ~(_SEGMENT_ENTRY_HARDWARE_BITS_LARGE | 2488 + _SEGMENT_ENTRY_GMAP_UC | 2489 + _SEGMENT_ENTRY)); 2142 2490 if (MACHINE_HAS_TLB_GUEST) 2143 2491 __pmdp_idte(gaddr, pmdp, IDTE_GUEST_ASCE, 2144 2492 gmap->asce, IDTE_GLOBAL); ··· 2143 2497 __pmdp_idte(gaddr, pmdp, 0, 0, IDTE_GLOBAL); 2144 2498 else 2145 2499 __pmdp_csp(pmdp); 2146 - *entry = _SEGMENT_ENTRY_EMPTY; 2500 + *pmdp = __pmd(_SEGMENT_ENTRY_EMPTY); 2147 2501 } 2148 2502 spin_unlock(&gmap->guest_table_lock); 2149 2503 } ··· 2589 2943 EXPORT_SYMBOL_GPL(__s390_uv_destroy_range); 2590 2944 2591 2945 /** 2592 - * s390_unlist_old_asce - Remove the topmost level of page tables from the 2593 - * list of page tables of the gmap. 2594 - * @gmap: the gmap whose table is to be removed 2595 - * 2596 - * On s390x, KVM keeps a list of all pages containing the page tables of the 2597 - * gmap (the CRST list). This list is used at tear down time to free all 2598 - * pages that are now not needed anymore. 2599 - * 2600 - * This function removes the topmost page of the tree (the one pointed to by 2601 - * the ASCE) from the CRST list. 2602 - * 2603 - * This means that it will not be freed when the VM is torn down, and needs 2604 - * to be handled separately by the caller, unless a leak is actually 2605 - * intended. Notice that this function will only remove the page from the 2606 - * list, the page will still be used as a top level page table (and ASCE). 2607 - */ 2608 - void s390_unlist_old_asce(struct gmap *gmap) 2609 - { 2610 - struct page *old; 2611 - 2612 - old = virt_to_page(gmap->table); 2613 - spin_lock(&gmap->guest_table_lock); 2614 - list_del(&old->lru); 2615 - /* 2616 - * Sometimes the topmost page might need to be "removed" multiple 2617 - * times, for example if the VM is rebooted into secure mode several 2618 - * times concurrently, or if s390_replace_asce fails after calling 2619 - * s390_remove_old_asce and is attempted again later. In that case 2620 - * the old asce has been removed from the list, and therefore it 2621 - * will not be freed when the VM terminates, but the ASCE is still 2622 - * in use and still pointed to. 2623 - * A subsequent call to replace_asce will follow the pointer and try 2624 - * to remove the same page from the list again. 2625 - * Therefore it's necessary that the page of the ASCE has valid 2626 - * pointers, so list_del can work (and do nothing) without 2627 - * dereferencing stale or invalid pointers. 2628 - */ 2629 - INIT_LIST_HEAD(&old->lru); 2630 - spin_unlock(&gmap->guest_table_lock); 2631 - } 2632 - EXPORT_SYMBOL_GPL(s390_unlist_old_asce); 2633 - 2634 - /** 2635 2946 * s390_replace_asce - Try to replace the current ASCE of a gmap with a copy 2636 2947 * @gmap: the gmap whose ASCE needs to be replaced 2637 2948 * ··· 2607 3004 struct page *page; 2608 3005 void *table; 2609 3006 2610 - s390_unlist_old_asce(gmap); 2611 - 2612 3007 /* Replacing segment type ASCEs would cause serious issues */ 2613 3008 if ((gmap->asce & _ASCE_TYPE_MASK) == _ASCE_TYPE_SEGMENT) 2614 3009 return -EINVAL; ··· 2614 3013 page = gmap_alloc_crst(); 2615 3014 if (!page) 2616 3015 return -ENOMEM; 2617 - page->index = 0; 2618 3016 table = page_to_virt(page); 2619 3017 memcpy(table, gmap->table, 1UL << (CRST_ALLOC_ORDER + PAGE_SHIFT)); 2620 - 2621 - /* 2622 - * The caller has to deal with the old ASCE, but here we make sure 2623 - * the new one is properly added to the CRST list, so that 2624 - * it will be freed when the VM is torn down. 2625 - */ 2626 - spin_lock(&gmap->guest_table_lock); 2627 - list_add(&page->lru, &gmap->crst_list); 2628 - spin_unlock(&gmap->guest_table_lock); 2629 3018 2630 3019 /* Set new table origin while preserving existing ASCE control bits */ 2631 3020 asce = (gmap->asce & ~_ASCE_ORIGIN) | __pa(table); ··· 2626 3035 return 0; 2627 3036 } 2628 3037 EXPORT_SYMBOL_GPL(s390_replace_asce); 3038 + 3039 + /** 3040 + * kvm_s390_wiggle_split_folio() - try to drain extra references to a folio and optionally split 3041 + * @mm: the mm containing the folio to work on 3042 + * @folio: the folio 3043 + * @split: whether to split a large folio 3044 + * 3045 + * Context: Must be called while holding an extra reference to the folio; 3046 + * the mm lock should not be held. 3047 + */ 3048 + int kvm_s390_wiggle_split_folio(struct mm_struct *mm, struct folio *folio, bool split) 3049 + { 3050 + int rc; 3051 + 3052 + lockdep_assert_not_held(&mm->mmap_lock); 3053 + folio_wait_writeback(folio); 3054 + lru_add_drain_all(); 3055 + if (split) { 3056 + folio_lock(folio); 3057 + rc = split_folio(folio); 3058 + folio_unlock(folio); 3059 + 3060 + if (rc != -EBUSY) 3061 + return rc; 3062 + } 3063 + return -EAGAIN; 3064 + } 3065 + EXPORT_SYMBOL_GPL(kvm_s390_wiggle_split_folio);

-2

arch/s390/mm/pgalloc.c

··· 176 176 } 177 177 table = ptdesc_to_virt(ptdesc); 178 178 __arch_set_page_dat(table, 1); 179 - /* pt_list is used by gmap only */ 180 - INIT_LIST_HEAD(&ptdesc->pt_list); 181 179 memset64((u64 *)table, _PAGE_INVALID, PTRS_PER_PTE); 182 180 memset64((u64 *)table + PTRS_PER_PTE, 0, PTRS_PER_PTE); 183 181 return table;

+20

arch/s390/pci/pci_bus.c

··· 331 331 return rc; 332 332 } 333 333 334 + static bool zpci_bus_is_isolated_vf(struct zpci_bus *zbus, struct zpci_dev *zdev) 335 + { 336 + struct pci_dev *pdev; 337 + 338 + pdev = zpci_iov_find_parent_pf(zbus, zdev); 339 + if (!pdev) 340 + return true; 341 + pci_dev_put(pdev); 342 + return false; 343 + } 344 + 334 345 int zpci_bus_device_register(struct zpci_dev *zdev, struct pci_ops *ops) 335 346 { 336 347 bool topo_is_tid = zdev->tid_avail; ··· 356 345 357 346 topo = topo_is_tid ? zdev->tid : zdev->pchid; 358 347 zbus = zpci_bus_get(topo, topo_is_tid); 348 + /* 349 + * An isolated VF gets its own domain/bus even if there exists 350 + * a matching domain/bus already 351 + */ 352 + if (zbus && zpci_bus_is_isolated_vf(zbus, zdev)) { 353 + zpci_bus_put(zbus); 354 + zbus = NULL; 355 + } 356 + 359 357 if (!zbus) { 360 358 zbus = zpci_bus_alloc(topo, topo_is_tid); 361 359 if (!zbus)

+42 -14

arch/s390/pci/pci_iov.c

··· 60 60 return 0; 61 61 } 62 62 63 - int zpci_iov_setup_virtfn(struct zpci_bus *zbus, struct pci_dev *virtfn, int vfn) 63 + /** 64 + * zpci_iov_find_parent_pf - Find the parent PF, if any, of the given function 65 + * @zbus: The bus that the PCI function is on, or would be added on 66 + * @zdev: The PCI function 67 + * 68 + * Finds the parent PF, if it exists and is configured, of the given PCI function 69 + * and increments its refcount. Th PF is searched for on the provided bus so the 70 + * caller has to ensure that this is the correct bus to search. This function may 71 + * be used before adding the PCI function to a zbus. 72 + * 73 + * Return: Pointer to the struct pci_dev of the parent PF or NULL if it not 74 + * found. If the function is not a VF or has no RequesterID information, 75 + * NULL is returned as well. 76 + */ 77 + struct pci_dev *zpci_iov_find_parent_pf(struct zpci_bus *zbus, struct zpci_dev *zdev) 64 78 { 65 - int i, cand_devfn; 66 - struct zpci_dev *zdev; 79 + int i, vfid, devfn, cand_devfn; 67 80 struct pci_dev *pdev; 68 - int vfid = vfn - 1; /* Linux' vfid's start at 0 vfn at 1*/ 69 - int rc = 0; 70 81 71 82 if (!zbus->multifunction) 72 - return 0; 73 - 74 - /* If the parent PF for the given VF is also configured in the 83 + return NULL; 84 + /* Non-VFs and VFs without RID available don't have a parent */ 85 + if (!zdev->vfn || !zdev->rid_available) 86 + return NULL; 87 + /* Linux vfid starts at 0 vfn at 1 */ 88 + vfid = zdev->vfn - 1; 89 + devfn = zdev->rid & ZPCI_RID_MASK_DEVFN; 90 + /* 91 + * If the parent PF for the given VF is also configured in the 75 92 * instance, it must be on the same zbus. 76 93 * We can then identify the parent PF by checking what 77 94 * devfn the VF would have if it belonged to that PF using the PF's ··· 102 85 if (!pdev) 103 86 continue; 104 87 cand_devfn = pci_iov_virtfn_devfn(pdev, vfid); 105 - if (cand_devfn == virtfn->devfn) { 106 - rc = zpci_iov_link_virtfn(pdev, virtfn, vfid); 107 - /* balance pci_get_slot() */ 108 - pci_dev_put(pdev); 109 - break; 110 - } 88 + if (cand_devfn == devfn) 89 + return pdev; 111 90 /* balance pci_get_slot() */ 112 91 pci_dev_put(pdev); 113 92 } 93 + } 94 + return NULL; 95 + } 96 + 97 + int zpci_iov_setup_virtfn(struct zpci_bus *zbus, struct pci_dev *virtfn, int vfn) 98 + { 99 + struct zpci_dev *zdev = to_zpci(virtfn); 100 + struct pci_dev *pdev_pf; 101 + int rc = 0; 102 + 103 + pdev_pf = zpci_iov_find_parent_pf(zbus, zdev); 104 + if (pdev_pf) { 105 + /* Linux' vfids start at 0 while zdev->vfn starts at 1 */ 106 + rc = zpci_iov_link_virtfn(pdev_pf, virtfn, zdev->vfn - 1); 107 + pci_dev_put(pdev_pf); 114 108 } 115 109 return rc; 116 110 }

+7

arch/s390/pci/pci_iov.h

··· 19 19 20 20 int zpci_iov_setup_virtfn(struct zpci_bus *zbus, struct pci_dev *virtfn, int vfn); 21 21 22 + struct pci_dev *zpci_iov_find_parent_pf(struct zpci_bus *zbus, struct zpci_dev *zdev); 23 + 22 24 #else /* CONFIG_PCI_IOV */ 23 25 static inline void zpci_iov_remove_virtfn(struct pci_dev *pdev, int vfn) {} 24 26 ··· 29 27 static inline int zpci_iov_setup_virtfn(struct zpci_bus *zbus, struct pci_dev *virtfn, int vfn) 30 28 { 31 29 return 0; 30 + } 31 + 32 + static inline struct pci_dev *zpci_iov_find_parent_pf(struct zpci_bus *zbus, struct zpci_dev *zdev) 33 + { 34 + return NULL; 32 35 } 33 36 #endif /* CONFIG_PCI_IOV */ 34 37 #endif /* __S390_PCI_IOV_h */

+101 -95

arch/um/drivers/virt-pci.c

··· 25 25 #define MAX_IRQ_MSG_SIZE (sizeof(struct virtio_pcidev_msg) + sizeof(u32)) 26 26 #define NUM_IRQ_MSGS 10 27 27 28 - #define HANDLE_NO_FREE(ptr) ((void *)((unsigned long)(ptr) | 1)) 29 - #define HANDLE_IS_NO_FREE(ptr) ((unsigned long)(ptr) & 1) 28 + struct um_pci_message_buffer { 29 + struct virtio_pcidev_msg hdr; 30 + u8 data[8]; 31 + }; 30 32 31 33 struct um_pci_device { 32 34 struct virtio_device *vdev; ··· 37 35 u8 resptr[PCI_STD_NUM_BARS]; 38 36 39 37 struct virtqueue *cmd_vq, *irq_vq; 38 + 39 + #define UM_PCI_WRITE_BUFS 20 40 + struct um_pci_message_buffer bufs[UM_PCI_WRITE_BUFS + 1]; 41 + void *extra_ptrs[UM_PCI_WRITE_BUFS + 1]; 42 + DECLARE_BITMAP(used_bufs, UM_PCI_WRITE_BUFS); 40 43 41 44 #define UM_PCI_STAT_WAITING 0 42 45 unsigned long status; ··· 68 61 static unsigned int um_pci_max_delay_us = 40000; 69 62 module_param_named(max_delay_us, um_pci_max_delay_us, uint, 0644); 70 63 71 - struct um_pci_message_buffer { 72 - struct virtio_pcidev_msg hdr; 73 - u8 data[8]; 74 - }; 64 + static int um_pci_get_buf(struct um_pci_device *dev, bool *posted) 65 + { 66 + int i; 75 67 76 - static struct um_pci_message_buffer __percpu *um_pci_msg_bufs; 68 + for (i = 0; i < UM_PCI_WRITE_BUFS; i++) { 69 + if (!test_and_set_bit(i, dev->used_bufs)) 70 + return i; 71 + } 72 + 73 + *posted = false; 74 + return UM_PCI_WRITE_BUFS; 75 + } 76 + 77 + static void um_pci_free_buf(struct um_pci_device *dev, void *buf) 78 + { 79 + int i; 80 + 81 + if (buf == &dev->bufs[UM_PCI_WRITE_BUFS]) { 82 + kfree(dev->extra_ptrs[UM_PCI_WRITE_BUFS]); 83 + dev->extra_ptrs[UM_PCI_WRITE_BUFS] = NULL; 84 + return; 85 + } 86 + 87 + for (i = 0; i < UM_PCI_WRITE_BUFS; i++) { 88 + if (buf == &dev->bufs[i]) { 89 + kfree(dev->extra_ptrs[i]); 90 + dev->extra_ptrs[i] = NULL; 91 + WARN_ON(!test_and_clear_bit(i, dev->used_bufs)); 92 + return; 93 + } 94 + } 95 + 96 + WARN_ON(1); 97 + } 77 98 78 99 static int um_pci_send_cmd(struct um_pci_device *dev, 79 100 struct virtio_pcidev_msg *cmd, ··· 117 82 }; 118 83 struct um_pci_message_buffer *buf; 119 84 int delay_count = 0; 85 + bool bounce_out; 120 86 int ret, len; 87 + int buf_idx; 121 88 bool posted; 122 89 123 90 if (WARN_ON(cmd_size < sizeof(*cmd) || cmd_size > sizeof(*buf))) ··· 138 101 break; 139 102 } 140 103 141 - buf = get_cpu_var(um_pci_msg_bufs); 142 - if (buf) 143 - memcpy(buf, cmd, cmd_size); 104 + bounce_out = !posted && cmd_size <= sizeof(*cmd) && 105 + out && out_size <= sizeof(buf->data); 144 106 145 - if (posted) { 146 - u8 *ncmd = kmalloc(cmd_size + extra_size, GFP_ATOMIC); 107 + buf_idx = um_pci_get_buf(dev, &posted); 108 + buf = &dev->bufs[buf_idx]; 109 + memcpy(buf, cmd, cmd_size); 147 110 148 - if (ncmd) { 149 - memcpy(ncmd, cmd, cmd_size); 150 - if (extra) 151 - memcpy(ncmd + cmd_size, extra, extra_size); 152 - cmd = (void *)ncmd; 153 - cmd_size += extra_size; 154 - extra = NULL; 155 - extra_size = 0; 156 - } else { 157 - /* try without allocating memory */ 158 - posted = false; 159 - cmd = (void *)buf; 111 + if (posted && extra && extra_size > sizeof(buf) - cmd_size) { 112 + dev->extra_ptrs[buf_idx] = kmemdup(extra, extra_size, 113 + GFP_ATOMIC); 114 + 115 + if (!dev->extra_ptrs[buf_idx]) { 116 + um_pci_free_buf(dev, buf); 117 + return -ENOMEM; 160 118 } 119 + extra = dev->extra_ptrs[buf_idx]; 120 + } else if (extra && extra_size <= sizeof(buf) - cmd_size) { 121 + memcpy((u8 *)buf + cmd_size, extra, extra_size); 122 + cmd_size += extra_size; 123 + extra_size = 0; 124 + extra = NULL; 125 + cmd = (void *)buf; 161 126 } else { 162 127 cmd = (void *)buf; 163 128 } ··· 167 128 sg_init_one(&out_sg, cmd, cmd_size); 168 129 if (extra) 169 130 sg_init_one(&extra_sg, extra, extra_size); 170 - if (out) 131 + /* allow stack for small buffers */ 132 + if (bounce_out) 133 + sg_init_one(&in_sg, buf->data, out_size); 134 + else if (out) 171 135 sg_init_one(&in_sg, out, out_size); 172 136 173 137 /* add to internal virtio queue */ 174 138 ret = virtqueue_add_sgs(dev->cmd_vq, sgs_list, 175 139 extra ? 2 : 1, 176 140 out ? 1 : 0, 177 - posted ? cmd : HANDLE_NO_FREE(cmd), 178 - GFP_ATOMIC); 141 + cmd, GFP_ATOMIC); 179 142 if (ret) { 180 - if (posted) 181 - kfree(cmd); 182 - goto out; 143 + um_pci_free_buf(dev, buf); 144 + return ret; 183 145 } 184 146 185 147 if (posted) { 186 148 virtqueue_kick(dev->cmd_vq); 187 - ret = 0; 188 - goto out; 149 + return 0; 189 150 } 190 151 191 152 /* kick and poll for getting a response on the queue */ 192 153 set_bit(UM_PCI_STAT_WAITING, &dev->status); 193 154 virtqueue_kick(dev->cmd_vq); 155 + ret = 0; 194 156 195 157 while (1) { 196 158 void *completed = virtqueue_get_buf(dev->cmd_vq, &len); 197 159 198 - if (completed == HANDLE_NO_FREE(cmd)) 160 + if (completed == buf) 199 161 break; 200 162 201 - if (completed && !HANDLE_IS_NO_FREE(completed)) 202 - kfree(completed); 163 + if (completed) 164 + um_pci_free_buf(dev, completed); 203 165 204 166 if (WARN_ONCE(virtqueue_is_broken(dev->cmd_vq) || 205 167 ++delay_count > um_pci_max_delay_us, ··· 212 172 } 213 173 clear_bit(UM_PCI_STAT_WAITING, &dev->status); 214 174 215 - out: 216 - put_cpu_var(um_pci_msg_bufs); 175 + if (bounce_out) 176 + memcpy(out, buf->data, out_size); 177 + 178 + um_pci_free_buf(dev, buf); 179 + 217 180 return ret; 218 181 } 219 182 ··· 230 187 .size = size, 231 188 .addr = offset, 232 189 }; 233 - /* buf->data is maximum size - we may only use parts of it */ 234 - struct um_pci_message_buffer *buf; 235 - u8 *data; 236 - unsigned long ret = ULONG_MAX; 237 - size_t bytes = sizeof(buf->data); 190 + /* max 8, we might not use it all */ 191 + u8 data[8]; 238 192 239 193 if (!dev) 240 194 return ULONG_MAX; 241 195 242 - buf = get_cpu_var(um_pci_msg_bufs); 243 - data = buf->data; 244 - 245 - if (buf) 246 - memset(data, 0xff, bytes); 196 + memset(data, 0xff, sizeof(data)); 247 197 248 198 switch (size) { 249 199 case 1: ··· 248 212 break; 249 213 default: 250 214 WARN(1, "invalid config space read size %d\n", size); 251 - goto out; 215 + return ULONG_MAX; 252 216 } 253 217 254 - if (um_pci_send_cmd(dev, &hdr, sizeof(hdr), NULL, 0, data, bytes)) 255 - goto out; 218 + if (um_pci_send_cmd(dev, &hdr, sizeof(hdr), NULL, 0, data, size)) 219 + return ULONG_MAX; 256 220 257 221 switch (size) { 258 222 case 1: 259 - ret = data[0]; 260 - break; 223 + return data[0]; 261 224 case 2: 262 - ret = le16_to_cpup((void *)data); 263 - break; 225 + return le16_to_cpup((void *)data); 264 226 case 4: 265 - ret = le32_to_cpup((void *)data); 266 - break; 227 + return le32_to_cpup((void *)data); 267 228 #ifdef CONFIG_64BIT 268 229 case 8: 269 - ret = le64_to_cpup((void *)data); 270 - break; 230 + return le64_to_cpup((void *)data); 271 231 #endif 272 232 default: 273 - break; 233 + return ULONG_MAX; 274 234 } 275 - 276 - out: 277 - put_cpu_var(um_pci_msg_bufs); 278 - return ret; 279 235 } 280 236 281 237 static void um_pci_cfgspace_write(void *priv, unsigned int offset, int size, ··· 340 312 static unsigned long um_pci_bar_read(void *priv, unsigned int offset, 341 313 int size) 342 314 { 343 - /* buf->data is maximum size - we may only use parts of it */ 344 - struct um_pci_message_buffer *buf; 345 - u8 *data; 346 - unsigned long ret = ULONG_MAX; 347 - 348 - buf = get_cpu_var(um_pci_msg_bufs); 349 - data = buf->data; 315 + /* 8 is maximum size - we may only use parts of it */ 316 + u8 data[8]; 350 317 351 318 switch (size) { 352 319 case 1: ··· 353 330 break; 354 331 default: 355 332 WARN(1, "invalid config space read size %d\n", size); 356 - goto out; 333 + return ULONG_MAX; 357 334 } 358 335 359 336 um_pci_bar_copy_from(priv, data, offset, size); 360 337 361 338 switch (size) { 362 339 case 1: 363 - ret = data[0]; 364 - break; 340 + return data[0]; 365 341 case 2: 366 - ret = le16_to_cpup((void *)data); 367 - break; 342 + return le16_to_cpup((void *)data); 368 343 case 4: 369 - ret = le32_to_cpup((void *)data); 370 - break; 344 + return le32_to_cpup((void *)data); 371 345 #ifdef CONFIG_64BIT 372 346 case 8: 373 - ret = le64_to_cpup((void *)data); 374 - break; 347 + return le64_to_cpup((void *)data); 375 348 #endif 376 349 default: 377 - break; 350 + return ULONG_MAX; 378 351 } 379 - 380 - out: 381 - put_cpu_var(um_pci_msg_bufs); 382 - return ret; 383 352 } 384 353 385 354 static void um_pci_bar_copy_to(void *priv, unsigned int offset, ··· 538 523 if (test_bit(UM_PCI_STAT_WAITING, &dev->status)) 539 524 return; 540 525 541 - while ((cmd = virtqueue_get_buf(vq, &len))) { 542 - if (WARN_ON(HANDLE_IS_NO_FREE(cmd))) 543 - continue; 544 - kfree(cmd); 545 - } 526 + while ((cmd = virtqueue_get_buf(vq, &len))) 527 + um_pci_free_buf(dev, cmd); 546 528 } 547 529 548 530 static void um_pci_irq_vq_cb(struct virtqueue *vq) ··· 1018 1006 "No virtio device ID configured for PCI - no PCI support\n")) 1019 1007 return 0; 1020 1008 1021 - um_pci_msg_bufs = alloc_percpu(struct um_pci_message_buffer); 1022 - if (!um_pci_msg_bufs) 1023 - return -ENOMEM; 1024 - 1025 1009 bridge = pci_alloc_host_bridge(0); 1026 1010 if (!bridge) { 1027 1011 err = -ENOMEM; ··· 1078 1070 pci_free_resource_list(&bridge->windows); 1079 1071 pci_free_host_bridge(bridge); 1080 1072 } 1081 - free_percpu(um_pci_msg_bufs); 1082 1073 return err; 1083 1074 } 1084 1075 module_init(um_pci_init); ··· 1089 1082 irq_domain_remove(um_pci_inner_domain); 1090 1083 pci_free_resource_list(&bridge->windows); 1091 1084 pci_free_host_bridge(bridge); 1092 - free_percpu(um_pci_msg_bufs); 1093 1085 } 1094 1086 module_exit(um_pci_exit);

+4 -4

arch/um/drivers/virtio_uml.c

··· 52 52 struct platform_device *pdev; 53 53 struct virtio_uml_platform_data *pdata; 54 54 55 - spinlock_t sock_lock; 55 + raw_spinlock_t sock_lock; 56 56 int sock, req_fd, irq; 57 57 u64 features; 58 58 u64 protocol_features; ··· 246 246 if (request_ack) 247 247 msg->header.flags |= VHOST_USER_FLAG_NEED_REPLY; 248 248 249 - spin_lock_irqsave(&vu_dev->sock_lock, flags); 249 + raw_spin_lock_irqsave(&vu_dev->sock_lock, flags); 250 250 rc = full_sendmsg_fds(vu_dev->sock, msg, size, fds, num_fds); 251 251 if (rc < 0) 252 252 goto out; ··· 266 266 } 267 267 268 268 out: 269 - spin_unlock_irqrestore(&vu_dev->sock_lock, flags); 269 + raw_spin_unlock_irqrestore(&vu_dev->sock_lock, flags); 270 270 return rc; 271 271 } 272 272 ··· 1239 1239 goto error_free; 1240 1240 vu_dev->sock = rc; 1241 1241 1242 - spin_lock_init(&vu_dev->sock_lock); 1242 + raw_spin_lock_init(&vu_dev->sock_lock); 1243 1243 1244 1244 rc = vhost_user_init(vu_dev); 1245 1245 if (rc)

+47 -32

arch/um/kernel/irq.c

··· 52 52 bool sigio_workaround; 53 53 }; 54 54 55 - static DEFINE_SPINLOCK(irq_lock); 55 + static DEFINE_RAW_SPINLOCK(irq_lock); 56 56 static LIST_HEAD(active_fds); 57 57 static DECLARE_BITMAP(irqs_allocated, UM_LAST_SIGNAL_IRQ); 58 58 static bool irqs_suspended; ··· 257 257 return NULL; 258 258 } 259 259 260 - static void free_irq_entry(struct irq_entry *to_free, bool remove) 260 + static void remove_irq_entry(struct irq_entry *to_free, bool remove) 261 261 { 262 262 if (!to_free) 263 263 return; ··· 265 265 if (remove) 266 266 os_del_epoll_fd(to_free->fd); 267 267 list_del(&to_free->list); 268 - kfree(to_free); 269 268 } 270 269 271 270 static bool update_irq_entry(struct irq_entry *entry) ··· 285 286 return false; 286 287 } 287 288 288 - static void update_or_free_irq_entry(struct irq_entry *entry) 289 + static struct irq_entry *update_or_remove_irq_entry(struct irq_entry *entry) 289 290 { 290 - if (!update_irq_entry(entry)) 291 - free_irq_entry(entry, false); 291 + if (update_irq_entry(entry)) 292 + return NULL; 293 + remove_irq_entry(entry, false); 294 + return entry; 292 295 } 293 296 294 297 static int activate_fd(int irq, int fd, enum um_irq_type type, void *dev_id, 295 298 void (*timetravel_handler)(int, int, void *, 296 299 struct time_travel_event *)) 297 300 { 298 - struct irq_entry *irq_entry; 301 + struct irq_entry *irq_entry, *to_free = NULL; 299 302 int err, events = os_event_mask(type); 300 303 unsigned long flags; 301 304 ··· 305 304 if (err < 0) 306 305 goto out; 307 306 308 - spin_lock_irqsave(&irq_lock, flags); 307 + raw_spin_lock_irqsave(&irq_lock, flags); 309 308 irq_entry = get_irq_entry_by_fd(fd); 310 309 if (irq_entry) { 310 + already: 311 311 /* cannot register the same FD twice with the same type */ 312 312 if (WARN_ON(irq_entry->reg[type].events)) { 313 313 err = -EALREADY; ··· 318 316 /* temporarily disable to avoid IRQ-side locking */ 319 317 os_del_epoll_fd(fd); 320 318 } else { 321 - irq_entry = kzalloc(sizeof(*irq_entry), GFP_ATOMIC); 322 - if (!irq_entry) { 323 - err = -ENOMEM; 324 - goto out_unlock; 319 + struct irq_entry *new; 320 + 321 + /* don't restore interrupts */ 322 + raw_spin_unlock(&irq_lock); 323 + new = kzalloc(sizeof(*irq_entry), GFP_ATOMIC); 324 + if (!new) { 325 + local_irq_restore(flags); 326 + return -ENOMEM; 325 327 } 328 + raw_spin_lock(&irq_lock); 329 + irq_entry = get_irq_entry_by_fd(fd); 330 + if (irq_entry) { 331 + to_free = new; 332 + goto already; 333 + } 334 + irq_entry = new; 326 335 irq_entry->fd = fd; 327 336 list_add_tail(&irq_entry->list, &active_fds); 328 337 maybe_sigio_broken(fd); ··· 352 339 #endif 353 340 354 341 WARN_ON(!update_irq_entry(irq_entry)); 355 - spin_unlock_irqrestore(&irq_lock, flags); 356 - 357 - return 0; 342 + err = 0; 358 343 out_unlock: 359 - spin_unlock_irqrestore(&irq_lock, flags); 344 + raw_spin_unlock_irqrestore(&irq_lock, flags); 360 345 out: 346 + kfree(to_free); 361 347 return err; 362 348 } 363 349 ··· 370 358 struct irq_entry *to_free; 371 359 unsigned long flags; 372 360 373 - spin_lock_irqsave(&irq_lock, flags); 361 + raw_spin_lock_irqsave(&irq_lock, flags); 374 362 to_free = get_irq_entry_by_fd(fd); 375 - free_irq_entry(to_free, true); 376 - spin_unlock_irqrestore(&irq_lock, flags); 363 + remove_irq_entry(to_free, true); 364 + raw_spin_unlock_irqrestore(&irq_lock, flags); 365 + kfree(to_free); 377 366 } 378 367 EXPORT_SYMBOL(free_irq_by_fd); 379 368 380 369 static void free_irq_by_irq_and_dev(unsigned int irq, void *dev) 381 370 { 382 - struct irq_entry *entry; 371 + struct irq_entry *entry, *to_free = NULL; 383 372 unsigned long flags; 384 373 385 - spin_lock_irqsave(&irq_lock, flags); 374 + raw_spin_lock_irqsave(&irq_lock, flags); 386 375 list_for_each_entry(entry, &active_fds, list) { 387 376 enum um_irq_type i; 388 377 ··· 399 386 400 387 os_del_epoll_fd(entry->fd); 401 388 reg->events = 0; 402 - update_or_free_irq_entry(entry); 389 + to_free = update_or_remove_irq_entry(entry); 403 390 goto out; 404 391 } 405 392 } 406 393 out: 407 - spin_unlock_irqrestore(&irq_lock, flags); 394 + raw_spin_unlock_irqrestore(&irq_lock, flags); 395 + kfree(to_free); 408 396 } 409 397 410 398 void deactivate_fd(int fd, int irqnum) ··· 416 402 417 403 os_del_epoll_fd(fd); 418 404 419 - spin_lock_irqsave(&irq_lock, flags); 405 + raw_spin_lock_irqsave(&irq_lock, flags); 420 406 entry = get_irq_entry_by_fd(fd); 421 407 if (!entry) 422 408 goto out; ··· 428 414 entry->reg[i].events = 0; 429 415 } 430 416 431 - update_or_free_irq_entry(entry); 417 + entry = update_or_remove_irq_entry(entry); 432 418 out: 433 - spin_unlock_irqrestore(&irq_lock, flags); 419 + raw_spin_unlock_irqrestore(&irq_lock, flags); 420 + kfree(entry); 434 421 435 422 ignore_sigio_fd(fd); 436 423 } ··· 561 546 562 547 irqs_suspended = true; 563 548 564 - spin_lock_irqsave(&irq_lock, flags); 549 + raw_spin_lock_irqsave(&irq_lock, flags); 565 550 list_for_each_entry(entry, &active_fds, list) { 566 551 enum um_irq_type t; 567 552 bool clear = true; ··· 594 579 !__ignore_sigio_fd(entry->fd); 595 580 } 596 581 } 597 - spin_unlock_irqrestore(&irq_lock, flags); 582 + raw_spin_unlock_irqrestore(&irq_lock, flags); 598 583 } 599 584 600 585 void um_irqs_resume(void) ··· 603 588 unsigned long flags; 604 589 605 590 606 - spin_lock_irqsave(&irq_lock, flags); 591 + raw_spin_lock_irqsave(&irq_lock, flags); 607 592 list_for_each_entry(entry, &active_fds, list) { 608 593 if (entry->suspended) { 609 594 int err = os_set_fd_async(entry->fd); ··· 617 602 } 618 603 } 619 604 } 620 - spin_unlock_irqrestore(&irq_lock, flags); 605 + raw_spin_unlock_irqrestore(&irq_lock, flags); 621 606 622 607 irqs_suspended = false; 623 608 send_sigio_to_self(); ··· 628 613 struct irq_entry *entry; 629 614 unsigned long flags; 630 615 631 - spin_lock_irqsave(&irq_lock, flags); 616 + raw_spin_lock_irqsave(&irq_lock, flags); 632 617 list_for_each_entry(entry, &active_fds, list) { 633 618 enum um_irq_type t; 634 619 ··· 643 628 } 644 629 } 645 630 unlock: 646 - spin_unlock_irqrestore(&irq_lock, flags); 631 + raw_spin_unlock_irqrestore(&irq_lock, flags); 647 632 return 0; 648 633 } 649 634 #else

+9 -1

arch/um/kernel/process.c

··· 191 191 int arch_dup_task_struct(struct task_struct *dst, 192 192 struct task_struct *src) 193 193 { 194 - memcpy(dst, src, arch_task_struct_size); 194 + /* init_task is not dynamically sized (missing FPU state) */ 195 + if (unlikely(src == &init_task)) { 196 + memcpy(dst, src, sizeof(init_task)); 197 + memset((void *)dst + sizeof(init_task), 0, 198 + arch_task_struct_size - sizeof(init_task)); 199 + } else { 200 + memcpy(dst, src, arch_task_struct_size); 201 + } 202 + 195 203 return 0; 196 204 } 197 205

+13 -3

arch/um/os-Linux/skas/process.c

··· 181 181 182 182 static int stub_exe_fd; 183 183 184 + #ifndef CLOSE_RANGE_CLOEXEC 185 + #define CLOSE_RANGE_CLOEXEC (1U << 2) 186 + #endif 187 + 184 188 static int userspace_tramp(void *stack) 185 189 { 186 190 char *const argv[] = { "uml-userspace", NULL }; ··· 206 202 init_data.stub_data_fd = phys_mapping(uml_to_phys(stack), &offset); 207 203 init_data.stub_data_offset = MMAP_OFFSET(offset); 208 204 209 - /* Set CLOEXEC on all FDs and then unset on all memory related FDs */ 210 - close_range(0, ~0U, CLOSE_RANGE_CLOEXEC); 205 + /* 206 + * Avoid leaking unneeded FDs to the stub by setting CLOEXEC on all FDs 207 + * and then unsetting it on all memory related FDs. 208 + * This is not strictly necessary from a safety perspective. 209 + */ 210 + syscall(__NR_close_range, 0, ~0U, CLOSE_RANGE_CLOEXEC); 211 211 212 212 fcntl(init_data.stub_data_fd, F_SETFD, 0); 213 213 for (iomem = iomem_regions; iomem; iomem = iomem->next) ··· 232 224 if (ret != sizeof(init_data)) 233 225 exit(4); 234 226 235 - execveat(stub_exe_fd, "", argv, NULL, AT_EMPTY_PATH); 227 + /* Raw execveat for compatibility with older libc versions */ 228 + syscall(__NR_execveat, stub_exe_fd, (unsigned long)"", 229 + (unsigned long)argv, NULL, AT_EMPTY_PATH); 236 230 237 231 exit(5); 238 232 }

+2 -1

arch/x86/Kconfig

··· 2599 2599 depends on CPU_SUP_AMD && X86_64 2600 2600 default y 2601 2601 help 2602 - Compile the kernel with support for the retbleed=ibpb mitigation. 2602 + Compile the kernel with support for the retbleed=ibpb and 2603 + spec_rstack_overflow={ibpb,ibpb-vmexit} mitigations. 2603 2604 2604 2605 config MITIGATION_IBRS_ENTRY 2605 2606 bool "Enable IBRS on kernel entry"

+1

arch/x86/boot/compressed/Makefile

··· 25 25 # avoid errors with '-march=i386', and future flags may depend on the target to 26 26 # be valid. 27 27 KBUILD_CFLAGS := -m$(BITS) -O2 $(CLANG_FLAGS) 28 + KBUILD_CFLAGS += -std=gnu11 28 29 KBUILD_CFLAGS += -fno-strict-aliasing -fPIE 29 30 KBUILD_CFLAGS += -Wundef 30 31 KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING

+14 -19

arch/x86/events/intel/core.c

··· 4905 4905 4906 4906 static void update_pmu_cap(struct x86_hybrid_pmu *pmu) 4907 4907 { 4908 - unsigned int sub_bitmaps, eax, ebx, ecx, edx; 4908 + unsigned int cntr, fixed_cntr, ecx, edx; 4909 + union cpuid35_eax eax; 4910 + union cpuid35_ebx ebx; 4909 4911 4910 - cpuid(ARCH_PERFMON_EXT_LEAF, &sub_bitmaps, &ebx, &ecx, &edx); 4912 + cpuid(ARCH_PERFMON_EXT_LEAF, &eax.full, &ebx.full, &ecx, &edx); 4911 4913 4912 - if (ebx & ARCH_PERFMON_EXT_UMASK2) 4914 + if (ebx.split.umask2) 4913 4915 pmu->config_mask |= ARCH_PERFMON_EVENTSEL_UMASK2; 4914 - if (ebx & ARCH_PERFMON_EXT_EQ) 4916 + if (ebx.split.eq) 4915 4917 pmu->config_mask |= ARCH_PERFMON_EVENTSEL_EQ; 4916 4918 4917 - if (sub_bitmaps & ARCH_PERFMON_NUM_COUNTER_LEAF_BIT) { 4919 + if (eax.split.cntr_subleaf) { 4918 4920 cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF, 4919 - &eax, &ebx, &ecx, &edx); 4920 - pmu->cntr_mask64 = eax; 4921 - pmu->fixed_cntr_mask64 = ebx; 4921 + &cntr, &fixed_cntr, &ecx, &edx); 4922 + pmu->cntr_mask64 = cntr; 4923 + pmu->fixed_cntr_mask64 = fixed_cntr; 4922 4924 } 4923 4925 4924 4926 if (!intel_pmu_broken_perf_cap()) { ··· 4942 4940 pmu->intel_ctrl |= 1ULL << GLOBAL_CTRL_EN_PERF_METRICS; 4943 4941 else 4944 4942 pmu->intel_ctrl &= ~(1ULL << GLOBAL_CTRL_EN_PERF_METRICS); 4945 - 4946 - if (pmu->intel_cap.pebs_output_pt_available) 4947 - pmu->pmu.capabilities |= PERF_PMU_CAP_AUX_OUTPUT; 4948 - else 4949 - pmu->pmu.capabilities &= ~PERF_PMU_CAP_AUX_OUTPUT; 4950 4943 4951 4944 intel_pmu_check_event_constraints(pmu->event_constraints, 4952 4945 pmu->cntr_mask64, ··· 5020 5023 5021 5024 pr_info("%s PMU driver: ", pmu->name); 5022 5025 5023 - if (pmu->intel_cap.pebs_output_pt_available) 5024 - pr_cont("PEBS-via-PT "); 5025 - 5026 5026 pr_cont("\n"); 5027 5027 5028 5028 x86_pmu_show_pmu_cap(&pmu->pmu); ··· 5042 5048 5043 5049 init_debug_store_on_cpu(cpu); 5044 5050 /* 5045 - * Deal with CPUs that don't clear their LBRs on power-up. 5051 + * Deal with CPUs that don't clear their LBRs on power-up, and that may 5052 + * even boot with LBRs enabled. 5046 5053 */ 5054 + if (!static_cpu_has(X86_FEATURE_ARCH_LBR) && x86_pmu.lbr_nr) 5055 + msr_clear_bit(MSR_IA32_DEBUGCTLMSR, DEBUGCTLMSR_LBR_BIT); 5047 5056 intel_pmu_lbr_reset(); 5048 5057 5049 5058 cpuc->lbr_sel = NULL; ··· 6367 6370 pmu->intel_cap.capabilities = x86_pmu.intel_cap.capabilities; 6368 6371 if (pmu->pmu_type & hybrid_small_tiny) { 6369 6372 pmu->intel_cap.perf_metrics = 0; 6370 - pmu->intel_cap.pebs_output_pt_available = 1; 6371 6373 pmu->mid_ack = true; 6372 6374 } else if (pmu->pmu_type & hybrid_big) { 6373 6375 pmu->intel_cap.perf_metrics = 1; 6374 - pmu->intel_cap.pebs_output_pt_available = 0; 6375 6376 pmu->late_ack = true; 6376 6377 } 6377 6378 }

+9 -1

arch/x86/events/intel/ds.c

··· 2578 2578 } 2579 2579 pr_cont("PEBS fmt4%c%s, ", pebs_type, pebs_qual); 2580 2580 2581 - if (!is_hybrid() && x86_pmu.intel_cap.pebs_output_pt_available) { 2581 + /* 2582 + * The PEBS-via-PT is not supported on hybrid platforms, 2583 + * because not all CPUs of a hybrid machine support it. 2584 + * The global x86_pmu.intel_cap, which only contains the 2585 + * common capabilities, is used to check the availability 2586 + * of the feature. The per-PMU pebs_output_pt_available 2587 + * in a hybrid machine should be ignored. 2588 + */ 2589 + if (x86_pmu.intel_cap.pebs_output_pt_available) { 2582 2590 pr_cont("PEBS-via-PT, "); 2583 2591 x86_get_pmu(smp_processor_id())->capabilities |= PERF_PMU_CAP_AUX_OUTPUT; 2584 2592 }

+4 -8

arch/x86/events/rapl.c

··· 370 370 unsigned int rapl_pmu_idx; 371 371 struct rapl_pmus *rapl_pmus; 372 372 373 + /* only look at RAPL events */ 374 + if (event->attr.type != event->pmu->type) 375 + return -ENOENT; 376 + 373 377 /* unsupported modes and filters */ 374 378 if (event->attr.sample_period) /* no sampling */ 375 379 return -EINVAL; ··· 391 387 rapl_pmus_scope = rapl_pmus->pmu.scope; 392 388 393 389 if (rapl_pmus_scope == PERF_PMU_SCOPE_PKG || rapl_pmus_scope == PERF_PMU_SCOPE_DIE) { 394 - /* only look at RAPL package events */ 395 - if (event->attr.type != rapl_pmus_pkg->pmu.type) 396 - return -ENOENT; 397 - 398 390 cfg = array_index_nospec((long)cfg, NR_RAPL_PKG_DOMAINS + 1); 399 391 if (!cfg || cfg >= NR_RAPL_PKG_DOMAINS + 1) 400 392 return -EINVAL; ··· 398 398 bit = cfg - 1; 399 399 event->hw.event_base = rapl_model->rapl_pkg_msrs[bit].msr; 400 400 } else if (rapl_pmus_scope == PERF_PMU_SCOPE_CORE) { 401 - /* only look at RAPL core events */ 402 - if (event->attr.type != rapl_pmus_core->pmu.type) 403 - return -ENOENT; 404 - 405 401 cfg = array_index_nospec((long)cfg, NR_RAPL_CORE_DOMAINS + 1); 406 402 if (!cfg || cfg >= NR_RAPL_PKG_DOMAINS + 1) 407 403 return -EINVAL;

+1

arch/x86/include/asm/kvm-x86-ops.h

··· 48 48 KVM_X86_OP(get_gdt) 49 49 KVM_X86_OP(set_gdt) 50 50 KVM_X86_OP(sync_dirty_debug_regs) 51 + KVM_X86_OP(set_dr6) 51 52 KVM_X86_OP(set_dr7) 52 53 KVM_X86_OP(cache_reg) 53 54 KVM_X86_OP(get_rflags)

+1

arch/x86/include/asm/kvm_host.h

··· 1696 1696 void (*get_gdt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt); 1697 1697 void (*set_gdt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt); 1698 1698 void (*sync_dirty_debug_regs)(struct kvm_vcpu *vcpu); 1699 + void (*set_dr6)(struct kvm_vcpu *vcpu, unsigned long value); 1699 1700 void (*set_dr7)(struct kvm_vcpu *vcpu, unsigned long value); 1700 1701 void (*cache_reg)(struct kvm_vcpu *vcpu, enum kvm_reg reg); 1701 1702 unsigned long (*get_rflags)(struct kvm_vcpu *vcpu);

+2 -1

arch/x86/include/asm/msr-index.h

··· 395 395 #define MSR_IA32_PASID_VALID BIT_ULL(31) 396 396 397 397 /* DEBUGCTLMSR bits (others vary by model): */ 398 - #define DEBUGCTLMSR_LBR (1UL << 0) /* last branch recording */ 398 + #define DEBUGCTLMSR_LBR_BIT 0 /* last branch recording */ 399 + #define DEBUGCTLMSR_LBR (1UL << DEBUGCTLMSR_LBR_BIT) 399 400 #define DEBUGCTLMSR_BTF_SHIFT 1 400 401 #define DEBUGCTLMSR_BTF (1UL << 1) /* single-step on branches */ 401 402 #define DEBUGCTLMSR_BUS_LOCK_DETECT (1UL << 2)

+25 -3

arch/x86/include/asm/perf_event.h

··· 188 188 * detection/enumeration details: 189 189 */ 190 190 #define ARCH_PERFMON_EXT_LEAF 0x00000023 191 - #define ARCH_PERFMON_EXT_UMASK2 0x1 192 - #define ARCH_PERFMON_EXT_EQ 0x2 193 - #define ARCH_PERFMON_NUM_COUNTER_LEAF_BIT 0x1 194 191 #define ARCH_PERFMON_NUM_COUNTER_LEAF 0x1 192 + 193 + union cpuid35_eax { 194 + struct { 195 + unsigned int leaf0:1; 196 + /* Counters Sub-Leaf */ 197 + unsigned int cntr_subleaf:1; 198 + /* Auto Counter Reload Sub-Leaf */ 199 + unsigned int acr_subleaf:1; 200 + /* Events Sub-Leaf */ 201 + unsigned int events_subleaf:1; 202 + unsigned int reserved:28; 203 + } split; 204 + unsigned int full; 205 + }; 206 + 207 + union cpuid35_ebx { 208 + struct { 209 + /* UnitMask2 Supported */ 210 + unsigned int umask2:1; 211 + /* EQ-bit Supported */ 212 + unsigned int eq:1; 213 + unsigned int reserved:30; 214 + } split; 215 + unsigned int full; 216 + }; 195 217 196 218 /* 197 219 * Intel Architectural LBR CPUID detection/enumeration details:

+2

arch/x86/include/asm/sev.h

··· 531 531 532 532 #ifdef CONFIG_KVM_AMD_SEV 533 533 bool snp_probe_rmptable_info(void); 534 + int snp_rmptable_init(void); 534 535 int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level); 535 536 void snp_dump_hva_rmpentry(unsigned long address); 536 537 int psmash(u64 pfn); ··· 542 541 void snp_fixup_e820_tables(void); 543 542 #else 544 543 static inline bool snp_probe_rmptable_info(void) { return false; } 544 + static inline int snp_rmptable_init(void) { return -ENOSYS; } 545 545 static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENODEV; } 546 546 static inline void snp_dump_hva_rmpentry(unsigned long address) {} 547 547 static inline int psmash(u64 pfn) { return -ENODEV; }

+14 -7

arch/x86/kernel/cpu/bugs.c

··· 1115 1115 1116 1116 case RETBLEED_MITIGATION_IBPB: 1117 1117 setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB); 1118 + setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT); 1119 + mitigate_smt = true; 1118 1120 1119 1121 /* 1120 1122 * IBPB on entry already obviates the need for ··· 1125 1123 */ 1126 1124 setup_clear_cpu_cap(X86_FEATURE_UNRET); 1127 1125 setup_clear_cpu_cap(X86_FEATURE_RETHUNK); 1128 - 1129 - setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT); 1130 - mitigate_smt = true; 1131 1126 1132 1127 /* 1133 1128 * There is no need for RSB filling: entry_ibpb() ensures ··· 2645 2646 if (IS_ENABLED(CONFIG_MITIGATION_IBPB_ENTRY)) { 2646 2647 if (has_microcode) { 2647 2648 setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB); 2649 + setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT); 2648 2650 srso_mitigation = SRSO_MITIGATION_IBPB; 2649 2651 2650 2652 /* ··· 2655 2655 */ 2656 2656 setup_clear_cpu_cap(X86_FEATURE_UNRET); 2657 2657 setup_clear_cpu_cap(X86_FEATURE_RETHUNK); 2658 + 2659 + /* 2660 + * There is no need for RSB filling: entry_ibpb() ensures 2661 + * all predictions, including the RSB, are invalidated, 2662 + * regardless of IBPB implementation. 2663 + */ 2664 + setup_clear_cpu_cap(X86_FEATURE_RSB_VMEXIT); 2658 2665 } 2659 2666 } else { 2660 2667 pr_err("WARNING: kernel not compiled with MITIGATION_IBPB_ENTRY.\n"); ··· 2670 2663 2671 2664 ibpb_on_vmexit: 2672 2665 case SRSO_CMD_IBPB_ON_VMEXIT: 2673 - if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) { 2674 - if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) { 2666 + if (IS_ENABLED(CONFIG_MITIGATION_IBPB_ENTRY)) { 2667 + if (has_microcode) { 2675 2668 setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT); 2676 2669 srso_mitigation = SRSO_MITIGATION_IBPB_ON_VMEXIT; 2677 2670 ··· 2683 2676 setup_clear_cpu_cap(X86_FEATURE_RSB_VMEXIT); 2684 2677 } 2685 2678 } else { 2686 - pr_err("WARNING: kernel not compiled with MITIGATION_SRSO.\n"); 2687 - } 2679 + pr_err("WARNING: kernel not compiled with MITIGATION_IBPB_ENTRY.\n"); 2680 + } 2688 2681 break; 2689 2682 default: 2690 2683 break;

+1 -1

arch/x86/kvm/cpuid.c

··· 1180 1180 SYNTHESIZED_F(SBPB), 1181 1181 SYNTHESIZED_F(IBPB_BRTYPE), 1182 1182 SYNTHESIZED_F(SRSO_NO), 1183 - SYNTHESIZED_F(SRSO_USER_KERNEL_NO), 1183 + F(SRSO_USER_KERNEL_NO), 1184 1184 ); 1185 1185 1186 1186 kvm_cpu_cap_init(CPUID_8000_0022_EAX,

+5 -1

arch/x86/kvm/hyperv.c

··· 2226 2226 u32 vector; 2227 2227 bool all_cpus; 2228 2228 2229 + if (!lapic_in_kernel(vcpu)) 2230 + return HV_STATUS_INVALID_HYPERCALL_INPUT; 2231 + 2229 2232 if (hc->code == HVCALL_SEND_IPI) { 2230 2233 if (!hc->fast) { 2231 2234 if (unlikely(kvm_read_guest(kvm, hc->ingpa, &send_ipi, ··· 2855 2852 ent->eax |= HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED; 2856 2853 ent->eax |= HV_X64_APIC_ACCESS_RECOMMENDED; 2857 2854 ent->eax |= HV_X64_RELAXED_TIMING_RECOMMENDED; 2858 - ent->eax |= HV_X64_CLUSTER_IPI_RECOMMENDED; 2855 + if (!vcpu || lapic_in_kernel(vcpu)) 2856 + ent->eax |= HV_X64_CLUSTER_IPI_RECOMMENDED; 2859 2857 ent->eax |= HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED; 2860 2858 if (evmcs_ver) 2861 2859 ent->eax |= HV_X64_ENLIGHTENED_VMCS_RECOMMENDED;

+27 -8

arch/x86/kvm/mmu/mmu.c

··· 5540 5540 union kvm_mmu_page_role root_role; 5541 5541 5542 5542 /* NPT requires CR0.PG=1. */ 5543 - WARN_ON_ONCE(cpu_role.base.direct); 5543 + WARN_ON_ONCE(cpu_role.base.direct || !cpu_role.base.guest_mode); 5544 5544 5545 5545 root_role = cpu_role.base; 5546 5546 root_role.level = kvm_mmu_get_tdp_level(vcpu); ··· 7120 7120 kmem_cache_destroy(mmu_page_header_cache); 7121 7121 } 7122 7122 7123 + static void kvm_wake_nx_recovery_thread(struct kvm *kvm) 7124 + { 7125 + /* 7126 + * The NX recovery thread is spawned on-demand at the first KVM_RUN and 7127 + * may not be valid even though the VM is globally visible. Do nothing, 7128 + * as such a VM can't have any possible NX huge pages. 7129 + */ 7130 + struct vhost_task *nx_thread = READ_ONCE(kvm->arch.nx_huge_page_recovery_thread); 7131 + 7132 + if (nx_thread) 7133 + vhost_task_wake(nx_thread); 7134 + } 7135 + 7123 7136 static int get_nx_huge_pages(char *buffer, const struct kernel_param *kp) 7124 7137 { 7125 7138 if (nx_hugepage_mitigation_hard_disabled) ··· 7193 7180 kvm_mmu_zap_all_fast(kvm); 7194 7181 mutex_unlock(&kvm->slots_lock); 7195 7182 7196 - vhost_task_wake(kvm->arch.nx_huge_page_recovery_thread); 7183 + kvm_wake_nx_recovery_thread(kvm); 7197 7184 } 7198 7185 mutex_unlock(&kvm_lock); 7199 7186 } ··· 7328 7315 mutex_lock(&kvm_lock); 7329 7316 7330 7317 list_for_each_entry(kvm, &vm_list, vm_list) 7331 - vhost_task_wake(kvm->arch.nx_huge_page_recovery_thread); 7318 + kvm_wake_nx_recovery_thread(kvm); 7332 7319 7333 7320 mutex_unlock(&kvm_lock); 7334 7321 } ··· 7464 7451 { 7465 7452 struct kvm_arch *ka = container_of(once, struct kvm_arch, nx_once); 7466 7453 struct kvm *kvm = container_of(ka, struct kvm, arch); 7454 + struct vhost_task *nx_thread; 7467 7455 7468 7456 kvm->arch.nx_huge_page_last = get_jiffies_64(); 7469 - kvm->arch.nx_huge_page_recovery_thread = vhost_task_create( 7470 - kvm_nx_huge_page_recovery_worker, kvm_nx_huge_page_recovery_worker_kill, 7471 - kvm, "kvm-nx-lpage-recovery"); 7457 + nx_thread = vhost_task_create(kvm_nx_huge_page_recovery_worker, 7458 + kvm_nx_huge_page_recovery_worker_kill, 7459 + kvm, "kvm-nx-lpage-recovery"); 7472 7460 7473 - if (kvm->arch.nx_huge_page_recovery_thread) 7474 - vhost_task_start(kvm->arch.nx_huge_page_recovery_thread); 7461 + if (!nx_thread) 7462 + return; 7463 + 7464 + vhost_task_start(nx_thread); 7465 + 7466 + /* Make the task visible only once it is fully started. */ 7467 + WRITE_ONCE(kvm->arch.nx_huge_page_recovery_thread, nx_thread); 7475 7468 } 7476 7469 7477 7470 int kvm_mmu_post_init_vm(struct kvm *kvm)

+5 -5

arch/x86/kvm/svm/nested.c

··· 646 646 u32 pause_count12; 647 647 u32 pause_thresh12; 648 648 649 + nested_svm_transition_tlb_flush(vcpu); 650 + 651 + /* Enter Guest-Mode */ 652 + enter_guest_mode(vcpu); 653 + 649 654 /* 650 655 * Filled at exit: exit_code, exit_code_hi, exit_info_1, exit_info_2, 651 656 * exit_int_info, exit_int_info_err, next_rip, insn_len, insn_bytes. ··· 766 761 vmcb02->control.pause_filter_thresh = 0; 767 762 } 768 763 } 769 - 770 - nested_svm_transition_tlb_flush(vcpu); 771 - 772 - /* Enter Guest-Mode */ 773 - enter_guest_mode(vcpu); 774 764 775 765 /* 776 766 * Merge guest and host intercepts - must be called with vcpu in

+10

arch/x86/kvm/svm/sev.c

··· 2972 2972 WARN_ON_ONCE(!boot_cpu_has(X86_FEATURE_FLUSHBYASID))) 2973 2973 goto out; 2974 2974 2975 + /* 2976 + * The kernel's initcall infrastructure lacks the ability to express 2977 + * dependencies between initcalls, whereas the modules infrastructure 2978 + * automatically handles dependencies via symbol loading. Ensure the 2979 + * PSP SEV driver is initialized before proceeding if KVM is built-in, 2980 + * as the dependency isn't handled by the initcall infrastructure. 2981 + */ 2982 + if (IS_BUILTIN(CONFIG_KVM_AMD) && sev_module_init()) 2983 + goto out; 2984 + 2975 2985 /* Retrieve SEV CPUID information */ 2976 2986 cpuid(0x8000001f, &eax, &ebx, &ecx, &edx); 2977 2987

+6 -7

arch/x86/kvm/svm/svm.c

··· 1991 1991 svm->asid = sd->next_asid++; 1992 1992 } 1993 1993 1994 - static void svm_set_dr6(struct vcpu_svm *svm, unsigned long value) 1994 + static void svm_set_dr6(struct kvm_vcpu *vcpu, unsigned long value) 1995 1995 { 1996 - struct vmcb *vmcb = svm->vmcb; 1996 + struct vmcb *vmcb = to_svm(vcpu)->vmcb; 1997 1997 1998 - if (svm->vcpu.arch.guest_state_protected) 1998 + if (vcpu->arch.guest_state_protected) 1999 1999 return; 2000 2000 2001 2001 if (unlikely(value != vmcb->save.dr6)) { ··· 4247 4247 * Run with all-zero DR6 unless needed, so that we can get the exact cause 4248 4248 * of a #DB. 4249 4249 */ 4250 - if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)) 4251 - svm_set_dr6(svm, vcpu->arch.dr6); 4252 - else 4253 - svm_set_dr6(svm, DR6_ACTIVE_LOW); 4250 + if (likely(!(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT))) 4251 + svm_set_dr6(vcpu, DR6_ACTIVE_LOW); 4254 4252 4255 4253 clgi(); 4256 4254 kvm_load_guest_xsave_state(vcpu); ··· 5041 5043 .set_idt = svm_set_idt, 5042 5044 .get_gdt = svm_get_gdt, 5043 5045 .set_gdt = svm_set_gdt, 5046 + .set_dr6 = svm_set_dr6, 5044 5047 .set_dr7 = svm_set_dr7, 5045 5048 .sync_dirty_debug_regs = svm_sync_dirty_debug_regs, 5046 5049 .cache_reg = svm_cache_reg,

+1

arch/x86/kvm/vmx/main.c

··· 61 61 .set_idt = vmx_set_idt, 62 62 .get_gdt = vmx_get_gdt, 63 63 .set_gdt = vmx_set_gdt, 64 + .set_dr6 = vmx_set_dr6, 64 65 .set_dr7 = vmx_set_dr7, 65 66 .sync_dirty_debug_regs = vmx_sync_dirty_debug_regs, 66 67 .cache_reg = vmx_cache_reg,

+6 -4

arch/x86/kvm/vmx/vmx.c

··· 5648 5648 set_debugreg(DR6_RESERVED, 6); 5649 5649 } 5650 5650 5651 + void vmx_set_dr6(struct kvm_vcpu *vcpu, unsigned long val) 5652 + { 5653 + lockdep_assert_irqs_disabled(); 5654 + set_debugreg(vcpu->arch.dr6, 6); 5655 + } 5656 + 5651 5657 void vmx_set_dr7(struct kvm_vcpu *vcpu, unsigned long val) 5652 5658 { 5653 5659 vmcs_writel(GUEST_DR7, val); ··· 7422 7416 vmcs_writel(HOST_CR4, cr4); 7423 7417 vmx->loaded_vmcs->host_state.cr4 = cr4; 7424 7418 } 7425 - 7426 - /* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */ 7427 - if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)) 7428 - set_debugreg(vcpu->arch.dr6, 6); 7429 7419 7430 7420 /* When single-stepping over STI and MOV SS, we must clear the 7431 7421 * corresponding interruptibility bits in the guest state. Otherwise

+1

arch/x86/kvm/vmx/x86_ops.h

··· 73 73 void vmx_set_idt(struct kvm_vcpu *vcpu, struct desc_ptr *dt); 74 74 void vmx_get_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt); 75 75 void vmx_set_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt); 76 + void vmx_set_dr6(struct kvm_vcpu *vcpu, unsigned long val); 76 77 void vmx_set_dr7(struct kvm_vcpu *vcpu, unsigned long val); 77 78 void vmx_sync_dirty_debug_regs(struct kvm_vcpu *vcpu); 78 79 void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg);

+4 -6

arch/x86/kvm/x86.c

··· 10961 10961 set_debugreg(vcpu->arch.eff_db[1], 1); 10962 10962 set_debugreg(vcpu->arch.eff_db[2], 2); 10963 10963 set_debugreg(vcpu->arch.eff_db[3], 3); 10964 + /* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */ 10965 + if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)) 10966 + kvm_x86_call(set_dr6)(vcpu, vcpu->arch.dr6); 10964 10967 } else if (unlikely(hw_breakpoint_active())) { 10965 10968 set_debugreg(0, 7); 10966 10969 } ··· 12744 12741 "does not run without ignore_msrs=1, please report it to kvm@vger.kernel.org.\n"); 12745 12742 } 12746 12743 12744 + once_init(&kvm->arch.nx_once); 12747 12745 return 0; 12748 12746 12749 12747 out_uninit_mmu: ··· 12752 12748 kvm_page_track_cleanup(kvm); 12753 12749 out: 12754 12750 return ret; 12755 - } 12756 - 12757 - int kvm_arch_post_init_vm(struct kvm *kvm) 12758 - { 12759 - once_init(&kvm->arch.nx_once); 12760 - return 0; 12761 12751 } 12762 12752 12763 12753 static void kvm_unload_vcpu_mmu(struct kvm_vcpu *vcpu)

+18 -3

arch/x86/um/os-Linux/registers.c

··· 18 18 #include <registers.h> 19 19 #include <sys/mman.h> 20 20 21 + static unsigned long ptrace_regset; 21 22 unsigned long host_fp_size; 22 23 23 24 int get_fp_registers(int pid, unsigned long *regs) ··· 28 27 .iov_len = host_fp_size, 29 28 }; 30 29 31 - if (ptrace(PTRACE_GETREGSET, pid, NT_X86_XSTATE, &iov) < 0) 30 + if (ptrace(PTRACE_GETREGSET, pid, ptrace_regset, &iov) < 0) 32 31 return -errno; 33 32 return 0; 34 33 } ··· 40 39 .iov_len = host_fp_size, 41 40 }; 42 41 43 - if (ptrace(PTRACE_SETREGSET, pid, NT_X86_XSTATE, &iov) < 0) 42 + if (ptrace(PTRACE_SETREGSET, pid, ptrace_regset, &iov) < 0) 44 43 return -errno; 45 44 return 0; 46 45 } ··· 59 58 return -ENOMEM; 60 59 61 60 /* GDB has x86_xsave_length, which uses x86_cpuid_count */ 62 - ret = ptrace(PTRACE_GETREGSET, pid, NT_X86_XSTATE, &iov); 61 + ptrace_regset = NT_X86_XSTATE; 62 + ret = ptrace(PTRACE_GETREGSET, pid, ptrace_regset, &iov); 63 63 if (ret) 64 64 ret = -errno; 65 + 66 + if (ret == -ENODEV) { 67 + #ifdef CONFIG_X86_32 68 + ptrace_regset = NT_PRXFPREG; 69 + #else 70 + ptrace_regset = NT_PRFPREG; 71 + #endif 72 + iov.iov_len = 2 * 1024 * 1024; 73 + ret = ptrace(PTRACE_GETREGSET, pid, ptrace_regset, &iov); 74 + if (ret) 75 + ret = -errno; 76 + } 77 + 65 78 munmap(iov.iov_base, 2 * 1024 * 1024); 66 79 67 80 host_fp_size = iov.iov_len;

+10 -3

arch/x86/um/signal.c

··· 187 187 * Put magic/size values for userspace. We do not bother to verify them 188 188 * later on, however, userspace needs them should it try to read the 189 189 * XSTATE data. And ptrace does not fill in these parts. 190 + * 191 + * Skip this if we do not have an XSTATE frame. 190 192 */ 193 + if (host_fp_size <= sizeof(to_fp64->fpstate)) 194 + return 0; 195 + 191 196 BUILD_BUG_ON(sizeof(int) != FP_XSTATE_MAGIC2_SIZE); 192 197 #ifdef CONFIG_X86_32 193 198 __put_user(offsetof(struct _fpstate_32, _fxsr_env) + ··· 372 367 int err = 0, sig = ksig->sig; 373 368 unsigned long fp_to; 374 369 375 - frame = (struct rt_sigframe __user *) 376 - round_down(stack_top - sizeof(struct rt_sigframe), 16); 370 + frame = (void __user *)stack_top - sizeof(struct rt_sigframe); 377 371 378 372 /* Add required space for math frame */ 379 - frame = (struct rt_sigframe __user *)((unsigned long)frame - math_size); 373 + frame = (void __user *)((unsigned long)frame - math_size); 374 + 375 + /* ABI requires 16 byte boundary alignment */ 376 + frame = (void __user *)round_down((unsigned long)frame, 16); 380 377 381 378 /* Subtract 128 for a red zone and 8 for proper alignment */ 382 379 frame = (struct rt_sigframe __user *) ((unsigned long) frame - 128 - 8);

+7 -16

arch/x86/virt/svm/sev.c

··· 505 505 * described in the SNP_INIT_EX firmware command description in the SNP 506 506 * firmware ABI spec. 507 507 */ 508 - static int __init snp_rmptable_init(void) 508 + int __init snp_rmptable_init(void) 509 509 { 510 510 unsigned int i; 511 511 u64 val; 512 512 513 - if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP)) 514 - return 0; 513 + if (WARN_ON_ONCE(!cc_platform_has(CC_ATTR_HOST_SEV_SNP))) 514 + return -ENOSYS; 515 515 516 - if (!amd_iommu_snp_en) 517 - goto nosnp; 516 + if (WARN_ON_ONCE(!amd_iommu_snp_en)) 517 + return -ENOSYS; 518 518 519 519 if (!setup_rmptable()) 520 - goto nosnp; 520 + return -ENOSYS; 521 521 522 522 /* 523 523 * Check if SEV-SNP is already enabled, this can happen in case of ··· 530 530 /* Zero out the RMP bookkeeping area */ 531 531 if (!clear_rmptable_bookkeeping()) { 532 532 free_rmp_segment_table(); 533 - goto nosnp; 533 + return -ENOSYS; 534 534 } 535 535 536 536 /* Zero out the RMP entries */ ··· 562 562 crash_kexec_post_notifiers = true; 563 563 564 564 return 0; 565 - 566 - nosnp: 567 - cc_platform_clear(CC_ATTR_HOST_SEV_SNP); 568 - return -ENOSYS; 569 565 } 570 - 571 - /* 572 - * This must be called after the IOMMU has been initialized. 573 - */ 574 - device_initcall(snp_rmptable_init); 575 566 576 567 static void set_rmp_segment_info(unsigned int segment_shift) 577 568 {

+62 -9

arch/x86/xen/mmu_pv.c

··· 111 111 */ 112 112 static DEFINE_SPINLOCK(xen_reservation_lock); 113 113 114 + /* Protected by xen_reservation_lock. */ 115 + #define MIN_CONTIG_ORDER 9 /* 2MB */ 116 + static unsigned int discontig_frames_order = MIN_CONTIG_ORDER; 117 + static unsigned long discontig_frames_early[1UL << MIN_CONTIG_ORDER] __initdata; 118 + static unsigned long *discontig_frames __refdata = discontig_frames_early; 119 + static bool discontig_frames_dyn; 120 + 121 + static int alloc_discontig_frames(unsigned int order) 122 + { 123 + unsigned long *new_array, *old_array; 124 + unsigned int old_order; 125 + unsigned long flags; 126 + 127 + BUG_ON(order < MIN_CONTIG_ORDER); 128 + BUILD_BUG_ON(sizeof(discontig_frames_early) != PAGE_SIZE); 129 + 130 + new_array = (unsigned long *)__get_free_pages(GFP_KERNEL, 131 + order - MIN_CONTIG_ORDER); 132 + if (!new_array) 133 + return -ENOMEM; 134 + 135 + spin_lock_irqsave(&xen_reservation_lock, flags); 136 + 137 + old_order = discontig_frames_order; 138 + 139 + if (order > discontig_frames_order || !discontig_frames_dyn) { 140 + if (!discontig_frames_dyn) 141 + old_array = NULL; 142 + else 143 + old_array = discontig_frames; 144 + 145 + discontig_frames = new_array; 146 + discontig_frames_order = order; 147 + discontig_frames_dyn = true; 148 + } else { 149 + old_array = new_array; 150 + } 151 + 152 + spin_unlock_irqrestore(&xen_reservation_lock, flags); 153 + 154 + free_pages((unsigned long)old_array, old_order - MIN_CONTIG_ORDER); 155 + 156 + return 0; 157 + } 158 + 114 159 /* 115 160 * Note about cr3 (pagetable base) values: 116 161 * ··· 859 814 SetPagePinned(virt_to_page(level3_user_vsyscall)); 860 815 #endif 861 816 xen_pgd_walk(&init_mm, xen_mark_pinned, FIXADDR_TOP); 817 + 818 + if (alloc_discontig_frames(MIN_CONTIG_ORDER)) 819 + BUG(); 862 820 } 863 821 864 822 static void xen_unpin_page(struct mm_struct *mm, struct page *page, ··· 2251 2203 memset(dummy_mapping, 0xff, PAGE_SIZE); 2252 2204 } 2253 2205 2254 - /* Protected by xen_reservation_lock. */ 2255 - #define MAX_CONTIG_ORDER 9 /* 2MB */ 2256 - static unsigned long discontig_frames[1<<MAX_CONTIG_ORDER]; 2257 - 2258 2206 #define VOID_PTE (mfn_pte(0, __pgprot(0))) 2259 2207 static void xen_zap_pfn_range(unsigned long vaddr, unsigned int order, 2260 2208 unsigned long *in_frames, ··· 2367 2323 unsigned int address_bits, 2368 2324 dma_addr_t *dma_handle) 2369 2325 { 2370 - unsigned long *in_frames = discontig_frames, out_frame; 2326 + unsigned long *in_frames, out_frame; 2371 2327 unsigned long flags; 2372 2328 int success; 2373 2329 unsigned long vstart = (unsigned long)phys_to_virt(pstart); 2374 2330 2375 - if (unlikely(order > MAX_CONTIG_ORDER)) 2376 - return -ENOMEM; 2331 + if (unlikely(order > discontig_frames_order)) { 2332 + if (!discontig_frames_dyn) 2333 + return -ENOMEM; 2334 + 2335 + if (alloc_discontig_frames(order)) 2336 + return -ENOMEM; 2337 + } 2377 2338 2378 2339 memset((void *) vstart, 0, PAGE_SIZE << order); 2379 2340 2380 2341 spin_lock_irqsave(&xen_reservation_lock, flags); 2342 + 2343 + in_frames = discontig_frames; 2381 2344 2382 2345 /* 1. Zap current PTEs, remembering MFNs. */ 2383 2346 xen_zap_pfn_range(vstart, order, in_frames, NULL); ··· 2409 2358 2410 2359 void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order) 2411 2360 { 2412 - unsigned long *out_frames = discontig_frames, in_frame; 2361 + unsigned long *out_frames, in_frame; 2413 2362 unsigned long flags; 2414 2363 int success; 2415 2364 unsigned long vstart; 2416 2365 2417 - if (unlikely(order > MAX_CONTIG_ORDER)) 2366 + if (unlikely(order > discontig_frames_order)) 2418 2367 return; 2419 2368 2420 2369 vstart = (unsigned long)phys_to_virt(pstart); 2421 2370 memset((void *) vstart, 0, PAGE_SIZE << order); 2422 2371 2423 2372 spin_lock_irqsave(&xen_reservation_lock, flags); 2373 + 2374 + out_frames = discontig_frames; 2424 2375 2425 2376 /* 1. Find start MFN of contiguous extent. */ 2426 2377 in_frame = virt_to_mfn((void *)vstart);

+3 -8

arch/x86/xen/xen-head.S

··· 100 100 push %r10 101 101 push %r9 102 102 push %r8 103 - #ifdef CONFIG_FRAME_POINTER 104 - pushq $0 /* Dummy push for stack alignment. */ 105 - #endif 106 103 #endif 107 104 /* Set the vendor specific function. */ 108 105 call __xen_hypercall_setfunc ··· 114 117 pop %ebx 115 118 pop %eax 116 119 #else 117 - lea xen_hypercall_amd(%rip), %rbx 118 - cmp %rax, %rbx 119 - #ifdef CONFIG_FRAME_POINTER 120 - pop %rax /* Dummy pop. */ 121 - #endif 120 + lea xen_hypercall_amd(%rip), %rcx 121 + cmp %rax, %rcx 122 122 pop %r8 123 123 pop %r9 124 124 pop %r10 ··· 126 132 pop %rcx 127 133 pop %rax 128 134 #endif 135 + FRAME_END 129 136 /* Use correct hypercall function. */ 130 137 jz xen_hypercall_amd 131 138 jmp xen_hypercall_intel

+15 -3

block/partitions/mac.c

··· 53 53 } 54 54 secsize = be16_to_cpu(md->block_size); 55 55 put_dev_sector(sect); 56 + 57 + /* 58 + * If the "block size" is not a power of 2, things get weird - we might 59 + * end up with a partition straddling a sector boundary, so we wouldn't 60 + * be able to read a partition entry with read_part_sector(). 61 + * Real block sizes are probably (?) powers of two, so just require 62 + * that. 63 + */ 64 + if (!is_power_of_2(secsize)) 65 + return -1; 56 66 datasize = round_down(secsize, 512); 57 67 data = read_part_sector(state, datasize / 512, &sect); 58 68 if (!data) 59 69 return -1; 60 70 partoffset = secsize % 512; 61 - if (partoffset + sizeof(*part) > datasize) 71 + if (partoffset + sizeof(*part) > datasize) { 72 + put_dev_sector(sect); 62 73 return -1; 74 + } 63 75 part = (struct mac_partition *) (data + partoffset); 64 76 if (be16_to_cpu(part->signature) != MAC_PARTITION_MAGIC) { 65 77 put_dev_sector(sect); ··· 124 112 int i, l; 125 113 126 114 goodness++; 127 - l = strlen(part->name); 128 - if (strcmp(part->name, "/") == 0) 115 + l = strnlen(part->name, sizeof(part->name)); 116 + if (strncmp(part->name, "/", sizeof(part->name)) == 0) 129 117 goodness++; 130 118 for (i = 0; i <= l - 4; ++i) { 131 119 if (strncasecmp(part->name + i, "root",

+5

drivers/accel/amdxdna/amdxdna_pci_drv.c

··· 21 21 22 22 #define AMDXDNA_AUTOSUSPEND_DELAY 5000 /* milliseconds */ 23 23 24 + MODULE_FIRMWARE("amdnpu/1502_00/npu.sbin"); 25 + MODULE_FIRMWARE("amdnpu/17f0_10/npu.sbin"); 26 + MODULE_FIRMWARE("amdnpu/17f0_11/npu.sbin"); 27 + MODULE_FIRMWARE("amdnpu/17f0_20/npu.sbin"); 28 + 24 29 /* 25 30 * Bind the driver base on (vendor_id, device_id) pair and later use the 26 31 * (device_id, rev_id) pair as a key to select the devices. The devices with

+6 -2

drivers/accel/ivpu/ivpu_drv.c

··· 397 397 if (ivpu_fw_is_cold_boot(vdev)) { 398 398 ret = ivpu_pm_dct_init(vdev); 399 399 if (ret) 400 - goto err_diagnose_failure; 400 + goto err_disable_ipc; 401 401 402 402 ret = ivpu_hw_sched_init(vdev); 403 403 if (ret) 404 - goto err_diagnose_failure; 404 + goto err_disable_ipc; 405 405 } 406 406 407 407 return 0; 408 408 409 + err_disable_ipc: 410 + ivpu_ipc_disable(vdev); 411 + ivpu_hw_irq_disable(vdev); 412 + disable_irq(vdev->irq); 409 413 err_diagnose_failure: 410 414 ivpu_hw_diagnose_failure(vdev); 411 415 ivpu_mmu_evtq_dump(vdev);

+47 -37

drivers/accel/ivpu/ivpu_pm.c

··· 115 115 return ret; 116 116 } 117 117 118 - static void ivpu_pm_recovery_work(struct work_struct *work) 118 + static void ivpu_pm_reset_begin(struct ivpu_device *vdev) 119 119 { 120 - struct ivpu_pm_info *pm = container_of(work, struct ivpu_pm_info, recovery_work); 121 - struct ivpu_device *vdev = pm->vdev; 122 - char *evt[2] = {"IVPU_PM_EVENT=IVPU_RECOVER", NULL}; 123 - int ret; 124 - 125 - ivpu_err(vdev, "Recovering the NPU (reset #%d)\n", atomic_read(&vdev->pm->reset_counter)); 126 - 127 - ret = pm_runtime_resume_and_get(vdev->drm.dev); 128 - if (ret) 129 - ivpu_err(vdev, "Failed to resume NPU: %d\n", ret); 130 - 131 - ivpu_jsm_state_dump(vdev); 132 - ivpu_dev_coredump(vdev); 120 + pm_runtime_disable(vdev->drm.dev); 133 121 134 122 atomic_inc(&vdev->pm->reset_counter); 135 123 atomic_set(&vdev->pm->reset_pending, 1); 136 124 down_write(&vdev->pm->reset_lock); 125 + } 137 126 138 - ivpu_suspend(vdev); 127 + static void ivpu_pm_reset_complete(struct ivpu_device *vdev) 128 + { 129 + int ret; 130 + 139 131 ivpu_pm_prepare_cold_boot(vdev); 140 132 ivpu_jobs_abort_all(vdev); 141 133 ivpu_ms_cleanup_all(vdev); 142 134 143 135 ret = ivpu_resume(vdev); 144 - if (ret) 136 + if (ret) { 145 137 ivpu_err(vdev, "Failed to resume NPU: %d\n", ret); 138 + pm_runtime_set_suspended(vdev->drm.dev); 139 + } else { 140 + pm_runtime_set_active(vdev->drm.dev); 141 + } 146 142 147 143 up_write(&vdev->pm->reset_lock); 148 144 atomic_set(&vdev->pm->reset_pending, 0); 149 145 150 - kobject_uevent_env(&vdev->drm.dev->kobj, KOBJ_CHANGE, evt); 151 146 pm_runtime_mark_last_busy(vdev->drm.dev); 152 - pm_runtime_put_autosuspend(vdev->drm.dev); 147 + pm_runtime_enable(vdev->drm.dev); 148 + } 149 + 150 + static void ivpu_pm_recovery_work(struct work_struct *work) 151 + { 152 + struct ivpu_pm_info *pm = container_of(work, struct ivpu_pm_info, recovery_work); 153 + struct ivpu_device *vdev = pm->vdev; 154 + char *evt[2] = {"IVPU_PM_EVENT=IVPU_RECOVER", NULL}; 155 + 156 + ivpu_err(vdev, "Recovering the NPU (reset #%d)\n", atomic_read(&vdev->pm->reset_counter)); 157 + 158 + ivpu_pm_reset_begin(vdev); 159 + 160 + if (!pm_runtime_status_suspended(vdev->drm.dev)) { 161 + ivpu_jsm_state_dump(vdev); 162 + ivpu_dev_coredump(vdev); 163 + ivpu_suspend(vdev); 164 + } 165 + 166 + ivpu_pm_reset_complete(vdev); 167 + 168 + kobject_uevent_env(&vdev->drm.dev->kobj, KOBJ_CHANGE, evt); 153 169 } 154 170 155 171 void ivpu_pm_trigger_recovery(struct ivpu_device *vdev, const char *reason) ··· 325 309 int ret; 326 310 327 311 ret = pm_runtime_resume_and_get(vdev->drm.dev); 328 - drm_WARN_ON(&vdev->drm, ret < 0); 312 + if (ret < 0) { 313 + ivpu_err(vdev, "Failed to resume NPU: %d\n", ret); 314 + pm_runtime_set_suspended(vdev->drm.dev); 315 + } 329 316 330 317 return ret; 331 318 } ··· 344 325 struct ivpu_device *vdev = pci_get_drvdata(pdev); 345 326 346 327 ivpu_dbg(vdev, PM, "Pre-reset..\n"); 347 - atomic_inc(&vdev->pm->reset_counter); 348 - atomic_set(&vdev->pm->reset_pending, 1); 349 328 350 - pm_runtime_get_sync(vdev->drm.dev); 351 - down_write(&vdev->pm->reset_lock); 352 - ivpu_prepare_for_reset(vdev); 353 - ivpu_hw_reset(vdev); 354 - ivpu_pm_prepare_cold_boot(vdev); 355 - ivpu_jobs_abort_all(vdev); 356 - ivpu_ms_cleanup_all(vdev); 329 + ivpu_pm_reset_begin(vdev); 330 + 331 + if (!pm_runtime_status_suspended(vdev->drm.dev)) { 332 + ivpu_prepare_for_reset(vdev); 333 + ivpu_hw_reset(vdev); 334 + } 357 335 358 336 ivpu_dbg(vdev, PM, "Pre-reset done.\n"); 359 337 } ··· 358 342 void ivpu_pm_reset_done_cb(struct pci_dev *pdev) 359 343 { 360 344 struct ivpu_device *vdev = pci_get_drvdata(pdev); 361 - int ret; 362 345 363 346 ivpu_dbg(vdev, PM, "Post-reset..\n"); 364 - ret = ivpu_resume(vdev); 365 - if (ret) 366 - ivpu_err(vdev, "Failed to set RESUME state: %d\n", ret); 367 - up_write(&vdev->pm->reset_lock); 368 - atomic_set(&vdev->pm->reset_pending, 0); 369 - ivpu_dbg(vdev, PM, "Post-reset done.\n"); 370 347 371 - pm_runtime_mark_last_busy(vdev->drm.dev); 372 - pm_runtime_put_autosuspend(vdev->drm.dev); 348 + ivpu_pm_reset_complete(vdev); 349 + 350 + ivpu_dbg(vdev, PM, "Post-reset done.\n"); 373 351 } 374 352 375 353 void ivpu_pm_init(struct ivpu_device *vdev)

+8 -4

drivers/acpi/arm64/gtdt.c

··· 163 163 { 164 164 void *platform_timer; 165 165 struct acpi_table_gtdt *gtdt; 166 - int cnt = 0; 166 + u32 cnt = 0; 167 167 168 168 gtdt = container_of(table, struct acpi_table_gtdt, header); 169 169 acpi_gtdt_desc.gtdt = gtdt; ··· 188 188 cnt++; 189 189 190 190 if (cnt != gtdt->platform_timer_count) { 191 + cnt = min(cnt, gtdt->platform_timer_count); 192 + pr_err(FW_BUG "limiting Platform Timer count to %d\n", cnt); 193 + } 194 + 195 + if (!cnt) { 191 196 acpi_gtdt_desc.platform_timer = NULL; 192 - pr_err(FW_BUG "invalid timer data.\n"); 193 - return -EINVAL; 197 + return 0; 194 198 } 195 199 196 200 if (platform_timer_count) 197 - *platform_timer_count = gtdt->platform_timer_count; 201 + *platform_timer_count = cnt; 198 202 199 203 return 0; 200 204 }

+1 -3

drivers/acpi/prmt.c

··· 287 287 if (!handler || !module) 288 288 goto invalid_guid; 289 289 290 - if (!handler->handler_addr || 291 - !handler->static_data_buffer_addr || 292 - !handler->acpi_param_buffer_addr) { 290 + if (!handler->handler_addr) { 293 291 buffer->prm_status = PRM_HANDLER_ERROR; 294 292 return AE_OK; 295 293 }

+5 -5

drivers/acpi/property.c

··· 1187 1187 } 1188 1188 break; 1189 1189 } 1190 - if (nval == 0) 1191 - return -EINVAL; 1192 1190 1193 1191 if (obj->type == ACPI_TYPE_BUFFER) { 1194 1192 if (proptype != DEV_PROP_U8) ··· 1210 1212 ret = acpi_copy_property_array_uint(items, (u64 *)val, nval); 1211 1213 break; 1212 1214 case DEV_PROP_STRING: 1213 - ret = acpi_copy_property_array_string( 1214 - items, (char **)val, 1215 - min_t(u32, nval, obj->package.count)); 1215 + nval = min_t(u32, nval, obj->package.count); 1216 + if (nval == 0) 1217 + return -ENODATA; 1218 + 1219 + ret = acpi_copy_property_array_string(items, (char **)val, nval); 1216 1220 break; 1217 1221 default: 1218 1222 ret = -EINVAL;

+6

drivers/acpi/resource.c

··· 564 564 }, 565 565 }, 566 566 { 567 + .matches = { 568 + DMI_MATCH(DMI_SYS_VENDOR, "Eluktronics Inc."), 569 + DMI_MATCH(DMI_BOARD_NAME, "MECH-17"), 570 + }, 571 + }, 572 + { 567 573 /* TongFang GM6XGxX/TUXEDO Stellaris 16 Gen5 AMD */ 568 574 .matches = { 569 575 DMI_MATCH(DMI_BOARD_NAME, "GM6XGxX"),

+1 -1

drivers/base/Makefile

··· 6 6 cpu.o firmware.o init.o map.o devres.o \ 7 7 attribute_container.o transport_class.o \ 8 8 topology.o container.o property.o cacheinfo.o \ 9 - swnode.o 9 + swnode.o faux.o 10 10 obj-$(CONFIG_AUXILIARY_BUS) += auxiliary.o 11 11 obj-$(CONFIG_DEVTMPFS) += devtmpfs.o 12 12 obj-y += power/

+1

drivers/base/base.h

··· 137 137 static inline int hypervisor_init(void) { return 0; } 138 138 #endif 139 139 int platform_bus_init(void); 140 + int faux_bus_init(void); 140 141 void cpu_dev_init(void); 141 142 void container_dev_init(void); 142 143 #ifdef CONFIG_AUXILIARY_BUS

+232

drivers/base/faux.c

··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * Copyright (c) 2025 Greg Kroah-Hartman <gregkh@linuxfoundation.org> 4 + * Copyright (c) 2025 The Linux Foundation 5 + * 6 + * A "simple" faux bus that allows devices to be created and added 7 + * automatically to it. This is to be used whenever you need to create a 8 + * device that is not associated with any "real" system resources, and do 9 + * not want to have to deal with a bus/driver binding logic. It is 10 + * intended to be very simple, with only a create and a destroy function 11 + * available. 12 + */ 13 + #include <linux/err.h> 14 + #include <linux/init.h> 15 + #include <linux/slab.h> 16 + #include <linux/string.h> 17 + #include <linux/container_of.h> 18 + #include <linux/device/faux.h> 19 + #include "base.h" 20 + 21 + /* 22 + * Internal wrapper structure so we can hold a pointer to the 23 + * faux_device_ops for this device. 24 + */ 25 + struct faux_object { 26 + struct faux_device faux_dev; 27 + const struct faux_device_ops *faux_ops; 28 + }; 29 + #define to_faux_object(dev) container_of_const(dev, struct faux_object, faux_dev.dev) 30 + 31 + static struct device faux_bus_root = { 32 + .init_name = "faux", 33 + }; 34 + 35 + static int faux_match(struct device *dev, const struct device_driver *drv) 36 + { 37 + /* Match always succeeds, we only have one driver */ 38 + return 1; 39 + } 40 + 41 + static int faux_probe(struct device *dev) 42 + { 43 + struct faux_object *faux_obj = to_faux_object(dev); 44 + struct faux_device *faux_dev = &faux_obj->faux_dev; 45 + const struct faux_device_ops *faux_ops = faux_obj->faux_ops; 46 + int ret = 0; 47 + 48 + if (faux_ops && faux_ops->probe) 49 + ret = faux_ops->probe(faux_dev); 50 + 51 + return ret; 52 + } 53 + 54 + static void faux_remove(struct device *dev) 55 + { 56 + struct faux_object *faux_obj = to_faux_object(dev); 57 + struct faux_device *faux_dev = &faux_obj->faux_dev; 58 + const struct faux_device_ops *faux_ops = faux_obj->faux_ops; 59 + 60 + if (faux_ops && faux_ops->remove) 61 + faux_ops->remove(faux_dev); 62 + } 63 + 64 + static const struct bus_type faux_bus_type = { 65 + .name = "faux", 66 + .match = faux_match, 67 + .probe = faux_probe, 68 + .remove = faux_remove, 69 + }; 70 + 71 + static struct device_driver faux_driver = { 72 + .name = "faux_driver", 73 + .bus = &faux_bus_type, 74 + .probe_type = PROBE_FORCE_SYNCHRONOUS, 75 + }; 76 + 77 + static void faux_device_release(struct device *dev) 78 + { 79 + struct faux_object *faux_obj = to_faux_object(dev); 80 + 81 + kfree(faux_obj); 82 + } 83 + 84 + /** 85 + * faux_device_create_with_groups - Create and register with the driver 86 + * core a faux device and populate the device with an initial 87 + * set of sysfs attributes. 88 + * @name: The name of the device we are adding, must be unique for 89 + * all faux devices. 90 + * @parent: Pointer to a potential parent struct device. If set to 91 + * NULL, the device will be created in the "root" of the faux 92 + * device tree in sysfs. 93 + * @faux_ops: struct faux_device_ops that the new device will call back 94 + * into, can be NULL. 95 + * @groups: The set of sysfs attributes that will be created for this 96 + * device when it is registered with the driver core. 97 + * 98 + * Create a new faux device and register it in the driver core properly. 99 + * If present, callbacks in @faux_ops will be called with the device that 100 + * for the caller to do something with at the proper time given the 101 + * device's lifecycle. 102 + * 103 + * Note, when this function is called, the functions specified in struct 104 + * faux_ops can be called before the function returns, so be prepared for 105 + * everything to be properly initialized before that point in time. 106 + * 107 + * Return: 108 + * * NULL if an error happened with creating the device 109 + * * pointer to a valid struct faux_device that is registered with sysfs 110 + */ 111 + struct faux_device *faux_device_create_with_groups(const char *name, 112 + struct device *parent, 113 + const struct faux_device_ops *faux_ops, 114 + const struct attribute_group **groups) 115 + { 116 + struct faux_object *faux_obj; 117 + struct faux_device *faux_dev; 118 + struct device *dev; 119 + int ret; 120 + 121 + faux_obj = kzalloc(sizeof(*faux_obj), GFP_KERNEL); 122 + if (!faux_obj) 123 + return NULL; 124 + 125 + /* Save off the callbacks so we can use them in the future */ 126 + faux_obj->faux_ops = faux_ops; 127 + 128 + /* Initialize the device portion and register it with the driver core */ 129 + faux_dev = &faux_obj->faux_dev; 130 + dev = &faux_dev->dev; 131 + 132 + device_initialize(dev); 133 + dev->release = faux_device_release; 134 + if (parent) 135 + dev->parent = parent; 136 + else 137 + dev->parent = &faux_bus_root; 138 + dev->bus = &faux_bus_type; 139 + dev->groups = groups; 140 + dev_set_name(dev, "%s", name); 141 + 142 + ret = device_add(dev); 143 + if (ret) { 144 + pr_err("%s: device_add for faux device '%s' failed with %d\n", 145 + __func__, name, ret); 146 + put_device(dev); 147 + return NULL; 148 + } 149 + 150 + return faux_dev; 151 + } 152 + EXPORT_SYMBOL_GPL(faux_device_create_with_groups); 153 + 154 + /** 155 + * faux_device_create - create and register with the driver core a faux device 156 + * @name: The name of the device we are adding, must be unique for all 157 + * faux devices. 158 + * @parent: Pointer to a potential parent struct device. If set to 159 + * NULL, the device will be created in the "root" of the faux 160 + * device tree in sysfs. 161 + * @faux_ops: struct faux_device_ops that the new device will call back 162 + * into, can be NULL. 163 + * 164 + * Create a new faux device and register it in the driver core properly. 165 + * If present, callbacks in @faux_ops will be called with the device that 166 + * for the caller to do something with at the proper time given the 167 + * device's lifecycle. 168 + * 169 + * Note, when this function is called, the functions specified in struct 170 + * faux_ops can be called before the function returns, so be prepared for 171 + * everything to be properly initialized before that point in time. 172 + * 173 + * Return: 174 + * * NULL if an error happened with creating the device 175 + * * pointer to a valid struct faux_device that is registered with sysfs 176 + */ 177 + struct faux_device *faux_device_create(const char *name, 178 + struct device *parent, 179 + const struct faux_device_ops *faux_ops) 180 + { 181 + return faux_device_create_with_groups(name, parent, faux_ops, NULL); 182 + } 183 + EXPORT_SYMBOL_GPL(faux_device_create); 184 + 185 + /** 186 + * faux_device_destroy - destroy a faux device 187 + * @faux_dev: faux device to destroy 188 + * 189 + * Unregisters and cleans up a device that was created with a call to 190 + * faux_device_create() 191 + */ 192 + void faux_device_destroy(struct faux_device *faux_dev) 193 + { 194 + struct device *dev = &faux_dev->dev; 195 + 196 + if (!faux_dev) 197 + return; 198 + 199 + device_del(dev); 200 + 201 + /* The final put_device() will clean up the memory we allocated for this device. */ 202 + put_device(dev); 203 + } 204 + EXPORT_SYMBOL_GPL(faux_device_destroy); 205 + 206 + int __init faux_bus_init(void) 207 + { 208 + int ret; 209 + 210 + ret = device_register(&faux_bus_root); 211 + if (ret) { 212 + put_device(&faux_bus_root); 213 + return ret; 214 + } 215 + 216 + ret = bus_register(&faux_bus_type); 217 + if (ret) 218 + goto error_bus; 219 + 220 + ret = driver_register(&faux_driver); 221 + if (ret) 222 + goto error_driver; 223 + 224 + return ret; 225 + 226 + error_driver: 227 + bus_unregister(&faux_bus_type); 228 + 229 + error_bus: 230 + device_unregister(&faux_bus_root); 231 + return ret; 232 + }

+1

drivers/base/init.c

··· 32 32 /* These are also core pieces, but must come after the 33 33 * core core pieces. 34 34 */ 35 + faux_bus_init(); 35 36 of_core_init(); 36 37 platform_bus_init(); 37 38 auxiliary_bus_init();

+9 -12

drivers/base/power/main.c

··· 1191 1191 return PMSG_ON; 1192 1192 } 1193 1193 1194 - static void dpm_superior_set_must_resume(struct device *dev, bool set_active) 1194 + static void dpm_superior_set_must_resume(struct device *dev) 1195 1195 { 1196 1196 struct device_link *link; 1197 1197 int idx; 1198 1198 1199 - if (dev->parent) { 1199 + if (dev->parent) 1200 1200 dev->parent->power.must_resume = true; 1201 - if (set_active) 1202 - dev->parent->power.set_active = true; 1203 - } 1204 1201 1205 1202 idx = device_links_read_lock(); 1206 1203 1207 - list_for_each_entry_rcu_locked(link, &dev->links.suppliers, c_node) { 1204 + list_for_each_entry_rcu_locked(link, &dev->links.suppliers, c_node) 1208 1205 link->supplier->power.must_resume = true; 1209 - if (set_active) 1210 - link->supplier->power.set_active = true; 1211 - } 1212 1206 1213 1207 device_links_read_unlock(idx); 1214 1208 } ··· 1281 1287 dev->power.must_resume = true; 1282 1288 1283 1289 if (dev->power.must_resume) { 1284 - dev->power.set_active = dev->power.set_active || 1285 - dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND); 1286 - dpm_superior_set_must_resume(dev, dev->power.set_active); 1290 + if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND)) { 1291 + dev->power.set_active = true; 1292 + if (dev->parent && !dev->parent->power.ignore_children) 1293 + dev->parent->power.set_active = true; 1294 + } 1295 + dpm_superior_set_must_resume(dev); 1287 1296 } 1288 1297 1289 1298 Complete:

+2

drivers/base/regmap/regmap-irq.c

··· 906 906 kfree(d->wake_buf); 907 907 kfree(d->mask_buf_def); 908 908 kfree(d->mask_buf); 909 + kfree(d->main_status_buf); 909 910 kfree(d->status_buf); 910 911 kfree(d->status_reg_buf); 911 912 if (d->config_buf) { ··· 982 981 kfree(d->wake_buf); 983 982 kfree(d->mask_buf_def); 984 983 kfree(d->mask_buf); 984 + kfree(d->main_status_buf); 985 985 kfree(d->status_reg_buf); 986 986 kfree(d->status_buf); 987 987 if (d->config_buf) {

+2 -2

drivers/block/sunvdc.c

··· 1127 1127 1128 1128 spin_lock_irq(&port->vio.lock); 1129 1129 port->drain = 0; 1130 - blk_mq_unquiesce_queue(q, memflags); 1131 - blk_mq_unfreeze_queue(q); 1130 + blk_mq_unquiesce_queue(q); 1131 + blk_mq_unfreeze_queue(q, memflags); 1132 1132 } 1133 1133 1134 1134 static void vdc_ldc_reset_timer_work(struct work_struct *work)

+4 -1

drivers/bluetooth/btintel_pcie.c

··· 1320 1320 if (opcode == 0xfc01) 1321 1321 btintel_pcie_inject_cmd_complete(hdev, opcode); 1322 1322 } 1323 + /* Firmware raises alive interrupt on HCI_OP_RESET */ 1324 + if (opcode == HCI_OP_RESET) 1325 + data->gp0_received = false; 1326 + 1323 1327 hdev->stat.cmd_tx++; 1324 1328 break; 1325 1329 case HCI_ACLDATA_PKT: ··· 1361 1357 opcode, btintel_pcie_alivectxt_state2str(old_ctxt), 1362 1358 btintel_pcie_alivectxt_state2str(data->alive_intr_ctxt)); 1363 1359 if (opcode == HCI_OP_RESET) { 1364 - data->gp0_received = false; 1365 1360 ret = wait_event_timeout(data->gp0_wait_q, 1366 1361 data->gp0_received, 1367 1362 msecs_to_jiffies(BTINTEL_DEFAULT_INTR_TIMEOUT_MS));

+1 -1

drivers/bus/moxtet.c

··· 657 657 658 658 id = moxtet->modules[pos->idx]; 659 659 660 - seq_printf(p, " moxtet-%s.%i#%i", mox_module_name(id), pos->idx, 660 + seq_printf(p, "moxtet-%s.%i#%i", mox_module_name(id), pos->idx, 661 661 pos->bit); 662 662 } 663 663

+2 -1

drivers/cpufreq/Kconfig.arm

··· 17 17 18 18 config ARM_AIROHA_SOC_CPUFREQ 19 19 tristate "Airoha EN7581 SoC CPUFreq support" 20 - depends on (ARCH_AIROHA && OF) || COMPILE_TEST 20 + depends on ARCH_AIROHA || COMPILE_TEST 21 + depends on OF 21 22 select PM_OPP 22 23 default ARCH_AIROHA 23 24 help

+10 -10

drivers/cpufreq/amd-pstate.c

··· 699 699 if (min_perf < lowest_nonlinear_perf) 700 700 min_perf = lowest_nonlinear_perf; 701 701 702 - max_perf = cap_perf; 702 + max_perf = cpudata->max_limit_perf; 703 703 if (max_perf < min_perf) 704 704 max_perf = min_perf; 705 705 ··· 747 747 guard(mutex)(&amd_pstate_driver_lock); 748 748 749 749 ret = amd_pstate_cpu_boost_update(policy, state); 750 - policy->boost_enabled = !ret ? state : false; 751 750 refresh_frequency_limits(policy); 752 751 753 752 return ret; ··· 821 822 822 823 static void amd_pstate_update_limits(unsigned int cpu) 823 824 { 824 - struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); 825 + struct cpufreq_policy *policy = NULL; 825 826 struct amd_cpudata *cpudata; 826 827 u32 prev_high = 0, cur_high = 0; 827 828 int ret; 828 829 bool highest_perf_changed = false; 829 830 831 + if (!amd_pstate_prefcore) 832 + return; 833 + 834 + policy = cpufreq_cpu_get(cpu); 830 835 if (!policy) 831 836 return; 832 837 833 838 cpudata = policy->driver_data; 834 839 835 - if (!amd_pstate_prefcore) 836 - return; 837 - 838 840 guard(mutex)(&amd_pstate_driver_lock); 839 841 840 842 ret = amd_get_highest_perf(cpu, &cur_high); 841 - if (ret) 842 - goto free_cpufreq_put; 843 + if (ret) { 844 + cpufreq_cpu_put(policy); 845 + return; 846 + } 843 847 844 848 prev_high = READ_ONCE(cpudata->prefcore_ranking); 845 849 highest_perf_changed = (prev_high != cur_high); ··· 852 850 if (cur_high < CPPC_MAX_PERF) 853 851 sched_set_itmt_core_prio((int)cur_high, cpu); 854 852 } 855 - 856 - free_cpufreq_put: 857 853 cpufreq_cpu_put(policy); 858 854 859 855 if (!highest_perf_changed)

+2 -1

drivers/cpufreq/cpufreq.c

··· 1571 1571 policy->cdev = of_cpufreq_cooling_register(policy); 1572 1572 1573 1573 /* Let the per-policy boost flag mirror the cpufreq_driver boost during init */ 1574 - if (policy->boost_enabled != cpufreq_boost_enabled()) { 1574 + if (cpufreq_driver->set_boost && 1575 + policy->boost_enabled != cpufreq_boost_enabled()) { 1575 1576 policy->boost_enabled = cpufreq_boost_enabled(); 1576 1577 ret = cpufreq_driver->set_boost(policy, policy->boost_enabled); 1577 1578 if (ret) {

+14

drivers/crypto/ccp/sp-dev.c

··· 19 19 #include <linux/types.h> 20 20 #include <linux/ccp.h> 21 21 22 + #include "sev-dev.h" 22 23 #include "ccp-dev.h" 23 24 #include "sp-dev.h" 24 25 ··· 254 253 static int __init sp_mod_init(void) 255 254 { 256 255 #ifdef CONFIG_X86 256 + static bool initialized; 257 257 int ret; 258 + 259 + if (initialized) 260 + return 0; 258 261 259 262 ret = sp_pci_init(); 260 263 if (ret) ··· 267 262 #ifdef CONFIG_CRYPTO_DEV_SP_PSP 268 263 psp_pci_init(); 269 264 #endif 265 + 266 + initialized = true; 270 267 271 268 return 0; 272 269 #endif ··· 285 278 286 279 return -ENODEV; 287 280 } 281 + 282 + #if IS_BUILTIN(CONFIG_KVM_AMD) && IS_ENABLED(CONFIG_KVM_AMD_SEV) 283 + int __init sev_module_init(void) 284 + { 285 + return sp_mod_init(); 286 + } 287 + #endif 288 288 289 289 static void __exit sp_mod_exit(void) 290 290 {

+14 -3

drivers/dma/tegra210-adma.c

··· 887 887 const struct tegra_adma_chip_data *cdata; 888 888 struct tegra_adma *tdma; 889 889 struct resource *res_page, *res_base; 890 - int ret, i, page_no; 890 + int ret, i; 891 891 892 892 cdata = of_device_get_match_data(&pdev->dev); 893 893 if (!cdata) { ··· 914 914 915 915 res_base = platform_get_resource_byname(pdev, IORESOURCE_MEM, "global"); 916 916 if (res_base) { 917 - page_no = (res_page->start - res_base->start) / cdata->ch_base_offset; 918 - if (page_no <= 0) 917 + resource_size_t page_offset, page_no; 918 + unsigned int ch_base_offset; 919 + 920 + if (res_page->start < res_base->start) 919 921 return -EINVAL; 922 + page_offset = res_page->start - res_base->start; 923 + ch_base_offset = cdata->ch_base_offset; 924 + if (!ch_base_offset) 925 + return -EINVAL; 926 + 927 + page_no = div_u64(page_offset, ch_base_offset); 928 + if (!page_no || page_no > INT_MAX) 929 + return -EINVAL; 930 + 920 931 tdma->ch_page_no = page_no - 1; 921 932 tdma->base_addr = devm_ioremap_resource(&pdev->dev, res_base); 922 933 if (IS_ERR(tdma->base_addr))

+1 -1

drivers/firmware/Kconfig

··· 106 106 select ISCSI_BOOT_SYSFS 107 107 select ISCSI_IBFT_FIND if X86 108 108 depends on ACPI && SCSI && SCSI_LOWLEVEL 109 - default n 109 + default n 110 110 help 111 111 This option enables support for detection and exposing of iSCSI 112 112 Boot Firmware Table (iBFT) via sysfs to userspace. If you wish to

+4 -2

drivers/firmware/efi/efi.c

··· 934 934 EFI_MEMORY_WB | EFI_MEMORY_UCE | EFI_MEMORY_RO | 935 935 EFI_MEMORY_WP | EFI_MEMORY_RP | EFI_MEMORY_XP | 936 936 EFI_MEMORY_NV | EFI_MEMORY_SP | EFI_MEMORY_CPU_CRYPTO | 937 - EFI_MEMORY_RUNTIME | EFI_MEMORY_MORE_RELIABLE)) 937 + EFI_MEMORY_MORE_RELIABLE | EFI_MEMORY_HOT_PLUGGABLE | 938 + EFI_MEMORY_RUNTIME)) 938 939 snprintf(pos, size, "|attr=0x%016llx]", 939 940 (unsigned long long)attr); 940 941 else 941 942 snprintf(pos, size, 942 - "|%3s|%2s|%2s|%2s|%2s|%2s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]", 943 + "|%3s|%2s|%2s|%2s|%2s|%2s|%2s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]", 943 944 attr & EFI_MEMORY_RUNTIME ? "RUN" : "", 945 + attr & EFI_MEMORY_HOT_PLUGGABLE ? "HP" : "", 944 946 attr & EFI_MEMORY_MORE_RELIABLE ? "MR" : "", 945 947 attr & EFI_MEMORY_CPU_CRYPTO ? "CC" : "", 946 948 attr & EFI_MEMORY_SP ? "SP" : "",

+3

drivers/firmware/efi/libstub/randomalloc.c

··· 25 25 if (md->type != EFI_CONVENTIONAL_MEMORY) 26 26 return 0; 27 27 28 + if (md->attribute & EFI_MEMORY_HOT_PLUGGABLE) 29 + return 0; 30 + 28 31 if (efi_soft_reserve_enabled() && 29 32 (md->attribute & EFI_MEMORY_SP)) 30 33 return 0;

+3

drivers/firmware/efi/libstub/relocate.c

··· 53 53 if (desc->type != EFI_CONVENTIONAL_MEMORY) 54 54 continue; 55 55 56 + if (desc->attribute & EFI_MEMORY_HOT_PLUGGABLE) 57 + continue; 58 + 56 59 if (efi_soft_reserve_enabled() && 57 60 (desc->attribute & EFI_MEMORY_SP)) 58 61 continue;

+4 -1

drivers/firmware/iscsi_ibft.c

··· 310 310 str += sprintf_ipaddr(str, nic->ip_addr); 311 311 break; 312 312 case ISCSI_BOOT_ETH_SUBNET_MASK: 313 - val = cpu_to_be32(~((1 << (32-nic->subnet_mask_prefix))-1)); 313 + if (nic->subnet_mask_prefix > 32) 314 + val = cpu_to_be32(~0); 315 + else 316 + val = cpu_to_be32(~((1 << (32-nic->subnet_mask_prefix))-1)); 314 317 str += sprintf(str, "%pI4", &val); 315 318 break; 316 319 case ISCSI_BOOT_ETH_PREFIX_LEN:

+1

drivers/gpio/Kconfig

··· 338 338 339 339 config GPIO_GRGPIO 340 340 tristate "Aeroflex Gaisler GRGPIO support" 341 + depends on OF || COMPILE_TEST 341 342 select GPIO_GENERIC 342 343 select IRQ_DOMAIN 343 344 help

+58 -13

drivers/gpio/gpio-bcm-kona.c

··· 69 69 struct bcm_kona_gpio_bank { 70 70 int id; 71 71 int irq; 72 + /* 73 + * Used to keep track of lock/unlock operations for each GPIO in the 74 + * bank. 75 + * 76 + * All GPIOs are locked by default (see bcm_kona_gpio_reset), and the 77 + * unlock count for all GPIOs is 0 by default. Each unlock increments 78 + * the counter, and each lock decrements the counter. 79 + * 80 + * The lock function only locks the GPIO once its unlock counter is 81 + * down to 0. This is necessary because the GPIO is unlocked in two 82 + * places in this driver: once for requested GPIOs, and once for 83 + * requested IRQs. Since it is possible for a GPIO to be requested 84 + * as both a GPIO and an IRQ, we need to ensure that we don't lock it 85 + * too early. 86 + */ 87 + u8 gpio_unlock_count[GPIO_PER_BANK]; 72 88 /* Used in the interrupt handler */ 73 89 struct bcm_kona_gpio *kona_gpio; 74 90 }; ··· 102 86 u32 val; 103 87 unsigned long flags; 104 88 int bank_id = GPIO_BANK(gpio); 89 + int bit = GPIO_BIT(gpio); 90 + struct bcm_kona_gpio_bank *bank = &kona_gpio->banks[bank_id]; 105 91 106 - raw_spin_lock_irqsave(&kona_gpio->lock, flags); 92 + if (bank->gpio_unlock_count[bit] == 0) { 93 + dev_err(kona_gpio->gpio_chip.parent, 94 + "Unbalanced locks for GPIO %u\n", gpio); 95 + return; 96 + } 107 97 108 - val = readl(kona_gpio->reg_base + GPIO_PWD_STATUS(bank_id)); 109 - val |= BIT(gpio); 110 - bcm_kona_gpio_write_lock_regs(kona_gpio->reg_base, bank_id, val); 98 + if (--bank->gpio_unlock_count[bit] == 0) { 99 + raw_spin_lock_irqsave(&kona_gpio->lock, flags); 111 100 112 - raw_spin_unlock_irqrestore(&kona_gpio->lock, flags); 101 + val = readl(kona_gpio->reg_base + GPIO_PWD_STATUS(bank_id)); 102 + val |= BIT(bit); 103 + bcm_kona_gpio_write_lock_regs(kona_gpio->reg_base, bank_id, val); 104 + 105 + raw_spin_unlock_irqrestore(&kona_gpio->lock, flags); 106 + } 113 107 } 114 108 115 109 static void bcm_kona_gpio_unlock_gpio(struct bcm_kona_gpio *kona_gpio, ··· 128 102 u32 val; 129 103 unsigned long flags; 130 104 int bank_id = GPIO_BANK(gpio); 105 + int bit = GPIO_BIT(gpio); 106 + struct bcm_kona_gpio_bank *bank = &kona_gpio->banks[bank_id]; 131 107 132 - raw_spin_lock_irqsave(&kona_gpio->lock, flags); 108 + if (bank->gpio_unlock_count[bit] == 0) { 109 + raw_spin_lock_irqsave(&kona_gpio->lock, flags); 133 110 134 - val = readl(kona_gpio->reg_base + GPIO_PWD_STATUS(bank_id)); 135 - val &= ~BIT(gpio); 136 - bcm_kona_gpio_write_lock_regs(kona_gpio->reg_base, bank_id, val); 111 + val = readl(kona_gpio->reg_base + GPIO_PWD_STATUS(bank_id)); 112 + val &= ~BIT(bit); 113 + bcm_kona_gpio_write_lock_regs(kona_gpio->reg_base, bank_id, val); 137 114 138 - raw_spin_unlock_irqrestore(&kona_gpio->lock, flags); 115 + raw_spin_unlock_irqrestore(&kona_gpio->lock, flags); 116 + } 117 + 118 + ++bank->gpio_unlock_count[bit]; 139 119 } 140 120 141 121 static int bcm_kona_gpio_get_dir(struct gpio_chip *chip, unsigned gpio) ··· 392 360 393 361 kona_gpio = irq_data_get_irq_chip_data(d); 394 362 reg_base = kona_gpio->reg_base; 363 + 395 364 raw_spin_lock_irqsave(&kona_gpio->lock, flags); 396 365 397 366 val = readl(reg_base + GPIO_INT_MASK(bank_id)); ··· 415 382 416 383 kona_gpio = irq_data_get_irq_chip_data(d); 417 384 reg_base = kona_gpio->reg_base; 385 + 418 386 raw_spin_lock_irqsave(&kona_gpio->lock, flags); 419 387 420 388 val = readl(reg_base + GPIO_INT_MSKCLR(bank_id)); ··· 511 477 static int bcm_kona_gpio_irq_reqres(struct irq_data *d) 512 478 { 513 479 struct bcm_kona_gpio *kona_gpio = irq_data_get_irq_chip_data(d); 480 + unsigned int gpio = d->hwirq; 514 481 515 - return gpiochip_reqres_irq(&kona_gpio->gpio_chip, d->hwirq); 482 + /* 483 + * We need to unlock the GPIO before any other operations are performed 484 + * on the relevant GPIO configuration registers 485 + */ 486 + bcm_kona_gpio_unlock_gpio(kona_gpio, gpio); 487 + 488 + return gpiochip_reqres_irq(&kona_gpio->gpio_chip, gpio); 516 489 } 517 490 518 491 static void bcm_kona_gpio_irq_relres(struct irq_data *d) 519 492 { 520 493 struct bcm_kona_gpio *kona_gpio = irq_data_get_irq_chip_data(d); 494 + unsigned int gpio = d->hwirq; 521 495 522 - gpiochip_relres_irq(&kona_gpio->gpio_chip, d->hwirq); 496 + /* Once we no longer use it, lock the GPIO again */ 497 + bcm_kona_gpio_lock_gpio(kona_gpio, gpio); 498 + 499 + gpiochip_relres_irq(&kona_gpio->gpio_chip, gpio); 523 500 } 524 501 525 502 static struct irq_chip bcm_gpio_irq_chip = { ··· 659 614 bank->irq = platform_get_irq(pdev, i); 660 615 bank->kona_gpio = kona_gpio; 661 616 if (bank->irq < 0) { 662 - dev_err(dev, "Couldn't get IRQ for bank %d", i); 617 + dev_err(dev, "Couldn't get IRQ for bank %d\n", i); 663 618 ret = -ENOENT; 664 619 goto err_irq_domain; 665 620 }

-19

drivers/gpio/gpio-pca953x.c

··· 841 841 DECLARE_BITMAP(trigger, MAX_LINE); 842 842 int ret; 843 843 844 - if (chip->driver_data & PCA_PCAL) { 845 - /* Read the current interrupt status from the device */ 846 - ret = pca953x_read_regs(chip, PCAL953X_INT_STAT, trigger); 847 - if (ret) 848 - return false; 849 - 850 - /* Check latched inputs and clear interrupt status */ 851 - ret = pca953x_read_regs(chip, chip->regs->input, cur_stat); 852 - if (ret) 853 - return false; 854 - 855 - /* Apply filter for rising/falling edge selection */ 856 - bitmap_replace(new_stat, chip->irq_trig_fall, chip->irq_trig_raise, cur_stat, gc->ngpio); 857 - 858 - bitmap_and(pending, new_stat, trigger, gc->ngpio); 859 - 860 - return !bitmap_empty(pending, gc->ngpio); 861 - } 862 - 863 844 ret = pca953x_read_regs(chip, chip->regs->input, cur_stat); 864 845 if (ret) 865 846 return false;

+8 -5

drivers/gpio/gpio-sim.c

··· 1028 1028 struct configfs_subsystem *subsys = dev->group.cg_subsys; 1029 1029 struct gpio_sim_bank *bank; 1030 1030 struct gpio_sim_line *line; 1031 + struct config_item *item; 1031 1032 1032 1033 /* 1033 - * The device only needs to depend on leaf line entries. This is 1034 + * The device only needs to depend on leaf entries. This is 1034 1035 * sufficient to lock up all the configfs entries that the 1035 1036 * instantiated, alive device depends on. 1036 1037 */ 1037 1038 list_for_each_entry(bank, &dev->bank_list, siblings) { 1038 1039 list_for_each_entry(line, &bank->line_list, siblings) { 1040 + item = line->hog ? &line->hog->item 1041 + : &line->group.cg_item; 1042 + 1039 1043 if (lock) 1040 - WARN_ON(configfs_depend_item_unlocked( 1041 - subsys, &line->group.cg_item)); 1044 + WARN_ON(configfs_depend_item_unlocked(subsys, 1045 + item)); 1042 1046 else 1043 - configfs_undepend_item_unlocked( 1044 - &line->group.cg_item); 1047 + configfs_undepend_item_unlocked(item); 1045 1048 } 1046 1049 } 1047 1050 }

+12 -3

drivers/gpio/gpio-stmpe.c

··· 191 191 [REG_IE][CSB] = STMPE_IDX_IEGPIOR_CSB, 192 192 [REG_IE][MSB] = STMPE_IDX_IEGPIOR_MSB, 193 193 }; 194 - int i, j; 194 + int ret, i, j; 195 195 196 196 /* 197 197 * STMPE1600: to be able to get IRQ from pins, ··· 199 199 * GPSR or GPCR registers 200 200 */ 201 201 if (stmpe->partnum == STMPE1600) { 202 - stmpe_reg_read(stmpe, stmpe->regs[STMPE_IDX_GPMR_LSB]); 203 - stmpe_reg_read(stmpe, stmpe->regs[STMPE_IDX_GPMR_CSB]); 202 + ret = stmpe_reg_read(stmpe, stmpe->regs[STMPE_IDX_GPMR_LSB]); 203 + if (ret < 0) { 204 + dev_err(stmpe->dev, "Failed to read GPMR_LSB: %d\n", ret); 205 + goto err; 206 + } 207 + ret = stmpe_reg_read(stmpe, stmpe->regs[STMPE_IDX_GPMR_CSB]); 208 + if (ret < 0) { 209 + dev_err(stmpe->dev, "Failed to read GPMR_CSB: %d\n", ret); 210 + goto err; 211 + } 204 212 } 205 213 206 214 for (i = 0; i < CACHE_NR_REGS; i++) { ··· 230 222 } 231 223 } 232 224 225 + err: 233 226 mutex_unlock(&stmpe_gpio->irq_lock); 234 227 } 235 228

+14

drivers/gpio/gpiolib-acpi.c

··· 1689 1689 .ignore_wake = "PNP0C50:00@8", 1690 1690 }, 1691 1691 }, 1692 + { 1693 + /* 1694 + * Spurious wakeups from GPIO 11 1695 + * Found in BIOS 1.04 1696 + * https://gitlab.freedesktop.org/drm/amd/-/issues/3954 1697 + */ 1698 + .matches = { 1699 + DMI_MATCH(DMI_SYS_VENDOR, "Acer"), 1700 + DMI_MATCH(DMI_PRODUCT_FAMILY, "Acer Nitro V 14"), 1701 + }, 1702 + .driver_data = &(struct acpi_gpiolib_dmi_quirk) { 1703 + .ignore_interrupt = "AMDI0030:00@11", 1704 + }, 1705 + }, 1692 1706 {} /* Terminating entry */ 1693 1707 }; 1694 1708

+3 -3

drivers/gpio/gpiolib.c

··· 904 904 } 905 905 906 906 if (gc->ngpio == 0) { 907 - chip_err(gc, "tried to insert a GPIO chip with zero lines\n"); 907 + dev_err(dev, "tried to insert a GPIO chip with zero lines\n"); 908 908 return -EINVAL; 909 909 } 910 910 911 911 if (gc->ngpio > FASTPATH_NGPIO) 912 - chip_warn(gc, "line cnt %u is greater than fast path cnt %u\n", 913 - gc->ngpio, FASTPATH_NGPIO); 912 + dev_warn(dev, "line cnt %u is greater than fast path cnt %u\n", 913 + gc->ngpio, FASTPATH_NGPIO); 914 914 915 915 return 0; 916 916 }

+3 -1

drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

··· 119 119 * - 3.57.0 - Compute tunneling on GFX10+ 120 120 * - 3.58.0 - Add GFX12 DCC support 121 121 * - 3.59.0 - Cleared VRAM 122 + * - 3.60.0 - Add AMDGPU_TILING_GFX12_DCC_WRITE_COMPRESS_DISABLE (Vulkan requirement) 123 + * - 3.61.0 - Contains fix for RV/PCO compute queues 122 124 */ 123 125 #define KMS_DRIVER_MAJOR 3 124 - #define KMS_DRIVER_MINOR 59 126 + #define KMS_DRIVER_MINOR 61 125 127 #define KMS_DRIVER_PATCHLEVEL 0 126 128 127 129 /*

+3 -2

drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c

··· 3815 3815 if (err == -ENODEV) { 3816 3816 dev_warn(adev->dev, "cap microcode does not exist, skip\n"); 3817 3817 err = 0; 3818 - goto out; 3818 + } else { 3819 + dev_err(adev->dev, "fail to initialize cap microcode\n"); 3819 3820 } 3820 - dev_err(adev->dev, "fail to initialize cap microcode\n"); 3821 + goto out; 3821 3822 } 3822 3823 3823 3824 info = &adev->firmware.ucode[AMDGPU_UCODE_ID_CAP];

+6 -2

drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c

··· 309 309 mutex_lock(&adev->mman.gtt_window_lock); 310 310 while (src_mm.remaining) { 311 311 uint64_t from, to, cur_size, tiling_flags; 312 - uint32_t num_type, data_format, max_com; 312 + uint32_t num_type, data_format, max_com, write_compress_disable; 313 313 struct dma_fence *next; 314 314 315 315 /* Never copy more than 256MiB at once to avoid a timeout */ ··· 340 340 max_com = AMDGPU_TILING_GET(tiling_flags, GFX12_DCC_MAX_COMPRESSED_BLOCK); 341 341 num_type = AMDGPU_TILING_GET(tiling_flags, GFX12_DCC_NUMBER_TYPE); 342 342 data_format = AMDGPU_TILING_GET(tiling_flags, GFX12_DCC_DATA_FORMAT); 343 + write_compress_disable = 344 + AMDGPU_TILING_GET(tiling_flags, GFX12_DCC_WRITE_COMPRESS_DISABLE); 343 345 copy_flags |= (AMDGPU_COPY_FLAGS_SET(MAX_COMPRESSED, max_com) | 344 346 AMDGPU_COPY_FLAGS_SET(NUMBER_TYPE, num_type) | 345 - AMDGPU_COPY_FLAGS_SET(DATA_FORMAT, data_format)); 347 + AMDGPU_COPY_FLAGS_SET(DATA_FORMAT, data_format) | 348 + AMDGPU_COPY_FLAGS_SET(WRITE_COMPRESS_DISABLE, 349 + write_compress_disable)); 346 350 } 347 351 348 352 r = amdgpu_copy_buffer(ring, from, to, cur_size, resv,

+2

drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h

··· 119 119 #define AMDGPU_COPY_FLAGS_NUMBER_TYPE_MASK 0x07 120 120 #define AMDGPU_COPY_FLAGS_DATA_FORMAT_SHIFT 8 121 121 #define AMDGPU_COPY_FLAGS_DATA_FORMAT_MASK 0x3f 122 + #define AMDGPU_COPY_FLAGS_WRITE_COMPRESS_DISABLE_SHIFT 14 123 + #define AMDGPU_COPY_FLAGS_WRITE_COMPRESS_DISABLE_MASK 0x1 122 124 123 125 #define AMDGPU_COPY_FLAGS_SET(field, value) \ 124 126 (((__u32)(value) & AMDGPU_COPY_FLAGS_##field##_MASK) << AMDGPU_COPY_FLAGS_##field##_SHIFT)

+34 -2

drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c

··· 7437 7437 amdgpu_ring_write(ring, 0); /* RESERVED field, programmed to zero */ 7438 7438 } 7439 7439 7440 + static void gfx_v9_0_ring_begin_use_compute(struct amdgpu_ring *ring) 7441 + { 7442 + struct amdgpu_device *adev = ring->adev; 7443 + struct amdgpu_ip_block *gfx_block = 7444 + amdgpu_device_ip_get_ip_block(adev, AMD_IP_BLOCK_TYPE_GFX); 7445 + 7446 + amdgpu_gfx_enforce_isolation_ring_begin_use(ring); 7447 + 7448 + /* Raven and PCO APUs seem to have stability issues 7449 + * with compute and gfxoff and gfx pg. Disable gfx pg during 7450 + * submission and allow again afterwards. 7451 + */ 7452 + if (gfx_block && amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 1, 0)) 7453 + gfx_v9_0_set_powergating_state(gfx_block, AMD_PG_STATE_UNGATE); 7454 + } 7455 + 7456 + static void gfx_v9_0_ring_end_use_compute(struct amdgpu_ring *ring) 7457 + { 7458 + struct amdgpu_device *adev = ring->adev; 7459 + struct amdgpu_ip_block *gfx_block = 7460 + amdgpu_device_ip_get_ip_block(adev, AMD_IP_BLOCK_TYPE_GFX); 7461 + 7462 + /* Raven and PCO APUs seem to have stability issues 7463 + * with compute and gfxoff and gfx pg. Disable gfx pg during 7464 + * submission and allow again afterwards. 7465 + */ 7466 + if (gfx_block && amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 1, 0)) 7467 + gfx_v9_0_set_powergating_state(gfx_block, AMD_PG_STATE_GATE); 7468 + 7469 + amdgpu_gfx_enforce_isolation_ring_end_use(ring); 7470 + } 7471 + 7440 7472 static const struct amd_ip_funcs gfx_v9_0_ip_funcs = { 7441 7473 .name = "gfx_v9_0", 7442 7474 .early_init = gfx_v9_0_early_init, ··· 7645 7613 .emit_wave_limit = gfx_v9_0_emit_wave_limit, 7646 7614 .reset = gfx_v9_0_reset_kcq, 7647 7615 .emit_cleaner_shader = gfx_v9_0_ring_emit_cleaner_shader, 7648 - .begin_use = amdgpu_gfx_enforce_isolation_ring_begin_use, 7649 - .end_use = amdgpu_gfx_enforce_isolation_ring_end_use, 7616 + .begin_use = gfx_v9_0_ring_begin_use_compute, 7617 + .end_use = gfx_v9_0_ring_end_use_compute, 7650 7618 }; 7651 7619 7652 7620 static const struct amdgpu_ring_funcs gfx_v9_0_ring_funcs_kiq = {

+3 -2

drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c

··· 1741 1741 uint32_t byte_count, 1742 1742 uint32_t copy_flags) 1743 1743 { 1744 - uint32_t num_type, data_format, max_com; 1744 + uint32_t num_type, data_format, max_com, write_cm; 1745 1745 1746 1746 max_com = AMDGPU_COPY_FLAGS_GET(copy_flags, MAX_COMPRESSED); 1747 1747 data_format = AMDGPU_COPY_FLAGS_GET(copy_flags, DATA_FORMAT); 1748 1748 num_type = AMDGPU_COPY_FLAGS_GET(copy_flags, NUMBER_TYPE); 1749 + write_cm = AMDGPU_COPY_FLAGS_GET(copy_flags, WRITE_COMPRESS_DISABLE) ? 2 : 1; 1749 1750 1750 1751 ib->ptr[ib->length_dw++] = SDMA_PKT_COPY_LINEAR_HEADER_OP(SDMA_OP_COPY) | 1751 1752 SDMA_PKT_COPY_LINEAR_HEADER_SUB_OP(SDMA_SUBOP_COPY_LINEAR) | ··· 1763 1762 if ((copy_flags & (AMDGPU_COPY_FLAGS_READ_DECOMPRESSED | AMDGPU_COPY_FLAGS_WRITE_COMPRESSED))) 1764 1763 ib->ptr[ib->length_dw++] = SDMA_DCC_DATA_FORMAT(data_format) | SDMA_DCC_NUM_TYPE(num_type) | 1765 1764 ((copy_flags & AMDGPU_COPY_FLAGS_READ_DECOMPRESSED) ? SDMA_DCC_READ_CM(2) : 0) | 1766 - ((copy_flags & AMDGPU_COPY_FLAGS_WRITE_COMPRESSED) ? SDMA_DCC_WRITE_CM(1) : 0) | 1765 + ((copy_flags & AMDGPU_COPY_FLAGS_WRITE_COMPRESSED) ? SDMA_DCC_WRITE_CM(write_cm) : 0) | 1767 1766 SDMA_DCC_MAX_COM(max_com) | SDMA_DCC_MAX_UCOM(1); 1768 1767 else 1769 1768 ib->ptr[ib->length_dw++] = 0;

+2 -1

drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h

··· 4121 4121 0x0000ffff, 0x8bfe7e7e, 4122 4122 0x8bea6a6a, 0xb97af804, 4123 4123 0xbe804ec2, 0xbf94fffe, 4124 - 0xbe804a6c, 0xbfb10000, 4124 + 0xbe804a6c, 0xbe804ec2, 4125 + 0xbf94fffe, 0xbfb10000, 4125 4126 0xbf9f0000, 0xbf9f0000, 4126 4127 0xbf9f0000, 0xbf9f0000, 4127 4128 0xbf9f0000, 0x00000000,

+4

drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx12.asm

··· 1049 1049 s_rfe_b64 s_restore_pc_lo //Return to the main shader program and resume execution 1050 1050 1051 1051 L_END_PGM: 1052 + // Make sure that no wave of the workgroup can exit the trap handler 1053 + // before the workgroup barrier state is saved. 1054 + s_barrier_signal -2 1055 + s_barrier_wait -2 1052 1056 s_endpgm_saved 1053 1057 end 1054 1058

+1 -1

drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c

··· 300 300 return 0; 301 301 302 302 free_gang_ctx_bo: 303 - amdgpu_amdkfd_free_gtt_mem(dev->adev, (*q)->gang_ctx_bo); 303 + amdgpu_amdkfd_free_gtt_mem(dev->adev, &(*q)->gang_ctx_bo); 304 304 cleanup: 305 305 uninit_queue(*q); 306 306 *q = NULL;

+1 -1

drivers/gpu/drm/amd/display/dc/core/dc.c

··· 2133 2133 2134 2134 dc_enable_stereo(dc, context, dc_streams, context->stream_count); 2135 2135 2136 - if (context->stream_count > get_seamless_boot_stream_count(context) || 2136 + if (get_seamless_boot_stream_count(context) == 0 || 2137 2137 context->stream_count == 0) { 2138 2138 /* Must wait for no flips to be pending before doing optimize bw */ 2139 2139 hwss_wait_for_no_pipes_pending(dc, context);

+1 -2

drivers/gpu/drm/amd/display/dc/dce/dmub_hw_lock_mgr.c

··· 63 63 64 64 bool should_use_dmub_lock(struct dc_link *link) 65 65 { 66 - if (link->psr_settings.psr_version == DC_PSR_VERSION_SU_1 || 67 - link->psr_settings.psr_version == DC_PSR_VERSION_1) 66 + if (link->psr_settings.psr_version == DC_PSR_VERSION_SU_1) 68 67 return true; 69 68 70 69 if (link->replay_settings.replay_feature_enabled)

+9 -5

drivers/gpu/drm/amd/display/dc/dml/Makefile

··· 29 29 dml_rcflags := $(CC_FLAGS_NO_FPU) 30 30 31 31 ifneq ($(CONFIG_FRAME_WARN),0) 32 - ifeq ($(filter y,$(CONFIG_KASAN)$(CONFIG_KCSAN)),y) 33 - frame_warn_flag := -Wframe-larger-than=3072 34 - else 35 - frame_warn_flag := -Wframe-larger-than=2048 36 - endif 32 + ifeq ($(filter y,$(CONFIG_KASAN)$(CONFIG_KCSAN)),y) 33 + frame_warn_limit := 3072 34 + else 35 + frame_warn_limit := 2048 36 + endif 37 + 38 + ifeq ($(call test-lt, $(CONFIG_FRAME_WARN), $(frame_warn_limit)),y) 39 + frame_warn_flag := -Wframe-larger-than=$(frame_warn_limit) 40 + endif 37 41 endif 38 42 39 43 CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_lib.o := $(dml_ccflags)

+13 -9

drivers/gpu/drm/amd/display/dc/dml2/Makefile

··· 28 28 dml2_rcflags := $(CC_FLAGS_NO_FPU) 29 29 30 30 ifneq ($(CONFIG_FRAME_WARN),0) 31 - ifeq ($(filter y,$(CONFIG_KASAN)$(CONFIG_KCSAN)),y) 32 - ifeq ($(CONFIG_CC_IS_CLANG)$(CONFIG_COMPILE_TEST),yy) 33 - frame_warn_flag := -Wframe-larger-than=4096 34 - else 35 - frame_warn_flag := -Wframe-larger-than=3072 36 - endif 37 - else 38 - frame_warn_flag := -Wframe-larger-than=2048 39 - endif 31 + ifeq ($(filter y,$(CONFIG_KASAN)$(CONFIG_KCSAN)),y) 32 + ifeq ($(CONFIG_CC_IS_CLANG)$(CONFIG_COMPILE_TEST),yy) 33 + frame_warn_limit := 4096 34 + else 35 + frame_warn_limit := 3072 36 + endif 37 + else 38 + frame_warn_limit := 2048 39 + endif 40 + 41 + ifeq ($(call test-lt, $(CONFIG_FRAME_WARN), $(frame_warn_limit)),y) 42 + frame_warn_flag := -Wframe-larger-than=$(frame_warn_limit) 43 + endif 40 44 endif 41 45 42 46 subdir-ccflags-y += -I$(FULL_AMD_DISPLAY_PATH)/dc/dml2

+2 -2

drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_translation_helper.c

··· 1017 1017 if (disp_cfg_stream_location < 0) 1018 1018 disp_cfg_stream_location = dml_dispcfg->num_streams++; 1019 1019 1020 - ASSERT(disp_cfg_stream_location >= 0 && disp_cfg_stream_location <= __DML2_WRAPPER_MAX_STREAMS_PLANES__); 1020 + ASSERT(disp_cfg_stream_location >= 0 && disp_cfg_stream_location < __DML2_WRAPPER_MAX_STREAMS_PLANES__); 1021 1021 populate_dml21_timing_config_from_stream_state(&dml_dispcfg->stream_descriptors[disp_cfg_stream_location].timing, context->streams[stream_index], dml_ctx); 1022 1022 adjust_dml21_hblank_timing_config_from_pipe_ctx(&dml_dispcfg->stream_descriptors[disp_cfg_stream_location].timing, &context->res_ctx.pipe_ctx[stream_index]); 1023 1023 populate_dml21_output_config_from_stream_state(&dml_dispcfg->stream_descriptors[disp_cfg_stream_location].output, context->streams[stream_index], &context->res_ctx.pipe_ctx[stream_index]); ··· 1042 1042 if (disp_cfg_plane_location < 0) 1043 1043 disp_cfg_plane_location = dml_dispcfg->num_planes++; 1044 1044 1045 - ASSERT(disp_cfg_plane_location >= 0 && disp_cfg_plane_location <= __DML2_WRAPPER_MAX_STREAMS_PLANES__); 1045 + ASSERT(disp_cfg_plane_location >= 0 && disp_cfg_plane_location < __DML2_WRAPPER_MAX_STREAMS_PLANES__); 1046 1046 1047 1047 populate_dml21_surface_config_from_plane_state(in_dc, &dml_dispcfg->plane_descriptors[disp_cfg_plane_location].surface, context->stream_status[stream_index].plane_states[plane_index]); 1048 1048 populate_dml21_plane_config_from_plane_state(dml_ctx, &dml_dispcfg->plane_descriptors[disp_cfg_plane_location], context->stream_status[stream_index].plane_states[plane_index], context, stream_index);

+3 -3

drivers/gpu/drm/amd/display/dc/dml2/dml2_translation_helper.c

··· 786 786 case SIGNAL_TYPE_DISPLAY_PORT_MST: 787 787 case SIGNAL_TYPE_DISPLAY_PORT: 788 788 out->OutputEncoder[location] = dml_dp; 789 - if (dml2->v20.scratch.hpo_stream_to_link_encoder_mapping[location] != -1) 789 + if (location < MAX_HPO_DP2_ENCODERS && dml2->v20.scratch.hpo_stream_to_link_encoder_mapping[location] != -1) 790 790 out->OutputEncoder[dml2->v20.scratch.hpo_stream_to_link_encoder_mapping[location]] = dml_dp2p0; 791 791 break; 792 792 case SIGNAL_TYPE_EDP: ··· 1343 1343 if (disp_cfg_stream_location < 0) 1344 1344 disp_cfg_stream_location = dml_dispcfg->num_timings++; 1345 1345 1346 - ASSERT(disp_cfg_stream_location >= 0 && disp_cfg_stream_location <= __DML2_WRAPPER_MAX_STREAMS_PLANES__); 1346 + ASSERT(disp_cfg_stream_location >= 0 && disp_cfg_stream_location < __DML2_WRAPPER_MAX_STREAMS_PLANES__); 1347 1347 1348 1348 populate_dml_timing_cfg_from_stream_state(&dml_dispcfg->timing, disp_cfg_stream_location, context->streams[i]); 1349 1349 populate_dml_output_cfg_from_stream_state(&dml_dispcfg->output, disp_cfg_stream_location, context->streams[i], current_pipe_context, dml2); ··· 1383 1383 if (disp_cfg_plane_location < 0) 1384 1384 disp_cfg_plane_location = dml_dispcfg->num_surfaces++; 1385 1385 1386 - ASSERT(disp_cfg_plane_location >= 0 && disp_cfg_plane_location <= __DML2_WRAPPER_MAX_STREAMS_PLANES__); 1386 + ASSERT(disp_cfg_plane_location >= 0 && disp_cfg_plane_location < __DML2_WRAPPER_MAX_STREAMS_PLANES__); 1387 1387 1388 1388 populate_dml_surface_cfg_from_plane_state(dml2->v20.dml_core_ctx.project, &dml_dispcfg->surface, disp_cfg_plane_location, context->stream_status[i].plane_states[j]); 1389 1389 populate_dml_plane_cfg_from_plane_state(

+2 -1

drivers/gpu/drm/amd/display/dc/hubbub/dcn30/dcn30_hubbub.c

··· 129 129 REG_UPDATE(DCHUBBUB_ARB_DF_REQ_OUTSTAND, 130 130 DCHUBBUB_ARB_MIN_REQ_OUTSTAND, 0x1FF); 131 131 132 - hubbub1_allow_self_refresh_control(hubbub, !hubbub->ctx->dc->debug.disable_stutter); 132 + if (safe_to_lower || hubbub->ctx->dc->debug.disable_stutter) 133 + hubbub1_allow_self_refresh_control(hubbub, !hubbub->ctx->dc->debug.disable_stutter); 133 134 134 135 return wm_pending; 135 136 }

+2 -1

drivers/gpu/drm/amd/display/dc/hubbub/dcn31/dcn31_hubbub.c

··· 750 750 REG_UPDATE(DCHUBBUB_ARB_DF_REQ_OUTSTAND, 751 751 DCHUBBUB_ARB_MIN_REQ_OUTSTAND, 0x1FF);*/ 752 752 753 - hubbub1_allow_self_refresh_control(hubbub, !hubbub->ctx->dc->debug.disable_stutter); 753 + if (safe_to_lower || hubbub->ctx->dc->debug.disable_stutter) 754 + hubbub1_allow_self_refresh_control(hubbub, !hubbub->ctx->dc->debug.disable_stutter); 754 755 return wm_pending; 755 756 } 756 757

+2 -1

drivers/gpu/drm/amd/display/dc/hubbub/dcn32/dcn32_hubbub.c

··· 786 786 REG_UPDATE(DCHUBBUB_ARB_DF_REQ_OUTSTAND, 787 787 DCHUBBUB_ARB_MIN_REQ_OUTSTAND, 0x1FF);*/ 788 788 789 - hubbub1_allow_self_refresh_control(hubbub, !hubbub->ctx->dc->debug.disable_stutter); 789 + if (safe_to_lower || hubbub->ctx->dc->debug.disable_stutter) 790 + hubbub1_allow_self_refresh_control(hubbub, !hubbub->ctx->dc->debug.disable_stutter); 790 791 791 792 hubbub32_force_usr_retraining_allow(hubbub, hubbub->ctx->dc->debug.force_usr_allow); 792 793

+2 -1

drivers/gpu/drm/amd/display/dc/hubbub/dcn35/dcn35_hubbub.c

··· 326 326 DCHUBBUB_ARB_MIN_REQ_OUTSTAND_COMMIT_THRESHOLD, 0xA);/*hw delta*/ 327 327 REG_UPDATE(DCHUBBUB_ARB_HOSTVM_CNTL, DCHUBBUB_ARB_MAX_QOS_COMMIT_THRESHOLD, 0xF); 328 328 329 - hubbub1_allow_self_refresh_control(hubbub, !hubbub->ctx->dc->debug.disable_stutter); 329 + if (safe_to_lower || hubbub->ctx->dc->debug.disable_stutter) 330 + hubbub1_allow_self_refresh_control(hubbub, !hubbub->ctx->dc->debug.disable_stutter); 330 331 331 332 hubbub32_force_usr_retraining_allow(hubbub, hubbub->ctx->dc->debug.force_usr_allow); 332 333

+2

drivers/gpu/drm/amd/display/dc/hubp/dcn30/dcn30_hubp.c

··· 500 500 //hubp[i].HUBPREQ_DEBUG.HUBPREQ_DEBUG[26] = 1; 501 501 REG_WRITE(HUBPREQ_DEBUG, 1 << 26); 502 502 503 + REG_UPDATE(DCHUBP_CNTL, HUBP_TTU_DISABLE, 0); 504 + 503 505 hubp_reset(hubp); 504 506 } 505 507

+2

drivers/gpu/drm/amd/display/dc/hubp/dcn32/dcn32_hubp.c

··· 168 168 { 169 169 struct dcn20_hubp *hubp2 = TO_DCN20_HUBP(hubp); 170 170 REG_WRITE(HUBPREQ_DEBUG_DB, 1 << 8); 171 + 172 + REG_UPDATE(DCHUBP_CNTL, HUBP_TTU_DISABLE, 0); 171 173 } 172 174 static struct hubp_funcs dcn32_hubp_funcs = { 173 175 .hubp_enable_tripleBuffer = hubp2_enable_triplebuffer,

+2 -1

drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.c

··· 236 236 } 237 237 238 238 hws->funcs.init_pipes(dc, dc->current_state); 239 - if (dc->res_pool->hubbub->funcs->allow_self_refresh_control) 239 + if (dc->res_pool->hubbub->funcs->allow_self_refresh_control && 240 + !dc->res_pool->hubbub->ctx->dc->debug.disable_stutter) 240 241 dc->res_pool->hubbub->funcs->allow_self_refresh_control(dc->res_pool->hubbub, 241 242 !dc->res_pool->hubbub->ctx->dc->debug.disable_stutter); 242 243 }

+1 -1

drivers/gpu/drm/amd/pm/amdgpu_dpm.c

··· 78 78 int ret = 0; 79 79 const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs; 80 80 enum ip_power_state pwr_state = gate ? POWER_STATE_OFF : POWER_STATE_ON; 81 - bool is_vcn = (block_type == AMD_IP_BLOCK_TYPE_UVD || block_type == AMD_IP_BLOCK_TYPE_VCN); 81 + bool is_vcn = block_type == AMD_IP_BLOCK_TYPE_VCN; 82 82 83 83 if (atomic_read(&adev->pm.pwr_state[block_type]) == pwr_state && 84 84 (!is_vcn || adev->vcn.num_vcn_inst == 1)) {

+2 -1

drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c

··· 612 612 return -EIO; 613 613 } 614 614 615 - if (!smu_table->hardcode_pptable) { 615 + if (!smu_table->hardcode_pptable || smu_table->power_play_table_size < size) { 616 + kfree(smu_table->hardcode_pptable); 616 617 smu_table->hardcode_pptable = kzalloc(size, GFP_KERNEL); 617 618 if (!smu_table->hardcode_pptable) 618 619 return -ENOMEM;

+4

drivers/gpu/drm/arm/display/komeda/komeda_wb_connector.c

··· 160 160 formats = komeda_get_layer_fourcc_list(&mdev->fmt_tbl, 161 161 kwb_conn->wb_layer->layer_type, 162 162 &n_formats); 163 + if (!formats) { 164 + kfree(kwb_conn); 165 + return -ENOMEM; 166 + } 163 167 164 168 err = drm_writeback_connector_init(&kms->base, wb_conn, 165 169 &komeda_wb_connector_funcs,

+1 -1

drivers/gpu/drm/ast/ast_dp.c

··· 195 195 if (enabled) 196 196 vgacrdf_test |= AST_IO_VGACRDF_DP_VIDEO_ENABLE; 197 197 198 - for (i = 0; i < 200; ++i) { 198 + for (i = 0; i < 1000; ++i) { 199 199 if (i) 200 200 mdelay(1); 201 201 vgacrdf = ast_get_index_reg_mask(ast, AST_IO_VGACRI, 0xdf,

+3 -11

drivers/gpu/drm/display/drm_dp_cec.c

··· 311 311 if (!aux->transfer) 312 312 return; 313 313 314 - #ifndef CONFIG_MEDIA_CEC_RC 315 - /* 316 - * CEC_CAP_RC is part of CEC_CAP_DEFAULTS, but it is stripped by 317 - * cec_allocate_adapter() if CONFIG_MEDIA_CEC_RC is undefined. 318 - * 319 - * Do this here as well to ensure the tests against cec_caps are 320 - * correct. 321 - */ 322 - cec_caps &= ~CEC_CAP_RC; 323 - #endif 324 314 cancel_delayed_work_sync(&aux->cec.unregister_work); 325 315 326 316 mutex_lock(&aux->cec.lock); ··· 327 337 num_las = CEC_MAX_LOG_ADDRS; 328 338 329 339 if (aux->cec.adap) { 330 - if (aux->cec.adap->capabilities == cec_caps && 340 + /* Check if the adapter properties have changed */ 341 + if ((aux->cec.adap->capabilities & CEC_CAP_MONITOR_ALL) == 342 + (cec_caps & CEC_CAP_MONITOR_ALL) && 331 343 aux->cec.adap->available_log_addrs == num_las) { 332 344 /* Unchanged, so just set the phys addr */ 333 345 cec_s_phys_addr(aux->cec.adap, source_physical_address, false);

+1 -1

drivers/gpu/drm/display/drm_dp_helper.c

··· 2544 2544 { 2545 2545 u8 bpp_increment_dpcd = dsc_dpcd[DP_DSC_BITS_PER_PIXEL_INC - DP_DSC_SUPPORT]; 2546 2546 2547 - switch (bpp_increment_dpcd) { 2547 + switch (bpp_increment_dpcd & DP_DSC_BITS_PER_PIXEL_MASK) { 2548 2548 case DP_DSC_BITS_PER_PIXEL_1_16: 2549 2549 return 16; 2550 2550 case DP_DSC_BITS_PER_PIXEL_1_8:

+2

drivers/gpu/drm/hisilicon/hibmc/Kconfig

··· 4 4 depends on DRM && PCI 5 5 depends on MMU 6 6 select DRM_CLIENT_SELECTION 7 + select DRM_DISPLAY_HELPER 8 + select DRM_DISPLAY_DP_HELPER 7 9 select DRM_KMS_HELPER 8 10 select DRM_VRAM_HELPER 9 11 select DRM_TTM

+3 -2

drivers/gpu/drm/i915/display/intel_backlight.c

··· 41 41 { 42 42 u64 target_val; 43 43 44 - WARN_ON(source_min > source_max); 45 - WARN_ON(target_min > target_max); 44 + if (WARN_ON(source_min >= source_max) || 45 + WARN_ON(target_min > target_max)) 46 + return target_min; 46 47 47 48 /* defensive */ 48 49 source_val = clamp(source_val, source_min, source_max);

+5 -7

drivers/gpu/drm/i915/display/intel_dp.c

··· 1791 1791 if (DISPLAY_VER(display) == 11) 1792 1792 return 10; 1793 1793 1794 - return 0; 1794 + return intel_dp_dsc_min_src_input_bpc(); 1795 1795 } 1796 1796 1797 1797 int intel_dp_dsc_compute_max_bpp(const struct intel_connector *connector, ··· 2072 2072 /* Compressed BPP should be less than the Input DSC bpp */ 2073 2073 dsc_max_bpp = min(dsc_max_bpp, pipe_bpp - 1); 2074 2074 2075 - for (i = 0; i < ARRAY_SIZE(valid_dsc_bpp); i++) { 2076 - if (valid_dsc_bpp[i] < dsc_min_bpp) 2075 + for (i = ARRAY_SIZE(valid_dsc_bpp) - 1; i >= 0; i--) { 2076 + if (valid_dsc_bpp[i] < dsc_min_bpp || 2077 + valid_dsc_bpp[i] > dsc_max_bpp) 2077 2078 continue; 2078 - if (valid_dsc_bpp[i] > dsc_max_bpp) 2079 - break; 2080 2079 2081 2080 ret = dsc_compute_link_config(intel_dp, 2082 2081 pipe_config, ··· 2828 2829 2829 2830 crtc_state->infoframes.enable |= intel_hdmi_infoframe_enable(DP_SDP_ADAPTIVE_SYNC); 2830 2831 2831 - /* Currently only DP_AS_SDP_AVT_FIXED_VTOTAL mode supported */ 2832 2832 as_sdp->sdp_type = DP_SDP_ADAPTIVE_SYNC; 2833 2833 as_sdp->length = 0x9; 2834 2834 as_sdp->duration_incr_ms = 0; ··· 2838 2840 as_sdp->target_rr = drm_mode_vrefresh(adjusted_mode); 2839 2841 as_sdp->target_rr_divider = true; 2840 2842 } else { 2841 - as_sdp->mode = DP_AS_SDP_AVT_FIXED_VTOTAL; 2843 + as_sdp->mode = DP_AS_SDP_AVT_DYNAMIC_VTOTAL; 2842 2844 as_sdp->vtotal = adjusted_mode->vtotal; 2843 2845 as_sdp->target_rr = 0; 2844 2846 }

+4

drivers/gpu/drm/i915/display/intel_dp_mst.c

··· 341 341 342 342 break; 343 343 } 344 + 345 + /* Allow using zero step to indicate one try */ 346 + if (!step) 347 + break; 344 348 } 345 349 346 350 if (slots < 0) {

+14 -1

drivers/gpu/drm/i915/display/intel_hdcp.c

··· 41 41 u32 rekey_bit = 0; 42 42 43 43 /* Here we assume HDMI is in TMDS mode of operation */ 44 - if (encoder->type != INTEL_OUTPUT_HDMI) 44 + if (!intel_encoder_is_hdmi(encoder)) 45 45 return; 46 46 47 47 if (DISPLAY_VER(display) >= 30) { ··· 2188 2188 2189 2189 drm_dbg_kms(display->drm, 2190 2190 "HDCP2.2 Downstream topology change\n"); 2191 + 2192 + ret = hdcp2_authenticate_repeater_topology(connector); 2193 + if (!ret) { 2194 + intel_hdcp_update_value(connector, 2195 + DRM_MODE_CONTENT_PROTECTION_ENABLED, 2196 + true); 2197 + goto out; 2198 + } 2199 + 2200 + drm_dbg_kms(display->drm, 2201 + "[CONNECTOR:%d:%s] Repeater topology auth failed.(%d)\n", 2202 + connector->base.base.id, connector->base.name, 2203 + ret); 2191 2204 } else { 2192 2205 drm_dbg_kms(display->drm, 2193 2206 "[CONNECTOR:%d:%s] HDCP2.2 link failed, retrying auth\n",

-4

drivers/gpu/drm/i915/display/skl_universal_plane.c

··· 106 106 DRM_FORMAT_Y216, 107 107 DRM_FORMAT_XYUV8888, 108 108 DRM_FORMAT_XVYU2101010, 109 - DRM_FORMAT_XVYU12_16161616, 110 - DRM_FORMAT_XVYU16161616, 111 109 }; 112 110 113 111 static const u32 icl_sdr_uv_plane_formats[] = { ··· 132 134 DRM_FORMAT_Y216, 133 135 DRM_FORMAT_XYUV8888, 134 136 DRM_FORMAT_XVYU2101010, 135 - DRM_FORMAT_XVYU12_16161616, 136 - DRM_FORMAT_XVYU16161616, 137 137 }; 138 138 139 139 static const u32 icl_hdr_plane_formats[] = {

+1 -5

drivers/gpu/drm/i915/gem/i915_gem_shmem.c

··· 209 209 struct address_space *mapping = obj->base.filp->f_mapping; 210 210 unsigned int max_segment = i915_sg_segment_size(i915->drm.dev); 211 211 struct sg_table *st; 212 - struct sgt_iter sgt_iter; 213 - struct page *page; 214 212 int ret; 215 213 216 214 /* ··· 237 239 * for PAGE_SIZE chunks instead may be helpful. 238 240 */ 239 241 if (max_segment > PAGE_SIZE) { 240 - for_each_sgt_page(page, sgt_iter, st) 241 - put_page(page); 242 - sg_free_table(st); 242 + shmem_sg_free_table(st, mapping, false, false); 243 243 kfree(st); 244 244 245 245 max_segment = PAGE_SIZE;

+30 -6

drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c

··· 1469 1469 spin_unlock_irqrestore(&guc->timestamp.lock, flags); 1470 1470 } 1471 1471 1472 + static void __update_guc_busyness_running_state(struct intel_guc *guc) 1473 + { 1474 + struct intel_gt *gt = guc_to_gt(guc); 1475 + struct intel_engine_cs *engine; 1476 + enum intel_engine_id id; 1477 + unsigned long flags; 1478 + 1479 + spin_lock_irqsave(&guc->timestamp.lock, flags); 1480 + for_each_engine(engine, gt, id) 1481 + engine->stats.guc.running = false; 1482 + spin_unlock_irqrestore(&guc->timestamp.lock, flags); 1483 + } 1484 + 1472 1485 static void __update_guc_busyness_stats(struct intel_guc *guc) 1473 1486 { 1474 1487 struct intel_gt *gt = guc_to_gt(guc); ··· 1631 1618 1632 1619 if (!guc_submission_initialized(guc)) 1633 1620 return; 1621 + 1622 + /* Assume no engines are running and set running state to false */ 1623 + __update_guc_busyness_running_state(guc); 1634 1624 1635 1625 /* 1636 1626 * There is a race with suspend flow where the worker runs after suspend ··· 5535 5519 { 5536 5520 drm_printf(p, "GuC lrc descriptor %u:\n", ce->guc_id.id); 5537 5521 drm_printf(p, "\tHW Context Desc: 0x%08x\n", ce->lrc.lrca); 5538 - drm_printf(p, "\t\tLRC Head: Internal %u, Memory %u\n", 5539 - ce->ring->head, 5540 - ce->lrc_reg_state[CTX_RING_HEAD]); 5541 - drm_printf(p, "\t\tLRC Tail: Internal %u, Memory %u\n", 5542 - ce->ring->tail, 5543 - ce->lrc_reg_state[CTX_RING_TAIL]); 5522 + if (intel_context_pin_if_active(ce)) { 5523 + drm_printf(p, "\t\tLRC Head: Internal %u, Memory %u\n", 5524 + ce->ring->head, 5525 + ce->lrc_reg_state[CTX_RING_HEAD]); 5526 + drm_printf(p, "\t\tLRC Tail: Internal %u, Memory %u\n", 5527 + ce->ring->tail, 5528 + ce->lrc_reg_state[CTX_RING_TAIL]); 5529 + intel_context_unpin(ce); 5530 + } else { 5531 + drm_printf(p, "\t\tLRC Head: Internal %u, Memory not pinned\n", 5532 + ce->ring->head); 5533 + drm_printf(p, "\t\tLRC Tail: Internal %u, Memory not pinned\n", 5534 + ce->ring->tail); 5535 + } 5544 5536 drm_printf(p, "\t\tContext Pin Count: %u\n", 5545 5537 atomic_read(&ce->pin_count)); 5546 5538 drm_printf(p, "\t\tGuC ID Ref Count: %u\n",

+2 -2

drivers/gpu/drm/i915/selftests/i915_gem_gtt.c

··· 168 168 return PTR_ERR(ppgtt); 169 169 170 170 if (!ppgtt->vm.allocate_va_range) 171 - goto err_ppgtt_cleanup; 171 + goto ppgtt_vm_put; 172 172 173 173 /* 174 174 * While we only allocate the page tables here and so we could ··· 236 236 goto retry; 237 237 } 238 238 i915_gem_ww_ctx_fini(&ww); 239 - 239 + ppgtt_vm_put: 240 240 i915_vm_put(&ppgtt->vm); 241 241 return err; 242 242 }

+1

drivers/gpu/drm/panthor/panthor_drv.c

··· 802 802 { 803 803 int prio; 804 804 805 + memset(arg, 0, sizeof(*arg)); 805 806 for (prio = PANTHOR_GROUP_PRIORITY_REALTIME; prio >= 0; prio--) { 806 807 if (!group_priority_permit(file, prio)) 807 808 arg->allowed_mask |= BIT(prio);

+103 -97

drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c

··· 70 70 state = drm_kunit_helper_atomic_state_alloc(test, drm, ctx); 71 71 KUNIT_ASSERT_NOT_ERR_OR_NULL(test, state); 72 72 73 + retry: 73 74 conn_state = drm_atomic_get_connector_state(state, connector); 74 75 KUNIT_ASSERT_NOT_ERR_OR_NULL(test, conn_state); 75 76 76 77 ret = drm_atomic_set_crtc_for_connector(conn_state, crtc); 78 + if (ret == -EDEADLK) { 79 + drm_atomic_state_clear(state); 80 + ret = drm_modeset_backoff(ctx); 81 + if (!ret) 82 + goto retry; 83 + } 77 84 KUNIT_EXPECT_EQ(test, ret, 0); 78 85 79 86 crtc_state = drm_atomic_get_crtc_state(state, crtc); ··· 289 282 8); 290 283 KUNIT_ASSERT_NOT_NULL(test, priv); 291 284 292 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 293 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 294 - 285 + drm = &priv->drm; 286 + crtc = priv->crtc; 295 287 conn = &priv->connector; 288 + 296 289 preferred = find_preferred_mode(conn); 297 290 KUNIT_ASSERT_NOT_NULL(test, preferred); 298 291 299 - drm = &priv->drm; 300 - crtc = priv->crtc; 292 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 293 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 294 + 301 295 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 302 296 KUNIT_ASSERT_EQ(test, ret, 0); 303 297 ··· 353 345 8); 354 346 KUNIT_ASSERT_NOT_NULL(test, priv); 355 347 356 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 357 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 358 - 348 + drm = &priv->drm; 349 + crtc = priv->crtc; 359 350 conn = &priv->connector; 351 + 360 352 preferred = find_preferred_mode(conn); 361 353 KUNIT_ASSERT_NOT_NULL(test, preferred); 362 354 363 - drm = &priv->drm; 364 - crtc = priv->crtc; 355 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 356 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 357 + 365 358 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 366 359 KUNIT_ASSERT_EQ(test, ret, 0); 367 360 ··· 417 408 8); 418 409 KUNIT_ASSERT_NOT_NULL(test, priv); 419 410 411 + drm = &priv->drm; 412 + crtc = priv->crtc; 420 413 conn = &priv->connector; 421 414 KUNIT_ASSERT_TRUE(test, conn->display_info.is_hdmi); 422 - 423 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 424 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 425 415 426 416 preferred = find_preferred_mode(conn); 427 417 KUNIT_ASSERT_NOT_NULL(test, preferred); 428 418 KUNIT_ASSERT_NE(test, drm_match_cea_mode(preferred), 1); 429 419 430 - drm = &priv->drm; 431 - crtc = priv->crtc; 420 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 421 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 422 + 432 423 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 433 424 KUNIT_ASSERT_EQ(test, ret, 0); 434 425 ··· 483 474 mode = drm_kunit_display_mode_from_cea_vic(test, drm, 1); 484 475 KUNIT_ASSERT_NOT_NULL(test, mode); 485 476 486 - drm = &priv->drm; 487 477 crtc = priv->crtc; 488 478 ret = light_up_connector(test, drm, crtc, conn, mode, ctx); 489 479 KUNIT_ASSERT_EQ(test, ret, 0); ··· 528 520 8); 529 521 KUNIT_ASSERT_NOT_NULL(test, priv); 530 522 523 + drm = &priv->drm; 524 + crtc = priv->crtc; 531 525 conn = &priv->connector; 532 526 KUNIT_ASSERT_TRUE(test, conn->display_info.is_hdmi); 533 - 534 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 535 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 536 527 537 528 preferred = find_preferred_mode(conn); 538 529 KUNIT_ASSERT_NOT_NULL(test, preferred); 539 530 KUNIT_ASSERT_NE(test, drm_match_cea_mode(preferred), 1); 540 531 541 - drm = &priv->drm; 542 - crtc = priv->crtc; 532 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 533 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 534 + 543 535 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 544 536 KUNIT_ASSERT_EQ(test, ret, 0); 545 537 ··· 596 588 mode = drm_kunit_display_mode_from_cea_vic(test, drm, 1); 597 589 KUNIT_ASSERT_NOT_NULL(test, mode); 598 590 599 - drm = &priv->drm; 600 591 crtc = priv->crtc; 601 592 ret = light_up_connector(test, drm, crtc, conn, mode, ctx); 602 593 KUNIT_ASSERT_EQ(test, ret, 0); ··· 643 636 8); 644 637 KUNIT_ASSERT_NOT_NULL(test, priv); 645 638 639 + drm = &priv->drm; 640 + crtc = priv->crtc; 646 641 conn = &priv->connector; 647 642 KUNIT_ASSERT_TRUE(test, conn->display_info.is_hdmi); 648 - 649 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 650 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 651 643 652 644 preferred = find_preferred_mode(conn); 653 645 KUNIT_ASSERT_NOT_NULL(test, preferred); 654 646 KUNIT_ASSERT_NE(test, drm_match_cea_mode(preferred), 1); 655 647 656 - drm = &priv->drm; 657 - crtc = priv->crtc; 648 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 649 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 650 + 658 651 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 659 652 KUNIT_ASSERT_EQ(test, ret, 0); 660 653 ··· 711 704 mode = drm_kunit_display_mode_from_cea_vic(test, drm, 1); 712 705 KUNIT_ASSERT_NOT_NULL(test, mode); 713 706 714 - drm = &priv->drm; 715 707 crtc = priv->crtc; 716 708 ret = light_up_connector(test, drm, crtc, conn, mode, ctx); 717 709 KUNIT_ASSERT_EQ(test, ret, 0); ··· 760 754 10); 761 755 KUNIT_ASSERT_NOT_NULL(test, priv); 762 756 757 + drm = &priv->drm; 758 + crtc = priv->crtc; 763 759 conn = &priv->connector; 764 760 ret = set_connector_edid(test, conn, 765 761 test_edid_hdmi_1080p_rgb_yuv_dc_max_200mhz, 766 762 ARRAY_SIZE(test_edid_hdmi_1080p_rgb_yuv_dc_max_200mhz)); 767 763 KUNIT_ASSERT_GT(test, ret, 0); 768 764 769 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 770 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 771 - 772 765 preferred = find_preferred_mode(conn); 773 766 KUNIT_ASSERT_NOT_NULL(test, preferred); 774 767 775 - drm = &priv->drm; 776 - crtc = priv->crtc; 768 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 769 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 770 + 777 771 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 778 772 KUNIT_ASSERT_EQ(test, ret, 0); 779 773 ··· 834 828 10); 835 829 KUNIT_ASSERT_NOT_NULL(test, priv); 836 830 831 + drm = &priv->drm; 832 + crtc = priv->crtc; 837 833 conn = &priv->connector; 838 834 ret = set_connector_edid(test, conn, 839 835 test_edid_hdmi_1080p_rgb_yuv_dc_max_200mhz, 840 836 ARRAY_SIZE(test_edid_hdmi_1080p_rgb_yuv_dc_max_200mhz)); 841 837 KUNIT_ASSERT_GT(test, ret, 0); 842 838 843 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 844 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 845 - 846 839 preferred = find_preferred_mode(conn); 847 840 KUNIT_ASSERT_NOT_NULL(test, preferred); 848 841 849 - drm = &priv->drm; 850 - crtc = priv->crtc; 842 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 843 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 844 + 851 845 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 852 846 KUNIT_ASSERT_EQ(test, ret, 0); 853 847 ··· 905 899 12); 906 900 KUNIT_ASSERT_NOT_NULL(test, priv); 907 901 902 + drm = &priv->drm; 903 + crtc = priv->crtc; 908 904 conn = &priv->connector; 909 905 ret = set_connector_edid(test, conn, 910 906 test_edid_dvi_1080p, ··· 916 908 info = &conn->display_info; 917 909 KUNIT_ASSERT_FALSE(test, info->is_hdmi); 918 910 919 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 920 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 921 - 922 911 preferred = find_preferred_mode(conn); 923 912 KUNIT_ASSERT_NOT_NULL(test, preferred); 924 913 925 - drm = &priv->drm; 926 - crtc = priv->crtc; 914 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 915 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 916 + 927 917 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 928 918 KUNIT_ASSERT_EQ(test, ret, 0); 929 919 ··· 952 946 8); 953 947 KUNIT_ASSERT_NOT_NULL(test, priv); 954 948 949 + drm = &priv->drm; 950 + crtc = priv->crtc; 955 951 conn = &priv->connector; 956 952 ret = set_connector_edid(test, conn, 957 953 test_edid_hdmi_1080p_rgb_max_200mhz, 958 954 ARRAY_SIZE(test_edid_hdmi_1080p_rgb_max_200mhz)); 959 955 KUNIT_ASSERT_GT(test, ret, 0); 960 956 961 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 962 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 963 - 964 957 preferred = find_preferred_mode(conn); 965 958 KUNIT_ASSERT_NOT_NULL(test, preferred); 966 959 KUNIT_ASSERT_FALSE(test, preferred->flags & DRM_MODE_FLAG_DBLCLK); 967 960 968 - drm = &priv->drm; 969 - crtc = priv->crtc; 961 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 962 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 963 + 970 964 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 971 965 KUNIT_ASSERT_EQ(test, ret, 0); 972 966 ··· 999 993 10); 1000 994 KUNIT_ASSERT_NOT_NULL(test, priv); 1001 995 996 + drm = &priv->drm; 997 + crtc = priv->crtc; 1002 998 conn = &priv->connector; 1003 999 ret = set_connector_edid(test, conn, 1004 1000 test_edid_hdmi_1080p_rgb_yuv_dc_max_340mhz, 1005 1001 ARRAY_SIZE(test_edid_hdmi_1080p_rgb_yuv_dc_max_340mhz)); 1006 1002 KUNIT_ASSERT_GT(test, ret, 0); 1007 1003 1008 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1009 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1010 - 1011 1004 preferred = find_preferred_mode(conn); 1012 1005 KUNIT_ASSERT_NOT_NULL(test, preferred); 1013 1006 KUNIT_ASSERT_FALSE(test, preferred->flags & DRM_MODE_FLAG_DBLCLK); 1014 1007 1015 - drm = &priv->drm; 1016 - crtc = priv->crtc; 1008 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1009 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1010 + 1017 1011 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 1018 1012 KUNIT_ASSERT_EQ(test, ret, 0); 1019 1013 ··· 1046 1040 12); 1047 1041 KUNIT_ASSERT_NOT_NULL(test, priv); 1048 1042 1043 + drm = &priv->drm; 1044 + crtc = priv->crtc; 1049 1045 conn = &priv->connector; 1050 1046 ret = set_connector_edid(test, conn, 1051 1047 test_edid_hdmi_1080p_rgb_yuv_dc_max_340mhz, 1052 1048 ARRAY_SIZE(test_edid_hdmi_1080p_rgb_yuv_dc_max_340mhz)); 1053 1049 KUNIT_ASSERT_GT(test, ret, 0); 1054 1050 1055 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1056 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1057 - 1058 1051 preferred = find_preferred_mode(conn); 1059 1052 KUNIT_ASSERT_NOT_NULL(test, preferred); 1060 1053 KUNIT_ASSERT_FALSE(test, preferred->flags & DRM_MODE_FLAG_DBLCLK); 1061 1054 1062 - drm = &priv->drm; 1063 - crtc = priv->crtc; 1055 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1056 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1057 + 1064 1058 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 1065 1059 KUNIT_ASSERT_EQ(test, ret, 0); 1066 1060 ··· 1097 1091 8); 1098 1092 KUNIT_ASSERT_NOT_NULL(test, priv); 1099 1093 1100 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1101 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1102 - 1094 + drm = &priv->drm; 1095 + crtc = priv->crtc; 1103 1096 conn = &priv->connector; 1097 + 1104 1098 preferred = find_preferred_mode(conn); 1105 1099 KUNIT_ASSERT_NOT_NULL(test, preferred); 1106 1100 1107 - drm = &priv->drm; 1108 - crtc = priv->crtc; 1101 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1102 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1103 + 1109 1104 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 1110 1105 KUNIT_ASSERT_EQ(test, ret, 0); 1111 1106 ··· 1154 1147 12); 1155 1148 KUNIT_ASSERT_NOT_NULL(test, priv); 1156 1149 1150 + drm = &priv->drm; 1151 + crtc = priv->crtc; 1157 1152 conn = &priv->connector; 1158 1153 ret = set_connector_edid(test, conn, 1159 1154 test_edid_hdmi_1080p_rgb_yuv_dc_max_200mhz, ··· 1165 1156 info = &conn->display_info; 1166 1157 KUNIT_ASSERT_TRUE(test, info->is_hdmi); 1167 1158 KUNIT_ASSERT_GT(test, info->max_tmds_clock, 0); 1168 - 1169 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1170 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1171 1159 1172 1160 preferred = find_preferred_mode(conn); 1173 1161 KUNIT_ASSERT_NOT_NULL(test, preferred); ··· 1176 1170 rate = drm_hdmi_compute_mode_clock(preferred, 10, HDMI_COLORSPACE_RGB); 1177 1171 KUNIT_ASSERT_LT(test, rate, info->max_tmds_clock * 1000); 1178 1172 1179 - drm = &priv->drm; 1180 - crtc = priv->crtc; 1173 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1174 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1175 + 1181 1176 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 1182 1177 KUNIT_EXPECT_EQ(test, ret, 0); 1183 1178 ··· 1223 1216 12); 1224 1217 KUNIT_ASSERT_NOT_NULL(test, priv); 1225 1218 1219 + drm = &priv->drm; 1220 + crtc = priv->crtc; 1226 1221 conn = &priv->connector; 1227 1222 ret = set_connector_edid(test, conn, 1228 1223 test_edid_hdmi_1080p_rgb_yuv_dc_max_200mhz, ··· 1234 1225 info = &conn->display_info; 1235 1226 KUNIT_ASSERT_TRUE(test, info->is_hdmi); 1236 1227 KUNIT_ASSERT_GT(test, info->max_tmds_clock, 0); 1237 - 1238 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1239 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1240 1228 1241 1229 preferred = find_preferred_mode(conn); 1242 1230 KUNIT_ASSERT_NOT_NULL(test, preferred); ··· 1248 1242 rate = drm_hdmi_compute_mode_clock(preferred, 12, HDMI_COLORSPACE_YUV422); 1249 1243 KUNIT_ASSERT_LT(test, rate, info->max_tmds_clock * 1000); 1250 1244 1251 - drm = &priv->drm; 1252 - crtc = priv->crtc; 1245 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1246 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1247 + 1253 1248 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 1254 1249 KUNIT_EXPECT_EQ(test, ret, 0); 1255 1250 ··· 1297 1290 KUNIT_ASSERT_TRUE(test, info->is_hdmi); 1298 1291 KUNIT_ASSERT_GT(test, info->max_tmds_clock, 0); 1299 1292 1300 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1301 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1302 - 1303 1293 mode = drm_kunit_display_mode_from_cea_vic(test, drm, 1); 1304 1294 KUNIT_ASSERT_NOT_NULL(test, mode); 1305 1295 ··· 1310 1306 rate = mode->clock * 1500; 1311 1307 KUNIT_ASSERT_LT(test, rate, info->max_tmds_clock * 1000); 1312 1308 1313 - drm = &priv->drm; 1309 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1310 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1311 + 1314 1312 crtc = priv->crtc; 1315 1313 ret = light_up_connector(test, drm, crtc, conn, mode, ctx); 1316 1314 KUNIT_EXPECT_EQ(test, ret, 0); ··· 1346 1340 12); 1347 1341 KUNIT_ASSERT_NOT_NULL(test, priv); 1348 1342 1343 + drm = &priv->drm; 1344 + crtc = priv->crtc; 1349 1345 conn = &priv->connector; 1350 1346 ret = set_connector_edid(test, conn, 1351 1347 test_edid_hdmi_1080p_rgb_yuv_dc_max_200mhz, ··· 1357 1349 info = &conn->display_info; 1358 1350 KUNIT_ASSERT_TRUE(test, info->is_hdmi); 1359 1351 KUNIT_ASSERT_GT(test, info->max_tmds_clock, 0); 1360 - 1361 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1362 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1363 1352 1364 1353 preferred = find_preferred_mode(conn); 1365 1354 KUNIT_ASSERT_NOT_NULL(test, preferred); ··· 1376 1371 rate = drm_hdmi_compute_mode_clock(preferred, 12, HDMI_COLORSPACE_YUV422); 1377 1372 KUNIT_ASSERT_LT(test, rate, info->max_tmds_clock * 1000); 1378 1373 1379 - drm = &priv->drm; 1380 - crtc = priv->crtc; 1374 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1375 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1376 + 1381 1377 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 1382 1378 KUNIT_EXPECT_EQ(test, ret, 0); 1383 1379 ··· 1413 1407 12); 1414 1408 KUNIT_ASSERT_NOT_NULL(test, priv); 1415 1409 1410 + drm = &priv->drm; 1411 + crtc = priv->crtc; 1416 1412 conn = &priv->connector; 1417 1413 ret = set_connector_edid(test, conn, 1418 1414 test_edid_hdmi_1080p_rgb_max_200mhz, ··· 1424 1416 info = &conn->display_info; 1425 1417 KUNIT_ASSERT_TRUE(test, info->is_hdmi); 1426 1418 KUNIT_ASSERT_GT(test, info->max_tmds_clock, 0); 1427 - 1428 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1429 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1430 1419 1431 1420 preferred = find_preferred_mode(conn); 1432 1421 KUNIT_ASSERT_NOT_NULL(test, preferred); ··· 1443 1438 rate = drm_hdmi_compute_mode_clock(preferred, 12, HDMI_COLORSPACE_YUV422); 1444 1439 KUNIT_ASSERT_LT(test, rate, info->max_tmds_clock * 1000); 1445 1440 1446 - drm = &priv->drm; 1447 - crtc = priv->crtc; 1441 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1442 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1443 + 1448 1444 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 1449 1445 KUNIT_EXPECT_EQ(test, ret, 0); 1450 1446 ··· 1479 1473 8); 1480 1474 KUNIT_ASSERT_NOT_NULL(test, priv); 1481 1475 1476 + drm = &priv->drm; 1477 + crtc = priv->crtc; 1482 1478 conn = &priv->connector; 1483 1479 ret = set_connector_edid(test, conn, 1484 1480 test_edid_hdmi_1080p_rgb_yuv_dc_max_340mhz, ··· 1490 1482 info = &conn->display_info; 1491 1483 KUNIT_ASSERT_TRUE(test, info->is_hdmi); 1492 1484 KUNIT_ASSERT_GT(test, info->max_tmds_clock, 0); 1493 - 1494 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1495 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1496 1485 1497 1486 preferred = find_preferred_mode(conn); 1498 1487 KUNIT_ASSERT_NOT_NULL(test, preferred); ··· 1501 1496 rate = drm_hdmi_compute_mode_clock(preferred, 12, HDMI_COLORSPACE_RGB); 1502 1497 KUNIT_ASSERT_LT(test, rate, info->max_tmds_clock * 1000); 1503 1498 1504 - drm = &priv->drm; 1505 - crtc = priv->crtc; 1499 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1500 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1501 + 1506 1502 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 1507 1503 KUNIT_EXPECT_EQ(test, ret, 0); 1508 1504 ··· 1539 1533 12); 1540 1534 KUNIT_ASSERT_NOT_NULL(test, priv); 1541 1535 1536 + drm = &priv->drm; 1537 + crtc = priv->crtc; 1542 1538 conn = &priv->connector; 1543 1539 ret = set_connector_edid(test, conn, 1544 1540 test_edid_hdmi_1080p_rgb_max_340mhz, ··· 1550 1542 info = &conn->display_info; 1551 1543 KUNIT_ASSERT_TRUE(test, info->is_hdmi); 1552 1544 KUNIT_ASSERT_GT(test, info->max_tmds_clock, 0); 1553 - 1554 - ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1555 - KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1556 1545 1557 1546 preferred = find_preferred_mode(conn); 1558 1547 KUNIT_ASSERT_NOT_NULL(test, preferred); ··· 1561 1556 rate = drm_hdmi_compute_mode_clock(preferred, 12, HDMI_COLORSPACE_RGB); 1562 1557 KUNIT_ASSERT_LT(test, rate, info->max_tmds_clock * 1000); 1563 1558 1564 - drm = &priv->drm; 1565 - crtc = priv->crtc; 1559 + ctx = drm_kunit_helper_acquire_ctx_alloc(test); 1560 + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); 1561 + 1566 1562 ret = light_up_connector(test, drm, crtc, conn, preferred, ctx); 1567 1563 KUNIT_EXPECT_EQ(test, ret, 0); 1568 1564

+6

drivers/gpu/drm/xe/regs/xe_oa_regs.h

··· 51 51 /* Common to all OA units */ 52 52 #define OA_OACONTROL_REPORT_BC_MASK REG_GENMASK(9, 9) 53 53 #define OA_OACONTROL_COUNTER_SIZE_MASK REG_GENMASK(8, 8) 54 + #define OAG_OACONTROL_USED_BITS \ 55 + (OAG_OACONTROL_OA_PES_DISAG_EN | OAG_OACONTROL_OA_CCS_SELECT_MASK | \ 56 + OAG_OACONTROL_OA_COUNTER_SEL_MASK | OAG_OACONTROL_OA_COUNTER_ENABLE | \ 57 + OA_OACONTROL_REPORT_BC_MASK | OA_OACONTROL_COUNTER_SIZE_MASK) 54 58 55 59 #define OAG_OA_DEBUG XE_REG(0xdaf8, XE_REG_OPTION_MASKED) 56 60 #define OAG_OA_DEBUG_DISABLE_MMIO_TRG REG_BIT(14) ··· 82 78 #define OAM_CONTEXT_CONTROL_OFFSET (0x1bc) 83 79 #define OAM_CONTROL_OFFSET (0x194) 84 80 #define OAM_CONTROL_COUNTER_SEL_MASK REG_GENMASK(3, 1) 81 + #define OAM_OACONTROL_USED_BITS \ 82 + (OAM_CONTROL_COUNTER_SEL_MASK | OAG_OACONTROL_OA_COUNTER_ENABLE) 85 83 #define OAM_DEBUG_OFFSET (0x198) 86 84 #define OAM_STATUS_OFFSET (0x19c) 87 85 #define OAM_MMIO_TRG_OFFSET (0x1d0)

+15 -27

drivers/gpu/drm/xe/xe_devcoredump.c

··· 119 119 drm_puts(&p, "\n**** GuC CT ****\n"); 120 120 xe_guc_ct_snapshot_print(ss->guc.ct, &p); 121 121 122 - /* 123 - * Don't add a new section header here because the mesa debug decoder 124 - * tool expects the context information to be in the 'GuC CT' section. 125 - */ 126 - /* drm_puts(&p, "\n**** Contexts ****\n"); */ 122 + drm_puts(&p, "\n**** Contexts ****\n"); 127 123 xe_guc_exec_queue_snapshot_print(ss->ge, &p); 128 124 129 125 drm_puts(&p, "\n**** Job ****\n"); ··· 391 395 /** 392 396 * xe_print_blob_ascii85 - print a BLOB to some useful location in ASCII85 393 397 * 394 - * The output is split to multiple lines because some print targets, e.g. dmesg 395 - * cannot handle arbitrarily long lines. Note also that printing to dmesg in 396 - * piece-meal fashion is not possible, each separate call to drm_puts() has a 397 - * line-feed automatically added! Therefore, the entire output line must be 398 - * constructed in a local buffer first, then printed in one atomic output call. 398 + * The output is split into multiple calls to drm_puts() because some print 399 + * targets, e.g. dmesg, cannot handle arbitrarily long lines. These targets may 400 + * add newlines, as is the case with dmesg: each drm_puts() call creates a 401 + * separate line. 399 402 * 400 403 * There is also a scheduler yield call to prevent the 'task has been stuck for 401 404 * 120s' kernel hang check feature from firing when printing to a slow target 402 405 * such as dmesg over a serial port. 403 406 * 404 - * TODO: Add compression prior to the ASCII85 encoding to shrink huge buffers down. 405 - * 406 407 * @p: the printer object to output to 407 408 * @prefix: optional prefix to add to output string 409 + * @suffix: optional suffix to add at the end. 0 disables it and is 410 + * not added to the output, which is useful when using multiple calls 411 + * to dump data to @p 408 412 * @blob: the Binary Large OBject to dump out 409 413 * @offset: offset in bytes to skip from the front of the BLOB, must be a multiple of sizeof(u32) 410 414 * @size: the size in bytes of the BLOB, must be a multiple of sizeof(u32) 411 415 */ 412 - void xe_print_blob_ascii85(struct drm_printer *p, const char *prefix, 416 + void xe_print_blob_ascii85(struct drm_printer *p, const char *prefix, char suffix, 413 417 const void *blob, size_t offset, size_t size) 414 418 { 415 419 const u32 *blob32 = (const u32 *)blob; 416 420 char buff[ASCII85_BUFSZ], *line_buff; 417 421 size_t line_pos = 0; 418 422 419 - /* 420 - * Splitting blobs across multiple lines is not compatible with the mesa 421 - * debug decoder tool. Note that even dropping the explicit '\n' below 422 - * doesn't help because the GuC log is so big some underlying implementation 423 - * still splits the lines at 512K characters. So just bail completely for 424 - * the moment. 425 - */ 426 - return; 427 - 428 423 #define DMESG_MAX_LINE_LEN 800 429 - #define MIN_SPACE (ASCII85_BUFSZ + 2) /* 85 + "\n\0" */ 424 + /* Always leave space for the suffix char and the \0 */ 425 + #define MIN_SPACE (ASCII85_BUFSZ + 2) /* 85 + "<suffix>\0" */ 430 426 431 427 if (size & 3) 432 428 drm_printf(p, "Size not word aligned: %zu", size); ··· 450 462 line_pos += strlen(line_buff + line_pos); 451 463 452 464 if ((line_pos + MIN_SPACE) >= DMESG_MAX_LINE_LEN) { 453 - line_buff[line_pos++] = '\n'; 454 465 line_buff[line_pos++] = 0; 455 466 456 467 drm_puts(p, line_buff); ··· 461 474 } 462 475 } 463 476 464 - if (line_pos) { 465 - line_buff[line_pos++] = '\n'; 466 - line_buff[line_pos++] = 0; 477 + if (suffix) 478 + line_buff[line_pos++] = suffix; 467 479 480 + if (line_pos) { 481 + line_buff[line_pos++] = 0; 468 482 drm_puts(p, line_buff); 469 483 } 470 484

+1 -1

drivers/gpu/drm/xe/xe_devcoredump.h

··· 29 29 } 30 30 #endif 31 31 32 - void xe_print_blob_ascii85(struct drm_printer *p, const char *prefix, 32 + void xe_print_blob_ascii85(struct drm_printer *p, const char *prefix, char suffix, 33 33 const void *blob, size_t offset, size_t size); 34 34 35 35 #endif

+1 -1

drivers/gpu/drm/xe/xe_drm_client.c

··· 135 135 XE_WARN_ON(bo->client); 136 136 XE_WARN_ON(!list_empty(&bo->client_link)); 137 137 138 - spin_lock(&client->bos_lock); 139 138 bo->client = xe_drm_client_get(client); 139 + spin_lock(&client->bos_lock); 140 140 list_add_tail(&bo->client_link, &client->bos_list); 141 141 spin_unlock(&client->bos_lock); 142 142 }

+3 -1

drivers/gpu/drm/xe/xe_gt.c

··· 532 532 if (IS_SRIOV_PF(gt_to_xe(gt)) && !xe_gt_is_media_type(gt)) 533 533 xe_lmtt_init_hw(&gt_to_tile(gt)->sriov.pf.lmtt); 534 534 535 - if (IS_SRIOV_PF(gt_to_xe(gt))) 535 + if (IS_SRIOV_PF(gt_to_xe(gt))) { 536 + xe_gt_sriov_pf_init(gt); 536 537 xe_gt_sriov_pf_init_hw(gt); 538 + } 537 539 538 540 xe_force_wake_put(gt_to_fw(gt), fw_ref); 539 541

+13 -1

drivers/gpu/drm/xe/xe_gt_sriov_pf.c

··· 68 68 return 0; 69 69 } 70 70 71 + /** 72 + * xe_gt_sriov_pf_init - Prepare SR-IOV PF data structures on PF. 73 + * @gt: the &xe_gt to initialize 74 + * 75 + * Late one-time initialization of the PF data. 76 + * 77 + * Return: 0 on success or a negative error code on failure. 78 + */ 79 + int xe_gt_sriov_pf_init(struct xe_gt *gt) 80 + { 81 + return xe_gt_sriov_pf_migration_init(gt); 82 + } 83 + 71 84 static bool pf_needs_enable_ggtt_guest_update(struct xe_device *xe) 72 85 { 73 86 return GRAPHICS_VERx100(xe) == 1200; ··· 103 90 pf_enable_ggtt_guest_update(gt); 104 91 105 92 xe_gt_sriov_pf_service_update(gt); 106 - xe_gt_sriov_pf_migration_init(gt); 107 93 } 108 94 109 95 static u32 pf_get_vf_regs_stride(struct xe_device *xe)

+6

drivers/gpu/drm/xe/xe_gt_sriov_pf.h

··· 10 10 11 11 #ifdef CONFIG_PCI_IOV 12 12 int xe_gt_sriov_pf_init_early(struct xe_gt *gt); 13 + int xe_gt_sriov_pf_init(struct xe_gt *gt); 13 14 void xe_gt_sriov_pf_init_hw(struct xe_gt *gt); 14 15 void xe_gt_sriov_pf_sanitize_hw(struct xe_gt *gt, unsigned int vfid); 15 16 void xe_gt_sriov_pf_restart(struct xe_gt *gt); 16 17 #else 17 18 static inline int xe_gt_sriov_pf_init_early(struct xe_gt *gt) 19 + { 20 + return 0; 21 + } 22 + 23 + static inline int xe_gt_sriov_pf_init(struct xe_gt *gt) 18 24 { 19 25 return 0; 20 26 }

+2 -1

drivers/gpu/drm/xe/xe_guc_ct.c

··· 1724 1724 snapshot->g2h_outstanding); 1725 1725 1726 1726 if (snapshot->ctb) 1727 - xe_print_blob_ascii85(p, "CTB data", snapshot->ctb, 0, snapshot->ctb_size); 1727 + xe_print_blob_ascii85(p, "CTB data", '\n', 1728 + snapshot->ctb, 0, snapshot->ctb_size); 1728 1729 } else { 1729 1730 drm_puts(p, "CT disabled\n"); 1730 1731 }

+3 -1

drivers/gpu/drm/xe/xe_guc_log.c

··· 211 211 remain = snapshot->size; 212 212 for (i = 0; i < snapshot->num_chunks; i++) { 213 213 size_t size = min(GUC_LOG_CHUNK_SIZE, remain); 214 + const char *prefix = i ? NULL : "Log data"; 215 + char suffix = i == snapshot->num_chunks - 1 ? '\n' : 0; 214 216 215 - xe_print_blob_ascii85(p, i ? NULL : "Log data", snapshot->copy[i], 0, size); 217 + xe_print_blob_ascii85(p, prefix, suffix, snapshot->copy[i], 0, size); 216 218 remain -= size; 217 219 } 218 220 }

+13 -8

drivers/gpu/drm/xe/xe_oa.c

··· 237 237 u32 tail, hw_tail, partial_report_size, available; 238 238 int report_size = stream->oa_buffer.format->size; 239 239 unsigned long flags; 240 - bool pollin; 241 240 242 241 spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags); 243 242 ··· 281 282 stream->oa_buffer.tail = tail; 282 283 283 284 available = xe_oa_circ_diff(stream, stream->oa_buffer.tail, stream->oa_buffer.head); 284 - pollin = available >= stream->wait_num_reports * report_size; 285 + stream->pollin = available >= stream->wait_num_reports * report_size; 285 286 286 287 spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags); 287 288 288 - return pollin; 289 + return stream->pollin; 289 290 } 290 291 291 292 static enum hrtimer_restart xe_oa_poll_check_timer_cb(struct hrtimer *hrtimer) ··· 293 294 struct xe_oa_stream *stream = 294 295 container_of(hrtimer, typeof(*stream), poll_check_timer); 295 296 296 - if (xe_oa_buffer_check_unlocked(stream)) { 297 - stream->pollin = true; 297 + if (xe_oa_buffer_check_unlocked(stream)) 298 298 wake_up(&stream->poll_wq); 299 - } 300 299 301 300 hrtimer_forward_now(hrtimer, ns_to_ktime(stream->poll_period_ns)); 302 301 ··· 449 452 return val; 450 453 } 451 454 455 + static u32 __oactrl_used_bits(struct xe_oa_stream *stream) 456 + { 457 + return stream->hwe->oa_unit->type == DRM_XE_OA_UNIT_TYPE_OAG ? 458 + OAG_OACONTROL_USED_BITS : OAM_OACONTROL_USED_BITS; 459 + } 460 + 452 461 static void xe_oa_enable(struct xe_oa_stream *stream) 453 462 { 454 463 const struct xe_oa_format *format = stream->oa_buffer.format; ··· 475 472 stream->hwe->oa_unit->type == DRM_XE_OA_UNIT_TYPE_OAG) 476 473 val |= OAG_OACONTROL_OA_PES_DISAG_EN; 477 474 478 - xe_mmio_write32(&stream->gt->mmio, regs->oa_ctrl, val); 475 + xe_mmio_rmw32(&stream->gt->mmio, regs->oa_ctrl, __oactrl_used_bits(stream), val); 479 476 } 480 477 481 478 static void xe_oa_disable(struct xe_oa_stream *stream) 482 479 { 483 480 struct xe_mmio *mmio = &stream->gt->mmio; 484 481 485 - xe_mmio_write32(mmio, __oa_regs(stream)->oa_ctrl, 0); 482 + xe_mmio_rmw32(mmio, __oa_regs(stream)->oa_ctrl, __oactrl_used_bits(stream), 0); 486 483 if (xe_mmio_wait32(mmio, __oa_regs(stream)->oa_ctrl, 487 484 OAG_OACONTROL_OA_COUNTER_ENABLE, 0, 50000, NULL, false)) 488 485 drm_err(&stream->oa->xe->drm, ··· 2536 2533 u->regs = __oam_regs(mtl_oa_base[i]); 2537 2534 u->type = DRM_XE_OA_UNIT_TYPE_OAM; 2538 2535 } 2536 + 2537 + xe_mmio_write32(&gt->mmio, u->regs.oa_ctrl, 0); 2539 2538 2540 2539 /* Ensure MMIO trigger remains disabled till there is a stream */ 2541 2540 xe_mmio_write32(&gt->mmio, u->regs.oa_debug,

+38 -32

drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c

··· 57 57 return GRAPHICS_VERx100(xe) < 1270 && !IS_DGFX(xe); 58 58 } 59 59 60 - static s64 detect_bar2_dgfx(struct xe_device *xe, struct xe_ttm_stolen_mgr *mgr) 61 - { 62 - struct xe_tile *tile = xe_device_get_root_tile(xe); 63 - struct xe_mmio *mmio = xe_root_tile_mmio(xe); 64 - struct pci_dev *pdev = to_pci_dev(xe->drm.dev); 65 - u64 stolen_size; 66 - u64 tile_offset; 67 - u64 tile_size; 68 - 69 - tile_offset = tile->mem.vram.io_start - xe->mem.vram.io_start; 70 - tile_size = tile->mem.vram.actual_physical_size; 71 - 72 - /* Use DSM base address instead for stolen memory */ 73 - mgr->stolen_base = (xe_mmio_read64_2x32(mmio, DSMBASE) & BDSM_MASK) - tile_offset; 74 - if (drm_WARN_ON(&xe->drm, tile_size < mgr->stolen_base)) 75 - return 0; 76 - 77 - stolen_size = tile_size - mgr->stolen_base; 78 - 79 - /* Verify usage fits in the actual resource available */ 80 - if (mgr->stolen_base + stolen_size <= pci_resource_len(pdev, LMEM_BAR)) 81 - mgr->io_base = tile->mem.vram.io_start + mgr->stolen_base; 82 - 83 - /* 84 - * There may be few KB of platform dependent reserved memory at the end 85 - * of vram which is not part of the DSM. Such reserved memory portion is 86 - * always less then DSM granularity so align down the stolen_size to DSM 87 - * granularity to accommodate such reserve vram portion. 88 - */ 89 - return ALIGN_DOWN(stolen_size, SZ_1M); 90 - } 91 - 92 60 static u32 get_wopcm_size(struct xe_device *xe) 93 61 { 94 62 u32 wopcm_size; ··· 78 110 } 79 111 80 112 return wopcm_size; 113 + } 114 + 115 + static s64 detect_bar2_dgfx(struct xe_device *xe, struct xe_ttm_stolen_mgr *mgr) 116 + { 117 + struct xe_tile *tile = xe_device_get_root_tile(xe); 118 + struct xe_mmio *mmio = xe_root_tile_mmio(xe); 119 + struct pci_dev *pdev = to_pci_dev(xe->drm.dev); 120 + u64 stolen_size, wopcm_size; 121 + u64 tile_offset; 122 + u64 tile_size; 123 + 124 + tile_offset = tile->mem.vram.io_start - xe->mem.vram.io_start; 125 + tile_size = tile->mem.vram.actual_physical_size; 126 + 127 + /* Use DSM base address instead for stolen memory */ 128 + mgr->stolen_base = (xe_mmio_read64_2x32(mmio, DSMBASE) & BDSM_MASK) - tile_offset; 129 + if (drm_WARN_ON(&xe->drm, tile_size < mgr->stolen_base)) 130 + return 0; 131 + 132 + /* Carve out the top of DSM as it contains the reserved WOPCM region */ 133 + wopcm_size = get_wopcm_size(xe); 134 + if (drm_WARN_ON(&xe->drm, !wopcm_size)) 135 + return 0; 136 + 137 + stolen_size = tile_size - mgr->stolen_base; 138 + stolen_size -= wopcm_size; 139 + 140 + /* Verify usage fits in the actual resource available */ 141 + if (mgr->stolen_base + stolen_size <= pci_resource_len(pdev, LMEM_BAR)) 142 + mgr->io_base = tile->mem.vram.io_start + mgr->stolen_base; 143 + 144 + /* 145 + * There may be few KB of platform dependent reserved memory at the end 146 + * of vram which is not part of the DSM. Such reserved memory portion is 147 + * always less then DSM granularity so align down the stolen_size to DSM 148 + * granularity to accommodate such reserve vram portion. 149 + */ 150 + return ALIGN_DOWN(stolen_size, SZ_1M); 81 151 } 82 152 83 153 static u32 detect_bar2_integrated(struct xe_device *xe, struct xe_ttm_stolen_mgr *mgr)

+2

drivers/gpu/host1x/dev.c

··· 619 619 goto free_contexts; 620 620 } 621 621 622 + mutex_init(&host->intr_mutex); 623 + 622 624 pm_runtime_enable(&pdev->dev); 623 625 624 626 err = devm_tegra_core_dev_init_opp_table_common(&pdev->dev);

-2

drivers/gpu/host1x/intr.c

··· 104 104 unsigned int id; 105 105 int i, err; 106 106 107 - mutex_init(&host->intr_mutex); 108 - 109 107 for (id = 0; id < host1x_syncpt_nb_pts(host); ++id) { 110 108 struct host1x_syncpt *syncpt = &host->syncpt[id]; 111 109

+10 -5

drivers/hid/Kconfig

··· 570 570 571 571 config HID_LENOVO 572 572 tristate "Lenovo / Thinkpad devices" 573 + depends on ACPI 574 + select ACPI_PLATFORM_PROFILE 573 575 select NEW_LEDS 574 576 select LEDS_CLASS 575 577 help ··· 1169 1167 tristate "Topre REALFORCE keyboards" 1170 1168 depends on HID 1171 1169 help 1172 - Say Y for N-key rollover support on Topre REALFORCE R2 108/87 key keyboards. 1170 + Say Y for N-key rollover support on Topre REALFORCE R2 108/87 key and 1171 + Topre REALFORCE R3S 87 key keyboards. 1173 1172 1174 1173 config HID_THINGM 1175 1174 tristate "ThingM blink(1) USB RGB LED" ··· 1377 1374 1378 1375 source "drivers/hid/bpf/Kconfig" 1379 1376 1380 - endif # HID 1381 - 1382 - source "drivers/hid/usbhid/Kconfig" 1383 - 1384 1377 source "drivers/hid/i2c-hid/Kconfig" 1385 1378 1386 1379 source "drivers/hid/intel-ish-hid/Kconfig" ··· 1386 1387 source "drivers/hid/surface-hid/Kconfig" 1387 1388 1388 1389 source "drivers/hid/intel-thc-hid/Kconfig" 1390 + 1391 + endif # HID 1392 + 1393 + # USB support may be used with HID disabled 1394 + 1395 + source "drivers/hid/usbhid/Kconfig" 1389 1396 1390 1397 endif # HID_SUPPORT

-1

drivers/hid/amd-sfh-hid/Kconfig

··· 5 5 6 6 config AMD_SFH_HID 7 7 tristate "AMD Sensor Fusion Hub" 8 - depends on HID 9 8 depends on X86 10 9 help 11 10 If you say yes to this option, support will be included for the

+8

drivers/hid/hid-apple.c

··· 474 474 hid->product == USB_DEVICE_ID_APPLE_MAGIC_KEYBOARD_NUMPAD_2015) 475 475 table = magic_keyboard_2015_fn_keys; 476 476 else if (hid->product == USB_DEVICE_ID_APPLE_MAGIC_KEYBOARD_2021 || 477 + hid->product == USB_DEVICE_ID_APPLE_MAGIC_KEYBOARD_2024 || 477 478 hid->product == USB_DEVICE_ID_APPLE_MAGIC_KEYBOARD_FINGERPRINT_2021 || 478 479 hid->product == USB_DEVICE_ID_APPLE_MAGIC_KEYBOARD_NUMPAD_2021) 479 480 table = apple2021_fn_keys; ··· 545 544 } 546 545 } 547 546 } 547 + 548 + if (usage->hid == 0xc0301) /* Omoton KB066 quirk */ 549 + code = KEY_F6; 548 550 549 551 if (usage->code != code) { 550 552 input_event_with_scancode(input, usage->type, code, usage->hid, value); ··· 1153 1149 { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_MAGIC_KEYBOARD_2021), 1154 1150 .driver_data = APPLE_HAS_FN | APPLE_ISO_TILDE_QUIRK | APPLE_RDESC_BATTERY }, 1155 1151 { HID_BLUETOOTH_DEVICE(BT_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_MAGIC_KEYBOARD_2021), 1152 + .driver_data = APPLE_HAS_FN | APPLE_ISO_TILDE_QUIRK }, 1153 + { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_MAGIC_KEYBOARD_2024), 1154 + .driver_data = APPLE_HAS_FN | APPLE_ISO_TILDE_QUIRK | APPLE_RDESC_BATTERY }, 1155 + { HID_BLUETOOTH_DEVICE(BT_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_MAGIC_KEYBOARD_2024), 1156 1156 .driver_data = APPLE_HAS_FN | APPLE_ISO_TILDE_QUIRK }, 1157 1157 { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_MAGIC_KEYBOARD_FINGERPRINT_2021), 1158 1158 .driver_data = APPLE_HAS_FN | APPLE_ISO_TILDE_QUIRK | APPLE_RDESC_BATTERY },

+2 -1

drivers/hid/hid-corsair-void.c

··· 553 553 static void corsair_void_battery_add_work_handler(struct work_struct *work) 554 554 { 555 555 struct corsair_void_drvdata *drvdata; 556 - struct power_supply_config psy_cfg; 556 + struct power_supply_config psy_cfg = {}; 557 557 struct power_supply *new_supply; 558 558 559 559 drvdata = container_of(work, struct corsair_void_drvdata, ··· 726 726 if (drvdata->battery) 727 727 power_supply_unregister(drvdata->battery); 728 728 729 + cancel_delayed_work_sync(&drvdata->delayed_status_work); 729 730 cancel_delayed_work_sync(&drvdata->delayed_firmware_work); 730 731 sysfs_remove_group(&hid_dev->dev.kobj, &corsair_void_attr_group); 731 732 }

+3

drivers/hid/hid-ids.h

··· 184 184 #define USB_DEVICE_ID_APPLE_IRCONTROL4 0x8242 185 185 #define USB_DEVICE_ID_APPLE_IRCONTROL5 0x8243 186 186 #define USB_DEVICE_ID_APPLE_MAGIC_KEYBOARD_2021 0x029c 187 + #define USB_DEVICE_ID_APPLE_MAGIC_KEYBOARD_2024 0x0320 187 188 #define USB_DEVICE_ID_APPLE_MAGIC_KEYBOARD_FINGERPRINT_2021 0x029a 188 189 #define USB_DEVICE_ID_APPLE_MAGIC_KEYBOARD_NUMPAD_2021 0x029f 189 190 #define USB_DEVICE_ID_APPLE_TOUCHBAR_BACKLIGHT 0x8102 ··· 1096 1095 #define USB_DEVICE_ID_QUANTA_OPTICAL_TOUCH_3001 0x3001 1097 1096 #define USB_DEVICE_ID_QUANTA_OPTICAL_TOUCH_3003 0x3003 1098 1097 #define USB_DEVICE_ID_QUANTA_OPTICAL_TOUCH_3008 0x3008 1098 + #define USB_DEVICE_ID_QUANTA_HP_5MP_CAMERA_5473 0x5473 1099 1099 1100 1100 #define I2C_VENDOR_ID_RAYDIUM 0x2386 1101 1101 #define I2C_PRODUCT_ID_RAYDIUM_4B33 0x4b33 ··· 1303 1301 #define USB_VENDOR_ID_TOPRE 0x0853 1304 1302 #define USB_DEVICE_ID_TOPRE_REALFORCE_R2_108 0x0148 1305 1303 #define USB_DEVICE_ID_TOPRE_REALFORCE_R2_87 0x0146 1304 + #define USB_DEVICE_ID_TOPRE_REALFORCE_R3S_87 0x0313 1306 1305 1307 1306 #define USB_VENDOR_ID_TOPSEED 0x0766 1308 1307 #define USB_DEVICE_ID_TOPSEED_CYBERLINK 0x0204

+1 -6

drivers/hid/hid-lenovo.c

··· 32 32 #include <linux/leds.h> 33 33 #include <linux/workqueue.h> 34 34 35 - #if IS_ENABLED(CONFIG_ACPI_PLATFORM_PROFILE) 36 35 #include <linux/platform_profile.h> 37 - #endif /* CONFIG_ACPI_PLATFORM_PROFILE */ 38 36 39 37 #include "hid-ids.h" 40 38 ··· 728 730 if (hdev->product == USB_DEVICE_ID_LENOVO_X12_TAB) { 729 731 report_key_event(input, KEY_RFKILL); 730 732 return 1; 731 - } 732 - #if IS_ENABLED(CONFIG_ACPI_PLATFORM_PROFILE) 733 - else { 733 + } else { 734 734 platform_profile_cycle(); 735 735 return 1; 736 736 } 737 - #endif /* CONFIG_ACPI_PLATFORM_PROFILE */ 738 737 return 0; 739 738 case TP_X12_RAW_HOTKEY_FN_F10: 740 739 /* TAB1 has PICKUP Phone and TAB2 use Snipping tool*/

+4 -1

drivers/hid/hid-multitouch.c

··· 1679 1679 break; 1680 1680 } 1681 1681 1682 - if (suffix) 1682 + if (suffix) { 1683 1683 hi->input->name = devm_kasprintf(&hdev->dev, GFP_KERNEL, 1684 1684 "%s %s", hdev->name, suffix); 1685 + if (!hi->input->name) 1686 + return -ENOMEM; 1687 + } 1685 1688 1686 1689 return 0; 1687 1690 }

+1

drivers/hid/hid-quirks.c

··· 891 891 { HID_USB_DEVICE(USB_VENDOR_ID_SYNAPTICS, USB_DEVICE_ID_SYNAPTICS_DPAD) }, 892 892 #endif 893 893 { HID_USB_DEVICE(USB_VENDOR_ID_YEALINK, USB_DEVICE_ID_YEALINK_P1K_P4K_B2K) }, 894 + { HID_USB_DEVICE(USB_VENDOR_ID_QUANTA, USB_DEVICE_ID_QUANTA_HP_5MP_CAMERA_5473) }, 894 895 { } 895 896 }; 896 897

+35 -11

drivers/hid/hid-steam.c

··· 313 313 u16 rumble_left; 314 314 u16 rumble_right; 315 315 unsigned int sensor_timestamp_us; 316 + struct work_struct unregister_work; 316 317 }; 317 318 318 319 static int steam_recv_report(struct steam_device *steam, ··· 1051 1050 struct steam_device, mode_switch); 1052 1051 unsigned long flags; 1053 1052 bool client_opened; 1054 - steam->gamepad_mode = !steam->gamepad_mode; 1055 1053 if (!lizard_mode) 1056 1054 return; 1057 1055 1056 + steam->gamepad_mode = !steam->gamepad_mode; 1058 1057 if (steam->gamepad_mode) 1059 1058 steam_set_lizard_mode(steam, false); 1060 1059 else { ··· 1070 1069 steam_haptic_pulse(steam, STEAM_PAD_LEFT, 0x14D, 0x14D, 0x2D, 0); 1071 1070 } else { 1072 1071 steam_haptic_pulse(steam, STEAM_PAD_LEFT, 0x1F4, 0x1F4, 0x1E, 0); 1072 + } 1073 + } 1074 + 1075 + static void steam_work_unregister_cb(struct work_struct *work) 1076 + { 1077 + struct steam_device *steam = container_of(work, struct steam_device, 1078 + unregister_work); 1079 + unsigned long flags; 1080 + bool connected; 1081 + bool opened; 1082 + 1083 + spin_lock_irqsave(&steam->lock, flags); 1084 + opened = steam->client_opened; 1085 + connected = steam->connected; 1086 + spin_unlock_irqrestore(&steam->lock, flags); 1087 + 1088 + if (connected) { 1089 + if (opened) { 1090 + steam_sensors_unregister(steam); 1091 + steam_input_unregister(steam); 1092 + } else { 1093 + steam_set_lizard_mode(steam, lizard_mode); 1094 + steam_input_register(steam); 1095 + steam_sensors_register(steam); 1096 + } 1073 1097 } 1074 1098 } 1075 1099 ··· 1143 1117 steam->client_opened++; 1144 1118 spin_unlock_irqrestore(&steam->lock, flags); 1145 1119 1146 - steam_sensors_unregister(steam); 1147 - steam_input_unregister(steam); 1120 + schedule_work(&steam->unregister_work); 1148 1121 1149 1122 return 0; 1150 1123 } ··· 1160 1135 connected = steam->connected && !steam->client_opened; 1161 1136 spin_unlock_irqrestore(&steam->lock, flags); 1162 1137 1163 - if (connected) { 1164 - steam_set_lizard_mode(steam, lizard_mode); 1165 - steam_input_register(steam); 1166 - steam_sensors_register(steam); 1167 - } 1138 + schedule_work(&steam->unregister_work); 1168 1139 } 1169 1140 1170 1141 static int steam_client_ll_raw_request(struct hid_device *hdev, ··· 1252 1231 INIT_LIST_HEAD(&steam->list); 1253 1232 INIT_WORK(&steam->rumble_work, steam_haptic_rumble_cb); 1254 1233 steam->sensor_timestamp_us = 0; 1234 + INIT_WORK(&steam->unregister_work, steam_work_unregister_cb); 1255 1235 1256 1236 /* 1257 1237 * With the real steam controller interface, do not connect hidraw. ··· 1313 1291 cancel_work_sync(&steam->work_connect); 1314 1292 cancel_delayed_work_sync(&steam->mode_switch); 1315 1293 cancel_work_sync(&steam->rumble_work); 1294 + cancel_work_sync(&steam->unregister_work); 1316 1295 1317 1296 return ret; 1318 1297 } ··· 1330 1307 cancel_delayed_work_sync(&steam->mode_switch); 1331 1308 cancel_work_sync(&steam->work_connect); 1332 1309 cancel_work_sync(&steam->rumble_work); 1310 + cancel_work_sync(&steam->unregister_work); 1333 1311 hid_destroy_device(steam->client_hdev); 1334 1312 steam->client_hdev = NULL; 1335 1313 steam->client_opened = 0; ··· 1617 1593 1618 1594 if (!(b9 & BIT(6)) && steam->did_mode_switch) { 1619 1595 steam->did_mode_switch = false; 1620 - cancel_delayed_work_sync(&steam->mode_switch); 1596 + cancel_delayed_work(&steam->mode_switch); 1621 1597 } else if (!steam->client_opened && (b9 & BIT(6)) && !steam->did_mode_switch) { 1622 1598 steam->did_mode_switch = true; 1623 1599 schedule_delayed_work(&steam->mode_switch, 45 * HZ / 100); 1624 1600 } 1625 1601 1626 - if (!steam->gamepad_mode) 1602 + if (!steam->gamepad_mode && lizard_mode) 1627 1603 return; 1628 1604 1629 1605 lpad_touched = b10 & BIT(3); ··· 1693 1669 */ 1694 1670 steam->sensor_timestamp_us += 4000; 1695 1671 1696 - if (!steam->gamepad_mode) 1672 + if (!steam->gamepad_mode && lizard_mode) 1697 1673 return; 1698 1674 1699 1675 input_event(sensors, EV_MSC, MSC_TIMESTAMP, steam->sensor_timestamp_us);

+1 -1

drivers/hid/hid-thrustmaster.c

··· 171 171 b_ep = ep->desc.bEndpointAddress; 172 172 173 173 /* Are the expected endpoints present? */ 174 - u8 ep_addr[1] = {b_ep}; 174 + u8 ep_addr[2] = {b_ep, 0}; 175 175 176 176 if (!usb_check_int_endpoints(usbif, ep_addr)) { 177 177 hid_err(hdev, "Unexpected non-int endpoint\n");

+7

drivers/hid/hid-topre.c

··· 29 29 hid_info(hdev, 30 30 "fixing up Topre REALFORCE keyboard report descriptor\n"); 31 31 rdesc[72] = 0x02; 32 + } else if (*rsize >= 106 && rdesc[28] == 0x29 && rdesc[29] == 0xe7 && 33 + rdesc[30] == 0x81 && rdesc[31] == 0x00) { 34 + hid_info(hdev, 35 + "fixing up Topre REALFORCE keyboard report descriptor\n"); 36 + rdesc[31] = 0x02; 32 37 } 33 38 return rdesc; 34 39 } ··· 43 38 USB_DEVICE_ID_TOPRE_REALFORCE_R2_108) }, 44 39 { HID_USB_DEVICE(USB_VENDOR_ID_TOPRE, 45 40 USB_DEVICE_ID_TOPRE_REALFORCE_R2_87) }, 41 + { HID_USB_DEVICE(USB_VENDOR_ID_TOPRE, 42 + USB_DEVICE_ID_TOPRE_REALFORCE_R3S_87) }, 46 43 { } 47 44 }; 48 45 MODULE_DEVICE_TABLE(hid, topre_id_table);

+2

drivers/hid/hid-winwing.c

··· 106 106 "%s::%s", 107 107 dev_name(&input->dev), 108 108 info->led_name); 109 + if (!led->cdev.name) 110 + return -ENOMEM; 109 111 110 112 ret = devm_led_classdev_register(&hdev->dev, &led->cdev); 111 113 if (ret)

+1 -1

drivers/hid/i2c-hid/Kconfig

··· 2 2 menuconfig I2C_HID 3 3 tristate "I2C HID support" 4 4 default y 5 - depends on I2C && INPUT && HID 5 + depends on I2C 6 6 7 7 if I2C_HID 8 8

-1

drivers/hid/intel-ish-hid/Kconfig

··· 6 6 tristate "Intel Integrated Sensor Hub" 7 7 default n 8 8 depends on X86 9 - depends on HID 10 9 help 11 10 The Integrated Sensor Hub (ISH) enables the ability to offload 12 11 sensor polling and algorithm processing to a dedicated low power

+2

drivers/hid/intel-ish-hid/ipc/hw-ish.h

··· 36 36 #define PCI_DEVICE_ID_INTEL_ISH_ARL_H 0x7745 37 37 #define PCI_DEVICE_ID_INTEL_ISH_ARL_S 0x7F78 38 38 #define PCI_DEVICE_ID_INTEL_ISH_LNL_M 0xA845 39 + #define PCI_DEVICE_ID_INTEL_ISH_PTL_H 0xE345 40 + #define PCI_DEVICE_ID_INTEL_ISH_PTL_P 0xE445 39 41 40 42 #define REVISION_ID_CHT_A0 0x6 41 43 #define REVISION_ID_CHT_Ax_SI 0x0

+9 -6

drivers/hid/intel-ish-hid/ipc/ipc.c

··· 517 517 /* ISH FW is dead */ 518 518 if (!ish_is_input_ready(dev)) 519 519 return -EPIPE; 520 + 521 + /* Send clock sync at once after reset */ 522 + ishtp_dev->prev_sync = 0; 523 + 520 524 /* 521 525 * Set HOST2ISH.ILUP. Apparently we need this BEFORE sending 522 526 * RESET_NOTIFY_ACK - FW will be checking for it ··· 581 577 */ 582 578 static void _ish_sync_fw_clock(struct ishtp_device *dev) 583 579 { 584 - static unsigned long prev_sync; 585 - uint64_t usec; 580 + struct ipc_time_update_msg time = {}; 586 581 587 - if (prev_sync && time_before(jiffies, prev_sync + 20 * HZ)) 582 + if (dev->prev_sync && time_before(jiffies, dev->prev_sync + 20 * HZ)) 588 583 return; 589 584 590 - prev_sync = jiffies; 591 - usec = ktime_to_us(ktime_get_boottime()); 592 - ipc_send_mng_msg(dev, MNG_SYNC_FW_CLOCK, &usec, sizeof(uint64_t)); 585 + dev->prev_sync = jiffies; 586 + /* The fields of time would be updated while sending message */ 587 + ipc_send_mng_msg(dev, MNG_SYNC_FW_CLOCK, &time, sizeof(time)); 593 588 } 594 589 595 590 /**

+7

drivers/hid/intel-ish-hid/ipc/pci-ish.c

··· 26 26 enum ishtp_driver_data_index { 27 27 ISHTP_DRIVER_DATA_NONE, 28 28 ISHTP_DRIVER_DATA_LNL_M, 29 + ISHTP_DRIVER_DATA_PTL, 29 30 }; 30 31 31 32 #define ISH_FW_GEN_LNL_M "lnlm" 33 + #define ISH_FW_GEN_PTL "ptl" 32 34 33 35 #define ISH_FIRMWARE_PATH(gen) "intel/ish/ish_" gen ".bin" 34 36 #define ISH_FIRMWARE_PATH_ALL "intel/ish/ish_*.bin" ··· 38 36 static struct ishtp_driver_data ishtp_driver_data[] = { 39 37 [ISHTP_DRIVER_DATA_LNL_M] = { 40 38 .fw_generation = ISH_FW_GEN_LNL_M, 39 + }, 40 + [ISHTP_DRIVER_DATA_PTL] = { 41 + .fw_generation = ISH_FW_GEN_PTL, 41 42 }, 42 43 }; 43 44 ··· 68 63 {PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_ISH_ARL_H)}, 69 64 {PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_ISH_ARL_S)}, 70 65 {PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_ISH_LNL_M), .driver_data = ISHTP_DRIVER_DATA_LNL_M}, 66 + {PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_ISH_PTL_H), .driver_data = ISHTP_DRIVER_DATA_PTL}, 67 + {PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_ISH_PTL_P), .driver_data = ISHTP_DRIVER_DATA_PTL}, 71 68 {} 72 69 }; 73 70 MODULE_DEVICE_TABLE(pci, ish_pci_tbl);

+2

drivers/hid/intel-ish-hid/ishtp/ishtp-dev.h

··· 253 253 unsigned int ipc_tx_cnt; 254 254 unsigned long long ipc_tx_bytes_cnt; 255 255 256 + /* Time of the last clock sync */ 257 + unsigned long prev_sync; 256 258 const struct ishtp_hw_ops *ops; 257 259 size_t mtu; 258 260 uint32_t ishtp_msg_hdr;

-1

drivers/hid/intel-thc-hid/Kconfig

··· 7 7 config INTEL_THC_HID 8 8 tristate "Intel Touch Host Controller" 9 9 depends on ACPI 10 - select HID 11 10 help 12 11 THC (Touch Host Controller) is the name of the IP block in PCH that 13 12 interfaces with Touch Devices (ex: touchscreen, touchpad etc.). It

-2

drivers/hid/surface-hid/Kconfig

··· 1 1 # SPDX-License-Identifier: GPL-2.0+ 2 2 menu "Surface System Aggregator Module HID support" 3 3 depends on SURFACE_AGGREGATOR 4 - depends on INPUT 5 4 6 5 config SURFACE_HID 7 6 tristate "HID transport driver for Surface System Aggregator Module" ··· 38 39 39 40 config SURFACE_HID_CORE 40 41 tristate 41 - select HID

+1 -2

drivers/hid/usbhid/Kconfig

··· 5 5 config USB_HID 6 6 tristate "USB HID transport layer" 7 7 default y 8 - depends on USB && INPUT 9 - select HID 8 + depends on HID 10 9 help 11 10 Say Y here if you want to connect USB keyboards, 12 11 mice, joysticks, graphic tablets, or any other HID based devices

+76 -39

drivers/i2c/i2c-core-base.c

··· 1300 1300 info.flags |= I2C_CLIENT_SLAVE; 1301 1301 } 1302 1302 1303 - info.flags |= I2C_CLIENT_USER; 1304 - 1305 1303 client = i2c_new_client_device(adap, &info); 1306 1304 if (IS_ERR(client)) 1307 1305 return PTR_ERR(client); 1308 1306 1307 + /* Keep track of the added device */ 1308 + mutex_lock(&adap->userspace_clients_lock); 1309 + list_add_tail(&client->detected, &adap->userspace_clients); 1310 + mutex_unlock(&adap->userspace_clients_lock); 1309 1311 dev_info(dev, "%s: Instantiated device %s at 0x%02hx\n", "new_device", 1310 1312 info.type, info.addr); 1311 1313 1312 1314 return count; 1313 1315 } 1314 1316 static DEVICE_ATTR_WO(new_device); 1315 - 1316 - static int __i2c_find_user_addr(struct device *dev, const void *addrp) 1317 - { 1318 - struct i2c_client *client = i2c_verify_client(dev); 1319 - unsigned short addr = *(unsigned short *)addrp; 1320 - 1321 - return client && client->flags & I2C_CLIENT_USER && 1322 - i2c_encode_flags_to_addr(client) == addr; 1323 - } 1324 1317 1325 1318 /* 1326 1319 * And of course let the users delete the devices they instantiated, if ··· 1329 1336 const char *buf, size_t count) 1330 1337 { 1331 1338 struct i2c_adapter *adap = to_i2c_adapter(dev); 1332 - struct device *child_dev; 1339 + struct i2c_client *client, *next; 1333 1340 unsigned short addr; 1334 1341 char end; 1335 1342 int res; ··· 1345 1352 return -EINVAL; 1346 1353 } 1347 1354 1348 - mutex_lock(&core_lock); 1349 1355 /* Make sure the device was added through sysfs */ 1350 - child_dev = device_find_child(&adap->dev, &addr, __i2c_find_user_addr); 1351 - if (child_dev) { 1352 - i2c_unregister_device(i2c_verify_client(child_dev)); 1353 - put_device(child_dev); 1354 - } else { 1355 - dev_err(dev, "Can't find userspace-created device at %#x\n", addr); 1356 - count = -ENOENT; 1357 - } 1358 - mutex_unlock(&core_lock); 1356 + res = -ENOENT; 1357 + mutex_lock_nested(&adap->userspace_clients_lock, 1358 + i2c_adapter_depth(adap)); 1359 + list_for_each_entry_safe(client, next, &adap->userspace_clients, 1360 + detected) { 1361 + if (i2c_encode_flags_to_addr(client) == addr) { 1362 + dev_info(dev, "%s: Deleting device %s at 0x%02hx\n", 1363 + "delete_device", client->name, client->addr); 1359 1364 1360 - return count; 1365 + list_del(&client->detected); 1366 + i2c_unregister_device(client); 1367 + res = count; 1368 + break; 1369 + } 1370 + } 1371 + mutex_unlock(&adap->userspace_clients_lock); 1372 + 1373 + if (res < 0) 1374 + dev_err(dev, "%s: Can't find device in list\n", 1375 + "delete_device"); 1376 + return res; 1361 1377 } 1362 1378 static DEVICE_ATTR_IGNORE_LOCKDEP(delete_device, S_IWUSR, NULL, 1363 1379 delete_device_store); ··· 1537 1535 adap->locked_flags = 0; 1538 1536 rt_mutex_init(&adap->bus_lock); 1539 1537 rt_mutex_init(&adap->mux_lock); 1538 + mutex_init(&adap->userspace_clients_lock); 1539 + INIT_LIST_HEAD(&adap->userspace_clients); 1540 1540 1541 1541 /* Set default timeout to 1 second if not already set */ 1542 1542 if (adap->timeout == 0) ··· 1704 1700 } 1705 1701 EXPORT_SYMBOL_GPL(i2c_add_numbered_adapter); 1706 1702 1703 + static void i2c_do_del_adapter(struct i2c_driver *driver, 1704 + struct i2c_adapter *adapter) 1705 + { 1706 + struct i2c_client *client, *_n; 1707 + 1708 + /* Remove the devices we created ourselves as the result of hardware 1709 + * probing (using a driver's detect method) */ 1710 + list_for_each_entry_safe(client, _n, &driver->clients, detected) { 1711 + if (client->adapter == adapter) { 1712 + dev_dbg(&adapter->dev, "Removing %s at 0x%x\n", 1713 + client->name, client->addr); 1714 + list_del(&client->detected); 1715 + i2c_unregister_device(client); 1716 + } 1717 + } 1718 + } 1719 + 1707 1720 static int __unregister_client(struct device *dev, void *dummy) 1708 1721 { 1709 1722 struct i2c_client *client = i2c_verify_client(dev); ··· 1736 1715 return 0; 1737 1716 } 1738 1717 1718 + static int __process_removed_adapter(struct device_driver *d, void *data) 1719 + { 1720 + i2c_do_del_adapter(to_i2c_driver(d), data); 1721 + return 0; 1722 + } 1723 + 1739 1724 /** 1740 1725 * i2c_del_adapter - unregister I2C adapter 1741 1726 * @adap: the adapter being unregistered ··· 1753 1726 void i2c_del_adapter(struct i2c_adapter *adap) 1754 1727 { 1755 1728 struct i2c_adapter *found; 1729 + struct i2c_client *client, *next; 1756 1730 1757 1731 /* First make sure that this adapter was ever added */ 1758 1732 mutex_lock(&core_lock); ··· 1765 1737 } 1766 1738 1767 1739 i2c_acpi_remove_space_handler(adap); 1740 + /* Tell drivers about this removal */ 1741 + mutex_lock(&core_lock); 1742 + bus_for_each_drv(&i2c_bus_type, NULL, adap, 1743 + __process_removed_adapter); 1744 + mutex_unlock(&core_lock); 1745 + 1746 + /* Remove devices instantiated from sysfs */ 1747 + mutex_lock_nested(&adap->userspace_clients_lock, 1748 + i2c_adapter_depth(adap)); 1749 + list_for_each_entry_safe(client, next, &adap->userspace_clients, 1750 + detected) { 1751 + dev_dbg(&adap->dev, "Removing %s at 0x%x\n", client->name, 1752 + client->addr); 1753 + list_del(&client->detected); 1754 + i2c_unregister_device(client); 1755 + } 1756 + mutex_unlock(&adap->userspace_clients_lock); 1768 1757 1769 1758 /* Detach any active clients. This can't fail, thus we do not 1770 1759 * check the returned value. This is a two-pass process, because 1771 1760 * we can't remove the dummy devices during the first pass: they 1772 1761 * could have been instantiated by real devices wishing to clean 1773 1762 * them up properly, so we give them a chance to do that first. */ 1774 - mutex_lock(&core_lock); 1775 1763 device_for_each_child(&adap->dev, NULL, __unregister_client); 1776 1764 device_for_each_child(&adap->dev, NULL, __unregister_dummy); 1777 - mutex_unlock(&core_lock); 1778 1765 1779 1766 /* device name is gone after device_unregister */ 1780 1767 dev_dbg(&adap->dev, "adapter [%s] unregistered\n", adap->name); ··· 2009 1966 /* add the driver to the list of i2c drivers in the driver core */ 2010 1967 driver->driver.owner = owner; 2011 1968 driver->driver.bus = &i2c_bus_type; 1969 + INIT_LIST_HEAD(&driver->clients); 2012 1970 2013 1971 /* When registration returns, the driver core 2014 1972 * will have called probe() for all matching-but-unbound devices. ··· 2027 1983 } 2028 1984 EXPORT_SYMBOL(i2c_register_driver); 2029 1985 2030 - static int __i2c_unregister_detected_client(struct device *dev, void *argp) 1986 + static int __process_removed_driver(struct device *dev, void *data) 2031 1987 { 2032 - struct i2c_client *client = i2c_verify_client(dev); 2033 - 2034 - if (client && client->flags & I2C_CLIENT_AUTO) 2035 - i2c_unregister_device(client); 2036 - 1988 + if (dev->type == &i2c_adapter_type) 1989 + i2c_do_del_adapter(data, to_i2c_adapter(dev)); 2037 1990 return 0; 2038 1991 } 2039 1992 ··· 2041 2000 */ 2042 2001 void i2c_del_driver(struct i2c_driver *driver) 2043 2002 { 2044 - mutex_lock(&core_lock); 2045 - /* Satisfy __must_check, function can't fail */ 2046 - if (driver_for_each_device(&driver->driver, NULL, NULL, 2047 - __i2c_unregister_detected_client)) { 2048 - } 2049 - mutex_unlock(&core_lock); 2003 + i2c_for_each_dev(driver, __process_removed_driver); 2050 2004 2051 2005 driver_unregister(&driver->driver); 2052 2006 pr_debug("driver [%s] unregistered\n", driver->driver.name); ··· 2468 2432 /* Finally call the custom detection function */ 2469 2433 memset(&info, 0, sizeof(struct i2c_board_info)); 2470 2434 info.addr = addr; 2471 - info.flags = I2C_CLIENT_AUTO; 2472 2435 err = driver->detect(temp_client, &info); 2473 2436 if (err) { 2474 2437 /* -ENODEV is returned if the detection fails. We catch it ··· 2494 2459 dev_dbg(&adapter->dev, "Creating %s at 0x%02x\n", 2495 2460 info.type, info.addr); 2496 2461 client = i2c_new_client_device(adapter, &info); 2497 - if (IS_ERR(client)) 2462 + if (!IS_ERR(client)) 2463 + list_add_tail(&client->detected, &driver->clients); 2464 + else 2498 2465 dev_err(&adapter->dev, "Failed creating %s at 0x%02x\n", 2499 2466 info.type, info.addr); 2500 2467 }

+1

drivers/iommu/amd/amd_iommu_types.h

··· 175 175 #define CONTROL_GAM_EN 25 176 176 #define CONTROL_GALOG_EN 28 177 177 #define CONTROL_GAINT_EN 29 178 + #define CONTROL_EPH_EN 45 178 179 #define CONTROL_XT_EN 50 179 180 #define CONTROL_INTCAPXT_EN 51 180 181 #define CONTROL_IRTCACHEDIS 59

+34 -4

drivers/iommu/amd/init.c

··· 2653 2653 2654 2654 /* Set IOTLB invalidation timeout to 1s */ 2655 2655 iommu_set_inv_tlb_timeout(iommu, CTRL_INV_TO_1S); 2656 + 2657 + /* Enable Enhanced Peripheral Page Request Handling */ 2658 + if (check_feature(FEATURE_EPHSUP)) 2659 + iommu_feature_enable(iommu, CONTROL_EPH_EN); 2656 2660 } 2657 2661 2658 2662 static void iommu_apply_resume_quirks(struct amd_iommu *iommu) ··· 3198 3194 return true; 3199 3195 } 3200 3196 3201 - static void iommu_snp_enable(void) 3197 + static __init void iommu_snp_enable(void) 3202 3198 { 3203 3199 #ifdef CONFIG_KVM_AMD_SEV 3204 3200 if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP)) ··· 3220 3216 amd_iommu_snp_en = check_feature(FEATURE_SNP); 3221 3217 if (!amd_iommu_snp_en) { 3222 3218 pr_warn("SNP: IOMMU SNP feature not enabled, SNP cannot be supported.\n"); 3219 + goto disable_snp; 3220 + } 3221 + 3222 + /* 3223 + * Enable host SNP support once SNP support is checked on IOMMU. 3224 + */ 3225 + if (snp_rmptable_init()) { 3226 + pr_warn("SNP: RMP initialization failed, SNP cannot be supported.\n"); 3223 3227 goto disable_snp; 3224 3228 } 3225 3229 ··· 3329 3317 break; 3330 3318 ret = state_next(); 3331 3319 } 3320 + 3321 + /* 3322 + * SNP platform initilazation requires IOMMUs to be fully configured. 3323 + * If the SNP support on IOMMUs has NOT been checked, simply mark SNP 3324 + * as unsupported. If the SNP support on IOMMUs has been checked and 3325 + * host SNP support enabled but RMP enforcement has not been enabled 3326 + * in IOMMUs, then the system is in a half-baked state, but can limp 3327 + * along as all memory should be Hypervisor-Owned in the RMP. WARN, 3328 + * but leave SNP as "supported" to avoid confusing the kernel. 3329 + */ 3330 + if (ret && cc_platform_has(CC_ATTR_HOST_SEV_SNP) && 3331 + !WARN_ON_ONCE(amd_iommu_snp_en)) 3332 + cc_platform_clear(CC_ATTR_HOST_SEV_SNP); 3332 3333 3333 3334 return ret; 3334 3335 } ··· 3451 3426 int ret; 3452 3427 3453 3428 if (no_iommu || (iommu_detected && !gart_iommu_aperture)) 3454 - return; 3429 + goto disable_snp; 3455 3430 3456 3431 if (!amd_iommu_sme_check()) 3457 - return; 3432 + goto disable_snp; 3458 3433 3459 3434 ret = iommu_go_to_state(IOMMU_IVRS_DETECTED); 3460 3435 if (ret) 3461 - return; 3436 + goto disable_snp; 3462 3437 3463 3438 amd_iommu_detected = true; 3464 3439 iommu_detected = 1; 3465 3440 x86_init.iommu.iommu_init = amd_iommu_init; 3441 + return; 3442 + 3443 + disable_snp: 3444 + if (cc_platform_has(CC_ATTR_HOST_SEV_SNP)) 3445 + cc_platform_clear(CC_ATTR_HOST_SEV_SNP); 3466 3446 } 3467 3447 3468 3448 /****************************************************************************

+3 -3

drivers/iommu/exynos-iommu.c

··· 249 249 struct list_head clients; /* list of sysmmu_drvdata.domain_node */ 250 250 sysmmu_pte_t *pgtable; /* lv1 page table, 16KB */ 251 251 short *lv2entcnt; /* free lv2 entry counter for each section */ 252 - spinlock_t lock; /* lock for modyfying list of clients */ 252 + spinlock_t lock; /* lock for modifying list of clients */ 253 253 spinlock_t pgtablelock; /* lock for modifying page table @ pgtable */ 254 254 struct iommu_domain domain; /* generic domain data structure */ 255 255 }; ··· 292 292 struct clk *aclk; /* SYSMMU's aclk clock */ 293 293 struct clk *pclk; /* SYSMMU's pclk clock */ 294 294 struct clk *clk_master; /* master's device clock */ 295 - spinlock_t lock; /* lock for modyfying state */ 295 + spinlock_t lock; /* lock for modifying state */ 296 296 bool active; /* current status */ 297 297 struct exynos_iommu_domain *domain; /* domain we belong to */ 298 298 struct list_head domain_node; /* node for domain clients list */ ··· 746 746 ret = devm_request_irq(dev, irq, exynos_sysmmu_irq, 0, 747 747 dev_name(dev), data); 748 748 if (ret) { 749 - dev_err(dev, "Unabled to register handler of irq %d\n", irq); 749 + dev_err(dev, "Unable to register handler of irq %d\n", irq); 750 750 return ret; 751 751 } 752 752

+3 -1

drivers/iommu/intel/prq.c

··· 87 87 struct page_req_dsc *req; 88 88 89 89 req = &iommu->prq[head / sizeof(*req)]; 90 - if (!req->pasid_present || req->pasid != pasid) { 90 + if (req->rid != sid || 91 + (req->pasid_present && pasid != req->pasid) || 92 + (!req->pasid_present && pasid != IOMMU_NO_PASID)) { 91 93 head = (head + sizeof(*req)) & PRQ_RING_MASK; 92 94 continue; 93 95 }

+1

drivers/iommu/io-pgfault.c

··· 478 478 479 479 ops->page_response(dev, iopf, &resp); 480 480 list_del_init(&group->pending_node); 481 + iopf_free_group(group); 481 482 } 482 483 mutex_unlock(&fault_param->lock); 483 484

+1 -1

drivers/iommu/iommu.c

··· 1756 1756 group->id); 1757 1757 1758 1758 /* 1759 - * Try to recover, drivers are allowed to force IDENITY or DMA, IDENTITY 1759 + * Try to recover, drivers are allowed to force IDENTITY or DMA, IDENTITY 1760 1760 * takes precedence. 1761 1761 */ 1762 1762 if (type == IOMMU_DOMAIN_IDENTITY)

+1

drivers/irqchip/Kconfig

··· 169 169 170 170 config LAN966X_OIC 171 171 tristate "Microchip LAN966x OIC Support" 172 + depends on MCHP_LAN966X_PCI || COMPILE_TEST 172 173 select GENERIC_IRQ_CHIP 173 174 select IRQ_DOMAIN 174 175 help

+2 -1

drivers/irqchip/irq-apple-aic.c

··· 577 577 AIC_FIQ_HWIRQ(AIC_TMR_EL02_VIRT)); 578 578 } 579 579 580 - if (read_sysreg_s(SYS_IMP_APL_PMCR0_EL1) & PMCR0_IACT) { 580 + if ((read_sysreg_s(SYS_IMP_APL_PMCR0_EL1) & (PMCR0_IMODE | PMCR0_IACT)) == 581 + (FIELD_PREP(PMCR0_IMODE, PMCR0_IMODE_FIQ) | PMCR0_IACT)) { 581 582 int irq; 582 583 if (cpumask_test_cpu(smp_processor_id(), 583 584 &aic_irqc->fiq_aff[AIC_CPU_PMU_P]->aff))

+2 -1

drivers/irqchip/irq-mvebu-icu.c

··· 68 68 unsigned long *hwirq, unsigned int *type) 69 69 { 70 70 unsigned int param_count = static_branch_unlikely(&legacy_bindings) ? 3 : 2; 71 - struct mvebu_icu_msi_data *msi_data = d->host_data; 71 + struct msi_domain_info *info = d->host_data; 72 + struct mvebu_icu_msi_data *msi_data = info->chip_data; 72 73 struct mvebu_icu *icu = msi_data->icu; 73 74 74 75 /* Check the count of the parameters in dt */

+1 -1

drivers/irqchip/irq-partition-percpu.c

··· 98 98 struct irq_chip *chip = irq_desc_get_chip(part->chained_desc); 99 99 struct irq_data *data = irq_desc_get_irq_data(part->chained_desc); 100 100 101 - seq_printf(p, " %5s-%lu", chip->name, data->hwirq); 101 + seq_printf(p, "%5s-%lu", chip->name, data->hwirq); 102 102 } 103 103 104 104 static struct irq_chip partition_irq_chip = {

+1 -1

drivers/irqchip/irq-riscv-imsic-early.c

··· 27 27 { 28 28 struct imsic_local_config *local = per_cpu_ptr(imsic->global.local, cpu); 29 29 30 - writel_relaxed(IMSIC_IPI_ID, local->msi_va); 30 + writel(IMSIC_IPI_ID, local->msi_va); 31 31 } 32 32 33 33 static void imsic_ipi_starting_cpu(void)

+1 -1

drivers/irqchip/irq-thead-c900-aclint-sswi.c

··· 31 31 32 32 static void thead_aclint_sswi_ipi_send(unsigned int cpu) 33 33 { 34 - writel_relaxed(0x1, per_cpu(sswi_cpu_regs, cpu)); 34 + writel(0x1, per_cpu(sswi_cpu_regs, cpu)); 35 35 } 36 36 37 37 static void thead_aclint_sswi_ipi_clear(void)

+1 -3

drivers/md/md-linear.c

··· 76 76 lim.max_write_zeroes_sectors = mddev->chunk_sectors; 77 77 lim.io_min = mddev->chunk_sectors << 9; 78 78 err = mddev_stack_rdev_limits(mddev, &lim, MDDEV_STACK_INTEGRITY); 79 - if (err) { 80 - queue_limits_cancel_update(mddev->gendisk->queue); 79 + if (err) 81 80 return err; 82 - } 83 81 84 82 return queue_limits_set(mddev->gendisk->queue, &lim); 85 83 }

+26 -3

drivers/mfd/syscon.c

··· 159 159 } 160 160 161 161 static struct regmap *device_node_get_regmap(struct device_node *np, 162 + bool create_regmap, 162 163 bool check_res) 163 164 { 164 165 struct syscon *entry, *syscon = NULL; ··· 173 172 } 174 173 175 174 if (!syscon) { 176 - if (of_device_is_compatible(np, "syscon")) 175 + if (create_regmap) 177 176 syscon = of_syscon_register(np, check_res); 178 177 else 179 178 syscon = ERR_PTR(-EINVAL); ··· 234 233 } 235 234 EXPORT_SYMBOL_GPL(of_syscon_register_regmap); 236 235 236 + /** 237 + * device_node_to_regmap() - Get or create a regmap for specified device node 238 + * @np: Device tree node 239 + * 240 + * Get a regmap for the specified device node. If there's not an existing 241 + * regmap, then one is instantiated. This function should not be used if the 242 + * device node has a custom regmap driver or has resources (clocks, resets) to 243 + * be managed. Use syscon_node_to_regmap() instead for those cases. 244 + * 245 + * Return: regmap ptr on success, negative error code on failure. 246 + */ 237 247 struct regmap *device_node_to_regmap(struct device_node *np) 238 248 { 239 - return device_node_get_regmap(np, false); 249 + return device_node_get_regmap(np, true, false); 240 250 } 241 251 EXPORT_SYMBOL_GPL(device_node_to_regmap); 242 252 253 + /** 254 + * syscon_node_to_regmap() - Get or create a regmap for specified syscon device node 255 + * @np: Device tree node 256 + * 257 + * Get a regmap for the specified device node. If there's not an existing 258 + * regmap, then one is instantiated if the node is a generic "syscon". This 259 + * function is safe to use for a syscon registered with 260 + * of_syscon_register_regmap(). 261 + * 262 + * Return: regmap ptr on success, negative error code on failure. 263 + */ 243 264 struct regmap *syscon_node_to_regmap(struct device_node *np) 244 265 { 245 - return device_node_get_regmap(np, true); 266 + return device_node_get_regmap(np, of_device_is_compatible(np, "syscon"), true); 246 267 } 247 268 EXPORT_SYMBOL_GPL(syscon_node_to_regmap); 248 269

+20 -11

drivers/mmc/host/mtk-sd.c

··· 273 273 #define MSDC_PAD_TUNE_CMD2_SEL BIT(21) /* RW */ 274 274 275 275 #define PAD_DS_TUNE_DLY_SEL BIT(0) /* RW */ 276 + #define PAD_DS_TUNE_DLY2_SEL BIT(1) /* RW */ 276 277 #define PAD_DS_TUNE_DLY1 GENMASK(6, 2) /* RW */ 277 278 #define PAD_DS_TUNE_DLY2 GENMASK(11, 7) /* RW */ 278 279 #define PAD_DS_TUNE_DLY3 GENMASK(16, 12) /* RW */ ··· 319 318 320 319 /* EMMC50_PAD_DS_TUNE mask */ 321 320 #define PAD_DS_DLY_SEL BIT(16) /* RW */ 321 + #define PAD_DS_DLY2_SEL BIT(15) /* RW */ 322 322 #define PAD_DS_DLY1 GENMASK(14, 10) /* RW */ 323 323 #define PAD_DS_DLY3 GENMASK(4, 0) /* RW */ 324 324 ··· 2506 2504 static int msdc_prepare_hs400_tuning(struct mmc_host *mmc, struct mmc_ios *ios) 2507 2505 { 2508 2506 struct msdc_host *host = mmc_priv(mmc); 2507 + 2509 2508 host->hs400_mode = true; 2510 2509 2511 - if (host->top_base) 2512 - writel(host->hs400_ds_delay, 2513 - host->top_base + EMMC50_PAD_DS_TUNE); 2514 - else 2515 - writel(host->hs400_ds_delay, host->base + PAD_DS_TUNE); 2510 + if (host->top_base) { 2511 + if (host->hs400_ds_dly3) 2512 + sdr_set_field(host->top_base + EMMC50_PAD_DS_TUNE, 2513 + PAD_DS_DLY3, host->hs400_ds_dly3); 2514 + if (host->hs400_ds_delay) 2515 + writel(host->hs400_ds_delay, 2516 + host->top_base + EMMC50_PAD_DS_TUNE); 2517 + } else { 2518 + if (host->hs400_ds_dly3) 2519 + sdr_set_field(host->base + PAD_DS_TUNE, 2520 + PAD_DS_TUNE_DLY3, host->hs400_ds_dly3); 2521 + if (host->hs400_ds_delay) 2522 + writel(host->hs400_ds_delay, host->base + PAD_DS_TUNE); 2523 + } 2516 2524 /* hs400 mode must set it to 0 */ 2517 2525 sdr_clr_bits(host->base + MSDC_PATCH_BIT2, MSDC_PATCH_BIT2_CFGCRCSTS); 2518 2526 /* to improve read performance, set outstanding to 2 */ ··· 2542 2530 if (host->top_base) { 2543 2531 sdr_set_bits(host->top_base + EMMC50_PAD_DS_TUNE, 2544 2532 PAD_DS_DLY_SEL); 2545 - if (host->hs400_ds_dly3) 2546 - sdr_set_field(host->top_base + EMMC50_PAD_DS_TUNE, 2547 - PAD_DS_DLY3, host->hs400_ds_dly3); 2533 + sdr_clr_bits(host->top_base + EMMC50_PAD_DS_TUNE, 2534 + PAD_DS_DLY2_SEL); 2548 2535 } else { 2549 2536 sdr_set_bits(host->base + PAD_DS_TUNE, PAD_DS_TUNE_DLY_SEL); 2550 - if (host->hs400_ds_dly3) 2551 - sdr_set_field(host->base + PAD_DS_TUNE, 2552 - PAD_DS_TUNE_DLY3, host->hs400_ds_dly3); 2537 + sdr_clr_bits(host->base + PAD_DS_TUNE, PAD_DS_TUNE_DLY2_SEL); 2553 2538 } 2554 2539 2555 2540 host->hs400_tuning = true;

-30

drivers/mmc/host/sdhci_am654.c

··· 155 155 u32 tuning_loop; 156 156 157 157 #define SDHCI_AM654_QUIRK_FORCE_CDTEST BIT(0) 158 - #define SDHCI_AM654_QUIRK_SUPPRESS_V1P8_ENA BIT(1) 159 158 }; 160 159 161 160 struct window { ··· 354 355 sdhci_am654->clkbuf_sel); 355 356 356 357 sdhci_set_clock(host, clock); 357 - } 358 - 359 - static int sdhci_am654_start_signal_voltage_switch(struct mmc_host *mmc, struct mmc_ios *ios) 360 - { 361 - struct sdhci_host *host = mmc_priv(mmc); 362 - struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host); 363 - struct sdhci_am654_data *sdhci_am654 = sdhci_pltfm_priv(pltfm_host); 364 - int ret; 365 - 366 - if ((sdhci_am654->quirks & SDHCI_AM654_QUIRK_SUPPRESS_V1P8_ENA) && 367 - ios->signal_voltage == MMC_SIGNAL_VOLTAGE_180) { 368 - if (!IS_ERR(mmc->supply.vqmmc)) { 369 - ret = mmc_regulator_set_vqmmc(mmc, ios); 370 - if (ret < 0) { 371 - pr_err("%s: Switching to 1.8V signalling voltage failed,\n", 372 - mmc_hostname(mmc)); 373 - return -EIO; 374 - } 375 - } 376 - return 0; 377 - } 378 - 379 - return sdhci_start_signal_voltage_switch(mmc, ios); 380 358 } 381 359 382 360 static u8 sdhci_am654_write_power_on(struct sdhci_host *host, u8 val, int reg) ··· 844 868 if (device_property_read_bool(dev, "ti,fails-without-test-cd")) 845 869 sdhci_am654->quirks |= SDHCI_AM654_QUIRK_FORCE_CDTEST; 846 870 847 - /* Suppress v1p8 ena for eMMC and SD with vqmmc supply */ 848 - if (!!of_parse_phandle(dev->of_node, "vmmc-supply", 0) == 849 - !!of_parse_phandle(dev->of_node, "vqmmc-supply", 0)) 850 - sdhci_am654->quirks |= SDHCI_AM654_QUIRK_SUPPRESS_V1P8_ENA; 851 - 852 871 sdhci_get_of_property(pdev); 853 872 854 873 return 0; ··· 940 969 goto err_pltfm_free; 941 970 } 942 971 943 - host->mmc_host_ops.start_signal_voltage_switch = sdhci_am654_start_signal_voltage_switch; 944 972 host->mmc_host_ops.execute_tuning = sdhci_am654_execute_tuning; 945 973 946 974 pm_runtime_get_noresume(dev);

+3 -2

drivers/net/can/c_can/c_can_platform.c

··· 385 385 if (ret) { 386 386 dev_err(&pdev->dev, "registering %s failed (err=%d)\n", 387 387 KBUILD_MODNAME, ret); 388 - goto exit_free_device; 388 + goto exit_pm_runtime; 389 389 } 390 390 391 391 dev_info(&pdev->dev, "%s device registered (regs=%p, irq=%d)\n", 392 392 KBUILD_MODNAME, priv->base, dev->irq); 393 393 return 0; 394 394 395 - exit_free_device: 395 + exit_pm_runtime: 396 396 pm_runtime_disable(priv->device); 397 + exit_free_device: 397 398 free_c_can_dev(dev); 398 399 exit: 399 400 dev_err(&pdev->dev, "probe failed\n");

+6 -4

drivers/net/can/ctucanfd/ctucanfd_base.c

··· 867 867 } 868 868 break; 869 869 case CAN_STATE_ERROR_ACTIVE: 870 - cf->can_id |= CAN_ERR_CNT; 871 - cf->data[1] = CAN_ERR_CRTL_ACTIVE; 872 - cf->data[6] = bec.txerr; 873 - cf->data[7] = bec.rxerr; 870 + if (skb) { 871 + cf->can_id |= CAN_ERR_CNT; 872 + cf->data[1] = CAN_ERR_CRTL_ACTIVE; 873 + cf->data[6] = bec.txerr; 874 + cf->data[7] = bec.rxerr; 875 + } 874 876 break; 875 877 default: 876 878 netdev_warn(ndev, "unhandled error state (%d:%s)!\n",

+1 -1

drivers/net/can/rockchip/rockchip_canfd-core.c

··· 622 622 netdev_dbg(priv->ndev, "RX-FIFO overflow\n"); 623 623 624 624 skb = rkcanfd_alloc_can_err_skb(priv, &cf, &timestamp); 625 - if (skb) 625 + if (!skb) 626 626 return 0; 627 627 628 628 rkcanfd_get_berr_counter_corrected(priv, &bec);

+5 -1

drivers/net/can/usb/etas_es58x/es58x_devlink.c

··· 248 248 return ret; 249 249 } 250 250 251 - return devlink_info_serial_number_put(req, es58x_dev->udev->serial); 251 + if (es58x_dev->udev->serial) 252 + ret = devlink_info_serial_number_put(req, 253 + es58x_dev->udev->serial); 254 + 255 + return ret; 252 256 } 253 257 254 258 const struct devlink_ops es58x_dl_ops = {

+3 -1

drivers/net/ethernet/aquantia/atlantic/aq_nic.c

··· 1441 1441 aq_ptp_ring_free(self); 1442 1442 aq_ptp_free(self); 1443 1443 1444 - if (likely(self->aq_fw_ops->deinit) && link_down) { 1444 + /* May be invoked during hot unplug. */ 1445 + if (pci_device_is_present(self->pdev) && 1446 + likely(self->aq_fw_ops->deinit) && link_down) { 1445 1447 mutex_lock(&self->fwreq_mutex); 1446 1448 self->aq_fw_ops->deinit(self->aq_hw); 1447 1449 mutex_unlock(&self->fwreq_mutex);

+12 -4

drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c

··· 41 41 { 42 42 struct bcmgenet_priv *priv = netdev_priv(dev); 43 43 struct device *kdev = &priv->pdev->dev; 44 + u32 phy_wolopts = 0; 44 45 45 - if (dev->phydev) 46 + if (dev->phydev) { 46 47 phy_ethtool_get_wol(dev->phydev, wol); 48 + phy_wolopts = wol->wolopts; 49 + } 47 50 48 51 /* MAC is not wake-up capable, return what the PHY does */ 49 52 if (!device_can_wakeup(kdev)) ··· 54 51 55 52 /* Overlay MAC capabilities with that of the PHY queried before */ 56 53 wol->supported |= WAKE_MAGIC | WAKE_MAGICSECURE | WAKE_FILTER; 57 - wol->wolopts = priv->wolopts; 58 - memset(wol->sopass, 0, sizeof(wol->sopass)); 54 + wol->wolopts |= priv->wolopts; 59 55 56 + /* Return the PHY configured magic password */ 57 + if (phy_wolopts & WAKE_MAGICSECURE) 58 + return; 59 + 60 + /* Otherwise the MAC one */ 61 + memset(wol->sopass, 0, sizeof(wol->sopass)); 60 62 if (wol->wolopts & WAKE_MAGICSECURE) 61 63 memcpy(wol->sopass, priv->sopass, sizeof(priv->sopass)); 62 64 } ··· 78 70 /* Try Wake-on-LAN from the PHY first */ 79 71 if (dev->phydev) { 80 72 ret = phy_ethtool_set_wol(dev->phydev, wol); 81 - if (ret != -EOPNOTSUPP) 73 + if (ret != -EOPNOTSUPP && wol->wolopts) 82 74 return ret; 83 75 } 84 76

+58

drivers/net/ethernet/broadcom/tg3.c

··· 55 55 #include <linux/hwmon.h> 56 56 #include <linux/hwmon-sysfs.h> 57 57 #include <linux/crc32poly.h> 58 + #include <linux/dmi.h> 58 59 59 60 #include <net/checksum.h> 60 61 #include <net/gso.h> ··· 18213 18212 18214 18213 static SIMPLE_DEV_PM_OPS(tg3_pm_ops, tg3_suspend, tg3_resume); 18215 18214 18215 + /* Systems where ACPI _PTS (Prepare To Sleep) S5 will result in a fatal 18216 + * PCIe AER event on the tg3 device if the tg3 device is not, or cannot 18217 + * be, powered down. 18218 + */ 18219 + static const struct dmi_system_id tg3_restart_aer_quirk_table[] = { 18220 + { 18221 + .matches = { 18222 + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), 18223 + DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R440"), 18224 + }, 18225 + }, 18226 + { 18227 + .matches = { 18228 + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), 18229 + DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R540"), 18230 + }, 18231 + }, 18232 + { 18233 + .matches = { 18234 + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), 18235 + DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R640"), 18236 + }, 18237 + }, 18238 + { 18239 + .matches = { 18240 + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), 18241 + DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R650"), 18242 + }, 18243 + }, 18244 + { 18245 + .matches = { 18246 + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), 18247 + DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R740"), 18248 + }, 18249 + }, 18250 + { 18251 + .matches = { 18252 + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), 18253 + DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R750"), 18254 + }, 18255 + }, 18256 + {} 18257 + }; 18258 + 18216 18259 static void tg3_shutdown(struct pci_dev *pdev) 18217 18260 { 18218 18261 struct net_device *dev = pci_get_drvdata(pdev); ··· 18273 18228 18274 18229 if (system_state == SYSTEM_POWER_OFF) 18275 18230 tg3_power_down(tp); 18231 + else if (system_state == SYSTEM_RESTART && 18232 + dmi_first_match(tg3_restart_aer_quirk_table) && 18233 + pdev->current_state != PCI_D3cold && 18234 + pdev->current_state != PCI_UNKNOWN) { 18235 + /* Disable PCIe AER on the tg3 to avoid a fatal 18236 + * error during this system restart. 18237 + */ 18238 + pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL, 18239 + PCI_EXP_DEVCTL_CERE | 18240 + PCI_EXP_DEVCTL_NFERE | 18241 + PCI_EXP_DEVCTL_FERE | 18242 + PCI_EXP_DEVCTL_URRE); 18243 + } 18276 18244 18277 18245 rtnl_unlock(); 18278 18246

+1 -1

drivers/net/ethernet/intel/iavf/iavf_main.c

··· 2903 2903 } 2904 2904 2905 2905 mutex_unlock(&adapter->crit_lock); 2906 - netdev_unlock(netdev); 2907 2906 restart_watchdog: 2907 + netdev_unlock(netdev); 2908 2908 if (adapter->state >= __IAVF_DOWN) 2909 2909 queue_work(adapter->wq, &adapter->adminq_task); 2910 2910 if (adapter->aq_required)

+3

drivers/net/ethernet/intel/ice/devlink/devlink.c

··· 981 981 982 982 /* preallocate memory for ice_sched_node */ 983 983 node = devm_kzalloc(ice_hw_to_dev(pi->hw), sizeof(*node), GFP_KERNEL); 984 + if (!node) 985 + return -ENOMEM; 986 + 984 987 *priv = node; 985 988 986 989 return 0;

+103 -47

drivers/net/ethernet/intel/ice/ice_txrx.c

··· 527 527 * @xdp: xdp_buff used as input to the XDP program 528 528 * @xdp_prog: XDP program to run 529 529 * @xdp_ring: ring to be used for XDP_TX action 530 - * @rx_buf: Rx buffer to store the XDP action 531 530 * @eop_desc: Last descriptor in packet to read metadata from 532 531 * 533 532 * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR} 534 533 */ 535 - static void 534 + static u32 536 535 ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, 537 536 struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring, 538 - struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc) 537 + union ice_32b_rx_flex_desc *eop_desc) 539 538 { 540 539 unsigned int ret = ICE_XDP_PASS; 541 540 u32 act; ··· 573 574 ret = ICE_XDP_CONSUMED; 574 575 } 575 576 exit: 576 - ice_set_rx_bufs_act(xdp, rx_ring, ret); 577 + return ret; 577 578 } 578 579 579 580 /** ··· 859 860 xdp_buff_set_frags_flag(xdp); 860 861 } 861 862 862 - if (unlikely(sinfo->nr_frags == MAX_SKB_FRAGS)) { 863 - ice_set_rx_bufs_act(xdp, rx_ring, ICE_XDP_CONSUMED); 863 + if (unlikely(sinfo->nr_frags == MAX_SKB_FRAGS)) 864 864 return -ENOMEM; 865 - } 866 865 867 866 __skb_fill_page_desc_noacc(sinfo, sinfo->nr_frags++, rx_buf->page, 868 867 rx_buf->page_offset, size); ··· 921 924 struct ice_rx_buf *rx_buf; 922 925 923 926 rx_buf = &rx_ring->rx_buf[ntc]; 924 - rx_buf->pgcnt = page_count(rx_buf->page); 925 927 prefetchw(rx_buf->page); 926 928 927 929 if (!size) ··· 934 938 rx_buf->pagecnt_bias--; 935 939 936 940 return rx_buf; 941 + } 942 + 943 + /** 944 + * ice_get_pgcnts - grab page_count() for gathered fragments 945 + * @rx_ring: Rx descriptor ring to store the page counts on 946 + * 947 + * This function is intended to be called right before running XDP 948 + * program so that the page recycling mechanism will be able to take 949 + * a correct decision regarding underlying pages; this is done in such 950 + * way as XDP program can change the refcount of page 951 + */ 952 + static void ice_get_pgcnts(struct ice_rx_ring *rx_ring) 953 + { 954 + u32 nr_frags = rx_ring->nr_frags + 1; 955 + u32 idx = rx_ring->first_desc; 956 + struct ice_rx_buf *rx_buf; 957 + u32 cnt = rx_ring->count; 958 + 959 + for (int i = 0; i < nr_frags; i++) { 960 + rx_buf = &rx_ring->rx_buf[idx]; 961 + rx_buf->pgcnt = page_count(rx_buf->page); 962 + 963 + if (++idx == cnt) 964 + idx = 0; 965 + } 937 966 } 938 967 939 968 /** ··· 1072 1051 rx_buf->page_offset + headlen, size, 1073 1052 xdp->frame_sz); 1074 1053 } else { 1075 - /* buffer is unused, change the act that should be taken later 1076 - * on; data was copied onto skb's linear part so there's no 1054 + /* buffer is unused, restore biased page count in Rx buffer; 1055 + * data was copied onto skb's linear part so there's no 1077 1056 * need for adjusting page offset and we can reuse this buffer 1078 1057 * as-is 1079 1058 */ 1080 - rx_buf->act = ICE_SKB_CONSUMED; 1059 + rx_buf->pagecnt_bias++; 1081 1060 } 1082 1061 1083 1062 if (unlikely(xdp_buff_has_frags(xdp))) { ··· 1125 1104 } 1126 1105 1127 1106 /** 1107 + * ice_put_rx_mbuf - ice_put_rx_buf() caller, for all frame frags 1108 + * @rx_ring: Rx ring with all the auxiliary data 1109 + * @xdp: XDP buffer carrying linear + frags part 1110 + * @xdp_xmit: XDP_TX/XDP_REDIRECT verdict storage 1111 + * @ntc: a current next_to_clean value to be stored at rx_ring 1112 + * @verdict: return code from XDP program execution 1113 + * 1114 + * Walk through gathered fragments and satisfy internal page 1115 + * recycle mechanism; we take here an action related to verdict 1116 + * returned by XDP program; 1117 + */ 1118 + static void ice_put_rx_mbuf(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, 1119 + u32 *xdp_xmit, u32 ntc, u32 verdict) 1120 + { 1121 + u32 nr_frags = rx_ring->nr_frags + 1; 1122 + u32 idx = rx_ring->first_desc; 1123 + u32 cnt = rx_ring->count; 1124 + u32 post_xdp_frags = 1; 1125 + struct ice_rx_buf *buf; 1126 + int i; 1127 + 1128 + if (unlikely(xdp_buff_has_frags(xdp))) 1129 + post_xdp_frags += xdp_get_shared_info_from_buff(xdp)->nr_frags; 1130 + 1131 + for (i = 0; i < post_xdp_frags; i++) { 1132 + buf = &rx_ring->rx_buf[idx]; 1133 + 1134 + if (verdict & (ICE_XDP_TX | ICE_XDP_REDIR)) { 1135 + ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); 1136 + *xdp_xmit |= verdict; 1137 + } else if (verdict & ICE_XDP_CONSUMED) { 1138 + buf->pagecnt_bias++; 1139 + } else if (verdict == ICE_XDP_PASS) { 1140 + ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); 1141 + } 1142 + 1143 + ice_put_rx_buf(rx_ring, buf); 1144 + 1145 + if (++idx == cnt) 1146 + idx = 0; 1147 + } 1148 + /* handle buffers that represented frags released by XDP prog; 1149 + * for these we keep pagecnt_bias as-is; refcount from struct page 1150 + * has been decremented within XDP prog and we do not have to increase 1151 + * the biased refcnt 1152 + */ 1153 + for (; i < nr_frags; i++) { 1154 + buf = &rx_ring->rx_buf[idx]; 1155 + ice_put_rx_buf(rx_ring, buf); 1156 + if (++idx == cnt) 1157 + idx = 0; 1158 + } 1159 + 1160 + xdp->data = NULL; 1161 + rx_ring->first_desc = ntc; 1162 + rx_ring->nr_frags = 0; 1163 + } 1164 + 1165 + /** 1128 1166 * ice_clean_rx_irq - Clean completed descriptors from Rx ring - bounce buf 1129 1167 * @rx_ring: Rx descriptor ring to transact packets on 1130 1168 * @budget: Total limit on number of packets to process ··· 1200 1120 unsigned int total_rx_bytes = 0, total_rx_pkts = 0; 1201 1121 unsigned int offset = rx_ring->rx_offset; 1202 1122 struct xdp_buff *xdp = &rx_ring->xdp; 1203 - u32 cached_ntc = rx_ring->first_desc; 1204 1123 struct ice_tx_ring *xdp_ring = NULL; 1205 1124 struct bpf_prog *xdp_prog = NULL; 1206 1125 u32 ntc = rx_ring->next_to_clean; 1126 + u32 cached_ntu, xdp_verdict; 1207 1127 u32 cnt = rx_ring->count; 1208 1128 u32 xdp_xmit = 0; 1209 - u32 cached_ntu; 1210 1129 bool failure; 1211 - u32 first; 1212 1130 1213 1131 xdp_prog = READ_ONCE(rx_ring->xdp_prog); 1214 1132 if (xdp_prog) { ··· 1268 1190 xdp_prepare_buff(xdp, hard_start, offset, size, !!offset); 1269 1191 xdp_buff_clear_frags_flag(xdp); 1270 1192 } else if (ice_add_xdp_frag(rx_ring, xdp, rx_buf, size)) { 1193 + ice_put_rx_mbuf(rx_ring, xdp, NULL, ntc, ICE_XDP_CONSUMED); 1271 1194 break; 1272 1195 } 1273 1196 if (++ntc == cnt) ··· 1278 1199 if (ice_is_non_eop(rx_ring, rx_desc)) 1279 1200 continue; 1280 1201 1281 - ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc); 1282 - if (rx_buf->act == ICE_XDP_PASS) 1202 + ice_get_pgcnts(rx_ring); 1203 + xdp_verdict = ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_desc); 1204 + if (xdp_verdict == ICE_XDP_PASS) 1283 1205 goto construct_skb; 1284 1206 total_rx_bytes += xdp_get_buff_len(xdp); 1285 1207 total_rx_pkts++; 1286 1208 1287 - xdp->data = NULL; 1288 - rx_ring->first_desc = ntc; 1289 - rx_ring->nr_frags = 0; 1209 + ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict); 1210 + 1290 1211 continue; 1291 1212 construct_skb: 1292 1213 if (likely(ice_ring_uses_build_skb(rx_ring))) ··· 1296 1217 /* exit if we failed to retrieve a buffer */ 1297 1218 if (!skb) { 1298 1219 rx_ring->ring_stats->rx_stats.alloc_page_failed++; 1299 - rx_buf->act = ICE_XDP_CONSUMED; 1300 - if (unlikely(xdp_buff_has_frags(xdp))) 1301 - ice_set_rx_bufs_act(xdp, rx_ring, 1302 - ICE_XDP_CONSUMED); 1303 - xdp->data = NULL; 1304 - rx_ring->first_desc = ntc; 1305 - rx_ring->nr_frags = 0; 1306 - break; 1220 + xdp_verdict = ICE_XDP_CONSUMED; 1307 1221 } 1308 - xdp->data = NULL; 1309 - rx_ring->first_desc = ntc; 1310 - rx_ring->nr_frags = 0; 1222 + ice_put_rx_mbuf(rx_ring, xdp, &xdp_xmit, ntc, xdp_verdict); 1223 + 1224 + if (!skb) 1225 + break; 1311 1226 1312 1227 stat_err_bits = BIT(ICE_RX_FLEX_DESC_STATUS0_RXE_S); 1313 1228 if (unlikely(ice_test_staterr(rx_desc->wb.status_error0, ··· 1330 1257 total_rx_pkts++; 1331 1258 } 1332 1259 1333 - first = rx_ring->first_desc; 1334 - while (cached_ntc != first) { 1335 - struct ice_rx_buf *buf = &rx_ring->rx_buf[cached_ntc]; 1336 - 1337 - if (buf->act & (ICE_XDP_TX | ICE_XDP_REDIR)) { 1338 - ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); 1339 - xdp_xmit |= buf->act; 1340 - } else if (buf->act & ICE_XDP_CONSUMED) { 1341 - buf->pagecnt_bias++; 1342 - } else if (buf->act == ICE_XDP_PASS) { 1343 - ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz); 1344 - } 1345 - 1346 - ice_put_rx_buf(rx_ring, buf); 1347 - if (++cached_ntc >= cnt) 1348 - cached_ntc = 0; 1349 - } 1350 1260 rx_ring->next_to_clean = ntc; 1351 1261 /* return up to cleaned_count buffers to hardware */ 1352 1262 failure = ice_alloc_rx_bufs(rx_ring, ICE_RX_DESC_UNUSED(rx_ring));

-1

drivers/net/ethernet/intel/ice/ice_txrx.h

··· 201 201 struct page *page; 202 202 unsigned int page_offset; 203 203 unsigned int pgcnt; 204 - unsigned int act; 205 204 unsigned int pagecnt_bias; 206 205 }; 207 206

-43

drivers/net/ethernet/intel/ice/ice_txrx_lib.h

··· 6 6 #include "ice.h" 7 7 8 8 /** 9 - * ice_set_rx_bufs_act - propagate Rx buffer action to frags 10 - * @xdp: XDP buffer representing frame (linear and frags part) 11 - * @rx_ring: Rx ring struct 12 - * act: action to store onto Rx buffers related to XDP buffer parts 13 - * 14 - * Set action that should be taken before putting Rx buffer from first frag 15 - * to the last. 16 - */ 17 - static inline void 18 - ice_set_rx_bufs_act(struct xdp_buff *xdp, const struct ice_rx_ring *rx_ring, 19 - const unsigned int act) 20 - { 21 - u32 sinfo_frags = xdp_get_shared_info_from_buff(xdp)->nr_frags; 22 - u32 nr_frags = rx_ring->nr_frags + 1; 23 - u32 idx = rx_ring->first_desc; 24 - u32 cnt = rx_ring->count; 25 - struct ice_rx_buf *buf; 26 - 27 - for (int i = 0; i < nr_frags; i++) { 28 - buf = &rx_ring->rx_buf[idx]; 29 - buf->act = act; 30 - 31 - if (++idx == cnt) 32 - idx = 0; 33 - } 34 - 35 - /* adjust pagecnt_bias on frags freed by XDP prog */ 36 - if (sinfo_frags < rx_ring->nr_frags && act == ICE_XDP_CONSUMED) { 37 - u32 delta = rx_ring->nr_frags - sinfo_frags; 38 - 39 - while (delta) { 40 - if (idx == 0) 41 - idx = cnt - 1; 42 - else 43 - idx--; 44 - buf = &rx_ring->rx_buf[idx]; 45 - buf->pagecnt_bias--; 46 - delta--; 47 - } 48 - } 49 - } 50 - 51 - /** 52 9 * ice_test_staterr - tests bits in Rx descriptor status and error fields 53 10 * @status_err_n: Rx descriptor status_error0 or status_error1 bits 54 11 * @stat_err_bits: value to mask

+5

drivers/net/ethernet/intel/idpf/idpf_lib.c

··· 2159 2159 idpf_vport_ctrl_lock(netdev); 2160 2160 vport = idpf_netdev_to_vport(netdev); 2161 2161 2162 + err = idpf_set_real_num_queues(vport); 2163 + if (err) 2164 + goto unlock; 2165 + 2162 2166 err = idpf_vport_open(vport); 2163 2167 2168 + unlock: 2164 2169 idpf_vport_ctrl_unlock(netdev); 2165 2170 2166 2171 return err;

+1 -4

drivers/net/ethernet/intel/idpf/idpf_txrx.c

··· 3008 3008 return -EINVAL; 3009 3009 3010 3010 rsc_segments = DIV_ROUND_UP(skb->data_len, rsc_seg_len); 3011 - if (unlikely(rsc_segments == 1)) 3012 - return 0; 3013 3011 3014 3012 NAPI_GRO_CB(skb)->count = rsc_segments; 3015 3013 skb_shinfo(skb)->gso_size = rsc_seg_len; ··· 3070 3072 idpf_rx_hash(rxq, skb, rx_desc, decoded); 3071 3073 3072 3074 skb->protocol = eth_type_trans(skb, rxq->netdev); 3075 + skb_record_rx_queue(skb, rxq->idx); 3073 3076 3074 3077 if (le16_get_bits(rx_desc->hdrlen_flags, 3075 3078 VIRTCHNL2_RX_FLEX_DESC_ADV_RSC_M)) ··· 3078 3079 3079 3080 csum_bits = idpf_rx_splitq_extract_csum_bits(rx_desc); 3080 3081 idpf_rx_csum(rxq, skb, csum_bits, decoded); 3081 - 3082 - skb_record_rx_queue(skb, rxq->idx); 3083 3082 3084 3083 return 0; 3085 3084 }

+13 -9

drivers/net/ethernet/intel/igc/igc_main.c

··· 1096 1096 return -ENOMEM; 1097 1097 } 1098 1098 1099 + buffer->type = IGC_TX_BUFFER_TYPE_SKB; 1099 1100 buffer->skb = skb; 1100 1101 buffer->protocol = 0; 1101 1102 buffer->bytecount = skb->len; ··· 2702 2701 } 2703 2702 2704 2703 static struct sk_buff *igc_construct_skb_zc(struct igc_ring *ring, 2705 - struct xdp_buff *xdp) 2704 + struct igc_xdp_buff *ctx) 2706 2705 { 2706 + struct xdp_buff *xdp = &ctx->xdp; 2707 2707 unsigned int totalsize = xdp->data_end - xdp->data_meta; 2708 2708 unsigned int metasize = xdp->data - xdp->data_meta; 2709 2709 struct sk_buff *skb; ··· 2723 2721 __skb_pull(skb, metasize); 2724 2722 } 2725 2723 2724 + if (ctx->rx_ts) { 2725 + skb_shinfo(skb)->tx_flags |= SKBTX_HW_TSTAMP_NETDEV; 2726 + skb_hwtstamps(skb)->netdev_data = ctx->rx_ts; 2727 + } 2728 + 2726 2729 return skb; 2727 2730 } 2728 2731 2729 2732 static void igc_dispatch_skb_zc(struct igc_q_vector *q_vector, 2730 2733 union igc_adv_rx_desc *desc, 2731 - struct xdp_buff *xdp, 2732 - ktime_t timestamp) 2734 + struct igc_xdp_buff *ctx) 2733 2735 { 2734 2736 struct igc_ring *ring = q_vector->rx.ring; 2735 2737 struct sk_buff *skb; 2736 2738 2737 - skb = igc_construct_skb_zc(ring, xdp); 2739 + skb = igc_construct_skb_zc(ring, ctx); 2738 2740 if (!skb) { 2739 2741 ring->rx_stats.alloc_failed++; 2740 2742 set_bit(IGC_RING_FLAG_RX_ALLOC_FAILED, &ring->flags); 2741 2743 return; 2742 2744 } 2743 - 2744 - if (timestamp) 2745 - skb_hwtstamps(skb)->hwtstamp = timestamp; 2746 2745 2747 2746 if (igc_cleanup_headers(ring, desc, skb)) 2748 2747 return; ··· 2780 2777 union igc_adv_rx_desc *desc; 2781 2778 struct igc_rx_buffer *bi; 2782 2779 struct igc_xdp_buff *ctx; 2783 - ktime_t timestamp = 0; 2784 2780 unsigned int size; 2785 2781 int res; 2786 2782 ··· 2809 2807 */ 2810 2808 bi->xdp->data_meta += IGC_TS_HDR_LEN; 2811 2809 size -= IGC_TS_HDR_LEN; 2810 + } else { 2811 + ctx->rx_ts = NULL; 2812 2812 } 2813 2813 2814 2814 bi->xdp->data_end = bi->xdp->data + size; ··· 2819 2815 res = __igc_xdp_run_prog(adapter, prog, bi->xdp); 2820 2816 switch (res) { 2821 2817 case IGC_XDP_PASS: 2822 - igc_dispatch_skb_zc(q_vector, desc, bi->xdp, timestamp); 2818 + igc_dispatch_skb_zc(q_vector, desc, ctx); 2823 2819 fallthrough; 2824 2820 case IGC_XDP_CONSUMED: 2825 2821 xsk_buff_free(bi->xdp);

+1 -1

drivers/net/ethernet/intel/ixgbe/ixgbe_main.c

··· 2105 2105 /* hand second half of page back to the ring */ 2106 2106 ixgbe_reuse_rx_page(rx_ring, rx_buffer); 2107 2107 } else { 2108 - if (!IS_ERR(skb) && IXGBE_CB(skb)->dma == rx_buffer->dma) { 2108 + if (skb && IXGBE_CB(skb)->dma == rx_buffer->dma) { 2109 2109 /* the page has been released from the ring */ 2110 2110 IXGBE_CB(skb)->page_released = true; 2111 2111 } else {

+3 -1

drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c

··· 768 768 err = mlxsw_sp_get_hw_stats_by_group(&hw_stats, &len, grp); 769 769 if (err) 770 770 return; 771 - mlxsw_sp_port_get_stats_raw(dev, grp, prio, ppcnt_pl); 771 + err = mlxsw_sp_port_get_stats_raw(dev, grp, prio, ppcnt_pl); 772 + if (err) 773 + return; 772 774 for (i = 0; i < len; i++) { 773 775 data[data_index + i] = hw_stats[i].getter(ppcnt_pl); 774 776 if (!hw_stats[i].cells_bytes)

+22 -18

drivers/net/ethernet/stmicro/stmmac/stmmac_main.c

··· 2094 2094 pp_params.offset = stmmac_rx_offset(priv); 2095 2095 pp_params.max_len = dma_conf->dma_buf_sz; 2096 2096 2097 + if (priv->sph) { 2098 + pp_params.offset = 0; 2099 + pp_params.max_len += stmmac_rx_offset(priv); 2100 + } 2101 + 2097 2102 rx_q->page_pool = page_pool_create(&pp_params); 2098 2103 if (IS_ERR(rx_q->page_pool)) { 2099 2104 ret = PTR_ERR(rx_q->page_pool); ··· 2428 2423 u32 rxmode = 0; 2429 2424 u32 chan = 0; 2430 2425 u8 qmode = 0; 2426 + 2427 + if (rxfifosz == 0) 2428 + rxfifosz = priv->dma_cap.rx_fifo_size; 2429 + if (txfifosz == 0) 2430 + txfifosz = priv->dma_cap.tx_fifo_size; 2431 2431 2432 2432 /* Split up the shared Tx/Rx FIFO memory on DW QoS Eth and DW XGMAC */ 2433 2433 if (priv->plat->has_gmac4 || priv->plat->has_xgmac) { ··· 2901 2891 u32 tx_channels_count = priv->plat->tx_queues_to_use; 2902 2892 int rxfifosz = priv->plat->rx_fifo_size; 2903 2893 int txfifosz = priv->plat->tx_fifo_size; 2894 + 2895 + if (rxfifosz == 0) 2896 + rxfifosz = priv->dma_cap.rx_fifo_size; 2897 + if (txfifosz == 0) 2898 + txfifosz = priv->dma_cap.tx_fifo_size; 2904 2899 2905 2900 /* Adjust for real per queue fifo size */ 2906 2901 rxfifosz /= rx_channels_count; ··· 5883 5868 const int mtu = new_mtu; 5884 5869 int ret; 5885 5870 5871 + if (txfifosz == 0) 5872 + txfifosz = priv->dma_cap.tx_fifo_size; 5873 + 5886 5874 txfifosz /= priv->plat->tx_queues_to_use; 5887 5875 5888 5876 if (stmmac_xdp_is_enabled(priv) && new_mtu > ETH_DATA_LEN) { ··· 7237 7219 priv->plat->tx_queues_to_use = priv->dma_cap.number_tx_queues; 7238 7220 } 7239 7221 7240 - if (!priv->plat->rx_fifo_size) { 7241 - if (priv->dma_cap.rx_fifo_size) { 7242 - priv->plat->rx_fifo_size = priv->dma_cap.rx_fifo_size; 7243 - } else { 7244 - dev_err(priv->device, "Can't specify Rx FIFO size\n"); 7245 - return -ENODEV; 7246 - } 7247 - } else if (priv->dma_cap.rx_fifo_size && 7248 - priv->plat->rx_fifo_size > priv->dma_cap.rx_fifo_size) { 7222 + if (priv->dma_cap.rx_fifo_size && 7223 + priv->plat->rx_fifo_size > priv->dma_cap.rx_fifo_size) { 7249 7224 dev_warn(priv->device, 7250 7225 "Rx FIFO size (%u) exceeds dma capability\n", 7251 7226 priv->plat->rx_fifo_size); 7252 7227 priv->plat->rx_fifo_size = priv->dma_cap.rx_fifo_size; 7253 7228 } 7254 - if (!priv->plat->tx_fifo_size) { 7255 - if (priv->dma_cap.tx_fifo_size) { 7256 - priv->plat->tx_fifo_size = priv->dma_cap.tx_fifo_size; 7257 - } else { 7258 - dev_err(priv->device, "Can't specify Tx FIFO size\n"); 7259 - return -ENODEV; 7260 - } 7261 - } else if (priv->dma_cap.tx_fifo_size && 7262 - priv->plat->tx_fifo_size > priv->dma_cap.tx_fifo_size) { 7229 + if (priv->dma_cap.tx_fifo_size && 7230 + priv->plat->tx_fifo_size > priv->dma_cap.tx_fifo_size) { 7263 7231 dev_warn(priv->device, 7264 7232 "Tx FIFO size (%u) exceeds dma capability\n", 7265 7233 priv->plat->tx_fifo_size);

+31 -19

drivers/net/ethernet/ti/am65-cpsw-nuss.c

··· 828 828 static void am65_cpsw_nuss_tx_cleanup(void *data, dma_addr_t desc_dma) 829 829 { 830 830 struct am65_cpsw_tx_chn *tx_chn = data; 831 + enum am65_cpsw_tx_buf_type buf_type; 831 832 struct cppi5_host_desc_t *desc_tx; 833 + struct xdp_frame *xdpf; 832 834 struct sk_buff *skb; 833 835 void **swdata; 834 836 835 837 desc_tx = k3_cppi_desc_pool_dma2virt(tx_chn->desc_pool, desc_dma); 836 838 swdata = cppi5_hdesc_get_swdata(desc_tx); 837 - skb = *(swdata); 838 - am65_cpsw_nuss_xmit_free(tx_chn, desc_tx); 839 + buf_type = am65_cpsw_nuss_buf_type(tx_chn, desc_dma); 840 + if (buf_type == AM65_CPSW_TX_BUF_TYPE_SKB) { 841 + skb = *(swdata); 842 + dev_kfree_skb_any(skb); 843 + } else { 844 + xdpf = *(swdata); 845 + xdp_return_frame(xdpf); 846 + } 839 847 840 - dev_kfree_skb_any(skb); 848 + am65_cpsw_nuss_xmit_free(tx_chn, desc_tx); 841 849 } 842 850 843 851 static struct sk_buff *am65_cpsw_build_skb(void *page_addr, 844 852 struct net_device *ndev, 845 - unsigned int len) 853 + unsigned int len, 854 + unsigned int headroom) 846 855 { 847 856 struct sk_buff *skb; 848 857 ··· 861 852 if (unlikely(!skb)) 862 853 return NULL; 863 854 864 - skb_reserve(skb, AM65_CPSW_HEADROOM); 855 + skb_reserve(skb, headroom); 865 856 skb->dev = ndev; 866 857 867 858 return skb; ··· 1178 1169 struct xdp_frame *xdpf; 1179 1170 struct bpf_prog *prog; 1180 1171 struct page *page; 1172 + int pkt_len; 1181 1173 u32 act; 1182 1174 int err; 1183 1175 1176 + pkt_len = *len; 1184 1177 prog = READ_ONCE(port->xdp_prog); 1185 1178 if (!prog) 1186 1179 return AM65_CPSW_XDP_PASS; ··· 1200 1189 netif_txq = netdev_get_tx_queue(ndev, tx_chn->id); 1201 1190 1202 1191 xdpf = xdp_convert_buff_to_frame(xdp); 1203 - if (unlikely(!xdpf)) 1192 + if (unlikely(!xdpf)) { 1193 + ndev->stats.tx_dropped++; 1204 1194 goto drop; 1195 + } 1205 1196 1206 1197 __netif_tx_lock(netif_txq, cpu); 1207 1198 err = am65_cpsw_xdp_tx_frame(ndev, tx_chn, xdpf, ··· 1212 1199 if (err) 1213 1200 goto drop; 1214 1201 1215 - dev_sw_netstats_tx_add(ndev, 1, *len); 1202 + dev_sw_netstats_rx_add(ndev, pkt_len); 1216 1203 ret = AM65_CPSW_XDP_CONSUMED; 1217 1204 goto out; 1218 1205 case XDP_REDIRECT: 1219 1206 if (unlikely(xdp_do_redirect(ndev, xdp, prog))) 1220 1207 goto drop; 1221 1208 1222 - dev_sw_netstats_rx_add(ndev, *len); 1209 + dev_sw_netstats_rx_add(ndev, pkt_len); 1223 1210 ret = AM65_CPSW_XDP_REDIRECT; 1224 1211 goto out; 1225 1212 default: ··· 1328 1315 dev_dbg(dev, "%s rx csum_info:%#x\n", __func__, csum_info); 1329 1316 1330 1317 dma_unmap_single(rx_chn->dma_dev, buf_dma, buf_dma_len, DMA_FROM_DEVICE); 1331 - 1332 1318 k3_cppi_desc_pool_free(rx_chn->desc_pool, desc_rx); 1333 - 1334 - skb = am65_cpsw_build_skb(page_addr, ndev, 1335 - AM65_CPSW_MAX_PACKET_SIZE); 1336 - if (unlikely(!skb)) { 1337 - new_page = page; 1338 - goto requeue; 1339 - } 1340 1319 1341 1320 if (port->xdp_prog) { 1342 1321 xdp_init_buff(&xdp, PAGE_SIZE, &port->xdp_rxq[flow->id]); ··· 1339 1334 if (*xdp_state != AM65_CPSW_XDP_PASS) 1340 1335 goto allocate; 1341 1336 1342 - /* Compute additional headroom to be reserved */ 1343 - headroom = (xdp.data - xdp.data_hard_start) - skb_headroom(skb); 1344 - skb_reserve(skb, headroom); 1337 + headroom = xdp.data - xdp.data_hard_start; 1338 + } else { 1339 + headroom = AM65_CPSW_HEADROOM; 1340 + } 1341 + 1342 + skb = am65_cpsw_build_skb(page_addr, ndev, 1343 + AM65_CPSW_MAX_PACKET_SIZE, headroom); 1344 + if (unlikely(!skb)) { 1345 + new_page = page; 1346 + goto requeue; 1345 1347 } 1346 1348 1347 1349 ndev_priv = netdev_priv(ndev);

+9 -6

drivers/net/phy/phylink.c

··· 2265 2265 /* Allow the MAC to stop its clock if the PHY has the capability */ 2266 2266 pl->mac_tx_clk_stop = phy_eee_tx_clock_stop_capable(phy) > 0; 2267 2267 2268 - /* Explicitly configure whether the PHY is allowed to stop it's 2269 - * receive clock. 2270 - */ 2271 - ret = phy_eee_rx_clock_stop(phy, pl->config->eee_rx_clk_stop_enable); 2272 - if (ret == -EOPNOTSUPP) 2273 - ret = 0; 2268 + if (pl->mac_supports_eee_ops) { 2269 + /* Explicitly configure whether the PHY is allowed to stop it's 2270 + * receive clock. 2271 + */ 2272 + ret = phy_eee_rx_clock_stop(phy, 2273 + pl->config->eee_rx_clk_stop_enable); 2274 + if (ret == -EOPNOTSUPP) 2275 + ret = 0; 2276 + } 2274 2277 2275 2278 return ret; 2276 2279 }

+2 -2

drivers/net/pse-pd/pse_core.c

··· 319 319 goto out; 320 320 mW = ret; 321 321 322 - ret = pse_pi_get_voltage(rdev); 322 + ret = _pse_pi_get_voltage(rdev); 323 323 if (!ret) { 324 324 dev_err(pcdev->dev, "Voltage null\n"); 325 325 ret = -ERANGE; ··· 356 356 357 357 id = rdev_get_id(rdev); 358 358 mutex_lock(&pcdev->lock); 359 - ret = pse_pi_get_voltage(rdev); 359 + ret = _pse_pi_get_voltage(rdev); 360 360 if (!ret) { 361 361 dev_err(pcdev->dev, "Voltage null\n"); 362 362 ret = -ERANGE;

+3 -1

drivers/net/team/team_core.c

··· 2639 2639 ctx.data.u32_val = nla_get_u32(attr_data); 2640 2640 break; 2641 2641 case TEAM_OPTION_TYPE_STRING: 2642 - if (nla_len(attr_data) > TEAM_STRING_MAX_LEN) { 2642 + if (nla_len(attr_data) > TEAM_STRING_MAX_LEN || 2643 + !memchr(nla_data(attr_data), '\0', 2644 + nla_len(attr_data))) { 2643 2645 err = -EINVAL; 2644 2646 goto team_put; 2645 2647 }

+5 -9

drivers/net/tun.c

··· 574 574 return ret; 575 575 } 576 576 577 - static inline bool tun_capable(struct tun_struct *tun) 577 + static inline bool tun_not_capable(struct tun_struct *tun) 578 578 { 579 579 const struct cred *cred = current_cred(); 580 580 struct net *net = dev_net(tun->dev); 581 581 582 - if (ns_capable(net->user_ns, CAP_NET_ADMIN)) 583 - return 1; 584 - if (uid_valid(tun->owner) && uid_eq(cred->euid, tun->owner)) 585 - return 1; 586 - if (gid_valid(tun->group) && in_egroup_p(tun->group)) 587 - return 1; 588 - return 0; 582 + return ((uid_valid(tun->owner) && !uid_eq(cred->euid, tun->owner)) || 583 + (gid_valid(tun->group) && !in_egroup_p(tun->group))) && 584 + !ns_capable(net->user_ns, CAP_NET_ADMIN); 589 585 } 590 586 591 587 static void tun_set_real_num_queues(struct tun_struct *tun) ··· 2778 2782 !!(tun->flags & IFF_MULTI_QUEUE)) 2779 2783 return -EINVAL; 2780 2784 2781 - if (!tun_capable(tun)) 2785 + if (tun_not_capable(tun)) 2782 2786 return -EPERM; 2783 2787 err = security_tun_dev_open(tun->security); 2784 2788 if (err < 0)

+12 -2

drivers/net/vmxnet3/vmxnet3_xdp.c

··· 28 28 if (likely(cpu < tq_number)) 29 29 tq = &adapter->tx_queue[cpu]; 30 30 else 31 - tq = &adapter->tx_queue[reciprocal_scale(cpu, tq_number)]; 31 + tq = &adapter->tx_queue[cpu % tq_number]; 32 32 33 33 return tq; 34 34 } ··· 124 124 u32 buf_size; 125 125 u32 dw2; 126 126 127 + spin_lock_irq(&tq->tx_lock); 127 128 dw2 = (tq->tx_ring.gen ^ 0x1) << VMXNET3_TXD_GEN_SHIFT; 128 129 dw2 |= xdpf->len; 129 130 ctx.sop_txd = tq->tx_ring.base + tq->tx_ring.next2fill; ··· 135 134 136 135 if (vmxnet3_cmd_ring_desc_avail(&tq->tx_ring) == 0) { 137 136 tq->stats.tx_ring_full++; 137 + spin_unlock_irq(&tq->tx_lock); 138 138 return -ENOSPC; 139 139 } 140 140 ··· 144 142 tbi->dma_addr = dma_map_single(&adapter->pdev->dev, 145 143 xdpf->data, buf_size, 146 144 DMA_TO_DEVICE); 147 - if (dma_mapping_error(&adapter->pdev->dev, tbi->dma_addr)) 145 + if (dma_mapping_error(&adapter->pdev->dev, tbi->dma_addr)) { 146 + spin_unlock_irq(&tq->tx_lock); 148 147 return -EFAULT; 148 + } 149 149 tbi->map_type |= VMXNET3_MAP_SINGLE; 150 150 } else { /* XDP buffer from page pool */ 151 151 page = virt_to_page(xdpf->data); ··· 186 182 dma_wmb(); 187 183 gdesc->dword[2] = cpu_to_le32(le32_to_cpu(gdesc->dword[2]) ^ 188 184 VMXNET3_TXD_GEN); 185 + spin_unlock_irq(&tq->tx_lock); 189 186 190 187 /* No need to handle the case when tx_num_deferred doesn't reach 191 188 * threshold. Backend driver at hypervisor side will poll and reset ··· 230 225 { 231 226 struct vmxnet3_adapter *adapter = netdev_priv(dev); 232 227 struct vmxnet3_tx_queue *tq; 228 + struct netdev_queue *nq; 233 229 int i; 234 230 235 231 if (unlikely(test_bit(VMXNET3_STATE_BIT_QUIESCED, &adapter->state))) ··· 242 236 if (tq->stopped) 243 237 return -ENETDOWN; 244 238 239 + nq = netdev_get_tx_queue(adapter->netdev, tq->qid); 240 + 241 + __netif_tx_lock(nq, smp_processor_id()); 245 242 for (i = 0; i < n; i++) { 246 243 if (vmxnet3_xdp_xmit_frame(adapter, frames[i], tq, true)) { 247 244 tq->stats.xdp_xmit_err++; ··· 252 243 } 253 244 } 254 245 tq->stats.xdp_xmit += i; 246 + __netif_tx_unlock(nq); 255 247 256 248 return i; 257 249 }

+5 -2

drivers/net/vxlan/vxlan_core.c

··· 2898 2898 struct vxlan_dev *vxlan = netdev_priv(dev); 2899 2899 int err; 2900 2900 2901 - if (vxlan->cfg.flags & VXLAN_F_VNIFILTER) 2902 - vxlan_vnigroup_init(vxlan); 2901 + if (vxlan->cfg.flags & VXLAN_F_VNIFILTER) { 2902 + err = vxlan_vnigroup_init(vxlan); 2903 + if (err) 2904 + return err; 2905 + } 2903 2906 2904 2907 err = gro_cells_init(&vxlan->gro_cells, dev); 2905 2908 if (err)

+45 -16

drivers/net/wireless/ath/ath12k/wmi.c

··· 4851 4851 return reg_rule_ptr; 4852 4852 } 4853 4853 4854 + static u8 ath12k_wmi_ignore_num_extra_rules(struct ath12k_wmi_reg_rule_ext_params *rule, 4855 + u32 num_reg_rules) 4856 + { 4857 + u8 num_invalid_5ghz_rules = 0; 4858 + u32 count, start_freq; 4859 + 4860 + for (count = 0; count < num_reg_rules; count++) { 4861 + start_freq = le32_get_bits(rule[count].freq_info, REG_RULE_START_FREQ); 4862 + 4863 + if (start_freq >= ATH12K_MIN_6G_FREQ) 4864 + num_invalid_5ghz_rules++; 4865 + } 4866 + 4867 + return num_invalid_5ghz_rules; 4868 + } 4869 + 4854 4870 static int ath12k_pull_reg_chan_list_ext_update_ev(struct ath12k_base *ab, 4855 4871 struct sk_buff *skb, 4856 4872 struct ath12k_reg_info *reg_info) ··· 4877 4861 u32 num_2g_reg_rules, num_5g_reg_rules; 4878 4862 u32 num_6g_reg_rules_ap[WMI_REG_CURRENT_MAX_AP_TYPE]; 4879 4863 u32 num_6g_reg_rules_cl[WMI_REG_CURRENT_MAX_AP_TYPE][WMI_REG_MAX_CLIENT_TYPE]; 4864 + u8 num_invalid_5ghz_ext_rules; 4880 4865 u32 total_reg_rules = 0; 4881 4866 int ret, i, j; 4882 4867 ··· 4970 4953 } 4971 4954 4972 4955 memcpy(reg_info->alpha2, &ev->alpha2, REG_ALPHA2_LEN); 4973 - 4974 - /* FIXME: Currently FW includes 6G reg rule also in 5G rule 4975 - * list for country US. 4976 - * Having same 6G reg rule in 5G and 6G rules list causes 4977 - * intersect check to be true, and same rules will be shown 4978 - * multiple times in iw cmd. So added hack below to avoid 4979 - * parsing 6G rule from 5G reg rule list, and this can be 4980 - * removed later, after FW updates to remove 6G reg rule 4981 - * from 5G rules list. 4982 - */ 4983 - if (memcmp(reg_info->alpha2, "US", 2) == 0) { 4984 - reg_info->num_5g_reg_rules = REG_US_5G_NUM_REG_RULES; 4985 - num_5g_reg_rules = reg_info->num_5g_reg_rules; 4986 - } 4987 4956 4988 4957 reg_info->dfs_region = le32_to_cpu(ev->dfs_region); 4989 4958 reg_info->phybitmap = le32_to_cpu(ev->phybitmap); ··· 5073 5070 } 5074 5071 } 5075 5072 5073 + ext_wmi_reg_rule += num_2g_reg_rules; 5074 + 5075 + /* Firmware might include 6 GHz reg rule in 5 GHz rule list 5076 + * for few countries along with separate 6 GHz rule. 5077 + * Having same 6 GHz reg rule in 5 GHz and 6 GHz rules list 5078 + * causes intersect check to be true, and same rules will be 5079 + * shown multiple times in iw cmd. 5080 + * Hence, avoid parsing 6 GHz rule from 5 GHz reg rule list 5081 + */ 5082 + num_invalid_5ghz_ext_rules = ath12k_wmi_ignore_num_extra_rules(ext_wmi_reg_rule, 5083 + num_5g_reg_rules); 5084 + 5085 + if (num_invalid_5ghz_ext_rules) { 5086 + ath12k_dbg(ab, ATH12K_DBG_WMI, 5087 + "CC: %s 5 GHz reg rules number %d from fw, %d number of invalid 5 GHz rules", 5088 + reg_info->alpha2, reg_info->num_5g_reg_rules, 5089 + num_invalid_5ghz_ext_rules); 5090 + 5091 + num_5g_reg_rules = num_5g_reg_rules - num_invalid_5ghz_ext_rules; 5092 + reg_info->num_5g_reg_rules = num_5g_reg_rules; 5093 + } 5094 + 5076 5095 if (num_5g_reg_rules) { 5077 - ext_wmi_reg_rule += num_2g_reg_rules; 5078 5096 reg_info->reg_rules_5g_ptr = 5079 5097 create_ext_reg_rules_from_wmi(num_5g_reg_rules, 5080 5098 ext_wmi_reg_rule); ··· 5107 5083 } 5108 5084 } 5109 5085 5110 - ext_wmi_reg_rule += num_5g_reg_rules; 5086 + /* We have adjusted the number of 5 GHz reg rules above. But still those 5087 + * many rules needs to be adjusted in ext_wmi_reg_rule. 5088 + * 5089 + * NOTE: num_invalid_5ghz_ext_rules will be 0 for rest other cases. 5090 + */ 5091 + ext_wmi_reg_rule += (num_5g_reg_rules + num_invalid_5ghz_ext_rules); 5111 5092 5112 5093 for (i = 0; i < WMI_REG_CURRENT_MAX_AP_TYPE; i++) { 5113 5094 reg_info->reg_rules_6g_ap_ptr[i] =

-1

drivers/net/wireless/ath/ath12k/wmi.h

··· 4073 4073 #define MAX_REG_RULES 10 4074 4074 #define REG_ALPHA2_LEN 2 4075 4075 #define MAX_6G_REG_RULES 5 4076 - #define REG_US_5G_NUM_REG_RULES 4 4077 4076 4078 4077 enum wmi_start_event_param { 4079 4078 WMI_VDEV_START_RESP_EVENT = 0,

+2 -2

drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c

··· 2712 2712 BRCMF_PCIE_DEVICE(BRCM_PCIE_4350_DEVICE_ID, WCC), 2713 2713 BRCMF_PCIE_DEVICE_SUB(0x4355, BRCM_PCIE_VENDOR_ID_BROADCOM, 0x4355, WCC), 2714 2714 BRCMF_PCIE_DEVICE(BRCM_PCIE_4354_RAW_DEVICE_ID, WCC), 2715 - BRCMF_PCIE_DEVICE(BRCM_PCIE_4355_DEVICE_ID, WCC), 2715 + BRCMF_PCIE_DEVICE(BRCM_PCIE_4355_DEVICE_ID, WCC_SEED), 2716 2716 BRCMF_PCIE_DEVICE(BRCM_PCIE_4356_DEVICE_ID, WCC), 2717 2717 BRCMF_PCIE_DEVICE(BRCM_PCIE_43567_DEVICE_ID, WCC), 2718 2718 BRCMF_PCIE_DEVICE(BRCM_PCIE_43570_DEVICE_ID, WCC), ··· 2723 2723 BRCMF_PCIE_DEVICE(BRCM_PCIE_43602_2G_DEVICE_ID, WCC), 2724 2724 BRCMF_PCIE_DEVICE(BRCM_PCIE_43602_5G_DEVICE_ID, WCC), 2725 2725 BRCMF_PCIE_DEVICE(BRCM_PCIE_43602_RAW_DEVICE_ID, WCC), 2726 - BRCMF_PCIE_DEVICE(BRCM_PCIE_4364_DEVICE_ID, WCC), 2726 + BRCMF_PCIE_DEVICE(BRCM_PCIE_4364_DEVICE_ID, WCC_SEED), 2727 2727 BRCMF_PCIE_DEVICE(BRCM_PCIE_4365_DEVICE_ID, BCA), 2728 2728 BRCMF_PCIE_DEVICE(BRCM_PCIE_4365_2G_DEVICE_ID, BCA), 2729 2729 BRCMF_PCIE_DEVICE(BRCM_PCIE_4365_5G_DEVICE_ID, BCA),

+7 -1

drivers/nvme/host/core.c

··· 1700 1700 1701 1701 status = nvme_set_features(ctrl, NVME_FEAT_NUM_QUEUES, q_count, NULL, 0, 1702 1702 &result); 1703 - if (status < 0) 1703 + 1704 + /* 1705 + * It's either a kernel error or the host observed a connection 1706 + * lost. In either case it's not possible communicate with the 1707 + * controller and thus enter the error code path. 1708 + */ 1709 + if (status < 0 || status == NVME_SC_HOST_PATH_ERROR) 1704 1710 return status; 1705 1711 1706 1712 /*

+25 -10

drivers/nvme/host/fc.c

··· 781 781 static void 782 782 nvme_fc_ctrl_connectivity_loss(struct nvme_fc_ctrl *ctrl) 783 783 { 784 + enum nvme_ctrl_state state; 785 + unsigned long flags; 786 + 784 787 dev_info(ctrl->ctrl.device, 785 788 "NVME-FC{%d}: controller connectivity lost. Awaiting " 786 789 "Reconnect", ctrl->cnum); 787 790 788 - switch (nvme_ctrl_state(&ctrl->ctrl)) { 791 + spin_lock_irqsave(&ctrl->lock, flags); 792 + set_bit(ASSOC_FAILED, &ctrl->flags); 793 + state = nvme_ctrl_state(&ctrl->ctrl); 794 + spin_unlock_irqrestore(&ctrl->lock, flags); 795 + 796 + switch (state) { 789 797 case NVME_CTRL_NEW: 790 798 case NVME_CTRL_LIVE: 791 799 /* ··· 2087 2079 nvme_fc_complete_rq(rq); 2088 2080 2089 2081 check_error: 2090 - if (terminate_assoc && ctrl->ctrl.state != NVME_CTRL_RESETTING) 2082 + if (terminate_assoc && 2083 + nvme_ctrl_state(&ctrl->ctrl) != NVME_CTRL_RESETTING) 2091 2084 queue_work(nvme_reset_wq, &ctrl->ioerr_work); 2092 2085 } 2093 2086 ··· 2542 2533 static void 2543 2534 nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl, char *errmsg) 2544 2535 { 2536 + enum nvme_ctrl_state state = nvme_ctrl_state(&ctrl->ctrl); 2537 + 2545 2538 /* 2546 2539 * if an error (io timeout, etc) while (re)connecting, the remote 2547 2540 * port requested terminating of the association (disconnect_ls) ··· 2551 2540 * the controller. Abort any ios on the association and let the 2552 2541 * create_association error path resolve things. 2553 2542 */ 2554 - if (ctrl->ctrl.state == NVME_CTRL_CONNECTING) { 2543 + if (state == NVME_CTRL_CONNECTING) { 2555 2544 __nvme_fc_abort_outstanding_ios(ctrl, true); 2556 - set_bit(ASSOC_FAILED, &ctrl->flags); 2557 2545 dev_warn(ctrl->ctrl.device, 2558 2546 "NVME-FC{%d}: transport error during (re)connect\n", 2559 2547 ctrl->cnum); ··· 2560 2550 } 2561 2551 2562 2552 /* Otherwise, only proceed if in LIVE state - e.g. on first error */ 2563 - if (ctrl->ctrl.state != NVME_CTRL_LIVE) 2553 + if (state != NVME_CTRL_LIVE) 2564 2554 return; 2565 2555 2566 2556 dev_warn(ctrl->ctrl.device, ··· 3177 3167 else 3178 3168 ret = nvme_fc_recreate_io_queues(ctrl); 3179 3169 } 3180 - if (!ret && test_bit(ASSOC_FAILED, &ctrl->flags)) 3181 - ret = -EIO; 3182 3170 if (ret) 3183 3171 goto out_term_aen_ops; 3184 3172 3185 - changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE); 3173 + spin_lock_irqsave(&ctrl->lock, flags); 3174 + if (!test_bit(ASSOC_FAILED, &ctrl->flags)) 3175 + changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE); 3176 + else 3177 + ret = -EIO; 3178 + spin_unlock_irqrestore(&ctrl->lock, flags); 3179 + 3180 + if (ret) 3181 + goto out_term_aen_ops; 3186 3182 3187 3183 ctrl->ctrl.nr_reconnects = 0; 3188 3184 ··· 3594 3578 list_add_tail(&ctrl->ctrl_list, &rport->ctrl_list); 3595 3579 spin_unlock_irqrestore(&rport->lock, flags); 3596 3580 3597 - if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING) || 3598 - !nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) { 3581 + if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) { 3599 3582 dev_err(ctrl->ctrl.device, 3600 3583 "NVME-FC{%d}: failed to init ctrl state\n", ctrl->cnum); 3601 3584 goto fail_ctrl;

+3 -9

drivers/nvme/host/pci.c

··· 2153 2153 return 0; 2154 2154 2155 2155 out_free_bufs: 2156 - while (--i >= 0) { 2157 - size_t size = le32_to_cpu(descs[i].size) * NVME_CTRL_PAGE_SIZE; 2158 - 2159 - dma_free_attrs(dev->dev, size, bufs[i], 2160 - le64_to_cpu(descs[i].addr), 2161 - DMA_ATTR_NO_KERNEL_MAPPING | DMA_ATTR_NO_WARN); 2162 - } 2163 - 2164 2156 kfree(bufs); 2165 2157 out_free_descs: 2166 2158 dma_free_coherent(dev->dev, descs_size, descs, descs_dma); ··· 3139 3147 * because of high power consumption (> 2 Watt) in s2idle 3140 3148 * sleep. Only some boards with Intel CPU are affected. 3141 3149 */ 3142 - if (dmi_match(DMI_BOARD_NAME, "GMxPXxx") || 3150 + if (dmi_match(DMI_BOARD_NAME, "DN50Z-140HC-YD") || 3151 + dmi_match(DMI_BOARD_NAME, "GMxPXxx") || 3152 + dmi_match(DMI_BOARD_NAME, "GXxMRXx") || 3143 3153 dmi_match(DMI_BOARD_NAME, "PH4PG31") || 3144 3154 dmi_match(DMI_BOARD_NAME, "PH4PRX1_PH6PRX1") || 3145 3155 dmi_match(DMI_BOARD_NAME, "PH6PG01_PH6PG71"))

+1 -1

drivers/nvme/host/sysfs.c

··· 792 792 return a->mode; 793 793 } 794 794 795 - const struct attribute_group nvme_tls_attrs_group = { 795 + static const struct attribute_group nvme_tls_attrs_group = { 796 796 .attrs = nvme_tls_attrs, 797 797 .is_visible = nvme_tls_attrs_are_visible, 798 798 };

+1

drivers/nvme/target/admin-cmd.c

··· 1068 1068 goto out; 1069 1069 } 1070 1070 status = nvmet_copy_to_sgl(req, 0, id, sizeof(*id)); 1071 + kfree(id); 1071 1072 out: 1072 1073 nvmet_req_complete(req, status); 1073 1074 }

+1 -1

drivers/nvme/target/fabrics-cmd.c

··· 287 287 args.subsysnqn = d->subsysnqn; 288 288 args.hostnqn = d->hostnqn; 289 289 args.hostid = &d->hostid; 290 - args.kato = c->kato; 290 + args.kato = le32_to_cpu(c->kato); 291 291 292 292 ctrl = nvmet_alloc_ctrl(&args); 293 293 if (!ctrl)

+1 -1

drivers/nvme/target/io-cmd-bdev.c

··· 272 272 iter_flags = SG_MITER_FROM_SG; 273 273 } 274 274 275 - if (req->cmd->rw.control & NVME_RW_LR) 275 + if (req->cmd->rw.control & cpu_to_le16(NVME_RW_LR)) 276 276 opf |= REQ_FAILFAST_DEV; 277 277 278 278 if (is_pci_p2pdma_page(sg_page(req->sg)))

+1 -1

drivers/nvme/target/nvmet.h

··· 589 589 const struct nvmet_fabrics_ops *ops; 590 590 struct device *p2p_client; 591 591 u32 kato; 592 - u32 result; 592 + __le32 result; 593 593 u16 error_loc; 594 594 u16 status; 595 595 };

+4 -1

drivers/of/address.c

··· 16 16 #include <linux/string.h> 17 17 #include <linux/dma-direct.h> /* for bus_dma_region */ 18 18 19 + #include <kunit/visibility.h> 20 + 19 21 /* Uncomment me to enable of_dump_addr() debugging output */ 20 22 // #define DEBUG 21 23 ··· 185 183 186 184 #endif /* CONFIG_PCI */ 187 185 188 - static int __of_address_resource_bounds(struct resource *r, u64 start, u64 size) 186 + VISIBLE_IF_KUNIT int __of_address_resource_bounds(struct resource *r, u64 start, u64 size) 189 187 { 190 188 if (overflows_type(start, r->start)) 191 189 return -EOVERFLOW; ··· 199 197 200 198 return 0; 201 199 } 200 + EXPORT_SYMBOL_IF_KUNIT(__of_address_resource_bounds); 202 201 203 202 /* 204 203 * of_pci_range_to_resource - Create a resource from an of_pci_range

+4

drivers/of/of_private.h

··· 208 208 static void __maybe_unused of_dump_addr(const char *s, const __be32 *addr, int na) { } 209 209 #endif 210 210 211 + #if IS_ENABLED(CONFIG_KUNIT) 212 + int __of_address_resource_bounds(struct resource *r, u64 start, u64 size); 213 + #endif 214 + 211 215 #endif /* _LINUX_OF_PRIVATE_H */

+118 -1

drivers/of/of_test.c

··· 2 2 /* 3 3 * KUnit tests for OF APIs 4 4 */ 5 + #include <linux/ioport.h> 5 6 #include <linux/module.h> 6 7 #include <linux/of.h> 7 8 ··· 55 54 .init = of_dtb_test_init, 56 55 }; 57 56 57 + struct of_address_resource_bounds_case { 58 + u64 start; 59 + u64 size; 60 + int ret; 61 + 62 + u64 res_start; 63 + u64 res_end; 64 + }; 65 + 66 + static void of_address_resource_bounds_case_desc(const struct of_address_resource_bounds_case *p, 67 + char *name) 68 + { 69 + snprintf(name, KUNIT_PARAM_DESC_SIZE, "start=0x%016llx,size=0x%016llx", p->start, p->size); 70 + } 71 + 72 + static const struct of_address_resource_bounds_case of_address_resource_bounds_cases[] = { 73 + { 74 + .start = 0, 75 + .size = 0, 76 + .ret = 0, 77 + .res_start = 0, 78 + .res_end = -1, 79 + }, 80 + { 81 + .start = 0, 82 + .size = 0x1000, 83 + .ret = 0, 84 + .res_start = 0, 85 + .res_end = 0xfff, 86 + }, 87 + { 88 + .start = 0x1000, 89 + .size = 0, 90 + .ret = 0, 91 + .res_start = 0x1000, 92 + .res_end = 0xfff, 93 + }, 94 + { 95 + .start = 0x1000, 96 + .size = 0x1000, 97 + .ret = 0, 98 + .res_start = 0x1000, 99 + .res_end = 0x1fff, 100 + }, 101 + { 102 + .start = 1, 103 + .size = RESOURCE_SIZE_MAX, 104 + .ret = 0, 105 + .res_start = 1, 106 + .res_end = RESOURCE_SIZE_MAX, 107 + }, 108 + { 109 + .start = RESOURCE_SIZE_MAX, 110 + .size = 1, 111 + .ret = 0, 112 + .res_start = RESOURCE_SIZE_MAX, 113 + .res_end = RESOURCE_SIZE_MAX, 114 + }, 115 + { 116 + .start = 2, 117 + .size = RESOURCE_SIZE_MAX, 118 + .ret = -EOVERFLOW, 119 + }, 120 + { 121 + .start = RESOURCE_SIZE_MAX, 122 + .size = 2, 123 + .ret = -EOVERFLOW, 124 + }, 125 + { 126 + .start = ULL(0x100000000), 127 + .size = 1, 128 + .ret = sizeof(resource_size_t) > sizeof(u32) ? 0 : -EOVERFLOW, 129 + .res_start = ULL(0x100000000), 130 + .res_end = ULL(0x100000000), 131 + }, 132 + { 133 + .start = 0x1000, 134 + .size = 0xffffffff, 135 + .ret = sizeof(resource_size_t) > sizeof(u32) ? 0 : -EOVERFLOW, 136 + .res_start = 0x1000, 137 + .res_end = ULL(0x100000ffe), 138 + }, 139 + }; 140 + 141 + KUNIT_ARRAY_PARAM(of_address_resource_bounds, 142 + of_address_resource_bounds_cases, of_address_resource_bounds_case_desc); 143 + 144 + static void of_address_resource_bounds(struct kunit *test) 145 + { 146 + const struct of_address_resource_bounds_case *param = test->param_value; 147 + struct resource r; /* Intentionally uninitialized */ 148 + int ret; 149 + 150 + if (!IS_ENABLED(CONFIG_OF_ADDRESS)) 151 + kunit_skip(test, "CONFIG_OF_ADDRESS not enabled\n"); 152 + 153 + ret = __of_address_resource_bounds(&r, param->start, param->size); 154 + KUNIT_EXPECT_EQ(test, param->ret, ret); 155 + if (ret == 0) { 156 + KUNIT_EXPECT_EQ(test, (resource_size_t)param->res_start, r.start); 157 + KUNIT_EXPECT_EQ(test, (resource_size_t)param->res_end, r.end); 158 + KUNIT_EXPECT_EQ(test, param->size, resource_size(&r)); 159 + } 160 + } 161 + 162 + static struct kunit_case of_address_test_cases[] = { 163 + KUNIT_CASE_PARAM(of_address_resource_bounds, of_address_resource_bounds_gen_params), 164 + {} 165 + }; 166 + 167 + static struct kunit_suite of_address_suite = { 168 + .name = "of_address", 169 + .test_cases = of_address_test_cases, 170 + }; 171 + 58 172 kunit_test_suites( 59 - &of_dtb_suite, 173 + &of_dtb_suite, &of_address_suite, 60 174 ); 61 175 MODULE_DESCRIPTION("KUnit tests for OF APIs"); 176 + MODULE_IMPORT_NS("EXPORTED_FOR_KUNIT_TESTING"); 62 177 MODULE_LICENSE("GPL");

-3

drivers/pci/pcie/aspm.c

··· 108 108 pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL2, cap++); 109 109 pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL1, cap++); 110 110 111 - if (parent->state_saved) 112 - return; 113 - 114 111 /* 115 112 * Save parent's L1 substate configuration so we have it for 116 113 * pci_restore_aspm_l1ss_state(pdev) to restore.

+3 -2

drivers/pci/probe.c

··· 339 339 return (res->flags & IORESOURCE_MEM_64) ? 1 : 0; 340 340 } 341 341 342 - static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int rom) 342 + static __always_inline void pci_read_bases(struct pci_dev *dev, 343 + unsigned int howmany, int rom) 343 344 { 344 345 u32 rombar, stdbars[PCI_STD_NUM_BARS]; 345 346 unsigned int pos, reg; 346 347 u16 orig_cmd; 347 348 348 - BUILD_BUG_ON(howmany > PCI_STD_NUM_BARS); 349 + BUILD_BUG_ON(statically_true(howmany > PCI_STD_NUM_BARS)); 349 350 350 351 if (dev->non_compliant_bars) 351 352 return;

+2 -1

drivers/pci/quirks.c

··· 5522 5522 * AMD Matisse USB 3.0 Host Controller 0x149c 5523 5523 * Intel 82579LM Gigabit Ethernet Controller 0x1502 5524 5524 * Intel 82579V Gigabit Ethernet Controller 0x1503 5525 - * 5525 + * Mediatek MT7922 802.11ax PCI Express Wireless Network Adapter 5526 5526 */ 5527 5527 static void quirk_no_flr(struct pci_dev *dev) 5528 5528 { ··· 5534 5534 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x7901, quirk_no_flr); 5535 5535 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1502, quirk_no_flr); 5536 5536 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1503, quirk_no_flr); 5537 + DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_MEDIATEK, 0x0616, quirk_no_flr); 5537 5538 5538 5539 /* FLR may cause the SolidRun SNET DPU (rev 0x1) to hang */ 5539 5540 static void quirk_no_flr_snet(struct pci_dev *dev)

+1 -1

drivers/pci/tph.c

··· 360 360 return err; 361 361 } 362 362 363 - set_ctrl_reg_req_en(pdev, pdev->tph_mode); 363 + set_ctrl_reg_req_en(pdev, pdev->tph_req_type); 364 364 365 365 pci_dbg(pdev, "set steering tag: %s table, index=%d, tag=%#04x\n", 366 366 (loc == PCI_TPH_LOC_MSIX) ? "MSI-X" : "ST", index, tag);

+4 -4

drivers/pinctrl/pinconf-generic.c

··· 89 89 seq_puts(s, items[i].display); 90 90 /* Print unit if available */ 91 91 if (items[i].has_arg) { 92 - seq_printf(s, " (0x%x", 93 - pinconf_to_config_argument(config)); 92 + u32 val = pinconf_to_config_argument(config); 93 + 94 94 if (items[i].format) 95 - seq_printf(s, " %s)", items[i].format); 95 + seq_printf(s, " (%u %s)", val, items[i].format); 96 96 else 97 - seq_puts(s, ")"); 97 + seq_printf(s, " (0x%x)", val); 98 98 } 99 99 } 100 100 }

+25 -17

drivers/pinctrl/pinctrl-cy8c95x0.c

··· 42 42 #define CY8C95X0_PORTSEL 0x18 43 43 /* Port settings, write PORTSEL first */ 44 44 #define CY8C95X0_INTMASK 0x19 45 - #define CY8C95X0_PWMSEL 0x1A 45 + #define CY8C95X0_SELPWM 0x1A 46 46 #define CY8C95X0_INVERT 0x1B 47 47 #define CY8C95X0_DIRECTION 0x1C 48 48 /* Drive mode register change state on writing '1' */ ··· 328 328 static bool cy8c95x0_readable_register(struct device *dev, unsigned int reg) 329 329 { 330 330 /* 331 - * Only 12 registers are present per port (see Table 6 in the 332 - * datasheet). 331 + * Only 12 registers are present per port (see Table 6 in the datasheet). 333 332 */ 334 - if (reg >= CY8C95X0_VIRTUAL && (reg % MUXED_STRIDE) < 12) 335 - return true; 333 + if (reg >= CY8C95X0_VIRTUAL && (reg % MUXED_STRIDE) >= 12) 334 + return false; 336 335 337 336 switch (reg) { 338 337 case 0x24 ... 0x27: 338 + case 0x31 ... 0x3f: 339 339 return false; 340 340 default: 341 341 return true; ··· 344 344 345 345 static bool cy8c95x0_writeable_register(struct device *dev, unsigned int reg) 346 346 { 347 - if (reg >= CY8C95X0_VIRTUAL) 348 - return true; 347 + /* 348 + * Only 12 registers are present per port (see Table 6 in the datasheet). 349 + */ 350 + if (reg >= CY8C95X0_VIRTUAL && (reg % MUXED_STRIDE) >= 12) 351 + return false; 349 352 350 353 switch (reg) { 351 354 case CY8C95X0_INPUT_(0) ... CY8C95X0_INPUT_(7): ··· 356 353 case CY8C95X0_DEVID: 357 354 return false; 358 355 case 0x24 ... 0x27: 356 + case 0x31 ... 0x3f: 359 357 return false; 360 358 default: 361 359 return true; ··· 369 365 case CY8C95X0_INPUT_(0) ... CY8C95X0_INPUT_(7): 370 366 case CY8C95X0_INTSTATUS_(0) ... CY8C95X0_INTSTATUS_(7): 371 367 case CY8C95X0_INTMASK: 368 + case CY8C95X0_SELPWM: 372 369 case CY8C95X0_INVERT: 373 - case CY8C95X0_PWMSEL: 374 370 case CY8C95X0_DIRECTION: 375 371 case CY8C95X0_DRV_PU: 376 372 case CY8C95X0_DRV_PD: ··· 399 395 { 400 396 switch (reg) { 401 397 case CY8C95X0_INTMASK: 402 - case CY8C95X0_PWMSEL: 398 + case CY8C95X0_SELPWM: 403 399 case CY8C95X0_INVERT: 404 400 case CY8C95X0_DIRECTION: 405 401 case CY8C95X0_DRV_PU: ··· 470 466 .max_register = 0, /* Updated at runtime */ 471 467 .num_reg_defaults_raw = 0, /* Updated at runtime */ 472 468 .use_single_read = true, /* Workaround for regcache bug */ 469 + #if IS_ENABLED(CONFIG_DEBUG_PINCTRL) 470 + .disable_locking = false, 471 + #else 473 472 .disable_locking = true, 473 + #endif 474 474 }; 475 475 476 476 static inline int cy8c95x0_regmap_update_bits_base(struct cy8c95x0_pinctrl *chip, ··· 797 789 reg = CY8C95X0_DIRECTION; 798 790 break; 799 791 case PIN_CONFIG_MODE_PWM: 800 - reg = CY8C95X0_PWMSEL; 792 + reg = CY8C95X0_SELPWM; 801 793 break; 802 794 case PIN_CONFIG_OUTPUT: 803 795 reg = CY8C95X0_OUTPUT; ··· 876 868 reg = CY8C95X0_DRV_PP_FAST; 877 869 break; 878 870 case PIN_CONFIG_MODE_PWM: 879 - reg = CY8C95X0_PWMSEL; 871 + reg = CY8C95X0_SELPWM; 880 872 break; 881 873 case PIN_CONFIG_OUTPUT_ENABLE: 882 874 return cy8c95x0_pinmux_direction(chip, off, !arg); ··· 1161 1153 bitmap_zero(mask, MAX_LINE); 1162 1154 __set_bit(pin, mask); 1163 1155 1164 - if (cy8c95x0_read_regs_mask(chip, CY8C95X0_PWMSEL, pwm, mask)) { 1156 + if (cy8c95x0_read_regs_mask(chip, CY8C95X0_SELPWM, pwm, mask)) { 1165 1157 seq_puts(s, "not available"); 1166 1158 return; 1167 1159 } ··· 1206 1198 u8 port = cypress_get_port(chip, off); 1207 1199 u8 bit = cypress_get_pin_mask(chip, off); 1208 1200 1209 - return cy8c95x0_regmap_write_bits(chip, CY8C95X0_PWMSEL, port, bit, mode ? bit : 0); 1201 + return cy8c95x0_regmap_write_bits(chip, CY8C95X0_SELPWM, port, bit, mode ? bit : 0); 1210 1202 } 1211 1203 1212 1204 static int cy8c95x0_pinmux_mode(struct cy8c95x0_pinctrl *chip, ··· 1355 1347 1356 1348 ret = devm_request_threaded_irq(chip->dev, irq, 1357 1349 NULL, cy8c95x0_irq_handler, 1358 - IRQF_ONESHOT | IRQF_SHARED | IRQF_TRIGGER_HIGH, 1350 + IRQF_ONESHOT | IRQF_SHARED, 1359 1351 dev_name(chip->dev), chip); 1360 1352 if (ret) { 1361 1353 dev_err(chip->dev, "failed to request irq %d\n", irq); ··· 1446 1438 switch (chip->tpin) { 1447 1439 case 20: 1448 1440 strscpy(chip->name, cy8c95x0_id[0].name); 1449 - regmap_range_conf.range_max = CY8C95X0_VIRTUAL + 3 * MUXED_STRIDE; 1441 + regmap_range_conf.range_max = CY8C95X0_VIRTUAL + 3 * MUXED_STRIDE - 1; 1450 1442 break; 1451 1443 case 40: 1452 1444 strscpy(chip->name, cy8c95x0_id[1].name); 1453 - regmap_range_conf.range_max = CY8C95X0_VIRTUAL + 6 * MUXED_STRIDE; 1445 + regmap_range_conf.range_max = CY8C95X0_VIRTUAL + 6 * MUXED_STRIDE - 1; 1454 1446 break; 1455 1447 case 60: 1456 1448 strscpy(chip->name, cy8c95x0_id[2].name); 1457 - regmap_range_conf.range_max = CY8C95X0_VIRTUAL + 8 * MUXED_STRIDE; 1449 + regmap_range_conf.range_max = CY8C95X0_VIRTUAL + 8 * MUXED_STRIDE - 1; 1458 1450 break; 1459 1451 default: 1460 1452 return -ENODEV;

+1 -1

drivers/platform/x86/ideapad-laptop.c

··· 1121 1121 1122 1122 /* Create platform_profile structure and register */ 1123 1123 priv->dytc->ppdev = devm_platform_profile_register(&priv->platform_device->dev, 1124 - "ideapad-laptop", &priv->dytc, 1124 + "ideapad-laptop", priv->dytc, 1125 1125 &dytc_profile_ops); 1126 1126 if (IS_ERR(priv->dytc->ppdev)) { 1127 1127 err = PTR_ERR(priv->dytc->ppdev);

+4 -5

drivers/platform/x86/intel/ifs/ifs.h

··· 23 23 * IFS Image 24 24 * --------- 25 25 * 26 - * Intel provides a firmware file containing the scan tests via 27 - * github [#f1]_. Similar to microcode there is a separate file for each 26 + * Intel provides firmware files containing the scan tests via the webpage [#f1]_. 27 + * Look under "In-Field Scan Test Images Download" section towards the 28 + * end of the page. Similar to microcode, there are separate files for each 28 29 * family-model-stepping. IFS Images are not applicable for some test types. 29 30 * Wherever applicable the sysfs directory would provide a "current_batch" file 30 31 * (see below) for loading the image. 31 32 * 33 + * .. [#f1] https://intel.com/InFieldScan 32 34 * 33 35 * IFS Image Loading 34 36 * ----------------- ··· 126 124 * 127 125 * 2) Hardware allows for some number of cores to be tested in parallel. 128 126 * The driver does not make use of this, it only tests one core at a time. 129 - * 130 - * .. [#f1] https://github.com/intel/TBD 131 - * 132 127 * 133 128 * Structural Based Functional Test at Field (SBAF): 134 129 * -------------------------------------------------

+65 -20

drivers/platform/x86/intel/int3472/discrete.c

··· 2 2 /* Author: Dan Scally <djrscally@gmail.com> */ 3 3 4 4 #include <linux/acpi.h> 5 + #include <linux/array_size.h> 5 6 #include <linux/bitfield.h> 6 7 #include <linux/device.h> 7 8 #include <linux/gpio/consumer.h> ··· 56 55 57 56 static int skl_int3472_fill_gpiod_lookup(struct gpiod_lookup *table_entry, 58 57 struct acpi_resource_gpio *agpio, 59 - const char *func, u32 polarity) 58 + const char *func, unsigned long gpio_flags) 60 59 { 61 60 char *path = agpio->resource_source.string_ptr; 62 61 struct acpi_device *adev; ··· 71 70 if (!adev) 72 71 return -ENODEV; 73 72 74 - *table_entry = GPIO_LOOKUP(acpi_dev_name(adev), agpio->pin_table[0], func, polarity); 73 + *table_entry = GPIO_LOOKUP(acpi_dev_name(adev), agpio->pin_table[0], func, gpio_flags); 75 74 76 75 return 0; 77 76 } 78 77 79 78 static int skl_int3472_map_gpio_to_sensor(struct int3472_discrete_device *int3472, 80 79 struct acpi_resource_gpio *agpio, 81 - const char *func, u32 polarity) 80 + const char *func, unsigned long gpio_flags) 82 81 { 83 82 int ret; 84 83 ··· 88 87 } 89 88 90 89 ret = skl_int3472_fill_gpiod_lookup(&int3472->gpios.table[int3472->n_sensor_gpios], 91 - agpio, func, polarity); 90 + agpio, func, gpio_flags); 92 91 if (ret) 93 92 return ret; 94 93 ··· 101 100 static struct gpio_desc * 102 101 skl_int3472_gpiod_get_from_temp_lookup(struct int3472_discrete_device *int3472, 103 102 struct acpi_resource_gpio *agpio, 104 - const char *func, u32 polarity) 103 + const char *func, unsigned long gpio_flags) 105 104 { 106 105 struct gpio_desc *desc; 107 106 int ret; ··· 112 111 return ERR_PTR(-ENOMEM); 113 112 114 113 lookup->dev_id = dev_name(int3472->dev); 115 - ret = skl_int3472_fill_gpiod_lookup(&lookup->table[0], agpio, func, polarity); 114 + ret = skl_int3472_fill_gpiod_lookup(&lookup->table[0], agpio, func, gpio_flags); 116 115 if (ret) 117 116 return ERR_PTR(ret); 118 117 ··· 123 122 return desc; 124 123 } 125 124 126 - static void int3472_get_func_and_polarity(u8 type, const char **func, u32 *polarity) 125 + /** 126 + * struct int3472_gpio_map - Map GPIOs to whatever is expected by the 127 + * sensor driver (as in DT bindings) 128 + * @hid: The ACPI HID of the device without the instance number e.g. INT347E 129 + * @type_from: The GPIO type from ACPI ?SDT 130 + * @type_to: The assigned GPIO type, typically same as @type_from 131 + * @func: The function, e.g. "enable" 132 + * @polarity_low: GPIO_ACTIVE_LOW true if the @polarity_low is true, 133 + * GPIO_ACTIVE_HIGH otherwise 134 + */ 135 + struct int3472_gpio_map { 136 + const char *hid; 137 + u8 type_from; 138 + u8 type_to; 139 + bool polarity_low; 140 + const char *func; 141 + }; 142 + 143 + static const struct int3472_gpio_map int3472_gpio_map[] = { 144 + { "INT347E", INT3472_GPIO_TYPE_RESET, INT3472_GPIO_TYPE_RESET, false, "enable" }, 145 + }; 146 + 147 + static void int3472_get_func_and_polarity(struct acpi_device *adev, u8 *type, 148 + const char **func, unsigned long *gpio_flags) 127 149 { 128 - switch (type) { 150 + unsigned int i; 151 + 152 + for (i = 0; i < ARRAY_SIZE(int3472_gpio_map); i++) { 153 + /* 154 + * Map the firmware-provided GPIO to whatever a driver expects 155 + * (as in DT bindings). First check if the type matches with the 156 + * GPIO map, then further check that the device _HID matches. 157 + */ 158 + if (*type != int3472_gpio_map[i].type_from) 159 + continue; 160 + 161 + if (!acpi_dev_hid_uid_match(adev, int3472_gpio_map[i].hid, NULL)) 162 + continue; 163 + 164 + *type = int3472_gpio_map[i].type_to; 165 + *gpio_flags = int3472_gpio_map[i].polarity_low ? 166 + GPIO_ACTIVE_LOW : GPIO_ACTIVE_HIGH; 167 + *func = int3472_gpio_map[i].func; 168 + return; 169 + } 170 + 171 + switch (*type) { 129 172 case INT3472_GPIO_TYPE_RESET: 130 173 *func = "reset"; 131 - *polarity = GPIO_ACTIVE_LOW; 174 + *gpio_flags = GPIO_ACTIVE_LOW; 132 175 break; 133 176 case INT3472_GPIO_TYPE_POWERDOWN: 134 177 *func = "powerdown"; 135 - *polarity = GPIO_ACTIVE_LOW; 178 + *gpio_flags = GPIO_ACTIVE_LOW; 136 179 break; 137 180 case INT3472_GPIO_TYPE_CLK_ENABLE: 138 181 *func = "clk-enable"; 139 - *polarity = GPIO_ACTIVE_HIGH; 182 + *gpio_flags = GPIO_ACTIVE_HIGH; 140 183 break; 141 184 case INT3472_GPIO_TYPE_PRIVACY_LED: 142 185 *func = "privacy-led"; 143 - *polarity = GPIO_ACTIVE_HIGH; 186 + *gpio_flags = GPIO_ACTIVE_HIGH; 144 187 break; 145 188 case INT3472_GPIO_TYPE_POWER_ENABLE: 146 189 *func = "power-enable"; 147 - *polarity = GPIO_ACTIVE_HIGH; 190 + *gpio_flags = GPIO_ACTIVE_HIGH; 148 191 break; 149 192 default: 150 193 *func = "unknown"; 151 - *polarity = GPIO_ACTIVE_HIGH; 194 + *gpio_flags = GPIO_ACTIVE_HIGH; 152 195 break; 153 196 } 154 197 } ··· 239 194 struct gpio_desc *gpio; 240 195 const char *err_msg; 241 196 const char *func; 242 - u32 polarity; 197 + unsigned long gpio_flags; 243 198 int ret; 244 199 245 200 if (!acpi_gpio_get_io_resource(ares, &agpio)) ··· 262 217 263 218 type = FIELD_GET(INT3472_GPIO_DSM_TYPE, obj->integer.value); 264 219 265 - int3472_get_func_and_polarity(type, &func, &polarity); 220 + int3472_get_func_and_polarity(int3472->sensor, &type, &func, &gpio_flags); 266 221 267 222 pin = FIELD_GET(INT3472_GPIO_DSM_PIN, obj->integer.value); 268 223 /* Pin field is not really used under Windows and wraps around at 8 bits */ ··· 272 227 273 228 active_value = FIELD_GET(INT3472_GPIO_DSM_SENSOR_ON_VAL, obj->integer.value); 274 229 if (!active_value) 275 - polarity ^= GPIO_ACTIVE_LOW; 230 + gpio_flags ^= GPIO_ACTIVE_LOW; 276 231 277 232 dev_dbg(int3472->dev, "%s %s pin %d active-%s\n", func, 278 233 agpio->resource_source.string_ptr, agpio->pin_table[0], 279 - str_high_low(polarity == GPIO_ACTIVE_HIGH)); 234 + str_high_low(gpio_flags == GPIO_ACTIVE_HIGH)); 280 235 281 236 switch (type) { 282 237 case INT3472_GPIO_TYPE_RESET: 283 238 case INT3472_GPIO_TYPE_POWERDOWN: 284 - ret = skl_int3472_map_gpio_to_sensor(int3472, agpio, func, polarity); 239 + ret = skl_int3472_map_gpio_to_sensor(int3472, agpio, func, gpio_flags); 285 240 if (ret) 286 241 err_msg = "Failed to map GPIO pin to sensor\n"; 287 242 ··· 289 244 case INT3472_GPIO_TYPE_CLK_ENABLE: 290 245 case INT3472_GPIO_TYPE_PRIVACY_LED: 291 246 case INT3472_GPIO_TYPE_POWER_ENABLE: 292 - gpio = skl_int3472_gpiod_get_from_temp_lookup(int3472, agpio, func, polarity); 247 + gpio = skl_int3472_gpiod_get_from_temp_lookup(int3472, agpio, func, gpio_flags); 293 248 if (IS_ERR(gpio)) { 294 249 ret = PTR_ERR(gpio); 295 250 err_msg = "Failed to get GPIO\n";

+2 -2

drivers/platform/x86/intel/pmc/core.c

··· 626 626 static int pmc_core_ltr_show(struct seq_file *s, void *unused) 627 627 { 628 628 struct pmc_dev *pmcdev = s->private; 629 - u64 decoded_snoop_ltr, decoded_non_snoop_ltr; 630 - u32 ltr_raw_data, scale, val; 629 + u64 decoded_snoop_ltr, decoded_non_snoop_ltr, val; 630 + u32 ltr_raw_data, scale; 631 631 u16 snoop_ltr, nonsnoop_ltr; 632 632 unsigned int i, index, ltr_index = 0; 633 633

+45 -16

drivers/platform/x86/thinkpad_acpi.c

··· 7885 7885 7886 7886 #define FAN_NS_CTRL_STATUS BIT(2) /* Bit which determines control is enabled or not */ 7887 7887 #define FAN_NS_CTRL BIT(4) /* Bit which determines control is by host or EC */ 7888 + #define FAN_CLOCK_TPM (22500*60) /* Ticks per minute for a 22.5 kHz clock */ 7888 7889 7889 7890 enum { /* Fan control constants */ 7890 7891 fan_status_offset = 0x2f, /* EC register 0x2f */ ··· 7941 7940 7942 7941 static bool fan_with_ns_addr; 7943 7942 static bool ecfw_with_fan_dec_rpm; 7943 + static bool fan_speed_in_tpr; 7944 7944 7945 7945 static struct mutex fan_mutex; 7946 7946 ··· 8144 8142 !acpi_ec_read(fan_rpm_offset + 1, &hi))) 8145 8143 return -EIO; 8146 8144 8147 - if (likely(speed)) 8145 + if (likely(speed)) { 8148 8146 *speed = (hi << 8) | lo; 8147 + if (fan_speed_in_tpr && *speed != 0) 8148 + *speed = FAN_CLOCK_TPM / *speed; 8149 + } 8149 8150 break; 8150 8151 case TPACPI_FAN_RD_TPEC_NS: 8151 8152 if (!acpi_ec_read(fan_rpm_status_ns, &lo)) ··· 8181 8176 if (rc) 8182 8177 return -EIO; 8183 8178 8184 - if (likely(speed)) 8179 + if (likely(speed)) { 8185 8180 *speed = (hi << 8) | lo; 8181 + if (fan_speed_in_tpr && *speed != 0) 8182 + *speed = FAN_CLOCK_TPM / *speed; 8183 + } 8186 8184 break; 8187 8185 8188 8186 case TPACPI_FAN_RD_TPEC_NS: ··· 8796 8788 #define TPACPI_FAN_NOFAN 0x0008 /* no fan available */ 8797 8789 #define TPACPI_FAN_NS 0x0010 /* For EC with non-Standard register addresses */ 8798 8790 #define TPACPI_FAN_DECRPM 0x0020 /* For ECFW's with RPM in register as decimal */ 8791 + #define TPACPI_FAN_TPR 0x0040 /* Fan speed is in Ticks Per Revolution */ 8799 8792 8800 8793 static const struct tpacpi_quirk fan_quirk_table[] __initconst = { 8801 8794 TPACPI_QEC_IBM('1', 'Y', TPACPI_FAN_Q1), ··· 8826 8817 TPACPI_Q_LNV3('R', '0', 'V', TPACPI_FAN_NS), /* 11e Gen5 KL-Y */ 8827 8818 TPACPI_Q_LNV3('N', '1', 'O', TPACPI_FAN_NOFAN), /* X1 Tablet (2nd gen) */ 8828 8819 TPACPI_Q_LNV3('R', '0', 'Q', TPACPI_FAN_DECRPM),/* L480 */ 8820 + TPACPI_Q_LNV('8', 'F', TPACPI_FAN_TPR), /* ThinkPad x120e */ 8829 8821 }; 8830 8822 8831 8823 static int __init fan_init(struct ibm_init_struct *iibm) ··· 8897 8887 8898 8888 if (quirks & TPACPI_FAN_Q1) 8899 8889 fan_quirk1_setup(); 8890 + if (quirks & TPACPI_FAN_TPR) 8891 + fan_speed_in_tpr = true; 8900 8892 /* Try and probe the 2nd fan */ 8901 8893 tp_features.second_fan = 1; /* needed for get_speed to work */ 8902 8894 res = fan2_get_speed(&speed); ··· 10331 10319 #define DYTC_MODE_PSC_BALANCE 5 /* Default mode aka balanced */ 10332 10320 #define DYTC_MODE_PSC_PERFORM 7 /* High power mode aka performance */ 10333 10321 10322 + #define DYTC_MODE_PSCV9_LOWPOWER 1 /* Low power mode */ 10323 + #define DYTC_MODE_PSCV9_BALANCE 3 /* Default mode aka balanced */ 10324 + #define DYTC_MODE_PSCV9_PERFORM 4 /* High power mode aka performance */ 10325 + 10334 10326 #define DYTC_ERR_MASK 0xF /* Bits 0-3 in cmd result are the error result */ 10335 10327 #define DYTC_ERR_SUCCESS 1 /* CMD completed successful */ 10336 10328 ··· 10354 10338 static int dytc_capabilities; 10355 10339 static bool dytc_mmc_get_available; 10356 10340 static int profile_force; 10341 + 10342 + static int platform_psc_profile_lowpower = DYTC_MODE_PSC_LOWPOWER; 10343 + static int platform_psc_profile_balanced = DYTC_MODE_PSC_BALANCE; 10344 + static int platform_psc_profile_performance = DYTC_MODE_PSC_PERFORM; 10357 10345 10358 10346 static int convert_dytc_to_profile(int funcmode, int dytcmode, 10359 10347 enum platform_profile_option *profile) ··· 10380 10360 } 10381 10361 return 0; 10382 10362 case DYTC_FUNCTION_PSC: 10383 - switch (dytcmode) { 10384 - case DYTC_MODE_PSC_LOWPOWER: 10363 + if (dytcmode == platform_psc_profile_lowpower) 10385 10364 *profile = PLATFORM_PROFILE_LOW_POWER; 10386 - break; 10387 - case DYTC_MODE_PSC_BALANCE: 10365 + else if (dytcmode == platform_psc_profile_balanced) 10388 10366 *profile = PLATFORM_PROFILE_BALANCED; 10389 - break; 10390 - case DYTC_MODE_PSC_PERFORM: 10367 + else if (dytcmode == platform_psc_profile_performance) 10391 10368 *profile = PLATFORM_PROFILE_PERFORMANCE; 10392 - break; 10393 - default: /* Unknown mode */ 10369 + else 10394 10370 return -EINVAL; 10395 - } 10371 + 10396 10372 return 0; 10397 10373 case DYTC_FUNCTION_AMT: 10398 10374 /* For now return balanced. It's the closest we have to 'auto' */ ··· 10409 10393 if (dytc_capabilities & BIT(DYTC_FC_MMC)) 10410 10394 *perfmode = DYTC_MODE_MMC_LOWPOWER; 10411 10395 else if (dytc_capabilities & BIT(DYTC_FC_PSC)) 10412 - *perfmode = DYTC_MODE_PSC_LOWPOWER; 10396 + *perfmode = platform_psc_profile_lowpower; 10413 10397 break; 10414 10398 case PLATFORM_PROFILE_BALANCED: 10415 10399 if (dytc_capabilities & BIT(DYTC_FC_MMC)) 10416 10400 *perfmode = DYTC_MODE_MMC_BALANCE; 10417 10401 else if (dytc_capabilities & BIT(DYTC_FC_PSC)) 10418 - *perfmode = DYTC_MODE_PSC_BALANCE; 10402 + *perfmode = platform_psc_profile_balanced; 10419 10403 break; 10420 10404 case PLATFORM_PROFILE_PERFORMANCE: 10421 10405 if (dytc_capabilities & BIT(DYTC_FC_MMC)) 10422 10406 *perfmode = DYTC_MODE_MMC_PERFORM; 10423 10407 else if (dytc_capabilities & BIT(DYTC_FC_PSC)) 10424 - *perfmode = DYTC_MODE_PSC_PERFORM; 10408 + *perfmode = platform_psc_profile_performance; 10425 10409 break; 10426 10410 default: /* Unknown profile */ 10427 10411 return -EOPNOTSUPP; ··· 10615 10599 if (output & BIT(DYTC_QUERY_ENABLE_BIT)) 10616 10600 dytc_version = (output >> DYTC_QUERY_REV_BIT) & 0xF; 10617 10601 10602 + dbg_printk(TPACPI_DBG_INIT, "DYTC version %d\n", dytc_version); 10618 10603 /* Check DYTC is enabled and supports mode setting */ 10619 10604 if (dytc_version < 5) 10620 10605 return -ENODEV; ··· 10654 10637 } 10655 10638 } else if (dytc_capabilities & BIT(DYTC_FC_PSC)) { /* PSC MODE */ 10656 10639 pr_debug("PSC is supported\n"); 10640 + if (dytc_version >= 9) { /* update profiles for DYTC 9 and up */ 10641 + platform_psc_profile_lowpower = DYTC_MODE_PSCV9_LOWPOWER; 10642 + platform_psc_profile_balanced = DYTC_MODE_PSCV9_BALANCE; 10643 + platform_psc_profile_performance = DYTC_MODE_PSCV9_PERFORM; 10644 + } 10657 10645 } else { 10658 10646 dbg_printk(TPACPI_DBG_INIT, "No DYTC support available\n"); 10659 10647 return -ENODEV; ··· 10668 10646 "DYTC version %d: thermal mode available\n", dytc_version); 10669 10647 10670 10648 /* Create platform_profile structure and register */ 10671 - tpacpi_pprof = devm_platform_profile_register(&tpacpi_pdev->dev, "thinkpad-acpi", 10672 - NULL, &dytc_profile_ops); 10649 + tpacpi_pprof = platform_profile_register(&tpacpi_pdev->dev, "thinkpad-acpi-profile", 10650 + NULL, &dytc_profile_ops); 10673 10651 /* 10674 10652 * If for some reason platform_profiles aren't enabled 10675 10653 * don't quit terminally. ··· 10687 10665 return 0; 10688 10666 } 10689 10667 10668 + static void dytc_profile_exit(void) 10669 + { 10670 + if (!IS_ERR_OR_NULL(tpacpi_pprof)) 10671 + platform_profile_remove(tpacpi_pprof); 10672 + } 10673 + 10690 10674 static struct ibm_struct dytc_profile_driver_data = { 10691 10675 .name = "dytc-profile", 10676 + .exit = dytc_profile_exit, 10692 10677 }; 10693 10678 10694 10679 /*************************************************************************

+1 -2

drivers/powercap/powercap_sys.c

··· 627 627 dev_set_name(&control_type->dev, "%s", name); 628 628 result = device_register(&control_type->dev); 629 629 if (result) { 630 - if (control_type->allocated) 631 - kfree(control_type); 630 + put_device(&control_type->dev); 632 631 return ERR_PTR(result); 633 632 } 634 633 idr_init(&control_type->idr);

+21 -26

drivers/ptp/ptp_vmclock.c

··· 414 414 } 415 415 416 416 static const struct file_operations vmclock_miscdev_fops = { 417 + .owner = THIS_MODULE, 417 418 .mmap = vmclock_miscdev_mmap, 418 419 .read = vmclock_miscdev_read, 419 420 }; 420 421 421 422 /* module operations */ 422 423 423 - static void vmclock_remove(struct platform_device *pdev) 424 + static void vmclock_remove(void *data) 424 425 { 425 - struct device *dev = &pdev->dev; 426 - struct vmclock_state *st = dev_get_drvdata(dev); 426 + struct vmclock_state *st = data; 427 427 428 428 if (st->ptp_clock) 429 429 ptp_clock_unregister(st->ptp_clock); ··· 506 506 507 507 if (ret) { 508 508 dev_info(dev, "Failed to obtain physical address: %d\n", ret); 509 - goto out; 509 + return ret; 510 510 } 511 511 512 512 if (resource_size(&st->res) < VMCLOCK_MIN_SIZE) { 513 513 dev_info(dev, "Region too small (0x%llx)\n", 514 514 resource_size(&st->res)); 515 - ret = -EINVAL; 516 - goto out; 515 + return -EINVAL; 517 516 } 518 517 st->clk = devm_memremap(dev, st->res.start, resource_size(&st->res), 519 518 MEMREMAP_WB | MEMREMAP_DEC); ··· 520 521 ret = PTR_ERR(st->clk); 521 522 dev_info(dev, "failed to map shared memory\n"); 522 523 st->clk = NULL; 523 - goto out; 524 + return ret; 524 525 } 525 526 526 527 if (le32_to_cpu(st->clk->magic) != VMCLOCK_MAGIC || 527 528 le32_to_cpu(st->clk->size) > resource_size(&st->res) || 528 529 le16_to_cpu(st->clk->version) != 1) { 529 530 dev_info(dev, "vmclock magic fields invalid\n"); 530 - ret = -EINVAL; 531 - goto out; 531 + return -EINVAL; 532 532 } 533 533 534 534 ret = ida_alloc(&vmclock_ida, GFP_KERNEL); 535 535 if (ret < 0) 536 - goto out; 536 + return ret; 537 537 538 538 st->index = ret; 539 539 ret = devm_add_action_or_reset(&pdev->dev, vmclock_put_idx, st); 540 540 if (ret) 541 - goto out; 541 + return ret; 542 542 543 543 st->name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "vmclock%d", st->index); 544 - if (!st->name) { 545 - ret = -ENOMEM; 546 - goto out; 547 - } 544 + if (!st->name) 545 + return -ENOMEM; 546 + 547 + st->miscdev.minor = MISC_DYNAMIC_MINOR; 548 + 549 + ret = devm_add_action_or_reset(&pdev->dev, vmclock_remove, st); 550 + if (ret) 551 + return ret; 548 552 549 553 /* 550 554 * If the structure is big enough, it can be mapped to userspace. ··· 556 554 * cross that bridge if/when we come to it. 557 555 */ 558 556 if (le32_to_cpu(st->clk->size) >= PAGE_SIZE) { 559 - st->miscdev.minor = MISC_DYNAMIC_MINOR; 560 557 st->miscdev.fops = &vmclock_miscdev_fops; 561 558 st->miscdev.name = st->name; 562 559 563 560 ret = misc_register(&st->miscdev); 564 561 if (ret) 565 - goto out; 562 + return ret; 566 563 } 567 564 568 565 /* If there is valid clock information, register a PTP clock */ ··· 571 570 if (IS_ERR(st->ptp_clock)) { 572 571 ret = PTR_ERR(st->ptp_clock); 573 572 st->ptp_clock = NULL; 574 - vmclock_remove(pdev); 575 - goto out; 573 + return ret; 576 574 } 577 575 } 578 576 579 577 if (!st->miscdev.minor && !st->ptp_clock) { 580 578 /* Neither miscdev nor PTP registered */ 581 579 dev_info(dev, "vmclock: Neither miscdev nor PTP available; not registering\n"); 582 - ret = -ENODEV; 583 - goto out; 580 + return -ENODEV; 584 581 } 585 582 586 583 dev_info(dev, "%s: registered %s%s%s\n", st->name, ··· 586 587 (st->miscdev.minor && st->ptp_clock) ? ", " : "", 587 588 st->ptp_clock ? "PTP" : ""); 588 589 589 - dev_set_drvdata(dev, st); 590 - 591 - out: 592 - return ret; 590 + return 0; 593 591 } 594 592 595 593 static const struct acpi_device_id vmclock_acpi_ids[] = { ··· 597 601 598 602 static struct platform_driver vmclock_platform_driver = { 599 603 .probe = vmclock_probe, 600 - .remove = vmclock_remove, 601 604 .driver = { 602 605 .name = "vmclock", 603 606 .acpi_match_table = vmclock_acpi_ids,

+27 -34

drivers/regulator/core.c

··· 5774 5774 goto clean; 5775 5775 } 5776 5776 5777 - if (config->init_data) { 5778 - /* 5779 - * Providing of_match means the framework is expected to parse 5780 - * DT to get the init_data. This would conflict with provided 5781 - * init_data, if set. Warn if it happens. 5782 - */ 5783 - if (regulator_desc->of_match) 5784 - dev_warn(dev, "Using provided init data - OF match ignored\n"); 5777 + /* 5778 + * DT may override the config->init_data provided if the platform 5779 + * needs to do so. If so, config->init_data is completely ignored. 5780 + */ 5781 + init_data = regulator_of_get_init_data(dev, regulator_desc, config, 5782 + &rdev->dev.of_node); 5785 5783 5784 + /* 5785 + * Sometimes not all resources are probed already so we need to take 5786 + * that into account. This happens most the time if the ena_gpiod comes 5787 + * from a gpio extender or something else. 5788 + */ 5789 + if (PTR_ERR(init_data) == -EPROBE_DEFER) { 5790 + ret = -EPROBE_DEFER; 5791 + goto clean; 5792 + } 5793 + 5794 + /* 5795 + * We need to keep track of any GPIO descriptor coming from the 5796 + * device tree until we have handled it over to the core. If the 5797 + * config that was passed in to this function DOES NOT contain 5798 + * a descriptor, and the config after this call DOES contain 5799 + * a descriptor, we definitely got one from parsing the device 5800 + * tree. 5801 + */ 5802 + if (!cfg->ena_gpiod && config->ena_gpiod) 5803 + dangling_of_gpiod = true; 5804 + if (!init_data) { 5786 5805 init_data = config->init_data; 5787 5806 rdev->dev.of_node = of_node_get(config->of_node); 5788 - 5789 - } else { 5790 - init_data = regulator_of_get_init_data(dev, regulator_desc, 5791 - config, 5792 - &rdev->dev.of_node); 5793 - 5794 - /* 5795 - * Sometimes not all resources are probed already so we need to 5796 - * take that into account. This happens most the time if the 5797 - * ena_gpiod comes from a gpio extender or something else. 5798 - */ 5799 - if (PTR_ERR(init_data) == -EPROBE_DEFER) { 5800 - ret = -EPROBE_DEFER; 5801 - goto clean; 5802 - } 5803 - 5804 - /* 5805 - * We need to keep track of any GPIO descriptor coming from the 5806 - * device tree until we have handled it over to the core. If the 5807 - * config that was passed in to this function DOES NOT contain a 5808 - * descriptor, and the config after this call DOES contain a 5809 - * descriptor, we definitely got one from parsing the device 5810 - * tree. 5811 - */ 5812 - if (!cfg->ena_gpiod && config->ena_gpiod) 5813 - dangling_of_gpiod = true; 5814 5807 } 5815 5808 5816 5809 ww_mutex_init(&rdev->mutex, &regulator_ww_class);

+2 -1

drivers/s390/cio/chp.c

··· 695 695 if (time_after(jiffies, chp_info_expires)) { 696 696 /* Data is too old, update. */ 697 697 rc = sclp_chp_read_info(&chp_info); 698 - chp_info_expires = jiffies + CHP_INFO_UPDATE_INTERVAL ; 698 + if (!rc) 699 + chp_info_expires = jiffies + CHP_INFO_UPDATE_INTERVAL; 699 700 } 700 701 mutex_unlock(&info_lock); 701 702

+5 -3

drivers/s390/net/qeth_core_main.c

··· 7050 7050 card->data.state = CH_STATE_UP; 7051 7051 netif_tx_start_all_queues(dev); 7052 7052 7053 - local_bh_disable(); 7054 7053 qeth_for_each_output_queue(card, queue, i) { 7055 7054 netif_napi_add_tx(dev, &queue->napi, qeth_tx_poll); 7056 7055 napi_enable(&queue->napi); 7056 + } 7057 + napi_enable(&card->napi); 7058 + 7059 + local_bh_disable(); 7060 + qeth_for_each_output_queue(card, queue, i) { 7057 7061 napi_schedule(&queue->napi); 7058 7062 } 7059 - 7060 - napi_enable(&card->napi); 7061 7063 napi_schedule(&card->napi); 7062 7064 /* kick-start the NAPI softirq: */ 7063 7065 local_bh_enable();

+1 -1

drivers/scsi/qla1280.c

··· 2867 2867 dprintk(3, "S/G Segment phys_addr=%x %x, len=0x%x\n", 2868 2868 cpu_to_le32(upper_32_bits(dma_handle)), 2869 2869 cpu_to_le32(lower_32_bits(dma_handle)), 2870 - cpu_to_le32(sg_dma_len(sg_next(s)))); 2870 + cpu_to_le32(sg_dma_len(s))); 2871 2871 remseg--; 2872 2872 } 2873 2873 dprintk(5, "qla1280_64bit_start_scsi: Scatter/gather "

+7 -2

drivers/scsi/scsi_lib.c

··· 872 872 case 0x1a: /* start stop unit in progress */ 873 873 case 0x1b: /* sanitize in progress */ 874 874 case 0x1d: /* configuration in progress */ 875 - case 0x24: /* depopulation in progress */ 876 - case 0x25: /* depopulation restore in progress */ 877 875 action = ACTION_DELAYED_RETRY; 878 876 break; 879 877 case 0x0a: /* ALUA state transition */ 880 878 action = ACTION_DELAYED_REPREP; 881 879 break; 880 + /* 881 + * Depopulation might take many hours, 882 + * thus it is not worthwhile to retry. 883 + */ 884 + case 0x24: /* depopulation in progress */ 885 + case 0x25: /* depopulation restore in progress */ 886 + fallthrough; 882 887 default: 883 888 action = ACTION_FAIL; 884 889 break;

+7

drivers/scsi/scsi_lib_test.c

··· 67 67 }; 68 68 int i; 69 69 70 + /* Success */ 71 + sc.result = 0; 72 + KUNIT_EXPECT_EQ(test, 0, scsi_check_passthrough(&sc, &failures)); 73 + KUNIT_EXPECT_EQ(test, 0, scsi_check_passthrough(&sc, NULL)); 74 + /* Command failed but caller did not pass in a failures array */ 75 + scsi_build_sense(&sc, 0, ILLEGAL_REQUEST, 0x91, 0x36); 76 + KUNIT_EXPECT_EQ(test, 0, scsi_check_passthrough(&sc, NULL)); 70 77 /* Match end of array */ 71 78 scsi_build_sense(&sc, 0, ILLEGAL_REQUEST, 0x91, 0x36); 72 79 KUNIT_EXPECT_EQ(test, -EAGAIN, scsi_check_passthrough(&sc, &failures));

+1 -1

drivers/scsi/scsi_scan.c

··· 246 246 } 247 247 ret = sbitmap_init_node(&sdev->budget_map, 248 248 scsi_device_max_queue_depth(sdev), 249 - new_shift, GFP_KERNEL, 249 + new_shift, GFP_NOIO, 250 250 sdev->request_queue->node, false, true); 251 251 if (!ret) 252 252 sbitmap_resize(&sdev->budget_map, depth);

+1

drivers/scsi/storvsc_drv.c

··· 1800 1800 1801 1801 length = scsi_bufflen(scmnd); 1802 1802 payload = (struct vmbus_packet_mpb_array *)&cmd_request->mpb; 1803 + payload->range.len = 0; 1803 1804 payload_sz = 0; 1804 1805 1805 1806 if (scsi_sg_count(scmnd)) {

+1 -1

drivers/soc/qcom/smp2p.c

··· 365 365 { 366 366 struct smp2p_entry *entry = irq_data_get_irq_chip_data(irqd); 367 367 368 - seq_printf(p, " %8s", dev_name(entry->smp2p->dev)); 368 + seq_printf(p, "%8s", dev_name(entry->smp2p->dev)); 369 369 } 370 370 371 371 static struct irq_chip smp2p_irq_chip = {

+2 -2

drivers/spi/atmel-quadspi.c

··· 235 235 /** 236 236 * struct atmel_qspi_pcal - Pad Calibration Clock Division 237 237 * @pclk_rate: peripheral clock rate. 238 - * @pclkdiv: calibration clock division. The clock applied to the calibration 239 - * cell is divided by pclkdiv + 1. 238 + * @pclk_div: calibration clock division. The clock applied to the calibration 239 + * cell is divided by pclk_div + 1. 240 240 */ 241 241 struct atmel_qspi_pcal { 242 242 u32 pclk_rate;

+1 -1

drivers/spi/spi-pxa2xx.c

··· 399 399 lpss_ssp_select_cs(spi, config); 400 400 401 401 mask = LPSS_CS_CONTROL_CS_HIGH; 402 - __lpss_ssp_update_priv(drv_data, config->reg_cs_ctrl, mask, enable ? mask : 0); 402 + __lpss_ssp_update_priv(drv_data, config->reg_cs_ctrl, mask, enable ? 0 : mask); 403 403 if (config->cs_clk_stays_gated) { 404 404 /* 405 405 * Changing CS alone when dynamic clock gating is on won't

+3

drivers/spi/spi-sn-f-ospi.c

··· 116 116 117 117 static u32 f_ospi_get_dummy_cycle(const struct spi_mem_op *op) 118 118 { 119 + if (!op->dummy.nbytes) 120 + return 0; 121 + 119 122 return (op->dummy.nbytes * 8) / op->dummy.buswidth; 120 123 } 121 124

+2 -2

drivers/target/target_core_stat.c

··· 117 117 char *page) 118 118 { 119 119 if (to_stat_tgt_dev(item)->export_count) 120 - return snprintf(page, PAGE_SIZE, "activated"); 120 + return snprintf(page, PAGE_SIZE, "activated\n"); 121 121 else 122 - return snprintf(page, PAGE_SIZE, "deactivated"); 122 + return snprintf(page, PAGE_SIZE, "deactivated\n"); 123 123 } 124 124 125 125 static ssize_t target_stat_tgt_non_access_lus_show(struct config_item *item,

-2

drivers/thermal/cpufreq_cooling.c

··· 57 57 * @max_level: maximum cooling level. One less than total number of valid 58 58 * cpufreq frequencies. 59 59 * @em: Reference on the Energy Model of the device 60 - * @cdev: thermal_cooling_device pointer to keep track of the 61 - * registered cooling device. 62 60 * @policy: cpufreq policy. 63 61 * @cooling_ops: cpufreq callbacks to thermal cooling device ops 64 62 * @idle_time: idle time stats

+1 -1

drivers/tty/pty.c

··· 798 798 nonseekable_open(inode, filp); 799 799 800 800 /* We refuse fsnotify events on ptmx, since it's a shared resource */ 801 - filp->f_mode |= FMODE_NONOTIFY; 801 + file_set_fsnotify_mode(filp, FMODE_NONOTIFY); 802 802 803 803 retval = tty_alloc_file(filp); 804 804 if (retval)

+2

drivers/tty/serial/8250/8250.h

··· 374 374 375 375 #ifdef CONFIG_SERIAL_8250_DMA 376 376 extern int serial8250_tx_dma(struct uart_8250_port *); 377 + extern void serial8250_tx_dma_flush(struct uart_8250_port *); 377 378 extern int serial8250_rx_dma(struct uart_8250_port *); 378 379 extern void serial8250_rx_dma_flush(struct uart_8250_port *); 379 380 extern int serial8250_request_dma(struct uart_8250_port *); ··· 407 406 { 408 407 return -1; 409 408 } 409 + static inline void serial8250_tx_dma_flush(struct uart_8250_port *p) { } 410 410 static inline int serial8250_rx_dma(struct uart_8250_port *p) 411 411 { 412 412 return -1;

+16

drivers/tty/serial/8250/8250_dma.c

··· 149 149 return ret; 150 150 } 151 151 152 + void serial8250_tx_dma_flush(struct uart_8250_port *p) 153 + { 154 + struct uart_8250_dma *dma = p->dma; 155 + 156 + if (!dma->tx_running) 157 + return; 158 + 159 + /* 160 + * kfifo_reset() has been called by the serial core, avoid 161 + * advancing and underflowing in __dma_tx_complete(). 162 + */ 163 + dma->tx_size = 0; 164 + 165 + dmaengine_terminate_async(dma->rxchan); 166 + } 167 + 152 168 int serial8250_rx_dma(struct uart_8250_port *p) 153 169 { 154 170 struct uart_8250_dma *dma = p->dma;

-1

drivers/tty/serial/8250/8250_of.c

··· 110 110 spin_lock_init(&port->lock); 111 111 112 112 if (resource_type(&resource) == IORESOURCE_IO) { 113 - port->iotype = UPIO_PORT; 114 113 port->iobase = resource.start; 115 114 } else { 116 115 port->mapbase = resource.start;

-9

drivers/tty/serial/8250/8250_platform.c

··· 112 112 struct device *dev = &pdev->dev; 113 113 struct uart_8250_port uart = { }; 114 114 struct resource *regs; 115 - unsigned char iotype; 116 115 int ret, line; 117 116 118 117 regs = platform_get_mem_or_io(pdev, 0); ··· 121 122 switch (resource_type(regs)) { 122 123 case IORESOURCE_IO: 123 124 uart.port.iobase = regs->start; 124 - iotype = UPIO_PORT; 125 125 break; 126 126 case IORESOURCE_MEM: 127 127 uart.port.mapbase = regs->start; 128 128 uart.port.mapsize = resource_size(regs); 129 129 uart.port.flags = UPF_IOREMAP; 130 - iotype = UPIO_MEM; 131 130 break; 132 131 default: 133 132 return -EINVAL; ··· 143 146 ret = 0; 144 147 if (ret) 145 148 return ret; 146 - 147 - /* 148 - * The previous call may not set iotype correctly when reg-io-width 149 - * property is absent and it doesn't support IO port resource. 150 - */ 151 - uart.port.iotype = iotype; 152 149 153 150 line = serial8250_register_8250_port(&uart); 154 151 if (line < 0)

-10

drivers/tty/serial/8250/8250_pnp.c

··· 436 436 { 437 437 struct uart_8250_port uart, *port; 438 438 int ret, flags = dev_id->driver_data; 439 - unsigned char iotype; 440 439 long line; 441 440 442 441 if (flags & UNKNOWN_DEV) { ··· 447 448 memset(&uart, 0, sizeof(uart)); 448 449 if ((flags & CIR_PORT) && pnp_port_valid(dev, 2)) { 449 450 uart.port.iobase = pnp_port_start(dev, 2); 450 - iotype = UPIO_PORT; 451 451 } else if (pnp_port_valid(dev, 0)) { 452 452 uart.port.iobase = pnp_port_start(dev, 0); 453 - iotype = UPIO_PORT; 454 453 } else if (pnp_mem_valid(dev, 0)) { 455 454 uart.port.mapbase = pnp_mem_start(dev, 0); 456 455 uart.port.mapsize = pnp_mem_len(dev, 0); 457 - iotype = UPIO_MEM; 458 456 uart.port.flags = UPF_IOREMAP; 459 457 } else 460 458 return -ENODEV; ··· 466 470 ret = 0; 467 471 if (ret) 468 472 return ret; 469 - 470 - /* 471 - * The previous call may not set iotype correctly when reg-io-width 472 - * property is absent and it doesn't support IO port resource. 473 - */ 474 - uart.port.iotype = iotype; 475 473 476 474 if (flags & CIR_PORT) { 477 475 uart.port.flags |= UPF_FIXED_PORT | UPF_FIXED_TYPE;

+9

drivers/tty/serial/8250/8250_port.c

··· 2555 2555 serial8250_do_shutdown(port); 2556 2556 } 2557 2557 2558 + static void serial8250_flush_buffer(struct uart_port *port) 2559 + { 2560 + struct uart_8250_port *up = up_to_u8250p(port); 2561 + 2562 + if (up->dma) 2563 + serial8250_tx_dma_flush(up); 2564 + } 2565 + 2558 2566 static unsigned int serial8250_do_get_divisor(struct uart_port *port, 2559 2567 unsigned int baud, 2560 2568 unsigned int *frac) ··· 3252 3244 .break_ctl = serial8250_break_ctl, 3253 3245 .startup = serial8250_startup, 3254 3246 .shutdown = serial8250_shutdown, 3247 + .flush_buffer = serial8250_flush_buffer, 3255 3248 .set_termios = serial8250_set_termios, 3256 3249 .set_ldisc = serial8250_set_ldisc, 3257 3250 .pm = serial8250_pm,

+1 -1

drivers/tty/serial/sc16is7xx.c

··· 1561 1561 /* Always ask for fixed clock rate from a property. */ 1562 1562 device_property_read_u32(dev, "clock-frequency", &uartclk); 1563 1563 1564 - s->polling = !!irq; 1564 + s->polling = (irq <= 0); 1565 1565 if (s->polling) 1566 1566 dev_dbg(dev, 1567 1567 "No interrupt pin definition, falling back to polling mode\n");

+7 -5

drivers/tty/serial/serial_port.c

··· 173 173 * The caller is responsible to initialize the following fields of the @port 174 174 * ->dev (must be valid) 175 175 * ->flags 176 + * ->iobase 176 177 * ->mapbase 177 178 * ->mapsize 178 179 * ->regshift (if @use_defaults is false) ··· 215 214 /* Read the registers I/O access type (default: MMIO 8-bit) */ 216 215 ret = device_property_read_u32(dev, "reg-io-width", &value); 217 216 if (ret) { 218 - port->iotype = UPIO_MEM; 217 + port->iotype = port->iobase ? UPIO_PORT : UPIO_MEM; 219 218 } else { 220 219 switch (value) { 221 220 case 1: ··· 228 227 port->iotype = device_is_big_endian(dev) ? UPIO_MEM32BE : UPIO_MEM32; 229 228 break; 230 229 default: 231 - if (!use_defaults) { 232 - dev_err(dev, "Unsupported reg-io-width (%u)\n", value); 233 - return -EINVAL; 234 - } 235 230 port->iotype = UPIO_UNKNOWN; 236 231 break; 237 232 } 233 + } 234 + 235 + if (!use_defaults && port->iotype == UPIO_UNKNOWN) { 236 + dev_err(dev, "Unsupported reg-io-width (%u)\n", value); 237 + return -EINVAL; 238 238 } 239 239 240 240 /* Read the address mapping base offset (default: no offset) */

+35 -33

drivers/ufs/core/ufshcd.c

··· 2120 2120 INIT_DELAYED_WORK(&hba->clk_gating.gate_work, ufshcd_gate_work); 2121 2121 INIT_WORK(&hba->clk_gating.ungate_work, ufshcd_ungate_work); 2122 2122 2123 - spin_lock_init(&hba->clk_gating.lock); 2124 - 2125 2123 hba->clk_gating.clk_gating_workq = alloc_ordered_workqueue( 2126 2124 "ufs_clk_gating_%d", WQ_MEM_RECLAIM | WQ_HIGHPRI, 2127 2125 hba->host->host_no); ··· 3104 3106 case UPIU_TRANSACTION_QUERY_RSP: { 3105 3107 u8 response = lrbp->ucd_rsp_ptr->header.response; 3106 3108 3107 - if (response == 0) 3109 + if (response == 0) { 3108 3110 err = ufshcd_copy_query_response(hba, lrbp); 3111 + } else { 3112 + err = -EINVAL; 3113 + dev_err(hba->dev, "%s: unexpected response in Query RSP: %x\n", 3114 + __func__, response); 3115 + } 3109 3116 break; 3110 3117 } 3111 3118 case UPIU_TRANSACTION_REJECT_UPIU: ··· 5979 5976 __func__, err); 5980 5977 } 5981 5978 5982 - static void ufshcd_temp_exception_event_handler(struct ufs_hba *hba, u16 status) 5983 - { 5984 - u32 value; 5985 - 5986 - if (ufshcd_query_attr_retry(hba, UPIU_QUERY_OPCODE_READ_ATTR, 5987 - QUERY_ATTR_IDN_CASE_ROUGH_TEMP, 0, 0, &value)) 5988 - return; 5989 - 5990 - dev_info(hba->dev, "exception Tcase %d\n", value - 80); 5991 - 5992 - ufs_hwmon_notify_event(hba, status & MASK_EE_URGENT_TEMP); 5993 - 5994 - /* 5995 - * A placeholder for the platform vendors to add whatever additional 5996 - * steps required 5997 - */ 5998 - } 5999 - 6000 5979 static int __ufshcd_wb_toggle(struct ufs_hba *hba, bool set, enum flag_idn idn) 6001 5980 { 6002 5981 u8 index; ··· 6199 6214 ufshcd_bkops_exception_event_handler(hba); 6200 6215 6201 6216 if (status & hba->ee_drv_mask & MASK_EE_URGENT_TEMP) 6202 - ufshcd_temp_exception_event_handler(hba, status); 6217 + ufs_hwmon_notify_event(hba, status & MASK_EE_URGENT_TEMP); 6203 6218 6204 6219 ufs_debugfs_exception_event(hba, status); 6205 6220 } ··· 9145 9160 if (!IS_ERR_OR_NULL(clki->clk) && clki->enabled) 9146 9161 clk_disable_unprepare(clki->clk); 9147 9162 } 9148 - } else if (!ret && on) { 9163 + } else if (!ret && on && hba->clk_gating.is_initialized) { 9149 9164 scoped_guard(spinlock_irqsave, &hba->clk_gating.lock) 9150 9165 hba->clk_gating.state = CLKS_ON; 9151 9166 trace_ufshcd_clk_gating(dev_name(hba->dev), ··· 10232 10247 #endif /* CONFIG_PM_SLEEP */ 10233 10248 10234 10249 /** 10235 - * ufshcd_dealloc_host - deallocate Host Bus Adapter (HBA) 10236 - * @hba: pointer to Host Bus Adapter (HBA) 10237 - */ 10238 - void ufshcd_dealloc_host(struct ufs_hba *hba) 10239 - { 10240 - scsi_host_put(hba->host); 10241 - } 10242 - EXPORT_SYMBOL_GPL(ufshcd_dealloc_host); 10243 - 10244 - /** 10245 10250 * ufshcd_set_dma_mask - Set dma mask based on the controller 10246 10251 * addressing capability 10247 10252 * @hba: per adapter instance ··· 10250 10275 } 10251 10276 10252 10277 /** 10278 + * ufshcd_devres_release - devres cleanup handler, invoked during release of 10279 + * hba->dev 10280 + * @host: pointer to SCSI host 10281 + */ 10282 + static void ufshcd_devres_release(void *host) 10283 + { 10284 + scsi_host_put(host); 10285 + } 10286 + 10287 + /** 10253 10288 * ufshcd_alloc_host - allocate Host Bus Adapter (HBA) 10254 10289 * @dev: pointer to device handle 10255 10290 * @hba_handle: driver private handle 10256 10291 * 10257 10292 * Return: 0 on success, non-zero value on failure. 10293 + * 10294 + * NOTE: There is no corresponding ufshcd_dealloc_host() because this function 10295 + * keeps track of its allocations using devres and deallocates everything on 10296 + * device removal automatically. 10258 10297 */ 10259 10298 int ufshcd_alloc_host(struct device *dev, struct ufs_hba **hba_handle) 10260 10299 { ··· 10290 10301 err = -ENOMEM; 10291 10302 goto out_error; 10292 10303 } 10304 + 10305 + err = devm_add_action_or_reset(dev, ufshcd_devres_release, 10306 + host); 10307 + if (err) 10308 + return dev_err_probe(dev, err, 10309 + "failed to add ufshcd dealloc action\n"); 10310 + 10293 10311 host->nr_maps = HCTX_TYPE_POLL + 1; 10294 10312 hba = shost_priv(host); 10295 10313 hba->host = host; ··· 10424 10428 hba->mmio_base = mmio_base; 10425 10429 hba->irq = irq; 10426 10430 hba->vps = &ufs_hba_vps; 10431 + 10432 + /* 10433 + * Initialize clk_gating.lock early since it is being used in 10434 + * ufshcd_setup_clocks() 10435 + */ 10436 + spin_lock_init(&hba->clk_gating.lock); 10427 10437 10428 10438 err = ufshcd_hba_init(hba); 10429 10439 if (err)

-2

drivers/ufs/host/ufshcd-pci.c

··· 562 562 pm_runtime_forbid(&pdev->dev); 563 563 pm_runtime_get_noresume(&pdev->dev); 564 564 ufshcd_remove(hba); 565 - ufshcd_dealloc_host(hba); 566 565 } 567 566 568 567 /** ··· 604 605 err = ufshcd_init(hba, mmio_base, pdev->irq); 605 606 if (err) { 606 607 dev_err(&pdev->dev, "Initialization failed\n"); 607 - ufshcd_dealloc_host(hba); 608 608 return err; 609 609 } 610 610

+9 -19

drivers/ufs/host/ufshcd-pltfrm.c

··· 465 465 struct device *dev = &pdev->dev; 466 466 467 467 mmio_base = devm_platform_ioremap_resource(pdev, 0); 468 - if (IS_ERR(mmio_base)) { 469 - err = PTR_ERR(mmio_base); 470 - goto out; 471 - } 468 + if (IS_ERR(mmio_base)) 469 + return PTR_ERR(mmio_base); 472 470 473 471 irq = platform_get_irq(pdev, 0); 474 - if (irq < 0) { 475 - err = irq; 476 - goto out; 477 - } 472 + if (irq < 0) 473 + return irq; 478 474 479 475 err = ufshcd_alloc_host(dev, &hba); 480 476 if (err) { 481 477 dev_err(dev, "Allocation failed\n"); 482 - goto out; 478 + return err; 483 479 } 484 480 485 481 hba->vops = vops; ··· 484 488 if (err) { 485 489 dev_err(dev, "%s: clock parse failed %d\n", 486 490 __func__, err); 487 - goto dealloc_host; 491 + return err; 488 492 } 489 493 err = ufshcd_parse_regulator_info(hba); 490 494 if (err) { 491 495 dev_err(dev, "%s: regulator init failed %d\n", 492 496 __func__, err); 493 - goto dealloc_host; 497 + return err; 494 498 } 495 499 496 500 ufshcd_init_lanes_per_dir(hba); ··· 498 502 err = ufshcd_parse_operating_points(hba); 499 503 if (err) { 500 504 dev_err(dev, "%s: OPP parse failed %d\n", __func__, err); 501 - goto dealloc_host; 505 + return err; 502 506 } 503 507 504 508 err = ufshcd_init(hba, mmio_base, irq); 505 509 if (err) { 506 510 dev_err_probe(dev, err, "Initialization failed with error %d\n", 507 511 err); 508 - goto dealloc_host; 512 + return err; 509 513 } 510 514 511 515 pm_runtime_set_active(dev); 512 516 pm_runtime_enable(dev); 513 517 514 518 return 0; 515 - 516 - dealloc_host: 517 - ufshcd_dealloc_host(hba); 518 - out: 519 - return err; 520 519 } 521 520 EXPORT_SYMBOL_GPL(ufshcd_pltfrm_init); 522 521 ··· 525 534 526 535 pm_runtime_get_sync(&pdev->dev); 527 536 ufshcd_remove(hba); 528 - ufshcd_dealloc_host(hba); 529 537 pm_runtime_disable(&pdev->dev); 530 538 pm_runtime_put_noidle(&pdev->dev); 531 539 }

+21 -7

drivers/usb/class/cdc-acm.c

··· 371 371 static void acm_ctrl_irq(struct urb *urb) 372 372 { 373 373 struct acm *acm = urb->context; 374 - struct usb_cdc_notification *dr = urb->transfer_buffer; 374 + struct usb_cdc_notification *dr; 375 375 unsigned int current_size = urb->actual_length; 376 376 unsigned int expected_size, copy_size, alloc_size; 377 377 int retval; ··· 398 398 399 399 usb_mark_last_busy(acm->dev); 400 400 401 - if (acm->nb_index) 401 + if (acm->nb_index == 0) { 402 + /* 403 + * The first chunk of a message must contain at least the 404 + * notification header with the length field, otherwise we 405 + * can't get an expected_size. 406 + */ 407 + if (current_size < sizeof(struct usb_cdc_notification)) { 408 + dev_dbg(&acm->control->dev, "urb too short\n"); 409 + goto exit; 410 + } 411 + dr = urb->transfer_buffer; 412 + } else { 402 413 dr = (struct usb_cdc_notification *)acm->notification_buffer; 403 - 414 + } 404 415 /* size = notification-header + (optional) data */ 405 416 expected_size = sizeof(struct usb_cdc_notification) + 406 417 le16_to_cpu(dr->wLength); 407 418 408 - if (current_size < expected_size) { 419 + if (acm->nb_index != 0 || current_size < expected_size) { 409 420 /* notification is transmitted fragmented, reassemble */ 410 421 if (acm->nb_size < expected_size) { 411 422 u8 *new_buffer; ··· 1738 1727 { USB_DEVICE(0x0870, 0x0001), /* Metricom GS Modem */ 1739 1728 .driver_info = NO_UNION_NORMAL, /* has no union descriptor */ 1740 1729 }, 1741 - { USB_DEVICE(0x045b, 0x023c), /* Renesas USB Download mode */ 1730 + { USB_DEVICE(0x045b, 0x023c), /* Renesas R-Car H3 USB Download mode */ 1742 1731 .driver_info = DISABLE_ECHO, /* Don't echo banner */ 1743 1732 }, 1744 - { USB_DEVICE(0x045b, 0x0248), /* Renesas USB Download mode */ 1733 + { USB_DEVICE(0x045b, 0x0247), /* Renesas R-Car D3 USB Download mode */ 1745 1734 .driver_info = DISABLE_ECHO, /* Don't echo banner */ 1746 1735 }, 1747 - { USB_DEVICE(0x045b, 0x024D), /* Renesas USB Download mode */ 1736 + { USB_DEVICE(0x045b, 0x0248), /* Renesas R-Car M3-N USB Download mode */ 1737 + .driver_info = DISABLE_ECHO, /* Don't echo banner */ 1738 + }, 1739 + { USB_DEVICE(0x045b, 0x024D), /* Renesas R-Car E3 USB Download mode */ 1748 1740 .driver_info = DISABLE_ECHO, /* Don't echo banner */ 1749 1741 }, 1750 1742 { USB_DEVICE(0x0e8d, 0x0003), /* FIREFLY, MediaTek Inc; andrey.arapov@gmail.com */

+12 -2

drivers/usb/core/hub.c

··· 1849 1849 hdev = interface_to_usbdev(intf); 1850 1850 1851 1851 /* 1852 + * The USB 2.0 spec prohibits hubs from having more than one 1853 + * configuration or interface, and we rely on this prohibition. 1854 + * Refuse to accept a device that violates it. 1855 + */ 1856 + if (hdev->descriptor.bNumConfigurations > 1 || 1857 + hdev->actconfig->desc.bNumInterfaces > 1) { 1858 + dev_err(&intf->dev, "Invalid hub with more than one config or interface\n"); 1859 + return -EINVAL; 1860 + } 1861 + 1862 + /* 1852 1863 * Set default autosuspend delay as 0 to speedup bus suspend, 1853 1864 * based on the below considerations: 1854 1865 * ··· 4709 4698 EXPORT_SYMBOL_GPL(usb_ep0_reinit); 4710 4699 4711 4700 #define usb_sndaddr0pipe() (PIPE_CONTROL << 30) 4712 - #define usb_rcvaddr0pipe() ((PIPE_CONTROL << 30) | USB_DIR_IN) 4713 4701 4714 4702 static int hub_set_address(struct usb_device *udev, int devnum) 4715 4703 { ··· 4814 4804 for (i = 0; i < GET_MAXPACKET0_TRIES; ++i) { 4815 4805 /* Start with invalid values in case the transfer fails */ 4816 4806 buf->bDescriptorType = buf->bMaxPacketSize0 = 0; 4817 - rc = usb_control_msg(udev, usb_rcvaddr0pipe(), 4807 + rc = usb_control_msg(udev, usb_rcvctrlpipe(udev, 0), 4818 4808 USB_REQ_GET_DESCRIPTOR, USB_DIR_IN, 4819 4809 USB_DT_DEVICE << 8, 0, 4820 4810 buf, size,

+6

drivers/usb/core/quirks.c

··· 435 435 { USB_DEVICE(0x0c45, 0x7056), .driver_info = 436 436 USB_QUIRK_IGNORE_REMOTE_WAKEUP }, 437 437 438 + /* Sony Xperia XZ1 Compact (lilac) smartphone in fastboot mode */ 439 + { USB_DEVICE(0x0fce, 0x0dde), .driver_info = USB_QUIRK_NO_LPM }, 440 + 438 441 /* Action Semiconductor flash disk */ 439 442 { USB_DEVICE(0x10d6, 0x2200), .driver_info = 440 443 USB_QUIRK_STRING_FETCH_255 }, ··· 527 524 528 525 /* Blackmagic Design UltraStudio SDI */ 529 526 { USB_DEVICE(0x1edb, 0xbd4f), .driver_info = USB_QUIRK_NO_LPM }, 527 + 528 + /* Teclast disk */ 529 + { USB_DEVICE(0x1f75, 0x0917), .driver_info = USB_QUIRK_NO_LPM }, 530 530 531 531 /* Hauppauge HVR-950q */ 532 532 { USB_DEVICE(0x2040, 0x7200), .driver_info =

+1

drivers/usb/dwc2/gadget.c

··· 4615 4615 spin_lock_irqsave(&hsotg->lock, flags); 4616 4616 4617 4617 hsotg->driver = NULL; 4618 + hsotg->gadget.dev.of_node = NULL; 4618 4619 hsotg->gadget.speed = USB_SPEED_UNKNOWN; 4619 4620 hsotg->enabled = 0; 4620 4621

+1

drivers/usb/dwc3/core.h

··· 717 717 /** 718 718 * struct dwc3_ep - device side endpoint representation 719 719 * @endpoint: usb endpoint 720 + * @nostream_work: work for handling bulk NoStream 720 721 * @cancelled_list: list of cancelled requests for this endpoint 721 722 * @pending_list: list of pending requests for this endpoint 722 723 * @started_list: list of started requests on this endpoint

+34

drivers/usb/dwc3/gadget.c

··· 2637 2637 { 2638 2638 u32 reg; 2639 2639 u32 timeout = 2000; 2640 + u32 saved_config = 0; 2640 2641 2641 2642 if (pm_runtime_suspended(dwc->dev)) 2642 2643 return 0; 2644 + 2645 + /* 2646 + * When operating in USB 2.0 speeds (HS/FS), ensure that 2647 + * GUSB2PHYCFG.ENBLSLPM and GUSB2PHYCFG.SUSPHY are cleared before starting 2648 + * or stopping the controller. This resolves timeout issues that occur 2649 + * during frequent role switches between host and device modes. 2650 + * 2651 + * Save and clear these settings, then restore them after completing the 2652 + * controller start or stop sequence. 2653 + * 2654 + * This solution was discovered through experimentation as it is not 2655 + * mentioned in the dwc3 programming guide. It has been tested on an 2656 + * Exynos platforms. 2657 + */ 2658 + reg = dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)); 2659 + if (reg & DWC3_GUSB2PHYCFG_SUSPHY) { 2660 + saved_config |= DWC3_GUSB2PHYCFG_SUSPHY; 2661 + reg &= ~DWC3_GUSB2PHYCFG_SUSPHY; 2662 + } 2663 + 2664 + if (reg & DWC3_GUSB2PHYCFG_ENBLSLPM) { 2665 + saved_config |= DWC3_GUSB2PHYCFG_ENBLSLPM; 2666 + reg &= ~DWC3_GUSB2PHYCFG_ENBLSLPM; 2667 + } 2668 + 2669 + if (saved_config) 2670 + dwc3_writel(dwc->regs, DWC3_GUSB2PHYCFG(0), reg); 2643 2671 2644 2672 reg = dwc3_readl(dwc->regs, DWC3_DCTL); 2645 2673 if (is_on) { ··· 2695 2667 reg = dwc3_readl(dwc->regs, DWC3_DSTS); 2696 2668 reg &= DWC3_DSTS_DEVCTRLHLT; 2697 2669 } while (--timeout && !(!is_on ^ !reg)); 2670 + 2671 + if (saved_config) { 2672 + reg = dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)); 2673 + reg |= saved_config; 2674 + dwc3_writel(dwc->regs, DWC3_GUSB2PHYCFG(0), reg); 2675 + } 2698 2676 2699 2677 if (!timeout) 2700 2678 return -ETIMEDOUT;

+14 -5

drivers/usb/gadget/function/f_midi.c

··· 283 283 /* Our transmit completed. See if there's more to go. 284 284 * f_midi_transmit eats req, don't queue it again. */ 285 285 req->length = 0; 286 - f_midi_transmit(midi); 286 + queue_work(system_highpri_wq, &midi->work); 287 287 return; 288 288 } 289 289 break; ··· 907 907 908 908 status = -ENODEV; 909 909 910 + /* 911 + * Reset wMaxPacketSize with maximum packet size of FS bulk transfer before 912 + * endpoint claim. This ensures that the wMaxPacketSize does not exceed the 913 + * limit during bind retries where configured dwc3 TX/RX FIFO's maxpacket 914 + * size of 512 bytes for IN/OUT endpoints in support HS speed only. 915 + */ 916 + bulk_in_desc.wMaxPacketSize = cpu_to_le16(64); 917 + bulk_out_desc.wMaxPacketSize = cpu_to_le16(64); 918 + 910 919 /* allocate instance-specific endpoints */ 911 920 midi->in_ep = usb_ep_autoconfig(cdev->gadget, &bulk_in_desc); 912 921 if (!midi->in_ep) ··· 1009 1000 } 1010 1001 1011 1002 /* configure the endpoint descriptors ... */ 1012 - ms_out_desc.bLength = USB_DT_MS_ENDPOINT_SIZE(midi->in_ports); 1013 - ms_out_desc.bNumEmbMIDIJack = midi->in_ports; 1003 + ms_out_desc.bLength = USB_DT_MS_ENDPOINT_SIZE(midi->out_ports); 1004 + ms_out_desc.bNumEmbMIDIJack = midi->out_ports; 1014 1005 1015 - ms_in_desc.bLength = USB_DT_MS_ENDPOINT_SIZE(midi->out_ports); 1016 - ms_in_desc.bNumEmbMIDIJack = midi->out_ports; 1006 + ms_in_desc.bLength = USB_DT_MS_ENDPOINT_SIZE(midi->in_ports); 1007 + ms_in_desc.bNumEmbMIDIJack = midi->in_ports; 1017 1008 1018 1009 /* ... and add them to the list */ 1019 1010 endpoint_descriptor_index = i;

+1 -1

drivers/usb/gadget/function/uvc_video.c

··· 818 818 return -EINVAL; 819 819 820 820 /* Allocate a kthread for asynchronous hw submit handler. */ 821 - video->kworker = kthread_create_worker(0, "UVCG"); 821 + video->kworker = kthread_run_worker(0, "UVCG"); 822 822 if (IS_ERR(video->kworker)) { 823 823 uvcg_err(&video->uvc->func, "failed to create UVCG kworker\n"); 824 824 return PTR_ERR(video->kworker);

+1 -1

drivers/usb/gadget/udc/core.c

··· 1543 1543 1544 1544 kobject_uevent(&udc->dev.kobj, KOBJ_REMOVE); 1545 1545 sysfs_remove_link(&udc->dev.kobj, "gadget"); 1546 - flush_work(&gadget->work); 1547 1546 device_del(&gadget->dev); 1547 + flush_work(&gadget->work); 1548 1548 ida_free(&gadget_id_numbers, gadget->id_number); 1549 1549 cancel_work_sync(&udc->vbus_work); 1550 1550 device_unregister(&udc->dev);

+1 -1

drivers/usb/gadget/udc/renesas_usb3.c

··· 310 310 struct list_head queue; 311 311 }; 312 312 313 - #define USB3_EP_NAME_SIZE 8 313 + #define USB3_EP_NAME_SIZE 16 314 314 struct renesas_usb3_ep { 315 315 struct usb_ep ep; 316 316 struct renesas_usb3 *usb3;

+9

drivers/usb/host/pci-quirks.c

··· 958 958 * booting from USB disk or using a usb keyboard 959 959 */ 960 960 hcc_params = readl(base + EHCI_HCC_PARAMS); 961 + 962 + /* LS7A EHCI controller doesn't have extended capabilities, the 963 + * EECP (EHCI Extended Capabilities Pointer) field of HCCPARAMS 964 + * register should be 0x0 but it reads as 0xa0. So clear it to 965 + * avoid error messages on boot. 966 + */ 967 + if (pdev->vendor == PCI_VENDOR_ID_LOONGSON && pdev->device == 0x7a14) 968 + hcc_params &= ~(0xffL << 8); 969 + 961 970 offset = (hcc_params >> 8) & 0xff; 962 971 while (offset && --count) { 963 972 pci_read_config_dword(pdev, offset, &cap);

+4 -3

drivers/usb/host/xhci-pci.c

··· 653 653 } 654 654 EXPORT_SYMBOL_NS_GPL(xhci_pci_common_probe, "xhci"); 655 655 656 - static const struct pci_device_id pci_ids_reject[] = { 657 - /* handled by xhci-pci-renesas */ 656 + /* handled by xhci-pci-renesas if enabled */ 657 + static const struct pci_device_id pci_ids_renesas[] = { 658 658 { PCI_DEVICE(PCI_VENDOR_ID_RENESAS, 0x0014) }, 659 659 { PCI_DEVICE(PCI_VENDOR_ID_RENESAS, 0x0015) }, 660 660 { /* end: all zeroes */ } ··· 662 662 663 663 static int xhci_pci_probe(struct pci_dev *dev, const struct pci_device_id *id) 664 664 { 665 - if (pci_match_id(pci_ids_reject, dev)) 665 + if (IS_ENABLED(CONFIG_USB_XHCI_PCI_RENESAS) && 666 + pci_match_id(pci_ids_renesas, dev)) 666 667 return -ENODEV; 667 668 668 669 return xhci_pci_common_probe(dev, id);

+1 -1

drivers/usb/phy/phy-generic.c

··· 212 212 if (of_property_read_u32(node, "clock-frequency", &clk_rate)) 213 213 clk_rate = 0; 214 214 215 - needs_clk = of_property_read_bool(node, "clocks"); 215 + needs_clk = of_property_present(node, "clocks"); 216 216 } 217 217 nop->gpiod_reset = devm_gpiod_get_optional(dev, "reset", 218 218 GPIOD_ASIS);

+3 -2

drivers/usb/roles/class.c

··· 387 387 dev_set_name(&sw->dev, "%s-role-switch", 388 388 desc->name ? desc->name : dev_name(parent)); 389 389 390 + sw->registered = true; 391 + 390 392 ret = device_register(&sw->dev); 391 393 if (ret) { 394 + sw->registered = false; 392 395 put_device(&sw->dev); 393 396 return ERR_PTR(ret); 394 397 } ··· 401 398 if (ret) 402 399 dev_warn(&sw->dev, "failed to add component\n"); 403 400 } 404 - 405 - sw->registered = true; 406 401 407 402 /* TODO: Symlinks for the host port and the device controller. */ 408 403

+29 -20

drivers/usb/serial/option.c

··· 619 619 /* Luat Air72*U series based on UNISOC UIS8910 uses UNISOC's vendor ID */ 620 620 #define LUAT_PRODUCT_AIR720U 0x4e00 621 621 622 - /* MeiG Smart Technology products */ 623 - #define MEIGSMART_VENDOR_ID 0x2dee 624 - /* MeiG Smart SRM815/SRM825L based on Qualcomm 315 */ 625 - #define MEIGSMART_PRODUCT_SRM825L 0x4d22 626 - /* MeiG Smart SLM320 based on UNISOC UIS8910 */ 627 - #define MEIGSMART_PRODUCT_SLM320 0x4d41 628 - /* MeiG Smart SLM770A based on ASR1803 */ 629 - #define MEIGSMART_PRODUCT_SLM770A 0x4d57 630 - 631 622 /* Device flags */ 632 623 633 624 /* Highest interface number which can be used with NCTRL() and RSVD() */ ··· 1358 1367 .driver_info = NCTRL(2) | RSVD(3) }, 1359 1368 { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1063, 0xff), /* Telit LN920 (ECM) */ 1360 1369 .driver_info = NCTRL(0) | RSVD(1) }, 1361 - { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1070, 0xff), /* Telit FN990 (rmnet) */ 1370 + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1070, 0xff), /* Telit FN990A (rmnet) */ 1362 1371 .driver_info = NCTRL(0) | RSVD(1) | RSVD(2) }, 1363 - { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1071, 0xff), /* Telit FN990 (MBIM) */ 1372 + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1071, 0xff), /* Telit FN990A (MBIM) */ 1364 1373 .driver_info = NCTRL(0) | RSVD(1) }, 1365 - { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1072, 0xff), /* Telit FN990 (RNDIS) */ 1374 + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1072, 0xff), /* Telit FN990A (RNDIS) */ 1366 1375 .driver_info = NCTRL(2) | RSVD(3) }, 1367 - { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1073, 0xff), /* Telit FN990 (ECM) */ 1376 + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1073, 0xff), /* Telit FN990A (ECM) */ 1368 1377 .driver_info = NCTRL(0) | RSVD(1) }, 1369 - { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1075, 0xff), /* Telit FN990 (PCIe) */ 1378 + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1075, 0xff), /* Telit FN990A (PCIe) */ 1370 1379 .driver_info = RSVD(0) }, 1371 1380 { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1080, 0xff), /* Telit FE990 (rmnet) */ 1372 1381 .driver_info = NCTRL(0) | RSVD(1) | RSVD(2) }, ··· 1394 1403 .driver_info = RSVD(0) | NCTRL(3) }, 1395 1404 { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10c8, 0xff), /* Telit FE910C04 (rmnet) */ 1396 1405 .driver_info = RSVD(0) | NCTRL(2) | RSVD(3) | RSVD(4) }, 1406 + { USB_DEVICE_INTERFACE_PROTOCOL(TELIT_VENDOR_ID, 0x10d0, 0x60) }, /* Telit FN990B (rmnet) */ 1407 + { USB_DEVICE_INTERFACE_PROTOCOL(TELIT_VENDOR_ID, 0x10d0, 0x40) }, 1408 + { USB_DEVICE_INTERFACE_PROTOCOL(TELIT_VENDOR_ID, 0x10d0, 0x30), 1409 + .driver_info = NCTRL(5) }, 1410 + { USB_DEVICE_INTERFACE_PROTOCOL(TELIT_VENDOR_ID, 0x10d1, 0x60) }, /* Telit FN990B (MBIM) */ 1411 + { USB_DEVICE_INTERFACE_PROTOCOL(TELIT_VENDOR_ID, 0x10d1, 0x40) }, 1412 + { USB_DEVICE_INTERFACE_PROTOCOL(TELIT_VENDOR_ID, 0x10d1, 0x30), 1413 + .driver_info = NCTRL(6) }, 1414 + { USB_DEVICE_INTERFACE_PROTOCOL(TELIT_VENDOR_ID, 0x10d2, 0x60) }, /* Telit FN990B (RNDIS) */ 1415 + { USB_DEVICE_INTERFACE_PROTOCOL(TELIT_VENDOR_ID, 0x10d2, 0x40) }, 1416 + { USB_DEVICE_INTERFACE_PROTOCOL(TELIT_VENDOR_ID, 0x10d2, 0x30), 1417 + .driver_info = NCTRL(6) }, 1418 + { USB_DEVICE_INTERFACE_PROTOCOL(TELIT_VENDOR_ID, 0x10d3, 0x60) }, /* Telit FN990B (ECM) */ 1419 + { USB_DEVICE_INTERFACE_PROTOCOL(TELIT_VENDOR_ID, 0x10d3, 0x40) }, 1420 + { USB_DEVICE_INTERFACE_PROTOCOL(TELIT_VENDOR_ID, 0x10d3, 0x30), 1421 + .driver_info = NCTRL(6) }, 1397 1422 { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_ME910), 1398 1423 .driver_info = NCTRL(0) | RSVD(1) | RSVD(3) }, 1399 1424 { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_ME910_DUAL_MODEM), ··· 2354 2347 { USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x0a05, 0xff) }, /* Fibocom FM650-CN (NCM mode) */ 2355 2348 { USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x0a06, 0xff) }, /* Fibocom FM650-CN (RNDIS mode) */ 2356 2349 { USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x0a07, 0xff) }, /* Fibocom FM650-CN (MBIM mode) */ 2350 + { USB_DEVICE_AND_INTERFACE_INFO(0x2dee, 0x4d41, 0xff, 0, 0) }, /* MeiG Smart SLM320 */ 2351 + { USB_DEVICE_AND_INTERFACE_INFO(0x2dee, 0x4d57, 0xff, 0, 0) }, /* MeiG Smart SLM770A */ 2352 + { USB_DEVICE_AND_INTERFACE_INFO(0x2dee, 0x4d22, 0xff, 0, 0) }, /* MeiG Smart SRM815 */ 2353 + { USB_DEVICE_AND_INTERFACE_INFO(0x2dee, 0x4d22, 0xff, 0x10, 0x02) }, /* MeiG Smart SLM828 */ 2354 + { USB_DEVICE_AND_INTERFACE_INFO(0x2dee, 0x4d22, 0xff, 0x10, 0x03) }, /* MeiG Smart SLM828 */ 2355 + { USB_DEVICE_AND_INTERFACE_INFO(0x2dee, 0x4d22, 0xff, 0xff, 0x30) }, /* MeiG Smart SRM815 and SRM825L */ 2356 + { USB_DEVICE_AND_INTERFACE_INFO(0x2dee, 0x4d22, 0xff, 0xff, 0x40) }, /* MeiG Smart SRM825L */ 2357 + { USB_DEVICE_AND_INTERFACE_INFO(0x2dee, 0x4d22, 0xff, 0xff, 0x60) }, /* MeiG Smart SRM825L */ 2357 2358 { USB_DEVICE_INTERFACE_CLASS(0x2df3, 0x9d03, 0xff) }, /* LongSung M5710 */ 2358 2359 { USB_DEVICE_INTERFACE_CLASS(0x305a, 0x1404, 0xff) }, /* GosunCn GM500 RNDIS */ 2359 2360 { USB_DEVICE_INTERFACE_CLASS(0x305a, 0x1405, 0xff) }, /* GosunCn GM500 MBIM */ ··· 2418 2403 { USB_DEVICE_AND_INTERFACE_INFO(SIERRA_VENDOR_ID, SIERRA_PRODUCT_EM9191, 0xff, 0, 0) }, 2419 2404 { USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, TOZED_PRODUCT_LT70C, 0xff, 0, 0) }, 2420 2405 { USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, LUAT_PRODUCT_AIR720U, 0xff, 0, 0) }, 2421 - { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SLM320, 0xff, 0, 0) }, 2422 - { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SLM770A, 0xff, 0, 0) }, 2423 - { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0, 0) }, 2424 - { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x30) }, 2425 - { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x40) }, 2426 - { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x60) }, 2427 2406 { USB_DEVICE_INTERFACE_CLASS(0x1bbb, 0x0530, 0xff), /* TCL IK512 MBIM */ 2428 2407 .driver_info = NCTRL(1) }, 2429 2408 { USB_DEVICE_INTERFACE_CLASS(0x1bbb, 0x0640, 0xff), /* TCL IK512 ECM */

+1 -2

drivers/usb/typec/tcpm/tcpm.c

··· 5591 5591 tcpm_set_auto_vbus_discharge_threshold(port, TYPEC_PWR_MODE_USB, 5592 5592 port->pps_data.active, 0); 5593 5593 tcpm_set_charge(port, false); 5594 - tcpm_set_state(port, hard_reset_state(port), 5595 - port->timings.ps_src_off_time); 5594 + tcpm_set_state(port, ERROR_RECOVERY, port->timings.ps_src_off_time); 5596 5595 break; 5597 5596 case PR_SWAP_SNK_SRC_SOURCE_ON: 5598 5597 tcpm_enable_auto_vbus_discharge(port, true);

+13 -9

drivers/xen/swiotlb-xen.c

··· 74 74 return xen_bus_to_phys(dev, dma_to_phys(dev, dma_addr)); 75 75 } 76 76 77 + static inline bool range_requires_alignment(phys_addr_t p, size_t size) 78 + { 79 + phys_addr_t algn = 1ULL << (get_order(size) + PAGE_SHIFT); 80 + phys_addr_t bus_addr = pfn_to_bfn(XEN_PFN_DOWN(p)) << XEN_PAGE_SHIFT; 81 + 82 + return IS_ALIGNED(p, algn) && !IS_ALIGNED(bus_addr, algn); 83 + } 84 + 77 85 static inline int range_straddles_page_boundary(phys_addr_t p, size_t size) 78 86 { 79 87 unsigned long next_bfn, xen_pfn = XEN_PFN_DOWN(p); 80 88 unsigned int i, nr_pages = XEN_PFN_UP(xen_offset_in_page(p) + size); 81 - phys_addr_t algn = 1ULL << (get_order(size) + PAGE_SHIFT); 82 89 83 90 next_bfn = pfn_to_bfn(xen_pfn); 84 - 85 - /* If buffer is physically aligned, ensure DMA alignment. */ 86 - if (IS_ALIGNED(p, algn) && 87 - !IS_ALIGNED((phys_addr_t)next_bfn << XEN_PAGE_SHIFT, algn)) 88 - return 1; 89 91 90 92 for (i = 1; i < nr_pages; i++) 91 93 if (pfn_to_bfn(++xen_pfn) != ++next_bfn) ··· 113 111 } 114 112 115 113 #ifdef CONFIG_X86 116 - int xen_swiotlb_fixup(void *buf, unsigned long nslabs) 114 + int __init xen_swiotlb_fixup(void *buf, unsigned long nslabs) 117 115 { 118 116 int rc; 119 117 unsigned int order = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT); ··· 158 156 159 157 *dma_handle = xen_phys_to_dma(dev, phys); 160 158 if (*dma_handle + size - 1 > dma_mask || 161 - range_straddles_page_boundary(phys, size)) { 159 + range_straddles_page_boundary(phys, size) || 160 + range_requires_alignment(phys, size)) { 162 161 if (xen_create_contiguous_region(phys, order, fls64(dma_mask), 163 162 dma_handle) != 0) 164 163 goto out_free_pages; ··· 185 182 size = ALIGN(size, XEN_PAGE_SIZE); 186 183 187 184 if (WARN_ON_ONCE(dma_handle + size - 1 > dev->coherent_dma_mask) || 188 - WARN_ON_ONCE(range_straddles_page_boundary(phys, size))) 185 + WARN_ON_ONCE(range_straddles_page_boundary(phys, size) || 186 + range_requires_alignment(phys, size))) 189 187 return; 190 188 191 189 if (TestClearPageXenRemapped(virt_to_page(vaddr)))

+7

fs/bcachefs/Kconfig

··· 61 61 The resulting code will be significantly slower than normal; you 62 62 probably shouldn't select this option unless you're a developer. 63 63 64 + config BCACHEFS_INJECT_TRANSACTION_RESTARTS 65 + bool "Randomly inject transaction restarts" 66 + depends on BCACHEFS_DEBUG 67 + help 68 + Randomly inject transaction restarts in a few core paths - may have a 69 + significant performance penalty 70 + 64 71 config BCACHEFS_TESTS 65 72 bool "bcachefs unit and performance tests" 66 73 depends on BCACHEFS_FS

+25 -22

fs/bcachefs/alloc_background.c

··· 1803 1803 u64 open; 1804 1804 u64 need_journal_commit; 1805 1805 u64 discarded; 1806 - u64 need_journal_commit_this_dev; 1807 1806 }; 1808 1807 1809 1808 static int bch2_discard_one_bucket(struct btree_trans *trans, ··· 1826 1827 goto out; 1827 1828 } 1828 1829 1829 - if (bch2_bucket_needs_journal_commit(&c->buckets_waiting_for_journal, 1830 - c->journal.flushed_seq_ondisk, 1831 - pos.inode, pos.offset)) { 1832 - s->need_journal_commit++; 1833 - s->need_journal_commit_this_dev++; 1830 + u64 seq_ready = bch2_bucket_journal_seq_ready(&c->buckets_waiting_for_journal, 1831 + pos.inode, pos.offset); 1832 + if (seq_ready > c->journal.flushed_seq_ondisk) { 1833 + if (seq_ready > c->journal.flushing_seq) 1834 + s->need_journal_commit++; 1834 1835 goto out; 1835 1836 } 1836 1837 ··· 1864 1865 discard_locked = true; 1865 1866 } 1866 1867 1867 - if (!bkey_eq(*discard_pos_done, iter.pos) && 1868 - ca->mi.discard && !c->opts.nochanges) { 1869 - /* 1870 - * This works without any other locks because this is the only 1871 - * thread that removes items from the need_discard tree 1872 - */ 1873 - bch2_trans_unlock_long(trans); 1874 - blkdev_issue_discard(ca->disk_sb.bdev, 1875 - k.k->p.offset * ca->mi.bucket_size, 1876 - ca->mi.bucket_size, 1877 - GFP_KERNEL); 1878 - *discard_pos_done = iter.pos; 1868 + if (!bkey_eq(*discard_pos_done, iter.pos)) { 1879 1869 s->discarded++; 1870 + *discard_pos_done = iter.pos; 1880 1871 1881 - ret = bch2_trans_relock_notrace(trans); 1882 - if (ret) 1883 - goto out; 1872 + if (ca->mi.discard && !c->opts.nochanges) { 1873 + /* 1874 + * This works without any other locks because this is the only 1875 + * thread that removes items from the need_discard tree 1876 + */ 1877 + bch2_trans_unlock_long(trans); 1878 + blkdev_issue_discard(ca->disk_sb.bdev, 1879 + k.k->p.offset * ca->mi.bucket_size, 1880 + ca->mi.bucket_size, 1881 + GFP_KERNEL); 1882 + ret = bch2_trans_relock_notrace(trans); 1883 + if (ret) 1884 + goto out; 1885 + } 1884 1886 } 1885 1887 1886 1888 SET_BCH_ALLOC_V4_NEED_DISCARD(&a->v, false); ··· 1928 1928 POS(ca->dev_idx, 0), 1929 1929 POS(ca->dev_idx, U64_MAX), 0, k, 1930 1930 bch2_discard_one_bucket(trans, ca, &iter, &discard_pos_done, &s, false))); 1931 + 1932 + if (s.need_journal_commit > dev_buckets_available(ca, BCH_WATERMARK_normal)) 1933 + bch2_journal_flush_async(&c->journal, NULL); 1931 1934 1932 1935 trace_discard_buckets(c, s.seen, s.open, s.need_journal_commit, s.discarded, 1933 1936 bch2_err_str(ret)); ··· 2027 2024 break; 2028 2025 } 2029 2026 2030 - trace_discard_buckets(c, s.seen, s.open, s.need_journal_commit, s.discarded, bch2_err_str(ret)); 2027 + trace_discard_buckets_fast(c, s.seen, s.open, s.need_journal_commit, s.discarded, bch2_err_str(ret)); 2031 2028 2032 2029 bch2_trans_put(trans); 2033 2030 percpu_ref_put(&ca->io_ref);

+7 -3

fs/bcachefs/alloc_foreground.c

··· 205 205 return false; 206 206 } 207 207 208 - if (bch2_bucket_needs_journal_commit(&c->buckets_waiting_for_journal, 209 - c->journal.flushed_seq_ondisk, bucket.inode, bucket.offset)) { 208 + u64 journal_seq_ready = 209 + bch2_bucket_journal_seq_ready(&c->buckets_waiting_for_journal, 210 + bucket.inode, bucket.offset); 211 + if (journal_seq_ready > c->journal.flushed_seq_ondisk) { 212 + if (journal_seq_ready > c->journal.flushing_seq) 213 + s->need_journal_commit++; 210 214 s->skipped_need_journal_commit++; 211 215 return false; 212 216 } ··· 574 570 ? bch2_bucket_alloc_freelist(trans, ca, watermark, &s, cl) 575 571 : bch2_bucket_alloc_early(trans, ca, watermark, &s, cl); 576 572 577 - if (s.skipped_need_journal_commit * 2 > avail) 573 + if (s.need_journal_commit * 2 > avail) 578 574 bch2_journal_flush_async(&c->journal, NULL); 579 575 580 576 if (!ob && s.btree_bitmap != BTREE_BITMAP_ANY) {

+1

fs/bcachefs/alloc_types.h

··· 18 18 u64 buckets_seen; 19 19 u64 skipped_open; 20 20 u64 skipped_need_journal_commit; 21 + u64 need_journal_commit; 21 22 u64 skipped_nocow; 22 23 u64 skipped_nouse; 23 24 u64 skipped_mi_btree_bitmap;

+32 -1

fs/bcachefs/btree_iter.c

··· 2357 2357 bch2_btree_iter_verify_entry_exit(iter); 2358 2358 EBUG_ON((iter->flags & BTREE_ITER_filter_snapshots) && bkey_eq(end, POS_MAX)); 2359 2359 2360 + ret = trans_maybe_inject_restart(trans, _RET_IP_); 2361 + if (unlikely(ret)) { 2362 + k = bkey_s_c_err(ret); 2363 + goto out_no_locked; 2364 + } 2365 + 2360 2366 if (iter->update_path) { 2361 2367 bch2_path_put_nokeep(trans, iter->update_path, 2362 2368 iter->flags & BTREE_ITER_intent); ··· 2628 2622 bch2_btree_iter_verify_entry_exit(iter); 2629 2623 EBUG_ON((iter->flags & BTREE_ITER_filter_snapshots) && bpos_eq(end, POS_MIN)); 2630 2624 2625 + int ret = trans_maybe_inject_restart(trans, _RET_IP_); 2626 + if (unlikely(ret)) { 2627 + k = bkey_s_c_err(ret); 2628 + goto out_no_locked; 2629 + } 2630 + 2631 2631 while (1) { 2632 2632 k = __bch2_btree_iter_peek_prev(iter, search_key); 2633 2633 if (unlikely(!k.k)) ··· 2760 2748 bch2_btree_iter_verify(iter); 2761 2749 bch2_btree_iter_verify_entry_exit(iter); 2762 2750 EBUG_ON(btree_iter_path(trans, iter)->level && (iter->flags & BTREE_ITER_with_key_cache)); 2751 + 2752 + ret = trans_maybe_inject_restart(trans, _RET_IP_); 2753 + if (unlikely(ret)) { 2754 + k = bkey_s_c_err(ret); 2755 + goto out_no_locked; 2756 + } 2763 2757 2764 2758 /* extents can't span inode numbers: */ 2765 2759 if ((iter->flags & BTREE_ITER_is_extents) && ··· 3124 3106 3125 3107 WARN_ON_ONCE(new_bytes > BTREE_TRANS_MEM_MAX); 3126 3108 3109 + ret = trans_maybe_inject_restart(trans, _RET_IP_); 3110 + if (ret) 3111 + return ERR_PTR(ret); 3112 + 3127 3113 struct btree_transaction_stats *s = btree_trans_stats(trans); 3128 3114 s->max_mem = max(s->max_mem, new_bytes); 3129 3115 ··· 3185 3163 3186 3164 if (old_bytes) { 3187 3165 trace_and_count(c, trans_restart_mem_realloced, trans, _RET_IP_, new_bytes); 3188 - return ERR_PTR(btree_trans_restart(trans, BCH_ERR_transaction_restart_mem_realloced)); 3166 + return ERR_PTR(btree_trans_restart_ip(trans, 3167 + BCH_ERR_transaction_restart_mem_realloced, _RET_IP_)); 3189 3168 } 3190 3169 out_change_top: 3191 3170 p = trans->mem + trans->mem_top; ··· 3293 3270 bch2_trans_srcu_unlock(trans); 3294 3271 3295 3272 trans->last_begin_ip = _RET_IP_; 3273 + 3274 + #ifdef CONFIG_BCACHEFS_INJECT_TRANSACTION_RESTARTS 3275 + if (trans->restarted) { 3276 + trans->restart_count_this_trans++; 3277 + } else { 3278 + trans->restart_count_this_trans = 0; 3279 + } 3280 + #endif 3296 3281 3297 3282 trans_set_locked(trans, false); 3298 3283

+13 -1

fs/bcachefs/btree_iter.h

··· 355 355 return btree_trans_restart_ip(trans, err, _THIS_IP_); 356 356 } 357 357 358 + static inline int trans_maybe_inject_restart(struct btree_trans *trans, unsigned long ip) 359 + { 360 + #ifdef CONFIG_BCACHEFS_INJECT_TRANSACTION_RESTARTS 361 + if (!(ktime_get_ns() & ~(~0ULL << min(63, (10 + trans->restart_count_this_trans))))) { 362 + trace_and_count(trans->c, trans_restart_injected, trans, ip); 363 + return btree_trans_restart_ip(trans, 364 + BCH_ERR_transaction_restart_fault_inject, ip); 365 + } 366 + #endif 367 + return 0; 368 + } 369 + 358 370 bool bch2_btree_node_upgrade(struct btree_trans *, 359 371 struct btree_path *, unsigned); 360 372 ··· 751 739 if (!_ret2) \ 752 740 bch2_trans_verify_not_restarted(_trans, _restart_count);\ 753 741 \ 754 - _ret2 ?: trans_was_restarted(_trans, _restart_count); \ 742 + _ret2 ?: trans_was_restarted(_trans, _orig_restart_count); \ 755 743 }) 756 744 757 745 #define for_each_btree_key_max_continue(_trans, _iter, \

-1

fs/bcachefs/btree_key_cache.c

··· 748 748 rcu_read_unlock(); 749 749 mutex_lock(&bc->table.mutex); 750 750 mutex_unlock(&bc->table.mutex); 751 - rcu_read_lock(); 752 751 continue; 753 752 } 754 753 for (i = 0; i < tbl->size; i++)

+4

fs/bcachefs/btree_trans_commit.c

··· 999 999 1000 1000 bch2_trans_verify_not_unlocked_or_in_restart(trans); 1001 1001 1002 + ret = trans_maybe_inject_restart(trans, _RET_IP_); 1003 + if (unlikely(ret)) 1004 + goto out_reset; 1005 + 1002 1006 if (!trans->nr_updates && 1003 1007 !trans->journal_entries_u64s) 1004 1008 goto out_reset;

+3

fs/bcachefs/btree_types.h

··· 509 509 bool notrace_relock_fail:1; 510 510 enum bch_errcode restarted:16; 511 511 u32 restart_count; 512 + #ifdef CONFIG_BCACHEFS_INJECT_TRANSACTION_RESTARTS 513 + u32 restart_count_this_trans; 514 + #endif 512 515 513 516 u64 last_begin_time; 514 517 unsigned long last_begin_ip;

+2 -2

fs/bcachefs/btree_update_interior.h

··· 278 278 { 279 279 struct bset_tree *t = bset_tree_last(b); 280 280 struct btree_node_entry *bne = max(write_block(b), 281 - (void *) btree_bkey_last(b, bset_tree_last(b))); 281 + (void *) btree_bkey_last(b, t)); 282 282 ssize_t remaining_space = 283 283 __bch2_btree_u64s_remaining(b, bne->keys.start); 284 284 285 285 if (unlikely(bset_written(b, bset(b, t)))) { 286 - if (remaining_space > (ssize_t) (block_bytes(c) >> 3)) 286 + if (b->written + block_sectors(c) <= btree_sectors(c)) 287 287 return bne; 288 288 } else { 289 289 if (unlikely(bset_u64s(t) * sizeof(u64) > btree_write_set_buffer(b)) &&

+5 -7

fs/bcachefs/buckets_waiting_for_journal.c

··· 22 22 memset(t->d, 0, sizeof(t->d[0]) << t->bits); 23 23 } 24 24 25 - bool bch2_bucket_needs_journal_commit(struct buckets_waiting_for_journal *b, 26 - u64 flushed_seq, 27 - unsigned dev, u64 bucket) 25 + u64 bch2_bucket_journal_seq_ready(struct buckets_waiting_for_journal *b, 26 + unsigned dev, u64 bucket) 28 27 { 29 28 struct buckets_waiting_for_journal_table *t; 30 29 u64 dev_bucket = (u64) dev << 56 | bucket; 31 - bool ret = false; 32 - unsigned i; 30 + u64 ret = 0; 33 31 34 32 mutex_lock(&b->lock); 35 33 t = b->t; 36 34 37 - for (i = 0; i < ARRAY_SIZE(t->hash_seeds); i++) { 35 + for (unsigned i = 0; i < ARRAY_SIZE(t->hash_seeds); i++) { 38 36 struct bucket_hashed *h = bucket_hash(t, i, dev_bucket); 39 37 40 38 if (h->dev_bucket == dev_bucket) { 41 - ret = h->journal_seq > flushed_seq; 39 + ret = h->journal_seq; 42 40 break; 43 41 } 44 42 }

+2 -2

fs/bcachefs/buckets_waiting_for_journal.h

··· 4 4 5 5 #include "buckets_waiting_for_journal_types.h" 6 6 7 - bool bch2_bucket_needs_journal_commit(struct buckets_waiting_for_journal *, 8 - u64, unsigned, u64); 7 + u64 bch2_bucket_journal_seq_ready(struct buckets_waiting_for_journal *, 8 + unsigned, u64); 9 9 int bch2_set_bucket_needs_journal_commit(struct buckets_waiting_for_journal *, 10 10 u64, unsigned, u64, u64); 11 11

+2

fs/bcachefs/disk_accounting.h

··· 210 210 static inline void bch2_accounting_mem_read(struct bch_fs *c, struct bpos p, 211 211 u64 *v, unsigned nr) 212 212 { 213 + percpu_down_read(&c->mark_lock); 213 214 struct bch_accounting_mem *acc = &c->accounting; 214 215 unsigned idx = eytzinger0_find(acc->k.data, acc->k.nr, sizeof(acc->k.data[0]), 215 216 accounting_pos_cmp, &p); 216 217 217 218 bch2_accounting_mem_read_counters(acc, idx, v, nr, false); 219 + percpu_up_read(&c->mark_lock); 218 220 } 219 221 220 222 static inline struct bversion journal_pos_to_bversion(struct journal_res *res, unsigned offset)

+3 -1

fs/bcachefs/inode.h

··· 285 285 struct bch_inode_unpacked *); 286 286 int bch2_inum_opts_get(struct btree_trans*, subvol_inum, struct bch_io_opts *); 287 287 288 + #include "rebalance.h" 289 + 288 290 static inline struct bch_extent_rebalance 289 291 bch2_inode_rebalance_opts_get(struct bch_fs *c, struct bch_inode_unpacked *inode) 290 292 { 291 293 struct bch_io_opts io_opts; 292 294 bch2_inode_opts_get(&io_opts, c, inode); 293 - return io_opts_to_rebalance_opts(&io_opts); 295 + return io_opts_to_rebalance_opts(c, &io_opts); 294 296 } 295 297 296 298 int bch2_inode_rm_snapshot(struct btree_trans *, u64, u32);

+11 -1

fs/bcachefs/io_write.c

··· 411 411 __bch2_write_op_error(out, op, op->pos.offset); 412 412 } 413 413 414 + static void bch2_write_op_error_trans(struct btree_trans *trans, struct printbuf *out, 415 + struct bch_write_op *op, u64 offset) 416 + { 417 + bch2_inum_offset_err_msg_trans(trans, out, 418 + (subvol_inum) { op->subvol, op->pos.inode, }, 419 + offset << 9); 420 + prt_printf(out, "write error%s: ", 421 + op->flags & BCH_WRITE_MOVE ? "(internal move)" : ""); 422 + } 423 + 414 424 void bch2_submit_wbio_replicas(struct bch_write_bio *wbio, struct bch_fs *c, 415 425 enum bch_data_type type, 416 426 const struct bkey_i *k, ··· 1203 1193 struct bkey_i *insert = bch2_keylist_front(&op->insert_keys); 1204 1194 1205 1195 struct printbuf buf = PRINTBUF; 1206 - __bch2_write_op_error(&buf, op, bkey_start_offset(&insert->k)); 1196 + bch2_write_op_error_trans(trans, &buf, op, bkey_start_offset(&insert->k)); 1207 1197 prt_printf(&buf, "btree update error: %s", bch2_err_str(ret)); 1208 1198 bch_err_ratelimited(c, "%s", buf.buf); 1209 1199 printbuf_exit(&buf);

+16 -2

fs/bcachefs/journal.c

··· 319 319 spin_unlock(&j->lock); 320 320 } 321 321 322 + void bch2_journal_halt_locked(struct journal *j) 323 + { 324 + lockdep_assert_held(&j->lock); 325 + 326 + __journal_entry_close(j, JOURNAL_ENTRY_ERROR_VAL, true); 327 + if (!j->err_seq) 328 + j->err_seq = journal_cur_seq(j); 329 + journal_wake(j); 330 + } 331 + 322 332 static bool journal_entry_want_write(struct journal *j) 323 333 { 324 334 bool ret = !journal_entry_is_open(j) || ··· 391 381 if (nr_unwritten_journal_entries(j) == ARRAY_SIZE(j->buf)) 392 382 return JOURNAL_ERR_max_in_flight; 393 383 394 - if (bch2_fs_fatal_err_on(journal_cur_seq(j) >= JOURNAL_SEQ_MAX, 395 - c, "cannot start: journal seq overflow")) 384 + if (journal_cur_seq(j) >= JOURNAL_SEQ_MAX) { 385 + bch_err(c, "cannot start: journal seq overflow"); 386 + if (bch2_fs_emergency_read_only_locked(c)) 387 + bch_err(c, "fatal error - emergency read only"); 396 388 return JOURNAL_ERR_insufficient_devices; /* -EROFS */ 389 + } 397 390 398 391 BUG_ON(!j->cur_entry_sectors); 399 392 ··· 796 783 } 797 784 798 785 buf->must_flush = true; 786 + j->flushing_seq = max(j->flushing_seq, seq); 799 787 800 788 if (parent && !closure_wait(&buf->wait, parent)) 801 789 BUG();

+1

fs/bcachefs/journal.h

··· 409 409 int bch2_journal_meta(struct journal *); 410 410 411 411 void bch2_journal_halt(struct journal *); 412 + void bch2_journal_halt_locked(struct journal *); 412 413 413 414 static inline int bch2_journal_error(struct journal *j) 414 415 {

+18 -19

fs/bcachefs/journal_reclaim.c

··· 384 384 spin_unlock(&j->lock); 385 385 } 386 386 387 - static enum journal_pin_type journal_pin_type(journal_pin_flush_fn fn) 387 + static enum journal_pin_type journal_pin_type(struct journal_entry_pin *pin, 388 + journal_pin_flush_fn fn) 388 389 { 389 390 if (fn == bch2_btree_node_flush0 || 390 - fn == bch2_btree_node_flush1) 391 - return JOURNAL_PIN_TYPE_btree; 392 - else if (fn == bch2_btree_key_cache_journal_flush) 391 + fn == bch2_btree_node_flush1) { 392 + unsigned idx = fn == bch2_btree_node_flush1; 393 + struct btree *b = container_of(pin, struct btree, writes[idx].journal); 394 + 395 + return JOURNAL_PIN_TYPE_btree0 - b->c.level; 396 + } else if (fn == bch2_btree_key_cache_journal_flush) 393 397 return JOURNAL_PIN_TYPE_key_cache; 394 398 else 395 399 return JOURNAL_PIN_TYPE_other; ··· 445 441 446 442 bool reclaim = __journal_pin_drop(j, dst); 447 443 448 - bch2_journal_pin_set_locked(j, seq, dst, flush_fn, journal_pin_type(flush_fn)); 444 + bch2_journal_pin_set_locked(j, seq, dst, flush_fn, journal_pin_type(dst, flush_fn)); 449 445 450 446 if (reclaim) 451 447 bch2_journal_reclaim_fast(j); ··· 469 465 470 466 bool reclaim = __journal_pin_drop(j, pin); 471 467 472 - bch2_journal_pin_set_locked(j, seq, pin, flush_fn, journal_pin_type(flush_fn)); 468 + bch2_journal_pin_set_locked(j, seq, pin, flush_fn, journal_pin_type(pin, flush_fn)); 473 469 474 470 if (reclaim) 475 471 bch2_journal_reclaim_fast(j); ··· 591 587 spin_lock(&j->lock); 592 588 /* Pin might have been dropped or rearmed: */ 593 589 if (likely(!err && !j->flush_in_progress_dropped)) 594 - list_move(&pin->list, &journal_seq_pin(j, seq)->flushed[journal_pin_type(flush_fn)]); 590 + list_move(&pin->list, &journal_seq_pin(j, seq)->flushed[journal_pin_type(pin, flush_fn)]); 595 591 j->flush_in_progress = NULL; 596 592 j->flush_in_progress_dropped = false; 597 593 spin_unlock(&j->lock); ··· 873 869 874 870 mutex_lock(&j->reclaim_lock); 875 871 876 - if (journal_flush_pins_or_still_flushing(j, seq_to_flush, 877 - BIT(JOURNAL_PIN_TYPE_key_cache)| 878 - BIT(JOURNAL_PIN_TYPE_other))) { 879 - *did_work = true; 880 - goto unlock; 881 - } 882 - 883 - if (journal_flush_pins_or_still_flushing(j, seq_to_flush, 884 - BIT(JOURNAL_PIN_TYPE_btree))) { 885 - *did_work = true; 886 - goto unlock; 887 - } 872 + for (int type = JOURNAL_PIN_TYPE_NR - 1; 873 + type >= 0; 874 + --type) 875 + if (journal_flush_pins_or_still_flushing(j, seq_to_flush, BIT(type))) { 876 + *did_work = true; 877 + goto unlock; 878 + } 888 879 889 880 if (seq_to_flush > journal_cur_seq(j)) 890 881 bch2_journal_entry_close(j);

+5 -1

fs/bcachefs/journal_types.h

··· 53 53 */ 54 54 55 55 enum journal_pin_type { 56 - JOURNAL_PIN_TYPE_btree, 56 + JOURNAL_PIN_TYPE_btree3, 57 + JOURNAL_PIN_TYPE_btree2, 58 + JOURNAL_PIN_TYPE_btree1, 59 + JOURNAL_PIN_TYPE_btree0, 57 60 JOURNAL_PIN_TYPE_key_cache, 58 61 JOURNAL_PIN_TYPE_other, 59 62 JOURNAL_PIN_TYPE_NR, ··· 240 237 /* seq, last_seq from the most recent journal entry successfully written */ 241 238 u64 seq_ondisk; 242 239 u64 flushed_seq_ondisk; 240 + u64 flushing_seq; 243 241 u64 last_seq_ondisk; 244 242 u64 err_seq; 245 243 u64 last_empty_seq;

-14

fs/bcachefs/opts.h

··· 659 659 struct bch_io_opts bch2_opts_to_inode_opts(struct bch_opts); 660 660 bool bch2_opt_is_inode_opt(enum bch_opt_id); 661 661 662 - /* rebalance opts: */ 663 - 664 - static inline struct bch_extent_rebalance io_opts_to_rebalance_opts(struct bch_io_opts *opts) 665 - { 666 - return (struct bch_extent_rebalance) { 667 - .type = BIT(BCH_EXTENT_ENTRY_rebalance), 668 - #define x(_name) \ 669 - ._name = opts->_name, \ 670 - ._name##_from_inode = opts->_name##_from_inode, 671 - BCH_REBALANCE_OPTS() 672 - #undef x 673 - }; 674 - }; 675 - 676 662 #endif /* _BCACHEFS_OPTS_H */

+3 -5

fs/bcachefs/rebalance.c

··· 121 121 } 122 122 } 123 123 incompressible: 124 - if (opts->background_target && 125 - bch2_target_accepts_data(c, BCH_DATA_user, opts->background_target)) { 124 + if (opts->background_target) 126 125 bkey_for_each_ptr_decode(k.k, ptrs, p, entry) 127 126 if (!p.ptr.cached && !bch2_dev_in_target(c, p.ptr.dev, opts->background_target)) 128 127 sectors += p.crc.compressed_size; 129 - } 130 128 131 129 return sectors; 132 130 } ··· 138 140 const struct bch_extent_rebalance *old = bch2_bkey_rebalance_opts(k); 139 141 140 142 if (k.k->type == KEY_TYPE_reflink_v || bch2_bkey_ptrs_need_rebalance(c, opts, k)) { 141 - struct bch_extent_rebalance new = io_opts_to_rebalance_opts(opts); 143 + struct bch_extent_rebalance new = io_opts_to_rebalance_opts(c, opts); 142 144 return old == NULL || memcmp(old, &new, sizeof(new)); 143 145 } else { 144 146 return old != NULL; ··· 161 163 k.k->u64s += sizeof(*old) / sizeof(u64); 162 164 } 163 165 164 - *old = io_opts_to_rebalance_opts(opts); 166 + *old = io_opts_to_rebalance_opts(c, opts); 165 167 } else { 166 168 if (old) 167 169 extent_entry_drop(k, (union bch_extent_entry *) old);

+20

fs/bcachefs/rebalance.h

··· 4 4 5 5 #include "compress.h" 6 6 #include "disk_groups.h" 7 + #include "opts.h" 7 8 #include "rebalance_types.h" 9 + 10 + static inline struct bch_extent_rebalance io_opts_to_rebalance_opts(struct bch_fs *c, 11 + struct bch_io_opts *opts) 12 + { 13 + struct bch_extent_rebalance r = { 14 + .type = BIT(BCH_EXTENT_ENTRY_rebalance), 15 + #define x(_name) \ 16 + ._name = opts->_name, \ 17 + ._name##_from_inode = opts->_name##_from_inode, 18 + BCH_REBALANCE_OPTS() 19 + #undef x 20 + }; 21 + 22 + if (r.background_target && 23 + !bch2_target_accepts_data(c, BCH_DATA_user, r.background_target)) 24 + r.background_target = 0; 25 + 26 + return r; 27 + }; 8 28 9 29 u64 bch2_bkey_sectors_need_rebalance(struct bch_fs *, struct bkey_s_c); 10 30 int bch2_bkey_set_needs_rebalance(struct bch_fs *, struct bch_io_opts *, struct bkey_i *);

+2

fs/bcachefs/reflink.c

··· 381 381 not_found: 382 382 if (flags & BTREE_TRIGGER_check_repair) { 383 383 ret = bch2_indirect_extent_missing_error(trans, p, *idx, next_idx, false); 384 + if (ret == -BCH_ERR_missing_indirect_extent) 385 + ret = 0; 384 386 if (ret) 385 387 goto err; 386 388 }

+2 -2

fs/bcachefs/sb-errors_format.h

··· 180 180 x(ptr_crc_nonce_mismatch, 162, 0) \ 181 181 x(ptr_stripe_redundant, 163, 0) \ 182 182 x(reservation_key_nr_replicas_invalid, 164, 0) \ 183 - x(reflink_v_refcount_wrong, 165, 0) \ 183 + x(reflink_v_refcount_wrong, 165, FSCK_AUTOFIX) \ 184 184 x(reflink_v_pos_bad, 292, 0) \ 185 - x(reflink_p_to_missing_reflink_v, 166, 0) \ 185 + x(reflink_p_to_missing_reflink_v, 166, FSCK_AUTOFIX) \ 186 186 x(reflink_refcount_underflow, 293, 0) \ 187 187 x(stripe_pos_bad, 167, 0) \ 188 188 x(stripe_val_size_bad, 168, 0) \

+6 -1

fs/bcachefs/subvolume.c

··· 428 428 bch2_bkey_get_iter_typed(trans, &snapshot_iter, 429 429 BTREE_ID_snapshots, POS(0, snapid), 430 430 0, snapshot); 431 - ret = bkey_err(subvol); 431 + ret = bkey_err(snapshot); 432 432 bch2_fs_inconsistent_on(bch2_err_matches(ret, ENOENT), trans->c, 433 433 "missing snapshot %u", snapid); 434 434 if (ret) ··· 440 440 bch2_bkey_get_iter_typed(trans, &snapshot_tree_iter, 441 441 BTREE_ID_snapshot_trees, POS(0, treeid), 442 442 0, snapshot_tree); 443 + ret = bkey_err(snapshot_tree); 444 + bch2_fs_inconsistent_on(bch2_err_matches(ret, ENOENT), trans->c, 445 + "missing snapshot tree %u", treeid); 446 + if (ret) 447 + goto err; 443 448 444 449 if (le32_to_cpu(snapshot_tree.v->master_subvol) == subvolid) { 445 450 struct bkey_i_snapshot_tree *snapshot_tree_mut =

+11

fs/bcachefs/super.c

··· 411 411 return ret; 412 412 } 413 413 414 + bool bch2_fs_emergency_read_only_locked(struct bch_fs *c) 415 + { 416 + bool ret = !test_and_set_bit(BCH_FS_emergency_ro, &c->flags); 417 + 418 + bch2_journal_halt_locked(&c->journal); 419 + bch2_fs_read_only_async(c); 420 + 421 + wake_up(&bch2_read_only_wait); 422 + return ret; 423 + } 424 + 414 425 static int bch2_fs_read_write_late(struct bch_fs *c) 415 426 { 416 427 int ret;

+1

fs/bcachefs/super.h

··· 29 29 struct bch_dev *bch2_dev_lookup(struct bch_fs *, const char *); 30 30 31 31 bool bch2_fs_emergency_read_only(struct bch_fs *); 32 + bool bch2_fs_emergency_read_only_locked(struct bch_fs *); 32 33 void bch2_fs_read_only(struct bch_fs *); 33 34 34 35 int bch2_fs_read_write(struct bch_fs *);

+13 -1

fs/bcachefs/trace.h

··· 727 727 TP_ARGS(c, str) 728 728 ); 729 729 730 - TRACE_EVENT(discard_buckets, 730 + DECLARE_EVENT_CLASS(discard_buckets_class, 731 731 TP_PROTO(struct bch_fs *c, u64 seen, u64 open, 732 732 u64 need_journal_commit, u64 discarded, const char *err), 733 733 TP_ARGS(c, seen, open, need_journal_commit, discarded, err), ··· 757 757 __entry->need_journal_commit, 758 758 __entry->discarded, 759 759 __entry->err) 760 + ); 761 + 762 + DEFINE_EVENT(discard_buckets_class, discard_buckets, 763 + TP_PROTO(struct bch_fs *c, u64 seen, u64 open, 764 + u64 need_journal_commit, u64 discarded, const char *err), 765 + TP_ARGS(c, seen, open, need_journal_commit, discarded, err) 766 + ); 767 + 768 + DEFINE_EVENT(discard_buckets_class, discard_buckets_fast, 769 + TP_PROTO(struct bch_fs *c, u64 seen, u64 open, 770 + u64 need_journal_commit, u64 discarded, const char *err), 771 + TP_ARGS(c, seen, open, need_journal_commit, discarded, err) 760 772 ); 761 773 762 774 TRACE_EVENT(bucket_invalidate,

+2

fs/btrfs/ctree.c

··· 1496 1496 1497 1497 if (!p->skip_locking) { 1498 1498 btrfs_unlock_up_safe(p, parent_level + 1); 1499 + btrfs_maybe_reset_lockdep_class(root, tmp); 1499 1500 tmp_locked = true; 1500 1501 btrfs_tree_read_lock(tmp); 1501 1502 btrfs_release_path(p); ··· 1540 1539 1541 1540 if (!p->skip_locking) { 1542 1541 ASSERT(ret == -EAGAIN); 1542 + btrfs_maybe_reset_lockdep_class(root, tmp); 1543 1543 tmp_locked = true; 1544 1544 btrfs_tree_read_lock(tmp); 1545 1545 btrfs_release_path(p);

+20 -9

fs/btrfs/extent_io.c

··· 523 523 u64 end; 524 524 u32 len; 525 525 526 - /* For now only order 0 folios are supported for data. */ 527 - ASSERT(folio_order(folio) == 0); 528 526 btrfs_debug(fs_info, 529 527 "%s: bi_sector=%llu, err=%d, mirror=%u", 530 528 __func__, bio->bi_iter.bi_sector, bio->bi_status, ··· 550 552 551 553 if (likely(uptodate)) { 552 554 loff_t i_size = i_size_read(inode); 553 - pgoff_t end_index = i_size >> folio_shift(folio); 554 555 555 556 /* 556 557 * Zero out the remaining part if this range straddles ··· 558 561 * Here we should only zero the range inside the folio, 559 562 * not touch anything else. 560 563 * 561 - * NOTE: i_size is exclusive while end is inclusive. 564 + * NOTE: i_size is exclusive while end is inclusive and 565 + * folio_contains() takes PAGE_SIZE units. 562 566 */ 563 - if (folio_index(folio) == end_index && i_size <= end) { 567 + if (folio_contains(folio, i_size >> PAGE_SHIFT) && 568 + i_size <= end) { 564 569 u32 zero_start = max(offset_in_folio(folio, i_size), 565 570 offset_in_folio(folio, start)); 566 571 u32 zero_len = offset_in_folio(folio, end) + 1 - ··· 898 899 u64 len, struct extent_map **em_cached) 899 900 { 900 901 struct extent_map *em; 901 - struct extent_state *cached_state = NULL; 902 902 903 903 ASSERT(em_cached); 904 904 ··· 913 915 *em_cached = NULL; 914 916 } 915 917 916 - btrfs_lock_and_flush_ordered_range(inode, start, start + len - 1, &cached_state); 917 918 em = btrfs_get_extent(inode, folio, start, len); 918 919 if (!IS_ERR(em)) { 919 920 BUG_ON(*em_cached); 920 921 refcount_inc(&em->refs); 921 922 *em_cached = em; 922 923 } 923 - unlock_extent(&inode->io_tree, start, start + len - 1, &cached_state); 924 924 925 925 return em; 926 926 } ··· 952 956 return ret; 953 957 } 954 958 955 - if (folio->index == last_byte >> folio_shift(folio)) { 959 + if (folio_contains(folio, last_byte >> PAGE_SHIFT)) { 956 960 size_t zero_offset = offset_in_folio(folio, last_byte); 957 961 958 962 if (zero_offset) { ··· 1075 1079 1076 1080 int btrfs_read_folio(struct file *file, struct folio *folio) 1077 1081 { 1082 + struct btrfs_inode *inode = folio_to_inode(folio); 1083 + const u64 start = folio_pos(folio); 1084 + const u64 end = start + folio_size(folio) - 1; 1085 + struct extent_state *cached_state = NULL; 1078 1086 struct btrfs_bio_ctrl bio_ctrl = { .opf = REQ_OP_READ }; 1079 1087 struct extent_map *em_cached = NULL; 1080 1088 int ret; 1081 1089 1090 + btrfs_lock_and_flush_ordered_range(inode, start, end, &cached_state); 1082 1091 ret = btrfs_do_readpage(folio, &em_cached, &bio_ctrl, NULL); 1092 + unlock_extent(&inode->io_tree, start, end, &cached_state); 1093 + 1083 1094 free_extent_map(em_cached); 1084 1095 1085 1096 /* ··· 2383 2380 { 2384 2381 struct btrfs_bio_ctrl bio_ctrl = { .opf = REQ_OP_READ | REQ_RAHEAD }; 2385 2382 struct folio *folio; 2383 + struct btrfs_inode *inode = BTRFS_I(rac->mapping->host); 2384 + const u64 start = readahead_pos(rac); 2385 + const u64 end = start + readahead_length(rac) - 1; 2386 + struct extent_state *cached_state = NULL; 2386 2387 struct extent_map *em_cached = NULL; 2387 2388 u64 prev_em_start = (u64)-1; 2388 2389 2390 + btrfs_lock_and_flush_ordered_range(inode, start, end, &cached_state); 2391 + 2389 2392 while ((folio = readahead_folio(rac)) != NULL) 2390 2393 btrfs_do_readpage(folio, &em_cached, &bio_ctrl, &prev_em_start); 2394 + 2395 + unlock_extent(&inode->io_tree, start, end, &cached_state); 2391 2396 2392 2397 if (em_cached) 2393 2398 free_extent_map(em_cached);

+1 -3

fs/btrfs/file.c

··· 1039 1039 loff_t pos = iocb->ki_pos; 1040 1040 int ret; 1041 1041 loff_t oldsize; 1042 - loff_t start_pos; 1043 1042 1044 1043 /* 1045 1044 * Quickly bail out on NOWAIT writes if we don't have the nodatacow or ··· 1065 1066 inode_inc_iversion(inode); 1066 1067 } 1067 1068 1068 - start_pos = round_down(pos, fs_info->sectorsize); 1069 1069 oldsize = i_size_read(inode); 1070 - if (start_pos > oldsize) { 1070 + if (pos > oldsize) { 1071 1071 /* Expand hole size to cover write data, preventing empty gap */ 1072 1072 loff_t end_pos = round_up(pos + count, fs_info->sectorsize); 1073 1073

+12

fs/btrfs/ordered-data.c

··· 1229 1229 */ 1230 1230 if (WARN_ON_ONCE(len >= ordered->num_bytes)) 1231 1231 return ERR_PTR(-EINVAL); 1232 + /* 1233 + * If our ordered extent had an error there's no point in continuing. 1234 + * The error may have come from a transaction abort done either by this 1235 + * task or some other concurrent task, and the transaction abort path 1236 + * iterates over all existing ordered extents and sets the flag 1237 + * BTRFS_ORDERED_IOERR on them. 1238 + */ 1239 + if (unlikely(flags & (1U << BTRFS_ORDERED_IOERR))) { 1240 + const int fs_error = BTRFS_FS_ERROR(fs_info); 1241 + 1242 + return fs_error ? ERR_PTR(fs_error) : ERR_PTR(-EIO); 1243 + } 1232 1244 /* We cannot split partially completed ordered extents. */ 1233 1245 if (ordered->bytes_left) { 1234 1246 ASSERT(!(flags & ~BTRFS_ORDERED_TYPE_FLAGS));

+5 -6

fs/btrfs/qgroup.c

··· 1880 1880 * Commit current transaction to make sure all the rfer/excl numbers 1881 1881 * get updated. 1882 1882 */ 1883 - trans = btrfs_start_transaction(fs_info->quota_root, 0); 1884 - if (IS_ERR(trans)) 1885 - return PTR_ERR(trans); 1886 - 1887 - ret = btrfs_commit_transaction(trans); 1883 + ret = btrfs_commit_current_transaction(fs_info->quota_root); 1888 1884 if (ret < 0) 1889 1885 return ret; 1890 1886 ··· 1893 1897 /* 1894 1898 * It's squota and the subvolume still has numbers needed for future 1895 1899 * accounting, in this case we can not delete it. Just skip it. 1900 + * 1901 + * Or the qgroup is already removed by a qgroup rescan. For both cases we're 1902 + * safe to ignore them. 1896 1903 */ 1897 - if (ret == -EBUSY) 1904 + if (ret == -EBUSY || ret == -ENOENT) 1898 1905 ret = 0; 1899 1906 return ret; 1900 1907 }

+3 -1

fs/btrfs/transaction.c

··· 274 274 cur_trans = fs_info->running_transaction; 275 275 if (cur_trans) { 276 276 if (TRANS_ABORTED(cur_trans)) { 277 + const int abort_error = cur_trans->aborted; 278 + 277 279 spin_unlock(&fs_info->trans_lock); 278 - return cur_trans->aborted; 280 + return abort_error; 279 281 } 280 282 if (btrfs_blocked_trans_types[cur_trans->state] & type) { 281 283 spin_unlock(&fs_info->trans_lock);

+3 -3

fs/dcache.c

··· 1700 1700 smp_store_release(&dentry->d_name.name, dname); /* ^^^ */ 1701 1701 1702 1702 dentry->d_flags = 0; 1703 - lockref_init(&dentry->d_lockref, 1); 1703 + lockref_init(&dentry->d_lockref); 1704 1704 seqcount_spinlock_init(&dentry->d_seq, &dentry->d_lock); 1705 1705 dentry->d_inode = NULL; 1706 1706 dentry->d_parent = dentry; ··· 2966 2966 goto out_err; 2967 2967 m2 = &alias->d_parent->d_inode->i_rwsem; 2968 2968 out_unalias: 2969 - if (alias->d_op->d_unalias_trylock && 2969 + if (alias->d_op && alias->d_op->d_unalias_trylock && 2970 2970 !alias->d_op->d_unalias_trylock(alias)) 2971 2971 goto out_err; 2972 2972 __d_move(alias, dentry, false); 2973 - if (alias->d_op->d_unalias_unlock) 2973 + if (alias->d_op && alias->d_op->d_unalias_unlock) 2974 2974 alias->d_op->d_unalias_unlock(alias); 2975 2975 ret = 0; 2976 2976 out_err:

+1 -1

fs/erofs/zdata.c

··· 726 726 if (IS_ERR(pcl)) 727 727 return PTR_ERR(pcl); 728 728 729 - lockref_init(&pcl->lockref, 1); /* one ref for this request */ 729 + lockref_init(&pcl->lockref); /* one ref for this request */ 730 730 pcl->algorithmformat = map->m_algorithmformat; 731 731 pcl->length = 0; 732 732 pcl->partial = true;

+16

fs/file_table.c

··· 194 194 * refcount bumps we should reinitialize the reused file first. 195 195 */ 196 196 file_ref_init(&f->f_ref, 1); 197 + /* 198 + * Disable permission and pre-content events for all files by default. 199 + * They may be enabled later by file_set_fsnotify_mode_from_watchers(). 200 + */ 201 + file_set_fsnotify_mode(f, FMODE_NONOTIFY_PERM); 197 202 return 0; 198 203 } 199 204 ··· 380 375 if (IS_ERR(file)) { 381 376 ihold(inode); 382 377 path_put(&path); 378 + return file; 383 379 } 380 + /* 381 + * Disable all fsnotify events for pseudo files by default. 382 + * They may be enabled by caller with file_set_fsnotify_mode(). 383 + */ 384 + file_set_fsnotify_mode(file, FMODE_NONOTIFY); 384 385 return file; 385 386 } 386 387 EXPORT_SYMBOL(alloc_file_pseudo); ··· 411 400 return file; 412 401 } 413 402 file_init_path(file, &path, fops); 403 + /* 404 + * Disable all fsnotify events for pseudo files by default. 405 + * They may be enabled by caller with file_set_fsnotify_mode(). 406 + */ 407 + file_set_fsnotify_mode(file, FMODE_NONOTIFY); 414 408 return file; 415 409 } 416 410 EXPORT_SYMBOL_GPL(alloc_file_pseudo_noaccount);

+1 -1

fs/gfs2/glock.c

··· 1201 1201 if (glops->go_instantiate) 1202 1202 gl->gl_flags |= BIT(GLF_INSTANTIATE_NEEDED); 1203 1203 gl->gl_name = name; 1204 + lockref_init(&gl->gl_lockref); 1204 1205 lockdep_set_subclass(&gl->gl_lockref.lock, glops->go_subclass); 1205 - gl->gl_lockref.count = 1; 1206 1206 gl->gl_state = LM_ST_UNLOCKED; 1207 1207 gl->gl_target = LM_ST_UNLOCKED; 1208 1208 gl->gl_demote_state = LM_ST_EXCLUSIVE;

-1

fs/gfs2/main.c

··· 51 51 { 52 52 struct gfs2_glock *gl = foo; 53 53 54 - spin_lock_init(&gl->gl_lockref.lock); 55 54 INIT_LIST_HEAD(&gl->gl_holders); 56 55 INIT_LIST_HEAD(&gl->gl_lru); 57 56 INIT_LIST_HEAD(&gl->gl_ail_list);

+2 -2

fs/gfs2/quota.c

··· 236 236 return NULL; 237 237 238 238 qd->qd_sbd = sdp; 239 - lockref_init(&qd->qd_lockref, 0); 239 + lockref_init(&qd->qd_lockref); 240 240 qd->qd_id = qid; 241 241 qd->qd_slot = -1; 242 242 INIT_LIST_HEAD(&qd->qd_lru); ··· 297 297 spin_lock_bucket(hash); 298 298 *qdp = qd = gfs2_qd_search_bucket(hash, sdp, qid); 299 299 if (qd == NULL) { 300 - new_qd->qd_lockref.count++; 301 300 *qdp = new_qd; 302 301 list_add(&new_qd->qd_list, &sdp->sd_quota_list); 303 302 hlist_bl_add_head_rcu(&new_qd->qd_hlist, &qd_hash_table[hash]); ··· 1449 1450 if (qd == NULL) 1450 1451 goto fail_brelse; 1451 1452 1453 + qd->qd_lockref.count = 0; 1452 1454 set_bit(QDF_CHANGE, &qd->qd_flags); 1453 1455 qd->qd_change = qc_change; 1454 1456 qd->qd_slot = slot;

+32 -26

fs/namespace.c

··· 5087 5087 { 5088 5088 struct vfsmount *mnt = s->mnt; 5089 5089 struct super_block *sb = mnt->mnt_sb; 5090 + size_t start = seq->count; 5090 5091 int err; 5091 5092 5093 + err = security_sb_show_options(seq, sb); 5094 + if (err) 5095 + return err; 5096 + 5092 5097 if (sb->s_op->show_options) { 5093 - size_t start = seq->count; 5094 - 5095 - err = security_sb_show_options(seq, sb); 5096 - if (err) 5097 - return err; 5098 - 5099 5098 err = sb->s_op->show_options(seq, mnt->mnt_root); 5100 5099 if (err) 5101 5100 return err; 5102 - 5103 - if (unlikely(seq_has_overflowed(seq))) 5104 - return -EAGAIN; 5105 - 5106 - if (seq->count == start) 5107 - return 0; 5108 - 5109 - /* skip leading comma */ 5110 - memmove(seq->buf + start, seq->buf + start + 1, 5111 - seq->count - start - 1); 5112 - seq->count--; 5113 5101 } 5102 + 5103 + if (unlikely(seq_has_overflowed(seq))) 5104 + return -EAGAIN; 5105 + 5106 + if (seq->count == start) 5107 + return 0; 5108 + 5109 + /* skip leading comma */ 5110 + memmove(seq->buf + start, seq->buf + start + 1, 5111 + seq->count - start - 1); 5112 + seq->count--; 5114 5113 5115 5114 return 0; 5116 5115 } ··· 5190 5191 size_t kbufsize; 5191 5192 struct seq_file *seq = &s->seq; 5192 5193 struct statmount *sm = &s->sm; 5193 - u32 start = seq->count; 5194 + u32 start, *offp; 5195 + 5196 + /* Reserve an empty string at the beginning for any unset offsets */ 5197 + if (!seq->count) 5198 + seq_putc(seq, 0); 5199 + 5200 + start = seq->count; 5194 5201 5195 5202 switch (flag) { 5196 5203 case STATMOUNT_FS_TYPE: 5197 - sm->fs_type = start; 5204 + offp = &sm->fs_type; 5198 5205 ret = statmount_fs_type(s, seq); 5199 5206 break; 5200 5207 case STATMOUNT_MNT_ROOT: 5201 - sm->mnt_root = start; 5208 + offp = &sm->mnt_root; 5202 5209 ret = statmount_mnt_root(s, seq); 5203 5210 break; 5204 5211 case STATMOUNT_MNT_POINT: 5205 - sm->mnt_point = start; 5212 + offp = &sm->mnt_point; 5206 5213 ret = statmount_mnt_point(s, seq); 5207 5214 break; 5208 5215 case STATMOUNT_MNT_OPTS: 5209 - sm->mnt_opts = start; 5216 + offp = &sm->mnt_opts; 5210 5217 ret = statmount_mnt_opts(s, seq); 5211 5218 break; 5212 5219 case STATMOUNT_OPT_ARRAY: 5213 - sm->opt_array = start; 5220 + offp = &sm->opt_array; 5214 5221 ret = statmount_opt_array(s, seq); 5215 5222 break; 5216 5223 case STATMOUNT_OPT_SEC_ARRAY: 5217 - sm->opt_sec_array = start; 5224 + offp = &sm->opt_sec_array; 5218 5225 ret = statmount_opt_sec_array(s, seq); 5219 5226 break; 5220 5227 case STATMOUNT_FS_SUBTYPE: 5221 - sm->fs_subtype = start; 5228 + offp = &sm->fs_subtype; 5222 5229 statmount_fs_subtype(s, seq); 5223 5230 break; 5224 5231 case STATMOUNT_SB_SOURCE: 5225 - sm->sb_source = start; 5232 + offp = &sm->sb_source; 5226 5233 ret = statmount_sb_source(s, seq); 5227 5234 break; 5228 5235 default: ··· 5256 5251 5257 5252 seq->buf[seq->count++] = '\0'; 5258 5253 sm->mask |= flag; 5254 + *offp = start; 5259 5255 return 0; 5260 5256 } 5261 5257

+10 -1

fs/nfsd/filecache.c

··· 446 446 struct nfsd_file, nf_gc); 447 447 struct nfsd_net *nn = net_generic(nf->nf_net, nfsd_net_id); 448 448 struct nfsd_fcache_disposal *l = nn->fcache_disposal; 449 + struct svc_serv *serv; 449 450 450 451 spin_lock(&l->lock); 451 452 list_move_tail(&nf->nf_gc, &l->freeme); 452 453 spin_unlock(&l->lock); 453 - svc_wake_up(nn->nfsd_serv); 454 + 455 + /* 456 + * The filecache laundrette is shut down after the 457 + * nn->nfsd_serv pointer is cleared, but before the 458 + * svc_serv is freed. 459 + */ 460 + serv = nn->nfsd_serv; 461 + if (serv) 462 + svc_wake_up(serv); 454 463 } 455 464 } 456 465

+2

fs/nfsd/nfs2acl.c

··· 84 84 fail: 85 85 posix_acl_release(resp->acl_access); 86 86 posix_acl_release(resp->acl_default); 87 + resp->acl_access = NULL; 88 + resp->acl_default = NULL; 87 89 goto out; 88 90 } 89 91

+2

fs/nfsd/nfs3acl.c

··· 76 76 fail: 77 77 posix_acl_release(resp->acl_access); 78 78 posix_acl_release(resp->acl_default); 79 + resp->acl_access = NULL; 80 + resp->acl_default = NULL; 79 81 goto out; 80 82 } 81 83

+6 -3

fs/nfsd/nfs4callback.c

··· 679 679 return status; 680 680 681 681 status = decode_cb_op_status(xdr, OP_CB_GETATTR, &cb->cb_status); 682 - if (unlikely(status || cb->cb_seq_status)) 682 + if (unlikely(status || cb->cb_status)) 683 683 return status; 684 684 if (xdr_stream_decode_uint32_array(xdr, bitmap, 3) < 0) 685 685 return -NFSERR_BAD_XDR; ··· 1583 1583 nfsd4_process_cb_update(cb); 1584 1584 1585 1585 clnt = clp->cl_cb_client; 1586 - if (!clnt) { 1587 - /* Callback channel broken, or client killed; give up: */ 1586 + if (!clnt || clp->cl_state == NFSD4_COURTESY) { 1587 + /* 1588 + * Callback channel broken, client killed or 1589 + * nfs4_client in courtesy state; give up. 1590 + */ 1588 1591 nfsd41_destroy_cb(cb); 1589 1592 return; 1590 1593 }

+2 -1

fs/nfsd/nfs4state.c

··· 4459 4459 } 4460 4460 } while (slot && --cnt > 0); 4461 4461 } 4462 + 4463 + out: 4462 4464 seq->maxslots = max(session->se_target_maxslots, seq->maxslots); 4463 4465 seq->target_maxslots = session->se_target_maxslots; 4464 4466 4465 - out: 4466 4467 switch (clp->cl_cb_state) { 4467 4468 case NFSD4_CB_DOWN: 4468 4469 seq->status_flags = SEQ4_STATUS_CB_PATH_DOWN;

+3 -2

fs/nfsd/nfsfh.c

··· 380 380 error = check_nfsd_access(exp, rqstp, may_bypass_gss); 381 381 if (error) 382 382 goto out; 383 - 384 - svc_xprt_set_valid(rqstp->rq_xprt); 383 + /* During LOCALIO call to fh_verify will be called with a NULL rqstp */ 384 + if (rqstp) 385 + svc_xprt_set_valid(rqstp->rq_xprt); 385 386 386 387 /* Finally, check access permissions. */ 387 388 error = nfsd_permission(cred, exp, dentry, access);

+12 -6

fs/notify/fsnotify.c

··· 648 648 * Later, fsnotify permission hooks do not check if there are permission event 649 649 * watches, but that there were permission event watches at open time. 650 650 */ 651 - void file_set_fsnotify_mode(struct file *file) 651 + void file_set_fsnotify_mode_from_watchers(struct file *file) 652 652 { 653 653 struct dentry *dentry = file->f_path.dentry, *parent; 654 654 struct super_block *sb = dentry->d_sb; ··· 665 665 */ 666 666 if (likely(!fsnotify_sb_has_priority_watchers(sb, 667 667 FSNOTIFY_PRIO_CONTENT))) { 668 - file->f_mode |= FMODE_NONOTIFY_PERM; 668 + file_set_fsnotify_mode(file, FMODE_NONOTIFY_PERM); 669 669 return; 670 670 } 671 671 ··· 676 676 if ((!d_is_dir(dentry) && !d_is_reg(dentry)) || 677 677 likely(!fsnotify_sb_has_priority_watchers(sb, 678 678 FSNOTIFY_PRIO_PRE_CONTENT))) { 679 - file->f_mode |= FMODE_NONOTIFY | FMODE_NONOTIFY_PERM; 679 + file_set_fsnotify_mode(file, FMODE_NONOTIFY | FMODE_NONOTIFY_PERM); 680 680 return; 681 681 } 682 682 ··· 686 686 */ 687 687 mnt_mask = READ_ONCE(real_mount(file->f_path.mnt)->mnt_fsnotify_mask); 688 688 if (unlikely(fsnotify_object_watched(d_inode(dentry), mnt_mask, 689 - FSNOTIFY_PRE_CONTENT_EVENTS))) 689 + FSNOTIFY_PRE_CONTENT_EVENTS))) { 690 + /* Enable pre-content events */ 691 + file_set_fsnotify_mode(file, 0); 690 692 return; 693 + } 691 694 692 695 /* Is parent watching for pre-content events on this file? */ 693 696 if (dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED) { 694 697 parent = dget_parent(dentry); 695 698 p_mask = fsnotify_inode_watches_children(d_inode(parent)); 696 699 dput(parent); 697 - if (p_mask & FSNOTIFY_PRE_CONTENT_EVENTS) 700 + if (p_mask & FSNOTIFY_PRE_CONTENT_EVENTS) { 701 + /* Enable pre-content events */ 702 + file_set_fsnotify_mode(file, 0); 698 703 return; 704 + } 699 705 } 700 706 /* Nobody watching for pre-content events from this file */ 701 - file->f_mode |= FMODE_NONOTIFY | FMODE_NONOTIFY_PERM; 707 + file_set_fsnotify_mode(file, FMODE_NONOTIFY | FMODE_NONOTIFY_PERM); 702 708 } 703 709 #endif 704 710

+6 -5

fs/open.c

··· 905 905 f->f_sb_err = file_sample_sb_err(f); 906 906 907 907 if (unlikely(f->f_flags & O_PATH)) { 908 - f->f_mode = FMODE_PATH | FMODE_OPENED | FMODE_NONOTIFY; 908 + f->f_mode = FMODE_PATH | FMODE_OPENED; 909 + file_set_fsnotify_mode(f, FMODE_NONOTIFY); 909 910 f->f_op = &empty_fops; 910 911 return 0; 911 912 } ··· 936 935 937 936 /* 938 937 * Set FMODE_NONOTIFY_* bits according to existing permission watches. 939 - * If FMODE_NONOTIFY was already set for an fanotify fd, this doesn't 940 - * change anything. 938 + * If FMODE_NONOTIFY mode was already set for an fanotify fd or for a 939 + * pseudo file, this call will not change the mode. 941 940 */ 942 - file_set_fsnotify_mode(f); 941 + file_set_fsnotify_mode_from_watchers(f); 943 942 error = fsnotify_open_perm(f); 944 943 if (error) 945 944 goto cleanup_all; ··· 1123 1122 if (!IS_ERR(f)) { 1124 1123 int error; 1125 1124 1126 - f->f_mode |= FMODE_NONOTIFY; 1125 + file_set_fsnotify_mode(f, FMODE_NONOTIFY); 1127 1126 error = vfs_open(path, f); 1128 1127 if (error) { 1129 1128 fput(f);

+11 -1

fs/pidfs.c

··· 287 287 switch (cmd) { 288 288 case FS_IOC_GETVERSION: 289 289 case PIDFD_GET_CGROUP_NAMESPACE: 290 - case PIDFD_GET_INFO: 291 290 case PIDFD_GET_IPC_NAMESPACE: 292 291 case PIDFD_GET_MNT_NAMESPACE: 293 292 case PIDFD_GET_NET_NAMESPACE: ··· 297 298 case PIDFD_GET_USER_NAMESPACE: 298 299 case PIDFD_GET_PID_NAMESPACE: 299 300 return true; 301 + } 302 + 303 + /* Extensible ioctls require some more careful checks. */ 304 + switch (_IOC_NR(cmd)) { 305 + case _IOC_NR(PIDFD_GET_INFO): 306 + /* 307 + * Try to prevent performing a pidfd ioctl when someone 308 + * erronously mistook the file descriptor for a pidfd. 309 + * This is not perfect but will catch most cases. 310 + */ 311 + return (_IOC_TYPE(cmd) == _IOC_TYPE(PIDFD_GET_INFO)); 300 312 } 301 313 302 314 return false;

+6

fs/pipe.c

··· 960 960 res[1] = f; 961 961 stream_open(inode, res[0]); 962 962 stream_open(inode, res[1]); 963 + /* 964 + * Disable permission and pre-content events, but enable legacy 965 + * inotify events for legacy users. 966 + */ 967 + file_set_fsnotify_mode(res[0], FMODE_NONOTIFY_PERM); 968 + file_set_fsnotify_mode(res[1], FMODE_NONOTIFY_PERM); 963 969 return 0; 964 970 } 965 971

+7 -8

fs/smb/client/cifsglob.h

··· 357 357 int (*handle_cancelled_mid)(struct mid_q_entry *, struct TCP_Server_Info *); 358 358 void (*downgrade_oplock)(struct TCP_Server_Info *server, 359 359 struct cifsInodeInfo *cinode, __u32 oplock, 360 - unsigned int epoch, bool *purge_cache); 360 + __u16 epoch, bool *purge_cache); 361 361 /* process transaction2 response */ 362 362 bool (*check_trans2)(struct mid_q_entry *, struct TCP_Server_Info *, 363 363 char *, int); ··· 552 552 /* if we can do cache read operations */ 553 553 bool (*is_read_op)(__u32); 554 554 /* set oplock level for the inode */ 555 - void (*set_oplock_level)(struct cifsInodeInfo *, __u32, unsigned int, 556 - bool *); 555 + void (*set_oplock_level)(struct cifsInodeInfo *cinode, __u32 oplock, __u16 epoch, 556 + bool *purge_cache); 557 557 /* create lease context buffer for CREATE request */ 558 558 char * (*create_lease_buf)(u8 *lease_key, u8 oplock); 559 559 /* parse lease context buffer and return oplock/epoch info */ 560 - __u8 (*parse_lease_buf)(void *buf, unsigned int *epoch, char *lkey); 560 + __u8 (*parse_lease_buf)(void *buf, __u16 *epoch, char *lkey); 561 561 ssize_t (*copychunk_range)(const unsigned int, 562 562 struct cifsFileInfo *src_file, 563 563 struct cifsFileInfo *target_file, ··· 1447 1447 __u8 create_guid[16]; 1448 1448 __u32 access; 1449 1449 struct cifs_pending_open *pending_open; 1450 - unsigned int epoch; 1450 + __u16 epoch; 1451 1451 #ifdef CONFIG_CIFS_DEBUG2 1452 1452 __u64 mid; 1453 1453 #endif /* CIFS_DEBUG2 */ ··· 1480 1480 bool oplock_break_cancelled:1; 1481 1481 bool status_file_deleted:1; /* file has been deleted */ 1482 1482 bool offload:1; /* offload final part of _put to a wq */ 1483 - unsigned int oplock_epoch; /* epoch from the lease break */ 1483 + __u16 oplock_epoch; /* epoch from the lease break */ 1484 1484 __u32 oplock_level; /* oplock/lease level from the lease break */ 1485 1485 int count; 1486 1486 spinlock_t file_info_lock; /* protects four flag/count fields above */ ··· 1508 1508 struct cifs_io_request { 1509 1509 struct netfs_io_request rreq; 1510 1510 struct cifsFileInfo *cfile; 1511 - struct TCP_Server_Info *server; 1512 1511 pid_t pid; 1513 1512 }; 1514 1513 ··· 1576 1577 spinlock_t open_file_lock; /* protects openFileList */ 1577 1578 __u32 cifsAttrs; /* e.g. DOS archive bit, sparse, compressed, system */ 1578 1579 unsigned int oplock; /* oplock/lease level we have */ 1579 - unsigned int epoch; /* used to track lease state changes */ 1580 + __u16 epoch; /* used to track lease state changes */ 1580 1581 #define CIFS_INODE_PENDING_OPLOCK_BREAK (0) /* oplock break in progress */ 1581 1582 #define CIFS_INODE_PENDING_WRITERS (1) /* Writes in progress */ 1582 1583 #define CIFS_INODE_FLAG_UNUSED (2) /* Unused flag */

+16 -14

fs/smb/client/dfs.c

··· 150 150 if (rc) 151 151 continue; 152 152 153 - if (tgt.flags & DFSREF_STORAGE_SERVER) { 154 - rc = cifs_mount_get_tcon(mnt_ctx); 155 - if (!rc) 156 - rc = cifs_is_path_remote(mnt_ctx); 153 + rc = cifs_mount_get_tcon(mnt_ctx); 154 + if (rc) { 155 + if (tgt.server_type == DFS_TYPE_LINK && 156 + DFS_INTERLINK(tgt.flags)) 157 + rc = -EREMOTE; 158 + } else { 159 + rc = cifs_is_path_remote(mnt_ctx); 157 160 if (!rc) { 158 161 ref_walk_set_tgt_hint(rw); 159 162 break; 160 163 } 161 - if (rc != -EREMOTE) 162 - continue; 163 164 } 164 - 165 - rc = ref_walk_advance(rw); 166 - if (!rc) { 167 - rc = setup_dfs_ref(&tgt, rw); 168 - if (rc) 169 - break; 170 - ref_walk_mark_end(rw); 171 - goto again; 165 + if (rc == -EREMOTE) { 166 + rc = ref_walk_advance(rw); 167 + if (!rc) { 168 + rc = setup_dfs_ref(&tgt, rw); 169 + if (rc) 170 + break; 171 + ref_walk_mark_end(rw); 172 + goto again; 173 + } 172 174 } 173 175 } 174 176 } while (rc && ref_walk_descend(rw));

+7

fs/smb/client/dfs.h

··· 188 188 } 189 189 } 190 190 191 + static inline const char *dfs_ses_refpath(struct cifs_ses *ses) 192 + { 193 + const char *path = ses->server->leaf_fullpath; 194 + 195 + return path ? path + 1 : ERR_PTR(-ENOENT); 196 + } 197 + 191 198 #endif /* _CIFS_DFS_H */

+5 -22

fs/smb/client/dfs_cache.c

··· 1136 1136 return ret; 1137 1137 } 1138 1138 1139 - static char *get_ses_refpath(struct cifs_ses *ses) 1140 - { 1141 - struct TCP_Server_Info *server = ses->server; 1142 - char *path = ERR_PTR(-ENOENT); 1143 - 1144 - if (server->leaf_fullpath) { 1145 - path = kstrdup(server->leaf_fullpath + 1, GFP_KERNEL); 1146 - if (!path) 1147 - path = ERR_PTR(-ENOMEM); 1148 - } 1149 - return path; 1150 - } 1151 - 1152 1139 /* Refresh dfs referral of @ses */ 1153 1140 static void refresh_ses_referral(struct cifs_ses *ses) 1154 1141 { 1155 1142 struct cache_entry *ce; 1156 1143 unsigned int xid; 1157 - char *path; 1144 + const char *path; 1158 1145 int rc = 0; 1159 1146 1160 1147 xid = get_xid(); 1161 1148 1162 - path = get_ses_refpath(ses); 1149 + path = dfs_ses_refpath(ses); 1163 1150 if (IS_ERR(path)) { 1164 1151 rc = PTR_ERR(path); 1165 - path = NULL; 1166 1152 goto out; 1167 1153 } 1168 1154 ··· 1167 1181 1168 1182 out: 1169 1183 free_xid(xid); 1170 - kfree(path); 1171 1184 } 1172 1185 1173 1186 static int __refresh_tcon_referral(struct cifs_tcon *tcon, ··· 1216 1231 struct dfs_info3_param *refs = NULL; 1217 1232 struct cache_entry *ce; 1218 1233 struct cifs_ses *ses; 1219 - unsigned int xid; 1220 1234 bool needs_refresh; 1221 - char *path; 1235 + const char *path; 1236 + unsigned int xid; 1222 1237 int numrefs = 0; 1223 1238 int rc = 0; 1224 1239 1225 1240 xid = get_xid(); 1226 1241 ses = tcon->ses; 1227 1242 1228 - path = get_ses_refpath(ses); 1243 + path = dfs_ses_refpath(ses); 1229 1244 if (IS_ERR(path)) { 1230 1245 rc = PTR_ERR(path); 1231 - path = NULL; 1232 1246 goto out; 1233 1247 } 1234 1248 ··· 1255 1271 1256 1272 out: 1257 1273 free_xid(xid); 1258 - kfree(path); 1259 1274 free_dfs_info_array(refs, numrefs); 1260 1275 } 1261 1276

+4 -3

fs/smb/client/file.c

··· 147 147 struct netfs_io_request *rreq = subreq->rreq; 148 148 struct cifs_io_subrequest *rdata = container_of(subreq, struct cifs_io_subrequest, subreq); 149 149 struct cifs_io_request *req = container_of(subreq->rreq, struct cifs_io_request, rreq); 150 - struct TCP_Server_Info *server = req->server; 150 + struct TCP_Server_Info *server; 151 151 struct cifs_sb_info *cifs_sb = CIFS_SB(rreq->inode->i_sb); 152 152 size_t size; 153 153 int rc = 0; ··· 156 156 rdata->xid = get_xid(); 157 157 rdata->have_xid = true; 158 158 } 159 + 160 + server = cifs_pick_channel(tlink_tcon(req->cfile->tlink)->ses); 159 161 rdata->server = server; 160 162 161 163 if (cifs_sb->ctx->rsize == 0) ··· 200 198 struct netfs_io_request *rreq = subreq->rreq; 201 199 struct cifs_io_subrequest *rdata = container_of(subreq, struct cifs_io_subrequest, subreq); 202 200 struct cifs_io_request *req = container_of(subreq->rreq, struct cifs_io_request, rreq); 203 - struct TCP_Server_Info *server = req->server; 201 + struct TCP_Server_Info *server = rdata->server; 204 202 int rc = 0; 205 203 206 204 cifs_dbg(FYI, "%s: op=%08x[%x] mapping=%p len=%zu/%zu\n", ··· 268 266 open_file = file->private_data; 269 267 rreq->netfs_priv = file->private_data; 270 268 req->cfile = cifsFileInfo_get(open_file); 271 - req->server = cifs_pick_channel(tlink_tcon(req->cfile->tlink)->ses); 272 269 if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_RWPIDFORWARD) 273 270 req->pid = req->cfile->pid; 274 271 } else if (rreq->origin != NETFS_WRITEBACK) {

+1 -1

fs/smb/client/smb1ops.c

··· 377 377 static void 378 378 cifs_downgrade_oplock(struct TCP_Server_Info *server, 379 379 struct cifsInodeInfo *cinode, __u32 oplock, 380 - unsigned int epoch, bool *purge_cache) 380 + __u16 epoch, bool *purge_cache) 381 381 { 382 382 cifs_set_oplock_level(cinode, oplock); 383 383 }

+9 -9

fs/smb/client/smb2ops.c

··· 3904 3904 static void 3905 3905 smb2_downgrade_oplock(struct TCP_Server_Info *server, 3906 3906 struct cifsInodeInfo *cinode, __u32 oplock, 3907 - unsigned int epoch, bool *purge_cache) 3907 + __u16 epoch, bool *purge_cache) 3908 3908 { 3909 3909 server->ops->set_oplock_level(cinode, oplock, 0, NULL); 3910 3910 } 3911 3911 3912 3912 static void 3913 3913 smb21_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock, 3914 - unsigned int epoch, bool *purge_cache); 3914 + __u16 epoch, bool *purge_cache); 3915 3915 3916 3916 static void 3917 3917 smb3_downgrade_oplock(struct TCP_Server_Info *server, 3918 3918 struct cifsInodeInfo *cinode, __u32 oplock, 3919 - unsigned int epoch, bool *purge_cache) 3919 + __u16 epoch, bool *purge_cache) 3920 3920 { 3921 3921 unsigned int old_state = cinode->oplock; 3922 - unsigned int old_epoch = cinode->epoch; 3922 + __u16 old_epoch = cinode->epoch; 3923 3923 unsigned int new_state; 3924 3924 3925 3925 if (epoch > old_epoch) { ··· 3939 3939 3940 3940 static void 3941 3941 smb2_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock, 3942 - unsigned int epoch, bool *purge_cache) 3942 + __u16 epoch, bool *purge_cache) 3943 3943 { 3944 3944 oplock &= 0xFF; 3945 3945 cinode->lease_granted = false; ··· 3963 3963 3964 3964 static void 3965 3965 smb21_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock, 3966 - unsigned int epoch, bool *purge_cache) 3966 + __u16 epoch, bool *purge_cache) 3967 3967 { 3968 3968 char message[5] = {0}; 3969 3969 unsigned int new_oplock = 0; ··· 4000 4000 4001 4001 static void 4002 4002 smb3_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock, 4003 - unsigned int epoch, bool *purge_cache) 4003 + __u16 epoch, bool *purge_cache) 4004 4004 { 4005 4005 unsigned int old_oplock = cinode->oplock; 4006 4006 ··· 4114 4114 } 4115 4115 4116 4116 static __u8 4117 - smb2_parse_lease_buf(void *buf, unsigned int *epoch, char *lease_key) 4117 + smb2_parse_lease_buf(void *buf, __u16 *epoch, char *lease_key) 4118 4118 { 4119 4119 struct create_lease *lc = (struct create_lease *)buf; 4120 4120 ··· 4125 4125 } 4126 4126 4127 4127 static __u8 4128 - smb3_parse_lease_buf(void *buf, unsigned int *epoch, char *lease_key) 4128 + smb3_parse_lease_buf(void *buf, __u16 *epoch, char *lease_key) 4129 4129 { 4130 4130 struct create_lease_v2 *lc = (struct create_lease_v2 *)buf; 4131 4131

+2 -2

fs/smb/client/smb2pdu.c

··· 2169 2169 2170 2170 tcon_error_exit: 2171 2171 if (rsp && rsp->hdr.Status == STATUS_BAD_NETWORK_NAME) 2172 - cifs_tcon_dbg(VFS, "BAD_NETWORK_NAME: %s\n", tree); 2172 + cifs_dbg(VFS | ONCE, "BAD_NETWORK_NAME: %s\n", tree); 2173 2173 goto tcon_exit; 2174 2174 } 2175 2175 ··· 2329 2329 2330 2330 int smb2_parse_contexts(struct TCP_Server_Info *server, 2331 2331 struct kvec *rsp_iov, 2332 - unsigned int *epoch, 2332 + __u16 *epoch, 2333 2333 char *lease_key, __u8 *oplock, 2334 2334 struct smb2_file_all_info *buf, 2335 2335 struct create_posix_rsp *posix)

+1 -1

fs/smb/client/smb2proto.h

··· 283 283 enum securityEnum); 284 284 int smb2_parse_contexts(struct TCP_Server_Info *server, 285 285 struct kvec *rsp_iov, 286 - unsigned int *epoch, 286 + __u16 *epoch, 287 287 char *lease_key, __u8 *oplock, 288 288 struct smb2_file_all_info *buf, 289 289 struct create_posix_rsp *posix);

+3 -1

fs/stat.c

··· 281 281 u32 request_mask) 282 282 { 283 283 int error = vfs_getattr(path, stat, request_mask, flags); 284 + if (error) 285 + return error; 284 286 285 287 if (request_mask & STATX_MNT_ID_UNIQUE) { 286 288 stat->mnt_id = real_mount(path->mnt)->mnt_id_unique; ··· 304 302 if (S_ISBLK(stat->mode)) 305 303 bdev_statx(path, stat, request_mask); 306 304 307 - return error; 305 + return 0; 308 306 } 309 307 310 308 static int vfs_statx_fd(int fd, int flags, struct kstat *stat,

+2 -1

fs/vboxsf/super.c

··· 21 21 22 22 #define VBOXSF_SUPER_MAGIC 0x786f4256 /* 'VBox' little endian */ 23 23 24 - static const unsigned char VBSF_MOUNT_SIGNATURE[4] = "\000\377\376\375"; 24 + static const unsigned char VBSF_MOUNT_SIGNATURE[4] = { '\000', '\377', '\376', 25 + '\375' }; 25 26 26 27 static int follow_symlinks; 27 28 module_param(follow_symlinks, int, 0444);

+7 -6

fs/xfs/libxfs/xfs_bmap.c

··· 3563 3563 int error; 3564 3564 3565 3565 /* 3566 - * If there are already extents in the file, try an exact EOF block 3567 - * allocation to extend the file as a contiguous extent. If that fails, 3568 - * or it's the first allocation in a file, just try for a stripe aligned 3569 - * allocation. 3566 + * If there are already extents in the file, and xfs_bmap_adjacent() has 3567 + * given a better blkno, try an exact EOF block allocation to extend the 3568 + * file as a contiguous extent. If that fails, or it's the first 3569 + * allocation in a file, just try for a stripe aligned allocation. 3570 3570 */ 3571 - if (ap->offset) { 3571 + if (ap->eof) { 3572 3572 xfs_extlen_t nextminlen = 0; 3573 3573 3574 3574 /* ··· 3736 3736 int error; 3737 3737 3738 3738 ap->blkno = XFS_INO_TO_FSB(args->mp, ap->ip->i_ino); 3739 - xfs_bmap_adjacent(ap); 3739 + if (!xfs_bmap_adjacent(ap)) 3740 + ap->eof = false; 3740 3741 3741 3742 /* 3742 3743 * Search for an allocation group with a single extent large enough for

+17 -19

fs/xfs/xfs_buf.c

··· 41 41 * 42 42 * xfs_buf_rele: 43 43 * b_lock 44 - * pag_buf_lock 45 - * lru_lock 44 + * lru_lock 46 45 * 47 46 * xfs_buftarg_drain_rele 48 47 * lru_lock ··· 219 220 */ 220 221 flags &= ~(XBF_UNMAPPED | XBF_TRYLOCK | XBF_ASYNC | XBF_READ_AHEAD); 221 222 222 - spin_lock_init(&bp->b_lock); 223 + /* 224 + * A new buffer is held and locked by the owner. This ensures that the 225 + * buffer is owned by the caller and racing RCU lookups right after 226 + * inserting into the hash table are safe (and will have to wait for 227 + * the unlock to do anything non-trivial). 228 + */ 223 229 bp->b_hold = 1; 230 + sema_init(&bp->b_sema, 0); /* held, no waiters */ 231 + 232 + spin_lock_init(&bp->b_lock); 224 233 atomic_set(&bp->b_lru_ref, 1); 225 234 init_completion(&bp->b_iowait); 226 235 INIT_LIST_HEAD(&bp->b_lru); 227 236 INIT_LIST_HEAD(&bp->b_list); 228 237 INIT_LIST_HEAD(&bp->b_li_list); 229 - sema_init(&bp->b_sema, 0); /* held, no waiters */ 230 238 bp->b_target = target; 231 239 bp->b_mount = target->bt_mount; 232 240 bp->b_flags = flags; 233 241 234 - /* 235 - * Set length and io_length to the same value initially. 236 - * I/O routines should use io_length, which will be the same in 237 - * most cases but may be reset (e.g. XFS recovery). 238 - */ 239 242 error = xfs_buf_get_maps(bp, nmaps); 240 243 if (error) { 241 244 kmem_cache_free(xfs_buf_cache, bp); ··· 503 502 xfs_buf_cache_init( 504 503 struct xfs_buf_cache *bch) 505 504 { 506 - spin_lock_init(&bch->bc_lock); 507 505 return rhashtable_init(&bch->bc_hash, &xfs_buf_hash_params); 508 506 } 509 507 ··· 652 652 if (error) 653 653 goto out_free_buf; 654 654 655 - spin_lock(&bch->bc_lock); 655 + /* The new buffer keeps the perag reference until it is freed. */ 656 + new_bp->b_pag = pag; 657 + 658 + rcu_read_lock(); 656 659 bp = rhashtable_lookup_get_insert_fast(&bch->bc_hash, 657 660 &new_bp->b_rhash_head, xfs_buf_hash_params); 658 661 if (IS_ERR(bp)) { 662 + rcu_read_unlock(); 659 663 error = PTR_ERR(bp); 660 - spin_unlock(&bch->bc_lock); 661 664 goto out_free_buf; 662 665 } 663 666 if (bp && xfs_buf_try_hold(bp)) { 664 667 /* found an existing buffer */ 665 - spin_unlock(&bch->bc_lock); 668 + rcu_read_unlock(); 666 669 error = xfs_buf_find_lock(bp, flags); 667 670 if (error) 668 671 xfs_buf_rele(bp); ··· 673 670 *bpp = bp; 674 671 goto out_free_buf; 675 672 } 673 + rcu_read_unlock(); 676 674 677 - /* The new buffer keeps the perag reference until it is freed. */ 678 - new_bp->b_pag = pag; 679 - spin_unlock(&bch->bc_lock); 680 675 *bpp = new_bp; 681 676 return 0; 682 677 ··· 1091 1090 } 1092 1091 1093 1092 /* we are asked to drop the last reference */ 1094 - spin_lock(&bch->bc_lock); 1095 1093 __xfs_buf_ioacct_dec(bp); 1096 1094 if (!(bp->b_flags & XBF_STALE) && atomic_read(&bp->b_lru_ref)) { 1097 1095 /* ··· 1102 1102 bp->b_state &= ~XFS_BSTATE_DISPOSE; 1103 1103 else 1104 1104 bp->b_hold--; 1105 - spin_unlock(&bch->bc_lock); 1106 1105 } else { 1107 1106 bp->b_hold--; 1108 1107 /* ··· 1119 1120 ASSERT(!(bp->b_flags & _XBF_DELWRI_Q)); 1120 1121 rhashtable_remove_fast(&bch->bc_hash, &bp->b_rhash_head, 1121 1122 xfs_buf_hash_params); 1122 - spin_unlock(&bch->bc_lock); 1123 1123 if (pag) 1124 1124 xfs_perag_put(pag); 1125 1125 freebuf = true;

-1

fs/xfs/xfs_buf.h

··· 80 80 #define XFS_BSTATE_IN_FLIGHT (1 << 1) /* I/O in flight */ 81 81 82 82 struct xfs_buf_cache { 83 - spinlock_t bc_lock; 84 83 struct rhashtable bc_hash; 85 84 }; 86 85

+27 -44

fs/xfs/xfs_exchrange.c

··· 329 329 * successfully but before locks are dropped. 330 330 */ 331 331 332 - /* Verify that we have security clearance to perform this operation. */ 333 - static int 334 - xfs_exchange_range_verify_area( 335 - struct xfs_exchrange *fxr) 336 - { 337 - int ret; 338 - 339 - ret = remap_verify_area(fxr->file1, fxr->file1_offset, fxr->length, 340 - true); 341 - if (ret) 342 - return ret; 343 - 344 - return remap_verify_area(fxr->file2, fxr->file2_offset, fxr->length, 345 - true); 346 - } 347 - 348 332 /* 349 333 * Performs necessary checks before doing a range exchange, having stabilized 350 334 * mutable inode attributes via i_rwsem. ··· 339 355 unsigned int alloc_unit) 340 356 { 341 357 struct inode *inode1 = file_inode(fxr->file1); 358 + loff_t size1 = i_size_read(inode1); 342 359 struct inode *inode2 = file_inode(fxr->file2); 360 + loff_t size2 = i_size_read(inode2); 343 361 uint64_t allocmask = alloc_unit - 1; 344 362 int64_t test_len; 345 363 uint64_t blen; 346 - loff_t size1, size2, tmp; 364 + loff_t tmp; 347 365 int error; 348 366 349 367 /* Don't touch certain kinds of inodes */ ··· 354 368 if (IS_SWAPFILE(inode1) || IS_SWAPFILE(inode2)) 355 369 return -ETXTBSY; 356 370 357 - size1 = i_size_read(inode1); 358 - size2 = i_size_read(inode2); 359 - 360 371 /* Ranges cannot start after EOF. */ 361 372 if (fxr->file1_offset > size1 || fxr->file2_offset > size2) 362 373 return -EINVAL; 363 374 364 - /* 365 - * If the caller said to exchange to EOF, we set the length of the 366 - * request large enough to cover everything to the end of both files. 367 - */ 368 375 if (fxr->flags & XFS_EXCHANGE_RANGE_TO_EOF) { 376 + /* 377 + * If the caller said to exchange to EOF, we set the length of 378 + * the request large enough to cover everything to the end of 379 + * both files. 380 + */ 369 381 fxr->length = max_t(int64_t, size1 - fxr->file1_offset, 370 382 size2 - fxr->file2_offset); 371 - 372 - error = xfs_exchange_range_verify_area(fxr); 373 - if (error) 374 - return error; 383 + } else { 384 + /* 385 + * Otherwise we require both ranges to end within EOF. 386 + */ 387 + if (fxr->file1_offset + fxr->length > size1 || 388 + fxr->file2_offset + fxr->length > size2) 389 + return -EINVAL; 375 390 } 376 391 377 392 /* ··· 386 399 /* Ensure offsets don't wrap. */ 387 400 if (check_add_overflow(fxr->file1_offset, fxr->length, &tmp) || 388 401 check_add_overflow(fxr->file2_offset, fxr->length, &tmp)) 389 - return -EINVAL; 390 - 391 - /* 392 - * We require both ranges to end within EOF, unless we're exchanging 393 - * to EOF. 394 - */ 395 - if (!(fxr->flags & XFS_EXCHANGE_RANGE_TO_EOF) && 396 - (fxr->file1_offset + fxr->length > size1 || 397 - fxr->file2_offset + fxr->length > size2)) 398 402 return -EINVAL; 399 403 400 404 /* ··· 725 747 { 726 748 struct inode *inode1 = file_inode(fxr->file1); 727 749 struct inode *inode2 = file_inode(fxr->file2); 750 + loff_t check_len = fxr->length; 728 751 int ret; 729 752 730 753 BUILD_BUG_ON(XFS_EXCHANGE_RANGE_ALL_FLAGS & ··· 758 779 return -EBADF; 759 780 760 781 /* 761 - * If we're not exchanging to EOF, we can check the areas before 762 - * stabilizing both files' i_size. 782 + * If we're exchanging to EOF we can't calculate the length until taking 783 + * the iolock. Pass a 0 length to remap_verify_area similar to the 784 + * FICLONE and FICLONERANGE ioctls that support cloning to EOF as well. 763 785 */ 764 - if (!(fxr->flags & XFS_EXCHANGE_RANGE_TO_EOF)) { 765 - ret = xfs_exchange_range_verify_area(fxr); 766 - if (ret) 767 - return ret; 768 - } 786 + if (fxr->flags & XFS_EXCHANGE_RANGE_TO_EOF) 787 + check_len = 0; 788 + ret = remap_verify_area(fxr->file1, fxr->file1_offset, check_len, true); 789 + if (ret) 790 + return ret; 791 + ret = remap_verify_area(fxr->file2, fxr->file2_offset, check_len, true); 792 + if (ret) 793 + return ret; 769 794 770 795 /* Update cmtime if the fd/inode don't forbid it. */ 771 796 if (!(fxr->file1->f_mode & FMODE_NOCMTIME) && !IS_NOCMTIME(inode1))

+5 -2

fs/xfs/xfs_inode.c

··· 1404 1404 goto out; 1405 1405 1406 1406 /* Try to clean out the cow blocks if there are any. */ 1407 - if (xfs_inode_has_cow_data(ip)) 1408 - xfs_reflink_cancel_cow_range(ip, 0, NULLFILEOFF, true); 1407 + if (xfs_inode_has_cow_data(ip)) { 1408 + error = xfs_reflink_cancel_cow_range(ip, 0, NULLFILEOFF, true); 1409 + if (error) 1410 + goto out; 1411 + } 1409 1412 1410 1413 if (VFS_I(ip)->i_nlink != 0) { 1411 1414 /*

+2 -4

fs/xfs/xfs_iomap.c

··· 976 976 if (!xfs_is_cow_inode(ip)) 977 977 return 0; 978 978 979 - if (!written) { 980 - xfs_reflink_cancel_cow_range(ip, pos, length, true); 981 - return 0; 982 - } 979 + if (!written) 980 + return xfs_reflink_cancel_cow_range(ip, pos, length, true); 983 981 984 982 return xfs_reflink_end_cow(ip, pos, written); 985 983 }

+1

include/asm-generic/vmlinux.lds.h

··· 1038 1038 *(.discard) \ 1039 1039 *(.discard.*) \ 1040 1040 *(.export_symbol) \ 1041 + *(.no_trim_symbol) \ 1041 1042 *(.modinfo) \ 1042 1043 /* ld.bfd warns about .gnu.version* even when not emitted */ \ 1043 1044 *(.gnu.version*) \

+1

include/drm/display/drm_dp.h

··· 359 359 # define DP_DSC_BITS_PER_PIXEL_1_4 0x2 360 360 # define DP_DSC_BITS_PER_PIXEL_1_2 0x3 361 361 # define DP_DSC_BITS_PER_PIXEL_1_1 0x4 362 + # define DP_DSC_BITS_PER_PIXEL_MASK 0x7 362 363 363 364 #define DP_PSR_SUPPORT 0x070 /* XXX 1.2? */ 364 365 # define DP_PSR_IS_SUPPORTED 1

+1

include/drm/drm_print.h

··· 32 32 #include <linux/dynamic_debug.h> 33 33 34 34 #include <drm/drm.h> 35 + #include <drm/drm_device.h> 35 36 36 37 struct debugfs_regset32; 37 38 struct drm_device;

+16

include/dt-bindings/clock/qcom,qcs8300-camcc.h

··· 1 + /* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */ 2 + /* 3 + * Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved. 4 + */ 5 + 6 + #ifndef _DT_BINDINGS_CLK_QCOM_QCS8300_CAM_CC_H 7 + #define _DT_BINDINGS_CLK_QCOM_QCS8300_CAM_CC_H 8 + 9 + #include "qcom,sa8775p-camcc.h" 10 + 11 + /* QCS8300 introduces below new clocks compared to SA8775P */ 12 + 13 + /* CAM_CC clocks */ 14 + #define CAM_CC_TITAN_TOP_ACCU_SHIFT_CLK 86 15 + 16 + #endif

+17

include/dt-bindings/clock/qcom,qcs8300-gpucc.h

··· 1 + /* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */ 2 + /* 3 + * Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved. 4 + */ 5 + 6 + #ifndef _DT_BINDINGS_CLK_QCOM_GPUCC_QCS8300_H 7 + #define _DT_BINDINGS_CLK_QCOM_GPUCC_QCS8300_H 8 + 9 + #include "qcom,sa8775p-gpucc.h" 10 + 11 + /* QCS8300 introduces below new clocks compared to SA8775P */ 12 + 13 + /* GPU_CC clocks */ 14 + #define GPU_CC_CX_ACCU_SHIFT_CLK 23 15 + #define GPU_CC_GX_ACCU_SHIFT_CLK 24 16 + 17 + #endif

+14 -4

include/linux/blk-mq.h

··· 861 861 void (*complete)(struct io_comp_batch *)) 862 862 { 863 863 /* 864 - * blk_mq_end_request_batch() can't end request allocated from 865 - * sched tags 864 + * Check various conditions that exclude batch processing: 865 + * 1) No batch container 866 + * 2) Has scheduler data attached 867 + * 3) Not a passthrough request and end_io set 868 + * 4) Not a passthrough request and an ioerror 866 869 */ 867 - if (!iob || (req->rq_flags & RQF_SCHED_TAGS) || ioerror || 868 - (req->end_io && !blk_rq_is_passthrough(req))) 870 + if (!iob) 869 871 return false; 872 + if (req->rq_flags & RQF_SCHED_TAGS) 873 + return false; 874 + if (!blk_rq_is_passthrough(req)) { 875 + if (req->end_io) 876 + return false; 877 + if (ioerror < 0) 878 + return false; 879 + } 870 880 871 881 if (!iob->complete) 872 882 iob->complete = complete;

+3 -3

include/linux/cgroup-defs.h

··· 71 71 72 72 /* Cgroup is frozen. */ 73 73 CGRP_FROZEN, 74 - 75 - /* Control group has to be killed. */ 76 - CGRP_KILL, 77 74 }; 78 75 79 76 /* cgroup_root->flags */ ··· 457 460 int nr_populated_threaded_children; 458 461 459 462 int nr_threaded_children; /* # of live threaded child cgroups */ 463 + 464 + /* sequence number for cgroup.kill, serialized by css_set_lock. */ 465 + unsigned int kill_seq; 460 466 461 467 struct kernfs_node *kn; /* cgroup kernfs entry */ 462 468 struct cgroup_file procs_file; /* handle for "cgroup.procs" */

+19 -13

include/linux/compiler.h

··· 191 191 __v; \ 192 192 }) 193 193 194 + #ifdef __CHECKER__ 195 + #define __BUILD_BUG_ON_ZERO_MSG(e, msg) (0) 196 + #else /* __CHECKER__ */ 197 + #define __BUILD_BUG_ON_ZERO_MSG(e, msg) ((int)sizeof(struct {_Static_assert(!(e), msg);})) 198 + #endif /* __CHECKER__ */ 199 + 200 + /* &a[0] degrades to a pointer: a different type from an array */ 201 + #define __is_array(a) (!__same_type((a), &(a)[0])) 202 + #define __must_be_array(a) __BUILD_BUG_ON_ZERO_MSG(!__is_array(a), \ 203 + "must be array") 204 + 205 + #define __is_byte_array(a) (__is_array(a) && sizeof((a)[0]) == 1) 206 + #define __must_be_byte_array(a) __BUILD_BUG_ON_ZERO_MSG(!__is_byte_array(a), \ 207 + "must be byte array") 208 + 209 + /* Require C Strings (i.e. NUL-terminated) lack the "nonstring" attribute. */ 210 + #define __must_be_cstr(p) \ 211 + __BUILD_BUG_ON_ZERO_MSG(__annotated(p, nonstring), "must be cstr (NUL-terminated)") 212 + 194 213 #endif /* __KERNEL__ */ 195 214 196 215 /** ··· 249 230 .popsection; 250 231 251 232 #define __ADDRESSABLE_ASM_STR(sym) __stringify(__ADDRESSABLE_ASM(sym)) 252 - 253 - #ifdef __CHECKER__ 254 - #define __BUILD_BUG_ON_ZERO_MSG(e, msg) (0) 255 - #else /* __CHECKER__ */ 256 - #define __BUILD_BUG_ON_ZERO_MSG(e, msg) ((int)sizeof(struct {_Static_assert(!(e), msg);})) 257 - #endif /* __CHECKER__ */ 258 - 259 - /* &a[0] degrades to a pointer: a different type from an array */ 260 - #define __must_be_array(a) __BUILD_BUG_ON_ZERO_MSG(__same_type((a), &(a)[0]), "must be array") 261 - 262 - /* Require C Strings (i.e. NUL-terminated) lack the "nonstring" attribute. */ 263 - #define __must_be_cstr(p) \ 264 - __BUILD_BUG_ON_ZERO_MSG(__annotated(p, nonstring), "must be cstr (NUL-terminated)") 265 233 266 234 /* 267 235 * This returns a constant expression while determining if an argument is

+69

include/linux/device/faux.h

··· 1 + /* SPDX-License-Identifier: GPL-2.0-only */ 2 + /* 3 + * Copyright (c) 2025 Greg Kroah-Hartman <gregkh@linuxfoundation.org> 4 + * Copyright (c) 2025 The Linux Foundation 5 + * 6 + * A "simple" faux bus that allows devices to be created and added 7 + * automatically to it. This is to be used whenever you need to create a 8 + * device that is not associated with any "real" system resources, and do 9 + * not want to have to deal with a bus/driver binding logic. It is 10 + * intended to be very simple, with only a create and a destroy function 11 + * available. 12 + */ 13 + #ifndef _FAUX_DEVICE_H_ 14 + #define _FAUX_DEVICE_H_ 15 + 16 + #include <linux/container_of.h> 17 + #include <linux/device.h> 18 + 19 + /** 20 + * struct faux_device - a "faux" device 21 + * @dev: internal struct device of the object 22 + * 23 + * A simple faux device that can be created/destroyed. To be used when a 24 + * driver only needs to have a device to "hang" something off. This can be 25 + * used for downloading firmware or other basic tasks. Use this instead of 26 + * a struct platform_device if the device has no resources assigned to 27 + * it at all. 28 + */ 29 + struct faux_device { 30 + struct device dev; 31 + }; 32 + #define to_faux_device(x) container_of_const((x), struct faux_device, dev) 33 + 34 + /** 35 + * struct faux_device_ops - a set of callbacks for a struct faux_device 36 + * @probe: called when a faux device is probed by the driver core 37 + * before the device is fully bound to the internal faux bus 38 + * code. If probe succeeds, return 0, otherwise return a 39 + * negative error number to stop the probe sequence from 40 + * succeeding. 41 + * @remove: called when a faux device is removed from the system 42 + * 43 + * Both @probe and @remove are optional, if not needed, set to NULL. 44 + */ 45 + struct faux_device_ops { 46 + int (*probe)(struct faux_device *faux_dev); 47 + void (*remove)(struct faux_device *faux_dev); 48 + }; 49 + 50 + struct faux_device *faux_device_create(const char *name, 51 + struct device *parent, 52 + const struct faux_device_ops *faux_ops); 53 + struct faux_device *faux_device_create_with_groups(const char *name, 54 + struct device *parent, 55 + const struct faux_device_ops *faux_ops, 56 + const struct attribute_group **groups); 57 + void faux_device_destroy(struct faux_device *faux_dev); 58 + 59 + static inline void *faux_device_get_drvdata(const struct faux_device *faux_dev) 60 + { 61 + return dev_get_drvdata(&faux_dev->dev); 62 + } 63 + 64 + static inline void faux_device_set_drvdata(struct faux_device *faux_dev, void *data) 65 + { 66 + dev_set_drvdata(&faux_dev->dev, data); 67 + } 68 + 69 + #endif /* _FAUX_DEVICE_H_ */

+16 -15

include/linux/efi.h

··· 114 114 #define EFI_MAX_MEMORY_TYPE 16 115 115 116 116 /* Attribute values: */ 117 - #define EFI_MEMORY_UC ((u64)0x0000000000000001ULL) /* uncached */ 118 - #define EFI_MEMORY_WC ((u64)0x0000000000000002ULL) /* write-coalescing */ 119 - #define EFI_MEMORY_WT ((u64)0x0000000000000004ULL) /* write-through */ 120 - #define EFI_MEMORY_WB ((u64)0x0000000000000008ULL) /* write-back */ 121 - #define EFI_MEMORY_UCE ((u64)0x0000000000000010ULL) /* uncached, exported */ 122 - #define EFI_MEMORY_WP ((u64)0x0000000000001000ULL) /* write-protect */ 123 - #define EFI_MEMORY_RP ((u64)0x0000000000002000ULL) /* read-protect */ 124 - #define EFI_MEMORY_XP ((u64)0x0000000000004000ULL) /* execute-protect */ 125 - #define EFI_MEMORY_NV ((u64)0x0000000000008000ULL) /* non-volatile */ 126 - #define EFI_MEMORY_MORE_RELIABLE \ 127 - ((u64)0x0000000000010000ULL) /* higher reliability */ 128 - #define EFI_MEMORY_RO ((u64)0x0000000000020000ULL) /* read-only */ 129 - #define EFI_MEMORY_SP ((u64)0x0000000000040000ULL) /* soft reserved */ 130 - #define EFI_MEMORY_CPU_CRYPTO ((u64)0x0000000000080000ULL) /* supports encryption */ 131 - #define EFI_MEMORY_RUNTIME ((u64)0x8000000000000000ULL) /* range requires runtime mapping */ 117 + #define EFI_MEMORY_UC BIT_ULL(0) /* uncached */ 118 + #define EFI_MEMORY_WC BIT_ULL(1) /* write-coalescing */ 119 + #define EFI_MEMORY_WT BIT_ULL(2) /* write-through */ 120 + #define EFI_MEMORY_WB BIT_ULL(3) /* write-back */ 121 + #define EFI_MEMORY_UCE BIT_ULL(4) /* uncached, exported */ 122 + #define EFI_MEMORY_WP BIT_ULL(12) /* write-protect */ 123 + #define EFI_MEMORY_RP BIT_ULL(13) /* read-protect */ 124 + #define EFI_MEMORY_XP BIT_ULL(14) /* execute-protect */ 125 + #define EFI_MEMORY_NV BIT_ULL(15) /* non-volatile */ 126 + #define EFI_MEMORY_MORE_RELIABLE BIT_ULL(16) /* higher reliability */ 127 + #define EFI_MEMORY_RO BIT_ULL(17) /* read-only */ 128 + #define EFI_MEMORY_SP BIT_ULL(18) /* soft reserved */ 129 + #define EFI_MEMORY_CPU_CRYPTO BIT_ULL(19) /* supports encryption */ 130 + #define EFI_MEMORY_HOT_PLUGGABLE BIT_ULL(20) /* supports unplugging at runtime */ 131 + #define EFI_MEMORY_RUNTIME BIT_ULL(63) /* range requires runtime mapping */ 132 + 132 133 #define EFI_MEMORY_DESCRIPTOR_VERSION 1 133 134 134 135 #define EFI_PAGE_SHIFT 12

+19 -1

include/linux/fs.h

··· 222 222 #define FMODE_FSNOTIFY_HSM(mode) 0 223 223 #endif 224 224 225 - 226 225 /* 227 226 * Attribute flags. These should be or-ed together to figure out what 228 227 * has been changed! ··· 790 791 791 792 static inline void inode_set_cached_link(struct inode *inode, char *link, int linklen) 792 793 { 794 + int testlen; 795 + 796 + /* 797 + * TODO: patch it into a debug-only check if relevant macros show up. 798 + * In the meantime, since we are suffering strlen even on production kernels 799 + * to find the right length, do a fixup if the wrong value got passed. 800 + */ 801 + testlen = strlen(link); 802 + if (testlen != linklen) { 803 + WARN_ONCE(1, "bad length passed for symlink [%s] (got %d, expected %d)", 804 + link, linklen, testlen); 805 + linklen = testlen; 806 + } 793 807 inode->i_link = link; 794 808 inode->i_linklen = linklen; 795 809 inode->i_opflags |= IOP_CACHED_LINK; ··· 3150 3138 if (unlikely(!exe_file || FMODE_FSNOTIFY_HSM(exe_file->f_mode))) 3151 3139 return; 3152 3140 allow_write_access(exe_file); 3141 + } 3142 + 3143 + static inline void file_set_fsnotify_mode(struct file *file, fmode_t mode) 3144 + { 3145 + file->f_mode &= ~FMODE_FSNOTIFY_MASK; 3146 + file->f_mode |= mode; 3153 3147 } 3154 3148 3155 3149 static inline bool inode_is_open_for_write(const struct inode *inode)

+2 -2

include/linux/fsnotify.h

··· 129 129 130 130 #ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS 131 131 132 - void file_set_fsnotify_mode(struct file *file); 132 + void file_set_fsnotify_mode_from_watchers(struct file *file); 133 133 134 134 /* 135 135 * fsnotify_file_area_perm - permission hook before access to file range ··· 213 213 } 214 214 215 215 #else 216 - static inline void file_set_fsnotify_mode(struct file *file) 216 + static inline void file_set_fsnotify_mode_from_watchers(struct file *file) 217 217 { 218 218 } 219 219

+1

include/linux/hrtimer_defs.h

··· 125 125 ktime_t softirq_expires_next; 126 126 struct hrtimer *softirq_next_timer; 127 127 struct hrtimer_clock_base clock_base[HRTIMER_MAX_CLOCK_BASES]; 128 + call_single_data_t csd; 128 129 } ____cacheline_aligned; 129 130 130 131

+8 -2

include/linux/i2c.h

··· 244 244 * @id_table: List of I2C devices supported by this driver 245 245 * @detect: Callback for device detection 246 246 * @address_list: The I2C addresses to probe (for detect) 247 + * @clients: List of detected clients we created (for i2c-core use only) 247 248 * @flags: A bitmask of flags defined in &enum i2c_driver_flags 248 249 * 249 250 * The driver.owner field should be set to the module owner of this driver. ··· 299 298 /* Device detection callback for automatic device creation */ 300 299 int (*detect)(struct i2c_client *client, struct i2c_board_info *info); 301 300 const unsigned short *address_list; 301 + struct list_head clients; 302 302 303 303 u32 flags; 304 304 }; ··· 315 313 * @dev: Driver model device node for the slave. 316 314 * @init_irq: IRQ that was set at initialization 317 315 * @irq: indicates the IRQ generated by this device (if any) 316 + * @detected: member of an i2c_driver.clients list or i2c-core's 317 + * userspace_devices list 318 318 * @slave_cb: Callback when I2C slave mode of an adapter is used. The adapter 319 319 * calls it to pass on slave events to the slave driver. 320 320 * @devres_group_id: id of the devres group that will be created for resources ··· 336 332 #define I2C_CLIENT_SLAVE 0x20 /* we are the slave */ 337 333 #define I2C_CLIENT_HOST_NOTIFY 0x40 /* We want to use I2C host notify */ 338 334 #define I2C_CLIENT_WAKE 0x80 /* for board_info; true iff can wake */ 339 - #define I2C_CLIENT_AUTO 0x100 /* client was auto-detected */ 340 - #define I2C_CLIENT_USER 0x200 /* client was userspace-created */ 341 335 #define I2C_CLIENT_SCCB 0x9000 /* Use Omnivision SCCB protocol */ 342 336 /* Must match I2C_M_STOP|IGNORE_NAK */ 343 337 ··· 347 345 struct device dev; /* the device structure */ 348 346 int init_irq; /* irq set at initialization */ 349 347 int irq; /* irq issued by device */ 348 + struct list_head detected; 350 349 #if IS_ENABLED(CONFIG_I2C_SLAVE) 351 350 i2c_slave_cb_t slave_cb; /* callback for slave mode */ 352 351 #endif ··· 753 750 int nr; 754 751 char name[48]; 755 752 struct completion dev_released; 753 + 754 + struct mutex userspace_clients_lock; 755 + struct list_head userspace_clients; 756 756 757 757 struct i2c_bus_recovery_info *bus_recovery_info; 758 758 const struct i2c_adapter_quirks *quirks;

+1 -1

include/linux/jiffies.h

··· 537 537 * 538 538 * Return: jiffies value 539 539 */ 540 - #define secs_to_jiffies(_secs) ((_secs) * HZ) 540 + #define secs_to_jiffies(_secs) (unsigned long)((_secs) * HZ) 541 541 542 542 extern unsigned long __usecs_to_jiffies(const unsigned int u); 543 543 #if !(USEC_PER_SEC % HZ)

-1

include/linux/kvm_host.h

··· 1615 1615 bool kvm_arch_dy_runnable(struct kvm_vcpu *vcpu); 1616 1616 bool kvm_arch_dy_has_pending_interrupt(struct kvm_vcpu *vcpu); 1617 1617 bool kvm_arch_vcpu_preempted_in_kernel(struct kvm_vcpu *vcpu); 1618 - int kvm_arch_post_init_vm(struct kvm *kvm); 1619 1618 void kvm_arch_pre_destroy_vm(struct kvm *kvm); 1620 1619 void kvm_arch_create_vm_debugfs(struct kvm *kvm); 1621 1620

+4 -3

include/linux/lockref.h

··· 37 37 /** 38 38 * lockref_init - Initialize a lockref 39 39 * @lockref: pointer to lockref structure 40 - * @count: initial count 40 + * 41 + * Initializes @lockref->count to 1. 41 42 */ 42 - static inline void lockref_init(struct lockref *lockref, unsigned int count) 43 + static inline void lockref_init(struct lockref *lockref) 43 44 { 44 45 spin_lock_init(&lockref->lock); 45 - lockref->count = count; 46 + lockref->count = 1; 46 47 } 47 48 48 49 void lockref_get(struct lockref *lockref);

+4 -1

include/linux/module.h

··· 306 306 /* Get/put a kernel symbol (calls must be symmetric) */ 307 307 void *__symbol_get(const char *symbol); 308 308 void *__symbol_get_gpl(const char *symbol); 309 - #define symbol_get(x) ((typeof(&x))(__symbol_get(__stringify(x)))) 309 + #define symbol_get(x) ({ \ 310 + static const char __notrim[] \ 311 + __used __section(".no_trim_symbol") = __stringify(x); \ 312 + (typeof(&x))(__symbol_get(__stringify(x))); }) 310 313 311 314 /* modules using other modules: kdb wants to see this. */ 312 315 struct module_use {

+7 -1

include/linux/netdevice.h

··· 2664 2664 } 2665 2665 2666 2666 static inline 2667 + struct net *dev_net_rcu(const struct net_device *dev) 2668 + { 2669 + return read_pnet_rcu(&dev->nd_net); 2670 + } 2671 + 2672 + static inline 2667 2673 void dev_net_set(struct net_device *dev, struct net *net) 2668 2674 { 2669 2675 write_pnet(&dev->nd_net, net); ··· 2910 2904 struct pcpu_dstats { 2911 2905 u64_stats_t rx_packets; 2912 2906 u64_stats_t rx_bytes; 2913 - u64_stats_t rx_drops; 2914 2907 u64_stats_t tx_packets; 2915 2908 u64_stats_t tx_bytes; 2909 + u64_stats_t rx_drops; 2916 2910 u64_stats_t tx_drops; 2917 2911 struct u64_stats_sync syncp; 2918 2912 } __aligned(8 * sizeof(u64));

+9

include/linux/psp-sev.h

··· 815 815 #ifdef CONFIG_CRYPTO_DEV_SP_PSP 816 816 817 817 /** 818 + * sev_module_init - perform PSP SEV module initialization 819 + * 820 + * Returns: 821 + * 0 if the PSP module is successfully initialized 822 + * negative value if the PSP module initialization fails 823 + */ 824 + int sev_module_init(void); 825 + 826 + /** 818 827 * sev_platform_init - perform SEV INIT command 819 828 * 820 829 * @args: struct sev_platform_init_args to pass in arguments

+1

include/linux/sched/task.h

··· 43 43 void *fn_arg; 44 44 struct cgroup *cgrp; 45 45 struct css_set *cset; 46 + unsigned int kill_seq; 46 47 }; 47 48 48 49 /*

+8 -4

include/linux/string.h

··· 414 414 * must be discoverable by the compiler. 415 415 */ 416 416 #define strtomem_pad(dest, src, pad) do { \ 417 - const size_t _dest_len = __builtin_object_size(dest, 1); \ 417 + const size_t _dest_len = __must_be_byte_array(dest) + \ 418 + ARRAY_SIZE(dest); \ 418 419 const size_t _src_len = __builtin_object_size(src, 1); \ 419 420 \ 420 421 BUILD_BUG_ON(!__builtin_constant_p(_dest_len) || \ ··· 438 437 * must be discoverable by the compiler. 439 438 */ 440 439 #define strtomem(dest, src) do { \ 441 - const size_t _dest_len = __builtin_object_size(dest, 1); \ 440 + const size_t _dest_len = __must_be_byte_array(dest) + \ 441 + ARRAY_SIZE(dest); \ 442 442 const size_t _src_len = __builtin_object_size(src, 1); \ 443 443 \ 444 444 BUILD_BUG_ON(!__builtin_constant_p(_dest_len) || \ ··· 458 456 * Note that sizes of @dest and @src must be known at compile-time. 459 457 */ 460 458 #define memtostr(dest, src) do { \ 461 - const size_t _dest_len = __builtin_object_size(dest, 1); \ 459 + const size_t _dest_len = __must_be_byte_array(dest) + \ 460 + ARRAY_SIZE(dest); \ 462 461 const size_t _src_len = __builtin_object_size(src, 1); \ 463 462 const size_t _src_chars = strnlen(src, _src_len); \ 464 463 const size_t _copy_len = min(_dest_len - 1, _src_chars); \ ··· 484 481 * Note that sizes of @dest and @src must be known at compile-time. 485 482 */ 486 483 #define memtostr_pad(dest, src) do { \ 487 - const size_t _dest_len = __builtin_object_size(dest, 1); \ 484 + const size_t _dest_len = __must_be_byte_array(dest) + \ 485 + ARRAY_SIZE(dest); \ 488 486 const size_t _src_len = __builtin_object_size(src, 1); \ 489 487 const size_t _src_chars = strnlen(src, _src_len); \ 490 488 const size_t _copy_len = min(_dest_len - 1, _src_chars); \

+2 -1

include/net/bluetooth/l2cap.h

··· 668 668 struct l2cap_chan *smp; 669 669 670 670 struct list_head chan_l; 671 - struct mutex chan_lock; 671 + struct mutex lock; 672 672 struct kref ref; 673 673 struct list_head users; 674 674 }; ··· 970 970 void l2cap_send_conn_req(struct l2cap_chan *chan); 971 971 972 972 struct l2cap_conn *l2cap_conn_get(struct l2cap_conn *conn); 973 + struct l2cap_conn *l2cap_conn_hold_unless_zero(struct l2cap_conn *conn); 973 974 void l2cap_conn_put(struct l2cap_conn *conn); 974 975 975 976 int l2cap_register_user(struct l2cap_conn *conn, struct l2cap_user *user);

+10 -3

include/net/ip.h

··· 471 471 bool forwarding) 472 472 { 473 473 const struct rtable *rt = dst_rtable(dst); 474 - struct net *net = dev_net(dst->dev); 475 - unsigned int mtu; 474 + unsigned int mtu, res; 475 + struct net *net; 476 476 477 + rcu_read_lock(); 478 + 479 + net = dev_net_rcu(dst->dev); 477 480 if (READ_ONCE(net->ipv4.sysctl_ip_fwd_use_pmtu) || 478 481 ip_mtu_locked(dst) || 479 482 !forwarding) { ··· 500 497 out: 501 498 mtu = min_t(unsigned int, mtu, IP_MAX_MTU); 502 499 503 - return mtu - lwtunnel_headroom(dst->lwtstate, mtu); 500 + res = mtu - lwtunnel_headroom(dst->lwtstate, mtu); 501 + 502 + rcu_read_unlock(); 503 + 504 + return res; 504 505 } 505 506 506 507 static inline unsigned int ip_skb_dst_mtu(struct sock *sk,

+2

include/net/l3mdev.h

··· 198 198 if (netif_is_l3_slave(dev)) { 199 199 struct net_device *master; 200 200 201 + rcu_read_lock(); 201 202 master = netdev_master_upper_dev_get_rcu(dev); 202 203 if (master && master->l3mdev_ops->l3mdev_l3_out) 203 204 skb = master->l3mdev_ops->l3mdev_l3_out(master, sk, 204 205 skb, proto); 206 + rcu_read_unlock(); 205 207 } 206 208 207 209 return skb;

+1 -1

include/net/net_namespace.h

··· 398 398 #endif 399 399 } 400 400 401 - static inline struct net *read_pnet_rcu(possible_net_t *pnet) 401 + static inline struct net *read_pnet_rcu(const possible_net_t *pnet) 402 402 { 403 403 #ifdef CONFIG_NET_NS 404 404 return rcu_dereference(pnet->net);

+7 -2

include/net/route.h

··· 382 382 static inline int ip4_dst_hoplimit(const struct dst_entry *dst) 383 383 { 384 384 int hoplimit = dst_metric_raw(dst, RTAX_HOPLIMIT); 385 - struct net *net = dev_net(dst->dev); 386 385 387 - if (hoplimit == 0) 386 + if (hoplimit == 0) { 387 + const struct net *net; 388 + 389 + rcu_read_lock(); 390 + net = dev_net_rcu(dst->dev); 388 391 hoplimit = READ_ONCE(net->ipv4.sysctl_ip_default_ttl); 392 + rcu_read_unlock(); 393 + } 389 394 return hoplimit; 390 395 } 391 396

+1 -1

include/net/sch_generic.h

··· 851 851 } 852 852 853 853 static inline void _bstats_update(struct gnet_stats_basic_sync *bstats, 854 - __u64 bytes, __u32 packets) 854 + __u64 bytes, __u64 packets) 855 855 { 856 856 u64_stats_update_begin(&bstats->syncp); 857 857 u64_stats_add(&bstats->bytes, bytes);

+1

include/trace/events/rxrpc.h

··· 219 219 EM(rxrpc_conn_get_conn_input, "GET inp-conn") \ 220 220 EM(rxrpc_conn_get_idle, "GET idle ") \ 221 221 EM(rxrpc_conn_get_poke_abort, "GET pk-abort") \ 222 + EM(rxrpc_conn_get_poke_secured, "GET secured ") \ 222 223 EM(rxrpc_conn_get_poke_timer, "GET poke ") \ 223 224 EM(rxrpc_conn_get_service_conn, "GET svc-conn") \ 224 225 EM(rxrpc_conn_new_client, "NEW client ") \

+8 -1

include/uapi/drm/amdgpu_drm.h

··· 411 411 /* GFX12 and later: */ 412 412 #define AMDGPU_TILING_GFX12_SWIZZLE_MODE_SHIFT 0 413 413 #define AMDGPU_TILING_GFX12_SWIZZLE_MODE_MASK 0x7 414 - /* These are DCC recompression setting for memory management: */ 414 + /* These are DCC recompression settings for memory management: */ 415 415 #define AMDGPU_TILING_GFX12_DCC_MAX_COMPRESSED_BLOCK_SHIFT 3 416 416 #define AMDGPU_TILING_GFX12_DCC_MAX_COMPRESSED_BLOCK_MASK 0x3 /* 0:64B, 1:128B, 2:256B */ 417 417 #define AMDGPU_TILING_GFX12_DCC_NUMBER_TYPE_SHIFT 5 418 418 #define AMDGPU_TILING_GFX12_DCC_NUMBER_TYPE_MASK 0x7 /* CB_COLOR0_INFO.NUMBER_TYPE */ 419 419 #define AMDGPU_TILING_GFX12_DCC_DATA_FORMAT_SHIFT 8 420 420 #define AMDGPU_TILING_GFX12_DCC_DATA_FORMAT_MASK 0x3f /* [0:4]:CB_COLOR0_INFO.FORMAT, [5]:MM */ 421 + /* When clearing the buffer or moving it from VRAM to GTT, don't compress and set DCC metadata 422 + * to uncompressed. Set when parts of an allocation bypass DCC and read raw data. */ 423 + #define AMDGPU_TILING_GFX12_DCC_WRITE_COMPRESS_DISABLE_SHIFT 14 424 + #define AMDGPU_TILING_GFX12_DCC_WRITE_COMPRESS_DISABLE_MASK 0x1 425 + /* bit gap */ 426 + #define AMDGPU_TILING_GFX12_SCANOUT_SHIFT 63 427 + #define AMDGPU_TILING_GFX12_SCANOUT_MASK 0x1 421 428 422 429 /* Set/Get helpers for tiling flags. */ 423 430 #define AMDGPU_TILING_SET(field, value) \

+2

include/uapi/linux/ethtool.h

··· 682 682 * @ETH_SS_STATS_ETH_CTRL: names of IEEE 802.3 MAC Control statistics 683 683 * @ETH_SS_STATS_RMON: names of RMON statistics 684 684 * @ETH_SS_STATS_PHY: names of PHY(dev) statistics 685 + * @ETH_SS_TS_FLAGS: hardware timestamping flags 685 686 * 686 687 * @ETH_SS_COUNT: number of defined string sets 687 688 */ ··· 709 708 ETH_SS_STATS_ETH_CTRL, 710 709 ETH_SS_STATS_RMON, 711 710 ETH_SS_STATS_PHY, 711 + ETH_SS_TS_FLAGS, 712 712 713 713 /* add new constants above here */ 714 714 ETH_SS_COUNT

+1 -1

include/uapi/linux/thermal.h

··· 30 30 THERMAL_GENL_ATTR_TZ, 31 31 THERMAL_GENL_ATTR_TZ_ID, 32 32 THERMAL_GENL_ATTR_TZ_TEMP, 33 - THERMAL_GENL_ATTR_TZ_PREV_TEMP, 34 33 THERMAL_GENL_ATTR_TZ_TRIP, 35 34 THERMAL_GENL_ATTR_TZ_TRIP_ID, 36 35 THERMAL_GENL_ATTR_TZ_TRIP_TYPE, ··· 53 54 THERMAL_GENL_ATTR_THRESHOLD, 54 55 THERMAL_GENL_ATTR_THRESHOLD_TEMP, 55 56 THERMAL_GENL_ATTR_THRESHOLD_DIRECTION, 57 + THERMAL_GENL_ATTR_TZ_PREV_TEMP, 56 58 __THERMAL_GENL_ATTR_MAX, 57 59 }; 58 60 #define THERMAL_GENL_ATTR_MAX (__THERMAL_GENL_ATTR_MAX - 1)

+2 -2

include/ufs/ufs.h

··· 385 385 386 386 /* Possible values for dExtendedUFSFeaturesSupport */ 387 387 enum { 388 - UFS_DEV_LOW_TEMP_NOTIF = BIT(4), 389 - UFS_DEV_HIGH_TEMP_NOTIF = BIT(5), 388 + UFS_DEV_HIGH_TEMP_NOTIF = BIT(4), 389 + UFS_DEV_LOW_TEMP_NOTIF = BIT(5), 390 390 UFS_DEV_EXT_TEMP_NOTIF = BIT(6), 391 391 UFS_DEV_HPB_SUPPORT = BIT(7), 392 392 UFS_DEV_WRITE_BOOSTER_SUP = BIT(8),

-1

include/ufs/ufshcd.h

··· 1309 1309 void ufshcd_enable_irq(struct ufs_hba *hba); 1310 1310 void ufshcd_disable_irq(struct ufs_hba *hba); 1311 1311 int ufshcd_alloc_host(struct device *, struct ufs_hba **); 1312 - void ufshcd_dealloc_host(struct ufs_hba *); 1313 1312 int ufshcd_hba_enable(struct ufs_hba *hba); 1314 1313 int ufshcd_init(struct ufs_hba *, void __iomem *, unsigned int); 1315 1314 int ufshcd_link_recovery(struct ufs_hba *hba);

+1 -1

io_uring/futex.c

··· 338 338 hlist_add_head(&req->hash_node, &ctx->futex_list); 339 339 io_ring_submit_unlock(ctx, issue_flags); 340 340 341 - futex_queue(&ifd->q, hb); 341 + futex_queue(&ifd->q, hb, NULL); 342 342 return IOU_ISSUE_SKIP_COMPLETE; 343 343 } 344 344

+12 -4

io_uring/kbuf.c

··· 415 415 } 416 416 } 417 417 418 + static void io_destroy_bl(struct io_ring_ctx *ctx, struct io_buffer_list *bl) 419 + { 420 + scoped_guard(mutex, &ctx->mmap_lock) 421 + WARN_ON_ONCE(xa_erase(&ctx->io_bl_xa, bl->bgid) != bl); 422 + io_put_bl(ctx, bl); 423 + } 424 + 418 425 int io_remove_buffers_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) 419 426 { 420 427 struct io_provide_buf *p = io_kiocb_to_cmd(req, struct io_provide_buf); ··· 643 636 /* if mapped buffer ring OR classic exists, don't allow */ 644 637 if (bl->flags & IOBL_BUF_RING || !list_empty(&bl->buf_list)) 645 638 return -EEXIST; 646 - } else { 647 - free_bl = bl = kzalloc(sizeof(*bl), GFP_KERNEL); 648 - if (!bl) 649 - return -ENOMEM; 639 + io_destroy_bl(ctx, bl); 650 640 } 641 + 642 + free_bl = bl = kzalloc(sizeof(*bl), GFP_KERNEL); 643 + if (!bl) 644 + return -ENOMEM; 651 645 652 646 mmap_offset = (unsigned long)reg.bgid << IORING_OFF_PBUF_SHIFT; 653 647 ring_size = flex_array_size(br, bufs, reg.ring_entries);

+9 -19

io_uring/uring_cmd.c

··· 54 54 continue; 55 55 56 56 if (cmd->flags & IORING_URING_CMD_CANCELABLE) { 57 - /* ->sqe isn't available if no async data */ 58 - if (!req_has_async_data(req)) 59 - cmd->sqe = NULL; 60 57 file->f_op->uring_cmd(cmd, IO_URING_F_CANCEL | 61 58 IO_URING_F_COMPLETE_DEFER); 62 59 ret = true; ··· 176 179 return -ENOMEM; 177 180 cache->op_data = NULL; 178 181 179 - if (!(req->flags & REQ_F_FORCE_ASYNC)) { 180 - /* defer memcpy until we need it */ 181 - ioucmd->sqe = sqe; 182 - return 0; 183 - } 184 - 182 + /* 183 + * Unconditionally cache the SQE for now - this is only needed for 184 + * requests that go async, but prep handlers must ensure that any 185 + * sqe data is stable beyond prep. Since uring_cmd is special in 186 + * that it doesn't read in per-op data, play it safe and ensure that 187 + * any SQE data is stable beyond prep. This can later get relaxed. 188 + */ 185 189 memcpy(cache->sqes, sqe, uring_sqe_size(req->ctx)); 186 190 ioucmd->sqe = cache->sqes; 187 191 return 0; ··· 247 249 } 248 250 249 251 ret = file->f_op->uring_cmd(ioucmd, issue_flags); 250 - if (ret == -EAGAIN) { 251 - struct io_uring_cmd_data *cache = req->async_data; 252 - 253 - if (ioucmd->sqe != (void *) cache) 254 - memcpy(cache->sqes, ioucmd->sqe, uring_sqe_size(req->ctx)); 255 - return -EAGAIN; 256 - } else if (ret == -EIOCBQUEUED) { 257 - return -EIOCBQUEUED; 258 - } 259 - 252 + if (ret == -EAGAIN || ret == -EIOCBQUEUED) 253 + return ret; 260 254 if (ret < 0) 261 255 req_set_fail(req); 262 256 io_req_uring_cleanup(req, issue_flags);

+9 -9

io_uring/waitid.c

··· 118 118 static void io_waitid_complete(struct io_kiocb *req, int ret) 119 119 { 120 120 struct io_waitid *iw = io_kiocb_to_cmd(req, struct io_waitid); 121 - struct io_tw_state ts = {}; 122 121 123 122 /* anyone completing better be holding a reference */ 124 123 WARN_ON_ONCE(!(atomic_read(&iw->refs) & IO_WAITID_REF_MASK)); ··· 130 131 if (ret < 0) 131 132 req_set_fail(req); 132 133 io_req_set_res(req, ret, 0); 133 - io_req_task_complete(req, &ts); 134 134 } 135 135 136 136 static bool __io_waitid_cancel(struct io_ring_ctx *ctx, struct io_kiocb *req) ··· 151 153 list_del_init(&iwa->wo.child_wait.entry); 152 154 spin_unlock_irq(&iw->head->lock); 153 155 io_waitid_complete(req, -ECANCELED); 156 + io_req_queue_tw_complete(req, -ECANCELED); 154 157 return true; 155 158 } 156 159 ··· 257 258 } 258 259 259 260 io_waitid_complete(req, ret); 261 + io_req_task_complete(req, ts); 260 262 } 261 263 262 264 static int io_waitid_wait(struct wait_queue_entry *wait, unsigned mode, ··· 285 285 int io_waitid_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) 286 286 { 287 287 struct io_waitid *iw = io_kiocb_to_cmd(req, struct io_waitid); 288 + struct io_waitid_async *iwa; 288 289 289 290 if (sqe->addr || sqe->buf_index || sqe->addr3 || sqe->waitid_flags) 290 291 return -EINVAL; 292 + 293 + iwa = io_uring_alloc_async_data(NULL, req); 294 + if (!unlikely(iwa)) 295 + return -ENOMEM; 296 + iwa->req = req; 291 297 292 298 iw->which = READ_ONCE(sqe->len); 293 299 iw->upid = READ_ONCE(sqe->fd); ··· 305 299 int io_waitid(struct io_kiocb *req, unsigned int issue_flags) 306 300 { 307 301 struct io_waitid *iw = io_kiocb_to_cmd(req, struct io_waitid); 302 + struct io_waitid_async *iwa = req->async_data; 308 303 struct io_ring_ctx *ctx = req->ctx; 309 - struct io_waitid_async *iwa; 310 304 int ret; 311 - 312 - iwa = io_uring_alloc_async_data(NULL, req); 313 - if (!iwa) 314 - return -ENOMEM; 315 - 316 - iwa->req = req; 317 305 318 306 ret = kernel_waitid_prepare(&iwa->wo, iw->which, iw->upid, &iw->info, 319 307 iw->options, NULL);

+12 -8

kernel/cgroup/cgroup.c

··· 4013 4013 lockdep_assert_held(&cgroup_mutex); 4014 4014 4015 4015 spin_lock_irq(&css_set_lock); 4016 - set_bit(CGRP_KILL, &cgrp->flags); 4016 + cgrp->kill_seq++; 4017 4017 spin_unlock_irq(&css_set_lock); 4018 4018 4019 4019 css_task_iter_start(&cgrp->self, CSS_TASK_ITER_PROCS | CSS_TASK_ITER_THREADED, &it); ··· 4029 4029 send_sig(SIGKILL, task, 0); 4030 4030 } 4031 4031 css_task_iter_end(&it); 4032 - 4033 - spin_lock_irq(&css_set_lock); 4034 - clear_bit(CGRP_KILL, &cgrp->flags); 4035 - spin_unlock_irq(&css_set_lock); 4036 4032 } 4037 4033 4038 4034 static void cgroup_kill(struct cgroup *cgrp) ··· 6484 6488 spin_lock_irq(&css_set_lock); 6485 6489 cset = task_css_set(current); 6486 6490 get_css_set(cset); 6491 + if (kargs->cgrp) 6492 + kargs->kill_seq = kargs->cgrp->kill_seq; 6493 + else 6494 + kargs->kill_seq = cset->dfl_cgrp->kill_seq; 6487 6495 spin_unlock_irq(&css_set_lock); 6488 6496 6489 6497 if (!(kargs->flags & CLONE_INTO_CGROUP)) { ··· 6668 6668 struct kernel_clone_args *kargs) 6669 6669 __releases(&cgroup_threadgroup_rwsem) __releases(&cgroup_mutex) 6670 6670 { 6671 + unsigned int cgrp_kill_seq = 0; 6671 6672 unsigned long cgrp_flags = 0; 6672 6673 bool kill = false; 6673 6674 struct cgroup_subsys *ss; ··· 6682 6681 6683 6682 /* init tasks are special, only link regular threads */ 6684 6683 if (likely(child->pid)) { 6685 - if (kargs->cgrp) 6684 + if (kargs->cgrp) { 6686 6685 cgrp_flags = kargs->cgrp->flags; 6687 - else 6686 + cgrp_kill_seq = kargs->cgrp->kill_seq; 6687 + } else { 6688 6688 cgrp_flags = cset->dfl_cgrp->flags; 6689 + cgrp_kill_seq = cset->dfl_cgrp->kill_seq; 6690 + } 6689 6691 6690 6692 WARN_ON_ONCE(!list_empty(&child->cg_list)); 6691 6693 cset->nr_tasks++; ··· 6723 6719 * child down right after we finished preparing it for 6724 6720 * userspace. 6725 6721 */ 6726 - kill = test_bit(CGRP_KILL, &cgrp_flags); 6722 + kill = kargs->kill_seq != cgrp_kill_seq; 6727 6723 } 6728 6724 6729 6725 spin_unlock_irq(&css_set_lock);

-1

kernel/cgroup/rstat.c

··· 590 590 591 591 cputime->sum_exec_runtime += user; 592 592 cputime->sum_exec_runtime += sys; 593 - cputime->sum_exec_runtime += cpustat[CPUTIME_STEAL]; 594 593 595 594 #ifdef CONFIG_SCHED_CORE 596 595 bstat->forceidle_sum += cpustat[CPUTIME_FORCEIDLE];

+3 -2

kernel/futex/core.c

··· 532 532 futex_hb_waiters_dec(hb); 533 533 } 534 534 535 - void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb) 535 + void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb, 536 + struct task_struct *task) 536 537 { 537 538 int prio; 538 539 ··· 549 548 550 549 plist_node_init(&q->list, prio); 551 550 plist_add(&q->list, &hb->chain); 552 - q->task = current; 551 + q->task = task; 553 552 } 554 553 555 554 /**

+8 -3

kernel/futex/futex.h

··· 285 285 } 286 286 287 287 extern void __futex_unqueue(struct futex_q *q); 288 - extern void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb); 288 + extern void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb, 289 + struct task_struct *task); 289 290 extern int futex_unqueue(struct futex_q *q); 290 291 291 292 /** 292 293 * futex_queue() - Enqueue the futex_q on the futex_hash_bucket 293 294 * @q: The futex_q to enqueue 294 295 * @hb: The destination hash bucket 296 + * @task: Task queueing this futex 295 297 * 296 298 * The hb->lock must be held by the caller, and is released here. A call to 297 299 * futex_queue() is typically paired with exactly one call to futex_unqueue(). The ··· 301 299 * or nothing if the unqueue is done as part of the wake process and the unqueue 302 300 * state is implicit in the state of woken task (see futex_wait_requeue_pi() for 303 301 * an example). 302 + * 303 + * Note that @task may be NULL, for async usage of futexes. 304 304 */ 305 - static inline void futex_queue(struct futex_q *q, struct futex_hash_bucket *hb) 305 + static inline void futex_queue(struct futex_q *q, struct futex_hash_bucket *hb, 306 + struct task_struct *task) 306 307 __releases(&hb->lock) 307 308 { 308 - __futex_queue(q, hb); 309 + __futex_queue(q, hb, task); 309 310 spin_unlock(&hb->lock); 310 311 } 311 312

+1 -1

kernel/futex/pi.c

··· 982 982 /* 983 983 * Only actually queue now that the atomic ops are done: 984 984 */ 985 - __futex_queue(&q, hb); 985 + __futex_queue(&q, hb, current); 986 986 987 987 if (trylock) { 988 988 ret = rt_mutex_futex_trylock(&q.pi_state->pi_mutex);

+2 -2

kernel/futex/waitwake.c

··· 349 349 * access to the hash list and forcing another memory barrier. 350 350 */ 351 351 set_current_state(TASK_INTERRUPTIBLE|TASK_FREEZABLE); 352 - futex_queue(q, hb); 352 + futex_queue(q, hb, current); 353 353 354 354 /* Arm the timer */ 355 355 if (timeout) ··· 460 460 * next futex. Queue each futex at this moment so hb can 461 461 * be unlocked. 462 462 */ 463 - futex_queue(q, hb); 463 + futex_queue(q, hb, current); 464 464 continue; 465 465 } 466 466

-4

kernel/irq/Kconfig

··· 31 31 config GENERIC_PENDING_IRQ 32 32 bool 33 33 34 - # Deduce delayed migration from top-level interrupt chip flags 35 - config GENERIC_PENDING_IRQ_CHIPFLAGS 36 - bool 37 - 38 34 # Support for generic irq migrating off cpu before the cpu is offline. 39 35 config GENERIC_IRQ_MIGRATION 40 36 bool

+2 -2

kernel/kthread.c

··· 859 859 struct kthread *kthread = to_kthread(p); 860 860 cpumask_var_t affinity; 861 861 unsigned long flags; 862 - int ret; 862 + int ret = 0; 863 863 864 864 if (!wait_task_inactive(p, TASK_UNINTERRUPTIBLE) || kthread->started) { 865 865 WARN_ON(1); ··· 892 892 out: 893 893 free_cpumask_var(affinity); 894 894 895 - return 0; 895 + return ret; 896 896 } 897 897 898 898 /*

+2 -2

kernel/sched/autogroup.c

··· 150 150 * see this thread after that: we can no longer use signal->autogroup. 151 151 * See the PF_EXITING check in task_wants_autogroup(). 152 152 */ 153 - sched_move_task(p); 153 + sched_move_task(p, true); 154 154 } 155 155 156 156 static void ··· 182 182 * sched_autogroup_exit_task(). 183 183 */ 184 184 for_each_thread(p, t) 185 - sched_move_task(t); 185 + sched_move_task(t, true); 186 186 187 187 unlock_task_sighand(p, &flags); 188 188 autogroup_kref_put(prev);

+7 -5

kernel/sched/core.c

··· 1063 1063 struct task_struct *task; 1064 1064 1065 1065 task = container_of(node, struct task_struct, wake_q); 1066 - /* Task can safely be re-inserted now: */ 1067 1066 node = node->next; 1068 - task->wake_q.next = NULL; 1067 + /* pairs with cmpxchg_relaxed() in __wake_q_add() */ 1068 + WRITE_ONCE(task->wake_q.next, NULL); 1069 + /* Task can safely be re-inserted now. */ 1069 1070 1070 1071 /* 1071 1072 * wake_up_process() executes a full barrier, which pairs with ··· 9051 9050 * now. This function just updates tsk->se.cfs_rq and tsk->se.parent to reflect 9052 9051 * its new group. 9053 9052 */ 9054 - void sched_move_task(struct task_struct *tsk) 9053 + void sched_move_task(struct task_struct *tsk, bool for_autogroup) 9055 9054 { 9056 9055 int queued, running, queue_flags = 9057 9056 DEQUEUE_SAVE | DEQUEUE_MOVE | DEQUEUE_NOCLOCK; ··· 9080 9079 put_prev_task(rq, tsk); 9081 9080 9082 9081 sched_change_group(tsk, group); 9083 - scx_move_task(tsk); 9082 + if (!for_autogroup) 9083 + scx_cgroup_move_task(tsk); 9084 9084 9085 9085 if (queued) 9086 9086 enqueue_task(rq, tsk, queue_flags); ··· 9182 9180 struct cgroup_subsys_state *css; 9183 9181 9184 9182 cgroup_taskset_for_each(task, css, tset) 9185 - sched_move_task(task); 9183 + sched_move_task(task, false); 9186 9184 9187 9185 scx_cgroup_finish_attach(); 9188 9186 }

+2

kernel/sched/debug.c

··· 1262 1262 if (task_has_dl_policy(p)) { 1263 1263 P(dl.runtime); 1264 1264 P(dl.deadline); 1265 + } else if (fair_policy(p->policy)) { 1266 + P(se.slice); 1265 1267 } 1266 1268 #ifdef CONFIG_SCHED_CLASS_EXT 1267 1269 __PS("ext.enabled", task_on_scx(p));

+76 -37

kernel/sched/ext.c

··· 123 123 SCX_OPS_SWITCH_PARTIAL = 1LLU << 3, 124 124 125 125 /* 126 + * A migration disabled task can only execute on its current CPU. By 127 + * default, such tasks are automatically put on the CPU's local DSQ with 128 + * the default slice on enqueue. If this ops flag is set, they also go 129 + * through ops.enqueue(). 130 + * 131 + * A migration disabled task never invokes ops.select_cpu() as it can 132 + * only select the current CPU. Also, p->cpus_ptr will only contain its 133 + * current CPU while p->nr_cpus_allowed keeps tracking p->user_cpus_ptr 134 + * and thus may disagree with cpumask_weight(p->cpus_ptr). 135 + */ 136 + SCX_OPS_ENQ_MIGRATION_DISABLED = 1LLU << 4, 137 + 138 + /* 126 139 * CPU cgroup support flags 127 140 */ 128 141 SCX_OPS_HAS_CGROUP_WEIGHT = 1LLU << 16, /* cpu.weight */ ··· 143 130 SCX_OPS_ALL_FLAGS = SCX_OPS_KEEP_BUILTIN_IDLE | 144 131 SCX_OPS_ENQ_LAST | 145 132 SCX_OPS_ENQ_EXITING | 133 + SCX_OPS_ENQ_MIGRATION_DISABLED | 146 134 SCX_OPS_SWITCH_PARTIAL | 147 135 SCX_OPS_HAS_CGROUP_WEIGHT, 148 136 }; ··· 430 416 431 417 /** 432 418 * @update_idle: Update the idle state of a CPU 433 - * @cpu: CPU to udpate the idle state for 419 + * @cpu: CPU to update the idle state for 434 420 * @idle: whether entering or exiting the idle state 435 421 * 436 422 * This operation is called when @rq's CPU goes or leaves the idle ··· 896 882 897 883 static DEFINE_STATIC_KEY_FALSE(scx_ops_enq_last); 898 884 static DEFINE_STATIC_KEY_FALSE(scx_ops_enq_exiting); 885 + static DEFINE_STATIC_KEY_FALSE(scx_ops_enq_migration_disabled); 899 886 static DEFINE_STATIC_KEY_FALSE(scx_ops_cpu_preempt); 900 887 static DEFINE_STATIC_KEY_FALSE(scx_builtin_idle_enabled); 901 888 ··· 1229 1214 1230 1215 /** 1231 1216 * nldsq_next_task - Iterate to the next task in a non-local DSQ 1232 - * @dsq: user dsq being interated 1217 + * @dsq: user dsq being iterated 1233 1218 * @cur: current position, %NULL to start iteration 1234 1219 * @rev: walk backwards 1235 1220 * ··· 2029 2014 unlikely(p->flags & PF_EXITING)) 2030 2015 goto local; 2031 2016 2017 + /* see %SCX_OPS_ENQ_MIGRATION_DISABLED */ 2018 + if (!static_branch_unlikely(&scx_ops_enq_migration_disabled) && 2019 + is_migration_disabled(p)) 2020 + goto local; 2021 + 2032 2022 if (!SCX_HAS_OP(enqueue)) 2033 2023 goto global; 2034 2024 ··· 2098 2078 2099 2079 /* 2100 2080 * list_add_tail() must be used. scx_ops_bypass() depends on tasks being 2101 - * appened to the runnable_list. 2081 + * appended to the runnable_list. 2102 2082 */ 2103 2083 list_add_tail(&p->scx.runnable_node, &rq->scx.runnable_list); 2104 2084 } ··· 2333 2313 * 2334 2314 * - The BPF scheduler is bypassed while the rq is offline and we can always say 2335 2315 * no to the BPF scheduler initiated migrations while offline. 2316 + * 2317 + * The caller must ensure that @p and @rq are on different CPUs. 2336 2318 */ 2337 2319 static bool task_can_run_on_remote_rq(struct task_struct *p, struct rq *rq, 2338 2320 bool trigger_error) 2339 2321 { 2340 2322 int cpu = cpu_of(rq); 2323 + 2324 + SCHED_WARN_ON(task_cpu(p) == cpu); 2325 + 2326 + /* 2327 + * If @p has migration disabled, @p->cpus_ptr is updated to contain only 2328 + * the pinned CPU in migrate_disable_switch() while @p is being switched 2329 + * out. However, put_prev_task_scx() is called before @p->cpus_ptr is 2330 + * updated and thus another CPU may see @p on a DSQ inbetween leading to 2331 + * @p passing the below task_allowed_on_cpu() check while migration is 2332 + * disabled. 2333 + * 2334 + * Test the migration disabled state first as the race window is narrow 2335 + * and the BPF scheduler failing to check migration disabled state can 2336 + * easily be masked if task_allowed_on_cpu() is done first. 2337 + */ 2338 + if (unlikely(is_migration_disabled(p))) { 2339 + if (trigger_error) 2340 + scx_ops_error("SCX_DSQ_LOCAL[_ON] cannot move migration disabled %s[%d] from CPU %d to %d", 2341 + p->comm, p->pid, task_cpu(p), cpu); 2342 + return false; 2343 + } 2341 2344 2342 2345 /* 2343 2346 * We don't require the BPF scheduler to avoid dispatching to offline ··· 2370 2327 */ 2371 2328 if (!task_allowed_on_cpu(p, cpu)) { 2372 2329 if (trigger_error) 2373 - scx_ops_error("SCX_DSQ_LOCAL[_ON] verdict target cpu %d not allowed for %s[%d]", 2374 - cpu_of(rq), p->comm, p->pid); 2330 + scx_ops_error("SCX_DSQ_LOCAL[_ON] target CPU %d not allowed for %s[%d]", 2331 + cpu, p->comm, p->pid); 2375 2332 return false; 2376 2333 } 2377 - 2378 - if (unlikely(is_migration_disabled(p))) 2379 - return false; 2380 2334 2381 2335 if (!scx_rq_online(rq)) 2382 2336 return false; ··· 2477 2437 2478 2438 if (dst_dsq->id == SCX_DSQ_LOCAL) { 2479 2439 dst_rq = container_of(dst_dsq, struct rq, scx.local_dsq); 2480 - if (!task_can_run_on_remote_rq(p, dst_rq, true)) { 2440 + if (src_rq != dst_rq && 2441 + unlikely(!task_can_run_on_remote_rq(p, dst_rq, true))) { 2481 2442 dst_dsq = find_global_dsq(p); 2482 2443 dst_rq = src_rq; 2483 2444 } ··· 2521 2480 /* 2522 2481 * A poorly behaving BPF scheduler can live-lock the system by e.g. incessantly 2523 2482 * banging on the same DSQ on a large NUMA system to the point where switching 2524 - * to the bypass mode can take a long time. Inject artifical delays while the 2483 + * to the bypass mode can take a long time. Inject artificial delays while the 2525 2484 * bypass mode is switching to guarantee timely completion. 2526 2485 */ 2527 2486 static void scx_ops_breather(struct rq *rq) ··· 2616 2575 { 2617 2576 struct rq *src_rq = task_rq(p); 2618 2577 struct rq *dst_rq = container_of(dst_dsq, struct rq, scx.local_dsq); 2578 + #ifdef CONFIG_SMP 2579 + struct rq *locked_rq = rq; 2580 + #endif 2619 2581 2620 2582 /* 2621 2583 * We're synchronized against dequeue through DISPATCHING. As @p can't ··· 2632 2588 } 2633 2589 2634 2590 #ifdef CONFIG_SMP 2635 - if (unlikely(!task_can_run_on_remote_rq(p, dst_rq, true))) { 2591 + if (src_rq != dst_rq && 2592 + unlikely(!task_can_run_on_remote_rq(p, dst_rq, true))) { 2636 2593 dispatch_enqueue(find_global_dsq(p), p, 2637 2594 enq_flags | SCX_ENQ_CLEAR_OPSS); 2638 2595 return; ··· 2656 2611 atomic_long_set_release(&p->scx.ops_state, SCX_OPSS_NONE); 2657 2612 2658 2613 /* switch to @src_rq lock */ 2659 - if (rq != src_rq) { 2660 - raw_spin_rq_unlock(rq); 2614 + if (locked_rq != src_rq) { 2615 + raw_spin_rq_unlock(locked_rq); 2616 + locked_rq = src_rq; 2661 2617 raw_spin_rq_lock(src_rq); 2662 2618 } 2663 2619 ··· 2676 2630 } else { 2677 2631 move_remote_task_to_local_dsq(p, enq_flags, 2678 2632 src_rq, dst_rq); 2633 + /* task has been moved to dst_rq, which is now locked */ 2634 + locked_rq = dst_rq; 2679 2635 } 2680 2636 2681 2637 /* if the destination CPU is idle, wake it up */ ··· 2686 2638 } 2687 2639 2688 2640 /* switch back to @rq lock */ 2689 - if (rq != dst_rq) { 2690 - raw_spin_rq_unlock(dst_rq); 2641 + if (locked_rq != rq) { 2642 + raw_spin_rq_unlock(locked_rq); 2691 2643 raw_spin_rq_lock(rq); 2692 2644 } 2693 2645 #else /* CONFIG_SMP */ ··· 3192 3144 * 3193 3145 * Unless overridden by ops.core_sched_before(), @p->scx.core_sched_at is used 3194 3146 * to implement the default task ordering. The older the timestamp, the higher 3195 - * prority the task - the global FIFO ordering matching the default scheduling 3147 + * priority the task - the global FIFO ordering matching the default scheduling 3196 3148 * behavior. 3197 3149 * 3198 3150 * When ops.core_sched_before() is enabled, @p->scx.core_sched_at is used to ··· 3899 3851 curr->scx.slice = 0; 3900 3852 touch_core_sched(rq, curr); 3901 3853 } else if (SCX_HAS_OP(tick)) { 3902 - SCX_CALL_OP(SCX_KF_REST, tick, curr); 3854 + SCX_CALL_OP_TASK(SCX_KF_REST, tick, curr); 3903 3855 } 3904 3856 3905 3857 if (!curr->scx.slice) ··· 4046 3998 WARN_ON_ONCE(scx_get_task_state(p) != SCX_TASK_ENABLED); 4047 3999 4048 4000 if (SCX_HAS_OP(disable)) 4049 - SCX_CALL_OP(SCX_KF_REST, disable, p); 4001 + SCX_CALL_OP_TASK(SCX_KF_REST, disable, p); 4050 4002 scx_set_task_state(p, SCX_TASK_READY); 4051 4003 } 4052 4004 ··· 4075 4027 } 4076 4028 4077 4029 if (SCX_HAS_OP(exit_task)) 4078 - SCX_CALL_OP(SCX_KF_REST, exit_task, p, &args); 4030 + SCX_CALL_OP_TASK(SCX_KF_REST, exit_task, p, &args); 4079 4031 scx_set_task_state(p, SCX_TASK_NONE); 4080 4032 } 4081 4033 ··· 4371 4323 return ops_sanitize_err("cgroup_prep_move", ret); 4372 4324 } 4373 4325 4374 - void scx_move_task(struct task_struct *p) 4326 + void scx_cgroup_move_task(struct task_struct *p) 4375 4327 { 4376 4328 if (!scx_cgroup_enabled) 4377 - return; 4378 - 4379 - /* 4380 - * We're called from sched_move_task() which handles both cgroup and 4381 - * autogroup moves. Ignore the latter. 4382 - * 4383 - * Also ignore exiting tasks, because in the exit path tasks transition 4384 - * from the autogroup to the root group, so task_group_is_autogroup() 4385 - * alone isn't able to catch exiting autogroup tasks. This is safe for 4386 - * cgroup_move(), because cgroup migrations never happen for PF_EXITING 4387 - * tasks. 4388 - */ 4389 - if (task_group_is_autogroup(task_group(p)) || (p->flags & PF_EXITING)) 4390 4329 return; 4391 4330 4392 4331 /* ··· 4625 4590 cgroup_warned_missing_idle = false; 4626 4591 4627 4592 /* 4628 - * scx_tg_on/offline() are excluded thorugh scx_cgroup_rwsem. If we walk 4593 + * scx_tg_on/offline() are excluded through scx_cgroup_rwsem. If we walk 4629 4594 * cgroups and init, all online cgroups are initialized. 4630 4595 */ 4631 4596 rcu_read_lock(); ··· 5094 5059 static_branch_disable(&scx_has_op[i]); 5095 5060 static_branch_disable(&scx_ops_enq_last); 5096 5061 static_branch_disable(&scx_ops_enq_exiting); 5062 + static_branch_disable(&scx_ops_enq_migration_disabled); 5097 5063 static_branch_disable(&scx_ops_cpu_preempt); 5098 5064 static_branch_disable(&scx_builtin_idle_enabled); 5099 5065 synchronize_rcu(); ··· 5313 5277 scx_get_task_state(p), p->scx.flags & ~SCX_TASK_STATE_MASK, 5314 5278 p->scx.dsq_flags, ops_state & SCX_OPSS_STATE_MASK, 5315 5279 ops_state >> SCX_OPSS_QSEQ_SHIFT); 5316 - dump_line(s, " sticky/holding_cpu=%d/%d dsq_id=%s dsq_vtime=%llu slice=%llu", 5317 - p->scx.sticky_cpu, p->scx.holding_cpu, dsq_id_buf, 5318 - p->scx.dsq_vtime, p->scx.slice); 5280 + dump_line(s, " sticky/holding_cpu=%d/%d dsq_id=%s", 5281 + p->scx.sticky_cpu, p->scx.holding_cpu, dsq_id_buf); 5282 + dump_line(s, " dsq_vtime=%llu slice=%llu weight=%u", 5283 + p->scx.dsq_vtime, p->scx.slice, p->scx.weight); 5319 5284 dump_line(s, " cpus=%*pb", cpumask_pr_args(p->cpus_ptr)); 5320 5285 5321 5286 if (SCX_HAS_OP(dump_task)) { ··· 5704 5667 5705 5668 if (ops->flags & SCX_OPS_ENQ_EXITING) 5706 5669 static_branch_enable(&scx_ops_enq_exiting); 5670 + if (ops->flags & SCX_OPS_ENQ_MIGRATION_DISABLED) 5671 + static_branch_enable(&scx_ops_enq_migration_disabled); 5707 5672 if (scx_ops.cpu_acquire || scx_ops.cpu_release) 5708 5673 static_branch_enable(&scx_ops_cpu_preempt); 5709 5674

+2 -2

kernel/sched/ext.h

··· 73 73 int scx_tg_online(struct task_group *tg); 74 74 void scx_tg_offline(struct task_group *tg); 75 75 int scx_cgroup_can_attach(struct cgroup_taskset *tset); 76 - void scx_move_task(struct task_struct *p); 76 + void scx_cgroup_move_task(struct task_struct *p); 77 77 void scx_cgroup_finish_attach(void); 78 78 void scx_cgroup_cancel_attach(struct cgroup_taskset *tset); 79 79 void scx_group_set_weight(struct task_group *tg, unsigned long cgrp_weight); ··· 82 82 static inline int scx_tg_online(struct task_group *tg) { return 0; } 83 83 static inline void scx_tg_offline(struct task_group *tg) {} 84 84 static inline int scx_cgroup_can_attach(struct cgroup_taskset *tset) { return 0; } 85 - static inline void scx_move_task(struct task_struct *p) {} 85 + static inline void scx_cgroup_move_task(struct task_struct *p) {} 86 86 static inline void scx_cgroup_finish_attach(void) {} 87 87 static inline void scx_cgroup_cancel_attach(struct cgroup_taskset *tset) {} 88 88 static inline void scx_group_set_weight(struct task_group *tg, unsigned long cgrp_weight) {}

+19

kernel/sched/fair.c

··· 5385 5385 static void set_delayed(struct sched_entity *se) 5386 5386 { 5387 5387 se->sched_delayed = 1; 5388 + 5389 + /* 5390 + * Delayed se of cfs_rq have no tasks queued on them. 5391 + * Do not adjust h_nr_runnable since dequeue_entities() 5392 + * will account it for blocked tasks. 5393 + */ 5394 + if (!entity_is_task(se)) 5395 + return; 5396 + 5388 5397 for_each_sched_entity(se) { 5389 5398 struct cfs_rq *cfs_rq = cfs_rq_of(se); 5390 5399 ··· 5406 5397 static void clear_delayed(struct sched_entity *se) 5407 5398 { 5408 5399 se->sched_delayed = 0; 5400 + 5401 + /* 5402 + * Delayed se of cfs_rq have no tasks queued on them. 5403 + * Do not adjust h_nr_runnable since a dequeue has 5404 + * already accounted for it or an enqueue of a task 5405 + * below it will account for it in enqueue_task_fair(). 5406 + */ 5407 + if (!entity_is_task(se)) 5408 + return; 5409 + 5409 5410 for_each_sched_entity(se) { 5410 5411 struct cfs_rq *cfs_rq = cfs_rq_of(se); 5411 5412

+1 -1

kernel/sched/sched.h

··· 572 572 extern void sched_destroy_group(struct task_group *tg); 573 573 extern void sched_release_group(struct task_group *tg); 574 574 575 - extern void sched_move_task(struct task_struct *tsk); 575 + extern void sched_move_task(struct task_struct *tsk, bool for_autogroup); 576 576 577 577 #ifdef CONFIG_FAIR_GROUP_SCHED 578 578 extern int sched_group_set_shares(struct task_group *tg, unsigned long shares);

+12

kernel/seccomp.c

··· 749 749 if (WARN_ON_ONCE(!fprog)) 750 750 return false; 751 751 752 + /* Our single exception to filtering. */ 753 + #ifdef __NR_uretprobe 754 + #ifdef SECCOMP_ARCH_COMPAT 755 + if (sd->arch == SECCOMP_ARCH_NATIVE) 756 + #endif 757 + if (sd->nr == __NR_uretprobe) 758 + return true; 759 + #endif 760 + 752 761 for (pc = 0; pc < fprog->len; pc++) { 753 762 struct sock_filter *insn = &fprog->filter[pc]; 754 763 u16 code = insn->code; ··· 1032 1023 */ 1033 1024 static const int mode1_syscalls[] = { 1034 1025 __NR_seccomp_read, __NR_seccomp_write, __NR_seccomp_exit, __NR_seccomp_sigreturn, 1026 + #ifdef __NR_uretprobe 1027 + __NR_uretprobe, 1028 + #endif 1035 1029 -1, /* negative terminated */ 1036 1030 }; 1037 1031

+6 -3

kernel/time/clocksource.c

··· 373 373 cpumask_clear(&cpus_ahead); 374 374 cpumask_clear(&cpus_behind); 375 375 cpus_read_lock(); 376 - preempt_disable(); 376 + migrate_disable(); 377 377 clocksource_verify_choose_cpus(); 378 378 if (cpumask_empty(&cpus_chosen)) { 379 - preempt_enable(); 379 + migrate_enable(); 380 380 cpus_read_unlock(); 381 381 pr_warn("Not enough CPUs to check clocksource '%s'.\n", cs->name); 382 382 return; 383 383 } 384 384 testcpu = smp_processor_id(); 385 - pr_warn("Checking clocksource %s synchronization from CPU %d to CPUs %*pbl.\n", cs->name, testcpu, cpumask_pr_args(&cpus_chosen)); 385 + pr_info("Checking clocksource %s synchronization from CPU %d to CPUs %*pbl.\n", 386 + cs->name, testcpu, cpumask_pr_args(&cpus_chosen)); 387 + preempt_disable(); 386 388 for_each_cpu(cpu, &cpus_chosen) { 387 389 if (cpu == testcpu) 388 390 continue; ··· 404 402 cs_nsec_min = cs_nsec; 405 403 } 406 404 preempt_enable(); 405 + migrate_enable(); 407 406 cpus_read_unlock(); 408 407 if (!cpumask_empty(&cpus_ahead)) 409 408 pr_warn(" CPUs %*pbl ahead of CPU %d for clocksource %s.\n",

+94 -31

kernel/time/hrtimer.c

··· 58 58 #define HRTIMER_ACTIVE_SOFT (HRTIMER_ACTIVE_HARD << MASK_SHIFT) 59 59 #define HRTIMER_ACTIVE_ALL (HRTIMER_ACTIVE_SOFT | HRTIMER_ACTIVE_HARD) 60 60 61 + static void retrigger_next_event(void *arg); 62 + 61 63 /* 62 64 * The timer bases: 63 65 * ··· 113 111 .clockid = CLOCK_TAI, 114 112 .get_time = &ktime_get_clocktai, 115 113 }, 116 - } 114 + }, 115 + .csd = CSD_INIT(retrigger_next_event, NULL) 117 116 }; 118 117 119 118 static const int hrtimer_clock_to_base_table[MAX_CLOCKS] = { ··· 126 123 [CLOCK_BOOTTIME] = HRTIMER_BASE_BOOTTIME, 127 124 [CLOCK_TAI] = HRTIMER_BASE_TAI, 128 125 }; 126 + 127 + static inline bool hrtimer_base_is_online(struct hrtimer_cpu_base *base) 128 + { 129 + if (!IS_ENABLED(CONFIG_HOTPLUG_CPU)) 130 + return true; 131 + else 132 + return likely(base->online); 133 + } 129 134 130 135 /* 131 136 * Functions and macros which are different for UP/SMP systems are kept in a ··· 155 144 }; 156 145 157 146 #define migration_base migration_cpu_base.clock_base[0] 158 - 159 - static inline bool is_migration_base(struct hrtimer_clock_base *base) 160 - { 161 - return base == &migration_base; 162 - } 163 147 164 148 /* 165 149 * We are using hashed locking: holding per_cpu(hrtimer_bases)[n].lock ··· 189 183 } 190 184 191 185 /* 192 - * We do not migrate the timer when it is expiring before the next 193 - * event on the target cpu. When high resolution is enabled, we cannot 194 - * reprogram the target cpu hardware and we would cause it to fire 195 - * late. To keep it simple, we handle the high resolution enabled and 196 - * disabled case similar. 186 + * Check if the elected target is suitable considering its next 187 + * event and the hotplug state of the current CPU. 188 + * 189 + * If the elected target is remote and its next event is after the timer 190 + * to queue, then a remote reprogram is necessary. However there is no 191 + * guarantee the IPI handling the operation would arrive in time to meet 192 + * the high resolution deadline. In this case the local CPU becomes a 193 + * preferred target, unless it is offline. 194 + * 195 + * High and low resolution modes are handled the same way for simplicity. 197 196 * 198 197 * Called with cpu_base->lock of target cpu held. 199 198 */ 200 - static int 201 - hrtimer_check_target(struct hrtimer *timer, struct hrtimer_clock_base *new_base) 199 + static bool hrtimer_suitable_target(struct hrtimer *timer, struct hrtimer_clock_base *new_base, 200 + struct hrtimer_cpu_base *new_cpu_base, 201 + struct hrtimer_cpu_base *this_cpu_base) 202 202 { 203 203 ktime_t expires; 204 204 205 + /* 206 + * The local CPU clockevent can be reprogrammed. Also get_target_base() 207 + * guarantees it is online. 208 + */ 209 + if (new_cpu_base == this_cpu_base) 210 + return true; 211 + 212 + /* 213 + * The offline local CPU can't be the default target if the 214 + * next remote target event is after this timer. Keep the 215 + * elected new base. An IPI will we issued to reprogram 216 + * it as a last resort. 217 + */ 218 + if (!hrtimer_base_is_online(this_cpu_base)) 219 + return true; 220 + 205 221 expires = ktime_sub(hrtimer_get_expires(timer), new_base->offset); 206 - return expires < new_base->cpu_base->expires_next; 222 + 223 + return expires >= new_base->cpu_base->expires_next; 207 224 } 208 225 209 - static inline 210 - struct hrtimer_cpu_base *get_target_base(struct hrtimer_cpu_base *base, 211 - int pinned) 226 + static inline struct hrtimer_cpu_base *get_target_base(struct hrtimer_cpu_base *base, int pinned) 212 227 { 228 + if (!hrtimer_base_is_online(base)) { 229 + int cpu = cpumask_any_and(cpu_online_mask, housekeeping_cpumask(HK_TYPE_TIMER)); 230 + 231 + return &per_cpu(hrtimer_bases, cpu); 232 + } 233 + 213 234 #if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON) 214 235 if (static_branch_likely(&timers_migration_enabled) && !pinned) 215 236 return &per_cpu(hrtimer_bases, get_nohz_timer_target()); ··· 287 254 raw_spin_unlock(&base->cpu_base->lock); 288 255 raw_spin_lock(&new_base->cpu_base->lock); 289 256 290 - if (new_cpu_base != this_cpu_base && 291 - hrtimer_check_target(timer, new_base)) { 257 + if (!hrtimer_suitable_target(timer, new_base, new_cpu_base, 258 + this_cpu_base)) { 292 259 raw_spin_unlock(&new_base->cpu_base->lock); 293 260 raw_spin_lock(&base->cpu_base->lock); 294 261 new_cpu_base = this_cpu_base; ··· 297 264 } 298 265 WRITE_ONCE(timer->base, new_base); 299 266 } else { 300 - if (new_cpu_base != this_cpu_base && 301 - hrtimer_check_target(timer, new_base)) { 267 + if (!hrtimer_suitable_target(timer, new_base, new_cpu_base, this_cpu_base)) { 302 268 new_cpu_base = this_cpu_base; 303 269 goto again; 304 270 } ··· 306 274 } 307 275 308 276 #else /* CONFIG_SMP */ 309 - 310 - static inline bool is_migration_base(struct hrtimer_clock_base *base) 311 - { 312 - return false; 313 - } 314 277 315 278 static inline struct hrtimer_clock_base * 316 279 lock_hrtimer_base(const struct hrtimer *timer, unsigned long *flags) ··· 742 715 { 743 716 return hrtimer_hres_enabled; 744 717 } 745 - 746 - static void retrigger_next_event(void *arg); 747 718 748 719 /* 749 720 * Switch to high resolution mode ··· 1230 1205 u64 delta_ns, const enum hrtimer_mode mode, 1231 1206 struct hrtimer_clock_base *base) 1232 1207 { 1208 + struct hrtimer_cpu_base *this_cpu_base = this_cpu_ptr(&hrtimer_bases); 1233 1209 struct hrtimer_clock_base *new_base; 1234 1210 bool force_local, first; 1235 1211 ··· 1242 1216 * and enforce reprogramming after it is queued no matter whether 1243 1217 * it is the new first expiring timer again or not. 1244 1218 */ 1245 - force_local = base->cpu_base == this_cpu_ptr(&hrtimer_bases); 1219 + force_local = base->cpu_base == this_cpu_base; 1246 1220 force_local &= base->cpu_base->next_timer == timer; 1221 + 1222 + /* 1223 + * Don't force local queuing if this enqueue happens on a unplugged 1224 + * CPU after hrtimer_cpu_dying() has been invoked. 1225 + */ 1226 + force_local &= this_cpu_base->online; 1247 1227 1248 1228 /* 1249 1229 * Remove an active timer from the queue. In case it is not queued ··· 1280 1248 } 1281 1249 1282 1250 first = enqueue_hrtimer(timer, new_base, mode); 1283 - if (!force_local) 1284 - return first; 1251 + if (!force_local) { 1252 + /* 1253 + * If the current CPU base is online, then the timer is 1254 + * never queued on a remote CPU if it would be the first 1255 + * expiring timer there. 1256 + */ 1257 + if (hrtimer_base_is_online(this_cpu_base)) 1258 + return first; 1259 + 1260 + /* 1261 + * Timer was enqueued remote because the current base is 1262 + * already offline. If the timer is the first to expire, 1263 + * kick the remote CPU to reprogram the clock event. 1264 + */ 1265 + if (first) { 1266 + struct hrtimer_cpu_base *new_cpu_base = new_base->cpu_base; 1267 + 1268 + smp_call_function_single_async(new_cpu_base->cpu, &new_cpu_base->csd); 1269 + } 1270 + return 0; 1271 + } 1285 1272 1286 1273 /* 1287 1274 * Timer was forced to stay on the current CPU to avoid ··· 1420 1369 raw_spin_lock_irq(&cpu_base->lock); 1421 1370 } 1422 1371 } 1372 + 1373 + #ifdef CONFIG_SMP 1374 + static __always_inline bool is_migration_base(struct hrtimer_clock_base *base) 1375 + { 1376 + return base == &migration_base; 1377 + } 1378 + #else 1379 + static __always_inline bool is_migration_base(struct hrtimer_clock_base *base) 1380 + { 1381 + return false; 1382 + } 1383 + #endif 1423 1384 1424 1385 /* 1425 1386 * This function is called on PREEMPT_RT kernels when the fast path

+9 -1

kernel/time/timer_migration.c

··· 1675 1675 1676 1676 } while (i < tmigr_hierarchy_levels); 1677 1677 1678 + /* Assert single root */ 1679 + WARN_ON_ONCE(!err && !group->parent && !list_is_singular(&tmigr_level_list[top])); 1680 + 1678 1681 while (i > 0) { 1679 1682 group = stack[--i]; 1680 1683 ··· 1719 1716 WARN_ON_ONCE(top == 0); 1720 1717 1721 1718 lvllist = &tmigr_level_list[top]; 1722 - if (group->num_children == 1 && list_is_singular(lvllist)) { 1719 + 1720 + /* 1721 + * Newly created root level should have accounted the upcoming 1722 + * CPU's child group and pre-accounted the old root. 1723 + */ 1724 + if (group->num_children == 2 && list_is_singular(lvllist)) { 1723 1725 /* 1724 1726 * The target CPU must never do the prepare work, except 1725 1727 * on early boot when the boot CPU is the target. Otherwise

+26 -2

kernel/trace/ring_buffer.c

··· 1672 1672 * must be the same. 1673 1673 */ 1674 1674 static bool rb_meta_valid(struct ring_buffer_meta *meta, int cpu, 1675 - struct trace_buffer *buffer, int nr_pages) 1675 + struct trace_buffer *buffer, int nr_pages, 1676 + unsigned long *subbuf_mask) 1676 1677 { 1677 1678 int subbuf_size = PAGE_SIZE; 1678 1679 struct buffer_data_page *subbuf; 1679 1680 unsigned long buffers_start; 1680 1681 unsigned long buffers_end; 1681 1682 int i; 1683 + 1684 + if (!subbuf_mask) 1685 + return false; 1682 1686 1683 1687 /* Check the meta magic and meta struct size */ 1684 1688 if (meta->magic != RING_BUFFER_META_MAGIC || ··· 1716 1712 1717 1713 subbuf = rb_subbufs_from_meta(meta); 1718 1714 1715 + bitmap_clear(subbuf_mask, 0, meta->nr_subbufs); 1716 + 1719 1717 /* Is the meta buffers and the subbufs themselves have correct data? */ 1720 1718 for (i = 0; i < meta->nr_subbufs; i++) { 1721 1719 if (meta->buffers[i] < 0 || ··· 1731 1725 return false; 1732 1726 } 1733 1727 1728 + if (test_bit(meta->buffers[i], subbuf_mask)) { 1729 + pr_info("Ring buffer boot meta [%d] array has duplicates\n", cpu); 1730 + return false; 1731 + } 1732 + 1733 + set_bit(meta->buffers[i], subbuf_mask); 1734 1734 subbuf = (void *)subbuf + subbuf_size; 1735 1735 } 1736 1736 ··· 1850 1838 cpu_buffer->cpu); 1851 1839 goto invalid; 1852 1840 } 1841 + 1842 + /* If the buffer has content, update pages_touched */ 1843 + if (ret) 1844 + local_inc(&cpu_buffer->pages_touched); 1845 + 1853 1846 entries += ret; 1854 1847 entry_bytes += local_read(&head_page->page->commit); 1855 1848 local_set(&cpu_buffer->head_page->entries, ret); ··· 1906 1889 static void rb_range_meta_init(struct trace_buffer *buffer, int nr_pages) 1907 1890 { 1908 1891 struct ring_buffer_meta *meta; 1892 + unsigned long *subbuf_mask; 1909 1893 unsigned long delta; 1910 1894 void *subbuf; 1911 1895 int cpu; 1912 1896 int i; 1897 + 1898 + /* Create a mask to test the subbuf array */ 1899 + subbuf_mask = bitmap_alloc(nr_pages + 1, GFP_KERNEL); 1900 + /* If subbuf_mask fails to allocate, then rb_meta_valid() will return false */ 1913 1901 1914 1902 for (cpu = 0; cpu < nr_cpu_ids; cpu++) { 1915 1903 void *next_meta; 1916 1904 1917 1905 meta = rb_range_meta(buffer, nr_pages, cpu); 1918 1906 1919 - if (rb_meta_valid(meta, cpu, buffer, nr_pages)) { 1907 + if (rb_meta_valid(meta, cpu, buffer, nr_pages, subbuf_mask)) { 1920 1908 /* Make the mappings match the current address */ 1921 1909 subbuf = rb_subbufs_from_meta(meta); 1922 1910 delta = (unsigned long)subbuf - meta->first_buffer; ··· 1965 1943 subbuf += meta->subbuf_size; 1966 1944 } 1967 1945 } 1946 + bitmap_free(subbuf_mask); 1968 1947 } 1969 1948 1970 1949 static void *rbm_start(struct seq_file *m, loff_t *pos) ··· 7149 7126 kfree(cpu_buffer->subbuf_ids); 7150 7127 cpu_buffer->subbuf_ids = NULL; 7151 7128 rb_free_meta_page(cpu_buffer); 7129 + atomic_dec(&cpu_buffer->resize_disabled); 7152 7130 } 7153 7131 7154 7132 unlock:

+5 -7

kernel/trace/trace.c

··· 5977 5977 ssize_t tracing_resize_ring_buffer(struct trace_array *tr, 5978 5978 unsigned long size, int cpu_id) 5979 5979 { 5980 - int ret; 5981 - 5982 5980 guard(mutex)(&trace_types_lock); 5983 5981 5984 5982 if (cpu_id != RING_BUFFER_ALL_CPUS) { ··· 5985 5987 return -EINVAL; 5986 5988 } 5987 5989 5988 - ret = __tracing_resize_ring_buffer(tr, size, cpu_id); 5989 - if (ret < 0) 5990 - ret = -ENOMEM; 5991 - 5992 - return ret; 5990 + return __tracing_resize_ring_buffer(tr, size, cpu_id); 5993 5991 } 5994 5992 5995 5993 static void update_last_data(struct trace_array *tr) ··· 8278 8284 struct ftrace_buffer_info *info = filp->private_data; 8279 8285 struct trace_iterator *iter = &info->iter; 8280 8286 int ret = 0; 8287 + 8288 + /* Currently the boot mapped buffer is not supported for mmap */ 8289 + if (iter->tr->flags & TRACE_ARRAY_FL_BOOT) 8290 + return -ENODEV; 8281 8291 8282 8292 ret = get_snapshot_map(iter->tr); 8283 8293 if (ret)

+1 -1

kernel/trace/trace_functions_graph.c

··· 198 198 * returning from the function. 199 199 */ 200 200 if (ftrace_graph_notrace_addr(trace->func)) { 201 - *task_var |= TRACE_GRAPH_NOTRACE_BIT; 201 + *task_var |= TRACE_GRAPH_NOTRACE; 202 202 /* 203 203 * Need to return 1 to have the return called 204 204 * that will clear the NOTRACE bit.

+6 -6

kernel/workqueue.c

··· 3517 3517 } 3518 3518 3519 3519 /* 3520 - * Put the reference grabbed by send_mayday(). @pool won't 3521 - * go away while we're still attached to it. 3522 - */ 3523 - put_pwq(pwq); 3524 - 3525 - /* 3526 3520 * Leave this pool. Notify regular workers; otherwise, we end up 3527 3521 * with 0 concurrency and stalling the execution. 3528 3522 */ ··· 3525 3531 raw_spin_unlock_irq(&pool->lock); 3526 3532 3527 3533 worker_detach_from_pool(rescuer); 3534 + 3535 + /* 3536 + * Put the reference grabbed by send_mayday(). @pool might 3537 + * go away any time after it. 3538 + */ 3539 + put_pwq_unlocked(pwq); 3528 3540 3529 3541 raw_spin_lock_irq(&wq_mayday_lock); 3530 3542 }

+4 -2

lib/stackinit_kunit.c

··· 75 75 */ 76 76 #ifdef CONFIG_M68K 77 77 #define FILL_SIZE_STRING 8 78 + #define FILL_SIZE_ARRAY 2 78 79 #else 79 80 #define FILL_SIZE_STRING 16 81 + #define FILL_SIZE_ARRAY 8 80 82 #endif 81 83 82 84 #define INIT_CLONE_SCALAR /**/ ··· 347 345 short three; 348 346 unsigned long four; 349 347 struct big_struct { 350 - unsigned long array[8]; 348 + unsigned long array[FILL_SIZE_ARRAY]; 351 349 } big; 352 350 }; 353 351 354 - /* Mismatched sizes, with one and two being small */ 352 + /* Mismatched sizes, with three and four being small */ 355 353 union test_small_end { 356 354 short one; 357 355 unsigned long two;

+11

net/ax25/af_ax25.c

··· 685 685 break; 686 686 } 687 687 688 + if (ax25->ax25_dev) { 689 + if (dev == ax25->ax25_dev->dev) { 690 + rcu_read_unlock(); 691 + break; 692 + } 693 + netdev_put(ax25->ax25_dev->dev, &ax25->dev_tracker); 694 + ax25_dev_put(ax25->ax25_dev); 695 + } 696 + 688 697 ax25->ax25_dev = ax25_dev_ax25dev(dev); 689 698 if (!ax25->ax25_dev) { 690 699 rcu_read_unlock(); ··· 701 692 break; 702 693 } 703 694 ax25_fillin_cb(ax25, ax25->ax25_dev); 695 + netdev_hold(dev, &ax25->dev_tracker, GFP_ATOMIC); 696 + ax25_dev_hold(ax25->ax25_dev); 704 697 rcu_read_unlock(); 705 698 break; 706 699

-2

net/batman-adv/bat_v.c

··· 113 113 batadv_v_hardif_neigh_init(struct batadv_hardif_neigh_node *hardif_neigh) 114 114 { 115 115 ewma_throughput_init(&hardif_neigh->bat_v.throughput); 116 - INIT_WORK(&hardif_neigh->bat_v.metric_work, 117 - batadv_v_elp_throughput_metric_update); 118 116 } 119 117 120 118 /**

+86 -36

net/batman-adv/bat_v_elp.c

··· 18 18 #include <linux/if_ether.h> 19 19 #include <linux/jiffies.h> 20 20 #include <linux/kref.h> 21 + #include <linux/list.h> 21 22 #include <linux/minmax.h> 22 23 #include <linux/netdevice.h> 23 24 #include <linux/nl80211.h> ··· 27 26 #include <linux/rcupdate.h> 28 27 #include <linux/rtnetlink.h> 29 28 #include <linux/skbuff.h> 29 + #include <linux/slab.h> 30 30 #include <linux/stddef.h> 31 31 #include <linux/string.h> 32 32 #include <linux/types.h> ··· 42 40 #include "originator.h" 43 41 #include "routing.h" 44 42 #include "send.h" 43 + 44 + /** 45 + * struct batadv_v_metric_queue_entry - list of hardif neighbors which require 46 + * and metric update 47 + */ 48 + struct batadv_v_metric_queue_entry { 49 + /** @hardif_neigh: hardif neighbor scheduled for metric update */ 50 + struct batadv_hardif_neigh_node *hardif_neigh; 51 + 52 + /** @list: list node for metric_queue */ 53 + struct list_head list; 54 + }; 45 55 46 56 /** 47 57 * batadv_v_elp_start_timer() - restart timer for ELP periodic work ··· 73 59 /** 74 60 * batadv_v_elp_get_throughput() - get the throughput towards a neighbour 75 61 * @neigh: the neighbour for which the throughput has to be obtained 62 + * @pthroughput: calculated throughput towards the given neighbour in multiples 63 + * of 100kpbs (a value of '1' equals 0.1Mbps, '10' equals 1Mbps, etc). 76 64 * 77 - * Return: The throughput towards the given neighbour in multiples of 100kpbs 78 - * (a value of '1' equals 0.1Mbps, '10' equals 1Mbps, etc). 65 + * Return: true when value behind @pthroughput was set 79 66 */ 80 - static u32 batadv_v_elp_get_throughput(struct batadv_hardif_neigh_node *neigh) 67 + static bool batadv_v_elp_get_throughput(struct batadv_hardif_neigh_node *neigh, 68 + u32 *pthroughput) 81 69 { 82 70 struct batadv_hard_iface *hard_iface = neigh->if_incoming; 71 + struct net_device *soft_iface = hard_iface->soft_iface; 83 72 struct ethtool_link_ksettings link_settings; 84 73 struct net_device *real_netdev; 85 74 struct station_info sinfo; 86 75 u32 throughput; 87 76 int ret; 88 77 78 + /* don't query throughput when no longer associated with any 79 + * batman-adv interface 80 + */ 81 + if (!soft_iface) 82 + return false; 83 + 89 84 /* if the user specified a customised value for this interface, then 90 85 * return it directly 91 86 */ 92 87 throughput = atomic_read(&hard_iface->bat_v.throughput_override); 93 - if (throughput != 0) 94 - return throughput; 88 + if (throughput != 0) { 89 + *pthroughput = throughput; 90 + return true; 91 + } 95 92 96 93 /* if this is a wireless device, then ask its throughput through 97 94 * cfg80211 API ··· 129 104 * possible to delete this neighbor. For now set 130 105 * the throughput metric to 0. 131 106 */ 132 - return 0; 107 + *pthroughput = 0; 108 + return true; 133 109 } 134 110 if (ret) 135 111 goto default_throughput; 136 112 137 - if (sinfo.filled & BIT(NL80211_STA_INFO_EXPECTED_THROUGHPUT)) 138 - return sinfo.expected_throughput / 100; 113 + if (sinfo.filled & BIT(NL80211_STA_INFO_EXPECTED_THROUGHPUT)) { 114 + *pthroughput = sinfo.expected_throughput / 100; 115 + return true; 116 + } 139 117 140 118 /* try to estimate the expected throughput based on reported tx 141 119 * rates 142 120 */ 143 - if (sinfo.filled & BIT(NL80211_STA_INFO_TX_BITRATE)) 144 - return cfg80211_calculate_bitrate(&sinfo.txrate) / 3; 121 + if (sinfo.filled & BIT(NL80211_STA_INFO_TX_BITRATE)) { 122 + *pthroughput = cfg80211_calculate_bitrate(&sinfo.txrate) / 3; 123 + return true; 124 + } 145 125 146 126 goto default_throughput; 147 127 } 148 128 129 + /* only use rtnl_trylock because the elp worker will be cancelled while 130 + * the rntl_lock is held. the cancel_delayed_work_sync() would otherwise 131 + * wait forever when the elp work_item was started and it is then also 132 + * trying to rtnl_lock 133 + */ 134 + if (!rtnl_trylock()) 135 + return false; 136 + 149 137 /* if not a wifi interface, check if this device provides data via 150 138 * ethtool (e.g. an Ethernet adapter) 151 139 */ 152 - rtnl_lock(); 153 140 ret = __ethtool_get_link_ksettings(hard_iface->net_dev, &link_settings); 154 141 rtnl_unlock(); 155 142 if (ret == 0) { ··· 172 135 hard_iface->bat_v.flags &= ~BATADV_FULL_DUPLEX; 173 136 174 137 throughput = link_settings.base.speed; 175 - if (throughput && throughput != SPEED_UNKNOWN) 176 - return throughput * 10; 138 + if (throughput && throughput != SPEED_UNKNOWN) { 139 + *pthroughput = throughput * 10; 140 + return true; 141 + } 177 142 } 178 143 179 144 default_throughput: 180 145 if (!(hard_iface->bat_v.flags & BATADV_WARNING_DEFAULT)) { 181 - batadv_info(hard_iface->soft_iface, 146 + batadv_info(soft_iface, 182 147 "WiFi driver or ethtool info does not provide information about link speeds on interface %s, therefore defaulting to hardcoded throughput values of %u.%1u Mbps. Consider overriding the throughput manually or checking your driver.\n", 183 148 hard_iface->net_dev->name, 184 149 BATADV_THROUGHPUT_DEFAULT_VALUE / 10, ··· 189 150 } 190 151 191 152 /* if none of the above cases apply, return the base_throughput */ 192 - return BATADV_THROUGHPUT_DEFAULT_VALUE; 153 + *pthroughput = BATADV_THROUGHPUT_DEFAULT_VALUE; 154 + return true; 193 155 } 194 156 195 157 /** 196 158 * batadv_v_elp_throughput_metric_update() - worker updating the throughput 197 159 * metric of a single hop neighbour 198 - * @work: the work queue item 160 + * @neigh: the neighbour to probe 199 161 */ 200 - void batadv_v_elp_throughput_metric_update(struct work_struct *work) 162 + static void 163 + batadv_v_elp_throughput_metric_update(struct batadv_hardif_neigh_node *neigh) 201 164 { 202 - struct batadv_hardif_neigh_node_bat_v *neigh_bat_v; 203 - struct batadv_hardif_neigh_node *neigh; 165 + u32 throughput; 166 + bool valid; 204 167 205 - neigh_bat_v = container_of(work, struct batadv_hardif_neigh_node_bat_v, 206 - metric_work); 207 - neigh = container_of(neigh_bat_v, struct batadv_hardif_neigh_node, 208 - bat_v); 168 + valid = batadv_v_elp_get_throughput(neigh, &throughput); 169 + if (!valid) 170 + return; 209 171 210 - ewma_throughput_add(&neigh->bat_v.throughput, 211 - batadv_v_elp_get_throughput(neigh)); 212 - 213 - /* decrement refcounter to balance increment performed before scheduling 214 - * this task 215 - */ 216 - batadv_hardif_neigh_put(neigh); 172 + ewma_throughput_add(&neigh->bat_v.throughput, throughput); 217 173 } 218 174 219 175 /** ··· 282 248 */ 283 249 static void batadv_v_elp_periodic_work(struct work_struct *work) 284 250 { 251 + struct batadv_v_metric_queue_entry *metric_entry; 252 + struct batadv_v_metric_queue_entry *metric_safe; 285 253 struct batadv_hardif_neigh_node *hardif_neigh; 286 254 struct batadv_hard_iface *hard_iface; 287 255 struct batadv_hard_iface_bat_v *bat_v; 288 256 struct batadv_elp_packet *elp_packet; 257 + struct list_head metric_queue; 289 258 struct batadv_priv *bat_priv; 290 259 struct sk_buff *skb; 291 260 u32 elp_interval; 292 - bool ret; 293 261 294 262 bat_v = container_of(work, struct batadv_hard_iface_bat_v, elp_wq.work); 295 263 hard_iface = container_of(bat_v, struct batadv_hard_iface, bat_v); ··· 327 291 328 292 atomic_inc(&hard_iface->bat_v.elp_seqno); 329 293 294 + INIT_LIST_HEAD(&metric_queue); 295 + 330 296 /* The throughput metric is updated on each sent packet. This way, if a 331 297 * node is dead and no longer sends packets, batman-adv is still able to 332 298 * react timely to its death. ··· 353 315 354 316 /* Reading the estimated throughput from cfg80211 is a task that 355 317 * may sleep and that is not allowed in an rcu protected 356 - * context. Therefore schedule a task for that. 318 + * context. Therefore add it to metric_queue and process it 319 + * outside rcu protected context. 357 320 */ 358 - ret = queue_work(batadv_event_workqueue, 359 - &hardif_neigh->bat_v.metric_work); 360 - 361 - if (!ret) 321 + metric_entry = kzalloc(sizeof(*metric_entry), GFP_ATOMIC); 322 + if (!metric_entry) { 362 323 batadv_hardif_neigh_put(hardif_neigh); 324 + continue; 325 + } 326 + 327 + metric_entry->hardif_neigh = hardif_neigh; 328 + list_add(&metric_entry->list, &metric_queue); 363 329 } 364 330 rcu_read_unlock(); 331 + 332 + list_for_each_entry_safe(metric_entry, metric_safe, &metric_queue, list) { 333 + batadv_v_elp_throughput_metric_update(metric_entry->hardif_neigh); 334 + 335 + batadv_hardif_neigh_put(metric_entry->hardif_neigh); 336 + list_del(&metric_entry->list); 337 + kfree(metric_entry); 338 + } 365 339 366 340 restart_timer: 367 341 batadv_v_elp_start_timer(hard_iface);

-2

net/batman-adv/bat_v_elp.h

··· 10 10 #include "main.h" 11 11 12 12 #include <linux/skbuff.h> 13 - #include <linux/workqueue.h> 14 13 15 14 int batadv_v_elp_iface_enable(struct batadv_hard_iface *hard_iface); 16 15 void batadv_v_elp_iface_disable(struct batadv_hard_iface *hard_iface); ··· 18 19 void batadv_v_elp_primary_iface_set(struct batadv_hard_iface *primary_iface); 19 20 int batadv_v_elp_packet_recv(struct sk_buff *skb, 20 21 struct batadv_hard_iface *if_incoming); 21 - void batadv_v_elp_throughput_metric_update(struct work_struct *work); 22 22 23 23 #endif /* _NET_BATMAN_ADV_BAT_V_ELP_H_ */

+5 -7

net/batman-adv/translation-table.c

··· 3937 3937 struct batadv_tvlv_tt_change *tt_change; 3938 3938 struct batadv_tvlv_tt_data *tt_data; 3939 3939 u16 num_entries, num_vlan; 3940 - size_t flex_size; 3940 + size_t tt_data_sz; 3941 3941 3942 3942 if (tvlv_value_len < sizeof(*tt_data)) 3943 3943 return; 3944 3944 3945 3945 tt_data = tvlv_value; 3946 - tvlv_value_len -= sizeof(*tt_data); 3947 - 3948 3946 num_vlan = ntohs(tt_data->num_vlan); 3949 3947 3950 - flex_size = flex_array_size(tt_data, vlan_data, num_vlan); 3951 - if (tvlv_value_len < flex_size) 3948 + tt_data_sz = struct_size(tt_data, vlan_data, num_vlan); 3949 + if (tvlv_value_len < tt_data_sz) 3952 3950 return; 3953 3951 3954 3952 tt_change = (struct batadv_tvlv_tt_change *)((void *)tt_data 3955 - + flex_size); 3956 - tvlv_value_len -= flex_size; 3953 + + tt_data_sz); 3954 + tvlv_value_len -= tt_data_sz; 3957 3955 3958 3956 num_entries = batadv_tt_entries(tvlv_value_len); 3959 3957

-3

net/batman-adv/types.h

··· 596 596 * neighbor 597 597 */ 598 598 unsigned long last_unicast_tx; 599 - 600 - /** @metric_work: work queue callback item for metric update */ 601 - struct work_struct metric_work; 602 599 }; 603 600 604 601 /**

+1 -2

net/bluetooth/hidp/Kconfig

··· 1 1 # SPDX-License-Identifier: GPL-2.0-only 2 2 config BT_HIDP 3 3 tristate "HIDP protocol support" 4 - depends on BT_BREDR && INPUT && HID_SUPPORT 5 - select HID 4 + depends on BT_BREDR && HID 6 5 help 7 6 HIDP (Human Interface Device Protocol) is a transport layer 8 7 for HID reports. HIDP is required for the Bluetooth Human

+79 -90

net/bluetooth/l2cap_core.c

··· 119 119 { 120 120 struct l2cap_chan *c; 121 121 122 - mutex_lock(&conn->chan_lock); 123 122 c = __l2cap_get_chan_by_scid(conn, cid); 124 123 if (c) { 125 124 /* Only lock if chan reference is not 0 */ ··· 126 127 if (c) 127 128 l2cap_chan_lock(c); 128 129 } 129 - mutex_unlock(&conn->chan_lock); 130 130 131 131 return c; 132 132 } ··· 138 140 { 139 141 struct l2cap_chan *c; 140 142 141 - mutex_lock(&conn->chan_lock); 142 143 c = __l2cap_get_chan_by_dcid(conn, cid); 143 144 if (c) { 144 145 /* Only lock if chan reference is not 0 */ ··· 145 148 if (c) 146 149 l2cap_chan_lock(c); 147 150 } 148 - mutex_unlock(&conn->chan_lock); 149 151 150 152 return c; 151 153 } ··· 414 418 if (!conn) 415 419 return; 416 420 417 - mutex_lock(&conn->chan_lock); 421 + mutex_lock(&conn->lock); 418 422 /* __set_chan_timer() calls l2cap_chan_hold(chan) while scheduling 419 423 * this work. No need to call l2cap_chan_hold(chan) here again. 420 424 */ ··· 435 439 l2cap_chan_unlock(chan); 436 440 l2cap_chan_put(chan); 437 441 438 - mutex_unlock(&conn->chan_lock); 442 + mutex_unlock(&conn->lock); 439 443 } 440 444 441 445 struct l2cap_chan *l2cap_chan_create(void) ··· 637 641 638 642 void l2cap_chan_add(struct l2cap_conn *conn, struct l2cap_chan *chan) 639 643 { 640 - mutex_lock(&conn->chan_lock); 644 + mutex_lock(&conn->lock); 641 645 __l2cap_chan_add(conn, chan); 642 - mutex_unlock(&conn->chan_lock); 646 + mutex_unlock(&conn->lock); 643 647 } 644 648 645 649 void l2cap_chan_del(struct l2cap_chan *chan, int err) ··· 727 731 if (!conn) 728 732 return; 729 733 730 - mutex_lock(&conn->chan_lock); 734 + mutex_lock(&conn->lock); 731 735 __l2cap_chan_list(conn, func, data); 732 - mutex_unlock(&conn->chan_lock); 736 + mutex_unlock(&conn->lock); 733 737 } 734 738 735 739 EXPORT_SYMBOL_GPL(l2cap_chan_list); ··· 741 745 struct hci_conn *hcon = conn->hcon; 742 746 struct l2cap_chan *chan; 743 747 744 - mutex_lock(&conn->chan_lock); 748 + mutex_lock(&conn->lock); 745 749 746 750 list_for_each_entry(chan, &conn->chan_l, list) { 747 751 l2cap_chan_lock(chan); ··· 750 754 l2cap_chan_unlock(chan); 751 755 } 752 756 753 - mutex_unlock(&conn->chan_lock); 757 + mutex_unlock(&conn->lock); 754 758 } 755 759 756 760 static void l2cap_chan_le_connect_reject(struct l2cap_chan *chan) ··· 944 948 return id; 945 949 } 946 950 951 + static void l2cap_send_acl(struct l2cap_conn *conn, struct sk_buff *skb, 952 + u8 flags) 953 + { 954 + /* Check if the hcon still valid before attempting to send */ 955 + if (hci_conn_valid(conn->hcon->hdev, conn->hcon)) 956 + hci_send_acl(conn->hchan, skb, flags); 957 + else 958 + kfree_skb(skb); 959 + } 960 + 947 961 static void l2cap_send_cmd(struct l2cap_conn *conn, u8 ident, u8 code, u16 len, 948 962 void *data) 949 963 { ··· 976 970 bt_cb(skb)->force_active = BT_POWER_FORCE_ACTIVE_ON; 977 971 skb->priority = HCI_PRIO_MAX; 978 972 979 - hci_send_acl(conn->hchan, skb, flags); 973 + l2cap_send_acl(conn, skb, flags); 980 974 } 981 975 982 976 static void l2cap_do_send(struct l2cap_chan *chan, struct sk_buff *skb) ··· 1503 1497 1504 1498 BT_DBG("conn %p", conn); 1505 1499 1506 - mutex_lock(&conn->chan_lock); 1507 - 1508 1500 list_for_each_entry_safe(chan, tmp, &conn->chan_l, list) { 1509 1501 l2cap_chan_lock(chan); 1510 1502 ··· 1571 1567 1572 1568 l2cap_chan_unlock(chan); 1573 1569 } 1574 - 1575 - mutex_unlock(&conn->chan_lock); 1576 1570 } 1577 1571 1578 1572 static void l2cap_le_conn_ready(struct l2cap_conn *conn) ··· 1616 1614 if (hcon->type == ACL_LINK) 1617 1615 l2cap_request_info(conn); 1618 1616 1619 - mutex_lock(&conn->chan_lock); 1617 + mutex_lock(&conn->lock); 1620 1618 1621 1619 list_for_each_entry(chan, &conn->chan_l, list) { 1622 1620 ··· 1634 1632 l2cap_chan_unlock(chan); 1635 1633 } 1636 1634 1637 - mutex_unlock(&conn->chan_lock); 1635 + mutex_unlock(&conn->lock); 1638 1636 1639 1637 if (hcon->type == LE_LINK) 1640 1638 l2cap_le_conn_ready(conn); ··· 1649 1647 1650 1648 BT_DBG("conn %p", conn); 1651 1649 1652 - mutex_lock(&conn->chan_lock); 1653 - 1654 1650 list_for_each_entry(chan, &conn->chan_l, list) { 1655 1651 if (test_bit(FLAG_FORCE_RELIABLE, &chan->flags)) 1656 1652 l2cap_chan_set_err(chan, err); 1657 1653 } 1658 - 1659 - mutex_unlock(&conn->chan_lock); 1660 1654 } 1661 1655 1662 1656 static void l2cap_info_timeout(struct work_struct *work) ··· 1663 1665 conn->info_state |= L2CAP_INFO_FEAT_MASK_REQ_DONE; 1664 1666 conn->info_ident = 0; 1665 1667 1668 + mutex_lock(&conn->lock); 1666 1669 l2cap_conn_start(conn); 1670 + mutex_unlock(&conn->lock); 1667 1671 } 1668 1672 1669 1673 /* ··· 1757 1757 1758 1758 BT_DBG("hcon %p conn %p, err %d", hcon, conn, err); 1759 1759 1760 + mutex_lock(&conn->lock); 1761 + 1760 1762 kfree_skb(conn->rx_skb); 1761 1763 1762 1764 skb_queue_purge(&conn->pending_rx); ··· 1777 1775 /* Force the connection to be immediately dropped */ 1778 1776 hcon->disc_timeout = 0; 1779 1777 1780 - mutex_lock(&conn->chan_lock); 1781 - 1782 1778 /* Kill channels */ 1783 1779 list_for_each_entry_safe(chan, l, &conn->chan_l, list) { 1784 1780 l2cap_chan_hold(chan); ··· 1790 1790 l2cap_chan_put(chan); 1791 1791 } 1792 1792 1793 - mutex_unlock(&conn->chan_lock); 1794 - 1795 - hci_chan_del(conn->hchan); 1796 - 1797 1793 if (conn->info_state & L2CAP_INFO_FEAT_MASK_REQ_SENT) 1798 1794 cancel_delayed_work_sync(&conn->info_timer); 1799 1795 1800 - hcon->l2cap_data = NULL; 1796 + hci_chan_del(conn->hchan); 1801 1797 conn->hchan = NULL; 1798 + 1799 + hcon->l2cap_data = NULL; 1800 + mutex_unlock(&conn->lock); 1802 1801 l2cap_conn_put(conn); 1803 1802 } 1804 1803 ··· 2915 2916 2916 2917 BT_DBG("conn %p", conn); 2917 2918 2918 - mutex_lock(&conn->chan_lock); 2919 - 2920 2919 list_for_each_entry(chan, &conn->chan_l, list) { 2921 2920 if (chan->chan_type != L2CAP_CHAN_RAW) 2922 2921 continue; ··· 2929 2932 if (chan->ops->recv(chan, nskb)) 2930 2933 kfree_skb(nskb); 2931 2934 } 2932 - 2933 - mutex_unlock(&conn->chan_lock); 2934 2935 } 2935 2936 2936 2937 /* ---- L2CAP signalling commands ---- */ ··· 3947 3952 goto response; 3948 3953 } 3949 3954 3950 - mutex_lock(&conn->chan_lock); 3951 3955 l2cap_chan_lock(pchan); 3952 3956 3953 3957 /* Check if the ACL is secure enough (if not SDP) */ ··· 4053 4059 } 4054 4060 4055 4061 l2cap_chan_unlock(pchan); 4056 - mutex_unlock(&conn->chan_lock); 4057 4062 l2cap_chan_put(pchan); 4058 4063 } 4059 4064 ··· 4091 4098 BT_DBG("dcid 0x%4.4x scid 0x%4.4x result 0x%2.2x status 0x%2.2x", 4092 4099 dcid, scid, result, status); 4093 4100 4094 - mutex_lock(&conn->chan_lock); 4095 - 4096 4101 if (scid) { 4097 4102 chan = __l2cap_get_chan_by_scid(conn, scid); 4098 - if (!chan) { 4099 - err = -EBADSLT; 4100 - goto unlock; 4101 - } 4103 + if (!chan) 4104 + return -EBADSLT; 4102 4105 } else { 4103 4106 chan = __l2cap_get_chan_by_ident(conn, cmd->ident); 4104 - if (!chan) { 4105 - err = -EBADSLT; 4106 - goto unlock; 4107 - } 4107 + if (!chan) 4108 + return -EBADSLT; 4108 4109 } 4109 4110 4110 4111 chan = l2cap_chan_hold_unless_zero(chan); 4111 - if (!chan) { 4112 - err = -EBADSLT; 4113 - goto unlock; 4114 - } 4112 + if (!chan) 4113 + return -EBADSLT; 4115 4114 4116 4115 err = 0; 4117 4116 ··· 4140 4155 4141 4156 l2cap_chan_unlock(chan); 4142 4157 l2cap_chan_put(chan); 4143 - 4144 - unlock: 4145 - mutex_unlock(&conn->chan_lock); 4146 4158 4147 4159 return err; 4148 4160 } ··· 4428 4446 4429 4447 chan->ops->set_shutdown(chan); 4430 4448 4431 - l2cap_chan_unlock(chan); 4432 - mutex_lock(&conn->chan_lock); 4433 - l2cap_chan_lock(chan); 4434 4449 l2cap_chan_del(chan, ECONNRESET); 4435 - mutex_unlock(&conn->chan_lock); 4436 4450 4437 4451 chan->ops->close(chan); 4438 4452 ··· 4465 4487 return 0; 4466 4488 } 4467 4489 4468 - l2cap_chan_unlock(chan); 4469 - mutex_lock(&conn->chan_lock); 4470 - l2cap_chan_lock(chan); 4471 4490 l2cap_chan_del(chan, 0); 4472 - mutex_unlock(&conn->chan_lock); 4473 4491 4474 4492 chan->ops->close(chan); 4475 4493 ··· 4663 4689 BT_DBG("dcid 0x%4.4x mtu %u mps %u credits %u result 0x%2.2x", 4664 4690 dcid, mtu, mps, credits, result); 4665 4691 4666 - mutex_lock(&conn->chan_lock); 4667 - 4668 4692 chan = __l2cap_get_chan_by_ident(conn, cmd->ident); 4669 - if (!chan) { 4670 - err = -EBADSLT; 4671 - goto unlock; 4672 - } 4693 + if (!chan) 4694 + return -EBADSLT; 4673 4695 4674 4696 err = 0; 4675 4697 ··· 4712 4742 } 4713 4743 4714 4744 l2cap_chan_unlock(chan); 4715 - 4716 - unlock: 4717 - mutex_unlock(&conn->chan_lock); 4718 4745 4719 4746 return err; 4720 4747 } ··· 4824 4857 goto response; 4825 4858 } 4826 4859 4827 - mutex_lock(&conn->chan_lock); 4828 4860 l2cap_chan_lock(pchan); 4829 4861 4830 4862 if (!smp_sufficient_security(conn->hcon, pchan->sec_level, ··· 4889 4923 4890 4924 response_unlock: 4891 4925 l2cap_chan_unlock(pchan); 4892 - mutex_unlock(&conn->chan_lock); 4893 4926 l2cap_chan_put(pchan); 4894 4927 4895 4928 if (result == L2CAP_CR_PEND) ··· 5022 5057 goto response; 5023 5058 } 5024 5059 5025 - mutex_lock(&conn->chan_lock); 5026 5060 l2cap_chan_lock(pchan); 5027 5061 5028 5062 if (!smp_sufficient_security(conn->hcon, pchan->sec_level, ··· 5096 5132 5097 5133 unlock: 5098 5134 l2cap_chan_unlock(pchan); 5099 - mutex_unlock(&conn->chan_lock); 5100 5135 l2cap_chan_put(pchan); 5101 5136 5102 5137 response: ··· 5131 5168 5132 5169 BT_DBG("mtu %u mps %u credits %u result 0x%4.4x", mtu, mps, credits, 5133 5170 result); 5134 - 5135 - mutex_lock(&conn->chan_lock); 5136 5171 5137 5172 cmd_len -= sizeof(*rsp); 5138 5173 ··· 5216 5255 5217 5256 l2cap_chan_unlock(chan); 5218 5257 } 5219 - 5220 - mutex_unlock(&conn->chan_lock); 5221 5258 5222 5259 return err; 5223 5260 } ··· 5329 5370 if (cmd_len < sizeof(*rej)) 5330 5371 return -EPROTO; 5331 5372 5332 - mutex_lock(&conn->chan_lock); 5333 - 5334 5373 chan = __l2cap_get_chan_by_ident(conn, cmd->ident); 5335 5374 if (!chan) 5336 5375 goto done; ··· 5343 5386 l2cap_chan_put(chan); 5344 5387 5345 5388 done: 5346 - mutex_unlock(&conn->chan_lock); 5347 5389 return 0; 5348 5390 } 5349 5391 ··· 6797 6841 6798 6842 BT_DBG(""); 6799 6843 6844 + mutex_lock(&conn->lock); 6845 + 6800 6846 while ((skb = skb_dequeue(&conn->pending_rx))) 6801 6847 l2cap_recv_frame(conn, skb); 6848 + 6849 + mutex_unlock(&conn->lock); 6802 6850 } 6803 6851 6804 6852 static struct l2cap_conn *l2cap_conn_add(struct hci_conn *hcon) ··· 6841 6881 conn->local_fixed_chan |= L2CAP_FC_SMP_BREDR; 6842 6882 6843 6883 mutex_init(&conn->ident_lock); 6844 - mutex_init(&conn->chan_lock); 6884 + mutex_init(&conn->lock); 6845 6885 6846 6886 INIT_LIST_HEAD(&conn->chan_l); 6847 6887 INIT_LIST_HEAD(&conn->users); ··· 7032 7072 } 7033 7073 } 7034 7074 7035 - mutex_lock(&conn->chan_lock); 7075 + mutex_lock(&conn->lock); 7036 7076 l2cap_chan_lock(chan); 7037 7077 7038 7078 if (cid && __l2cap_get_chan_by_dcid(conn, cid)) { ··· 7073 7113 7074 7114 chan_unlock: 7075 7115 l2cap_chan_unlock(chan); 7076 - mutex_unlock(&conn->chan_lock); 7116 + mutex_unlock(&conn->lock); 7077 7117 done: 7078 7118 hci_dev_unlock(hdev); 7079 7119 hci_dev_put(hdev); ··· 7285 7325 7286 7326 BT_DBG("conn %p status 0x%2.2x encrypt %u", conn, status, encrypt); 7287 7327 7288 - mutex_lock(&conn->chan_lock); 7328 + mutex_lock(&conn->lock); 7289 7329 7290 7330 list_for_each_entry(chan, &conn->chan_l, list) { 7291 7331 l2cap_chan_lock(chan); ··· 7359 7399 l2cap_chan_unlock(chan); 7360 7400 } 7361 7401 7362 - mutex_unlock(&conn->chan_lock); 7402 + mutex_unlock(&conn->lock); 7363 7403 } 7364 7404 7365 7405 /* Append fragment into frame respecting the maximum len of rx_skb */ ··· 7426 7466 conn->rx_len = 0; 7427 7467 } 7428 7468 7469 + struct l2cap_conn *l2cap_conn_hold_unless_zero(struct l2cap_conn *c) 7470 + { 7471 + if (!c) 7472 + return NULL; 7473 + 7474 + BT_DBG("conn %p orig refcnt %u", c, kref_read(&c->ref)); 7475 + 7476 + if (!kref_get_unless_zero(&c->ref)) 7477 + return NULL; 7478 + 7479 + return c; 7480 + } 7481 + 7429 7482 void l2cap_recv_acldata(struct hci_conn *hcon, struct sk_buff *skb, u16 flags) 7430 7483 { 7431 - struct l2cap_conn *conn = hcon->l2cap_data; 7484 + struct l2cap_conn *conn; 7432 7485 int len; 7486 + 7487 + /* Lock hdev to access l2cap_data to avoid race with l2cap_conn_del */ 7488 + hci_dev_lock(hcon->hdev); 7489 + 7490 + conn = hcon->l2cap_data; 7433 7491 7434 7492 if (!conn) 7435 7493 conn = l2cap_conn_add(hcon); 7436 7494 7437 - if (!conn) 7438 - goto drop; 7495 + conn = l2cap_conn_hold_unless_zero(conn); 7496 + 7497 + hci_dev_unlock(hcon->hdev); 7498 + 7499 + if (!conn) { 7500 + kfree_skb(skb); 7501 + return; 7502 + } 7439 7503 7440 7504 BT_DBG("conn %p len %u flags 0x%x", conn, skb->len, flags); 7505 + 7506 + mutex_lock(&conn->lock); 7441 7507 7442 7508 switch (flags) { 7443 7509 case ACL_START: ··· 7489 7503 if (len == skb->len) { 7490 7504 /* Complete frame received */ 7491 7505 l2cap_recv_frame(conn, skb); 7492 - return; 7506 + goto unlock; 7493 7507 } 7494 7508 7495 7509 BT_DBG("Start: total len %d, frag len %u", len, skb->len); ··· 7553 7567 7554 7568 drop: 7555 7569 kfree_skb(skb); 7570 + unlock: 7571 + mutex_unlock(&conn->lock); 7572 + l2cap_conn_put(conn); 7556 7573 } 7557 7574 7558 7575 static struct hci_cb l2cap_cb = {

+7 -8

net/bluetooth/l2cap_sock.c

··· 1326 1326 /* prevent sk structure from being freed whilst unlocked */ 1327 1327 sock_hold(sk); 1328 1328 1329 - chan = l2cap_pi(sk)->chan; 1330 1329 /* prevent chan structure from being freed whilst unlocked */ 1331 - l2cap_chan_hold(chan); 1330 + chan = l2cap_chan_hold_unless_zero(l2cap_pi(sk)->chan); 1331 + if (!chan) 1332 + goto shutdown_already; 1332 1333 1333 1334 BT_DBG("chan %p state %s", chan, state_to_string(chan->state)); 1334 1335 ··· 1359 1358 release_sock(sk); 1360 1359 1361 1360 l2cap_chan_lock(chan); 1362 - conn = chan->conn; 1363 - if (conn) 1364 - /* prevent conn structure from being freed */ 1365 - l2cap_conn_get(conn); 1361 + /* prevent conn structure from being freed */ 1362 + conn = l2cap_conn_hold_unless_zero(chan->conn); 1366 1363 l2cap_chan_unlock(chan); 1367 1364 1368 1365 if (conn) 1369 1366 /* mutex lock must be taken before l2cap_chan_lock() */ 1370 - mutex_lock(&conn->chan_lock); 1367 + mutex_lock(&conn->lock); 1371 1368 1372 1369 l2cap_chan_lock(chan); 1373 1370 l2cap_chan_close(chan, 0); 1374 1371 l2cap_chan_unlock(chan); 1375 1372 1376 1373 if (conn) { 1377 - mutex_unlock(&conn->chan_lock); 1374 + mutex_unlock(&conn->lock); 1378 1375 l2cap_conn_put(conn); 1379 1376 } 1380 1377

+2 -2

net/can/j1939/socket.c

··· 1132 1132 1133 1133 todo_size = size; 1134 1134 1135 - while (todo_size) { 1135 + do { 1136 1136 struct j1939_sk_buff_cb *skcb; 1137 1137 1138 1138 segment_size = min_t(size_t, J1939_MAX_TP_PACKET_SIZE, ··· 1177 1177 1178 1178 todo_size -= segment_size; 1179 1179 session->total_queued_size += segment_size; 1180 - } 1180 + } while (todo_size); 1181 1181 1182 1182 switch (ret) { 1183 1183 case 0: /* OK */

+3 -2

net/can/j1939/transport.c

··· 382 382 skb_queue_walk(&session->skb_queue, do_skb) { 383 383 do_skcb = j1939_skb_to_cb(do_skb); 384 384 385 - if (offset_start >= do_skcb->offset && 386 - offset_start < (do_skcb->offset + do_skb->len)) { 385 + if ((offset_start >= do_skcb->offset && 386 + offset_start < (do_skcb->offset + do_skb->len)) || 387 + (offset_start == 0 && do_skcb->offset == 0 && do_skb->len == 0)) { 387 388 skb = do_skb; 388 389 } 389 390 }

+14

net/core/dev.c

··· 11286 11286 const struct net_device_ops *ops = dev->netdev_ops; 11287 11287 const struct net_device_core_stats __percpu *p; 11288 11288 11289 + /* 11290 + * IPv{4,6} and udp tunnels share common stat helpers and use 11291 + * different stat type (NETDEV_PCPU_STAT_TSTATS vs 11292 + * NETDEV_PCPU_STAT_DSTATS). Ensure the accounting is consistent. 11293 + */ 11294 + BUILD_BUG_ON(offsetof(struct pcpu_sw_netstats, rx_bytes) != 11295 + offsetof(struct pcpu_dstats, rx_bytes)); 11296 + BUILD_BUG_ON(offsetof(struct pcpu_sw_netstats, rx_packets) != 11297 + offsetof(struct pcpu_dstats, rx_packets)); 11298 + BUILD_BUG_ON(offsetof(struct pcpu_sw_netstats, tx_bytes) != 11299 + offsetof(struct pcpu_dstats, tx_bytes)); 11300 + BUILD_BUG_ON(offsetof(struct pcpu_sw_netstats, tx_packets) != 11301 + offsetof(struct pcpu_dstats, tx_packets)); 11302 + 11289 11303 if (ops->ndo_get_stats64) { 11290 11304 memset(storage, 0, sizeof(*storage)); 11291 11305 ops->ndo_get_stats64(dev, storage);

+13 -11

net/core/fib_rules.c

··· 37 37 38 38 bool fib_rule_matchall(const struct fib_rule *rule) 39 39 { 40 - if (rule->iifindex || rule->oifindex || rule->mark || rule->tun_id || 41 - rule->flags) 40 + if (READ_ONCE(rule->iifindex) || READ_ONCE(rule->oifindex) || 41 + rule->mark || rule->tun_id || rule->flags) 42 42 return false; 43 43 if (rule->suppress_ifgroup != -1 || rule->suppress_prefixlen != -1) 44 44 return false; ··· 261 261 struct flowi *fl, int flags, 262 262 struct fib_lookup_arg *arg) 263 263 { 264 - int ret = 0; 264 + int iifindex, oifindex, ret = 0; 265 265 266 - if (rule->iifindex && (rule->iifindex != fl->flowi_iif)) 266 + iifindex = READ_ONCE(rule->iifindex); 267 + if (iifindex && (iifindex != fl->flowi_iif)) 267 268 goto out; 268 269 269 - if (rule->oifindex && (rule->oifindex != fl->flowi_oif)) 270 + oifindex = READ_ONCE(rule->oifindex); 271 + if (oifindex && (oifindex != fl->flowi_oif)) 270 272 goto out; 271 273 272 274 if ((rule->mark ^ fl->flowi_mark) & rule->mark_mask) ··· 1043 1041 if (rule->iifname[0]) { 1044 1042 if (nla_put_string(skb, FRA_IIFNAME, rule->iifname)) 1045 1043 goto nla_put_failure; 1046 - if (rule->iifindex == -1) 1044 + if (READ_ONCE(rule->iifindex) == -1) 1047 1045 frh->flags |= FIB_RULE_IIF_DETACHED; 1048 1046 } 1049 1047 1050 1048 if (rule->oifname[0]) { 1051 1049 if (nla_put_string(skb, FRA_OIFNAME, rule->oifname)) 1052 1050 goto nla_put_failure; 1053 - if (rule->oifindex == -1) 1051 + if (READ_ONCE(rule->oifindex) == -1) 1054 1052 frh->flags |= FIB_RULE_OIF_DETACHED; 1055 1053 } 1056 1054 ··· 1222 1220 list_for_each_entry(rule, rules, list) { 1223 1221 if (rule->iifindex == -1 && 1224 1222 strcmp(dev->name, rule->iifname) == 0) 1225 - rule->iifindex = dev->ifindex; 1223 + WRITE_ONCE(rule->iifindex, dev->ifindex); 1226 1224 if (rule->oifindex == -1 && 1227 1225 strcmp(dev->name, rule->oifname) == 0) 1228 - rule->oifindex = dev->ifindex; 1226 + WRITE_ONCE(rule->oifindex, dev->ifindex); 1229 1227 } 1230 1228 } 1231 1229 ··· 1235 1233 1236 1234 list_for_each_entry(rule, rules, list) { 1237 1235 if (rule->iifindex == dev->ifindex) 1238 - rule->iifindex = -1; 1236 + WRITE_ONCE(rule->iifindex, -1); 1239 1237 if (rule->oifindex == dev->ifindex) 1240 - rule->oifindex = -1; 1238 + WRITE_ONCE(rule->oifindex, -1); 1241 1239 } 1242 1240 } 1243 1241

+11 -10

net/core/flow_dissector.c

··· 1108 1108 FLOW_DISSECTOR_KEY_BASIC, 1109 1109 target_container); 1110 1110 1111 + rcu_read_lock(); 1112 + 1111 1113 if (skb) { 1112 1114 if (!net) { 1113 1115 if (skb->dev) 1114 - net = dev_net(skb->dev); 1116 + net = dev_net_rcu(skb->dev); 1115 1117 else if (skb->sk) 1116 1118 net = sock_net(skb->sk); 1117 1119 } ··· 1124 1122 enum netns_bpf_attach_type type = NETNS_BPF_FLOW_DISSECTOR; 1125 1123 struct bpf_prog_array *run_array; 1126 1124 1127 - rcu_read_lock(); 1128 1125 run_array = rcu_dereference(init_net.bpf.run_array[type]); 1129 1126 if (!run_array) 1130 1127 run_array = rcu_dereference(net->bpf.run_array[type]); ··· 1151 1150 prog = READ_ONCE(run_array->items[0].prog); 1152 1151 result = bpf_flow_dissect(prog, &ctx, n_proto, nhoff, 1153 1152 hlen, flags); 1154 - if (result == BPF_FLOW_DISSECTOR_CONTINUE) 1155 - goto dissect_continue; 1156 - __skb_flow_bpf_to_target(&flow_keys, flow_dissector, 1157 - target_container); 1158 - rcu_read_unlock(); 1159 - return result == BPF_OK; 1153 + if (result != BPF_FLOW_DISSECTOR_CONTINUE) { 1154 + __skb_flow_bpf_to_target(&flow_keys, flow_dissector, 1155 + target_container); 1156 + rcu_read_unlock(); 1157 + return result == BPF_OK; 1158 + } 1160 1159 } 1161 - dissect_continue: 1162 - rcu_read_unlock(); 1163 1160 } 1161 + 1162 + rcu_read_unlock(); 1164 1163 1165 1164 if (dissector_uses_key(flow_dissector, 1166 1165 FLOW_DISSECTOR_KEY_ETH_ADDRS)) {

+6 -2

net/core/neighbour.c

··· 3447 3447 static void __neigh_notify(struct neighbour *n, int type, int flags, 3448 3448 u32 pid) 3449 3449 { 3450 - struct net *net = dev_net(n->dev); 3451 3450 struct sk_buff *skb; 3452 3451 int err = -ENOBUFS; 3452 + struct net *net; 3453 3453 3454 + rcu_read_lock(); 3455 + net = dev_net_rcu(n->dev); 3454 3456 skb = nlmsg_new(neigh_nlmsg_size(), GFP_ATOMIC); 3455 3457 if (skb == NULL) 3456 3458 goto errout; ··· 3465 3463 goto errout; 3466 3464 } 3467 3465 rtnl_notify(skb, net, 0, RTNLGRP_NEIGH, NULL, GFP_ATOMIC); 3468 - return; 3466 + goto out; 3469 3467 errout: 3470 3468 rtnl_set_sk_err(net, RTNLGRP_NEIGH, err); 3469 + out: 3470 + rcu_read_unlock(); 3471 3471 } 3472 3472 3473 3473 void neigh_app_ns(struct neighbour *n)

+1

net/core/rtnetlink.c

··· 3432 3432 err = -ENODEV; 3433 3433 3434 3434 rtnl_nets_unlock(&rtnl_nets); 3435 + rtnl_nets_destroy(&rtnl_nets); 3435 3436 errout: 3436 3437 return err; 3437 3438 }

+5

net/ethtool/common.c

··· 462 462 }; 463 463 static_assert(ARRAY_SIZE(ts_rx_filter_names) == __HWTSTAMP_FILTER_CNT); 464 464 465 + const char ts_flags_names[][ETH_GSTRING_LEN] = { 466 + [const_ilog2(HWTSTAMP_FLAG_BONDED_PHC_INDEX)] = "bonded-phc-index", 467 + }; 468 + static_assert(ARRAY_SIZE(ts_flags_names) == __HWTSTAMP_FLAG_CNT); 469 + 465 470 const char udp_tunnel_type_names[][ETH_GSTRING_LEN] = { 466 471 [ETHTOOL_UDP_TUNNEL_TYPE_VXLAN] = "vxlan", 467 472 [ETHTOOL_UDP_TUNNEL_TYPE_GENEVE] = "geneve",

+2

net/ethtool/common.h

··· 13 13 ETHTOOL_LINK_MODE_ ## speed ## base ## type ## _ ## duplex ## _BIT 14 14 15 15 #define __SOF_TIMESTAMPING_CNT (const_ilog2(SOF_TIMESTAMPING_LAST) + 1) 16 + #define __HWTSTAMP_FLAG_CNT (const_ilog2(HWTSTAMP_FLAG_LAST) + 1) 16 17 17 18 struct link_mode_info { 18 19 int speed; ··· 39 38 extern const char sof_timestamping_names[][ETH_GSTRING_LEN]; 40 39 extern const char ts_tx_type_names[][ETH_GSTRING_LEN]; 41 40 extern const char ts_rx_filter_names[][ETH_GSTRING_LEN]; 41 + extern const char ts_flags_names[][ETH_GSTRING_LEN]; 42 42 extern const char udp_tunnel_type_names[][ETH_GSTRING_LEN]; 43 43 44 44 int __ethtool_get_link(struct net_device *dev);

+1 -1

net/ethtool/ioctl.c

··· 993 993 return rc; 994 994 995 995 /* Nonzero ring with RSS only makes sense if NIC adds them together */ 996 - if (cmd == ETHTOOL_SRXCLSRLINS && info.flow_type & FLOW_RSS && 996 + if (cmd == ETHTOOL_SRXCLSRLINS && info.fs.flow_type & FLOW_RSS && 997 997 !ops->cap_rss_rxnfc_adds && 998 998 ethtool_get_flow_spec_ring(info.fs.ring_cookie)) 999 999 return -EINVAL;

+2 -1

net/ethtool/rss.c

··· 107 107 u32 total_size, indir_bytes; 108 108 u8 *rss_config; 109 109 110 + data->no_key_fields = !dev->ethtool_ops->rxfh_per_ctx_key; 111 + 110 112 ctx = xa_load(&dev->ethtool->rss_ctx, request->rss_context); 111 113 if (!ctx) 112 114 return -ENOENT; ··· 155 153 if (!ops->cap_rss_ctx_supported && !ops->create_rxfh_context) 156 154 return -EOPNOTSUPP; 157 155 158 - data->no_key_fields = !ops->rxfh_per_ctx_key; 159 156 return rss_prepare_ctx(request, dev, data, info); 160 157 } 161 158

+5

net/ethtool/strset.c

··· 75 75 .count = __HWTSTAMP_FILTER_CNT, 76 76 .strings = ts_rx_filter_names, 77 77 }, 78 + [ETH_SS_TS_FLAGS] = { 79 + .per_dev = false, 80 + .count = __HWTSTAMP_FLAG_CNT, 81 + .strings = ts_flags_names, 82 + }, 78 83 [ETH_SS_UDP_TUNNEL_TYPES] = { 79 84 .per_dev = false, 80 85 .count = __ETHTOOL_UDP_TUNNEL_TYPE_CNT,

+23 -10

net/ethtool/tsconfig.c

··· 54 54 55 55 data->hwtst_config.tx_type = BIT(cfg.tx_type); 56 56 data->hwtst_config.rx_filter = BIT(cfg.rx_filter); 57 - data->hwtst_config.flags = BIT(cfg.flags); 57 + data->hwtst_config.flags = cfg.flags; 58 58 59 59 data->hwprov_desc.index = -1; 60 60 hwprov = rtnl_dereference(dev->hwprov); ··· 91 91 92 92 BUILD_BUG_ON(__HWTSTAMP_TX_CNT > 32); 93 93 BUILD_BUG_ON(__HWTSTAMP_FILTER_CNT > 32); 94 + BUILD_BUG_ON(__HWTSTAMP_FLAG_CNT > 32); 94 95 95 - if (data->hwtst_config.flags) 96 - /* _TSCONFIG_HWTSTAMP_FLAGS */ 97 - len += nla_total_size(sizeof(u32)); 96 + if (data->hwtst_config.flags) { 97 + ret = ethnl_bitset32_size(&data->hwtst_config.flags, 98 + NULL, __HWTSTAMP_FLAG_CNT, 99 + ts_flags_names, compact); 100 + if (ret < 0) 101 + return ret; 102 + len += ret; /* _TSCONFIG_HWTSTAMP_FLAGS */ 103 + } 98 104 99 105 if (data->hwtst_config.tx_type) { 100 106 ret = ethnl_bitset32_size(&data->hwtst_config.tx_type, ··· 136 130 int ret; 137 131 138 132 if (data->hwtst_config.flags) { 139 - ret = nla_put_u32(skb, ETHTOOL_A_TSCONFIG_HWTSTAMP_FLAGS, 140 - data->hwtst_config.flags); 133 + ret = ethnl_put_bitset32(skb, ETHTOOL_A_TSCONFIG_HWTSTAMP_FLAGS, 134 + &data->hwtst_config.flags, NULL, 135 + __HWTSTAMP_FLAG_CNT, 136 + ts_flags_names, compact); 141 137 if (ret < 0) 142 138 return ret; 143 139 } ··· 188 180 [ETHTOOL_A_TSCONFIG_HEADER] = NLA_POLICY_NESTED(ethnl_header_policy), 189 181 [ETHTOOL_A_TSCONFIG_HWTSTAMP_PROVIDER] = 190 182 NLA_POLICY_NESTED(ethnl_ts_hwtst_prov_policy), 191 - [ETHTOOL_A_TSCONFIG_HWTSTAMP_FLAGS] = { .type = NLA_U32 }, 183 + [ETHTOOL_A_TSCONFIG_HWTSTAMP_FLAGS] = { .type = NLA_NESTED }, 192 184 [ETHTOOL_A_TSCONFIG_RX_FILTERS] = { .type = NLA_NESTED }, 193 185 [ETHTOOL_A_TSCONFIG_TX_TYPES] = { .type = NLA_NESTED }, 194 186 }; ··· 304 296 305 297 BUILD_BUG_ON(__HWTSTAMP_TX_CNT >= 32); 306 298 BUILD_BUG_ON(__HWTSTAMP_FILTER_CNT >= 32); 299 + BUILD_BUG_ON(__HWTSTAMP_FLAG_CNT > 32); 307 300 308 301 if (!netif_device_present(dev)) 309 302 return -ENODEV; ··· 386 377 } 387 378 388 379 if (tb[ETHTOOL_A_TSCONFIG_HWTSTAMP_FLAGS]) { 389 - ethnl_update_u32(&hwtst_config.flags, 390 - tb[ETHTOOL_A_TSCONFIG_HWTSTAMP_FLAGS], 391 - &config_mod); 380 + ret = ethnl_update_bitset32(&hwtst_config.flags, 381 + __HWTSTAMP_FLAG_CNT, 382 + tb[ETHTOOL_A_TSCONFIG_HWTSTAMP_FLAGS], 383 + ts_flags_names, info->extack, 384 + &config_mod); 385 + if (ret < 0) 386 + goto err_free_hwprov; 392 387 } 393 388 394 389 ret = net_hwtstamp_validate(&hwtst_config);

+3 -1

net/ipv4/arp.c

··· 659 659 */ 660 660 void arp_xmit(struct sk_buff *skb) 661 661 { 662 + rcu_read_lock(); 662 663 /* Send it off, maybe filter it using firewalling first. */ 663 664 NF_HOOK(NFPROTO_ARP, NF_ARP_OUT, 664 - dev_net(skb->dev), NULL, skb, NULL, skb->dev, 665 + dev_net_rcu(skb->dev), NULL, skb, NULL, skb->dev, 665 666 arp_xmit_finish); 667 + rcu_read_unlock(); 666 668 } 667 669 EXPORT_SYMBOL(arp_xmit); 668 670

+2 -1

net/ipv4/devinet.c

··· 1371 1371 __be32 addr = 0; 1372 1372 unsigned char localnet_scope = RT_SCOPE_HOST; 1373 1373 struct in_device *in_dev; 1374 - struct net *net = dev_net(dev); 1374 + struct net *net; 1375 1375 int master_idx; 1376 1376 1377 1377 rcu_read_lock(); 1378 + net = dev_net_rcu(dev); 1378 1379 in_dev = __in_dev_get_rcu(dev); 1379 1380 if (!in_dev) 1380 1381 goto no_in_dev;

+17 -14

net/ipv4/icmp.c

··· 399 399 400 400 static void icmp_reply(struct icmp_bxm *icmp_param, struct sk_buff *skb) 401 401 { 402 - struct ipcm_cookie ipc; 403 402 struct rtable *rt = skb_rtable(skb); 404 - struct net *net = dev_net(rt->dst.dev); 403 + struct net *net = dev_net_rcu(rt->dst.dev); 405 404 bool apply_ratelimit = false; 405 + struct ipcm_cookie ipc; 406 406 struct flowi4 fl4; 407 407 struct sock *sk; 408 408 struct inet_sock *inet; ··· 608 608 struct sock *sk; 609 609 610 610 if (!rt) 611 - goto out; 611 + return; 612 + 613 + rcu_read_lock(); 612 614 613 615 if (rt->dst.dev) 614 - net = dev_net(rt->dst.dev); 616 + net = dev_net_rcu(rt->dst.dev); 615 617 else if (skb_in->dev) 616 - net = dev_net(skb_in->dev); 618 + net = dev_net_rcu(skb_in->dev); 617 619 else 618 620 goto out; 619 621 ··· 787 785 icmp_xmit_unlock(sk); 788 786 out_bh_enable: 789 787 local_bh_enable(); 790 - out:; 788 + out: 789 + rcu_read_unlock(); 791 790 } 792 791 EXPORT_SYMBOL(__icmp_send); 793 792 ··· 837 834 * avoid additional coding at protocol handlers. 838 835 */ 839 836 if (!pskb_may_pull(skb, iph->ihl * 4 + 8)) { 840 - __ICMP_INC_STATS(dev_net(skb->dev), ICMP_MIB_INERRORS); 837 + __ICMP_INC_STATS(dev_net_rcu(skb->dev), ICMP_MIB_INERRORS); 841 838 return; 842 839 } 843 840 ··· 871 868 struct net *net; 872 869 u32 info = 0; 873 870 874 - net = dev_net(skb_dst(skb)->dev); 871 + net = dev_net_rcu(skb_dst(skb)->dev); 875 872 876 873 /* 877 874 * Incomplete header ? ··· 982 979 static enum skb_drop_reason icmp_redirect(struct sk_buff *skb) 983 980 { 984 981 if (skb->len < sizeof(struct iphdr)) { 985 - __ICMP_INC_STATS(dev_net(skb->dev), ICMP_MIB_INERRORS); 982 + __ICMP_INC_STATS(dev_net_rcu(skb->dev), ICMP_MIB_INERRORS); 986 983 return SKB_DROP_REASON_PKT_TOO_SMALL; 987 984 } 988 985 ··· 1014 1011 struct icmp_bxm icmp_param; 1015 1012 struct net *net; 1016 1013 1017 - net = dev_net(skb_dst(skb)->dev); 1014 + net = dev_net_rcu(skb_dst(skb)->dev); 1018 1015 /* should there be an ICMP stat for ignored echos? */ 1019 1016 if (READ_ONCE(net->ipv4.sysctl_icmp_echo_ignore_all)) 1020 1017 return SKB_NOT_DROPPED_YET; ··· 1043 1040 1044 1041 bool icmp_build_probe(struct sk_buff *skb, struct icmphdr *icmphdr) 1045 1042 { 1043 + struct net *net = dev_net_rcu(skb->dev); 1046 1044 struct icmp_ext_hdr *ext_hdr, _ext_hdr; 1047 1045 struct icmp_ext_echo_iio *iio, _iio; 1048 - struct net *net = dev_net(skb->dev); 1049 1046 struct inet6_dev *in6_dev; 1050 1047 struct in_device *in_dev; 1051 1048 struct net_device *dev; ··· 1184 1181 return SKB_NOT_DROPPED_YET; 1185 1182 1186 1183 out_err: 1187 - __ICMP_INC_STATS(dev_net(skb_dst(skb)->dev), ICMP_MIB_INERRORS); 1184 + __ICMP_INC_STATS(dev_net_rcu(skb_dst(skb)->dev), ICMP_MIB_INERRORS); 1188 1185 return SKB_DROP_REASON_PKT_TOO_SMALL; 1189 1186 } 1190 1187 ··· 1201 1198 { 1202 1199 enum skb_drop_reason reason = SKB_DROP_REASON_NOT_SPECIFIED; 1203 1200 struct rtable *rt = skb_rtable(skb); 1204 - struct net *net = dev_net(rt->dst.dev); 1201 + struct net *net = dev_net_rcu(rt->dst.dev); 1205 1202 struct icmphdr *icmph; 1206 1203 1207 1204 if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb)) { ··· 1374 1371 struct iphdr *iph = (struct iphdr *)skb->data; 1375 1372 int offset = iph->ihl<<2; 1376 1373 struct icmphdr *icmph = (struct icmphdr *)(skb->data + offset); 1374 + struct net *net = dev_net_rcu(skb->dev); 1377 1375 int type = icmp_hdr(skb)->type; 1378 1376 int code = icmp_hdr(skb)->code; 1379 - struct net *net = dev_net(skb->dev); 1380 1377 1381 1378 /* 1382 1379 * Use ping_err to handle all icmp errors except those

+21 -9

net/ipv4/route.c

··· 390 390 391 391 static inline bool rt_is_expired(const struct rtable *rth) 392 392 { 393 - return rth->rt_genid != rt_genid_ipv4(dev_net(rth->dst.dev)); 393 + bool res; 394 + 395 + rcu_read_lock(); 396 + res = rth->rt_genid != rt_genid_ipv4(dev_net_rcu(rth->dst.dev)); 397 + rcu_read_unlock(); 398 + 399 + return res; 394 400 } 395 401 396 402 void rt_cache_flush(struct net *net) ··· 1008 1002 static void __ip_rt_update_pmtu(struct rtable *rt, struct flowi4 *fl4, u32 mtu) 1009 1003 { 1010 1004 struct dst_entry *dst = &rt->dst; 1011 - struct net *net = dev_net(dst->dev); 1012 1005 struct fib_result res; 1013 1006 bool lock = false; 1007 + struct net *net; 1014 1008 u32 old_mtu; 1015 1009 1016 1010 if (ip_mtu_locked(dst)) ··· 1020 1014 if (old_mtu < mtu) 1021 1015 return; 1022 1016 1017 + rcu_read_lock(); 1018 + net = dev_net_rcu(dst->dev); 1023 1019 if (mtu < net->ipv4.ip_rt_min_pmtu) { 1024 1020 lock = true; 1025 1021 mtu = min(old_mtu, net->ipv4.ip_rt_min_pmtu); ··· 1029 1021 1030 1022 if (rt->rt_pmtu == mtu && !lock && 1031 1023 time_before(jiffies, dst->expires - net->ipv4.ip_rt_mtu_expires / 2)) 1032 - return; 1024 + goto out; 1033 1025 1034 - rcu_read_lock(); 1035 1026 if (fib_lookup(net, fl4, &res, 0) == 0) { 1036 1027 struct fib_nh_common *nhc; 1037 1028 ··· 1044 1037 update_or_create_fnhe(nhc, fl4->daddr, 0, mtu, lock, 1045 1038 jiffies + net->ipv4.ip_rt_mtu_expires); 1046 1039 } 1047 - rcu_read_unlock(); 1048 - return; 1040 + goto out; 1049 1041 } 1050 1042 #endif /* CONFIG_IP_ROUTE_MULTIPATH */ 1051 1043 nhc = FIB_RES_NHC(res); 1052 1044 update_or_create_fnhe(nhc, fl4->daddr, 0, mtu, lock, 1053 1045 jiffies + net->ipv4.ip_rt_mtu_expires); 1054 1046 } 1047 + out: 1055 1048 rcu_read_unlock(); 1056 1049 } 1057 1050 ··· 1314 1307 1315 1308 static unsigned int ipv4_default_advmss(const struct dst_entry *dst) 1316 1309 { 1317 - struct net *net = dev_net(dst->dev); 1318 1310 unsigned int header_size = sizeof(struct tcphdr) + sizeof(struct iphdr); 1319 - unsigned int advmss = max_t(unsigned int, ipv4_mtu(dst) - header_size, 1320 - net->ipv4.ip_rt_min_advmss); 1311 + unsigned int advmss; 1312 + struct net *net; 1313 + 1314 + rcu_read_lock(); 1315 + net = dev_net_rcu(dst->dev); 1316 + advmss = max_t(unsigned int, ipv4_mtu(dst) - header_size, 1317 + net->ipv4.ip_rt_min_advmss); 1318 + rcu_read_unlock(); 1321 1319 1322 1320 return min(advmss, IPV4_MAX_PMTU - header_size); 1323 1321 }

+2 -2

net/ipv4/udp.c

··· 1141 1141 const int hlen = skb_network_header_len(skb) + 1142 1142 sizeof(struct udphdr); 1143 1143 1144 - if (hlen + cork->gso_size > cork->fragsize) { 1144 + if (hlen + min(datalen, cork->gso_size) > cork->fragsize) { 1145 1145 kfree_skb(skb); 1146 - return -EINVAL; 1146 + return -EMSGSIZE; 1147 1147 } 1148 1148 if (datalen > cork->gso_size * UDP_MAX_SEGMENTS) { 1149 1149 kfree_skb(skb);

+23 -19

net/ipv6/icmp.c

··· 76 76 { 77 77 /* icmpv6_notify checks 8 bytes can be pulled, icmp6hdr is 8 bytes */ 78 78 struct icmp6hdr *icmp6 = (struct icmp6hdr *) (skb->data + offset); 79 - struct net *net = dev_net(skb->dev); 79 + struct net *net = dev_net_rcu(skb->dev); 80 80 81 81 if (type == ICMPV6_PKT_TOOBIG) 82 82 ip6_update_pmtu(skb, net, info, skb->dev->ifindex, 0, sock_net_uid(net, NULL)); ··· 473 473 474 474 if (!skb->dev) 475 475 return; 476 - net = dev_net(skb->dev); 476 + 477 + rcu_read_lock(); 478 + 479 + net = dev_net_rcu(skb->dev); 477 480 mark = IP6_REPLY_MARK(net, skb->mark); 478 481 /* 479 482 * Make sure we respect the rules ··· 499 496 !(type == ICMPV6_PARAMPROB && 500 497 code == ICMPV6_UNK_OPTION && 501 498 (opt_unrec(skb, info)))) 502 - return; 499 + goto out; 503 500 504 501 saddr = NULL; 505 502 } ··· 529 526 if ((addr_type == IPV6_ADDR_ANY) || (addr_type & IPV6_ADDR_MULTICAST)) { 530 527 net_dbg_ratelimited("icmp6_send: addr_any/mcast source [%pI6c > %pI6c]\n", 531 528 &hdr->saddr, &hdr->daddr); 532 - return; 529 + goto out; 533 530 } 534 531 535 532 /* ··· 538 535 if (is_ineligible(skb)) { 539 536 net_dbg_ratelimited("icmp6_send: no reply to icmp error [%pI6c > %pI6c]\n", 540 537 &hdr->saddr, &hdr->daddr); 541 - return; 538 + goto out; 542 539 } 543 540 544 541 /* Needed by both icmpv6_global_allow and icmpv6_xmit_lock */ ··· 585 582 np = inet6_sk(sk); 586 583 587 584 if (!icmpv6_xrlim_allow(sk, type, &fl6, apply_ratelimit)) 588 - goto out; 585 + goto out_unlock; 589 586 590 587 tmp_hdr.icmp6_type = type; 591 588 tmp_hdr.icmp6_code = code; ··· 603 600 604 601 dst = icmpv6_route_lookup(net, skb, sk, &fl6); 605 602 if (IS_ERR(dst)) 606 - goto out; 603 + goto out_unlock; 607 604 608 605 ipc6.hlimit = ip6_sk_dst_hoplimit(np, &fl6, dst); 609 606 ··· 619 616 goto out_dst_release; 620 617 } 621 618 622 - rcu_read_lock(); 623 619 idev = __in6_dev_get(skb->dev); 624 620 625 621 if (ip6_append_data(sk, icmpv6_getfrag, &msg, ··· 632 630 icmpv6_push_pending_frames(sk, &fl6, &tmp_hdr, 633 631 len + sizeof(struct icmp6hdr)); 634 632 } 635 - rcu_read_unlock(); 633 + 636 634 out_dst_release: 637 635 dst_release(dst); 638 - out: 636 + out_unlock: 639 637 icmpv6_xmit_unlock(sk); 640 638 out_bh_enable: 641 639 local_bh_enable(); 640 + out: 641 + rcu_read_unlock(); 642 642 } 643 643 EXPORT_SYMBOL(icmp6_send); 644 644 ··· 683 679 skb_pull(skb2, nhs); 684 680 skb_reset_network_header(skb2); 685 681 686 - rt = rt6_lookup(dev_net(skb->dev), &ipv6_hdr(skb2)->saddr, NULL, 0, 687 - skb, 0); 682 + rt = rt6_lookup(dev_net_rcu(skb->dev), &ipv6_hdr(skb2)->saddr, 683 + NULL, 0, skb, 0); 688 684 689 685 if (rt && rt->dst.dev) 690 686 skb2->dev = rt->dst.dev; ··· 721 717 722 718 static enum skb_drop_reason icmpv6_echo_reply(struct sk_buff *skb) 723 719 { 724 - struct net *net = dev_net(skb->dev); 720 + struct net *net = dev_net_rcu(skb->dev); 725 721 struct sock *sk; 726 722 struct inet6_dev *idev; 727 723 struct ipv6_pinfo *np; ··· 836 832 u8 code, __be32 info) 837 833 { 838 834 struct inet6_skb_parm *opt = IP6CB(skb); 839 - struct net *net = dev_net(skb->dev); 835 + struct net *net = dev_net_rcu(skb->dev); 840 836 const struct inet6_protocol *ipprot; 841 837 enum skb_drop_reason reason; 842 838 int inner_offset; ··· 893 889 static int icmpv6_rcv(struct sk_buff *skb) 894 890 { 895 891 enum skb_drop_reason reason = SKB_DROP_REASON_NOT_SPECIFIED; 896 - struct net *net = dev_net(skb->dev); 892 + struct net *net = dev_net_rcu(skb->dev); 897 893 struct net_device *dev = icmp6_dev(skb); 898 894 struct inet6_dev *idev = __in6_dev_get(dev); 899 895 const struct in6_addr *saddr, *daddr; ··· 925 921 skb_set_network_header(skb, nh); 926 922 } 927 923 928 - __ICMP6_INC_STATS(dev_net(dev), idev, ICMP6_MIB_INMSGS); 924 + __ICMP6_INC_STATS(dev_net_rcu(dev), idev, ICMP6_MIB_INMSGS); 929 925 930 926 saddr = &ipv6_hdr(skb)->saddr; 931 927 daddr = &ipv6_hdr(skb)->daddr; ··· 943 939 944 940 type = hdr->icmp6_type; 945 941 946 - ICMP6MSGIN_INC_STATS(dev_net(dev), idev, type); 942 + ICMP6MSGIN_INC_STATS(dev_net_rcu(dev), idev, type); 947 943 948 944 switch (type) { 949 945 case ICMPV6_ECHO_REQUEST: ··· 1038 1034 1039 1035 csum_error: 1040 1036 reason = SKB_DROP_REASON_ICMP_CSUM; 1041 - __ICMP6_INC_STATS(dev_net(dev), idev, ICMP6_MIB_CSUMERRORS); 1037 + __ICMP6_INC_STATS(dev_net_rcu(dev), idev, ICMP6_MIB_CSUMERRORS); 1042 1038 discard_it: 1043 - __ICMP6_INC_STATS(dev_net(dev), idev, ICMP6_MIB_INERRORS); 1039 + __ICMP6_INC_STATS(dev_net_rcu(dev), idev, ICMP6_MIB_INERRORS); 1044 1040 drop_no_count: 1045 1041 kfree_skb_reason(skb, reason); 1046 1042 return 0;

+9 -5

net/ipv6/ioam6_iptunnel.c

··· 336 336 337 337 static int ioam6_output(struct net *net, struct sock *sk, struct sk_buff *skb) 338 338 { 339 - struct dst_entry *dst = skb_dst(skb), *cache_dst; 339 + struct dst_entry *dst = skb_dst(skb), *cache_dst = NULL; 340 340 struct in6_addr orig_daddr; 341 341 struct ioam6_lwt *ilwt; 342 342 int err = -EINVAL; ··· 407 407 cache_dst = ip6_route_output(net, NULL, &fl6); 408 408 if (cache_dst->error) { 409 409 err = cache_dst->error; 410 - dst_release(cache_dst); 411 410 goto drop; 412 411 } 413 412 414 - local_bh_disable(); 415 - dst_cache_set_ip6(&ilwt->cache, cache_dst, &fl6.saddr); 416 - local_bh_enable(); 413 + /* cache only if we don't create a dst reference loop */ 414 + if (dst->lwtstate != cache_dst->lwtstate) { 415 + local_bh_disable(); 416 + dst_cache_set_ip6(&ilwt->cache, cache_dst, &fl6.saddr); 417 + local_bh_enable(); 418 + } 417 419 418 420 err = skb_cow_head(skb, LL_RESERVED_SPACE(cache_dst->dev)); 419 421 if (unlikely(err)) ··· 428 426 return dst_output(net, sk, skb); 429 427 } 430 428 out: 429 + dst_release(cache_dst); 431 430 return dst->lwtstate->orig_output(net, sk, skb); 432 431 drop: 432 + dst_release(cache_dst); 433 433 kfree_skb(skb); 434 434 return err; 435 435 }

+9 -5

net/ipv6/ip6_input.c

··· 477 477 static int ip6_input_finish(struct net *net, struct sock *sk, struct sk_buff *skb) 478 478 { 479 479 skb_clear_delivery_time(skb); 480 - rcu_read_lock(); 481 480 ip6_protocol_deliver_rcu(net, skb, 0, false); 482 - rcu_read_unlock(); 483 481 484 482 return 0; 485 483 } ··· 485 487 486 488 int ip6_input(struct sk_buff *skb) 487 489 { 488 - return NF_HOOK(NFPROTO_IPV6, NF_INET_LOCAL_IN, 489 - dev_net(skb->dev), NULL, skb, skb->dev, NULL, 490 - ip6_input_finish); 490 + int res; 491 + 492 + rcu_read_lock(); 493 + res = NF_HOOK(NFPROTO_IPV6, NF_INET_LOCAL_IN, 494 + dev_net_rcu(skb->dev), NULL, skb, skb->dev, NULL, 495 + ip6_input_finish); 496 + rcu_read_unlock(); 497 + 498 + return res; 491 499 } 492 500 EXPORT_SYMBOL_GPL(ip6_input); 493 501

+25 -20

net/ipv6/mcast.c

··· 1773 1773 struct net_device *dev = idev->dev; 1774 1774 int hlen = LL_RESERVED_SPACE(dev); 1775 1775 int tlen = dev->needed_tailroom; 1776 - struct net *net = dev_net(dev); 1777 1776 const struct in6_addr *saddr; 1778 1777 struct in6_addr addr_buf; 1779 1778 struct mld2_report *pmr; 1780 1779 struct sk_buff *skb; 1781 1780 unsigned int size; 1782 1781 struct sock *sk; 1783 - int err; 1782 + struct net *net; 1784 1783 1785 - sk = net->ipv6.igmp_sk; 1786 1784 /* we assume size > sizeof(ra) here 1787 1785 * Also try to not allocate high-order pages for big MTU 1788 1786 */ 1789 1787 size = min_t(int, mtu, PAGE_SIZE / 2) + hlen + tlen; 1790 - skb = sock_alloc_send_skb(sk, size, 1, &err); 1788 + skb = alloc_skb(size, GFP_KERNEL); 1791 1789 if (!skb) 1792 1790 return NULL; 1793 1791 1794 1792 skb->priority = TC_PRIO_CONTROL; 1795 1793 skb_reserve(skb, hlen); 1796 1794 skb_tailroom_reserve(skb, mtu, tlen); 1795 + 1796 + rcu_read_lock(); 1797 + 1798 + net = dev_net_rcu(dev); 1799 + sk = net->ipv6.igmp_sk; 1800 + skb_set_owner_w(skb, sk); 1797 1801 1798 1802 if (ipv6_get_lladdr(dev, &addr_buf, IFA_F_TENTATIVE)) { 1799 1803 /* <draft-ietf-magma-mld-source-05.txt>: ··· 1809 1805 saddr = &addr_buf; 1810 1806 1811 1807 ip6_mc_hdr(sk, skb, dev, saddr, &mld2_all_mcr, NEXTHDR_HOP, 0); 1808 + 1809 + rcu_read_unlock(); 1812 1810 1813 1811 skb_put_data(skb, ra, sizeof(ra)); 1814 1812 ··· 2171 2165 2172 2166 static void igmp6_send(struct in6_addr *addr, struct net_device *dev, int type) 2173 2167 { 2174 - struct net *net = dev_net(dev); 2175 - struct sock *sk = net->ipv6.igmp_sk; 2168 + const struct in6_addr *snd_addr, *saddr; 2169 + int err, len, payload_len, full_len; 2170 + struct in6_addr addr_buf; 2176 2171 struct inet6_dev *idev; 2177 2172 struct sk_buff *skb; 2178 2173 struct mld_msg *hdr; 2179 - const struct in6_addr *snd_addr, *saddr; 2180 - struct in6_addr addr_buf; 2181 2174 int hlen = LL_RESERVED_SPACE(dev); 2182 2175 int tlen = dev->needed_tailroom; 2183 - int err, len, payload_len, full_len; 2184 2176 u8 ra[8] = { IPPROTO_ICMPV6, 0, 2185 2177 IPV6_TLV_ROUTERALERT, 2, 0, 0, 2186 2178 IPV6_TLV_PADN, 0 }; 2187 - struct flowi6 fl6; 2188 2179 struct dst_entry *dst; 2180 + struct flowi6 fl6; 2181 + struct net *net; 2182 + struct sock *sk; 2189 2183 2190 2184 if (type == ICMPV6_MGM_REDUCTION) 2191 2185 snd_addr = &in6addr_linklocal_allrouters; ··· 2196 2190 payload_len = len + sizeof(ra); 2197 2191 full_len = sizeof(struct ipv6hdr) + payload_len; 2198 2192 2193 + skb = alloc_skb(hlen + tlen + full_len, GFP_KERNEL); 2194 + 2199 2195 rcu_read_lock(); 2200 - IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_OUTREQUESTS); 2201 - rcu_read_unlock(); 2202 2196 2203 - skb = sock_alloc_send_skb(sk, hlen + tlen + full_len, 1, &err); 2204 - 2197 + net = dev_net_rcu(dev); 2198 + idev = __in6_dev_get(dev); 2199 + IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTREQUESTS); 2205 2200 if (!skb) { 2206 - rcu_read_lock(); 2207 - IP6_INC_STATS(net, __in6_dev_get(dev), 2208 - IPSTATS_MIB_OUTDISCARDS); 2201 + IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTDISCARDS); 2209 2202 rcu_read_unlock(); 2210 2203 return; 2211 2204 } 2205 + sk = net->ipv6.igmp_sk; 2206 + skb_set_owner_w(skb, sk); 2207 + 2212 2208 skb->priority = TC_PRIO_CONTROL; 2213 2209 skb_reserve(skb, hlen); 2214 2210 ··· 2234 2226 hdr->mld_cksum = csum_ipv6_magic(saddr, snd_addr, len, 2235 2227 IPPROTO_ICMPV6, 2236 2228 csum_partial(hdr, len, 0)); 2237 - 2238 - rcu_read_lock(); 2239 - idev = __in6_dev_get(skb->dev); 2240 2229 2241 2230 icmpv6_flow_init(sk, &fl6, type, 2242 2231 &ipv6_hdr(skb)->saddr, &ipv6_hdr(skb)->daddr,

+15 -13

net/ipv6/ndisc.c

··· 418 418 { 419 419 int hlen = LL_RESERVED_SPACE(dev); 420 420 int tlen = dev->needed_tailroom; 421 - struct sock *sk = dev_net(dev)->ipv6.ndisc_sk; 422 421 struct sk_buff *skb; 423 422 424 423 skb = alloc_skb(hlen + sizeof(struct ipv6hdr) + len + tlen, GFP_ATOMIC); 425 - if (!skb) { 426 - ND_PRINTK(0, err, "ndisc: %s failed to allocate an skb\n", 427 - __func__); 424 + if (!skb) 428 425 return NULL; 429 - } 430 426 431 427 skb->protocol = htons(ETH_P_IPV6); 432 428 skb->dev = dev; ··· 433 437 /* Manually assign socket ownership as we avoid calling 434 438 * sock_alloc_send_pskb() to bypass wmem buffer limits 435 439 */ 436 - skb_set_owner_w(skb, sk); 440 + rcu_read_lock(); 441 + skb_set_owner_w(skb, dev_net_rcu(dev)->ipv6.ndisc_sk); 442 + rcu_read_unlock(); 437 443 438 444 return skb; 439 445 } ··· 471 473 void ndisc_send_skb(struct sk_buff *skb, const struct in6_addr *daddr, 472 474 const struct in6_addr *saddr) 473 475 { 474 - struct dst_entry *dst = skb_dst(skb); 475 - struct net *net = dev_net(skb->dev); 476 - struct sock *sk = net->ipv6.ndisc_sk; 477 - struct inet6_dev *idev; 478 - int err; 479 476 struct icmp6hdr *icmp6h = icmp6_hdr(skb); 477 + struct dst_entry *dst = skb_dst(skb); 478 + struct inet6_dev *idev; 479 + struct net *net; 480 + struct sock *sk; 481 + int err; 480 482 u8 type; 481 483 482 484 type = icmp6h->icmp6_type; 483 485 486 + rcu_read_lock(); 487 + 488 + net = dev_net_rcu(skb->dev); 489 + sk = net->ipv6.ndisc_sk; 484 490 if (!dst) { 485 491 struct flowi6 fl6; 486 492 int oif = skb->dev->ifindex; ··· 492 490 icmpv6_flow_init(sk, &fl6, type, saddr, daddr, oif); 493 491 dst = icmp6_dst_alloc(skb->dev, &fl6); 494 492 if (IS_ERR(dst)) { 493 + rcu_read_unlock(); 495 494 kfree_skb(skb); 496 495 return; 497 496 } ··· 507 504 508 505 ip6_nd_hdr(skb, saddr, daddr, READ_ONCE(inet6_sk(sk)->hop_limit), skb->len); 509 506 510 - rcu_read_lock(); 511 507 idev = __in6_dev_get(dst->dev); 512 508 IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTREQUESTS); 513 509 ··· 1696 1694 bool ret; 1697 1695 1698 1696 if (netif_is_l3_master(skb->dev)) { 1699 - dev = __dev_get_by_index(dev_net(skb->dev), IPCB(skb)->iif); 1697 + dev = dev_get_by_index_rcu(dev_net(skb->dev), IPCB(skb)->iif); 1700 1698 if (!dev) 1701 1699 return; 1702 1700 }

+6 -1

net/ipv6/route.c

··· 3196 3196 { 3197 3197 struct net_device *dev = dst->dev; 3198 3198 unsigned int mtu = dst_mtu(dst); 3199 - struct net *net = dev_net(dev); 3199 + struct net *net; 3200 3200 3201 3201 mtu -= sizeof(struct ipv6hdr) + sizeof(struct tcphdr); 3202 3202 3203 + rcu_read_lock(); 3204 + 3205 + net = dev_net_rcu(dev); 3203 3206 if (mtu < net->ipv6.sysctl.ip6_rt_min_advmss) 3204 3207 mtu = net->ipv6.sysctl.ip6_rt_min_advmss; 3208 + 3209 + rcu_read_unlock(); 3205 3210 3206 3211 /* 3207 3212 * Maximal non-jumbo IPv6 payload is IPV6_MAXPLEN and

+10 -5

net/ipv6/rpl_iptunnel.c

··· 232 232 dst = ip6_route_output(net, NULL, &fl6); 233 233 if (dst->error) { 234 234 err = dst->error; 235 - dst_release(dst); 236 235 goto drop; 237 236 } 238 237 239 - local_bh_disable(); 240 - dst_cache_set_ip6(&rlwt->cache, dst, &fl6.saddr); 241 - local_bh_enable(); 238 + /* cache only if we don't create a dst reference loop */ 239 + if (orig_dst->lwtstate != dst->lwtstate) { 240 + local_bh_disable(); 241 + dst_cache_set_ip6(&rlwt->cache, dst, &fl6.saddr); 242 + local_bh_enable(); 243 + } 242 244 243 245 err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev)); 244 246 if (unlikely(err)) ··· 253 251 return dst_output(net, sk, skb); 254 252 255 253 drop: 254 + dst_release(dst); 256 255 kfree_skb(skb); 257 256 return err; 258 257 } ··· 272 269 local_bh_enable(); 273 270 274 271 err = rpl_do_srh(skb, rlwt, dst); 275 - if (unlikely(err)) 272 + if (unlikely(err)) { 273 + dst_release(dst); 276 274 goto drop; 275 + } 277 276 278 277 if (!dst) { 279 278 ip6_route_input(skb);

+10 -5

net/ipv6/seg6_iptunnel.c

··· 482 482 local_bh_enable(); 483 483 484 484 err = seg6_do_srh(skb, dst); 485 - if (unlikely(err)) 485 + if (unlikely(err)) { 486 + dst_release(dst); 486 487 goto drop; 488 + } 487 489 488 490 if (!dst) { 489 491 ip6_route_input(skb); ··· 573 571 dst = ip6_route_output(net, NULL, &fl6); 574 572 if (dst->error) { 575 573 err = dst->error; 576 - dst_release(dst); 577 574 goto drop; 578 575 } 579 576 580 - local_bh_disable(); 581 - dst_cache_set_ip6(&slwt->cache, dst, &fl6.saddr); 582 - local_bh_enable(); 577 + /* cache only if we don't create a dst reference loop */ 578 + if (orig_dst->lwtstate != dst->lwtstate) { 579 + local_bh_disable(); 580 + dst_cache_set_ip6(&slwt->cache, dst, &fl6.saddr); 581 + local_bh_enable(); 582 + } 583 583 584 584 err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev)); 585 585 if (unlikely(err)) ··· 597 593 598 594 return dst_output(net, sk, skb); 599 595 drop: 596 + dst_release(dst); 600 597 kfree_skb(skb); 601 598 return err; 602 599 }

+2 -2

net/ipv6/udp.c

··· 1389 1389 const int hlen = skb_network_header_len(skb) + 1390 1390 sizeof(struct udphdr); 1391 1391 1392 - if (hlen + cork->gso_size > cork->fragsize) { 1392 + if (hlen + min(datalen, cork->gso_size) > cork->fragsize) { 1393 1393 kfree_skb(skb); 1394 - return -EINVAL; 1394 + return -EMSGSIZE; 1395 1395 } 1396 1396 if (datalen > cork->gso_size * UDP_MAX_SEGMENTS) { 1397 1397 kfree_skb(skb);

+2 -6

net/netfilter/nf_flow_table_ip.c

··· 381 381 flow = container_of(tuplehash, struct flow_offload, tuplehash[dir]); 382 382 383 383 mtu = flow->tuplehash[dir].tuple.mtu + ctx->offset; 384 - if (unlikely(nf_flow_exceeds_mtu(skb, mtu))) { 385 - flow_offload_teardown(flow); 384 + if (unlikely(nf_flow_exceeds_mtu(skb, mtu))) 386 385 return 0; 387 - } 388 386 389 387 iph = (struct iphdr *)(skb_network_header(skb) + ctx->offset); 390 388 thoff = (iph->ihl * 4) + ctx->offset; ··· 660 662 flow = container_of(tuplehash, struct flow_offload, tuplehash[dir]); 661 663 662 664 mtu = flow->tuplehash[dir].tuple.mtu + ctx->offset; 663 - if (unlikely(nf_flow_exceeds_mtu(skb, mtu))) { 664 - flow_offload_teardown(flow); 665 + if (unlikely(nf_flow_exceeds_mtu(skb, mtu))) 665 666 return 0; 666 - } 667 667 668 668 ip6h = (struct ipv6hdr *)(skb_network_header(skb) + ctx->offset); 669 669 thoff = sizeof(*ip6h) + ctx->offset;

+9 -3

net/openvswitch/datapath.c

··· 2101 2101 { 2102 2102 struct ovs_header *ovs_header; 2103 2103 struct ovs_vport_stats vport_stats; 2104 + struct net *net_vport; 2104 2105 int err; 2105 2106 2106 2107 ovs_header = genlmsg_put(skb, portid, seq, &dp_vport_genl_family, ··· 2118 2117 nla_put_u32(skb, OVS_VPORT_ATTR_IFINDEX, vport->dev->ifindex)) 2119 2118 goto nla_put_failure; 2120 2119 2121 - if (!net_eq(net, dev_net(vport->dev))) { 2122 - int id = peernet2id_alloc(net, dev_net(vport->dev), gfp); 2120 + rcu_read_lock(); 2121 + net_vport = dev_net_rcu(vport->dev); 2122 + if (!net_eq(net, net_vport)) { 2123 + int id = peernet2id_alloc(net, net_vport, GFP_ATOMIC); 2123 2124 2124 2125 if (nla_put_s32(skb, OVS_VPORT_ATTR_NETNSID, id)) 2125 - goto nla_put_failure; 2126 + goto nla_put_failure_unlock; 2126 2127 } 2128 + rcu_read_unlock(); 2127 2129 2128 2130 ovs_vport_get_stats(vport, &vport_stats); 2129 2131 if (nla_put_64bit(skb, OVS_VPORT_ATTR_STATS, ··· 2147 2143 genlmsg_end(skb, ovs_header); 2148 2144 return 0; 2149 2145 2146 + nla_put_failure_unlock: 2147 + rcu_read_unlock(); 2150 2148 nla_put_failure: 2151 2149 err = -EMSGSIZE; 2152 2150 error:

+16 -8

net/rose/af_rose.c

··· 701 701 struct net_device *dev; 702 702 ax25_address *source; 703 703 ax25_uid_assoc *user; 704 + int err = -EINVAL; 704 705 int n; 705 - 706 - if (!sock_flag(sk, SOCK_ZAPPED)) 707 - return -EINVAL; 708 706 709 707 if (addr_len != sizeof(struct sockaddr_rose) && addr_len != sizeof(struct full_sockaddr_rose)) 710 708 return -EINVAL; ··· 716 718 if ((unsigned int) addr->srose_ndigis > ROSE_MAX_DIGIS) 717 719 return -EINVAL; 718 720 719 - if ((dev = rose_dev_get(&addr->srose_addr)) == NULL) 720 - return -EADDRNOTAVAIL; 721 + lock_sock(sk); 722 + 723 + if (!sock_flag(sk, SOCK_ZAPPED)) 724 + goto out_release; 725 + 726 + err = -EADDRNOTAVAIL; 727 + dev = rose_dev_get(&addr->srose_addr); 728 + if (!dev) 729 + goto out_release; 721 730 722 731 source = &addr->srose_call; 723 732 ··· 735 730 } else { 736 731 if (ax25_uid_policy && !capable(CAP_NET_BIND_SERVICE)) { 737 732 dev_put(dev); 738 - return -EACCES; 733 + err = -EACCES; 734 + goto out_release; 739 735 } 740 736 rose->source_call = *source; 741 737 } ··· 759 753 rose_insert_socket(sk); 760 754 761 755 sock_reset_flag(sk, SOCK_ZAPPED); 762 - 763 - return 0; 756 + err = 0; 757 + out_release: 758 + release_sock(sk); 759 + return err; 764 760 } 765 761 766 762 static int rose_connect(struct socket *sock, struct sockaddr *uaddr, int addr_len, int flags)

+4 -5

net/rxrpc/ar-internal.h

··· 327 327 * packet with a maximum set of jumbo subpackets or a PING ACK padded 328 328 * out to 64K with zeropages for PMTUD. 329 329 */ 330 - struct kvec kvec[RXRPC_MAX_NR_JUMBO > 3 + 16 ? 331 - RXRPC_MAX_NR_JUMBO : 3 + 16]; 330 + struct kvec kvec[1 + RXRPC_MAX_NR_JUMBO > 3 + 16 ? 331 + 1 + RXRPC_MAX_NR_JUMBO : 3 + 16]; 332 332 }; 333 333 334 334 /* ··· 582 582 RXRPC_CALL_EXCLUSIVE, /* The call uses a once-only connection */ 583 583 RXRPC_CALL_RX_IS_IDLE, /* recvmsg() is idle - send an ACK */ 584 584 RXRPC_CALL_RECVMSG_READ_ALL, /* recvmsg() read all of the received data */ 585 + RXRPC_CALL_CONN_CHALLENGING, /* The connection is being challenged */ 585 586 }; 586 587 587 588 /* ··· 603 602 RXRPC_CALL_CLIENT_AWAIT_REPLY, /* - client awaiting reply */ 604 603 RXRPC_CALL_CLIENT_RECV_REPLY, /* - client receiving reply phase */ 605 604 RXRPC_CALL_SERVER_PREALLOC, /* - service preallocation */ 606 - RXRPC_CALL_SERVER_SECURING, /* - server securing request connection */ 607 605 RXRPC_CALL_SERVER_RECV_REQUEST, /* - server receiving request */ 608 606 RXRPC_CALL_SERVER_ACK_REQUEST, /* - server pending ACK of request */ 609 607 RXRPC_CALL_SERVER_SEND_REPLY, /* - server sending reply */ ··· 874 874 #define RXRPC_TXBUF_RESENT 0x100 /* Set if has been resent */ 875 875 __be16 cksum; /* Checksum to go in header */ 876 876 bool jumboable; /* Can be non-terminal jumbo subpacket */ 877 - u8 nr_kvec; /* Amount of kvec[] used */ 878 - struct kvec kvec[1]; 877 + void *data; /* Data with preceding jumbo header */ 879 878 }; 880 879 881 880 static inline bool rxrpc_sending_to_server(const struct rxrpc_txbuf *txb)

+2 -4

net/rxrpc/call_object.c

··· 22 22 [RXRPC_CALL_CLIENT_AWAIT_REPLY] = "ClAwtRpl", 23 23 [RXRPC_CALL_CLIENT_RECV_REPLY] = "ClRcvRpl", 24 24 [RXRPC_CALL_SERVER_PREALLOC] = "SvPrealc", 25 - [RXRPC_CALL_SERVER_SECURING] = "SvSecure", 26 25 [RXRPC_CALL_SERVER_RECV_REQUEST] = "SvRcvReq", 27 26 [RXRPC_CALL_SERVER_ACK_REQUEST] = "SvAckReq", 28 27 [RXRPC_CALL_SERVER_SEND_REPLY] = "SvSndRpl", ··· 452 453 call->cong_tstamp = skb->tstamp; 453 454 454 455 __set_bit(RXRPC_CALL_EXPOSED, &call->flags); 455 - rxrpc_set_call_state(call, RXRPC_CALL_SERVER_SECURING); 456 + rxrpc_set_call_state(call, RXRPC_CALL_SERVER_RECV_REQUEST); 456 457 457 458 spin_lock(&conn->state_lock); 458 459 459 460 switch (conn->state) { 460 461 case RXRPC_CONN_SERVICE_UNSECURED: 461 462 case RXRPC_CONN_SERVICE_CHALLENGING: 462 - rxrpc_set_call_state(call, RXRPC_CALL_SERVER_SECURING); 463 + __set_bit(RXRPC_CALL_CONN_CHALLENGING, &call->flags); 463 464 break; 464 465 case RXRPC_CONN_SERVICE: 465 - rxrpc_set_call_state(call, RXRPC_CALL_SERVER_RECV_REQUEST); 466 466 break; 467 467 468 468 case RXRPC_CONN_ABORTED:

+11 -10

net/rxrpc/conn_event.c

··· 228 228 */ 229 229 static void rxrpc_call_is_secure(struct rxrpc_call *call) 230 230 { 231 - if (call && __rxrpc_call_state(call) == RXRPC_CALL_SERVER_SECURING) { 232 - rxrpc_set_call_state(call, RXRPC_CALL_SERVER_RECV_REQUEST); 231 + if (call && __test_and_clear_bit(RXRPC_CALL_CONN_CHALLENGING, &call->flags)) 233 232 rxrpc_notify_socket(call); 234 - } 235 233 } 236 234 237 235 /* ··· 270 272 * we've already received the packet, put it on the 271 273 * front of the queue. 272 274 */ 275 + sp->conn = rxrpc_get_connection(conn, rxrpc_conn_get_poke_secured); 273 276 skb->mark = RXRPC_SKB_MARK_SERVICE_CONN_SECURED; 274 277 rxrpc_get_skb(skb, rxrpc_skb_get_conn_secured); 275 278 skb_queue_head(&conn->local->rx_queue, skb); ··· 436 437 if (test_and_clear_bit(RXRPC_CONN_EV_ABORT_CALLS, &conn->events)) 437 438 rxrpc_abort_calls(conn); 438 439 439 - switch (skb->mark) { 440 - case RXRPC_SKB_MARK_SERVICE_CONN_SECURED: 441 - if (conn->state != RXRPC_CONN_SERVICE) 442 - break; 440 + if (skb) { 441 + switch (skb->mark) { 442 + case RXRPC_SKB_MARK_SERVICE_CONN_SECURED: 443 + if (conn->state != RXRPC_CONN_SERVICE) 444 + break; 443 445 444 - for (loop = 0; loop < RXRPC_MAXCALLS; loop++) 445 - rxrpc_call_is_secure(conn->channels[loop].call); 446 - break; 446 + for (loop = 0; loop < RXRPC_MAXCALLS; loop++) 447 + rxrpc_call_is_secure(conn->channels[loop].call); 448 + break; 449 + } 447 450 } 448 451 449 452 /* Process delayed ACKs whose time has come. */

+1

net/rxrpc/conn_object.c

··· 67 67 INIT_WORK(&conn->destructor, rxrpc_clean_up_connection); 68 68 INIT_LIST_HEAD(&conn->proc_link); 69 69 INIT_LIST_HEAD(&conn->link); 70 + INIT_LIST_HEAD(&conn->attend_link); 70 71 mutex_init(&conn->security_lock); 71 72 mutex_init(&conn->tx_data_alloc_lock); 72 73 skb_queue_head_init(&conn->rx_queue);

+10 -2

net/rxrpc/input.c

··· 448 448 struct rxrpc_skb_priv *sp = rxrpc_skb(skb); 449 449 bool last = sp->hdr.flags & RXRPC_LAST_PACKET; 450 450 451 - skb_queue_tail(&call->recvmsg_queue, skb); 451 + spin_lock_irq(&call->recvmsg_queue.lock); 452 + 453 + __skb_queue_tail(&call->recvmsg_queue, skb); 452 454 rxrpc_input_update_ack_window(call, window, wtop); 453 455 trace_rxrpc_receive(call, last ? why + 1 : why, sp->hdr.serial, sp->hdr.seq); 454 456 if (last) 457 + /* Change the state inside the lock so that recvmsg syncs 458 + * correctly with it and using sendmsg() to send a reply 459 + * doesn't race. 460 + */ 455 461 rxrpc_end_rx_phase(call, sp->hdr.serial); 462 + 463 + spin_unlock_irq(&call->recvmsg_queue.lock); 456 464 } 457 465 458 466 /* ··· 665 657 rxrpc_propose_delay_ACK(call, sp->hdr.serial, 666 658 rxrpc_propose_ack_input_data); 667 659 } 668 - if (notify) { 660 + if (notify && !test_bit(RXRPC_CALL_CONN_CHALLENGING, &call->flags)) { 669 661 trace_rxrpc_notify_socket(call->debug_id, sp->hdr.serial); 670 662 rxrpc_notify_socket(call); 671 663 }

+35 -15

net/rxrpc/output.c

··· 428 428 static size_t rxrpc_prepare_data_subpacket(struct rxrpc_call *call, 429 429 struct rxrpc_send_data_req *req, 430 430 struct rxrpc_txbuf *txb, 431 + struct rxrpc_wire_header *whdr, 431 432 rxrpc_serial_t serial, int subpkt) 432 433 { 433 - struct rxrpc_wire_header *whdr = txb->kvec[0].iov_base; 434 - struct rxrpc_jumbo_header *jumbo = (void *)(whdr + 1) - sizeof(*jumbo); 434 + struct rxrpc_jumbo_header *jumbo = txb->data - sizeof(*jumbo); 435 435 enum rxrpc_req_ack_trace why; 436 436 struct rxrpc_connection *conn = call->conn; 437 - struct kvec *kv = &call->local->kvec[subpkt]; 437 + struct kvec *kv = &call->local->kvec[1 + subpkt]; 438 438 size_t len = txb->pkt_len; 439 439 bool last; 440 440 u8 flags; ··· 491 491 } 492 492 dont_set_request_ack: 493 493 494 - /* The jumbo header overlays the wire header in the txbuf. */ 494 + /* There's a jumbo header prepended to the data if we need it. */ 495 495 if (subpkt < req->n - 1) 496 496 flags |= RXRPC_JUMBO_PACKET; 497 497 else 498 498 flags &= ~RXRPC_JUMBO_PACKET; 499 499 if (subpkt == 0) { 500 500 whdr->flags = flags; 501 - whdr->serial = htonl(txb->serial); 502 501 whdr->cksum = txb->cksum; 503 - whdr->serviceId = htons(conn->service_id); 504 - kv->iov_base = whdr; 505 - len += sizeof(*whdr); 502 + kv->iov_base = txb->data; 506 503 } else { 507 504 jumbo->flags = flags; 508 505 jumbo->pad = 0; ··· 532 535 /* 533 536 * Prepare a (jumbo) packet for transmission. 534 537 */ 535 - static size_t rxrpc_prepare_data_packet(struct rxrpc_call *call, struct rxrpc_send_data_req *req) 538 + static size_t rxrpc_prepare_data_packet(struct rxrpc_call *call, 539 + struct rxrpc_send_data_req *req, 540 + struct rxrpc_wire_header *whdr) 536 541 { 537 542 struct rxrpc_txqueue *tq = req->tq; 538 543 rxrpc_serial_t serial; ··· 547 548 548 549 /* Each transmission of a Tx packet needs a new serial number */ 549 550 serial = rxrpc_get_next_serials(call->conn, req->n); 551 + 552 + whdr->epoch = htonl(call->conn->proto.epoch); 553 + whdr->cid = htonl(call->cid); 554 + whdr->callNumber = htonl(call->call_id); 555 + whdr->seq = htonl(seq); 556 + whdr->serial = htonl(serial); 557 + whdr->type = RXRPC_PACKET_TYPE_DATA; 558 + whdr->flags = 0; 559 + whdr->userStatus = 0; 560 + whdr->securityIndex = call->security_ix; 561 + whdr->_rsvd = 0; 562 + whdr->serviceId = htons(call->conn->service_id); 550 563 551 564 call->tx_last_serial = serial + req->n - 1; 552 565 call->tx_last_sent = req->now; ··· 587 576 if (i + 1 == req->n) 588 577 /* Only sample the last subpacket in a jumbo. */ 589 578 __set_bit(ix, &tq->rtt_samples); 590 - len += rxrpc_prepare_data_subpacket(call, req, txb, serial, i); 579 + len += rxrpc_prepare_data_subpacket(call, req, txb, whdr, serial, i); 591 580 serial++; 592 581 seq++; 593 582 i++; ··· 629 618 } 630 619 631 620 rxrpc_set_keepalive(call, req->now); 621 + page_frag_free(whdr); 632 622 return len; 633 623 } 634 624 ··· 638 626 */ 639 627 void rxrpc_send_data_packet(struct rxrpc_call *call, struct rxrpc_send_data_req *req) 640 628 { 629 + struct rxrpc_wire_header *whdr; 641 630 struct rxrpc_connection *conn = call->conn; 642 631 enum rxrpc_tx_point frag; 643 632 struct rxrpc_txqueue *tq = req->tq; 644 633 struct rxrpc_txbuf *txb; 645 634 struct msghdr msg; 646 635 rxrpc_seq_t seq = req->seq; 647 - size_t len; 636 + size_t len = sizeof(*whdr); 648 637 bool new_call = test_bit(RXRPC_CALL_BEGAN_RX_TIMER, &call->flags); 649 638 int ret, stat_ix; 650 639 651 640 _enter("%x,%x-%x", tq->qbase, seq, seq + req->n - 1); 652 641 642 + whdr = page_frag_alloc(&call->local->tx_alloc, sizeof(*whdr), GFP_NOFS); 643 + if (!whdr) 644 + return; /* Drop the packet if no memory. */ 645 + 646 + call->local->kvec[0].iov_base = whdr; 647 + call->local->kvec[0].iov_len = sizeof(*whdr); 648 + 653 649 stat_ix = umin(req->n, ARRAY_SIZE(call->rxnet->stat_tx_jumbo)) - 1; 654 650 atomic_inc(&call->rxnet->stat_tx_jumbo[stat_ix]); 655 651 656 - len = rxrpc_prepare_data_packet(call, req); 652 + len += rxrpc_prepare_data_packet(call, req, whdr); 657 653 txb = tq->bufs[seq & RXRPC_TXQ_MASK]; 658 654 659 - iov_iter_kvec(&msg.msg_iter, WRITE, call->local->kvec, req->n, len); 655 + iov_iter_kvec(&msg.msg_iter, WRITE, call->local->kvec, 1 + req->n, len); 660 656 661 657 msg.msg_name = &call->peer->srx.transport; 662 658 msg.msg_namelen = call->peer->srx.transport_len; ··· 715 695 716 696 if (ret == -EMSGSIZE) { 717 697 rxrpc_inc_stat(call->rxnet, stat_tx_data_send_msgsize); 718 - trace_rxrpc_tx_packet(call->debug_id, call->local->kvec[0].iov_base, frag); 698 + trace_rxrpc_tx_packet(call->debug_id, whdr, frag); 719 699 ret = 0; 720 700 } else if (ret < 0) { 721 701 rxrpc_inc_stat(call->rxnet, stat_tx_data_send_fail); 722 702 trace_rxrpc_tx_fail(call->debug_id, txb->serial, ret, frag); 723 703 } else { 724 - trace_rxrpc_tx_packet(call->debug_id, call->local->kvec[0].iov_base, frag); 704 + trace_rxrpc_tx_packet(call->debug_id, whdr, frag); 725 705 } 726 706 727 707 rxrpc_tx_backoff(call, ret);

+7

net/rxrpc/peer_event.c

··· 169 169 goto out; 170 170 } 171 171 172 + if ((serr->ee.ee_origin == SO_EE_ORIGIN_ICMP6 && 173 + serr->ee.ee_type == ICMPV6_PKT_TOOBIG && 174 + serr->ee.ee_code == 0)) { 175 + rxrpc_adjust_mtu(peer, serr->ee.ee_info); 176 + goto out; 177 + } 178 + 172 179 rxrpc_store_error(peer, skb); 173 180 out: 174 181 rxrpc_put_peer(peer, rxrpc_peer_put_input_error);

+5 -8

net/rxrpc/rxkad.c

··· 257 257 struct rxrpc_txbuf *txb, 258 258 struct skcipher_request *req) 259 259 { 260 - struct rxrpc_wire_header *whdr = txb->kvec[0].iov_base; 261 - struct rxkad_level1_hdr *hdr = (void *)(whdr + 1); 260 + struct rxkad_level1_hdr *hdr = txb->data; 262 261 struct rxrpc_crypt iv; 263 262 struct scatterlist sg; 264 263 size_t pad; ··· 273 274 pad = RXKAD_ALIGN - pad; 274 275 pad &= RXKAD_ALIGN - 1; 275 276 if (pad) { 276 - memset(txb->kvec[0].iov_base + txb->offset, 0, pad); 277 + memset(txb->data + txb->offset, 0, pad); 277 278 txb->pkt_len += pad; 278 279 } 279 280 ··· 299 300 struct skcipher_request *req) 300 301 { 301 302 const struct rxrpc_key_token *token; 302 - struct rxrpc_wire_header *whdr = txb->kvec[0].iov_base; 303 - struct rxkad_level2_hdr *rxkhdr = (void *)(whdr + 1); 303 + struct rxkad_level2_hdr *rxkhdr = txb->data; 304 304 struct rxrpc_crypt iv; 305 305 struct scatterlist sg; 306 306 size_t content, pad; ··· 317 319 txb->pkt_len = round_up(content, RXKAD_ALIGN); 318 320 pad = txb->pkt_len - content; 319 321 if (pad) 320 - memset(txb->kvec[0].iov_base + txb->offset, 0, pad); 322 + memset(txb->data + txb->offset, 0, pad); 321 323 322 324 /* encrypt from the session key */ 323 325 token = call->conn->key->payload.data[0]; ··· 405 407 406 408 /* Clear excess space in the packet */ 407 409 if (txb->pkt_len < txb->alloc_size) { 408 - struct rxrpc_wire_header *whdr = txb->kvec[0].iov_base; 409 410 size_t gap = txb->alloc_size - txb->pkt_len; 410 - void *p = whdr + 1; 411 + void *p = txb->data; 411 412 412 413 memset(p + txb->pkt_len, 0, gap); 413 414 }

+2 -4

net/rxrpc/sendmsg.c

··· 419 419 size_t copy = umin(txb->space, msg_data_left(msg)); 420 420 421 421 _debug("add %zu", copy); 422 - if (!copy_from_iter_full(txb->kvec[0].iov_base + txb->offset, 422 + if (!copy_from_iter_full(txb->data + txb->offset, 423 423 copy, &msg->msg_iter)) 424 424 goto efault; 425 425 _debug("added"); ··· 445 445 ret = call->security->secure_packet(call, txb); 446 446 if (ret < 0) 447 447 goto out; 448 - 449 - txb->kvec[0].iov_len += txb->len; 450 448 rxrpc_queue_packet(rx, call, txb, notify_end_tx); 451 449 txb = NULL; 452 450 } ··· 705 707 } else { 706 708 switch (rxrpc_call_state(call)) { 707 709 case RXRPC_CALL_CLIENT_AWAIT_CONN: 708 - case RXRPC_CALL_SERVER_SECURING: 710 + case RXRPC_CALL_SERVER_RECV_REQUEST: 709 711 if (p.command == RXRPC_CMD_SEND_ABORT) 710 712 break; 711 713 fallthrough;

+10 -27

net/rxrpc/txbuf.c

··· 19 19 struct rxrpc_txbuf *rxrpc_alloc_data_txbuf(struct rxrpc_call *call, size_t data_size, 20 20 size_t data_align, gfp_t gfp) 21 21 { 22 - struct rxrpc_wire_header *whdr; 23 22 struct rxrpc_txbuf *txb; 24 - size_t total, hoff; 23 + size_t total, doff, jsize = sizeof(struct rxrpc_jumbo_header); 25 24 void *buf; 26 25 27 26 txb = kzalloc(sizeof(*txb), gfp); 28 27 if (!txb) 29 28 return NULL; 30 29 31 - hoff = round_up(sizeof(*whdr), data_align) - sizeof(*whdr); 32 - total = hoff + sizeof(*whdr) + data_size; 30 + /* We put a jumbo header in the buffer, but not a full wire header to 31 + * avoid delayed-corruption problems with zerocopy. 32 + */ 33 + doff = round_up(jsize, data_align); 34 + total = doff + data_size; 33 35 34 36 data_align = umax(data_align, L1_CACHE_BYTES); 35 37 mutex_lock(&call->conn->tx_data_alloc_lock); ··· 43 41 return NULL; 44 42 } 45 43 46 - whdr = buf + hoff; 47 - 48 44 refcount_set(&txb->ref, 1); 49 45 txb->call_debug_id = call->debug_id; 50 46 txb->debug_id = atomic_inc_return(&rxrpc_txbuf_debug_ids); 51 47 txb->alloc_size = data_size; 52 48 txb->space = data_size; 53 - txb->offset = sizeof(*whdr); 49 + txb->offset = 0; 54 50 txb->flags = call->conn->out_clientflag; 55 51 txb->seq = call->send_top + 1; 56 - txb->nr_kvec = 1; 57 - txb->kvec[0].iov_base = whdr; 58 - txb->kvec[0].iov_len = sizeof(*whdr); 59 - 60 - whdr->epoch = htonl(call->conn->proto.epoch); 61 - whdr->cid = htonl(call->cid); 62 - whdr->callNumber = htonl(call->call_id); 63 - whdr->seq = htonl(txb->seq); 64 - whdr->type = RXRPC_PACKET_TYPE_DATA; 65 - whdr->flags = 0; 66 - whdr->userStatus = 0; 67 - whdr->securityIndex = call->security_ix; 68 - whdr->_rsvd = 0; 69 - whdr->serviceId = htons(call->dest_srx.srx_service); 52 + txb->data = buf + doff; 70 53 71 54 trace_rxrpc_txbuf(txb->debug_id, txb->call_debug_id, txb->seq, 1, 72 55 rxrpc_txbuf_alloc_data); ··· 77 90 78 91 static void rxrpc_free_txbuf(struct rxrpc_txbuf *txb) 79 92 { 80 - int i; 81 - 82 93 trace_rxrpc_txbuf(txb->debug_id, txb->call_debug_id, txb->seq, 0, 83 94 rxrpc_txbuf_free); 84 - for (i = 0; i < txb->nr_kvec; i++) 85 - if (txb->kvec[i].iov_base && 86 - !is_zero_pfn(page_to_pfn(virt_to_page(txb->kvec[i].iov_base)))) 87 - page_frag_free(txb->kvec[i].iov_base); 95 + if (txb->data) 96 + page_frag_free(txb->data); 88 97 kfree(txb); 89 98 atomic_dec(&rxrpc_nr_txbuf); 90 99 }

+3

net/sched/sch_fifo.c

··· 40 40 { 41 41 unsigned int prev_backlog; 42 42 43 + if (unlikely(READ_ONCE(sch->limit) == 0)) 44 + return qdisc_drop(skb, sch, to_free); 45 + 43 46 if (likely(sch->q.qlen < READ_ONCE(sch->limit))) 44 47 return qdisc_enqueue_tail(skb, sch); 45 48

+1 -1

net/sched/sch_netem.c

··· 749 749 if (err != NET_XMIT_SUCCESS) { 750 750 if (net_xmit_drop_count(err)) 751 751 qdisc_qstats_drop(sch); 752 - qdisc_tree_reduce_backlog(sch, 1, pkt_len); 753 752 sch->qstats.backlog -= pkt_len; 754 753 sch->q.qlen--; 754 + qdisc_tree_reduce_backlog(sch, 1, pkt_len); 755 755 } 756 756 goto tfifo_dequeue; 757 757 }

+5

net/socket.c

··· 479 479 sock->file = file; 480 480 file->private_data = sock; 481 481 stream_open(SOCK_INODE(sock), file); 482 + /* 483 + * Disable permission and pre-content events, but enable legacy 484 + * inotify events for legacy users. 485 + */ 486 + file_set_fsnotify_mode(file, FMODE_NONOTIFY_PERM); 482 487 return file; 483 488 } 484 489 EXPORT_SYMBOL(sock_alloc_file);

+7 -1

net/vmw_vsock/af_vsock.c

··· 824 824 */ 825 825 lock_sock_nested(sk, level); 826 826 827 - sock_orphan(sk); 827 + /* Indicate to vsock_remove_sock() that the socket is being released and 828 + * can be removed from the bound_table. Unlike transport reassignment 829 + * case, where the socket must remain bound despite vsock_remove_sock() 830 + * being called from the transport release() callback. 831 + */ 832 + sock_set_flag(sk, SOCK_DEAD); 828 833 829 834 if (vsk->transport) 830 835 vsk->transport->release(vsk); 831 836 else if (sock_type_connectible(sk->sk_type)) 832 837 vsock_remove_sock(vsk); 833 838 839 + sock_orphan(sk); 834 840 sk->sk_shutdown = SHUTDOWN_MASK; 835 841 836 842 skb_queue_purge(&sk->sk_receive_queue);

+3 -2

rust/Makefile

··· 144 144 --extern bindings --extern uapi 145 145 rusttestlib-kernel: $(src)/kernel/lib.rs \ 146 146 rusttestlib-bindings rusttestlib-uapi rusttestlib-build_error \ 147 - $(obj)/libmacros.so $(obj)/bindings.o FORCE 147 + $(obj)/$(libmacros_name) $(obj)/bindings.o FORCE 148 148 +$(call if_changed,rustc_test_library) 149 149 150 150 rusttestlib-bindings: private rustc_target_flags = --extern ffi ··· 240 240 -fzero-call-used-regs=% -fno-stack-clash-protection \ 241 241 -fno-inline-functions-called-once -fsanitize=bounds-strict \ 242 242 -fstrict-flex-arrays=% -fmin-function-alignment=% \ 243 + -fzero-init-padding-bits=% \ 243 244 --param=% --param asan-% 244 245 245 246 # Derived from `scripts/Makefile.clang`. ··· 332 331 $(obj)/bindings/bindings_helpers_generated.rs: $(src)/helpers/helpers.c FORCE 333 332 $(call if_changed_dep,bindgen) 334 333 335 - rust_exports = $(NM) -p --defined-only $(1) | awk '$$2~/(T|R|D|B)/ && $$3!~/__cfi/ { printf $(2),$$3 }' 334 + rust_exports = $(NM) -p --defined-only $(1) | awk '$$2~/(T|R|D|B)/ && $$3!~/__cfi/ && $$3!~/__odr_asan/ { printf $(2),$$3 }' 336 335 337 336 quiet_cmd_exports = EXPORTS $@ 338 337 cmd_exports = \

+1

rust/bindings/bindings_helper.h

··· 11 11 #include <linux/blk_types.h> 12 12 #include <linux/blkdev.h> 13 13 #include <linux/cred.h> 14 + #include <linux/device/faux.h> 14 15 #include <linux/errname.h> 15 16 #include <linux/ethtool.h> 16 17 #include <linux/file.h>

+67

rust/kernel/faux.rs

··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + 3 + //! Abstractions for the faux bus. 4 + //! 5 + //! This module provides bindings for working with faux devices in kernel modules. 6 + //! 7 + //! C header: [`include/linux/device/faux.h`] 8 + 9 + use crate::{bindings, device, error::code::*, prelude::*}; 10 + use core::ptr::{addr_of_mut, null, null_mut, NonNull}; 11 + 12 + /// The registration of a faux device. 13 + /// 14 + /// This type represents the registration of a [`struct faux_device`]. When an instance of this type 15 + /// is dropped, its respective faux device will be unregistered from the system. 16 + /// 17 + /// # Invariants 18 + /// 19 + /// `self.0` always holds a valid pointer to an initialized and registered [`struct faux_device`]. 20 + /// 21 + /// [`struct faux_device`]: srctree/include/linux/device/faux.h 22 + #[repr(transparent)] 23 + pub struct Registration(NonNull<bindings::faux_device>); 24 + 25 + impl Registration { 26 + /// Create and register a new faux device with the given name. 27 + pub fn new(name: &CStr) -> Result<Self> { 28 + // SAFETY: 29 + // - `name` is copied by this function into its own storage 30 + // - `faux_ops` is safe to leave NULL according to the C API 31 + let dev = unsafe { bindings::faux_device_create(name.as_char_ptr(), null_mut(), null()) }; 32 + 33 + // The above function will return either a valid device, or NULL on failure 34 + // INVARIANT: The device will remain registered until faux_device_destroy() is called, which 35 + // happens in our Drop implementation. 36 + Ok(Self(NonNull::new(dev).ok_or(ENODEV)?)) 37 + } 38 + 39 + fn as_raw(&self) -> *mut bindings::faux_device { 40 + self.0.as_ptr() 41 + } 42 + } 43 + 44 + impl AsRef<device::Device> for Registration { 45 + fn as_ref(&self) -> &device::Device { 46 + // SAFETY: The underlying `device` in `faux_device` is guaranteed by the C API to be 47 + // a valid initialized `device`. 48 + unsafe { device::Device::as_ref(addr_of_mut!((*self.as_raw()).dev)) } 49 + } 50 + } 51 + 52 + impl Drop for Registration { 53 + fn drop(&mut self) { 54 + // SAFETY: `self.0` is a valid registered faux_device via our type invariants. 55 + unsafe { bindings::faux_device_destroy(self.as_raw()) } 56 + } 57 + } 58 + 59 + // SAFETY: The faux device API is thread-safe as guaranteed by the device core, as long as 60 + // faux_device_destroy() is guaranteed to only be called once - which is guaranteed by our type not 61 + // having Copy/Clone. 62 + unsafe impl Send for Registration {} 63 + 64 + // SAFETY: The faux device API is thread-safe as guaranteed by the device core, as long as 65 + // faux_device_destroy() is guaranteed to only be called once - which is guaranteed by our type not 66 + // having Copy/Clone. 67 + unsafe impl Sync for Registration {}

+1 -1

rust/kernel/init.rs

··· 870 870 /// use kernel::{types::Opaque, init::pin_init_from_closure}; 871 871 /// #[repr(C)] 872 872 /// struct RawFoo([u8; 16]); 873 - /// extern { 873 + /// extern "C" { 874 874 /// fn init_foo(_: *mut RawFoo); 875 875 /// } 876 876 ///

+1

rust/kernel/lib.rs

··· 46 46 pub mod devres; 47 47 pub mod driver; 48 48 pub mod error; 49 + pub mod faux; 49 50 #[cfg(CONFIG_RUST_FW_LOADER_ABSTRACTIONS)] 50 51 pub mod firmware; 51 52 pub mod fs;

+1 -1

rust/kernel/rbtree.rs

··· 1149 1149 /// # Invariants 1150 1150 /// - `parent` may be null if the new node becomes the root. 1151 1151 /// - `child_field_of_parent` is a valid pointer to the left-child or right-child of `parent`. If `parent` is 1152 - /// null, it is a pointer to the root of the [`RBTree`]. 1152 + /// null, it is a pointer to the root of the [`RBTree`]. 1153 1153 struct RawVacantEntry<'a, K, V> { 1154 1154 rbtree: *mut RBTree<K, V>, 1155 1155 /// The node that will become the parent of the new node if we insert one.

+7 -6

samples/hid/Makefile

··· 40 40 endif 41 41 endif 42 42 43 - TPROGS_CFLAGS += -Wall -O2 44 - TPROGS_CFLAGS += -Wmissing-prototypes 45 - TPROGS_CFLAGS += -Wstrict-prototypes 43 + COMMON_CFLAGS += -Wall -O2 44 + COMMON_CFLAGS += -Wmissing-prototypes 45 + COMMON_CFLAGS += -Wstrict-prototypes 46 46 47 + TPROGS_CFLAGS += $(COMMON_CFLAGS) 47 48 TPROGS_CFLAGS += -I$(objtree)/usr/include 48 49 TPROGS_CFLAGS += -I$(LIBBPF_INCLUDE) 49 50 TPROGS_CFLAGS += -I$(srctree)/tools/include 50 51 51 52 ifdef SYSROOT 52 - TPROGS_CFLAGS += --sysroot=$(SYSROOT) 53 + COMMON_CFLAGS += --sysroot=$(SYSROOT) 53 54 TPROGS_LDFLAGS := -L$(SYSROOT)/usr/lib 54 55 endif 55 56 ··· 113 112 114 113 $(LIBBPF): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(LIBBPF_OUTPUT) 115 114 # Fix up variables inherited from Kbuild that tools/ build system won't like 116 - $(MAKE) -C $(LIBBPF_SRC) RM='rm -rf' EXTRA_CFLAGS="$(TPROGS_CFLAGS)" \ 115 + $(MAKE) -C $(LIBBPF_SRC) RM='rm -rf' EXTRA_CFLAGS="$(COMMON_CFLAGS)" \ 117 116 LDFLAGS=$(TPROGS_LDFLAGS) srctree=$(HID_SAMPLES_PATH)/../../ \ 118 117 O= OUTPUT=$(LIBBPF_OUTPUT)/ DESTDIR=$(LIBBPF_DESTDIR) prefix= \ 119 118 $@ install_headers ··· 164 163 165 164 VMLINUX_BTF_PATHS ?= $(abspath $(if $(O),$(O)/vmlinux)) \ 166 165 $(abspath $(if $(KBUILD_OUTPUT),$(KBUILD_OUTPUT)/vmlinux)) \ 167 - $(abspath ./vmlinux) 166 + $(abspath $(objtree)/vmlinux) 168 167 VMLINUX_BTF ?= $(abspath $(firstword $(wildcard $(VMLINUX_BTF_PATHS)))) 169 168 170 169 $(obj)/vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL)

+10

samples/rust/Kconfig

··· 61 61 62 62 If unsure, say N. 63 63 64 + config SAMPLE_RUST_DRIVER_FAUX 65 + tristate "Faux Driver" 66 + help 67 + This option builds the Rust Faux driver sample. 68 + 69 + To compile this as a module, choose M here: 70 + the module will be called rust_driver_faux. 71 + 72 + If unsure, say N. 73 + 64 74 config SAMPLE_RUST_HOSTPROGS 65 75 bool "Host programs" 66 76 help

+1

samples/rust/Makefile

··· 6 6 obj-$(CONFIG_SAMPLE_RUST_PRINT) += rust_print.o 7 7 obj-$(CONFIG_SAMPLE_RUST_DRIVER_PCI) += rust_driver_pci.o 8 8 obj-$(CONFIG_SAMPLE_RUST_DRIVER_PLATFORM) += rust_driver_platform.o 9 + obj-$(CONFIG_SAMPLE_RUST_DRIVER_FAUX) += rust_driver_faux.o 9 10 10 11 rust_print-y := rust_print_main.o rust_print_events.o 11 12

+29

samples/rust/rust_driver_faux.rs

··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + 3 + //! Rust faux device sample. 4 + 5 + use kernel::{c_str, faux, prelude::*, Module}; 6 + 7 + module! { 8 + type: SampleModule, 9 + name: "rust_faux_driver", 10 + author: "Lyude Paul", 11 + description: "Rust faux device sample", 12 + license: "GPL", 13 + } 14 + 15 + struct SampleModule { 16 + _reg: faux::Registration, 17 + } 18 + 19 + impl Module for SampleModule { 20 + fn init(_module: &'static ThisModule) -> Result<Self> { 21 + pr_info!("Initialising Rust Faux Device Sample\n"); 22 + 23 + let reg = faux::Registration::new(c_str!("rust-faux-sample-device"))?; 24 + 25 + dev_info!(reg.as_ref(), "Hello from faux device!\n"); 26 + 27 + Ok(Self { _reg: reg }) 28 + } 29 + }

+9 -6

scripts/Makefile.extrawarn

··· 31 31 ifdef CONFIG_CC_IS_CLANG 32 32 # The kernel builds with '-std=gnu11' so use of GNU extensions is acceptable. 33 33 KBUILD_CFLAGS += -Wno-gnu 34 + 35 + # Clang checks for overflow/truncation with '%p', while GCC does not: 36 + # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111219 37 + KBUILD_CFLAGS += $(call cc-disable-warning, format-overflow-non-kprintf) 38 + KBUILD_CFLAGS += $(call cc-disable-warning, format-truncation-non-kprintf) 34 39 else 35 40 36 41 # gcc inanely warns about local variables called 'main' ··· 110 105 KBUILD_CFLAGS += $(call cc-disable-warning, format-overflow) 111 106 ifdef CONFIG_CC_IS_GCC 112 107 KBUILD_CFLAGS += $(call cc-disable-warning, format-truncation) 113 - else 114 - # Clang checks for overflow/truncation with '%p', while GCC does not: 115 - # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111219 116 - KBUILD_CFLAGS += $(call cc-disable-warning, format-overflow-non-kprintf) 117 - KBUILD_CFLAGS += $(call cc-disable-warning, format-truncation-non-kprintf) 118 108 endif 119 109 KBUILD_CFLAGS += $(call cc-disable-warning, stringop-truncation) 120 110 ··· 133 133 KBUILD_CFLAGS += -Wno-tautological-constant-out-of-range-compare 134 134 KBUILD_CFLAGS += $(call cc-disable-warning, unaligned-access) 135 135 KBUILD_CFLAGS += -Wno-enum-compare-conditional 136 - KBUILD_CFLAGS += -Wno-enum-enum-conversion 137 136 endif 138 137 139 138 endif ··· 155 156 KBUILD_CFLAGS += -Wno-missing-field-initializers 156 157 KBUILD_CFLAGS += -Wno-type-limits 157 158 KBUILD_CFLAGS += -Wno-shift-negative-value 159 + 160 + ifdef CONFIG_CC_IS_CLANG 161 + KBUILD_CFLAGS += -Wno-enum-enum-conversion 162 + endif 158 163 159 164 ifdef CONFIG_CC_IS_GCC 160 165 KBUILD_CFLAGS += -Wno-maybe-uninitialized

+1 -1

scripts/Makefile.lib

··· 305 305 # These are shared by some Makefile.* files. 306 306 307 307 ifdef CONFIG_LTO_CLANG 308 - # Run $(LD) here to covert LLVM IR to ELF in the following cases: 308 + # Run $(LD) here to convert LLVM IR to ELF in the following cases: 309 309 # - when this object needs objtool processing, as objtool cannot process LLVM IR 310 310 # - when this is a single-object module, as modpost cannot process LLVM IR 311 311 cmd_ld_single = $(if $(objtool-enabled)$(is-single-obj-m), ; $(LD) $(ld_flags) -r -o $(tmp-target) $@; mv $(tmp-target) $@)

+18

scripts/generate_rust_target.rs

··· 165 165 let option = "CONFIG_".to_owned() + option; 166 166 self.0.contains_key(&option) 167 167 } 168 + 169 + /// Is the rustc version at least `major.minor.patch`? 170 + fn rustc_version_atleast(&self, major: u32, minor: u32, patch: u32) -> bool { 171 + let check_version = 100000 * major + 100 * minor + patch; 172 + let actual_version = self 173 + .0 174 + .get("CONFIG_RUSTC_VERSION") 175 + .unwrap() 176 + .parse::<u32>() 177 + .unwrap(); 178 + check_version <= actual_version 179 + } 168 180 } 169 181 170 182 fn main() { ··· 194 182 } 195 183 } else if cfg.has("X86_64") { 196 184 ts.push("arch", "x86_64"); 185 + if cfg.rustc_version_atleast(1, 86, 0) { 186 + ts.push("rustc-abi", "x86-softfloat"); 187 + } 197 188 ts.push( 198 189 "data-layout", 199 190 "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128", ··· 230 215 panic!("32-bit x86 only works under UML"); 231 216 } 232 217 ts.push("arch", "x86"); 218 + if cfg.rustc_version_atleast(1, 86, 0) { 219 + ts.push("rustc-abi", "x86-softfloat"); 220 + } 233 221 ts.push( 234 222 "data-layout", 235 223 "e-m:e-p:32:32-p270:32:32-p271:32:32-p272:64:64-i128:128-f64:32:64-f80:32-n8:16:32-S128",

+37 -2

scripts/mod/modpost.c

··· 190 190 191 191 /* 192 192 * Set mod->is_gpl_compatible to true by default. If MODULE_LICENSE() 193 - * is missing, do not check the use for EXPORT_SYMBOL_GPL() becasue 194 - * modpost will exit wiht error anyway. 193 + * is missing, do not check the use for EXPORT_SYMBOL_GPL() because 194 + * modpost will exit with an error anyway. 195 195 */ 196 196 mod->is_gpl_compatible = true; 197 197 ··· 507 507 info->modinfo_len = sechdrs[i].sh_size; 508 508 } else if (!strcmp(secname, ".export_symbol")) { 509 509 info->export_symbol_secndx = i; 510 + } else if (!strcmp(secname, ".no_trim_symbol")) { 511 + info->no_trim_symbol = (void *)hdr + sechdrs[i].sh_offset; 512 + info->no_trim_symbol_len = sechdrs[i].sh_size; 510 513 } 511 514 512 515 if (sechdrs[i].sh_type == SHT_SYMTAB) { ··· 1569 1566 /* strip trailing .o */ 1570 1567 mod = new_module(modname, strlen(modname) - strlen(".o")); 1571 1568 1569 + /* save .no_trim_symbol section for later use */ 1570 + if (info.no_trim_symbol_len) { 1571 + mod->no_trim_symbol = xmalloc(info.no_trim_symbol_len); 1572 + memcpy(mod->no_trim_symbol, info.no_trim_symbol, 1573 + info.no_trim_symbol_len); 1574 + mod->no_trim_symbol_len = info.no_trim_symbol_len; 1575 + } 1576 + 1572 1577 if (!mod->is_vmlinux) { 1573 1578 license = get_modinfo(&info, "license"); 1574 1579 if (!license) ··· 1737 1726 } 1738 1727 1739 1728 free(buf); 1729 + } 1730 + 1731 + /* 1732 + * Keep symbols recorded in the .no_trim_symbol section. This is necessary to 1733 + * prevent CONFIG_TRIM_UNUSED_KSYMS from dropping EXPORT_SYMBOL because 1734 + * symbol_get() relies on the symbol being present in the ksymtab for lookups. 1735 + */ 1736 + static void keep_no_trim_symbols(struct module *mod) 1737 + { 1738 + unsigned long size = mod->no_trim_symbol_len; 1739 + 1740 + for (char *s = mod->no_trim_symbol; s; s = next_string(s , &size)) { 1741 + struct symbol *sym; 1742 + 1743 + /* 1744 + * If find_symbol() returns NULL, this symbol is not provided 1745 + * by any module, and symbol_get() will fail. 1746 + */ 1747 + sym = find_symbol(s); 1748 + if (sym) 1749 + sym->used = true; 1750 + } 1740 1751 } 1741 1752 1742 1753 static void check_modname_len(struct module *mod) ··· 2287 2254 read_symbols_from_files(files_source); 2288 2255 2289 2256 list_for_each_entry(mod, &modules, list) { 2257 + keep_no_trim_symbols(mod); 2258 + 2290 2259 if (mod->dump_file || mod->is_vmlinux) 2291 2260 continue; 2292 2261

+6

scripts/mod/modpost.h

··· 111 111 * 112 112 * @dump_file: path to the .symvers file if loaded from a file 113 113 * @aliases: list head for module_aliases 114 + * @no_trim_symbol: .no_trim_symbol section data 115 + * @no_trim_symbol_len: length of the .no_trim_symbol section 114 116 */ 115 117 struct module { 116 118 struct list_head list; ··· 130 128 // Actual imported namespaces 131 129 struct list_head imported_namespaces; 132 130 struct list_head aliases; 131 + char *no_trim_symbol; 132 + unsigned int no_trim_symbol_len; 133 133 char name[]; 134 134 }; 135 135 ··· 145 141 char *strtab; 146 142 char *modinfo; 147 143 unsigned int modinfo_len; 144 + char *no_trim_symbol; 145 + unsigned int no_trim_symbol_len; 148 146 149 147 /* support for 32bit section numbers */ 150 148

+1

scripts/module.lds.S

··· 16 16 *(.discard) 17 17 *(.discard.*) 18 18 *(.export_symbol) 19 + *(.no_trim_symbol) 19 20 } 20 21 21 22 __ksymtab 0 : ALIGN(8) { *(SORT(___ksymtab+*)) }

+2 -2

scripts/package/install-extmod-build

··· 62 62 # 63 63 # Clear VPATH and srcroot because the source files reside in the output 64 64 # directory. 65 - # shellcheck disable=SC2016 # $(MAKE), $(CC), and $(build) will be expanded by Make 66 - "${MAKE}" run-command KBUILD_RUN_COMMAND='+$(MAKE) HOSTCC=$(CC) VPATH= srcroot=. $(build)='"${destdir}"/scripts 65 + # shellcheck disable=SC2016 # $(MAKE) and $(build) will be expanded by Make 66 + "${MAKE}" run-command KBUILD_RUN_COMMAND='+$(MAKE) HOSTCC='"${CC}"' VPATH= srcroot=. $(build)='"${destdir}"/scripts 67 67 68 68 rm -f "${destdir}/scripts/Kbuild" 69 69 fi

+112 -33

security/tomoyo/common.c

··· 1981 1981 } 1982 1982 1983 1983 /** 1984 + * tomoyo_numscan - sscanf() which stores the length of a decimal integer value. 1985 + * 1986 + * @str: String to scan. 1987 + * @head: Leading string that must start with. 1988 + * @width: Pointer to "int" for storing length of a decimal integer value after @head. 1989 + * @tail: Optional character that must match after a decimal integer value. 1990 + * 1991 + * Returns whether @str starts with @head and a decimal value follows @head. 1992 + */ 1993 + static bool tomoyo_numscan(const char *str, const char *head, int *width, const char tail) 1994 + { 1995 + const char *cp; 1996 + const int n = strlen(head); 1997 + 1998 + if (!strncmp(str, head, n)) { 1999 + cp = str + n; 2000 + while (*cp && *cp >= '0' && *cp <= '9') 2001 + cp++; 2002 + if (*cp == tail || !tail) { 2003 + *width = cp - (str + n); 2004 + return *width != 0; 2005 + } 2006 + } 2007 + *width = 0; 2008 + return 0; 2009 + } 2010 + 2011 + /** 2012 + * tomoyo_patternize_path - Make patterns for file path. Used by learning mode. 2013 + * 2014 + * @buffer: Destination buffer. 2015 + * @len: Size of @buffer. 2016 + * @entry: Original line. 2017 + * 2018 + * Returns nothing. 2019 + */ 2020 + static void tomoyo_patternize_path(char *buffer, const int len, char *entry) 2021 + { 2022 + int width; 2023 + char *cp = entry; 2024 + 2025 + /* Nothing to do if this line is not for "file" related entry. */ 2026 + if (strncmp(entry, "file ", 5)) 2027 + goto flush; 2028 + /* 2029 + * Nothing to do if there is no colon in this line, for this rewriting 2030 + * applies to only filesystems where numeric values in the path are volatile. 2031 + */ 2032 + cp = strchr(entry + 5, ':'); 2033 + if (!cp) { 2034 + cp = entry; 2035 + goto flush; 2036 + } 2037 + /* Flush e.g. "file ioctl" part. */ 2038 + while (*cp != ' ') 2039 + cp--; 2040 + *cp++ = '\0'; 2041 + tomoyo_addprintf(buffer, len, "%s ", entry); 2042 + /* e.g. file ioctl pipe:[$INO] $CMD */ 2043 + if (tomoyo_numscan(cp, "pipe:[", &width, ']')) { 2044 + cp += width + 7; 2045 + tomoyo_addprintf(buffer, len, "pipe:[\\$]"); 2046 + goto flush; 2047 + } 2048 + /* e.g. file ioctl socket:[$INO] $CMD */ 2049 + if (tomoyo_numscan(cp, "socket:[", &width, ']')) { 2050 + cp += width + 9; 2051 + tomoyo_addprintf(buffer, len, "socket:[\\$]"); 2052 + goto flush; 2053 + } 2054 + if (!strncmp(cp, "proc:/self", 10)) { 2055 + /* e.g. file read proc:/self/task/$TID/fdinfo/$FD */ 2056 + cp += 10; 2057 + tomoyo_addprintf(buffer, len, "proc:/self"); 2058 + } else if (tomoyo_numscan(cp, "proc:/", &width, 0)) { 2059 + /* e.g. file read proc:/$PID/task/$TID/fdinfo/$FD */ 2060 + /* 2061 + * Don't patternize $PID part if $PID == 1, for several 2062 + * programs access only files in /proc/1/ directory. 2063 + */ 2064 + cp += width + 6; 2065 + if (width == 1 && *(cp - 1) == '1') 2066 + tomoyo_addprintf(buffer, len, "proc:/1"); 2067 + else 2068 + tomoyo_addprintf(buffer, len, "proc:/\\$"); 2069 + } else { 2070 + goto flush; 2071 + } 2072 + /* Patternize $TID part if "/task/" follows. */ 2073 + if (tomoyo_numscan(cp, "/task/", &width, 0)) { 2074 + cp += width + 6; 2075 + tomoyo_addprintf(buffer, len, "/task/\\$"); 2076 + } 2077 + /* Patternize $FD part if "/fd/" or "/fdinfo/" follows. */ 2078 + if (tomoyo_numscan(cp, "/fd/", &width, 0)) { 2079 + cp += width + 4; 2080 + tomoyo_addprintf(buffer, len, "/fd/\\$"); 2081 + } else if (tomoyo_numscan(cp, "/fdinfo/", &width, 0)) { 2082 + cp += width + 8; 2083 + tomoyo_addprintf(buffer, len, "/fdinfo/\\$"); 2084 + } 2085 + flush: 2086 + /* Flush remaining part if any. */ 2087 + if (*cp) 2088 + tomoyo_addprintf(buffer, len, "%s", cp); 2089 + } 2090 + 2091 + /** 1984 2092 * tomoyo_add_entry - Add an ACL to current thread's domain. Used by learning mode. 1985 2093 * 1986 2094 * @domain: Pointer to "struct tomoyo_domain_info". ··· 2111 2003 if (!cp) 2112 2004 return; 2113 2005 *cp++ = '\0'; 2114 - len = strlen(cp) + 1; 2006 + /* Reserve some space for potentially using patterns. */ 2007 + len = strlen(cp) + 16; 2115 2008 /* strstr() will return NULL if ordering is wrong. */ 2116 2009 if (*cp == 'f') { 2117 2010 argv0 = strstr(header, " argv[]={ \""); ··· 2129 2020 if (symlink) 2130 2021 len += tomoyo_truncate(symlink + 1) + 1; 2131 2022 } 2132 - buffer = kmalloc(len, GFP_NOFS); 2023 + buffer = kmalloc(len, GFP_NOFS | __GFP_ZERO); 2133 2024 if (!buffer) 2134 2025 return; 2135 - snprintf(buffer, len - 1, "%s", cp); 2136 - if (*cp == 'f' && strchr(buffer, ':')) { 2137 - /* Automatically replace 2 or more digits with \$ pattern. */ 2138 - char *cp2; 2139 - 2140 - /* e.g. file read proc:/$PID/stat */ 2141 - cp = strstr(buffer, " proc:/"); 2142 - if (cp && simple_strtoul(cp + 7, &cp2, 10) >= 10 && *cp2 == '/') { 2143 - *(cp + 7) = '\\'; 2144 - *(cp + 8) = '$'; 2145 - memmove(cp + 9, cp2, strlen(cp2) + 1); 2146 - goto ok; 2147 - } 2148 - /* e.g. file ioctl pipe:[$INO] $CMD */ 2149 - cp = strstr(buffer, " pipe:["); 2150 - if (cp && simple_strtoul(cp + 7, &cp2, 10) >= 10 && *cp2 == ']') { 2151 - *(cp + 7) = '\\'; 2152 - *(cp + 8) = '$'; 2153 - memmove(cp + 9, cp2, strlen(cp2) + 1); 2154 - goto ok; 2155 - } 2156 - /* e.g. file ioctl socket:[$INO] $CMD */ 2157 - cp = strstr(buffer, " socket:["); 2158 - if (cp && simple_strtoul(cp + 9, &cp2, 10) >= 10 && *cp2 == ']') { 2159 - *(cp + 9) = '\\'; 2160 - *(cp + 10) = '$'; 2161 - memmove(cp + 11, cp2, strlen(cp2) + 1); 2162 - goto ok; 2163 - } 2164 - } 2165 - ok: 2026 + tomoyo_patternize_path(buffer, len, cp); 2166 2027 if (realpath) 2167 2028 tomoyo_addprintf(buffer, len, " exec.%s", realpath); 2168 2029 if (argv0)

+1 -1

security/tomoyo/domain.c

··· 920 920 #ifdef CONFIG_MMU 921 921 /* 922 922 * This is called at execve() time in order to dig around 923 - * in the argv/environment of the new proceess 923 + * in the argv/environment of the new process 924 924 * (represented by bprm). 925 925 */ 926 926 mmap_read_lock(bprm->mm);

+3 -3

security/tomoyo/securityfs_if.c

··· 229 229 } 230 230 231 231 /** 232 - * tomoyo_initerface_init - Initialize /sys/kernel/security/tomoyo/ interface. 232 + * tomoyo_interface_init - Initialize /sys/kernel/security/tomoyo/ interface. 233 233 * 234 234 * Returns 0. 235 235 */ 236 - static int __init tomoyo_initerface_init(void) 236 + static int __init tomoyo_interface_init(void) 237 237 { 238 238 struct tomoyo_domain_info *domain; 239 239 struct dentry *tomoyo_dir; ··· 270 270 return 0; 271 271 } 272 272 273 - fs_initcall(tomoyo_initerface_init); 273 + fs_initcall(tomoyo_interface_init);

+1 -4

security/tomoyo/tomoyo.c

··· 549 549 .id = LSM_ID_TOMOYO, 550 550 }; 551 551 552 - /* 553 - * tomoyo_security_ops is a "struct security_operations" which is used for 554 - * registering TOMOYO. 555 - */ 552 + /* tomoyo_hooks is used for registering TOMOYO. */ 556 553 static struct security_hook_list tomoyo_hooks[] __ro_after_init = { 557 554 LSM_HOOK_INIT(cred_prepare, tomoyo_cred_prepare), 558 555 LSM_HOOK_INIT(bprm_committed_creds, tomoyo_bprm_committed_creds),

+11 -1

tools/objtool/check.c

··· 227 227 str_ends_with(func->name, "_4core9panicking18panic_bounds_check") || 228 228 str_ends_with(func->name, "_4core9panicking19assert_failed_inner") || 229 229 str_ends_with(func->name, "_4core9panicking36panic_misaligned_pointer_dereference") || 230 + strstr(func->name, "_4core9panicking13assert_failed") || 230 231 strstr(func->name, "_4core9panicking11panic_const24panic_const_") || 231 232 (strstr(func->name, "_4core5slice5index24slice_") && 232 233 str_ends_with(func->name, "_fail")); ··· 1976 1975 reloc_addend(reloc) == pfunc->offset) 1977 1976 break; 1978 1977 1978 + /* 1979 + * Clang sometimes leaves dangling unused jump table entries 1980 + * which point to the end of the function. Ignore them. 1981 + */ 1982 + if (reloc->sym->sec == pfunc->sec && 1983 + reloc_addend(reloc) == pfunc->offset + pfunc->len) 1984 + goto next; 1985 + 1979 1986 dest_insn = find_insn(file, reloc->sym->sec, reloc_addend(reloc)); 1980 1987 if (!dest_insn) 1981 1988 break; ··· 2001 1992 alt->insn = dest_insn; 2002 1993 alt->next = insn->alts; 2003 1994 insn->alts = alt; 1995 + next: 2004 1996 prev_offset = reloc_offset(reloc); 2005 1997 } 2006 1998 ··· 2274 2264 2275 2265 if (sec->sh.sh_entsize != 8) { 2276 2266 static bool warned = false; 2277 - if (!warned) { 2267 + if (!warned && opts.verbose) { 2278 2268 WARN("%s: dodgy linker, sh_entsize != 8", sec->name); 2279 2269 warned = true; 2280 2270 }

+22 -3

tools/sched_ext/include/scx/common.bpf.h

··· 270 270 #define bpf_obj_new(type) ((type *)bpf_obj_new_impl(bpf_core_type_id_local(type), NULL)) 271 271 #define bpf_obj_drop(kptr) bpf_obj_drop_impl(kptr, NULL) 272 272 273 - void bpf_list_push_front(struct bpf_list_head *head, struct bpf_list_node *node) __ksym; 274 - void bpf_list_push_back(struct bpf_list_head *head, struct bpf_list_node *node) __ksym; 273 + int bpf_list_push_front_impl(struct bpf_list_head *head, 274 + struct bpf_list_node *node, 275 + void *meta, __u64 off) __ksym; 276 + #define bpf_list_push_front(head, node) bpf_list_push_front_impl(head, node, NULL, 0) 277 + 278 + int bpf_list_push_back_impl(struct bpf_list_head *head, 279 + struct bpf_list_node *node, 280 + void *meta, __u64 off) __ksym; 281 + #define bpf_list_push_back(head, node) bpf_list_push_back_impl(head, node, NULL, 0) 282 + 275 283 struct bpf_list_node *bpf_list_pop_front(struct bpf_list_head *head) __ksym; 276 284 struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head) __ksym; 277 285 struct bpf_rb_node *bpf_rbtree_remove(struct bpf_rb_root *root, ··· 412 404 return (const struct cpumask *)mask; 413 405 } 414 406 407 + /* 408 + * Return true if task @p cannot migrate to a different CPU, false 409 + * otherwise. 410 + */ 411 + static inline bool is_migration_disabled(const struct task_struct *p) 412 + { 413 + if (bpf_core_field_exists(p->migration_disabled)) 414 + return p->migration_disabled; 415 + return false; 416 + } 417 + 415 418 /* rcu */ 416 419 void bpf_rcu_read_lock(void) __ksym; 417 420 void bpf_rcu_read_unlock(void) __ksym; ··· 440 421 */ 441 422 static inline s64 time_delta(u64 after, u64 before) 442 423 { 443 - return (s64)(after - before) > 0 ? : 0; 424 + return (s64)(after - before) > 0 ? (s64)(after - before) : 0; 444 425 } 445 426 446 427 /**

+1 -1

tools/testing/selftests/cgroup/test_cpuset_v1_hp.sh

··· 1 - #!/bin/sh 1 + #!/bin/bash 2 2 # SPDX-License-Identifier: GPL-2.0 3 3 # 4 4 # Test the special cpuset v1 hotplug case where a cpuset become empty of

+8 -1

tools/testing/selftests/drivers/net/hw/rss_ctx.py

··· 252 252 try: 253 253 # this targets queue 4, which doesn't exist 254 254 ntuple2 = ethtool_create(cfg, "-N", flow) 255 + defer(ethtool, f"-N {cfg.ifname} delete {ntuple2}") 255 256 except CmdExitFailure: 256 257 pass 257 258 else: ··· 260 259 # change the table to target queues 0 and 2 261 260 ethtool(f"-X {cfg.ifname} {ctx_ref} weight 1 0 1 0") 262 261 # ntuple rule therefore targets queues 1 and 3 263 - ntuple2 = ethtool_create(cfg, "-N", flow) 262 + try: 263 + ntuple2 = ethtool_create(cfg, "-N", flow) 264 + except CmdExitFailure: 265 + ksft_pr("Driver does not support rss + queue offset") 266 + return 267 + 268 + defer(ethtool, f"-N {cfg.ifname} delete {ntuple2}") 264 269 # should replace existing filter 265 270 ksft_eq(ntuple, ntuple2) 266 271 _send_traffic_check(cfg, port, ctx_ref, { 'target': (1, 3),

+21 -1

tools/testing/selftests/filesystems/statmount/statmount_test.c

··· 383 383 return; 384 384 } 385 385 386 + if (!(sm->mask & STATMOUNT_MNT_POINT)) { 387 + ksft_test_result_fail("missing STATMOUNT_MNT_POINT in mask\n"); 388 + return; 389 + } 386 390 if (strcmp(sm->str + sm->mnt_point, "/") != 0) { 387 391 ksft_test_result_fail("unexpected mount point: '%s' != '/'\n", 388 392 sm->str + sm->mnt_point); ··· 410 406 if (!sm) { 411 407 ksft_test_result_fail("statmount mount root: %s\n", 412 408 strerror(errno)); 409 + return; 410 + } 411 + if (!(sm->mask & STATMOUNT_MNT_ROOT)) { 412 + ksft_test_result_fail("missing STATMOUNT_MNT_ROOT in mask\n"); 413 413 return; 414 414 } 415 415 mnt_root = sm->str + sm->mnt_root; ··· 445 437 strerror(errno)); 446 438 return; 447 439 } 440 + if (!(sm->mask & STATMOUNT_FS_TYPE)) { 441 + ksft_test_result_fail("missing STATMOUNT_FS_TYPE in mask\n"); 442 + return; 443 + } 448 444 fs_type = sm->str + sm->fs_type; 449 445 for (s = known_fs; s != NULL; s++) { 450 446 if (strcmp(fs_type, *s) == 0) ··· 473 461 if (!sm) { 474 462 ksft_test_result_fail("statmount mnt opts: %s\n", 475 463 strerror(errno)); 464 + return; 465 + } 466 + 467 + if (!(sm->mask & STATMOUNT_MNT_BASIC)) { 468 + ksft_test_result_fail("missing STATMOUNT_MNT_BASIC in mask\n"); 476 469 return; 477 470 } 478 471 ··· 531 514 if (p2) 532 515 *p2 = '\0'; 533 516 534 - statmount_opts = sm->str + sm->mnt_opts; 517 + if (sm->mask & STATMOUNT_MNT_OPTS) 518 + statmount_opts = sm->str + sm->mnt_opts; 519 + else 520 + statmount_opts = ""; 535 521 if (strcmp(statmount_opts, p) != 0) 536 522 ksft_test_result_fail( 537 523 "unexpected mount options: '%s' != '%s'\n",

+2 -2

tools/testing/selftests/kvm/s390/cmma_test.c

··· 444 444 ); 445 445 } 446 446 447 - static void test_get_inital_dirty(void) 447 + static void test_get_initial_dirty(void) 448 448 { 449 449 struct kvm_vm *vm = create_vm_two_memslots(); 450 450 struct kvm_vcpu *vcpu; ··· 651 651 } testlist[] = { 652 652 { "migration mode and dirty tracking", test_migration_mode }, 653 653 { "GET_CMMA_BITS: basic calls", test_get_cmma_basic }, 654 - { "GET_CMMA_BITS: all pages are dirty initally", test_get_inital_dirty }, 654 + { "GET_CMMA_BITS: all pages are dirty initially", test_get_initial_dirty }, 655 655 { "GET_CMMA_BITS: holes are skipped", test_get_skip_holes }, 656 656 }; 657 657

+12 -20

tools/testing/selftests/kvm/s390/ucontrol_test.c

··· 88 88 " ahi %r0,1\n" 89 89 " st %r1,0(%r5,%r6)\n" 90 90 91 - " iske %r1,%r6\n" 92 - " ahi %r0,1\n" 93 - " diag 0,0,0x44\n" 94 - 95 91 " sske %r1,%r6\n" 96 92 " xgr %r1,%r1\n" 97 93 " iske %r1,%r6\n" ··· 455 459 }; 456 460 457 461 ASSERT_EQ(-1, ioctl(self->vm_fd, KVM_SET_USER_MEMORY_REGION, &region)); 458 - ASSERT_EQ(EINVAL, errno); 462 + ASSERT_TRUE(errno == EEXIST || errno == EINVAL) 463 + TH_LOG("errno %s (%i) not expected for ioctl KVM_SET_USER_MEMORY_REGION", 464 + strerror(errno), errno); 459 465 460 466 ASSERT_EQ(-1, ioctl(self->vm_fd, KVM_SET_USER_MEMORY_REGION2, &region2)); 461 - ASSERT_EQ(EINVAL, errno); 467 + ASSERT_TRUE(errno == EEXIST || errno == EINVAL) 468 + TH_LOG("errno %s (%i) not expected for ioctl KVM_SET_USER_MEMORY_REGION2", 469 + strerror(errno), errno); 462 470 } 463 471 464 472 TEST_F(uc_kvm, uc_map_unmap) ··· 596 596 ASSERT_EQ(true, uc_handle_exit(self)); 597 597 ASSERT_EQ(1, sync_regs->gprs[0]); 598 598 599 - /* ISKE */ 599 + /* SSKE + ISKE */ 600 + sync_regs->gprs[1] = skeyvalue; 601 + run->kvm_dirty_regs |= KVM_SYNC_GPRS; 600 602 ASSERT_EQ(0, uc_run_once(self)); 601 603 602 604 /* ··· 610 608 TEST_ASSERT_EQ(0, sie_block->ictl & (ICTL_ISKE | ICTL_SSKE | ICTL_RRBE)); 611 609 TEST_ASSERT_EQ(KVM_EXIT_S390_SIEIC, self->run->exit_reason); 612 610 TEST_ASSERT_EQ(ICPT_INST, sie_block->icptcode); 613 - TEST_REQUIRE(sie_block->ipa != 0xb229); 611 + TEST_REQUIRE(sie_block->ipa != 0xb22b); 614 612 615 - /* ISKE contd. */ 613 + /* SSKE + ISKE contd. */ 616 614 ASSERT_EQ(false, uc_handle_exit(self)); 617 615 ASSERT_EQ(2, sync_regs->gprs[0]); 618 - /* assert initial skey (ACC = 0, R & C = 1) */ 619 - ASSERT_EQ(0x06, sync_regs->gprs[1]); 620 - uc_assert_diag44(self); 621 - 622 - /* SSKE + ISKE */ 623 - sync_regs->gprs[1] = skeyvalue; 624 - run->kvm_dirty_regs |= KVM_SYNC_GPRS; 625 - ASSERT_EQ(0, uc_run_once(self)); 626 - ASSERT_EQ(false, uc_handle_exit(self)); 627 - ASSERT_EQ(3, sync_regs->gprs[0]); 628 616 ASSERT_EQ(skeyvalue, sync_regs->gprs[1]); 629 617 uc_assert_diag44(self); 630 618 ··· 623 631 run->kvm_dirty_regs |= KVM_SYNC_GPRS; 624 632 ASSERT_EQ(0, uc_run_once(self)); 625 633 ASSERT_EQ(false, uc_handle_exit(self)); 626 - ASSERT_EQ(4, sync_regs->gprs[0]); 634 + ASSERT_EQ(3, sync_regs->gprs[0]); 627 635 /* assert R reset but rest of skey unchanged */ 628 636 ASSERT_EQ(skeyvalue & 0xfa, sync_regs->gprs[1]); 629 637 ASSERT_EQ(0, sync_regs->gprs[1] & 0x04);

+32 -15

tools/testing/selftests/kvm/x86/hyperv_cpuid.c

··· 41 41 return res; 42 42 } 43 43 44 - static void test_hv_cpuid(const struct kvm_cpuid2 *hv_cpuid_entries, 45 - bool evmcs_expected) 44 + static void test_hv_cpuid(struct kvm_vcpu *vcpu, bool evmcs_expected) 46 45 { 46 + const bool has_irqchip = !vcpu || vcpu->vm->has_irqchip; 47 + const struct kvm_cpuid2 *hv_cpuid_entries; 47 48 int i; 48 49 int nent_expected = 10; 49 50 u32 test_val; 51 + 52 + if (vcpu) 53 + hv_cpuid_entries = vcpu_get_supported_hv_cpuid(vcpu); 54 + else 55 + hv_cpuid_entries = kvm_get_supported_hv_cpuid(); 50 56 51 57 TEST_ASSERT(hv_cpuid_entries->nent == nent_expected, 52 58 "KVM_GET_SUPPORTED_HV_CPUID should return %d entries" ··· 86 80 entry->eax, evmcs_expected 87 81 ); 88 82 break; 83 + case 0x40000003: 84 + TEST_ASSERT(has_irqchip || !(entry->edx & BIT(19)), 85 + "\"Direct\" Synthetic Timers should require in-kernel APIC"); 86 + break; 89 87 case 0x40000004: 90 88 test_val = entry->eax & (1UL << 18); 91 89 92 90 TEST_ASSERT(!!test_val == !smt_possible(), 93 91 "NoNonArchitecturalCoreSharing bit" 94 92 " doesn't reflect SMT setting"); 93 + 94 + TEST_ASSERT(has_irqchip || !(entry->eax & BIT(10)), 95 + "Cluster IPI (i.e. SEND_IPI) should require in-kernel APIC"); 95 96 break; 96 97 case 0x4000000A: 97 98 TEST_ASSERT(entry->eax & (1UL << 19), ··· 122 109 * entry->edx); 123 110 */ 124 111 } 112 + 113 + /* 114 + * Note, the CPUID array returned by the system-scoped helper is a one- 115 + * time allocation, i.e. must not be freed. 116 + */ 117 + if (vcpu) 118 + free((void *)hv_cpuid_entries); 125 119 } 126 120 127 - void test_hv_cpuid_e2big(struct kvm_vm *vm, struct kvm_vcpu *vcpu) 121 + static void test_hv_cpuid_e2big(struct kvm_vm *vm, struct kvm_vcpu *vcpu) 128 122 { 129 123 static struct kvm_cpuid2 cpuid = {.nent = 0}; 130 124 int ret; ··· 149 129 int main(int argc, char *argv[]) 150 130 { 151 131 struct kvm_vm *vm; 152 - const struct kvm_cpuid2 *hv_cpuid_entries; 153 132 struct kvm_vcpu *vcpu; 154 133 155 134 TEST_REQUIRE(kvm_has_cap(KVM_CAP_HYPERV_CPUID)); 156 135 157 - vm = vm_create_with_one_vcpu(&vcpu, guest_code); 136 + /* Test the vCPU ioctl without an in-kernel local APIC. */ 137 + vm = vm_create_barebones(); 138 + vcpu = __vm_vcpu_add(vm, 0); 139 + test_hv_cpuid(vcpu, false); 140 + kvm_vm_free(vm); 158 141 159 142 /* Test vCPU ioctl version */ 143 + vm = vm_create_with_one_vcpu(&vcpu, guest_code); 160 144 test_hv_cpuid_e2big(vm, vcpu); 161 - 162 - hv_cpuid_entries = vcpu_get_supported_hv_cpuid(vcpu); 163 - test_hv_cpuid(hv_cpuid_entries, false); 164 - free((void *)hv_cpuid_entries); 145 + test_hv_cpuid(vcpu, false); 165 146 166 147 if (!kvm_cpu_has(X86_FEATURE_VMX) || 167 148 !kvm_has_cap(KVM_CAP_HYPERV_ENLIGHTENED_VMCS)) { ··· 170 149 goto do_sys; 171 150 } 172 151 vcpu_enable_evmcs(vcpu); 173 - hv_cpuid_entries = vcpu_get_supported_hv_cpuid(vcpu); 174 - test_hv_cpuid(hv_cpuid_entries, true); 175 - free((void *)hv_cpuid_entries); 152 + test_hv_cpuid(vcpu, true); 176 153 177 154 do_sys: 178 155 /* Test system ioctl version */ ··· 180 161 } 181 162 182 163 test_hv_cpuid_e2big(vm, NULL); 183 - 184 - hv_cpuid_entries = kvm_get_supported_hv_cpuid(); 185 - test_hv_cpuid(hv_cpuid_entries, kvm_cpu_has(X86_FEATURE_VMX)); 164 + test_hv_cpuid(NULL, kvm_cpu_has(X86_FEATURE_VMX)); 186 165 187 166 out: 188 167 kvm_vm_free(vm);

+2 -1

tools/testing/selftests/livepatch/functions.sh

··· 306 306 result=$(dmesg | awk -v last_dmesg="$LAST_DMESG" 'p; $0 == last_dmesg { p=1 }' | \ 307 307 grep -e 'livepatch:' -e 'test_klp' | \ 308 308 grep -v '$tainting\|taints$ kernel' | \ 309 - sed 's/^\[[ 0-9.]*\] //') 309 + sed 's/^\[[ 0-9.]*\] //' | \ 310 + sed 's/^\[[ ]*[CT][0-9]*\] //') 310 311 311 312 if [[ "$expect" == "$result" ]] ; then 312 313 echo "ok"

+1 -1

tools/testing/selftests/net/mptcp/mptcp_connect.c

··· 1302 1302 return ret; 1303 1303 1304 1304 if (cfg_truncate > 0) { 1305 - xdisconnect(fd); 1305 + shutdown(fd, SHUT_WR); 1306 1306 } else if (--cfg_repeat > 0) { 1307 1307 xdisconnect(fd); 1308 1308

+26

tools/testing/selftests/net/udpgso.c

··· 103 103 .r_num_mss = 1, 104 104 }, 105 105 { 106 + /* datalen <= MSS < gso_len: will fall back to no GSO */ 107 + .tlen = CONST_MSS_V4, 108 + .gso_len = CONST_MSS_V4 + 1, 109 + .r_num_mss = 0, 110 + .r_len_last = CONST_MSS_V4, 111 + }, 112 + { 113 + /* MSS < datalen < gso_len: fail */ 114 + .tlen = CONST_MSS_V4 + 1, 115 + .gso_len = CONST_MSS_V4 + 2, 116 + .tfail = true, 117 + }, 118 + { 106 119 /* send a single MSS + 1B */ 107 120 .tlen = CONST_MSS_V4 + 1, 108 121 .gso_len = CONST_MSS_V4, ··· 217 204 .tlen = CONST_MSS_V6, 218 205 .gso_len = CONST_MSS_V6, 219 206 .r_num_mss = 1, 207 + }, 208 + { 209 + /* datalen <= MSS < gso_len: will fall back to no GSO */ 210 + .tlen = CONST_MSS_V6, 211 + .gso_len = CONST_MSS_V6 + 1, 212 + .r_num_mss = 0, 213 + .r_len_last = CONST_MSS_V6, 214 + }, 215 + { 216 + /* MSS < datalen < gso_len: fail */ 217 + .tlen = CONST_MSS_V6 + 1, 218 + .gso_len = CONST_MSS_V6 + 2, 219 + .tfail = true 220 220 }, 221 221 { 222 222 /* send a single MSS + 1B */

+5 -5

tools/testing/selftests/sched_ext/create_dsq.c

··· 14 14 { 15 15 struct create_dsq *skel; 16 16 17 - skel = create_dsq__open_and_load(); 18 - if (!skel) { 19 - SCX_ERR("Failed to open and load skel"); 20 - return SCX_TEST_FAIL; 21 - } 17 + skel = create_dsq__open(); 18 + SCX_FAIL_IF(!skel, "Failed to open"); 19 + SCX_ENUM_INIT(skel); 20 + SCX_FAIL_IF(create_dsq__load(skel), "Failed to load skel"); 21 + 22 22 *ctx = skel; 23 23 24 24 return SCX_TEST_PASS;

+5 -2

tools/testing/selftests/sched_ext/ddsp_bogus_dsq_fail.c

··· 15 15 { 16 16 struct ddsp_bogus_dsq_fail *skel; 17 17 18 - skel = ddsp_bogus_dsq_fail__open_and_load(); 19 - SCX_FAIL_IF(!skel, "Failed to open and load skel"); 18 + skel = ddsp_bogus_dsq_fail__open(); 19 + SCX_FAIL_IF(!skel, "Failed to open"); 20 + SCX_ENUM_INIT(skel); 21 + SCX_FAIL_IF(ddsp_bogus_dsq_fail__load(skel), "Failed to load skel"); 22 + 20 23 *ctx = skel; 21 24 22 25 return SCX_TEST_PASS;

+5 -2

tools/testing/selftests/sched_ext/ddsp_vtimelocal_fail.c

··· 14 14 { 15 15 struct ddsp_vtimelocal_fail *skel; 16 16 17 - skel = ddsp_vtimelocal_fail__open_and_load(); 18 - SCX_FAIL_IF(!skel, "Failed to open and load skel"); 17 + skel = ddsp_vtimelocal_fail__open(); 18 + SCX_FAIL_IF(!skel, "Failed to open"); 19 + SCX_ENUM_INIT(skel); 20 + SCX_FAIL_IF(ddsp_vtimelocal_fail__load(skel), "Failed to load skel"); 21 + 19 22 *ctx = skel; 20 23 21 24 return SCX_TEST_PASS;

+1 -1

tools/testing/selftests/sched_ext/dsp_local_on.bpf.c

··· 43 43 if (!p) 44 44 return; 45 45 46 - if (p->nr_cpus_allowed == nr_cpus) 46 + if (p->nr_cpus_allowed == nr_cpus && !is_migration_disabled(p)) 47 47 target = bpf_get_prandom_u32() % nr_cpus; 48 48 else 49 49 target = scx_bpf_task_cpu(p);

+1

tools/testing/selftests/sched_ext/dsp_local_on.c

··· 15 15 16 16 skel = dsp_local_on__open(); 17 17 SCX_FAIL_IF(!skel, "Failed to open"); 18 + SCX_ENUM_INIT(skel); 18 19 19 20 skel->rodata->nr_cpus = libbpf_num_possible_cpus(); 20 21 SCX_FAIL_IF(dsp_local_on__load(skel), "Failed to load skel");

+5 -5

tools/testing/selftests/sched_ext/enq_last_no_enq_fails.c

··· 15 15 { 16 16 struct enq_last_no_enq_fails *skel; 17 17 18 - skel = enq_last_no_enq_fails__open_and_load(); 19 - if (!skel) { 20 - SCX_ERR("Failed to open and load skel"); 21 - return SCX_TEST_FAIL; 22 - } 18 + skel = enq_last_no_enq_fails__open(); 19 + SCX_FAIL_IF(!skel, "Failed to open"); 20 + SCX_ENUM_INIT(skel); 21 + SCX_FAIL_IF(enq_last_no_enq_fails__load(skel), "Failed to load skel"); 22 + 23 23 *ctx = skel; 24 24 25 25 return SCX_TEST_PASS;

+5 -5

tools/testing/selftests/sched_ext/enq_select_cpu_fails.c

··· 15 15 { 16 16 struct enq_select_cpu_fails *skel; 17 17 18 - skel = enq_select_cpu_fails__open_and_load(); 19 - if (!skel) { 20 - SCX_ERR("Failed to open and load skel"); 21 - return SCX_TEST_FAIL; 22 - } 18 + skel = enq_select_cpu_fails__open(); 19 + SCX_FAIL_IF(!skel, "Failed to open"); 20 + SCX_ENUM_INIT(skel); 21 + SCX_FAIL_IF(enq_select_cpu_fails__load(skel), "Failed to load skel"); 22 + 23 23 *ctx = skel; 24 24 25 25 return SCX_TEST_PASS;

+1

tools/testing/selftests/sched_ext/exit.c

··· 23 23 char buf[16]; 24 24 25 25 skel = exit__open(); 26 + SCX_ENUM_INIT(skel); 26 27 skel->rodata->exit_point = tc; 27 28 exit__load(skel); 28 29 link = bpf_map__attach_struct_ops(skel->maps.exit_ops);

+4 -2

tools/testing/selftests/sched_ext/hotplug.c

··· 49 49 50 50 SCX_ASSERT(is_cpu_online()); 51 51 52 - skel = hotplug__open_and_load(); 53 - SCX_ASSERT(skel); 52 + skel = hotplug__open(); 53 + SCX_FAIL_IF(!skel, "Failed to open"); 54 + SCX_ENUM_INIT(skel); 55 + SCX_FAIL_IF(hotplug__load(skel), "Failed to load skel"); 54 56 55 57 /* Testing the offline -> online path, so go offline before starting */ 56 58 if (onlining)

+9 -18

tools/testing/selftests/sched_ext/init_enable_count.c

··· 15 15 16 16 #define SCHED_EXT 7 17 17 18 - static struct init_enable_count * 19 - open_load_prog(bool global) 20 - { 21 - struct init_enable_count *skel; 22 - 23 - skel = init_enable_count__open(); 24 - SCX_BUG_ON(!skel, "Failed to open skel"); 25 - 26 - if (!global) 27 - skel->struct_ops.init_enable_count_ops->flags |= SCX_OPS_SWITCH_PARTIAL; 28 - 29 - SCX_BUG_ON(init_enable_count__load(skel), "Failed to load skel"); 30 - 31 - return skel; 32 - } 33 - 34 18 static enum scx_test_status run_test(bool global) 35 19 { 36 20 struct init_enable_count *skel; ··· 24 40 struct sched_param param = {}; 25 41 pid_t pids[num_pre_forks]; 26 42 27 - skel = open_load_prog(global); 43 + skel = init_enable_count__open(); 44 + SCX_FAIL_IF(!skel, "Failed to open"); 45 + SCX_ENUM_INIT(skel); 46 + 47 + if (!global) 48 + skel->struct_ops.init_enable_count_ops->flags |= SCX_OPS_SWITCH_PARTIAL; 49 + 50 + SCX_FAIL_IF(init_enable_count__load(skel), "Failed to load skel"); 28 51 29 52 /* 30 53 * Fork a bunch of children before we attach the scheduler so that we ··· 150 159 151 160 struct scx_test init_enable_count = { 152 161 .name = "init_enable_count", 153 - .description = "Verify we do the correct amount of counting of init, " 162 + .description = "Verify we correctly count the occurrences of init, " 154 163 "enable, etc callbacks.", 155 164 .run = run, 156 165 };

+5 -2

tools/testing/selftests/sched_ext/maximal.c

··· 14 14 { 15 15 struct maximal *skel; 16 16 17 - skel = maximal__open_and_load(); 18 - SCX_FAIL_IF(!skel, "Failed to open and load skel"); 17 + skel = maximal__open(); 18 + SCX_FAIL_IF(!skel, "Failed to open"); 19 + SCX_ENUM_INIT(skel); 20 + SCX_FAIL_IF(maximal__load(skel), "Failed to load skel"); 21 + 19 22 *ctx = skel; 20 23 21 24 return SCX_TEST_PASS;

+1 -1

tools/testing/selftests/sched_ext/maybe_null.c

··· 43 43 44 44 struct scx_test maybe_null = { 45 45 .name = "maybe_null", 46 - .description = "Verify if PTR_MAYBE_NULL work for .dispatch", 46 + .description = "Verify if PTR_MAYBE_NULL works for .dispatch", 47 47 .run = run, 48 48 }; 49 49 REGISTER_SCX_TEST(&maybe_null)

+5 -5

tools/testing/selftests/sched_ext/minimal.c

··· 15 15 { 16 16 struct minimal *skel; 17 17 18 - skel = minimal__open_and_load(); 19 - if (!skel) { 20 - SCX_ERR("Failed to open and load skel"); 21 - return SCX_TEST_FAIL; 22 - } 18 + skel = minimal__open(); 19 + SCX_FAIL_IF(!skel, "Failed to open"); 20 + SCX_ENUM_INIT(skel); 21 + SCX_FAIL_IF(minimal__load(skel), "Failed to load skel"); 22 + 23 23 *ctx = skel; 24 24 25 25 return SCX_TEST_PASS;

+5 -5

tools/testing/selftests/sched_ext/prog_run.c

··· 15 15 { 16 16 struct prog_run *skel; 17 17 18 - skel = prog_run__open_and_load(); 19 - if (!skel) { 20 - SCX_ERR("Failed to open and load skel"); 21 - return SCX_TEST_FAIL; 22 - } 18 + skel = prog_run__open(); 19 + SCX_FAIL_IF(!skel, "Failed to open"); 20 + SCX_ENUM_INIT(skel); 21 + SCX_FAIL_IF(prog_run__load(skel), "Failed to load skel"); 22 + 23 23 *ctx = skel; 24 24 25 25 return SCX_TEST_PASS;

+4 -5

tools/testing/selftests/sched_ext/reload_loop.c

··· 18 18 19 19 static enum scx_test_status setup(void **ctx) 20 20 { 21 - skel = maximal__open_and_load(); 22 - if (!skel) { 23 - SCX_ERR("Failed to open and load skel"); 24 - return SCX_TEST_FAIL; 25 - } 21 + skel = maximal__open(); 22 + SCX_FAIL_IF(!skel, "Failed to open"); 23 + SCX_ENUM_INIT(skel); 24 + SCX_FAIL_IF(maximal__load(skel), "Failed to load skel"); 26 25 27 26 return SCX_TEST_PASS; 28 27 }

+5 -2

tools/testing/selftests/sched_ext/select_cpu_dfl.c

··· 17 17 { 18 18 struct select_cpu_dfl *skel; 19 19 20 - skel = select_cpu_dfl__open_and_load(); 21 - SCX_FAIL_IF(!skel, "Failed to open and load skel"); 20 + skel = select_cpu_dfl__open(); 21 + SCX_FAIL_IF(!skel, "Failed to open"); 22 + SCX_ENUM_INIT(skel); 23 + SCX_FAIL_IF(select_cpu_dfl__load(skel), "Failed to load skel"); 24 + 22 25 *ctx = skel; 23 26 24 27 return SCX_TEST_PASS;

+5 -2

tools/testing/selftests/sched_ext/select_cpu_dfl_nodispatch.c

··· 17 17 { 18 18 struct select_cpu_dfl_nodispatch *skel; 19 19 20 - skel = select_cpu_dfl_nodispatch__open_and_load(); 21 - SCX_FAIL_IF(!skel, "Failed to open and load skel"); 20 + skel = select_cpu_dfl_nodispatch__open(); 21 + SCX_FAIL_IF(!skel, "Failed to open"); 22 + SCX_ENUM_INIT(skel); 23 + SCX_FAIL_IF(select_cpu_dfl_nodispatch__load(skel), "Failed to load skel"); 24 + 22 25 *ctx = skel; 23 26 24 27 return SCX_TEST_PASS;

+5 -2

tools/testing/selftests/sched_ext/select_cpu_dispatch.c

··· 17 17 { 18 18 struct select_cpu_dispatch *skel; 19 19 20 - skel = select_cpu_dispatch__open_and_load(); 21 - SCX_FAIL_IF(!skel, "Failed to open and load skel"); 20 + skel = select_cpu_dispatch__open(); 21 + SCX_FAIL_IF(!skel, "Failed to open"); 22 + SCX_ENUM_INIT(skel); 23 + SCX_FAIL_IF(select_cpu_dispatch__load(skel), "Failed to load skel"); 24 + 22 25 *ctx = skel; 23 26 24 27 return SCX_TEST_PASS;

+5 -2

tools/testing/selftests/sched_ext/select_cpu_dispatch_bad_dsq.c

··· 15 15 { 16 16 struct select_cpu_dispatch_bad_dsq *skel; 17 17 18 - skel = select_cpu_dispatch_bad_dsq__open_and_load(); 19 - SCX_FAIL_IF(!skel, "Failed to open and load skel"); 18 + skel = select_cpu_dispatch_bad_dsq__open(); 19 + SCX_FAIL_IF(!skel, "Failed to open"); 20 + SCX_ENUM_INIT(skel); 21 + SCX_FAIL_IF(select_cpu_dispatch_bad_dsq__load(skel), "Failed to load skel"); 22 + 20 23 *ctx = skel; 21 24 22 25 return SCX_TEST_PASS;

+5 -2

tools/testing/selftests/sched_ext/select_cpu_dispatch_dbl_dsp.c

··· 15 15 { 16 16 struct select_cpu_dispatch_dbl_dsp *skel; 17 17 18 - skel = select_cpu_dispatch_dbl_dsp__open_and_load(); 19 - SCX_FAIL_IF(!skel, "Failed to open and load skel"); 18 + skel = select_cpu_dispatch_dbl_dsp__open(); 19 + SCX_FAIL_IF(!skel, "Failed to open"); 20 + SCX_ENUM_INIT(skel); 21 + SCX_FAIL_IF(select_cpu_dispatch_dbl_dsp__load(skel), "Failed to load skel"); 22 + 20 23 *ctx = skel; 21 24 22 25 return SCX_TEST_PASS;

+5 -2

tools/testing/selftests/sched_ext/select_cpu_vtime.c

··· 15 15 { 16 16 struct select_cpu_vtime *skel; 17 17 18 - skel = select_cpu_vtime__open_and_load(); 19 - SCX_FAIL_IF(!skel, "Failed to open and load skel"); 18 + skel = select_cpu_vtime__open(); 19 + SCX_FAIL_IF(!skel, "Failed to open"); 20 + SCX_ENUM_INIT(skel); 21 + SCX_FAIL_IF(select_cpu_vtime__load(skel), "Failed to load skel"); 22 + 20 23 *ctx = skel; 21 24 22 25 return SCX_TEST_PASS;

+199

tools/testing/selftests/seccomp/seccomp_bpf.c

··· 47 47 #include <linux/kcmp.h> 48 48 #include <sys/resource.h> 49 49 #include <sys/capability.h> 50 + #include <linux/perf_event.h> 50 51 51 52 #include <unistd.h> 52 53 #include <sys/syscall.h> ··· 67 66 68 67 #ifndef PR_SET_PTRACER 69 68 # define PR_SET_PTRACER 0x59616d61 69 + #endif 70 + 71 + #ifndef noinline 72 + #define noinline __attribute__((noinline)) 70 73 #endif 71 74 72 75 #ifndef PR_SET_NO_NEW_PRIVS ··· 4891 4886 4892 4887 EXPECT_EQ(pid, waitpid(pid, &status, 0)); 4893 4888 EXPECT_EQ(0, status); 4889 + } 4890 + 4891 + noinline int probed(void) 4892 + { 4893 + return 1; 4894 + } 4895 + 4896 + static int parse_uint_from_file(const char *file, const char *fmt) 4897 + { 4898 + int err = -1, ret; 4899 + FILE *f; 4900 + 4901 + f = fopen(file, "re"); 4902 + if (f) { 4903 + err = fscanf(f, fmt, &ret); 4904 + fclose(f); 4905 + } 4906 + return err == 1 ? ret : err; 4907 + } 4908 + 4909 + static int determine_uprobe_perf_type(void) 4910 + { 4911 + const char *file = "/sys/bus/event_source/devices/uprobe/type"; 4912 + 4913 + return parse_uint_from_file(file, "%d\n"); 4914 + } 4915 + 4916 + static int determine_uprobe_retprobe_bit(void) 4917 + { 4918 + const char *file = "/sys/bus/event_source/devices/uprobe/format/retprobe"; 4919 + 4920 + return parse_uint_from_file(file, "config:%d\n"); 4921 + } 4922 + 4923 + static ssize_t get_uprobe_offset(const void *addr) 4924 + { 4925 + size_t start, base, end; 4926 + bool found = false; 4927 + char buf[256]; 4928 + FILE *f; 4929 + 4930 + f = fopen("/proc/self/maps", "r"); 4931 + if (!f) 4932 + return -1; 4933 + 4934 + while (fscanf(f, "%zx-%zx %s %zx %*[^\n]\n", &start, &end, buf, &base) == 4) { 4935 + if (buf[2] == 'x' && (uintptr_t)addr >= start && (uintptr_t)addr < end) { 4936 + found = true; 4937 + break; 4938 + } 4939 + } 4940 + fclose(f); 4941 + return found ? (uintptr_t)addr - start + base : -1; 4942 + } 4943 + 4944 + FIXTURE(URETPROBE) { 4945 + int fd; 4946 + }; 4947 + 4948 + FIXTURE_VARIANT(URETPROBE) { 4949 + /* 4950 + * All of the URETPROBE behaviors can be tested with either 4951 + * uretprobe attached or not 4952 + */ 4953 + bool attach; 4954 + }; 4955 + 4956 + FIXTURE_VARIANT_ADD(URETPROBE, attached) { 4957 + .attach = true, 4958 + }; 4959 + 4960 + FIXTURE_VARIANT_ADD(URETPROBE, not_attached) { 4961 + .attach = false, 4962 + }; 4963 + 4964 + FIXTURE_SETUP(URETPROBE) 4965 + { 4966 + const size_t attr_sz = sizeof(struct perf_event_attr); 4967 + struct perf_event_attr attr; 4968 + ssize_t offset; 4969 + int type, bit; 4970 + 4971 + #ifndef __NR_uretprobe 4972 + SKIP(return, "__NR_uretprobe syscall not defined"); 4973 + #endif 4974 + 4975 + if (!variant->attach) 4976 + return; 4977 + 4978 + memset(&attr, 0, attr_sz); 4979 + 4980 + type = determine_uprobe_perf_type(); 4981 + ASSERT_GE(type, 0); 4982 + bit = determine_uprobe_retprobe_bit(); 4983 + ASSERT_GE(bit, 0); 4984 + offset = get_uprobe_offset(probed); 4985 + ASSERT_GE(offset, 0); 4986 + 4987 + attr.config |= 1 << bit; 4988 + attr.size = attr_sz; 4989 + attr.type = type; 4990 + attr.config1 = ptr_to_u64("/proc/self/exe"); 4991 + attr.config2 = offset; 4992 + 4993 + self->fd = syscall(__NR_perf_event_open, &attr, 4994 + getpid() /* pid */, -1 /* cpu */, -1 /* group_fd */, 4995 + PERF_FLAG_FD_CLOEXEC); 4996 + } 4997 + 4998 + FIXTURE_TEARDOWN(URETPROBE) 4999 + { 5000 + /* we could call close(self->fd), but we'd need extra filter for 5001 + * that and since we are calling _exit right away.. 5002 + */ 5003 + } 5004 + 5005 + static int run_probed_with_filter(struct sock_fprog *prog) 5006 + { 5007 + if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) || 5008 + seccomp(SECCOMP_SET_MODE_FILTER, 0, prog)) { 5009 + return -1; 5010 + } 5011 + 5012 + probed(); 5013 + return 0; 5014 + } 5015 + 5016 + TEST_F(URETPROBE, uretprobe_default_allow) 5017 + { 5018 + struct sock_filter filter[] = { 5019 + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW), 5020 + }; 5021 + struct sock_fprog prog = { 5022 + .len = (unsigned short)ARRAY_SIZE(filter), 5023 + .filter = filter, 5024 + }; 5025 + 5026 + ASSERT_EQ(0, run_probed_with_filter(&prog)); 5027 + } 5028 + 5029 + TEST_F(URETPROBE, uretprobe_default_block) 5030 + { 5031 + struct sock_filter filter[] = { 5032 + BPF_STMT(BPF_LD|BPF_W|BPF_ABS, 5033 + offsetof(struct seccomp_data, nr)), 5034 + BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, __NR_exit_group, 1, 0), 5035 + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_KILL), 5036 + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW), 5037 + }; 5038 + struct sock_fprog prog = { 5039 + .len = (unsigned short)ARRAY_SIZE(filter), 5040 + .filter = filter, 5041 + }; 5042 + 5043 + ASSERT_EQ(0, run_probed_with_filter(&prog)); 5044 + } 5045 + 5046 + TEST_F(URETPROBE, uretprobe_block_uretprobe_syscall) 5047 + { 5048 + struct sock_filter filter[] = { 5049 + BPF_STMT(BPF_LD|BPF_W|BPF_ABS, 5050 + offsetof(struct seccomp_data, nr)), 5051 + #ifdef __NR_uretprobe 5052 + BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, __NR_uretprobe, 0, 1), 5053 + #endif 5054 + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_KILL), 5055 + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW), 5056 + }; 5057 + struct sock_fprog prog = { 5058 + .len = (unsigned short)ARRAY_SIZE(filter), 5059 + .filter = filter, 5060 + }; 5061 + 5062 + ASSERT_EQ(0, run_probed_with_filter(&prog)); 5063 + } 5064 + 5065 + TEST_F(URETPROBE, uretprobe_default_block_with_uretprobe_syscall) 5066 + { 5067 + struct sock_filter filter[] = { 5068 + BPF_STMT(BPF_LD|BPF_W|BPF_ABS, 5069 + offsetof(struct seccomp_data, nr)), 5070 + #ifdef __NR_uretprobe 5071 + BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, __NR_uretprobe, 2, 0), 5072 + #endif 5073 + BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, __NR_exit_group, 1, 0), 5074 + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_KILL), 5075 + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW), 5076 + }; 5077 + struct sock_fprog prog = { 5078 + .len = (unsigned short)ARRAY_SIZE(filter), 5079 + .filter = filter, 5080 + }; 5081 + 5082 + ASSERT_EQ(0, run_probed_with_filter(&prog)); 4894 5083 } 4895 5084 4896 5085 /*

+33 -1

tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json

··· 94 94 "$TC qdisc del dev $DUMMY ingress", 95 95 "$IP addr del 10.10.10.10/24 dev $DUMMY" 96 96 ] 97 - } 97 + }, 98 + { 99 + "id": "a4b9", 100 + "name": "Test class qlen notification", 101 + "category": [ 102 + "qdisc" 103 + ], 104 + "plugins": { 105 + "requires": "nsPlugin" 106 + }, 107 + "setup": [ 108 + "$IP link set dev $DUMMY up || true", 109 + "$IP addr add 10.10.10.10/24 dev $DUMMY || true", 110 + "$TC qdisc add dev $DUMMY root handle 1: drr", 111 + "$TC filter add dev $DUMMY parent 1: basic classid 1:1", 112 + "$TC class add dev $DUMMY parent 1: classid 1:1 drr", 113 + "$TC qdisc add dev $DUMMY parent 1:1 handle 2: netem", 114 + "$TC qdisc add dev $DUMMY parent 2: handle 3: drr", 115 + "$TC filter add dev $DUMMY parent 3: basic action drop", 116 + "$TC class add dev $DUMMY parent 3: classid 3:1 drr", 117 + "$TC class del dev $DUMMY classid 1:1", 118 + "$TC class add dev $DUMMY parent 1: classid 1:1 drr" 119 + ], 120 + "cmdUnderTest": "ping -c1 -W0.01 -I $DUMMY 10.10.10.1", 121 + "expExitCode": "1", 122 + "verifyCmd": "$TC qdisc ls dev $DUMMY", 123 + "matchPattern": "drr 1: root", 124 + "matchCount": "1", 125 + "teardown": [ 126 + "$TC qdisc del dev $DUMMY root handle 1: drr", 127 + "$IP addr del 10.10.10.10/24 dev $DUMMY" 128 + ] 129 + } 98 130 ]

+23

tools/testing/selftests/tc-testing/tc-tests/qdiscs/fifo.json

··· 313 313 "matchPattern": "qdisc bfifo 1: root", 314 314 "matchCount": "0", 315 315 "teardown": [ 316 + ] 317 + }, 318 + { 319 + "id": "d774", 320 + "name": "Check pfifo_head_drop qdisc enqueue behaviour when limit == 0", 321 + "category": [ 322 + "qdisc", 323 + "pfifo_head_drop" 324 + ], 325 + "plugins": { 326 + "requires": "nsPlugin" 327 + }, 328 + "setup": [ 329 + "$IP addr add 10.10.10.10/24 dev $DUMMY || true", 330 + "$TC qdisc add dev $DUMMY root handle 1: pfifo_head_drop limit 0", 331 + "$IP link set dev $DUMMY up || true" 332 + ], 333 + "cmdUnderTest": "ping -c2 -W0.01 -I $DUMMY 10.10.10.1", 334 + "expExitCode": "1", 335 + "verifyCmd": "$TC -s qdisc show dev $DUMMY", 336 + "matchPattern": "dropped 2", 337 + "matchCount": "1", 338 + "teardown": [ 316 339 ] 317 340 } 318 341 ]

+41

tools/testing/vsock/vsock_test.c

··· 1788 1788 close(fd); 1789 1789 } 1790 1790 1791 + static void test_stream_linger_client(const struct test_opts *opts) 1792 + { 1793 + struct linger optval = { 1794 + .l_onoff = 1, 1795 + .l_linger = 1 1796 + }; 1797 + int fd; 1798 + 1799 + fd = vsock_stream_connect(opts->peer_cid, opts->peer_port); 1800 + if (fd < 0) { 1801 + perror("connect"); 1802 + exit(EXIT_FAILURE); 1803 + } 1804 + 1805 + if (setsockopt(fd, SOL_SOCKET, SO_LINGER, &optval, sizeof(optval))) { 1806 + perror("setsockopt(SO_LINGER)"); 1807 + exit(EXIT_FAILURE); 1808 + } 1809 + 1810 + close(fd); 1811 + } 1812 + 1813 + static void test_stream_linger_server(const struct test_opts *opts) 1814 + { 1815 + int fd; 1816 + 1817 + fd = vsock_stream_accept(VMADDR_CID_ANY, opts->peer_port, NULL); 1818 + if (fd < 0) { 1819 + perror("accept"); 1820 + exit(EXIT_FAILURE); 1821 + } 1822 + 1823 + vsock_wait_remote_close(fd); 1824 + close(fd); 1825 + } 1826 + 1791 1827 static struct test_case test_cases[] = { 1792 1828 { 1793 1829 .name = "SOCK_STREAM connection reset", ··· 1978 1942 .name = "SOCK_STREAM retry failed connect()", 1979 1943 .run_client = test_stream_connect_retry_client, 1980 1944 .run_server = test_stream_connect_retry_server, 1945 + }, 1946 + { 1947 + .name = "SOCK_STREAM SO_LINGER null-ptr-deref", 1948 + .run_client = test_stream_linger_client, 1949 + .run_server = test_stream_linger_server, 1981 1950 }, 1982 1951 {}, 1983 1952 };

+9 -16

virt/kvm/kvm_main.c

··· 1071 1071 } 1072 1072 1073 1073 /* 1074 - * Called after the VM is otherwise initialized, but just before adding it to 1075 - * the vm_list. 1076 - */ 1077 - int __weak kvm_arch_post_init_vm(struct kvm *kvm) 1078 - { 1079 - return 0; 1080 - } 1081 - 1082 - /* 1083 1074 * Called just after removing the VM from the vm_list, but before doing any 1084 1075 * other destruction. 1085 1076 */ ··· 1190 1199 if (r) 1191 1200 goto out_err_no_debugfs; 1192 1201 1193 - r = kvm_arch_post_init_vm(kvm); 1194 - if (r) 1195 - goto out_err; 1196 - 1197 1202 mutex_lock(&kvm_lock); 1198 1203 list_add(&kvm->vm_list, &vm_list); 1199 1204 mutex_unlock(&kvm_lock); ··· 1199 1212 1200 1213 return kvm; 1201 1214 1202 - out_err: 1203 - kvm_destroy_vm_debugfs(kvm); 1204 1215 out_err_no_debugfs: 1205 1216 kvm_coalesced_mmio_free(kvm); 1206 1217 out_no_coalesced_mmio: ··· 1956 1971 return -EINVAL; 1957 1972 if (mem->guest_phys_addr + mem->memory_size < mem->guest_phys_addr) 1958 1973 return -EINVAL; 1959 - if ((mem->memory_size >> PAGE_SHIFT) > KVM_MEM_MAX_NR_PAGES) 1974 + 1975 + /* 1976 + * The size of userspace-defined memory regions is restricted in order 1977 + * to play nice with dirty bitmap operations, which are indexed with an 1978 + * "unsigned int". KVM's internal memory regions don't support dirty 1979 + * logging, and so are exempt. 1980 + */ 1981 + if (id < KVM_USER_MEM_SLOTS && 1982 + (mem->memory_size >> PAGE_SHIFT) > KVM_MEM_MAX_NR_PAGES) 1960 1983 return -EINVAL; 1961 1984 1962 1985 slots = __kvm_memslots(kvm, as_id);