Add sound card support for QCS9100 and QCS9075

+1

.mailmap

··· 102 102 Arnaud Patard <arnaud.patard@rtp-net.org> 103 103 Arnd Bergmann <arnd@arndb.de> 104 104 Arun Kumar Neelakantam <quic_aneela@quicinc.com> <aneela@codeaurora.org> 105 + Asahi Lina <lina+kernel@asahilina.net> <lina@asahilina.net> 105 106 Ashok Raj Nagarajan <quic_arnagara@quicinc.com> <arnagara@codeaurora.org> 106 107 Ashwin Chaugule <quic_ashwinc@quicinc.com> <ashwinc@codeaurora.org> 107 108 Asutosh Das <quic_asutoshd@quicinc.com> <asutoshd@codeaurora.org>

+1

Documentation/ABI/testing/sysfs-devices-system-cpu

··· 511 511 512 512 What: /sys/devices/system/cpu/vulnerabilities 513 513 /sys/devices/system/cpu/vulnerabilities/gather_data_sampling 514 + /sys/devices/system/cpu/vulnerabilities/indirect_target_selection 514 515 /sys/devices/system/cpu/vulnerabilities/itlb_multihit 515 516 /sys/devices/system/cpu/vulnerabilities/l1tf 516 517 /sys/devices/system/cpu/vulnerabilities/mds

+2 -2

Documentation/ABI/testing/sysfs-driver-hid-appletb-kbd

··· 1 1 What: /sys/bus/hid/drivers/hid-appletb-kbd/<dev>/mode 2 - Date: September, 2023 3 - KernelVersion: 6.5 2 + Date: March, 2025 3 + KernelVersion: 6.15 4 4 Contact: linux-input@vger.kernel.org 5 5 Description: 6 6 The set of keys displayed on the Touch Bar.

+1

Documentation/admin-guide/hw-vuln/index.rst

··· 23 23 gather_data_sampling 24 24 reg-file-data-sampling 25 25 rsb 26 + indirect-target-selection

+168

Documentation/admin-guide/hw-vuln/indirect-target-selection.rst

··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + Indirect Target Selection (ITS) 4 + =============================== 5 + 6 + ITS is a vulnerability in some Intel CPUs that support Enhanced IBRS and were 7 + released before Alder Lake. ITS may allow an attacker to control the prediction 8 + of indirect branches and RETs located in the lower half of a cacheline. 9 + 10 + ITS is assigned CVE-2024-28956 with a CVSS score of 4.7 (Medium). 11 + 12 + Scope of Impact 13 + --------------- 14 + - **eIBRS Guest/Host Isolation**: Indirect branches in KVM/kernel may still be 15 + predicted with unintended target corresponding to a branch in the guest. 16 + 17 + - **Intra-Mode BTI**: In-kernel training such as through cBPF or other native 18 + gadgets. 19 + 20 + - **Indirect Branch Prediction Barrier (IBPB)**: After an IBPB, indirect 21 + branches may still be predicted with targets corresponding to direct branches 22 + executed prior to the IBPB. This is fixed by the IPU 2025.1 microcode, which 23 + should be available via distro updates. Alternatively microcode can be 24 + obtained from Intel's github repository [#f1]_. 25 + 26 + Affected CPUs 27 + ------------- 28 + Below is the list of ITS affected CPUs [#f2]_ [#f3]_: 29 + 30 + ======================== ============ ==================== =============== 31 + Common name Family_Model eIBRS Intra-mode BTI 32 + Guest/Host Isolation 33 + ======================== ============ ==================== =============== 34 + SKYLAKE_X (step >= 6) 06_55H Affected Affected 35 + ICELAKE_X 06_6AH Not affected Affected 36 + ICELAKE_D 06_6CH Not affected Affected 37 + ICELAKE_L 06_7EH Not affected Affected 38 + TIGERLAKE_L 06_8CH Not affected Affected 39 + TIGERLAKE 06_8DH Not affected Affected 40 + KABYLAKE_L (step >= 12) 06_8EH Affected Affected 41 + KABYLAKE (step >= 13) 06_9EH Affected Affected 42 + COMETLAKE 06_A5H Affected Affected 43 + COMETLAKE_L 06_A6H Affected Affected 44 + ROCKETLAKE 06_A7H Not affected Affected 45 + ======================== ============ ==================== =============== 46 + 47 + - All affected CPUs enumerate Enhanced IBRS feature. 48 + - IBPB isolation is affected on all ITS affected CPUs, and need a microcode 49 + update for mitigation. 50 + - None of the affected CPUs enumerate BHI_CTRL which was introduced in Golden 51 + Cove (Alder Lake and Sapphire Rapids). This can help guests to determine the 52 + host's affected status. 53 + - Intel Atom CPUs are not affected by ITS. 54 + 55 + Mitigation 56 + ---------- 57 + As only the indirect branches and RETs that have their last byte of instruction 58 + in the lower half of the cacheline are vulnerable to ITS, the basic idea behind 59 + the mitigation is to not allow indirect branches in the lower half. 60 + 61 + This is achieved by relying on existing retpoline support in the kernel, and in 62 + compilers. ITS-vulnerable retpoline sites are runtime patched to point to newly 63 + added ITS-safe thunks. These safe thunks consists of indirect branch in the 64 + second half of the cacheline. Not all retpoline sites are patched to thunks, if 65 + a retpoline site is evaluated to be ITS-safe, it is replaced with an inline 66 + indirect branch. 67 + 68 + Dynamic thunks 69 + ~~~~~~~~~~~~~~ 70 + From a dynamically allocated pool of safe-thunks, each vulnerable site is 71 + replaced with a new thunk, such that they get a unique address. This could 72 + improve the branch prediction accuracy. Also, it is a defense-in-depth measure 73 + against aliasing. 74 + 75 + Note, for simplicity, indirect branches in eBPF programs are always replaced 76 + with a jump to a static thunk in __x86_indirect_its_thunk_array. If required, 77 + in future this can be changed to use dynamic thunks. 78 + 79 + All vulnerable RETs are replaced with a static thunk, they do not use dynamic 80 + thunks. This is because RETs get their prediction from RSB mostly that does not 81 + depend on source address. RETs that underflow RSB may benefit from dynamic 82 + thunks. But, RETs significantly outnumber indirect branches, and any benefit 83 + from a unique source address could be outweighed by the increased icache 84 + footprint and iTLB pressure. 85 + 86 + Retpoline 87 + ~~~~~~~~~ 88 + Retpoline sequence also mitigates ITS-unsafe indirect branches. For this 89 + reason, when retpoline is enabled, ITS mitigation only relocates the RETs to 90 + safe thunks. Unless user requested the RSB-stuffing mitigation. 91 + 92 + RSB Stuffing 93 + ~~~~~~~~~~~~ 94 + RSB-stuffing via Call Depth Tracking is a mitigation for Retbleed RSB-underflow 95 + attacks. And it also mitigates RETs that are vulnerable to ITS. 96 + 97 + Mitigation in guests 98 + ^^^^^^^^^^^^^^^^^^^^ 99 + All guests deploy ITS mitigation by default, irrespective of eIBRS enumeration 100 + and Family/Model of the guest. This is because eIBRS feature could be hidden 101 + from a guest. One exception to this is when a guest enumerates BHI_DIS_S, which 102 + indicates that the guest is running on an unaffected host. 103 + 104 + To prevent guests from unnecessarily deploying the mitigation on unaffected 105 + platforms, Intel has defined ITS_NO bit(62) in MSR IA32_ARCH_CAPABILITIES. When 106 + a guest sees this bit set, it should not enumerate the ITS bug. Note, this bit 107 + is not set by any hardware, but is **intended for VMMs to synthesize** it for 108 + guests as per the host's affected status. 109 + 110 + Mitigation options 111 + ^^^^^^^^^^^^^^^^^^ 112 + The ITS mitigation can be controlled using the "indirect_target_selection" 113 + kernel parameter. The available options are: 114 + 115 + ======== =================================================================== 116 + on (default) Deploy the "Aligned branch/return thunks" mitigation. 117 + If spectre_v2 mitigation enables retpoline, aligned-thunks are only 118 + deployed for the affected RET instructions. Retpoline mitigates 119 + indirect branches. 120 + 121 + off Disable ITS mitigation. 122 + 123 + vmexit Equivalent to "=on" if the CPU is affected by guest/host isolation 124 + part of ITS. Otherwise, mitigation is not deployed. This option is 125 + useful when host userspace is not in the threat model, and only 126 + attacks from guest to host are considered. 127 + 128 + stuff Deploy RSB-fill mitigation when retpoline is also deployed. 129 + Otherwise, deploy the default mitigation. When retpoline mitigation 130 + is enabled, RSB-stuffing via Call-Depth-Tracking also mitigates 131 + ITS. 132 + 133 + force Force the ITS bug and deploy the default mitigation. 134 + ======== =================================================================== 135 + 136 + Sysfs reporting 137 + --------------- 138 + 139 + The sysfs file showing ITS mitigation status is: 140 + 141 + /sys/devices/system/cpu/vulnerabilities/indirect_target_selection 142 + 143 + Note, microcode mitigation status is not reported in this file. 144 + 145 + The possible values in this file are: 146 + 147 + .. list-table:: 148 + 149 + * - Not affected 150 + - The processor is not vulnerable. 151 + * - Vulnerable 152 + - System is vulnerable and no mitigation has been applied. 153 + * - Vulnerable, KVM: Not affected 154 + - System is vulnerable to intra-mode BTI, but not affected by eIBRS 155 + guest/host isolation. 156 + * - Mitigation: Aligned branch/return thunks 157 + - The mitigation is enabled, affected indirect branches and RETs are 158 + relocated to safe thunks. 159 + * - Mitigation: Retpolines, Stuffing RSB 160 + - The mitigation is enabled using retpoline and RSB stuffing. 161 + 162 + References 163 + ---------- 164 + .. [#f1] Microcode repository - https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files 165 + 166 + .. [#f2] Affected Processors list - https://www.intel.com/content/www/us/en/developer/topic-technology/software-security-guidance/processors-affected-consolidated-product-cpu-model.html 167 + 168 + .. [#f3] Affected Processors list (machine readable) - https://github.com/intel/Intel-affected-processor-list

+18

Documentation/admin-guide/kernel-parameters.txt

··· 2202 2202 different crypto accelerators. This option can be used 2203 2203 to achieve best performance for particular HW. 2204 2204 2205 + indirect_target_selection= [X86,Intel] Mitigation control for Indirect 2206 + Target Selection(ITS) bug in Intel CPUs. Updated 2207 + microcode is also required for a fix in IBPB. 2208 + 2209 + on: Enable mitigation (default). 2210 + off: Disable mitigation. 2211 + force: Force the ITS bug and deploy default 2212 + mitigation. 2213 + vmexit: Only deploy mitigation if CPU is affected by 2214 + guest/host isolation part of ITS. 2215 + stuff: Deploy RSB-fill mitigation when retpoline is 2216 + also deployed. Otherwise, deploy the default 2217 + mitigation. 2218 + 2219 + For details see: 2220 + Documentation/admin-guide/hw-vuln/indirect-target-selection.rst 2221 + 2205 2222 init= [KNL] 2206 2223 Format: <full_path> 2207 2224 Run specified binary instead of /sbin/init as init ··· 3710 3693 expose users to several CPU vulnerabilities. 3711 3694 Equivalent to: if nokaslr then kpti=0 [ARM64] 3712 3695 gather_data_sampling=off [X86] 3696 + indirect_target_selection=off [X86] 3713 3697 kvm.nx_huge_pages=off [X86] 3714 3698 l1tf=off [X86] 3715 3699 mds=off [X86]

+2

Documentation/devicetree/bindings/sound/qcom,sm8250.yaml

··· 31 31 - qcom,apq8096-sndcard 32 32 - qcom,qcm6490-idp-sndcard 33 33 - qcom,qcs6490-rb3gen2-sndcard 34 + - qcom,qcs9075-sndcard 35 + - qcom,qcs9100-sndcard 34 36 - qcom,qrb4210-rb2-sndcard 35 37 - qcom,qrb5165-rb5-sndcard 36 38 - qcom,sc7180-qdsp6-sndcard

+17

Documentation/kbuild/reproducible-builds.rst

··· 46 46 `KBUILD_BUILD_USER and KBUILD_BUILD_HOST`_ variables. If you are 47 47 building from a git commit, you could use its committer address. 48 48 49 + Absolute filenames 50 + ------------------ 51 + 52 + When the kernel is built out-of-tree, debug information may include 53 + absolute filenames for the source files. This must be overridden by 54 + including the ``-fdebug-prefix-map`` option in the `KCFLAGS`_ variable. 55 + 56 + Depending on the compiler used, the ``__FILE__`` macro may also expand 57 + to an absolute filename in an out-of-tree build. Kbuild automatically 58 + uses the ``-fmacro-prefix-map`` option to prevent this, if it is 59 + supported. 60 + 61 + The Reproducible Builds web site has more information about these 62 + `prefix-map options`_. 63 + 49 64 Generated files in source packages 50 65 ---------------------------------- 51 66 ··· 131 116 132 117 .. _KBUILD_BUILD_TIMESTAMP: kbuild.html#kbuild-build-timestamp 133 118 .. _KBUILD_BUILD_USER and KBUILD_BUILD_HOST: kbuild.html#kbuild-build-user-kbuild-build-host 119 + .. _KCFLAGS: kbuild.html#kcflags 120 + .. _prefix-map options: https://reproducible-builds.org/docs/build-path/ 134 121 .. _Reproducible Builds project: https://reproducible-builds.org/ 135 122 .. _SOURCE_DATE_EPOCH: https://reproducible-builds.org/docs/source-date-epoch/

+6 -4

Documentation/netlink/specs/tc.yaml

··· 2017 2017 attributes: 2018 2018 - 2019 2019 name: act 2020 - type: nest 2020 + type: indexed-array 2021 + sub-type: nest 2021 2022 nested-attributes: tc-act-attrs 2022 2023 - 2023 2024 name: police ··· 2251 2250 attributes: 2252 2251 - 2253 2252 name: act 2254 - type: nest 2253 + type: indexed-array 2254 + sub-type: nest 2255 2255 nested-attributes: tc-act-attrs 2256 2256 - 2257 2257 name: police ··· 2747 2745 type: u16 2748 2746 byte-order: big-endian 2749 2747 - 2750 - name: key-l2-tpv3-sid 2748 + name: key-l2tpv3-sid 2751 2749 type: u32 2752 2750 byte-order: big-endian 2753 2751 - ··· 3506 3504 name: rate64 3507 3505 type: u64 3508 3506 - 3509 - name: prate4 3507 + name: prate64 3510 3508 type: u64 3511 3509 - 3512 3510 name: burst

+3 -5

Documentation/networking/timestamping.rst

··· 811 811 3.2.4 Other caveats for MAC drivers 812 812 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 813 813 814 - Stacked PHCs, especially DSA (but not only) - since that doesn't require any 815 - modification to MAC drivers, so it is more difficult to ensure correctness of 816 - all possible code paths - is that they uncover bugs which were impossible to 817 - trigger before the existence of stacked PTP clocks. One example has to do with 818 - this line of code, already presented earlier:: 814 + The use of stacked PHCs may uncover MAC driver bugs which were impossible to 815 + trigger without them. One example has to do with this line of code, already 816 + presented earlier:: 819 817 820 818 skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS; 821 819

+22 -4

MAINTAINERS

··· 10147 10147 F: include/linux/gpio/regmap.h 10148 10148 K: (devm_)?gpio_regmap_(un)?register 10149 10149 10150 + GPIO SLOPPY LOGIC ANALYZER 10151 + M: Wolfram Sang <wsa+renesas@sang-engineering.com> 10152 + S: Supported 10153 + F: Documentation/dev-tools/gpio-sloppy-logic-analyzer.rst 10154 + F: drivers/gpio/gpio-sloppy-logic-analyzer.c 10155 + F: tools/gpio/gpio-sloppy-logic-analyzer.sh 10156 + 10150 10157 GPIO SUBSYSTEM 10151 10158 M: Linus Walleij <linus.walleij@linaro.org> 10152 10159 M: Bartosz Golaszewski <brgl@bgdev.pl> ··· 15549 15542 F: include/linux/execmem.h 15550 15543 F: mm/execmem.c 15551 15544 15545 + MEMORY MANAGEMENT - GUP (GET USER PAGES) 15546 + M: Andrew Morton <akpm@linux-foundation.org> 15547 + M: David Hildenbrand <david@redhat.com> 15548 + R: Jason Gunthorpe <jgg@nvidia.com> 15549 + R: John Hubbard <jhubbard@nvidia.com> 15550 + R: Peter Xu <peterx@redhat.com> 15551 + L: linux-mm@kvack.org 15552 + S: Maintained 15553 + W: http://www.linux-mm.org 15554 + T: git git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm 15555 + F: mm/gup.c 15556 + 15552 15557 MEMORY MANAGEMENT - NUMA MEMBLOCKS AND NUMA EMULATION 15553 15558 M: Andrew Morton <akpm@linux-foundation.org> 15554 15559 M: Mike Rapoport <rppt@kernel.org> ··· 18452 18433 PARAVIRT_OPS INTERFACE 18453 18434 M: Juergen Gross <jgross@suse.com> 18454 18435 R: Ajay Kaher <ajay.kaher@broadcom.com> 18455 - R: Alexey Makhalov <alexey.amakhalov@broadcom.com> 18436 + R: Alexey Makhalov <alexey.makhalov@broadcom.com> 18456 18437 R: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com> 18457 18438 L: virtualization@lists.linux.dev 18458 18439 L: x86@kernel.org ··· 22943 22924 22944 22925 SPEAR PLATFORM/CLOCK/PINCTRL SUPPORT 22945 22926 M: Viresh Kumar <vireshk@kernel.org> 22946 - M: Shiraz Hashim <shiraz.linux.kernel@gmail.com> 22947 22927 L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) 22948 22928 L: soc@lists.linux.dev 22949 22929 S: Maintained ··· 25943 25925 25944 25926 VMWARE HYPERVISOR INTERFACE 25945 25927 M: Ajay Kaher <ajay.kaher@broadcom.com> 25946 - M: Alexey Makhalov <alexey.amakhalov@broadcom.com> 25928 + M: Alexey Makhalov <alexey.makhalov@broadcom.com> 25947 25929 R: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com> 25948 25930 L: virtualization@lists.linux.dev 25949 25931 L: x86@kernel.org ··· 25971 25953 VMWARE VIRTUAL PTP CLOCK DRIVER 25972 25954 M: Nick Shi <nick.shi@broadcom.com> 25973 25955 R: Ajay Kaher <ajay.kaher@broadcom.com> 25974 - R: Alexey Makhalov <alexey.amakhalov@broadcom.com> 25956 + R: Alexey Makhalov <alexey.makhalov@broadcom.com> 25975 25957 R: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com> 25976 25958 L: netdev@vger.kernel.org 25977 25959 S: Supported

+2 -3

Makefile

··· 2 2 VERSION = 6 3 3 PATCHLEVEL = 15 4 4 SUBLEVEL = 0 5 - EXTRAVERSION = -rc6 5 + EXTRAVERSION = -rc7 6 6 NAME = Baby Opossum Posse 7 7 8 8 # *DOCUMENTATION* ··· 1068 1068 1069 1069 # change __FILE__ to the relative path to the source directory 1070 1070 ifdef building_out_of_srctree 1071 - KBUILD_CPPFLAGS += $(call cc-option,-ffile-prefix-map=$(srcroot)/=) 1072 - KBUILD_RUSTFLAGS += --remap-path-prefix=$(srcroot)/= 1071 + KBUILD_CPPFLAGS += $(call cc-option,-fmacro-prefix-map=$(srcroot)/=) 1073 1072 endif 1074 1073 1075 1074 # include additional Makefiles when needed

+3 -3

arch/arm/boot/dts/amlogic/meson8.dtsi

··· 451 451 pwm_ef: pwm@86c0 { 452 452 compatible = "amlogic,meson8-pwm-v2"; 453 453 clocks = <&xtal>, 454 - <>, /* unknown/untested, the datasheet calls it "Video PLL" */ 454 + <0>, /* unknown/untested, the datasheet calls it "Video PLL" */ 455 455 <&clkc CLKID_FCLK_DIV4>, 456 456 <&clkc CLKID_FCLK_DIV3>; 457 457 reg = <0x86c0 0x10>; ··· 705 705 &pwm_ab { 706 706 compatible = "amlogic,meson8-pwm-v2"; 707 707 clocks = <&xtal>, 708 - <>, /* unknown/untested, the datasheet calls it "Video PLL" */ 708 + <0>, /* unknown/untested, the datasheet calls it "Video PLL" */ 709 709 <&clkc CLKID_FCLK_DIV4>, 710 710 <&clkc CLKID_FCLK_DIV3>; 711 711 }; ··· 713 713 &pwm_cd { 714 714 compatible = "amlogic,meson8-pwm-v2"; 715 715 clocks = <&xtal>, 716 - <>, /* unknown/untested, the datasheet calls it "Video PLL" */ 716 + <0>, /* unknown/untested, the datasheet calls it "Video PLL" */ 717 717 <&clkc CLKID_FCLK_DIV4>, 718 718 <&clkc CLKID_FCLK_DIV3>; 719 719 };

+3 -3

arch/arm/boot/dts/amlogic/meson8b.dtsi

··· 406 406 compatible = "amlogic,meson8b-pwm-v2", "amlogic,meson8-pwm-v2"; 407 407 reg = <0x86c0 0x10>; 408 408 clocks = <&xtal>, 409 - <>, /* unknown/untested, the datasheet calls it "Video PLL" */ 409 + <0>, /* unknown/untested, the datasheet calls it "Video PLL" */ 410 410 <&clkc CLKID_FCLK_DIV4>, 411 411 <&clkc CLKID_FCLK_DIV3>; 412 412 #pwm-cells = <3>; ··· 680 680 &pwm_ab { 681 681 compatible = "amlogic,meson8b-pwm-v2", "amlogic,meson8-pwm-v2"; 682 682 clocks = <&xtal>, 683 - <>, /* unknown/untested, the datasheet calls it "Video PLL" */ 683 + <0>, /* unknown/untested, the datasheet calls it "Video PLL" */ 684 684 <&clkc CLKID_FCLK_DIV4>, 685 685 <&clkc CLKID_FCLK_DIV3>; 686 686 }; ··· 688 688 &pwm_cd { 689 689 compatible = "amlogic,meson8b-pwm-v2", "amlogic,meson8-pwm-v2"; 690 690 clocks = <&xtal>, 691 - <>, /* unknown/untested, the datasheet calls it "Video PLL" */ 691 + <0>, /* unknown/untested, the datasheet calls it "Video PLL" */ 692 692 <&clkc CLKID_FCLK_DIV4>, 693 693 <&clkc CLKID_FCLK_DIV3>; 694 694 };

+1 -1

arch/arm64/boot/dts/amazon/alpine-v2.dtsi

··· 151 151 al,msi-num-spis = <160>; 152 152 }; 153 153 154 - io-fabric@fc000000 { 154 + io-bus@fc000000 { 155 155 compatible = "simple-bus"; 156 156 #address-cells = <1>; 157 157 #size-cells = <1>;

+1 -1

arch/arm64/boot/dts/amazon/alpine-v3.dtsi

··· 361 361 interrupt-parent = <&gic>; 362 362 }; 363 363 364 - io-fabric@fc000000 { 364 + io-bus@fc000000 { 365 365 compatible = "simple-bus"; 366 366 #address-cells = <1>; 367 367 #size-cells = <1>;

+3 -3

arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi

··· 2313 2313 "amlogic,meson8-pwm-v2"; 2314 2314 reg = <0x0 0x19000 0x0 0x20>; 2315 2315 clocks = <&xtal>, 2316 - <>, /* unknown/untested, the datasheet calls it "vid_pll" */ 2316 + <0>, /* unknown/untested, the datasheet calls it "vid_pll" */ 2317 2317 <&clkc CLKID_FCLK_DIV4>, 2318 2318 <&clkc CLKID_FCLK_DIV3>; 2319 2319 #pwm-cells = <3>; ··· 2325 2325 "amlogic,meson8-pwm-v2"; 2326 2326 reg = <0x0 0x1a000 0x0 0x20>; 2327 2327 clocks = <&xtal>, 2328 - <>, /* unknown/untested, the datasheet calls it "vid_pll" */ 2328 + <0>, /* unknown/untested, the datasheet calls it "vid_pll" */ 2329 2329 <&clkc CLKID_FCLK_DIV4>, 2330 2330 <&clkc CLKID_FCLK_DIV3>; 2331 2331 #pwm-cells = <3>; ··· 2337 2337 "amlogic,meson8-pwm-v2"; 2338 2338 reg = <0x0 0x1b000 0x0 0x20>; 2339 2339 clocks = <&xtal>, 2340 - <>, /* unknown/untested, the datasheet calls it "vid_pll" */ 2340 + <0>, /* unknown/untested, the datasheet calls it "vid_pll" */ 2341 2341 <&clkc CLKID_FCLK_DIV4>, 2342 2342 <&clkc CLKID_FCLK_DIV3>; 2343 2343 #pwm-cells = <3>;

+4

arch/arm64/boot/dts/amlogic/meson-g12b-dreambox.dtsi

··· 116 116 status = "okay"; 117 117 }; 118 118 119 + &clkc_audio { 120 + status = "okay"; 121 + }; 122 + 119 123 &frddr_a { 120 124 status = "okay"; 121 125 };

+3 -3

arch/arm64/boot/dts/amlogic/meson-gxbb.dtsi

··· 741 741 742 742 &pwm_ab { 743 743 clocks = <&xtal>, 744 - <>, /* unknown/untested, the datasheet calls it "vid_pll" */ 744 + <0>, /* unknown/untested, the datasheet calls it "vid_pll" */ 745 745 <&clkc CLKID_FCLK_DIV4>, 746 746 <&clkc CLKID_FCLK_DIV3>; 747 747 }; ··· 752 752 753 753 &pwm_cd { 754 754 clocks = <&xtal>, 755 - <>, /* unknown/untested, the datasheet calls it "vid_pll" */ 755 + <0>, /* unknown/untested, the datasheet calls it "vid_pll" */ 756 756 <&clkc CLKID_FCLK_DIV4>, 757 757 <&clkc CLKID_FCLK_DIV3>; 758 758 }; 759 759 760 760 &pwm_ef { 761 761 clocks = <&xtal>, 762 - <>, /* unknown/untested, the datasheet calls it "vid_pll" */ 762 + <0>, /* unknown/untested, the datasheet calls it "vid_pll" */ 763 763 <&clkc CLKID_FCLK_DIV4>, 764 764 <&clkc CLKID_FCLK_DIV3>; 765 765 };

+3 -3

arch/arm64/boot/dts/amlogic/meson-gxl.dtsi

··· 811 811 812 812 &pwm_ab { 813 813 clocks = <&xtal>, 814 - <>, /* unknown/untested, the datasheet calls it "vid_pll" */ 814 + <0>, /* unknown/untested, the datasheet calls it "vid_pll" */ 815 815 <&clkc CLKID_FCLK_DIV4>, 816 816 <&clkc CLKID_FCLK_DIV3>; 817 817 }; ··· 822 822 823 823 &pwm_cd { 824 824 clocks = <&xtal>, 825 - <>, /* unknown/untested, the datasheet calls it "vid_pll" */ 825 + <0>, /* unknown/untested, the datasheet calls it "vid_pll" */ 826 826 <&clkc CLKID_FCLK_DIV4>, 827 827 <&clkc CLKID_FCLK_DIV3>; 828 828 }; 829 829 830 830 &pwm_ef { 831 831 clocks = <&xtal>, 832 - <>, /* unknown/untested, the datasheet calls it "vid_pll" */ 832 + <0>, /* unknown/untested, the datasheet calls it "vid_pll" */ 833 833 <&clkc CLKID_FCLK_DIV4>, 834 834 <&clkc CLKID_FCLK_DIV3>; 835 835 };

+10

arch/arm64/boot/dts/apple/t8103-j293.dts

··· 77 77 }; 78 78 }; 79 79 80 + /* 81 + * The driver depends on boot loader initialized state which resets when this 82 + * power-domain is powered off. This happens on suspend or when the driver is 83 + * missing during boot. Mark the domain as always on until the driver can 84 + * handle this. 85 + */ 86 + &ps_dispdfr_be { 87 + apple,always-on; 88 + }; 89 + 80 90 &display_dfr { 81 91 status = "okay"; 82 92 };

+10

arch/arm64/boot/dts/apple/t8112-j493.dts

··· 40 40 }; 41 41 }; 42 42 43 + /* 44 + * The driver depends on boot loader initialized state which resets when this 45 + * power-domain is powered off. This happens on suspend or when the driver is 46 + * missing during boot. Mark the domain as always on until the driver can 47 + * handle this. 48 + */ 49 + &ps_dispdfr_be { 50 + apple,always-on; 51 + }; 52 + 43 53 &display_dfr { 44 54 status = "okay"; 45 55 };

+2

arch/arm64/boot/dts/freescale/imx8mp-nominal.dtsi

··· 88 88 <0>, <0>, <400000000>, 89 89 <1039500000>; 90 90 }; 91 + 92 + /delete-node/ &{noc_opp_table/opp-1000000000};

+11 -1

arch/arm64/boot/dts/freescale/imx8mp-var-som.dtsi

··· 35 35 <0x1 0x00000000 0 0xc0000000>; 36 36 }; 37 37 38 - 39 38 reg_usdhc2_vmmc: regulator-usdhc2-vmmc { 40 39 compatible = "regulator-fixed"; 41 40 regulator-name = "VSD_3V3"; ··· 44 45 enable-active-high; 45 46 startup-delay-us = <100>; 46 47 off-on-delay-us = <12000>; 48 + }; 49 + 50 + reg_usdhc2_vqmmc: regulator-usdhc2-vqmmc { 51 + compatible = "regulator-gpio"; 52 + regulator-name = "VSD_VSEL"; 53 + regulator-min-microvolt = <1800000>; 54 + regulator-max-microvolt = <3300000>; 55 + gpios = <&gpio2 12 GPIO_ACTIVE_HIGH>; 56 + states = <3300000 0x0 1800000 0x1>; 57 + vin-supply = <&ldo5>; 47 58 }; 48 59 }; 49 60 ··· 214 205 pinctrl-2 = <&pinctrl_usdhc2_200mhz>, <&pinctrl_usdhc2_gpio>; 215 206 cd-gpios = <&gpio1 14 GPIO_ACTIVE_LOW>; 216 207 vmmc-supply = <&reg_usdhc2_vmmc>; 208 + vqmmc-supply = <&reg_usdhc2_vqmmc>; 217 209 bus-width = <4>; 218 210 status = "okay"; 219 211 };

+6

arch/arm64/boot/dts/freescale/imx8mp.dtsi

··· 1645 1645 opp-hz = /bits/ 64 <200000000>; 1646 1646 }; 1647 1647 1648 + /* Nominal drive mode maximum */ 1649 + opp-800000000 { 1650 + opp-hz = /bits/ 64 <800000000>; 1651 + }; 1652 + 1653 + /* Overdrive mode maximum */ 1648 1654 opp-1000000000 { 1649 1655 opp-hz = /bits/ 64 <1000000000>; 1650 1656 };

+1 -2

arch/arm64/boot/dts/rockchip/px30-engicam-common.dtsi

··· 31 31 }; 32 32 33 33 vcc3v3_btreg: vcc3v3-btreg { 34 - compatible = "regulator-gpio"; 34 + compatible = "regulator-fixed"; 35 35 enable-active-high; 36 36 pinctrl-names = "default"; 37 37 pinctrl-0 = <&bt_enable_h>; ··· 39 39 regulator-min-microvolt = <3300000>; 40 40 regulator-max-microvolt = <3300000>; 41 41 regulator-always-on; 42 - states = <3300000 0x0>; 43 42 }; 44 43 45 44 vcc3v3_rf_aux_mod: regulator-vcc3v3-rf-aux-mod {

+1 -1

arch/arm64/boot/dts/rockchip/px30-engicam-ctouch2.dtsi

··· 26 26 }; 27 27 28 28 &vcc3v3_btreg { 29 - enable-gpios = <&gpio1 RK_PC3 GPIO_ACTIVE_HIGH>; 29 + gpios = <&gpio1 RK_PC3 GPIO_ACTIVE_HIGH>; 30 30 };

+1 -1

arch/arm64/boot/dts/rockchip/px30-engicam-px30-core-edimm2.2.dts

··· 39 39 }; 40 40 41 41 &vcc3v3_btreg { 42 - enable-gpios = <&gpio1 RK_PC2 GPIO_ACTIVE_HIGH>; 42 + gpios = <&gpio1 RK_PC2 GPIO_ACTIVE_HIGH>; 43 43 };

+1 -1

arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi

··· 43 43 sdio_pwrseq: sdio-pwrseq { 44 44 compatible = "mmc-pwrseq-simple"; 45 45 clocks = <&rk808 1>; 46 - clock-names = "lpo"; 46 + clock-names = "ext_clock"; 47 47 pinctrl-names = "default"; 48 48 pinctrl-0 = <&wifi_enable_h>; 49 49 reset-gpios = <&gpio0 RK_PB2 GPIO_ACTIVE_LOW>;

+1 -1

arch/arm64/boot/dts/rockchip/rk3566-bigtreetech-cb2.dtsi

··· 775 775 rockchip,default-sample-phase = <90>; 776 776 status = "okay"; 777 777 778 - sdio-wifi@1 { 778 + wifi@1 { 779 779 compatible = "brcm,bcm4329-fmac"; 780 780 reg = <1>; 781 781 interrupt-parent = <&gpio2>;

+2

arch/arm64/boot/dts/rockchip/rk3568-qnap-ts433.dts

··· 619 619 bus-width = <8>; 620 620 max-frequency = <200000000>; 621 621 non-removable; 622 + pinctrl-names = "default"; 623 + pinctrl-0 = <&emmc_bus8 &emmc_clk &emmc_cmd &emmc_datastrobe>; 622 624 status = "okay"; 623 625 }; 624 626

+1 -1

arch/arm64/boot/dts/rockchip/rk3576-armsom-sige5.dts

··· 610 610 reg = <0x51>; 611 611 clock-output-names = "hym8563"; 612 612 interrupt-parent = <&gpio0>; 613 - interrupts = <RK_PB0 IRQ_TYPE_LEVEL_LOW>; 613 + interrupts = <RK_PA0 IRQ_TYPE_LEVEL_LOW>; 614 614 pinctrl-names = "default"; 615 615 pinctrl-0 = <&hym8563_int>; 616 616 wakeup-source;

+4

arch/arm64/boot/dts/rockchip/rk3588-friendlyelec-cm3588.dtsi

··· 222 222 compatible = "realtek,rt5616"; 223 223 reg = <0x1b>; 224 224 #sound-dai-cells = <0>; 225 + assigned-clocks = <&cru I2S0_8CH_MCLKOUT>; 226 + assigned-clock-rates = <12288000>; 227 + clocks = <&cru I2S0_8CH_MCLKOUT>; 228 + clock-names = "mclk"; 225 229 }; 226 230 }; 227 231

+2

arch/arm64/boot/dts/rockchip/rk3588-turing-rk1.dtsi

··· 214 214 }; 215 215 216 216 &package_thermal { 217 + polling-delay = <1000>; 218 + 217 219 trips { 218 220 package_active1: trip-active1 { 219 221 temperature = <45000>;

+17 -36

arch/arm64/boot/dts/rockchip/rk3588j.dtsi

··· 11 11 compatible = "operating-points-v2"; 12 12 opp-shared; 13 13 14 - opp-1416000000 { 15 - opp-hz = /bits/ 64 <1416000000>; 14 + opp-1200000000 { 15 + opp-hz = /bits/ 64 <1200000000>; 16 16 opp-microvolt = <750000 750000 950000>; 17 17 clock-latency-ns = <40000>; 18 18 opp-suspend; 19 19 }; 20 - opp-1608000000 { 21 - opp-hz = /bits/ 64 <1608000000>; 22 - opp-microvolt = <887500 887500 950000>; 23 - clock-latency-ns = <40000>; 24 - }; 25 - opp-1704000000 { 26 - opp-hz = /bits/ 64 <1704000000>; 27 - opp-microvolt = <937500 937500 950000>; 20 + opp-1296000000 { 21 + opp-hz = /bits/ 64 <1296000000>; 22 + opp-microvolt = <775000 775000 950000>; 28 23 clock-latency-ns = <40000>; 29 24 }; 30 25 }; ··· 28 33 compatible = "operating-points-v2"; 29 34 opp-shared; 30 35 36 + opp-1200000000{ 37 + opp-hz = /bits/ 64 <1200000000>; 38 + opp-microvolt = <750000 750000 950000>; 39 + clock-latency-ns = <40000>; 40 + }; 31 41 opp-1416000000 { 32 42 opp-hz = /bits/ 64 <1416000000>; 33 - opp-microvolt = <750000 750000 950000>; 43 + opp-microvolt = <762500 762500 950000>; 34 44 clock-latency-ns = <40000>; 35 45 }; 36 46 opp-1608000000 { 37 47 opp-hz = /bits/ 64 <1608000000>; 38 48 opp-microvolt = <787500 787500 950000>; 39 - clock-latency-ns = <40000>; 40 - }; 41 - opp-1800000000 { 42 - opp-hz = /bits/ 64 <1800000000>; 43 - opp-microvolt = <875000 875000 950000>; 44 - clock-latency-ns = <40000>; 45 - }; 46 - opp-2016000000 { 47 - opp-hz = /bits/ 64 <2016000000>; 48 - opp-microvolt = <950000 950000 950000>; 49 49 clock-latency-ns = <40000>; 50 50 }; 51 51 }; ··· 49 59 compatible = "operating-points-v2"; 50 60 opp-shared; 51 61 62 + opp-1200000000{ 63 + opp-hz = /bits/ 64 <1200000000>; 64 + opp-microvolt = <750000 750000 950000>; 65 + clock-latency-ns = <40000>; 66 + }; 52 67 opp-1416000000 { 53 68 opp-hz = /bits/ 64 <1416000000>; 54 - opp-microvolt = <750000 750000 950000>; 69 + opp-microvolt = <762500 762500 950000>; 55 70 clock-latency-ns = <40000>; 56 71 }; 57 72 opp-1608000000 { 58 73 opp-hz = /bits/ 64 <1608000000>; 59 74 opp-microvolt = <787500 787500 950000>; 60 - clock-latency-ns = <40000>; 61 - }; 62 - opp-1800000000 { 63 - opp-hz = /bits/ 64 <1800000000>; 64 - opp-microvolt = <875000 875000 950000>; 65 - clock-latency-ns = <40000>; 66 - }; 67 - opp-2016000000 { 68 - opp-hz = /bits/ 64 <2016000000>; 69 - opp-microvolt = <950000 950000 950000>; 70 75 clock-latency-ns = <40000>; 71 76 }; 72 77 }; ··· 88 103 opp-700000000 { 89 104 opp-hz = /bits/ 64 <700000000>; 90 105 opp-microvolt = <750000 750000 850000>; 91 - }; 92 - opp-850000000 { 93 - opp-hz = /bits/ 64 <800000000>; 94 - opp-microvolt = <787500 787500 850000>; 95 106 }; 96 107 }; 97 108 };

+2

arch/arm64/include/asm/cputype.h

··· 81 81 #define ARM_CPU_PART_CORTEX_A78AE 0xD42 82 82 #define ARM_CPU_PART_CORTEX_X1 0xD44 83 83 #define ARM_CPU_PART_CORTEX_A510 0xD46 84 + #define ARM_CPU_PART_CORTEX_X1C 0xD4C 84 85 #define ARM_CPU_PART_CORTEX_A520 0xD80 85 86 #define ARM_CPU_PART_CORTEX_A710 0xD47 86 87 #define ARM_CPU_PART_CORTEX_A715 0xD4D ··· 169 168 #define MIDR_CORTEX_A78AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78AE) 170 169 #define MIDR_CORTEX_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1) 171 170 #define MIDR_CORTEX_A510 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A510) 171 + #define MIDR_CORTEX_X1C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1C) 172 172 #define MIDR_CORTEX_A520 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A520) 173 173 #define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710) 174 174 #define MIDR_CORTEX_A715 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A715)

+1

arch/arm64/include/asm/insn.h

··· 706 706 } 707 707 #endif 708 708 u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type); 709 + u32 aarch64_insn_gen_dsb(enum aarch64_insn_mb_type type); 709 710 u32 aarch64_insn_gen_mrs(enum aarch64_insn_register result, 710 711 enum aarch64_insn_system_register sysreg); 711 712

+3

arch/arm64/include/asm/spectre.h

··· 97 97 98 98 enum mitigation_state arm64_get_spectre_bhb_state(void); 99 99 bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry, int scope); 100 + extern bool __nospectre_bhb; 101 + u8 get_spectre_bhb_loop_value(void); 102 + bool is_spectre_bhb_fw_mitigated(void); 100 103 void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *__unused); 101 104 bool try_emulate_el1_ssbs(struct pt_regs *regs, u32 instr); 102 105

+12 -1

arch/arm64/kernel/proton-pack.c

··· 891 891 MIDR_ALL_VERSIONS(MIDR_CORTEX_A78AE), 892 892 MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C), 893 893 MIDR_ALL_VERSIONS(MIDR_CORTEX_X1), 894 + MIDR_ALL_VERSIONS(MIDR_CORTEX_X1C), 894 895 MIDR_ALL_VERSIONS(MIDR_CORTEX_A710), 895 896 MIDR_ALL_VERSIONS(MIDR_CORTEX_X2), 896 897 MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2), ··· 1000 999 return true; 1001 1000 } 1002 1001 1002 + u8 get_spectre_bhb_loop_value(void) 1003 + { 1004 + return max_bhb_k; 1005 + } 1006 + 1003 1007 static void this_cpu_set_vectors(enum arm64_bp_harden_el1_vectors slot) 1004 1008 { 1005 1009 const char *v = arm64_get_bp_hardening_vector(slot); ··· 1022 1016 isb(); 1023 1017 } 1024 1018 1025 - static bool __read_mostly __nospectre_bhb; 1019 + bool __read_mostly __nospectre_bhb; 1026 1020 static int __init parse_spectre_bhb_param(char *str) 1027 1021 { 1028 1022 __nospectre_bhb = true; ··· 1098 1092 } 1099 1093 1100 1094 update_mitigation_state(&spectre_bhb_state, state); 1095 + } 1096 + 1097 + bool is_spectre_bhb_fw_mitigated(void) 1098 + { 1099 + return test_bit(BHB_FW, &system_bhb_mitigations); 1101 1100 } 1102 1101 1103 1102 /* Patched to NOP when enabled */

+45 -31

arch/arm64/lib/insn.c

··· 5 5 * 6 6 * Copyright (C) 2014-2016 Zi Shen Lim <zlim.lnx@gmail.com> 7 7 */ 8 + #include <linux/bitfield.h> 8 9 #include <linux/bitops.h> 9 10 #include <linux/bug.h> 10 11 #include <linux/printk.h> ··· 1501 1500 return aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RM, insn, Rm); 1502 1501 } 1503 1502 1503 + static u32 __get_barrier_crm_val(enum aarch64_insn_mb_type type) 1504 + { 1505 + switch (type) { 1506 + case AARCH64_INSN_MB_SY: 1507 + return 0xf; 1508 + case AARCH64_INSN_MB_ST: 1509 + return 0xe; 1510 + case AARCH64_INSN_MB_LD: 1511 + return 0xd; 1512 + case AARCH64_INSN_MB_ISH: 1513 + return 0xb; 1514 + case AARCH64_INSN_MB_ISHST: 1515 + return 0xa; 1516 + case AARCH64_INSN_MB_ISHLD: 1517 + return 0x9; 1518 + case AARCH64_INSN_MB_NSH: 1519 + return 0x7; 1520 + case AARCH64_INSN_MB_NSHST: 1521 + return 0x6; 1522 + case AARCH64_INSN_MB_NSHLD: 1523 + return 0x5; 1524 + default: 1525 + pr_err("%s: unknown barrier type %d\n", __func__, type); 1526 + return AARCH64_BREAK_FAULT; 1527 + } 1528 + } 1529 + 1504 1530 u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type) 1505 1531 { 1506 1532 u32 opt; 1507 1533 u32 insn; 1508 1534 1509 - switch (type) { 1510 - case AARCH64_INSN_MB_SY: 1511 - opt = 0xf; 1512 - break; 1513 - case AARCH64_INSN_MB_ST: 1514 - opt = 0xe; 1515 - break; 1516 - case AARCH64_INSN_MB_LD: 1517 - opt = 0xd; 1518 - break; 1519 - case AARCH64_INSN_MB_ISH: 1520 - opt = 0xb; 1521 - break; 1522 - case AARCH64_INSN_MB_ISHST: 1523 - opt = 0xa; 1524 - break; 1525 - case AARCH64_INSN_MB_ISHLD: 1526 - opt = 0x9; 1527 - break; 1528 - case AARCH64_INSN_MB_NSH: 1529 - opt = 0x7; 1530 - break; 1531 - case AARCH64_INSN_MB_NSHST: 1532 - opt = 0x6; 1533 - break; 1534 - case AARCH64_INSN_MB_NSHLD: 1535 - opt = 0x5; 1536 - break; 1537 - default: 1538 - pr_err("%s: unknown dmb type %d\n", __func__, type); 1535 + opt = __get_barrier_crm_val(type); 1536 + if (opt == AARCH64_BREAK_FAULT) 1539 1537 return AARCH64_BREAK_FAULT; 1540 - } 1541 1538 1542 1539 insn = aarch64_insn_get_dmb_value(); 1540 + insn &= ~GENMASK(11, 8); 1541 + insn |= (opt << 8); 1542 + 1543 + return insn; 1544 + } 1545 + 1546 + u32 aarch64_insn_gen_dsb(enum aarch64_insn_mb_type type) 1547 + { 1548 + u32 opt, insn; 1549 + 1550 + opt = __get_barrier_crm_val(type); 1551 + if (opt == AARCH64_BREAK_FAULT) 1552 + return AARCH64_BREAK_FAULT; 1553 + 1554 + insn = aarch64_insn_get_dsb_base_value(); 1543 1555 insn &= ~GENMASK(11, 8); 1544 1556 insn |= (opt << 8); 1545 1557

+53 -4

arch/arm64/net/bpf_jit_comp.c

··· 7 7 8 8 #define pr_fmt(fmt) "bpf_jit: " fmt 9 9 10 + #include <linux/arm-smccc.h> 10 11 #include <linux/bitfield.h> 11 12 #include <linux/bpf.h> 12 13 #include <linux/filter.h> ··· 18 17 #include <asm/asm-extable.h> 19 18 #include <asm/byteorder.h> 20 19 #include <asm/cacheflush.h> 20 + #include <asm/cpufeature.h> 21 21 #include <asm/debug-monitors.h> 22 22 #include <asm/insn.h> 23 23 #include <asm/text-patching.h> ··· 941 939 plt->target = (u64)&dummy_tramp; 942 940 } 943 941 944 - static void build_epilogue(struct jit_ctx *ctx) 942 + /* Clobbers BPF registers 1-4, aka x0-x3 */ 943 + static void __maybe_unused build_bhb_mitigation(struct jit_ctx *ctx) 944 + { 945 + const u8 r1 = bpf2a64[BPF_REG_1]; /* aka x0 */ 946 + u8 k = get_spectre_bhb_loop_value(); 947 + 948 + if (!IS_ENABLED(CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY) || 949 + cpu_mitigations_off() || __nospectre_bhb || 950 + arm64_get_spectre_v2_state() == SPECTRE_VULNERABLE) 951 + return; 952 + 953 + if (capable(CAP_SYS_ADMIN)) 954 + return; 955 + 956 + if (supports_clearbhb(SCOPE_SYSTEM)) { 957 + emit(aarch64_insn_gen_hint(AARCH64_INSN_HINT_CLEARBHB), ctx); 958 + return; 959 + } 960 + 961 + if (k) { 962 + emit_a64_mov_i64(r1, k, ctx); 963 + emit(A64_B(1), ctx); 964 + emit(A64_SUBS_I(true, r1, r1, 1), ctx); 965 + emit(A64_B_(A64_COND_NE, -2), ctx); 966 + emit(aarch64_insn_gen_dsb(AARCH64_INSN_MB_ISH), ctx); 967 + emit(aarch64_insn_get_isb_value(), ctx); 968 + } 969 + 970 + if (is_spectre_bhb_fw_mitigated()) { 971 + emit(A64_ORR_I(false, r1, AARCH64_INSN_REG_ZR, 972 + ARM_SMCCC_ARCH_WORKAROUND_3), ctx); 973 + switch (arm_smccc_1_1_get_conduit()) { 974 + case SMCCC_CONDUIT_HVC: 975 + emit(aarch64_insn_get_hvc_value(), ctx); 976 + break; 977 + case SMCCC_CONDUIT_SMC: 978 + emit(aarch64_insn_get_smc_value(), ctx); 979 + break; 980 + default: 981 + pr_err_once("Firmware mitigation enabled with unknown conduit\n"); 982 + } 983 + } 984 + } 985 + 986 + static void build_epilogue(struct jit_ctx *ctx, bool was_classic) 945 987 { 946 988 const u8 r0 = bpf2a64[BPF_REG_0]; 947 989 const u8 ptr = bpf2a64[TCCNT_PTR]; ··· 998 952 999 953 emit(A64_POP(A64_ZR, ptr, A64_SP), ctx); 1000 954 955 + if (was_classic) 956 + build_bhb_mitigation(ctx); 957 + 1001 958 /* Restore FP/LR registers */ 1002 959 emit(A64_POP(A64_FP, A64_LR, A64_SP), ctx); 1003 960 1004 - /* Set return value */ 961 + /* Move the return value from bpf:r0 (aka x7) to x0 */ 1005 962 emit(A64_MOV(1, A64_R(0), r0), ctx); 1006 963 1007 964 /* Authenticate lr */ ··· 1947 1898 } 1948 1899 1949 1900 ctx.epilogue_offset = ctx.idx; 1950 - build_epilogue(&ctx); 1901 + build_epilogue(&ctx, was_classic); 1951 1902 build_plt(&ctx); 1952 1903 1953 1904 extable_align = __alignof__(struct exception_table_entry); ··· 2010 1961 goto out_free_hdr; 2011 1962 } 2012 1963 2013 - build_epilogue(&ctx); 1964 + build_epilogue(&ctx, was_classic); 2014 1965 build_plt(&ctx); 2015 1966 2016 1967 /* Extra pass to validate JITed code. */

+1 -1

arch/loongarch/include/asm/ptrace.h

··· 55 55 56 56 /* Query offset/name of register from its name/offset */ 57 57 extern int regs_query_register_offset(const char *name); 58 - #define MAX_REG_OFFSET (offsetof(struct pt_regs, __last)) 58 + #define MAX_REG_OFFSET (offsetof(struct pt_regs, __last) - sizeof(unsigned long)) 59 59 60 60 /** 61 61 * regs_get_register() - get register value from its offset

-1

arch/loongarch/include/asm/uprobes.h

··· 15 15 #define UPROBE_XOLBP_INSN __emit_break(BRK_UPROBE_XOLBP) 16 16 17 17 struct arch_uprobe { 18 - unsigned long resume_era; 19 18 u32 insn[2]; 20 19 u32 ixol[2]; 21 20 bool simulate;

+5 -2

arch/loongarch/kernel/genex.S

··· 16 16 #include <asm/stackframe.h> 17 17 #include <asm/thread_info.h> 18 18 19 + .section .cpuidle.text, "ax" 19 20 .align 5 20 21 SYM_FUNC_START(__arch_cpu_idle) 21 22 /* start of idle interrupt region */ ··· 32 31 */ 33 32 idle 0 34 33 /* end of idle interrupt region */ 35 - 1: jr ra 34 + idle_exit: 35 + jr ra 36 36 SYM_FUNC_END(__arch_cpu_idle) 37 + .previous 37 38 38 39 SYM_CODE_START(handle_vint) 39 40 UNWIND_HINT_UNDEFINED 40 41 BACKUP_T0T1 41 42 SAVE_ALL 42 - la_abs t1, 1b 43 + la_abs t1, idle_exit 43 44 LONG_L t0, sp, PT_ERA 44 45 /* 3 instructions idle interrupt region */ 45 46 ori t0, t0, 0b1100

+20 -2

arch/loongarch/kernel/kfpu.c

··· 18 18 static DEFINE_PER_CPU(bool, in_kernel_fpu); 19 19 static DEFINE_PER_CPU(unsigned int, euen_current); 20 20 21 + static inline void fpregs_lock(void) 22 + { 23 + if (IS_ENABLED(CONFIG_PREEMPT_RT)) 24 + preempt_disable(); 25 + else 26 + local_bh_disable(); 27 + } 28 + 29 + static inline void fpregs_unlock(void) 30 + { 31 + if (IS_ENABLED(CONFIG_PREEMPT_RT)) 32 + preempt_enable(); 33 + else 34 + local_bh_enable(); 35 + } 36 + 21 37 void kernel_fpu_begin(void) 22 38 { 23 39 unsigned int *euen_curr; 24 40 25 - preempt_disable(); 41 + if (!irqs_disabled()) 42 + fpregs_lock(); 26 43 27 44 WARN_ON(this_cpu_read(in_kernel_fpu)); 28 45 ··· 90 73 91 74 this_cpu_write(in_kernel_fpu, false); 92 75 93 - preempt_enable(); 76 + if (!irqs_disabled()) 77 + fpregs_unlock(); 94 78 } 95 79 EXPORT_SYMBOL_GPL(kernel_fpu_end); 96 80

+1 -1

arch/loongarch/kernel/time.c

··· 111 111 return lpj; 112 112 } 113 113 114 - static long init_offset __nosavedata; 114 + static long init_offset; 115 115 116 116 void save_counter(void) 117 117 {

+1 -10

arch/loongarch/kernel/uprobes.c

··· 42 42 utask->autask.saved_trap_nr = current->thread.trap_nr; 43 43 current->thread.trap_nr = UPROBE_TRAP_NR; 44 44 instruction_pointer_set(regs, utask->xol_vaddr); 45 - user_enable_single_step(current); 46 45 47 46 return 0; 48 47 } ··· 52 53 53 54 WARN_ON_ONCE(current->thread.trap_nr != UPROBE_TRAP_NR); 54 55 current->thread.trap_nr = utask->autask.saved_trap_nr; 55 - 56 - if (auprobe->simulate) 57 - instruction_pointer_set(regs, auprobe->resume_era); 58 - else 59 - instruction_pointer_set(regs, utask->vaddr + LOONGARCH_INSN_SIZE); 60 - 61 - user_disable_single_step(current); 56 + instruction_pointer_set(regs, utask->vaddr + LOONGARCH_INSN_SIZE); 62 57 63 58 return 0; 64 59 } ··· 63 70 64 71 current->thread.trap_nr = utask->autask.saved_trap_nr; 65 72 instruction_pointer_set(regs, utask->vaddr); 66 - user_disable_single_step(current); 67 73 } 68 74 69 75 bool arch_uprobe_xol_was_trapped(struct task_struct *t) ··· 82 90 83 91 insn.word = auprobe->insn[0]; 84 92 arch_simulate_insn(insn, regs); 85 - auprobe->resume_era = regs->csr_era; 86 93 87 94 return true; 88 95 }

+3

arch/loongarch/power/hibernate.c

··· 2 2 #include <asm/fpu.h> 3 3 #include <asm/loongson.h> 4 4 #include <asm/sections.h> 5 + #include <asm/time.h> 5 6 #include <asm/tlbflush.h> 6 7 #include <linux/suspend.h> 7 8 ··· 15 14 16 15 void save_processor_state(void) 17 16 { 17 + save_counter(); 18 18 saved_crmd = csr_read32(LOONGARCH_CSR_CRMD); 19 19 saved_prmd = csr_read32(LOONGARCH_CSR_PRMD); 20 20 saved_euen = csr_read32(LOONGARCH_CSR_EUEN); ··· 28 26 29 27 void restore_processor_state(void) 30 28 { 29 + sync_counter(); 31 30 csr_write32(saved_crmd, LOONGARCH_CSR_CRMD); 32 31 csr_write32(saved_prmd, LOONGARCH_CSR_PRMD); 33 32 csr_write32(saved_euen, LOONGARCH_CSR_EUEN);

+1 -1

arch/riscv/boot/dts/sophgo/cv18xx.dtsi

··· 341 341 1024 1024 1024 1024>; 342 342 snps,priority = <0 1 2 3 4 5 6 7>; 343 343 snps,dma-masters = <2>; 344 - snps,data-width = <4>; 344 + snps,data-width = <2>; 345 345 status = "disabled"; 346 346 }; 347 347

+1

arch/um/Makefile

··· 154 154 archclean: 155 155 @find . $ -name '*.bb' -o -name '*.bbg' -o -name '*.da' \ 156 156 -o -name '*.gcov' $ -type f -print | xargs rm -f 157 + $(Q)$(MAKE) -f $(srctree)/Makefile ARCH=$(HEADER_ARCH) clean 157 158 158 159 export HEADER_ARCH SUBARCH USER_CFLAGS CFLAGS_NO_HARDENING DEV_NULL_PATH

+12

arch/x86/Kconfig

··· 2711 2711 of speculative execution in a similar way to the Meltdown and Spectre 2712 2712 security vulnerabilities. 2713 2713 2714 + config MITIGATION_ITS 2715 + bool "Enable Indirect Target Selection mitigation" 2716 + depends on CPU_SUP_INTEL && X86_64 2717 + depends on MITIGATION_RETPOLINE && MITIGATION_RETHUNK 2718 + select EXECMEM 2719 + default y 2720 + help 2721 + Enable Indirect Target Selection (ITS) mitigation. ITS is a bug in 2722 + BPU on some Intel CPUs that may allow Spectre V2 style attacks. If 2723 + disabled, mitigation cannot be enabled via cmdline. 2724 + See <file:Documentation/admin-guide/hw-vuln/indirect-target-selection.rst> 2725 + 2714 2726 endif 2715 2727 2716 2728 config ARCH_HAS_ADD_PAGES

+165 -90

arch/x86/coco/sev/core.c

··· 959 959 set_pages_state(vaddr, npages, SNP_PAGE_STATE_PRIVATE); 960 960 } 961 961 962 + static int vmgexit_ap_control(u64 event, struct sev_es_save_area *vmsa, u32 apic_id) 963 + { 964 + bool create = event != SVM_VMGEXIT_AP_DESTROY; 965 + struct ghcb_state state; 966 + unsigned long flags; 967 + struct ghcb *ghcb; 968 + int ret = 0; 969 + 970 + local_irq_save(flags); 971 + 972 + ghcb = __sev_get_ghcb(&state); 973 + 974 + vc_ghcb_invalidate(ghcb); 975 + 976 + if (create) 977 + ghcb_set_rax(ghcb, vmsa->sev_features); 978 + 979 + ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_AP_CREATION); 980 + ghcb_set_sw_exit_info_1(ghcb, 981 + ((u64)apic_id << 32) | 982 + ((u64)snp_vmpl << 16) | 983 + event); 984 + ghcb_set_sw_exit_info_2(ghcb, __pa(vmsa)); 985 + 986 + sev_es_wr_ghcb_msr(__pa(ghcb)); 987 + VMGEXIT(); 988 + 989 + if (!ghcb_sw_exit_info_1_is_valid(ghcb) || 990 + lower_32_bits(ghcb->save.sw_exit_info_1)) { 991 + pr_err("SNP AP %s error\n", (create ? "CREATE" : "DESTROY")); 992 + ret = -EINVAL; 993 + } 994 + 995 + __sev_put_ghcb(&state); 996 + 997 + local_irq_restore(flags); 998 + 999 + return ret; 1000 + } 1001 + 1002 + static int snp_set_vmsa(void *va, void *caa, int apic_id, bool make_vmsa) 1003 + { 1004 + int ret; 1005 + 1006 + if (snp_vmpl) { 1007 + struct svsm_call call = {}; 1008 + unsigned long flags; 1009 + 1010 + local_irq_save(flags); 1011 + 1012 + call.caa = this_cpu_read(svsm_caa); 1013 + call.rcx = __pa(va); 1014 + 1015 + if (make_vmsa) { 1016 + /* Protocol 0, Call ID 2 */ 1017 + call.rax = SVSM_CORE_CALL(SVSM_CORE_CREATE_VCPU); 1018 + call.rdx = __pa(caa); 1019 + call.r8 = apic_id; 1020 + } else { 1021 + /* Protocol 0, Call ID 3 */ 1022 + call.rax = SVSM_CORE_CALL(SVSM_CORE_DELETE_VCPU); 1023 + } 1024 + 1025 + ret = svsm_perform_call_protocol(&call); 1026 + 1027 + local_irq_restore(flags); 1028 + } else { 1029 + /* 1030 + * If the kernel runs at VMPL0, it can change the VMSA 1031 + * bit for a page using the RMPADJUST instruction. 1032 + * However, for the instruction to succeed it must 1033 + * target the permissions of a lesser privileged (higher 1034 + * numbered) VMPL level, so use VMPL1. 1035 + */ 1036 + u64 attrs = 1; 1037 + 1038 + if (make_vmsa) 1039 + attrs |= RMPADJUST_VMSA_PAGE_BIT; 1040 + 1041 + ret = rmpadjust((unsigned long)va, RMP_PG_SIZE_4K, attrs); 1042 + } 1043 + 1044 + return ret; 1045 + } 1046 + 1047 + static void snp_cleanup_vmsa(struct sev_es_save_area *vmsa, int apic_id) 1048 + { 1049 + int err; 1050 + 1051 + err = snp_set_vmsa(vmsa, NULL, apic_id, false); 1052 + if (err) 1053 + pr_err("clear VMSA page failed (%u), leaking page\n", err); 1054 + else 1055 + free_page((unsigned long)vmsa); 1056 + } 1057 + 962 1058 static void set_pte_enc(pte_t *kpte, int level, void *va) 963 1059 { 964 1060 struct pte_enc_desc d = { ··· 1101 1005 data = per_cpu(runtime_data, cpu); 1102 1006 ghcb = (unsigned long)&data->ghcb_page; 1103 1007 1104 - if (addr <= ghcb && ghcb <= addr + size) { 1008 + /* Handle the case of a huge page containing the GHCB page */ 1009 + if (addr <= ghcb && ghcb < addr + size) { 1105 1010 skipped_addr = true; 1106 1011 break; 1107 1012 } ··· 1152 1055 pr_warn("Failed to stop shared<->private conversions\n"); 1153 1056 } 1154 1057 1058 + /* 1059 + * Shutdown all APs except the one handling kexec/kdump and clearing 1060 + * the VMSA tag on AP's VMSA pages as they are not being used as 1061 + * VMSA page anymore. 1062 + */ 1063 + static void shutdown_all_aps(void) 1064 + { 1065 + struct sev_es_save_area *vmsa; 1066 + int apic_id, this_cpu, cpu; 1067 + 1068 + this_cpu = get_cpu(); 1069 + 1070 + /* 1071 + * APs are already in HLT loop when enc_kexec_finish() callback 1072 + * is invoked. 1073 + */ 1074 + for_each_present_cpu(cpu) { 1075 + vmsa = per_cpu(sev_vmsa, cpu); 1076 + 1077 + /* 1078 + * The BSP or offlined APs do not have guest allocated VMSA 1079 + * and there is no need to clear the VMSA tag for this page. 1080 + */ 1081 + if (!vmsa) 1082 + continue; 1083 + 1084 + /* 1085 + * Cannot clear the VMSA tag for the currently running vCPU. 1086 + */ 1087 + if (this_cpu == cpu) { 1088 + unsigned long pa; 1089 + struct page *p; 1090 + 1091 + pa = __pa(vmsa); 1092 + /* 1093 + * Mark the VMSA page of the running vCPU as offline 1094 + * so that is excluded and not touched by makedumpfile 1095 + * while generating vmcore during kdump. 1096 + */ 1097 + p = pfn_to_online_page(pa >> PAGE_SHIFT); 1098 + if (p) 1099 + __SetPageOffline(p); 1100 + continue; 1101 + } 1102 + 1103 + apic_id = cpuid_to_apicid[cpu]; 1104 + 1105 + /* 1106 + * Issue AP destroy to ensure AP gets kicked out of guest mode 1107 + * to allow using RMPADJUST to remove the VMSA tag on it's 1108 + * VMSA page. 1109 + */ 1110 + vmgexit_ap_control(SVM_VMGEXIT_AP_DESTROY, vmsa, apic_id); 1111 + snp_cleanup_vmsa(vmsa, apic_id); 1112 + } 1113 + 1114 + put_cpu(); 1115 + } 1116 + 1155 1117 void snp_kexec_finish(void) 1156 1118 { 1157 1119 struct sev_es_runtime_data *data; 1120 + unsigned long size, addr; 1158 1121 unsigned int level, cpu; 1159 - unsigned long size; 1160 1122 struct ghcb *ghcb; 1161 1123 pte_t *pte; 1162 1124 ··· 1224 1068 1225 1069 if (!IS_ENABLED(CONFIG_KEXEC_CORE)) 1226 1070 return; 1071 + 1072 + shutdown_all_aps(); 1227 1073 1228 1074 unshare_all_memory(); 1229 1075 ··· 1243 1085 ghcb = &data->ghcb_page; 1244 1086 pte = lookup_address((unsigned long)ghcb, &level); 1245 1087 size = page_level_size(level); 1246 - set_pte_enc(pte, level, (void *)ghcb); 1247 - snp_set_memory_private((unsigned long)ghcb, (size / PAGE_SIZE)); 1088 + /* Handle the case of a huge page containing the GHCB page */ 1089 + addr = (unsigned long)ghcb & page_level_mask(level); 1090 + set_pte_enc(pte, level, (void *)addr); 1091 + snp_set_memory_private(addr, (size / PAGE_SIZE)); 1248 1092 } 1249 - } 1250 - 1251 - static int snp_set_vmsa(void *va, void *caa, int apic_id, bool make_vmsa) 1252 - { 1253 - int ret; 1254 - 1255 - if (snp_vmpl) { 1256 - struct svsm_call call = {}; 1257 - unsigned long flags; 1258 - 1259 - local_irq_save(flags); 1260 - 1261 - call.caa = this_cpu_read(svsm_caa); 1262 - call.rcx = __pa(va); 1263 - 1264 - if (make_vmsa) { 1265 - /* Protocol 0, Call ID 2 */ 1266 - call.rax = SVSM_CORE_CALL(SVSM_CORE_CREATE_VCPU); 1267 - call.rdx = __pa(caa); 1268 - call.r8 = apic_id; 1269 - } else { 1270 - /* Protocol 0, Call ID 3 */ 1271 - call.rax = SVSM_CORE_CALL(SVSM_CORE_DELETE_VCPU); 1272 - } 1273 - 1274 - ret = svsm_perform_call_protocol(&call); 1275 - 1276 - local_irq_restore(flags); 1277 - } else { 1278 - /* 1279 - * If the kernel runs at VMPL0, it can change the VMSA 1280 - * bit for a page using the RMPADJUST instruction. 1281 - * However, for the instruction to succeed it must 1282 - * target the permissions of a lesser privileged (higher 1283 - * numbered) VMPL level, so use VMPL1. 1284 - */ 1285 - u64 attrs = 1; 1286 - 1287 - if (make_vmsa) 1288 - attrs |= RMPADJUST_VMSA_PAGE_BIT; 1289 - 1290 - ret = rmpadjust((unsigned long)va, RMP_PG_SIZE_4K, attrs); 1291 - } 1292 - 1293 - return ret; 1294 1093 } 1295 1094 1296 1095 #define __ATTR_BASE (SVM_SELECTOR_P_MASK | SVM_SELECTOR_S_MASK) ··· 1281 1166 return page_address(p + 1); 1282 1167 } 1283 1168 1284 - static void snp_cleanup_vmsa(struct sev_es_save_area *vmsa, int apic_id) 1285 - { 1286 - int err; 1287 - 1288 - err = snp_set_vmsa(vmsa, NULL, apic_id, false); 1289 - if (err) 1290 - pr_err("clear VMSA page failed (%u), leaking page\n", err); 1291 - else 1292 - free_page((unsigned long)vmsa); 1293 - } 1294 - 1295 1169 static int wakeup_cpu_via_vmgexit(u32 apic_id, unsigned long start_ip) 1296 1170 { 1297 1171 struct sev_es_save_area *cur_vmsa, *vmsa; 1298 - struct ghcb_state state; 1299 1172 struct svsm_ca *caa; 1300 - unsigned long flags; 1301 - struct ghcb *ghcb; 1302 1173 u8 sipi_vector; 1303 1174 int cpu, ret; 1304 1175 u64 cr4; ··· 1398 1297 } 1399 1298 1400 1299 /* Issue VMGEXIT AP Creation NAE event */ 1401 - local_irq_save(flags); 1402 - 1403 - ghcb = __sev_get_ghcb(&state); 1404 - 1405 - vc_ghcb_invalidate(ghcb); 1406 - ghcb_set_rax(ghcb, vmsa->sev_features); 1407 - ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_AP_CREATION); 1408 - ghcb_set_sw_exit_info_1(ghcb, 1409 - ((u64)apic_id << 32) | 1410 - ((u64)snp_vmpl << 16) | 1411 - SVM_VMGEXIT_AP_CREATE); 1412 - ghcb_set_sw_exit_info_2(ghcb, __pa(vmsa)); 1413 - 1414 - sev_es_wr_ghcb_msr(__pa(ghcb)); 1415 - VMGEXIT(); 1416 - 1417 - if (!ghcb_sw_exit_info_1_is_valid(ghcb) || 1418 - lower_32_bits(ghcb->save.sw_exit_info_1)) { 1419 - pr_err("SNP AP Creation error\n"); 1420 - ret = -EINVAL; 1421 - } 1422 - 1423 - __sev_put_ghcb(&state); 1424 - 1425 - local_irq_restore(flags); 1426 - 1427 - /* Perform cleanup if there was an error */ 1300 + ret = vmgexit_ap_control(SVM_VMGEXIT_AP_CREATE, vmsa, apic_id); 1428 1301 if (ret) { 1429 1302 snp_cleanup_vmsa(vmsa, apic_id); 1430 1303 vmsa = NULL;

+17 -3

arch/x86/entry/entry_64.S

··· 1525 1525 * ORC to unwind properly. 1526 1526 * 1527 1527 * The alignment is for performance and not for safety, and may be safely 1528 - * refactored in the future if needed. 1528 + * refactored in the future if needed. The .skips are for safety, to ensure 1529 + * that all RETs are in the second half of a cacheline to mitigate Indirect 1530 + * Target Selection, rather than taking the slowpath via its_return_thunk. 1529 1531 */ 1530 1532 SYM_FUNC_START(clear_bhb_loop) 1531 1533 ANNOTATE_NOENDBR ··· 1538 1536 call 1f 1539 1537 jmp 5f 1540 1538 .align 64, 0xcc 1539 + /* 1540 + * Shift instructions so that the RET is in the upper half of the 1541 + * cacheline and don't take the slowpath to its_return_thunk. 1542 + */ 1543 + .skip 32 - (.Lret1 - 1f), 0xcc 1541 1544 ANNOTATE_INTRA_FUNCTION_CALL 1542 1545 1: call 2f 1543 - RET 1546 + .Lret1: RET 1544 1547 .align 64, 0xcc 1548 + /* 1549 + * As above shift instructions for RET at .Lret2 as well. 1550 + * 1551 + * This should be ideally be: .skip 32 - (.Lret2 - 2f), 0xcc 1552 + * but some Clang versions (e.g. 18) don't like this. 1553 + */ 1554 + .skip 32 - 18, 0xcc 1545 1555 2: movl $5, %eax 1546 1556 3: jmp 4f 1547 1557 nop ··· 1561 1547 jnz 3b 1562 1548 sub $1, %ecx 1563 1549 jnz 1b 1564 - RET 1550 + .Lret2: RET 1565 1551 5: lfence 1566 1552 pop %rbp 1567 1553 RET

+5 -4

arch/x86/events/intel/ds.c

··· 2465 2465 setup_pebs_fixed_sample_data); 2466 2466 } 2467 2467 2468 - static void intel_pmu_pebs_event_update_no_drain(struct cpu_hw_events *cpuc, int size) 2468 + static void intel_pmu_pebs_event_update_no_drain(struct cpu_hw_events *cpuc, u64 mask) 2469 2469 { 2470 + u64 pebs_enabled = cpuc->pebs_enabled & mask; 2470 2471 struct perf_event *event; 2471 2472 int bit; 2472 2473 ··· 2478 2477 * It needs to call intel_pmu_save_and_restart_reload() to 2479 2478 * update the event->count for this case. 2480 2479 */ 2481 - for_each_set_bit(bit, (unsigned long *)&cpuc->pebs_enabled, size) { 2480 + for_each_set_bit(bit, (unsigned long *)&pebs_enabled, X86_PMC_IDX_MAX) { 2482 2481 event = cpuc->events[bit]; 2483 2482 if (event->hw.flags & PERF_X86_EVENT_AUTO_RELOAD) 2484 2483 intel_pmu_save_and_restart_reload(event, 0); ··· 2513 2512 } 2514 2513 2515 2514 if (unlikely(base >= top)) { 2516 - intel_pmu_pebs_event_update_no_drain(cpuc, size); 2515 + intel_pmu_pebs_event_update_no_drain(cpuc, mask); 2517 2516 return; 2518 2517 } 2519 2518 ··· 2627 2626 (hybrid(cpuc->pmu, fixed_cntr_mask64) << INTEL_PMC_IDX_FIXED); 2628 2627 2629 2628 if (unlikely(base >= top)) { 2630 - intel_pmu_pebs_event_update_no_drain(cpuc, X86_PMC_IDX_MAX); 2629 + intel_pmu_pebs_event_update_no_drain(cpuc, mask); 2631 2630 return; 2632 2631 } 2633 2632

+32

arch/x86/include/asm/alternative.h

··· 6 6 #include <linux/stringify.h> 7 7 #include <linux/objtool.h> 8 8 #include <asm/asm.h> 9 + #include <asm/bug.h> 9 10 10 11 #define ALT_FLAGS_SHIFT 16 11 12 ··· 122 121 void *func, void *ip) 123 122 { 124 123 return 0; 124 + } 125 + #endif 126 + 127 + #ifdef CONFIG_MITIGATION_ITS 128 + extern void its_init_mod(struct module *mod); 129 + extern void its_fini_mod(struct module *mod); 130 + extern void its_free_mod(struct module *mod); 131 + extern u8 *its_static_thunk(int reg); 132 + #else /* CONFIG_MITIGATION_ITS */ 133 + static inline void its_init_mod(struct module *mod) { } 134 + static inline void its_fini_mod(struct module *mod) { } 135 + static inline void its_free_mod(struct module *mod) { } 136 + static inline u8 *its_static_thunk(int reg) 137 + { 138 + WARN_ONCE(1, "ITS not compiled in"); 139 + 140 + return NULL; 141 + } 142 + #endif 143 + 144 + #if defined(CONFIG_MITIGATION_RETHUNK) && defined(CONFIG_OBJTOOL) 145 + extern bool cpu_wants_rethunk(void); 146 + extern bool cpu_wants_rethunk_at(void *addr); 147 + #else 148 + static __always_inline bool cpu_wants_rethunk(void) 149 + { 150 + return false; 151 + } 152 + static __always_inline bool cpu_wants_rethunk_at(void *addr) 153 + { 154 + return false; 125 155 } 126 156 #endif 127 157

+4 -1

arch/x86/include/asm/cpufeatures.h

··· 75 75 #define X86_FEATURE_CENTAUR_MCR ( 3*32+ 3) /* "centaur_mcr" Centaur MCRs (= MTRRs) */ 76 76 #define X86_FEATURE_K8 ( 3*32+ 4) /* Opteron, Athlon64 */ 77 77 #define X86_FEATURE_ZEN5 ( 3*32+ 5) /* CPU based on Zen5 microarchitecture */ 78 - /* Free ( 3*32+ 6) */ 78 + #define X86_FEATURE_ZEN6 ( 3*32+ 6) /* CPU based on Zen6 microarchitecture */ 79 79 /* Free ( 3*32+ 7) */ 80 80 #define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* "constant_tsc" TSC ticks at a constant rate */ 81 81 #define X86_FEATURE_UP ( 3*32+ 9) /* "up" SMP kernel running on UP */ ··· 481 481 #define X86_FEATURE_AMD_HETEROGENEOUS_CORES (21*32 + 6) /* Heterogeneous Core Topology */ 482 482 #define X86_FEATURE_AMD_WORKLOAD_CLASS (21*32 + 7) /* Workload Classification */ 483 483 #define X86_FEATURE_PREFER_YMM (21*32 + 8) /* Avoid ZMM registers due to downclocking */ 484 + #define X86_FEATURE_INDIRECT_THUNK_ITS (21*32 + 9) /* Use thunk for indirect branches in lower half of cacheline */ 484 485 485 486 /* 486 487 * BUG word(s) ··· 534 533 #define X86_BUG_BHI X86_BUG(1*32 + 3) /* "bhi" CPU is affected by Branch History Injection */ 535 534 #define X86_BUG_IBPB_NO_RET X86_BUG(1*32 + 4) /* "ibpb_no_ret" IBPB omits return target predictions */ 536 535 #define X86_BUG_SPECTRE_V2_USER X86_BUG(1*32 + 5) /* "spectre_v2_user" CPU is affected by Spectre variant 2 attack between user processes */ 536 + #define X86_BUG_ITS X86_BUG(1*32 + 6) /* "its" CPU is affected by Indirect Target Selection */ 537 + #define X86_BUG_ITS_NATIVE_ONLY X86_BUG(1*32 + 7) /* "its_native_only" CPU is affected by ITS, VMX is not affected */ 537 538 #endif /* _ASM_X86_CPUFEATURES_H */

+8

arch/x86/include/asm/msr-index.h

··· 211 211 * VERW clears CPU Register 212 212 * File. 213 213 */ 214 + #define ARCH_CAP_ITS_NO BIT_ULL(62) /* 215 + * Not susceptible to 216 + * Indirect Target Selection. 217 + * This bit is not set by 218 + * HW, but is synthesized by 219 + * VMMs for guests to know 220 + * their affected status. 221 + */ 214 222 215 223 #define MSR_IA32_FLUSH_CMD 0x0000010b 216 224 #define L1D_FLUSH BIT(0) /*

+10

arch/x86/include/asm/nospec-branch.h

··· 336 336 337 337 #else /* __ASSEMBLER__ */ 338 338 339 + #define ITS_THUNK_SIZE 64 340 + 339 341 typedef u8 retpoline_thunk_t[RETPOLINE_THUNK_SIZE]; 342 + typedef u8 its_thunk_t[ITS_THUNK_SIZE]; 340 343 extern retpoline_thunk_t __x86_indirect_thunk_array[]; 341 344 extern retpoline_thunk_t __x86_indirect_call_thunk_array[]; 342 345 extern retpoline_thunk_t __x86_indirect_jump_thunk_array[]; 346 + extern its_thunk_t __x86_indirect_its_thunk_array[]; 343 347 344 348 #ifdef CONFIG_MITIGATION_RETHUNK 345 349 extern void __x86_return_thunk(void); ··· 365 361 #else 366 362 static inline void srso_return_thunk(void) {} 367 363 static inline void srso_alias_return_thunk(void) {} 364 + #endif 365 + 366 + #ifdef CONFIG_MITIGATION_ITS 367 + extern void its_return_thunk(void); 368 + #else 369 + static inline void its_return_thunk(void) {} 368 370 #endif 369 371 370 372 extern void retbleed_return_thunk(void);

+1 -1

arch/x86/include/asm/sev-common.h

··· 116 116 #define GHCB_MSR_VMPL_REQ 0x016 117 117 #define GHCB_MSR_VMPL_REQ_LEVEL(v) \ 118 118 /* GHCBData[39:32] */ \ 119 - (((u64)(v) & GENMASK_ULL(7, 0) << 32) | \ 119 + ((((u64)(v) & GENMASK_ULL(7, 0)) << 32) | \ 120 120 /* GHCBDdata[11:0] */ \ 121 121 GHCB_MSR_VMPL_REQ) 122 122

+322 -20

arch/x86/kernel/alternative.c

··· 18 18 #include <linux/mmu_context.h> 19 19 #include <linux/bsearch.h> 20 20 #include <linux/sync_core.h> 21 + #include <linux/execmem.h> 21 22 #include <asm/text-patching.h> 22 23 #include <asm/alternative.h> 23 24 #include <asm/sections.h> ··· 32 31 #include <asm/paravirt.h> 33 32 #include <asm/asm-prototypes.h> 34 33 #include <asm/cfi.h> 34 + #include <asm/ibt.h> 35 + #include <asm/set_memory.h> 35 36 36 37 int __read_mostly alternatives_patched; 37 38 ··· 126 123 x86nops + 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10, 127 124 #endif 128 125 }; 126 + 127 + #ifdef CONFIG_FINEIBT 128 + static bool cfi_paranoid __ro_after_init; 129 + #endif 130 + 131 + #ifdef CONFIG_MITIGATION_ITS 132 + 133 + #ifdef CONFIG_MODULES 134 + static struct module *its_mod; 135 + #endif 136 + static void *its_page; 137 + static unsigned int its_offset; 138 + 139 + /* Initialize a thunk with the "jmp *reg; int3" instructions. */ 140 + static void *its_init_thunk(void *thunk, int reg) 141 + { 142 + u8 *bytes = thunk; 143 + int offset = 0; 144 + int i = 0; 145 + 146 + #ifdef CONFIG_FINEIBT 147 + if (cfi_paranoid) { 148 + /* 149 + * When ITS uses indirect branch thunk the fineibt_paranoid 150 + * caller sequence doesn't fit in the caller site. So put the 151 + * remaining part of the sequence (<ea> + JNE) into the ITS 152 + * thunk. 153 + */ 154 + bytes[i++] = 0xea; /* invalid instruction */ 155 + bytes[i++] = 0x75; /* JNE */ 156 + bytes[i++] = 0xfd; 157 + 158 + offset = 1; 159 + } 160 + #endif 161 + 162 + if (reg >= 8) { 163 + bytes[i++] = 0x41; /* REX.B prefix */ 164 + reg -= 8; 165 + } 166 + bytes[i++] = 0xff; 167 + bytes[i++] = 0xe0 + reg; /* jmp *reg */ 168 + bytes[i++] = 0xcc; 169 + 170 + return thunk + offset; 171 + } 172 + 173 + #ifdef CONFIG_MODULES 174 + void its_init_mod(struct module *mod) 175 + { 176 + if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) 177 + return; 178 + 179 + mutex_lock(&text_mutex); 180 + its_mod = mod; 181 + its_page = NULL; 182 + } 183 + 184 + void its_fini_mod(struct module *mod) 185 + { 186 + if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) 187 + return; 188 + 189 + WARN_ON_ONCE(its_mod != mod); 190 + 191 + its_mod = NULL; 192 + its_page = NULL; 193 + mutex_unlock(&text_mutex); 194 + 195 + for (int i = 0; i < mod->its_num_pages; i++) { 196 + void *page = mod->its_page_array[i]; 197 + execmem_restore_rox(page, PAGE_SIZE); 198 + } 199 + } 200 + 201 + void its_free_mod(struct module *mod) 202 + { 203 + if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) 204 + return; 205 + 206 + for (int i = 0; i < mod->its_num_pages; i++) { 207 + void *page = mod->its_page_array[i]; 208 + execmem_free(page); 209 + } 210 + kfree(mod->its_page_array); 211 + } 212 + #endif /* CONFIG_MODULES */ 213 + 214 + static void *its_alloc(void) 215 + { 216 + void *page __free(execmem) = execmem_alloc(EXECMEM_MODULE_TEXT, PAGE_SIZE); 217 + 218 + if (!page) 219 + return NULL; 220 + 221 + #ifdef CONFIG_MODULES 222 + if (its_mod) { 223 + void *tmp = krealloc(its_mod->its_page_array, 224 + (its_mod->its_num_pages+1) * sizeof(void *), 225 + GFP_KERNEL); 226 + if (!tmp) 227 + return NULL; 228 + 229 + its_mod->its_page_array = tmp; 230 + its_mod->its_page_array[its_mod->its_num_pages++] = page; 231 + 232 + execmem_make_temp_rw(page, PAGE_SIZE); 233 + } 234 + #endif /* CONFIG_MODULES */ 235 + 236 + return no_free_ptr(page); 237 + } 238 + 239 + static void *its_allocate_thunk(int reg) 240 + { 241 + int size = 3 + (reg / 8); 242 + void *thunk; 243 + 244 + #ifdef CONFIG_FINEIBT 245 + /* 246 + * The ITS thunk contains an indirect jump and an int3 instruction so 247 + * its size is 3 or 4 bytes depending on the register used. If CFI 248 + * paranoid is used then 3 extra bytes are added in the ITS thunk to 249 + * complete the fineibt_paranoid caller sequence. 250 + */ 251 + if (cfi_paranoid) 252 + size += 3; 253 + #endif 254 + 255 + if (!its_page || (its_offset + size - 1) >= PAGE_SIZE) { 256 + its_page = its_alloc(); 257 + if (!its_page) { 258 + pr_err("ITS page allocation failed\n"); 259 + return NULL; 260 + } 261 + memset(its_page, INT3_INSN_OPCODE, PAGE_SIZE); 262 + its_offset = 32; 263 + } 264 + 265 + /* 266 + * If the indirect branch instruction will be in the lower half 267 + * of a cacheline, then update the offset to reach the upper half. 268 + */ 269 + if ((its_offset + size - 1) % 64 < 32) 270 + its_offset = ((its_offset - 1) | 0x3F) + 33; 271 + 272 + thunk = its_page + its_offset; 273 + its_offset += size; 274 + 275 + return its_init_thunk(thunk, reg); 276 + } 277 + 278 + u8 *its_static_thunk(int reg) 279 + { 280 + u8 *thunk = __x86_indirect_its_thunk_array[reg]; 281 + 282 + #ifdef CONFIG_FINEIBT 283 + /* Paranoid thunk starts 2 bytes before */ 284 + if (cfi_paranoid) 285 + return thunk - 2; 286 + #endif 287 + return thunk; 288 + } 289 + 290 + #endif 129 291 130 292 /* 131 293 * Nomenclature for variable names to simplify and clarify this code and ease ··· 749 581 return i; 750 582 } 751 583 752 - static int emit_call_track_retpoline(void *addr, struct insn *insn, int reg, u8 *bytes) 584 + static int __emit_trampoline(void *addr, struct insn *insn, u8 *bytes, 585 + void *call_dest, void *jmp_dest) 753 586 { 754 587 u8 op = insn->opcode.bytes[0]; 755 588 int i = 0; ··· 771 602 switch (op) { 772 603 case CALL_INSN_OPCODE: 773 604 __text_gen_insn(bytes+i, op, addr+i, 774 - __x86_indirect_call_thunk_array[reg], 605 + call_dest, 775 606 CALL_INSN_SIZE); 776 607 i += CALL_INSN_SIZE; 777 608 break; ··· 779 610 case JMP32_INSN_OPCODE: 780 611 clang_jcc: 781 612 __text_gen_insn(bytes+i, op, addr+i, 782 - __x86_indirect_jump_thunk_array[reg], 613 + jmp_dest, 783 614 JMP32_INSN_SIZE); 784 615 i += JMP32_INSN_SIZE; 785 616 break; ··· 793 624 794 625 return i; 795 626 } 627 + 628 + static int emit_call_track_retpoline(void *addr, struct insn *insn, int reg, u8 *bytes) 629 + { 630 + return __emit_trampoline(addr, insn, bytes, 631 + __x86_indirect_call_thunk_array[reg], 632 + __x86_indirect_jump_thunk_array[reg]); 633 + } 634 + 635 + #ifdef CONFIG_MITIGATION_ITS 636 + static int emit_its_trampoline(void *addr, struct insn *insn, int reg, u8 *bytes) 637 + { 638 + u8 *thunk = __x86_indirect_its_thunk_array[reg]; 639 + u8 *tmp = its_allocate_thunk(reg); 640 + 641 + if (tmp) 642 + thunk = tmp; 643 + 644 + return __emit_trampoline(addr, insn, bytes, thunk, thunk); 645 + } 646 + 647 + /* Check if an indirect branch is at ITS-unsafe address */ 648 + static bool cpu_wants_indirect_its_thunk_at(unsigned long addr, int reg) 649 + { 650 + if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) 651 + return false; 652 + 653 + /* Indirect branch opcode is 2 or 3 bytes depending on reg */ 654 + addr += 1 + reg / 8; 655 + 656 + /* Lower-half of the cacheline? */ 657 + return !(addr & 0x20); 658 + } 659 + #else /* CONFIG_MITIGATION_ITS */ 660 + 661 + #ifdef CONFIG_FINEIBT 662 + static bool cpu_wants_indirect_its_thunk_at(unsigned long addr, int reg) 663 + { 664 + return false; 665 + } 666 + #endif 667 + 668 + #endif /* CONFIG_MITIGATION_ITS */ 796 669 797 670 /* 798 671 * Rewrite the compiler generated retpoline thunk calls. ··· 910 699 bytes[i++] = 0xe8; /* LFENCE */ 911 700 } 912 701 702 + #ifdef CONFIG_MITIGATION_ITS 703 + /* 704 + * Check if the address of last byte of emitted-indirect is in 705 + * lower-half of the cacheline. Such branches need ITS mitigation. 706 + */ 707 + if (cpu_wants_indirect_its_thunk_at((unsigned long)addr + i, reg)) 708 + return emit_its_trampoline(addr, insn, reg, bytes); 709 + #endif 710 + 913 711 ret = emit_indirect(op, reg, bytes + i); 914 712 if (ret < 0) 915 713 return ret; ··· 952 732 int len, ret; 953 733 u8 bytes[16]; 954 734 u8 op1, op2; 735 + u8 *dest; 955 736 956 737 ret = insn_decode_kernel(&insn, addr); 957 738 if (WARN_ON_ONCE(ret < 0)) ··· 969 748 970 749 case CALL_INSN_OPCODE: 971 750 case JMP32_INSN_OPCODE: 751 + /* Check for cfi_paranoid + ITS */ 752 + dest = addr + insn.length + insn.immediate.value; 753 + if (dest[-1] == 0xea && (dest[0] & 0xf0) == 0x70) { 754 + WARN_ON_ONCE(cfi_mode != CFI_FINEIBT); 755 + continue; 756 + } 972 757 break; 973 758 974 759 case 0x0f: /* escape */ ··· 1002 775 1003 776 #ifdef CONFIG_MITIGATION_RETHUNK 1004 777 778 + bool cpu_wants_rethunk(void) 779 + { 780 + return cpu_feature_enabled(X86_FEATURE_RETHUNK); 781 + } 782 + 783 + bool cpu_wants_rethunk_at(void *addr) 784 + { 785 + if (!cpu_feature_enabled(X86_FEATURE_RETHUNK)) 786 + return false; 787 + if (x86_return_thunk != its_return_thunk) 788 + return true; 789 + 790 + return !((unsigned long)addr & 0x20); 791 + } 792 + 1005 793 /* 1006 794 * Rewrite the compiler generated return thunk tail-calls. 1007 795 * ··· 1033 791 int i = 0; 1034 792 1035 793 /* Patch the custom return thunks... */ 1036 - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) { 794 + if (cpu_wants_rethunk_at(addr)) { 1037 795 i = JMP32_INSN_SIZE; 1038 796 __text_gen_insn(bytes, JMP32_INSN_OPCODE, addr, x86_return_thunk, i); 1039 797 } else { ··· 1050 808 { 1051 809 s32 *s; 1052 810 1053 - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) 811 + if (cpu_wants_rethunk()) 1054 812 static_call_force_reinit(); 1055 813 1056 814 for (s = start; s < end; s++) { ··· 1263 1021 1264 1022 static bool cfi_rand __ro_after_init = true; 1265 1023 static u32 cfi_seed __ro_after_init; 1266 - 1267 - static bool cfi_paranoid __ro_after_init = false; 1268 1024 1269 1025 /* 1270 1026 * Re-hash the CFI hash with a boot-time seed while making sure the result is ··· 1676 1436 return 0; 1677 1437 } 1678 1438 1439 + static int emit_paranoid_trampoline(void *addr, struct insn *insn, int reg, u8 *bytes) 1440 + { 1441 + u8 *thunk = (void *)__x86_indirect_its_thunk_array[reg] - 2; 1442 + 1443 + #ifdef CONFIG_MITIGATION_ITS 1444 + u8 *tmp = its_allocate_thunk(reg); 1445 + if (tmp) 1446 + thunk = tmp; 1447 + #endif 1448 + 1449 + return __emit_trampoline(addr, insn, bytes, thunk, thunk); 1450 + } 1451 + 1679 1452 static int cfi_rewrite_callers(s32 *start, s32 *end) 1680 1453 { 1681 1454 s32 *s; ··· 1730 1477 memcpy(bytes, fineibt_paranoid_start, fineibt_paranoid_size); 1731 1478 memcpy(bytes + fineibt_caller_hash, &hash, 4); 1732 1479 1733 - ret = emit_indirect(op, 11, bytes + fineibt_paranoid_ind); 1734 - if (WARN_ON_ONCE(ret != 3)) 1735 - continue; 1480 + if (cpu_wants_indirect_its_thunk_at((unsigned long)addr + fineibt_paranoid_ind, 11)) { 1481 + emit_paranoid_trampoline(addr + fineibt_caller_size, 1482 + &insn, 11, bytes + fineibt_caller_size); 1483 + } else { 1484 + ret = emit_indirect(op, 11, bytes + fineibt_paranoid_ind); 1485 + if (WARN_ON_ONCE(ret != 3)) 1486 + continue; 1487 + } 1736 1488 1737 1489 text_poke_early(addr, bytes, fineibt_paranoid_size); 1738 1490 } ··· 1964 1706 return false; 1965 1707 } 1966 1708 1709 + static bool is_paranoid_thunk(unsigned long addr) 1710 + { 1711 + u32 thunk; 1712 + 1713 + __get_kernel_nofault(&thunk, (u32 *)addr, u32, Efault); 1714 + return (thunk & 0x00FFFFFF) == 0xfd75ea; 1715 + 1716 + Efault: 1717 + return false; 1718 + } 1719 + 1967 1720 /* 1968 1721 * regs->ip points to a LOCK Jcc.d8 instruction from the fineibt_paranoid_start[] 1969 - * sequence. 1722 + * sequence, or to an invalid instruction (0xea) + Jcc.d8 for cfi_paranoid + ITS 1723 + * thunk. 1970 1724 */ 1971 1725 static bool decode_fineibt_paranoid(struct pt_regs *regs, unsigned long *target, u32 *type) 1972 1726 { 1973 1727 unsigned long addr = regs->ip - fineibt_paranoid_ud; 1974 - u32 hash; 1975 1728 1976 - if (!cfi_paranoid || !is_cfi_trap(addr + fineibt_caller_size - LEN_UD2)) 1729 + if (!cfi_paranoid) 1977 1730 return false; 1978 1731 1979 - __get_kernel_nofault(&hash, addr + fineibt_caller_hash, u32, Efault); 1980 - *target = regs->r11 + fineibt_preamble_size; 1981 - *type = regs->r10; 1732 + if (is_cfi_trap(addr + fineibt_caller_size - LEN_UD2)) { 1733 + *target = regs->r11 + fineibt_preamble_size; 1734 + *type = regs->r10; 1735 + 1736 + /* 1737 + * Since the trapping instruction is the exact, but LOCK prefixed, 1738 + * Jcc.d8 that got us here, the normal fixup will work. 1739 + */ 1740 + return true; 1741 + } 1982 1742 1983 1743 /* 1984 - * Since the trapping instruction is the exact, but LOCK prefixed, 1985 - * Jcc.d8 that got us here, the normal fixup will work. 1744 + * The cfi_paranoid + ITS thunk combination results in: 1745 + * 1746 + * 0: 41 ba 78 56 34 12 mov $0x12345678, %r10d 1747 + * 6: 45 3b 53 f7 cmp -0x9(%r11), %r10d 1748 + * a: 4d 8d 5b f0 lea -0x10(%r11), %r11 1749 + * e: 2e e8 XX XX XX XX cs call __x86_indirect_paranoid_thunk_r11 1750 + * 1751 + * Where the paranoid_thunk looks like: 1752 + * 1753 + * 1d: <ea> (bad) 1754 + * __x86_indirect_paranoid_thunk_r11: 1755 + * 1e: 75 fd jne 1d 1756 + * __x86_indirect_its_thunk_r11: 1757 + * 20: 41 ff eb jmp *%r11 1758 + * 23: cc int3 1759 + * 1986 1760 */ 1987 - return true; 1761 + if (is_paranoid_thunk(regs->ip)) { 1762 + *target = regs->r11 + fineibt_preamble_size; 1763 + *type = regs->r10; 1988 1764 1989 - Efault: 1765 + regs->ip = *target; 1766 + return true; 1767 + } 1768 + 1990 1769 return false; 1991 1770 } 1992 1771 ··· 2326 2031 2327 2032 void __init alternative_instructions(void) 2328 2033 { 2034 + u64 ibt; 2035 + 2329 2036 int3_selftest(); 2330 2037 2331 2038 /* ··· 2354 2057 */ 2355 2058 paravirt_set_cap(); 2356 2059 2060 + /* Keep CET-IBT disabled until caller/callee are patched */ 2061 + ibt = ibt_save(/*disable*/ true); 2062 + 2357 2063 __apply_fineibt(__retpoline_sites, __retpoline_sites_end, 2358 2064 __cfi_sites, __cfi_sites_end, true); 2359 2065 ··· 2379 2079 * Seal all functions that do not have their address taken. 2380 2080 */ 2381 2081 apply_seal_endbr(__ibt_endbr_seal, __ibt_endbr_seal_end); 2082 + 2083 + ibt_restore(ibt); 2382 2084 2383 2085 #ifdef CONFIG_SMP 2384 2086 /* Patch to UP if other cpus not imminent. */

+5

arch/x86/kernel/cpu/amd.c

··· 472 472 case 0x60 ... 0x7f: 473 473 setup_force_cpu_cap(X86_FEATURE_ZEN5); 474 474 break; 475 + case 0x50 ... 0x5f: 476 + case 0x90 ... 0xaf: 477 + case 0xc0 ... 0xcf: 478 + setup_force_cpu_cap(X86_FEATURE_ZEN6); 479 + break; 475 480 default: 476 481 goto warn; 477 482 }

+169 -7

arch/x86/kernel/cpu/bugs.c

··· 49 49 static void __init l1d_flush_select_mitigation(void); 50 50 static void __init srso_select_mitigation(void); 51 51 static void __init gds_select_mitigation(void); 52 + static void __init its_select_mitigation(void); 52 53 53 54 /* The base value of the SPEC_CTRL MSR without task-specific bits set */ 54 55 u64 x86_spec_ctrl_base; ··· 66 65 static DEFINE_MUTEX(spec_ctrl_mutex); 67 66 68 67 void (*x86_return_thunk)(void) __ro_after_init = __x86_return_thunk; 68 + 69 + static void __init set_return_thunk(void *thunk) 70 + { 71 + if (x86_return_thunk != __x86_return_thunk) 72 + pr_warn("x86/bugs: return thunk changed\n"); 73 + 74 + x86_return_thunk = thunk; 75 + } 69 76 70 77 /* Update SPEC_CTRL MSR and its cached copy unconditionally */ 71 78 static void update_spec_ctrl(u64 val) ··· 187 178 */ 188 179 srso_select_mitigation(); 189 180 gds_select_mitigation(); 181 + its_select_mitigation(); 190 182 } 191 183 192 184 /* ··· 1128 1118 setup_force_cpu_cap(X86_FEATURE_RETHUNK); 1129 1119 setup_force_cpu_cap(X86_FEATURE_UNRET); 1130 1120 1131 - x86_return_thunk = retbleed_return_thunk; 1121 + set_return_thunk(retbleed_return_thunk); 1132 1122 1133 1123 if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD && 1134 1124 boot_cpu_data.x86_vendor != X86_VENDOR_HYGON) ··· 1163 1153 setup_force_cpu_cap(X86_FEATURE_RETHUNK); 1164 1154 setup_force_cpu_cap(X86_FEATURE_CALL_DEPTH); 1165 1155 1166 - x86_return_thunk = call_depth_return_thunk; 1156 + set_return_thunk(call_depth_return_thunk); 1167 1157 break; 1168 1158 1169 1159 default: ··· 1195 1185 } 1196 1186 1197 1187 pr_info("%s\n", retbleed_strings[retbleed_mitigation]); 1188 + } 1189 + 1190 + #undef pr_fmt 1191 + #define pr_fmt(fmt) "ITS: " fmt 1192 + 1193 + enum its_mitigation_cmd { 1194 + ITS_CMD_OFF, 1195 + ITS_CMD_ON, 1196 + ITS_CMD_VMEXIT, 1197 + ITS_CMD_RSB_STUFF, 1198 + }; 1199 + 1200 + enum its_mitigation { 1201 + ITS_MITIGATION_OFF, 1202 + ITS_MITIGATION_VMEXIT_ONLY, 1203 + ITS_MITIGATION_ALIGNED_THUNKS, 1204 + ITS_MITIGATION_RETPOLINE_STUFF, 1205 + }; 1206 + 1207 + static const char * const its_strings[] = { 1208 + [ITS_MITIGATION_OFF] = "Vulnerable", 1209 + [ITS_MITIGATION_VMEXIT_ONLY] = "Mitigation: Vulnerable, KVM: Not affected", 1210 + [ITS_MITIGATION_ALIGNED_THUNKS] = "Mitigation: Aligned branch/return thunks", 1211 + [ITS_MITIGATION_RETPOLINE_STUFF] = "Mitigation: Retpolines, Stuffing RSB", 1212 + }; 1213 + 1214 + static enum its_mitigation its_mitigation __ro_after_init = ITS_MITIGATION_ALIGNED_THUNKS; 1215 + 1216 + static enum its_mitigation_cmd its_cmd __ro_after_init = 1217 + IS_ENABLED(CONFIG_MITIGATION_ITS) ? ITS_CMD_ON : ITS_CMD_OFF; 1218 + 1219 + static int __init its_parse_cmdline(char *str) 1220 + { 1221 + if (!str) 1222 + return -EINVAL; 1223 + 1224 + if (!IS_ENABLED(CONFIG_MITIGATION_ITS)) { 1225 + pr_err("Mitigation disabled at compile time, ignoring option (%s)", str); 1226 + return 0; 1227 + } 1228 + 1229 + if (!strcmp(str, "off")) { 1230 + its_cmd = ITS_CMD_OFF; 1231 + } else if (!strcmp(str, "on")) { 1232 + its_cmd = ITS_CMD_ON; 1233 + } else if (!strcmp(str, "force")) { 1234 + its_cmd = ITS_CMD_ON; 1235 + setup_force_cpu_bug(X86_BUG_ITS); 1236 + } else if (!strcmp(str, "vmexit")) { 1237 + its_cmd = ITS_CMD_VMEXIT; 1238 + } else if (!strcmp(str, "stuff")) { 1239 + its_cmd = ITS_CMD_RSB_STUFF; 1240 + } else { 1241 + pr_err("Ignoring unknown indirect_target_selection option (%s).", str); 1242 + } 1243 + 1244 + return 0; 1245 + } 1246 + early_param("indirect_target_selection", its_parse_cmdline); 1247 + 1248 + static void __init its_select_mitigation(void) 1249 + { 1250 + enum its_mitigation_cmd cmd = its_cmd; 1251 + 1252 + if (!boot_cpu_has_bug(X86_BUG_ITS) || cpu_mitigations_off()) { 1253 + its_mitigation = ITS_MITIGATION_OFF; 1254 + return; 1255 + } 1256 + 1257 + /* Retpoline+CDT mitigates ITS, bail out */ 1258 + if (boot_cpu_has(X86_FEATURE_RETPOLINE) && 1259 + boot_cpu_has(X86_FEATURE_CALL_DEPTH)) { 1260 + its_mitigation = ITS_MITIGATION_RETPOLINE_STUFF; 1261 + goto out; 1262 + } 1263 + 1264 + /* Exit early to avoid irrelevant warnings */ 1265 + if (cmd == ITS_CMD_OFF) { 1266 + its_mitigation = ITS_MITIGATION_OFF; 1267 + goto out; 1268 + } 1269 + if (spectre_v2_enabled == SPECTRE_V2_NONE) { 1270 + pr_err("WARNING: Spectre-v2 mitigation is off, disabling ITS\n"); 1271 + its_mitigation = ITS_MITIGATION_OFF; 1272 + goto out; 1273 + } 1274 + if (!IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) || 1275 + !IS_ENABLED(CONFIG_MITIGATION_RETHUNK)) { 1276 + pr_err("WARNING: ITS mitigation depends on retpoline and rethunk support\n"); 1277 + its_mitigation = ITS_MITIGATION_OFF; 1278 + goto out; 1279 + } 1280 + if (IS_ENABLED(CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B)) { 1281 + pr_err("WARNING: ITS mitigation is not compatible with CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B\n"); 1282 + its_mitigation = ITS_MITIGATION_OFF; 1283 + goto out; 1284 + } 1285 + if (boot_cpu_has(X86_FEATURE_RETPOLINE_LFENCE)) { 1286 + pr_err("WARNING: ITS mitigation is not compatible with lfence mitigation\n"); 1287 + its_mitigation = ITS_MITIGATION_OFF; 1288 + goto out; 1289 + } 1290 + 1291 + if (cmd == ITS_CMD_RSB_STUFF && 1292 + (!boot_cpu_has(X86_FEATURE_RETPOLINE) || !IS_ENABLED(CONFIG_MITIGATION_CALL_DEPTH_TRACKING))) { 1293 + pr_err("RSB stuff mitigation not supported, using default\n"); 1294 + cmd = ITS_CMD_ON; 1295 + } 1296 + 1297 + switch (cmd) { 1298 + case ITS_CMD_OFF: 1299 + its_mitigation = ITS_MITIGATION_OFF; 1300 + break; 1301 + case ITS_CMD_VMEXIT: 1302 + if (boot_cpu_has_bug(X86_BUG_ITS_NATIVE_ONLY)) { 1303 + its_mitigation = ITS_MITIGATION_VMEXIT_ONLY; 1304 + goto out; 1305 + } 1306 + fallthrough; 1307 + case ITS_CMD_ON: 1308 + its_mitigation = ITS_MITIGATION_ALIGNED_THUNKS; 1309 + if (!boot_cpu_has(X86_FEATURE_RETPOLINE)) 1310 + setup_force_cpu_cap(X86_FEATURE_INDIRECT_THUNK_ITS); 1311 + setup_force_cpu_cap(X86_FEATURE_RETHUNK); 1312 + set_return_thunk(its_return_thunk); 1313 + break; 1314 + case ITS_CMD_RSB_STUFF: 1315 + its_mitigation = ITS_MITIGATION_RETPOLINE_STUFF; 1316 + setup_force_cpu_cap(X86_FEATURE_RETHUNK); 1317 + setup_force_cpu_cap(X86_FEATURE_CALL_DEPTH); 1318 + set_return_thunk(call_depth_return_thunk); 1319 + if (retbleed_mitigation == RETBLEED_MITIGATION_NONE) { 1320 + retbleed_mitigation = RETBLEED_MITIGATION_STUFF; 1321 + pr_info("Retbleed mitigation updated to stuffing\n"); 1322 + } 1323 + break; 1324 + } 1325 + out: 1326 + pr_info("%s\n", its_strings[its_mitigation]); 1198 1327 } 1199 1328 1200 1329 #undef pr_fmt ··· 1846 1697 return; 1847 1698 } 1848 1699 1849 - /* Mitigate in hardware if supported */ 1850 - if (spec_ctrl_bhi_dis()) 1700 + if (!IS_ENABLED(CONFIG_X86_64)) 1851 1701 return; 1852 1702 1853 - if (!IS_ENABLED(CONFIG_X86_64)) 1703 + /* Mitigate in hardware if supported */ 1704 + if (spec_ctrl_bhi_dis()) 1854 1705 return; 1855 1706 1856 1707 if (bhi_mitigation == BHI_MITIGATION_VMEXIT_ONLY) { ··· 2756 2607 2757 2608 if (boot_cpu_data.x86 == 0x19) { 2758 2609 setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS); 2759 - x86_return_thunk = srso_alias_return_thunk; 2610 + set_return_thunk(srso_alias_return_thunk); 2760 2611 } else { 2761 2612 setup_force_cpu_cap(X86_FEATURE_SRSO); 2762 - x86_return_thunk = srso_return_thunk; 2613 + set_return_thunk(srso_return_thunk); 2763 2614 } 2764 2615 if (has_microcode) 2765 2616 srso_mitigation = SRSO_MITIGATION_SAFE_RET; ··· 2949 2800 return sysfs_emit(buf, "%s\n", rfds_strings[rfds_mitigation]); 2950 2801 } 2951 2802 2803 + static ssize_t its_show_state(char *buf) 2804 + { 2805 + return sysfs_emit(buf, "%s\n", its_strings[its_mitigation]); 2806 + } 2807 + 2952 2808 static char *stibp_state(void) 2953 2809 { 2954 2810 if (spectre_v2_in_eibrs_mode(spectre_v2_enabled) && ··· 3136 2982 case X86_BUG_RFDS: 3137 2983 return rfds_show_state(buf); 3138 2984 2985 + case X86_BUG_ITS: 2986 + return its_show_state(buf); 2987 + 3139 2988 default: 3140 2989 break; 3141 2990 } ··· 3217 3060 ssize_t cpu_show_reg_file_data_sampling(struct device *dev, struct device_attribute *attr, char *buf) 3218 3061 { 3219 3062 return cpu_show_common(dev, attr, buf, X86_BUG_RFDS); 3063 + } 3064 + 3065 + ssize_t cpu_show_indirect_target_selection(struct device *dev, struct device_attribute *attr, char *buf) 3066 + { 3067 + return cpu_show_common(dev, attr, buf, X86_BUG_ITS); 3220 3068 } 3221 3069 #endif 3222 3070

+57 -15

arch/x86/kernel/cpu/common.c

··· 1227 1227 #define GDS BIT(6) 1228 1228 /* CPU is affected by Register File Data Sampling */ 1229 1229 #define RFDS BIT(7) 1230 + /* CPU is affected by Indirect Target Selection */ 1231 + #define ITS BIT(8) 1232 + /* CPU is affected by Indirect Target Selection, but guest-host isolation is not affected */ 1233 + #define ITS_NATIVE_ONLY BIT(9) 1230 1234 1231 1235 static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { 1232 1236 VULNBL_INTEL_STEPS(INTEL_IVYBRIDGE, X86_STEP_MAX, SRBDS), ··· 1242 1238 VULNBL_INTEL_STEPS(INTEL_BROADWELL_G, X86_STEP_MAX, SRBDS), 1243 1239 VULNBL_INTEL_STEPS(INTEL_BROADWELL_X, X86_STEP_MAX, MMIO), 1244 1240 VULNBL_INTEL_STEPS(INTEL_BROADWELL, X86_STEP_MAX, SRBDS), 1245 - VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, X86_STEP_MAX, MMIO | RETBLEED | GDS), 1241 + VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, 0x5, MMIO | RETBLEED | GDS), 1242 + VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, X86_STEP_MAX, MMIO | RETBLEED | GDS | ITS), 1246 1243 VULNBL_INTEL_STEPS(INTEL_SKYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS), 1247 1244 VULNBL_INTEL_STEPS(INTEL_SKYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS), 1248 - VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS), 1249 - VULNBL_INTEL_STEPS(INTEL_KABYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS), 1245 + VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, 0xb, MMIO | RETBLEED | GDS | SRBDS), 1246 + VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | ITS), 1247 + VULNBL_INTEL_STEPS(INTEL_KABYLAKE, 0xc, MMIO | RETBLEED | GDS | SRBDS), 1248 + VULNBL_INTEL_STEPS(INTEL_KABYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | ITS), 1250 1249 VULNBL_INTEL_STEPS(INTEL_CANNONLAKE_L, X86_STEP_MAX, RETBLEED), 1251 - VULNBL_INTEL_STEPS(INTEL_ICELAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS), 1252 - VULNBL_INTEL_STEPS(INTEL_ICELAKE_D, X86_STEP_MAX, MMIO | GDS), 1253 - VULNBL_INTEL_STEPS(INTEL_ICELAKE_X, X86_STEP_MAX, MMIO | GDS), 1254 - VULNBL_INTEL_STEPS(INTEL_COMETLAKE, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS), 1255 - VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, 0x0, MMIO | RETBLEED), 1256 - VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS), 1257 - VULNBL_INTEL_STEPS(INTEL_TIGERLAKE_L, X86_STEP_MAX, GDS), 1258 - VULNBL_INTEL_STEPS(INTEL_TIGERLAKE, X86_STEP_MAX, GDS), 1250 + VULNBL_INTEL_STEPS(INTEL_ICELAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS | ITS_NATIVE_ONLY), 1251 + VULNBL_INTEL_STEPS(INTEL_ICELAKE_D, X86_STEP_MAX, MMIO | GDS | ITS | ITS_NATIVE_ONLY), 1252 + VULNBL_INTEL_STEPS(INTEL_ICELAKE_X, X86_STEP_MAX, MMIO | GDS | ITS | ITS_NATIVE_ONLY), 1253 + VULNBL_INTEL_STEPS(INTEL_COMETLAKE, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), 1254 + VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, 0x0, MMIO | RETBLEED | ITS), 1255 + VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), 1256 + VULNBL_INTEL_STEPS(INTEL_TIGERLAKE_L, X86_STEP_MAX, GDS | ITS | ITS_NATIVE_ONLY), 1257 + VULNBL_INTEL_STEPS(INTEL_TIGERLAKE, X86_STEP_MAX, GDS | ITS | ITS_NATIVE_ONLY), 1259 1258 VULNBL_INTEL_STEPS(INTEL_LAKEFIELD, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED), 1260 - VULNBL_INTEL_STEPS(INTEL_ROCKETLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS), 1259 + VULNBL_INTEL_STEPS(INTEL_ROCKETLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | ITS | ITS_NATIVE_ONLY), 1261 1260 VULNBL_INTEL_TYPE(INTEL_ALDERLAKE, ATOM, RFDS), 1262 1261 VULNBL_INTEL_STEPS(INTEL_ALDERLAKE_L, X86_STEP_MAX, RFDS), 1263 1262 VULNBL_INTEL_TYPE(INTEL_RAPTORLAKE, ATOM, RFDS), ··· 1323 1316 1324 1317 /* Only consult the blacklist when there is no enumeration: */ 1325 1318 return cpu_matches(cpu_vuln_blacklist, RFDS); 1319 + } 1320 + 1321 + static bool __init vulnerable_to_its(u64 x86_arch_cap_msr) 1322 + { 1323 + /* The "immunity" bit trumps everything else: */ 1324 + if (x86_arch_cap_msr & ARCH_CAP_ITS_NO) 1325 + return false; 1326 + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) 1327 + return false; 1328 + 1329 + /* None of the affected CPUs have BHI_CTRL */ 1330 + if (boot_cpu_has(X86_FEATURE_BHI_CTRL)) 1331 + return false; 1332 + 1333 + /* 1334 + * If a VMM did not expose ITS_NO, assume that a guest could 1335 + * be running on a vulnerable hardware or may migrate to such 1336 + * hardware. 1337 + */ 1338 + if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) 1339 + return true; 1340 + 1341 + if (cpu_matches(cpu_vuln_blacklist, ITS)) 1342 + return true; 1343 + 1344 + return false; 1326 1345 } 1327 1346 1328 1347 static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) ··· 1472 1439 if (vulnerable_to_rfds(x86_arch_cap_msr)) 1473 1440 setup_force_cpu_bug(X86_BUG_RFDS); 1474 1441 1475 - /* When virtualized, eIBRS could be hidden, assume vulnerable */ 1476 - if (!(x86_arch_cap_msr & ARCH_CAP_BHI_NO) && 1477 - !cpu_matches(cpu_vuln_whitelist, NO_BHI) && 1442 + /* 1443 + * Intel parts with eIBRS are vulnerable to BHI attacks. Parts with 1444 + * BHI_NO still need to use the BHI mitigation to prevent Intra-mode 1445 + * attacks. When virtualized, eIBRS could be hidden, assume vulnerable. 1446 + */ 1447 + if (!cpu_matches(cpu_vuln_whitelist, NO_BHI) && 1478 1448 (boot_cpu_has(X86_FEATURE_IBRS_ENHANCED) || 1479 1449 boot_cpu_has(X86_FEATURE_HYPERVISOR))) 1480 1450 setup_force_cpu_bug(X86_BUG_BHI); 1481 1451 1482 1452 if (cpu_has(c, X86_FEATURE_AMD_IBPB) && !cpu_has(c, X86_FEATURE_AMD_IBPB_RET)) 1483 1453 setup_force_cpu_bug(X86_BUG_IBPB_NO_RET); 1454 + 1455 + if (vulnerable_to_its(x86_arch_cap_msr)) { 1456 + setup_force_cpu_bug(X86_BUG_ITS); 1457 + if (cpu_matches(cpu_vuln_blacklist, ITS_NATIVE_ONLY)) 1458 + setup_force_cpu_bug(X86_BUG_ITS_NATIVE_ONLY); 1459 + } 1484 1460 1485 1461 if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN)) 1486 1462 return;

+1 -1

arch/x86/kernel/ftrace.c

··· 354 354 goto fail; 355 355 356 356 ip = trampoline + size; 357 - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) 357 + if (cpu_wants_rethunk_at(ip)) 358 358 __text_gen_insn(ip, JMP32_INSN_OPCODE, ip, x86_return_thunk, JMP32_INSN_SIZE); 359 359 else 360 360 memcpy(ip, retq, sizeof(retq));

+6

arch/x86/kernel/module.c

··· 266 266 ibt_endbr = s; 267 267 } 268 268 269 + its_init_mod(me); 270 + 269 271 if (retpolines || cfi) { 270 272 void *rseg = NULL, *cseg = NULL; 271 273 unsigned int rsize = 0, csize = 0; ··· 288 286 void *rseg = (void *)retpolines->sh_addr; 289 287 apply_retpolines(rseg, rseg + retpolines->sh_size); 290 288 } 289 + 290 + its_fini_mod(me); 291 + 291 292 if (returns) { 292 293 void *rseg = (void *)returns->sh_addr; 293 294 apply_returns(rseg, rseg + returns->sh_size); ··· 331 326 void module_arch_cleanup(struct module *mod) 332 327 { 333 328 alternatives_smp_module_del(mod); 329 + its_free_mod(mod); 334 330 }

+2 -2

arch/x86/kernel/static_call.c

··· 81 81 break; 82 82 83 83 case RET: 84 - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) 84 + if (cpu_wants_rethunk_at(insn)) 85 85 code = text_gen_insn(JMP32_INSN_OPCODE, insn, x86_return_thunk); 86 86 else 87 87 code = &retinsn; ··· 90 90 case JCC: 91 91 if (!func) { 92 92 func = __static_call_return; 93 - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) 93 + if (cpu_wants_rethunk()) 94 94 func = x86_return_thunk; 95 95 } 96 96

+10

arch/x86/kernel/vmlinux.lds.S

··· 505 505 "SRSO function pair won't alias"); 506 506 #endif 507 507 508 + #if defined(CONFIG_MITIGATION_ITS) && !defined(CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B) 509 + . = ASSERT(__x86_indirect_its_thunk_rax & 0x20, "__x86_indirect_thunk_rax not in second half of cacheline"); 510 + . = ASSERT(((__x86_indirect_its_thunk_rcx - __x86_indirect_its_thunk_rax) % 64) == 0, "Indirect thunks are not cacheline apart"); 511 + . = ASSERT(__x86_indirect_its_thunk_array == __x86_indirect_its_thunk_rax, "Gap in ITS thunk array"); 512 + #endif 513 + 514 + #if defined(CONFIG_MITIGATION_ITS) && !defined(CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B) 515 + . = ASSERT(its_return_thunk & 0x20, "its_return_thunk not in second half of cacheline"); 516 + #endif 517 + 508 518 #endif /* CONFIG_X86_64 */ 509 519 510 520 /*

+3 -1

arch/x86/kvm/x86.c

··· 1584 1584 ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \ 1585 1585 ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \ 1586 1586 ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | ARCH_CAP_GDS_NO | \ 1587 - ARCH_CAP_RFDS_NO | ARCH_CAP_RFDS_CLEAR | ARCH_CAP_BHI_NO) 1587 + ARCH_CAP_RFDS_NO | ARCH_CAP_RFDS_CLEAR | ARCH_CAP_BHI_NO | ARCH_CAP_ITS_NO) 1588 1588 1589 1589 static u64 kvm_get_arch_capabilities(void) 1590 1590 { ··· 1618 1618 data |= ARCH_CAP_MDS_NO; 1619 1619 if (!boot_cpu_has_bug(X86_BUG_RFDS)) 1620 1620 data |= ARCH_CAP_RFDS_NO; 1621 + if (!boot_cpu_has_bug(X86_BUG_ITS)) 1622 + data |= ARCH_CAP_ITS_NO; 1621 1623 1622 1624 if (!boot_cpu_has(X86_FEATURE_RTM)) { 1623 1625 /*

+48

arch/x86/lib/retpoline.S

··· 367 367 368 368 #endif /* CONFIG_MITIGATION_CALL_DEPTH_TRACKING */ 369 369 370 + #ifdef CONFIG_MITIGATION_ITS 371 + 372 + .macro ITS_THUNK reg 373 + 374 + /* 375 + * If CFI paranoid is used then the ITS thunk starts with opcodes (0xea; jne 1b) 376 + * that complete the fineibt_paranoid caller sequence. 377 + */ 378 + 1: .byte 0xea 379 + SYM_INNER_LABEL(__x86_indirect_paranoid_thunk_\reg, SYM_L_GLOBAL) 380 + UNWIND_HINT_UNDEFINED 381 + ANNOTATE_NOENDBR 382 + jne 1b 383 + SYM_INNER_LABEL(__x86_indirect_its_thunk_\reg, SYM_L_GLOBAL) 384 + UNWIND_HINT_UNDEFINED 385 + ANNOTATE_NOENDBR 386 + ANNOTATE_RETPOLINE_SAFE 387 + jmp *%\reg 388 + int3 389 + .align 32, 0xcc /* fill to the end of the line */ 390 + .skip 32 - (__x86_indirect_its_thunk_\reg - 1b), 0xcc /* skip to the next upper half */ 391 + .endm 392 + 393 + /* ITS mitigation requires thunks be aligned to upper half of cacheline */ 394 + .align 64, 0xcc 395 + .skip 29, 0xcc 396 + 397 + #define GEN(reg) ITS_THUNK reg 398 + #include <asm/GEN-for-each-reg.h> 399 + #undef GEN 400 + 401 + .align 64, 0xcc 402 + SYM_FUNC_ALIAS(__x86_indirect_its_thunk_array, __x86_indirect_its_thunk_rax) 403 + SYM_CODE_END(__x86_indirect_its_thunk_array) 404 + 405 + .align 64, 0xcc 406 + .skip 32, 0xcc 407 + SYM_CODE_START(its_return_thunk) 408 + UNWIND_HINT_FUNC 409 + ANNOTATE_NOENDBR 410 + ANNOTATE_UNRET_SAFE 411 + ret 412 + int3 413 + SYM_CODE_END(its_return_thunk) 414 + EXPORT_SYMBOL(its_return_thunk) 415 + 416 + #endif /* CONFIG_MITIGATION_ITS */ 417 + 370 418 /* 371 419 * This function name is magical and is used by -mfunction-return=thunk-extern 372 420 * for the compiler to generate JMPs to it.

+4 -1

arch/x86/mm/init_32.c

··· 30 30 #include <linux/initrd.h> 31 31 #include <linux/cpumask.h> 32 32 #include <linux/gfp.h> 33 + #include <linux/execmem.h> 33 34 34 35 #include <asm/asm.h> 35 36 #include <asm/bios_ebda.h> ··· 566 565 "only %luMB highmem pages available, ignoring highmem size of %luMB!\n" 567 566 568 567 #define MSG_HIGHMEM_TRIMMED \ 569 - "Warning: only 4GB will be used. Support for for CONFIG_HIGHMEM64G was removed!\n" 568 + "Warning: only 4GB will be used. Support for CONFIG_HIGHMEM64G was removed!\n" 570 569 /* 571 570 * We have more RAM than fits into lowmem - we try to put it into 572 571 * highmem, also taking the highmem=x boot parameter into account: ··· 755 754 set_pages_ro(virt_to_page(start), size >> PAGE_SHIFT); 756 755 pr_info("Write protecting kernel text and read-only data: %luk\n", 757 756 size >> 10); 757 + 758 + execmem_cache_make_ro(); 758 759 759 760 kernel_set_to_readonly = 1; 760 761

+3

arch/x86/mm/init_64.c

··· 34 34 #include <linux/gfp.h> 35 35 #include <linux/kcore.h> 36 36 #include <linux/bootmem_info.h> 37 + #include <linux/execmem.h> 37 38 38 39 #include <asm/processor.h> 39 40 #include <asm/bios_ebda.h> ··· 1391 1390 printk(KERN_INFO "Write protecting the kernel read-only data: %luk\n", 1392 1391 (end - start) >> 10); 1393 1392 set_memory_ro(start, (end - start) >> PAGE_SHIFT); 1393 + 1394 + execmem_cache_make_ro(); 1394 1395 1395 1396 kernel_set_to_readonly = 1; 1396 1397

+56 -2

arch/x86/net/bpf_jit_comp.c

··· 41 41 #define EMIT2(b1, b2) EMIT((b1) + ((b2) << 8), 2) 42 42 #define EMIT3(b1, b2, b3) EMIT((b1) + ((b2) << 8) + ((b3) << 16), 3) 43 43 #define EMIT4(b1, b2, b3, b4) EMIT((b1) + ((b2) << 8) + ((b3) << 16) + ((b4) << 24), 4) 44 + #define EMIT5(b1, b2, b3, b4, b5) \ 45 + do { EMIT1(b1); EMIT4(b2, b3, b4, b5); } while (0) 44 46 45 47 #define EMIT1_off32(b1, off) \ 46 48 do { EMIT1(b1); EMIT(off, 4); } while (0) ··· 663 661 { 664 662 u8 *prog = *pprog; 665 663 666 - if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) { 664 + if (cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) { 665 + OPTIMIZER_HIDE_VAR(reg); 666 + emit_jump(&prog, its_static_thunk(reg), ip); 667 + } else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) { 667 668 EMIT_LFENCE(); 668 669 EMIT2(0xFF, 0xE0 + reg); 669 670 } else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE)) { ··· 688 683 { 689 684 u8 *prog = *pprog; 690 685 691 - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) { 686 + if (cpu_wants_rethunk()) { 692 687 emit_jump(&prog, x86_return_thunk, ip); 693 688 } else { 694 689 EMIT1(0xC3); /* ret */ ··· 1506 1501 /* Memory size/value to protect private stack overflow/underflow */ 1507 1502 #define PRIV_STACK_GUARD_SZ 8 1508 1503 #define PRIV_STACK_GUARD_VAL 0xEB9F12345678eb9fULL 1504 + 1505 + static int emit_spectre_bhb_barrier(u8 **pprog, u8 *ip, 1506 + struct bpf_prog *bpf_prog) 1507 + { 1508 + u8 *prog = *pprog; 1509 + u8 *func; 1510 + 1511 + if (cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_LOOP)) { 1512 + /* The clearing sequence clobbers eax and ecx. */ 1513 + EMIT1(0x50); /* push rax */ 1514 + EMIT1(0x51); /* push rcx */ 1515 + ip += 2; 1516 + 1517 + func = (u8 *)clear_bhb_loop; 1518 + ip += x86_call_depth_emit_accounting(&prog, func, ip); 1519 + 1520 + if (emit_call(&prog, func, ip)) 1521 + return -EINVAL; 1522 + EMIT1(0x59); /* pop rcx */ 1523 + EMIT1(0x58); /* pop rax */ 1524 + } 1525 + /* Insert IBHF instruction */ 1526 + if ((cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_LOOP) && 1527 + cpu_feature_enabled(X86_FEATURE_HYPERVISOR)) || 1528 + cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_HW)) { 1529 + /* 1530 + * Add an Indirect Branch History Fence (IBHF). IBHF acts as a 1531 + * fence preventing branch history from before the fence from 1532 + * affecting indirect branches after the fence. This is 1533 + * specifically used in cBPF jitted code to prevent Intra-mode 1534 + * BHI attacks. The IBHF instruction is designed to be a NOP on 1535 + * hardware that doesn't need or support it. The REP and REX.W 1536 + * prefixes are required by the microcode, and they also ensure 1537 + * that the NOP is unlikely to be used in existing code. 1538 + * 1539 + * IBHF is not a valid instruction in 32-bit mode. 1540 + */ 1541 + EMIT5(0xF3, 0x48, 0x0F, 0x1E, 0xF8); /* ibhf */ 1542 + } 1543 + *pprog = prog; 1544 + return 0; 1545 + } 1509 1546 1510 1547 static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image, 1511 1548 int oldproglen, struct jit_context *ctx, bool jmp_padding) ··· 2591 2544 seen_exit = true; 2592 2545 /* Update cleanup_addr */ 2593 2546 ctx->cleanup_addr = proglen; 2547 + if (bpf_prog_was_classic(bpf_prog) && 2548 + !capable(CAP_SYS_ADMIN)) { 2549 + u8 *ip = image + addrs[i - 1]; 2550 + 2551 + if (emit_spectre_bhb_barrier(&prog, ip, bpf_prog)) 2552 + return -EINVAL; 2553 + } 2594 2554 if (bpf_prog->aux->exception_boundary) { 2595 2555 pop_callee_regs(&prog, all_callee_regs_used); 2596 2556 pop_r12(&prog);

+47 -15

block/bio-integrity-auto.c

··· 9 9 * not aware of PI. 10 10 */ 11 11 #include <linux/blk-integrity.h> 12 + #include <linux/t10-pi.h> 12 13 #include <linux/workqueue.h> 13 14 #include "blk.h" 14 15 ··· 44 43 bio_endio(bio); 45 44 } 46 45 46 + #define BIP_CHECK_FLAGS (BIP_CHECK_GUARD | BIP_CHECK_REFTAG | BIP_CHECK_APPTAG) 47 + static bool bip_should_check(struct bio_integrity_payload *bip) 48 + { 49 + return bip->bip_flags & BIP_CHECK_FLAGS; 50 + } 51 + 52 + static bool bi_offload_capable(struct blk_integrity *bi) 53 + { 54 + switch (bi->csum_type) { 55 + case BLK_INTEGRITY_CSUM_CRC64: 56 + return bi->tuple_size == sizeof(struct crc64_pi_tuple); 57 + case BLK_INTEGRITY_CSUM_CRC: 58 + case BLK_INTEGRITY_CSUM_IP: 59 + return bi->tuple_size == sizeof(struct t10_pi_tuple); 60 + default: 61 + pr_warn_once("%s: unknown integrity checksum type:%d\n", 62 + __func__, bi->csum_type); 63 + fallthrough; 64 + case BLK_INTEGRITY_CSUM_NONE: 65 + return false; 66 + } 67 + } 68 + 47 69 /** 48 70 * __bio_integrity_endio - Integrity I/O completion function 49 71 * @bio: Protected bio ··· 78 54 */ 79 55 bool __bio_integrity_endio(struct bio *bio) 80 56 { 81 - struct blk_integrity *bi = blk_get_integrity(bio->bi_bdev->bd_disk); 82 57 struct bio_integrity_payload *bip = bio_integrity(bio); 83 58 struct bio_integrity_data *bid = 84 59 container_of(bip, struct bio_integrity_data, bip); 85 60 86 - if (bio_op(bio) == REQ_OP_READ && !bio->bi_status && bi->csum_type) { 61 + if (bio_op(bio) == REQ_OP_READ && !bio->bi_status && 62 + bip_should_check(bip)) { 87 63 INIT_WORK(&bid->work, bio_integrity_verify_fn); 88 64 queue_work(kintegrityd_wq, &bid->work); 89 65 return false; ··· 108 84 { 109 85 struct blk_integrity *bi = blk_get_integrity(bio->bi_bdev->bd_disk); 110 86 struct bio_integrity_data *bid; 87 + bool set_flags = true; 111 88 gfp_t gfp = GFP_NOIO; 112 89 unsigned int len; 113 90 void *buf; ··· 125 100 126 101 switch (bio_op(bio)) { 127 102 case REQ_OP_READ: 128 - if (bi->flags & BLK_INTEGRITY_NOVERIFY) 129 - return true; 103 + if (bi->flags & BLK_INTEGRITY_NOVERIFY) { 104 + if (bi_offload_capable(bi)) 105 + return true; 106 + set_flags = false; 107 + } 130 108 break; 131 109 case REQ_OP_WRITE: 132 - if (bi->flags & BLK_INTEGRITY_NOGENERATE) 133 - return true; 134 - 135 110 /* 136 111 * Zero the memory allocated to not leak uninitialized kernel 137 112 * memory to disk for non-integrity metadata where nothing else 138 113 * initializes the memory. 139 114 */ 140 - if (bi->csum_type == BLK_INTEGRITY_CSUM_NONE) 115 + if (bi->flags & BLK_INTEGRITY_NOGENERATE) { 116 + if (bi_offload_capable(bi)) 117 + return true; 118 + set_flags = false; 119 + gfp |= __GFP_ZERO; 120 + } else if (bi->csum_type == BLK_INTEGRITY_CSUM_NONE) 141 121 gfp |= __GFP_ZERO; 142 122 break; 143 123 default: ··· 167 137 bid->bip.bip_flags |= BIP_BLOCK_INTEGRITY; 168 138 bip_set_seed(&bid->bip, bio->bi_iter.bi_sector); 169 139 170 - if (bi->csum_type == BLK_INTEGRITY_CSUM_IP) 171 - bid->bip.bip_flags |= BIP_IP_CHECKSUM; 172 - if (bi->csum_type) 173 - bid->bip.bip_flags |= BIP_CHECK_GUARD; 174 - if (bi->flags & BLK_INTEGRITY_REF_TAG) 175 - bid->bip.bip_flags |= BIP_CHECK_REFTAG; 140 + if (set_flags) { 141 + if (bi->csum_type == BLK_INTEGRITY_CSUM_IP) 142 + bid->bip.bip_flags |= BIP_IP_CHECKSUM; 143 + if (bi->csum_type) 144 + bid->bip.bip_flags |= BIP_CHECK_GUARD; 145 + if (bi->flags & BLK_INTEGRITY_REF_TAG) 146 + bid->bip.bip_flags |= BIP_CHECK_REFTAG; 147 + } 176 148 177 149 if (bio_integrity_add_page(bio, virt_to_page(buf), len, 178 150 offset_in_page(buf)) < len) 179 151 goto err_end_io; 180 152 181 153 /* Auto-generate integrity metadata if this is a write */ 182 - if (bio_data_dir(bio) == WRITE) 154 + if (bio_data_dir(bio) == WRITE && bip_should_check(&bid->bip)) 183 155 blk_integrity_generate(bio); 184 156 else 185 157 bid->saved_bio_iter = bio->bi_iter;

+1 -1

block/bio.c

··· 611 611 { 612 612 struct bio *bio; 613 613 614 - if (nr_vecs > UIO_MAXIOV) 614 + if (nr_vecs > BIO_MAX_INLINE_VECS) 615 615 return NULL; 616 616 return kmalloc(struct_size(bio, bi_inline_vecs, nr_vecs), gfp_mask); 617 617 }

+1 -1

drivers/accel/ivpu/ivpu_debugfs.c

··· 455 455 if (ret < 0) 456 456 return ret; 457 457 458 - buf[size] = '\0'; 458 + buf[ret] = '\0'; 459 459 ret = sscanf(buf, "%u %u %u %u", &band, &grace_period, &process_grace_period, 460 460 &process_quantum); 461 461 if (ret != 4)

+8 -3

drivers/acpi/pptt.c

··· 231 231 sizeof(struct acpi_table_pptt)); 232 232 proc_sz = sizeof(struct acpi_pptt_processor); 233 233 234 - while ((unsigned long)entry + proc_sz < table_end) { 234 + /* ignore subtable types that are smaller than a processor node */ 235 + while ((unsigned long)entry + proc_sz <= table_end) { 235 236 cpu_node = (struct acpi_pptt_processor *)entry; 237 + 236 238 if (entry->type == ACPI_PPTT_TYPE_PROCESSOR && 237 239 cpu_node->parent == node_entry) 238 240 return 0; 239 241 if (entry->length == 0) 240 242 return 0; 243 + 241 244 entry = ACPI_ADD_PTR(struct acpi_subtable_header, entry, 242 245 entry->length); 243 - 244 246 } 245 247 return 1; 246 248 } ··· 275 273 proc_sz = sizeof(struct acpi_pptt_processor); 276 274 277 275 /* find the processor structure associated with this cpuid */ 278 - while ((unsigned long)entry + proc_sz < table_end) { 276 + while ((unsigned long)entry + proc_sz <= table_end) { 279 277 cpu_node = (struct acpi_pptt_processor *)entry; 280 278 281 279 if (entry->length == 0) { 282 280 pr_warn("Invalid zero length subtable\n"); 283 281 break; 284 282 } 283 + /* entry->length may not equal proc_sz, revalidate the processor structure length */ 285 284 if (entry->type == ACPI_PPTT_TYPE_PROCESSOR && 286 285 acpi_cpu_id == cpu_node->acpi_processor_id && 286 + (unsigned long)entry + entry->length <= table_end && 287 + entry->length == proc_sz + cpu_node->number_of_priv_resources * sizeof(u32) && 287 288 acpi_pptt_leaf_node(table_hdr, cpu_node)) { 288 289 return (struct acpi_pptt_processor *)entry; 289 290 }

+3

drivers/base/cpu.c

··· 600 600 CPU_SHOW_VULN_FALLBACK(gds); 601 601 CPU_SHOW_VULN_FALLBACK(reg_file_data_sampling); 602 602 CPU_SHOW_VULN_FALLBACK(ghostwrite); 603 + CPU_SHOW_VULN_FALLBACK(indirect_target_selection); 603 604 604 605 static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); 605 606 static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); ··· 617 616 static DEVICE_ATTR(gather_data_sampling, 0444, cpu_show_gds, NULL); 618 617 static DEVICE_ATTR(reg_file_data_sampling, 0444, cpu_show_reg_file_data_sampling, NULL); 619 618 static DEVICE_ATTR(ghostwrite, 0444, cpu_show_ghostwrite, NULL); 619 + static DEVICE_ATTR(indirect_target_selection, 0444, cpu_show_indirect_target_selection, NULL); 620 620 621 621 static struct attribute *cpu_root_vulnerabilities_attrs[] = { 622 622 &dev_attr_meltdown.attr, ··· 635 633 &dev_attr_gather_data_sampling.attr, 636 634 &dev_attr_reg_file_data_sampling.attr, 637 635 &dev_attr_ghostwrite.attr, 636 + &dev_attr_indirect_target_selection.attr, 638 637 NULL 639 638 }; 640 639

+1 -1

drivers/block/ublk_drv.c

··· 1708 1708 * that ublk_dispatch_req() is always called 1709 1709 */ 1710 1710 req = blk_mq_tag_to_rq(ub->tag_set.tags[ubq->q_id], tag); 1711 - if (req && blk_mq_request_started(req)) 1711 + if (req && blk_mq_request_started(req) && req->tag == tag) 1712 1712 return; 1713 1713 1714 1714 spin_lock(&ubq->cancel_lock);

+3 -3

drivers/char/tpm/tpm-buf.c

··· 201 201 */ 202 202 u8 tpm_buf_read_u8(struct tpm_buf *buf, off_t *offset) 203 203 { 204 - u8 value; 204 + u8 value = 0; 205 205 206 206 tpm_buf_read(buf, offset, sizeof(value), &value); 207 207 ··· 218 218 */ 219 219 u16 tpm_buf_read_u16(struct tpm_buf *buf, off_t *offset) 220 220 { 221 - u16 value; 221 + u16 value = 0; 222 222 223 223 tpm_buf_read(buf, offset, sizeof(value), &value); 224 224 ··· 235 235 */ 236 236 u32 tpm_buf_read_u32(struct tpm_buf *buf, off_t *offset) 237 237 { 238 - u32 value; 238 + u32 value = 0; 239 239 240 240 tpm_buf_read(buf, offset, sizeof(value), &value); 241 241

+6 -14

drivers/char/tpm/tpm2-sessions.c

··· 40 40 * 41 41 * These are the usage functions: 42 42 * 43 - * tpm2_start_auth_session() which allocates the opaque auth structure 44 - * and gets a session from the TPM. This must be called before 45 - * any of the following functions. The session is protected by a 46 - * session_key which is derived from a random salt value 47 - * encrypted to the NULL seed. 48 43 * tpm2_end_auth_session() kills the session and frees the resources. 49 44 * Under normal operation this function is done by 50 45 * tpm_buf_check_hmac_response(), so this is only to be used on ··· 958 963 } 959 964 960 965 /** 961 - * tpm2_start_auth_session() - create a HMAC authentication session with the TPM 962 - * @chip: the TPM chip structure to create the session with 966 + * tpm2_start_auth_session() - Create an a HMAC authentication session 967 + * @chip: A TPM chip 963 968 * 964 - * This function loads the NULL seed from its saved context and starts 965 - * an authentication session on the null seed, fills in the 966 - * @chip->auth structure to contain all the session details necessary 967 - * for performing the HMAC, encrypt and decrypt operations and 968 - * returns. The NULL seed is flushed before this function returns. 969 + * Loads the ephemeral key (null seed), and starts an HMAC authenticated 970 + * session. The null seed is flushed before the return. 969 971 * 970 - * Return: zero on success or actual error encountered. 972 + * Returns zero on success, or a POSIX error code. 971 973 */ 972 974 int tpm2_start_auth_session(struct tpm_chip *chip) 973 975 { ··· 1016 1024 /* hash algorithm for session */ 1017 1025 tpm_buf_append_u16(&buf, TPM_ALG_SHA256); 1018 1026 1019 - rc = tpm_transmit_cmd(chip, &buf, 0, "start auth session"); 1027 + rc = tpm_ret_to_err(tpm_transmit_cmd(chip, &buf, 0, "StartAuthSession")); 1020 1028 tpm2_flush_context(chip, null_key); 1021 1029 1022 1030 if (rc == TPM2_RC_SUCCESS)

+1 -1

drivers/char/tpm/tpm_tis_core.h

··· 54 54 enum tis_defaults { 55 55 TIS_MEM_LEN = 0x5000, 56 56 TIS_SHORT_TIMEOUT = 750, /* ms */ 57 - TIS_LONG_TIMEOUT = 2000, /* 2 sec */ 57 + TIS_LONG_TIMEOUT = 4000, /* 4 secs */ 58 58 TIS_TIMEOUT_MIN_ATML = 14700, /* usecs */ 59 59 TIS_TIMEOUT_MAX_ATML = 15000, /* usecs */ 60 60 };

+3 -2

drivers/dma-buf/dma-resv.c

··· 320 320 count++; 321 321 322 322 dma_resv_list_set(fobj, i, fence, usage); 323 - /* pointer update must be visible before we extend the num_fences */ 324 - smp_store_mb(fobj->num_fences, count); 323 + /* fence update must be visible before we extend the num_fences */ 324 + smp_wmb(); 325 + fobj->num_fences = count; 325 326 } 326 327 EXPORT_SYMBOL(dma_resv_add_fence); 327 328

+10 -9

drivers/dma/amd/ptdma/ptdma-dmaengine.c

··· 342 342 struct pt_dma_chan *chan; 343 343 unsigned long flags; 344 344 345 + if (!desc) 346 + return; 347 + 345 348 dma_chan = desc->vd.tx.chan; 346 349 chan = to_pt_chan(dma_chan); 347 350 ··· 358 355 desc->status = DMA_ERROR; 359 356 360 357 spin_lock_irqsave(&chan->vc.lock, flags); 361 - if (desc) { 362 - if (desc->status != DMA_COMPLETE) { 363 - if (desc->status != DMA_ERROR) 364 - desc->status = DMA_COMPLETE; 358 + if (desc->status != DMA_COMPLETE) { 359 + if (desc->status != DMA_ERROR) 360 + desc->status = DMA_COMPLETE; 365 361 366 - dma_cookie_complete(tx_desc); 367 - dma_descriptor_unmap(tx_desc); 368 - } else { 369 - tx_desc = NULL; 370 - } 362 + dma_cookie_complete(tx_desc); 363 + dma_descriptor_unmap(tx_desc); 364 + } else { 365 + tx_desc = NULL; 371 366 } 372 367 spin_unlock_irqrestore(&chan->vc.lock, flags); 373 368

+3 -3

drivers/dma/dmatest.c

··· 841 841 } else { 842 842 dma_async_issue_pending(chan); 843 843 844 - wait_event_timeout(thread->done_wait, 845 - done->done, 846 - msecs_to_jiffies(params->timeout)); 844 + wait_event_freezable_timeout(thread->done_wait, 845 + done->done, 846 + msecs_to_jiffies(params->timeout)); 847 847 848 848 status = dma_async_is_tx_complete(chan, cookie, NULL, 849 849 NULL);

+1 -1

drivers/dma/fsl-edma-main.c

··· 57 57 58 58 intr = edma_readl_chreg(fsl_chan, ch_int); 59 59 if (!intr) 60 - return IRQ_HANDLED; 60 + return IRQ_NONE; 61 61 62 62 edma_writel_chreg(fsl_chan, 1, ch_int); 63 63

+11 -2

drivers/dma/idxd/cdev.c

··· 222 222 struct idxd_wq *wq; 223 223 struct device *dev, *fdev; 224 224 int rc = 0; 225 - struct iommu_sva *sva; 225 + struct iommu_sva *sva = NULL; 226 226 unsigned int pasid; 227 227 struct idxd_cdev *idxd_cdev; 228 228 ··· 317 317 if (device_user_pasid_enabled(idxd)) 318 318 idxd_xa_pasid_remove(ctx); 319 319 failed_get_pasid: 320 - if (device_user_pasid_enabled(idxd)) 320 + if (device_user_pasid_enabled(idxd) && !IS_ERR_OR_NULL(sva)) 321 321 iommu_sva_unbind_device(sva); 322 322 failed: 323 323 mutex_unlock(&wq->wq_lock); ··· 407 407 if (!idxd->user_submission_safe && !capable(CAP_SYS_RAWIO)) 408 408 return -EPERM; 409 409 410 + if (current->mm != ctx->mm) 411 + return -EPERM; 412 + 410 413 rc = check_vma(wq, vma, __func__); 411 414 if (rc < 0) 412 415 return rc; ··· 476 473 ssize_t written = 0; 477 474 int i; 478 475 476 + if (current->mm != ctx->mm) 477 + return -EPERM; 478 + 479 479 for (i = 0; i < len/sizeof(struct dsa_hw_desc); i++) { 480 480 int rc = idxd_submit_user_descriptor(ctx, udesc + i); 481 481 ··· 498 492 struct idxd_wq *wq = ctx->wq; 499 493 struct idxd_device *idxd = wq->idxd; 500 494 __poll_t out = 0; 495 + 496 + if (current->mm != ctx->mm) 497 + return POLLNVAL; 501 498 502 499 poll_wait(filp, &wq->err_queue, wait); 503 500 spin_lock(&idxd->dev_lock);

+113 -46

drivers/dma/idxd/init.c

··· 155 155 pci_free_irq_vectors(pdev); 156 156 } 157 157 158 + static void idxd_clean_wqs(struct idxd_device *idxd) 159 + { 160 + struct idxd_wq *wq; 161 + struct device *conf_dev; 162 + int i; 163 + 164 + for (i = 0; i < idxd->max_wqs; i++) { 165 + wq = idxd->wqs[i]; 166 + if (idxd->hw.wq_cap.op_config) 167 + bitmap_free(wq->opcap_bmap); 168 + kfree(wq->wqcfg); 169 + conf_dev = wq_confdev(wq); 170 + put_device(conf_dev); 171 + kfree(wq); 172 + } 173 + bitmap_free(idxd->wq_enable_map); 174 + kfree(idxd->wqs); 175 + } 176 + 158 177 static int idxd_setup_wqs(struct idxd_device *idxd) 159 178 { 160 179 struct device *dev = &idxd->pdev->dev; ··· 188 169 189 170 idxd->wq_enable_map = bitmap_zalloc_node(idxd->max_wqs, GFP_KERNEL, dev_to_node(dev)); 190 171 if (!idxd->wq_enable_map) { 191 - kfree(idxd->wqs); 192 - return -ENOMEM; 172 + rc = -ENOMEM; 173 + goto err_bitmap; 193 174 } 194 175 195 176 for (i = 0; i < idxd->max_wqs; i++) { ··· 208 189 conf_dev->bus = &dsa_bus_type; 209 190 conf_dev->type = &idxd_wq_device_type; 210 191 rc = dev_set_name(conf_dev, "wq%d.%d", idxd->id, wq->id); 211 - if (rc < 0) { 212 - put_device(conf_dev); 192 + if (rc < 0) 213 193 goto err; 214 - } 215 194 216 195 mutex_init(&wq->wq_lock); 217 196 init_waitqueue_head(&wq->err_queue); ··· 220 203 wq->enqcmds_retries = IDXD_ENQCMDS_RETRIES; 221 204 wq->wqcfg = kzalloc_node(idxd->wqcfg_size, GFP_KERNEL, dev_to_node(dev)); 222 205 if (!wq->wqcfg) { 223 - put_device(conf_dev); 224 206 rc = -ENOMEM; 225 207 goto err; 226 208 } ··· 227 211 if (idxd->hw.wq_cap.op_config) { 228 212 wq->opcap_bmap = bitmap_zalloc(IDXD_MAX_OPCAP_BITS, GFP_KERNEL); 229 213 if (!wq->opcap_bmap) { 230 - put_device(conf_dev); 231 214 rc = -ENOMEM; 232 - goto err; 215 + goto err_opcap_bmap; 233 216 } 234 217 bitmap_copy(wq->opcap_bmap, idxd->opcap_bmap, IDXD_MAX_OPCAP_BITS); 235 218 } ··· 239 224 240 225 return 0; 241 226 242 - err: 227 + err_opcap_bmap: 228 + kfree(wq->wqcfg); 229 + 230 + err: 231 + put_device(conf_dev); 232 + kfree(wq); 233 + 243 234 while (--i >= 0) { 244 235 wq = idxd->wqs[i]; 236 + if (idxd->hw.wq_cap.op_config) 237 + bitmap_free(wq->opcap_bmap); 238 + kfree(wq->wqcfg); 245 239 conf_dev = wq_confdev(wq); 246 240 put_device(conf_dev); 241 + kfree(wq); 242 + 247 243 } 244 + bitmap_free(idxd->wq_enable_map); 245 + 246 + err_bitmap: 247 + kfree(idxd->wqs); 248 + 248 249 return rc; 250 + } 251 + 252 + static void idxd_clean_engines(struct idxd_device *idxd) 253 + { 254 + struct idxd_engine *engine; 255 + struct device *conf_dev; 256 + int i; 257 + 258 + for (i = 0; i < idxd->max_engines; i++) { 259 + engine = idxd->engines[i]; 260 + conf_dev = engine_confdev(engine); 261 + put_device(conf_dev); 262 + kfree(engine); 263 + } 264 + kfree(idxd->engines); 249 265 } 250 266 251 267 static int idxd_setup_engines(struct idxd_device *idxd) ··· 309 263 rc = dev_set_name(conf_dev, "engine%d.%d", idxd->id, engine->id); 310 264 if (rc < 0) { 311 265 put_device(conf_dev); 266 + kfree(engine); 312 267 goto err; 313 268 } 314 269 ··· 323 276 engine = idxd->engines[i]; 324 277 conf_dev = engine_confdev(engine); 325 278 put_device(conf_dev); 279 + kfree(engine); 326 280 } 281 + kfree(idxd->engines); 282 + 327 283 return rc; 284 + } 285 + 286 + static void idxd_clean_groups(struct idxd_device *idxd) 287 + { 288 + struct idxd_group *group; 289 + int i; 290 + 291 + for (i = 0; i < idxd->max_groups; i++) { 292 + group = idxd->groups[i]; 293 + put_device(group_confdev(group)); 294 + kfree(group); 295 + } 296 + kfree(idxd->groups); 328 297 } 329 298 330 299 static int idxd_setup_groups(struct idxd_device *idxd) ··· 373 310 rc = dev_set_name(conf_dev, "group%d.%d", idxd->id, group->id); 374 311 if (rc < 0) { 375 312 put_device(conf_dev); 313 + kfree(group); 376 314 goto err; 377 315 } 378 316 ··· 398 334 while (--i >= 0) { 399 335 group = idxd->groups[i]; 400 336 put_device(group_confdev(group)); 337 + kfree(group); 401 338 } 339 + kfree(idxd->groups); 340 + 402 341 return rc; 403 342 } 404 343 405 344 static void idxd_cleanup_internals(struct idxd_device *idxd) 406 345 { 407 - int i; 408 - 409 - for (i = 0; i < idxd->max_groups; i++) 410 - put_device(group_confdev(idxd->groups[i])); 411 - for (i = 0; i < idxd->max_engines; i++) 412 - put_device(engine_confdev(idxd->engines[i])); 413 - for (i = 0; i < idxd->max_wqs; i++) 414 - put_device(wq_confdev(idxd->wqs[i])); 346 + idxd_clean_groups(idxd); 347 + idxd_clean_engines(idxd); 348 + idxd_clean_wqs(idxd); 415 349 destroy_workqueue(idxd->wq); 416 350 } 417 351 ··· 452 390 static int idxd_setup_internals(struct idxd_device *idxd) 453 391 { 454 392 struct device *dev = &idxd->pdev->dev; 455 - int rc, i; 393 + int rc; 456 394 457 395 init_waitqueue_head(&idxd->cmd_waitq); 458 396 ··· 483 421 err_evl: 484 422 destroy_workqueue(idxd->wq); 485 423 err_wkq_create: 486 - for (i = 0; i < idxd->max_groups; i++) 487 - put_device(group_confdev(idxd->groups[i])); 424 + idxd_clean_groups(idxd); 488 425 err_group: 489 - for (i = 0; i < idxd->max_engines; i++) 490 - put_device(engine_confdev(idxd->engines[i])); 426 + idxd_clean_engines(idxd); 491 427 err_engine: 492 - for (i = 0; i < idxd->max_wqs; i++) 493 - put_device(wq_confdev(idxd->wqs[i])); 428 + idxd_clean_wqs(idxd); 494 429 err_wqs: 495 430 return rc; 496 431 } ··· 587 528 idxd->hw.iaa_cap.bits = ioread64(idxd->reg_base + IDXD_IAACAP_OFFSET); 588 529 } 589 530 531 + static void idxd_free(struct idxd_device *idxd) 532 + { 533 + if (!idxd) 534 + return; 535 + 536 + put_device(idxd_confdev(idxd)); 537 + bitmap_free(idxd->opcap_bmap); 538 + ida_free(&idxd_ida, idxd->id); 539 + kfree(idxd); 540 + } 541 + 590 542 static struct idxd_device *idxd_alloc(struct pci_dev *pdev, struct idxd_driver_data *data) 591 543 { 592 544 struct device *dev = &pdev->dev; ··· 615 545 idxd_dev_set_type(&idxd->idxd_dev, idxd->data->type); 616 546 idxd->id = ida_alloc(&idxd_ida, GFP_KERNEL); 617 547 if (idxd->id < 0) 618 - return NULL; 548 + goto err_ida; 619 549 620 550 idxd->opcap_bmap = bitmap_zalloc_node(IDXD_MAX_OPCAP_BITS, GFP_KERNEL, dev_to_node(dev)); 621 - if (!idxd->opcap_bmap) { 622 - ida_free(&idxd_ida, idxd->id); 623 - return NULL; 624 - } 551 + if (!idxd->opcap_bmap) 552 + goto err_opcap; 625 553 626 554 device_initialize(conf_dev); 627 555 conf_dev->parent = dev; 628 556 conf_dev->bus = &dsa_bus_type; 629 557 conf_dev->type = idxd->data->dev_type; 630 558 rc = dev_set_name(conf_dev, "%s%d", idxd->data->name_prefix, idxd->id); 631 - if (rc < 0) { 632 - put_device(conf_dev); 633 - return NULL; 634 - } 559 + if (rc < 0) 560 + goto err_name; 635 561 636 562 spin_lock_init(&idxd->dev_lock); 637 563 spin_lock_init(&idxd->cmd_lock); 638 564 639 565 return idxd; 566 + 567 + err_name: 568 + put_device(conf_dev); 569 + bitmap_free(idxd->opcap_bmap); 570 + err_opcap: 571 + ida_free(&idxd_ida, idxd->id); 572 + err_ida: 573 + kfree(idxd); 574 + 575 + return NULL; 640 576 } 641 577 642 578 static int idxd_enable_system_pasid(struct idxd_device *idxd) ··· 1266 1190 err: 1267 1191 pci_iounmap(pdev, idxd->reg_base); 1268 1192 err_iomap: 1269 - put_device(idxd_confdev(idxd)); 1193 + idxd_free(idxd); 1270 1194 err_idxd_alloc: 1271 1195 pci_disable_device(pdev); 1272 1196 return rc; ··· 1308 1232 static void idxd_remove(struct pci_dev *pdev) 1309 1233 { 1310 1234 struct idxd_device *idxd = pci_get_drvdata(pdev); 1311 - struct idxd_irq_entry *irq_entry; 1312 1235 1313 1236 idxd_unregister_devices(idxd); 1314 1237 /* ··· 1320 1245 get_device(idxd_confdev(idxd)); 1321 1246 device_unregister(idxd_confdev(idxd)); 1322 1247 idxd_shutdown(pdev); 1323 - if (device_pasid_enabled(idxd)) 1324 - idxd_disable_system_pasid(idxd); 1325 1248 idxd_device_remove_debugfs(idxd); 1326 - 1327 - irq_entry = idxd_get_ie(idxd, 0); 1328 - free_irq(irq_entry->vector, irq_entry); 1329 - pci_free_irq_vectors(pdev); 1249 + idxd_cleanup(idxd); 1330 1250 pci_iounmap(pdev, idxd->reg_base); 1331 - if (device_user_pasid_enabled(idxd)) 1332 - idxd_disable_sva(pdev); 1333 - pci_disable_device(pdev); 1334 - destroy_workqueue(idxd->wq); 1335 - perfmon_pmu_remove(idxd); 1336 1251 put_device(idxd_confdev(idxd)); 1252 + idxd_free(idxd); 1253 + pci_disable_device(pdev); 1337 1254 } 1338 1255 1339 1256 static struct pci_driver idxd_pci_driver = {

+2 -4

drivers/dma/mediatek/mtk-cqdma.c

··· 420 420 { 421 421 struct mtk_cqdma_vchan *cvc = to_cqdma_vchan(c); 422 422 struct virt_dma_desc *vd; 423 - unsigned long flags; 424 423 425 - spin_lock_irqsave(&cvc->pc->lock, flags); 426 424 list_for_each_entry(vd, &cvc->pc->queue, node) 427 425 if (vd->tx.cookie == cookie) { 428 - spin_unlock_irqrestore(&cvc->pc->lock, flags); 429 426 return vd; 430 427 } 431 - spin_unlock_irqrestore(&cvc->pc->lock, flags); 432 428 433 429 list_for_each_entry(vd, &cvc->vc.desc_issued, node) 434 430 if (vd->tx.cookie == cookie) ··· 448 452 if (ret == DMA_COMPLETE || !txstate) 449 453 return ret; 450 454 455 + spin_lock_irqsave(&cvc->pc->lock, flags); 451 456 spin_lock_irqsave(&cvc->vc.lock, flags); 452 457 vd = mtk_cqdma_find_active_desc(c, cookie); 453 458 spin_unlock_irqrestore(&cvc->vc.lock, flags); 459 + spin_unlock_irqrestore(&cvc->pc->lock, flags); 454 460 455 461 if (vd) { 456 462 cvd = to_cqdma_vdesc(vd);

+8 -2

drivers/dma/ti/k3-udma.c

··· 1091 1091 u32 residue_diff; 1092 1092 ktime_t time_diff; 1093 1093 unsigned long delay; 1094 + unsigned long flags; 1094 1095 1095 1096 while (1) { 1097 + spin_lock_irqsave(&uc->vc.lock, flags); 1098 + 1096 1099 if (uc->desc) { 1097 1100 /* Get previous residue and time stamp */ 1098 1101 residue_diff = uc->tx_drain.residue; ··· 1130 1127 break; 1131 1128 } 1132 1129 1130 + spin_unlock_irqrestore(&uc->vc.lock, flags); 1131 + 1133 1132 usleep_range(ktime_to_us(delay), 1134 1133 ktime_to_us(delay) + 10); 1135 1134 continue; ··· 1148 1143 1149 1144 break; 1150 1145 } 1146 + 1147 + spin_unlock_irqrestore(&uc->vc.lock, flags); 1151 1148 } 1152 1149 1153 1150 static irqreturn_t udma_ring_irq_handler(int irq, void *data) ··· 4253 4246 struct of_dma *ofdma) 4254 4247 { 4255 4248 struct udma_dev *ud = ofdma->of_dma_data; 4256 - dma_cap_mask_t mask = ud->ddev.cap_mask; 4257 4249 struct udma_filter_param filter_param; 4258 4250 struct dma_chan *chan; 4259 4251 ··· 4284 4278 } 4285 4279 } 4286 4280 4287 - chan = __dma_request_channel(&mask, udma_dma_filter_fn, &filter_param, 4281 + chan = __dma_request_channel(&ud->ddev.cap_mask, udma_dma_filter_fn, &filter_param, 4288 4282 ofdma->of_node); 4289 4283 if (!chan) { 4290 4284 dev_err(ud->dev, "get channel fail in %s.\n", __func__);

+6

drivers/gpio/gpio-pca953x.c

··· 1204 1204 1205 1205 guard(mutex)(&chip->i2c_lock); 1206 1206 1207 + if (chip->client->irq > 0) 1208 + enable_irq(chip->client->irq); 1207 1209 regcache_cache_only(chip->regmap, false); 1208 1210 regcache_mark_dirty(chip->regmap); 1209 1211 ret = pca953x_regcache_sync(chip); ··· 1218 1216 static void pca953x_save_context(struct pca953x_chip *chip) 1219 1217 { 1220 1218 guard(mutex)(&chip->i2c_lock); 1219 + 1220 + /* Disable IRQ to prevent early triggering while regmap "cache only" is on */ 1221 + if (chip->client->irq > 0) 1222 + disable_irq(chip->client->irq); 1221 1223 regcache_cache_only(chip->regmap, true); 1222 1224 } 1223 1225

+10 -2

drivers/gpio/gpio-virtuser.c

··· 401 401 char buf[32], *trimmed; 402 402 int ret, dir, val = 0; 403 403 404 - ret = simple_write_to_buffer(buf, sizeof(buf), ppos, user_buf, count); 404 + if (count >= sizeof(buf)) 405 + return -EINVAL; 406 + 407 + ret = simple_write_to_buffer(buf, sizeof(buf) - 1, ppos, user_buf, count); 405 408 if (ret < 0) 406 409 return ret; 410 + 411 + buf[ret] = '\0'; 407 412 408 413 trimmed = strim(buf); 409 414 ··· 628 623 char buf[GPIO_VIRTUSER_NAME_BUF_LEN + 2]; 629 624 int ret; 630 625 626 + if (count >= sizeof(buf)) 627 + return -EINVAL; 628 + 631 629 ret = simple_write_to_buffer(buf, GPIO_VIRTUSER_NAME_BUF_LEN, ppos, 632 630 user_buf, count); 633 631 if (ret < 0) 634 632 return ret; 635 633 636 - buf[strlen(buf) - 1] = '\0'; 634 + buf[ret] = '\0'; 637 635 638 636 ret = gpiod_set_consumer_name(data->ad.desc, buf); 639 637 if (ret)

+1 -1

drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c

··· 109 109 struct drm_exec exec; 110 110 int r; 111 111 112 - drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); 112 + drm_exec_init(&exec, 0, 0); 113 113 drm_exec_until_all_locked(&exec) { 114 114 r = amdgpu_vm_lock_pd(vm, &exec, 0); 115 115 if (likely(!r))

+12

drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c

··· 752 752 adev->gmc.vram_type = vram_type; 753 753 adev->gmc.vram_vendor = vram_vendor; 754 754 755 + /* The mall_size is already calculated as mall_size_per_umc * num_umc. 756 + * However, for gfx1151, which features a 2-to-1 UMC mapping, 757 + * the result must be multiplied by 2 to determine the actual mall size. 758 + */ 759 + switch (amdgpu_ip_version(adev, GC_HWIP, 0)) { 760 + case IP_VERSION(11, 5, 1): 761 + adev->gmc.mall_size *= 2; 762 + break; 763 + default: 764 + break; 765 + } 766 + 755 767 switch (amdgpu_ip_version(adev, GC_HWIP, 0)) { 756 768 case IP_VERSION(11, 0, 0): 757 769 case IP_VERSION(11, 0, 1):

+8

drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c

··· 1023 1023 ring->doorbell_index << VCN_RB1_DB_CTRL__OFFSET__SHIFT | 1024 1024 VCN_RB1_DB_CTRL__EN_MASK); 1025 1025 1026 + /* Keeping one read-back to ensure all register writes are done, otherwise 1027 + * it may introduce race conditions */ 1028 + RREG32_SOC15(VCN, inst_idx, regVCN_RB1_DB_CTRL); 1029 + 1026 1030 return 0; 1027 1031 } 1028 1032 ··· 1208 1204 tmp |= VCN_RB_ENABLE__RB1_EN_MASK; 1209 1205 WREG32_SOC15(VCN, i, regVCN_RB_ENABLE, tmp); 1210 1206 fw_shared->sq.queue_mode &= ~(FW_QUEUE_RING_RESET | FW_QUEUE_DPG_HOLD_OFF); 1207 + 1208 + /* Keeping one read-back to ensure all register writes are done, otherwise 1209 + * it may introduce race conditions */ 1210 + RREG32_SOC15(VCN, i, regVCN_RB_ENABLE); 1211 1211 1212 1212 return 0; 1213 1213 }

+4 -1

drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

··· 372 372 static inline bool is_dc_timing_adjust_needed(struct dm_crtc_state *old_state, 373 373 struct dm_crtc_state *new_state) 374 374 { 375 + if (new_state->stream->adjust.timing_adjust_pending) 376 + return true; 375 377 if (new_state->freesync_config.state == VRR_STATE_ACTIVE_FIXED) 376 378 return true; 377 379 else if (amdgpu_dm_crtc_vrr_active(old_state) != amdgpu_dm_crtc_vrr_active(new_state)) ··· 12765 12763 /* The reply is stored in the top nibble of the command. */ 12766 12764 payload->reply[0] = (adev->dm.dmub_notify->aux_reply.command >> 4) & 0xF; 12767 12765 12768 - if (!payload->write && p_notify->aux_reply.length) 12766 + /*write req may receive a byte indicating partially written number as well*/ 12767 + if (p_notify->aux_reply.length) 12769 12768 memcpy(payload->data, p_notify->aux_reply.data, 12770 12769 p_notify->aux_reply.length); 12771 12770

+11 -5

drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c

··· 62 62 enum aux_return_code_type operation_result; 63 63 struct amdgpu_device *adev; 64 64 struct ddc_service *ddc; 65 + uint8_t copy[16]; 65 66 66 67 if (WARN_ON(msg->size > 16)) 67 68 return -E2BIG; ··· 77 76 payload.write_status_update = 78 77 (msg->request & DP_AUX_I2C_WRITE_STATUS_UPDATE) != 0; 79 78 payload.defer_delay = 0; 79 + 80 + if (payload.write) { 81 + memcpy(copy, msg->buffer, msg->size); 82 + payload.data = copy; 83 + } 80 84 81 85 result = dc_link_aux_transfer_raw(TO_DM_AUX(aux)->ddc_service, &payload, 82 86 &operation_result); ··· 106 100 */ 107 101 if (payload.write && result >= 0) { 108 102 if (result) { 109 - /*one byte indicating partially written bytes. Force 0 to retry*/ 110 - drm_info(adev_to_drm(adev), "amdgpu: AUX partially written\n"); 111 - result = 0; 103 + /*one byte indicating partially written bytes*/ 104 + drm_dbg_dp(adev_to_drm(adev), "amdgpu: AUX partially written\n"); 105 + result = payload.data[0]; 112 106 } else if (!payload.reply[0]) 113 107 /*I2C_ACK|AUX_ACK*/ 114 108 result = msg->size; ··· 133 127 break; 134 128 } 135 129 136 - drm_info(adev_to_drm(adev), "amdgpu: DP AUX transfer fail:%d\n", operation_result); 130 + drm_dbg_dp(adev_to_drm(adev), "amdgpu: DP AUX transfer fail:%d\n", operation_result); 137 131 } 138 132 139 133 if (payload.reply[0]) 140 - drm_info(adev_to_drm(adev), "amdgpu: AUX reply command not ACK: 0x%02x.", 134 + drm_dbg_dp(adev_to_drm(adev), "amdgpu: AUX reply command not ACK: 0x%02x.", 141 135 payload.reply[0]); 142 136 143 137 return result;

+7 -3

drivers/gpu/drm/amd/display/dc/core/dc.c

··· 439 439 * Don't adjust DRR while there's bandwidth optimizations pending to 440 440 * avoid conflicting with firmware updates. 441 441 */ 442 - if (dc->ctx->dce_version > DCE_VERSION_MAX) 443 - if (dc->optimized_required || dc->wm_optimized_required) 442 + if (dc->ctx->dce_version > DCE_VERSION_MAX) { 443 + if (dc->optimized_required || dc->wm_optimized_required) { 444 + stream->adjust.timing_adjust_pending = true; 444 445 return false; 446 + } 447 + } 445 448 446 449 dc_exit_ips_for_hw_access(dc); 447 450 ··· 3171 3168 3172 3169 if (update->crtc_timing_adjust) { 3173 3170 if (stream->adjust.v_total_min != update->crtc_timing_adjust->v_total_min || 3174 - stream->adjust.v_total_max != update->crtc_timing_adjust->v_total_max) 3171 + stream->adjust.v_total_max != update->crtc_timing_adjust->v_total_max || 3172 + stream->adjust.timing_adjust_pending) 3175 3173 update->crtc_timing_adjust->timing_adjust_pending = true; 3176 3174 stream->adjust = *update->crtc_timing_adjust; 3177 3175 update->crtc_timing_adjust->timing_adjust_pending = false;

+2 -2

drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c

··· 195 195 .dcn_downspread_percent = 0.5, 196 196 .gpuvm_min_page_size_bytes = 4096, 197 197 .hostvm_min_page_size_bytes = 4096, 198 - .do_urgent_latency_adjustment = 1, 198 + .do_urgent_latency_adjustment = 0, 199 199 .urgent_latency_adjustment_fabric_clock_component_us = 0, 200 - .urgent_latency_adjustment_fabric_clock_reference_mhz = 3000, 200 + .urgent_latency_adjustment_fabric_clock_reference_mhz = 0, 201 201 }; 202 202 203 203 void dcn35_build_wm_range_table_fpu(struct clk_mgr *clk_mgr)

+11 -9

drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_translation_helper.c

··· 910 910 } 911 911 912 912 //TODO : Could be possibly moved to a common helper layer. 913 - static bool dml21_wrapper_get_plane_id(const struct dc_state *context, const struct dc_plane_state *plane, unsigned int *plane_id) 913 + static bool dml21_wrapper_get_plane_id(const struct dc_state *context, unsigned int stream_id, const struct dc_plane_state *plane, unsigned int *plane_id) 914 914 { 915 915 int i, j; 916 916 ··· 918 918 return false; 919 919 920 920 for (i = 0; i < context->stream_count; i++) { 921 - for (j = 0; j < context->stream_status[i].plane_count; j++) { 922 - if (context->stream_status[i].plane_states[j] == plane) { 923 - *plane_id = (i << 16) | j; 924 - return true; 921 + if (context->streams[i]->stream_id == stream_id) { 922 + for (j = 0; j < context->stream_status[i].plane_count; j++) { 923 + if (context->stream_status[i].plane_states[j] == plane) { 924 + *plane_id = (i << 16) | j; 925 + return true; 926 + } 925 927 } 926 928 } 927 929 } ··· 946 944 return location; 947 945 } 948 946 949 - static unsigned int map_plane_to_dml21_display_cfg(const struct dml2_context *dml_ctx, 947 + static unsigned int map_plane_to_dml21_display_cfg(const struct dml2_context *dml_ctx, unsigned int stream_id, 950 948 const struct dc_plane_state *plane, const struct dc_state *context) 951 949 { 952 950 unsigned int plane_id; 953 951 int i = 0; 954 952 int location = -1; 955 953 956 - if (!dml21_wrapper_get_plane_id(context, plane, &plane_id)) { 954 + if (!dml21_wrapper_get_plane_id(context, stream_id, plane, &plane_id)) { 957 955 ASSERT(false); 958 956 return -1; 959 957 } ··· 1039 1037 dml_dispcfg->plane_descriptors[disp_cfg_plane_location].stream_index = disp_cfg_stream_location; 1040 1038 } else { 1041 1039 for (plane_index = 0; plane_index < context->stream_status[stream_index].plane_count; plane_index++) { 1042 - disp_cfg_plane_location = map_plane_to_dml21_display_cfg(dml_ctx, context->stream_status[stream_index].plane_states[plane_index], context); 1040 + disp_cfg_plane_location = map_plane_to_dml21_display_cfg(dml_ctx, context->streams[stream_index]->stream_id, context->stream_status[stream_index].plane_states[plane_index], context); 1043 1041 1044 1042 if (disp_cfg_plane_location < 0) 1045 1043 disp_cfg_plane_location = dml_dispcfg->num_planes++; ··· 1050 1048 populate_dml21_plane_config_from_plane_state(dml_ctx, &dml_dispcfg->plane_descriptors[disp_cfg_plane_location], context->stream_status[stream_index].plane_states[plane_index], context, stream_index); 1051 1049 dml_dispcfg->plane_descriptors[disp_cfg_plane_location].stream_index = disp_cfg_stream_location; 1052 1050 1053 - if (dml21_wrapper_get_plane_id(context, context->stream_status[stream_index].plane_states[plane_index], &dml_ctx->v21.dml_to_dc_pipe_mapping.disp_cfg_to_plane_id[disp_cfg_plane_location])) 1051 + if (dml21_wrapper_get_plane_id(context, context->streams[stream_index]->stream_id, context->stream_status[stream_index].plane_states[plane_index], &dml_ctx->v21.dml_to_dc_pipe_mapping.disp_cfg_to_plane_id[disp_cfg_plane_location])) 1054 1052 dml_ctx->v21.dml_to_dc_pipe_mapping.disp_cfg_to_plane_id_valid[disp_cfg_plane_location] = true; 1055 1053 1056 1054 /* apply forced pstate policy */

+3 -2

drivers/gpu/drm/amd/display/dc/dpp/dcn401/dcn401_dpp_cm.c

··· 120 120 enum dc_cursor_color_format color_format = cursor_attributes->color_format; 121 121 int cur_rom_en = 0; 122 122 123 - // DCN4 should always do Cursor degamma for Cursor Color modes 124 123 if (color_format == CURSOR_MODE_COLOR_PRE_MULTIPLIED_ALPHA || 125 124 color_format == CURSOR_MODE_COLOR_UN_PRE_MULTIPLIED_ALPHA) { 126 - cur_rom_en = 1; 125 + if (cursor_attributes->attribute_flags.bits.ENABLE_CURSOR_DEGAMMA) { 126 + cur_rom_en = 1; 127 + } 127 128 } 128 129 129 130 REG_UPDATE_3(CURSOR0_CONTROL,

+3 -3

drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c

··· 1980 1980 dc->res_pool->hubbub, pipe_ctx->plane_res.hubp->inst, pipe_ctx->hubp_regs.det_size); 1981 1981 } 1982 1982 1983 - if (pipe_ctx->update_flags.raw || 1984 - (pipe_ctx->plane_state && pipe_ctx->plane_state->update_flags.raw) || 1985 - pipe_ctx->stream->update_flags.raw) 1983 + if (pipe_ctx->plane_state && (pipe_ctx->update_flags.raw || 1984 + pipe_ctx->plane_state->update_flags.raw || 1985 + pipe_ctx->stream->update_flags.raw)) 1986 1986 dc->hwss.update_dchubp_dpp(dc, pipe_ctx, context); 1987 1987 1988 1988 if (pipe_ctx->plane_state && (pipe_ctx->update_flags.bits.enable ||

+11 -2

drivers/gpu/drm/amd/display/dc/link/link_dpms.c

··· 148 148 void link_set_all_streams_dpms_off_for_link(struct dc_link *link) 149 149 { 150 150 struct pipe_ctx *pipes[MAX_PIPES]; 151 + struct dc_stream_state *streams[MAX_PIPES]; 151 152 struct dc_state *state = link->dc->current_state; 152 153 uint8_t count; 153 154 int i; ··· 161 160 162 161 link_get_master_pipes_with_dpms_on(link, state, &count, pipes); 163 162 163 + /* The subsequent call to dc_commit_updates_for_stream for a full update 164 + * will release the current state and swap to a new state. Releasing the 165 + * current state results in the stream pointers in the pipe_ctx structs 166 + * to be zero'd. Hence, cache all streams prior to dc_commit_updates_for_stream. 167 + */ 168 + for (i = 0; i < count; i++) 169 + streams[i] = pipes[i]->stream; 170 + 164 171 for (i = 0; i < count; i++) { 165 - stream_update.stream = pipes[i]->stream; 172 + stream_update.stream = streams[i]; 166 173 dc_commit_updates_for_stream(link->ctx->dc, NULL, 0, 167 - pipes[i]->stream, &stream_update, 174 + streams[i], &stream_update, 168 175 state); 169 176 } 170 177

+32 -5

drivers/gpu/drm/drm_gpusvm.c

··· 1118 1118 lockdep_assert_held(&gpusvm->notifier_lock); 1119 1119 1120 1120 if (range->flags.has_dma_mapping) { 1121 + struct drm_gpusvm_range_flags flags = { 1122 + .__flags = range->flags.__flags, 1123 + }; 1124 + 1121 1125 for (i = 0, j = 0; i < npages; j++) { 1122 1126 struct drm_pagemap_device_addr *addr = &range->dma_addr[j]; 1123 1127 ··· 1135 1131 dev, *addr); 1136 1132 i += 1 << addr->order; 1137 1133 } 1138 - range->flags.has_devmem_pages = false; 1139 - range->flags.has_dma_mapping = false; 1134 + 1135 + /* WRITE_ONCE pairs with READ_ONCE for opportunistic checks */ 1136 + flags.has_devmem_pages = false; 1137 + flags.has_dma_mapping = false; 1138 + WRITE_ONCE(range->flags.__flags, flags.__flags); 1139 + 1140 1140 range->dpagemap = NULL; 1141 1141 } 1142 1142 } ··· 1342 1334 int err = 0; 1343 1335 struct dev_pagemap *pagemap; 1344 1336 struct drm_pagemap *dpagemap; 1337 + struct drm_gpusvm_range_flags flags; 1345 1338 1346 1339 retry: 1347 1340 hmm_range.notifier_seq = mmu_interval_read_begin(notifier); ··· 1387 1378 */ 1388 1379 drm_gpusvm_notifier_lock(gpusvm); 1389 1380 1390 - if (range->flags.unmapped) { 1381 + flags.__flags = range->flags.__flags; 1382 + if (flags.unmapped) { 1391 1383 drm_gpusvm_notifier_unlock(gpusvm); 1392 1384 err = -EFAULT; 1393 1385 goto err_free; ··· 1464 1454 goto err_unmap; 1465 1455 } 1466 1456 1457 + if (ctx->devmem_only) { 1458 + err = -EFAULT; 1459 + goto err_unmap; 1460 + } 1461 + 1467 1462 addr = dma_map_page(gpusvm->drm->dev, 1468 1463 page, 0, 1469 1464 PAGE_SIZE << order, ··· 1484 1469 } 1485 1470 i += 1 << order; 1486 1471 num_dma_mapped = i; 1487 - range->flags.has_dma_mapping = true; 1472 + flags.has_dma_mapping = true; 1488 1473 } 1489 1474 1490 1475 if (zdd) { 1491 - range->flags.has_devmem_pages = true; 1476 + flags.has_devmem_pages = true; 1492 1477 range->dpagemap = dpagemap; 1493 1478 } 1479 + 1480 + /* WRITE_ONCE pairs with READ_ONCE for opportunistic checks */ 1481 + WRITE_ONCE(range->flags.__flags, flags.__flags); 1494 1482 1495 1483 drm_gpusvm_notifier_unlock(gpusvm); 1496 1484 kvfree(pfns); ··· 1783 1765 goto err_finalize; 1784 1766 1785 1767 /* Upon success bind devmem allocation to range and zdd */ 1768 + devmem_allocation->timeslice_expiration = get_jiffies_64() + 1769 + msecs_to_jiffies(ctx->timeslice_ms); 1786 1770 zdd->devmem_allocation = devmem_allocation; /* Owns ref */ 1787 1771 1788 1772 err_finalize: ··· 2004 1984 unsigned long start, end; 2005 1985 void *buf; 2006 1986 int i, err = 0; 1987 + 1988 + if (page) { 1989 + zdd = page->zone_device_data; 1990 + if (time_before64(get_jiffies_64(), 1991 + zdd->devmem_allocation->timeslice_expiration)) 1992 + return 0; 1993 + } 2007 1994 2008 1995 start = ALIGN_DOWN(fault_addr, size); 2009 1996 end = ALIGN(fault_addr + 1, size);

+2 -2

drivers/gpu/drm/meson/meson_encoder_hdmi.c

··· 75 75 unsigned long long venc_freq; 76 76 unsigned long long hdmi_freq; 77 77 78 - vclk_freq = mode->clock * 1000; 78 + vclk_freq = mode->clock * 1000ULL; 79 79 80 80 /* For 420, pixel clock is half unlike venc clock */ 81 81 if (encoder_hdmi->output_bus_fmt == MEDIA_BUS_FMT_UYYVYY8_0_5X24) ··· 123 123 struct meson_encoder_hdmi *encoder_hdmi = bridge_to_meson_encoder_hdmi(bridge); 124 124 struct meson_drm *priv = encoder_hdmi->priv; 125 125 bool is_hdmi2_sink = display_info->hdmi.scdc.supported; 126 - unsigned long long clock = mode->clock * 1000; 126 + unsigned long long clock = mode->clock * 1000ULL; 127 127 unsigned long long phy_freq; 128 128 unsigned long long vclk_freq; 129 129 unsigned long long venc_freq;

+4 -1

drivers/gpu/drm/tiny/panel-mipi-dbi.c

··· 390 390 391 391 spi_set_drvdata(spi, drm); 392 392 393 - drm_client_setup(drm, NULL); 393 + if (bpp == 16) 394 + drm_client_setup_with_fourcc(drm, DRM_FORMAT_RGB565); 395 + else 396 + drm_client_setup_with_fourcc(drm, DRM_FORMAT_RGB888); 394 397 395 398 return 0; 396 399 }

+4

drivers/gpu/drm/xe/instructions/xe_mi_commands.h

··· 47 47 #define MI_LRI_FORCE_POSTED REG_BIT(12) 48 48 #define MI_LRI_LEN(x) (((x) & 0xff) + 1) 49 49 50 + #define MI_STORE_REGISTER_MEM (__MI_INSTR(0x24) | XE_INSTR_NUM_DW(4)) 51 + #define MI_SRM_USE_GGTT REG_BIT(22) 52 + #define MI_SRM_ADD_CS_OFFSET REG_BIT(19) 53 + 50 54 #define MI_FLUSH_DW __MI_INSTR(0x26) 51 55 #define MI_FLUSH_DW_PROTECTED_MEM_EN REG_BIT(22) 52 56 #define MI_FLUSH_DW_STORE_INDEX REG_BIT(21)

+5

drivers/gpu/drm/xe/regs/xe_engine_regs.h

··· 43 43 #define XEHPC_BCS8_RING_BASE 0x3ee000 44 44 #define GSCCS_RING_BASE 0x11a000 45 45 46 + #define ENGINE_ID(base) XE_REG((base) + 0x8c) 47 + #define ENGINE_INSTANCE_ID REG_GENMASK(9, 4) 48 + #define ENGINE_CLASS_ID REG_GENMASK(2, 0) 49 + 46 50 #define RING_TAIL(base) XE_REG((base) + 0x30) 47 51 #define TAIL_ADDR REG_GENMASK(20, 3) 48 52 ··· 158 154 #define STOP_RING REG_BIT(8) 159 155 160 156 #define RING_CTX_TIMESTAMP(base) XE_REG((base) + 0x3a8) 157 + #define RING_CTX_TIMESTAMP_UDW(base) XE_REG((base) + 0x3ac) 161 158 #define CSBE_DEBUG_STATUS(base) XE_REG((base) + 0x3fc) 162 159 163 160 #define RING_FORCE_TO_NONPRIV(base, i) XE_REG(((base) + 0x4d0) + (i) * 4)

+1

drivers/gpu/drm/xe/regs/xe_gt_regs.h

··· 157 157 #define XEHPG_SC_INSTDONE_EXTRA2 XE_REG_MCR(0x7108) 158 158 159 159 #define COMMON_SLICE_CHICKEN4 XE_REG(0x7300, XE_REG_OPTION_MASKED) 160 + #define SBE_PUSH_CONSTANT_BEHIND_FIX_ENABLE REG_BIT(12) 160 161 #define DISABLE_TDC_LOAD_BALANCING_CALC REG_BIT(6) 161 162 162 163 #define COMMON_SLICE_CHICKEN3 XE_REG(0x7304, XE_REG_OPTION_MASKED)

+2

drivers/gpu/drm/xe/regs/xe_lrc_layout.h

··· 11 11 #define CTX_RING_TAIL (0x06 + 1) 12 12 #define CTX_RING_START (0x08 + 1) 13 13 #define CTX_RING_CTL (0x0a + 1) 14 + #define CTX_BB_PER_CTX_PTR (0x12 + 1) 14 15 #define CTX_TIMESTAMP (0x22 + 1) 16 + #define CTX_TIMESTAMP_UDW (0x24 + 1) 15 17 #define CTX_INDIRECT_RING_STATE (0x26 + 1) 16 18 #define CTX_PDP0_UDW (0x30 + 1) 17 19 #define CTX_PDP0_LDW (0x32 + 1)

+2

drivers/gpu/drm/xe/xe_device_types.h

··· 330 330 u8 has_sriov:1; 331 331 /** @info.has_usm: Device has unified shared memory support */ 332 332 u8 has_usm:1; 333 + /** @info.has_64bit_timestamp: Device supports 64-bit timestamps */ 334 + u8 has_64bit_timestamp:1; 333 335 /** @info.is_dgfx: is discrete device */ 334 336 u8 is_dgfx:1; 335 337 /**

+1 -1

drivers/gpu/drm/xe/xe_exec_queue.c

··· 830 830 { 831 831 struct xe_device *xe = gt_to_xe(q->gt); 832 832 struct xe_lrc *lrc; 833 - u32 old_ts, new_ts; 833 + u64 old_ts, new_ts; 834 834 int idx; 835 835 836 836 /*

+1 -1

drivers/gpu/drm/xe/xe_guc_submit.c

··· 941 941 return xe_sched_invalidate_job(job, 2); 942 942 } 943 943 944 - ctx_timestamp = xe_lrc_ctx_timestamp(q->lrc[0]); 944 + ctx_timestamp = lower_32_bits(xe_lrc_ctx_timestamp(q->lrc[0])); 945 945 ctx_job_timestamp = xe_lrc_ctx_job_timestamp(q->lrc[0]); 946 946 947 947 /*

+186 -13

drivers/gpu/drm/xe/xe_lrc.c

··· 24 24 #include "xe_hw_fence.h" 25 25 #include "xe_map.h" 26 26 #include "xe_memirq.h" 27 + #include "xe_mmio.h" 27 28 #include "xe_sriov.h" 28 29 #include "xe_trace_lrc.h" 29 30 #include "xe_vm.h" ··· 651 650 #define LRC_START_SEQNO_PPHWSP_OFFSET (LRC_SEQNO_PPHWSP_OFFSET + 8) 652 651 #define LRC_CTX_JOB_TIMESTAMP_OFFSET (LRC_START_SEQNO_PPHWSP_OFFSET + 8) 653 652 #define LRC_PARALLEL_PPHWSP_OFFSET 2048 653 + #define LRC_ENGINE_ID_PPHWSP_OFFSET 2096 654 654 #define LRC_PPHWSP_SIZE SZ_4K 655 655 656 656 u32 xe_lrc_regs_offset(struct xe_lrc *lrc) ··· 686 684 687 685 static u32 __xe_lrc_ctx_job_timestamp_offset(struct xe_lrc *lrc) 688 686 { 689 - /* The start seqno is stored in the driver-defined portion of PPHWSP */ 687 + /* This is stored in the driver-defined portion of PPHWSP */ 690 688 return xe_lrc_pphwsp_offset(lrc) + LRC_CTX_JOB_TIMESTAMP_OFFSET; 691 689 } 692 690 ··· 696 694 return xe_lrc_pphwsp_offset(lrc) + LRC_PARALLEL_PPHWSP_OFFSET; 697 695 } 698 696 697 + static inline u32 __xe_lrc_engine_id_offset(struct xe_lrc *lrc) 698 + { 699 + return xe_lrc_pphwsp_offset(lrc) + LRC_ENGINE_ID_PPHWSP_OFFSET; 700 + } 701 + 699 702 static u32 __xe_lrc_ctx_timestamp_offset(struct xe_lrc *lrc) 700 703 { 701 704 return __xe_lrc_regs_offset(lrc) + CTX_TIMESTAMP * sizeof(u32); 705 + } 706 + 707 + static u32 __xe_lrc_ctx_timestamp_udw_offset(struct xe_lrc *lrc) 708 + { 709 + return __xe_lrc_regs_offset(lrc) + CTX_TIMESTAMP_UDW * sizeof(u32); 702 710 } 703 711 704 712 static inline u32 __xe_lrc_indirect_ring_offset(struct xe_lrc *lrc) ··· 738 726 DECL_MAP_ADDR_HELPERS(start_seqno) 739 727 DECL_MAP_ADDR_HELPERS(ctx_job_timestamp) 740 728 DECL_MAP_ADDR_HELPERS(ctx_timestamp) 729 + DECL_MAP_ADDR_HELPERS(ctx_timestamp_udw) 741 730 DECL_MAP_ADDR_HELPERS(parallel) 742 731 DECL_MAP_ADDR_HELPERS(indirect_ring) 732 + DECL_MAP_ADDR_HELPERS(engine_id) 743 733 744 734 #undef DECL_MAP_ADDR_HELPERS 745 735 ··· 757 743 } 758 744 759 745 /** 746 + * xe_lrc_ctx_timestamp_udw_ggtt_addr() - Get ctx timestamp udw GGTT address 747 + * @lrc: Pointer to the lrc. 748 + * 749 + * Returns: ctx timestamp udw GGTT address 750 + */ 751 + u32 xe_lrc_ctx_timestamp_udw_ggtt_addr(struct xe_lrc *lrc) 752 + { 753 + return __xe_lrc_ctx_timestamp_udw_ggtt_addr(lrc); 754 + } 755 + 756 + /** 760 757 * xe_lrc_ctx_timestamp() - Read ctx timestamp value 761 758 * @lrc: Pointer to the lrc. 762 759 * 763 760 * Returns: ctx timestamp value 764 761 */ 765 - u32 xe_lrc_ctx_timestamp(struct xe_lrc *lrc) 762 + u64 xe_lrc_ctx_timestamp(struct xe_lrc *lrc) 766 763 { 767 764 struct xe_device *xe = lrc_to_xe(lrc); 768 765 struct iosys_map map; 766 + u32 ldw, udw = 0; 769 767 770 768 map = __xe_lrc_ctx_timestamp_map(lrc); 771 - return xe_map_read32(xe, &map); 769 + ldw = xe_map_read32(xe, &map); 770 + 771 + if (xe->info.has_64bit_timestamp) { 772 + map = __xe_lrc_ctx_timestamp_udw_map(lrc); 773 + udw = xe_map_read32(xe, &map); 774 + } 775 + 776 + return (u64)udw << 32 | ldw; 772 777 } 773 778 774 779 /** ··· 897 864 898 865 static void xe_lrc_set_ppgtt(struct xe_lrc *lrc, struct xe_vm *vm) 899 866 { 900 - u64 desc = xe_vm_pdp4_descriptor(vm, lrc->tile); 867 + u64 desc = xe_vm_pdp4_descriptor(vm, gt_to_tile(lrc->gt)); 901 868 902 869 xe_lrc_write_ctx_reg(lrc, CTX_PDP0_UDW, upper_32_bits(desc)); 903 870 xe_lrc_write_ctx_reg(lrc, CTX_PDP0_LDW, lower_32_bits(desc)); ··· 910 877 xe_bo_unpin(lrc->bo); 911 878 xe_bo_unlock(lrc->bo); 912 879 xe_bo_put(lrc->bo); 880 + xe_bo_unpin_map_no_vm(lrc->bb_per_ctx_bo); 881 + } 882 + 883 + /* 884 + * xe_lrc_setup_utilization() - Setup wa bb to assist in calculating active 885 + * context run ticks. 886 + * @lrc: Pointer to the lrc. 887 + * 888 + * Context Timestamp (CTX_TIMESTAMP) in the LRC accumulates the run ticks of the 889 + * context, but only gets updated when the context switches out. In order to 890 + * check how long a context has been active before it switches out, two things 891 + * are required: 892 + * 893 + * (1) Determine if the context is running: 894 + * To do so, we program the WA BB to set an initial value for CTX_TIMESTAMP in 895 + * the LRC. The value chosen is 1 since 0 is the initial value when the LRC is 896 + * initialized. During a query, we just check for this value to determine if the 897 + * context is active. If the context switched out, it would overwrite this 898 + * location with the actual CTX_TIMESTAMP MMIO value. Note that WA BB runs as 899 + * the last part of context restore, so reusing this LRC location will not 900 + * clobber anything. 901 + * 902 + * (2) Calculate the time that the context has been active for: 903 + * The CTX_TIMESTAMP ticks only when the context is active. If a context is 904 + * active, we just use the CTX_TIMESTAMP MMIO as the new value of utilization. 905 + * While doing so, we need to read the CTX_TIMESTAMP MMIO for the specific 906 + * engine instance. Since we do not know which instance the context is running 907 + * on until it is scheduled, we also read the ENGINE_ID MMIO in the WA BB and 908 + * store it in the PPHSWP. 909 + */ 910 + #define CONTEXT_ACTIVE 1ULL 911 + static void xe_lrc_setup_utilization(struct xe_lrc *lrc) 912 + { 913 + u32 *cmd; 914 + 915 + cmd = lrc->bb_per_ctx_bo->vmap.vaddr; 916 + 917 + *cmd++ = MI_STORE_REGISTER_MEM | MI_SRM_USE_GGTT | MI_SRM_ADD_CS_OFFSET; 918 + *cmd++ = ENGINE_ID(0).addr; 919 + *cmd++ = __xe_lrc_engine_id_ggtt_addr(lrc); 920 + *cmd++ = 0; 921 + 922 + *cmd++ = MI_STORE_DATA_IMM | MI_SDI_GGTT | MI_SDI_NUM_DW(1); 923 + *cmd++ = __xe_lrc_ctx_timestamp_ggtt_addr(lrc); 924 + *cmd++ = 0; 925 + *cmd++ = lower_32_bits(CONTEXT_ACTIVE); 926 + 927 + if (lrc_to_xe(lrc)->info.has_64bit_timestamp) { 928 + *cmd++ = MI_STORE_DATA_IMM | MI_SDI_GGTT | MI_SDI_NUM_DW(1); 929 + *cmd++ = __xe_lrc_ctx_timestamp_udw_ggtt_addr(lrc); 930 + *cmd++ = 0; 931 + *cmd++ = upper_32_bits(CONTEXT_ACTIVE); 932 + } 933 + 934 + *cmd++ = MI_BATCH_BUFFER_END; 935 + 936 + xe_lrc_write_ctx_reg(lrc, CTX_BB_PER_CTX_PTR, 937 + xe_bo_ggtt_addr(lrc->bb_per_ctx_bo) | 1); 938 + 913 939 } 914 940 915 941 #define PVC_CTX_ASID (0x2e + 1) ··· 985 893 void *init_data = NULL; 986 894 u32 arb_enable; 987 895 u32 lrc_size; 896 + u32 bo_flags; 988 897 int err; 989 898 990 899 kref_init(&lrc->refcount); 900 + lrc->gt = gt; 991 901 lrc->flags = 0; 992 902 lrc_size = ring_size + xe_gt_lrc_size(gt, hwe->class); 993 903 if (xe_gt_has_indirect_ring_state(gt)) 994 904 lrc->flags |= XE_LRC_FLAG_INDIRECT_RING_STATE; 905 + 906 + bo_flags = XE_BO_FLAG_VRAM_IF_DGFX(tile) | XE_BO_FLAG_GGTT | 907 + XE_BO_FLAG_GGTT_INVALIDATE; 995 908 996 909 /* 997 910 * FIXME: Perma-pinning LRC as we don't yet support moving GGTT address ··· 1004 907 */ 1005 908 lrc->bo = xe_bo_create_pin_map(xe, tile, vm, lrc_size, 1006 909 ttm_bo_type_kernel, 1007 - XE_BO_FLAG_VRAM_IF_DGFX(tile) | 1008 - XE_BO_FLAG_GGTT | 1009 - XE_BO_FLAG_GGTT_INVALIDATE); 910 + bo_flags); 1010 911 if (IS_ERR(lrc->bo)) 1011 912 return PTR_ERR(lrc->bo); 1012 913 914 + lrc->bb_per_ctx_bo = xe_bo_create_pin_map(xe, tile, NULL, SZ_4K, 915 + ttm_bo_type_kernel, 916 + bo_flags); 917 + if (IS_ERR(lrc->bb_per_ctx_bo)) { 918 + err = PTR_ERR(lrc->bb_per_ctx_bo); 919 + goto err_lrc_finish; 920 + } 921 + 1013 922 lrc->size = lrc_size; 1014 - lrc->tile = gt_to_tile(hwe->gt); 1015 923 lrc->ring.size = ring_size; 1016 924 lrc->ring.tail = 0; 1017 - lrc->ctx_timestamp = 0; 1018 925 1019 926 xe_hw_fence_ctx_init(&lrc->fence_ctx, hwe->gt, 1020 927 hwe->fence_irq, hwe->name); ··· 1091 990 xe_lrc_read_ctx_reg(lrc, CTX_CONTEXT_CONTROL) | 1092 991 _MASKED_BIT_ENABLE(CTX_CTRL_PXP_ENABLE)); 1093 992 993 + lrc->ctx_timestamp = 0; 1094 994 xe_lrc_write_ctx_reg(lrc, CTX_TIMESTAMP, 0); 995 + if (lrc_to_xe(lrc)->info.has_64bit_timestamp) 996 + xe_lrc_write_ctx_reg(lrc, CTX_TIMESTAMP_UDW, 0); 1095 997 1096 998 if (xe->info.has_asid && vm) 1097 999 xe_lrc_write_ctx_reg(lrc, PVC_CTX_ASID, vm->usm.asid); ··· 1122 1018 1123 1019 map = __xe_lrc_start_seqno_map(lrc); 1124 1020 xe_map_write32(lrc_to_xe(lrc), &map, lrc->fence_ctx.next_seqno - 1); 1021 + 1022 + xe_lrc_setup_utilization(lrc); 1125 1023 1126 1024 return 0; 1127 1025 ··· 1342 1236 struct iosys_map xe_lrc_parallel_map(struct xe_lrc *lrc) 1343 1237 { 1344 1238 return __xe_lrc_parallel_map(lrc); 1239 + } 1240 + 1241 + /** 1242 + * xe_lrc_engine_id() - Read engine id value 1243 + * @lrc: Pointer to the lrc. 1244 + * 1245 + * Returns: context id value 1246 + */ 1247 + static u32 xe_lrc_engine_id(struct xe_lrc *lrc) 1248 + { 1249 + struct xe_device *xe = lrc_to_xe(lrc); 1250 + struct iosys_map map; 1251 + 1252 + map = __xe_lrc_engine_id_map(lrc); 1253 + return xe_map_read32(xe, &map); 1345 1254 } 1346 1255 1347 1256 static int instr_dw(u32 cmd_header) ··· 1805 1684 snapshot->lrc_offset = xe_lrc_pphwsp_offset(lrc); 1806 1685 snapshot->lrc_size = lrc->bo->size - snapshot->lrc_offset; 1807 1686 snapshot->lrc_snapshot = NULL; 1808 - snapshot->ctx_timestamp = xe_lrc_ctx_timestamp(lrc); 1687 + snapshot->ctx_timestamp = lower_32_bits(xe_lrc_ctx_timestamp(lrc)); 1809 1688 snapshot->ctx_job_timestamp = xe_lrc_ctx_job_timestamp(lrc); 1810 1689 return snapshot; 1811 1690 } ··· 1905 1784 kfree(snapshot); 1906 1785 } 1907 1786 1787 + static int get_ctx_timestamp(struct xe_lrc *lrc, u32 engine_id, u64 *reg_ctx_ts) 1788 + { 1789 + u16 class = REG_FIELD_GET(ENGINE_CLASS_ID, engine_id); 1790 + u16 instance = REG_FIELD_GET(ENGINE_INSTANCE_ID, engine_id); 1791 + struct xe_hw_engine *hwe; 1792 + u64 val; 1793 + 1794 + hwe = xe_gt_hw_engine(lrc->gt, class, instance, false); 1795 + if (xe_gt_WARN_ONCE(lrc->gt, !hwe || xe_hw_engine_is_reserved(hwe), 1796 + "Unexpected engine class:instance %d:%d for context utilization\n", 1797 + class, instance)) 1798 + return -1; 1799 + 1800 + if (lrc_to_xe(lrc)->info.has_64bit_timestamp) 1801 + val = xe_mmio_read64_2x32(&hwe->gt->mmio, 1802 + RING_CTX_TIMESTAMP(hwe->mmio_base)); 1803 + else 1804 + val = xe_mmio_read32(&hwe->gt->mmio, 1805 + RING_CTX_TIMESTAMP(hwe->mmio_base)); 1806 + 1807 + *reg_ctx_ts = val; 1808 + 1809 + return 0; 1810 + } 1811 + 1908 1812 /** 1909 1813 * xe_lrc_update_timestamp() - Update ctx timestamp 1910 1814 * @lrc: Pointer to the lrc. 1911 1815 * @old_ts: Old timestamp value 1912 1816 * 1913 1817 * Populate @old_ts current saved ctx timestamp, read new ctx timestamp and 1914 - * update saved value. 1818 + * update saved value. With support for active contexts, the calculation may be 1819 + * slightly racy, so follow a read-again logic to ensure that the context is 1820 + * still active before returning the right timestamp. 1915 1821 * 1916 1822 * Returns: New ctx timestamp value 1917 1823 */ 1918 - u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts) 1824 + u64 xe_lrc_update_timestamp(struct xe_lrc *lrc, u64 *old_ts) 1919 1825 { 1826 + u64 lrc_ts, reg_ts; 1827 + u32 engine_id; 1828 + 1920 1829 *old_ts = lrc->ctx_timestamp; 1921 1830 1922 - lrc->ctx_timestamp = xe_lrc_ctx_timestamp(lrc); 1831 + lrc_ts = xe_lrc_ctx_timestamp(lrc); 1832 + /* CTX_TIMESTAMP mmio read is invalid on VF, so return the LRC value */ 1833 + if (IS_SRIOV_VF(lrc_to_xe(lrc))) { 1834 + lrc->ctx_timestamp = lrc_ts; 1835 + goto done; 1836 + } 1923 1837 1838 + if (lrc_ts == CONTEXT_ACTIVE) { 1839 + engine_id = xe_lrc_engine_id(lrc); 1840 + if (!get_ctx_timestamp(lrc, engine_id, &reg_ts)) 1841 + lrc->ctx_timestamp = reg_ts; 1842 + 1843 + /* read lrc again to ensure context is still active */ 1844 + lrc_ts = xe_lrc_ctx_timestamp(lrc); 1845 + } 1846 + 1847 + /* 1848 + * If context switched out, just use the lrc_ts. Note that this needs to 1849 + * be a separate if condition. 1850 + */ 1851 + if (lrc_ts != CONTEXT_ACTIVE) 1852 + lrc->ctx_timestamp = lrc_ts; 1853 + 1854 + done: 1924 1855 trace_xe_lrc_update_timestamp(lrc, *old_ts); 1925 1856 1926 1857 return lrc->ctx_timestamp;

+3 -2

drivers/gpu/drm/xe/xe_lrc.h

··· 120 120 void xe_lrc_snapshot_free(struct xe_lrc_snapshot *snapshot); 121 121 122 122 u32 xe_lrc_ctx_timestamp_ggtt_addr(struct xe_lrc *lrc); 123 - u32 xe_lrc_ctx_timestamp(struct xe_lrc *lrc); 123 + u32 xe_lrc_ctx_timestamp_udw_ggtt_addr(struct xe_lrc *lrc); 124 + u64 xe_lrc_ctx_timestamp(struct xe_lrc *lrc); 124 125 u32 xe_lrc_ctx_job_timestamp_ggtt_addr(struct xe_lrc *lrc); 125 126 u32 xe_lrc_ctx_job_timestamp(struct xe_lrc *lrc); 126 127 ··· 137 136 * 138 137 * Returns the current LRC timestamp 139 138 */ 140 - u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts); 139 + u64 xe_lrc_update_timestamp(struct xe_lrc *lrc, u64 *old_ts); 141 140 142 141 #endif

+6 -3

drivers/gpu/drm/xe/xe_lrc_types.h

··· 25 25 /** @size: size of lrc including any indirect ring state page */ 26 26 u32 size; 27 27 28 - /** @tile: tile which this LRC belongs to */ 29 - struct xe_tile *tile; 28 + /** @gt: gt which this LRC belongs to */ 29 + struct xe_gt *gt; 30 30 31 31 /** @flags: LRC flags */ 32 32 #define XE_LRC_FLAG_INDIRECT_RING_STATE 0x1 ··· 52 52 struct xe_hw_fence_ctx fence_ctx; 53 53 54 54 /** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */ 55 - u32 ctx_timestamp; 55 + u64 ctx_timestamp; 56 + 57 + /** @bb_per_ctx_bo: buffer object for per context batch wa buffer */ 58 + struct xe_bo *bb_per_ctx_bo; 56 59 }; 57 60 58 61 struct xe_lrc_snapshot;

-3

drivers/gpu/drm/xe/xe_module.c

··· 29 29 module_param_named(svm_notifier_size, xe_modparam.svm_notifier_size, uint, 0600); 30 30 MODULE_PARM_DESC(svm_notifier_size, "Set the svm notifier size(in MiB), must be power of 2"); 31 31 32 - module_param_named(always_migrate_to_vram, xe_modparam.always_migrate_to_vram, bool, 0444); 33 - MODULE_PARM_DESC(always_migrate_to_vram, "Always migrate to VRAM on GPU fault"); 34 - 35 32 module_param_named_unsafe(force_execlist, xe_modparam.force_execlist, bool, 0444); 36 33 MODULE_PARM_DESC(force_execlist, "Force Execlist submission"); 37 34

-1

drivers/gpu/drm/xe/xe_module.h

··· 12 12 struct xe_modparam { 13 13 bool force_execlist; 14 14 bool probe_display; 15 - bool always_migrate_to_vram; 16 15 u32 force_vram_bar_size; 17 16 int guc_log_level; 18 17 char *guc_firmware_path;

+2

drivers/gpu/drm/xe/xe_pci.c

··· 140 140 .has_indirect_ring_state = 1, \ 141 141 .has_range_tlb_invalidation = 1, \ 142 142 .has_usm = 1, \ 143 + .has_64bit_timestamp = 1, \ 143 144 .va_bits = 48, \ 144 145 .vm_max_level = 4, \ 145 146 .hw_engine_mask = \ ··· 669 668 670 669 xe->info.has_range_tlb_invalidation = graphics_desc->has_range_tlb_invalidation; 671 670 xe->info.has_usm = graphics_desc->has_usm; 671 + xe->info.has_64bit_timestamp = graphics_desc->has_64bit_timestamp; 672 672 673 673 for_each_remote_tile(tile, xe, id) { 674 674 int err;

+1

drivers/gpu/drm/xe/xe_pci_types.h

··· 21 21 u8 has_indirect_ring_state:1; 22 22 u8 has_range_tlb_invalidation:1; 23 23 u8 has_usm:1; 24 + u8 has_64bit_timestamp:1; 24 25 }; 25 26 26 27 struct xe_media_desc {

+11 -3

drivers/gpu/drm/xe/xe_pt.c

··· 2232 2232 } 2233 2233 case DRM_GPUVA_OP_DRIVER: 2234 2234 { 2235 + /* WRITE_ONCE pairs with READ_ONCE in xe_svm.c */ 2236 + 2235 2237 if (op->subop == XE_VMA_SUBOP_MAP_RANGE) { 2236 - op->map_range.range->tile_present |= BIT(tile->id); 2237 - op->map_range.range->tile_invalidated &= ~BIT(tile->id); 2238 + WRITE_ONCE(op->map_range.range->tile_present, 2239 + op->map_range.range->tile_present | 2240 + BIT(tile->id)); 2241 + WRITE_ONCE(op->map_range.range->tile_invalidated, 2242 + op->map_range.range->tile_invalidated & 2243 + ~BIT(tile->id)); 2238 2244 } else if (op->subop == XE_VMA_SUBOP_UNMAP_RANGE) { 2239 - op->unmap_range.range->tile_present &= ~BIT(tile->id); 2245 + WRITE_ONCE(op->unmap_range.range->tile_present, 2246 + op->unmap_range.range->tile_present & 2247 + ~BIT(tile->id)); 2240 2248 } 2241 2249 break; 2242 2250 }

+2 -5

drivers/gpu/drm/xe/xe_ring_ops.c

··· 234 234 235 235 static int emit_copy_timestamp(struct xe_lrc *lrc, u32 *dw, int i) 236 236 { 237 - dw[i++] = MI_COPY_MEM_MEM | MI_COPY_MEM_MEM_SRC_GGTT | 238 - MI_COPY_MEM_MEM_DST_GGTT; 237 + dw[i++] = MI_STORE_REGISTER_MEM | MI_SRM_USE_GGTT | MI_SRM_ADD_CS_OFFSET; 238 + dw[i++] = RING_CTX_TIMESTAMP(0).addr; 239 239 dw[i++] = xe_lrc_ctx_job_timestamp_ggtt_addr(lrc); 240 240 dw[i++] = 0; 241 - dw[i++] = xe_lrc_ctx_timestamp_ggtt_addr(lrc); 242 - dw[i++] = 0; 243 - dw[i++] = MI_NOOP; 244 241 245 242 return i; 246 243 }

+1 -1

drivers/gpu/drm/xe/xe_shrinker.c

··· 227 227 if (!shrinker) 228 228 return ERR_PTR(-ENOMEM); 229 229 230 - shrinker->shrink = shrinker_alloc(0, "xe system shrinker"); 230 + shrinker->shrink = shrinker_alloc(0, "drm-xe_gem:%s", xe->drm.unique); 231 231 if (!shrinker->shrink) { 232 232 kfree(shrinker); 233 233 return ERR_PTR(-ENOMEM);

+90 -26

drivers/gpu/drm/xe/xe_svm.c

··· 15 15 16 16 static bool xe_svm_range_in_vram(struct xe_svm_range *range) 17 17 { 18 - /* Not reliable without notifier lock */ 19 - return range->base.flags.has_devmem_pages; 18 + /* 19 + * Advisory only check whether the range is currently backed by VRAM 20 + * memory. 21 + */ 22 + 23 + struct drm_gpusvm_range_flags flags = { 24 + /* Pairs with WRITE_ONCE in drm_gpusvm.c */ 25 + .__flags = READ_ONCE(range->base.flags.__flags), 26 + }; 27 + 28 + return flags.has_devmem_pages; 20 29 } 21 30 22 31 static bool xe_svm_range_has_vram_binding(struct xe_svm_range *range) ··· 654 645 } 655 646 656 647 static bool xe_svm_range_is_valid(struct xe_svm_range *range, 657 - struct xe_tile *tile) 648 + struct xe_tile *tile, 649 + bool devmem_only) 658 650 { 659 - return (range->tile_present & ~range->tile_invalidated) & BIT(tile->id); 651 + /* 652 + * Advisory only check whether the range currently has a valid mapping, 653 + * READ_ONCE pairs with WRITE_ONCE in xe_pt.c 654 + */ 655 + return ((READ_ONCE(range->tile_present) & 656 + ~READ_ONCE(range->tile_invalidated)) & BIT(tile->id)) && 657 + (!devmem_only || xe_svm_range_in_vram(range)); 660 658 } 661 659 662 660 static struct xe_vram_region *tile_to_vr(struct xe_tile *tile) ··· 728 712 return err; 729 713 } 730 714 715 + static bool supports_4K_migration(struct xe_device *xe) 716 + { 717 + if (xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K) 718 + return false; 719 + 720 + return true; 721 + } 722 + 723 + static bool xe_svm_range_needs_migrate_to_vram(struct xe_svm_range *range, 724 + struct xe_vma *vma) 725 + { 726 + struct xe_vm *vm = range_to_vm(&range->base); 727 + u64 range_size = xe_svm_range_size(range); 728 + 729 + if (!range->base.flags.migrate_devmem) 730 + return false; 731 + 732 + if (xe_svm_range_in_vram(range)) { 733 + drm_dbg(&vm->xe->drm, "Range is already in VRAM\n"); 734 + return false; 735 + } 736 + 737 + if (range_size <= SZ_64K && !supports_4K_migration(vm->xe)) { 738 + drm_dbg(&vm->xe->drm, "Platform doesn't support SZ_4K range migration\n"); 739 + return false; 740 + } 741 + 742 + return true; 743 + } 744 + 731 745 /** 732 746 * xe_svm_handle_pagefault() - SVM handle page fault 733 747 * @vm: The VM. ··· 781 735 IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR), 782 736 .check_pages_threshold = IS_DGFX(vm->xe) && 783 737 IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ? SZ_64K : 0, 738 + .devmem_only = atomic && IS_DGFX(vm->xe) && 739 + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR), 740 + .timeslice_ms = atomic && IS_DGFX(vm->xe) && 741 + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ? 5 : 0, 784 742 }; 785 743 struct xe_svm_range *range; 786 744 struct drm_gpusvm_range *r; 787 745 struct drm_exec exec; 788 746 struct dma_fence *fence; 747 + int migrate_try_count = ctx.devmem_only ? 3 : 1; 789 748 ktime_t end = 0; 790 749 int err; 791 750 ··· 809 758 if (IS_ERR(r)) 810 759 return PTR_ERR(r); 811 760 761 + if (ctx.devmem_only && !r->flags.migrate_devmem) 762 + return -EACCES; 763 + 812 764 range = to_xe_range(r); 813 - if (xe_svm_range_is_valid(range, tile)) 765 + if (xe_svm_range_is_valid(range, tile, ctx.devmem_only)) 814 766 return 0; 815 767 816 768 range_debug(range, "PAGE FAULT"); 817 769 818 - /* XXX: Add migration policy, for now migrate range once */ 819 - if (!range->skip_migrate && range->base.flags.migrate_devmem && 820 - xe_svm_range_size(range) >= SZ_64K) { 821 - range->skip_migrate = true; 822 - 770 + if (--migrate_try_count >= 0 && 771 + xe_svm_range_needs_migrate_to_vram(range, vma)) { 823 772 err = xe_svm_alloc_vram(vm, tile, range, &ctx); 773 + ctx.timeslice_ms <<= 1; /* Double timeslice if we have to retry */ 824 774 if (err) { 825 - drm_dbg(&vm->xe->drm, 826 - "VRAM allocation failed, falling back to " 827 - "retrying fault, asid=%u, errno=%pe\n", 828 - vm->usm.asid, ERR_PTR(err)); 829 - goto retry; 775 + if (migrate_try_count || !ctx.devmem_only) { 776 + drm_dbg(&vm->xe->drm, 777 + "VRAM allocation failed, falling back to retrying fault, asid=%u, errno=%pe\n", 778 + vm->usm.asid, ERR_PTR(err)); 779 + goto retry; 780 + } else { 781 + drm_err(&vm->xe->drm, 782 + "VRAM allocation failed, retry count exceeded, asid=%u, errno=%pe\n", 783 + vm->usm.asid, ERR_PTR(err)); 784 + return err; 785 + } 830 786 } 831 787 } 832 788 ··· 841 783 err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r, &ctx); 842 784 /* Corner where CPU mappings have changed */ 843 785 if (err == -EOPNOTSUPP || err == -EFAULT || err == -EPERM) { 844 - if (err == -EOPNOTSUPP) { 845 - range_debug(range, "PAGE FAULT - EVICT PAGES"); 846 - drm_gpusvm_range_evict(&vm->svm.gpusvm, &range->base); 786 + ctx.timeslice_ms <<= 1; /* Double timeslice if we have to retry */ 787 + if (migrate_try_count > 0 || !ctx.devmem_only) { 788 + if (err == -EOPNOTSUPP) { 789 + range_debug(range, "PAGE FAULT - EVICT PAGES"); 790 + drm_gpusvm_range_evict(&vm->svm.gpusvm, 791 + &range->base); 792 + } 793 + drm_dbg(&vm->xe->drm, 794 + "Get pages failed, falling back to retrying, asid=%u, gpusvm=%p, errno=%pe\n", 795 + vm->usm.asid, &vm->svm.gpusvm, ERR_PTR(err)); 796 + range_debug(range, "PAGE FAULT - RETRY PAGES"); 797 + goto retry; 798 + } else { 799 + drm_err(&vm->xe->drm, 800 + "Get pages failed, retry count exceeded, asid=%u, gpusvm=%p, errno=%pe\n", 801 + vm->usm.asid, &vm->svm.gpusvm, ERR_PTR(err)); 847 802 } 848 - drm_dbg(&vm->xe->drm, 849 - "Get pages failed, falling back to retrying, asid=%u, gpusvm=%p, errno=%pe\n", 850 - vm->usm.asid, &vm->svm.gpusvm, ERR_PTR(err)); 851 - range_debug(range, "PAGE FAULT - RETRY PAGES"); 852 - goto retry; 853 803 } 854 804 if (err) { 855 805 range_debug(range, "PAGE FAULT - FAIL PAGE COLLECT"); ··· 881 815 drm_exec_fini(&exec); 882 816 err = PTR_ERR(fence); 883 817 if (err == -EAGAIN) { 818 + ctx.timeslice_ms <<= 1; /* Double timeslice if we have to retry */ 884 819 range_debug(range, "PAGE FAULT - RETRY BIND"); 885 820 goto retry; 886 821 } ··· 891 824 } 892 825 } 893 826 drm_exec_fini(&exec); 894 - 895 - if (xe_modparam.always_migrate_to_vram) 896 - range->skip_migrate = false; 897 827 898 828 dma_fence_wait(fence, false); 899 829 dma_fence_put(fence);

-5

drivers/gpu/drm/xe/xe_svm.h

··· 36 36 * range. Protected by GPU SVM notifier lock. 37 37 */ 38 38 u8 tile_invalidated; 39 - /** 40 - * @skip_migrate: Skip migration to VRAM, protected by GPU fault handler 41 - * locking. 42 - */ 43 - u8 skip_migrate :1; 44 39 }; 45 40 46 41 #if IS_ENABLED(CONFIG_DRM_GPUSVM)

+4 -4

drivers/gpu/drm/xe/xe_trace_lrc.h

··· 19 19 #define __dev_name_lrc(lrc) dev_name(gt_to_xe((lrc)->fence_ctx.gt)->drm.dev) 20 20 21 21 TRACE_EVENT(xe_lrc_update_timestamp, 22 - TP_PROTO(struct xe_lrc *lrc, uint32_t old), 22 + TP_PROTO(struct xe_lrc *lrc, uint64_t old), 23 23 TP_ARGS(lrc, old), 24 24 TP_STRUCT__entry( 25 25 __field(struct xe_lrc *, lrc) 26 - __field(u32, old) 27 - __field(u32, new) 26 + __field(u64, old) 27 + __field(u64, new) 28 28 __string(name, lrc->fence_ctx.name) 29 29 __string(device_id, __dev_name_lrc(lrc)) 30 30 ), ··· 36 36 __assign_str(name); 37 37 __assign_str(device_id); 38 38 ), 39 - TP_printk("lrc=:%p lrc->name=%s old=%u new=%u device_id:%s", 39 + TP_printk("lrc=:%p lrc->name=%s old=%llu new=%llu device_id:%s", 40 40 __entry->lrc, __get_str(name), 41 41 __entry->old, __entry->new, 42 42 __get_str(device_id))

+4

drivers/gpu/drm/xe/xe_wa.c

··· 815 815 XE_RTP_RULES(GRAPHICS_VERSION(2001), ENGINE_CLASS(RENDER)), 816 816 XE_RTP_ACTIONS(SET(CHICKEN_RASTER_1, DIS_CLIP_NEGATIVE_BOUNDING_BOX)) 817 817 }, 818 + { XE_RTP_NAME("22021007897"), 819 + XE_RTP_RULES(GRAPHICS_VERSION(2001), ENGINE_CLASS(RENDER)), 820 + XE_RTP_ACTIONS(SET(COMMON_SLICE_CHICKEN4, SBE_PUSH_CONSTANT_BEHIND_FIX_ENABLE)) 821 + }, 818 822 819 823 /* Xe3_LPG */ 820 824 { XE_RTP_NAME("14021490052"),

+8 -4

drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_init.c

··· 83 83 case ALS_IDX: 84 84 privdata->dev_en.is_als_present = false; 85 85 break; 86 + case SRA_IDX: 87 + privdata->dev_en.is_sra_present = false; 88 + break; 86 89 } 87 90 88 91 if (cl_data->sensor_sts[i] == SENSOR_ENABLED) { ··· 137 134 for (i = 0; i < cl_data->num_hid_devices; i++) { 138 135 cl_data->sensor_sts[i] = SENSOR_DISABLED; 139 136 140 - if (cl_data->num_hid_devices == 1 && cl_data->sensor_idx[0] == SRA_IDX) 141 - break; 142 - 143 137 if (cl_data->sensor_idx[i] == SRA_IDX) { 144 138 info.sensor_idx = cl_data->sensor_idx[i]; 145 139 writel(0, privdata->mmio + amd_get_p2c_val(privdata, 0)); ··· 145 145 (privdata, cl_data->sensor_idx[i], ENABLE_SENSOR); 146 146 147 147 cl_data->sensor_sts[i] = (status == 0) ? SENSOR_ENABLED : SENSOR_DISABLED; 148 - if (cl_data->sensor_sts[i] == SENSOR_ENABLED) 148 + if (cl_data->sensor_sts[i] == SENSOR_ENABLED) { 149 + cl_data->is_any_sensor_enabled = true; 149 150 privdata->dev_en.is_sra_present = true; 151 + } 150 152 continue; 151 153 } 152 154 ··· 240 238 cleanup: 241 239 amd_sfh_hid_client_deinit(privdata); 242 240 for (i = 0; i < cl_data->num_hid_devices; i++) { 241 + if (cl_data->sensor_idx[i] == SRA_IDX) 242 + continue; 243 243 devm_kfree(dev, cl_data->feature_report[i]); 244 244 devm_kfree(dev, in_data->input_report[i]); 245 245 devm_kfree(dev, cl_data->report_descr[i]);

+9

drivers/hid/bpf/hid_bpf_dispatch.c

··· 38 38 struct hid_bpf_ops *e; 39 39 int ret; 40 40 41 + if (unlikely(hdev->bpf.destroyed)) 42 + return ERR_PTR(-ENODEV); 43 + 41 44 if (type >= HID_REPORT_TYPES) 42 45 return ERR_PTR(-EINVAL); 43 46 ··· 96 93 struct hid_bpf_ops *e; 97 94 int ret, idx; 98 95 96 + if (unlikely(hdev->bpf.destroyed)) 97 + return -ENODEV; 98 + 99 99 if (rtype >= HID_REPORT_TYPES) 100 100 return -EINVAL; 101 101 ··· 135 129 }; 136 130 struct hid_bpf_ops *e; 137 131 int ret, idx; 132 + 133 + if (unlikely(hdev->bpf.destroyed)) 134 + return -ENODEV; 138 135 139 136 idx = srcu_read_lock(&hdev->bpf.srcu); 140 137 list_for_each_entry_srcu(e, &hdev->bpf.prog_list, list,

+1

drivers/hid/bpf/progs/XPPen__ACK05.bpf.c

··· 157 157 ReportCount(5) // padding 158 158 Input(Const) 159 159 // Byte 4 in report - just exists so we get to be a tablet pad 160 + UsagePage_Digitizers 160 161 Usage_Dig_BarrelSwitch // BTN_STYLUS 161 162 ReportCount(1) 162 163 ReportSize(1)

+4

drivers/hid/hid-ids.h

··· 41 41 #define USB_VENDOR_ID_ACTIONSTAR 0x2101 42 42 #define USB_DEVICE_ID_ACTIONSTAR_1011 0x1011 43 43 44 + #define USB_VENDOR_ID_ADATA_XPG 0x125f 45 + #define USB_VENDOR_ID_ADATA_XPG_WL_GAMING_MOUSE 0x7505 46 + #define USB_VENDOR_ID_ADATA_XPG_WL_GAMING_MOUSE_DONGLE 0x7506 47 + 44 48 #define USB_VENDOR_ID_ADS_TECH 0x06e1 45 49 #define USB_DEVICE_ID_ADS_TECH_RADIO_SI470X 0xa155 46 50

+2

drivers/hid/hid-quirks.c

··· 27 27 static const struct hid_device_id hid_quirks[] = { 28 28 { HID_USB_DEVICE(USB_VENDOR_ID_AASHIMA, USB_DEVICE_ID_AASHIMA_GAMEPAD), HID_QUIRK_BADPAD }, 29 29 { HID_USB_DEVICE(USB_VENDOR_ID_AASHIMA, USB_DEVICE_ID_AASHIMA_PREDATOR), HID_QUIRK_BADPAD }, 30 + { HID_USB_DEVICE(USB_VENDOR_ID_ADATA_XPG, USB_VENDOR_ID_ADATA_XPG_WL_GAMING_MOUSE), HID_QUIRK_ALWAYS_POLL }, 31 + { HID_USB_DEVICE(USB_VENDOR_ID_ADATA_XPG, USB_VENDOR_ID_ADATA_XPG_WL_GAMING_MOUSE_DONGLE), HID_QUIRK_ALWAYS_POLL }, 30 32 { HID_USB_DEVICE(USB_VENDOR_ID_AFATECH, USB_DEVICE_ID_AFATECH_AF9016), HID_QUIRK_FULLSPEED_INTERVAL }, 31 33 { HID_USB_DEVICE(USB_VENDOR_ID_AIREN, USB_DEVICE_ID_AIREN_SLIMPLUS), HID_QUIRK_NOGET }, 32 34 { HID_USB_DEVICE(USB_VENDOR_ID_AKAI_09E8, USB_DEVICE_ID_AKAI_09E8_MIDIMIX), HID_QUIRK_NO_INIT_REPORTS },

-2

drivers/hid/hid-steam.c

··· 1150 1150 struct steam_device *steam = hdev->driver_data; 1151 1151 1152 1152 unsigned long flags; 1153 - bool connected; 1154 1153 1155 1154 spin_lock_irqsave(&steam->lock, flags); 1156 1155 steam->client_opened--; 1157 - connected = steam->connected && !steam->client_opened; 1158 1156 spin_unlock_irqrestore(&steam->lock, flags); 1159 1157 1160 1158 schedule_work(&steam->unregister_work);

+1

drivers/hid/hid-thrustmaster.c

··· 174 174 u8 ep_addr[2] = {b_ep, 0}; 175 175 176 176 if (!usb_check_int_endpoints(usbif, ep_addr)) { 177 + kfree(send_buf); 177 178 hid_err(hdev, "Unexpected non-int endpoint\n"); 178 179 return; 179 180 }

+4 -3

drivers/hid/hid-uclogic-core.c

··· 142 142 suffix = "System Control"; 143 143 break; 144 144 } 145 - } 146 - 147 - if (suffix) 145 + } else { 148 146 hi->input->name = devm_kasprintf(&hdev->dev, GFP_KERNEL, 149 147 "%s %s", hdev->name, suffix); 148 + if (!hi->input->name) 149 + return -ENOMEM; 150 + } 150 151 151 152 return 0; 152 153 }

+10 -1

drivers/hid/wacom_sys.c

··· 70 70 { 71 71 while (!kfifo_is_empty(fifo)) { 72 72 int size = kfifo_peek_len(fifo); 73 - u8 *buf = kzalloc(size, GFP_KERNEL); 73 + u8 *buf; 74 74 unsigned int count; 75 75 int err; 76 + 77 + buf = kzalloc(size, GFP_KERNEL); 78 + if (!buf) { 79 + kfifo_skip(fifo); 80 + continue; 81 + } 76 82 77 83 count = kfifo_out(fifo, buf, size); 78 84 if (count != size) { ··· 87 81 // to flush seems reasonable enough, however. 88 82 hid_warn(hdev, "%s: removed fifo entry with unexpected size\n", 89 83 __func__); 84 + kfree(buf); 90 85 continue; 91 86 } 92 87 err = hid_report_raw_event(hdev, HID_INPUT_REPORT, buf, size, false); ··· 2368 2361 unsigned int connect_mask = HID_CONNECT_HIDRAW; 2369 2362 2370 2363 features->pktlen = wacom_compute_pktlen(hdev); 2364 + if (!features->pktlen) 2365 + return -ENODEV; 2371 2366 2372 2367 if (!devres_open_group(&hdev->dev, wacom, GFP_KERNEL)) 2373 2368 return -ENOMEM;

+3 -62

drivers/hv/channel.c

··· 1077 1077 EXPORT_SYMBOL(vmbus_sendpacket); 1078 1078 1079 1079 /* 1080 - * vmbus_sendpacket_pagebuffer - Send a range of single-page buffer 1081 - * packets using a GPADL Direct packet type. This interface allows you 1082 - * to control notifying the host. This will be useful for sending 1083 - * batched data. Also the sender can control the send flags 1084 - * explicitly. 1085 - */ 1086 - int vmbus_sendpacket_pagebuffer(struct vmbus_channel *channel, 1087 - struct hv_page_buffer pagebuffers[], 1088 - u32 pagecount, void *buffer, u32 bufferlen, 1089 - u64 requestid) 1090 - { 1091 - int i; 1092 - struct vmbus_channel_packet_page_buffer desc; 1093 - u32 descsize; 1094 - u32 packetlen; 1095 - u32 packetlen_aligned; 1096 - struct kvec bufferlist[3]; 1097 - u64 aligned_data = 0; 1098 - 1099 - if (pagecount > MAX_PAGE_BUFFER_COUNT) 1100 - return -EINVAL; 1101 - 1102 - /* 1103 - * Adjust the size down since vmbus_channel_packet_page_buffer is the 1104 - * largest size we support 1105 - */ 1106 - descsize = sizeof(struct vmbus_channel_packet_page_buffer) - 1107 - ((MAX_PAGE_BUFFER_COUNT - pagecount) * 1108 - sizeof(struct hv_page_buffer)); 1109 - packetlen = descsize + bufferlen; 1110 - packetlen_aligned = ALIGN(packetlen, sizeof(u64)); 1111 - 1112 - /* Setup the descriptor */ 1113 - desc.type = VM_PKT_DATA_USING_GPA_DIRECT; 1114 - desc.flags = VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED; 1115 - desc.dataoffset8 = descsize >> 3; /* in 8-bytes granularity */ 1116 - desc.length8 = (u16)(packetlen_aligned >> 3); 1117 - desc.transactionid = VMBUS_RQST_ERROR; /* will be updated in hv_ringbuffer_write() */ 1118 - desc.reserved = 0; 1119 - desc.rangecount = pagecount; 1120 - 1121 - for (i = 0; i < pagecount; i++) { 1122 - desc.range[i].len = pagebuffers[i].len; 1123 - desc.range[i].offset = pagebuffers[i].offset; 1124 - desc.range[i].pfn = pagebuffers[i].pfn; 1125 - } 1126 - 1127 - bufferlist[0].iov_base = &desc; 1128 - bufferlist[0].iov_len = descsize; 1129 - bufferlist[1].iov_base = buffer; 1130 - bufferlist[1].iov_len = bufferlen; 1131 - bufferlist[2].iov_base = &aligned_data; 1132 - bufferlist[2].iov_len = (packetlen_aligned - packetlen); 1133 - 1134 - return hv_ringbuffer_write(channel, bufferlist, 3, requestid, NULL); 1135 - } 1136 - EXPORT_SYMBOL_GPL(vmbus_sendpacket_pagebuffer); 1137 - 1138 - /* 1139 - * vmbus_sendpacket_multipagebuffer - Send a multi-page buffer packet 1080 + * vmbus_sendpacket_mpb_desc - Send one or more multi-page buffer packets 1140 1081 * using a GPADL Direct packet type. 1141 - * The buffer includes the vmbus descriptor. 1082 + * The desc argument must include space for the VMBus descriptor. The 1083 + * rangecount field must already be set. 1142 1084 */ 1143 1085 int vmbus_sendpacket_mpb_desc(struct vmbus_channel *channel, 1144 1086 struct vmbus_packet_mpb_array *desc, ··· 1102 1160 desc->length8 = (u16)(packetlen_aligned >> 3); 1103 1161 desc->transactionid = VMBUS_RQST_ERROR; /* will be updated in hv_ringbuffer_write() */ 1104 1162 desc->reserved = 0; 1105 - desc->rangecount = 1; 1106 1163 1107 1164 bufferlist[0].iov_base = desc; 1108 1165 bufferlist[0].iov_len = desc_size;

+3 -1

drivers/i2c/busses/i2c-designware-pcidrv.c

··· 278 278 279 279 if ((dev->flags & MODEL_MASK) == MODEL_AMD_NAVI_GPU) { 280 280 dev->slave = i2c_new_ccgx_ucsi(&dev->adapter, dev->irq, &dgpu_node); 281 - if (IS_ERR(dev->slave)) 281 + if (IS_ERR(dev->slave)) { 282 + i2c_del_adapter(&dev->adapter); 282 283 return dev_err_probe(device, PTR_ERR(dev->slave), 283 284 "register UCSI failed\n"); 285 + } 284 286 } 285 287 286 288 pm_runtime_set_autosuspend_delay(device, 1000);

+4 -2

drivers/infiniband/core/device.c

··· 1352 1352 1353 1353 down_read(&devices_rwsem); 1354 1354 1355 + /* Mark for userspace that device is ready */ 1356 + kobject_uevent(&device->dev.kobj, KOBJ_ADD); 1357 + 1355 1358 ret = rdma_nl_notify_event(device, 0, RDMA_REGISTER_EVENT); 1356 1359 if (ret) 1357 1360 goto out; ··· 1471 1468 return ret; 1472 1469 } 1473 1470 dev_set_uevent_suppress(&device->dev, false); 1474 - /* Mark for userspace that device is ready */ 1475 - kobject_uevent(&device->dev.kobj, KOBJ_ADD); 1476 1471 1477 1472 ib_device_notify_register(device); 1473 + 1478 1474 ib_device_put(device); 1479 1475 1480 1476 return 0;

+3 -1

drivers/infiniband/hw/irdma/main.c

··· 221 221 break; 222 222 223 223 if (i < IRDMA_MIN_MSIX) { 224 - for (; i > 0; i--) 224 + while (--i >= 0) 225 225 ice_free_rdma_qvector(pf, &rf->msix_entries[i]); 226 226 227 227 kfree(rf->msix_entries); ··· 254 254 irdma_ib_unregister_device(iwdev); 255 255 ice_rdma_update_vsi_filter(pf, iwdev->vsi_num, false); 256 256 irdma_deinit_interrupts(iwdev->rf, pf); 257 + 258 + kfree(iwdev->rf); 257 259 258 260 pr_debug("INIT: Gen2 PF[%d] device remove success\n", PCI_FUNC(pf->pdev->devfn)); 259 261 }

-1

drivers/infiniband/hw/irdma/verbs.c

··· 4871 4871 4872 4872 irdma_rt_deinit_hw(iwdev); 4873 4873 irdma_ctrl_deinit_hw(iwdev->rf); 4874 - kfree(iwdev->rf); 4875 4874 }

+1 -4

drivers/infiniband/sw/rxe/rxe_cq.c

··· 56 56 57 57 err = do_mmap_info(rxe, uresp ? &uresp->mi : NULL, udata, 58 58 cq->queue->buf, cq->queue->buf_size, &cq->queue->ip); 59 - if (err) { 60 - vfree(cq->queue->buf); 61 - kfree(cq->queue); 59 + if (err) 62 60 return err; 63 - } 64 61 65 62 cq->is_user = uresp; 66 63

+1 -1

drivers/irqchip/irq-gic-v2m.c

··· 252 252 static struct msi_parent_ops gicv2m_msi_parent_ops = { 253 253 .supported_flags = GICV2M_MSI_FLAGS_SUPPORTED, 254 254 .required_flags = GICV2M_MSI_FLAGS_REQUIRED, 255 - .chip_flags = MSI_CHIP_FLAG_SET_EOI | MSI_CHIP_FLAG_SET_ACK, 255 + .chip_flags = MSI_CHIP_FLAG_SET_EOI, 256 256 .bus_select_token = DOMAIN_BUS_NEXUS, 257 257 .bus_select_mask = MATCH_PCI_MSI | MATCH_PLATFORM_MSI, 258 258 .prefix = "GICv2m-",

+1 -1

drivers/irqchip/irq-gic-v3-its-msi-parent.c

··· 203 203 const struct msi_parent_ops gic_v3_its_msi_parent_ops = { 204 204 .supported_flags = ITS_MSI_FLAGS_SUPPORTED, 205 205 .required_flags = ITS_MSI_FLAGS_REQUIRED, 206 - .chip_flags = MSI_CHIP_FLAG_SET_EOI | MSI_CHIP_FLAG_SET_ACK, 206 + .chip_flags = MSI_CHIP_FLAG_SET_EOI, 207 207 .bus_select_token = DOMAIN_BUS_NEXUS, 208 208 .bus_select_mask = MATCH_PCI_MSI | MATCH_PLATFORM_MSI, 209 209 .prefix = "ITS-",

+1 -1

drivers/irqchip/irq-gic-v3-mbi.c

··· 197 197 static const struct msi_parent_ops gic_v3_mbi_msi_parent_ops = { 198 198 .supported_flags = MBI_MSI_FLAGS_SUPPORTED, 199 199 .required_flags = MBI_MSI_FLAGS_REQUIRED, 200 - .chip_flags = MSI_CHIP_FLAG_SET_EOI | MSI_CHIP_FLAG_SET_ACK, 200 + .chip_flags = MSI_CHIP_FLAG_SET_EOI, 201 201 .bus_select_token = DOMAIN_BUS_NEXUS, 202 202 .bus_select_mask = MATCH_PCI_MSI | MATCH_PLATFORM_MSI, 203 203 .prefix = "MBI-",

+1 -1

drivers/irqchip/irq-mvebu-gicp.c

··· 161 161 static const struct msi_parent_ops gicp_msi_parent_ops = { 162 162 .supported_flags = GICP_MSI_FLAGS_SUPPORTED, 163 163 .required_flags = GICP_MSI_FLAGS_REQUIRED, 164 - .chip_flags = MSI_CHIP_FLAG_SET_EOI | MSI_CHIP_FLAG_SET_ACK, 164 + .chip_flags = MSI_CHIP_FLAG_SET_EOI, 165 165 .bus_select_token = DOMAIN_BUS_GENERIC_MSI, 166 166 .bus_select_mask = MATCH_PLATFORM_MSI, 167 167 .prefix = "GICP-",

+1 -1

drivers/irqchip/irq-mvebu-odmi.c

··· 157 157 static const struct msi_parent_ops odmi_msi_parent_ops = { 158 158 .supported_flags = ODMI_MSI_FLAGS_SUPPORTED, 159 159 .required_flags = ODMI_MSI_FLAGS_REQUIRED, 160 - .chip_flags = MSI_CHIP_FLAG_SET_EOI | MSI_CHIP_FLAG_SET_ACK, 160 + .chip_flags = MSI_CHIP_FLAG_SET_EOI, 161 161 .bus_select_token = DOMAIN_BUS_GENERIC_MSI, 162 162 .bus_select_mask = MATCH_PLATFORM_MSI, 163 163 .prefix = "ODMI-",

+5 -5

drivers/irqchip/irq-riscv-imsic-state.c

··· 208 208 } 209 209 210 210 #ifdef CONFIG_SMP 211 - static void __imsic_local_timer_start(struct imsic_local_priv *lpriv) 211 + static void __imsic_local_timer_start(struct imsic_local_priv *lpriv, unsigned int cpu) 212 212 { 213 213 lockdep_assert_held(&lpriv->lock); 214 214 215 215 if (!timer_pending(&lpriv->timer)) { 216 216 lpriv->timer.expires = jiffies + 1; 217 - add_timer_on(&lpriv->timer, smp_processor_id()); 217 + add_timer_on(&lpriv->timer, cpu); 218 218 } 219 219 } 220 220 #else 221 - static inline void __imsic_local_timer_start(struct imsic_local_priv *lpriv) 221 + static inline void __imsic_local_timer_start(struct imsic_local_priv *lpriv, unsigned int cpu) 222 222 { 223 223 } 224 224 #endif ··· 233 233 if (force_all) 234 234 bitmap_fill(lpriv->dirty_bitmap, imsic->global.nr_ids + 1); 235 235 if (!__imsic_local_sync(lpriv)) 236 - __imsic_local_timer_start(lpriv); 236 + __imsic_local_timer_start(lpriv, smp_processor_id()); 237 237 238 238 raw_spin_unlock_irqrestore(&lpriv->lock, flags); 239 239 } ··· 278 278 return; 279 279 } 280 280 281 - __imsic_local_timer_start(lpriv); 281 + __imsic_local_timer_start(lpriv, cpu); 282 282 } 283 283 } 284 284 #else

+33

drivers/net/dsa/b53/b53_common.c

··· 326 326 } 327 327 } 328 328 329 + static void b53_set_eap_mode(struct b53_device *dev, int port, int mode) 330 + { 331 + u64 eap_conf; 332 + 333 + if (is5325(dev) || is5365(dev) || dev->chip_id == BCM5389_DEVICE_ID) 334 + return; 335 + 336 + b53_read64(dev, B53_EAP_PAGE, B53_PORT_EAP_CONF(port), &eap_conf); 337 + 338 + if (is63xx(dev)) { 339 + eap_conf &= ~EAP_MODE_MASK_63XX; 340 + eap_conf |= (u64)mode << EAP_MODE_SHIFT_63XX; 341 + } else { 342 + eap_conf &= ~EAP_MODE_MASK; 343 + eap_conf |= (u64)mode << EAP_MODE_SHIFT; 344 + } 345 + 346 + b53_write64(dev, B53_EAP_PAGE, B53_PORT_EAP_CONF(port), eap_conf); 347 + } 348 + 329 349 static void b53_set_forwarding(struct b53_device *dev, int enable) 330 350 { 331 351 u8 mgmt; ··· 605 585 b53_port_set_ucast_flood(dev, port, true); 606 586 b53_port_set_mcast_flood(dev, port, true); 607 587 b53_port_set_learning(dev, port, false); 588 + 589 + /* Force all traffic to go to the CPU port to prevent the ASIC from 590 + * trying to forward to bridged ports on matching FDB entries, then 591 + * dropping frames because it isn't allowed to forward there. 592 + */ 593 + if (dsa_is_user_port(ds, port)) 594 + b53_set_eap_mode(dev, port, EAP_MODE_SIMPLIFIED); 608 595 609 596 return 0; 610 597 } ··· 2069 2042 pvlan |= BIT(i); 2070 2043 } 2071 2044 2045 + /* Disable redirection of unknown SA to the CPU port */ 2046 + b53_set_eap_mode(dev, port, EAP_MODE_BASIC); 2047 + 2072 2048 /* Configure the local port VLAN control membership to include 2073 2049 * remote ports and update the local port bitmask 2074 2050 */ ··· 2106 2076 if (port != i) 2107 2077 pvlan &= ~BIT(i); 2108 2078 } 2079 + 2080 + /* Enable redirection of unknown SA to the CPU port */ 2081 + b53_set_eap_mode(dev, port, EAP_MODE_SIMPLIFIED); 2109 2082 2110 2083 b53_write16(dev, B53_PVLAN_PAGE, B53_PVLAN_PORT_MASK(port), pvlan); 2111 2084 dev->ports[port].vlan_ctl_mask = pvlan;

+14

drivers/net/dsa/b53/b53_regs.h

··· 50 50 /* Jumbo Frame Registers */ 51 51 #define B53_JUMBO_PAGE 0x40 52 52 53 + /* EAP Registers */ 54 + #define B53_EAP_PAGE 0x42 55 + 53 56 /* EEE Control Registers Page */ 54 57 #define B53_EEE_PAGE 0x92 55 58 ··· 482 479 #define B53_JUMBO_MAX_SIZE_63XX 0x08 483 480 #define JMS_MIN_SIZE 1518 484 481 #define JMS_MAX_SIZE 9724 482 + 483 + /************************************************************************* 484 + * EAP Page Registers 485 + *************************************************************************/ 486 + #define B53_PORT_EAP_CONF(i) (0x20 + 8 * (i)) 487 + #define EAP_MODE_SHIFT 51 488 + #define EAP_MODE_SHIFT_63XX 50 489 + #define EAP_MODE_MASK (0x3ull << EAP_MODE_SHIFT) 490 + #define EAP_MODE_MASK_63XX (0x3ull << EAP_MODE_SHIFT_63XX) 491 + #define EAP_MODE_BASIC 0 492 + #define EAP_MODE_SIMPLIFIED 3 485 493 486 494 /************************************************************************* 487 495 * EEE Configuration Page Registers

+108 -29

drivers/net/dsa/microchip/ksz_common.c

··· 265 265 unsigned int mode, 266 266 phy_interface_t interface); 267 267 268 + /** 269 + * ksz_phylink_mac_disable_tx_lpi() - Callback to signal LPI support (Dummy) 270 + * @config: phylink config structure 271 + * 272 + * This function is a dummy handler. See ksz_phylink_mac_enable_tx_lpi() for 273 + * a detailed explanation of EEE/LPI handling in KSZ switches. 274 + */ 275 + static void ksz_phylink_mac_disable_tx_lpi(struct phylink_config *config) 276 + { 277 + } 278 + 279 + /** 280 + * ksz_phylink_mac_enable_tx_lpi() - Callback to signal LPI support (Dummy) 281 + * @config: phylink config structure 282 + * @timer: timer value before entering LPI (unused) 283 + * @tx_clock_stop: whether to stop the TX clock in LPI mode (unused) 284 + * 285 + * This function signals to phylink that the driver architecture supports 286 + * LPI management, enabling phylink to control EEE advertisement during 287 + * negotiation according to IEEE Std 802.3 (Clause 78). 288 + * 289 + * Hardware Management of EEE/LPI State: 290 + * For KSZ switch ports with integrated PHYs (e.g., KSZ9893R ports 1-2), 291 + * observation and testing suggest that the actual EEE / Low Power Idle (LPI) 292 + * state transitions are managed autonomously by the hardware based on 293 + * the auto-negotiation results. (Note: While the datasheet describes EEE 294 + * operation based on negotiation, it doesn't explicitly detail the internal 295 + * MAC/PHY interaction, so autonomous hardware management of the MAC state 296 + * for LPI is inferred from observed behavior). 297 + * This hardware control, consistent with the switch's ability to operate 298 + * autonomously via strapping, means MAC-level software intervention is not 299 + * required or exposed for managing the LPI state once EEE is negotiated. 300 + * (Ref: KSZ9893R Data Sheet DS00002420D, primarily Section 4.7.5 explaining 301 + * EEE, also Sections 4.1.7 on Auto-Negotiation and 3.2.1 on Configuration 302 + * Straps). 303 + * 304 + * Additionally, ports configured as MAC interfaces (e.g., KSZ9893R port 3) 305 + * lack documented MAC-level LPI control. 306 + * 307 + * Therefore, this callback performs no action and serves primarily to inform 308 + * phylink of LPI awareness and to document the inferred hardware behavior. 309 + * 310 + * Returns: 0 (Always success) 311 + */ 312 + static int ksz_phylink_mac_enable_tx_lpi(struct phylink_config *config, 313 + u32 timer, bool tx_clock_stop) 314 + { 315 + return 0; 316 + } 317 + 268 318 static const struct phylink_mac_ops ksz88x3_phylink_mac_ops = { 269 319 .mac_config = ksz88x3_phylink_mac_config, 270 320 .mac_link_down = ksz_phylink_mac_link_down, 271 321 .mac_link_up = ksz8_phylink_mac_link_up, 322 + .mac_disable_tx_lpi = ksz_phylink_mac_disable_tx_lpi, 323 + .mac_enable_tx_lpi = ksz_phylink_mac_enable_tx_lpi, 272 324 }; 273 325 274 326 static const struct phylink_mac_ops ksz8_phylink_mac_ops = { 275 327 .mac_config = ksz_phylink_mac_config, 276 328 .mac_link_down = ksz_phylink_mac_link_down, 277 329 .mac_link_up = ksz8_phylink_mac_link_up, 330 + .mac_disable_tx_lpi = ksz_phylink_mac_disable_tx_lpi, 331 + .mac_enable_tx_lpi = ksz_phylink_mac_enable_tx_lpi, 278 332 }; 279 333 280 334 static const struct ksz_dev_ops ksz88xx_dev_ops = { ··· 412 358 .mac_config = ksz_phylink_mac_config, 413 359 .mac_link_down = ksz_phylink_mac_link_down, 414 360 .mac_link_up = ksz9477_phylink_mac_link_up, 361 + .mac_disable_tx_lpi = ksz_phylink_mac_disable_tx_lpi, 362 + .mac_enable_tx_lpi = ksz_phylink_mac_enable_tx_lpi, 415 363 }; 416 364 417 365 static const struct ksz_dev_ops ksz9477_dev_ops = { ··· 457 401 .mac_config = ksz_phylink_mac_config, 458 402 .mac_link_down = ksz_phylink_mac_link_down, 459 403 .mac_link_up = ksz9477_phylink_mac_link_up, 404 + .mac_disable_tx_lpi = ksz_phylink_mac_disable_tx_lpi, 405 + .mac_enable_tx_lpi = ksz_phylink_mac_enable_tx_lpi, 460 406 }; 461 407 462 408 static const struct ksz_dev_ops lan937x_dev_ops = { ··· 2074 2016 2075 2017 if (dev->dev_ops->get_caps) 2076 2018 dev->dev_ops->get_caps(dev, port, config); 2019 + 2020 + if (ds->ops->support_eee && ds->ops->support_eee(ds, port)) { 2021 + memcpy(config->lpi_interfaces, config->supported_interfaces, 2022 + sizeof(config->lpi_interfaces)); 2023 + 2024 + config->lpi_capabilities = MAC_100FD; 2025 + if (dev->info->gbit_capable[port]) 2026 + config->lpi_capabilities |= MAC_1000FD; 2027 + 2028 + /* EEE is fully operational */ 2029 + config->eee_enabled_default = true; 2030 + } 2077 2031 } 2078 2032 2079 2033 void ksz_r_mib_stats64(struct ksz_device *dev, int port) ··· 3078 3008 if (!port) 3079 3009 return MICREL_KSZ8_P1_ERRATA; 3080 3010 break; 3081 - case KSZ8567_CHIP_ID: 3082 - /* KSZ8567R Errata DS80000752C Module 4 */ 3083 - case KSZ8765_CHIP_ID: 3084 - case KSZ8794_CHIP_ID: 3085 - case KSZ8795_CHIP_ID: 3086 - /* KSZ879x/KSZ877x/KSZ876x Errata DS80000687C Module 2 */ 3087 - case KSZ9477_CHIP_ID: 3088 - /* KSZ9477S Errata DS80000754A Module 4 */ 3089 - case KSZ9567_CHIP_ID: 3090 - /* KSZ9567S Errata DS80000756A Module 4 */ 3091 - case KSZ9896_CHIP_ID: 3092 - /* KSZ9896C Errata DS80000757A Module 3 */ 3093 - case KSZ9897_CHIP_ID: 3094 - case LAN9646_CHIP_ID: 3095 - /* KSZ9897R Errata DS80000758C Module 4 */ 3096 - /* Energy Efficient Ethernet (EEE) feature select must be manually disabled 3097 - * The EEE feature is enabled by default, but it is not fully 3098 - * operational. It must be manually disabled through register 3099 - * controls. If not disabled, the PHY ports can auto-negotiate 3100 - * to enable EEE, and this feature can cause link drops when 3101 - * linked to another device supporting EEE. 3102 - * 3103 - * The same item appears in the errata for all switches above. 3104 - */ 3105 - return MICREL_NO_EEE; 3106 3011 } 3107 3012 3108 3013 return 0; ··· 3511 3466 return -EOPNOTSUPP; 3512 3467 } 3513 3468 3469 + /** 3470 + * ksz_support_eee - Determine Energy Efficient Ethernet (EEE) support for a 3471 + * port 3472 + * @ds: Pointer to the DSA switch structure 3473 + * @port: Port number to check 3474 + * 3475 + * This function also documents devices where EEE was initially advertised but 3476 + * later withdrawn due to reliability issues, as described in official errata 3477 + * documents. These devices are explicitly listed to record known limitations, 3478 + * even if there is no technical necessity for runtime checks. 3479 + * 3480 + * Returns: true if the internal PHY on the given port supports fully 3481 + * operational EEE, false otherwise. 3482 + */ 3514 3483 static bool ksz_support_eee(struct dsa_switch *ds, int port) 3515 3484 { 3516 3485 struct ksz_device *dev = ds->priv; ··· 3534 3475 3535 3476 switch (dev->chip_id) { 3536 3477 case KSZ8563_CHIP_ID: 3537 - case KSZ8567_CHIP_ID: 3538 - case KSZ9477_CHIP_ID: 3539 3478 case KSZ9563_CHIP_ID: 3540 - case KSZ9567_CHIP_ID: 3541 3479 case KSZ9893_CHIP_ID: 3480 + return true; 3481 + case KSZ8567_CHIP_ID: 3482 + /* KSZ8567R Errata DS80000752C Module 4 */ 3483 + case KSZ8765_CHIP_ID: 3484 + case KSZ8794_CHIP_ID: 3485 + case KSZ8795_CHIP_ID: 3486 + /* KSZ879x/KSZ877x/KSZ876x Errata DS80000687C Module 2 */ 3487 + case KSZ9477_CHIP_ID: 3488 + /* KSZ9477S Errata DS80000754A Module 4 */ 3489 + case KSZ9567_CHIP_ID: 3490 + /* KSZ9567S Errata DS80000756A Module 4 */ 3542 3491 case KSZ9896_CHIP_ID: 3492 + /* KSZ9896C Errata DS80000757A Module 3 */ 3543 3493 case KSZ9897_CHIP_ID: 3544 3494 case LAN9646_CHIP_ID: 3545 - return true; 3495 + /* KSZ9897R Errata DS80000758C Module 4 */ 3496 + /* Energy Efficient Ethernet (EEE) feature select must be 3497 + * manually disabled 3498 + * The EEE feature is enabled by default, but it is not fully 3499 + * operational. It must be manually disabled through register 3500 + * controls. If not disabled, the PHY ports can auto-negotiate 3501 + * to enable EEE, and this feature can cause link drops when 3502 + * linked to another device supporting EEE. 3503 + * 3504 + * The same item appears in the errata for all switches above. 3505 + */ 3506 + break; 3546 3507 } 3547 3508 3548 3509 return false;

+1 -5

drivers/net/dsa/sja1105/sja1105_main.c

··· 2081 2081 switch (state) { 2082 2082 case BR_STATE_DISABLED: 2083 2083 case BR_STATE_BLOCKING: 2084 + case BR_STATE_LISTENING: 2084 2085 /* From UM10944 description of DRPDTAG (why put this there?): 2085 2086 * "Management traffic flows to the port regardless of the state 2086 2087 * of the INGRESS flag". So BPDUs are still be allowed to pass. 2087 2088 * At the moment no difference between DISABLED and BLOCKING. 2088 2089 */ 2089 2090 mac[port].ingress = false; 2090 - mac[port].egress = false; 2091 - mac[port].dyn_learn = false; 2092 - break; 2093 - case BR_STATE_LISTENING: 2094 - mac[port].ingress = true; 2095 2091 mac[port].egress = false; 2096 2092 mac[port].dyn_learn = false; 2097 2093 break;

+29 -7

drivers/net/ethernet/broadcom/bnxt/bnxt.c

··· 14013 14013 netdev_unlock(bp->dev); 14014 14014 } 14015 14015 14016 + /* Same as bnxt_lock_sp() with additional rtnl_lock */ 14017 + static void bnxt_rtnl_lock_sp(struct bnxt *bp) 14018 + { 14019 + clear_bit(BNXT_STATE_IN_SP_TASK, &bp->state); 14020 + rtnl_lock(); 14021 + netdev_lock(bp->dev); 14022 + } 14023 + 14024 + static void bnxt_rtnl_unlock_sp(struct bnxt *bp) 14025 + { 14026 + set_bit(BNXT_STATE_IN_SP_TASK, &bp->state); 14027 + netdev_unlock(bp->dev); 14028 + rtnl_unlock(); 14029 + } 14030 + 14016 14031 /* Only called from bnxt_sp_task() */ 14017 14032 static void bnxt_reset(struct bnxt *bp, bool silent) 14018 14033 { 14019 - bnxt_lock_sp(bp); 14034 + bnxt_rtnl_lock_sp(bp); 14020 14035 if (test_bit(BNXT_STATE_OPEN, &bp->state)) 14021 14036 bnxt_reset_task(bp, silent); 14022 - bnxt_unlock_sp(bp); 14037 + bnxt_rtnl_unlock_sp(bp); 14023 14038 } 14024 14039 14025 14040 /* Only called from bnxt_sp_task() */ ··· 14042 14027 { 14043 14028 int i; 14044 14029 14045 - bnxt_lock_sp(bp); 14030 + bnxt_rtnl_lock_sp(bp); 14046 14031 if (!test_bit(BNXT_STATE_OPEN, &bp->state)) { 14047 - bnxt_unlock_sp(bp); 14032 + bnxt_rtnl_unlock_sp(bp); 14048 14033 return; 14049 14034 } 14050 14035 /* Disable and flush TPA before resetting the RX ring */ ··· 14083 14068 } 14084 14069 if (bp->flags & BNXT_FLAG_TPA) 14085 14070 bnxt_set_tpa(bp, true); 14086 - bnxt_unlock_sp(bp); 14071 + bnxt_rtnl_unlock_sp(bp); 14087 14072 } 14088 14073 14089 14074 static void bnxt_fw_fatal_close(struct bnxt *bp) ··· 14975 14960 bp->fw_reset_state = BNXT_FW_RESET_STATE_OPENING; 14976 14961 fallthrough; 14977 14962 case BNXT_FW_RESET_STATE_OPENING: 14978 - while (!netdev_trylock(bp->dev)) { 14963 + while (!rtnl_trylock()) { 14979 14964 bnxt_queue_fw_reset_work(bp, HZ / 10); 14980 14965 return; 14981 14966 } 14967 + netdev_lock(bp->dev); 14982 14968 rc = bnxt_open(bp->dev); 14983 14969 if (rc) { 14984 14970 netdev_err(bp->dev, "bnxt_open() failed during FW reset\n"); 14985 14971 bnxt_fw_reset_abort(bp, rc); 14986 14972 netdev_unlock(bp->dev); 14973 + rtnl_unlock(); 14987 14974 goto ulp_start; 14988 14975 } 14989 14976 ··· 15005 14988 bnxt_dl_health_fw_status_update(bp, true); 15006 14989 } 15007 14990 netdev_unlock(bp->dev); 14991 + rtnl_unlock(); 15008 14992 bnxt_ulp_start(bp, 0); 15009 14993 bnxt_reenable_sriov(bp); 15010 14994 netdev_lock(bp->dev); ··· 15954 15936 rc); 15955 15937 napi_enable_locked(&bnapi->napi); 15956 15938 bnxt_db_nq_arm(bp, &cpr->cp_db, cpr->cp_raw_cons); 15957 - bnxt_reset_task(bp, true); 15939 + netif_close(dev); 15958 15940 return rc; 15959 15941 } 15960 15942 ··· 16770 16752 struct bnxt *bp = netdev_priv(dev); 16771 16753 int rc = 0; 16772 16754 16755 + rtnl_lock(); 16773 16756 netdev_lock(dev); 16774 16757 rc = pci_enable_device(bp->pdev); 16775 16758 if (rc) { ··· 16815 16796 16816 16797 resume_exit: 16817 16798 netdev_unlock(bp->dev); 16799 + rtnl_unlock(); 16818 16800 bnxt_ulp_start(bp, rc); 16819 16801 if (!rc) 16820 16802 bnxt_reenable_sriov(bp); ··· 16981 16961 int err; 16982 16962 16983 16963 netdev_info(bp->dev, "PCI Slot Resume\n"); 16964 + rtnl_lock(); 16984 16965 netdev_lock(netdev); 16985 16966 16986 16967 err = bnxt_hwrm_func_qcaps(bp); ··· 16999 16978 netif_device_attach(netdev); 17000 16979 17001 16980 netdev_unlock(netdev); 16981 + rtnl_unlock(); 17002 16982 bnxt_ulp_start(bp, err); 17003 16983 if (!err) 17004 16984 bnxt_reenable_sriov(bp);

+6 -13

drivers/net/ethernet/cadence/macb_main.c

··· 997 997 998 998 static int macb_halt_tx(struct macb *bp) 999 999 { 1000 - unsigned long halt_time, timeout; 1001 - u32 status; 1000 + u32 status; 1002 1001 1003 1002 macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(THALT)); 1004 1003 1005 - timeout = jiffies + usecs_to_jiffies(MACB_HALT_TIMEOUT); 1006 - do { 1007 - halt_time = jiffies; 1008 - status = macb_readl(bp, TSR); 1009 - if (!(status & MACB_BIT(TGO))) 1010 - return 0; 1011 - 1012 - udelay(250); 1013 - } while (time_before(halt_time, timeout)); 1014 - 1015 - return -ETIMEDOUT; 1004 + /* Poll TSR until TGO is cleared or timeout. */ 1005 + return read_poll_timeout_atomic(macb_readl, status, 1006 + !(status & MACB_BIT(TGO)), 1007 + 250, MACB_HALT_TIMEOUT, false, 1008 + bp, TSR); 1016 1009 } 1017 1010 1018 1011 static void macb_tx_unmap(struct macb *bp, struct macb_tx_skb *tx_skb, int budget)

+19 -11

drivers/net/ethernet/engleder/tsnep_main.c

··· 67 67 #define TSNEP_TX_TYPE_XDP_NDO_MAP_PAGE (TSNEP_TX_TYPE_XDP_NDO | TSNEP_TX_TYPE_MAP_PAGE) 68 68 #define TSNEP_TX_TYPE_XDP (TSNEP_TX_TYPE_XDP_TX | TSNEP_TX_TYPE_XDP_NDO) 69 69 #define TSNEP_TX_TYPE_XSK BIT(12) 70 + #define TSNEP_TX_TYPE_TSTAMP BIT(13) 71 + #define TSNEP_TX_TYPE_SKB_TSTAMP (TSNEP_TX_TYPE_SKB | TSNEP_TX_TYPE_TSTAMP) 70 72 71 73 #define TSNEP_XDP_TX BIT(0) 72 74 #define TSNEP_XDP_REDIRECT BIT(1) ··· 388 386 if (entry->skb) { 389 387 entry->properties = length & TSNEP_DESC_LENGTH_MASK; 390 388 entry->properties |= TSNEP_DESC_INTERRUPT_FLAG; 391 - if ((entry->type & TSNEP_TX_TYPE_SKB) && 392 - (skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS)) 389 + if ((entry->type & TSNEP_TX_TYPE_SKB_TSTAMP) == TSNEP_TX_TYPE_SKB_TSTAMP) 393 390 entry->properties |= TSNEP_DESC_EXTENDED_WRITEBACK_FLAG; 394 391 395 392 /* toggle user flag to prevent false acknowledge ··· 480 479 return mapped; 481 480 } 482 481 483 - static int tsnep_tx_map(struct sk_buff *skb, struct tsnep_tx *tx, int count) 482 + static int tsnep_tx_map(struct sk_buff *skb, struct tsnep_tx *tx, int count, 483 + bool do_tstamp) 484 484 { 485 485 struct device *dmadev = tx->adapter->dmadev; 486 486 struct tsnep_tx_entry *entry; ··· 507 505 entry->type = TSNEP_TX_TYPE_SKB_INLINE; 508 506 mapped = 0; 509 507 } 508 + 509 + if (do_tstamp) 510 + entry->type |= TSNEP_TX_TYPE_TSTAMP; 510 511 } else { 511 512 skb_frag_t *frag = &skb_shinfo(skb)->frags[i - 1]; 512 513 ··· 563 558 static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb, 564 559 struct tsnep_tx *tx) 565 560 { 566 - int count = 1; 567 561 struct tsnep_tx_entry *entry; 562 + bool do_tstamp = false; 563 + int count = 1; 568 564 int length; 569 - int i; 570 565 int retval; 566 + int i; 571 567 572 568 if (skb_shinfo(skb)->nr_frags > 0) 573 569 count += skb_shinfo(skb)->nr_frags; ··· 585 579 entry = &tx->entry[tx->write]; 586 580 entry->skb = skb; 587 581 588 - retval = tsnep_tx_map(skb, tx, count); 582 + if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) && 583 + tx->adapter->hwtstamp_config.tx_type == HWTSTAMP_TX_ON) { 584 + skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS; 585 + do_tstamp = true; 586 + } 587 + 588 + retval = tsnep_tx_map(skb, tx, count, do_tstamp); 589 589 if (retval < 0) { 590 590 tsnep_tx_unmap(tx, tx->write, count); 591 591 dev_kfree_skb_any(entry->skb); ··· 602 590 return NETDEV_TX_OK; 603 591 } 604 592 length = retval; 605 - 606 - if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) 607 - skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS; 608 593 609 594 for (i = 0; i < count; i++) 610 595 tsnep_tx_activate(tx, (tx->write + i) & TSNEP_RING_MASK, length, ··· 853 844 854 845 length = tsnep_tx_unmap(tx, tx->read, count); 855 846 856 - if ((entry->type & TSNEP_TX_TYPE_SKB) && 857 - (skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS) && 847 + if (((entry->type & TSNEP_TX_TYPE_SKB_TSTAMP) == TSNEP_TX_TYPE_SKB_TSTAMP) && 858 848 (__le32_to_cpu(entry->desc_wb->properties) & 859 849 TSNEP_DESC_EXTENDED_WRITEBACK_FLAG)) { 860 850 struct skb_shared_hwtstamps hwtstamps;

+5

drivers/net/ethernet/marvell/octeontx2/af/cgx.c

··· 717 717 718 718 if (!is_lmac_valid(cgx, lmac_id)) 719 719 return -ENODEV; 720 + 721 + /* pass lmac as 0 for CGX_CMR_RX_STAT9-12 */ 722 + if (idx >= CGX_RX_STAT_GLOBAL_INDEX) 723 + lmac_id = 0; 724 + 720 725 *rx_stat = cgx_read(cgx, lmac_id, CGXX_CMRX_RX_STAT0 + (idx * 8)); 721 726 return 0; 722 727 }

+2 -1

drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c

··· 531 531 if (sw_tx_sc->encrypt) 532 532 sectag_tci |= (MCS_TCI_E | MCS_TCI_C); 533 533 534 - policy = FIELD_PREP(MCS_TX_SECY_PLCY_MTU, secy->netdev->mtu); 534 + policy = FIELD_PREP(MCS_TX_SECY_PLCY_MTU, 535 + pfvf->netdev->mtu + OTX2_ETH_HLEN); 535 536 /* Write SecTag excluding AN bits(1..0) */ 536 537 policy |= FIELD_PREP(MCS_TX_SECY_PLCY_ST_TCI, sectag_tci >> 2); 537 538 policy |= FIELD_PREP(MCS_TX_SECY_PLCY_ST_OFFSET, tag_offset);

+1

drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h

··· 356 356 struct list_head flow_list_tc; 357 357 u8 ucast_flt_cnt; 358 358 bool ntuple; 359 + u16 ntuple_cnt; 359 360 }; 360 361 361 362 struct dev_hw_ops {

+1

drivers/net/ethernet/marvell/octeontx2/nic/otx2_devlink.c

··· 41 41 if (!pfvf->flow_cfg) 42 42 return 0; 43 43 44 + pfvf->flow_cfg->ntuple_cnt = ctx->val.vu16; 44 45 otx2_alloc_mcam_entries(pfvf, ctx->val.vu16); 45 46 46 47 return 0;

+5 -5

drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c

··· 315 315 struct otx2_nic *pfvf = netdev_priv(netdev); 316 316 struct cgx_pause_frm_cfg *req, *rsp; 317 317 318 - if (is_otx2_lbkvf(pfvf->pdev)) 318 + if (is_otx2_lbkvf(pfvf->pdev) || is_otx2_sdp_rep(pfvf->pdev)) 319 319 return; 320 320 321 321 mutex_lock(&pfvf->mbox.lock); ··· 347 347 if (pause->autoneg) 348 348 return -EOPNOTSUPP; 349 349 350 - if (is_otx2_lbkvf(pfvf->pdev)) 350 + if (is_otx2_lbkvf(pfvf->pdev) || is_otx2_sdp_rep(pfvf->pdev)) 351 351 return -EOPNOTSUPP; 352 352 353 353 if (pause->rx_pause) ··· 941 941 { 942 942 struct otx2_nic *pfvf = netdev_priv(netdev); 943 943 944 - /* LBK link is internal and always UP */ 945 - if (is_otx2_lbkvf(pfvf->pdev)) 944 + /* LBK and SDP links are internal and always UP */ 945 + if (is_otx2_lbkvf(pfvf->pdev) || is_otx2_sdp_rep(pfvf->pdev)) 946 946 return 1; 947 947 return pfvf->linfo.link_up; 948 948 } ··· 1413 1413 { 1414 1414 struct otx2_nic *pfvf = netdev_priv(netdev); 1415 1415 1416 - if (is_otx2_lbkvf(pfvf->pdev)) { 1416 + if (is_otx2_lbkvf(pfvf->pdev) || is_otx2_sdp_rep(pfvf->pdev)) { 1417 1417 cmd->base.duplex = DUPLEX_FULL; 1418 1418 cmd->base.speed = SPEED_100000; 1419 1419 } else {

+2 -1

drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c

··· 247 247 mutex_unlock(&pfvf->mbox.lock); 248 248 249 249 /* Allocate entries for Ntuple filters */ 250 - count = otx2_alloc_mcam_entries(pfvf, OTX2_DEFAULT_FLOWCOUNT); 250 + count = otx2_alloc_mcam_entries(pfvf, flow_cfg->ntuple_cnt); 251 251 if (count <= 0) { 252 252 otx2_clear_ntuple_flow_info(pfvf, flow_cfg); 253 253 return 0; ··· 307 307 INIT_LIST_HEAD(&pf->flow_cfg->flow_list_tc); 308 308 309 309 pf->flow_cfg->ucast_flt_cnt = OTX2_DEFAULT_UNICAST_FLOWS; 310 + pf->flow_cfg->ntuple_cnt = OTX2_DEFAULT_FLOWCOUNT; 310 311 311 312 /* Allocate bare minimum number of MCAM entries needed for 312 313 * unicast and ntuple filters.

+1 -1

drivers/net/ethernet/mediatek/mtk_eth_soc.c

··· 4748 4748 } 4749 4749 4750 4750 if (mtk_is_netsys_v3_or_greater(mac->hw) && 4751 - MTK_HAS_CAPS(mac->hw->soc->caps, MTK_ESW_BIT) && 4751 + MTK_HAS_CAPS(mac->hw->soc->caps, MTK_ESW) && 4752 4752 id == MTK_GMAC1_ID) { 4753 4753 mac->phylink_config.mac_capabilities = MAC_ASYM_PAUSE | 4754 4754 MAC_SYM_PAUSE |

+4

drivers/net/ethernet/mellanox/mlx5/core/en_main.c

··· 4349 4349 if (netdev->features & NETIF_F_HW_VLAN_CTAG_FILTER) 4350 4350 netdev_warn(netdev, "Disabling HW_VLAN CTAG FILTERING, not supported in switchdev mode\n"); 4351 4351 4352 + features &= ~NETIF_F_HW_MACSEC; 4353 + if (netdev->features & NETIF_F_HW_MACSEC) 4354 + netdev_warn(netdev, "Disabling HW MACsec offload, not supported in switchdev mode\n"); 4355 + 4352 4356 return features; 4353 4357 } 4354 4358

+3

drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c

··· 3014 3014 .rif = rif, 3015 3015 }; 3016 3016 3017 + if (!mlxsw_sp_dev_lower_is_port(mlxsw_sp_rif_dev(rif))) 3018 + return 0; 3019 + 3017 3020 neigh_for_each(&arp_tbl, mlxsw_sp_neigh_rif_made_sync_each, &rms); 3018 3021 if (rms.err) 3019 3022 goto err_arp;

+1 -1

drivers/net/ethernet/qlogic/qede/qede_main.c

··· 203 203 }; 204 204 205 205 static struct qed_eth_cb_ops qede_ll_ops = { 206 - { 206 + .common = { 207 207 #ifdef CONFIG_RFS_ACCEL 208 208 .arfs_filter_op = qede_arfs_filter_op, 209 209 #endif

+5 -2

drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c

··· 1484 1484 } 1485 1485 1486 1486 cmd_op = (cmd.rsp.arg[0] & 0xff); 1487 - if (cmd.rsp.arg[0] >> 25 == 2) 1488 - return 2; 1487 + if (cmd.rsp.arg[0] >> 25 == 2) { 1488 + ret = 2; 1489 + goto out; 1490 + } 1491 + 1489 1492 if (cmd_op == QLCNIC_BC_CMD_CHANNEL_INIT) 1490 1493 set_bit(QLC_BC_VF_STATE, &vf->state); 1491 1494 else

+8 -2

drivers/net/ethernet/wangxun/libwx/wx_hw.c

··· 434 434 wr32m(wx, WX_SW2FW_MBOX_CMD, WX_SW2FW_MBOX_CMD_VLD, WX_SW2FW_MBOX_CMD_VLD); 435 435 436 436 /* polling reply from FW */ 437 - err = read_poll_timeout(wx_poll_fw_reply, reply, reply, 1000, 50000, 438 - true, wx, buffer, send_cmd); 437 + err = read_poll_timeout(wx_poll_fw_reply, reply, reply, 2000, 438 + timeout * 1000, true, wx, buffer, send_cmd); 439 439 if (err) { 440 440 wx_err(wx, "Polling from FW messages timeout, cmd: 0x%x, index: %d\n", 441 441 send_cmd, wx->swfw_index); 442 + goto rel_out; 443 + } 444 + 445 + if (hdr->cmd_or_resp.ret_status == 0x80) { 446 + wx_err(wx, "Unknown FW command: 0x%x\n", send_cmd); 447 + err = -EINVAL; 442 448 goto rel_out; 443 449 } 444 450

+7 -1

drivers/net/ethernet/wangxun/txgbe/txgbe_hw.c

··· 99 99 } 100 100 local_buffer = eeprom_ptrs; 101 101 102 - for (i = 0; i < TXGBE_EEPROM_LAST_WORD; i++) 102 + for (i = 0; i < TXGBE_EEPROM_LAST_WORD; i++) { 103 + if (wx->mac.type == wx_mac_aml) { 104 + if (i >= TXGBE_EEPROM_I2C_SRART_PTR && 105 + i < TXGBE_EEPROM_I2C_END_PTR) 106 + local_buffer[i] = 0xffff; 107 + } 103 108 if (i != wx->eeprom.sw_region_offset + TXGBE_EEPROM_CHECKSUM) 104 109 *checksum += local_buffer[i]; 110 + } 105 111 106 112 kvfree(eeprom_ptrs); 107 113

+2

drivers/net/ethernet/wangxun/txgbe/txgbe_type.h

··· 158 158 #define TXGBE_EEPROM_VERSION_L 0x1D 159 159 #define TXGBE_EEPROM_VERSION_H 0x1E 160 160 #define TXGBE_ISCSI_BOOT_CONFIG 0x07 161 + #define TXGBE_EEPROM_I2C_SRART_PTR 0x580 162 + #define TXGBE_EEPROM_I2C_END_PTR 0x800 161 163 162 164 #define TXGBE_MAX_MSIX_VECTORS 64 163 165 #define TXGBE_MAX_FDIR_INDICES 63

+12 -1

drivers/net/hyperv/hyperv_net.h

··· 158 158 u8 cp_partial; /* partial copy into send buffer */ 159 159 160 160 u8 rmsg_size; /* RNDIS header and PPI size */ 161 - u8 rmsg_pgcnt; /* page count of RNDIS header and PPI */ 162 161 u8 page_buf_cnt; 163 162 164 163 u16 q_idx; ··· 891 892 #define NETVSC_MIN_OUT_MSG_SIZE (sizeof(struct vmpacket_descriptor) + \ 892 893 sizeof(struct nvsp_message)) 893 894 #define NETVSC_MIN_IN_MSG_SIZE sizeof(struct vmpacket_descriptor) 895 + 896 + /* Maximum # of contiguous data ranges that can make up a trasmitted packet. 897 + * Typically it's the max SKB fragments plus 2 for the rndis packet and the 898 + * linear portion of the SKB. But if MAX_SKB_FRAGS is large, the value may 899 + * need to be limited to MAX_PAGE_BUFFER_COUNT, which is the max # of entries 900 + * in a GPA direct packet sent to netvsp over VMBus. 901 + */ 902 + #if MAX_SKB_FRAGS + 2 < MAX_PAGE_BUFFER_COUNT 903 + #define MAX_DATA_RANGES (MAX_SKB_FRAGS + 2) 904 + #else 905 + #define MAX_DATA_RANGES MAX_PAGE_BUFFER_COUNT 906 + #endif 894 907 895 908 /* Estimated requestor size: 896 909 * out_ring_size/min_out_msg_size + in_ring_size/min_in_msg_size

+48 -9

drivers/net/hyperv/netvsc.c

··· 953 953 + pend_size; 954 954 int i; 955 955 u32 padding = 0; 956 - u32 page_count = packet->cp_partial ? packet->rmsg_pgcnt : 957 - packet->page_buf_cnt; 956 + u32 page_count = packet->cp_partial ? 1 : packet->page_buf_cnt; 958 957 u32 remain; 959 958 960 959 /* Add padding */ ··· 1054 1055 return 0; 1055 1056 } 1056 1057 1058 + /* Build an "array" of mpb entries describing the data to be transferred 1059 + * over VMBus. After the desc header fields, each "array" entry is variable 1060 + * size, and each entry starts after the end of the previous entry. The 1061 + * "offset" and "len" fields for each entry imply the size of the entry. 1062 + * 1063 + * The pfns are in HV_HYP_PAGE_SIZE, because all communication with Hyper-V 1064 + * uses that granularity, even if the system page size of the guest is larger. 1065 + * Each entry in the input "pb" array must describe a contiguous range of 1066 + * guest physical memory so that the pfns are sequential if the range crosses 1067 + * a page boundary. The offset field must be < HV_HYP_PAGE_SIZE. 1068 + */ 1069 + static inline void netvsc_build_mpb_array(struct hv_page_buffer *pb, 1070 + u32 page_buffer_count, 1071 + struct vmbus_packet_mpb_array *desc, 1072 + u32 *desc_size) 1073 + { 1074 + struct hv_mpb_array *mpb_entry = &desc->range; 1075 + int i, j; 1076 + 1077 + for (i = 0; i < page_buffer_count; i++) { 1078 + u32 offset = pb[i].offset; 1079 + u32 len = pb[i].len; 1080 + 1081 + mpb_entry->offset = offset; 1082 + mpb_entry->len = len; 1083 + 1084 + for (j = 0; j < HVPFN_UP(offset + len); j++) 1085 + mpb_entry->pfn_array[j] = pb[i].pfn + j; 1086 + 1087 + mpb_entry = (struct hv_mpb_array *)&mpb_entry->pfn_array[j]; 1088 + } 1089 + 1090 + desc->rangecount = page_buffer_count; 1091 + *desc_size = (char *)mpb_entry - (char *)desc; 1092 + } 1093 + 1057 1094 static inline int netvsc_send_pkt( 1058 1095 struct hv_device *device, 1059 1096 struct hv_netvsc_packet *packet, ··· 1132 1097 1133 1098 packet->dma_range = NULL; 1134 1099 if (packet->page_buf_cnt) { 1100 + struct vmbus_channel_packet_page_buffer desc; 1101 + u32 desc_size; 1102 + 1135 1103 if (packet->cp_partial) 1136 - pb += packet->rmsg_pgcnt; 1104 + pb++; 1137 1105 1138 1106 ret = netvsc_dma_map(ndev_ctx->device_ctx, packet, pb); 1139 1107 if (ret) { ··· 1144 1106 goto exit; 1145 1107 } 1146 1108 1147 - ret = vmbus_sendpacket_pagebuffer(out_channel, 1148 - pb, packet->page_buf_cnt, 1149 - &nvmsg, sizeof(nvmsg), 1150 - req_id); 1151 - 1109 + netvsc_build_mpb_array(pb, packet->page_buf_cnt, 1110 + (struct vmbus_packet_mpb_array *)&desc, 1111 + &desc_size); 1112 + ret = vmbus_sendpacket_mpb_desc(out_channel, 1113 + (struct vmbus_packet_mpb_array *)&desc, 1114 + desc_size, &nvmsg, sizeof(nvmsg), req_id); 1152 1115 if (ret) 1153 1116 netvsc_dma_unmap(ndev_ctx->device_ctx, packet); 1154 1117 } else { ··· 1298 1259 packet->send_buf_index = section_index; 1299 1260 1300 1261 if (packet->cp_partial) { 1301 - packet->page_buf_cnt -= packet->rmsg_pgcnt; 1262 + packet->page_buf_cnt--; 1302 1263 packet->total_data_buflen = msd_len + packet->rmsg_size; 1303 1264 } else { 1304 1265 packet->page_buf_cnt = 0;

+14 -48

drivers/net/hyperv/netvsc_drv.c

··· 326 326 return txq; 327 327 } 328 328 329 - static u32 fill_pg_buf(unsigned long hvpfn, u32 offset, u32 len, 330 - struct hv_page_buffer *pb) 331 - { 332 - int j = 0; 333 - 334 - hvpfn += offset >> HV_HYP_PAGE_SHIFT; 335 - offset = offset & ~HV_HYP_PAGE_MASK; 336 - 337 - while (len > 0) { 338 - unsigned long bytes; 339 - 340 - bytes = HV_HYP_PAGE_SIZE - offset; 341 - if (bytes > len) 342 - bytes = len; 343 - pb[j].pfn = hvpfn; 344 - pb[j].offset = offset; 345 - pb[j].len = bytes; 346 - 347 - offset += bytes; 348 - len -= bytes; 349 - 350 - if (offset == HV_HYP_PAGE_SIZE && len) { 351 - hvpfn++; 352 - offset = 0; 353 - j++; 354 - } 355 - } 356 - 357 - return j + 1; 358 - } 359 - 360 329 static u32 init_page_array(void *hdr, u32 len, struct sk_buff *skb, 361 330 struct hv_netvsc_packet *packet, 362 331 struct hv_page_buffer *pb) 363 332 { 364 - u32 slots_used = 0; 365 - char *data = skb->data; 366 333 int frags = skb_shinfo(skb)->nr_frags; 367 334 int i; 368 335 ··· 338 371 * 2. skb linear data 339 372 * 3. skb fragment data 340 373 */ 341 - slots_used += fill_pg_buf(virt_to_hvpfn(hdr), 342 - offset_in_hvpage(hdr), 343 - len, 344 - &pb[slots_used]); 345 374 375 + pb[0].offset = offset_in_hvpage(hdr); 376 + pb[0].len = len; 377 + pb[0].pfn = virt_to_hvpfn(hdr); 346 378 packet->rmsg_size = len; 347 - packet->rmsg_pgcnt = slots_used; 348 379 349 - slots_used += fill_pg_buf(virt_to_hvpfn(data), 350 - offset_in_hvpage(data), 351 - skb_headlen(skb), 352 - &pb[slots_used]); 380 + pb[1].offset = offset_in_hvpage(skb->data); 381 + pb[1].len = skb_headlen(skb); 382 + pb[1].pfn = virt_to_hvpfn(skb->data); 353 383 354 384 for (i = 0; i < frags; i++) { 355 385 skb_frag_t *frag = skb_shinfo(skb)->frags + i; 386 + struct hv_page_buffer *cur_pb = &pb[i + 2]; 387 + u64 pfn = page_to_hvpfn(skb_frag_page(frag)); 388 + u32 offset = skb_frag_off(frag); 356 389 357 - slots_used += fill_pg_buf(page_to_hvpfn(skb_frag_page(frag)), 358 - skb_frag_off(frag), 359 - skb_frag_size(frag), 360 - &pb[slots_used]); 390 + cur_pb->offset = offset_in_hvpage(offset); 391 + cur_pb->len = skb_frag_size(frag); 392 + cur_pb->pfn = pfn + (offset >> HV_HYP_PAGE_SHIFT); 361 393 } 362 - return slots_used; 394 + return frags + 2; 363 395 } 364 396 365 397 static int count_skb_frag_slots(struct sk_buff *skb) ··· 449 483 struct net_device *vf_netdev; 450 484 u32 rndis_msg_size; 451 485 u32 hash; 452 - struct hv_page_buffer pb[MAX_PAGE_BUFFER_COUNT]; 486 + struct hv_page_buffer pb[MAX_DATA_RANGES]; 453 487 454 488 /* If VF is present and up then redirect packets to it. 455 489 * Skip the VF if it is marked down or has no carrier.

+5 -19

drivers/net/hyperv/rndis_filter.c

··· 225 225 struct rndis_request *req) 226 226 { 227 227 struct hv_netvsc_packet *packet; 228 - struct hv_page_buffer page_buf[2]; 229 - struct hv_page_buffer *pb = page_buf; 228 + struct hv_page_buffer pb; 230 229 int ret; 231 230 232 231 /* Setup the packet to send it */ ··· 234 235 packet->total_data_buflen = req->request_msg.msg_len; 235 236 packet->page_buf_cnt = 1; 236 237 237 - pb[0].pfn = virt_to_phys(&req->request_msg) >> 238 - HV_HYP_PAGE_SHIFT; 239 - pb[0].len = req->request_msg.msg_len; 240 - pb[0].offset = offset_in_hvpage(&req->request_msg); 241 - 242 - /* Add one page_buf when request_msg crossing page boundary */ 243 - if (pb[0].offset + pb[0].len > HV_HYP_PAGE_SIZE) { 244 - packet->page_buf_cnt++; 245 - pb[0].len = HV_HYP_PAGE_SIZE - 246 - pb[0].offset; 247 - pb[1].pfn = virt_to_phys((void *)&req->request_msg 248 - + pb[0].len) >> HV_HYP_PAGE_SHIFT; 249 - pb[1].offset = 0; 250 - pb[1].len = req->request_msg.msg_len - 251 - pb[0].len; 252 - } 238 + pb.pfn = virt_to_phys(&req->request_msg) >> HV_HYP_PAGE_SHIFT; 239 + pb.len = req->request_msg.msg_len; 240 + pb.offset = offset_in_hvpage(&req->request_msg); 253 241 254 242 trace_rndis_send(dev->ndev, 0, &req->request_msg); 255 243 256 244 rcu_read_lock_bh(); 257 - ret = netvsc_send(dev->ndev, packet, NULL, pb, NULL, false); 245 + ret = netvsc_send(dev->ndev, packet, NULL, &pb, NULL, false); 258 246 rcu_read_unlock_bh(); 259 247 260 248 return ret;

-7

drivers/net/phy/micrel.c

··· 2027 2027 return err; 2028 2028 } 2029 2029 2030 - /* According to KSZ9477 Errata DS80000754C (Module 4) all EEE modes 2031 - * in this switch shall be regarded as broken. 2032 - */ 2033 - if (phydev->dev_flags & MICREL_NO_EEE) 2034 - phy_disable_eee(phydev); 2035 - 2036 2030 return kszphy_config_init(phydev); 2037 2031 } 2038 2032 ··· 5699 5705 .handle_interrupt = kszphy_handle_interrupt, 5700 5706 .suspend = genphy_suspend, 5701 5707 .resume = ksz9477_resume, 5702 - .get_features = ksz9477_get_features, 5703 5708 } }; 5704 5709 5705 5710 module_phy_driver(ksphy_driver);

+1

drivers/net/wireless/mediatek/mt76/dma.c

··· 1011 1011 int i; 1012 1012 1013 1013 mt76_worker_disable(&dev->tx_worker); 1014 + napi_disable(&dev->tx_napi); 1014 1015 netif_napi_del(&dev->tx_napi); 1015 1016 1016 1017 for (i = 0; i < ARRAY_SIZE(dev->phys); i++) {

+2 -2

drivers/net/wireless/mediatek/mt76/mt7925/mcu.c

··· 1924 1924 mt7925_mcu_sta_mld_tlv(skb, info->vif, info->link_sta->sta); 1925 1925 mt7925_mcu_sta_eht_mld_tlv(skb, info->vif, info->link_sta->sta); 1926 1926 } 1927 - 1928 - mt7925_mcu_sta_hdr_trans_tlv(skb, info->vif, info->link_sta); 1929 1927 } 1930 1928 1931 1929 if (!info->enable) { 1932 1930 mt7925_mcu_sta_remove_tlv(skb); 1933 1931 mt76_connac_mcu_add_tlv(skb, STA_REC_MLD_OFF, 1934 1932 sizeof(struct tlv)); 1933 + } else { 1934 + mt7925_mcu_sta_hdr_trans_tlv(skb, info->vif, info->link_sta); 1935 1935 } 1936 1936 1937 1937 return mt76_mcu_skb_send_msg(dev, skb, info->cmd, true);

+27 -3

drivers/nvme/host/core.c

··· 2059 2059 if (id->nsfeat & NVME_NS_FEAT_ATOMICS && id->nawupf) 2060 2060 atomic_bs = (1 + le16_to_cpu(id->nawupf)) * bs; 2061 2061 else 2062 - atomic_bs = (1 + ns->ctrl->subsys->awupf) * bs; 2062 + atomic_bs = (1 + ns->ctrl->awupf) * bs; 2063 + 2064 + /* 2065 + * Set subsystem atomic bs. 2066 + */ 2067 + if (ns->ctrl->subsys->atomic_bs) { 2068 + if (atomic_bs != ns->ctrl->subsys->atomic_bs) { 2069 + dev_err_ratelimited(ns->ctrl->device, 2070 + "%s: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=%d bytes, Controller/Namespace=%d bytes\n", 2071 + ns->disk ? ns->disk->disk_name : "?", 2072 + ns->ctrl->subsys->atomic_bs, 2073 + atomic_bs); 2074 + } 2075 + } else 2076 + ns->ctrl->subsys->atomic_bs = atomic_bs; 2063 2077 2064 2078 nvme_update_atomic_write_disk_info(ns, id, lim, bs, atomic_bs); 2065 2079 } ··· 2215 2201 nvme_set_chunk_sectors(ns, id, &lim); 2216 2202 if (!nvme_update_disk_info(ns, id, &lim)) 2217 2203 capacity = 0; 2204 + 2205 + /* 2206 + * Validate the max atomic write size fits within the subsystem's 2207 + * atomic write capabilities. 2208 + */ 2209 + if (lim.atomic_write_hw_max > ns->ctrl->subsys->atomic_bs) { 2210 + blk_mq_unfreeze_queue(ns->disk->queue, memflags); 2211 + ret = -ENXIO; 2212 + goto out; 2213 + } 2214 + 2218 2215 nvme_config_discard(ns, &lim); 2219 2216 if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) && 2220 2217 ns->head->ids.csi == NVME_CSI_ZNS) ··· 3056 3031 kfree(subsys); 3057 3032 return -EINVAL; 3058 3033 } 3059 - subsys->awupf = le16_to_cpu(id->awupf); 3060 3034 nvme_mpath_default_iopolicy(subsys); 3061 3035 3062 3036 subsys->dev.class = &nvme_subsys_class; ··· 3465 3441 dev_pm_qos_expose_latency_tolerance(ctrl->device); 3466 3442 else if (!ctrl->apst_enabled && prev_apst_enabled) 3467 3443 dev_pm_qos_hide_latency_tolerance(ctrl->device); 3468 - 3444 + ctrl->awupf = le16_to_cpu(id->awupf); 3469 3445 out_free: 3470 3446 kfree(id); 3471 3447 return ret;

+2 -1

drivers/nvme/host/multipath.c

+2 -1

drivers/nvme/host/nvme.h

··· 410 410 411 411 enum nvme_ctrl_type cntrltype; 412 412 enum nvme_dctype dctype; 413 + u16 awupf; /* 0's based value. */ 413 414 }; 414 415 415 416 static inline enum nvme_ctrl_state nvme_ctrl_state(struct nvme_ctrl *ctrl) ··· 443 442 u8 cmic; 444 443 enum nvme_subsys_type subtype; 445 444 u16 vendor_id; 446 - u16 awupf; /* 0's based awupf value. */ 447 445 struct ida ns_ida; 448 446 #ifdef CONFIG_NVME_MULTIPATH 449 447 enum nvme_iopolicy iopolicy; 450 448 #endif 449 + u32 atomic_bs; 451 450 }; 452 451 453 452 /*

+5 -1

drivers/nvme/host/pci.c

··· 390 390 * as it only leads to a small amount of wasted memory for the lifetime of 391 391 * the I/O. 392 392 */ 393 - static int nvme_pci_npages_prp(void) 393 + static __always_inline int nvme_pci_npages_prp(void) 394 394 { 395 395 unsigned max_bytes = (NVME_MAX_KB_SZ * 1024) + NVME_CTRL_PAGE_SIZE; 396 396 unsigned nprps = DIV_ROUND_UP(max_bytes, NVME_CTRL_PAGE_SIZE); ··· 1202 1202 WARN_ON_ONCE(test_bit(NVMEQ_POLLED, &nvmeq->flags)); 1203 1203 1204 1204 disable_irq(pci_irq_vector(pdev, nvmeq->cq_vector)); 1205 + spin_lock(&nvmeq->cq_poll_lock); 1205 1206 nvme_poll_cq(nvmeq, NULL); 1207 + spin_unlock(&nvmeq->cq_poll_lock); 1206 1208 enable_irq(pci_irq_vector(pdev, nvmeq->cq_vector)); 1207 1209 } 1208 1210 ··· 3738 3736 { PCI_DEVICE(0x1e49, 0x0021), /* ZHITAI TiPro5000 NVMe SSD */ 3739 3737 .driver_data = NVME_QUIRK_NO_DEEPEST_PS, }, 3740 3738 { PCI_DEVICE(0x1e49, 0x0041), /* ZHITAI TiPro7000 NVMe SSD */ 3739 + .driver_data = NVME_QUIRK_NO_DEEPEST_PS, }, 3740 + { PCI_DEVICE(0x025e, 0xf1ac), /* SOLIDIGM P44 pro SSDPFKKW020X7 */ 3741 3741 .driver_data = NVME_QUIRK_NO_DEEPEST_PS, }, 3742 3742 { PCI_DEVICE(0xc0a9, 0x540a), /* Crucial P2 */ 3743 3743 .driver_data = NVME_QUIRK_BOGUS_NID, },

+23 -16

drivers/nvme/target/pci-epf.c

··· 62 62 #define NVMET_PCI_EPF_CQ_RETRY_INTERVAL msecs_to_jiffies(1) 63 63 64 64 enum nvmet_pci_epf_queue_flags { 65 - NVMET_PCI_EPF_Q_IS_SQ = 0, /* The queue is a submission queue */ 66 - NVMET_PCI_EPF_Q_LIVE, /* The queue is live */ 65 + NVMET_PCI_EPF_Q_LIVE = 0, /* The queue is live */ 67 66 NVMET_PCI_EPF_Q_IRQ_ENABLED, /* IRQ is enabled for this queue */ 68 67 }; 69 68 ··· 595 596 struct nvmet_pci_epf_irq_vector *iv = cq->iv; 596 597 bool ret; 597 598 598 - if (!test_bit(NVMET_PCI_EPF_Q_IRQ_ENABLED, &cq->flags)) 599 - return false; 600 - 601 599 /* IRQ coalescing for the admin queue is not allowed. */ 602 600 if (!cq->qid) 603 601 return true; ··· 621 625 struct pci_epf *epf = nvme_epf->epf; 622 626 int ret = 0; 623 627 624 - if (!test_bit(NVMET_PCI_EPF_Q_LIVE, &cq->flags)) 628 + if (!test_bit(NVMET_PCI_EPF_Q_LIVE, &cq->flags) || 629 + !test_bit(NVMET_PCI_EPF_Q_IRQ_ENABLED, &cq->flags)) 625 630 return; 626 631 627 632 mutex_lock(&ctrl->irq_lock); ··· 633 636 switch (nvme_epf->irq_type) { 634 637 case PCI_IRQ_MSIX: 635 638 case PCI_IRQ_MSI: 639 + /* 640 + * If we fail to raise an MSI or MSI-X interrupt, it is likely 641 + * because the host is using legacy INTX IRQs (e.g. BIOS, 642 + * grub), but we can fallback to the INTX type only if the 643 + * endpoint controller supports this type. 644 + */ 636 645 ret = pci_epc_raise_irq(epf->epc, epf->func_no, epf->vfunc_no, 637 646 nvme_epf->irq_type, cq->vector + 1); 638 - if (!ret) 647 + if (!ret || !nvme_epf->epc_features->intx_capable) 639 648 break; 640 - /* 641 - * If we got an error, it is likely because the host is using 642 - * legacy IRQs (e.g. BIOS, grub). 643 - */ 644 649 fallthrough; 645 650 case PCI_IRQ_INTX: 646 651 ret = pci_epc_raise_irq(epf->epc, epf->func_no, epf->vfunc_no, ··· 655 656 } 656 657 657 658 if (ret) 658 - dev_err(ctrl->dev, "Failed to raise IRQ (err=%d)\n", ret); 659 + dev_err_ratelimited(ctrl->dev, 660 + "CQ[%u]: Failed to raise IRQ (err=%d)\n", 661 + cq->qid, ret); 659 662 660 663 unlock: 661 664 mutex_unlock(&ctrl->irq_lock); ··· 1320 1319 1321 1320 set_bit(NVMET_PCI_EPF_Q_LIVE, &cq->flags); 1322 1321 1323 - dev_dbg(ctrl->dev, "CQ[%u]: %u entries of %zu B, IRQ vector %u\n", 1324 - cqid, qsize, cq->qes, cq->vector); 1322 + if (test_bit(NVMET_PCI_EPF_Q_IRQ_ENABLED, &cq->flags)) 1323 + dev_dbg(ctrl->dev, 1324 + "CQ[%u]: %u entries of %zu B, IRQ vector %u\n", 1325 + cqid, qsize, cq->qes, cq->vector); 1326 + else 1327 + dev_dbg(ctrl->dev, 1328 + "CQ[%u]: %u entries of %zu B, IRQ disabled\n", 1329 + cqid, qsize, cq->qes); 1325 1330 1326 1331 return NVME_SC_SUCCESS; 1327 1332 ··· 1351 1344 1352 1345 cancel_delayed_work_sync(&cq->work); 1353 1346 nvmet_pci_epf_drain_queue(cq); 1354 - nvmet_pci_epf_remove_irq_vector(ctrl, cq->vector); 1347 + if (test_and_clear_bit(NVMET_PCI_EPF_Q_IRQ_ENABLED, &cq->flags)) 1348 + nvmet_pci_epf_remove_irq_vector(ctrl, cq->vector); 1355 1349 nvmet_pci_epf_mem_unmap(ctrl->nvme_epf, &cq->pci_map); 1356 1350 1357 1351 return NVME_SC_SUCCESS; ··· 1541 1533 1542 1534 if (sq) { 1543 1535 queue = &ctrl->sq[qid]; 1544 - set_bit(NVMET_PCI_EPF_Q_IS_SQ, &queue->flags); 1545 1536 } else { 1546 1537 queue = &ctrl->cq[qid]; 1547 1538 INIT_DELAYED_WORK(&queue->work, nvmet_pci_epf_cq_work);

+15 -7

drivers/phy/phy-can-transceiver.c

··· 93 93 }; 94 94 MODULE_DEVICE_TABLE(of, can_transceiver_phy_ids); 95 95 96 + /* Temporary wrapper until the multiplexer subsystem supports optional muxes */ 97 + static inline struct mux_state * 98 + devm_mux_state_get_optional(struct device *dev, const char *mux_name) 99 + { 100 + if (!of_property_present(dev->of_node, "mux-states")) 101 + return NULL; 102 + 103 + return devm_mux_state_get(dev, mux_name); 104 + } 105 + 96 106 static int can_transceiver_phy_probe(struct platform_device *pdev) 97 107 { 98 108 struct phy_provider *phy_provider; ··· 124 114 match = of_match_node(can_transceiver_phy_ids, pdev->dev.of_node); 125 115 drvdata = match->data; 126 116 127 - mux_state = devm_mux_state_get(dev, NULL); 128 - if (IS_ERR(mux_state)) { 129 - if (PTR_ERR(mux_state) == -EPROBE_DEFER) 130 - return PTR_ERR(mux_state); 131 - } else { 132 - can_transceiver_phy->mux_state = mux_state; 133 - } 117 + mux_state = devm_mux_state_get_optional(dev, NULL); 118 + if (IS_ERR(mux_state)) 119 + return PTR_ERR(mux_state); 120 + 121 + can_transceiver_phy->mux_state = mux_state; 134 122 135 123 phy = devm_phy_create(dev, dev->of_node, 136 124 &can_transceiver_phy_ops);

+2 -1

drivers/phy/qualcomm/phy-qcom-qmp-ufs.c

··· 1754 1754 qmp_ufs_init_all(qmp, &cfg->tbls_hs_overlay[i]); 1755 1755 } 1756 1756 1757 - qmp_ufs_init_all(qmp, &cfg->tbls_hs_b); 1757 + if (qmp->mode == PHY_MODE_UFS_HS_B) 1758 + qmp_ufs_init_all(qmp, &cfg->tbls_hs_b); 1758 1759 } 1759 1760 1760 1761 static int qmp_ufs_com_init(struct qmp_ufs *qmp)

+74 -59

drivers/phy/renesas/phy-rcar-gen3-usb2.c

··· 9 9 * Copyright (C) 2014 Cogent Embedded, Inc. 10 10 */ 11 11 12 + #include <linux/cleanup.h> 12 13 #include <linux/extcon-provider.h> 13 14 #include <linux/interrupt.h> 14 15 #include <linux/io.h> ··· 108 107 struct rcar_gen3_chan *ch; 109 108 u32 int_enable_bits; 110 109 bool initialized; 111 - bool otg_initialized; 112 110 bool powered; 113 111 }; 114 112 ··· 119 119 struct regulator *vbus; 120 120 struct reset_control *rstc; 121 121 struct work_struct work; 122 - struct mutex lock; /* protects rphys[...].powered */ 122 + spinlock_t lock; /* protects access to hardware and driver data structure. */ 123 123 enum usb_dr_mode dr_mode; 124 - int irq; 125 124 u32 obint_enable_bits; 126 125 bool extcon_host; 127 126 bool is_otg_channel; ··· 319 320 return false; 320 321 } 321 322 322 - static bool rcar_gen3_needs_init_otg(struct rcar_gen3_chan *ch) 323 + static bool rcar_gen3_is_any_otg_rphy_initialized(struct rcar_gen3_chan *ch) 323 324 { 324 - int i; 325 - 326 - for (i = 0; i < NUM_OF_PHYS; i++) { 327 - if (ch->rphys[i].otg_initialized) 328 - return false; 325 + for (enum rcar_gen3_phy_index i = PHY_INDEX_BOTH_HC; i <= PHY_INDEX_EHCI; 326 + i++) { 327 + if (ch->rphys[i].initialized) 328 + return true; 329 329 } 330 330 331 - return true; 331 + return false; 332 332 } 333 333 334 334 static bool rcar_gen3_are_all_rphys_power_off(struct rcar_gen3_chan *ch) ··· 349 351 bool is_b_device; 350 352 enum phy_mode cur_mode, new_mode; 351 353 352 - if (!ch->is_otg_channel || !rcar_gen3_is_any_rphy_initialized(ch)) 354 + guard(spinlock_irqsave)(&ch->lock); 355 + 356 + if (!ch->is_otg_channel || !rcar_gen3_is_any_otg_rphy_initialized(ch)) 353 357 return -EIO; 354 358 355 359 if (sysfs_streq(buf, "host")) ··· 389 389 { 390 390 struct rcar_gen3_chan *ch = dev_get_drvdata(dev); 391 391 392 - if (!ch->is_otg_channel || !rcar_gen3_is_any_rphy_initialized(ch)) 392 + if (!ch->is_otg_channel || !rcar_gen3_is_any_otg_rphy_initialized(ch)) 393 393 return -EIO; 394 394 395 395 return sprintf(buf, "%s\n", rcar_gen3_is_host(ch) ? "host" : ··· 401 401 { 402 402 void __iomem *usb2_base = ch->base; 403 403 u32 val; 404 + 405 + if (!ch->is_otg_channel || rcar_gen3_is_any_otg_rphy_initialized(ch)) 406 + return; 404 407 405 408 /* Should not use functions of read-modify-write a register */ 406 409 val = readl(usb2_base + USB2_LINECTRL1); ··· 418 415 val = readl(usb2_base + USB2_ADPCTRL); 419 416 writel(val | USB2_ADPCTRL_IDPULLUP, usb2_base + USB2_ADPCTRL); 420 417 } 421 - msleep(20); 418 + mdelay(20); 422 419 423 420 writel(0xffffffff, usb2_base + USB2_OBINTSTA); 424 421 writel(ch->obint_enable_bits, usb2_base + USB2_OBINTEN); ··· 430 427 { 431 428 struct rcar_gen3_chan *ch = _ch; 432 429 void __iomem *usb2_base = ch->base; 433 - u32 status = readl(usb2_base + USB2_OBINTSTA); 430 + struct device *dev = ch->dev; 434 431 irqreturn_t ret = IRQ_NONE; 432 + u32 status; 435 433 436 - if (status & ch->obint_enable_bits) { 437 - dev_vdbg(ch->dev, "%s: %08x\n", __func__, status); 438 - writel(ch->obint_enable_bits, usb2_base + USB2_OBINTSTA); 439 - rcar_gen3_device_recognition(ch); 440 - ret = IRQ_HANDLED; 434 + pm_runtime_get_noresume(dev); 435 + 436 + if (pm_runtime_suspended(dev)) 437 + goto rpm_put; 438 + 439 + scoped_guard(spinlock, &ch->lock) { 440 + status = readl(usb2_base + USB2_OBINTSTA); 441 + if (status & ch->obint_enable_bits) { 442 + dev_vdbg(dev, "%s: %08x\n", __func__, status); 443 + writel(ch->obint_enable_bits, usb2_base + USB2_OBINTSTA); 444 + rcar_gen3_device_recognition(ch); 445 + ret = IRQ_HANDLED; 446 + } 441 447 } 442 448 449 + rpm_put: 450 + pm_runtime_put_noidle(dev); 443 451 return ret; 444 452 } 445 453 ··· 460 446 struct rcar_gen3_chan *channel = rphy->ch; 461 447 void __iomem *usb2_base = channel->base; 462 448 u32 val; 463 - int ret; 464 449 465 - if (!rcar_gen3_is_any_rphy_initialized(channel) && channel->irq >= 0) { 466 - INIT_WORK(&channel->work, rcar_gen3_phy_usb2_work); 467 - ret = request_irq(channel->irq, rcar_gen3_phy_usb2_irq, 468 - IRQF_SHARED, dev_name(channel->dev), channel); 469 - if (ret < 0) { 470 - dev_err(channel->dev, "No irq handler (%d)\n", channel->irq); 471 - return ret; 472 - } 473 - } 450 + guard(spinlock_irqsave)(&channel->lock); 474 451 475 452 /* Initialize USB2 part */ 476 453 val = readl(usb2_base + USB2_INT_ENABLE); 477 454 val |= USB2_INT_ENABLE_UCOM_INTEN | rphy->int_enable_bits; 478 455 writel(val, usb2_base + USB2_INT_ENABLE); 479 - writel(USB2_SPD_RSM_TIMSET_INIT, usb2_base + USB2_SPD_RSM_TIMSET); 480 - writel(USB2_OC_TIMSET_INIT, usb2_base + USB2_OC_TIMSET); 481 456 482 - /* Initialize otg part */ 483 - if (channel->is_otg_channel) { 484 - if (rcar_gen3_needs_init_otg(channel)) 485 - rcar_gen3_init_otg(channel); 486 - rphy->otg_initialized = true; 457 + if (!rcar_gen3_is_any_rphy_initialized(channel)) { 458 + writel(USB2_SPD_RSM_TIMSET_INIT, usb2_base + USB2_SPD_RSM_TIMSET); 459 + writel(USB2_OC_TIMSET_INIT, usb2_base + USB2_OC_TIMSET); 487 460 } 461 + 462 + /* Initialize otg part (only if we initialize a PHY with IRQs). */ 463 + if (rphy->int_enable_bits) 464 + rcar_gen3_init_otg(channel); 488 465 489 466 rphy->initialized = true; 490 467 ··· 489 484 void __iomem *usb2_base = channel->base; 490 485 u32 val; 491 486 492 - rphy->initialized = false; 487 + guard(spinlock_irqsave)(&channel->lock); 493 488 494 - if (channel->is_otg_channel) 495 - rphy->otg_initialized = false; 489 + rphy->initialized = false; 496 490 497 491 val = readl(usb2_base + USB2_INT_ENABLE); 498 492 val &= ~rphy->int_enable_bits; 499 493 if (!rcar_gen3_is_any_rphy_initialized(channel)) 500 494 val &= ~USB2_INT_ENABLE_UCOM_INTEN; 501 495 writel(val, usb2_base + USB2_INT_ENABLE); 502 - 503 - if (channel->irq >= 0 && !rcar_gen3_is_any_rphy_initialized(channel)) 504 - free_irq(channel->irq, channel); 505 496 506 497 return 0; 507 498 } ··· 510 509 u32 val; 511 510 int ret = 0; 512 511 513 - mutex_lock(&channel->lock); 514 - if (!rcar_gen3_are_all_rphys_power_off(channel)) 515 - goto out; 516 - 517 512 if (channel->vbus) { 518 513 ret = regulator_enable(channel->vbus); 519 514 if (ret) 520 - goto out; 515 + return ret; 521 516 } 517 + 518 + guard(spinlock_irqsave)(&channel->lock); 519 + 520 + if (!rcar_gen3_are_all_rphys_power_off(channel)) 521 + goto out; 522 522 523 523 val = readl(usb2_base + USB2_USBCTR); 524 524 val |= USB2_USBCTR_PLL_RST; ··· 530 528 out: 531 529 /* The powered flag should be set for any other phys anyway */ 532 530 rphy->powered = true; 533 - mutex_unlock(&channel->lock); 534 531 535 532 return 0; 536 533 } ··· 540 539 struct rcar_gen3_chan *channel = rphy->ch; 541 540 int ret = 0; 542 541 543 - mutex_lock(&channel->lock); 544 - rphy->powered = false; 542 + scoped_guard(spinlock_irqsave, &channel->lock) { 543 + rphy->powered = false; 545 544 546 - if (!rcar_gen3_are_all_rphys_power_off(channel)) 547 - goto out; 545 + if (rcar_gen3_are_all_rphys_power_off(channel)) { 546 + u32 val = readl(channel->base + USB2_USBCTR); 547 + 548 + val |= USB2_USBCTR_PLL_RST; 549 + writel(val, channel->base + USB2_USBCTR); 550 + } 551 + } 548 552 549 553 if (channel->vbus) 550 554 ret = regulator_disable(channel->vbus); 551 - 552 - out: 553 - mutex_unlock(&channel->lock); 554 555 555 556 return ret; 556 557 } ··· 706 703 struct device *dev = &pdev->dev; 707 704 struct rcar_gen3_chan *channel; 708 705 struct phy_provider *provider; 709 - int ret = 0, i; 706 + int ret = 0, i, irq; 710 707 711 708 if (!dev->of_node) { 712 709 dev_err(dev, "This driver needs device tree\n"); ··· 722 719 return PTR_ERR(channel->base); 723 720 724 721 channel->obint_enable_bits = USB2_OBINT_BITS; 725 - /* get irq number here and request_irq for OTG in phy_init */ 726 - channel->irq = platform_get_irq_optional(pdev, 0); 727 722 channel->dr_mode = rcar_gen3_get_dr_mode(dev->of_node); 728 723 if (channel->dr_mode != USB_DR_MODE_UNKNOWN) { 729 724 channel->is_otg_channel = true; ··· 764 763 if (phy_data->no_adp_ctrl) 765 764 channel->obint_enable_bits = USB2_OBINT_IDCHG_EN; 766 765 767 - mutex_init(&channel->lock); 766 + spin_lock_init(&channel->lock); 768 767 for (i = 0; i < NUM_OF_PHYS; i++) { 769 768 channel->rphys[i].phy = devm_phy_create(dev, NULL, 770 769 phy_data->phy_usb2_ops); ··· 788 787 goto error; 789 788 } 790 789 channel->vbus = NULL; 790 + } 791 + 792 + irq = platform_get_irq_optional(pdev, 0); 793 + if (irq < 0 && irq != -ENXIO) { 794 + ret = irq; 795 + goto error; 796 + } else if (irq > 0) { 797 + INIT_WORK(&channel->work, rcar_gen3_phy_usb2_work); 798 + ret = devm_request_irq(dev, irq, rcar_gen3_phy_usb2_irq, 799 + IRQF_SHARED, dev_name(dev), channel); 800 + if (ret < 0) { 801 + dev_err(dev, "Failed to request irq (%d)\n", irq); 802 + goto error; 803 + } 791 804 } 792 805 793 806 provider = devm_of_phy_provider_register(dev, rcar_gen3_phy_usb2_xlate);

+1 -1

drivers/phy/rockchip/phy-rockchip-samsung-dcphy.c

··· 1653 1653 return ret; 1654 1654 } 1655 1655 1656 - clk_prepare_enable(samsung->ref_clk); 1656 + ret = clk_prepare_enable(samsung->ref_clk); 1657 1657 if (ret) { 1658 1658 dev_err(samsung->dev, "Failed to enable reference clock, %d\n", ret); 1659 1659 clk_disable_unprepare(samsung->pclk);

+2

drivers/phy/rockchip/phy-rockchip-samsung-hdptx.c

··· 476 476 1, 1, 0, 0x20, 0x0c, 1, 0x0e, 0, 0, }, 477 477 { 650000, 162, 162, 1, 1, 11, 1, 1, 1, 1, 1, 1, 1, 54, 0, 16, 4, 1, 478 478 1, 1, 0, 0x20, 0x0c, 1, 0x0e, 0, 0, }, 479 + { 502500, 84, 84, 1, 1, 7, 1, 1, 1, 1, 1, 1, 1, 11, 1, 4, 5, 480 + 4, 11, 1, 0, 0x20, 0x0c, 1, 0x0e, 0, 0, }, 479 481 { 337500, 0x70, 0x70, 1, 1, 0xf, 1, 1, 1, 1, 1, 1, 1, 0x2, 0, 0x01, 5, 480 482 1, 1, 1, 0, 0x20, 0x0c, 1, 0x0e, 0, 0, }, 481 483 { 400000, 100, 100, 1, 1, 11, 1, 1, 0, 1, 0, 1, 1, 0x9, 0, 0x05, 0,

+7

drivers/phy/starfive/phy-jh7110-usb.c

··· 18 18 #include <linux/usb/of.h> 19 19 20 20 #define USB_125M_CLK_RATE 125000000 21 + #define USB_CLK_MODE_OFF 0x0 22 + #define USB_CLK_MODE_RX_NORMAL_PWR BIT(1) 21 23 #define USB_LS_KEEPALIVE_OFF 0x4 22 24 #define USB_LS_KEEPALIVE_ENABLE BIT(4) 23 25 ··· 80 78 { 81 79 struct jh7110_usb2_phy *phy = phy_get_drvdata(_phy); 82 80 int ret; 81 + unsigned int val; 83 82 84 83 ret = clk_set_rate(phy->usb_125m_clk, USB_125M_CLK_RATE); 85 84 if (ret) ··· 89 86 ret = clk_prepare_enable(phy->app_125m); 90 87 if (ret) 91 88 return ret; 89 + 90 + val = readl(phy->regs + USB_CLK_MODE_OFF); 91 + val |= USB_CLK_MODE_RX_NORMAL_PWR; 92 + writel(val, phy->regs + USB_CLK_MODE_OFF); 92 93 93 94 return 0; 94 95 }

+27 -19

drivers/phy/tegra/xusb-tegra186.c

··· 237 237 #define DATA0_VAL_PD BIT(1) 238 238 #define USE_XUSB_AO BIT(4) 239 239 240 + #define TEGRA_UTMI_PAD_MAX 4 241 + 240 242 #define TEGRA186_LANE(_name, _offset, _shift, _mask, _type) \ 241 243 { \ 242 244 .name = _name, \ ··· 271 269 272 270 /* UTMI bias and tracking */ 273 271 struct clk *usb2_trk_clk; 274 - unsigned int bias_pad_enable; 272 + DECLARE_BITMAP(utmi_pad_enabled, TEGRA_UTMI_PAD_MAX); 275 273 276 274 /* padctl context */ 277 275 struct tegra186_xusb_padctl_context context; ··· 605 603 u32 value; 606 604 int err; 607 605 608 - mutex_lock(&padctl->lock); 609 - 610 - if (priv->bias_pad_enable++ > 0) { 611 - mutex_unlock(&padctl->lock); 606 + if (!bitmap_empty(priv->utmi_pad_enabled, TEGRA_UTMI_PAD_MAX)) 612 607 return; 613 - } 614 608 615 609 err = clk_prepare_enable(priv->usb2_trk_clk); 616 610 if (err < 0) ··· 656 658 } else { 657 659 clk_disable_unprepare(priv->usb2_trk_clk); 658 660 } 659 - 660 - mutex_unlock(&padctl->lock); 661 661 } 662 662 663 663 static void tegra186_utmi_bias_pad_power_off(struct tegra_xusb_padctl *padctl) ··· 663 667 struct tegra186_xusb_padctl *priv = to_tegra186_xusb_padctl(padctl); 664 668 u32 value; 665 669 666 - mutex_lock(&padctl->lock); 667 - 668 - if (WARN_ON(priv->bias_pad_enable == 0)) { 669 - mutex_unlock(&padctl->lock); 670 + if (!bitmap_empty(priv->utmi_pad_enabled, TEGRA_UTMI_PAD_MAX)) 670 671 return; 671 - } 672 - 673 - if (--priv->bias_pad_enable > 0) { 674 - mutex_unlock(&padctl->lock); 675 - return; 676 - } 677 672 678 673 value = padctl_readl(padctl, XUSB_PADCTL_USB2_BIAS_PAD_CTL1); 679 674 value |= USB2_PD_TRK; ··· 677 690 clk_disable_unprepare(priv->usb2_trk_clk); 678 691 } 679 692 680 - mutex_unlock(&padctl->lock); 681 693 } 682 694 683 695 static void tegra186_utmi_pad_power_on(struct phy *phy) 684 696 { 685 697 struct tegra_xusb_lane *lane = phy_get_drvdata(phy); 686 698 struct tegra_xusb_padctl *padctl = lane->pad->padctl; 699 + struct tegra186_xusb_padctl *priv = to_tegra186_xusb_padctl(padctl); 687 700 struct tegra_xusb_usb2_port *port; 688 701 struct device *dev = padctl->dev; 689 702 unsigned int index = lane->index; ··· 692 705 if (!phy) 693 706 return; 694 707 708 + mutex_lock(&padctl->lock); 709 + if (test_bit(index, priv->utmi_pad_enabled)) { 710 + mutex_unlock(&padctl->lock); 711 + return; 712 + } 713 + 695 714 port = tegra_xusb_find_usb2_port(padctl, index); 696 715 if (!port) { 697 716 dev_err(dev, "no port found for USB2 lane %u\n", index); 717 + mutex_unlock(&padctl->lock); 698 718 return; 699 719 } 700 720 ··· 718 724 value = padctl_readl(padctl, XUSB_PADCTL_USB2_OTG_PADX_CTL1(index)); 719 725 value &= ~USB2_OTG_PD_DR; 720 726 padctl_writel(padctl, value, XUSB_PADCTL_USB2_OTG_PADX_CTL1(index)); 727 + 728 + set_bit(index, priv->utmi_pad_enabled); 729 + mutex_unlock(&padctl->lock); 721 730 } 722 731 723 732 static void tegra186_utmi_pad_power_down(struct phy *phy) 724 733 { 725 734 struct tegra_xusb_lane *lane = phy_get_drvdata(phy); 726 735 struct tegra_xusb_padctl *padctl = lane->pad->padctl; 736 + struct tegra186_xusb_padctl *priv = to_tegra186_xusb_padctl(padctl); 727 737 unsigned int index = lane->index; 728 738 u32 value; 729 739 730 740 if (!phy) 731 741 return; 742 + 743 + mutex_lock(&padctl->lock); 744 + if (!test_bit(index, priv->utmi_pad_enabled)) { 745 + mutex_unlock(&padctl->lock); 746 + return; 747 + } 732 748 733 749 dev_dbg(padctl->dev, "power down UTMI pad %u\n", index); 734 750 ··· 752 748 753 749 udelay(2); 754 750 751 + clear_bit(index, priv->utmi_pad_enabled); 752 + 755 753 tegra186_utmi_bias_pad_power_off(padctl); 754 + 755 + mutex_unlock(&padctl->lock); 756 756 } 757 757 758 758 static int tegra186_xusb_padctl_vbus_override(struct tegra_xusb_padctl *padctl,

+4 -4

drivers/phy/tegra/xusb.c

··· 548 548 549 549 err = dev_set_name(&port->dev, "%s-%u", name, index); 550 550 if (err < 0) 551 - goto unregister; 551 + goto put_device; 552 552 553 553 err = device_add(&port->dev); 554 554 if (err < 0) 555 - goto unregister; 555 + goto put_device; 556 556 557 557 return 0; 558 558 559 - unregister: 560 - device_unregister(&port->dev); 559 + put_device: 560 + put_device(&port->dev); 561 561 return err; 562 562 } 563 563

+1 -2

drivers/platform/x86/amd/hsmp/acpi.c

··· 27 27 28 28 #include "hsmp.h" 29 29 30 - #define DRIVER_NAME "amd_hsmp" 30 + #define DRIVER_NAME "hsmp_acpi" 31 31 #define DRIVER_VERSION "2.3" 32 - #define ACPI_HSMP_DEVICE_HID "AMDI0097" 33 32 34 33 /* These are the strings specified in ACPI table */ 35 34 #define MSG_IDOFF_STR "MsgIdOffset"

+1

drivers/platform/x86/amd/hsmp/hsmp.h

··· 23 23 24 24 #define HSMP_CDEV_NAME "hsmp_cdev" 25 25 #define HSMP_DEVNODE_NAME "hsmp" 26 + #define ACPI_HSMP_DEVICE_HID "AMDI0097" 26 27 27 28 struct hsmp_mbaddr_info { 28 29 u32 base_addr;

+5 -1

drivers/platform/x86/amd/hsmp/plat.c

··· 11 11 12 12 #include <asm/amd_hsmp.h> 13 13 14 + #include <linux/acpi.h> 14 15 #include <linux/build_bug.h> 15 16 #include <linux/device.h> 16 17 #include <linux/module.h> ··· 267 266 } 268 267 case 0x1A: 269 268 switch (boot_cpu_data.x86_model) { 270 - case 0x00 ... 0x1F: 269 + case 0x00 ... 0x0F: 271 270 return true; 272 271 default: 273 272 return false; ··· 288 287 boot_cpu_data.x86, boot_cpu_data.x86_model); 289 288 return ret; 290 289 } 290 + 291 + if (acpi_dev_present(ACPI_HSMP_DEVICE_HID, NULL, -1)) 292 + return -ENODEV; 291 293 292 294 hsmp_pdev = get_hsmp_pdev(); 293 295 if (!hsmp_pdev)

+7

drivers/platform/x86/amd/pmc/pmc-quirks.c

··· 217 217 DMI_MATCH(DMI_BIOS_VERSION, "03.05"), 218 218 } 219 219 }, 220 + { 221 + .ident = "MECHREVO Wujie 14X (GX4HRXL)", 222 + .driver_data = &quirk_spurious_8042, 223 + .matches = { 224 + DMI_MATCH(DMI_BOARD_NAME, "WUJIE14-GX4HRXL"), 225 + } 226 + }, 220 227 {} 221 228 }; 222 229

+22 -1

drivers/platform/x86/amd/pmf/tee-if.c

··· 334 334 return 0; 335 335 } 336 336 337 + static inline bool amd_pmf_pb_valid(struct amd_pmf_dev *dev) 338 + { 339 + return memchr_inv(dev->policy_buf, 0xff, dev->policy_sz); 340 + } 341 + 337 342 #ifdef CONFIG_AMD_PMF_DEBUG 338 343 static void amd_pmf_hex_dump_pb(struct amd_pmf_dev *dev) 339 344 { ··· 366 361 dev->policy_buf = new_policy_buf; 367 362 dev->policy_sz = length; 368 363 364 + if (!amd_pmf_pb_valid(dev)) { 365 + ret = -EINVAL; 366 + goto cleanup; 367 + } 368 + 369 369 amd_pmf_hex_dump_pb(dev); 370 370 ret = amd_pmf_start_policy_engine(dev); 371 371 if (ret < 0) 372 - return ret; 372 + goto cleanup; 373 373 374 374 return length; 375 + 376 + cleanup: 377 + kfree(dev->policy_buf); 378 + dev->policy_buf = NULL; 379 + return ret; 375 380 } 376 381 377 382 static const struct file_operations pb_fops = { ··· 542 527 } 543 528 544 529 memcpy_fromio(dev->policy_buf, dev->policy_base, dev->policy_sz); 530 + 531 + if (!amd_pmf_pb_valid(dev)) { 532 + dev_info(dev->dev, "No Smart PC policy present\n"); 533 + ret = -EINVAL; 534 + goto err_free_policy; 535 + } 545 536 546 537 amd_pmf_hex_dump_pb(dev); 547 538

+2 -1

drivers/platform/x86/asus-wmi.c

··· 4779 4779 goto fail_leds; 4780 4780 4781 4781 asus_wmi_get_devstate(asus, ASUS_WMI_DEVID_WLAN, &result); 4782 - if (result & (ASUS_WMI_DSTS_PRESENCE_BIT | ASUS_WMI_DSTS_USER_BIT)) 4782 + if ((result & (ASUS_WMI_DSTS_PRESENCE_BIT | ASUS_WMI_DSTS_USER_BIT)) == 4783 + (ASUS_WMI_DSTS_PRESENCE_BIT | ASUS_WMI_DSTS_USER_BIT)) 4783 4784 asus->driver->wlan_ctrl_by_user = 1; 4784 4785 4785 4786 if (!(asus->driver->wlan_ctrl_by_user && ashs_present())) {

+2

drivers/platform/x86/thinkpad_acpi.c

··· 11478 11478 tp->vendor = PCI_VENDOR_ID_IBM; 11479 11479 else if (dmi_name_in_vendors("LENOVO")) 11480 11480 tp->vendor = PCI_VENDOR_ID_LENOVO; 11481 + else if (dmi_name_in_vendors("NEC")) 11482 + tp->vendor = PCI_VENDOR_ID_LENOVO; 11481 11483 else 11482 11484 return 0; 11483 11485

+6 -1

drivers/regulator/max20086-regulator.c

··· 132 132 133 133 static int max20086_parse_regulators_dt(struct max20086 *chip, bool *boot_on) 134 134 { 135 - struct of_regulator_match matches[MAX20086_MAX_REGULATORS] = { }; 135 + struct of_regulator_match *matches; 136 136 struct device_node *node; 137 137 unsigned int i; 138 138 int ret; ··· 142 142 dev_err(chip->dev, "regulators node not found\n"); 143 143 return -ENODEV; 144 144 } 145 + 146 + matches = devm_kcalloc(chip->dev, chip->info->num_outputs, 147 + sizeof(*matches), GFP_KERNEL); 148 + if (!matches) 149 + return -ENOMEM; 145 150 146 151 for (i = 0; i < chip->info->num_outputs; ++i) 147 152 matches[i].name = max20086_output_names[i];

+5 -1

drivers/scsi/sd_zbc.c

··· 169 169 unsigned int nr_zones, size_t *buflen) 170 170 { 171 171 struct request_queue *q = sdkp->disk->queue; 172 + unsigned int max_segments; 172 173 size_t bufsize; 173 174 void *buf; 174 175 ··· 181 180 * Furthermore, since the report zone command cannot be split, make 182 181 * sure that the allocated buffer can always be mapped by limiting the 183 182 * number of pages allocated to the HBA max segments limit. 183 + * Since max segments can be larger than the max inline bio vectors, 184 + * further limit the allocated buffer to BIO_MAX_INLINE_VECS. 184 185 */ 185 186 nr_zones = min(nr_zones, sdkp->zone_info.nr_zones); 186 187 bufsize = roundup((nr_zones + 1) * 64, SECTOR_SIZE); 187 188 bufsize = min_t(size_t, bufsize, 188 189 queue_max_hw_sectors(q) << SECTOR_SHIFT); 189 - bufsize = min_t(size_t, bufsize, queue_max_segments(q) << PAGE_SHIFT); 190 + max_segments = min(BIO_MAX_INLINE_VECS, queue_max_segments(q)); 191 + bufsize = min_t(size_t, bufsize, max_segments << PAGE_SHIFT); 190 192 191 193 while (bufsize >= SECTOR_SIZE) { 192 194 buf = kvzalloc(bufsize, GFP_KERNEL | __GFP_NORETRY);

+1

drivers/scsi/storvsc_drv.c

··· 1819 1819 return SCSI_MLQUEUE_DEVICE_BUSY; 1820 1820 } 1821 1821 1822 + payload->rangecount = 1; 1822 1823 payload->range.len = length; 1823 1824 payload->range.offset = offset_in_hvpg; 1824 1825

+5 -4

drivers/soundwire/bus.c

··· 122 122 set_bit(SDW_GROUP13_DEV_NUM, bus->assigned); 123 123 set_bit(SDW_MASTER_DEV_NUM, bus->assigned); 124 124 125 + ret = sdw_irq_create(bus, fwnode); 126 + if (ret) 127 + return ret; 128 + 125 129 /* 126 130 * SDW is an enumerable bus, but devices can be powered off. So, 127 131 * they won't be able to report as present. ··· 142 138 143 139 if (ret < 0) { 144 140 dev_err(bus->dev, "Finding slaves failed:%d\n", ret); 141 + sdw_irq_delete(bus); 145 142 return ret; 146 143 } 147 144 ··· 160 155 bus->params.curr_dr_freq = bus->params.max_dr_freq; 161 156 bus->params.curr_bank = SDW_BANK0; 162 157 bus->params.next_bank = SDW_BANK1; 163 - 164 - ret = sdw_irq_create(bus, fwnode); 165 - if (ret) 166 - return ret; 167 158 168 159 return 0; 169 160 }

+1 -1

drivers/spi/spi-loopback-test.c

··· 420 420 static void spi_test_print_hex_dump(char *pre, const void *ptr, size_t len) 421 421 { 422 422 /* limit the hex_dump */ 423 - if (len < 1024) { 423 + if (len <= 1024) { 424 424 print_hex_dump(KERN_INFO, pre, 425 425 DUMP_PREFIX_OFFSET, 16, 1, 426 426 ptr, len, 0);

+4 -1

drivers/spi/spi-sun4i.c

··· 264 264 else 265 265 reg |= SUN4I_CTL_DHB; 266 266 267 + /* Now that the settings are correct, enable the interface */ 268 + reg |= SUN4I_CTL_ENABLE; 269 + 267 270 sun4i_spi_write(sspi, SUN4I_CTL_REG, reg); 268 271 269 272 /* Ensure that we have a parent clock fast enough */ ··· 407 404 } 408 405 409 406 sun4i_spi_write(sspi, SUN4I_CTL_REG, 410 - SUN4I_CTL_ENABLE | SUN4I_CTL_MASTER | SUN4I_CTL_TP); 407 + SUN4I_CTL_MASTER | SUN4I_CTL_TP); 411 408 412 409 return 0; 413 410

+3 -3

drivers/spi/spi-tegra114.c

··· 728 728 u32 inactive_cycles; 729 729 u8 cs_state; 730 730 731 - if ((setup->unit && setup->unit != SPI_DELAY_UNIT_SCK) || 732 - (hold->unit && hold->unit != SPI_DELAY_UNIT_SCK) || 733 - (inactive->unit && inactive->unit != SPI_DELAY_UNIT_SCK)) { 731 + if ((setup->value && setup->unit != SPI_DELAY_UNIT_SCK) || 732 + (hold->value && hold->unit != SPI_DELAY_UNIT_SCK) || 733 + (inactive->value && inactive->unit != SPI_DELAY_UNIT_SCK)) { 734 734 dev_err(&spi->dev, 735 735 "Invalid delay unit %d, should be SPI_DELAY_UNIT_SCK\n", 736 736 SPI_DELAY_UNIT_SCK);

+1 -1

drivers/usb/gadget/function/f_midi2.c

··· 475 475 /* reply a UMP EP device info */ 476 476 static void reply_ump_stream_ep_device(struct f_midi2_ep *ep) 477 477 { 478 - struct snd_ump_stream_msg_devince_info rep = { 478 + struct snd_ump_stream_msg_device_info rep = { 479 479 .type = UMP_MSG_TYPE_STREAM, 480 480 .status = UMP_STREAM_MSG_STATUS_DEVICE_INFO, 481 481 .manufacture_id = ep->info.manufacturer,

+79 -38

fs/bcachefs/backpointers.c

··· 192 192 static int backpointer_target_not_found(struct btree_trans *trans, 193 193 struct bkey_s_c_backpointer bp, 194 194 struct bkey_s_c target_k, 195 - struct bkey_buf *last_flushed) 195 + struct bkey_buf *last_flushed, 196 + bool commit) 196 197 { 197 198 struct bch_fs *c = trans->c; 198 199 struct printbuf buf = PRINTBUF; ··· 229 228 } 230 229 231 230 if (fsck_err(trans, backpointer_to_missing_ptr, 232 - "%s", buf.buf)) 231 + "%s", buf.buf)) { 233 232 ret = bch2_backpointer_del(trans, bp.k->p); 233 + if (ret || !commit) 234 + goto out; 235 + 236 + /* 237 + * Normally, on transaction commit from inside a transaction, 238 + * we'll return -BCH_ERR_transaction_restart_nested, since a 239 + * transaction commit invalidates pointers given out by peek(). 240 + * 241 + * However, since we're updating a write buffer btree, if we 242 + * return a transaction restart and loop we won't see that the 243 + * backpointer has been deleted without an additional write 244 + * buffer flush - and those are expensive. 245 + * 246 + * So we're relying on the caller immediately advancing to the 247 + * next backpointer and starting a new transaction immediately 248 + * after backpointer_get_key() returns NULL: 249 + */ 250 + ret = bch2_trans_commit(trans, NULL, NULL, BCH_TRANS_COMMIT_no_enospc); 251 + } 252 + out: 234 253 fsck_err: 235 254 printbuf_exit(&buf); 236 255 return ret; 237 256 } 238 257 239 - struct bkey_s_c bch2_backpointer_get_key(struct btree_trans *trans, 240 - struct bkey_s_c_backpointer bp, 241 - struct btree_iter *iter, 242 - unsigned iter_flags, 243 - struct bkey_buf *last_flushed) 258 + static struct btree *__bch2_backpointer_get_node(struct btree_trans *trans, 259 + struct bkey_s_c_backpointer bp, 260 + struct btree_iter *iter, 261 + struct bkey_buf *last_flushed, 262 + bool commit) 263 + { 264 + struct bch_fs *c = trans->c; 265 + 266 + BUG_ON(!bp.v->level); 267 + 268 + bch2_trans_node_iter_init(trans, iter, 269 + bp.v->btree_id, 270 + bp.v->pos, 271 + 0, 272 + bp.v->level - 1, 273 + 0); 274 + struct btree *b = bch2_btree_iter_peek_node(trans, iter); 275 + if (IS_ERR_OR_NULL(b)) 276 + goto err; 277 + 278 + BUG_ON(b->c.level != bp.v->level - 1); 279 + 280 + if (extent_matches_bp(c, bp.v->btree_id, bp.v->level, 281 + bkey_i_to_s_c(&b->key), bp)) 282 + return b; 283 + 284 + if (btree_node_will_make_reachable(b)) { 285 + b = ERR_PTR(-BCH_ERR_backpointer_to_overwritten_btree_node); 286 + } else { 287 + int ret = backpointer_target_not_found(trans, bp, bkey_i_to_s_c(&b->key), 288 + last_flushed, commit); 289 + b = ret ? ERR_PTR(ret) : NULL; 290 + } 291 + err: 292 + bch2_trans_iter_exit(trans, iter); 293 + return b; 294 + } 295 + 296 + static struct bkey_s_c __bch2_backpointer_get_key(struct btree_trans *trans, 297 + struct bkey_s_c_backpointer bp, 298 + struct btree_iter *iter, 299 + unsigned iter_flags, 300 + struct bkey_buf *last_flushed, 301 + bool commit) 244 302 { 245 303 struct bch_fs *c = trans->c; 246 304 ··· 337 277 bch2_trans_iter_exit(trans, iter); 338 278 339 279 if (!bp.v->level) { 340 - int ret = backpointer_target_not_found(trans, bp, k, last_flushed); 280 + int ret = backpointer_target_not_found(trans, bp, k, last_flushed, commit); 341 281 return ret ? bkey_s_c_err(ret) : bkey_s_c_null; 342 282 } else { 343 - struct btree *b = bch2_backpointer_get_node(trans, bp, iter, last_flushed); 283 + struct btree *b = __bch2_backpointer_get_node(trans, bp, iter, last_flushed, commit); 344 284 if (b == ERR_PTR(-BCH_ERR_backpointer_to_overwritten_btree_node)) 345 285 return bkey_s_c_null; 346 286 if (IS_ERR_OR_NULL(b)) ··· 355 295 struct btree_iter *iter, 356 296 struct bkey_buf *last_flushed) 357 297 { 358 - struct bch_fs *c = trans->c; 298 + return __bch2_backpointer_get_node(trans, bp, iter, last_flushed, true); 299 + } 359 300 360 - BUG_ON(!bp.v->level); 361 - 362 - bch2_trans_node_iter_init(trans, iter, 363 - bp.v->btree_id, 364 - bp.v->pos, 365 - 0, 366 - bp.v->level - 1, 367 - 0); 368 - struct btree *b = bch2_btree_iter_peek_node(trans, iter); 369 - if (IS_ERR_OR_NULL(b)) 370 - goto err; 371 - 372 - BUG_ON(b->c.level != bp.v->level - 1); 373 - 374 - if (extent_matches_bp(c, bp.v->btree_id, bp.v->level, 375 - bkey_i_to_s_c(&b->key), bp)) 376 - return b; 377 - 378 - if (btree_node_will_make_reachable(b)) { 379 - b = ERR_PTR(-BCH_ERR_backpointer_to_overwritten_btree_node); 380 - } else { 381 - int ret = backpointer_target_not_found(trans, bp, bkey_i_to_s_c(&b->key), last_flushed); 382 - b = ret ? ERR_PTR(ret) : NULL; 383 - } 384 - err: 385 - bch2_trans_iter_exit(trans, iter); 386 - return b; 301 + struct bkey_s_c bch2_backpointer_get_key(struct btree_trans *trans, 302 + struct bkey_s_c_backpointer bp, 303 + struct btree_iter *iter, 304 + unsigned iter_flags, 305 + struct bkey_buf *last_flushed) 306 + { 307 + return __bch2_backpointer_get_key(trans, bp, iter, iter_flags, last_flushed, true); 387 308 } 388 309 389 310 static int bch2_check_backpointer_has_valid_bucket(struct btree_trans *trans, struct bkey_s_c k, ··· 562 521 struct bkey_s_c_backpointer other_bp = bkey_s_c_to_backpointer(bp_k); 563 522 564 523 struct bkey_s_c other_extent = 565 - bch2_backpointer_get_key(trans, other_bp, &other_extent_iter, 0, NULL); 524 + __bch2_backpointer_get_key(trans, other_bp, &other_extent_iter, 0, NULL, false); 566 525 ret = bkey_err(other_extent); 567 526 if (ret == -BCH_ERR_backpointer_to_overwritten_btree_node) 568 527 ret = 0;

+4 -5

fs/bcachefs/btree_cache.c

··· 852 852 b->sib_u64s[1] = 0; 853 853 b->whiteout_u64s = 0; 854 854 bch2_btree_keys_init(b); 855 - set_btree_node_accessed(b); 856 855 857 856 bch2_time_stats_update(&c->times[BCH_TIME_btree_node_mem_alloc], 858 857 start_time); ··· 1285 1286 six_unlock_read(&b->c.lock); 1286 1287 goto retry; 1287 1288 } 1289 + 1290 + /* avoid atomic set bit if it's not needed: */ 1291 + if (!btree_node_accessed(b)) 1292 + set_btree_node_accessed(b); 1288 1293 } 1289 1294 1290 1295 /* XXX: waiting on IO with btree locks held: */ ··· 1303 1300 prefetch(p + L1_CACHE_BYTES * 1); 1304 1301 prefetch(p + L1_CACHE_BYTES * 2); 1305 1302 } 1306 - 1307 - /* avoid atomic set bit if it's not needed: */ 1308 - if (!btree_node_accessed(b)) 1309 - set_btree_node_accessed(b); 1310 1303 1311 1304 if (unlikely(btree_node_read_error(b))) { 1312 1305 six_unlock_read(&b->c.lock);

+14 -8

fs/bcachefs/btree_iter.c

··· 1971 1971 return NULL; 1972 1972 } 1973 1973 1974 + /* 1975 + * We don't correctly handle nodes with extra intent locks here: 1976 + * downgrade so we don't violate locking invariants 1977 + */ 1978 + bch2_btree_path_downgrade(trans, path); 1979 + 1974 1980 if (!bch2_btree_node_relock(trans, path, path->level + 1)) { 1975 1981 __bch2_btree_path_unlock(trans, path); 1976 1982 path->l[path->level].b = ERR_PTR(-BCH_ERR_no_btree_node_relock); ··· 2749 2743 ret = trans_maybe_inject_restart(trans, _RET_IP_); 2750 2744 if (unlikely(ret)) { 2751 2745 k = bkey_s_c_err(ret); 2752 - goto out_no_locked; 2746 + goto out; 2753 2747 } 2754 2748 2755 2749 /* extents can't span inode numbers: */ ··· 2769 2763 ret = bch2_btree_path_traverse(trans, iter->path, iter->flags); 2770 2764 if (unlikely(ret)) { 2771 2765 k = bkey_s_c_err(ret); 2772 - goto out_no_locked; 2766 + goto out; 2773 2767 } 2774 2768 2775 2769 struct btree_path *path = btree_iter_path(trans, iter); 2776 2770 if (unlikely(!btree_path_node(path, path->level))) 2777 2771 return bkey_s_c_null; 2772 + 2773 + btree_path_set_should_be_locked(trans, path); 2778 2774 2779 2775 if ((iter->flags & BTREE_ITER_cached) || 2780 2776 !(iter->flags & (BTREE_ITER_is_extents|BTREE_ITER_filter_snapshots))) { ··· 2798 2790 if (!bkey_err(k)) 2799 2791 iter->k = *k.k; 2800 2792 /* We're not returning a key from iter->path: */ 2801 - goto out_no_locked; 2793 + goto out; 2802 2794 } 2803 2795 2804 - k = bch2_btree_path_peek_slot(trans->paths + iter->path, &iter->k); 2796 + k = bch2_btree_path_peek_slot(btree_iter_path(trans, iter), &iter->k); 2805 2797 if (unlikely(!k.k)) 2806 - goto out_no_locked; 2798 + goto out; 2807 2799 2808 2800 if (unlikely(k.k->type == KEY_TYPE_whiteout && 2809 2801 (iter->flags & BTREE_ITER_filter_snapshots) && ··· 2841 2833 } 2842 2834 2843 2835 if (unlikely(bkey_err(k))) 2844 - goto out_no_locked; 2836 + goto out; 2845 2837 2846 2838 next = k.k ? bkey_start_pos(k.k) : POS_MAX; 2847 2839 ··· 2863 2855 } 2864 2856 } 2865 2857 out: 2866 - btree_path_set_should_be_locked(trans, btree_iter_path(trans, iter)); 2867 - out_no_locked: 2868 2858 bch2_btree_iter_verify_entry_exit(iter); 2869 2859 bch2_btree_iter_verify(trans, iter); 2870 2860 ret = bch2_btree_iter_verify_ret(trans, iter, k);

+15 -2

fs/bcachefs/disk_accounting.c

··· 376 376 return ret; 377 377 } 378 378 379 + int bch2_accounting_mem_insert_locked(struct bch_fs *c, struct bkey_s_c_accounting a, 380 + enum bch_accounting_mode mode) 381 + { 382 + struct bch_replicas_padded r; 383 + 384 + if (mode != BCH_ACCOUNTING_read && 385 + accounting_to_replicas(&r.e, a.k->p) && 386 + !bch2_replicas_marked_locked(c, &r.e)) 387 + return -BCH_ERR_btree_insert_need_mark_replicas; 388 + 389 + return __bch2_accounting_mem_insert(c, a); 390 + } 391 + 379 392 static bool accounting_mem_entry_is_zero(struct accounting_mem_entry *e) 380 393 { 381 394 for (unsigned i = 0; i < e->nr_counters; i++) ··· 596 583 accounting_key_init(&k_i.k, &acc_k, src_v, nr); 597 584 bch2_accounting_mem_mod_locked(trans, 598 585 bkey_i_to_s_c_accounting(&k_i.k), 599 - BCH_ACCOUNTING_normal); 586 + BCH_ACCOUNTING_normal, true); 600 587 601 588 preempt_disable(); 602 589 struct bch_fs_usage_base *dst = this_cpu_ptr(c->usage); ··· 625 612 626 613 percpu_down_read(&c->mark_lock); 627 614 int ret = bch2_accounting_mem_mod_locked(trans, bkey_s_c_to_accounting(k), 628 - BCH_ACCOUNTING_read); 615 + BCH_ACCOUNTING_read, false); 629 616 percpu_up_read(&c->mark_lock); 630 617 return ret; 631 618 }

+11 -5

fs/bcachefs/disk_accounting.h

··· 136 136 }; 137 137 138 138 int bch2_accounting_mem_insert(struct bch_fs *, struct bkey_s_c_accounting, enum bch_accounting_mode); 139 + int bch2_accounting_mem_insert_locked(struct bch_fs *, struct bkey_s_c_accounting, enum bch_accounting_mode); 139 140 void bch2_accounting_mem_gc(struct bch_fs *); 140 141 141 142 static inline bool bch2_accounting_is_mem(struct disk_accounting_pos acc) ··· 151 150 */ 152 151 static inline int bch2_accounting_mem_mod_locked(struct btree_trans *trans, 153 152 struct bkey_s_c_accounting a, 154 - enum bch_accounting_mode mode) 153 + enum bch_accounting_mode mode, 154 + bool write_locked) 155 155 { 156 156 struct bch_fs *c = trans->c; 157 157 struct bch_accounting_mem *acc = &c->accounting; ··· 191 189 192 190 while ((idx = eytzinger0_find(acc->k.data, acc->k.nr, sizeof(acc->k.data[0]), 193 191 accounting_pos_cmp, &a.k->p)) >= acc->k.nr) { 194 - int ret = bch2_accounting_mem_insert(c, a, mode); 192 + int ret = 0; 193 + if (unlikely(write_locked)) 194 + ret = bch2_accounting_mem_insert_locked(c, a, mode); 195 + else 196 + ret = bch2_accounting_mem_insert(c, a, mode); 195 197 if (ret) 196 198 return ret; 197 199 } ··· 212 206 static inline int bch2_accounting_mem_add(struct btree_trans *trans, struct bkey_s_c_accounting a, bool gc) 213 207 { 214 208 percpu_down_read(&trans->c->mark_lock); 215 - int ret = bch2_accounting_mem_mod_locked(trans, a, gc ? BCH_ACCOUNTING_gc : BCH_ACCOUNTING_normal); 209 + int ret = bch2_accounting_mem_mod_locked(trans, a, gc ? BCH_ACCOUNTING_gc : BCH_ACCOUNTING_normal, false); 216 210 percpu_up_read(&trans->c->mark_lock); 217 211 return ret; 218 212 } ··· 265 259 EBUG_ON(bversion_zero(a->k.bversion)); 266 260 267 261 return likely(!(commit_flags & BCH_TRANS_COMMIT_skip_accounting_apply)) 268 - ? bch2_accounting_mem_mod_locked(trans, accounting_i_to_s_c(a), BCH_ACCOUNTING_normal) 262 + ? bch2_accounting_mem_mod_locked(trans, accounting_i_to_s_c(a), BCH_ACCOUNTING_normal, false) 269 263 : 0; 270 264 } 271 265 ··· 277 271 struct bkey_s_accounting a = accounting_i_to_s(a_i); 278 272 279 273 bch2_accounting_neg(a); 280 - bch2_accounting_mem_mod_locked(trans, a.c, BCH_ACCOUNTING_normal); 274 + bch2_accounting_mem_mod_locked(trans, a.c, BCH_ACCOUNTING_normal, false); 281 275 bch2_accounting_neg(a); 282 276 } 283 277 }

+3 -1

fs/bcachefs/fs.c

··· 1429 1429 if (ret) 1430 1430 goto err; 1431 1431 1432 - ret = bch2_next_fiemap_pagecache_extent(trans, inode, start, end, cur); 1432 + u64 pagecache_end = k.k ? max(start, bkey_start_offset(k.k)) : end; 1433 + 1434 + ret = bch2_next_fiemap_pagecache_extent(trans, inode, start, pagecache_end, cur); 1433 1435 if (ret) 1434 1436 goto err; 1435 1437

+1 -1

fs/bcachefs/fsck.c

··· 2446 2446 u32 parent = le32_to_cpu(s.v->fs_path_parent); 2447 2447 2448 2448 if (darray_u32_has(&subvol_path, parent)) { 2449 - if (fsck_err(c, subvol_loop, "subvolume loop")) 2449 + if (fsck_err(trans, subvol_loop, "subvolume loop")) 2450 2450 ret = reattach_subvol(trans, s); 2451 2451 break; 2452 2452 }

+13 -6

fs/bcachefs/journal_reclaim.c

··· 17 17 #include <linux/kthread.h> 18 18 #include <linux/sched/mm.h> 19 19 20 + static bool __should_discard_bucket(struct journal *, struct journal_device *); 21 + 20 22 /* Free space calculations: */ 21 23 22 24 static unsigned journal_space_from(struct journal_device *ja, ··· 205 203 ja->bucket_seq[ja->dirty_idx_ondisk] < j->last_seq_ondisk) 206 204 ja->dirty_idx_ondisk = (ja->dirty_idx_ondisk + 1) % ja->nr; 207 205 208 - if (ja->discard_idx != ja->dirty_idx_ondisk) 209 - can_discard = true; 206 + can_discard |= __should_discard_bucket(j, ja); 210 207 211 208 max_entry_size = min_t(unsigned, max_entry_size, ca->mi.bucket_size); 212 209 nr_online++; ··· 265 264 266 265 /* Discards - last part of journal reclaim: */ 267 266 267 + static bool __should_discard_bucket(struct journal *j, struct journal_device *ja) 268 + { 269 + unsigned min_free = max(4, ja->nr / 8); 270 + 271 + return bch2_journal_dev_buckets_available(j, ja, journal_space_discarded) < 272 + min_free && 273 + ja->discard_idx != ja->dirty_idx_ondisk; 274 + } 275 + 268 276 static bool should_discard_bucket(struct journal *j, struct journal_device *ja) 269 277 { 270 278 spin_lock(&j->lock); 271 - unsigned min_free = max(4, ja->nr / 8); 272 - 273 - bool ret = bch2_journal_dev_buckets_available(j, ja, journal_space_discarded) < min_free && 274 - ja->discard_idx != ja->dirty_idx_ondisk; 279 + bool ret = __should_discard_bucket(j, ja); 275 280 spin_unlock(&j->lock); 276 281 277 282 return ret;

+1 -1

fs/bcachefs/rebalance.c

··· 309 309 struct btree_iter *iter, 310 310 struct bkey_s_c k) 311 311 { 312 - if (!bch2_bkey_rebalance_opts(k)) 312 + if (k.k->type == KEY_TYPE_reflink_v || !bch2_bkey_rebalance_opts(k)) 313 313 return 0; 314 314 315 315 struct bkey_i *n = bch2_bkey_make_mut(trans, iter, &k, 0);

+47 -24

fs/binfmt_elf.c

··· 830 830 struct elf_phdr *elf_ppnt, *elf_phdata, *interp_elf_phdata = NULL; 831 831 struct elf_phdr *elf_property_phdata = NULL; 832 832 unsigned long elf_brk; 833 + bool brk_moved = false; 833 834 int retval, i; 834 835 unsigned long elf_entry; 835 836 unsigned long e_entry; ··· 1098 1097 /* Calculate any requested alignment. */ 1099 1098 alignment = maximum_alignment(elf_phdata, elf_ex->e_phnum); 1100 1099 1101 - /* 1102 - * There are effectively two types of ET_DYN 1103 - * binaries: programs (i.e. PIE: ET_DYN with PT_INTERP) 1104 - * and loaders (ET_DYN without PT_INTERP, since they 1105 - * _are_ the ELF interpreter). The loaders must 1106 - * be loaded away from programs since the program 1107 - * may otherwise collide with the loader (especially 1108 - * for ET_EXEC which does not have a randomized 1109 - * position). For example to handle invocations of 1100 + /** 1101 + * DOC: PIE handling 1102 + * 1103 + * There are effectively two types of ET_DYN ELF 1104 + * binaries: programs (i.e. PIE: ET_DYN with 1105 + * PT_INTERP) and loaders (i.e. static PIE: ET_DYN 1106 + * without PT_INTERP, usually the ELF interpreter 1107 + * itself). Loaders must be loaded away from programs 1108 + * since the program may otherwise collide with the 1109 + * loader (especially for ET_EXEC which does not have 1110 + * a randomized position). 1111 + * 1112 + * For example, to handle invocations of 1110 1113 * "./ld.so someprog" to test out a new version of 1111 1114 * the loader, the subsequent program that the 1112 1115 * loader loads must avoid the loader itself, so ··· 1123 1118 * ELF_ET_DYN_BASE and loaders are loaded into the 1124 1119 * independently randomized mmap region (0 load_bias 1125 1120 * without MAP_FIXED nor MAP_FIXED_NOREPLACE). 1121 + * 1122 + * See below for "brk" handling details, which is 1123 + * also affected by program vs loader and ASLR. 1126 1124 */ 1127 1125 if (interpreter) { 1128 1126 /* On ET_DYN with PT_INTERP, we do the ASLR. */ ··· 1242 1234 start_data += load_bias; 1243 1235 end_data += load_bias; 1244 1236 1245 - current->mm->start_brk = current->mm->brk = ELF_PAGEALIGN(elf_brk); 1246 - 1247 1237 if (interpreter) { 1248 1238 elf_entry = load_elf_interp(interp_elf_ex, 1249 1239 interpreter, ··· 1297 1291 mm->end_data = end_data; 1298 1292 mm->start_stack = bprm->p; 1299 1293 1300 - if ((current->flags & PF_RANDOMIZE) && (snapshot_randomize_va_space > 1)) { 1294 + /** 1295 + * DOC: "brk" handling 1296 + * 1297 + * For architectures with ELF randomization, when executing a 1298 + * loader directly (i.e. static PIE: ET_DYN without PT_INTERP), 1299 + * move the brk area out of the mmap region and into the unused 1300 + * ELF_ET_DYN_BASE region. Since "brk" grows up it may collide 1301 + * early with the stack growing down or other regions being put 1302 + * into the mmap region by the kernel (e.g. vdso). 1303 + * 1304 + * In the CONFIG_COMPAT_BRK case, though, everything is turned 1305 + * off because we're not allowed to move the brk at all. 1306 + */ 1307 + if (!IS_ENABLED(CONFIG_COMPAT_BRK) && 1308 + IS_ENABLED(CONFIG_ARCH_HAS_ELF_RANDOMIZE) && 1309 + elf_ex->e_type == ET_DYN && !interpreter) { 1310 + elf_brk = ELF_ET_DYN_BASE; 1311 + /* This counts as moving the brk, so let brk(2) know. */ 1312 + brk_moved = true; 1313 + } 1314 + mm->start_brk = mm->brk = ELF_PAGEALIGN(elf_brk); 1315 + 1316 + if ((current->flags & PF_RANDOMIZE) && snapshot_randomize_va_space > 1) { 1301 1317 /* 1302 - * For architectures with ELF randomization, when executing 1303 - * a loader directly (i.e. no interpreter listed in ELF 1304 - * headers), move the brk area out of the mmap region 1305 - * (since it grows up, and may collide early with the stack 1306 - * growing down), and into the unused ELF_ET_DYN_BASE region. 1318 + * If we didn't move the brk to ELF_ET_DYN_BASE (above), 1319 + * leave a gap between .bss and brk. 1307 1320 */ 1308 - if (IS_ENABLED(CONFIG_ARCH_HAS_ELF_RANDOMIZE) && 1309 - elf_ex->e_type == ET_DYN && !interpreter) { 1310 - mm->brk = mm->start_brk = ELF_ET_DYN_BASE; 1311 - } else { 1312 - /* Otherwise leave a gap between .bss and brk. */ 1321 + if (!brk_moved) 1313 1322 mm->brk = mm->start_brk = mm->brk + PAGE_SIZE; 1314 - } 1315 1323 1316 1324 mm->brk = mm->start_brk = arch_randomize_brk(mm); 1325 + brk_moved = true; 1326 + } 1327 + 1317 1328 #ifdef compat_brk_randomized 1329 + if (brk_moved) 1318 1330 current->brk_randomized = 1; 1319 1331 #endif 1320 - } 1321 1332 1322 1333 if (current->personality & MMAP_PAGE_ZERO) { 1323 1334 /* Why this, you ask??? Well SVr4 maps page 0 as read-only,

+15 -2

fs/btrfs/discard.c

··· 94 94 struct btrfs_block_group *block_group) 95 95 { 96 96 lockdep_assert_held(&discard_ctl->lock); 97 - if (!btrfs_run_discard_work(discard_ctl)) 98 - return; 99 97 100 98 if (list_empty(&block_group->discard_list) || 101 99 block_group->discard_index == BTRFS_DISCARD_INDEX_UNUSED) { ··· 114 116 struct btrfs_block_group *block_group) 115 117 { 116 118 if (!btrfs_is_block_group_data_only(block_group)) 119 + return; 120 + 121 + if (!btrfs_run_discard_work(discard_ctl)) 117 122 return; 118 123 119 124 spin_lock(&discard_ctl->lock); ··· 245 244 block_group->used != 0) { 246 245 if (btrfs_is_block_group_data_only(block_group)) { 247 246 __add_to_discard_list(discard_ctl, block_group); 247 + /* 248 + * The block group must have been moved to other 249 + * discard list even if discard was disabled in 250 + * the meantime or a transaction abort happened, 251 + * otherwise we can end up in an infinite loop, 252 + * always jumping into the 'again' label and 253 + * keep getting this block group over and over 254 + * in case there are no other block groups in 255 + * the discard lists. 256 + */ 257 + ASSERT(block_group->discard_index != 258 + BTRFS_DISCARD_INDEX_UNUSED); 248 259 } else { 249 260 list_del_init(&block_group->discard_list); 250 261 btrfs_put_block_group(block_group);

+1

fs/btrfs/fs.h

··· 300 300 #define BTRFS_FEATURE_INCOMPAT_SAFE_CLEAR 0ULL 301 301 302 302 #define BTRFS_DEFAULT_COMMIT_INTERVAL (30) 303 + #define BTRFS_WARNING_COMMIT_INTERVAL (300) 303 304 #define BTRFS_DEFAULT_MAX_INLINE (2048) 304 305 305 306 struct btrfs_dev_replace {

+7

fs/btrfs/inode.c

··· 1109 1109 struct extent_state *cached = NULL; 1110 1110 struct extent_map *em; 1111 1111 int ret = 0; 1112 + bool free_pages = false; 1112 1113 u64 start = async_extent->start; 1113 1114 u64 end = async_extent->start + async_extent->ram_size - 1; 1114 1115 ··· 1130 1129 } 1131 1130 1132 1131 if (async_extent->compress_type == BTRFS_COMPRESS_NONE) { 1132 + ASSERT(!async_extent->folios); 1133 + ASSERT(async_extent->nr_folios == 0); 1133 1134 submit_uncompressed_range(inode, async_extent, locked_folio); 1135 + free_pages = true; 1134 1136 goto done; 1135 1137 } 1136 1138 ··· 1149 1145 * fall back to uncompressed. 1150 1146 */ 1151 1147 submit_uncompressed_range(inode, async_extent, locked_folio); 1148 + free_pages = true; 1152 1149 goto done; 1153 1150 } 1154 1151 ··· 1191 1186 done: 1192 1187 if (async_chunk->blkcg_css) 1193 1188 kthread_associate_blkcg(NULL); 1189 + if (free_pages) 1190 + free_async_extent_pages(async_extent); 1194 1191 kfree(async_extent); 1195 1192 return; 1196 1193

+4

fs/btrfs/super.c

··· 569 569 break; 570 570 case Opt_commit_interval: 571 571 ctx->commit_interval = result.uint_32; 572 + if (ctx->commit_interval > BTRFS_WARNING_COMMIT_INTERVAL) { 573 + btrfs_warn(NULL, "excessive commit interval %u, use with care", 574 + ctx->commit_interval); 575 + } 572 576 if (ctx->commit_interval == 0) 573 577 ctx->commit_interval = BTRFS_DEFAULT_COMMIT_INTERVAL; 574 578 break;

+1 -3

fs/buffer.c

··· 1220 1220 /* FIXME: do we need to set this in both places? */ 1221 1221 if (bh->b_folio && bh->b_folio->mapping) 1222 1222 mapping_set_error(bh->b_folio->mapping, -EIO); 1223 - if (bh->b_assoc_map) { 1223 + if (bh->b_assoc_map) 1224 1224 mapping_set_error(bh->b_assoc_map, -EIO); 1225 - errseq_set(&bh->b_assoc_map->host->i_sb->s_wb_err, -EIO); 1226 - } 1227 1225 } 1228 1226 EXPORT_SYMBOL(mark_buffer_write_io_error); 1229 1227

+4 -3

fs/eventpoll.c

··· 2111 2111 2112 2112 write_unlock_irq(&ep->lock); 2113 2113 2114 - if (!eavail && ep_schedule_timeout(to)) 2115 - timed_out = !schedule_hrtimeout_range(to, slack, 2116 - HRTIMER_MODE_ABS); 2114 + if (!eavail) 2115 + timed_out = !ep_schedule_timeout(to) || 2116 + !schedule_hrtimeout_range(to, slack, 2117 + HRTIMER_MODE_ABS); 2117 2118 __set_current_state(TASK_RUNNING); 2118 2119 2119 2120 /*

+9

fs/nfs/client.c

··· 1105 1105 if (server->namelen == 0 || server->namelen > NFS2_MAXNAMLEN) 1106 1106 server->namelen = NFS2_MAXNAMLEN; 1107 1107 } 1108 + /* Linux 'subtree_check' borkenness mandates this setting */ 1109 + server->fh_expire_type = NFS_FH_VOL_RENAME; 1108 1110 1109 1111 if (!(fattr->valid & NFS_ATTR_FATTR)) { 1110 1112 error = ctx->nfs_mod->rpc_ops->getattr(server, ctx->mntfh, ··· 1202 1200 #if IS_ENABLED(CONFIG_NFS_V4) 1203 1201 idr_init(&nn->cb_ident_idr); 1204 1202 #endif 1203 + #if IS_ENABLED(CONFIG_NFS_V4_1) 1204 + INIT_LIST_HEAD(&nn->nfs4_data_server_cache); 1205 + spin_lock_init(&nn->nfs4_data_server_lock); 1206 + #endif 1205 1207 spin_lock_init(&nn->nfs_client_lock); 1206 1208 nn->boot_time = ktime_get_real(); 1207 1209 memset(&nn->rpcstats, 0, sizeof(nn->rpcstats)); ··· 1222 1216 nfs_cleanup_cb_ident_idr(net); 1223 1217 WARN_ON_ONCE(!list_empty(&nn->nfs_client_list)); 1224 1218 WARN_ON_ONCE(!list_empty(&nn->nfs_volume_list)); 1219 + #if IS_ENABLED(CONFIG_NFS_V4_1) 1220 + WARN_ON_ONCE(!list_empty(&nn->nfs4_data_server_cache)); 1221 + #endif 1225 1222 } 1226 1223 1227 1224 #ifdef CONFIG_PROC_FS

+14 -1

fs/nfs/dir.c

··· 2676 2676 unblock_revalidate(new_dentry); 2677 2677 } 2678 2678 2679 + static bool nfs_rename_is_unsafe_cross_dir(struct dentry *old_dentry, 2680 + struct dentry *new_dentry) 2681 + { 2682 + struct nfs_server *server = NFS_SB(old_dentry->d_sb); 2683 + 2684 + if (old_dentry->d_parent != new_dentry->d_parent) 2685 + return false; 2686 + if (server->fh_expire_type & NFS_FH_RENAME_UNSAFE) 2687 + return !(server->fh_expire_type & NFS_FH_NOEXPIRE_WITH_OPEN); 2688 + return true; 2689 + } 2690 + 2679 2691 /* 2680 2692 * RENAME 2681 2693 * FIXME: Some nfsds, like the Linux user space nfsd, may generate a ··· 2775 2763 2776 2764 } 2777 2765 2778 - if (S_ISREG(old_inode->i_mode)) 2766 + if (S_ISREG(old_inode->i_mode) && 2767 + nfs_rename_is_unsafe_cross_dir(old_dentry, new_dentry)) 2779 2768 nfs_sync_inode(old_inode); 2780 2769 task = nfs_async_rename(old_dir, new_dir, old_dentry, new_dentry, 2781 2770 must_unblock ? nfs_unblock_rename : NULL);

+1 -1

fs/nfs/direct.c

··· 757 757 { 758 758 struct nfs_direct_req *dreq = hdr->dreq; 759 759 struct nfs_commit_info cinfo; 760 - struct nfs_page *req = nfs_list_entry(hdr->pages.next); 761 760 struct inode *inode = dreq->inode; 762 761 int flags = NFS_ODIRECT_DONE; 763 762 ··· 785 786 spin_unlock(&inode->i_lock); 786 787 787 788 while (!list_empty(&hdr->pages)) { 789 + struct nfs_page *req; 788 790 789 791 req = nfs_list_entry(hdr->pages.next); 790 792 nfs_list_remove_request(req);

+3 -3

fs/nfs/filelayout/filelayoutdev.c

··· 76 76 struct page *scratch; 77 77 struct list_head dsaddrs; 78 78 struct nfs4_pnfs_ds_addr *da; 79 + struct net *net = server->nfs_client->cl_net; 79 80 80 81 /* set up xdr stream */ 81 82 scratch = alloc_page(gfp_flags); ··· 160 159 161 160 mp_count = be32_to_cpup(p); /* multipath count */ 162 161 for (j = 0; j < mp_count; j++) { 163 - da = nfs4_decode_mp_ds_addr(server->nfs_client->cl_net, 164 - &stream, gfp_flags); 162 + da = nfs4_decode_mp_ds_addr(net, &stream, gfp_flags); 165 163 if (da) 166 164 list_add_tail(&da->da_node, &dsaddrs); 167 165 } ··· 170 170 goto out_err_free_deviceid; 171 171 } 172 172 173 - dsaddr->ds_list[i] = nfs4_pnfs_ds_add(&dsaddrs, gfp_flags); 173 + dsaddr->ds_list[i] = nfs4_pnfs_ds_add(net, &dsaddrs, gfp_flags); 174 174 if (!dsaddr->ds_list[i]) 175 175 goto out_err_drain_dsaddrs; 176 176 trace_fl_getdevinfo(server, &pdev->dev_id, dsaddr->ds_list[i]->ds_remotestr);

+3 -3

fs/nfs/flexfilelayout/flexfilelayout.c

··· 1329 1329 hdr->args.offset, hdr->args.count, 1330 1330 &hdr->res.op_status, OP_READ, 1331 1331 task->tk_status); 1332 - trace_ff_layout_read_error(hdr); 1332 + trace_ff_layout_read_error(hdr, task->tk_status); 1333 1333 } 1334 1334 1335 1335 err = ff_layout_async_handle_error(task, hdr->args.context->state, ··· 1502 1502 hdr->args.offset, hdr->args.count, 1503 1503 &hdr->res.op_status, OP_WRITE, 1504 1504 task->tk_status); 1505 - trace_ff_layout_write_error(hdr); 1505 + trace_ff_layout_write_error(hdr, task->tk_status); 1506 1506 } 1507 1507 1508 1508 err = ff_layout_async_handle_error(task, hdr->args.context->state, ··· 1551 1551 data->args.offset, data->args.count, 1552 1552 &data->res.op_status, OP_COMMIT, 1553 1553 task->tk_status); 1554 - trace_ff_layout_commit_error(data); 1554 + trace_ff_layout_commit_error(data, task->tk_status); 1555 1555 } 1556 1556 1557 1557 err = ff_layout_async_handle_error(task, NULL, data->ds_clp,

+3 -3

fs/nfs/flexfilelayout/flexfilelayoutdev.c

··· 49 49 struct nfs4_pnfs_ds_addr *da; 50 50 struct nfs4_ff_layout_ds *new_ds = NULL; 51 51 struct nfs4_ff_ds_version *ds_versions = NULL; 52 + struct net *net = server->nfs_client->cl_net; 52 53 u32 mp_count; 53 54 u32 version_count; 54 55 __be32 *p; ··· 81 80 82 81 for (i = 0; i < mp_count; i++) { 83 82 /* multipath ds */ 84 - da = nfs4_decode_mp_ds_addr(server->nfs_client->cl_net, 85 - &stream, gfp_flags); 83 + da = nfs4_decode_mp_ds_addr(net, &stream, gfp_flags); 86 84 if (da) 87 85 list_add_tail(&da->da_node, &dsaddrs); 88 86 } ··· 149 149 new_ds->ds_versions = ds_versions; 150 150 new_ds->ds_versions_cnt = version_count; 151 151 152 - new_ds->ds = nfs4_pnfs_ds_add(&dsaddrs, gfp_flags); 152 + new_ds->ds = nfs4_pnfs_ds_add(net, &dsaddrs, gfp_flags); 153 153 if (!new_ds->ds) 154 154 goto out_err_drain_dsaddrs; 155 155

+1 -1

fs/nfs/localio.c

··· 278 278 new = __nfs_local_open_fh(clp, cred, fh, nfl, mode); 279 279 if (IS_ERR(new)) 280 280 return NULL; 281 + rcu_read_lock(); 281 282 /* try to swap in the pointer */ 282 283 spin_lock(&clp->cl_uuid.lock); 283 284 nf = rcu_dereference_protected(*pnf, 1); ··· 288 287 rcu_assign_pointer(*pnf, nf); 289 288 } 290 289 spin_unlock(&clp->cl_uuid.lock); 291 - rcu_read_lock(); 292 290 } 293 291 nf = nfs_local_file_get(nf); 294 292 rcu_read_unlock();

+5 -1

fs/nfs/netns.h

··· 31 31 unsigned short nfs_callback_tcpport; 32 32 unsigned short nfs_callback_tcpport6; 33 33 int cb_users[NFS4_MAX_MINOR_VERSION + 1]; 34 - #endif 34 + #endif /* CONFIG_NFS_V4 */ 35 + #if IS_ENABLED(CONFIG_NFS_V4_1) 36 + struct list_head nfs4_data_server_cache; 37 + spinlock_t nfs4_data_server_lock; 38 + #endif /* CONFIG_NFS_V4_1 */ 35 39 struct nfs_netns_client *nfs_client; 36 40 spinlock_t nfs_client_lock; 37 41 ktime_t boot_time;

+1 -1

fs/nfs/nfs3acl.c

··· 104 104 105 105 switch (status) { 106 106 case 0: 107 - status = nfs_refresh_inode(inode, res.fattr); 107 + nfs_refresh_inode(inode, res.fattr); 108 108 break; 109 109 case -EPFNOSUPPORT: 110 110 case -EPROTONOSUPPORT:

+17 -1

fs/nfs/nfs4proc.c

··· 671 671 struct nfs_client *clp = server->nfs_client; 672 672 int ret; 673 673 674 + if ((task->tk_rpc_status == -ENETDOWN || 675 + task->tk_rpc_status == -ENETUNREACH) && 676 + task->tk_flags & RPC_TASK_NETUNREACH_FATAL) { 677 + exception->delay = 0; 678 + exception->recovering = 0; 679 + exception->retry = 0; 680 + return -EIO; 681 + } 682 + 674 683 ret = nfs4_do_handle_exception(server, errorcode, exception); 675 684 if (exception->delay) { 676 685 int ret2 = nfs4_exception_should_retrans(server, exception); ··· 7083 7074 struct nfs4_unlockdata *p; 7084 7075 struct nfs4_state *state = lsp->ls_state; 7085 7076 struct inode *inode = state->inode; 7077 + struct nfs_lock_context *l_ctx; 7086 7078 7087 7079 p = kzalloc(sizeof(*p), GFP_KERNEL); 7088 7080 if (p == NULL) 7089 7081 return NULL; 7082 + l_ctx = nfs_get_lock_context(ctx); 7083 + if (!IS_ERR(l_ctx)) { 7084 + p->l_ctx = l_ctx; 7085 + } else { 7086 + kfree(p); 7087 + return NULL; 7088 + } 7090 7089 p->arg.fh = NFS_FH(inode); 7091 7090 p->arg.fl = &p->fl; 7092 7091 p->arg.seqid = seqid; ··· 7102 7085 p->lsp = lsp; 7103 7086 /* Ensure we don't close file until we're done freeing locks! */ 7104 7087 p->ctx = get_nfs_open_context(ctx); 7105 - p->l_ctx = nfs_get_lock_context(ctx); 7106 7088 locks_init_lock(&p->fl); 7107 7089 locks_copy_lock(&p->fl, fl); 7108 7090 p->server = NFS_SERVER(inode);

+22 -12

fs/nfs/nfs4trace.h

··· 2051 2051 2052 2052 DECLARE_EVENT_CLASS(nfs4_flexfiles_io_event, 2053 2053 TP_PROTO( 2054 - const struct nfs_pgio_header *hdr 2054 + const struct nfs_pgio_header *hdr, 2055 + int error 2055 2056 ), 2056 2057 2057 - TP_ARGS(hdr), 2058 + TP_ARGS(hdr, error), 2058 2059 2059 2060 TP_STRUCT__entry( 2060 2061 __field(unsigned long, error) 2062 + __field(unsigned long, nfs_error) 2061 2063 __field(dev_t, dev) 2062 2064 __field(u32, fhandle) 2063 2065 __field(u64, fileid) ··· 2075 2073 TP_fast_assign( 2076 2074 const struct inode *inode = hdr->inode; 2077 2075 2078 - __entry->error = hdr->res.op_status; 2076 + __entry->error = -error; 2077 + __entry->nfs_error = hdr->res.op_status; 2079 2078 __entry->fhandle = nfs_fhandle_hash(hdr->args.fh); 2080 2079 __entry->fileid = NFS_FILEID(inode); 2081 2080 __entry->dev = inode->i_sb->s_dev; ··· 2091 2088 2092 2089 TP_printk( 2093 2090 "error=%ld (%s) fileid=%02x:%02x:%llu fhandle=0x%08x " 2094 - "offset=%llu count=%u stateid=%d:0x%08x dstaddr=%s", 2091 + "offset=%llu count=%u stateid=%d:0x%08x dstaddr=%s " 2092 + "nfs_error=%lu (%s)", 2095 2093 -__entry->error, 2096 2094 show_nfs4_status(__entry->error), 2097 2095 MAJOR(__entry->dev), MINOR(__entry->dev), ··· 2100 2096 __entry->fhandle, 2101 2097 __entry->offset, __entry->count, 2102 2098 __entry->stateid_seq, __entry->stateid_hash, 2103 - __get_str(dstaddr) 2099 + __get_str(dstaddr), __entry->nfs_error, 2100 + show_nfs4_status(__entry->nfs_error) 2104 2101 ) 2105 2102 ); 2106 2103 2107 2104 #define DEFINE_NFS4_FLEXFILES_IO_EVENT(name) \ 2108 2105 DEFINE_EVENT(nfs4_flexfiles_io_event, name, \ 2109 2106 TP_PROTO( \ 2110 - const struct nfs_pgio_header *hdr \ 2107 + const struct nfs_pgio_header *hdr, \ 2108 + int error \ 2111 2109 ), \ 2112 - TP_ARGS(hdr)) 2110 + TP_ARGS(hdr, error)) 2113 2111 DEFINE_NFS4_FLEXFILES_IO_EVENT(ff_layout_read_error); 2114 2112 DEFINE_NFS4_FLEXFILES_IO_EVENT(ff_layout_write_error); 2115 2113 2116 2114 TRACE_EVENT(ff_layout_commit_error, 2117 2115 TP_PROTO( 2118 - const struct nfs_commit_data *data 2116 + const struct nfs_commit_data *data, 2117 + int error 2119 2118 ), 2120 2119 2121 - TP_ARGS(data), 2120 + TP_ARGS(data, error), 2122 2121 2123 2122 TP_STRUCT__entry( 2124 2123 __field(unsigned long, error) 2124 + __field(unsigned long, nfs_error) 2125 2125 __field(dev_t, dev) 2126 2126 __field(u32, fhandle) 2127 2127 __field(u64, fileid) ··· 2139 2131 TP_fast_assign( 2140 2132 const struct inode *inode = data->inode; 2141 2133 2142 - __entry->error = data->res.op_status; 2134 + __entry->error = -error; 2135 + __entry->nfs_error = data->res.op_status; 2143 2136 __entry->fhandle = nfs_fhandle_hash(data->args.fh); 2144 2137 __entry->fileid = NFS_FILEID(inode); 2145 2138 __entry->dev = inode->i_sb->s_dev; ··· 2151 2142 2152 2143 TP_printk( 2153 2144 "error=%ld (%s) fileid=%02x:%02x:%llu fhandle=0x%08x " 2154 - "offset=%llu count=%u dstaddr=%s", 2145 + "offset=%llu count=%u dstaddr=%s nfs_error=%lu (%s)", 2155 2146 -__entry->error, 2156 2147 show_nfs4_status(__entry->error), 2157 2148 MAJOR(__entry->dev), MINOR(__entry->dev), 2158 2149 (unsigned long long)__entry->fileid, 2159 2150 __entry->fhandle, 2160 2151 __entry->offset, __entry->count, 2161 - __get_str(dstaddr) 2152 + __get_str(dstaddr), __entry->nfs_error, 2153 + show_nfs4_status(__entry->nfs_error) 2162 2154 ) 2163 2155 ); 2164 2156

+34 -17

fs/nfs/pnfs.c

··· 745 745 return remaining; 746 746 } 747 747 748 + static void pnfs_reset_return_info(struct pnfs_layout_hdr *lo) 749 + { 750 + struct pnfs_layout_segment *lseg; 751 + 752 + list_for_each_entry(lseg, &lo->plh_return_segs, pls_list) 753 + pnfs_set_plh_return_info(lo, lseg->pls_range.iomode, 0); 754 + } 755 + 748 756 static void 749 757 pnfs_free_returned_lsegs(struct pnfs_layout_hdr *lo, 750 758 struct list_head *free_me, ··· 1254 1246 static void 1255 1247 pnfs_layoutreturn_retry_later_locked(struct pnfs_layout_hdr *lo, 1256 1248 const nfs4_stateid *arg_stateid, 1257 - const struct pnfs_layout_range *range) 1249 + const struct pnfs_layout_range *range, 1250 + struct list_head *freeme) 1258 1251 { 1259 - const struct pnfs_layout_segment *lseg; 1260 - u32 seq = be32_to_cpu(arg_stateid->seqid); 1261 - 1262 1252 if (pnfs_layout_is_valid(lo) && 1263 - nfs4_stateid_match_other(&lo->plh_stateid, arg_stateid)) { 1264 - list_for_each_entry(lseg, &lo->plh_return_segs, pls_list) { 1265 - if (pnfs_seqid_is_newer(lseg->pls_seq, seq) || 1266 - !pnfs_should_free_range(&lseg->pls_range, range)) 1267 - continue; 1268 - pnfs_set_plh_return_info(lo, range->iomode, seq); 1269 - break; 1270 - } 1271 - } 1253 + nfs4_stateid_match_other(&lo->plh_stateid, arg_stateid)) 1254 + pnfs_reset_return_info(lo); 1255 + else 1256 + pnfs_mark_layout_stateid_invalid(lo, freeme); 1257 + pnfs_clear_layoutreturn_waitbit(lo); 1272 1258 } 1273 1259 1274 1260 void pnfs_layoutreturn_retry_later(struct pnfs_layout_hdr *lo, ··· 1270 1268 const struct pnfs_layout_range *range) 1271 1269 { 1272 1270 struct inode *inode = lo->plh_inode; 1271 + LIST_HEAD(freeme); 1273 1272 1274 1273 spin_lock(&inode->i_lock); 1275 - pnfs_layoutreturn_retry_later_locked(lo, arg_stateid, range); 1276 - pnfs_clear_layoutreturn_waitbit(lo); 1274 + pnfs_layoutreturn_retry_later_locked(lo, arg_stateid, range, &freeme); 1277 1275 spin_unlock(&inode->i_lock); 1276 + pnfs_free_lseg_list(&freeme); 1278 1277 } 1279 1278 1280 1279 void pnfs_layoutreturn_free_lsegs(struct pnfs_layout_hdr *lo, ··· 1295 1292 pnfs_mark_matching_lsegs_invalid(lo, &freeme, range, seq); 1296 1293 pnfs_free_returned_lsegs(lo, &freeme, range, seq); 1297 1294 pnfs_set_layout_stateid(lo, stateid, NULL, true); 1295 + pnfs_reset_return_info(lo); 1298 1296 } else 1299 1297 pnfs_mark_layout_stateid_invalid(lo, &freeme); 1300 1298 out_unlock: ··· 1665 1661 /* Was there an RPC level error? If not, retry */ 1666 1662 if (task->tk_rpc_status == 0) 1667 1663 break; 1664 + /* 1665 + * Is there a fatal network level error? 1666 + * If so release the layout, but flag the error. 1667 + */ 1668 + if ((task->tk_rpc_status == -ENETDOWN || 1669 + task->tk_rpc_status == -ENETUNREACH) && 1670 + task->tk_flags & RPC_TASK_NETUNREACH_FATAL) { 1671 + *ret = 0; 1672 + (*respp)->lrs_present = 0; 1673 + retval = -EIO; 1674 + break; 1675 + } 1668 1676 /* If the call was not sent, let caller handle it */ 1669 1677 if (!RPC_WAS_SENT(task)) 1670 1678 return 0; ··· 1711 1695 struct inode *inode = args->inode; 1712 1696 const nfs4_stateid *res_stateid = NULL; 1713 1697 struct nfs4_xdr_opaque_data *ld_private = args->ld_private; 1698 + LIST_HEAD(freeme); 1714 1699 1715 1700 switch (ret) { 1716 1701 case -NFS4ERR_BADSESSION: ··· 1720 1703 case -NFS4ERR_NOMATCHING_LAYOUT: 1721 1704 spin_lock(&inode->i_lock); 1722 1705 pnfs_layoutreturn_retry_later_locked(lo, &args->stateid, 1723 - &args->range); 1724 - pnfs_clear_layoutreturn_waitbit(lo); 1706 + &args->range, &freeme); 1725 1707 spin_unlock(&inode->i_lock); 1708 + pnfs_free_lseg_list(&freeme); 1726 1709 break; 1727 1710 case 0: 1728 1711 if (res->lrs_present)

+3 -1

fs/nfs/pnfs.h

··· 60 60 struct list_head ds_node; /* nfs4_pnfs_dev_hlist dev_dslist */ 61 61 char *ds_remotestr; /* comma sep list of addrs */ 62 62 struct list_head ds_addrs; 63 + const struct net *ds_net; 63 64 struct nfs_client *ds_clp; 64 65 refcount_t ds_count; 65 66 unsigned long ds_state; ··· 416 415 int pnfs_generic_scan_commit_lists(struct nfs_commit_info *cinfo, int max); 417 416 void pnfs_generic_write_commit_done(struct rpc_task *task, void *data); 418 417 void nfs4_pnfs_ds_put(struct nfs4_pnfs_ds *ds); 419 - struct nfs4_pnfs_ds *nfs4_pnfs_ds_add(struct list_head *dsaddrs, 418 + struct nfs4_pnfs_ds *nfs4_pnfs_ds_add(const struct net *net, 419 + struct list_head *dsaddrs, 420 420 gfp_t gfp_flags); 421 421 void nfs4_pnfs_v3_ds_connect_unload(void); 422 422 int nfs4_pnfs_ds_connect(struct nfs_server *mds_srv, struct nfs4_pnfs_ds *ds,

+18 -14

fs/nfs/pnfs_nfs.c

··· 16 16 #include "nfs4session.h" 17 17 #include "internal.h" 18 18 #include "pnfs.h" 19 + #include "netns.h" 19 20 20 21 #define NFSDBG_FACILITY NFSDBG_PNFS 21 22 ··· 505 504 /* 506 505 * Data server cache 507 506 * 508 - * Data servers can be mapped to different device ids. 509 - * nfs4_pnfs_ds reference counting 507 + * Data servers can be mapped to different device ids, but should 508 + * never be shared between net namespaces. 509 + * 510 + * nfs4_pnfs_ds reference counting: 510 511 * - set to 1 on allocation 511 512 * - incremented when a device id maps a data server already in the cache. 512 513 * - decremented when deviceid is removed from the cache. 513 514 */ 514 - static DEFINE_SPINLOCK(nfs4_ds_cache_lock); 515 - static LIST_HEAD(nfs4_data_server_cache); 516 515 517 516 /* Debug routines */ 518 517 static void ··· 605 604 * Lookup DS by addresses. nfs4_ds_cache_lock is held 606 605 */ 607 606 static struct nfs4_pnfs_ds * 608 - _data_server_lookup_locked(const struct list_head *dsaddrs) 607 + _data_server_lookup_locked(const struct nfs_net *nn, const struct list_head *dsaddrs) 609 608 { 610 609 struct nfs4_pnfs_ds *ds; 611 610 612 - list_for_each_entry(ds, &nfs4_data_server_cache, ds_node) 611 + list_for_each_entry(ds, &nn->nfs4_data_server_cache, ds_node) 613 612 if (_same_data_server_addrs_locked(&ds->ds_addrs, dsaddrs)) 614 613 return ds; 615 614 return NULL; ··· 654 653 655 654 void nfs4_pnfs_ds_put(struct nfs4_pnfs_ds *ds) 656 655 { 657 - if (refcount_dec_and_lock(&ds->ds_count, 658 - &nfs4_ds_cache_lock)) { 656 + struct nfs_net *nn = net_generic(ds->ds_net, nfs_net_id); 657 + 658 + if (refcount_dec_and_lock(&ds->ds_count, &nn->nfs4_data_server_lock)) { 659 659 list_del_init(&ds->ds_node); 660 - spin_unlock(&nfs4_ds_cache_lock); 660 + spin_unlock(&nn->nfs4_data_server_lock); 661 661 destroy_ds(ds); 662 662 } 663 663 } ··· 718 716 * uncached and return cached struct nfs4_pnfs_ds. 719 717 */ 720 718 struct nfs4_pnfs_ds * 721 - nfs4_pnfs_ds_add(struct list_head *dsaddrs, gfp_t gfp_flags) 719 + nfs4_pnfs_ds_add(const struct net *net, struct list_head *dsaddrs, gfp_t gfp_flags) 722 720 { 721 + struct nfs_net *nn = net_generic(net, nfs_net_id); 723 722 struct nfs4_pnfs_ds *tmp_ds, *ds = NULL; 724 723 char *remotestr; 725 724 ··· 736 733 /* this is only used for debugging, so it's ok if its NULL */ 737 734 remotestr = nfs4_pnfs_remotestr(dsaddrs, gfp_flags); 738 735 739 - spin_lock(&nfs4_ds_cache_lock); 740 - tmp_ds = _data_server_lookup_locked(dsaddrs); 736 + spin_lock(&nn->nfs4_data_server_lock); 737 + tmp_ds = _data_server_lookup_locked(nn, dsaddrs); 741 738 if (tmp_ds == NULL) { 742 739 INIT_LIST_HEAD(&ds->ds_addrs); 743 740 list_splice_init(dsaddrs, &ds->ds_addrs); 744 741 ds->ds_remotestr = remotestr; 745 742 refcount_set(&ds->ds_count, 1); 746 743 INIT_LIST_HEAD(&ds->ds_node); 744 + ds->ds_net = net; 747 745 ds->ds_clp = NULL; 748 - list_add(&ds->ds_node, &nfs4_data_server_cache); 746 + list_add(&ds->ds_node, &nn->nfs4_data_server_cache); 749 747 dprintk("%s add new data server %s\n", __func__, 750 748 ds->ds_remotestr); 751 749 } else { ··· 758 754 refcount_read(&tmp_ds->ds_count)); 759 755 ds = tmp_ds; 760 756 } 761 - spin_unlock(&nfs4_ds_cache_lock); 757 + spin_unlock(&nn->nfs4_data_server_lock); 762 758 out: 763 759 return ds; 764 760 }

+4 -2

fs/smb/client/file.c

··· 160 160 server = cifs_pick_channel(tlink_tcon(req->cfile->tlink)->ses); 161 161 rdata->server = server; 162 162 163 - cifs_negotiate_rsize(server, cifs_sb->ctx, 164 - tlink_tcon(req->cfile->tlink)); 163 + if (cifs_sb->ctx->rsize == 0) { 164 + cifs_negotiate_rsize(server, cifs_sb->ctx, 165 + tlink_tcon(req->cfile->tlink)); 166 + } 165 167 166 168 rc = server->ops->wait_mtu_credits(server, cifs_sb->ctx->rsize, 167 169 &size, &rdata->credits);

+1 -1

fs/smb/client/smb2pdu.c

··· 2968 2968 /* Eventually save off posix specific response info and timestamps */ 2969 2969 2970 2970 err_free_rsp_buf: 2971 - free_rsp_buf(resp_buftype, rsp); 2971 + free_rsp_buf(resp_buftype, rsp_iov.iov_base); 2972 2972 kfree(pc_buf); 2973 2973 err_free_req: 2974 2974 cifs_small_buf_release(req);

+1 -1

fs/udf/truncate.c

··· 115 115 } 116 116 /* This inode entry is in-memory only and thus we don't have to mark 117 117 * the inode dirty */ 118 - if (ret == 0) 118 + if (ret >= 0) 119 119 iinfo->i_lenExtents = inode->i_size; 120 120 brelse(epos.bh); 121 121 }

+24

fs/xattr.c

··· 1428 1428 return !strncmp(name, XATTR_TRUSTED_PREFIX, XATTR_TRUSTED_PREFIX_LEN); 1429 1429 } 1430 1430 1431 + static bool xattr_is_maclabel(const char *name) 1432 + { 1433 + const char *suffix = name + XATTR_SECURITY_PREFIX_LEN; 1434 + 1435 + return !strncmp(name, XATTR_SECURITY_PREFIX, 1436 + XATTR_SECURITY_PREFIX_LEN) && 1437 + security_ismaclabel(suffix); 1438 + } 1439 + 1431 1440 /** 1432 1441 * simple_xattr_list - list all xattr objects 1433 1442 * @inode: inode from which to get the xattrs ··· 1469 1460 if (err) 1470 1461 return err; 1471 1462 1463 + err = security_inode_listsecurity(inode, buffer, remaining_size); 1464 + if (err < 0) 1465 + return err; 1466 + 1467 + if (buffer) { 1468 + if (remaining_size < err) 1469 + return -ERANGE; 1470 + buffer += err; 1471 + } 1472 + remaining_size -= err; 1473 + 1472 1474 read_lock(&xattrs->lock); 1473 1475 for (rbp = rb_first(&xattrs->rb_root); rbp; rbp = rb_next(rbp)) { 1474 1476 xattr = rb_entry(rbp, struct simple_xattr, rb_node); 1475 1477 1476 1478 /* skip "trusted." attributes for unprivileged callers */ 1477 1479 if (!trusted && xattr_is_trusted(xattr->name)) 1480 + continue; 1481 + 1482 + /* skip MAC labels; these are provided by LSM above */ 1483 + if (xattr_is_maclabel(xattr->name)) 1478 1484 continue; 1479 1485 1480 1486 err = xattr_list_one(&buffer, &remaining_size, xattr->name);

+27 -1

fs/xfs/xfs_super.c

··· 1149 1149 return 0; 1150 1150 1151 1151 free_freecounters: 1152 - while (--i > 0) 1152 + while (--i >= 0) 1153 1153 percpu_counter_destroy(&mp->m_free[i].count); 1154 1154 percpu_counter_destroy(&mp->m_delalloc_rtextents); 1155 1155 free_delalloc: ··· 2114 2114 if (error) 2115 2115 return error; 2116 2116 2117 + /* attr2 -> noattr2 */ 2118 + if (xfs_has_noattr2(new_mp)) { 2119 + if (xfs_has_crc(mp)) { 2120 + xfs_warn(mp, 2121 + "attr2 is always enabled for a V5 filesystem - can't be changed."); 2122 + return -EINVAL; 2123 + } 2124 + mp->m_features &= ~XFS_FEAT_ATTR2; 2125 + mp->m_features |= XFS_FEAT_NOATTR2; 2126 + } else if (xfs_has_attr2(new_mp)) { 2127 + /* noattr2 -> attr2 */ 2128 + mp->m_features &= ~XFS_FEAT_NOATTR2; 2129 + mp->m_features |= XFS_FEAT_ATTR2; 2130 + } 2131 + 2117 2132 /* inode32 -> inode64 */ 2118 2133 if (xfs_has_small_inums(mp) && !xfs_has_small_inums(new_mp)) { 2119 2134 mp->m_features &= ~XFS_FEAT_SMALL_INUMS; ··· 2140 2125 mp->m_features |= XFS_FEAT_SMALL_INUMS; 2141 2126 mp->m_maxagi = xfs_set_inode_alloc(mp, mp->m_sb.sb_agcount); 2142 2127 } 2128 + 2129 + /* 2130 + * Now that mp has been modified according to the remount options, we 2131 + * do a final option validation with xfs_finish_flags() just like it is 2132 + * just like it is done during mount. We cannot use 2133 + * done during mount. We cannot use xfs_finish_flags() on new_mp as it 2134 + * contains only the user given options. 2135 + */ 2136 + error = xfs_finish_flags(mp); 2137 + if (error) 2138 + return error; 2143 2139 2144 2140 /* ro -> rw */ 2145 2141 if (xfs_is_readonly(mp) && !(flags & SB_RDONLY)) {

+18 -16

fs/xfs/xfs_trans_ail.c

··· 315 315 } 316 316 317 317 /* 318 - * Delete the given item from the AIL. Return a pointer to the item. 318 + * Delete the given item from the AIL. 319 319 */ 320 320 static void 321 321 xfs_ail_delete( ··· 777 777 } 778 778 779 779 /* 780 - * xfs_trans_ail_update - bulk AIL insertion operation. 780 + * xfs_trans_ail_update_bulk - bulk AIL insertion operation. 781 781 * 782 - * @xfs_trans_ail_update takes an array of log items that all need to be 782 + * @xfs_trans_ail_update_bulk takes an array of log items that all need to be 783 783 * positioned at the same LSN in the AIL. If an item is not in the AIL, it will 784 - * be added. Otherwise, it will be repositioned by removing it and re-adding 785 - * it to the AIL. If we move the first item in the AIL, update the log tail to 786 - * match the new minimum LSN in the AIL. 784 + * be added. Otherwise, it will be repositioned by removing it and re-adding 785 + * it to the AIL. 787 786 * 788 - * This function takes the AIL lock once to execute the update operations on 789 - * all the items in the array, and as such should not be called with the AIL 790 - * lock held. As a result, once we have the AIL lock, we need to check each log 791 - * item LSN to confirm it needs to be moved forward in the AIL. 787 + * If we move the first item in the AIL, update the log tail to match the new 788 + * minimum LSN in the AIL. 792 789 * 793 - * To optimise the insert operation, we delete all the items from the AIL in 794 - * the first pass, moving them into a temporary list, then splice the temporary 795 - * list into the correct position in the AIL. This avoids needing to do an 796 - * insert operation on every item. 790 + * This function should be called with the AIL lock held. 797 791 * 798 - * This function must be called with the AIL lock held. The lock is dropped 799 - * before returning. 792 + * To optimise the insert operation, we add all items to a temporary list, then 793 + * splice this list into the correct position in the AIL. 794 + * 795 + * Items that are already in the AIL are first deleted from their current 796 + * location before being added to the temporary list. 797 + * 798 + * This avoids needing to do an insert operation on every item. 799 + * 800 + * The AIL lock is dropped by xfs_ail_update_finish() before returning to 801 + * the caller. 800 802 */ 801 803 void 802 804 xfs_trans_ail_update_bulk(

+3 -2

fs/xfs/xfs_zone_gc.c

··· 807 807 { 808 808 struct xfs_zone_gc_data *data = chunk->data; 809 809 struct xfs_mount *mp = chunk->ip->i_mount; 810 - unsigned int folio_offset = chunk->bio.bi_io_vec->bv_offset; 810 + phys_addr_t bvec_paddr = 811 + bvec_phys(bio_first_bvec_all(&chunk->bio)); 811 812 struct xfs_gc_bio *split_chunk; 812 813 813 814 if (chunk->bio.bi_status) ··· 823 822 824 823 bio_reset(&chunk->bio, mp->m_rtdev_targp->bt_bdev, REQ_OP_WRITE); 825 824 bio_add_folio_nofail(&chunk->bio, chunk->scratch->folio, chunk->len, 826 - folio_offset); 825 + offset_in_folio(chunk->scratch->folio, bvec_paddr)); 827 826 828 827 while ((split_chunk = xfs_zone_gc_split_write(data, chunk))) 829 828 xfs_zone_gc_submit_write(data, split_chunk);

+33 -14

include/drm/drm_gpusvm.h

··· 89 89 * @ops: Pointer to the operations structure for GPU SVM device memory 90 90 * @dpagemap: The struct drm_pagemap of the pages this allocation belongs to. 91 91 * @size: Size of device memory allocation 92 + * @timeslice_expiration: Timeslice expiration in jiffies 92 93 */ 93 94 struct drm_gpusvm_devmem { 94 95 struct device *dev; ··· 98 97 const struct drm_gpusvm_devmem_ops *ops; 99 98 struct drm_pagemap *dpagemap; 100 99 size_t size; 100 + u64 timeslice_expiration; 101 101 }; 102 102 103 103 /** ··· 188 186 }; 189 187 190 188 /** 189 + * struct drm_gpusvm_range_flags - Structure representing a GPU SVM range flags 190 + * 191 + * @migrate_devmem: Flag indicating whether the range can be migrated to device memory 192 + * @unmapped: Flag indicating if the range has been unmapped 193 + * @partial_unmap: Flag indicating if the range has been partially unmapped 194 + * @has_devmem_pages: Flag indicating if the range has devmem pages 195 + * @has_dma_mapping: Flag indicating if the range has a DMA mapping 196 + * @__flags: Flags for range in u16 form (used for READ_ONCE) 197 + */ 198 + struct drm_gpusvm_range_flags { 199 + union { 200 + struct { 201 + /* All flags below must be set upon creation */ 202 + u16 migrate_devmem : 1; 203 + /* All flags below must be set / cleared under notifier lock */ 204 + u16 unmapped : 1; 205 + u16 partial_unmap : 1; 206 + u16 has_devmem_pages : 1; 207 + u16 has_dma_mapping : 1; 208 + }; 209 + u16 __flags; 210 + }; 211 + }; 212 + 213 + /** 191 214 * struct drm_gpusvm_range - Structure representing a GPU SVM range 192 215 * 193 216 * @gpusvm: Pointer to the GPU SVM structure ··· 225 198 * @dpagemap: The struct drm_pagemap of the device pages we're dma-mapping. 226 199 * Note this is assuming only one drm_pagemap per range is allowed. 227 200 * @flags: Flags for range 228 - * @flags.migrate_devmem: Flag indicating whether the range can be migrated to device memory 229 - * @flags.unmapped: Flag indicating if the range has been unmapped 230 - * @flags.partial_unmap: Flag indicating if the range has been partially unmapped 231 - * @flags.has_devmem_pages: Flag indicating if the range has devmem pages 232 - * @flags.has_dma_mapping: Flag indicating if the range has a DMA mapping 233 201 * 234 202 * This structure represents a GPU SVM range used for tracking memory ranges 235 203 * mapped in a DRM device. ··· 238 216 unsigned long notifier_seq; 239 217 struct drm_pagemap_device_addr *dma_addr; 240 218 struct drm_pagemap *dpagemap; 241 - struct { 242 - /* All flags below must be set upon creation */ 243 - u16 migrate_devmem : 1; 244 - /* All flags below must be set / cleared under notifier lock */ 245 - u16 unmapped : 1; 246 - u16 partial_unmap : 1; 247 - u16 has_devmem_pages : 1; 248 - u16 has_dma_mapping : 1; 249 - } flags; 219 + struct drm_gpusvm_range_flags flags; 250 220 }; 251 221 252 222 /** ··· 297 283 * @check_pages_threshold: Check CPU pages for present if chunk is less than or 298 284 * equal to threshold. If not present, reduce chunk 299 285 * size. 286 + * @timeslice_ms: The timeslice MS which in minimum time a piece of memory 287 + * remains with either exclusive GPU or CPU access. 300 288 * @in_notifier: entering from a MMU notifier 301 289 * @read_only: operating on read-only memory 302 290 * @devmem_possible: possible to use device memory 291 + * @devmem_only: use only device memory 303 292 * 304 293 * Context that is DRM GPUSVM is operating in (i.e. user arguments). 305 294 */ 306 295 struct drm_gpusvm_ctx { 307 296 unsigned long check_pages_threshold; 297 + unsigned long timeslice_ms; 308 298 unsigned int in_notifier :1; 309 299 unsigned int read_only :1; 310 300 unsigned int devmem_possible :1; 301 + unsigned int devmem_only :1; 311 302 }; 312 303 313 304 int drm_gpusvm_init(struct drm_gpusvm *gpusvm,

+1

include/linux/bio.h

··· 11 11 #include <linux/uio.h> 12 12 13 13 #define BIO_MAX_VECS 256U 14 + #define BIO_MAX_INLINE_VECS UIO_MAXIOV 14 15 15 16 struct queue_limits; 16 17

+2

include/linux/cpu.h

··· 78 78 extern ssize_t cpu_show_reg_file_data_sampling(struct device *dev, 79 79 struct device_attribute *attr, char *buf); 80 80 extern ssize_t cpu_show_ghostwrite(struct device *dev, struct device_attribute *attr, char *buf); 81 + extern ssize_t cpu_show_indirect_target_selection(struct device *dev, 82 + struct device_attribute *attr, char *buf); 81 83 82 84 extern __printf(4, 5) 83 85 struct device *cpu_device_create(struct device *parent, void *drvdata,

+10 -1

include/linux/execmem.h

··· 4 4 5 5 #include <linux/types.h> 6 6 #include <linux/moduleloader.h> 7 + #include <linux/cleanup.h> 7 8 8 9 #if (defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)) && \ 9 10 !defined(CONFIG_KASAN_VMALLOC) ··· 54 53 EXECMEM_ROX_CACHE = (1 << 1), 55 54 }; 56 55 57 - #ifdef CONFIG_ARCH_HAS_EXECMEM_ROX 56 + #if defined(CONFIG_ARCH_HAS_EXECMEM_ROX) && defined(CONFIG_EXECMEM) 58 57 /** 59 58 * execmem_fill_trapping_insns - set memory to contain instructions that 60 59 * will trap ··· 94 93 * Return: 0 on success or negative error code on failure. 95 94 */ 96 95 int execmem_restore_rox(void *ptr, size_t size); 96 + 97 + /* 98 + * Called from mark_readonly(), where the system transitions to ROX. 99 + */ 100 + void execmem_cache_make_ro(void); 97 101 #else 98 102 static inline int execmem_make_temp_rw(void *ptr, size_t size) { return 0; } 99 103 static inline int execmem_restore_rox(void *ptr, size_t size) { return 0; } 104 + static inline void execmem_cache_make_ro(void) { } 100 105 #endif 101 106 102 107 /** ··· 176 169 * @ptr: pointer to the memory that should be freed 177 170 */ 178 171 void execmem_free(void *ptr); 172 + 173 + DEFINE_FREE(execmem, void *, if (_T) execmem_free(_T)); 179 174 180 175 #ifdef CONFIG_MMU 181 176 /**

-7

include/linux/hyperv.h

··· 1167 1167 enum vmbus_packet_type type, 1168 1168 u32 flags); 1169 1169 1170 - extern int vmbus_sendpacket_pagebuffer(struct vmbus_channel *channel, 1171 - struct hv_page_buffer pagebuffers[], 1172 - u32 pagecount, 1173 - void *buffer, 1174 - u32 bufferlen, 1175 - u64 requestid); 1176 - 1177 1170 extern int vmbus_sendpacket_mpb_desc(struct vmbus_channel *channel, 1178 1171 struct vmbus_packet_mpb_array *mpb, 1179 1172 u32 desc_size,

-1

include/linux/micrel_phy.h

··· 44 44 #define MICREL_PHY_50MHZ_CLK BIT(0) 45 45 #define MICREL_PHY_FXEN BIT(1) 46 46 #define MICREL_KSZ8_P1_ERRATA BIT(2) 47 - #define MICREL_NO_EEE BIT(3) 48 47 49 48 #define MICREL_KSZ9021_EXTREG_CTRL 0xB 50 49 #define MICREL_KSZ9021_EXTREG_DATA_WRITE 0xC

+5

include/linux/module.h

··· 586 586 atomic_t refcnt; 587 587 #endif 588 588 589 + #ifdef CONFIG_MITIGATION_ITS 590 + int its_num_pages; 591 + void **its_page_array; 592 + #endif 593 + 589 594 #ifdef CONFIG_CONSTRUCTORS 590 595 /* Constructor functions. */ 591 596 ctor_fn_t *ctors;

+9 -3

include/linux/nfs_fs_sb.h

··· 213 213 char *fscache_uniq; /* Uniquifier (or NULL) */ 214 214 #endif 215 215 216 + /* The following #defines numerically match the NFSv4 equivalents */ 217 + #define NFS_FH_NOEXPIRE_WITH_OPEN (0x1) 218 + #define NFS_FH_VOLATILE_ANY (0x2) 219 + #define NFS_FH_VOL_MIGRATION (0x4) 220 + #define NFS_FH_VOL_RENAME (0x8) 221 + #define NFS_FH_RENAME_UNSAFE (NFS_FH_VOLATILE_ANY | NFS_FH_VOL_RENAME) 222 + u32 fh_expire_type; /* V4 bitmask representing file 223 + handle volatility type for 224 + this filesystem */ 216 225 u32 pnfs_blksize; /* layout_blksize attr */ 217 226 #if IS_ENABLED(CONFIG_NFS_V4) 218 227 u32 attr_bitmask[3];/* V4 bitmask representing the set ··· 245 236 u32 acl_bitmask; /* V4 bitmask representing the ACEs 246 237 that are supported on this 247 238 filesystem */ 248 - u32 fh_expire_type; /* V4 bitmask representing file 249 - handle volatility type for 250 - this filesystem */ 251 239 struct pnfs_layoutdriver_type *pnfs_curr_ld; /* Active layout driver */ 252 240 struct rpc_wait_queue roc_rpcwaitq; 253 241 void *pnfs_ld_data; /* per mount point data */

+8

include/linux/pgalloc_tag.h

··· 188 188 return tag; 189 189 } 190 190 191 + static inline struct alloc_tag *pgalloc_tag_get(struct page *page) 192 + { 193 + if (mem_alloc_profiling_enabled()) 194 + return __pgalloc_tag_get(page); 195 + return NULL; 196 + } 197 + 191 198 void pgalloc_tag_split(struct folio *folio, int old_order, int new_order); 192 199 void pgalloc_tag_swap(struct folio *new, struct folio *old); 193 200 ··· 206 199 static inline void alloc_tag_sec_init(void) {} 207 200 static inline void pgalloc_tag_split(struct folio *folio, int old_order, int new_order) {} 208 201 static inline void pgalloc_tag_swap(struct folio *new, struct folio *old) {} 202 + static inline struct alloc_tag *pgalloc_tag_get(struct page *page) { return NULL; } 209 203 210 204 #endif /* CONFIG_MEM_ALLOC_PROFILING */ 211 205

+1 -1

include/linux/soundwire/sdw_intel.h

··· 365 365 * on e.g. which machine driver to select (I2S mode, HDaudio or 366 366 * SoundWire). 367 367 */ 368 - int sdw_intel_acpi_scan(acpi_handle *parent_handle, 368 + int sdw_intel_acpi_scan(acpi_handle parent_handle, 369 369 struct sdw_intel_acpi_info *info); 370 370 371 371 void sdw_intel_process_wakeen_event(struct sdw_intel_ctx *ctx);

+20 -1

include/linux/tpm.h

··· 224 224 225 225 enum tpm2_timeouts { 226 226 TPM2_TIMEOUT_A = 750, 227 - TPM2_TIMEOUT_B = 2000, 227 + TPM2_TIMEOUT_B = 4000, 228 228 TPM2_TIMEOUT_C = 200, 229 229 TPM2_TIMEOUT_D = 30, 230 230 TPM2_DURATION_SHORT = 20, ··· 257 257 TPM2_RC_TESTING = 0x090A, /* RC_WARN */ 258 258 TPM2_RC_REFERENCE_H0 = 0x0910, 259 259 TPM2_RC_RETRY = 0x0922, 260 + TPM2_RC_SESSION_MEMORY = 0x0903, 260 261 }; 261 262 262 263 enum tpm2_command_codes { ··· 436 435 static inline u32 tpm2_rc_value(u32 rc) 437 436 { 438 437 return (rc & BIT(7)) ? rc & 0xbf : rc; 438 + } 439 + 440 + /* 441 + * Convert a return value from tpm_transmit_cmd() to POSIX error code. 442 + */ 443 + static inline ssize_t tpm_ret_to_err(ssize_t ret) 444 + { 445 + if (ret < 0) 446 + return ret; 447 + 448 + switch (tpm2_rc_value(ret)) { 449 + case TPM2_RC_SUCCESS: 450 + return 0; 451 + case TPM2_RC_SESSION_MEMORY: 452 + return -ENOMEM; 453 + default: 454 + return -EFAULT; 455 + } 439 456 } 440 457 441 458 #if defined(CONFIG_TCG_TPM) || defined(CONFIG_TCG_TPM_MODULE)

+1

include/net/bluetooth/hci_core.h

··· 1798 1798 void hci_uuids_clear(struct hci_dev *hdev); 1799 1799 1800 1800 void hci_link_keys_clear(struct hci_dev *hdev); 1801 + u8 *hci_conn_key_enc_size(struct hci_conn *conn); 1801 1802 struct link_key *hci_find_link_key(struct hci_dev *hdev, bdaddr_t *bdaddr); 1802 1803 struct link_key *hci_add_link_key(struct hci_dev *hdev, struct hci_conn *conn, 1803 1804 bdaddr_t *bdaddr, u8 *val, u8 type,

+15

include/net/sch_generic.h

··· 1031 1031 return skb; 1032 1032 } 1033 1033 1034 + static inline struct sk_buff *qdisc_dequeue_internal(struct Qdisc *sch, bool direct) 1035 + { 1036 + struct sk_buff *skb; 1037 + 1038 + skb = __skb_dequeue(&sch->gso_skb); 1039 + if (skb) { 1040 + sch->q.qlen--; 1041 + return skb; 1042 + } 1043 + if (direct) 1044 + return __qdisc_dequeue_head(&sch->q); 1045 + else 1046 + return sch->dequeue(sch); 1047 + } 1048 + 1034 1049 static inline struct sk_buff *qdisc_dequeue_head(struct Qdisc *sch) 1035 1050 { 1036 1051 struct sk_buff *skb = __qdisc_dequeue_head(&sch->q);

+2 -2

include/sound/ump_msg.h

··· 604 604 } __packed; 605 605 606 606 /* UMP Stream Message: Device Info Notification (128bit) */ 607 - struct snd_ump_stream_msg_devince_info { 607 + struct snd_ump_stream_msg_device_info { 608 608 #ifdef __BIG_ENDIAN_BITFIELD 609 609 /* 0 */ 610 610 u32 type:4; ··· 754 754 union snd_ump_stream_msg { 755 755 struct snd_ump_stream_msg_ep_discovery ep_discovery; 756 756 struct snd_ump_stream_msg_ep_info ep_info; 757 - struct snd_ump_stream_msg_devince_info device_info; 757 + struct snd_ump_stream_msg_device_info device_info; 758 758 struct snd_ump_stream_msg_stream_cfg stream_cfg; 759 759 struct snd_ump_stream_msg_fb_discovery fb_discovery; 760 760 struct snd_ump_stream_msg_fb_info fb_info;

-5

init/Kconfig

··· 87 87 default $(success,$(srctree)/scripts/cc-can-link.sh $(CC) $(CLANG_FLAGS) $(USERCFLAGS) $(USERLDFLAGS) $(m64-flag)) if 64BIT 88 88 default $(success,$(srctree)/scripts/cc-can-link.sh $(CC) $(CLANG_FLAGS) $(USERCFLAGS) $(USERLDFLAGS) $(m32-flag)) 89 89 90 - config CC_CAN_LINK_STATIC 91 - bool 92 - default $(success,$(srctree)/scripts/cc-can-link.sh $(CC) $(CLANG_FLAGS) $(USERCFLAGS) $(USERLDFLAGS) $(m64-flag) -static) if 64BIT 93 - default $(success,$(srctree)/scripts/cc-can-link.sh $(CC) $(CLANG_FLAGS) $(USERCFLAGS) $(USERLDFLAGS) $(m32-flag) -static) 94 - 95 90 # Fixed in GCC 14, 13.3, 12.4 and 11.5 96 91 # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921 97 92 config GCC_ASM_GOTO_OUTPUT_BROKEN

+25 -23

io_uring/fdinfo.c

··· 86 86 } 87 87 #endif 88 88 89 - /* 90 - * Caller holds a reference to the file already, we don't need to do 91 - * anything else to get an extra reference. 92 - */ 93 - __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *file) 89 + static void __io_uring_show_fdinfo(struct io_ring_ctx *ctx, struct seq_file *m) 94 90 { 95 - struct io_ring_ctx *ctx = file->private_data; 96 91 struct io_overflow_cqe *ocqe; 97 92 struct io_rings *r = ctx->rings; 98 93 struct rusage sq_usage; ··· 101 106 unsigned int sq_entries, cq_entries; 102 107 int sq_pid = -1, sq_cpu = -1; 103 108 u64 sq_total_time = 0, sq_work_time = 0; 104 - bool has_lock; 105 109 unsigned int i; 106 110 107 111 if (ctx->flags & IORING_SETUP_CQE32) ··· 170 176 seq_printf(m, "\n"); 171 177 } 172 178 173 - /* 174 - * Avoid ABBA deadlock between the seq lock and the io_uring mutex, 175 - * since fdinfo case grabs it in the opposite direction of normal use 176 - * cases. If we fail to get the lock, we just don't iterate any 177 - * structures that could be going away outside the io_uring mutex. 178 - */ 179 - has_lock = mutex_trylock(&ctx->uring_lock); 180 - 181 - if (has_lock && (ctx->flags & IORING_SETUP_SQPOLL)) { 179 + if (ctx->flags & IORING_SETUP_SQPOLL) { 182 180 struct io_sq_data *sq = ctx->sq_data; 183 181 184 182 /* ··· 192 206 seq_printf(m, "SqTotalTime:\t%llu\n", sq_total_time); 193 207 seq_printf(m, "SqWorkTime:\t%llu\n", sq_work_time); 194 208 seq_printf(m, "UserFiles:\t%u\n", ctx->file_table.data.nr); 195 - for (i = 0; has_lock && i < ctx->file_table.data.nr; i++) { 209 + for (i = 0; i < ctx->file_table.data.nr; i++) { 196 210 struct file *f = NULL; 197 211 198 212 if (ctx->file_table.data.nodes[i]) ··· 204 218 } 205 219 } 206 220 seq_printf(m, "UserBufs:\t%u\n", ctx->buf_table.nr); 207 - for (i = 0; has_lock && i < ctx->buf_table.nr; i++) { 221 + for (i = 0; i < ctx->buf_table.nr; i++) { 208 222 struct io_mapped_ubuf *buf = NULL; 209 223 210 224 if (ctx->buf_table.nodes[i]) ··· 214 228 else 215 229 seq_printf(m, "%5u: <none>\n", i); 216 230 } 217 - if (has_lock && !xa_empty(&ctx->personalities)) { 231 + if (!xa_empty(&ctx->personalities)) { 218 232 unsigned long index; 219 233 const struct cred *cred; 220 234 ··· 224 238 } 225 239 226 240 seq_puts(m, "PollList:\n"); 227 - for (i = 0; has_lock && i < (1U << ctx->cancel_table.hash_bits); i++) { 241 + for (i = 0; i < (1U << ctx->cancel_table.hash_bits); i++) { 228 242 struct io_hash_bucket *hb = &ctx->cancel_table.hbs[i]; 229 243 struct io_kiocb *req; 230 244 ··· 232 246 seq_printf(m, " op=%d, task_works=%d\n", req->opcode, 233 247 task_work_pending(req->tctx->task)); 234 248 } 235 - 236 - if (has_lock) 237 - mutex_unlock(&ctx->uring_lock); 238 249 239 250 seq_puts(m, "CqOverflowList:\n"); 240 251 spin_lock(&ctx->completion_lock); ··· 244 261 } 245 262 spin_unlock(&ctx->completion_lock); 246 263 napi_show_fdinfo(ctx, m); 264 + } 265 + 266 + /* 267 + * Caller holds a reference to the file already, we don't need to do 268 + * anything else to get an extra reference. 269 + */ 270 + __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *file) 271 + { 272 + struct io_ring_ctx *ctx = file->private_data; 273 + 274 + /* 275 + * Avoid ABBA deadlock between the seq lock and the io_uring mutex, 276 + * since fdinfo case grabs it in the opposite direction of normal use 277 + * cases. 278 + */ 279 + if (mutex_trylock(&ctx->uring_lock)) { 280 + __io_uring_show_fdinfo(ctx, m); 281 + mutex_unlock(&ctx->uring_lock); 282 + } 247 283 } 248 284 #endif

+1 -1

io_uring/memmap.c

··· 116 116 void *ptr; 117 117 118 118 if (io_check_coalesce_buffer(mr->pages, mr->nr_pages, &ifd)) { 119 - if (ifd.nr_folios == 1) { 119 + if (ifd.nr_folios == 1 && !PageHighMem(mr->pages[0])) { 120 120 mr->ptr = page_address(mr->pages[0]); 121 121 return 0; 122 122 }

+5

io_uring/uring_cmd.c

··· 254 254 return -EOPNOTSUPP; 255 255 issue_flags |= IO_URING_F_IOPOLL; 256 256 req->iopoll_completed = 0; 257 + if (ctx->flags & IORING_SETUP_HYBRID_IOPOLL) { 258 + /* make sure every req only blocks once */ 259 + req->flags &= ~REQ_F_IOPOLL_STATE; 260 + req->iopoll_start = ktime_get_ns(); 261 + } 257 262 } 258 263 259 264 ret = file->f_op->uring_cmd(ioucmd, issue_flags);

+4 -2

kernel/cgroup/cpuset.c

··· 1116 1116 1117 1117 if (top_cs) { 1118 1118 /* 1119 - * Percpu kthreads in top_cpuset are ignored 1119 + * PF_NO_SETAFFINITY tasks are ignored. 1120 + * All per cpu kthreads should have PF_NO_SETAFFINITY 1121 + * flag set, see kthread_set_per_cpu(). 1120 1122 */ 1121 - if (kthread_is_per_cpu(task)) 1123 + if (task->flags & PF_NO_SETAFFINITY) 1122 1124 continue; 1123 1125 cpumask_andnot(new_cpus, possible_mask, subpartitions_cpus); 1124 1126 } else {

+5 -4

kernel/fork.c

··· 498 498 vma_numab_state_init(new); 499 499 dup_anon_vma_name(orig, new); 500 500 501 - /* track_pfn_copy() will later take care of copying internal state. */ 502 - if (unlikely(new->vm_flags & VM_PFNMAP)) 503 - untrack_pfn_clear(new); 504 - 505 501 return new; 506 502 } 507 503 ··· 668 672 tmp = vm_area_dup(mpnt); 669 673 if (!tmp) 670 674 goto fail_nomem; 675 + 676 + /* track_pfn_copy() will later take care of copying internal state. */ 677 + if (unlikely(tmp->vm_flags & VM_PFNMAP)) 678 + untrack_pfn_clear(tmp); 679 + 671 680 retval = vma_dup_policy(mpnt, tmp); 672 681 if (retval) 673 682 goto fail_nomem_policy;

+5

kernel/module/Kconfig

··· 192 192 depends on !DEBUG_INFO_REDUCED && !DEBUG_INFO_SPLIT 193 193 # Requires ELF object files. 194 194 depends on !LTO 195 + # To avoid conflicts with the discarded __gendwarfksyms_ptr symbols on 196 + # X86, requires pahole before commit 47dcb534e253 ("btf_encoder: Stop 197 + # indexing symbols for VARs") or after commit 9810758003ce ("btf_encoder: 198 + # Verify 0 address DWARF variables are in ELF section"). 199 + depends on !X86 || !DEBUG_INFO_BTF || PAHOLE_VERSION < 128 || PAHOLE_VERSION > 129 195 200 help 196 201 Calculate symbol versions from DWARF debugging information using 197 202 gendwarfksyms. Requires DEBUG_INFO to be enabled.

+123 -68

kernel/sched/ext.c

··· 1118 1118 current->scx.kf_mask &= ~mask; 1119 1119 } 1120 1120 1121 - #define SCX_CALL_OP(mask, op, args...) \ 1121 + /* 1122 + * Track the rq currently locked. 1123 + * 1124 + * This allows kfuncs to safely operate on rq from any scx ops callback, 1125 + * knowing which rq is already locked. 1126 + */ 1127 + static DEFINE_PER_CPU(struct rq *, locked_rq); 1128 + 1129 + static inline void update_locked_rq(struct rq *rq) 1130 + { 1131 + /* 1132 + * Check whether @rq is actually locked. This can help expose bugs 1133 + * or incorrect assumptions about the context in which a kfunc or 1134 + * callback is executed. 1135 + */ 1136 + if (rq) 1137 + lockdep_assert_rq_held(rq); 1138 + __this_cpu_write(locked_rq, rq); 1139 + } 1140 + 1141 + /* 1142 + * Return the rq currently locked from an scx callback, or NULL if no rq is 1143 + * locked. 1144 + */ 1145 + static inline struct rq *scx_locked_rq(void) 1146 + { 1147 + return __this_cpu_read(locked_rq); 1148 + } 1149 + 1150 + #define SCX_CALL_OP(mask, op, rq, args...) \ 1122 1151 do { \ 1152 + update_locked_rq(rq); \ 1123 1153 if (mask) { \ 1124 1154 scx_kf_allow(mask); \ 1125 1155 scx_ops.op(args); \ ··· 1157 1127 } else { \ 1158 1128 scx_ops.op(args); \ 1159 1129 } \ 1130 + update_locked_rq(NULL); \ 1160 1131 } while (0) 1161 1132 1162 - #define SCX_CALL_OP_RET(mask, op, args...) \ 1133 + #define SCX_CALL_OP_RET(mask, op, rq, args...) \ 1163 1134 ({ \ 1164 1135 __typeof__(scx_ops.op(args)) __ret; \ 1136 + \ 1137 + update_locked_rq(rq); \ 1165 1138 if (mask) { \ 1166 1139 scx_kf_allow(mask); \ 1167 1140 __ret = scx_ops.op(args); \ ··· 1172 1139 } else { \ 1173 1140 __ret = scx_ops.op(args); \ 1174 1141 } \ 1142 + update_locked_rq(NULL); \ 1175 1143 __ret; \ 1176 1144 }) 1177 1145 ··· 1187 1153 * scx_kf_allowed_on_arg_tasks() to test whether the invocation is allowed on 1188 1154 * the specific task. 1189 1155 */ 1190 - #define SCX_CALL_OP_TASK(mask, op, task, args...) \ 1156 + #define SCX_CALL_OP_TASK(mask, op, rq, task, args...) \ 1191 1157 do { \ 1192 1158 BUILD_BUG_ON((mask) & ~__SCX_KF_TERMINAL); \ 1193 1159 current->scx.kf_tasks[0] = task; \ 1194 - SCX_CALL_OP(mask, op, task, ##args); \ 1160 + SCX_CALL_OP(mask, op, rq, task, ##args); \ 1195 1161 current->scx.kf_tasks[0] = NULL; \ 1196 1162 } while (0) 1197 1163 1198 - #define SCX_CALL_OP_TASK_RET(mask, op, task, args...) \ 1164 + #define SCX_CALL_OP_TASK_RET(mask, op, rq, task, args...) \ 1199 1165 ({ \ 1200 1166 __typeof__(scx_ops.op(task, ##args)) __ret; \ 1201 1167 BUILD_BUG_ON((mask) & ~__SCX_KF_TERMINAL); \ 1202 1168 current->scx.kf_tasks[0] = task; \ 1203 - __ret = SCX_CALL_OP_RET(mask, op, task, ##args); \ 1169 + __ret = SCX_CALL_OP_RET(mask, op, rq, task, ##args); \ 1204 1170 current->scx.kf_tasks[0] = NULL; \ 1205 1171 __ret; \ 1206 1172 }) 1207 1173 1208 - #define SCX_CALL_OP_2TASKS_RET(mask, op, task0, task1, args...) \ 1174 + #define SCX_CALL_OP_2TASKS_RET(mask, op, rq, task0, task1, args...) \ 1209 1175 ({ \ 1210 1176 __typeof__(scx_ops.op(task0, task1, ##args)) __ret; \ 1211 1177 BUILD_BUG_ON((mask) & ~__SCX_KF_TERMINAL); \ 1212 1178 current->scx.kf_tasks[0] = task0; \ 1213 1179 current->scx.kf_tasks[1] = task1; \ 1214 - __ret = SCX_CALL_OP_RET(mask, op, task0, task1, ##args); \ 1180 + __ret = SCX_CALL_OP_RET(mask, op, rq, task0, task1, ##args); \ 1215 1181 current->scx.kf_tasks[0] = NULL; \ 1216 1182 current->scx.kf_tasks[1] = NULL; \ 1217 1183 __ret; \ ··· 2206 2172 WARN_ON_ONCE(*ddsp_taskp); 2207 2173 *ddsp_taskp = p; 2208 2174 2209 - SCX_CALL_OP_TASK(SCX_KF_ENQUEUE, enqueue, p, enq_flags); 2175 + SCX_CALL_OP_TASK(SCX_KF_ENQUEUE, enqueue, rq, p, enq_flags); 2210 2176 2211 2177 *ddsp_taskp = NULL; 2212 2178 if (p->scx.ddsp_dsq_id != SCX_DSQ_INVALID) ··· 2303 2269 add_nr_running(rq, 1); 2304 2270 2305 2271 if (SCX_HAS_OP(runnable) && !task_on_rq_migrating(p)) 2306 - SCX_CALL_OP_TASK(SCX_KF_REST, runnable, p, enq_flags); 2272 + SCX_CALL_OP_TASK(SCX_KF_REST, runnable, rq, p, enq_flags); 2307 2273 2308 2274 if (enq_flags & SCX_ENQ_WAKEUP) 2309 2275 touch_core_sched(rq, p); ··· 2317 2283 __scx_add_event(SCX_EV_SELECT_CPU_FALLBACK, 1); 2318 2284 } 2319 2285 2320 - static void ops_dequeue(struct task_struct *p, u64 deq_flags) 2286 + static void ops_dequeue(struct rq *rq, struct task_struct *p, u64 deq_flags) 2321 2287 { 2322 2288 unsigned long opss; 2323 2289 ··· 2338 2304 BUG(); 2339 2305 case SCX_OPSS_QUEUED: 2340 2306 if (SCX_HAS_OP(dequeue)) 2341 - SCX_CALL_OP_TASK(SCX_KF_REST, dequeue, p, deq_flags); 2307 + SCX_CALL_OP_TASK(SCX_KF_REST, dequeue, rq, p, deq_flags); 2342 2308 2343 2309 if (atomic_long_try_cmpxchg(&p->scx.ops_state, &opss, 2344 2310 SCX_OPSS_NONE)) ··· 2371 2337 return true; 2372 2338 } 2373 2339 2374 - ops_dequeue(p, deq_flags); 2340 + ops_dequeue(rq, p, deq_flags); 2375 2341 2376 2342 /* 2377 2343 * A currently running task which is going off @rq first gets dequeued ··· 2387 2353 */ 2388 2354 if (SCX_HAS_OP(stopping) && task_current(rq, p)) { 2389 2355 update_curr_scx(rq); 2390 - SCX_CALL_OP_TASK(SCX_KF_REST, stopping, p, false); 2356 + SCX_CALL_OP_TASK(SCX_KF_REST, stopping, rq, p, false); 2391 2357 } 2392 2358 2393 2359 if (SCX_HAS_OP(quiescent) && !task_on_rq_migrating(p)) 2394 - SCX_CALL_OP_TASK(SCX_KF_REST, quiescent, p, deq_flags); 2360 + SCX_CALL_OP_TASK(SCX_KF_REST, quiescent, rq, p, deq_flags); 2395 2361 2396 2362 if (deq_flags & SCX_DEQ_SLEEP) 2397 2363 p->scx.flags |= SCX_TASK_DEQD_FOR_SLEEP; ··· 2411 2377 struct task_struct *p = rq->curr; 2412 2378 2413 2379 if (SCX_HAS_OP(yield)) 2414 - SCX_CALL_OP_2TASKS_RET(SCX_KF_REST, yield, p, NULL); 2380 + SCX_CALL_OP_2TASKS_RET(SCX_KF_REST, yield, rq, p, NULL); 2415 2381 else 2416 2382 p->scx.slice = 0; 2417 2383 } ··· 2421 2387 struct task_struct *from = rq->curr; 2422 2388 2423 2389 if (SCX_HAS_OP(yield)) 2424 - return SCX_CALL_OP_2TASKS_RET(SCX_KF_REST, yield, from, to); 2390 + return SCX_CALL_OP_2TASKS_RET(SCX_KF_REST, yield, rq, from, to); 2425 2391 else 2426 2392 return false; 2427 2393 } ··· 2979 2945 * emitted in switch_class(). 2980 2946 */ 2981 2947 if (SCX_HAS_OP(cpu_acquire)) 2982 - SCX_CALL_OP(SCX_KF_REST, cpu_acquire, cpu_of(rq), NULL); 2948 + SCX_CALL_OP(SCX_KF_REST, cpu_acquire, rq, cpu_of(rq), NULL); 2983 2949 rq->scx.cpu_released = false; 2984 2950 } 2985 2951 ··· 3024 2990 do { 3025 2991 dspc->nr_tasks = 0; 3026 2992 3027 - SCX_CALL_OP(SCX_KF_DISPATCH, dispatch, cpu_of(rq), 2993 + SCX_CALL_OP(SCX_KF_DISPATCH, dispatch, rq, cpu_of(rq), 3028 2994 prev_on_scx ? prev : NULL); 3029 2995 3030 2996 flush_dispatch_buf(rq); ··· 3138 3104 * Core-sched might decide to execute @p before it is 3139 3105 * dispatched. Call ops_dequeue() to notify the BPF scheduler. 3140 3106 */ 3141 - ops_dequeue(p, SCX_DEQ_CORE_SCHED_EXEC); 3107 + ops_dequeue(rq, p, SCX_DEQ_CORE_SCHED_EXEC); 3142 3108 dispatch_dequeue(rq, p); 3143 3109 } 3144 3110 ··· 3146 3112 3147 3113 /* see dequeue_task_scx() on why we skip when !QUEUED */ 3148 3114 if (SCX_HAS_OP(running) && (p->scx.flags & SCX_TASK_QUEUED)) 3149 - SCX_CALL_OP_TASK(SCX_KF_REST, running, p); 3115 + SCX_CALL_OP_TASK(SCX_KF_REST, running, rq, p); 3150 3116 3151 3117 clr_task_runnable(p, true); 3152 3118 ··· 3227 3193 .task = next, 3228 3194 }; 3229 3195 3230 - SCX_CALL_OP(SCX_KF_CPU_RELEASE, 3231 - cpu_release, cpu_of(rq), &args); 3196 + SCX_CALL_OP(SCX_KF_CPU_RELEASE, cpu_release, rq, cpu_of(rq), &args); 3232 3197 } 3233 3198 rq->scx.cpu_released = true; 3234 3199 } ··· 3240 3207 3241 3208 /* see dequeue_task_scx() on why we skip when !QUEUED */ 3242 3209 if (SCX_HAS_OP(stopping) && (p->scx.flags & SCX_TASK_QUEUED)) 3243 - SCX_CALL_OP_TASK(SCX_KF_REST, stopping, p, true); 3210 + SCX_CALL_OP_TASK(SCX_KF_REST, stopping, rq, p, true); 3244 3211 3245 3212 if (p->scx.flags & SCX_TASK_QUEUED) { 3246 3213 set_task_runnable(rq, p); ··· 3381 3348 * verifier. 3382 3349 */ 3383 3350 if (SCX_HAS_OP(core_sched_before) && !scx_rq_bypassing(task_rq(a))) 3384 - return SCX_CALL_OP_2TASKS_RET(SCX_KF_REST, core_sched_before, 3351 + return SCX_CALL_OP_2TASKS_RET(SCX_KF_REST, core_sched_before, NULL, 3385 3352 (struct task_struct *)a, 3386 3353 (struct task_struct *)b); 3387 3354 else ··· 3418 3385 *ddsp_taskp = p; 3419 3386 3420 3387 cpu = SCX_CALL_OP_TASK_RET(SCX_KF_ENQUEUE | SCX_KF_SELECT_CPU, 3421 - select_cpu, p, prev_cpu, wake_flags); 3388 + select_cpu, NULL, p, prev_cpu, wake_flags); 3422 3389 p->scx.selected_cpu = cpu; 3423 3390 *ddsp_taskp = NULL; 3424 3391 if (ops_cpu_valid(cpu, "from ops.select_cpu()")) ··· 3463 3430 * designation pointless. Cast it away when calling the operation. 3464 3431 */ 3465 3432 if (SCX_HAS_OP(set_cpumask)) 3466 - SCX_CALL_OP_TASK(SCX_KF_REST, set_cpumask, p, 3467 - (struct cpumask *)p->cpus_ptr); 3433 + SCX_CALL_OP_TASK(SCX_KF_REST, set_cpumask, NULL, 3434 + p, (struct cpumask *)p->cpus_ptr); 3468 3435 } 3469 3436 3470 3437 static void handle_hotplug(struct rq *rq, bool online) ··· 3477 3444 scx_idle_update_selcpu_topology(&scx_ops); 3478 3445 3479 3446 if (online && SCX_HAS_OP(cpu_online)) 3480 - SCX_CALL_OP(SCX_KF_UNLOCKED, cpu_online, cpu); 3447 + SCX_CALL_OP(SCX_KF_UNLOCKED, cpu_online, NULL, cpu); 3481 3448 else if (!online && SCX_HAS_OP(cpu_offline)) 3482 - SCX_CALL_OP(SCX_KF_UNLOCKED, cpu_offline, cpu); 3449 + SCX_CALL_OP(SCX_KF_UNLOCKED, cpu_offline, NULL, cpu); 3483 3450 else 3484 3451 scx_ops_exit(SCX_ECODE_ACT_RESTART | SCX_ECODE_RSN_HOTPLUG, 3485 3452 "cpu %d going %s, exiting scheduler", cpu, ··· 3583 3550 curr->scx.slice = 0; 3584 3551 touch_core_sched(rq, curr); 3585 3552 } else if (SCX_HAS_OP(tick)) { 3586 - SCX_CALL_OP_TASK(SCX_KF_REST, tick, curr); 3553 + SCX_CALL_OP_TASK(SCX_KF_REST, tick, rq, curr); 3587 3554 } 3588 3555 3589 3556 if (!curr->scx.slice) ··· 3660 3627 .fork = fork, 3661 3628 }; 3662 3629 3663 - ret = SCX_CALL_OP_RET(SCX_KF_UNLOCKED, init_task, p, &args); 3630 + ret = SCX_CALL_OP_RET(SCX_KF_UNLOCKED, init_task, NULL, p, &args); 3664 3631 if (unlikely(ret)) { 3665 3632 ret = ops_sanitize_err("init_task", ret); 3666 3633 return ret; ··· 3701 3668 3702 3669 static void scx_ops_enable_task(struct task_struct *p) 3703 3670 { 3671 + struct rq *rq = task_rq(p); 3704 3672 u32 weight; 3705 3673 3706 - lockdep_assert_rq_held(task_rq(p)); 3674 + lockdep_assert_rq_held(rq); 3707 3675 3708 3676 /* 3709 3677 * Set the weight before calling ops.enable() so that the scheduler ··· 3718 3684 p->scx.weight = sched_weight_to_cgroup(weight); 3719 3685 3720 3686 if (SCX_HAS_OP(enable)) 3721 - SCX_CALL_OP_TASK(SCX_KF_REST, enable, p); 3687 + SCX_CALL_OP_TASK(SCX_KF_REST, enable, rq, p); 3722 3688 scx_set_task_state(p, SCX_TASK_ENABLED); 3723 3689 3724 3690 if (SCX_HAS_OP(set_weight)) 3725 - SCX_CALL_OP_TASK(SCX_KF_REST, set_weight, p, p->scx.weight); 3691 + SCX_CALL_OP_TASK(SCX_KF_REST, set_weight, rq, p, p->scx.weight); 3726 3692 } 3727 3693 3728 3694 static void scx_ops_disable_task(struct task_struct *p) 3729 3695 { 3730 - lockdep_assert_rq_held(task_rq(p)); 3696 + struct rq *rq = task_rq(p); 3697 + 3698 + lockdep_assert_rq_held(rq); 3731 3699 WARN_ON_ONCE(scx_get_task_state(p) != SCX_TASK_ENABLED); 3732 3700 3733 3701 if (SCX_HAS_OP(disable)) 3734 - SCX_CALL_OP_TASK(SCX_KF_REST, disable, p); 3702 + SCX_CALL_OP_TASK(SCX_KF_REST, disable, rq, p); 3735 3703 scx_set_task_state(p, SCX_TASK_READY); 3736 3704 } 3737 3705 ··· 3762 3726 } 3763 3727 3764 3728 if (SCX_HAS_OP(exit_task)) 3765 - SCX_CALL_OP_TASK(SCX_KF_REST, exit_task, p, &args); 3729 + SCX_CALL_OP_TASK(SCX_KF_REST, exit_task, task_rq(p), p, &args); 3766 3730 scx_set_task_state(p, SCX_TASK_NONE); 3767 3731 } 3768 3732 ··· 3871 3835 3872 3836 p->scx.weight = sched_weight_to_cgroup(scale_load_down(lw->weight)); 3873 3837 if (SCX_HAS_OP(set_weight)) 3874 - SCX_CALL_OP_TASK(SCX_KF_REST, set_weight, p, p->scx.weight); 3838 + SCX_CALL_OP_TASK(SCX_KF_REST, set_weight, rq, p, p->scx.weight); 3875 3839 } 3876 3840 3877 3841 static void prio_changed_scx(struct rq *rq, struct task_struct *p, int oldprio) ··· 3887 3851 * different scheduler class. Keep the BPF scheduler up-to-date. 3888 3852 */ 3889 3853 if (SCX_HAS_OP(set_cpumask)) 3890 - SCX_CALL_OP_TASK(SCX_KF_REST, set_cpumask, p, 3891 - (struct cpumask *)p->cpus_ptr); 3854 + SCX_CALL_OP_TASK(SCX_KF_REST, set_cpumask, rq, 3855 + p, (struct cpumask *)p->cpus_ptr); 3892 3856 } 3893 3857 3894 3858 static void switched_from_scx(struct rq *rq, struct task_struct *p) ··· 3949 3913 struct scx_cgroup_init_args args = 3950 3914 { .weight = tg->scx_weight }; 3951 3915 3952 - ret = SCX_CALL_OP_RET(SCX_KF_UNLOCKED, cgroup_init, 3916 + ret = SCX_CALL_OP_RET(SCX_KF_UNLOCKED, cgroup_init, NULL, 3953 3917 tg->css.cgroup, &args); 3954 3918 if (ret) 3955 3919 ret = ops_sanitize_err("cgroup_init", ret); ··· 3971 3935 percpu_down_read(&scx_cgroup_rwsem); 3972 3936 3973 3937 if (SCX_HAS_OP(cgroup_exit) && (tg->scx_flags & SCX_TG_INITED)) 3974 - SCX_CALL_OP(SCX_KF_UNLOCKED, cgroup_exit, tg->css.cgroup); 3938 + SCX_CALL_OP(SCX_KF_UNLOCKED, cgroup_exit, NULL, tg->css.cgroup); 3975 3939 tg->scx_flags &= ~(SCX_TG_ONLINE | SCX_TG_INITED); 3976 3940 3977 3941 percpu_up_read(&scx_cgroup_rwsem); ··· 4004 3968 continue; 4005 3969 4006 3970 if (SCX_HAS_OP(cgroup_prep_move)) { 4007 - ret = SCX_CALL_OP_RET(SCX_KF_UNLOCKED, cgroup_prep_move, 3971 + ret = SCX_CALL_OP_RET(SCX_KF_UNLOCKED, cgroup_prep_move, NULL, 4008 3972 p, from, css->cgroup); 4009 3973 if (ret) 4010 3974 goto err; ··· 4018 3982 err: 4019 3983 cgroup_taskset_for_each(p, css, tset) { 4020 3984 if (SCX_HAS_OP(cgroup_cancel_move) && p->scx.cgrp_moving_from) 4021 - SCX_CALL_OP(SCX_KF_UNLOCKED, cgroup_cancel_move, p, 4022 - p->scx.cgrp_moving_from, css->cgroup); 3985 + SCX_CALL_OP(SCX_KF_UNLOCKED, cgroup_cancel_move, NULL, 3986 + p, p->scx.cgrp_moving_from, css->cgroup); 4023 3987 p->scx.cgrp_moving_from = NULL; 4024 3988 } 4025 3989 ··· 4037 4001 * cgrp_moving_from set. 4038 4002 */ 4039 4003 if (SCX_HAS_OP(cgroup_move) && !WARN_ON_ONCE(!p->scx.cgrp_moving_from)) 4040 - SCX_CALL_OP_TASK(SCX_KF_UNLOCKED, cgroup_move, p, 4041 - p->scx.cgrp_moving_from, tg_cgrp(task_group(p))); 4004 + SCX_CALL_OP_TASK(SCX_KF_UNLOCKED, cgroup_move, NULL, 4005 + p, p->scx.cgrp_moving_from, tg_cgrp(task_group(p))); 4042 4006 p->scx.cgrp_moving_from = NULL; 4043 4007 } 4044 4008 ··· 4057 4021 4058 4022 cgroup_taskset_for_each(p, css, tset) { 4059 4023 if (SCX_HAS_OP(cgroup_cancel_move) && p->scx.cgrp_moving_from) 4060 - SCX_CALL_OP(SCX_KF_UNLOCKED, cgroup_cancel_move, p, 4061 - p->scx.cgrp_moving_from, css->cgroup); 4024 + SCX_CALL_OP(SCX_KF_UNLOCKED, cgroup_cancel_move, NULL, 4025 + p, p->scx.cgrp_moving_from, css->cgroup); 4062 4026 p->scx.cgrp_moving_from = NULL; 4063 4027 } 4064 4028 out_unlock: ··· 4071 4035 4072 4036 if (scx_cgroup_enabled && tg->scx_weight != weight) { 4073 4037 if (SCX_HAS_OP(cgroup_set_weight)) 4074 - SCX_CALL_OP(SCX_KF_UNLOCKED, cgroup_set_weight, 4038 + SCX_CALL_OP(SCX_KF_UNLOCKED, cgroup_set_weight, NULL, 4075 4039 tg_cgrp(tg), weight); 4076 4040 tg->scx_weight = weight; 4077 4041 } ··· 4260 4224 continue; 4261 4225 rcu_read_unlock(); 4262 4226 4263 - SCX_CALL_OP(SCX_KF_UNLOCKED, cgroup_exit, css->cgroup); 4227 + SCX_CALL_OP(SCX_KF_UNLOCKED, cgroup_exit, NULL, css->cgroup); 4264 4228 4265 4229 rcu_read_lock(); 4266 4230 css_put(css); ··· 4297 4261 continue; 4298 4262 rcu_read_unlock(); 4299 4263 4300 - ret = SCX_CALL_OP_RET(SCX_KF_UNLOCKED, cgroup_init, 4264 + ret = SCX_CALL_OP_RET(SCX_KF_UNLOCKED, cgroup_init, NULL, 4301 4265 css->cgroup, &args); 4302 4266 if (ret) { 4303 4267 css_put(css); ··· 4794 4758 } 4795 4759 4796 4760 if (scx_ops.exit) 4797 - SCX_CALL_OP(SCX_KF_UNLOCKED, exit, ei); 4761 + SCX_CALL_OP(SCX_KF_UNLOCKED, exit, NULL, ei); 4798 4762 4799 4763 cancel_delayed_work_sync(&scx_watchdog_work); 4800 4764 ··· 5001 4965 5002 4966 if (SCX_HAS_OP(dump_task)) { 5003 4967 ops_dump_init(s, " "); 5004 - SCX_CALL_OP(SCX_KF_REST, dump_task, dctx, p); 4968 + SCX_CALL_OP(SCX_KF_REST, dump_task, NULL, dctx, p); 5005 4969 ops_dump_exit(); 5006 4970 } 5007 4971 ··· 5048 5012 5049 5013 if (SCX_HAS_OP(dump)) { 5050 5014 ops_dump_init(&s, ""); 5051 - SCX_CALL_OP(SCX_KF_UNLOCKED, dump, &dctx); 5015 + SCX_CALL_OP(SCX_KF_UNLOCKED, dump, NULL, &dctx); 5052 5016 ops_dump_exit(); 5053 5017 } 5054 5018 ··· 5105 5069 used = seq_buf_used(&ns); 5106 5070 if (SCX_HAS_OP(dump_cpu)) { 5107 5071 ops_dump_init(&ns, " "); 5108 - SCX_CALL_OP(SCX_KF_REST, dump_cpu, &dctx, cpu, idle); 5072 + SCX_CALL_OP(SCX_KF_REST, dump_cpu, NULL, &dctx, cpu, idle); 5109 5073 ops_dump_exit(); 5110 5074 } 5111 5075 ··· 5364 5328 scx_idle_enable(ops); 5365 5329 5366 5330 if (scx_ops.init) { 5367 - ret = SCX_CALL_OP_RET(SCX_KF_UNLOCKED, init); 5331 + ret = SCX_CALL_OP_RET(SCX_KF_UNLOCKED, init, NULL); 5368 5332 if (ret) { 5369 5333 ret = ops_sanitize_err("init", ret); 5370 5334 cpus_read_unlock(); ··· 6827 6791 BUILD_BUG_ON(__alignof__(struct bpf_iter_scx_dsq_kern) != 6828 6792 __alignof__(struct bpf_iter_scx_dsq)); 6829 6793 6794 + /* 6795 + * next() and destroy() will be called regardless of the return value. 6796 + * Always clear $kit->dsq. 6797 + */ 6798 + kit->dsq = NULL; 6799 + 6830 6800 if (flags & ~__SCX_DSQ_ITER_USER_FLAGS) 6831 6801 return -EINVAL; 6832 6802 ··· 7119 7077 } 7120 7078 7121 7079 if (ops_cpu_valid(cpu, NULL)) { 7122 - struct rq *rq = cpu_rq(cpu); 7080 + struct rq *rq = cpu_rq(cpu), *locked_rq = scx_locked_rq(); 7081 + struct rq_flags rf; 7082 + 7083 + /* 7084 + * When called with an rq lock held, restrict the operation 7085 + * to the corresponding CPU to prevent ABBA deadlocks. 7086 + */ 7087 + if (locked_rq && rq != locked_rq) { 7088 + scx_ops_error("Invalid target CPU %d", cpu); 7089 + return; 7090 + } 7091 + 7092 + /* 7093 + * If no rq lock is held, allow to operate on any CPU by 7094 + * acquiring the corresponding rq lock. 7095 + */ 7096 + if (!locked_rq) { 7097 + rq_lock_irqsave(rq, &rf); 7098 + update_rq_clock(rq); 7099 + } 7123 7100 7124 7101 rq->scx.cpuperf_target = perf; 7102 + cpufreq_update_util(rq, 0); 7125 7103 7126 - rcu_read_lock_sched_notrace(); 7127 - cpufreq_update_util(cpu_rq(cpu), 0); 7128 - rcu_read_unlock_sched_notrace(); 7104 + if (!locked_rq) 7105 + rq_unlock_irqrestore(rq, &rf); 7129 7106 } 7130 7107 } 7131 7108 ··· 7375 7314 BTF_ID_FLAGS(func, scx_bpf_get_possible_cpumask, KF_ACQUIRE) 7376 7315 BTF_ID_FLAGS(func, scx_bpf_get_online_cpumask, KF_ACQUIRE) 7377 7316 BTF_ID_FLAGS(func, scx_bpf_put_cpumask, KF_RELEASE) 7378 - BTF_ID_FLAGS(func, scx_bpf_get_idle_cpumask, KF_ACQUIRE) 7379 - BTF_ID_FLAGS(func, scx_bpf_get_idle_smtmask, KF_ACQUIRE) 7380 - BTF_ID_FLAGS(func, scx_bpf_put_idle_cpumask, KF_RELEASE) 7381 - BTF_ID_FLAGS(func, scx_bpf_test_and_clear_cpu_idle) 7382 - BTF_ID_FLAGS(func, scx_bpf_pick_idle_cpu, KF_RCU) 7383 - BTF_ID_FLAGS(func, scx_bpf_pick_any_cpu, KF_RCU) 7384 7317 BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU) 7385 7318 BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU) 7386 7319 BTF_ID_FLAGS(func, scx_bpf_cpu_rq)

+1 -1

kernel/sched/ext_idle.c

··· 674 674 * managed by put_prev_task_idle()/set_next_task_idle(). 675 675 */ 676 676 if (SCX_HAS_OP(update_idle) && do_notify && !scx_rq_bypassing(rq)) 677 - SCX_CALL_OP(SCX_KF_REST, update_idle, cpu_of(rq), idle); 677 + SCX_CALL_OP(SCX_KF_REST, update_idle, rq, cpu_of(rq), idle); 678 678 679 679 /* 680 680 * Update the idle masks:

+2 -1

kernel/trace/fprobe.c

··· 454 454 struct fprobe_hlist_node *node; 455 455 int ret = 0; 456 456 457 - hlist_for_each_entry_rcu(node, head, hlist) { 457 + hlist_for_each_entry_rcu(node, head, hlist, 458 + lockdep_is_held(&fprobe_mutex)) { 458 459 if (!within_module(node->addr, mod)) 459 460 continue; 460 461 if (delete_fprobe_node(node))

+5 -3

kernel/trace/ring_buffer.c

··· 1887 1887 1888 1888 head_page = cpu_buffer->head_page; 1889 1889 1890 - /* If both the head and commit are on the reader_page then we are done. */ 1891 - if (head_page == cpu_buffer->reader_page && 1892 - head_page == cpu_buffer->commit_page) 1890 + /* If the commit_buffer is the reader page, update the commit page */ 1891 + if (meta->commit_buffer == (unsigned long)cpu_buffer->reader_page->page) { 1892 + cpu_buffer->commit_page = cpu_buffer->reader_page; 1893 + /* Nothing more to do, the only page is the reader page */ 1893 1894 goto done; 1895 + } 1894 1896 1895 1897 /* Iterate until finding the commit page */ 1896 1898 for (i = 0; i < meta->nr_subbufs + 1; i++, rb_inc_page(&head_page)) {

+15 -1

kernel/trace/trace_dynevent.c

··· 16 16 #include "trace_output.h" /* for trace_event_sem */ 17 17 #include "trace_dynevent.h" 18 18 19 - static DEFINE_MUTEX(dyn_event_ops_mutex); 19 + DEFINE_MUTEX(dyn_event_ops_mutex); 20 20 static LIST_HEAD(dyn_event_ops_list); 21 21 22 22 bool trace_event_dyn_try_get_ref(struct trace_event_call *dyn_call) ··· 113 113 } 114 114 tracing_reset_all_online_cpus(); 115 115 mutex_unlock(&event_mutex); 116 + return ret; 117 + } 118 + 119 + /* 120 + * Locked version of event creation. The event creation must be protected by 121 + * dyn_event_ops_mutex because of protecting trace_probe_log. 122 + */ 123 + int dyn_event_create(const char *raw_command, struct dyn_event_operations *type) 124 + { 125 + int ret; 126 + 127 + mutex_lock(&dyn_event_ops_mutex); 128 + ret = type->create(raw_command); 129 + mutex_unlock(&dyn_event_ops_mutex); 116 130 return ret; 117 131 } 118 132

+1

kernel/trace/trace_dynevent.h

··· 100 100 void dyn_event_seq_stop(struct seq_file *m, void *v); 101 101 int dyn_events_release_all(struct dyn_event_operations *type); 102 102 int dyn_event_release(const char *raw_command, struct dyn_event_operations *type); 103 + int dyn_event_create(const char *raw_command, struct dyn_event_operations *type); 103 104 104 105 /* 105 106 * for_each_dyn_event - iterate over the dyn_event list

+3

kernel/trace/trace_eprobe.c

··· 969 969 goto error; 970 970 } 971 971 } 972 + trace_probe_log_clear(); 972 973 return ret; 974 + 973 975 parse_error: 974 976 ret = -EINVAL; 975 977 error: 978 + trace_probe_log_clear(); 976 979 trace_event_probe_cleanup(ep); 977 980 return ret; 978 981 }

+1 -1

kernel/trace/trace_events_trigger.c

··· 1560 1560 struct trace_event_file *file = data->private_data; 1561 1561 1562 1562 if (file) 1563 - __trace_stack(file->tr, tracing_gen_ctx(), STACK_SKIP); 1563 + __trace_stack(file->tr, tracing_gen_ctx_dec(), STACK_SKIP); 1564 1564 else 1565 1565 trace_dump_stack(STACK_SKIP); 1566 1566 }

+1 -5

kernel/trace/trace_functions.c

··· 633 633 634 634 static __always_inline void trace_stack(struct trace_array *tr) 635 635 { 636 - unsigned int trace_ctx; 637 - 638 - trace_ctx = tracing_gen_ctx(); 639 - 640 - __trace_stack(tr, trace_ctx, FTRACE_STACK_SKIP); 636 + __trace_stack(tr, tracing_gen_ctx_dec(), FTRACE_STACK_SKIP); 641 637 } 642 638 643 639 static void

+1 -1

kernel/trace/trace_kprobe.c

··· 1089 1089 if (raw_command[0] == '-') 1090 1090 return dyn_event_release(raw_command, &trace_kprobe_ops); 1091 1091 1092 - ret = trace_kprobe_create(raw_command); 1092 + ret = dyn_event_create(raw_command, &trace_kprobe_ops); 1093 1093 return ret == -ECANCELED ? -EINVAL : ret; 1094 1094 } 1095 1095

+9

kernel/trace/trace_probe.c

··· 154 154 } 155 155 156 156 static struct trace_probe_log trace_probe_log; 157 + extern struct mutex dyn_event_ops_mutex; 157 158 158 159 void trace_probe_log_init(const char *subsystem, int argc, const char **argv) 159 160 { 161 + lockdep_assert_held(&dyn_event_ops_mutex); 162 + 160 163 trace_probe_log.subsystem = subsystem; 161 164 trace_probe_log.argc = argc; 162 165 trace_probe_log.argv = argv; ··· 168 165 169 166 void trace_probe_log_clear(void) 170 167 { 168 + lockdep_assert_held(&dyn_event_ops_mutex); 169 + 171 170 memset(&trace_probe_log, 0, sizeof(trace_probe_log)); 172 171 } 173 172 174 173 void trace_probe_log_set_index(int index) 175 174 { 175 + lockdep_assert_held(&dyn_event_ops_mutex); 176 + 176 177 trace_probe_log.index = index; 177 178 } 178 179 ··· 184 177 { 185 178 char *command, *p; 186 179 int i, len = 0, pos = 0; 180 + 181 + lockdep_assert_held(&dyn_event_ops_mutex); 187 182 188 183 if (!trace_probe_log.argv) 189 184 return;

+1 -1

kernel/trace/trace_uprobe.c

··· 741 741 if (raw_command[0] == '-') 742 742 return dyn_event_release(raw_command, &trace_uprobe_ops); 743 743 744 - ret = trace_uprobe_create(raw_command); 744 + ret = dyn_event_create(raw_command, &trace_uprobe_ops); 745 745 return ret == -ECANCELED ? -EINVAL : ret; 746 746 } 747 747

+37 -3

mm/execmem.c

··· 254 254 return ptr; 255 255 } 256 256 257 + static bool execmem_cache_rox = false; 258 + 259 + void execmem_cache_make_ro(void) 260 + { 261 + struct maple_tree *free_areas = &execmem_cache.free_areas; 262 + struct maple_tree *busy_areas = &execmem_cache.busy_areas; 263 + MA_STATE(mas_free, free_areas, 0, ULONG_MAX); 264 + MA_STATE(mas_busy, busy_areas, 0, ULONG_MAX); 265 + struct mutex *mutex = &execmem_cache.mutex; 266 + void *area; 267 + 268 + execmem_cache_rox = true; 269 + 270 + mutex_lock(mutex); 271 + 272 + mas_for_each(&mas_free, area, ULONG_MAX) { 273 + unsigned long pages = mas_range_len(&mas_free) >> PAGE_SHIFT; 274 + set_memory_ro(mas_free.index, pages); 275 + } 276 + 277 + mas_for_each(&mas_busy, area, ULONG_MAX) { 278 + unsigned long pages = mas_range_len(&mas_busy) >> PAGE_SHIFT; 279 + set_memory_ro(mas_busy.index, pages); 280 + } 281 + 282 + mutex_unlock(mutex); 283 + } 284 + 257 285 static int execmem_cache_populate(struct execmem_range *range, size_t size) 258 286 { 259 287 unsigned long vm_flags = VM_ALLOW_HUGE_VMAP; ··· 302 274 /* fill memory with instructions that will trap */ 303 275 execmem_fill_trapping_insns(p, alloc_size, /* writable = */ true); 304 276 305 - err = set_memory_rox((unsigned long)p, vm->nr_pages); 306 - if (err) 307 - goto err_free_mem; 277 + if (execmem_cache_rox) { 278 + err = set_memory_rox((unsigned long)p, vm->nr_pages); 279 + if (err) 280 + goto err_free_mem; 281 + } else { 282 + err = set_memory_x((unsigned long)p, vm->nr_pages); 283 + if (err) 284 + goto err_free_mem; 285 + } 308 286 309 287 err = execmem_cache_add(p, alloc_size); 310 288 if (err)

+22 -6

mm/hugetlb.c

··· 3010 3010 struct hugepage_subpool *spool = subpool_vma(vma); 3011 3011 struct hstate *h = hstate_vma(vma); 3012 3012 struct folio *folio; 3013 - long retval, gbl_chg; 3013 + long retval, gbl_chg, gbl_reserve; 3014 3014 map_chg_state map_chg; 3015 3015 int ret, idx; 3016 3016 struct hugetlb_cgroup *h_cg = NULL; ··· 3163 3163 hugetlb_cgroup_uncharge_cgroup_rsvd(idx, pages_per_huge_page(h), 3164 3164 h_cg); 3165 3165 out_subpool_put: 3166 - if (map_chg) 3167 - hugepage_subpool_put_pages(spool, 1); 3166 + /* 3167 + * put page to subpool iff the quota of subpool's rsv_hpages is used 3168 + * during hugepage_subpool_get_pages. 3169 + */ 3170 + if (map_chg && !gbl_chg) { 3171 + gbl_reserve = hugepage_subpool_put_pages(spool, 1); 3172 + hugetlb_acct_memory(h, -gbl_reserve); 3173 + } 3174 + 3175 + 3168 3176 out_end_reservation: 3169 3177 if (map_chg != MAP_CHG_ENFORCED) 3170 3178 vma_end_reservation(h, vma, addr); ··· 7247 7239 struct vm_area_struct *vma, 7248 7240 vm_flags_t vm_flags) 7249 7241 { 7250 - long chg = -1, add = -1; 7242 + long chg = -1, add = -1, spool_resv, gbl_resv; 7251 7243 struct hstate *h = hstate_inode(inode); 7252 7244 struct hugepage_subpool *spool = subpool_inode(inode); 7253 7245 struct resv_map *resv_map; ··· 7382 7374 return true; 7383 7375 7384 7376 out_put_pages: 7385 - /* put back original number of pages, chg */ 7386 - (void)hugepage_subpool_put_pages(spool, chg); 7377 + spool_resv = chg - gbl_reserve; 7378 + if (spool_resv) { 7379 + /* put sub pool's reservation back, chg - gbl_reserve */ 7380 + gbl_resv = hugepage_subpool_put_pages(spool, spool_resv); 7381 + /* 7382 + * subpool's reserved pages can not be put back due to race, 7383 + * return to hstate. 7384 + */ 7385 + hugetlb_acct_memory(h, -gbl_resv); 7386 + } 7387 7387 out_uncharge_cgroup: 7388 7388 hugetlb_cgroup_uncharge_cgroup_rsvd(hstate_index(h), 7389 7389 chg * pages_per_huge_page(h), h_cg);

-1

mm/internal.h

··· 1590 1590 1591 1591 #ifdef CONFIG_UNACCEPTED_MEMORY 1592 1592 void accept_page(struct page *page); 1593 - void unaccepted_cleanup_work(struct work_struct *work); 1594 1593 #else /* CONFIG_UNACCEPTED_MEMORY */ 1595 1594 static inline void accept_page(struct page *page) 1596 1595 {

+1 -1

mm/memory.c

··· 3751 3751 3752 3752 /* Stabilize the mapcount vs. refcount and recheck. */ 3753 3753 folio_lock_large_mapcount(folio); 3754 - VM_WARN_ON_ONCE(folio_large_mapcount(folio) < folio_ref_count(folio)); 3754 + VM_WARN_ON_ONCE_FOLIO(folio_large_mapcount(folio) > folio_ref_count(folio), folio); 3755 3755 3756 3756 if (folio_test_large_maybe_mapped_shared(folio)) 3757 3757 goto unlock;

-1

mm/mm_init.c

··· 1441 1441 1442 1442 #ifdef CONFIG_UNACCEPTED_MEMORY 1443 1443 INIT_LIST_HEAD(&zone->unaccepted_pages); 1444 - INIT_WORK(&zone->unaccepted_cleanup, unaccepted_cleanup_work); 1445 1444 #endif 1446 1445 } 1447 1446

+20 -68

mm/page_alloc.c

··· 290 290 #endif 291 291 292 292 static bool page_contains_unaccepted(struct page *page, unsigned int order); 293 - static bool cond_accept_memory(struct zone *zone, unsigned int order); 293 + static bool cond_accept_memory(struct zone *zone, unsigned int order, 294 + int alloc_flags); 294 295 static bool __free_unaccepted(struct page *page); 295 296 296 297 int page_group_by_mobility_disabled __read_mostly; ··· 1152 1151 __pgalloc_tag_sub(page, nr); 1153 1152 } 1154 1153 1155 - static inline void pgalloc_tag_sub_pages(struct page *page, unsigned int nr) 1154 + /* When tag is not NULL, assuming mem_alloc_profiling_enabled */ 1155 + static inline void pgalloc_tag_sub_pages(struct alloc_tag *tag, unsigned int nr) 1156 1156 { 1157 - struct alloc_tag *tag; 1158 - 1159 - if (!mem_alloc_profiling_enabled()) 1160 - return; 1161 - 1162 - tag = __pgalloc_tag_get(page); 1163 1157 if (tag) 1164 1158 this_cpu_sub(tag->counters->bytes, PAGE_SIZE * nr); 1165 1159 } ··· 1164 1168 static inline void pgalloc_tag_add(struct page *page, struct task_struct *task, 1165 1169 unsigned int nr) {} 1166 1170 static inline void pgalloc_tag_sub(struct page *page, unsigned int nr) {} 1167 - static inline void pgalloc_tag_sub_pages(struct page *page, unsigned int nr) {} 1171 + static inline void pgalloc_tag_sub_pages(struct alloc_tag *tag, unsigned int nr) {} 1168 1172 1169 1173 #endif /* CONFIG_MEM_ALLOC_PROFILING */ 1170 1174 ··· 3612 3616 } 3613 3617 } 3614 3618 3615 - cond_accept_memory(zone, order); 3619 + cond_accept_memory(zone, order, alloc_flags); 3616 3620 3617 3621 /* 3618 3622 * Detect whether the number of free pages is below high ··· 3639 3643 gfp_mask)) { 3640 3644 int ret; 3641 3645 3642 - if (cond_accept_memory(zone, order)) 3646 + if (cond_accept_memory(zone, order, alloc_flags)) 3643 3647 goto try_this_zone; 3644 3648 3645 3649 /* ··· 3692 3696 3693 3697 return page; 3694 3698 } else { 3695 - if (cond_accept_memory(zone, order)) 3699 + if (cond_accept_memory(zone, order, alloc_flags)) 3696 3700 goto try_this_zone; 3697 3701 3698 3702 /* Try again if zone has deferred pages */ ··· 4845 4849 goto failed; 4846 4850 } 4847 4851 4848 - cond_accept_memory(zone, 0); 4852 + cond_accept_memory(zone, 0, alloc_flags); 4849 4853 retry_this_zone: 4850 4854 mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK) + nr_pages; 4851 4855 if (zone_watermark_fast(zone, 0, mark, ··· 4854 4858 break; 4855 4859 } 4856 4860 4857 - if (cond_accept_memory(zone, 0)) 4861 + if (cond_accept_memory(zone, 0, alloc_flags)) 4858 4862 goto retry_this_zone; 4859 4863 4860 4864 /* Try again if zone has deferred pages */ ··· 5061 5065 { 5062 5066 /* get PageHead before we drop reference */ 5063 5067 int head = PageHead(page); 5068 + /* get alloc tag in case the page is released by others */ 5069 + struct alloc_tag *tag = pgalloc_tag_get(page); 5064 5070 5065 5071 if (put_page_testzero(page)) 5066 5072 __free_frozen_pages(page, order, fpi_flags); 5067 5073 else if (!head) { 5068 - pgalloc_tag_sub_pages(page, (1 << order) - 1); 5074 + pgalloc_tag_sub_pages(tag, (1 << order) - 1); 5069 5075 while (order-- > 0) 5070 5076 __free_frozen_pages(page + (1 << order), order, 5071 5077 fpi_flags); ··· 7172 7174 7173 7175 #ifdef CONFIG_UNACCEPTED_MEMORY 7174 7176 7175 - /* Counts number of zones with unaccepted pages. */ 7176 - static DEFINE_STATIC_KEY_FALSE(zones_with_unaccepted_pages); 7177 - 7178 7177 static bool lazy_accept = true; 7179 - 7180 - void unaccepted_cleanup_work(struct work_struct *work) 7181 - { 7182 - static_branch_dec(&zones_with_unaccepted_pages); 7183 - } 7184 7178 7185 7179 static int __init accept_memory_parse(char *p) 7186 7180 { ··· 7198 7208 static void __accept_page(struct zone *zone, unsigned long *flags, 7199 7209 struct page *page) 7200 7210 { 7201 - bool last; 7202 - 7203 7211 list_del(&page->lru); 7204 - last = list_empty(&zone->unaccepted_pages); 7205 - 7206 7212 account_freepages(zone, -MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE); 7207 7213 __mod_zone_page_state(zone, NR_UNACCEPTED, -MAX_ORDER_NR_PAGES); 7208 7214 __ClearPageUnaccepted(page); ··· 7207 7221 accept_memory(page_to_phys(page), PAGE_SIZE << MAX_PAGE_ORDER); 7208 7222 7209 7223 __free_pages_ok(page, MAX_PAGE_ORDER, FPI_TO_TAIL); 7210 - 7211 - if (last) { 7212 - /* 7213 - * There are two corner cases: 7214 - * 7215 - * - If allocation occurs during the CPU bring up, 7216 - * static_branch_dec() cannot be used directly as 7217 - * it causes a deadlock on cpu_hotplug_lock. 7218 - * 7219 - * Instead, use schedule_work() to prevent deadlock. 7220 - * 7221 - * - If allocation occurs before workqueues are initialized, 7222 - * static_branch_dec() should be called directly. 7223 - * 7224 - * Workqueues are initialized before CPU bring up, so this 7225 - * will not conflict with the first scenario. 7226 - */ 7227 - if (system_wq) 7228 - schedule_work(&zone->unaccepted_cleanup); 7229 - else 7230 - unaccepted_cleanup_work(&zone->unaccepted_cleanup); 7231 - } 7232 7224 } 7233 7225 7234 7226 void accept_page(struct page *page) ··· 7243 7279 return true; 7244 7280 } 7245 7281 7246 - static inline bool has_unaccepted_memory(void) 7247 - { 7248 - return static_branch_unlikely(&zones_with_unaccepted_pages); 7249 - } 7250 - 7251 - static bool cond_accept_memory(struct zone *zone, unsigned int order) 7282 + static bool cond_accept_memory(struct zone *zone, unsigned int order, 7283 + int alloc_flags) 7252 7284 { 7253 7285 long to_accept, wmark; 7254 7286 bool ret = false; 7255 7287 7256 - if (!has_unaccepted_memory()) 7288 + if (list_empty(&zone->unaccepted_pages)) 7257 7289 return false; 7258 7290 7259 - if (list_empty(&zone->unaccepted_pages)) 7291 + /* Bailout, since try_to_accept_memory_one() needs to take a lock */ 7292 + if (alloc_flags & ALLOC_TRYLOCK) 7260 7293 return false; 7261 7294 7262 7295 wmark = promo_wmark_pages(zone); ··· 7286 7325 { 7287 7326 struct zone *zone = page_zone(page); 7288 7327 unsigned long flags; 7289 - bool first = false; 7290 7328 7291 7329 if (!lazy_accept) 7292 7330 return false; 7293 7331 7294 7332 spin_lock_irqsave(&zone->lock, flags); 7295 - first = list_empty(&zone->unaccepted_pages); 7296 7333 list_add_tail(&page->lru, &zone->unaccepted_pages); 7297 7334 account_freepages(zone, MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE); 7298 7335 __mod_zone_page_state(zone, NR_UNACCEPTED, MAX_ORDER_NR_PAGES); 7299 7336 __SetPageUnaccepted(page); 7300 7337 spin_unlock_irqrestore(&zone->lock, flags); 7301 - 7302 - if (first) 7303 - static_branch_inc(&zones_with_unaccepted_pages); 7304 7338 7305 7339 return true; 7306 7340 } ··· 7307 7351 return false; 7308 7352 } 7309 7353 7310 - static bool cond_accept_memory(struct zone *zone, unsigned int order) 7354 + static bool cond_accept_memory(struct zone *zone, unsigned int order, 7355 + int alloc_flags) 7311 7356 { 7312 7357 return false; 7313 7358 } ··· 7379 7422 if (!pcp_allowed_order(order)) 7380 7423 return NULL; 7381 7424 7382 - #ifdef CONFIG_UNACCEPTED_MEMORY 7383 - /* Bailout, since try_to_accept_memory_one() needs to take a lock */ 7384 - if (has_unaccepted_memory()) 7385 - return NULL; 7386 - #endif 7387 7425 /* Bailout, since _deferred_grow_zone() needs to take a lock */ 7388 7426 if (deferred_pages_enabled()) 7389 7427 return NULL;

+9

mm/swapfile.c

··· 3332 3332 } 3333 3333 3334 3334 /* 3335 + * The swap subsystem needs a major overhaul to support this. 3336 + * It doesn't work yet so just disable it for now. 3337 + */ 3338 + if (mapping_min_folio_order(mapping) > 0) { 3339 + error = -EINVAL; 3340 + goto bad_swap_unlock_inode; 3341 + } 3342 + 3343 + /* 3335 3344 * Read the swap header. 3336 3345 */ 3337 3346 if (!mapping->a_ops->read_folio) {

+10 -2

mm/userfaultfd.c

··· 1064 1064 src_folio->index = linear_page_index(dst_vma, dst_addr); 1065 1065 1066 1066 orig_dst_pte = mk_pte(&src_folio->page, dst_vma->vm_page_prot); 1067 - /* Follow mremap() behavior and treat the entry dirty after the move */ 1068 - orig_dst_pte = pte_mkwrite(pte_mkdirty(orig_dst_pte), dst_vma); 1067 + /* Set soft dirty bit so userspace can notice the pte was moved */ 1068 + #ifdef CONFIG_MEM_SOFT_DIRTY 1069 + orig_dst_pte = pte_mksoft_dirty(orig_dst_pte); 1070 + #endif 1071 + if (pte_dirty(orig_src_pte)) 1072 + orig_dst_pte = pte_mkdirty(orig_dst_pte); 1073 + orig_dst_pte = pte_mkwrite(orig_dst_pte, dst_vma); 1069 1074 1070 1075 set_pte_at(mm, dst_addr, dst_pte, orig_dst_pte); 1071 1076 out: ··· 1105 1100 } 1106 1101 1107 1102 orig_src_pte = ptep_get_and_clear(mm, src_addr, src_pte); 1103 + #ifdef CONFIG_MEM_SOFT_DIRTY 1104 + orig_src_pte = pte_swp_mksoft_dirty(orig_src_pte); 1105 + #endif 1108 1106 set_pte_at(mm, dst_addr, dst_pte, orig_src_pte); 1109 1107 double_pt_unlock(dst_ptl, src_ptl); 1110 1108

+4 -4

mm/zsmalloc.c

··· 1243 1243 class = zspage_class(pool, zspage); 1244 1244 off = offset_in_page(class->size * obj_idx); 1245 1245 1246 - if (off + class->size <= PAGE_SIZE) { 1246 + if (!ZsHugePage(zspage)) 1247 + off += ZS_HANDLE_SIZE; 1248 + 1249 + if (off + mem_len <= PAGE_SIZE) { 1247 1250 /* this object is contained entirely within a page */ 1248 1251 void *dst = kmap_local_zpdesc(zpdesc); 1249 1252 1250 - if (!ZsHugePage(zspage)) 1251 - off += ZS_HANDLE_SIZE; 1252 1253 memcpy(dst + off, handle_mem, mem_len); 1253 1254 kunmap_local(dst); 1254 1255 } else { 1255 1256 /* this object spans two pages */ 1256 1257 size_t sizes[2]; 1257 1258 1258 - off += ZS_HANDLE_SIZE; 1259 1259 sizes[0] = PAGE_SIZE - off; 1260 1260 sizes[1] = mem_len - sizes[0]; 1261 1261

+18 -13

net/batman-adv/hard-interface.c

··· 506 506 return false; 507 507 } 508 508 509 - static void batadv_check_known_mac_addr(const struct net_device *net_dev) 509 + static void batadv_check_known_mac_addr(const struct batadv_hard_iface *hard_iface) 510 510 { 511 - const struct batadv_hard_iface *hard_iface; 511 + const struct net_device *mesh_iface = hard_iface->mesh_iface; 512 + const struct batadv_hard_iface *tmp_hard_iface; 512 513 513 - rcu_read_lock(); 514 - list_for_each_entry_rcu(hard_iface, &batadv_hardif_list, list) { 515 - if (hard_iface->if_status != BATADV_IF_ACTIVE && 516 - hard_iface->if_status != BATADV_IF_TO_BE_ACTIVATED) 514 + if (!mesh_iface) 515 + return; 516 + 517 + list_for_each_entry(tmp_hard_iface, &batadv_hardif_list, list) { 518 + if (tmp_hard_iface == hard_iface) 517 519 continue; 518 520 519 - if (hard_iface->net_dev == net_dev) 521 + if (tmp_hard_iface->mesh_iface != mesh_iface) 520 522 continue; 521 523 522 - if (!batadv_compare_eth(hard_iface->net_dev->dev_addr, 523 - net_dev->dev_addr)) 524 + if (tmp_hard_iface->if_status == BATADV_IF_NOT_IN_USE) 525 + continue; 526 + 527 + if (!batadv_compare_eth(tmp_hard_iface->net_dev->dev_addr, 528 + hard_iface->net_dev->dev_addr)) 524 529 continue; 525 530 526 531 pr_warn("The newly added mac address (%pM) already exists on: %s\n", 527 - net_dev->dev_addr, hard_iface->net_dev->name); 532 + hard_iface->net_dev->dev_addr, tmp_hard_iface->net_dev->name); 528 533 pr_warn("It is strongly recommended to keep mac addresses unique to avoid problems!\n"); 529 534 } 530 - rcu_read_unlock(); 531 535 } 532 536 533 537 /** ··· 767 763 hard_iface->net_dev->name, hardif_mtu, 768 764 required_mtu); 769 765 766 + batadv_check_known_mac_addr(hard_iface); 767 + 770 768 if (batadv_hardif_is_iface_up(hard_iface)) 771 769 batadv_hardif_activate_interface(hard_iface); 772 770 else ··· 907 901 908 902 batadv_v_hardif_init(hard_iface); 909 903 910 - batadv_check_known_mac_addr(hard_iface->net_dev); 911 904 kref_get(&hard_iface->refcount); 912 905 list_add_tail_rcu(&hard_iface->list, &batadv_hardif_list); 913 906 batadv_hardif_generation++; ··· 993 988 if (hard_iface->if_status == BATADV_IF_NOT_IN_USE) 994 989 goto hardif_put; 995 990 996 - batadv_check_known_mac_addr(hard_iface->net_dev); 991 + batadv_check_known_mac_addr(hard_iface); 997 992 998 993 bat_priv = netdev_priv(hard_iface->mesh_iface); 999 994 bat_priv->algo_ops->iface.update_mac(hard_iface);

+24

net/bluetooth/hci_conn.c

··· 3023 3023 3024 3024 kfree_skb(skb); 3025 3025 } 3026 + 3027 + u8 *hci_conn_key_enc_size(struct hci_conn *conn) 3028 + { 3029 + if (conn->type == ACL_LINK) { 3030 + struct link_key *key; 3031 + 3032 + key = hci_find_link_key(conn->hdev, &conn->dst); 3033 + if (!key) 3034 + return NULL; 3035 + 3036 + return &key->pin_len; 3037 + } else if (conn->type == LE_LINK) { 3038 + struct smp_ltk *ltk; 3039 + 3040 + ltk = hci_find_ltk(conn->hdev, &conn->dst, conn->dst_type, 3041 + conn->role); 3042 + if (!ltk) 3043 + return NULL; 3044 + 3045 + return &ltk->enc_size; 3046 + } 3047 + 3048 + return NULL; 3049 + }

+42 -31

net/bluetooth/hci_event.c

··· 739 739 handle); 740 740 conn->enc_key_size = 0; 741 741 } else { 742 + u8 *key_enc_size = hci_conn_key_enc_size(conn); 743 + 742 744 conn->enc_key_size = rp->key_size; 743 745 status = 0; 744 746 745 - if (conn->enc_key_size < hdev->min_enc_key_size) { 747 + /* Attempt to check if the key size is too small or if it has 748 + * been downgraded from the last time it was stored as part of 749 + * the link_key. 750 + */ 751 + if (conn->enc_key_size < hdev->min_enc_key_size || 752 + (key_enc_size && conn->enc_key_size < *key_enc_size)) { 746 753 /* As slave role, the conn->state has been set to 747 754 * BT_CONNECTED and l2cap conn req might not be received 748 755 * yet, at this moment the l2cap layer almost does ··· 762 755 clear_bit(HCI_CONN_ENCRYPT, &conn->flags); 763 756 clear_bit(HCI_CONN_AES_CCM, &conn->flags); 764 757 } 758 + 759 + /* Update the key encryption size with the connection one */ 760 + if (key_enc_size && *key_enc_size != conn->enc_key_size) 761 + *key_enc_size = conn->enc_key_size; 765 762 } 766 763 767 764 hci_encrypt_cfm(conn, status); ··· 3076 3065 hci_dev_unlock(hdev); 3077 3066 } 3078 3067 3068 + static int hci_read_enc_key_size(struct hci_dev *hdev, struct hci_conn *conn) 3069 + { 3070 + struct hci_cp_read_enc_key_size cp; 3071 + u8 *key_enc_size = hci_conn_key_enc_size(conn); 3072 + 3073 + if (!read_key_size_capable(hdev)) { 3074 + conn->enc_key_size = HCI_LINK_KEY_SIZE; 3075 + return -EOPNOTSUPP; 3076 + } 3077 + 3078 + bt_dev_dbg(hdev, "hcon %p", conn); 3079 + 3080 + memset(&cp, 0, sizeof(cp)); 3081 + cp.handle = cpu_to_le16(conn->handle); 3082 + 3083 + /* If the key enc_size is already known, use it as conn->enc_key_size, 3084 + * otherwise use hdev->min_enc_key_size so the likes of 3085 + * l2cap_check_enc_key_size don't fail while waiting for 3086 + * HCI_OP_READ_ENC_KEY_SIZE response. 3087 + */ 3088 + if (key_enc_size && *key_enc_size) 3089 + conn->enc_key_size = *key_enc_size; 3090 + else 3091 + conn->enc_key_size = hdev->min_enc_key_size; 3092 + 3093 + return hci_send_cmd(hdev, HCI_OP_READ_ENC_KEY_SIZE, sizeof(cp), &cp); 3094 + } 3095 + 3079 3096 static void hci_conn_complete_evt(struct hci_dev *hdev, void *data, 3080 3097 struct sk_buff *skb) 3081 3098 { ··· 3196 3157 if (ev->encr_mode == 1 && !test_bit(HCI_CONN_ENCRYPT, &conn->flags) && 3197 3158 ev->link_type == ACL_LINK) { 3198 3159 struct link_key *key; 3199 - struct hci_cp_read_enc_key_size cp; 3200 3160 3201 3161 key = hci_find_link_key(hdev, &ev->bdaddr); 3202 3162 if (key) { 3203 3163 set_bit(HCI_CONN_ENCRYPT, &conn->flags); 3204 - 3205 - if (!read_key_size_capable(hdev)) { 3206 - conn->enc_key_size = HCI_LINK_KEY_SIZE; 3207 - } else { 3208 - cp.handle = cpu_to_le16(conn->handle); 3209 - if (hci_send_cmd(hdev, HCI_OP_READ_ENC_KEY_SIZE, 3210 - sizeof(cp), &cp)) { 3211 - bt_dev_err(hdev, "sending read key size failed"); 3212 - conn->enc_key_size = HCI_LINK_KEY_SIZE; 3213 - } 3214 - } 3215 - 3164 + hci_read_enc_key_size(hdev, conn); 3216 3165 hci_encrypt_cfm(conn, ev->status); 3217 3166 } 3218 3167 } ··· 3639 3612 3640 3613 /* Try reading the encryption key size for encrypted ACL links */ 3641 3614 if (!ev->status && ev->encrypt && conn->type == ACL_LINK) { 3642 - struct hci_cp_read_enc_key_size cp; 3643 - 3644 - /* Only send HCI_Read_Encryption_Key_Size if the 3645 - * controller really supports it. If it doesn't, assume 3646 - * the default size (16). 3647 - */ 3648 - if (!read_key_size_capable(hdev)) { 3649 - conn->enc_key_size = HCI_LINK_KEY_SIZE; 3615 + if (hci_read_enc_key_size(hdev, conn)) 3650 3616 goto notify; 3651 - } 3652 - 3653 - cp.handle = cpu_to_le16(conn->handle); 3654 - if (hci_send_cmd(hdev, HCI_OP_READ_ENC_KEY_SIZE, 3655 - sizeof(cp), &cp)) { 3656 - bt_dev_err(hdev, "sending read key size failed"); 3657 - conn->enc_key_size = HCI_LINK_KEY_SIZE; 3658 - goto notify; 3659 - } 3660 3617 3661 3618 goto unlock; 3662 3619 }

+6 -3

net/bluetooth/mgmt.c

··· 7506 7506 struct mgmt_cp_add_device *cp = cmd->param; 7507 7507 7508 7508 if (!err) { 7509 + struct hci_conn_params *params; 7510 + 7511 + params = hci_conn_params_lookup(hdev, &cp->addr.bdaddr, 7512 + le_addr_type(cp->addr.type)); 7513 + 7509 7514 device_added(cmd->sk, hdev, &cp->addr.bdaddr, cp->addr.type, 7510 7515 cp->action); 7511 7516 device_flags_changed(NULL, hdev, &cp->addr.bdaddr, 7512 7517 cp->addr.type, hdev->conn_flags, 7513 - PTR_UINT(cmd->user_data)); 7518 + params ? params->flags : 0); 7514 7519 } 7515 7520 7516 7521 mgmt_cmd_complete(cmd->sk, hdev->id, MGMT_OP_ADD_DEVICE, ··· 7617 7612 err = -ENOMEM; 7618 7613 goto unlock; 7619 7614 } 7620 - 7621 - cmd->user_data = UINT_PTR(current_flags); 7622 7615 7623 7616 err = hci_cmd_sync_queue(hdev, add_device_sync, cmd, 7624 7617 add_device_complete);

+2

net/core/dev.c

··· 10441 10441 if (!(features & feature) && (lower->features & feature)) { 10442 10442 netdev_dbg(upper, "Disabling feature %pNF on lower dev %s.\n", 10443 10443 &feature, lower->name); 10444 + netdev_lock_ops(lower); 10444 10445 lower->wanted_features &= ~feature; 10445 10446 __netdev_update_features(lower); 10446 10447 ··· 10450 10449 &feature, lower->name); 10451 10450 else 10452 10451 netdev_features_change(lower); 10452 + netdev_unlock_ops(lower); 10453 10453 } 10454 10454 } 10455 10455 }

+7

net/core/devmem.c

··· 200 200 201 201 refcount_set(&binding->ref, 1); 202 202 203 + mutex_init(&binding->lock); 204 + 203 205 binding->dmabuf = dmabuf; 204 206 205 207 binding->attachment = dma_buf_attach(binding->dmabuf, dev->dev.parent); ··· 381 379 xa_for_each(&binding->bound_rxqs, xa_idx, bound_rxq) { 382 380 if (bound_rxq == rxq) { 383 381 xa_erase(&binding->bound_rxqs, xa_idx); 382 + if (xa_empty(&binding->bound_rxqs)) { 383 + mutex_lock(&binding->lock); 384 + binding->dev = NULL; 385 + mutex_unlock(&binding->lock); 386 + } 384 387 break; 385 388 } 386 389 }

+2

net/core/devmem.h

··· 20 20 struct sg_table *sgt; 21 21 struct net_device *dev; 22 22 struct gen_pool *chunk_pool; 23 + /* Protect dev */ 24 + struct mutex lock; 23 25 24 26 /* The user holds a ref (via the netlink API) for as long as they want 25 27 * the binding to remain alive. Each page pool using this binding holds

+11

net/core/netdev-genl.c

··· 979 979 { 980 980 struct net_devmem_dmabuf_binding *binding; 981 981 struct net_devmem_dmabuf_binding *temp; 982 + netdevice_tracker dev_tracker; 982 983 struct net_device *dev; 983 984 984 985 mutex_lock(&priv->lock); 985 986 list_for_each_entry_safe(binding, temp, &priv->bindings, list) { 987 + mutex_lock(&binding->lock); 986 988 dev = binding->dev; 989 + if (!dev) { 990 + mutex_unlock(&binding->lock); 991 + net_devmem_unbind_dmabuf(binding); 992 + continue; 993 + } 994 + netdev_hold(dev, &dev_tracker, GFP_KERNEL); 995 + mutex_unlock(&binding->lock); 996 + 987 997 netdev_lock(dev); 988 998 net_devmem_unbind_dmabuf(binding); 989 999 netdev_unlock(dev); 1000 + netdev_put(dev, &dev_tracker); 990 1001 } 991 1002 mutex_unlock(&priv->lock); 992 1003 }

+4 -2

net/mac80211/main.c

··· 1354 1354 hw->wiphy->software_iftypes |= BIT(NL80211_IFTYPE_MONITOR); 1355 1355 1356 1356 1357 - local->int_scan_req = kzalloc(sizeof(*local->int_scan_req) + 1358 - sizeof(void *) * channels, GFP_KERNEL); 1357 + local->int_scan_req = kzalloc(struct_size(local->int_scan_req, 1358 + channels, channels), 1359 + GFP_KERNEL); 1359 1360 if (!local->int_scan_req) 1360 1361 return -ENOMEM; 1362 + local->int_scan_req->n_channels = channels; 1361 1363 1362 1364 eth_broadcast_addr(local->int_scan_req->bssid); 1363 1365

+11 -4

net/mctp/device.c

··· 117 117 struct net_device *dev; 118 118 struct ifaddrmsg *hdr; 119 119 struct mctp_dev *mdev; 120 - int ifindex, rc; 120 + int ifindex = 0, rc; 121 121 122 - hdr = nlmsg_data(cb->nlh); 123 - // filter by ifindex if requested 124 - ifindex = hdr->ifa_index; 122 + /* Filter by ifindex if a header is provided */ 123 + if (cb->nlh->nlmsg_len >= nlmsg_msg_size(sizeof(*hdr))) { 124 + hdr = nlmsg_data(cb->nlh); 125 + ifindex = hdr->ifa_index; 126 + } else { 127 + if (cb->strict_check) { 128 + NL_SET_ERR_MSG(cb->extack, "mctp: Invalid header for addr dump request"); 129 + return -EINVAL; 130 + } 131 + } 125 132 126 133 rcu_read_lock(); 127 134 for_each_netdev_dump(net, dev, mcb->ifindex) {

+3 -1

net/mctp/route.c

··· 313 313 314 314 key = flow->key; 315 315 316 - if (WARN_ON(key->dev && key->dev != dev)) 316 + if (key->dev) { 317 + WARN_ON(key->dev != dev); 317 318 return; 319 + } 318 320 319 321 mctp_dev_set_key(dev, key); 320 322 }

+1 -1

net/sched/sch_codel.c

··· 144 144 145 145 qlen = sch->q.qlen; 146 146 while (sch->q.qlen > sch->limit) { 147 - struct sk_buff *skb = __qdisc_dequeue_head(&sch->q); 147 + struct sk_buff *skb = qdisc_dequeue_internal(sch, true); 148 148 149 149 dropped += qdisc_pkt_len(skb); 150 150 qdisc_qstats_backlog_dec(sch, skb);

+1 -1

net/sched/sch_fq.c

··· 1136 1136 sch_tree_lock(sch); 1137 1137 } 1138 1138 while (sch->q.qlen > sch->limit) { 1139 - struct sk_buff *skb = fq_dequeue(sch); 1139 + struct sk_buff *skb = qdisc_dequeue_internal(sch, false); 1140 1140 1141 1141 if (!skb) 1142 1142 break;

+1 -1

net/sched/sch_fq_codel.c

··· 441 441 442 442 while (sch->q.qlen > sch->limit || 443 443 q->memory_usage > q->memory_limit) { 444 - struct sk_buff *skb = fq_codel_dequeue(sch); 444 + struct sk_buff *skb = qdisc_dequeue_internal(sch, false); 445 445 446 446 q->cstats.drop_len += qdisc_pkt_len(skb); 447 447 rtnl_kfree_skbs(skb, skb);

+1 -1

net/sched/sch_fq_pie.c

··· 366 366 367 367 /* Drop excess packets if new limit is lower */ 368 368 while (sch->q.qlen > sch->limit) { 369 - struct sk_buff *skb = fq_pie_qdisc_dequeue(sch); 369 + struct sk_buff *skb = qdisc_dequeue_internal(sch, false); 370 370 371 371 len_dropped += qdisc_pkt_len(skb); 372 372 num_dropped += 1;

+1 -1

net/sched/sch_hhf.c

··· 564 564 qlen = sch->q.qlen; 565 565 prev_backlog = sch->qstats.backlog; 566 566 while (sch->q.qlen > sch->limit) { 567 - struct sk_buff *skb = hhf_dequeue(sch); 567 + struct sk_buff *skb = qdisc_dequeue_internal(sch, false); 568 568 569 569 rtnl_kfree_skbs(skb, skb); 570 570 }

+1 -1

net/sched/sch_pie.c

··· 195 195 /* Drop excess packets if new limit is lower */ 196 196 qlen = sch->q.qlen; 197 197 while (sch->q.qlen > sch->limit) { 198 - struct sk_buff *skb = __qdisc_dequeue_head(&sch->q); 198 + struct sk_buff *skb = qdisc_dequeue_internal(sch, true); 199 199 200 200 dropped += qdisc_pkt_len(skb); 201 201 qdisc_qstats_backlog_dec(sch, skb);

+2 -1

net/tls/tls_strp.c

··· 396 396 return 0; 397 397 398 398 shinfo = skb_shinfo(strp->anchor); 399 - shinfo->frag_list = NULL; 400 399 401 400 /* If we don't know the length go max plus page for cipher overhead */ 402 401 need_spc = strp->stm.full_len ?: TLS_MAX_PAYLOAD_SIZE + PAGE_SIZE; ··· 410 411 skb_fill_page_desc(strp->anchor, shinfo->nr_frags++, 411 412 page, 0, 0); 412 413 } 414 + 415 + shinfo->frag_list = NULL; 413 416 414 417 strp->copy_mode = 1; 415 418 strp->stm.offset = 0;

+1 -1

samples/ftrace/sample-trace-array.c

··· 112 112 /* 113 113 * If context specific per-cpu buffers havent already been allocated. 114 114 */ 115 - trace_printk_init_buffers(); 115 + trace_array_init_printk(tr); 116 116 117 117 simple_tsk = kthread_run(simple_thread, NULL, "sample-instance"); 118 118 if (IS_ERR(simple_tsk)) {

+12

scripts/Makefile.extrawarn

··· 37 37 # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111219 38 38 KBUILD_CFLAGS += $(call cc-disable-warning, format-overflow-non-kprintf) 39 39 KBUILD_CFLAGS += $(call cc-disable-warning, format-truncation-non-kprintf) 40 + 41 + # Clang may emit a warning when a const variable, such as the dummy variables 42 + # in typecheck(), or const member of an aggregate type are not initialized, 43 + # which can result in unexpected behavior. However, in many audited cases of 44 + # the "field" variant of the warning, this is intentional because the field is 45 + # never used within a particular call path, the field is within a union with 46 + # other non-const members, or the containing object is not const so the field 47 + # can be modified via memcpy() / memset(). While the variable warning also gets 48 + # disabled with this same switch, there should not be too much coverage lost 49 + # because -Wuninitialized will still flag when an uninitialized const variable 50 + # is used. 51 + KBUILD_CFLAGS += $(call cc-disable-warning, default-const-init-unsafe) 40 52 else 41 53 42 54 # gcc inanely warns about local variables called 'main'

+2 -2

scripts/Makefile.vmlinux

··· 94 94 endif 95 95 96 96 ifdef CONFIG_BUILDTIME_TABLE_SORT 97 - vmlinux: scripts/sorttable 97 + $(vmlinux-final): scripts/sorttable 98 98 endif 99 99 100 - # module.builtin.ranges 100 + # modules.builtin.ranges 101 101 # --------------------------------------------------------------------------- 102 102 ifdef CONFIG_BUILTIN_MODULE_RANGES 103 103 __default: modules.builtin.ranges

+2 -2

scripts/Makefile.vmlinux_o

··· 73 73 74 74 targets += vmlinux.o 75 75 76 - # module.builtin.modinfo 76 + # modules.builtin.modinfo 77 77 # --------------------------------------------------------------------------- 78 78 79 79 OBJCOPYFLAGS_modules.builtin.modinfo := -j .modinfo -O binary ··· 82 82 modules.builtin.modinfo: vmlinux.o FORCE 83 83 $(call if_changed,objcopy) 84 84 85 - # module.builtin 85 + # modules.builtin 86 86 # --------------------------------------------------------------------------- 87 87 88 88 # The second line aids cases where multiple modules share the same object.

+1

scripts/package/kernel.spec

··· 16 16 Source2: diff.patch 17 17 Provides: kernel-%{KERNELRELEASE} 18 18 BuildRequires: bc binutils bison dwarves 19 + BuildRequires: (elfutils-devel or libdw-devel) 19 20 BuildRequires: (elfutils-libelf-devel or libelf-devel) flex 20 21 BuildRequires: gcc make openssl openssl-devel perl python3 rsync 21 22

+1 -1

scripts/package/mkdebian

··· 210 210 Build-Depends: debhelper-compat (= 12) 211 211 Build-Depends-Arch: bc, bison, flex, 212 212 gcc-${host_gnu} <!pkg.${sourcename}.nokernelheaders>, 213 - kmod, libelf-dev:native, 213 + kmod, libdw-dev:native, libelf-dev:native, 214 214 libssl-dev:native, libssl-dev <!pkg.${sourcename}.nokernelheaders>, 215 215 python3:native, rsync 216 216 Homepage: https://www.kernel.org/

+2 -2

security/landlock/audit.c

··· 175 175 KUNIT_EXPECT_EQ(test, 10, get_hierarchy(&dom2, 0)->id); 176 176 KUNIT_EXPECT_EQ(test, 20, get_hierarchy(&dom2, 1)->id); 177 177 KUNIT_EXPECT_EQ(test, 30, get_hierarchy(&dom2, 2)->id); 178 - KUNIT_EXPECT_EQ(test, 30, get_hierarchy(&dom2, -1)->id); 178 + /* KUNIT_EXPECT_EQ(test, 30, get_hierarchy(&dom2, -1)->id); */ 179 179 } 180 180 181 181 #endif /* CONFIG_SECURITY_LANDLOCK_KUNIT_TEST */ ··· 437 437 return; 438 438 439 439 /* Checks if the current exec was restricting itself. */ 440 - if (subject->domain_exec & (1 << youngest_layer)) { 440 + if (subject->domain_exec & BIT(youngest_layer)) { 441 441 /* Ignores denials for the same execution. */ 442 442 if (!youngest_denied->log_same_exec) 443 443 return;

+31 -2

security/landlock/id.c

··· 7 7 8 8 #include <kunit/test.h> 9 9 #include <linux/atomic.h> 10 + #include <linux/bitops.h> 10 11 #include <linux/random.h> 11 12 #include <linux/spinlock.h> 12 13 ··· 26 25 * Ensures sure 64-bit values are always used by user space (or may 27 26 * fail with -EOVERFLOW), and makes this testable. 28 27 */ 29 - init = 1ULL << 32; 28 + init = BIT_ULL(32); 30 29 31 30 /* 32 31 * Makes a large (2^32) boot-time value to limit ID collision in logs ··· 106 105 * to get a new ID (e.g. a full landlock_restrict_self() call), and the 107 106 * cost of draining all available IDs during the system's uptime. 108 107 */ 109 - random_4bits = random_4bits % (1 << 4); 108 + random_4bits &= 0b1111; 110 109 step = number_of_ids + random_4bits; 111 110 112 111 /* It is safe to cast a signed atomic to an unsigned value. */ ··· 143 142 KUNIT_EXPECT_EQ( 144 143 test, get_id_range(get_random_u8(), &counter, get_random_u8()), 145 144 init + 2); 145 + } 146 + 147 + static void test_range1_rand15(struct kunit *const test) 148 + { 149 + atomic64_t counter; 150 + u64 init; 151 + 152 + init = get_random_u32(); 153 + atomic64_set(&counter, init); 154 + KUNIT_EXPECT_EQ(test, get_id_range(1, &counter, 15), init); 155 + KUNIT_EXPECT_EQ( 156 + test, get_id_range(get_random_u8(), &counter, get_random_u8()), 157 + init + 16); 146 158 } 147 159 148 160 static void test_range1_rand16(struct kunit *const test) ··· 210 196 init + 4); 211 197 } 212 198 199 + static void test_range2_rand15(struct kunit *const test) 200 + { 201 + atomic64_t counter; 202 + u64 init; 203 + 204 + init = get_random_u32(); 205 + atomic64_set(&counter, init); 206 + KUNIT_EXPECT_EQ(test, get_id_range(2, &counter, 15), init); 207 + KUNIT_EXPECT_EQ( 208 + test, get_id_range(get_random_u8(), &counter, get_random_u8()), 209 + init + 17); 210 + } 211 + 213 212 static void test_range2_rand16(struct kunit *const test) 214 213 { 215 214 atomic64_t counter; ··· 259 232 KUNIT_CASE(test_init_once), 260 233 KUNIT_CASE(test_range1_rand0), 261 234 KUNIT_CASE(test_range1_rand1), 235 + KUNIT_CASE(test_range1_rand15), 262 236 KUNIT_CASE(test_range1_rand16), 263 237 KUNIT_CASE(test_range2_rand0), 264 238 KUNIT_CASE(test_range2_rand1), 265 239 KUNIT_CASE(test_range2_rand2), 240 + KUNIT_CASE(test_range2_rand15), 266 241 KUNIT_CASE(test_range2_rand16), 267 242 {} 268 243 /* clang-format on */

+2 -1

security/landlock/syscalls.c

··· 9 9 10 10 #include <asm/current.h> 11 11 #include <linux/anon_inodes.h> 12 + #include <linux/bitops.h> 12 13 #include <linux/build_bug.h> 13 14 #include <linux/capability.h> 14 15 #include <linux/cleanup.h> ··· 564 563 new_llcred->domain = new_dom; 565 564 566 565 #ifdef CONFIG_AUDIT 567 - new_llcred->domain_exec |= 1 << (new_dom->num_layers - 1); 566 + new_llcred->domain_exec |= BIT(new_dom->num_layers - 1); 568 567 #endif /* CONFIG_AUDIT */ 569 568 570 569 return commit_creds(new_cred);

+33 -19

sound/core/seq/seq_clientmgr.c

··· 732 732 */ 733 733 static int __deliver_to_subscribers(struct snd_seq_client *client, 734 734 struct snd_seq_event *event, 735 - struct snd_seq_client_port *src_port, 736 - int atomic, int hop) 735 + int port, int atomic, int hop) 737 736 { 737 + struct snd_seq_client_port *src_port; 738 738 struct snd_seq_subscribers *subs; 739 739 int err, result = 0, num_ev = 0; 740 740 union __snd_seq_event event_saved; 741 741 size_t saved_size; 742 742 struct snd_seq_port_subs_info *grp; 743 + 744 + if (port < 0) 745 + return 0; 746 + src_port = snd_seq_port_use_ptr(client, port); 747 + if (!src_port) 748 + return 0; 743 749 744 750 /* save original event record */ 745 751 saved_size = snd_seq_event_packet_size(event); ··· 781 775 read_unlock(&grp->list_lock); 782 776 else 783 777 up_read(&grp->list_mutex); 778 + snd_seq_port_unlock(src_port); 784 779 memcpy(event, &event_saved, saved_size); 785 780 return (result < 0) ? result : num_ev; 786 781 } ··· 790 783 struct snd_seq_event *event, 791 784 int atomic, int hop) 792 785 { 793 - struct snd_seq_client_port *src_port; 794 - int ret = 0, ret2; 786 + int ret; 787 + #if IS_ENABLED(CONFIG_SND_SEQ_UMP) 788 + int ret2; 789 + #endif 795 790 796 - src_port = snd_seq_port_use_ptr(client, event->source.port); 797 - if (src_port) { 798 - ret = __deliver_to_subscribers(client, event, src_port, atomic, hop); 799 - snd_seq_port_unlock(src_port); 800 - } 801 - 802 - if (client->ump_endpoint_port < 0 || 803 - event->source.port == client->ump_endpoint_port) 791 + ret = __deliver_to_subscribers(client, event, 792 + event->source.port, atomic, hop); 793 + #if IS_ENABLED(CONFIG_SND_SEQ_UMP) 794 + if (!snd_seq_client_is_ump(client) || client->ump_endpoint_port < 0) 804 795 return ret; 805 - 806 - src_port = snd_seq_port_use_ptr(client, client->ump_endpoint_port); 807 - if (!src_port) 808 - return ret; 809 - ret2 = __deliver_to_subscribers(client, event, src_port, atomic, hop); 810 - snd_seq_port_unlock(src_port); 811 - return ret2 < 0 ? ret2 : ret; 796 + /* If it's an event from EP port (and with a UMP group), 797 + * deliver to subscribers of the corresponding UMP group port, too. 798 + * Or, if it's from non-EP port, deliver to subscribers of EP port, too. 799 + */ 800 + if (event->source.port == client->ump_endpoint_port) 801 + ret2 = __deliver_to_subscribers(client, event, 802 + snd_seq_ump_group_port(event), 803 + atomic, hop); 804 + else 805 + ret2 = __deliver_to_subscribers(client, event, 806 + client->ump_endpoint_port, 807 + atomic, hop); 808 + if (ret2 < 0) 809 + return ret2; 810 + #endif 811 + return ret; 812 812 } 813 813 814 814 /* deliver an event to the destination port(s).

+18

sound/core/seq/seq_ump_convert.c

··· 1285 1285 else 1286 1286 return cvt_to_ump_midi1(dest, dest_port, event, atomic, hop); 1287 1287 } 1288 + 1289 + /* return the UMP group-port number of the event; 1290 + * return -1 if groupless or non-UMP event 1291 + */ 1292 + int snd_seq_ump_group_port(const struct snd_seq_event *event) 1293 + { 1294 + const struct snd_seq_ump_event *ump_ev = 1295 + (const struct snd_seq_ump_event *)event; 1296 + unsigned char type; 1297 + 1298 + if (!snd_seq_ev_is_ump(event)) 1299 + return -1; 1300 + type = ump_message_type(ump_ev->ump[0]); 1301 + if (ump_is_groupless_msg(type)) 1302 + return -1; 1303 + /* group-port number starts from 1 */ 1304 + return ump_message_group(ump_ev->ump[0]) + 1; 1305 + }

+1

sound/core/seq/seq_ump_convert.h

··· 18 18 struct snd_seq_client_port *dest_port, 19 19 struct snd_seq_event *event, 20 20 int atomic, int hop); 21 + int snd_seq_ump_group_port(const struct snd_seq_event *event); 21 22 22 23 #endif /* __SEQ_UMP_CONVERT_H */

+1 -1

sound/hda/intel-sdw-acpi.c

··· 177 177 * sdw_intel_startup() is required for creation of devices and bus 178 178 * startup 179 179 */ 180 - int sdw_intel_acpi_scan(acpi_handle *parent_handle, 180 + int sdw_intel_acpi_scan(acpi_handle parent_handle, 181 181 struct sdw_intel_acpi_info *info) 182 182 { 183 183 acpi_status status;

+4 -2

sound/pci/es1968.c

··· 1561 1561 struct snd_pcm_runtime *runtime = substream->runtime; 1562 1562 struct es1968 *chip = snd_pcm_substream_chip(substream); 1563 1563 struct esschan *es; 1564 - int apu1, apu2; 1564 + int err, apu1, apu2; 1565 1565 1566 1566 apu1 = snd_es1968_alloc_apu_pair(chip, ESM_APU_PCM_CAPTURE); 1567 1567 if (apu1 < 0) ··· 1605 1605 runtime->hw = snd_es1968_capture; 1606 1606 runtime->hw.buffer_bytes_max = runtime->hw.period_bytes_max = 1607 1607 calc_available_memory_size(chip) - 1024; /* keep MIXBUF size */ 1608 - snd_pcm_hw_constraint_pow2(runtime, 0, SNDRV_PCM_HW_PARAM_BUFFER_BYTES); 1608 + err = snd_pcm_hw_constraint_pow2(runtime, 0, SNDRV_PCM_HW_PARAM_BUFFER_BYTES); 1609 + if (err < 0) 1610 + return err; 1609 1611 1610 1612 spin_lock_irq(&chip->substream_lock); 1611 1613 list_add(&es->list, &chip->substream_list);

+1 -1

sound/sh/Kconfig

··· 14 14 15 15 config SND_AICA 16 16 tristate "Dreamcast Yamaha AICA sound" 17 - depends on SH_DREAMCAST 17 + depends on SH_DREAMCAST && SH_DMA_API 18 18 select SND_PCM 19 19 select G2_DMA 20 20 help

+2

sound/soc/qcom/sc8280xp.c

··· 186 186 static const struct of_device_id snd_sc8280xp_dt_match[] = { 187 187 {.compatible = "qcom,qcm6490-idp-sndcard", "qcm6490"}, 188 188 {.compatible = "qcom,qcs6490-rb3gen2-sndcard", "qcs6490"}, 189 + {.compatible = "qcom,qcs9075-sndcard", "qcs9075"}, 190 + {.compatible = "qcom,qcs9100-sndcard", "qcs9100"}, 189 191 {.compatible = "qcom,sc8280xp-sndcard", "sc8280xp"}, 190 192 {.compatible = "qcom,sm8450-sndcard", "sm8450"}, 191 193 {.compatible = "qcom,sm8550-sndcard", "sm8550"},

+4

sound/usb/quirks.c

··· 2242 2242 QUIRK_FLAG_CTL_MSG_DELAY_1M), 2243 2243 DEVICE_FLG(0x0c45, 0x6340, /* Sonix HD USB Camera */ 2244 2244 QUIRK_FLAG_GET_SAMPLE_RATE), 2245 + DEVICE_FLG(0x0c45, 0x636b, /* Microdia JP001 USB Camera */ 2246 + QUIRK_FLAG_GET_SAMPLE_RATE), 2245 2247 DEVICE_FLG(0x0d8c, 0x0014, /* USB Audio Device */ 2246 2248 QUIRK_FLAG_CTL_MSG_DELAY_1M), 2247 2249 DEVICE_FLG(0x0ecb, 0x205c, /* JBL Quantum610 Wireless */ ··· 2252 2250 QUIRK_FLAG_FIXED_RATE), 2253 2251 DEVICE_FLG(0x0fd9, 0x0008, /* Hauppauge HVR-950Q */ 2254 2252 QUIRK_FLAG_SHARE_MEDIA_DEVICE | QUIRK_FLAG_ALIGN_TRANSFER), 2253 + DEVICE_FLG(0x1101, 0x0003, /* Audioengine D1 */ 2254 + QUIRK_FLAG_GET_SAMPLE_RATE), 2255 2255 DEVICE_FLG(0x1224, 0x2a25, /* Jieli Technology USB PHY 2.0 */ 2256 2256 QUIRK_FLAG_GET_SAMPLE_RATE | QUIRK_FLAG_MIC_RES_16), 2257 2257 DEVICE_FLG(0x1395, 0x740a, /* Sennheiser DECT */

+15 -7

tools/net/ynl/pyynl/ethtool.py

··· 338 338 print('Capabilities:') 339 339 [print(f'\t{v}') for v in bits_to_dict(tsinfo['timestamping'])] 340 340 341 - print(f'PTP Hardware Clock: {tsinfo["phc-index"]}') 341 + print(f'PTP Hardware Clock: {tsinfo.get("phc-index", "none")}') 342 342 343 - print('Hardware Transmit Timestamp Modes:') 344 - [print(f'\t{v}') for v in bits_to_dict(tsinfo['tx-types'])] 343 + if 'tx-types' in tsinfo: 344 + print('Hardware Transmit Timestamp Modes:') 345 + [print(f'\t{v}') for v in bits_to_dict(tsinfo['tx-types'])] 346 + else: 347 + print('Hardware Transmit Timestamp Modes: none') 345 348 346 - print('Hardware Receive Filter Modes:') 347 - [print(f'\t{v}') for v in bits_to_dict(tsinfo['rx-filters'])] 349 + if 'rx-filters' in tsinfo: 350 + print('Hardware Receive Filter Modes:') 351 + [print(f'\t{v}') for v in bits_to_dict(tsinfo['rx-filters'])] 352 + else: 353 + print('Hardware Receive Filter Modes: none') 348 354 349 - print('Statistics:') 350 - [print(f'\t{k}: {v}') for k, v in tsinfo['stats'].items()] 355 + if 'stats' in tsinfo and tsinfo['stats']: 356 + print('Statistics:') 357 + [print(f'\t{k}: {v}') for k, v in tsinfo['stats'].items()] 358 + 351 359 return 352 360 353 361 print(f'Settings for {args.device}:')

+3 -4

tools/net/ynl/pyynl/ynl_gen_c.py

··· 1143 1143 self.pure_nested_structs[nested].request = True 1144 1144 if attr in rs_members['reply']: 1145 1145 self.pure_nested_structs[nested].reply = True 1146 - 1147 - if spec.is_multi_val(): 1148 - child = self.pure_nested_structs.get(nested) 1149 - child.in_multi_val = True 1146 + if spec.is_multi_val(): 1147 + child = self.pure_nested_structs.get(nested) 1148 + child.in_multi_val = True 1150 1149 1151 1150 self._sort_pure_types() 1152 1151

+9

tools/objtool/arch/x86/decode.c

··· 189 189 op2 = ins.opcode.bytes[1]; 190 190 op3 = ins.opcode.bytes[2]; 191 191 192 + /* 193 + * XXX hack, decoder is buggered and thinks 0xea is 7 bytes long. 194 + */ 195 + if (op1 == 0xea) { 196 + insn->len = 1; 197 + insn->type = INSN_BUG; 198 + return 0; 199 + } 200 + 192 201 if (ins.rex_prefix.nbytes) { 193 202 rex = ins.rex_prefix.bytes[0]; 194 203 rex_w = X86_REX_W(rex) >> 3;

+1

tools/testing/selftests/Makefile

··· 121 121 TARGETS += vDSO 122 122 TARGETS += mm 123 123 TARGETS += x86 124 + TARGETS += x86/bugs 124 125 TARGETS += zram 125 126 #Please keep the TARGETS list alphabetically sorted 126 127 # Run "make quicktest=1 run_tests" or

+22 -33

tools/testing/selftests/drivers/net/hw/ncdevmem.c

··· 431 431 return 0; 432 432 } 433 433 434 + static struct netdev_queue_id *create_queues(void) 435 + { 436 + struct netdev_queue_id *queues; 437 + size_t i = 0; 438 + 439 + queues = calloc(num_queues, sizeof(*queues)); 440 + for (i = 0; i < num_queues; i++) { 441 + queues[i]._present.type = 1; 442 + queues[i]._present.id = 1; 443 + queues[i].type = NETDEV_QUEUE_TYPE_RX; 444 + queues[i].id = start_queue + i; 445 + } 446 + 447 + return queues; 448 + } 449 + 434 450 int do_server(struct memory_buffer *mem) 435 451 { 436 452 char ctrl_data[sizeof(int) * 20000]; ··· 464 448 char buffer[256]; 465 449 int socket_fd; 466 450 int client_fd; 467 - size_t i = 0; 468 451 int ret; 469 452 470 453 ret = parse_address(server_ip, atoi(port), &server_sin); ··· 486 471 487 472 sleep(1); 488 473 489 - queues = malloc(sizeof(*queues) * num_queues); 490 - 491 - for (i = 0; i < num_queues; i++) { 492 - queues[i]._present.type = 1; 493 - queues[i]._present.id = 1; 494 - queues[i].type = NETDEV_QUEUE_TYPE_RX; 495 - queues[i].id = start_queue + i; 496 - } 497 - 498 - if (bind_rx_queue(ifindex, mem->fd, queues, num_queues, &ys)) 474 + if (bind_rx_queue(ifindex, mem->fd, create_queues(), num_queues, &ys)) 499 475 error(1, 0, "Failed to bind\n"); 500 476 501 477 tmp_mem = malloc(mem->size); ··· 551 545 goto cleanup; 552 546 } 553 547 554 - i++; 555 548 for (cm = CMSG_FIRSTHDR(&msg); cm; cm = CMSG_NXTHDR(&msg, cm)) { 556 549 if (cm->cmsg_level != SOL_SOCKET || 557 550 (cm->cmsg_type != SCM_DEVMEM_DMABUF && ··· 635 630 636 631 void run_devmem_tests(void) 637 632 { 638 - struct netdev_queue_id *queues; 639 633 struct memory_buffer *mem; 640 634 struct ynl_sock *ys; 641 - size_t i = 0; 642 635 643 636 mem = provider->alloc(getpagesize() * NUM_PAGES); 644 637 ··· 644 641 if (configure_rss()) 645 642 error(1, 0, "rss error\n"); 646 643 647 - queues = calloc(num_queues, sizeof(*queues)); 648 - 649 644 if (configure_headersplit(1)) 650 645 error(1, 0, "Failed to configure header split\n"); 651 646 652 - if (!bind_rx_queue(ifindex, mem->fd, queues, num_queues, &ys)) 647 + if (!bind_rx_queue(ifindex, mem->fd, 648 + calloc(num_queues, sizeof(struct netdev_queue_id)), 649 + num_queues, &ys)) 653 650 error(1, 0, "Binding empty queues array should have failed\n"); 654 - 655 - for (i = 0; i < num_queues; i++) { 656 - queues[i]._present.type = 1; 657 - queues[i]._present.id = 1; 658 - queues[i].type = NETDEV_QUEUE_TYPE_RX; 659 - queues[i].id = start_queue + i; 660 - } 661 651 662 652 if (configure_headersplit(0)) 663 653 error(1, 0, "Failed to configure header split\n"); 664 654 665 - if (!bind_rx_queue(ifindex, mem->fd, queues, num_queues, &ys)) 655 + if (!bind_rx_queue(ifindex, mem->fd, create_queues(), num_queues, &ys)) 666 656 error(1, 0, "Configure dmabuf with header split off should have failed\n"); 667 657 668 658 if (configure_headersplit(1)) 669 659 error(1, 0, "Failed to configure header split\n"); 670 660 671 - for (i = 0; i < num_queues; i++) { 672 - queues[i]._present.type = 1; 673 - queues[i]._present.id = 1; 674 - queues[i].type = NETDEV_QUEUE_TYPE_RX; 675 - queues[i].id = start_queue + i; 676 - } 677 - 678 - if (bind_rx_queue(ifindex, mem->fd, queues, num_queues, &ys)) 661 + if (bind_rx_queue(ifindex, mem->fd, create_queues(), num_queues, &ys)) 679 662 error(1, 0, "Failed to bind\n"); 680 663 681 664 /* Deactivating a bound queue should not be legal */

+24

tools/testing/selftests/tc-testing/tc-tests/qdiscs/codel.json

··· 189 189 "teardown": [ 190 190 "$TC qdisc del dev $DUMMY handle 1: root" 191 191 ] 192 + }, 193 + { 194 + "id": "deb1", 195 + "name": "CODEL test qdisc limit trimming", 196 + "category": ["qdisc", "codel"], 197 + "plugins": { 198 + "requires": ["nsPlugin", "scapyPlugin"] 199 + }, 200 + "setup": [ 201 + "$TC qdisc add dev $DEV1 handle 1: root codel limit 10" 202 + ], 203 + "scapy": [ 204 + { 205 + "iface": "$DEV0", 206 + "count": 10, 207 + "packet": "Ether(type=0x800)/IP(src='10.0.0.10',dst='10.0.0.20')/TCP(sport=5000,dport=10)" 208 + } 209 + ], 210 + "cmdUnderTest": "$TC qdisc change dev $DEV1 handle 1: root codel limit 1", 211 + "expExitCode": "0", 212 + "verifyCmd": "$TC qdisc show dev $DEV1", 213 + "matchPattern": "qdisc codel 1: root refcnt [0-9]+ limit 1p target 5ms interval 100ms", 214 + "matchCount": "1", 215 + "teardown": ["$TC qdisc del dev $DEV1 handle 1: root"] 192 216 } 193 217 ]

+22

tools/testing/selftests/tc-testing/tc-tests/qdiscs/fq.json

··· 377 377 "teardown": [ 378 378 "$TC qdisc del dev $DUMMY handle 1: root" 379 379 ] 380 + }, 381 + { 382 + "id": "9479", 383 + "name": "FQ test qdisc limit trimming", 384 + "category": ["qdisc", "fq"], 385 + "plugins": {"requires": ["nsPlugin", "scapyPlugin"]}, 386 + "setup": [ 387 + "$TC qdisc add dev $DEV1 handle 1: root fq limit 10" 388 + ], 389 + "scapy": [ 390 + { 391 + "iface": "$DEV0", 392 + "count": 10, 393 + "packet": "Ether(type=0x800)/IP(src='10.0.0.10',dst='10.0.0.20')/TCP(sport=5000,dport=10)" 394 + } 395 + ], 396 + "cmdUnderTest": "$TC qdisc change dev $DEV1 handle 1: root fq limit 1", 397 + "expExitCode": "0", 398 + "verifyCmd": "$TC qdisc show dev $DEV1", 399 + "matchPattern": "qdisc fq 1: root refcnt [0-9]+ limit 1p", 400 + "matchCount": "1", 401 + "teardown": ["$TC qdisc del dev $DEV1 handle 1: root"] 380 402 } 381 403 ]

+22

tools/testing/selftests/tc-testing/tc-tests/qdiscs/fq_codel.json

··· 294 294 "teardown": [ 295 295 "$TC qdisc del dev $DUMMY handle 1: root" 296 296 ] 297 + }, 298 + { 299 + "id": "0436", 300 + "name": "FQ_CODEL test qdisc limit trimming", 301 + "category": ["qdisc", "fq_codel"], 302 + "plugins": {"requires": ["nsPlugin", "scapyPlugin"]}, 303 + "setup": [ 304 + "$TC qdisc add dev $DEV1 handle 1: root fq_codel limit 10" 305 + ], 306 + "scapy": [ 307 + { 308 + "iface": "$DEV0", 309 + "count": 10, 310 + "packet": "Ether(type=0x800)/IP(src='10.0.0.10',dst='10.0.0.20')/TCP(sport=5000,dport=10)" 311 + } 312 + ], 313 + "cmdUnderTest": "$TC qdisc change dev $DEV1 handle 1: root fq_codel limit 1", 314 + "expExitCode": "0", 315 + "verifyCmd": "$TC qdisc show dev $DEV1", 316 + "matchPattern": "qdisc fq_codel 1: root refcnt [0-9]+ limit 1p flows 1024 quantum.*target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64", 317 + "matchCount": "1", 318 + "teardown": ["$TC qdisc del dev $DEV1 handle 1: root"] 297 319 } 298 320 ]

+22

tools/testing/selftests/tc-testing/tc-tests/qdiscs/fq_pie.json

··· 18 18 "matchCount": "1", 19 19 "teardown": [ 20 20 ] 21 + }, 22 + { 23 + "id": "83bf", 24 + "name": "FQ_PIE test qdisc limit trimming", 25 + "category": ["qdisc", "fq_pie"], 26 + "plugins": {"requires": ["nsPlugin", "scapyPlugin"]}, 27 + "setup": [ 28 + "$TC qdisc add dev $DEV1 handle 1: root fq_pie limit 10" 29 + ], 30 + "scapy": [ 31 + { 32 + "iface": "$DEV0", 33 + "count": 10, 34 + "packet": "Ether(type=0x800)/IP(src='10.0.0.10',dst='10.0.0.20')/TCP(sport=5000,dport=10)" 35 + } 36 + ], 37 + "cmdUnderTest": "$TC qdisc change dev $DEV1 handle 1: root fq_pie limit 1", 38 + "expExitCode": "0", 39 + "verifyCmd": "$TC qdisc show dev $DEV1", 40 + "matchPattern": "qdisc fq_pie 1: root refcnt [0-9]+ limit 1p", 41 + "matchCount": "1", 42 + "teardown": ["$TC qdisc del dev $DEV1 handle 1: root"] 21 43 } 22 44 ]

+22

tools/testing/selftests/tc-testing/tc-tests/qdiscs/hhf.json

··· 188 188 "teardown": [ 189 189 "$TC qdisc del dev $DUMMY handle 1: root" 190 190 ] 191 + }, 192 + { 193 + "id": "385f", 194 + "name": "HHF test qdisc limit trimming", 195 + "category": ["qdisc", "hhf"], 196 + "plugins": {"requires": ["nsPlugin", "scapyPlugin"]}, 197 + "setup": [ 198 + "$TC qdisc add dev $DEV1 handle 1: root hhf limit 10" 199 + ], 200 + "scapy": [ 201 + { 202 + "iface": "$DEV0", 203 + "count": 10, 204 + "packet": "Ether(type=0x800)/IP(src='10.0.0.10',dst='10.0.0.20')/TCP(sport=5000,dport=10)" 205 + } 206 + ], 207 + "cmdUnderTest": "$TC qdisc change dev $DEV1 handle 1: root hhf limit 1", 208 + "expExitCode": "0", 209 + "verifyCmd": "$TC qdisc show dev $DEV1", 210 + "matchPattern": "qdisc hhf 1: root refcnt [0-9]+ limit 1p.*hh_limit 2048 reset_timeout 40ms admit_bytes 128Kb evict_timeout 1s non_hh_weight 2", 211 + "matchCount": "1", 212 + "teardown": ["$TC qdisc del dev $DEV1 handle 1: root"] 191 213 } 192 214 ]

+24

tools/testing/selftests/tc-testing/tc-tests/qdiscs/pie.json

··· 1 + [ 2 + { 3 + "id": "6158", 4 + "name": "PIE test qdisc limit trimming", 5 + "category": ["qdisc", "pie"], 6 + "plugins": {"requires": ["nsPlugin", "scapyPlugin"]}, 7 + "setup": [ 8 + "$TC qdisc add dev $DEV1 handle 1: root pie limit 10" 9 + ], 10 + "scapy": [ 11 + { 12 + "iface": "$DEV0", 13 + "count": 10, 14 + "packet": "Ether(type=0x800)/IP(src='10.0.0.10',dst='10.0.0.20')/TCP(sport=5000,dport=10)" 15 + } 16 + ], 17 + "cmdUnderTest": "$TC qdisc change dev $DEV1 handle 1: root pie limit 1", 18 + "expExitCode": "0", 19 + "verifyCmd": "$TC qdisc show dev $DEV1", 20 + "matchPattern": "qdisc pie 1: root refcnt [0-9]+ limit 1p", 21 + "matchCount": "1", 22 + "teardown": ["$TC qdisc del dev $DEV1 handle 1: root"] 23 + } 24 + ]

+3

tools/testing/selftests/x86/bugs/Makefile

··· 1 + TEST_PROGS := its_sysfs.py its_permutations.py its_indirect_alignment.py its_ret_alignment.py 2 + TEST_FILES := common.py 3 + include ../../lib.mk

+164

tools/testing/selftests/x86/bugs/common.py

··· 1 + #!/usr/bin/env python3 2 + # SPDX-License-Identifier: GPL-2.0 3 + # 4 + # Copyright (c) 2025 Intel Corporation 5 + # 6 + # This contains kselftest framework adapted common functions for testing 7 + # mitigation for x86 bugs. 8 + 9 + import os, sys, re, shutil 10 + 11 + sys.path.insert(0, '../../kselftest') 12 + import ksft 13 + 14 + def read_file(path): 15 + if not os.path.exists(path): 16 + return None 17 + with open(path, 'r') as file: 18 + return file.read().strip() 19 + 20 + def cpuinfo_has(arg): 21 + cpuinfo = read_file('/proc/cpuinfo') 22 + if arg in cpuinfo: 23 + return True 24 + return False 25 + 26 + def cmdline_has(arg): 27 + cmdline = read_file('/proc/cmdline') 28 + if arg in cmdline: 29 + return True 30 + return False 31 + 32 + def cmdline_has_either(args): 33 + cmdline = read_file('/proc/cmdline') 34 + for arg in args: 35 + if arg in cmdline: 36 + return True 37 + return False 38 + 39 + def cmdline_has_none(args): 40 + return not cmdline_has_either(args) 41 + 42 + def cmdline_has_all(args): 43 + cmdline = read_file('/proc/cmdline') 44 + for arg in args: 45 + if arg not in cmdline: 46 + return False 47 + return True 48 + 49 + def get_sysfs(bug): 50 + return read_file("/sys/devices/system/cpu/vulnerabilities/" + bug) 51 + 52 + def sysfs_has(bug, mitigation): 53 + status = get_sysfs(bug) 54 + if mitigation in status: 55 + return True 56 + return False 57 + 58 + def sysfs_has_either(bugs, mitigations): 59 + for bug in bugs: 60 + for mitigation in mitigations: 61 + if sysfs_has(bug, mitigation): 62 + return True 63 + return False 64 + 65 + def sysfs_has_none(bugs, mitigations): 66 + return not sysfs_has_either(bugs, mitigations) 67 + 68 + def sysfs_has_all(bugs, mitigations): 69 + for bug in bugs: 70 + for mitigation in mitigations: 71 + if not sysfs_has(bug, mitigation): 72 + return False 73 + return True 74 + 75 + def bug_check_pass(bug, found): 76 + ksft.print_msg(f"\nFound: {found}") 77 + # ksft.print_msg(f"\ncmdline: {read_file('/proc/cmdline')}") 78 + ksft.test_result_pass(f'{bug}: {found}') 79 + 80 + def bug_check_fail(bug, found, expected): 81 + ksft.print_msg(f'\nFound:\t {found}') 82 + ksft.print_msg(f'Expected:\t {expected}') 83 + ksft.print_msg(f"\ncmdline: {read_file('/proc/cmdline')}") 84 + ksft.test_result_fail(f'{bug}: {found}') 85 + 86 + def bug_status_unknown(bug, found): 87 + ksft.print_msg(f'\nUnknown status: {found}') 88 + ksft.print_msg(f"\ncmdline: {read_file('/proc/cmdline')}") 89 + ksft.test_result_fail(f'{bug}: {found}') 90 + 91 + def basic_checks_sufficient(bug, mitigation): 92 + if not mitigation: 93 + bug_status_unknown(bug, "None") 94 + return True 95 + elif mitigation == "Not affected": 96 + ksft.test_result_pass(bug) 97 + return True 98 + elif mitigation == "Vulnerable": 99 + if cmdline_has_either([f'{bug}=off', 'mitigations=off']): 100 + bug_check_pass(bug, mitigation) 101 + return True 102 + return False 103 + 104 + def get_section_info(vmlinux, section_name): 105 + from elftools.elf.elffile import ELFFile 106 + with open(vmlinux, 'rb') as f: 107 + elffile = ELFFile(f) 108 + section = elffile.get_section_by_name(section_name) 109 + if section is None: 110 + ksft.print_msg("Available sections in vmlinux:") 111 + for sec in elffile.iter_sections(): 112 + ksft.print_msg(sec.name) 113 + raise ValueError(f"Section {section_name} not found in {vmlinux}") 114 + return section['sh_addr'], section['sh_offset'], section['sh_size'] 115 + 116 + def get_patch_sites(vmlinux, offset, size): 117 + import struct 118 + output = [] 119 + with open(vmlinux, 'rb') as f: 120 + f.seek(offset) 121 + i = 0 122 + while i < size: 123 + data = f.read(4) # s32 124 + if not data: 125 + break 126 + sym_offset = struct.unpack('<i', data)[0] + i 127 + i += 4 128 + output.append(sym_offset) 129 + return output 130 + 131 + def get_instruction_from_vmlinux(elffile, section, virtual_address, target_address): 132 + from capstone import Cs, CS_ARCH_X86, CS_MODE_64 133 + section_start = section['sh_addr'] 134 + section_end = section_start + section['sh_size'] 135 + 136 + if not (section_start <= target_address < section_end): 137 + return None 138 + 139 + offset = target_address - section_start 140 + code = section.data()[offset:offset + 16] 141 + 142 + cap = init_capstone() 143 + for instruction in cap.disasm(code, target_address): 144 + if instruction.address == target_address: 145 + return instruction 146 + return None 147 + 148 + def init_capstone(): 149 + from capstone import Cs, CS_ARCH_X86, CS_MODE_64, CS_OPT_SYNTAX_ATT 150 + cap = Cs(CS_ARCH_X86, CS_MODE_64) 151 + cap.syntax = CS_OPT_SYNTAX_ATT 152 + return cap 153 + 154 + def get_runtime_kernel(): 155 + import drgn 156 + return drgn.program_from_kernel() 157 + 158 + def check_dependencies_or_skip(modules, script_name="unknown test"): 159 + for mod in modules: 160 + try: 161 + __import__(mod) 162 + except ImportError: 163 + ksft.test_result_skip(f"Skipping {script_name}: missing module '{mod}'") 164 + ksft.finished()

+150

tools/testing/selftests/x86/bugs/its_indirect_alignment.py

··· 1 + #!/usr/bin/env python3 2 + # SPDX-License-Identifier: GPL-2.0 3 + # 4 + # Copyright (c) 2025 Intel Corporation 5 + # 6 + # Test for indirect target selection (ITS) mitigation. 7 + # 8 + # Test if indirect CALL/JMP are correctly patched by evaluating 9 + # the vmlinux .retpoline_sites in /proc/kcore. 10 + 11 + # Install dependencies 12 + # add-apt-repository ppa:michel-slm/kernel-utils 13 + # apt update 14 + # apt install -y python3-drgn python3-pyelftools python3-capstone 15 + # 16 + # Best to copy the vmlinux at a standard location: 17 + # mkdir -p /usr/lib/debug/lib/modules/$(uname -r) 18 + # cp $VMLINUX /usr/lib/debug/lib/modules/$(uname -r)/vmlinux 19 + # 20 + # Usage: ./its_indirect_alignment.py [vmlinux] 21 + 22 + import os, sys, argparse 23 + from pathlib import Path 24 + 25 + this_dir = os.path.dirname(os.path.realpath(__file__)) 26 + sys.path.insert(0, this_dir + '/../../kselftest') 27 + import ksft 28 + import common as c 29 + 30 + bug = "indirect_target_selection" 31 + 32 + mitigation = c.get_sysfs(bug) 33 + if not mitigation or "Aligned branch/return thunks" not in mitigation: 34 + ksft.test_result_skip("Skipping its_indirect_alignment.py: Aligned branch/return thunks not enabled") 35 + ksft.finished() 36 + 37 + if c.sysfs_has("spectre_v2", "Retpolines"): 38 + ksft.test_result_skip("Skipping its_indirect_alignment.py: Retpolines deployed") 39 + ksft.finished() 40 + 41 + c.check_dependencies_or_skip(['drgn', 'elftools', 'capstone'], script_name="its_indirect_alignment.py") 42 + 43 + from elftools.elf.elffile import ELFFile 44 + from drgn.helpers.common.memory import identify_address 45 + 46 + cap = c.init_capstone() 47 + 48 + if len(os.sys.argv) > 1: 49 + arg_vmlinux = os.sys.argv[1] 50 + if not os.path.exists(arg_vmlinux): 51 + ksft.test_result_fail(f"its_indirect_alignment.py: vmlinux not found at argument path: {arg_vmlinux}") 52 + ksft.exit_fail() 53 + os.makedirs(f"/usr/lib/debug/lib/modules/{os.uname().release}", exist_ok=True) 54 + os.system(f'cp {arg_vmlinux} /usr/lib/debug/lib/modules/$(uname -r)/vmlinux') 55 + 56 + vmlinux = f"/usr/lib/debug/lib/modules/{os.uname().release}/vmlinux" 57 + if not os.path.exists(vmlinux): 58 + ksft.test_result_fail(f"its_indirect_alignment.py: vmlinux not found at {vmlinux}") 59 + ksft.exit_fail() 60 + 61 + ksft.print_msg(f"Using vmlinux: {vmlinux}") 62 + 63 + retpolines_start_vmlinux, retpolines_sec_offset, size = c.get_section_info(vmlinux, '.retpoline_sites') 64 + ksft.print_msg(f"vmlinux: Section .retpoline_sites (0x{retpolines_start_vmlinux:x}) found at 0x{retpolines_sec_offset:x} with size 0x{size:x}") 65 + 66 + sites_offset = c.get_patch_sites(vmlinux, retpolines_sec_offset, size) 67 + total_retpoline_tests = len(sites_offset) 68 + ksft.print_msg(f"Found {total_retpoline_tests} retpoline sites") 69 + 70 + prog = c.get_runtime_kernel() 71 + retpolines_start_kcore = prog.symbol('__retpoline_sites').address 72 + ksft.print_msg(f'kcore: __retpoline_sites: 0x{retpolines_start_kcore:x}') 73 + 74 + x86_indirect_its_thunk_r15 = prog.symbol('__x86_indirect_its_thunk_r15').address 75 + ksft.print_msg(f'kcore: __x86_indirect_its_thunk_r15: 0x{x86_indirect_its_thunk_r15:x}') 76 + 77 + tests_passed = 0 78 + tests_failed = 0 79 + tests_unknown = 0 80 + 81 + with open(vmlinux, 'rb') as f: 82 + elffile = ELFFile(f) 83 + text_section = elffile.get_section_by_name('.text') 84 + 85 + for i in range(0, len(sites_offset)): 86 + site = retpolines_start_kcore + sites_offset[i] 87 + vmlinux_site = retpolines_start_vmlinux + sites_offset[i] 88 + passed = unknown = failed = False 89 + try: 90 + vmlinux_insn = c.get_instruction_from_vmlinux(elffile, text_section, text_section['sh_addr'], vmlinux_site) 91 + kcore_insn = list(cap.disasm(prog.read(site, 16), site))[0] 92 + operand = kcore_insn.op_str 93 + insn_end = site + kcore_insn.size - 1 # TODO handle Jcc.32 __x86_indirect_thunk_\reg 94 + safe_site = insn_end & 0x20 95 + site_status = "" if safe_site else "(unsafe)" 96 + 97 + ksft.print_msg(f"\nSite {i}: {identify_address(prog, site)} <0x{site:x}> {site_status}") 98 + ksft.print_msg(f"\tvmlinux: 0x{vmlinux_insn.address:x}:\t{vmlinux_insn.mnemonic}\t{vmlinux_insn.op_str}") 99 + ksft.print_msg(f"\tkcore: 0x{kcore_insn.address:x}:\t{kcore_insn.mnemonic}\t{kcore_insn.op_str}") 100 + 101 + if (site & 0x20) ^ (insn_end & 0x20): 102 + ksft.print_msg(f"\tSite at safe/unsafe boundary: {str(kcore_insn.bytes)} {kcore_insn.mnemonic} {operand}") 103 + if safe_site: 104 + tests_passed += 1 105 + passed = True 106 + ksft.print_msg(f"\tPASSED: At safe address") 107 + continue 108 + 109 + if operand.startswith('0xffffffff'): 110 + thunk = int(operand, 16) 111 + if thunk > x86_indirect_its_thunk_r15: 112 + insn_at_thunk = list(cap.disasm(prog.read(thunk, 16), thunk))[0] 113 + operand += ' -> ' + insn_at_thunk.mnemonic + ' ' + insn_at_thunk.op_str + ' <dynamic-thunk?>' 114 + if 'jmp' in insn_at_thunk.mnemonic and thunk & 0x20: 115 + ksft.print_msg(f"\tPASSED: Found {operand} at safe address") 116 + passed = True 117 + if not passed: 118 + if kcore_insn.operands[0].type == capstone.CS_OP_IMM: 119 + operand += ' <' + prog.symbol(int(operand, 16)) + '>' 120 + if '__x86_indirect_its_thunk_' in operand: 121 + ksft.print_msg(f"\tPASSED: Found {operand}") 122 + else: 123 + ksft.print_msg(f"\tPASSED: Found direct branch: {kcore_insn}, ITS thunk not required.") 124 + passed = True 125 + else: 126 + unknown = True 127 + if passed: 128 + tests_passed += 1 129 + elif unknown: 130 + ksft.print_msg(f"UNKNOWN: unexpected operand: {kcore_insn}") 131 + tests_unknown += 1 132 + else: 133 + ksft.print_msg(f'\t************* FAILED *************') 134 + ksft.print_msg(f"\tFound {kcore_insn.bytes} {kcore_insn.mnemonic} {operand}") 135 + ksft.print_msg(f'\t**********************************') 136 + tests_failed += 1 137 + except Exception as e: 138 + ksft.print_msg(f"UNKNOWN: An unexpected error occurred: {e}") 139 + tests_unknown += 1 140 + 141 + ksft.print_msg(f"\n\nSummary:") 142 + ksft.print_msg(f"PASS: \t{tests_passed} \t/ {total_retpoline_tests}") 143 + ksft.print_msg(f"FAIL: \t{tests_failed} \t/ {total_retpoline_tests}") 144 + ksft.print_msg(f"UNKNOWN: \t{tests_unknown} \t/ {total_retpoline_tests}") 145 + 146 + if tests_failed == 0: 147 + ksft.test_result_pass("All ITS return thunk sites passed") 148 + else: 149 + ksft.test_result_fail(f"{tests_failed} ITS return thunk sites failed") 150 + ksft.finished()

+109

tools/testing/selftests/x86/bugs/its_permutations.py

··· 1 + #!/usr/bin/env python3 2 + # SPDX-License-Identifier: GPL-2.0 3 + # 4 + # Copyright (c) 2025 Intel Corporation 5 + # 6 + # Test for indirect target selection (ITS) cmdline permutations with other bugs 7 + # like spectre_v2 and retbleed. 8 + 9 + import os, sys, subprocess, itertools, re, shutil 10 + 11 + test_dir = os.path.dirname(os.path.realpath(__file__)) 12 + sys.path.insert(0, test_dir + '/../../kselftest') 13 + import ksft 14 + import common as c 15 + 16 + bug = "indirect_target_selection" 17 + mitigation = c.get_sysfs(bug) 18 + 19 + if not mitigation or "Not affected" in mitigation: 20 + ksft.test_result_skip("Skipping its_permutations.py: not applicable") 21 + ksft.finished() 22 + 23 + if shutil.which('vng') is None: 24 + ksft.test_result_skip("Skipping its_permutations.py: virtme-ng ('vng') not found in PATH.") 25 + ksft.finished() 26 + 27 + TEST = f"{test_dir}/its_sysfs.py" 28 + default_kparam = ['clearcpuid=hypervisor', 'panic=5', 'panic_on_warn=1', 'oops=panic', 'nmi_watchdog=1', 'hung_task_panic=1'] 29 + 30 + DEBUG = " -v " 31 + 32 + # Install dependencies 33 + # https://github.com/arighi/virtme-ng 34 + # apt install virtme-ng 35 + BOOT_CMD = f"vng --run {test_dir}/../../../../../arch/x86/boot/bzImage " 36 + #BOOT_CMD += DEBUG 37 + 38 + bug = "indirect_target_selection" 39 + 40 + input_options = { 41 + 'indirect_target_selection' : ['off', 'on', 'stuff', 'vmexit'], 42 + 'retbleed' : ['off', 'stuff', 'auto'], 43 + 'spectre_v2' : ['off', 'on', 'eibrs', 'retpoline', 'ibrs', 'eibrs,retpoline'], 44 + } 45 + 46 + def pretty_print(output): 47 + OKBLUE = '\033[94m' 48 + OKGREEN = '\033[92m' 49 + WARNING = '\033[93m' 50 + FAIL = '\033[91m' 51 + ENDC = '\033[0m' 52 + BOLD = '\033[1m' 53 + 54 + # Define patterns and their corresponding colors 55 + patterns = { 56 + r"^ok \d+": OKGREEN, 57 + r"^not ok \d+": FAIL, 58 + r"^# Testing .*": OKBLUE, 59 + r"^# Found: .*": WARNING, 60 + r"^# Totals: .*": BOLD, 61 + r"pass:([1-9]\d*)": OKGREEN, 62 + r"fail:([1-9]\d*)": FAIL, 63 + r"skip:([1-9]\d*)": WARNING, 64 + } 65 + 66 + # Apply colors based on patterns 67 + for pattern, color in patterns.items(): 68 + output = re.sub(pattern, lambda match: f"{color}{match.group(0)}{ENDC}", output, flags=re.MULTILINE) 69 + 70 + print(output) 71 + 72 + combinations = list(itertools.product(*input_options.values())) 73 + ksft.print_header() 74 + ksft.set_plan(len(combinations)) 75 + 76 + logs = "" 77 + 78 + for combination in combinations: 79 + append = "" 80 + log = "" 81 + for p in default_kparam: 82 + append += f' --append={p}' 83 + command = BOOT_CMD + append 84 + test_params = "" 85 + for i, key in enumerate(input_options.keys()): 86 + param = f'{key}={combination[i]}' 87 + test_params += f' {param}' 88 + command += f" --append={param}" 89 + command += f" -- {TEST}" 90 + test_name = f"{bug} {test_params}" 91 + pretty_print(f'# Testing {test_name}') 92 + t = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT) 93 + t.wait() 94 + output, _ = t.communicate() 95 + if t.returncode == 0: 96 + ksft.test_result_pass(test_name) 97 + else: 98 + ksft.test_result_fail(test_name) 99 + output = output.decode() 100 + log += f" {output}" 101 + pretty_print(log) 102 + logs += output + "\n" 103 + 104 + # Optionally use tappy to parse the output 105 + # apt install python3-tappy 106 + with open("logs.txt", "w") as f: 107 + f.write(logs) 108 + 109 + ksft.finished()

+139

tools/testing/selftests/x86/bugs/its_ret_alignment.py

··· 1 + #!/usr/bin/env python3 2 + # SPDX-License-Identifier: GPL-2.0 3 + # 4 + # Copyright (c) 2025 Intel Corporation 5 + # 6 + # Test for indirect target selection (ITS) mitigation. 7 + # 8 + # Tests if the RETs are correctly patched by evaluating the 9 + # vmlinux .return_sites in /proc/kcore. 10 + # 11 + # Install dependencies 12 + # add-apt-repository ppa:michel-slm/kernel-utils 13 + # apt update 14 + # apt install -y python3-drgn python3-pyelftools python3-capstone 15 + # 16 + # Run on target machine 17 + # mkdir -p /usr/lib/debug/lib/modules/$(uname -r) 18 + # cp $VMLINUX /usr/lib/debug/lib/modules/$(uname -r)/vmlinux 19 + # 20 + # Usage: ./its_ret_alignment.py 21 + 22 + import os, sys, argparse 23 + from pathlib import Path 24 + 25 + this_dir = os.path.dirname(os.path.realpath(__file__)) 26 + sys.path.insert(0, this_dir + '/../../kselftest') 27 + import ksft 28 + import common as c 29 + 30 + bug = "indirect_target_selection" 31 + mitigation = c.get_sysfs(bug) 32 + if not mitigation or "Aligned branch/return thunks" not in mitigation: 33 + ksft.test_result_skip("Skipping its_ret_alignment.py: Aligned branch/return thunks not enabled") 34 + ksft.finished() 35 + 36 + c.check_dependencies_or_skip(['drgn', 'elftools', 'capstone'], script_name="its_ret_alignment.py") 37 + 38 + from elftools.elf.elffile import ELFFile 39 + from drgn.helpers.common.memory import identify_address 40 + 41 + cap = c.init_capstone() 42 + 43 + if len(os.sys.argv) > 1: 44 + arg_vmlinux = os.sys.argv[1] 45 + if not os.path.exists(arg_vmlinux): 46 + ksft.test_result_fail(f"its_ret_alignment.py: vmlinux not found at user-supplied path: {arg_vmlinux}") 47 + ksft.exit_fail() 48 + os.makedirs(f"/usr/lib/debug/lib/modules/{os.uname().release}", exist_ok=True) 49 + os.system(f'cp {arg_vmlinux} /usr/lib/debug/lib/modules/$(uname -r)/vmlinux') 50 + 51 + vmlinux = f"/usr/lib/debug/lib/modules/{os.uname().release}/vmlinux" 52 + if not os.path.exists(vmlinux): 53 + ksft.test_result_fail(f"its_ret_alignment.py: vmlinux not found at {vmlinux}") 54 + ksft.exit_fail() 55 + 56 + ksft.print_msg(f"Using vmlinux: {vmlinux}") 57 + 58 + rethunks_start_vmlinux, rethunks_sec_offset, size = c.get_section_info(vmlinux, '.return_sites') 59 + ksft.print_msg(f"vmlinux: Section .return_sites (0x{rethunks_start_vmlinux:x}) found at 0x{rethunks_sec_offset:x} with size 0x{size:x}") 60 + 61 + sites_offset = c.get_patch_sites(vmlinux, rethunks_sec_offset, size) 62 + total_rethunk_tests = len(sites_offset) 63 + ksft.print_msg(f"Found {total_rethunk_tests} rethunk sites") 64 + 65 + prog = c.get_runtime_kernel() 66 + rethunks_start_kcore = prog.symbol('__return_sites').address 67 + ksft.print_msg(f'kcore: __rethunk_sites: 0x{rethunks_start_kcore:x}') 68 + 69 + its_return_thunk = prog.symbol('its_return_thunk').address 70 + ksft.print_msg(f'kcore: its_return_thunk: 0x{its_return_thunk:x}') 71 + 72 + tests_passed = 0 73 + tests_failed = 0 74 + tests_unknown = 0 75 + tests_skipped = 0 76 + 77 + with open(vmlinux, 'rb') as f: 78 + elffile = ELFFile(f) 79 + text_section = elffile.get_section_by_name('.text') 80 + 81 + for i in range(len(sites_offset)): 82 + site = rethunks_start_kcore + sites_offset[i] 83 + vmlinux_site = rethunks_start_vmlinux + sites_offset[i] 84 + try: 85 + passed = unknown = failed = skipped = False 86 + 87 + symbol = identify_address(prog, site) 88 + vmlinux_insn = c.get_instruction_from_vmlinux(elffile, text_section, text_section['sh_addr'], vmlinux_site) 89 + kcore_insn = list(cap.disasm(prog.read(site, 16), site))[0] 90 + 91 + insn_end = site + kcore_insn.size - 1 92 + 93 + safe_site = insn_end & 0x20 94 + site_status = "" if safe_site else "(unsafe)" 95 + 96 + ksft.print_msg(f"\nSite {i}: {symbol} <0x{site:x}> {site_status}") 97 + ksft.print_msg(f"\tvmlinux: 0x{vmlinux_insn.address:x}:\t{vmlinux_insn.mnemonic}\t{vmlinux_insn.op_str}") 98 + ksft.print_msg(f"\tkcore: 0x{kcore_insn.address:x}:\t{kcore_insn.mnemonic}\t{kcore_insn.op_str}") 99 + 100 + if safe_site: 101 + tests_passed += 1 102 + passed = True 103 + ksft.print_msg(f"\tPASSED: At safe address") 104 + continue 105 + 106 + if "jmp" in kcore_insn.mnemonic: 107 + passed = True 108 + elif "ret" not in kcore_insn.mnemonic: 109 + skipped = True 110 + 111 + if passed: 112 + ksft.print_msg(f"\tPASSED: Found {kcore_insn.mnemonic} {kcore_insn.op_str}") 113 + tests_passed += 1 114 + elif skipped: 115 + ksft.print_msg(f"\tSKIPPED: Found '{kcore_insn.mnemonic}'") 116 + tests_skipped += 1 117 + elif unknown: 118 + ksft.print_msg(f"UNKNOWN: An unknown instruction: {kcore_insn}") 119 + tests_unknown += 1 120 + else: 121 + ksft.print_msg(f'\t************* FAILED *************') 122 + ksft.print_msg(f"\tFound {kcore_insn.mnemonic} {kcore_insn.op_str}") 123 + ksft.print_msg(f'\t**********************************') 124 + tests_failed += 1 125 + except Exception as e: 126 + ksft.print_msg(f"UNKNOWN: An unexpected error occurred: {e}") 127 + tests_unknown += 1 128 + 129 + ksft.print_msg(f"\n\nSummary:") 130 + ksft.print_msg(f"PASSED: \t{tests_passed} \t/ {total_rethunk_tests}") 131 + ksft.print_msg(f"FAILED: \t{tests_failed} \t/ {total_rethunk_tests}") 132 + ksft.print_msg(f"SKIPPED: \t{tests_skipped} \t/ {total_rethunk_tests}") 133 + ksft.print_msg(f"UNKNOWN: \t{tests_unknown} \t/ {total_rethunk_tests}") 134 + 135 + if tests_failed == 0: 136 + ksft.test_result_pass("All ITS return thunk sites passed.") 137 + else: 138 + ksft.test_result_fail(f"{tests_failed} failed sites need ITS return thunks.") 139 + ksft.finished()

+65

tools/testing/selftests/x86/bugs/its_sysfs.py

··· 1 + #!/usr/bin/env python3 2 + # SPDX-License-Identifier: GPL-2.0 3 + # 4 + # Copyright (c) 2025 Intel Corporation 5 + # 6 + # Test for Indirect Target Selection(ITS) mitigation sysfs status. 7 + 8 + import sys, os, re 9 + this_dir = os.path.dirname(os.path.realpath(__file__)) 10 + sys.path.insert(0, this_dir + '/../../kselftest') 11 + import ksft 12 + 13 + from common import * 14 + 15 + bug = "indirect_target_selection" 16 + mitigation = get_sysfs(bug) 17 + 18 + ITS_MITIGATION_ALIGNED_THUNKS = "Mitigation: Aligned branch/return thunks" 19 + ITS_MITIGATION_RETPOLINE_STUFF = "Mitigation: Retpolines, Stuffing RSB" 20 + ITS_MITIGATION_VMEXIT_ONLY = "Mitigation: Vulnerable, KVM: Not affected" 21 + ITS_MITIGATION_VULNERABLE = "Vulnerable" 22 + 23 + def check_mitigation(): 24 + if mitigation == ITS_MITIGATION_ALIGNED_THUNKS: 25 + if cmdline_has(f'{bug}=stuff') and sysfs_has("spectre_v2", "Retpolines"): 26 + bug_check_fail(bug, ITS_MITIGATION_ALIGNED_THUNKS, ITS_MITIGATION_RETPOLINE_STUFF) 27 + return 28 + if cmdline_has(f'{bug}=vmexit') and cpuinfo_has('its_native_only'): 29 + bug_check_fail(bug, ITS_MITIGATION_ALIGNED_THUNKS, ITS_MITIGATION_VMEXIT_ONLY) 30 + return 31 + bug_check_pass(bug, ITS_MITIGATION_ALIGNED_THUNKS) 32 + return 33 + 34 + if mitigation == ITS_MITIGATION_RETPOLINE_STUFF: 35 + if cmdline_has(f'{bug}=stuff') and sysfs_has("spectre_v2", "Retpolines"): 36 + bug_check_pass(bug, ITS_MITIGATION_RETPOLINE_STUFF) 37 + return 38 + if sysfs_has('retbleed', 'Stuffing'): 39 + bug_check_pass(bug, ITS_MITIGATION_RETPOLINE_STUFF) 40 + return 41 + bug_check_fail(bug, ITS_MITIGATION_RETPOLINE_STUFF, ITS_MITIGATION_ALIGNED_THUNKS) 42 + 43 + if mitigation == ITS_MITIGATION_VMEXIT_ONLY: 44 + if cmdline_has(f'{bug}=vmexit') and cpuinfo_has('its_native_only'): 45 + bug_check_pass(bug, ITS_MITIGATION_VMEXIT_ONLY) 46 + return 47 + bug_check_fail(bug, ITS_MITIGATION_VMEXIT_ONLY, ITS_MITIGATION_ALIGNED_THUNKS) 48 + 49 + if mitigation == ITS_MITIGATION_VULNERABLE: 50 + if sysfs_has("spectre_v2", "Vulnerable"): 51 + bug_check_pass(bug, ITS_MITIGATION_VULNERABLE) 52 + else: 53 + bug_check_fail(bug, "Mitigation", ITS_MITIGATION_VULNERABLE) 54 + 55 + bug_status_unknown(bug, mitigation) 56 + return 57 + 58 + ksft.print_header() 59 + ksft.set_plan(1) 60 + ksft.print_msg(f'{bug}: {mitigation} ...') 61 + 62 + if not basic_checks_sufficient(bug, mitigation): 63 + check_mitigation() 64 + 65 + ksft.finished()

+16 -12

tools/testing/vsock/vsock_test.c

··· 1264 1264 send_buf(fd, buf, sizeof(buf), 0, sizeof(buf)); 1265 1265 control_expectln("RECEIVED"); 1266 1266 1267 - ret = ioctl(fd, SIOCOUTQ, &sock_bytes_unsent); 1268 - if (ret < 0) { 1269 - if (errno == EOPNOTSUPP) { 1270 - fprintf(stderr, "Test skipped, SIOCOUTQ not supported.\n"); 1271 - } else { 1267 + /* SIOCOUTQ isn't guaranteed to instantly track sent data. Even though 1268 + * the "RECEIVED" message means that the other side has received the 1269 + * data, there can be a delay in our kernel before updating the "unsent 1270 + * bytes" counter. Repeat SIOCOUTQ until it returns 0. 1271 + */ 1272 + timeout_begin(TIMEOUT); 1273 + do { 1274 + ret = ioctl(fd, SIOCOUTQ, &sock_bytes_unsent); 1275 + if (ret < 0) { 1276 + if (errno == EOPNOTSUPP) { 1277 + fprintf(stderr, "Test skipped, SIOCOUTQ not supported.\n"); 1278 + break; 1279 + } 1272 1280 perror("ioctl"); 1273 1281 exit(EXIT_FAILURE); 1274 1282 } 1275 - } else if (ret == 0 && sock_bytes_unsent != 0) { 1276 - fprintf(stderr, 1277 - "Unexpected 'SIOCOUTQ' value, expected 0, got %i\n", 1278 - sock_bytes_unsent); 1279 - exit(EXIT_FAILURE); 1280 - } 1281 - 1283 + timeout_check("SIOCOUTQ"); 1284 + } while (sock_bytes_unsent != 0); 1285 + timeout_end(); 1282 1286 close(fd); 1283 1287 } 1284 1288

+4

usr/include/Makefile

··· 59 59 no-header-test += linux/bpf_perf_event.h 60 60 endif 61 61 62 + ifeq ($(SRCARCH),openrisc) 63 + no-header-test += linux/bpf_perf_event.h 64 + endif 65 + 62 66 ifeq ($(SRCARCH),powerpc) 63 67 no-header-test += linux/bpf_perf_event.h 64 68 endif