Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Lots of overlapping changes and parallel additions, stuff
like that.

Signed-off-by: David S. Miller <davem@davemloft.net>

+3698 -1069
+4
.mailmap
··· 108 108 Jason Gunthorpe <jgg@ziepe.ca> <jgunthorpe@obsidianresearch.com> 109 109 Javi Merino <javi.merino@kernel.org> <javi.merino@arm.com> 110 110 <javier@osg.samsung.com> <javier.martinez@collabora.co.uk> 111 + Jayachandran C <c.jayachandran@gmail.com> <jayachandranc@netlogicmicro.com> 112 + Jayachandran C <c.jayachandran@gmail.com> <jchandra@broadcom.com> 113 + Jayachandran C <c.jayachandran@gmail.com> <jchandra@digeo.com> 114 + Jayachandran C <c.jayachandran@gmail.com> <jnair@caviumnetworks.com> 111 115 Jean Tourrilhes <jt@hpl.hp.com> 112 116 <jean-philippe@linaro.org> <jean-philippe.brucker@arm.com> 113 117 Jeff Garzik <jgarzik@pretzel.yyz.us>
+2
Documentation/ABI/testing/sysfs-devices-system-cpu
··· 486 486 /sys/devices/system/cpu/vulnerabilities/spec_store_bypass 487 487 /sys/devices/system/cpu/vulnerabilities/l1tf 488 488 /sys/devices/system/cpu/vulnerabilities/mds 489 + /sys/devices/system/cpu/vulnerabilities/tsx_async_abort 490 + /sys/devices/system/cpu/vulnerabilities/itlb_multihit 489 491 Date: January 2018 490 492 Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org> 491 493 Description: Information about CPU vulnerabilities
+2
Documentation/admin-guide/hw-vuln/index.rst
··· 12 12 spectre 13 13 l1tf 14 14 mds 15 + tsx_async_abort 16 + multihit.rst
+163
Documentation/admin-guide/hw-vuln/multihit.rst
··· 1 + iTLB multihit 2 + ============= 3 + 4 + iTLB multihit is an erratum where some processors may incur a machine check 5 + error, possibly resulting in an unrecoverable CPU lockup, when an 6 + instruction fetch hits multiple entries in the instruction TLB. This can 7 + occur when the page size is changed along with either the physical address 8 + or cache type. A malicious guest running on a virtualized system can 9 + exploit this erratum to perform a denial of service attack. 10 + 11 + 12 + Affected processors 13 + ------------------- 14 + 15 + Variations of this erratum are present on most Intel Core and Xeon processor 16 + models. The erratum is not present on: 17 + 18 + - non-Intel processors 19 + 20 + - Some Atoms (Airmont, Bonnell, Goldmont, GoldmontPlus, Saltwell, Silvermont) 21 + 22 + - Intel processors that have the PSCHANGE_MC_NO bit set in the 23 + IA32_ARCH_CAPABILITIES MSR. 24 + 25 + 26 + Related CVEs 27 + ------------ 28 + 29 + The following CVE entry is related to this issue: 30 + 31 + ============== ================================================= 32 + CVE-2018-12207 Machine Check Error Avoidance on Page Size Change 33 + ============== ================================================= 34 + 35 + 36 + Problem 37 + ------- 38 + 39 + Privileged software, including OS and virtual machine managers (VMM), are in 40 + charge of memory management. A key component in memory management is the control 41 + of the page tables. Modern processors use virtual memory, a technique that creates 42 + the illusion of a very large memory for processors. This virtual space is split 43 + into pages of a given size. Page tables translate virtual addresses to physical 44 + addresses. 45 + 46 + To reduce latency when performing a virtual to physical address translation, 47 + processors include a structure, called TLB, that caches recent translations. 48 + There are separate TLBs for instruction (iTLB) and data (dTLB). 49 + 50 + Under this errata, instructions are fetched from a linear address translated 51 + using a 4 KB translation cached in the iTLB. Privileged software modifies the 52 + paging structure so that the same linear address using large page size (2 MB, 4 53 + MB, 1 GB) with a different physical address or memory type. After the page 54 + structure modification but before the software invalidates any iTLB entries for 55 + the linear address, a code fetch that happens on the same linear address may 56 + cause a machine-check error which can result in a system hang or shutdown. 57 + 58 + 59 + Attack scenarios 60 + ---------------- 61 + 62 + Attacks against the iTLB multihit erratum can be mounted from malicious 63 + guests in a virtualized system. 64 + 65 + 66 + iTLB multihit system information 67 + -------------------------------- 68 + 69 + The Linux kernel provides a sysfs interface to enumerate the current iTLB 70 + multihit status of the system:whether the system is vulnerable and which 71 + mitigations are active. The relevant sysfs file is: 72 + 73 + /sys/devices/system/cpu/vulnerabilities/itlb_multihit 74 + 75 + The possible values in this file are: 76 + 77 + .. list-table:: 78 + 79 + * - Not affected 80 + - The processor is not vulnerable. 81 + * - KVM: Mitigation: Split huge pages 82 + - Software changes mitigate this issue. 83 + * - KVM: Vulnerable 84 + - The processor is vulnerable, but no mitigation enabled 85 + 86 + 87 + Enumeration of the erratum 88 + -------------------------------- 89 + 90 + A new bit has been allocated in the IA32_ARCH_CAPABILITIES (PSCHANGE_MC_NO) msr 91 + and will be set on CPU's which are mitigated against this issue. 92 + 93 + ======================================= =========== =============================== 94 + IA32_ARCH_CAPABILITIES MSR Not present Possibly vulnerable,check model 95 + IA32_ARCH_CAPABILITIES[PSCHANGE_MC_NO] '0' Likely vulnerable,check model 96 + IA32_ARCH_CAPABILITIES[PSCHANGE_MC_NO] '1' Not vulnerable 97 + ======================================= =========== =============================== 98 + 99 + 100 + Mitigation mechanism 101 + ------------------------- 102 + 103 + This erratum can be mitigated by restricting the use of large page sizes to 104 + non-executable pages. This forces all iTLB entries to be 4K, and removes 105 + the possibility of multiple hits. 106 + 107 + In order to mitigate the vulnerability, KVM initially marks all huge pages 108 + as non-executable. If the guest attempts to execute in one of those pages, 109 + the page is broken down into 4K pages, which are then marked executable. 110 + 111 + If EPT is disabled or not available on the host, KVM is in control of TLB 112 + flushes and the problematic situation cannot happen. However, the shadow 113 + EPT paging mechanism used by nested virtualization is vulnerable, because 114 + the nested guest can trigger multiple iTLB hits by modifying its own 115 + (non-nested) page tables. For simplicity, KVM will make large pages 116 + non-executable in all shadow paging modes. 117 + 118 + Mitigation control on the kernel command line and KVM - module parameter 119 + ------------------------------------------------------------------------ 120 + 121 + The KVM hypervisor mitigation mechanism for marking huge pages as 122 + non-executable can be controlled with a module parameter "nx_huge_pages=". 123 + The kernel command line allows to control the iTLB multihit mitigations at 124 + boot time with the option "kvm.nx_huge_pages=". 125 + 126 + The valid arguments for these options are: 127 + 128 + ========== ================================================================ 129 + force Mitigation is enabled. In this case, the mitigation implements 130 + non-executable huge pages in Linux kernel KVM module. All huge 131 + pages in the EPT are marked as non-executable. 132 + If a guest attempts to execute in one of those pages, the page is 133 + broken down into 4K pages, which are then marked executable. 134 + 135 + off Mitigation is disabled. 136 + 137 + auto Enable mitigation only if the platform is affected and the kernel 138 + was not booted with the "mitigations=off" command line parameter. 139 + This is the default option. 140 + ========== ================================================================ 141 + 142 + 143 + Mitigation selection guide 144 + -------------------------- 145 + 146 + 1. No virtualization in use 147 + ^^^^^^^^^^^^^^^^^^^^^^^^^^^ 148 + 149 + The system is protected by the kernel unconditionally and no further 150 + action is required. 151 + 152 + 2. Virtualization with trusted guests 153 + ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 154 + 155 + If the guest comes from a trusted source, you may assume that the guest will 156 + not attempt to maliciously exploit these errata and no further action is 157 + required. 158 + 159 + 3. Virtualization with untrusted guests 160 + ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 161 + If the guest comes from an untrusted source, the guest host kernel will need 162 + to apply iTLB multihit mitigation via the kernel command line or kvm 163 + module parameter.
+276
Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + TAA - TSX Asynchronous Abort 4 + ====================================== 5 + 6 + TAA is a hardware vulnerability that allows unprivileged speculative access to 7 + data which is available in various CPU internal buffers by using asynchronous 8 + aborts within an Intel TSX transactional region. 9 + 10 + Affected processors 11 + ------------------- 12 + 13 + This vulnerability only affects Intel processors that support Intel 14 + Transactional Synchronization Extensions (TSX) when the TAA_NO bit (bit 8) 15 + is 0 in the IA32_ARCH_CAPABILITIES MSR. On processors where the MDS_NO bit 16 + (bit 5) is 0 in the IA32_ARCH_CAPABILITIES MSR, the existing MDS mitigations 17 + also mitigate against TAA. 18 + 19 + Whether a processor is affected or not can be read out from the TAA 20 + vulnerability file in sysfs. See :ref:`tsx_async_abort_sys_info`. 21 + 22 + Related CVEs 23 + ------------ 24 + 25 + The following CVE entry is related to this TAA issue: 26 + 27 + ============== ===== =================================================== 28 + CVE-2019-11135 TAA TSX Asynchronous Abort (TAA) condition on some 29 + microprocessors utilizing speculative execution may 30 + allow an authenticated user to potentially enable 31 + information disclosure via a side channel with 32 + local access. 33 + ============== ===== =================================================== 34 + 35 + Problem 36 + ------- 37 + 38 + When performing store, load or L1 refill operations, processors write 39 + data into temporary microarchitectural structures (buffers). The data in 40 + those buffers can be forwarded to load operations as an optimization. 41 + 42 + Intel TSX is an extension to the x86 instruction set architecture that adds 43 + hardware transactional memory support to improve performance of multi-threaded 44 + software. TSX lets the processor expose and exploit concurrency hidden in an 45 + application due to dynamically avoiding unnecessary synchronization. 46 + 47 + TSX supports atomic memory transactions that are either committed (success) or 48 + aborted. During an abort, operations that happened within the transactional region 49 + are rolled back. An asynchronous abort takes place, among other options, when a 50 + different thread accesses a cache line that is also used within the transactional 51 + region when that access might lead to a data race. 52 + 53 + Immediately after an uncompleted asynchronous abort, certain speculatively 54 + executed loads may read data from those internal buffers and pass it to dependent 55 + operations. This can be then used to infer the value via a cache side channel 56 + attack. 57 + 58 + Because the buffers are potentially shared between Hyper-Threads cross 59 + Hyper-Thread attacks are possible. 60 + 61 + The victim of a malicious actor does not need to make use of TSX. Only the 62 + attacker needs to begin a TSX transaction and raise an asynchronous abort 63 + which in turn potenitally leaks data stored in the buffers. 64 + 65 + More detailed technical information is available in the TAA specific x86 66 + architecture section: :ref:`Documentation/x86/tsx_async_abort.rst <tsx_async_abort>`. 67 + 68 + 69 + Attack scenarios 70 + ---------------- 71 + 72 + Attacks against the TAA vulnerability can be implemented from unprivileged 73 + applications running on hosts or guests. 74 + 75 + As for MDS, the attacker has no control over the memory addresses that can 76 + be leaked. Only the victim is responsible for bringing data to the CPU. As 77 + a result, the malicious actor has to sample as much data as possible and 78 + then postprocess it to try to infer any useful information from it. 79 + 80 + A potential attacker only has read access to the data. Also, there is no direct 81 + privilege escalation by using this technique. 82 + 83 + 84 + .. _tsx_async_abort_sys_info: 85 + 86 + TAA system information 87 + ----------------------- 88 + 89 + The Linux kernel provides a sysfs interface to enumerate the current TAA status 90 + of mitigated systems. The relevant sysfs file is: 91 + 92 + /sys/devices/system/cpu/vulnerabilities/tsx_async_abort 93 + 94 + The possible values in this file are: 95 + 96 + .. list-table:: 97 + 98 + * - 'Vulnerable' 99 + - The CPU is affected by this vulnerability and the microcode and kernel mitigation are not applied. 100 + * - 'Vulnerable: Clear CPU buffers attempted, no microcode' 101 + - The system tries to clear the buffers but the microcode might not support the operation. 102 + * - 'Mitigation: Clear CPU buffers' 103 + - The microcode has been updated to clear the buffers. TSX is still enabled. 104 + * - 'Mitigation: TSX disabled' 105 + - TSX is disabled. 106 + * - 'Not affected' 107 + - The CPU is not affected by this issue. 108 + 109 + .. _ucode_needed: 110 + 111 + Best effort mitigation mode 112 + ^^^^^^^^^^^^^^^^^^^^^^^^^^^ 113 + 114 + If the processor is vulnerable, but the availability of the microcode-based 115 + mitigation mechanism is not advertised via CPUID the kernel selects a best 116 + effort mitigation mode. This mode invokes the mitigation instructions 117 + without a guarantee that they clear the CPU buffers. 118 + 119 + This is done to address virtualization scenarios where the host has the 120 + microcode update applied, but the hypervisor is not yet updated to expose the 121 + CPUID to the guest. If the host has updated microcode the protection takes 122 + effect; otherwise a few CPU cycles are wasted pointlessly. 123 + 124 + The state in the tsx_async_abort sysfs file reflects this situation 125 + accordingly. 126 + 127 + 128 + Mitigation mechanism 129 + -------------------- 130 + 131 + The kernel detects the affected CPUs and the presence of the microcode which is 132 + required. If a CPU is affected and the microcode is available, then the kernel 133 + enables the mitigation by default. 134 + 135 + 136 + The mitigation can be controlled at boot time via a kernel command line option. 137 + See :ref:`taa_mitigation_control_command_line`. 138 + 139 + .. _virt_mechanism: 140 + 141 + Virtualization mitigation 142 + ^^^^^^^^^^^^^^^^^^^^^^^^^ 143 + 144 + Affected systems where the host has TAA microcode and TAA is mitigated by 145 + having disabled TSX previously, are not vulnerable regardless of the status 146 + of the VMs. 147 + 148 + In all other cases, if the host either does not have the TAA microcode or 149 + the kernel is not mitigated, the system might be vulnerable. 150 + 151 + 152 + .. _taa_mitigation_control_command_line: 153 + 154 + Mitigation control on the kernel command line 155 + --------------------------------------------- 156 + 157 + The kernel command line allows to control the TAA mitigations at boot time with 158 + the option "tsx_async_abort=". The valid arguments for this option are: 159 + 160 + ============ ============================================================= 161 + off This option disables the TAA mitigation on affected platforms. 162 + If the system has TSX enabled (see next parameter) and the CPU 163 + is affected, the system is vulnerable. 164 + 165 + full TAA mitigation is enabled. If TSX is enabled, on an affected 166 + system it will clear CPU buffers on ring transitions. On 167 + systems which are MDS-affected and deploy MDS mitigation, 168 + TAA is also mitigated. Specifying this option on those 169 + systems will have no effect. 170 + 171 + full,nosmt The same as tsx_async_abort=full, with SMT disabled on 172 + vulnerable CPUs that have TSX enabled. This is the complete 173 + mitigation. When TSX is disabled, SMT is not disabled because 174 + CPU is not vulnerable to cross-thread TAA attacks. 175 + ============ ============================================================= 176 + 177 + Not specifying this option is equivalent to "tsx_async_abort=full". 178 + 179 + The kernel command line also allows to control the TSX feature using the 180 + parameter "tsx=" on CPUs which support TSX control. MSR_IA32_TSX_CTRL is used 181 + to control the TSX feature and the enumeration of the TSX feature bits (RTM 182 + and HLE) in CPUID. 183 + 184 + The valid options are: 185 + 186 + ============ ============================================================= 187 + off Disables TSX on the system. 188 + 189 + Note that this option takes effect only on newer CPUs which are 190 + not vulnerable to MDS, i.e., have MSR_IA32_ARCH_CAPABILITIES.MDS_NO=1 191 + and which get the new IA32_TSX_CTRL MSR through a microcode 192 + update. This new MSR allows for the reliable deactivation of 193 + the TSX functionality. 194 + 195 + on Enables TSX. 196 + 197 + Although there are mitigations for all known security 198 + vulnerabilities, TSX has been known to be an accelerator for 199 + several previous speculation-related CVEs, and so there may be 200 + unknown security risks associated with leaving it enabled. 201 + 202 + auto Disables TSX if X86_BUG_TAA is present, otherwise enables TSX 203 + on the system. 204 + ============ ============================================================= 205 + 206 + Not specifying this option is equivalent to "tsx=off". 207 + 208 + The following combinations of the "tsx_async_abort" and "tsx" are possible. For 209 + affected platforms tsx=auto is equivalent to tsx=off and the result will be: 210 + 211 + ========= ========================== ========================================= 212 + tsx=on tsx_async_abort=full The system will use VERW to clear CPU 213 + buffers. Cross-thread attacks are still 214 + possible on SMT machines. 215 + tsx=on tsx_async_abort=full,nosmt As above, cross-thread attacks on SMT 216 + mitigated. 217 + tsx=on tsx_async_abort=off The system is vulnerable. 218 + tsx=off tsx_async_abort=full TSX might be disabled if microcode 219 + provides a TSX control MSR. If so, 220 + system is not vulnerable. 221 + tsx=off tsx_async_abort=full,nosmt Ditto 222 + tsx=off tsx_async_abort=off ditto 223 + ========= ========================== ========================================= 224 + 225 + 226 + For unaffected platforms "tsx=on" and "tsx_async_abort=full" does not clear CPU 227 + buffers. For platforms without TSX control (MSR_IA32_ARCH_CAPABILITIES.MDS_NO=0) 228 + "tsx" command line argument has no effect. 229 + 230 + For the affected platforms below table indicates the mitigation status for the 231 + combinations of CPUID bit MD_CLEAR and IA32_ARCH_CAPABILITIES MSR bits MDS_NO 232 + and TSX_CTRL_MSR. 233 + 234 + ======= ========= ============= ======================================== 235 + MDS_NO MD_CLEAR TSX_CTRL_MSR Status 236 + ======= ========= ============= ======================================== 237 + 0 0 0 Vulnerable (needs microcode) 238 + 0 1 0 MDS and TAA mitigated via VERW 239 + 1 1 0 MDS fixed, TAA vulnerable if TSX enabled 240 + because MD_CLEAR has no meaning and 241 + VERW is not guaranteed to clear buffers 242 + 1 X 1 MDS fixed, TAA can be mitigated by 243 + VERW or TSX_CTRL_MSR 244 + ======= ========= ============= ======================================== 245 + 246 + Mitigation selection guide 247 + -------------------------- 248 + 249 + 1. Trusted userspace and guests 250 + ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 251 + 252 + If all user space applications are from a trusted source and do not execute 253 + untrusted code which is supplied externally, then the mitigation can be 254 + disabled. The same applies to virtualized environments with trusted guests. 255 + 256 + 257 + 2. Untrusted userspace and guests 258 + ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 259 + 260 + If there are untrusted applications or guests on the system, enabling TSX 261 + might allow a malicious actor to leak data from the host or from other 262 + processes running on the same physical core. 263 + 264 + If the microcode is available and the TSX is disabled on the host, attacks 265 + are prevented in a virtualized environment as well, even if the VMs do not 266 + explicitly enable the mitigation. 267 + 268 + 269 + .. _taa_default_mitigations: 270 + 271 + Default mitigations 272 + ------------------- 273 + 274 + The kernel's default action for vulnerable processors is: 275 + 276 + - Deploy TSX disable mitigation (tsx_async_abort=full tsx=off).
+92
Documentation/admin-guide/kernel-parameters.txt
··· 2055 2055 KVM MMU at runtime. 2056 2056 Default is 0 (off) 2057 2057 2058 + kvm.nx_huge_pages= 2059 + [KVM] Controls the software workaround for the 2060 + X86_BUG_ITLB_MULTIHIT bug. 2061 + force : Always deploy workaround. 2062 + off : Never deploy workaround. 2063 + auto : Deploy workaround based on the presence of 2064 + X86_BUG_ITLB_MULTIHIT. 2065 + 2066 + Default is 'auto'. 2067 + 2068 + If the software workaround is enabled for the host, 2069 + guests do need not to enable it for nested guests. 2070 + 2071 + kvm.nx_huge_pages_recovery_ratio= 2072 + [KVM] Controls how many 4KiB pages are periodically zapped 2073 + back to huge pages. 0 disables the recovery, otherwise if 2074 + the value is N KVM will zap 1/Nth of the 4KiB pages every 2075 + minute. The default is 60. 2076 + 2058 2077 kvm-amd.nested= [KVM,AMD] Allow nested virtualization in KVM/SVM. 2059 2078 Default is 1 (enabled) 2060 2079 ··· 2655 2636 ssbd=force-off [ARM64] 2656 2637 l1tf=off [X86] 2657 2638 mds=off [X86] 2639 + tsx_async_abort=off [X86] 2640 + kvm.nx_huge_pages=off [X86] 2641 + 2642 + Exceptions: 2643 + This does not have any effect on 2644 + kvm.nx_huge_pages when 2645 + kvm.nx_huge_pages=force. 2658 2646 2659 2647 auto (default) 2660 2648 Mitigate all CPU vulnerabilities, but leave SMT ··· 2677 2651 be fully mitigated, even if it means losing SMT. 2678 2652 Equivalent to: l1tf=flush,nosmt [X86] 2679 2653 mds=full,nosmt [X86] 2654 + tsx_async_abort=full,nosmt [X86] 2680 2655 2681 2656 mminit_loglevel= 2682 2657 [KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this ··· 4874 4847 in situations with strict latency requirements (where 4875 4848 interruptions from clocksource watchdog are not 4876 4849 acceptable). 4850 + 4851 + tsx= [X86] Control Transactional Synchronization 4852 + Extensions (TSX) feature in Intel processors that 4853 + support TSX control. 4854 + 4855 + This parameter controls the TSX feature. The options are: 4856 + 4857 + on - Enable TSX on the system. Although there are 4858 + mitigations for all known security vulnerabilities, 4859 + TSX has been known to be an accelerator for 4860 + several previous speculation-related CVEs, and 4861 + so there may be unknown security risks associated 4862 + with leaving it enabled. 4863 + 4864 + off - Disable TSX on the system. (Note that this 4865 + option takes effect only on newer CPUs which are 4866 + not vulnerable to MDS, i.e., have 4867 + MSR_IA32_ARCH_CAPABILITIES.MDS_NO=1 and which get 4868 + the new IA32_TSX_CTRL MSR through a microcode 4869 + update. This new MSR allows for the reliable 4870 + deactivation of the TSX functionality.) 4871 + 4872 + auto - Disable TSX if X86_BUG_TAA is present, 4873 + otherwise enable TSX on the system. 4874 + 4875 + Not specifying this option is equivalent to tsx=off. 4876 + 4877 + See Documentation/admin-guide/hw-vuln/tsx_async_abort.rst 4878 + for more details. 4879 + 4880 + tsx_async_abort= [X86,INTEL] Control mitigation for the TSX Async 4881 + Abort (TAA) vulnerability. 4882 + 4883 + Similar to Micro-architectural Data Sampling (MDS) 4884 + certain CPUs that support Transactional 4885 + Synchronization Extensions (TSX) are vulnerable to an 4886 + exploit against CPU internal buffers which can forward 4887 + information to a disclosure gadget under certain 4888 + conditions. 4889 + 4890 + In vulnerable processors, the speculatively forwarded 4891 + data can be used in a cache side channel attack, to 4892 + access data to which the attacker does not have direct 4893 + access. 4894 + 4895 + This parameter controls the TAA mitigation. The 4896 + options are: 4897 + 4898 + full - Enable TAA mitigation on vulnerable CPUs 4899 + if TSX is enabled. 4900 + 4901 + full,nosmt - Enable TAA mitigation and disable SMT on 4902 + vulnerable CPUs. If TSX is disabled, SMT 4903 + is not disabled because CPU is not 4904 + vulnerable to cross-thread TAA attacks. 4905 + off - Unconditionally disable TAA mitigation 4906 + 4907 + Not specifying this option is equivalent to 4908 + tsx_async_abort=full. On CPUs which are MDS affected 4909 + and deploy MDS mitigation, TAA mitigation is not 4910 + required and doesn't provide any additional 4911 + mitigation. 4912 + 4913 + For details see: 4914 + Documentation/admin-guide/hw-vuln/tsx_async_abort.rst 4877 4915 4878 4916 turbografx.map[2|3]= [HW,JOY] 4879 4917 TurboGraFX parallel port interface
+1
Documentation/x86/index.rst
··· 27 27 mds 28 28 microcode 29 29 resctrl_ui 30 + tsx_async_abort 30 31 usb-legacy-support 31 32 i386/index 32 33 x86_64/index
+117
Documentation/x86/tsx_async_abort.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + TSX Async Abort (TAA) mitigation 4 + ================================ 5 + 6 + .. _tsx_async_abort: 7 + 8 + Overview 9 + -------- 10 + 11 + TSX Async Abort (TAA) is a side channel attack on internal buffers in some 12 + Intel processors similar to Microachitectural Data Sampling (MDS). In this 13 + case certain loads may speculatively pass invalid data to dependent operations 14 + when an asynchronous abort condition is pending in a Transactional 15 + Synchronization Extensions (TSX) transaction. This includes loads with no 16 + fault or assist condition. Such loads may speculatively expose stale data from 17 + the same uarch data structures as in MDS, with same scope of exposure i.e. 18 + same-thread and cross-thread. This issue affects all current processors that 19 + support TSX. 20 + 21 + Mitigation strategy 22 + ------------------- 23 + 24 + a) TSX disable - one of the mitigations is to disable TSX. A new MSR 25 + IA32_TSX_CTRL will be available in future and current processors after 26 + microcode update which can be used to disable TSX. In addition, it 27 + controls the enumeration of the TSX feature bits (RTM and HLE) in CPUID. 28 + 29 + b) Clear CPU buffers - similar to MDS, clearing the CPU buffers mitigates this 30 + vulnerability. More details on this approach can be found in 31 + :ref:`Documentation/admin-guide/hw-vuln/mds.rst <mds>`. 32 + 33 + Kernel internal mitigation modes 34 + -------------------------------- 35 + 36 + ============= ============================================================ 37 + off Mitigation is disabled. Either the CPU is not affected or 38 + tsx_async_abort=off is supplied on the kernel command line. 39 + 40 + tsx disabled Mitigation is enabled. TSX feature is disabled by default at 41 + bootup on processors that support TSX control. 42 + 43 + verw Mitigation is enabled. CPU is affected and MD_CLEAR is 44 + advertised in CPUID. 45 + 46 + ucode needed Mitigation is enabled. CPU is affected and MD_CLEAR is not 47 + advertised in CPUID. That is mainly for virtualization 48 + scenarios where the host has the updated microcode but the 49 + hypervisor does not expose MD_CLEAR in CPUID. It's a best 50 + effort approach without guarantee. 51 + ============= ============================================================ 52 + 53 + If the CPU is affected and the "tsx_async_abort" kernel command line parameter is 54 + not provided then the kernel selects an appropriate mitigation depending on the 55 + status of RTM and MD_CLEAR CPUID bits. 56 + 57 + Below tables indicate the impact of tsx=on|off|auto cmdline options on state of 58 + TAA mitigation, VERW behavior and TSX feature for various combinations of 59 + MSR_IA32_ARCH_CAPABILITIES bits. 60 + 61 + 1. "tsx=off" 62 + 63 + ========= ========= ============ ============ ============== =================== ====================== 64 + MSR_IA32_ARCH_CAPABILITIES bits Result with cmdline tsx=off 65 + ---------------------------------- ------------------------------------------------------------------------- 66 + TAA_NO MDS_NO TSX_CTRL_MSR TSX state VERW can clear TAA mitigation TAA mitigation 67 + after bootup CPU buffers tsx_async_abort=off tsx_async_abort=full 68 + ========= ========= ============ ============ ============== =================== ====================== 69 + 0 0 0 HW default Yes Same as MDS Same as MDS 70 + 0 0 1 Invalid case Invalid case Invalid case Invalid case 71 + 0 1 0 HW default No Need ucode update Need ucode update 72 + 0 1 1 Disabled Yes TSX disabled TSX disabled 73 + 1 X 1 Disabled X None needed None needed 74 + ========= ========= ============ ============ ============== =================== ====================== 75 + 76 + 2. "tsx=on" 77 + 78 + ========= ========= ============ ============ ============== =================== ====================== 79 + MSR_IA32_ARCH_CAPABILITIES bits Result with cmdline tsx=on 80 + ---------------------------------- ------------------------------------------------------------------------- 81 + TAA_NO MDS_NO TSX_CTRL_MSR TSX state VERW can clear TAA mitigation TAA mitigation 82 + after bootup CPU buffers tsx_async_abort=off tsx_async_abort=full 83 + ========= ========= ============ ============ ============== =================== ====================== 84 + 0 0 0 HW default Yes Same as MDS Same as MDS 85 + 0 0 1 Invalid case Invalid case Invalid case Invalid case 86 + 0 1 0 HW default No Need ucode update Need ucode update 87 + 0 1 1 Enabled Yes None Same as MDS 88 + 1 X 1 Enabled X None needed None needed 89 + ========= ========= ============ ============ ============== =================== ====================== 90 + 91 + 3. "tsx=auto" 92 + 93 + ========= ========= ============ ============ ============== =================== ====================== 94 + MSR_IA32_ARCH_CAPABILITIES bits Result with cmdline tsx=auto 95 + ---------------------------------- ------------------------------------------------------------------------- 96 + TAA_NO MDS_NO TSX_CTRL_MSR TSX state VERW can clear TAA mitigation TAA mitigation 97 + after bootup CPU buffers tsx_async_abort=off tsx_async_abort=full 98 + ========= ========= ============ ============ ============== =================== ====================== 99 + 0 0 0 HW default Yes Same as MDS Same as MDS 100 + 0 0 1 Invalid case Invalid case Invalid case Invalid case 101 + 0 1 0 HW default No Need ucode update Need ucode update 102 + 0 1 1 Disabled Yes TSX disabled TSX disabled 103 + 1 X 1 Enabled X None needed None needed 104 + ========= ========= ============ ============ ============== =================== ====================== 105 + 106 + In the tables, TSX_CTRL_MSR is a new bit in MSR_IA32_ARCH_CAPABILITIES that 107 + indicates whether MSR_IA32_TSX_CTRL is supported. 108 + 109 + There are two control bits in IA32_TSX_CTRL MSR: 110 + 111 + Bit 0: When set it disables the Restricted Transactional Memory (RTM) 112 + sub-feature of TSX (will force all transactions to abort on the 113 + XBEGIN instruction). 114 + 115 + Bit 1: When set it disables the enumeration of the RTM and HLE feature 116 + (i.e. it will make CPUID(EAX=7).EBX{bit4} and 117 + CPUID(EAX=7).EBX{bit11} read as 0).
-2
MAINTAINERS
··· 3268 3268 F: drivers/cpufreq/bmips-cpufreq.c 3269 3269 3270 3270 BROADCOM BMIPS MIPS ARCHITECTURE 3271 - M: Kevin Cernekee <cernekee@gmail.com> 3272 3271 M: Florian Fainelli <f.fainelli@gmail.com> 3273 3272 L: bcm-kernel-feedback-list@broadcom.com 3274 3273 L: linux-mips@vger.kernel.org ··· 3744 3745 3745 3746 CAVIUM THUNDERX2 ARM64 SOC 3746 3747 M: Robert Richter <rrichter@cavium.com> 3747 - M: Jayachandran C <jnair@caviumnetworks.com> 3748 3748 L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) 3749 3749 S: Maintained 3750 3750 F: arch/arm64/boot/dts/cavium/thunder2-99xx*
+4 -1
Makefile
··· 2 2 VERSION = 5 3 3 PATCHLEVEL = 4 4 4 SUBLEVEL = 0 5 - EXTRAVERSION = -rc6 5 + EXTRAVERSION = -rc7 6 6 NAME = Kleptomaniac Octopus 7 7 8 8 # *DOCUMENTATION* ··· 916 916 ifeq ($(CONFIG_RELR),y) 917 917 LDFLAGS_vmlinux += --pack-dyn-relocs=relr 918 918 endif 919 + 920 + # make the checker run with the right architecture 921 + CHECKFLAGS += --arch=$(ARCH) 919 922 920 923 # insure the checker run with the right endianness 921 924 CHECKFLAGS += $(if $(CONFIG_CPU_BIG_ENDIAN),-mbig-endian,-mlittle-endian)
+4
arch/arm/boot/dts/imx6-logicpd-baseboard.dtsi
··· 328 328 pinctrl-0 = <&pinctrl_pwm3>; 329 329 }; 330 330 331 + &snvs_pwrkey { 332 + status = "okay"; 333 + }; 334 + 331 335 &ssi2 { 332 336 status = "okay"; 333 337 };
+8
arch/arm/boot/dts/imx6qdl-sabreauto.dtsi
··· 230 230 accelerometer@1c { 231 231 compatible = "fsl,mma8451"; 232 232 reg = <0x1c>; 233 + pinctrl-names = "default"; 234 + pinctrl-0 = <&pinctrl_mma8451_int>; 233 235 interrupt-parent = <&gpio6>; 234 236 interrupts = <31 IRQ_TYPE_LEVEL_LOW>; 235 237 }; ··· 627 625 pinctrl_max7310: max7310grp { 628 626 fsl,pins = < 629 627 MX6QDL_PAD_SD2_DAT0__GPIO1_IO15 0x1b0b0 628 + >; 629 + }; 630 + 631 + pinctrl_mma8451_int: mma8451intgrp { 632 + fsl,pins = < 633 + MX6QDL_PAD_EIM_BCLK__GPIO6_IO31 0xb0b1 630 634 >; 631 635 }; 632 636
+2 -11
arch/arm/boot/dts/stm32mp157c-ev1.dts
··· 183 183 184 184 ov5640: camera@3c { 185 185 compatible = "ovti,ov5640"; 186 - pinctrl-names = "default"; 187 - pinctrl-0 = <&ov5640_pins>; 188 186 reg = <0x3c>; 189 187 clocks = <&clk_ext_camera>; 190 188 clock-names = "xclk"; 191 189 DOVDD-supply = <&v2v8>; 192 - powerdown-gpios = <&stmfx_pinctrl 18 GPIO_ACTIVE_HIGH>; 193 - reset-gpios = <&stmfx_pinctrl 19 GPIO_ACTIVE_LOW>; 190 + powerdown-gpios = <&stmfx_pinctrl 18 (GPIO_ACTIVE_HIGH | GPIO_PUSH_PULL)>; 191 + reset-gpios = <&stmfx_pinctrl 19 (GPIO_ACTIVE_LOW | GPIO_PUSH_PULL)>; 194 192 rotation = <180>; 195 193 status = "okay"; 196 194 ··· 221 223 222 224 joystick_pins: joystick { 223 225 pins = "gpio0", "gpio1", "gpio2", "gpio3", "gpio4"; 224 - drive-push-pull; 225 226 bias-pull-down; 226 - }; 227 - 228 - ov5640_pins: camera { 229 - pins = "agpio2", "agpio3"; /* stmfx pins 18 & 19 */ 230 - drive-push-pull; 231 - output-low; 232 227 }; 233 228 }; 234 229 };
+2 -2
arch/arm/boot/dts/stm32mp157c.dtsi
··· 932 932 interrupt-names = "int0", "int1"; 933 933 clocks = <&rcc CK_HSE>, <&rcc FDCAN_K>; 934 934 clock-names = "hclk", "cclk"; 935 - bosch,mram-cfg = <0x1400 0 0 32 0 0 2 2>; 935 + bosch,mram-cfg = <0x0 0 0 32 0 0 2 2>; 936 936 status = "disabled"; 937 937 }; 938 938 ··· 945 945 interrupt-names = "int0", "int1"; 946 946 clocks = <&rcc CK_HSE>, <&rcc FDCAN_K>; 947 947 clock-names = "hclk", "cclk"; 948 - bosch,mram-cfg = <0x0 0 0 32 0 0 2 2>; 948 + bosch,mram-cfg = <0x1400 0 0 32 0 0 2 2>; 949 949 status = "disabled"; 950 950 }; 951 951
+1
arch/arm/boot/dts/sun8i-a83t-tbs-a711.dts
··· 192 192 vqmmc-supply = <&reg_dldo1>; 193 193 non-removable; 194 194 wakeup-source; 195 + keep-power-in-suspend; 195 196 status = "okay"; 196 197 197 198 brcmf: wifi@1 {
+5 -1
arch/arm/mach-sunxi/mc_smp.c
··· 481 481 static int sunxi_cpu_powerdown(unsigned int cpu, unsigned int cluster) 482 482 { 483 483 u32 reg; 484 + int gating_bit = cpu; 484 485 485 486 pr_debug("%s: cluster %u cpu %u\n", __func__, cluster, cpu); 486 487 if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS) 487 488 return -EINVAL; 488 489 490 + if (is_a83t && cpu == 0) 491 + gating_bit = 4; 492 + 489 493 /* gate processor power */ 490 494 reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster)); 491 - reg |= PRCM_PWROFF_GATING_REG_CORE(cpu); 495 + reg |= PRCM_PWROFF_GATING_REG_CORE(gating_bit); 492 496 writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster)); 493 497 udelay(20); 494 498
+1 -1
arch/arm64/boot/dts/freescale/fsl-ls1028a-qds.dts
··· 127 127 status = "okay"; 128 128 129 129 i2c-mux@77 { 130 - compatible = "nxp,pca9847"; 130 + compatible = "nxp,pca9547"; 131 131 reg = <0x77>; 132 132 #address-cells = <1>; 133 133 #size-cells = <0>;
+3 -3
arch/arm64/boot/dts/freescale/imx8mm.dtsi
··· 394 394 }; 395 395 396 396 sdma2: dma-controller@302c0000 { 397 - compatible = "fsl,imx8mm-sdma", "fsl,imx7d-sdma"; 397 + compatible = "fsl,imx8mm-sdma", "fsl,imx8mq-sdma"; 398 398 reg = <0x302c0000 0x10000>; 399 399 interrupts = <GIC_SPI 103 IRQ_TYPE_LEVEL_HIGH>; 400 400 clocks = <&clk IMX8MM_CLK_SDMA2_ROOT>, ··· 405 405 }; 406 406 407 407 sdma3: dma-controller@302b0000 { 408 - compatible = "fsl,imx8mm-sdma", "fsl,imx7d-sdma"; 408 + compatible = "fsl,imx8mm-sdma", "fsl,imx8mq-sdma"; 409 409 reg = <0x302b0000 0x10000>; 410 410 interrupts = <GIC_SPI 34 IRQ_TYPE_LEVEL_HIGH>; 411 411 clocks = <&clk IMX8MM_CLK_SDMA3_ROOT>, ··· 737 737 }; 738 738 739 739 sdma1: dma-controller@30bd0000 { 740 - compatible = "fsl,imx8mm-sdma", "fsl,imx7d-sdma"; 740 + compatible = "fsl,imx8mm-sdma", "fsl,imx8mq-sdma"; 741 741 reg = <0x30bd0000 0x10000>; 742 742 interrupts = <GIC_SPI 2 IRQ_TYPE_LEVEL_HIGH>; 743 743 clocks = <&clk IMX8MM_CLK_SDMA1_ROOT>,
+3 -3
arch/arm64/boot/dts/freescale/imx8mn.dtsi
··· 288 288 }; 289 289 290 290 sdma3: dma-controller@302b0000 { 291 - compatible = "fsl,imx8mn-sdma", "fsl,imx7d-sdma"; 291 + compatible = "fsl,imx8mn-sdma", "fsl,imx8mq-sdma"; 292 292 reg = <0x302b0000 0x10000>; 293 293 interrupts = <GIC_SPI 34 IRQ_TYPE_LEVEL_HIGH>; 294 294 clocks = <&clk IMX8MN_CLK_SDMA3_ROOT>, ··· 299 299 }; 300 300 301 301 sdma2: dma-controller@302c0000 { 302 - compatible = "fsl,imx8mn-sdma", "fsl,imx7d-sdma"; 302 + compatible = "fsl,imx8mn-sdma", "fsl,imx8mq-sdma"; 303 303 reg = <0x302c0000 0x10000>; 304 304 interrupts = <GIC_SPI 103 IRQ_TYPE_LEVEL_HIGH>; 305 305 clocks = <&clk IMX8MN_CLK_SDMA2_ROOT>, ··· 612 612 }; 613 613 614 614 sdma1: dma-controller@30bd0000 { 615 - compatible = "fsl,imx8mn-sdma", "fsl,imx7d-sdma"; 615 + compatible = "fsl,imx8mn-sdma", "fsl,imx8mq-sdma"; 616 616 reg = <0x30bd0000 0x10000>; 617 617 interrupts = <GIC_SPI 2 IRQ_TYPE_LEVEL_HIGH>; 618 618 clocks = <&clk IMX8MN_CLK_SDMA1_ROOT>,
+1 -1
arch/arm64/boot/dts/freescale/imx8mq-zii-ultra.dtsi
··· 88 88 regulator-name = "0V9_ARM"; 89 89 regulator-min-microvolt = <900000>; 90 90 regulator-max-microvolt = <1000000>; 91 - gpios = <&gpio3 19 GPIO_ACTIVE_HIGH>; 91 + gpios = <&gpio3 16 GPIO_ACTIVE_HIGH>; 92 92 states = <1000000 0x1 93 93 900000 0x0>; 94 94 regulator-always-on;
-7
arch/arm64/include/asm/vdso/vsyscall.h
··· 31 31 #define __arch_get_clock_mode __arm64_get_clock_mode 32 32 33 33 static __always_inline 34 - int __arm64_use_vsyscall(struct vdso_data *vdata) 35 - { 36 - return !vdata[CS_HRES_COARSE].clock_mode; 37 - } 38 - #define __arch_use_vsyscall __arm64_use_vsyscall 39 - 40 - static __always_inline 41 34 void __arm64_update_vsyscall(struct vdso_data *vdata, struct timekeeper *tk) 42 35 { 43 36 vdata[CS_HRES_COARSE].mask = VDSO_PRECISION_MASK;
-7
arch/mips/include/asm/vdso/vsyscall.h
··· 28 28 } 29 29 #define __arch_get_clock_mode __mips_get_clock_mode 30 30 31 - static __always_inline 32 - int __mips_use_vsyscall(struct vdso_data *vdata) 33 - { 34 - return (vdata[CS_HRES_COARSE].clock_mode != VDSO_CLOCK_NONE); 35 - } 36 - #define __arch_use_vsyscall __mips_use_vsyscall 37 - 38 31 /* The asm-generic header needs to be included after the definitions above */ 39 32 #include <asm-generic/vdso/vsyscall.h> 40 33
-7
arch/mips/sgi-ip27/Kconfig
··· 38 38 Say Y here to enable replicating the kernel text across multiple 39 39 nodes in a NUMA cluster. This trades memory for speed. 40 40 41 - config REPLICATE_EXHANDLERS 42 - bool "Exception handler replication support" 43 - depends on SGI_IP27 44 - help 45 - Say Y here to enable replicating the kernel exception handlers 46 - across multiple nodes in a NUMA cluster. This trades memory for 47 - speed.
+6 -15
arch/mips/sgi-ip27/ip27-init.c
··· 69 69 70 70 hub_rtc_init(cnode); 71 71 72 - #ifdef CONFIG_REPLICATE_EXHANDLERS 73 - /* 74 - * If this is not a headless node initialization, 75 - * copy over the caliased exception handlers. 76 - */ 77 - if (get_compact_nodeid() == cnode) { 78 - extern char except_vec2_generic, except_vec3_generic; 79 - extern void build_tlb_refill_handler(void); 80 - 81 - memcpy((void *)(CKSEG0 + 0x100), &except_vec2_generic, 0x80); 82 - memcpy((void *)(CKSEG0 + 0x180), &except_vec3_generic, 0x80); 83 - build_tlb_refill_handler(); 84 - memcpy((void *)(CKSEG0 + 0x100), (void *) CKSEG0, 0x80); 85 - memcpy((void *)(CKSEG0 + 0x180), &except_vec3_generic, 0x100); 72 + if (nasid) { 73 + /* copy exception handlers from first node to current node */ 74 + memcpy((void *)NODE_OFFSET_TO_K0(nasid, 0), 75 + (void *)CKSEG0, 0x200); 86 76 __flush_cache_all(); 77 + /* switch to node local exception handlers */ 78 + REMOTE_HUB_S(nasid, PI_CALIAS_SIZE, PI_CALIAS_SIZE_8K); 87 79 } 88 - #endif 89 80 } 90 81 91 82 void per_cpu_init(void)
-4
arch/mips/sgi-ip27/ip27-memory.c
··· 332 332 * thinks it is a node 0 address. 333 333 */ 334 334 REMOTE_HUB_S(nasid, PI_REGION_PRESENT, (region_mask | 1)); 335 - #ifdef CONFIG_REPLICATE_EXHANDLERS 336 - REMOTE_HUB_S(nasid, PI_CALIAS_SIZE, PI_CALIAS_SIZE_8K); 337 - #else 338 335 REMOTE_HUB_S(nasid, PI_CALIAS_SIZE, PI_CALIAS_SIZE_0); 339 - #endif 340 336 341 337 #ifdef LATER 342 338 /*
+2 -2
arch/sparc/vdso/Makefile
··· 65 65 # 66 66 # vDSO code runs in userspace and -pg doesn't help with profiling anyway. 67 67 # 68 - CFLAGS_REMOVE_vdso-note.o = -pg 69 68 CFLAGS_REMOVE_vclock_gettime.o = -pg 69 + CFLAGS_REMOVE_vdso32/vclock_gettime.o = -pg 70 70 71 71 $(obj)/%.so: OBJCOPYFLAGS := -S 72 72 $(obj)/%.so: $(obj)/%.so.dbg FORCE 73 73 $(call if_changed,objcopy) 74 74 75 - CPPFLAGS_vdso32.lds = $(CPPFLAGS_vdso.lds) 75 + CPPFLAGS_vdso32/vdso32.lds = $(CPPFLAGS_vdso.lds) 76 76 VDSO_LDFLAGS_vdso32.lds = -m elf32_sparc -soname linux-gate.so.1 77 77 78 78 #This makes sure the $(obj) subdirectory exists even though vdso32/
+45
arch/x86/Kconfig
··· 1940 1940 1941 1941 If unsure, say y. 1942 1942 1943 + choice 1944 + prompt "TSX enable mode" 1945 + depends on CPU_SUP_INTEL 1946 + default X86_INTEL_TSX_MODE_OFF 1947 + help 1948 + Intel's TSX (Transactional Synchronization Extensions) feature 1949 + allows to optimize locking protocols through lock elision which 1950 + can lead to a noticeable performance boost. 1951 + 1952 + On the other hand it has been shown that TSX can be exploited 1953 + to form side channel attacks (e.g. TAA) and chances are there 1954 + will be more of those attacks discovered in the future. 1955 + 1956 + Therefore TSX is not enabled by default (aka tsx=off). An admin 1957 + might override this decision by tsx=on the command line parameter. 1958 + Even with TSX enabled, the kernel will attempt to enable the best 1959 + possible TAA mitigation setting depending on the microcode available 1960 + for the particular machine. 1961 + 1962 + This option allows to set the default tsx mode between tsx=on, =off 1963 + and =auto. See Documentation/admin-guide/kernel-parameters.txt for more 1964 + details. 1965 + 1966 + Say off if not sure, auto if TSX is in use but it should be used on safe 1967 + platforms or on if TSX is in use and the security aspect of tsx is not 1968 + relevant. 1969 + 1970 + config X86_INTEL_TSX_MODE_OFF 1971 + bool "off" 1972 + help 1973 + TSX is disabled if possible - equals to tsx=off command line parameter. 1974 + 1975 + config X86_INTEL_TSX_MODE_ON 1976 + bool "on" 1977 + help 1978 + TSX is always enabled on TSX capable HW - equals the tsx=on command 1979 + line parameter. 1980 + 1981 + config X86_INTEL_TSX_MODE_AUTO 1982 + bool "auto" 1983 + help 1984 + TSX is enabled on TSX capable HW that is believed to be safe against 1985 + side channel attacks- equals the tsx=auto command line parameter. 1986 + endchoice 1987 + 1943 1988 config EFI 1944 1989 bool "EFI runtime service support" 1945 1990 depends on ACPI
+2
arch/x86/include/asm/cpufeatures.h
··· 399 399 #define X86_BUG_MDS X86_BUG(19) /* CPU is affected by Microarchitectural data sampling */ 400 400 #define X86_BUG_MSBDS_ONLY X86_BUG(20) /* CPU is only affected by the MSDBS variant of BUG_MDS */ 401 401 #define X86_BUG_SWAPGS X86_BUG(21) /* CPU is affected by speculation through SWAPGS */ 402 + #define X86_BUG_TAA X86_BUG(22) /* CPU is affected by TSX Async Abort(TAA) */ 403 + #define X86_BUG_ITLB_MULTIHIT X86_BUG(23) /* CPU may incur MCE during certain page attribute changes */ 402 404 403 405 #endif /* _ASM_X86_CPUFEATURES_H */
+6
arch/x86/include/asm/kvm_host.h
··· 312 312 struct kvm_mmu_page { 313 313 struct list_head link; 314 314 struct hlist_node hash_link; 315 + struct list_head lpage_disallowed_link; 316 + 315 317 bool unsync; 316 318 u8 mmu_valid_gen; 317 319 bool mmio_cached; 320 + bool lpage_disallowed; /* Can't be replaced by an equiv large page */ 318 321 319 322 /* 320 323 * The following two entries are used to key the shadow page in the ··· 862 859 */ 863 860 struct list_head active_mmu_pages; 864 861 struct list_head zapped_obsolete_pages; 862 + struct list_head lpage_disallowed_mmu_pages; 865 863 struct kvm_page_track_notifier_node mmu_sp_tracker; 866 864 struct kvm_page_track_notifier_head track_notifier_head; 867 865 ··· 937 933 bool exception_payload_enabled; 938 934 939 935 struct kvm_pmu_event_filter *pmu_event_filter; 936 + struct task_struct *nx_lpage_recovery_thread; 940 937 }; 941 938 942 939 struct kvm_vm_stat { ··· 951 946 ulong mmu_unsync; 952 947 ulong remote_tlb_flush; 953 948 ulong lpages; 949 + ulong nx_lpage_splits; 954 950 ulong max_mmu_page_hash_collisions; 955 951 }; 956 952
+16
arch/x86/include/asm/msr-index.h
··· 93 93 * Microarchitectural Data 94 94 * Sampling (MDS) vulnerabilities. 95 95 */ 96 + #define ARCH_CAP_PSCHANGE_MC_NO BIT(6) /* 97 + * The processor is not susceptible to a 98 + * machine check error due to modifying the 99 + * code page size along with either the 100 + * physical address or cache type 101 + * without TLB invalidation. 102 + */ 103 + #define ARCH_CAP_TSX_CTRL_MSR BIT(7) /* MSR for TSX control is available. */ 104 + #define ARCH_CAP_TAA_NO BIT(8) /* 105 + * Not susceptible to 106 + * TSX Async Abort (TAA) vulnerabilities. 107 + */ 96 108 97 109 #define MSR_IA32_FLUSH_CMD 0x0000010b 98 110 #define L1D_FLUSH BIT(0) /* ··· 114 102 115 103 #define MSR_IA32_BBL_CR_CTL 0x00000119 116 104 #define MSR_IA32_BBL_CR_CTL3 0x0000011e 105 + 106 + #define MSR_IA32_TSX_CTRL 0x00000122 107 + #define TSX_CTRL_RTM_DISABLE BIT(0) /* Disable RTM feature */ 108 + #define TSX_CTRL_CPUID_CLEAR BIT(1) /* Disable TSX enumeration */ 117 109 118 110 #define MSR_IA32_SYSENTER_CS 0x00000174 119 111 #define MSR_IA32_SYSENTER_ESP 0x00000175
+2 -2
arch/x86/include/asm/nospec-branch.h
··· 314 314 #include <asm/segment.h> 315 315 316 316 /** 317 - * mds_clear_cpu_buffers - Mitigation for MDS vulnerability 317 + * mds_clear_cpu_buffers - Mitigation for MDS and TAA vulnerability 318 318 * 319 319 * This uses the otherwise unused and obsolete VERW instruction in 320 320 * combination with microcode which triggers a CPU buffer flush when the ··· 337 337 } 338 338 339 339 /** 340 - * mds_user_clear_cpu_buffers - Mitigation for MDS vulnerability 340 + * mds_user_clear_cpu_buffers - Mitigation for MDS and TAA vulnerability 341 341 * 342 342 * Clear CPU buffers if the corresponding static key is enabled 343 343 */
+7
arch/x86/include/asm/processor.h
··· 988 988 MDS_MITIGATION_VMWERV, 989 989 }; 990 990 991 + enum taa_mitigations { 992 + TAA_MITIGATION_OFF, 993 + TAA_MITIGATION_UCODE_NEEDED, 994 + TAA_MITIGATION_VERW, 995 + TAA_MITIGATION_TSX_DISABLED, 996 + }; 997 + 991 998 #endif /* _ASM_X86_PROCESSOR_H */
+15 -13
arch/x86/kernel/apic/apic.c
··· 1586 1586 { 1587 1587 int cpu = smp_processor_id(); 1588 1588 unsigned int value; 1589 - #ifdef CONFIG_X86_32 1590 - int logical_apicid, ldr_apicid; 1591 - #endif 1592 1589 1593 1590 if (disable_apic) { 1594 1591 disable_ioapic_support(); ··· 1623 1626 apic->init_apic_ldr(); 1624 1627 1625 1628 #ifdef CONFIG_X86_32 1626 - /* 1627 - * APIC LDR is initialized. If logical_apicid mapping was 1628 - * initialized during get_smp_config(), make sure it matches the 1629 - * actual value. 1630 - */ 1631 - logical_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu); 1632 - ldr_apicid = GET_APIC_LOGICAL_ID(apic_read(APIC_LDR)); 1633 - WARN_ON(logical_apicid != BAD_APICID && logical_apicid != ldr_apicid); 1634 - /* always use the value from LDR */ 1635 - early_per_cpu(x86_cpu_to_logical_apicid, cpu) = ldr_apicid; 1629 + if (apic->dest_logical) { 1630 + int logical_apicid, ldr_apicid; 1631 + 1632 + /* 1633 + * APIC LDR is initialized. If logical_apicid mapping was 1634 + * initialized during get_smp_config(), make sure it matches 1635 + * the actual value. 1636 + */ 1637 + logical_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu); 1638 + ldr_apicid = GET_APIC_LOGICAL_ID(apic_read(APIC_LDR)); 1639 + if (logical_apicid != BAD_APICID) 1640 + WARN_ON(logical_apicid != ldr_apicid); 1641 + /* Always use the value from LDR. */ 1642 + early_per_cpu(x86_cpu_to_logical_apicid, cpu) = ldr_apicid; 1643 + } 1636 1644 #endif 1637 1645 1638 1646 /*
+1 -1
arch/x86/kernel/cpu/Makefile
··· 30 30 obj-$(CONFIG_X86_FEATURE_NAMES) += capflags.o powerflags.o 31 31 32 32 ifdef CONFIG_CPU_SUP_INTEL 33 - obj-y += intel.o intel_pconfig.o 33 + obj-y += intel.o intel_pconfig.o tsx.o 34 34 obj-$(CONFIG_PM) += intel_epb.o 35 35 endif 36 36 obj-$(CONFIG_CPU_SUP_AMD) += amd.o
+155 -4
arch/x86/kernel/cpu/bugs.c
··· 39 39 static void __init ssb_select_mitigation(void); 40 40 static void __init l1tf_select_mitigation(void); 41 41 static void __init mds_select_mitigation(void); 42 + static void __init taa_select_mitigation(void); 42 43 43 44 /* The base value of the SPEC_CTRL MSR that always has to be preserved. */ 44 45 u64 x86_spec_ctrl_base; ··· 106 105 ssb_select_mitigation(); 107 106 l1tf_select_mitigation(); 108 107 mds_select_mitigation(); 108 + taa_select_mitigation(); 109 109 110 110 arch_smt_update(); 111 111 ··· 269 267 return 0; 270 268 } 271 269 early_param("mds", mds_cmdline); 270 + 271 + #undef pr_fmt 272 + #define pr_fmt(fmt) "TAA: " fmt 273 + 274 + /* Default mitigation for TAA-affected CPUs */ 275 + static enum taa_mitigations taa_mitigation __ro_after_init = TAA_MITIGATION_VERW; 276 + static bool taa_nosmt __ro_after_init; 277 + 278 + static const char * const taa_strings[] = { 279 + [TAA_MITIGATION_OFF] = "Vulnerable", 280 + [TAA_MITIGATION_UCODE_NEEDED] = "Vulnerable: Clear CPU buffers attempted, no microcode", 281 + [TAA_MITIGATION_VERW] = "Mitigation: Clear CPU buffers", 282 + [TAA_MITIGATION_TSX_DISABLED] = "Mitigation: TSX disabled", 283 + }; 284 + 285 + static void __init taa_select_mitigation(void) 286 + { 287 + u64 ia32_cap; 288 + 289 + if (!boot_cpu_has_bug(X86_BUG_TAA)) { 290 + taa_mitigation = TAA_MITIGATION_OFF; 291 + return; 292 + } 293 + 294 + /* TSX previously disabled by tsx=off */ 295 + if (!boot_cpu_has(X86_FEATURE_RTM)) { 296 + taa_mitigation = TAA_MITIGATION_TSX_DISABLED; 297 + goto out; 298 + } 299 + 300 + if (cpu_mitigations_off()) { 301 + taa_mitigation = TAA_MITIGATION_OFF; 302 + return; 303 + } 304 + 305 + /* TAA mitigation is turned off on the cmdline (tsx_async_abort=off) */ 306 + if (taa_mitigation == TAA_MITIGATION_OFF) 307 + goto out; 308 + 309 + if (boot_cpu_has(X86_FEATURE_MD_CLEAR)) 310 + taa_mitigation = TAA_MITIGATION_VERW; 311 + else 312 + taa_mitigation = TAA_MITIGATION_UCODE_NEEDED; 313 + 314 + /* 315 + * VERW doesn't clear the CPU buffers when MD_CLEAR=1 and MDS_NO=1. 316 + * A microcode update fixes this behavior to clear CPU buffers. It also 317 + * adds support for MSR_IA32_TSX_CTRL which is enumerated by the 318 + * ARCH_CAP_TSX_CTRL_MSR bit. 319 + * 320 + * On MDS_NO=1 CPUs if ARCH_CAP_TSX_CTRL_MSR is not set, microcode 321 + * update is required. 322 + */ 323 + ia32_cap = x86_read_arch_cap_msr(); 324 + if ( (ia32_cap & ARCH_CAP_MDS_NO) && 325 + !(ia32_cap & ARCH_CAP_TSX_CTRL_MSR)) 326 + taa_mitigation = TAA_MITIGATION_UCODE_NEEDED; 327 + 328 + /* 329 + * TSX is enabled, select alternate mitigation for TAA which is 330 + * the same as MDS. Enable MDS static branch to clear CPU buffers. 331 + * 332 + * For guests that can't determine whether the correct microcode is 333 + * present on host, enable the mitigation for UCODE_NEEDED as well. 334 + */ 335 + static_branch_enable(&mds_user_clear); 336 + 337 + if (taa_nosmt || cpu_mitigations_auto_nosmt()) 338 + cpu_smt_disable(false); 339 + 340 + out: 341 + pr_info("%s\n", taa_strings[taa_mitigation]); 342 + } 343 + 344 + static int __init tsx_async_abort_parse_cmdline(char *str) 345 + { 346 + if (!boot_cpu_has_bug(X86_BUG_TAA)) 347 + return 0; 348 + 349 + if (!str) 350 + return -EINVAL; 351 + 352 + if (!strcmp(str, "off")) { 353 + taa_mitigation = TAA_MITIGATION_OFF; 354 + } else if (!strcmp(str, "full")) { 355 + taa_mitigation = TAA_MITIGATION_VERW; 356 + } else if (!strcmp(str, "full,nosmt")) { 357 + taa_mitigation = TAA_MITIGATION_VERW; 358 + taa_nosmt = true; 359 + } 360 + 361 + return 0; 362 + } 363 + early_param("tsx_async_abort", tsx_async_abort_parse_cmdline); 272 364 273 365 #undef pr_fmt 274 366 #define pr_fmt(fmt) "Spectre V1 : " fmt ··· 882 786 } 883 787 884 788 #define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n" 789 + #define TAA_MSG_SMT "TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.\n" 885 790 886 791 void cpu_bugs_smt_update(void) 887 792 { 888 - /* Enhanced IBRS implies STIBP. No update required. */ 889 - if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED) 890 - return; 891 - 892 793 mutex_lock(&spec_ctrl_mutex); 893 794 894 795 switch (spectre_v2_user) { ··· 909 816 update_mds_branch_idle(); 910 817 break; 911 818 case MDS_MITIGATION_OFF: 819 + break; 820 + } 821 + 822 + switch (taa_mitigation) { 823 + case TAA_MITIGATION_VERW: 824 + case TAA_MITIGATION_UCODE_NEEDED: 825 + if (sched_smt_active()) 826 + pr_warn_once(TAA_MSG_SMT); 827 + break; 828 + case TAA_MITIGATION_TSX_DISABLED: 829 + case TAA_MITIGATION_OFF: 912 830 break; 913 831 } 914 832 ··· 1253 1149 x86_amd_ssb_disable(); 1254 1150 } 1255 1151 1152 + bool itlb_multihit_kvm_mitigation; 1153 + EXPORT_SYMBOL_GPL(itlb_multihit_kvm_mitigation); 1154 + 1256 1155 #undef pr_fmt 1257 1156 #define pr_fmt(fmt) "L1TF: " fmt 1258 1157 ··· 1411 1304 l1tf_vmx_states[l1tf_vmx_mitigation], 1412 1305 sched_smt_active() ? "vulnerable" : "disabled"); 1413 1306 } 1307 + 1308 + static ssize_t itlb_multihit_show_state(char *buf) 1309 + { 1310 + if (itlb_multihit_kvm_mitigation) 1311 + return sprintf(buf, "KVM: Mitigation: Split huge pages\n"); 1312 + else 1313 + return sprintf(buf, "KVM: Vulnerable\n"); 1314 + } 1414 1315 #else 1415 1316 static ssize_t l1tf_show_state(char *buf) 1416 1317 { 1417 1318 return sprintf(buf, "%s\n", L1TF_DEFAULT_MSG); 1319 + } 1320 + 1321 + static ssize_t itlb_multihit_show_state(char *buf) 1322 + { 1323 + return sprintf(buf, "Processor vulnerable\n"); 1418 1324 } 1419 1325 #endif 1420 1326 ··· 1445 1325 } 1446 1326 1447 1327 return sprintf(buf, "%s; SMT %s\n", mds_strings[mds_mitigation], 1328 + sched_smt_active() ? "vulnerable" : "disabled"); 1329 + } 1330 + 1331 + static ssize_t tsx_async_abort_show_state(char *buf) 1332 + { 1333 + if ((taa_mitigation == TAA_MITIGATION_TSX_DISABLED) || 1334 + (taa_mitigation == TAA_MITIGATION_OFF)) 1335 + return sprintf(buf, "%s\n", taa_strings[taa_mitigation]); 1336 + 1337 + if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) { 1338 + return sprintf(buf, "%s; SMT Host state unknown\n", 1339 + taa_strings[taa_mitigation]); 1340 + } 1341 + 1342 + return sprintf(buf, "%s; SMT %s\n", taa_strings[taa_mitigation], 1448 1343 sched_smt_active() ? "vulnerable" : "disabled"); 1449 1344 } 1450 1345 ··· 1533 1398 case X86_BUG_MDS: 1534 1399 return mds_show_state(buf); 1535 1400 1401 + case X86_BUG_TAA: 1402 + return tsx_async_abort_show_state(buf); 1403 + 1404 + case X86_BUG_ITLB_MULTIHIT: 1405 + return itlb_multihit_show_state(buf); 1406 + 1536 1407 default: 1537 1408 break; 1538 1409 } ··· 1574 1433 ssize_t cpu_show_mds(struct device *dev, struct device_attribute *attr, char *buf) 1575 1434 { 1576 1435 return cpu_show_common(dev, attr, buf, X86_BUG_MDS); 1436 + } 1437 + 1438 + ssize_t cpu_show_tsx_async_abort(struct device *dev, struct device_attribute *attr, char *buf) 1439 + { 1440 + return cpu_show_common(dev, attr, buf, X86_BUG_TAA); 1441 + } 1442 + 1443 + ssize_t cpu_show_itlb_multihit(struct device *dev, struct device_attribute *attr, char *buf) 1444 + { 1445 + return cpu_show_common(dev, attr, buf, X86_BUG_ITLB_MULTIHIT); 1577 1446 } 1578 1447 #endif
+64 -33
arch/x86/kernel/cpu/common.c
··· 1016 1016 #endif 1017 1017 } 1018 1018 1019 - #define NO_SPECULATION BIT(0) 1020 - #define NO_MELTDOWN BIT(1) 1021 - #define NO_SSB BIT(2) 1022 - #define NO_L1TF BIT(3) 1023 - #define NO_MDS BIT(4) 1024 - #define MSBDS_ONLY BIT(5) 1025 - #define NO_SWAPGS BIT(6) 1019 + #define NO_SPECULATION BIT(0) 1020 + #define NO_MELTDOWN BIT(1) 1021 + #define NO_SSB BIT(2) 1022 + #define NO_L1TF BIT(3) 1023 + #define NO_MDS BIT(4) 1024 + #define MSBDS_ONLY BIT(5) 1025 + #define NO_SWAPGS BIT(6) 1026 + #define NO_ITLB_MULTIHIT BIT(7) 1026 1027 1027 1028 #define VULNWL(_vendor, _family, _model, _whitelist) \ 1028 1029 { X86_VENDOR_##_vendor, _family, _model, X86_FEATURE_ANY, _whitelist } ··· 1044 1043 VULNWL(NSC, 5, X86_MODEL_ANY, NO_SPECULATION), 1045 1044 1046 1045 /* Intel Family 6 */ 1047 - VULNWL_INTEL(ATOM_SALTWELL, NO_SPECULATION), 1048 - VULNWL_INTEL(ATOM_SALTWELL_TABLET, NO_SPECULATION), 1049 - VULNWL_INTEL(ATOM_SALTWELL_MID, NO_SPECULATION), 1050 - VULNWL_INTEL(ATOM_BONNELL, NO_SPECULATION), 1051 - VULNWL_INTEL(ATOM_BONNELL_MID, NO_SPECULATION), 1046 + VULNWL_INTEL(ATOM_SALTWELL, NO_SPECULATION | NO_ITLB_MULTIHIT), 1047 + VULNWL_INTEL(ATOM_SALTWELL_TABLET, NO_SPECULATION | NO_ITLB_MULTIHIT), 1048 + VULNWL_INTEL(ATOM_SALTWELL_MID, NO_SPECULATION | NO_ITLB_MULTIHIT), 1049 + VULNWL_INTEL(ATOM_BONNELL, NO_SPECULATION | NO_ITLB_MULTIHIT), 1050 + VULNWL_INTEL(ATOM_BONNELL_MID, NO_SPECULATION | NO_ITLB_MULTIHIT), 1052 1051 1053 - VULNWL_INTEL(ATOM_SILVERMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS), 1054 - VULNWL_INTEL(ATOM_SILVERMONT_D, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS), 1055 - VULNWL_INTEL(ATOM_SILVERMONT_MID, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS), 1056 - VULNWL_INTEL(ATOM_AIRMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS), 1057 - VULNWL_INTEL(XEON_PHI_KNL, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS), 1058 - VULNWL_INTEL(XEON_PHI_KNM, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS), 1052 + VULNWL_INTEL(ATOM_SILVERMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT), 1053 + VULNWL_INTEL(ATOM_SILVERMONT_D, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT), 1054 + VULNWL_INTEL(ATOM_SILVERMONT_MID, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT), 1055 + VULNWL_INTEL(ATOM_AIRMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT), 1056 + VULNWL_INTEL(XEON_PHI_KNL, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT), 1057 + VULNWL_INTEL(XEON_PHI_KNM, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT), 1059 1058 1060 1059 VULNWL_INTEL(CORE_YONAH, NO_SSB), 1061 1060 1062 - VULNWL_INTEL(ATOM_AIRMONT_MID, NO_L1TF | MSBDS_ONLY | NO_SWAPGS), 1063 - VULNWL_INTEL(ATOM_AIRMONT_NP, NO_L1TF | NO_SWAPGS), 1061 + VULNWL_INTEL(ATOM_AIRMONT_MID, NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT), 1062 + VULNWL_INTEL(ATOM_AIRMONT_NP, NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT), 1064 1063 1065 - VULNWL_INTEL(ATOM_GOLDMONT, NO_MDS | NO_L1TF | NO_SWAPGS), 1066 - VULNWL_INTEL(ATOM_GOLDMONT_D, NO_MDS | NO_L1TF | NO_SWAPGS), 1067 - VULNWL_INTEL(ATOM_GOLDMONT_PLUS, NO_MDS | NO_L1TF | NO_SWAPGS), 1064 + VULNWL_INTEL(ATOM_GOLDMONT, NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT), 1065 + VULNWL_INTEL(ATOM_GOLDMONT_D, NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT), 1066 + VULNWL_INTEL(ATOM_GOLDMONT_PLUS, NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT), 1068 1067 1069 1068 /* 1070 1069 * Technically, swapgs isn't serializing on AMD (despite it previously ··· 1074 1073 * good enough for our purposes. 1075 1074 */ 1076 1075 1076 + VULNWL_INTEL(ATOM_TREMONT_D, NO_ITLB_MULTIHIT), 1077 + 1077 1078 /* AMD Family 0xf - 0x12 */ 1078 - VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS), 1079 - VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS), 1080 - VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS), 1081 - VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS), 1079 + VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT), 1080 + VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT), 1081 + VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT), 1082 + VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT), 1082 1083 1083 1084 /* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */ 1084 - VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS), 1085 - VULNWL_HYGON(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS), 1085 + VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT), 1086 + VULNWL_HYGON(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT), 1086 1087 {} 1087 1088 }; 1088 1089 ··· 1095 1092 return m && !!(m->driver_data & which); 1096 1093 } 1097 1094 1098 - static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) 1095 + u64 x86_read_arch_cap_msr(void) 1099 1096 { 1100 1097 u64 ia32_cap = 0; 1098 + 1099 + if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES)) 1100 + rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap); 1101 + 1102 + return ia32_cap; 1103 + } 1104 + 1105 + static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) 1106 + { 1107 + u64 ia32_cap = x86_read_arch_cap_msr(); 1108 + 1109 + /* Set ITLB_MULTIHIT bug if cpu is not in the whitelist and not mitigated */ 1110 + if (!cpu_matches(NO_ITLB_MULTIHIT) && !(ia32_cap & ARCH_CAP_PSCHANGE_MC_NO)) 1111 + setup_force_cpu_bug(X86_BUG_ITLB_MULTIHIT); 1101 1112 1102 1113 if (cpu_matches(NO_SPECULATION)) 1103 1114 return; 1104 1115 1105 1116 setup_force_cpu_bug(X86_BUG_SPECTRE_V1); 1106 1117 setup_force_cpu_bug(X86_BUG_SPECTRE_V2); 1107 - 1108 - if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES)) 1109 - rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap); 1110 1118 1111 1119 if (!cpu_matches(NO_SSB) && !(ia32_cap & ARCH_CAP_SSB_NO) && 1112 1120 !cpu_has(c, X86_FEATURE_AMD_SSB_NO)) ··· 1134 1120 1135 1121 if (!cpu_matches(NO_SWAPGS)) 1136 1122 setup_force_cpu_bug(X86_BUG_SWAPGS); 1123 + 1124 + /* 1125 + * When the CPU is not mitigated for TAA (TAA_NO=0) set TAA bug when: 1126 + * - TSX is supported or 1127 + * - TSX_CTRL is present 1128 + * 1129 + * TSX_CTRL check is needed for cases when TSX could be disabled before 1130 + * the kernel boot e.g. kexec. 1131 + * TSX_CTRL check alone is not sufficient for cases when the microcode 1132 + * update is not present or running as guest that don't get TSX_CTRL. 1133 + */ 1134 + if (!(ia32_cap & ARCH_CAP_TAA_NO) && 1135 + (cpu_has(c, X86_FEATURE_RTM) || 1136 + (ia32_cap & ARCH_CAP_TSX_CTRL_MSR))) 1137 + setup_force_cpu_bug(X86_BUG_TAA); 1137 1138 1138 1139 if (cpu_matches(NO_MELTDOWN)) 1139 1140 return; ··· 1583 1554 #endif 1584 1555 cpu_detect_tlb(&boot_cpu_data); 1585 1556 setup_cr_pinning(); 1557 + 1558 + tsx_init(); 1586 1559 } 1587 1560 1588 1561 void identify_secondary_cpu(struct cpuinfo_x86 *c)
+18
arch/x86/kernel/cpu/cpu.h
··· 44 44 extern const struct cpu_dev *const __x86_cpu_dev_start[], 45 45 *const __x86_cpu_dev_end[]; 46 46 47 + #ifdef CONFIG_CPU_SUP_INTEL 48 + enum tsx_ctrl_states { 49 + TSX_CTRL_ENABLE, 50 + TSX_CTRL_DISABLE, 51 + TSX_CTRL_NOT_SUPPORTED, 52 + }; 53 + 54 + extern __ro_after_init enum tsx_ctrl_states tsx_ctrl_state; 55 + 56 + extern void __init tsx_init(void); 57 + extern void tsx_enable(void); 58 + extern void tsx_disable(void); 59 + #else 60 + static inline void tsx_init(void) { } 61 + #endif /* CONFIG_CPU_SUP_INTEL */ 62 + 47 63 extern void get_cpu_cap(struct cpuinfo_x86 *c); 48 64 extern void get_cpu_address_sizes(struct cpuinfo_x86 *c); 49 65 extern void cpu_detect_cache_sizes(struct cpuinfo_x86 *c); ··· 77 61 unsigned int aperfmperf_get_khz(int cpu); 78 62 79 63 extern void x86_spec_ctrl_setup_ap(void); 64 + 65 + extern u64 x86_read_arch_cap_msr(void); 80 66 81 67 #endif /* ARCH_X86_CPU_H */
+5
arch/x86/kernel/cpu/intel.c
··· 762 762 detect_tme(c); 763 763 764 764 init_intel_misc_features(c); 765 + 766 + if (tsx_ctrl_state == TSX_CTRL_ENABLE) 767 + tsx_enable(); 768 + if (tsx_ctrl_state == TSX_CTRL_DISABLE) 769 + tsx_disable(); 765 770 } 766 771 767 772 #ifdef CONFIG_X86_32
+4
arch/x86/kernel/cpu/resctrl/ctrlmondata.c
··· 522 522 int ret = 0; 523 523 524 524 rdtgrp = rdtgroup_kn_lock_live(of->kn); 525 + if (!rdtgrp) { 526 + ret = -ENOENT; 527 + goto out; 528 + } 525 529 526 530 md.priv = of->kn->priv; 527 531 resid = md.u.rid;
-4
arch/x86/kernel/cpu/resctrl/rdtgroup.c
··· 461 461 } 462 462 463 463 rdtgrp = rdtgroup_kn_lock_live(of->kn); 464 - rdt_last_cmd_clear(); 465 464 if (!rdtgrp) { 466 465 ret = -ENOENT; 467 - rdt_last_cmd_puts("Directory was removed\n"); 468 466 goto unlock; 469 467 } 470 468 ··· 2646 2648 int ret; 2647 2649 2648 2650 prdtgrp = rdtgroup_kn_lock_live(prgrp_kn); 2649 - rdt_last_cmd_clear(); 2650 2651 if (!prdtgrp) { 2651 2652 ret = -ENODEV; 2652 - rdt_last_cmd_puts("Directory was removed\n"); 2653 2653 goto out_unlock; 2654 2654 } 2655 2655
+140
arch/x86/kernel/cpu/tsx.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Intel Transactional Synchronization Extensions (TSX) control. 4 + * 5 + * Copyright (C) 2019 Intel Corporation 6 + * 7 + * Author: 8 + * Pawan Gupta <pawan.kumar.gupta@linux.intel.com> 9 + */ 10 + 11 + #include <linux/cpufeature.h> 12 + 13 + #include <asm/cmdline.h> 14 + 15 + #include "cpu.h" 16 + 17 + enum tsx_ctrl_states tsx_ctrl_state __ro_after_init = TSX_CTRL_NOT_SUPPORTED; 18 + 19 + void tsx_disable(void) 20 + { 21 + u64 tsx; 22 + 23 + rdmsrl(MSR_IA32_TSX_CTRL, tsx); 24 + 25 + /* Force all transactions to immediately abort */ 26 + tsx |= TSX_CTRL_RTM_DISABLE; 27 + 28 + /* 29 + * Ensure TSX support is not enumerated in CPUID. 30 + * This is visible to userspace and will ensure they 31 + * do not waste resources trying TSX transactions that 32 + * will always abort. 33 + */ 34 + tsx |= TSX_CTRL_CPUID_CLEAR; 35 + 36 + wrmsrl(MSR_IA32_TSX_CTRL, tsx); 37 + } 38 + 39 + void tsx_enable(void) 40 + { 41 + u64 tsx; 42 + 43 + rdmsrl(MSR_IA32_TSX_CTRL, tsx); 44 + 45 + /* Enable the RTM feature in the cpu */ 46 + tsx &= ~TSX_CTRL_RTM_DISABLE; 47 + 48 + /* 49 + * Ensure TSX support is enumerated in CPUID. 50 + * This is visible to userspace and will ensure they 51 + * can enumerate and use the TSX feature. 52 + */ 53 + tsx &= ~TSX_CTRL_CPUID_CLEAR; 54 + 55 + wrmsrl(MSR_IA32_TSX_CTRL, tsx); 56 + } 57 + 58 + static bool __init tsx_ctrl_is_supported(void) 59 + { 60 + u64 ia32_cap = x86_read_arch_cap_msr(); 61 + 62 + /* 63 + * TSX is controlled via MSR_IA32_TSX_CTRL. However, support for this 64 + * MSR is enumerated by ARCH_CAP_TSX_MSR bit in MSR_IA32_ARCH_CAPABILITIES. 65 + * 66 + * TSX control (aka MSR_IA32_TSX_CTRL) is only available after a 67 + * microcode update on CPUs that have their MSR_IA32_ARCH_CAPABILITIES 68 + * bit MDS_NO=1. CPUs with MDS_NO=0 are not planned to get 69 + * MSR_IA32_TSX_CTRL support even after a microcode update. Thus, 70 + * tsx= cmdline requests will do nothing on CPUs without 71 + * MSR_IA32_TSX_CTRL support. 72 + */ 73 + return !!(ia32_cap & ARCH_CAP_TSX_CTRL_MSR); 74 + } 75 + 76 + static enum tsx_ctrl_states x86_get_tsx_auto_mode(void) 77 + { 78 + if (boot_cpu_has_bug(X86_BUG_TAA)) 79 + return TSX_CTRL_DISABLE; 80 + 81 + return TSX_CTRL_ENABLE; 82 + } 83 + 84 + void __init tsx_init(void) 85 + { 86 + char arg[5] = {}; 87 + int ret; 88 + 89 + if (!tsx_ctrl_is_supported()) 90 + return; 91 + 92 + ret = cmdline_find_option(boot_command_line, "tsx", arg, sizeof(arg)); 93 + if (ret >= 0) { 94 + if (!strcmp(arg, "on")) { 95 + tsx_ctrl_state = TSX_CTRL_ENABLE; 96 + } else if (!strcmp(arg, "off")) { 97 + tsx_ctrl_state = TSX_CTRL_DISABLE; 98 + } else if (!strcmp(arg, "auto")) { 99 + tsx_ctrl_state = x86_get_tsx_auto_mode(); 100 + } else { 101 + tsx_ctrl_state = TSX_CTRL_DISABLE; 102 + pr_err("tsx: invalid option, defaulting to off\n"); 103 + } 104 + } else { 105 + /* tsx= not provided */ 106 + if (IS_ENABLED(CONFIG_X86_INTEL_TSX_MODE_AUTO)) 107 + tsx_ctrl_state = x86_get_tsx_auto_mode(); 108 + else if (IS_ENABLED(CONFIG_X86_INTEL_TSX_MODE_OFF)) 109 + tsx_ctrl_state = TSX_CTRL_DISABLE; 110 + else 111 + tsx_ctrl_state = TSX_CTRL_ENABLE; 112 + } 113 + 114 + if (tsx_ctrl_state == TSX_CTRL_DISABLE) { 115 + tsx_disable(); 116 + 117 + /* 118 + * tsx_disable() will change the state of the 119 + * RTM CPUID bit. Clear it here since it is now 120 + * expected to be not set. 121 + */ 122 + setup_clear_cpu_cap(X86_FEATURE_RTM); 123 + } else if (tsx_ctrl_state == TSX_CTRL_ENABLE) { 124 + 125 + /* 126 + * HW defaults TSX to be enabled at bootup. 127 + * We may still need the TSX enable support 128 + * during init for special cases like 129 + * kexec after TSX is disabled. 130 + */ 131 + tsx_enable(); 132 + 133 + /* 134 + * tsx_enable() will change the state of the 135 + * RTM CPUID bit. Force it here since it is now 136 + * expected to be set. 137 + */ 138 + setup_force_cpu_cap(X86_FEATURE_RTM); 139 + } 140 + }
+7
arch/x86/kernel/dumpstack_64.c
··· 94 94 BUILD_BUG_ON(N_EXCEPTION_STACKS != 6); 95 95 96 96 begin = (unsigned long)__this_cpu_read(cea_exception_stacks); 97 + /* 98 + * Handle the case where stack trace is collected _before_ 99 + * cea_exception_stacks had been initialized. 100 + */ 101 + if (!begin) 102 + return false; 103 + 97 104 end = begin + sizeof(struct cea_exception_stacks); 98 105 /* Bail if @stack is outside the exception stack area. */ 99 106 if (stk < begin || stk >= end)
+2
arch/x86/kernel/early-quirks.c
··· 710 710 */ 711 711 { PCI_VENDOR_ID_INTEL, 0x0f00, 712 712 PCI_CLASS_BRIDGE_HOST, PCI_ANY_ID, 0, force_disable_hpet}, 713 + { PCI_VENDOR_ID_INTEL, 0x3ec4, 714 + PCI_CLASS_BRIDGE_HOST, PCI_ANY_ID, 0, force_disable_hpet}, 713 715 { PCI_VENDOR_ID_BROADCOM, 0x4331, 714 716 PCI_CLASS_NETWORK_OTHER, PCI_ANY_ID, 0, apple_airport_reset}, 715 717 {}
+3
arch/x86/kernel/tsc.c
··· 1505 1505 return; 1506 1506 } 1507 1507 1508 + if (tsc_clocksource_reliable || no_tsc_watchdog) 1509 + clocksource_tsc_early.flags &= ~CLOCK_SOURCE_MUST_VERIFY; 1510 + 1508 1511 clocksource_register_khz(&clocksource_tsc_early, tsc_khz); 1509 1512 detect_art(); 1510 1513 }
+272 -10
arch/x86/kvm/mmu.c
··· 37 37 #include <linux/uaccess.h> 38 38 #include <linux/hash.h> 39 39 #include <linux/kern_levels.h> 40 + #include <linux/kthread.h> 40 41 41 42 #include <asm/page.h> 42 43 #include <asm/pat.h> ··· 47 46 #include <asm/vmx.h> 48 47 #include <asm/kvm_page_track.h> 49 48 #include "trace.h" 49 + 50 + extern bool itlb_multihit_kvm_mitigation; 51 + 52 + static int __read_mostly nx_huge_pages = -1; 53 + #ifdef CONFIG_PREEMPT_RT 54 + /* Recovery can cause latency spikes, disable it for PREEMPT_RT. */ 55 + static uint __read_mostly nx_huge_pages_recovery_ratio = 0; 56 + #else 57 + static uint __read_mostly nx_huge_pages_recovery_ratio = 60; 58 + #endif 59 + 60 + static int set_nx_huge_pages(const char *val, const struct kernel_param *kp); 61 + static int set_nx_huge_pages_recovery_ratio(const char *val, const struct kernel_param *kp); 62 + 63 + static struct kernel_param_ops nx_huge_pages_ops = { 64 + .set = set_nx_huge_pages, 65 + .get = param_get_bool, 66 + }; 67 + 68 + static struct kernel_param_ops nx_huge_pages_recovery_ratio_ops = { 69 + .set = set_nx_huge_pages_recovery_ratio, 70 + .get = param_get_uint, 71 + }; 72 + 73 + module_param_cb(nx_huge_pages, &nx_huge_pages_ops, &nx_huge_pages, 0644); 74 + __MODULE_PARM_TYPE(nx_huge_pages, "bool"); 75 + module_param_cb(nx_huge_pages_recovery_ratio, &nx_huge_pages_recovery_ratio_ops, 76 + &nx_huge_pages_recovery_ratio, 0644); 77 + __MODULE_PARM_TYPE(nx_huge_pages_recovery_ratio, "uint"); 50 78 51 79 /* 52 80 * When setting this variable to true it enables Two-Dimensional-Paging ··· 380 350 { 381 351 MMU_WARN_ON(is_mmio_spte(spte)); 382 352 return (spte & SPTE_SPECIAL_MASK) != SPTE_AD_ENABLED_MASK; 353 + } 354 + 355 + static bool is_nx_huge_page_enabled(void) 356 + { 357 + return READ_ONCE(nx_huge_pages); 383 358 } 384 359 385 360 static inline u64 spte_shadow_accessed_mask(u64 spte) ··· 1225 1190 kvm_mmu_gfn_disallow_lpage(slot, gfn); 1226 1191 } 1227 1192 1193 + static void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp) 1194 + { 1195 + if (sp->lpage_disallowed) 1196 + return; 1197 + 1198 + ++kvm->stat.nx_lpage_splits; 1199 + list_add_tail(&sp->lpage_disallowed_link, 1200 + &kvm->arch.lpage_disallowed_mmu_pages); 1201 + sp->lpage_disallowed = true; 1202 + } 1203 + 1228 1204 static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) 1229 1205 { 1230 1206 struct kvm_memslots *slots; ··· 1251 1205 KVM_PAGE_TRACK_WRITE); 1252 1206 1253 1207 kvm_mmu_gfn_allow_lpage(slot, gfn); 1208 + } 1209 + 1210 + static void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp) 1211 + { 1212 + --kvm->stat.nx_lpage_splits; 1213 + sp->lpage_disallowed = false; 1214 + list_del(&sp->lpage_disallowed_link); 1254 1215 } 1255 1216 1256 1217 static bool __mmu_gfn_lpage_is_disallowed(gfn_t gfn, int level, ··· 2845 2792 kvm_reload_remote_mmus(kvm); 2846 2793 } 2847 2794 2795 + if (sp->lpage_disallowed) 2796 + unaccount_huge_nx_page(kvm, sp); 2797 + 2848 2798 sp->role.invalid = 1; 2849 2799 return list_unstable; 2850 2800 } ··· 3069 3013 if (!speculative) 3070 3014 spte |= spte_shadow_accessed_mask(spte); 3071 3015 3016 + if (level > PT_PAGE_TABLE_LEVEL && (pte_access & ACC_EXEC_MASK) && 3017 + is_nx_huge_page_enabled()) { 3018 + pte_access &= ~ACC_EXEC_MASK; 3019 + } 3020 + 3072 3021 if (pte_access & ACC_EXEC_MASK) 3073 3022 spte |= shadow_x_mask; 3074 3023 else ··· 3294 3233 __direct_pte_prefetch(vcpu, sp, sptep); 3295 3234 } 3296 3235 3236 + static void disallowed_hugepage_adjust(struct kvm_shadow_walk_iterator it, 3237 + gfn_t gfn, kvm_pfn_t *pfnp, int *levelp) 3238 + { 3239 + int level = *levelp; 3240 + u64 spte = *it.sptep; 3241 + 3242 + if (it.level == level && level > PT_PAGE_TABLE_LEVEL && 3243 + is_nx_huge_page_enabled() && 3244 + is_shadow_present_pte(spte) && 3245 + !is_large_pte(spte)) { 3246 + /* 3247 + * A small SPTE exists for this pfn, but FNAME(fetch) 3248 + * and __direct_map would like to create a large PTE 3249 + * instead: just force them to go down another level, 3250 + * patching back for them into pfn the next 9 bits of 3251 + * the address. 3252 + */ 3253 + u64 page_mask = KVM_PAGES_PER_HPAGE(level) - KVM_PAGES_PER_HPAGE(level - 1); 3254 + *pfnp |= gfn & page_mask; 3255 + (*levelp)--; 3256 + } 3257 + } 3258 + 3297 3259 static int __direct_map(struct kvm_vcpu *vcpu, gpa_t gpa, int write, 3298 3260 int map_writable, int level, kvm_pfn_t pfn, 3299 - bool prefault) 3261 + bool prefault, bool lpage_disallowed) 3300 3262 { 3301 3263 struct kvm_shadow_walk_iterator it; 3302 3264 struct kvm_mmu_page *sp; ··· 3332 3248 3333 3249 trace_kvm_mmu_spte_requested(gpa, level, pfn); 3334 3250 for_each_shadow_entry(vcpu, gpa, it) { 3251 + /* 3252 + * We cannot overwrite existing page tables with an NX 3253 + * large page, as the leaf could be executable. 3254 + */ 3255 + disallowed_hugepage_adjust(it, gfn, &pfn, &level); 3256 + 3335 3257 base_gfn = gfn & ~(KVM_PAGES_PER_HPAGE(it.level) - 1); 3336 3258 if (it.level == level) 3337 3259 break; ··· 3348 3258 it.level - 1, true, ACC_ALL); 3349 3259 3350 3260 link_shadow_page(vcpu, it.sptep, sp); 3261 + if (lpage_disallowed) 3262 + account_huge_nx_page(vcpu->kvm, sp); 3351 3263 } 3352 3264 } 3353 3265 ··· 3398 3306 * here. 3399 3307 */ 3400 3308 if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn) && 3401 - level == PT_PAGE_TABLE_LEVEL && 3309 + !kvm_is_zone_device_pfn(pfn) && level == PT_PAGE_TABLE_LEVEL && 3402 3310 PageTransCompoundMap(pfn_to_page(pfn)) && 3403 3311 !mmu_gfn_lpage_is_disallowed(vcpu, gfn, PT_DIRECTORY_LEVEL)) { 3404 3312 unsigned long mask; ··· 3642 3550 { 3643 3551 int r; 3644 3552 int level; 3645 - bool force_pt_level = false; 3553 + bool force_pt_level; 3646 3554 kvm_pfn_t pfn; 3647 3555 unsigned long mmu_seq; 3648 3556 bool map_writable, write = error_code & PFERR_WRITE_MASK; 3557 + bool lpage_disallowed = (error_code & PFERR_FETCH_MASK) && 3558 + is_nx_huge_page_enabled(); 3649 3559 3560 + force_pt_level = lpage_disallowed; 3650 3561 level = mapping_level(vcpu, gfn, &force_pt_level); 3651 3562 if (likely(!force_pt_level)) { 3652 3563 /* ··· 3683 3588 goto out_unlock; 3684 3589 if (likely(!force_pt_level)) 3685 3590 transparent_hugepage_adjust(vcpu, gfn, &pfn, &level); 3686 - r = __direct_map(vcpu, v, write, map_writable, level, pfn, prefault); 3591 + r = __direct_map(vcpu, v, write, map_writable, level, pfn, 3592 + prefault, false); 3687 3593 out_unlock: 3688 3594 spin_unlock(&vcpu->kvm->mmu_lock); 3689 3595 kvm_release_pfn_clean(pfn); ··· 4270 4174 unsigned long mmu_seq; 4271 4175 int write = error_code & PFERR_WRITE_MASK; 4272 4176 bool map_writable; 4177 + bool lpage_disallowed = (error_code & PFERR_FETCH_MASK) && 4178 + is_nx_huge_page_enabled(); 4273 4179 4274 4180 MMU_WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root_hpa)); 4275 4181 ··· 4282 4184 if (r) 4283 4185 return r; 4284 4186 4285 - force_pt_level = !check_hugepage_cache_consistency(vcpu, gfn, 4286 - PT_DIRECTORY_LEVEL); 4187 + force_pt_level = 4188 + lpage_disallowed || 4189 + !check_hugepage_cache_consistency(vcpu, gfn, PT_DIRECTORY_LEVEL); 4287 4190 level = mapping_level(vcpu, gfn, &force_pt_level); 4288 4191 if (likely(!force_pt_level)) { 4289 4192 if (level > PT_DIRECTORY_LEVEL && ··· 4313 4214 goto out_unlock; 4314 4215 if (likely(!force_pt_level)) 4315 4216 transparent_hugepage_adjust(vcpu, gfn, &pfn, &level); 4316 - r = __direct_map(vcpu, gpa, write, map_writable, level, pfn, prefault); 4217 + r = __direct_map(vcpu, gpa, write, map_writable, level, pfn, 4218 + prefault, lpage_disallowed); 4317 4219 out_unlock: 4318 4220 spin_unlock(&vcpu->kvm->mmu_lock); 4319 4221 kvm_release_pfn_clean(pfn); ··· 6014 5914 * the guest, and the guest page table is using 4K page size 6015 5915 * mapping if the indirect sp has level = 1. 6016 5916 */ 6017 - if (sp->role.direct && 6018 - !kvm_is_reserved_pfn(pfn) && 6019 - PageTransCompoundMap(pfn_to_page(pfn))) { 5917 + if (sp->role.direct && !kvm_is_reserved_pfn(pfn) && 5918 + !kvm_is_zone_device_pfn(pfn) && 5919 + PageTransCompoundMap(pfn_to_page(pfn))) { 6020 5920 pte_list_remove(rmap_head, sptep); 6021 5921 6022 5922 if (kvm_available_flush_tlb_with_range()) ··· 6255 6155 kvm_mmu_set_mmio_spte_mask(mask, mask, ACC_WRITE_MASK | ACC_USER_MASK); 6256 6156 } 6257 6157 6158 + static bool get_nx_auto_mode(void) 6159 + { 6160 + /* Return true when CPU has the bug, and mitigations are ON */ 6161 + return boot_cpu_has_bug(X86_BUG_ITLB_MULTIHIT) && !cpu_mitigations_off(); 6162 + } 6163 + 6164 + static void __set_nx_huge_pages(bool val) 6165 + { 6166 + nx_huge_pages = itlb_multihit_kvm_mitigation = val; 6167 + } 6168 + 6169 + static int set_nx_huge_pages(const char *val, const struct kernel_param *kp) 6170 + { 6171 + bool old_val = nx_huge_pages; 6172 + bool new_val; 6173 + 6174 + /* In "auto" mode deploy workaround only if CPU has the bug. */ 6175 + if (sysfs_streq(val, "off")) 6176 + new_val = 0; 6177 + else if (sysfs_streq(val, "force")) 6178 + new_val = 1; 6179 + else if (sysfs_streq(val, "auto")) 6180 + new_val = get_nx_auto_mode(); 6181 + else if (strtobool(val, &new_val) < 0) 6182 + return -EINVAL; 6183 + 6184 + __set_nx_huge_pages(new_val); 6185 + 6186 + if (new_val != old_val) { 6187 + struct kvm *kvm; 6188 + 6189 + mutex_lock(&kvm_lock); 6190 + 6191 + list_for_each_entry(kvm, &vm_list, vm_list) { 6192 + mutex_lock(&kvm->slots_lock); 6193 + kvm_mmu_zap_all_fast(kvm); 6194 + mutex_unlock(&kvm->slots_lock); 6195 + 6196 + wake_up_process(kvm->arch.nx_lpage_recovery_thread); 6197 + } 6198 + mutex_unlock(&kvm_lock); 6199 + } 6200 + 6201 + return 0; 6202 + } 6203 + 6258 6204 int kvm_mmu_module_init(void) 6259 6205 { 6260 6206 int ret = -ENOMEM; 6207 + 6208 + if (nx_huge_pages == -1) 6209 + __set_nx_huge_pages(get_nx_auto_mode()); 6261 6210 6262 6211 /* 6263 6212 * MMU roles use union aliasing which is, generally speaking, an ··· 6386 6237 percpu_counter_destroy(&kvm_total_used_mmu_pages); 6387 6238 unregister_shrinker(&mmu_shrinker); 6388 6239 mmu_audit_disable(); 6240 + } 6241 + 6242 + static int set_nx_huge_pages_recovery_ratio(const char *val, const struct kernel_param *kp) 6243 + { 6244 + unsigned int old_val; 6245 + int err; 6246 + 6247 + old_val = nx_huge_pages_recovery_ratio; 6248 + err = param_set_uint(val, kp); 6249 + if (err) 6250 + return err; 6251 + 6252 + if (READ_ONCE(nx_huge_pages) && 6253 + !old_val && nx_huge_pages_recovery_ratio) { 6254 + struct kvm *kvm; 6255 + 6256 + mutex_lock(&kvm_lock); 6257 + 6258 + list_for_each_entry(kvm, &vm_list, vm_list) 6259 + wake_up_process(kvm->arch.nx_lpage_recovery_thread); 6260 + 6261 + mutex_unlock(&kvm_lock); 6262 + } 6263 + 6264 + return err; 6265 + } 6266 + 6267 + static void kvm_recover_nx_lpages(struct kvm *kvm) 6268 + { 6269 + int rcu_idx; 6270 + struct kvm_mmu_page *sp; 6271 + unsigned int ratio; 6272 + LIST_HEAD(invalid_list); 6273 + ulong to_zap; 6274 + 6275 + rcu_idx = srcu_read_lock(&kvm->srcu); 6276 + spin_lock(&kvm->mmu_lock); 6277 + 6278 + ratio = READ_ONCE(nx_huge_pages_recovery_ratio); 6279 + to_zap = ratio ? DIV_ROUND_UP(kvm->stat.nx_lpage_splits, ratio) : 0; 6280 + while (to_zap && !list_empty(&kvm->arch.lpage_disallowed_mmu_pages)) { 6281 + /* 6282 + * We use a separate list instead of just using active_mmu_pages 6283 + * because the number of lpage_disallowed pages is expected to 6284 + * be relatively small compared to the total. 6285 + */ 6286 + sp = list_first_entry(&kvm->arch.lpage_disallowed_mmu_pages, 6287 + struct kvm_mmu_page, 6288 + lpage_disallowed_link); 6289 + WARN_ON_ONCE(!sp->lpage_disallowed); 6290 + kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list); 6291 + WARN_ON_ONCE(sp->lpage_disallowed); 6292 + 6293 + if (!--to_zap || need_resched() || spin_needbreak(&kvm->mmu_lock)) { 6294 + kvm_mmu_commit_zap_page(kvm, &invalid_list); 6295 + if (to_zap) 6296 + cond_resched_lock(&kvm->mmu_lock); 6297 + } 6298 + } 6299 + 6300 + spin_unlock(&kvm->mmu_lock); 6301 + srcu_read_unlock(&kvm->srcu, rcu_idx); 6302 + } 6303 + 6304 + static long get_nx_lpage_recovery_timeout(u64 start_time) 6305 + { 6306 + return READ_ONCE(nx_huge_pages) && READ_ONCE(nx_huge_pages_recovery_ratio) 6307 + ? start_time + 60 * HZ - get_jiffies_64() 6308 + : MAX_SCHEDULE_TIMEOUT; 6309 + } 6310 + 6311 + static int kvm_nx_lpage_recovery_worker(struct kvm *kvm, uintptr_t data) 6312 + { 6313 + u64 start_time; 6314 + long remaining_time; 6315 + 6316 + while (true) { 6317 + start_time = get_jiffies_64(); 6318 + remaining_time = get_nx_lpage_recovery_timeout(start_time); 6319 + 6320 + set_current_state(TASK_INTERRUPTIBLE); 6321 + while (!kthread_should_stop() && remaining_time > 0) { 6322 + schedule_timeout(remaining_time); 6323 + remaining_time = get_nx_lpage_recovery_timeout(start_time); 6324 + set_current_state(TASK_INTERRUPTIBLE); 6325 + } 6326 + 6327 + set_current_state(TASK_RUNNING); 6328 + 6329 + if (kthread_should_stop()) 6330 + return 0; 6331 + 6332 + kvm_recover_nx_lpages(kvm); 6333 + } 6334 + } 6335 + 6336 + int kvm_mmu_post_init_vm(struct kvm *kvm) 6337 + { 6338 + int err; 6339 + 6340 + err = kvm_vm_create_worker_thread(kvm, kvm_nx_lpage_recovery_worker, 0, 6341 + "kvm-nx-lpage-recovery", 6342 + &kvm->arch.nx_lpage_recovery_thread); 6343 + if (!err) 6344 + kthread_unpark(kvm->arch.nx_lpage_recovery_thread); 6345 + 6346 + return err; 6347 + } 6348 + 6349 + void kvm_mmu_pre_destroy_vm(struct kvm *kvm) 6350 + { 6351 + if (kvm->arch.nx_lpage_recovery_thread) 6352 + kthread_stop(kvm->arch.nx_lpage_recovery_thread); 6389 6353 }
+4
arch/x86/kvm/mmu.h
··· 210 210 bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm, 211 211 struct kvm_memory_slot *slot, u64 gfn); 212 212 int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu); 213 + 214 + int kvm_mmu_post_init_vm(struct kvm *kvm); 215 + void kvm_mmu_pre_destroy_vm(struct kvm *kvm); 216 + 213 217 #endif
+23 -6
arch/x86/kvm/paging_tmpl.h
··· 614 614 static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr, 615 615 struct guest_walker *gw, 616 616 int write_fault, int hlevel, 617 - kvm_pfn_t pfn, bool map_writable, bool prefault) 617 + kvm_pfn_t pfn, bool map_writable, bool prefault, 618 + bool lpage_disallowed) 618 619 { 619 620 struct kvm_mmu_page *sp = NULL; 620 621 struct kvm_shadow_walk_iterator it; 621 622 unsigned direct_access, access = gw->pt_access; 622 623 int top_level, ret; 623 - gfn_t base_gfn; 624 + gfn_t gfn, base_gfn; 624 625 625 626 direct_access = gw->pte_access; 626 627 ··· 666 665 link_shadow_page(vcpu, it.sptep, sp); 667 666 } 668 667 669 - base_gfn = gw->gfn; 668 + /* 669 + * FNAME(page_fault) might have clobbered the bottom bits of 670 + * gw->gfn, restore them from the virtual address. 671 + */ 672 + gfn = gw->gfn | ((addr & PT_LVL_OFFSET_MASK(gw->level)) >> PAGE_SHIFT); 673 + base_gfn = gfn; 670 674 671 675 trace_kvm_mmu_spte_requested(addr, gw->level, pfn); 672 676 673 677 for (; shadow_walk_okay(&it); shadow_walk_next(&it)) { 674 678 clear_sp_write_flooding_count(it.sptep); 675 - base_gfn = gw->gfn & ~(KVM_PAGES_PER_HPAGE(it.level) - 1); 679 + 680 + /* 681 + * We cannot overwrite existing page tables with an NX 682 + * large page, as the leaf could be executable. 683 + */ 684 + disallowed_hugepage_adjust(it, gfn, &pfn, &hlevel); 685 + 686 + base_gfn = gfn & ~(KVM_PAGES_PER_HPAGE(it.level) - 1); 676 687 if (it.level == hlevel) 677 688 break; 678 689 ··· 696 683 sp = kvm_mmu_get_page(vcpu, base_gfn, addr, 697 684 it.level - 1, true, direct_access); 698 685 link_shadow_page(vcpu, it.sptep, sp); 686 + if (lpage_disallowed) 687 + account_huge_nx_page(vcpu->kvm, sp); 699 688 } 700 689 } 701 690 ··· 774 759 int r; 775 760 kvm_pfn_t pfn; 776 761 int level = PT_PAGE_TABLE_LEVEL; 777 - bool force_pt_level = false; 778 762 unsigned long mmu_seq; 779 763 bool map_writable, is_self_change_mapping; 764 + bool lpage_disallowed = (error_code & PFERR_FETCH_MASK) && 765 + is_nx_huge_page_enabled(); 766 + bool force_pt_level = lpage_disallowed; 780 767 781 768 pgprintk("%s: addr %lx err %x\n", __func__, addr, error_code); 782 769 ··· 868 851 if (!force_pt_level) 869 852 transparent_hugepage_adjust(vcpu, walker.gfn, &pfn, &level); 870 853 r = FNAME(fetch)(vcpu, addr, &walker, write_fault, 871 - level, pfn, map_writable, prefault); 854 + level, pfn, map_writable, prefault, lpage_disallowed); 872 855 kvm_mmu_audit(vcpu, AUDIT_POST_PAGE_FAULT); 873 856 874 857 out_unlock:
+20 -3
arch/x86/kvm/vmx/vmx.c
··· 1268 1268 if (!pi_test_sn(pi_desc) && vcpu->cpu == cpu) 1269 1269 return; 1270 1270 1271 + /* 1272 + * If the 'nv' field is POSTED_INTR_WAKEUP_VECTOR, do not change 1273 + * PI.NDST: pi_post_block is the one expected to change PID.NDST and the 1274 + * wakeup handler expects the vCPU to be on the blocked_vcpu_list that 1275 + * matches PI.NDST. Otherwise, a vcpu may not be able to be woken up 1276 + * correctly. 1277 + */ 1278 + if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR || vcpu->cpu == cpu) { 1279 + pi_clear_sn(pi_desc); 1280 + goto after_clear_sn; 1281 + } 1282 + 1271 1283 /* The full case. */ 1272 1284 do { 1273 1285 old.control = new.control = pi_desc->control; ··· 1295 1283 } while (cmpxchg64(&pi_desc->control, old.control, 1296 1284 new.control) != old.control); 1297 1285 1286 + after_clear_sn: 1287 + 1298 1288 /* 1299 1289 * Clear SN before reading the bitmap. The VT-d firmware 1300 1290 * writes the bitmap and reads SN atomically (5.2.3 in the ··· 1305 1291 */ 1306 1292 smp_mb__after_atomic(); 1307 1293 1308 - if (!bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS)) 1294 + if (!pi_is_pir_empty(pi_desc)) 1309 1295 pi_set_on(pi_desc); 1310 1296 } 1311 1297 ··· 6151 6137 if (pi_test_on(&vmx->pi_desc)) { 6152 6138 pi_clear_on(&vmx->pi_desc); 6153 6139 /* 6154 - * IOMMU can write to PIR.ON, so the barrier matters even on UP. 6140 + * IOMMU can write to PID.ON, so the barrier matters even on UP. 6155 6141 * But on x86 this is just a compiler barrier anyway. 6156 6142 */ 6157 6143 smp_mb__after_atomic(); ··· 6181 6167 6182 6168 static bool vmx_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu) 6183 6169 { 6184 - return pi_test_on(vcpu_to_pi_desc(vcpu)); 6170 + struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); 6171 + 6172 + return pi_test_on(pi_desc) || 6173 + (pi_test_sn(pi_desc) && !pi_is_pir_empty(pi_desc)); 6185 6174 } 6186 6175 6187 6176 static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
+11
arch/x86/kvm/vmx/vmx.h
··· 355 355 return test_and_set_bit(vector, (unsigned long *)pi_desc->pir); 356 356 } 357 357 358 + static inline bool pi_is_pir_empty(struct pi_desc *pi_desc) 359 + { 360 + return bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS); 361 + } 362 + 358 363 static inline void pi_set_sn(struct pi_desc *pi_desc) 359 364 { 360 365 set_bit(POSTED_INTR_SN, ··· 375 370 static inline void pi_clear_on(struct pi_desc *pi_desc) 376 371 { 377 372 clear_bit(POSTED_INTR_ON, 373 + (unsigned long *)&pi_desc->control); 374 + } 375 + 376 + static inline void pi_clear_sn(struct pi_desc *pi_desc) 377 + { 378 + clear_bit(POSTED_INTR_SN, 378 379 (unsigned long *)&pi_desc->control); 379 380 } 380 381
+69 -30
arch/x86/kvm/x86.c
··· 213 213 { "mmu_unsync", VM_STAT(mmu_unsync) }, 214 214 { "remote_tlb_flush", VM_STAT(remote_tlb_flush) }, 215 215 { "largepages", VM_STAT(lpages, .mode = 0444) }, 216 + { "nx_largepages_splitted", VM_STAT(nx_lpage_splits, .mode = 0444) }, 216 217 { "max_mmu_page_hash_collisions", 217 218 VM_STAT(max_mmu_page_hash_collisions) }, 218 219 { NULL } ··· 1133 1132 * List of msr numbers which we expose to userspace through KVM_GET_MSRS 1134 1133 * and KVM_SET_MSRS, and KVM_GET_MSR_INDEX_LIST. 1135 1134 * 1136 - * This list is modified at module load time to reflect the 1135 + * The three MSR lists(msrs_to_save, emulated_msrs, msr_based_features) 1136 + * extract the supported MSRs from the related const lists. 1137 + * msrs_to_save is selected from the msrs_to_save_all to reflect the 1137 1138 * capabilities of the host cpu. This capabilities test skips MSRs that are 1138 - * kvm-specific. Those are put in emulated_msrs; filtering of emulated_msrs 1139 + * kvm-specific. Those are put in emulated_msrs_all; filtering of emulated_msrs 1139 1140 * may depend on host virtualization features rather than host cpu features. 1140 1141 */ 1141 1142 1142 - static u32 msrs_to_save[] = { 1143 + static const u32 msrs_to_save_all[] = { 1143 1144 MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, 1144 1145 MSR_STAR, 1145 1146 #ifdef CONFIG_X86_64 ··· 1182 1179 MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17, 1183 1180 }; 1184 1181 1182 + static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_all)]; 1185 1183 static unsigned num_msrs_to_save; 1186 1184 1187 - static u32 emulated_msrs[] = { 1185 + static const u32 emulated_msrs_all[] = { 1188 1186 MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, 1189 1187 MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, 1190 1188 HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, ··· 1224 1220 * by arch/x86/kvm/vmx/nested.c based on CPUID or other MSRs. 1225 1221 * We always support the "true" VMX control MSRs, even if the host 1226 1222 * processor does not, so I am putting these registers here rather 1227 - * than in msrs_to_save. 1223 + * than in msrs_to_save_all. 1228 1224 */ 1229 1225 MSR_IA32_VMX_BASIC, 1230 1226 MSR_IA32_VMX_TRUE_PINBASED_CTLS, ··· 1243 1239 MSR_KVM_POLL_CONTROL, 1244 1240 }; 1245 1241 1242 + static u32 emulated_msrs[ARRAY_SIZE(emulated_msrs_all)]; 1246 1243 static unsigned num_emulated_msrs; 1247 1244 1248 1245 /* 1249 1246 * List of msr numbers which are used to expose MSR-based features that 1250 1247 * can be used by a hypervisor to validate requested CPU features. 1251 1248 */ 1252 - static u32 msr_based_features[] = { 1249 + static const u32 msr_based_features_all[] = { 1253 1250 MSR_IA32_VMX_BASIC, 1254 1251 MSR_IA32_VMX_TRUE_PINBASED_CTLS, 1255 1252 MSR_IA32_VMX_PINBASED_CTLS, ··· 1275 1270 MSR_IA32_ARCH_CAPABILITIES, 1276 1271 }; 1277 1272 1273 + static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all)]; 1278 1274 static unsigned int num_msr_based_features; 1279 1275 1280 1276 static u64 kvm_get_arch_capabilities(void) ··· 1284 1278 1285 1279 if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES)) 1286 1280 rdmsrl(MSR_IA32_ARCH_CAPABILITIES, data); 1281 + 1282 + /* 1283 + * If nx_huge_pages is enabled, KVM's shadow paging will ensure that 1284 + * the nested hypervisor runs with NX huge pages. If it is not, 1285 + * L1 is anyway vulnerable to ITLB_MULTIHIT explots from other 1286 + * L1 guests, so it need not worry about its own (L2) guests. 1287 + */ 1288 + data |= ARCH_CAP_PSCHANGE_MC_NO; 1287 1289 1288 1290 /* 1289 1291 * If we're doing cache flushes (either "always" or "cond") ··· 1311 1297 data |= ARCH_CAP_SSB_NO; 1312 1298 if (!boot_cpu_has_bug(X86_BUG_MDS)) 1313 1299 data |= ARCH_CAP_MDS_NO; 1300 + 1301 + /* 1302 + * On TAA affected systems, export MDS_NO=0 when: 1303 + * - TSX is enabled on the host, i.e. X86_FEATURE_RTM=1. 1304 + * - Updated microcode is present. This is detected by 1305 + * the presence of ARCH_CAP_TSX_CTRL_MSR and ensures 1306 + * that VERW clears CPU buffers. 1307 + * 1308 + * When MDS_NO=0 is exported, guests deploy clear CPU buffer 1309 + * mitigation and don't complain: 1310 + * 1311 + * "Vulnerable: Clear CPU buffers attempted, no microcode" 1312 + * 1313 + * If TSX is disabled on the system, guests are also mitigated against 1314 + * TAA and clear CPU buffer mitigation is not required for guests. 1315 + */ 1316 + if (boot_cpu_has_bug(X86_BUG_TAA) && boot_cpu_has(X86_FEATURE_RTM) && 1317 + (data & ARCH_CAP_TSX_CTRL_MSR)) 1318 + data &= ~ARCH_CAP_MDS_NO; 1314 1319 1315 1320 return data; 1316 1321 } ··· 5123 5090 { 5124 5091 struct x86_pmu_capability x86_pmu; 5125 5092 u32 dummy[2]; 5126 - unsigned i, j; 5093 + unsigned i; 5127 5094 5128 5095 BUILD_BUG_ON_MSG(INTEL_PMC_MAX_FIXED != 4, 5129 - "Please update the fixed PMCs in msrs_to_save[]"); 5096 + "Please update the fixed PMCs in msrs_to_saved_all[]"); 5130 5097 5131 5098 perf_get_x86_pmu_capability(&x86_pmu); 5132 5099 5133 - for (i = j = 0; i < ARRAY_SIZE(msrs_to_save); i++) { 5134 - if (rdmsr_safe(msrs_to_save[i], &dummy[0], &dummy[1]) < 0) 5100 + num_msrs_to_save = 0; 5101 + num_emulated_msrs = 0; 5102 + num_msr_based_features = 0; 5103 + 5104 + for (i = 0; i < ARRAY_SIZE(msrs_to_save_all); i++) { 5105 + if (rdmsr_safe(msrs_to_save_all[i], &dummy[0], &dummy[1]) < 0) 5135 5106 continue; 5136 5107 5137 5108 /* 5138 5109 * Even MSRs that are valid in the host may not be exposed 5139 5110 * to the guests in some cases. 5140 5111 */ 5141 - switch (msrs_to_save[i]) { 5112 + switch (msrs_to_save_all[i]) { 5142 5113 case MSR_IA32_BNDCFGS: 5143 5114 if (!kvm_mpx_supported()) 5144 5115 continue; ··· 5170 5133 break; 5171 5134 case MSR_IA32_RTIT_ADDR0_A ... MSR_IA32_RTIT_ADDR3_B: { 5172 5135 if (!kvm_x86_ops->pt_supported() || 5173 - msrs_to_save[i] - MSR_IA32_RTIT_ADDR0_A >= 5136 + msrs_to_save_all[i] - MSR_IA32_RTIT_ADDR0_A >= 5174 5137 intel_pt_validate_hw_cap(PT_CAP_num_address_ranges) * 2) 5175 5138 continue; 5176 5139 break; 5177 5140 case MSR_ARCH_PERFMON_PERFCTR0 ... MSR_ARCH_PERFMON_PERFCTR0 + 17: 5178 - if (msrs_to_save[i] - MSR_ARCH_PERFMON_PERFCTR0 >= 5141 + if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_PERFCTR0 >= 5179 5142 min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp)) 5180 5143 continue; 5181 5144 break; 5182 5145 case MSR_ARCH_PERFMON_EVENTSEL0 ... MSR_ARCH_PERFMON_EVENTSEL0 + 17: 5183 - if (msrs_to_save[i] - MSR_ARCH_PERFMON_EVENTSEL0 >= 5146 + if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >= 5184 5147 min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp)) 5185 5148 continue; 5186 5149 } ··· 5188 5151 break; 5189 5152 } 5190 5153 5191 - if (j < i) 5192 - msrs_to_save[j] = msrs_to_save[i]; 5193 - j++; 5154 + msrs_to_save[num_msrs_to_save++] = msrs_to_save_all[i]; 5194 5155 } 5195 - num_msrs_to_save = j; 5196 5156 5197 - for (i = j = 0; i < ARRAY_SIZE(emulated_msrs); i++) { 5198 - if (!kvm_x86_ops->has_emulated_msr(emulated_msrs[i])) 5157 + for (i = 0; i < ARRAY_SIZE(emulated_msrs_all); i++) { 5158 + if (!kvm_x86_ops->has_emulated_msr(emulated_msrs_all[i])) 5199 5159 continue; 5200 5160 5201 - if (j < i) 5202 - emulated_msrs[j] = emulated_msrs[i]; 5203 - j++; 5161 + emulated_msrs[num_emulated_msrs++] = emulated_msrs_all[i]; 5204 5162 } 5205 - num_emulated_msrs = j; 5206 5163 5207 - for (i = j = 0; i < ARRAY_SIZE(msr_based_features); i++) { 5164 + for (i = 0; i < ARRAY_SIZE(msr_based_features_all); i++) { 5208 5165 struct kvm_msr_entry msr; 5209 5166 5210 - msr.index = msr_based_features[i]; 5167 + msr.index = msr_based_features_all[i]; 5211 5168 if (kvm_get_msr_feature(&msr)) 5212 5169 continue; 5213 5170 5214 - if (j < i) 5215 - msr_based_features[j] = msr_based_features[i]; 5216 - j++; 5171 + msr_based_features[num_msr_based_features++] = msr_based_features_all[i]; 5217 5172 } 5218 - num_msr_based_features = j; 5219 5173 } 5220 5174 5221 5175 static int vcpu_mmio_write(struct kvm_vcpu *vcpu, gpa_t addr, int len, ··· 9456 9428 INIT_HLIST_HEAD(&kvm->arch.mask_notifier_list); 9457 9429 INIT_LIST_HEAD(&kvm->arch.active_mmu_pages); 9458 9430 INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages); 9431 + INIT_LIST_HEAD(&kvm->arch.lpage_disallowed_mmu_pages); 9459 9432 INIT_LIST_HEAD(&kvm->arch.assigned_dev_head); 9460 9433 atomic_set(&kvm->arch.noncoherent_dma_count, 0); 9461 9434 ··· 9483 9454 kvm_mmu_init_vm(kvm); 9484 9455 9485 9456 return kvm_x86_ops->vm_init(kvm); 9457 + } 9458 + 9459 + int kvm_arch_post_init_vm(struct kvm *kvm) 9460 + { 9461 + return kvm_mmu_post_init_vm(kvm); 9486 9462 } 9487 9463 9488 9464 static void kvm_unload_vcpu_mmu(struct kvm_vcpu *vcpu) ··· 9590 9556 return r; 9591 9557 } 9592 9558 EXPORT_SYMBOL_GPL(x86_set_memory_region); 9559 + 9560 + void kvm_arch_pre_destroy_vm(struct kvm *kvm) 9561 + { 9562 + kvm_mmu_pre_destroy_vm(kvm); 9563 + } 9593 9564 9594 9565 void kvm_arch_destroy_vm(struct kvm *kvm) 9595 9566 {
+26 -6
block/bfq-iosched.c
··· 2713 2713 } 2714 2714 } 2715 2715 2716 + 2717 + static 2718 + void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq) 2719 + { 2720 + /* 2721 + * To prevent bfqq's service guarantees from being violated, 2722 + * bfqq may be left busy, i.e., queued for service, even if 2723 + * empty (see comments in __bfq_bfqq_expire() for 2724 + * details). But, if no process will send requests to bfqq any 2725 + * longer, then there is no point in keeping bfqq queued for 2726 + * service. In addition, keeping bfqq queued for service, but 2727 + * with no process ref any longer, may have caused bfqq to be 2728 + * freed when dequeued from service. But this is assumed to 2729 + * never happen. 2730 + */ 2731 + if (bfq_bfqq_busy(bfqq) && RB_EMPTY_ROOT(&bfqq->sort_list) && 2732 + bfqq != bfqd->in_service_queue) 2733 + bfq_del_bfqq_busy(bfqd, bfqq, false); 2734 + 2735 + bfq_put_queue(bfqq); 2736 + } 2737 + 2716 2738 static void 2717 2739 bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic, 2718 2740 struct bfq_queue *bfqq, struct bfq_queue *new_bfqq) ··· 2805 2783 */ 2806 2784 new_bfqq->pid = -1; 2807 2785 bfqq->bic = NULL; 2808 - /* release process reference to bfqq */ 2809 - bfq_put_queue(bfqq); 2786 + bfq_release_process_ref(bfqd, bfqq); 2810 2787 } 2811 2788 2812 2789 static bool bfq_allow_bio_merge(struct request_queue *q, struct request *rq, ··· 4920 4899 4921 4900 bfq_put_cooperator(bfqq); 4922 4901 4923 - bfq_put_queue(bfqq); /* release process reference */ 4902 + bfq_release_process_ref(bfqd, bfqq); 4924 4903 } 4925 4904 4926 4905 static void bfq_exit_icq_bfqq(struct bfq_io_cq *bic, bool is_sync) ··· 5022 5001 5023 5002 bfqq = bic_to_bfqq(bic, false); 5024 5003 if (bfqq) { 5025 - /* release process reference on this queue */ 5026 - bfq_put_queue(bfqq); 5004 + bfq_release_process_ref(bfqd, bfqq); 5027 5005 bfqq = bfq_get_queue(bfqd, bio, BLK_RW_ASYNC, bic); 5028 5006 bic_set_bfqq(bic, bfqq, false); 5029 5007 } ··· 5983 5963 5984 5964 bfq_put_cooperator(bfqq); 5985 5965 5986 - bfq_put_queue(bfqq); 5966 + bfq_release_process_ref(bfqq->bfqd, bfqq); 5987 5967 return NULL; 5988 5968 } 5989 5969
+1 -1
block/bio.c
··· 751 751 if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED))) 752 752 return false; 753 753 754 - if (bio->bi_vcnt > 0) { 754 + if (bio->bi_vcnt > 0 && !bio_full(bio, len)) { 755 755 struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; 756 756 757 757 if (page_is_mergeable(bv, page, len, off, same_page)) {
+6 -2
block/blk-iocost.c
··· 1057 1057 atomic64_set(&iocg->active_period, cur_period); 1058 1058 1059 1059 /* already activated or breaking leaf-only constraint? */ 1060 - for (i = iocg->level; i > 0; i--) 1061 - if (!list_empty(&iocg->active_list)) 1060 + if (!list_empty(&iocg->active_list)) 1061 + goto succeed_unlock; 1062 + for (i = iocg->level - 1; i > 0; i--) 1063 + if (!list_empty(&iocg->ancestors[i]->active_list)) 1062 1064 goto fail_unlock; 1065 + 1063 1066 if (iocg->child_active_sum) 1064 1067 goto fail_unlock; 1065 1068 ··· 1104 1101 ioc_start_period(ioc, now); 1105 1102 } 1106 1103 1104 + succeed_unlock: 1107 1105 spin_unlock_irq(&ioc->lock); 1108 1106 return true; 1109 1107
+17
drivers/base/cpu.c
··· 554 554 return sprintf(buf, "Not affected\n"); 555 555 } 556 556 557 + ssize_t __weak cpu_show_tsx_async_abort(struct device *dev, 558 + struct device_attribute *attr, 559 + char *buf) 560 + { 561 + return sprintf(buf, "Not affected\n"); 562 + } 563 + 564 + ssize_t __weak cpu_show_itlb_multihit(struct device *dev, 565 + struct device_attribute *attr, char *buf) 566 + { 567 + return sprintf(buf, "Not affected\n"); 568 + } 569 + 557 570 static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); 558 571 static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); 559 572 static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL); 560 573 static DEVICE_ATTR(spec_store_bypass, 0444, cpu_show_spec_store_bypass, NULL); 561 574 static DEVICE_ATTR(l1tf, 0444, cpu_show_l1tf, NULL); 562 575 static DEVICE_ATTR(mds, 0444, cpu_show_mds, NULL); 576 + static DEVICE_ATTR(tsx_async_abort, 0444, cpu_show_tsx_async_abort, NULL); 577 + static DEVICE_ATTR(itlb_multihit, 0444, cpu_show_itlb_multihit, NULL); 563 578 564 579 static struct attribute *cpu_root_vulnerabilities_attrs[] = { 565 580 &dev_attr_meltdown.attr, ··· 583 568 &dev_attr_spec_store_bypass.attr, 584 569 &dev_attr_l1tf.attr, 585 570 &dev_attr_mds.attr, 571 + &dev_attr_tsx_async_abort.attr, 572 + &dev_attr_itlb_multihit.attr, 586 573 NULL 587 574 }; 588 575
+36
drivers/base/memory.c
··· 872 872 } 873 873 return ret; 874 874 } 875 + 876 + struct for_each_memory_block_cb_data { 877 + walk_memory_blocks_func_t func; 878 + void *arg; 879 + }; 880 + 881 + static int for_each_memory_block_cb(struct device *dev, void *data) 882 + { 883 + struct memory_block *mem = to_memory_block(dev); 884 + struct for_each_memory_block_cb_data *cb_data = data; 885 + 886 + return cb_data->func(mem, cb_data->arg); 887 + } 888 + 889 + /** 890 + * for_each_memory_block - walk through all present memory blocks 891 + * 892 + * @arg: argument passed to func 893 + * @func: callback for each memory block walked 894 + * 895 + * This function walks through all present memory blocks, calling func on 896 + * each memory block. 897 + * 898 + * In case func() returns an error, walking is aborted and the error is 899 + * returned. 900 + */ 901 + int for_each_memory_block(void *arg, walk_memory_blocks_func_t func) 902 + { 903 + struct for_each_memory_block_cb_data cb_data = { 904 + .func = func, 905 + .arg = arg, 906 + }; 907 + 908 + return bus_for_each_dev(&memory_subsys, NULL, &cb_data, 909 + for_each_memory_block_cb); 910 + }
+1 -1
drivers/block/rbd.c
··· 2087 2087 struct rbd_device *rbd_dev = obj_req->img_request->rbd_dev; 2088 2088 struct ceph_osd_data *osd_data; 2089 2089 u64 objno; 2090 - u8 state, new_state, current_state; 2090 + u8 state, new_state, uninitialized_var(current_state); 2091 2091 bool has_current_state; 2092 2092 void *p; 2093 2093
+2
drivers/block/rsxx/core.c
··· 1000 1000 1001 1001 cancel_work_sync(&card->event_work); 1002 1002 1003 + destroy_workqueue(card->event_wq); 1003 1004 rsxx_destroy_dev(card); 1004 1005 rsxx_dma_destroy(card); 1006 + destroy_workqueue(card->creg_ctrl.creg_wq); 1005 1007 1006 1008 spin_lock_irqsave(&card->irq_lock, flags); 1007 1009 rsxx_disable_ier_and_isr(card, CR_INTR_ALL);
+1 -4
drivers/char/hw_random/core.c
··· 13 13 #include <linux/delay.h> 14 14 #include <linux/device.h> 15 15 #include <linux/err.h> 16 - #include <linux/freezer.h> 17 16 #include <linux/fs.h> 18 17 #include <linux/hw_random.h> 19 18 #include <linux/kernel.h> ··· 421 422 { 422 423 long rc; 423 424 424 - set_freezable(); 425 - 426 - while (!kthread_freezable_should_stop(NULL)) { 425 + while (!kthread_should_stop()) { 427 426 struct hwrng *rng; 428 427 429 428 rng = get_current_rng();
+1 -3
drivers/char/random.c
··· 327 327 #include <linux/percpu.h> 328 328 #include <linux/cryptohash.h> 329 329 #include <linux/fips.h> 330 - #include <linux/freezer.h> 331 330 #include <linux/ptrace.h> 332 331 #include <linux/workqueue.h> 333 332 #include <linux/irq.h> ··· 2499 2500 * We'll be woken up again once below random_write_wakeup_thresh, 2500 2501 * or when the calling thread is about to terminate. 2501 2502 */ 2502 - wait_event_freezable(random_write_wait, 2503 - kthread_should_stop() || 2503 + wait_event_interruptible(random_write_wait, kthread_should_stop() || 2504 2504 ENTROPY_BITS(&input_pool) <= random_write_wakeup_bits); 2505 2505 mix_pool_bytes(poolp, buffer, count); 2506 2506 credit_entropy_bits(poolp, entropy);
+11 -5
drivers/clocksource/sh_mtu2.c
··· 328 328 return 0; 329 329 } 330 330 331 + static const unsigned int sh_mtu2_channel_offsets[] = { 332 + 0x300, 0x380, 0x000, 333 + }; 334 + 331 335 static int sh_mtu2_setup_channel(struct sh_mtu2_channel *ch, unsigned int index, 332 336 struct sh_mtu2_device *mtu) 333 337 { 334 - static const unsigned int channel_offsets[] = { 335 - 0x300, 0x380, 0x000, 336 - }; 337 338 char name[6]; 338 339 int irq; 339 340 int ret; ··· 357 356 return ret; 358 357 } 359 358 360 - ch->base = mtu->mapbase + channel_offsets[index]; 359 + ch->base = mtu->mapbase + sh_mtu2_channel_offsets[index]; 361 360 ch->index = index; 362 361 363 362 return sh_mtu2_register(ch, dev_name(&mtu->pdev->dev)); ··· 409 408 } 410 409 411 410 /* Allocate and setup the channels. */ 412 - mtu->num_channels = 3; 411 + ret = platform_irq_count(pdev); 412 + if (ret < 0) 413 + goto err_unmap; 414 + 415 + mtu->num_channels = min_t(unsigned int, ret, 416 + ARRAY_SIZE(sh_mtu2_channel_offsets)); 413 417 414 418 mtu->channels = kcalloc(mtu->num_channels, sizeof(*mtu->channels), 415 419 GFP_KERNEL);
+2 -8
drivers/clocksource/timer-mediatek.c
··· 268 268 269 269 ret = timer_of_init(node, &to); 270 270 if (ret) 271 - goto err; 271 + return ret; 272 272 273 273 clockevents_config_and_register(&to.clkevt, timer_of_rate(&to), 274 274 TIMER_SYNC_TICKS, 0xffffffff); 275 275 276 276 return 0; 277 - err: 278 - timer_of_cleanup(&to); 279 - return ret; 280 277 } 281 278 282 279 static int __init mtk_gpt_init(struct device_node *node) ··· 290 293 291 294 ret = timer_of_init(node, &to); 292 295 if (ret) 293 - goto err; 296 + return ret; 294 297 295 298 /* Configure clock source */ 296 299 mtk_gpt_setup(&to, TIMER_CLK_SRC, GPT_CTRL_OP_FREERUN); ··· 308 311 mtk_gpt_enable_irq(&to, TIMER_CLK_EVT); 309 312 310 313 return 0; 311 - err: 312 - timer_of_cleanup(&to); 313 - return ret; 314 314 } 315 315 TIMER_OF_DECLARE(mtk_mt6577, "mediatek,mt6577-timer", mtk_gpt_init); 316 316 TIMER_OF_DECLARE(mtk_mt6765, "mediatek,mt6765-timer", mtk_syst_init);
+16 -22
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
··· 950 950 struct amdgpu_firmware_info *ucode) 951 951 { 952 952 struct amdgpu_device *adev = psp->adev; 953 - const struct sdma_firmware_header_v1_0 *sdma_hdr = 954 - (const struct sdma_firmware_header_v1_0 *) 955 - adev->sdma.instance[ucode->ucode_id - AMDGPU_UCODE_ID_SDMA0].fw->data; 956 - const struct gfx_firmware_header_v1_0 *ce_hdr = 957 - (const struct gfx_firmware_header_v1_0 *)adev->gfx.ce_fw->data; 958 - const struct gfx_firmware_header_v1_0 *pfp_hdr = 959 - (const struct gfx_firmware_header_v1_0 *)adev->gfx.pfp_fw->data; 960 - const struct gfx_firmware_header_v1_0 *me_hdr = 961 - (const struct gfx_firmware_header_v1_0 *)adev->gfx.me_fw->data; 962 - const struct gfx_firmware_header_v1_0 *mec_hdr = 963 - (const struct gfx_firmware_header_v1_0 *)adev->gfx.mec_fw->data; 964 - const struct rlc_firmware_header_v2_0 *rlc_hdr = 965 - (const struct rlc_firmware_header_v2_0 *)adev->gfx.rlc_fw->data; 966 - const struct smc_firmware_header_v1_0 *smc_hdr = 967 - (const struct smc_firmware_header_v1_0 *)adev->pm.fw->data; 953 + struct common_firmware_header *hdr; 968 954 969 955 switch (ucode->ucode_id) { 970 956 case AMDGPU_UCODE_ID_SDMA0: ··· 961 975 case AMDGPU_UCODE_ID_SDMA5: 962 976 case AMDGPU_UCODE_ID_SDMA6: 963 977 case AMDGPU_UCODE_ID_SDMA7: 964 - amdgpu_ucode_print_sdma_hdr(&sdma_hdr->header); 978 + hdr = (struct common_firmware_header *) 979 + adev->sdma.instance[ucode->ucode_id - AMDGPU_UCODE_ID_SDMA0].fw->data; 980 + amdgpu_ucode_print_sdma_hdr(hdr); 965 981 break; 966 982 case AMDGPU_UCODE_ID_CP_CE: 967 - amdgpu_ucode_print_gfx_hdr(&ce_hdr->header); 983 + hdr = (struct common_firmware_header *)adev->gfx.ce_fw->data; 984 + amdgpu_ucode_print_gfx_hdr(hdr); 968 985 break; 969 986 case AMDGPU_UCODE_ID_CP_PFP: 970 - amdgpu_ucode_print_gfx_hdr(&pfp_hdr->header); 987 + hdr = (struct common_firmware_header *)adev->gfx.pfp_fw->data; 988 + amdgpu_ucode_print_gfx_hdr(hdr); 971 989 break; 972 990 case AMDGPU_UCODE_ID_CP_ME: 973 - amdgpu_ucode_print_gfx_hdr(&me_hdr->header); 991 + hdr = (struct common_firmware_header *)adev->gfx.me_fw->data; 992 + amdgpu_ucode_print_gfx_hdr(hdr); 974 993 break; 975 994 case AMDGPU_UCODE_ID_CP_MEC1: 976 - amdgpu_ucode_print_gfx_hdr(&mec_hdr->header); 995 + hdr = (struct common_firmware_header *)adev->gfx.mec_fw->data; 996 + amdgpu_ucode_print_gfx_hdr(hdr); 977 997 break; 978 998 case AMDGPU_UCODE_ID_RLC_G: 979 - amdgpu_ucode_print_rlc_hdr(&rlc_hdr->header); 999 + hdr = (struct common_firmware_header *)adev->gfx.rlc_fw->data; 1000 + amdgpu_ucode_print_rlc_hdr(hdr); 980 1001 break; 981 1002 case AMDGPU_UCODE_ID_SMC: 982 - amdgpu_ucode_print_smc_hdr(&smc_hdr->header); 1003 + hdr = (struct common_firmware_header *)adev->pm.fw->data; 1004 + amdgpu_ucode_print_smc_hdr(hdr); 983 1005 break; 984 1006 default: 985 1007 break;
+3
drivers/gpu/drm/i915/display/intel_display_power.c
··· 4896 4896 4897 4897 power_domains->initializing = true; 4898 4898 4899 + /* Must happen before power domain init on VLV/CHV */ 4900 + intel_update_rawclk(i915); 4901 + 4899 4902 if (INTEL_GEN(i915) >= 11) { 4900 4903 icl_display_core_init(i915, resume); 4901 4904 } else if (IS_CANNONLAKE(i915)) {
+5
drivers/gpu/drm/i915/gem/i915_gem_context.c
··· 319 319 free_engines(rcu_access_pointer(ctx->engines)); 320 320 mutex_destroy(&ctx->engines_mutex); 321 321 322 + kfree(ctx->jump_whitelist); 323 + 322 324 if (ctx->timeline) 323 325 intel_timeline_put(ctx->timeline); 324 326 ··· 442 440 443 441 for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++) 444 442 ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES; 443 + 444 + ctx->jump_whitelist = NULL; 445 + ctx->jump_whitelist_cmds = 0; 445 446 446 447 return ctx; 447 448
+7
drivers/gpu/drm/i915/gem/i915_gem_context_types.h
··· 192 192 * per vm, which may be one per context or shared with the global GTT) 193 193 */ 194 194 struct radix_tree_root handles_vma; 195 + 196 + /** jump_whitelist: Bit array for tracking cmds during cmdparsing 197 + * Guarded by struct_mutex 198 + */ 199 + unsigned long *jump_whitelist; 200 + /** jump_whitelist_cmds: No of cmd slots available */ 201 + u32 jump_whitelist_cmds; 195 202 }; 196 203 197 204 #endif /* __I915_GEM_CONTEXT_TYPES_H__ */
+80 -31
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
··· 296 296 297 297 static inline bool eb_use_cmdparser(const struct i915_execbuffer *eb) 298 298 { 299 - return intel_engine_needs_cmd_parser(eb->engine) && eb->batch_len; 299 + return intel_engine_requires_cmd_parser(eb->engine) || 300 + (intel_engine_using_cmd_parser(eb->engine) && 301 + eb->args->batch_len); 300 302 } 301 303 302 304 static int eb_create(struct i915_execbuffer *eb) ··· 1957 1955 return 0; 1958 1956 } 1959 1957 1960 - static struct i915_vma *eb_parse(struct i915_execbuffer *eb, bool is_master) 1958 + static struct i915_vma * 1959 + shadow_batch_pin(struct i915_execbuffer *eb, struct drm_i915_gem_object *obj) 1960 + { 1961 + struct drm_i915_private *dev_priv = eb->i915; 1962 + struct i915_vma * const vma = *eb->vma; 1963 + struct i915_address_space *vm; 1964 + u64 flags; 1965 + 1966 + /* 1967 + * PPGTT backed shadow buffers must be mapped RO, to prevent 1968 + * post-scan tampering 1969 + */ 1970 + if (CMDPARSER_USES_GGTT(dev_priv)) { 1971 + flags = PIN_GLOBAL; 1972 + vm = &dev_priv->ggtt.vm; 1973 + } else if (vma->vm->has_read_only) { 1974 + flags = PIN_USER; 1975 + vm = vma->vm; 1976 + i915_gem_object_set_readonly(obj); 1977 + } else { 1978 + DRM_DEBUG("Cannot prevent post-scan tampering without RO capable vm\n"); 1979 + return ERR_PTR(-EINVAL); 1980 + } 1981 + 1982 + return i915_gem_object_pin(obj, vm, NULL, 0, 0, flags); 1983 + } 1984 + 1985 + static struct i915_vma *eb_parse(struct i915_execbuffer *eb) 1961 1986 { 1962 1987 struct intel_engine_pool_node *pool; 1963 1988 struct i915_vma *vma; 1989 + u64 batch_start; 1990 + u64 shadow_batch_start; 1964 1991 int err; 1965 1992 1966 1993 pool = intel_engine_pool_get(&eb->engine->pool, eb->batch_len); 1967 1994 if (IS_ERR(pool)) 1968 1995 return ERR_CAST(pool); 1969 1996 1970 - err = intel_engine_cmd_parser(eb->engine, 1997 + vma = shadow_batch_pin(eb, pool->obj); 1998 + if (IS_ERR(vma)) 1999 + goto err; 2000 + 2001 + batch_start = gen8_canonical_addr(eb->batch->node.start) + 2002 + eb->batch_start_offset; 2003 + 2004 + shadow_batch_start = gen8_canonical_addr(vma->node.start); 2005 + 2006 + err = intel_engine_cmd_parser(eb->gem_context, 2007 + eb->engine, 1971 2008 eb->batch->obj, 1972 - pool->obj, 2009 + batch_start, 1973 2010 eb->batch_start_offset, 1974 2011 eb->batch_len, 1975 - is_master); 2012 + pool->obj, 2013 + shadow_batch_start); 2014 + 1976 2015 if (err) { 1977 - if (err == -EACCES) /* unhandled chained batch */ 2016 + i915_vma_unpin(vma); 2017 + 2018 + /* 2019 + * Unsafe GGTT-backed buffers can still be submitted safely 2020 + * as non-secure. 2021 + * For PPGTT backing however, we have no choice but to forcibly 2022 + * reject unsafe buffers 2023 + */ 2024 + if (CMDPARSER_USES_GGTT(eb->i915) && (err == -EACCES)) 2025 + /* Execute original buffer non-secure */ 1978 2026 vma = NULL; 1979 2027 else 1980 2028 vma = ERR_PTR(err); 1981 2029 goto err; 1982 2030 } 1983 2031 1984 - vma = i915_gem_object_ggtt_pin(pool->obj, NULL, 0, 0, 0); 1985 - if (IS_ERR(vma)) 1986 - goto err; 1987 - 1988 2032 eb->vma[eb->buffer_count] = i915_vma_get(vma); 1989 2033 eb->flags[eb->buffer_count] = 1990 2034 __EXEC_OBJECT_HAS_PIN | __EXEC_OBJECT_HAS_REF; 1991 2035 vma->exec_flags = &eb->flags[eb->buffer_count]; 1992 2036 eb->buffer_count++; 2037 + 2038 + eb->batch_start_offset = 0; 2039 + eb->batch = vma; 2040 + 2041 + if (CMDPARSER_USES_GGTT(eb->i915)) 2042 + eb->batch_flags |= I915_DISPATCH_SECURE; 2043 + 2044 + /* eb->batch_len unchanged */ 1993 2045 1994 2046 vma->private = pool; 1995 2047 return vma; ··· 2477 2421 struct drm_i915_gem_exec_object2 *exec, 2478 2422 struct drm_syncobj **fences) 2479 2423 { 2424 + struct drm_i915_private *i915 = to_i915(dev); 2480 2425 struct i915_execbuffer eb; 2481 2426 struct dma_fence *in_fence = NULL; 2482 2427 struct dma_fence *exec_fence = NULL; ··· 2489 2432 BUILD_BUG_ON(__EXEC_OBJECT_INTERNAL_FLAGS & 2490 2433 ~__EXEC_OBJECT_UNKNOWN_FLAGS); 2491 2434 2492 - eb.i915 = to_i915(dev); 2435 + eb.i915 = i915; 2493 2436 eb.file = file; 2494 2437 eb.args = args; 2495 2438 if (DBG_FORCE_RELOC || !(args->flags & I915_EXEC_NO_RELOC)) ··· 2509 2452 2510 2453 eb.batch_flags = 0; 2511 2454 if (args->flags & I915_EXEC_SECURE) { 2455 + if (INTEL_GEN(i915) >= 11) 2456 + return -ENODEV; 2457 + 2458 + /* Return -EPERM to trigger fallback code on old binaries. */ 2459 + if (!HAS_SECURE_BATCHES(i915)) 2460 + return -EPERM; 2461 + 2512 2462 if (!drm_is_current_master(file) || !capable(CAP_SYS_ADMIN)) 2513 - return -EPERM; 2463 + return -EPERM; 2514 2464 2515 2465 eb.batch_flags |= I915_DISPATCH_SECURE; 2516 2466 } ··· 2594 2530 goto err_vma; 2595 2531 } 2596 2532 2533 + if (eb.batch_len == 0) 2534 + eb.batch_len = eb.batch->size - eb.batch_start_offset; 2535 + 2597 2536 if (eb_use_cmdparser(&eb)) { 2598 2537 struct i915_vma *vma; 2599 2538 2600 - vma = eb_parse(&eb, drm_is_current_master(file)); 2539 + vma = eb_parse(&eb); 2601 2540 if (IS_ERR(vma)) { 2602 2541 err = PTR_ERR(vma); 2603 2542 goto err_vma; 2604 2543 } 2605 - 2606 - if (vma) { 2607 - /* 2608 - * Batch parsed and accepted: 2609 - * 2610 - * Set the DISPATCH_SECURE bit to remove the NON_SECURE 2611 - * bit from MI_BATCH_BUFFER_START commands issued in 2612 - * the dispatch_execbuffer implementations. We 2613 - * specifically don't want that set on batches the 2614 - * command parser has accepted. 2615 - */ 2616 - eb.batch_flags |= I915_DISPATCH_SECURE; 2617 - eb.batch_start_offset = 0; 2618 - eb.batch = vma; 2619 - } 2620 2544 } 2621 - 2622 - if (eb.batch_len == 0) 2623 - eb.batch_len = eb.batch->size - eb.batch_start_offset; 2624 2545 2625 2546 /* 2626 2547 * snb/ivb/vlv conflate the "batch in ppgtt" bit with the "non-secure
+10 -3
drivers/gpu/drm/i915/gt/intel_engine_types.h
··· 475 475 476 476 struct intel_engine_hangcheck hangcheck; 477 477 478 - #define I915_ENGINE_NEEDS_CMD_PARSER BIT(0) 478 + #define I915_ENGINE_USING_CMD_PARSER BIT(0) 479 479 #define I915_ENGINE_SUPPORTS_STATS BIT(1) 480 480 #define I915_ENGINE_HAS_PREEMPTION BIT(2) 481 481 #define I915_ENGINE_HAS_SEMAPHORES BIT(3) 482 482 #define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(4) 483 483 #define I915_ENGINE_IS_VIRTUAL BIT(5) 484 + #define I915_ENGINE_REQUIRES_CMD_PARSER BIT(7) 484 485 unsigned int flags; 485 486 486 487 /* ··· 542 541 }; 543 542 544 543 static inline bool 545 - intel_engine_needs_cmd_parser(const struct intel_engine_cs *engine) 544 + intel_engine_using_cmd_parser(const struct intel_engine_cs *engine) 546 545 { 547 - return engine->flags & I915_ENGINE_NEEDS_CMD_PARSER; 546 + return engine->flags & I915_ENGINE_USING_CMD_PARSER; 547 + } 548 + 549 + static inline bool 550 + intel_engine_requires_cmd_parser(const struct intel_engine_cs *engine) 551 + { 552 + return engine->flags & I915_ENGINE_REQUIRES_CMD_PARSER; 548 553 } 549 554 550 555 static inline bool
+8
drivers/gpu/drm/i915/gt/intel_gt_pm.c
··· 38 38 gt->awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ); 39 39 GEM_BUG_ON(!gt->awake); 40 40 41 + if (NEEDS_RC6_CTX_CORRUPTION_WA(i915)) 42 + intel_uncore_forcewake_get(&i915->uncore, FORCEWAKE_ALL); 43 + 41 44 intel_enable_gt_powersave(i915); 42 45 43 46 i915_update_gfx_val(i915); ··· 69 66 i915_pmu_gt_parked(i915); 70 67 if (INTEL_GEN(i915) >= 6) 71 68 gen6_rps_idle(i915); 69 + 70 + if (NEEDS_RC6_CTX_CORRUPTION_WA(i915)) { 71 + i915_rc6_ctx_wa_check(i915); 72 + intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL); 73 + } 72 74 73 75 /* Everything switched off, flush any residual interrupt just in case */ 74 76 intel_synchronize_irq(i915);
+1 -9
drivers/gpu/drm/i915/gt/intel_mocs.c
··· 199 199 MOCS_ENTRY(15, \ 200 200 LE_3_WB | LE_TC_1_LLC | LE_LRUM(2) | LE_AOM(1), \ 201 201 L3_3_WB), \ 202 - /* Bypass LLC - Uncached (EHL+) */ \ 203 - MOCS_ENTRY(16, \ 204 - LE_1_UC | LE_TC_1_LLC | LE_SCF(1), \ 205 - L3_1_UC), \ 206 - /* Bypass LLC - L3 (Read-Only) (EHL+) */ \ 207 - MOCS_ENTRY(17, \ 208 - LE_1_UC | LE_TC_1_LLC | LE_SCF(1), \ 209 - L3_3_WB), \ 210 202 /* Self-Snoop - L3 + LLC */ \ 211 203 MOCS_ENTRY(18, \ 212 204 LE_3_WB | LE_TC_1_LLC | LE_LRUM(3) | LE_SSE(3), \ ··· 262 270 L3_1_UC), 263 271 /* HW Special Case (Displayable) */ 264 272 MOCS_ENTRY(61, 265 - LE_1_UC | LE_TC_1_LLC | LE_SCF(1), 273 + LE_1_UC | LE_TC_1_LLC, 266 274 L3_3_WB), 267 275 }; 268 276
+2 -2
drivers/gpu/drm/i915/gvt/dmabuf.c
··· 498 498 goto out_free_gem; 499 499 } 500 500 501 - i915_gem_object_put(obj); 502 - 503 501 ret = dma_buf_fd(dmabuf, DRM_CLOEXEC | DRM_RDWR); 504 502 if (ret < 0) { 505 503 gvt_vgpu_err("create dma-buf fd failed ret:%d\n", ret); ··· 521 523 dmabuf_fd, 522 524 file_count(dmabuf->file), 523 525 kref_read(&obj->base.refcount)); 526 + 527 + i915_gem_object_put(obj); 524 528 525 529 return dmabuf_fd; 526 530
+308 -131
drivers/gpu/drm/i915/i915_cmd_parser.c
··· 53 53 * granting userspace undue privileges. There are three categories of privilege. 54 54 * 55 55 * First, commands which are explicitly defined as privileged or which should 56 - * only be used by the kernel driver. The parser generally rejects such 57 - * commands, though it may allow some from the drm master process. 56 + * only be used by the kernel driver. The parser rejects such commands 58 57 * 59 58 * Second, commands which access registers. To support correct/enhanced 60 59 * userspace functionality, particularly certain OpenGL extensions, the parser 61 - * provides a whitelist of registers which userspace may safely access (for both 62 - * normal and drm master processes). 60 + * provides a whitelist of registers which userspace may safely access 63 61 * 64 62 * Third, commands which access privileged memory (i.e. GGTT, HWS page, etc). 65 63 * The parser always rejects such commands. ··· 82 84 * in the per-engine command tables. 83 85 * 84 86 * Other command table entries map fairly directly to high level categories 85 - * mentioned above: rejected, master-only, register whitelist. The parser 86 - * implements a number of checks, including the privileged memory checks, via a 87 - * general bitmasking mechanism. 87 + * mentioned above: rejected, register whitelist. The parser implements a number 88 + * of checks, including the privileged memory checks, via a general bitmasking 89 + * mechanism. 88 90 */ 89 91 90 92 /* ··· 102 104 * CMD_DESC_REJECT: The command is never allowed 103 105 * CMD_DESC_REGISTER: The command should be checked against the 104 106 * register whitelist for the appropriate ring 105 - * CMD_DESC_MASTER: The command is allowed if the submitting process 106 - * is the DRM master 107 107 */ 108 108 u32 flags; 109 109 #define CMD_DESC_FIXED (1<<0) ··· 109 113 #define CMD_DESC_REJECT (1<<2) 110 114 #define CMD_DESC_REGISTER (1<<3) 111 115 #define CMD_DESC_BITMASK (1<<4) 112 - #define CMD_DESC_MASTER (1<<5) 113 116 114 117 /* 115 118 * The command's unique identification bits and the bitmask to get them. ··· 189 194 #define CMD(op, opm, f, lm, fl, ...) \ 190 195 { \ 191 196 .flags = (fl) | ((f) ? CMD_DESC_FIXED : 0), \ 192 - .cmd = { (op), ~0u << (opm) }, \ 197 + .cmd = { (op & ~0u << (opm)), ~0u << (opm) }, \ 193 198 .length = { (lm) }, \ 194 199 __VA_ARGS__ \ 195 200 } ··· 204 209 #define R CMD_DESC_REJECT 205 210 #define W CMD_DESC_REGISTER 206 211 #define B CMD_DESC_BITMASK 207 - #define M CMD_DESC_MASTER 208 212 209 213 /* Command Mask Fixed Len Action 210 214 ---------------------------------------------------------- */ 211 - static const struct drm_i915_cmd_descriptor common_cmds[] = { 215 + static const struct drm_i915_cmd_descriptor gen7_common_cmds[] = { 212 216 CMD( MI_NOOP, SMI, F, 1, S ), 213 217 CMD( MI_USER_INTERRUPT, SMI, F, 1, R ), 214 - CMD( MI_WAIT_FOR_EVENT, SMI, F, 1, M ), 218 + CMD( MI_WAIT_FOR_EVENT, SMI, F, 1, R ), 215 219 CMD( MI_ARB_CHECK, SMI, F, 1, S ), 216 220 CMD( MI_REPORT_HEAD, SMI, F, 1, S ), 217 221 CMD( MI_SUSPEND_FLUSH, SMI, F, 1, S ), ··· 240 246 CMD( MI_BATCH_BUFFER_START, SMI, !F, 0xFF, S ), 241 247 }; 242 248 243 - static const struct drm_i915_cmd_descriptor render_cmds[] = { 249 + static const struct drm_i915_cmd_descriptor gen7_render_cmds[] = { 244 250 CMD( MI_FLUSH, SMI, F, 1, S ), 245 251 CMD( MI_ARB_ON_OFF, SMI, F, 1, R ), 246 252 CMD( MI_PREDICATE, SMI, F, 1, S ), ··· 307 313 CMD( MI_URB_ATOMIC_ALLOC, SMI, F, 1, S ), 308 314 CMD( MI_SET_APPID, SMI, F, 1, S ), 309 315 CMD( MI_RS_CONTEXT, SMI, F, 1, S ), 310 - CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, M ), 316 + CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, R ), 311 317 CMD( MI_LOAD_SCAN_LINES_EXCL, SMI, !F, 0x3F, R ), 312 318 CMD( MI_LOAD_REGISTER_REG, SMI, !F, 0xFF, W, 313 319 .reg = { .offset = 1, .mask = 0x007FFFFC, .step = 1 } ), ··· 324 330 CMD( GFX_OP_3DSTATE_BINDING_TABLE_EDIT_PS, S3D, !F, 0x1FF, S ), 325 331 }; 326 332 327 - static const struct drm_i915_cmd_descriptor video_cmds[] = { 333 + static const struct drm_i915_cmd_descriptor gen7_video_cmds[] = { 328 334 CMD( MI_ARB_ON_OFF, SMI, F, 1, R ), 329 335 CMD( MI_SET_APPID, SMI, F, 1, S ), 330 336 CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, B, ··· 368 374 CMD( MFX_WAIT, SMFX, F, 1, S ), 369 375 }; 370 376 371 - static const struct drm_i915_cmd_descriptor vecs_cmds[] = { 377 + static const struct drm_i915_cmd_descriptor gen7_vecs_cmds[] = { 372 378 CMD( MI_ARB_ON_OFF, SMI, F, 1, R ), 373 379 CMD( MI_SET_APPID, SMI, F, 1, S ), 374 380 CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, B, ··· 406 412 }}, ), 407 413 }; 408 414 409 - static const struct drm_i915_cmd_descriptor blt_cmds[] = { 415 + static const struct drm_i915_cmd_descriptor gen7_blt_cmds[] = { 410 416 CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, R ), 411 417 CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, B, 412 418 .bits = {{ ··· 440 446 }; 441 447 442 448 static const struct drm_i915_cmd_descriptor hsw_blt_cmds[] = { 443 - CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, M ), 449 + CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, R ), 444 450 CMD( MI_LOAD_SCAN_LINES_EXCL, SMI, !F, 0x3F, R ), 451 + }; 452 + 453 + /* 454 + * For Gen9 we can still rely on the h/w to enforce cmd security, and only 455 + * need to re-enforce the register access checks. We therefore only need to 456 + * teach the cmdparser how to find the end of each command, and identify 457 + * register accesses. The table doesn't need to reject any commands, and so 458 + * the only commands listed here are: 459 + * 1) Those that touch registers 460 + * 2) Those that do not have the default 8-bit length 461 + * 462 + * Note that the default MI length mask chosen for this table is 0xFF, not 463 + * the 0x3F used on older devices. This is because the vast majority of MI 464 + * cmds on Gen9 use a standard 8-bit Length field. 465 + * All the Gen9 blitter instructions are standard 0xFF length mask, and 466 + * none allow access to non-general registers, so in fact no BLT cmds are 467 + * included in the table at all. 468 + * 469 + */ 470 + static const struct drm_i915_cmd_descriptor gen9_blt_cmds[] = { 471 + CMD( MI_NOOP, SMI, F, 1, S ), 472 + CMD( MI_USER_INTERRUPT, SMI, F, 1, S ), 473 + CMD( MI_WAIT_FOR_EVENT, SMI, F, 1, S ), 474 + CMD( MI_FLUSH, SMI, F, 1, S ), 475 + CMD( MI_ARB_CHECK, SMI, F, 1, S ), 476 + CMD( MI_REPORT_HEAD, SMI, F, 1, S ), 477 + CMD( MI_ARB_ON_OFF, SMI, F, 1, S ), 478 + CMD( MI_SUSPEND_FLUSH, SMI, F, 1, S ), 479 + CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, S ), 480 + CMD( MI_LOAD_SCAN_LINES_EXCL, SMI, !F, 0x3F, S ), 481 + CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, S ), 482 + CMD( MI_LOAD_REGISTER_IMM(1), SMI, !F, 0xFF, W, 483 + .reg = { .offset = 1, .mask = 0x007FFFFC, .step = 2 } ), 484 + CMD( MI_UPDATE_GTT, SMI, !F, 0x3FF, S ), 485 + CMD( MI_STORE_REGISTER_MEM_GEN8, SMI, F, 4, W, 486 + .reg = { .offset = 1, .mask = 0x007FFFFC } ), 487 + CMD( MI_FLUSH_DW, SMI, !F, 0x3F, S ), 488 + CMD( MI_LOAD_REGISTER_MEM_GEN8, SMI, F, 4, W, 489 + .reg = { .offset = 1, .mask = 0x007FFFFC } ), 490 + CMD( MI_LOAD_REGISTER_REG, SMI, !F, 0xFF, W, 491 + .reg = { .offset = 1, .mask = 0x007FFFFC, .step = 1 } ), 492 + 493 + /* 494 + * We allow BB_START but apply further checks. We just sanitize the 495 + * basic fields here. 496 + */ 497 + #define MI_BB_START_OPERAND_MASK GENMASK(SMI-1, 0) 498 + #define MI_BB_START_OPERAND_EXPECT (MI_BATCH_PPGTT_HSW | 1) 499 + CMD( MI_BATCH_BUFFER_START_GEN8, SMI, !F, 0xFF, B, 500 + .bits = {{ 501 + .offset = 0, 502 + .mask = MI_BB_START_OPERAND_MASK, 503 + .expected = MI_BB_START_OPERAND_EXPECT, 504 + }}, ), 445 505 }; 446 506 447 507 static const struct drm_i915_cmd_descriptor noop_desc = ··· 511 463 #undef R 512 464 #undef W 513 465 #undef B 514 - #undef M 515 466 516 - static const struct drm_i915_cmd_table gen7_render_cmds[] = { 517 - { common_cmds, ARRAY_SIZE(common_cmds) }, 518 - { render_cmds, ARRAY_SIZE(render_cmds) }, 467 + static const struct drm_i915_cmd_table gen7_render_cmd_table[] = { 468 + { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) }, 469 + { gen7_render_cmds, ARRAY_SIZE(gen7_render_cmds) }, 519 470 }; 520 471 521 - static const struct drm_i915_cmd_table hsw_render_ring_cmds[] = { 522 - { common_cmds, ARRAY_SIZE(common_cmds) }, 523 - { render_cmds, ARRAY_SIZE(render_cmds) }, 472 + static const struct drm_i915_cmd_table hsw_render_ring_cmd_table[] = { 473 + { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) }, 474 + { gen7_render_cmds, ARRAY_SIZE(gen7_render_cmds) }, 524 475 { hsw_render_cmds, ARRAY_SIZE(hsw_render_cmds) }, 525 476 }; 526 477 527 - static const struct drm_i915_cmd_table gen7_video_cmds[] = { 528 - { common_cmds, ARRAY_SIZE(common_cmds) }, 529 - { video_cmds, ARRAY_SIZE(video_cmds) }, 478 + static const struct drm_i915_cmd_table gen7_video_cmd_table[] = { 479 + { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) }, 480 + { gen7_video_cmds, ARRAY_SIZE(gen7_video_cmds) }, 530 481 }; 531 482 532 - static const struct drm_i915_cmd_table hsw_vebox_cmds[] = { 533 - { common_cmds, ARRAY_SIZE(common_cmds) }, 534 - { vecs_cmds, ARRAY_SIZE(vecs_cmds) }, 483 + static const struct drm_i915_cmd_table hsw_vebox_cmd_table[] = { 484 + { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) }, 485 + { gen7_vecs_cmds, ARRAY_SIZE(gen7_vecs_cmds) }, 535 486 }; 536 487 537 - static const struct drm_i915_cmd_table gen7_blt_cmds[] = { 538 - { common_cmds, ARRAY_SIZE(common_cmds) }, 539 - { blt_cmds, ARRAY_SIZE(blt_cmds) }, 488 + static const struct drm_i915_cmd_table gen7_blt_cmd_table[] = { 489 + { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) }, 490 + { gen7_blt_cmds, ARRAY_SIZE(gen7_blt_cmds) }, 540 491 }; 541 492 542 - static const struct drm_i915_cmd_table hsw_blt_ring_cmds[] = { 543 - { common_cmds, ARRAY_SIZE(common_cmds) }, 544 - { blt_cmds, ARRAY_SIZE(blt_cmds) }, 493 + static const struct drm_i915_cmd_table hsw_blt_ring_cmd_table[] = { 494 + { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) }, 495 + { gen7_blt_cmds, ARRAY_SIZE(gen7_blt_cmds) }, 545 496 { hsw_blt_cmds, ARRAY_SIZE(hsw_blt_cmds) }, 546 497 }; 498 + 499 + static const struct drm_i915_cmd_table gen9_blt_cmd_table[] = { 500 + { gen9_blt_cmds, ARRAY_SIZE(gen9_blt_cmds) }, 501 + }; 502 + 547 503 548 504 /* 549 505 * Register whitelists, sorted by increasing register offset. ··· 664 612 REG64_IDX(RING_TIMESTAMP, BLT_RING_BASE), 665 613 }; 666 614 667 - static const struct drm_i915_reg_descriptor ivb_master_regs[] = { 668 - REG32(FORCEWAKE_MT), 669 - REG32(DERRMR), 670 - REG32(GEN7_PIPE_DE_LOAD_SL(PIPE_A)), 671 - REG32(GEN7_PIPE_DE_LOAD_SL(PIPE_B)), 672 - REG32(GEN7_PIPE_DE_LOAD_SL(PIPE_C)), 673 - }; 674 - 675 - static const struct drm_i915_reg_descriptor hsw_master_regs[] = { 676 - REG32(FORCEWAKE_MT), 677 - REG32(DERRMR), 615 + static const struct drm_i915_reg_descriptor gen9_blt_regs[] = { 616 + REG64_IDX(RING_TIMESTAMP, RENDER_RING_BASE), 617 + REG64_IDX(RING_TIMESTAMP, BSD_RING_BASE), 618 + REG32(BCS_SWCTRL), 619 + REG64_IDX(RING_TIMESTAMP, BLT_RING_BASE), 620 + REG64_IDX(BCS_GPR, 0), 621 + REG64_IDX(BCS_GPR, 1), 622 + REG64_IDX(BCS_GPR, 2), 623 + REG64_IDX(BCS_GPR, 3), 624 + REG64_IDX(BCS_GPR, 4), 625 + REG64_IDX(BCS_GPR, 5), 626 + REG64_IDX(BCS_GPR, 6), 627 + REG64_IDX(BCS_GPR, 7), 628 + REG64_IDX(BCS_GPR, 8), 629 + REG64_IDX(BCS_GPR, 9), 630 + REG64_IDX(BCS_GPR, 10), 631 + REG64_IDX(BCS_GPR, 11), 632 + REG64_IDX(BCS_GPR, 12), 633 + REG64_IDX(BCS_GPR, 13), 634 + REG64_IDX(BCS_GPR, 14), 635 + REG64_IDX(BCS_GPR, 15), 678 636 }; 679 637 680 638 #undef REG64 ··· 693 631 struct drm_i915_reg_table { 694 632 const struct drm_i915_reg_descriptor *regs; 695 633 int num_regs; 696 - bool master; 697 634 }; 698 635 699 636 static const struct drm_i915_reg_table ivb_render_reg_tables[] = { 700 - { gen7_render_regs, ARRAY_SIZE(gen7_render_regs), false }, 701 - { ivb_master_regs, ARRAY_SIZE(ivb_master_regs), true }, 637 + { gen7_render_regs, ARRAY_SIZE(gen7_render_regs) }, 702 638 }; 703 639 704 640 static const struct drm_i915_reg_table ivb_blt_reg_tables[] = { 705 - { gen7_blt_regs, ARRAY_SIZE(gen7_blt_regs), false }, 706 - { ivb_master_regs, ARRAY_SIZE(ivb_master_regs), true }, 641 + { gen7_blt_regs, ARRAY_SIZE(gen7_blt_regs) }, 707 642 }; 708 643 709 644 static const struct drm_i915_reg_table hsw_render_reg_tables[] = { 710 - { gen7_render_regs, ARRAY_SIZE(gen7_render_regs), false }, 711 - { hsw_render_regs, ARRAY_SIZE(hsw_render_regs), false }, 712 - { hsw_master_regs, ARRAY_SIZE(hsw_master_regs), true }, 645 + { gen7_render_regs, ARRAY_SIZE(gen7_render_regs) }, 646 + { hsw_render_regs, ARRAY_SIZE(hsw_render_regs) }, 713 647 }; 714 648 715 649 static const struct drm_i915_reg_table hsw_blt_reg_tables[] = { 716 - { gen7_blt_regs, ARRAY_SIZE(gen7_blt_regs), false }, 717 - { hsw_master_regs, ARRAY_SIZE(hsw_master_regs), true }, 650 + { gen7_blt_regs, ARRAY_SIZE(gen7_blt_regs) }, 651 + }; 652 + 653 + static const struct drm_i915_reg_table gen9_blt_reg_tables[] = { 654 + { gen9_blt_regs, ARRAY_SIZE(gen9_blt_regs) }, 718 655 }; 719 656 720 657 static u32 gen7_render_get_cmd_length_mask(u32 cmd_header) ··· 765 704 if (client == INSTR_MI_CLIENT) 766 705 return 0x3F; 767 706 else if (client == INSTR_BC_CLIENT) 707 + return 0xFF; 708 + 709 + DRM_DEBUG_DRIVER("CMD: Abnormal blt cmd length! 0x%08X\n", cmd_header); 710 + return 0; 711 + } 712 + 713 + static u32 gen9_blt_get_cmd_length_mask(u32 cmd_header) 714 + { 715 + u32 client = cmd_header >> INSTR_CLIENT_SHIFT; 716 + 717 + if (client == INSTR_MI_CLIENT || client == INSTR_BC_CLIENT) 768 718 return 0xFF; 769 719 770 720 DRM_DEBUG_DRIVER("CMD: Abnormal blt cmd length! 0x%08X\n", cmd_header); ··· 939 867 int cmd_table_count; 940 868 int ret; 941 869 942 - if (!IS_GEN(engine->i915, 7)) 870 + if (!IS_GEN(engine->i915, 7) && !(IS_GEN(engine->i915, 9) && 871 + engine->class == COPY_ENGINE_CLASS)) 943 872 return; 944 873 945 874 switch (engine->class) { 946 875 case RENDER_CLASS: 947 876 if (IS_HASWELL(engine->i915)) { 948 - cmd_tables = hsw_render_ring_cmds; 877 + cmd_tables = hsw_render_ring_cmd_table; 949 878 cmd_table_count = 950 - ARRAY_SIZE(hsw_render_ring_cmds); 879 + ARRAY_SIZE(hsw_render_ring_cmd_table); 951 880 } else { 952 - cmd_tables = gen7_render_cmds; 953 - cmd_table_count = ARRAY_SIZE(gen7_render_cmds); 881 + cmd_tables = gen7_render_cmd_table; 882 + cmd_table_count = ARRAY_SIZE(gen7_render_cmd_table); 954 883 } 955 884 956 885 if (IS_HASWELL(engine->i915)) { ··· 961 888 engine->reg_tables = ivb_render_reg_tables; 962 889 engine->reg_table_count = ARRAY_SIZE(ivb_render_reg_tables); 963 890 } 964 - 965 891 engine->get_cmd_length_mask = gen7_render_get_cmd_length_mask; 966 892 break; 967 893 case VIDEO_DECODE_CLASS: 968 - cmd_tables = gen7_video_cmds; 969 - cmd_table_count = ARRAY_SIZE(gen7_video_cmds); 894 + cmd_tables = gen7_video_cmd_table; 895 + cmd_table_count = ARRAY_SIZE(gen7_video_cmd_table); 970 896 engine->get_cmd_length_mask = gen7_bsd_get_cmd_length_mask; 971 897 break; 972 898 case COPY_ENGINE_CLASS: 973 - if (IS_HASWELL(engine->i915)) { 974 - cmd_tables = hsw_blt_ring_cmds; 975 - cmd_table_count = ARRAY_SIZE(hsw_blt_ring_cmds); 899 + engine->get_cmd_length_mask = gen7_blt_get_cmd_length_mask; 900 + if (IS_GEN(engine->i915, 9)) { 901 + cmd_tables = gen9_blt_cmd_table; 902 + cmd_table_count = ARRAY_SIZE(gen9_blt_cmd_table); 903 + engine->get_cmd_length_mask = 904 + gen9_blt_get_cmd_length_mask; 905 + 906 + /* BCS Engine unsafe without parser */ 907 + engine->flags |= I915_ENGINE_REQUIRES_CMD_PARSER; 908 + } else if (IS_HASWELL(engine->i915)) { 909 + cmd_tables = hsw_blt_ring_cmd_table; 910 + cmd_table_count = ARRAY_SIZE(hsw_blt_ring_cmd_table); 976 911 } else { 977 - cmd_tables = gen7_blt_cmds; 978 - cmd_table_count = ARRAY_SIZE(gen7_blt_cmds); 912 + cmd_tables = gen7_blt_cmd_table; 913 + cmd_table_count = ARRAY_SIZE(gen7_blt_cmd_table); 979 914 } 980 915 981 - if (IS_HASWELL(engine->i915)) { 916 + if (IS_GEN(engine->i915, 9)) { 917 + engine->reg_tables = gen9_blt_reg_tables; 918 + engine->reg_table_count = 919 + ARRAY_SIZE(gen9_blt_reg_tables); 920 + } else if (IS_HASWELL(engine->i915)) { 982 921 engine->reg_tables = hsw_blt_reg_tables; 983 922 engine->reg_table_count = ARRAY_SIZE(hsw_blt_reg_tables); 984 923 } else { 985 924 engine->reg_tables = ivb_blt_reg_tables; 986 925 engine->reg_table_count = ARRAY_SIZE(ivb_blt_reg_tables); 987 926 } 988 - 989 - engine->get_cmd_length_mask = gen7_blt_get_cmd_length_mask; 990 927 break; 991 928 case VIDEO_ENHANCEMENT_CLASS: 992 - cmd_tables = hsw_vebox_cmds; 993 - cmd_table_count = ARRAY_SIZE(hsw_vebox_cmds); 929 + cmd_tables = hsw_vebox_cmd_table; 930 + cmd_table_count = ARRAY_SIZE(hsw_vebox_cmd_table); 994 931 /* VECS can use the same length_mask function as VCS */ 995 932 engine->get_cmd_length_mask = gen7_bsd_get_cmd_length_mask; 996 933 break; ··· 1026 943 return; 1027 944 } 1028 945 1029 - engine->flags |= I915_ENGINE_NEEDS_CMD_PARSER; 946 + engine->flags |= I915_ENGINE_USING_CMD_PARSER; 1030 947 } 1031 948 1032 949 /** ··· 1038 955 */ 1039 956 void intel_engine_cleanup_cmd_parser(struct intel_engine_cs *engine) 1040 957 { 1041 - if (!intel_engine_needs_cmd_parser(engine)) 958 + if (!intel_engine_using_cmd_parser(engine)) 1042 959 return; 1043 960 1044 961 fini_hash_table(engine); ··· 1112 1029 } 1113 1030 1114 1031 static const struct drm_i915_reg_descriptor * 1115 - find_reg(const struct intel_engine_cs *engine, bool is_master, u32 addr) 1032 + find_reg(const struct intel_engine_cs *engine, u32 addr) 1116 1033 { 1117 1034 const struct drm_i915_reg_table *table = engine->reg_tables; 1035 + const struct drm_i915_reg_descriptor *reg = NULL; 1118 1036 int count = engine->reg_table_count; 1119 1037 1120 - for (; count > 0; ++table, --count) { 1121 - if (!table->master || is_master) { 1122 - const struct drm_i915_reg_descriptor *reg; 1038 + for (; !reg && (count > 0); ++table, --count) 1039 + reg = __find_reg(table->regs, table->num_regs, addr); 1123 1040 1124 - reg = __find_reg(table->regs, table->num_regs, addr); 1125 - if (reg != NULL) 1126 - return reg; 1127 - } 1128 - } 1129 - 1130 - return NULL; 1041 + return reg; 1131 1042 } 1132 1043 1133 1044 /* Returns a vmap'd pointer to dst_obj, which the caller must unmap */ ··· 1205 1128 1206 1129 static bool check_cmd(const struct intel_engine_cs *engine, 1207 1130 const struct drm_i915_cmd_descriptor *desc, 1208 - const u32 *cmd, u32 length, 1209 - const bool is_master) 1131 + const u32 *cmd, u32 length) 1210 1132 { 1211 1133 if (desc->flags & CMD_DESC_SKIP) 1212 1134 return true; 1213 1135 1214 1136 if (desc->flags & CMD_DESC_REJECT) { 1215 1137 DRM_DEBUG_DRIVER("CMD: Rejected command: 0x%08X\n", *cmd); 1216 - return false; 1217 - } 1218 - 1219 - if ((desc->flags & CMD_DESC_MASTER) && !is_master) { 1220 - DRM_DEBUG_DRIVER("CMD: Rejected master-only command: 0x%08X\n", 1221 - *cmd); 1222 1138 return false; 1223 1139 } 1224 1140 ··· 1228 1158 offset += step) { 1229 1159 const u32 reg_addr = cmd[offset] & desc->reg.mask; 1230 1160 const struct drm_i915_reg_descriptor *reg = 1231 - find_reg(engine, is_master, reg_addr); 1161 + find_reg(engine, reg_addr); 1232 1162 1233 1163 if (!reg) { 1234 1164 DRM_DEBUG_DRIVER("CMD: Rejected register 0x%08X in command: 0x%08X (%s)\n", ··· 1306 1236 return true; 1307 1237 } 1308 1238 1239 + static int check_bbstart(const struct i915_gem_context *ctx, 1240 + u32 *cmd, u32 offset, u32 length, 1241 + u32 batch_len, 1242 + u64 batch_start, 1243 + u64 shadow_batch_start) 1244 + { 1245 + u64 jump_offset, jump_target; 1246 + u32 target_cmd_offset, target_cmd_index; 1247 + 1248 + /* For igt compatibility on older platforms */ 1249 + if (CMDPARSER_USES_GGTT(ctx->i915)) { 1250 + DRM_DEBUG("CMD: Rejecting BB_START for ggtt based submission\n"); 1251 + return -EACCES; 1252 + } 1253 + 1254 + if (length != 3) { 1255 + DRM_DEBUG("CMD: Recursive BB_START with bad length(%u)\n", 1256 + length); 1257 + return -EINVAL; 1258 + } 1259 + 1260 + jump_target = *(u64*)(cmd+1); 1261 + jump_offset = jump_target - batch_start; 1262 + 1263 + /* 1264 + * Any underflow of jump_target is guaranteed to be outside the range 1265 + * of a u32, so >= test catches both too large and too small 1266 + */ 1267 + if (jump_offset >= batch_len) { 1268 + DRM_DEBUG("CMD: BB_START to 0x%llx jumps out of BB\n", 1269 + jump_target); 1270 + return -EINVAL; 1271 + } 1272 + 1273 + /* 1274 + * This cannot overflow a u32 because we already checked jump_offset 1275 + * is within the BB, and the batch_len is a u32 1276 + */ 1277 + target_cmd_offset = lower_32_bits(jump_offset); 1278 + target_cmd_index = target_cmd_offset / sizeof(u32); 1279 + 1280 + *(u64*)(cmd + 1) = shadow_batch_start + target_cmd_offset; 1281 + 1282 + if (target_cmd_index == offset) 1283 + return 0; 1284 + 1285 + if (ctx->jump_whitelist_cmds <= target_cmd_index) { 1286 + DRM_DEBUG("CMD: Rejecting BB_START - truncated whitelist array\n"); 1287 + return -EINVAL; 1288 + } else if (!test_bit(target_cmd_index, ctx->jump_whitelist)) { 1289 + DRM_DEBUG("CMD: BB_START to 0x%llx not a previously executed cmd\n", 1290 + jump_target); 1291 + return -EINVAL; 1292 + } 1293 + 1294 + return 0; 1295 + } 1296 + 1297 + static void init_whitelist(struct i915_gem_context *ctx, u32 batch_len) 1298 + { 1299 + const u32 batch_cmds = DIV_ROUND_UP(batch_len, sizeof(u32)); 1300 + const u32 exact_size = BITS_TO_LONGS(batch_cmds); 1301 + u32 next_size = BITS_TO_LONGS(roundup_pow_of_two(batch_cmds)); 1302 + unsigned long *next_whitelist; 1303 + 1304 + if (CMDPARSER_USES_GGTT(ctx->i915)) 1305 + return; 1306 + 1307 + if (batch_cmds <= ctx->jump_whitelist_cmds) { 1308 + bitmap_zero(ctx->jump_whitelist, batch_cmds); 1309 + return; 1310 + } 1311 + 1312 + again: 1313 + next_whitelist = kcalloc(next_size, sizeof(long), GFP_KERNEL); 1314 + if (next_whitelist) { 1315 + kfree(ctx->jump_whitelist); 1316 + ctx->jump_whitelist = next_whitelist; 1317 + ctx->jump_whitelist_cmds = 1318 + next_size * BITS_PER_BYTE * sizeof(long); 1319 + return; 1320 + } 1321 + 1322 + if (next_size > exact_size) { 1323 + next_size = exact_size; 1324 + goto again; 1325 + } 1326 + 1327 + DRM_DEBUG("CMD: Failed to extend whitelist. BB_START may be disallowed\n"); 1328 + bitmap_zero(ctx->jump_whitelist, ctx->jump_whitelist_cmds); 1329 + 1330 + return; 1331 + } 1332 + 1309 1333 #define LENGTH_BIAS 2 1310 1334 1311 1335 /** 1312 1336 * i915_parse_cmds() - parse a submitted batch buffer for privilege violations 1337 + * @ctx: the context in which the batch is to execute 1313 1338 * @engine: the engine on which the batch is to execute 1314 1339 * @batch_obj: the batch buffer in question 1315 - * @shadow_batch_obj: copy of the batch buffer in question 1340 + * @batch_start: Canonical base address of batch 1316 1341 * @batch_start_offset: byte offset in the batch at which execution starts 1317 1342 * @batch_len: length of the commands in batch_obj 1318 - * @is_master: is the submitting process the drm master? 1343 + * @shadow_batch_obj: copy of the batch buffer in question 1344 + * @shadow_batch_start: Canonical base address of shadow_batch_obj 1319 1345 * 1320 1346 * Parses the specified batch buffer looking for privilege violations as 1321 1347 * described in the overview. ··· 1419 1253 * Return: non-zero if the parser finds violations or otherwise fails; -EACCES 1420 1254 * if the batch appears legal but should use hardware parsing 1421 1255 */ 1422 - int intel_engine_cmd_parser(struct intel_engine_cs *engine, 1256 + 1257 + int intel_engine_cmd_parser(struct i915_gem_context *ctx, 1258 + struct intel_engine_cs *engine, 1423 1259 struct drm_i915_gem_object *batch_obj, 1424 - struct drm_i915_gem_object *shadow_batch_obj, 1260 + u64 batch_start, 1425 1261 u32 batch_start_offset, 1426 1262 u32 batch_len, 1427 - bool is_master) 1263 + struct drm_i915_gem_object *shadow_batch_obj, 1264 + u64 shadow_batch_start) 1428 1265 { 1429 - u32 *cmd, *batch_end; 1266 + u32 *cmd, *batch_end, offset = 0; 1430 1267 struct drm_i915_cmd_descriptor default_desc = noop_desc; 1431 1268 const struct drm_i915_cmd_descriptor *desc = &default_desc; 1432 1269 bool needs_clflush_after = false; ··· 1443 1274 return PTR_ERR(cmd); 1444 1275 } 1445 1276 1277 + init_whitelist(ctx, batch_len); 1278 + 1446 1279 /* 1447 1280 * We use the batch length as size because the shadow object is as 1448 1281 * large or larger and copy_batch() will write MI_NOPs to the extra ··· 1454 1283 do { 1455 1284 u32 length; 1456 1285 1457 - if (*cmd == MI_BATCH_BUFFER_END) { 1458 - if (needs_clflush_after) { 1459 - void *ptr = page_mask_bits(shadow_batch_obj->mm.mapping); 1460 - drm_clflush_virt_range(ptr, 1461 - (void *)(cmd + 1) - ptr); 1462 - } 1286 + if (*cmd == MI_BATCH_BUFFER_END) 1463 1287 break; 1464 - } 1465 1288 1466 1289 desc = find_cmd(engine, *cmd, desc, &default_desc); 1467 1290 if (!desc) { 1468 1291 DRM_DEBUG_DRIVER("CMD: Unrecognized command: 0x%08X\n", 1469 1292 *cmd); 1470 1293 ret = -EINVAL; 1471 - break; 1472 - } 1473 - 1474 - /* 1475 - * If the batch buffer contains a chained batch, return an 1476 - * error that tells the caller to abort and dispatch the 1477 - * workload as a non-secure batch. 1478 - */ 1479 - if (desc->cmd.value == MI_BATCH_BUFFER_START) { 1480 - ret = -EACCES; 1481 - break; 1294 + goto err; 1482 1295 } 1483 1296 1484 1297 if (desc->flags & CMD_DESC_FIXED) ··· 1476 1321 length, 1477 1322 batch_end - cmd); 1478 1323 ret = -EINVAL; 1324 + goto err; 1325 + } 1326 + 1327 + if (!check_cmd(engine, desc, cmd, length)) { 1328 + ret = -EACCES; 1329 + goto err; 1330 + } 1331 + 1332 + if (desc->cmd.value == MI_BATCH_BUFFER_START) { 1333 + ret = check_bbstart(ctx, cmd, offset, length, 1334 + batch_len, batch_start, 1335 + shadow_batch_start); 1336 + 1337 + if (ret) 1338 + goto err; 1479 1339 break; 1480 1340 } 1481 1341 1482 - if (!check_cmd(engine, desc, cmd, length, is_master)) { 1483 - ret = -EACCES; 1484 - break; 1485 - } 1342 + if (ctx->jump_whitelist_cmds > offset) 1343 + set_bit(offset, ctx->jump_whitelist); 1486 1344 1487 1345 cmd += length; 1346 + offset += length; 1488 1347 if (cmd >= batch_end) { 1489 1348 DRM_DEBUG_DRIVER("CMD: Got to the end of the buffer w/o a BBE cmd!\n"); 1490 1349 ret = -EINVAL; 1491 - break; 1350 + goto err; 1492 1351 } 1493 1352 } while (1); 1494 1353 1354 + if (needs_clflush_after) { 1355 + void *ptr = page_mask_bits(shadow_batch_obj->mm.mapping); 1356 + 1357 + drm_clflush_virt_range(ptr, (void *)(cmd + 1) - ptr); 1358 + } 1359 + 1360 + err: 1495 1361 i915_gem_object_unpin_map(shadow_batch_obj); 1496 1362 return ret; 1497 1363 } ··· 1533 1357 1534 1358 /* If the command parser is not enabled, report 0 - unsupported */ 1535 1359 for_each_uabi_engine(engine, dev_priv) { 1536 - if (intel_engine_needs_cmd_parser(engine)) { 1360 + if (intel_engine_using_cmd_parser(engine)) { 1537 1361 active = true; 1538 1362 break; 1539 1363 } ··· 1558 1382 * the parser enabled. 1559 1383 * 9. Don't whitelist or handle oacontrol specially, as ownership 1560 1384 * for oacontrol state is moving to i915-perf. 1385 + * 10. Support for Gen9 BCS Parsing 1561 1386 */ 1562 - return 9; 1387 + return 10; 1563 1388 }
+4 -3
drivers/gpu/drm/i915/i915_drv.c
··· 364 364 if (ret) 365 365 goto cleanup_vga_client; 366 366 367 - /* must happen before intel_power_domains_init_hw() on VLV/CHV */ 368 - intel_update_rawclk(dev_priv); 369 - 370 367 intel_power_domains_init_hw(dev_priv, false); 371 368 372 369 intel_csr_ucode_init(dev_priv); ··· 1847 1850 1848 1851 i915_gem_suspend_late(dev_priv); 1849 1852 1853 + i915_rc6_ctx_wa_suspend(dev_priv); 1854 + 1850 1855 intel_uncore_suspend(&dev_priv->uncore); 1851 1856 1852 1857 intel_power_domains_suspend(dev_priv, ··· 2051 2052 intel_sanitize_gt_powersave(dev_priv); 2052 2053 2053 2054 intel_power_domains_resume(dev_priv); 2055 + 2056 + i915_rc6_ctx_wa_resume(dev_priv); 2054 2057 2055 2058 intel_gt_sanitize(&dev_priv->gt, true); 2056 2059
+26 -5
drivers/gpu/drm/i915/i915_drv.h
··· 593 593 594 594 struct intel_rc6 { 595 595 bool enabled; 596 + bool ctx_corrupted; 597 + intel_wakeref_t ctx_corrupted_wakeref; 596 598 u64 prev_hw_residency[4]; 597 599 u64 cur_residency[4]; 598 600 }; ··· 2077 2075 #define VEBOX_MASK(dev_priv) \ 2078 2076 ENGINE_INSTANCES_MASK(dev_priv, VECS0, I915_MAX_VECS) 2079 2077 2078 + /* 2079 + * The Gen7 cmdparser copies the scanned buffer to the ggtt for execution 2080 + * All later gens can run the final buffer from the ppgtt 2081 + */ 2082 + #define CMDPARSER_USES_GGTT(dev_priv) IS_GEN(dev_priv, 7) 2083 + 2080 2084 #define HAS_LLC(dev_priv) (INTEL_INFO(dev_priv)->has_llc) 2081 2085 #define HAS_SNOOP(dev_priv) (INTEL_INFO(dev_priv)->has_snoop) 2082 2086 #define HAS_EDRAM(dev_priv) ((dev_priv)->edram_size_mb) 2087 + #define HAS_SECURE_BATCHES(dev_priv) (INTEL_GEN(dev_priv) < 6) 2083 2088 #define HAS_WT(dev_priv) ((IS_HASWELL(dev_priv) || \ 2084 2089 IS_BROADWELL(dev_priv)) && HAS_EDRAM(dev_priv)) 2085 2090 ··· 2119 2110 /* Early gen2 have a totally busted CS tlb and require pinned batches. */ 2120 2111 #define HAS_BROKEN_CS_TLB(dev_priv) (IS_I830(dev_priv) || IS_I845G(dev_priv)) 2121 2112 2113 + #define NEEDS_RC6_CTX_CORRUPTION_WA(dev_priv) \ 2114 + (IS_BROADWELL(dev_priv) || IS_GEN(dev_priv, 9)) 2115 + 2122 2116 /* WaRsDisableCoarsePowerGating:skl,cnl */ 2123 2117 #define NEEDS_WaRsDisableCoarsePowerGating(dev_priv) \ 2124 - (IS_CANNONLAKE(dev_priv) || \ 2125 - IS_SKL_GT3(dev_priv) || IS_SKL_GT4(dev_priv)) 2118 + (IS_CANNONLAKE(dev_priv) || IS_GEN(dev_priv, 9)) 2126 2119 2127 2120 #define HAS_GMBUS_IRQ(dev_priv) (INTEL_GEN(dev_priv) >= 4) 2128 2121 #define HAS_GMBUS_BURST_READ(dev_priv) (INTEL_GEN(dev_priv) >= 10 || \ ··· 2295 2284 unsigned long flags); 2296 2285 #define I915_GEM_OBJECT_UNBIND_ACTIVE BIT(0) 2297 2286 2287 + struct i915_vma * __must_check 2288 + i915_gem_object_pin(struct drm_i915_gem_object *obj, 2289 + struct i915_address_space *vm, 2290 + const struct i915_ggtt_view *view, 2291 + u64 size, 2292 + u64 alignment, 2293 + u64 flags); 2294 + 2298 2295 void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv); 2299 2296 2300 2297 static inline int __must_check ··· 2412 2393 int i915_cmd_parser_get_version(struct drm_i915_private *dev_priv); 2413 2394 void intel_engine_init_cmd_parser(struct intel_engine_cs *engine); 2414 2395 void intel_engine_cleanup_cmd_parser(struct intel_engine_cs *engine); 2415 - int intel_engine_cmd_parser(struct intel_engine_cs *engine, 2396 + int intel_engine_cmd_parser(struct i915_gem_context *cxt, 2397 + struct intel_engine_cs *engine, 2416 2398 struct drm_i915_gem_object *batch_obj, 2417 - struct drm_i915_gem_object *shadow_batch_obj, 2399 + u64 user_batch_start, 2418 2400 u32 batch_start_offset, 2419 2401 u32 batch_len, 2420 - bool is_master); 2402 + struct drm_i915_gem_object *shadow_batch_obj, 2403 + u64 shadow_batch_start); 2421 2404 2422 2405 /* intel_device_info.c */ 2423 2406 static inline struct intel_device_info *
+15 -1
drivers/gpu/drm/i915/i915_gem.c
··· 964 964 { 965 965 struct drm_i915_private *dev_priv = to_i915(obj->base.dev); 966 966 struct i915_address_space *vm = &dev_priv->ggtt.vm; 967 + 968 + return i915_gem_object_pin(obj, vm, view, size, alignment, 969 + flags | PIN_GLOBAL); 970 + } 971 + 972 + struct i915_vma * 973 + i915_gem_object_pin(struct drm_i915_gem_object *obj, 974 + struct i915_address_space *vm, 975 + const struct i915_ggtt_view *view, 976 + u64 size, 977 + u64 alignment, 978 + u64 flags) 979 + { 980 + struct drm_i915_private *dev_priv = to_i915(obj->base.dev); 967 981 struct i915_vma *vma; 968 982 int ret; 969 983 ··· 1052 1038 return ERR_PTR(ret); 1053 1039 } 1054 1040 1055 - ret = i915_vma_pin(vma, size, alignment, flags | PIN_GLOBAL); 1041 + ret = i915_vma_pin(vma, size, alignment, flags); 1056 1042 if (ret) 1057 1043 return ERR_PTR(ret); 1058 1044
+1 -1
drivers/gpu/drm/i915/i915_getparam.c
··· 62 62 value = !!(i915->caps.scheduler & I915_SCHEDULER_CAP_SEMAPHORES); 63 63 break; 64 64 case I915_PARAM_HAS_SECURE_BATCHES: 65 - value = capable(CAP_SYS_ADMIN); 65 + value = HAS_SECURE_BATCHES(i915) && capable(CAP_SYS_ADMIN); 66 66 break; 67 67 case I915_PARAM_CMD_PARSER_VERSION: 68 68 value = i915_cmd_parser_get_version(i915);
+10
drivers/gpu/drm/i915/i915_reg.h
··· 471 471 #define ECOCHK_PPGTT_WT_HSW (0x2 << 3) 472 472 #define ECOCHK_PPGTT_WB_HSW (0x3 << 3) 473 473 474 + #define GEN8_RC6_CTX_INFO _MMIO(0x8504) 475 + 474 476 #define GAC_ECO_BITS _MMIO(0x14090) 475 477 #define ECOBITS_SNB_BIT (1 << 13) 476 478 #define ECOBITS_PPGTT_CACHE64B (3 << 8) ··· 556 554 * Registers used only by the command parser 557 555 */ 558 556 #define BCS_SWCTRL _MMIO(0x22200) 557 + 558 + /* There are 16 GPR registers */ 559 + #define BCS_GPR(n) _MMIO(0x22600 + (n) * 8) 560 + #define BCS_GPR_UDW(n) _MMIO(0x22600 + (n) * 8 + 4) 559 561 560 562 #define GPGPU_THREADS_DISPATCHED _MMIO(0x2290) 561 563 #define GPGPU_THREADS_DISPATCHED_UDW _MMIO(0x2290 + 4) ··· 7216 7210 #define BXT_CSR_DC3_DC5_COUNT _MMIO(0x80038) 7217 7211 #define TGL_DMC_DEBUG_DC5_COUNT _MMIO(0x101084) 7218 7212 #define TGL_DMC_DEBUG_DC6_COUNT _MMIO(0x101088) 7213 + 7214 + /* Display Internal Timeout Register */ 7215 + #define RM_TIMEOUT _MMIO(0x42060) 7216 + #define MMIO_TIMEOUT_US(us) ((us) << 0) 7219 7217 7220 7218 /* interrupts */ 7221 7219 #define DE_MASTER_IRQ_CONTROL (1 << 31)
+120 -2
drivers/gpu/drm/i915/intel_pm.c
··· 126 126 */ 127 127 I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) | 128 128 PWM1_GATING_DIS | PWM2_GATING_DIS); 129 + 130 + /* 131 + * Lower the display internal timeout. 132 + * This is needed to avoid any hard hangs when DSI port PLL 133 + * is off and a MMIO access is attempted by any privilege 134 + * application, using batch buffers or any other means. 135 + */ 136 + I915_WRITE(RM_TIMEOUT, MMIO_TIMEOUT_US(950)); 129 137 } 130 138 131 139 static void glk_init_clock_gating(struct drm_i915_private *dev_priv) ··· 8552 8544 dev_priv->ips.corr = (lcfuse & LCFUSE_HIV_MASK); 8553 8545 } 8554 8546 8547 + static bool i915_rc6_ctx_corrupted(struct drm_i915_private *dev_priv) 8548 + { 8549 + return !I915_READ(GEN8_RC6_CTX_INFO); 8550 + } 8551 + 8552 + static void i915_rc6_ctx_wa_init(struct drm_i915_private *i915) 8553 + { 8554 + if (!NEEDS_RC6_CTX_CORRUPTION_WA(i915)) 8555 + return; 8556 + 8557 + if (i915_rc6_ctx_corrupted(i915)) { 8558 + DRM_INFO("RC6 context corrupted, disabling runtime power management\n"); 8559 + i915->gt_pm.rc6.ctx_corrupted = true; 8560 + i915->gt_pm.rc6.ctx_corrupted_wakeref = 8561 + intel_runtime_pm_get(&i915->runtime_pm); 8562 + } 8563 + } 8564 + 8565 + static void i915_rc6_ctx_wa_cleanup(struct drm_i915_private *i915) 8566 + { 8567 + if (i915->gt_pm.rc6.ctx_corrupted) { 8568 + intel_runtime_pm_put(&i915->runtime_pm, 8569 + i915->gt_pm.rc6.ctx_corrupted_wakeref); 8570 + i915->gt_pm.rc6.ctx_corrupted = false; 8571 + } 8572 + } 8573 + 8574 + /** 8575 + * i915_rc6_ctx_wa_suspend - system suspend sequence for the RC6 CTX WA 8576 + * @i915: i915 device 8577 + * 8578 + * Perform any steps needed to clean up the RC6 CTX WA before system suspend. 8579 + */ 8580 + void i915_rc6_ctx_wa_suspend(struct drm_i915_private *i915) 8581 + { 8582 + if (i915->gt_pm.rc6.ctx_corrupted) 8583 + intel_runtime_pm_put(&i915->runtime_pm, 8584 + i915->gt_pm.rc6.ctx_corrupted_wakeref); 8585 + } 8586 + 8587 + /** 8588 + * i915_rc6_ctx_wa_resume - system resume sequence for the RC6 CTX WA 8589 + * @i915: i915 device 8590 + * 8591 + * Perform any steps needed to re-init the RC6 CTX WA after system resume. 8592 + */ 8593 + void i915_rc6_ctx_wa_resume(struct drm_i915_private *i915) 8594 + { 8595 + if (!i915->gt_pm.rc6.ctx_corrupted) 8596 + return; 8597 + 8598 + if (i915_rc6_ctx_corrupted(i915)) { 8599 + i915->gt_pm.rc6.ctx_corrupted_wakeref = 8600 + intel_runtime_pm_get(&i915->runtime_pm); 8601 + return; 8602 + } 8603 + 8604 + DRM_INFO("RC6 context restored, re-enabling runtime power management\n"); 8605 + i915->gt_pm.rc6.ctx_corrupted = false; 8606 + } 8607 + 8608 + static void intel_disable_rc6(struct drm_i915_private *dev_priv); 8609 + 8610 + /** 8611 + * i915_rc6_ctx_wa_check - check for a new RC6 CTX corruption 8612 + * @i915: i915 device 8613 + * 8614 + * Check if an RC6 CTX corruption has happened since the last check and if so 8615 + * disable RC6 and runtime power management. 8616 + * 8617 + * Return false if no context corruption has happened since the last call of 8618 + * this function, true otherwise. 8619 + */ 8620 + bool i915_rc6_ctx_wa_check(struct drm_i915_private *i915) 8621 + { 8622 + if (!NEEDS_RC6_CTX_CORRUPTION_WA(i915)) 8623 + return false; 8624 + 8625 + if (i915->gt_pm.rc6.ctx_corrupted) 8626 + return false; 8627 + 8628 + if (!i915_rc6_ctx_corrupted(i915)) 8629 + return false; 8630 + 8631 + DRM_NOTE("RC6 context corruption, disabling runtime power management\n"); 8632 + 8633 + intel_disable_rc6(i915); 8634 + i915->gt_pm.rc6.ctx_corrupted = true; 8635 + i915->gt_pm.rc6.ctx_corrupted_wakeref = 8636 + intel_runtime_pm_get_noresume(&i915->runtime_pm); 8637 + 8638 + return true; 8639 + } 8640 + 8555 8641 void intel_init_gt_powersave(struct drm_i915_private *dev_priv) 8556 8642 { 8557 8643 struct intel_rps *rps = &dev_priv->gt_pm.rps; ··· 8658 8556 DRM_INFO("RC6 disabled, disabling runtime PM support\n"); 8659 8557 pm_runtime_get(&dev_priv->drm.pdev->dev); 8660 8558 } 8559 + 8560 + i915_rc6_ctx_wa_init(dev_priv); 8661 8561 8662 8562 /* Initialize RPS limits (for userspace) */ 8663 8563 if (IS_CHERRYVIEW(dev_priv)) ··· 8699 8595 if (IS_VALLEYVIEW(dev_priv)) 8700 8596 valleyview_cleanup_gt_powersave(dev_priv); 8701 8597 8598 + i915_rc6_ctx_wa_cleanup(dev_priv); 8599 + 8702 8600 if (!HAS_RC6(dev_priv)) 8703 8601 pm_runtime_put(&dev_priv->drm.pdev->dev); 8704 8602 } ··· 8729 8623 i915->gt_pm.llc_pstate.enabled = false; 8730 8624 } 8731 8625 8732 - static void intel_disable_rc6(struct drm_i915_private *dev_priv) 8626 + static void __intel_disable_rc6(struct drm_i915_private *dev_priv) 8733 8627 { 8734 8628 lockdep_assert_held(&dev_priv->gt_pm.rps.lock); 8735 8629 ··· 8746 8640 gen6_disable_rc6(dev_priv); 8747 8641 8748 8642 dev_priv->gt_pm.rc6.enabled = false; 8643 + } 8644 + 8645 + static void intel_disable_rc6(struct drm_i915_private *dev_priv) 8646 + { 8647 + struct intel_rps *rps = &dev_priv->gt_pm.rps; 8648 + 8649 + mutex_lock(&rps->lock); 8650 + __intel_disable_rc6(dev_priv); 8651 + mutex_unlock(&rps->lock); 8749 8652 } 8750 8653 8751 8654 static void intel_disable_rps(struct drm_i915_private *dev_priv) ··· 8782 8667 { 8783 8668 mutex_lock(&dev_priv->gt_pm.rps.lock); 8784 8669 8785 - intel_disable_rc6(dev_priv); 8670 + __intel_disable_rc6(dev_priv); 8786 8671 intel_disable_rps(dev_priv); 8787 8672 if (HAS_LLC(dev_priv)) 8788 8673 intel_disable_llc_pstate(dev_priv); ··· 8807 8692 lockdep_assert_held(&dev_priv->gt_pm.rps.lock); 8808 8693 8809 8694 if (dev_priv->gt_pm.rc6.enabled) 8695 + return; 8696 + 8697 + if (dev_priv->gt_pm.rc6.ctx_corrupted) 8810 8698 return; 8811 8699 8812 8700 if (IS_CHERRYVIEW(dev_priv))
+3
drivers/gpu/drm/i915/intel_pm.h
··· 36 36 void intel_sanitize_gt_powersave(struct drm_i915_private *dev_priv); 37 37 void intel_enable_gt_powersave(struct drm_i915_private *dev_priv); 38 38 void intel_disable_gt_powersave(struct drm_i915_private *dev_priv); 39 + bool i915_rc6_ctx_wa_check(struct drm_i915_private *i915); 40 + void i915_rc6_ctx_wa_suspend(struct drm_i915_private *i915); 41 + void i915_rc6_ctx_wa_resume(struct drm_i915_private *i915); 39 42 void gen6_rps_busy(struct drm_i915_private *dev_priv); 40 43 void gen6_rps_idle(struct drm_i915_private *dev_priv); 41 44 void gen6_rps_boost(struct i915_request *rq);
+1 -1
drivers/gpu/drm/sun4i/sun4i_tcon.c
··· 488 488 489 489 WARN_ON(!tcon->quirks->has_channel_0); 490 490 491 - tcon->dclk_min_div = 6; 491 + tcon->dclk_min_div = 1; 492 492 tcon->dclk_max_div = 127; 493 493 sun4i_tcon0_mode_set_common(tcon, mode); 494 494
+3
drivers/hwtracing/intel_th/gth.c
··· 626 626 if (!count) 627 627 dev_dbg(&thdev->dev, "timeout waiting for CTS Trigger\n"); 628 628 629 + /* De-assert the trigger */ 630 + iowrite32(0, gth->base + REG_CTS_CTL); 631 + 629 632 intel_th_gth_stop(gth, output, false); 630 633 intel_th_gth_start(gth, output); 631 634 }
+8 -3
drivers/hwtracing/intel_th/msu.c
··· 164 164 }; 165 165 166 166 static LIST_HEAD(msu_buffer_list); 167 - static struct mutex msu_buffer_mutex; 167 + static DEFINE_MUTEX(msu_buffer_mutex); 168 168 169 169 /** 170 170 * struct msu_buffer_entry - internal MSU buffer bookkeeping ··· 327 327 struct msc_block_desc *bdesc = sg_virt(sg); 328 328 329 329 if (msc_block_wrapped(bdesc)) 330 - return win->nr_blocks << PAGE_SHIFT; 330 + return (size_t)win->nr_blocks << PAGE_SHIFT; 331 331 332 332 size += msc_total_sz(bdesc); 333 333 if (msc_block_last_written(bdesc)) ··· 1848 1848 len = cp - buf; 1849 1849 1850 1850 mode = kstrndup(buf, len, GFP_KERNEL); 1851 + if (!mode) 1852 + return -ENOMEM; 1853 + 1851 1854 i = match_string(msc_mode, ARRAY_SIZE(msc_mode), mode); 1852 - if (i >= 0) 1855 + if (i >= 0) { 1856 + kfree(mode); 1853 1857 goto found; 1858 + } 1854 1859 1855 1860 /* Buffer sinks only work with a usable IRQ */ 1856 1861 if (!msc->do_irq) {
+10
drivers/hwtracing/intel_th/pci.c
··· 200 200 .driver_data = (kernel_ulong_t)&intel_th_2x, 201 201 }, 202 202 { 203 + /* Comet Lake PCH */ 204 + PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x06a6), 205 + .driver_data = (kernel_ulong_t)&intel_th_2x, 206 + }, 207 + { 203 208 /* Ice Lake NNPI */ 204 209 PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x45c5), 205 210 .driver_data = (kernel_ulong_t)&intel_th_2x, ··· 212 207 { 213 208 /* Tiger Lake PCH */ 214 209 PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xa0a6), 210 + .driver_data = (kernel_ulong_t)&intel_th_2x, 211 + }, 212 + { 213 + /* Jasper Lake PCH */ 214 + PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x4da6), 215 215 .driver_data = (kernel_ulong_t)&intel_th_2x, 216 216 }, 217 217 { 0 },
+2 -2
drivers/iio/adc/stm32-adc.c
··· 1399 1399 cookie = dmaengine_submit(desc); 1400 1400 ret = dma_submit_error(cookie); 1401 1401 if (ret) { 1402 - dmaengine_terminate_all(adc->dma_chan); 1402 + dmaengine_terminate_sync(adc->dma_chan); 1403 1403 return ret; 1404 1404 } 1405 1405 ··· 1477 1477 stm32_adc_conv_irq_disable(adc); 1478 1478 1479 1479 if (adc->dma_chan) 1480 - dmaengine_terminate_all(adc->dma_chan); 1480 + dmaengine_terminate_sync(adc->dma_chan); 1481 1481 1482 1482 if (stm32_adc_set_trig(indio_dev, NULL)) 1483 1483 dev_err(&indio_dev->dev, "Can't clear trigger\n");
+4 -1
drivers/iio/imu/adis16480.c
··· 317 317 struct adis16480 *st = iio_priv(indio_dev); 318 318 unsigned int t, reg; 319 319 320 + if (val < 0 || val2 < 0) 321 + return -EINVAL; 322 + 320 323 t = val * 1000 + val2 / 1000; 321 - if (t <= 0) 324 + if (t == 0) 322 325 return -EINVAL; 323 326 324 327 /*
+9
drivers/iio/imu/inv_mpu6050/inv_mpu_core.c
··· 114 114 .name = "MPU6050", 115 115 .reg = &reg_set_6050, 116 116 .config = &chip_config_6050, 117 + .fifo_size = 1024, 117 118 }, 118 119 { 119 120 .whoami = INV_MPU6500_WHOAMI_VALUE, 120 121 .name = "MPU6500", 121 122 .reg = &reg_set_6500, 122 123 .config = &chip_config_6050, 124 + .fifo_size = 512, 123 125 }, 124 126 { 125 127 .whoami = INV_MPU6515_WHOAMI_VALUE, 126 128 .name = "MPU6515", 127 129 .reg = &reg_set_6500, 128 130 .config = &chip_config_6050, 131 + .fifo_size = 512, 129 132 }, 130 133 { 131 134 .whoami = INV_MPU6000_WHOAMI_VALUE, 132 135 .name = "MPU6000", 133 136 .reg = &reg_set_6050, 134 137 .config = &chip_config_6050, 138 + .fifo_size = 1024, 135 139 }, 136 140 { 137 141 .whoami = INV_MPU9150_WHOAMI_VALUE, 138 142 .name = "MPU9150", 139 143 .reg = &reg_set_6050, 140 144 .config = &chip_config_6050, 145 + .fifo_size = 1024, 141 146 }, 142 147 { 143 148 .whoami = INV_MPU9250_WHOAMI_VALUE, 144 149 .name = "MPU9250", 145 150 .reg = &reg_set_6500, 146 151 .config = &chip_config_6050, 152 + .fifo_size = 512, 147 153 }, 148 154 { 149 155 .whoami = INV_MPU9255_WHOAMI_VALUE, 150 156 .name = "MPU9255", 151 157 .reg = &reg_set_6500, 152 158 .config = &chip_config_6050, 159 + .fifo_size = 512, 153 160 }, 154 161 { 155 162 .whoami = INV_ICM20608_WHOAMI_VALUE, 156 163 .name = "ICM20608", 157 164 .reg = &reg_set_6500, 158 165 .config = &chip_config_6050, 166 + .fifo_size = 512, 159 167 }, 160 168 { 161 169 .whoami = INV_ICM20602_WHOAMI_VALUE, 162 170 .name = "ICM20602", 163 171 .reg = &reg_set_icm20602, 164 172 .config = &chip_config_6050, 173 + .fifo_size = 1008, 165 174 }, 166 175 }; 167 176
+2
drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h
··· 100 100 * @name: name of the chip. 101 101 * @reg: register map of the chip. 102 102 * @config: configuration of the chip. 103 + * @fifo_size: size of the FIFO in bytes. 103 104 */ 104 105 struct inv_mpu6050_hw { 105 106 u8 whoami; 106 107 u8 *name; 107 108 const struct inv_mpu6050_reg_map *reg; 108 109 const struct inv_mpu6050_chip_config *config; 110 + size_t fifo_size; 109 111 }; 110 112 111 113 /*
+12 -3
drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c
··· 180 180 "failed to ack interrupt\n"); 181 181 goto flush_fifo; 182 182 } 183 - /* handle fifo overflow by reseting fifo */ 184 - if (int_status & INV_MPU6050_BIT_FIFO_OVERFLOW_INT) 185 - goto flush_fifo; 186 183 if (!(int_status & INV_MPU6050_BIT_RAW_DATA_RDY_INT)) { 187 184 dev_warn(regmap_get_device(st->map), 188 185 "spurious interrupt with status 0x%x\n", int_status); ··· 208 211 if (result) 209 212 goto end_session; 210 213 fifo_count = get_unaligned_be16(&data[0]); 214 + 215 + /* 216 + * Handle fifo overflow by resetting fifo. 217 + * Reset if there is only 3 data set free remaining to mitigate 218 + * possible delay between reading fifo count and fifo data. 219 + */ 220 + nb = 3 * bytes_per_datum; 221 + if (fifo_count >= st->hw->fifo_size - nb) { 222 + dev_warn(regmap_get_device(st->map), "fifo overflow reset\n"); 223 + goto flush_fifo; 224 + } 225 + 211 226 /* compute and process all complete datum */ 212 227 nb = fifo_count / bytes_per_datum; 213 228 inv_mpu6050_update_period(st, pf->timestamp, nb);
+15 -14
drivers/iio/proximity/srf04.c
··· 110 110 udelay(data->cfg->trigger_pulse_us); 111 111 gpiod_set_value(data->gpiod_trig, 0); 112 112 113 - /* it cannot take more than 20 ms */ 113 + /* it should not take more than 20 ms until echo is rising */ 114 114 ret = wait_for_completion_killable_timeout(&data->rising, HZ/50); 115 115 if (ret < 0) { 116 116 mutex_unlock(&data->lock); ··· 120 120 return -ETIMEDOUT; 121 121 } 122 122 123 - ret = wait_for_completion_killable_timeout(&data->falling, HZ/50); 123 + /* it cannot take more than 50 ms until echo is falling */ 124 + ret = wait_for_completion_killable_timeout(&data->falling, HZ/20); 124 125 if (ret < 0) { 125 126 mutex_unlock(&data->lock); 126 127 return ret; ··· 136 135 137 136 dt_ns = ktime_to_ns(ktime_dt); 138 137 /* 139 - * measuring more than 3 meters is beyond the capabilities of 140 - * the sensor 138 + * measuring more than 6,45 meters is beyond the capabilities of 139 + * the supported sensors 141 140 * ==> filter out invalid results for not measuring echos of 142 141 * another us sensor 143 142 * 144 143 * formula: 145 - * distance 3 m 146 - * time = ---------- = --------- = 9404389 ns 147 - * speed 319 m/s 144 + * distance 6,45 * 2 m 145 + * time = ---------- = ------------ = 40438871 ns 146 + * speed 319 m/s 148 147 * 149 148 * using a minimum speed at -20 °C of 319 m/s 150 149 */ 151 - if (dt_ns > 9404389) 150 + if (dt_ns > 40438871) 152 151 return -EIO; 153 152 154 153 time_ns = dt_ns; ··· 160 159 * with Temp in °C 161 160 * and speed in m/s 162 161 * 163 - * use 343 m/s as ultrasonic speed at 20 °C here in absence of the 162 + * use 343,5 m/s as ultrasonic speed at 20 °C here in absence of the 164 163 * temperature 165 164 * 166 165 * therefore: 167 - * time 343 168 - * distance = ------ * ----- 169 - * 10^6 2 166 + * time 343,5 time * 106 167 + * distance = ------ * ------- = ------------ 168 + * 10^6 2 617176 170 169 * with time in ns 171 170 * and distance in mm (one way) 172 171 * 173 - * because we limit to 3 meters the multiplication with 343 just 172 + * because we limit to 6,45 meters the multiplication with 106 just 174 173 * fits into 32 bit 175 174 */ 176 - distance_mm = time_ns * 343 / 2000000; 175 + distance_mm = time_ns * 106 / 617176; 177 176 178 177 return distance_mm; 179 178 }
-1
drivers/infiniband/hw/hfi1/init.c
··· 1489 1489 goto bail_dev; 1490 1490 } 1491 1491 1492 - hfi1_compute_tid_rdma_flow_wt(); 1493 1492 /* 1494 1493 * These must be called before the driver is registered with 1495 1494 * the PCI subsystem.
+3 -1
drivers/infiniband/hw/hfi1/pcie.c
··· 319 319 /* 320 320 * bus->max_bus_speed is set from the bridge's linkcap Max Link Speed 321 321 */ 322 - if (parent && dd->pcidev->bus->max_bus_speed != PCIE_SPEED_8_0GT) { 322 + if (parent && 323 + (dd->pcidev->bus->max_bus_speed == PCIE_SPEED_2_5GT || 324 + dd->pcidev->bus->max_bus_speed == PCIE_SPEED_5_0GT)) { 323 325 dd_dev_info(dd, "Parent PCIe bridge does not support Gen3\n"); 324 326 dd->link_gen3_capable = 0; 325 327 }
+8 -8
drivers/infiniband/hw/hfi1/rc.c
··· 2209 2209 if (qp->s_flags & RVT_S_WAIT_RNR) 2210 2210 goto bail_stop; 2211 2211 rdi = ib_to_rvt(qp->ibqp.device); 2212 - if (qp->s_rnr_retry == 0 && 2213 - !((rdi->post_parms[wqe->wr.opcode].flags & 2214 - RVT_OPERATION_IGN_RNR_CNT) && 2215 - qp->s_rnr_retry_cnt == 0)) { 2216 - status = IB_WC_RNR_RETRY_EXC_ERR; 2217 - goto class_b; 2212 + if (!(rdi->post_parms[wqe->wr.opcode].flags & 2213 + RVT_OPERATION_IGN_RNR_CNT)) { 2214 + if (qp->s_rnr_retry == 0) { 2215 + status = IB_WC_RNR_RETRY_EXC_ERR; 2216 + goto class_b; 2217 + } 2218 + if (qp->s_rnr_retry_cnt < 7 && qp->s_rnr_retry_cnt > 0) 2219 + qp->s_rnr_retry--; 2218 2220 } 2219 - if (qp->s_rnr_retry_cnt < 7 && qp->s_rnr_retry_cnt > 0) 2220 - qp->s_rnr_retry--; 2221 2221 2222 2222 /* 2223 2223 * The last valid PSN is the previous PSN. For TID RDMA WRITE
+32 -25
drivers/infiniband/hw/hfi1/tid_rdma.c
··· 107 107 * C - Capcode 108 108 */ 109 109 110 - static u32 tid_rdma_flow_wt; 111 - 112 110 static void tid_rdma_trigger_resume(struct work_struct *work); 113 111 static void hfi1_kern_exp_rcv_free_flows(struct tid_rdma_request *req); 114 112 static int hfi1_kern_exp_rcv_alloc_flows(struct tid_rdma_request *req, ··· 133 135 struct hfi1_ctxtdata *rcd, 134 136 struct tid_rdma_flow *flow, 135 137 bool fecn); 138 + 139 + static void validate_r_tid_ack(struct hfi1_qp_priv *priv) 140 + { 141 + if (priv->r_tid_ack == HFI1_QP_WQE_INVALID) 142 + priv->r_tid_ack = priv->r_tid_tail; 143 + } 144 + 145 + static void tid_rdma_schedule_ack(struct rvt_qp *qp) 146 + { 147 + struct hfi1_qp_priv *priv = qp->priv; 148 + 149 + priv->s_flags |= RVT_S_ACK_PENDING; 150 + hfi1_schedule_tid_send(qp); 151 + } 152 + 153 + static void tid_rdma_trigger_ack(struct rvt_qp *qp) 154 + { 155 + validate_r_tid_ack(qp->priv); 156 + tid_rdma_schedule_ack(qp); 157 + } 136 158 137 159 static u64 tid_rdma_opfn_encode(struct tid_rdma_params *p) 138 160 { ··· 3023 3005 qpriv->s_nak_state = IB_NAK_PSN_ERROR; 3024 3006 /* We are NAK'ing the next expected PSN */ 3025 3007 qpriv->s_nak_psn = mask_psn(flow->flow_state.r_next_psn); 3026 - qpriv->s_flags |= RVT_S_ACK_PENDING; 3027 - if (qpriv->r_tid_ack == HFI1_QP_WQE_INVALID) 3028 - qpriv->r_tid_ack = qpriv->r_tid_tail; 3029 - hfi1_schedule_tid_send(qp); 3008 + tid_rdma_trigger_ack(qp); 3030 3009 } 3031 3010 goto unlock; 3032 3011 } ··· 3386 3371 return sizeof(ohdr->u.tid_rdma.w_req) / sizeof(u32); 3387 3372 } 3388 3373 3389 - void hfi1_compute_tid_rdma_flow_wt(void) 3374 + static u32 hfi1_compute_tid_rdma_flow_wt(struct rvt_qp *qp) 3390 3375 { 3391 3376 /* 3392 3377 * Heuristic for computing the RNR timeout when waiting on the flow 3393 3378 * queue. Rather than a computationaly expensive exact estimate of when 3394 3379 * a flow will be available, we assume that if a QP is at position N in 3395 3380 * the flow queue it has to wait approximately (N + 1) * (number of 3396 - * segments between two sync points), assuming PMTU of 4K. The rationale 3397 - * for this is that flows are released and recycled at each sync point. 3381 + * segments between two sync points). The rationale for this is that 3382 + * flows are released and recycled at each sync point. 3398 3383 */ 3399 - tid_rdma_flow_wt = MAX_TID_FLOW_PSN * enum_to_mtu(OPA_MTU_4096) / 3400 - TID_RDMA_MAX_SEGMENT_SIZE; 3384 + return (MAX_TID_FLOW_PSN * qp->pmtu) >> TID_RDMA_SEGMENT_SHIFT; 3401 3385 } 3402 3386 3403 3387 static u32 position_in_queue(struct hfi1_qp_priv *qpriv, ··· 3519 3505 if (qpriv->flow_state.index >= RXE_NUM_TID_FLOWS) { 3520 3506 ret = hfi1_kern_setup_hw_flow(qpriv->rcd, qp); 3521 3507 if (ret) { 3522 - to_seg = tid_rdma_flow_wt * 3508 + to_seg = hfi1_compute_tid_rdma_flow_wt(qp) * 3523 3509 position_in_queue(qpriv, 3524 3510 &rcd->flow_queue); 3525 3511 break; ··· 3540 3526 /* 3541 3527 * If overtaking req->acked_tail, send an RNR NAK. Because the 3542 3528 * QP is not queued in this case, and the issue can only be 3543 - * caused due a delay in scheduling the second leg which we 3529 + * caused by a delay in scheduling the second leg which we 3544 3530 * cannot estimate, we use a rather arbitrary RNR timeout of 3545 3531 * (MAX_FLOWS / 2) segments 3546 3532 */ ··· 3548 3534 MAX_FLOWS)) { 3549 3535 ret = -EAGAIN; 3550 3536 to_seg = MAX_FLOWS >> 1; 3551 - qpriv->s_flags |= RVT_S_ACK_PENDING; 3552 - hfi1_schedule_tid_send(qp); 3537 + tid_rdma_trigger_ack(qp); 3553 3538 break; 3554 3539 } 3555 3540 ··· 4348 4335 trace_hfi1_tid_req_rcv_write_data(qp, 0, e->opcode, e->psn, e->lpsn, 4349 4336 req); 4350 4337 trace_hfi1_tid_write_rsp_rcv_data(qp); 4351 - if (priv->r_tid_ack == HFI1_QP_WQE_INVALID) 4352 - priv->r_tid_ack = priv->r_tid_tail; 4338 + validate_r_tid_ack(priv); 4353 4339 4354 4340 if (opcode == TID_OP(WRITE_DATA_LAST)) { 4355 4341 release_rdma_sge_mr(e); ··· 4387 4375 } 4388 4376 4389 4377 done: 4390 - priv->s_flags |= RVT_S_ACK_PENDING; 4391 - hfi1_schedule_tid_send(qp); 4378 + tid_rdma_schedule_ack(qp); 4392 4379 exit: 4393 4380 priv->r_next_psn_kdeth = flow->flow_state.r_next_psn; 4394 4381 if (fecn) ··· 4399 4388 if (!priv->s_nak_state) { 4400 4389 priv->s_nak_state = IB_NAK_PSN_ERROR; 4401 4390 priv->s_nak_psn = flow->flow_state.r_next_psn; 4402 - priv->s_flags |= RVT_S_ACK_PENDING; 4403 - if (priv->r_tid_ack == HFI1_QP_WQE_INVALID) 4404 - priv->r_tid_ack = priv->r_tid_tail; 4405 - hfi1_schedule_tid_send(qp); 4391 + tid_rdma_trigger_ack(qp); 4406 4392 } 4407 4393 goto done; 4408 4394 } ··· 4947 4939 qpriv->resync = true; 4948 4940 /* RESYNC request always gets a TID RDMA ACK. */ 4949 4941 qpriv->s_nak_state = 0; 4950 - qpriv->s_flags |= RVT_S_ACK_PENDING; 4951 - hfi1_schedule_tid_send(qp); 4942 + tid_rdma_trigger_ack(qp); 4952 4943 bail: 4953 4944 if (fecn) 4954 4945 qp->s_flags |= RVT_S_ECN;
+1 -2
drivers/infiniband/hw/hfi1/tid_rdma.h
··· 17 17 #define TID_RDMA_MIN_SEGMENT_SIZE BIT(18) /* 256 KiB (for now) */ 18 18 #define TID_RDMA_MAX_SEGMENT_SIZE BIT(18) /* 256 KiB (for now) */ 19 19 #define TID_RDMA_MAX_PAGES (BIT(18) >> PAGE_SHIFT) 20 + #define TID_RDMA_SEGMENT_SHIFT 18 20 21 21 22 /* 22 23 * Bit definitions for priv->s_flags. ··· 274 273 u32 hfi1_build_tid_rdma_write_req(struct rvt_qp *qp, struct rvt_swqe *wqe, 275 274 struct ib_other_headers *ohdr, 276 275 u32 *bth1, u32 *bth2, u32 *len); 277 - 278 - void hfi1_compute_tid_rdma_flow_wt(void); 279 276 280 277 void hfi1_rc_rcv_tid_rdma_write_req(struct hfi1_packet *packet); 281 278
+1 -1
drivers/infiniband/hw/hns/hns_roce_hem.h
··· 59 59 60 60 #define HNS_ROCE_HEM_CHUNK_LEN \ 61 61 ((256 - sizeof(struct list_head) - 2 * sizeof(int)) / \ 62 - (sizeof(struct scatterlist))) 62 + (sizeof(struct scatterlist) + sizeof(void *))) 63 63 64 64 #define check_whether_bt_num_3(type, hop_num) \ 65 65 (type < HEM_TYPE_MTT && hop_num == 2)
+1 -1
drivers/infiniband/hw/hns/hns_roce_srq.c
··· 376 376 srq->max = roundup_pow_of_two(srq_init_attr->attr.max_wr + 1); 377 377 srq->max_gs = srq_init_attr->attr.max_sge; 378 378 379 - srq_desc_size = max(16, 16 * srq->max_gs); 379 + srq_desc_size = roundup_pow_of_two(max(16, 16 * srq->max_gs)); 380 380 381 381 srq->wqe_shift = ilog2(srq_desc_size); 382 382
+9
drivers/input/ff-memless.c
··· 489 489 { 490 490 struct ml_device *ml = ff->private; 491 491 492 + /* 493 + * Even though we stop all playing effects when tearing down 494 + * an input device (via input_device_flush() that calls into 495 + * input_ff_flush() that stops and erases all effects), we 496 + * do not actually stop the timer, and therefore we should 497 + * do it here. 498 + */ 499 + del_timer_sync(&ml->timer); 500 + 492 501 kfree(ml->private); 493 502 } 494 503
+1
drivers/input/mouse/synaptics.c
··· 177 177 "LEN0096", /* X280 */ 178 178 "LEN0097", /* X280 -> ALPS trackpoint */ 179 179 "LEN009b", /* T580 */ 180 + "LEN0402", /* X1 Extreme 2nd Generation */ 180 181 "LEN200f", /* T450s */ 181 182 "LEN2054", /* E480 */ 182 183 "LEN2055", /* E580 */
+3 -6
drivers/input/rmi4/rmi_f11.c
··· 510 510 struct rmi_2d_sensor_platform_data sensor_pdata; 511 511 unsigned long *abs_mask; 512 512 unsigned long *rel_mask; 513 - unsigned long *result_bits; 514 513 }; 515 514 516 515 enum f11_finger_state { ··· 1056 1057 /* 1057 1058 ** init instance data, fill in values and create any sysfs files 1058 1059 */ 1059 - f11 = devm_kzalloc(&fn->dev, sizeof(struct f11_data) + mask_size * 3, 1060 + f11 = devm_kzalloc(&fn->dev, sizeof(struct f11_data) + mask_size * 2, 1060 1061 GFP_KERNEL); 1061 1062 if (!f11) 1062 1063 return -ENOMEM; ··· 1075 1076 + sizeof(struct f11_data)); 1076 1077 f11->rel_mask = (unsigned long *)((char *)f11 1077 1078 + sizeof(struct f11_data) + mask_size); 1078 - f11->result_bits = (unsigned long *)((char *)f11 1079 - + sizeof(struct f11_data) + mask_size * 2); 1080 1079 1081 1080 set_bit(fn->irq_pos, f11->abs_mask); 1082 1081 set_bit(fn->irq_pos + 1, f11->rel_mask); ··· 1281 1284 valid_bytes = f11->sensor.attn_size; 1282 1285 memcpy(f11->sensor.data_pkt, drvdata->attn_data.data, 1283 1286 valid_bytes); 1284 - drvdata->attn_data.data += f11->sensor.attn_size; 1285 - drvdata->attn_data.size -= f11->sensor.attn_size; 1287 + drvdata->attn_data.data += valid_bytes; 1288 + drvdata->attn_data.size -= valid_bytes; 1286 1289 } else { 1287 1290 error = rmi_read_block(rmi_dev, 1288 1291 data_base_addr, f11->sensor.data_pkt,
+28 -4
drivers/input/rmi4/rmi_f12.c
··· 55 55 56 56 const struct rmi_register_desc_item *data15; 57 57 u16 data15_offset; 58 + 59 + unsigned long *abs_mask; 60 + unsigned long *rel_mask; 58 61 }; 59 62 60 63 static int rmi_f12_read_sensor_tuning(struct f12_data *f12) ··· 212 209 valid_bytes = sensor->attn_size; 213 210 memcpy(sensor->data_pkt, drvdata->attn_data.data, 214 211 valid_bytes); 215 - drvdata->attn_data.data += sensor->attn_size; 216 - drvdata->attn_data.size -= sensor->attn_size; 212 + drvdata->attn_data.data += valid_bytes; 213 + drvdata->attn_data.size -= valid_bytes; 217 214 } else { 218 215 retval = rmi_read_block(rmi_dev, f12->data_addr, 219 216 sensor->data_pkt, sensor->pkt_size); ··· 294 291 static int rmi_f12_config(struct rmi_function *fn) 295 292 { 296 293 struct rmi_driver *drv = fn->rmi_dev->driver; 294 + struct f12_data *f12 = dev_get_drvdata(&fn->dev); 295 + struct rmi_2d_sensor *sensor; 297 296 int ret; 298 297 299 - drv->set_irq_bits(fn->rmi_dev, fn->irq_mask); 298 + sensor = &f12->sensor; 299 + 300 + if (!sensor->report_abs) 301 + drv->clear_irq_bits(fn->rmi_dev, f12->abs_mask); 302 + else 303 + drv->set_irq_bits(fn->rmi_dev, f12->abs_mask); 304 + 305 + drv->clear_irq_bits(fn->rmi_dev, f12->rel_mask); 300 306 301 307 ret = rmi_f12_write_control_regs(fn); 302 308 if (ret) ··· 327 315 struct rmi_device_platform_data *pdata = rmi_get_platform_data(rmi_dev); 328 316 struct rmi_driver_data *drvdata = dev_get_drvdata(&rmi_dev->dev); 329 317 u16 data_offset = 0; 318 + int mask_size; 330 319 331 320 rmi_dbg(RMI_DEBUG_FN, &fn->dev, "%s\n", __func__); 321 + 322 + mask_size = BITS_TO_LONGS(drvdata->irq_count) * sizeof(unsigned long); 332 323 333 324 ret = rmi_read(fn->rmi_dev, query_addr, &buf); 334 325 if (ret < 0) { ··· 347 332 return -ENODEV; 348 333 } 349 334 350 - f12 = devm_kzalloc(&fn->dev, sizeof(struct f12_data), GFP_KERNEL); 335 + f12 = devm_kzalloc(&fn->dev, sizeof(struct f12_data) + mask_size * 2, 336 + GFP_KERNEL); 351 337 if (!f12) 352 338 return -ENOMEM; 339 + 340 + f12->abs_mask = (unsigned long *)((char *)f12 341 + + sizeof(struct f12_data)); 342 + f12->rel_mask = (unsigned long *)((char *)f12 343 + + sizeof(struct f12_data) + mask_size); 344 + 345 + set_bit(fn->irq_pos, f12->abs_mask); 346 + set_bit(fn->irq_pos + 1, f12->rel_mask); 353 347 354 348 f12->has_dribble = !!(buf & BIT(3)); 355 349
+3 -2
drivers/input/rmi4/rmi_f54.c
··· 359 359 static const struct vb2_queue rmi_f54_queue = { 360 360 .type = V4L2_BUF_TYPE_VIDEO_CAPTURE, 361 361 .io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF | VB2_READ, 362 - .buf_struct_size = sizeof(struct vb2_buffer), 362 + .buf_struct_size = sizeof(struct vb2_v4l2_buffer), 363 363 .ops = &rmi_f54_queue_ops, 364 364 .mem_ops = &vb2_vmalloc_memops, 365 365 .timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC, ··· 601 601 { 602 602 struct rmi_driver *drv = fn->rmi_dev->driver; 603 603 604 - drv->set_irq_bits(fn->rmi_dev, fn->irq_mask); 604 + drv->clear_irq_bits(fn->rmi_dev, fn->irq_mask); 605 605 606 606 return 0; 607 607 } ··· 730 730 731 731 video_unregister_device(&f54->vdev); 732 732 v4l2_device_unregister(&f54->v4l2); 733 + destroy_workqueue(f54->workqueue); 733 734 } 734 735 735 736 struct rmi_function_handler rmi_f54_handler = {
-7
drivers/input/touchscreen/cyttsp4_core.c
··· 1990 1990 1991 1991 /* get sysinfo */ 1992 1992 md->si = &cd->sysinfo; 1993 - if (!md->si) { 1994 - dev_err(dev, "%s: Fail get sysinfo pointer from core p=%p\n", 1995 - __func__, md->si); 1996 - goto error_get_sysinfo; 1997 - } 1998 1993 1999 1994 rc = cyttsp4_setup_input_device(cd); 2000 1995 if (rc) ··· 1999 2004 2000 2005 error_init_input: 2001 2006 input_free_device(md->input); 2002 - error_get_sysinfo: 2003 - input_set_drvdata(md->input, NULL); 2004 2007 error_alloc_failed: 2005 2008 dev_err(dev, "%s failed.\n", __func__); 2006 2009 return rc;
+4
drivers/interconnect/core.c
··· 405 405 if (!path) 406 406 return; 407 407 408 + mutex_lock(&icc_lock); 409 + 408 410 for (i = 0; i < path->num_nodes; i++) 409 411 path->reqs[i].tag = tag; 412 + 413 + mutex_unlock(&icc_lock); 410 414 } 411 415 EXPORT_SYMBOL_GPL(icc_set_tag); 412 416
+2 -1
drivers/interconnect/qcom/qcs404.c
··· 433 433 if (!qp) 434 434 return -ENOMEM; 435 435 436 - data = devm_kcalloc(dev, num_nodes, sizeof(*node), GFP_KERNEL); 436 + data = devm_kzalloc(dev, struct_size(data, nodes, num_nodes), 437 + GFP_KERNEL); 437 438 if (!data) 438 439 return -ENOMEM; 439 440
+2 -1
drivers/interconnect/qcom/sdm845.c
··· 790 790 if (!qp) 791 791 return -ENOMEM; 792 792 793 - data = devm_kcalloc(&pdev->dev, num_nodes, sizeof(*node), GFP_KERNEL); 793 + data = devm_kzalloc(&pdev->dev, struct_size(data, nodes, num_nodes), 794 + GFP_KERNEL); 794 795 if (!data) 795 796 return -ENOMEM; 796 797
+1 -1
drivers/mmc/host/sdhci-of-at91.c
··· 358 358 pm_runtime_use_autosuspend(&pdev->dev); 359 359 360 360 /* HS200 is broken at this moment */ 361 - host->quirks2 = SDHCI_QUIRK2_BROKEN_HS200; 361 + host->quirks2 |= SDHCI_QUIRK2_BROKEN_HS200; 362 362 363 363 ret = sdhci_add_host(host); 364 364 if (ret)
+1
drivers/net/can/slcan.c
··· 617 617 sl->tty = NULL; 618 618 tty->disc_data = NULL; 619 619 clear_bit(SLF_INUSE, &sl->flags); 620 + free_netdev(sl->dev); 620 621 621 622 err_exit: 622 623 rtnl_unlock();
+13
drivers/net/dsa/mv88e6xxx/ptp.c
··· 273 273 int pin; 274 274 int err; 275 275 276 + /* Reject requests with unsupported flags */ 277 + if (rq->extts.flags & ~(PTP_ENABLE_FEATURE | 278 + PTP_RISING_EDGE | 279 + PTP_FALLING_EDGE | 280 + PTP_STRICT_FLAGS)) 281 + return -EOPNOTSUPP; 282 + 283 + /* Reject requests to enable time stamping on both edges. */ 284 + if ((rq->extts.flags & PTP_STRICT_FLAGS) && 285 + (rq->extts.flags & PTP_ENABLE_FEATURE) && 286 + (rq->extts.flags & PTP_EXTTS_EDGES) == PTP_EXTTS_EDGES) 287 + return -EOPNOTSUPP; 288 + 276 289 pin = ptp_find_pin(chip->ptp_clock, PTP_PF_EXTTS, rq->extts.index); 277 290 278 291 if (pin < 0)
+4
drivers/net/ethernet/broadcom/tg3.c
··· 6280 6280 6281 6281 switch (rq->type) { 6282 6282 case PTP_CLK_REQ_PEROUT: 6283 + /* Reject requests with unsupported flags */ 6284 + if (rq->perout.flags) 6285 + return -EOPNOTSUPP; 6286 + 6283 6287 if (rq->perout.index != 0) 6284 6288 return -EINVAL; 6285 6289
+3 -2
drivers/net/ethernet/cirrus/ep93xx_eth.c
··· 763 763 { 764 764 struct net_device *dev; 765 765 struct ep93xx_priv *ep; 766 + struct resource *mem; 766 767 767 768 dev = platform_get_drvdata(pdev); 768 769 if (dev == NULL) ··· 779 778 iounmap(ep->base_addr); 780 779 781 780 if (ep->res != NULL) { 782 - release_resource(ep->res); 783 - kfree(ep->res); 781 + mem = platform_get_resource(pdev, IORESOURCE_MEM, 0); 782 + release_mem_region(mem->start, resource_size(mem)); 784 783 } 785 784 786 785 free_netdev(dev);
+1
drivers/net/ethernet/cortina/gemini.c
··· 2524 2524 struct gemini_ethernet_port *port = platform_get_drvdata(pdev); 2525 2525 2526 2526 gemini_port_remove(port); 2527 + free_netdev(port->netdev); 2527 2528 return 0; 2528 2529 } 2529 2530
+9 -1
drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
··· 2260 2260 err_service_reg: 2261 2261 free_channel(priv, channel); 2262 2262 err_alloc_ch: 2263 - if (err == -EPROBE_DEFER) 2263 + if (err == -EPROBE_DEFER) { 2264 + for (i = 0; i < priv->num_channels; i++) { 2265 + channel = priv->channel[i]; 2266 + nctx = &channel->nctx; 2267 + dpaa2_io_service_deregister(channel->dpio, nctx, dev); 2268 + free_channel(priv, channel); 2269 + } 2270 + priv->num_channels = 0; 2264 2271 return err; 2272 + } 2265 2273 2266 2274 if (cpumask_empty(&priv->dpio_cpumask)) { 2267 2275 dev_err(dev, "No cpu with an affine DPIO/DPCON\n");
-5
drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
··· 70 70 #define HNS3_NIC_LB_TEST_TX_CNT_ERR 2 71 71 #define HNS3_NIC_LB_TEST_RX_CNT_ERR 3 72 72 73 - struct hns3_link_mode_mapping { 74 - u32 hns3_link_mode; 75 - u32 ethtool_link_mode; 76 - }; 77 - 78 73 static int hns3_lp_setup(struct net_device *ndev, enum hnae3_loop loop, bool en) 79 74 { 80 75 struct hnae3_handle *h = hns3_get_handle(ndev);
+17 -2
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c
··· 124 124 if (ret) 125 125 return ret; 126 126 127 - for (i = 0; i < HNAE3_MAX_TC; i++) { 127 + for (i = 0; i < hdev->tc_max; i++) { 128 128 switch (ets->tc_tsa[i]) { 129 129 case IEEE_8021QAZ_TSA_STRICT: 130 130 if (hdev->tm_info.tc_info[i].tc_sch_mode != ··· 318 318 struct net_device *netdev = h->kinfo.netdev; 319 319 struct hclge_dev *hdev = vport->back; 320 320 u8 i, j, pfc_map, *prio_tc; 321 + int ret; 321 322 322 323 if (!(hdev->dcbx_cap & DCB_CAP_DCBX_VER_IEEE) || 323 324 hdev->flag & HCLGE_FLAG_MQPRIO_ENABLE) ··· 348 347 349 348 hclge_tm_pfc_info_update(hdev); 350 349 351 - return hclge_pause_setup_hw(hdev, false); 350 + ret = hclge_pause_setup_hw(hdev, false); 351 + if (ret) 352 + return ret; 353 + 354 + ret = hclge_notify_client(hdev, HNAE3_DOWN_CLIENT); 355 + if (ret) 356 + return ret; 357 + 358 + ret = hclge_buffer_alloc(hdev); 359 + if (ret) { 360 + hclge_notify_client(hdev, HNAE3_UP_CLIENT); 361 + return ret; 362 + } 363 + 364 + return hclge_notify_client(hdev, HNAE3_UP_CLIENT); 352 365 } 353 366 354 367 /* DCBX configuration */
+14 -2
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
··· 6366 6366 6367 6367 func_id = hclge_get_port_number(HOST_PORT, 0, vfid, 0); 6368 6368 req = (struct hclge_mac_vlan_switch_cmd *)desc.data; 6369 + 6370 + /* read current config parameter */ 6369 6371 hclge_cmd_setup_basic_desc(&desc, HCLGE_OPC_MAC_VLAN_SWITCH_PARAM, 6370 - false); 6372 + true); 6371 6373 req->roce_sel = HCLGE_MAC_VLAN_NIC_SEL; 6372 6374 req->func_id = cpu_to_le32(func_id); 6373 - req->switch_param = switch_param; 6375 + 6376 + ret = hclge_cmd_send(&hdev->hw, &desc, 1); 6377 + if (ret) { 6378 + dev_err(&hdev->pdev->dev, 6379 + "read mac vlan switch parameter fail, ret = %d\n", ret); 6380 + return ret; 6381 + } 6382 + 6383 + /* modify and write new config parameter */ 6384 + hclge_cmd_reuse_desc(&desc, false); 6385 + req->switch_param = (req->switch_param & param_mask) | switch_param; 6374 6386 req->param_mask = param_mask; 6375 6387 6376 6388 ret = hclge_cmd_send(&hdev->hw, &desc, 1);
+17
drivers/net/ethernet/intel/igb/igb_ptp.c
··· 521 521 522 522 switch (rq->type) { 523 523 case PTP_CLK_REQ_EXTTS: 524 + /* Reject requests with unsupported flags */ 525 + if (rq->extts.flags & ~(PTP_ENABLE_FEATURE | 526 + PTP_RISING_EDGE | 527 + PTP_FALLING_EDGE | 528 + PTP_STRICT_FLAGS)) 529 + return -EOPNOTSUPP; 530 + 531 + /* Reject requests failing to enable both edges. */ 532 + if ((rq->extts.flags & PTP_STRICT_FLAGS) && 533 + (rq->extts.flags & PTP_ENABLE_FEATURE) && 534 + (rq->extts.flags & PTP_EXTTS_EDGES) != PTP_EXTTS_EDGES) 535 + return -EOPNOTSUPP; 536 + 524 537 if (on) { 525 538 pin = ptp_find_pin(igb->ptp_clock, PTP_PF_EXTTS, 526 539 rq->extts.index); ··· 564 551 return 0; 565 552 566 553 case PTP_CLK_REQ_PEROUT: 554 + /* Reject requests with unsupported flags */ 555 + if (rq->perout.flags) 556 + return -EOPNOTSUPP; 557 + 567 558 if (on) { 568 559 pin = ptp_find_pin(igb->ptp_clock, PTP_PF_PEROUT, 569 560 rq->perout.index);
+2 -2
drivers/net/ethernet/marvell/octeontx2/af/cgx.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 2 - * Marvell OcteonTx2 CGX driver 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* Marvell OcteonTx2 CGX driver 3 3 * 4 4 * Copyright (C) 2018 Marvell International Ltd. 5 5 *
+2 -2
drivers/net/ethernet/marvell/octeontx2/af/cgx_fw_if.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 2 - * Marvell OcteonTx2 CGX driver 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* Marvell OcteonTx2 CGX driver 3 3 * 4 4 * Copyright (C) 2018 Marvell International Ltd. 5 5 *
+2 -2
drivers/net/ethernet/marvell/octeontx2/af/common.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 2 - * Marvell OcteonTx2 RVU Admin Function driver 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* Marvell OcteonTx2 RVU Admin Function driver 3 3 * 4 4 * Copyright (C) 2018 Marvell International Ltd. 5 5 *
+2 -2
drivers/net/ethernet/marvell/octeontx2/af/mbox.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 2 - * Marvell OcteonTx2 RVU Admin Function driver 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* Marvell OcteonTx2 RVU Admin Function driver 3 3 * 4 4 * Copyright (C) 2018 Marvell International Ltd. 5 5 *
+2 -2
drivers/net/ethernet/marvell/octeontx2/af/npc.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 2 - * Marvell OcteonTx2 RVU Admin Function driver 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* Marvell OcteonTx2 RVU Admin Function driver 3 3 * 4 4 * Copyright (C) 2018 Marvell International Ltd. 5 5 *
+2 -2
drivers/net/ethernet/marvell/octeontx2/af/npc_profile.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 2 - * Marvell OcteonTx2 RVU Admin Function driver 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* Marvell OcteonTx2 RVU Admin Function driver 3 3 * 4 4 * Copyright (C) 2018 Marvell International Ltd. 5 5 *
+2 -2
drivers/net/ethernet/marvell/octeontx2/af/rvu.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 2 - * Marvell OcteonTx2 RVU Admin Function driver 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* Marvell OcteonTx2 RVU Admin Function driver 3 3 * 4 4 * Copyright (C) 2018 Marvell International Ltd. 5 5 *
+2 -2
drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 2 - * Marvell OcteonTx2 RVU Admin Function driver 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* Marvell OcteonTx2 RVU Admin Function driver 3 3 * 4 4 * Copyright (C) 2018 Marvell International Ltd. 5 5 *
+2 -2
drivers/net/ethernet/marvell/octeontx2/af/rvu_struct.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 2 - * Marvell OcteonTx2 RVU Admin Function driver 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* Marvell OcteonTx2 RVU Admin Function driver 3 3 * 4 4 * Copyright (C) 2018 Marvell International Ltd. 5 5 *
+17
drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
··· 236 236 if (!MLX5_PPS_CAP(mdev)) 237 237 return -EOPNOTSUPP; 238 238 239 + /* Reject requests with unsupported flags */ 240 + if (rq->extts.flags & ~(PTP_ENABLE_FEATURE | 241 + PTP_RISING_EDGE | 242 + PTP_FALLING_EDGE | 243 + PTP_STRICT_FLAGS)) 244 + return -EOPNOTSUPP; 245 + 246 + /* Reject requests to enable time stamping on both edges. */ 247 + if ((rq->extts.flags & PTP_STRICT_FLAGS) && 248 + (rq->extts.flags & PTP_ENABLE_FEATURE) && 249 + (rq->extts.flags & PTP_EXTTS_EDGES) == PTP_EXTTS_EDGES) 250 + return -EOPNOTSUPP; 251 + 239 252 if (rq->extts.index >= clock->ptp_info.n_pins) 240 253 return -EINVAL; 241 254 ··· 301 288 s64 ns; 302 289 303 290 if (!MLX5_PPS_CAP(mdev)) 291 + return -EOPNOTSUPP; 292 + 293 + /* Reject requests with unsupported flags */ 294 + if (rq->perout.flags) 304 295 return -EOPNOTSUPP; 305 296 306 297 if (rq->perout.index >= clock->ptp_info.n_pins)
+4
drivers/net/ethernet/microchip/lan743x_ptp.c
··· 492 492 unsigned int index = perout_request->index; 493 493 struct lan743x_ptp_perout *perout = &ptp->perout[index]; 494 494 495 + /* Reject requests with unsupported flags */ 496 + if (perout_request->flags) 497 + return -EOPNOTSUPP; 498 + 495 499 if (on) { 496 500 perout_pin = ptp_find_pin(ptp->ptp_clock, PTP_PF_PEROUT, 497 501 perout_request->index);
+2 -1
drivers/net/ethernet/renesas/ravb.h
··· 955 955 #define NUM_RX_QUEUE 2 956 956 #define NUM_TX_QUEUE 2 957 957 958 + #define RX_BUF_SZ (2048 - ETH_FCS_LEN + sizeof(__sum16)) 959 + 958 960 /* TX descriptors per packet */ 959 961 #define NUM_TX_DESC_GEN2 2 960 962 #define NUM_TX_DESC_GEN3 1 ··· 1020 1018 u32 dirty_rx[NUM_RX_QUEUE]; /* Producer ring indices */ 1021 1019 u32 cur_tx[NUM_TX_QUEUE]; 1022 1020 u32 dirty_tx[NUM_TX_QUEUE]; 1023 - u32 rx_buf_sz; /* Based on MTU+slack. */ 1024 1021 struct napi_struct napi[NUM_RX_QUEUE]; 1025 1022 struct work_struct work; 1026 1023 /* MII transceiver section. */
+14 -12
drivers/net/ethernet/renesas/ravb_main.c
··· 230 230 le32_to_cpu(desc->dptr))) 231 231 dma_unmap_single(ndev->dev.parent, 232 232 le32_to_cpu(desc->dptr), 233 - priv->rx_buf_sz, 233 + RX_BUF_SZ, 234 234 DMA_FROM_DEVICE); 235 235 } 236 236 ring_size = sizeof(struct ravb_ex_rx_desc) * ··· 293 293 for (i = 0; i < priv->num_rx_ring[q]; i++) { 294 294 /* RX descriptor */ 295 295 rx_desc = &priv->rx_ring[q][i]; 296 - rx_desc->ds_cc = cpu_to_le16(priv->rx_buf_sz); 296 + rx_desc->ds_cc = cpu_to_le16(RX_BUF_SZ); 297 297 dma_addr = dma_map_single(ndev->dev.parent, priv->rx_skb[q][i]->data, 298 - priv->rx_buf_sz, 298 + RX_BUF_SZ, 299 299 DMA_FROM_DEVICE); 300 300 /* We just set the data size to 0 for a failed mapping which 301 301 * should prevent DMA from happening... ··· 342 342 int ring_size; 343 343 int i; 344 344 345 - priv->rx_buf_sz = (ndev->mtu <= 1492 ? PKT_BUF_SZ : ndev->mtu) + 346 - ETH_HLEN + VLAN_HLEN + sizeof(__sum16); 347 - 348 345 /* Allocate RX and TX skb rings */ 349 346 priv->rx_skb[q] = kcalloc(priv->num_rx_ring[q], 350 347 sizeof(*priv->rx_skb[q]), GFP_KERNEL); ··· 351 354 goto error; 352 355 353 356 for (i = 0; i < priv->num_rx_ring[q]; i++) { 354 - skb = netdev_alloc_skb(ndev, priv->rx_buf_sz + RAVB_ALIGN - 1); 357 + skb = netdev_alloc_skb(ndev, RX_BUF_SZ + RAVB_ALIGN - 1); 355 358 if (!skb) 356 359 goto error; 357 360 ravb_set_buffer_align(skb); ··· 581 584 skb = priv->rx_skb[q][entry]; 582 585 priv->rx_skb[q][entry] = NULL; 583 586 dma_unmap_single(ndev->dev.parent, le32_to_cpu(desc->dptr), 584 - priv->rx_buf_sz, 587 + RX_BUF_SZ, 585 588 DMA_FROM_DEVICE); 586 589 get_ts &= (q == RAVB_NC) ? 587 590 RAVB_RXTSTAMP_TYPE_V2_L2_EVENT : ··· 614 617 for (; priv->cur_rx[q] - priv->dirty_rx[q] > 0; priv->dirty_rx[q]++) { 615 618 entry = priv->dirty_rx[q] % priv->num_rx_ring[q]; 616 619 desc = &priv->rx_ring[q][entry]; 617 - desc->ds_cc = cpu_to_le16(priv->rx_buf_sz); 620 + desc->ds_cc = cpu_to_le16(RX_BUF_SZ); 618 621 619 622 if (!priv->rx_skb[q][entry]) { 620 623 skb = netdev_alloc_skb(ndev, 621 - priv->rx_buf_sz + 624 + RX_BUF_SZ + 622 625 RAVB_ALIGN - 1); 623 626 if (!skb) 624 627 break; /* Better luck next round. */ ··· 1798 1801 1799 1802 static int ravb_change_mtu(struct net_device *ndev, int new_mtu) 1800 1803 { 1801 - if (netif_running(ndev)) 1802 - return -EBUSY; 1804 + struct ravb_private *priv = netdev_priv(ndev); 1803 1805 1804 1806 ndev->mtu = new_mtu; 1807 + 1808 + if (netif_running(ndev)) { 1809 + synchronize_irq(priv->emac_irq); 1810 + ravb_emac_init(ndev); 1811 + } 1812 + 1805 1813 netdev_update_features(ndev); 1806 1814 1807 1815 return 0;
+11
drivers/net/ethernet/renesas/ravb_ptp.c
··· 182 182 struct net_device *ndev = priv->ndev; 183 183 unsigned long flags; 184 184 185 + /* Reject requests with unsupported flags */ 186 + if (req->flags & ~(PTP_ENABLE_FEATURE | 187 + PTP_RISING_EDGE | 188 + PTP_FALLING_EDGE | 189 + PTP_STRICT_FLAGS)) 190 + return -EOPNOTSUPP; 191 + 185 192 if (req->index) 186 193 return -EINVAL; 187 194 ··· 217 210 struct ravb_ptp_perout *perout; 218 211 unsigned long flags; 219 212 int error = 0; 213 + 214 + /* Reject requests with unsupported flags */ 215 + if (req->flags) 216 + return -EOPNOTSUPP; 220 217 221 218 if (req->index) 222 219 return -EINVAL;
+1 -1
drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
··· 1227 1227 dwmac_mux: 1228 1228 sun8i_dwmac_unset_syscon(gmac); 1229 1229 dwmac_exit: 1230 - sun8i_dwmac_exit(pdev, plat_dat->bsp_priv); 1230 + stmmac_pltfr_remove(pdev); 1231 1231 return ret; 1232 1232 } 1233 1233
+1 -1
drivers/net/ethernet/stmicro/stmmac/dwmac5.h
··· 1 - // SPDX-License-Identifier: (GPL-2.0 OR MIT) 1 + /* SPDX-License-Identifier: (GPL-2.0 OR MIT) */ 2 2 // Copyright (c) 2017 Synopsys, Inc. and/or its affiliates. 3 3 // stmmac Support for 5.xx Ethernet QoS cores 4 4
+1 -1
drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
··· 1 - // SPDX-License-Identifier: (GPL-2.0 OR MIT) 1 + /* SPDX-License-Identifier: (GPL-2.0 OR MIT) */ 2 2 /* 3 3 * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates. 4 4 * stmmac XGMAC definitions.
+1 -1
drivers/net/ethernet/stmicro/stmmac/hwif.h
··· 1 - // SPDX-License-Identifier: (GPL-2.0 OR MIT) 1 + /* SPDX-License-Identifier: (GPL-2.0 OR MIT) */ 2 2 // Copyright (c) 2018 Synopsys, Inc. and/or its affiliates. 3 3 // stmmac HW Interface Callbacks 4 4
+4
drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c
··· 140 140 141 141 switch (rq->type) { 142 142 case PTP_CLK_REQ_PEROUT: 143 + /* Reject requests with unsupported flags */ 144 + if (rq->perout.flags) 145 + return -EOPNOTSUPP; 146 + 143 147 cfg = &priv->pps[rq->perout.index]; 144 148 145 149 cfg->start.tv_sec = rq->perout.start.sec;
+16
drivers/net/phy/dp83640.c
··· 469 469 470 470 switch (rq->type) { 471 471 case PTP_CLK_REQ_EXTTS: 472 + /* Reject requests with unsupported flags */ 473 + if (rq->extts.flags & ~(PTP_ENABLE_FEATURE | 474 + PTP_RISING_EDGE | 475 + PTP_FALLING_EDGE | 476 + PTP_STRICT_FLAGS)) 477 + return -EOPNOTSUPP; 478 + 479 + /* Reject requests to enable time stamping on both edges. */ 480 + if ((rq->extts.flags & PTP_STRICT_FLAGS) && 481 + (rq->extts.flags & PTP_ENABLE_FEATURE) && 482 + (rq->extts.flags & PTP_EXTTS_EDGES) == PTP_EXTTS_EDGES) 483 + return -EOPNOTSUPP; 484 + 472 485 index = rq->extts.index; 473 486 if (index >= N_EXT_TS) 474 487 return -EINVAL; ··· 504 491 return 0; 505 492 506 493 case PTP_CLK_REQ_PEROUT: 494 + /* Reject requests with unsupported flags */ 495 + if (rq->perout.flags) 496 + return -EOPNOTSUPP; 507 497 if (rq->perout.index >= N_PER_OUT) 508 498 return -EINVAL; 509 499 return periodic_output(clock, rq, on, rq->perout.index);
+6 -5
drivers/net/phy/mdio_bus.c
··· 64 64 if (mdiodev->dev.of_node) 65 65 reset = devm_reset_control_get_exclusive(&mdiodev->dev, 66 66 "phy"); 67 - if (PTR_ERR(reset) == -ENOENT || 68 - PTR_ERR(reset) == -ENOTSUPP) 69 - reset = NULL; 70 - else if (IS_ERR(reset)) 71 - return PTR_ERR(reset); 67 + if (IS_ERR(reset)) { 68 + if (PTR_ERR(reset) == -ENOENT || PTR_ERR(reset) == -ENOSYS) 69 + reset = NULL; 70 + else 71 + return PTR_ERR(reset); 72 + } 72 73 73 74 mdiodev->reset_ctrl = reset; 74 75
+1
drivers/net/slip/slip.c
··· 855 855 sl->tty = NULL; 856 856 tty->disc_data = NULL; 857 857 clear_bit(SLF_INUSE, &sl->flags); 858 + free_netdev(sl->dev); 858 859 859 860 err_exit: 860 861 rtnl_unlock();
+1 -1
drivers/net/usb/ax88172a.c
··· 196 196 197 197 /* Get the MAC address */ 198 198 ret = asix_read_cmd(dev, AX_CMD_READ_NODE_ID, 0, 0, ETH_ALEN, buf, 0); 199 - if (ret < 0) { 199 + if (ret < ETH_ALEN) { 200 200 netdev_err(dev->net, "Failed to read MAC address: %d\n", ret); 201 201 goto free; 202 202 }
+1 -1
drivers/net/usb/cdc_ncm.c
··· 579 579 err = usbnet_read_cmd(dev, USB_CDC_GET_MAX_DATAGRAM_SIZE, 580 580 USB_TYPE_CLASS | USB_DIR_IN | USB_RECIP_INTERFACE, 581 581 0, iface_no, &max_datagram_size, sizeof(max_datagram_size)); 582 - if (err < sizeof(max_datagram_size)) { 582 + if (err != sizeof(max_datagram_size)) { 583 583 dev_dbg(&dev->intf->dev, "GET_MAX_DATAGRAM_SIZE failed\n"); 584 584 goto out; 585 585 }
+2
drivers/net/usb/qmi_wwan.c
··· 1371 1371 {QMI_QUIRK_SET_DTR(0x2c7c, 0x0191, 4)}, /* Quectel EG91 */ 1372 1372 {QMI_FIXED_INTF(0x2c7c, 0x0296, 4)}, /* Quectel BG96 */ 1373 1373 {QMI_QUIRK_SET_DTR(0x2cb7, 0x0104, 4)}, /* Fibocom NL678 series */ 1374 + {QMI_FIXED_INTF(0x0489, 0xe0b4, 0)}, /* Foxconn T77W968 LTE */ 1375 + {QMI_FIXED_INTF(0x0489, 0xe0b5, 0)}, /* Foxconn T77W968 LTE with eSIM support*/ 1374 1376 1375 1377 /* 4. Gobi 1000 devices */ 1376 1378 {QMI_GOBI1K_DEVICE(0x05c6, 0x9212)}, /* Acer Gobi Modem Device */
+7 -13
drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c
··· 251 251 struct ieee80211_hdr *hdr = (void *)skb->data; 252 252 unsigned int snap_ip_tcp_hdrlen, ip_hdrlen, total_len, hdr_room; 253 253 unsigned int mss = skb_shinfo(skb)->gso_size; 254 - u16 length, iv_len, amsdu_pad; 254 + u16 length, amsdu_pad; 255 255 u8 *start_hdr; 256 256 struct iwl_tso_hdr_page *hdr_page; 257 257 struct page **page_ptr; 258 258 struct tso_t tso; 259 - 260 - /* if the packet is protected, then it must be CCMP or GCMP */ 261 - iv_len = ieee80211_has_protected(hdr->frame_control) ? 262 - IEEE80211_CCMP_HDR_LEN : 0; 263 259 264 260 trace_iwlwifi_dev_tx(trans->dev, skb, tfd, sizeof(*tfd), 265 261 &dev_cmd->hdr, start_len, 0); 266 262 267 263 ip_hdrlen = skb_transport_header(skb) - skb_network_header(skb); 268 264 snap_ip_tcp_hdrlen = 8 + ip_hdrlen + tcp_hdrlen(skb); 269 - total_len = skb->len - snap_ip_tcp_hdrlen - hdr_len - iv_len; 265 + total_len = skb->len - snap_ip_tcp_hdrlen - hdr_len; 270 266 amsdu_pad = 0; 271 267 272 268 /* total amount of header we may need for this A-MSDU */ 273 269 hdr_room = DIV_ROUND_UP(total_len, mss) * 274 - (3 + snap_ip_tcp_hdrlen + sizeof(struct ethhdr)) + iv_len; 270 + (3 + snap_ip_tcp_hdrlen + sizeof(struct ethhdr)); 275 271 276 272 /* Our device supports 9 segments at most, it will fit in 1 page */ 277 273 hdr_page = get_page_hdr(trans, hdr_room); ··· 278 282 start_hdr = hdr_page->pos; 279 283 page_ptr = (void *)((u8 *)skb->cb + trans_pcie->page_offs); 280 284 *page_ptr = hdr_page->page; 281 - memcpy(hdr_page->pos, skb->data + hdr_len, iv_len); 282 - hdr_page->pos += iv_len; 283 285 284 286 /* 285 - * Pull the ieee80211 header + IV to be able to use TSO core, 287 + * Pull the ieee80211 header to be able to use TSO core, 286 288 * we will restore it for the tx_status flow. 287 289 */ 288 - skb_pull(skb, hdr_len + iv_len); 290 + skb_pull(skb, hdr_len); 289 291 290 292 /* 291 293 * Remove the length of all the headers that we don't actually ··· 358 364 } 359 365 } 360 366 361 - /* re -add the WiFi header and IV */ 362 - skb_push(skb, hdr_len + iv_len); 367 + /* re -add the WiFi header */ 368 + skb_push(skb, hdr_len); 363 369 364 370 return 0; 365 371
+4 -2
drivers/nfc/nxp-nci/i2c.c
··· 220 220 221 221 if (r == -EREMOTEIO) { 222 222 phy->hard_fault = r; 223 - skb = NULL; 224 - } else if (r < 0) { 223 + if (info->mode == NXP_NCI_MODE_FW) 224 + nxp_nci_fw_recv_frame(phy->ndev, NULL); 225 + } 226 + if (r < 0) { 225 227 nfc_err(&client->dev, "Read failed with error %d\n", r); 226 228 goto exit_irq_handled; 227 229 }
+13 -13
drivers/pinctrl/intel/pinctrl-cherryview.c
··· 147 147 * @pctldesc: Pin controller description 148 148 * @pctldev: Pointer to the pin controller device 149 149 * @chip: GPIO chip in this pin controller 150 + * @irqchip: IRQ chip in this pin controller 150 151 * @regs: MMIO registers 151 152 * @intr_lines: Stores mapping between 16 HW interrupt wires and GPIO 152 153 * offset (in GPIO number space) ··· 163 162 struct pinctrl_desc pctldesc; 164 163 struct pinctrl_dev *pctldev; 165 164 struct gpio_chip chip; 165 + struct irq_chip irqchip; 166 166 void __iomem *regs; 167 167 unsigned intr_lines[16]; 168 168 const struct chv_community *community; ··· 1468 1466 return 0; 1469 1467 } 1470 1468 1471 - static struct irq_chip chv_gpio_irqchip = { 1472 - .name = "chv-gpio", 1473 - .irq_startup = chv_gpio_irq_startup, 1474 - .irq_ack = chv_gpio_irq_ack, 1475 - .irq_mask = chv_gpio_irq_mask, 1476 - .irq_unmask = chv_gpio_irq_unmask, 1477 - .irq_set_type = chv_gpio_irq_type, 1478 - .flags = IRQCHIP_SKIP_SET_WAKE, 1479 - }; 1480 - 1481 1469 static void chv_gpio_irq_handler(struct irq_desc *desc) 1482 1470 { 1483 1471 struct gpio_chip *gc = irq_desc_get_handler_data(desc); ··· 1551 1559 intsel >>= CHV_PADCTRL0_INTSEL_SHIFT; 1552 1560 1553 1561 if (intsel >= community->nirqs) 1554 - clear_bit(i, valid_mask); 1562 + clear_bit(desc->number, valid_mask); 1555 1563 } 1556 1564 } 1557 1565 ··· 1617 1625 } 1618 1626 } 1619 1627 1620 - ret = gpiochip_irqchip_add(chip, &chv_gpio_irqchip, 0, 1628 + pctrl->irqchip.name = "chv-gpio"; 1629 + pctrl->irqchip.irq_startup = chv_gpio_irq_startup; 1630 + pctrl->irqchip.irq_ack = chv_gpio_irq_ack; 1631 + pctrl->irqchip.irq_mask = chv_gpio_irq_mask; 1632 + pctrl->irqchip.irq_unmask = chv_gpio_irq_unmask; 1633 + pctrl->irqchip.irq_set_type = chv_gpio_irq_type; 1634 + pctrl->irqchip.flags = IRQCHIP_SKIP_SET_WAKE; 1635 + 1636 + ret = gpiochip_irqchip_add(chip, &pctrl->irqchip, 0, 1621 1637 handle_bad_irq, IRQ_TYPE_NONE); 1622 1638 if (ret) { 1623 1639 dev_err(pctrl->dev, "failed to add IRQ chip\n"); ··· 1642 1642 } 1643 1643 } 1644 1644 1645 - gpiochip_set_chained_irqchip(chip, &chv_gpio_irqchip, irq, 1645 + gpiochip_set_chained_irqchip(chip, &pctrl->irqchip, irq, 1646 1646 chv_gpio_irq_handler); 1647 1647 return 0; 1648 1648 }
+20 -1
drivers/pinctrl/intel/pinctrl-intel.c
··· 52 52 #define PADCFG0_GPIROUTNMI BIT(17) 53 53 #define PADCFG0_PMODE_SHIFT 10 54 54 #define PADCFG0_PMODE_MASK GENMASK(13, 10) 55 + #define PADCFG0_PMODE_GPIO 0 55 56 #define PADCFG0_GPIORXDIS BIT(9) 56 57 #define PADCFG0_GPIOTXDIS BIT(8) 57 58 #define PADCFG0_GPIORXSTATE BIT(1) ··· 333 332 cfg1 = readl(intel_get_padcfg(pctrl, pin, PADCFG1)); 334 333 335 334 mode = (cfg0 & PADCFG0_PMODE_MASK) >> PADCFG0_PMODE_SHIFT; 336 - if (!mode) 335 + if (mode == PADCFG0_PMODE_GPIO) 337 336 seq_puts(s, "GPIO "); 338 337 else 339 338 seq_printf(s, "mode %d ", mode); ··· 459 458 writel(value, padcfg0); 460 459 } 461 460 461 + static int intel_gpio_get_gpio_mode(void __iomem *padcfg0) 462 + { 463 + return (readl(padcfg0) & PADCFG0_PMODE_MASK) >> PADCFG0_PMODE_SHIFT; 464 + } 465 + 462 466 static void intel_gpio_set_gpio_mode(void __iomem *padcfg0) 463 467 { 464 468 u32 value; ··· 497 491 } 498 492 499 493 padcfg0 = intel_get_padcfg(pctrl, pin, PADCFG0); 494 + 495 + /* 496 + * If pin is already configured in GPIO mode, we assume that 497 + * firmware provides correct settings. In such case we avoid 498 + * potential glitches on the pin. Otherwise, for the pin in 499 + * alternative mode, consumer has to supply respective flags. 500 + */ 501 + if (intel_gpio_get_gpio_mode(padcfg0) == PADCFG0_PMODE_GPIO) { 502 + raw_spin_unlock_irqrestore(&pctrl->lock, flags); 503 + return 0; 504 + } 505 + 500 506 intel_gpio_set_gpio_mode(padcfg0); 507 + 501 508 /* Disable TX buffer and enable RX (this will be input) */ 502 509 __intel_gpio_set_direction(padcfg0, true); 503 510
-14
drivers/pinctrl/pinctrl-stmfx.c
··· 585 585 return stmfx_function_enable(pctl->stmfx, func); 586 586 } 587 587 588 - static int stmfx_pinctrl_gpio_init_valid_mask(struct gpio_chip *gc, 589 - unsigned long *valid_mask, 590 - unsigned int ngpios) 591 - { 592 - struct stmfx_pinctrl *pctl = gpiochip_get_data(gc); 593 - u32 n; 594 - 595 - for_each_clear_bit(n, &pctl->gpio_valid_mask, ngpios) 596 - clear_bit(n, valid_mask); 597 - 598 - return 0; 599 - } 600 - 601 588 static int stmfx_pinctrl_probe(struct platform_device *pdev) 602 589 { 603 590 struct stmfx *stmfx = dev_get_drvdata(pdev->dev.parent); ··· 647 660 pctl->gpio_chip.ngpio = pctl->pctl_desc.npins; 648 661 pctl->gpio_chip.can_sleep = true; 649 662 pctl->gpio_chip.of_node = np; 650 - pctl->gpio_chip.init_valid_mask = stmfx_pinctrl_gpio_init_valid_mask; 651 663 652 664 ret = devm_gpiochip_add_data(pctl->dev, &pctl->gpio_chip, pctl); 653 665 if (ret) {
+15 -5
drivers/ptp/ptp_chardev.c
··· 149 149 err = -EFAULT; 150 150 break; 151 151 } 152 - if (((req.extts.flags & ~PTP_EXTTS_VALID_FLAGS) || 153 - req.extts.rsv[0] || req.extts.rsv[1]) && 154 - cmd == PTP_EXTTS_REQUEST2) { 155 - err = -EINVAL; 156 - break; 152 + if (cmd == PTP_EXTTS_REQUEST2) { 153 + /* Tell the drivers to check the flags carefully. */ 154 + req.extts.flags |= PTP_STRICT_FLAGS; 155 + /* Make sure no reserved bit is set. */ 156 + if ((req.extts.flags & ~PTP_EXTTS_VALID_FLAGS) || 157 + req.extts.rsv[0] || req.extts.rsv[1]) { 158 + err = -EINVAL; 159 + break; 160 + } 161 + /* Ensure one of the rising/falling edge bits is set. */ 162 + if ((req.extts.flags & PTP_ENABLE_FEATURE) && 163 + (req.extts.flags & PTP_EXTTS_EDGES) == 0) { 164 + err = -EINVAL; 165 + break; 166 + } 157 167 } else if (cmd == PTP_EXTTS_REQUEST) { 158 168 req.extts.flags &= PTP_EXTTS_V1_VALID_FLAGS; 159 169 req.extts.rsv[0] = 0;
+3 -2
drivers/reset/core.c
··· 76 76 * of_reset_simple_xlate - translate reset_spec to the reset line number 77 77 * @rcdev: a pointer to the reset controller device 78 78 * @reset_spec: reset line specifier as found in the device tree 79 - * @flags: a flags pointer to fill in (optional) 80 79 * 81 80 * This simple translation function should be used for reset controllers 82 81 * with 1:1 mapping, where reset lines can be indexed by number without gaps. ··· 747 748 for (i = 0; i < resets->num_rstcs; i++) 748 749 __reset_control_put_internal(resets->rstc[i]); 749 750 mutex_unlock(&reset_list_mutex); 751 + kfree(resets); 750 752 } 751 753 752 754 /** ··· 825 825 } 826 826 EXPORT_SYMBOL_GPL(__device_reset); 827 827 828 - /** 828 + /* 829 829 * APIs to manage an array of reset controls. 830 830 */ 831 + 831 832 /** 832 833 * of_reset_control_get_count - Count number of resets available with a device 833 834 *
+5 -3
drivers/scsi/qla2xxx/qla_mid.c
··· 76 76 * ensures no active vp_list traversal while the vport is removed 77 77 * from the queue) 78 78 */ 79 - for (i = 0; i < 10 && atomic_read(&vha->vref_count); i++) 80 - wait_event_timeout(vha->vref_waitq, 81 - atomic_read(&vha->vref_count), HZ); 79 + for (i = 0; i < 10; i++) { 80 + if (wait_event_timeout(vha->vref_waitq, 81 + !atomic_read(&vha->vref_count), HZ) > 0) 82 + break; 83 + } 82 84 83 85 spin_lock_irqsave(&ha->vport_slock, flags); 84 86 if (atomic_read(&vha->vref_count)) {
+5 -3
drivers/scsi/qla2xxx/qla_os.c
··· 1119 1119 1120 1120 qla2x00_mark_all_devices_lost(vha, 0); 1121 1121 1122 - for (i = 0; i < 10; i++) 1123 - wait_event_timeout(vha->fcport_waitQ, test_fcport_count(vha), 1124 - HZ); 1122 + for (i = 0; i < 10; i++) { 1123 + if (wait_event_timeout(vha->fcport_waitQ, 1124 + test_fcport_count(vha), HZ) > 0) 1125 + break; 1126 + } 1125 1127 1126 1128 flush_workqueue(vha->hw->wq); 1127 1129 }
+2 -1
drivers/scsi/scsi_lib.c
··· 1883 1883 { 1884 1884 unsigned int cmd_size, sgl_size; 1885 1885 1886 - sgl_size = scsi_mq_inline_sgl_size(shost); 1886 + sgl_size = max_t(unsigned int, sizeof(struct scatterlist), 1887 + scsi_mq_inline_sgl_size(shost)); 1887 1888 cmd_size = sizeof(struct scsi_cmnd) + shost->hostt->cmd_size + sgl_size; 1888 1889 if (scsi_host_get_prot(shost)) 1889 1890 cmd_size += sizeof(struct scsi_data_buffer) +
+10 -19
drivers/scsi/sd_zbc.c
··· 263 263 int result = cmd->result; 264 264 struct request *rq = cmd->request; 265 265 266 - switch (req_op(rq)) { 267 - case REQ_OP_ZONE_RESET: 268 - case REQ_OP_ZONE_RESET_ALL: 269 - 270 - if (result && 271 - sshdr->sense_key == ILLEGAL_REQUEST && 272 - sshdr->asc == 0x24) 273 - /* 274 - * INVALID FIELD IN CDB error: reset of a conventional 275 - * zone was attempted. Nothing to worry about, so be 276 - * quiet about the error. 277 - */ 278 - rq->rq_flags |= RQF_QUIET; 279 - break; 280 - 281 - case REQ_OP_WRITE: 282 - case REQ_OP_WRITE_ZEROES: 283 - case REQ_OP_WRITE_SAME: 284 - break; 266 + if (req_op(rq) == REQ_OP_ZONE_RESET && 267 + result && 268 + sshdr->sense_key == ILLEGAL_REQUEST && 269 + sshdr->asc == 0x24) { 270 + /* 271 + * INVALID FIELD IN CDB error: reset of a conventional 272 + * zone was attempted. Nothing to worry about, so be 273 + * quiet about the error. 274 + */ 275 + rq->rq_flags |= RQF_QUIET; 285 276 } 286 277 } 287 278
+4 -4
drivers/soc/imx/gpc.c
··· 249 249 }; 250 250 251 251 static struct imx_pm_domain imx_gpc_domains[] = { 252 - [GPC_PGC_DOMAIN_ARM] { 252 + [GPC_PGC_DOMAIN_ARM] = { 253 253 .base = { 254 254 .name = "ARM", 255 255 .flags = GENPD_FLAG_ALWAYS_ON, 256 256 }, 257 257 }, 258 - [GPC_PGC_DOMAIN_PU] { 258 + [GPC_PGC_DOMAIN_PU] = { 259 259 .base = { 260 260 .name = "PU", 261 261 .power_off = imx6_pm_domain_power_off, ··· 266 266 .reg_offs = 0x260, 267 267 .cntr_pdn_bit = 0, 268 268 }, 269 - [GPC_PGC_DOMAIN_DISPLAY] { 269 + [GPC_PGC_DOMAIN_DISPLAY] = { 270 270 .base = { 271 271 .name = "DISPLAY", 272 272 .power_off = imx6_pm_domain_power_off, ··· 275 275 .reg_offs = 0x240, 276 276 .cntr_pdn_bit = 4, 277 277 }, 278 - [GPC_PGC_DOMAIN_PCI] { 278 + [GPC_PGC_DOMAIN_PCI] = { 279 279 .base = { 280 280 .name = "PCI", 281 281 .power_off = imx6_pm_domain_power_off,
+1
drivers/soundwire/Kconfig
··· 5 5 6 6 menuconfig SOUNDWIRE 7 7 tristate "SoundWire support" 8 + depends on ACPI || OF 8 9 help 9 10 SoundWire is a 2-Pin interface with data and clock line ratified 10 11 by the MIPI Alliance. SoundWire is used for transporting data
+2 -2
drivers/soundwire/intel.c
··· 900 900 /* Create PCM DAIs */ 901 901 stream = &cdns->pcm; 902 902 903 - ret = intel_create_dai(cdns, dais, INTEL_PDI_IN, stream->num_in, 903 + ret = intel_create_dai(cdns, dais, INTEL_PDI_IN, cdns->pcm.num_in, 904 904 off, stream->num_ch_in, true); 905 905 if (ret) 906 906 return ret; ··· 931 931 if (ret) 932 932 return ret; 933 933 934 - off += cdns->pdm.num_bd; 934 + off += cdns->pdm.num_out; 935 935 ret = intel_create_dai(cdns, dais, INTEL_PDI_BD, cdns->pdm.num_bd, 936 936 off, stream->num_ch_bd, false); 937 937 if (ret)
+2 -1
drivers/soundwire/slave.c
··· 128 128 struct device_node *node; 129 129 130 130 for_each_child_of_node(bus->dev->of_node, node) { 131 - int link_id, sdw_version, ret, len; 131 + int link_id, ret, len; 132 + unsigned int sdw_version; 132 133 const char *compat = NULL; 133 134 struct sdw_slave_id id; 134 135 const __be32 *addr;
-1
drivers/thunderbolt/nhi_ops.c
··· 80 80 { 81 81 u32 data; 82 82 83 - pci_read_config_dword(nhi->pdev, VS_CAP_19, &data); 84 83 data = (cmd << VS_CAP_19_CMD_SHIFT) & VS_CAP_19_CMD_MASK; 85 84 pci_write_config_dword(nhi->pdev, VS_CAP_19, data | VS_CAP_19_VALID); 86 85 }
+11 -17
drivers/thunderbolt/switch.c
··· 896 896 */ 897 897 bool tb_dp_port_is_enabled(struct tb_port *port) 898 898 { 899 - u32 data; 899 + u32 data[2]; 900 900 901 - if (tb_port_read(port, &data, TB_CFG_PORT, port->cap_adap, 1)) 901 + if (tb_port_read(port, data, TB_CFG_PORT, port->cap_adap, 902 + ARRAY_SIZE(data))) 902 903 return false; 903 904 904 - return !!(data & (TB_DP_VIDEO_EN | TB_DP_AUX_EN)); 905 + return !!(data[0] & (TB_DP_VIDEO_EN | TB_DP_AUX_EN)); 905 906 } 906 907 907 908 /** ··· 915 914 */ 916 915 int tb_dp_port_enable(struct tb_port *port, bool enable) 917 916 { 918 - u32 data; 917 + u32 data[2]; 919 918 int ret; 920 919 921 - ret = tb_port_read(port, &data, TB_CFG_PORT, port->cap_adap, 1); 920 + ret = tb_port_read(port, data, TB_CFG_PORT, port->cap_adap, 921 + ARRAY_SIZE(data)); 922 922 if (ret) 923 923 return ret; 924 924 925 925 if (enable) 926 - data |= TB_DP_VIDEO_EN | TB_DP_AUX_EN; 926 + data[0] |= TB_DP_VIDEO_EN | TB_DP_AUX_EN; 927 927 else 928 - data &= ~(TB_DP_VIDEO_EN | TB_DP_AUX_EN); 928 + data[0] &= ~(TB_DP_VIDEO_EN | TB_DP_AUX_EN); 929 929 930 - return tb_port_write(port, &data, TB_CFG_PORT, port->cap_adap, 1); 930 + return tb_port_write(port, data, TB_CFG_PORT, port->cap_adap, 931 + ARRAY_SIZE(data)); 931 932 } 932 933 933 934 /* switch utility functions */ ··· 1034 1031 if (sw->authorized) 1035 1032 goto unlock; 1036 1033 1037 - /* 1038 - * Make sure there is no PCIe rescan ongoing when a new PCIe 1039 - * tunnel is created. Otherwise the PCIe rescan code might find 1040 - * the new tunnel too early. 1041 - */ 1042 - pci_lock_rescan_remove(); 1043 - 1044 1034 switch (val) { 1045 1035 /* Approve switch */ 1046 1036 case 1: ··· 1052 1056 default: 1053 1057 break; 1054 1058 } 1055 - 1056 - pci_unlock_rescan_remove(); 1057 1059 1058 1060 if (!ret) { 1059 1061 sw->authorized = val;
+1
drivers/watchdog/bd70528_wdt.c
··· 288 288 MODULE_AUTHOR("Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>"); 289 289 MODULE_DESCRIPTION("BD70528 watchdog driver"); 290 290 MODULE_LICENSE("GPL"); 291 + MODULE_ALIAS("platform:bd70528-wdt");
+7 -1
drivers/watchdog/cpwd.c
··· 26 26 #include <linux/interrupt.h> 27 27 #include <linux/ioport.h> 28 28 #include <linux/timer.h> 29 + #include <linux/compat.h> 29 30 #include <linux/slab.h> 30 31 #include <linux/mutex.h> 31 32 #include <linux/io.h> ··· 474 473 return 0; 475 474 } 476 475 476 + static long cpwd_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg) 477 + { 478 + return cpwd_ioctl(file, cmd, (unsigned long)compat_ptr(arg)); 479 + } 480 + 477 481 static ssize_t cpwd_write(struct file *file, const char __user *buf, 478 482 size_t count, loff_t *ppos) 479 483 { ··· 503 497 static const struct file_operations cpwd_fops = { 504 498 .owner = THIS_MODULE, 505 499 .unlocked_ioctl = cpwd_ioctl, 506 - .compat_ioctl = compat_ptr_ioctl, 500 + .compat_ioctl = cpwd_compat_ioctl, 507 501 .open = cpwd_open, 508 502 .write = cpwd_write, 509 503 .read = cpwd_read,
+7 -1
drivers/watchdog/imx_sc_wdt.c
··· 99 99 { 100 100 struct arm_smccc_res res; 101 101 102 + /* 103 + * SCU firmware calculates pretimeout based on current time 104 + * stamp instead of watchdog timeout stamp, need to convert 105 + * the pretimeout to SCU firmware's timeout value. 106 + */ 102 107 arm_smccc_smc(IMX_SIP_TIMER, IMX_SIP_TIMER_SET_PRETIME_WDOG, 103 - pretimeout * 1000, 0, 0, 0, 0, 0, &res); 108 + (wdog->timeout - pretimeout) * 1000, 0, 0, 0, 109 + 0, 0, &res); 104 110 if (res.a0) 105 111 return -EACCES; 106 112
+2 -2
drivers/watchdog/meson_gxbb_wdt.c
··· 89 89 90 90 reg = readl(data->reg_base + GXBB_WDT_TCNT_REG); 91 91 92 - return ((reg >> GXBB_WDT_TCNT_CNT_SHIFT) - 93 - (reg & GXBB_WDT_TCNT_SETUP_MASK)) / 1000; 92 + return ((reg & GXBB_WDT_TCNT_SETUP_MASK) - 93 + (reg >> GXBB_WDT_TCNT_CNT_SHIFT)) / 1000; 94 94 } 95 95 96 96 static const struct watchdog_ops meson_gxbb_wdt_ops = {
+11 -4
drivers/watchdog/pm8916_wdt.c
··· 163 163 164 164 irq = platform_get_irq(pdev, 0); 165 165 if (irq > 0) { 166 - if (devm_request_irq(dev, irq, pm8916_wdt_isr, 0, "pm8916_wdt", 167 - wdt)) 168 - irq = 0; 166 + err = devm_request_irq(dev, irq, pm8916_wdt_isr, 0, 167 + "pm8916_wdt", wdt); 168 + if (err) 169 + return err; 170 + 171 + wdt->wdev.info = &pm8916_wdt_pt_ident; 172 + } else { 173 + if (irq == -EPROBE_DEFER) 174 + return -EPROBE_DEFER; 175 + 176 + wdt->wdev.info = &pm8916_wdt_ident; 169 177 } 170 178 171 179 /* Configure watchdog to hard-reset mode */ ··· 185 177 return err; 186 178 } 187 179 188 - wdt->wdev.info = (irq > 0) ? &pm8916_wdt_pt_ident : &pm8916_wdt_ident, 189 180 wdt->wdev.ops = &pm8916_wdt_ops, 190 181 wdt->wdev.parent = dev; 191 182 wdt->wdev.min_timeout = PM8916_WDT_MIN_TIMEOUT;
+6 -1
fs/afs/dir.c
··· 803 803 continue; 804 804 805 805 if (cookie->inodes[i]) { 806 - afs_vnode_commit_status(&fc, AFS_FS_I(cookie->inodes[i]), 806 + struct afs_vnode *iv = AFS_FS_I(cookie->inodes[i]); 807 + 808 + if (test_bit(AFS_VNODE_UNSET, &iv->flags)) 809 + continue; 810 + 811 + afs_vnode_commit_status(&fc, iv, 807 812 scb->cb_break, NULL, scb); 808 813 continue; 809 814 }
+5 -5
fs/aio.c
··· 2179 2179 #ifdef CONFIG_COMPAT 2180 2180 2181 2181 struct __compat_aio_sigset { 2182 - compat_sigset_t __user *sigmask; 2182 + compat_uptr_t sigmask; 2183 2183 compat_size_t sigsetsize; 2184 2184 }; 2185 2185 ··· 2193 2193 struct old_timespec32 __user *, timeout, 2194 2194 const struct __compat_aio_sigset __user *, usig) 2195 2195 { 2196 - struct __compat_aio_sigset ksig = { NULL, }; 2196 + struct __compat_aio_sigset ksig = { 0, }; 2197 2197 struct timespec64 t; 2198 2198 bool interrupted; 2199 2199 int ret; ··· 2204 2204 if (usig && copy_from_user(&ksig, usig, sizeof(ksig))) 2205 2205 return -EFAULT; 2206 2206 2207 - ret = set_compat_user_sigmask(ksig.sigmask, ksig.sigsetsize); 2207 + ret = set_compat_user_sigmask(compat_ptr(ksig.sigmask), ksig.sigsetsize); 2208 2208 if (ret) 2209 2209 return ret; 2210 2210 ··· 2228 2228 struct __kernel_timespec __user *, timeout, 2229 2229 const struct __compat_aio_sigset __user *, usig) 2230 2230 { 2231 - struct __compat_aio_sigset ksig = { NULL, }; 2231 + struct __compat_aio_sigset ksig = { 0, }; 2232 2232 struct timespec64 t; 2233 2233 bool interrupted; 2234 2234 int ret; ··· 2239 2239 if (usig && copy_from_user(&ksig, usig, sizeof(ksig))) 2240 2240 return -EFAULT; 2241 2241 2242 - ret = set_compat_user_sigmask(ksig.sigmask, ksig.sigsetsize); 2242 + ret = set_compat_user_sigmask(compat_ptr(ksig.sigmask), ksig.sigsetsize); 2243 2243 if (ret) 2244 2244 return ret; 2245 2245
+3 -2
fs/autofs/expire.c
··· 459 459 */ 460 460 how &= ~AUTOFS_EXP_LEAVES; 461 461 found = should_expire(expired, mnt, timeout, how); 462 - if (!found || found != expired) 463 - /* Something has changed, continue */ 462 + if (found != expired) { // something has changed, continue 463 + dput(found); 464 464 goto next; 465 + } 465 466 466 467 if (expired != dentry) 467 468 dput(dentry);
+29 -1
fs/btrfs/inode.c
··· 474 474 u64 start = async_chunk->start; 475 475 u64 end = async_chunk->end; 476 476 u64 actual_end; 477 + u64 i_size; 477 478 int ret = 0; 478 479 struct page **pages = NULL; 479 480 unsigned long nr_pages; ··· 489 488 inode_should_defrag(BTRFS_I(inode), start, end, end - start + 1, 490 489 SZ_16K); 491 490 492 - actual_end = min_t(u64, i_size_read(inode), end + 1); 491 + /* 492 + * We need to save i_size before now because it could change in between 493 + * us evaluating the size and assigning it. This is because we lock and 494 + * unlock the page in truncate and fallocate, and then modify the i_size 495 + * later on. 496 + * 497 + * The barriers are to emulate READ_ONCE, remove that once i_size_read 498 + * does that for us. 499 + */ 500 + barrier(); 501 + i_size = i_size_read(inode); 502 + barrier(); 503 + actual_end = min_t(u64, i_size, end + 1); 493 504 again: 494 505 will_compress = 0; 495 506 nr_pages = (end >> PAGE_SHIFT) - (start >> PAGE_SHIFT) + 1; ··· 9744 9731 commit_transaction = true; 9745 9732 } 9746 9733 if (commit_transaction) { 9734 + /* 9735 + * We may have set commit_transaction when logging the new name 9736 + * in the destination root, in which case we left the source 9737 + * root context in the list of log contextes. So make sure we 9738 + * remove it to avoid invalid memory accesses, since the context 9739 + * was allocated in our stack frame. 9740 + */ 9741 + if (sync_log_root) { 9742 + mutex_lock(&root->log_mutex); 9743 + list_del_init(&ctx_root.list); 9744 + mutex_unlock(&root->log_mutex); 9745 + } 9747 9746 ret = btrfs_commit_transaction(trans); 9748 9747 } else { 9749 9748 int ret2; ··· 9768 9743 up_read(&fs_info->subvol_sem); 9769 9744 if (old_ino == BTRFS_FIRST_FREE_OBJECTID) 9770 9745 up_read(&fs_info->subvol_sem); 9746 + 9747 + ASSERT(list_empty(&ctx_root.list)); 9748 + ASSERT(list_empty(&ctx_dest.list)); 9771 9749 9772 9750 return ret; 9773 9751 }
-6
fs/btrfs/ioctl.c
··· 4195 4195 u64 transid; 4196 4196 int ret; 4197 4197 4198 - btrfs_warn(root->fs_info, 4199 - "START_SYNC ioctl is deprecated and will be removed in kernel 5.7"); 4200 - 4201 4198 trans = btrfs_attach_transaction_barrier(root); 4202 4199 if (IS_ERR(trans)) { 4203 4200 if (PTR_ERR(trans) != -ENOENT) ··· 4221 4224 void __user *argp) 4222 4225 { 4223 4226 u64 transid; 4224 - 4225 - btrfs_warn(fs_info, 4226 - "WAIT_SYNC ioctl is deprecated and will be removed in kernel 5.7"); 4227 4227 4228 4228 if (argp) { 4229 4229 if (copy_from_user(&transid, argp, sizeof(transid)))
+21
fs/btrfs/space-info.c
··· 893 893 while (ticket->bytes > 0 && ticket->error == 0) { 894 894 ret = prepare_to_wait_event(&ticket->wait, &wait, TASK_KILLABLE); 895 895 if (ret) { 896 + /* 897 + * Delete us from the list. After we unlock the space 898 + * info, we don't want the async reclaim job to reserve 899 + * space for this ticket. If that would happen, then the 900 + * ticket's task would not known that space was reserved 901 + * despite getting an error, resulting in a space leak 902 + * (bytes_may_use counter of our space_info). 903 + */ 904 + list_del_init(&ticket->list); 896 905 ticket->error = -EINTR; 897 906 break; 898 907 } ··· 954 945 spin_lock(&space_info->lock); 955 946 ret = ticket->error; 956 947 if (ticket->bytes || ticket->error) { 948 + /* 949 + * Need to delete here for priority tickets. For regular tickets 950 + * either the async reclaim job deletes the ticket from the list 951 + * or we delete it ourselves at wait_reserve_ticket(). 952 + */ 957 953 list_del_init(&ticket->list); 958 954 if (!ret) 959 955 ret = -ENOSPC; 960 956 } 961 957 spin_unlock(&space_info->lock); 962 958 ASSERT(list_empty(&ticket->list)); 959 + /* 960 + * Check that we can't have an error set if the reservation succeeded, 961 + * as that would confuse tasks and lead them to error out without 962 + * releasing reserved space (if an error happens the expectation is that 963 + * space wasn't reserved at all). 964 + */ 965 + ASSERT(!(ticket->bytes == 0 && ticket->error)); 963 966 return ret; 964 967 } 965 968
-8
fs/btrfs/tree-checker.c
··· 686 686 static int check_dev_item(struct extent_buffer *leaf, 687 687 struct btrfs_key *key, int slot) 688 688 { 689 - struct btrfs_fs_info *fs_info = leaf->fs_info; 690 689 struct btrfs_dev_item *ditem; 691 - u64 max_devid = max(BTRFS_MAX_DEVS(fs_info), BTRFS_MAX_DEVS_SYS_CHUNK); 692 690 693 691 if (key->objectid != BTRFS_DEV_ITEMS_OBJECTID) { 694 692 dev_item_err(leaf, slot, 695 693 "invalid objectid: has=%llu expect=%llu", 696 694 key->objectid, BTRFS_DEV_ITEMS_OBJECTID); 697 - return -EUCLEAN; 698 - } 699 - if (key->offset > max_devid) { 700 - dev_item_err(leaf, slot, 701 - "invalid devid: has=%llu expect=[0, %llu]", 702 - key->offset, max_devid); 703 695 return -EUCLEAN; 704 696 } 705 697 ditem = btrfs_item_ptr(leaf, slot, struct btrfs_dev_item);
+1
fs/btrfs/volumes.c
··· 4967 4967 } else if (type & BTRFS_BLOCK_GROUP_SYSTEM) { 4968 4968 max_stripe_size = SZ_32M; 4969 4969 max_chunk_size = 2 * max_stripe_size; 4970 + devs_max = min_t(int, devs_max, BTRFS_MAX_DEVS_SYS_CHUNK); 4970 4971 } else { 4971 4972 btrfs_err(info, "invalid chunk type 0x%llx requested", 4972 4973 type);
+22 -7
fs/ceph/file.c
··· 753 753 if (!atomic_dec_and_test(&aio_req->pending_reqs)) 754 754 return; 755 755 756 + if (aio_req->iocb->ki_flags & IOCB_DIRECT) 757 + inode_dio_end(inode); 758 + 756 759 ret = aio_req->error; 757 760 if (!ret) 758 761 ret = aio_req->total_len; ··· 1094 1091 CEPH_CAP_FILE_RD); 1095 1092 1096 1093 list_splice(&aio_req->osd_reqs, &osd_reqs); 1094 + inode_dio_begin(inode); 1097 1095 while (!list_empty(&osd_reqs)) { 1098 1096 req = list_first_entry(&osd_reqs, 1099 1097 struct ceph_osd_request, ··· 1268 1264 dout("aio_read %p %llx.%llx %llu~%u trying to get caps on %p\n", 1269 1265 inode, ceph_vinop(inode), iocb->ki_pos, (unsigned)len, inode); 1270 1266 1267 + if (iocb->ki_flags & IOCB_DIRECT) 1268 + ceph_start_io_direct(inode); 1269 + else 1270 + ceph_start_io_read(inode); 1271 + 1271 1272 if (fi->fmode & CEPH_FILE_MODE_LAZY) 1272 1273 want = CEPH_CAP_FILE_CACHE | CEPH_CAP_FILE_LAZYIO; 1273 1274 else 1274 1275 want = CEPH_CAP_FILE_CACHE; 1275 1276 ret = ceph_get_caps(filp, CEPH_CAP_FILE_RD, want, -1, 1276 1277 &got, &pinned_page); 1277 - if (ret < 0) 1278 + if (ret < 0) { 1279 + if (iocb->ki_flags & IOCB_DIRECT) 1280 + ceph_end_io_direct(inode); 1281 + else 1282 + ceph_end_io_read(inode); 1278 1283 return ret; 1284 + } 1279 1285 1280 1286 if ((got & (CEPH_CAP_FILE_CACHE|CEPH_CAP_FILE_LAZYIO)) == 0 || 1281 1287 (iocb->ki_flags & IOCB_DIRECT) || ··· 1297 1283 1298 1284 if (ci->i_inline_version == CEPH_INLINE_NONE) { 1299 1285 if (!retry_op && (iocb->ki_flags & IOCB_DIRECT)) { 1300 - ceph_start_io_direct(inode); 1301 1286 ret = ceph_direct_read_write(iocb, to, 1302 1287 NULL, NULL); 1303 - ceph_end_io_direct(inode); 1304 1288 if (ret >= 0 && ret < len) 1305 1289 retry_op = CHECK_EOF; 1306 1290 } else { 1307 - ceph_start_io_read(inode); 1308 1291 ret = ceph_sync_read(iocb, to, &retry_op); 1309 - ceph_end_io_read(inode); 1310 1292 } 1311 1293 } else { 1312 1294 retry_op = READ_INLINE; ··· 1313 1303 inode, ceph_vinop(inode), iocb->ki_pos, (unsigned)len, 1314 1304 ceph_cap_string(got)); 1315 1305 ceph_add_rw_context(fi, &rw_ctx); 1316 - ceph_start_io_read(inode); 1317 1306 ret = generic_file_read_iter(iocb, to); 1318 - ceph_end_io_read(inode); 1319 1307 ceph_del_rw_context(fi, &rw_ctx); 1320 1308 } 1309 + 1321 1310 dout("aio_read %p %llx.%llx dropping cap refs on %s = %d\n", 1322 1311 inode, ceph_vinop(inode), ceph_cap_string(got), (int)ret); 1323 1312 if (pinned_page) { ··· 1324 1315 pinned_page = NULL; 1325 1316 } 1326 1317 ceph_put_cap_refs(ci, got); 1318 + 1319 + if (iocb->ki_flags & IOCB_DIRECT) 1320 + ceph_end_io_direct(inode); 1321 + else 1322 + ceph_end_io_read(inode); 1323 + 1327 1324 if (retry_op > HAVE_RETRIED && ret >= 0) { 1328 1325 int statret; 1329 1326 struct page *page = NULL;
+1
fs/cifs/smb2pdu.h
··· 838 838 struct create_context ccontext; 839 839 __u8 Name[8]; 840 840 struct durable_reconnect_context_v2 dcontext; 841 + __u8 Pad[4]; 841 842 } __packed; 842 843 843 844 /* See MS-SMB2 2.2.13.2.5 */
+1 -1
fs/configfs/symlink.c
··· 101 101 } 102 102 target_sd->s_links++; 103 103 spin_unlock(&configfs_dirent_lock); 104 - ret = configfs_get_target_path(item, item, body); 104 + ret = configfs_get_target_path(parent_item, item, body); 105 105 if (!ret) 106 106 ret = configfs_create_link(target_sd, parent_item->ci_dentry, 107 107 dentry, body);
+53 -31
fs/ecryptfs/inode.c
··· 128 128 struct inode *inode) 129 129 { 130 130 struct dentry *lower_dentry = ecryptfs_dentry_to_lower(dentry); 131 - struct inode *lower_dir_inode = ecryptfs_inode_to_lower(dir); 132 131 struct dentry *lower_dir_dentry; 132 + struct inode *lower_dir_inode; 133 133 int rc; 134 134 135 - dget(lower_dentry); 136 - lower_dir_dentry = lock_parent(lower_dentry); 137 - rc = vfs_unlink(lower_dir_inode, lower_dentry, NULL); 135 + lower_dir_dentry = ecryptfs_dentry_to_lower(dentry->d_parent); 136 + lower_dir_inode = d_inode(lower_dir_dentry); 137 + inode_lock_nested(lower_dir_inode, I_MUTEX_PARENT); 138 + dget(lower_dentry); // don't even try to make the lower negative 139 + if (lower_dentry->d_parent != lower_dir_dentry) 140 + rc = -EINVAL; 141 + else if (d_unhashed(lower_dentry)) 142 + rc = -EINVAL; 143 + else 144 + rc = vfs_unlink(lower_dir_inode, lower_dentry, NULL); 138 145 if (rc) { 139 146 printk(KERN_ERR "Error in vfs_unlink; rc = [%d]\n", rc); 140 147 goto out_unlock; ··· 149 142 fsstack_copy_attr_times(dir, lower_dir_inode); 150 143 set_nlink(inode, ecryptfs_inode_to_lower(inode)->i_nlink); 151 144 inode->i_ctime = dir->i_ctime; 152 - d_drop(dentry); 153 145 out_unlock: 154 - unlock_dir(lower_dir_dentry); 155 146 dput(lower_dentry); 147 + inode_unlock(lower_dir_inode); 148 + if (!rc) 149 + d_drop(dentry); 156 150 return rc; 157 151 } 158 152 ··· 319 311 static struct dentry *ecryptfs_lookup_interpose(struct dentry *dentry, 320 312 struct dentry *lower_dentry) 321 313 { 322 - struct inode *inode, *lower_inode = d_inode(lower_dentry); 314 + struct path *path = ecryptfs_dentry_to_lower_path(dentry->d_parent); 315 + struct inode *inode, *lower_inode; 323 316 struct ecryptfs_dentry_info *dentry_info; 324 - struct vfsmount *lower_mnt; 325 317 int rc = 0; 326 318 327 319 dentry_info = kmem_cache_alloc(ecryptfs_dentry_info_cache, GFP_KERNEL); ··· 330 322 return ERR_PTR(-ENOMEM); 331 323 } 332 324 333 - lower_mnt = mntget(ecryptfs_dentry_to_lower_mnt(dentry->d_parent)); 334 325 fsstack_copy_attr_atime(d_inode(dentry->d_parent), 335 - d_inode(lower_dentry->d_parent)); 326 + d_inode(path->dentry)); 336 327 BUG_ON(!d_count(lower_dentry)); 337 328 338 329 ecryptfs_set_dentry_private(dentry, dentry_info); 339 - dentry_info->lower_path.mnt = lower_mnt; 330 + dentry_info->lower_path.mnt = mntget(path->mnt); 340 331 dentry_info->lower_path.dentry = lower_dentry; 341 332 342 - if (d_really_is_negative(lower_dentry)) { 333 + /* 334 + * negative dentry can go positive under us here - its parent is not 335 + * locked. That's OK and that could happen just as we return from 336 + * ecryptfs_lookup() anyway. Just need to be careful and fetch 337 + * ->d_inode only once - it's not stable here. 338 + */ 339 + lower_inode = READ_ONCE(lower_dentry->d_inode); 340 + 341 + if (!lower_inode) { 343 342 /* We want to add because we couldn't find in lower */ 344 343 d_add(dentry, NULL); 345 344 return NULL; ··· 527 512 { 528 513 struct dentry *lower_dentry; 529 514 struct dentry *lower_dir_dentry; 515 + struct inode *lower_dir_inode; 530 516 int rc; 531 517 532 518 lower_dentry = ecryptfs_dentry_to_lower(dentry); 533 - dget(dentry); 534 - lower_dir_dentry = lock_parent(lower_dentry); 535 - dget(lower_dentry); 536 - rc = vfs_rmdir(d_inode(lower_dir_dentry), lower_dentry); 537 - dput(lower_dentry); 538 - if (!rc && d_really_is_positive(dentry)) 519 + lower_dir_dentry = ecryptfs_dentry_to_lower(dentry->d_parent); 520 + lower_dir_inode = d_inode(lower_dir_dentry); 521 + 522 + inode_lock_nested(lower_dir_inode, I_MUTEX_PARENT); 523 + dget(lower_dentry); // don't even try to make the lower negative 524 + if (lower_dentry->d_parent != lower_dir_dentry) 525 + rc = -EINVAL; 526 + else if (d_unhashed(lower_dentry)) 527 + rc = -EINVAL; 528 + else 529 + rc = vfs_rmdir(lower_dir_inode, lower_dentry); 530 + if (!rc) { 539 531 clear_nlink(d_inode(dentry)); 540 - fsstack_copy_attr_times(dir, d_inode(lower_dir_dentry)); 541 - set_nlink(dir, d_inode(lower_dir_dentry)->i_nlink); 542 - unlock_dir(lower_dir_dentry); 532 + fsstack_copy_attr_times(dir, lower_dir_inode); 533 + set_nlink(dir, lower_dir_inode->i_nlink); 534 + } 535 + dput(lower_dentry); 536 + inode_unlock(lower_dir_inode); 543 537 if (!rc) 544 538 d_drop(dentry); 545 - dput(dentry); 546 539 return rc; 547 540 } 548 541 ··· 588 565 struct dentry *lower_new_dentry; 589 566 struct dentry *lower_old_dir_dentry; 590 567 struct dentry *lower_new_dir_dentry; 591 - struct dentry *trap = NULL; 568 + struct dentry *trap; 592 569 struct inode *target_inode; 593 570 594 571 if (flags) 595 572 return -EINVAL; 596 573 574 + lower_old_dir_dentry = ecryptfs_dentry_to_lower(old_dentry->d_parent); 575 + lower_new_dir_dentry = ecryptfs_dentry_to_lower(new_dentry->d_parent); 576 + 597 577 lower_old_dentry = ecryptfs_dentry_to_lower(old_dentry); 598 578 lower_new_dentry = ecryptfs_dentry_to_lower(new_dentry); 599 - dget(lower_old_dentry); 600 - dget(lower_new_dentry); 601 - lower_old_dir_dentry = dget_parent(lower_old_dentry); 602 - lower_new_dir_dentry = dget_parent(lower_new_dentry); 579 + 603 580 target_inode = d_inode(new_dentry); 581 + 604 582 trap = lock_rename(lower_old_dir_dentry, lower_new_dir_dentry); 583 + dget(lower_new_dentry); 605 584 rc = -EINVAL; 606 585 if (lower_old_dentry->d_parent != lower_old_dir_dentry) 607 586 goto out_lock; ··· 631 606 if (new_dir != old_dir) 632 607 fsstack_copy_attr_all(old_dir, d_inode(lower_old_dir_dentry)); 633 608 out_lock: 634 - unlock_rename(lower_old_dir_dentry, lower_new_dir_dentry); 635 - dput(lower_new_dir_dentry); 636 - dput(lower_old_dir_dentry); 637 609 dput(lower_new_dentry); 638 - dput(lower_old_dentry); 610 + unlock_rename(lower_old_dir_dentry, lower_new_dir_dentry); 639 611 return rc; 640 612 } 641 613
+19 -12
fs/exportfs/expfs.c
··· 519 519 * inode is actually connected to the parent. 520 520 */ 521 521 err = exportfs_get_name(mnt, target_dir, nbuf, result); 522 - if (!err) { 523 - inode_lock(target_dir->d_inode); 524 - nresult = lookup_one_len(nbuf, target_dir, 525 - strlen(nbuf)); 526 - inode_unlock(target_dir->d_inode); 527 - if (!IS_ERR(nresult)) { 528 - if (nresult->d_inode) { 529 - dput(result); 530 - result = nresult; 531 - } else 532 - dput(nresult); 533 - } 522 + if (err) { 523 + dput(target_dir); 524 + goto err_result; 534 525 } 535 526 527 + inode_lock(target_dir->d_inode); 528 + nresult = lookup_one_len(nbuf, target_dir, strlen(nbuf)); 529 + if (!IS_ERR(nresult)) { 530 + if (unlikely(nresult->d_inode != result->d_inode)) { 531 + dput(nresult); 532 + nresult = ERR_PTR(-ESTALE); 533 + } 534 + } 535 + inode_unlock(target_dir->d_inode); 536 536 /* 537 537 * At this point we are done with the parent, but it's pinned 538 538 * by the child dentry anyway. 539 539 */ 540 540 dput(target_dir); 541 + 542 + if (IS_ERR(nresult)) { 543 + err = PTR_ERR(nresult); 544 + goto err_result; 545 + } 546 + dput(result); 547 + result = nresult; 541 548 542 549 /* 543 550 * And finally make sure the dentry is actually acceptable
+24 -8
fs/io_uring.c
··· 326 326 #define REQ_F_TIMEOUT 1024 /* timeout request */ 327 327 #define REQ_F_ISREG 2048 /* regular file */ 328 328 #define REQ_F_MUST_PUNT 4096 /* must be punted even for NONBLOCK */ 329 + #define REQ_F_TIMEOUT_NOSEQ 8192 /* no timeout sequence */ 329 330 u64 user_data; 330 331 u32 result; 331 332 u32 sequence; ··· 454 453 struct io_kiocb *req; 455 454 456 455 req = list_first_entry_or_null(&ctx->timeout_list, struct io_kiocb, list); 457 - if (req && !__io_sequence_defer(ctx, req)) { 458 - list_del_init(&req->list); 459 - return req; 456 + if (req) { 457 + if (req->flags & REQ_F_TIMEOUT_NOSEQ) 458 + return NULL; 459 + if (!__io_sequence_defer(ctx, req)) { 460 + list_del_init(&req->list); 461 + return req; 462 + } 460 463 } 461 464 462 465 return NULL; ··· 1230 1225 } 1231 1226 } 1232 1227 1233 - return 0; 1228 + return len; 1234 1229 } 1235 1230 1236 1231 static ssize_t io_import_iovec(struct io_ring_ctx *ctx, int rw, ··· 1946 1941 if (get_timespec64(&ts, u64_to_user_ptr(sqe->addr))) 1947 1942 return -EFAULT; 1948 1943 1944 + req->flags |= REQ_F_TIMEOUT; 1945 + 1949 1946 /* 1950 1947 * sqe->off holds how many events that need to occur for this 1951 - * timeout event to be satisfied. 1948 + * timeout event to be satisfied. If it isn't set, then this is 1949 + * a pure timeout request, sequence isn't used. 1952 1950 */ 1953 1951 count = READ_ONCE(sqe->off); 1954 - if (!count) 1955 - count = 1; 1952 + if (!count) { 1953 + req->flags |= REQ_F_TIMEOUT_NOSEQ; 1954 + spin_lock_irq(&ctx->completion_lock); 1955 + entry = ctx->timeout_list.prev; 1956 + goto add; 1957 + } 1956 1958 1957 1959 req->sequence = ctx->cached_sq_head + count - 1; 1958 1960 /* reuse it to store the count */ 1959 1961 req->submit.sequence = count; 1960 - req->flags |= REQ_F_TIMEOUT; 1961 1962 1962 1963 /* 1963 1964 * Insertion sort, ensuring the first entry in the list is always ··· 1974 1963 struct io_kiocb *nxt = list_entry(entry, struct io_kiocb, list); 1975 1964 unsigned nxt_sq_head; 1976 1965 long long tmp, tmp_nxt; 1966 + 1967 + if (nxt->flags & REQ_F_TIMEOUT_NOSEQ) 1968 + continue; 1977 1969 1978 1970 /* 1979 1971 * Since cached_sq_head + count - 1 can overflow, use type long ··· 2004 1990 nxt->sequence++; 2005 1991 } 2006 1992 req->sequence -= span; 1993 + add: 2007 1994 list_add(&req->list, entry); 2008 1995 spin_unlock_irq(&ctx->completion_lock); 2009 1996 ··· 2298 2283 switch (op) { 2299 2284 case IORING_OP_NOP: 2300 2285 case IORING_OP_POLL_REMOVE: 2286 + case IORING_OP_TIMEOUT: 2301 2287 return false; 2302 2288 default: 2303 2289 return true;
+7 -8
fs/namespace.c
··· 2478 2478 2479 2479 time64_to_tm(sb->s_time_max, 0, &tm); 2480 2480 2481 - pr_warn("Mounted %s file system at %s supports timestamps until %04ld (0x%llx)\n", 2482 - sb->s_type->name, mntpath, 2481 + pr_warn("%s filesystem being %s at %s supports timestamps until %04ld (0x%llx)\n", 2482 + sb->s_type->name, 2483 + is_mounted(mnt) ? "remounted" : "mounted", 2484 + mntpath, 2483 2485 tm.tm_year+1900, (unsigned long long)sb->s_time_max); 2484 2486 2485 2487 free_page((unsigned long)buf); ··· 2766 2764 if (IS_ERR(mnt)) 2767 2765 return PTR_ERR(mnt); 2768 2766 2769 - error = do_add_mount(real_mount(mnt), mountpoint, mnt_flags); 2770 - if (error < 0) { 2771 - mntput(mnt); 2772 - return error; 2773 - } 2774 - 2775 2767 mnt_warn_timestamp_expiry(mountpoint, mnt); 2776 2768 2769 + error = do_add_mount(real_mount(mnt), mountpoint, mnt_flags); 2770 + if (error < 0) 2771 + mntput(mnt); 2777 2772 return error; 2778 2773 } 2779 2774
-7
include/asm-generic/vdso/vsyscall.h
··· 25 25 } 26 26 #endif /* __arch_get_clock_mode */ 27 27 28 - #ifndef __arch_use_vsyscall 29 - static __always_inline int __arch_use_vsyscall(struct vdso_data *vdata) 30 - { 31 - return 1; 32 - } 33 - #endif /* __arch_use_vsyscall */ 34 - 35 28 #ifndef __arch_update_vsyscall 36 29 static __always_inline void __arch_update_vsyscall(struct vdso_data *vdata, 37 30 struct timekeeper *tk)
+1
include/linux/can/core.h
··· 65 65 void *data); 66 66 67 67 extern int can_send(struct sk_buff *skb, int loop); 68 + void can_sock_destruct(struct sock *sk); 68 69 69 70 #endif /* !_CAN_CORE_H */
+7 -23
include/linux/cpu.h
··· 59 59 struct device_attribute *attr, char *buf); 60 60 extern ssize_t cpu_show_mds(struct device *dev, 61 61 struct device_attribute *attr, char *buf); 62 + extern ssize_t cpu_show_tsx_async_abort(struct device *dev, 63 + struct device_attribute *attr, 64 + char *buf); 65 + extern ssize_t cpu_show_itlb_multihit(struct device *dev, 66 + struct device_attribute *attr, char *buf); 62 67 63 68 extern __printf(4, 5) 64 69 struct device *cpu_device_create(struct device *parent, void *drvdata, ··· 218 213 static inline int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval) { return 0; } 219 214 #endif 220 215 221 - /* 222 - * These are used for a global "mitigations=" cmdline option for toggling 223 - * optional CPU mitigations. 224 - */ 225 - enum cpu_mitigations { 226 - CPU_MITIGATIONS_OFF, 227 - CPU_MITIGATIONS_AUTO, 228 - CPU_MITIGATIONS_AUTO_NOSMT, 229 - }; 230 - 231 - extern enum cpu_mitigations cpu_mitigations; 232 - 233 - /* mitigations=off */ 234 - static inline bool cpu_mitigations_off(void) 235 - { 236 - return cpu_mitigations == CPU_MITIGATIONS_OFF; 237 - } 238 - 239 - /* mitigations=auto,nosmt */ 240 - static inline bool cpu_mitigations_auto_nosmt(void) 241 - { 242 - return cpu_mitigations == CPU_MITIGATIONS_AUTO_NOSMT; 243 - } 216 + extern bool cpu_mitigations_off(void); 217 + extern bool cpu_mitigations_auto_nosmt(void); 244 218 245 219 #endif /* _LINUX_CPU_H_ */
+7
include/linux/kvm_host.h
··· 966 966 void kvm_vcpu_kick(struct kvm_vcpu *vcpu); 967 967 968 968 bool kvm_is_reserved_pfn(kvm_pfn_t pfn); 969 + bool kvm_is_zone_device_pfn(kvm_pfn_t pfn); 969 970 970 971 struct kvm_irq_ack_notifier { 971 972 struct hlist_node link; ··· 1382 1381 return 0; 1383 1382 } 1384 1383 #endif /* CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE */ 1384 + 1385 + typedef int (*kvm_vm_thread_fn_t)(struct kvm *kvm, uintptr_t data); 1386 + 1387 + int kvm_vm_create_worker_thread(struct kvm *kvm, kvm_vm_thread_fn_t thread_fn, 1388 + uintptr_t data, const char *name, 1389 + struct task_struct **thread_ptr); 1385 1390 1386 1391 #endif
+1
include/linux/memory.h
··· 119 119 typedef int (*walk_memory_blocks_func_t)(struct memory_block *, void *); 120 120 extern int walk_memory_blocks(unsigned long start, unsigned long size, 121 121 void *arg, walk_memory_blocks_func_t func); 122 + extern int for_each_memory_block(void *arg, walk_memory_blocks_func_t func); 122 123 #define CONFIG_MEM_BLOCK_SIZE (PAGES_PER_SECTION<<PAGE_SHIFT) 123 124 #endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */ 124 125
+2 -2
include/linux/reset-controller.h
··· 7 7 struct reset_controller_dev; 8 8 9 9 /** 10 - * struct reset_control_ops 10 + * struct reset_control_ops - reset controller driver callbacks 11 11 * 12 12 * @reset: for self-deasserting resets, does all necessary 13 13 * things to reset the device ··· 33 33 * @provider: name of the reset controller device controlling this reset line 34 34 * @index: ID of the reset controller in the reset controller device 35 35 * @dev_id: name of the device associated with this reset line 36 - * @con_id name of the reset line (can be NULL) 36 + * @con_id: name of the reset line (can be NULL) 37 37 */ 38 38 struct reset_control_lookup { 39 39 struct list_head list;
+1 -1
include/linux/reset.h
··· 143 143 * If this function is called more than once for the same reset_control it will 144 144 * return -EBUSY. 145 145 * 146 - * See reset_control_get_shared for details on shared references to 146 + * See reset_control_get_shared() for details on shared references to 147 147 * reset-controls. 148 148 * 149 149 * Use of id names is optional.
+1 -1
include/trace/events/tcp.h
··· 86 86 sk->sk_v6_rcv_saddr, sk->sk_v6_daddr); 87 87 ), 88 88 89 - TP_printk("sport=%hu dport=%hu saddr=%pI4 daddr=%pI4 saddrv6=%pI6c daddrv6=%pI6c state=%s\n", 89 + TP_printk("sport=%hu dport=%hu saddr=%pI4 daddr=%pI4 saddrv6=%pI6c daddrv6=%pI6c state=%s", 90 90 __entry->sport, __entry->dport, __entry->saddr, __entry->daddr, 91 91 __entry->saddr_v6, __entry->daddr_v6, 92 92 show_tcp_state_name(__entry->state))
+2 -1
include/uapi/linux/devlink.h
··· 421 421 422 422 DEVLINK_ATTR_RELOAD_FAILED, /* u8 0 or 1 */ 423 423 424 + DEVLINK_ATTR_HEALTH_REPORTER_DUMP_TS_NS, /* u64 */ 425 + 424 426 DEVLINK_ATTR_NETNS_FD, /* u32 */ 425 427 DEVLINK_ATTR_NETNS_PID, /* u32 */ 426 428 DEVLINK_ATTR_NETNS_ID, /* u32 */ 427 - 428 429 /* add new attributes above here, update the policy in devlink.c */ 429 430 430 431 __DEVLINK_ATTR_MAX,
+4 -1
include/uapi/linux/ptp_clock.h
··· 31 31 #define PTP_ENABLE_FEATURE (1<<0) 32 32 #define PTP_RISING_EDGE (1<<1) 33 33 #define PTP_FALLING_EDGE (1<<2) 34 + #define PTP_STRICT_FLAGS (1<<3) 35 + #define PTP_EXTTS_EDGES (PTP_RISING_EDGE | PTP_FALLING_EDGE) 34 36 35 37 /* 36 38 * flag fields valid for the new PTP_EXTTS_REQUEST2 ioctl. 37 39 */ 38 40 #define PTP_EXTTS_VALID_FLAGS (PTP_ENABLE_FEATURE | \ 39 41 PTP_RISING_EDGE | \ 40 - PTP_FALLING_EDGE) 42 + PTP_FALLING_EDGE | \ 43 + PTP_STRICT_FLAGS) 41 44 42 45 /* 43 46 * flag fields valid for the original PTP_EXTTS_REQUEST ioctl.
+1 -1
kernel/audit_watch.c
··· 351 351 struct dentry *d = kern_path_locked(watch->path, parent); 352 352 if (IS_ERR(d)) 353 353 return PTR_ERR(d); 354 - inode_unlock(d_backing_inode(parent->dentry)); 355 354 if (d_is_positive(d)) { 356 355 /* update watch filter fields */ 357 356 watch->dev = d->d_sb->s_dev; 358 357 watch->ino = d_backing_inode(d)->i_ino; 359 358 } 359 + inode_unlock(d_backing_inode(parent->dentry)); 360 360 dput(d); 361 361 return 0; 362 362 }
+3 -2
kernel/cgroup/cgroup.c
··· 2119 2119 2120 2120 nsdentry = kernfs_node_dentry(cgrp->kn, sb); 2121 2121 dput(fc->root); 2122 - fc->root = nsdentry; 2123 2122 if (IS_ERR(nsdentry)) { 2124 - ret = PTR_ERR(nsdentry); 2125 2123 deactivate_locked_super(sb); 2124 + ret = PTR_ERR(nsdentry); 2125 + nsdentry = NULL; 2126 2126 } 2127 + fc->root = nsdentry; 2127 2128 } 2128 2129 2129 2130 if (!ctx->kfc.new_sb_created)
+26 -1
kernel/cpu.c
··· 2373 2373 this_cpu_write(cpuhp_state.state, CPUHP_ONLINE); 2374 2374 } 2375 2375 2376 - enum cpu_mitigations cpu_mitigations __ro_after_init = CPU_MITIGATIONS_AUTO; 2376 + /* 2377 + * These are used for a global "mitigations=" cmdline option for toggling 2378 + * optional CPU mitigations. 2379 + */ 2380 + enum cpu_mitigations { 2381 + CPU_MITIGATIONS_OFF, 2382 + CPU_MITIGATIONS_AUTO, 2383 + CPU_MITIGATIONS_AUTO_NOSMT, 2384 + }; 2385 + 2386 + static enum cpu_mitigations cpu_mitigations __ro_after_init = 2387 + CPU_MITIGATIONS_AUTO; 2377 2388 2378 2389 static int __init mitigations_parse_cmdline(char *arg) 2379 2390 { ··· 2401 2390 return 0; 2402 2391 } 2403 2392 early_param("mitigations", mitigations_parse_cmdline); 2393 + 2394 + /* mitigations=off */ 2395 + bool cpu_mitigations_off(void) 2396 + { 2397 + return cpu_mitigations == CPU_MITIGATIONS_OFF; 2398 + } 2399 + EXPORT_SYMBOL_GPL(cpu_mitigations_off); 2400 + 2401 + /* mitigations=auto,nosmt */ 2402 + bool cpu_mitigations_auto_nosmt(void) 2403 + { 2404 + return cpu_mitigations == CPU_MITIGATIONS_AUTO_NOSMT; 2405 + } 2406 + EXPORT_SYMBOL_GPL(cpu_mitigations_auto_nosmt);
+19 -4
kernel/events/core.c
··· 1031 1031 { 1032 1032 } 1033 1033 1034 - void 1034 + static inline void 1035 1035 perf_cgroup_switch(struct task_struct *task, struct task_struct *next) 1036 1036 { 1037 1037 } ··· 10535 10535 goto err_ns; 10536 10536 } 10537 10537 10538 + /* 10539 + * Disallow uncore-cgroup events, they don't make sense as the cgroup will 10540 + * be different on other CPUs in the uncore mask. 10541 + */ 10542 + if (pmu->task_ctx_nr == perf_invalid_context && cgroup_fd != -1) { 10543 + err = -EINVAL; 10544 + goto err_pmu; 10545 + } 10546 + 10538 10547 if (event->attr.aux_output && 10539 10548 !(pmu->capabilities & PERF_PMU_CAP_AUX_OUTPUT)) { 10540 10549 err = -EOPNOTSUPP; ··· 11332 11323 int err; 11333 11324 11334 11325 /* 11335 - * Get the target context (task or percpu): 11326 + * Grouping is not supported for kernel events, neither is 'AUX', 11327 + * make sure the caller's intentions are adjusted. 11336 11328 */ 11329 + if (attr->aux_output) 11330 + return ERR_PTR(-EINVAL); 11337 11331 11338 11332 event = perf_event_alloc(attr, cpu, task, NULL, NULL, 11339 11333 overflow_handler, context, -1); ··· 11348 11336 /* Mark owner so we could distinguish it from user events. */ 11349 11337 event->owner = TASK_TOMBSTONE; 11350 11338 11339 + /* 11340 + * Get the target context (task or percpu): 11341 + */ 11351 11342 ctx = find_get_context(event->pmu, task, event); 11352 11343 if (IS_ERR(ctx)) { 11353 11344 err = PTR_ERR(ctx); ··· 11802 11787 GFP_KERNEL); 11803 11788 if (!child_ctx->task_ctx_data) { 11804 11789 free_event(child_event); 11805 - return NULL; 11790 + return ERR_PTR(-ENOMEM); 11806 11791 } 11807 11792 } 11808 11793 ··· 11905 11890 if (IS_ERR(child_ctr)) 11906 11891 return PTR_ERR(child_ctr); 11907 11892 11908 - if (sub->aux_event == parent_event && 11893 + if (sub->aux_event == parent_event && child_ctr && 11909 11894 !perf_get_aux_event(child_ctr, leader)) 11910 11895 return -EINVAL; 11911 11896 }
+1 -1
kernel/irq/irqdomain.c
··· 51 51 * @type: Type of irqchip_fwnode. See linux/irqdomain.h 52 52 * @name: Optional user provided domain name 53 53 * @id: Optional user provided id if name != NULL 54 - * @data: Optional user-provided data 54 + * @pa: Optional user-provided physical address 55 55 * 56 56 * Allocate a struct irqchip_fwid, and return a poiner to the embedded 57 57 * fwnode_handle (or NULL on failure).
+16 -7
kernel/sched/core.c
··· 1073 1073 task_rq_unlock(rq, p, &rf); 1074 1074 } 1075 1075 1076 + #ifdef CONFIG_UCLAMP_TASK_GROUP 1076 1077 static inline void 1077 1078 uclamp_update_active_tasks(struct cgroup_subsys_state *css, 1078 1079 unsigned int clamps) ··· 1092 1091 css_task_iter_end(&it); 1093 1092 } 1094 1093 1095 - #ifdef CONFIG_UCLAMP_TASK_GROUP 1096 1094 static void cpu_util_update_eff(struct cgroup_subsys_state *css); 1097 1095 static void uclamp_update_root_tg(void) 1098 1096 { ··· 3929 3929 } 3930 3930 3931 3931 restart: 3932 + #ifdef CONFIG_SMP 3932 3933 /* 3933 - * Ensure that we put DL/RT tasks before the pick loop, such that they 3934 - * can PULL higher prio tasks when we lower the RQ 'priority'. 3934 + * We must do the balancing pass before put_next_task(), such 3935 + * that when we release the rq->lock the task is in the same 3936 + * state as before we took rq->lock. 3937 + * 3938 + * We can terminate the balance pass as soon as we know there is 3939 + * a runnable task of @class priority or higher. 3935 3940 */ 3936 - prev->sched_class->put_prev_task(rq, prev, rf); 3937 - if (!rq->nr_running) 3938 - newidle_balance(rq, rf); 3941 + for_class_range(class, prev->sched_class, &idle_sched_class) { 3942 + if (class->balance(rq, prev, rf)) 3943 + break; 3944 + } 3945 + #endif 3946 + 3947 + put_prev_task(rq, prev); 3939 3948 3940 3949 for_each_class(class) { 3941 3950 p = class->pick_next_task(rq, NULL, NULL); ··· 6210 6201 for_each_class(class) { 6211 6202 next = class->pick_next_task(rq, NULL, NULL); 6212 6203 if (next) { 6213 - next->sched_class->put_prev_task(rq, next, NULL); 6204 + next->sched_class->put_prev_task(rq, next); 6214 6205 return next; 6215 6206 } 6216 6207 }
+20 -20
kernel/sched/deadline.c
··· 1691 1691 resched_curr(rq); 1692 1692 } 1693 1693 1694 + static int balance_dl(struct rq *rq, struct task_struct *p, struct rq_flags *rf) 1695 + { 1696 + if (!on_dl_rq(&p->dl) && need_pull_dl_task(rq, p)) { 1697 + /* 1698 + * This is OK, because current is on_cpu, which avoids it being 1699 + * picked for load-balance and preemption/IRQs are still 1700 + * disabled avoiding further scheduler activity on it and we've 1701 + * not yet started the picking loop. 1702 + */ 1703 + rq_unpin_lock(rq, rf); 1704 + pull_dl_task(rq); 1705 + rq_repin_lock(rq, rf); 1706 + } 1707 + 1708 + return sched_stop_runnable(rq) || sched_dl_runnable(rq); 1709 + } 1694 1710 #endif /* CONFIG_SMP */ 1695 1711 1696 1712 /* ··· 1774 1758 pick_next_task_dl(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) 1775 1759 { 1776 1760 struct sched_dl_entity *dl_se; 1761 + struct dl_rq *dl_rq = &rq->dl; 1777 1762 struct task_struct *p; 1778 - struct dl_rq *dl_rq; 1779 1763 1780 1764 WARN_ON_ONCE(prev || rf); 1781 1765 1782 - dl_rq = &rq->dl; 1783 - 1784 - if (unlikely(!dl_rq->dl_nr_running)) 1766 + if (!sched_dl_runnable(rq)) 1785 1767 return NULL; 1786 1768 1787 1769 dl_se = pick_next_dl_entity(rq, dl_rq); 1788 1770 BUG_ON(!dl_se); 1789 - 1790 1771 p = dl_task_of(dl_se); 1791 - 1792 1772 set_next_task_dl(rq, p); 1793 - 1794 1773 return p; 1795 1774 } 1796 1775 1797 - static void put_prev_task_dl(struct rq *rq, struct task_struct *p, struct rq_flags *rf) 1776 + static void put_prev_task_dl(struct rq *rq, struct task_struct *p) 1798 1777 { 1799 1778 update_curr_dl(rq); 1800 1779 1801 1780 update_dl_rq_load_avg(rq_clock_pelt(rq), rq, 1); 1802 1781 if (on_dl_rq(&p->dl) && p->nr_cpus_allowed > 1) 1803 1782 enqueue_pushable_dl_task(rq, p); 1804 - 1805 - if (rf && !on_dl_rq(&p->dl) && need_pull_dl_task(rq, p)) { 1806 - /* 1807 - * This is OK, because current is on_cpu, which avoids it being 1808 - * picked for load-balance and preemption/IRQs are still 1809 - * disabled avoiding further scheduler activity on it and we've 1810 - * not yet started the picking loop. 1811 - */ 1812 - rq_unpin_lock(rq, rf); 1813 - pull_dl_task(rq); 1814 - rq_repin_lock(rq, rf); 1815 - } 1816 1783 } 1817 1784 1818 1785 /* ··· 2441 2442 .set_next_task = set_next_task_dl, 2442 2443 2443 2444 #ifdef CONFIG_SMP 2445 + .balance = balance_dl, 2444 2446 .select_task_rq = select_task_rq_dl, 2445 2447 .migrate_task_rq = migrate_task_rq_dl, 2446 2448 .set_cpus_allowed = set_cpus_allowed_dl,
+12 -3
kernel/sched/fair.c
··· 6570 6570 { 6571 6571 remove_entity_load_avg(&p->se); 6572 6572 } 6573 + 6574 + static int 6575 + balance_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) 6576 + { 6577 + if (rq->nr_running) 6578 + return 1; 6579 + 6580 + return newidle_balance(rq, rf) != 0; 6581 + } 6573 6582 #endif /* CONFIG_SMP */ 6574 6583 6575 6584 static unsigned long wakeup_gran(struct sched_entity *se) ··· 6755 6746 int new_tasks; 6756 6747 6757 6748 again: 6758 - if (!cfs_rq->nr_running) 6749 + if (!sched_fair_runnable(rq)) 6759 6750 goto idle; 6760 6751 6761 6752 #ifdef CONFIG_FAIR_GROUP_SCHED ··· 6893 6884 /* 6894 6885 * Account for a descheduled task: 6895 6886 */ 6896 - static void put_prev_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) 6887 + static void put_prev_task_fair(struct rq *rq, struct task_struct *prev) 6897 6888 { 6898 6889 struct sched_entity *se = &prev->se; 6899 6890 struct cfs_rq *cfs_rq; ··· 10423 10414 .check_preempt_curr = check_preempt_wakeup, 10424 10415 10425 10416 .pick_next_task = pick_next_task_fair, 10426 - 10427 10417 .put_prev_task = put_prev_task_fair, 10428 10418 .set_next_task = set_next_task_fair, 10429 10419 10430 10420 #ifdef CONFIG_SMP 10421 + .balance = balance_fair, 10431 10422 .select_task_rq = select_task_rq_fair, 10432 10423 .migrate_task_rq = migrate_task_rq_fair, 10433 10424
+8 -1
kernel/sched/idle.c
··· 365 365 { 366 366 return task_cpu(p); /* IDLE tasks as never migrated */ 367 367 } 368 + 369 + static int 370 + balance_idle(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) 371 + { 372 + return WARN_ON_ONCE(1); 373 + } 368 374 #endif 369 375 370 376 /* ··· 381 375 resched_curr(rq); 382 376 } 383 377 384 - static void put_prev_task_idle(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) 378 + static void put_prev_task_idle(struct rq *rq, struct task_struct *prev) 385 379 { 386 380 } 387 381 ··· 466 460 .set_next_task = set_next_task_idle, 467 461 468 462 #ifdef CONFIG_SMP 463 + .balance = balance_idle, 469 464 .select_task_rq = select_task_rq_idle, 470 465 .set_cpus_allowed = set_cpus_allowed_common, 471 466 #endif
+19 -18
kernel/sched/rt.c
··· 1469 1469 resched_curr(rq); 1470 1470 } 1471 1471 1472 + static int balance_rt(struct rq *rq, struct task_struct *p, struct rq_flags *rf) 1473 + { 1474 + if (!on_rt_rq(&p->rt) && need_pull_rt_task(rq, p)) { 1475 + /* 1476 + * This is OK, because current is on_cpu, which avoids it being 1477 + * picked for load-balance and preemption/IRQs are still 1478 + * disabled avoiding further scheduler activity on it and we've 1479 + * not yet started the picking loop. 1480 + */ 1481 + rq_unpin_lock(rq, rf); 1482 + pull_rt_task(rq); 1483 + rq_repin_lock(rq, rf); 1484 + } 1485 + 1486 + return sched_stop_runnable(rq) || sched_dl_runnable(rq) || sched_rt_runnable(rq); 1487 + } 1472 1488 #endif /* CONFIG_SMP */ 1473 1489 1474 1490 /* ··· 1568 1552 pick_next_task_rt(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) 1569 1553 { 1570 1554 struct task_struct *p; 1571 - struct rt_rq *rt_rq = &rq->rt; 1572 1555 1573 1556 WARN_ON_ONCE(prev || rf); 1574 1557 1575 - if (!rt_rq->rt_queued) 1558 + if (!sched_rt_runnable(rq)) 1576 1559 return NULL; 1577 1560 1578 1561 p = _pick_next_task_rt(rq); 1579 - 1580 1562 set_next_task_rt(rq, p); 1581 - 1582 1563 return p; 1583 1564 } 1584 1565 1585 - static void put_prev_task_rt(struct rq *rq, struct task_struct *p, struct rq_flags *rf) 1566 + static void put_prev_task_rt(struct rq *rq, struct task_struct *p) 1586 1567 { 1587 1568 update_curr_rt(rq); 1588 1569 ··· 1591 1578 */ 1592 1579 if (on_rt_rq(&p->rt) && p->nr_cpus_allowed > 1) 1593 1580 enqueue_pushable_task(rq, p); 1594 - 1595 - if (rf && !on_rt_rq(&p->rt) && need_pull_rt_task(rq, p)) { 1596 - /* 1597 - * This is OK, because current is on_cpu, which avoids it being 1598 - * picked for load-balance and preemption/IRQs are still 1599 - * disabled avoiding further scheduler activity on it and we've 1600 - * not yet started the picking loop. 1601 - */ 1602 - rq_unpin_lock(rq, rf); 1603 - pull_rt_task(rq); 1604 - rq_repin_lock(rq, rf); 1605 - } 1606 1581 } 1607 1582 1608 1583 #ifdef CONFIG_SMP ··· 2367 2366 .set_next_task = set_next_task_rt, 2368 2367 2369 2368 #ifdef CONFIG_SMP 2369 + .balance = balance_rt, 2370 2370 .select_task_rq = select_task_rq_rt, 2371 - 2372 2371 .set_cpus_allowed = set_cpus_allowed_common, 2373 2372 .rq_online = rq_online_rt, 2374 2373 .rq_offline = rq_offline_rt,
+27 -3
kernel/sched/sched.h
··· 1727 1727 struct task_struct * (*pick_next_task)(struct rq *rq, 1728 1728 struct task_struct *prev, 1729 1729 struct rq_flags *rf); 1730 - void (*put_prev_task)(struct rq *rq, struct task_struct *p, struct rq_flags *rf); 1730 + void (*put_prev_task)(struct rq *rq, struct task_struct *p); 1731 1731 void (*set_next_task)(struct rq *rq, struct task_struct *p); 1732 1732 1733 1733 #ifdef CONFIG_SMP 1734 + int (*balance)(struct rq *rq, struct task_struct *prev, struct rq_flags *rf); 1734 1735 int (*select_task_rq)(struct task_struct *p, int task_cpu, int sd_flag, int flags); 1735 1736 void (*migrate_task_rq)(struct task_struct *p, int new_cpu); 1736 1737 ··· 1774 1773 static inline void put_prev_task(struct rq *rq, struct task_struct *prev) 1775 1774 { 1776 1775 WARN_ON_ONCE(rq->curr != prev); 1777 - prev->sched_class->put_prev_task(rq, prev, NULL); 1776 + prev->sched_class->put_prev_task(rq, prev); 1778 1777 } 1779 1778 1780 1779 static inline void set_next_task(struct rq *rq, struct task_struct *next) ··· 1788 1787 #else 1789 1788 #define sched_class_highest (&dl_sched_class) 1790 1789 #endif 1790 + 1791 + #define for_class_range(class, _from, _to) \ 1792 + for (class = (_from); class != (_to); class = class->next) 1793 + 1791 1794 #define for_each_class(class) \ 1792 - for (class = sched_class_highest; class; class = class->next) 1795 + for_class_range(class, sched_class_highest, NULL) 1793 1796 1794 1797 extern const struct sched_class stop_sched_class; 1795 1798 extern const struct sched_class dl_sched_class; ··· 1801 1796 extern const struct sched_class fair_sched_class; 1802 1797 extern const struct sched_class idle_sched_class; 1803 1798 1799 + static inline bool sched_stop_runnable(struct rq *rq) 1800 + { 1801 + return rq->stop && task_on_rq_queued(rq->stop); 1802 + } 1803 + 1804 + static inline bool sched_dl_runnable(struct rq *rq) 1805 + { 1806 + return rq->dl.dl_nr_running > 0; 1807 + } 1808 + 1809 + static inline bool sched_rt_runnable(struct rq *rq) 1810 + { 1811 + return rq->rt.rt_queued > 0; 1812 + } 1813 + 1814 + static inline bool sched_fair_runnable(struct rq *rq) 1815 + { 1816 + return rq->cfs.nr_running > 0; 1817 + } 1804 1818 1805 1819 #ifdef CONFIG_SMP 1806 1820
+11 -7
kernel/sched/stop_task.c
··· 15 15 { 16 16 return task_cpu(p); /* stop tasks as never migrate */ 17 17 } 18 + 19 + static int 20 + balance_stop(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) 21 + { 22 + return sched_stop_runnable(rq); 23 + } 18 24 #endif /* CONFIG_SMP */ 19 25 20 26 static void ··· 37 31 static struct task_struct * 38 32 pick_next_task_stop(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) 39 33 { 40 - struct task_struct *stop = rq->stop; 41 - 42 34 WARN_ON_ONCE(prev || rf); 43 35 44 - if (!stop || !task_on_rq_queued(stop)) 36 + if (!sched_stop_runnable(rq)) 45 37 return NULL; 46 38 47 - set_next_task_stop(rq, stop); 48 - 49 - return stop; 39 + set_next_task_stop(rq, rq->stop); 40 + return rq->stop; 50 41 } 51 42 52 43 static void ··· 63 60 BUG(); /* the stop task should never yield, its pointless. */ 64 61 } 65 62 66 - static void put_prev_task_stop(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) 63 + static void put_prev_task_stop(struct rq *rq, struct task_struct *prev) 67 64 { 68 65 struct task_struct *curr = rq->curr; 69 66 u64 delta_exec; ··· 132 129 .set_next_task = set_next_task_stop, 133 130 134 131 #ifdef CONFIG_SMP 132 + .balance = balance_stop, 135 133 .select_task_rq = select_task_rq_stop, 136 134 .set_cpus_allowed = set_cpus_allowed_common, 137 135 #endif
+1 -1
kernel/signal.c
··· 2205 2205 */ 2206 2206 preempt_disable(); 2207 2207 read_unlock(&tasklist_lock); 2208 - preempt_enable_no_resched(); 2209 2208 cgroup_enter_frozen(); 2209 + preempt_enable_no_resched(); 2210 2210 freezable_schedule(); 2211 2211 cgroup_leave_frozen(true); 2212 2212 } else {
+4 -2
kernel/stacktrace.c
··· 141 141 struct stacktrace_cookie c = { 142 142 .store = store, 143 143 .size = size, 144 - .skip = skipnr + 1, 144 + /* skip this function if they are tracing us */ 145 + .skip = skipnr + !!(current == tsk), 145 146 }; 146 147 147 148 if (!try_get_task_stack(tsk)) ··· 299 298 struct stack_trace trace = { 300 299 .entries = store, 301 300 .max_entries = size, 302 - .skip = skipnr + 1, 301 + /* skip this function if they are tracing us */ 302 + .skip = skipnr + !!(current == task), 303 303 }; 304 304 305 305 save_stack_trace_tsk(task, &trace);
+1 -1
kernel/time/ntp.c
··· 771 771 /* fill PPS status fields */ 772 772 pps_fill_timex(txc); 773 773 774 - txc->time.tv_sec = (time_t)ts->tv_sec; 774 + txc->time.tv_sec = ts->tv_sec; 775 775 txc->time.tv_usec = ts->tv_nsec; 776 776 if (!(time_status & STA_NANO)) 777 777 txc->time.tv_usec = ts->tv_nsec / NSEC_PER_USEC;
+3 -6
kernel/time/vsyscall.c
··· 110 110 nsec = nsec + tk->wall_to_monotonic.tv_nsec; 111 111 vdso_ts->sec += __iter_div_u64_rem(nsec, NSEC_PER_SEC, &vdso_ts->nsec); 112 112 113 - if (__arch_use_vsyscall(vdata)) 114 - update_vdso_data(vdata, tk); 113 + update_vdso_data(vdata, tk); 115 114 116 115 __arch_update_vsyscall(vdata, tk); 117 116 ··· 123 124 { 124 125 struct vdso_data *vdata = __arch_get_k_vdso_data(); 125 126 126 - if (__arch_use_vsyscall(vdata)) { 127 - vdata[CS_HRES_COARSE].tz_minuteswest = sys_tz.tz_minuteswest; 128 - vdata[CS_HRES_COARSE].tz_dsttime = sys_tz.tz_dsttime; 129 - } 127 + vdata[CS_HRES_COARSE].tz_minuteswest = sys_tz.tz_minuteswest; 128 + vdata[CS_HRES_COARSE].tz_dsttime = sys_tz.tz_dsttime; 130 129 131 130 __arch_sync_vdso_data(vdata); 132 131 }
-1
lib/Kconfig
··· 447 447 config HAS_IOMEM 448 448 bool 449 449 depends on !NO_IOMEM 450 - select GENERIC_IO 451 450 default y 452 451 453 452 config HAS_IOPORT_MAP
+1
lib/xz/xz_dec_lzma2.c
··· 1146 1146 1147 1147 if (DEC_IS_DYNALLOC(s->dict.mode)) { 1148 1148 if (s->dict.allocated < s->dict.size) { 1149 + s->dict.allocated = s->dict.size; 1149 1150 vfree(s->dict.buf); 1150 1151 s->dict.buf = vmalloc(s->dict.size); 1151 1152 if (s->dict.buf == NULL) {
+17 -14
mm/debug.c
··· 67 67 */ 68 68 mapcount = PageSlab(page) ? 0 : page_mapcount(page); 69 69 70 - pr_warn("page:%px refcount:%d mapcount:%d mapping:%px index:%#lx", 71 - page, page_ref_count(page), mapcount, 72 - page->mapping, page_to_pgoff(page)); 73 70 if (PageCompound(page)) 74 - pr_cont(" compound_mapcount: %d", compound_mapcount(page)); 75 - pr_cont("\n"); 76 - if (PageAnon(page)) 77 - pr_warn("anon "); 78 - else if (PageKsm(page)) 79 - pr_warn("ksm "); 71 + pr_warn("page:%px refcount:%d mapcount:%d mapping:%px " 72 + "index:%#lx compound_mapcount: %d\n", 73 + page, page_ref_count(page), mapcount, 74 + page->mapping, page_to_pgoff(page), 75 + compound_mapcount(page)); 76 + else 77 + pr_warn("page:%px refcount:%d mapcount:%d mapping:%px index:%#lx\n", 78 + page, page_ref_count(page), mapcount, 79 + page->mapping, page_to_pgoff(page)); 80 + if (PageKsm(page)) 81 + pr_warn("ksm flags: %#lx(%pGp)\n", page->flags, &page->flags); 82 + else if (PageAnon(page)) 83 + pr_warn("anon flags: %#lx(%pGp)\n", page->flags, &page->flags); 80 84 else if (mapping) { 81 - pr_warn("%ps ", mapping->a_ops); 82 85 if (mapping->host && mapping->host->i_dentry.first) { 83 86 struct dentry *dentry; 84 87 dentry = container_of(mapping->host->i_dentry.first, struct dentry, d_u.d_alias); 85 - pr_warn("name:\"%pd\" ", dentry); 86 - } 88 + pr_warn("%ps name:\"%pd\"\n", mapping->a_ops, dentry); 89 + } else 90 + pr_warn("%ps\n", mapping->a_ops); 91 + pr_warn("flags: %#lx(%pGp)\n", page->flags, &page->flags); 87 92 } 88 93 BUILD_BUG_ON(ARRAY_SIZE(pageflag_names) != __NR_PAGEFLAGS + 1); 89 - 90 - pr_warn("flags: %#lx(%pGp)\n", page->flags, &page->flags); 91 94 92 95 hex_only: 93 96 print_hex_dump(KERN_WARNING, "raw: ", DUMP_PREFIX_NONE, 32,
+1 -1
mm/hugetlb_cgroup.c
··· 196 196 again: 197 197 rcu_read_lock(); 198 198 h_cg = hugetlb_cgroup_from_task(current); 199 - if (!css_tryget_online(&h_cg->css)) { 199 + if (!css_tryget(&h_cg->css)) { 200 200 rcu_read_unlock(); 201 201 goto again; 202 202 }
+16 -12
mm/khugepaged.c
··· 1602 1602 result = SCAN_FAIL; 1603 1603 goto xa_unlocked; 1604 1604 } 1605 - } else if (!PageUptodate(page)) { 1606 - xas_unlock_irq(&xas); 1607 - wait_on_page_locked(page); 1608 - if (!trylock_page(page)) { 1609 - result = SCAN_PAGE_LOCK; 1610 - goto xa_unlocked; 1611 - } 1612 - get_page(page); 1613 - } else if (PageDirty(page)) { 1614 - result = SCAN_FAIL; 1615 - goto xa_locked; 1616 1605 } else if (trylock_page(page)) { 1617 1606 get_page(page); 1618 1607 xas_unlock_irq(&xas); ··· 1616 1627 * without racing with truncate. 1617 1628 */ 1618 1629 VM_BUG_ON_PAGE(!PageLocked(page), page); 1619 - VM_BUG_ON_PAGE(!PageUptodate(page), page); 1630 + 1631 + /* make sure the page is up to date */ 1632 + if (unlikely(!PageUptodate(page))) { 1633 + result = SCAN_FAIL; 1634 + goto out_unlock; 1635 + } 1620 1636 1621 1637 /* 1622 1638 * If file was truncated then extended, or hole-punched, before ··· 1634 1640 1635 1641 if (page_mapping(page) != mapping) { 1636 1642 result = SCAN_TRUNCATED; 1643 + goto out_unlock; 1644 + } 1645 + 1646 + if (!is_shmem && PageDirty(page)) { 1647 + /* 1648 + * khugepaged only works on read-only fd, so this 1649 + * page is dirty because it hasn't been flushed 1650 + * since first write. 1651 + */ 1652 + result = SCAN_FAIL; 1637 1653 goto out_unlock; 1638 1654 } 1639 1655
+12 -4
mm/madvise.c
··· 363 363 ClearPageReferenced(page); 364 364 test_and_clear_page_young(page); 365 365 if (pageout) { 366 - if (!isolate_lru_page(page)) 367 - list_add(&page->lru, &page_list); 366 + if (!isolate_lru_page(page)) { 367 + if (PageUnevictable(page)) 368 + putback_lru_page(page); 369 + else 370 + list_add(&page->lru, &page_list); 371 + } 368 372 } else 369 373 deactivate_page(page); 370 374 huge_unlock: ··· 445 441 ClearPageReferenced(page); 446 442 test_and_clear_page_young(page); 447 443 if (pageout) { 448 - if (!isolate_lru_page(page)) 449 - list_add(&page->lru, &page_list); 444 + if (!isolate_lru_page(page)) { 445 + if (PageUnevictable(page)) 446 + putback_lru_page(page); 447 + else 448 + list_add(&page->lru, &page_list); 449 + } 450 450 } else 451 451 deactivate_page(page); 452 452 }
+1 -1
mm/memcontrol.c
··· 960 960 if (unlikely(!memcg)) 961 961 memcg = root_mem_cgroup; 962 962 } 963 - } while (!css_tryget_online(&memcg->css)); 963 + } while (!css_tryget(&memcg->css)); 964 964 rcu_read_unlock(); 965 965 return memcg; 966 966 }
+28 -17
mm/memory_hotplug.c
··· 1646 1646 return 0; 1647 1647 } 1648 1648 1649 + static int check_no_memblock_for_node_cb(struct memory_block *mem, void *arg) 1650 + { 1651 + int nid = *(int *)arg; 1652 + 1653 + /* 1654 + * If a memory block belongs to multiple nodes, the stored nid is not 1655 + * reliable. However, such blocks are always online (e.g., cannot get 1656 + * offlined) and, therefore, are still spanned by the node. 1657 + */ 1658 + return mem->nid == nid ? -EEXIST : 0; 1659 + } 1660 + 1649 1661 /** 1650 1662 * try_offline_node 1651 1663 * @nid: the node ID ··· 1670 1658 void try_offline_node(int nid) 1671 1659 { 1672 1660 pg_data_t *pgdat = NODE_DATA(nid); 1673 - unsigned long start_pfn = pgdat->node_start_pfn; 1674 - unsigned long end_pfn = start_pfn + pgdat->node_spanned_pages; 1675 - unsigned long pfn; 1661 + int rc; 1676 1662 1677 - for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) { 1678 - unsigned long section_nr = pfn_to_section_nr(pfn); 1679 - 1680 - if (!present_section_nr(section_nr)) 1681 - continue; 1682 - 1683 - if (pfn_to_nid(pfn) != nid) 1684 - continue; 1685 - 1686 - /* 1687 - * some memory sections of this node are not removed, and we 1688 - * can't offline node now. 1689 - */ 1663 + /* 1664 + * If the node still spans pages (especially ZONE_DEVICE), don't 1665 + * offline it. A node spans memory after move_pfn_range_to_zone(), 1666 + * e.g., after the memory block was onlined. 1667 + */ 1668 + if (pgdat->node_spanned_pages) 1690 1669 return; 1691 - } 1670 + 1671 + /* 1672 + * Especially offline memory blocks might not be spanned by the 1673 + * node. They will get spanned by the node once they get onlined. 1674 + * However, they link to the node in sysfs and can get onlined later. 1675 + */ 1676 + rc = for_each_memory_block(&nid, check_no_memblock_for_node_cb); 1677 + if (rc) 1678 + return; 1692 1679 1693 1680 if (check_cpu_on_node(pgdat)) 1694 1681 return;
+9 -5
mm/mempolicy.c
··· 672 672 * 1 - there is unmovable page, but MPOL_MF_MOVE* & MPOL_MF_STRICT were 673 673 * specified. 674 674 * 0 - queue pages successfully or no misplaced page. 675 - * -EIO - there is misplaced page and only MPOL_MF_STRICT was specified. 675 + * errno - i.e. misplaced pages with MPOL_MF_STRICT specified (-EIO) or 676 + * memory range specified by nodemask and maxnode points outside 677 + * your accessible address space (-EFAULT) 676 678 */ 677 679 static int 678 680 queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end, ··· 1288 1286 flags | MPOL_MF_INVERT, &pagelist); 1289 1287 1290 1288 if (ret < 0) { 1291 - err = -EIO; 1289 + err = ret; 1292 1290 goto up_out; 1293 1291 } 1294 1292 ··· 1307 1305 1308 1306 if ((ret > 0) || (nr_failed && (flags & MPOL_MF_STRICT))) 1309 1307 err = -EIO; 1310 - } else 1311 - putback_movable_pages(&pagelist); 1312 - 1308 + } else { 1313 1309 up_out: 1310 + if (!list_empty(&pagelist)) 1311 + putback_movable_pages(&pagelist); 1312 + } 1313 + 1314 1314 up_write(&mm->mmap_sem); 1315 1315 mpol_out: 1316 1316 mpol_put(new);
+3 -3
mm/page_io.c
··· 73 73 { 74 74 struct swap_info_struct *sis; 75 75 struct gendisk *disk; 76 + swp_entry_t entry; 76 77 77 78 /* 78 79 * There is no guarantee that the page is in swap cache - the software ··· 105 104 * we again wish to reclaim it. 106 105 */ 107 106 disk = sis->bdev->bd_disk; 108 - if (disk->fops->swap_slot_free_notify) { 109 - swp_entry_t entry; 107 + entry.val = page_private(page); 108 + if (disk->fops->swap_slot_free_notify && __swap_count(entry) == 1) { 110 109 unsigned long offset; 111 110 112 - entry.val = page_private(page); 113 111 offset = swp_offset(entry); 114 112 115 113 SetPageDirty(page);
+9 -30
mm/slub.c
··· 1433 1433 void *old_tail = *tail ? *tail : *head; 1434 1434 int rsize; 1435 1435 1436 - if (slab_want_init_on_free(s)) { 1437 - void *p = NULL; 1436 + /* Head and tail of the reconstructed freelist */ 1437 + *head = NULL; 1438 + *tail = NULL; 1438 1439 1439 - do { 1440 - object = next; 1441 - next = get_freepointer(s, object); 1440 + do { 1441 + object = next; 1442 + next = get_freepointer(s, object); 1443 + 1444 + if (slab_want_init_on_free(s)) { 1442 1445 /* 1443 1446 * Clear the object and the metadata, but don't touch 1444 1447 * the redzone. ··· 1451 1448 : 0; 1452 1449 memset((char *)object + s->inuse, 0, 1453 1450 s->size - s->inuse - rsize); 1454 - set_freepointer(s, object, p); 1455 - p = object; 1456 - } while (object != old_tail); 1457 - } 1458 1451 1459 - /* 1460 - * Compiler cannot detect this function can be removed if slab_free_hook() 1461 - * evaluates to nothing. Thus, catch all relevant config debug options here. 1462 - */ 1463 - #if defined(CONFIG_LOCKDEP) || \ 1464 - defined(CONFIG_DEBUG_KMEMLEAK) || \ 1465 - defined(CONFIG_DEBUG_OBJECTS_FREE) || \ 1466 - defined(CONFIG_KASAN) 1467 - 1468 - next = *head; 1469 - 1470 - /* Head and tail of the reconstructed freelist */ 1471 - *head = NULL; 1472 - *tail = NULL; 1473 - 1474 - do { 1475 - object = next; 1476 - next = get_freepointer(s, object); 1452 + } 1477 1453 /* If object's reuse doesn't have to be delayed */ 1478 1454 if (!slab_free_hook(s, object)) { 1479 1455 /* Move object to the new freelist */ ··· 1467 1485 *tail = NULL; 1468 1486 1469 1487 return *head != NULL; 1470 - #else 1471 - return true; 1472 - #endif 1473 1488 } 1474 1489 1475 1490 static void *setup_object(struct kmem_cache *s, struct page *page,
+2 -1
net/can/af_can.c
··· 86 86 87 87 /* af_can socket functions */ 88 88 89 - static void can_sock_destruct(struct sock *sk) 89 + void can_sock_destruct(struct sock *sk) 90 90 { 91 91 skb_queue_purge(&sk->sk_receive_queue); 92 92 skb_queue_purge(&sk->sk_error_queue); 93 93 } 94 + EXPORT_SYMBOL(can_sock_destruct); 94 95 95 96 static const struct can_proto *can_get_proto(int protocol) 96 97 {
+9
net/can/j1939/main.c
··· 51 51 if (!skb) 52 52 return; 53 53 54 + j1939_priv_get(priv); 54 55 can_skb_set_owner(skb, iskb->sk); 55 56 56 57 /* get a pointer to the header of the skb ··· 105 104 j1939_simple_recv(priv, skb); 106 105 j1939_sk_recv(priv, skb); 107 106 done: 107 + j1939_priv_put(priv); 108 108 kfree_skb(skb); 109 109 } 110 110 ··· 151 149 struct net_device *ndev = priv->ndev; 152 150 153 151 netdev_dbg(priv->ndev, "%s: 0x%p\n", __func__, priv); 152 + 153 + WARN_ON_ONCE(!list_empty(&priv->active_session_list)); 154 + WARN_ON_ONCE(!list_empty(&priv->ecus)); 155 + WARN_ON_ONCE(!list_empty(&priv->j1939_socks)); 154 156 155 157 dev_put(ndev); 156 158 kfree(priv); ··· 212 206 static inline struct j1939_priv *j1939_ndev_to_priv(struct net_device *ndev) 213 207 { 214 208 struct can_ml_priv *can_ml_priv = ndev->ml_priv; 209 + 210 + if (!can_ml_priv) 211 + return NULL; 215 212 216 213 return can_ml_priv->j1939_priv; 217 214 }
+74 -20
net/can/j1939/socket.c
··· 78 78 { 79 79 jsk->state |= J1939_SOCK_BOUND; 80 80 j1939_priv_get(priv); 81 - jsk->priv = priv; 82 81 83 82 spin_lock_bh(&priv->j1939_socks_lock); 84 83 list_add_tail(&jsk->list, &priv->j1939_socks); ··· 90 91 list_del_init(&jsk->list); 91 92 spin_unlock_bh(&priv->j1939_socks_lock); 92 93 93 - jsk->priv = NULL; 94 94 j1939_priv_put(priv); 95 95 jsk->state &= ~J1939_SOCK_BOUND; 96 96 } ··· 347 349 spin_unlock_bh(&priv->j1939_socks_lock); 348 350 } 349 351 352 + static void j1939_sk_sock_destruct(struct sock *sk) 353 + { 354 + struct j1939_sock *jsk = j1939_sk(sk); 355 + 356 + /* This function will be call by the generic networking code, when then 357 + * the socket is ultimately closed (sk->sk_destruct). 358 + * 359 + * The race between 360 + * - processing a received CAN frame 361 + * (can_receive -> j1939_can_recv) 362 + * and accessing j1939_priv 363 + * ... and ... 364 + * - closing a socket 365 + * (j1939_can_rx_unregister -> can_rx_unregister) 366 + * and calling the final j1939_priv_put() 367 + * 368 + * is avoided by calling the final j1939_priv_put() from this 369 + * RCU deferred cleanup call. 370 + */ 371 + if (jsk->priv) { 372 + j1939_priv_put(jsk->priv); 373 + jsk->priv = NULL; 374 + } 375 + 376 + /* call generic CAN sock destruct */ 377 + can_sock_destruct(sk); 378 + } 379 + 350 380 static int j1939_sk_init(struct sock *sk) 351 381 { 352 382 struct j1939_sock *jsk = j1939_sk(sk); ··· 397 371 atomic_set(&jsk->skb_pending, 0); 398 372 spin_lock_init(&jsk->sk_session_queue_lock); 399 373 INIT_LIST_HEAD(&jsk->sk_session_queue); 374 + sk->sk_destruct = j1939_sk_sock_destruct; 400 375 401 376 return 0; 402 377 } ··· 470 443 } 471 444 472 445 jsk->ifindex = addr->can_ifindex; 446 + 447 + /* the corresponding j1939_priv_put() is called via 448 + * sk->sk_destruct, which points to j1939_sk_sock_destruct() 449 + */ 450 + j1939_priv_get(priv); 451 + jsk->priv = priv; 473 452 } 474 453 475 454 /* set default transmit pgn */ ··· 593 560 if (!sk) 594 561 return 0; 595 562 596 - jsk = j1939_sk(sk); 597 563 lock_sock(sk); 564 + jsk = j1939_sk(sk); 598 565 599 566 if (jsk->state & J1939_SOCK_BOUND) { 600 567 struct j1939_priv *priv = jsk->priv; ··· 1092 1059 { 1093 1060 struct sock *sk = sock->sk; 1094 1061 struct j1939_sock *jsk = j1939_sk(sk); 1095 - struct j1939_priv *priv = jsk->priv; 1062 + struct j1939_priv *priv; 1096 1063 int ifindex; 1097 1064 int ret; 1098 1065 1066 + lock_sock(sock->sk); 1099 1067 /* various socket state tests */ 1100 - if (!(jsk->state & J1939_SOCK_BOUND)) 1101 - return -EBADFD; 1068 + if (!(jsk->state & J1939_SOCK_BOUND)) { 1069 + ret = -EBADFD; 1070 + goto sendmsg_done; 1071 + } 1102 1072 1073 + priv = jsk->priv; 1103 1074 ifindex = jsk->ifindex; 1104 1075 1105 - if (!jsk->addr.src_name && jsk->addr.sa == J1939_NO_ADDR) 1076 + if (!jsk->addr.src_name && jsk->addr.sa == J1939_NO_ADDR) { 1106 1077 /* no source address assigned yet */ 1107 - return -EBADFD; 1078 + ret = -EBADFD; 1079 + goto sendmsg_done; 1080 + } 1108 1081 1109 1082 /* deal with provided destination address info */ 1110 1083 if (msg->msg_name) { 1111 1084 struct sockaddr_can *addr = msg->msg_name; 1112 1085 1113 - if (msg->msg_namelen < J1939_MIN_NAMELEN) 1114 - return -EINVAL; 1086 + if (msg->msg_namelen < J1939_MIN_NAMELEN) { 1087 + ret = -EINVAL; 1088 + goto sendmsg_done; 1089 + } 1115 1090 1116 - if (addr->can_family != AF_CAN) 1117 - return -EINVAL; 1091 + if (addr->can_family != AF_CAN) { 1092 + ret = -EINVAL; 1093 + goto sendmsg_done; 1094 + } 1118 1095 1119 - if (addr->can_ifindex && addr->can_ifindex != ifindex) 1120 - return -EBADFD; 1096 + if (addr->can_ifindex && addr->can_ifindex != ifindex) { 1097 + ret = -EBADFD; 1098 + goto sendmsg_done; 1099 + } 1121 1100 1122 1101 if (j1939_pgn_is_valid(addr->can_addr.j1939.pgn) && 1123 - !j1939_pgn_is_clean_pdu(addr->can_addr.j1939.pgn)) 1124 - return -EINVAL; 1102 + !j1939_pgn_is_clean_pdu(addr->can_addr.j1939.pgn)) { 1103 + ret = -EINVAL; 1104 + goto sendmsg_done; 1105 + } 1125 1106 1126 1107 if (!addr->can_addr.j1939.name && 1127 1108 addr->can_addr.j1939.addr == J1939_NO_ADDR && 1128 - !sock_flag(sk, SOCK_BROADCAST)) 1109 + !sock_flag(sk, SOCK_BROADCAST)) { 1129 1110 /* broadcast, but SO_BROADCAST not set */ 1130 - return -EACCES; 1111 + ret = -EACCES; 1112 + goto sendmsg_done; 1113 + } 1131 1114 } else { 1132 1115 if (!jsk->addr.dst_name && jsk->addr.da == J1939_NO_ADDR && 1133 - !sock_flag(sk, SOCK_BROADCAST)) 1116 + !sock_flag(sk, SOCK_BROADCAST)) { 1134 1117 /* broadcast, but SO_BROADCAST not set */ 1135 - return -EACCES; 1118 + ret = -EACCES; 1119 + goto sendmsg_done; 1120 + } 1136 1121 } 1137 1122 1138 1123 ret = j1939_sk_send_loop(priv, sk, msg, size); 1124 + 1125 + sendmsg_done: 1126 + release_sock(sock->sk); 1139 1127 1140 1128 return ret; 1141 1129 }
+27 -9
net/can/j1939/transport.c
··· 255 255 return; 256 256 257 257 j1939_sock_pending_del(session->sk); 258 + sock_put(session->sk); 258 259 } 259 260 260 261 static void j1939_session_destroy(struct j1939_session *session) ··· 266 265 j1939_sk_errqueue(session, J1939_ERRQUEUE_ACK); 267 266 268 267 netdev_dbg(session->priv->ndev, "%s: 0x%p\n", __func__, session); 268 + 269 + WARN_ON_ONCE(!list_empty(&session->sk_session_queue_entry)); 270 + WARN_ON_ONCE(!list_empty(&session->active_session_list_entry)); 269 271 270 272 skb_queue_purge(&session->skb_queue); 271 273 __j1939_session_drop(session); ··· 1046 1042 j1939_sk_queue_activate_next(session); 1047 1043 } 1048 1044 1049 - static void j1939_session_cancel(struct j1939_session *session, 1045 + static void __j1939_session_cancel(struct j1939_session *session, 1050 1046 enum j1939_xtp_abort err) 1051 1047 { 1052 1048 struct j1939_priv *priv = session->priv; 1053 1049 1054 1050 WARN_ON_ONCE(!err); 1051 + lockdep_assert_held(&session->priv->active_session_list_lock); 1055 1052 1056 1053 session->err = j1939_xtp_abort_to_errno(priv, err); 1057 1054 /* do not send aborts on incoming broadcasts */ ··· 1065 1060 1066 1061 if (session->sk) 1067 1062 j1939_sk_send_loop_abort(session->sk, session->err); 1063 + } 1064 + 1065 + static void j1939_session_cancel(struct j1939_session *session, 1066 + enum j1939_xtp_abort err) 1067 + { 1068 + j1939_session_list_lock(session->priv); 1069 + 1070 + if (session->state >= J1939_SESSION_ACTIVE && 1071 + session->state < J1939_SESSION_WAITING_ABORT) { 1072 + j1939_tp_set_rxtimeout(session, J1939_XTP_ABORT_TIMEOUT_MS); 1073 + __j1939_session_cancel(session, err); 1074 + } 1075 + 1076 + j1939_session_list_unlock(session->priv); 1068 1077 } 1069 1078 1070 1079 static enum hrtimer_restart j1939_tp_txtimer(struct hrtimer *hrtimer) ··· 1127 1108 netdev_alert(priv->ndev, "%s: 0x%p: tx aborted with unknown reason: %i\n", 1128 1109 __func__, session, ret); 1129 1110 if (session->skcb.addr.type != J1939_SIMPLE) { 1130 - j1939_tp_set_rxtimeout(session, 1131 - J1939_XTP_ABORT_TIMEOUT_MS); 1132 1111 j1939_session_cancel(session, J1939_XTP_ABORT_OTHER); 1133 1112 } else { 1134 1113 session->err = ret; ··· 1186 1169 hrtimer_start(&session->rxtimer, 1187 1170 ms_to_ktime(J1939_XTP_ABORT_TIMEOUT_MS), 1188 1171 HRTIMER_MODE_REL_SOFT); 1189 - j1939_session_cancel(session, J1939_XTP_ABORT_TIMEOUT); 1172 + __j1939_session_cancel(session, J1939_XTP_ABORT_TIMEOUT); 1190 1173 } 1191 1174 j1939_session_list_unlock(session->priv); 1192 1175 } ··· 1392 1375 1393 1376 out_session_cancel: 1394 1377 j1939_session_timers_cancel(session); 1395 - j1939_tp_set_rxtimeout(session, J1939_XTP_ABORT_TIMEOUT_MS); 1396 1378 j1939_session_cancel(session, err); 1397 1379 } 1398 1380 ··· 1588 1572 1589 1573 /* RTS on active session */ 1590 1574 j1939_session_timers_cancel(session); 1591 - j1939_tp_set_rxtimeout(session, J1939_XTP_ABORT_TIMEOUT_MS); 1592 1575 j1939_session_cancel(session, J1939_XTP_ABORT_BUSY); 1593 1576 } 1594 1577 ··· 1598 1583 session->last_cmd); 1599 1584 1600 1585 j1939_session_timers_cancel(session); 1601 - j1939_tp_set_rxtimeout(session, J1939_XTP_ABORT_TIMEOUT_MS); 1602 1586 j1939_session_cancel(session, J1939_XTP_ABORT_BUSY); 1603 1587 1604 1588 return -EBUSY; ··· 1799 1785 1800 1786 out_session_cancel: 1801 1787 j1939_session_timers_cancel(session); 1802 - j1939_tp_set_rxtimeout(session, J1939_XTP_ABORT_TIMEOUT_MS); 1803 1788 j1939_session_cancel(session, J1939_XTP_ABORT_FAULT); 1804 1789 j1939_session_put(session); 1805 1790 } ··· 1879 1866 return ERR_PTR(-ENOMEM); 1880 1867 1881 1868 /* skb is recounted in j1939_session_new() */ 1869 + sock_hold(skb->sk); 1882 1870 session->sk = skb->sk; 1883 1871 session->transmission = true; 1884 1872 session->pkt.total = (size + 6) / 7; ··· 2042 2028 &priv->active_session_list, 2043 2029 active_session_list_entry) { 2044 2030 if (!sk || sk == session->sk) { 2045 - j1939_session_timers_cancel(session); 2031 + if (hrtimer_try_to_cancel(&session->txtimer) == 1) 2032 + j1939_session_put(session); 2033 + if (hrtimer_try_to_cancel(&session->rxtimer) == 1) 2034 + j1939_session_put(session); 2035 + 2046 2036 session->err = ESHUTDOWN; 2047 2037 j1939_session_deactivate_locked(session); 2048 2038 }
+7 -1
net/core/devlink.c
··· 2812 2812 struct net *dest_net = NULL; 2813 2813 int err; 2814 2814 2815 - if (!devlink_reload_supported(devlink)) 2815 + if (!devlink_reload_supported(devlink) || !devlink->reload_enabled) 2816 2816 return -EOPNOTSUPP; 2817 2817 2818 2818 err = devlink_resources_validate(devlink, NULL, info); ··· 4747 4747 bool auto_recover; 4748 4748 u8 health_state; 4749 4749 u64 dump_ts; 4750 + u64 dump_real_ts; 4750 4751 u64 error_count; 4751 4752 u64 recovery_count; 4752 4753 u64 last_recovery_ts; ··· 4924 4923 goto dump_err; 4925 4924 4926 4925 reporter->dump_ts = jiffies; 4926 + reporter->dump_real_ts = ktime_get_real_ns(); 4927 4927 4928 4928 return 0; 4929 4929 ··· 5073 5071 nla_put_u64_64bit(msg, DEVLINK_ATTR_HEALTH_REPORTER_DUMP_TS, 5074 5072 jiffies_to_msecs(reporter->dump_ts), 5075 5073 DEVLINK_ATTR_PAD)) 5074 + goto reporter_nest_cancel; 5075 + if (reporter->dump_fmsg && 5076 + nla_put_u64_64bit(msg, DEVLINK_ATTR_HEALTH_REPORTER_DUMP_TS_NS, 5077 + reporter->dump_real_ts, DEVLINK_ATTR_PAD)) 5076 5078 goto reporter_nest_cancel; 5077 5079 5078 5080 nla_nest_end(msg, reporter_attr);
+1 -1
net/dsa/tag_8021q.c
··· 105 105 slave = dsa_to_port(ds, port)->slave; 106 106 107 107 err = br_vlan_get_pvid(slave, &pvid); 108 - if (err < 0) 108 + if (!pvid || err < 0) 109 109 /* There is no pvid on the bridge for this port, which is 110 110 * perfectly valid. Nothing to restore, bye-bye! 111 111 */
+2 -1
net/ipv4/ipmr.c
··· 2291 2291 rcu_read_unlock(); 2292 2292 return -ENODEV; 2293 2293 } 2294 - skb2 = skb_clone(skb, GFP_ATOMIC); 2294 + 2295 + skb2 = skb_realloc_headroom(skb, sizeof(struct iphdr)); 2295 2296 if (!skb2) { 2296 2297 read_unlock(&mrt_lock); 2297 2298 rcu_read_unlock();
+11
net/ipv6/seg6_local.c
··· 81 81 if (!pskb_may_pull(skb, srhoff + len)) 82 82 return NULL; 83 83 84 + /* note that pskb_may_pull may change pointers in header; 85 + * for this reason it is necessary to reload them when needed. 86 + */ 87 + srh = (struct ipv6_sr_hdr *)(skb->data + srhoff); 88 + 84 89 if (!seg6_validate_srh(srh, len)) 85 90 return NULL; 86 91 ··· 341 336 if (!ipv6_addr_any(&slwt->nh6)) 342 337 nhaddr = &slwt->nh6; 343 338 339 + skb_set_transport_header(skb, sizeof(struct ipv6hdr)); 340 + 344 341 seg6_lookup_nexthop(skb, nhaddr, 0); 345 342 346 343 return dst_input(skb); ··· 372 365 373 366 skb_dst_drop(skb); 374 367 368 + skb_set_transport_header(skb, sizeof(struct iphdr)); 369 + 375 370 err = ip_route_input(skb, nhaddr, iph->saddr, 0, skb->dev); 376 371 if (err) 377 372 goto drop; ··· 393 384 394 385 if (!pskb_may_pull(skb, sizeof(struct ipv6hdr))) 395 386 goto drop; 387 + 388 + skb_set_transport_header(skb, sizeof(struct ipv6hdr)); 396 389 397 390 seg6_lookup_nexthop(skb, NULL, slwt->table); 398 391
+15 -8
net/rds/ib_cm.c
··· 513 513 struct ib_qp_init_attr attr; 514 514 struct ib_cq_init_attr cq_attr = {}; 515 515 struct rds_ib_device *rds_ibdev; 516 + unsigned long max_wrs; 516 517 int ret, fr_queue_space; 517 518 struct dma_pool *pool; 518 519 ··· 534 533 /* add the conn now so that connection establishment has the dev */ 535 534 rds_ib_add_conn(rds_ibdev, conn); 536 535 537 - if (rds_ibdev->max_wrs < ic->i_send_ring.w_nr + 1) 538 - rds_ib_ring_resize(&ic->i_send_ring, rds_ibdev->max_wrs - 1); 539 - if (rds_ibdev->max_wrs < ic->i_recv_ring.w_nr + 1) 540 - rds_ib_ring_resize(&ic->i_recv_ring, rds_ibdev->max_wrs - 1); 536 + max_wrs = rds_ibdev->max_wrs < rds_ib_sysctl_max_send_wr + 1 ? 537 + rds_ibdev->max_wrs - 1 : rds_ib_sysctl_max_send_wr; 538 + if (ic->i_send_ring.w_nr != max_wrs) 539 + rds_ib_ring_resize(&ic->i_send_ring, max_wrs); 540 + 541 + max_wrs = rds_ibdev->max_wrs < rds_ib_sysctl_max_recv_wr + 1 ? 542 + rds_ibdev->max_wrs - 1 : rds_ib_sysctl_max_recv_wr; 543 + if (ic->i_recv_ring.w_nr != max_wrs) 544 + rds_ib_ring_resize(&ic->i_recv_ring, max_wrs); 541 545 542 546 /* Protection domain and memory range */ 543 547 ic->i_pd = rds_ibdev->pd; ··· 1182 1176 ic->i_flowctl = 0; 1183 1177 atomic_set(&ic->i_credits, 0); 1184 1178 1185 - rds_ib_ring_init(&ic->i_send_ring, rds_ib_sysctl_max_send_wr); 1186 - rds_ib_ring_init(&ic->i_recv_ring, rds_ib_sysctl_max_recv_wr); 1179 + /* Re-init rings, but retain sizes. */ 1180 + rds_ib_ring_init(&ic->i_send_ring, ic->i_send_ring.w_nr); 1181 + rds_ib_ring_init(&ic->i_recv_ring, ic->i_recv_ring.w_nr); 1187 1182 1188 1183 if (ic->i_ibinc) { 1189 1184 rds_inc_put(&ic->i_ibinc->ii_inc); ··· 1231 1224 * rds_ib_conn_shutdown() waits for these to be emptied so they 1232 1225 * must be initialized before it can be called. 1233 1226 */ 1234 - rds_ib_ring_init(&ic->i_send_ring, rds_ib_sysctl_max_send_wr); 1235 - rds_ib_ring_init(&ic->i_recv_ring, rds_ib_sysctl_max_recv_wr); 1227 + rds_ib_ring_init(&ic->i_send_ring, 0); 1228 + rds_ib_ring_init(&ic->i_recv_ring, 0); 1236 1229 1237 1230 ic->conn = conn; 1238 1231 conn->c_transport_data = ic;
+2 -1
net/smc/af_smc.c
··· 799 799 smc->sk.sk_err = EPIPE; 800 800 else if (signal_pending(current)) 801 801 smc->sk.sk_err = -sock_intr_errno(timeo); 802 + sock_put(&smc->sk); /* passive closing */ 802 803 goto out; 803 804 } 804 805 ··· 1737 1736 case TCP_FASTOPEN_KEY: 1738 1737 case TCP_FASTOPEN_NO_COOKIE: 1739 1738 /* option not supported by SMC */ 1740 - if (sk->sk_state == SMC_INIT) { 1739 + if (sk->sk_state == SMC_INIT && !smc->connect_nonblock) { 1741 1740 smc_switch_to_fallback(smc); 1742 1741 smc->fallback_rsn = SMC_CLC_DECL_OPTUNSUPP; 1743 1742 } else {
-2
net/tipc/core.c
··· 34 34 * POSSIBILITY OF SUCH DAMAGE. 35 35 */ 36 36 37 - #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt 38 - 39 37 #include "core.h" 40 38 #include "name_table.h" 41 39 #include "subscr.h"
+6
net/tipc/core.h
··· 61 61 #include <net/genetlink.h> 62 62 #include <net/netns/hash.h> 63 63 64 + #ifdef pr_fmt 65 + #undef pr_fmt 66 + #endif 67 + 68 + #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt 69 + 64 70 struct tipc_node; 65 71 struct tipc_bearer; 66 72 struct tipc_bc_base;
+3
net/xfrm/xfrm_input.c
··· 480 480 else 481 481 XFRM_INC_STATS(net, 482 482 LINUX_MIB_XFRMINSTATEINVALID); 483 + 484 + if (encap_type == -1) 485 + dev_put(skb->dev); 483 486 goto drop; 484 487 } 485 488
+2
net/xfrm/xfrm_state.c
··· 495 495 x->type->destructor(x); 496 496 xfrm_put_type(x->type); 497 497 } 498 + if (x->xfrag.page) 499 + put_page(x->xfrag.page); 498 500 xfrm_dev_state_free(x); 499 501 security_xfrm_state_free(x); 500 502 xfrm_state_free(x);
+4 -4
scripts/tools-support-relr.sh
··· 4 4 tmp_file=$(mktemp) 5 5 trap "rm -f $tmp_file.o $tmp_file $tmp_file.bin" EXIT 6 6 7 - cat << "END" | "$CC" -c -x c - -o $tmp_file.o >/dev/null 2>&1 7 + cat << "END" | $CC -c -x c - -o $tmp_file.o >/dev/null 2>&1 8 8 void *p = &p; 9 9 END 10 - "$LD" $tmp_file.o -shared -Bsymbolic --pack-dyn-relocs=relr -o $tmp_file 10 + $LD $tmp_file.o -shared -Bsymbolic --pack-dyn-relocs=relr -o $tmp_file 11 11 12 12 # Despite printing an error message, GNU nm still exits with exit code 0 if it 13 13 # sees a relr section. So we need to check that nothing is printed to stderr. 14 - test -z "$("$NM" $tmp_file 2>&1 >/dev/null)" 14 + test -z "$($NM $tmp_file 2>&1 >/dev/null)" 15 15 16 - "$OBJCOPY" -O binary $tmp_file $tmp_file.bin 16 + $OBJCOPY -O binary $tmp_file $tmp_file.bin
+6 -2
sound/core/pcm_lib.c
··· 1782 1782 struct snd_pcm_runtime *runtime; 1783 1783 unsigned long flags; 1784 1784 1785 - if (PCM_RUNTIME_CHECK(substream)) 1785 + if (snd_BUG_ON(!substream)) 1786 1786 return; 1787 - runtime = substream->runtime; 1788 1787 1789 1788 snd_pcm_stream_lock_irqsave(substream, flags); 1789 + if (PCM_RUNTIME_CHECK(substream)) 1790 + goto _unlock; 1791 + runtime = substream->runtime; 1792 + 1790 1793 if (!snd_pcm_running(substream) || 1791 1794 snd_pcm_update_hw_ptr0(substream, 1) < 0) 1792 1795 goto _end; ··· 1800 1797 #endif 1801 1798 _end: 1802 1799 kill_fasync(&runtime->fasync, SIGIO, POLL_IN); 1800 + _unlock: 1803 1801 snd_pcm_stream_unlock_irqrestore(substream, flags); 1804 1802 } 1805 1803 EXPORT_SYMBOL(snd_pcm_period_elapsed);
+3
sound/pci/hda/hda_intel.c
··· 2396 2396 /* CometLake-H */ 2397 2397 { PCI_DEVICE(0x8086, 0x06C8), 2398 2398 .driver_data = AZX_DRIVER_SKL | AZX_DCAPS_INTEL_SKYLAKE}, 2399 + /* CometLake-S */ 2400 + { PCI_DEVICE(0x8086, 0xa3f0), 2401 + .driver_data = AZX_DRIVER_SKL | AZX_DCAPS_INTEL_SKYLAKE}, 2399 2402 /* Icelake */ 2400 2403 { PCI_DEVICE(0x8086, 0x34c8), 2401 2404 .driver_data = AZX_DRIVER_SKL | AZX_DCAPS_INTEL_SKYLAKE},
+3 -1
sound/pci/hda/patch_hdmi.c
··· 46 46 ((codec)->core.vendor_id == 0x80862800)) 47 47 #define is_cannonlake(codec) ((codec)->core.vendor_id == 0x8086280c) 48 48 #define is_icelake(codec) ((codec)->core.vendor_id == 0x8086280f) 49 + #define is_tigerlake(codec) ((codec)->core.vendor_id == 0x80862812) 49 50 #define is_haswell_plus(codec) (is_haswell(codec) || is_broadwell(codec) \ 50 51 || is_skylake(codec) || is_broxton(codec) \ 51 52 || is_kabylake(codec) || is_geminilake(codec) \ 52 - || is_cannonlake(codec) || is_icelake(codec)) 53 + || is_cannonlake(codec) || is_icelake(codec) \ 54 + || is_tigerlake(codec)) 53 55 #define is_valleyview(codec) ((codec)->core.vendor_id == 0x80862882) 54 56 #define is_cherryview(codec) ((codec)->core.vendor_id == 0x80862883) 55 57 #define is_valleyview_plus(codec) (is_valleyview(codec) || is_cherryview(codec))
+3
sound/usb/endpoint.c
··· 388 388 } 389 389 390 390 prepare_outbound_urb(ep, ctx); 391 + /* can be stopped during prepare callback */ 392 + if (unlikely(!test_bit(EP_FLAG_RUNNING, &ep->flags))) 393 + goto exit_clear; 391 394 } else { 392 395 retire_inbound_urb(ep, ctx); 393 396 /* can be stopped during retire callback */
+3 -1
sound/usb/mixer.c
··· 1229 1229 if (cval->min + cval->res < cval->max) { 1230 1230 int last_valid_res = cval->res; 1231 1231 int saved, test, check; 1232 - get_cur_mix_raw(cval, minchn, &saved); 1232 + if (get_cur_mix_raw(cval, minchn, &saved) < 0) 1233 + goto no_res_check; 1233 1234 for (;;) { 1234 1235 test = saved; 1235 1236 if (test < cval->max) ··· 1250 1249 snd_usb_set_cur_mix_value(cval, minchn, 0, saved); 1251 1250 } 1252 1251 1252 + no_res_check: 1253 1253 cval->initialized = 1; 1254 1254 } 1255 1255
+2 -2
sound/usb/quirks.c
··· 248 248 NULL, USB_MS_MIDI_OUT_JACK); 249 249 if (!injd && !outjd) 250 250 return -ENODEV; 251 - if (!(injd && snd_usb_validate_midi_desc(injd)) || 252 - !(outjd && snd_usb_validate_midi_desc(outjd))) 251 + if ((injd && !snd_usb_validate_midi_desc(injd)) || 252 + (outjd && !snd_usb_validate_midi_desc(outjd))) 253 253 return -ENODEV; 254 254 if (injd && (injd->bLength < 5 || 255 255 (injd->bJackType != USB_MS_EMBEDDED &&
+3 -3
sound/usb/validate.c
··· 81 81 switch (v->protocol) { 82 82 case UAC_VERSION_1: 83 83 default: 84 - /* bNrChannels, wChannelConfig, iChannelNames, bControlSize */ 85 - len += 1 + 2 + 1 + 1; 86 - if (d->bLength < len) /* bControlSize */ 84 + /* bNrChannels, wChannelConfig, iChannelNames */ 85 + len += 1 + 2 + 1; 86 + if (d->bLength < len + 1) /* bControlSize */ 87 87 return false; 88 88 m = hdr[len]; 89 89 len += 1 + m + 1; /* bControlSize, bmControls, iProcessing */
+1 -1
tools/perf/util/hist.c
··· 1625 1625 return 0; 1626 1626 } 1627 1627 1628 - static int hist_entry__sort(struct hist_entry *a, struct hist_entry *b) 1628 + static int64_t hist_entry__sort(struct hist_entry *a, struct hist_entry *b) 1629 1629 { 1630 1630 struct hists *hists = a->hists; 1631 1631 struct perf_hpp_fmt *fmt;
+6 -2
tools/perf/util/scripting-engines/trace-event-perl.c
··· 539 539 540 540 static int perl_generate_script(struct tep_handle *pevent, const char *outfile) 541 541 { 542 + int i, not_first, count, nr_events; 543 + struct tep_event **all_events; 542 544 struct tep_event *event = NULL; 543 545 struct tep_format_field *f; 544 546 char fname[PATH_MAX]; 545 - int not_first, count; 546 547 FILE *ofp; 547 548 548 549 sprintf(fname, "%s.pl", outfile); ··· 604 603 }\n\n\ 605 604 "); 606 605 606 + nr_events = tep_get_events_count(pevent); 607 + all_events = tep_list_events(pevent, TEP_EVENT_SORT_ID); 607 608 608 - while ((event = trace_find_next_event(pevent, event))) { 609 + for (i = 0; all_events && i < nr_events; i++) { 610 + event = all_events[i]; 609 611 fprintf(ofp, "sub %s::%s\n{\n", event->system, event->name); 610 612 fprintf(ofp, "\tmy ("); 611 613
+7 -2
tools/perf/util/scripting-engines/trace-event-python.c
··· 1687 1687 1688 1688 static int python_generate_script(struct tep_handle *pevent, const char *outfile) 1689 1689 { 1690 + int i, not_first, count, nr_events; 1691 + struct tep_event **all_events; 1690 1692 struct tep_event *event = NULL; 1691 1693 struct tep_format_field *f; 1692 1694 char fname[PATH_MAX]; 1693 - int not_first, count; 1694 1695 FILE *ofp; 1695 1696 1696 1697 sprintf(fname, "%s.py", outfile); ··· 1736 1735 fprintf(ofp, "def trace_end():\n"); 1737 1736 fprintf(ofp, "\tprint(\"in trace_end\")\n\n"); 1738 1737 1739 - while ((event = trace_find_next_event(pevent, event))) { 1738 + nr_events = tep_get_events_count(pevent); 1739 + all_events = tep_list_events(pevent, TEP_EVENT_SORT_ID); 1740 + 1741 + for (i = 0; all_events && i < nr_events; i++) { 1742 + event = all_events[i]; 1740 1743 fprintf(ofp, "def %s__%s(", event->system, event->name); 1741 1744 fprintf(ofp, "event_name, "); 1742 1745 fprintf(ofp, "context, ");
-31
tools/perf/util/trace-event-parse.c
··· 173 173 return tep_parse_event(pevent, buf, size, sys); 174 174 } 175 175 176 - struct tep_event *trace_find_next_event(struct tep_handle *pevent, 177 - struct tep_event *event) 178 - { 179 - static int idx; 180 - int events_count; 181 - struct tep_event *all_events; 182 - 183 - all_events = tep_get_first_event(pevent); 184 - events_count = tep_get_events_count(pevent); 185 - if (!pevent || !all_events || events_count < 1) 186 - return NULL; 187 - 188 - if (!event) { 189 - idx = 0; 190 - return all_events; 191 - } 192 - 193 - if (idx < events_count && event == (all_events + idx)) { 194 - idx++; 195 - if (idx == events_count) 196 - return NULL; 197 - return (all_events + idx); 198 - } 199 - 200 - for (idx = 1; idx < events_count; idx++) { 201 - if (event == (all_events + (idx - 1))) 202 - return (all_events + idx); 203 - } 204 - return NULL; 205 - } 206 - 207 176 struct flag { 208 177 const char *name; 209 178 unsigned long long value;
-2
tools/perf/util/trace-event.h
··· 47 47 48 48 ssize_t trace_report(int fd, struct trace_event *tevent, bool repipe); 49 49 50 - struct tep_event *trace_find_next_event(struct tep_handle *pevent, 51 - struct tep_event *event); 52 50 unsigned long long read_size(struct tep_event *event, void *ptr, int size); 53 51 unsigned long long eval_flag(const char *flag); 54 52
+6 -2
tools/testing/selftests/drivers/net/mlxsw/vxlan.sh
··· 112 112 RET=0 113 113 114 114 ip link add dev br0 type bridge mcast_snooping 0 115 + ip link add name dummy1 up type dummy 115 116 116 117 ip link add name vxlan0 up type vxlan id 10 nolearning noudpcsum \ 117 118 ttl 20 tos inherit local 198.51.100.1 dstport 4789 \ 118 - dev $swp2 group 239.0.0.1 119 + dev dummy1 group 239.0.0.1 119 120 120 121 sanitization_single_dev_test_fail 121 122 122 123 ip link del dev vxlan0 124 + ip link del dev dummy1 123 125 ip link del dev br0 124 126 125 127 log_test "vxlan device with a multicast group" ··· 183 181 RET=0 184 182 185 183 ip link add dev br0 type bridge mcast_snooping 0 184 + ip link add name dummy1 up type dummy 186 185 187 186 ip link add name vxlan0 up type vxlan id 10 nolearning noudpcsum \ 188 - ttl 20 tos inherit local 198.51.100.1 dstport 4789 dev $swp2 187 + ttl 20 tos inherit local 198.51.100.1 dstport 4789 dev dummy1 189 188 190 189 sanitization_single_dev_test_fail 191 190 192 191 ip link del dev vxlan0 192 + ip link del dev dummy1 193 193 ip link del dev br0 194 194 195 195 log_test "vxlan device with local interface"
+2 -2
tools/testing/selftests/kvm/lib/assert.c
··· 55 55 #pragma GCC diagnostic pop 56 56 } 57 57 58 - static pid_t gettid(void) 58 + static pid_t _gettid(void) 59 59 { 60 60 return syscall(SYS_gettid); 61 61 } ··· 72 72 fprintf(stderr, "==== Test Assertion Failure ====\n" 73 73 " %s:%u: %s\n" 74 74 " pid=%d tid=%d - %s\n", 75 - file, line, exp_str, getpid(), gettid(), 75 + file, line, exp_str, getpid(), _gettid(), 76 76 strerror(errno)); 77 77 test_dump_stack(); 78 78 if (fmt) {
+51 -2
tools/testing/selftests/ptp/testptp.c
··· 44 44 } 45 45 #endif 46 46 47 + static void show_flag_test(int rq_index, unsigned int flags, int err) 48 + { 49 + printf("PTP_EXTTS_REQUEST%c flags 0x%08x : (%d) %s\n", 50 + rq_index ? '1' + rq_index : ' ', 51 + flags, err, strerror(errno)); 52 + /* sigh, uClibc ... */ 53 + errno = 0; 54 + } 55 + 56 + static void do_flag_test(int fd, unsigned int index) 57 + { 58 + struct ptp_extts_request extts_request; 59 + unsigned long request[2] = { 60 + PTP_EXTTS_REQUEST, 61 + PTP_EXTTS_REQUEST2, 62 + }; 63 + unsigned int enable_flags[5] = { 64 + PTP_ENABLE_FEATURE, 65 + PTP_ENABLE_FEATURE | PTP_RISING_EDGE, 66 + PTP_ENABLE_FEATURE | PTP_FALLING_EDGE, 67 + PTP_ENABLE_FEATURE | PTP_RISING_EDGE | PTP_FALLING_EDGE, 68 + PTP_ENABLE_FEATURE | (PTP_EXTTS_VALID_FLAGS + 1), 69 + }; 70 + int err, i, j; 71 + 72 + memset(&extts_request, 0, sizeof(extts_request)); 73 + extts_request.index = index; 74 + 75 + for (i = 0; i < 2; i++) { 76 + for (j = 0; j < 5; j++) { 77 + extts_request.flags = enable_flags[j]; 78 + err = ioctl(fd, request[i], &extts_request); 79 + show_flag_test(i, extts_request.flags, err); 80 + 81 + extts_request.flags = 0; 82 + err = ioctl(fd, request[i], &extts_request); 83 + } 84 + } 85 + } 86 + 47 87 static clockid_t get_clockid(int fd) 48 88 { 49 89 #define CLOCKFD 3 ··· 136 96 " -s set the ptp clock time from the system time\n" 137 97 " -S set the system time from the ptp clock time\n" 138 98 " -t val shift the ptp clock time by 'val' seconds\n" 139 - " -T val set the ptp clock time to 'val' seconds\n", 99 + " -T val set the ptp clock time to 'val' seconds\n" 100 + " -z test combinations of rising/falling external time stamp flags\n", 140 101 progname); 141 102 } 142 103 ··· 163 122 int adjtime = 0; 164 123 int capabilities = 0; 165 124 int extts = 0; 125 + int flagtest = 0; 166 126 int gettime = 0; 167 127 int index = 0; 168 128 int list_pins = 0; ··· 180 138 181 139 progname = strrchr(argv[0], '/'); 182 140 progname = progname ? 1+progname : argv[0]; 183 - while (EOF != (c = getopt(argc, argv, "cd:e:f:ghi:k:lL:p:P:sSt:T:v"))) { 141 + while (EOF != (c = getopt(argc, argv, "cd:e:f:ghi:k:lL:p:P:sSt:T:z"))) { 184 142 switch (c) { 185 143 case 'c': 186 144 capabilities = 1; ··· 232 190 case 'T': 233 191 settime = 3; 234 192 seconds = atoi(optarg); 193 + break; 194 + case 'z': 195 + flagtest = 1; 235 196 break; 236 197 case 'h': 237 198 usage(progname); ··· 365 320 if (ioctl(fd, PTP_EXTTS_REQUEST, &extts_request)) { 366 321 perror("PTP_EXTTS_REQUEST"); 367 322 } 323 + } 324 + 325 + if (flagtest) { 326 + do_flag_test(fd, index); 368 327 } 369 328 370 329 if (list_pins) {
+160 -15
virt/kvm/kvm_main.c
··· 50 50 #include <linux/bsearch.h> 51 51 #include <linux/io.h> 52 52 #include <linux/lockdep.h> 53 + #include <linux/kthread.h> 53 54 54 55 #include <asm/processor.h> 55 56 #include <asm/ioctl.h> ··· 122 121 unsigned long arg); 123 122 #define KVM_COMPAT(c) .compat_ioctl = (c) 124 123 #else 124 + /* 125 + * For architectures that don't implement a compat infrastructure, 126 + * adopt a double line of defense: 127 + * - Prevent a compat task from opening /dev/kvm 128 + * - If the open has been done by a 64bit task, and the KVM fd 129 + * passed to a compat task, let the ioctls fail. 130 + */ 125 131 static long kvm_no_compat_ioctl(struct file *file, unsigned int ioctl, 126 132 unsigned long arg) { return -EINVAL; } 127 - #define KVM_COMPAT(c) .compat_ioctl = kvm_no_compat_ioctl 133 + 134 + static int kvm_no_compat_open(struct inode *inode, struct file *file) 135 + { 136 + return is_compat_task() ? -ENODEV : 0; 137 + } 138 + #define KVM_COMPAT(c) .compat_ioctl = kvm_no_compat_ioctl, \ 139 + .open = kvm_no_compat_open 128 140 #endif 129 141 static int hardware_enable_all(void); 130 142 static void hardware_disable_all(void); ··· 163 149 return 0; 164 150 } 165 151 152 + bool kvm_is_zone_device_pfn(kvm_pfn_t pfn) 153 + { 154 + /* 155 + * The metadata used by is_zone_device_page() to determine whether or 156 + * not a page is ZONE_DEVICE is guaranteed to be valid if and only if 157 + * the device has been pinned, e.g. by get_user_pages(). WARN if the 158 + * page_count() is zero to help detect bad usage of this helper. 159 + */ 160 + if (!pfn_valid(pfn) || WARN_ON_ONCE(!page_count(pfn_to_page(pfn)))) 161 + return false; 162 + 163 + return is_zone_device_page(pfn_to_page(pfn)); 164 + } 165 + 166 166 bool kvm_is_reserved_pfn(kvm_pfn_t pfn) 167 167 { 168 + /* 169 + * ZONE_DEVICE pages currently set PG_reserved, but from a refcounting 170 + * perspective they are "normal" pages, albeit with slightly different 171 + * usage rules. 172 + */ 168 173 if (pfn_valid(pfn)) 169 - return PageReserved(pfn_to_page(pfn)); 174 + return PageReserved(pfn_to_page(pfn)) && 175 + !kvm_is_zone_device_pfn(pfn); 170 176 171 177 return true; 172 178 } ··· 659 625 return 0; 660 626 } 661 627 628 + /* 629 + * Called after the VM is otherwise initialized, but just before adding it to 630 + * the vm_list. 631 + */ 632 + int __weak kvm_arch_post_init_vm(struct kvm *kvm) 633 + { 634 + return 0; 635 + } 636 + 637 + /* 638 + * Called just after removing the VM from the vm_list, but before doing any 639 + * other destruction. 640 + */ 641 + void __weak kvm_arch_pre_destroy_vm(struct kvm *kvm) 642 + { 643 + } 644 + 662 645 static struct kvm *kvm_create_vm(unsigned long type) 663 646 { 664 647 struct kvm *kvm = kvm_arch_alloc_vm(); ··· 696 645 697 646 BUILD_BUG_ON(KVM_MEM_SLOTS_NUM > SHRT_MAX); 698 647 648 + if (init_srcu_struct(&kvm->srcu)) 649 + goto out_err_no_srcu; 650 + if (init_srcu_struct(&kvm->irq_srcu)) 651 + goto out_err_no_irq_srcu; 652 + 653 + refcount_set(&kvm->users_count, 1); 699 654 for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { 700 655 struct kvm_memslots *slots = kvm_alloc_memslots(); 701 656 ··· 719 662 goto out_err_no_arch_destroy_vm; 720 663 } 721 664 722 - refcount_set(&kvm->users_count, 1); 723 665 r = kvm_arch_init_vm(kvm, type); 724 666 if (r) 725 667 goto out_err_no_arch_destroy_vm; ··· 731 675 INIT_HLIST_HEAD(&kvm->irq_ack_notifier_list); 732 676 #endif 733 677 734 - if (init_srcu_struct(&kvm->srcu)) 735 - goto out_err_no_srcu; 736 - if (init_srcu_struct(&kvm->irq_srcu)) 737 - goto out_err_no_irq_srcu; 738 - 739 678 r = kvm_init_mmu_notifier(kvm); 679 + if (r) 680 + goto out_err_no_mmu_notifier; 681 + 682 + r = kvm_arch_post_init_vm(kvm); 740 683 if (r) 741 684 goto out_err; 742 685 ··· 748 693 return kvm; 749 694 750 695 out_err: 751 - cleanup_srcu_struct(&kvm->irq_srcu); 752 - out_err_no_irq_srcu: 753 - cleanup_srcu_struct(&kvm->srcu); 754 - out_err_no_srcu: 696 + #if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) 697 + if (kvm->mmu_notifier.ops) 698 + mmu_notifier_unregister(&kvm->mmu_notifier, current->mm); 699 + #endif 700 + out_err_no_mmu_notifier: 755 701 hardware_disable_all(); 756 702 out_err_no_disable: 757 703 kvm_arch_destroy_vm(kvm); 758 - WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count)); 759 704 out_err_no_arch_destroy_vm: 705 + WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count)); 760 706 for (i = 0; i < KVM_NR_BUSES; i++) 761 707 kfree(kvm_get_bus(kvm, i)); 762 708 for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) 763 709 kvm_free_memslots(kvm, __kvm_memslots(kvm, i)); 710 + cleanup_srcu_struct(&kvm->irq_srcu); 711 + out_err_no_irq_srcu: 712 + cleanup_srcu_struct(&kvm->srcu); 713 + out_err_no_srcu: 764 714 kvm_arch_free_vm(kvm); 765 715 mmdrop(current->mm); 766 716 return ERR_PTR(r); ··· 797 737 mutex_lock(&kvm_lock); 798 738 list_del(&kvm->vm_list); 799 739 mutex_unlock(&kvm_lock); 740 + kvm_arch_pre_destroy_vm(kvm); 741 + 800 742 kvm_free_irq_routing(kvm); 801 743 for (i = 0; i < KVM_NR_BUSES; i++) { 802 744 struct kvm_io_bus *bus = kvm_get_bus(kvm, i); ··· 1919 1857 1920 1858 void kvm_set_pfn_dirty(kvm_pfn_t pfn) 1921 1859 { 1922 - if (!kvm_is_reserved_pfn(pfn)) { 1860 + if (!kvm_is_reserved_pfn(pfn) && !kvm_is_zone_device_pfn(pfn)) { 1923 1861 struct page *page = pfn_to_page(pfn); 1924 1862 1925 1863 SetPageDirty(page); ··· 1929 1867 1930 1868 void kvm_set_pfn_accessed(kvm_pfn_t pfn) 1931 1869 { 1932 - if (!kvm_is_reserved_pfn(pfn)) 1870 + if (!kvm_is_reserved_pfn(pfn) && !kvm_is_zone_device_pfn(pfn)) 1933 1871 mark_page_accessed(pfn_to_page(pfn)); 1934 1872 } 1935 1873 EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed); ··· 4433 4371 kvm_vfio_ops_exit(); 4434 4372 } 4435 4373 EXPORT_SYMBOL_GPL(kvm_exit); 4374 + 4375 + struct kvm_vm_worker_thread_context { 4376 + struct kvm *kvm; 4377 + struct task_struct *parent; 4378 + struct completion init_done; 4379 + kvm_vm_thread_fn_t thread_fn; 4380 + uintptr_t data; 4381 + int err; 4382 + }; 4383 + 4384 + static int kvm_vm_worker_thread(void *context) 4385 + { 4386 + /* 4387 + * The init_context is allocated on the stack of the parent thread, so 4388 + * we have to locally copy anything that is needed beyond initialization 4389 + */ 4390 + struct kvm_vm_worker_thread_context *init_context = context; 4391 + struct kvm *kvm = init_context->kvm; 4392 + kvm_vm_thread_fn_t thread_fn = init_context->thread_fn; 4393 + uintptr_t data = init_context->data; 4394 + int err; 4395 + 4396 + err = kthread_park(current); 4397 + /* kthread_park(current) is never supposed to return an error */ 4398 + WARN_ON(err != 0); 4399 + if (err) 4400 + goto init_complete; 4401 + 4402 + err = cgroup_attach_task_all(init_context->parent, current); 4403 + if (err) { 4404 + kvm_err("%s: cgroup_attach_task_all failed with err %d\n", 4405 + __func__, err); 4406 + goto init_complete; 4407 + } 4408 + 4409 + set_user_nice(current, task_nice(init_context->parent)); 4410 + 4411 + init_complete: 4412 + init_context->err = err; 4413 + complete(&init_context->init_done); 4414 + init_context = NULL; 4415 + 4416 + if (err) 4417 + return err; 4418 + 4419 + /* Wait to be woken up by the spawner before proceeding. */ 4420 + kthread_parkme(); 4421 + 4422 + if (!kthread_should_stop()) 4423 + err = thread_fn(kvm, data); 4424 + 4425 + return err; 4426 + } 4427 + 4428 + int kvm_vm_create_worker_thread(struct kvm *kvm, kvm_vm_thread_fn_t thread_fn, 4429 + uintptr_t data, const char *name, 4430 + struct task_struct **thread_ptr) 4431 + { 4432 + struct kvm_vm_worker_thread_context init_context = {}; 4433 + struct task_struct *thread; 4434 + 4435 + *thread_ptr = NULL; 4436 + init_context.kvm = kvm; 4437 + init_context.parent = current; 4438 + init_context.thread_fn = thread_fn; 4439 + init_context.data = data; 4440 + init_completion(&init_context.init_done); 4441 + 4442 + thread = kthread_run(kvm_vm_worker_thread, &init_context, 4443 + "%s-%d", name, task_pid_nr(current)); 4444 + if (IS_ERR(thread)) 4445 + return PTR_ERR(thread); 4446 + 4447 + /* kthread_run is never supposed to return NULL */ 4448 + WARN_ON(thread == NULL); 4449 + 4450 + wait_for_completion(&init_context.init_done); 4451 + 4452 + if (!init_context.err) 4453 + *thread_ptr = thread; 4454 + 4455 + return init_context.err; 4456 + }