···486486 /sys/devices/system/cpu/vulnerabilities/spec_store_bypass487487 /sys/devices/system/cpu/vulnerabilities/l1tf488488 /sys/devices/system/cpu/vulnerabilities/mds489489+ /sys/devices/system/cpu/vulnerabilities/tsx_async_abort490490+ /sys/devices/system/cpu/vulnerabilities/itlb_multihit489491Date: January 2018490492Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>491493Description: Information about CPU vulnerabilities
···11+iTLB multihit22+=============33+44+iTLB multihit is an erratum where some processors may incur a machine check55+error, possibly resulting in an unrecoverable CPU lockup, when an66+instruction fetch hits multiple entries in the instruction TLB. This can77+occur when the page size is changed along with either the physical address88+or cache type. A malicious guest running on a virtualized system can99+exploit this erratum to perform a denial of service attack.1010+1111+1212+Affected processors1313+-------------------1414+1515+Variations of this erratum are present on most Intel Core and Xeon processor1616+models. The erratum is not present on:1717+1818+ - non-Intel processors1919+2020+ - Some Atoms (Airmont, Bonnell, Goldmont, GoldmontPlus, Saltwell, Silvermont)2121+2222+ - Intel processors that have the PSCHANGE_MC_NO bit set in the2323+ IA32_ARCH_CAPABILITIES MSR.2424+2525+2626+Related CVEs2727+------------2828+2929+The following CVE entry is related to this issue:3030+3131+ ============== =================================================3232+ CVE-2018-12207 Machine Check Error Avoidance on Page Size Change3333+ ============== =================================================3434+3535+3636+Problem3737+-------3838+3939+Privileged software, including OS and virtual machine managers (VMM), are in4040+charge of memory management. A key component in memory management is the control4141+of the page tables. Modern processors use virtual memory, a technique that creates4242+the illusion of a very large memory for processors. This virtual space is split4343+into pages of a given size. Page tables translate virtual addresses to physical4444+addresses.4545+4646+To reduce latency when performing a virtual to physical address translation,4747+processors include a structure, called TLB, that caches recent translations.4848+There are separate TLBs for instruction (iTLB) and data (dTLB).4949+5050+Under this errata, instructions are fetched from a linear address translated5151+using a 4 KB translation cached in the iTLB. Privileged software modifies the5252+paging structure so that the same linear address using large page size (2 MB, 45353+MB, 1 GB) with a different physical address or memory type. After the page5454+structure modification but before the software invalidates any iTLB entries for5555+the linear address, a code fetch that happens on the same linear address may5656+cause a machine-check error which can result in a system hang or shutdown.5757+5858+5959+Attack scenarios6060+----------------6161+6262+Attacks against the iTLB multihit erratum can be mounted from malicious6363+guests in a virtualized system.6464+6565+6666+iTLB multihit system information6767+--------------------------------6868+6969+The Linux kernel provides a sysfs interface to enumerate the current iTLB7070+multihit status of the system:whether the system is vulnerable and which7171+mitigations are active. The relevant sysfs file is:7272+7373+/sys/devices/system/cpu/vulnerabilities/itlb_multihit7474+7575+The possible values in this file are:7676+7777+.. list-table::7878+7979+ * - Not affected8080+ - The processor is not vulnerable.8181+ * - KVM: Mitigation: Split huge pages8282+ - Software changes mitigate this issue.8383+ * - KVM: Vulnerable8484+ - The processor is vulnerable, but no mitigation enabled8585+8686+8787+Enumeration of the erratum8888+--------------------------------8989+9090+A new bit has been allocated in the IA32_ARCH_CAPABILITIES (PSCHANGE_MC_NO) msr9191+and will be set on CPU's which are mitigated against this issue.9292+9393+ ======================================= =========== ===============================9494+ IA32_ARCH_CAPABILITIES MSR Not present Possibly vulnerable,check model9595+ IA32_ARCH_CAPABILITIES[PSCHANGE_MC_NO] '0' Likely vulnerable,check model9696+ IA32_ARCH_CAPABILITIES[PSCHANGE_MC_NO] '1' Not vulnerable9797+ ======================================= =========== ===============================9898+9999+100100+Mitigation mechanism101101+-------------------------102102+103103+This erratum can be mitigated by restricting the use of large page sizes to104104+non-executable pages. This forces all iTLB entries to be 4K, and removes105105+the possibility of multiple hits.106106+107107+In order to mitigate the vulnerability, KVM initially marks all huge pages108108+as non-executable. If the guest attempts to execute in one of those pages,109109+the page is broken down into 4K pages, which are then marked executable.110110+111111+If EPT is disabled or not available on the host, KVM is in control of TLB112112+flushes and the problematic situation cannot happen. However, the shadow113113+EPT paging mechanism used by nested virtualization is vulnerable, because114114+the nested guest can trigger multiple iTLB hits by modifying its own115115+(non-nested) page tables. For simplicity, KVM will make large pages116116+non-executable in all shadow paging modes.117117+118118+Mitigation control on the kernel command line and KVM - module parameter119119+------------------------------------------------------------------------120120+121121+The KVM hypervisor mitigation mechanism for marking huge pages as122122+non-executable can be controlled with a module parameter "nx_huge_pages=".123123+The kernel command line allows to control the iTLB multihit mitigations at124124+boot time with the option "kvm.nx_huge_pages=".125125+126126+The valid arguments for these options are:127127+128128+ ========== ================================================================129129+ force Mitigation is enabled. In this case, the mitigation implements130130+ non-executable huge pages in Linux kernel KVM module. All huge131131+ pages in the EPT are marked as non-executable.132132+ If a guest attempts to execute in one of those pages, the page is133133+ broken down into 4K pages, which are then marked executable.134134+135135+ off Mitigation is disabled.136136+137137+ auto Enable mitigation only if the platform is affected and the kernel138138+ was not booted with the "mitigations=off" command line parameter.139139+ This is the default option.140140+ ========== ================================================================141141+142142+143143+Mitigation selection guide144144+--------------------------145145+146146+1. No virtualization in use147147+^^^^^^^^^^^^^^^^^^^^^^^^^^^148148+149149+ The system is protected by the kernel unconditionally and no further150150+ action is required.151151+152152+2. Virtualization with trusted guests153153+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^154154+155155+ If the guest comes from a trusted source, you may assume that the guest will156156+ not attempt to maliciously exploit these errata and no further action is157157+ required.158158+159159+3. Virtualization with untrusted guests160160+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^161161+ If the guest comes from an untrusted source, the guest host kernel will need162162+ to apply iTLB multihit mitigation via the kernel command line or kvm163163+ module parameter.
···11+.. SPDX-License-Identifier: GPL-2.022+33+TAA - TSX Asynchronous Abort44+======================================55+66+TAA is a hardware vulnerability that allows unprivileged speculative access to77+data which is available in various CPU internal buffers by using asynchronous88+aborts within an Intel TSX transactional region.99+1010+Affected processors1111+-------------------1212+1313+This vulnerability only affects Intel processors that support Intel1414+Transactional Synchronization Extensions (TSX) when the TAA_NO bit (bit 8)1515+is 0 in the IA32_ARCH_CAPABILITIES MSR. On processors where the MDS_NO bit1616+(bit 5) is 0 in the IA32_ARCH_CAPABILITIES MSR, the existing MDS mitigations1717+also mitigate against TAA.1818+1919+Whether a processor is affected or not can be read out from the TAA2020+vulnerability file in sysfs. See :ref:`tsx_async_abort_sys_info`.2121+2222+Related CVEs2323+------------2424+2525+The following CVE entry is related to this TAA issue:2626+2727+ ============== ===== ===================================================2828+ CVE-2019-11135 TAA TSX Asynchronous Abort (TAA) condition on some2929+ microprocessors utilizing speculative execution may3030+ allow an authenticated user to potentially enable3131+ information disclosure via a side channel with3232+ local access.3333+ ============== ===== ===================================================3434+3535+Problem3636+-------3737+3838+When performing store, load or L1 refill operations, processors write3939+data into temporary microarchitectural structures (buffers). The data in4040+those buffers can be forwarded to load operations as an optimization.4141+4242+Intel TSX is an extension to the x86 instruction set architecture that adds4343+hardware transactional memory support to improve performance of multi-threaded4444+software. TSX lets the processor expose and exploit concurrency hidden in an4545+application due to dynamically avoiding unnecessary synchronization.4646+4747+TSX supports atomic memory transactions that are either committed (success) or4848+aborted. During an abort, operations that happened within the transactional region4949+are rolled back. An asynchronous abort takes place, among other options, when a5050+different thread accesses a cache line that is also used within the transactional5151+region when that access might lead to a data race.5252+5353+Immediately after an uncompleted asynchronous abort, certain speculatively5454+executed loads may read data from those internal buffers and pass it to dependent5555+operations. This can be then used to infer the value via a cache side channel5656+attack.5757+5858+Because the buffers are potentially shared between Hyper-Threads cross5959+Hyper-Thread attacks are possible.6060+6161+The victim of a malicious actor does not need to make use of TSX. Only the6262+attacker needs to begin a TSX transaction and raise an asynchronous abort6363+which in turn potenitally leaks data stored in the buffers.6464+6565+More detailed technical information is available in the TAA specific x866666+architecture section: :ref:`Documentation/x86/tsx_async_abort.rst <tsx_async_abort>`.6767+6868+6969+Attack scenarios7070+----------------7171+7272+Attacks against the TAA vulnerability can be implemented from unprivileged7373+applications running on hosts or guests.7474+7575+As for MDS, the attacker has no control over the memory addresses that can7676+be leaked. Only the victim is responsible for bringing data to the CPU. As7777+a result, the malicious actor has to sample as much data as possible and7878+then postprocess it to try to infer any useful information from it.7979+8080+A potential attacker only has read access to the data. Also, there is no direct8181+privilege escalation by using this technique.8282+8383+8484+.. _tsx_async_abort_sys_info:8585+8686+TAA system information8787+-----------------------8888+8989+The Linux kernel provides a sysfs interface to enumerate the current TAA status9090+of mitigated systems. The relevant sysfs file is:9191+9292+/sys/devices/system/cpu/vulnerabilities/tsx_async_abort9393+9494+The possible values in this file are:9595+9696+.. list-table::9797+9898+ * - 'Vulnerable'9999+ - The CPU is affected by this vulnerability and the microcode and kernel mitigation are not applied.100100+ * - 'Vulnerable: Clear CPU buffers attempted, no microcode'101101+ - The system tries to clear the buffers but the microcode might not support the operation.102102+ * - 'Mitigation: Clear CPU buffers'103103+ - The microcode has been updated to clear the buffers. TSX is still enabled.104104+ * - 'Mitigation: TSX disabled'105105+ - TSX is disabled.106106+ * - 'Not affected'107107+ - The CPU is not affected by this issue.108108+109109+.. _ucode_needed:110110+111111+Best effort mitigation mode112112+^^^^^^^^^^^^^^^^^^^^^^^^^^^113113+114114+If the processor is vulnerable, but the availability of the microcode-based115115+mitigation mechanism is not advertised via CPUID the kernel selects a best116116+effort mitigation mode. This mode invokes the mitigation instructions117117+without a guarantee that they clear the CPU buffers.118118+119119+This is done to address virtualization scenarios where the host has the120120+microcode update applied, but the hypervisor is not yet updated to expose the121121+CPUID to the guest. If the host has updated microcode the protection takes122122+effect; otherwise a few CPU cycles are wasted pointlessly.123123+124124+The state in the tsx_async_abort sysfs file reflects this situation125125+accordingly.126126+127127+128128+Mitigation mechanism129129+--------------------130130+131131+The kernel detects the affected CPUs and the presence of the microcode which is132132+required. If a CPU is affected and the microcode is available, then the kernel133133+enables the mitigation by default.134134+135135+136136+The mitigation can be controlled at boot time via a kernel command line option.137137+See :ref:`taa_mitigation_control_command_line`.138138+139139+.. _virt_mechanism:140140+141141+Virtualization mitigation142142+^^^^^^^^^^^^^^^^^^^^^^^^^143143+144144+Affected systems where the host has TAA microcode and TAA is mitigated by145145+having disabled TSX previously, are not vulnerable regardless of the status146146+of the VMs.147147+148148+In all other cases, if the host either does not have the TAA microcode or149149+the kernel is not mitigated, the system might be vulnerable.150150+151151+152152+.. _taa_mitigation_control_command_line:153153+154154+Mitigation control on the kernel command line155155+---------------------------------------------156156+157157+The kernel command line allows to control the TAA mitigations at boot time with158158+the option "tsx_async_abort=". The valid arguments for this option are:159159+160160+ ============ =============================================================161161+ off This option disables the TAA mitigation on affected platforms.162162+ If the system has TSX enabled (see next parameter) and the CPU163163+ is affected, the system is vulnerable.164164+165165+ full TAA mitigation is enabled. If TSX is enabled, on an affected166166+ system it will clear CPU buffers on ring transitions. On167167+ systems which are MDS-affected and deploy MDS mitigation,168168+ TAA is also mitigated. Specifying this option on those169169+ systems will have no effect.170170+171171+ full,nosmt The same as tsx_async_abort=full, with SMT disabled on172172+ vulnerable CPUs that have TSX enabled. This is the complete173173+ mitigation. When TSX is disabled, SMT is not disabled because174174+ CPU is not vulnerable to cross-thread TAA attacks.175175+ ============ =============================================================176176+177177+Not specifying this option is equivalent to "tsx_async_abort=full".178178+179179+The kernel command line also allows to control the TSX feature using the180180+parameter "tsx=" on CPUs which support TSX control. MSR_IA32_TSX_CTRL is used181181+to control the TSX feature and the enumeration of the TSX feature bits (RTM182182+and HLE) in CPUID.183183+184184+The valid options are:185185+186186+ ============ =============================================================187187+ off Disables TSX on the system.188188+189189+ Note that this option takes effect only on newer CPUs which are190190+ not vulnerable to MDS, i.e., have MSR_IA32_ARCH_CAPABILITIES.MDS_NO=1191191+ and which get the new IA32_TSX_CTRL MSR through a microcode192192+ update. This new MSR allows for the reliable deactivation of193193+ the TSX functionality.194194+195195+ on Enables TSX.196196+197197+ Although there are mitigations for all known security198198+ vulnerabilities, TSX has been known to be an accelerator for199199+ several previous speculation-related CVEs, and so there may be200200+ unknown security risks associated with leaving it enabled.201201+202202+ auto Disables TSX if X86_BUG_TAA is present, otherwise enables TSX203203+ on the system.204204+ ============ =============================================================205205+206206+Not specifying this option is equivalent to "tsx=off".207207+208208+The following combinations of the "tsx_async_abort" and "tsx" are possible. For209209+affected platforms tsx=auto is equivalent to tsx=off and the result will be:210210+211211+ ========= ========================== =========================================212212+ tsx=on tsx_async_abort=full The system will use VERW to clear CPU213213+ buffers. Cross-thread attacks are still214214+ possible on SMT machines.215215+ tsx=on tsx_async_abort=full,nosmt As above, cross-thread attacks on SMT216216+ mitigated.217217+ tsx=on tsx_async_abort=off The system is vulnerable.218218+ tsx=off tsx_async_abort=full TSX might be disabled if microcode219219+ provides a TSX control MSR. If so,220220+ system is not vulnerable.221221+ tsx=off tsx_async_abort=full,nosmt Ditto222222+ tsx=off tsx_async_abort=off ditto223223+ ========= ========================== =========================================224224+225225+226226+For unaffected platforms "tsx=on" and "tsx_async_abort=full" does not clear CPU227227+buffers. For platforms without TSX control (MSR_IA32_ARCH_CAPABILITIES.MDS_NO=0)228228+"tsx" command line argument has no effect.229229+230230+For the affected platforms below table indicates the mitigation status for the231231+combinations of CPUID bit MD_CLEAR and IA32_ARCH_CAPABILITIES MSR bits MDS_NO232232+and TSX_CTRL_MSR.233233+234234+ ======= ========= ============= ========================================235235+ MDS_NO MD_CLEAR TSX_CTRL_MSR Status236236+ ======= ========= ============= ========================================237237+ 0 0 0 Vulnerable (needs microcode)238238+ 0 1 0 MDS and TAA mitigated via VERW239239+ 1 1 0 MDS fixed, TAA vulnerable if TSX enabled240240+ because MD_CLEAR has no meaning and241241+ VERW is not guaranteed to clear buffers242242+ 1 X 1 MDS fixed, TAA can be mitigated by243243+ VERW or TSX_CTRL_MSR244244+ ======= ========= ============= ========================================245245+246246+Mitigation selection guide247247+--------------------------248248+249249+1. Trusted userspace and guests250250+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^251251+252252+If all user space applications are from a trusted source and do not execute253253+untrusted code which is supplied externally, then the mitigation can be254254+disabled. The same applies to virtualized environments with trusted guests.255255+256256+257257+2. Untrusted userspace and guests258258+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^259259+260260+If there are untrusted applications or guests on the system, enabling TSX261261+might allow a malicious actor to leak data from the host or from other262262+processes running on the same physical core.263263+264264+If the microcode is available and the TSX is disabled on the host, attacks265265+are prevented in a virtualized environment as well, even if the VMs do not266266+explicitly enable the mitigation.267267+268268+269269+.. _taa_default_mitigations:270270+271271+Default mitigations272272+-------------------273273+274274+The kernel's default action for vulnerable processors is:275275+276276+ - Deploy TSX disable mitigation (tsx_async_abort=full tsx=off).
+92
Documentation/admin-guide/kernel-parameters.txt
···20552055 KVM MMU at runtime.20562056 Default is 0 (off)2057205720582058+ kvm.nx_huge_pages=20592059+ [KVM] Controls the software workaround for the20602060+ X86_BUG_ITLB_MULTIHIT bug.20612061+ force : Always deploy workaround.20622062+ off : Never deploy workaround.20632063+ auto : Deploy workaround based on the presence of20642064+ X86_BUG_ITLB_MULTIHIT.20652065+20662066+ Default is 'auto'.20672067+20682068+ If the software workaround is enabled for the host,20692069+ guests do need not to enable it for nested guests.20702070+20712071+ kvm.nx_huge_pages_recovery_ratio=20722072+ [KVM] Controls how many 4KiB pages are periodically zapped20732073+ back to huge pages. 0 disables the recovery, otherwise if20742074+ the value is N KVM will zap 1/Nth of the 4KiB pages every20752075+ minute. The default is 60.20762076+20582077 kvm-amd.nested= [KVM,AMD] Allow nested virtualization in KVM/SVM.20592078 Default is 1 (enabled)20602079···26552636 ssbd=force-off [ARM64]26562637 l1tf=off [X86]26572638 mds=off [X86]26392639+ tsx_async_abort=off [X86]26402640+ kvm.nx_huge_pages=off [X86]26412641+26422642+ Exceptions:26432643+ This does not have any effect on26442644+ kvm.nx_huge_pages when26452645+ kvm.nx_huge_pages=force.2658264626592647 auto (default)26602648 Mitigate all CPU vulnerabilities, but leave SMT···26772651 be fully mitigated, even if it means losing SMT.26782652 Equivalent to: l1tf=flush,nosmt [X86]26792653 mds=full,nosmt [X86]26542654+ tsx_async_abort=full,nosmt [X86]2680265526812656 mminit_loglevel=26822657 [KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this···48744847 in situations with strict latency requirements (where48754848 interruptions from clocksource watchdog are not48764849 acceptable).48504850+48514851+ tsx= [X86] Control Transactional Synchronization48524852+ Extensions (TSX) feature in Intel processors that48534853+ support TSX control.48544854+48554855+ This parameter controls the TSX feature. The options are:48564856+48574857+ on - Enable TSX on the system. Although there are48584858+ mitigations for all known security vulnerabilities,48594859+ TSX has been known to be an accelerator for48604860+ several previous speculation-related CVEs, and48614861+ so there may be unknown security risks associated48624862+ with leaving it enabled.48634863+48644864+ off - Disable TSX on the system. (Note that this48654865+ option takes effect only on newer CPUs which are48664866+ not vulnerable to MDS, i.e., have48674867+ MSR_IA32_ARCH_CAPABILITIES.MDS_NO=1 and which get48684868+ the new IA32_TSX_CTRL MSR through a microcode48694869+ update. This new MSR allows for the reliable48704870+ deactivation of the TSX functionality.)48714871+48724872+ auto - Disable TSX if X86_BUG_TAA is present,48734873+ otherwise enable TSX on the system.48744874+48754875+ Not specifying this option is equivalent to tsx=off.48764876+48774877+ See Documentation/admin-guide/hw-vuln/tsx_async_abort.rst48784878+ for more details.48794879+48804880+ tsx_async_abort= [X86,INTEL] Control mitigation for the TSX Async48814881+ Abort (TAA) vulnerability.48824882+48834883+ Similar to Micro-architectural Data Sampling (MDS)48844884+ certain CPUs that support Transactional48854885+ Synchronization Extensions (TSX) are vulnerable to an48864886+ exploit against CPU internal buffers which can forward48874887+ information to a disclosure gadget under certain48884888+ conditions.48894889+48904890+ In vulnerable processors, the speculatively forwarded48914891+ data can be used in a cache side channel attack, to48924892+ access data to which the attacker does not have direct48934893+ access.48944894+48954895+ This parameter controls the TAA mitigation. The48964896+ options are:48974897+48984898+ full - Enable TAA mitigation on vulnerable CPUs48994899+ if TSX is enabled.49004900+49014901+ full,nosmt - Enable TAA mitigation and disable SMT on49024902+ vulnerable CPUs. If TSX is disabled, SMT49034903+ is not disabled because CPU is not49044904+ vulnerable to cross-thread TAA attacks.49054905+ off - Unconditionally disable TAA mitigation49064906+49074907+ Not specifying this option is equivalent to49084908+ tsx_async_abort=full. On CPUs which are MDS affected49094909+ and deploy MDS mitigation, TAA mitigation is not49104910+ required and doesn't provide any additional49114911+ mitigation.49124912+49134913+ For details see:49144914+ Documentation/admin-guide/hw-vuln/tsx_async_abort.rst4877491548784916 turbografx.map[2|3]= [HW,JOY]48794917 TurboGraFX parallel port interface
···11+.. SPDX-License-Identifier: GPL-2.022+33+TSX Async Abort (TAA) mitigation44+================================55+66+.. _tsx_async_abort:77+88+Overview99+--------1010+1111+TSX Async Abort (TAA) is a side channel attack on internal buffers in some1212+Intel processors similar to Microachitectural Data Sampling (MDS). In this1313+case certain loads may speculatively pass invalid data to dependent operations1414+when an asynchronous abort condition is pending in a Transactional1515+Synchronization Extensions (TSX) transaction. This includes loads with no1616+fault or assist condition. Such loads may speculatively expose stale data from1717+the same uarch data structures as in MDS, with same scope of exposure i.e.1818+same-thread and cross-thread. This issue affects all current processors that1919+support TSX.2020+2121+Mitigation strategy2222+-------------------2323+2424+a) TSX disable - one of the mitigations is to disable TSX. A new MSR2525+IA32_TSX_CTRL will be available in future and current processors after2626+microcode update which can be used to disable TSX. In addition, it2727+controls the enumeration of the TSX feature bits (RTM and HLE) in CPUID.2828+2929+b) Clear CPU buffers - similar to MDS, clearing the CPU buffers mitigates this3030+vulnerability. More details on this approach can be found in3131+:ref:`Documentation/admin-guide/hw-vuln/mds.rst <mds>`.3232+3333+Kernel internal mitigation modes3434+--------------------------------3535+3636+ ============= ============================================================3737+ off Mitigation is disabled. Either the CPU is not affected or3838+ tsx_async_abort=off is supplied on the kernel command line.3939+4040+ tsx disabled Mitigation is enabled. TSX feature is disabled by default at4141+ bootup on processors that support TSX control.4242+4343+ verw Mitigation is enabled. CPU is affected and MD_CLEAR is4444+ advertised in CPUID.4545+4646+ ucode needed Mitigation is enabled. CPU is affected and MD_CLEAR is not4747+ advertised in CPUID. That is mainly for virtualization4848+ scenarios where the host has the updated microcode but the4949+ hypervisor does not expose MD_CLEAR in CPUID. It's a best5050+ effort approach without guarantee.5151+ ============= ============================================================5252+5353+If the CPU is affected and the "tsx_async_abort" kernel command line parameter is5454+not provided then the kernel selects an appropriate mitigation depending on the5555+status of RTM and MD_CLEAR CPUID bits.5656+5757+Below tables indicate the impact of tsx=on|off|auto cmdline options on state of5858+TAA mitigation, VERW behavior and TSX feature for various combinations of5959+MSR_IA32_ARCH_CAPABILITIES bits.6060+6161+1. "tsx=off"6262+6363+========= ========= ============ ============ ============== =================== ======================6464+MSR_IA32_ARCH_CAPABILITIES bits Result with cmdline tsx=off6565+---------------------------------- -------------------------------------------------------------------------6666+TAA_NO MDS_NO TSX_CTRL_MSR TSX state VERW can clear TAA mitigation TAA mitigation6767+ after bootup CPU buffers tsx_async_abort=off tsx_async_abort=full6868+========= ========= ============ ============ ============== =================== ======================6969+ 0 0 0 HW default Yes Same as MDS Same as MDS7070+ 0 0 1 Invalid case Invalid case Invalid case Invalid case7171+ 0 1 0 HW default No Need ucode update Need ucode update7272+ 0 1 1 Disabled Yes TSX disabled TSX disabled7373+ 1 X 1 Disabled X None needed None needed7474+========= ========= ============ ============ ============== =================== ======================7575+7676+2. "tsx=on"7777+7878+========= ========= ============ ============ ============== =================== ======================7979+MSR_IA32_ARCH_CAPABILITIES bits Result with cmdline tsx=on8080+---------------------------------- -------------------------------------------------------------------------8181+TAA_NO MDS_NO TSX_CTRL_MSR TSX state VERW can clear TAA mitigation TAA mitigation8282+ after bootup CPU buffers tsx_async_abort=off tsx_async_abort=full8383+========= ========= ============ ============ ============== =================== ======================8484+ 0 0 0 HW default Yes Same as MDS Same as MDS8585+ 0 0 1 Invalid case Invalid case Invalid case Invalid case8686+ 0 1 0 HW default No Need ucode update Need ucode update8787+ 0 1 1 Enabled Yes None Same as MDS8888+ 1 X 1 Enabled X None needed None needed8989+========= ========= ============ ============ ============== =================== ======================9090+9191+3. "tsx=auto"9292+9393+========= ========= ============ ============ ============== =================== ======================9494+MSR_IA32_ARCH_CAPABILITIES bits Result with cmdline tsx=auto9595+---------------------------------- -------------------------------------------------------------------------9696+TAA_NO MDS_NO TSX_CTRL_MSR TSX state VERW can clear TAA mitigation TAA mitigation9797+ after bootup CPU buffers tsx_async_abort=off tsx_async_abort=full9898+========= ========= ============ ============ ============== =================== ======================9999+ 0 0 0 HW default Yes Same as MDS Same as MDS100100+ 0 0 1 Invalid case Invalid case Invalid case Invalid case101101+ 0 1 0 HW default No Need ucode update Need ucode update102102+ 0 1 1 Disabled Yes TSX disabled TSX disabled103103+ 1 X 1 Enabled X None needed None needed104104+========= ========= ============ ============ ============== =================== ======================105105+106106+In the tables, TSX_CTRL_MSR is a new bit in MSR_IA32_ARCH_CAPABILITIES that107107+indicates whether MSR_IA32_TSX_CTRL is supported.108108+109109+There are two control bits in IA32_TSX_CTRL MSR:110110+111111+ Bit 0: When set it disables the Restricted Transactional Memory (RTM)112112+ sub-feature of TSX (will force all transactions to abort on the113113+ XBEGIN instruction).114114+115115+ Bit 1: When set it disables the enumeration of the RTM and HLE feature116116+ (i.e. it will make CPUID(EAX=7).EBX{bit4} and117117+ CPUID(EAX=7).EBX{bit11} read as 0).
-2
MAINTAINERS
···32683268F: drivers/cpufreq/bmips-cpufreq.c3269326932703270BROADCOM BMIPS MIPS ARCHITECTURE32713271-M: Kevin Cernekee <cernekee@gmail.com>32723271M: Florian Fainelli <f.fainelli@gmail.com>32733272L: bcm-kernel-feedback-list@broadcom.com32743273L: linux-mips@vger.kernel.org···3744374537453746CAVIUM THUNDERX2 ARM64 SOC37463747M: Robert Richter <rrichter@cavium.com>37473747-M: Jayachandran C <jnair@caviumnetworks.com>37483748L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)37493749S: Maintained37503750F: arch/arm64/boot/dts/cavium/thunder2-99xx*
+4-1
Makefile
···22VERSION = 533PATCHLEVEL = 444SUBLEVEL = 055-EXTRAVERSION = -rc655+EXTRAVERSION = -rc766NAME = Kleptomaniac Octopus7788# *DOCUMENTATION*···916916ifeq ($(CONFIG_RELR),y)917917LDFLAGS_vmlinux += --pack-dyn-relocs=relr918918endif919919+920920+# make the checker run with the right architecture921921+CHECKFLAGS += --arch=$(ARCH)919922920923# insure the checker run with the right endianness921924CHECKFLAGS += $(if $(CONFIG_CPU_BIG_ENDIAN),-mbig-endian,-mlittle-endian)
···2828}2929#define __arch_get_clock_mode __mips_get_clock_mode30303131-static __always_inline3232-int __mips_use_vsyscall(struct vdso_data *vdata)3333-{3434- return (vdata[CS_HRES_COARSE].clock_mode != VDSO_CLOCK_NONE);3535-}3636-#define __arch_use_vsyscall __mips_use_vsyscall3737-3831/* The asm-generic header needs to be included after the definitions above */3932#include <asm-generic/vdso/vsyscall.h>4033
-7
arch/mips/sgi-ip27/Kconfig
···3838 Say Y here to enable replicating the kernel text across multiple3939 nodes in a NUMA cluster. This trades memory for speed.40404141-config REPLICATE_EXHANDLERS4242- bool "Exception handler replication support"4343- depends on SGI_IP274444- help4545- Say Y here to enable replicating the kernel exception handlers4646- across multiple nodes in a NUMA cluster. This trades memory for4747- speed.
+6-15
arch/mips/sgi-ip27/ip27-init.c
···69697070 hub_rtc_init(cnode);71717272-#ifdef CONFIG_REPLICATE_EXHANDLERS7373- /*7474- * If this is not a headless node initialization,7575- * copy over the caliased exception handlers.7676- */7777- if (get_compact_nodeid() == cnode) {7878- extern char except_vec2_generic, except_vec3_generic;7979- extern void build_tlb_refill_handler(void);8080-8181- memcpy((void *)(CKSEG0 + 0x100), &except_vec2_generic, 0x80);8282- memcpy((void *)(CKSEG0 + 0x180), &except_vec3_generic, 0x80);8383- build_tlb_refill_handler();8484- memcpy((void *)(CKSEG0 + 0x100), (void *) CKSEG0, 0x80);8585- memcpy((void *)(CKSEG0 + 0x180), &except_vec3_generic, 0x100);7272+ if (nasid) {7373+ /* copy exception handlers from first node to current node */7474+ memcpy((void *)NODE_OFFSET_TO_K0(nasid, 0),7575+ (void *)CKSEG0, 0x200);8676 __flush_cache_all();7777+ /* switch to node local exception handlers */7878+ REMOTE_HUB_S(nasid, PI_CALIAS_SIZE, PI_CALIAS_SIZE_8K);8779 }8888-#endif8980}90819182void per_cpu_init(void)
-4
arch/mips/sgi-ip27/ip27-memory.c
···332332 * thinks it is a node 0 address.333333 */334334 REMOTE_HUB_S(nasid, PI_REGION_PRESENT, (region_mask | 1));335335-#ifdef CONFIG_REPLICATE_EXHANDLERS336336- REMOTE_HUB_S(nasid, PI_CALIAS_SIZE, PI_CALIAS_SIZE_8K);337337-#else338335 REMOTE_HUB_S(nasid, PI_CALIAS_SIZE, PI_CALIAS_SIZE_0);339339-#endif340336341337#ifdef LATER342338 /*
+2-2
arch/sparc/vdso/Makefile
···6565#6666# vDSO code runs in userspace and -pg doesn't help with profiling anyway.6767#6868-CFLAGS_REMOVE_vdso-note.o = -pg6968CFLAGS_REMOVE_vclock_gettime.o = -pg6969+CFLAGS_REMOVE_vdso32/vclock_gettime.o = -pg70707171$(obj)/%.so: OBJCOPYFLAGS := -S7272$(obj)/%.so: $(obj)/%.so.dbg FORCE7373 $(call if_changed,objcopy)74747575-CPPFLAGS_vdso32.lds = $(CPPFLAGS_vdso.lds)7575+CPPFLAGS_vdso32/vdso32.lds = $(CPPFLAGS_vdso.lds)7676VDSO_LDFLAGS_vdso32.lds = -m elf32_sparc -soname linux-gate.so.177777878#This makes sure the $(obj) subdirectory exists even though vdso32/
+45
arch/x86/Kconfig
···1940194019411941 If unsure, say y.1942194219431943+choice19441944+ prompt "TSX enable mode"19451945+ depends on CPU_SUP_INTEL19461946+ default X86_INTEL_TSX_MODE_OFF19471947+ help19481948+ Intel's TSX (Transactional Synchronization Extensions) feature19491949+ allows to optimize locking protocols through lock elision which19501950+ can lead to a noticeable performance boost.19511951+19521952+ On the other hand it has been shown that TSX can be exploited19531953+ to form side channel attacks (e.g. TAA) and chances are there19541954+ will be more of those attacks discovered in the future.19551955+19561956+ Therefore TSX is not enabled by default (aka tsx=off). An admin19571957+ might override this decision by tsx=on the command line parameter.19581958+ Even with TSX enabled, the kernel will attempt to enable the best19591959+ possible TAA mitigation setting depending on the microcode available19601960+ for the particular machine.19611961+19621962+ This option allows to set the default tsx mode between tsx=on, =off19631963+ and =auto. See Documentation/admin-guide/kernel-parameters.txt for more19641964+ details.19651965+19661966+ Say off if not sure, auto if TSX is in use but it should be used on safe19671967+ platforms or on if TSX is in use and the security aspect of tsx is not19681968+ relevant.19691969+19701970+config X86_INTEL_TSX_MODE_OFF19711971+ bool "off"19721972+ help19731973+ TSX is disabled if possible - equals to tsx=off command line parameter.19741974+19751975+config X86_INTEL_TSX_MODE_ON19761976+ bool "on"19771977+ help19781978+ TSX is always enabled on TSX capable HW - equals the tsx=on command19791979+ line parameter.19801980+19811981+config X86_INTEL_TSX_MODE_AUTO19821982+ bool "auto"19831983+ help19841984+ TSX is enabled on TSX capable HW that is believed to be safe against19851985+ side channel attacks- equals the tsx=auto command line parameter.19861986+endchoice19871987+19431988config EFI19441989 bool "EFI runtime service support"19451990 depends on ACPI
+2
arch/x86/include/asm/cpufeatures.h
···399399#define X86_BUG_MDS X86_BUG(19) /* CPU is affected by Microarchitectural data sampling */400400#define X86_BUG_MSBDS_ONLY X86_BUG(20) /* CPU is only affected by the MSDBS variant of BUG_MDS */401401#define X86_BUG_SWAPGS X86_BUG(21) /* CPU is affected by speculation through SWAPGS */402402+#define X86_BUG_TAA X86_BUG(22) /* CPU is affected by TSX Async Abort(TAA) */403403+#define X86_BUG_ITLB_MULTIHIT X86_BUG(23) /* CPU may incur MCE during certain page attribute changes */402404403405#endif /* _ASM_X86_CPUFEATURES_H */
+6
arch/x86/include/asm/kvm_host.h
···312312struct kvm_mmu_page {313313 struct list_head link;314314 struct hlist_node hash_link;315315+ struct list_head lpage_disallowed_link;316316+315317 bool unsync;316318 u8 mmu_valid_gen;317319 bool mmio_cached;320320+ bool lpage_disallowed; /* Can't be replaced by an equiv large page */318321319322 /*320323 * The following two entries are used to key the shadow page in the···862859 */863860 struct list_head active_mmu_pages;864861 struct list_head zapped_obsolete_pages;862862+ struct list_head lpage_disallowed_mmu_pages;865863 struct kvm_page_track_notifier_node mmu_sp_tracker;866864 struct kvm_page_track_notifier_head track_notifier_head;867865···937933 bool exception_payload_enabled;938934939935 struct kvm_pmu_event_filter *pmu_event_filter;936936+ struct task_struct *nx_lpage_recovery_thread;940937};941938942939struct kvm_vm_stat {···951946 ulong mmu_unsync;952947 ulong remote_tlb_flush;953948 ulong lpages;949949+ ulong nx_lpage_splits;954950 ulong max_mmu_page_hash_collisions;955951};956952
+16
arch/x86/include/asm/msr-index.h
···9393 * Microarchitectural Data9494 * Sampling (MDS) vulnerabilities.9595 */9696+#define ARCH_CAP_PSCHANGE_MC_NO BIT(6) /*9797+ * The processor is not susceptible to a9898+ * machine check error due to modifying the9999+ * code page size along with either the100100+ * physical address or cache type101101+ * without TLB invalidation.102102+ */103103+#define ARCH_CAP_TSX_CTRL_MSR BIT(7) /* MSR for TSX control is available. */104104+#define ARCH_CAP_TAA_NO BIT(8) /*105105+ * Not susceptible to106106+ * TSX Async Abort (TAA) vulnerabilities.107107+ */9610897109#define MSR_IA32_FLUSH_CMD 0x0000010b98110#define L1D_FLUSH BIT(0) /*···114102115103#define MSR_IA32_BBL_CR_CTL 0x00000119116104#define MSR_IA32_BBL_CR_CTL3 0x0000011e105105+106106+#define MSR_IA32_TSX_CTRL 0x00000122107107+#define TSX_CTRL_RTM_DISABLE BIT(0) /* Disable RTM feature */108108+#define TSX_CTRL_CPUID_CLEAR BIT(1) /* Disable TSX enumeration */117109118110#define MSR_IA32_SYSENTER_CS 0x00000174119111#define MSR_IA32_SYSENTER_ESP 0x00000175
+2-2
arch/x86/include/asm/nospec-branch.h
···314314#include <asm/segment.h>315315316316/**317317- * mds_clear_cpu_buffers - Mitigation for MDS vulnerability317317+ * mds_clear_cpu_buffers - Mitigation for MDS and TAA vulnerability318318 *319319 * This uses the otherwise unused and obsolete VERW instruction in320320 * combination with microcode which triggers a CPU buffer flush when the···337337}338338339339/**340340- * mds_user_clear_cpu_buffers - Mitigation for MDS vulnerability340340+ * mds_user_clear_cpu_buffers - Mitigation for MDS and TAA vulnerability341341 *342342 * Clear CPU buffers if the corresponding static key is enabled343343 */
···15861586{15871587 int cpu = smp_processor_id();15881588 unsigned int value;15891589-#ifdef CONFIG_X86_3215901590- int logical_apicid, ldr_apicid;15911591-#endif1592158915931590 if (disable_apic) {15941591 disable_ioapic_support();···16231626 apic->init_apic_ldr();1624162716251628#ifdef CONFIG_X86_3216261626- /*16271627- * APIC LDR is initialized. If logical_apicid mapping was16281628- * initialized during get_smp_config(), make sure it matches the16291629- * actual value.16301630- */16311631- logical_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);16321632- ldr_apicid = GET_APIC_LOGICAL_ID(apic_read(APIC_LDR));16331633- WARN_ON(logical_apicid != BAD_APICID && logical_apicid != ldr_apicid);16341634- /* always use the value from LDR */16351635- early_per_cpu(x86_cpu_to_logical_apicid, cpu) = ldr_apicid;16291629+ if (apic->dest_logical) {16301630+ int logical_apicid, ldr_apicid;16311631+16321632+ /*16331633+ * APIC LDR is initialized. If logical_apicid mapping was16341634+ * initialized during get_smp_config(), make sure it matches16351635+ * the actual value.16361636+ */16371637+ logical_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);16381638+ ldr_apicid = GET_APIC_LOGICAL_ID(apic_read(APIC_LDR));16391639+ if (logical_apicid != BAD_APICID)16401640+ WARN_ON(logical_apicid != ldr_apicid);16411641+ /* Always use the value from LDR. */16421642+ early_per_cpu(x86_cpu_to_logical_apicid, cpu) = ldr_apicid;16431643+ }16361644#endif1637164516381646 /*
···11+// SPDX-License-Identifier: GPL-2.022+/*33+ * Intel Transactional Synchronization Extensions (TSX) control.44+ *55+ * Copyright (C) 2019 Intel Corporation66+ *77+ * Author:88+ * Pawan Gupta <pawan.kumar.gupta@linux.intel.com>99+ */1010+1111+#include <linux/cpufeature.h>1212+1313+#include <asm/cmdline.h>1414+1515+#include "cpu.h"1616+1717+enum tsx_ctrl_states tsx_ctrl_state __ro_after_init = TSX_CTRL_NOT_SUPPORTED;1818+1919+void tsx_disable(void)2020+{2121+ u64 tsx;2222+2323+ rdmsrl(MSR_IA32_TSX_CTRL, tsx);2424+2525+ /* Force all transactions to immediately abort */2626+ tsx |= TSX_CTRL_RTM_DISABLE;2727+2828+ /*2929+ * Ensure TSX support is not enumerated in CPUID.3030+ * This is visible to userspace and will ensure they3131+ * do not waste resources trying TSX transactions that3232+ * will always abort.3333+ */3434+ tsx |= TSX_CTRL_CPUID_CLEAR;3535+3636+ wrmsrl(MSR_IA32_TSX_CTRL, tsx);3737+}3838+3939+void tsx_enable(void)4040+{4141+ u64 tsx;4242+4343+ rdmsrl(MSR_IA32_TSX_CTRL, tsx);4444+4545+ /* Enable the RTM feature in the cpu */4646+ tsx &= ~TSX_CTRL_RTM_DISABLE;4747+4848+ /*4949+ * Ensure TSX support is enumerated in CPUID.5050+ * This is visible to userspace and will ensure they5151+ * can enumerate and use the TSX feature.5252+ */5353+ tsx &= ~TSX_CTRL_CPUID_CLEAR;5454+5555+ wrmsrl(MSR_IA32_TSX_CTRL, tsx);5656+}5757+5858+static bool __init tsx_ctrl_is_supported(void)5959+{6060+ u64 ia32_cap = x86_read_arch_cap_msr();6161+6262+ /*6363+ * TSX is controlled via MSR_IA32_TSX_CTRL. However, support for this6464+ * MSR is enumerated by ARCH_CAP_TSX_MSR bit in MSR_IA32_ARCH_CAPABILITIES.6565+ *6666+ * TSX control (aka MSR_IA32_TSX_CTRL) is only available after a6767+ * microcode update on CPUs that have their MSR_IA32_ARCH_CAPABILITIES6868+ * bit MDS_NO=1. CPUs with MDS_NO=0 are not planned to get6969+ * MSR_IA32_TSX_CTRL support even after a microcode update. Thus,7070+ * tsx= cmdline requests will do nothing on CPUs without7171+ * MSR_IA32_TSX_CTRL support.7272+ */7373+ return !!(ia32_cap & ARCH_CAP_TSX_CTRL_MSR);7474+}7575+7676+static enum tsx_ctrl_states x86_get_tsx_auto_mode(void)7777+{7878+ if (boot_cpu_has_bug(X86_BUG_TAA))7979+ return TSX_CTRL_DISABLE;8080+8181+ return TSX_CTRL_ENABLE;8282+}8383+8484+void __init tsx_init(void)8585+{8686+ char arg[5] = {};8787+ int ret;8888+8989+ if (!tsx_ctrl_is_supported())9090+ return;9191+9292+ ret = cmdline_find_option(boot_command_line, "tsx", arg, sizeof(arg));9393+ if (ret >= 0) {9494+ if (!strcmp(arg, "on")) {9595+ tsx_ctrl_state = TSX_CTRL_ENABLE;9696+ } else if (!strcmp(arg, "off")) {9797+ tsx_ctrl_state = TSX_CTRL_DISABLE;9898+ } else if (!strcmp(arg, "auto")) {9999+ tsx_ctrl_state = x86_get_tsx_auto_mode();100100+ } else {101101+ tsx_ctrl_state = TSX_CTRL_DISABLE;102102+ pr_err("tsx: invalid option, defaulting to off\n");103103+ }104104+ } else {105105+ /* tsx= not provided */106106+ if (IS_ENABLED(CONFIG_X86_INTEL_TSX_MODE_AUTO))107107+ tsx_ctrl_state = x86_get_tsx_auto_mode();108108+ else if (IS_ENABLED(CONFIG_X86_INTEL_TSX_MODE_OFF))109109+ tsx_ctrl_state = TSX_CTRL_DISABLE;110110+ else111111+ tsx_ctrl_state = TSX_CTRL_ENABLE;112112+ }113113+114114+ if (tsx_ctrl_state == TSX_CTRL_DISABLE) {115115+ tsx_disable();116116+117117+ /*118118+ * tsx_disable() will change the state of the119119+ * RTM CPUID bit. Clear it here since it is now120120+ * expected to be not set.121121+ */122122+ setup_clear_cpu_cap(X86_FEATURE_RTM);123123+ } else if (tsx_ctrl_state == TSX_CTRL_ENABLE) {124124+125125+ /*126126+ * HW defaults TSX to be enabled at bootup.127127+ * We may still need the TSX enable support128128+ * during init for special cases like129129+ * kexec after TSX is disabled.130130+ */131131+ tsx_enable();132132+133133+ /*134134+ * tsx_enable() will change the state of the135135+ * RTM CPUID bit. Force it here since it is now136136+ * expected to be set.137137+ */138138+ setup_force_cpu_cap(X86_FEATURE_RTM);139139+ }140140+}
+7
arch/x86/kernel/dumpstack_64.c
···9494 BUILD_BUG_ON(N_EXCEPTION_STACKS != 6);95959696 begin = (unsigned long)__this_cpu_read(cea_exception_stacks);9797+ /*9898+ * Handle the case where stack trace is collected _before_9999+ * cea_exception_stacks had been initialized.100100+ */101101+ if (!begin)102102+ return false;103103+97104 end = begin + sizeof(struct cea_exception_stacks);98105 /* Bail if @stack is outside the exception stack area. */99106 if (stk < begin || stk >= end)
···614614static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,615615 struct guest_walker *gw,616616 int write_fault, int hlevel,617617- kvm_pfn_t pfn, bool map_writable, bool prefault)617617+ kvm_pfn_t pfn, bool map_writable, bool prefault,618618+ bool lpage_disallowed)618619{619620 struct kvm_mmu_page *sp = NULL;620621 struct kvm_shadow_walk_iterator it;621622 unsigned direct_access, access = gw->pt_access;622623 int top_level, ret;623623- gfn_t base_gfn;624624+ gfn_t gfn, base_gfn;624625625626 direct_access = gw->pte_access;626627···666665 link_shadow_page(vcpu, it.sptep, sp);667666 }668667669669- base_gfn = gw->gfn;668668+ /*669669+ * FNAME(page_fault) might have clobbered the bottom bits of670670+ * gw->gfn, restore them from the virtual address.671671+ */672672+ gfn = gw->gfn | ((addr & PT_LVL_OFFSET_MASK(gw->level)) >> PAGE_SHIFT);673673+ base_gfn = gfn;670674671675 trace_kvm_mmu_spte_requested(addr, gw->level, pfn);672676673677 for (; shadow_walk_okay(&it); shadow_walk_next(&it)) {674678 clear_sp_write_flooding_count(it.sptep);675675- base_gfn = gw->gfn & ~(KVM_PAGES_PER_HPAGE(it.level) - 1);679679+680680+ /*681681+ * We cannot overwrite existing page tables with an NX682682+ * large page, as the leaf could be executable.683683+ */684684+ disallowed_hugepage_adjust(it, gfn, &pfn, &hlevel);685685+686686+ base_gfn = gfn & ~(KVM_PAGES_PER_HPAGE(it.level) - 1);676687 if (it.level == hlevel)677688 break;678689···696683 sp = kvm_mmu_get_page(vcpu, base_gfn, addr,697684 it.level - 1, true, direct_access);698685 link_shadow_page(vcpu, it.sptep, sp);686686+ if (lpage_disallowed)687687+ account_huge_nx_page(vcpu->kvm, sp);699688 }700689 }701690···774759 int r;775760 kvm_pfn_t pfn;776761 int level = PT_PAGE_TABLE_LEVEL;777777- bool force_pt_level = false;778762 unsigned long mmu_seq;779763 bool map_writable, is_self_change_mapping;764764+ bool lpage_disallowed = (error_code & PFERR_FETCH_MASK) &&765765+ is_nx_huge_page_enabled();766766+ bool force_pt_level = lpage_disallowed;780767781768 pgprintk("%s: addr %lx err %x\n", __func__, addr, error_code);782769···868851 if (!force_pt_level)869852 transparent_hugepage_adjust(vcpu, walker.gfn, &pfn, &level);870853 r = FNAME(fetch)(vcpu, addr, &walker, write_fault,871871- level, pfn, map_writable, prefault);854854+ level, pfn, map_writable, prefault, lpage_disallowed);872855 kvm_mmu_audit(vcpu, AUDIT_POST_PAGE_FAULT);873856874857out_unlock:
+20-3
arch/x86/kvm/vmx/vmx.c
···12681268 if (!pi_test_sn(pi_desc) && vcpu->cpu == cpu)12691269 return;1270127012711271+ /*12721272+ * If the 'nv' field is POSTED_INTR_WAKEUP_VECTOR, do not change12731273+ * PI.NDST: pi_post_block is the one expected to change PID.NDST and the12741274+ * wakeup handler expects the vCPU to be on the blocked_vcpu_list that12751275+ * matches PI.NDST. Otherwise, a vcpu may not be able to be woken up12761276+ * correctly.12771277+ */12781278+ if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR || vcpu->cpu == cpu) {12791279+ pi_clear_sn(pi_desc);12801280+ goto after_clear_sn;12811281+ }12821282+12711283 /* The full case. */12721284 do {12731285 old.control = new.control = pi_desc->control;···12951283 } while (cmpxchg64(&pi_desc->control, old.control,12961284 new.control) != old.control);1297128512861286+after_clear_sn:12871287+12981288 /*12991289 * Clear SN before reading the bitmap. The VT-d firmware13001290 * writes the bitmap and reads SN atomically (5.2.3 in the···13051291 */13061292 smp_mb__after_atomic();1307129313081308- if (!bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS))12941294+ if (!pi_is_pir_empty(pi_desc))13091295 pi_set_on(pi_desc);13101296}13111297···61516137 if (pi_test_on(&vmx->pi_desc)) {61526138 pi_clear_on(&vmx->pi_desc);61536139 /*61546154- * IOMMU can write to PIR.ON, so the barrier matters even on UP.61406140+ * IOMMU can write to PID.ON, so the barrier matters even on UP.61556141 * But on x86 this is just a compiler barrier anyway.61566142 */61576143 smp_mb__after_atomic();···6181616761826168static bool vmx_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu)61836169{61846184- return pi_test_on(vcpu_to_pi_desc(vcpu));61706170+ struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);61716171+61726172+ return pi_test_on(pi_desc) ||61736173+ (pi_test_sn(pi_desc) && !pi_is_pir_empty(pi_desc));61856174}6186617561876176static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
+11
arch/x86/kvm/vmx/vmx.h
···355355 return test_and_set_bit(vector, (unsigned long *)pi_desc->pir);356356}357357358358+static inline bool pi_is_pir_empty(struct pi_desc *pi_desc)359359+{360360+ return bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS);361361+}362362+358363static inline void pi_set_sn(struct pi_desc *pi_desc)359364{360365 set_bit(POSTED_INTR_SN,···375370static inline void pi_clear_on(struct pi_desc *pi_desc)376371{377372 clear_bit(POSTED_INTR_ON,373373+ (unsigned long *)&pi_desc->control);374374+}375375+376376+static inline void pi_clear_sn(struct pi_desc *pi_desc)377377+{378378+ clear_bit(POSTED_INTR_SN,378379 (unsigned long *)&pi_desc->control);379380}380381
+69-30
arch/x86/kvm/x86.c
···213213 { "mmu_unsync", VM_STAT(mmu_unsync) },214214 { "remote_tlb_flush", VM_STAT(remote_tlb_flush) },215215 { "largepages", VM_STAT(lpages, .mode = 0444) },216216+ { "nx_largepages_splitted", VM_STAT(nx_lpage_splits, .mode = 0444) },216217 { "max_mmu_page_hash_collisions",217218 VM_STAT(max_mmu_page_hash_collisions) },218219 { NULL }···11331132 * List of msr numbers which we expose to userspace through KVM_GET_MSRS11341133 * and KVM_SET_MSRS, and KVM_GET_MSR_INDEX_LIST.11351134 *11361136- * This list is modified at module load time to reflect the11351135+ * The three MSR lists(msrs_to_save, emulated_msrs, msr_based_features)11361136+ * extract the supported MSRs from the related const lists.11371137+ * msrs_to_save is selected from the msrs_to_save_all to reflect the11371138 * capabilities of the host cpu. This capabilities test skips MSRs that are11381138- * kvm-specific. Those are put in emulated_msrs; filtering of emulated_msrs11391139+ * kvm-specific. Those are put in emulated_msrs_all; filtering of emulated_msrs11391140 * may depend on host virtualization features rather than host cpu features.11401141 */1141114211421142-static u32 msrs_to_save[] = {11431143+static const u32 msrs_to_save_all[] = {11431144 MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP,11441145 MSR_STAR,11451146#ifdef CONFIG_X86_64···11821179 MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17,11831180};1184118111821182+static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_all)];11851183static unsigned num_msrs_to_save;1186118411871187-static u32 emulated_msrs[] = {11851185+static const u32 emulated_msrs_all[] = {11881186 MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK,11891187 MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW,11901188 HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL,···12241220 * by arch/x86/kvm/vmx/nested.c based on CPUID or other MSRs.12251221 * We always support the "true" VMX control MSRs, even if the host12261222 * processor does not, so I am putting these registers here rather12271227- * than in msrs_to_save.12231223+ * than in msrs_to_save_all.12281224 */12291225 MSR_IA32_VMX_BASIC,12301226 MSR_IA32_VMX_TRUE_PINBASED_CTLS,···12431239 MSR_KVM_POLL_CONTROL,12441240};1245124112421242+static u32 emulated_msrs[ARRAY_SIZE(emulated_msrs_all)];12461243static unsigned num_emulated_msrs;1247124412481245/*12491246 * List of msr numbers which are used to expose MSR-based features that12501247 * can be used by a hypervisor to validate requested CPU features.12511248 */12521252-static u32 msr_based_features[] = {12491249+static const u32 msr_based_features_all[] = {12531250 MSR_IA32_VMX_BASIC,12541251 MSR_IA32_VMX_TRUE_PINBASED_CTLS,12551252 MSR_IA32_VMX_PINBASED_CTLS,···12751270 MSR_IA32_ARCH_CAPABILITIES,12761271};1277127212731273+static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all)];12781274static unsigned int num_msr_based_features;1279127512801276static u64 kvm_get_arch_capabilities(void)···1284127812851279 if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))12861280 rdmsrl(MSR_IA32_ARCH_CAPABILITIES, data);12811281+12821282+ /*12831283+ * If nx_huge_pages is enabled, KVM's shadow paging will ensure that12841284+ * the nested hypervisor runs with NX huge pages. If it is not,12851285+ * L1 is anyway vulnerable to ITLB_MULTIHIT explots from other12861286+ * L1 guests, so it need not worry about its own (L2) guests.12871287+ */12881288+ data |= ARCH_CAP_PSCHANGE_MC_NO;1287128912881290 /*12891291 * If we're doing cache flushes (either "always" or "cond")···13111297 data |= ARCH_CAP_SSB_NO;13121298 if (!boot_cpu_has_bug(X86_BUG_MDS))13131299 data |= ARCH_CAP_MDS_NO;13001300+13011301+ /*13021302+ * On TAA affected systems, export MDS_NO=0 when:13031303+ * - TSX is enabled on the host, i.e. X86_FEATURE_RTM=1.13041304+ * - Updated microcode is present. This is detected by13051305+ * the presence of ARCH_CAP_TSX_CTRL_MSR and ensures13061306+ * that VERW clears CPU buffers.13071307+ *13081308+ * When MDS_NO=0 is exported, guests deploy clear CPU buffer13091309+ * mitigation and don't complain:13101310+ *13111311+ * "Vulnerable: Clear CPU buffers attempted, no microcode"13121312+ *13131313+ * If TSX is disabled on the system, guests are also mitigated against13141314+ * TAA and clear CPU buffer mitigation is not required for guests.13151315+ */13161316+ if (boot_cpu_has_bug(X86_BUG_TAA) && boot_cpu_has(X86_FEATURE_RTM) &&13171317+ (data & ARCH_CAP_TSX_CTRL_MSR))13181318+ data &= ~ARCH_CAP_MDS_NO;1314131913151320 return data;13161321}···51235090{51245091 struct x86_pmu_capability x86_pmu;51255092 u32 dummy[2];51265126- unsigned i, j;50935093+ unsigned i;5127509451285095 BUILD_BUG_ON_MSG(INTEL_PMC_MAX_FIXED != 4,51295129- "Please update the fixed PMCs in msrs_to_save[]");50965096+ "Please update the fixed PMCs in msrs_to_saved_all[]");5130509751315098 perf_get_x86_pmu_capability(&x86_pmu);5132509951335133- for (i = j = 0; i < ARRAY_SIZE(msrs_to_save); i++) {51345134- if (rdmsr_safe(msrs_to_save[i], &dummy[0], &dummy[1]) < 0)51005100+ num_msrs_to_save = 0;51015101+ num_emulated_msrs = 0;51025102+ num_msr_based_features = 0;51035103+51045104+ for (i = 0; i < ARRAY_SIZE(msrs_to_save_all); i++) {51055105+ if (rdmsr_safe(msrs_to_save_all[i], &dummy[0], &dummy[1]) < 0)51355106 continue;5136510751375108 /*51385109 * Even MSRs that are valid in the host may not be exposed51395110 * to the guests in some cases.51405111 */51415141- switch (msrs_to_save[i]) {51125112+ switch (msrs_to_save_all[i]) {51425113 case MSR_IA32_BNDCFGS:51435114 if (!kvm_mpx_supported())51445115 continue;···51705133 break;51715134 case MSR_IA32_RTIT_ADDR0_A ... MSR_IA32_RTIT_ADDR3_B: {51725135 if (!kvm_x86_ops->pt_supported() ||51735173- msrs_to_save[i] - MSR_IA32_RTIT_ADDR0_A >=51365136+ msrs_to_save_all[i] - MSR_IA32_RTIT_ADDR0_A >=51745137 intel_pt_validate_hw_cap(PT_CAP_num_address_ranges) * 2)51755138 continue;51765139 break;51775140 case MSR_ARCH_PERFMON_PERFCTR0 ... MSR_ARCH_PERFMON_PERFCTR0 + 17:51785178- if (msrs_to_save[i] - MSR_ARCH_PERFMON_PERFCTR0 >=51415141+ if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_PERFCTR0 >=51795142 min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))51805143 continue;51815144 break;51825145 case MSR_ARCH_PERFMON_EVENTSEL0 ... MSR_ARCH_PERFMON_EVENTSEL0 + 17:51835183- if (msrs_to_save[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=51465146+ if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=51845147 min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))51855148 continue;51865149 }···51885151 break;51895152 }5190515351915191- if (j < i)51925192- msrs_to_save[j] = msrs_to_save[i];51935193- j++;51545154+ msrs_to_save[num_msrs_to_save++] = msrs_to_save_all[i];51945155 }51955195- num_msrs_to_save = j;5196515651975197- for (i = j = 0; i < ARRAY_SIZE(emulated_msrs); i++) {51985198- if (!kvm_x86_ops->has_emulated_msr(emulated_msrs[i]))51575157+ for (i = 0; i < ARRAY_SIZE(emulated_msrs_all); i++) {51585158+ if (!kvm_x86_ops->has_emulated_msr(emulated_msrs_all[i]))51995159 continue;5200516052015201- if (j < i)52025202- emulated_msrs[j] = emulated_msrs[i];52035203- j++;51615161+ emulated_msrs[num_emulated_msrs++] = emulated_msrs_all[i];52045162 }52055205- num_emulated_msrs = j;5206516352075207- for (i = j = 0; i < ARRAY_SIZE(msr_based_features); i++) {51645164+ for (i = 0; i < ARRAY_SIZE(msr_based_features_all); i++) {52085165 struct kvm_msr_entry msr;5209516652105210- msr.index = msr_based_features[i];51675167+ msr.index = msr_based_features_all[i];52115168 if (kvm_get_msr_feature(&msr))52125169 continue;5213517052145214- if (j < i)52155215- msr_based_features[j] = msr_based_features[i];52165216- j++;51715171+ msr_based_features[num_msr_based_features++] = msr_based_features_all[i];52175172 }52185218- num_msr_based_features = j;52195173}5220517452215175static int vcpu_mmio_write(struct kvm_vcpu *vcpu, gpa_t addr, int len,···94569428 INIT_HLIST_HEAD(&kvm->arch.mask_notifier_list);94579429 INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);94589430 INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages);94319431+ INIT_LIST_HEAD(&kvm->arch.lpage_disallowed_mmu_pages);94599432 INIT_LIST_HEAD(&kvm->arch.assigned_dev_head);94609433 atomic_set(&kvm->arch.noncoherent_dma_count, 0);94619434···94839454 kvm_mmu_init_vm(kvm);9484945594859456 return kvm_x86_ops->vm_init(kvm);94579457+}94589458+94599459+int kvm_arch_post_init_vm(struct kvm *kvm)94609460+{94619461+ return kvm_mmu_post_init_vm(kvm);94869462}9487946394889464static void kvm_unload_vcpu_mmu(struct kvm_vcpu *vcpu)···95909556 return r;95919557}95929558EXPORT_SYMBOL_GPL(x86_set_memory_region);95599559+95609560+void kvm_arch_pre_destroy_vm(struct kvm *kvm)95619561+{95629562+ kvm_mmu_pre_destroy_vm(kvm);95639563+}9593956495949565void kvm_arch_destroy_vm(struct kvm *kvm)95959566{
+26-6
block/bfq-iosched.c
···27132713 }27142714}2715271527162716+27172717+static27182718+void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq)27192719+{27202720+ /*27212721+ * To prevent bfqq's service guarantees from being violated,27222722+ * bfqq may be left busy, i.e., queued for service, even if27232723+ * empty (see comments in __bfq_bfqq_expire() for27242724+ * details). But, if no process will send requests to bfqq any27252725+ * longer, then there is no point in keeping bfqq queued for27262726+ * service. In addition, keeping bfqq queued for service, but27272727+ * with no process ref any longer, may have caused bfqq to be27282728+ * freed when dequeued from service. But this is assumed to27292729+ * never happen.27302730+ */27312731+ if (bfq_bfqq_busy(bfqq) && RB_EMPTY_ROOT(&bfqq->sort_list) &&27322732+ bfqq != bfqd->in_service_queue)27332733+ bfq_del_bfqq_busy(bfqd, bfqq, false);27342734+27352735+ bfq_put_queue(bfqq);27362736+}27372737+27162738static void27172739bfq_merge_bfqqs(struct bfq_data *bfqd, struct bfq_io_cq *bic,27182740 struct bfq_queue *bfqq, struct bfq_queue *new_bfqq)···28052783 */28062784 new_bfqq->pid = -1;28072785 bfqq->bic = NULL;28082808- /* release process reference to bfqq */28092809- bfq_put_queue(bfqq);27862786+ bfq_release_process_ref(bfqd, bfqq);28102787}2811278828122789static bool bfq_allow_bio_merge(struct request_queue *q, struct request *rq,···4920489949214900 bfq_put_cooperator(bfqq);4922490149234923- bfq_put_queue(bfqq); /* release process reference */49024902+ bfq_release_process_ref(bfqd, bfqq);49244903}4925490449264905static void bfq_exit_icq_bfqq(struct bfq_io_cq *bic, bool is_sync)···5022500150235002 bfqq = bic_to_bfqq(bic, false);50245003 if (bfqq) {50255025- /* release process reference on this queue */50265026- bfq_put_queue(bfqq);50045004+ bfq_release_process_ref(bfqd, bfqq);50275005 bfqq = bfq_get_queue(bfqd, bio, BLK_RW_ASYNC, bic);50285006 bic_set_bfqq(bic, bfqq, false);50295007 }···5983596359845964 bfq_put_cooperator(bfqq);5985596559865986- bfq_put_queue(bfqq);59665966+ bfq_release_process_ref(bfqq->bfqd, bfqq);59875967 return NULL;59885968}59895969
+1-1
block/bio.c
···751751 if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))752752 return false;753753754754- if (bio->bi_vcnt > 0) {754754+ if (bio->bi_vcnt > 0 && !bio_full(bio, len)) {755755 struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];756756757757 if (page_is_mergeable(bv, page, len, off, same_page)) {
+6-2
block/blk-iocost.c
···10571057 atomic64_set(&iocg->active_period, cur_period);1058105810591059 /* already activated or breaking leaf-only constraint? */10601060- for (i = iocg->level; i > 0; i--)10611061- if (!list_empty(&iocg->active_list))10601060+ if (!list_empty(&iocg->active_list))10611061+ goto succeed_unlock;10621062+ for (i = iocg->level - 1; i > 0; i--)10631063+ if (!list_empty(&iocg->ancestors[i]->active_list))10621064 goto fail_unlock;10651065+10631066 if (iocg->child_active_sum)10641067 goto fail_unlock;10651068···11041101 ioc_start_period(ioc, now);11051102 }1106110311041104+succeed_unlock:11071105 spin_unlock_irq(&ioc->lock);11081106 return true;11091107
···1313#include <linux/delay.h>1414#include <linux/device.h>1515#include <linux/err.h>1616-#include <linux/freezer.h>1716#include <linux/fs.h>1817#include <linux/hw_random.h>1918#include <linux/kernel.h>···421422{422423 long rc;423424424424- set_freezable();425425-426426- while (!kthread_freezable_should_stop(NULL)) {425425+ while (!kthread_should_stop()) {427426 struct hwrng *rng;428427429428 rng = get_current_rng();
+1-3
drivers/char/random.c
···327327#include <linux/percpu.h>328328#include <linux/cryptohash.h>329329#include <linux/fips.h>330330-#include <linux/freezer.h>331330#include <linux/ptrace.h>332331#include <linux/workqueue.h>333332#include <linux/irq.h>···24992500 * We'll be woken up again once below random_write_wakeup_thresh,25002501 * or when the calling thread is about to terminate.25012502 */25022502- wait_event_freezable(random_write_wait,25032503- kthread_should_stop() ||25032503+ wait_event_interruptible(random_write_wait, kthread_should_stop() ||25042504 ENTROPY_BITS(&input_pool) <= random_write_wakeup_bits);25052505 mix_pool_bytes(poolp, buffer, count);25062506 credit_entropy_bits(poolp, entropy);
+11-5
drivers/clocksource/sh_mtu2.c
···328328 return 0;329329}330330331331+static const unsigned int sh_mtu2_channel_offsets[] = {332332+ 0x300, 0x380, 0x000,333333+};334334+331335static int sh_mtu2_setup_channel(struct sh_mtu2_channel *ch, unsigned int index,332336 struct sh_mtu2_device *mtu)333337{334334- static const unsigned int channel_offsets[] = {335335- 0x300, 0x380, 0x000,336336- };337338 char name[6];338339 int irq;339340 int ret;···357356 return ret;358357 }359358360360- ch->base = mtu->mapbase + channel_offsets[index];359359+ ch->base = mtu->mapbase + sh_mtu2_channel_offsets[index];361360 ch->index = index;362361363362 return sh_mtu2_register(ch, dev_name(&mtu->pdev->dev));···409408 }410409411410 /* Allocate and setup the channels. */412412- mtu->num_channels = 3;411411+ ret = platform_irq_count(pdev);412412+ if (ret < 0)413413+ goto err_unmap;414414+415415+ mtu->num_channels = min_t(unsigned int, ret,416416+ ARRAY_SIZE(sh_mtu2_channel_offsets));413417414418 mtu->channels = kcalloc(mtu->num_channels, sizeof(*mtu->channels),415419 GFP_KERNEL);
···4896489648974897 power_domains->initializing = true;4898489848994899+ /* Must happen before power domain init on VLV/CHV */49004900+ intel_update_rawclk(i915);49014901+48994902 if (INTEL_GEN(i915) >= 11) {49004903 icl_display_core_init(i915, resume);49014904 } else if (IS_CANNONLAKE(i915)) {
+5
drivers/gpu/drm/i915/gem/i915_gem_context.c
···319319 free_engines(rcu_access_pointer(ctx->engines));320320 mutex_destroy(&ctx->engines_mutex);321321322322+ kfree(ctx->jump_whitelist);323323+322324 if (ctx->timeline)323325 intel_timeline_put(ctx->timeline);324326···442440443441 for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)444442 ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;443443+444444+ ctx->jump_whitelist = NULL;445445+ ctx->jump_whitelist_cmds = 0;445446446447 return ctx;447448
+7
drivers/gpu/drm/i915/gem/i915_gem_context_types.h
···192192 * per vm, which may be one per context or shared with the global GTT)193193 */194194 struct radix_tree_root handles_vma;195195+196196+ /** jump_whitelist: Bit array for tracking cmds during cmdparsing197197+ * Guarded by struct_mutex198198+ */199199+ unsigned long *jump_whitelist;200200+ /** jump_whitelist_cmds: No of cmd slots available */201201+ u32 jump_whitelist_cmds;195202};196203197204#endif /* __I915_GEM_CONTEXT_TYPES_H__ */
+80-31
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
···296296297297static inline bool eb_use_cmdparser(const struct i915_execbuffer *eb)298298{299299- return intel_engine_needs_cmd_parser(eb->engine) && eb->batch_len;299299+ return intel_engine_requires_cmd_parser(eb->engine) ||300300+ (intel_engine_using_cmd_parser(eb->engine) &&301301+ eb->args->batch_len);300302}301303302304static int eb_create(struct i915_execbuffer *eb)···19571955 return 0;19581956}1959195719601960-static struct i915_vma *eb_parse(struct i915_execbuffer *eb, bool is_master)19581958+static struct i915_vma *19591959+shadow_batch_pin(struct i915_execbuffer *eb, struct drm_i915_gem_object *obj)19601960+{19611961+ struct drm_i915_private *dev_priv = eb->i915;19621962+ struct i915_vma * const vma = *eb->vma;19631963+ struct i915_address_space *vm;19641964+ u64 flags;19651965+19661966+ /*19671967+ * PPGTT backed shadow buffers must be mapped RO, to prevent19681968+ * post-scan tampering19691969+ */19701970+ if (CMDPARSER_USES_GGTT(dev_priv)) {19711971+ flags = PIN_GLOBAL;19721972+ vm = &dev_priv->ggtt.vm;19731973+ } else if (vma->vm->has_read_only) {19741974+ flags = PIN_USER;19751975+ vm = vma->vm;19761976+ i915_gem_object_set_readonly(obj);19771977+ } else {19781978+ DRM_DEBUG("Cannot prevent post-scan tampering without RO capable vm\n");19791979+ return ERR_PTR(-EINVAL);19801980+ }19811981+19821982+ return i915_gem_object_pin(obj, vm, NULL, 0, 0, flags);19831983+}19841984+19851985+static struct i915_vma *eb_parse(struct i915_execbuffer *eb)19611986{19621987 struct intel_engine_pool_node *pool;19631988 struct i915_vma *vma;19891989+ u64 batch_start;19901990+ u64 shadow_batch_start;19641991 int err;1965199219661993 pool = intel_engine_pool_get(&eb->engine->pool, eb->batch_len);19671994 if (IS_ERR(pool))19681995 return ERR_CAST(pool);1969199619701970- err = intel_engine_cmd_parser(eb->engine,19971997+ vma = shadow_batch_pin(eb, pool->obj);19981998+ if (IS_ERR(vma))19991999+ goto err;20002000+20012001+ batch_start = gen8_canonical_addr(eb->batch->node.start) +20022002+ eb->batch_start_offset;20032003+20042004+ shadow_batch_start = gen8_canonical_addr(vma->node.start);20052005+20062006+ err = intel_engine_cmd_parser(eb->gem_context,20072007+ eb->engine,19712008 eb->batch->obj,19721972- pool->obj,20092009+ batch_start,19732010 eb->batch_start_offset,19742011 eb->batch_len,19751975- is_master);20122012+ pool->obj,20132013+ shadow_batch_start);20142014+19762015 if (err) {19771977- if (err == -EACCES) /* unhandled chained batch */20162016+ i915_vma_unpin(vma);20172017+20182018+ /*20192019+ * Unsafe GGTT-backed buffers can still be submitted safely20202020+ * as non-secure.20212021+ * For PPGTT backing however, we have no choice but to forcibly20222022+ * reject unsafe buffers20232023+ */20242024+ if (CMDPARSER_USES_GGTT(eb->i915) && (err == -EACCES))20252025+ /* Execute original buffer non-secure */19782026 vma = NULL;19792027 else19802028 vma = ERR_PTR(err);19812029 goto err;19822030 }1983203119841984- vma = i915_gem_object_ggtt_pin(pool->obj, NULL, 0, 0, 0);19851985- if (IS_ERR(vma))19861986- goto err;19871987-19882032 eb->vma[eb->buffer_count] = i915_vma_get(vma);19892033 eb->flags[eb->buffer_count] =19902034 __EXEC_OBJECT_HAS_PIN | __EXEC_OBJECT_HAS_REF;19912035 vma->exec_flags = &eb->flags[eb->buffer_count];19922036 eb->buffer_count++;20372037+20382038+ eb->batch_start_offset = 0;20392039+ eb->batch = vma;20402040+20412041+ if (CMDPARSER_USES_GGTT(eb->i915))20422042+ eb->batch_flags |= I915_DISPATCH_SECURE;20432043+20442044+ /* eb->batch_len unchanged */1993204519942046 vma->private = pool;19952047 return vma;···24772421 struct drm_i915_gem_exec_object2 *exec,24782422 struct drm_syncobj **fences)24792423{24242424+ struct drm_i915_private *i915 = to_i915(dev);24802425 struct i915_execbuffer eb;24812426 struct dma_fence *in_fence = NULL;24822427 struct dma_fence *exec_fence = NULL;···24892432 BUILD_BUG_ON(__EXEC_OBJECT_INTERNAL_FLAGS &24902433 ~__EXEC_OBJECT_UNKNOWN_FLAGS);2491243424922492- eb.i915 = to_i915(dev);24352435+ eb.i915 = i915;24932436 eb.file = file;24942437 eb.args = args;24952438 if (DBG_FORCE_RELOC || !(args->flags & I915_EXEC_NO_RELOC))···2509245225102453 eb.batch_flags = 0;25112454 if (args->flags & I915_EXEC_SECURE) {24552455+ if (INTEL_GEN(i915) >= 11)24562456+ return -ENODEV;24572457+24582458+ /* Return -EPERM to trigger fallback code on old binaries. */24592459+ if (!HAS_SECURE_BATCHES(i915))24602460+ return -EPERM;24612461+25122462 if (!drm_is_current_master(file) || !capable(CAP_SYS_ADMIN))25132513- return -EPERM;24632463+ return -EPERM;2514246425152465 eb.batch_flags |= I915_DISPATCH_SECURE;25162466 }···25942530 goto err_vma;25952531 }2596253225332533+ if (eb.batch_len == 0)25342534+ eb.batch_len = eb.batch->size - eb.batch_start_offset;25352535+25972536 if (eb_use_cmdparser(&eb)) {25982537 struct i915_vma *vma;2599253826002600- vma = eb_parse(&eb, drm_is_current_master(file));25392539+ vma = eb_parse(&eb);26012540 if (IS_ERR(vma)) {26022541 err = PTR_ERR(vma);26032542 goto err_vma;26042543 }26052605-26062606- if (vma) {26072607- /*26082608- * Batch parsed and accepted:26092609- *26102610- * Set the DISPATCH_SECURE bit to remove the NON_SECURE26112611- * bit from MI_BATCH_BUFFER_START commands issued in26122612- * the dispatch_execbuffer implementations. We26132613- * specifically don't want that set on batches the26142614- * command parser has accepted.26152615- */26162616- eb.batch_flags |= I915_DISPATCH_SECURE;26172617- eb.batch_start_offset = 0;26182618- eb.batch = vma;26192619- }26202544 }26212621-26222622- if (eb.batch_len == 0)26232623- eb.batch_len = eb.batch->size - eb.batch_start_offset;2624254526252546 /*26262547 * snb/ivb/vlv conflate the "batch in ppgtt" bit with the "non-secure
···6262 value = !!(i915->caps.scheduler & I915_SCHEDULER_CAP_SEMAPHORES);6363 break;6464 case I915_PARAM_HAS_SECURE_BATCHES:6565- value = capable(CAP_SYS_ADMIN);6565+ value = HAS_SECURE_BATCHES(i915) && capable(CAP_SYS_ADMIN);6666 break;6767 case I915_PARAM_CMD_PARSER_VERSION:6868 value = i915_cmd_parser_get_version(i915);
···126126 */127127 I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |128128 PWM1_GATING_DIS | PWM2_GATING_DIS);129129+130130+ /*131131+ * Lower the display internal timeout.132132+ * This is needed to avoid any hard hangs when DSI port PLL133133+ * is off and a MMIO access is attempted by any privilege134134+ * application, using batch buffers or any other means.135135+ */136136+ I915_WRITE(RM_TIMEOUT, MMIO_TIMEOUT_US(950));129137}130138131139static void glk_init_clock_gating(struct drm_i915_private *dev_priv)···85528544 dev_priv->ips.corr = (lcfuse & LCFUSE_HIV_MASK);85538545}8554854685478547+static bool i915_rc6_ctx_corrupted(struct drm_i915_private *dev_priv)85488548+{85498549+ return !I915_READ(GEN8_RC6_CTX_INFO);85508550+}85518551+85528552+static void i915_rc6_ctx_wa_init(struct drm_i915_private *i915)85538553+{85548554+ if (!NEEDS_RC6_CTX_CORRUPTION_WA(i915))85558555+ return;85568556+85578557+ if (i915_rc6_ctx_corrupted(i915)) {85588558+ DRM_INFO("RC6 context corrupted, disabling runtime power management\n");85598559+ i915->gt_pm.rc6.ctx_corrupted = true;85608560+ i915->gt_pm.rc6.ctx_corrupted_wakeref =85618561+ intel_runtime_pm_get(&i915->runtime_pm);85628562+ }85638563+}85648564+85658565+static void i915_rc6_ctx_wa_cleanup(struct drm_i915_private *i915)85668566+{85678567+ if (i915->gt_pm.rc6.ctx_corrupted) {85688568+ intel_runtime_pm_put(&i915->runtime_pm,85698569+ i915->gt_pm.rc6.ctx_corrupted_wakeref);85708570+ i915->gt_pm.rc6.ctx_corrupted = false;85718571+ }85728572+}85738573+85748574+/**85758575+ * i915_rc6_ctx_wa_suspend - system suspend sequence for the RC6 CTX WA85768576+ * @i915: i915 device85778577+ *85788578+ * Perform any steps needed to clean up the RC6 CTX WA before system suspend.85798579+ */85808580+void i915_rc6_ctx_wa_suspend(struct drm_i915_private *i915)85818581+{85828582+ if (i915->gt_pm.rc6.ctx_corrupted)85838583+ intel_runtime_pm_put(&i915->runtime_pm,85848584+ i915->gt_pm.rc6.ctx_corrupted_wakeref);85858585+}85868586+85878587+/**85888588+ * i915_rc6_ctx_wa_resume - system resume sequence for the RC6 CTX WA85898589+ * @i915: i915 device85908590+ *85918591+ * Perform any steps needed to re-init the RC6 CTX WA after system resume.85928592+ */85938593+void i915_rc6_ctx_wa_resume(struct drm_i915_private *i915)85948594+{85958595+ if (!i915->gt_pm.rc6.ctx_corrupted)85968596+ return;85978597+85988598+ if (i915_rc6_ctx_corrupted(i915)) {85998599+ i915->gt_pm.rc6.ctx_corrupted_wakeref =86008600+ intel_runtime_pm_get(&i915->runtime_pm);86018601+ return;86028602+ }86038603+86048604+ DRM_INFO("RC6 context restored, re-enabling runtime power management\n");86058605+ i915->gt_pm.rc6.ctx_corrupted = false;86068606+}86078607+86088608+static void intel_disable_rc6(struct drm_i915_private *dev_priv);86098609+86108610+/**86118611+ * i915_rc6_ctx_wa_check - check for a new RC6 CTX corruption86128612+ * @i915: i915 device86138613+ *86148614+ * Check if an RC6 CTX corruption has happened since the last check and if so86158615+ * disable RC6 and runtime power management.86168616+ *86178617+ * Return false if no context corruption has happened since the last call of86188618+ * this function, true otherwise.86198619+*/86208620+bool i915_rc6_ctx_wa_check(struct drm_i915_private *i915)86218621+{86228622+ if (!NEEDS_RC6_CTX_CORRUPTION_WA(i915))86238623+ return false;86248624+86258625+ if (i915->gt_pm.rc6.ctx_corrupted)86268626+ return false;86278627+86288628+ if (!i915_rc6_ctx_corrupted(i915))86298629+ return false;86308630+86318631+ DRM_NOTE("RC6 context corruption, disabling runtime power management\n");86328632+86338633+ intel_disable_rc6(i915);86348634+ i915->gt_pm.rc6.ctx_corrupted = true;86358635+ i915->gt_pm.rc6.ctx_corrupted_wakeref =86368636+ intel_runtime_pm_get_noresume(&i915->runtime_pm);86378637+86388638+ return true;86398639+}86408640+85558641void intel_init_gt_powersave(struct drm_i915_private *dev_priv)85568642{85578643 struct intel_rps *rps = &dev_priv->gt_pm.rps;···86588556 DRM_INFO("RC6 disabled, disabling runtime PM support\n");86598557 pm_runtime_get(&dev_priv->drm.pdev->dev);86608558 }85598559+85608560+ i915_rc6_ctx_wa_init(dev_priv);8661856186628562 /* Initialize RPS limits (for userspace) */86638563 if (IS_CHERRYVIEW(dev_priv))···86998595 if (IS_VALLEYVIEW(dev_priv))87008596 valleyview_cleanup_gt_powersave(dev_priv);8701859785988598+ i915_rc6_ctx_wa_cleanup(dev_priv);85998599+87028600 if (!HAS_RC6(dev_priv))87038601 pm_runtime_put(&dev_priv->drm.pdev->dev);87048602}···87298623 i915->gt_pm.llc_pstate.enabled = false;87308624}8731862587328732-static void intel_disable_rc6(struct drm_i915_private *dev_priv)86268626+static void __intel_disable_rc6(struct drm_i915_private *dev_priv)87338627{87348628 lockdep_assert_held(&dev_priv->gt_pm.rps.lock);87358629···87468640 gen6_disable_rc6(dev_priv);8747864187488642 dev_priv->gt_pm.rc6.enabled = false;86438643+}86448644+86458645+static void intel_disable_rc6(struct drm_i915_private *dev_priv)86468646+{86478647+ struct intel_rps *rps = &dev_priv->gt_pm.rps;86488648+86498649+ mutex_lock(&rps->lock);86508650+ __intel_disable_rc6(dev_priv);86518651+ mutex_unlock(&rps->lock);87498652}8750865387518654static void intel_disable_rps(struct drm_i915_private *dev_priv)···87828667{87838668 mutex_lock(&dev_priv->gt_pm.rps.lock);8784866987858785- intel_disable_rc6(dev_priv);86708670+ __intel_disable_rc6(dev_priv);87868671 intel_disable_rps(dev_priv);87878672 if (HAS_LLC(dev_priv))87888673 intel_disable_llc_pstate(dev_priv);···88078692 lockdep_assert_held(&dev_priv->gt_pm.rps.lock);8808869388098694 if (dev_priv->gt_pm.rc6.enabled)86958695+ return;86968696+86978697+ if (dev_priv->gt_pm.rc6.ctx_corrupted)88108698 return;8811869988128700 if (IS_CHERRYVIEW(dev_priv))
···100100 * @name: name of the chip.101101 * @reg: register map of the chip.102102 * @config: configuration of the chip.103103+ * @fifo_size: size of the FIFO in bytes.103104 */104105struct inv_mpu6050_hw {105106 u8 whoami;106107 u8 *name;107108 const struct inv_mpu6050_reg_map *reg;108109 const struct inv_mpu6050_chip_config *config;110110+ size_t fifo_size;109111};110112111113/*
+12-3
drivers/iio/imu/inv_mpu6050/inv_mpu_ring.c
···180180 "failed to ack interrupt\n");181181 goto flush_fifo;182182 }183183- /* handle fifo overflow by reseting fifo */184184- if (int_status & INV_MPU6050_BIT_FIFO_OVERFLOW_INT)185185- goto flush_fifo;186183 if (!(int_status & INV_MPU6050_BIT_RAW_DATA_RDY_INT)) {187184 dev_warn(regmap_get_device(st->map),188185 "spurious interrupt with status 0x%x\n", int_status);···208211 if (result)209212 goto end_session;210213 fifo_count = get_unaligned_be16(&data[0]);214214+215215+ /*216216+ * Handle fifo overflow by resetting fifo.217217+ * Reset if there is only 3 data set free remaining to mitigate218218+ * possible delay between reading fifo count and fifo data.219219+ */220220+ nb = 3 * bytes_per_datum;221221+ if (fifo_count >= st->hw->fifo_size - nb) {222222+ dev_warn(regmap_get_device(st->map), "fifo overflow reset\n");223223+ goto flush_fifo;224224+ }225225+211226 /* compute and process all complete datum */212227 nb = fifo_count / bytes_per_datum;213228 inv_mpu6050_update_period(st, pf->timestamp, nb);
+15-14
drivers/iio/proximity/srf04.c
···110110 udelay(data->cfg->trigger_pulse_us);111111 gpiod_set_value(data->gpiod_trig, 0);112112113113- /* it cannot take more than 20 ms */113113+ /* it should not take more than 20 ms until echo is rising */114114 ret = wait_for_completion_killable_timeout(&data->rising, HZ/50);115115 if (ret < 0) {116116 mutex_unlock(&data->lock);···120120 return -ETIMEDOUT;121121 }122122123123- ret = wait_for_completion_killable_timeout(&data->falling, HZ/50);123123+ /* it cannot take more than 50 ms until echo is falling */124124+ ret = wait_for_completion_killable_timeout(&data->falling, HZ/20);124125 if (ret < 0) {125126 mutex_unlock(&data->lock);126127 return ret;···136135137136 dt_ns = ktime_to_ns(ktime_dt);138137 /*139139- * measuring more than 3 meters is beyond the capabilities of140140- * the sensor138138+ * measuring more than 6,45 meters is beyond the capabilities of139139+ * the supported sensors141140 * ==> filter out invalid results for not measuring echos of142141 * another us sensor143142 *144143 * formula:145145- * distance 3 m146146- * time = ---------- = --------- = 9404389 ns147147- * speed 319 m/s144144+ * distance 6,45 * 2 m145145+ * time = ---------- = ------------ = 40438871 ns146146+ * speed 319 m/s148147 *149148 * using a minimum speed at -20 °C of 319 m/s150149 */151151- if (dt_ns > 9404389)150150+ if (dt_ns > 40438871)152151 return -EIO;153152154153 time_ns = dt_ns;···160159 * with Temp in °C161160 * and speed in m/s162161 *163163- * use 343 m/s as ultrasonic speed at 20 °C here in absence of the162162+ * use 343,5 m/s as ultrasonic speed at 20 °C here in absence of the164163 * temperature165164 *166165 * therefore:167167- * time 343168168- * distance = ------ * -----169169- * 10^6 2166166+ * time 343,5 time * 106167167+ * distance = ------ * ------- = ------------168168+ * 10^6 2 617176170169 * with time in ns171170 * and distance in mm (one way)172171 *173173- * because we limit to 3 meters the multiplication with 343 just172172+ * because we limit to 6,45 meters the multiplication with 106 just174173 * fits into 32 bit175174 */176176- distance_mm = time_ns * 343 / 2000000;175175+ distance_mm = time_ns * 106 / 617176;177176178177 return distance_mm;179178}
···319319 /*320320 * bus->max_bus_speed is set from the bridge's linkcap Max Link Speed321321 */322322- if (parent && dd->pcidev->bus->max_bus_speed != PCIE_SPEED_8_0GT) {322322+ if (parent &&323323+ (dd->pcidev->bus->max_bus_speed == PCIE_SPEED_2_5GT ||324324+ dd->pcidev->bus->max_bus_speed == PCIE_SPEED_5_0GT)) {323325 dd_dev_info(dd, "Parent PCIe bridge does not support Gen3\n");324326 dd->link_gen3_capable = 0;325327 }
+8-8
drivers/infiniband/hw/hfi1/rc.c
···22092209 if (qp->s_flags & RVT_S_WAIT_RNR)22102210 goto bail_stop;22112211 rdi = ib_to_rvt(qp->ibqp.device);22122212- if (qp->s_rnr_retry == 0 &&22132213- !((rdi->post_parms[wqe->wr.opcode].flags &22142214- RVT_OPERATION_IGN_RNR_CNT) &&22152215- qp->s_rnr_retry_cnt == 0)) {22162216- status = IB_WC_RNR_RETRY_EXC_ERR;22172217- goto class_b;22122212+ if (!(rdi->post_parms[wqe->wr.opcode].flags &22132213+ RVT_OPERATION_IGN_RNR_CNT)) {22142214+ if (qp->s_rnr_retry == 0) {22152215+ status = IB_WC_RNR_RETRY_EXC_ERR;22162216+ goto class_b;22172217+ }22182218+ if (qp->s_rnr_retry_cnt < 7 && qp->s_rnr_retry_cnt > 0)22192219+ qp->s_rnr_retry--;22182220 }22192219- if (qp->s_rnr_retry_cnt < 7 && qp->s_rnr_retry_cnt > 0)22202220- qp->s_rnr_retry--;2221222122222222 /*22232223 * The last valid PSN is the previous PSN. For TID RDMA WRITE
+32-25
drivers/infiniband/hw/hfi1/tid_rdma.c
···107107 * C - Capcode108108 */109109110110-static u32 tid_rdma_flow_wt;111111-112110static void tid_rdma_trigger_resume(struct work_struct *work);113111static void hfi1_kern_exp_rcv_free_flows(struct tid_rdma_request *req);114112static int hfi1_kern_exp_rcv_alloc_flows(struct tid_rdma_request *req,···133135 struct hfi1_ctxtdata *rcd,134136 struct tid_rdma_flow *flow,135137 bool fecn);138138+139139+static void validate_r_tid_ack(struct hfi1_qp_priv *priv)140140+{141141+ if (priv->r_tid_ack == HFI1_QP_WQE_INVALID)142142+ priv->r_tid_ack = priv->r_tid_tail;143143+}144144+145145+static void tid_rdma_schedule_ack(struct rvt_qp *qp)146146+{147147+ struct hfi1_qp_priv *priv = qp->priv;148148+149149+ priv->s_flags |= RVT_S_ACK_PENDING;150150+ hfi1_schedule_tid_send(qp);151151+}152152+153153+static void tid_rdma_trigger_ack(struct rvt_qp *qp)154154+{155155+ validate_r_tid_ack(qp->priv);156156+ tid_rdma_schedule_ack(qp);157157+}136158137159static u64 tid_rdma_opfn_encode(struct tid_rdma_params *p)138160{···30233005 qpriv->s_nak_state = IB_NAK_PSN_ERROR;30243006 /* We are NAK'ing the next expected PSN */30253007 qpriv->s_nak_psn = mask_psn(flow->flow_state.r_next_psn);30263026- qpriv->s_flags |= RVT_S_ACK_PENDING;30273027- if (qpriv->r_tid_ack == HFI1_QP_WQE_INVALID)30283028- qpriv->r_tid_ack = qpriv->r_tid_tail;30293029- hfi1_schedule_tid_send(qp);30083008+ tid_rdma_trigger_ack(qp);30303009 }30313010 goto unlock;30323011}···33863371 return sizeof(ohdr->u.tid_rdma.w_req) / sizeof(u32);33873372}3388337333893389-void hfi1_compute_tid_rdma_flow_wt(void)33743374+static u32 hfi1_compute_tid_rdma_flow_wt(struct rvt_qp *qp)33903375{33913376 /*33923377 * Heuristic for computing the RNR timeout when waiting on the flow33933378 * queue. Rather than a computationaly expensive exact estimate of when33943379 * a flow will be available, we assume that if a QP is at position N in33953380 * the flow queue it has to wait approximately (N + 1) * (number of33963396- * segments between two sync points), assuming PMTU of 4K. The rationale33973397- * for this is that flows are released and recycled at each sync point.33813381+ * segments between two sync points). The rationale for this is that33823382+ * flows are released and recycled at each sync point.33983383 */33993399- tid_rdma_flow_wt = MAX_TID_FLOW_PSN * enum_to_mtu(OPA_MTU_4096) /34003400- TID_RDMA_MAX_SEGMENT_SIZE;33843384+ return (MAX_TID_FLOW_PSN * qp->pmtu) >> TID_RDMA_SEGMENT_SHIFT;34013385}3402338634033387static u32 position_in_queue(struct hfi1_qp_priv *qpriv,···35193505 if (qpriv->flow_state.index >= RXE_NUM_TID_FLOWS) {35203506 ret = hfi1_kern_setup_hw_flow(qpriv->rcd, qp);35213507 if (ret) {35223522- to_seg = tid_rdma_flow_wt *35083508+ to_seg = hfi1_compute_tid_rdma_flow_wt(qp) *35233509 position_in_queue(qpriv,35243510 &rcd->flow_queue);35253511 break;···35403526 /*35413527 * If overtaking req->acked_tail, send an RNR NAK. Because the35423528 * QP is not queued in this case, and the issue can only be35433543- * caused due a delay in scheduling the second leg which we35293529+ * caused by a delay in scheduling the second leg which we35443530 * cannot estimate, we use a rather arbitrary RNR timeout of35453531 * (MAX_FLOWS / 2) segments35463532 */···35483534 MAX_FLOWS)) {35493535 ret = -EAGAIN;35503536 to_seg = MAX_FLOWS >> 1;35513551- qpriv->s_flags |= RVT_S_ACK_PENDING;35523552- hfi1_schedule_tid_send(qp);35373537+ tid_rdma_trigger_ack(qp);35533538 break;35543539 }35553540···43484335 trace_hfi1_tid_req_rcv_write_data(qp, 0, e->opcode, e->psn, e->lpsn,43494336 req);43504337 trace_hfi1_tid_write_rsp_rcv_data(qp);43514351- if (priv->r_tid_ack == HFI1_QP_WQE_INVALID)43524352- priv->r_tid_ack = priv->r_tid_tail;43384338+ validate_r_tid_ack(priv);4353433943544340 if (opcode == TID_OP(WRITE_DATA_LAST)) {43554341 release_rdma_sge_mr(e);···43874375 }4388437643894377done:43904390- priv->s_flags |= RVT_S_ACK_PENDING;43914391- hfi1_schedule_tid_send(qp);43784378+ tid_rdma_schedule_ack(qp);43924379exit:43934380 priv->r_next_psn_kdeth = flow->flow_state.r_next_psn;43944381 if (fecn)···43994388 if (!priv->s_nak_state) {44004389 priv->s_nak_state = IB_NAK_PSN_ERROR;44014390 priv->s_nak_psn = flow->flow_state.r_next_psn;44024402- priv->s_flags |= RVT_S_ACK_PENDING;44034403- if (priv->r_tid_ack == HFI1_QP_WQE_INVALID)44044404- priv->r_tid_ack = priv->r_tid_tail;44054405- hfi1_schedule_tid_send(qp);43914391+ tid_rdma_trigger_ack(qp);44064392 }44074393 goto done;44084394}···49474939 qpriv->resync = true;49484940 /* RESYNC request always gets a TID RDMA ACK. */49494941 qpriv->s_nak_state = 0;49504950- qpriv->s_flags |= RVT_S_ACK_PENDING;49514951- hfi1_schedule_tid_send(qp);49424942+ tid_rdma_trigger_ack(qp);49524943bail:49534944 if (fecn)49544945 qp->s_flags |= RVT_S_ECN;
···489489{490490 struct ml_device *ml = ff->private;491491492492+ /*493493+ * Even though we stop all playing effects when tearing down494494+ * an input device (via input_device_flush() that calls into495495+ * input_ff_flush() that stops and erases all effects), we496496+ * do not actually stop the timer, and therefore we should497497+ * do it here.498498+ */499499+ del_timer_sync(&ml->timer);500500+492501 kfree(ml->private);493502}494503
···1990199019911991 /* get sysinfo */19921992 md->si = &cd->sysinfo;19931993- if (!md->si) {19941994- dev_err(dev, "%s: Fail get sysinfo pointer from core p=%p\n",19951995- __func__, md->si);19961996- goto error_get_sysinfo;19971997- }1998199319991994 rc = cyttsp4_setup_input_device(cd);20001995 if (rc)···1999200420002005error_init_input:20012006 input_free_device(md->input);20022002-error_get_sysinfo:20032003- input_set_drvdata(md->input, NULL);20042007error_alloc_failed:20052008 dev_err(dev, "%s failed.\n", __func__);20062009 return rc;
+4
drivers/interconnect/core.c
···405405 if (!path)406406 return;407407408408+ mutex_lock(&icc_lock);409409+408410 for (i = 0; i < path->num_nodes; i++)409411 path->reqs[i].tag = tag;412412+413413+ mutex_unlock(&icc_lock);410414}411415EXPORT_SYMBOL_GPL(icc_set_tag);412416
+2-1
drivers/interconnect/qcom/qcs404.c
···433433 if (!qp)434434 return -ENOMEM;435435436436- data = devm_kcalloc(dev, num_nodes, sizeof(*node), GFP_KERNEL);436436+ data = devm_kzalloc(dev, struct_size(data, nodes, num_nodes),437437+ GFP_KERNEL);437438 if (!data)438439 return -ENOMEM;439440
+2-1
drivers/interconnect/qcom/sdm845.c
···790790 if (!qp)791791 return -ENOMEM;792792793793- data = devm_kcalloc(&pdev->dev, num_nodes, sizeof(*node), GFP_KERNEL);793793+ data = devm_kzalloc(&pdev->dev, struct_size(data, nodes, num_nodes),794794+ GFP_KERNEL);794795 if (!data)795796 return -ENOMEM;796797
+1-1
drivers/mmc/host/sdhci-of-at91.c
···358358 pm_runtime_use_autosuspend(&pdev->dev);359359360360 /* HS200 is broken at this moment */361361- host->quirks2 = SDHCI_QUIRK2_BROKEN_HS200;361361+ host->quirks2 |= SDHCI_QUIRK2_BROKEN_HS200;362362363363 ret = sdhci_add_host(host);364364 if (ret)
···22602260err_service_reg:22612261 free_channel(priv, channel);22622262err_alloc_ch:22632263- if (err == -EPROBE_DEFER)22632263+ if (err == -EPROBE_DEFER) {22642264+ for (i = 0; i < priv->num_channels; i++) {22652265+ channel = priv->channel[i];22662266+ nctx = &channel->nctx;22672267+ dpaa2_io_service_deregister(channel->dpio, nctx, dev);22682268+ free_channel(priv, channel);22692269+ }22702270+ priv->num_channels = 0;22642271 return err;22722272+ }2265227322662274 if (cpumask_empty(&priv->dpio_cpumask)) {22672275 dev_err(dev, "No cpu with an affine DPIO/DPCON\n");
···11-// SPDX-License-Identifier: (GPL-2.0 OR MIT)11+/* SPDX-License-Identifier: (GPL-2.0 OR MIT) */22// Copyright (c) 2017 Synopsys, Inc. and/or its affiliates.33// stmmac Support for 5.xx Ethernet QoS cores44
+1-1
drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
···11-// SPDX-License-Identifier: (GPL-2.0 OR MIT)11+/* SPDX-License-Identifier: (GPL-2.0 OR MIT) */22/*33 * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.44 * stmmac XGMAC definitions.
+1-1
drivers/net/ethernet/stmicro/stmmac/hwif.h
···11-// SPDX-License-Identifier: (GPL-2.0 OR MIT)11+/* SPDX-License-Identifier: (GPL-2.0 OR MIT) */22// Copyright (c) 2018 Synopsys, Inc. and/or its affiliates.33// stmmac HW Interface Callbacks44
+4
drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c
···140140141141 switch (rq->type) {142142 case PTP_CLK_REQ_PEROUT:143143+ /* Reject requests with unsupported flags */144144+ if (rq->perout.flags)145145+ return -EOPNOTSUPP;146146+143147 cfg = &priv->pps[rq->perout.index];144148145149 cfg->start.tv_sec = rq->perout.start.sec;
+16
drivers/net/phy/dp83640.c
···469469470470 switch (rq->type) {471471 case PTP_CLK_REQ_EXTTS:472472+ /* Reject requests with unsupported flags */473473+ if (rq->extts.flags & ~(PTP_ENABLE_FEATURE |474474+ PTP_RISING_EDGE |475475+ PTP_FALLING_EDGE |476476+ PTP_STRICT_FLAGS))477477+ return -EOPNOTSUPP;478478+479479+ /* Reject requests to enable time stamping on both edges. */480480+ if ((rq->extts.flags & PTP_STRICT_FLAGS) &&481481+ (rq->extts.flags & PTP_ENABLE_FEATURE) &&482482+ (rq->extts.flags & PTP_EXTTS_EDGES) == PTP_EXTTS_EDGES)483483+ return -EOPNOTSUPP;484484+472485 index = rq->extts.index;473486 if (index >= N_EXT_TS)474487 return -EINVAL;···504491 return 0;505492506493 case PTP_CLK_REQ_PEROUT:494494+ /* Reject requests with unsupported flags */495495+ if (rq->perout.flags)496496+ return -EOPNOTSUPP;507497 if (rq->perout.index >= N_PER_OUT)508498 return -EINVAL;509499 return periodic_output(clock, rq, on, rq->perout.index);
+6-5
drivers/net/phy/mdio_bus.c
···6464 if (mdiodev->dev.of_node)6565 reset = devm_reset_control_get_exclusive(&mdiodev->dev,6666 "phy");6767- if (PTR_ERR(reset) == -ENOENT ||6868- PTR_ERR(reset) == -ENOTSUPP)6969- reset = NULL;7070- else if (IS_ERR(reset))7171- return PTR_ERR(reset);6767+ if (IS_ERR(reset)) {6868+ if (PTR_ERR(reset) == -ENOENT || PTR_ERR(reset) == -ENOSYS)6969+ reset = NULL;7070+ else7171+ return PTR_ERR(reset);7272+ }72737374 mdiodev->reset_ctrl = reset;7475
···196196197197 /* Get the MAC address */198198 ret = asix_read_cmd(dev, AX_CMD_READ_NODE_ID, 0, 0, ETH_ALEN, buf, 0);199199- if (ret < 0) {199199+ if (ret < ETH_ALEN) {200200 netdev_err(dev->net, "Failed to read MAC address: %d\n", ret);201201 goto free;202202 }
···251251 struct ieee80211_hdr *hdr = (void *)skb->data;252252 unsigned int snap_ip_tcp_hdrlen, ip_hdrlen, total_len, hdr_room;253253 unsigned int mss = skb_shinfo(skb)->gso_size;254254- u16 length, iv_len, amsdu_pad;254254+ u16 length, amsdu_pad;255255 u8 *start_hdr;256256 struct iwl_tso_hdr_page *hdr_page;257257 struct page **page_ptr;258258 struct tso_t tso;259259-260260- /* if the packet is protected, then it must be CCMP or GCMP */261261- iv_len = ieee80211_has_protected(hdr->frame_control) ?262262- IEEE80211_CCMP_HDR_LEN : 0;263259264260 trace_iwlwifi_dev_tx(trans->dev, skb, tfd, sizeof(*tfd),265261 &dev_cmd->hdr, start_len, 0);266262267263 ip_hdrlen = skb_transport_header(skb) - skb_network_header(skb);268264 snap_ip_tcp_hdrlen = 8 + ip_hdrlen + tcp_hdrlen(skb);269269- total_len = skb->len - snap_ip_tcp_hdrlen - hdr_len - iv_len;265265+ total_len = skb->len - snap_ip_tcp_hdrlen - hdr_len;270266 amsdu_pad = 0;271267272268 /* total amount of header we may need for this A-MSDU */273269 hdr_room = DIV_ROUND_UP(total_len, mss) *274274- (3 + snap_ip_tcp_hdrlen + sizeof(struct ethhdr)) + iv_len;270270+ (3 + snap_ip_tcp_hdrlen + sizeof(struct ethhdr));275271276272 /* Our device supports 9 segments at most, it will fit in 1 page */277273 hdr_page = get_page_hdr(trans, hdr_room);···278282 start_hdr = hdr_page->pos;279283 page_ptr = (void *)((u8 *)skb->cb + trans_pcie->page_offs);280284 *page_ptr = hdr_page->page;281281- memcpy(hdr_page->pos, skb->data + hdr_len, iv_len);282282- hdr_page->pos += iv_len;283285284286 /*285285- * Pull the ieee80211 header + IV to be able to use TSO core,287287+ * Pull the ieee80211 header to be able to use TSO core,286288 * we will restore it for the tx_status flow.287289 */288288- skb_pull(skb, hdr_len + iv_len);290290+ skb_pull(skb, hdr_len);289291290292 /*291293 * Remove the length of all the headers that we don't actually···358364 }359365 }360366361361- /* re -add the WiFi header and IV */362362- skb_push(skb, hdr_len + iv_len);367367+ /* re -add the WiFi header */368368+ skb_push(skb, hdr_len);363369364370 return 0;365371
+4-2
drivers/nfc/nxp-nci/i2c.c
···220220221221 if (r == -EREMOTEIO) {222222 phy->hard_fault = r;223223- skb = NULL;224224- } else if (r < 0) {223223+ if (info->mode == NXP_NCI_MODE_FW)224224+ nxp_nci_fw_recv_frame(phy->ndev, NULL);225225+ }226226+ if (r < 0) {225227 nfc_err(&client->dev, "Read failed with error %d\n", r);226228 goto exit_irq_handled;227229 }
···5252#define PADCFG0_GPIROUTNMI BIT(17)5353#define PADCFG0_PMODE_SHIFT 105454#define PADCFG0_PMODE_MASK GENMASK(13, 10)5555+#define PADCFG0_PMODE_GPIO 05556#define PADCFG0_GPIORXDIS BIT(9)5657#define PADCFG0_GPIOTXDIS BIT(8)5758#define PADCFG0_GPIORXSTATE BIT(1)···333332 cfg1 = readl(intel_get_padcfg(pctrl, pin, PADCFG1));334333335334 mode = (cfg0 & PADCFG0_PMODE_MASK) >> PADCFG0_PMODE_SHIFT;336336- if (!mode)335335+ if (mode == PADCFG0_PMODE_GPIO)337336 seq_puts(s, "GPIO ");338337 else339338 seq_printf(s, "mode %d ", mode);···459458 writel(value, padcfg0);460459}461460461461+static int intel_gpio_get_gpio_mode(void __iomem *padcfg0)462462+{463463+ return (readl(padcfg0) & PADCFG0_PMODE_MASK) >> PADCFG0_PMODE_SHIFT;464464+}465465+462466static void intel_gpio_set_gpio_mode(void __iomem *padcfg0)463467{464468 u32 value;···497491 }498492499493 padcfg0 = intel_get_padcfg(pctrl, pin, PADCFG0);494494+495495+ /*496496+ * If pin is already configured in GPIO mode, we assume that497497+ * firmware provides correct settings. In such case we avoid498498+ * potential glitches on the pin. Otherwise, for the pin in499499+ * alternative mode, consumer has to supply respective flags.500500+ */501501+ if (intel_gpio_get_gpio_mode(padcfg0) == PADCFG0_PMODE_GPIO) {502502+ raw_spin_unlock_irqrestore(&pctrl->lock, flags);503503+ return 0;504504+ }505505+500506 intel_gpio_set_gpio_mode(padcfg0);507507+501508 /* Disable TX buffer and enable RX (this will be input) */502509 __intel_gpio_set_direction(padcfg0, true);503510
-14
drivers/pinctrl/pinctrl-stmfx.c
···585585 return stmfx_function_enable(pctl->stmfx, func);586586}587587588588-static int stmfx_pinctrl_gpio_init_valid_mask(struct gpio_chip *gc,589589- unsigned long *valid_mask,590590- unsigned int ngpios)591591-{592592- struct stmfx_pinctrl *pctl = gpiochip_get_data(gc);593593- u32 n;594594-595595- for_each_clear_bit(n, &pctl->gpio_valid_mask, ngpios)596596- clear_bit(n, valid_mask);597597-598598- return 0;599599-}600600-601588static int stmfx_pinctrl_probe(struct platform_device *pdev)602589{603590 struct stmfx *stmfx = dev_get_drvdata(pdev->dev.parent);···647660 pctl->gpio_chip.ngpio = pctl->pctl_desc.npins;648661 pctl->gpio_chip.can_sleep = true;649662 pctl->gpio_chip.of_node = np;650650- pctl->gpio_chip.init_valid_mask = stmfx_pinctrl_gpio_init_valid_mask;651663652664 ret = devm_gpiochip_add_data(pctl->dev, &pctl->gpio_chip, pctl);653665 if (ret) {
+15-5
drivers/ptp/ptp_chardev.c
···149149 err = -EFAULT;150150 break;151151 }152152- if (((req.extts.flags & ~PTP_EXTTS_VALID_FLAGS) ||153153- req.extts.rsv[0] || req.extts.rsv[1]) &&154154- cmd == PTP_EXTTS_REQUEST2) {155155- err = -EINVAL;156156- break;152152+ if (cmd == PTP_EXTTS_REQUEST2) {153153+ /* Tell the drivers to check the flags carefully. */154154+ req.extts.flags |= PTP_STRICT_FLAGS;155155+ /* Make sure no reserved bit is set. */156156+ if ((req.extts.flags & ~PTP_EXTTS_VALID_FLAGS) ||157157+ req.extts.rsv[0] || req.extts.rsv[1]) {158158+ err = -EINVAL;159159+ break;160160+ }161161+ /* Ensure one of the rising/falling edge bits is set. */162162+ if ((req.extts.flags & PTP_ENABLE_FEATURE) &&163163+ (req.extts.flags & PTP_EXTTS_EDGES) == 0) {164164+ err = -EINVAL;165165+ break;166166+ }157167 } else if (cmd == PTP_EXTTS_REQUEST) {158168 req.extts.flags &= PTP_EXTTS_V1_VALID_FLAGS;159169 req.extts.rsv[0] = 0;
+3-2
drivers/reset/core.c
···7676 * of_reset_simple_xlate - translate reset_spec to the reset line number7777 * @rcdev: a pointer to the reset controller device7878 * @reset_spec: reset line specifier as found in the device tree7979- * @flags: a flags pointer to fill in (optional)8079 *8180 * This simple translation function should be used for reset controllers8281 * with 1:1 mapping, where reset lines can be indexed by number without gaps.···747748 for (i = 0; i < resets->num_rstcs; i++)748749 __reset_control_put_internal(resets->rstc[i]);749750 mutex_unlock(&reset_list_mutex);751751+ kfree(resets);750752}751753752754/**···825825}826826EXPORT_SYMBOL_GPL(__device_reset);827827828828-/**828828+/*829829 * APIs to manage an array of reset controls.830830 */831831+831832/**832833 * of_reset_control_get_count - Count number of resets available with a device833834 *
+5-3
drivers/scsi/qla2xxx/qla_mid.c
···7676 * ensures no active vp_list traversal while the vport is removed7777 * from the queue)7878 */7979- for (i = 0; i < 10 && atomic_read(&vha->vref_count); i++)8080- wait_event_timeout(vha->vref_waitq,8181- atomic_read(&vha->vref_count), HZ);7979+ for (i = 0; i < 10; i++) {8080+ if (wait_event_timeout(vha->vref_waitq,8181+ !atomic_read(&vha->vref_count), HZ) > 0)8282+ break;8383+ }82848385 spin_lock_irqsave(&ha->vport_slock, flags);8486 if (atomic_read(&vha->vref_count)) {
+5-3
drivers/scsi/qla2xxx/qla_os.c
···1119111911201120 qla2x00_mark_all_devices_lost(vha, 0);1121112111221122- for (i = 0; i < 10; i++)11231123- wait_event_timeout(vha->fcport_waitQ, test_fcport_count(vha),11241124- HZ);11221122+ for (i = 0; i < 10; i++) {11231123+ if (wait_event_timeout(vha->fcport_waitQ,11241124+ test_fcport_count(vha), HZ) > 0)11251125+ break;11261126+ }1125112711261128 flush_workqueue(vha->hw->wq);11271129}
···263263 int result = cmd->result;264264 struct request *rq = cmd->request;265265266266- switch (req_op(rq)) {267267- case REQ_OP_ZONE_RESET:268268- case REQ_OP_ZONE_RESET_ALL:269269-270270- if (result &&271271- sshdr->sense_key == ILLEGAL_REQUEST &&272272- sshdr->asc == 0x24)273273- /*274274- * INVALID FIELD IN CDB error: reset of a conventional275275- * zone was attempted. Nothing to worry about, so be276276- * quiet about the error.277277- */278278- rq->rq_flags |= RQF_QUIET;279279- break;280280-281281- case REQ_OP_WRITE:282282- case REQ_OP_WRITE_ZEROES:283283- case REQ_OP_WRITE_SAME:284284- break;266266+ if (req_op(rq) == REQ_OP_ZONE_RESET &&267267+ result &&268268+ sshdr->sense_key == ILLEGAL_REQUEST &&269269+ sshdr->asc == 0x24) {270270+ /*271271+ * INVALID FIELD IN CDB error: reset of a conventional272272+ * zone was attempted. Nothing to worry about, so be273273+ * quiet about the error.274274+ */275275+ rq->rq_flags |= RQF_QUIET;285276 }286277}287278
···5566menuconfig SOUNDWIRE77 tristate "SoundWire support"88+ depends on ACPI || OF89 help910 SoundWire is a 2-Pin interface with data and clock line ratified1011 by the MIPI Alliance. SoundWire is used for transporting data
+2-2
drivers/soundwire/intel.c
···900900 /* Create PCM DAIs */901901 stream = &cdns->pcm;902902903903- ret = intel_create_dai(cdns, dais, INTEL_PDI_IN, stream->num_in,903903+ ret = intel_create_dai(cdns, dais, INTEL_PDI_IN, cdns->pcm.num_in,904904 off, stream->num_ch_in, true);905905 if (ret)906906 return ret;···931931 if (ret)932932 return ret;933933934934- off += cdns->pdm.num_bd;934934+ off += cdns->pdm.num_out;935935 ret = intel_create_dai(cdns, dais, INTEL_PDI_BD, cdns->pdm.num_bd,936936 off, stream->num_ch_bd, false);937937 if (ret)
+2-1
drivers/soundwire/slave.c
···128128 struct device_node *node;129129130130 for_each_child_of_node(bus->dev->of_node, node) {131131- int link_id, sdw_version, ret, len;131131+ int link_id, ret, len;132132+ unsigned int sdw_version;132133 const char *compat = NULL;133134 struct sdw_slave_id id;134135 const __be32 *addr;
-1
drivers/thunderbolt/nhi_ops.c
···8080{8181 u32 data;82828383- pci_read_config_dword(nhi->pdev, VS_CAP_19, &data);8483 data = (cmd << VS_CAP_19_CMD_SHIFT) & VS_CAP_19_CMD_MASK;8584 pci_write_config_dword(nhi->pdev, VS_CAP_19, data | VS_CAP_19_VALID);8685}
+11-17
drivers/thunderbolt/switch.c
···896896 */897897bool tb_dp_port_is_enabled(struct tb_port *port)898898{899899- u32 data;899899+ u32 data[2];900900901901- if (tb_port_read(port, &data, TB_CFG_PORT, port->cap_adap, 1))901901+ if (tb_port_read(port, data, TB_CFG_PORT, port->cap_adap,902902+ ARRAY_SIZE(data)))902903 return false;903904904904- return !!(data & (TB_DP_VIDEO_EN | TB_DP_AUX_EN));905905+ return !!(data[0] & (TB_DP_VIDEO_EN | TB_DP_AUX_EN));905906}906907907908/**···915914 */916915int tb_dp_port_enable(struct tb_port *port, bool enable)917916{918918- u32 data;917917+ u32 data[2];919918 int ret;920919921921- ret = tb_port_read(port, &data, TB_CFG_PORT, port->cap_adap, 1);920920+ ret = tb_port_read(port, data, TB_CFG_PORT, port->cap_adap,921921+ ARRAY_SIZE(data));922922 if (ret)923923 return ret;924924925925 if (enable)926926- data |= TB_DP_VIDEO_EN | TB_DP_AUX_EN;926926+ data[0] |= TB_DP_VIDEO_EN | TB_DP_AUX_EN;927927 else928928- data &= ~(TB_DP_VIDEO_EN | TB_DP_AUX_EN);928928+ data[0] &= ~(TB_DP_VIDEO_EN | TB_DP_AUX_EN);929929930930- return tb_port_write(port, &data, TB_CFG_PORT, port->cap_adap, 1);930930+ return tb_port_write(port, data, TB_CFG_PORT, port->cap_adap,931931+ ARRAY_SIZE(data));931932}932933933934/* switch utility functions */···10341031 if (sw->authorized)10351032 goto unlock;1036103310371037- /*10381038- * Make sure there is no PCIe rescan ongoing when a new PCIe10391039- * tunnel is created. Otherwise the PCIe rescan code might find10401040- * the new tunnel too early.10411041- */10421042- pci_lock_rescan_remove();10431043-10441034 switch (val) {10451035 /* Approve switch */10461036 case 1:···10521056 default:10531057 break;10541058 }10551055-10561056- pci_unlock_rescan_remove();1057105910581060 if (!ret) {10591061 sw->authorized = val;
···459459 */460460 how &= ~AUTOFS_EXP_LEAVES;461461 found = should_expire(expired, mnt, timeout, how);462462- if (!found || found != expired)463463- /* Something has changed, continue */462462+ if (found != expired) { // something has changed, continue463463+ dput(found);464464 goto next;465465+ }465466466467 if (expired != dentry)467468 dput(dentry);
+29-1
fs/btrfs/inode.c
···474474 u64 start = async_chunk->start;475475 u64 end = async_chunk->end;476476 u64 actual_end;477477+ u64 i_size;477478 int ret = 0;478479 struct page **pages = NULL;479480 unsigned long nr_pages;···489488 inode_should_defrag(BTRFS_I(inode), start, end, end - start + 1,490489 SZ_16K);491490492492- actual_end = min_t(u64, i_size_read(inode), end + 1);491491+ /*492492+ * We need to save i_size before now because it could change in between493493+ * us evaluating the size and assigning it. This is because we lock and494494+ * unlock the page in truncate and fallocate, and then modify the i_size495495+ * later on.496496+ *497497+ * The barriers are to emulate READ_ONCE, remove that once i_size_read498498+ * does that for us.499499+ */500500+ barrier();501501+ i_size = i_size_read(inode);502502+ barrier();503503+ actual_end = min_t(u64, i_size, end + 1);493504again:494505 will_compress = 0;495506 nr_pages = (end >> PAGE_SHIFT) - (start >> PAGE_SHIFT) + 1;···97449731 commit_transaction = true;97459732 }97469733 if (commit_transaction) {97349734+ /*97359735+ * We may have set commit_transaction when logging the new name97369736+ * in the destination root, in which case we left the source97379737+ * root context in the list of log contextes. So make sure we97389738+ * remove it to avoid invalid memory accesses, since the context97399739+ * was allocated in our stack frame.97409740+ */97419741+ if (sync_log_root) {97429742+ mutex_lock(&root->log_mutex);97439743+ list_del_init(&ctx_root.list);97449744+ mutex_unlock(&root->log_mutex);97459745+ }97479746 ret = btrfs_commit_transaction(trans);97489747 } else {97499748 int ret2;···97689743 up_read(&fs_info->subvol_sem);97699744 if (old_ino == BTRFS_FIRST_FREE_OBJECTID)97709745 up_read(&fs_info->subvol_sem);97469746+97479747+ ASSERT(list_empty(&ctx_root.list));97489748+ ASSERT(list_empty(&ctx_dest.list));9771974997729750 return ret;97739751}
-6
fs/btrfs/ioctl.c
···41954195 u64 transid;41964196 int ret;4197419741984198- btrfs_warn(root->fs_info,41994199- "START_SYNC ioctl is deprecated and will be removed in kernel 5.7");42004200-42014198 trans = btrfs_attach_transaction_barrier(root);42024199 if (IS_ERR(trans)) {42034200 if (PTR_ERR(trans) != -ENOENT)···42214224 void __user *argp)42224225{42234226 u64 transid;42244224-42254225- btrfs_warn(fs_info,42264226- "WAIT_SYNC ioctl is deprecated and will be removed in kernel 5.7");4227422742284228 if (argp) {42294229 if (copy_from_user(&transid, argp, sizeof(transid)))
+21
fs/btrfs/space-info.c
···893893 while (ticket->bytes > 0 && ticket->error == 0) {894894 ret = prepare_to_wait_event(&ticket->wait, &wait, TASK_KILLABLE);895895 if (ret) {896896+ /*897897+ * Delete us from the list. After we unlock the space898898+ * info, we don't want the async reclaim job to reserve899899+ * space for this ticket. If that would happen, then the900900+ * ticket's task would not known that space was reserved901901+ * despite getting an error, resulting in a space leak902902+ * (bytes_may_use counter of our space_info).903903+ */904904+ list_del_init(&ticket->list);896905 ticket->error = -EINTR;897906 break;898907 }···954945 spin_lock(&space_info->lock);955946 ret = ticket->error;956947 if (ticket->bytes || ticket->error) {948948+ /*949949+ * Need to delete here for priority tickets. For regular tickets950950+ * either the async reclaim job deletes the ticket from the list951951+ * or we delete it ourselves at wait_reserve_ticket().952952+ */957953 list_del_init(&ticket->list);958954 if (!ret)959955 ret = -ENOSPC;960956 }961957 spin_unlock(&space_info->lock);962958 ASSERT(list_empty(&ticket->list));959959+ /*960960+ * Check that we can't have an error set if the reservation succeeded,961961+ * as that would confuse tasks and lead them to error out without962962+ * releasing reserved space (if an error happens the expectation is that963963+ * space wasn't reserved at all).964964+ */965965+ ASSERT(!(ticket->bytes == 0 && ticket->error));963966 return ret;964967}965968
···101101 }102102 target_sd->s_links++;103103 spin_unlock(&configfs_dirent_lock);104104- ret = configfs_get_target_path(item, item, body);104104+ ret = configfs_get_target_path(parent_item, item, body);105105 if (!ret)106106 ret = configfs_create_link(target_sd, parent_item->ci_dentry,107107 dentry, body);
+53-31
fs/ecryptfs/inode.c
···128128 struct inode *inode)129129{130130 struct dentry *lower_dentry = ecryptfs_dentry_to_lower(dentry);131131- struct inode *lower_dir_inode = ecryptfs_inode_to_lower(dir);132131 struct dentry *lower_dir_dentry;132132+ struct inode *lower_dir_inode;133133 int rc;134134135135- dget(lower_dentry);136136- lower_dir_dentry = lock_parent(lower_dentry);137137- rc = vfs_unlink(lower_dir_inode, lower_dentry, NULL);135135+ lower_dir_dentry = ecryptfs_dentry_to_lower(dentry->d_parent);136136+ lower_dir_inode = d_inode(lower_dir_dentry);137137+ inode_lock_nested(lower_dir_inode, I_MUTEX_PARENT);138138+ dget(lower_dentry); // don't even try to make the lower negative139139+ if (lower_dentry->d_parent != lower_dir_dentry)140140+ rc = -EINVAL;141141+ else if (d_unhashed(lower_dentry))142142+ rc = -EINVAL;143143+ else144144+ rc = vfs_unlink(lower_dir_inode, lower_dentry, NULL);138145 if (rc) {139146 printk(KERN_ERR "Error in vfs_unlink; rc = [%d]\n", rc);140147 goto out_unlock;···149142 fsstack_copy_attr_times(dir, lower_dir_inode);150143 set_nlink(inode, ecryptfs_inode_to_lower(inode)->i_nlink);151144 inode->i_ctime = dir->i_ctime;152152- d_drop(dentry);153145out_unlock:154154- unlock_dir(lower_dir_dentry);155146 dput(lower_dentry);147147+ inode_unlock(lower_dir_inode);148148+ if (!rc)149149+ d_drop(dentry);156150 return rc;157151}158152···319311static struct dentry *ecryptfs_lookup_interpose(struct dentry *dentry,320312 struct dentry *lower_dentry)321313{322322- struct inode *inode, *lower_inode = d_inode(lower_dentry);314314+ struct path *path = ecryptfs_dentry_to_lower_path(dentry->d_parent);315315+ struct inode *inode, *lower_inode;323316 struct ecryptfs_dentry_info *dentry_info;324324- struct vfsmount *lower_mnt;325317 int rc = 0;326318327319 dentry_info = kmem_cache_alloc(ecryptfs_dentry_info_cache, GFP_KERNEL);···330322 return ERR_PTR(-ENOMEM);331323 }332324333333- lower_mnt = mntget(ecryptfs_dentry_to_lower_mnt(dentry->d_parent));334325 fsstack_copy_attr_atime(d_inode(dentry->d_parent),335335- d_inode(lower_dentry->d_parent));326326+ d_inode(path->dentry));336327 BUG_ON(!d_count(lower_dentry));337328338329 ecryptfs_set_dentry_private(dentry, dentry_info);339339- dentry_info->lower_path.mnt = lower_mnt;330330+ dentry_info->lower_path.mnt = mntget(path->mnt);340331 dentry_info->lower_path.dentry = lower_dentry;341332342342- if (d_really_is_negative(lower_dentry)) {333333+ /*334334+ * negative dentry can go positive under us here - its parent is not335335+ * locked. That's OK and that could happen just as we return from336336+ * ecryptfs_lookup() anyway. Just need to be careful and fetch337337+ * ->d_inode only once - it's not stable here.338338+ */339339+ lower_inode = READ_ONCE(lower_dentry->d_inode);340340+341341+ if (!lower_inode) {343342 /* We want to add because we couldn't find in lower */344343 d_add(dentry, NULL);345344 return NULL;···527512{528513 struct dentry *lower_dentry;529514 struct dentry *lower_dir_dentry;515515+ struct inode *lower_dir_inode;530516 int rc;531517532518 lower_dentry = ecryptfs_dentry_to_lower(dentry);533533- dget(dentry);534534- lower_dir_dentry = lock_parent(lower_dentry);535535- dget(lower_dentry);536536- rc = vfs_rmdir(d_inode(lower_dir_dentry), lower_dentry);537537- dput(lower_dentry);538538- if (!rc && d_really_is_positive(dentry))519519+ lower_dir_dentry = ecryptfs_dentry_to_lower(dentry->d_parent);520520+ lower_dir_inode = d_inode(lower_dir_dentry);521521+522522+ inode_lock_nested(lower_dir_inode, I_MUTEX_PARENT);523523+ dget(lower_dentry); // don't even try to make the lower negative524524+ if (lower_dentry->d_parent != lower_dir_dentry)525525+ rc = -EINVAL;526526+ else if (d_unhashed(lower_dentry))527527+ rc = -EINVAL;528528+ else529529+ rc = vfs_rmdir(lower_dir_inode, lower_dentry);530530+ if (!rc) {539531 clear_nlink(d_inode(dentry));540540- fsstack_copy_attr_times(dir, d_inode(lower_dir_dentry));541541- set_nlink(dir, d_inode(lower_dir_dentry)->i_nlink);542542- unlock_dir(lower_dir_dentry);532532+ fsstack_copy_attr_times(dir, lower_dir_inode);533533+ set_nlink(dir, lower_dir_inode->i_nlink);534534+ }535535+ dput(lower_dentry);536536+ inode_unlock(lower_dir_inode);543537 if (!rc)544538 d_drop(dentry);545545- dput(dentry);546539 return rc;547540}548541···588565 struct dentry *lower_new_dentry;589566 struct dentry *lower_old_dir_dentry;590567 struct dentry *lower_new_dir_dentry;591591- struct dentry *trap = NULL;568568+ struct dentry *trap;592569 struct inode *target_inode;593570594571 if (flags)595572 return -EINVAL;596573574574+ lower_old_dir_dentry = ecryptfs_dentry_to_lower(old_dentry->d_parent);575575+ lower_new_dir_dentry = ecryptfs_dentry_to_lower(new_dentry->d_parent);576576+597577 lower_old_dentry = ecryptfs_dentry_to_lower(old_dentry);598578 lower_new_dentry = ecryptfs_dentry_to_lower(new_dentry);599599- dget(lower_old_dentry);600600- dget(lower_new_dentry);601601- lower_old_dir_dentry = dget_parent(lower_old_dentry);602602- lower_new_dir_dentry = dget_parent(lower_new_dentry);579579+603580 target_inode = d_inode(new_dentry);581581+604582 trap = lock_rename(lower_old_dir_dentry, lower_new_dir_dentry);583583+ dget(lower_new_dentry);605584 rc = -EINVAL;606585 if (lower_old_dentry->d_parent != lower_old_dir_dentry)607586 goto out_lock;···631606 if (new_dir != old_dir)632607 fsstack_copy_attr_all(old_dir, d_inode(lower_old_dir_dentry));633608out_lock:634634- unlock_rename(lower_old_dir_dentry, lower_new_dir_dentry);635635- dput(lower_new_dir_dentry);636636- dput(lower_old_dir_dentry);637609 dput(lower_new_dentry);638638- dput(lower_old_dentry);610610+ unlock_rename(lower_old_dir_dentry, lower_new_dir_dentry);639611 return rc;640612}641613
+19-12
fs/exportfs/expfs.c
···519519 * inode is actually connected to the parent.520520 */521521 err = exportfs_get_name(mnt, target_dir, nbuf, result);522522- if (!err) {523523- inode_lock(target_dir->d_inode);524524- nresult = lookup_one_len(nbuf, target_dir,525525- strlen(nbuf));526526- inode_unlock(target_dir->d_inode);527527- if (!IS_ERR(nresult)) {528528- if (nresult->d_inode) {529529- dput(result);530530- result = nresult;531531- } else532532- dput(nresult);533533- }522522+ if (err) {523523+ dput(target_dir);524524+ goto err_result;534525 }535526527527+ inode_lock(target_dir->d_inode);528528+ nresult = lookup_one_len(nbuf, target_dir, strlen(nbuf));529529+ if (!IS_ERR(nresult)) {530530+ if (unlikely(nresult->d_inode != result->d_inode)) {531531+ dput(nresult);532532+ nresult = ERR_PTR(-ESTALE);533533+ }534534+ }535535+ inode_unlock(target_dir->d_inode);536536 /*537537 * At this point we are done with the parent, but it's pinned538538 * by the child dentry anyway.539539 */540540 dput(target_dir);541541+542542+ if (IS_ERR(nresult)) {543543+ err = PTR_ERR(nresult);544544+ goto err_result;545545+ }546546+ dput(result);547547+ result = nresult;541548542549 /*543550 * And finally make sure the dentry is actually acceptable
+24-8
fs/io_uring.c
···326326#define REQ_F_TIMEOUT 1024 /* timeout request */327327#define REQ_F_ISREG 2048 /* regular file */328328#define REQ_F_MUST_PUNT 4096 /* must be punted even for NONBLOCK */329329+#define REQ_F_TIMEOUT_NOSEQ 8192 /* no timeout sequence */329330 u64 user_data;330331 u32 result;331332 u32 sequence;···454453 struct io_kiocb *req;455454456455 req = list_first_entry_or_null(&ctx->timeout_list, struct io_kiocb, list);457457- if (req && !__io_sequence_defer(ctx, req)) {458458- list_del_init(&req->list);459459- return req;456456+ if (req) {457457+ if (req->flags & REQ_F_TIMEOUT_NOSEQ)458458+ return NULL;459459+ if (!__io_sequence_defer(ctx, req)) {460460+ list_del_init(&req->list);461461+ return req;462462+ }460463 }461464462465 return NULL;···12301225 }12311226 }1232122712331233- return 0;12281228+ return len;12341229}1235123012361231static ssize_t io_import_iovec(struct io_ring_ctx *ctx, int rw,···19461941 if (get_timespec64(&ts, u64_to_user_ptr(sqe->addr)))19471942 return -EFAULT;1948194319441944+ req->flags |= REQ_F_TIMEOUT;19451945+19491946 /*19501947 * sqe->off holds how many events that need to occur for this19511951- * timeout event to be satisfied.19481948+ * timeout event to be satisfied. If it isn't set, then this is19491949+ * a pure timeout request, sequence isn't used.19521950 */19531951 count = READ_ONCE(sqe->off);19541954- if (!count)19551955- count = 1;19521952+ if (!count) {19531953+ req->flags |= REQ_F_TIMEOUT_NOSEQ;19541954+ spin_lock_irq(&ctx->completion_lock);19551955+ entry = ctx->timeout_list.prev;19561956+ goto add;19571957+ }1956195819571959 req->sequence = ctx->cached_sq_head + count - 1;19581960 /* reuse it to store the count */19591961 req->submit.sequence = count;19601960- req->flags |= REQ_F_TIMEOUT;1961196219621963 /*19631964 * Insertion sort, ensuring the first entry in the list is always···19741963 struct io_kiocb *nxt = list_entry(entry, struct io_kiocb, list);19751964 unsigned nxt_sq_head;19761965 long long tmp, tmp_nxt;19661966+19671967+ if (nxt->flags & REQ_F_TIMEOUT_NOSEQ)19681968+ continue;1977196919781970 /*19791971 * Since cached_sq_head + count - 1 can overflow, use type long···20041990 nxt->sequence++;20051991 }20061992 req->sequence -= span;19931993+add:20071994 list_add(&req->list, entry);20081995 spin_unlock_irq(&ctx->completion_lock);20091996···22982283 switch (op) {22992284 case IORING_OP_NOP:23002285 case IORING_OP_POLL_REMOVE:22862286+ case IORING_OP_TIMEOUT:23012287 return false;23022288 default:23032289 return true;
+7-8
fs/namespace.c
···2478247824792479 time64_to_tm(sb->s_time_max, 0, &tm);2480248024812481- pr_warn("Mounted %s file system at %s supports timestamps until %04ld (0x%llx)\n",24822482- sb->s_type->name, mntpath,24812481+ pr_warn("%s filesystem being %s at %s supports timestamps until %04ld (0x%llx)\n",24822482+ sb->s_type->name,24832483+ is_mounted(mnt) ? "remounted" : "mounted",24842484+ mntpath,24832485 tm.tm_year+1900, (unsigned long long)sb->s_time_max);2484248624852487 free_page((unsigned long)buf);···27662764 if (IS_ERR(mnt))27672765 return PTR_ERR(mnt);2768276627692769- error = do_add_mount(real_mount(mnt), mountpoint, mnt_flags);27702770- if (error < 0) {27712771- mntput(mnt);27722772- return error;27732773- }27742774-27752767 mnt_warn_timestamp_expiry(mountpoint, mnt);2776276827692769+ error = do_add_mount(real_mount(mnt), mountpoint, mnt_flags);27702770+ if (error < 0)27712771+ mntput(mnt);27772772 return error;27782773}27792774
···119119typedef int (*walk_memory_blocks_func_t)(struct memory_block *, void *);120120extern int walk_memory_blocks(unsigned long start, unsigned long size,121121 void *arg, walk_memory_blocks_func_t func);122122+extern int for_each_memory_block(void *arg, walk_memory_blocks_func_t func);122123#define CONFIG_MEM_BLOCK_SIZE (PAGES_PER_SECTION<<PAGE_SHIFT)123124#endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */124125
+2-2
include/linux/reset-controller.h
···77struct reset_controller_dev;8899/**1010- * struct reset_control_ops1010+ * struct reset_control_ops - reset controller driver callbacks1111 *1212 * @reset: for self-deasserting resets, does all necessary1313 * things to reset the device···3333 * @provider: name of the reset controller device controlling this reset line3434 * @index: ID of the reset controller in the reset controller device3535 * @dev_id: name of the device associated with this reset line3636- * @con_id name of the reset line (can be NULL)3636+ * @con_id: name of the reset line (can be NULL)3737 */3838struct reset_control_lookup {3939 struct list_head list;
+1-1
include/linux/reset.h
···143143 * If this function is called more than once for the same reset_control it will144144 * return -EBUSY.145145 *146146- * See reset_control_get_shared for details on shared references to146146+ * See reset_control_get_shared() for details on shared references to147147 * reset-controls.148148 *149149 * Use of id names is optional.
···2119211921202120 nsdentry = kernfs_node_dentry(cgrp->kn, sb);21212121 dput(fc->root);21222122- fc->root = nsdentry;21232122 if (IS_ERR(nsdentry)) {21242124- ret = PTR_ERR(nsdentry);21252123 deactivate_locked_super(sb);21242124+ ret = PTR_ERR(nsdentry);21252125+ nsdentry = NULL;21262126 }21272127+ fc->root = nsdentry;21272128 }2128212921292130 if (!ctx->kfc.new_sb_created)
+26-1
kernel/cpu.c
···23732373 this_cpu_write(cpuhp_state.state, CPUHP_ONLINE);23742374}2375237523762376-enum cpu_mitigations cpu_mitigations __ro_after_init = CPU_MITIGATIONS_AUTO;23762376+/*23772377+ * These are used for a global "mitigations=" cmdline option for toggling23782378+ * optional CPU mitigations.23792379+ */23802380+enum cpu_mitigations {23812381+ CPU_MITIGATIONS_OFF,23822382+ CPU_MITIGATIONS_AUTO,23832383+ CPU_MITIGATIONS_AUTO_NOSMT,23842384+};23852385+23862386+static enum cpu_mitigations cpu_mitigations __ro_after_init =23872387+ CPU_MITIGATIONS_AUTO;2377238823782389static int __init mitigations_parse_cmdline(char *arg)23792390{···24012390 return 0;24022391}24032392early_param("mitigations", mitigations_parse_cmdline);23932393+23942394+/* mitigations=off */23952395+bool cpu_mitigations_off(void)23962396+{23972397+ return cpu_mitigations == CPU_MITIGATIONS_OFF;23982398+}23992399+EXPORT_SYMBOL_GPL(cpu_mitigations_off);24002400+24012401+/* mitigations=auto,nosmt */24022402+bool cpu_mitigations_auto_nosmt(void)24032403+{24042404+ return cpu_mitigations == CPU_MITIGATIONS_AUTO_NOSMT;24052405+}24062406+EXPORT_SYMBOL_GPL(cpu_mitigations_auto_nosmt);
+19-4
kernel/events/core.c
···10311031{10321032}1033103310341034-void10341034+static inline void10351035perf_cgroup_switch(struct task_struct *task, struct task_struct *next)10361036{10371037}···1053510535 goto err_ns;1053610536 }10537105371053810538+ /*1053910539+ * Disallow uncore-cgroup events, they don't make sense as the cgroup will1054010540+ * be different on other CPUs in the uncore mask.1054110541+ */1054210542+ if (pmu->task_ctx_nr == perf_invalid_context && cgroup_fd != -1) {1054310543+ err = -EINVAL;1054410544+ goto err_pmu;1054510545+ }1054610546+1053810547 if (event->attr.aux_output &&1053910548 !(pmu->capabilities & PERF_PMU_CAP_AUX_OUTPUT)) {1054010549 err = -EOPNOTSUPP;···1133211323 int err;11333113241133411325 /*1133511335- * Get the target context (task or percpu):1132611326+ * Grouping is not supported for kernel events, neither is 'AUX',1132711327+ * make sure the caller's intentions are adjusted.1133611328 */1132911329+ if (attr->aux_output)1133011330+ return ERR_PTR(-EINVAL);11337113311133811332 event = perf_event_alloc(attr, cpu, task, NULL, NULL,1133911333 overflow_handler, context, -1);···1134811336 /* Mark owner so we could distinguish it from user events. */1134911337 event->owner = TASK_TOMBSTONE;11350113381133911339+ /*1134011340+ * Get the target context (task or percpu):1134111341+ */1135111342 ctx = find_get_context(event->pmu, task, event);1135211343 if (IS_ERR(ctx)) {1135311344 err = PTR_ERR(ctx);···1180211787 GFP_KERNEL);1180311788 if (!child_ctx->task_ctx_data) {1180411789 free_event(child_event);1180511805- return NULL;1179011790+ return ERR_PTR(-ENOMEM);1180611791 }1180711792 }1180811793···1190511890 if (IS_ERR(child_ctr))1190611891 return PTR_ERR(child_ctr);11907118921190811908- if (sub->aux_event == parent_event &&1189311893+ if (sub->aux_event == parent_event && child_ctr &&1190911894 !perf_get_aux_event(child_ctr, leader))1191011895 return -EINVAL;1191111896 }
+1-1
kernel/irq/irqdomain.c
···5151 * @type: Type of irqchip_fwnode. See linux/irqdomain.h5252 * @name: Optional user provided domain name5353 * @id: Optional user provided id if name != NULL5454- * @data: Optional user-provided data5454+ * @pa: Optional user-provided physical address5555 *5656 * Allocate a struct irqchip_fwid, and return a poiner to the embedded5757 * fwnode_handle (or NULL on failure).
+16-7
kernel/sched/core.c
···10731073 task_rq_unlock(rq, p, &rf);10741074}1075107510761076+#ifdef CONFIG_UCLAMP_TASK_GROUP10761077static inline void10771078uclamp_update_active_tasks(struct cgroup_subsys_state *css,10781079 unsigned int clamps)···10921091 css_task_iter_end(&it);10931092}1094109310951095-#ifdef CONFIG_UCLAMP_TASK_GROUP10961094static void cpu_util_update_eff(struct cgroup_subsys_state *css);10971095static void uclamp_update_root_tg(void)10981096{···39293929 }3930393039313931restart:39323932+#ifdef CONFIG_SMP39323933 /*39333933- * Ensure that we put DL/RT tasks before the pick loop, such that they39343934- * can PULL higher prio tasks when we lower the RQ 'priority'.39343934+ * We must do the balancing pass before put_next_task(), such39353935+ * that when we release the rq->lock the task is in the same39363936+ * state as before we took rq->lock.39373937+ *39383938+ * We can terminate the balance pass as soon as we know there is39393939+ * a runnable task of @class priority or higher.39353940 */39363936- prev->sched_class->put_prev_task(rq, prev, rf);39373937- if (!rq->nr_running)39383938- newidle_balance(rq, rf);39413941+ for_class_range(class, prev->sched_class, &idle_sched_class) {39423942+ if (class->balance(rq, prev, rf))39433943+ break;39443944+ }39453945+#endif39463946+39473947+ put_prev_task(rq, prev);3939394839403949 for_each_class(class) {39413950 p = class->pick_next_task(rq, NULL, NULL);···62106201 for_each_class(class) {62116202 next = class->pick_next_task(rq, NULL, NULL);62126203 if (next) {62136213- next->sched_class->put_prev_task(rq, next, NULL);62046204+ next->sched_class->put_prev_task(rq, next);62146205 return next;62156206 }62166207 }
+20-20
kernel/sched/deadline.c
···16911691 resched_curr(rq);16921692}1693169316941694+static int balance_dl(struct rq *rq, struct task_struct *p, struct rq_flags *rf)16951695+{16961696+ if (!on_dl_rq(&p->dl) && need_pull_dl_task(rq, p)) {16971697+ /*16981698+ * This is OK, because current is on_cpu, which avoids it being16991699+ * picked for load-balance and preemption/IRQs are still17001700+ * disabled avoiding further scheduler activity on it and we've17011701+ * not yet started the picking loop.17021702+ */17031703+ rq_unpin_lock(rq, rf);17041704+ pull_dl_task(rq);17051705+ rq_repin_lock(rq, rf);17061706+ }17071707+17081708+ return sched_stop_runnable(rq) || sched_dl_runnable(rq);17091709+}16941710#endif /* CONFIG_SMP */1695171116961712/*···17741758pick_next_task_dl(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)17751759{17761760 struct sched_dl_entity *dl_se;17611761+ struct dl_rq *dl_rq = &rq->dl;17771762 struct task_struct *p;17781778- struct dl_rq *dl_rq;1779176317801764 WARN_ON_ONCE(prev || rf);1781176517821782- dl_rq = &rq->dl;17831783-17841784- if (unlikely(!dl_rq->dl_nr_running))17661766+ if (!sched_dl_runnable(rq))17851767 return NULL;1786176817871769 dl_se = pick_next_dl_entity(rq, dl_rq);17881770 BUG_ON(!dl_se);17891789-17901771 p = dl_task_of(dl_se);17911791-17921772 set_next_task_dl(rq, p);17931793-17941773 return p;17951774}1796177517971797-static void put_prev_task_dl(struct rq *rq, struct task_struct *p, struct rq_flags *rf)17761776+static void put_prev_task_dl(struct rq *rq, struct task_struct *p)17981777{17991778 update_curr_dl(rq);1800177918011780 update_dl_rq_load_avg(rq_clock_pelt(rq), rq, 1);18021781 if (on_dl_rq(&p->dl) && p->nr_cpus_allowed > 1)18031782 enqueue_pushable_dl_task(rq, p);18041804-18051805- if (rf && !on_dl_rq(&p->dl) && need_pull_dl_task(rq, p)) {18061806- /*18071807- * This is OK, because current is on_cpu, which avoids it being18081808- * picked for load-balance and preemption/IRQs are still18091809- * disabled avoiding further scheduler activity on it and we've18101810- * not yet started the picking loop.18111811- */18121812- rq_unpin_lock(rq, rf);18131813- pull_dl_task(rq);18141814- rq_repin_lock(rq, rf);18151815- }18161783}1817178418181785/*···24412442 .set_next_task = set_next_task_dl,2442244324432444#ifdef CONFIG_SMP24452445+ .balance = balance_dl,24442446 .select_task_rq = select_task_rq_dl,24452447 .migrate_task_rq = migrate_task_rq_dl,24462448 .set_cpus_allowed = set_cpus_allowed_dl,
···14691469 resched_curr(rq);14701470}1471147114721472+static int balance_rt(struct rq *rq, struct task_struct *p, struct rq_flags *rf)14731473+{14741474+ if (!on_rt_rq(&p->rt) && need_pull_rt_task(rq, p)) {14751475+ /*14761476+ * This is OK, because current is on_cpu, which avoids it being14771477+ * picked for load-balance and preemption/IRQs are still14781478+ * disabled avoiding further scheduler activity on it and we've14791479+ * not yet started the picking loop.14801480+ */14811481+ rq_unpin_lock(rq, rf);14821482+ pull_rt_task(rq);14831483+ rq_repin_lock(rq, rf);14841484+ }14851485+14861486+ return sched_stop_runnable(rq) || sched_dl_runnable(rq) || sched_rt_runnable(rq);14871487+}14721488#endif /* CONFIG_SMP */1473148914741490/*···15681552pick_next_task_rt(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)15691553{15701554 struct task_struct *p;15711571- struct rt_rq *rt_rq = &rq->rt;1572155515731556 WARN_ON_ONCE(prev || rf);1574155715751575- if (!rt_rq->rt_queued)15581558+ if (!sched_rt_runnable(rq))15761559 return NULL;1577156015781561 p = _pick_next_task_rt(rq);15791579-15801562 set_next_task_rt(rq, p);15811581-15821563 return p;15831564}1584156515851585-static void put_prev_task_rt(struct rq *rq, struct task_struct *p, struct rq_flags *rf)15661566+static void put_prev_task_rt(struct rq *rq, struct task_struct *p)15861567{15871568 update_curr_rt(rq);15881569···15911578 */15921579 if (on_rt_rq(&p->rt) && p->nr_cpus_allowed > 1)15931580 enqueue_pushable_task(rq, p);15941594-15951595- if (rf && !on_rt_rq(&p->rt) && need_pull_rt_task(rq, p)) {15961596- /*15971597- * This is OK, because current is on_cpu, which avoids it being15981598- * picked for load-balance and preemption/IRQs are still15991599- * disabled avoiding further scheduler activity on it and we've16001600- * not yet started the picking loop.16011601- */16021602- rq_unpin_lock(rq, rf);16031603- pull_rt_task(rq);16041604- rq_repin_lock(rq, rf);16051605- }16061581}1607158216081583#ifdef CONFIG_SMP···23672366 .set_next_task = set_next_task_rt,2368236723692368#ifdef CONFIG_SMP23692369+ .balance = balance_rt,23702370 .select_task_rq = select_task_rq_rt,23712371-23722371 .set_cpus_allowed = set_cpus_allowed_common,23732372 .rq_online = rq_online_rt,23742373 .rq_offline = rq_offline_rt,
···196196again:197197 rcu_read_lock();198198 h_cg = hugetlb_cgroup_from_task(current);199199- if (!css_tryget_online(&h_cg->css)) {199199+ if (!css_tryget(&h_cg->css)) {200200 rcu_read_unlock();201201 goto again;202202 }
+16-12
mm/khugepaged.c
···16021602 result = SCAN_FAIL;16031603 goto xa_unlocked;16041604 }16051605- } else if (!PageUptodate(page)) {16061606- xas_unlock_irq(&xas);16071607- wait_on_page_locked(page);16081608- if (!trylock_page(page)) {16091609- result = SCAN_PAGE_LOCK;16101610- goto xa_unlocked;16111611- }16121612- get_page(page);16131613- } else if (PageDirty(page)) {16141614- result = SCAN_FAIL;16151615- goto xa_locked;16161605 } else if (trylock_page(page)) {16171606 get_page(page);16181607 xas_unlock_irq(&xas);···16161627 * without racing with truncate.16171628 */16181629 VM_BUG_ON_PAGE(!PageLocked(page), page);16191619- VM_BUG_ON_PAGE(!PageUptodate(page), page);16301630+16311631+ /* make sure the page is up to date */16321632+ if (unlikely(!PageUptodate(page))) {16331633+ result = SCAN_FAIL;16341634+ goto out_unlock;16351635+ }1620163616211637 /*16221638 * If file was truncated then extended, or hole-punched, before···1634164016351641 if (page_mapping(page) != mapping) {16361642 result = SCAN_TRUNCATED;16431643+ goto out_unlock;16441644+ }16451645+16461646+ if (!is_shmem && PageDirty(page)) {16471647+ /*16481648+ * khugepaged only works on read-only fd, so this16491649+ * page is dirty because it hasn't been flushed16501650+ * since first write.16511651+ */16521652+ result = SCAN_FAIL;16371653 goto out_unlock;16381654 }16391655
+12-4
mm/madvise.c
···363363 ClearPageReferenced(page);364364 test_and_clear_page_young(page);365365 if (pageout) {366366- if (!isolate_lru_page(page))367367- list_add(&page->lru, &page_list);366366+ if (!isolate_lru_page(page)) {367367+ if (PageUnevictable(page))368368+ putback_lru_page(page);369369+ else370370+ list_add(&page->lru, &page_list);371371+ }368372 } else369373 deactivate_page(page);370374huge_unlock:···445441 ClearPageReferenced(page);446442 test_and_clear_page_young(page);447443 if (pageout) {448448- if (!isolate_lru_page(page))449449- list_add(&page->lru, &page_list);444444+ if (!isolate_lru_page(page)) {445445+ if (PageUnevictable(page))446446+ putback_lru_page(page);447447+ else448448+ list_add(&page->lru, &page_list);449449+ }450450 } else451451 deactivate_page(page);452452 }
+1-1
mm/memcontrol.c
···960960 if (unlikely(!memcg))961961 memcg = root_mem_cgroup;962962 }963963- } while (!css_tryget_online(&memcg->css));963963+ } while (!css_tryget(&memcg->css));964964 rcu_read_unlock();965965 return memcg;966966}
+28-17
mm/memory_hotplug.c
···16461646 return 0;16471647}1648164816491649+static int check_no_memblock_for_node_cb(struct memory_block *mem, void *arg)16501650+{16511651+ int nid = *(int *)arg;16521652+16531653+ /*16541654+ * If a memory block belongs to multiple nodes, the stored nid is not16551655+ * reliable. However, such blocks are always online (e.g., cannot get16561656+ * offlined) and, therefore, are still spanned by the node.16571657+ */16581658+ return mem->nid == nid ? -EEXIST : 0;16591659+}16601660+16491661/**16501662 * try_offline_node16511663 * @nid: the node ID···16701658void try_offline_node(int nid)16711659{16721660 pg_data_t *pgdat = NODE_DATA(nid);16731673- unsigned long start_pfn = pgdat->node_start_pfn;16741674- unsigned long end_pfn = start_pfn + pgdat->node_spanned_pages;16751675- unsigned long pfn;16611661+ int rc;1676166216771677- for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {16781678- unsigned long section_nr = pfn_to_section_nr(pfn);16791679-16801680- if (!present_section_nr(section_nr))16811681- continue;16821682-16831683- if (pfn_to_nid(pfn) != nid)16841684- continue;16851685-16861686- /*16871687- * some memory sections of this node are not removed, and we16881688- * can't offline node now.16891689- */16631663+ /*16641664+ * If the node still spans pages (especially ZONE_DEVICE), don't16651665+ * offline it. A node spans memory after move_pfn_range_to_zone(),16661666+ * e.g., after the memory block was onlined.16671667+ */16681668+ if (pgdat->node_spanned_pages)16901669 return;16911691- }16701670+16711671+ /*16721672+ * Especially offline memory blocks might not be spanned by the16731673+ * node. They will get spanned by the node once they get onlined.16741674+ * However, they link to the node in sysfs and can get onlined later.16751675+ */16761676+ rc = for_each_memory_block(&nid, check_no_memblock_for_node_cb);16771677+ if (rc)16781678+ return;1692167916931680 if (check_cpu_on_node(pgdat))16941681 return;
+9-5
mm/mempolicy.c
···672672 * 1 - there is unmovable page, but MPOL_MF_MOVE* & MPOL_MF_STRICT were673673 * specified.674674 * 0 - queue pages successfully or no misplaced page.675675- * -EIO - there is misplaced page and only MPOL_MF_STRICT was specified.675675+ * errno - i.e. misplaced pages with MPOL_MF_STRICT specified (-EIO) or676676+ * memory range specified by nodemask and maxnode points outside677677+ * your accessible address space (-EFAULT)676678 */677679static int678680queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end,···12881286 flags | MPOL_MF_INVERT, &pagelist);1289128712901288 if (ret < 0) {12911291- err = -EIO;12891289+ err = ret;12921290 goto up_out;12931291 }12941292···1307130513081306 if ((ret > 0) || (nr_failed && (flags & MPOL_MF_STRICT)))13091307 err = -EIO;13101310- } else13111311- putback_movable_pages(&pagelist);13121312-13081308+ } else {13131309up_out:13101310+ if (!list_empty(&pagelist))13111311+ putback_movable_pages(&pagelist);13121312+ }13131313+13141314 up_write(&mm->mmap_sem);13151315mpol_out:13161316 mpol_put(new);
+3-3
mm/page_io.c
···7373{7474 struct swap_info_struct *sis;7575 struct gendisk *disk;7676+ swp_entry_t entry;76777778 /*7879 * There is no guarantee that the page is in swap cache - the software···105104 * we again wish to reclaim it.106105 */107106 disk = sis->bdev->bd_disk;108108- if (disk->fops->swap_slot_free_notify) {109109- swp_entry_t entry;107107+ entry.val = page_private(page);108108+ if (disk->fops->swap_slot_free_notify && __swap_count(entry) == 1) {110109 unsigned long offset;111110112112- entry.val = page_private(page);113111 offset = swp_offset(entry);114112115113 SetPageDirty(page);
+9-30
mm/slub.c
···14331433 void *old_tail = *tail ? *tail : *head;14341434 int rsize;1435143514361436- if (slab_want_init_on_free(s)) {14371437- void *p = NULL;14361436+ /* Head and tail of the reconstructed freelist */14371437+ *head = NULL;14381438+ *tail = NULL;1438143914391439- do {14401440- object = next;14411441- next = get_freepointer(s, object);14401440+ do {14411441+ object = next;14421442+ next = get_freepointer(s, object);14431443+14441444+ if (slab_want_init_on_free(s)) {14421445 /*14431446 * Clear the object and the metadata, but don't touch14441447 * the redzone.···14511448 : 0;14521449 memset((char *)object + s->inuse, 0,14531450 s->size - s->inuse - rsize);14541454- set_freepointer(s, object, p);14551455- p = object;14561456- } while (object != old_tail);14571457- }1458145114591459-/*14601460- * Compiler cannot detect this function can be removed if slab_free_hook()14611461- * evaluates to nothing. Thus, catch all relevant config debug options here.14621462- */14631463-#if defined(CONFIG_LOCKDEP) || \14641464- defined(CONFIG_DEBUG_KMEMLEAK) || \14651465- defined(CONFIG_DEBUG_OBJECTS_FREE) || \14661466- defined(CONFIG_KASAN)14671467-14681468- next = *head;14691469-14701470- /* Head and tail of the reconstructed freelist */14711471- *head = NULL;14721472- *tail = NULL;14731473-14741474- do {14751475- object = next;14761476- next = get_freepointer(s, object);14521452+ }14771453 /* If object's reuse doesn't have to be delayed */14781454 if (!slab_free_hook(s, object)) {14791455 /* Move object to the new freelist */···14671485 *tail = NULL;1468148614691487 return *head != NULL;14701470-#else14711471- return true;14721472-#endif14731488}1474148914751490static void *setup_object(struct kmem_cache *s, struct page *page,
···105105 slave = dsa_to_port(ds, port)->slave;106106107107 err = br_vlan_get_pvid(slave, &pvid);108108- if (err < 0)108108+ if (!pvid || err < 0)109109 /* There is no pvid on the bridge for this port, which is110110 * perfectly valid. Nothing to restore, bye-bye!111111 */
···388388 }389389390390 prepare_outbound_urb(ep, ctx);391391+ /* can be stopped during prepare callback */392392+ if (unlikely(!test_bit(EP_FLAG_RUNNING, &ep->flags)))393393+ goto exit_clear;391394 } else {392395 retire_inbound_urb(ep, ctx);393396 /* can be stopped during retire callback */
+3-1
sound/usb/mixer.c
···12291229 if (cval->min + cval->res < cval->max) {12301230 int last_valid_res = cval->res;12311231 int saved, test, check;12321232- get_cur_mix_raw(cval, minchn, &saved);12321232+ if (get_cur_mix_raw(cval, minchn, &saved) < 0)12331233+ goto no_res_check;12331234 for (;;) {12341235 test = saved;12351236 if (test < cval->max)···12501249 snd_usb_set_cur_mix_value(cval, minchn, 0, saved);12511250 }1252125112521252+no_res_check:12531253 cval->initialized = 1;12541254 }12551255
+2-2
sound/usb/quirks.c
···248248 NULL, USB_MS_MIDI_OUT_JACK);249249 if (!injd && !outjd)250250 return -ENODEV;251251- if (!(injd && snd_usb_validate_midi_desc(injd)) ||252252- !(outjd && snd_usb_validate_midi_desc(outjd)))251251+ if ((injd && !snd_usb_validate_midi_desc(injd)) ||252252+ (outjd && !snd_usb_validate_midi_desc(outjd)))253253 return -ENODEV;254254 if (injd && (injd->bLength < 5 ||255255 (injd->bJackType != USB_MS_EMBEDDED &&
+3-3
sound/usb/validate.c
···8181 switch (v->protocol) {8282 case UAC_VERSION_1:8383 default:8484- /* bNrChannels, wChannelConfig, iChannelNames, bControlSize */8585- len += 1 + 2 + 1 + 1;8686- if (d->bLength < len) /* bControlSize */8484+ /* bNrChannels, wChannelConfig, iChannelNames */8585+ len += 1 + 2 + 1;8686+ if (d->bLength < len + 1) /* bControlSize */8787 return false;8888 m = hdr[len];8989 len += 1 + m + 1; /* bControlSize, bmControls, iProcessing */
···112112 RET=0113113114114 ip link add dev br0 type bridge mcast_snooping 0115115+ ip link add name dummy1 up type dummy115116116117 ip link add name vxlan0 up type vxlan id 10 nolearning noudpcsum \117118 ttl 20 tos inherit local 198.51.100.1 dstport 4789 \118118- dev $swp2 group 239.0.0.1119119+ dev dummy1 group 239.0.0.1119120120121 sanitization_single_dev_test_fail121122122123 ip link del dev vxlan0124124+ ip link del dev dummy1123125 ip link del dev br0124126125127 log_test "vxlan device with a multicast group"···183181 RET=0184182185183 ip link add dev br0 type bridge mcast_snooping 0184184+ ip link add name dummy1 up type dummy186185187186 ip link add name vxlan0 up type vxlan id 10 nolearning noudpcsum \188188- ttl 20 tos inherit local 198.51.100.1 dstport 4789 dev $swp2187187+ ttl 20 tos inherit local 198.51.100.1 dstport 4789 dev dummy1189188190189 sanitization_single_dev_test_fail191190192191 ip link del dev vxlan0192192+ ip link del dev dummy1193193 ip link del dev br0194194195195 log_test "vxlan device with local interface"
···4444}4545#endif46464747+static void show_flag_test(int rq_index, unsigned int flags, int err)4848+{4949+ printf("PTP_EXTTS_REQUEST%c flags 0x%08x : (%d) %s\n",5050+ rq_index ? '1' + rq_index : ' ',5151+ flags, err, strerror(errno));5252+ /* sigh, uClibc ... */5353+ errno = 0;5454+}5555+5656+static void do_flag_test(int fd, unsigned int index)5757+{5858+ struct ptp_extts_request extts_request;5959+ unsigned long request[2] = {6060+ PTP_EXTTS_REQUEST,6161+ PTP_EXTTS_REQUEST2,6262+ };6363+ unsigned int enable_flags[5] = {6464+ PTP_ENABLE_FEATURE,6565+ PTP_ENABLE_FEATURE | PTP_RISING_EDGE,6666+ PTP_ENABLE_FEATURE | PTP_FALLING_EDGE,6767+ PTP_ENABLE_FEATURE | PTP_RISING_EDGE | PTP_FALLING_EDGE,6868+ PTP_ENABLE_FEATURE | (PTP_EXTTS_VALID_FLAGS + 1),6969+ };7070+ int err, i, j;7171+7272+ memset(&extts_request, 0, sizeof(extts_request));7373+ extts_request.index = index;7474+7575+ for (i = 0; i < 2; i++) {7676+ for (j = 0; j < 5; j++) {7777+ extts_request.flags = enable_flags[j];7878+ err = ioctl(fd, request[i], &extts_request);7979+ show_flag_test(i, extts_request.flags, err);8080+8181+ extts_request.flags = 0;8282+ err = ioctl(fd, request[i], &extts_request);8383+ }8484+ }8585+}8686+4787static clockid_t get_clockid(int fd)4888{4989#define CLOCKFD 3···13696 " -s set the ptp clock time from the system time\n"13797 " -S set the system time from the ptp clock time\n"13898 " -t val shift the ptp clock time by 'val' seconds\n"139139- " -T val set the ptp clock time to 'val' seconds\n",9999+ " -T val set the ptp clock time to 'val' seconds\n"100100+ " -z test combinations of rising/falling external time stamp flags\n",140101 progname);141102}142103···163122 int adjtime = 0;164123 int capabilities = 0;165124 int extts = 0;125125+ int flagtest = 0;166126 int gettime = 0;167127 int index = 0;168128 int list_pins = 0;···180138181139 progname = strrchr(argv[0], '/');182140 progname = progname ? 1+progname : argv[0];183183- while (EOF != (c = getopt(argc, argv, "cd:e:f:ghi:k:lL:p:P:sSt:T:v"))) {141141+ while (EOF != (c = getopt(argc, argv, "cd:e:f:ghi:k:lL:p:P:sSt:T:z"))) {184142 switch (c) {185143 case 'c':186144 capabilities = 1;···232190 case 'T':233191 settime = 3;234192 seconds = atoi(optarg);193193+ break;194194+ case 'z':195195+ flagtest = 1;235196 break;236197 case 'h':237198 usage(progname);···365320 if (ioctl(fd, PTP_EXTTS_REQUEST, &extts_request)) {366321 perror("PTP_EXTTS_REQUEST");367322 }323323+ }324324+325325+ if (flagtest) {326326+ do_flag_test(fd, index);368327 }369328370329 if (list_pins) {
+160-15
virt/kvm/kvm_main.c
···5050#include <linux/bsearch.h>5151#include <linux/io.h>5252#include <linux/lockdep.h>5353+#include <linux/kthread.h>53545455#include <asm/processor.h>5556#include <asm/ioctl.h>···122121 unsigned long arg);123122#define KVM_COMPAT(c) .compat_ioctl = (c)124123#else124124+/*125125+ * For architectures that don't implement a compat infrastructure,126126+ * adopt a double line of defense:127127+ * - Prevent a compat task from opening /dev/kvm128128+ * - If the open has been done by a 64bit task, and the KVM fd129129+ * passed to a compat task, let the ioctls fail.130130+ */125131static long kvm_no_compat_ioctl(struct file *file, unsigned int ioctl,126132 unsigned long arg) { return -EINVAL; }127127-#define KVM_COMPAT(c) .compat_ioctl = kvm_no_compat_ioctl133133+134134+static int kvm_no_compat_open(struct inode *inode, struct file *file)135135+{136136+ return is_compat_task() ? -ENODEV : 0;137137+}138138+#define KVM_COMPAT(c) .compat_ioctl = kvm_no_compat_ioctl, \139139+ .open = kvm_no_compat_open128140#endif129141static int hardware_enable_all(void);130142static void hardware_disable_all(void);···163149 return 0;164150}165151152152+bool kvm_is_zone_device_pfn(kvm_pfn_t pfn)153153+{154154+ /*155155+ * The metadata used by is_zone_device_page() to determine whether or156156+ * not a page is ZONE_DEVICE is guaranteed to be valid if and only if157157+ * the device has been pinned, e.g. by get_user_pages(). WARN if the158158+ * page_count() is zero to help detect bad usage of this helper.159159+ */160160+ if (!pfn_valid(pfn) || WARN_ON_ONCE(!page_count(pfn_to_page(pfn))))161161+ return false;162162+163163+ return is_zone_device_page(pfn_to_page(pfn));164164+}165165+166166bool kvm_is_reserved_pfn(kvm_pfn_t pfn)167167{168168+ /*169169+ * ZONE_DEVICE pages currently set PG_reserved, but from a refcounting170170+ * perspective they are "normal" pages, albeit with slightly different171171+ * usage rules.172172+ */168173 if (pfn_valid(pfn))169169- return PageReserved(pfn_to_page(pfn));174174+ return PageReserved(pfn_to_page(pfn)) &&175175+ !kvm_is_zone_device_pfn(pfn);170176171177 return true;172178}···659625 return 0;660626}661627628628+/*629629+ * Called after the VM is otherwise initialized, but just before adding it to630630+ * the vm_list.631631+ */632632+int __weak kvm_arch_post_init_vm(struct kvm *kvm)633633+{634634+ return 0;635635+}636636+637637+/*638638+ * Called just after removing the VM from the vm_list, but before doing any639639+ * other destruction.640640+ */641641+void __weak kvm_arch_pre_destroy_vm(struct kvm *kvm)642642+{643643+}644644+662645static struct kvm *kvm_create_vm(unsigned long type)663646{664647 struct kvm *kvm = kvm_arch_alloc_vm();···696645697646 BUILD_BUG_ON(KVM_MEM_SLOTS_NUM > SHRT_MAX);698647648648+ if (init_srcu_struct(&kvm->srcu))649649+ goto out_err_no_srcu;650650+ if (init_srcu_struct(&kvm->irq_srcu))651651+ goto out_err_no_irq_srcu;652652+653653+ refcount_set(&kvm->users_count, 1);699654 for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {700655 struct kvm_memslots *slots = kvm_alloc_memslots();701656···719662 goto out_err_no_arch_destroy_vm;720663 }721664722722- refcount_set(&kvm->users_count, 1);723665 r = kvm_arch_init_vm(kvm, type);724666 if (r)725667 goto out_err_no_arch_destroy_vm;···731675 INIT_HLIST_HEAD(&kvm->irq_ack_notifier_list);732676#endif733677734734- if (init_srcu_struct(&kvm->srcu))735735- goto out_err_no_srcu;736736- if (init_srcu_struct(&kvm->irq_srcu))737737- goto out_err_no_irq_srcu;738738-739678 r = kvm_init_mmu_notifier(kvm);679679+ if (r)680680+ goto out_err_no_mmu_notifier;681681+682682+ r = kvm_arch_post_init_vm(kvm);740683 if (r)741684 goto out_err;742685···748693 return kvm;749694750695out_err:751751- cleanup_srcu_struct(&kvm->irq_srcu);752752-out_err_no_irq_srcu:753753- cleanup_srcu_struct(&kvm->srcu);754754-out_err_no_srcu:696696+#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)697697+ if (kvm->mmu_notifier.ops)698698+ mmu_notifier_unregister(&kvm->mmu_notifier, current->mm);699699+#endif700700+out_err_no_mmu_notifier:755701 hardware_disable_all();756702out_err_no_disable:757703 kvm_arch_destroy_vm(kvm);758758- WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));759704out_err_no_arch_destroy_vm:705705+ WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));760706 for (i = 0; i < KVM_NR_BUSES; i++)761707 kfree(kvm_get_bus(kvm, i));762708 for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++)763709 kvm_free_memslots(kvm, __kvm_memslots(kvm, i));710710+ cleanup_srcu_struct(&kvm->irq_srcu);711711+out_err_no_irq_srcu:712712+ cleanup_srcu_struct(&kvm->srcu);713713+out_err_no_srcu:764714 kvm_arch_free_vm(kvm);765715 mmdrop(current->mm);766716 return ERR_PTR(r);···797737 mutex_lock(&kvm_lock);798738 list_del(&kvm->vm_list);799739 mutex_unlock(&kvm_lock);740740+ kvm_arch_pre_destroy_vm(kvm);741741+800742 kvm_free_irq_routing(kvm);801743 for (i = 0; i < KVM_NR_BUSES; i++) {802744 struct kvm_io_bus *bus = kvm_get_bus(kvm, i);···1919185719201858void kvm_set_pfn_dirty(kvm_pfn_t pfn)19211859{19221922- if (!kvm_is_reserved_pfn(pfn)) {18601860+ if (!kvm_is_reserved_pfn(pfn) && !kvm_is_zone_device_pfn(pfn)) {19231861 struct page *page = pfn_to_page(pfn);1924186219251863 SetPageDirty(page);···1929186719301868void kvm_set_pfn_accessed(kvm_pfn_t pfn)19311869{19321932- if (!kvm_is_reserved_pfn(pfn))18701870+ if (!kvm_is_reserved_pfn(pfn) && !kvm_is_zone_device_pfn(pfn))19331871 mark_page_accessed(pfn_to_page(pfn));19341872}19351873EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed);···44334371 kvm_vfio_ops_exit();44344372}44354373EXPORT_SYMBOL_GPL(kvm_exit);43744374+43754375+struct kvm_vm_worker_thread_context {43764376+ struct kvm *kvm;43774377+ struct task_struct *parent;43784378+ struct completion init_done;43794379+ kvm_vm_thread_fn_t thread_fn;43804380+ uintptr_t data;43814381+ int err;43824382+};43834383+43844384+static int kvm_vm_worker_thread(void *context)43854385+{43864386+ /*43874387+ * The init_context is allocated on the stack of the parent thread, so43884388+ * we have to locally copy anything that is needed beyond initialization43894389+ */43904390+ struct kvm_vm_worker_thread_context *init_context = context;43914391+ struct kvm *kvm = init_context->kvm;43924392+ kvm_vm_thread_fn_t thread_fn = init_context->thread_fn;43934393+ uintptr_t data = init_context->data;43944394+ int err;43954395+43964396+ err = kthread_park(current);43974397+ /* kthread_park(current) is never supposed to return an error */43984398+ WARN_ON(err != 0);43994399+ if (err)44004400+ goto init_complete;44014401+44024402+ err = cgroup_attach_task_all(init_context->parent, current);44034403+ if (err) {44044404+ kvm_err("%s: cgroup_attach_task_all failed with err %d\n",44054405+ __func__, err);44064406+ goto init_complete;44074407+ }44084408+44094409+ set_user_nice(current, task_nice(init_context->parent));44104410+44114411+init_complete:44124412+ init_context->err = err;44134413+ complete(&init_context->init_done);44144414+ init_context = NULL;44154415+44164416+ if (err)44174417+ return err;44184418+44194419+ /* Wait to be woken up by the spawner before proceeding. */44204420+ kthread_parkme();44214421+44224422+ if (!kthread_should_stop())44234423+ err = thread_fn(kvm, data);44244424+44254425+ return err;44264426+}44274427+44284428+int kvm_vm_create_worker_thread(struct kvm *kvm, kvm_vm_thread_fn_t thread_fn,44294429+ uintptr_t data, const char *name,44304430+ struct task_struct **thread_ptr)44314431+{44324432+ struct kvm_vm_worker_thread_context init_context = {};44334433+ struct task_struct *thread;44344434+44354435+ *thread_ptr = NULL;44364436+ init_context.kvm = kvm;44374437+ init_context.parent = current;44384438+ init_context.thread_fn = thread_fn;44394439+ init_context.data = data;44404440+ init_completion(&init_context.init_done);44414441+44424442+ thread = kthread_run(kvm_vm_worker_thread, &init_context,44434443+ "%s-%d", name, task_pid_nr(current));44444444+ if (IS_ERR(thread))44454445+ return PTR_ERR(thread);44464446+44474447+ /* kthread_run is never supposed to return NULL */44484448+ WARN_ON(thread == NULL);44494449+44504450+ wait_for_completion(&init_context.init_done);44514451+44524452+ if (!init_context.err)44534453+ *thread_ptr = thread;44544454+44554455+ return init_context.err;44564456+}