Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
"Certain AMD processors are vulnerable to a cross-thread return address
predictions bug. When running in SMT mode and one of the sibling
threads transitions out of C0 state, the other thread gets access to
twice as many entries in the RSB, but unfortunately the predictions of
the now-halted logical processor are not purged. Therefore, the
executing processor could speculatively execute from locations that
the now-halted processor had trained the RSB on.

The Spectre v2 mitigations cover the Linux kernel, as it fills the RSB
when context switching to the idle thread. However, KVM allows a VMM
to prevent exiting guest mode when transitioning out of C0 using the
KVM_CAP_X86_DISABLE_EXITS capability can be used by a VMM to change
this behavior. To mitigate the cross-thread return address predictions
bug, a VMM must not be allowed to override the default behavior to
intercept C0 transitions.

These patches introduce a KVM module parameter that, if set, will
prevent the user from disabling the HLT, MWAIT and CSTATE exits"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
Documentation/hw-vuln: Add documentation for Cross-Thread Return Predictions
KVM: x86: Mitigate the cross-thread return address predictions bug
x86/speculation: Identify processors vulnerable to SMT RSB predictions

+133 -13
+92
Documentation/admin-guide/hw-vuln/cross-thread-rsb.rst
··· 1 + 2 + .. SPDX-License-Identifier: GPL-2.0 3 + 4 + Cross-Thread Return Address Predictions 5 + ======================================= 6 + 7 + Certain AMD and Hygon processors are subject to a cross-thread return address 8 + predictions vulnerability. When running in SMT mode and one sibling thread 9 + transitions out of C0 state, the other sibling thread could use return target 10 + predictions from the sibling thread that transitioned out of C0. 11 + 12 + The Spectre v2 mitigations protect the Linux kernel, as it fills the return 13 + address prediction entries with safe targets when context switching to the idle 14 + thread. However, KVM does allow a VMM to prevent exiting guest mode when 15 + transitioning out of C0. This could result in a guest-controlled return target 16 + being consumed by the sibling thread. 17 + 18 + Affected processors 19 + ------------------- 20 + 21 + The following CPUs are vulnerable: 22 + 23 + - AMD Family 17h processors 24 + - Hygon Family 18h processors 25 + 26 + Related CVEs 27 + ------------ 28 + 29 + The following CVE entry is related to this issue: 30 + 31 + ============== ======================================= 32 + CVE-2022-27672 Cross-Thread Return Address Predictions 33 + ============== ======================================= 34 + 35 + Problem 36 + ------- 37 + 38 + Affected SMT-capable processors support 1T and 2T modes of execution when SMT 39 + is enabled. In 2T mode, both threads in a core are executing code. For the 40 + processor core to enter 1T mode, it is required that one of the threads 41 + requests to transition out of the C0 state. This can be communicated with the 42 + HLT instruction or with an MWAIT instruction that requests non-C0. 43 + When the thread re-enters the C0 state, the processor transitions back 44 + to 2T mode, assuming the other thread is also still in C0 state. 45 + 46 + In affected processors, the return address predictor (RAP) is partitioned 47 + depending on the SMT mode. For instance, in 2T mode each thread uses a private 48 + 16-entry RAP, but in 1T mode, the active thread uses a 32-entry RAP. Upon 49 + transition between 1T/2T mode, the RAP contents are not modified but the RAP 50 + pointers (which control the next return target to use for predictions) may 51 + change. This behavior may result in return targets from one SMT thread being 52 + used by RET predictions in the sibling thread following a 1T/2T switch. In 53 + particular, a RET instruction executed immediately after a transition to 1T may 54 + use a return target from the thread that just became idle. In theory, this 55 + could lead to information disclosure if the return targets used do not come 56 + from trustworthy code. 57 + 58 + Attack scenarios 59 + ---------------- 60 + 61 + An attack can be mounted on affected processors by performing a series of CALL 62 + instructions with targeted return locations and then transitioning out of C0 63 + state. 64 + 65 + Mitigation mechanism 66 + -------------------- 67 + 68 + Before entering idle state, the kernel context switches to the idle thread. The 69 + context switch fills the RAP entries (referred to as the RSB in Linux) with safe 70 + targets by performing a sequence of CALL instructions. 71 + 72 + Prevent a guest VM from directly putting the processor into an idle state by 73 + intercepting HLT and MWAIT instructions. 74 + 75 + Both mitigations are required to fully address this issue. 76 + 77 + Mitigation control on the kernel command line 78 + --------------------------------------------- 79 + 80 + Use existing Spectre v2 mitigations that will fill the RSB on context switch. 81 + 82 + Mitigation control for KVM - module parameter 83 + --------------------------------------------- 84 + 85 + By default, the KVM hypervisor mitigates this issue by intercepting guest 86 + attempts to transition out of C0. A VMM can use the KVM_CAP_X86_DISABLE_EXITS 87 + capability to override those interceptions, but since this is not common, the 88 + mitigation that covers this path is not enabled by default. 89 + 90 + The mitigation for the KVM_CAP_X86_DISABLE_EXITS capability can be turned on 91 + using the boolean module parameter mitigate_smt_rsb, e.g.: 92 + kvm.mitigate_smt_rsb=1
+1
Documentation/admin-guide/hw-vuln/index.rst
··· 18 18 core-scheduling.rst 19 19 l1d_flush.rst 20 20 processor_mmio_stale_data.rst 21 + cross-thread-rsb.rst
+1
arch/x86/include/asm/cpufeatures.h
··· 466 466 #define X86_BUG_MMIO_UNKNOWN X86_BUG(26) /* CPU is too old and its MMIO Stale Data status is unknown */ 467 467 #define X86_BUG_RETBLEED X86_BUG(27) /* CPU is affected by RETBleed */ 468 468 #define X86_BUG_EIBRS_PBRSB X86_BUG(28) /* EIBRS is vulnerable to Post Barrier RSB Predictions */ 469 + #define X86_BUG_SMT_RSB X86_BUG(29) /* CPU is vulnerable to Cross-Thread Return Address Predictions */ 469 470 470 471 #endif /* _ASM_X86_CPUFEATURES_H */
+7 -2
arch/x86/kernel/cpu/common.c
··· 1256 1256 #define MMIO_SBDS BIT(2) 1257 1257 /* CPU is affected by RETbleed, speculating where you would not expect it */ 1258 1258 #define RETBLEED BIT(3) 1259 + /* CPU is affected by SMT (cross-thread) return predictions */ 1260 + #define SMT_RSB BIT(4) 1259 1261 1260 1262 static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { 1261 1263 VULNBL_INTEL_STEPPINGS(IVYBRIDGE, X86_STEPPING_ANY, SRBDS), ··· 1289 1287 1290 1288 VULNBL_AMD(0x15, RETBLEED), 1291 1289 VULNBL_AMD(0x16, RETBLEED), 1292 - VULNBL_AMD(0x17, RETBLEED), 1293 - VULNBL_HYGON(0x18, RETBLEED), 1290 + VULNBL_AMD(0x17, RETBLEED | SMT_RSB), 1291 + VULNBL_HYGON(0x18, RETBLEED | SMT_RSB), 1294 1292 {} 1295 1293 }; 1296 1294 ··· 1407 1405 !cpu_matches(cpu_vuln_whitelist, NO_EIBRS_PBRSB) && 1408 1406 !(ia32_cap & ARCH_CAP_PBRSB_NO)) 1409 1407 setup_force_cpu_bug(X86_BUG_EIBRS_PBRSB); 1408 + 1409 + if (cpu_matches(cpu_vuln_blacklist, SMT_RSB)) 1410 + setup_force_cpu_bug(X86_BUG_SMT_RSB); 1410 1411 1411 1412 if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN)) 1412 1413 return;
+32 -11
arch/x86/kvm/x86.c
··· 191 191 bool __read_mostly eager_page_split = true; 192 192 module_param(eager_page_split, bool, 0644); 193 193 194 + /* Enable/disable SMT_RSB bug mitigation */ 195 + bool __read_mostly mitigate_smt_rsb; 196 + module_param(mitigate_smt_rsb, bool, 0444); 197 + 194 198 /* 195 199 * Restoring the host value for MSRs that are only consumed when running in 196 200 * usermode, e.g. SYSCALL MSRs and TSC_AUX, can be deferred until the CPU ··· 4452 4448 r = KVM_CLOCK_VALID_FLAGS; 4453 4449 break; 4454 4450 case KVM_CAP_X86_DISABLE_EXITS: 4455 - r |= KVM_X86_DISABLE_EXITS_HLT | KVM_X86_DISABLE_EXITS_PAUSE | 4456 - KVM_X86_DISABLE_EXITS_CSTATE; 4457 - if(kvm_can_mwait_in_guest()) 4458 - r |= KVM_X86_DISABLE_EXITS_MWAIT; 4451 + r = KVM_X86_DISABLE_EXITS_PAUSE; 4452 + 4453 + if (!mitigate_smt_rsb) { 4454 + r |= KVM_X86_DISABLE_EXITS_HLT | 4455 + KVM_X86_DISABLE_EXITS_CSTATE; 4456 + 4457 + if (kvm_can_mwait_in_guest()) 4458 + r |= KVM_X86_DISABLE_EXITS_MWAIT; 4459 + } 4459 4460 break; 4460 4461 case KVM_CAP_X86_SMM: 4461 4462 if (!IS_ENABLED(CONFIG_KVM_SMM)) ··· 6236 6227 if (cap->args[0] & ~KVM_X86_DISABLE_VALID_EXITS) 6237 6228 break; 6238 6229 6239 - if ((cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT) && 6240 - kvm_can_mwait_in_guest()) 6241 - kvm->arch.mwait_in_guest = true; 6242 - if (cap->args[0] & KVM_X86_DISABLE_EXITS_HLT) 6243 - kvm->arch.hlt_in_guest = true; 6244 6230 if (cap->args[0] & KVM_X86_DISABLE_EXITS_PAUSE) 6245 6231 kvm->arch.pause_in_guest = true; 6246 - if (cap->args[0] & KVM_X86_DISABLE_EXITS_CSTATE) 6247 - kvm->arch.cstate_in_guest = true; 6232 + 6233 + #define SMT_RSB_MSG "This processor is affected by the Cross-Thread Return Predictions vulnerability. " \ 6234 + "KVM_CAP_X86_DISABLE_EXITS should only be used with SMT disabled or trusted guests." 6235 + 6236 + if (!mitigate_smt_rsb) { 6237 + if (boot_cpu_has_bug(X86_BUG_SMT_RSB) && cpu_smt_possible() && 6238 + (cap->args[0] & ~KVM_X86_DISABLE_EXITS_PAUSE)) 6239 + pr_warn_once(SMT_RSB_MSG); 6240 + 6241 + if ((cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT) && 6242 + kvm_can_mwait_in_guest()) 6243 + kvm->arch.mwait_in_guest = true; 6244 + if (cap->args[0] & KVM_X86_DISABLE_EXITS_HLT) 6245 + kvm->arch.hlt_in_guest = true; 6246 + if (cap->args[0] & KVM_X86_DISABLE_EXITS_CSTATE) 6247 + kvm->arch.cstate_in_guest = true; 6248 + } 6249 + 6248 6250 r = 0; 6249 6251 break; 6250 6252 case KVM_CAP_MSR_PLATFORM_INFO: ··· 13476 13456 static int __init kvm_x86_init(void) 13477 13457 { 13478 13458 kvm_mmu_x86_module_init(); 13459 + mitigate_smt_rsb &= boot_cpu_has_bug(X86_BUG_SMT_RSB) && cpu_smt_possible(); 13479 13460 return 0; 13480 13461 } 13481 13462 module_init(kvm_x86_init);