Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'vmscape-for-linus-20250904' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull vmescape mitigation fixes from Dave Hansen:
"Mitigate vmscape issue with indirect branch predictor flushes.

vmscape is a vulnerability that essentially takes Spectre-v2 and
attacks host userspace from a guest. It particularly affects
hypervisors like QEMU.

Even if a hypervisor may not have any sensitive data like disk
encryption keys, guest-userspace may be able to attack the
guest-kernel using the hypervisor as a confused deputy.

There are many ways to mitigate vmscape using the existing Spectre-v2
defenses like IBRS variants or the IBPB flushes. This series focuses
solely on IBPB because it works universally across vendors and all
vulnerable processors. Further work doing vendor and model-specific
optimizations can build on top of this if needed / wanted.

Do the normal issue mitigation dance:

- Add the CPU bug boilerplate

- Add a list of vulnerable CPUs

- Use IBPB to flush the branch predictors after running guests"

* tag 'vmscape-for-linus-20250904' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vmscape: Add old Intel CPUs to affected list
x86/vmscape: Warn when STIBP is disabled with SMT
x86/bugs: Move cpu_bugs_smt_update() down
x86/vmscape: Enable the mitigation
x86/vmscape: Add conditional IBPB mitigation
x86/vmscape: Enumerate VMSCAPE bug
Documentation/hw-vuln: Add VMSCAPE documentation

+414 -113
+1
Documentation/ABI/testing/sysfs-devices-system-cpu
··· 586 586 /sys/devices/system/cpu/vulnerabilities/srbds 587 587 /sys/devices/system/cpu/vulnerabilities/tsa 588 588 /sys/devices/system/cpu/vulnerabilities/tsx_async_abort 589 + /sys/devices/system/cpu/vulnerabilities/vmscape 589 590 Date: January 2018 590 591 Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org> 591 592 Description: Information about CPU vulnerabilities
+1
Documentation/admin-guide/hw-vuln/index.rst
··· 26 26 rsb 27 27 old_microcode 28 28 indirect-target-selection 29 + vmscape
+110
Documentation/admin-guide/hw-vuln/vmscape.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + VMSCAPE 4 + ======= 5 + 6 + VMSCAPE is a vulnerability that may allow a guest to influence the branch 7 + prediction in host userspace. It particularly affects hypervisors like QEMU. 8 + 9 + Even if a hypervisor may not have any sensitive data like disk encryption keys, 10 + guest-userspace may be able to attack the guest-kernel using the hypervisor as 11 + a confused deputy. 12 + 13 + Affected processors 14 + ------------------- 15 + 16 + The following CPU families are affected by VMSCAPE: 17 + 18 + **Intel processors:** 19 + - Skylake generation (Parts without Enhanced-IBRS) 20 + - Cascade Lake generation - (Parts affected by ITS guest/host separation) 21 + - Alder Lake and newer (Parts affected by BHI) 22 + 23 + Note that, BHI affected parts that use BHB clearing software mitigation e.g. 24 + Icelake are not vulnerable to VMSCAPE. 25 + 26 + **AMD processors:** 27 + - Zen series (families 0x17, 0x19, 0x1a) 28 + 29 + ** Hygon processors:** 30 + - Family 0x18 31 + 32 + Mitigation 33 + ---------- 34 + 35 + Conditional IBPB 36 + ---------------- 37 + 38 + Kernel tracks when a CPU has run a potentially malicious guest and issues an 39 + IBPB before the first exit to userspace after VM-exit. If userspace did not run 40 + between VM-exit and the next VM-entry, no IBPB is issued. 41 + 42 + Note that the existing userspace mitigation against Spectre-v2 is effective in 43 + protecting the userspace. They are insufficient to protect the userspace VMMs 44 + from a malicious guest. This is because Spectre-v2 mitigations are applied at 45 + context switch time, while the userspace VMM can run after a VM-exit without a 46 + context switch. 47 + 48 + Vulnerability enumeration and mitigation is not applied inside a guest. This is 49 + because nested hypervisors should already be deploying IBPB to isolate 50 + themselves from nested guests. 51 + 52 + SMT considerations 53 + ------------------ 54 + 55 + When Simultaneous Multi-Threading (SMT) is enabled, hypervisors can be 56 + vulnerable to cross-thread attacks. For complete protection against VMSCAPE 57 + attacks in SMT environments, STIBP should be enabled. 58 + 59 + The kernel will issue a warning if SMT is enabled without adequate STIBP 60 + protection. Warning is not issued when: 61 + 62 + - SMT is disabled 63 + - STIBP is enabled system-wide 64 + - Intel eIBRS is enabled (which implies STIBP protection) 65 + 66 + System information and options 67 + ------------------------------ 68 + 69 + The sysfs file showing VMSCAPE mitigation status is: 70 + 71 + /sys/devices/system/cpu/vulnerabilities/vmscape 72 + 73 + The possible values in this file are: 74 + 75 + * 'Not affected': 76 + 77 + The processor is not vulnerable to VMSCAPE attacks. 78 + 79 + * 'Vulnerable': 80 + 81 + The processor is vulnerable and no mitigation has been applied. 82 + 83 + * 'Mitigation: IBPB before exit to userspace': 84 + 85 + Conditional IBPB mitigation is enabled. The kernel tracks when a CPU has 86 + run a potentially malicious guest and issues an IBPB before the first 87 + exit to userspace after VM-exit. 88 + 89 + * 'Mitigation: IBPB on VMEXIT': 90 + 91 + IBPB is issued on every VM-exit. This occurs when other mitigations like 92 + RETBLEED or SRSO are already issuing IBPB on VM-exit. 93 + 94 + Mitigation control on the kernel command line 95 + ---------------------------------------------- 96 + 97 + The mitigation can be controlled via the ``vmscape=`` command line parameter: 98 + 99 + * ``vmscape=off``: 100 + 101 + Disable the VMSCAPE mitigation. 102 + 103 + * ``vmscape=ibpb``: 104 + 105 + Enable conditional IBPB mitigation (default when CONFIG_MITIGATION_VMSCAPE=y). 106 + 107 + * ``vmscape=force``: 108 + 109 + Force vulnerability detection and mitigation even on processors that are 110 + not known to be affected.
+11
Documentation/admin-guide/kernel-parameters.txt
··· 3829 3829 srbds=off [X86,INTEL] 3830 3830 ssbd=force-off [ARM64] 3831 3831 tsx_async_abort=off [X86] 3832 + vmscape=off [X86] 3832 3833 3833 3834 Exceptions: 3834 3835 This does not have any effect on ··· 8041 8040 8042 8041 vmpoff= [KNL,S390] Perform z/VM CP command after power off. 8043 8042 Format: <command> 8043 + 8044 + vmscape= [X86] Controls mitigation for VMscape attacks. 8045 + VMscape attacks can leak information from a userspace 8046 + hypervisor to a guest via speculative side-channels. 8047 + 8048 + off - disable the mitigation 8049 + ibpb - use Indirect Branch Prediction Barrier 8050 + (IBPB) mitigation (default) 8051 + force - force vulnerability detection even on 8052 + unaffected processors 8044 8053 8045 8054 vsyscall= [X86-64,EARLY] 8046 8055 Controls the behavior of vsyscalls (i.e. calls to
+9
arch/x86/Kconfig
··· 2701 2701 security vulnerability on AMD CPUs which can lead to forwarding of 2702 2702 invalid info to subsequent instructions and thus can affect their 2703 2703 timing and thereby cause a leakage. 2704 + 2705 + config MITIGATION_VMSCAPE 2706 + bool "Mitigate VMSCAPE" 2707 + depends on KVM 2708 + default y 2709 + help 2710 + Enable mitigation for VMSCAPE attacks. VMSCAPE is a hardware security 2711 + vulnerability on Intel and AMD CPUs that may allow a guest to do 2712 + Spectre v2 style attacks on userspace hypervisor. 2704 2713 endif 2705 2714 2706 2715 config ARCH_HAS_ADD_PAGES
+2
arch/x86/include/asm/cpufeatures.h
··· 495 495 #define X86_FEATURE_TSA_SQ_NO (21*32+11) /* AMD CPU not vulnerable to TSA-SQ */ 496 496 #define X86_FEATURE_TSA_L1_NO (21*32+12) /* AMD CPU not vulnerable to TSA-L1 */ 497 497 #define X86_FEATURE_CLEAR_CPU_BUF_VM (21*32+13) /* Clear CPU buffers using VERW before VMRUN */ 498 + #define X86_FEATURE_IBPB_EXIT_TO_USER (21*32+14) /* Use IBPB on exit-to-userspace, see VMSCAPE bug */ 498 499 499 500 /* 500 501 * BUG word(s) ··· 552 551 #define X86_BUG_ITS X86_BUG( 1*32+ 7) /* "its" CPU is affected by Indirect Target Selection */ 553 552 #define X86_BUG_ITS_NATIVE_ONLY X86_BUG( 1*32+ 8) /* "its_native_only" CPU is affected by ITS, VMX is not affected */ 554 553 #define X86_BUG_TSA X86_BUG( 1*32+ 9) /* "tsa" CPU is affected by Transient Scheduler Attacks */ 554 + #define X86_BUG_VMSCAPE X86_BUG( 1*32+10) /* "vmscape" CPU is affected by VMSCAPE attacks from guests */ 555 555 #endif /* _ASM_X86_CPUFEATURES_H */
+7
arch/x86/include/asm/entry-common.h
··· 93 93 * 8 (ia32) bits. 94 94 */ 95 95 choose_random_kstack_offset(rdtsc()); 96 + 97 + /* Avoid unnecessary reads of 'x86_ibpb_exit_to_user' */ 98 + if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER) && 99 + this_cpu_read(x86_ibpb_exit_to_user)) { 100 + indirect_branch_prediction_barrier(); 101 + this_cpu_write(x86_ibpb_exit_to_user, false); 102 + } 96 103 } 97 104 #define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare 98 105
+2
arch/x86/include/asm/nospec-branch.h
··· 530 530 : "memory"); 531 531 } 532 532 533 + DECLARE_PER_CPU(bool, x86_ibpb_exit_to_user); 534 + 533 535 static inline void indirect_branch_prediction_barrier(void) 534 536 { 535 537 asm_inline volatile(ALTERNATIVE("", "call write_ibpb", X86_FEATURE_IBPB)
+203 -82
arch/x86/kernel/cpu/bugs.c
··· 96 96 static void __init its_apply_mitigation(void); 97 97 static void __init tsa_select_mitigation(void); 98 98 static void __init tsa_apply_mitigation(void); 99 + static void __init vmscape_select_mitigation(void); 100 + static void __init vmscape_update_mitigation(void); 101 + static void __init vmscape_apply_mitigation(void); 99 102 100 103 /* The base value of the SPEC_CTRL MSR without task-specific bits set */ 101 104 u64 x86_spec_ctrl_base; ··· 107 104 /* The current value of the SPEC_CTRL MSR with task-specific bits set */ 108 105 DEFINE_PER_CPU(u64, x86_spec_ctrl_current); 109 106 EXPORT_PER_CPU_SYMBOL_GPL(x86_spec_ctrl_current); 107 + 108 + /* 109 + * Set when the CPU has run a potentially malicious guest. An IBPB will 110 + * be needed to before running userspace. That IBPB will flush the branch 111 + * predictor content. 112 + */ 113 + DEFINE_PER_CPU(bool, x86_ibpb_exit_to_user); 114 + EXPORT_PER_CPU_SYMBOL_GPL(x86_ibpb_exit_to_user); 110 115 111 116 u64 x86_pred_cmd __ro_after_init = PRED_CMD_IBPB; 112 117 ··· 273 262 its_select_mitigation(); 274 263 bhi_select_mitigation(); 275 264 tsa_select_mitigation(); 265 + vmscape_select_mitigation(); 276 266 277 267 /* 278 268 * After mitigations are selected, some may need to update their ··· 305 293 bhi_update_mitigation(); 306 294 /* srso_update_mitigation() depends on retbleed_update_mitigation(). */ 307 295 srso_update_mitigation(); 296 + vmscape_update_mitigation(); 308 297 309 298 spectre_v1_apply_mitigation(); 310 299 spectre_v2_apply_mitigation(); ··· 323 310 its_apply_mitigation(); 324 311 bhi_apply_mitigation(); 325 312 tsa_apply_mitigation(); 313 + vmscape_apply_mitigation(); 326 314 } 327 315 328 316 /* ··· 2552 2538 } 2553 2539 } 2554 2540 2555 - #define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n" 2556 - #define TAA_MSG_SMT "TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.\n" 2557 - #define MMIO_MSG_SMT "MMIO Stale Data CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/processor_mmio_stale_data.html for more details.\n" 2558 - 2559 - void cpu_bugs_smt_update(void) 2560 - { 2561 - mutex_lock(&spec_ctrl_mutex); 2562 - 2563 - if (sched_smt_active() && unprivileged_ebpf_enabled() && 2564 - spectre_v2_enabled == SPECTRE_V2_EIBRS_LFENCE) 2565 - pr_warn_once(SPECTRE_V2_EIBRS_LFENCE_EBPF_SMT_MSG); 2566 - 2567 - switch (spectre_v2_user_stibp) { 2568 - case SPECTRE_V2_USER_NONE: 2569 - break; 2570 - case SPECTRE_V2_USER_STRICT: 2571 - case SPECTRE_V2_USER_STRICT_PREFERRED: 2572 - update_stibp_strict(); 2573 - break; 2574 - case SPECTRE_V2_USER_PRCTL: 2575 - case SPECTRE_V2_USER_SECCOMP: 2576 - update_indir_branch_cond(); 2577 - break; 2578 - } 2579 - 2580 - switch (mds_mitigation) { 2581 - case MDS_MITIGATION_FULL: 2582 - case MDS_MITIGATION_AUTO: 2583 - case MDS_MITIGATION_VMWERV: 2584 - if (sched_smt_active() && !boot_cpu_has(X86_BUG_MSBDS_ONLY)) 2585 - pr_warn_once(MDS_MSG_SMT); 2586 - update_mds_branch_idle(); 2587 - break; 2588 - case MDS_MITIGATION_OFF: 2589 - break; 2590 - } 2591 - 2592 - switch (taa_mitigation) { 2593 - case TAA_MITIGATION_VERW: 2594 - case TAA_MITIGATION_AUTO: 2595 - case TAA_MITIGATION_UCODE_NEEDED: 2596 - if (sched_smt_active()) 2597 - pr_warn_once(TAA_MSG_SMT); 2598 - break; 2599 - case TAA_MITIGATION_TSX_DISABLED: 2600 - case TAA_MITIGATION_OFF: 2601 - break; 2602 - } 2603 - 2604 - switch (mmio_mitigation) { 2605 - case MMIO_MITIGATION_VERW: 2606 - case MMIO_MITIGATION_AUTO: 2607 - case MMIO_MITIGATION_UCODE_NEEDED: 2608 - if (sched_smt_active()) 2609 - pr_warn_once(MMIO_MSG_SMT); 2610 - break; 2611 - case MMIO_MITIGATION_OFF: 2612 - break; 2613 - } 2614 - 2615 - switch (tsa_mitigation) { 2616 - case TSA_MITIGATION_USER_KERNEL: 2617 - case TSA_MITIGATION_VM: 2618 - case TSA_MITIGATION_AUTO: 2619 - case TSA_MITIGATION_FULL: 2620 - /* 2621 - * TSA-SQ can potentially lead to info leakage between 2622 - * SMT threads. 2623 - */ 2624 - if (sched_smt_active()) 2625 - static_branch_enable(&cpu_buf_idle_clear); 2626 - else 2627 - static_branch_disable(&cpu_buf_idle_clear); 2628 - break; 2629 - case TSA_MITIGATION_NONE: 2630 - case TSA_MITIGATION_UCODE_NEEDED: 2631 - break; 2632 - } 2633 - 2634 - mutex_unlock(&spec_ctrl_mutex); 2635 - } 2636 - 2637 2541 #undef pr_fmt 2638 2542 #define pr_fmt(fmt) "Speculative Store Bypass: " fmt 2639 2543 ··· 3263 3331 } 3264 3332 3265 3333 #undef pr_fmt 3334 + #define pr_fmt(fmt) "VMSCAPE: " fmt 3335 + 3336 + enum vmscape_mitigations { 3337 + VMSCAPE_MITIGATION_NONE, 3338 + VMSCAPE_MITIGATION_AUTO, 3339 + VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER, 3340 + VMSCAPE_MITIGATION_IBPB_ON_VMEXIT, 3341 + }; 3342 + 3343 + static const char * const vmscape_strings[] = { 3344 + [VMSCAPE_MITIGATION_NONE] = "Vulnerable", 3345 + /* [VMSCAPE_MITIGATION_AUTO] */ 3346 + [VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER] = "Mitigation: IBPB before exit to userspace", 3347 + [VMSCAPE_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT", 3348 + }; 3349 + 3350 + static enum vmscape_mitigations vmscape_mitigation __ro_after_init = 3351 + IS_ENABLED(CONFIG_MITIGATION_VMSCAPE) ? VMSCAPE_MITIGATION_AUTO : VMSCAPE_MITIGATION_NONE; 3352 + 3353 + static int __init vmscape_parse_cmdline(char *str) 3354 + { 3355 + if (!str) 3356 + return -EINVAL; 3357 + 3358 + if (!strcmp(str, "off")) { 3359 + vmscape_mitigation = VMSCAPE_MITIGATION_NONE; 3360 + } else if (!strcmp(str, "ibpb")) { 3361 + vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER; 3362 + } else if (!strcmp(str, "force")) { 3363 + setup_force_cpu_bug(X86_BUG_VMSCAPE); 3364 + vmscape_mitigation = VMSCAPE_MITIGATION_AUTO; 3365 + } else { 3366 + pr_err("Ignoring unknown vmscape=%s option.\n", str); 3367 + } 3368 + 3369 + return 0; 3370 + } 3371 + early_param("vmscape", vmscape_parse_cmdline); 3372 + 3373 + static void __init vmscape_select_mitigation(void) 3374 + { 3375 + if (cpu_mitigations_off() || 3376 + !boot_cpu_has_bug(X86_BUG_VMSCAPE) || 3377 + !boot_cpu_has(X86_FEATURE_IBPB)) { 3378 + vmscape_mitigation = VMSCAPE_MITIGATION_NONE; 3379 + return; 3380 + } 3381 + 3382 + if (vmscape_mitigation == VMSCAPE_MITIGATION_AUTO) 3383 + vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER; 3384 + } 3385 + 3386 + static void __init vmscape_update_mitigation(void) 3387 + { 3388 + if (!boot_cpu_has_bug(X86_BUG_VMSCAPE)) 3389 + return; 3390 + 3391 + if (retbleed_mitigation == RETBLEED_MITIGATION_IBPB || 3392 + srso_mitigation == SRSO_MITIGATION_IBPB_ON_VMEXIT) 3393 + vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_ON_VMEXIT; 3394 + 3395 + pr_info("%s\n", vmscape_strings[vmscape_mitigation]); 3396 + } 3397 + 3398 + static void __init vmscape_apply_mitigation(void) 3399 + { 3400 + if (vmscape_mitigation == VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER) 3401 + setup_force_cpu_cap(X86_FEATURE_IBPB_EXIT_TO_USER); 3402 + } 3403 + 3404 + #undef pr_fmt 3266 3405 #define pr_fmt(fmt) fmt 3406 + 3407 + #define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n" 3408 + #define TAA_MSG_SMT "TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.\n" 3409 + #define MMIO_MSG_SMT "MMIO Stale Data CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/processor_mmio_stale_data.html for more details.\n" 3410 + #define VMSCAPE_MSG_SMT "VMSCAPE: SMT on, STIBP is required for full protection. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/vmscape.html for more details.\n" 3411 + 3412 + void cpu_bugs_smt_update(void) 3413 + { 3414 + mutex_lock(&spec_ctrl_mutex); 3415 + 3416 + if (sched_smt_active() && unprivileged_ebpf_enabled() && 3417 + spectre_v2_enabled == SPECTRE_V2_EIBRS_LFENCE) 3418 + pr_warn_once(SPECTRE_V2_EIBRS_LFENCE_EBPF_SMT_MSG); 3419 + 3420 + switch (spectre_v2_user_stibp) { 3421 + case SPECTRE_V2_USER_NONE: 3422 + break; 3423 + case SPECTRE_V2_USER_STRICT: 3424 + case SPECTRE_V2_USER_STRICT_PREFERRED: 3425 + update_stibp_strict(); 3426 + break; 3427 + case SPECTRE_V2_USER_PRCTL: 3428 + case SPECTRE_V2_USER_SECCOMP: 3429 + update_indir_branch_cond(); 3430 + break; 3431 + } 3432 + 3433 + switch (mds_mitigation) { 3434 + case MDS_MITIGATION_FULL: 3435 + case MDS_MITIGATION_AUTO: 3436 + case MDS_MITIGATION_VMWERV: 3437 + if (sched_smt_active() && !boot_cpu_has(X86_BUG_MSBDS_ONLY)) 3438 + pr_warn_once(MDS_MSG_SMT); 3439 + update_mds_branch_idle(); 3440 + break; 3441 + case MDS_MITIGATION_OFF: 3442 + break; 3443 + } 3444 + 3445 + switch (taa_mitigation) { 3446 + case TAA_MITIGATION_VERW: 3447 + case TAA_MITIGATION_AUTO: 3448 + case TAA_MITIGATION_UCODE_NEEDED: 3449 + if (sched_smt_active()) 3450 + pr_warn_once(TAA_MSG_SMT); 3451 + break; 3452 + case TAA_MITIGATION_TSX_DISABLED: 3453 + case TAA_MITIGATION_OFF: 3454 + break; 3455 + } 3456 + 3457 + switch (mmio_mitigation) { 3458 + case MMIO_MITIGATION_VERW: 3459 + case MMIO_MITIGATION_AUTO: 3460 + case MMIO_MITIGATION_UCODE_NEEDED: 3461 + if (sched_smt_active()) 3462 + pr_warn_once(MMIO_MSG_SMT); 3463 + break; 3464 + case MMIO_MITIGATION_OFF: 3465 + break; 3466 + } 3467 + 3468 + switch (tsa_mitigation) { 3469 + case TSA_MITIGATION_USER_KERNEL: 3470 + case TSA_MITIGATION_VM: 3471 + case TSA_MITIGATION_AUTO: 3472 + case TSA_MITIGATION_FULL: 3473 + /* 3474 + * TSA-SQ can potentially lead to info leakage between 3475 + * SMT threads. 3476 + */ 3477 + if (sched_smt_active()) 3478 + static_branch_enable(&cpu_buf_idle_clear); 3479 + else 3480 + static_branch_disable(&cpu_buf_idle_clear); 3481 + break; 3482 + case TSA_MITIGATION_NONE: 3483 + case TSA_MITIGATION_UCODE_NEEDED: 3484 + break; 3485 + } 3486 + 3487 + switch (vmscape_mitigation) { 3488 + case VMSCAPE_MITIGATION_NONE: 3489 + case VMSCAPE_MITIGATION_AUTO: 3490 + break; 3491 + case VMSCAPE_MITIGATION_IBPB_ON_VMEXIT: 3492 + case VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER: 3493 + /* 3494 + * Hypervisors can be attacked across-threads, warn for SMT when 3495 + * STIBP is not already enabled system-wide. 3496 + * 3497 + * Intel eIBRS (!AUTOIBRS) implies STIBP on. 3498 + */ 3499 + if (!sched_smt_active() || 3500 + spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT || 3501 + spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED || 3502 + (spectre_v2_in_eibrs_mode(spectre_v2_enabled) && 3503 + !boot_cpu_has(X86_FEATURE_AUTOIBRS))) 3504 + break; 3505 + pr_warn_once(VMSCAPE_MSG_SMT); 3506 + break; 3507 + } 3508 + 3509 + mutex_unlock(&spec_ctrl_mutex); 3510 + } 3267 3511 3268 3512 #ifdef CONFIG_SYSFS 3269 3513 ··· 3686 3578 return sysfs_emit(buf, "%s\n", tsa_strings[tsa_mitigation]); 3687 3579 } 3688 3580 3581 + static ssize_t vmscape_show_state(char *buf) 3582 + { 3583 + return sysfs_emit(buf, "%s\n", vmscape_strings[vmscape_mitigation]); 3584 + } 3585 + 3689 3586 static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr, 3690 3587 char *buf, unsigned int bug) 3691 3588 { ··· 3756 3643 3757 3644 case X86_BUG_TSA: 3758 3645 return tsa_show_state(buf); 3646 + 3647 + case X86_BUG_VMSCAPE: 3648 + return vmscape_show_state(buf); 3759 3649 3760 3650 default: 3761 3651 break; ··· 3850 3734 ssize_t cpu_show_tsa(struct device *dev, struct device_attribute *attr, char *buf) 3851 3735 { 3852 3736 return cpu_show_common(dev, attr, buf, X86_BUG_TSA); 3737 + } 3738 + 3739 + ssize_t cpu_show_vmscape(struct device *dev, struct device_attribute *attr, char *buf) 3740 + { 3741 + return cpu_show_common(dev, attr, buf, X86_BUG_VMSCAPE); 3853 3742 } 3854 3743 #endif 3855 3744
+55 -31
arch/x86/kernel/cpu/common.c
··· 1236 1236 #define ITS_NATIVE_ONLY BIT(9) 1237 1237 /* CPU is affected by Transient Scheduler Attacks */ 1238 1238 #define TSA BIT(10) 1239 + /* CPU is affected by VMSCAPE */ 1240 + #define VMSCAPE BIT(11) 1239 1241 1240 1242 static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { 1241 - VULNBL_INTEL_STEPS(INTEL_IVYBRIDGE, X86_STEP_MAX, SRBDS), 1242 - VULNBL_INTEL_STEPS(INTEL_HASWELL, X86_STEP_MAX, SRBDS), 1243 - VULNBL_INTEL_STEPS(INTEL_HASWELL_L, X86_STEP_MAX, SRBDS), 1244 - VULNBL_INTEL_STEPS(INTEL_HASWELL_G, X86_STEP_MAX, SRBDS), 1245 - VULNBL_INTEL_STEPS(INTEL_HASWELL_X, X86_STEP_MAX, MMIO), 1246 - VULNBL_INTEL_STEPS(INTEL_BROADWELL_D, X86_STEP_MAX, MMIO), 1247 - VULNBL_INTEL_STEPS(INTEL_BROADWELL_G, X86_STEP_MAX, SRBDS), 1248 - VULNBL_INTEL_STEPS(INTEL_BROADWELL_X, X86_STEP_MAX, MMIO), 1249 - VULNBL_INTEL_STEPS(INTEL_BROADWELL, X86_STEP_MAX, SRBDS), 1250 - VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, 0x5, MMIO | RETBLEED | GDS), 1251 - VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, X86_STEP_MAX, MMIO | RETBLEED | GDS | ITS), 1252 - VULNBL_INTEL_STEPS(INTEL_SKYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS), 1253 - VULNBL_INTEL_STEPS(INTEL_SKYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS), 1254 - VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, 0xb, MMIO | RETBLEED | GDS | SRBDS), 1255 - VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | ITS), 1256 - VULNBL_INTEL_STEPS(INTEL_KABYLAKE, 0xc, MMIO | RETBLEED | GDS | SRBDS), 1257 - VULNBL_INTEL_STEPS(INTEL_KABYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | ITS), 1258 - VULNBL_INTEL_STEPS(INTEL_CANNONLAKE_L, X86_STEP_MAX, RETBLEED), 1243 + VULNBL_INTEL_STEPS(INTEL_SANDYBRIDGE_X, X86_STEP_MAX, VMSCAPE), 1244 + VULNBL_INTEL_STEPS(INTEL_SANDYBRIDGE, X86_STEP_MAX, VMSCAPE), 1245 + VULNBL_INTEL_STEPS(INTEL_IVYBRIDGE_X, X86_STEP_MAX, VMSCAPE), 1246 + VULNBL_INTEL_STEPS(INTEL_IVYBRIDGE, X86_STEP_MAX, SRBDS | VMSCAPE), 1247 + VULNBL_INTEL_STEPS(INTEL_HASWELL, X86_STEP_MAX, SRBDS | VMSCAPE), 1248 + VULNBL_INTEL_STEPS(INTEL_HASWELL_L, X86_STEP_MAX, SRBDS | VMSCAPE), 1249 + VULNBL_INTEL_STEPS(INTEL_HASWELL_G, X86_STEP_MAX, SRBDS | VMSCAPE), 1250 + VULNBL_INTEL_STEPS(INTEL_HASWELL_X, X86_STEP_MAX, MMIO | VMSCAPE), 1251 + VULNBL_INTEL_STEPS(INTEL_BROADWELL_D, X86_STEP_MAX, MMIO | VMSCAPE), 1252 + VULNBL_INTEL_STEPS(INTEL_BROADWELL_X, X86_STEP_MAX, MMIO | VMSCAPE), 1253 + VULNBL_INTEL_STEPS(INTEL_BROADWELL_G, X86_STEP_MAX, SRBDS | VMSCAPE), 1254 + VULNBL_INTEL_STEPS(INTEL_BROADWELL, X86_STEP_MAX, SRBDS | VMSCAPE), 1255 + VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, 0x5, MMIO | RETBLEED | GDS | VMSCAPE), 1256 + VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, X86_STEP_MAX, MMIO | RETBLEED | GDS | ITS | VMSCAPE), 1257 + VULNBL_INTEL_STEPS(INTEL_SKYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | VMSCAPE), 1258 + VULNBL_INTEL_STEPS(INTEL_SKYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | VMSCAPE), 1259 + VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, 0xb, MMIO | RETBLEED | GDS | SRBDS | VMSCAPE), 1260 + VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | ITS | VMSCAPE), 1261 + VULNBL_INTEL_STEPS(INTEL_KABYLAKE, 0xc, MMIO | RETBLEED | GDS | SRBDS | VMSCAPE), 1262 + VULNBL_INTEL_STEPS(INTEL_KABYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | ITS | VMSCAPE), 1263 + VULNBL_INTEL_STEPS(INTEL_CANNONLAKE_L, X86_STEP_MAX, RETBLEED | VMSCAPE), 1259 1264 VULNBL_INTEL_STEPS(INTEL_ICELAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS | ITS_NATIVE_ONLY), 1260 1265 VULNBL_INTEL_STEPS(INTEL_ICELAKE_D, X86_STEP_MAX, MMIO | GDS | ITS | ITS_NATIVE_ONLY), 1261 1266 VULNBL_INTEL_STEPS(INTEL_ICELAKE_X, X86_STEP_MAX, MMIO | GDS | ITS | ITS_NATIVE_ONLY), 1262 - VULNBL_INTEL_STEPS(INTEL_COMETLAKE, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), 1263 - VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, 0x0, MMIO | RETBLEED | ITS), 1264 - VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), 1267 + VULNBL_INTEL_STEPS(INTEL_COMETLAKE, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS | VMSCAPE), 1268 + VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, 0x0, MMIO | RETBLEED | ITS | VMSCAPE), 1269 + VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS | VMSCAPE), 1265 1270 VULNBL_INTEL_STEPS(INTEL_TIGERLAKE_L, X86_STEP_MAX, GDS | ITS | ITS_NATIVE_ONLY), 1266 1271 VULNBL_INTEL_STEPS(INTEL_TIGERLAKE, X86_STEP_MAX, GDS | ITS | ITS_NATIVE_ONLY), 1267 1272 VULNBL_INTEL_STEPS(INTEL_LAKEFIELD, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED), 1268 1273 VULNBL_INTEL_STEPS(INTEL_ROCKETLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | ITS | ITS_NATIVE_ONLY), 1269 - VULNBL_INTEL_TYPE(INTEL_ALDERLAKE, ATOM, RFDS), 1270 - VULNBL_INTEL_STEPS(INTEL_ALDERLAKE_L, X86_STEP_MAX, RFDS), 1271 - VULNBL_INTEL_TYPE(INTEL_RAPTORLAKE, ATOM, RFDS), 1272 - VULNBL_INTEL_STEPS(INTEL_RAPTORLAKE_P, X86_STEP_MAX, RFDS), 1273 - VULNBL_INTEL_STEPS(INTEL_RAPTORLAKE_S, X86_STEP_MAX, RFDS), 1274 - VULNBL_INTEL_STEPS(INTEL_ATOM_GRACEMONT, X86_STEP_MAX, RFDS), 1274 + VULNBL_INTEL_TYPE(INTEL_ALDERLAKE, ATOM, RFDS | VMSCAPE), 1275 + VULNBL_INTEL_STEPS(INTEL_ALDERLAKE, X86_STEP_MAX, VMSCAPE), 1276 + VULNBL_INTEL_STEPS(INTEL_ALDERLAKE_L, X86_STEP_MAX, RFDS | VMSCAPE), 1277 + VULNBL_INTEL_TYPE(INTEL_RAPTORLAKE, ATOM, RFDS | VMSCAPE), 1278 + VULNBL_INTEL_STEPS(INTEL_RAPTORLAKE, X86_STEP_MAX, VMSCAPE), 1279 + VULNBL_INTEL_STEPS(INTEL_RAPTORLAKE_P, X86_STEP_MAX, RFDS | VMSCAPE), 1280 + VULNBL_INTEL_STEPS(INTEL_RAPTORLAKE_S, X86_STEP_MAX, RFDS | VMSCAPE), 1281 + VULNBL_INTEL_STEPS(INTEL_METEORLAKE_L, X86_STEP_MAX, VMSCAPE), 1282 + VULNBL_INTEL_STEPS(INTEL_ARROWLAKE_H, X86_STEP_MAX, VMSCAPE), 1283 + VULNBL_INTEL_STEPS(INTEL_ARROWLAKE, X86_STEP_MAX, VMSCAPE), 1284 + VULNBL_INTEL_STEPS(INTEL_ARROWLAKE_U, X86_STEP_MAX, VMSCAPE), 1285 + VULNBL_INTEL_STEPS(INTEL_LUNARLAKE_M, X86_STEP_MAX, VMSCAPE), 1286 + VULNBL_INTEL_STEPS(INTEL_SAPPHIRERAPIDS_X, X86_STEP_MAX, VMSCAPE), 1287 + VULNBL_INTEL_STEPS(INTEL_GRANITERAPIDS_X, X86_STEP_MAX, VMSCAPE), 1288 + VULNBL_INTEL_STEPS(INTEL_EMERALDRAPIDS_X, X86_STEP_MAX, VMSCAPE), 1289 + VULNBL_INTEL_STEPS(INTEL_ATOM_GRACEMONT, X86_STEP_MAX, RFDS | VMSCAPE), 1275 1290 VULNBL_INTEL_STEPS(INTEL_ATOM_TREMONT, X86_STEP_MAX, MMIO | MMIO_SBDS | RFDS), 1276 1291 VULNBL_INTEL_STEPS(INTEL_ATOM_TREMONT_D, X86_STEP_MAX, MMIO | RFDS), 1277 1292 VULNBL_INTEL_STEPS(INTEL_ATOM_TREMONT_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RFDS), 1278 1293 VULNBL_INTEL_STEPS(INTEL_ATOM_GOLDMONT, X86_STEP_MAX, RFDS), 1279 1294 VULNBL_INTEL_STEPS(INTEL_ATOM_GOLDMONT_D, X86_STEP_MAX, RFDS), 1280 1295 VULNBL_INTEL_STEPS(INTEL_ATOM_GOLDMONT_PLUS, X86_STEP_MAX, RFDS), 1296 + VULNBL_INTEL_STEPS(INTEL_ATOM_CRESTMONT_X, X86_STEP_MAX, VMSCAPE), 1281 1297 1282 1298 VULNBL_AMD(0x15, RETBLEED), 1283 1299 VULNBL_AMD(0x16, RETBLEED), 1284 - VULNBL_AMD(0x17, RETBLEED | SMT_RSB | SRSO), 1285 - VULNBL_HYGON(0x18, RETBLEED | SMT_RSB | SRSO), 1286 - VULNBL_AMD(0x19, SRSO | TSA), 1287 - VULNBL_AMD(0x1a, SRSO), 1300 + VULNBL_AMD(0x17, RETBLEED | SMT_RSB | SRSO | VMSCAPE), 1301 + VULNBL_HYGON(0x18, RETBLEED | SMT_RSB | SRSO | VMSCAPE), 1302 + VULNBL_AMD(0x19, SRSO | TSA | VMSCAPE), 1303 + VULNBL_AMD(0x1a, SRSO | VMSCAPE), 1288 1304 {} 1289 1305 }; 1290 1306 ··· 1558 1542 setup_force_cpu_bug(X86_BUG_TSA); 1559 1543 } 1560 1544 } 1545 + 1546 + /* 1547 + * Set the bug only on bare-metal. A nested hypervisor should already be 1548 + * deploying IBPB to isolate itself from nested guests. 1549 + */ 1550 + if (cpu_matches(cpu_vuln_blacklist, VMSCAPE) && 1551 + !boot_cpu_has(X86_FEATURE_HYPERVISOR)) 1552 + setup_force_cpu_bug(X86_BUG_VMSCAPE); 1561 1553 1562 1554 if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN)) 1563 1555 return;
+9
arch/x86/kvm/x86.c
··· 11011 11011 wrmsrq(MSR_IA32_XFD_ERR, 0); 11012 11012 11013 11013 /* 11014 + * Mark this CPU as needing a branch predictor flush before running 11015 + * userspace. Must be done before enabling preemption to ensure it gets 11016 + * set for the CPU that actually ran the guest, and not the CPU that it 11017 + * may migrate to. 11018 + */ 11019 + if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER)) 11020 + this_cpu_write(x86_ibpb_exit_to_user, true); 11021 + 11022 + /* 11014 11023 * Consume any pending interrupts, including the possible source of 11015 11024 * VM-Exit on SVM and any ticks that occur between VM-Exit and now. 11016 11025 * An instruction is required after local_irq_enable() to fully unblock
+3
drivers/base/cpu.c
··· 603 603 CPU_SHOW_VULN_FALLBACK(old_microcode); 604 604 CPU_SHOW_VULN_FALLBACK(indirect_target_selection); 605 605 CPU_SHOW_VULN_FALLBACK(tsa); 606 + CPU_SHOW_VULN_FALLBACK(vmscape); 606 607 607 608 static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); 608 609 static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); ··· 623 622 static DEVICE_ATTR(old_microcode, 0444, cpu_show_old_microcode, NULL); 624 623 static DEVICE_ATTR(indirect_target_selection, 0444, cpu_show_indirect_target_selection, NULL); 625 624 static DEVICE_ATTR(tsa, 0444, cpu_show_tsa, NULL); 625 + static DEVICE_ATTR(vmscape, 0444, cpu_show_vmscape, NULL); 626 626 627 627 static struct attribute *cpu_root_vulnerabilities_attrs[] = { 628 628 &dev_attr_meltdown.attr, ··· 644 642 &dev_attr_old_microcode.attr, 645 643 &dev_attr_indirect_target_selection.attr, 646 644 &dev_attr_tsa.attr, 645 + &dev_attr_vmscape.attr, 647 646 NULL 648 647 }; 649 648
+1
include/linux/cpu.h
··· 83 83 extern ssize_t cpu_show_indirect_target_selection(struct device *dev, 84 84 struct device_attribute *attr, char *buf); 85 85 extern ssize_t cpu_show_tsa(struct device *dev, struct device_attribute *attr, char *buf); 86 + extern ssize_t cpu_show_vmscape(struct device *dev, struct device_attribute *attr, char *buf); 86 87 87 88 extern __printf(4, 5) 88 89 struct device *cpu_device_create(struct device *parent, void *drvdata,