Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

x86/sev: Skip ROM range scans and validation for SEV-SNP guests

SEV-SNP requires encrypted memory to be validated before access.
Because the ROM memory range is not part of the e820 table, it is not
pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
to access this range, the guest must first validate the range.

The current SEV-SNP code does indeed scan the ROM range during early
boot and thus attempts to validate the ROM range in probe_roms().
However, this behavior is neither sufficient nor necessary for the
following reasons:

* With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
falls in the ROM range) prior to validation.

For example, Project Oak Stage 0 provides a minimal guest firmware
that currently meets these configuration conditions, meaning guests
booting atop Oak Stage 0 firmware encounter a problematic call chain
during dmi_setup() -> dmi_scan_machine() that results in a crash
during boot if SEV-SNP is enabled.

* With regards to necessity, SEV-SNP guests generally read garbage
(which changes across boots) from the ROM range, meaning these scans
are unnecessary. The guest reads garbage because the legacy ROM range
is unencrypted data but is accessed via an encrypted PMD during early
boot (where the PMD is marked as encrypted due to potentially mapping
actually-encrypted data in other PMD-contained ranges).

In one exceptional case, EISA probing treats the ROM range as
unencrypted data, which is inconsistent with other probing.

Continuing to allow SEV-SNP guests to use garbage and to inconsistently
classify ROM range encryption status can trigger undesirable behavior.
For instance, if garbage bytes appear to be a valid signature, memory
may be unnecessarily reserved for the ROM range. Future code or other
use cases may result in more problematic (arbitrary) behavior that
should be avoided.

While one solution would be to overhaul the early PMD mapping to always
treat the ROM region of the PMD as unencrypted, SEV-SNP guests do not
currently rely on data from the ROM region during early boot (and even
if they did, they would be mostly relying on garbage data anyways).

As a simpler solution, skip the ROM range scans (and the otherwise-
necessary range validation) during SEV-SNP guest early boot. The
potential SEV-SNP guest crash due to lack of ROM range validation is
thus avoided by simply not accessing the ROM range.

In most cases, skip the scans by overriding problematic x86_init
functions during sme_early_init() to SNP-safe variants, which can be
likened to x86_init overrides done for other platforms (ex: Xen); such
overrides also avoid the spread of cc_platform_has() checks throughout
the tree.

In the exceptional EISA case, still use cc_platform_has() for the
simplest change, given (1) checks for guest type (ex: Xen domain status)
are already performed here, and (2) these checks occur in a subsys
initcall instead of an x86_init function.

[ bp: Massage commit message, remove "we"s. ]

Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
Signed-off-by: Kevin Loughlin <kevinloughlin@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20240313121546.2964854-1-kevinloughlin@google.com

authored by

Kevin Loughlin and committed by
Borislav Petkov (AMD)
0f4a1e80 4969d75d

+39 -31
+2 -2
arch/x86/include/asm/sev.h
··· 218 218 unsigned long npages); 219 219 void early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, 220 220 unsigned long npages); 221 - void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op); 222 221 void snp_set_memory_shared(unsigned long vaddr, unsigned long npages); 223 222 void snp_set_memory_private(unsigned long vaddr, unsigned long npages); 224 223 void snp_set_wakeup_secondary_cpu(void); 225 224 bool snp_init(struct boot_params *bp); 226 225 void __noreturn snp_abort(void); 226 + void snp_dmi_setup(void); 227 227 int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct snp_guest_request_ioctl *rio); 228 228 void snp_accept_memory(phys_addr_t start, phys_addr_t end); 229 229 u64 snp_get_unsupported_features(u64 status); ··· 244 244 early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned long npages) { } 245 245 static inline void __init 246 246 early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned long npages) { } 247 - static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { } 248 247 static inline void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) { } 249 248 static inline void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { } 250 249 static inline void snp_set_wakeup_secondary_cpu(void) { } 251 250 static inline bool snp_init(struct boot_params *bp) { return false; } 252 251 static inline void snp_abort(void) { } 252 + static inline void snp_dmi_setup(void) { } 253 253 static inline int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct snp_guest_request_ioctl *rio) 254 254 { 255 255 return -ENOTTY;
+2 -1
arch/x86/include/asm/x86_init.h
··· 30 30 * @reserve_resources: reserve the standard resources for the 31 31 * platform 32 32 * @memory_setup: platform specific memory setup 33 - * 33 + * @dmi_setup: platform specific DMI setup 34 34 */ 35 35 struct x86_init_resources { 36 36 void (*probe_roms)(void); 37 37 void (*reserve_resources)(void); 38 38 char *(*memory_setup)(void); 39 + void (*dmi_setup)(void); 39 40 }; 40 41 41 42 /**
+2 -1
arch/x86/kernel/eisa.c
··· 2 2 /* 3 3 * EISA specific code 4 4 */ 5 + #include <linux/cc_platform.h> 5 6 #include <linux/ioport.h> 6 7 #include <linux/eisa.h> 7 8 #include <linux/io.h> ··· 13 12 { 14 13 void __iomem *p; 15 14 16 - if (xen_pv_domain() && !xen_initial_domain()) 15 + if ((xen_pv_domain() && !xen_initial_domain()) || cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) 17 16 return 0; 18 17 19 18 p = ioremap(0x0FFFD9, 4);
-10
arch/x86/kernel/probe_roms.c
··· 203 203 unsigned char c; 204 204 int i; 205 205 206 - /* 207 - * The ROM memory range is not part of the e820 table and is therefore not 208 - * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted 209 - * memory, and SNP requires encrypted memory to be validated before access. 210 - * Do that here. 211 - */ 212 - snp_prep_memory(video_rom_resource.start, 213 - ((system_rom_resource.end + 1) - video_rom_resource.start), 214 - SNP_PAGE_STATE_PRIVATE); 215 - 216 206 /* video rom */ 217 207 upper = adapter_rom_resources[0].start; 218 208 for (start = video_rom_resource.start; start < upper; start += 2048) {
+1 -2
arch/x86/kernel/setup.c
··· 9 9 #include <linux/console.h> 10 10 #include <linux/crash_dump.h> 11 11 #include <linux/dma-map-ops.h> 12 - #include <linux/dmi.h> 13 12 #include <linux/efi.h> 14 13 #include <linux/ima.h> 15 14 #include <linux/init_ohci1394_dma.h> ··· 901 902 efi_init(); 902 903 903 904 reserve_ibft_region(); 904 - dmi_setup(); 905 + x86_init.resources.dmi_setup(); 905 906 906 907 /* 907 908 * VMware detection requires dmi to be available, so this
+12 -15
arch/x86/kernel/sev.c
··· 23 23 #include <linux/platform_device.h> 24 24 #include <linux/io.h> 25 25 #include <linux/psp-sev.h> 26 + #include <linux/dmi.h> 26 27 #include <uapi/linux/sev-guest.h> 27 28 28 29 #include <asm/init.h> ··· 794 793 795 794 /* Ask hypervisor to mark the memory pages shared in the RMP table. */ 796 795 early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED); 797 - } 798 - 799 - void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) 800 - { 801 - unsigned long vaddr, npages; 802 - 803 - vaddr = (unsigned long)__va(paddr); 804 - npages = PAGE_ALIGN(sz) >> PAGE_SHIFT; 805 - 806 - if (op == SNP_PAGE_STATE_PRIVATE) 807 - early_snp_set_memory_private(vaddr, paddr, npages); 808 - else if (op == SNP_PAGE_STATE_SHARED) 809 - early_snp_set_memory_shared(vaddr, paddr, npages); 810 - else 811 - WARN(1, "invalid memory op %d\n", op); 812 796 } 813 797 814 798 static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr, ··· 2120 2134 void __head __noreturn snp_abort(void) 2121 2135 { 2122 2136 sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED); 2137 + } 2138 + 2139 + /* 2140 + * SEV-SNP guests should only execute dmi_setup() if EFI_CONFIG_TABLES are 2141 + * enabled, as the alternative (fallback) logic for DMI probing in the legacy 2142 + * ROM region can cause a crash since this region is not pre-validated. 2143 + */ 2144 + void __init snp_dmi_setup(void) 2145 + { 2146 + if (efi_enabled(EFI_CONFIG_TABLES)) 2147 + dmi_setup(); 2123 2148 } 2124 2149 2125 2150 static void dump_cpuid_table(void)
+2
arch/x86/kernel/x86_init.c
··· 3 3 * 4 4 * For licencing details see kernel-base/COPYING 5 5 */ 6 + #include <linux/dmi.h> 6 7 #include <linux/init.h> 7 8 #include <linux/ioport.h> 8 9 #include <linux/export.h> ··· 67 66 .probe_roms = probe_roms, 68 67 .reserve_resources = reserve_standard_io_resources, 69 68 .memory_setup = e820__memory_setup_default, 69 + .dmi_setup = dmi_setup, 70 70 }, 71 71 72 72 .mpparse = {
+18
arch/x86/mm/mem_encrypt_amd.c
··· 492 492 */ 493 493 if (sev_status & MSR_AMD64_SEV_ENABLED) 494 494 ia32_disable(); 495 + 496 + /* 497 + * Override init functions that scan the ROM region in SEV-SNP guests, 498 + * as this memory is not pre-validated and would thus cause a crash. 499 + */ 500 + if (sev_status & MSR_AMD64_SEV_SNP_ENABLED) { 501 + x86_init.mpparse.find_mptable = x86_init_noop; 502 + x86_init.pci.init_irq = x86_init_noop; 503 + x86_init.resources.probe_roms = x86_init_noop; 504 + 505 + /* 506 + * DMI setup behavior for SEV-SNP guests depends on 507 + * efi_enabled(EFI_CONFIG_TABLES), which hasn't been 508 + * parsed yet. snp_dmi_setup() will run after that 509 + * parsing has happened. 510 + */ 511 + x86_init.resources.dmi_setup = snp_dmi_setup; 512 + } 495 513 } 496 514 497 515 void __init mem_encrypt_free_decrypted_mem(void)