Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

KVM: VMX: Inject #UD if guest tries to execute SEAMCALL or TDCALL

Add VMX exit handlers for SEAMCALL and TDCALL to inject a #UD if a non-TD
guest attempts to execute SEAMCALL or TDCALL. Neither SEAMCALL nor TDCALL
is gated by any software enablement other than VMXON, and so will generate
a VM-Exit instead of e.g. a native #UD when executed from the guest kernel.

Note! No unprivileged DoS of the L1 kernel is possible as TDCALL and
SEAMCALL #GP at CPL > 0, and the CPL check is performed prior to the VMX
non-root (VM-Exit) check, i.e. userspace can't crash the VM. And for a
nested guest, KVM forwards unknown exits to L1, i.e. an L2 kernel can
crash itself, but not L1.

Note #2! The Intel® Trust Domain CPU Architectural Extensions spec's
pseudocode shows the CPL > 0 check for SEAMCALL coming _after_ the VM-Exit,
but that appears to be a documentation bug (likely because the CPL > 0
check was incorrectly bundled with other lower-priority #GP checks).
Testing on SPR and EMR shows that the CPL > 0 check is performed before
the VMX non-root check, i.e. SEAMCALL #GPs when executed in usermode.

Note #3! The aforementioned Trust Domain spec uses confusing pseudocode
that says that SEAMCALL will #UD if executed "inSEAM", but "inSEAM"
specifically means in SEAM Root Mode, i.e. in the TDX-Module. The long-
form description explicitly states that SEAMCALL generates an exit when
executed in "SEAM VMX non-root operation". But that's a moot point as the
TDX-Module injects #UD if the guest attempts to execute SEAMCALL, as
documented in the "Unconditionally Blocked Instructions" section of the
TDX-Module base specification.

Cc: stable@vger.kernel.org
Cc: Kai Huang <kai.huang@intel.com>
Cc: Xiaoyao Li <xiaoyao.li@intel.com>
Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20251016182148.69085-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

+17
+1
arch/x86/include/uapi/asm/vmx.h
··· 93 93 #define EXIT_REASON_TPAUSE 68 94 94 #define EXIT_REASON_BUS_LOCK 74 95 95 #define EXIT_REASON_NOTIFY 75 96 + #define EXIT_REASON_SEAMCALL 76 96 97 #define EXIT_REASON_TDCALL 77 97 98 #define EXIT_REASON_MSR_READ_IMM 84 98 99 #define EXIT_REASON_MSR_WRITE_IMM 85
+8
arch/x86/kvm/vmx/nested.c
··· 6728 6728 case EXIT_REASON_NOTIFY: 6729 6729 /* Notify VM exit is not exposed to L1 */ 6730 6730 return false; 6731 + case EXIT_REASON_SEAMCALL: 6732 + case EXIT_REASON_TDCALL: 6733 + /* 6734 + * SEAMCALL and TDCALL unconditionally VM-Exit, but aren't 6735 + * virtualized by KVM for L1 hypervisors, i.e. L1 should 6736 + * never want or expect such an exit. 6737 + */ 6738 + return false; 6731 6739 default: 6732 6740 return true; 6733 6741 }
+8
arch/x86/kvm/vmx/vmx.c
··· 6032 6032 return 1; 6033 6033 } 6034 6034 6035 + static int handle_tdx_instruction(struct kvm_vcpu *vcpu) 6036 + { 6037 + kvm_queue_exception(vcpu, UD_VECTOR); 6038 + return 1; 6039 + } 6040 + 6035 6041 #ifndef CONFIG_X86_SGX_KVM 6036 6042 static int handle_encls(struct kvm_vcpu *vcpu) 6037 6043 { ··· 6163 6157 [EXIT_REASON_ENCLS] = handle_encls, 6164 6158 [EXIT_REASON_BUS_LOCK] = handle_bus_lock_vmexit, 6165 6159 [EXIT_REASON_NOTIFY] = handle_notify, 6160 + [EXIT_REASON_SEAMCALL] = handle_tdx_instruction, 6161 + [EXIT_REASON_TDCALL] = handle_tdx_instruction, 6166 6162 [EXIT_REASON_MSR_READ_IMM] = handle_rdmsr_imm, 6167 6163 [EXIT_REASON_MSR_WRITE_IMM] = handle_wrmsr_imm, 6168 6164 };