Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

KVM: x86: Initialize allow_smaller_maxphyaddr earlier in setup

Initialize allow_smaller_maxphyaddr during hardware setup as soon as KVM
knows whether or not TDP will be utilized. To avoid having to teach KVM's
emulator all about CET, KVM's upcoming CET virtualization support will be
mutually exclusive with allow_smaller_maxphyaddr, i.e. will disable SHSTK
and IBT if allow_smaller_maxphyaddr is enabled.

In general, allow_smaller_maxphyaddr should be initialized as soon as
possible since it's globally visible while its only input is whether or
not EPT/NPT is enabled. I.e. there's effectively zero risk of setting
allow_smaller_maxphyaddr too early, and substantial risk of setting it
too late.

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20250922184743.1745778-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

+23 -23
+15 -15
arch/x86/kvm/svm/svm.c
··· 5370 5370 get_npt_level(), PG_LEVEL_1G); 5371 5371 pr_info("Nested Paging %s\n", str_enabled_disabled(npt_enabled)); 5372 5372 5373 + /* 5374 + * It seems that on AMD processors PTE's accessed bit is 5375 + * being set by the CPU hardware before the NPF vmexit. 5376 + * This is not expected behaviour and our tests fail because 5377 + * of it. 5378 + * A workaround here is to disable support for 5379 + * GUEST_MAXPHYADDR < HOST_MAXPHYADDR if NPT is enabled. 5380 + * In this case userspace can know if there is support using 5381 + * KVM_CAP_SMALLER_MAXPHYADDR extension and decide how to handle 5382 + * it 5383 + * If future AMD CPU models change the behaviour described above, 5384 + * this variable can be changed accordingly 5385 + */ 5386 + allow_smaller_maxphyaddr = !npt_enabled; 5387 + 5373 5388 /* Setup shadow_me_value and shadow_me_mask */ 5374 5389 kvm_mmu_set_me_spte_mask(sme_me_mask, sme_me_mask); 5375 5390 ··· 5463 5448 pr_info("PMU virtualization is disabled\n"); 5464 5449 5465 5450 svm_set_cpu_caps(); 5466 - 5467 - /* 5468 - * It seems that on AMD processors PTE's accessed bit is 5469 - * being set by the CPU hardware before the NPF vmexit. 5470 - * This is not expected behaviour and our tests fail because 5471 - * of it. 5472 - * A workaround here is to disable support for 5473 - * GUEST_MAXPHYADDR < HOST_MAXPHYADDR if NPT is enabled. 5474 - * In this case userspace can know if there is support using 5475 - * KVM_CAP_SMALLER_MAXPHYADDR extension and decide how to handle 5476 - * it 5477 - * If future AMD CPU models change the behaviour described above, 5478 - * this variable can be changed accordingly 5479 - */ 5480 - allow_smaller_maxphyaddr = !npt_enabled; 5481 5451 5482 5452 kvm_caps.inapplicable_quirks &= ~KVM_X86_QUIRK_CD_NW_CLEARED; 5483 5453 return 0;
+8 -8
arch/x86/kvm/vmx/vmx.c
··· 8436 8436 return -EOPNOTSUPP; 8437 8437 } 8438 8438 8439 + /* 8440 + * Shadow paging doesn't have a (further) performance penalty 8441 + * from GUEST_MAXPHYADDR < HOST_MAXPHYADDR so enable it 8442 + * by default 8443 + */ 8444 + if (!enable_ept) 8445 + allow_smaller_maxphyaddr = true; 8446 + 8439 8447 if (!cpu_has_vmx_ept_ad_bits() || !enable_ept) 8440 8448 enable_ept_ad_bits = 0; 8441 8449 ··· 8672 8664 } 8673 8665 8674 8666 vmx_check_vmcs12_offsets(); 8675 - 8676 - /* 8677 - * Shadow paging doesn't have a (further) performance penalty 8678 - * from GUEST_MAXPHYADDR < HOST_MAXPHYADDR so enable it 8679 - * by default 8680 - */ 8681 - if (!enable_ept) 8682 - allow_smaller_maxphyaddr = true; 8683 8667 8684 8668 return 0; 8685 8669