Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

KVM: arm64: Ensure a VMID is allocated before programming VTTBR_EL2

Vladimir reports that a race condition to attach a VMID to a stage-2 MMU
sometimes results in a vCPU entering the guest with a VMID of 0:

| CPU1 | CPU2
| |
| | kvm_arch_vcpu_ioctl_run
| | vcpu_load <= load VTTBR_EL2
| | kvm_vmid->id = 0
| |
| kvm_arch_vcpu_ioctl_run |
| vcpu_load <= load VTTBR_EL2 |
| with kvm_vmid->id = 0|
| kvm_arm_vmid_update <= allocates fresh |
| kvm_vmid->id and |
| reload VTTBR_EL2 |
| |
| | kvm_arm_vmid_update <= observes that kvm_vmid->id
| | already allocated,
| | skips reload VTTBR_EL2

Oh yeah, it's as bad as it looks. Remember that VHE loads the stage-2
MMU eagerly but a VMID only gets attached to the MMU later on in the
KVM_RUN loop.

Even in the "best case" where VTTBR_EL2 correctly gets reprogrammed
before entering the EL1&0 regime, there is a period of time where
hardware is configured with VMID 0. That's completely insane. So, rather
than decorating the 'late' binding with another hack, just allocate the
damn thing up front.

Attaching a VMID from vcpu_load() is still rollover safe since
(surprise!) it'll always get called after a vCPU was preempted.

Excuse me while I go find a brown paper bag.

Cc: stable@vger.kernel.org
Fixes: 934bf871f011 ("KVM: arm64: Load the stage-2 MMU context in kvm_vcpu_load_vhe()")
Reported-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20250219220737.130842-1-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>

authored by

Oliver Upton and committed by
Marc Zyngier
fa808ed4 102c51c5

+14 -21
+1 -1
arch/arm64/include/asm/kvm_host.h
··· 1259 1259 extern unsigned int __ro_after_init kvm_arm_vmid_bits; 1260 1260 int __init kvm_arm_vmid_alloc_init(void); 1261 1261 void __init kvm_arm_vmid_alloc_free(void); 1262 - bool kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid); 1262 + void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid); 1263 1263 void kvm_arm_vmid_clear_active(void); 1264 1264 1265 1265 static inline void kvm_arm_pvtime_vcpu_init(struct kvm_vcpu_arch *vcpu_arch)
+10 -12
arch/arm64/kvm/arm.c
··· 560 560 last_ran = this_cpu_ptr(mmu->last_vcpu_ran); 561 561 562 562 /* 563 + * Ensure a VMID is allocated for the MMU before programming VTTBR_EL2, 564 + * which happens eagerly in VHE. 565 + * 566 + * Also, the VMID allocator only preserves VMIDs that are active at the 567 + * time of rollover, so KVM might need to grab a new VMID for the MMU if 568 + * this is called from kvm_sched_in(). 569 + */ 570 + kvm_arm_vmid_update(&mmu->vmid); 571 + 572 + /* 563 573 * We guarantee that both TLBs and I-cache are private to each 564 574 * vcpu. If detecting that a vcpu from the same VM has 565 575 * previously run on the same physical CPU, call into the ··· 1147 1137 * non-preemptible context. 1148 1138 */ 1149 1139 preempt_disable(); 1150 - 1151 - /* 1152 - * The VMID allocator only tracks active VMIDs per 1153 - * physical CPU, and therefore the VMID allocated may not be 1154 - * preserved on VMID roll-over if the task was preempted, 1155 - * making a thread's VMID inactive. So we need to call 1156 - * kvm_arm_vmid_update() in non-premptible context. 1157 - */ 1158 - if (kvm_arm_vmid_update(&vcpu->arch.hw_mmu->vmid) && 1159 - has_vhe()) 1160 - __load_stage2(vcpu->arch.hw_mmu, 1161 - vcpu->arch.hw_mmu->arch); 1162 1140 1163 1141 kvm_pmu_flush_hwstate(vcpu); 1164 1142
+3 -8
arch/arm64/kvm/vmid.c
··· 135 135 atomic64_set(this_cpu_ptr(&active_vmids), VMID_ACTIVE_INVALID); 136 136 } 137 137 138 - bool kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid) 138 + void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid) 139 139 { 140 140 unsigned long flags; 141 141 u64 vmid, old_active_vmid; 142 - bool updated = false; 143 142 144 143 vmid = atomic64_read(&kvm_vmid->id); 145 144 ··· 156 157 if (old_active_vmid != 0 && vmid_gen_match(vmid) && 157 158 0 != atomic64_cmpxchg_relaxed(this_cpu_ptr(&active_vmids), 158 159 old_active_vmid, vmid)) 159 - return false; 160 + return; 160 161 161 162 raw_spin_lock_irqsave(&cpu_vmid_lock, flags); 162 163 163 164 /* Check that our VMID belongs to the current generation. */ 164 165 vmid = atomic64_read(&kvm_vmid->id); 165 - if (!vmid_gen_match(vmid)) { 166 + if (!vmid_gen_match(vmid)) 166 167 vmid = new_vmid(kvm_vmid); 167 - updated = true; 168 - } 169 168 170 169 atomic64_set(this_cpu_ptr(&active_vmids), vmid); 171 170 raw_spin_unlock_irqrestore(&cpu_vmid_lock, flags); 172 - 173 - return updated; 174 171 } 175 172 176 173 /*