Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

KVM: arm64: GICv4.1: Fix race with doorbell on VPE activation/deactivation

To save the vgic LPI pending state with GICv4.1, the VPEs must all be
unmapped from the ITSs so that the sGIC caches can be flushed.
The opposite is done once the state is saved.

This is all done by using the activate/deactivate irqdomain callbacks
directly from the vgic code. Crutially, this is done without holding
the irqdesc lock for the interrupts that represent the VPE. And these
callbacks are changing the state of the irqdesc. What could possibly
go wrong?

If a doorbell fires while we are messing with the irqdesc state,
it will acquire the lock and change the interrupt state concurrently.
Since we don't hole the lock, curruption occurs in on the interrupt
state. Oh well.

While acquiring the lock would fix this (and this was Shanker's
initial approach), this is still a layering violation we could do
without. A better approach is actually to free the VPE interrupt,
do what we have to do, and re-request it.

It is more work, but this usually happens only once in the lifetime
of the VM and we don't really care about this sort of overhead.

Fixes: f66b7b151e00 ("KVM: arm64: GICv4.1: Try to save VLPI state in save_pending_tables")
Reported-by: Shanker Donthineni <sdonthineni@nvidia.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230118022348.4137094-1-sdonthineni@nvidia.com

+18 -16
+11 -14
arch/arm64/kvm/vgic/vgic-v3.c
··· 350 350 * The deactivation of the doorbell interrupt will trigger the 351 351 * unmapping of the associated vPE. 352 352 */ 353 - static void unmap_all_vpes(struct vgic_dist *dist) 353 + static void unmap_all_vpes(struct kvm *kvm) 354 354 { 355 - struct irq_desc *desc; 355 + struct vgic_dist *dist = &kvm->arch.vgic; 356 356 int i; 357 357 358 - for (i = 0; i < dist->its_vm.nr_vpes; i++) { 359 - desc = irq_to_desc(dist->its_vm.vpes[i]->irq); 360 - irq_domain_deactivate_irq(irq_desc_get_irq_data(desc)); 361 - } 358 + for (i = 0; i < dist->its_vm.nr_vpes; i++) 359 + free_irq(dist->its_vm.vpes[i]->irq, kvm_get_vcpu(kvm, i)); 362 360 } 363 361 364 - static void map_all_vpes(struct vgic_dist *dist) 362 + static void map_all_vpes(struct kvm *kvm) 365 363 { 366 - struct irq_desc *desc; 364 + struct vgic_dist *dist = &kvm->arch.vgic; 367 365 int i; 368 366 369 - for (i = 0; i < dist->its_vm.nr_vpes; i++) { 370 - desc = irq_to_desc(dist->its_vm.vpes[i]->irq); 371 - irq_domain_activate_irq(irq_desc_get_irq_data(desc), false); 372 - } 367 + for (i = 0; i < dist->its_vm.nr_vpes; i++) 368 + WARN_ON(vgic_v4_request_vpe_irq(kvm_get_vcpu(kvm, i), 369 + dist->its_vm.vpes[i]->irq)); 373 370 } 374 371 375 372 /** ··· 391 394 * and enabling of the doorbells have already been done. 392 395 */ 393 396 if (kvm_vgic_global_state.has_gicv4_1) { 394 - unmap_all_vpes(dist); 397 + unmap_all_vpes(kvm); 395 398 vlpi_avail = true; 396 399 } 397 400 ··· 441 444 442 445 out: 443 446 if (vlpi_avail) 444 - map_all_vpes(dist); 447 + map_all_vpes(kvm); 445 448 446 449 return ret; 447 450 }
+6 -2
arch/arm64/kvm/vgic/vgic-v4.c
··· 222 222 *val = !!(*ptr & mask); 223 223 } 224 224 225 + int vgic_v4_request_vpe_irq(struct kvm_vcpu *vcpu, int irq) 226 + { 227 + return request_irq(irq, vgic_v4_doorbell_handler, 0, "vcpu", vcpu); 228 + } 229 + 225 230 /** 226 231 * vgic_v4_init - Initialize the GICv4 data structures 227 232 * @kvm: Pointer to the VM being initialized ··· 288 283 irq_flags &= ~IRQ_NOAUTOEN; 289 284 irq_set_status_flags(irq, irq_flags); 290 285 291 - ret = request_irq(irq, vgic_v4_doorbell_handler, 292 - 0, "vcpu", vcpu); 286 + ret = vgic_v4_request_vpe_irq(vcpu, irq); 293 287 if (ret) { 294 288 kvm_err("failed to allocate vcpu IRQ%d\n", irq); 295 289 /*
+1
arch/arm64/kvm/vgic/vgic.h
··· 331 331 void vgic_v4_teardown(struct kvm *kvm); 332 332 void vgic_v4_configure_vsgis(struct kvm *kvm); 333 333 void vgic_v4_get_vlpi_state(struct vgic_irq *irq, bool *val); 334 + int vgic_v4_request_vpe_irq(struct kvm_vcpu *vcpu, int irq); 334 335 335 336 #endif