Merge tag 'kvm-arm-for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

+187

Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt

··· 1 + KVM/ARM VGIC Forwarded Physical Interrupts 2 + ========================================== 3 + 4 + The KVM/ARM code implements software support for the ARM Generic 5 + Interrupt Controller's (GIC's) hardware support for virtualization by 6 + allowing software to inject virtual interrupts to a VM, which the guest 7 + OS sees as regular interrupts. The code is famously known as the VGIC. 8 + 9 + Some of these virtual interrupts, however, correspond to physical 10 + interrupts from real physical devices. One example could be the 11 + architected timer, which itself supports virtualization, and therefore 12 + lets a guest OS program the hardware device directly to raise an 13 + interrupt at some point in time. When such an interrupt is raised, the 14 + host OS initially handles the interrupt and must somehow signal this 15 + event as a virtual interrupt to the guest. Another example could be a 16 + passthrough device, where the physical interrupts are initially handled 17 + by the host, but the device driver for the device lives in the guest OS 18 + and KVM must therefore somehow inject a virtual interrupt on behalf of 19 + the physical one to the guest OS. 20 + 21 + These virtual interrupts corresponding to a physical interrupt on the 22 + host are called forwarded physical interrupts, but are also sometimes 23 + referred to as 'virtualized physical interrupts' and 'mapped interrupts'. 24 + 25 + Forwarded physical interrupts are handled slightly differently compared 26 + to virtual interrupts generated purely by a software emulated device. 27 + 28 + 29 + The HW bit 30 + ---------- 31 + Virtual interrupts are signalled to the guest by programming the List 32 + Registers (LRs) on the GIC before running a VCPU. The LR is programmed 33 + with the virtual IRQ number and the state of the interrupt (Pending, 34 + Active, or Pending+Active). When the guest ACKs and EOIs a virtual 35 + interrupt, the LR state moves from Pending to Active, and finally to 36 + inactive. 37 + 38 + The LRs include an extra bit, called the HW bit. When this bit is set, 39 + KVM must also program an additional field in the LR, the physical IRQ 40 + number, to link the virtual with the physical IRQ. 41 + 42 + When the HW bit is set, KVM must EITHER set the Pending OR the Active 43 + bit, never both at the same time. 44 + 45 + Setting the HW bit causes the hardware to deactivate the physical 46 + interrupt on the physical distributor when the guest deactivates the 47 + corresponding virtual interrupt. 48 + 49 + 50 + Forwarded Physical Interrupts Life Cycle 51 + ---------------------------------------- 52 + 53 + The state of forwarded physical interrupts is managed in the following way: 54 + 55 + - The physical interrupt is acked by the host, and becomes active on 56 + the physical distributor (*). 57 + - KVM sets the LR.Pending bit, because this is the only way the GICV 58 + interface is going to present it to the guest. 59 + - LR.Pending will stay set as long as the guest has not acked the interrupt. 60 + - LR.Pending transitions to LR.Active on the guest read of the IAR, as 61 + expected. 62 + - On guest EOI, the *physical distributor* active bit gets cleared, 63 + but the LR.Active is left untouched (set). 64 + - KVM clears the LR on VM exits when the physical distributor 65 + active state has been cleared. 66 + 67 + (*): The host handling is slightly more complicated. For some forwarded 68 + interrupts (shared), KVM directly sets the active state on the physical 69 + distributor before entering the guest, because the interrupt is never actually 70 + handled on the host (see details on the timer as an example below). For other 71 + forwarded interrupts (non-shared) the host does not deactivate the interrupt 72 + when the host ISR completes, but leaves the interrupt active until the guest 73 + deactivates it. Leaving the interrupt active is allowed, because Linux 74 + configures the physical GIC with EOIMode=1, which causes EOI operations to 75 + perform a priority drop allowing the GIC to receive other interrupts of the 76 + default priority. 77 + 78 + 79 + Forwarded Edge and Level Triggered PPIs and SPIs 80 + ------------------------------------------------ 81 + Forwarded physical interrupts injected should always be active on the 82 + physical distributor when injected to a guest. 83 + 84 + Level-triggered interrupts will keep the interrupt line to the GIC 85 + asserted, typically until the guest programs the device to deassert the 86 + line. This means that the interrupt will remain pending on the physical 87 + distributor until the guest has reprogrammed the device. Since we 88 + always run the VM with interrupts enabled on the CPU, a pending 89 + interrupt will exit the guest as soon as we switch into the guest, 90 + preventing the guest from ever making progress as the process repeats 91 + over and over. Therefore, the active state on the physical distributor 92 + must be set when entering the guest, preventing the GIC from forwarding 93 + the pending interrupt to the CPU. As soon as the guest deactivates the 94 + interrupt, the physical line is sampled by the hardware again and the host 95 + takes a new interrupt if and only if the physical line is still asserted. 96 + 97 + Edge-triggered interrupts do not exhibit the same problem with 98 + preventing guest execution that level-triggered interrupts do. One 99 + option is to not use HW bit at all, and inject edge-triggered interrupts 100 + from a physical device as pure virtual interrupts. But that would 101 + potentially slow down handling of the interrupt in the guest, because a 102 + physical interrupt occurring in the middle of the guest ISR would 103 + preempt the guest for the host to handle the interrupt. Additionally, 104 + if you configure the system to handle interrupts on a separate physical 105 + core from that running your VCPU, you still have to interrupt the VCPU 106 + to queue the pending state onto the LR, even though the guest won't use 107 + this information until the guest ISR completes. Therefore, the HW 108 + bit should always be set for forwarded edge-triggered interrupts. With 109 + the HW bit set, the virtual interrupt is injected and additional 110 + physical interrupts occurring before the guest deactivates the interrupt 111 + simply mark the state on the physical distributor as Pending+Active. As 112 + soon as the guest deactivates the interrupt, the host takes another 113 + interrupt if and only if there was a physical interrupt between injecting 114 + the forwarded interrupt to the guest and the guest deactivating the 115 + interrupt. 116 + 117 + Consequently, whenever we schedule a VCPU with one or more LRs with the 118 + HW bit set, the interrupt must also be active on the physical 119 + distributor. 120 + 121 + 122 + Forwarded LPIs 123 + -------------- 124 + LPIs, introduced in GICv3, are always edge-triggered and do not have an 125 + active state. They become pending when a device signal them, and as 126 + soon as they are acked by the CPU, they are inactive again. 127 + 128 + It therefore doesn't make sense, and is not supported, to set the HW bit 129 + for physical LPIs that are forwarded to a VM as virtual interrupts, 130 + typically virtual SPIs. 131 + 132 + For LPIs, there is no other choice than to preempt the VCPU thread if 133 + necessary, and queue the pending state onto the LR. 134 + 135 + 136 + Putting It Together: The Architected Timer 137 + ------------------------------------------ 138 + The architected timer is a device that signals interrupts with level 139 + triggered semantics. The timer hardware is directly accessed by VCPUs 140 + which program the timer to fire at some point in time. Each VCPU on a 141 + system programs the timer to fire at different times, and therefore the 142 + hardware is multiplexed between multiple VCPUs. This is implemented by 143 + context-switching the timer state along with each VCPU thread. 144 + 145 + However, this means that a scenario like the following is entirely 146 + possible, and in fact, typical: 147 + 148 + 1. KVM runs the VCPU 149 + 2. The guest programs the time to fire in T+100 150 + 3. The guest is idle and calls WFI (wait-for-interrupts) 151 + 4. The hardware traps to the host 152 + 5. KVM stores the timer state to memory and disables the hardware timer 153 + 6. KVM schedules a soft timer to fire in T+(100 - time since step 2) 154 + 7. KVM puts the VCPU thread to sleep (on a waitqueue) 155 + 8. The soft timer fires, waking up the VCPU thread 156 + 9. KVM reprograms the timer hardware with the VCPU's values 157 + 10. KVM marks the timer interrupt as active on the physical distributor 158 + 11. KVM injects a forwarded physical interrupt to the guest 159 + 12. KVM runs the VCPU 160 + 161 + Notice that KVM injects a forwarded physical interrupt in step 11 without 162 + the corresponding interrupt having actually fired on the host. That is 163 + exactly why we mark the timer interrupt as active in step 10, because 164 + the active state on the physical distributor is part of the state 165 + belonging to the timer hardware, which is context-switched along with 166 + the VCPU thread. 167 + 168 + If the guest does not idle because it is busy, the flow looks like this 169 + instead: 170 + 171 + 1. KVM runs the VCPU 172 + 2. The guest programs the time to fire in T+100 173 + 4. At T+100 the timer fires and a physical IRQ causes the VM to exit 174 + (note that this initially only traps to EL2 and does not run the host ISR 175 + until KVM has returned to the host). 176 + 5. With interrupts still disabled on the CPU coming back from the guest, KVM 177 + stores the virtual timer state to memory and disables the virtual hw timer. 178 + 6. KVM looks at the timer state (in memory) and injects a forwarded physical 179 + interrupt because it concludes the timer has expired. 180 + 7. KVM marks the timer interrupt as active on the physical distributor 181 + 7. KVM enables the timer, enables interrupts, and runs the VCPU 182 + 183 + Notice that again the forwarded physical interrupt is injected to the 184 + guest without having actually been handled on the host. In this case it 185 + is because the physical interrupt is never actually seen by the host because the 186 + timer is disabled upon guest return, and the virtual forwarded interrupt is 187 + injected on the KVM guest entry path.

+10 -8

Documentation/virtual/kvm/devices/arm-vgic.txt

··· 44 44 Attributes: 45 45 The attr field of kvm_device_attr encodes two values: 46 46 bits: | 63 .... 40 | 39 .. 32 | 31 .... 0 | 47 - values: | reserved | cpu id | offset | 47 + values: | reserved | vcpu_index | offset | 48 48 49 49 All distributor regs are (rw, 32-bit) 50 50 51 51 The offset is relative to the "Distributor base address" as defined in the 52 52 GICv2 specs. Getting or setting such a register has the same effect as 53 - reading or writing the register on the actual hardware from the cpu 54 - specified with cpu id field. Note that most distributor fields are not 55 - banked, but return the same value regardless of the cpu id used to access 56 - the register. 53 + reading or writing the register on the actual hardware from the cpu whose 54 + index is specified with the vcpu_index field. Note that most distributor 55 + fields are not banked, but return the same value regardless of the 56 + vcpu_index used to access the register. 57 57 Limitations: 58 58 - Priorities are not implemented, and registers are RAZ/WI 59 59 - Currently only implemented for KVM_DEV_TYPE_ARM_VGIC_V2. 60 60 Errors: 61 - -ENODEV: Getting or setting this register is not yet supported 61 + -ENXIO: Getting or setting this register is not yet supported 62 62 -EBUSY: One or more VCPUs are running 63 + -EINVAL: Invalid vcpu_index supplied 63 64 64 65 KVM_DEV_ARM_VGIC_GRP_CPU_REGS 65 66 Attributes: 66 67 The attr field of kvm_device_attr encodes two values: 67 68 bits: | 63 .... 40 | 39 .. 32 | 31 .... 0 | 68 - values: | reserved | cpu id | offset | 69 + values: | reserved | vcpu_index | offset | 69 70 70 71 All CPU interface regs are (rw, 32-bit) 71 72 ··· 92 91 - Priorities are not implemented, and registers are RAZ/WI 93 92 - Currently only implemented for KVM_DEV_TYPE_ARM_VGIC_V2. 94 93 Errors: 95 - -ENODEV: Getting or setting this register is not yet supported 94 + -ENXIO: Getting or setting this register is not yet supported 96 95 -EBUSY: One or more VCPUs are running 96 + -EINVAL: Invalid vcpu_index supplied 97 97 98 98 KVM_DEV_ARM_VGIC_GRP_NR_IRQS 99 99 Attributes:

+20

arch/arm/include/asm/kvm_arm.h

··· 218 218 #define HSR_DABT_CM (1U << 8) 219 219 #define HSR_DABT_EA (1U << 9) 220 220 221 + #define kvm_arm_exception_type \ 222 + {0, "RESET" }, \ 223 + {1, "UNDEFINED" }, \ 224 + {2, "SOFTWARE" }, \ 225 + {3, "PREF_ABORT" }, \ 226 + {4, "DATA_ABORT" }, \ 227 + {5, "IRQ" }, \ 228 + {6, "FIQ" }, \ 229 + {7, "HVC" } 230 + 231 + #define HSRECN(x) { HSR_EC_##x, #x } 232 + 233 + #define kvm_arm_exception_class \ 234 + HSRECN(UNKNOWN), HSRECN(WFI), HSRECN(CP15_32), HSRECN(CP15_64), \ 235 + HSRECN(CP14_MR), HSRECN(CP14_LS), HSRECN(CP_0_13), HSRECN(CP10_ID), \ 236 + HSRECN(JAZELLE), HSRECN(BXJ), HSRECN(CP14_64), HSRECN(SVC_HYP), \ 237 + HSRECN(HVC), HSRECN(SMC), HSRECN(IABT), HSRECN(IABT_HYP), \ 238 + HSRECN(DABT), HSRECN(DABT_HYP) 239 + 240 + 221 241 #endif /* __ARM_KVM_ARM_H__ */

+4 -1

arch/arm/include/asm/kvm_host.h

··· 126 126 * here. 127 127 */ 128 128 129 - /* Don't run the guest on this vcpu */ 129 + /* vcpu power-off state */ 130 + bool power_off; 131 + 132 + /* Don't run the guest (internal implementation need) */ 130 133 bool pause; 131 134 132 135 /* IO related fields */

+3

arch/arm/kvm/Kconfig

··· 21 21 depends on MMU && OF 22 22 select PREEMPT_NOTIFIERS 23 23 select ANON_INODES 24 + select ARM_GIC 24 25 select HAVE_KVM_CPU_RELAX_INTERCEPT 25 26 select HAVE_KVM_ARCH_TLB_FLUSH_ALL 26 27 select KVM_MMIO ··· 45 44 bool 46 45 ---help--- 47 46 Provides host support for ARM processors. 47 + 48 + source drivers/vhost/Kconfig 48 49 49 50 endif # VIRTUALIZATION

+61 -17

arch/arm/kvm/arm.c

··· 271 271 return kvm_timer_should_fire(vcpu); 272 272 } 273 273 274 + void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) 275 + { 276 + kvm_timer_schedule(vcpu); 277 + } 278 + 279 + void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) 280 + { 281 + kvm_timer_unschedule(vcpu); 282 + } 283 + 274 284 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) 275 285 { 276 286 /* Force users to call KVM_ARM_VCPU_INIT */ ··· 318 308 int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, 319 309 struct kvm_mp_state *mp_state) 320 310 { 321 - if (vcpu->arch.pause) 311 + if (vcpu->arch.power_off) 322 312 mp_state->mp_state = KVM_MP_STATE_STOPPED; 323 313 else 324 314 mp_state->mp_state = KVM_MP_STATE_RUNNABLE; ··· 331 321 { 332 322 switch (mp_state->mp_state) { 333 323 case KVM_MP_STATE_RUNNABLE: 334 - vcpu->arch.pause = false; 324 + vcpu->arch.power_off = false; 335 325 break; 336 326 case KVM_MP_STATE_STOPPED: 337 - vcpu->arch.pause = true; 327 + vcpu->arch.power_off = true; 338 328 break; 339 329 default: 340 330 return -EINVAL; ··· 352 342 */ 353 343 int kvm_arch_vcpu_runnable(struct kvm_vcpu *v) 354 344 { 355 - return !!v->arch.irq_lines || kvm_vgic_vcpu_pending_irq(v); 345 + return ((!!v->arch.irq_lines || kvm_vgic_vcpu_pending_irq(v)) 346 + && !v->arch.power_off && !v->arch.pause); 356 347 } 357 348 358 349 /* Just ensure a guest exit from a particular CPU */ ··· 479 468 return vgic_initialized(kvm); 480 469 } 481 470 482 - static void vcpu_pause(struct kvm_vcpu *vcpu) 471 + static void kvm_arm_halt_guest(struct kvm *kvm) __maybe_unused; 472 + static void kvm_arm_resume_guest(struct kvm *kvm) __maybe_unused; 473 + 474 + static void kvm_arm_halt_guest(struct kvm *kvm) 475 + { 476 + int i; 477 + struct kvm_vcpu *vcpu; 478 + 479 + kvm_for_each_vcpu(i, vcpu, kvm) 480 + vcpu->arch.pause = true; 481 + force_vm_exit(cpu_all_mask); 482 + } 483 + 484 + static void kvm_arm_resume_guest(struct kvm *kvm) 485 + { 486 + int i; 487 + struct kvm_vcpu *vcpu; 488 + 489 + kvm_for_each_vcpu(i, vcpu, kvm) { 490 + wait_queue_head_t *wq = kvm_arch_vcpu_wq(vcpu); 491 + 492 + vcpu->arch.pause = false; 493 + wake_up_interruptible(wq); 494 + } 495 + } 496 + 497 + static void vcpu_sleep(struct kvm_vcpu *vcpu) 483 498 { 484 499 wait_queue_head_t *wq = kvm_arch_vcpu_wq(vcpu); 485 500 486 - wait_event_interruptible(*wq, !vcpu->arch.pause); 501 + wait_event_interruptible(*wq, ((!vcpu->arch.power_off) && 502 + (!vcpu->arch.pause))); 487 503 } 488 504 489 505 static int kvm_vcpu_initialized(struct kvm_vcpu *vcpu) ··· 560 522 561 523 update_vttbr(vcpu->kvm); 562 524 563 - if (vcpu->arch.pause) 564 - vcpu_pause(vcpu); 525 + if (vcpu->arch.power_off || vcpu->arch.pause) 526 + vcpu_sleep(vcpu); 565 527 566 528 /* 567 529 * Disarming the background timer must be done in a ··· 587 549 run->exit_reason = KVM_EXIT_INTR; 588 550 } 589 551 590 - if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) { 552 + if (ret <= 0 || need_new_vmid_gen(vcpu->kvm) || 553 + vcpu->arch.power_off || vcpu->arch.pause) { 591 554 local_irq_enable(); 555 + kvm_timer_sync_hwstate(vcpu); 592 556 kvm_vgic_sync_hwstate(vcpu); 593 557 preempt_enable(); 594 - kvm_timer_sync_hwstate(vcpu); 595 558 continue; 596 559 } 597 560 ··· 635 596 * guest time. 636 597 */ 637 598 kvm_guest_exit(); 638 - trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu)); 599 + trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu)); 600 + 601 + /* 602 + * We must sync the timer state before the vgic state so that 603 + * the vgic can properly sample the updated state of the 604 + * interrupt line. 605 + */ 606 + kvm_timer_sync_hwstate(vcpu); 639 607 640 608 kvm_vgic_sync_hwstate(vcpu); 641 609 642 610 preempt_enable(); 643 - 644 - kvm_timer_sync_hwstate(vcpu); 645 611 646 612 ret = handle_exit(vcpu, run, ret); 647 613 } ··· 809 765 vcpu_reset_hcr(vcpu); 810 766 811 767 /* 812 - * Handle the "start in power-off" case by marking the VCPU as paused. 768 + * Handle the "start in power-off" case. 813 769 */ 814 770 if (test_bit(KVM_ARM_VCPU_POWER_OFF, vcpu->arch.features)) 815 - vcpu->arch.pause = true; 771 + vcpu->arch.power_off = true; 816 772 else 817 - vcpu->arch.pause = false; 773 + vcpu->arch.power_off = false; 818 774 819 775 return 0; 820 776 } ··· 1124 1080 */ 1125 1081 err = kvm_timer_hyp_init(); 1126 1082 if (err) 1127 - goto out_free_mappings; 1083 + goto out_free_context; 1128 1084 1129 1085 #ifndef CONFIG_HOTPLUG_CPU 1130 1086 free_boot_hyp_pgd();

+5 -5

arch/arm/kvm/psci.c

··· 63 63 64 64 static void kvm_psci_vcpu_off(struct kvm_vcpu *vcpu) 65 65 { 66 - vcpu->arch.pause = true; 66 + vcpu->arch.power_off = true; 67 67 } 68 68 69 69 static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu) ··· 87 87 */ 88 88 if (!vcpu) 89 89 return PSCI_RET_INVALID_PARAMS; 90 - if (!vcpu->arch.pause) { 90 + if (!vcpu->arch.power_off) { 91 91 if (kvm_psci_version(source_vcpu) != KVM_ARM_PSCI_0_1) 92 92 return PSCI_RET_ALREADY_ON; 93 93 else ··· 115 115 * the general puspose registers are undefined upon CPU_ON. 116 116 */ 117 117 *vcpu_reg(vcpu, 0) = context_id; 118 - vcpu->arch.pause = false; 118 + vcpu->arch.power_off = false; 119 119 smp_mb(); /* Make sure the above is visible */ 120 120 121 121 wq = kvm_arch_vcpu_wq(vcpu); ··· 153 153 mpidr = kvm_vcpu_get_mpidr_aff(tmp); 154 154 if ((mpidr & target_affinity_mask) == target_affinity) { 155 155 matching_cpus++; 156 - if (!tmp->arch.pause) 156 + if (!tmp->arch.power_off) 157 157 return PSCI_0_2_AFFINITY_LEVEL_ON; 158 158 } 159 159 } ··· 179 179 * re-initialized. 180 180 */ 181 181 kvm_for_each_vcpu(i, tmp, vcpu->kvm) { 182 - tmp->arch.pause = true; 182 + tmp->arch.power_off = true; 183 183 kvm_vcpu_kick(tmp); 184 184 } 185 185

+7 -3

arch/arm/kvm/trace.h

··· 25 25 ); 26 26 27 27 TRACE_EVENT(kvm_exit, 28 - TP_PROTO(unsigned int exit_reason, unsigned long vcpu_pc), 29 - TP_ARGS(exit_reason, vcpu_pc), 28 + TP_PROTO(int idx, unsigned int exit_reason, unsigned long vcpu_pc), 29 + TP_ARGS(idx, exit_reason, vcpu_pc), 30 30 31 31 TP_STRUCT__entry( 32 + __field( int, idx ) 32 33 __field( unsigned int, exit_reason ) 33 34 __field( unsigned long, vcpu_pc ) 34 35 ), 35 36 36 37 TP_fast_assign( 38 + __entry->idx = idx; 37 39 __entry->exit_reason = exit_reason; 38 40 __entry->vcpu_pc = vcpu_pc; 39 41 ), 40 42 41 - TP_printk("HSR_EC: 0x%04x, PC: 0x%08lx", 43 + TP_printk("%s: HSR_EC: 0x%04x (%s), PC: 0x%08lx", 44 + __print_symbolic(__entry->idx, kvm_arm_exception_type), 42 45 __entry->exit_reason, 46 + __print_symbolic(__entry->exit_reason, kvm_arm_exception_class), 43 47 __entry->vcpu_pc) 44 48 ); 45 49

+16

arch/arm64/include/asm/kvm_arm.h

··· 200 200 /* Hyp Prefetch Fault Address Register (HPFAR/HDFAR) */ 201 201 #define HPFAR_MASK (~UL(0xf)) 202 202 203 + #define kvm_arm_exception_type \ 204 + {0, "IRQ" }, \ 205 + {1, "TRAP" } 206 + 207 + #define ECN(x) { ESR_ELx_EC_##x, #x } 208 + 209 + #define kvm_arm_exception_class \ 210 + ECN(UNKNOWN), ECN(WFx), ECN(CP15_32), ECN(CP15_64), ECN(CP14_MR), \ 211 + ECN(CP14_LS), ECN(FP_ASIMD), ECN(CP10_ID), ECN(CP14_64), ECN(SVC64), \ 212 + ECN(HVC64), ECN(SMC64), ECN(SYS64), ECN(IMP_DEF), ECN(IABT_LOW), \ 213 + ECN(IABT_CUR), ECN(PC_ALIGN), ECN(DABT_LOW), ECN(DABT_CUR), \ 214 + ECN(SP_ALIGN), ECN(FP_EXC32), ECN(FP_EXC64), ECN(SERROR), \ 215 + ECN(BREAKPT_LOW), ECN(BREAKPT_CUR), ECN(SOFTSTP_LOW), \ 216 + ECN(SOFTSTP_CUR), ECN(WATCHPT_LOW), ECN(WATCHPT_CUR), \ 217 + ECN(BKPT32), ECN(VECTOR32), ECN(BRK64) 218 + 203 219 #endif /* __ARM64_KVM_ARM_H__ */

+4 -1

arch/arm64/include/asm/kvm_host.h

··· 149 149 u32 mdscr_el1; 150 150 } guest_debug_preserved; 151 151 152 - /* Don't run the guest */ 152 + /* vcpu power-off state */ 153 + bool power_off; 154 + 155 + /* Don't run the guest (internal implementation need) */ 153 156 bool pause; 154 157 155 158 /* IO related fields */

+2

arch/arm64/kvm/Kconfig

··· 41 41 ---help--- 42 42 Provides host support for ARM processors. 43 43 44 + source drivers/vhost/Kconfig 45 + 44 46 endif # VIRTUALIZATION

+8

arch/arm64/kvm/hyp.S

··· 880 880 881 881 bl __restore_sysregs 882 882 883 + /* 884 + * Make sure we have a valid host stack, and don't leave junk in the 885 + * frame pointer that will give us a misleading host stack unwinding. 886 + */ 887 + ldr x22, [x2, #CPU_GP_REG_OFFSET(CPU_SP_EL1)] 888 + msr sp_el1, x22 889 + mov x29, xzr 890 + 883 891 1: adr x0, __hyp_panic_str 884 892 adr x1, 2f 885 893 ldp x2, x3, [x1]

+2

arch/mips/include/asm/kvm_host.h

··· 847 847 struct kvm_memory_slot *slot) {} 848 848 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {} 849 849 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} 850 + static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {} 851 + static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {} 850 852 851 853 #endif /* __MIPS_KVM_HOST_H__ */

+2

arch/powerpc/include/asm/kvm_host.h

··· 718 718 static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {} 719 719 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} 720 720 static inline void kvm_arch_exit(void) {} 721 + static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {} 722 + static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {} 721 723 722 724 #endif /* __POWERPC_KVM_HOST_H__ */

+2

arch/s390/include/asm/kvm_host.h

··· 644 644 static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {} 645 645 static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm, 646 646 struct kvm_memory_slot *slot) {} 647 + static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {} 648 + static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {} 647 649 648 650 #endif

+4

arch/x86/include/asm/kvm_host.h

··· 1261 1261 1262 1262 void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e, 1263 1263 struct kvm_lapic_irq *irq); 1264 + 1265 + static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {} 1266 + static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {} 1267 + 1264 1268 #endif /* _ASM_X86_KVM_HOST_H */

+3 -1

include/kvm/arm_arch_timer.h

··· 51 51 bool armed; 52 52 53 53 /* Timer IRQ */ 54 - const struct kvm_irq_level *irq; 54 + struct kvm_irq_level irq; 55 55 56 56 /* VGIC mapping */ 57 57 struct irq_phys_map *map; ··· 71 71 int kvm_arm_timer_set_reg(struct kvm_vcpu *, u64 regid, u64 value); 72 72 73 73 bool kvm_timer_should_fire(struct kvm_vcpu *vcpu); 74 + void kvm_timer_schedule(struct kvm_vcpu *vcpu); 75 + void kvm_timer_unschedule(struct kvm_vcpu *vcpu); 74 76 75 77 #endif

+3 -13

include/kvm/arm_vgic.h

··· 112 112 struct vgic_ops { 113 113 struct vgic_lr (*get_lr)(const struct kvm_vcpu *, int); 114 114 void (*set_lr)(struct kvm_vcpu *, int, struct vgic_lr); 115 - void (*sync_lr_elrsr)(struct kvm_vcpu *, int, struct vgic_lr); 116 115 u64 (*get_elrsr)(const struct kvm_vcpu *vcpu); 117 116 u64 (*get_eisr)(const struct kvm_vcpu *vcpu); 118 117 void (*clear_eisr)(struct kvm_vcpu *vcpu); ··· 158 159 u32 virt_irq; 159 160 u32 phys_irq; 160 161 u32 irq; 161 - bool active; 162 162 }; 163 163 164 164 struct irq_phys_map_entry { ··· 294 296 }; 295 297 296 298 struct vgic_cpu { 297 - /* per IRQ to LR mapping */ 298 - u8 *vgic_irq_lr_map; 299 - 300 299 /* Pending/active/both interrupts on this VCPU */ 301 - DECLARE_BITMAP( pending_percpu, VGIC_NR_PRIVATE_IRQS); 302 - DECLARE_BITMAP( active_percpu, VGIC_NR_PRIVATE_IRQS); 303 - DECLARE_BITMAP( pend_act_percpu, VGIC_NR_PRIVATE_IRQS); 300 + DECLARE_BITMAP(pending_percpu, VGIC_NR_PRIVATE_IRQS); 301 + DECLARE_BITMAP(active_percpu, VGIC_NR_PRIVATE_IRQS); 302 + DECLARE_BITMAP(pend_act_percpu, VGIC_NR_PRIVATE_IRQS); 304 303 305 304 /* Pending/active/both shared interrupts, dynamically sized */ 306 305 unsigned long *pending_shared; 307 306 unsigned long *active_shared; 308 307 unsigned long *pend_act_shared; 309 - 310 - /* Bitmap of used/free list registers */ 311 - DECLARE_BITMAP( lr_used, VGIC_V2_MAX_LRS); 312 308 313 309 /* Number of list registers on this CPU */ 314 310 int nr_lr; ··· 346 354 struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, 347 355 int virt_irq, int irq); 348 356 int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map); 349 - bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map); 350 - void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active); 351 357 352 358 #define irqchip_in_kernel(k) (!!((k)->arch.vgic.in_kernel)) 353 359 #define vgic_initialized(k) (!!((k)->arch.vgic.nr_cpus))

+2

include/linux/kvm_host.h

··· 647 647 void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn); 648 648 649 649 void kvm_vcpu_block(struct kvm_vcpu *vcpu); 650 + void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu); 651 + void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu); 650 652 void kvm_vcpu_kick(struct kvm_vcpu *vcpu); 651 653 int kvm_vcpu_yield_to(struct kvm_vcpu *target); 652 654 void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu);

+129 -49

virt/kvm/arm/arch_timer.c

··· 28 28 #include <kvm/arm_vgic.h> 29 29 #include <kvm/arm_arch_timer.h> 30 30 31 + #include "trace.h" 32 + 31 33 static struct timecounter *timecounter; 32 34 static struct workqueue_struct *wqueue; 33 35 static unsigned int host_vtimer_irq; ··· 59 57 cancel_work_sync(&timer->expired); 60 58 timer->armed = false; 61 59 } 62 - } 63 - 64 - static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu) 65 - { 66 - int ret; 67 - struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; 68 - 69 - kvm_vgic_set_phys_irq_active(timer->map, true); 70 - ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id, 71 - timer->map, 72 - timer->irq->level); 73 - WARN_ON(ret); 74 60 } 75 61 76 62 static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id) ··· 101 111 return HRTIMER_NORESTART; 102 112 } 103 113 114 + static bool kvm_timer_irq_can_fire(struct kvm_vcpu *vcpu) 115 + { 116 + struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; 117 + 118 + return !(timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) && 119 + (timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE); 120 + } 121 + 104 122 bool kvm_timer_should_fire(struct kvm_vcpu *vcpu) 105 123 { 106 124 struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; 107 125 cycle_t cval, now; 108 126 109 - if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) || 110 - !(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE) || 111 - kvm_vgic_get_phys_irq_active(timer->map)) 127 + if (!kvm_timer_irq_can_fire(vcpu)) 112 128 return false; 113 129 114 130 cval = timer->cntv_cval; ··· 123 127 return cval <= now; 124 128 } 125 129 126 - /** 127 - * kvm_timer_flush_hwstate - prepare to move the virt timer to the cpu 128 - * @vcpu: The vcpu pointer 129 - * 130 - * Disarm any pending soft timers, since the world-switch code will write the 131 - * virtual timer state back to the physical CPU. 130 + static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level) 131 + { 132 + int ret; 133 + struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; 134 + 135 + BUG_ON(!vgic_initialized(vcpu->kvm)); 136 + 137 + timer->irq.level = new_level; 138 + trace_kvm_timer_update_irq(vcpu->vcpu_id, timer->map->virt_irq, 139 + timer->irq.level); 140 + ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id, 141 + timer->map, 142 + timer->irq.level); 143 + WARN_ON(ret); 144 + } 145 + 146 + /* 147 + * Check if there was a change in the timer state (should we raise or lower 148 + * the line level to the GIC). 132 149 */ 133 - void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu) 150 + static void kvm_timer_update_state(struct kvm_vcpu *vcpu) 134 151 { 135 152 struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; 136 153 137 154 /* 138 - * We're about to run this vcpu again, so there is no need to 139 - * keep the background timer running, as we're about to 140 - * populate the CPU timer again. 155 + * If userspace modified the timer registers via SET_ONE_REG before 156 + * the vgic was initialized, we mustn't set the timer->irq.level value 157 + * because the guest would never see the interrupt. Instead wait 158 + * until we call this function from kvm_timer_flush_hwstate. 141 159 */ 142 - timer_disarm(timer); 160 + if (!vgic_initialized(vcpu->kvm)) 161 + return; 162 + 163 + if (kvm_timer_should_fire(vcpu) != timer->irq.level) 164 + kvm_timer_update_irq(vcpu, !timer->irq.level); 165 + } 166 + 167 + /* 168 + * Schedule the background timer before calling kvm_vcpu_block, so that this 169 + * thread is removed from its waitqueue and made runnable when there's a timer 170 + * interrupt to handle. 171 + */ 172 + void kvm_timer_schedule(struct kvm_vcpu *vcpu) 173 + { 174 + struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; 175 + u64 ns; 176 + cycle_t cval, now; 177 + 178 + BUG_ON(timer_is_armed(timer)); 143 179 144 180 /* 145 - * If the timer expired while we were not scheduled, now is the time 146 - * to inject it. 181 + * No need to schedule a background timer if the guest timer has 182 + * already expired, because kvm_vcpu_block will return before putting 183 + * the thread to sleep. 147 184 */ 148 185 if (kvm_timer_should_fire(vcpu)) 149 - kvm_timer_inject_irq(vcpu); 186 + return; 187 + 188 + /* 189 + * If the timer is not capable of raising interrupts (disabled or 190 + * masked), then there's no more work for us to do. 191 + */ 192 + if (!kvm_timer_irq_can_fire(vcpu)) 193 + return; 194 + 195 + /* The timer has not yet expired, schedule a background timer */ 196 + cval = timer->cntv_cval; 197 + now = kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff; 198 + 199 + ns = cyclecounter_cyc2ns(timecounter->cc, 200 + cval - now, 201 + timecounter->mask, 202 + &timecounter->frac); 203 + timer_arm(timer, ns); 204 + } 205 + 206 + void kvm_timer_unschedule(struct kvm_vcpu *vcpu) 207 + { 208 + struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; 209 + timer_disarm(timer); 210 + } 211 + 212 + /** 213 + * kvm_timer_flush_hwstate - prepare to move the virt timer to the cpu 214 + * @vcpu: The vcpu pointer 215 + * 216 + * Check if the virtual timer has expired while we were running in the host, 217 + * and inject an interrupt if that was the case. 218 + */ 219 + void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu) 220 + { 221 + struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; 222 + bool phys_active; 223 + int ret; 224 + 225 + kvm_timer_update_state(vcpu); 226 + 227 + /* 228 + * If we enter the guest with the virtual input level to the VGIC 229 + * asserted, then we have already told the VGIC what we need to, and 230 + * we don't need to exit from the guest until the guest deactivates 231 + * the already injected interrupt, so therefore we should set the 232 + * hardware active state to prevent unnecessary exits from the guest. 233 + * 234 + * Conversely, if the virtual input level is deasserted, then always 235 + * clear the hardware active state to ensure that hardware interrupts 236 + * from the timer triggers a guest exit. 237 + */ 238 + if (timer->irq.level) 239 + phys_active = true; 240 + else 241 + phys_active = false; 242 + 243 + ret = irq_set_irqchip_state(timer->map->irq, 244 + IRQCHIP_STATE_ACTIVE, 245 + phys_active); 246 + WARN_ON(ret); 150 247 } 151 248 152 249 /** 153 250 * kvm_timer_sync_hwstate - sync timer state from cpu 154 251 * @vcpu: The vcpu pointer 155 252 * 156 - * Check if the virtual timer was armed and either schedule a corresponding 157 - * soft timer or inject directly if already expired. 253 + * Check if the virtual timer has expired while we were running in the guest, 254 + * and inject an interrupt if that was the case. 158 255 */ 159 256 void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu) 160 257 { 161 258 struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; 162 - cycle_t cval, now; 163 - u64 ns; 164 259 165 260 BUG_ON(timer_is_armed(timer)); 166 261 167 - if (kvm_timer_should_fire(vcpu)) { 168 - /* 169 - * Timer has already expired while we were not 170 - * looking. Inject the interrupt and carry on. 171 - */ 172 - kvm_timer_inject_irq(vcpu); 173 - return; 174 - } 175 - 176 - cval = timer->cntv_cval; 177 - now = kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff; 178 - 179 - ns = cyclecounter_cyc2ns(timecounter->cc, cval - now, timecounter->mask, 180 - &timecounter->frac); 181 - timer_arm(timer, ns); 262 + /* 263 + * The guest could have modified the timer registers or the timer 264 + * could have expired, update the timer state. 265 + */ 266 + kvm_timer_update_state(vcpu); 182 267 } 183 268 184 269 int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu, ··· 274 197 * kvm_vcpu_set_target(). To handle this, we determine 275 198 * vcpu timer irq number when the vcpu is reset. 276 199 */ 277 - timer->irq = irq; 200 + timer->irq.irq = irq->irq; 278 201 279 202 /* 280 203 * The bits in CNTV_CTL are architecturally reset to UNKNOWN for ARMv8 ··· 283 206 * the ARMv7 architecture. 284 207 */ 285 208 timer->cntv_ctl = 0; 209 + kvm_timer_update_state(vcpu); 286 210 287 211 /* 288 212 * Tell the VGIC that the virtual interrupt is tied to a ··· 328 250 default: 329 251 return -1; 330 252 } 253 + 254 + kvm_timer_update_state(vcpu); 331 255 return 0; 332 256 } 333 257

+63

virt/kvm/arm/trace.h

··· 1 + #if !defined(_TRACE_KVM_H) || defined(TRACE_HEADER_MULTI_READ) 2 + #define _TRACE_KVM_H 3 + 4 + #include <linux/tracepoint.h> 5 + 6 + #undef TRACE_SYSTEM 7 + #define TRACE_SYSTEM kvm 8 + 9 + /* 10 + * Tracepoints for vgic 11 + */ 12 + TRACE_EVENT(vgic_update_irq_pending, 13 + TP_PROTO(unsigned long vcpu_id, __u32 irq, bool level), 14 + TP_ARGS(vcpu_id, irq, level), 15 + 16 + TP_STRUCT__entry( 17 + __field( unsigned long, vcpu_id ) 18 + __field( __u32, irq ) 19 + __field( bool, level ) 20 + ), 21 + 22 + TP_fast_assign( 23 + __entry->vcpu_id = vcpu_id; 24 + __entry->irq = irq; 25 + __entry->level = level; 26 + ), 27 + 28 + TP_printk("VCPU: %ld, IRQ %d, level: %d", 29 + __entry->vcpu_id, __entry->irq, __entry->level) 30 + ); 31 + 32 + /* 33 + * Tracepoints for arch_timer 34 + */ 35 + TRACE_EVENT(kvm_timer_update_irq, 36 + TP_PROTO(unsigned long vcpu_id, __u32 irq, int level), 37 + TP_ARGS(vcpu_id, irq, level), 38 + 39 + TP_STRUCT__entry( 40 + __field( unsigned long, vcpu_id ) 41 + __field( __u32, irq ) 42 + __field( int, level ) 43 + ), 44 + 45 + TP_fast_assign( 46 + __entry->vcpu_id = vcpu_id; 47 + __entry->irq = irq; 48 + __entry->level = level; 49 + ), 50 + 51 + TP_printk("VCPU: %ld, IRQ %d, level %d", 52 + __entry->vcpu_id, __entry->irq, __entry->level) 53 + ); 54 + 55 + #endif /* _TRACE_KVM_H */ 56 + 57 + #undef TRACE_INCLUDE_PATH 58 + #define TRACE_INCLUDE_PATH ../../../virt/kvm/arm 59 + #undef TRACE_INCLUDE_FILE 60 + #define TRACE_INCLUDE_FILE trace 61 + 62 + /* This part must be outside protection */ 63 + #include <trace/define_trace.h>

+1 -5

virt/kvm/arm/vgic-v2.c

··· 79 79 lr_val |= (lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT); 80 80 81 81 vcpu->arch.vgic_cpu.vgic_v2.vgic_lr[lr] = lr_val; 82 - } 83 82 84 - static void vgic_v2_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr, 85 - struct vgic_lr lr_desc) 86 - { 87 83 if (!(lr_desc.state & LR_STATE_MASK)) 88 84 vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr |= (1ULL << lr); 89 85 else ··· 154 158 * anyway. 155 159 */ 156 160 vcpu->arch.vgic_cpu.vgic_v2.vgic_vmcr = 0; 161 + vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr = ~0; 157 162 158 163 /* Get the show on the road... */ 159 164 vcpu->arch.vgic_cpu.vgic_v2.vgic_hcr = GICH_HCR_EN; ··· 163 166 static const struct vgic_ops vgic_v2_ops = { 164 167 .get_lr = vgic_v2_get_lr, 165 168 .set_lr = vgic_v2_set_lr, 166 - .sync_lr_elrsr = vgic_v2_sync_lr_elrsr, 167 169 .get_elrsr = vgic_v2_get_elrsr, 168 170 .get_eisr = vgic_v2_get_eisr, 169 171 .clear_eisr = vgic_v2_clear_eisr,

+1 -5

virt/kvm/arm/vgic-v3.c

··· 112 112 } 113 113 114 114 vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)] = lr_val; 115 - } 116 115 117 - static void vgic_v3_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr, 118 - struct vgic_lr lr_desc) 119 - { 120 116 if (!(lr_desc.state & LR_STATE_MASK)) 121 117 vcpu->arch.vgic_cpu.vgic_v3.vgic_elrsr |= (1U << lr); 122 118 else ··· 189 193 * anyway. 190 194 */ 191 195 vgic_v3->vgic_vmcr = 0; 196 + vgic_v3->vgic_elrsr = ~0; 192 197 193 198 /* 194 199 * If we are emulating a GICv3, we do it in an non-GICv2-compatible ··· 208 211 static const struct vgic_ops vgic_v3_ops = { 209 212 .get_lr = vgic_v3_get_lr, 210 213 .set_lr = vgic_v3_set_lr, 211 - .sync_lr_elrsr = vgic_v3_sync_lr_elrsr, 212 214 .get_elrsr = vgic_v3_get_elrsr, 213 215 .get_eisr = vgic_v3_get_eisr, 214 216 .clear_eisr = vgic_v3_clear_eisr,

+136 -191

virt/kvm/arm/vgic.c

··· 34 34 #include <asm/kvm.h> 35 35 #include <kvm/iodev.h> 36 36 37 + #define CREATE_TRACE_POINTS 38 + #include "trace.h" 39 + 37 40 /* 38 41 * How the whole thing works (courtesy of Christoffer Dall): 39 42 * ··· 105 102 #include "vgic.h" 106 103 107 104 static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu); 108 - static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu); 105 + static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu); 109 106 static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr); 110 107 static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc); 108 + static u64 vgic_get_elrsr(struct kvm_vcpu *vcpu); 111 109 static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu, 112 110 int virt_irq); 111 + static int compute_pending_for_cpu(struct kvm_vcpu *vcpu); 113 112 114 113 static const struct vgic_ops *vgic_ops; 115 114 static const struct vgic_params *vgic; ··· 362 357 struct vgic_dist *dist = &vcpu->kvm->arch.vgic; 363 358 364 359 vgic_bitmap_set_irq_val(&dist->irq_soft_pend, vcpu->vcpu_id, irq, 0); 360 + if (!vgic_dist_irq_get_level(vcpu, irq)) { 361 + vgic_dist_irq_clear_pending(vcpu, irq); 362 + if (!compute_pending_for_cpu(vcpu)) 363 + clear_bit(vcpu->vcpu_id, dist->irq_pending_on_cpu); 364 + } 365 365 } 366 366 367 367 static int vgic_dist_irq_is_pending(struct kvm_vcpu *vcpu, int irq) ··· 664 654 vgic_reg_access(mmio, &val, offset, 665 655 ACCESS_READ_VALUE | ACCESS_WRITE_VALUE); 666 656 if (mmio->is_write) { 667 - if (offset < 8) { 668 - *reg = ~0U; /* Force PPIs/SGIs to 1 */ 657 + /* Ignore writes to read-only SGI and PPI bits */ 658 + if (offset < 8) 669 659 return false; 670 - } 671 660 672 661 val = vgic_cfg_compress(val); 673 662 if (offset & 4) { ··· 692 683 void vgic_unqueue_irqs(struct kvm_vcpu *vcpu) 693 684 { 694 685 struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; 686 + u64 elrsr = vgic_get_elrsr(vcpu); 687 + unsigned long *elrsr_ptr = u64_to_bitmask(&elrsr); 695 688 int i; 696 689 697 - for_each_set_bit(i, vgic_cpu->lr_used, vgic_cpu->nr_lr) { 690 + for_each_clear_bit(i, elrsr_ptr, vgic_cpu->nr_lr) { 698 691 struct vgic_lr lr = vgic_get_lr(vcpu, i); 699 692 700 693 /* ··· 717 706 * interrupt then move the active state to the 718 707 * distributor tracking bit. 719 708 */ 720 - if (lr.state & LR_STATE_ACTIVE) { 709 + if (lr.state & LR_STATE_ACTIVE) 721 710 vgic_irq_set_active(vcpu, lr.irq); 722 - lr.state &= ~LR_STATE_ACTIVE; 723 - } 724 711 725 712 /* 726 713 * Reestablish the pending state on the distributor and the 727 - * CPU interface. It may have already been pending, but that 728 - * is fine, then we are only setting a few bits that were 729 - * already set. 714 + * CPU interface and mark the LR as free for other use. 730 715 */ 731 - if (lr.state & LR_STATE_PENDING) { 732 - vgic_dist_irq_set_pending(vcpu, lr.irq); 733 - lr.state &= ~LR_STATE_PENDING; 734 - } 735 - 736 - vgic_set_lr(vcpu, i, lr); 737 - 738 - /* 739 - * Mark the LR as free for other use. 740 - */ 741 - BUG_ON(lr.state & LR_STATE_MASK); 742 - vgic_retire_lr(i, lr.irq, vcpu); 743 - vgic_irq_clear_queued(vcpu, lr.irq); 716 + vgic_retire_lr(i, vcpu); 744 717 745 718 /* Finally update the VGIC state. */ 746 719 vgic_update_state(vcpu->kvm); ··· 977 982 pend_percpu = vcpu->arch.vgic_cpu.pending_percpu; 978 983 pend_shared = vcpu->arch.vgic_cpu.pending_shared; 979 984 985 + if (!dist->enabled) { 986 + bitmap_zero(pend_percpu, VGIC_NR_PRIVATE_IRQS); 987 + bitmap_zero(pend_shared, nr_shared); 988 + return 0; 989 + } 990 + 980 991 pending = vgic_bitmap_get_cpu_map(&dist->irq_pending, vcpu_id); 981 992 enabled = vgic_bitmap_get_cpu_map(&dist->irq_enabled, vcpu_id); 982 993 bitmap_and(pend_percpu, pending, enabled, VGIC_NR_PRIVATE_IRQS); ··· 1010 1009 struct kvm_vcpu *vcpu; 1011 1010 int c; 1012 1011 1013 - if (!dist->enabled) { 1014 - set_bit(0, dist->irq_pending_on_cpu); 1015 - return; 1016 - } 1017 - 1018 1012 kvm_for_each_vcpu(c, vcpu, kvm) { 1019 1013 if (compute_pending_for_cpu(vcpu)) 1020 1014 set_bit(c, dist->irq_pending_on_cpu); ··· 1030 1034 struct vgic_lr vlr) 1031 1035 { 1032 1036 vgic_ops->set_lr(vcpu, lr, vlr); 1033 - } 1034 - 1035 - static void vgic_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr, 1036 - struct vgic_lr vlr) 1037 - { 1038 - vgic_ops->sync_lr_elrsr(vcpu, lr, vlr); 1039 1037 } 1040 1038 1041 1039 static inline u64 vgic_get_elrsr(struct kvm_vcpu *vcpu) ··· 1077 1087 vgic_ops->enable(vcpu); 1078 1088 } 1079 1089 1080 - static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu) 1090 + static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu) 1081 1091 { 1082 - struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; 1083 1092 struct vgic_lr vlr = vgic_get_lr(vcpu, lr_nr); 1093 + 1094 + vgic_irq_clear_queued(vcpu, vlr.irq); 1095 + 1096 + /* 1097 + * We must transfer the pending state back to the distributor before 1098 + * retiring the LR, otherwise we may loose edge-triggered interrupts. 1099 + */ 1100 + if (vlr.state & LR_STATE_PENDING) { 1101 + vgic_dist_irq_set_pending(vcpu, vlr.irq); 1102 + vlr.hwirq = 0; 1103 + } 1084 1104 1085 1105 vlr.state = 0; 1086 1106 vgic_set_lr(vcpu, lr_nr, vlr); 1087 - clear_bit(lr_nr, vgic_cpu->lr_used); 1088 - vgic_cpu->vgic_irq_lr_map[irq] = LR_EMPTY; 1089 - vgic_sync_lr_elrsr(vcpu, lr_nr, vlr); 1090 1107 } 1091 1108 1092 1109 /* ··· 1107 1110 */ 1108 1111 static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu) 1109 1112 { 1110 - struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; 1113 + u64 elrsr = vgic_get_elrsr(vcpu); 1114 + unsigned long *elrsr_ptr = u64_to_bitmask(&elrsr); 1111 1115 int lr; 1112 1116 1113 - for_each_set_bit(lr, vgic_cpu->lr_used, vgic->nr_lr) { 1117 + for_each_clear_bit(lr, elrsr_ptr, vgic->nr_lr) { 1114 1118 struct vgic_lr vlr = vgic_get_lr(vcpu, lr); 1115 1119 1116 - if (!vgic_irq_is_enabled(vcpu, vlr.irq)) { 1117 - vgic_retire_lr(lr, vlr.irq, vcpu); 1118 - if (vgic_irq_is_queued(vcpu, vlr.irq)) 1119 - vgic_irq_clear_queued(vcpu, vlr.irq); 1120 - } 1120 + if (!vgic_irq_is_enabled(vcpu, vlr.irq)) 1121 + vgic_retire_lr(lr, vcpu); 1121 1122 } 1122 1123 } 1123 1124 ··· 1127 1132 kvm_debug("Set active, clear distributor: 0x%x\n", vlr.state); 1128 1133 vgic_irq_clear_active(vcpu, irq); 1129 1134 vgic_update_state(vcpu->kvm); 1130 - } else if (vgic_dist_irq_is_pending(vcpu, irq)) { 1135 + } else { 1136 + WARN_ON(!vgic_dist_irq_is_pending(vcpu, irq)); 1131 1137 vlr.state |= LR_STATE_PENDING; 1132 1138 kvm_debug("Set pending: 0x%x\n", vlr.state); 1133 1139 } ··· 1155 1159 } 1156 1160 1157 1161 vgic_set_lr(vcpu, lr_nr, vlr); 1158 - vgic_sync_lr_elrsr(vcpu, lr_nr, vlr); 1159 1162 } 1160 1163 1161 1164 /* ··· 1164 1169 */ 1165 1170 bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq) 1166 1171 { 1167 - struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; 1168 1172 struct vgic_dist *dist = &vcpu->kvm->arch.vgic; 1173 + u64 elrsr = vgic_get_elrsr(vcpu); 1174 + unsigned long *elrsr_ptr = u64_to_bitmask(&elrsr); 1169 1175 struct vgic_lr vlr; 1170 1176 int lr; 1171 1177 ··· 1177 1181 1178 1182 kvm_debug("Queue IRQ%d\n", irq); 1179 1183 1180 - lr = vgic_cpu->vgic_irq_lr_map[irq]; 1181 - 1182 1184 /* Do we have an active interrupt for the same CPUID? */ 1183 - if (lr != LR_EMPTY) { 1185 + for_each_clear_bit(lr, elrsr_ptr, vgic->nr_lr) { 1184 1186 vlr = vgic_get_lr(vcpu, lr); 1185 - if (vlr.source == sgi_source_id) { 1187 + if (vlr.irq == irq && vlr.source == sgi_source_id) { 1186 1188 kvm_debug("LR%d piggyback for IRQ%d\n", lr, vlr.irq); 1187 - BUG_ON(!test_bit(lr, vgic_cpu->lr_used)); 1188 1189 vgic_queue_irq_to_lr(vcpu, irq, lr, vlr); 1189 1190 return true; 1190 1191 } 1191 1192 } 1192 1193 1193 1194 /* Try to use another LR for this interrupt */ 1194 - lr = find_first_zero_bit((unsigned long *)vgic_cpu->lr_used, 1195 - vgic->nr_lr); 1195 + lr = find_first_bit(elrsr_ptr, vgic->nr_lr); 1196 1196 if (lr >= vgic->nr_lr) 1197 1197 return false; 1198 1198 1199 1199 kvm_debug("LR%d allocated for IRQ%d %x\n", lr, irq, sgi_source_id); 1200 - vgic_cpu->vgic_irq_lr_map[irq] = lr; 1201 - set_bit(lr, vgic_cpu->lr_used); 1202 1200 1203 1201 vlr.irq = irq; 1204 1202 vlr.source = sgi_source_id; ··· 1230 1240 struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; 1231 1241 struct vgic_dist *dist = &vcpu->kvm->arch.vgic; 1232 1242 unsigned long *pa_percpu, *pa_shared; 1233 - int i, vcpu_id, lr, ret; 1243 + int i, vcpu_id; 1234 1244 int overflow = 0; 1235 1245 int nr_shared = vgic_nr_shared_irqs(dist); 1236 1246 ··· 1285 1295 */ 1286 1296 clear_bit(vcpu_id, dist->irq_pending_on_cpu); 1287 1297 } 1298 + } 1288 1299 1289 - for (lr = 0; lr < vgic->nr_lr; lr++) { 1290 - struct vgic_lr vlr; 1300 + static int process_queued_irq(struct kvm_vcpu *vcpu, 1301 + int lr, struct vgic_lr vlr) 1302 + { 1303 + int pending = 0; 1291 1304 1292 - if (!test_bit(lr, vgic_cpu->lr_used)) 1293 - continue; 1305 + /* 1306 + * If the IRQ was EOIed (called from vgic_process_maintenance) or it 1307 + * went from active to non-active (called from vgic_sync_hwirq) it was 1308 + * also ACKed and we we therefore assume we can clear the soft pending 1309 + * state (should it had been set) for this interrupt. 1310 + * 1311 + * Note: if the IRQ soft pending state was set after the IRQ was 1312 + * acked, it actually shouldn't be cleared, but we have no way of 1313 + * knowing that unless we start trapping ACKs when the soft-pending 1314 + * state is set. 1315 + */ 1316 + vgic_dist_irq_clear_soft_pend(vcpu, vlr.irq); 1294 1317 1295 - vlr = vgic_get_lr(vcpu, lr); 1318 + /* 1319 + * Tell the gic to start sampling this interrupt again. 1320 + */ 1321 + vgic_irq_clear_queued(vcpu, vlr.irq); 1296 1322 1297 - /* 1298 - * If we have a mapping, and the virtual interrupt is 1299 - * presented to the guest (as pending or active), then we must 1300 - * set the state to active in the physical world. See 1301 - * Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt. 1302 - */ 1303 - if (vlr.state & LR_HW) { 1304 - struct irq_phys_map *map; 1305 - map = vgic_irq_map_search(vcpu, vlr.irq); 1306 - 1307 - ret = irq_set_irqchip_state(map->irq, 1308 - IRQCHIP_STATE_ACTIVE, 1309 - true); 1310 - WARN_ON(ret); 1323 + /* Any additional pending interrupt? */ 1324 + if (vgic_irq_is_edge(vcpu, vlr.irq)) { 1325 + BUG_ON(!(vlr.state & LR_HW)); 1326 + pending = vgic_dist_irq_is_pending(vcpu, vlr.irq); 1327 + } else { 1328 + if (vgic_dist_irq_get_level(vcpu, vlr.irq)) { 1329 + vgic_cpu_irq_set(vcpu, vlr.irq); 1330 + pending = 1; 1331 + } else { 1332 + vgic_dist_irq_clear_pending(vcpu, vlr.irq); 1333 + vgic_cpu_irq_clear(vcpu, vlr.irq); 1311 1334 } 1312 1335 } 1336 + 1337 + /* 1338 + * Despite being EOIed, the LR may not have 1339 + * been marked as empty. 1340 + */ 1341 + vlr.state = 0; 1342 + vlr.hwirq = 0; 1343 + vgic_set_lr(vcpu, lr, vlr); 1344 + 1345 + return pending; 1313 1346 } 1314 1347 1315 1348 static bool vgic_process_maintenance(struct kvm_vcpu *vcpu) 1316 1349 { 1317 1350 u32 status = vgic_get_interrupt_status(vcpu); 1318 1351 struct vgic_dist *dist = &vcpu->kvm->arch.vgic; 1319 - bool level_pending = false; 1320 1352 struct kvm *kvm = vcpu->kvm; 1353 + int level_pending = 0; 1321 1354 1322 1355 kvm_debug("STATUS = %08x\n", status); 1323 1356 ··· 1355 1342 1356 1343 for_each_set_bit(lr, eisr_ptr, vgic->nr_lr) { 1357 1344 struct vgic_lr vlr = vgic_get_lr(vcpu, lr); 1345 + 1358 1346 WARN_ON(vgic_irq_is_edge(vcpu, vlr.irq)); 1359 - 1360 - spin_lock(&dist->lock); 1361 - vgic_irq_clear_queued(vcpu, vlr.irq); 1362 1347 WARN_ON(vlr.state & LR_STATE_MASK); 1363 - vlr.state = 0; 1364 - vgic_set_lr(vcpu, lr, vlr); 1365 1348 1366 - /* 1367 - * If the IRQ was EOIed it was also ACKed and we we 1368 - * therefore assume we can clear the soft pending 1369 - * state (should it had been set) for this interrupt. 1370 - * 1371 - * Note: if the IRQ soft pending state was set after 1372 - * the IRQ was acked, it actually shouldn't be 1373 - * cleared, but we have no way of knowing that unless 1374 - * we start trapping ACKs when the soft-pending state 1375 - * is set. 1376 - */ 1377 - vgic_dist_irq_clear_soft_pend(vcpu, vlr.irq); 1378 1349 1379 1350 /* 1380 1351 * kvm_notify_acked_irq calls kvm_set_irq() 1381 - * to reset the IRQ level. Need to release the 1382 - * lock for kvm_set_irq to grab it. 1352 + * to reset the IRQ level, which grabs the dist->lock 1353 + * so we call this before taking the dist->lock. 1383 1354 */ 1384 - spin_unlock(&dist->lock); 1385 - 1386 1355 kvm_notify_acked_irq(kvm, 0, 1387 1356 vlr.irq - VGIC_NR_PRIVATE_IRQS); 1357 + 1388 1358 spin_lock(&dist->lock); 1389 - 1390 - /* Any additional pending interrupt? */ 1391 - if (vgic_dist_irq_get_level(vcpu, vlr.irq)) { 1392 - vgic_cpu_irq_set(vcpu, vlr.irq); 1393 - level_pending = true; 1394 - } else { 1395 - vgic_dist_irq_clear_pending(vcpu, vlr.irq); 1396 - vgic_cpu_irq_clear(vcpu, vlr.irq); 1397 - } 1398 - 1359 + level_pending |= process_queued_irq(vcpu, lr, vlr); 1399 1360 spin_unlock(&dist->lock); 1400 - 1401 - /* 1402 - * Despite being EOIed, the LR may not have 1403 - * been marked as empty. 1404 - */ 1405 - vgic_sync_lr_elrsr(vcpu, lr, vlr); 1406 1361 } 1407 1362 } 1408 1363 ··· 1391 1410 /* 1392 1411 * Save the physical active state, and reset it to inactive. 1393 1412 * 1394 - * Return 1 if HW interrupt went from active to inactive, and 0 otherwise. 1413 + * Return true if there's a pending forwarded interrupt to queue. 1395 1414 */ 1396 - static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr) 1415 + static bool vgic_sync_hwirq(struct kvm_vcpu *vcpu, int lr, struct vgic_lr vlr) 1397 1416 { 1417 + struct vgic_dist *dist = &vcpu->kvm->arch.vgic; 1398 1418 struct irq_phys_map *map; 1419 + bool phys_active; 1420 + bool level_pending; 1399 1421 int ret; 1400 1422 1401 1423 if (!(vlr.state & LR_HW)) 1402 - return 0; 1424 + return false; 1403 1425 1404 1426 map = vgic_irq_map_search(vcpu, vlr.irq); 1405 - BUG_ON(!map || !map->active); 1427 + BUG_ON(!map); 1406 1428 1407 1429 ret = irq_get_irqchip_state(map->irq, 1408 1430 IRQCHIP_STATE_ACTIVE, 1409 - &map->active); 1431 + &phys_active); 1410 1432 1411 1433 WARN_ON(ret); 1412 1434 1413 - if (map->active) { 1414 - ret = irq_set_irqchip_state(map->irq, 1415 - IRQCHIP_STATE_ACTIVE, 1416 - false); 1417 - WARN_ON(ret); 1435 + if (phys_active) 1418 1436 return 0; 1419 - } 1420 1437 1421 - return 1; 1438 + spin_lock(&dist->lock); 1439 + level_pending = process_queued_irq(vcpu, lr, vlr); 1440 + spin_unlock(&dist->lock); 1441 + return level_pending; 1422 1442 } 1423 1443 1424 1444 /* Sync back the VGIC state after a guest run */ 1425 1445 static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu) 1426 1446 { 1427 - struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; 1428 1447 struct vgic_dist *dist = &vcpu->kvm->arch.vgic; 1429 1448 u64 elrsr; 1430 1449 unsigned long *elrsr_ptr; ··· 1432 1451 bool level_pending; 1433 1452 1434 1453 level_pending = vgic_process_maintenance(vcpu); 1435 - elrsr = vgic_get_elrsr(vcpu); 1436 - elrsr_ptr = u64_to_bitmask(&elrsr); 1437 1454 1438 1455 /* Deal with HW interrupts, and clear mappings for empty LRs */ 1439 1456 for (lr = 0; lr < vgic->nr_lr; lr++) { 1440 - struct vgic_lr vlr; 1457 + struct vgic_lr vlr = vgic_get_lr(vcpu, lr); 1441 1458 1442 - if (!test_bit(lr, vgic_cpu->lr_used)) 1443 - continue; 1444 - 1445 - vlr = vgic_get_lr(vcpu, lr); 1446 - if (vgic_sync_hwirq(vcpu, vlr)) { 1447 - /* 1448 - * So this is a HW interrupt that the guest 1449 - * EOI-ed. Clean the LR state and allow the 1450 - * interrupt to be sampled again. 1451 - */ 1452 - vlr.state = 0; 1453 - vlr.hwirq = 0; 1454 - vgic_set_lr(vcpu, lr, vlr); 1455 - vgic_irq_clear_queued(vcpu, vlr.irq); 1456 - set_bit(lr, elrsr_ptr); 1457 - } 1458 - 1459 - if (!test_bit(lr, elrsr_ptr)) 1460 - continue; 1461 - 1462 - clear_bit(lr, vgic_cpu->lr_used); 1463 - 1459 + level_pending |= vgic_sync_hwirq(vcpu, lr, vlr); 1464 1460 BUG_ON(vlr.irq >= dist->nr_irqs); 1465 - vgic_cpu->vgic_irq_lr_map[vlr.irq] = LR_EMPTY; 1466 1461 } 1467 1462 1468 1463 /* Check if we still have something up our sleeve... */ 1464 + elrsr = vgic_get_elrsr(vcpu); 1465 + elrsr_ptr = u64_to_bitmask(&elrsr); 1469 1466 pending = find_first_zero_bit(elrsr_ptr, vgic->nr_lr); 1470 1467 if (level_pending || pending < vgic->nr_lr) 1471 1468 set_bit(vcpu->vcpu_id, dist->irq_pending_on_cpu); ··· 1533 1574 int enabled; 1534 1575 bool ret = true, can_inject = true; 1535 1576 1577 + trace_vgic_update_irq_pending(cpuid, irq_num, level); 1578 + 1536 1579 if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020)) 1537 1580 return -EINVAL; 1538 1581 ··· 1568 1607 } else { 1569 1608 if (level_triggered) { 1570 1609 vgic_dist_irq_clear_level(vcpu, irq_num); 1571 - if (!vgic_dist_irq_soft_pend(vcpu, irq_num)) 1610 + if (!vgic_dist_irq_soft_pend(vcpu, irq_num)) { 1572 1611 vgic_dist_irq_clear_pending(vcpu, irq_num); 1612 + vgic_cpu_irq_clear(vcpu, irq_num); 1613 + if (!compute_pending_for_cpu(vcpu)) 1614 + clear_bit(cpuid, dist->irq_pending_on_cpu); 1615 + } 1573 1616 } 1574 1617 1575 1618 ret = false; ··· 1814 1849 } 1815 1850 1816 1851 /** 1817 - * kvm_vgic_get_phys_irq_active - Return the active state of a mapped IRQ 1818 - * 1819 - * Return the logical active state of a mapped interrupt. This doesn't 1820 - * necessarily reflects the current HW state. 1821 - */ 1822 - bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map) 1823 - { 1824 - BUG_ON(!map); 1825 - return map->active; 1826 - } 1827 - 1828 - /** 1829 - * kvm_vgic_set_phys_irq_active - Set the active state of a mapped IRQ 1830 - * 1831 - * Set the logical active state of a mapped interrupt. This doesn't 1832 - * immediately affects the HW state. 1833 - */ 1834 - void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active) 1835 - { 1836 - BUG_ON(!map); 1837 - map->active = active; 1838 - } 1839 - 1840 - /** 1841 1852 * kvm_vgic_unmap_phys_irq - Remove a virtual to physical IRQ mapping 1842 1853 * @vcpu: The VCPU pointer 1843 1854 * @map: The pointer to a mapping obtained through kvm_vgic_map_phys_irq ··· 1868 1927 kfree(vgic_cpu->pending_shared); 1869 1928 kfree(vgic_cpu->active_shared); 1870 1929 kfree(vgic_cpu->pend_act_shared); 1871 - kfree(vgic_cpu->vgic_irq_lr_map); 1872 1930 vgic_destroy_irq_phys_map(vcpu->kvm, &vgic_cpu->irq_phys_map_list); 1873 1931 vgic_cpu->pending_shared = NULL; 1874 1932 vgic_cpu->active_shared = NULL; 1875 1933 vgic_cpu->pend_act_shared = NULL; 1876 - vgic_cpu->vgic_irq_lr_map = NULL; 1877 1934 } 1878 1935 1879 1936 static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs) ··· 1882 1943 vgic_cpu->pending_shared = kzalloc(sz, GFP_KERNEL); 1883 1944 vgic_cpu->active_shared = kzalloc(sz, GFP_KERNEL); 1884 1945 vgic_cpu->pend_act_shared = kzalloc(sz, GFP_KERNEL); 1885 - vgic_cpu->vgic_irq_lr_map = kmalloc(nr_irqs, GFP_KERNEL); 1886 1946 1887 1947 if (!vgic_cpu->pending_shared 1888 1948 || !vgic_cpu->active_shared 1889 - || !vgic_cpu->pend_act_shared 1890 - || !vgic_cpu->vgic_irq_lr_map) { 1949 + || !vgic_cpu->pend_act_shared) { 1891 1950 kvm_vgic_vcpu_destroy(vcpu); 1892 1951 return -ENOMEM; 1893 1952 } 1894 - 1895 - memset(vgic_cpu->vgic_irq_lr_map, LR_EMPTY, nr_irqs); 1896 1953 1897 1954 /* 1898 1955 * Store the number of LRs per vcpu, so we don't have to go ··· 2031 2096 break; 2032 2097 } 2033 2098 2034 - for (i = 0; i < dist->nr_irqs; i++) { 2035 - if (i < VGIC_NR_PPIS) 2099 + /* 2100 + * Enable and configure all SGIs to be edge-triggere and 2101 + * configure all PPIs as level-triggered. 2102 + */ 2103 + for (i = 0; i < VGIC_NR_PRIVATE_IRQS; i++) { 2104 + if (i < VGIC_NR_SGIS) { 2105 + /* SGIs */ 2036 2106 vgic_bitmap_set_irq_val(&dist->irq_enabled, 2037 2107 vcpu->vcpu_id, i, 1); 2038 - if (i < VGIC_NR_PRIVATE_IRQS) 2039 2108 vgic_bitmap_set_irq_val(&dist->irq_cfg, 2040 2109 vcpu->vcpu_id, i, 2041 2110 VGIC_CFG_EDGE); 2111 + } else if (i < VGIC_NR_PRIVATE_IRQS) { 2112 + /* PPIs */ 2113 + vgic_bitmap_set_irq_val(&dist->irq_cfg, 2114 + vcpu->vcpu_id, i, 2115 + VGIC_CFG_LEVEL); 2116 + } 2042 2117 } 2043 2118 2044 2119 vgic_enable(vcpu);

+3

virt/kvm/kvm_main.c

··· 2021 2021 } while (single_task_running() && ktime_before(cur, stop)); 2022 2022 } 2023 2023 2024 + kvm_arch_vcpu_blocking(vcpu); 2025 + 2024 2026 for (;;) { 2025 2027 prepare_to_wait(&vcpu->wq, &wait, TASK_INTERRUPTIBLE); 2026 2028 ··· 2036 2034 finish_wait(&vcpu->wq, &wait); 2037 2035 cur = ktime_get(); 2038 2036 2037 + kvm_arch_vcpu_unblocking(vcpu); 2039 2038 out: 2040 2039 block_ns = ktime_to_ns(cur) - ktime_to_ns(start); 2041 2040