Merge tag 'kvmarm-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

+3 -3

Documentation/admin-guide/kernel-parameters.txt

··· 3083 3083 [X86,PV_OPS] Disable paravirtualized VMware scheduler 3084 3084 clock and use the default one. 3085 3085 3086 - no-steal-acc [X86,KVM] Disable paravirtualized steal time accounting. 3087 - steal time is computed, but won't influence scheduler 3088 - behaviour 3086 + no-steal-acc [X86,KVM,ARM64] Disable paravirtualized steal time 3087 + accounting. steal time is computed, but won't 3088 + influence scheduler behaviour 3089 3089 3090 3090 nolapic [X86-32,APIC] Do not enable or use the local APIC. 3091 3091

+54 -1

Documentation/virt/kvm/api.txt

··· 1002 1002 -EINVAL. Setting anything other than the lower 24bits of exception.serror_esr 1003 1003 will return -EINVAL. 1004 1004 1005 + It is not possible to read back a pending external abort (injected via 1006 + KVM_SET_VCPU_EVENTS or otherwise) because such an exception is always delivered 1007 + directly to the virtual CPU). 1008 + 1009 + 1005 1010 struct kvm_vcpu_events { 1006 1011 struct { 1007 1012 __u8 serror_pending; 1008 1013 __u8 serror_has_esr; 1014 + __u8 ext_dabt_pending; 1009 1015 /* Align it to 8 bytes */ 1010 - __u8 pad[6]; 1016 + __u8 pad[5]; 1011 1017 __u64 serror_esr; 1012 1018 } exception; 1013 1019 __u32 reserved[12]; ··· 1057 1051 1058 1052 ARM/ARM64: 1059 1053 1054 + User space may need to inject several types of events to the guest. 1055 + 1060 1056 Set the pending SError exception state for this VCPU. It is not possible to 1061 1057 'cancel' an Serror that has been made pending. 1058 + 1059 + If the guest performed an access to I/O memory which could not be handled by 1060 + userspace, for example because of missing instruction syndrome decode 1061 + information or because there is no device mapped at the accessed IPA, then 1062 + userspace can ask the kernel to inject an external abort using the address 1063 + from the exiting fault on the VCPU. It is a programming error to set 1064 + ext_dabt_pending after an exit which was not either KVM_EXIT_MMIO or 1065 + KVM_EXIT_ARM_NISV. This feature is only available if the system supports 1066 + KVM_CAP_ARM_INJECT_EXT_DABT. This is a helper which provides commonality in 1067 + how userspace reports accesses for the above cases to guests, across different 1068 + userspace implementations. Nevertheless, userspace can still emulate all Arm 1069 + exceptions by manipulating individual registers using the KVM_SET_ONE_REG API. 1062 1070 1063 1071 See KVM_GET_VCPU_EVENTS for the data structure. 1064 1072 ··· 4490 4470 Hyper-V SynIC state change. Notification is used to remap SynIC 4491 4471 event/message pages and to enable/disable SynIC messages/events processing 4492 4472 in userspace. 4473 + 4474 + /* KVM_EXIT_ARM_NISV */ 4475 + struct { 4476 + __u64 esr_iss; 4477 + __u64 fault_ipa; 4478 + } arm_nisv; 4479 + 4480 + Used on arm and arm64 systems. If a guest accesses memory not in a memslot, 4481 + KVM will typically return to userspace and ask it to do MMIO emulation on its 4482 + behalf. However, for certain classes of instructions, no instruction decode 4483 + (direction, length of memory access) is provided, and fetching and decoding 4484 + the instruction from the VM is overly complicated to live in the kernel. 4485 + 4486 + Historically, when this situation occurred, KVM would print a warning and kill 4487 + the VM. KVM assumed that if the guest accessed non-memslot memory, it was 4488 + trying to do I/O, which just couldn't be emulated, and the warning message was 4489 + phrased accordingly. However, what happened more often was that a guest bug 4490 + caused access outside the guest memory areas which should lead to a more 4491 + meaningful warning message and an external abort in the guest, if the access 4492 + did not fall within an I/O window. 4493 + 4494 + Userspace implementations can query for KVM_CAP_ARM_NISV_TO_USER, and enable 4495 + this capability at VM creation. Once this is done, these types of errors will 4496 + instead return to userspace with KVM_EXIT_ARM_NISV, with the valid bits from 4497 + the HSR (arm) and ESR_EL2 (arm64) in the esr_iss field, and the faulting IPA 4498 + in the fault_ipa field. Userspace can either fix up the access if it's 4499 + actually an I/O access by decoding the instruction from guest memory (if it's 4500 + very brave) and continue executing the guest, or it can decide to suspend, 4501 + dump, or restart the guest. 4502 + 4503 + Note that KVM does not skip the faulting instruction as it does for 4504 + KVM_EXIT_MMIO, but userspace has to emulate any change to the processing state 4505 + if it decides to decode and emulate the instruction. 4493 4506 4494 4507 /* Fix the size of the union. */ 4495 4508 char padding[256];

+80

Documentation/virt/kvm/arm/pvtime.rst

··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + Paravirtualized time support for arm64 4 + ====================================== 5 + 6 + Arm specification DEN0057/A defines a standard for paravirtualised time 7 + support for AArch64 guests: 8 + 9 + https://developer.arm.com/docs/den0057/a 10 + 11 + KVM/arm64 implements the stolen time part of this specification by providing 12 + some hypervisor service calls to support a paravirtualized guest obtaining a 13 + view of the amount of time stolen from its execution. 14 + 15 + Two new SMCCC compatible hypercalls are defined: 16 + 17 + * PV_TIME_FEATURES: 0xC5000020 18 + * PV_TIME_ST: 0xC5000021 19 + 20 + These are only available in the SMC64/HVC64 calling convention as 21 + paravirtualized time is not available to 32 bit Arm guests. The existence of 22 + the PV_FEATURES hypercall should be probed using the SMCCC 1.1 ARCH_FEATURES 23 + mechanism before calling it. 24 + 25 + PV_TIME_FEATURES 26 + ============= ======== ========== 27 + Function ID: (uint32) 0xC5000020 28 + PV_call_id: (uint32) The function to query for support. 29 + Currently only PV_TIME_ST is supported. 30 + Return value: (int64) NOT_SUPPORTED (-1) or SUCCESS (0) if the relevant 31 + PV-time feature is supported by the hypervisor. 32 + ============= ======== ========== 33 + 34 + PV_TIME_ST 35 + ============= ======== ========== 36 + Function ID: (uint32) 0xC5000021 37 + Return value: (int64) IPA of the stolen time data structure for this 38 + VCPU. On failure: 39 + NOT_SUPPORTED (-1) 40 + ============= ======== ========== 41 + 42 + The IPA returned by PV_TIME_ST should be mapped by the guest as normal memory 43 + with inner and outer write back caching attributes, in the inner shareable 44 + domain. A total of 16 bytes from the IPA returned are guaranteed to be 45 + meaningfully filled by the hypervisor (see structure below). 46 + 47 + PV_TIME_ST returns the structure for the calling VCPU. 48 + 49 + Stolen Time 50 + ----------- 51 + 52 + The structure pointed to by the PV_TIME_ST hypercall is as follows: 53 + 54 + +-------------+-------------+-------------+----------------------------+ 55 + | Field | Byte Length | Byte Offset | Description | 56 + +=============+=============+=============+============================+ 57 + | Revision | 4 | 0 | Must be 0 for version 1.0 | 58 + +-------------+-------------+-------------+----------------------------+ 59 + | Attributes | 4 | 4 | Must be 0 | 60 + +-------------+-------------+-------------+----------------------------+ 61 + | Stolen time | 8 | 8 | Stolen time in unsigned | 62 + | | | | nanoseconds indicating how | 63 + | | | | much time this VCPU thread | 64 + | | | | was involuntarily not | 65 + | | | | running on a physical CPU. | 66 + +-------------+-------------+-------------+----------------------------+ 67 + 68 + All values in the structure are stored little-endian. 69 + 70 + The structure will be updated by the hypervisor prior to scheduling a VCPU. It 71 + will be present within a reserved region of the normal memory given to the 72 + guest. The guest should not attempt to write into this memory. There is a 73 + structure per VCPU of the guest. 74 + 75 + It is advisable that one or more 64k pages are set aside for the purpose of 76 + these structures and not used for other purposes, this enables the guest to map 77 + the region using 64k pages and avoids conflicting attributes with other memory. 78 + 79 + For the user space interface see Documentation/virt/kvm/devices/vcpu.txt 80 + section "3. GROUP: KVM_ARM_VCPU_PVTIME_CTRL".

+14

Documentation/virt/kvm/devices/vcpu.txt

··· 60 60 configured values on other VCPUs. Userspace should configure the interrupt 61 61 numbers on at least one VCPU after creating all VCPUs and before running any 62 62 VCPUs. 63 + 64 + 3. GROUP: KVM_ARM_VCPU_PVTIME_CTRL 65 + Architectures: ARM64 66 + 67 + 3.1 ATTRIBUTE: KVM_ARM_VCPU_PVTIME_IPA 68 + Parameters: 64-bit base address 69 + Returns: -ENXIO: Stolen time not implemented 70 + -EEXIST: Base address already set for this VCPU 71 + -EINVAL: Base address not 64 byte aligned 72 + 73 + Specifies the base address of the stolen time structure for this VCPU. The 74 + base address must be 64 byte aligned and exist within a valid guest memory 75 + region. See Documentation/virt/kvm/arm/pvtime.txt for more information 76 + including the layout of the stolen time structure.

+1

arch/arm/include/asm/kvm_arm.h

··· 162 162 #define HSR_ISV (_AC(1, UL) << HSR_ISV_SHIFT) 163 163 #define HSR_SRT_SHIFT (16) 164 164 #define HSR_SRT_MASK (0xf << HSR_SRT_SHIFT) 165 + #define HSR_CM (1 << 8) 165 166 #define HSR_FSC (0x3f) 166 167 #define HSR_FSC_TYPE (0x3c) 167 168 #define HSR_SSE (1 << 21)

+7 -2

arch/arm/include/asm/kvm_emulate.h

··· 95 95 return (unsigned long *)&vcpu->arch.hcr; 96 96 } 97 97 98 - static inline void vcpu_clear_wfe_traps(struct kvm_vcpu *vcpu) 98 + static inline void vcpu_clear_wfx_traps(struct kvm_vcpu *vcpu) 99 99 { 100 100 vcpu->arch.hcr &= ~HCR_TWE; 101 101 } 102 102 103 - static inline void vcpu_set_wfe_traps(struct kvm_vcpu *vcpu) 103 + static inline void vcpu_set_wfx_traps(struct kvm_vcpu *vcpu) 104 104 { 105 105 vcpu->arch.hcr |= HCR_TWE; 106 106 } ··· 165 165 static inline bool kvm_vcpu_dabt_isvalid(struct kvm_vcpu *vcpu) 166 166 { 167 167 return kvm_vcpu_get_hsr(vcpu) & HSR_ISV; 168 + } 169 + 170 + static inline unsigned long kvm_vcpu_dabt_iss_nisv_sanitized(const struct kvm_vcpu *vcpu) 171 + { 172 + return kvm_vcpu_get_hsr(vcpu) & (HSR_CM | HSR_WNR | HSR_FSC); 168 173 } 169 174 170 175 static inline bool kvm_vcpu_dabt_iswrite(struct kvm_vcpu *vcpu)

+33

arch/arm/include/asm/kvm_host.h

··· 7 7 #ifndef __ARM_KVM_HOST_H__ 8 8 #define __ARM_KVM_HOST_H__ 9 9 10 + #include <linux/arm-smccc.h> 10 11 #include <linux/errno.h> 11 12 #include <linux/types.h> 12 13 #include <linux/kvm_types.h> ··· 39 38 KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) 40 39 #define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1) 41 40 #define KVM_REQ_VCPU_RESET KVM_ARCH_REQ(2) 41 + #define KVM_REQ_RECORD_STEAL KVM_ARCH_REQ(3) 42 42 43 43 DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use); 44 44 ··· 78 76 79 77 /* Mandated version of PSCI */ 80 78 u32 psci_version; 79 + 80 + /* 81 + * If we encounter a data abort without valid instruction syndrome 82 + * information, report this to user space. User space can (and 83 + * should) opt in to this feature if KVM_CAP_ARM_NISV_TO_USER is 84 + * supported. 85 + */ 86 + bool return_nisv_io_abort_to_user; 81 87 }; 82 88 83 89 #define KVM_NR_MEM_OBJS 40 ··· 332 322 333 323 int kvm_perf_init(void); 334 324 int kvm_perf_teardown(void); 325 + 326 + static inline long kvm_hypercall_pv_features(struct kvm_vcpu *vcpu) 327 + { 328 + return SMCCC_RET_NOT_SUPPORTED; 329 + } 330 + 331 + static inline gpa_t kvm_init_stolen_time(struct kvm_vcpu *vcpu) 332 + { 333 + return GPA_INVALID; 334 + } 335 + 336 + static inline void kvm_update_stolen_time(struct kvm_vcpu *vcpu) 337 + { 338 + } 339 + 340 + static inline void kvm_arm_pvtime_vcpu_init(struct kvm_vcpu_arch *vcpu_arch) 341 + { 342 + } 343 + 344 + static inline bool kvm_arm_is_pvtime_enabled(struct kvm_vcpu_arch *vcpu_arch) 345 + { 346 + return false; 347 + } 335 348 336 349 void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot); 337 350

+2 -1

arch/arm/include/uapi/asm/kvm.h

··· 131 131 struct { 132 132 __u8 serror_pending; 133 133 __u8 serror_has_esr; 134 + __u8 ext_dabt_pending; 134 135 /* Align it to 8 bytes */ 135 - __u8 pad[6]; 136 + __u8 pad[5]; 136 137 __u64 serror_esr; 137 138 } exception; 138 139 __u32 reserved[12];

+1 -1

arch/arm/kvm/Makefile

··· 24 24 obj-y += handle_exit.o guest.o emulate.o reset.o 25 25 obj-y += coproc.o coproc_a15.o coproc_a7.o vgic-v3-coproc.o 26 26 obj-y += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o 27 - obj-y += $(KVM)/arm/psci.o $(KVM)/arm/perf.o 27 + obj-y += $(KVM)/arm/psci.o $(KVM)/arm/perf.o $(KVM)/arm/hypercalls.o 28 28 obj-y += $(KVM)/arm/aarch32.o 29 29 30 30 obj-y += $(KVM)/arm/vgic/vgic.o

+14

arch/arm/kvm/guest.c

··· 21 21 #define VCPU_STAT(x) { #x, offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU } 22 22 23 23 struct kvm_stats_debugfs_item debugfs_entries[] = { 24 + VCPU_STAT(halt_successful_poll), 25 + VCPU_STAT(halt_attempted_poll), 26 + VCPU_STAT(halt_poll_invalid), 27 + VCPU_STAT(halt_wakeup), 24 28 VCPU_STAT(hvc_exit_stat), 25 29 VCPU_STAT(wfe_exit_stat), 26 30 VCPU_STAT(wfi_exit_stat), ··· 259 255 { 260 256 events->exception.serror_pending = !!(*vcpu_hcr(vcpu) & HCR_VA); 261 257 258 + /* 259 + * We never return a pending ext_dabt here because we deliver it to 260 + * the virtual CPU directly when setting the event and it's no longer 261 + * 'pending' at this point. 262 + */ 263 + 262 264 return 0; 263 265 } 264 266 ··· 273 263 { 274 264 bool serror_pending = events->exception.serror_pending; 275 265 bool has_esr = events->exception.serror_has_esr; 266 + bool ext_dabt_pending = events->exception.ext_dabt_pending; 276 267 277 268 if (serror_pending && has_esr) 278 269 return -EINVAL; 279 270 else if (serror_pending) 280 271 kvm_inject_vabt(vcpu); 272 + 273 + if (ext_dabt_pending) 274 + kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu)); 281 275 282 276 return 0; 283 277 }

+1 -1

arch/arm/kvm/handle_exit.c

··· 9 9 #include <asm/kvm_emulate.h> 10 10 #include <asm/kvm_coproc.h> 11 11 #include <asm/kvm_mmu.h> 12 - #include <kvm/arm_psci.h> 12 + #include <kvm/arm_hypercalls.h> 13 13 #include <trace/events/kvm.h> 14 14 15 15 #include "trace.h"

+7 -14

arch/arm/mm/proc-v7-bugs.c

··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 2 #include <linux/arm-smccc.h> 3 3 #include <linux/kernel.h> 4 - #include <linux/psci.h> 5 4 #include <linux/smp.h> 6 5 7 6 #include <asm/cp15.h> ··· 74 75 case ARM_CPU_PART_CORTEX_A72: { 75 76 struct arm_smccc_res res; 76 77 77 - if (psci_ops.smccc_version == SMCCC_VERSION_1_0) 78 - break; 78 + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, 79 + ARM_SMCCC_ARCH_WORKAROUND_1, &res); 80 + if ((int)res.a0 != 0) 81 + return; 79 82 80 - switch (psci_ops.conduit) { 81 - case PSCI_CONDUIT_HVC: 82 - arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, 83 - ARM_SMCCC_ARCH_WORKAROUND_1, &res); 84 - if ((int)res.a0 != 0) 85 - break; 83 + switch (arm_smccc_1_1_get_conduit()) { 84 + case SMCCC_CONDUIT_HVC: 86 85 per_cpu(harden_branch_predictor_fn, cpu) = 87 86 call_hvc_arch_workaround_1; 88 87 cpu_do_switch_mm = cpu_v7_hvc_switch_mm; 89 88 spectre_v2_method = "hypervisor"; 90 89 break; 91 90 92 - case PSCI_CONDUIT_SMC: 93 - arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, 94 - ARM_SMCCC_ARCH_WORKAROUND_1, &res); 95 - if ((int)res.a0 != 0) 96 - break; 91 + case SMCCC_CONDUIT_SMC: 97 92 per_cpu(harden_branch_predictor_fn, cpu) = 98 93 call_smc_arch_workaround_1; 99 94 cpu_do_switch_mm = cpu_v7_smc_switch_mm;

+1 -2

arch/arm64/include/asm/kvm_arm.h

+23 -3

arch/arm64/include/asm/kvm_emulate.h

··· 53 53 /* trap error record accesses */ 54 54 vcpu->arch.hcr_el2 |= HCR_TERR; 55 55 } 56 - if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) 56 + 57 + if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) { 57 58 vcpu->arch.hcr_el2 |= HCR_FWB; 59 + } else { 60 + /* 61 + * For non-FWB CPUs, we trap VM ops (HCR_EL2.TVM) until M+C 62 + * get set in SCTLR_EL1 such that we can detect when the guest 63 + * MMU gets turned on and do the necessary cache maintenance 64 + * then. 65 + */ 66 + vcpu->arch.hcr_el2 |= HCR_TVM; 67 + } 58 68 59 69 if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) 60 70 vcpu->arch.hcr_el2 &= ~HCR_RW; ··· 87 77 return (unsigned long *)&vcpu->arch.hcr_el2; 88 78 } 89 79 90 - static inline void vcpu_clear_wfe_traps(struct kvm_vcpu *vcpu) 80 + static inline void vcpu_clear_wfx_traps(struct kvm_vcpu *vcpu) 91 81 { 92 82 vcpu->arch.hcr_el2 &= ~HCR_TWE; 83 + if (atomic_read(&vcpu->arch.vgic_cpu.vgic_v3.its_vpe.vlpi_count)) 84 + vcpu->arch.hcr_el2 &= ~HCR_TWI; 85 + else 86 + vcpu->arch.hcr_el2 |= HCR_TWI; 93 87 } 94 88 95 - static inline void vcpu_set_wfe_traps(struct kvm_vcpu *vcpu) 89 + static inline void vcpu_set_wfx_traps(struct kvm_vcpu *vcpu) 96 90 { 97 91 vcpu->arch.hcr_el2 |= HCR_TWE; 92 + vcpu->arch.hcr_el2 |= HCR_TWI; 98 93 } 99 94 100 95 static inline void vcpu_ptrauth_enable(struct kvm_vcpu *vcpu) ··· 271 256 static inline bool kvm_vcpu_dabt_isvalid(const struct kvm_vcpu *vcpu) 272 257 { 273 258 return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_ISV); 259 + } 260 + 261 + static inline unsigned long kvm_vcpu_dabt_iss_nisv_sanitized(const struct kvm_vcpu *vcpu) 262 + { 263 + return kvm_vcpu_get_hsr(vcpu) & (ESR_ELx_CM | ESR_ELx_WNR | ESR_ELx_FSC); 274 264 } 275 265 276 266 static inline bool kvm_vcpu_dabt_issext(const struct kvm_vcpu *vcpu)

+37

arch/arm64/include/asm/kvm_host.h

··· 44 44 KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) 45 45 #define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1) 46 46 #define KVM_REQ_VCPU_RESET KVM_ARCH_REQ(2) 47 + #define KVM_REQ_RECORD_STEAL KVM_ARCH_REQ(3) 47 48 48 49 DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use); 49 50 ··· 84 83 85 84 /* Mandated version of PSCI */ 86 85 u32 psci_version; 86 + 87 + /* 88 + * If we encounter a data abort without valid instruction syndrome 89 + * information, report this to user space. User space can (and 90 + * should) opt in to this feature if KVM_CAP_ARM_NISV_TO_USER is 91 + * supported. 92 + */ 93 + bool return_nisv_io_abort_to_user; 87 94 }; 88 95 89 96 #define KVM_NR_MEM_OBJS 40 ··· 347 338 /* True when deferrable sysregs are loaded on the physical CPU, 348 339 * see kvm_vcpu_load_sysregs and kvm_vcpu_put_sysregs. */ 349 340 bool sysregs_loaded_on_cpu; 341 + 342 + /* Guest PV state */ 343 + struct { 344 + u64 steal; 345 + u64 last_steal; 346 + gpa_t base; 347 + } steal; 350 348 }; 351 349 352 350 /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */ ··· 493 477 494 478 int kvm_perf_init(void); 495 479 int kvm_perf_teardown(void); 480 + 481 + long kvm_hypercall_pv_features(struct kvm_vcpu *vcpu); 482 + gpa_t kvm_init_stolen_time(struct kvm_vcpu *vcpu); 483 + void kvm_update_stolen_time(struct kvm_vcpu *vcpu); 484 + 485 + int kvm_arm_pvtime_set_attr(struct kvm_vcpu *vcpu, 486 + struct kvm_device_attr *attr); 487 + int kvm_arm_pvtime_get_attr(struct kvm_vcpu *vcpu, 488 + struct kvm_device_attr *attr); 489 + int kvm_arm_pvtime_has_attr(struct kvm_vcpu *vcpu, 490 + struct kvm_device_attr *attr); 491 + 492 + static inline void kvm_arm_pvtime_vcpu_init(struct kvm_vcpu_arch *vcpu_arch) 493 + { 494 + vcpu_arch->steal.base = GPA_INVALID; 495 + } 496 + 497 + static inline bool kvm_arm_is_pvtime_enabled(struct kvm_vcpu_arch *vcpu_arch) 498 + { 499 + return (vcpu_arch->steal.base != GPA_INVALID); 500 + } 496 501 497 502 void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome); 498 503

+8 -1

arch/arm64/include/asm/paravirt.h

··· 21 21 { 22 22 return pv_ops.time.steal_clock(cpu); 23 23 } 24 - #endif 24 + 25 + int __init pv_time_init(void); 26 + 27 + #else 28 + 29 + #define pv_time_init() do {} while (0) 30 + 31 + #endif // CONFIG_PARAVIRT 25 32 26 33 #endif

+17

arch/arm64/include/asm/pvclock-abi.h

··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* Copyright (C) 2019 Arm Ltd. */ 3 + 4 + #ifndef __ASM_PVCLOCK_ABI_H 5 + #define __ASM_PVCLOCK_ABI_H 6 + 7 + /* The below structure is defined in ARM DEN0057A */ 8 + 9 + struct pvclock_vcpu_stolen_time { 10 + __le32 revision; 11 + __le32 attributes; 12 + __le64 stolen_time; 13 + /* Structure must be 64 byte aligned, pad to that size */ 14 + u8 padding[48]; 15 + } __packed; 16 + 17 + #endif

+4 -1

arch/arm64/include/uapi/asm/kvm.h

··· 164 164 struct { 165 165 __u8 serror_pending; 166 166 __u8 serror_has_esr; 167 + __u8 ext_dabt_pending; 167 168 /* Align it to 8 bytes */ 168 - __u8 pad[6]; 169 + __u8 pad[5]; 169 170 __u64 serror_esr; 170 171 } exception; 171 172 __u32 reserved[12]; ··· 324 323 #define KVM_ARM_VCPU_TIMER_CTRL 1 325 324 #define KVM_ARM_VCPU_TIMER_IRQ_VTIMER 0 326 325 #define KVM_ARM_VCPU_TIMER_IRQ_PTIMER 1 326 + #define KVM_ARM_VCPU_PVTIME_CTRL 2 327 + #define KVM_ARM_VCPU_PVTIME_IPA 0 327 328 328 329 /* KVM_IRQ_LINE irq field index values */ 329 330 #define KVM_ARM_IRQ_VCPU2_SHIFT 28

+34 -70

arch/arm64/kernel/cpu_errata.c

··· 6 6 */ 7 7 8 8 #include <linux/arm-smccc.h> 9 - #include <linux/psci.h> 10 9 #include <linux/types.h> 11 10 #include <linux/cpu.h> 12 11 #include <asm/cpu.h> ··· 166 167 } 167 168 #endif /* CONFIG_KVM_INDIRECT_VECTORS */ 168 169 169 - #include <uapi/linux/psci.h> 170 170 #include <linux/arm-smccc.h> 171 - #include <linux/psci.h> 172 171 173 172 static void call_smc_arch_workaround_1(void) 174 173 { ··· 210 213 struct arm_smccc_res res; 211 214 u32 midr = read_cpuid_id(); 212 215 213 - if (psci_ops.smccc_version == SMCCC_VERSION_1_0) 214 - return -1; 216 + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, 217 + ARM_SMCCC_ARCH_WORKAROUND_1, &res); 215 218 216 - switch (psci_ops.conduit) { 217 - case PSCI_CONDUIT_HVC: 218 - arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, 219 - ARM_SMCCC_ARCH_WORKAROUND_1, &res); 220 - switch ((int)res.a0) { 221 - case 1: 222 - /* Firmware says we're just fine */ 223 - return 0; 224 - case 0: 225 - cb = call_hvc_arch_workaround_1; 226 - /* This is a guest, no need to patch KVM vectors */ 227 - smccc_start = NULL; 228 - smccc_end = NULL; 229 - break; 230 - default: 231 - return -1; 232 - } 219 + switch ((int)res.a0) { 220 + case 1: 221 + /* Firmware says we're just fine */ 222 + return 0; 223 + case 0: 224 + break; 225 + default: 226 + return -1; 227 + } 228 + 229 + switch (arm_smccc_1_1_get_conduit()) { 230 + case SMCCC_CONDUIT_HVC: 231 + cb = call_hvc_arch_workaround_1; 232 + /* This is a guest, no need to patch KVM vectors */ 233 + smccc_start = NULL; 234 + smccc_end = NULL; 233 235 break; 234 236 235 - case PSCI_CONDUIT_SMC: 236 - arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, 237 - ARM_SMCCC_ARCH_WORKAROUND_1, &res); 238 - switch ((int)res.a0) { 239 - case 1: 240 - /* Firmware says we're just fine */ 241 - return 0; 242 - case 0: 243 - cb = call_smc_arch_workaround_1; 244 - smccc_start = __smccc_workaround_1_smc_start; 245 - smccc_end = __smccc_workaround_1_smc_end; 246 - break; 247 - default: 248 - return -1; 249 - } 237 + case SMCCC_CONDUIT_SMC: 238 + cb = call_smc_arch_workaround_1; 239 + smccc_start = __smccc_workaround_1_smc_start; 240 + smccc_end = __smccc_workaround_1_smc_end; 250 241 break; 251 242 252 243 default: ··· 294 309 295 310 BUG_ON(nr_inst != 1); 296 311 297 - switch (psci_ops.conduit) { 298 - case PSCI_CONDUIT_HVC: 312 + switch (arm_smccc_1_1_get_conduit()) { 313 + case SMCCC_CONDUIT_HVC: 299 314 insn = aarch64_insn_get_hvc_value(); 300 315 break; 301 - case PSCI_CONDUIT_SMC: 316 + case SMCCC_CONDUIT_SMC: 302 317 insn = aarch64_insn_get_smc_value(); 303 318 break; 304 319 default: ··· 324 339 325 340 void arm64_set_ssbd_mitigation(bool state) 326 341 { 342 + int conduit; 343 + 327 344 if (!IS_ENABLED(CONFIG_ARM64_SSBD)) { 328 345 pr_info_once("SSBD disabled by kernel configuration\n"); 329 346 return; ··· 339 352 return; 340 353 } 341 354 342 - switch (psci_ops.conduit) { 343 - case PSCI_CONDUIT_HVC: 344 - arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_2, state, NULL); 345 - break; 355 + conduit = arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_WORKAROUND_2, state, 356 + NULL); 346 357 347 - case PSCI_CONDUIT_SMC: 348 - arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2, state, NULL); 349 - break; 350 - 351 - default: 352 - WARN_ON_ONCE(1); 353 - break; 354 - } 358 + WARN_ON_ONCE(conduit == SMCCC_CONDUIT_NONE); 355 359 } 356 360 357 361 static bool has_ssbd_mitigation(const struct arm64_cpu_capabilities *entry, ··· 352 374 bool required = true; 353 375 s32 val; 354 376 bool this_cpu_safe = false; 377 + int conduit; 355 378 356 379 WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible()); 357 380 ··· 370 391 goto out_printmsg; 371 392 } 372 393 373 - if (psci_ops.smccc_version == SMCCC_VERSION_1_0) { 374 - ssbd_state = ARM64_SSBD_UNKNOWN; 375 - if (!this_cpu_safe) 376 - __ssb_safe = false; 377 - return false; 378 - } 394 + conduit = arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, 395 + ARM_SMCCC_ARCH_WORKAROUND_2, &res); 379 396 380 - switch (psci_ops.conduit) { 381 - case PSCI_CONDUIT_HVC: 382 - arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, 383 - ARM_SMCCC_ARCH_WORKAROUND_2, &res); 384 - break; 385 - 386 - case PSCI_CONDUIT_SMC: 387 - arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, 388 - ARM_SMCCC_ARCH_WORKAROUND_2, &res); 389 - break; 390 - 391 - default: 397 + if (conduit == SMCCC_CONDUIT_NONE) { 392 398 ssbd_state = ARM64_SSBD_UNKNOWN; 393 399 if (!this_cpu_safe) 394 400 __ssb_safe = false;

+140

arch/arm64/kernel/paravirt.c

··· 6 6 * Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com> 7 7 */ 8 8 9 + #define pr_fmt(fmt) "arm-pv: " fmt 10 + 11 + #include <linux/arm-smccc.h> 12 + #include <linux/cpuhotplug.h> 9 13 #include <linux/export.h> 14 + #include <linux/io.h> 10 15 #include <linux/jump_label.h> 16 + #include <linux/printk.h> 17 + #include <linux/psci.h> 18 + #include <linux/reboot.h> 19 + #include <linux/slab.h> 11 20 #include <linux/types.h> 21 + 12 22 #include <asm/paravirt.h> 23 + #include <asm/pvclock-abi.h> 24 + #include <asm/smp_plat.h> 13 25 14 26 struct static_key paravirt_steal_enabled; 15 27 struct static_key paravirt_steal_rq_enabled; 16 28 17 29 struct paravirt_patch_template pv_ops; 18 30 EXPORT_SYMBOL_GPL(pv_ops); 31 + 32 + struct pv_time_stolen_time_region { 33 + struct pvclock_vcpu_stolen_time *kaddr; 34 + }; 35 + 36 + static DEFINE_PER_CPU(struct pv_time_stolen_time_region, stolen_time_region); 37 + 38 + static bool steal_acc = true; 39 + static int __init parse_no_stealacc(char *arg) 40 + { 41 + steal_acc = false; 42 + return 0; 43 + } 44 + 45 + early_param("no-steal-acc", parse_no_stealacc); 46 + 47 + /* return stolen time in ns by asking the hypervisor */ 48 + static u64 pv_steal_clock(int cpu) 49 + { 50 + struct pv_time_stolen_time_region *reg; 51 + 52 + reg = per_cpu_ptr(&stolen_time_region, cpu); 53 + if (!reg->kaddr) { 54 + pr_warn_once("stolen time enabled but not configured for cpu %d\n", 55 + cpu); 56 + return 0; 57 + } 58 + 59 + return le64_to_cpu(READ_ONCE(reg->kaddr->stolen_time)); 60 + } 61 + 62 + static int stolen_time_dying_cpu(unsigned int cpu) 63 + { 64 + struct pv_time_stolen_time_region *reg; 65 + 66 + reg = this_cpu_ptr(&stolen_time_region); 67 + if (!reg->kaddr) 68 + return 0; 69 + 70 + memunmap(reg->kaddr); 71 + memset(reg, 0, sizeof(*reg)); 72 + 73 + return 0; 74 + } 75 + 76 + static int init_stolen_time_cpu(unsigned int cpu) 77 + { 78 + struct pv_time_stolen_time_region *reg; 79 + struct arm_smccc_res res; 80 + 81 + reg = this_cpu_ptr(&stolen_time_region); 82 + 83 + arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_TIME_ST, &res); 84 + 85 + if (res.a0 == SMCCC_RET_NOT_SUPPORTED) 86 + return -EINVAL; 87 + 88 + reg->kaddr = memremap(res.a0, 89 + sizeof(struct pvclock_vcpu_stolen_time), 90 + MEMREMAP_WB); 91 + 92 + if (!reg->kaddr) { 93 + pr_warn("Failed to map stolen time data structure\n"); 94 + return -ENOMEM; 95 + } 96 + 97 + if (le32_to_cpu(reg->kaddr->revision) != 0 || 98 + le32_to_cpu(reg->kaddr->attributes) != 0) { 99 + pr_warn_once("Unexpected revision or attributes in stolen time data\n"); 100 + return -ENXIO; 101 + } 102 + 103 + return 0; 104 + } 105 + 106 + static int pv_time_init_stolen_time(void) 107 + { 108 + int ret; 109 + 110 + ret = cpuhp_setup_state(CPUHP_AP_ARM_KVMPV_STARTING, 111 + "hypervisor/arm/pvtime:starting", 112 + init_stolen_time_cpu, stolen_time_dying_cpu); 113 + if (ret < 0) 114 + return ret; 115 + return 0; 116 + } 117 + 118 + static bool has_pv_steal_clock(void) 119 + { 120 + struct arm_smccc_res res; 121 + 122 + /* To detect the presence of PV time support we require SMCCC 1.1+ */ 123 + if (psci_ops.smccc_version < SMCCC_VERSION_1_1) 124 + return false; 125 + 126 + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID, 127 + ARM_SMCCC_HV_PV_TIME_FEATURES, &res); 128 + 129 + if (res.a0 != SMCCC_RET_SUCCESS) 130 + return false; 131 + 132 + arm_smccc_1_1_invoke(ARM_SMCCC_HV_PV_TIME_FEATURES, 133 + ARM_SMCCC_HV_PV_TIME_ST, &res); 134 + 135 + return (res.a0 == SMCCC_RET_SUCCESS); 136 + } 137 + 138 + int __init pv_time_init(void) 139 + { 140 + int ret; 141 + 142 + if (!has_pv_steal_clock()) 143 + return 0; 144 + 145 + ret = pv_time_init_stolen_time(); 146 + if (ret) 147 + return ret; 148 + 149 + pv_ops.time.steal_clock = pv_steal_clock; 150 + 151 + static_key_slow_inc(&paravirt_steal_enabled); 152 + if (steal_acc) 153 + static_key_slow_inc(&paravirt_steal_rq_enabled); 154 + 155 + pr_info("using stolen time PV\n"); 156 + 157 + return 0; 158 + }

+2 -1

arch/arm64/kernel/sdei.c

··· 2 2 // Copyright (C) 2017 Arm Ltd. 3 3 #define pr_fmt(fmt) "sdei: " fmt 4 4 5 + #include <linux/arm-smccc.h> 5 6 #include <linux/arm_sdei.h> 6 7 #include <linux/hardirq.h> 7 8 #include <linux/irqflags.h> ··· 162 161 return 0; 163 162 } 164 163 165 - sdei_exit_mode = (conduit == CONDUIT_HVC) ? SDEI_EXIT_HVC : SDEI_EXIT_SMC; 164 + sdei_exit_mode = (conduit == SMCCC_CONDUIT_HVC) ? SDEI_EXIT_HVC : SDEI_EXIT_SMC; 166 165 167 166 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 168 167 if (arm64_kernel_unmapped_at_el0()) {

+3

arch/arm64/kernel/time.c

··· 30 30 31 31 #include <asm/thread_info.h> 32 32 #include <asm/stacktrace.h> 33 + #include <asm/paravirt.h> 33 34 34 35 unsigned long profile_pc(struct pt_regs *regs) 35 36 { ··· 66 65 67 66 /* Calibrate the delay loop directly */ 68 67 lpj_fine = arch_timer_rate / HZ; 68 + 69 + pv_time_init(); 69 70 }

+4

arch/arm64/kvm/Kconfig

··· 21 21 config KVM 22 22 bool "Kernel-based Virtual Machine (KVM) support" 23 23 depends on OF 24 + # for TASKSTATS/TASK_DELAY_ACCT: 25 + depends on NET && MULTIUSER 24 26 select MMU_NOTIFIER 25 27 select PREEMPT_NOTIFIERS 26 28 select HAVE_KVM_CPU_RELAX_INTERCEPT ··· 41 39 select IRQ_BYPASS_MANAGER 42 40 select HAVE_KVM_IRQ_BYPASS 43 41 select HAVE_KVM_VCPU_RUN_PID_CHANGE 42 + select TASKSTATS 43 + select TASK_DELAY_ACCT 44 44 ---help--- 45 45 Support hosting virtualized guest machines. 46 46 We don't support KVM with 16K page tables yet, due to the multiple

+2

arch/arm64/kvm/Makefile

··· 13 13 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o $(KVM)/vfio.o 14 14 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o 15 15 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o 16 + kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hypercalls.o 17 + kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/pvtime.o 16 18 17 19 kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o 18 20 kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o

+23

arch/arm64/kvm/guest.c

··· 34 34 #define VCPU_STAT(x) { #x, offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU } 35 35 36 36 struct kvm_stats_debugfs_item debugfs_entries[] = { 37 + VCPU_STAT(halt_successful_poll), 38 + VCPU_STAT(halt_attempted_poll), 39 + VCPU_STAT(halt_poll_invalid), 40 + VCPU_STAT(halt_wakeup), 37 41 VCPU_STAT(hvc_exit_stat), 38 42 VCPU_STAT(wfe_exit_stat), 39 43 VCPU_STAT(wfi_exit_stat), ··· 716 712 if (events->exception.serror_pending && events->exception.serror_has_esr) 717 713 events->exception.serror_esr = vcpu_get_vsesr(vcpu); 718 714 715 + /* 716 + * We never return a pending ext_dabt here because we deliver it to 717 + * the virtual CPU directly when setting the event and it's no longer 718 + * 'pending' at this point. 719 + */ 720 + 719 721 return 0; 720 722 } 721 723 ··· 730 720 { 731 721 bool serror_pending = events->exception.serror_pending; 732 722 bool has_esr = events->exception.serror_has_esr; 723 + bool ext_dabt_pending = events->exception.ext_dabt_pending; 733 724 734 725 if (serror_pending && has_esr) { 735 726 if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) ··· 743 732 } else if (serror_pending) { 744 733 kvm_inject_vabt(vcpu); 745 734 } 735 + 736 + if (ext_dabt_pending) 737 + kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu)); 746 738 747 739 return 0; 748 740 } ··· 872 858 case KVM_ARM_VCPU_TIMER_CTRL: 873 859 ret = kvm_arm_timer_set_attr(vcpu, attr); 874 860 break; 861 + case KVM_ARM_VCPU_PVTIME_CTRL: 862 + ret = kvm_arm_pvtime_set_attr(vcpu, attr); 863 + break; 875 864 default: 876 865 ret = -ENXIO; 877 866 break; ··· 895 878 case KVM_ARM_VCPU_TIMER_CTRL: 896 879 ret = kvm_arm_timer_get_attr(vcpu, attr); 897 880 break; 881 + case KVM_ARM_VCPU_PVTIME_CTRL: 882 + ret = kvm_arm_pvtime_get_attr(vcpu, attr); 883 + break; 898 884 default: 899 885 ret = -ENXIO; 900 886 break; ··· 917 897 break; 918 898 case KVM_ARM_VCPU_TIMER_CTRL: 919 899 ret = kvm_arm_timer_has_attr(vcpu, attr); 900 + break; 901 + case KVM_ARM_VCPU_PVTIME_CTRL: 902 + ret = kvm_arm_pvtime_has_attr(vcpu, attr); 920 903 break; 921 904 default: 922 905 ret = -ENXIO;

+2 -2

arch/arm64/kvm/handle_exit.c

··· 11 11 #include <linux/kvm.h> 12 12 #include <linux/kvm_host.h> 13 13 14 - #include <kvm/arm_psci.h> 15 - 16 14 #include <asm/esr.h> 17 15 #include <asm/exception.h> 18 16 #include <asm/kvm_asm.h> ··· 19 21 #include <asm/kvm_mmu.h> 20 22 #include <asm/debug-monitors.h> 21 23 #include <asm/traps.h> 24 + 25 + #include <kvm/arm_hypercalls.h> 22 26 23 27 #define CREATE_TRACE_POINTS 24 28 #include "trace.h"

+2 -2

arch/arm64/kvm/inject_fault.c

··· 109 109 110 110 /** 111 111 * kvm_inject_dabt - inject a data abort into the guest 112 - * @vcpu: The VCPU to receive the undefined exception 112 + * @vcpu: The VCPU to receive the data abort 113 113 * @addr: The address to report in the DFAR 114 114 * 115 115 * It is assumed that this code is called from the VCPU thread and that the ··· 125 125 126 126 /** 127 127 * kvm_inject_pabt - inject a prefetch abort into the guest 128 - * @vcpu: The VCPU to receive the undefined exception 128 + * @vcpu: The VCPU to receive the prefetch abort 129 129 * @addr: The address to report in the DFAR 130 130 * 131 131 * It is assumed that this code is called from the VCPU thread and that the

+6 -6

drivers/firmware/arm_sdei.c

··· 967 967 if (np) { 968 968 if (of_property_read_string(np, "method", &method)) { 969 969 pr_warn("missing \"method\" property\n"); 970 - return CONDUIT_INVALID; 970 + return SMCCC_CONDUIT_NONE; 971 971 } 972 972 973 973 if (!strcmp("hvc", method)) { 974 974 sdei_firmware_call = &sdei_smccc_hvc; 975 - return CONDUIT_HVC; 975 + return SMCCC_CONDUIT_HVC; 976 976 } else if (!strcmp("smc", method)) { 977 977 sdei_firmware_call = &sdei_smccc_smc; 978 - return CONDUIT_SMC; 978 + return SMCCC_CONDUIT_SMC; 979 979 } 980 980 981 981 pr_warn("invalid \"method\" property: %s\n", method); 982 982 } else if (IS_ENABLED(CONFIG_ACPI) && !acpi_disabled) { 983 983 if (acpi_psci_use_hvc()) { 984 984 sdei_firmware_call = &sdei_smccc_hvc; 985 - return CONDUIT_HVC; 985 + return SMCCC_CONDUIT_HVC; 986 986 } else { 987 987 sdei_firmware_call = &sdei_smccc_smc; 988 - return CONDUIT_SMC; 988 + return SMCCC_CONDUIT_SMC; 989 989 } 990 990 } 991 991 992 - return CONDUIT_INVALID; 992 + return SMCCC_CONDUIT_NONE; 993 993 } 994 994 995 995 static int sdei_probe(struct platform_device *pdev)

+16 -8

drivers/firmware/psci/psci.c

··· 53 53 } 54 54 55 55 struct psci_operations psci_ops = { 56 - .conduit = PSCI_CONDUIT_NONE, 56 + .conduit = SMCCC_CONDUIT_NONE, 57 57 .smccc_version = SMCCC_VERSION_1_0, 58 58 }; 59 + 60 + enum arm_smccc_conduit arm_smccc_1_1_get_conduit(void) 61 + { 62 + if (psci_ops.smccc_version < SMCCC_VERSION_1_1) 63 + return SMCCC_CONDUIT_NONE; 64 + 65 + return psci_ops.conduit; 66 + } 59 67 60 68 typedef unsigned long (psci_fn)(unsigned long, unsigned long, 61 69 unsigned long, unsigned long); ··· 220 212 0, 0, 0); 221 213 } 222 214 223 - static void set_conduit(enum psci_conduit conduit) 215 + static void set_conduit(enum arm_smccc_conduit conduit) 224 216 { 225 217 switch (conduit) { 226 - case PSCI_CONDUIT_HVC: 218 + case SMCCC_CONDUIT_HVC: 227 219 invoke_psci_fn = __invoke_psci_fn_hvc; 228 220 break; 229 - case PSCI_CONDUIT_SMC: 221 + case SMCCC_CONDUIT_SMC: 230 222 invoke_psci_fn = __invoke_psci_fn_smc; 231 223 break; 232 224 default: ··· 248 240 } 249 241 250 242 if (!strcmp("hvc", method)) { 251 - set_conduit(PSCI_CONDUIT_HVC); 243 + set_conduit(SMCCC_CONDUIT_HVC); 252 244 } else if (!strcmp("smc", method)) { 253 - set_conduit(PSCI_CONDUIT_SMC); 245 + set_conduit(SMCCC_CONDUIT_SMC); 254 246 } else { 255 247 pr_warn("invalid \"method\" property: %s\n", method); 256 248 return -EINVAL; ··· 591 583 pr_info("probing for conduit method from ACPI.\n"); 592 584 593 585 if (acpi_psci_use_hvc()) 594 - set_conduit(PSCI_CONDUIT_HVC); 586 + set_conduit(SMCCC_CONDUIT_HVC); 595 587 else 596 - set_conduit(PSCI_CONDUIT_SMC); 588 + set_conduit(SMCCC_CONDUIT_SMC); 597 589 598 590 return psci_probe(); 599 591 }

+6 -1

drivers/irqchip/irq-gic-v4.c

··· 141 141 int its_schedule_vpe(struct its_vpe *vpe, bool on) 142 142 { 143 143 struct its_cmd_info info; 144 + int ret; 144 145 145 146 WARN_ON(preemptible()); 146 147 147 148 info.cmd_type = on ? SCHEDULE_VPE : DESCHEDULE_VPE; 148 149 149 - return its_send_vpe_cmd(vpe, &info); 150 + ret = its_send_vpe_cmd(vpe, &info); 151 + if (!ret) 152 + vpe->resident = on; 153 + 154 + return ret; 150 155 } 151 156 152 157 int its_invall_vpe(struct its_vpe *vpe)

+2

include/Kbuild

··· 67 67 header-test- += keys/request_key_auth-type.h 68 68 header-test- += keys/trusted.h 69 69 header-test- += kvm/arm_arch_timer.h 70 + header-test-$(CONFIG_ARM) += kvm/arm_hypercalls.h 71 + header-test-$(CONFIG_ARM64) += kvm/arm_hypercalls.h 70 72 header-test- += kvm/arm_pmu.h 71 73 header-test-$(CONFIG_ARM) += kvm/arm_psci.h 72 74 header-test-$(CONFIG_ARM64) += kvm/arm_psci.h

+43

include/kvm/arm_hypercalls.h

··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* Copyright (C) 2019 Arm Ltd. */ 3 + 4 + #ifndef __KVM_ARM_HYPERCALLS_H 5 + #define __KVM_ARM_HYPERCALLS_H 6 + 7 + #include <asm/kvm_emulate.h> 8 + 9 + int kvm_hvc_call_handler(struct kvm_vcpu *vcpu); 10 + 11 + static inline u32 smccc_get_function(struct kvm_vcpu *vcpu) 12 + { 13 + return vcpu_get_reg(vcpu, 0); 14 + } 15 + 16 + static inline unsigned long smccc_get_arg1(struct kvm_vcpu *vcpu) 17 + { 18 + return vcpu_get_reg(vcpu, 1); 19 + } 20 + 21 + static inline unsigned long smccc_get_arg2(struct kvm_vcpu *vcpu) 22 + { 23 + return vcpu_get_reg(vcpu, 2); 24 + } 25 + 26 + static inline unsigned long smccc_get_arg3(struct kvm_vcpu *vcpu) 27 + { 28 + return vcpu_get_reg(vcpu, 3); 29 + } 30 + 31 + static inline void smccc_set_retval(struct kvm_vcpu *vcpu, 32 + unsigned long a0, 33 + unsigned long a1, 34 + unsigned long a2, 35 + unsigned long a3) 36 + { 37 + vcpu_set_reg(vcpu, 0, a0); 38 + vcpu_set_reg(vcpu, 1, a1); 39 + vcpu_set_reg(vcpu, 2, a2); 40 + vcpu_set_reg(vcpu, 3, a3); 41 + } 42 + 43 + #endif

+1 -1

include/kvm/arm_psci.h

··· 40 40 } 41 41 42 42 43 - int kvm_hvc_call_handler(struct kvm_vcpu *vcpu); 43 + int kvm_psci_call(struct kvm_vcpu *vcpu); 44 44 45 45 struct kvm_one_reg; 46 46

+3 -5

include/kvm/arm_vgic.h

··· 240 240 * Contains the attributes and gpa of the LPI configuration table. 241 241 * Since we report GICR_TYPER.CommonLPIAff as 0b00, we can share 242 242 * one address across all redistributors. 243 - * GICv3 spec: 6.1.2 "LPI Configuration tables" 243 + * GICv3 spec: IHI 0069E 6.1.1 "LPI Configuration tables" 244 244 */ 245 245 u64 propbaser; 246 246 ··· 378 378 return kvm_vgic_global_state.max_gic_vcpus; 379 379 } 380 380 381 - int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi); 382 - 383 381 /** 384 382 * kvm_vgic_setup_default_irq_routing: 385 383 * Setup a default flat gsi routing table mapping all SPIs ··· 394 396 int kvm_vgic_v4_unset_forwarding(struct kvm *kvm, int irq, 395 397 struct kvm_kernel_irq_routing_entry *irq_entry); 396 398 397 - void kvm_vgic_v4_enable_doorbell(struct kvm_vcpu *vcpu); 398 - void kvm_vgic_v4_disable_doorbell(struct kvm_vcpu *vcpu); 399 + int vgic_v4_load(struct kvm_vcpu *vcpu); 400 + int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db); 399 401 400 402 #endif /* __KVM_ARM_VGIC_H */

+75

include/linux/arm-smccc.h

··· 45 45 #define ARM_SMCCC_OWNER_SIP 2 46 46 #define ARM_SMCCC_OWNER_OEM 3 47 47 #define ARM_SMCCC_OWNER_STANDARD 4 48 + #define ARM_SMCCC_OWNER_STANDARD_HYP 5 48 49 #define ARM_SMCCC_OWNER_TRUSTED_APP 48 49 50 #define ARM_SMCCC_OWNER_TRUSTED_APP_END 49 50 51 #define ARM_SMCCC_OWNER_TRUSTED_OS 50 ··· 81 80 82 81 #include <linux/linkage.h> 83 82 #include <linux/types.h> 83 + 84 + enum arm_smccc_conduit { 85 + SMCCC_CONDUIT_NONE, 86 + SMCCC_CONDUIT_SMC, 87 + SMCCC_CONDUIT_HVC, 88 + }; 89 + 90 + /** 91 + * arm_smccc_1_1_get_conduit() 92 + * 93 + * Returns the conduit to be used for SMCCCv1.1 or later. 94 + * 95 + * When SMCCCv1.1 is not present, returns SMCCC_CONDUIT_NONE. 96 + */ 97 + enum arm_smccc_conduit arm_smccc_1_1_get_conduit(void); 98 + 84 99 /** 85 100 * struct arm_smccc_res - Result from SMC/HVC call 86 101 * @a0-a3 result values from registers 0 to 3 ··· 318 301 #define SMCCC_RET_SUCCESS 0 319 302 #define SMCCC_RET_NOT_SUPPORTED -1 320 303 #define SMCCC_RET_NOT_REQUIRED -2 304 + 305 + /* 306 + * Like arm_smccc_1_1* but always returns SMCCC_RET_NOT_SUPPORTED. 307 + * Used when the SMCCC conduit is not defined. The empty asm statement 308 + * avoids compiler warnings about unused variables. 309 + */ 310 + #define __fail_smccc_1_1(...) \ 311 + do { \ 312 + __declare_args(__count_args(__VA_ARGS__), __VA_ARGS__); \ 313 + asm ("" __constraints(__count_args(__VA_ARGS__))); \ 314 + if (___res) \ 315 + ___res->a0 = SMCCC_RET_NOT_SUPPORTED; \ 316 + } while (0) 317 + 318 + /* 319 + * arm_smccc_1_1_invoke() - make an SMCCC v1.1 compliant call 320 + * 321 + * This is a variadic macro taking one to eight source arguments, and 322 + * an optional return structure. 323 + * 324 + * @a0-a7: arguments passed in registers 0 to 7 325 + * @res: result values from registers 0 to 3 326 + * 327 + * This macro will make either an HVC call or an SMC call depending on the 328 + * current SMCCC conduit. If no valid conduit is available then -1 329 + * (SMCCC_RET_NOT_SUPPORTED) is returned in @res.a0 (if supplied). 330 + * 331 + * The return value also provides the conduit that was used. 332 + */ 333 + #define arm_smccc_1_1_invoke(...) ({ \ 334 + int method = arm_smccc_1_1_get_conduit(); \ 335 + switch (method) { \ 336 + case SMCCC_CONDUIT_HVC: \ 337 + arm_smccc_1_1_hvc(__VA_ARGS__); \ 338 + break; \ 339 + case SMCCC_CONDUIT_SMC: \ 340 + arm_smccc_1_1_smc(__VA_ARGS__); \ 341 + break; \ 342 + default: \ 343 + __fail_smccc_1_1(__VA_ARGS__); \ 344 + method = SMCCC_CONDUIT_NONE; \ 345 + break; \ 346 + } \ 347 + method; \ 348 + }) 349 + 350 + /* Paravirtualised time calls (defined by ARM DEN0057A) */ 351 + #define ARM_SMCCC_HV_PV_TIME_FEATURES \ 352 + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ 353 + ARM_SMCCC_SMC_64, \ 354 + ARM_SMCCC_OWNER_STANDARD_HYP, \ 355 + 0x20) 356 + 357 + #define ARM_SMCCC_HV_PV_TIME_ST \ 358 + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ 359 + ARM_SMCCC_SMC_64, \ 360 + ARM_SMCCC_OWNER_STANDARD_HYP, \ 361 + 0x21) 321 362 322 363 #endif /*__ASSEMBLY__*/ 323 364 #endif /*__LINUX_ARM_SMCCC_H*/

-6

include/linux/arm_sdei.h

··· 5 5 6 6 #include <uapi/linux/arm_sdei.h> 7 7 8 - enum sdei_conduit_types { 9 - CONDUIT_INVALID = 0, 10 - CONDUIT_SMC, 11 - CONDUIT_HVC, 12 - }; 13 - 14 8 #include <acpi/ghes.h> 15 9 16 10 #ifdef CONFIG_ARM_SDE_INTERFACE

+1

include/linux/cpuhotplug.h

··· 136 136 /* Must be the last timer callback */ 137 137 CPUHP_AP_DUMMY_TIMER_STARTING, 138 138 CPUHP_AP_ARM_XEN_STARTING, 139 + CPUHP_AP_ARM_KVMPV_STARTING, 139 140 CPUHP_AP_ARM_CORESIGHT_STARTING, 140 141 CPUHP_AP_ARM64_ISNDEP_STARTING, 141 142 CPUHP_AP_SMPCFD_DYING,

+4

include/linux/irqchip/arm-gic-v4.h

··· 32 32 struct its_vpe { 33 33 struct page *vpt_page; 34 34 struct its_vm *its_vm; 35 + /* per-vPE VLPI tracking */ 36 + atomic_t vlpi_count; 35 37 /* Doorbell interrupt */ 36 38 int irq; 37 39 irq_hw_number_t vpe_db_lpi; 40 + /* VPE resident */ 41 + bool resident; 38 42 /* VPE proxy mapping */ 39 43 int vpe_proxy_event; 40 44 /*

+24 -2

include/linux/kvm_host.h

··· 741 741 unsigned long len); 742 742 int kvm_gfn_to_hva_cache_init(struct kvm *kvm, struct gfn_to_hva_cache *ghc, 743 743 gpa_t gpa, unsigned long len); 744 + 745 + #define __kvm_put_guest(kvm, gfn, offset, value, type) \ 746 + ({ \ 747 + unsigned long __addr = gfn_to_hva(kvm, gfn); \ 748 + type __user *__uaddr = (type __user *)(__addr + offset); \ 749 + int __ret = -EFAULT; \ 750 + \ 751 + if (!kvm_is_error_hva(__addr)) \ 752 + __ret = put_user(value, __uaddr); \ 753 + if (!__ret) \ 754 + mark_page_dirty(kvm, gfn); \ 755 + __ret; \ 756 + }) 757 + 758 + #define kvm_put_guest(kvm, gpa, value, type) \ 759 + ({ \ 760 + gpa_t __gpa = gpa; \ 761 + struct kvm *__kvm = kvm; \ 762 + __kvm_put_guest(__kvm, __gpa >> PAGE_SHIFT, \ 763 + offset_in_page(__gpa), (value), type); \ 764 + }) 765 + 744 766 int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len); 745 767 int kvm_clear_guest(struct kvm *kvm, gpa_t gpa, unsigned long len); 746 768 struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn); ··· 1259 1237 extern unsigned int halt_poll_ns_shrink; 1260 1238 1261 1239 struct kvm_device { 1262 - struct kvm_device_ops *ops; 1240 + const struct kvm_device_ops *ops; 1263 1241 struct kvm *kvm; 1264 1242 void *private; 1265 1243 struct list_head vm_node; ··· 1312 1290 void kvm_device_get(struct kvm_device *dev); 1313 1291 void kvm_device_put(struct kvm_device *dev); 1314 1292 struct kvm_device *kvm_device_from_filp(struct file *filp); 1315 - int kvm_register_device_ops(struct kvm_device_ops *ops, u32 type); 1293 + int kvm_register_device_ops(const struct kvm_device_ops *ops, u32 type); 1316 1294 void kvm_unregister_device_ops(u32 type); 1317 1295 1318 1296 extern struct kvm_device_ops kvm_mpic_ops;

+2

include/linux/kvm_types.h

··· 35 35 typedef u64 gpa_t; 36 36 typedef u64 gfn_t; 37 37 38 + #define GPA_INVALID (~(gpa_t)0) 39 + 38 40 typedef unsigned long hva_t; 39 41 typedef u64 hpa_t; 40 42 typedef u64 hfn_t;

+2 -7

include/linux/psci.h

··· 7 7 #ifndef __LINUX_PSCI_H 8 8 #define __LINUX_PSCI_H 9 9 10 + #include <linux/arm-smccc.h> 10 11 #include <linux/init.h> 11 12 #include <linux/types.h> 12 13 ··· 18 17 19 18 int psci_cpu_suspend_enter(u32 state); 20 19 bool psci_power_state_is_valid(u32 state); 21 - 22 - enum psci_conduit { 23 - PSCI_CONDUIT_NONE, 24 - PSCI_CONDUIT_SMC, 25 - PSCI_CONDUIT_HVC, 26 - }; 27 20 28 21 enum smccc_version { 29 22 SMCCC_VERSION_1_0, ··· 33 38 int (*affinity_info)(unsigned long target_affinity, 34 39 unsigned long lowest_affinity_level); 35 40 int (*migrate_info_type)(void); 36 - enum psci_conduit conduit; 41 + enum arm_smccc_conduit conduit; 37 42 enum smccc_version smccc_version; 38 43 }; 39 44

+10

include/uapi/linux/kvm.h

··· 235 235 #define KVM_EXIT_S390_STSI 25 236 236 #define KVM_EXIT_IOAPIC_EOI 26 237 237 #define KVM_EXIT_HYPERV 27 238 + #define KVM_EXIT_ARM_NISV 28 238 239 239 240 /* For KVM_EXIT_INTERNAL_ERROR */ 240 241 /* Emulate instruction failed. */ ··· 395 394 } eoi; 396 395 /* KVM_EXIT_HYPERV */ 397 396 struct kvm_hyperv_exit hyperv; 397 + /* KVM_EXIT_ARM_NISV */ 398 + struct { 399 + __u64 esr_iss; 400 + __u64 fault_ipa; 401 + } arm_nisv; 398 402 /* Fix the size of the union. */ 399 403 char padding[256]; 400 404 }; ··· 1007 1001 #define KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 174 1008 1002 #define KVM_CAP_HYPERV_DIRECT_TLBFLUSH 175 1009 1003 #define KVM_CAP_PPC_GUEST_DEBUG_SSTEP 176 1004 + #define KVM_CAP_ARM_NISV_TO_USER 177 1005 + #define KVM_CAP_ARM_INJECT_EXT_DABT 178 1010 1006 1011 1007 #ifdef KVM_CAP_IRQ_ROUTING 1012 1008 ··· 1236 1228 #define KVM_DEV_TYPE_ARM_VGIC_ITS KVM_DEV_TYPE_ARM_VGIC_ITS 1237 1229 KVM_DEV_TYPE_XIVE, 1238 1230 #define KVM_DEV_TYPE_XIVE KVM_DEV_TYPE_XIVE 1231 + KVM_DEV_TYPE_ARM_PV_TIME, 1232 + #define KVM_DEV_TYPE_ARM_PV_TIME KVM_DEV_TYPE_ARM_PV_TIME 1239 1233 KVM_DEV_TYPE_MAX, 1240 1234 }; 1241 1235

+4 -4

virt/kvm/arm/arch_timer.c

··· 80 80 static void soft_timer_start(struct hrtimer *hrt, u64 ns) 81 81 { 82 82 hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns), 83 - HRTIMER_MODE_ABS); 83 + HRTIMER_MODE_ABS_HARD); 84 84 } 85 85 86 86 static void soft_timer_cancel(struct hrtimer *hrt) ··· 697 697 update_vtimer_cntvoff(vcpu, kvm_phys_timer_read()); 698 698 ptimer->cntvoff = 0; 699 699 700 - hrtimer_init(&timer->bg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS); 700 + hrtimer_init(&timer->bg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD); 701 701 timer->bg_timer.function = kvm_bg_timer_expire; 702 702 703 - hrtimer_init(&vtimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS); 704 - hrtimer_init(&ptimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS); 703 + hrtimer_init(&vtimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD); 704 + hrtimer_init(&ptimer->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD); 705 705 vtimer->hrtimer.function = kvm_hrtimer_expire; 706 706 ptimer->hrtimer.function = kvm_hrtimer_expire; 707 707

+43 -6

virt/kvm/arm/arm.c

··· 40 40 #include <asm/kvm_coproc.h> 41 41 #include <asm/sections.h> 42 42 43 + #include <kvm/arm_hypercalls.h> 44 + #include <kvm/arm_pmu.h> 45 + #include <kvm/arm_psci.h> 46 + 43 47 #ifdef REQUIRES_VIRT 44 48 __asm__(".arch_extension virt"); 45 49 #endif ··· 102 98 return 0; 103 99 } 104 100 101 + int kvm_vm_ioctl_enable_cap(struct kvm *kvm, 102 + struct kvm_enable_cap *cap) 103 + { 104 + int r; 105 + 106 + if (cap->flags) 107 + return -EINVAL; 108 + 109 + switch (cap->cap) { 110 + case KVM_CAP_ARM_NISV_TO_USER: 111 + r = 0; 112 + kvm->arch.return_nisv_io_abort_to_user = true; 113 + break; 114 + default: 115 + r = -EINVAL; 116 + break; 117 + } 118 + 119 + return r; 120 + } 105 121 106 122 /** 107 123 * kvm_arch_init_vm - initializes a VM data structure ··· 221 197 case KVM_CAP_IMMEDIATE_EXIT: 222 198 case KVM_CAP_VCPU_EVENTS: 223 199 case KVM_CAP_ARM_IRQ_LINE_LAYOUT_2: 200 + case KVM_CAP_ARM_NISV_TO_USER: 201 + case KVM_CAP_ARM_INJECT_EXT_DABT: 224 202 r = 1; 225 203 break; 226 204 case KVM_CAP_ARM_SET_DEVICE_ADDR: ··· 348 322 /* 349 323 * If we're about to block (most likely because we've just hit a 350 324 * WFI), we need to sync back the state of the GIC CPU interface 351 - * so that we have the lastest PMR and group enables. This ensures 325 + * so that we have the latest PMR and group enables. This ensures 352 326 * that kvm_arch_vcpu_runnable has up-to-date data to decide 353 327 * whether we have pending interrupts. 328 + * 329 + * For the same reason, we want to tell GICv4 that we need 330 + * doorbells to be signalled, should an interrupt become pending. 354 331 */ 355 332 preempt_disable(); 356 333 kvm_vgic_vmcr_sync(vcpu); 334 + vgic_v4_put(vcpu, true); 357 335 preempt_enable(); 358 - 359 - kvm_vgic_v4_enable_doorbell(vcpu); 360 336 } 361 337 362 338 void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) 363 339 { 364 - kvm_vgic_v4_disable_doorbell(vcpu); 340 + preempt_disable(); 341 + vgic_v4_load(vcpu); 342 + preempt_enable(); 365 343 } 366 344 367 345 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) ··· 380 350 kvm_pmu_vcpu_init(vcpu); 381 351 382 352 kvm_arm_reset_debug_ptr(vcpu); 353 + 354 + kvm_arm_pvtime_vcpu_init(&vcpu->arch); 383 355 384 356 return kvm_vgic_vcpu_init(vcpu); 385 357 } ··· 412 380 kvm_vcpu_load_sysregs(vcpu); 413 381 kvm_arch_vcpu_load_fp(vcpu); 414 382 kvm_vcpu_pmu_restore_guest(vcpu); 383 + if (kvm_arm_is_pvtime_enabled(&vcpu->arch)) 384 + kvm_make_request(KVM_REQ_RECORD_STEAL, vcpu); 415 385 416 386 if (single_task_running()) 417 - vcpu_clear_wfe_traps(vcpu); 387 + vcpu_clear_wfx_traps(vcpu); 418 388 else 419 - vcpu_set_wfe_traps(vcpu); 389 + vcpu_set_wfx_traps(vcpu); 420 390 421 391 vcpu_ptrauth_setup_lazy(vcpu); 422 392 } ··· 679 645 * that a VCPU sees new virtual interrupts. 680 646 */ 681 647 kvm_check_request(KVM_REQ_IRQ_PENDING, vcpu); 648 + 649 + if (kvm_check_request(KVM_REQ_RECORD_STEAL, vcpu)) 650 + kvm_update_stolen_time(vcpu); 682 651 } 683 652 } 684 653

+71

virt/kvm/arm/hypercalls.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + // Copyright (C) 2019 Arm Ltd. 3 + 4 + #include <linux/arm-smccc.h> 5 + #include <linux/kvm_host.h> 6 + 7 + #include <asm/kvm_emulate.h> 8 + 9 + #include <kvm/arm_hypercalls.h> 10 + #include <kvm/arm_psci.h> 11 + 12 + int kvm_hvc_call_handler(struct kvm_vcpu *vcpu) 13 + { 14 + u32 func_id = smccc_get_function(vcpu); 15 + long val = SMCCC_RET_NOT_SUPPORTED; 16 + u32 feature; 17 + gpa_t gpa; 18 + 19 + switch (func_id) { 20 + case ARM_SMCCC_VERSION_FUNC_ID: 21 + val = ARM_SMCCC_VERSION_1_1; 22 + break; 23 + case ARM_SMCCC_ARCH_FEATURES_FUNC_ID: 24 + feature = smccc_get_arg1(vcpu); 25 + switch (feature) { 26 + case ARM_SMCCC_ARCH_WORKAROUND_1: 27 + switch (kvm_arm_harden_branch_predictor()) { 28 + case KVM_BP_HARDEN_UNKNOWN: 29 + break; 30 + case KVM_BP_HARDEN_WA_NEEDED: 31 + val = SMCCC_RET_SUCCESS; 32 + break; 33 + case KVM_BP_HARDEN_NOT_REQUIRED: 34 + val = SMCCC_RET_NOT_REQUIRED; 35 + break; 36 + } 37 + break; 38 + case ARM_SMCCC_ARCH_WORKAROUND_2: 39 + switch (kvm_arm_have_ssbd()) { 40 + case KVM_SSBD_FORCE_DISABLE: 41 + case KVM_SSBD_UNKNOWN: 42 + break; 43 + case KVM_SSBD_KERNEL: 44 + val = SMCCC_RET_SUCCESS; 45 + break; 46 + case KVM_SSBD_FORCE_ENABLE: 47 + case KVM_SSBD_MITIGATED: 48 + val = SMCCC_RET_NOT_REQUIRED; 49 + break; 50 + } 51 + break; 52 + case ARM_SMCCC_HV_PV_TIME_FEATURES: 53 + val = SMCCC_RET_SUCCESS; 54 + break; 55 + } 56 + break; 57 + case ARM_SMCCC_HV_PV_TIME_FEATURES: 58 + val = kvm_hypercall_pv_features(vcpu); 59 + break; 60 + case ARM_SMCCC_HV_PV_TIME_ST: 61 + gpa = kvm_init_stolen_time(vcpu); 62 + if (gpa != GPA_INVALID) 63 + val = gpa; 64 + break; 65 + default: 66 + return kvm_psci_call(vcpu); 67 + } 68 + 69 + smccc_set_retval(vcpu, val, 0, 0, 0); 70 + return 1; 71 + }

+8 -1

virt/kvm/arm/mmio.c

··· 167 167 if (ret) 168 168 return ret; 169 169 } else { 170 - kvm_err("load/store instruction decoding not implemented\n"); 170 + if (vcpu->kvm->arch.return_nisv_io_abort_to_user) { 171 + run->exit_reason = KVM_EXIT_ARM_NISV; 172 + run->arm_nisv.esr_iss = kvm_vcpu_dabt_iss_nisv_sanitized(vcpu); 173 + run->arm_nisv.fault_ipa = fault_ipa; 174 + return 0; 175 + } 176 + 177 + kvm_pr_unimpl("Data abort outside memslots with no valid syndrome info\n"); 171 178 return -ENOSYS; 172 179 } 173 180

+2 -82

virt/kvm/arm/psci.c

··· 15 15 #include <asm/kvm_host.h> 16 16 17 17 #include <kvm/arm_psci.h> 18 + #include <kvm/arm_hypercalls.h> 18 19 19 20 /* 20 21 * This is an implementation of the Power State Coordination Interface ··· 23 22 */ 24 23 25 24 #define AFFINITY_MASK(level) ~((0x1UL << ((level) * MPIDR_LEVEL_BITS)) - 1) 26 - 27 - static u32 smccc_get_function(struct kvm_vcpu *vcpu) 28 - { 29 - return vcpu_get_reg(vcpu, 0); 30 - } 31 - 32 - static unsigned long smccc_get_arg1(struct kvm_vcpu *vcpu) 33 - { 34 - return vcpu_get_reg(vcpu, 1); 35 - } 36 - 37 - static unsigned long smccc_get_arg2(struct kvm_vcpu *vcpu) 38 - { 39 - return vcpu_get_reg(vcpu, 2); 40 - } 41 - 42 - static unsigned long smccc_get_arg3(struct kvm_vcpu *vcpu) 43 - { 44 - return vcpu_get_reg(vcpu, 3); 45 - } 46 - 47 - static void smccc_set_retval(struct kvm_vcpu *vcpu, 48 - unsigned long a0, 49 - unsigned long a1, 50 - unsigned long a2, 51 - unsigned long a3) 52 - { 53 - vcpu_set_reg(vcpu, 0, a0); 54 - vcpu_set_reg(vcpu, 1, a1); 55 - vcpu_set_reg(vcpu, 2, a2); 56 - vcpu_set_reg(vcpu, 3, a3); 57 - } 58 25 59 26 static unsigned long psci_affinity_mask(unsigned long affinity_level) 60 27 { ··· 342 373 * Errors: 343 374 * -EINVAL: Unrecognized PSCI function 344 375 */ 345 - static int kvm_psci_call(struct kvm_vcpu *vcpu) 376 + int kvm_psci_call(struct kvm_vcpu *vcpu) 346 377 { 347 378 switch (kvm_psci_version(vcpu, vcpu->kvm)) { 348 379 case KVM_ARM_PSCI_1_0: ··· 354 385 default: 355 386 return -EINVAL; 356 387 }; 357 - } 358 - 359 - int kvm_hvc_call_handler(struct kvm_vcpu *vcpu) 360 - { 361 - u32 func_id = smccc_get_function(vcpu); 362 - u32 val = SMCCC_RET_NOT_SUPPORTED; 363 - u32 feature; 364 - 365 - switch (func_id) { 366 - case ARM_SMCCC_VERSION_FUNC_ID: 367 - val = ARM_SMCCC_VERSION_1_1; 368 - break; 369 - case ARM_SMCCC_ARCH_FEATURES_FUNC_ID: 370 - feature = smccc_get_arg1(vcpu); 371 - switch(feature) { 372 - case ARM_SMCCC_ARCH_WORKAROUND_1: 373 - switch (kvm_arm_harden_branch_predictor()) { 374 - case KVM_BP_HARDEN_UNKNOWN: 375 - break; 376 - case KVM_BP_HARDEN_WA_NEEDED: 377 - val = SMCCC_RET_SUCCESS; 378 - break; 379 - case KVM_BP_HARDEN_NOT_REQUIRED: 380 - val = SMCCC_RET_NOT_REQUIRED; 381 - break; 382 - } 383 - break; 384 - case ARM_SMCCC_ARCH_WORKAROUND_2: 385 - switch (kvm_arm_have_ssbd()) { 386 - case KVM_SSBD_FORCE_DISABLE: 387 - case KVM_SSBD_UNKNOWN: 388 - break; 389 - case KVM_SSBD_KERNEL: 390 - val = SMCCC_RET_SUCCESS; 391 - break; 392 - case KVM_SSBD_FORCE_ENABLE: 393 - case KVM_SSBD_MITIGATED: 394 - val = SMCCC_RET_NOT_REQUIRED; 395 - break; 396 - } 397 - break; 398 - } 399 - break; 400 - default: 401 - return kvm_psci_call(vcpu); 402 - } 403 - 404 - smccc_set_retval(vcpu, val, 0, 0, 0); 405 - return 1; 406 388 } 407 389 408 390 int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)

+131

virt/kvm/arm/pvtime.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + // Copyright (C) 2019 Arm Ltd. 3 + 4 + #include <linux/arm-smccc.h> 5 + #include <linux/kvm_host.h> 6 + 7 + #include <asm/kvm_mmu.h> 8 + #include <asm/pvclock-abi.h> 9 + 10 + #include <kvm/arm_hypercalls.h> 11 + 12 + void kvm_update_stolen_time(struct kvm_vcpu *vcpu) 13 + { 14 + struct kvm *kvm = vcpu->kvm; 15 + u64 steal; 16 + __le64 steal_le; 17 + u64 offset; 18 + int idx; 19 + u64 base = vcpu->arch.steal.base; 20 + 21 + if (base == GPA_INVALID) 22 + return; 23 + 24 + /* Let's do the local bookkeeping */ 25 + steal = vcpu->arch.steal.steal; 26 + steal += current->sched_info.run_delay - vcpu->arch.steal.last_steal; 27 + vcpu->arch.steal.last_steal = current->sched_info.run_delay; 28 + vcpu->arch.steal.steal = steal; 29 + 30 + steal_le = cpu_to_le64(steal); 31 + idx = srcu_read_lock(&kvm->srcu); 32 + offset = offsetof(struct pvclock_vcpu_stolen_time, stolen_time); 33 + kvm_put_guest(kvm, base + offset, steal_le, u64); 34 + srcu_read_unlock(&kvm->srcu, idx); 35 + } 36 + 37 + long kvm_hypercall_pv_features(struct kvm_vcpu *vcpu) 38 + { 39 + u32 feature = smccc_get_arg1(vcpu); 40 + long val = SMCCC_RET_NOT_SUPPORTED; 41 + 42 + switch (feature) { 43 + case ARM_SMCCC_HV_PV_TIME_FEATURES: 44 + case ARM_SMCCC_HV_PV_TIME_ST: 45 + val = SMCCC_RET_SUCCESS; 46 + break; 47 + } 48 + 49 + return val; 50 + } 51 + 52 + gpa_t kvm_init_stolen_time(struct kvm_vcpu *vcpu) 53 + { 54 + struct pvclock_vcpu_stolen_time init_values = {}; 55 + struct kvm *kvm = vcpu->kvm; 56 + u64 base = vcpu->arch.steal.base; 57 + int idx; 58 + 59 + if (base == GPA_INVALID) 60 + return base; 61 + 62 + /* 63 + * Start counting stolen time from the time the guest requests 64 + * the feature enabled. 65 + */ 66 + vcpu->arch.steal.steal = 0; 67 + vcpu->arch.steal.last_steal = current->sched_info.run_delay; 68 + 69 + idx = srcu_read_lock(&kvm->srcu); 70 + kvm_write_guest(kvm, base, &init_values, sizeof(init_values)); 71 + srcu_read_unlock(&kvm->srcu, idx); 72 + 73 + return base; 74 + } 75 + 76 + int kvm_arm_pvtime_set_attr(struct kvm_vcpu *vcpu, 77 + struct kvm_device_attr *attr) 78 + { 79 + u64 __user *user = (u64 __user *)attr->addr; 80 + struct kvm *kvm = vcpu->kvm; 81 + u64 ipa; 82 + int ret = 0; 83 + int idx; 84 + 85 + if (attr->attr != KVM_ARM_VCPU_PVTIME_IPA) 86 + return -ENXIO; 87 + 88 + if (get_user(ipa, user)) 89 + return -EFAULT; 90 + if (!IS_ALIGNED(ipa, 64)) 91 + return -EINVAL; 92 + if (vcpu->arch.steal.base != GPA_INVALID) 93 + return -EEXIST; 94 + 95 + /* Check the address is in a valid memslot */ 96 + idx = srcu_read_lock(&kvm->srcu); 97 + if (kvm_is_error_hva(gfn_to_hva(kvm, ipa >> PAGE_SHIFT))) 98 + ret = -EINVAL; 99 + srcu_read_unlock(&kvm->srcu, idx); 100 + 101 + if (!ret) 102 + vcpu->arch.steal.base = ipa; 103 + 104 + return ret; 105 + } 106 + 107 + int kvm_arm_pvtime_get_attr(struct kvm_vcpu *vcpu, 108 + struct kvm_device_attr *attr) 109 + { 110 + u64 __user *user = (u64 __user *)attr->addr; 111 + u64 ipa; 112 + 113 + if (attr->attr != KVM_ARM_VCPU_PVTIME_IPA) 114 + return -ENXIO; 115 + 116 + ipa = vcpu->arch.steal.base; 117 + 118 + if (put_user(ipa, user)) 119 + return -EFAULT; 120 + return 0; 121 + } 122 + 123 + int kvm_arm_pvtime_has_attr(struct kvm_vcpu *vcpu, 124 + struct kvm_device_attr *attr) 125 + { 126 + switch (attr->attr) { 127 + case KVM_ARM_VCPU_PVTIME_IPA: 128 + return 0; 129 + } 130 + return -ENXIO; 131 + }

+1

virt/kvm/arm/vgic/vgic-init.c

··· 203 203 204 204 INIT_LIST_HEAD(&vgic_cpu->ap_list_head); 205 205 raw_spin_lock_init(&vgic_cpu->ap_list_lock); 206 + atomic_set(&vgic_cpu->vgic_v3.its_vpe.vlpi_count, 0); 206 207 207 208 /* 208 209 * Enable and configure all SGIs to be edge-triggered and

+3

virt/kvm/arm/vgic/vgic-its.c

··· 360 360 if (ret) 361 361 return ret; 362 362 363 + if (map.vpe) 364 + atomic_dec(&map.vpe->vlpi_count); 363 365 map.vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe; 366 + atomic_inc(&map.vpe->vlpi_count); 364 367 365 368 ret = its_map_vlpi(irq->host_irq, &map); 366 369 }

+8 -4

virt/kvm/arm/vgic/vgic-v3.c

··· 357 357 } 358 358 359 359 /** 360 - * vgic_its_save_pending_tables - Save the pending tables into guest RAM 360 + * vgic_v3_save_pending_tables - Save the pending tables into guest RAM 361 361 * kvm lock and all vcpu lock must be held 362 362 */ 363 363 int vgic_v3_save_pending_tables(struct kvm *kvm) 364 364 { 365 365 struct vgic_dist *dist = &kvm->arch.vgic; 366 - int last_byte_offset = -1; 367 366 struct vgic_irq *irq; 367 + gpa_t last_ptr = ~(gpa_t)0; 368 368 int ret; 369 369 u8 val; 370 370 ··· 384 384 bit_nr = irq->intid % BITS_PER_BYTE; 385 385 ptr = pendbase + byte_offset; 386 386 387 - if (byte_offset != last_byte_offset) { 387 + if (ptr != last_ptr) { 388 388 ret = kvm_read_guest_lock(kvm, ptr, &val, 1); 389 389 if (ret) 390 390 return ret; 391 - last_byte_offset = byte_offset; 391 + last_ptr = ptr; 392 392 } 393 393 394 394 stored = val & (1U << bit_nr); ··· 664 664 665 665 if (has_vhe()) 666 666 __vgic_v3_activate_traps(vcpu); 667 + 668 + WARN_ON(vgic_v4_load(vcpu)); 667 669 } 668 670 669 671 void vgic_v3_vmcr_sync(struct kvm_vcpu *vcpu) ··· 678 676 679 677 void vgic_v3_put(struct kvm_vcpu *vcpu) 680 678 { 679 + WARN_ON(vgic_v4_put(vcpu, false)); 680 + 681 681 vgic_v3_vmcr_sync(vcpu); 682 682 683 683 kvm_call_hyp(__vgic_v3_save_aprs, vcpu);

+29 -30

virt/kvm/arm/vgic/vgic-v4.c

··· 85 85 { 86 86 struct kvm_vcpu *vcpu = info; 87 87 88 + /* We got the message, no need to fire again */ 89 + if (!irqd_irq_disabled(&irq_to_desc(irq)->irq_data)) 90 + disable_irq_nosync(irq); 91 + 88 92 vcpu->arch.vgic_cpu.vgic_v3.its_vpe.pending_last = true; 89 93 kvm_make_request(KVM_REQ_IRQ_PENDING, vcpu); 90 94 kvm_vcpu_kick(vcpu); ··· 196 192 its_vm->vpes = NULL; 197 193 } 198 194 199 - int vgic_v4_sync_hwstate(struct kvm_vcpu *vcpu) 195 + int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db) 200 196 { 201 - if (!vgic_supports_direct_msis(vcpu->kvm)) 197 + struct its_vpe *vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe; 198 + struct irq_desc *desc = irq_to_desc(vpe->irq); 199 + 200 + if (!vgic_supports_direct_msis(vcpu->kvm) || !vpe->resident) 202 201 return 0; 203 202 204 - return its_schedule_vpe(&vcpu->arch.vgic_cpu.vgic_v3.its_vpe, false); 203 + /* 204 + * If blocking, a doorbell is required. Undo the nested 205 + * disable_irq() calls... 206 + */ 207 + while (need_db && irqd_irq_disabled(&desc->irq_data)) 208 + enable_irq(vpe->irq); 209 + 210 + return its_schedule_vpe(vpe, false); 205 211 } 206 212 207 - int vgic_v4_flush_hwstate(struct kvm_vcpu *vcpu) 213 + int vgic_v4_load(struct kvm_vcpu *vcpu) 208 214 { 209 - int irq = vcpu->arch.vgic_cpu.vgic_v3.its_vpe.irq; 215 + struct its_vpe *vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe; 210 216 int err; 211 217 212 - if (!vgic_supports_direct_msis(vcpu->kvm)) 218 + if (!vgic_supports_direct_msis(vcpu->kvm) || vpe->resident) 213 219 return 0; 214 220 215 221 /* ··· 228 214 * doc in drivers/irqchip/irq-gic-v4.c to understand how this 229 215 * turns into a VMOVP command at the ITS level. 230 216 */ 231 - err = irq_set_affinity(irq, cpumask_of(smp_processor_id())); 217 + err = irq_set_affinity(vpe->irq, cpumask_of(smp_processor_id())); 232 218 if (err) 233 219 return err; 234 220 235 - err = its_schedule_vpe(&vcpu->arch.vgic_cpu.vgic_v3.its_vpe, true); 221 + /* Disabled the doorbell, as we're about to enter the guest */ 222 + disable_irq_nosync(vpe->irq); 223 + 224 + err = its_schedule_vpe(vpe, true); 236 225 if (err) 237 226 return err; 238 227 ··· 243 226 * Now that the VPE is resident, let's get rid of a potential 244 227 * doorbell interrupt that would still be pending. 245 228 */ 246 - err = irq_set_irqchip_state(irq, IRQCHIP_STATE_PENDING, false); 247 - 248 - return err; 229 + return irq_set_irqchip_state(vpe->irq, IRQCHIP_STATE_PENDING, false); 249 230 } 250 231 251 232 static struct vgic_its *vgic_get_its(struct kvm *kvm, ··· 281 266 282 267 mutex_lock(&its->its_lock); 283 268 284 - /* Perform then actual DevID/EventID -> LPI translation. */ 269 + /* Perform the actual DevID/EventID -> LPI translation. */ 285 270 ret = vgic_its_resolve_lpi(kvm, its, irq_entry->msi.devid, 286 271 irq_entry->msi.data, &irq); 287 272 if (ret) ··· 309 294 310 295 irq->hw = true; 311 296 irq->host_irq = virq; 297 + atomic_inc(&map.vpe->vlpi_count); 312 298 313 299 out: 314 300 mutex_unlock(&its->its_lock); ··· 343 327 344 328 WARN_ON(!(irq->hw && irq->host_irq == virq)); 345 329 if (irq->hw) { 330 + atomic_dec(&irq->target_vcpu->arch.vgic_cpu.vgic_v3.its_vpe.vlpi_count); 346 331 irq->hw = false; 347 332 ret = its_unmap_vlpi(virq); 348 333 } ··· 351 334 out: 352 335 mutex_unlock(&its->its_lock); 353 336 return ret; 354 - } 355 - 356 - void kvm_vgic_v4_enable_doorbell(struct kvm_vcpu *vcpu) 357 - { 358 - if (vgic_supports_direct_msis(vcpu->kvm)) { 359 - int irq = vcpu->arch.vgic_cpu.vgic_v3.its_vpe.irq; 360 - if (irq) 361 - enable_irq(irq); 362 - } 363 - } 364 - 365 - void kvm_vgic_v4_disable_doorbell(struct kvm_vcpu *vcpu) 366 - { 367 - if (vgic_supports_direct_msis(vcpu->kvm)) { 368 - int irq = vcpu->arch.vgic_cpu.vgic_v3.its_vpe.irq; 369 - if (irq) 370 - disable_irq(irq); 371 - } 372 337 }

-4

virt/kvm/arm/vgic/vgic.c

··· 857 857 { 858 858 struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; 859 859 860 - WARN_ON(vgic_v4_sync_hwstate(vcpu)); 861 - 862 860 /* An empty ap_list_head implies used_lrs == 0 */ 863 861 if (list_empty(&vcpu->arch.vgic_cpu.ap_list_head)) 864 862 return; ··· 880 882 /* Flush our emulation state into the GIC hardware before entering the guest. */ 881 883 void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu) 882 884 { 883 - WARN_ON(vgic_v4_flush_hwstate(vcpu)); 884 - 885 885 /* 886 886 * If there are no virtual interrupts active or pending for this 887 887 * VCPU, then there is no work to do and we can bail out without

-2

virt/kvm/arm/vgic/vgic.h

··· 316 316 bool vgic_supports_direct_msis(struct kvm *kvm); 317 317 int vgic_v4_init(struct kvm *kvm); 318 318 void vgic_v4_teardown(struct kvm *kvm); 319 - int vgic_v4_sync_hwstate(struct kvm_vcpu *vcpu); 320 - int vgic_v4_flush_hwstate(struct kvm_vcpu *vcpu); 321 319 322 320 #endif

+3 -3

virt/kvm/kvm_main.c

··· 3062 3062 return filp->private_data; 3063 3063 } 3064 3064 3065 - static struct kvm_device_ops *kvm_device_ops_table[KVM_DEV_TYPE_MAX] = { 3065 + static const struct kvm_device_ops *kvm_device_ops_table[KVM_DEV_TYPE_MAX] = { 3066 3066 #ifdef CONFIG_KVM_MPIC 3067 3067 [KVM_DEV_TYPE_FSL_MPIC_20] = &kvm_mpic_ops, 3068 3068 [KVM_DEV_TYPE_FSL_MPIC_42] = &kvm_mpic_ops, 3069 3069 #endif 3070 3070 }; 3071 3071 3072 - int kvm_register_device_ops(struct kvm_device_ops *ops, u32 type) 3072 + int kvm_register_device_ops(const struct kvm_device_ops *ops, u32 type) 3073 3073 { 3074 3074 if (type >= ARRAY_SIZE(kvm_device_ops_table)) 3075 3075 return -ENOSPC; ··· 3090 3090 static int kvm_ioctl_create_device(struct kvm *kvm, 3091 3091 struct kvm_create_device *cd) 3092 3092 { 3093 - struct kvm_device_ops *ops = NULL; 3093 + const struct kvm_device_ops *ops = NULL; 3094 3094 struct kvm_device *dev; 3095 3095 bool test = cd->flags & KVM_CREATE_DEVICE_TEST; 3096 3096 int type;