Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'kvmarm-6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for 6.15

- Nested virtualization support for VGICv3, giving the nested
hypervisor control of the VGIC hardware when running an L2 VM

- Removal of 'late' nested virtualization feature register masking,
making the supported feature set directly visible to userspace

- Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage
of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers

- Paravirtual interface for discovering the set of CPU implementations
where a VM may run, addressing a longstanding issue of guest CPU
errata awareness in big-little systems and cross-implementation VM
migration

- Userspace control of the registers responsible for identifying a
particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1),
allowing VMs to be migrated cross-implementation

- pKVM updates, including support for tracking stage-2 page table
allocations in the protected hypervisor in the 'SecPageTable' stat

- Fixes to vPMU, ensuring that userspace updates to the vPMU after
KVM_RUN are reflected into the backing perf events

+2167 -746
+18
Documentation/virt/kvm/api.rst
··· 8262 8262 depending on which executed at the time of an exit. Userspace must 8263 8263 take care to differentiate between these cases. 8264 8264 8265 + 7.37 KVM_CAP_ARM_WRITABLE_IMP_ID_REGS 8266 + ------------------------------------- 8267 + 8268 + :Architectures: arm64 8269 + :Target: VM 8270 + :Parameters: None 8271 + :Returns: 0 on success, -EINVAL if vCPUs have been created before enabling this 8272 + capability. 8273 + 8274 + This capability changes the behavior of the registers that identify a PE 8275 + implementation of the Arm architecture: MIDR_EL1, REVIDR_EL1, and AIDR_EL1. 8276 + By default, these registers are visible to userspace but treated as invariant. 8277 + 8278 + When this capability is enabled, KVM allows userspace to change the 8279 + aforementioned registers before the first KVM_RUN. These registers are VM 8280 + scoped, meaning that the same set of values are presented on all vCPUs in a 8281 + given VM. 8282 + 8265 8283 8. Other capabilities. 8266 8284 ====================== 8267 8285
+14 -1
Documentation/virt/kvm/arm/fw-pseudo-registers.rst
··· 116 116 ARM DEN0057A. 117 117 118 118 * KVM_REG_ARM_VENDOR_HYP_BMAP: 119 - Controls the bitmap of the Vendor specific Hypervisor Service Calls. 119 + Controls the bitmap of the Vendor specific Hypervisor Service Calls[0-63]. 120 120 121 121 The following bits are accepted: 122 122 ··· 126 126 127 127 Bit-1: KVM_REG_ARM_VENDOR_HYP_BIT_PTP: 128 128 The bit represents the Precision Time Protocol KVM service. 129 + 130 + * KVM_REG_ARM_VENDOR_HYP_BMAP_2: 131 + Controls the bitmap of the Vendor specific Hypervisor Service Calls[64-127]. 132 + 133 + The following bits are accepted: 134 + 135 + Bit-0: KVM_REG_ARM_VENDOR_HYP_BIT_DISCOVER_IMPL_VER 136 + This represents the ARM_SMCCC_VENDOR_HYP_KVM_DISCOVER_IMPL_VER_FUNC_ID 137 + function-id. This is reset to 0. 138 + 139 + Bit-1: KVM_REG_ARM_VENDOR_HYP_BIT_DISCOVER_IMPL_CPUS 140 + This represents the ARM_SMCCC_VENDOR_HYP_KVM_DISCOVER_IMPL_CPUS_FUNC_ID 141 + function-id. This is reset to 0. 129 142 130 143 Errors: 131 144
+59
Documentation/virt/kvm/arm/hypercalls.rst
··· 142 142 | | | +---------------------------------------------+ 143 143 | | | | ``INVALID_PARAMETER (-3)`` | 144 144 +---------------------+----------+----+---------------------------------------------+ 145 + 146 + ``ARM_SMCCC_VENDOR_HYP_KVM_DISCOVER_IMPL_VER_FUNC_ID`` 147 + ------------------------------------------------------- 148 + Request the target CPU implementation version information and the number of target 149 + implementations for the Guest VM. 150 + 151 + +---------------------+-------------------------------------------------------------+ 152 + | Presence: | Optional; KVM/ARM64 Guests only | 153 + +---------------------+-------------------------------------------------------------+ 154 + | Calling convention: | HVC64 | 155 + +---------------------+----------+--------------------------------------------------+ 156 + | Function ID: | (uint32) | 0xC6000040 | 157 + +---------------------+----------+--------------------------------------------------+ 158 + | Arguments: | None | 159 + +---------------------+----------+----+---------------------------------------------+ 160 + | Return Values: | (int64) | R0 | ``SUCCESS (0)`` | 161 + | | | +---------------------------------------------+ 162 + | | | | ``NOT_SUPPORTED (-1)`` | 163 + | +----------+----+---------------------------------------------+ 164 + | | (uint64) | R1 | Bits [63:32] Reserved/Must be zero | 165 + | | | +---------------------------------------------+ 166 + | | | | Bits [31:16] Major version | 167 + | | | +---------------------------------------------+ 168 + | | | | Bits [15:0] Minor version | 169 + | +----------+----+---------------------------------------------+ 170 + | | (uint64) | R2 | Number of target implementations | 171 + | +----------+----+---------------------------------------------+ 172 + | | (uint64) | R3 | Reserved / Must be zero | 173 + +---------------------+----------+----+---------------------------------------------+ 174 + 175 + ``ARM_SMCCC_VENDOR_HYP_KVM_DISCOVER_IMPL_CPUS_FUNC_ID`` 176 + ------------------------------------------------------- 177 + 178 + Request the target CPU implementation information for the Guest VM. The Guest kernel 179 + will use this information to enable the associated errata. 180 + 181 + +---------------------+-------------------------------------------------------------+ 182 + | Presence: | Optional; KVM/ARM64 Guests only | 183 + +---------------------+-------------------------------------------------------------+ 184 + | Calling convention: | HVC64 | 185 + +---------------------+----------+--------------------------------------------------+ 186 + | Function ID: | (uint32) | 0xC6000041 | 187 + +---------------------+----------+----+---------------------------------------------+ 188 + | Arguments: | (uint64) | R1 | selected implementation index | 189 + | +----------+----+---------------------------------------------+ 190 + | | (uint64) | R2 | Reserved / Must be zero | 191 + | +----------+----+---------------------------------------------+ 192 + | | (uint64) | R3 | Reserved / Must be zero | 193 + +---------------------+----------+----+---------------------------------------------+ 194 + | Return Values: | (int64) | R0 | ``SUCCESS (0)`` | 195 + | | | +---------------------------------------------+ 196 + | | | | ``INVALID_PARAMETER (-3)`` | 197 + | +----------+----+---------------------------------------------+ 198 + | | (uint64) | R1 | MIDR_EL1 of the selected implementation | 199 + | +----------+----+---------------------------------------------+ 200 + | | (uint64) | R2 | REVIDR_EL1 of the selected implementation | 201 + | +----------+----+---------------------------------------------+ 202 + | | (uint64) | R3 | AIDR_EL1 of the selected implementation | 203 + +---------------------+----------+----+---------------------------------------------+
+4 -1
Documentation/virt/kvm/devices/arm-vgic-its.rst
··· 126 126 ITS Restore Sequence: 127 127 --------------------- 128 128 129 - The following ordering must be followed when restoring the GIC and the ITS: 129 + The following ordering must be followed when restoring the GIC, ITS, and 130 + KVM_IRQFD assignments: 130 131 131 132 a) restore all guest memory and create vcpus 132 133 b) restore all redistributors ··· 139 138 2. Restore all other ``GITS_`` registers, except GITS_CTLR! 140 139 3. Load the ITS table data (KVM_DEV_ARM_ITS_RESTORE_TABLES) 141 140 4. Restore GITS_CTLR 141 + 142 + e) restore KVM_IRQFD assignments for MSIs 142 143 143 144 Then vcpus can be started. 144 145
+11 -1
Documentation/virt/kvm/devices/arm-vgic-v3.rst
··· 291 291 | Aff3 | Aff2 | Aff1 | Aff0 | 292 292 293 293 Errors: 294 - 295 294 ======= ============================================= 296 295 -EINVAL vINTID is not multiple of 32 or info field is 297 296 not VGIC_LEVEL_INFO_LINE_LEVEL 298 297 ======= ============================================= 298 + 299 + KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ 300 + Attributes: 301 + 302 + The attr field of kvm_device_attr encodes the following values: 303 + 304 + bits: | 31 .... 5 | 4 .... 0 | 305 + values: | RES0 | vINTID | 306 + 307 + The vINTID specifies which interrupt is generated when the vGIC 308 + must generate a maintenance interrupt. This must be a PPI.
+1
arch/arm64/include/asm/apple_m1_pmu.h
··· 37 37 #define PMCR0_PMI_ENABLE_8_9 GENMASK(45, 44) 38 38 39 39 #define SYS_IMP_APL_PMCR1_EL1 sys_reg(3, 1, 15, 1, 0) 40 + #define SYS_IMP_APL_PMCR1_EL12 sys_reg(3, 1, 15, 7, 2) 40 41 #define PMCR1_COUNT_A64_EL0_0_7 GENMASK(15, 8) 41 42 #define PMCR1_COUNT_A64_EL1_0_7 GENMASK(23, 16) 42 43 #define PMCR1_COUNT_A64_EL0_8_9 GENMASK(41, 40)
+2
arch/arm64/include/asm/cpucaps.h
··· 71 71 * KVM MPAM support doesn't rely on the host kernel supporting MPAM. 72 72 */ 73 73 return true; 74 + case ARM64_HAS_PMUV3: 75 + return IS_ENABLED(CONFIG_HW_PERF_EVENTS); 74 76 } 75 77 76 78 return true;
+5 -23
arch/arm64/include/asm/cpufeature.h
··· 525 525 return cpuid_feature_extract_unsigned_field_width(features, field, 4); 526 526 } 527 527 528 - /* 529 - * Fields that identify the version of the Performance Monitors Extension do 530 - * not follow the standard ID scheme. See ARM DDI 0487E.a page D13-2825, 531 - * "Alternative ID scheme used for the Performance Monitors Extension version". 532 - */ 533 - static inline u64 __attribute_const__ 534 - cpuid_feature_cap_perfmon_field(u64 features, int field, u64 cap) 535 - { 536 - u64 val = cpuid_feature_extract_unsigned_field(features, field); 537 - u64 mask = GENMASK_ULL(field + 3, field); 538 - 539 - /* Treat IMPLEMENTATION DEFINED functionality as unimplemented */ 540 - if (val == ID_AA64DFR0_EL1_PMUVer_IMP_DEF) 541 - val = 0; 542 - 543 - if (val > cap) { 544 - features &= ~mask; 545 - features |= (cap << field) & mask; 546 - } 547 - 548 - return features; 549 - } 550 - 551 528 static inline u64 arm64_ftr_mask(const struct arm64_ftr_bits *ftrp) 552 529 { 553 530 return (u64)GENMASK(ftrp->shift + ftrp->width - 1, ftrp->shift); ··· 841 864 static __always_inline bool system_supports_mpam_hcr(void) 842 865 { 843 866 return alternative_has_cap_unlikely(ARM64_MPAM_HCR); 867 + } 868 + 869 + static inline bool system_supports_pmuv3(void) 870 + { 871 + return cpus_have_final_cap(ARM64_HAS_PMUV3); 844 872 } 845 873 846 874 int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
+17 -23
arch/arm64/include/asm/cputype.h
··· 232 232 #define read_cpuid(reg) read_sysreg_s(SYS_ ## reg) 233 233 234 234 /* 235 + * The CPU ID never changes at run time, so we might as well tell the 236 + * compiler that it's constant. Use this function to read the CPU ID 237 + * rather than directly reading processor_id or read_cpuid() directly. 238 + */ 239 + static inline u32 __attribute_const__ read_cpuid_id(void) 240 + { 241 + return read_cpuid(MIDR_EL1); 242 + } 243 + 244 + /* 235 245 * Represent a range of MIDR values for a given CPU model and a 236 246 * range of variant/revision values. 237 247 * ··· 276 266 return _model == model && rv >= rv_min && rv <= rv_max; 277 267 } 278 268 279 - static inline bool is_midr_in_range(u32 midr, struct midr_range const *range) 280 - { 281 - return midr_is_cpu_model_range(midr, range->model, 282 - range->rv_min, range->rv_max); 283 - } 269 + struct target_impl_cpu { 270 + u64 midr; 271 + u64 revidr; 272 + u64 aidr; 273 + }; 284 274 285 - static inline bool 286 - is_midr_in_range_list(u32 midr, struct midr_range const *ranges) 287 - { 288 - while (ranges->model) 289 - if (is_midr_in_range(midr, ranges++)) 290 - return true; 291 - return false; 292 - } 293 - 294 - /* 295 - * The CPU ID never changes at run time, so we might as well tell the 296 - * compiler that it's constant. Use this function to read the CPU ID 297 - * rather than directly reading processor_id or read_cpuid() directly. 298 - */ 299 - static inline u32 __attribute_const__ read_cpuid_id(void) 300 - { 301 - return read_cpuid(MIDR_EL1); 302 - } 275 + bool cpu_errata_set_target_impl(u64 num, void *impl_cpus); 276 + bool is_midr_in_range_list(struct midr_range const *ranges); 303 277 304 278 static inline u64 __attribute_const__ read_cpuid_mpidr(void) 305 279 {
+1
arch/arm64/include/asm/hypervisor.h
··· 6 6 7 7 void kvm_init_hyp_services(void); 8 8 bool kvm_arm_hyp_service_available(u32 func_id); 9 + void kvm_arm_target_impl_cpu_init(void); 9 10 10 11 #ifdef CONFIG_ARM_PKVM_GUEST 11 12 void pkvm_init_hyp_services(void);
+2 -2
arch/arm64/include/asm/kvm_arm.h
··· 92 92 * SWIO: Turn set/way invalidates into set/way clean+invalidate 93 93 * PTW: Take a stage2 fault if a stage1 walk steps in device memory 94 94 * TID3: Trap EL1 reads of group 3 ID registers 95 - * TID2: Trap CTR_EL0, CCSIDR2_EL1, CLIDR_EL1, and CSSELR_EL1 95 + * TID1: Trap REVIDR_EL1, AIDR_EL1, and SMIDR_EL1 96 96 */ 97 97 #define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \ 98 98 HCR_BSU_IS | HCR_FB | HCR_TACR | \ 99 99 HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW | HCR_TLOR | \ 100 - HCR_FMO | HCR_IMO | HCR_PTW | HCR_TID3) 100 + HCR_FMO | HCR_IMO | HCR_PTW | HCR_TID3 | HCR_TID1) 101 101 #define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK | HCR_ATA) 102 102 #define HCR_HOST_NVHE_PROTECTED_FLAGS (HCR_HOST_NVHE_FLAGS | HCR_TSC) 103 103 #define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
+37
arch/arm64/include/asm/kvm_emulate.h
··· 275 275 return vcpu->arch.fault.esr_el2; 276 276 } 277 277 278 + static inline bool guest_hyp_wfx_traps_enabled(const struct kvm_vcpu *vcpu) 279 + { 280 + u64 esr = kvm_vcpu_get_esr(vcpu); 281 + bool is_wfe = !!(esr & ESR_ELx_WFx_ISS_WFE); 282 + u64 hcr_el2 = __vcpu_sys_reg(vcpu, HCR_EL2); 283 + 284 + if (!vcpu_has_nv(vcpu) || vcpu_is_el2(vcpu)) 285 + return false; 286 + 287 + return ((is_wfe && (hcr_el2 & HCR_TWE)) || 288 + (!is_wfe && (hcr_el2 & HCR_TWI))); 289 + } 290 + 278 291 static __always_inline int kvm_vcpu_get_condition(const struct kvm_vcpu *vcpu) 279 292 { 280 293 u64 esr = kvm_vcpu_get_esr(vcpu); ··· 661 648 static inline bool guest_hyp_sve_traps_enabled(const struct kvm_vcpu *vcpu) 662 649 { 663 650 return __guest_hyp_cptr_xen_trap_enabled(vcpu, ZEN); 651 + } 652 + 653 + static inline void vcpu_set_hcrx(struct kvm_vcpu *vcpu) 654 + { 655 + struct kvm *kvm = vcpu->kvm; 656 + 657 + if (cpus_have_final_cap(ARM64_HAS_HCX)) { 658 + /* 659 + * In general, all HCRX_EL2 bits are gated by a feature. 660 + * The only reason we can set SMPME without checking any 661 + * feature is that its effects are not directly observable 662 + * from the guest. 663 + */ 664 + vcpu->arch.hcrx_el2 = HCRX_EL2_SMPME; 665 + 666 + if (kvm_has_feat(kvm, ID_AA64ISAR2_EL1, MOPS, IMP)) 667 + vcpu->arch.hcrx_el2 |= (HCRX_EL2_MSCEn | HCRX_EL2_MCE2); 668 + 669 + if (kvm_has_tcr2(kvm)) 670 + vcpu->arch.hcrx_el2 |= HCRX_EL2_TCR2En; 671 + 672 + if (kvm_has_fpmr(kvm)) 673 + vcpu->arch.hcrx_el2 |= HCRX_EL2_EnFPM; 674 + } 664 675 } 665 676 #endif /* __ARM64_KVM_EMULATE_H__ */
+56 -9
arch/arm64/include/asm/kvm_host.h
··· 44 44 45 45 #define KVM_REQ_SLEEP \ 46 46 KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) 47 - #define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1) 48 - #define KVM_REQ_VCPU_RESET KVM_ARCH_REQ(2) 49 - #define KVM_REQ_RECORD_STEAL KVM_ARCH_REQ(3) 50 - #define KVM_REQ_RELOAD_GICv4 KVM_ARCH_REQ(4) 51 - #define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5) 52 - #define KVM_REQ_SUSPEND KVM_ARCH_REQ(6) 53 - #define KVM_REQ_RESYNC_PMU_EL0 KVM_ARCH_REQ(7) 54 - #define KVM_REQ_NESTED_S2_UNMAP KVM_ARCH_REQ(8) 47 + #define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1) 48 + #define KVM_REQ_VCPU_RESET KVM_ARCH_REQ(2) 49 + #define KVM_REQ_RECORD_STEAL KVM_ARCH_REQ(3) 50 + #define KVM_REQ_RELOAD_GICv4 KVM_ARCH_REQ(4) 51 + #define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5) 52 + #define KVM_REQ_SUSPEND KVM_ARCH_REQ(6) 53 + #define KVM_REQ_RESYNC_PMU_EL0 KVM_ARCH_REQ(7) 54 + #define KVM_REQ_NESTED_S2_UNMAP KVM_ARCH_REQ(8) 55 + #define KVM_REQ_GUEST_HYP_IRQ_PENDING KVM_ARCH_REQ(9) 55 56 56 57 #define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \ 57 58 KVM_DIRTY_LOG_INITIALLY_SET) ··· 87 86 phys_addr_t head; 88 87 unsigned long nr_pages; 89 88 struct pkvm_mapping *mapping; /* only used from EL1 */ 89 + 90 + #define HYP_MEMCACHE_ACCOUNT_STAGE2 BIT(1) 91 + unsigned long flags; 90 92 }; 91 93 92 94 static inline void push_hyp_memcache(struct kvm_hyp_memcache *mc, ··· 241 237 struct kvm_smccc_features { 242 238 unsigned long std_bmap; 243 239 unsigned long std_hyp_bmap; 244 - unsigned long vendor_hyp_bmap; 240 + unsigned long vendor_hyp_bmap; /* Function numbers 0-63 */ 241 + unsigned long vendor_hyp_bmap_2; /* Function numbers 64-127 */ 245 242 }; 246 243 247 244 typedef unsigned int pkvm_handle_t; ··· 250 245 struct kvm_protected_vm { 251 246 pkvm_handle_t handle; 252 247 struct kvm_hyp_memcache teardown_mc; 248 + struct kvm_hyp_memcache stage2_teardown_mc; 253 249 bool enabled; 254 250 }; 255 251 ··· 340 334 #define KVM_ARCH_FLAG_FGU_INITIALIZED 8 341 335 /* SVE exposed to guest */ 342 336 #define KVM_ARCH_FLAG_GUEST_HAS_SVE 9 337 + /* MIDR_EL1, REVIDR_EL1, and AIDR_EL1 are writable from userspace */ 338 + #define KVM_ARCH_FLAG_WRITABLE_IMP_ID_REGS 10 343 339 unsigned long flags; 344 340 345 341 /* VM-wide vCPU feature set */ ··· 381 373 #define KVM_ARM_ID_REG_NUM (IDREG_IDX(sys_reg(3, 0, 0, 7, 7)) + 1) 382 374 u64 id_regs[KVM_ARM_ID_REG_NUM]; 383 375 376 + u64 midr_el1; 377 + u64 revidr_el1; 378 + u64 aidr_el1; 384 379 u64 ctr_el0; 385 380 386 381 /* Masks for VNCR-backed and general EL2 sysregs */ ··· 568 557 VNCR(CNTP_CVAL_EL0), 569 558 VNCR(CNTP_CTL_EL0), 570 559 560 + VNCR(ICH_LR0_EL2), 561 + VNCR(ICH_LR1_EL2), 562 + VNCR(ICH_LR2_EL2), 563 + VNCR(ICH_LR3_EL2), 564 + VNCR(ICH_LR4_EL2), 565 + VNCR(ICH_LR5_EL2), 566 + VNCR(ICH_LR6_EL2), 567 + VNCR(ICH_LR7_EL2), 568 + VNCR(ICH_LR8_EL2), 569 + VNCR(ICH_LR9_EL2), 570 + VNCR(ICH_LR10_EL2), 571 + VNCR(ICH_LR11_EL2), 572 + VNCR(ICH_LR12_EL2), 573 + VNCR(ICH_LR13_EL2), 574 + VNCR(ICH_LR14_EL2), 575 + VNCR(ICH_LR15_EL2), 576 + 577 + VNCR(ICH_AP0R0_EL2), 578 + VNCR(ICH_AP0R1_EL2), 579 + VNCR(ICH_AP0R2_EL2), 580 + VNCR(ICH_AP0R3_EL2), 581 + VNCR(ICH_AP1R0_EL2), 582 + VNCR(ICH_AP1R1_EL2), 583 + VNCR(ICH_AP1R2_EL2), 584 + VNCR(ICH_AP1R3_EL2), 571 585 VNCR(ICH_HCR_EL2), 586 + VNCR(ICH_VMCR_EL2), 572 587 573 588 NR_SYS_REGS /* Nothing after this line! */ 574 589 }; ··· 906 869 #define VCPU_INITIALIZED __vcpu_single_flag(cflags, BIT(0)) 907 870 /* SVE config completed */ 908 871 #define VCPU_SVE_FINALIZED __vcpu_single_flag(cflags, BIT(1)) 872 + /* pKVM VCPU setup completed */ 873 + #define VCPU_PKVM_FINALIZED __vcpu_single_flag(cflags, BIT(2)) 909 874 910 875 /* Exception pending */ 911 876 #define PENDING_EXCEPTION __vcpu_single_flag(iflags, BIT(0)) ··· 958 919 #define PMUSERENR_ON_CPU __vcpu_single_flag(sflags, BIT(5)) 959 920 /* WFI instruction trapped */ 960 921 #define IN_WFI __vcpu_single_flag(sflags, BIT(6)) 922 + /* KVM is currently emulating a nested ERET */ 923 + #define IN_NESTED_ERET __vcpu_single_flag(sflags, BIT(7)) 961 924 962 925 963 926 /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */ ··· 1500 1459 return &ka->id_regs[IDREG_IDX(reg)]; 1501 1460 case SYS_CTR_EL0: 1502 1461 return &ka->ctr_el0; 1462 + case SYS_MIDR_EL1: 1463 + return &ka->midr_el1; 1464 + case SYS_REVIDR_EL1: 1465 + return &ka->revidr_el1; 1466 + case SYS_AIDR_EL1: 1467 + return &ka->aidr_el1; 1503 1468 default: 1504 1469 WARN_ON_ONCE(1); 1505 1470 return NULL;
+2
arch/arm64/include/asm/kvm_hyp.h
··· 76 76 77 77 int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu); 78 78 79 + u64 __gic_v3_get_lr(unsigned int lr); 80 + 79 81 void __vgic_v3_save_state(struct vgic_v3_cpu_if *cpu_if); 80 82 void __vgic_v3_restore_state(struct vgic_v3_cpu_if *cpu_if); 81 83 void __vgic_v3_activate_traps(struct vgic_v3_cpu_if *cpu_if);
+1
arch/arm64/include/asm/kvm_nested.h
··· 188 188 } 189 189 190 190 int kvm_init_nv_sysregs(struct kvm_vcpu *vcpu); 191 + u64 limit_nv_id_reg(struct kvm *kvm, u32 reg, u64 val); 191 192 192 193 #ifdef CONFIG_ARM64_PTR_AUTH 193 194 bool kvm_auth_eretax(struct kvm_vcpu *vcpu, u64 *elr);
+1
arch/arm64/include/asm/kvm_pkvm.h
··· 19 19 int pkvm_init_host_vm(struct kvm *kvm); 20 20 int pkvm_create_hyp_vm(struct kvm *kvm); 21 21 void pkvm_destroy_hyp_vm(struct kvm *kvm); 22 + int pkvm_create_hyp_vcpu(struct kvm_vcpu *vcpu); 22 23 23 24 /* 24 25 * This functions as an allow-list of protected VM capabilities.
+1 -2
arch/arm64/include/asm/mmu.h
··· 101 101 if (IS_ENABLED(CONFIG_CAVIUM_ERRATUM_27456)) { 102 102 extern const struct midr_range cavium_erratum_27456_cpus[]; 103 103 104 - if (is_midr_in_range_list(read_cpuid_id(), 105 - cavium_erratum_27456_cpus)) 104 + if (is_midr_in_range_list(cavium_erratum_27456_cpus)) 106 105 return false; 107 106 } 108 107
-30
arch/arm64/include/asm/sysreg.h
··· 562 562 563 563 #define SYS_ICH_VSEIR_EL2 sys_reg(3, 4, 12, 9, 4) 564 564 #define SYS_ICC_SRE_EL2 sys_reg(3, 4, 12, 9, 5) 565 - #define SYS_ICH_HCR_EL2 sys_reg(3, 4, 12, 11, 0) 566 - #define SYS_ICH_VTR_EL2 sys_reg(3, 4, 12, 11, 1) 567 - #define SYS_ICH_MISR_EL2 sys_reg(3, 4, 12, 11, 2) 568 565 #define SYS_ICH_EISR_EL2 sys_reg(3, 4, 12, 11, 3) 569 566 #define SYS_ICH_ELRSR_EL2 sys_reg(3, 4, 12, 11, 5) 570 567 #define SYS_ICH_VMCR_EL2 sys_reg(3, 4, 12, 11, 7) ··· 982 985 #define SYS_MPIDR_SAFE_VAL (BIT(31)) 983 986 984 987 /* GIC Hypervisor interface registers */ 985 - /* ICH_MISR_EL2 bit definitions */ 986 - #define ICH_MISR_EOI (1 << 0) 987 - #define ICH_MISR_U (1 << 1) 988 - 989 988 /* ICH_LR*_EL2 bit definitions */ 990 989 #define ICH_LR_VIRTUAL_ID_MASK ((1ULL << 32) - 1) 991 990 ··· 995 1002 #define ICH_LR_PHYS_ID_MASK (0x3ffULL << ICH_LR_PHYS_ID_SHIFT) 996 1003 #define ICH_LR_PRIORITY_SHIFT 48 997 1004 #define ICH_LR_PRIORITY_MASK (0xffULL << ICH_LR_PRIORITY_SHIFT) 998 - 999 - /* ICH_HCR_EL2 bit definitions */ 1000 - #define ICH_HCR_EN (1 << 0) 1001 - #define ICH_HCR_UIE (1 << 1) 1002 - #define ICH_HCR_NPIE (1 << 3) 1003 - #define ICH_HCR_TC (1 << 10) 1004 - #define ICH_HCR_TALL0 (1 << 11) 1005 - #define ICH_HCR_TALL1 (1 << 12) 1006 - #define ICH_HCR_TDIR (1 << 14) 1007 - #define ICH_HCR_EOIcount_SHIFT 27 1008 - #define ICH_HCR_EOIcount_MASK (0x1f << ICH_HCR_EOIcount_SHIFT) 1009 1005 1010 1006 /* ICH_VMCR_EL2 bit definitions */ 1011 1007 #define ICH_VMCR_ACK_CTL_SHIFT 2 ··· 1015 1033 #define ICH_VMCR_ENG0_MASK (1 << ICH_VMCR_ENG0_SHIFT) 1016 1034 #define ICH_VMCR_ENG1_SHIFT 1 1017 1035 #define ICH_VMCR_ENG1_MASK (1 << ICH_VMCR_ENG1_SHIFT) 1018 - 1019 - /* ICH_VTR_EL2 bit definitions */ 1020 - #define ICH_VTR_PRI_BITS_SHIFT 29 1021 - #define ICH_VTR_PRI_BITS_MASK (7 << ICH_VTR_PRI_BITS_SHIFT) 1022 - #define ICH_VTR_ID_BITS_SHIFT 23 1023 - #define ICH_VTR_ID_BITS_MASK (7 << ICH_VTR_ID_BITS_SHIFT) 1024 - #define ICH_VTR_SEIS_SHIFT 22 1025 - #define ICH_VTR_SEIS_MASK (1 << ICH_VTR_SEIS_SHIFT) 1026 - #define ICH_VTR_A3V_SHIFT 21 1027 - #define ICH_VTR_A3V_MASK (1 << ICH_VTR_A3V_SHIFT) 1028 - #define ICH_VTR_TDS_SHIFT 19 1029 - #define ICH_VTR_TDS_MASK (1 << ICH_VTR_TDS_SHIFT) 1030 1036 1031 1037 /* 1032 1038 * Permission Indirection Extension (PIE) permission encodings.
+14
arch/arm64/include/uapi/asm/kvm.h
··· 105 105 #define KVM_ARM_VCPU_PTRAUTH_ADDRESS 5 /* VCPU uses address authentication */ 106 106 #define KVM_ARM_VCPU_PTRAUTH_GENERIC 6 /* VCPU uses generic authentication */ 107 107 #define KVM_ARM_VCPU_HAS_EL2 7 /* Support nested virtualization */ 108 + #define KVM_ARM_VCPU_HAS_EL2_E2H0 8 /* Limit NV support to E2H RES0 */ 108 109 109 110 struct kvm_vcpu_init { 110 111 __u32 target; ··· 372 371 #endif 373 372 }; 374 373 374 + /* Vendor hyper call function numbers 0-63 */ 375 375 #define KVM_REG_ARM_VENDOR_HYP_BMAP KVM_REG_ARM_FW_FEAT_BMAP_REG(2) 376 376 377 377 enum { ··· 380 378 KVM_REG_ARM_VENDOR_HYP_BIT_PTP = 1, 381 379 #ifdef __KERNEL__ 382 380 KVM_REG_ARM_VENDOR_HYP_BMAP_BIT_COUNT, 381 + #endif 382 + }; 383 + 384 + /* Vendor hyper call function numbers 64-127 */ 385 + #define KVM_REG_ARM_VENDOR_HYP_BMAP_2 KVM_REG_ARM_FW_FEAT_BMAP_REG(3) 386 + 387 + enum { 388 + KVM_REG_ARM_VENDOR_HYP_BIT_DISCOVER_IMPL_VER = 0, 389 + KVM_REG_ARM_VENDOR_HYP_BIT_DISCOVER_IMPL_CPUS = 1, 390 + #ifdef __KERNEL__ 391 + KVM_REG_ARM_VENDOR_HYP_BMAP_2_BIT_COUNT, 383 392 #endif 384 393 }; 385 394 ··· 416 403 #define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6 417 404 #define KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO 7 418 405 #define KVM_DEV_ARM_VGIC_GRP_ITS_REGS 8 406 + #define KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ 9 419 407 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT 10 420 408 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK \ 421 409 (0x3fffffULL << KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT)
+107 -10
arch/arm64/kernel/cpu_errata.c
··· 14 14 #include <asm/kvm_asm.h> 15 15 #include <asm/smp_plat.h> 16 16 17 + static u64 target_impl_cpu_num; 18 + static struct target_impl_cpu *target_impl_cpus; 19 + 20 + bool cpu_errata_set_target_impl(u64 num, void *impl_cpus) 21 + { 22 + if (target_impl_cpu_num || !num || !impl_cpus) 23 + return false; 24 + 25 + target_impl_cpu_num = num; 26 + target_impl_cpus = impl_cpus; 27 + return true; 28 + } 29 + 30 + static inline bool is_midr_in_range(struct midr_range const *range) 31 + { 32 + int i; 33 + 34 + if (!target_impl_cpu_num) 35 + return midr_is_cpu_model_range(read_cpuid_id(), range->model, 36 + range->rv_min, range->rv_max); 37 + 38 + for (i = 0; i < target_impl_cpu_num; i++) { 39 + if (midr_is_cpu_model_range(target_impl_cpus[i].midr, 40 + range->model, 41 + range->rv_min, range->rv_max)) 42 + return true; 43 + } 44 + return false; 45 + } 46 + 47 + bool is_midr_in_range_list(struct midr_range const *ranges) 48 + { 49 + while (ranges->model) 50 + if (is_midr_in_range(ranges++)) 51 + return true; 52 + return false; 53 + } 54 + EXPORT_SYMBOL_GPL(is_midr_in_range_list); 55 + 17 56 static bool __maybe_unused 18 - is_affected_midr_range(const struct arm64_cpu_capabilities *entry, int scope) 57 + __is_affected_midr_range(const struct arm64_cpu_capabilities *entry, 58 + u32 midr, u32 revidr) 19 59 { 20 60 const struct arm64_midr_revidr *fix; 21 - u32 midr = read_cpuid_id(), revidr; 22 - 23 - WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible()); 24 - if (!is_midr_in_range(midr, &entry->midr_range)) 61 + if (!is_midr_in_range(&entry->midr_range)) 25 62 return false; 26 63 27 64 midr &= MIDR_REVISION_MASK | MIDR_VARIANT_MASK; 28 - revidr = read_cpuid(REVIDR_EL1); 29 65 for (fix = entry->fixed_revs; fix && fix->revidr_mask; fix++) 30 66 if (midr == fix->midr_rv && (revidr & fix->revidr_mask)) 31 67 return false; 32 - 33 68 return true; 69 + } 70 + 71 + static bool __maybe_unused 72 + is_affected_midr_range(const struct arm64_cpu_capabilities *entry, int scope) 73 + { 74 + int i; 75 + 76 + if (!target_impl_cpu_num) { 77 + WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible()); 78 + return __is_affected_midr_range(entry, read_cpuid_id(), 79 + read_cpuid(REVIDR_EL1)); 80 + } 81 + 82 + for (i = 0; i < target_impl_cpu_num; i++) { 83 + if (__is_affected_midr_range(entry, target_impl_cpus[i].midr, 84 + target_impl_cpus[i].midr)) 85 + return true; 86 + } 87 + return false; 34 88 } 35 89 36 90 static bool __maybe_unused ··· 92 38 int scope) 93 39 { 94 40 WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible()); 95 - return is_midr_in_range_list(read_cpuid_id(), entry->midr_range_list); 41 + return is_midr_in_range_list(entry->midr_range_list); 96 42 } 97 43 98 44 static bool __maybe_unused ··· 240 186 has_neoverse_n1_erratum_1542419(const struct arm64_cpu_capabilities *entry, 241 187 int scope) 242 188 { 243 - u32 midr = read_cpuid_id(); 244 189 bool has_dic = read_cpuid_cachetype() & BIT(CTR_EL0_DIC_SHIFT); 245 190 const struct midr_range range = MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1); 246 191 247 192 WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible()); 248 - return is_midr_in_range(midr, &range) && has_dic; 193 + return is_midr_in_range(&range) && has_dic; 194 + } 195 + 196 + static const struct midr_range impdef_pmuv3_cpus[] = { 197 + MIDR_ALL_VERSIONS(MIDR_APPLE_M1_ICESTORM), 198 + MIDR_ALL_VERSIONS(MIDR_APPLE_M1_FIRESTORM), 199 + MIDR_ALL_VERSIONS(MIDR_APPLE_M1_ICESTORM_PRO), 200 + MIDR_ALL_VERSIONS(MIDR_APPLE_M1_FIRESTORM_PRO), 201 + MIDR_ALL_VERSIONS(MIDR_APPLE_M1_ICESTORM_MAX), 202 + MIDR_ALL_VERSIONS(MIDR_APPLE_M1_FIRESTORM_MAX), 203 + MIDR_ALL_VERSIONS(MIDR_APPLE_M2_BLIZZARD), 204 + MIDR_ALL_VERSIONS(MIDR_APPLE_M2_AVALANCHE), 205 + MIDR_ALL_VERSIONS(MIDR_APPLE_M2_BLIZZARD_PRO), 206 + MIDR_ALL_VERSIONS(MIDR_APPLE_M2_AVALANCHE_PRO), 207 + MIDR_ALL_VERSIONS(MIDR_APPLE_M2_BLIZZARD_MAX), 208 + MIDR_ALL_VERSIONS(MIDR_APPLE_M2_AVALANCHE_MAX), 209 + {}, 210 + }; 211 + 212 + static bool has_impdef_pmuv3(const struct arm64_cpu_capabilities *entry, int scope) 213 + { 214 + u64 dfr0 = read_sanitised_ftr_reg(SYS_ID_AA64DFR0_EL1); 215 + unsigned int pmuver; 216 + 217 + if (!is_kernel_in_hyp_mode()) 218 + return false; 219 + 220 + pmuver = cpuid_feature_extract_unsigned_field(dfr0, 221 + ID_AA64DFR0_EL1_PMUVer_SHIFT); 222 + if (pmuver != ID_AA64DFR0_EL1_PMUVer_IMP_DEF) 223 + return false; 224 + 225 + return is_midr_in_range_list(impdef_pmuv3_cpus); 226 + } 227 + 228 + static void cpu_enable_impdef_pmuv3_traps(const struct arm64_cpu_capabilities *__unused) 229 + { 230 + sysreg_clear_set_s(SYS_HACR_EL2, 0, BIT(56)); 249 231 } 250 232 251 233 #ifdef CONFIG_ARM64_WORKAROUND_REPEAT_TLBI ··· 883 793 MIDR_ALL_VERSIONS(MIDR_QCOM_ORYON_X1), 884 794 {} 885 795 })), 796 + }, 797 + { 798 + .desc = "Apple IMPDEF PMUv3 Traps", 799 + .capability = ARM64_WORKAROUND_PMUV3_IMPDEF_TRAPS, 800 + .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM, 801 + .matches = has_impdef_pmuv3, 802 + .cpu_enable = cpu_enable_impdef_pmuv3_traps, 886 803 }, 887 804 { 888 805 }
+48 -5
arch/arm64/kernel/cpufeature.c
··· 86 86 #include <asm/kvm_host.h> 87 87 #include <asm/mmu_context.h> 88 88 #include <asm/mte.h> 89 + #include <asm/hypervisor.h> 89 90 #include <asm/processor.h> 90 91 #include <asm/smp.h> 91 92 #include <asm/sysreg.h> ··· 498 497 499 498 static const struct arm64_ftr_bits ftr_id_aa64mmfr4[] = { 500 499 S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR4_EL1_E2H0_SHIFT, 4, 0), 500 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR4_EL1_NV_frac_SHIFT, 4, 0), 501 501 ARM64_FTR_END, 502 502 }; 503 503 ··· 1794 1792 char const *str = "kpti command line option"; 1795 1793 bool meltdown_safe; 1796 1794 1797 - meltdown_safe = is_midr_in_range_list(read_cpuid_id(), kpti_safe_list); 1795 + meltdown_safe = is_midr_in_range_list(kpti_safe_list); 1798 1796 1799 1797 /* Defer to CPU feature registers */ 1800 1798 if (has_cpuid_feature(entry, scope)) ··· 1864 1862 1865 1863 return (__system_matches_cap(ARM64_HAS_NESTED_VIRT) && 1866 1864 !(has_cpuid_feature(entry, scope) || 1867 - is_midr_in_range_list(read_cpuid_id(), nv1_ni_list))); 1865 + is_midr_in_range_list(nv1_ni_list))); 1868 1866 } 1869 1867 1870 1868 #if defined(ID_AA64MMFR0_EL1_TGRAN_LPA2) && defined(ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_LPA2) ··· 1897 1895 static bool has_lpa2(const struct arm64_cpu_capabilities *entry, int scope) 1898 1896 { 1899 1897 return false; 1898 + } 1899 + #endif 1900 + 1901 + #ifdef CONFIG_HW_PERF_EVENTS 1902 + static bool has_pmuv3(const struct arm64_cpu_capabilities *entry, int scope) 1903 + { 1904 + u64 dfr0 = read_sanitised_ftr_reg(SYS_ID_AA64DFR0_EL1); 1905 + unsigned int pmuver; 1906 + 1907 + /* 1908 + * PMUVer follows the standard ID scheme for an unsigned field with the 1909 + * exception of 0xF (IMP_DEF) which is treated specially and implies 1910 + * FEAT_PMUv3 is not implemented. 1911 + * 1912 + * See DDI0487L.a D24.1.3.2 for more details. 1913 + */ 1914 + pmuver = cpuid_feature_extract_unsigned_field(dfr0, 1915 + ID_AA64DFR0_EL1_PMUVer_SHIFT); 1916 + if (pmuver == ID_AA64DFR0_EL1_PMUVer_IMP_DEF) 1917 + return false; 1918 + 1919 + return pmuver >= ID_AA64DFR0_EL1_PMUVer_IMP; 1900 1920 } 1901 1921 #endif 1902 1922 ··· 2069 2045 {}, 2070 2046 }; 2071 2047 2072 - return is_midr_in_range_list(read_cpuid_id(), cpus); 2048 + return is_midr_in_range_list(cpus); 2073 2049 } 2074 2050 2075 2051 static bool cpu_can_use_dbm(const struct arm64_cpu_capabilities *cap) ··· 2186 2162 if (kvm_get_mode() != KVM_MODE_NV) 2187 2163 return false; 2188 2164 2189 - if (!has_cpuid_feature(cap, scope)) { 2165 + if (!cpucap_multi_entry_cap_matches(cap, scope)) { 2190 2166 pr_warn("unavailable: %s\n", cap->desc); 2191 2167 return false; 2192 2168 } ··· 2543 2519 .capability = ARM64_HAS_NESTED_VIRT, 2544 2520 .type = ARM64_CPUCAP_SYSTEM_FEATURE, 2545 2521 .matches = has_nested_virt_support, 2546 - ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, NV, NV2) 2522 + .match_list = (const struct arm64_cpu_capabilities []){ 2523 + { 2524 + .matches = has_cpuid_feature, 2525 + ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, NV, NV2) 2526 + }, 2527 + { 2528 + .matches = has_cpuid_feature, 2529 + ARM64_CPUID_FIELDS(ID_AA64MMFR4_EL1, NV_frac, NV2_ONLY) 2530 + }, 2531 + { /* Sentinel */ } 2532 + }, 2547 2533 }, 2548 2534 { 2549 2535 .capability = ARM64_HAS_32BIT_EL0_DO_NOT_USE, ··· 3031 2997 .cpu_enable = cpu_enable_gcs, 3032 2998 .matches = has_cpuid_feature, 3033 2999 ARM64_CPUID_FIELDS(ID_AA64PFR1_EL1, GCS, IMP) 3000 + }, 3001 + #endif 3002 + #ifdef CONFIG_HW_PERF_EVENTS 3003 + { 3004 + .desc = "PMUv3", 3005 + .capability = ARM64_HAS_PMUV3, 3006 + .type = ARM64_CPUCAP_SYSTEM_FEATURE, 3007 + .matches = has_pmuv3, 3034 3008 }, 3035 3009 #endif 3036 3010 {}, ··· 3722 3680 3723 3681 static void __init setup_boot_cpu_capabilities(void) 3724 3682 { 3683 + kvm_arm_target_impl_cpu_init(); 3725 3684 /* 3726 3685 * The boot CPU's feature register values have been recorded. Detect 3727 3686 * boot cpucaps and local cpucaps for the boot CPU, then enable and
+1 -5
arch/arm64/kernel/image-vars.h
··· 49 49 PROVIDE(__pi_arm64_use_ng_mappings = arm64_use_ng_mappings); 50 50 #ifdef CONFIG_CAVIUM_ERRATUM_27456 51 51 PROVIDE(__pi_cavium_erratum_27456_cpus = cavium_erratum_27456_cpus); 52 + PROVIDE(__pi_is_midr_in_range_list = is_midr_in_range_list); 52 53 #endif 53 54 PROVIDE(__pi__ctype = _ctype); 54 55 PROVIDE(__pi_memstart_offset_seed = memstart_offset_seed); ··· 112 111 /* EL2 exception handling */ 113 112 KVM_NVHE_ALIAS(__start___kvm_ex_table); 114 113 KVM_NVHE_ALIAS(__stop___kvm_ex_table); 115 - 116 - /* PMU available static key */ 117 - #ifdef CONFIG_HW_PERF_EVENTS 118 - KVM_NVHE_ALIAS(kvm_arm_pmu_available); 119 - #endif 120 114 121 115 /* Position-independent library routines */ 122 116 KVM_NVHE_ALIAS_HYP(clear_page, __pi_clear_page);
+8 -9
arch/arm64/kernel/proton-pack.c
··· 172 172 return SPECTRE_UNAFFECTED; 173 173 174 174 /* Alternatively, we have a list of unaffected CPUs */ 175 - if (is_midr_in_range_list(read_cpuid_id(), spectre_v2_safe_list)) 175 + if (is_midr_in_range_list(spectre_v2_safe_list)) 176 176 return SPECTRE_UNAFFECTED; 177 177 178 178 return SPECTRE_VULNERABLE; ··· 331 331 }; 332 332 333 333 WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible()); 334 - return is_midr_in_range_list(read_cpuid_id(), spectre_v3a_unsafe_list); 334 + return is_midr_in_range_list(spectre_v3a_unsafe_list); 335 335 } 336 336 337 337 void spectre_v3a_enable_mitigation(const struct arm64_cpu_capabilities *__unused) ··· 475 475 { /* sentinel */ }, 476 476 }; 477 477 478 - if (is_midr_in_range_list(read_cpuid_id(), spectre_v4_safe_list)) 478 + if (is_midr_in_range_list(spectre_v4_safe_list)) 479 479 return SPECTRE_UNAFFECTED; 480 480 481 481 /* CPU features are detected first */ ··· 878 878 {}, 879 879 }; 880 880 881 - if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k32_list)) 881 + if (is_midr_in_range_list(spectre_bhb_k32_list)) 882 882 k = 32; 883 - else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k24_list)) 883 + else if (is_midr_in_range_list(spectre_bhb_k24_list)) 884 884 k = 24; 885 - else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k11_list)) 885 + else if (is_midr_in_range_list(spectre_bhb_k11_list)) 886 886 k = 11; 887 - else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k8_list)) 887 + else if (is_midr_in_range_list(spectre_bhb_k8_list)) 888 888 k = 8; 889 889 890 890 max_bhb_k = max(max_bhb_k, k); ··· 926 926 MIDR_ALL_VERSIONS(MIDR_CORTEX_A75), 927 927 {}, 928 928 }; 929 - bool cpu_in_list = is_midr_in_range_list(read_cpuid_id(), 930 - spectre_bhb_firmware_mitigated_list); 929 + bool cpu_in_list = is_midr_in_range_list(spectre_bhb_firmware_mitigated_list); 931 930 932 931 if (scope != SCOPE_LOCAL_CPU) 933 932 return system_affected;
+1 -1
arch/arm64/kvm/Makefile
··· 23 23 vgic/vgic-v3.o vgic/vgic-v4.o \ 24 24 vgic/vgic-mmio.o vgic/vgic-mmio-v2.o \ 25 25 vgic/vgic-mmio-v3.o vgic/vgic-kvm-device.o \ 26 - vgic/vgic-its.o vgic/vgic-debug.o 26 + vgic/vgic-its.o vgic/vgic-debug.o vgic/vgic-v3-nested.o 27 27 28 28 kvm-$(CONFIG_HW_PERF_EVENTS) += pmu-emul.o pmu.o 29 29 kvm-$(CONFIG_ARM64_PTR_AUTH) += pauth.o
+64 -12
arch/arm64/kvm/arm.c
··· 125 125 } 126 126 mutex_unlock(&kvm->slots_lock); 127 127 break; 128 + case KVM_CAP_ARM_WRITABLE_IMP_ID_REGS: 129 + mutex_lock(&kvm->lock); 130 + if (!kvm->created_vcpus) { 131 + r = 0; 132 + set_bit(KVM_ARCH_FLAG_WRITABLE_IMP_ID_REGS, &kvm->arch.flags); 133 + } 134 + mutex_unlock(&kvm->lock); 135 + break; 128 136 default: 129 137 break; 130 138 } ··· 321 313 case KVM_CAP_ARM_SYSTEM_SUSPEND: 322 314 case KVM_CAP_IRQFD_RESAMPLE: 323 315 case KVM_CAP_COUNTER_OFFSET: 316 + case KVM_CAP_ARM_WRITABLE_IMP_ID_REGS: 324 317 r = 1; 325 318 break; 326 319 case KVM_CAP_SET_GUEST_DEBUG2: ··· 375 366 r = get_num_wrps(); 376 367 break; 377 368 case KVM_CAP_ARM_PMU_V3: 378 - r = kvm_arm_support_pmu_v3(); 369 + r = kvm_supports_guest_pmuv3(); 379 370 break; 380 371 case KVM_CAP_ARM_INJECT_SERROR_ESR: 381 372 r = cpus_have_final_cap(ARM64_HAS_RAS_EXTN); ··· 475 466 if (err) 476 467 return err; 477 468 478 - return kvm_share_hyp(vcpu, vcpu + 1); 469 + err = kvm_share_hyp(vcpu, vcpu + 1); 470 + if (err) 471 + kvm_vgic_vcpu_destroy(vcpu); 472 + 473 + return err; 479 474 } 480 475 481 476 void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu) ··· 599 586 nommu: 600 587 vcpu->cpu = cpu; 601 588 602 - kvm_vgic_load(vcpu); 589 + /* 590 + * The timer must be loaded before the vgic to correctly set up physical 591 + * interrupt deactivation in nested state (e.g. timer interrupt). 592 + */ 603 593 kvm_timer_vcpu_load(vcpu); 594 + kvm_vgic_load(vcpu); 604 595 kvm_vcpu_load_debug(vcpu); 605 596 if (has_vhe()) 606 597 kvm_vcpu_load_vhe(vcpu); ··· 842 825 if (ret) 843 826 return ret; 844 827 828 + if (vcpu_has_nv(vcpu)) { 829 + ret = kvm_vgic_vcpu_nv_init(vcpu); 830 + if (ret) 831 + return ret; 832 + } 833 + 845 834 /* 846 835 * This needs to happen after any restriction has been applied 847 836 * to the feature set. ··· 858 835 if (ret) 859 836 return ret; 860 837 861 - ret = kvm_arm_pmu_v3_enable(vcpu); 862 - if (ret) 863 - return ret; 838 + if (kvm_vcpu_has_pmu(vcpu)) { 839 + ret = kvm_arm_pmu_v3_enable(vcpu); 840 + if (ret) 841 + return ret; 842 + } 864 843 865 844 if (is_protected_kvm_enabled()) { 866 845 ret = pkvm_create_hyp_vm(kvm); 846 + if (ret) 847 + return ret; 848 + 849 + ret = pkvm_create_hyp_vcpu(vcpu); 867 850 if (ret) 868 851 return ret; 869 852 } ··· 1177 1148 */ 1178 1149 preempt_disable(); 1179 1150 1180 - kvm_pmu_flush_hwstate(vcpu); 1151 + if (kvm_vcpu_has_pmu(vcpu)) 1152 + kvm_pmu_flush_hwstate(vcpu); 1181 1153 1182 1154 local_irq_disable(); 1183 1155 ··· 1197 1167 if (ret <= 0 || kvm_vcpu_exit_request(vcpu, &ret)) { 1198 1168 vcpu->mode = OUTSIDE_GUEST_MODE; 1199 1169 isb(); /* Ensure work in x_flush_hwstate is committed */ 1200 - kvm_pmu_sync_hwstate(vcpu); 1170 + if (kvm_vcpu_has_pmu(vcpu)) 1171 + kvm_pmu_sync_hwstate(vcpu); 1201 1172 if (unlikely(!irqchip_in_kernel(vcpu->kvm))) 1202 1173 kvm_timer_sync_user(vcpu); 1203 1174 kvm_vgic_sync_hwstate(vcpu); ··· 1228 1197 * that the vgic can properly sample the updated state of the 1229 1198 * interrupt line. 1230 1199 */ 1231 - kvm_pmu_sync_hwstate(vcpu); 1200 + if (kvm_vcpu_has_pmu(vcpu)) 1201 + kvm_pmu_sync_hwstate(vcpu); 1232 1202 1233 1203 /* 1234 1204 * Sync the vgic state before syncing the timer state because ··· 1418 1386 if (!cpus_have_final_cap(ARM64_HAS_32BIT_EL1)) 1419 1387 clear_bit(KVM_ARM_VCPU_EL1_32BIT, &features); 1420 1388 1421 - if (!kvm_arm_support_pmu_v3()) 1389 + if (!kvm_supports_guest_pmuv3()) 1422 1390 clear_bit(KVM_ARM_VCPU_PMU_V3, &features); 1423 1391 1424 1392 if (!system_supports_sve()) ··· 2339 2307 goto out; 2340 2308 } 2341 2309 2310 + if (kvm_mode == KVM_MODE_NV && 2311 + !(vgic_present && kvm_vgic_global_state.type == VGIC_V3)) { 2312 + kvm_err("NV support requires GICv3, giving up\n"); 2313 + err = -EINVAL; 2314 + goto out; 2315 + } 2316 + 2342 2317 /* 2343 2318 * Init HYP architected timer support 2344 2319 */ ··· 2753 2714 { 2754 2715 struct kvm_kernel_irqfd *irqfd = 2755 2716 container_of(cons, struct kvm_kernel_irqfd, consumer); 2717 + struct kvm_kernel_irq_routing_entry *irq_entry = &irqfd->irq_entry; 2718 + 2719 + /* 2720 + * The only thing we have a chance of directly-injecting is LPIs. Maybe 2721 + * one day... 2722 + */ 2723 + if (irq_entry->type != KVM_IRQ_ROUTING_MSI) 2724 + return 0; 2756 2725 2757 2726 return kvm_vgic_v4_set_forwarding(irqfd->kvm, prod->irq, 2758 2727 &irqfd->irq_entry); ··· 2770 2723 { 2771 2724 struct kvm_kernel_irqfd *irqfd = 2772 2725 container_of(cons, struct kvm_kernel_irqfd, consumer); 2726 + struct kvm_kernel_irq_routing_entry *irq_entry = &irqfd->irq_entry; 2727 + 2728 + if (irq_entry->type != KVM_IRQ_ROUTING_MSI) 2729 + return; 2773 2730 2774 2731 kvm_vgic_v4_unset_forwarding(irqfd->kvm, prod->irq, 2775 2732 &irqfd->irq_entry); ··· 2854 2803 if (err) 2855 2804 goto out_hyp; 2856 2805 2857 - kvm_info("%s%sVHE mode initialized successfully\n", 2806 + kvm_info("%s%sVHE%s mode initialized successfully\n", 2858 2807 in_hyp_mode ? "" : (is_protected_kvm_enabled() ? 2859 2808 "Protected " : "Hyp "), 2860 2809 in_hyp_mode ? "" : (cpus_have_final_cap(ARM64_KVM_HVHE) ? 2861 - "h" : "n")); 2810 + "h" : "n"), 2811 + cpus_have_final_cap(ARM64_HAS_NESTED_VIRT) ? "+NV2": ""); 2862 2812 2863 2813 /* 2864 2814 * FIXME: Do something reasonable if kvm_init() fails after pKVM
+14 -10
arch/arm64/kvm/emulate-nested.c
··· 412 412 }, 413 413 [CGT_ICH_HCR_TC] = { 414 414 .index = ICH_HCR_EL2, 415 - .value = ICH_HCR_TC, 416 - .mask = ICH_HCR_TC, 415 + .value = ICH_HCR_EL2_TC, 416 + .mask = ICH_HCR_EL2_TC, 417 417 .behaviour = BEHAVE_FORWARD_RW, 418 418 }, 419 419 [CGT_ICH_HCR_TALL0] = { 420 420 .index = ICH_HCR_EL2, 421 - .value = ICH_HCR_TALL0, 422 - .mask = ICH_HCR_TALL0, 421 + .value = ICH_HCR_EL2_TALL0, 422 + .mask = ICH_HCR_EL2_TALL0, 423 423 .behaviour = BEHAVE_FORWARD_RW, 424 424 }, 425 425 [CGT_ICH_HCR_TALL1] = { 426 426 .index = ICH_HCR_EL2, 427 - .value = ICH_HCR_TALL1, 428 - .mask = ICH_HCR_TALL1, 427 + .value = ICH_HCR_EL2_TALL1, 428 + .mask = ICH_HCR_EL2_TALL1, 429 429 .behaviour = BEHAVE_FORWARD_RW, 430 430 }, 431 431 [CGT_ICH_HCR_TDIR] = { 432 432 .index = ICH_HCR_EL2, 433 - .value = ICH_HCR_TDIR, 434 - .mask = ICH_HCR_TDIR, 433 + .value = ICH_HCR_EL2_TDIR, 434 + .mask = ICH_HCR_EL2_TDIR, 435 435 .behaviour = BEHAVE_FORWARD_RW, 436 436 }, 437 437 }; ··· 2503 2503 } 2504 2504 2505 2505 preempt_disable(); 2506 + vcpu_set_flag(vcpu, IN_NESTED_ERET); 2506 2507 kvm_arch_vcpu_put(vcpu); 2507 2508 2508 2509 if (!esr_iss_is_eretax(esr)) ··· 2515 2514 *vcpu_cpsr(vcpu) = spsr; 2516 2515 2517 2516 kvm_arch_vcpu_load(vcpu, smp_processor_id()); 2517 + vcpu_clear_flag(vcpu, IN_NESTED_ERET); 2518 2518 preempt_enable(); 2519 2519 2520 - kvm_pmu_nested_transition(vcpu); 2520 + if (kvm_vcpu_has_pmu(vcpu)) 2521 + kvm_pmu_nested_transition(vcpu); 2521 2522 } 2522 2523 2523 2524 static void kvm_inject_el2_exception(struct kvm_vcpu *vcpu, u64 esr_el2, ··· 2602 2599 kvm_arch_vcpu_load(vcpu, smp_processor_id()); 2603 2600 preempt_enable(); 2604 2601 2605 - kvm_pmu_nested_transition(vcpu); 2602 + if (kvm_vcpu_has_pmu(vcpu)) 2603 + kvm_pmu_nested_transition(vcpu); 2606 2604 2607 2605 return 1; 2608 2606 }
+5 -1
arch/arm64/kvm/handle_exit.c
··· 129 129 static int kvm_handle_wfx(struct kvm_vcpu *vcpu) 130 130 { 131 131 u64 esr = kvm_vcpu_get_esr(vcpu); 132 + bool is_wfe = !!(esr & ESR_ELx_WFx_ISS_WFE); 132 133 133 - if (esr & ESR_ELx_WFx_ISS_WFE) { 134 + if (guest_hyp_wfx_traps_enabled(vcpu)) 135 + return kvm_inject_nested_sync(vcpu, kvm_vcpu_get_esr(vcpu)); 136 + 137 + if (is_wfe) { 134 138 trace_kvm_wfx_arm64(*vcpu_pc(vcpu), true); 135 139 vcpu->stat.wfe_exit_stat++; 136 140 } else {
+2 -2
arch/arm64/kvm/hyp/include/hyp/switch.h
··· 244 244 * counter, which could make a PMXEVCNTR_EL0 access UNDEF at 245 245 * EL1 instead of being trapped to EL2. 246 246 */ 247 - if (kvm_arm_support_pmu_v3()) { 247 + if (system_supports_pmuv3()) { 248 248 struct kvm_cpu_context *hctxt; 249 249 250 250 write_sysreg(0, pmselr_el0); ··· 281 281 write_sysreg(*host_data_ptr(host_debug_state.mdcr_el2), mdcr_el2); 282 282 283 283 write_sysreg(0, hstr_el2); 284 - if (kvm_arm_support_pmu_v3()) { 284 + if (system_supports_pmuv3()) { 285 285 struct kvm_cpu_context *hctxt; 286 286 287 287 hctxt = host_data_ptr(host_ctxt);
+13 -1
arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
··· 43 43 return &ctxt_sys_reg(ctxt, MDSCR_EL1); 44 44 } 45 45 46 + static inline u64 ctxt_midr_el1(struct kvm_cpu_context *ctxt) 47 + { 48 + struct kvm *kvm = kern_hyp_va(ctxt_to_vcpu(ctxt)->kvm); 49 + 50 + if (!(ctxt_is_guest(ctxt) && 51 + test_bit(KVM_ARCH_FLAG_WRITABLE_IMP_ID_REGS, &kvm->arch.flags))) 52 + return read_cpuid_id(); 53 + 54 + return kvm_read_vm_id_reg(kvm, SYS_MIDR_EL1); 55 + } 56 + 46 57 static inline void __sysreg_save_common_state(struct kvm_cpu_context *ctxt) 47 58 { 48 59 *ctxt_mdscr_el1(ctxt) = read_sysreg(mdscr_el1); ··· 179 168 } 180 169 181 170 static inline void __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt, 182 - u64 mpidr) 171 + u64 midr, u64 mpidr) 183 172 { 173 + write_sysreg(midr, vpidr_el2); 184 174 write_sysreg(mpidr, vmpidr_el2); 185 175 186 176 if (has_vhe() ||
+1 -1
arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
··· 56 56 57 57 int hyp_pin_shared_mem(void *from, void *to); 58 58 void hyp_unpin_shared_mem(void *from, void *to); 59 - void reclaim_guest_pages(struct pkvm_hyp_vm *vm, struct kvm_hyp_memcache *mc); 59 + void reclaim_pgtable_pages(struct pkvm_hyp_vm *vm, struct kvm_hyp_memcache *mc); 60 60 int refill_memcache(struct kvm_hyp_memcache *mc, unsigned long min_pages, 61 61 struct kvm_hyp_memcache *host_mc); 62 62
-6
arch/arm64/kvm/hyp/include/nvhe/pkvm.h
··· 43 43 struct hyp_pool pool; 44 44 hyp_spinlock_t lock; 45 45 46 - /* 47 - * The number of vcpus initialized and ready to run. 48 - * Modifying this is protected by 'vm_table_lock'. 49 - */ 50 - unsigned int nr_vcpus; 51 - 52 46 /* Array of the hyp vCPU structures for this VM. */ 53 47 struct pkvm_hyp_vcpu *vcpus[]; 54 48 };
+1 -1
arch/arm64/kvm/hyp/nvhe/mem_protect.c
··· 266 266 return 0; 267 267 } 268 268 269 - void reclaim_guest_pages(struct pkvm_hyp_vm *vm, struct kvm_hyp_memcache *mc) 269 + void reclaim_pgtable_pages(struct pkvm_hyp_vm *vm, struct kvm_hyp_memcache *mc) 270 270 { 271 271 struct hyp_page *page; 272 272 void *addr;
+51 -28
arch/arm64/kvm/hyp/nvhe/pkvm.c
··· 46 46 vcpu->arch.hcr_el2 |= HCR_FWB; 47 47 48 48 if (cpus_have_final_cap(ARM64_HAS_EVT) && 49 - !cpus_have_final_cap(ARM64_MISMATCHED_CACHE_TYPE)) 49 + !cpus_have_final_cap(ARM64_MISMATCHED_CACHE_TYPE) && 50 + kvm_read_vm_id_reg(vcpu->kvm, SYS_CTR_EL0) == read_cpuid(CTR_EL0)) 50 51 vcpu->arch.hcr_el2 |= HCR_TID4; 51 52 else 52 53 vcpu->arch.hcr_el2 |= HCR_TID2; ··· 167 166 168 167 pkvm_vcpu_reset_hcr(vcpu); 169 168 170 - if ((!pkvm_hyp_vcpu_is_protected(hyp_vcpu))) 169 + if ((!pkvm_hyp_vcpu_is_protected(hyp_vcpu))) { 170 + struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu; 171 + 172 + /* Trust the host for non-protected vcpu features. */ 173 + vcpu->arch.hcrx_el2 = host_vcpu->arch.hcrx_el2; 171 174 return 0; 175 + } 172 176 173 177 ret = pkvm_check_pvm_cpu_features(vcpu); 174 178 if (ret) ··· 181 175 182 176 pvm_init_traps_hcr(vcpu); 183 177 pvm_init_traps_mdcr(vcpu); 178 + vcpu_set_hcrx(vcpu); 184 179 185 180 return 0; 186 181 } ··· 246 239 247 240 hyp_spin_lock(&vm_table_lock); 248 241 hyp_vm = get_vm_by_handle(handle); 249 - if (!hyp_vm || hyp_vm->nr_vcpus <= vcpu_idx) 242 + if (!hyp_vm || hyp_vm->kvm.created_vcpus <= vcpu_idx) 250 243 goto unlock; 251 244 252 245 hyp_vcpu = hyp_vm->vcpus[vcpu_idx]; 246 + if (!hyp_vcpu) 247 + goto unlock; 253 248 254 249 /* Ensure vcpu isn't loaded on more than one cpu simultaneously. */ 255 250 if (unlikely(hyp_vcpu->loaded_hyp_vcpu)) { ··· 324 315 unsigned long host_arch_flags = READ_ONCE(host_kvm->arch.flags); 325 316 DECLARE_BITMAP(allowed_features, KVM_VCPU_MAX_FEATURES); 326 317 318 + /* CTR_EL0 is always under host control, even for protected VMs. */ 319 + hyp_vm->kvm.arch.ctr_el0 = host_kvm->arch.ctr_el0; 320 + 327 321 if (test_bit(KVM_ARCH_FLAG_MTE_ENABLED, &host_kvm->arch.flags)) 328 322 set_bit(KVM_ARCH_FLAG_MTE_ENABLED, &kvm->arch.flags); 329 323 ··· 337 325 bitmap_copy(kvm->arch.vcpu_features, 338 326 host_kvm->arch.vcpu_features, 339 327 KVM_VCPU_MAX_FEATURES); 328 + 329 + if (test_bit(KVM_ARCH_FLAG_WRITABLE_IMP_ID_REGS, &host_arch_flags)) 330 + hyp_vm->kvm.arch.midr_el1 = host_kvm->arch.midr_el1; 331 + 340 332 return; 341 333 } 342 334 ··· 377 361 { 378 362 int i; 379 363 380 - for (i = 0; i < nr_vcpus; i++) 381 - unpin_host_vcpu(hyp_vcpus[i]->host_vcpu); 364 + for (i = 0; i < nr_vcpus; i++) { 365 + struct pkvm_hyp_vcpu *hyp_vcpu = hyp_vcpus[i]; 366 + 367 + if (!hyp_vcpu) 368 + continue; 369 + 370 + unpin_host_vcpu(hyp_vcpu->host_vcpu); 371 + } 382 372 } 383 373 384 374 static void init_pkvm_hyp_vm(struct kvm *host_kvm, struct pkvm_hyp_vm *hyp_vm, ··· 408 386 409 387 static int init_pkvm_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu, 410 388 struct pkvm_hyp_vm *hyp_vm, 411 - struct kvm_vcpu *host_vcpu, 412 - unsigned int vcpu_idx) 389 + struct kvm_vcpu *host_vcpu) 413 390 { 414 391 int ret = 0; 415 392 416 393 if (hyp_pin_shared_mem(host_vcpu, host_vcpu + 1)) 417 394 return -EBUSY; 418 395 419 - if (host_vcpu->vcpu_idx != vcpu_idx) { 420 - ret = -EINVAL; 421 - goto done; 422 - } 423 - 424 396 hyp_vcpu->host_vcpu = host_vcpu; 425 397 426 398 hyp_vcpu->vcpu.kvm = &hyp_vm->kvm; 427 399 hyp_vcpu->vcpu.vcpu_id = READ_ONCE(host_vcpu->vcpu_id); 428 - hyp_vcpu->vcpu.vcpu_idx = vcpu_idx; 400 + hyp_vcpu->vcpu.vcpu_idx = READ_ONCE(host_vcpu->vcpu_idx); 429 401 430 402 hyp_vcpu->vcpu.arch.hw_mmu = &hyp_vm->kvm.arch.mmu; 431 403 hyp_vcpu->vcpu.arch.cflags = READ_ONCE(host_vcpu->arch.cflags); ··· 657 641 goto unlock; 658 642 } 659 643 660 - idx = hyp_vm->nr_vcpus; 644 + ret = init_pkvm_hyp_vcpu(hyp_vcpu, hyp_vm, host_vcpu); 645 + if (ret) 646 + goto unlock; 647 + 648 + idx = hyp_vcpu->vcpu.vcpu_idx; 661 649 if (idx >= hyp_vm->kvm.created_vcpus) { 662 650 ret = -EINVAL; 663 651 goto unlock; 664 652 } 665 653 666 - ret = init_pkvm_hyp_vcpu(hyp_vcpu, hyp_vm, host_vcpu, idx); 667 - if (ret) 654 + if (hyp_vm->vcpus[idx]) { 655 + ret = -EINVAL; 668 656 goto unlock; 657 + } 669 658 670 659 hyp_vm->vcpus[idx] = hyp_vcpu; 671 - hyp_vm->nr_vcpus++; 672 660 unlock: 673 661 hyp_spin_unlock(&vm_table_lock); 674 662 675 - if (ret) { 663 + if (ret) 676 664 unmap_donated_memory(hyp_vcpu, sizeof(*hyp_vcpu)); 677 - return ret; 678 - } 679 - 680 - return 0; 665 + return ret; 681 666 } 682 667 683 668 static void ··· 695 678 696 679 int __pkvm_teardown_vm(pkvm_handle_t handle) 697 680 { 698 - struct kvm_hyp_memcache *mc; 681 + struct kvm_hyp_memcache *mc, *stage2_mc; 699 682 struct pkvm_hyp_vm *hyp_vm; 700 683 struct kvm *host_kvm; 701 684 unsigned int idx; ··· 723 706 724 707 /* Reclaim guest pages (including page-table pages) */ 725 708 mc = &host_kvm->arch.pkvm.teardown_mc; 726 - reclaim_guest_pages(hyp_vm, mc); 727 - unpin_host_vcpus(hyp_vm->vcpus, hyp_vm->nr_vcpus); 709 + stage2_mc = &host_kvm->arch.pkvm.stage2_teardown_mc; 710 + reclaim_pgtable_pages(hyp_vm, stage2_mc); 711 + unpin_host_vcpus(hyp_vm->vcpus, hyp_vm->kvm.created_vcpus); 728 712 729 713 /* Push the metadata pages to the teardown memcache */ 730 - for (idx = 0; idx < hyp_vm->nr_vcpus; ++idx) { 714 + for (idx = 0; idx < hyp_vm->kvm.created_vcpus; ++idx) { 731 715 struct pkvm_hyp_vcpu *hyp_vcpu = hyp_vm->vcpus[idx]; 732 - struct kvm_hyp_memcache *vcpu_mc = &hyp_vcpu->vcpu.arch.pkvm_memcache; 716 + struct kvm_hyp_memcache *vcpu_mc; 717 + 718 + if (!hyp_vcpu) 719 + continue; 720 + 721 + vcpu_mc = &hyp_vcpu->vcpu.arch.pkvm_memcache; 733 722 734 723 while (vcpu_mc->nr_pages) { 735 724 void *addr = pop_hyp_memcache(vcpu_mc, hyp_phys_to_virt); 736 725 737 - push_hyp_memcache(mc, addr, hyp_virt_to_phys); 726 + push_hyp_memcache(stage2_mc, addr, hyp_virt_to_phys); 738 727 unmap_donated_memory_noclear(addr, PAGE_SIZE); 739 728 } 740 729
+3 -1
arch/arm64/kvm/hyp/nvhe/sysreg-sr.c
··· 28 28 29 29 void __sysreg_restore_state_nvhe(struct kvm_cpu_context *ctxt) 30 30 { 31 - __sysreg_restore_el1_state(ctxt, ctxt_sys_reg(ctxt, MPIDR_EL1)); 31 + u64 midr = ctxt_midr_el1(ctxt); 32 + 33 + __sysreg_restore_el1_state(ctxt, midr, ctxt_sys_reg(ctxt, MPIDR_EL1)); 32 34 __sysreg_restore_common_state(ctxt); 33 35 __sysreg_restore_user_state(ctxt); 34 36 __sysreg_restore_el2_return_state(ctxt);
+8 -8
arch/arm64/kvm/hyp/vgic-v3-sr.c
··· 18 18 #define vtr_to_nr_pre_bits(v) ((((u32)(v) >> 26) & 7) + 1) 19 19 #define vtr_to_nr_apr_regs(v) (1 << (vtr_to_nr_pre_bits(v) - 5)) 20 20 21 - static u64 __gic_v3_get_lr(unsigned int lr) 21 + u64 __gic_v3_get_lr(unsigned int lr) 22 22 { 23 23 switch (lr & 0xf) { 24 24 case 0: ··· 218 218 219 219 elrsr = read_gicreg(ICH_ELRSR_EL2); 220 220 221 - write_gicreg(cpu_if->vgic_hcr & ~ICH_HCR_EN, ICH_HCR_EL2); 221 + write_gicreg(cpu_if->vgic_hcr & ~ICH_HCR_EL2_En, ICH_HCR_EL2); 222 222 223 223 for (i = 0; i < used_lrs; i++) { 224 224 if (elrsr & (1 << i)) ··· 274 274 * system registers to trap to EL1 (duh), force ICC_SRE_EL1.SRE to 1 275 275 * so that the trap bits can take effect. Yes, we *loves* the GIC. 276 276 */ 277 - if (!(cpu_if->vgic_hcr & ICH_HCR_EN)) { 277 + if (!(cpu_if->vgic_hcr & ICH_HCR_EL2_En)) { 278 278 write_gicreg(ICC_SRE_EL1_SRE, ICC_SRE_EL1); 279 279 isb(); 280 280 } else if (!cpu_if->vgic_sre) { ··· 752 752 u32 hcr; 753 753 754 754 hcr = read_gicreg(ICH_HCR_EL2); 755 - hcr += 1 << ICH_HCR_EOIcount_SHIFT; 755 + hcr += 1 << ICH_HCR_EL2_EOIcount_SHIFT; 756 756 write_gicreg(hcr, ICH_HCR_EL2); 757 757 } 758 758 ··· 1069 1069 case SYS_ICC_EOIR0_EL1: 1070 1070 case SYS_ICC_HPPIR0_EL1: 1071 1071 case SYS_ICC_IAR0_EL1: 1072 - return ich_hcr & ICH_HCR_TALL0; 1072 + return ich_hcr & ICH_HCR_EL2_TALL0; 1073 1073 1074 1074 case SYS_ICC_IGRPEN1_EL1: 1075 1075 if (is_read && ··· 1090 1090 case SYS_ICC_EOIR1_EL1: 1091 1091 case SYS_ICC_HPPIR1_EL1: 1092 1092 case SYS_ICC_IAR1_EL1: 1093 - return ich_hcr & ICH_HCR_TALL1; 1093 + return ich_hcr & ICH_HCR_EL2_TALL1; 1094 1094 1095 1095 case SYS_ICC_DIR_EL1: 1096 - if (ich_hcr & ICH_HCR_TDIR) 1096 + if (ich_hcr & ICH_HCR_EL2_TDIR) 1097 1097 return true; 1098 1098 1099 1099 fallthrough; ··· 1101 1101 case SYS_ICC_RPR_EL1: 1102 1102 case SYS_ICC_CTLR_EL1: 1103 1103 case SYS_ICC_PMR_EL1: 1104 - return ich_hcr & ICH_HCR_TC; 1104 + return ich_hcr & ICH_HCR_EL2_TC; 1105 1105 1106 1106 default: 1107 1107 return false;
+22
arch/arm64/kvm/hyp/vhe/switch.c
··· 527 527 return kvm_hyp_handle_sysreg(vcpu, exit_code); 528 528 } 529 529 530 + static bool kvm_hyp_handle_impdef(struct kvm_vcpu *vcpu, u64 *exit_code) 531 + { 532 + u64 iss; 533 + 534 + if (!cpus_have_final_cap(ARM64_WORKAROUND_PMUV3_IMPDEF_TRAPS)) 535 + return false; 536 + 537 + /* 538 + * Compute a synthetic ESR for a sysreg trap. Conveniently, AFSR1_EL2 539 + * is populated with a correct ISS for a sysreg trap. These fruity 540 + * parts are 64bit only, so unconditionally set IL. 541 + */ 542 + iss = ESR_ELx_ISS(read_sysreg_s(SYS_AFSR1_EL2)); 543 + vcpu->arch.fault.esr_el2 = FIELD_PREP(ESR_ELx_EC_MASK, ESR_ELx_EC_SYS64) | 544 + FIELD_PREP(ESR_ELx_ISS_MASK, iss) | 545 + ESR_ELx_IL; 546 + return false; 547 + } 548 + 530 549 static const exit_handler_fn hyp_exit_handlers[] = { 531 550 [0 ... ESR_ELx_EC_MAX] = NULL, 532 551 [ESR_ELx_EC_CP15_32] = kvm_hyp_handle_cp15_32, ··· 557 538 [ESR_ELx_EC_WATCHPT_LOW] = kvm_hyp_handle_watchpt_low, 558 539 [ESR_ELx_EC_ERET] = kvm_hyp_handle_eret, 559 540 [ESR_ELx_EC_MOPS] = kvm_hyp_handle_mops, 541 + 542 + /* Apple shenanigans */ 543 + [0x3F] = kvm_hyp_handle_impdef, 560 544 }; 561 545 562 546 static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
+10 -18
arch/arm64/kvm/hyp/vhe/sysreg-sr.c
··· 87 87 write_sysreg(__vcpu_sys_reg(vcpu, PAR_EL1), par_el1); 88 88 write_sysreg(__vcpu_sys_reg(vcpu, TPIDR_EL1), tpidr_el1); 89 89 90 - write_sysreg(__vcpu_sys_reg(vcpu, MPIDR_EL1), vmpidr_el2); 91 - write_sysreg_el1(__vcpu_sys_reg(vcpu, MAIR_EL2), SYS_MAIR); 92 - write_sysreg_el1(__vcpu_sys_reg(vcpu, VBAR_EL2), SYS_VBAR); 93 - write_sysreg_el1(__vcpu_sys_reg(vcpu, CONTEXTIDR_EL2), SYS_CONTEXTIDR); 94 - write_sysreg_el1(__vcpu_sys_reg(vcpu, AMAIR_EL2), SYS_AMAIR); 90 + write_sysreg(ctxt_midr_el1(&vcpu->arch.ctxt), vpidr_el2); 91 + write_sysreg(__vcpu_sys_reg(vcpu, MPIDR_EL1), vmpidr_el2); 92 + write_sysreg_el1(__vcpu_sys_reg(vcpu, MAIR_EL2), SYS_MAIR); 93 + write_sysreg_el1(__vcpu_sys_reg(vcpu, VBAR_EL2), SYS_VBAR); 94 + write_sysreg_el1(__vcpu_sys_reg(vcpu, CONTEXTIDR_EL2), SYS_CONTEXTIDR); 95 + write_sysreg_el1(__vcpu_sys_reg(vcpu, AMAIR_EL2), SYS_AMAIR); 95 96 96 97 if (vcpu_el2_e2h_is_set(vcpu)) { 97 98 /* ··· 192 191 { 193 192 struct kvm_cpu_context *guest_ctxt = &vcpu->arch.ctxt; 194 193 struct kvm_cpu_context *host_ctxt; 195 - u64 mpidr; 194 + u64 midr, mpidr; 196 195 197 196 host_ctxt = host_data_ptr(host_ctxt); 198 197 __sysreg_save_user_state(host_ctxt); ··· 222 221 } else { 223 222 if (vcpu_has_nv(vcpu)) { 224 223 /* 225 - * Use the guest hypervisor's VPIDR_EL2 when in a 226 - * nested state. The hardware value of MIDR_EL1 gets 227 - * restored on put. 228 - */ 229 - write_sysreg(ctxt_sys_reg(guest_ctxt, VPIDR_EL2), vpidr_el2); 230 - 231 - /* 232 224 * As we're restoring a nested guest, set the value 233 225 * provided by the guest hypervisor. 234 226 */ 227 + midr = ctxt_sys_reg(guest_ctxt, VPIDR_EL2); 235 228 mpidr = ctxt_sys_reg(guest_ctxt, VMPIDR_EL2); 236 229 } else { 230 + midr = ctxt_midr_el1(guest_ctxt); 237 231 mpidr = ctxt_sys_reg(guest_ctxt, MPIDR_EL1); 238 232 } 239 233 240 - __sysreg_restore_el1_state(guest_ctxt, mpidr); 234 + __sysreg_restore_el1_state(guest_ctxt, midr, mpidr); 241 235 } 242 236 243 237 vcpu_set_flag(vcpu, SYSREGS_ON_CPU); ··· 266 270 267 271 /* Restore host user state */ 268 272 __sysreg_restore_user_state(host_ctxt); 269 - 270 - /* If leaving a nesting guest, restore MIDR_EL1 default view */ 271 - if (vcpu_has_nv(vcpu)) 272 - write_sysreg(read_cpuid_id(), vpidr_el2); 273 273 274 274 vcpu_clear_flag(vcpu, SYSREGS_ON_CPU); 275 275 }
+13
arch/arm64/kvm/hypercalls.c
··· 15 15 GENMASK(KVM_REG_ARM_STD_HYP_BMAP_BIT_COUNT - 1, 0) 16 16 #define KVM_ARM_SMCCC_VENDOR_HYP_FEATURES \ 17 17 GENMASK(KVM_REG_ARM_VENDOR_HYP_BMAP_BIT_COUNT - 1, 0) 18 + #define KVM_ARM_SMCCC_VENDOR_HYP_FEATURES_2 \ 19 + GENMASK(KVM_REG_ARM_VENDOR_HYP_BMAP_2_BIT_COUNT - 1, 0) 18 20 19 21 static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val) 20 22 { ··· 362 360 break; 363 361 case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID: 364 362 val[0] = smccc_feat->vendor_hyp_bmap; 363 + /* Function numbers 2-63 are reserved for pKVM for now */ 364 + val[2] = smccc_feat->vendor_hyp_bmap_2; 365 365 break; 366 366 case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID: 367 367 kvm_ptp_get_time(vcpu, val); ··· 391 387 KVM_REG_ARM_STD_BMAP, 392 388 KVM_REG_ARM_STD_HYP_BMAP, 393 389 KVM_REG_ARM_VENDOR_HYP_BMAP, 390 + KVM_REG_ARM_VENDOR_HYP_BMAP_2, 394 391 }; 395 392 396 393 void kvm_arm_init_hypercalls(struct kvm *kvm) ··· 502 497 case KVM_REG_ARM_VENDOR_HYP_BMAP: 503 498 val = READ_ONCE(smccc_feat->vendor_hyp_bmap); 504 499 break; 500 + case KVM_REG_ARM_VENDOR_HYP_BMAP_2: 501 + val = READ_ONCE(smccc_feat->vendor_hyp_bmap_2); 502 + break; 505 503 default: 506 504 return -ENOENT; 507 505 } ··· 534 526 case KVM_REG_ARM_VENDOR_HYP_BMAP: 535 527 fw_reg_bmap = &smccc_feat->vendor_hyp_bmap; 536 528 fw_reg_features = KVM_ARM_SMCCC_VENDOR_HYP_FEATURES; 529 + break; 530 + case KVM_REG_ARM_VENDOR_HYP_BMAP_2: 531 + fw_reg_bmap = &smccc_feat->vendor_hyp_bmap_2; 532 + fw_reg_features = KVM_ARM_SMCCC_VENDOR_HYP_FEATURES_2; 537 533 break; 538 534 default: 539 535 return -ENOENT; ··· 645 633 case KVM_REG_ARM_STD_BMAP: 646 634 case KVM_REG_ARM_STD_HYP_BMAP: 647 635 case KVM_REG_ARM_VENDOR_HYP_BMAP: 636 + case KVM_REG_ARM_VENDOR_HYP_BMAP_2: 648 637 return kvm_arm_set_fw_reg_bmap(vcpu, reg->id, val); 649 638 default: 650 639 return -ENOENT;
+17 -5
arch/arm64/kvm/mmu.c
··· 1086 1086 } 1087 1087 } 1088 1088 1089 - static void hyp_mc_free_fn(void *addr, void *unused) 1089 + static void hyp_mc_free_fn(void *addr, void *mc) 1090 1090 { 1091 + struct kvm_hyp_memcache *memcache = mc; 1092 + 1093 + if (memcache->flags & HYP_MEMCACHE_ACCOUNT_STAGE2) 1094 + kvm_account_pgtable_pages(addr, -1); 1095 + 1091 1096 free_page((unsigned long)addr); 1092 1097 } 1093 1098 1094 - static void *hyp_mc_alloc_fn(void *unused) 1099 + static void *hyp_mc_alloc_fn(void *mc) 1095 1100 { 1096 - return (void *)__get_free_page(GFP_KERNEL_ACCOUNT); 1101 + struct kvm_hyp_memcache *memcache = mc; 1102 + void *addr; 1103 + 1104 + addr = (void *)__get_free_page(GFP_KERNEL_ACCOUNT); 1105 + if (addr && memcache->flags & HYP_MEMCACHE_ACCOUNT_STAGE2) 1106 + kvm_account_pgtable_pages(addr, 1); 1107 + 1108 + return addr; 1097 1109 } 1098 1110 1099 1111 void free_hyp_memcache(struct kvm_hyp_memcache *mc) ··· 1114 1102 return; 1115 1103 1116 1104 kfree(mc->mapping); 1117 - __free_hyp_memcache(mc, hyp_mc_free_fn, kvm_host_va, NULL); 1105 + __free_hyp_memcache(mc, hyp_mc_free_fn, kvm_host_va, mc); 1118 1106 } 1119 1107 1120 1108 int topup_hyp_memcache(struct kvm_hyp_memcache *mc, unsigned long min_pages) ··· 1129 1117 } 1130 1118 1131 1119 return __topup_hyp_memcache(mc, min_pages, hyp_mc_alloc_fn, 1132 - kvm_host_pa, NULL); 1120 + kvm_host_pa, mc); 1133 1121 } 1134 1122 1135 1123 /**
+159 -127
arch/arm64/kvm/nested.c
··· 16 16 17 17 #include "sys_regs.h" 18 18 19 - /* Protection against the sysreg repainting madness... */ 20 - #define NV_FTR(r, f) ID_AA64##r##_EL1_##f 21 - 22 19 /* 23 20 * Ratio of live shadow S2 MMU per vcpu. This is a trade-off between 24 21 * memory usage and potential number of different sets of S2 PTs in ··· 50 53 struct kvm *kvm = vcpu->kvm; 51 54 struct kvm_s2_mmu *tmp; 52 55 int num_mmus, ret = 0; 56 + 57 + if (test_bit(KVM_ARM_VCPU_HAS_EL2_E2H0, kvm->arch.vcpu_features) && 58 + !cpus_have_final_cap(ARM64_HAS_HCR_NV1)) 59 + return -EINVAL; 53 60 54 61 /* 55 62 * Let's treat memory allocation failures as benign: If we fail to ··· 808 807 * This list should get updated as new features get added to the NV 809 808 * support, and new extension to the architecture. 810 809 */ 811 - static void limit_nv_id_regs(struct kvm *kvm) 810 + u64 limit_nv_id_reg(struct kvm *kvm, u32 reg, u64 val) 812 811 { 813 - u64 val, tmp; 812 + switch (reg) { 813 + case SYS_ID_AA64ISAR0_EL1: 814 + /* Support everything but TME */ 815 + val &= ~ID_AA64ISAR0_EL1_TME; 816 + break; 814 817 815 - /* Support everything but TME */ 816 - val = kvm_read_vm_id_reg(kvm, SYS_ID_AA64ISAR0_EL1); 817 - val &= ~NV_FTR(ISAR0, TME); 818 - kvm_set_vm_id_reg(kvm, SYS_ID_AA64ISAR0_EL1, val); 818 + case SYS_ID_AA64ISAR1_EL1: 819 + /* Support everything but LS64 and Spec Invalidation */ 820 + val &= ~(ID_AA64ISAR1_EL1_LS64 | 821 + ID_AA64ISAR1_EL1_SPECRES); 822 + break; 819 823 820 - /* Support everything but Spec Invalidation and LS64 */ 821 - val = kvm_read_vm_id_reg(kvm, SYS_ID_AA64ISAR1_EL1); 822 - val &= ~(NV_FTR(ISAR1, LS64) | 823 - NV_FTR(ISAR1, SPECRES)); 824 - kvm_set_vm_id_reg(kvm, SYS_ID_AA64ISAR1_EL1, val); 824 + case SYS_ID_AA64PFR0_EL1: 825 + /* No RME, AMU, MPAM, S-EL2, or RAS */ 826 + val &= ~(ID_AA64PFR0_EL1_RME | 827 + ID_AA64PFR0_EL1_AMU | 828 + ID_AA64PFR0_EL1_MPAM | 829 + ID_AA64PFR0_EL1_SEL2 | 830 + ID_AA64PFR0_EL1_RAS | 831 + ID_AA64PFR0_EL1_EL3 | 832 + ID_AA64PFR0_EL1_EL2 | 833 + ID_AA64PFR0_EL1_EL1 | 834 + ID_AA64PFR0_EL1_EL0); 835 + /* 64bit only at any EL */ 836 + val |= SYS_FIELD_PREP_ENUM(ID_AA64PFR0_EL1, EL0, IMP); 837 + val |= SYS_FIELD_PREP_ENUM(ID_AA64PFR0_EL1, EL1, IMP); 838 + val |= SYS_FIELD_PREP_ENUM(ID_AA64PFR0_EL1, EL2, IMP); 839 + val |= SYS_FIELD_PREP_ENUM(ID_AA64PFR0_EL1, EL3, IMP); 840 + break; 825 841 826 - /* No AMU, MPAM, S-EL2, or RAS */ 827 - val = kvm_read_vm_id_reg(kvm, SYS_ID_AA64PFR0_EL1); 828 - val &= ~(GENMASK_ULL(55, 52) | 829 - NV_FTR(PFR0, AMU) | 830 - NV_FTR(PFR0, MPAM) | 831 - NV_FTR(PFR0, SEL2) | 832 - NV_FTR(PFR0, RAS) | 833 - NV_FTR(PFR0, EL3) | 834 - NV_FTR(PFR0, EL2) | 835 - NV_FTR(PFR0, EL1) | 836 - NV_FTR(PFR0, EL0)); 837 - /* 64bit only at any EL */ 838 - val |= FIELD_PREP(NV_FTR(PFR0, EL0), 0b0001); 839 - val |= FIELD_PREP(NV_FTR(PFR0, EL1), 0b0001); 840 - val |= FIELD_PREP(NV_FTR(PFR0, EL2), 0b0001); 841 - val |= FIELD_PREP(NV_FTR(PFR0, EL3), 0b0001); 842 - kvm_set_vm_id_reg(kvm, SYS_ID_AA64PFR0_EL1, val); 842 + case SYS_ID_AA64PFR1_EL1: 843 + /* Only support BTI, SSBS, CSV2_frac */ 844 + val &= (ID_AA64PFR1_EL1_BT | 845 + ID_AA64PFR1_EL1_SSBS | 846 + ID_AA64PFR1_EL1_CSV2_frac); 847 + break; 843 848 844 - /* Only support BTI, SSBS, CSV2_frac */ 845 - val = kvm_read_vm_id_reg(kvm, SYS_ID_AA64PFR1_EL1); 846 - val &= (NV_FTR(PFR1, BT) | 847 - NV_FTR(PFR1, SSBS) | 848 - NV_FTR(PFR1, CSV2_frac)); 849 - kvm_set_vm_id_reg(kvm, SYS_ID_AA64PFR1_EL1, val); 849 + case SYS_ID_AA64MMFR0_EL1: 850 + /* Hide ExS, Secure Memory */ 851 + val &= ~(ID_AA64MMFR0_EL1_EXS | 852 + ID_AA64MMFR0_EL1_TGRAN4_2 | 853 + ID_AA64MMFR0_EL1_TGRAN16_2 | 854 + ID_AA64MMFR0_EL1_TGRAN64_2 | 855 + ID_AA64MMFR0_EL1_SNSMEM); 850 856 851 - /* Hide ECV, ExS, Secure Memory */ 852 - val = kvm_read_vm_id_reg(kvm, SYS_ID_AA64MMFR0_EL1); 853 - val &= ~(NV_FTR(MMFR0, ECV) | 854 - NV_FTR(MMFR0, EXS) | 855 - NV_FTR(MMFR0, TGRAN4_2) | 856 - NV_FTR(MMFR0, TGRAN16_2) | 857 - NV_FTR(MMFR0, TGRAN64_2) | 858 - NV_FTR(MMFR0, SNSMEM)); 857 + /* Hide CNTPOFF if present */ 858 + val = ID_REG_LIMIT_FIELD_ENUM(val, ID_AA64MMFR0_EL1, ECV, IMP); 859 859 860 - /* Disallow unsupported S2 page sizes */ 861 - switch (PAGE_SIZE) { 862 - case SZ_64K: 863 - val |= FIELD_PREP(NV_FTR(MMFR0, TGRAN16_2), 0b0001); 864 - fallthrough; 865 - case SZ_16K: 866 - val |= FIELD_PREP(NV_FTR(MMFR0, TGRAN4_2), 0b0001); 867 - fallthrough; 868 - case SZ_4K: 869 - /* Support everything */ 860 + /* Disallow unsupported S2 page sizes */ 861 + switch (PAGE_SIZE) { 862 + case SZ_64K: 863 + val |= SYS_FIELD_PREP_ENUM(ID_AA64MMFR0_EL1, TGRAN16_2, NI); 864 + fallthrough; 865 + case SZ_16K: 866 + val |= SYS_FIELD_PREP_ENUM(ID_AA64MMFR0_EL1, TGRAN4_2, NI); 867 + fallthrough; 868 + case SZ_4K: 869 + /* Support everything */ 870 + break; 871 + } 872 + 873 + /* 874 + * Since we can't support a guest S2 page size smaller 875 + * than the host's own page size (due to KVM only 876 + * populating its own S2 using the kernel's page 877 + * size), advertise the limitation using FEAT_GTG. 878 + */ 879 + switch (PAGE_SIZE) { 880 + case SZ_4K: 881 + val |= SYS_FIELD_PREP_ENUM(ID_AA64MMFR0_EL1, TGRAN4_2, IMP); 882 + fallthrough; 883 + case SZ_16K: 884 + val |= SYS_FIELD_PREP_ENUM(ID_AA64MMFR0_EL1, TGRAN16_2, IMP); 885 + fallthrough; 886 + case SZ_64K: 887 + val |= SYS_FIELD_PREP_ENUM(ID_AA64MMFR0_EL1, TGRAN64_2, IMP); 888 + break; 889 + } 890 + 891 + /* Cap PARange to 48bits */ 892 + val = ID_REG_LIMIT_FIELD_ENUM(val, ID_AA64MMFR0_EL1, PARANGE, 48); 893 + break; 894 + 895 + case SYS_ID_AA64MMFR1_EL1: 896 + val &= (ID_AA64MMFR1_EL1_HCX | 897 + ID_AA64MMFR1_EL1_PAN | 898 + ID_AA64MMFR1_EL1_LO | 899 + ID_AA64MMFR1_EL1_HPDS | 900 + ID_AA64MMFR1_EL1_VH | 901 + ID_AA64MMFR1_EL1_VMIDBits); 902 + /* FEAT_E2H0 implies no VHE */ 903 + if (test_bit(KVM_ARM_VCPU_HAS_EL2_E2H0, kvm->arch.vcpu_features)) 904 + val &= ~ID_AA64MMFR1_EL1_VH; 905 + break; 906 + 907 + case SYS_ID_AA64MMFR2_EL1: 908 + val &= ~(ID_AA64MMFR2_EL1_BBM | 909 + ID_AA64MMFR2_EL1_TTL | 910 + GENMASK_ULL(47, 44) | 911 + ID_AA64MMFR2_EL1_ST | 912 + ID_AA64MMFR2_EL1_CCIDX | 913 + ID_AA64MMFR2_EL1_VARange); 914 + 915 + /* Force TTL support */ 916 + val |= SYS_FIELD_PREP_ENUM(ID_AA64MMFR2_EL1, TTL, IMP); 917 + break; 918 + 919 + case SYS_ID_AA64MMFR4_EL1: 920 + /* 921 + * You get EITHER 922 + * 923 + * - FEAT_VHE without FEAT_E2H0 924 + * - FEAT_NV limited to FEAT_NV2 925 + * - HCR_EL2.NV1 being RES0 926 + * 927 + * OR 928 + * 929 + * - FEAT_E2H0 without FEAT_VHE nor FEAT_NV 930 + * 931 + * Life is too short for anything else. 932 + */ 933 + if (test_bit(KVM_ARM_VCPU_HAS_EL2_E2H0, kvm->arch.vcpu_features)) { 934 + val = 0; 935 + } else { 936 + val = SYS_FIELD_PREP_ENUM(ID_AA64MMFR4_EL1, NV_frac, NV2_ONLY); 937 + val |= SYS_FIELD_PREP_ENUM(ID_AA64MMFR4_EL1, E2H0, NI_NV1); 938 + } 939 + break; 940 + 941 + case SYS_ID_AA64DFR0_EL1: 942 + /* Only limited support for PMU, Debug, BPs, WPs, and HPMN0 */ 943 + val &= (ID_AA64DFR0_EL1_PMUVer | 944 + ID_AA64DFR0_EL1_WRPs | 945 + ID_AA64DFR0_EL1_BRPs | 946 + ID_AA64DFR0_EL1_DebugVer| 947 + ID_AA64DFR0_EL1_HPMN0); 948 + 949 + /* Cap Debug to ARMv8.1 */ 950 + val = ID_REG_LIMIT_FIELD_ENUM(val, ID_AA64DFR0_EL1, DebugVer, VHE); 870 951 break; 871 952 } 872 - /* 873 - * Since we can't support a guest S2 page size smaller than 874 - * the host's own page size (due to KVM only populating its 875 - * own S2 using the kernel's page size), advertise the 876 - * limitation using FEAT_GTG. 877 - */ 878 - switch (PAGE_SIZE) { 879 - case SZ_4K: 880 - val |= FIELD_PREP(NV_FTR(MMFR0, TGRAN4_2), 0b0010); 881 - fallthrough; 882 - case SZ_16K: 883 - val |= FIELD_PREP(NV_FTR(MMFR0, TGRAN16_2), 0b0010); 884 - fallthrough; 885 - case SZ_64K: 886 - val |= FIELD_PREP(NV_FTR(MMFR0, TGRAN64_2), 0b0010); 887 - break; 888 - } 889 - /* Cap PARange to 48bits */ 890 - tmp = FIELD_GET(NV_FTR(MMFR0, PARANGE), val); 891 - if (tmp > 0b0101) { 892 - val &= ~NV_FTR(MMFR0, PARANGE); 893 - val |= FIELD_PREP(NV_FTR(MMFR0, PARANGE), 0b0101); 894 - } 895 - kvm_set_vm_id_reg(kvm, SYS_ID_AA64MMFR0_EL1, val); 896 953 897 - val = kvm_read_vm_id_reg(kvm, SYS_ID_AA64MMFR1_EL1); 898 - val &= (NV_FTR(MMFR1, HCX) | 899 - NV_FTR(MMFR1, PAN) | 900 - NV_FTR(MMFR1, LO) | 901 - NV_FTR(MMFR1, HPDS) | 902 - NV_FTR(MMFR1, VH) | 903 - NV_FTR(MMFR1, VMIDBits)); 904 - kvm_set_vm_id_reg(kvm, SYS_ID_AA64MMFR1_EL1, val); 905 - 906 - val = kvm_read_vm_id_reg(kvm, SYS_ID_AA64MMFR2_EL1); 907 - val &= ~(NV_FTR(MMFR2, BBM) | 908 - NV_FTR(MMFR2, TTL) | 909 - GENMASK_ULL(47, 44) | 910 - NV_FTR(MMFR2, ST) | 911 - NV_FTR(MMFR2, CCIDX) | 912 - NV_FTR(MMFR2, VARange)); 913 - 914 - /* Force TTL support */ 915 - val |= FIELD_PREP(NV_FTR(MMFR2, TTL), 0b0001); 916 - kvm_set_vm_id_reg(kvm, SYS_ID_AA64MMFR2_EL1, val); 917 - 918 - val = 0; 919 - if (!cpus_have_final_cap(ARM64_HAS_HCR_NV1)) 920 - val |= FIELD_PREP(NV_FTR(MMFR4, E2H0), 921 - ID_AA64MMFR4_EL1_E2H0_NI_NV1); 922 - kvm_set_vm_id_reg(kvm, SYS_ID_AA64MMFR4_EL1, val); 923 - 924 - /* Only limited support for PMU, Debug, BPs, WPs, and HPMN0 */ 925 - val = kvm_read_vm_id_reg(kvm, SYS_ID_AA64DFR0_EL1); 926 - val &= (NV_FTR(DFR0, PMUVer) | 927 - NV_FTR(DFR0, WRPs) | 928 - NV_FTR(DFR0, BRPs) | 929 - NV_FTR(DFR0, DebugVer) | 930 - NV_FTR(DFR0, HPMN0)); 931 - 932 - /* Cap Debug to ARMv8.1 */ 933 - tmp = FIELD_GET(NV_FTR(DFR0, DebugVer), val); 934 - if (tmp > 0b0111) { 935 - val &= ~NV_FTR(DFR0, DebugVer); 936 - val |= FIELD_PREP(NV_FTR(DFR0, DebugVer), 0b0111); 937 - } 938 - kvm_set_vm_id_reg(kvm, SYS_ID_AA64DFR0_EL1, val); 954 + return val; 939 955 } 940 956 941 957 u64 kvm_vcpu_apply_reg_masks(const struct kvm_vcpu *vcpu, ··· 999 981 if (!kvm->arch.sysreg_masks) 1000 982 return -ENOMEM; 1001 983 1002 - limit_nv_id_regs(kvm); 1003 - 1004 984 /* VTTBR_EL2 */ 1005 985 res0 = res1 = 0; 1006 986 if (!kvm_has_feat_enum(kvm, ID_AA64MMFR1_EL1, VMIDBits, 16)) ··· 1037 1021 res0 |= HCR_FIEN; 1038 1022 if (!kvm_has_feat(kvm, ID_AA64MMFR2_EL1, FWB, IMP)) 1039 1023 res0 |= HCR_FWB; 1040 - if (!kvm_has_feat(kvm, ID_AA64MMFR2_EL1, NV, NV2)) 1041 - res0 |= HCR_NV2; 1042 - if (!kvm_has_feat(kvm, ID_AA64MMFR2_EL1, NV, IMP)) 1043 - res0 |= (HCR_AT | HCR_NV1 | HCR_NV); 1024 + /* Implementation choice: NV2 is the only supported config */ 1025 + if (!kvm_has_feat(kvm, ID_AA64MMFR4_EL1, NV_frac, NV2_ONLY)) 1026 + res0 |= (HCR_NV2 | HCR_NV | HCR_AT); 1027 + if (!kvm_has_feat(kvm, ID_AA64MMFR4_EL1, E2H0, NI)) 1028 + res0 |= HCR_NV1; 1044 1029 if (!(kvm_vcpu_has_feature(kvm, KVM_ARM_VCPU_PTRAUTH_ADDRESS) && 1045 1030 kvm_vcpu_has_feature(kvm, KVM_ARM_VCPU_PTRAUTH_GENERIC))) 1046 1031 res0 |= (HCR_API | HCR_APK); ··· 1051 1034 res0 |= (HCR_TEA | HCR_TERR); 1052 1035 if (!kvm_has_feat(kvm, ID_AA64MMFR1_EL1, LO, IMP)) 1053 1036 res0 |= HCR_TLOR; 1037 + if (!kvm_has_feat(kvm, ID_AA64MMFR1_EL1, VH, IMP)) 1038 + res0 |= HCR_E2H; 1054 1039 if (!kvm_has_feat(kvm, ID_AA64MMFR4_EL1, E2H0, IMP)) 1055 1040 res1 |= HCR_E2H; 1056 1041 set_sysreg_masks(kvm, HCR_EL2, res0, res1); ··· 1309 1290 res0 |= GENMASK(11, 8); 1310 1291 set_sysreg_masks(kvm, CNTHCTL_EL2, res0, res1); 1311 1292 1293 + /* ICH_HCR_EL2 */ 1294 + res0 = ICH_HCR_EL2_RES0; 1295 + res1 = ICH_HCR_EL2_RES1; 1296 + if (!(kvm_vgic_global_state.ich_vtr_el2 & ICH_VTR_EL2_TDS)) 1297 + res0 |= ICH_HCR_EL2_TDIR; 1298 + /* No GICv4 is presented to the guest */ 1299 + res0 |= ICH_HCR_EL2_DVIM | ICH_HCR_EL2_vSGIEOICount; 1300 + set_sysreg_masks(kvm, ICH_HCR_EL2, res0, res1); 1301 + 1312 1302 out: 1313 1303 for (enum vcpu_sysreg sr = __SANITISED_REG_START__; sr < NR_SYS_REGS; sr++) 1314 1304 (void)__vcpu_sys_reg(vcpu, sr); ··· 1337 1309 } 1338 1310 write_unlock(&vcpu->kvm->mmu_lock); 1339 1311 } 1312 + 1313 + /* Must be last, as may switch context! */ 1314 + if (kvm_check_request(KVM_REQ_GUEST_HYP_IRQ_PENDING, vcpu)) 1315 + kvm_inject_nested_irq(vcpu); 1340 1316 }
+39 -36
arch/arm64/kvm/pkvm.c
··· 111 111 112 112 host_kvm->arch.pkvm.handle = 0; 113 113 free_hyp_memcache(&host_kvm->arch.pkvm.teardown_mc); 114 + free_hyp_memcache(&host_kvm->arch.pkvm.stage2_teardown_mc); 115 + } 116 + 117 + static int __pkvm_create_hyp_vcpu(struct kvm_vcpu *vcpu) 118 + { 119 + size_t hyp_vcpu_sz = PAGE_ALIGN(PKVM_HYP_VCPU_SIZE); 120 + pkvm_handle_t handle = vcpu->kvm->arch.pkvm.handle; 121 + void *hyp_vcpu; 122 + int ret; 123 + 124 + vcpu->arch.pkvm_memcache.flags |= HYP_MEMCACHE_ACCOUNT_STAGE2; 125 + 126 + hyp_vcpu = alloc_pages_exact(hyp_vcpu_sz, GFP_KERNEL_ACCOUNT); 127 + if (!hyp_vcpu) 128 + return -ENOMEM; 129 + 130 + ret = kvm_call_hyp_nvhe(__pkvm_init_vcpu, handle, vcpu, hyp_vcpu); 131 + if (!ret) 132 + vcpu_set_flag(vcpu, VCPU_PKVM_FINALIZED); 133 + else 134 + free_pages_exact(hyp_vcpu, hyp_vcpu_sz); 135 + 136 + return ret; 114 137 } 115 138 116 139 /* ··· 148 125 */ 149 126 static int __pkvm_create_hyp_vm(struct kvm *host_kvm) 150 127 { 151 - size_t pgd_sz, hyp_vm_sz, hyp_vcpu_sz; 152 - struct kvm_vcpu *host_vcpu; 153 - pkvm_handle_t handle; 128 + size_t pgd_sz, hyp_vm_sz; 154 129 void *pgd, *hyp_vm; 155 - unsigned long idx; 156 130 int ret; 157 131 158 132 if (host_kvm->created_vcpus < 1) ··· 181 161 if (ret < 0) 182 162 goto free_vm; 183 163 184 - handle = ret; 185 - 186 - host_kvm->arch.pkvm.handle = handle; 187 - 188 - /* Donate memory for the vcpus at hyp and initialize it. */ 189 - hyp_vcpu_sz = PAGE_ALIGN(PKVM_HYP_VCPU_SIZE); 190 - kvm_for_each_vcpu(idx, host_vcpu, host_kvm) { 191 - void *hyp_vcpu; 192 - 193 - /* Indexing of the vcpus to be sequential starting at 0. */ 194 - if (WARN_ON(host_vcpu->vcpu_idx != idx)) { 195 - ret = -EINVAL; 196 - goto destroy_vm; 197 - } 198 - 199 - hyp_vcpu = alloc_pages_exact(hyp_vcpu_sz, GFP_KERNEL_ACCOUNT); 200 - if (!hyp_vcpu) { 201 - ret = -ENOMEM; 202 - goto destroy_vm; 203 - } 204 - 205 - ret = kvm_call_hyp_nvhe(__pkvm_init_vcpu, handle, host_vcpu, 206 - hyp_vcpu); 207 - if (ret) { 208 - free_pages_exact(hyp_vcpu, hyp_vcpu_sz); 209 - goto destroy_vm; 210 - } 211 - } 164 + host_kvm->arch.pkvm.handle = ret; 165 + host_kvm->arch.pkvm.stage2_teardown_mc.flags |= HYP_MEMCACHE_ACCOUNT_STAGE2; 166 + kvm_account_pgtable_pages(pgd, pgd_sz / PAGE_SIZE); 212 167 213 168 return 0; 214 - 215 - destroy_vm: 216 - __pkvm_destroy_hyp_vm(host_kvm); 217 - return ret; 218 169 free_vm: 219 170 free_pages_exact(hyp_vm, hyp_vm_sz); 220 171 free_pgd: ··· 201 210 if (!host_kvm->arch.pkvm.handle) 202 211 ret = __pkvm_create_hyp_vm(host_kvm); 203 212 mutex_unlock(&host_kvm->arch.config_lock); 213 + 214 + return ret; 215 + } 216 + 217 + int pkvm_create_hyp_vcpu(struct kvm_vcpu *vcpu) 218 + { 219 + int ret = 0; 220 + 221 + mutex_lock(&vcpu->kvm->arch.config_lock); 222 + if (!vcpu_get_flag(vcpu, VCPU_PKVM_FINALIZED)) 223 + ret = __pkvm_create_hyp_vcpu(vcpu); 224 + mutex_unlock(&vcpu->kvm->arch.config_lock); 204 225 205 226 return ret; 206 227 }
+115 -79
arch/arm64/kvm/pmu-emul.c
··· 17 17 18 18 #define PERF_ATTR_CFG1_COUNTER_64BIT BIT(0) 19 19 20 - DEFINE_STATIC_KEY_FALSE(kvm_arm_pmu_available); 21 - 22 20 static LIST_HEAD(arm_pmus); 23 21 static DEFINE_MUTEX(arm_pmus_lock); 24 22 25 23 static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc); 26 24 static void kvm_pmu_release_perf_event(struct kvm_pmc *pmc); 27 25 static bool kvm_pmu_counter_is_enabled(struct kvm_pmc *pmc); 26 + 27 + bool kvm_supports_guest_pmuv3(void) 28 + { 29 + guard(mutex)(&arm_pmus_lock); 30 + return !list_empty(&arm_pmus); 31 + } 28 32 29 33 static struct kvm_vcpu *kvm_pmc_to_vcpu(const struct kvm_pmc *pmc) 30 34 { ··· 154 150 */ 155 151 u64 kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u64 select_idx) 156 152 { 157 - if (!kvm_vcpu_has_pmu(vcpu)) 158 - return 0; 159 - 160 153 return kvm_pmu_get_pmc_value(kvm_vcpu_idx_to_pmc(vcpu, select_idx)); 161 154 } 162 155 ··· 192 191 */ 193 192 void kvm_pmu_set_counter_value(struct kvm_vcpu *vcpu, u64 select_idx, u64 val) 194 193 { 195 - if (!kvm_vcpu_has_pmu(vcpu)) 196 - return; 197 - 198 194 kvm_pmu_set_pmc_value(kvm_vcpu_idx_to_pmc(vcpu, select_idx), val, false); 195 + } 196 + 197 + /** 198 + * kvm_pmu_set_counter_value_user - set PMU counter value from user 199 + * @vcpu: The vcpu pointer 200 + * @select_idx: The counter index 201 + * @val: The counter value 202 + */ 203 + void kvm_pmu_set_counter_value_user(struct kvm_vcpu *vcpu, u64 select_idx, u64 val) 204 + { 205 + kvm_pmu_release_perf_event(kvm_vcpu_idx_to_pmc(vcpu, select_idx)); 206 + __vcpu_sys_reg(vcpu, counter_index_to_reg(select_idx)) = val; 207 + kvm_make_request(KVM_REQ_RELOAD_PMU, vcpu); 199 208 } 200 209 201 210 /** ··· 256 245 257 246 for (i = 0; i < KVM_ARMV8_PMU_MAX_COUNTERS; i++) 258 247 pmu->pmc[i].idx = i; 259 - } 260 - 261 - /** 262 - * kvm_pmu_vcpu_reset - reset pmu state for cpu 263 - * @vcpu: The vcpu pointer 264 - * 265 - */ 266 - void kvm_pmu_vcpu_reset(struct kvm_vcpu *vcpu) 267 - { 268 - unsigned long mask = kvm_pmu_implemented_counter_mask(vcpu); 269 - int i; 270 - 271 - for_each_set_bit(i, &mask, 32) 272 - kvm_pmu_stop_counter(kvm_vcpu_idx_to_pmc(vcpu, i)); 273 248 } 274 249 275 250 /** ··· 347 350 { 348 351 int i; 349 352 350 - if (!kvm_vcpu_has_pmu(vcpu) || !val) 353 + if (!val) 351 354 return; 352 355 353 356 for (i = 0; i < KVM_ARMV8_PMU_MAX_COUNTERS; i++) { ··· 397 400 { 398 401 struct kvm_pmu *pmu = &vcpu->arch.pmu; 399 402 bool overflow; 400 - 401 - if (!kvm_vcpu_has_pmu(vcpu)) 402 - return; 403 403 404 404 overflow = kvm_pmu_overflow_status(vcpu); 405 405 if (pmu->irq_level == overflow) ··· 593 599 { 594 600 int i; 595 601 596 - if (!kvm_vcpu_has_pmu(vcpu)) 597 - return; 598 - 599 602 /* Fixup PMCR_EL0 to reconcile the PMU version and the LP bit */ 600 603 if (!kvm_has_feat(vcpu->kvm, ID_AA64DFR0_EL1, PMUVer, V3P5)) 601 604 val &= ~ARMV8_PMU_PMCR_LP; ··· 664 673 return kvm_pmc_read_evtreg(pmc) & ARMV8_PMU_INCLUDE_EL2; 665 674 } 666 675 676 + static int kvm_map_pmu_event(struct kvm *kvm, unsigned int eventsel) 677 + { 678 + struct arm_pmu *pmu = kvm->arch.arm_pmu; 679 + 680 + /* 681 + * The CPU PMU likely isn't PMUv3; let the driver provide a mapping 682 + * for the guest's PMUv3 event ID. 683 + */ 684 + if (unlikely(pmu->map_pmuv3_event)) 685 + return pmu->map_pmuv3_event(eventsel); 686 + 687 + return eventsel; 688 + } 689 + 667 690 /** 668 691 * kvm_pmu_create_perf_event - create a perf event for a counter 669 692 * @pmc: Counter context ··· 688 683 struct arm_pmu *arm_pmu = vcpu->kvm->arch.arm_pmu; 689 684 struct perf_event *event; 690 685 struct perf_event_attr attr; 691 - u64 eventsel, evtreg; 686 + int eventsel; 687 + u64 evtreg; 692 688 693 689 evtreg = kvm_pmc_read_evtreg(pmc); 694 690 ··· 713 707 */ 714 708 if (vcpu->kvm->arch.pmu_filter && 715 709 !test_bit(eventsel, vcpu->kvm->arch.pmu_filter)) 710 + return; 711 + 712 + /* 713 + * Don't create an event if we're running on hardware that requires 714 + * PMUv3 event translation and we couldn't find a valid mapping. 715 + */ 716 + eventsel = kvm_map_pmu_event(vcpu->kvm, eventsel); 717 + if (eventsel < 0) 716 718 return; 717 719 718 720 memset(&attr, 0, sizeof(struct perf_event_attr)); ··· 780 766 struct kvm_pmc *pmc = kvm_vcpu_idx_to_pmc(vcpu, select_idx); 781 767 u64 reg; 782 768 783 - if (!kvm_vcpu_has_pmu(vcpu)) 784 - return; 785 - 786 769 reg = counter_index_to_evtreg(pmc->idx); 787 770 __vcpu_sys_reg(vcpu, reg) = data & kvm_pmu_evtyper_mask(vcpu->kvm); 788 771 ··· 797 786 if (!pmuv3_implemented(kvm_arm_pmu_get_pmuver_limit())) 798 787 return; 799 788 800 - mutex_lock(&arm_pmus_lock); 789 + guard(mutex)(&arm_pmus_lock); 801 790 802 791 entry = kmalloc(sizeof(*entry), GFP_KERNEL); 803 792 if (!entry) 804 - goto out_unlock; 793 + return; 805 794 806 795 entry->arm_pmu = pmu; 807 796 list_add_tail(&entry->entry, &arm_pmus); 808 - 809 - if (list_is_singular(&arm_pmus)) 810 - static_branch_enable(&kvm_arm_pmu_available); 811 - 812 - out_unlock: 813 - mutex_unlock(&arm_pmus_lock); 814 797 } 815 798 816 799 static struct arm_pmu *kvm_pmu_probe_armpmu(void) 817 800 { 818 - struct arm_pmu *tmp, *pmu = NULL; 819 801 struct arm_pmu_entry *entry; 802 + struct arm_pmu *pmu; 820 803 int cpu; 821 804 822 - mutex_lock(&arm_pmus_lock); 805 + guard(mutex)(&arm_pmus_lock); 823 806 824 807 /* 825 808 * It is safe to use a stale cpu to iterate the list of PMUs so long as ··· 834 829 */ 835 830 cpu = raw_smp_processor_id(); 836 831 list_for_each_entry(entry, &arm_pmus, entry) { 837 - tmp = entry->arm_pmu; 832 + pmu = entry->arm_pmu; 838 833 839 - if (cpumask_test_cpu(cpu, &tmp->supported_cpus)) { 840 - pmu = tmp; 841 - break; 842 - } 834 + if (cpumask_test_cpu(cpu, &pmu->supported_cpus)) 835 + return pmu; 843 836 } 844 837 845 - mutex_unlock(&arm_pmus_lock); 838 + return NULL; 839 + } 846 840 847 - return pmu; 841 + static u64 __compute_pmceid(struct arm_pmu *pmu, bool pmceid1) 842 + { 843 + u32 hi[2], lo[2]; 844 + 845 + bitmap_to_arr32(lo, pmu->pmceid_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS); 846 + bitmap_to_arr32(hi, pmu->pmceid_ext_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS); 847 + 848 + return ((u64)hi[pmceid1] << 32) | lo[pmceid1]; 849 + } 850 + 851 + static u64 compute_pmceid0(struct arm_pmu *pmu) 852 + { 853 + u64 val = __compute_pmceid(pmu, 0); 854 + 855 + /* always support SW_INCR */ 856 + val |= BIT(ARMV8_PMUV3_PERFCTR_SW_INCR); 857 + /* always support CHAIN */ 858 + val |= BIT(ARMV8_PMUV3_PERFCTR_CHAIN); 859 + return val; 860 + } 861 + 862 + static u64 compute_pmceid1(struct arm_pmu *pmu) 863 + { 864 + u64 val = __compute_pmceid(pmu, 1); 865 + 866 + /* 867 + * Don't advertise STALL_SLOT*, as PMMIR_EL0 is handled 868 + * as RAZ 869 + */ 870 + val &= ~(BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT - 32) | 871 + BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT_FRONTEND - 32) | 872 + BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT_BACKEND - 32)); 873 + return val; 848 874 } 849 875 850 876 u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1) 851 877 { 878 + struct arm_pmu *cpu_pmu = vcpu->kvm->arch.arm_pmu; 852 879 unsigned long *bmap = vcpu->kvm->arch.pmu_filter; 853 880 u64 val, mask = 0; 854 881 int base, i, nr_events; 855 882 856 - if (!kvm_vcpu_has_pmu(vcpu)) 857 - return 0; 858 - 859 883 if (!pmceid1) { 860 - val = read_sysreg(pmceid0_el0); 861 - /* always support CHAIN */ 862 - val |= BIT(ARMV8_PMUV3_PERFCTR_CHAIN); 884 + val = compute_pmceid0(cpu_pmu); 863 885 base = 0; 864 886 } else { 865 - val = read_sysreg(pmceid1_el0); 866 - /* 867 - * Don't advertise STALL_SLOT*, as PMMIR_EL0 is handled 868 - * as RAZ 869 - */ 870 - val &= ~(BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT - 32) | 871 - BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT_FRONTEND - 32) | 872 - BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT_BACKEND - 32)); 887 + val = compute_pmceid1(cpu_pmu); 873 888 base = 32; 874 889 } 875 890 ··· 925 900 926 901 int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu) 927 902 { 928 - if (!kvm_vcpu_has_pmu(vcpu)) 929 - return 0; 930 - 931 903 if (!vcpu->arch.pmu.created) 932 904 return -EINVAL; 933 905 ··· 946 924 } else if (kvm_arm_pmu_irq_initialized(vcpu)) { 947 925 return -EINVAL; 948 926 } 949 - 950 - /* One-off reload of the PMU on first run */ 951 - kvm_make_request(KVM_REQ_RELOAD_PMU, vcpu); 952 927 953 928 return 0; 954 929 } ··· 1012 993 u8 kvm_arm_pmu_get_max_counters(struct kvm *kvm) 1013 994 { 1014 995 struct arm_pmu *arm_pmu = kvm->arch.arm_pmu; 996 + 997 + /* 998 + * PMUv3 requires that all event counters are capable of counting any 999 + * event, though the same may not be true of non-PMUv3 hardware. 1000 + */ 1001 + if (cpus_have_final_cap(ARM64_WORKAROUND_PMUV3_IMPDEF_TRAPS)) 1002 + return 1; 1015 1003 1016 1004 /* 1017 1005 * The arm_pmu->cntr_mask considers the fixed counter(s) as well. ··· 1231 1205 1232 1206 u8 kvm_arm_pmu_get_pmuver_limit(void) 1233 1207 { 1234 - u64 tmp; 1208 + unsigned int pmuver; 1235 1209 1236 - tmp = read_sanitised_ftr_reg(SYS_ID_AA64DFR0_EL1); 1237 - tmp = cpuid_feature_cap_perfmon_field(tmp, 1238 - ID_AA64DFR0_EL1_PMUVer_SHIFT, 1239 - ID_AA64DFR0_EL1_PMUVer_V3P5); 1240 - return FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), tmp); 1210 + pmuver = SYS_FIELD_GET(ID_AA64DFR0_EL1, PMUVer, 1211 + read_sanitised_ftr_reg(SYS_ID_AA64DFR0_EL1)); 1212 + 1213 + /* 1214 + * Spoof a barebones PMUv3 implementation if the system supports IMPDEF 1215 + * traps of the PMUv3 sysregs 1216 + */ 1217 + if (cpus_have_final_cap(ARM64_WORKAROUND_PMUV3_IMPDEF_TRAPS)) 1218 + return ID_AA64DFR0_EL1_PMUVer_IMP; 1219 + 1220 + /* 1221 + * Otherwise, treat IMPLEMENTATION DEFINED functionality as 1222 + * unimplemented 1223 + */ 1224 + if (pmuver == ID_AA64DFR0_EL1_PMUVer_IMP_DEF) 1225 + return 0; 1226 + 1227 + return min(pmuver, ID_AA64DFR0_EL1_PMUVer_V3P5); 1241 1228 } 1242 1229 1243 1230 /** ··· 1269 1230 bool reprogrammed = false; 1270 1231 unsigned long mask; 1271 1232 int i; 1272 - 1273 - if (!kvm_vcpu_has_pmu(vcpu)) 1274 - return; 1275 1233 1276 1234 mask = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0); 1277 1235 for_each_set_bit(i, &mask, 32) {
+5 -5
arch/arm64/kvm/pmu.c
··· 41 41 { 42 42 struct kvm_pmu_events *pmu = kvm_get_pmu_events(); 43 43 44 - if (!kvm_arm_support_pmu_v3() || !kvm_pmu_switch_needed(attr)) 44 + if (!system_supports_pmuv3() || !kvm_pmu_switch_needed(attr)) 45 45 return; 46 46 47 47 if (!attr->exclude_host) ··· 57 57 { 58 58 struct kvm_pmu_events *pmu = kvm_get_pmu_events(); 59 59 60 - if (!kvm_arm_support_pmu_v3()) 60 + if (!system_supports_pmuv3()) 61 61 return; 62 62 63 63 pmu->events_host &= ~clr; ··· 133 133 struct kvm_pmu_events *pmu; 134 134 u64 events_guest, events_host; 135 135 136 - if (!kvm_arm_support_pmu_v3() || !has_vhe()) 136 + if (!system_supports_pmuv3() || !has_vhe()) 137 137 return; 138 138 139 139 preempt_disable(); ··· 154 154 struct kvm_pmu_events *pmu; 155 155 u64 events_guest, events_host; 156 156 157 - if (!kvm_arm_support_pmu_v3() || !has_vhe()) 157 + if (!system_supports_pmuv3() || !has_vhe()) 158 158 return; 159 159 160 160 pmu = kvm_get_pmu_events(); ··· 180 180 struct kvm_cpu_context *hctxt; 181 181 struct kvm_vcpu *vcpu; 182 182 183 - if (!kvm_arm_support_pmu_v3() || !has_vhe()) 183 + if (!system_supports_pmuv3() || !has_vhe()) 184 184 return false; 185 185 186 186 vcpu = kvm_get_running_vcpu();
-3
arch/arm64/kvm/reset.c
··· 196 196 vcpu->arch.reset_state.reset = false; 197 197 spin_unlock(&vcpu->arch.mp_state_lock); 198 198 199 - /* Reset PMU outside of the non-preemptible section */ 200 - kvm_pmu_vcpu_reset(vcpu); 201 - 202 199 preempt_disable(); 203 200 loaded = (vcpu->cpu != -1); 204 201 if (loaded)
+314 -164
arch/arm64/kvm/sys_regs.c
··· 17 17 #include <linux/mm.h> 18 18 #include <linux/printk.h> 19 19 #include <linux/uaccess.h> 20 + #include <linux/irqchip/arm-gic-v3.h> 20 21 21 22 #include <asm/arm_pmuv3.h> 22 23 #include <asm/cacheflush.h> ··· 532 531 if (p->is_write) 533 532 return ignore_write(vcpu, p); 534 533 535 - p->regval = vcpu->arch.vgic_cpu.vgic_v3.vgic_sre; 534 + if (p->Op1 == 4) { /* ICC_SRE_EL2 */ 535 + p->regval = (ICC_SRE_EL2_ENABLE | ICC_SRE_EL2_SRE | 536 + ICC_SRE_EL1_DIB | ICC_SRE_EL1_DFB); 537 + } else { /* ICC_SRE_EL1 */ 538 + p->regval = vcpu->arch.vgic_cpu.vgic_v3.vgic_sre; 539 + } 540 + 536 541 return true; 537 542 } 538 543 ··· 967 960 return 0; 968 961 } 969 962 963 + static int set_pmu_evcntr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r, 964 + u64 val) 965 + { 966 + u64 idx; 967 + 968 + if (r->CRn == 9 && r->CRm == 13 && r->Op2 == 0) 969 + /* PMCCNTR_EL0 */ 970 + idx = ARMV8_PMU_CYCLE_IDX; 971 + else 972 + /* PMEVCNTRn_EL0 */ 973 + idx = ((r->CRm & 3) << 3) | (r->Op2 & 7); 974 + 975 + kvm_pmu_set_counter_value_user(vcpu, idx, val); 976 + return 0; 977 + } 978 + 970 979 static bool access_pmu_evcntr(struct kvm_vcpu *vcpu, 971 980 struct sys_reg_params *p, 972 981 const struct sys_reg_desc *r) ··· 1074 1051 1075 1052 static int set_pmreg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r, u64 val) 1076 1053 { 1077 - bool set; 1054 + u64 mask = kvm_pmu_accessible_counter_mask(vcpu); 1078 1055 1079 - val &= kvm_pmu_accessible_counter_mask(vcpu); 1080 - 1081 - switch (r->reg) { 1082 - case PMOVSSET_EL0: 1083 - /* CRm[1] being set indicates a SET register, and CLR otherwise */ 1084 - set = r->CRm & 2; 1085 - break; 1086 - default: 1087 - /* Op2[0] being set indicates a SET register, and CLR otherwise */ 1088 - set = r->Op2 & 1; 1089 - break; 1090 - } 1091 - 1092 - if (set) 1093 - __vcpu_sys_reg(vcpu, r->reg) |= val; 1094 - else 1095 - __vcpu_sys_reg(vcpu, r->reg) &= ~val; 1056 + __vcpu_sys_reg(vcpu, r->reg) = val & mask; 1057 + kvm_make_request(KVM_REQ_RELOAD_PMU, vcpu); 1096 1058 1097 1059 return 0; 1098 1060 } ··· 1237 1229 val |= ARMV8_PMU_PMCR_LC; 1238 1230 1239 1231 __vcpu_sys_reg(vcpu, r->reg) = val; 1232 + kvm_make_request(KVM_REQ_RELOAD_PMU, vcpu); 1233 + 1240 1234 return 0; 1241 1235 } 1242 1236 ··· 1265 1255 #define PMU_PMEVCNTR_EL0(n) \ 1266 1256 { PMU_SYS_REG(PMEVCNTRn_EL0(n)), \ 1267 1257 .reset = reset_pmevcntr, .get_user = get_pmu_evcntr, \ 1258 + .set_user = set_pmu_evcntr, \ 1268 1259 .access = access_pmu_evcntr, .reg = (PMEVCNTR0_EL0 + n), } 1269 1260 1270 1261 /* Macro to expand the PMEVTYPERn_EL0 register */ ··· 1638 1627 break; 1639 1628 case SYS_ID_AA64MMFR2_EL1: 1640 1629 val &= ~ID_AA64MMFR2_EL1_CCIDX_MASK; 1630 + val &= ~ID_AA64MMFR2_EL1_NV; 1641 1631 break; 1642 1632 case SYS_ID_AA64MMFR3_EL1: 1643 1633 val &= ID_AA64MMFR3_EL1_TCRX | ID_AA64MMFR3_EL1_S1POE | ··· 1648 1636 val &= ~ARM64_FEATURE_MASK(ID_MMFR4_EL1_CCIDX); 1649 1637 break; 1650 1638 } 1639 + 1640 + if (vcpu_has_nv(vcpu)) 1641 + val = limit_nv_id_reg(vcpu->kvm, id, val); 1651 1642 1652 1643 return val; 1653 1644 } ··· 1678 1663 * Return true if the register's (Op0, Op1, CRn, CRm, Op2) is 1679 1664 * (3, 0, 0, crm, op2), where 1<=crm<8, 0<=op2<8, which is the range of ID 1680 1665 * registers KVM maintains on a per-VM basis. 1666 + * 1667 + * Additionally, the implementation ID registers and CTR_EL0 are handled as 1668 + * per-VM registers. 1681 1669 */ 1682 1670 static inline bool is_vm_ftr_id_reg(u32 id) 1683 1671 { 1684 - if (id == SYS_CTR_EL0) 1672 + switch (id) { 1673 + case SYS_CTR_EL0: 1674 + case SYS_MIDR_EL1: 1675 + case SYS_REVIDR_EL1: 1676 + case SYS_AIDR_EL1: 1685 1677 return true; 1678 + default: 1679 + return (sys_reg_Op0(id) == 3 && sys_reg_Op1(id) == 0 && 1680 + sys_reg_CRn(id) == 0 && sys_reg_CRm(id) >= 1 && 1681 + sys_reg_CRm(id) < 8); 1686 1682 1687 - return (sys_reg_Op0(id) == 3 && sys_reg_Op1(id) == 0 && 1688 - sys_reg_CRn(id) == 0 && sys_reg_CRm(id) >= 1 && 1689 - sys_reg_CRm(id) < 8); 1683 + } 1690 1684 } 1691 1685 1692 1686 static inline bool is_vcpu_ftr_id_reg(u32 id) ··· 1826 1802 return val; 1827 1803 } 1828 1804 1829 - #define ID_REG_LIMIT_FIELD_ENUM(val, reg, field, limit) \ 1830 - ({ \ 1831 - u64 __f_val = FIELD_GET(reg##_##field##_MASK, val); \ 1832 - (val) &= ~reg##_##field##_MASK; \ 1833 - (val) |= FIELD_PREP(reg##_##field##_MASK, \ 1834 - min(__f_val, \ 1835 - (u64)SYS_FIELD_VALUE(reg, field, limit))); \ 1836 - (val); \ 1837 - }) 1838 - 1839 1805 static u64 sanitise_id_aa64dfr0_el1(const struct kvm_vcpu *vcpu, u64 val) 1840 1806 { 1841 1807 val = ID_REG_LIMIT_FIELD_ENUM(val, ID_AA64DFR0_EL1, DebugVer, V8P8); ··· 1884 1870 static u64 read_sanitised_id_dfr0_el1(struct kvm_vcpu *vcpu, 1885 1871 const struct sys_reg_desc *rd) 1886 1872 { 1887 - u8 perfmon = pmuver_to_perfmon(kvm_arm_pmu_get_pmuver_limit()); 1873 + u8 perfmon; 1888 1874 u64 val = read_sanitised_ftr_reg(SYS_ID_DFR0_EL1); 1889 1875 1890 1876 val &= ~ID_DFR0_EL1_PerfMon_MASK; 1891 - if (kvm_vcpu_has_pmu(vcpu)) 1877 + if (kvm_vcpu_has_pmu(vcpu)) { 1878 + perfmon = pmuver_to_perfmon(kvm_arm_pmu_get_pmuver_limit()); 1892 1879 val |= SYS_FIELD_PREP(ID_DFR0_EL1, PerfMon, perfmon); 1880 + } 1893 1881 1894 1882 val = ID_REG_LIMIT_FIELD_ENUM(val, ID_DFR0_EL1, CopDbg, Debugv8p8); 1895 1883 ··· 1957 1941 /* See set_id_aa64pfr0_el1 for comment about MPAM */ 1958 1942 if ((hw_val & mpam_mask) == (user_val & mpam_mask)) 1959 1943 user_val &= ~ID_AA64PFR1_EL1_MPAM_frac_MASK; 1944 + 1945 + return set_id_reg(vcpu, rd, user_val); 1946 + } 1947 + 1948 + static int set_id_aa64mmfr0_el1(struct kvm_vcpu *vcpu, 1949 + const struct sys_reg_desc *rd, u64 user_val) 1950 + { 1951 + u64 sanitized_val = kvm_read_sanitised_id_reg(vcpu, rd); 1952 + u64 tgran2_mask = ID_AA64MMFR0_EL1_TGRAN4_2_MASK | 1953 + ID_AA64MMFR0_EL1_TGRAN16_2_MASK | 1954 + ID_AA64MMFR0_EL1_TGRAN64_2_MASK; 1955 + 1956 + if (vcpu_has_nv(vcpu) && 1957 + ((sanitized_val & tgran2_mask) != (user_val & tgran2_mask))) 1958 + return -EINVAL; 1959 + 1960 + return set_id_reg(vcpu, rd, user_val); 1961 + } 1962 + 1963 + static int set_id_aa64mmfr2_el1(struct kvm_vcpu *vcpu, 1964 + const struct sys_reg_desc *rd, u64 user_val) 1965 + { 1966 + u64 hw_val = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1); 1967 + u64 nv_mask = ID_AA64MMFR2_EL1_NV_MASK; 1968 + 1969 + /* 1970 + * We made the mistake to expose the now deprecated NV field, 1971 + * so allow userspace to write it, but silently ignore it. 1972 + */ 1973 + if ((hw_val & nv_mask) == (user_val & nv_mask)) 1974 + user_val &= ~nv_mask; 1960 1975 1961 1976 return set_id_reg(vcpu, rd, user_val); 1962 1977 } ··· 2313 2266 * from userspace. 2314 2267 */ 2315 2268 2269 + #define ID_DESC_DEFAULT_CALLBACKS \ 2270 + .access = access_id_reg, \ 2271 + .get_user = get_id_reg, \ 2272 + .set_user = set_id_reg, \ 2273 + .visibility = id_visibility, \ 2274 + .reset = kvm_read_sanitised_id_reg 2275 + 2316 2276 #define ID_DESC(name) \ 2317 2277 SYS_DESC(SYS_##name), \ 2318 - .access = access_id_reg, \ 2319 - .get_user = get_id_reg \ 2278 + ID_DESC_DEFAULT_CALLBACKS 2320 2279 2321 2280 /* sys_reg_desc initialiser for known cpufeature ID registers */ 2322 2281 #define ID_SANITISED(name) { \ 2323 2282 ID_DESC(name), \ 2324 - .set_user = set_id_reg, \ 2325 - .visibility = id_visibility, \ 2326 - .reset = kvm_read_sanitised_id_reg, \ 2327 2283 .val = 0, \ 2328 2284 } 2329 2285 2330 2286 /* sys_reg_desc initialiser for known cpufeature ID registers */ 2331 2287 #define AA32_ID_SANITISED(name) { \ 2332 2288 ID_DESC(name), \ 2333 - .set_user = set_id_reg, \ 2334 2289 .visibility = aa32_id_visibility, \ 2335 - .reset = kvm_read_sanitised_id_reg, \ 2336 2290 .val = 0, \ 2337 2291 } 2338 2292 2339 2293 /* sys_reg_desc initialiser for writable ID registers */ 2340 2294 #define ID_WRITABLE(name, mask) { \ 2341 2295 ID_DESC(name), \ 2342 - .set_user = set_id_reg, \ 2343 - .visibility = id_visibility, \ 2344 - .reset = kvm_read_sanitised_id_reg, \ 2345 2296 .val = mask, \ 2346 2297 } 2347 2298 ··· 2347 2302 #define ID_FILTERED(sysreg, name, mask) { \ 2348 2303 ID_DESC(sysreg), \ 2349 2304 .set_user = set_##name, \ 2350 - .visibility = id_visibility, \ 2351 - .reset = kvm_read_sanitised_id_reg, \ 2352 2305 .val = (mask), \ 2353 2306 } 2354 2307 ··· 2356 2313 * (1 <= crm < 8, 0 <= Op2 < 8). 2357 2314 */ 2358 2315 #define ID_UNALLOCATED(crm, op2) { \ 2316 + .name = "S3_0_0_" #crm "_" #op2, \ 2359 2317 Op0(3), Op1(0), CRn(0), CRm(crm), Op2(op2), \ 2360 - .access = access_id_reg, \ 2361 - .get_user = get_id_reg, \ 2362 - .set_user = set_id_reg, \ 2318 + ID_DESC_DEFAULT_CALLBACKS, \ 2363 2319 .visibility = raz_visibility, \ 2364 - .reset = kvm_read_sanitised_id_reg, \ 2365 2320 .val = 0, \ 2366 2321 } 2367 2322 ··· 2370 2329 */ 2371 2330 #define ID_HIDDEN(name) { \ 2372 2331 ID_DESC(name), \ 2373 - .set_user = set_id_reg, \ 2374 2332 .visibility = raz_visibility, \ 2375 - .reset = kvm_read_sanitised_id_reg, \ 2376 2333 .val = 0, \ 2377 2334 } 2378 2335 ··· 2465 2426 vq = SYS_FIELD_GET(ZCR_ELx, LEN, p->regval) + 1; 2466 2427 vq = min(vq, vcpu_sve_max_vq(vcpu)); 2467 2428 vcpu_write_sys_reg(vcpu, vq - 1, ZCR_EL2); 2429 + 2430 + return true; 2431 + } 2432 + 2433 + static bool access_gic_vtr(struct kvm_vcpu *vcpu, 2434 + struct sys_reg_params *p, 2435 + const struct sys_reg_desc *r) 2436 + { 2437 + if (p->is_write) 2438 + return write_to_read_only(vcpu, p, r); 2439 + 2440 + p->regval = kvm_vgic_global_state.ich_vtr_el2; 2441 + p->regval &= ~(ICH_VTR_EL2_DVIM | 2442 + ICH_VTR_EL2_A3V | 2443 + ICH_VTR_EL2_IDbits); 2444 + p->regval |= ICH_VTR_EL2_nV4; 2445 + 2446 + return true; 2447 + } 2448 + 2449 + static bool access_gic_misr(struct kvm_vcpu *vcpu, 2450 + struct sys_reg_params *p, 2451 + const struct sys_reg_desc *r) 2452 + { 2453 + if (p->is_write) 2454 + return write_to_read_only(vcpu, p, r); 2455 + 2456 + p->regval = vgic_v3_get_misr(vcpu); 2457 + 2458 + return true; 2459 + } 2460 + 2461 + static bool access_gic_eisr(struct kvm_vcpu *vcpu, 2462 + struct sys_reg_params *p, 2463 + const struct sys_reg_desc *r) 2464 + { 2465 + if (p->is_write) 2466 + return write_to_read_only(vcpu, p, r); 2467 + 2468 + p->regval = vgic_v3_get_eisr(vcpu); 2469 + 2470 + return true; 2471 + } 2472 + 2473 + static bool access_gic_elrsr(struct kvm_vcpu *vcpu, 2474 + struct sys_reg_params *p, 2475 + const struct sys_reg_desc *r) 2476 + { 2477 + if (p->is_write) 2478 + return write_to_read_only(vcpu, p, r); 2479 + 2480 + p->regval = vgic_v3_get_elrsr(vcpu); 2481 + 2468 2482 return true; 2469 2483 } 2470 2484 ··· 2585 2493 return true; 2586 2494 } 2587 2495 2496 + /* 2497 + * For historical (ahem ABI) reasons, KVM treated MIDR_EL1, REVIDR_EL1, and 2498 + * AIDR_EL1 as "invariant" registers, meaning userspace cannot change them. 2499 + * The values made visible to userspace were the register values of the boot 2500 + * CPU. 2501 + * 2502 + * At the same time, reads from these registers at EL1 previously were not 2503 + * trapped, allowing the guest to read the actual hardware value. On big-little 2504 + * machines, this means the VM can see different values depending on where a 2505 + * given vCPU got scheduled. 2506 + * 2507 + * These registers are now trapped as collateral damage from SME, and what 2508 + * follows attempts to give a user / guest view consistent with the existing 2509 + * ABI. 2510 + */ 2511 + static bool access_imp_id_reg(struct kvm_vcpu *vcpu, 2512 + struct sys_reg_params *p, 2513 + const struct sys_reg_desc *r) 2514 + { 2515 + if (p->is_write) 2516 + return write_to_read_only(vcpu, p, r); 2517 + 2518 + /* 2519 + * Return the VM-scoped implementation ID register values if userspace 2520 + * has made them writable. 2521 + */ 2522 + if (test_bit(KVM_ARCH_FLAG_WRITABLE_IMP_ID_REGS, &vcpu->kvm->arch.flags)) 2523 + return access_id_reg(vcpu, p, r); 2524 + 2525 + /* 2526 + * Otherwise, fall back to the old behavior of returning the value of 2527 + * the current CPU. 2528 + */ 2529 + switch (reg_to_encoding(r)) { 2530 + case SYS_REVIDR_EL1: 2531 + p->regval = read_sysreg(revidr_el1); 2532 + break; 2533 + case SYS_AIDR_EL1: 2534 + p->regval = read_sysreg(aidr_el1); 2535 + break; 2536 + default: 2537 + WARN_ON_ONCE(1); 2538 + } 2539 + 2540 + return true; 2541 + } 2542 + 2543 + static u64 __ro_after_init boot_cpu_midr_val; 2544 + static u64 __ro_after_init boot_cpu_revidr_val; 2545 + static u64 __ro_after_init boot_cpu_aidr_val; 2546 + 2547 + static void init_imp_id_regs(void) 2548 + { 2549 + boot_cpu_midr_val = read_sysreg(midr_el1); 2550 + boot_cpu_revidr_val = read_sysreg(revidr_el1); 2551 + boot_cpu_aidr_val = read_sysreg(aidr_el1); 2552 + } 2553 + 2554 + static u64 reset_imp_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r) 2555 + { 2556 + switch (reg_to_encoding(r)) { 2557 + case SYS_MIDR_EL1: 2558 + return boot_cpu_midr_val; 2559 + case SYS_REVIDR_EL1: 2560 + return boot_cpu_revidr_val; 2561 + case SYS_AIDR_EL1: 2562 + return boot_cpu_aidr_val; 2563 + default: 2564 + KVM_BUG_ON(1, vcpu->kvm); 2565 + return 0; 2566 + } 2567 + } 2568 + 2569 + static int set_imp_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r, 2570 + u64 val) 2571 + { 2572 + struct kvm *kvm = vcpu->kvm; 2573 + u64 expected; 2574 + 2575 + guard(mutex)(&kvm->arch.config_lock); 2576 + 2577 + expected = read_id_reg(vcpu, r); 2578 + if (expected == val) 2579 + return 0; 2580 + 2581 + if (!test_bit(KVM_ARCH_FLAG_WRITABLE_IMP_ID_REGS, &kvm->arch.flags)) 2582 + return -EINVAL; 2583 + 2584 + /* 2585 + * Once the VM has started the ID registers are immutable. Reject the 2586 + * write if userspace tries to change it. 2587 + */ 2588 + if (kvm_vm_has_ran_once(kvm)) 2589 + return -EBUSY; 2590 + 2591 + /* 2592 + * Any value is allowed for the implementation ID registers so long as 2593 + * it is within the writable mask. 2594 + */ 2595 + if ((val & r->val) != val) 2596 + return -EINVAL; 2597 + 2598 + kvm_set_vm_id_reg(kvm, reg_to_encoding(r), val); 2599 + return 0; 2600 + } 2601 + 2602 + #define IMPLEMENTATION_ID(reg, mask) { \ 2603 + SYS_DESC(SYS_##reg), \ 2604 + .access = access_imp_id_reg, \ 2605 + .get_user = get_id_reg, \ 2606 + .set_user = set_imp_id_reg, \ 2607 + .reset = reset_imp_id_reg, \ 2608 + .val = mask, \ 2609 + } 2588 2610 2589 2611 /* 2590 2612 * Architected system registers. ··· 2748 2542 2749 2543 { SYS_DESC(SYS_DBGVCR32_EL2), undef_access, reset_val, DBGVCR32_EL2, 0 }, 2750 2544 2545 + IMPLEMENTATION_ID(MIDR_EL1, GENMASK_ULL(31, 0)), 2751 2546 { SYS_DESC(SYS_MPIDR_EL1), NULL, reset_mpidr, MPIDR_EL1 }, 2547 + IMPLEMENTATION_ID(REVIDR_EL1, GENMASK_ULL(63, 0)), 2752 2548 2753 2549 /* 2754 2550 * ID regs: all ID_SANITISED() entries here must have corresponding ··· 2868 2660 ID_UNALLOCATED(6,7), 2869 2661 2870 2662 /* CRm=7 */ 2871 - ID_WRITABLE(ID_AA64MMFR0_EL1, ~(ID_AA64MMFR0_EL1_RES0 | 2872 - ID_AA64MMFR0_EL1_TGRAN4_2 | 2873 - ID_AA64MMFR0_EL1_TGRAN64_2 | 2874 - ID_AA64MMFR0_EL1_TGRAN16_2 | 2663 + ID_FILTERED(ID_AA64MMFR0_EL1, id_aa64mmfr0_el1, 2664 + ~(ID_AA64MMFR0_EL1_RES0 | 2875 2665 ID_AA64MMFR0_EL1_ASIDBITS)), 2876 2666 ID_WRITABLE(ID_AA64MMFR1_EL1, ~(ID_AA64MMFR1_EL1_RES0 | 2877 2667 ID_AA64MMFR1_EL1_HCX | ··· 2877 2671 ID_AA64MMFR1_EL1_XNX | 2878 2672 ID_AA64MMFR1_EL1_VH | 2879 2673 ID_AA64MMFR1_EL1_VMIDBits)), 2880 - ID_WRITABLE(ID_AA64MMFR2_EL1, ~(ID_AA64MMFR2_EL1_RES0 | 2674 + ID_FILTERED(ID_AA64MMFR2_EL1, 2675 + id_aa64mmfr2_el1, ~(ID_AA64MMFR2_EL1_RES0 | 2881 2676 ID_AA64MMFR2_EL1_EVT | 2882 2677 ID_AA64MMFR2_EL1_FWB | 2883 2678 ID_AA64MMFR2_EL1_IDS | ··· 2887 2680 ID_WRITABLE(ID_AA64MMFR3_EL1, (ID_AA64MMFR3_EL1_TCRX | 2888 2681 ID_AA64MMFR3_EL1_S1PIE | 2889 2682 ID_AA64MMFR3_EL1_S1POE)), 2890 - ID_SANITISED(ID_AA64MMFR4_EL1), 2683 + ID_WRITABLE(ID_AA64MMFR4_EL1, ID_AA64MMFR4_EL1_NV_frac), 2891 2684 ID_UNALLOCATED(7,5), 2892 2685 ID_UNALLOCATED(7,6), 2893 2686 ID_UNALLOCATED(7,7), ··· 3021 2814 .set_user = set_clidr, .val = ~CLIDR_EL1_RES0 }, 3022 2815 { SYS_DESC(SYS_CCSIDR2_EL1), undef_access }, 3023 2816 { SYS_DESC(SYS_SMIDR_EL1), undef_access }, 2817 + IMPLEMENTATION_ID(AIDR_EL1, GENMASK_ULL(63, 0)), 3024 2818 { SYS_DESC(SYS_CSSELR_EL1), access_csselr, reset_unknown, CSSELR_EL1 }, 3025 2819 ID_FILTERED(CTR_EL0, ctr_el0, 3026 2820 CTR_EL0_DIC_MASK | ··· 3058 2850 .access = access_pmceid, .reset = NULL }, 3059 2851 { PMU_SYS_REG(PMCCNTR_EL0), 3060 2852 .access = access_pmu_evcntr, .reset = reset_unknown, 3061 - .reg = PMCCNTR_EL0, .get_user = get_pmu_evcntr}, 2853 + .reg = PMCCNTR_EL0, .get_user = get_pmu_evcntr, 2854 + .set_user = set_pmu_evcntr }, 3062 2855 { PMU_SYS_REG(PMXEVTYPER_EL0), 3063 2856 .access = access_pmu_evtyper, .reset = NULL }, 3064 2857 { PMU_SYS_REG(PMXEVCNTR_EL0), ··· 3311 3102 EL2_REG(RVBAR_EL2, access_rw, reset_val, 0), 3312 3103 { SYS_DESC(SYS_RMR_EL2), undef_access }, 3313 3104 3105 + EL2_REG_VNCR(ICH_AP0R0_EL2, reset_val, 0), 3106 + EL2_REG_VNCR(ICH_AP0R1_EL2, reset_val, 0), 3107 + EL2_REG_VNCR(ICH_AP0R2_EL2, reset_val, 0), 3108 + EL2_REG_VNCR(ICH_AP0R3_EL2, reset_val, 0), 3109 + EL2_REG_VNCR(ICH_AP1R0_EL2, reset_val, 0), 3110 + EL2_REG_VNCR(ICH_AP1R1_EL2, reset_val, 0), 3111 + EL2_REG_VNCR(ICH_AP1R2_EL2, reset_val, 0), 3112 + EL2_REG_VNCR(ICH_AP1R3_EL2, reset_val, 0), 3113 + 3114 + { SYS_DESC(SYS_ICC_SRE_EL2), access_gic_sre }, 3115 + 3314 3116 EL2_REG_VNCR(ICH_HCR_EL2, reset_val, 0), 3117 + { SYS_DESC(SYS_ICH_VTR_EL2), access_gic_vtr }, 3118 + { SYS_DESC(SYS_ICH_MISR_EL2), access_gic_misr }, 3119 + { SYS_DESC(SYS_ICH_EISR_EL2), access_gic_eisr }, 3120 + { SYS_DESC(SYS_ICH_ELRSR_EL2), access_gic_elrsr }, 3121 + EL2_REG_VNCR(ICH_VMCR_EL2, reset_val, 0), 3122 + 3123 + EL2_REG_VNCR(ICH_LR0_EL2, reset_val, 0), 3124 + EL2_REG_VNCR(ICH_LR1_EL2, reset_val, 0), 3125 + EL2_REG_VNCR(ICH_LR2_EL2, reset_val, 0), 3126 + EL2_REG_VNCR(ICH_LR3_EL2, reset_val, 0), 3127 + EL2_REG_VNCR(ICH_LR4_EL2, reset_val, 0), 3128 + EL2_REG_VNCR(ICH_LR5_EL2, reset_val, 0), 3129 + EL2_REG_VNCR(ICH_LR6_EL2, reset_val, 0), 3130 + EL2_REG_VNCR(ICH_LR7_EL2, reset_val, 0), 3131 + EL2_REG_VNCR(ICH_LR8_EL2, reset_val, 0), 3132 + EL2_REG_VNCR(ICH_LR9_EL2, reset_val, 0), 3133 + EL2_REG_VNCR(ICH_LR10_EL2, reset_val, 0), 3134 + EL2_REG_VNCR(ICH_LR11_EL2, reset_val, 0), 3135 + EL2_REG_VNCR(ICH_LR12_EL2, reset_val, 0), 3136 + EL2_REG_VNCR(ICH_LR13_EL2, reset_val, 0), 3137 + EL2_REG_VNCR(ICH_LR14_EL2, reset_val, 0), 3138 + EL2_REG_VNCR(ICH_LR15_EL2, reset_val, 0), 3315 3139 3316 3140 EL2_REG(CONTEXTIDR_EL2, access_rw, reset_val, 0), 3317 3141 EL2_REG(TPIDR_EL2, access_rw, reset_val, 0), ··· 4514 4272 * Certain AArch32 ID registers are handled by rerouting to the AArch64 4515 4273 * system register table. Registers in the ID range where CRm=0 are 4516 4274 * excluded from this scheme as they do not trivially map into AArch64 4517 - * system register encodings. 4275 + * system register encodings, except for AIDR/REVIDR. 4518 4276 */ 4519 - if (params.Op1 == 0 && params.CRn == 0 && params.CRm) 4277 + if (params.Op1 == 0 && params.CRn == 0 && 4278 + (params.CRm || params.Op2 == 6 /* REVIDR */)) 4279 + return kvm_emulate_cp15_id_reg(vcpu, &params); 4280 + if (params.Op1 == 1 && params.CRn == 0 && 4281 + params.CRm == 0 && params.Op2 == 7 /* AIDR */) 4520 4282 return kvm_emulate_cp15_id_reg(vcpu, &params); 4521 4283 4522 4284 return kvm_handle_cp_32(vcpu, &params, cp15_regs, ARRAY_SIZE(cp15_regs)); ··· 4719 4473 } 4720 4474 4721 4475 set_bit(KVM_ARCH_FLAG_ID_REGS_INITIALIZED, &kvm->arch.flags); 4476 + 4477 + if (kvm_vcpu_has_pmu(vcpu)) 4478 + kvm_make_request(KVM_REQ_RELOAD_PMU, vcpu); 4722 4479 } 4723 4480 4724 4481 /** ··· 4827 4578 return r; 4828 4579 } 4829 4580 4830 - /* 4831 - * These are the invariant sys_reg registers: we let the guest see the 4832 - * host versions of these, so they're part of the guest state. 4833 - * 4834 - * A future CPU may provide a mechanism to present different values to 4835 - * the guest, or a future kvm may trap them. 4836 - */ 4837 - 4838 - #define FUNCTION_INVARIANT(reg) \ 4839 - static u64 reset_##reg(struct kvm_vcpu *v, \ 4840 - const struct sys_reg_desc *r) \ 4841 - { \ 4842 - ((struct sys_reg_desc *)r)->val = read_sysreg(reg); \ 4843 - return ((struct sys_reg_desc *)r)->val; \ 4844 - } 4845 - 4846 - FUNCTION_INVARIANT(midr_el1) 4847 - FUNCTION_INVARIANT(revidr_el1) 4848 - FUNCTION_INVARIANT(aidr_el1) 4849 - 4850 - /* ->val is filled in by kvm_sys_reg_table_init() */ 4851 - static struct sys_reg_desc invariant_sys_regs[] __ro_after_init = { 4852 - { SYS_DESC(SYS_MIDR_EL1), NULL, reset_midr_el1 }, 4853 - { SYS_DESC(SYS_REVIDR_EL1), NULL, reset_revidr_el1 }, 4854 - { SYS_DESC(SYS_AIDR_EL1), NULL, reset_aidr_el1 }, 4855 - }; 4856 - 4857 - static int get_invariant_sys_reg(u64 id, u64 __user *uaddr) 4858 - { 4859 - const struct sys_reg_desc *r; 4860 - 4861 - r = get_reg_by_id(id, invariant_sys_regs, 4862 - ARRAY_SIZE(invariant_sys_regs)); 4863 - if (!r) 4864 - return -ENOENT; 4865 - 4866 - return put_user(r->val, uaddr); 4867 - } 4868 - 4869 - static int set_invariant_sys_reg(u64 id, u64 __user *uaddr) 4870 - { 4871 - const struct sys_reg_desc *r; 4872 - u64 val; 4873 - 4874 - r = get_reg_by_id(id, invariant_sys_regs, 4875 - ARRAY_SIZE(invariant_sys_regs)); 4876 - if (!r) 4877 - return -ENOENT; 4878 - 4879 - if (get_user(val, uaddr)) 4880 - return -EFAULT; 4881 - 4882 - /* This is what we mean by invariant: you can't change it. */ 4883 - if (r->val != val) 4884 - return -EINVAL; 4885 - 4886 - return 0; 4887 - } 4888 - 4889 4581 static int demux_c15_get(struct kvm_vcpu *vcpu, u64 id, void __user *uaddr) 4890 4582 { 4891 4583 u32 val; ··· 4908 4718 int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) 4909 4719 { 4910 4720 void __user *uaddr = (void __user *)(unsigned long)reg->addr; 4911 - int err; 4912 4721 4913 4722 if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_DEMUX) 4914 4723 return demux_c15_get(vcpu, reg->id, uaddr); 4915 - 4916 - err = get_invariant_sys_reg(reg->id, uaddr); 4917 - if (err != -ENOENT) 4918 - return err; 4919 4724 4920 4725 return kvm_sys_reg_get_user(vcpu, reg, 4921 4726 sys_reg_descs, ARRAY_SIZE(sys_reg_descs)); ··· 4947 4762 int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) 4948 4763 { 4949 4764 void __user *uaddr = (void __user *)(unsigned long)reg->addr; 4950 - int err; 4951 4765 4952 4766 if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_DEMUX) 4953 4767 return demux_c15_set(vcpu, reg->id, uaddr); 4954 - 4955 - err = set_invariant_sys_reg(reg->id, uaddr); 4956 - if (err != -ENOENT) 4957 - return err; 4958 4768 4959 4769 return kvm_sys_reg_set_user(vcpu, reg, 4960 4770 sys_reg_descs, ARRAY_SIZE(sys_reg_descs)); ··· 5039 4859 5040 4860 unsigned long kvm_arm_num_sys_reg_descs(struct kvm_vcpu *vcpu) 5041 4861 { 5042 - return ARRAY_SIZE(invariant_sys_regs) 5043 - + num_demux_regs() 4862 + return num_demux_regs() 5044 4863 + walk_sys_regs(vcpu, (u64 __user *)NULL); 5045 4864 } 5046 4865 5047 4866 int kvm_arm_copy_sys_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices) 5048 4867 { 5049 - unsigned int i; 5050 4868 int err; 5051 - 5052 - /* Then give them all the invariant registers' indices. */ 5053 - for (i = 0; i < ARRAY_SIZE(invariant_sys_regs); i++) { 5054 - if (put_user(sys_reg_to_index(&invariant_sys_regs[i]), uindices)) 5055 - return -EFAULT; 5056 - uindices++; 5057 - } 5058 4869 5059 4870 err = walk_sys_regs(vcpu, uindices); 5060 4871 if (err < 0) ··· 5142 4971 mutex_lock(&kvm->arch.config_lock); 5143 4972 vcpu_set_hcr(vcpu); 5144 4973 vcpu_set_ich_hcr(vcpu); 5145 - 5146 - if (cpus_have_final_cap(ARM64_HAS_HCX)) { 5147 - /* 5148 - * In general, all HCRX_EL2 bits are gated by a feature. 5149 - * The only reason we can set SMPME without checking any 5150 - * feature is that its effects are not directly observable 5151 - * from the guest. 5152 - */ 5153 - vcpu->arch.hcrx_el2 = HCRX_EL2_SMPME; 5154 - 5155 - if (kvm_has_feat(kvm, ID_AA64ISAR2_EL1, MOPS, IMP)) 5156 - vcpu->arch.hcrx_el2 |= (HCRX_EL2_MSCEn | HCRX_EL2_MCE2); 5157 - 5158 - if (kvm_has_tcr2(kvm)) 5159 - vcpu->arch.hcrx_el2 |= HCRX_EL2_TCR2En; 5160 - 5161 - if (kvm_has_fpmr(kvm)) 5162 - vcpu->arch.hcrx_el2 |= HCRX_EL2_EnFPM; 5163 - } 4974 + vcpu_set_hcrx(vcpu); 5164 4975 5165 4976 if (test_bit(KVM_ARCH_FLAG_FGU_INITIALIZED, &kvm->arch.flags)) 5166 4977 goto out; ··· 5254 5101 valid &= check_sysreg_table(cp14_64_regs, ARRAY_SIZE(cp14_64_regs), true); 5255 5102 valid &= check_sysreg_table(cp15_regs, ARRAY_SIZE(cp15_regs), true); 5256 5103 valid &= check_sysreg_table(cp15_64_regs, ARRAY_SIZE(cp15_64_regs), true); 5257 - valid &= check_sysreg_table(invariant_sys_regs, ARRAY_SIZE(invariant_sys_regs), false); 5258 5104 valid &= check_sysreg_table(sys_insn_descs, ARRAY_SIZE(sys_insn_descs), false); 5259 5105 5260 5106 if (!valid) 5261 5107 return -EINVAL; 5262 5108 5263 - /* We abuse the reset function to overwrite the table itself. */ 5264 - for (i = 0; i < ARRAY_SIZE(invariant_sys_regs); i++) 5265 - invariant_sys_regs[i].reset(NULL, &invariant_sys_regs[i]); 5109 + init_imp_id_regs(); 5266 5110 5267 5111 ret = populate_nv_trap_config(); 5268 5112
+10
arch/arm64/kvm/sys_regs.h
··· 247 247 CRn(sys_reg_CRn(reg)), CRm(sys_reg_CRm(reg)), \ 248 248 Op2(sys_reg_Op2(reg)) 249 249 250 + #define ID_REG_LIMIT_FIELD_ENUM(val, reg, field, limit) \ 251 + ({ \ 252 + u64 __f_val = FIELD_GET(reg##_##field##_MASK, val); \ 253 + (val) &= ~reg##_##field##_MASK; \ 254 + (val) |= FIELD_PREP(reg##_##field##_MASK, \ 255 + min(__f_val, \ 256 + (u64)SYS_FIELD_VALUE(reg, field, limit))); \ 257 + (val); \ 258 + }) 259 + 250 260 #endif /* __ARM64_KVM_SYS_REGS_LOCAL_H__ */
+4 -4
arch/arm64/kvm/vgic-sys-reg-v3.c
··· 35 35 36 36 vgic_v3_cpu->num_id_bits = host_id_bits; 37 37 38 - host_seis = FIELD_GET(ICH_VTR_SEIS_MASK, kvm_vgic_global_state.ich_vtr_el2); 38 + host_seis = FIELD_GET(ICH_VTR_EL2_SEIS, kvm_vgic_global_state.ich_vtr_el2); 39 39 seis = FIELD_GET(ICC_CTLR_EL1_SEIS_MASK, val); 40 40 if (host_seis != seis) 41 41 return -EINVAL; 42 42 43 - host_a3v = FIELD_GET(ICH_VTR_A3V_MASK, kvm_vgic_global_state.ich_vtr_el2); 43 + host_a3v = FIELD_GET(ICH_VTR_EL2_A3V, kvm_vgic_global_state.ich_vtr_el2); 44 44 a3v = FIELD_GET(ICC_CTLR_EL1_A3V_MASK, val); 45 45 if (host_a3v != a3v) 46 46 return -EINVAL; ··· 68 68 val |= FIELD_PREP(ICC_CTLR_EL1_PRI_BITS_MASK, vgic_v3_cpu->num_pri_bits - 1); 69 69 val |= FIELD_PREP(ICC_CTLR_EL1_ID_BITS_MASK, vgic_v3_cpu->num_id_bits); 70 70 val |= FIELD_PREP(ICC_CTLR_EL1_SEIS_MASK, 71 - FIELD_GET(ICH_VTR_SEIS_MASK, 71 + FIELD_GET(ICH_VTR_EL2_SEIS, 72 72 kvm_vgic_global_state.ich_vtr_el2)); 73 73 val |= FIELD_PREP(ICC_CTLR_EL1_A3V_MASK, 74 - FIELD_GET(ICH_VTR_A3V_MASK, kvm_vgic_global_state.ich_vtr_el2)); 74 + FIELD_GET(ICH_VTR_EL2_A3V, kvm_vgic_global_state.ich_vtr_el2)); 75 75 /* 76 76 * The VMCR.CTLR value is in ICC_CTLR_EL1 layout. 77 77 * Extract it directly using ICC_CTLR_EL1 reg definitions.
+29
arch/arm64/kvm/vgic/vgic-init.c
··· 198 198 return 0; 199 199 } 200 200 201 + /* Default GICv3 Maintenance Interrupt INTID, as per SBSA */ 202 + #define DEFAULT_MI_INTID 25 203 + 204 + int kvm_vgic_vcpu_nv_init(struct kvm_vcpu *vcpu) 205 + { 206 + int ret; 207 + 208 + guard(mutex)(&vcpu->kvm->arch.config_lock); 209 + 210 + /* 211 + * Matching the tradition established with the timers, provide 212 + * a default PPI for the maintenance interrupt. It makes 213 + * things easier to reason about. 214 + */ 215 + if (vcpu->kvm->arch.vgic.mi_intid == 0) 216 + vcpu->kvm->arch.vgic.mi_intid = DEFAULT_MI_INTID; 217 + ret = kvm_vgic_set_owner(vcpu, vcpu->kvm->arch.vgic.mi_intid, vcpu); 218 + 219 + return ret; 220 + } 221 + 201 222 static int vgic_allocate_private_irqs_locked(struct kvm_vcpu *vcpu, u32 type) 202 223 { 203 224 struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; ··· 609 588 610 589 static irqreturn_t vgic_maintenance_handler(int irq, void *data) 611 590 { 591 + struct kvm_vcpu *vcpu = *(struct kvm_vcpu **)data; 592 + 612 593 /* 613 594 * We cannot rely on the vgic maintenance interrupt to be 614 595 * delivered synchronously. This means we can only use it to 615 596 * exit the VM, and we perform the handling of EOIed 616 597 * interrupts on the exit path (see vgic_fold_lr_state). 598 + * 599 + * Of course, NV throws a wrench in this plan, and needs 600 + * something special. 617 601 */ 602 + if (vcpu && vgic_state_is_nested(vcpu)) 603 + vgic_v3_handle_nested_maint_irq(vcpu); 604 + 618 605 return IRQ_HANDLED; 619 606 } 620 607
+27 -2
arch/arm64/kvm/vgic/vgic-kvm-device.c
··· 303 303 VGIC_NR_PRIVATE_IRQS, uaddr); 304 304 break; 305 305 } 306 + case KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ: { 307 + u32 __user *uaddr = (u32 __user *)(long)attr->addr; 308 + 309 + r = put_user(dev->kvm->arch.vgic.mi_intid, uaddr); 310 + break; 311 + } 306 312 } 307 313 308 314 return r; ··· 523 517 struct vgic_reg_attr reg_attr; 524 518 gpa_t addr; 525 519 struct kvm_vcpu *vcpu; 526 - bool uaccess; 520 + bool uaccess, post_init = true; 527 521 u32 val; 528 522 int ret; 529 523 ··· 539 533 /* Sysregs uaccess is performed by the sysreg handling code */ 540 534 uaccess = false; 541 535 break; 536 + case KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ: 537 + post_init = false; 538 + fallthrough; 542 539 default: 543 540 uaccess = true; 544 541 } ··· 561 552 562 553 mutex_lock(&dev->kvm->arch.config_lock); 563 554 564 - if (unlikely(!vgic_initialized(dev->kvm))) { 555 + if (post_init != vgic_initialized(dev->kvm)) { 565 556 ret = -EBUSY; 566 557 goto out; 567 558 } ··· 591 582 } 592 583 break; 593 584 } 585 + case KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ: 586 + if (!is_write) { 587 + val = dev->kvm->arch.vgic.mi_intid; 588 + ret = 0; 589 + break; 590 + } 591 + 592 + ret = -EINVAL; 593 + if ((val < VGIC_NR_PRIVATE_IRQS) && (val >= VGIC_NR_SGIS)) { 594 + dev->kvm->arch.vgic.mi_intid = val; 595 + ret = 0; 596 + } 597 + break; 594 598 default: 595 599 ret = -EINVAL; 596 600 break; ··· 630 608 case KVM_DEV_ARM_VGIC_GRP_REDIST_REGS: 631 609 case KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS: 632 610 case KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO: 611 + case KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ: 633 612 return vgic_v3_attr_regs_access(dev, attr, true); 634 613 default: 635 614 return vgic_set_common_attr(dev, attr); ··· 645 622 case KVM_DEV_ARM_VGIC_GRP_REDIST_REGS: 646 623 case KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS: 647 624 case KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO: 625 + case KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ: 648 626 return vgic_v3_attr_regs_access(dev, attr, false); 649 627 default: 650 628 return vgic_get_common_attr(dev, attr); ··· 669 645 case KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS: 670 646 return vgic_v3_has_attr_regs(dev, attr); 671 647 case KVM_DEV_ARM_VGIC_GRP_NR_IRQS: 648 + case KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ: 672 649 return 0; 673 650 case KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO: { 674 651 if (((attr->attr & KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK) >>
+409
arch/arm64/kvm/vgic/vgic-v3-nested.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + 3 + #include <linux/cpu.h> 4 + #include <linux/kvm.h> 5 + #include <linux/kvm_host.h> 6 + #include <linux/interrupt.h> 7 + #include <linux/io.h> 8 + #include <linux/uaccess.h> 9 + 10 + #include <kvm/arm_vgic.h> 11 + 12 + #include <asm/kvm_arm.h> 13 + #include <asm/kvm_emulate.h> 14 + #include <asm/kvm_nested.h> 15 + 16 + #include "vgic.h" 17 + 18 + #define ICH_LRN(n) (ICH_LR0_EL2 + (n)) 19 + #define ICH_AP0RN(n) (ICH_AP0R0_EL2 + (n)) 20 + #define ICH_AP1RN(n) (ICH_AP1R0_EL2 + (n)) 21 + 22 + struct mi_state { 23 + u16 eisr; 24 + u16 elrsr; 25 + bool pend; 26 + }; 27 + 28 + /* 29 + * The shadow registers loaded to the hardware when running a L2 guest 30 + * with the virtual IMO/FMO bits set. 31 + */ 32 + struct shadow_if { 33 + struct vgic_v3_cpu_if cpuif; 34 + unsigned long lr_map; 35 + }; 36 + 37 + static DEFINE_PER_CPU(struct shadow_if, shadow_if); 38 + 39 + /* 40 + * Nesting GICv3 support 41 + * 42 + * On a non-nesting VM (only running at EL0/EL1), the host hypervisor 43 + * completely controls the interrupts injected via the list registers. 44 + * Consequently, most of the state that is modified by the guest (by ACK-ing 45 + * and EOI-ing interrupts) is synced by KVM on each entry/exit, so that we 46 + * keep a semi-consistent view of the interrupts. 47 + * 48 + * This still applies for a NV guest, but only while "InHost" (either 49 + * running at EL2, or at EL0 with HCR_EL2.{E2H.TGE}=={1,1}. 50 + * 51 + * When running a L2 guest ("not InHost"), things are radically different, 52 + * as the L1 guest is in charge of provisioning the interrupts via its own 53 + * view of the ICH_LR*_EL2 registers, which conveniently live in the VNCR 54 + * page. This means that the flow described above does work (there is no 55 + * state to rebuild in the L0 hypervisor), and that most things happed on L2 56 + * load/put: 57 + * 58 + * - on L2 load: move the in-memory L1 vGIC configuration into a shadow, 59 + * per-CPU data structure that is used to populate the actual LRs. This is 60 + * an extra copy that we could avoid, but life is short. In the process, 61 + * we remap any interrupt that has the HW bit set to the mapped interrupt 62 + * on the host, should the host consider it a HW one. This allows the HW 63 + * deactivation to take its course, such as for the timer. 64 + * 65 + * - on L2 put: perform the inverse transformation, so that the result of L2 66 + * running becomes visible to L1 in the VNCR-accessible registers. 67 + * 68 + * - there is nothing to do on L2 entry, as everything will have happened 69 + * on load. However, this is the point where we detect that an interrupt 70 + * targeting L1 and prepare the grand switcheroo. 71 + * 72 + * - on L2 exit: emulate the HW bit, and deactivate corresponding the L1 73 + * interrupt. The L0 active state will be cleared by the HW if the L1 74 + * interrupt was itself backed by a HW interrupt. 75 + * 76 + * Maintenance Interrupt (MI) management: 77 + * 78 + * Since the L2 guest runs the vgic in its full glory, MIs get delivered and 79 + * used as a handover point between L2 and L1. 80 + * 81 + * - on delivery of a MI to L0 while L2 is running: make the L1 MI pending, 82 + * and let it rip. This will initiate a vcpu_put() on L2, and allow L1 to 83 + * run and process the MI. 84 + * 85 + * - L1 MI is a fully virtual interrupt, not linked to the host's MI. Its 86 + * state must be computed at each entry/exit of the guest, much like we do 87 + * it for the PMU interrupt. 88 + * 89 + * - because most of the ICH_*_EL2 registers live in the VNCR page, the 90 + * quality of emulation is poor: L1 can setup the vgic so that an MI would 91 + * immediately fire, and not observe anything until the next exit. Trying 92 + * to read ICH_MISR_EL2 would do the trick, for example. 93 + * 94 + * System register emulation: 95 + * 96 + * We get two classes of registers: 97 + * 98 + * - those backed by memory (LRs, APRs, HCR, VMCR): L1 can freely access 99 + * them, and L0 doesn't see a thing. 100 + * 101 + * - those that always trap (ELRSR, EISR, MISR): these are status registers 102 + * that are built on the fly based on the in-memory state. 103 + * 104 + * Only L1 can access the ICH_*_EL2 registers. A non-NV L2 obviously cannot, 105 + * and a NV L2 would either access the VNCR page provided by L1 (memory 106 + * based registers), or see the access redirected to L1 (registers that 107 + * trap) thanks to NV being set by L1. 108 + */ 109 + 110 + bool vgic_state_is_nested(struct kvm_vcpu *vcpu) 111 + { 112 + u64 xmo; 113 + 114 + if (vcpu_has_nv(vcpu) && !is_hyp_ctxt(vcpu)) { 115 + xmo = __vcpu_sys_reg(vcpu, HCR_EL2) & (HCR_IMO | HCR_FMO); 116 + WARN_ONCE(xmo && xmo != (HCR_IMO | HCR_FMO), 117 + "Separate virtual IRQ/FIQ settings not supported\n"); 118 + 119 + return !!xmo; 120 + } 121 + 122 + return false; 123 + } 124 + 125 + static struct shadow_if *get_shadow_if(void) 126 + { 127 + return this_cpu_ptr(&shadow_if); 128 + } 129 + 130 + static bool lr_triggers_eoi(u64 lr) 131 + { 132 + return !(lr & (ICH_LR_STATE | ICH_LR_HW)) && (lr & ICH_LR_EOI); 133 + } 134 + 135 + static void vgic_compute_mi_state(struct kvm_vcpu *vcpu, struct mi_state *mi_state) 136 + { 137 + u16 eisr = 0, elrsr = 0; 138 + bool pend = false; 139 + 140 + for (int i = 0; i < kvm_vgic_global_state.nr_lr; i++) { 141 + u64 lr = __vcpu_sys_reg(vcpu, ICH_LRN(i)); 142 + 143 + if (lr_triggers_eoi(lr)) 144 + eisr |= BIT(i); 145 + if (!(lr & ICH_LR_STATE)) 146 + elrsr |= BIT(i); 147 + pend |= (lr & ICH_LR_PENDING_BIT); 148 + } 149 + 150 + mi_state->eisr = eisr; 151 + mi_state->elrsr = elrsr; 152 + mi_state->pend = pend; 153 + } 154 + 155 + u16 vgic_v3_get_eisr(struct kvm_vcpu *vcpu) 156 + { 157 + struct mi_state mi_state; 158 + 159 + vgic_compute_mi_state(vcpu, &mi_state); 160 + return mi_state.eisr; 161 + } 162 + 163 + u16 vgic_v3_get_elrsr(struct kvm_vcpu *vcpu) 164 + { 165 + struct mi_state mi_state; 166 + 167 + vgic_compute_mi_state(vcpu, &mi_state); 168 + return mi_state.elrsr; 169 + } 170 + 171 + u64 vgic_v3_get_misr(struct kvm_vcpu *vcpu) 172 + { 173 + struct mi_state mi_state; 174 + u64 reg = 0, hcr, vmcr; 175 + 176 + hcr = __vcpu_sys_reg(vcpu, ICH_HCR_EL2); 177 + vmcr = __vcpu_sys_reg(vcpu, ICH_VMCR_EL2); 178 + 179 + vgic_compute_mi_state(vcpu, &mi_state); 180 + 181 + if (mi_state.eisr) 182 + reg |= ICH_MISR_EL2_EOI; 183 + 184 + if (__vcpu_sys_reg(vcpu, ICH_HCR_EL2) & ICH_HCR_EL2_UIE) { 185 + int used_lrs = kvm_vgic_global_state.nr_lr; 186 + 187 + used_lrs -= hweight16(mi_state.elrsr); 188 + reg |= (used_lrs <= 1) ? ICH_MISR_EL2_U : 0; 189 + } 190 + 191 + if ((hcr & ICH_HCR_EL2_LRENPIE) && FIELD_GET(ICH_HCR_EL2_EOIcount_MASK, hcr)) 192 + reg |= ICH_MISR_EL2_LRENP; 193 + 194 + if ((hcr & ICH_HCR_EL2_NPIE) && !mi_state.pend) 195 + reg |= ICH_MISR_EL2_NP; 196 + 197 + if ((hcr & ICH_HCR_EL2_VGrp0EIE) && (vmcr & ICH_VMCR_ENG0_MASK)) 198 + reg |= ICH_MISR_EL2_VGrp0E; 199 + 200 + if ((hcr & ICH_HCR_EL2_VGrp0DIE) && !(vmcr & ICH_VMCR_ENG0_MASK)) 201 + reg |= ICH_MISR_EL2_VGrp0D; 202 + 203 + if ((hcr & ICH_HCR_EL2_VGrp1EIE) && (vmcr & ICH_VMCR_ENG1_MASK)) 204 + reg |= ICH_MISR_EL2_VGrp1E; 205 + 206 + if ((hcr & ICH_HCR_EL2_VGrp1DIE) && !(vmcr & ICH_VMCR_ENG1_MASK)) 207 + reg |= ICH_MISR_EL2_VGrp1D; 208 + 209 + return reg; 210 + } 211 + 212 + /* 213 + * For LRs which have HW bit set such as timer interrupts, we modify them to 214 + * have the host hardware interrupt number instead of the virtual one programmed 215 + * by the guest hypervisor. 216 + */ 217 + static void vgic_v3_create_shadow_lr(struct kvm_vcpu *vcpu, 218 + struct vgic_v3_cpu_if *s_cpu_if) 219 + { 220 + unsigned long lr_map = 0; 221 + int index = 0; 222 + 223 + for (int i = 0; i < kvm_vgic_global_state.nr_lr; i++) { 224 + u64 lr = __vcpu_sys_reg(vcpu, ICH_LRN(i)); 225 + struct vgic_irq *irq; 226 + 227 + if (!(lr & ICH_LR_STATE)) 228 + lr = 0; 229 + 230 + if (!(lr & ICH_LR_HW)) 231 + goto next; 232 + 233 + /* We have the HW bit set, check for validity of pINTID */ 234 + irq = vgic_get_vcpu_irq(vcpu, FIELD_GET(ICH_LR_PHYS_ID_MASK, lr)); 235 + if (!irq || !irq->hw || irq->intid > VGIC_MAX_SPI ) { 236 + /* There was no real mapping, so nuke the HW bit */ 237 + lr &= ~ICH_LR_HW; 238 + if (irq) 239 + vgic_put_irq(vcpu->kvm, irq); 240 + goto next; 241 + } 242 + 243 + /* It is illegal to have the EOI bit set with HW */ 244 + lr &= ~ICH_LR_EOI; 245 + 246 + /* Translate the virtual mapping to the real one */ 247 + lr &= ~ICH_LR_PHYS_ID_MASK; 248 + lr |= FIELD_PREP(ICH_LR_PHYS_ID_MASK, (u64)irq->hwintid); 249 + 250 + vgic_put_irq(vcpu->kvm, irq); 251 + 252 + next: 253 + s_cpu_if->vgic_lr[index] = lr; 254 + if (lr) { 255 + lr_map |= BIT(i); 256 + index++; 257 + } 258 + } 259 + 260 + container_of(s_cpu_if, struct shadow_if, cpuif)->lr_map = lr_map; 261 + s_cpu_if->used_lrs = index; 262 + } 263 + 264 + void vgic_v3_sync_nested(struct kvm_vcpu *vcpu) 265 + { 266 + struct shadow_if *shadow_if = get_shadow_if(); 267 + int i, index = 0; 268 + 269 + for_each_set_bit(i, &shadow_if->lr_map, kvm_vgic_global_state.nr_lr) { 270 + u64 lr = __vcpu_sys_reg(vcpu, ICH_LRN(i)); 271 + struct vgic_irq *irq; 272 + 273 + if (!(lr & ICH_LR_HW) || !(lr & ICH_LR_STATE)) 274 + goto next; 275 + 276 + /* 277 + * If we had a HW lr programmed by the guest hypervisor, we 278 + * need to emulate the HW effect between the guest hypervisor 279 + * and the nested guest. 280 + */ 281 + irq = vgic_get_vcpu_irq(vcpu, FIELD_GET(ICH_LR_PHYS_ID_MASK, lr)); 282 + if (WARN_ON(!irq)) /* Shouldn't happen as we check on load */ 283 + goto next; 284 + 285 + lr = __gic_v3_get_lr(index); 286 + if (!(lr & ICH_LR_STATE)) 287 + irq->active = false; 288 + 289 + vgic_put_irq(vcpu->kvm, irq); 290 + next: 291 + index++; 292 + } 293 + } 294 + 295 + static void vgic_v3_create_shadow_state(struct kvm_vcpu *vcpu, 296 + struct vgic_v3_cpu_if *s_cpu_if) 297 + { 298 + struct vgic_v3_cpu_if *host_if = &vcpu->arch.vgic_cpu.vgic_v3; 299 + u64 val = 0; 300 + int i; 301 + 302 + /* 303 + * If we're on a system with a broken vgic that requires 304 + * trapping, propagate the trapping requirements. 305 + * 306 + * Ah, the smell of rotten fruits... 307 + */ 308 + if (static_branch_unlikely(&vgic_v3_cpuif_trap)) 309 + val = host_if->vgic_hcr & (ICH_HCR_EL2_TALL0 | ICH_HCR_EL2_TALL1 | 310 + ICH_HCR_EL2_TC | ICH_HCR_EL2_TDIR); 311 + s_cpu_if->vgic_hcr = __vcpu_sys_reg(vcpu, ICH_HCR_EL2) | val; 312 + s_cpu_if->vgic_vmcr = __vcpu_sys_reg(vcpu, ICH_VMCR_EL2); 313 + s_cpu_if->vgic_sre = host_if->vgic_sre; 314 + 315 + for (i = 0; i < 4; i++) { 316 + s_cpu_if->vgic_ap0r[i] = __vcpu_sys_reg(vcpu, ICH_AP0RN(i)); 317 + s_cpu_if->vgic_ap1r[i] = __vcpu_sys_reg(vcpu, ICH_AP1RN(i)); 318 + } 319 + 320 + vgic_v3_create_shadow_lr(vcpu, s_cpu_if); 321 + } 322 + 323 + void vgic_v3_load_nested(struct kvm_vcpu *vcpu) 324 + { 325 + struct shadow_if *shadow_if = get_shadow_if(); 326 + struct vgic_v3_cpu_if *cpu_if = &shadow_if->cpuif; 327 + 328 + BUG_ON(!vgic_state_is_nested(vcpu)); 329 + 330 + vgic_v3_create_shadow_state(vcpu, cpu_if); 331 + 332 + __vgic_v3_restore_vmcr_aprs(cpu_if); 333 + __vgic_v3_activate_traps(cpu_if); 334 + 335 + __vgic_v3_restore_state(cpu_if); 336 + 337 + /* 338 + * Propagate the number of used LRs for the benefit of the HYP 339 + * GICv3 emulation code. Yes, this is a pretty sorry hack. 340 + */ 341 + vcpu->arch.vgic_cpu.vgic_v3.used_lrs = cpu_if->used_lrs; 342 + } 343 + 344 + void vgic_v3_put_nested(struct kvm_vcpu *vcpu) 345 + { 346 + struct shadow_if *shadow_if = get_shadow_if(); 347 + struct vgic_v3_cpu_if *s_cpu_if = &shadow_if->cpuif; 348 + u64 val; 349 + int i; 350 + 351 + __vgic_v3_save_vmcr_aprs(s_cpu_if); 352 + __vgic_v3_deactivate_traps(s_cpu_if); 353 + __vgic_v3_save_state(s_cpu_if); 354 + 355 + /* 356 + * Translate the shadow state HW fields back to the virtual ones 357 + * before copying the shadow struct back to the nested one. 358 + */ 359 + val = __vcpu_sys_reg(vcpu, ICH_HCR_EL2); 360 + val &= ~ICH_HCR_EL2_EOIcount_MASK; 361 + val |= (s_cpu_if->vgic_hcr & ICH_HCR_EL2_EOIcount_MASK); 362 + __vcpu_sys_reg(vcpu, ICH_HCR_EL2) = val; 363 + __vcpu_sys_reg(vcpu, ICH_VMCR_EL2) = s_cpu_if->vgic_vmcr; 364 + 365 + for (i = 0; i < 4; i++) { 366 + __vcpu_sys_reg(vcpu, ICH_AP0RN(i)) = s_cpu_if->vgic_ap0r[i]; 367 + __vcpu_sys_reg(vcpu, ICH_AP1RN(i)) = s_cpu_if->vgic_ap1r[i]; 368 + } 369 + 370 + for_each_set_bit(i, &shadow_if->lr_map, kvm_vgic_global_state.nr_lr) { 371 + val = __vcpu_sys_reg(vcpu, ICH_LRN(i)); 372 + 373 + val &= ~ICH_LR_STATE; 374 + val |= s_cpu_if->vgic_lr[i] & ICH_LR_STATE; 375 + 376 + __vcpu_sys_reg(vcpu, ICH_LRN(i)) = val; 377 + s_cpu_if->vgic_lr[i] = 0; 378 + } 379 + 380 + shadow_if->lr_map = 0; 381 + vcpu->arch.vgic_cpu.vgic_v3.used_lrs = 0; 382 + } 383 + 384 + /* 385 + * If we exit a L2 VM with a pending maintenance interrupt from the GIC, 386 + * then we need to forward this to L1 so that it can re-sync the appropriate 387 + * LRs and sample level triggered interrupts again. 388 + */ 389 + void vgic_v3_handle_nested_maint_irq(struct kvm_vcpu *vcpu) 390 + { 391 + bool state = read_sysreg_s(SYS_ICH_MISR_EL2); 392 + 393 + /* This will force a switch back to L1 if the level is high */ 394 + kvm_vgic_inject_irq(vcpu->kvm, vcpu, 395 + vcpu->kvm->arch.vgic.mi_intid, state, vcpu); 396 + 397 + sysreg_clear_set_s(SYS_ICH_HCR_EL2, ICH_HCR_EL2_En, 0); 398 + } 399 + 400 + void vgic_v3_nested_update_mi(struct kvm_vcpu *vcpu) 401 + { 402 + bool level; 403 + 404 + level = __vcpu_sys_reg(vcpu, ICH_HCR_EL2) & ICH_HCR_EL2_En; 405 + if (level) 406 + level &= vgic_v3_get_misr(vcpu); 407 + kvm_vgic_inject_irq(vcpu->kvm, vcpu, 408 + vcpu->kvm->arch.vgic.mi_intid, level, vcpu); 409 + }
+28 -18
arch/arm64/kvm/vgic/vgic-v3.c
··· 24 24 { 25 25 struct vgic_v3_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v3; 26 26 27 - cpuif->vgic_hcr |= ICH_HCR_UIE; 27 + cpuif->vgic_hcr |= ICH_HCR_EL2_UIE; 28 28 } 29 29 30 30 static bool lr_signals_eoi_mi(u64 lr_val) ··· 42 42 43 43 DEBUG_SPINLOCK_BUG_ON(!irqs_disabled()); 44 44 45 - cpuif->vgic_hcr &= ~ICH_HCR_UIE; 45 + cpuif->vgic_hcr &= ~ICH_HCR_EL2_UIE; 46 46 47 47 for (lr = 0; lr < cpuif->used_lrs; lr++) { 48 48 u64 val = cpuif->vgic_lr[lr]; ··· 284 284 vgic_v3->vgic_sre = 0; 285 285 } 286 286 287 - vcpu->arch.vgic_cpu.num_id_bits = (kvm_vgic_global_state.ich_vtr_el2 & 288 - ICH_VTR_ID_BITS_MASK) >> 289 - ICH_VTR_ID_BITS_SHIFT; 290 - vcpu->arch.vgic_cpu.num_pri_bits = ((kvm_vgic_global_state.ich_vtr_el2 & 291 - ICH_VTR_PRI_BITS_MASK) >> 292 - ICH_VTR_PRI_BITS_SHIFT) + 1; 287 + vcpu->arch.vgic_cpu.num_id_bits = FIELD_GET(ICH_VTR_EL2_IDbits, 288 + kvm_vgic_global_state.ich_vtr_el2); 289 + vcpu->arch.vgic_cpu.num_pri_bits = FIELD_GET(ICH_VTR_EL2_PRIbits, 290 + kvm_vgic_global_state.ich_vtr_el2) + 1; 293 291 294 292 /* Get the show on the road... */ 295 - vgic_v3->vgic_hcr = ICH_HCR_EN; 293 + vgic_v3->vgic_hcr = ICH_HCR_EL2_En; 296 294 } 297 295 298 296 void vcpu_set_ich_hcr(struct kvm_vcpu *vcpu) ··· 299 301 300 302 /* Hide GICv3 sysreg if necessary */ 301 303 if (!kvm_has_gicv3(vcpu->kvm)) { 302 - vgic_v3->vgic_hcr |= ICH_HCR_TALL0 | ICH_HCR_TALL1 | ICH_HCR_TC; 304 + vgic_v3->vgic_hcr |= (ICH_HCR_EL2_TALL0 | ICH_HCR_EL2_TALL1 | 305 + ICH_HCR_EL2_TC); 303 306 return; 304 307 } 305 308 306 309 if (group0_trap) 307 - vgic_v3->vgic_hcr |= ICH_HCR_TALL0; 310 + vgic_v3->vgic_hcr |= ICH_HCR_EL2_TALL0; 308 311 if (group1_trap) 309 - vgic_v3->vgic_hcr |= ICH_HCR_TALL1; 312 + vgic_v3->vgic_hcr |= ICH_HCR_EL2_TALL1; 310 313 if (common_trap) 311 - vgic_v3->vgic_hcr |= ICH_HCR_TC; 314 + vgic_v3->vgic_hcr |= ICH_HCR_EL2_TC; 312 315 if (dir_trap) 313 - vgic_v3->vgic_hcr |= ICH_HCR_TDIR; 316 + vgic_v3->vgic_hcr |= ICH_HCR_EL2_TDIR; 314 317 } 315 318 316 319 int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, struct vgic_irq *irq) ··· 631 632 632 633 static bool vgic_v3_broken_seis(void) 633 634 { 634 - return ((kvm_vgic_global_state.ich_vtr_el2 & ICH_VTR_SEIS_MASK) && 635 - is_midr_in_range_list(read_cpuid_id(), broken_seis)); 635 + return ((kvm_vgic_global_state.ich_vtr_el2 & ICH_VTR_EL2_SEIS) && 636 + is_midr_in_range_list(broken_seis)); 636 637 } 637 638 638 639 /** ··· 705 706 if (vgic_v3_broken_seis()) { 706 707 kvm_info("GICv3 with broken locally generated SEI\n"); 707 708 708 - kvm_vgic_global_state.ich_vtr_el2 &= ~ICH_VTR_SEIS_MASK; 709 + kvm_vgic_global_state.ich_vtr_el2 &= ~ICH_VTR_EL2_SEIS; 709 710 group0_trap = true; 710 711 group1_trap = true; 711 - if (ich_vtr_el2 & ICH_VTR_TDS_MASK) 712 + if (ich_vtr_el2 & ICH_VTR_EL2_TDS) 712 713 dir_trap = true; 713 714 else 714 715 common_trap = true; ··· 734 735 { 735 736 struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3; 736 737 738 + /* If the vgic is nested, perform the full state loading */ 739 + if (vgic_state_is_nested(vcpu)) { 740 + vgic_v3_load_nested(vcpu); 741 + return; 742 + } 743 + 737 744 if (likely(!is_protected_kvm_enabled())) 738 745 kvm_call_hyp(__vgic_v3_restore_vmcr_aprs, cpu_if); 739 746 ··· 752 747 void vgic_v3_put(struct kvm_vcpu *vcpu) 753 748 { 754 749 struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3; 750 + 751 + if (vgic_state_is_nested(vcpu)) { 752 + vgic_v3_put_nested(vcpu); 753 + return; 754 + } 755 755 756 756 if (likely(!is_protected_kvm_enabled())) 757 757 kvm_call_hyp(__vgic_v3_save_vmcr_aprs, cpu_if);
+28 -7
arch/arm64/kvm/vgic/vgic-v4.c
··· 336 336 its_vm->vpes = NULL; 337 337 } 338 338 339 + static inline bool vgic_v4_want_doorbell(struct kvm_vcpu *vcpu) 340 + { 341 + if (vcpu_get_flag(vcpu, IN_WFI)) 342 + return true; 343 + 344 + if (likely(!vcpu_has_nv(vcpu))) 345 + return false; 346 + 347 + /* 348 + * GICv4 hardware is only ever used for the L1. Mark the vPE (i.e. the 349 + * L1 context) nonresident and request a doorbell to kick us out of the 350 + * L2 when an IRQ becomes pending. 351 + */ 352 + return vcpu_get_flag(vcpu, IN_NESTED_ERET); 353 + } 354 + 339 355 int vgic_v4_put(struct kvm_vcpu *vcpu) 340 356 { 341 357 struct its_vpe *vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe; ··· 359 343 if (!vgic_supports_direct_msis(vcpu->kvm) || !vpe->resident) 360 344 return 0; 361 345 362 - return its_make_vpe_non_resident(vpe, !!vcpu_get_flag(vcpu, IN_WFI)); 346 + return its_make_vpe_non_resident(vpe, vgic_v4_want_doorbell(vcpu)); 363 347 } 364 348 365 349 int vgic_v4_load(struct kvm_vcpu *vcpu) ··· 431 415 struct vgic_irq *irq; 432 416 struct its_vlpi_map map; 433 417 unsigned long flags; 434 - int ret; 418 + int ret = 0; 435 419 436 420 if (!vgic_supports_direct_msis(kvm)) 437 421 return 0; ··· 446 430 447 431 mutex_lock(&its->its_lock); 448 432 449 - /* Perform the actual DevID/EventID -> LPI translation. */ 450 - ret = vgic_its_resolve_lpi(kvm, its, irq_entry->msi.devid, 451 - irq_entry->msi.data, &irq); 452 - if (ret) 433 + /* 434 + * Perform the actual DevID/EventID -> LPI translation. 435 + * 436 + * Silently exit if translation fails as the guest (or userspace!) has 437 + * managed to do something stupid. Emulated LPI injection will still 438 + * work if the guest figures itself out at a later time. 439 + */ 440 + if (vgic_its_resolve_lpi(kvm, its, irq_entry->msi.devid, 441 + irq_entry->msi.data, &irq)) 453 442 goto out; 454 443 455 444 /* Silently exit if the vLPI is already mapped */ ··· 533 512 if (ret) 534 513 goto out; 535 514 536 - WARN_ON(!(irq->hw && irq->host_irq == virq)); 515 + WARN_ON(irq->hw && irq->host_irq != virq); 537 516 if (irq->hw) { 538 517 atomic_dec(&irq->target_vcpu->arch.vgic_cpu.vgic_v3.its_vpe.vlpi_count); 539 518 irq->hw = false;
+38
arch/arm64/kvm/vgic/vgic.c
··· 872 872 { 873 873 int used_lrs; 874 874 875 + /* If nesting, emulate the HW effect from L0 to L1 */ 876 + if (vgic_state_is_nested(vcpu)) { 877 + vgic_v3_sync_nested(vcpu); 878 + return; 879 + } 880 + 881 + if (vcpu_has_nv(vcpu)) 882 + vgic_v3_nested_update_mi(vcpu); 883 + 875 884 /* An empty ap_list_head implies used_lrs == 0 */ 876 885 if (list_empty(&vcpu->arch.vgic_cpu.ap_list_head)) 877 886 return; ··· 909 900 /* Flush our emulation state into the GIC hardware before entering the guest. */ 910 901 void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu) 911 902 { 903 + /* 904 + * If in a nested state, we must return early. Two possibilities: 905 + * 906 + * - If we have any pending IRQ for the guest and the guest 907 + * expects IRQs to be handled in its virtual EL2 mode (the 908 + * virtual IMO bit is set) and it is not already running in 909 + * virtual EL2 mode, then we have to emulate an IRQ 910 + * exception to virtual EL2. 911 + * 912 + * We do that by placing a request to ourselves which will 913 + * abort the entry procedure and inject the exception at the 914 + * beginning of the run loop. 915 + * 916 + * - Otherwise, do exactly *NOTHING*. The guest state is 917 + * already loaded, and we can carry on with running it. 918 + * 919 + * If we have NV, but are not in a nested state, compute the 920 + * maintenance interrupt state, as it may fire. 921 + */ 922 + if (vgic_state_is_nested(vcpu)) { 923 + if (kvm_vgic_vcpu_pending_irq(vcpu)) 924 + kvm_make_request(KVM_REQ_GUEST_HYP_IRQ_PENDING, vcpu); 925 + 926 + return; 927 + } 928 + 929 + if (vcpu_has_nv(vcpu)) 930 + vgic_v3_nested_update_mi(vcpu); 931 + 912 932 /* 913 933 * If there are no virtual interrupts active or pending for this 914 934 * VCPU, then there is no work to do and we can bail out without
+6
arch/arm64/kvm/vgic/vgic.h
··· 353 353 return kvm_has_feat(kvm, ID_AA64PFR0_EL1, GIC, IMP); 354 354 } 355 355 356 + void vgic_v3_sync_nested(struct kvm_vcpu *vcpu); 357 + void vgic_v3_load_nested(struct kvm_vcpu *vcpu); 358 + void vgic_v3_put_nested(struct kvm_vcpu *vcpu); 359 + void vgic_v3_handle_nested_maint_irq(struct kvm_vcpu *vcpu); 360 + void vgic_v3_nested_update_mi(struct kvm_vcpu *vcpu); 361 + 356 362 #endif
+2
arch/arm64/tools/cpucaps
··· 45 45 HAS_MOPS 46 46 HAS_NESTED_VIRT 47 47 HAS_PAN 48 + HAS_PMUV3 48 49 HAS_S1PIE 49 50 HAS_S1POE 50 51 HAS_RAS_EXTN ··· 105 104 WORKAROUND_CLEAN_CACHE 106 105 WORKAROUND_DEVICE_LOAD_ACQUIRE 107 106 WORKAROUND_NVIDIA_CARMEL_CNP 107 + WORKAROUND_PMUV3_IMPDEF_TRAPS 108 108 WORKAROUND_QCOM_FALKOR_E1003 109 109 WORKAROUND_QCOM_ORYON_CNTVOFF 110 110 WORKAROUND_REPEAT_TLBI
+48
arch/arm64/tools/sysreg
··· 3035 3035 Field 15:0 PhyPARTID28 3036 3036 EndSysreg 3037 3037 3038 + Sysreg ICH_HCR_EL2 3 4 12 11 0 3039 + Res0 63:32 3040 + Field 31:27 EOIcount 3041 + Res0 26:16 3042 + Field 15 DVIM 3043 + Field 14 TDIR 3044 + Field 13 TSEI 3045 + Field 12 TALL1 3046 + Field 11 TALL0 3047 + Field 10 TC 3048 + Res0 9 3049 + Field 8 vSGIEOICount 3050 + Field 7 VGrp1DIE 3051 + Field 6 VGrp1EIE 3052 + Field 5 VGrp0DIE 3053 + Field 4 VGrp0EIE 3054 + Field 3 NPIE 3055 + Field 2 LRENPIE 3056 + Field 1 UIE 3057 + Field 0 En 3058 + EndSysreg 3059 + 3060 + Sysreg ICH_VTR_EL2 3 4 12 11 1 3061 + Res0 63:32 3062 + Field 31:29 PRIbits 3063 + Field 28:26 PREbits 3064 + Field 25:23 IDbits 3065 + Field 22 SEIS 3066 + Field 21 A3V 3067 + Field 20 nV4 3068 + Field 19 TDS 3069 + Field 18 DVIM 3070 + Res0 17:5 3071 + Field 4:0 ListRegs 3072 + EndSysreg 3073 + 3074 + Sysreg ICH_MISR_EL2 3 4 12 11 2 3075 + Res0 63:8 3076 + Field 7 VGrp1D 3077 + Field 6 VGrp1E 3078 + Field 5 VGrp0D 3079 + Field 4 VGrp0E 3080 + Field 3 NP 3081 + Field 2 LRENP 3082 + Field 1 U 3083 + Field 0 EOI 3084 + EndSysreg 3085 + 3038 3086 Sysreg CONTEXTIDR_EL2 3 4 13 0 1 3039 3087 Fields CONTEXTIDR_ELx 3040 3088 EndSysreg
+1 -1
drivers/clocksource/arm_arch_timer.c
··· 842 842 {}, 843 843 }; 844 844 845 - if (is_midr_in_range_list(read_cpuid_id(), broken_cval_midrs)) { 845 + if (is_midr_in_range_list(broken_cval_midrs)) { 846 846 pr_warn_once("Broken CNTx_CVAL_EL1, using 31 bit TVAL instead.\n"); 847 847 return CLOCKSOURCE_MASK(31); 848 848 }
+66
drivers/firmware/smccc/kvm_guest.c
··· 6 6 #include <linux/bitmap.h> 7 7 #include <linux/cache.h> 8 8 #include <linux/kernel.h> 9 + #include <linux/memblock.h> 9 10 #include <linux/string.h> 11 + 12 + #include <uapi/linux/psci.h> 10 13 11 14 #include <asm/hypervisor.h> 12 15 ··· 54 51 return test_bit(func_id, __kvm_arm_hyp_services); 55 52 } 56 53 EXPORT_SYMBOL_GPL(kvm_arm_hyp_service_available); 54 + 55 + #ifdef CONFIG_ARM64 56 + void __init kvm_arm_target_impl_cpu_init(void) 57 + { 58 + int i; 59 + u32 ver; 60 + u64 max_cpus; 61 + struct arm_smccc_res res; 62 + struct target_impl_cpu *target; 63 + 64 + if (!kvm_arm_hyp_service_available(ARM_SMCCC_KVM_FUNC_DISCOVER_IMPL_VER) || 65 + !kvm_arm_hyp_service_available(ARM_SMCCC_KVM_FUNC_DISCOVER_IMPL_CPUS)) 66 + return; 67 + 68 + arm_smccc_1_1_invoke(ARM_SMCCC_VENDOR_HYP_KVM_DISCOVER_IMPL_VER_FUNC_ID, 69 + 0, &res); 70 + if (res.a0 != SMCCC_RET_SUCCESS) 71 + return; 72 + 73 + /* Version info is in lower 32 bits and is in SMMCCC_VERSION format */ 74 + ver = lower_32_bits(res.a1); 75 + if (PSCI_VERSION_MAJOR(ver) != 1) { 76 + pr_warn("Unsupported target CPU implementation version v%d.%d\n", 77 + PSCI_VERSION_MAJOR(ver), PSCI_VERSION_MINOR(ver)); 78 + return; 79 + } 80 + 81 + if (!res.a2) { 82 + pr_warn("No target implementation CPUs specified\n"); 83 + return; 84 + } 85 + 86 + max_cpus = res.a2; 87 + target = memblock_alloc(sizeof(*target) * max_cpus, __alignof__(*target)); 88 + if (!target) { 89 + pr_warn("Not enough memory for struct target_impl_cpu\n"); 90 + return; 91 + } 92 + 93 + for (i = 0; i < max_cpus; i++) { 94 + arm_smccc_1_1_invoke(ARM_SMCCC_VENDOR_HYP_KVM_DISCOVER_IMPL_CPUS_FUNC_ID, 95 + i, &res); 96 + if (res.a0 != SMCCC_RET_SUCCESS) { 97 + pr_warn("Discovering target implementation CPUs failed\n"); 98 + goto mem_free; 99 + } 100 + target[i].midr = res.a1; 101 + target[i].revidr = res.a2; 102 + target[i].aidr = res.a3; 103 + }; 104 + 105 + if (!cpu_errata_set_target_impl(max_cpus, target)) { 106 + pr_warn("Failed to set target implementation CPUs\n"); 107 + goto mem_free; 108 + } 109 + 110 + pr_info("Number of target implementation CPUs is %lld\n", max_cpus); 111 + return; 112 + 113 + mem_free: 114 + memblock_free(target, sizeof(*target) * max_cpus); 115 + } 116 + #endif
+1 -1
drivers/hwtracing/coresight/coresight-etm4x-core.c
··· 1216 1216 * recorded value for 'drvdata->ccitmin' to workaround 1217 1217 * this problem. 1218 1218 */ 1219 - if (is_midr_in_range_list(read_cpuid_id(), etm_wrong_ccitmin_cpus)) { 1219 + if (is_midr_in_range_list(etm_wrong_ccitmin_cpus)) { 1220 1220 if (drvdata->ccitmin == 256) 1221 1221 drvdata->ccitmin = 4; 1222 1222 }
+4 -4
drivers/irqchip/irq-apple-aic.c
··· 409 409 * in use, and be cleared when coming back from the handler. 410 410 */ 411 411 if (is_kernel_in_hyp_mode() && 412 - (read_sysreg_s(SYS_ICH_HCR_EL2) & ICH_HCR_EN) && 412 + (read_sysreg_s(SYS_ICH_HCR_EL2) & ICH_HCR_EL2_En) && 413 413 read_sysreg_s(SYS_ICH_MISR_EL2) != 0) { 414 414 generic_handle_domain_irq(aic_irqc->hw_domain, 415 415 AIC_FIQ_HWIRQ(AIC_VGIC_MI)); 416 416 417 - if (unlikely((read_sysreg_s(SYS_ICH_HCR_EL2) & ICH_HCR_EN) && 417 + if (unlikely((read_sysreg_s(SYS_ICH_HCR_EL2) & ICH_HCR_EL2_En) && 418 418 read_sysreg_s(SYS_ICH_MISR_EL2))) { 419 419 pr_err_ratelimited("vGIC IRQ fired and not handled by KVM, disabling.\n"); 420 - sysreg_clear_set_s(SYS_ICH_HCR_EL2, ICH_HCR_EN, 0); 420 + sysreg_clear_set_s(SYS_ICH_HCR_EL2, ICH_HCR_EL2_En, 0); 421 421 } 422 422 } 423 423 } ··· 841 841 VM_TMR_FIQ_ENABLE_V | VM_TMR_FIQ_ENABLE_P, 0); 842 842 843 843 /* vGIC maintenance IRQ */ 844 - sysreg_clear_set_s(SYS_ICH_HCR_EL2, ICH_HCR_EN, 0); 844 + sysreg_clear_set_s(SYS_ICH_HCR_EL2, ICH_HCR_EL2_En, 0); 845 845 } 846 846 847 847 /* PMC FIQ */
+80 -21
drivers/perf/apple_m1_cpu_pmu.c
··· 12 12 13 13 #include <linux/of.h> 14 14 #include <linux/perf/arm_pmu.h> 15 + #include <linux/perf/arm_pmuv3.h> 15 16 #include <linux/platform_device.h> 16 17 17 18 #include <asm/apple_m1_pmu.h> ··· 121 120 */ 122 121 M1_PMU_CFG_COUNT_USER = BIT(8), 123 122 M1_PMU_CFG_COUNT_KERNEL = BIT(9), 123 + M1_PMU_CFG_COUNT_HOST = BIT(10), 124 + M1_PMU_CFG_COUNT_GUEST = BIT(11), 124 125 }; 125 126 126 127 /* ··· 173 170 [PERF_COUNT_HW_INSTRUCTIONS] = M1_PMU_PERFCTR_INST_ALL, 174 171 [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = M1_PMU_PERFCTR_INST_BRANCH, 175 172 [PERF_COUNT_HW_BRANCH_MISSES] = M1_PMU_PERFCTR_BRANCH_MISPRED_NONSPEC, 173 + }; 174 + 175 + #define M1_PMUV3_EVENT_MAP(pmuv3_event, m1_event) \ 176 + [ARMV8_PMUV3_PERFCTR_##pmuv3_event] = M1_PMU_PERFCTR_##m1_event 177 + 178 + static const u16 m1_pmu_pmceid_map[ARMV8_PMUV3_MAX_COMMON_EVENTS] = { 179 + [0 ... ARMV8_PMUV3_MAX_COMMON_EVENTS - 1] = HW_OP_UNSUPPORTED, 180 + M1_PMUV3_EVENT_MAP(INST_RETIRED, INST_ALL), 181 + M1_PMUV3_EVENT_MAP(CPU_CYCLES, CORE_ACTIVE_CYCLE), 182 + M1_PMUV3_EVENT_MAP(BR_RETIRED, INST_BRANCH), 183 + M1_PMUV3_EVENT_MAP(BR_MIS_PRED_RETIRED, BRANCH_MISPRED_NONSPEC), 176 184 }; 177 185 178 186 /* sysfs definitions */ ··· 341 327 __m1_pmu_enable_counter_interrupt(index, false); 342 328 } 343 329 344 - static void m1_pmu_configure_counter(unsigned int index, u8 event, 345 - bool user, bool kernel) 330 + static void __m1_pmu_configure_event_filter(unsigned int index, bool user, 331 + bool kernel, bool host) 346 332 { 347 - u64 val, user_bit, kernel_bit; 348 - int shift; 333 + u64 clear, set, user_bit, kernel_bit; 349 334 350 335 switch (index) { 351 336 case 0 ... 7: ··· 359 346 BUG(); 360 347 } 361 348 362 - val = read_sysreg_s(SYS_IMP_APL_PMCR1_EL1); 363 - 349 + clear = set = 0; 364 350 if (user) 365 - val |= user_bit; 351 + set |= user_bit; 366 352 else 367 - val &= ~user_bit; 353 + clear |= user_bit; 368 354 369 355 if (kernel) 370 - val |= kernel_bit; 356 + set |= kernel_bit; 371 357 else 372 - val &= ~kernel_bit; 358 + clear |= kernel_bit; 373 359 374 - write_sysreg_s(val, SYS_IMP_APL_PMCR1_EL1); 360 + if (host) 361 + sysreg_clear_set_s(SYS_IMP_APL_PMCR1_EL1, clear, set); 362 + else if (is_kernel_in_hyp_mode()) 363 + sysreg_clear_set_s(SYS_IMP_APL_PMCR1_EL12, clear, set); 364 + } 365 + 366 + static void __m1_pmu_configure_eventsel(unsigned int index, u8 event) 367 + { 368 + u64 clear = 0, set = 0; 369 + int shift; 375 370 376 371 /* 377 372 * Counters 0 and 1 have fixed events. For anything else, ··· 392 371 break; 393 372 case 2 ... 5: 394 373 shift = (index - 2) * 8; 395 - val = read_sysreg_s(SYS_IMP_APL_PMESR0_EL1); 396 - val &= ~((u64)0xff << shift); 397 - val |= (u64)event << shift; 398 - write_sysreg_s(val, SYS_IMP_APL_PMESR0_EL1); 374 + clear |= (u64)0xff << shift; 375 + set |= (u64)event << shift; 376 + sysreg_clear_set_s(SYS_IMP_APL_PMESR0_EL1, clear, set); 399 377 break; 400 378 case 6 ... 9: 401 379 shift = (index - 6) * 8; 402 - val = read_sysreg_s(SYS_IMP_APL_PMESR1_EL1); 403 - val &= ~((u64)0xff << shift); 404 - val |= (u64)event << shift; 405 - write_sysreg_s(val, SYS_IMP_APL_PMESR1_EL1); 380 + clear |= (u64)0xff << shift; 381 + set |= (u64)event << shift; 382 + sysreg_clear_set_s(SYS_IMP_APL_PMESR1_EL1, clear, set); 406 383 break; 407 384 } 385 + } 386 + 387 + static void m1_pmu_configure_counter(unsigned int index, unsigned long config_base) 388 + { 389 + bool kernel = config_base & M1_PMU_CFG_COUNT_KERNEL; 390 + bool guest = config_base & M1_PMU_CFG_COUNT_GUEST; 391 + bool host = config_base & M1_PMU_CFG_COUNT_HOST; 392 + bool user = config_base & M1_PMU_CFG_COUNT_USER; 393 + u8 evt = config_base & M1_PMU_CFG_EVENT; 394 + 395 + __m1_pmu_configure_event_filter(index, user && host, kernel && host, true); 396 + __m1_pmu_configure_event_filter(index, user && guest, kernel && guest, false); 397 + __m1_pmu_configure_eventsel(index, evt); 408 398 } 409 399 410 400 /* arm_pmu backend */ ··· 432 400 m1_pmu_disable_counter(event->hw.idx); 433 401 isb(); 434 402 435 - m1_pmu_configure_counter(event->hw.idx, evt, user, kernel); 403 + m1_pmu_configure_counter(event->hw.idx, event->hw.config_base); 436 404 m1_pmu_enable_counter(event->hw.idx); 437 405 m1_pmu_enable_counter_interrupt(event->hw.idx); 438 406 isb(); ··· 570 538 return armpmu_map_event(event, &m1_pmu_perf_map, NULL, M1_PMU_CFG_EVENT); 571 539 } 572 540 541 + static int m1_pmu_map_pmuv3_event(unsigned int eventsel) 542 + { 543 + u16 m1_event = HW_OP_UNSUPPORTED; 544 + 545 + if (eventsel < ARMV8_PMUV3_MAX_COMMON_EVENTS) 546 + m1_event = m1_pmu_pmceid_map[eventsel]; 547 + 548 + return m1_event == HW_OP_UNSUPPORTED ? -EOPNOTSUPP : m1_event; 549 + } 550 + 551 + static void m1_pmu_init_pmceid(struct arm_pmu *pmu) 552 + { 553 + unsigned int event; 554 + 555 + for (event = 0; event < ARMV8_PMUV3_MAX_COMMON_EVENTS; event++) { 556 + if (m1_pmu_map_pmuv3_event(event) >= 0) 557 + set_bit(event, pmu->pmceid_bitmap); 558 + } 559 + } 560 + 573 561 static void m1_pmu_reset(void *info) 574 562 { 575 563 int i; ··· 610 558 { 611 559 unsigned long config_base = 0; 612 560 613 - if (!attr->exclude_guest) { 561 + if (!attr->exclude_guest && !is_kernel_in_hyp_mode()) { 614 562 pr_debug("ARM performance counters do not support mode exclusion\n"); 615 563 return -EOPNOTSUPP; 616 564 } ··· 618 566 config_base |= M1_PMU_CFG_COUNT_KERNEL; 619 567 if (!attr->exclude_user) 620 568 config_base |= M1_PMU_CFG_COUNT_USER; 569 + if (!attr->exclude_host) 570 + config_base |= M1_PMU_CFG_COUNT_HOST; 571 + if (!attr->exclude_guest) 572 + config_base |= M1_PMU_CFG_COUNT_GUEST; 621 573 622 574 event->config_base = config_base; 623 575 ··· 649 593 650 594 cpu_pmu->reset = m1_pmu_reset; 651 595 cpu_pmu->set_event_filter = m1_pmu_set_event_filter; 596 + 597 + cpu_pmu->map_pmuv3_event = m1_pmu_map_pmuv3_event; 598 + m1_pmu_init_pmceid(cpu_pmu); 652 599 653 600 bitmap_set(cpu_pmu->cntr_mask, 0, M1_PMU_NR_COUNTERS); 654 601 cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] = &m1_pmu_events_attr_group;
+6 -11
include/kvm/arm_pmu.h
··· 37 37 struct arm_pmu *arm_pmu; 38 38 }; 39 39 40 - DECLARE_STATIC_KEY_FALSE(kvm_arm_pmu_available); 41 - 42 - static __always_inline bool kvm_arm_support_pmu_v3(void) 43 - { 44 - return static_branch_likely(&kvm_arm_pmu_available); 45 - } 46 - 40 + bool kvm_supports_guest_pmuv3(void); 47 41 #define kvm_arm_pmu_irq_initialized(v) ((v)->arch.pmu.irq_num >= VGIC_NR_SGIS) 48 42 u64 kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u64 select_idx); 49 43 void kvm_pmu_set_counter_value(struct kvm_vcpu *vcpu, u64 select_idx, u64 val); 44 + void kvm_pmu_set_counter_value_user(struct kvm_vcpu *vcpu, u64 select_idx, u64 val); 50 45 u64 kvm_pmu_implemented_counter_mask(struct kvm_vcpu *vcpu); 51 46 u64 kvm_pmu_accessible_counter_mask(struct kvm_vcpu *vcpu); 52 47 u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1); 53 48 void kvm_pmu_vcpu_init(struct kvm_vcpu *vcpu); 54 - void kvm_pmu_vcpu_reset(struct kvm_vcpu *vcpu); 55 49 void kvm_pmu_vcpu_destroy(struct kvm_vcpu *vcpu); 56 50 void kvm_pmu_reprogram_counter_mask(struct kvm_vcpu *vcpu, u64 val); 57 51 void kvm_pmu_flush_hwstate(struct kvm_vcpu *vcpu); ··· 80 86 */ 81 87 #define kvm_pmu_update_vcpu_events(vcpu) \ 82 88 do { \ 83 - if (!has_vhe() && kvm_arm_support_pmu_v3()) \ 89 + if (!has_vhe() && system_supports_pmuv3()) \ 84 90 vcpu->arch.pmu.events = *kvm_get_pmu_events(); \ 85 91 } while (0) 86 92 ··· 96 102 struct kvm_pmu { 97 103 }; 98 104 99 - static inline bool kvm_arm_support_pmu_v3(void) 105 + static inline bool kvm_supports_guest_pmuv3(void) 100 106 { 101 107 return false; 102 108 } ··· 109 115 } 110 116 static inline void kvm_pmu_set_counter_value(struct kvm_vcpu *vcpu, 111 117 u64 select_idx, u64 val) {} 118 + static inline void kvm_pmu_set_counter_value_user(struct kvm_vcpu *vcpu, 119 + u64 select_idx, u64 val) {} 112 120 static inline u64 kvm_pmu_implemented_counter_mask(struct kvm_vcpu *vcpu) 113 121 { 114 122 return 0; ··· 120 124 return 0; 121 125 } 122 126 static inline void kvm_pmu_vcpu_init(struct kvm_vcpu *vcpu) {} 123 - static inline void kvm_pmu_vcpu_reset(struct kvm_vcpu *vcpu) {} 124 127 static inline void kvm_pmu_vcpu_destroy(struct kvm_vcpu *vcpu) {} 125 128 static inline void kvm_pmu_reprogram_counter_mask(struct kvm_vcpu *vcpu, u64 val) {} 126 129 static inline void kvm_pmu_flush_hwstate(struct kvm_vcpu *vcpu) {}
+10
include/kvm/arm_vgic.h
··· 249 249 250 250 int nr_spis; 251 251 252 + /* The GIC maintenance IRQ for nested hypervisors. */ 253 + u32 mi_intid; 254 + 252 255 /* base addresses in guest physical address space: */ 253 256 gpa_t vgic_dist_base; /* distributor */ 254 257 union { ··· 372 369 int kvm_set_legacy_vgic_v2_addr(struct kvm *kvm, struct kvm_arm_device_addr *dev_addr); 373 370 void kvm_vgic_early_init(struct kvm *kvm); 374 371 int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu); 372 + int kvm_vgic_vcpu_nv_init(struct kvm_vcpu *vcpu); 375 373 int kvm_vgic_create(struct kvm *kvm, u32 type); 376 374 void kvm_vgic_destroy(struct kvm *kvm); 377 375 void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu); ··· 392 388 393 389 void kvm_vgic_load(struct kvm_vcpu *vcpu); 394 390 void kvm_vgic_put(struct kvm_vcpu *vcpu); 391 + 392 + u16 vgic_v3_get_eisr(struct kvm_vcpu *vcpu); 393 + u16 vgic_v3_get_elrsr(struct kvm_vcpu *vcpu); 394 + u64 vgic_v3_get_misr(struct kvm_vcpu *vcpu); 395 395 396 396 #define irqchip_in_kernel(k) (!!((k)->arch.vgic.in_kernel)) 397 397 #define vgic_initialized(k) ((k)->arch.vgic.initialized) ··· 440 432 int vgic_v4_load(struct kvm_vcpu *vcpu); 441 433 void vgic_v4_commit(struct kvm_vcpu *vcpu); 442 434 int vgic_v4_put(struct kvm_vcpu *vcpu); 435 + 436 + bool vgic_state_is_nested(struct kvm_vcpu *vcpu); 443 437 444 438 /* CPU HP callbacks */ 445 439 void kvm_vgic_cpu_up(void);
+15
include/linux/arm-smccc.h
··· 179 179 #define ARM_SMCCC_KVM_FUNC_PKVM_RESV_62 62 180 180 #define ARM_SMCCC_KVM_FUNC_PKVM_RESV_63 63 181 181 /* End of pKVM hypercall range */ 182 + #define ARM_SMCCC_KVM_FUNC_DISCOVER_IMPL_VER 64 183 + #define ARM_SMCCC_KVM_FUNC_DISCOVER_IMPL_CPUS 65 184 + 182 185 #define ARM_SMCCC_KVM_FUNC_FEATURES_2 127 183 186 #define ARM_SMCCC_KVM_NUM_FUNCS 128 184 187 ··· 227 224 ARM_SMCCC_SMC_64, \ 228 225 ARM_SMCCC_OWNER_VENDOR_HYP, \ 229 226 ARM_SMCCC_KVM_FUNC_MMIO_GUARD) 227 + 228 + #define ARM_SMCCC_VENDOR_HYP_KVM_DISCOVER_IMPL_VER_FUNC_ID \ 229 + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ 230 + ARM_SMCCC_SMC_64, \ 231 + ARM_SMCCC_OWNER_VENDOR_HYP, \ 232 + ARM_SMCCC_KVM_FUNC_DISCOVER_IMPL_VER) 233 + 234 + #define ARM_SMCCC_VENDOR_HYP_KVM_DISCOVER_IMPL_CPUS_FUNC_ID \ 235 + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ 236 + ARM_SMCCC_SMC_64, \ 237 + ARM_SMCCC_OWNER_VENDOR_HYP, \ 238 + ARM_SMCCC_KVM_FUNC_DISCOVER_IMPL_CPUS) 230 239 231 240 /* ptp_kvm counter type ID */ 232 241 #define KVM_PTP_VIRT_COUNTER 0
+4
include/linux/perf/arm_pmu.h
··· 100 100 void (*stop)(struct arm_pmu *); 101 101 void (*reset)(void *); 102 102 int (*map_event)(struct perf_event *event); 103 + /* 104 + * Called by KVM to map the PMUv3 event space onto non-PMUv3 hardware. 105 + */ 106 + int (*map_pmuv3_event)(unsigned int eventsel); 103 107 DECLARE_BITMAP(cntr_mask, ARMPMU_MAX_HWEVENTS); 104 108 bool secure_access; /* 32-bit ARM only */ 105 109 #define ARMV8_PMUV3_MAX_COMMON_EVENTS 0x40
+1
include/uapi/linux/kvm.h
··· 929 929 #define KVM_CAP_PRE_FAULT_MEMORY 236 930 930 #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237 931 931 #define KVM_CAP_X86_GUEST_MODE 238 932 + #define KVM_CAP_ARM_WRITABLE_IMP_ID_REGS 239 932 933 933 934 struct kvm_irq_routing_irqchip { 934 935 __u32 irqchip;
+1
tools/arch/arm/include/uapi/asm/kvm.h
··· 246 246 #define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6 247 247 #define KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO 7 248 248 #define KVM_DEV_ARM_VGIC_GRP_ITS_REGS 8 249 + #define KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ 9 249 250 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT 10 250 251 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK \ 251 252 (0x3fffffULL << KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT)
-30
tools/arch/arm64/include/asm/sysreg.h
··· 558 558 559 559 #define SYS_ICH_VSEIR_EL2 sys_reg(3, 4, 12, 9, 4) 560 560 #define SYS_ICC_SRE_EL2 sys_reg(3, 4, 12, 9, 5) 561 - #define SYS_ICH_HCR_EL2 sys_reg(3, 4, 12, 11, 0) 562 - #define SYS_ICH_VTR_EL2 sys_reg(3, 4, 12, 11, 1) 563 - #define SYS_ICH_MISR_EL2 sys_reg(3, 4, 12, 11, 2) 564 561 #define SYS_ICH_EISR_EL2 sys_reg(3, 4, 12, 11, 3) 565 562 #define SYS_ICH_ELRSR_EL2 sys_reg(3, 4, 12, 11, 5) 566 563 #define SYS_ICH_VMCR_EL2 sys_reg(3, 4, 12, 11, 7) ··· 978 981 #define SYS_MPIDR_SAFE_VAL (BIT(31)) 979 982 980 983 /* GIC Hypervisor interface registers */ 981 - /* ICH_MISR_EL2 bit definitions */ 982 - #define ICH_MISR_EOI (1 << 0) 983 - #define ICH_MISR_U (1 << 1) 984 - 985 984 /* ICH_LR*_EL2 bit definitions */ 986 985 #define ICH_LR_VIRTUAL_ID_MASK ((1ULL << 32) - 1) 987 986 ··· 991 998 #define ICH_LR_PHYS_ID_MASK (0x3ffULL << ICH_LR_PHYS_ID_SHIFT) 992 999 #define ICH_LR_PRIORITY_SHIFT 48 993 1000 #define ICH_LR_PRIORITY_MASK (0xffULL << ICH_LR_PRIORITY_SHIFT) 994 - 995 - /* ICH_HCR_EL2 bit definitions */ 996 - #define ICH_HCR_EN (1 << 0) 997 - #define ICH_HCR_UIE (1 << 1) 998 - #define ICH_HCR_NPIE (1 << 3) 999 - #define ICH_HCR_TC (1 << 10) 1000 - #define ICH_HCR_TALL0 (1 << 11) 1001 - #define ICH_HCR_TALL1 (1 << 12) 1002 - #define ICH_HCR_TDIR (1 << 14) 1003 - #define ICH_HCR_EOIcount_SHIFT 27 1004 - #define ICH_HCR_EOIcount_MASK (0x1f << ICH_HCR_EOIcount_SHIFT) 1005 1001 1006 1002 /* ICH_VMCR_EL2 bit definitions */ 1007 1003 #define ICH_VMCR_ACK_CTL_SHIFT 2 ··· 1011 1029 #define ICH_VMCR_ENG0_MASK (1 << ICH_VMCR_ENG0_SHIFT) 1012 1030 #define ICH_VMCR_ENG1_SHIFT 1 1013 1031 #define ICH_VMCR_ENG1_MASK (1 << ICH_VMCR_ENG1_SHIFT) 1014 - 1015 - /* ICH_VTR_EL2 bit definitions */ 1016 - #define ICH_VTR_PRI_BITS_SHIFT 29 1017 - #define ICH_VTR_PRI_BITS_MASK (7 << ICH_VTR_PRI_BITS_SHIFT) 1018 - #define ICH_VTR_ID_BITS_SHIFT 23 1019 - #define ICH_VTR_ID_BITS_MASK (7 << ICH_VTR_ID_BITS_SHIFT) 1020 - #define ICH_VTR_SEIS_SHIFT 22 1021 - #define ICH_VTR_SEIS_MASK (1 << ICH_VTR_SEIS_SHIFT) 1022 - #define ICH_VTR_A3V_SHIFT 21 1023 - #define ICH_VTR_A3V_MASK (1 << ICH_VTR_A3V_SHIFT) 1024 - #define ICH_VTR_TDS_SHIFT 19 1025 - #define ICH_VTR_TDS_MASK (1 << ICH_VTR_TDS_SHIFT) 1026 1032 1027 1033 /* 1028 1034 * Permission Indirection Extension (PIE) permission encodings.
+12
tools/arch/arm64/include/uapi/asm/kvm.h
··· 374 374 #endif 375 375 }; 376 376 377 + /* Vendor hyper call function numbers 0-63 */ 377 378 #define KVM_REG_ARM_VENDOR_HYP_BMAP KVM_REG_ARM_FW_FEAT_BMAP_REG(2) 378 379 379 380 enum { ··· 382 381 KVM_REG_ARM_VENDOR_HYP_BIT_PTP = 1, 383 382 #ifdef __KERNEL__ 384 383 KVM_REG_ARM_VENDOR_HYP_BMAP_BIT_COUNT, 384 + #endif 385 + }; 386 + 387 + /* Vendor hyper call function numbers 64-127 */ 388 + #define KVM_REG_ARM_VENDOR_HYP_BMAP_2 KVM_REG_ARM_FW_FEAT_BMAP_REG(3) 389 + 390 + enum { 391 + KVM_REG_ARM_VENDOR_HYP_BIT_DISCOVER_IMPL_VER = 0, 392 + KVM_REG_ARM_VENDOR_HYP_BIT_DISCOVER_IMPL_CPUS = 1, 393 + #ifdef __KERNEL__ 394 + KVM_REG_ARM_VENDOR_HYP_BMAP_2_BIT_COUNT, 385 395 #endif 386 396 }; 387 397
+1
tools/testing/selftests/kvm/arm64/get-reg-list.c
··· 332 332 KVM_REG_ARM_FW_FEAT_BMAP_REG(0), /* KVM_REG_ARM_STD_BMAP */ 333 333 KVM_REG_ARM_FW_FEAT_BMAP_REG(1), /* KVM_REG_ARM_STD_HYP_BMAP */ 334 334 KVM_REG_ARM_FW_FEAT_BMAP_REG(2), /* KVM_REG_ARM_VENDOR_HYP_BMAP */ 335 + KVM_REG_ARM_FW_FEAT_BMAP_REG(3), /* KVM_REG_ARM_VENDOR_HYP_BMAP_2 */ 335 336 ARM64_SYS_REG(3, 3, 14, 3, 1), /* CNTV_CTL_EL0 */ 336 337 ARM64_SYS_REG(3, 3, 14, 3, 2), /* CNTV_CVAL_EL0 */ 337 338 ARM64_SYS_REG(3, 3, 14, 0, 2),
+36 -10
tools/testing/selftests/kvm/arm64/hypercalls.c
··· 21 21 #define KVM_REG_ARM_STD_BMAP_BIT_MAX 0 22 22 #define KVM_REG_ARM_STD_HYP_BMAP_BIT_MAX 0 23 23 #define KVM_REG_ARM_VENDOR_HYP_BMAP_BIT_MAX 1 24 + #define KVM_REG_ARM_VENDOR_HYP_BMAP_2_BIT_MAX 1 25 + 26 + #define KVM_REG_ARM_STD_BMAP_RESET_VAL FW_REG_ULIMIT_VAL(KVM_REG_ARM_STD_BMAP_BIT_MAX) 27 + #define KVM_REG_ARM_STD_HYP_BMAP_RESET_VAL FW_REG_ULIMIT_VAL(KVM_REG_ARM_STD_HYP_BMAP_BIT_MAX) 28 + #define KVM_REG_ARM_VENDOR_HYP_BMAP_RESET_VAL FW_REG_ULIMIT_VAL(KVM_REG_ARM_VENDOR_HYP_BMAP_BIT_MAX) 29 + #define KVM_REG_ARM_VENDOR_HYP_BMAP_2_RESET_VAL 0 24 30 25 31 struct kvm_fw_reg_info { 26 32 uint64_t reg; /* Register definition */ 27 33 uint64_t max_feat_bit; /* Bit that represents the upper limit of the feature-map */ 34 + uint64_t reset_val; /* Reset value for the register */ 28 35 }; 29 36 30 37 #define FW_REG_INFO(r) \ 31 38 { \ 32 39 .reg = r, \ 33 40 .max_feat_bit = r##_BIT_MAX, \ 41 + .reset_val = r##_RESET_VAL \ 34 42 } 35 43 36 44 static const struct kvm_fw_reg_info fw_reg_info[] = { 37 45 FW_REG_INFO(KVM_REG_ARM_STD_BMAP), 38 46 FW_REG_INFO(KVM_REG_ARM_STD_HYP_BMAP), 39 47 FW_REG_INFO(KVM_REG_ARM_VENDOR_HYP_BMAP), 48 + FW_REG_INFO(KVM_REG_ARM_VENDOR_HYP_BMAP_2), 40 49 }; 41 50 42 51 enum test_stage { ··· 180 171 181 172 for (i = 0; i < ARRAY_SIZE(fw_reg_info); i++) { 182 173 const struct kvm_fw_reg_info *reg_info = &fw_reg_info[i]; 174 + uint64_t set_val; 183 175 184 - /* First 'read' should be an upper limit of the features supported */ 176 + /* First 'read' should be the reset value for the reg */ 185 177 val = vcpu_get_reg(vcpu, reg_info->reg); 186 - TEST_ASSERT(val == FW_REG_ULIMIT_VAL(reg_info->max_feat_bit), 187 - "Expected all the features to be set for reg: 0x%lx; expected: 0x%lx; read: 0x%lx", 188 - reg_info->reg, FW_REG_ULIMIT_VAL(reg_info->max_feat_bit), val); 178 + TEST_ASSERT(val == reg_info->reset_val, 179 + "Unexpected reset value for reg: 0x%lx; expected: 0x%lx; read: 0x%lx", 180 + reg_info->reg, reg_info->reset_val, val); 189 181 190 - /* Test a 'write' by disabling all the features of the register map */ 191 - ret = __vcpu_set_reg(vcpu, reg_info->reg, 0); 182 + if (reg_info->reset_val) 183 + set_val = 0; 184 + else 185 + set_val = FW_REG_ULIMIT_VAL(reg_info->max_feat_bit); 186 + 187 + ret = __vcpu_set_reg(vcpu, reg_info->reg, set_val); 192 188 TEST_ASSERT(ret == 0, 189 + "Failed to %s all the features of reg: 0x%lx; ret: %d", 190 + (set_val ? "set" : "clear"), reg_info->reg, errno); 191 + 192 + val = vcpu_get_reg(vcpu, reg_info->reg); 193 + TEST_ASSERT(val == set_val, 194 + "Expected all the features to be %s for reg: 0x%lx", 195 + (set_val ? "set" : "cleared"), reg_info->reg); 196 + 197 + /* 198 + * If the reg has been set, clear it as test_fw_regs_after_vm_start() 199 + * expects it to be cleared. 200 + */ 201 + if (set_val) { 202 + ret = __vcpu_set_reg(vcpu, reg_info->reg, 0); 203 + TEST_ASSERT(ret == 0, 193 204 "Failed to clear all the features of reg: 0x%lx; ret: %d", 194 205 reg_info->reg, errno); 195 - 196 - val = vcpu_get_reg(vcpu, reg_info->reg); 197 - TEST_ASSERT(val == 0, 198 - "Expected all the features to be cleared for reg: 0x%lx", reg_info->reg); 206 + } 199 207 200 208 /* 201 209 * Test enabling a feature that's not supported.
+33 -7
tools/testing/selftests/kvm/arm64/set_id_regs.c
··· 146 146 static const struct reg_ftr_bits ftr_id_aa64mmfr0_el1[] = { 147 147 REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64MMFR0_EL1, ECV, 0), 148 148 REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64MMFR0_EL1, EXS, 0), 149 + REG_FTR_BITS(FTR_EXACT, ID_AA64MMFR0_EL1, TGRAN4_2, 1), 150 + REG_FTR_BITS(FTR_EXACT, ID_AA64MMFR0_EL1, TGRAN64_2, 1), 151 + REG_FTR_BITS(FTR_EXACT, ID_AA64MMFR0_EL1, TGRAN16_2, 1), 149 152 S_REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64MMFR0_EL1, TGRAN4, 0), 150 153 S_REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64MMFR0_EL1, TGRAN64, 0), 151 154 REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64MMFR0_EL1, TGRAN16, 0), ··· 233 230 GUEST_REG_SYNC(SYS_ID_AA64MMFR2_EL1); 234 231 GUEST_REG_SYNC(SYS_ID_AA64ZFR0_EL1); 235 232 GUEST_REG_SYNC(SYS_CTR_EL0); 233 + GUEST_REG_SYNC(SYS_MIDR_EL1); 234 + GUEST_REG_SYNC(SYS_REVIDR_EL1); 235 + GUEST_REG_SYNC(SYS_AIDR_EL1); 236 236 237 237 GUEST_DONE(); 238 238 } ··· 615 609 test_reg_vals[encoding_to_range_idx(SYS_CTR_EL0)] = ctr; 616 610 } 617 611 618 - static void test_vcpu_ftr_id_regs(struct kvm_vcpu *vcpu) 612 + static void test_id_reg(struct kvm_vcpu *vcpu, u32 id) 619 613 { 620 614 u64 val; 621 615 616 + val = vcpu_get_reg(vcpu, KVM_ARM64_SYS_REG(id)); 617 + val++; 618 + vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(id), val); 619 + test_reg_vals[encoding_to_range_idx(id)] = val; 620 + } 621 + 622 + static void test_vcpu_ftr_id_regs(struct kvm_vcpu *vcpu) 623 + { 622 624 test_clidr(vcpu); 623 625 test_ctr(vcpu); 624 626 625 - val = vcpu_get_reg(vcpu, KVM_ARM64_SYS_REG(SYS_MPIDR_EL1)); 626 - val++; 627 - vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_MPIDR_EL1), val); 627 + test_id_reg(vcpu, SYS_MPIDR_EL1); 628 + ksft_test_result_pass("%s\n", __func__); 629 + } 628 630 629 - test_reg_vals[encoding_to_range_idx(SYS_MPIDR_EL1)] = val; 631 + static void test_vcpu_non_ftr_id_regs(struct kvm_vcpu *vcpu) 632 + { 633 + test_id_reg(vcpu, SYS_MIDR_EL1); 634 + test_id_reg(vcpu, SYS_REVIDR_EL1); 635 + test_id_reg(vcpu, SYS_AIDR_EL1); 636 + 630 637 ksft_test_result_pass("%s\n", __func__); 631 638 } 632 639 ··· 666 647 test_assert_id_reg_unchanged(vcpu, SYS_MPIDR_EL1); 667 648 test_assert_id_reg_unchanged(vcpu, SYS_CLIDR_EL1); 668 649 test_assert_id_reg_unchanged(vcpu, SYS_CTR_EL0); 650 + test_assert_id_reg_unchanged(vcpu, SYS_MIDR_EL1); 651 + test_assert_id_reg_unchanged(vcpu, SYS_REVIDR_EL1); 652 + test_assert_id_reg_unchanged(vcpu, SYS_AIDR_EL1); 669 653 670 654 ksft_test_result_pass("%s\n", __func__); 671 655 } ··· 682 660 int test_cnt; 683 661 684 662 TEST_REQUIRE(kvm_has_cap(KVM_CAP_ARM_SUPPORTED_REG_MASK_RANGES)); 663 + TEST_REQUIRE(kvm_has_cap(KVM_CAP_ARM_WRITABLE_IMP_ID_REGS)); 685 664 686 - vm = vm_create_with_one_vcpu(&vcpu, guest_code); 665 + vm = vm_create(1); 666 + vm_enable_cap(vm, KVM_CAP_ARM_WRITABLE_IMP_ID_REGS, 0); 667 + vcpu = vm_vcpu_add(vm, 0, guest_code); 687 668 688 669 /* Check for AARCH64 only system */ 689 670 val = vcpu_get_reg(vcpu, KVM_ARM64_SYS_REG(SYS_ID_AA64PFR0_EL1)); ··· 700 675 ARRAY_SIZE(ftr_id_aa64isar2_el1) + ARRAY_SIZE(ftr_id_aa64pfr0_el1) + 701 676 ARRAY_SIZE(ftr_id_aa64pfr1_el1) + ARRAY_SIZE(ftr_id_aa64mmfr0_el1) + 702 677 ARRAY_SIZE(ftr_id_aa64mmfr1_el1) + ARRAY_SIZE(ftr_id_aa64mmfr2_el1) + 703 - ARRAY_SIZE(ftr_id_aa64zfr0_el1) - ARRAY_SIZE(test_regs) + 2 + 678 + ARRAY_SIZE(ftr_id_aa64zfr0_el1) - ARRAY_SIZE(test_regs) + 3 + 704 679 MPAM_IDREG_TEST; 705 680 706 681 ksft_set_plan(test_cnt); 707 682 708 683 test_vm_ftr_id_regs(vcpu, aarch64_only); 709 684 test_vcpu_ftr_id_regs(vcpu); 685 + test_vcpu_non_ftr_id_regs(vcpu); 710 686 test_user_set_mpam_reg(vcpu); 711 687 712 688 test_guest_reg_read(vcpu);