Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

KVM: arm64: vgic-v3: Optimize affinity-based SGI injection

Our affinity-based SGI injection code is a bit daft. We iterate
over all the CPUs trying to match the set of affinities that the
guest is trying to reach, leading to some very bad behaviours
if the selected targets are at a high vcpu index.

Instead, we can now use the fact that we have an optimised
MPIDR to vcpu mapping, and only look at the relevant values.

This results in a much faster injection for large VMs, and
in a near constant time, irrespective of the position in the
vcpu index space.

As a bonus, this is mostly deleting a lot of hard-to-read
code. Nobody will complain about that.

Suggested-by: Xu Zhao <zhaoxu.35@bytedance.com>
Tested-by: Joey Gouly <joey.gouly@arm.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230927090911.3355209-11-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>

authored by

Marc Zyngier and committed by
Oliver Upton
b5daffb1 54a8006d

+11 -53
+11 -53
arch/arm64/kvm/vgic/vgic-mmio-v3.c
··· 1013 1013 1014 1014 return 0; 1015 1015 } 1016 - /* 1017 - * Compare a given affinity (level 1-3 and a level 0 mask, from the SGI 1018 - * generation register ICC_SGI1R_EL1) with a given VCPU. 1019 - * If the VCPU's MPIDR matches, return the level0 affinity, otherwise 1020 - * return -1. 1021 - */ 1022 - static int match_mpidr(u64 sgi_aff, u16 sgi_cpu_mask, struct kvm_vcpu *vcpu) 1023 - { 1024 - unsigned long affinity; 1025 - int level0; 1026 - 1027 - /* 1028 - * Split the current VCPU's MPIDR into affinity level 0 and the 1029 - * rest as this is what we have to compare against. 1030 - */ 1031 - affinity = kvm_vcpu_get_mpidr_aff(vcpu); 1032 - level0 = MPIDR_AFFINITY_LEVEL(affinity, 0); 1033 - affinity &= ~MPIDR_LEVEL_MASK; 1034 - 1035 - /* bail out if the upper three levels don't match */ 1036 - if (sgi_aff != affinity) 1037 - return -1; 1038 - 1039 - /* Is this VCPU's bit set in the mask ? */ 1040 - if (!(sgi_cpu_mask & BIT(level0))) 1041 - return -1; 1042 - 1043 - return level0; 1044 - } 1045 1016 1046 1017 /* 1047 1018 * The ICC_SGI* registers encode the affinity differently from the MPIDR, ··· 1065 1094 * This will trap in sys_regs.c and call this function. 1066 1095 * This ICC_SGI1R_EL1 register contains the upper three affinity levels of the 1067 1096 * target processors as well as a bitmask of 16 Aff0 CPUs. 1068 - * If the interrupt routing mode bit is not set, we iterate over all VCPUs to 1069 - * check for matching ones. If this bit is set, we signal all, but not the 1070 - * calling VCPU. 1097 + * 1098 + * If the interrupt routing mode bit is not set, we iterate over the Aff0 1099 + * bits and signal the VCPUs matching the provided Aff{3,2,1}. 1100 + * 1101 + * If this bit is set, we signal all, but not the calling VCPU. 1071 1102 */ 1072 1103 void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg, bool allow_group1) 1073 1104 { ··· 1077 1104 struct kvm_vcpu *c_vcpu; 1078 1105 unsigned long target_cpus; 1079 1106 u64 mpidr; 1080 - u32 sgi; 1107 + u32 sgi, aff0; 1081 1108 unsigned long c; 1082 1109 1083 1110 sgi = FIELD_GET(ICC_SGI1R_SGI_ID_MASK, reg); ··· 1095 1122 return; 1096 1123 } 1097 1124 1125 + /* We iterate over affinities to find the corresponding vcpus */ 1098 1126 mpidr = SGI_AFFINITY_LEVEL(reg, 3); 1099 1127 mpidr |= SGI_AFFINITY_LEVEL(reg, 2); 1100 1128 mpidr |= SGI_AFFINITY_LEVEL(reg, 1); 1101 1129 target_cpus = FIELD_GET(ICC_SGI1R_TARGET_LIST_MASK, reg); 1102 1130 1103 - /* 1104 - * We iterate over all VCPUs to find the MPIDRs matching the request. 1105 - * If we have handled one CPU, we clear its bit to detect early 1106 - * if we are already finished. This avoids iterating through all 1107 - * VCPUs when most of the times we just signal a single VCPU. 1108 - */ 1109 - kvm_for_each_vcpu(c, c_vcpu, kvm) { 1110 - int level0; 1111 - 1112 - /* Exit early if we have dealt with all requested CPUs */ 1113 - if (target_cpus == 0) 1114 - break; 1115 - level0 = match_mpidr(mpidr, target_cpus, c_vcpu); 1116 - if (level0 == -1) 1117 - continue; 1118 - 1119 - /* remove this matching VCPU from the mask */ 1120 - target_cpus &= ~BIT(level0); 1121 - 1122 - vgic_v3_queue_sgi(c_vcpu, sgi, allow_group1); 1131 + for_each_set_bit(aff0, &target_cpus, hweight_long(ICC_SGI1R_TARGET_LIST_MASK)) { 1132 + c_vcpu = kvm_mpidr_to_vcpu(kvm, mpidr | aff0); 1133 + if (c_vcpu) 1134 + vgic_v3_queue_sgi(c_vcpu, sgi, allow_group1); 1123 1135 } 1124 1136 } 1125 1137