Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

Merge tag 'hyperv-next-signed-20250602' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv updates from Wei Liu:

- Support for Virtual Trust Level (VTL) on arm64 (Roman Kisel)

- Fixes for Hyper-V UIO driver (Long Li)

- Fixes for Hyper-V PCI driver (Michael Kelley)

- Select CONFIG_SYSFB for Hyper-V guests (Michael Kelley)

- Documentation updates for Hyper-V VMBus (Michael Kelley)

- Enhance logging for hv_kvp_daemon (Shradha Gupta)

* tag 'hyperv-next-signed-20250602' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (23 commits)
Drivers: hv: Always select CONFIG_SYSFB for Hyper-V guests
Drivers: hv: vmbus: Add comments about races with "channels" sysfs dir
Documentation: hyperv: Update VMBus doc with new features and info
PCI: hv: Remove unnecessary flex array in struct pci_packet
Drivers: hv: Remove hv_alloc/free_* helpers
Drivers: hv: Use kzalloc for panic page allocation
uio_hv_generic: Align ring size to system page
uio_hv_generic: Use correct size for interrupt and monitor pages
Drivers: hv: Allocate interrupt and monitor pages aligned to system page boundary
arch/x86: Provide the CPU number in the wakeup AP callback
x86/hyperv: Fix APIC ID and VP index confusion in hv_snp_boot_ap()
PCI: hv: Get vPCI MSI IRQ domain from DeviceTree
ACPI: irq: Introduce acpi_get_gsi_dispatcher()
Drivers: hv: vmbus: Introduce hv_get_vmbus_root_device()
Drivers: hv: vmbus: Get the IRQ number from DeviceTree
dt-bindings: microsoft,vmbus: Add interrupt and DMA coherence properties
arm64, x86: hyperv: Report the VTL the system boots in
arm64: hyperv: Initialize the Virtual Trust Level field
Drivers: hv: Provide arch-neutral implementation of get_vtl()
Drivers: hv: Enable VTL mode for arm64
...

+562 -235
+14 -2
Documentation/devicetree/bindings/bus/microsoft,vmbus.yaml
··· 10 10 - Saurabh Sengar <ssengar@linux.microsoft.com> 11 11 12 12 description: 13 - VMBus is a software bus that implement the protocols for communication 14 - between the root or host OS and guest OSs (virtual machines). 13 + VMBus is a software bus that implements the protocols for communication 14 + between the root or host OS and guest OS'es (virtual machines). 15 15 16 16 properties: 17 17 compatible: ··· 25 25 '#size-cells': 26 26 const: 1 27 27 28 + dma-coherent: true 29 + 30 + interrupts: 31 + maxItems: 1 32 + description: Interrupt is used to report a message from the host. 33 + 28 34 required: 29 35 - compatible 30 36 - ranges 37 + - interrupts 31 38 - '#address-cells' 32 39 - '#size-cells' 33 40 ··· 42 35 43 36 examples: 44 37 - | 38 + #include <dt-bindings/interrupt-controller/irq.h> 39 + #include <dt-bindings/interrupt-controller/arm-gic.h> 45 40 soc { 46 41 #address-cells = <2>; 47 42 #size-cells = <1>; ··· 58 49 #address-cells = <2>; 59 50 #size-cells = <1>; 60 51 ranges = <0x0f 0xf0000000 0x0f 0xf0000000 0x10000000>; 52 + dma-coherent; 53 + interrupt-parent = <&gic>; 54 + interrupts = <GIC_PPI 2 IRQ_TYPE_EDGE_RISING>; 61 55 }; 62 56 }; 63 57 };
+24 -4
Documentation/virt/hyperv/vmbus.rst
··· 250 250 or /proc/irq corresponding to individual VMBus channel interrupts. 251 251 252 252 An online CPU in a Linux guest may not be taken offline if it has 253 - VMBus channel interrupts assigned to it. Any such channel 254 - interrupts must first be manually reassigned to another CPU as 255 - described above. When no channel interrupts are assigned to the 256 - CPU, it can be taken offline. 253 + VMBus channel interrupts assigned to it. Starting in kernel v6.15, 254 + any such interrupts are automatically reassigned to some other CPU 255 + at the time of offlining. The "other" CPU is chosen by the 256 + implementation and is not load balanced or otherwise intelligently 257 + determined. If the CPU is onlined again, channel interrupts previously 258 + assigned to it are not moved back. As a result, after multiple CPUs 259 + have been offlined, and perhaps onlined again, the interrupt-to-CPU 260 + mapping may be scrambled and non-optimal. In such a case, optimal 261 + assignments must be re-established manually. For kernels v6.14 and 262 + earlier, any conflicting channel interrupts must first be manually 263 + reassigned to another CPU as described above. Then when no channel 264 + interrupts are assigned to the CPU, it can be taken offline. 257 265 258 266 The VMBus channel interrupt handling code is designed to work 259 267 correctly even if an interrupt is received on a CPU other than the ··· 332 324 its previous existence. Such a device might be re-added later, 333 325 in which case it is treated as an entirely new device. See 334 326 vmbus_onoffer_rescind(). 327 + 328 + For some devices, such as the KVP device, Hyper-V automatically 329 + sends a rescind message when the primary channel is closed, 330 + likely as a result of unbinding the device from its driver. 331 + The rescind causes Linux to remove the device. But then Hyper-V 332 + immediately reoffers the device to the guest, causing a new 333 + instance of the device to be created in Linux. For other 334 + devices, such as the synthetic SCSI and NIC devices, closing the 335 + primary channel does *not* result in Hyper-V sending a rescind 336 + message. The device continues to exist in Linux on the VMBus, 337 + but with no driver bound to it. The same driver or a new driver 338 + can subsequently be bound to the existing instance of the device.
+48 -5
arch/arm64/hyperv/mshyperv.c
··· 28 28 } 29 29 EXPORT_SYMBOL_GPL(hv_get_hypervisor_version); 30 30 31 + #ifdef CONFIG_ACPI 32 + 33 + static bool __init hyperv_detect_via_acpi(void) 34 + { 35 + if (acpi_disabled) 36 + return false; 37 + /* 38 + * Hypervisor ID is only available in ACPI v6+, and the 39 + * structure layout was extended in v6 to accommodate that 40 + * new field. 41 + * 42 + * At the very minimum, this check makes sure not to read 43 + * past the FADT structure. 44 + * 45 + * It is also needed to catch running in some unknown 46 + * non-Hyper-V environment that has ACPI 5.x or less. 47 + * In such a case, it can't be Hyper-V. 48 + */ 49 + if (acpi_gbl_FADT.header.revision < 6) 50 + return false; 51 + return strncmp((char *)&acpi_gbl_FADT.hypervisor_id, "MsHyperV", 8) == 0; 52 + } 53 + 54 + #else 55 + 56 + static bool __init hyperv_detect_via_acpi(void) 57 + { 58 + return false; 59 + } 60 + 61 + #endif 62 + 63 + static bool __init hyperv_detect_via_smccc(void) 64 + { 65 + uuid_t hyperv_uuid = UUID_INIT( 66 + 0x58ba324d, 0x6447, 0x24cd, 67 + 0x75, 0x6c, 0xef, 0x8e, 68 + 0x24, 0x70, 0x59, 0x16); 69 + 70 + return arm_smccc_hypervisor_has_uuid(&hyperv_uuid); 71 + } 72 + 31 73 static int __init hyperv_init(void) 32 74 { 33 75 struct hv_get_vp_registers_output result; ··· 78 36 79 37 /* 80 38 * Allow for a kernel built with CONFIG_HYPERV to be running in 81 - * a non-Hyper-V environment, including on DT instead of ACPI. 39 + * a non-Hyper-V environment. 40 + * 82 41 * In such cases, do nothing and return success. 83 42 */ 84 - if (acpi_disabled) 85 - return 0; 86 - 87 - if (strncmp((char *)&acpi_gbl_FADT.hypervisor_id, "MsHyperV", 8)) 43 + if (!hyperv_detect_via_acpi() && !hyperv_detect_via_smccc()) 88 44 return 0; 89 45 90 46 /* Setup the guest ID */ ··· 117 77 118 78 if (ms_hyperv.priv_high & HV_ACCESS_PARTITION_ID) 119 79 hv_get_partition_id(); 80 + ms_hyperv.vtl = get_vtl(); 81 + if (ms_hyperv.vtl > 0) /* non default VTL */ 82 + pr_info("Linux runs in Hyper-V Virtual Trust Level %d\n", ms_hyperv.vtl); 120 83 121 84 ms_hyperv_late_init(); 122 85
+6 -4
arch/arm64/kvm/hypercalls.c
··· 270 270 u32 feature; 271 271 u8 action; 272 272 gpa_t gpa; 273 + uuid_t uuid; 273 274 274 275 action = kvm_smccc_get_action(vcpu, func_id); 275 276 switch (action) { ··· 356 355 val[0] = gpa; 357 356 break; 358 357 case ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID: 359 - val[0] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_0; 360 - val[1] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_1; 361 - val[2] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_2; 362 - val[3] = ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_3; 358 + uuid = ARM_SMCCC_VENDOR_HYP_UID_KVM; 359 + val[0] = smccc_uuid_to_reg(&uuid, 0); 360 + val[1] = smccc_uuid_to_reg(&uuid, 1); 361 + val[2] = smccc_uuid_to_reg(&uuid, 2); 362 + val[3] = smccc_uuid_to_reg(&uuid, 3); 363 363 break; 364 364 case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID: 365 365 val[0] = smccc_feat->vendor_hyp_bmap;
+2 -11
arch/x86/coco/sev/core.c
··· 869 869 return page_address(p + 1); 870 870 } 871 871 872 - static int wakeup_cpu_via_vmgexit(u32 apic_id, unsigned long start_ip) 872 + static int wakeup_cpu_via_vmgexit(u32 apic_id, unsigned long start_ip, unsigned int cpu) 873 873 { 874 874 struct sev_es_save_area *cur_vmsa, *vmsa; 875 875 struct svsm_ca *caa; 876 876 u8 sipi_vector; 877 - int cpu, ret; 877 + int ret; 878 878 u64 cr4; 879 879 880 880 /* ··· 895 895 896 896 /* Override start_ip with known protected guest start IP */ 897 897 start_ip = real_mode_header->sev_es_trampoline_start; 898 - 899 - /* Find the logical CPU for the APIC ID */ 900 - for_each_present_cpu(cpu) { 901 - if (arch_match_cpu_phys_id(cpu, apic_id)) 902 - break; 903 - } 904 - if (cpu >= nr_cpu_ids) 905 - return -EINVAL; 906 - 907 898 cur_vmsa = per_cpu(sev_vmsa, cpu); 908 899 909 900 /*
+33 -34
arch/x86/hyperv/hv_init.c
··· 391 391 old_setup_percpu_clockev(); 392 392 } 393 393 394 - #if IS_ENABLED(CONFIG_HYPERV_VTL_MODE) 395 - static u8 __init get_vtl(void) 396 - { 397 - u64 control = HV_HYPERCALL_REP_COMP_1 | HVCALL_GET_VP_REGISTERS; 398 - struct hv_input_get_vp_registers *input; 399 - struct hv_output_get_vp_registers *output; 400 - unsigned long flags; 401 - u64 ret; 402 - 403 - local_irq_save(flags); 404 - input = *this_cpu_ptr(hyperv_pcpu_input_arg); 405 - output = *this_cpu_ptr(hyperv_pcpu_output_arg); 406 - 407 - memset(input, 0, struct_size(input, names, 1)); 408 - input->partition_id = HV_PARTITION_ID_SELF; 409 - input->vp_index = HV_VP_INDEX_SELF; 410 - input->input_vtl.as_uint8 = 0; 411 - input->names[0] = HV_REGISTER_VSM_VP_STATUS; 412 - 413 - ret = hv_do_hypercall(control, input, output); 414 - if (hv_result_success(ret)) { 415 - ret = output->values[0].reg8 & HV_X64_VTL_MASK; 416 - } else { 417 - pr_err("Failed to get VTL(error: %lld) exiting...\n", ret); 418 - BUG(); 419 - } 420 - 421 - local_irq_restore(flags); 422 - return ret; 423 - } 424 - #else 425 - static inline u8 get_vtl(void) { return 0; } 426 - #endif 427 - 428 394 /* 429 395 * This function is to be invoked early in the boot sequence after the 430 396 * hypervisor has been detected. ··· 673 707 return hypercall_msr.enable; 674 708 } 675 709 EXPORT_SYMBOL_GPL(hv_is_hyperv_initialized); 710 + 711 + int hv_apicid_to_vp_index(u32 apic_id) 712 + { 713 + u64 control; 714 + u64 status; 715 + unsigned long irq_flags; 716 + struct hv_get_vp_from_apic_id_in *input; 717 + u32 *output, ret; 718 + 719 + local_irq_save(irq_flags); 720 + 721 + input = *this_cpu_ptr(hyperv_pcpu_input_arg); 722 + memset(input, 0, sizeof(*input)); 723 + input->partition_id = HV_PARTITION_ID_SELF; 724 + input->apic_ids[0] = apic_id; 725 + 726 + output = *this_cpu_ptr(hyperv_pcpu_output_arg); 727 + 728 + control = HV_HYPERCALL_REP_COMP_1 | HVCALL_GET_VP_INDEX_FROM_APIC_ID; 729 + status = hv_do_hypercall(control, input, output); 730 + ret = output[0]; 731 + 732 + local_irq_restore(irq_flags); 733 + 734 + if (!hv_result_success(status)) { 735 + pr_err("failed to get vp index from apic id %d, status %#llx\n", 736 + apic_id, status); 737 + return -EINVAL; 738 + } 739 + 740 + return ret; 741 + } 742 + EXPORT_SYMBOL_GPL(hv_apicid_to_vp_index);
+13 -48
arch/x86/hyperv/hv_vtl.c
··· 56 56 57 57 void __init hv_vtl_init_platform(void) 58 58 { 59 - pr_info("Linux runs in Hyper-V Virtual Trust Level\n"); 59 + /* 60 + * This function is a no-op if the VTL mode is not enabled. 61 + * If it is, this function runs if and only the kernel boots in 62 + * VTL2 which the x86 hv initialization path makes sure of. 63 + */ 64 + pr_info("Linux runs in Hyper-V Virtual Trust Level %d\n", ms_hyperv.vtl); 60 65 61 66 x86_platform.realmode_reserve = x86_init_noop; 62 67 x86_platform.realmode_init = x86_init_noop; ··· 212 207 return ret; 213 208 } 214 209 215 - static int hv_vtl_apicid_to_vp_id(u32 apic_id) 210 + static int hv_vtl_wakeup_secondary_cpu(u32 apicid, unsigned long start_eip, unsigned int cpu) 216 211 { 217 - u64 control; 218 - u64 status; 219 - unsigned long irq_flags; 220 - struct hv_get_vp_from_apic_id_in *input; 221 - u32 *output, ret; 222 - 223 - local_irq_save(irq_flags); 224 - 225 - input = *this_cpu_ptr(hyperv_pcpu_input_arg); 226 - memset(input, 0, sizeof(*input)); 227 - input->partition_id = HV_PARTITION_ID_SELF; 228 - input->apic_ids[0] = apic_id; 229 - 230 - output = *this_cpu_ptr(hyperv_pcpu_output_arg); 231 - 232 - control = HV_HYPERCALL_REP_COMP_1 | HVCALL_GET_VP_ID_FROM_APIC_ID; 233 - status = hv_do_hypercall(control, input, output); 234 - ret = output[0]; 235 - 236 - local_irq_restore(irq_flags); 237 - 238 - if (!hv_result_success(status)) { 239 - pr_err("failed to get vp id from apic id %d, status %#llx\n", 240 - apic_id, status); 241 - return -EINVAL; 242 - } 243 - 244 - return ret; 245 - } 246 - 247 - static int hv_vtl_wakeup_secondary_cpu(u32 apicid, unsigned long start_eip) 248 - { 249 - int vp_id, cpu; 250 - 251 - /* Find the logical CPU for the APIC ID */ 252 - for_each_present_cpu(cpu) { 253 - if (arch_match_cpu_phys_id(cpu, apicid)) 254 - break; 255 - } 256 - if (cpu >= nr_cpu_ids) 257 - return -EINVAL; 212 + int vp_index; 258 213 259 214 pr_debug("Bringing up CPU with APIC ID %d in VTL2...\n", apicid); 260 - vp_id = hv_vtl_apicid_to_vp_id(apicid); 215 + vp_index = hv_apicid_to_vp_index(apicid); 261 216 262 - if (vp_id < 0) { 217 + if (vp_index < 0) { 263 218 pr_err("Couldn't find CPU with APIC ID %d\n", apicid); 264 219 return -EINVAL; 265 220 } 266 - if (vp_id > ms_hyperv.max_vp_index) { 267 - pr_err("Invalid CPU id %d for APIC ID %d\n", vp_id, apicid); 221 + if (vp_index > ms_hyperv.max_vp_index) { 222 + pr_err("Invalid CPU id %d for APIC ID %d\n", vp_index, apicid); 268 223 return -EINVAL; 269 224 } 270 225 271 - return hv_vtl_bringup_vcpu(vp_id, cpu, start_eip); 226 + return hv_vtl_bringup_vcpu(vp_index, cpu, start_eip); 272 227 } 273 228 274 229 int __init hv_vtl_early_init(void)
+9 -2
arch/x86/hyperv/ivm.c
··· 9 9 #include <linux/bitfield.h> 10 10 #include <linux/types.h> 11 11 #include <linux/slab.h> 12 + #include <linux/cpu.h> 12 13 #include <asm/svm.h> 13 14 #include <asm/sev.h> 14 15 #include <asm/io.h> ··· 290 289 free_page((unsigned long)vmsa); 291 290 } 292 291 293 - int hv_snp_boot_ap(u32 cpu, unsigned long start_ip) 292 + int hv_snp_boot_ap(u32 apic_id, unsigned long start_ip, unsigned int cpu) 294 293 { 295 294 struct sev_es_save_area *vmsa = (struct sev_es_save_area *) 296 295 __get_free_page(GFP_KERNEL | __GFP_ZERO); ··· 299 298 u64 ret, retry = 5; 300 299 struct hv_enable_vp_vtl *start_vp_input; 301 300 unsigned long flags; 301 + int vp_index; 302 302 303 303 if (!vmsa) 304 304 return -ENOMEM; 305 + 306 + /* Find the Hyper-V VP index which might be not the same as APIC ID */ 307 + vp_index = hv_apicid_to_vp_index(apic_id); 308 + if (vp_index < 0 || vp_index > ms_hyperv.max_vp_index) 309 + return -EINVAL; 305 310 306 311 native_store_gdt(&gdtr); 307 312 ··· 356 349 start_vp_input = (struct hv_enable_vp_vtl *)ap_start_input_arg; 357 350 memset(start_vp_input, 0, sizeof(*start_vp_input)); 358 351 start_vp_input->partition_id = -1; 359 - start_vp_input->vp_index = cpu; 352 + start_vp_input->vp_index = vp_index; 360 353 start_vp_input->target_vtl.target_vtl = ms_hyperv.vtl; 361 354 *(u64 *)&start_vp_input->vp_context = __pa(vmsa) | 1; 362 355
+4 -4
arch/x86/include/asm/apic.h
··· 313 313 u32 (*get_apic_id)(u32 id); 314 314 315 315 /* wakeup_secondary_cpu */ 316 - int (*wakeup_secondary_cpu)(u32 apicid, unsigned long start_eip); 316 + int (*wakeup_secondary_cpu)(u32 apicid, unsigned long start_eip, unsigned int cpu); 317 317 /* wakeup secondary CPU using 64-bit wakeup point */ 318 - int (*wakeup_secondary_cpu_64)(u32 apicid, unsigned long start_eip); 318 + int (*wakeup_secondary_cpu_64)(u32 apicid, unsigned long start_eip, unsigned int cpu); 319 319 320 320 char *name; 321 321 }; ··· 333 333 void (*send_IPI_self)(int vector); 334 334 u64 (*icr_read)(void); 335 335 void (*icr_write)(u32 low, u32 high); 336 - int (*wakeup_secondary_cpu)(u32 apicid, unsigned long start_eip); 337 - int (*wakeup_secondary_cpu_64)(u32 apicid, unsigned long start_eip); 336 + int (*wakeup_secondary_cpu)(u32 apicid, unsigned long start_eip, unsigned int cpu); 337 + int (*wakeup_secondary_cpu_64)(u32 apicid, unsigned long start_eip, unsigned int cpu); 338 338 }; 339 339 340 340 /*
+5 -2
arch/x86/include/asm/mshyperv.h
··· 269 269 #ifdef CONFIG_AMD_MEM_ENCRYPT 270 270 bool hv_ghcb_negotiate_protocol(void); 271 271 void __noreturn hv_ghcb_terminate(unsigned int set, unsigned int reason); 272 - int hv_snp_boot_ap(u32 cpu, unsigned long start_ip); 272 + int hv_snp_boot_ap(u32 apic_id, unsigned long start_ip, unsigned int cpu); 273 273 #else 274 274 static inline bool hv_ghcb_negotiate_protocol(void) { return false; } 275 275 static inline void hv_ghcb_terminate(unsigned int set, unsigned int reason) {} 276 - static inline int hv_snp_boot_ap(u32 cpu, unsigned long start_ip) { return 0; } 276 + static inline int hv_snp_boot_ap(u32 apic_id, unsigned long start_ip, 277 + unsigned int cpu) { return 0; } 277 278 #endif 278 279 279 280 #if defined(CONFIG_AMD_MEM_ENCRYPT) || defined(CONFIG_INTEL_TDX_GUEST) ··· 308 307 { 309 308 return native_rdmsrq(reg); 310 309 } 310 + int hv_apicid_to_vp_index(u32 apic_id); 311 311 312 312 #else /* CONFIG_HYPERV */ 313 313 static inline void hyperv_init(void) {} ··· 330 328 static inline u64 hv_get_msr(unsigned int reg) { return 0; } 331 329 static inline void hv_set_non_nested_msr(unsigned int reg, u64 value) { } 332 330 static inline u64 hv_get_non_nested_msr(unsigned int reg) { return 0; } 331 + static inline int hv_apicid_to_vp_index(u32 apic_id) { return -EINVAL; } 333 332 #endif /* CONFIG_HYPERV */ 334 333 335 334
+1 -1
arch/x86/kernel/acpi/madt_wakeup.c
··· 126 126 return 0; 127 127 } 128 128 129 - static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip) 129 + static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip, unsigned int cpu) 130 130 { 131 131 if (!acpi_mp_wake_mailbox_paddr) { 132 132 pr_warn_once("No MADT mailbox: cannot bringup secondary CPUs. Booting with kexec?\n");
+7 -1
arch/x86/kernel/apic/apic_noop.c
··· 27 27 static void noop_send_IPI_all(int vector) { } 28 28 static void noop_send_IPI_self(int vector) { } 29 29 static void noop_apic_icr_write(u32 low, u32 id) { } 30 - static int noop_wakeup_secondary_cpu(u32 apicid, unsigned long start_eip) { return -1; } 30 + 31 + static int noop_wakeup_secondary_cpu(u32 apicid, unsigned long start_eip, 32 + unsigned int cpu) 33 + { 34 + return -1; 35 + } 36 + 31 37 static u64 noop_apic_icr_read(void) { return 0; } 32 38 static u32 noop_get_apic_id(u32 apicid) { return 0; } 33 39 static void noop_apic_eoi(void) { }
+1 -1
arch/x86/kernel/apic/apic_numachip.c
··· 57 57 numachip2_write32_lcsr(NUMACHIP2_APIC_ICR, (apicid << 12) | val); 58 58 } 59 59 60 - static int numachip_wakeup_secondary(u32 phys_apicid, unsigned long start_rip) 60 + static int numachip_wakeup_secondary(u32 phys_apicid, unsigned long start_rip, unsigned int cpu) 61 61 { 62 62 numachip_apic_icr_write(phys_apicid, APIC_DM_INIT); 63 63 numachip_apic_icr_write(phys_apicid, APIC_DM_STARTUP |
+1 -1
arch/x86/kernel/apic/x2apic_uv_x.c
··· 667 667 } 668 668 } 669 669 670 - static int uv_wakeup_secondary(u32 phys_apicid, unsigned long start_rip) 670 + static int uv_wakeup_secondary(u32 phys_apicid, unsigned long start_rip, unsigned int cpu) 671 671 { 672 672 unsigned long val; 673 673 int pnode;
+5 -5
arch/x86/kernel/smpboot.c
··· 695 695 /* 696 696 * Wake up AP by INIT, INIT, STARTUP sequence. 697 697 */ 698 - static int wakeup_secondary_cpu_via_init(u32 phys_apicid, unsigned long start_eip) 698 + static int wakeup_secondary_cpu_via_init(u32 phys_apicid, unsigned long start_eip, unsigned int cpu) 699 699 { 700 700 unsigned long send_status = 0, accept_status = 0; 701 701 int num_starts, j, maxlvt; ··· 842 842 * Returns zero if startup was successfully sent, else error code from 843 843 * ->wakeup_secondary_cpu. 844 844 */ 845 - static int do_boot_cpu(u32 apicid, int cpu, struct task_struct *idle) 845 + static int do_boot_cpu(u32 apicid, unsigned int cpu, struct task_struct *idle) 846 846 { 847 847 unsigned long start_ip = real_mode_header->trampoline_start; 848 848 int ret; ··· 896 896 * - Use an INIT boot APIC message 897 897 */ 898 898 if (apic->wakeup_secondary_cpu_64) 899 - ret = apic->wakeup_secondary_cpu_64(apicid, start_ip); 899 + ret = apic->wakeup_secondary_cpu_64(apicid, start_ip, cpu); 900 900 else if (apic->wakeup_secondary_cpu) 901 - ret = apic->wakeup_secondary_cpu(apicid, start_ip); 901 + ret = apic->wakeup_secondary_cpu(apicid, start_ip, cpu); 902 902 else 903 - ret = wakeup_secondary_cpu_via_init(apicid, start_ip); 903 + ret = wakeup_secondary_cpu_via_init(apicid, start_ip, cpu); 904 904 905 905 /* If the wakeup mechanism failed, cleanup the warm reset vector */ 906 906 if (ret)
+14 -2
drivers/acpi/irq.c
··· 12 12 13 13 enum acpi_irq_model_id acpi_irq_model; 14 14 15 - static struct fwnode_handle *(*acpi_get_gsi_domain_id)(u32 gsi); 15 + static acpi_gsi_domain_disp_fn acpi_get_gsi_domain_id; 16 16 static u32 (*acpi_gsi_to_irq_fallback)(u32 gsi); 17 17 18 18 /** ··· 307 307 * for a given GSI 308 308 */ 309 309 void __init acpi_set_irq_model(enum acpi_irq_model_id model, 310 - struct fwnode_handle *(*fn)(u32)) 310 + acpi_gsi_domain_disp_fn fn) 311 311 { 312 312 acpi_irq_model = model; 313 313 acpi_get_gsi_domain_id = fn; 314 314 } 315 + 316 + /* 317 + * acpi_get_gsi_dispatcher() - Get the GSI dispatcher function 318 + * 319 + * Return the dispatcher function that computes the domain fwnode for 320 + * a given GSI. 321 + */ 322 + acpi_gsi_domain_disp_fn acpi_get_gsi_dispatcher(void) 323 + { 324 + return acpi_get_gsi_domain_id; 325 + } 326 + EXPORT_SYMBOL_GPL(acpi_get_gsi_dispatcher); 315 327 316 328 /** 317 329 * acpi_set_gsi_to_irq_fallback - Register a GSI transfer
+2 -8
drivers/firmware/smccc/kvm_guest.c
··· 17 17 18 18 void __init kvm_init_hyp_services(void) 19 19 { 20 + uuid_t kvm_uuid = ARM_SMCCC_VENDOR_HYP_UID_KVM; 20 21 struct arm_smccc_res res; 21 22 u32 val[4]; 22 23 23 - if (arm_smccc_1_1_get_conduit() != SMCCC_CONDUIT_HVC) 24 - return; 25 - 26 - arm_smccc_1_1_invoke(ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID, &res); 27 - if (res.a0 != ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_0 || 28 - res.a1 != ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_1 || 29 - res.a2 != ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_2 || 30 - res.a3 != ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_3) 24 + if (!arm_smccc_hypervisor_has_uuid(&kvm_uuid)) 31 25 return; 32 26 33 27 memset(&res, 0, sizeof(res));
+17
drivers/firmware/smccc/smccc.c
··· 67 67 } 68 68 EXPORT_SYMBOL_GPL(arm_smccc_get_soc_id_revision); 69 69 70 + bool arm_smccc_hypervisor_has_uuid(const uuid_t *hyp_uuid) 71 + { 72 + struct arm_smccc_res res = {}; 73 + uuid_t uuid; 74 + 75 + if (arm_smccc_1_1_get_conduit() != SMCCC_CONDUIT_HVC) 76 + return false; 77 + 78 + arm_smccc_1_1_hvc(ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID, &res); 79 + if (res.a0 == SMCCC_RET_NOT_SUPPORTED) 80 + return false; 81 + 82 + uuid = smccc_res_to_uuid(res.a0, res.a1, res.a2, res.a3); 83 + return uuid_equal(&uuid, hyp_uuid); 84 + } 85 + EXPORT_SYMBOL_GPL(arm_smccc_hypervisor_has_uuid); 86 + 70 87 static int __init smccc_devices_init(void) 71 88 { 72 89 struct platform_device *pdev;
+4 -3
drivers/hv/Kconfig
··· 5 5 config HYPERV 6 6 tristate "Microsoft Hyper-V client drivers" 7 7 depends on (X86 && X86_LOCAL_APIC && HYPERVISOR_GUEST) \ 8 - || (ACPI && ARM64 && !CPU_BIG_ENDIAN) 8 + || (ARM64 && !CPU_BIG_ENDIAN) 9 9 select PARAVIRT 10 10 select X86_HV_CALLBACK_VECTOR if X86 11 11 select OF_EARLY_FLATTREE if OF 12 + select SYSFB if !HYPERV_VTL_MODE 12 13 help 13 14 Select this option to run Linux as a Hyper-V client operating 14 15 system. 15 16 16 17 config HYPERV_VTL_MODE 17 18 bool "Enable Linux to boot in VTL context" 18 - depends on X86_64 && HYPERV 19 + depends on (X86_64 || ARM64) && HYPERV 19 20 depends on SMP 20 21 default n 21 22 help ··· 32 31 33 32 Select this option to build a Linux kernel to run at a VTL other than 34 33 the normal VTL0, which currently is only VTL2. This option 35 - initializes the x86 platform for VTL2, and adds the ability to boot 34 + initializes the kernel to run in VTL2, and adds the ability to boot 36 35 secondary CPUs directly into 64-bit context as required for VTLs other 37 36 than 0. A kernel built with this option must run at VTL2, and will 38 37 not run as a normal guest.
+17 -6
drivers/hv/connection.c
··· 207 207 mutex_init(&vmbus_connection.channel_mutex); 208 208 209 209 /* 210 + * The following Hyper-V interrupt and monitor pages can be used by 211 + * UIO for mapping to user-space, so they should always be allocated on 212 + * system page boundaries. The system page size must be >= the Hyper-V 213 + * page size. 214 + */ 215 + BUILD_BUG_ON(PAGE_SIZE < HV_HYP_PAGE_SIZE); 216 + 217 + /* 210 218 * Setup the vmbus event connection for channel interrupt 211 219 * abstraction stuff 212 220 */ 213 - vmbus_connection.int_page = hv_alloc_hyperv_zeroed_page(); 221 + vmbus_connection.int_page = 222 + (void *)__get_free_page(GFP_KERNEL | __GFP_ZERO); 214 223 if (vmbus_connection.int_page == NULL) { 215 224 ret = -ENOMEM; 216 225 goto cleanup; ··· 234 225 * Setup the monitor notification facility. The 1st page for 235 226 * parent->child and the 2nd page for child->parent 236 227 */ 237 - vmbus_connection.monitor_pages[0] = hv_alloc_hyperv_page(); 238 - vmbus_connection.monitor_pages[1] = hv_alloc_hyperv_page(); 228 + vmbus_connection.monitor_pages[0] = (void *)__get_free_page(GFP_KERNEL); 229 + vmbus_connection.monitor_pages[1] = (void *)__get_free_page(GFP_KERNEL); 239 230 if ((vmbus_connection.monitor_pages[0] == NULL) || 240 231 (vmbus_connection.monitor_pages[1] == NULL)) { 241 232 ret = -ENOMEM; ··· 351 342 destroy_workqueue(vmbus_connection.work_queue); 352 343 353 344 if (vmbus_connection.int_page) { 354 - hv_free_hyperv_page(vmbus_connection.int_page); 345 + free_page((unsigned long)vmbus_connection.int_page); 355 346 vmbus_connection.int_page = NULL; 356 347 } 357 348 358 349 if (vmbus_connection.monitor_pages[0]) { 359 350 if (!set_memory_encrypted( 360 351 (unsigned long)vmbus_connection.monitor_pages[0], 1)) 361 - hv_free_hyperv_page(vmbus_connection.monitor_pages[0]); 352 + free_page((unsigned long) 353 + vmbus_connection.monitor_pages[0]); 362 354 vmbus_connection.monitor_pages[0] = NULL; 363 355 } 364 356 365 357 if (vmbus_connection.monitor_pages[1]) { 366 358 if (!set_memory_encrypted( 367 359 (unsigned long)vmbus_connection.monitor_pages[1], 1)) 368 - hv_free_hyperv_page(vmbus_connection.monitor_pages[1]); 360 + free_page((unsigned long) 361 + vmbus_connection.monitor_pages[1]); 369 362 vmbus_connection.monitor_pages[1] = NULL; 370 363 } 371 364 }
+34 -42
drivers/hv/hv_common.c
··· 105 105 hv_synic_eventring_tail = NULL; 106 106 } 107 107 108 - /* 109 - * Functions for allocating and freeing memory with size and 110 - * alignment HV_HYP_PAGE_SIZE. These functions are needed because 111 - * the guest page size may not be the same as the Hyper-V page 112 - * size. We depend upon kmalloc() aligning power-of-two size 113 - * allocations to the allocation size boundary, so that the 114 - * allocated memory appears to Hyper-V as a page of the size 115 - * it expects. 116 - */ 117 - 118 - void *hv_alloc_hyperv_page(void) 119 - { 120 - BUILD_BUG_ON(PAGE_SIZE < HV_HYP_PAGE_SIZE); 121 - 122 - if (PAGE_SIZE == HV_HYP_PAGE_SIZE) 123 - return (void *)__get_free_page(GFP_KERNEL); 124 - else 125 - return kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL); 126 - } 127 - EXPORT_SYMBOL_GPL(hv_alloc_hyperv_page); 128 - 129 - void *hv_alloc_hyperv_zeroed_page(void) 130 - { 131 - if (PAGE_SIZE == HV_HYP_PAGE_SIZE) 132 - return (void *)__get_free_page(GFP_KERNEL | __GFP_ZERO); 133 - else 134 - return kzalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL); 135 - } 136 - EXPORT_SYMBOL_GPL(hv_alloc_hyperv_zeroed_page); 137 - 138 - void hv_free_hyperv_page(void *addr) 139 - { 140 - if (PAGE_SIZE == HV_HYP_PAGE_SIZE) 141 - free_page((unsigned long)addr); 142 - else 143 - kfree(addr); 144 - } 145 - EXPORT_SYMBOL_GPL(hv_free_hyperv_page); 146 - 147 108 static void *hv_panic_page; 148 109 149 110 /* ··· 233 272 atomic_notifier_chain_unregister(&panic_notifier_list, 234 273 &hyperv_panic_report_block); 235 274 236 - hv_free_hyperv_page(hv_panic_page); 275 + kfree(hv_panic_page); 237 276 hv_panic_page = NULL; 238 277 } 239 278 ··· 241 280 { 242 281 int ret; 243 282 244 - hv_panic_page = hv_alloc_hyperv_zeroed_page(); 283 + hv_panic_page = kzalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL); 245 284 if (!hv_panic_page) { 246 285 pr_err("Hyper-V: panic message page memory allocation failed\n"); 247 286 return; ··· 250 289 ret = kmsg_dump_register(&hv_kmsg_dumper); 251 290 if (ret) { 252 291 pr_err("Hyper-V: kmsg dump register error 0x%x\n", ret); 253 - hv_free_hyperv_page(hv_panic_page); 292 + kfree(hv_panic_page); 254 293 hv_panic_page = NULL; 255 294 } 256 295 } ··· 278 317 pr_err("Hyper-V: failed to get partition ID: %#x\n", 279 318 hv_result(status)); 280 319 } 320 + #if IS_ENABLED(CONFIG_HYPERV_VTL_MODE) 321 + u8 __init get_vtl(void) 322 + { 323 + u64 control = HV_HYPERCALL_REP_COMP_1 | HVCALL_GET_VP_REGISTERS; 324 + struct hv_input_get_vp_registers *input; 325 + struct hv_output_get_vp_registers *output; 326 + unsigned long flags; 327 + u64 ret; 328 + 329 + local_irq_save(flags); 330 + input = *this_cpu_ptr(hyperv_pcpu_input_arg); 331 + output = *this_cpu_ptr(hyperv_pcpu_output_arg); 332 + 333 + memset(input, 0, struct_size(input, names, 1)); 334 + input->partition_id = HV_PARTITION_ID_SELF; 335 + input->vp_index = HV_VP_INDEX_SELF; 336 + input->input_vtl.as_uint8 = 0; 337 + input->names[0] = HV_REGISTER_VSM_VP_STATUS; 338 + 339 + ret = hv_do_hypercall(control, input, output); 340 + if (hv_result_success(ret)) { 341 + ret = output->values[0].reg8 & HV_VTL_MASK; 342 + } else { 343 + pr_err("Failed to get VTL(error: %lld) exiting...\n", ret); 344 + BUG(); 345 + } 346 + 347 + local_irq_restore(flags); 348 + return ret; 349 + } 350 + #endif 281 351 282 352 int __init hv_common_init(void) 283 353 {
+85 -10
drivers/hv/vmbus_drv.c
··· 45 45 struct hv_vmbus_device_id id; 46 46 }; 47 47 48 - static struct device *hv_dev; 48 + /* VMBus Root Device */ 49 + static struct device *vmbus_root_device; 49 50 50 51 static int hyperv_cpuhp_online; 51 52 ··· 81 80 static struct resource *hyperv_mmio; 82 81 static DEFINE_MUTEX(hyperv_mmio_lock); 83 82 83 + struct device *hv_get_vmbus_root_device(void) 84 + { 85 + return vmbus_root_device; 86 + } 87 + EXPORT_SYMBOL_GPL(hv_get_vmbus_root_device); 88 + 84 89 static int vmbus_exists(void) 85 90 { 86 - if (hv_dev == NULL) 91 + if (vmbus_root_device == NULL) 87 92 return -ENODEV; 88 93 89 94 return 0; ··· 714 707 return id; 715 708 } 716 709 717 - /* vmbus_add_dynid - add a new device ID to this driver and re-probe devices */ 710 + /* vmbus_add_dynid - add a new device ID to this driver and re-probe devices 711 + * 712 + * This function can race with vmbus_device_register(). This function is 713 + * typically running on a user thread in response to writing to the "new_id" 714 + * sysfs entry for a driver. vmbus_device_register() is running on a 715 + * workqueue thread in response to the Hyper-V host offering a device to the 716 + * guest. This function calls driver_attach(), which looks for an existing 717 + * device matching the new id, and attaches the driver to which the new id 718 + * has been assigned. vmbus_device_register() calls device_register(), which 719 + * looks for a driver that matches the device being registered. If both 720 + * operations are running simultaneously, the device driver probe function runs 721 + * on whichever thread establishes the linkage between the driver and device. 722 + * 723 + * In most cases, it doesn't matter which thread runs the driver probe 724 + * function. But if vmbus_device_register() does not find a matching driver, 725 + * it proceeds to create the "channels" subdirectory and numbered per-channel 726 + * subdirectory in sysfs. While that multi-step creation is in progress, this 727 + * function could run the driver probe function. If the probe function checks 728 + * for, or operates on, entries in the "channels" subdirectory, including by 729 + * calling hv_create_ring_sysfs(), the operation may or may not succeed 730 + * depending on the race. The race can't create a kernel failure in VMBus 731 + * or device subsystem code, but probe functions in VMBus drivers doing such 732 + * operations must be prepared for the failure case. 733 + */ 718 734 static int vmbus_add_dynid(struct hv_driver *drv, guid_t *guid) 719 735 { 720 736 struct vmbus_dynid *dynid; ··· 891 861 * On x86/x64 coherence is assumed and these calls have no effect. 892 862 */ 893 863 hv_setup_dma_ops(child_device, 894 - device_get_dma_attr(hv_dev) == DEV_DMA_COHERENT); 864 + device_get_dma_attr(vmbus_root_device) == DEV_DMA_COHERENT); 895 865 return 0; 896 866 } 897 867 ··· 1951 1921 * ring for userspace to use. 1952 1922 * Note: Race conditions can happen with userspace and it is not encouraged to create new 1953 1923 * use-cases for this. This was added to maintain backward compatibility, while solving 1954 - * one of the race conditions in uio_hv_generic while creating sysfs. 1924 + * one of the race conditions in uio_hv_generic while creating sysfs. See comments with 1925 + * vmbus_add_dynid() and vmbus_device_register(). 1955 1926 * 1956 1927 * Returns 0 on success or error code on failure. 1957 1928 */ ··· 2068 2037 &child_device_obj->channel->offermsg.offer.if_instance); 2069 2038 2070 2039 child_device_obj->device.bus = &hv_bus; 2071 - child_device_obj->device.parent = hv_dev; 2040 + child_device_obj->device.parent = vmbus_root_device; 2072 2041 child_device_obj->device.release = vmbus_device_release; 2073 2042 2074 2043 child_device_obj->device.dma_parms = &child_device_obj->dma_parms; ··· 2086 2055 return ret; 2087 2056 } 2088 2057 2058 + /* 2059 + * If device_register() found a driver to assign to the device, the 2060 + * driver's probe function has already run at this point. If that 2061 + * probe function accesses or operates on the "channels" subdirectory 2062 + * in sysfs, those operations will have failed because the "channels" 2063 + * subdirectory doesn't exist until the code below runs. Or if the 2064 + * probe function creates a /dev entry, a user space program could 2065 + * find and open the /dev entry, and then create a race by accessing 2066 + * the "channels" subdirectory while the creation steps are in progress 2067 + * here. The race can't result in a kernel failure, but the user space 2068 + * program may get an error in accessing "channels" or its 2069 + * subdirectories. See also comments with vmbus_add_dynid() about a 2070 + * related race condition. 2071 + */ 2089 2072 child_device_obj->channels_kset = kset_create_and_add("channels", 2090 2073 NULL, kobj); 2091 2074 if (!child_device_obj->channels_kset) { ··· 2457 2412 struct acpi_device *ancestor; 2458 2413 struct acpi_device *device = ACPI_COMPANION(&pdev->dev); 2459 2414 2460 - hv_dev = &device->dev; 2415 + vmbus_root_device = &device->dev; 2461 2416 2462 2417 /* 2463 2418 * Older versions of Hyper-V for ARM64 fail to include the _CCA ··· 2510 2465 } 2511 2466 #endif 2512 2467 2468 + static int vmbus_set_irq(struct platform_device *pdev) 2469 + { 2470 + struct irq_data *data; 2471 + int irq; 2472 + irq_hw_number_t hwirq; 2473 + 2474 + irq = platform_get_irq(pdev, 0); 2475 + /* platform_get_irq() may not return 0. */ 2476 + if (irq < 0) 2477 + return irq; 2478 + 2479 + data = irq_get_irq_data(irq); 2480 + if (!data) { 2481 + pr_err("No interrupt data for VMBus virq %d\n", irq); 2482 + return -ENODEV; 2483 + } 2484 + hwirq = irqd_to_hwirq(data); 2485 + 2486 + vmbus_irq = irq; 2487 + vmbus_interrupt = hwirq; 2488 + pr_debug("VMBus virq %d, hwirq %d\n", vmbus_irq, vmbus_interrupt); 2489 + 2490 + return 0; 2491 + } 2492 + 2513 2493 static int vmbus_device_add(struct platform_device *pdev) 2514 2494 { 2515 2495 struct resource **cur_res = &hyperv_mmio; ··· 2543 2473 struct device_node *np = pdev->dev.of_node; 2544 2474 int ret; 2545 2475 2546 - hv_dev = &pdev->dev; 2476 + vmbus_root_device = &pdev->dev; 2547 2477 2548 2478 ret = of_range_parser_init(&parser, np); 2479 + if (ret) 2480 + return ret; 2481 + 2482 + if (!__is_defined(HYPERVISOR_CALLBACK_VECTOR)) 2483 + ret = vmbus_set_irq(pdev); 2549 2484 if (ret) 2550 2485 return ret; 2551 2486 ··· 2861 2786 if (ret) 2862 2787 return ret; 2863 2788 2864 - if (!hv_dev) { 2789 + if (!vmbus_root_device) { 2865 2790 ret = -ENODEV; 2866 2791 goto cleanup; 2867 2792 } ··· 2892 2817 2893 2818 cleanup: 2894 2819 platform_driver_unregister(&vmbus_platform_driver); 2895 - hv_dev = NULL; 2820 + vmbus_root_device = NULL; 2896 2821 return ret; 2897 2822 } 2898 2823
+78 -21
drivers/pci/controller/pci-hyperv.c
··· 50 50 #include <linux/irqdomain.h> 51 51 #include <linux/acpi.h> 52 52 #include <linux/sizes.h> 53 + #include <linux/of_irq.h> 53 54 #include <asm/mshyperv.h> 54 55 55 56 /* ··· 310 309 void (*completion_func)(void *context, struct pci_response *resp, 311 310 int resp_packet_size); 312 311 void *compl_ctxt; 313 - 314 - struct pci_message message[]; 315 312 }; 316 313 317 314 /* ··· 816 817 int ret; 817 818 818 819 fwspec.fwnode = domain->parent->fwnode; 819 - fwspec.param_count = 2; 820 - fwspec.param[0] = hwirq; 821 - fwspec.param[1] = IRQ_TYPE_EDGE_RISING; 820 + if (is_of_node(fwspec.fwnode)) { 821 + /* SPI lines for OF translations start at offset 32 */ 822 + fwspec.param_count = 3; 823 + fwspec.param[0] = 0; 824 + fwspec.param[1] = hwirq - 32; 825 + fwspec.param[2] = IRQ_TYPE_EDGE_RISING; 826 + } else { 827 + fwspec.param_count = 2; 828 + fwspec.param[0] = hwirq; 829 + fwspec.param[1] = IRQ_TYPE_EDGE_RISING; 830 + } 822 831 823 832 ret = irq_domain_alloc_irqs_parent(domain, virq, 1, &fwspec); 824 833 if (ret) ··· 894 887 .activate = hv_pci_vec_irq_domain_activate, 895 888 }; 896 889 890 + #ifdef CONFIG_OF 891 + 892 + static struct irq_domain *hv_pci_of_irq_domain_parent(void) 893 + { 894 + struct device_node *parent; 895 + struct irq_domain *domain; 896 + 897 + parent = of_irq_find_parent(hv_get_vmbus_root_device()->of_node); 898 + if (!parent) 899 + return NULL; 900 + domain = irq_find_host(parent); 901 + of_node_put(parent); 902 + 903 + return domain; 904 + } 905 + 906 + #endif 907 + 908 + #ifdef CONFIG_ACPI 909 + 910 + static struct irq_domain *hv_pci_acpi_irq_domain_parent(void) 911 + { 912 + acpi_gsi_domain_disp_fn gsi_domain_disp_fn; 913 + 914 + gsi_domain_disp_fn = acpi_get_gsi_dispatcher(); 915 + if (!gsi_domain_disp_fn) 916 + return NULL; 917 + return irq_find_matching_fwnode(gsi_domain_disp_fn(0), 918 + DOMAIN_BUS_ANY); 919 + } 920 + 921 + #endif 922 + 897 923 static int hv_pci_irqchip_init(void) 898 924 { 899 925 static struct hv_pci_chip_data *chip_data; 900 926 struct fwnode_handle *fn = NULL; 927 + struct irq_domain *irq_domain_parent = NULL; 901 928 int ret = -ENOMEM; 902 929 903 930 chip_data = kzalloc(sizeof(*chip_data), GFP_KERNEL); ··· 948 907 * way to ensure that all the corresponding devices are also gone and 949 908 * no interrupts will be generated. 950 909 */ 951 - hv_msi_gic_irq_domain = acpi_irq_create_hierarchy(0, HV_PCI_MSI_SPI_NR, 952 - fn, &hv_pci_domain_ops, 953 - chip_data); 910 + #ifdef CONFIG_ACPI 911 + if (!acpi_disabled) 912 + irq_domain_parent = hv_pci_acpi_irq_domain_parent(); 913 + #endif 914 + #ifdef CONFIG_OF 915 + if (!irq_domain_parent) 916 + irq_domain_parent = hv_pci_of_irq_domain_parent(); 917 + #endif 918 + if (!irq_domain_parent) { 919 + WARN_ONCE(1, "Invalid firmware configuration for VMBus interrupts\n"); 920 + ret = -EINVAL; 921 + goto free_chip; 922 + } 923 + 924 + hv_msi_gic_irq_domain = irq_domain_create_hierarchy(irq_domain_parent, 0, 925 + HV_PCI_MSI_SPI_NR, 926 + fn, &hv_pci_domain_ops, 927 + chip_data); 954 928 955 929 if (!hv_msi_gic_irq_domain) { 956 930 pr_err("Failed to create Hyper-V arm64 vPCI MSI IRQ domain\n"); ··· 1494 1438 memset(&pkt, 0, sizeof(pkt)); 1495 1439 pkt.pkt.completion_func = hv_pci_read_config_compl; 1496 1440 pkt.pkt.compl_ctxt = &comp_pkt; 1497 - read_blk = (struct pci_read_block *)&pkt.pkt.message; 1441 + read_blk = (struct pci_read_block *)pkt.buf; 1498 1442 read_blk->message_type.type = PCI_READ_BLOCK; 1499 1443 read_blk->wslot.slot = devfn_to_wslot(pdev->devfn); 1500 1444 read_blk->block_id = block_id; ··· 1574 1518 memset(&pkt, 0, sizeof(pkt)); 1575 1519 pkt.pkt.completion_func = hv_pci_write_config_compl; 1576 1520 pkt.pkt.compl_ctxt = &comp_pkt; 1577 - write_blk = (struct pci_write_block *)&pkt.pkt.message; 1521 + write_blk = (struct pci_write_block *)pkt.buf; 1578 1522 write_blk->message_type.type = PCI_WRITE_BLOCK; 1579 1523 write_blk->wslot.slot = devfn_to_wslot(pdev->devfn); 1580 1524 write_blk->block_id = block_id; ··· 1655 1599 return; 1656 1600 } 1657 1601 memset(&ctxt, 0, sizeof(ctxt)); 1658 - int_pkt = (struct pci_delete_interrupt *)&ctxt.pkt.message; 1602 + int_pkt = (struct pci_delete_interrupt *)ctxt.buffer; 1659 1603 int_pkt->message_type.type = 1660 1604 PCI_DELETE_INTERRUPT_MESSAGE; 1661 1605 int_pkt->wslot.slot = hpdev->desc.win_slot.slot; ··· 2538 2482 comp_pkt.hpdev = hpdev; 2539 2483 pkt.init_packet.compl_ctxt = &comp_pkt; 2540 2484 pkt.init_packet.completion_func = q_resource_requirements; 2541 - res_req = (struct pci_child_message *)&pkt.init_packet.message; 2485 + res_req = (struct pci_child_message *)pkt.buffer; 2542 2486 res_req->message_type.type = PCI_QUERY_RESOURCE_REQUIREMENTS; 2543 2487 res_req->wslot.slot = desc->win_slot.slot; 2544 2488 ··· 2916 2860 pci_destroy_slot(hpdev->pci_slot); 2917 2861 2918 2862 memset(&ctxt, 0, sizeof(ctxt)); 2919 - ejct_pkt = (struct pci_eject_response *)&ctxt.pkt.message; 2863 + ejct_pkt = (struct pci_eject_response *)ctxt.buffer; 2920 2864 ejct_pkt->message_type.type = PCI_EJECTION_COMPLETE; 2921 2865 ejct_pkt->wslot.slot = hpdev->desc.win_slot.slot; 2922 2866 vmbus_sendpacket(hbus->hdev->channel, ejct_pkt, ··· 3174 3118 init_completion(&comp_pkt.host_event); 3175 3119 pkt->completion_func = hv_pci_generic_compl; 3176 3120 pkt->compl_ctxt = &comp_pkt; 3177 - version_req = (struct pci_version_request *)&pkt->message; 3121 + version_req = (struct pci_version_request *)(pkt + 1); 3178 3122 version_req->message_type.type = PCI_QUERY_PROTOCOL_VERSION; 3179 3123 3180 3124 for (i = 0; i < num_version; i++) { ··· 3396 3340 init_completion(&comp_pkt.host_event); 3397 3341 pkt->completion_func = hv_pci_generic_compl; 3398 3342 pkt->compl_ctxt = &comp_pkt; 3399 - d0_entry = (struct pci_bus_d0_entry *)&pkt->message; 3343 + d0_entry = (struct pci_bus_d0_entry *)(pkt + 1); 3400 3344 d0_entry->message_type.type = PCI_BUS_D0ENTRY; 3401 3345 d0_entry->mmio_base = hbus->mem_config->start; 3402 3346 ··· 3554 3498 3555 3499 if (hbus->protocol_version < PCI_PROTOCOL_VERSION_1_2) { 3556 3500 res_assigned = 3557 - (struct pci_resources_assigned *)&pkt->message; 3501 + (struct pci_resources_assigned *)(pkt + 1); 3558 3502 res_assigned->message_type.type = 3559 3503 PCI_RESOURCES_ASSIGNED; 3560 3504 res_assigned->wslot.slot = hpdev->desc.win_slot.slot; 3561 3505 } else { 3562 3506 res_assigned2 = 3563 - (struct pci_resources_assigned2 *)&pkt->message; 3507 + (struct pci_resources_assigned2 *)(pkt + 1); 3564 3508 res_assigned2->message_type.type = 3565 3509 PCI_RESOURCES_ASSIGNED2; 3566 3510 res_assigned2->wslot.slot = hpdev->desc.win_slot.slot; 3567 3511 } 3568 3512 put_pcichild(hpdev); 3569 3513 3570 - ret = vmbus_sendpacket(hdev->channel, &pkt->message, 3514 + ret = vmbus_sendpacket(hdev->channel, pkt + 1, 3571 3515 size_res, (unsigned long)pkt, 3572 3516 VM_PKT_DATA_INBAND, 3573 3517 VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED); ··· 3865 3809 struct pci_packet teardown_packet; 3866 3810 u8 buffer[sizeof(struct pci_message)]; 3867 3811 } pkt; 3812 + struct pci_message *msg; 3868 3813 struct hv_pci_compl comp_pkt; 3869 3814 struct hv_pci_dev *hpdev, *tmp; 3870 3815 unsigned long flags; ··· 3911 3854 init_completion(&comp_pkt.host_event); 3912 3855 pkt.teardown_packet.completion_func = hv_pci_generic_compl; 3913 3856 pkt.teardown_packet.compl_ctxt = &comp_pkt; 3914 - pkt.teardown_packet.message[0].type = PCI_BUS_D0EXIT; 3857 + msg = (struct pci_message *)pkt.buffer; 3858 + msg->type = PCI_BUS_D0EXIT; 3915 3859 3916 - ret = vmbus_sendpacket_getid(chan, &pkt.teardown_packet.message, 3917 - sizeof(struct pci_message), 3860 + ret = vmbus_sendpacket_getid(chan, msg, sizeof(*msg), 3918 3861 (unsigned long)&pkt.teardown_packet, 3919 3862 &trans_id, VM_PKT_DATA_INBAND, 3920 3863 VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
+5 -2
drivers/uio/uio_hv_generic.c
··· 243 243 if (!ring_size) 244 244 ring_size = SZ_2M; 245 245 246 + /* Adjust ring size if necessary to have it page aligned */ 247 + ring_size = VMBUS_RING_SIZE(ring_size); 248 + 246 249 pdata = devm_kzalloc(&dev->device, sizeof(*pdata), GFP_KERNEL); 247 250 if (!pdata) 248 251 return -ENOMEM; ··· 277 274 pdata->info.mem[INT_PAGE_MAP].name = "int_page"; 278 275 pdata->info.mem[INT_PAGE_MAP].addr 279 276 = (uintptr_t)vmbus_connection.int_page; 280 - pdata->info.mem[INT_PAGE_MAP].size = PAGE_SIZE; 277 + pdata->info.mem[INT_PAGE_MAP].size = HV_HYP_PAGE_SIZE; 281 278 pdata->info.mem[INT_PAGE_MAP].memtype = UIO_MEM_LOGICAL; 282 279 283 280 pdata->info.mem[MON_PAGE_MAP].name = "monitor_page"; 284 281 pdata->info.mem[MON_PAGE_MAP].addr 285 282 = (uintptr_t)vmbus_connection.monitor_pages[1]; 286 - pdata->info.mem[MON_PAGE_MAP].size = PAGE_SIZE; 283 + pdata->info.mem[MON_PAGE_MAP].size = HV_HYP_PAGE_SIZE; 287 284 pdata->info.mem[MON_PAGE_MAP].memtype = UIO_MEM_LOGICAL; 288 285 289 286 if (channel->device_id == HV_NIC) {
+6 -4
include/asm-generic/mshyperv.h
··· 236 236 int hv_common_cpu_die(unsigned int cpu); 237 237 void hv_identify_partition_type(void); 238 238 239 - void *hv_alloc_hyperv_page(void); 240 - void *hv_alloc_hyperv_zeroed_page(void); 241 - void hv_free_hyperv_page(void *addr); 242 - 243 239 /** 244 240 * hv_cpu_number_to_vp_number() - Map CPU to VP. 245 241 * @cpu_number: CPU number in Linux terms ··· 373 377 return -EOPNOTSUPP; 374 378 } 375 379 #endif /* CONFIG_MSHV_ROOT */ 380 + 381 + #if IS_ENABLED(CONFIG_HYPERV_VTL_MODE) 382 + u8 __init get_vtl(void); 383 + #else 384 + static inline u8 get_vtl(void) { return 0; } 385 + #endif 376 386 377 387 #endif
+2 -2
include/hyperv/hvgdk_mini.h
··· 475 475 #define HVCALL_CREATE_PORT 0x0095 476 476 #define HVCALL_CONNECT_PORT 0x0096 477 477 #define HVCALL_START_VP 0x0099 478 - #define HVCALL_GET_VP_ID_FROM_APIC_ID 0x009a 478 + #define HVCALL_GET_VP_INDEX_FROM_APIC_ID 0x009a 479 479 #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE 0x00af 480 480 #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_LIST 0x00b0 481 481 #define HVCALL_SIGNAL_EVENT_DIRECT 0x00c0 ··· 1228 1228 u64 cpu_mask; 1229 1229 } __packed; 1230 1230 1231 - #define HV_X64_VTL_MASK GENMASK(3, 0) 1231 + #define HV_VTL_MASK GENMASK(3, 0) 1232 1232 1233 1233 /* Hyper-V memory host visibility */ 1234 1234 enum hv_mem_host_visibility {
+4 -1
include/linux/acpi.h
··· 335 335 int acpi_gsi_to_irq (u32 gsi, unsigned int *irq); 336 336 int acpi_isa_irq_to_gsi (unsigned isa_irq, u32 *gsi); 337 337 338 + typedef struct fwnode_handle *(*acpi_gsi_domain_disp_fn)(u32); 339 + 338 340 void acpi_set_irq_model(enum acpi_irq_model_id model, 339 - struct fwnode_handle *(*)(u32)); 341 + acpi_gsi_domain_disp_fn fn); 342 + acpi_gsi_domain_disp_fn acpi_get_gsi_dispatcher(void); 340 343 void acpi_set_gsi_to_irq_fallback(u32 (*)(u32)); 341 344 342 345 struct irq_domain *acpi_irq_create_hierarchy(unsigned int flags,
+60 -4
include/linux/arm-smccc.h
··· 7 7 8 8 #include <linux/args.h> 9 9 #include <linux/init.h> 10 + 11 + #ifndef __ASSEMBLY__ 12 + #include <linux/uuid.h> 13 + #endif 14 + 10 15 #include <uapi/linux/const.h> 11 16 12 17 /* ··· 112 107 ARM_SMCCC_FUNC_QUERY_CALL_UID) 113 108 114 109 /* KVM UID value: 28b46fb6-2ec5-11e9-a9ca-4b564d003a74 */ 115 - #define ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_0 0xb66fb428U 116 - #define ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_1 0xe911c52eU 117 - #define ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_2 0x564bcaa9U 118 - #define ARM_SMCCC_VENDOR_HYP_UID_KVM_REG_3 0x743a004dU 110 + #define ARM_SMCCC_VENDOR_HYP_UID_KVM UUID_INIT(\ 111 + 0xb66fb428, 0xc52e, 0xe911, \ 112 + 0xa9, 0xca, 0x4b, 0x56, \ 113 + 0x4d, 0x00, 0x3a, 0x74) 119 114 120 115 /* KVM "vendor specific" services */ 121 116 #define ARM_SMCCC_KVM_FUNC_FEATURES 0 ··· 352 347 * When ARM_SMCCC_ARCH_SOC_ID is not present, returns SMCCC_RET_NOT_SUPPORTED. 353 348 */ 354 349 s32 arm_smccc_get_soc_id_revision(void); 350 + 351 + #ifndef __ASSEMBLY__ 352 + 353 + /* 354 + * Returns whether a specific hypervisor UUID is advertised for the 355 + * Vendor Specific Hypervisor Service range. 356 + */ 357 + bool arm_smccc_hypervisor_has_uuid(const uuid_t *uuid); 358 + 359 + static inline uuid_t smccc_res_to_uuid(u32 r0, u32 r1, u32 r2, u32 r3) 360 + { 361 + uuid_t uuid = { 362 + .b = { 363 + [0] = (r0 >> 0) & 0xff, 364 + [1] = (r0 >> 8) & 0xff, 365 + [2] = (r0 >> 16) & 0xff, 366 + [3] = (r0 >> 24) & 0xff, 367 + 368 + [4] = (r1 >> 0) & 0xff, 369 + [5] = (r1 >> 8) & 0xff, 370 + [6] = (r1 >> 16) & 0xff, 371 + [7] = (r1 >> 24) & 0xff, 372 + 373 + [8] = (r2 >> 0) & 0xff, 374 + [9] = (r2 >> 8) & 0xff, 375 + [10] = (r2 >> 16) & 0xff, 376 + [11] = (r2 >> 24) & 0xff, 377 + 378 + [12] = (r3 >> 0) & 0xff, 379 + [13] = (r3 >> 8) & 0xff, 380 + [14] = (r3 >> 16) & 0xff, 381 + [15] = (r3 >> 24) & 0xff, 382 + }, 383 + }; 384 + 385 + return uuid; 386 + } 387 + 388 + static inline u32 smccc_uuid_to_reg(const uuid_t *uuid, int reg) 389 + { 390 + u32 val = 0; 391 + 392 + val |= (u32)(uuid->b[4 * reg + 0] << 0); 393 + val |= (u32)(uuid->b[4 * reg + 1] << 8); 394 + val |= (u32)(uuid->b[4 * reg + 2] << 16); 395 + val |= (u32)(uuid->b[4 * reg + 3] << 24); 396 + 397 + return val; 398 + } 399 + 400 + #endif /* !__ASSEMBLY__ */ 355 401 356 402 /** 357 403 * struct arm_smccc_res - Result from SMC/HVC call
+2
include/linux/hyperv.h
··· 1276 1276 return dev_get_drvdata(&dev->device); 1277 1277 } 1278 1278 1279 + struct device *hv_get_vmbus_root_device(void); 1280 + 1279 1281 struct hv_ring_buffer_debug_info { 1280 1282 u32 current_interrupt_mask; 1281 1283 u32 current_read_index;
+59 -5
tools/hv/hv_kvp_daemon.c
··· 84 84 }; 85 85 86 86 static int in_hand_shake; 87 + static int debug; 87 88 88 89 static char *os_name = ""; 89 90 static char *os_major = ""; ··· 185 184 kvp_release_lock(pool); 186 185 } 187 186 187 + static void kvp_dump_initial_pools(int pool) 188 + { 189 + int i; 190 + 191 + syslog(LOG_DEBUG, "===Start dumping the contents of pool %d ===\n", 192 + pool); 193 + 194 + for (i = 0; i < kvp_file_info[pool].num_records; i++) 195 + syslog(LOG_DEBUG, "pool: %d, %d/%d key=%s val=%s\n", 196 + pool, i + 1, kvp_file_info[pool].num_records, 197 + kvp_file_info[pool].records[i].key, 198 + kvp_file_info[pool].records[i].value); 199 + } 200 + 188 201 static void kvp_update_mem_state(int pool) 189 202 { 190 203 FILE *filep; ··· 286 271 return 1; 287 272 kvp_file_info[i].num_records = 0; 288 273 kvp_update_mem_state(i); 274 + if (debug) 275 + kvp_dump_initial_pools(i); 289 276 } 290 277 291 278 return 0; ··· 315 298 * Found a match; just move the remaining 316 299 * entries up. 317 300 */ 301 + if (debug) 302 + syslog(LOG_DEBUG, "%s: deleting the KVP: pool=%d key=%s val=%s", 303 + __func__, pool, record[i].key, record[i].value); 318 304 if (i == (num_records - 1)) { 319 305 kvp_file_info[pool].num_records--; 320 306 kvp_update_file(pool); ··· 336 316 kvp_update_file(pool); 337 317 return 0; 338 318 } 319 + 320 + if (debug) 321 + syslog(LOG_DEBUG, "%s: could not delete KVP: pool=%d key=%s. Record not found", 322 + __func__, pool, key); 323 + 339 324 return 1; 340 325 } 341 326 342 327 static int kvp_key_add_or_modify(int pool, const __u8 *key, int key_size, 343 328 const __u8 *value, int value_size) 344 329 { 345 - int i; 346 - int num_records; 347 330 struct kvp_record *record; 331 + int num_records; 348 332 int num_blocks; 333 + int i; 334 + 335 + if (debug) 336 + syslog(LOG_DEBUG, "%s: got a KVP: pool=%d key=%s val=%s", 337 + __func__, pool, key, value); 349 338 350 339 if ((key_size > HV_KVP_EXCHANGE_MAX_KEY_SIZE) || 351 - (value_size > HV_KVP_EXCHANGE_MAX_VALUE_SIZE)) 340 + (value_size > HV_KVP_EXCHANGE_MAX_VALUE_SIZE)) { 341 + syslog(LOG_ERR, "%s: Too long key or value: key=%s, val=%s", 342 + __func__, key, value); 343 + 344 + if (debug) 345 + syslog(LOG_DEBUG, "%s: Too long key or value: pool=%d, key=%s, val=%s", 346 + __func__, pool, key, value); 352 347 return 1; 348 + } 353 349 354 350 /* 355 351 * First update the in-memory state. ··· 385 349 */ 386 350 memcpy(record[i].value, value, value_size); 387 351 kvp_update_file(pool); 352 + if (debug) 353 + syslog(LOG_DEBUG, "%s: updated: pool=%d key=%s val=%s", 354 + __func__, pool, key, value); 388 355 return 0; 389 356 } 390 357 ··· 399 360 record = realloc(record, sizeof(struct kvp_record) * 400 361 ENTRIES_PER_BLOCK * (num_blocks + 1)); 401 362 402 - if (record == NULL) 363 + if (!record) { 364 + syslog(LOG_ERR, "%s: Memory alloc failure", __func__); 403 365 return 1; 366 + } 404 367 kvp_file_info[pool].num_blocks++; 405 368 406 369 } ··· 410 369 memcpy(record[i].key, key, key_size); 411 370 kvp_file_info[pool].records = record; 412 371 kvp_file_info[pool].num_records++; 372 + 373 + if (debug) 374 + syslog(LOG_DEBUG, "%s: added: pool=%d key=%s val=%s", 375 + __func__, pool, key, value); 376 + 413 377 kvp_update_file(pool); 414 378 return 0; 415 379 } ··· 1768 1722 fprintf(stderr, "Usage: %s [options]\n" 1769 1723 "Options are:\n" 1770 1724 " -n, --no-daemon stay in foreground, don't daemonize\n" 1725 + " -d, --debug Enable debug logs(syslog debug by default)\n" 1771 1726 " -h, --help print this help\n", argv[0]); 1772 1727 } 1773 1728 ··· 1790 1743 static struct option long_options[] = { 1791 1744 {"help", no_argument, 0, 'h' }, 1792 1745 {"no-daemon", no_argument, 0, 'n' }, 1746 + {"debug", no_argument, 0, 'd' }, 1793 1747 {0, 0, 0, 0 } 1794 1748 }; 1795 1749 1796 - while ((opt = getopt_long(argc, argv, "hn", long_options, 1750 + while ((opt = getopt_long(argc, argv, "hnd", long_options, 1797 1751 &long_index)) != -1) { 1798 1752 switch (opt) { 1799 1753 case 'n': ··· 1803 1755 case 'h': 1804 1756 print_usage(argv); 1805 1757 exit(0); 1758 + case 'd': 1759 + debug = 1; 1760 + break; 1806 1761 default: 1807 1762 print_usage(argv); 1808 1763 exit(EXIT_FAILURE); ··· 1827 1776 * unpredictable amount of time to finish. 1828 1777 */ 1829 1778 kvp_get_domain_name(full_domain_name, sizeof(full_domain_name)); 1779 + 1780 + if (debug) 1781 + syslog(LOG_INFO, "Logging debug info in syslog(debug)"); 1830 1782 1831 1783 if (kvp_file_init()) { 1832 1784 syslog(LOG_ERR, "Failed to initialize the pools");