Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

x86/hyperv: fix kexec crash due to VP assist page corruption

commit 9636be85cc5b ("x86/hyperv: Fix hyperv_pcpu_input_arg handling when
CPUs go online/offline") introduces a new cpuhp state for hyperv
initialization.

cpuhp_setup_state() returns the state number if state is
CPUHP_AP_ONLINE_DYN or CPUHP_BP_PREPARE_DYN and 0 for all other states.
For the hyperv case, since a new cpuhp state was introduced it would
return 0. However, in hv_machine_shutdown(), the cpuhp_remove_state() call
is conditioned upon "hyperv_init_cpuhp > 0". This will never be true and
so hv_cpu_die() won't be called on all CPUs. This means the VP assist page
won't be reset. When the kexec kernel tries to setup the VP assist page
again, the hypervisor corrupts the memory region of the old VP assist page
causing a panic in case the kexec kernel is using that memory elsewhere.
This was originally fixed in commit dfe94d4086e4 ("x86/hyperv: Fix kexec
panic/hang issues").

Get rid of hyperv_init_cpuhp entirely since we are no longer using a
dynamic cpuhp state and use CPUHP_AP_HYPERV_ONLINE directly with
cpuhp_remove_state().

Cc: stable@vger.kernel.org
Fixes: 9636be85cc5b ("x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline")
Signed-off-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/20240828112158.3538342-1-anirudh@anirudhrb.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20240828112158.3538342-1-anirudh@anirudhrb.com>

authored by

Anirudh Rayabharam (Microsoft) and committed by
Wei Liu
b9af6418 44305569

+3 -7
+1 -4
arch/x86/hyperv/hv_init.c
··· 35 35 #include <clocksource/hyperv_timer.h> 36 36 #include <linux/highmem.h> 37 37 38 - int hyperv_init_cpuhp; 39 38 u64 hv_current_partition_id = ~0ull; 40 39 EXPORT_SYMBOL_GPL(hv_current_partition_id); 41 40 ··· 606 607 607 608 register_syscore_ops(&hv_syscore_ops); 608 609 609 - hyperv_init_cpuhp = cpuhp; 610 - 611 610 if (cpuid_ebx(HYPERV_CPUID_FEATURES) & HV_ACCESS_PARTITION_ID) 612 611 hv_get_partition_id(); 613 612 ··· 634 637 clean_guest_os_id: 635 638 wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0); 636 639 hv_ivm_msr_write(HV_X64_MSR_GUEST_OS_ID, 0); 637 - cpuhp_remove_state(cpuhp); 640 + cpuhp_remove_state(CPUHP_AP_HYPERV_ONLINE); 638 641 free_ghcb_page: 639 642 free_percpu(hv_ghcb_pg); 640 643 free_vp_assist_page:
-1
arch/x86/include/asm/mshyperv.h
··· 40 40 } 41 41 42 42 #if IS_ENABLED(CONFIG_HYPERV) 43 - extern int hyperv_init_cpuhp; 44 43 extern bool hyperv_paravisor_present; 45 44 46 45 extern void *hv_hypercall_pg;
+2 -2
arch/x86/kernel/cpu/mshyperv.c
··· 199 199 * Call hv_cpu_die() on all the CPUs, otherwise later the hypervisor 200 200 * corrupts the old VP Assist Pages and can crash the kexec kernel. 201 201 */ 202 - if (kexec_in_progress && hyperv_init_cpuhp > 0) 203 - cpuhp_remove_state(hyperv_init_cpuhp); 202 + if (kexec_in_progress) 203 + cpuhp_remove_state(CPUHP_AP_HYPERV_ONLINE); 204 204 205 205 /* The function calls stop_other_cpus(). */ 206 206 native_machine_shutdown();