From nobody Mon Feb 9 15:07:16 2026 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 7076631A064; Mon, 27 Oct 2025 20:29:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=13.77.154.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761596953; cv=none; b=J4ue3YVY5oqCyldW/y9s/JyGeTorx9MzRn2qTwDQk/hEhyyJNwVJP+sMgRF9o5iCMHHUmdphUKwQbAwlMfL7+xYMSyyk6ZCyCqaX/n/qDI4tnV2BArCdvC+BRbSxR8e3bb9suclKCjCINl1lP57cTxPWiUCFFyo/wJTNej2crl4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761596953; c=relaxed/simple; bh=vnr6TK8JReGzvBGFhEqVtDasPNCQQyW45mZ8vOCWTzM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HKl657FPQL/gP4mVxvvYks0fl0HcjekFgv8U4eAdFqgEj0svbyaa3Zb6cWBZkTeFrW7wtVf25BtLfpBFYpQtXEOIDo4/SqNbmahzFVvHpGL1a01PmavOxKIps0OAcva5si+PTaeXbFukJrXe6ZXTx8lIKK0sdTvWaIMX4Gi9YU8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.microsoft.com; spf=pass smtp.mailfrom=linux.microsoft.com; dkim=pass (1024-bit key) header.d=linux.microsoft.com header.i=@linux.microsoft.com header.b=WLEqQ6gC; arc=none smtp.client-ip=13.77.154.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.microsoft.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.microsoft.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.microsoft.com header.i=@linux.microsoft.com header.b="WLEqQ6gC" Received: from home (unknown [72.191.74.189]) by linux.microsoft.com (Postfix) with ESMTPSA id 456842123279; Mon, 27 Oct 2025 13:29:09 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 456842123279 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1761596950; bh=AZZr3hJeCpe3fLnQl4aCVIwclGJQ1RbC24we1daQtRA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WLEqQ6gC3rm9ubsgRH+VTQT56gQ2R2RGECFSuzsYu2afu/SWmwTrCqEornPKxtAcF /ZRqdnX3nRbQY49GDGrGjNs6UzXddfZtL+BatArFHeEFiQ9pPJuj6NuxiwoQpWVmu+ sksiA2xnMdqb3auQqDpn9EQBdMDRkvZI4FDzkdhk= From: Praveen K Paladugu To: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, tglx@linutronix.de, mingo@redhat.com, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, arnd@arndb.de Cc: anbelski@linux.microsoft.com, prapal@linux.microsoft.com, easwar.hariharan@linux.microsoft.com, nunodasneves@linux.microsoft.com, skinsburskii@linux.microsoft.com Subject: [PATCH v3 2/2] hyperv: Enable clean shutdown for root partition with MSHV Date: Mon, 27 Oct 2025 15:28:43 -0500 Message-ID: <20251027202859.72006-3-prapal@linux.microsoft.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251027202859.72006-1-prapal@linux.microsoft.com> References: <20251027202859.72006-1-prapal@linux.microsoft.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Without configuing sleep state info within mshv, if a root partition is shut down, it will try to shutdown by writing to ACPI regions. These writes are intercepted by mshv and will result in a Machine Check Exception (MCE). Root eventually panics with a trace similar to: [ 81.306348] reboot: Power down [ 81.314709] mce: [Hardware Error]: CPU 0: Machine Check Exception: 4 Ban= k 0: b2000000c0060001 [ 81.314711] mce: [Hardware Error]: TSC 3b8cb60a66 PPIN 11d98332458e4ea9 [ 81.314713] mce: [Hardware Error]: PROCESSOR 0:606a6 TIME 1759339405 SOC= KET 0 APIC 0 microcode ffffffff [ 81.314715] mce: [Hardware Error]: Run the above through 'mcelog --ascii' [ 81.314716] mce: [Hardware Error]: Machine check: Processor context corr= upt [ 81.314717] Kernel panic - not syncing: Fatal machine check To prevent this, properly configure sleep states within MSHV, enable a reboot notifier, allowing the root partition to cleanly shut down without any panics. Only S5 sleep state is enabled for now. Signed-off-by: Praveen K Paladugu Co-developed-by: Anatol Belski Signed-off-by: Anatol Belski Reviewed-by: Easwar Hariharan --- arch/x86/hyperv/hv_init.c | 8 +++ arch/x86/include/asm/mshyperv.h | 2 + drivers/hv/mshv_common.c | 103 ++++++++++++++++++++++++++++++++ 3 files changed, 113 insertions(+) diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index e28737ec7054..0a8856009792 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -555,6 +555,14 @@ void __init hyperv_init(void) =20 hv_remap_tsc_clocksource(); hv_root_crash_init(); + + /* + * The notifier registration might fail at various hops. + * Corresponding error messages will land in dmesg. There is + * otherwise nothing that can be specifically done to handle + * failures here. + */ + hv_sleep_notifiers_register(); } else { hypercall_msr.guest_physical_address =3D vmalloc_to_pfn(hv_hypercall_pg); wrmsrq(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyper= v.h index 1342d55c2545..fbc1233175ce 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -181,8 +181,10 @@ int hyperv_fill_flush_guest_mapping_list( void hv_apic_init(void); void __init hv_init_spinlocks(void); bool hv_vcpu_is_preempted(int vcpu); +void hv_sleep_notifiers_register(void); #else static inline void hv_apic_init(void) {} +static inline void hv_sleep_notifiers_register(void) {}; #endif =20 struct irq_domain *hv_create_pci_msi_domain(void); diff --git a/drivers/hv/mshv_common.c b/drivers/hv/mshv_common.c index aa2be51979fd..12b248ea3498 100644 --- a/drivers/hv/mshv_common.c +++ b/drivers/hv/mshv_common.c @@ -14,6 +14,9 @@ #include #include #include +#include +#include +#include =20 #include "mshv.h" =20 @@ -138,3 +141,103 @@ int hv_call_get_partition_property(u64 partition_id, return 0; } EXPORT_SYMBOL_GPL(hv_call_get_partition_property); + +#if IS_ENABLED(CONFIG_ACPI) +/* + * Corresponding sleep states have to be initialized in order for a subseq= uent + * HVCALL_ENTER_SLEEP_STATE call to succeed. Currently only S5 state as per + * ACPI 6.4 chapter 7.4.2 is relevant, while S1, S2 and S3 can be supporte= d. + * + * ACPI should be initialized and should support S5 sleep state when this = method + * is called, so that it can extract correct PM values and pass them to hv. + */ +static int hv_initialize_sleep_states(void) +{ + u64 status; + unsigned long flags; + struct hv_input_set_system_property *in; + acpi_status acpi_status; + u8 sleep_type_a, sleep_type_b; + + if (!acpi_sleep_state_supported(ACPI_STATE_S5)) { + pr_err("%s: S5 sleep state not supported.\n", __func__); + return -ENODEV; + } + + acpi_status =3D acpi_get_sleep_type_data(ACPI_STATE_S5, &sleep_type_a, + &sleep_type_b); + if (ACPI_FAILURE(acpi_status)) + return -ENODEV; + + local_irq_save(flags); + in =3D *this_cpu_ptr(hyperv_pcpu_input_arg); + memset(in, 0, sizeof(*in)); + + in->property_id =3D HV_SYSTEM_PROPERTY_SLEEP_STATE; + in->set_sleep_state_info.sleep_state =3D HV_SLEEP_STATE_S5; + in->set_sleep_state_info.pm1a_slp_typ =3D sleep_type_a; + in->set_sleep_state_info.pm1b_slp_typ =3D sleep_type_b; + + status =3D hv_do_hypercall(HVCALL_SET_SYSTEM_PROPERTY, in, NULL); + local_irq_restore(flags); + + if (!hv_result_success(status)) { + hv_status_err(status, "\n"); + return hv_result_to_errno(status); + } + + return 0; +} + +static int hv_call_enter_sleep_state(u32 sleep_state) +{ + u64 status; + int ret; + unsigned long flags; + struct hv_input_enter_sleep_state *in; + + ret =3D hv_initialize_sleep_states(); + if (ret) + return ret; + + local_irq_save(flags); + in =3D *this_cpu_ptr(hyperv_pcpu_input_arg); + in->sleep_state =3D sleep_state; + + status =3D hv_do_hypercall(HVCALL_ENTER_SLEEP_STATE, in, NULL); + local_irq_restore(flags); + + if (!hv_result_success(status)) { + hv_status_err(status, "\n"); + return hv_result_to_errno(status); + } + + return 0; +} + +static int hv_reboot_notifier_handler(struct notifier_block *this, + unsigned long code, void *another) +{ + int ret =3D 0; + + if (code =3D=3D SYS_HALT || code =3D=3D SYS_POWER_OFF) + ret =3D hv_call_enter_sleep_state(HV_SLEEP_STATE_S5); + + return ret ? NOTIFY_DONE : NOTIFY_OK; +} + +static struct notifier_block hv_reboot_notifier =3D { + .notifier_call =3D hv_reboot_notifier_handler, + .priority =3D INT_MIN, /* may not return */ +}; + +void hv_sleep_notifiers_register(void) +{ + int ret; + + ret =3D register_reboot_notifier(&hv_reboot_notifier); + if (ret) + pr_err("%s: cannot register reboot notifier %d\n", __func__, + ret); +} +#endif --=20 2.51.0