From: Vishal Verma <vishal.l.verma@intel.com>
Use the TDH.SYS.DISABLE SEAMCALL, which disables the TDX module,
reclaims all memory resources assigned to TDX, and clears any
partial-write induced poison, to allow kexec and kdump on platforms with
the partial write errata.
On TDX-capable platforms with the partial write erratum, kexec has been
disabled because the new kernel could hit a machine check reading a
previously poisoned memory location.
Later TDX modules support TDH.SYS.DISABLE, which disables the module and
reclaims all TDX memory resources, allowing the new kernel to re-initialize
TDX from scratch. This operation also clears the old memory, cleaning up
any poison.
Add tdx_sys_disable() to tdx_shutdown(), which is called in the
syscore_shutdown path for kexec. This is done just before tdx_shutdown()
disables VMX on all CPUs.
For kdump, call tdx_sys_disable() in the crash path before
x86_virt_emergency_disable_virtualization_cpu() does VMXOFF.
Since this clears any poison on TDX-managed memory, the
X86_BUG_TDX_PW_MCE check in machine_kexec() that blocked kexec on
partial write errata platforms can be removed.
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
arch/x86/kernel/crash.c | 2 ++
arch/x86/kernel/machine_kexec_64.c | 16 ----------------
arch/x86/virt/vmx/tdx/tdx.c | 1 +
3 files changed, 3 insertions(+), 16 deletions(-)
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index cd796818d94d..623d4474631a 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -38,6 +38,7 @@
#include <linux/kdebug.h>
#include <asm/cpu.h>
#include <asm/reboot.h>
+#include <asm/tdx.h>
#include <asm/intel_pt.h>
#include <asm/crash.h>
#include <asm/cmdline.h>
@@ -112,6 +113,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
crash_smp_send_stop();
+ tdx_sys_disable();
x86_virt_emergency_disable_virtualization_cpu();
/*
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 0590d399d4f1..c3f4a389992d 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -347,22 +347,6 @@ int machine_kexec_prepare(struct kimage *image)
unsigned long reloc_end = (unsigned long)__relocate_kernel_end;
int result;
- /*
- * Some early TDX-capable platforms have an erratum. A kernel
- * partial write (a write transaction of less than cacheline
- * lands at memory controller) to TDX private memory poisons that
- * memory, and a subsequent read triggers a machine check.
- *
- * On those platforms the old kernel must reset TDX private
- * memory before jumping to the new kernel otherwise the new
- * kernel may see unexpected machine check. For simplicity
- * just fail kexec/kdump on those platforms.
- */
- if (boot_cpu_has_bug(X86_BUG_TDX_PW_MCE)) {
- pr_info_once("Not allowed on platform with tdx_pw_mce bug\n");
- return -EOPNOTSUPP;
- }
-
/* Setup the identity mapped 64bit page table */
result = init_pgtable(image, __pa(control_page));
if (result)
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 68bd2618dde4..b388fbce5d76 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -252,6 +252,7 @@ static void tdx_shutdown_cpu(void *ign)
static void tdx_shutdown(void *ign)
{
+ tdx_sys_disable();
on_each_cpu(tdx_shutdown_cpu, NULL, 1);
}
--
2.53.0
A few nits below:
The scope "KVM: x86" is wrong as this doesn't touch any KVM code.
On Fri, Mar 06, 2026 at 05:03:58PM -0800, Rick Edgecombe wrote:
>From: Vishal Verma <vishal.l.verma@intel.com>
>
>Use the TDH.SYS.DISABLE SEAMCALL, which disables the TDX module,
>reclaims all memory resources assigned to TDX, and clears any
>partial-write induced poison, to allow kexec and kdump on platforms with
>the partial write errata.
>
>On TDX-capable platforms with the partial write erratum, kexec has been
>disabled because the new kernel could hit a machine check reading a
>previously poisoned memory location.
>
>Later TDX modules support TDH.SYS.DISABLE, which disables the module and
>reclaims all TDX memory resources, allowing the new kernel to re-initialize
>TDX from scratch. This operation also clears the old memory, cleaning up
>any poison.
>
>Add tdx_sys_disable() to tdx_shutdown(), which is called in the
>syscore_shutdown path for kexec. This is done just before tdx_shutdown()
>disables VMX on all CPUs.
>
>For kdump, call tdx_sys_disable() in the crash path before
>x86_virt_emergency_disable_virtualization_cpu() does VMXOFF.
>
>Since this clears any poison on TDX-managed memory, the
>X86_BUG_TDX_PW_MCE check in machine_kexec() that blocked kexec on
>partial write errata platforms can be removed.
Use imperative mood here: "Since ..., remove the X86_BUG_TDX_PW_MCE check..."
>
>Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
>Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
>---
> arch/x86/kernel/crash.c | 2 ++
> arch/x86/kernel/machine_kexec_64.c | 16 ----------------
> arch/x86/virt/vmx/tdx/tdx.c | 1 +
> 3 files changed, 3 insertions(+), 16 deletions(-)
>
>diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
>index cd796818d94d..623d4474631a 100644
>--- a/arch/x86/kernel/crash.c
>+++ b/arch/x86/kernel/crash.c
>@@ -38,6 +38,7 @@
> #include <linux/kdebug.h>
> #include <asm/cpu.h>
> #include <asm/reboot.h>
>+#include <asm/tdx.h>
> #include <asm/intel_pt.h>
> #include <asm/crash.h>
> #include <asm/cmdline.h>
>@@ -112,6 +113,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
>
> crash_smp_send_stop();
>
>+ tdx_sys_disable();
> x86_virt_emergency_disable_virtualization_cpu();
>
> /*
>diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
>index 0590d399d4f1..c3f4a389992d 100644
>--- a/arch/x86/kernel/machine_kexec_64.c
>+++ b/arch/x86/kernel/machine_kexec_64.c
>@@ -347,22 +347,6 @@ int machine_kexec_prepare(struct kimage *image)
> unsigned long reloc_end = (unsigned long)__relocate_kernel_end;
> int result;
>
>- /*
>- * Some early TDX-capable platforms have an erratum. A kernel
>- * partial write (a write transaction of less than cacheline
>- * lands at memory controller) to TDX private memory poisons that
>- * memory, and a subsequent read triggers a machine check.
>- *
>- * On those platforms the old kernel must reset TDX private
>- * memory before jumping to the new kernel otherwise the new
>- * kernel may see unexpected machine check. For simplicity
>- * just fail kexec/kdump on those platforms.
>- */
>- if (boot_cpu_has_bug(X86_BUG_TDX_PW_MCE)) {
>- pr_info_once("Not allowed on platform with tdx_pw_mce bug\n");
>- return -EOPNOTSUPP;
>- }
With this series, we need to update the "Kexec" section in tdx.rst.
>-
> /* Setup the identity mapped 64bit page table */
> result = init_pgtable(image, __pa(control_page));
> if (result)
>diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
>index 68bd2618dde4..b388fbce5d76 100644
>--- a/arch/x86/virt/vmx/tdx/tdx.c
>+++ b/arch/x86/virt/vmx/tdx/tdx.c
>@@ -252,6 +252,7 @@ static void tdx_shutdown_cpu(void *ign)
>
> static void tdx_shutdown(void *ign)
> {
>+ tdx_sys_disable();
> on_each_cpu(tdx_shutdown_cpu, NULL, 1);
> }
>
>--
>2.53.0
>
On Mon, 2026-03-09 at 16:15 +0800, Chao Gao wrote:
> > - /*
> > - * Some early TDX-capable platforms have an erratum. A
> > kernel
> > - * partial write (a write transaction of less than
> > cacheline
> > - * lands at memory controller) to TDX private memory
> > poisons that
> > - * memory, and a subsequent read triggers a machine check.
> > - *
> > - * On those platforms the old kernel must reset TDX
> > private
> > - * memory before jumping to the new kernel otherwise the
> > new
> > - * kernel may see unexpected machine check. For
> > simplicity
> > - * just fail kexec/kdump on those platforms.
> > - */
> > - if (boot_cpu_has_bug(X86_BUG_TDX_PW_MCE)) {
> > - pr_info_once("Not allowed on platform with
> > tdx_pw_mce bug\n");
> > - return -EOPNOTSUPP;
> > - }
>
> With this series, we need to update the "Kexec" section in tdx.rst.
Nice catch, and I agree on the others. Will update it. Thanks.
© 2016 - 2026 Red Hat, Inc.