arch/x86/Kconfig | 1 - arch/x86/include/asm/kexec.h | 2 +- arch/x86/include/asm/tdx.h | 2 + arch/x86/kernel/machine_kexec_64.c | 29 +++++++++-- arch/x86/kernel/process.c | 19 ++++--- arch/x86/kernel/relocate_kernel_64.S | 19 +++++-- arch/x86/virt/vmx/tdx/tdx.c | 78 ++++++++++++++++++++++++++++ 7 files changed, 129 insertions(+), 21 deletions(-)
Currently kexec() support and TDX host are muturally exclusive in the
Kconfig. This series adds the TDX host kexec support so that they can
work together and can be enabled at the same time in the Kconfig.
Hi maintainers,
This series aims to go through the tip tree, but I also CC'ed Sean/Paolo
due to when KVM TDX comes to play a KVM patch [*] is needed to complete
the kexec support for TDX. Also copy Dan for TDX connect.
Thanks for your time!
=== More information ===
If the kernel has ever enabled TDX, part of system memory remains TDX
private memory when kexec happens. E.g., the PAMT (Physical Address
Metadata Table) pages used by the TDX module to track each TDX memory
page's state are never freed once the TDX module is initialized. TDX
guests also have guest private memory and secure-EPT pages.
Similar to AMD SME, to support kexec the kernel needs to flush dirty
cachelines for TDX private memory before booting to the second kernel.
Also, the kernel needs to reset TDX private memory to normal (using
MOVDIR64B) before booting to the second kernel when the platform has
"partial write machine check" erratum, otherwise the second kernel may
see unexpected machine check.
The majority code change in this series handles "resetting TDX private
memory" (flushing cache part is relatively straightforward). Due to
currently the kernel doesn't have a unified way to tell whether a given
page is TDX private or not, this series chooses to only reset PAMT in
the core-kernel kexec code, but requires the in-kernel TDX users (e.g.,
KVM to reset the TDX private pages that they manage (see [*]).
Other options are also mentioned in the changelog of patch:
x86/kexec: Reset TDX private memory on platforms with TDX erratum
..which also contains more information about the above TDX erratum.
This series also covers crash kexec, but no special handling is needed
for crash kexec:
1) kdump kernel uses reserved memory from the first kernel, but the
reserved memory will never be used as TDX memory.
2) /proc/vmcore in the kdump kernel will only be used for read, but read
itself won't poison TDX private memory thus won't cause unexpected
machine check (only "partial write" will).
v4 -> v5:
- Rebase to tip/master.
- Remove the TDX-specific callback due to no need to reset TDX private
memory for crash kexec.
- Add a new patch to make module status immutable in reboot notifier
(split from v1) in order to use module status to tell the presence of
TDX private memory.
- Minor changelog updates, trivial comments improvements.
- Add Tom's Reviewed-by tag.
v4: https://lore.kernel.org/all/cover.1713439632.git.kai.huang@intel.com/
v3 -> v4:
- Updated changelog and comments of patch 1/2 per comments from
Kirill and Tom (see specific patch for details).
v3: https://lore.kernel.org/linux-kernel/cover.1712493366.git.kai.huang@intel.com/
v2 -> v3:
- Change to only do WBINVD for bare-metal, as Kirill/Tom pointed out
WBINVD in TDX guests and SEV-ES/SEV-SNP guests triggers #VE.
v2: https://lore.kernel.org/linux-kernel/cover.1710811610.git.kai.huang@intel.com/
v1 -> v2:
- Do unconditional WBINVD during kexec() -- Boris
- Change to cover crash kexec() -- Rick
- Add a new patch (last one) to add a mechanism to reset all TDX private
pages due to having to cover crash kexec().
- Other code improvements -- Dave
- Rebase to latest tip/master.
v1: https://lore.kernel.org/linux-kernel/cover.1706698706.git.kai.huang@intel.com/
[*]: https://github.com/intel/tdx/commit/513e24d7913457ba87b6f25644d02fbed0848f21
Kai Huang (5):
x86/kexec: do unconditional WBINVD for bare-metal in stop_this_cpu()
x86/kexec: do unconditional WBINVD for bare-metal in relocate_kernel()
x86/virt/tdx: Make module initializatiton state immutable in reboot
notifier
x86/kexec: Reset TDX private memory on platforms with TDX erratum
x86/virt/tdx: Remove the !KEXEC_CORE dependency
arch/x86/Kconfig | 1 -
arch/x86/include/asm/kexec.h | 2 +-
arch/x86/include/asm/tdx.h | 2 +
arch/x86/kernel/machine_kexec_64.c | 29 +++++++++--
arch/x86/kernel/process.c | 19 ++++---
arch/x86/kernel/relocate_kernel_64.S | 19 +++++--
arch/x86/virt/vmx/tdx/tdx.c | 78 ++++++++++++++++++++++++++++
7 files changed, 129 insertions(+), 21 deletions(-)
base-commit: b8c7cbc324dc17b9e42379b42603613580bec2d8
--
2.45.2
> Currently kexec() support and TDX host are muturally exclusive in the > Kconfig. This series adds the TDX host kexec support so that they can > work together and can be enabled at the same time in the Kconfig. I tried testing the kexec functionality and noticed that the TDX module fails initialization on the second kernel so you can't actually kexec between 2 kernels that enable TDX. Is that the expected behavior? Are there future patches to enable that functionality?
On 20/08/2024 9:21 am, Sagi Shahar wrote: >> Currently kexec() support and TDX host are muturally exclusive in the >> Kconfig. This series adds the TDX host kexec support so that they can >> work together and can be enabled at the same time in the Kconfig. > > I tried testing the kexec functionality and noticed that the TDX module > fails initialization on the second kernel so you can't actually kexec > between 2 kernels that enable TDX. Is that the expected behavior? Are > there future patches to enable that functionality? > Thanks for testing! Yes this is the expected behaviour. If the first kernel has enabled TDX, then the second kernel will fail to init TDX. The reason the first SEAMCALL to initialize TDX module in the second kernel will fail due to module having been initialized. However if the first kernel has not enabled TDX, the second kernel is able to enable it.
On Mon, Aug 19, 2024 at 5:16 PM Huang, Kai <kai.huang@intel.com> wrote: > > > > On 20/08/2024 9:21 am, Sagi Shahar wrote: > >> Currently kexec() support and TDX host are muturally exclusive in the > >> Kconfig. This series adds the TDX host kexec support so that they can > >> work together and can be enabled at the same time in the Kconfig. > > > > I tried testing the kexec functionality and noticed that the TDX module > > fails initialization on the second kernel so you can't actually kexec > > between 2 kernels that enable TDX. Is that the expected behavior? Are > > there future patches to enable that functionality? > > > > Thanks for testing! > > Yes this is the expected behaviour. If the first kernel has enabled > TDX, then the second kernel will fail to init TDX. The reason the first > SEAMCALL to initialize TDX module in the second kernel will fail due to > module having been initialized. > > However if the first kernel has not enabled TDX, the second kernel is > able to enable it. Are there any plans to support both kernels being able to enable TDX in the future? Either by changes to KVM or the TDX module?
On 20/08/2024 10:28 am, Sagi Shahar wrote: > On Mon, Aug 19, 2024 at 5:16 PM Huang, Kai <kai.huang@intel.com> wrote: >> >> >> >> On 20/08/2024 9:21 am, Sagi Shahar wrote: >>>> Currently kexec() support and TDX host are muturally exclusive in the >>>> Kconfig. This series adds the TDX host kexec support so that they can >>>> work together and can be enabled at the same time in the Kconfig. >>> >>> I tried testing the kexec functionality and noticed that the TDX module >>> fails initialization on the second kernel so you can't actually kexec >>> between 2 kernels that enable TDX. Is that the expected behavior? Are >>> there future patches to enable that functionality? >>> >> >> Thanks for testing! >> >> Yes this is the expected behaviour. If the first kernel has enabled >> TDX, then the second kernel will fail to init TDX. The reason the first >> SEAMCALL to initialize TDX module in the second kernel will fail due to >> module having been initialized. >> >> However if the first kernel has not enabled TDX, the second kernel is >> able to enable it. > > Are there any plans to support both kernels being able to enable TDX > in the future? Either by changes to KVM or the TDX module? AFAICT we haven't received such requirement so far. Let me double check internally and get back here. Btw, if we want to do this purely from software, changing KVM isn't the right thing to do. We need to somehow pass key data structures managing TDX module to the second kernel, e.g., module status, locations of PAMTs. And the second kernel needs to be modified to understand those, which means some old (second) kernels with TDX support may not be able to support this even if we add this to the kernel.
On Mon, Aug 19, 2024 at 5:44 PM Huang, Kai <kai.huang@intel.com> wrote: > > > > On 20/08/2024 10:28 am, Sagi Shahar wrote: > > On Mon, Aug 19, 2024 at 5:16 PM Huang, Kai <kai.huang@intel.com> wrote: > >> > >> > >> > >> On 20/08/2024 9:21 am, Sagi Shahar wrote: > >>>> Currently kexec() support and TDX host are muturally exclusive in the > >>>> Kconfig. This series adds the TDX host kexec support so that they can > >>>> work together and can be enabled at the same time in the Kconfig. > >>> > >>> I tried testing the kexec functionality and noticed that the TDX module > >>> fails initialization on the second kernel so you can't actually kexec > >>> between 2 kernels that enable TDX. Is that the expected behavior? Are > >>> there future patches to enable that functionality? > >>> > >> > >> Thanks for testing! > >> > >> Yes this is the expected behaviour. If the first kernel has enabled > >> TDX, then the second kernel will fail to init TDX. The reason the first > >> SEAMCALL to initialize TDX module in the second kernel will fail due to > >> module having been initialized. > >> > >> However if the first kernel has not enabled TDX, the second kernel is > >> able to enable it. > > > > Are there any plans to support both kernels being able to enable TDX > > in the future? Either by changes to KVM or the TDX module? > > AFAICT we haven't received such requirement so far. Let me double check > internally and get back here. > > Btw, if we want to do this purely from software, changing KVM isn't the > right thing to do. We need to somehow pass key data structures managing > TDX module to the second kernel, e.g., module status, locations of > PAMTs. And the second kernel needs to be modified to understand those, > which means some old (second) kernels with TDX support may not be able > to support this even if we add this to the kernel. Would it be possible to tear down the tdx module and re-initialize it on the next kernel? I don't think there's a requirement for the tdx module data structures to remain intact during kexec but it could be useful if tdx can be enabled on the new kernel.
On Fri, 2024-08-23 at 11:15 -0500, Sagi Shahar wrote: > On Mon, Aug 19, 2024 at 5:44 PM Huang, Kai <kai.huang@intel.com> wrote: > > > > > > > > On 20/08/2024 10:28 am, Sagi Shahar wrote: > > > On Mon, Aug 19, 2024 at 5:16 PM Huang, Kai <kai.huang@intel.com> wrote: > > > > > > > > > > > > > > > > On 20/08/2024 9:21 am, Sagi Shahar wrote: > > > > > > Currently kexec() support and TDX host are muturally exclusive in the > > > > > > Kconfig. This series adds the TDX host kexec support so that they can > > > > > > work together and can be enabled at the same time in the Kconfig. > > > > > > > > > > I tried testing the kexec functionality and noticed that the TDX module > > > > > fails initialization on the second kernel so you can't actually kexec > > > > > between 2 kernels that enable TDX. Is that the expected behavior? Are > > > > > there future patches to enable that functionality? > > > > > > > > > > > > > Thanks for testing! > > > > > > > > Yes this is the expected behaviour. If the first kernel has enabled > > > > TDX, then the second kernel will fail to init TDX. The reason the first > > > > SEAMCALL to initialize TDX module in the second kernel will fail due to > > > > module having been initialized. > > > > > > > > However if the first kernel has not enabled TDX, the second kernel is > > > > able to enable it. > > > > > > Are there any plans to support both kernels being able to enable TDX > > > in the future? Either by changes to KVM or the TDX module? > > > > AFAICT we haven't received such requirement so far. Let me double check > > internally and get back here. > > > > Btw, if we want to do this purely from software, changing KVM isn't the > > right thing to do. We need to somehow pass key data structures managing > > TDX module to the second kernel, e.g., module status, locations of > > PAMTs. And the second kernel needs to be modified to understand those, > > which means some old (second) kernels with TDX support may not be able > > to support this even if we add this to the kernel. > > Would it be possible to tear down the tdx module and re-initialize it > on the next kernel? I don't think there's a requirement for the tdx > module data structures to remain intact during kexec but it could be > useful if tdx can be enabled on the new kernel. We discussed this internally. The TDX module cannot be re-initialized after torn down. However the new kernel can reload the (same) TDX module and re- initialize it (the P-SEAMLDR supports reload or runtime update TDX module). However our primary focus is to enable baseline TDX support in upstream. For TDX host kexec, at this stage we focus on: 1) enable both TDX and Kexec in the Kconfig; 2) allow normal kexec and kdump to work when TDX is enabled. Making the second kernel be able to use TDX is next-step plan. May I ask is there any real use case that requires the second kernel to be able to use TDX at this stage?
On Sat, Aug 24, 2024 at 4:31 AM Huang, Kai <kai.huang@intel.com> wrote: > > On Fri, 2024-08-23 at 11:15 -0500, Sagi Shahar wrote: > > On Mon, Aug 19, 2024 at 5:44 PM Huang, Kai <kai.huang@intel.com> wrote: > > > > > > > > > > > > On 20/08/2024 10:28 am, Sagi Shahar wrote: > > > > On Mon, Aug 19, 2024 at 5:16 PM Huang, Kai <kai.huang@intel.com> wrote: > > > > > > > > > > > > > > > > > > > > On 20/08/2024 9:21 am, Sagi Shahar wrote: > > > > > > > Currently kexec() support and TDX host are muturally exclusive in the > > > > > > > Kconfig. This series adds the TDX host kexec support so that they can > > > > > > > work together and can be enabled at the same time in the Kconfig. > > > > > > > > > > > > I tried testing the kexec functionality and noticed that the TDX module > > > > > > fails initialization on the second kernel so you can't actually kexec > > > > > > between 2 kernels that enable TDX. Is that the expected behavior? Are > > > > > > there future patches to enable that functionality? > > > > > > > > > > > > > > > > Thanks for testing! > > > > > > > > > > Yes this is the expected behaviour. If the first kernel has enabled > > > > > TDX, then the second kernel will fail to init TDX. The reason the first > > > > > SEAMCALL to initialize TDX module in the second kernel will fail due to > > > > > module having been initialized. > > > > > > > > > > However if the first kernel has not enabled TDX, the second kernel is > > > > > able to enable it. > > > > > > > > Are there any plans to support both kernels being able to enable TDX > > > > in the future? Either by changes to KVM or the TDX module? > > > > > > AFAICT we haven't received such requirement so far. Let me double check > > > internally and get back here. > > > > > > Btw, if we want to do this purely from software, changing KVM isn't the > > > right thing to do. We need to somehow pass key data structures managing > > > TDX module to the second kernel, e.g., module status, locations of > > > PAMTs. And the second kernel needs to be modified to understand those, > > > which means some old (second) kernels with TDX support may not be able > > > to support this even if we add this to the kernel. > > > > Would it be possible to tear down the tdx module and re-initialize it > > on the next kernel? I don't think there's a requirement for the tdx > > module data structures to remain intact during kexec but it could be > > useful if tdx can be enabled on the new kernel. > > We discussed this internally. The TDX module cannot be re-initialized after > torn down. However the new kernel can reload the (same) TDX module and re- > initialize it (the P-SEAMLDR supports reload or runtime update TDX module). > > However our primary focus is to enable baseline TDX support in upstream. For > TDX host kexec, at this stage we focus on: 1) enable both TDX and Kexec in the > Kconfig; 2) allow normal kexec and kdump to work when TDX is enabled. Making > the second kernel be able to use TDX is next-step plan. > > May I ask is there any real use case that requires the second kernel to be > able to use TDX at this stage? [Again in plaintext] Right now I don't think we have production requirements for kexec at all besides kdump support. Kexec from TDX enabled kernel to a non-TDX kernel definitely doesn't have a production requirement. It would be nice to be able to kexec to a TDX enabled kernel to speed up the development cycle instead of waiting for a full reboot but that's not a high priority at the moment.
On Mon, 2024-08-26 at 14:22 -0500, Sagi Shahar wrote: > On Sat, Aug 24, 2024 at 4:31 AM Huang, Kai <kai.huang@intel.com> wrote: > > > > On Fri, 2024-08-23 at 11:15 -0500, Sagi Shahar wrote: > > > On Mon, Aug 19, 2024 at 5:44 PM Huang, Kai <kai.huang@intel.com> wrote: > > > > > > > > > > > > > > > > On 20/08/2024 10:28 am, Sagi Shahar wrote: > > > > > On Mon, Aug 19, 2024 at 5:16 PM Huang, Kai <kai.huang@intel.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > On 20/08/2024 9:21 am, Sagi Shahar wrote: > > > > > > > > Currently kexec() support and TDX host are muturally exclusive in the > > > > > > > > Kconfig. This series adds the TDX host kexec support so that they can > > > > > > > > work together and can be enabled at the same time in the Kconfig. > > > > > > > > > > > > > > I tried testing the kexec functionality and noticed that the TDX module > > > > > > > fails initialization on the second kernel so you can't actually kexec > > > > > > > between 2 kernels that enable TDX. Is that the expected behavior? Are > > > > > > > there future patches to enable that functionality? > > > > > > > > > > > > > > > > > > > Thanks for testing! > > > > > > > > > > > > Yes this is the expected behaviour. If the first kernel has enabled > > > > > > TDX, then the second kernel will fail to init TDX. The reason the first > > > > > > SEAMCALL to initialize TDX module in the second kernel will fail due to > > > > > > module having been initialized. > > > > > > > > > > > > However if the first kernel has not enabled TDX, the second kernel is > > > > > > able to enable it. > > > > > > > > > > Are there any plans to support both kernels being able to enable TDX > > > > > in the future? Either by changes to KVM or the TDX module? > > > > > > > > AFAICT we haven't received such requirement so far. Let me double check > > > > internally and get back here. > > > > > > > > Btw, if we want to do this purely from software, changing KVM isn't the > > > > right thing to do. We need to somehow pass key data structures managing > > > > TDX module to the second kernel, e.g., module status, locations of > > > > PAMTs. And the second kernel needs to be modified to understand those, > > > > which means some old (second) kernels with TDX support may not be able > > > > to support this even if we add this to the kernel. > > > > > > Would it be possible to tear down the tdx module and re-initialize it > > > on the next kernel? I don't think there's a requirement for the tdx > > > module data structures to remain intact during kexec but it could be > > > useful if tdx can be enabled on the new kernel. > > > > We discussed this internally. The TDX module cannot be re-initialized after > > torn down. However the new kernel can reload the (same) TDX module and re- > > initialize it (the P-SEAMLDR supports reload or runtime update TDX module). > > > > However our primary focus is to enable baseline TDX support in upstream. For > > TDX host kexec, at this stage we focus on: 1) enable both TDX and Kexec in the > > Kconfig; 2) allow normal kexec and kdump to work when TDX is enabled. Making > > the second kernel be able to use TDX is next-step plan. > > > > May I ask is there any real use case that requires the second kernel to be > > able to use TDX at this stage? > > [Again in plaintext] > > Right now I don't think we have production requirements for kexec at > all besides kdump support. Kexec from TDX enabled kernel to a non-TDX > kernel definitely doesn't have a production requirement. > > It would be nice to be able to kexec to a TDX enabled kernel to speed > up the development cycle instead of waiting for a full reboot but > that's not a high priority at the moment. Yeah agreed. But then let's do this as a future work after baseline TDX support is done.
© 2016 - 2026 Red Hat, Inc.