arch/x86/coco/tdx/Makefile | 1 + arch/x86/coco/tdx/kexec.c | 82 ++++++++++++++++++++++++++++ arch/x86/include/asm/tdx.h | 4 ++ arch/x86/kernel/machine_kexec_64.c | 2 + arch/x86/kernel/relocate_kernel_64.S | 6 +- 5 files changed, 94 insertions(+), 1 deletion(-) create mode 100644 arch/x86/coco/tdx/kexec.c
The patch brings basic enabling of kexec in TDX guests. By "basic enabling" I mean, kexec in the guests with a single CPU. TDX guests use ACPI MADT MPWK to bring up secondary CPUs. The mechanism doesn't allow to put a CPU back offline if it has woken up. We are looking into this, but it might take time. Kirill A. Shutemov (2): x86/kexec: Preserve CR4.MCE during kexec x86/tdx: Convert shared memory back to private on kexec arch/x86/coco/tdx/Makefile | 1 + arch/x86/coco/tdx/kexec.c | 82 ++++++++++++++++++++++++++++ arch/x86/include/asm/tdx.h | 4 ++ arch/x86/kernel/machine_kexec_64.c | 2 + arch/x86/kernel/relocate_kernel_64.S | 6 +- 5 files changed, 94 insertions(+), 1 deletion(-) create mode 100644 arch/x86/coco/tdx/kexec.c -- 2.39.1
On Tue, 2023-02-14 at 02:48 +0300, Kirill A. Shutemov wrote: > The patch brings basic enabling of kexec in TDX guests. > > By "basic enabling" I mean, kexec in the guests with a single CPU. > TDX guests use ACPI MADT MPWK to bring up secondary CPUs. The mechanism > doesn't allow to put a CPU back offline if it has woken up. > > We are looking into this, but it might take time. Can't we park the secondary CPUs in a purgatory-like thing of their own and wake them from there when we want them? Patches for that were floating around once, although the primary reason then was latency, and we decided to address that differently by doing the bringup in parallel instead.
On Wed, Feb 22, 2023 at 10:26:22AM +0000, David Woodhouse wrote: > On Tue, 2023-02-14 at 02:48 +0300, Kirill A. Shutemov wrote: > > The patch brings basic enabling of kexec in TDX guests. > > > > By "basic enabling" I mean, kexec in the guests with a single CPU. > > TDX guests use ACPI MADT MPWK to bring up secondary CPUs. The mechanism > > doesn't allow to put a CPU back offline if it has woken up. > > > > We are looking into this, but it might take time. > > Can't we park the secondary CPUs in a purgatory-like thing of their own > and wake them from there when we want them? > > Patches for that were floating around once, although the primary reason > then was latency, and we decided to address that differently by doing > the bringup in parallel instead. That's plan B. It is suboptimal. kexec() can happen into something that is not Linux which will not be able to wake up CPUs. Ideally, it has to be addressed on BIOS level: it has to provide a way to offline CPUs, putting it back to pre-wakeup state. -- Kiryl Shutsemau / Kirill A. Shutemov
On 2/24/23 06:30, Kirill A. Shutemov wrote: > Ideally, it has to be addressed on BIOS level: it has to provide a way to > offline CPUs, putting it back to pre-wakeup state. Is there anything stopping us from just parking the CPUs in a loop looking at 'acpi_mp_wake_mailbox_paddr'? Basically park them in a way which is indistinguishable from what the BIOS did.
On Fri, Feb 24, 2023 at 07:22:18AM -0800, Dave Hansen wrote: > On 2/24/23 06:30, Kirill A. Shutemov wrote: > > Ideally, it has to be addressed on BIOS level: it has to provide a way to > > offline CPUs, putting it back to pre-wakeup state. > > Is there anything stopping us from just parking the CPUs in a loop > looking at 'acpi_mp_wake_mailbox_paddr'? Basically park them in a way > which is indistinguishable from what the BIOS did. +Rafael. - Forward compatibility can be an issue. Version 0 of mailbox supports only single Wakeup command. Future specs may define a new command that kernel implementation doesn't support. - BIOS owns the mailbox page and can re-use for something else after the last CPU has woken up. (I know it is very theoretical, but still.) - We can patch ACPI table to point to mailbox page in kernel allocated memory, but it brings other problem. If the first kernel didn't wake up all CPUs for some reason (CONFIG_SMP=n or nr_cpus= or something) the second kernel would not be able to wake up them too since they looping around the old address. But ultimately, I think it is clearly missing BIOS functionality and has to be addressed there. Hacking around it in kernel will lead to more problems down the road. -- Kiryl Shutsemau / Kirill A. Shutemov
On 2/13/23 15:48, Kirill A. Shutemov wrote: > The patch brings basic enabling of kexec in TDX guests. > > By "basic enabling" I mean, kexec in the guests with a single CPU. > TDX guests use ACPI MADT MPWK to bring up secondary CPUs. The mechanism > doesn't allow to put a CPU back offline if it has woken up. > > We are looking into this, but it might take time. This is simple enough. But, nobody will _actually_ use this code as-is, right? What's the point of applying it now?
On Thu, Feb 16, 2023 at 09:50:32AM -0800, Dave Hansen wrote: > On 2/13/23 15:48, Kirill A. Shutemov wrote: > > The patch brings basic enabling of kexec in TDX guests. > > > > By "basic enabling" I mean, kexec in the guests with a single CPU. > > TDX guests use ACPI MADT MPWK to bring up secondary CPUs. The mechanism > > doesn't allow to put a CPU back offline if it has woken up. > > > > We are looking into this, but it might take time. > > This is simple enough. But, nobody will _actually_ use this code as-is, > right? What's the point of applying it now? Why nobody? Single CPU VMs are not that uncommon. -- Kiryl Shutsemau / Kirill A. Shutemov
On 2/16/23 10:12, Kirill A. Shutemov wrote: > On Thu, Feb 16, 2023 at 09:50:32AM -0800, Dave Hansen wrote: >> On 2/13/23 15:48, Kirill A. Shutemov wrote: >>> The patch brings basic enabling of kexec in TDX guests. >>> >>> By "basic enabling" I mean, kexec in the guests with a single CPU. >>> TDX guests use ACPI MADT MPWK to bring up secondary CPUs. The mechanism >>> doesn't allow to put a CPU back offline if it has woken up. >>> >>> We are looking into this, but it might take time. >> This is simple enough. But, nobody will _actually_ use this code as-is, >> right? What's the point of applying it now? > Why nobody? Single CPU VMs are not that uncommon. Here's one data point: the only "General Purpose" ones I see AWS offering are Haswell era: https://aws.amazon.com/ec2/instance-types/ That _might_ be because of concerns about SMT side-channel exposure on anything newer. So, we can argue about what "uncommon" means. But, a minority of folks care about 1-cpu VMs. Also, a separate minority of folks care about kexec(). I'm worried that the overlap between the two will be an *OVERWHELMING* minority of folks. In other words, so few people will use this code that it'll just bitrot. I'm looking for compelling arguments why mainline should carry this.
© 2016 - 2025 Red Hat, Inc.