[PATCH v3 14/21] qemu: Add FakeReboot support for TDX guest

Zhenzhong Duan posted 21 patches 5 months, 2 weeks ago
There is a newer version of this series
[PATCH v3 14/21] qemu: Add FakeReboot support for TDX guest
Posted by Zhenzhong Duan 5 months, 2 weeks ago
Utilize the existing fake reboot mechanism to do reboot for TDX guest.

Different from normal guest, TDX guest doesn't support system_reset,
so have to kill the old guest and start a new one to simulate the reboot.

Co-developed-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 src/qemu/qemu_process.c | 80 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 77 insertions(+), 3 deletions(-)

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 529c65b9b0..7c8f785f32 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -446,6 +446,67 @@ qemuProcessHandleReset(qemuMonitor *mon G_GNUC_UNUSED,
 }
 
 
+/*
+ * Secure guest doesn't support fake reboot via machine CPU reset.
+ * We thus fake reboot via QEMU re-creation.
+ */
+static void
+qemuProcessFakeRebootViaRecreate(virDomainObj *vm)
+{
+    qemuDomainObjPrivate *priv = vm->privateData;
+    virQEMUDriver *driver = priv->driver;
+    int ret = -1;
+
+    VIR_DEBUG("Handle secure guest reboot: destroy phase");
+
+    virObjectLock(vm);
+    if (qemuProcessBeginStopJob(vm, VIR_JOB_DESTROY, 0) < 0)
+        goto cleanup;
+
+    if (virDomainObjCheckActive(vm) < 0) {
+        qemuProcessEndStopJob(vm);
+        goto cleanup;
+    }
+
+    qemuProcessStop(vm, VIR_DOMAIN_SHUTOFF_DESTROYED, VIR_ASYNC_JOB_NONE, 0);
+    virDomainAuditStop(vm, "destroyed");
+
+    /* skip remove inactive domain from active list */
+    qemuProcessEndStopJob(vm);
+
+    VIR_DEBUG("Handle secure guest reboot: boot phase");
+
+    if (qemuProcessBeginJob(vm, VIR_DOMAIN_JOB_OPERATION_START, 0) < 0) {
+        qemuDomainRemoveInactive(vm, 0, false);
+        goto cleanup;
+    }
+
+    if (qemuProcessStart(NULL, driver, vm, NULL, VIR_ASYNC_JOB_START,
+                         NULL, -1, NULL, NULL, NULL,
+                         VIR_NETDEV_VPORT_PROFILE_OP_CREATE,
+                         0) < 0) {
+        virDomainAuditStart(vm, "booted", false);
+        qemuDomainRemoveInactive(vm, 0, false);
+        goto endjob;
+    }
+
+    virDomainAuditStart(vm, "booted", true);
+
+    qemuDomainSaveStatus(vm);
+    ret = 0;
+
+ endjob:
+    qemuProcessEndJob(vm);
+
+ cleanup:
+    priv->pausedShutdown = false;
+    qemuDomainSetFakeReboot(vm, false);
+    if (ret == -1)
+        ignore_value(qemuProcessKill(vm, VIR_QEMU_PROCESS_KILL_FORCE));
+    virDomainObjEndAPI(&vm);
+}
+
+
 /*
  * Since we have the '-no-shutdown' flag set, the
  * QEMU process will currently have guest OS shutdown
@@ -455,15 +516,13 @@ qemuProcessHandleReset(qemuMonitor *mon G_GNUC_UNUSED,
  * guest OS booting up again
  */
 static void
-qemuProcessFakeReboot(void *opaque)
+qemuProcessFakeRebootViaReset(virDomainObj *vm)
 {
-    virDomainObj *vm = opaque;
     qemuDomainObjPrivate *priv = vm->privateData;
     virQEMUDriver *driver = priv->driver;
     virDomainRunningReason reason = VIR_DOMAIN_RUNNING_BOOTED;
     int ret = -1, rc;
 
-    VIR_DEBUG("vm=%p", vm);
     virObjectLock(vm);
     if (virDomainObjBeginJob(vm, VIR_JOB_MODIFY) < 0)
         goto cleanup;
@@ -509,6 +568,21 @@ qemuProcessFakeReboot(void *opaque)
 }
 
 
+static void
+qemuProcessFakeReboot(void *opaque)
+{
+    virDomainObj *vm = opaque;
+
+    VIR_DEBUG("vm=%p", vm);
+
+    if (vm->def->sec &&
+        vm->def->sec->sectype == VIR_DOMAIN_LAUNCH_SECURITY_TDX)
+        qemuProcessFakeRebootViaRecreate(vm);
+    else
+        qemuProcessFakeRebootViaReset(vm);
+}
+
+
 void
 qemuProcessShutdownOrReboot(virDomainObj *vm)
 {
-- 
2.34.1
Re: [PATCH v3 14/21] qemu: Add FakeReboot support for TDX guest
Posted by Daniel P. Berrangé via Devel 5 months, 1 week ago
On Mon, Jun 30, 2025 at 02:17:25PM +0800, Zhenzhong Duan wrote:
> Utilize the existing fake reboot mechanism to do reboot for TDX guest.
> 
> Different from normal guest, TDX guest doesn't support system_reset,
> so have to kill the old guest and start a new one to simulate the reboot.
> 
> Co-developed-by: Chenyi Qiang <chenyi.qiang@intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>  src/qemu/qemu_process.c | 80 +++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 77 insertions(+), 3 deletions(-)

One thing I noticed during testing is that when a guest crashes
during boot up eg via a triple-fault, we'll endlessly re-create
QEMU which is quite expensive as memory pages are allocated/deallocated,
and also burn through domain ID values.

I'm not sure there's much (anything) we can do about these downsides
though.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
RE: [PATCH v3 14/21] qemu: Add FakeReboot support for TDX guest
Posted by Duan, Zhenzhong 5 months, 1 week ago

>-----Original Message-----
>From: Daniel P. Berrangé <berrange@redhat.com>
>Subject: Re: [PATCH v3 14/21] qemu: Add FakeReboot support for TDX guest
>
>On Mon, Jun 30, 2025 at 02:17:25PM +0800, Zhenzhong Duan wrote:
>> Utilize the existing fake reboot mechanism to do reboot for TDX guest.
>>
>> Different from normal guest, TDX guest doesn't support system_reset,
>> so have to kill the old guest and start a new one to simulate the reboot.
>>
>> Co-developed-by: Chenyi Qiang <chenyi.qiang@intel.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>>  src/qemu/qemu_process.c | 80
>+++++++++++++++++++++++++++++++++++++++--
>>  1 file changed, 77 insertions(+), 3 deletions(-)
>
>One thing I noticed during testing is that when a guest crashes
>during boot up eg via a triple-fault, we'll endlessly re-create
>QEMU which is quite expensive as memory pages are allocated/deallocated,
>and also burn through domain ID values.

Is it because you enabled SEPT #VE? What's your <on_crash> setting?

>
>I'm not sure there's much (anything) we can do about these downsides
>though.

About the sept-ve-disable, it's a must for linux kernel, but may be not for others.
Maybe checking "TD misconfiguration: SEPT #VE has to be disabled", but it's not clean code.
Or maybe document it?

Thanks
Zhenzhong

>
>Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
>
>
>With regards,
>Daniel
>--
>|: https://berrange.com      -o-
>https://www.flickr.com/photos/dberrange :|
>|: https://libvirt.org         -o-
>https://fstop138.berrange.com :|
>|: https://entangle-photo.org    -o-
>https://www.instagram.com/dberrange :|

Re: [PATCH v3 14/21] qemu: Add FakeReboot support for TDX guest
Posted by Daniel P. Berrangé via Devel 5 months, 1 week ago
On Wed, Jul 09, 2025 at 09:44:42AM +0000, Duan, Zhenzhong wrote:
> 
> 
> >-----Original Message-----
> >From: Daniel P. Berrangé <berrange@redhat.com>
> >Subject: Re: [PATCH v3 14/21] qemu: Add FakeReboot support for TDX guest
> >
> >On Mon, Jun 30, 2025 at 02:17:25PM +0800, Zhenzhong Duan wrote:
> >> Utilize the existing fake reboot mechanism to do reboot for TDX guest.
> >>
> >> Different from normal guest, TDX guest doesn't support system_reset,
> >> so have to kill the old guest and start a new one to simulate the reboot.
> >>
> >> Co-developed-by: Chenyi Qiang <chenyi.qiang@intel.com>
> >> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> >> ---
> >>  src/qemu/qemu_process.c | 80
> >+++++++++++++++++++++++++++++++++++++++--
> >>  1 file changed, 77 insertions(+), 3 deletions(-)
> >
> >One thing I noticed during testing is that when a guest crashes
> >during boot up eg via a triple-fault, we'll endlessly re-create
> >QEMU which is quite expensive as memory pages are allocated/deallocated,
> >and also burn through domain ID values.
> 
> Is it because you enabled SEPT #VE? What's your <on_crash> setting?

The SEPT #VE config mistake was a later config problem - that one
did NOT cause reboots - the VM simply powered off.

The thing causing my endless reboots is a bug exposed in a recent
EDK2 update in Fedora that is resulting in a triple-fault, which
in turns causes a CPU reset, and thus this reboot logic triggers.
We have not identified the cause of that EDK problem yet

https://bugzilla.redhat.com/show_bug.cgi?id=2376851

> >I'm not sure there's much (anything) we can do about these downsides
> >though.
> 
> About the sept-ve-disable, it's a must for linux kernel, but may be not for others.
> Maybe checking "TD misconfiguration: SEPT #VE has to be disabled", but it's not clean code.
> Or maybe document it?

In the common case users simply should not set the 'policy'
attribute value at all, in which case the default values
should result in SEPT #VE being disabled.

My mistake was that I was playing with the configuration
and had set policy to 0x0 and then forgot I did this when
coming back later.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
RE: [PATCH v3 14/21] qemu: Add FakeReboot support for TDX guest
Posted by Duan, Zhenzhong 5 months, 1 week ago

>-----Original Message-----
>From: Daniel P. Berrangé <berrange@redhat.com>
>Subject: Re: [PATCH v3 14/21] qemu: Add FakeReboot support for TDX guest
>
>On Wed, Jul 09, 2025 at 09:44:42AM +0000, Duan, Zhenzhong wrote:
>>
>>
>> >-----Original Message-----
>> >From: Daniel P. Berrangé <berrange@redhat.com>
>> >Subject: Re: [PATCH v3 14/21] qemu: Add FakeReboot support for TDX
>guest
>> >
>> >On Mon, Jun 30, 2025 at 02:17:25PM +0800, Zhenzhong Duan wrote:
>> >> Utilize the existing fake reboot mechanism to do reboot for TDX guest.
>> >>
>> >> Different from normal guest, TDX guest doesn't support system_reset,
>> >> so have to kill the old guest and start a new one to simulate the reboot.
>> >>
>> >> Co-developed-by: Chenyi Qiang <chenyi.qiang@intel.com>
>> >> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> >> ---
>> >>  src/qemu/qemu_process.c | 80
>> >+++++++++++++++++++++++++++++++++++++++--
>> >>  1 file changed, 77 insertions(+), 3 deletions(-)
>> >
>> >One thing I noticed during testing is that when a guest crashes
>> >during boot up eg via a triple-fault, we'll endlessly re-create
>> >QEMU which is quite expensive as memory pages are
>allocated/deallocated,
>> >and also burn through domain ID values.
>>
>> Is it because you enabled SEPT #VE? What's your <on_crash> setting?
>
>The SEPT #VE config mistake was a later config problem - that one
>did NOT cause reboots - the VM simply powered off.
>
>The thing causing my endless reboots is a bug exposed in a recent
>EDK2 update in Fedora that is resulting in a triple-fault, which
>in turns causes a CPU reset, and thus this reboot logic triggers.
>We have not identified the cause of that EDK problem yet

I see. I think we don't have much to do for this. We are emulating same
behavior as normal guest, though there are some overheads.

>
>https://bugzilla.redhat.com/show_bug.cgi?id=2376851
>
>> >I'm not sure there's much (anything) we can do about these downsides
>> >though.
>>
>> About the sept-ve-disable, it's a must for linux kernel, but may be not for
>others.
>> Maybe checking "TD misconfiguration: SEPT #VE has to be disabled", but it's
>not clean code.
>> Or maybe document it?
>
>In the common case users simply should not set the 'policy'
>attribute value at all, in which case the default values
>should result in SEPT #VE being disabled.
>
>My mistake was that I was playing with the configuration
>and had set policy to 0x0 and then forgot I did this when
>coming back later.

OK.

Thanks
Zhenzhong