[libvirt] [PATCH RESEND] qemu: Remove inactive vm when failed to start it

Yi Wang posted 1 patch 6 years, 9 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/1498445798-40874-1-git-send-email-wang.yi59@zte.com.cn
src/qemu/qemu_driver.c | 2 ++
1 file changed, 2 insertions(+)
[libvirt] [PATCH RESEND] qemu: Remove inactive vm when failed to start it
Posted by Yi Wang 6 years, 9 months ago
Libvirt forgets to remove inactive vm when failed to start a defined vm.
That may result in residual domain in driver->domains on such condition:
during the process of starting a vm, undefine it, and qemu exit because
of some exception. As we undefined the vm successfully, the vm->persistent
was set to 0, we will always fail to undefine it until restart libvirtd.

Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
---
 src/qemu/qemu_driver.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index cdb727b..af8afab 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -7185,6 +7185,8 @@ qemuDomainCreateWithFlags(virDomainPtr dom, unsigned int flags)
 
  endjob:
     qemuProcessEndJob(driver, vm);
+    if (ret < 0)
+        qemuDomainRemoveInactive(driver, vm);
 
  cleanup:
     virDomainObjEndAPI(&vm);
-- 
1.8.3.1


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH RESEND] qemu: Remove inactive vm when failed to start it
Posted by John Ferlan 6 years, 9 months ago

On 06/25/2017 10:56 PM, Yi Wang wrote:
> Libvirt forgets to remove inactive vm when failed to start a defined vm.
> That may result in residual domain in driver->domains on such condition:
> during the process of starting a vm, undefine it, and qemu exit because
> of some exception. As we undefined the vm successfully, the vm->persistent
> was set to 0, we will always fail to undefine it until restart libvirtd.

This seems to be a strange sequence of operations, but the claim is that
by adding this logic to CreateWithFlags, then the problem you're facing
is resolved. However, is adding this to the Create logic the right thing
to do?

IIUC:

CreateXMLWithFlags is successful
Undefine domain is successful
  -> Logic at top says, "if !vm->persistent", then fail
  -> Logic at bottom has:
    -> vm->persistent = 0
    -> if !virDomainObjIsActive then qemuDomainRemoveInactive

   ... Thus if domain is active, expectation is that RemoveInactive is
done when the domain is "normally" shutdown...

... time goes by...

QEMU exits w/ some exception

  -> qemuDomainDestroyFlags isn't run
  -> ? Thus avoiding some libvirt logic to handle the exception (would
that be qemuProcessAutoDestroyAdd assuming VIR_DOMAIN_START_AUTODESTROY
was used) ?

So your fix is, the next time a CreateWithFlags happens, if anything
fails, then domain removal can be done with the assumption that
something within ObjStart will fail eventually because Undefine deleted
most traces of the guest.

Of course if libvirtd restart happens, then because the config no longer
exists the domain object isn't created.

I wonder if perhaps (yuck) another flag is required to indicate that a
domain went from persistent to !persistent in Undefine, but because it
was still running the RemoveInactive wasn't called.

The interesting part of that is how would the reaping happen. Seems this
would be part of some qemuProcessHandle* function.  Something that would
be resetting vm->def->id and/or running qemuProcessStop (e.g. processing
via AutoDestroy)

Some more details and understanding why/what qemuProcessHandle* may (or
may not be) called I think is necessary.

It doesn't seem to me that domain removal for this condition should
require going through Start processing, but something in that monitor
event processing logic would seem to need to be taught about this
possibility.


> 
> Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
> ---
>  src/qemu/qemu_driver.c | 2 ++
>  1 file changed, 2 insertions(+)
> 

There's two reasons to get to the endjob: label...

#1 virDomainObjIsActive (domain already active)

#2 qemuDomainObjStart fails

So this certainly wouldn't be right for the IsActive condition, under
"normal" circumstances. And again, I don't think requiring a subsequent
call to Start would be "correct" either.  I am curious though if when in
this condition, is the vm->def->id set?  IOW: What does 'virsh list'
indicate or even virsh dominfo $domain?

John

> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index cdb727b..af8afab 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -7185,6 +7185,8 @@ qemuDomainCreateWithFlags(virDomainPtr dom, unsigned int flags)
>  
>   endjob:
>      qemuProcessEndJob(driver, vm);
> +    if (ret < 0)
> +        qemuDomainRemoveInactive(driver, vm);
>  
>   cleanup:
>      virDomainObjEndAPI(&vm);
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list