[libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache

Claudio Fontana posted 1 patch 2 years ago
Test syntax-check passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20220412091815.11231-1-cfontana@suse.de
src/qemu/qemu_saveimage.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
[libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache
Posted by Claudio Fontana 2 years ago
align the "save" with the "restore" code,
by only using the wrapper when using --bypass-cache.

This avoids a copy, resulting in better performance.

Signed-off-by: Claudio Fontana <cfontana@suse.de>
---
 src/qemu/qemu_saveimage.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
index 4fd4c5cfcd..5ea1b2fbcc 100644
--- a/src/qemu/qemu_saveimage.c
+++ b/src/qemu/qemu_saveimage.c
@@ -289,8 +289,10 @@ qemuSaveImageCreate(virQEMUDriver *driver,
     if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def, fd) < 0)
         goto cleanup;
 
-    if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
-        goto cleanup;
+    if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
+        if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
+            goto cleanup;
+    }
 
     if (virQEMUSaveDataWrite(data, fd, path) < 0)
         goto cleanup;
-- 
2.34.1
Re: [libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache
Posted by Daniel P. Berrangé 2 years ago
On Tue, Apr 12, 2022 at 11:18:15AM +0200, Claudio Fontana wrote:
> align the "save" with the "restore" code,
> by only using the wrapper when using --bypass-cache.
> 
> This avoids a copy, resulting in better performance.
> 
> Signed-off-by: Claudio Fontana <cfontana@suse.de>
> ---
>  src/qemu/qemu_saveimage.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
> index 4fd4c5cfcd..5ea1b2fbcc 100644
> --- a/src/qemu/qemu_saveimage.c
> +++ b/src/qemu/qemu_saveimage.c
> @@ -289,8 +289,10 @@ qemuSaveImageCreate(virQEMUDriver *driver,
>      if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def, fd) < 0)
>          goto cleanup;
>  
> -    if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> -        goto cleanup;
> +    if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
> +        if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> +            goto cleanup;
> +    }

This effectively reverts:

  commit c4caab538effb57411ad787fedb61e80d557caae
  Author: Jiri Denemark <jdenemar@redhat.com>
  Date:   Wed Feb 8 14:08:54 2012 +0100

    qemu: Always use iohelper for domain save
    
    This is probably not strictly needed as save operation is not live but
    we may have other reasons to avoid blocking qemu's main loop.

As the commit message mentions, using a plain file FD results
in the QEMU thread being blocked, even when O_NONBLOCK is used.

Originally this impacted the main QEMU event loop, which means
things like timers won't fire, or VNC clients won't be serviced,
and the monitor won't respond while the save is taking place.

Today IIUC, the migrate should run in a background thread, so
much of these ill effects go away, as only the migration thread
gets blocked.

This would however impact the ability to cancel the migration
operation. The cancel request could be delayed significantly
if there's alot of pending disk I/O, as the migrate thread
will be in kernel space in an uninterruptable sleep.

Similarly if there is a problem with the host storage,
eg dead NFS server, we'll be unable to cancel migration at
all on the QEMU side.

With the I/O helper, we'll still be unable to kill the I/O
helper for the same reasons, but at least the impact is
isolated from QEMU.

I didn't realize that we actually passed the file FD to
QEMU directly when restoring - I thought  we always used
the I/O helper there too. IMHO this is a bug in libvirt,
as again QEMU has always (implicitly) expected a socket/pipe
FD for fd: protocol with migration, and explicitly never
tried to provide a 'file:' protocol. 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
Re: [libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache
Posted by Claudio Fontana 2 years ago
On 4/21/22 6:27 PM, Daniel P. Berrangé wrote:
> On Tue, Apr 12, 2022 at 11:18:15AM +0200, Claudio Fontana wrote:
>> align the "save" with the "restore" code,
>> by only using the wrapper when using --bypass-cache.
>>
>> This avoids a copy, resulting in better performance.
>>
>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>> ---
>>  src/qemu/qemu_saveimage.c | 6 ++++--
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
>> index 4fd4c5cfcd..5ea1b2fbcc 100644
>> --- a/src/qemu/qemu_saveimage.c
>> +++ b/src/qemu/qemu_saveimage.c
>> @@ -289,8 +289,10 @@ qemuSaveImageCreate(virQEMUDriver *driver,
>>      if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def, fd) < 0)
>>          goto cleanup;
>>  
>> -    if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
>> -        goto cleanup;
>> +    if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
>> +        if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
>> +            goto cleanup;
>> +    }
> 
> This effectively reverts:
> 
>   commit c4caab538effb57411ad787fedb61e80d557caae
>   Author: Jiri Denemark <jdenemar@redhat.com>
>   Date:   Wed Feb 8 14:08:54 2012 +0100
> 
>     qemu: Always use iohelper for domain save
>     
>     This is probably not strictly needed as save operation is not live but
>     we may have other reasons to avoid blocking qemu's main loop.
> 
> As the commit message mentions, using a plain file FD results
> in the QEMU thread being blocked, even when O_NONBLOCK is used.
> 
> Originally this impacted the main QEMU event loop, which means
> things like timers won't fire, or VNC clients won't be serviced,
> and the monitor won't respond while the save is taking place.
> 
> Today IIUC, the migrate should run in a background thread, so
> much of these ill effects go away, as only the migration thread
> gets blocked.
> 
> This would however impact the ability to cancel the migration
> operation. The cancel request could be delayed significantly
> if there's alot of pending disk I/O, as the migrate thread
> will be in kernel space in an uninterruptable sleep.
> 
> Similarly if there is a problem with the host storage,
> eg dead NFS server, we'll be unable to cancel migration at
> all on the QEMU side.
> 
> With the I/O helper, we'll still be unable to kill the I/O
> helper for the same reasons, but at least the impact is
> isolated from QEMU.
> 
> I didn't realize that we actually passed the file FD to
> QEMU directly when restoring - I thought  we always used
> the I/O helper there too. IMHO this is a bug in libvirt,

I think there might be more behind that.

I think it would be overkill to always use the wrapper, because of how qemuSaveImageOpen() has been implemented,
as opposed to qemuSaveImageCreate().

qemuSaveImageOpen is used for a plethora of use cases, and this stems I think from the fact that
sometimes libvirt just wants to operate on the XML it stores mixed up with the QEMU VM,
and sometimes it wants to actually read the whole QEMU VM.

(I wonder if storing them in separate files would have been preferable)

So the mid-way that has been found there seems to be, only use the wrapper when opening with bypass-cache,
otherwise the iohelper would be employed each and every time libvirt wants to access the xml as well.

Of course I cannot be sure of the intentions there.

Thanks,

Claudio

> as again QEMU has always (implicitly) expected a socket/pipe
> FD for fd: protocol with migration, and explicitly never
> tried to provide a 'file:' protocol. 
> 
> With regards,
> Daniel
> 
Re: [libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache
Posted by Daniel P. Berrangé 2 years ago
On Thu, Apr 21, 2022 at 05:27:09PM +0100, Daniel P. Berrangé wrote:
> On Tue, Apr 12, 2022 at 11:18:15AM +0200, Claudio Fontana wrote:
> > align the "save" with the "restore" code,
> > by only using the wrapper when using --bypass-cache.
> > 
> > This avoids a copy, resulting in better performance.
> > 
> > Signed-off-by: Claudio Fontana <cfontana@suse.de>
> > ---
> >  src/qemu/qemu_saveimage.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
> > index 4fd4c5cfcd..5ea1b2fbcc 100644
> > --- a/src/qemu/qemu_saveimage.c
> > +++ b/src/qemu/qemu_saveimage.c
> > @@ -289,8 +289,10 @@ qemuSaveImageCreate(virQEMUDriver *driver,
> >      if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def, fd) < 0)
> >          goto cleanup;
> >  
> > -    if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> > -        goto cleanup;
> > +    if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
> > +        if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> > +            goto cleanup;
> > +    }
> 
> This effectively reverts:
> 
>   commit c4caab538effb57411ad787fedb61e80d557caae
>   Author: Jiri Denemark <jdenemar@redhat.com>
>   Date:   Wed Feb 8 14:08:54 2012 +0100
> 
>     qemu: Always use iohelper for domain save
>     
>     This is probably not strictly needed as save operation is not live but
>     we may have other reasons to avoid blocking qemu's main loop.
> 
> As the commit message mentions, using a plain file FD results
> in the QEMU thread being blocked, even when O_NONBLOCK is used.

Oh and the bit about 'save operation is not live' is actually
outdated. The VM CPUs are stopped when using 'virsh save'
but are not stopped when using 'virsh snapshot-create' for
live snapshots.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache
Posted by Claudio Fontana 2 years ago
On 4/21/22 6:52 PM, Daniel P. Berrangé wrote:
> On Thu, Apr 21, 2022 at 05:27:09PM +0100, Daniel P. Berrangé wrote:
>> On Tue, Apr 12, 2022 at 11:18:15AM +0200, Claudio Fontana wrote:
>>> align the "save" with the "restore" code,
>>> by only using the wrapper when using --bypass-cache.
>>>
>>> This avoids a copy, resulting in better performance.
>>>
>>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>>> ---
>>>  src/qemu/qemu_saveimage.c | 6 ++++--
>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
>>> index 4fd4c5cfcd..5ea1b2fbcc 100644
>>> --- a/src/qemu/qemu_saveimage.c
>>> +++ b/src/qemu/qemu_saveimage.c
>>> @@ -289,8 +289,10 @@ qemuSaveImageCreate(virQEMUDriver *driver,
>>>      if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def, fd) < 0)
>>>          goto cleanup;
>>>  
>>> -    if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
>>> -        goto cleanup;
>>> +    if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
>>> +        if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
>>> +            goto cleanup;
>>> +    }
>>
>> This effectively reverts:
>>
>>   commit c4caab538effb57411ad787fedb61e80d557caae
>>   Author: Jiri Denemark <jdenemar@redhat.com>
>>   Date:   Wed Feb 8 14:08:54 2012 +0100
>>
>>     qemu: Always use iohelper for domain save
>>     
>>     This is probably not strictly needed as save operation is not live but
>>     we may have other reasons to avoid blocking qemu's main loop.
>>
>> As the commit message mentions, using a plain file FD results
>> in the QEMU thread being blocked, even when O_NONBLOCK is used.
> 

Then we are stuck with the extra copy apparently, if the helper cannot be taken out of the picture.

The problem is, the only way currently to get the maximum bandwidth out for a regular save (non-multifd), is to skip libvirt entirely,
which is very hard to maintain.

Maybe we need a --skip-iohelper option?


> Oh and the bit about 'save operation is not live' is actually
> outdated. The VM CPUs are stopped when using 'virsh save'
> but are not stopped when using 'virsh snapshot-create' for
> live snapshots.

So --skip-iohelper should be an option for only save and managedsave?

> 
> With regards,
> Daniel
> 

Thanks,

Claudio
Re: [libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache
Posted by Ani Sinha 2 years ago
On Tue, Apr 12, 2022 at 2:48 PM Claudio Fontana <cfontana@suse.de> wrote:
>
> align the "save" with the "restore" code,
> by only using the wrapper when using --bypass-cache.
>
> This avoids a copy, resulting in better performance.
>
> Signed-off-by: Claudio Fontana <cfontana@suse.de>
> ---
>  src/qemu/qemu_saveimage.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
> index 4fd4c5cfcd..5ea1b2fbcc 100644
> --- a/src/qemu/qemu_saveimage.c
> +++ b/src/qemu/qemu_saveimage.c
> @@ -289,8 +289,10 @@ qemuSaveImageCreate(virQEMUDriver *driver,
>      if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def, fd) < 0)
>          goto cleanup;
>
> -    if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> -        goto cleanup;
> +    if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
> +        if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> +            goto cleanup;
> +    }

There is an obvious issue with this code. We are trying to close and
free a file descriptor that we have not opened when
VIR_DOMAIN_SAVE_BYPASS_CACHE is set in flags.

>
>      if (virQEMUSaveDataWrite(data, fd, path) < 0)
>          goto cleanup;
> --
> 2.34.1
>
Re: [libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache
Posted by Ani Sinha 2 years ago
On Tue, Apr 12, 2022 at 10:09 PM Ani Sinha <ani@anisinha.ca> wrote:

> On Tue, Apr 12, 2022 at 2:48 PM Claudio Fontana <cfontana@suse.de> wrote:
> >
> > align the "save" with the "restore" code,
> > by only using the wrapper when using --bypass-cache.
> >
> > This avoids a copy, resulting in better performance.
> >
> > Signed-off-by: Claudio Fontana <cfontana@suse.de>
> > ---
> >  src/qemu/qemu_saveimage.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
> > index 4fd4c5cfcd..5ea1b2fbcc 100644
> > --- a/src/qemu/qemu_saveimage.c
> > +++ b/src/qemu/qemu_saveimage.c
> > @@ -289,8 +289,10 @@ qemuSaveImageCreate(virQEMUDriver *driver,
> >      if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def,
> fd) < 0)
> >          goto cleanup;
> >
> > -    if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> > -        goto cleanup;
> > +    if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
> > +        if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> > +            goto cleanup;
> > +    }
>
> There is an obvious issue with this code. We are trying to close and
> free a file descriptor that we have not opened when
> VIR_DOMAIN_SAVE_BYPASS_CACHE is set in flags.


I meant *not* set in flags.


>
> >
> >      if (virQEMUSaveDataWrite(data, fd, path) < 0)
> >          goto cleanup;
> > --
> > 2.34.1
> >
>
Re: [libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache
Posted by Claudio Fontana 2 years ago
On 4/12/22 7:23 PM, Ani Sinha wrote:
> On Tue, Apr 12, 2022 at 10:09 PM Ani Sinha <ani@anisinha.ca> wrote:
> 
>> On Tue, Apr 12, 2022 at 2:48 PM Claudio Fontana <cfontana@suse.de> wrote:
>>>
>>> align the "save" with the "restore" code,
>>> by only using the wrapper when using --bypass-cache.
>>>
>>> This avoids a copy, resulting in better performance.
>>>
>>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>>> ---
>>>  src/qemu/qemu_saveimage.c | 6 ++++--
>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
>>> index 4fd4c5cfcd..5ea1b2fbcc 100644
>>> --- a/src/qemu/qemu_saveimage.c
>>> +++ b/src/qemu/qemu_saveimage.c
>>> @@ -289,8 +289,10 @@ qemuSaveImageCreate(virQEMUDriver *driver,
>>>      if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def,
>> fd) < 0)
>>>          goto cleanup;
>>>
>>> -    if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
>>> -        goto cleanup;
>>> +    if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
>>> +        if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
>>> +            goto cleanup;
>>> +    }
>>
>> There is an obvious issue with this code. We are trying to close and
>> free a file descriptor that we have not opened when
>> VIR_DOMAIN_SAVE_BYPASS_CACHE is set in flags.
> 
> 
> I meant *not* set in flags.

I don't think so. I don't think it is obvious, so can you be more specific?

From the function header comments it seems lecit and defined to call the function with NULL:

/**                                                                                                                     
 * virFileWrapperFdClose:                                                                                               
 * @wfd: fd wrapper, or NULL                                                                                            
 *                                                                                                                      
 * If @wfd is valid, then ensure that I/O has completed, which may                                                      
 * include reaping a child process.  Return 0 if all data for the                                                       
 * wrapped fd is complete, or -1 on failure with an error emitted.                                                      
 * This function intentionally returns 0 when @wfd is NULL, so that                                                     
 * callers can conditionally create a virFileWrapperFd wrapper but                                                      
 * unconditionally call the cleanup code.  To avoid deadlock, only                                                      
 * call this after closing the fd resulting from virFileWrapperFdNew().                                                 
 *                                                                                                                      
 * This function can be safely called multiple times on the same @wfd.                                                  
 */

Seems it has been specifically designed to simplify situations like this.

Thanks,

Claudio


> 
> 
>>
>>>
>>>      if (virQEMUSaveDataWrite(data, fd, path) < 0)
>>>          goto cleanup;
>>> --
>>> 2.34.1
>>>
>>
>
Re: [libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache
Posted by Ani Sinha 2 years ago
On Wed, Apr 13, 2022 at 12:49 AM Claudio Fontana <cfontana@suse.de> wrote:
>
> On 4/12/22 7:23 PM, Ani Sinha wrote:
> > On Tue, Apr 12, 2022 at 10:09 PM Ani Sinha <ani@anisinha.ca> wrote:
> >
> >> On Tue, Apr 12, 2022 at 2:48 PM Claudio Fontana <cfontana@suse.de> wrote:
> >>>
> >>> align the "save" with the "restore" code,
> >>> by only using the wrapper when using --bypass-cache.
> >>>
> >>> This avoids a copy, resulting in better performance.
> >>>
> >>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
> >>> ---
> >>>  src/qemu/qemu_saveimage.c | 6 ++++--
> >>>  1 file changed, 4 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
> >>> index 4fd4c5cfcd..5ea1b2fbcc 100644
> >>> --- a/src/qemu/qemu_saveimage.c
> >>> +++ b/src/qemu/qemu_saveimage.c
> >>> @@ -289,8 +289,10 @@ qemuSaveImageCreate(virQEMUDriver *driver,
> >>>      if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def,
> >> fd) < 0)
> >>>          goto cleanup;
> >>>
> >>> -    if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> >>> -        goto cleanup;
> >>> +    if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
> >>> +        if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> >>> +            goto cleanup;
> >>> +    }
> >>
> >> There is an obvious issue with this code. We are trying to close and
> >> free a file descriptor that we have not opened when
> >> VIR_DOMAIN_SAVE_BYPASS_CACHE is set in flags.
> >
> >
> > I meant *not* set in flags.
>
> I don't think so. I don't think it is obvious, so can you be more specific?

See
   if (qemuDomainFileWrapperFDClose(vm, wrapperFd) < 0)
        goto cleanup;

and under cleanup:
   if (qemuDomainFileWrapperFDClose(vm, wrapperFd) < 0)
        ret = -1;

if VIR_DOMAIN_SAVE_BYPASS_CACHE not set in flags, both the above
function calls use wrapperFd initialized to NULL.
I think this patch would be incomplete if you do not handle all these scenarios.

>
> From the function header comments it seems lecit and defined to call the function with NULL:
>
> /**
>  * virFileWrapperFdClose:
>  * @wfd: fd wrapper, or NULL
>  *
>  * If @wfd is valid, then ensure that I/O has completed, which may
>  * include reaping a child process.  Return 0 if all data for the
>  * wrapped fd is complete, or -1 on failure with an error emitted.
>  * This function intentionally returns 0 when @wfd is NULL, so that
>  * callers can conditionally create a virFileWrapperFd wrapper but
>  * unconditionally call the cleanup code.  To avoid deadlock, only
>  * call this after closing the fd resulting from virFileWrapperFdNew().
>  *
>  * This function can be safely called multiple times on the same @wfd.
>  */
>
> Seems it has been specifically designed to simplify situations like this.
>
> Thanks,
>
> Claudio
>
>
> >
> >
> >>
> >>>
> >>>      if (virQEMUSaveDataWrite(data, fd, path) < 0)
> >>>          goto cleanup;
> >>> --
> >>> 2.34.1
> >>>
> >>
> >
>
Re: [libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache
Posted by Claudio Fontana 2 years ago
On 4/13/22 6:13 AM, Ani Sinha wrote:
> On Wed, Apr 13, 2022 at 12:49 AM Claudio Fontana <cfontana@suse.de> wrote:
>>
>> On 4/12/22 7:23 PM, Ani Sinha wrote:
>>> On Tue, Apr 12, 2022 at 10:09 PM Ani Sinha <ani@anisinha.ca> wrote:
>>>
>>>> On Tue, Apr 12, 2022 at 2:48 PM Claudio Fontana <cfontana@suse.de> wrote:
>>>>>
>>>>> align the "save" with the "restore" code,
>>>>> by only using the wrapper when using --bypass-cache.
>>>>>
>>>>> This avoids a copy, resulting in better performance.
>>>>>
>>>>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>>>>> ---
>>>>>  src/qemu/qemu_saveimage.c | 6 ++++--
>>>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
>>>>> index 4fd4c5cfcd..5ea1b2fbcc 100644
>>>>> --- a/src/qemu/qemu_saveimage.c
>>>>> +++ b/src/qemu/qemu_saveimage.c
>>>>> @@ -289,8 +289,10 @@ qemuSaveImageCreate(virQEMUDriver *driver,
>>>>>      if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def,
>>>> fd) < 0)
>>>>>          goto cleanup;
>>>>>
>>>>> -    if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
>>>>> -        goto cleanup;
>>>>> +    if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
>>>>> +        if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
>>>>> +            goto cleanup;
>>>>> +    }
>>>>
>>>> There is an obvious issue with this code. We are trying to close and
>>>> free a file descriptor that we have not opened when
>>>> VIR_DOMAIN_SAVE_BYPASS_CACHE is set in flags.
>>>
>>>
>>> I meant *not* set in flags.
>>
>> I don't think so. I don't think it is obvious, so can you be more specific?
> 
> See
>    if (qemuDomainFileWrapperFDClose(vm, wrapperFd) < 0)
>         goto cleanup;
> 
> and under cleanup:
>    if (qemuDomainFileWrapperFDClose(vm, wrapperFd) < 0)
>         ret = -1;
> 
> if VIR_DOMAIN_SAVE_BYPASS_CACHE not set in flags, both the above
> function calls use wrapperFd initialized to NULL.
> I think this patch would be incomplete if you do not handle all these scenarios.
> 
>>
>> From the function header comments it seems lecit and defined to call the function with NULL:
>>
>> /**
>>  * virFileWrapperFdClose:
>>  * @wfd: fd wrapper, or NULL
>>  *
>>  * If @wfd is valid, then ensure that I/O has completed, which may
>>  * include reaping a child process.  Return 0 if all data for the
>>  * wrapped fd is complete, or -1 on failure with an error emitted.
>>  * This function intentionally returns 0 when @wfd is NULL, so that
>>  * callers can conditionally create a virFileWrapperFd wrapper but
>>  * unconditionally call the cleanup code.  To avoid deadlock, only
>>  * call this after closing the fd resulting from virFileWrapperFdNew().
>>  *
>>  * This function can be safely called multiple times on the same @wfd.
>>  */
>>
>> Seems it has been specifically designed to simplify situations like this.
>>

Hi Ani, did you read the comments snippet above?

To me it looks like there is nothing else to do here, because someone thought about this,
and made the wrapper API easier to use.

Thanks,

Claudio
Re: [libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache
Posted by Ani Sinha 2 years ago
On Wed, Apr 13, 2022 at 11:57 AM Claudio Fontana <cfontana@suse.de> wrote:
>
> On 4/13/22 6:13 AM, Ani Sinha wrote:
> > On Wed, Apr 13, 2022 at 12:49 AM Claudio Fontana <cfontana@suse.de> wrote:
> >>
> >> On 4/12/22 7:23 PM, Ani Sinha wrote:
> >>> On Tue, Apr 12, 2022 at 10:09 PM Ani Sinha <ani@anisinha.ca> wrote:
> >>>
> >>>> On Tue, Apr 12, 2022 at 2:48 PM Claudio Fontana <cfontana@suse.de> wrote:
> >>>>>
> >>>>> align the "save" with the "restore" code,
> >>>>> by only using the wrapper when using --bypass-cache.
> >>>>>
> >>>>> This avoids a copy, resulting in better performance.
> >>>>>
> >>>>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
> >>>>> ---
> >>>>>  src/qemu/qemu_saveimage.c | 6 ++++--
> >>>>>  1 file changed, 4 insertions(+), 2 deletions(-)
> >>>>>
> >>>>> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
> >>>>> index 4fd4c5cfcd..5ea1b2fbcc 100644
> >>>>> --- a/src/qemu/qemu_saveimage.c
> >>>>> +++ b/src/qemu/qemu_saveimage.c
> >>>>> @@ -289,8 +289,10 @@ qemuSaveImageCreate(virQEMUDriver *driver,
> >>>>>      if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def,
> >>>> fd) < 0)
> >>>>>          goto cleanup;
> >>>>>
> >>>>> -    if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> >>>>> -        goto cleanup;
> >>>>> +    if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
> >>>>> +        if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> >>>>> +            goto cleanup;
> >>>>> +    }
> >>>>
> >>>> There is an obvious issue with this code. We are trying to close and
> >>>> free a file descriptor that we have not opened when
> >>>> VIR_DOMAIN_SAVE_BYPASS_CACHE is set in flags.
> >>>
> >>>
> >>> I meant *not* set in flags.
> >>
> >> I don't think so. I don't think it is obvious, so can you be more specific?
> >
> > See
> >    if (qemuDomainFileWrapperFDClose(vm, wrapperFd) < 0)
> >         goto cleanup;
> >
> > and under cleanup:
> >    if (qemuDomainFileWrapperFDClose(vm, wrapperFd) < 0)
> >         ret = -1;
> >
> > if VIR_DOMAIN_SAVE_BYPASS_CACHE not set in flags, both the above
> > function calls use wrapperFd initialized to NULL.
> > I think this patch would be incomplete if you do not handle all these scenarios.
> >
> >>
> >> From the function header comments it seems lecit and defined to call the function with NULL:
> >>
> >> /**
> >>  * virFileWrapperFdClose:
> >>  * @wfd: fd wrapper, or NULL
> >>  *
> >>  * If @wfd is valid, then ensure that I/O has completed, which may
> >>  * include reaping a child process.  Return 0 if all data for the
> >>  * wrapped fd is complete, or -1 on failure with an error emitted.
> >>  * This function intentionally returns 0 when @wfd is NULL, so that
> >>  * callers can conditionally create a virFileWrapperFd wrapper but
> >>  * unconditionally call the cleanup code.  To avoid deadlock, only
> >>  * call this after closing the fd resulting from virFileWrapperFdNew().
> >>  *
> >>  * This function can be safely called multiple times on the same @wfd.
> >>  */
> >>
> >> Seems it has been specifically designed to simplify situations like this.
> >>
>
> Hi Ani, did you read the comments snippet above?

yes and I get that the called functions today are written with safety
for such cases in mind (check for null file descriptors). However, I
still think it is quite unclean (and looks buggy) to do stuff that
does not apply. With this change, wrapperFd only used when
VIR_DOMAIN_SAVE_BYPASS_CACHE is set and we should avoid any operations
on the fd when VIR_DOMAIN_SAVE_BYPASS_CACHE is not set.
That being said, previous to this patch, under error conditions from
call to virFileWrapperFdNew() also would call close() and free()
without checks for null fd from cleanup. So I guess we are not at
least making things any worse by not checking for null fd from the
caller of those two functions.
Re: [libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache
Posted by Claudio Fontana 2 years ago
On 4/13/22 8:57 AM, Ani Sinha wrote:
> On Wed, Apr 13, 2022 at 11:57 AM Claudio Fontana <cfontana@suse.de> wrote:
>>
>> On 4/13/22 6:13 AM, Ani Sinha wrote:
>>> On Wed, Apr 13, 2022 at 12:49 AM Claudio Fontana <cfontana@suse.de> wrote:
>>>>
>>>> On 4/12/22 7:23 PM, Ani Sinha wrote:
>>>>> On Tue, Apr 12, 2022 at 10:09 PM Ani Sinha <ani@anisinha.ca> wrote:
>>>>>
>>>>>> On Tue, Apr 12, 2022 at 2:48 PM Claudio Fontana <cfontana@suse.de> wrote:
>>>>>>>
>>>>>>> align the "save" with the "restore" code,
>>>>>>> by only using the wrapper when using --bypass-cache.
>>>>>>>
>>>>>>> This avoids a copy, resulting in better performance.
>>>>>>>
>>>>>>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>>>>>>> ---
>>>>>>>  src/qemu/qemu_saveimage.c | 6 ++++--
>>>>>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
>>>>>>> index 4fd4c5cfcd..5ea1b2fbcc 100644
>>>>>>> --- a/src/qemu/qemu_saveimage.c
>>>>>>> +++ b/src/qemu/qemu_saveimage.c
>>>>>>> @@ -289,8 +289,10 @@ qemuSaveImageCreate(virQEMUDriver *driver,
>>>>>>>      if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def,
>>>>>> fd) < 0)
>>>>>>>          goto cleanup;
>>>>>>>
>>>>>>> -    if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
>>>>>>> -        goto cleanup;
>>>>>>> +    if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
>>>>>>> +        if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
>>>>>>> +            goto cleanup;
>>>>>>> +    }
>>>>>>
>>>>>> There is an obvious issue with this code. We are trying to close and
>>>>>> free a file descriptor that we have not opened when
>>>>>> VIR_DOMAIN_SAVE_BYPASS_CACHE is set in flags.
>>>>>
>>>>>
>>>>> I meant *not* set in flags.
>>>>
>>>> I don't think so. I don't think it is obvious, so can you be more specific?
>>>
>>> See
>>>    if (qemuDomainFileWrapperFDClose(vm, wrapperFd) < 0)
>>>         goto cleanup;
>>>
>>> and under cleanup:
>>>    if (qemuDomainFileWrapperFDClose(vm, wrapperFd) < 0)
>>>         ret = -1;
>>>
>>> if VIR_DOMAIN_SAVE_BYPASS_CACHE not set in flags, both the above
>>> function calls use wrapperFd initialized to NULL.
>>> I think this patch would be incomplete if you do not handle all these scenarios.
>>>
>>>>
>>>> From the function header comments it seems lecit and defined to call the function with NULL:
>>>>
>>>> /**
>>>>  * virFileWrapperFdClose:
>>>>  * @wfd: fd wrapper, or NULL
>>>>  *
>>>>  * If @wfd is valid, then ensure that I/O has completed, which may
>>>>  * include reaping a child process.  Return 0 if all data for the
>>>>  * wrapped fd is complete, or -1 on failure with an error emitted.
>>>>  * This function intentionally returns 0 when @wfd is NULL, so that
>>>>  * callers can conditionally create a virFileWrapperFd wrapper but
>>>>  * unconditionally call the cleanup code.  To avoid deadlock, only
>>>>  * call this after closing the fd resulting from virFileWrapperFdNew().
>>>>  *
>>>>  * This function can be safely called multiple times on the same @wfd.
>>>>  */
>>>>
>>>> Seems it has been specifically designed to simplify situations like this.
>>>>
>>
>> Hi Ani, did you read the comments snippet above?
> 
> yes and I get that the called functions today are written with safety
> for such cases in mind (check for null file descriptors). However, I
> still think it is quite unclean (and looks buggy) to do stuff that
> does not apply. With this change, wrapperFd only used when
> VIR_DOMAIN_SAVE_BYPASS_CACHE is set and we should avoid any operations
> on the fd when VIR_DOMAIN_SAVE_BYPASS_CACHE is not set.
> That being said, previous to this patch, under error conditions from
> call to virFileWrapperFdNew() also would call close() and free()
> without checks for null fd from cleanup. So I guess we are not at
> least making things any worse by not checking for null fd from the
> caller of those two functions.
> 

It is a curious approach I agree, but I think this is how libvirt wants it, based on specifically this part of the comment in the code:

"
 This function intentionally returns 0 when @wfd is NULL, so that
 * callers can conditionally create a virFileWrapperFd wrapper but
 * unconditionally call the cleanup code. 
"

Thanks,

Claudio
Re: [libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache
Posted by Ani Sinha 2 years ago
On Wed, Apr 13, 2022 at 9:43 AM Ani Sinha <ani@anisinha.ca> wrote:
>
> On Wed, Apr 13, 2022 at 12:49 AM Claudio Fontana <cfontana@suse.de> wrote:
> >
> > On 4/12/22 7:23 PM, Ani Sinha wrote:
> > > On Tue, Apr 12, 2022 at 10:09 PM Ani Sinha <ani@anisinha.ca> wrote:
> > >
> > >> On Tue, Apr 12, 2022 at 2:48 PM Claudio Fontana <cfontana@suse.de> wrote:
> > >>>
> > >>> align the "save" with the "restore" code,
> > >>> by only using the wrapper when using --bypass-cache.
> > >>>
> > >>> This avoids a copy, resulting in better performance.
> > >>>
> > >>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
> > >>> ---
> > >>>  src/qemu/qemu_saveimage.c | 6 ++++--
> > >>>  1 file changed, 4 insertions(+), 2 deletions(-)
> > >>>
> > >>> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
> > >>> index 4fd4c5cfcd..5ea1b2fbcc 100644
> > >>> --- a/src/qemu/qemu_saveimage.c
> > >>> +++ b/src/qemu/qemu_saveimage.c
> > >>> @@ -289,8 +289,10 @@ qemuSaveImageCreate(virQEMUDriver *driver,
> > >>>      if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def,
> > >> fd) < 0)
> > >>>          goto cleanup;
> > >>>
> > >>> -    if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> > >>> -        goto cleanup;
> > >>> +    if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
> > >>> +        if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> > >>> +            goto cleanup;
> > >>> +    }
> > >>
> > >> There is an obvious issue with this code. We are trying to close and
> > >> free a file descriptor that we have not opened when
> > >> VIR_DOMAIN_SAVE_BYPASS_CACHE is set in flags.
> > >
> > >
> > > I meant *not* set in flags.
> >
> > I don't think so. I don't think it is obvious, so can you be more specific?
>
> See
>    if (qemuDomainFileWrapperFDClose(vm, wrapperFd) < 0)
>         goto cleanup;
>
> and under cleanup:
>    if (qemuDomainFileWrapperFDClose(vm, wrapperFd) < 0)
>         ret = -1;

Also see virFileWrapperFdFree() in cleanup.

>
> if VIR_DOMAIN_SAVE_BYPASS_CACHE not set in flags, both the above
> function calls use wrapperFd initialized to NULL.
> I think this patch would be incomplete if you do not handle all these scenarios.
>
> >
> > From the function header comments it seems lecit and defined to call the function with NULL:
> >
> > /**
> >  * virFileWrapperFdClose:
> >  * @wfd: fd wrapper, or NULL
> >  *
> >  * If @wfd is valid, then ensure that I/O has completed, which may
> >  * include reaping a child process.  Return 0 if all data for the
> >  * wrapped fd is complete, or -1 on failure with an error emitted.
> >  * This function intentionally returns 0 when @wfd is NULL, so that
> >  * callers can conditionally create a virFileWrapperFd wrapper but
> >  * unconditionally call the cleanup code.  To avoid deadlock, only
> >  * call this after closing the fd resulting from virFileWrapperFdNew().
> >  *
> >  * This function can be safely called multiple times on the same @wfd.
> >  */
> >
> > Seems it has been specifically designed to simplify situations like this.
> >
> > Thanks,
> >
> > Claudio
> >
> >
> > >
> > >
> > >>
> > >>>
> > >>>      if (virQEMUSaveDataWrite(data, fd, path) < 0)
> > >>>          goto cleanup;
> > >>> --
> > >>> 2.34.1
> > >>>
> > >>
> > >
> >
Re: [libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache
Posted by Ani Sinha 2 years ago
On Tue, Apr 12, 2022 at 2:48 PM Claudio Fontana <cfontana@suse.de> wrote:
>
> align the "save" with the "restore" code,
> by only using the wrapper when using --bypass-cache.
>
> This avoids a copy, resulting in better performance.
>
> Signed-off-by: Claudio Fontana <cfontana@suse.de>

Reviewed-by: Ani Sinha <ani@anisinha.ca>

> ---
>  src/qemu/qemu_saveimage.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
> index 4fd4c5cfcd..5ea1b2fbcc 100644
> --- a/src/qemu/qemu_saveimage.c
> +++ b/src/qemu/qemu_saveimage.c
> @@ -289,8 +289,10 @@ qemuSaveImageCreate(virQEMUDriver *driver,
>      if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def, fd) < 0)
>          goto cleanup;
>
> -    if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> -        goto cleanup;
> +    if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
> +        if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
> +            goto cleanup;
> +    }
>
>      if (virQEMUSaveDataWrite(data, fd, path) < 0)
>          goto cleanup;
> --
> 2.34.1
>
Re: [libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache
Posted by Claudio Fontana 2 years ago
a ping on this one;

this change makes sense to me, and makes performance better in specific cases, by avoiding --bypass-cache before reaching the "cache trashing" state.

However, before virsh save was guarenteed to go through iohelper, now it is not.

The iohelper was guaranteeing a virFileDataSync before exiting,
while after the change the data will still be in-flight between the kernel and the device as virsh save returns to the prompt.

Seems more efficient, but wanted to mention this as a user-visible change.

Thoughts?

Claudio


On 4/13/22 9:18 AM, Ani Sinha wrote:
> On Tue, Apr 12, 2022 at 2:48 PM Claudio Fontana <cfontana@suse.de> wrote:
>>
>> align the "save" with the "restore" code,
>> by only using the wrapper when using --bypass-cache.
>>
>> This avoids a copy, resulting in better performance.
>>
>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
> 
> Reviewed-by: Ani Sinha <ani@anisinha.ca>
> 
>> ---
>>  src/qemu/qemu_saveimage.c | 6 ++++--
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
>> index 4fd4c5cfcd..5ea1b2fbcc 100644
>> --- a/src/qemu/qemu_saveimage.c
>> +++ b/src/qemu/qemu_saveimage.c
>> @@ -289,8 +289,10 @@ qemuSaveImageCreate(virQEMUDriver *driver,
>>      if (qemuSecuritySetImageFDLabel(driver->securityManager, vm->def, fd) < 0)
>>          goto cleanup;
>>
>> -    if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
>> -        goto cleanup;
>> +    if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
>> +        if (!(wrapperFd = virFileWrapperFdNew(&fd, path, wrapperFlags)))
>> +            goto cleanup;
>> +    }
>>
>>      if (virQEMUSaveDataWrite(data, fd, path) < 0)
>>          goto cleanup;
>> --
>> 2.34.1
>>
>
Re: [libvirt RFC] qemu_saveimage: only use wrapper when using --bypass-cache
Posted by Daniel P. Berrangé 2 years ago
On Thu, Apr 21, 2022 at 04:55:53PM +0200, Claudio Fontana wrote:
> a ping on this one;
> 
> this change makes sense to me, and makes performance better in specific cases, by avoiding --bypass-cache before reaching the "cache trashing" state.
> 
> However, before virsh save was guarenteed to go through iohelper, now it is not.
> 
> The iohelper was guaranteeing a virFileDataSync before exiting,
> while after the change the data will still be in-flight between
> the kernel and the device as virsh save returns to the prompt.

This is a correctness problem I'm afraid, notably with network
filesystems

commit f32e3a2dd686f3692cd2bd3147c03e90f82df987
Author: Michal Prívozník <mprivozn@redhat.com>
Date:   Tue Oct 30 19:15:48 2012 +0100

    iohelper: fdatasync() at the end
    
    Currently, when we are doing (managed) save, we insert the
    iohelper between the qemu and OS. The pipe is created, the
    writing end is passed to qemu and the reading end to the
    iohelper. It reads data and write them into given file. However,
    with write() being asynchronous data may still be in OS
    caches and hence in some (corner) cases, all migration data
    may have been read and written (not physically though). So
    qemu will report success, as well as iohelper. However, with
    some non local filesystems, where ENOSPACE is polled every X
    time units, we may get into situation where all operations
    succeeded but data hasn't reached the disk. And in fact will
    never do. Therefore we ought sync caches to make sure data
    has reached the block device on remote host.


It is also a crash-safety issue. ie if the host OS crashes
after we've created the save image, the image will still
exist on disk but can have lost arbitrary amounts of data,
in a way they might not be detectable until the restored
VM randomly crashes hours/days later.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|