[libvirt RFCv4] virfile: set pipe size in virFileWrapperFdNew to improve throughput

Claudio Fontana posted 1 patch 2 years, 1 month ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20220325151019.17935-1-cfontana@suse.de
Test syntax-check passed
There is a newer version of this series
src/util/virfile.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 46 insertions(+)
[libvirt RFCv4] virfile: set pipe size in virFileWrapperFdNew to improve throughput
Posted by Claudio Fontana 2 years, 1 month ago
currently the only user of virFileWrapperFdNew is the qemu driver;
virsh save is very slow with a default pipe size.
This change improves throughput by ~400% on fast nvme or ramdisk.

Best value currently measured is 1MB, which happens to be also
the kernel default for the pipe-max-size.

Signed-off-by: Claudio Fontana <cfontana@suse.de>
---
 src/util/virfile.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

see v2 at
https://listman.redhat.com/archives/libvir-list/2022-March/229423.html

Changes v3 -> v4:

* changed INFO and WARN messages to DEBUG (Daniel)

Changes v2 -> v3:

* removed reading of max-pipe-size from procfs,
  instead make multiple attempts on EPERM with smaller sizes.
  In the regular case, this should succeed on the first try.
  (Daniel)

Changes v1 -> v2:

* removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing
  unconditional (Michal)

* moved code to separate functions (Michal)

* removed ternary op, disliked in libvirt (Michal)

* added #ifdef __linux__ (Ani Sinha)

* try smallest value between currently best measured value (1MB)
  and the pipe-max-size setting. If pipe-max-size cannot be read,
  try kernel default max (1MB). (Daniel)


diff --git a/src/util/virfile.c b/src/util/virfile.c
index a04f888e06..87539be0b9 100644
--- a/src/util/virfile.c
+++ b/src/util/virfile.c
@@ -201,6 +201,50 @@ struct _virFileWrapperFd {
 };
 
 #ifndef WIN32
+
+#ifdef __linux__
+
+/**
+ * virFileWrapperSetPipeSize:
+ * @fd: the fd of the pipe
+ *
+ * Set best pipe size on the passed file descriptor for bulk transfers of data.
+ *
+ * default pipe size (usually 64K) is generally not suited for large transfers
+ * to fast devices. A value of 1MB has been measured to improve virsh save
+ * by 400% in ideal conditions. We retry multiple times with smaller sizes
+ * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size.
+ *
+ * OS note: only for linux, on other OS this is a no-op.
+ */
+static void
+virFileWrapperSetPipeSize(int fd)
+{
+    int sz;
+
+    for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) {
+        int rv = fcntl(fd, F_SETPIPE_SZ, sz);
+        if (rv < 0 && errno == EPERM) {
+            VIR_DEBUG("EPERM trying to set fd %d pipe size to %d", fd, sz);
+            continue; /* retry with half the size */
+        }
+        if (rv < 0) {
+            break;
+        }
+        VIR_DEBUG("fd %d pipe size adjusted to %d", fd, sz);
+        return;
+    }
+    virReportSystemError(errno, "%s", _("unable to set pipe size, data transfer might be slow"));
+}
+
+#else /* !__linux__ */
+static void virFileWrapperSetPipeSize(int fd)
+{
+    return;
+}
+#endif /* !__linux__ */
+
+
 /**
  * virFileWrapperFdNew:
  * @fd: pointer to fd to wrap
@@ -282,6 +326,8 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags)
 
     ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
 
+    virFileWrapperSetPipeSize(pipefd[!output]);
+
     if (output) {
         virCommandSetInputFD(ret->cmd, pipefd[0]);
         virCommandSetOutputFD(ret->cmd, fd);
-- 
2.35.1
Re: [libvirt RFCv4] virfile: set pipe size in virFileWrapperFdNew to improve throughput
Posted by Michal Prívozník 2 years, 1 month ago
On 3/25/22 16:10, Claudio Fontana wrote:
> currently the only user of virFileWrapperFdNew is the qemu driver;
> virsh save is very slow with a default pipe size.
> This change improves throughput by ~400% on fast nvme or ramdisk.
> 
> Best value currently measured is 1MB, which happens to be also
> the kernel default for the pipe-max-size.
> 
> Signed-off-by: Claudio Fontana <cfontana@suse.de>
> ---
>  src/util/virfile.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 46 insertions(+)
> 
> see v2 at
> https://listman.redhat.com/archives/libvir-list/2022-March/229423.html
> 
> Changes v3 -> v4:
> 
> * changed INFO and WARN messages to DEBUG (Daniel)
> 
> Changes v2 -> v3:
> 
> * removed reading of max-pipe-size from procfs,
>   instead make multiple attempts on EPERM with smaller sizes.
>   In the regular case, this should succeed on the first try.
>   (Daniel)
> 
> Changes v1 -> v2:
> 
> * removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing
>   unconditional (Michal)
> 
> * moved code to separate functions (Michal)
> 
> * removed ternary op, disliked in libvirt (Michal)
> 
> * added #ifdef __linux__ (Ani Sinha)
> 
> * try smallest value between currently best measured value (1MB)
>   and the pipe-max-size setting. If pipe-max-size cannot be read,
>   try kernel default max (1MB). (Daniel)
> 
> 
> diff --git a/src/util/virfile.c b/src/util/virfile.c
> index a04f888e06..87539be0b9 100644
> --- a/src/util/virfile.c
> +++ b/src/util/virfile.c
> @@ -201,6 +201,50 @@ struct _virFileWrapperFd {
>  };
>  
>  #ifndef WIN32
> +
> +#ifdef __linux__
> +
> +/**
> + * virFileWrapperSetPipeSize:
> + * @fd: the fd of the pipe
> + *
> + * Set best pipe size on the passed file descriptor for bulk transfers of data.
> + *
> + * default pipe size (usually 64K) is generally not suited for large transfers
> + * to fast devices. A value of 1MB has been measured to improve virsh save
> + * by 400% in ideal conditions. We retry multiple times with smaller sizes
> + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size.
> + *
> + * OS note: only for linux, on other OS this is a no-op.
> + */
> +static void
> +virFileWrapperSetPipeSize(int fd)
> +{
> +    int sz;
> +
> +    for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) {
> +        int rv = fcntl(fd, F_SETPIPE_SZ, sz);
> +        if (rv < 0 && errno == EPERM) {
> +            VIR_DEBUG("EPERM trying to set fd %d pipe size to %d", fd, sz);
> +            continue; /* retry with half the size */
> +        }
> +        if (rv < 0) {
> +            break;
> +        }
> +        VIR_DEBUG("fd %d pipe size adjusted to %d", fd, sz);
> +        return;
> +    }
> +    virReportSystemError(errno, "%s", _("unable to set pipe size, data transfer might be slow"));

This should have been VIR_WARN(). It's weird to report an error when the
function returns void.

> +}
> +
> +#else /* !__linux__ */
> +static void virFileWrapperSetPipeSize(int fd)
The @fd argument is unused and thus has to be marked as such.

> +{
> +    return;
> +}
> +#endif /* !__linux__ */
> +
> +
>  /**
>   * virFileWrapperFdNew:
>   * @fd: pointer to fd to wrap
> @@ -282,6 +326,8 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags)
>  
>      ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
>  
> +    virFileWrapperSetPipeSize(pipefd[!output]);

This feels weird, because just a few lines below the pipefd[!output]) is
closed. As I said earlier, it doesn't matter what end of the pipe we set
the size on, therefore, let's switch over to pipefd[output].

> +
>      if (output) {
>          virCommandSetInputFD(ret->cmd, pipefd[0]);
>          virCommandSetOutputFD(ret->cmd, fd);

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>

and pushed.

Michal
Re: [libvirt RFCv4] virfile: set pipe size in virFileWrapperFdNew to improve throughput
Posted by Daniel P. Berrangé 2 years, 1 month ago
On Mon, Mar 28, 2022 at 01:06:03PM +0200, Michal Prívozník wrote:
> On 3/25/22 16:10, Claudio Fontana wrote:
> > currently the only user of virFileWrapperFdNew is the qemu driver;
> > virsh save is very slow with a default pipe size.
> > This change improves throughput by ~400% on fast nvme or ramdisk.
> > 
> > Best value currently measured is 1MB, which happens to be also
> > the kernel default for the pipe-max-size.
> > 
> > Signed-off-by: Claudio Fontana <cfontana@suse.de>
> > ---
> >  src/util/virfile.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 46 insertions(+)
> > 
> > see v2 at
> > https://listman.redhat.com/archives/libvir-list/2022-March/229423.html
> > 
> > Changes v3 -> v4:
> > 
> > * changed INFO and WARN messages to DEBUG (Daniel)
> > 
> > Changes v2 -> v3:
> > 
> > * removed reading of max-pipe-size from procfs,
> >   instead make multiple attempts on EPERM with smaller sizes.
> >   In the regular case, this should succeed on the first try.
> >   (Daniel)
> > 
> > Changes v1 -> v2:
> > 
> > * removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing
> >   unconditional (Michal)
> > 
> > * moved code to separate functions (Michal)
> > 
> > * removed ternary op, disliked in libvirt (Michal)
> > 
> > * added #ifdef __linux__ (Ani Sinha)
> > 
> > * try smallest value between currently best measured value (1MB)
> >   and the pipe-max-size setting. If pipe-max-size cannot be read,
> >   try kernel default max (1MB). (Daniel)
> > 
> > 
> > diff --git a/src/util/virfile.c b/src/util/virfile.c
> > index a04f888e06..87539be0b9 100644
> > --- a/src/util/virfile.c
> > +++ b/src/util/virfile.c
> > @@ -201,6 +201,50 @@ struct _virFileWrapperFd {
> >  };
> >  
> >  #ifndef WIN32
> > +
> > +#ifdef __linux__
> > +
> > +/**
> > + * virFileWrapperSetPipeSize:
> > + * @fd: the fd of the pipe
> > + *
> > + * Set best pipe size on the passed file descriptor for bulk transfers of data.
> > + *
> > + * default pipe size (usually 64K) is generally not suited for large transfers
> > + * to fast devices. A value of 1MB has been measured to improve virsh save
> > + * by 400% in ideal conditions. We retry multiple times with smaller sizes
> > + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size.
> > + *
> > + * OS note: only for linux, on other OS this is a no-op.
> > + */
> > +static void
> > +virFileWrapperSetPipeSize(int fd)
> > +{
> > +    int sz;
> > +
> > +    for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) {
> > +        int rv = fcntl(fd, F_SETPIPE_SZ, sz);
> > +        if (rv < 0 && errno == EPERM) {
> > +            VIR_DEBUG("EPERM trying to set fd %d pipe size to %d", fd, sz);
> > +            continue; /* retry with half the size */
> > +        }
> > +        if (rv < 0) {
> > +            break;
> > +        }
> > +        VIR_DEBUG("fd %d pipe size adjusted to %d", fd, sz);
> > +        return;
> > +    }
> > +    virReportSystemError(errno, "%s", _("unable to set pipe size, data transfer might be slow"));
> 
> This should have been VIR_WARN(). It's weird to report an error when the
> function returns void.

Actually I said to report an error in prvious version, as I figured we
were handling the expect EPERM, but I guess we could even fail the
last 64 KB iteration and stick with the default. So we need a slight
tweak:

 static void
 virFileWrapperSetPipeSize(int fd)
 {
     int sz;
 
     for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) {
         int rv = fcntl(fd, F_SETPIPE_SZ, sz);
         if (rv < 0 && errno == EPERM) {
             VIR_DEBUG("EPERM trying to set fd %d pipe size to %d", fd, sz);
             continue; /* retry with half the size */
         }
         if (rv < 0) {
            virReportSystemError(errno, "%s", _("unable to set pipe size"));
            return -1;
        }
        VIR_DEBUG("fd %d pipe size adjusted to %d", fd, sz);
        return 0;
    }
    VIR_WARN("Could set pipe size to 64 KB, leaving on default size");
    return 0;
 }

then the caller can treat -1 as fatal


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [libvirt RFCv4] virfile: set pipe size in virFileWrapperFdNew to improve throughput
Posted by Michal Prívozník 2 years, 1 month ago
On 3/28/22 13:10, Daniel P. Berrangé wrote:
> On Mon, Mar 28, 2022 at 01:06:03PM +0200, Michal Prívozník wrote:
>> On 3/25/22 16:10, Claudio Fontana wrote:
>>> currently the only user of virFileWrapperFdNew is the qemu driver;
>>> virsh save is very slow with a default pipe size.
>>> This change improves throughput by ~400% on fast nvme or ramdisk.
>>>
>>> Best value currently measured is 1MB, which happens to be also
>>> the kernel default for the pipe-max-size.
>>>
>>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>>> ---
>>>  src/util/virfile.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
>>>  1 file changed, 46 insertions(+)
>>>
>>> see v2 at
>>> https://listman.redhat.com/archives/libvir-list/2022-March/229423.html
>>>
>>> Changes v3 -> v4:
>>>
>>> * changed INFO and WARN messages to DEBUG (Daniel)
>>>
>>> Changes v2 -> v3:
>>>
>>> * removed reading of max-pipe-size from procfs,
>>>   instead make multiple attempts on EPERM with smaller sizes.
>>>   In the regular case, this should succeed on the first try.
>>>   (Daniel)
>>>
>>> Changes v1 -> v2:
>>>
>>> * removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing
>>>   unconditional (Michal)
>>>
>>> * moved code to separate functions (Michal)
>>>
>>> * removed ternary op, disliked in libvirt (Michal)
>>>
>>> * added #ifdef __linux__ (Ani Sinha)
>>>
>>> * try smallest value between currently best measured value (1MB)
>>>   and the pipe-max-size setting. If pipe-max-size cannot be read,
>>>   try kernel default max (1MB). (Daniel)
>>>
>>>
>>> diff --git a/src/util/virfile.c b/src/util/virfile.c
>>> index a04f888e06..87539be0b9 100644
>>> --- a/src/util/virfile.c
>>> +++ b/src/util/virfile.c
>>> @@ -201,6 +201,50 @@ struct _virFileWrapperFd {
>>>  };
>>>  
>>>  #ifndef WIN32
>>> +
>>> +#ifdef __linux__
>>> +
>>> +/**
>>> + * virFileWrapperSetPipeSize:
>>> + * @fd: the fd of the pipe
>>> + *
>>> + * Set best pipe size on the passed file descriptor for bulk transfers of data.
>>> + *
>>> + * default pipe size (usually 64K) is generally not suited for large transfers
>>> + * to fast devices. A value of 1MB has been measured to improve virsh save
>>> + * by 400% in ideal conditions. We retry multiple times with smaller sizes
>>> + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size.
>>> + *
>>> + * OS note: only for linux, on other OS this is a no-op.
>>> + */
>>> +static void
>>> +virFileWrapperSetPipeSize(int fd)
>>> +{
>>> +    int sz;
>>> +
>>> +    for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) {
>>> +        int rv = fcntl(fd, F_SETPIPE_SZ, sz);
>>> +        if (rv < 0 && errno == EPERM) {
>>> +            VIR_DEBUG("EPERM trying to set fd %d pipe size to %d", fd, sz);
>>> +            continue; /* retry with half the size */
>>> +        }
>>> +        if (rv < 0) {
>>> +            break;
>>> +        }
>>> +        VIR_DEBUG("fd %d pipe size adjusted to %d", fd, sz);
>>> +        return;
>>> +    }
>>> +    virReportSystemError(errno, "%s", _("unable to set pipe size, data transfer might be slow"));
>>
>> This should have been VIR_WARN(). It's weird to report an error when the
>> function returns void.
> 
> Actually I said to report an error in prvious version, as I figured we
> were handling the expect EPERM, but I guess we could even fail the
> last 64 KB iteration and stick with the default. So we need a slight
> tweak:
> 
>  static void
>  virFileWrapperSetPipeSize(int fd)
>  {
>      int sz;
>  
>      for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) {
>          int rv = fcntl(fd, F_SETPIPE_SZ, sz);
>          if (rv < 0 && errno == EPERM) {
>              VIR_DEBUG("EPERM trying to set fd %d pipe size to %d", fd, sz);
>              continue; /* retry with half the size */
>          }
>          if (rv < 0) {
>             virReportSystemError(errno, "%s", _("unable to set pipe size"));
>             return -1;
>         }
>         VIR_DEBUG("fd %d pipe size adjusted to %d", fd, sz);
>         return 0;
>     }
>     VIR_WARN("Could set pipe size to 64 KB, leaving on default size");
>     return 0;
>  }
> 
> then the caller can treat -1 as fatal

Yes, in that case we could call virReportSystemError(), but the way the
code is currently written doesn't make much sense. Anyway, let me post a
follow up patch that does report error.

Michal

Re: [libvirt RFCv4] virfile: set pipe size in virFileWrapperFdNew to improve throughput
Posted by Claudio Fontana 2 years, 1 month ago
On 3/28/22 1:06 PM, Michal Prívozník wrote:
> On 3/25/22 16:10, Claudio Fontana wrote:
>> currently the only user of virFileWrapperFdNew is the qemu driver;
>> virsh save is very slow with a default pipe size.
>> This change improves throughput by ~400% on fast nvme or ramdisk.
>>
>> Best value currently measured is 1MB, which happens to be also
>> the kernel default for the pipe-max-size.
>>
>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>> ---
>>  src/util/virfile.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 46 insertions(+)
>>
>> see v2 at
>> https://listman.redhat.com/archives/libvir-list/2022-March/229423.html
>>
>> Changes v3 -> v4:
>>
>> * changed INFO and WARN messages to DEBUG (Daniel)
>>
>> Changes v2 -> v3:
>>
>> * removed reading of max-pipe-size from procfs,
>>   instead make multiple attempts on EPERM with smaller sizes.
>>   In the regular case, this should succeed on the first try.
>>   (Daniel)
>>
>> Changes v1 -> v2:
>>
>> * removed VIR_FILE_WRAPPER_BIG_PIPE, made the new pipe resizing
>>   unconditional (Michal)
>>
>> * moved code to separate functions (Michal)
>>
>> * removed ternary op, disliked in libvirt (Michal)
>>
>> * added #ifdef __linux__ (Ani Sinha)
>>
>> * try smallest value between currently best measured value (1MB)
>>   and the pipe-max-size setting. If pipe-max-size cannot be read,
>>   try kernel default max (1MB). (Daniel)
>>
>>
>> diff --git a/src/util/virfile.c b/src/util/virfile.c
>> index a04f888e06..87539be0b9 100644
>> --- a/src/util/virfile.c
>> +++ b/src/util/virfile.c
>> @@ -201,6 +201,50 @@ struct _virFileWrapperFd {
>>  };
>>  
>>  #ifndef WIN32
>> +
>> +#ifdef __linux__
>> +
>> +/**
>> + * virFileWrapperSetPipeSize:
>> + * @fd: the fd of the pipe
>> + *
>> + * Set best pipe size on the passed file descriptor for bulk transfers of data.
>> + *
>> + * default pipe size (usually 64K) is generally not suited for large transfers
>> + * to fast devices. A value of 1MB has been measured to improve virsh save
>> + * by 400% in ideal conditions. We retry multiple times with smaller sizes
>> + * on EPERM to account for possible small values of /proc/sys/fs/pipe-max-size.
>> + *
>> + * OS note: only for linux, on other OS this is a no-op.
>> + */
>> +static void
>> +virFileWrapperSetPipeSize(int fd)
>> +{
>> +    int sz;
>> +
>> +    for (sz = 1024 * 1024; sz >= 64 * 1024; sz /= 2) {
>> +        int rv = fcntl(fd, F_SETPIPE_SZ, sz);
>> +        if (rv < 0 && errno == EPERM) {
>> +            VIR_DEBUG("EPERM trying to set fd %d pipe size to %d", fd, sz);
>> +            continue; /* retry with half the size */
>> +        }
>> +        if (rv < 0) {
>> +            break;
>> +        }
>> +        VIR_DEBUG("fd %d pipe size adjusted to %d", fd, sz);
>> +        return;
>> +    }
>> +    virReportSystemError(errno, "%s", _("unable to set pipe size, data transfer might be slow"));
> 
> This should have been VIR_WARN(). It's weird to report an error when the
> function returns void.
> 
>> +}
>> +
>> +#else /* !__linux__ */
>> +static void virFileWrapperSetPipeSize(int fd)
> The @fd argument is unused and thus has to be marked as such.
> 
>> +{
>> +    return;
>> +}
>> +#endif /* !__linux__ */
>> +
>> +
>>  /**
>>   * virFileWrapperFdNew:
>>   * @fd: pointer to fd to wrap
>> @@ -282,6 +326,8 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned int flags)
>>  
>>      ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
>>  
>> +    virFileWrapperSetPipeSize(pipefd[!output]);
> 
> This feels weird, because just a few lines below the pipefd[!output]) is
> closed. As I said earlier, it doesn't matter what end of the pipe we set
> the size on, therefore, let's switch over to pipefd[output].
> 
>> +
>>      if (output) {
>>          virCommandSetInputFD(ret->cmd, pipefd[0]);
>>          virCommandSetOutputFD(ret->cmd, fd);
> 
> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
> 
> and pushed.
> 
> Michal
> 

Thanks a lot for fixing up the remaining few pieces!

Ciao,

Claudio