While dumping very large VMs (over 128GB), iohelper seems to cause
very intense IO usage on the disk, and it causes some processes
(like journald) to hung, and depending on kernel configuration,
to panic.
This change creates a time window, after every 10GB written, so
this processes can write to the disk, and avoid hunging.
Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>
---
src/util/iohelper.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/src/util/iohelper.c b/src/util/iohelper.c
index ddc338b7c7..164c1e2085 100644
--- a/src/util/iohelper.c
+++ b/src/util/iohelper.c
@@ -52,6 +52,8 @@ runIO(const char *path, int fd, int oflags)
unsigned long long total = 0;
bool direct = O_DIRECT && ((oflags & O_DIRECT) != 0);
off_t end = 0;
+ const unsigned long long sleep_step = (long long)10*1024*1024*1024;
+ unsigned long long next_sleep = sleep_step;
#if HAVE_POSIX_MEMALIGN
if (posix_memalign(&base, alignMask + 1, buflen)) {
@@ -128,6 +130,12 @@ runIO(const char *path, int fd, int oflags)
total += got;
+ /* sleeps for a while to avoid hunging other tasks */
+ if (total > next_sleep) {
+ next_sleep += sleep_step;
+ usleep(100*1000);
+ }
+
/* handle last write size align in direct case */
if (got < buflen && direct && fdout == fd) {
ssize_t aligned_got = (got + alignMask) & ~alignMask;
--
2.20.1
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, 2019-05-09 at 17:59 -0300, Leonardo Bras wrote: > While dumping very large VMs (over 128GB), iohelper seems to cause > very intense IO usage on the disk, and it causes some processes > (like journald) to hung, and depending on kernel configuration, > to panic. > > This change creates a time window, after every 10GB written, so > this processes can write to the disk, and avoid hunging. > > Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com> > --- > src/util/iohelper.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/src/util/iohelper.c b/src/util/iohelper.c > index ddc338b7c7..164c1e2085 100644 > --- a/src/util/iohelper.c > +++ b/src/util/iohelper.c > @@ -52,6 +52,8 @@ runIO(const char *path, int fd, int oflags) > unsigned long long total = 0; > bool direct = O_DIRECT && ((oflags & O_DIRECT) != 0); > off_t end = 0; > + const unsigned long long sleep_step = (long long)10*1024*1024*1024; > + unsigned long long next_sleep = sleep_step; > > #if HAVE_POSIX_MEMALIGN > if (posix_memalign(&base, alignMask + 1, buflen)) { > @@ -128,6 +130,12 @@ runIO(const char *path, int fd, int oflags) > > total += got; > > + /* sleeps for a while to avoid hunging other tasks */ > + if (total > next_sleep) { > + next_sleep += sleep_step; > + usleep(100*1000); > + } > + > /* handle last write size align in direct case */ > if (got < buflen && direct && fdout == fd) { > ssize_t aligned_got = (got + alignMask) & ~alignMask; please note there is a typo on the subject: current: iohelper: Introduces a small sleep do avoid hunging other tasks fixed: iohelper: Introduces a small sleep to avoid hunging other tasks -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, May 09, 2019 at 05:59:36PM -0300, Leonardo Bras wrote: > While dumping very large VMs (over 128GB), iohelper seems to cause > very intense IO usage on the disk, and it causes some processes > (like journald) to hung, and depending on kernel configuration, > to panic. What causes a panic here ? This just sounds like a I/O schedular issue in the end. The storage can't cope with the huge data sizes and the kernel is not fairly giving time to other processes. Surely the answer here is to tune the kernel params to improve I/O fairness Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, May 09, 2019 at 17:59:36 -0300, Leonardo Bras wrote: > While dumping very large VMs (over 128GB), iohelper seems to cause > very intense IO usage on the disk, and it causes some processes > (like journald) to hung, and depending on kernel configuration, > to panic. > > This change creates a time window, after every 10GB written, so > this processes can write to the disk, and avoid hunging. > > Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com> I don't think this approach makes sense. This looks like something on your host is seriously wrong as anything that would write a large file would to the same to your host. Unfortunately libvirt's API does not have a possibility to pass in bandwidth limit for the dump operation which would allow to limit the bandwidth when calling the API. You can e.g. use cgroups to limit the bandwidth of children of the libvirtd to work around this problem or perhaps switch to a better I/O scheduler. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
© 2016 - 2024 Red Hat, Inc.