[libvirt PATCH] ci: print test suite logs on failure for Cirrus jobs

Daniel P. Berrangé posted 1 patch 2 years ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20220426091217.572678-1-berrange@redhat.com
ci/cirrus/build.yml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[libvirt PATCH] ci: print test suite logs on failure for Cirrus jobs
Posted by Daniel P. Berrangé 2 years ago
We don't have access to the 'testlog.txt' file, so we need meson to
print the failures for any broken tests directly.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---

The CI pipeline for macOS started failing a few days ago. It was not
triggered by any commit, as the pipeline immediately preceeding the
first failure used the same commit hash on master. The logs show
glib2 being updated from 2.72.0 to 2.72.1

With this patch applied I can see the test logs

   https://gitlab.com/berrange/libvirt/-/jobs/2376891003

and all the failing tests are hitting:

(process:50961): GLib-WARNING **: 01:56:14.162: poll(2) failed due to:
Bad file descriptor.

so something todo with the QEMU monitor/event loop AFAIK, but not
sure what.

 ci/cirrus/build.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ci/cirrus/build.yml b/ci/cirrus/build.yml
index 867d5f297b..f03ad58143 100644
--- a/ci/cirrus/build.yml
+++ b/ci/cirrus/build.yml
@@ -26,4 +26,4 @@ build_task:
     - meson setup build
     - meson dist -C build --no-tests
     - meson compile -C build
-    - meson test -C build --no-suite syntax-check
+    - meson test -C build --no-suite syntax-check --print-errorlogs
-- 
2.35.1

Re: [libvirt PATCH] ci: print test suite logs on failure for Cirrus jobs
Posted by Andrea Bolognani 2 years ago
On Tue, Apr 26, 2022 at 10:12:17AM +0100, Daniel P. Berrangé wrote:
> We don't have access to the 'testlog.txt' file, so we need meson to
> print the failures for any broken tests directly.
>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  ci/cirrus/build.yml | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Andrea Bolognani <abologna@redhat.com>

> The CI pipeline for macOS started failing a few days ago. It was not
> triggered by any commit, as the pipeline immediately preceeding the
> first failure used the same commit hash on master. The logs show
> glib2 being updated from 2.72.0 to 2.72.1
>
> With this patch applied I can see the test logs
>
>    https://gitlab.com/berrange/libvirt/-/jobs/2376891003
>
> and all the failing tests are hitting:
>
> (process:50961): GLib-WARNING **: 01:56:14.162: poll(2) failed due to:
> Bad file descriptor.
>
> so something todo with the QEMU monitor/event loop AFAIK, but not
> sure what.

Looking at

  https://gitlab.gnome.org/GNOME/glib/-/releases/2.72.1

the interesting change seems to be

  * Fix detection of broken poll() function on macOS

which would correspond to

  https://gitlab.gnome.org/GNOME/glib/-/merge_requests/2571

So it looks like this was very intentional, and motivated by the
needs of QEMU of all projects... I wonder what they're doing
differently from us?

-- 
Andrea Bolognani / Red Hat / Virtualization
Re: [libvirt PATCH] ci: print test suite logs on failure for Cirrus jobs
Posted by Daniel P. Berrangé 2 years ago
On Tue, Apr 26, 2022 at 12:07:40PM +0000, Andrea Bolognani wrote:
> On Tue, Apr 26, 2022 at 10:12:17AM +0100, Daniel P. Berrangé wrote:
> > We don't have access to the 'testlog.txt' file, so we need meson to
> > print the failures for any broken tests directly.
> >
> > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> > ---
> >  ci/cirrus/build.yml | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Reviewed-by: Andrea Bolognani <abologna@redhat.com>
> 
> > The CI pipeline for macOS started failing a few days ago. It was not
> > triggered by any commit, as the pipeline immediately preceeding the
> > first failure used the same commit hash on master. The logs show
> > glib2 being updated from 2.72.0 to 2.72.1
> >
> > With this patch applied I can see the test logs
> >
> >    https://gitlab.com/berrange/libvirt/-/jobs/2376891003
> >
> > and all the failing tests are hitting:
> >
> > (process:50961): GLib-WARNING **: 01:56:14.162: poll(2) failed due to:
> > Bad file descriptor.
> >
> > so something todo with the QEMU monitor/event loop AFAIK, but not
> > sure what.
> 
> Looking at
> 
>   https://gitlab.gnome.org/GNOME/glib/-/releases/2.72.1
> 
> the interesting change seems to be
> 
>   * Fix detection of broken poll() function on macOS
> 
> which would correspond to
> 
>   https://gitlab.gnome.org/GNOME/glib/-/merge_requests/2571

Yep, I since filed a bug

https://gitlab.com/libvirt/libvirt/-/issues/303

The problem is that poll handles bad file descriptors by setting
the event POLLNVAL. ie poll() syscall succeeds.

With the glib BROKEN_POLL macro set, it emulates poll() using
select() and with bad file descriptors that fails hard with
EBADF.

IOW, I suspect there's a latent bug hiding in libvirt tests (or
real code) that is passing a bad FD to poll(), and we've been
lucky that it was harmless with POLLNVAL, and now we're seeing
the bug exposed.

> So it looks like this was very intentional, and motivated by the
> needs of QEMU of all projects... I wonder what they're doing
> differently from us?

Tracing the history back, poll() on macOS fails when used with
FDs associated with a device node under /dev/*.  Not likely
something libvirt does, but certainly  an issue for QEMU with
tap devices.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|