chardev/char-io.c | 2 +- hw/remote/mpqemu-link.c | 2 +- include/io/channel-socket.h | 2 + include/io/channel.h | 38 +- io/channel-buffer.c | 1 + io/channel-command.c | 1 + io/channel-file.c | 1 + io/channel-socket.c | 118 ++++- io/channel-tls.c | 1 + io/channel-websock.c | 1 + io/channel.c | 49 +- meson.build | 1 + migration/channel.c | 3 +- migration/migration.c | 52 ++- migration/migration.h | 6 + migration/multifd.c | 74 ++- migration/multifd.h | 4 +- migration/ram.c | 29 +- migration/rdma.c | 1 + migration/socket.c | 12 +- monitor/hmp-cmds.c | 6 + qapi/migration.json | 24 + scsi/pr-manager-helper.c | 2 +- tests/qtest/meson.build | 12 +- tests/qtest/migration-helpers.c | 13 + tests/qtest/migration-helpers.h | 1 + tests/qtest/migration-test.c | 867 +++++++++++++++++++++++++++++++---- tests/unit/crypto-tls-psk-helpers.c | 18 +- tests/unit/crypto-tls-psk-helpers.h | 1 + tests/unit/crypto-tls-x509-helpers.c | 16 +- tests/unit/crypto-tls-x509-helpers.h | 53 +++ tests/unit/test-crypto-tlssession.c | 11 +- tests/unit/test-io-channel-socket.c | 1 + 33 files changed, 1284 insertions(+), 139 deletions(-)
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
The following changes since commit 178bacb66d98d9ee7a702b9f2a4dfcd88b72a9ab:
Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging (2022-05-09 11:07:04 -0700)
are available in the Git repository at:
https://gitlab.com/dagrh/qemu.git tags/pull-migration-20220510a
for you to fetch changes up to 511f4a0506af1d380115a61f3362149953646871:
multifd: Implement zero copy write in multifd migration (multifd-zero-copy) (2022-05-10 09:15:06 +0100)
----------------------------------------------------------------
Migration pull 2022-05-10
This replaces yesterdays and the pull originally sent on 28th April;
compared to yesterdays this fixes an accidental change to skiboot.
It contains:
TLS test fixes from Dan
Zerocopy migration feature from Leo
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
----------------------------------------------------------------
Daniel P. Berrangé (9):
tests: fix encoding of IP addresses in x509 certs
tests: add more helper macros for creating TLS x509 certs
tests: add migration tests of TLS with PSK credentials
tests: add migration tests of TLS with x509 credentials
tests: convert XBZRLE migration test to use common helper
tests: convert multifd migration tests to use common helper
tests: add multifd migration tests of TLS with PSK credentials
tests: add multifd migration tests of TLS with x509 credentials
tests: ensure migration status isn't reported as failed
Leonardo Bras (7):
QIOChannel: Add flags on io_writev and introduce io_flush callback
QIOChannelSocket: Implement io_writev zero copy flag & io_flush for CONFIG_LINUX
migration: Add zero-copy-send parameter for QMP/HMP for Linux
migration: Add migrate_use_tls() helper
multifd: multifd_send_sync_main now returns negative on error
multifd: Send header packet without flags if zero-copy-send is enabled
multifd: Implement zero copy write in multifd migration (multifd-zero-copy)
chardev/char-io.c | 2 +-
hw/remote/mpqemu-link.c | 2 +-
include/io/channel-socket.h | 2 +
include/io/channel.h | 38 +-
io/channel-buffer.c | 1 +
io/channel-command.c | 1 +
io/channel-file.c | 1 +
io/channel-socket.c | 118 ++++-
io/channel-tls.c | 1 +
io/channel-websock.c | 1 +
io/channel.c | 49 +-
meson.build | 1 +
migration/channel.c | 3 +-
migration/migration.c | 52 ++-
migration/migration.h | 6 +
migration/multifd.c | 74 ++-
migration/multifd.h | 4 +-
migration/ram.c | 29 +-
migration/rdma.c | 1 +
migration/socket.c | 12 +-
monitor/hmp-cmds.c | 6 +
qapi/migration.json | 24 +
scsi/pr-manager-helper.c | 2 +-
tests/qtest/meson.build | 12 +-
tests/qtest/migration-helpers.c | 13 +
tests/qtest/migration-helpers.h | 1 +
tests/qtest/migration-test.c | 867 +++++++++++++++++++++++++++++++----
tests/unit/crypto-tls-psk-helpers.c | 18 +-
tests/unit/crypto-tls-psk-helpers.h | 1 +
tests/unit/crypto-tls-x509-helpers.c | 16 +-
tests/unit/crypto-tls-x509-helpers.h | 53 +++
tests/unit/test-crypto-tlssession.c | 11 +-
tests/unit/test-io-channel-socket.c | 1 +
33 files changed, 1284 insertions(+), 139 deletions(-)
* Dr. David Alan Gilbert (git) (dgilbert@redhat.com) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> The following changes since commit 178bacb66d98d9ee7a702b9f2a4dfcd88b72a9ab:
>
> Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging (2022-05-09 11:07:04 -0700)
>
> are available in the Git repository at:
>
> https://gitlab.com/dagrh/qemu.git tags/pull-migration-20220510a
>
> for you to fetch changes up to 511f4a0506af1d380115a61f3362149953646871:
>
> multifd: Implement zero copy write in multifd migration (multifd-zero-copy) (2022-05-10 09:15:06 +0100)
Nack
This is still failing the Alpine build test:
ninja: job failed: cc -m64 -mcx16 -Ilibio.fa.p -I. -I.. -Iqapi -Itrace -Iui -Iui/shader -I/usr/include/p11-kit-1 -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include -fdiagnostics-color=auto -Wall -Winvalid-pch -Werror -std=gnu11 -O2 -g -isystem /builds/dagrh/qemu/linux-headers -isystem linux-headers -iquote . -iquote /builds/dagrh/qemu -iquote /builds/dagrh/qemu/include -iquote /builds/dagrh/qemu/disas/libvixl -iquote /builds/dagrh/qemu/tcg/i386 -pthread -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -Wold-style-declaration -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels -Wexpansion-to-defined -Wimplicit-fallthrough=2 -Wno-missing-include-dirs -Wno-shift-negative-value -Wno-psabi -fstack-protector-strong -fPIE -MD -MQ libio.fa.p/io_channel-socket.c.o -MF libio.fa.p/io_channel-socket.c.o.d -o libio.fa.p/io_channel-socket.c.o -c ../io/channel-socket.c
In file included from /usr/include/linux/errqueue.h:6,
from ../io/channel-socket.c:29:
/usr/include/linux/time_types.h:7:8: error: redefinition of 'struct __kernel_timespec'
7 | struct __kernel_timespec {
| ^~~~~~~~~~~~~~~~~
In file included from /usr/include/liburing.h:19,
from /builds/dagrh/qemu/include/block/aio.h:18,
from /builds/dagrh/qemu/include/io/channel.h:26,
from /builds/dagrh/qemu/include/io/channel-socket.h:24,
from ../io/channel-socket.c:24:
/usr/include/liburing/compat.h:9:8: note: originally defined here
9 | struct __kernel_timespec {
| ^~~~~~~~~~~~~~~~~
ninja: subcommand failed
make: *** [Makefile:163: run-ninja] Error 1
> ----------------------------------------------------------------
> Migration pull 2022-05-10
>
> This replaces yesterdays and the pull originally sent on 28th April;
> compared to yesterdays this fixes an accidental change to skiboot.
>
> It contains:
> TLS test fixes from Dan
> Zerocopy migration feature from Leo
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>
> ----------------------------------------------------------------
> Daniel P. Berrangé (9):
> tests: fix encoding of IP addresses in x509 certs
> tests: add more helper macros for creating TLS x509 certs
> tests: add migration tests of TLS with PSK credentials
> tests: add migration tests of TLS with x509 credentials
> tests: convert XBZRLE migration test to use common helper
> tests: convert multifd migration tests to use common helper
> tests: add multifd migration tests of TLS with PSK credentials
> tests: add multifd migration tests of TLS with x509 credentials
> tests: ensure migration status isn't reported as failed
>
> Leonardo Bras (7):
> QIOChannel: Add flags on io_writev and introduce io_flush callback
> QIOChannelSocket: Implement io_writev zero copy flag & io_flush for CONFIG_LINUX
> migration: Add zero-copy-send parameter for QMP/HMP for Linux
> migration: Add migrate_use_tls() helper
> multifd: multifd_send_sync_main now returns negative on error
> multifd: Send header packet without flags if zero-copy-send is enabled
> multifd: Implement zero copy write in multifd migration (multifd-zero-copy)
>
> chardev/char-io.c | 2 +-
> hw/remote/mpqemu-link.c | 2 +-
> include/io/channel-socket.h | 2 +
> include/io/channel.h | 38 +-
> io/channel-buffer.c | 1 +
> io/channel-command.c | 1 +
> io/channel-file.c | 1 +
> io/channel-socket.c | 118 ++++-
> io/channel-tls.c | 1 +
> io/channel-websock.c | 1 +
> io/channel.c | 49 +-
> meson.build | 1 +
> migration/channel.c | 3 +-
> migration/migration.c | 52 ++-
> migration/migration.h | 6 +
> migration/multifd.c | 74 ++-
> migration/multifd.h | 4 +-
> migration/ram.c | 29 +-
> migration/rdma.c | 1 +
> migration/socket.c | 12 +-
> monitor/hmp-cmds.c | 6 +
> qapi/migration.json | 24 +
> scsi/pr-manager-helper.c | 2 +-
> tests/qtest/meson.build | 12 +-
> tests/qtest/migration-helpers.c | 13 +
> tests/qtest/migration-helpers.h | 1 +
> tests/qtest/migration-test.c | 867 +++++++++++++++++++++++++++++++----
> tests/unit/crypto-tls-psk-helpers.c | 18 +-
> tests/unit/crypto-tls-psk-helpers.h | 1 +
> tests/unit/crypto-tls-x509-helpers.c | 16 +-
> tests/unit/crypto-tls-x509-helpers.h | 53 +++
> tests/unit/test-crypto-tlssession.c | 11 +-
> tests/unit/test-io-channel-socket.c | 1 +
> 33 files changed, 1284 insertions(+), 139 deletions(-)
>
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Tue, May 10, 2022 at 10:58:30AM +0100, Dr. David Alan Gilbert wrote:
> * Dr. David Alan Gilbert (git) (dgilbert@redhat.com) wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> >
> > The following changes since commit 178bacb66d98d9ee7a702b9f2a4dfcd88b72a9ab:
> >
> > Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging (2022-05-09 11:07:04 -0700)
> >
> > are available in the Git repository at:
> >
> > https://gitlab.com/dagrh/qemu.git tags/pull-migration-20220510a
> >
> > for you to fetch changes up to 511f4a0506af1d380115a61f3362149953646871:
> >
> > multifd: Implement zero copy write in multifd migration (multifd-zero-copy) (2022-05-10 09:15:06 +0100)
>
> Nack
> This is still failing the Alpine build test:
>
> ninja: job failed: cc -m64 -mcx16 -Ilibio.fa.p -I. -I.. -Iqapi -Itrace -Iui -Iui/shader -I/usr/include/p11-kit-1 -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include -fdiagnostics-color=auto -Wall -Winvalid-pch -Werror -std=gnu11 -O2 -g -isystem /builds/dagrh/qemu/linux-headers -isystem linux-headers -iquote . -iquote /builds/dagrh/qemu -iquote /builds/dagrh/qemu/include -iquote /builds/dagrh/qemu/disas/libvixl -iquote /builds/dagrh/qemu/tcg/i386 -pthread -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -Wold-style-declaration -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels -Wexpansion-to-defined -Wimplicit-fallthrough=2 -Wno-missing-include-dirs -Wno-shift-negative-value -Wno-psabi -fstack-protector-strong -fPIE -MD -MQ libio.fa.p/io_channel-socket.c.o -MF libio.fa.p/io_channel-socket.c.o.d -o libio.fa.p/io_channel-socket.c.o -c ../io/channel-socket.c
> In file included from /usr/include/linux/errqueue.h:6,
> from ../io/channel-socket.c:29:
> /usr/include/linux/time_types.h:7:8: error: redefinition of 'struct __kernel_timespec'
> 7 | struct __kernel_timespec {
> | ^~~~~~~~~~~~~~~~~
> In file included from /usr/include/liburing.h:19,
> from /builds/dagrh/qemu/include/block/aio.h:18,
> from /builds/dagrh/qemu/include/io/channel.h:26,
> from /builds/dagrh/qemu/include/io/channel-socket.h:24,
> from ../io/channel-socket.c:24:
> /usr/include/liburing/compat.h:9:8: note: originally defined here
> 9 | struct __kernel_timespec {
> | ^~~~~~~~~~~~~~~~~
> ninja: subcommand failed
> make: *** [Makefile:163: run-ninja] Error 1
Yuk. That very much looks like a bug in liburing itself to me.
We've exposed the latent bug by including linux/errqueue.h
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Tue, May 10, 2022 at 10:58:30AM +0100, Dr. David Alan Gilbert wrote:
> > * Dr. David Alan Gilbert (git) (dgilbert@redhat.com) wrote:
> > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > >
> > > The following changes since commit 178bacb66d98d9ee7a702b9f2a4dfcd88b72a9ab:
> > >
> > > Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging (2022-05-09 11:07:04 -0700)
> > >
> > > are available in the Git repository at:
> > >
> > > https://gitlab.com/dagrh/qemu.git tags/pull-migration-20220510a
> > >
> > > for you to fetch changes up to 511f4a0506af1d380115a61f3362149953646871:
> > >
> > > multifd: Implement zero copy write in multifd migration (multifd-zero-copy) (2022-05-10 09:15:06 +0100)
> >
> > Nack
> > This is still failing the Alpine build test:
> >
> > ninja: job failed: cc -m64 -mcx16 -Ilibio.fa.p -I. -I.. -Iqapi -Itrace -Iui -Iui/shader -I/usr/include/p11-kit-1 -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include -fdiagnostics-color=auto -Wall -Winvalid-pch -Werror -std=gnu11 -O2 -g -isystem /builds/dagrh/qemu/linux-headers -isystem linux-headers -iquote . -iquote /builds/dagrh/qemu -iquote /builds/dagrh/qemu/include -iquote /builds/dagrh/qemu/disas/libvixl -iquote /builds/dagrh/qemu/tcg/i386 -pthread -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -Wold-style-declaration -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels -Wexpansion-to-defined -Wimplicit-fallthrough=2 -Wno-missing-include-dirs -Wno-shift-negative-value -Wno-psabi -fstack-protector-strong -fPIE -MD -MQ libio.fa.p/io_channel-socket.c.o -MF libio.fa.p/io_channel-socket.c.o.d -o libio.fa.p/io_channel-socket.c.o -c ../io/channel-socket.c
> > In file included from /usr/include/linux/errqueue.h:6,
> > from ../io/channel-socket.c:29:
> > /usr/include/linux/time_types.h:7:8: error: redefinition of 'struct __kernel_timespec'
> > 7 | struct __kernel_timespec {
> > | ^~~~~~~~~~~~~~~~~
> > In file included from /usr/include/liburing.h:19,
> > from /builds/dagrh/qemu/include/block/aio.h:18,
> > from /builds/dagrh/qemu/include/io/channel.h:26,
> > from /builds/dagrh/qemu/include/io/channel-socket.h:24,
> > from ../io/channel-socket.c:24:
> > /usr/include/liburing/compat.h:9:8: note: originally defined here
> > 9 | struct __kernel_timespec {
> > | ^~~~~~~~~~~~~~~~~
> > ninja: subcommand failed
> > make: *** [Makefile:163: run-ninja] Error 1
>
> Yuk. That very much looks like a bug in liburing itself to me.
>
>
> We've exposed the latent bug by including linux/errqueue.h
Yes, I think there was a thread after the 1st pull where Leo identified
the patch that fixed it; but it's not in that image.
Dave
> With regards,
> Daniel
> --
> |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o- https://fstop138.berrange.com :|
> |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
From a previous thread: On Thu, Apr 28, 2022 at 1:20 PM Dr. David Alan Gilbert <dgilbert@redhat.com> wrote: > > Leo: > Unfortunately this is failing a couple of CI tests; the MSG_ZEROCOPY > one I guess is the simpler one; I think Stefanha managed to find the > liburing fix for the __kernel_timespec case, but that looks like a bit > more fun! > > Dave I thought Stefanha had fixed this bug, and we were just waiting for a new alpine rootfs/image with that fixed. Is that correct? On Tue, May 10, 2022 at 7:43 AM Dr. David Alan Gilbert <dgilbert@redhat.com> wrote: > > * Daniel P. Berrangé (berrange@redhat.com) wrote: > > On Tue, May 10, 2022 at 10:58:30AM +0100, Dr. David Alan Gilbert wrote: [...] > > > > Yuk. That very much looks like a bug in liburing itself to me. > > > > > > We've exposed the latent bug by including linux/errqueue.h > > Yes, I think there was a thread after the 1st pull where Leo identified > the patch that fixed it; but it's not in that image. I only fixed the MSG_ZEROCOPY missing define bug, as I got that Stefanha had already fixed the issue in liburing/alpine. questions: - Has Stefanha really fixed that, and we are just waiting for a new image, or have I got that wrong? - How should I proceed with that? - If we proceed with fixing this up in alpine, will that require this patchset to be on pause until it's fixed there? - If so, is there any suggestion on how to fix that in qemu code? (this header is needed because of SO_EE_* defines) Thank you all! Best regards, Leo > > Dave > > > With regards, > > Daniel > > -- > > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > > |: https://libvirt.org -o- https://fstop138.berrange.com :| > > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| > > > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK >
On Wed, May 11, 2022 at 12:40:07AM -0300, Leonardo Bras Soares Passos wrote: > From a previous thread: > > On Thu, Apr 28, 2022 at 1:20 PM Dr. David Alan Gilbert > <dgilbert@redhat.com> wrote: > > > > Leo: > > Unfortunately this is failing a couple of CI tests; the MSG_ZEROCOPY > > one I guess is the simpler one; I think Stefanha managed to find the > > liburing fix for the __kernel_timespec case, but that looks like a bit > > more fun! > > > > Dave > > I thought Stefanha had fixed this bug, and we were just waiting for a > new alpine rootfs/image with that fixed. > Is that correct? I didn't fix the liburing __kernel_timespec issue on alpine Linux. It's probably a packaging bug and I gave Dave the email address of the package maintainer. I'm not sure if the package maintainer has been contacted or released a fixed liburing package. Stefan
On Mon, May 16, 2022 at 09:51:12AM +0100, Stefan Hajnoczi wrote: > On Wed, May 11, 2022 at 12:40:07AM -0300, Leonardo Bras Soares Passos wrote: > > From a previous thread: > > > > On Thu, Apr 28, 2022 at 1:20 PM Dr. David Alan Gilbert > > <dgilbert@redhat.com> wrote: > > > > > > Leo: > > > Unfortunately this is failing a couple of CI tests; the MSG_ZEROCOPY > > > one I guess is the simpler one; I think Stefanha managed to find the > > > liburing fix for the __kernel_timespec case, but that looks like a bit > > > more fun! > > > > > > Dave > > > > I thought Stefanha had fixed this bug, and we were just waiting for a > > new alpine rootfs/image with that fixed. > > Is that correct? > > I didn't fix the liburing __kernel_timespec issue on alpine Linux. It's > probably a packaging bug and I gave Dave the email address of the > package maintainer. I'm not sure if the package maintainer has been > contacted or released a fixed liburing package. It was prety easy to bisect the problem with liburing so I filed a bug pointing to the fix needed: https://gitlab.alpinelinux.org/alpine/aports/-/issues/13813 With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Mon, May 16, 2022 at 10:40:15AM +0100, Daniel P. Berrangé wrote: > On Mon, May 16, 2022 at 09:51:12AM +0100, Stefan Hajnoczi wrote: > > On Wed, May 11, 2022 at 12:40:07AM -0300, Leonardo Bras Soares Passos wrote: > > > From a previous thread: > > > > > > On Thu, Apr 28, 2022 at 1:20 PM Dr. David Alan Gilbert > > > <dgilbert@redhat.com> wrote: > > > > > > > > Leo: > > > > Unfortunately this is failing a couple of CI tests; the MSG_ZEROCOPY > > > > one I guess is the simpler one; I think Stefanha managed to find the > > > > liburing fix for the __kernel_timespec case, but that looks like a bit > > > > more fun! > > > > > > > > Dave > > > > > > I thought Stefanha had fixed this bug, and we were just waiting for a > > > new alpine rootfs/image with that fixed. > > > Is that correct? > > > > I didn't fix the liburing __kernel_timespec issue on alpine Linux. It's > > probably a packaging bug and I gave Dave the email address of the > > package maintainer. I'm not sure if the package maintainer has been > > contacted or released a fixed liburing package. > > It was prety easy to bisect the problem with liburing so I filed a > bug pointing to the fix needed: > > https://gitlab.alpinelinux.org/alpine/aports/-/issues/13813 Alpine win the prize for quickest distro bug fix response. The liburing problem is now fixed in all current Alpine release streams, just minutes after I filed the above bug. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Mon, May 16, 2022 at 11:09:23AM +0100, Daniel P. Berrangé wrote: > On Mon, May 16, 2022 at 10:40:15AM +0100, Daniel P. Berrangé wrote: > > On Mon, May 16, 2022 at 09:51:12AM +0100, Stefan Hajnoczi wrote: > > > On Wed, May 11, 2022 at 12:40:07AM -0300, Leonardo Bras Soares Passos wrote: > > > > From a previous thread: > > > > > > > > On Thu, Apr 28, 2022 at 1:20 PM Dr. David Alan Gilbert > > > > <dgilbert@redhat.com> wrote: > > > > > > > > > > Leo: > > > > > Unfortunately this is failing a couple of CI tests; the MSG_ZEROCOPY > > > > > one I guess is the simpler one; I think Stefanha managed to find the > > > > > liburing fix for the __kernel_timespec case, but that looks like a bit > > > > > more fun! > > > > > > > > > > Dave > > > > > > > > I thought Stefanha had fixed this bug, and we were just waiting for a > > > > new alpine rootfs/image with that fixed. > > > > Is that correct? > > > > > > I didn't fix the liburing __kernel_timespec issue on alpine Linux. It's > > > probably a packaging bug and I gave Dave the email address of the > > > package maintainer. I'm not sure if the package maintainer has been > > > contacted or released a fixed liburing package. > > > > It was prety easy to bisect the problem with liburing so I filed a > > bug pointing to the fix needed: > > > > https://gitlab.alpinelinux.org/alpine/aports/-/issues/13813 > > Alpine win the prize for quickest distro bug fix response. The liburing > problem is now fixed in all current Alpine release streams, just minutes > after I filed the above bug. Awesome, thank you for identifying the issue and filing the issue! Stefan
* Leonardo Bras Soares Passos (leobras@redhat.com) wrote: > From a previous thread: > > On Thu, Apr 28, 2022 at 1:20 PM Dr. David Alan Gilbert > <dgilbert@redhat.com> wrote: > > > > Leo: > > Unfortunately this is failing a couple of CI tests; the MSG_ZEROCOPY > > one I guess is the simpler one; I think Stefanha managed to find the > > liburing fix for the __kernel_timespec case, but that looks like a bit > > more fun! > > > > Dave > > I thought Stefanha had fixed this bug, and we were just waiting for a > new alpine rootfs/image with that fixed. > Is that correct? > > On Tue, May 10, 2022 at 7:43 AM Dr. David Alan Gilbert > <dgilbert@redhat.com> wrote: > > > > * Daniel P. Berrangé (berrange@redhat.com) wrote: > > > On Tue, May 10, 2022 at 10:58:30AM +0100, Dr. David Alan Gilbert wrote: > [...] > > > > > > Yuk. That very much looks like a bug in liburing itself to me. > > > > > > > > > We've exposed the latent bug by including linux/errqueue.h > > > > Yes, I think there was a thread after the 1st pull where Leo identified > > the patch that fixed it; but it's not in that image. > > I only fixed the MSG_ZEROCOPY missing define bug, as I got that > Stefanha had already fixed the issue in liburing/alpine. > > questions: > - Has Stefanha really fixed that, and we are just waiting for a new > image, or have I got that wrong? > - How should I proceed with that? > > - If we proceed with fixing this up in alpine, will that require this > patchset to be on pause until it's fixed there? It needs to pass in CI; so yes. > - If so, is there any suggestion on how to fix that in qemu code? > (this header is needed because of SO_EE_* defines) I've not actually looked at the detail of the failure; but yes I think we need a qemu workaround here. If there's no simple fix, then adding a test to meson.build to conditionally disable liburing might be best; like the test code for libcap_ng I guess (search in meson.build for libcap_ng.found at around line 540. Dave > Thank you all! > > Best regards, > Leo > > > > > Dave > > > > > With regards, > > > Daniel > > > -- > > > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > > > |: https://libvirt.org -o- https://fstop138.berrange.com :| > > > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| > > > > > -- > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > > > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Wed, May 11, 2022 at 5:55 AM Dr. David Alan Gilbert
<dgilbert@redhat.com> wrote:
>
> * Leonardo Bras Soares Passos (leobras@redhat.com) wrote:
> > From a previous thread:
> >
> > On Thu, Apr 28, 2022 at 1:20 PM Dr. David Alan Gilbert
> > <dgilbert@redhat.com> wrote:
> > >
> > > Leo:
> > > Unfortunately this is failing a couple of CI tests; the MSG_ZEROCOPY
> > > one I guess is the simpler one; I think Stefanha managed to find the
> > > liburing fix for the __kernel_timespec case, but that looks like a bit
> > > more fun!
> > >
> > > Dave
> >
> > I thought Stefanha had fixed this bug, and we were just waiting for a
> > new alpine rootfs/image with that fixed.
> > Is that correct?
> >
> > On Tue, May 10, 2022 at 7:43 AM Dr. David Alan Gilbert
> > <dgilbert@redhat.com> wrote:
> > >
> > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > On Tue, May 10, 2022 at 10:58:30AM +0100, Dr. David Alan Gilbert wrote:
> > [...]
> > > >
> > > > Yuk. That very much looks like a bug in liburing itself to me.
> > > >
> > > >
> > > > We've exposed the latent bug by including linux/errqueue.h
> > >
> > > Yes, I think there was a thread after the 1st pull where Leo identified
> > > the patch that fixed it; but it's not in that image.
> >
> > I only fixed the MSG_ZEROCOPY missing define bug, as I got that
> > Stefanha had already fixed the issue in liburing/alpine.
> >
> > questions:
> > - Has Stefanha really fixed that, and we are just waiting for a new
> > image, or have I got that wrong?
> > - How should I proceed with that?
> >
> > - If we proceed with fixing this up in alpine, will that require this
> > patchset to be on pause until it's fixed there?
>
> It needs to pass in CI; so yes.
>
> > - If so, is there any suggestion on how to fix that in qemu code?
> > (this header is needed because of SO_EE_* defines)
>
> I've not actually looked at the detail of the failure; but yes I think
> we need a qemu workaround here.
>
> If there's no simple fix, then adding a test to meson.build to
> conditionally disable liburing might be best; like the test code for
> libcap_ng I guess (search in meson.build for libcap_ng.found at around
> line 540.
Hello Dave,
I solved this issue by doing this:
+linux_io_uring_test = '''
+ #include <liburing.h>
+ #include <linux/errqueue.h>
+
+ int main(void) { return 0; }'''
+
linux_io_uring = not_found
if not get_option('linux_io_uring').auto() or have_block
linux_io_uring = dependency('liburing', version: '>=0.3',
required: get_option('linux_io_uring'),
method: 'pkg-config', kwargs: static_kwargs)
+ if not cc.links(linux_io_uring_test)
+ linux_io_uring = not_found
+ endif
endif
+
Seems to work fine in CI, and now Alpine does not fail anymore.
(See pipeline https://gitlab.com/LeoBras/qemu/-/pipelines/538123933
for reference)
I am not sure if this is the right thing to do, but I will be sending
it as a full new patchset (v13), with the first patch being the one
with the above change and the rest just carrying the recommended
fixes.
I was also thinking I could instead send the single "fix" patch, and
recommend adding it before my v12. If that is the correct approach for
this case, please let me know so I can improve in the future. (I am
trying to figure out what is simpler/best for maintainers)
Best regards,
Leo
>
> Dave
>
> > Thank you all!
> >
> > Best regards,
> > Leo
> >
> > >
> > > Dave
> > >
> > > > With regards,
> > > > Daniel
> > > > --
> > > > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
> > > > |: https://libvirt.org -o- https://fstop138.berrange.com :|
> > > > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
> > > >
> > > --
> > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > >
> >
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
* Leonardo Bras Soares Passos (leobras@redhat.com) wrote:
> On Wed, May 11, 2022 at 5:55 AM Dr. David Alan Gilbert
> <dgilbert@redhat.com> wrote:
> >
> > * Leonardo Bras Soares Passos (leobras@redhat.com) wrote:
> > > From a previous thread:
> > >
> > > On Thu, Apr 28, 2022 at 1:20 PM Dr. David Alan Gilbert
> > > <dgilbert@redhat.com> wrote:
> > > >
> > > > Leo:
> > > > Unfortunately this is failing a couple of CI tests; the MSG_ZEROCOPY
> > > > one I guess is the simpler one; I think Stefanha managed to find the
> > > > liburing fix for the __kernel_timespec case, but that looks like a bit
> > > > more fun!
> > > >
> > > > Dave
> > >
> > > I thought Stefanha had fixed this bug, and we were just waiting for a
> > > new alpine rootfs/image with that fixed.
> > > Is that correct?
> > >
> > > On Tue, May 10, 2022 at 7:43 AM Dr. David Alan Gilbert
> > > <dgilbert@redhat.com> wrote:
> > > >
> > > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > > On Tue, May 10, 2022 at 10:58:30AM +0100, Dr. David Alan Gilbert wrote:
> > > [...]
> > > > >
> > > > > Yuk. That very much looks like a bug in liburing itself to me.
> > > > >
> > > > >
> > > > > We've exposed the latent bug by including linux/errqueue.h
> > > >
> > > > Yes, I think there was a thread after the 1st pull where Leo identified
> > > > the patch that fixed it; but it's not in that image.
> > >
> > > I only fixed the MSG_ZEROCOPY missing define bug, as I got that
> > > Stefanha had already fixed the issue in liburing/alpine.
> > >
> > > questions:
> > > - Has Stefanha really fixed that, and we are just waiting for a new
> > > image, or have I got that wrong?
> > > - How should I proceed with that?
> > >
> > > - If we proceed with fixing this up in alpine, will that require this
> > > patchset to be on pause until it's fixed there?
> >
> > It needs to pass in CI; so yes.
> >
> > > - If so, is there any suggestion on how to fix that in qemu code?
> > > (this header is needed because of SO_EE_* defines)
> >
> > I've not actually looked at the detail of the failure; but yes I think
> > we need a qemu workaround here.
> >
> > If there's no simple fix, then adding a test to meson.build to
> > conditionally disable liburing might be best; like the test code for
> > libcap_ng I guess (search in meson.build for libcap_ng.found at around
> > line 540.
>
> Hello Dave,
>
> I solved this issue by doing this:
>
> +linux_io_uring_test = '''
> + #include <liburing.h>
> + #include <linux/errqueue.h>
> +
> + int main(void) { return 0; }'''
> +
> linux_io_uring = not_found
> if not get_option('linux_io_uring').auto() or have_block
> linux_io_uring = dependency('liburing', version: '>=0.3',
> required: get_option('linux_io_uring'),
> method: 'pkg-config', kwargs: static_kwargs)
> + if not cc.links(linux_io_uring_test)
> + linux_io_uring = not_found
> + endif
> endif
> +
>
> Seems to work fine in CI, and now Alpine does not fail anymore.
> (See pipeline https://gitlab.com/LeoBras/qemu/-/pipelines/538123933
> for reference)
>
> I am not sure if this is the right thing to do, but I will be sending
> it as a full new patchset (v13), with the first patch being the one
> with the above change and the rest just carrying the recommended
> fixes.
Thanks! That looks promising. I'll cook a new pull.
> I was also thinking I could instead send the single "fix" patch, and
> recommend adding it before my v12. If that is the correct approach for
> this case, please let me know so I can improve in the future. (I am
> trying to figure out what is simpler/best for maintainers)
Either way would be fine; the full series is probably better.
Dave
> Best regards,
> Leo
>
>
>
>
>
>
>
> >
> > Dave
> >
> > > Thank you all!
> > >
> > > Best regards,
> > > Leo
> > >
> > > >
> > > > Dave
> > > >
> > > > > With regards,
> > > > > Daniel
> > > > > --
> > > > > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
> > > > > |: https://libvirt.org -o- https://fstop138.berrange.com :|
> > > > > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
> > > > >
> > > > --
> > > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > > >
> > >
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
© 2016 - 2026 Red Hat, Inc.