[PATCH 0/4] migration: workaround GNUTLS live migration crashes

Daniel P. Berrangé posted 4 patches 3 months, 4 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20250718150514.2635338-1-berrange@redhat.com
Maintainers: "Daniel P. Berrangé" <berrange@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, "Marc-André Lureau" <marcandre.lureau@redhat.com>, "Philippe Mathieu-Daudé" <philmd@linaro.org>, Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>
crypto/tlssession.c           | 99 +++++++++++++++++++++++++++++++++--
crypto/trace-events           |  2 +
include/crypto/tlssession.h   | 14 +++++
include/io/channel.h          |  1 +
io/channel-tls.c              |  5 ++
meson.build                   |  9 ++++
meson_options.txt             |  2 +
migration/tls.c               |  9 ++++
scripts/meson-buildoptions.sh |  5 ++
9 files changed, 143 insertions(+), 3 deletions(-)
[PATCH 0/4] migration: workaround GNUTLS live migration crashes
Posted by Daniel P. Berrangé 3 months, 4 weeks ago
TL:DR: GNUTLS is liable to crash QEMU when live migration is run
with TLS enabled and a return path channel is present, if approx
64 GB of data is transferred. This is easily triggered in a 16 GB
VM with 4 CPUs, by running 'stress-ng --vm 4 --vm-bytes 80%' to
prevent convergance until 64 GB of RAM has been copied. Then
triggering post-copy switchover, or removing the stress workload
to allow completion, will crash it.

The only live migration scenario that should avoid this danger
is multifd, since the high volume data transfers are handled in
dedicated TCP connections which are unidirectional. The main
bi-directionl TCP connection is only for co-ordination purposes

This patch implements a workaround that will prevent future QEMU
versions from triggering the crash.

The only way to avoid the crash with *existing* running QEMU
processes is to change the TLS cipher priority string to avoid
use of AES with TLS 1.3. This can be done with the 'priority'
field in the 'tls-creds-x509' object.eg

  -object tls-creds-x509,id=tls0,priority=NORMAL:-AES-256-GCM:-AES-128-GCM:-AES-128-CCM

which should force the use of CHACHA20-POLY1305 which does not
require TLS re-keying after 16 million sent records (64 GB of
migration data).

  https://gitlab.com/qemu-project/qemu/-/issues/1937

On RHEL/Fedora distros you can also use the system wide crypto
priorities to override this from the migration *target* host
by creating /etc/crypto-policies/local.d/gnutls-qemu.config
containing

  QEMU=NONE:+ECDHE-RSA:+ECDHE-ECDSA:+RSA:+DHE-RSA:+GROUP-X25519:+GROUP-X448:+GROUP-SECP256R1:+GROUP-SECP384R1:+GROUP-SECP521R1:+GROUP-FF

and running 'update-crypto-policies'. I recommend the QEMU
level 'tls-creds-x509' workaround though, which new libvirt
patches can soon do:

  https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/LX5KMIUFZSP5DPUXKJDFYBZI5TIE3E5N/

Daniel P. Berrangé (4):
  crypto: implement workaround for GNUTLS thread safety problems
  io: add support for activating TLS thread safety workaround
  migration: activate TLS thread safety workaround
  crypto: add tracing & warning about GNUTLS countermeasures

 crypto/tlssession.c           | 99 +++++++++++++++++++++++++++++++++--
 crypto/trace-events           |  2 +
 include/crypto/tlssession.h   | 14 +++++
 include/io/channel.h          |  1 +
 io/channel-tls.c              |  5 ++
 meson.build                   |  9 ++++
 meson_options.txt             |  2 +
 migration/tls.c               |  9 ++++
 scripts/meson-buildoptions.sh |  5 ++
 9 files changed, 143 insertions(+), 3 deletions(-)

-- 
2.50.1


Re: [PATCH 0/4] migration: workaround GNUTLS live migration crashes
Posted by Michael Tokarev 3 months, 3 weeks ago
On 18.07.2025 18:05, Daniel P. Berrangé wrote:
> TL:DR: GNUTLS is liable to crash QEMU when live migration is run
> with TLS enabled and a return path channel is present, if approx
> 64 GB of data is transferred. This is easily triggered in a 16 GB
> VM with 4 CPUs, by running 'stress-ng --vm 4 --vm-bytes 80%' to
> prevent convergance until 64 GB of RAM has been copied. Then
> triggering post-copy switchover, or removing the stress workload
> to allow completion, will crash it.
> 
> The only live migration scenario that should avoid this danger
> is multifd, since the high volume data transfers are handled in
> dedicated TCP connections which are unidirectional. The main
> bi-directionl TCP connection is only for co-ordination purposes
> 
> This patch implements a workaround that will prevent future QEMU
> versions from triggering the crash.
> 
> The only way to avoid the crash with *existing* running QEMU
> processes is to change the TLS cipher priority string to avoid
> use of AES with TLS 1.3. This can be done with the 'priority'
> field in the 'tls-creds-x509' object.eg
> 
>    -object tls-creds-x509,id=tls0,priority=NORMAL:-AES-256-GCM:-AES-128-GCM:-AES-128-CCM
> 
> which should force the use of CHACHA20-POLY1305 which does not
> require TLS re-keying after 16 million sent records (64 GB of
> migration data).
> 
>    https://gitlab.com/qemu-project/qemu/-/issues/1937
> 
> On RHEL/Fedora distros you can also use the system wide crypto
> priorities to override this from the migration *target* host
> by creating /etc/crypto-policies/local.d/gnutls-qemu.config
> containing
> 
>    QEMU=NONE:+ECDHE-RSA:+ECDHE-ECDSA:+RSA:+DHE-RSA:+GROUP-X25519:+GROUP-X448:+GROUP-SECP256R1:+GROUP-SECP384R1:+GROUP-SECP521R1:+GROUP-FF
> 
> and running 'update-crypto-policies'. I recommend the QEMU
> level 'tls-creds-x509' workaround though, which new libvirt
> patches can soon do:
> 
>    https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/LX5KMIUFZSP5DPUXKJDFYBZI5TIE3E5N/
> 
> Daniel P. Berrangé (4):
>    crypto: implement workaround for GNUTLS thread safety problems
>    io: add support for activating TLS thread safety workaround
>    migration: activate TLS thread safety workaround
>    crypto: add tracing & warning about GNUTLS countermeasures
> 
>   crypto/tlssession.c           | 99 +++++++++++++++++++++++++++++++++--
>   crypto/trace-events           |  2 +
>   include/crypto/tlssession.h   | 14 +++++
>   include/io/channel.h          |  1 +
>   io/channel-tls.c              |  5 ++
>   meson.build                   |  9 ++++
>   meson_options.txt             |  2 +
>   migration/tls.c               |  9 ++++
>   scripts/meson-buildoptions.sh |  5 ++
>   9 files changed, 143 insertions(+), 3 deletions(-)

Being a large(ish) change, but it looks like this patch set is a good
candidate for qemu-stable series, at least for 10.0.x.  What do you
think?

Thanks,

/mjt

Re: [PATCH 0/4] migration: workaround GNUTLS live migration crashes
Posted by Daniel P. Berrangé 3 months, 2 weeks ago
On Sat, Jul 26, 2025 at 09:24:10AM +0300, Michael Tokarev wrote:
> On 18.07.2025 18:05, Daniel P. Berrangé wrote:
> > TL:DR: GNUTLS is liable to crash QEMU when live migration is run
> > with TLS enabled and a return path channel is present, if approx
> > 64 GB of data is transferred. This is easily triggered in a 16 GB
> > VM with 4 CPUs, by running 'stress-ng --vm 4 --vm-bytes 80%' to
> > prevent convergance until 64 GB of RAM has been copied. Then
> > triggering post-copy switchover, or removing the stress workload
> > to allow completion, will crash it.
> > 
> > The only live migration scenario that should avoid this danger
> > is multifd, since the high volume data transfers are handled in
> > dedicated TCP connections which are unidirectional. The main
> > bi-directionl TCP connection is only for co-ordination purposes
> > 
> > This patch implements a workaround that will prevent future QEMU
> > versions from triggering the crash.
> > 
> > The only way to avoid the crash with *existing* running QEMU
> > processes is to change the TLS cipher priority string to avoid
> > use of AES with TLS 1.3. This can be done with the 'priority'
> > field in the 'tls-creds-x509' object.eg
> > 
> >    -object tls-creds-x509,id=tls0,priority=NORMAL:-AES-256-GCM:-AES-128-GCM:-AES-128-CCM
> > 
> > which should force the use of CHACHA20-POLY1305 which does not
> > require TLS re-keying after 16 million sent records (64 GB of
> > migration data).
> > 
> >    https://gitlab.com/qemu-project/qemu/-/issues/1937
> > 
> > On RHEL/Fedora distros you can also use the system wide crypto
> > priorities to override this from the migration *target* host
> > by creating /etc/crypto-policies/local.d/gnutls-qemu.config
> > containing
> > 
> >    QEMU=NONE:+ECDHE-RSA:+ECDHE-ECDSA:+RSA:+DHE-RSA:+GROUP-X25519:+GROUP-X448:+GROUP-SECP256R1:+GROUP-SECP384R1:+GROUP-SECP521R1:+GROUP-FF
> > 
> > and running 'update-crypto-policies'. I recommend the QEMU
> > level 'tls-creds-x509' workaround though, which new libvirt
> > patches can soon do:
> > 
> >    https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/LX5KMIUFZSP5DPUXKJDFYBZI5TIE3E5N/
> > 
> > Daniel P. Berrangé (4):
> >    crypto: implement workaround for GNUTLS thread safety problems
> >    io: add support for activating TLS thread safety workaround
> >    migration: activate TLS thread safety workaround
> >    crypto: add tracing & warning about GNUTLS countermeasures
> > 
> >   crypto/tlssession.c           | 99 +++++++++++++++++++++++++++++++++--
> >   crypto/trace-events           |  2 +
> >   include/crypto/tlssession.h   | 14 +++++
> >   include/io/channel.h          |  1 +
> >   io/channel-tls.c              |  5 ++
> >   meson.build                   |  9 ++++
> >   meson_options.txt             |  2 +
> >   migration/tls.c               |  9 ++++
> >   scripts/meson-buildoptions.sh |  5 ++
> >   9 files changed, 143 insertions(+), 3 deletions(-)
> 
> Being a large(ish) change, but it looks like this patch set is a good
> candidate for qemu-stable series, at least for 10.0.x.  What do you
> think?

Yeah, given broken gnutls is everywhere, I'd take it to any stable tree
where it is reasonably easy to do a clean backport.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH 0/4] migration: workaround GNUTLS live migration crashes
Posted by Fabiano Rosas 3 months, 3 weeks ago
Daniel P. Berrangé <berrange@redhat.com> writes:

> TL:DR: GNUTLS is liable to crash QEMU when live migration is run
> with TLS enabled and a return path channel is present, if approx
> 64 GB of data is transferred. This is easily triggered in a 16 GB
> VM with 4 CPUs, by running 'stress-ng --vm 4 --vm-bytes 80%' to
> prevent convergance until 64 GB of RAM has been copied. Then
> triggering post-copy switchover, or removing the stress workload
> to allow completion, will crash it.
>
> The only live migration scenario that should avoid this danger
> is multifd, since the high volume data transfers are handled in
> dedicated TCP connections which are unidirectional. The main
> bi-directionl TCP connection is only for co-ordination purposes
>
> This patch implements a workaround that will prevent future QEMU
> versions from triggering the crash.
>
> The only way to avoid the crash with *existing* running QEMU
> processes is to change the TLS cipher priority string to avoid
> use of AES with TLS 1.3. This can be done with the 'priority'
> field in the 'tls-creds-x509' object.eg
>
>   -object tls-creds-x509,id=tls0,priority=NORMAL:-AES-256-GCM:-AES-128-GCM:-AES-128-CCM
>
> which should force the use of CHACHA20-POLY1305 which does not
> require TLS re-keying after 16 million sent records (64 GB of
> migration data).
>
>   https://gitlab.com/qemu-project/qemu/-/issues/1937
>
> On RHEL/Fedora distros you can also use the system wide crypto
> priorities to override this from the migration *target* host
> by creating /etc/crypto-policies/local.d/gnutls-qemu.config
> containing
>
>   QEMU=NONE:+ECDHE-RSA:+ECDHE-ECDSA:+RSA:+DHE-RSA:+GROUP-X25519:+GROUP-X448:+GROUP-SECP256R1:+GROUP-SECP384R1:+GROUP-SECP521R1:+GROUP-FF
>
> and running 'update-crypto-policies'. I recommend the QEMU
> level 'tls-creds-x509' workaround though, which new libvirt
> patches can soon do:
>
>   https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/LX5KMIUFZSP5DPUXKJDFYBZI5TIE3E5N/
>
> Daniel P. Berrangé (4):
>   crypto: implement workaround for GNUTLS thread safety problems
>   io: add support for activating TLS thread safety workaround
>   migration: activate TLS thread safety workaround
>   crypto: add tracing & warning about GNUTLS countermeasures
>
>  crypto/tlssession.c           | 99 +++++++++++++++++++++++++++++++++--
>  crypto/trace-events           |  2 +
>  include/crypto/tlssession.h   | 14 +++++
>  include/io/channel.h          |  1 +
>  io/channel-tls.c              |  5 ++
>  meson.build                   |  9 ++++
>  meson_options.txt             |  2 +
>  migration/tls.c               |  9 ++++
>  scripts/meson-buildoptions.sh |  5 ++
>  9 files changed, 143 insertions(+), 3 deletions(-)

Hi, thank you for getting to the bottom of this.

Do you think it would be too cumbersome to add a test for this
somewhere? So we don't regress the workaround but also so the test tells
us whether GNUTLS is fixed.
Re: [PATCH 0/4] migration: workaround GNUTLS live migration crashes
Posted by Daniel P. Berrangé 3 months, 3 weeks ago
On Mon, Jul 21, 2025 at 11:56:09AM -0300, Fabiano Rosas wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > TL:DR: GNUTLS is liable to crash QEMU when live migration is run
> > with TLS enabled and a return path channel is present, if approx
> > 64 GB of data is transferred. This is easily triggered in a 16 GB
> > VM with 4 CPUs, by running 'stress-ng --vm 4 --vm-bytes 80%' to
> > prevent convergance until 64 GB of RAM has been copied. Then
> > triggering post-copy switchover, or removing the stress workload
> > to allow completion, will crash it.
> >
> > The only live migration scenario that should avoid this danger
> > is multifd, since the high volume data transfers are handled in
> > dedicated TCP connections which are unidirectional. The main
> > bi-directionl TCP connection is only for co-ordination purposes
> >
> > This patch implements a workaround that will prevent future QEMU
> > versions from triggering the crash.
> >
> > The only way to avoid the crash with *existing* running QEMU
> > processes is to change the TLS cipher priority string to avoid
> > use of AES with TLS 1.3. This can be done with the 'priority'
> > field in the 'tls-creds-x509' object.eg
> >
> >   -object tls-creds-x509,id=tls0,priority=NORMAL:-AES-256-GCM:-AES-128-GCM:-AES-128-CCM
> >
> > which should force the use of CHACHA20-POLY1305 which does not
> > require TLS re-keying after 16 million sent records (64 GB of
> > migration data).
> >
> >   https://gitlab.com/qemu-project/qemu/-/issues/1937
> >
> > On RHEL/Fedora distros you can also use the system wide crypto
> > priorities to override this from the migration *target* host
> > by creating /etc/crypto-policies/local.d/gnutls-qemu.config
> > containing
> >
> >   QEMU=NONE:+ECDHE-RSA:+ECDHE-ECDSA:+RSA:+DHE-RSA:+GROUP-X25519:+GROUP-X448:+GROUP-SECP256R1:+GROUP-SECP384R1:+GROUP-SECP521R1:+GROUP-FF
> >
> > and running 'update-crypto-policies'. I recommend the QEMU
> > level 'tls-creds-x509' workaround though, which new libvirt
> > patches can soon do:
> >
> >   https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/LX5KMIUFZSP5DPUXKJDFYBZI5TIE3E5N/
> >
> > Daniel P. Berrangé (4):
> >   crypto: implement workaround for GNUTLS thread safety problems
> >   io: add support for activating TLS thread safety workaround
> >   migration: activate TLS thread safety workaround
> >   crypto: add tracing & warning about GNUTLS countermeasures
> >
> >  crypto/tlssession.c           | 99 +++++++++++++++++++++++++++++++++--
> >  crypto/trace-events           |  2 +
> >  include/crypto/tlssession.h   | 14 +++++
> >  include/io/channel.h          |  1 +
> >  io/channel-tls.c              |  5 ++
> >  meson.build                   |  9 ++++
> >  meson_options.txt             |  2 +
> >  migration/tls.c               |  9 ++++
> >  scripts/meson-buildoptions.sh |  5 ++
> >  9 files changed, 143 insertions(+), 3 deletions(-)
> 
> Hi, thank you for getting to the bottom of this.
> 
> Do you think it would be too cumbersome to add a test for this
> somewhere? So we don't regress the workaround but also so the test tells
> us whether GNUTLS is fixed.

The reproducer scenario is very expensive. I'm doing it with a 16 GB RAM
guest, with 4 CPUs, running 'stress-ng' guest workload. With that, it
takes between 10-20 minutes before live migration gets GNUTLS into the
potentially broken state, and the failure is not 100% guaranteed at
that point.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH 0/4] migration: workaround GNUTLS live migration crashes
Posted by Fabiano Rosas 3 months, 3 weeks ago
Daniel P. Berrangé <berrange@redhat.com> writes:

> On Mon, Jul 21, 2025 at 11:56:09AM -0300, Fabiano Rosas wrote:
>> Daniel P. Berrangé <berrange@redhat.com> writes:
>> 
>> > TL:DR: GNUTLS is liable to crash QEMU when live migration is run
>> > with TLS enabled and a return path channel is present, if approx
>> > 64 GB of data is transferred. This is easily triggered in a 16 GB
>> > VM with 4 CPUs, by running 'stress-ng --vm 4 --vm-bytes 80%' to
>> > prevent convergance until 64 GB of RAM has been copied. Then
>> > triggering post-copy switchover, or removing the stress workload
>> > to allow completion, will crash it.
>> >
>> > The only live migration scenario that should avoid this danger
>> > is multifd, since the high volume data transfers are handled in
>> > dedicated TCP connections which are unidirectional. The main
>> > bi-directionl TCP connection is only for co-ordination purposes
>> >
>> > This patch implements a workaround that will prevent future QEMU
>> > versions from triggering the crash.
>> >
>> > The only way to avoid the crash with *existing* running QEMU
>> > processes is to change the TLS cipher priority string to avoid
>> > use of AES with TLS 1.3. This can be done with the 'priority'
>> > field in the 'tls-creds-x509' object.eg
>> >
>> >   -object tls-creds-x509,id=tls0,priority=NORMAL:-AES-256-GCM:-AES-128-GCM:-AES-128-CCM
>> >
>> > which should force the use of CHACHA20-POLY1305 which does not
>> > require TLS re-keying after 16 million sent records (64 GB of
>> > migration data).
>> >
>> >   https://gitlab.com/qemu-project/qemu/-/issues/1937
>> >
>> > On RHEL/Fedora distros you can also use the system wide crypto
>> > priorities to override this from the migration *target* host
>> > by creating /etc/crypto-policies/local.d/gnutls-qemu.config
>> > containing
>> >
>> >   QEMU=NONE:+ECDHE-RSA:+ECDHE-ECDSA:+RSA:+DHE-RSA:+GROUP-X25519:+GROUP-X448:+GROUP-SECP256R1:+GROUP-SECP384R1:+GROUP-SECP521R1:+GROUP-FF
>> >
>> > and running 'update-crypto-policies'. I recommend the QEMU
>> > level 'tls-creds-x509' workaround though, which new libvirt
>> > patches can soon do:
>> >
>> >   https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/LX5KMIUFZSP5DPUXKJDFYBZI5TIE3E5N/
>> >
>> > Daniel P. Berrangé (4):
>> >   crypto: implement workaround for GNUTLS thread safety problems
>> >   io: add support for activating TLS thread safety workaround
>> >   migration: activate TLS thread safety workaround
>> >   crypto: add tracing & warning about GNUTLS countermeasures
>> >
>> >  crypto/tlssession.c           | 99 +++++++++++++++++++++++++++++++++--
>> >  crypto/trace-events           |  2 +
>> >  include/crypto/tlssession.h   | 14 +++++
>> >  include/io/channel.h          |  1 +
>> >  io/channel-tls.c              |  5 ++
>> >  meson.build                   |  9 ++++
>> >  meson_options.txt             |  2 +
>> >  migration/tls.c               |  9 ++++
>> >  scripts/meson-buildoptions.sh |  5 ++
>> >  9 files changed, 143 insertions(+), 3 deletions(-)
>> 
>> Hi, thank you for getting to the bottom of this.
>> 
>> Do you think it would be too cumbersome to add a test for this
>> somewhere? So we don't regress the workaround but also so the test tells
>> us whether GNUTLS is fixed.
>
> The reproducer scenario is very expensive. I'm doing it with a 16 GB RAM
> guest, with 4 CPUs, running 'stress-ng' guest workload. With that, it
> takes between 10-20 minutes before live migration gets GNUTLS into the
> potentially broken state, and the failure is not 100% guaranteed at
> that point.
>

Makes sense. Thanks.

Will you take the series or should I?

> With regards,
> Daniel
Re: [PATCH 0/4] migration: workaround GNUTLS live migration crashes
Posted by Daniel P. Berrangé 3 months, 3 weeks ago
On Mon, Jul 21, 2025 at 12:14:51PM -0300, Fabiano Rosas wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Mon, Jul 21, 2025 at 11:56:09AM -0300, Fabiano Rosas wrote:
> >> Daniel P. Berrangé <berrange@redhat.com> writes:
> >> 
> >> > TL:DR: GNUTLS is liable to crash QEMU when live migration is run
> >> > with TLS enabled and a return path channel is present, if approx
> >> > 64 GB of data is transferred. This is easily triggered in a 16 GB
> >> > VM with 4 CPUs, by running 'stress-ng --vm 4 --vm-bytes 80%' to
> >> > prevent convergance until 64 GB of RAM has been copied. Then
> >> > triggering post-copy switchover, or removing the stress workload
> >> > to allow completion, will crash it.
> >> >
> >> > The only live migration scenario that should avoid this danger
> >> > is multifd, since the high volume data transfers are handled in
> >> > dedicated TCP connections which are unidirectional. The main
> >> > bi-directionl TCP connection is only for co-ordination purposes
> >> >
> >> > This patch implements a workaround that will prevent future QEMU
> >> > versions from triggering the crash.
> >> >
> >> > The only way to avoid the crash with *existing* running QEMU
> >> > processes is to change the TLS cipher priority string to avoid
> >> > use of AES with TLS 1.3. This can be done with the 'priority'
> >> > field in the 'tls-creds-x509' object.eg
> >> >
> >> >   -object tls-creds-x509,id=tls0,priority=NORMAL:-AES-256-GCM:-AES-128-GCM:-AES-128-CCM
> >> >
> >> > which should force the use of CHACHA20-POLY1305 which does not
> >> > require TLS re-keying after 16 million sent records (64 GB of
> >> > migration data).
> >> >
> >> >   https://gitlab.com/qemu-project/qemu/-/issues/1937
> >> >
> >> > On RHEL/Fedora distros you can also use the system wide crypto
> >> > priorities to override this from the migration *target* host
> >> > by creating /etc/crypto-policies/local.d/gnutls-qemu.config
> >> > containing
> >> >
> >> >   QEMU=NONE:+ECDHE-RSA:+ECDHE-ECDSA:+RSA:+DHE-RSA:+GROUP-X25519:+GROUP-X448:+GROUP-SECP256R1:+GROUP-SECP384R1:+GROUP-SECP521R1:+GROUP-FF
> >> >
> >> > and running 'update-crypto-policies'. I recommend the QEMU
> >> > level 'tls-creds-x509' workaround though, which new libvirt
> >> > patches can soon do:
> >> >
> >> >   https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/LX5KMIUFZSP5DPUXKJDFYBZI5TIE3E5N/
> >> >
> >> > Daniel P. Berrangé (4):
> >> >   crypto: implement workaround for GNUTLS thread safety problems
> >> >   io: add support for activating TLS thread safety workaround
> >> >   migration: activate TLS thread safety workaround
> >> >   crypto: add tracing & warning about GNUTLS countermeasures
> >> >
> >> >  crypto/tlssession.c           | 99 +++++++++++++++++++++++++++++++++--
> >> >  crypto/trace-events           |  2 +
> >> >  include/crypto/tlssession.h   | 14 +++++
> >> >  include/io/channel.h          |  1 +
> >> >  io/channel-tls.c              |  5 ++
> >> >  meson.build                   |  9 ++++
> >> >  meson_options.txt             |  2 +
> >> >  migration/tls.c               |  9 ++++
> >> >  scripts/meson-buildoptions.sh |  5 ++
> >> >  9 files changed, 143 insertions(+), 3 deletions(-)
> >> 
> >> Hi, thank you for getting to the bottom of this.
> >> 
> >> Do you think it would be too cumbersome to add a test for this
> >> somewhere? So we don't regress the workaround but also so the test tells
> >> us whether GNUTLS is fixed.
> >
> > The reproducer scenario is very expensive. I'm doing it with a 16 GB RAM
> > guest, with 4 CPUs, running 'stress-ng' guest workload. With that, it
> > takes between 10-20 minutes before live migration gets GNUTLS into the
> > potentially broken state, and the failure is not 100% guaranteed at
> > that point.
> >
> 
> Makes sense. Thanks.
> 
> Will you take the series or should I?

I presume you've already got another migration pull request planned, so
feel free to add this


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|