[libvirt PATCH v2 0/6] <interface> <teaming> element (was: virtio failover / vfio auto-plug-on-migrate)

Laine Stump posted 6 patches 4 years, 3 months ago
Failed in applying to current master (apply log)
docs/formatdomain.html.in                     | 100 ++++++++++++++++++
docs/news.xml                                 |  28 +++++
docs/schemas/domaincommon.rng                 |  19 ++++
src/conf/domain_conf.c                        |  45 ++++++++
src/conf/domain_conf.h                        |  14 +++
src/qemu/qemu_capabilities.c                  |   4 +
src/qemu/qemu_capabilities.h                  |   3 +
src/qemu/qemu_command.c                       |   9 ++
src/qemu/qemu_domain.c                        |  36 ++++++-
src/qemu/qemu_migration.c                     |  53 +++++++++-
src/qemu/qemu_monitor.c                       |   1 +
src/qemu/qemu_monitor.h                       |   1 +
src/qemu/qemu_monitor_json.c                  |   1 +
.../caps_4.2.0.aarch64.xml                    |   1 +
.../caps_4.2.0.x86_64.xml                     |   1 +
.../net-virtio-teaming-network.xml            |  37 +++++++
.../qemuxml2argvdata/net-virtio-teaming.args  |  40 +++++++
tests/qemuxml2argvdata/net-virtio-teaming.xml |  50 +++++++++
tests/qemuxml2argvtest.c                      |   4 +
.../net-virtio-teaming-network.xml            |  51 +++++++++
.../qemuxml2xmloutdata/net-virtio-teaming.xml |  66 ++++++++++++
tests/qemuxml2xmltest.c                       |   6 ++
22 files changed, 563 insertions(+), 7 deletions(-)
create mode 100644 tests/qemuxml2argvdata/net-virtio-teaming-network.xml
create mode 100644 tests/qemuxml2argvdata/net-virtio-teaming.args
create mode 100644 tests/qemuxml2argvdata/net-virtio-teaming.xml
create mode 100644 tests/qemuxml2xmloutdata/net-virtio-teaming-network.xml
create mode 100644 tests/qemuxml2xmloutdata/net-virtio-teaming.xml
[libvirt PATCH v2 0/6] <interface> <teaming> element (was: virtio failover / vfio auto-plug-on-migrate)
Posted by Laine Stump 4 years, 3 months ago
V1: https://www.redhat.com/archives/libvir-list/2020-January/msg00813.html

This all used different names in V1 - in that incarnation the
configuration was done using "failover" and "backupAlias" attributes
added to the <driver> subelement of <interface>. But the resulting
code was cumbersome and had little bits scattered all over the place
due to needing it in both hostdev and interface parsing/formatting.

In his review of V1, danpb suggesting just adding a new subelement for
this configuration to free ourselves from the constraints of <driver>
parsing/formatting. This ended up dramatically simplifying the code
(hence the lack of V1's refactoring patches in V2, and a decrease in
patch count from 12 to 6).

During further discussion in email and on IRC, we decided that naming
the element <failover> was too limiting, as it implied the behavior of
what is, to libvirt, just two network devices that should be
teamed/bonded together - it's completely up to the hypervisor and
guest what is done with this information. In light of that, we decided
to name the new subelement <teaming>, and to specify the two
interfaces as "persistent" (an interface that will always remain
plugged in) and "transient" (an interface that may be periodically
unplugged (during migration, in the case of QEMU). So the virtio
device will have

   <teaming type='persistent'/>

and the hostdev device will have

   <teaming type='transient' persistent='ua-myvirtio'/>

(note that the persistent interface must have <alias name='ua-myvirtio'/>)

Given this config, libvirt will add "failover=on" to the device
commandline arg for the virtio device, and
"failover_pair_id=ua-myvirtio" to the arg for the hostdev device (and
when a migration is requested, it will notice if there is a hostdev
that has <teaming type='transient'/> set, and will allow the migration
in this case, but still disallow migrations of domains with normal
hostdevs).

In response to these extra commandline options, QEMU will set some
extra capabilities in the virtio device PCI capabilities data, and
will also automatically unplug/re-plug the hostdev device before and
after migration.

In the guest, the virtio-net driver will notice the extra PCI
capabilities and use this as a clue that it should search for another
device with matching MAC address (NB: the guest driver requires the
two devices to have matching MAC addresses) to join into a bond with
the virtio-net device. This bond is hard-wired to always prefer the
hostdev device whenever it is present, and use the virtio device as
backup when the hostdev is unplugged.

----

As mentioned in a followup to the V1 cover letter, there is a
regression in QEMU 4.2.0 that causes QEMU to segv when a hostdev is
unplugged. That bug is fixed with this upstream QEMU patch:

https://git.qemu.org/?p=qemu.git;a=commitdiff;h=0446f8121723b134ca1d1ed0b73e96d4a0a8689d;hp=48008198270e3ebcc9394401d676c54ed5ac139c

Be sure to use a qemu build with this patch applied, or you may not
even be able to start the guest! Also we've found that the
DEVICE_DELETED event is never sent to libvirt when one of these
hostdevs is manually unplugged, meaning that libvirt keeps the device marked as
"in-use", and it therefore cannot be re-plugged to the guest until
after a complete guest "power cycle". AFAIK there isn't yet a fix for
that bug, so don't expect manual unplug of the device to work.

Laine Stump (6):
  qemu: add capabilities flag for failover feature
  conf: parse/format <teaming> subelement of <interface>
  qemu: support interface <teaming> functionality
  qemu: allow migration with assigned PCI hostdev if <teaming> is set
  qemu: add wait-unplug to qemu migration status enum
  docs: document <interface> subelement <teaming>

 docs/formatdomain.html.in                     | 100 ++++++++++++++++++
 docs/news.xml                                 |  28 +++++
 docs/schemas/domaincommon.rng                 |  19 ++++
 src/conf/domain_conf.c                        |  45 ++++++++
 src/conf/domain_conf.h                        |  14 +++
 src/qemu/qemu_capabilities.c                  |   4 +
 src/qemu/qemu_capabilities.h                  |   3 +
 src/qemu/qemu_command.c                       |   9 ++
 src/qemu/qemu_domain.c                        |  36 ++++++-
 src/qemu/qemu_migration.c                     |  53 +++++++++-
 src/qemu/qemu_monitor.c                       |   1 +
 src/qemu/qemu_monitor.h                       |   1 +
 src/qemu/qemu_monitor_json.c                  |   1 +
 .../caps_4.2.0.aarch64.xml                    |   1 +
 .../caps_4.2.0.x86_64.xml                     |   1 +
 .../net-virtio-teaming-network.xml            |  37 +++++++
 .../qemuxml2argvdata/net-virtio-teaming.args  |  40 +++++++
 tests/qemuxml2argvdata/net-virtio-teaming.xml |  50 +++++++++
 tests/qemuxml2argvtest.c                      |   4 +
 .../net-virtio-teaming-network.xml            |  51 +++++++++
 .../qemuxml2xmloutdata/net-virtio-teaming.xml |  66 ++++++++++++
 tests/qemuxml2xmltest.c                       |   6 ++
 22 files changed, 563 insertions(+), 7 deletions(-)
 create mode 100644 tests/qemuxml2argvdata/net-virtio-teaming-network.xml
 create mode 100644 tests/qemuxml2argvdata/net-virtio-teaming.args
 create mode 100644 tests/qemuxml2argvdata/net-virtio-teaming.xml
 create mode 100644 tests/qemuxml2xmloutdata/net-virtio-teaming-network.xml
 create mode 100644 tests/qemuxml2xmloutdata/net-virtio-teaming.xml

-- 
2.24.1

Re: [libvirt PATCH v2 0/6] <interface> <teaming> element (was: virtio failover / vfio auto-plug-on-migrate)
Posted by Dan Kenigsberg 4 years, 3 months ago
On Fri, 24 Jan 2020, 17:54 Laine Stump, <laine@redhat.com> wrote:

> V1: https://www.redhat.com/archives/libvir-list/2020-January/msg00813.html
>
> This all used different names in V1 - in that incarnation the
> configuration was done using "failover" and "backupAlias" attributes
> added to the <driver> subelement of <interface>. But the resulting
> code was cumbersome and had little bits scattered all over the place
> due to needing it in both hostdev and interface parsing/formatting.
>
> In his review of V1, danpb suggesting just adding a new subelement for
> this configuration to free ourselves from the constraints of <driver>
> parsing/formatting. This ended up dramatically simplifying the code
> (hence the lack of V1's refactoring patches in V2, and a decrease in
> patch count from 12 to 6).
>
> During further discussion in email and on IRC, we decided that naming
> the element <failover> was too limiting, as it implied the behavior of
> what is, to libvirt, just two network devices that should be
> teamed/bonded together - it's completely up to the hypervisor and
> guest what is done with this information. In light of that, we decided
> to name the new subelement <teaming>, and to specify the two
> interfaces as "persistent" (an interface that will always remain
> plugged in) and "transient" (an interface that may be periodically
> unplugged (during migration, in the case of QEMU). So the virtio
> device will have
>
>    <teaming type='persistent'/>
>
> and the hostdev device will have
>
>    <teaming type='transient' persistent='ua-myvirtio'/>
>
> (note that the persistent interface must have <alias name='ua-myvirtio'/>)
>
> Given this config, libvirt will add "failover=on" to the device
> commandline arg for the virtio device, and
> "failover_pair_id=ua-myvirtio" to the arg for the hostdev device (and
> when a migration is requested, it will notice if there is a hostdev
> that has <teaming type='transient'/> set, and will allow the migration
> in this case, but still disallow migrations of domains with normal
> hostdevs).
>
> In response to these extra commandline options, QEMU will set some
> extra capabilities in the virtio device PCI capabilities data, and
> will also automatically unplug/re-plug the hostdev device before and
> after migration.
>
> In the guest, the virtio-net driver will notice the extra PCI
> capabilities and use this as a clue that it should search for another
> device with matching MAC address (NB: the guest driver requires the
> two devices to have matching MAC addresses) to join into a bond with
> the virtio-net device.


I like the <teaming/> abstraction.

As I wrote earlier, as a virt-manager user I'd like to specify that two
interfaces are teamed; I would not care to copy the mac address of one onto
the other. I prefer that libvirt hides this virtio awkwardness by passing
the "persistent" mac address to both qemu nics. Would libvirt do this
service to its multiple clients?


This bond is hard-wired to always prefer the
> hostdev device whenever it is present, and use the virtio device as
> backup when the hostdev is unplugged.
>
> ----
>
> As mentioned in a followup to the V1 cover letter, there is a
> regression in QEMU 4.2.0 that causes QEMU to segv when a hostdev is
> unplugged. That bug is fixed with this upstream QEMU patch:
>
>
> https://git.qemu.org/?p=qemu.git;a=commitdiff;h=0446f8121723b134ca1d1ed0b73e96d4a0a8689d;hp=48008198270e3ebcc9394401d676c54ed5ac139c
>
> Be sure to use a qemu build with this patch applied, or you may not
> even be able to start the guest! Also we've found that the
> DEVICE_DELETED event is never sent to libvirt when one of these
> hostdevs is manually unplugged, meaning that libvirt keeps the device
> marked as
> "in-use", and it therefore cannot be re-plugged to the guest until
> after a complete guest "power cycle". AFAIK there isn't yet a fix for
> that bug, so don't expect manual unplug of the device to work.
>
> Laine Stump (6):
>   qemu: add capabilities flag for failover feature
>   conf: parse/format <teaming> subelement of <interface>
>   qemu: support interface <teaming> functionality
>   qemu: allow migration with assigned PCI hostdev if <teaming> is set
>   qemu: add wait-unplug to qemu migration status enum
>   docs: document <interface> subelement <teaming>
>
>  docs/formatdomain.html.in                     | 100 ++++++++++++++++++
>  docs/news.xml                                 |  28 +++++
>  docs/schemas/domaincommon.rng                 |  19 ++++
>  src/conf/domain_conf.c                        |  45 ++++++++
>  src/conf/domain_conf.h                        |  14 +++
>  src/qemu/qemu_capabilities.c                  |   4 +
>  src/qemu/qemu_capabilities.h                  |   3 +
>  src/qemu/qemu_command.c                       |   9 ++
>  src/qemu/qemu_domain.c                        |  36 ++++++-
>  src/qemu/qemu_migration.c                     |  53 +++++++++-
>  src/qemu/qemu_monitor.c                       |   1 +
>  src/qemu/qemu_monitor.h                       |   1 +
>  src/qemu/qemu_monitor_json.c                  |   1 +
>  .../caps_4.2.0.aarch64.xml                    |   1 +
>  .../caps_4.2.0.x86_64.xml                     |   1 +
>  .../net-virtio-teaming-network.xml            |  37 +++++++
>  .../qemuxml2argvdata/net-virtio-teaming.args  |  40 +++++++
>  tests/qemuxml2argvdata/net-virtio-teaming.xml |  50 +++++++++
>  tests/qemuxml2argvtest.c                      |   4 +
>  .../net-virtio-teaming-network.xml            |  51 +++++++++
>  .../qemuxml2xmloutdata/net-virtio-teaming.xml |  66 ++++++++++++
>  tests/qemuxml2xmltest.c                       |   6 ++
>  22 files changed, 563 insertions(+), 7 deletions(-)
>  create mode 100644 tests/qemuxml2argvdata/net-virtio-teaming-network.xml
>  create mode 100644 tests/qemuxml2argvdata/net-virtio-teaming.args
>  create mode 100644 tests/qemuxml2argvdata/net-virtio-teaming.xml
>  create mode 100644 tests/qemuxml2xmloutdata/net-virtio-teaming-network.xml
>  create mode 100644 tests/qemuxml2xmloutdata/net-virtio-teaming.xml
>
> --
> 2.24.1
>
>