Commit 01ca4010d86 (libvirt v5.1.0) moved address reservation for
hotplugged interface devices up to an earlier point in
qemuDomainAttachNetDevice() because some function called when
hotplugging on aarch64 needed to know the address type, and failed
when it was "none".
This unfortunately caused a regression, because it also made PCI
address reservation happen before we noticed that the device was a
*hostdev* interface. Those interfaces are hotplugged by just calling
out to qemuDomainAttachHostdevDevice() - that function would then also
attempt to reserve the *same PCI address* that had just been reserved
in qemuDomainAttachNetDevice().
The solution is to move the bit of code that short-circuits out to
virDomainHostdevAttach() up *even earlier* so that no PCI address has
been allocated by the time it's called.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1744523
Signed-off-by: Laine Stump <laine@redhat.com>
---
src/qemu/qemu_hotplug.c | 26 +++++++++++++++-----------
1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
index 55696ebe9a..80ddee6587 100644
--- a/src/qemu/qemu_hotplug.c
+++ b/src/qemu/qemu_hotplug.c
@@ -1181,6 +1181,17 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver,
if (qemuAssignDeviceNetAlias(vm->def, net, -1) < 0)
goto cleanup;
+ if (actualType == VIR_DOMAIN_NET_TYPE_HOSTDEV) {
+ /* This is really a "smart hostdev", so it should be attached
+ * as a hostdev (the hostdev code will reach over into the
+ * netdev-specific code as appropriate), then also added to
+ * the nets list (see cleanup:) if successful.
+ */
+ ret = qemuDomainAttachHostDevice(driver, vm,
+ virDomainNetGetActualHostdev(net));
+ goto cleanup;
+ }
+
if (qemuDomainIsS390CCW(vm->def) &&
net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI &&
virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_CCW)) {
@@ -1260,17 +1271,6 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver,
goto cleanup;
break;
- case VIR_DOMAIN_NET_TYPE_HOSTDEV:
- /* This is really a "smart hostdev", so it should be attached
- * as a hostdev (the hostdev code will reach over into the
- * netdev-specific code as appropriate), then also added to
- * the nets list (see cleanup:) if successful.
- */
- ret = qemuDomainAttachHostDevice(driver, vm,
- virDomainNetGetActualHostdev(net));
- goto cleanup;
- break;
-
case VIR_DOMAIN_NET_TYPE_VHOSTUSER:
queueSize = net->driver.virtio.queues;
if (!queueSize)
@@ -1313,6 +1313,10 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver,
}
break;
+ case VIR_DOMAIN_NET_TYPE_HOSTDEV:
+ /* this switch is skipped by hostdev interfaces */
+ break;
+
case VIR_DOMAIN_NET_TYPE_SERVER:
case VIR_DOMAIN_NET_TYPE_CLIENT:
case VIR_DOMAIN_NET_TYPE_MCAST:
--
2.21.0
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
On Sat, 2019-10-19 at 02:18 -0400, Laine Stump wrote: > Commit 01ca4010d86 (libvirt v5.1.0) moved address reservation for > hotplugged interface devices up to an earlier point in > qemuDomainAttachNetDevice() because some function called when > hotplugging on aarch64 needed to know the address type, and failed > when it was "none". I don't see anything in the original commit suggesting the change was connected to aarch64? In fact, for the most part I would expect aarch64/virt guests to go down the very same code paths as x86/q35 guests. > This unfortunately caused a regression, because it also made PCI > address reservation happen before we noticed that the device was a > *hostdev* interface. Those interfaces are hotplugged by just calling > out to qemuDomainAttachHostdevDevice() - that function would then also > attempt to reserve the *same PCI address* that had just been reserved > in qemuDomainAttachNetDevice(). And of course at the time we didn't have, and after this patch still don't have, test suite coverage for that O:-) > The solution is to move the bit of code that short-circuits out to > virDomainHostdevAttach() up *even earlier* so that no PCI address has > been allocated by the time it's called. > > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1744523 > Signed-off-by: Laine Stump <laine@redhat.com> The prevailing style seems to be ... been allocated by the time it's called. https://bugzilla.redhat.com/show_bug.cgi?id=1744523 Signed-off-by: Laine Stump <laine@redhat.com> Feel free to change it or leave it as-is, I personally don't feel very strongly either way. > +++ b/src/qemu/qemu_hotplug.c > + case VIR_DOMAIN_NET_TYPE_HOSTDEV: > + /* this switch is skipped by hostdev interfaces */ Maybe something like /* hostdev interfaces have been dealt with earlier */ instead? With at least the reference to aarch64 removed (unless of course I have misunderstood the situation and the original commit was really aarch64-related), Reviewed-by: Andrea Bolognani <abologna@redhat.com> -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On 10/21/19 9:02 AM, Andrea Bolognani wrote: > On Sat, 2019-10-19 at 02:18 -0400, Laine Stump wrote: >> Commit 01ca4010d86 (libvirt v5.1.0) moved address reservation for >> hotplugged interface devices up to an earlier point in >> qemuDomainAttachNetDevice() because some function called when >> hotplugging on aarch64 needed to know the address type, and failed >> when it was "none". > I don't see anything in the original commit suggesting the change > was connected to aarch64? In fact, for the most part I would expect > aarch64/virt guests to go down the very same code paths as x86/q35 > guests. The secret info is in the emails surrounding the V1 patch that turned into the V2 patch that was eventually pushed: https://www.redhat.com/archives/libvir-list/2018-December/msg00439.html https://www.redhat.com/archives/libvir-list/2018-December/msg00460.html I think the issue is that in the VIR_DOMAIN_NET_TYPE_VHOSTUSER case of the switch statement in qemuDomainAttachNetDevice(), it is calling qemuDomainSupportsNicdev(), and qemuDomainSupportsNicdev has this code: /* non-virtio ARM nics require legacy -net nic */ if (((def->os.arch == VIR_ARCH_ARMV6L) || (def->os.arch == VIR_ARCH_ARMV7L) || (def->os.arch == VIR_ARCH_AARCH64)) && net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO && net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI) return false; return true; So if you were attempting to hotplug a vhostuser device on aarch64 with no address specified in the XML, it would return false, which would cause qemuDomainAttachNetDevice() to fail and log an error. > >> This unfortunately caused a regression, because it also made PCI >> address reservation happen before we noticed that the device was a >> *hostdev* interface. Those interfaces are hotplugged by just calling >> out to qemuDomainAttachHostdevDevice() - that function would then also >> attempt to reserve the *same PCI address* that had just been reserved >> in qemuDomainAttachNetDevice(). > And of course at the time we didn't have, and after this patch still > don't have, test suite coverage for that O:-) I'll choose to pretend I didn't see that. > >> The solution is to move the bit of code that short-circuits out to >> virDomainHostdevAttach() up *even earlier* so that no PCI address has >> been allocated by the time it's called. >> >> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1744523 >> Signed-off-by: Laine Stump <laine@redhat.com> > The prevailing style seems to be > > ... > been allocated by the time it's called. > > https://bugzilla.redhat.com/show_bug.cgi?id=1744523 > > Signed-off-by: Laine Stump <laine@redhat.com> > > Feel free to change it or leave it as-is, I personally don't feel > very strongly either way. Remember when BZes weren't listed *at all* in commit messages? Stumperidge Farms remembers. I guess the debate about this happened on one of the many patch threads that I didn't get around to reading (I remember seeing some private messages, but I thought those were only wrt downstream distro patches, not upstream). I can see the point with patches that are *in support* of some patch that fixes a bug (and therefore shouldn't claim "resolves"), but I think it's nice to have an explicit indicator in a patch that means "immediately prior to this patch, the bug still existed. After this patch, the bug no longer exists". Yeah yeah, I know it's more complicated than that. Anyway, if the winds are currently blowing in the direction of "don't say "Resolves:" then fine, I won't says "Resolves:". > >> +++ b/src/qemu/qemu_hotplug.c >> + case VIR_DOMAIN_NET_TYPE_HOSTDEV: >> + /* this switch is skipped by hostdev interfaces */ > Maybe something like > > /* hostdev interfaces have been dealt with earlier */ > > instead? Sure. > > > With at least the reference to aarch64 removed (unless of course I > have misunderstood the situation and the original commit was really > aarch64-related), Still want me to remove that, or add extra comments describing the original problem more thoroughly? (seems kind of the wrong place for that - that documentation train already sailed) > > Reviewed-by: Andrea Bolognani <abologna@redhat.com> > -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Tue, 2019-10-22 at 14:25 -0400, Laine Stump wrote: > On 10/21/19 9:02 AM, Andrea Bolognani wrote: > > On Sat, 2019-10-19 at 02:18 -0400, Laine Stump wrote: > > > Commit 01ca4010d86 (libvirt v5.1.0) moved address reservation for > > > hotplugged interface devices up to an earlier point in > > > qemuDomainAttachNetDevice() because some function called when > > > hotplugging on aarch64 needed to know the address type, and failed > > > when it was "none". > > > > I don't see anything in the original commit suggesting the change > > was connected to aarch64? In fact, for the most part I would expect > > aarch64/virt guests to go down the very same code paths as x86/q35 > > guests. > > The secret info is in the emails surrounding the V1 patch that turned > into the V2 patch that was eventually pushed: > > https://www.redhat.com/archives/libvir-list/2018-December/msg00439.html > https://www.redhat.com/archives/libvir-list/2018-December/msg00460.html > > I think the issue is that in the VIR_DOMAIN_NET_TYPE_VHOSTUSER case of > the switch statement in qemuDomainAttachNetDevice(), it is calling > qemuDomainSupportsNicdev(), and qemuDomainSupportsNicdev has this code: > > /* non-virtio ARM nics require legacy -net nic */ > if (((def->os.arch == VIR_ARCH_ARMV6L) || > (def->os.arch == VIR_ARCH_ARMV7L) || > (def->os.arch == VIR_ARCH_AARCH64)) && > net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO && > net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI) > return false; > return true; > > So if you were attempting to hotplug a vhostuser device on aarch64 with > no address specified in the XML, it would return false, which would > cause qemuDomainAttachNetDevice() to fail and log an error. Alright, it makes more sense now. Maybe you could mention qemuDomainSupportsNicdev() specifically instead of vaguely talking about "some function"? That would make the explanation a bit more self-contained and possibly make it unnecessary for the next developer to dig through the mailing list archives. Up to you. -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
© 2016 - 2024 Red Hat, Inc.