src/conf/domain_conf.c | 107 +++++++++--------- src/conf/domain_conf.h | 2 +- src/conf/domain_validate.c | 83 ++++---------- src/conf/schemas/domaincommon.rng | 32 +++++- src/libxl/libxl_domain.c | 5 +- src/libxl/libxl_driver.c | 3 +- src/lxc/lxc_driver.c | 3 +- src/qemu/qemu_command.c | 7 +- src/qemu/qemu_driver.c | 3 +- src/qemu/qemu_extdevice.c | 6 +- src/qemu/qemu_hotplug.c | 21 +++- src/qemu/qemu_passt.c | 5 +- src/qemu/qemu_passt.h | 3 + src/qemu/qemu_postparse.c | 3 +- src/qemu/qemu_process.c | 84 +++++++++----- src/qemu/qemu_validate.c | 56 ++++++--- ...t-user-slirp-portforward.x86_64-latest.err | 2 +- .../net-vhostuser-passt.x86_64-latest.args | 42 +++++++ .../net-vhostuser-passt.x86_64-latest.xml | 72 ++++++++++++ tests/qemuxmlconfdata/net-vhostuser-passt.xml | 70 ++++++++++++ tests/qemuxmlconftest.c | 1 + 21 files changed, 429 insertions(+), 181 deletions(-) create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.args create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.xml create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.xml
passt (https://passt.top) provides a method of connecting QEMU virtual machines to the external network without requiring special privileges or capabilities of any participating processes - even libvirt itself can run unprivileged and create an instance of passt (which *always* runs unprivileged) that is then connected to the qemu process (and thus the virtual machine) with a unix socket. Originally passt used its own protocol for this socket, sending both control messages and data packets over the socket. This works, and is already much more efficient than the previously only-unprivileged-networking-solution slirp. But recently passt added support for using the vhost-user protocol for communication between the passt process (which is connected to the external network) and the QEMU process (and thus the VM). vhost-user also uses a unix socket, but only for control plane messages - all data packets are "sent" between the VM and passt process via a shared memory region. This is unsurprisingly much more efficient. From the point of view of QEMU, the passt process looks identical to any normal vhost-user backend, so we can run QEMU with exactly the same interface commandline options as normal vhost-user. Also, the passt process supports all of the same options as it does when used in its "traditional" mode, so really in the end all we need to do is twist libvirt around so that when <backend type='passt'/> is specified for an <interface type='vhostuser'>, it will run passt just as before (except with the added "--vhost-user" option so that passt will know to use that), and then force feed the vhost-user code in libvirt with the same ocket path used by passt. This series does that, while also switching up a few bits of code prior to adding in the new functionality. So far this has been tested both unprivileged and privileged on Fedora 40 (with latest passt packet) and selinux enabled (there are a couple of selinux policy tweaks that still need to be pushed to passt-selinux) as well as unprivileged on debian (I *think* with AppArmor enabled) and everything seems to work. (I haven't gotten to testing hotplug, but it *should* work, and I'll be testing it while (hopefully) someone is reviewing these patches.) I also need to make the patch to update formatdomain.rst before the rest of it can be pushed, but I wanted to get this posted to save time later. This series does require patch 1 of the series I posted a couple days ago that changes several functions that can't fail to return void. Also, you will need the latest (20250121) passt package. This Resolves: https://issues.redhat.com/browse/RHEL-69455 Laine Stump (9): conf: change virDomainHostdevInsert() to return void qemu: fix qemu validation to forbid guest-side IP address for type='vdpa' qemu: validate that model is virtio for vhostuser and vdpa interfaces in the same place qemu: automatically set model type='virtio' for interface type='vhostuser' qemu: do all vhostuser attribute validation in qemu driver conf/qemu: make <source> element *almost* optional for type=vhostuser qemu: use switch instead of if in qemuProcessPrepareDomainNetwork() qemu: make qemuPasstCreateSocketPath() public qemu: complete vhostuser + passt support src/conf/domain_conf.c | 107 +++++++++--------- src/conf/domain_conf.h | 2 +- src/conf/domain_validate.c | 83 ++++---------- src/conf/schemas/domaincommon.rng | 32 +++++- src/libxl/libxl_domain.c | 5 +- src/libxl/libxl_driver.c | 3 +- src/lxc/lxc_driver.c | 3 +- src/qemu/qemu_command.c | 7 +- src/qemu/qemu_driver.c | 3 +- src/qemu/qemu_extdevice.c | 6 +- src/qemu/qemu_hotplug.c | 21 +++- src/qemu/qemu_passt.c | 5 +- src/qemu/qemu_passt.h | 3 + src/qemu/qemu_postparse.c | 3 +- src/qemu/qemu_process.c | 84 +++++++++----- src/qemu/qemu_validate.c | 56 ++++++--- ...t-user-slirp-portforward.x86_64-latest.err | 2 +- .../net-vhostuser-passt.x86_64-latest.args | 42 +++++++ .../net-vhostuser-passt.x86_64-latest.xml | 72 ++++++++++++ tests/qemuxmlconfdata/net-vhostuser-passt.xml | 70 ++++++++++++ tests/qemuxmlconftest.c | 1 + 21 files changed, 429 insertions(+), 181 deletions(-) create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.args create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.xml create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.xml -- 2.47.1
On Thu, Feb 13, 2025 at 01:19:44PM -0500, Laine Stump wrote: > passt (https://passt.top) provides a method of connecting QEMU virtual > machines to the external network without requiring special privileges > or capabilities of any participating processes - even libvirt itself > can run unprivileged and create an instance of passt (which *always* > runs unprivileged) that is then connected to the qemu process (and > thus the virtual machine) with a unix socket. > [...] > > So far this has been tested both unprivileged and privileged on Fedora > 40 (with latest passt packet) and selinux enabled (there are a couple > of selinux policy tweaks that still need to be pushed to > passt-selinux) as well as unprivileged on debian (I *think* with > AppArmor enabled) and everything seems to work. Unfortunately unprivileged VMs don't actually benefit from AppArmor isolation. See [1] for a recent discussion about this. > Also, you will need the latest (20250121) passt package. I truly appreciate the amount of information you've included in the cover letter, especially the details about required passt version and missing SELinux bits. That made it much easier for me to jump in with some confidence. Speaking of SELinux, with the current policy on Fedora 41 I get a couple of AVC denials related to accessing the shared memory file. I understand that's expected, based on the above, but it's still quite surprising to me that the VM would start at all in this case. Just like the scenario that I've mentioned in my reply to 9/9, the network interface quietly being broken doesn't make for a great user experience. I believe this specific failure scenario, unlike the other one, is pre-existing and not easy to deal with purely through XML validation, but I really think that we should spend some effort (as a follow-up) on making sure that, if passt can't set up the network interface successfully, we report a useful error to the user instead of just leaving things broken with no clear indication that they are. [1] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/R52J7KGT2X5A6WEYKNOSLQUDQKUC5ORA/ -- Andrea Bolognani / Red Hat / Virtualization
On Fri, 14 Feb 2025 03:17:06 -0800 Andrea Bolognani <abologna@redhat.com> wrote: > On Thu, Feb 13, 2025 at 01:19:44PM -0500, Laine Stump wrote: > > passt (https://passt.top) provides a method of connecting QEMU virtual > > machines to the external network without requiring special privileges > > or capabilities of any participating processes - even libvirt itself > > can run unprivileged and create an instance of passt (which *always* > > runs unprivileged) that is then connected to the qemu process (and > > thus the virtual machine) with a unix socket. > > > [...] > > > > So far this has been tested both unprivileged and privileged on Fedora > > 40 (with latest passt packet) and selinux enabled (there are a couple > > of selinux policy tweaks that still need to be pushed to > > passt-selinux) as well as unprivileged on debian (I *think* with > > AppArmor enabled) and everything seems to work. > > Unfortunately unprivileged VMs don't actually benefit from AppArmor > isolation. See [1] for a recent discussion about this. I quickly reported to Laine about a test I made with the workaround I proposed there. That's what it means by "working with AppArmor". It's simply passt with: https://archives.passt.top/passt-dev/20250205163101.3793658-1-sbrivio@redhat.com/ (merged but not released yet). > > Also, you will need the latest (20250121) passt package. > > I truly appreciate the amount of information you've included in the > cover letter, especially the details about required passt version and > missing SELinux bits. That made it much easier for me to jump in with > some confidence. > > Speaking of SELinux, with the current policy on Fedora 41 I get a > couple of AVC denials related to accessing the shared memory file. > I understand that's expected, based on the above, but it's still > quite surprising to me that the VM would start at all in this case. Just for the record, it's with this: https://archives.passt.top/passt-dev/20250213221642.4085986-1-sbrivio@redhat.com/ as you found out meanwhile. Of course, it's temporary (and not even released yet). > Just like the scenario that I've mentioned in my reply to 9/9, the > network interface quietly being broken doesn't make for a great user > experience. I believe this specific failure scenario, unlike the > other one, is pre-existing and not easy to deal with purely through > XML validation, but I really think that we should spend some effort > (as a follow-up) on making sure that, if passt can't set up the > network interface successfully, we report a useful error to the user > instead of just leaving things broken with no clear indication that > they are. I guess that's a valid follow-up improvement regardless of the workaround I'm about to release in passt's policy. > [1] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/R52J7KGT2X5A6WEYKNOSLQUDQKUC5ORA/ -- Stefano
On 2/14/25 6:17 AM, Andrea Bolognani wrote: > On Thu, Feb 13, 2025 at 01:19:44PM -0500, Laine Stump wrote: >> passt (https://passt.top) provides a method of connecting QEMU virtual >> machines to the external network without requiring special privileges >> or capabilities of any participating processes - even libvirt itself >> can run unprivileged and create an instance of passt (which *always* >> runs unprivileged) that is then connected to the qemu process (and >> thus the virtual machine) with a unix socket. >> > [...] >> >> So far this has been tested both unprivileged and privileged on Fedora >> 40 (with latest passt packet) and selinux enabled (there are a couple >> of selinux policy tweaks that still need to be pushed to >> passt-selinux) as well as unprivileged on debian (I *think* with >> AppArmor enabled) and everything seems to work. > > Unfortunately unprivileged VMs don't actually benefit from AppArmor > isolation. See [1] for a recent discussion about this. > >> Also, you will need the latest (20250121) passt package. > > I truly appreciate the amount of information you've included in the > cover letter, especially the details about required passt version and > missing SELinux bits. That made it much easier for me to jump in with > some confidence. > > Speaking of SELinux, with the current policy on Fedora 41 I get a > couple of AVC denials related to accessing the shared memory file. > I understand that's expected, based on the above, but it's still > quite surprising to me that the VM would start at all in this case. > > Just like the scenario that I've mentioned in my reply to 9/9, the > network interface quietly being broken doesn't make for a great user > experience. I believe this specific failure scenario, unlike the > other one, is pre-existing and not easy to deal with purely through > XML validation, but I really think that we should spend some effort > (as a follow-up) on making sure that, if passt can't set up the > network interface successfully, we report a useful error to the user > instead of just leaving things broken with no clear indication that > they are. (I guess you're talking about reporting the selinux denial?) The difficulty here is that it's not libvirt getting the selinux denial, it's passt and/or QEMU, and we don't report errors from either of those unless they are fatal (i.e. the other process exits right away with an error code). Practically speaking though, the selinux issue you're seeing should never happen in production (as long as all packages are up to date) so I don't think it's as essential as the share memory config thing to figure out all the contortions necessary to report it. (Translated: "Error reporting is hard!!!! Let's all go shopping!!!!"). If you've got any bright ideas feel free to pontificate though :-) > > > [1] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/R52J7KGT2X5A6WEYKNOSLQUDQKUC5ORA/
On Fri, Feb 14, 2025 at 09:08:36AM -0500, Laine Stump wrote: > On 2/14/25 6:17 AM, Andrea Bolognani wrote: > > Speaking of SELinux, with the current policy on Fedora 41 I get a > > couple of AVC denials related to accessing the shared memory file. > > I understand that's expected, based on the above, but it's still > > quite surprising to me that the VM would start at all in this case. > > > > Just like the scenario that I've mentioned in my reply to 9/9, the > > network interface quietly being broken doesn't make for a great user > > experience. I believe this specific failure scenario, unlike the > > other one, is pre-existing and not easy to deal with purely through > > XML validation, but I really think that we should spend some effort > > (as a follow-up) on making sure that, if passt can't set up the > > network interface successfully, we report a useful error to the user > > instead of just leaving things broken with no clear indication that > > they are. > > (I guess you're talking about reporting the selinux denial?) > > The difficulty here is that it's not libvirt getting the selinux denial, > it's passt and/or QEMU, and we don't report errors from either of those > unless they are fatal (i.e. the other process exits right away with an error > code). Practically speaking though, the selinux issue you're seeing should > never happen in production (as long as all packages are up to date) so I > don't think it's as essential as the share memory config thing to figure out > all the contortions necessary to report it. (Translated: "Error reporting is > hard!!!! Let's all go shopping!!!!"). If you've got any bright ideas feel > free to pontificate though :-) I haven't looked into it in any detail, so no specific suggestions. And I agree that it won't be seen in production so we can proceed as is for now, and only consider improving things further as a follow-up. In abstract terms, we need to be able to catch startup errors from passt more consistently. As a point of comparison, swtpm will complain very loudly and refuse to start if it can't manufacture a TPM device; passt being unable to create the network interface backend or connect to the frontend should result in a similar VM startup failure. My impression is that in many cases passt will attempt to proceed and simply log an error message on failure, possibly because the underlying problem can be fixed after the fact. In the context of libvirt, I don't think this applies. So maybe we need a "strict mode" of sorts, where passt is more willing to call it quits whenever something doesn't work? I don't know how feasible that is. It's entirely possible that I have incorrectly described how passt does error handling. All I know for sure is that the current situation, in which a VM can successfully be started despite ending up with a non-working network interface, is very clearly not acceptable in the long run. -- Andrea Bolognani / Red Hat / Virtualization
On Fri, 14 Feb 2025 06:47:53 -0800 Andrea Bolognani <abologna@redhat.com> wrote: > On Fri, Feb 14, 2025 at 09:08:36AM -0500, Laine Stump wrote: > > On 2/14/25 6:17 AM, Andrea Bolognani wrote: > > > Speaking of SELinux, with the current policy on Fedora 41 I get a > > > couple of AVC denials related to accessing the shared memory file. > > > I understand that's expected, based on the above, but it's still > > > quite surprising to me that the VM would start at all in this case. > > > > > > Just like the scenario that I've mentioned in my reply to 9/9, the > > > network interface quietly being broken doesn't make for a great user > > > experience. I believe this specific failure scenario, unlike the > > > other one, is pre-existing and not easy to deal with purely through > > > XML validation, but I really think that we should spend some effort > > > (as a follow-up) on making sure that, if passt can't set up the > > > network interface successfully, we report a useful error to the user > > > instead of just leaving things broken with no clear indication that > > > they are. > > > > (I guess you're talking about reporting the selinux denial?) > > > > The difficulty here is that it's not libvirt getting the selinux denial, > > it's passt and/or QEMU, and we don't report errors from either of those > > unless they are fatal (i.e. the other process exits right away with an error > > code). Practically speaking though, the selinux issue you're seeing should > > never happen in production (as long as all packages are up to date) so I > > don't think it's as essential as the share memory config thing to figure out > > all the contortions necessary to report it. (Translated: "Error reporting is > > hard!!!! Let's all go shopping!!!!"). If you've got any bright ideas feel > > free to pontificate though :-) > > I haven't looked into it in any detail, so no specific suggestions. > And I agree that it won't be seen in production so we can proceed as > is for now, and only consider improving things further as a > follow-up. > > In abstract terms, we need to be able to catch startup errors from > passt more consistently. As a point of comparison, swtpm will > complain very loudly and refuse to start if it can't manufacture a > TPM device; passt being unable to create the network interface > backend or connect to the frontend should result in a similar VM > startup failure. > > My impression is that in many cases passt will attempt to proceed and > simply log an error message on failure, possibly because the > underlying problem can be fixed after the fact. In this case what you see is not a desired behaviour of passt and it's a bit more complicated compared to the non-vhost-user case (hence the issue that nobody had thought about). With or without --vhost-user, passt goes to background once its *control interface* is up and running, which should make it convenient for libvirt, libkrun, and even "manual" users (typically using scripts). You start it without having to fork. If it forks and exits successfully, you know it's ready. Nothing fancy here, that's a pretty much established UNIX daemon thing. In the non-vhost-user case, that socket is also the data interface, so not much can go wrong after this phase. QEMU might fail to start (or fail to connect to passt and hence fail to start), and in that case libvirt will terminate passt, but that's about it. In the vhost-user case, passt still forks once the control interface is up and running: libvirt knows that QEMU can start, now. But the data connection is not ready, yet, because that's negotiated as part of the vhost-user protocol. So, QEMU connects, and passes to passt a passing file descriptor, en passant, via SCM_RIGHTS (admittedly a bit passé), to represent the shared (guest) memory that passt can map. If passt can't map the memory, it fails with a resounding: die_perror("vhost-user region mmap error"); but that's too late: passt declared success, the guest started, and libvirt can't do much. We don't have guarantees as to when this failure can happen, either. Without vhost-user, libvirt gets a NETDEV_STREAM_DISCONNECTED if passt terminates for whatever reason, and it handles that by restarting passt. QEMU's interface is configured with a --reconnect-ms option, so restarting passt should be enough to give the guest its connectivity back. For vhost-user, a patch from Laurent would introduce equivalent functionality: https://lore.kernel.org/qemu-devel/20250214072629.1033314-1-lvivier@redhat.com/ but that only helps when things are up and running. If failure happens before the guest ever had working connectivity, it's pretty unclear to me what the desired behaviour is. And: > In the context of > libvirt, I don't think this applies. So maybe we need a "strict mode" > of sorts, where passt is more willing to call it quits whenever > something doesn't work? ...while this already happens... > I don't know how feasible that is. It's entirely possible that I have > incorrectly described how passt does error handling. All I know for > sure is that the current situation, in which a VM can successfully be > started despite ending up with a non-working network interface, is > very clearly not acceptable in the long run. ...should libvirt really bring down the guest because oops, on a second thought, your network interface will never work? Again, we have no timing guarantees. Or perhaps QEMU should terminate instead? Are you aware of any similar case we can try to use as established practice? -- Stefano
On a Thursday in 2025, Laine Stump wrote: [...] >This Resolves: https://issues.redhat.com/browse/RHEL-69455 > >Laine Stump (9): > conf: change virDomainHostdevInsert() to return void > qemu: fix qemu validation to forbid guest-side IP address for > type='vdpa' > qemu: validate that model is virtio for vhostuser and vdpa interfaces > in the same place > qemu: automatically set model type='virtio' for interface > type='vhostuser' > qemu: do all vhostuser attribute validation in qemu driver > conf/qemu: make <source> element *almost* optional for type=vhostuser > qemu: use switch instead of if in qemuProcessPrepareDomainNetwork() > qemu: make qemuPasstCreateSocketPath() public > qemu: complete vhostuser + passt support > > src/conf/domain_conf.c | 107 +++++++++--------- > src/conf/domain_conf.h | 2 +- > src/conf/domain_validate.c | 83 ++++---------- > src/conf/schemas/domaincommon.rng | 32 +++++- > src/libxl/libxl_domain.c | 5 +- > src/libxl/libxl_driver.c | 3 +- > src/lxc/lxc_driver.c | 3 +- > src/qemu/qemu_command.c | 7 +- > src/qemu/qemu_driver.c | 3 +- > src/qemu/qemu_extdevice.c | 6 +- > src/qemu/qemu_hotplug.c | 21 +++- > src/qemu/qemu_passt.c | 5 +- > src/qemu/qemu_passt.h | 3 + > src/qemu/qemu_postparse.c | 3 +- > src/qemu/qemu_process.c | 84 +++++++++----- > src/qemu/qemu_validate.c | 56 ++++++--- > ...t-user-slirp-portforward.x86_64-latest.err | 2 +- > .../net-vhostuser-passt.x86_64-latest.args | 42 +++++++ > .../net-vhostuser-passt.x86_64-latest.xml | 72 ++++++++++++ > tests/qemuxmlconfdata/net-vhostuser-passt.xml | 70 ++++++++++++ > tests/qemuxmlconftest.c | 1 + > 21 files changed, 429 insertions(+), 181 deletions(-) > create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.args > create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.x86_64-latest.xml > create mode 100644 tests/qemuxmlconfdata/net-vhostuser-passt.xml > Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano
© 2016 - 2025 Red Hat, Inc.