libvirt.spec.in | 5 + meson.build | 1 + po/POTFILES | 2 + src/conf/virnetworkobj.c | 40 + src/conf/virnetworkobj.h | 11 + src/libvirt_private.syms | 68 +- src/network/bridge_driver.c | 40 +- src/network/bridge_driver_conf.c | 44 + src/network/bridge_driver_conf.h | 3 + src/network/bridge_driver_linux.c | 241 +++-- src/network/bridge_driver_nop.c | 6 +- src/network/bridge_driver_platform.h | 6 +- src/network/libvirtd_network.aug | 39 + src/network/meson.build | 11 + src/network/network.conf | 24 + src/network/test_libvirtd_network.aug.in | 5 + src/nwfilter/nwfilter_ebiptables_driver.c | 16 +- src/util/meson.build | 2 + src/util/virebtables.c | 4 +- src/util/virfirewall.c | 490 ++++++++-- src/util/virfirewall.h | 51 +- src/util/viriptables.c | 762 ++++----------- src/util/viriptables.h | 222 ++--- src/util/virnetfilter.c | 892 ++++++++++++++++++ src/util/virnetfilter.h | 159 ++++ src/util/virnftables.c | 698 ++++++++++++++ src/util/virnftables.h | 118 +++ .../{base.args => base.iptables} | 0 tests/networkxml2firewalldata/base.nftables | 256 +++++ ...-linux.args => nat-default-linux.iptables} | 0 .../nat-default-linux.nftables | 248 +++++ ...pv6-linux.args => nat-ipv6-linux.iptables} | 0 .../nat-ipv6-linux.nftables | 384 ++++++++ ...rgs => nat-ipv6-masquerade-linux.iptables} | 0 .../nat-ipv6-masquerade-linux.nftables | 456 +++++++++ ...linux.args => nat-many-ips-linux.iptables} | 0 .../nat-many-ips-linux.nftables | 472 +++++++++ ...-linux.args => nat-no-dhcp-linux.iptables} | 0 .../nat-no-dhcp-linux.nftables | 384 ++++++++ ...ftp-linux.args => nat-tftp-linux.iptables} | 0 .../nat-tftp-linux.nftables | 274 ++++++ ...inux.args => route-default-linux.iptables} | 0 .../route-default-linux.nftables | 162 ++++ tests/networkxml2firewalltest.c | 56 +- tests/virfirewalltest.c | 20 +- 45 files changed, 5718 insertions(+), 954 deletions(-) create mode 100644 src/network/libvirtd_network.aug create mode 100644 src/network/network.conf create mode 100644 src/network/test_libvirtd_network.aug.in create mode 100644 src/util/virnetfilter.c create mode 100644 src/util/virnetfilter.h create mode 100644 src/util/virnftables.c create mode 100644 src/util/virnftables.h rename tests/networkxml2firewalldata/{base.args => base.iptables} (100%) create mode 100644 tests/networkxml2firewalldata/base.nftables rename tests/networkxml2firewalldata/{nat-default-linux.args => nat-default-linux.iptables} (100%) create mode 100644 tests/networkxml2firewalldata/nat-default-linux.nftables rename tests/networkxml2firewalldata/{nat-ipv6-linux.args => nat-ipv6-linux.iptables} (100%) create mode 100644 tests/networkxml2firewalldata/nat-ipv6-linux.nftables rename tests/networkxml2firewalldata/{nat-ipv6-masquerade-linux.args => nat-ipv6-masquerade-linux.iptables} (100%) create mode 100644 tests/networkxml2firewalldata/nat-ipv6-masquerade-linux.nftables rename tests/networkxml2firewalldata/{nat-many-ips-linux.args => nat-many-ips-linux.iptables} (100%) create mode 100644 tests/networkxml2firewalldata/nat-many-ips-linux.nftables rename tests/networkxml2firewalldata/{nat-no-dhcp-linux.args => nat-no-dhcp-linux.iptables} (100%) create mode 100644 tests/networkxml2firewalldata/nat-no-dhcp-linux.nftables rename tests/networkxml2firewalldata/{nat-tftp-linux.args => nat-tftp-linux.iptables} (100%) create mode 100644 tests/networkxml2firewalldata/nat-tftp-linux.nftables rename tests/networkxml2firewalldata/{route-default-linux.args => route-default-linux.iptables} (100%) create mode 100644 tests/networkxml2firewalldata/route-default-linux.nftables
This patch series enables libvirt to use nftables rules rather than iptables *when setting up virtual networks* (it does *not* add nftables support to the nwfilter driver). It accomplishes this by abstracting several iptables functions (from viriptables.[ch] called by the virtual network driver into a rudimentary "virNetfilter API" (in virnetfilter.[ch], having the virtual network driver call the virNetFilter API rather than calling the existing iptables functions directly, and then finally adding an equivalent virNftables backend that can be used instead of iptables (selected manually via a network.conf setting, or automatically if iptables isn't found on the host). A first look at the result may have you thinking that it's filled with a lot of bad decisions. While I would agree with that in many cases, I think that overall they are the "least bad" decisions, or at least "bad within acceptable limits / no worse than something else", and point out that it's been done in a way that minimizes (actually eliminates) the need for immediate changes to nwfilter (the other consumer of iptables, which *also* needs to be updated to use native nftables), and makes it much easier to change our mind about the details in the future. When I first started on this (long, protracted, repeatedly interrupted for extended periods - many of these patches are > a year old) task, I considered doing an all-at-once complete replacement of iptables with nftables, since all the Linux distros we support have had nftables for several years, and I'm pretty sure nobody has it disabled (not even sure if it's possible to disable nftables while still enabling iptables, since they both use xtables in the kernel). But due to libvirt's use of "-t mangle -j CHECKSUM --checksum-fill" (see commit fd5b15ff all the way back in July 2010 for details) which has no equivalent in nftables rules (and we don't *want* it to!!), and the desire to be able to easily switch back to iptables in case of an unforeseen regression, we decided that both iptables and nftables need to be supported (for now), with the default (for now) remaining as iptables. Just allowing for dual backends complicated matters, since it means that we have to have a config file, a setting, detection of which backends are available, and of course some sort of concept of an abstracted frontend that can use either backend based on the config setting (and/or auto-detection). Combining that with the fact that it would just be "too big" of a project to switch over nwfilter's iptables usage at the same time means that we have to keep around a lot of existing code for compatibility's sake rather than just wiping it all away and starting over. So, what I've ended up with is: 1) a network.conf file (didn't exist before) with a single setting "firewall_backend". If unset, the network driver tries to use iptables on the backend, and if that's missing, then tries to use nftables. 2) a new (internal-only, so transient!) virNetFilterXXX API that is used by the network driver in place of the iptablesXXX API, and calls either iptablesXXX or: 3) a virNftablesXXX API that exactly replicates the filtering rules of the existing iptablesXXX API (except in the custom "libvirt" base table rather than the system "filter" and "nat" tables). This means that: 4) when the nftables backend is used, the rules added are *exactly the same* (functionally speaking) as we currently add for iptables (except they are in the "libvirt" table). We had spent some time in IRC discussing different ways of using new functionality available in nftables to make a more efficient/performant implemention of the desired filtering, and there are some really great possibilities that need to be explored, but in the end there were too many details up in the air, and I decided that it would be more "accomplishable" (coined a new word there!) to first replicate existing behavior with nftables, but do it inside a framework that makes it easy to modify the details in the future (in particular making it painless to switch back and forth between builds with differing filter models at runtime) - this way we'll be able to separate the infrastructure work from the details of the rules (which we can then more easily work on and experiment with). (This implies that the main objective right now is "get rid of iptables dependencies", not "make the filtering faster and more efficient"). Notable features of this patchset: * allows switching between iptables/nftables backends without rebooting or restarting networks/guests. Because the commands required to remove a network's filter rules are now saved in the network status XML, each time libvirtd (or virtnetworkd) is restarted, it will execute exactly the commands needed to remove the filter rules that had been added by the previous libvirtd/virtnetworkd (rather than just making a guess, as we've always done up until now), and then add new rules using the current backend+binary's set of rules (while also saving the info needed for future removal of these new rules back into the network's status XML). * firewall_backend can be explicitly set in (new) /etc/libvirt/network.conf, but if it's not explicitly set, libvirt will default to the iptables backend if the iptables binary is found, and otherwise fall back to nftables as long as the nft binary is found; otherwise the first attempt to start a network will fail with an appropriate error. Things that seem ugly / that I would like to clean up / that I think are just fine as they are: * virFirewall does *not* provide a backend-agnostic interface [this is fine] * We need to maintain a backward-compatible API for virFirewall so that we don't have to touch nwfilter code. Trying to make its API backend-agnostic would require individually considering/changing every nwfilter use of virFirewall. * instead virFirewall objects are just a way to build a collection of commands to execute to build a firewall, then execute them while collecting info for and building a collection of commands that will tear down that firewall in the future. Do I want to "fix" this in the future by making virFirewall a higher level interface that accepts tokens describing the type of rule to add (rather than backend-specific arguments to a backend-specific command)? No. I think I like the way virFirewall works (as described in that previous bullet-point), instead I'm thinking that it is just slightly mis-named - I've lately been thinking of it as a "virNetFilterCmdList". Similarly, the virFirewallRules that it has a list of aren't really "rules", they are better described as commands or actions, so maybe they should be renamed to virNetfilterCmd or virNetfilterAction. But that is just cosmetic, so I didn't want to get into it in these patches (especially in case someone disagrees, or has a better idea for naming). * Speaking of renaming - I should probably rename all the "iptablesXXX" functions to "virIptablesXXX" to be consistent with so much of our other code. I lost the ambition to deal with it right now though, so I'm leaving that for later cleanup (or I could do it now if it really makes someone's day :-). * I could have chosen a higher place in the callchain to make the virNetfilter abstraction, e.g. at the level of "networkAddXXXFirewallRules()" rather than at the lower level of iptablesXXX(). That is actually probably what will happen in the future (since it will be necessary in order for an nftables-based firewall to be significantly different in structure from an iptables-based firewall). But that's the beauty of an API being private - we can freely add/remove things as needed. the important thing is that we now have the basic structure there. For now, the split is just above the existing iptablesXXX API (util/viriptables.[ch], which seems like a "narrow" enough place. Most iptablesXXX functions are written in terms of just 10 *other* iptablesXXX functions that add iptables-specific commands - I've just moved those functions into virnetfilter.[ch] (appropriately renamed), and changed them to call the 10 virNetfilterXXX functions that will in-turn call those 10 iptablesXXX (or equivalent virNftablesXXX) functions. * Some people may dislike that the 10 virNetfilterXXX functions are each written with a switch statement that has cases to directly call each backend, rather than each backend driver having a table of pointers to API functions, with the virNetfilter API function calling backends[fwBackend]->XXX() (ie the pattern for so many drivers in libvirt). But for just 2 backends, that really seemed like overkill and unnecessary obfuscation. * As implemented here, I am storing a "<fwRemoval>" element in the network status XML - it contains a serialized virFirewall object that directly contains the commands necessary to remove the firewall. I could instead just store "<firewall>", which would include all the commands that were used to *create* the firewall in addition to the commands needed to remove the firewall. The way it's done currently takes up less space; switching to storing the full firewall *might* be more informative to somebody, but on the other hand would make the network status XML *very* long. If anybody has an opinion about this, now is the time to bring it up - do you think it's worth having a separate list of all the commands that were used to create a network's firewall (keeping in mind that there is no public API to access it)? Or is it enough to just store what's needed to remove the firewall? * Several months ago Eric Garver posted patches for a pure firewalld backend, and I requested that they not be pushed because I wanted that to be integrated with my nftables backend support. Due to the fact that the firewalld backend is almost entirely implemented by putting the bridge into a new firewalld "zone", with no individual rules added, that won't happen as just another backend driver file in parallel to iptables and nftables; it will instead work by checking firewall_backend at a higher level in the network driver, thus avoiding the calls to virNetfilterXXX() entirely. I have locally merged Eric's patches over the top of these patches, and there are surprisingly few conflicts, but since his patches didn't account for a user-settable config (but instead just always used the firewalld backend if firewalld was active), some of the patches are going to require a bit of rework, which I'll take care of after getting these patches in. Laine Stump (28): util: add -w/--concurrent when applying the rule rather than when building it util: new virFirewallRuleGet*() APIs util: determine ignoreErrors value when creating rule, not when applying util: rename iptables helpers that will become the frontend for ip&nftables util: move backend-agnostic virNetfilter*() functions to their own file util: make netfilter action a proper typedefed (virFirewall) enum util: #define the names used for private packet filter chains util: move/rename virFirewallApplyRuleDirect to virIptablesApplyFirewallRule util/network: reintroduce virFirewallBackend, but different network: add (empty) network.conf file to distribution files network: allow setting firewallBackend from network.conf network: do not add DHCP checksum mangle rule unless using iptables network: call backend agnostic function to init private filter chains util: setup functions in virnetfilter which will call appropriate backend build: add nft to the list of binaries we attempt to locate util: add nftables backend to virnetfilter API used by network driver tests: test cases for nftables backend util: new functions to support adding individual rollback rules util: check for 0 args when applying iptables rule util: implement rollback rule autosave for iptables backend util: implement rollback rule autosave for nftables backend network: turn on auto-rollback for the rules added for virtual networks util: new function virFirewallNewFromRollback() util: new functions virFirewallParseXML() and virFirewallFormat() conf: add a virFirewall object to virNetworkObj network: use previously saved list of firewall rules when removing network: save network status when firewall rules are reloaded network: improve log message when reloading virtual network firewall rules libvirt.spec.in | 5 + meson.build | 1 + po/POTFILES | 2 + src/conf/virnetworkobj.c | 40 + src/conf/virnetworkobj.h | 11 + src/libvirt_private.syms | 68 +- src/network/bridge_driver.c | 40 +- src/network/bridge_driver_conf.c | 44 + src/network/bridge_driver_conf.h | 3 + src/network/bridge_driver_linux.c | 241 +++-- src/network/bridge_driver_nop.c | 6 +- src/network/bridge_driver_platform.h | 6 +- src/network/libvirtd_network.aug | 39 + src/network/meson.build | 11 + src/network/network.conf | 24 + src/network/test_libvirtd_network.aug.in | 5 + src/nwfilter/nwfilter_ebiptables_driver.c | 16 +- src/util/meson.build | 2 + src/util/virebtables.c | 4 +- src/util/virfirewall.c | 490 ++++++++-- src/util/virfirewall.h | 51 +- src/util/viriptables.c | 762 ++++----------- src/util/viriptables.h | 222 ++--- src/util/virnetfilter.c | 892 ++++++++++++++++++ src/util/virnetfilter.h | 159 ++++ src/util/virnftables.c | 698 ++++++++++++++ src/util/virnftables.h | 118 +++ .../{base.args => base.iptables} | 0 tests/networkxml2firewalldata/base.nftables | 256 +++++ ...-linux.args => nat-default-linux.iptables} | 0 .../nat-default-linux.nftables | 248 +++++ ...pv6-linux.args => nat-ipv6-linux.iptables} | 0 .../nat-ipv6-linux.nftables | 384 ++++++++ ...rgs => nat-ipv6-masquerade-linux.iptables} | 0 .../nat-ipv6-masquerade-linux.nftables | 456 +++++++++ ...linux.args => nat-many-ips-linux.iptables} | 0 .../nat-many-ips-linux.nftables | 472 +++++++++ ...-linux.args => nat-no-dhcp-linux.iptables} | 0 .../nat-no-dhcp-linux.nftables | 384 ++++++++ ...ftp-linux.args => nat-tftp-linux.iptables} | 0 .../nat-tftp-linux.nftables | 274 ++++++ ...inux.args => route-default-linux.iptables} | 0 .../route-default-linux.nftables | 162 ++++ tests/networkxml2firewalltest.c | 56 +- tests/virfirewalltest.c | 20 +- 45 files changed, 5718 insertions(+), 954 deletions(-) create mode 100644 src/network/libvirtd_network.aug create mode 100644 src/network/network.conf create mode 100644 src/network/test_libvirtd_network.aug.in create mode 100644 src/util/virnetfilter.c create mode 100644 src/util/virnetfilter.h create mode 100644 src/util/virnftables.c create mode 100644 src/util/virnftables.h rename tests/networkxml2firewalldata/{base.args => base.iptables} (100%) create mode 100644 tests/networkxml2firewalldata/base.nftables rename tests/networkxml2firewalldata/{nat-default-linux.args => nat-default-linux.iptables} (100%) create mode 100644 tests/networkxml2firewalldata/nat-default-linux.nftables rename tests/networkxml2firewalldata/{nat-ipv6-linux.args => nat-ipv6-linux.iptables} (100%) create mode 100644 tests/networkxml2firewalldata/nat-ipv6-linux.nftables rename tests/networkxml2firewalldata/{nat-ipv6-masquerade-linux.args => nat-ipv6-masquerade-linux.iptables} (100%) create mode 100644 tests/networkxml2firewalldata/nat-ipv6-masquerade-linux.nftables rename tests/networkxml2firewalldata/{nat-many-ips-linux.args => nat-many-ips-linux.iptables} (100%) create mode 100644 tests/networkxml2firewalldata/nat-many-ips-linux.nftables rename tests/networkxml2firewalldata/{nat-no-dhcp-linux.args => nat-no-dhcp-linux.iptables} (100%) create mode 100644 tests/networkxml2firewalldata/nat-no-dhcp-linux.nftables rename tests/networkxml2firewalldata/{nat-tftp-linux.args => nat-tftp-linux.iptables} (100%) create mode 100644 tests/networkxml2firewalldata/nat-tftp-linux.nftables rename tests/networkxml2firewalldata/{route-default-linux.args => route-default-linux.iptables} (100%) create mode 100644 tests/networkxml2firewalldata/route-default-linux.nftables -- 2.39.2
Laine Stump wrote: > This patch series enables libvirt to use nftables rules rather than > iptables *when setting up virtual networks* (it does *not* add > nftables support to the nwfilter driver). It accomplishes this by > abstracting several iptables functions (from viriptables.[ch] called > by the virtual network driver into a rudimentary "virNetfilter API" > (in virnetfilter.[ch], having the virtual network driver call the > virNetFilter API rather than calling the existing iptables functions > directly, and then finally adding an equivalent virNftables backend > that can be used instead of iptables (selected manually via a > network.conf setting, or automatically if iptables isn't found on the > host). [resend to a proper list] Hi, Apparently, I'm late to the discussion. I noticed that now I cannot use the bridge driver on FreeBSD as it's failing to initialize both iptables and nftables backends (which is expect). What would be a good way to address that? I see at least two options: 1. Add a Noop firewall driver 2. Implement a "real" FreeBSD driver based either on pf or ipfw (that's been on my TODO list forever, but I somehow got stuck on the very first step on choosing between pf and ipfw). This obviously will take much more time. Maybe there are other options I'm missing. What do you think? Thanks, Roman
On Mon, Jun 10, 2024 at 09:10:08PM +0200, Roman Bogorodskiy wrote: > Laine Stump wrote: > > > This patch series enables libvirt to use nftables rules rather than > > iptables *when setting up virtual networks* (it does *not* add > > nftables support to the nwfilter driver). It accomplishes this by > > abstracting several iptables functions (from viriptables.[ch] called > > by the virtual network driver into a rudimentary "virNetfilter API" > > (in virnetfilter.[ch], having the virtual network driver call the > > virNetFilter API rather than calling the existing iptables functions > > directly, and then finally adding an equivalent virNftables backend > > that can be used instead of iptables (selected manually via a > > network.conf setting, or automatically if iptables isn't found on the > > host). > > [resend to a proper list] > > Hi, > > Apparently, I'm late to the discussion. > > I noticed that now I cannot use the bridge driver on FreeBSD as it's > failing to initialize both iptables and nftables backends (which is > expect). > > What would be a good way to address that? I see at least two options: > > 1. Add a Noop firewall driver > 2. Implement a "real" FreeBSD driver based either on pf or ipfw (that's > been on my TODO list forever, but I somehow got stuck on the very first > step on choosing between pf and ipfw). This obviously will take much > more time. How about both :-) There will always be platforms for which no suitable FW driver exists, so a no-op driver that just returns errors for everything will be beneficial for many cases. Then you can worry about a real freebsd driver at your leisure. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Mon, Jun 10, 2024 at 09:10:08PM GMT, Roman Bogorodskiy wrote: > Laine Stump wrote: > > > This patch series enables libvirt to use nftables rules rather than > > iptables *when setting up virtual networks* (it does *not* add > > nftables support to the nwfilter driver). It accomplishes this by > > abstracting several iptables functions (from viriptables.[ch] called > > by the virtual network driver into a rudimentary "virNetfilter API" > > (in virnetfilter.[ch], having the virtual network driver call the > > virNetFilter API rather than calling the existing iptables functions > > directly, and then finally adding an equivalent virNftables backend > > that can be used instead of iptables (selected manually via a > > network.conf setting, or automatically if iptables isn't found on the > > host). > > [resend to a proper list] > > Hi, > > Apparently, I'm late to the discussion. > > I noticed that now I cannot use the bridge driver on FreeBSD as it's > failing to initialize both iptables and nftables backends (which is > expect). > > What would be a good way to address that? I see at least two options: > > 1. Add a Noop firewall driver > 2. Implement a "real" FreeBSD driver based either on pf or ipfw (that's > been on my TODO list forever, but I somehow got stuck on the very first > step on choosing between pf and ipfw). This obviously will take much > more time. > > Maybe there are other options I'm missing. > > What do you think? If I understand correctly, the new behavior might cause problems on macOS too, see [1]. I wouldn't be surprised if the other BSDs were similarly affected. I wonder how things could work before if the iptables backend is not functional. Is it possible that we used to ignore failures to initialize the backend, but now we consider them fatal instead? A proper platform-specific backend would obviously be the right approach in the long run, but the priority should be restoring the previous status quo. A noop backend might be the answer, but honestly I just don't understand enough about networking to know for sure. I thought that these firewall rules were necessary in order to give network access to VMs, but if FreeBSD has been doing fine without iptables so far clearly that's not the case? [1] https://gitlab.com/libvirt/libvirt/-/issues/642 -- Andrea Bolognani / Red Hat / Virtualization
On Tue, Jun 11, 2024 at 02:38:58AM GMT, Andrea Bolognani wrote: > On Mon, Jun 10, 2024 at 09:10:08PM GMT, Roman Bogorodskiy wrote: > > Laine Stump wrote: > > > > > This patch series enables libvirt to use nftables rules rather than > > > iptables *when setting up virtual networks* (it does *not* add > > > nftables support to the nwfilter driver). It accomplishes this by > > > abstracting several iptables functions (from viriptables.[ch] called > > > by the virtual network driver into a rudimentary "virNetfilter API" > > > (in virnetfilter.[ch], having the virtual network driver call the > > > virNetFilter API rather than calling the existing iptables functions > > > directly, and then finally adding an equivalent virNftables backend > > > that can be used instead of iptables (selected manually via a > > > network.conf setting, or automatically if iptables isn't found on the > > > host). > > > > [resend to a proper list] > > > > Hi, > > > > Apparently, I'm late to the discussion. > > > > I noticed that now I cannot use the bridge driver on FreeBSD as it's > > failing to initialize both iptables and nftables backends (which is > > expect). > > > > What would be a good way to address that? I see at least two options: > > > > 1. Add a Noop firewall driver > > 2. Implement a "real" FreeBSD driver based either on pf or ipfw (that's > > been on my TODO list forever, but I somehow got stuck on the very first > > step on choosing between pf and ipfw). This obviously will take much > > more time. > > > > Maybe there are other options I'm missing. > > > > What do you think? > > If I understand correctly, the new behavior might cause problems on > macOS too, see [1]. I wouldn't be surprised if the other BSDs were > similarly affected. > > I wonder how things could work before if the iptables backend is not > functional. Is it possible that we used to ignore failures to > initialize the backend, but now we consider them fatal instead? > > A proper platform-specific backend would obviously be the right > approach in the long run, but the priority should be restoring the > previous status quo. A noop backend might be the answer, but honestly > I just don't understand enough about networking to know for sure. I > thought that these firewall rules were necessary in order to give > network access to VMs, but if FreeBSD has been doing fine without > iptables so far clearly that's not the case? One additional issue with this: $ PATH=/usr/bin /usr/sbin/libvirtd error : virNetworkLoadDriverConfig:146 : internal error: could not find a usable firewall backend error : virStateInitialize:672 : Initialization of bridge state driver failed: internal error: could not find a usable firewall backend error : daemonRunStateInit:617 : Driver state initialization failed This happens because nft and iptables are both in /usr/sbin, so if the user's $PATH doesn't include that directory they won't be found and the driver will fail to initialize. Not a big deal on Fedora, where /usr/sbin is part of the default $PATH for users, but that's not the case on Debian, where qemu:///session is just completely broken right now. I was testing out a patch that addressed the situation by switching backend detection to virFindFileInPathFull(), but then I realized that it's fairly pointless to look for nft/iptables when a regular user can't run them anyway. So what I think we need to do is, make the failure to detect a working backend non-fatal, unless the user has explicitly asked for a specific backend to be used. That should bring us back to the previous situation. -- Andrea Bolognani / Red Hat / Virtualization
On Tue, Jun 11, 2024 at 08:49:42AM -0700, Andrea Bolognani wrote: > On Tue, Jun 11, 2024 at 02:38:58AM GMT, Andrea Bolognani wrote: > > On Mon, Jun 10, 2024 at 09:10:08PM GMT, Roman Bogorodskiy wrote: > > > Laine Stump wrote: > > > > > > > This patch series enables libvirt to use nftables rules rather than > > > > iptables *when setting up virtual networks* (it does *not* add > > > > nftables support to the nwfilter driver). It accomplishes this by > > > > abstracting several iptables functions (from viriptables.[ch] called > > > > by the virtual network driver into a rudimentary "virNetfilter API" > > > > (in virnetfilter.[ch], having the virtual network driver call the > > > > virNetFilter API rather than calling the existing iptables functions > > > > directly, and then finally adding an equivalent virNftables backend > > > > that can be used instead of iptables (selected manually via a > > > > network.conf setting, or automatically if iptables isn't found on the > > > > host). > > > > > > [resend to a proper list] > > > > > > Hi, > > > > > > Apparently, I'm late to the discussion. > > > > > > I noticed that now I cannot use the bridge driver on FreeBSD as it's > > > failing to initialize both iptables and nftables backends (which is > > > expect). > > > > > > What would be a good way to address that? I see at least two options: > > > > > > 1. Add a Noop firewall driver > > > 2. Implement a "real" FreeBSD driver based either on pf or ipfw (that's > > > been on my TODO list forever, but I somehow got stuck on the very first > > > step on choosing between pf and ipfw). This obviously will take much > > > more time. > > > > > > Maybe there are other options I'm missing. > > > > > > What do you think? > > > > If I understand correctly, the new behavior might cause problems on > > macOS too, see [1]. I wouldn't be surprised if the other BSDs were > > similarly affected. > > > > I wonder how things could work before if the iptables backend is not > > functional. Is it possible that we used to ignore failures to > > initialize the backend, but now we consider them fatal instead? > > > > A proper platform-specific backend would obviously be the right > > approach in the long run, but the priority should be restoring the > > previous status quo. A noop backend might be the answer, but honestly > > I just don't understand enough about networking to know for sure. I > > thought that these firewall rules were necessary in order to give > > network access to VMs, but if FreeBSD has been doing fine without > > iptables so far clearly that's not the case? > > One additional issue with this: > > $ PATH=/usr/bin /usr/sbin/libvirtd > error : virNetworkLoadDriverConfig:146 : internal error: could not > find a usable firewall backend > error : virStateInitialize:672 : Initialization of bridge state > driver failed: internal error: could not find a usable firewall > backend > error : daemonRunStateInit:617 : Driver state initialization failed > > This happens because nft and iptables are both in /usr/sbin, so if > the user's $PATH doesn't include that directory they won't be found > and the driver will fail to initialize. > > Not a big deal on Fedora, where /usr/sbin is part of the default > $PATH for users, but that's not the case on Debian, where > qemu:///session is just completely broken right now. > > I was testing out a patch that addressed the situation by switching > backend detection to virFindFileInPathFull(), but then I realized > that it's fairly pointless to look for nft/iptables when a regular > user can't run them anyway. > > So what I think we need to do is, make the failure to detect a > working backend non-fatal, unless the user has explicitly asked for a > specific backend to be used. That should bring us back to the > previous situation. This is probably another reason to have a "no op" backend that merely raises errors for every operation - see my Roman's mail about FreeBSD With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Tue, Jun 11, 2024 at 05:27:42PM GMT, Daniel P. Berrangé wrote: > On Tue, Jun 11, 2024 at 08:49:42AM -0700, Andrea Bolognani wrote: > > One additional issue with this: > > > > $ PATH=/usr/bin /usr/sbin/libvirtd > > error : virNetworkLoadDriverConfig:146 : internal error: could not > > find a usable firewall backend > > error : virStateInitialize:672 : Initialization of bridge state > > driver failed: internal error: could not find a usable firewall > > backend > > error : daemonRunStateInit:617 : Driver state initialization failed > > > > This happens because nft and iptables are both in /usr/sbin, so if > > the user's $PATH doesn't include that directory they won't be found > > and the driver will fail to initialize. > > > > Not a big deal on Fedora, where /usr/sbin is part of the default > > $PATH for users, but that's not the case on Debian, where > > qemu:///session is just completely broken right now. > > > > I was testing out a patch that addressed the situation by switching > > backend detection to virFindFileInPathFull(), but then I realized > > that it's fairly pointless to look for nft/iptables when a regular > > user can't run them anyway. > > > > So what I think we need to do is, make the failure to detect a > > working backend non-fatal, unless the user has explicitly asked for a > > specific backend to be used. That should bring us back to the > > previous situation. > > This is probably another reason to have a "no op" backend that merely > raises errors for every operation - see my Roman's mail about FreeBSD Is there much of a difference between having an explicit noop backend that is checked for availability after all other ones, and simply not failing to initialize the driver if a backend can't be found? Anyway, I'd be happy with either solution. I'm still unclear on how networking on FreeBSD could work at all until now. Aren't the iptables rules needed for guest connectivity? Or did I misunderstand their purpose? -- Andrea Bolognani / Red Hat / Virtualization
On Wed, Jun 12, 2024 at 01:54:47AM -0700, Andrea Bolognani wrote: > On Tue, Jun 11, 2024 at 05:27:42PM GMT, Daniel P. Berrangé wrote: > > On Tue, Jun 11, 2024 at 08:49:42AM -0700, Andrea Bolognani wrote: > > > One additional issue with this: > > > > > > $ PATH=/usr/bin /usr/sbin/libvirtd > > > error : virNetworkLoadDriverConfig:146 : internal error: could not > > > find a usable firewall backend > > > error : virStateInitialize:672 : Initialization of bridge state > > > driver failed: internal error: could not find a usable firewall > > > backend > > > error : daemonRunStateInit:617 : Driver state initialization failed > > > > > > This happens because nft and iptables are both in /usr/sbin, so if > > > the user's $PATH doesn't include that directory they won't be found > > > and the driver will fail to initialize. > > > > > > Not a big deal on Fedora, where /usr/sbin is part of the default > > > $PATH for users, but that's not the case on Debian, where > > > qemu:///session is just completely broken right now. > > > > > > I was testing out a patch that addressed the situation by switching > > > backend detection to virFindFileInPathFull(), but then I realized > > > that it's fairly pointless to look for nft/iptables when a regular > > > user can't run them anyway. > > > > > > So what I think we need to do is, make the failure to detect a > > > working backend non-fatal, unless the user has explicitly asked for a > > > specific backend to be used. That should bring us back to the > > > previous situation. > > > > This is probably another reason to have a "no op" backend that merely > > raises errors for every operation - see my Roman's mail about FreeBSD > > Is there much of a difference between having an explicit noop backend > that is checked for availability after all other ones, and simply not > failing to initialize the driver if a backend can't be found? I actually sent a patch for the latter last night > > Anyway, I'd be happy with either solution. > > I'm still unclear on how networking on FreeBSD could work at all > until now. Aren't the iptables rules needed for guest connectivity? > Or did I misunderstand their purpose? It wouldn't have worked, but the problem is that we now kill the entire libvirtd startup, instead of successfully starting a (broken) network driver. Both are broken, but now the brokenness has spread to the bits that do matter. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Wed, Jun 12, 2024 at 09:57:15AM GMT, Daniel P. Berrangé wrote: > On Wed, Jun 12, 2024 at 01:54:47AM -0700, Andrea Bolognani wrote: > > Is there much of a difference between having an explicit noop backend > > that is checked for availability after all other ones, and simply not > > failing to initialize the driver if a backend can't be found? > > I actually sent a patch for the latter last night Awesome, thanks! > > I'm still unclear on how networking on FreeBSD could work at all > > until now. Aren't the iptables rules needed for guest connectivity? > > Or did I misunderstand their purpose? > > It wouldn't have worked, but the problem is that we now kill the > entire libvirtd startup, instead of successfully starting a (broken) > network driver. Both are broken, but now the brokenness has spread > to the bits that do matter. I get that, it's just that I'd be extremely surprised to learn that guest network connectivity hasn't worked on FreeBSD all this time. Surely that can't be right! Roman, what am I missing? -- Andrea Bolognani / Red Hat / Virtualization
On Wed, Jun 12, 2024 at 03:27:24AM -0700, Andrea Bolognani wrote: > On Wed, Jun 12, 2024 at 09:57:15AM GMT, Daniel P. Berrangé wrote: > > On Wed, Jun 12, 2024 at 01:54:47AM -0700, Andrea Bolognani wrote: > > > Is there much of a difference between having an explicit noop backend > > > that is checked for availability after all other ones, and simply not > > > failing to initialize the driver if a backend can't be found? > > > > I actually sent a patch for the latter last night > > Awesome, thanks! > > > > I'm still unclear on how networking on FreeBSD could work at all > > > until now. Aren't the iptables rules needed for guest connectivity? > > > Or did I misunderstand their purpose? > > > > It wouldn't have worked, but the problem is that we now kill the > > entire libvirtd startup, instead of successfully starting a (broken) > > network driver. Both are broken, but now the brokenness has spread > > to the bits that do matter. > > I get that, it's just that I'd be extremely surprised to learn that > guest network connectivity hasn't worked on FreeBSD all this time. > Surely that can't be right! Roman, what am I missing? This is only the libvirt virtual network backend. I presume BSD hosted guests could just use one of the other network backend options. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 6/12/24 6:47 AM, Daniel P. Berrangé wrote: > On Wed, Jun 12, 2024 at 03:27:24AM -0700, Andrea Bolognani wrote: >> On Wed, Jun 12, 2024 at 09:57:15AM GMT, Daniel P. Berrangé wrote: >>> On Wed, Jun 12, 2024 at 01:54:47AM -0700, Andrea Bolognani wrote: >>>> Is there much of a difference between having an explicit noop backend >>>> that is checked for availability after all other ones, and simply not >>>> failing to initialize the driver if a backend can't be found? >>> >>> I actually sent a patch for the latter last night >> >> Awesome, thanks! >> >>>> I'm still unclear on how networking on FreeBSD could work at all >>>> until now. Aren't the iptables rules needed for guest connectivity? >>>> Or did I misunderstand their purpose? >>> >>> It wouldn't have worked, but the problem is that we now kill the >>> entire libvirtd startup, instead of successfully starting a (broken) >>> network driver. Both are broken, but now the brokenness has spread >>> to the bits that do matter. >> >> I get that, it's just that I'd be extremely surprised to learn that >> guest network connectivity hasn't worked on FreeBSD all this time. >> Surely that can't be right! Roman, what am I missing? > > This is only the libvirt virtual network backend. I presume BSD hosted > guests could just use one of the other network backend options. > Based on the wording of Roman's initial message, I wondered if possibly people had been using the virtual network driver with <forward mode='open'/> - this wouldn't ever call any firewall functions, so it should succeed. I'm pretty sure none of the other network types are supported on BSD (macvtap/direct, or pools of SRIOV VFs used via VFIO device assignment). (I had asked about this in a reply night before last, but I think it wasn't seen by anyone because I replied to his first message that was accidentally sent to the old list and I'd iniially just hit reply (sending my reply to the old list too), then re-sent the message to the new list, but I think my email client changed the In-Reply-To: so it wasn't properly threaded and appeared separate from the rest of the thread.)
On Wed, Jun 12, 2024 at 08:42:48AM GMT, Laine Stump wrote: > On 6/12/24 6:47 AM, Daniel P. Berrangé wrote: > > On Wed, Jun 12, 2024 at 03:27:24AM -0700, Andrea Bolognani wrote: > > > [...] I'd be extremely surprised to learn that > > > guest network connectivity hasn't worked on FreeBSD all this time. > > > Surely that can't be right! Roman, what am I missing? > > > > This is only the libvirt virtual network backend. I presume BSD hosted > > guests could just use one of the other network backend options. > > Based on the wording of Roman's initial message, I wondered if possibly > people had been using the virtual network driver with <forward mode='open'/> > - this wouldn't ever call any firewall functions, so it should succeed. It looks like it fails before it can even get to the point where firewall rules would be created: # virsh net-start default error: Failed to start network default error: Unable to create bridge device: Invalid argument For reference, here's what the configuration looks like: # virsh net-dumpxml default <network> <name>default</name> <uuid>2bd47e50-eab7-4988-b7a5-7da41a53f9c8</uuid> <forward mode='open'/> <bridge name='virbr0' stp='on' delay='0'/> <mac address='52:54:00:f2:ce:e4'/> <ip address='192.168.122.1' netmask='255.255.255.0'> <dhcp> <range start='192.168.122.2' end='192.168.122.254'/> </dhcp> </ip> </network> > I'm > pretty sure none of the other network types are supported on BSD > (macvtap/direct, or pools of SRIOV VFs used via VFIO device assignment). Maybe <interface type='bridge'> works? I'm not even sure why the network driver is enabled on FreeBSD in the first place. Only the QEMU driver can use it, right? And that's compiled out by default on FreeBSD, if I'm interpreting the port[1] correctly. So, at the very least, I would expect the network driver to only be enabled when the QEMU driver is, i.e. not in the default binary package. [1] https://github.com/freebsd/freebsd-ports/blob/main/devel/libvirt/Makefile -- Andrea Bolognani / Red Hat / Virtualization
On 6/12/24 9:18 AM, Andrea Bolognani wrote: > On Wed, Jun 12, 2024 at 08:42:48AM GMT, Laine Stump wrote: >> On 6/12/24 6:47 AM, Daniel P. Berrangé wrote: >>> On Wed, Jun 12, 2024 at 03:27:24AM -0700, Andrea Bolognani wrote: >>>> [...] I'd be extremely surprised to learn that >>>> guest network connectivity hasn't worked on FreeBSD all this time. >>>> Surely that can't be right! Roman, what am I missing? >>> >>> This is only the libvirt virtual network backend. I presume BSD hosted >>> guests could just use one of the other network backend options. >> >> Based on the wording of Roman's initial message, I wondered if possibly >> people had been using the virtual network driver with <forward mode='open'/> >> - this wouldn't ever call any firewall functions, so it should succeed. > > It looks like it fails before it can even get to the point where > firewall rules would be created: > > # virsh net-start default > error: Failed to start network default > error: Unable to create bridge device: Invalid argument Okay, then I guess I read too much into what Roman said: I noticed that now I cannot use the bridge driver on FreeBSD as it's failing to initialize both iptables and nftables backends I figured that meant the bridge driver (aka the network driver) had previously been usable on FreeBSD, but if your test is typical, then that's not the case; maybe only <interface type='bridge'> works, and Roman just assumed that the network driver was needed in order for that to function. If a platform supports standard tap devices (which FreeBSD does), the three things the network driver needs to function properly are 1) a functioning firewall backend, 2) dnsmasq, and 3) the ability to manage a bridge device (all the functions in virnetdevbridge.c). (1) is obviously missing, but (2) is present on FreeBSD, and it looks like, at least for some *BSDs, (3) is also available (there is a build-time flag WITH_BSD_BRIDGE_MGMT that is set if certain ioctls are defined in net/if_bridgevar.h). Is WITH_BSD_BRIDGE_MGMT set in your FreeBSD build? Does net/if_bridgevar.h exist? > > For reference, here's what the configuration looks like: > > # virsh net-dumpxml default > <network> > <name>default</name> > <uuid>2bd47e50-eab7-4988-b7a5-7da41a53f9c8</uuid> > <forward mode='open'/> > <bridge name='virbr0' stp='on' delay='0'/> > <mac address='52:54:00:f2:ce:e4'/> > <ip address='192.168.122.1' netmask='255.255.255.0'> > <dhcp> > <range start='192.168.122.2' end='192.168.122.254'/> > </dhcp> > </ip> > </network> > >> I'm >> pretty sure none of the other network types are supported on BSD >> (macvtap/direct, or pools of SRIOV VFs used via VFIO device assignment). > > Maybe <interface type='bridge'> works? > > I'm not even sure why the network driver is enabled on FreeBSD in the > first place. Only the QEMU driver can use it, right? And that's > compiled out by default on FreeBSD, if I'm interpreting the port[1] > correctly. So, at the very least, I would expect the network driver > to only be enabled when the QEMU driver is, i.e. not in the default > binary package. > > > [1] https://github.com/freebsd/freebsd-ports/blob/main/devel/libvirt/Makefile
On Wed, Jun 12, 2024 at 10:42:43AM GMT, Laine Stump wrote: > On 6/12/24 9:18 AM, Andrea Bolognani wrote: > > On Wed, Jun 12, 2024 at 08:42:48AM GMT, Laine Stump wrote: > > > On 6/12/24 6:47 AM, Daniel P. Berrangé wrote: > > > > On Wed, Jun 12, 2024 at 03:27:24AM -0700, Andrea Bolognani wrote: > > > > > [...] I'd be extremely surprised to learn that > > > > > guest network connectivity hasn't worked on FreeBSD all this time. > > > > > Surely that can't be right! Roman, what am I missing? > > > > > > > > This is only the libvirt virtual network backend. I presume BSD hosted > > > > guests could just use one of the other network backend options. > > > > > > Based on the wording of Roman's initial message, I wondered if possibly > > > people had been using the virtual network driver with <forward mode='open'/> > > > - this wouldn't ever call any firewall functions, so it should succeed. > > > > It looks like it fails before it can even get to the point where > > firewall rules would be created: > > > > # virsh net-start default > > error: Failed to start network default > > error: Unable to create bridge device: Invalid argument > > Okay, then I guess I read too much into what Roman said: > > I noticed that now I cannot use the bridge driver > on FreeBSD as it's failing to initialize both > iptables and nftables backends > > I figured that meant the bridge driver (aka the network driver) had > previously been usable on FreeBSD, but if your test is typical, then that's > not the case; maybe only <interface type='bridge'> works, and Roman just > assumed that the network driver was needed in order for that to function. This is a bog-standard FreeBSD installation, so whatever I'm seeing is the out-of-the-box experience. Maybe there's a way to get things working with further tweaking, I don't know. > If a platform supports standard tap devices (which FreeBSD does), the three > things the network driver needs to function properly are 1) a functioning > firewall backend, 2) dnsmasq, and 3) the ability to manage a bridge device > (all the functions in virnetdevbridge.c). (1) is obviously missing, but (2) > is present on FreeBSD, and it looks like, at least for some *BSDs, (3) is > also available (there is a build-time flag WITH_BSD_BRIDGE_MGMT that is set > if certain ioctls are defined in net/if_bridgevar.h). > > Is WITH_BSD_BRIDGE_MGMT set in your FreeBSD build? Yes. > Does net/if_bridgevar.h exist? Also yes. > > I'm not even sure why the network driver is enabled on FreeBSD in the > > first place. Only the QEMU driver can use it, right? And that's > > compiled out by default on FreeBSD, if I'm interpreting the port[1] > > correctly. So, at the very least, I would expect the network driver > > to only be enabled when the QEMU driver is, i.e. not in the default > > binary package. More on this. When installing libvirt, the following message is printed out: NOTE ON CONFIGURATION: The libvirt port does not come with networking configuration enabled. The 'default' network definition is available at: /usr/local/share/examples/libvirt/networks/default.xml To enable this network please do the following: cp /usr/local/share/examples/libvirt/networks/default.xml /usr/local/etc/libvirt/qemu/networks To configure this network for autostart, execute the following: ln -s ../default.xml /usr/local/etc/libvirt/qemu/networks/autostart/default.xml If you have libvirtd already running you'll need to restart it for changes to take effect. As seen earlier, these instructions are pointless, as the default network won't be able to start. But the fact that they exist at all seems to indicate that, at some point, the default network could work on FreeBSD? -- Andrea Bolognani / Red Hat / Virtualization
On 6/12/24 11:46 AM, Andrea Bolognani wrote: > On Wed, Jun 12, 2024 at 10:42:43AM GMT, Laine Stump wrote: >> On 6/12/24 9:18 AM, Andrea Bolognani wrote: >>> On Wed, Jun 12, 2024 at 08:42:48AM GMT, Laine Stump wrote: >>>> On 6/12/24 6:47 AM, Daniel P. Berrangé wrote: >>>>> On Wed, Jun 12, 2024 at 03:27:24AM -0700, Andrea Bolognani wrote: >>>>>> [...] I'd be extremely surprised to learn that >>>>>> guest network connectivity hasn't worked on FreeBSD all this time. >>>>>> Surely that can't be right! Roman, what am I missing? >>>>> >>>>> This is only the libvirt virtual network backend. I presume BSD hosted >>>>> guests could just use one of the other network backend options. >>>> >>>> Based on the wording of Roman's initial message, I wondered if possibly >>>> people had been using the virtual network driver with <forward mode='open'/> >>>> - this wouldn't ever call any firewall functions, so it should succeed. >>> >>> It looks like it fails before it can even get to the point where >>> firewall rules would be created: >>> >>> # virsh net-start default >>> error: Failed to start network default >>> error: Unable to create bridge device: Invalid argument >> >> Okay, then I guess I read too much into what Roman said: >> >> I noticed that now I cannot use the bridge driver >> on FreeBSD as it's failing to initialize both >> iptables and nftables backends >> >> I figured that meant the bridge driver (aka the network driver) had >> previously been usable on FreeBSD, but if your test is typical, then that's >> not the case; maybe only <interface type='bridge'> works, and Roman just >> assumed that the network driver was needed in order for that to function. > > This is a bog-standard FreeBSD installation, so whatever I'm seeing > is the out-of-the-box experience. Maybe there's a way to get things > working with further tweaking, I don't know. > >> If a platform supports standard tap devices (which FreeBSD does), the three >> things the network driver needs to function properly are 1) a functioning >> firewall backend, 2) dnsmasq, and 3) the ability to manage a bridge device >> (all the functions in virnetdevbridge.c). (1) is obviously missing, but (2) >> is present on FreeBSD, and it looks like, at least for some *BSDs, (3) is >> also available (there is a build-time flag WITH_BSD_BRIDGE_MGMT that is set >> if certain ioctls are defined in net/if_bridgevar.h). >> >> Is WITH_BSD_BRIDGE_MGMT set in your FreeBSD build? > > Yes. > >> Does net/if_bridgevar.h exist? > > Also yes. > >>> I'm not even sure why the network driver is enabled on FreeBSD in the >>> first place. Only the QEMU driver can use it, right? And that's >>> compiled out by default on FreeBSD, if I'm interpreting the port[1] >>> correctly. So, at the very least, I would expect the network driver >>> to only be enabled when the QEMU driver is, i.e. not in the default >>> binary package. > > More on this. When installing libvirt, the following message is > printed out: > > NOTE ON CONFIGURATION: > > The libvirt port does not come with networking configuration enabled. > The 'default' network definition is available at: > > /usr/local/share/examples/libvirt/networks/default.xml > > To enable this network please do the following: > > cp /usr/local/share/examples/libvirt/networks/default.xml > /usr/local/etc/libvirt/qemu/networks > > To configure this network for autostart, execute the following: > > ln -s ../default.xml > /usr/local/etc/libvirt/qemu/networks/autostart/default.xml > > If you have libvirtd already running you'll need to restart it for changes > to take effect. > > As seen earlier, these instructions are pointless, as the default > network won't be able to start. But the fact that they exist at all > seems to indicate that, at some point, the default network could work > on FreeBSD? That is very strange, since the default network uses NAT, which requires firewall functionality (unless they've modified the contents of default.xml), so I don't see how it could have ever worked.
On Sun, Apr 30, 2023 at 11:19:15PM -0400, Laine Stump wrote: > This patch series enables libvirt to use nftables rules rather than > iptables *when setting up virtual networks* (it does *not* add > nftables support to the nwfilter driver). It accomplishes this by > abstracting several iptables functions (from viriptables.[ch] called > by the virtual network driver into a rudimentary "virNetfilter API" > (in virnetfilter.[ch], having the virtual network driver call the > virNetFilter API rather than calling the existing iptables functions > directly, and then finally adding an equivalent virNftables backend > that can be used instead of iptables (selected manually via a > network.conf setting, or automatically if iptables isn't found on the > host). > > A first look at the result may have you thinking that it's filled with > a lot of bad decisions. While I would agree with that in many cases, I > think that overall they are the "least bad" decisions, or at least > "bad within acceptable limits / no worse than something else", and > point out that it's been done in a way that minimizes (actually > eliminates) the need for immediate changes to nwfilter (the other > consumer of iptables, which *also* needs to be updated to use native > nftables), and makes it much easier to change our mind about the > details in the future. > > When I first started on this (long, protracted, repeatedly interrupted > for extended periods - many of these patches are > a year old) task, I > considered doing an all-at-once complete replacement of iptables with > nftables, since all the Linux distros we support have had nftables for > several years, and I'm pretty sure nobody has it disabled (not even > sure if it's possible to disable nftables while still enabling > iptables, since they both use xtables in the kernel). But due to > libvirt's use of "-t mangle -j CHECKSUM --checksum-fill" (see commit > fd5b15ff all the way back in July 2010 for details) which has no > equivalent in nftables rules (and we don't *want* it to!!), and the > desire to be able to easily switch back to iptables in case of an > unforeseen regression, we decided that both iptables and nftables need > to be supported (for now), with the default (for now) remaining as > iptables. > > Just allowing for dual backends complicated matters, since it means > that we have to have a config file, a setting, detection of which > backends are available, and of course some sort of concept of an > abstracted frontend that can use either backend based on the config > setting (and/or auto-detection). Combining that with the fact that it > would just be "too big" of a project to switch over nwfilter's > iptables usage at the same time means that we have to keep around a > lot of existing code for compatibility's sake rather than just wiping > it all away and starting over. > > So, what I've ended up with is: > > 1) a network.conf file (didn't exist before) with a single setting > "firewall_backend". If unset, the network driver tries to use iptables > on the backend, and if that's missing, then tries to use nftables. When testing your git branch active-nft-10 leavnig it unset didn't work: Running './src/libvirtd'... 2023-05-04 10:16:11.447+0000: 115377: info : libvirt version: 9.3.0 2023-05-04 10:16:11.447+0000: 115377: info : hostname: localhost.localdomain 2023-05-04 10:16:11.447+0000: 115377: error : virFirewallNew:118 : internal error: firewall_backend wasn't set, and no usable setting could be auto-detected 2023-05-04 10:16:11.447+0000: 115377: error : virNetFilterBackendUnsetError:51 : internal error: firewall_backend wasn't set, and no usable setting could be auto-detected 2023-05-04 10:16:11.447+0000: 115377: error : virNetFilterBackendUnsetError:51 : internal error: firewall_backend wasn't set, and no usable setting could be auto-detected 2023-05-04 10:16:11.473+0000: 115377: error : virFirewallNew:118 : internal error: firewall_backend wasn't set, and no usable setting could be auto-detected With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 5/4/23 6:47 AM, Daniel P. Berrangé wrote: > On Sun, Apr 30, 2023 at 11:19:15PM -0400, Laine Stump wrote: >> This patch series enables libvirt to use nftables rules rather than >> iptables *when setting up virtual networks* (it does *not* add >> nftables support to the nwfilter driver). It accomplishes this by >> abstracting several iptables functions (from viriptables.[ch] called >> by the virtual network driver into a rudimentary "virNetfilter API" >> (in virnetfilter.[ch], having the virtual network driver call the >> virNetFilter API rather than calling the existing iptables functions >> directly, and then finally adding an equivalent virNftables backend >> that can be used instead of iptables (selected manually via a >> network.conf setting, or automatically if iptables isn't found on the >> host). >> >> A first look at the result may have you thinking that it's filled with >> a lot of bad decisions. While I would agree with that in many cases, I >> think that overall they are the "least bad" decisions, or at least >> "bad within acceptable limits / no worse than something else", and >> point out that it's been done in a way that minimizes (actually >> eliminates) the need for immediate changes to nwfilter (the other >> consumer of iptables, which *also* needs to be updated to use native >> nftables), and makes it much easier to change our mind about the >> details in the future. >> >> When I first started on this (long, protracted, repeatedly interrupted >> for extended periods - many of these patches are > a year old) task, I >> considered doing an all-at-once complete replacement of iptables with >> nftables, since all the Linux distros we support have had nftables for >> several years, and I'm pretty sure nobody has it disabled (not even >> sure if it's possible to disable nftables while still enabling >> iptables, since they both use xtables in the kernel). But due to >> libvirt's use of "-t mangle -j CHECKSUM --checksum-fill" (see commit >> fd5b15ff all the way back in July 2010 for details) which has no >> equivalent in nftables rules (and we don't *want* it to!!), and the >> desire to be able to easily switch back to iptables in case of an >> unforeseen regression, we decided that both iptables and nftables need >> to be supported (for now), with the default (for now) remaining as >> iptables. >> >> Just allowing for dual backends complicated matters, since it means >> that we have to have a config file, a setting, detection of which >> backends are available, and of course some sort of concept of an >> abstracted frontend that can use either backend based on the config >> setting (and/or auto-detection). Combining that with the fact that it >> would just be "too big" of a project to switch over nwfilter's >> iptables usage at the same time means that we have to keep around a >> lot of existing code for compatibility's sake rather than just wiping >> it all away and starting over. >> >> So, what I've ended up with is: >> >> 1) a network.conf file (didn't exist before) with a single setting >> "firewall_backend". If unset, the network driver tries to use iptables >> on the backend, and if that's missing, then tries to use nftables. > > When testing your git branch active-nft-10 leavnig it unset didn't > work: > > Running './src/libvirtd'... > 2023-05-04 10:16:11.447+0000: 115377: info : libvirt version: 9.3.0 > 2023-05-04 10:16:11.447+0000: 115377: info : hostname: localhost.localdomain > 2023-05-04 10:16:11.447+0000: 115377: error : virFirewallNew:118 : internal error: firewall_backend wasn't set, and no usable setting could be auto-detected > 2023-05-04 10:16:11.447+0000: 115377: error : virNetFilterBackendUnsetError:51 : internal error: firewall_backend wasn't set, and no usable setting could be auto-detected > 2023-05-04 10:16:11.447+0000: 115377: error : virNetFilterBackendUnsetError:51 : internal error: firewall_backend wasn't set, and no usable setting could be auto-detected > 2023-05-04 10:16:11.473+0000: 115377: error : virFirewallNew:118 : internal error: firewall_backend wasn't set, and no usable setting could be auto-detected Ugh :-( I guess I did brak something with all the pre-posting cleanups after my last round of testing, as that was working just fine as of the end of last week.. I'll see what I broke, and if it's simple I'll update the branch on gitlab so that you can try it out if you want (even though it's obvious I need to change some things)
On Sun, Apr 30, 2023 at 11:19:15PM -0400, Laine Stump wrote: > When I first started on this (long, protracted, repeatedly interrupted > for extended periods - many of these patches are > a year old) task, I > considered doing an all-at-once complete replacement of iptables with > nftables, since all the Linux distros we support have had nftables for > several years, and I'm pretty sure nobody has it disabled (not even > sure if it's possible to disable nftables while still enabling > iptables, since they both use xtables in the kernel). But due to > libvirt's use of "-t mangle -j CHECKSUM --checksum-fill" (see commit > fd5b15ff all the way back in July 2010 for details) which has no > equivalent in nftables rules (and we don't *want* it to!!), and the > desire to be able to easily switch back to iptables in case of an > unforeseen regression, we decided that both iptables and nftables need > to be supported (for now), with the default (for now) remaining as > iptables. What happens if you use iptables -j CHECKSUM --checksum-fill, and this is just the iptables compat shim that is secretly talking to NFT in the kernel ? Is the checksum stuff just ignored entirely then ? If so, then the key decision factor for us, is whether any of the supported distros still officially support use of the iptables KMOD instead of nft KMOD. > So, what I've ended up with is: > > 1) a network.conf file (didn't exist before) with a single setting > "firewall_backend". If unset, the network driver tries to use iptables > on the backend, and if that's missing, then tries to use nftables. For most distros I feel like nftables should be the default and not require a user opt in. That raises the question of how do we deal with the upgrade path. Historically when starting libvirt we'll blow away our existing iptables rules and create them fresh. For an upgrade path we'll need a variant which blows away iptables rules and instead creates nftables rules. If that works, then we could default to nftables without user config. > We had spent some time in IRC discussing different ways of using new > functionality available in nftables to make a more > efficient/performant implemention of the desired filtering, and there > are some really great possibilities that need to be explored, but in > the end there were too many details up in the air, and I decided that > it would be more "accomplishable" (coined a new word there!) to first > replicate existing behavior with nftables, but do it inside a > framework that makes it easy to modify the details in the future (in > particular making it painless to switch back and forth between builds > with differing filter models at runtime) - this way we'll be able to > separate the infrastructure work from the details of the rules (which > we can then more easily work on and experiment with). (This implies > that the main objective right now is "get rid of iptables > dependencies", not "make the filtering faster and more efficient"). I feel like filtering wrt virtual networks isn't our main performance bottleneck. Most machines only have a couple of virtual networks, so most host traffic only has to match a few rules. With nwfilter though, if we have 1000 VMs we'll have 1000 top level rules that need evaluating for every packet. IOW, the performance optimizations allowed by nftables are much more relevant to nwfilter. > * allows switching between iptables/nftables backends without > rebooting or restarting networks/guests. > > Because the commands required to remove a network's filter rules are > now saved in the network status XML, each time libvirtd (or > virtnetworkd) is restarted, it will execute exactly the commands > needed to remove the filter rules that had been added by the > previous libvirtd/virtnetworkd (rather than just making a guess, as > we've always done up until now), and then add new rules using the > current backend+binary's set of rules (while also saving the info > needed for future removal of these new rules back into the network's > status XML). Ok that's answering my question about upgrades then. It looks like we should be able to default to nftables out of the box, if we assume that absence of info in the network status XML implies iptables. > * virFirewall does *not* provide a backend-agnostic interface [this is fine] > > * We need to maintain a backward-compatible API for virFirewall so > that we don't have to touch nwfilter code. Trying to make its API > backend-agnostic would require individually considering/changing > every nwfilter use of virFirewall. > > * instead virFirewall objects are just a way to build a collection > of commands to execute to build a firewall, then execute them > while collecting info for and building a collection of commands > that will tear down that firewall in the future. > > Do I want to "fix" this in the future by making virFirewall a higher > level interface that accepts tokens describing the type of rule to > add (rather than backend-specific arguments to a backend-specific > command)? No. I think I like the way virFirewall works (as > described in that previous bullet-point), instead I'm thinking that > it is just slightly mis-named - I've lately been thinking of it as a > "virNetFilterCmdList". Similarly, the virFirewallRules that it has a > list of aren't really "rules", they are better described as commands > or actions, so maybe they should be renamed to virNetfilterCmd or > virNetfilterAction. But that is just cosmetic, so I didn't want to > get into it in these patches (especially in case someone disagrees, > or has a better idea for naming). The main thing I'd like to see is to maintain the clean split of code between what is general purpose, and what is specific to the virtual network driver and what is specific to the nwfilter driver. AFAICT, this series mixes that up with circular dependancies between viriptables.c, virfirewall.c and virnetfilter.c. I'd like virfirwall.c to remain indepent of anything related to the virtual network driver. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 5/3/23 11:40 AM, Daniel P. Berrangé wrote: > On Sun, Apr 30, 2023 at 11:19:15PM -0400, Laine Stump wrote: >> When I first started on this (long, protracted, repeatedly interrupted >> for extended periods - many of these patches are > a year old) task, I >> considered doing an all-at-once complete replacement of iptables with >> nftables, since all the Linux distros we support have had nftables for >> several years, and I'm pretty sure nobody has it disabled (not even >> sure if it's possible to disable nftables while still enabling >> iptables, since they both use xtables in the kernel). But due to >> libvirt's use of "-t mangle -j CHECKSUM --checksum-fill" (see commit >> fd5b15ff all the way back in July 2010 for details) which has no >> equivalent in nftables rules (and we don't *want* it to!!), and the >> desire to be able to easily switch back to iptables in case of an >> unforeseen regression, we decided that both iptables and nftables need >> to be supported (for now), with the default (for now) remaining as >> iptables. > > What happens if you use iptables -j CHECKSUM --checksum-fill, and > this is just the iptables compat shim that is secretly talking to > NFT in the kernel ? Is the checksum stuff just ignored entirely > then ? > > If so, then the key decision factor for us, is whether any of the > supported distros still officially support use of the iptables KMOD > instead of nft KMOD. > >> So, what I've ended up with is: >> >> 1) a network.conf file (didn't exist before) with a single setting >> "firewall_backend". If unset, the network driver tries to use iptables >> on the backend, and if that's missing, then tries to use nftables. > > For most distros I feel like nftables should be the default > and not require a user opt in. That raises the question of > how do we deal with the upgrade path. > > Historically when starting libvirt we'll blow away our existing > iptables rules and create them fresh. For an upgrade path we'll > need a variant which blows away iptables rules and instead creates > nftables rules. If that works, then we could default to nftables > without user config. Since my patches do that, I would be totally fine with preferring nftables over iptables when no config is present. It would definitely give a better feeling of "modernizing" if we could do that. (Actually, I was considering that when I tweak Eric Garver's patches for native firewalld networks, I would prefer firewalld over nftables if firewalld.service is running. In that case, the order of preference (still overrideable, of course) would be firewalld->nftables->iptables, which from a functionality standpoint makes the most sense.) > > >> We had spent some time in IRC discussing different ways of using new >> functionality available in nftables to make a more >> efficient/performant implemention of the desired filtering, and there >> are some really great possibilities that need to be explored, but in >> the end there were too many details up in the air, and I decided that >> it would be more "accomplishable" (coined a new word there!) to first >> replicate existing behavior with nftables, but do it inside a >> framework that makes it easy to modify the details in the future (in >> particular making it painless to switch back and forth between builds >> with differing filter models at runtime) - this way we'll be able to >> separate the infrastructure work from the details of the rules (which >> we can then more easily work on and experiment with). (This implies >> that the main objective right now is "get rid of iptables >> dependencies", not "make the filtering faster and more efficient"). > > I feel like filtering wrt virtual networks isn't our main > performance bottleneck. Most machines only have a couple > of virtual networks, so most host traffic only has to > match a few rules. > > With nwfilter though, if we have 1000 VMs we'll have > 1000 top level rules that need evaluating for every > packet. IOW, the performance optimizations allowed > by nftables are much more relevant to nwfilter. That's a good point, and makes me feel better about not trying to immediately optimize the rules created here. (although I do recall someone once a long time ago who filed a bug about poor performance when they had > 500 virtual networks :-P) (OTOH, I've been reading through your response to 18/28, and what you're suggesting there, in the name of simpler removal rather than runtime performance, is changing what rules are generated in the initial version. I'm still wrapping my head around that - I had confused "chain" with "hook" in a strange way, and thought that the behavior was different than it apparently is). > >> * allows switching between iptables/nftables backends without >> rebooting or restarting networks/guests. >> >> Because the commands required to remove a network's filter rules are >> now saved in the network status XML, each time libvirtd (or >> virtnetworkd) is restarted, it will execute exactly the commands >> needed to remove the filter rules that had been added by the >> previous libvirtd/virtnetworkd (rather than just making a guess, as >> we've always done up until now), and then add new rules using the >> current backend+binary's set of rules (while also saving the info >> needed for future removal of these new rules back into the network's >> status XML). > > Ok that's answering my question about upgrades then. It looks like we > should be able to default to nftables out of the box, if we assume > that absence of info in the network status XML implies iptables. Yep, that's what the code does - if there is no <fwRemoval> element, then we assume that the current network instance was setup with iptables rules. > >> * virFirewall does *not* provide a backend-agnostic interface [this is fine] >> >> * We need to maintain a backward-compatible API for virFirewall so >> that we don't have to touch nwfilter code. Trying to make its API >> backend-agnostic would require individually considering/changing >> every nwfilter use of virFirewall. >> >> * instead virFirewall objects are just a way to build a collection >> of commands to execute to build a firewall, then execute them >> while collecting info for and building a collection of commands >> that will tear down that firewall in the future. >> >> Do I want to "fix" this in the future by making virFirewall a higher >> level interface that accepts tokens describing the type of rule to >> add (rather than backend-specific arguments to a backend-specific >> command)? No. I think I like the way virFirewall works (as >> described in that previous bullet-point), instead I'm thinking that >> it is just slightly mis-named - I've lately been thinking of it as a >> "virNetFilterCmdList". Similarly, the virFirewallRules that it has a >> list of aren't really "rules", they are better described as commands >> or actions, so maybe they should be renamed to virNetfilterCmd or >> virNetfilterAction. But that is just cosmetic, so I didn't want to >> get into it in these patches (especially in case someone disagrees, >> or has a better idea for naming). > > The main thing I'd like to see is to maintain the clean split of > code between what is general purpose, and what is specific to the > virtual network driver and what is specific to the nwfilter > driver. AFAICT, this series mixes that up with circular dependancies > between viriptables.c, virfirewall.c and virnetfilter.c. Yeah, after reading through your comments about that part, I agree. It all started because viriptables.[ch] were originally put in util instead of network, and I was too focused on maintaining the status quo where possible. As for the circular dependency, that was one of the bits where I felt a bit dirty writing the code, as mentioned earlier. I'm going to try and eliminate that. > > I'd like virfirwall.c to remain indepent of anything related to > the virtual network driver. Definitely. And if viriptables is moved from util to network, then we *must* eliminate this bit.
On a Sunday in 2023, Laine Stump wrote: >This patch series enables libvirt to use nftables rules rather than >iptables *when setting up virtual networks* (it does *not* add >nftables support to the nwfilter driver). It accomplishes this by > getting these patches in. [... 150 lines delted ...] > >Laine Stump (28): > util: add -w/--concurrent when applying the rule rather than when > building it > util: new virFirewallRuleGet*() APIs > util: determine ignoreErrors value when creating rule, not when > applying > util: rename iptables helpers that will become the frontend for > ip&nftables > util: move backend-agnostic virNetfilter*() functions to their own > file > util: make netfilter action a proper typedefed (virFirewall) enum > util: #define the names used for private packet filter chains > util: move/rename virFirewallApplyRuleDirect to > virIptablesApplyFirewallRule > util/network: reintroduce virFirewallBackend, but different > network: add (empty) network.conf file to distribution files > network: allow setting firewallBackend from network.conf > network: do not add DHCP checksum mangle rule unless using iptables > network: call backend agnostic function to init private filter chains > util: setup functions in virnetfilter which will call appropriate > backend > build: add nft to the list of binaries we attempt to locate > util: add nftables backend to virnetfilter API used by network driver > tests: test cases for nftables backend > util: new functions to support adding individual rollback rules > util: check for 0 args when applying iptables rule > util: implement rollback rule autosave for iptables backend > util: implement rollback rule autosave for nftables backend > network: turn on auto-rollback for the rules added for virtual > networks > util: new function virFirewallNewFromRollback() > util: new functions virFirewallParseXML() and virFirewallFormat() > conf: add a virFirewall object to virNetworkObj > network: use previously saved list of firewall rules when removing > network: save network status when firewall rules are reloaded > network: improve log message when reloading virtual network firewall > rules > > libvirt.spec.in | 5 + > meson.build | 1 + > po/POTFILES | 2 + > src/conf/virnetworkobj.c | 40 + > src/conf/virnetworkobj.h | 11 + > src/libvirt_private.syms | 68 +- > src/network/bridge_driver.c | 40 +- > src/network/bridge_driver_conf.c | 44 + > src/network/bridge_driver_conf.h | 3 + > src/network/bridge_driver_linux.c | 241 +++-- > src/network/bridge_driver_nop.c | 6 +- > src/network/bridge_driver_platform.h | 6 +- > src/network/libvirtd_network.aug | 39 + > src/network/meson.build | 11 + > src/network/network.conf | 24 + > src/network/test_libvirtd_network.aug.in | 5 + > src/nwfilter/nwfilter_ebiptables_driver.c | 16 +- > src/util/meson.build | 2 + > src/util/virebtables.c | 4 +- > src/util/virfirewall.c | 490 ++++++++-- > src/util/virfirewall.h | 51 +- > src/util/viriptables.c | 762 ++++----------- > src/util/viriptables.h | 222 ++--- > src/util/virnetfilter.c | 892 ++++++++++++++++++ > src/util/virnetfilter.h | 159 ++++ > src/util/virnftables.c | 698 ++++++++++++++ > src/util/virnftables.h | 118 +++ > .../{base.args => base.iptables} | 0 > tests/networkxml2firewalldata/base.nftables | 256 +++++ > ...-linux.args => nat-default-linux.iptables} | 0 > .../nat-default-linux.nftables | 248 +++++ > ...pv6-linux.args => nat-ipv6-linux.iptables} | 0 > .../nat-ipv6-linux.nftables | 384 ++++++++ > ...rgs => nat-ipv6-masquerade-linux.iptables} | 0 > .../nat-ipv6-masquerade-linux.nftables | 456 +++++++++ > ...linux.args => nat-many-ips-linux.iptables} | 0 > .../nat-many-ips-linux.nftables | 472 +++++++++ > ...-linux.args => nat-no-dhcp-linux.iptables} | 0 > .../nat-no-dhcp-linux.nftables | 384 ++++++++ > ...ftp-linux.args => nat-tftp-linux.iptables} | 0 > .../nat-tftp-linux.nftables | 274 ++++++ > ...inux.args => route-default-linux.iptables} | 0 > .../route-default-linux.nftables | 162 ++++ > tests/networkxml2firewalltest.c | 56 +- > tests/virfirewalltest.c | 20 +- > 45 files changed, 5718 insertions(+), 954 deletions(-) Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano
© 2016 - 2024 Red Hat, Inc.