[PATCH 5/7] network: NetworkManager script to monitor/resolve conflicts with new interfaces

Laine Stump posted 7 patches 3 months, 2 weeks ago
[PATCH 5/7] network: NetworkManager script to monitor/resolve conflicts with new interfaces
Posted by Laine Stump 3 months, 2 weeks ago
There has been a problem for several years with libvirt's default
virtual network conflicting with the host physical network connection
on new installs, particularly when the "host" is actually a virtual
machine that is, itself, connected to the libvirt default network on
its respective host. If the two default networks use the same subnet,
and if the nested host's libvirt happens to start its network before
the system networking connects to the L0 host, then network
connectivity to the L1 guest is just silently non-working.

We've tried several things over the years to eliminate this problem,
including:

1) Checking for conflicting routes/interfaces during installation of
   the libvirt-daemon-config-network package (the one containing the
   default network config file) which tries different subnets until it
   finds one that doesn't conflict. This helps in some cases, but
   fails on 2 points: a) if the installation environment is different
   from the environment where the system is actually run (e.g. a "live
   CD" image of a distro, which is built in a container by the distro
   maintainers, then could later run in any number of places, and b)
   it modifies the installed files during the rpm installation %post
   script, which is now discouraged because people building container
   images don't like dealing with that.

2) When starting a libvirt network, we now check for any route or
   interface that conflicts with the new network's IP's and
   routes. This doesn't fix the problem, but does notify the user of
   the problem *as long as libvirt starts its networks after the host
   has started its system networks*.

3) New code (in the commits immediately previous to this one) add
   support for an "autoaddr" attribute in each virtual network <ip>
   element; when autoaddr is set, the network driver goes one step
   beyond (2) and actually finds an unused subnet and sets the new
   virtual network's addresses accordingly.

These are all nice in their own ways, but none of them helps in the
situation where libvirt's networks are started first (before the
host's own network connections are all up). This led to this patch,
which does the following:

4) Using a NetworkManager facility (dispatcher.d pscripts, which are
   run whenever any interface is brought up or down), check for any
   libvirt networks that conflict with a newly started NetworkManager
   interface, and if a conflict is found then log a message and
   destroy the libvirt network. Most usefully, though, if this
   destroyed network has autoaddr='yes' then the script will
   immediately restart the network, which will find a new, unused
   subnet.

Once this is in place, the only issues are:

1) It only works with NetworkManager. But of course almost all of the
   cases where this problem has been an issue, networking is managed
   by NetworkManager.

2) If there are guests already running and connected to the network,
   they will be disconnected, and won't be reconnected until
   libvirtd/virtqemud is restarted (one of the things the QEMU driver
   does when rereading the status of active guests is to make sure all
   their interfaces are connected to their respective networks).

Signed-off-by: Laine Stump <laine@redhat.com>
---
 libvirt.spec.in                         |   2 +
 src/network/meson.build                 |   6 +
 src/network/nm-dispatcher-check-nets.py | 196 ++++++++++++++++++++++++
 3 files changed, 204 insertions(+)
 create mode 100755 src/network/nm-dispatcher-check-nets.py

diff --git a/libvirt.spec.in b/libvirt.spec.in
index 29101e74fe..51cecfa598 100644
--- a/libvirt.spec.in
+++ b/libvirt.spec.in
@@ -603,6 +603,7 @@ Network filter configuration files for cleaning guest traffic
 Summary: Network driver plugin for the libvirtd daemon
 Requires: libvirt-daemon-common = %{version}-%{release}
 Requires: libvirt-libs = %{version}-%{release}
+Requires: python3-libxml2
 Requires: dnsmasq >= 2.41
     %if %{prefer_nftables}
 Requires: nftables
@@ -2151,6 +2152,7 @@ exit 0
 %dir %attr(0755, root, root) %{_localstatedir}/lib/libvirt/dnsmasq/
 %attr(0755, root, root) %{_libexecdir}/libvirt_leaseshelper
 %{_libdir}/libvirt/connection-driver/libvirt_driver_network.so
+%{_prefix}/lib/NetworkManager/dispatcher.d/50-libvirt-check-nets
 %{_mandir}/man8/virtnetworkd.8*
     %if %{with_firewalld_zone}
 %{_prefix}/lib/firewalld/zones/libvirt.xml
diff --git a/src/network/meson.build b/src/network/meson.build
index 8faff6eb1c..f620407759 100644
--- a/src/network/meson.build
+++ b/src/network/meson.build
@@ -169,4 +169,10 @@ if conf.has('WITH_NETWORK')
       rename: [ 'libvirt-routed-in.xml' ],
     )
   endif
+
+  install_data(
+    'nm-dispatcher-check-nets.py',
+    install_dir: prefix / 'lib' / 'NetworkManager' / 'dispatcher.d',
+    rename: [ '50-libvirt-check-nets' ],
+  )
 endif
diff --git a/src/network/nm-dispatcher-check-nets.py b/src/network/nm-dispatcher-check-nets.py
new file mode 100755
index 0000000000..454c434c88
--- /dev/null
+++ b/src/network/nm-dispatcher-check-nets.py
@@ -0,0 +1,196 @@
+#!/usr/bin/env python3
+#
+# Copyright (C) 2012-2019 Red Hat, Inc.
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library.  If not, see
+# <http://www.gnu.org/licenses/>.
+
+import libvirt
+import sys
+import os
+import libxml2
+from ipaddress import ip_network
+
+# This script should be installed in
+# /usr/lib/NetworkManager/dispatcher.d/50-libvirt-check-nets. It will be
+# called by NetworkManager every time a network interface is taken up
+# or down. When a new network comes up, it checks the libvirt virtual
+# networks to see if their IP address(es) (including any static
+# routes) are in conflict with the IP address(es) (or static routes)
+# of the newly added interface.  If so, the libvirt network is
+# disabled. It is assumed that the user will notice that their guests
+# no longer have network connectvity (and/or the message logged by
+# this script), see that the network has been disabled, and then
+# realize the conflict when they try to restart it.
+#
+# set checkDefaultOnly=False to check *all* active libvirt networks
+# for conflicts with the new interface. Set to True to check only the
+# libvirt default network (since most networks other than the default
+# network are added post-install at a time when all of the hosts other
+# networks are already active, it may be overkill to check all of the
+# libvirt networks for conflict here (and instead just add more
+# needless overheard to bringing up a new host interface).
+#
+checkDefaultOnly = False
+
+# NB: since this file is installed in /usr/lib, it really shouldn't be
+# modified by the user, but instead should be copied to
+# /etc/NetworkManager/dispatcher.d, where it will override the copy in
+# /usr/lib. Even that isn't a proper solution though - if we're going
+# to actually have this config knob, perhaps we should check for it in
+# the environment, and if someone wants to modify it they can put a
+# short script in /etc that exports and environment variable and then
+# calls this script? Just thinking out loud here.
+
+
+def checkconflict(conn, netname, hostnets, hostif):
+
+    # ignore if the network has been brought down or removed since we
+    # got the list
+    try:
+        net = conn.networkLookupByName(netname)
+    except libvirt.libvirtError:
+        return
+
+    if not net.isActive():
+        return
+
+    xml = net.XMLDesc()
+    doc = libxml2.parseDoc(xml)
+    ctx = doc.xpathNewContext()
+
+    # see if NetworkManager is informing us that this libvirt network
+    # itself is coming online
+    bridge = ctx.xpathEval("/network/bridge/@name")
+    if bridge and bridge[0].content == hostif:
+        return
+
+    # check *all* the addresses of this network
+    addrs = ctx.xpathEval("/network/*[@address]")
+    for ip in addrs:
+        ctx.setContextNode(ip)
+        address = ctx.xpathEval("@address")
+        prefix = ctx.xpathEval("@prefix")
+        netmask = ctx.xpathEval("@netmask")
+        autoaddr = ctx.xpathEval("@autoaddr")
+
+        isAutoaddr = False
+        if autoaddr and len(autoaddr[0].content):
+            isAutoaddr = (autoaddr[0].content == "yes")
+
+        if not (address and len(address[0].content)):
+            continue
+
+        addrstr = address[0].content
+        if not (prefix and len(prefix[0].content)):
+            # check for a netmask
+            if not (netmask and len(netmask[0].content)):
+                # this element has address, but no prefix or netmask
+                # probably it is <mac address ...> so we can ignore it
+                continue
+            # convert netmask to prefix
+            prefixstr = str(ip_network("0.0.0.0/%s" % netmask[0].content).prefixlen)
+        else:
+            prefixstr = prefix[0].content
+
+        virtnetaddress = ip_network("%s/%s" % (addrstr, prefixstr), strict=False)
+
+        for hostnet in hostnets:
+            if virtnetaddress == hostnet:
+                # There is a conflict with this libvirt network and the specified
+                # net, so we need to disable the libvirt network
+                print("Stopping libvirt network '%s' because its subnet %s conflicts with newly started interface '%s'')"
+                      % (netname, str(hostnet), hostif))
+                try:
+                    net.destroy()
+                except libvirt.libvirtError:
+                    print("Failed to destroy network %s" % netname)
+                    return
+
+                if isAutoaddr:
+                    print("Restarting autoaddr libvirt network '%s'with new subnet" % (netname))
+                    try:
+                        net.create()
+                    except libvirt.libvirtError:
+                        print("Failed to restart network '%s'" % netname)
+                return
+    return
+
+
+def addHostNets(hostnets, countenv, addrenv):
+
+    count = os.getenv(countenv)
+    if not count or count == 0:
+        return
+
+    for num in range(int(count)):
+        addrstr = os.getenv("%s_%d" % (addrenv, num))
+        if not addrstr or addrstr == "":
+            continue
+
+        net = ip_network(addrstr.split()[0], strict=False)
+        if net:
+            hostnets.append(net)
+    return
+
+
+############################################################
+
+if sys.argv[2] != "up":
+    sys.exit(0)
+
+hostif = sys.argv[1]
+
+try:
+    conn = libvirt.open(None)
+except libvirt.libvirtError:
+    print('Failed to open connection to the hypervisor')
+    sys.exit(0)
+
+if checkDefaultOnly:
+    nets = []
+    net = conn.networkLookupByName("default")
+    if not (net and net.isActive()):
+        sys.exit(0)
+    nets.append(net)
+else:
+    nets = conn.listAllNetworks(libvirt.VIR_CONNECT_LIST_NETWORKS_ACTIVE)
+    if not nets:
+        sys.exit(0)
+
+# We have at least one active network. Build a list of all network
+# routes added by the new interface, and compare that list to the list
+# of all networks used by each active libvirt network. If any are an
+# exact match, then we have a conflict and need to shut down the
+# libvirt network to avoid killing host networking.
+
+# When NetworkManager calls scripts in /etc/NetworkManager/dispatcher.d
+# it will have all IP addresses and routes associated with the interface
+# that is going up or down in the following environment variables:
+#
+# IP4_NUM_ADDRESSES - number of IPv4 addresses
+# IP4_ADDRESS_N - one variable for each address, starting at _0
+# IP4_NUM_ROUTES - number of IPv5 routes
+# IP4_ROUTE_N - one for each route, starting at _0
+# (replace "IP4" with "IP6" and repeat)
+#
+hostnets = []
+addHostNets(hostnets, "IP4_NUM_ADDRESSES", "IP4_ADDRESS")
+addHostNets(hostnets, "IP4_NUM_ROUTES", "IP4_ROUTE")
+addHostNets(hostnets, "IP6_NUM_ADDRESSES", "IP6_ADDRESS")
+addHostNets(hostnets, "IP6_NUM_ROUTES", "IP6_ROUTE")
+
+for net in nets:
+
+    checkconflict(conn, net.name(), hostnets, hostif)
-- 
2.45.2
Re: [PATCH 5/7] network: NetworkManager script to monitor/resolve conflicts with new interfaces
Posted by Daniel P. Berrangé 3 months, 2 weeks ago
On Wed, Aug 07, 2024 at 01:16:01PM -0400, Laine Stump wrote:
> There has been a problem for several years with libvirt's default
> virtual network conflicting with the host physical network connection
> on new installs, particularly when the "host" is actually a virtual
> machine that is, itself, connected to the libvirt default network on
> its respective host. If the two default networks use the same subnet,
> and if the nested host's libvirt happens to start its network before
> the system networking connects to the L0 host, then network
> connectivity to the L1 guest is just silently non-working.
> 
> We've tried several things over the years to eliminate this problem,
> including:
> 
> 1) Checking for conflicting routes/interfaces during installation of
>    the libvirt-daemon-config-network package (the one containing the
>    default network config file) which tries different subnets until it
>    finds one that doesn't conflict. This helps in some cases, but
>    fails on 2 points: a) if the installation environment is different
>    from the environment where the system is actually run (e.g. a "live
>    CD" image of a distro, which is built in a container by the distro
>    maintainers, then could later run in any number of places, and b)
>    it modifies the installed files during the rpm installation %post
>    script, which is now discouraged because people building container
>    images don't like dealing with that.
> 
> 2) When starting a libvirt network, we now check for any route or
>    interface that conflicts with the new network's IP's and
>    routes. This doesn't fix the problem, but does notify the user of
>    the problem *as long as libvirt starts its networks after the host
>    has started its system networks*.
> 
> 3) New code (in the commits immediately previous to this one) add
>    support for an "autoaddr" attribute in each virtual network <ip>
>    element; when autoaddr is set, the network driver goes one step
>    beyond (2) and actually finds an unused subnet and sets the new
>    virtual network's addresses accordingly.
> 
> These are all nice in their own ways, but none of them helps in the
> situation where libvirt's networks are started first (before the
> host's own network connections are all up). This led to this patch,
> which does the following:
> 
> 4) Using a NetworkManager facility (dispatcher.d pscripts, which are
>    run whenever any interface is brought up or down), check for any
>    libvirt networks that conflict with a newly started NetworkManager
>    interface, and if a conflict is found then log a message and
>    destroy the libvirt network. Most usefully, though, if this
>    destroyed network has autoaddr='yes' then the script will
>    immediately restart the network, which will find a new, unused
>    subnet.
> 
> Once this is in place, the only issues are:
> 
> 1) It only works with NetworkManager. But of course almost all of the
>    cases where this problem has been an issue, networking is managed
>    by NetworkManager.
> 
> 2) If there are guests already running and connected to the network,
>    they will be disconnected, and won't be reconnected until
>    libvirtd/virtqemud is restarted (one of the things the QEMU driver
>    does when rereading the status of active guests is to make sure all
>    their interfaces are connected to their respective networks).
> 
> Signed-off-by: Laine Stump <laine@redhat.com>
> ---
>  libvirt.spec.in                         |   2 +
>  src/network/meson.build                 |   6 +
>  src/network/nm-dispatcher-check-nets.py | 196 ++++++++++++++++++++++++
>  3 files changed, 204 insertions(+)
>  create mode 100755 src/network/nm-dispatcher-check-nets.py
> 
> diff --git a/libvirt.spec.in b/libvirt.spec.in
> index 29101e74fe..51cecfa598 100644
> --- a/libvirt.spec.in
> +++ b/libvirt.spec.in
> @@ -603,6 +603,7 @@ Network filter configuration files for cleaning guest traffic
>  Summary: Network driver plugin for the libvirtd daemon
>  Requires: libvirt-daemon-common = %{version}-%{release}
>  Requires: libvirt-libs = %{version}-%{release}
> +Requires: python3-libxml2
>  Requires: dnsmasq >= 2.41
>      %if %{prefer_nftables}
>  Requires: nftables
> @@ -2151,6 +2152,7 @@ exit 0
>  %dir %attr(0755, root, root) %{_localstatedir}/lib/libvirt/dnsmasq/
>  %attr(0755, root, root) %{_libexecdir}/libvirt_leaseshelper
>  %{_libdir}/libvirt/connection-driver/libvirt_driver_network.so
> +%{_prefix}/lib/NetworkManager/dispatcher.d/50-libvirt-check-nets
>  %{_mandir}/man8/virtnetworkd.8*
>      %if %{with_firewalld_zone}
>  %{_prefix}/lib/firewalld/zones/libvirt.xml
> diff --git a/src/network/meson.build b/src/network/meson.build
> index 8faff6eb1c..f620407759 100644
> --- a/src/network/meson.build
> +++ b/src/network/meson.build
> @@ -169,4 +169,10 @@ if conf.has('WITH_NETWORK')
>        rename: [ 'libvirt-routed-in.xml' ],
>      )
>    endif
> +
> +  install_data(
> +    'nm-dispatcher-check-nets.py',
> +    install_dir: prefix / 'lib' / 'NetworkManager' / 'dispatcher.d',
> +    rename: [ '50-libvirt-check-nets' ],
> +  )
>  endif
> diff --git a/src/network/nm-dispatcher-check-nets.py b/src/network/nm-dispatcher-check-nets.py
> new file mode 100755
> index 0000000000..454c434c88
> --- /dev/null
> +++ b/src/network/nm-dispatcher-check-nets.py
> @@ -0,0 +1,196 @@
> +#!/usr/bin/env python3
> +#
> +# Copyright (C) 2012-2019 Red Hat, Inc.
> +#
> +# This library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2.1 of the License, or (at your option) any later version.
> +#
> +# This library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +# Lesser General Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with this library.  If not, see
> +# <http://www.gnu.org/licenses/>.
> +
> +import libvirt
> +import sys
> +import os
> +import libxml2
> +from ipaddress import ip_network
> +
> +# This script should be installed in
> +# /usr/lib/NetworkManager/dispatcher.d/50-libvirt-check-nets. It will be
> +# called by NetworkManager every time a network interface is taken up
> +# or down. When a new network comes up, it checks the libvirt virtual
> +# networks to see if their IP address(es) (including any static
> +# routes) are in conflict with the IP address(es) (or static routes)
> +# of the newly added interface.  If so, the libvirt network is
> +# disabled. It is assumed that the user will notice that their guests
> +# no longer have network connectvity (and/or the message logged by
> +# this script), see that the network has been disabled, and then
> +# realize the conflict when they try to restart it.
> +#
> +# set checkDefaultOnly=False to check *all* active libvirt networks
> +# for conflicts with the new interface. Set to True to check only the
> +# libvirt default network (since most networks other than the default
> +# network are added post-install at a time when all of the hosts other
> +# networks are already active, it may be overkill to check all of the
> +# libvirt networks for conflict here (and instead just add more
> +# needless overheard to bringing up a new host interface).
> +#
> +checkDefaultOnly = False
> +
> +# NB: since this file is installed in /usr/lib, it really shouldn't be
> +# modified by the user, but instead should be copied to
> +# /etc/NetworkManager/dispatcher.d, where it will override the copy in
> +# /usr/lib. Even that isn't a proper solution though - if we're going
> +# to actually have this config knob, perhaps we should check for it in
> +# the environment, and if someone wants to modify it they can put a
> +# short script in /etc that exports and environment variable and then
> +# calls this script? Just thinking out loud here.
> +
> +
> +def checkconflict(conn, netname, hostnets, hostif):

...snip complex code...

It occurrs to me that this python code is entirely duplicating
logic that already exists in virtnetworkd that checks for
conflicts.

This makes me want to figure out a way to just re-use the existing
code rather than having to write it again.

Could we just send a SIGUSR2 to virtnetworkd as a way to trigger it
to re-validate all running networks ? That would be easy to trigger
from non-network manager setups too.

But then how about ignoring network manager entirely and going right
to the source, ie the kernel. Does it send out either udev or netlink
events (probably the latter), when NICs have their link status set
online/offline ? If so, virtnetworkd could just listen to the kernel
events and "do the right thing",  regardless of what tool is managing
the NICs.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
Re: [PATCH 5/7] network: NetworkManager script to monitor/resolve conflicts with new interfaces
Posted by Laine Stump 3 months, 2 weeks ago
On 8/7/24 1:32 PM, Daniel P. Berrangé wrote:
> On Wed, Aug 07, 2024 at 01:16:01PM -0400, Laine Stump wrote:
>> There has been a problem for several years with libvirt's default
>> virtual network conflicting with the host physical network connection
>> on new installs, particularly when the "host" is actually a virtual
>> machine that is, itself, connected to the libvirt default network on
>> its respective host. If the two default networks use the same subnet,
>> and if the nested host's libvirt happens to start its network before
>> the system networking connects to the L0 host, then network
>> connectivity to the L1 guest is just silently non-working.
>>
>> We've tried several things over the years to eliminate this problem,
>> including:
>>
>> 1) Checking for conflicting routes/interfaces during installation of
>>     the libvirt-daemon-config-network package (the one containing the
>>     default network config file) which tries different subnets until it
>>     finds one that doesn't conflict. This helps in some cases, but
>>     fails on 2 points: a) if the installation environment is different
>>     from the environment where the system is actually run (e.g. a "live
>>     CD" image of a distro, which is built in a container by the distro
>>     maintainers, then could later run in any number of places, and b)
>>     it modifies the installed files during the rpm installation %post
>>     script, which is now discouraged because people building container
>>     images don't like dealing with that.
>>
>> 2) When starting a libvirt network, we now check for any route or
>>     interface that conflicts with the new network's IP's and
>>     routes. This doesn't fix the problem, but does notify the user of
>>     the problem *as long as libvirt starts its networks after the host
>>     has started its system networks*.
>>
>> 3) New code (in the commits immediately previous to this one) add
>>     support for an "autoaddr" attribute in each virtual network <ip>
>>     element; when autoaddr is set, the network driver goes one step
>>     beyond (2) and actually finds an unused subnet and sets the new
>>     virtual network's addresses accordingly.
>>
>> These are all nice in their own ways, but none of them helps in the
>> situation where libvirt's networks are started first (before the
>> host's own network connections are all up). This led to this patch,
>> which does the following:
>>
>> 4) Using a NetworkManager facility (dispatcher.d pscripts, which are
>>     run whenever any interface is brought up or down), check for any
>>     libvirt networks that conflict with a newly started NetworkManager
>>     interface, and if a conflict is found then log a message and
>>     destroy the libvirt network. Most usefully, though, if this
>>     destroyed network has autoaddr='yes' then the script will
>>     immediately restart the network, which will find a new, unused
>>     subnet.
>>
>> Once this is in place, the only issues are:
>>
>> 1) It only works with NetworkManager. But of course almost all of the
>>     cases where this problem has been an issue, networking is managed
>>     by NetworkManager.
>>
>> 2) If there are guests already running and connected to the network,
>>     they will be disconnected, and won't be reconnected until
>>     libvirtd/virtqemud is restarted (one of the things the QEMU driver
>>     does when rereading the status of active guests is to make sure all
>>     their interfaces are connected to their respective networks).
>>
>> Signed-off-by: Laine Stump <laine@redhat.com>
>> ---
>>   libvirt.spec.in                         |   2 +
>>   src/network/meson.build                 |   6 +
>>   src/network/nm-dispatcher-check-nets.py | 196 ++++++++++++++++++++++++
>>   3 files changed, 204 insertions(+)
>>   create mode 100755 src/network/nm-dispatcher-check-nets.py
>>
>> diff --git a/libvirt.spec.in b/libvirt.spec.in
>> index 29101e74fe..51cecfa598 100644
>> --- a/libvirt.spec.in
>> +++ b/libvirt.spec.in
>> @@ -603,6 +603,7 @@ Network filter configuration files for cleaning guest traffic
>>   Summary: Network driver plugin for the libvirtd daemon
>>   Requires: libvirt-daemon-common = %{version}-%{release}
>>   Requires: libvirt-libs = %{version}-%{release}
>> +Requires: python3-libxml2
>>   Requires: dnsmasq >= 2.41
>>       %if %{prefer_nftables}
>>   Requires: nftables
>> @@ -2151,6 +2152,7 @@ exit 0
>>   %dir %attr(0755, root, root) %{_localstatedir}/lib/libvirt/dnsmasq/
>>   %attr(0755, root, root) %{_libexecdir}/libvirt_leaseshelper
>>   %{_libdir}/libvirt/connection-driver/libvirt_driver_network.so
>> +%{_prefix}/lib/NetworkManager/dispatcher.d/50-libvirt-check-nets
>>   %{_mandir}/man8/virtnetworkd.8*
>>       %if %{with_firewalld_zone}
>>   %{_prefix}/lib/firewalld/zones/libvirt.xml
>> diff --git a/src/network/meson.build b/src/network/meson.build
>> index 8faff6eb1c..f620407759 100644
>> --- a/src/network/meson.build
>> +++ b/src/network/meson.build
>> @@ -169,4 +169,10 @@ if conf.has('WITH_NETWORK')
>>         rename: [ 'libvirt-routed-in.xml' ],
>>       )
>>     endif
>> +
>> +  install_data(
>> +    'nm-dispatcher-check-nets.py',
>> +    install_dir: prefix / 'lib' / 'NetworkManager' / 'dispatcher.d',
>> +    rename: [ '50-libvirt-check-nets' ],
>> +  )
>>   endif
>> diff --git a/src/network/nm-dispatcher-check-nets.py b/src/network/nm-dispatcher-check-nets.py
>> new file mode 100755
>> index 0000000000..454c434c88
>> --- /dev/null
>> +++ b/src/network/nm-dispatcher-check-nets.py
>> @@ -0,0 +1,196 @@
>> +#!/usr/bin/env python3
>> +#
>> +# Copyright (C) 2012-2019 Red Hat, Inc.
>> +#
>> +# This library is free software; you can redistribute it and/or
>> +# modify it under the terms of the GNU Lesser General Public
>> +# License as published by the Free Software Foundation; either
>> +# version 2.1 of the License, or (at your option) any later version.
>> +#
>> +# This library is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +# Lesser General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU Lesser General Public
>> +# License along with this library.  If not, see
>> +# <http://www.gnu.org/licenses/>.
>> +
>> +import libvirt
>> +import sys
>> +import os
>> +import libxml2
>> +from ipaddress import ip_network
>> +
>> +# This script should be installed in
>> +# /usr/lib/NetworkManager/dispatcher.d/50-libvirt-check-nets. It will be
>> +# called by NetworkManager every time a network interface is taken up
>> +# or down. When a new network comes up, it checks the libvirt virtual
>> +# networks to see if their IP address(es) (including any static
>> +# routes) are in conflict with the IP address(es) (or static routes)
>> +# of the newly added interface.  If so, the libvirt network is
>> +# disabled. It is assumed that the user will notice that their guests
>> +# no longer have network connectvity (and/or the message logged by
>> +# this script), see that the network has been disabled, and then
>> +# realize the conflict when they try to restart it.
>> +#
>> +# set checkDefaultOnly=False to check *all* active libvirt networks
>> +# for conflicts with the new interface. Set to True to check only the
>> +# libvirt default network (since most networks other than the default
>> +# network are added post-install at a time when all of the hosts other
>> +# networks are already active, it may be overkill to check all of the
>> +# libvirt networks for conflict here (and instead just add more
>> +# needless overheard to bringing up a new host interface).
>> +#
>> +checkDefaultOnly = False
>> +
>> +# NB: since this file is installed in /usr/lib, it really shouldn't be
>> +# modified by the user, but instead should be copied to
>> +# /etc/NetworkManager/dispatcher.d, where it will override the copy in
>> +# /usr/lib. Even that isn't a proper solution though - if we're going
>> +# to actually have this config knob, perhaps we should check for it in
>> +# the environment, and if someone wants to modify it they can put a
>> +# short script in /etc that exports and environment variable and then
>> +# calls this script? Just thinking out loud here.
>> +
>> +
>> +def checkconflict(conn, netname, hostnets, hostif):
> 
> ...snip complex code...
> 
> It occurrs to me that this python code is entirely duplicating
> logic that already exists in virtnetworkd that checks for
> conflicts.

Possibly not *exactly* what's in virtnetworkd, since I'm not certain if 
NM includes all static routes in the list it puts in the environment 
variables (while they *would* show up in the system routing table that 
our network driver uses). Of course that sounds like another reason to 
prefer re-using the existing code!

> 
> This makes me want to figure out a way to just re-use the existing
> code rather than having to write it again.
> 
> Could we just send a SIGUSR2 to virtnetworkd as a way to trigger it
> to re-validate all running networks ? That would be easy to trigger
> from non-network manager setups too.

That sounds simpler (and probably quicker), and if we could do it that I 
would prefer it ("only works with NetworkManager" is #1 on my list of 
'issues' above after all :-)), but what would we do in the case that 
virtnetworkd (or libvirtd) wasn't currently running? Run some virsh 
command to trigger virtnetworkd to start, and then send the SIGUSR2?

(this does
> But then how about ignoring network manager entirely and going right
> to the source, ie the kernel. Does it send out either udev or netlink
> events (probably the latter), when NICs have their link status set
> online/offline ? If so, virtnetworkd could just listen to the kernel
> events and "do the right thing",  regardless of what tool is managing
> the NICs.

I imagine *something* is sent out, and it's definitely worth me 
investigating. But wouldn't virtnetworkd have to be running all of the 
time in order to use something like that?
Re: [PATCH 5/7] network: NetworkManager script to monitor/resolve conflicts with new interfaces
Posted by Daniel P. Berrangé 3 months, 1 week ago
On Wed, Aug 07, 2024 at 04:00:09PM -0400, Laine Stump wrote:
> On 8/7/24 1:32 PM, Daniel P. Berrangé wrote:
> > On Wed, Aug 07, 2024 at 01:16:01PM -0400, Laine Stump wrote:
> > > +
> > > +import libvirt
> > > +import sys
> > > +import os
> > > +import libxml2
> > > +from ipaddress import ip_network
> > > +
> > > +# This script should be installed in
> > > +# /usr/lib/NetworkManager/dispatcher.d/50-libvirt-check-nets. It will be
> > > +# called by NetworkManager every time a network interface is taken up
> > > +# or down. When a new network comes up, it checks the libvirt virtual
> > > +# networks to see if their IP address(es) (including any static
> > > +# routes) are in conflict with the IP address(es) (or static routes)
> > > +# of the newly added interface.  If so, the libvirt network is
> > > +# disabled. It is assumed that the user will notice that their guests
> > > +# no longer have network connectvity (and/or the message logged by
> > > +# this script), see that the network has been disabled, and then
> > > +# realize the conflict when they try to restart it.
> > > +#
> > > +# set checkDefaultOnly=False to check *all* active libvirt networks
> > > +# for conflicts with the new interface. Set to True to check only the
> > > +# libvirt default network (since most networks other than the default
> > > +# network are added post-install at a time when all of the hosts other
> > > +# networks are already active, it may be overkill to check all of the
> > > +# libvirt networks for conflict here (and instead just add more
> > > +# needless overheard to bringing up a new host interface).
> > > +#
> > > +checkDefaultOnly = False
> > > +
> > > +# NB: since this file is installed in /usr/lib, it really shouldn't be
> > > +# modified by the user, but instead should be copied to
> > > +# /etc/NetworkManager/dispatcher.d, where it will override the copy in
> > > +# /usr/lib. Even that isn't a proper solution though - if we're going
> > > +# to actually have this config knob, perhaps we should check for it in
> > > +# the environment, and if someone wants to modify it they can put a
> > > +# short script in /etc that exports and environment variable and then
> > > +# calls this script? Just thinking out loud here.
> > > +
> > > +
> > > +def checkconflict(conn, netname, hostnets, hostif):
> > 
> > ...snip complex code...
> > 
> > It occurrs to me that this python code is entirely duplicating
> > logic that already exists in virtnetworkd that checks for
> > conflicts.
> 
> Possibly not *exactly* what's in virtnetworkd, since I'm not certain if NM
> includes all static routes in the list it puts in the environment variables
> (while they *would* show up in the system routing table that our network
> driver uses). Of course that sounds like another reason to prefer re-using
> the existing code!

Oh definitely a strong reason to reuse the code. We don't want a situation
where users file bugs because we don't detect clashes in one scenario, but
do in the other.

> > This makes me want to figure out a way to just re-use the existing
> > code rather than having to write it again.
> > 
> > Could we just send a SIGUSR2 to virtnetworkd as a way to trigger it
> > to re-validate all running networks ? That would be easy to trigger
> > from non-network manager setups too.
> 
> That sounds simpler (and probably quicker), and if we could do it that I
> would prefer it ("only works with NetworkManager" is #1 on my list of
> 'issues' above after all :-)), but what would we do in the case that
> virtnetworkd (or libvirtd) wasn't currently running? Run some virsh command
> to trigger virtnetworkd to start, and then send the SIGUSR2?

As long as there are networks running, libvirtd/virnetworkd should not
shut down due to the auto shutdown timer.

They might temporarily shutdown due to a restart on package upgrades.

If our startup code runs the "check for clashes" logic, then we'll
do the right thing.

The only problem will be if a user manually stops the daemon and
leaves it stopped, while still having a network present. I'm happy
to call that user error and not worry about solving clashes in
that case, as no normal system should get in that state.

> (this does
> > But then how about ignoring network manager entirely and going right
> > to the source, ie the kernel. Does it send out either udev or netlink
> > events (probably the latter), when NICs have their link status set
> > online/offline ? If so, virtnetworkd could just listen to the kernel
> > events and "do the right thing",  regardless of what tool is managing
> > the NICs.
> 
> I imagine *something* is sent out, and it's definitely worth me
> investigating. But wouldn't virtnetworkd have to be running all of the time
> in order to use something like that?



With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
Re: [PATCH 5/7] network: NetworkManager script to monitor/resolve conflicts with new interfaces
Posted by Laine Stump 3 months, 1 week ago
On 8/16/24 11:23 AM, Daniel P. Berrangé wrote:
> On Wed, Aug 07, 2024 at 04:00:09PM -0400, Laine Stump wrote:
>> On 8/7/24 1:32 PM, Daniel P. Berrangé wrote:
>>> On Wed, Aug 07, 2024 at 01:16:01PM -0400, Laine Stump wrote:
>>>> +
>>>> +import libvirt
>>>> +import sys
>>>> +import os
>>>> +import libxml2
>>>> +from ipaddress import ip_network
>>>> +
>>>> +# This script should be installed in
>>>> +# /usr/lib/NetworkManager/dispatcher.d/50-libvirt-check-nets. It will be
>>>> +# called by NetworkManager every time a network interface is taken up
>>>> +# or down. When a new network comes up, it checks the libvirt virtual
>>>> +# networks to see if their IP address(es) (including any static
>>>> +# routes) are in conflict with the IP address(es) (or static routes)
>>>> +# of the newly added interface.  If so, the libvirt network is
>>>> +# disabled. It is assumed that the user will notice that their guests
>>>> +# no longer have network connectvity (and/or the message logged by
>>>> +# this script), see that the network has been disabled, and then
>>>> +# realize the conflict when they try to restart it.
>>>> +#
>>>> +# set checkDefaultOnly=False to check *all* active libvirt networks
>>>> +# for conflicts with the new interface. Set to True to check only the
>>>> +# libvirt default network (since most networks other than the default
>>>> +# network are added post-install at a time when all of the hosts other
>>>> +# networks are already active, it may be overkill to check all of the
>>>> +# libvirt networks for conflict here (and instead just add more
>>>> +# needless overheard to bringing up a new host interface).
>>>> +#
>>>> +checkDefaultOnly = False
>>>> +
>>>> +# NB: since this file is installed in /usr/lib, it really shouldn't be
>>>> +# modified by the user, but instead should be copied to
>>>> +# /etc/NetworkManager/dispatcher.d, where it will override the copy in
>>>> +# /usr/lib. Even that isn't a proper solution though - if we're going
>>>> +# to actually have this config knob, perhaps we should check for it in
>>>> +# the environment, and if someone wants to modify it they can put a
>>>> +# short script in /etc that exports and environment variable and then
>>>> +# calls this script? Just thinking out loud here.
>>>> +
>>>> +
>>>> +def checkconflict(conn, netname, hostnets, hostif):
>>>
>>> ...snip complex code...
>>>
>>> It occurrs to me that this python code is entirely duplicating
>>> logic that already exists in virtnetworkd that checks for
>>> conflicts.
>>
>> Possibly not *exactly* what's in virtnetworkd, since I'm not certain if NM
>> includes all static routes in the list it puts in the environment variables
>> (while they *would* show up in the system routing table that our network
>> driver uses). Of course that sounds like another reason to prefer re-using
>> the existing code!
> 
> Oh definitely a strong reason to reuse the code. We don't want a situation
> where users file bugs because we don't detect clashes in one scenario, but
> do in the other.
> 
>>> This makes me want to figure out a way to just re-use the existing
>>> code rather than having to write it again.
>>>
>>> Could we just send a SIGUSR2 to virtnetworkd as a way to trigger it
>>> to re-validate all running networks ? That would be easy to trigger
>>> from non-network manager setups too.
>>
>> That sounds simpler (and probably quicker), and if we could do it that I
>> would prefer it ("only works with NetworkManager" is #1 on my list of
>> 'issues' above after all :-)), but what would we do in the case that
>> virtnetworkd (or libvirtd) wasn't currently running? Run some virsh command
>> to trigger virtnetworkd to start, and then send the SIGUSR2?
> 
> As long as there are networks running, libvirtd/virnetworkd should not
> shut down due to the auto shutdown timer.

But it *does* auto-shutdown (even if there is an active guest that's 
using an active virtual network) :-). So I guess that bit was never 
wired up for virtnetworkd.

I'd never before bothered to learn how the idle shutdown timer works, 
but just spent 5 minutes cscoping back through the code, and didn't get 
to the magic call made by the qemu driver that's missing from the 
network driver. If someone can point me at the right bit so I can avoid 
more digging, that would be appreciated.

Anyway, if virtnetworkd doesn't do an idle shutdown, that simplifies 
things quite a bit (I had just assumed that we didn't want that because 
it took up extra resources and was (until now) unnecessary).


> 
> They might temporarily shutdown due to a restart on package upgrades.
> 
> If our startup code runs the "check for clashes" logic, then we'll
> do the right thing.
> 
> The only problem will be if a user manually stops the daemon and
> leaves it stopped, while still having a network present. I'm happy
> to call that user error and not worry about solving clashes in
> that case, as no normal system should get in that state.

Yeah, if you want things to work, play by the rules!

> 
>> (this does
>>> But then how about ignoring network manager entirely and going right
>>> to the source, ie the kernel. Does it send out either udev or netlink
>>> events (probably the latter), when NICs have their link status set
>>> online/offline ? If so, virtnetworkd could just listen to the kernel
>>> events and "do the right thing",  regardless of what tool is managing
>>> the NICs.
>>
>> I imagine *something* is sent out, and it's definitely worth me
>> investigating. But wouldn't virtnetworkd have to be running all of the time
>> in order to use something like that?

Right before I left town for a week away from the computer, I found the 
blog of someone who was wanting to do exactly this. I'm going to go read 
up on it now, and hopefully it won't lead me down the wrong path! :-)

(I have to say though, that needing to look at netlink to try and figure 
out a new bug report (also just before leaving town) reminded me of how 
unnecessarily complicated netlink is :-/)