[libvirt] [PATCH] nwfilter: directly use poll to wait for packets instead of pcap_next

Daniel P. Berrangé posted 1 patch 5 years, 11 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20180521121554.11952-1-berrange@redhat.com
Test syntax-check passed
src/nwfilter/nwfilter_learnipaddr.c | 37 +++++++++++++++++++++++++++--
1 file changed, 35 insertions(+), 2 deletions(-)
[libvirt] [PATCH] nwfilter: directly use poll to wait for packets instead of pcap_next
Posted by Daniel P. Berrangé 5 years, 11 months ago
When a QEMU VM shuts down its TAP device gets deleted while nwfilter
IP address learning thread is still capturing packets. It is seen that
with TPACKET_V3 support in libcap, the pcap_next() call will not always
exit its poll() when the NIC is removed. This prevents the learning
thread from exiting which blocks the rest of libvirtd waiting on mutex
acquisition. By switching to do poll() in libvirt code, we can ensure
that we always exit the poll() at a time that is right for libvirt.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 src/nwfilter/nwfilter_learnipaddr.c | 37 +++++++++++++++++++++++++++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/src/nwfilter/nwfilter_learnipaddr.c b/src/nwfilter/nwfilter_learnipaddr.c
index 061b39d72b..e117be9ce2 100644
--- a/src/nwfilter/nwfilter_learnipaddr.c
+++ b/src/nwfilter/nwfilter_learnipaddr.c
@@ -31,6 +31,7 @@
 
 #include <fcntl.h>
 #include <sys/ioctl.h>
+#include <poll.h>
 
 #include <arpa/inet.h>
 #include <net/ethernet.h>
@@ -414,6 +415,7 @@ learnIPAddressThread(void *arg)
     bool showError = true;
     enum howDetect howDetected = 0;
     virNWFilterTechDriverPtr techdriver = req->techdriver;
+    struct pollfd fds[1];
 
     if (virNWFilterLockIface(req->ifname) < 0)
        goto err_no_lock;
@@ -435,6 +437,9 @@ learnIPAddressThread(void *arg)
         goto done;
     }
 
+    fds[0].fd = pcap_fileno(handle);
+    fds[0].events = POLLIN | POLLERR;
+
     virMacAddrFormat(&req->macaddr, macaddr);
 
     switch (req->howDetect) {
@@ -483,17 +488,45 @@ learnIPAddressThread(void *arg)
     pcap_freecode(&fp);
 
     while (req->status == 0 && vmaddr == 0) {
+        int n = poll(fds, ARRAY_CARDINALITY(fds), PKT_TIMEOUT_MS);
+
+        if (n < 0) {
+            if (errno == EAGAIN || errno == EINTR)
+                continue;
+
+            req->status = errno;
+            showError = true;
+            break;
+        }
+
+        if (n == 0) {
+            if (threadsTerminate || req->terminate) {
+                VIR_DEBUG("Terminate request seen, cancelling pcap");
+                req->status = ECANCELED;
+                showError = false;
+                break;
+            }
+            continue;
+        }
+
+        if (fds[0].revents & (POLLHUP | POLLERR)) {
+            VIR_DEBUG("Error from FD probably dev deleted");
+            req->status = ENODEV;
+            showError = false;
+            break;
+        }
+
         packet = pcap_next(handle, &header);
 
         if (!packet) {
-
+            /* Already handled with poll, but lets be sure */
             if (threadsTerminate || req->terminate) {
                 req->status = ECANCELED;
                 showError = false;
                 break;
             }
 
-            /* check whether VM's dev is still there */
+            /* Again, already handled above, but lets be sure */
             if (virNetDevValidateConfig(req->ifname, NULL, req->ifindex) <= 0) {
                 virResetLastError();
                 req->status = ENODEV;
-- 
2.17.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] nwfilter: directly use poll to wait for packets instead of pcap_next
Posted by John Ferlan 5 years, 11 months ago

On 05/21/2018 08:15 AM, Daniel P. Berrangé wrote:
> When a QEMU VM shuts down its TAP device gets deleted while nwfilter
> IP address learning thread is still capturing packets. It is seen that
> with TPACKET_V3 support in libcap, the pcap_next() call will not always
> exit its poll() when the NIC is removed. This prevents the learning
> thread from exiting which blocks the rest of libvirtd waiting on mutex
> acquisition. By switching to do poll() in libvirt code, we can ensure
> that we always exit the poll() at a time that is right for libvirt.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  src/nwfilter/nwfilter_learnipaddr.c | 37 +++++++++++++++++++++++++++--
>  1 file changed, 35 insertions(+), 2 deletions(-)
> 

FWIW: This patch is related to:

https://www.redhat.com/archives/libvir-list/2018-May/msg01440.html

Reviewed-by: John Ferlan <jferlan@redhat.com>

I think this needs to be pushed before the release! Since Daniel is
away, I can take care of that.  I will wait for Monday though to ensure
there's no objection...

There's also a companion patch:

https://www.redhat.com/archives/libvir-list/2018-May/msg01429.html

but I have some issues with that and will post an alternative that would
also need to be pushed before the release.

John

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 2/1] nwfilter: Fix IP address learning
Posted by John Ferlan 5 years, 11 months ago
In a previous commit:

   commit d4bf8f415074759baf051644559e04fe78888f8b
   Author: Daniel P. Berrangé <berrange@redhat.com>
   Date:   Wed Feb 14 09:43:59 2018 +0000

     nwfilter: handle missing switch enum cases

     Ensure all enum cases are listed in switch statements, or cast away
     enum type in places where we don't wish to cover all cases.

     Reviewed-by: John Ferlan <jferlan@redhat.com>
     Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

we changed a switch in the nwfilter learning thread so that it had
explict cases for all enum entries. Unfortunately the parameters in the
method had been declared with incorrect type. The "howDetect" parameter
does *not* accept "enum howDetect" values, rather it accepts a bitmask
of "enum howDetect" values, so it should have been an "int" type.

The caller always passes DETECT_STATIC|DETECT_DHCP, so essentially the
IP addressing learning was completely broken by the above change, as it
never matched any switch case, hitting the default leading to EINVAL.

Rather than treat the howDetect as a numerical enum, let's treat it
as a bitmask/flag since that's the way it's used. Thus, the switch
changes to a simple if bit is set keeping the original intention that
if @howDetect == DETECT_DHCP, then use applyDHCPOnlyRules; otherwise,
use applyBasicRules to determine the IP Address.

Proposed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
---

 Here's to hoping I've used the correct magic incantation to get this
 to reply to Daniel's poll series as this should be pushed in
 conjunction with that (and thus would need review). This should be
 done before the 4.4.0 release.

 BTW: This also fixes a strange phenomena I was seeing in my testing
      environment where domain startups would periodically fail with:

          error: Failed to start domain f23
          error: Machine 'qemu-6-f23' already exists

      but could be started with enough retry attempts. What causes me
      to believe there's a relationship with this series is that if
      running libvirtd debug, I see the following as output:


          Detaching after fork from child process 22442.
          2018-05-25 20:37:24.966+0000: 25923: error :
              virSystemdCreateMachine:361 : Machine 'qemu-6-f23' already exists
          2018-05-25 20:37:24.988+0000: 22440: error :
              learnIPAddressThread:668   : encountered an error on interface
              vnet0 index 150: Invalid argument
          [Thread 0x7fffacdb7700 (LWP 22440) exited]

      With these two patches in place, no more errors.

 src/nwfilter/nwfilter_learnipaddr.c | 27 +++++++++++++--------------
 src/nwfilter/nwfilter_learnipaddr.h | 10 +++++-----
 2 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/src/nwfilter/nwfilter_learnipaddr.c b/src/nwfilter/nwfilter_learnipaddr.c
index 506b98fd07..ab2697274c 100644
--- a/src/nwfilter/nwfilter_learnipaddr.c
+++ b/src/nwfilter/nwfilter_learnipaddr.c
@@ -145,7 +145,7 @@ struct _virNWFilterIPAddrLearnReq {
     char *filtername;
     virHashTablePtr filterparams;
     virNWFilterDriverStatePtr driver;
-    enum howDetect howDetect;
+    virNWFilterLearnIPHowDetectFlags howDetect;
 
     int status;
     volatile bool terminate;
@@ -344,7 +344,7 @@ virNWFilterDeregisterLearnReq(int ifindex)
 static void
 procDHCPOpts(struct dhcp *dhcp, int dhcp_opts_len,
              uint32_t *vmaddr, uint32_t *bcastaddr,
-             enum howDetect *howDetected)
+             virNWFilterLearnIPHowDetectFlags *howDetected)
 {
     struct dhcp_option *dhcpopt = &dhcp->options[0];
 
@@ -413,7 +413,7 @@ learnIPAddressThread(void *arg)
     char *filter = NULL;
     uint16_t etherType;
     bool showError = true;
-    enum howDetect howDetected = 0;
+    virNWFilterLearnIPHowDetectFlags howDetected = 0;
     virNWFilterTechDriverPtr techdriver = req->techdriver;
     struct pollfd fds[1];
 
@@ -442,28 +442,27 @@ learnIPAddressThread(void *arg)
 
     virMacAddrFormat(&req->macaddr, macaddr);
 
-    switch (req->howDetect) {
-    case DETECT_DHCP:
+    if (req->howDetect == DETECT_DHCP) {
         if (techdriver->applyDHCPOnlyRules(req->ifname,
                                            &req->macaddr,
                                            NULL, false) < 0) {
+            VIR_DEBUG("Unable to apply DHCP only rules");
             req->status = EINVAL;
             goto done;
         }
         virBufferAddLit(&buf, "src port 67 and dst port 68");
-        break;
-    case DETECT_STATIC:
+    } else {
+        /* howDetect == DETECT_DHCP ||
+         * howDetect == (DETECT_DHCP | DETECT_STATIC)
+         */
         if (techdriver->applyBasicRules(req->ifname,
                                         &req->macaddr) < 0) {
+            VIR_DEBUG("Unable to apply basic rules");
             req->status = EINVAL;
             goto done;
         }
         virBufferAsprintf(&buf, "ether host %s or ether dst ff:ff:ff:ff:ff:ff",
                           macaddr);
-        break;
-    default:
-        req->status = EINVAL;
-        goto done;
     }
 
     if (virBufferError(&buf)) {
@@ -726,7 +725,7 @@ learnIPAddressThread(void *arg)
  *               once its IP address has been detected
  * @driver : the network filter driver
  * @howDetect : the method on how the thread is supposed to detect the
- *              IP address; must choose any of the available flags
+ *              IP address using virNWFilterLearnIPHowDetectFlags bitmask
  *
  * Instruct to learn the IP address being used on a given interface (ifname).
  * Unless there already is a thread attempting to learn the IP address
@@ -744,7 +743,7 @@ virNWFilterLearnIPAddress(virNWFilterTechDriverPtr techdriver,
                           const char *filtername,
                           virHashTablePtr filterparams,
                           virNWFilterDriverStatePtr driver,
-                          enum howDetect howDetect)
+                          virNWFilterLearnIPHowDetectFlags howDetect)
 {
     int rc;
     virThread thread;
@@ -833,7 +832,7 @@ virNWFilterLearnIPAddress(virNWFilterTechDriverPtr techdriver ATTRIBUTE_UNUSED,
                           const char *filtername ATTRIBUTE_UNUSED,
                           virHashTablePtr filterparams ATTRIBUTE_UNUSED,
                           virNWFilterDriverStatePtr driver ATTRIBUTE_UNUSED,
-                          enum howDetect howDetect ATTRIBUTE_UNUSED)
+                          virNWFilterLearnIPHowDetectFlags howDetect ATTRIBUTE_UNUSED)
 {
     virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
                    _("IP parameter must be given since libvirt "
diff --git a/src/nwfilter/nwfilter_learnipaddr.h b/src/nwfilter/nwfilter_learnipaddr.h
index 06fea5bff8..ab21603090 100644
--- a/src/nwfilter/nwfilter_learnipaddr.h
+++ b/src/nwfilter/nwfilter_learnipaddr.h
@@ -30,10 +30,10 @@
 # include "nwfilter_tech_driver.h"
 # include <net/if.h>
 
-enum howDetect {
-  DETECT_DHCP = 1,
-  DETECT_STATIC = 2,
-};
+typedef enum {
+  DETECT_DHCP = 1 << 0,
+  DETECT_STATIC = 1 << 1,
+} virNWFilterLearnIPHowDetectFlags;
 
 int virNWFilterLearnIPAddress(virNWFilterTechDriverPtr techdriver,
                               const char *ifname,
@@ -43,7 +43,7 @@ int virNWFilterLearnIPAddress(virNWFilterTechDriverPtr techdriver,
                               const char *filtername,
                               virHashTablePtr filterparams,
                               virNWFilterDriverStatePtr driver,
-                              enum howDetect howDetect);
+                              virNWFilterLearnIPHowDetectFlags howDetect);
 
 bool virNWFilterHasLearnReq(int ifindex);
 int virNWFilterTerminateLearnReq(const char *ifname);
-- 
2.14.3

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 2/1] nwfilter: Fix IP address learning
Posted by Daniel P. Berrangé 5 years, 10 months ago
On Sat, May 26, 2018 at 08:27:47AM -0400, John Ferlan wrote:
> In a previous commit:
> 
>    commit d4bf8f415074759baf051644559e04fe78888f8b
>    Author: Daniel P. Berrangé <berrange@redhat.com>
>    Date:   Wed Feb 14 09:43:59 2018 +0000
> 
>      nwfilter: handle missing switch enum cases
> 
>      Ensure all enum cases are listed in switch statements, or cast away
>      enum type in places where we don't wish to cover all cases.
> 
>      Reviewed-by: John Ferlan <jferlan@redhat.com>
>      Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> 
> we changed a switch in the nwfilter learning thread so that it had
> explict cases for all enum entries. Unfortunately the parameters in the
> method had been declared with incorrect type. The "howDetect" parameter
> does *not* accept "enum howDetect" values, rather it accepts a bitmask
> of "enum howDetect" values, so it should have been an "int" type.
> 
> The caller always passes DETECT_STATIC|DETECT_DHCP, so essentially the
> IP addressing learning was completely broken by the above change, as it
> never matched any switch case, hitting the default leading to EINVAL.
> 
> Rather than treat the howDetect as a numerical enum, let's treat it
> as a bitmask/flag since that's the way it's used. Thus, the switch
> changes to a simple if bit is set keeping the original intention that
> if @howDetect == DETECT_DHCP, then use applyDHCPOnlyRules; otherwise,
> use applyBasicRules to determine the IP Address.
> 
> Proposed-by: Daniel P. Berrangé <berrange@redhat.com>
> Signed-off-by: John Ferlan <jferlan@redhat.com>
> ---
> 
>  Here's to hoping I've used the correct magic incantation to get this
>  to reply to Daniel's poll series as this should be pushed in
>  conjunction with that (and thus would need review). This should be
>  done before the 4.4.0 release.
> 
>  BTW: This also fixes a strange phenomena I was seeing in my testing
>       environment where domain startups would periodically fail with:
> 
>           error: Failed to start domain f23
>           error: Machine 'qemu-6-f23' already exists
> 
>       but could be started with enough retry attempts. What causes me
>       to believe there's a relationship with this series is that if
>       running libvirtd debug, I see the following as output:
> 
> 
>           Detaching after fork from child process 22442.
>           2018-05-25 20:37:24.966+0000: 25923: error :
>               virSystemdCreateMachine:361 : Machine 'qemu-6-f23' already exists
>           2018-05-25 20:37:24.988+0000: 22440: error :
>               learnIPAddressThread:668   : encountered an error on interface
>               vnet0 index 150: Invalid argument
>           [Thread 0x7fffacdb7700 (LWP 22440) exited]
> 
>       With these two patches in place, no more errors.
> 
>  src/nwfilter/nwfilter_learnipaddr.c | 27 +++++++++++++--------------
>  src/nwfilter/nwfilter_learnipaddr.h | 10 +++++-----
>  2 files changed, 18 insertions(+), 19 deletions(-)
> 
> diff --git a/src/nwfilter/nwfilter_learnipaddr.c b/src/nwfilter/nwfilter_learnipaddr.c
> index 506b98fd07..ab2697274c 100644
> --- a/src/nwfilter/nwfilter_learnipaddr.c
> +++ b/src/nwfilter/nwfilter_learnipaddr.c
> @@ -145,7 +145,7 @@ struct _virNWFilterIPAddrLearnReq {
>      char *filtername;
>      virHashTablePtr filterparams;
>      virNWFilterDriverStatePtr driver;
> -    enum howDetect howDetect;
> +    virNWFilterLearnIPHowDetectFlags howDetect;

We don't normally use typedefs for bitmasks because they are misleading
as evidenced by this bug. So I still prefer my original patch that uses
'int'.

We could just simplify the code though. Given that it is hardcoded to
always pass DETECT_DHCP | DETECT_STATIC, there's little obvious point
in it being configurable right now, so we could just remove the enum
entirely and hardwire the only thing we actually use.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 2/1] nwfilter: Fix IP address learning
Posted by John Ferlan 5 years, 10 months ago

On 06/01/2018 06:33 AM, Daniel P. Berrangé wrote:
> On Sat, May 26, 2018 at 08:27:47AM -0400, John Ferlan wrote:
>> In a previous commit:
>>
>>    commit d4bf8f415074759baf051644559e04fe78888f8b
>>    Author: Daniel P. Berrangé <berrange@redhat.com>
>>    Date:   Wed Feb 14 09:43:59 2018 +0000
>>
>>      nwfilter: handle missing switch enum cases
>>
>>      Ensure all enum cases are listed in switch statements, or cast away
>>      enum type in places where we don't wish to cover all cases.
>>
>>      Reviewed-by: John Ferlan <jferlan@redhat.com>
>>      Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
>>
>> we changed a switch in the nwfilter learning thread so that it had
>> explict cases for all enum entries. Unfortunately the parameters in the
>> method had been declared with incorrect type. The "howDetect" parameter
>> does *not* accept "enum howDetect" values, rather it accepts a bitmask
>> of "enum howDetect" values, so it should have been an "int" type.
>>
>> The caller always passes DETECT_STATIC|DETECT_DHCP, so essentially the
>> IP addressing learning was completely broken by the above change, as it
>> never matched any switch case, hitting the default leading to EINVAL.
>>
>> Rather than treat the howDetect as a numerical enum, let's treat it
>> as a bitmask/flag since that's the way it's used. Thus, the switch
>> changes to a simple if bit is set keeping the original intention that
>> if @howDetect == DETECT_DHCP, then use applyDHCPOnlyRules; otherwise,
>> use applyBasicRules to determine the IP Address.
>>
>> Proposed-by: Daniel P. Berrangé <berrange@redhat.com>
>> Signed-off-by: John Ferlan <jferlan@redhat.com>
>> ---
>>
>>  Here's to hoping I've used the correct magic incantation to get this
>>  to reply to Daniel's poll series as this should be pushed in
>>  conjunction with that (and thus would need review). This should be
>>  done before the 4.4.0 release.
>>
>>  BTW: This also fixes a strange phenomena I was seeing in my testing
>>       environment where domain startups would periodically fail with:
>>
>>           error: Failed to start domain f23
>>           error: Machine 'qemu-6-f23' already exists
>>
>>       but could be started with enough retry attempts. What causes me
>>       to believe there's a relationship with this series is that if
>>       running libvirtd debug, I see the following as output:
>>
>>
>>           Detaching after fork from child process 22442.
>>           2018-05-25 20:37:24.966+0000: 25923: error :
>>               virSystemdCreateMachine:361 : Machine 'qemu-6-f23' already exists
>>           2018-05-25 20:37:24.988+0000: 22440: error :
>>               learnIPAddressThread:668   : encountered an error on interface
>>               vnet0 index 150: Invalid argument
>>           [Thread 0x7fffacdb7700 (LWP 22440) exited]
>>
>>       With these two patches in place, no more errors.
>>
>>  src/nwfilter/nwfilter_learnipaddr.c | 27 +++++++++++++--------------
>>  src/nwfilter/nwfilter_learnipaddr.h | 10 +++++-----
>>  2 files changed, 18 insertions(+), 19 deletions(-)
>>
>> diff --git a/src/nwfilter/nwfilter_learnipaddr.c b/src/nwfilter/nwfilter_learnipaddr.c
>> index 506b98fd07..ab2697274c 100644
>> --- a/src/nwfilter/nwfilter_learnipaddr.c
>> +++ b/src/nwfilter/nwfilter_learnipaddr.c
>> @@ -145,7 +145,7 @@ struct _virNWFilterIPAddrLearnReq {
>>      char *filtername;
>>      virHashTablePtr filterparams;
>>      virNWFilterDriverStatePtr driver;
>> -    enum howDetect howDetect;
>> +    virNWFilterLearnIPHowDetectFlags howDetect;
> 
> We don't normally use typedefs for bitmasks because they are misleading
> as evidenced by this bug. So I still prefer my original patch that uses
> 'int'.
> 

Yay - more unwritten rules ;-)

OK, sure usually I see bitmasks and unsigned int flags in use...  Still
the value is used as a bitmask in one instance and a constant in
another. Fortunately (1 << 0) and (1 << 1) worked out to the same thing
as the existing code. In the long run, it doesn't really matter to me
how that's "worked out". Using = 1 and = 2, then later using as a
bitmask is probably wrong - it's a consistency thing.

As an aside, passing a typedef enum as a parameter is done a few times
such as virDomainPCIConnectFlags, qemuBlockIoTuneSetFlags, and
virFileCloseFlags using a quick search.

> We could just simplify the code though. Given that it is hardcoded to
> always pass DETECT_DHCP | DETECT_STATIC, there's little obvious point
> in it being configurable right now, so we could just remove the enum
> entirely and hardwire the only thing we actually use.
> 

Yeah - I thought of that too, but fear of any unknown future plans and
history of the code kept me away from taking that option.

I suppose taking the hardwire route or perhaps going with unsigned int
howDetect; would be fine... I also think that instead of a switch the if
then else would be fine.

John

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list