[libvirt] [RFC PATCH] udev: Remove udev handle from main loop when udev thread stops

Marc Hartmayer posted 1 patch 5 years, 2 months ago
Test syntax-check passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20190207160832.10455-1-mhartmay@linux.ibm.com
src/node_device/node_device_udev.c | 7 +++++++
1 file changed, 7 insertions(+)
[libvirt] [RFC PATCH] udev: Remove udev handle from main loop when udev thread stops
Posted by Marc Hartmayer 5 years, 2 months ago
Commit "nodedev: Move device enumumeration out of nodeStateInitialize"
(9f0ae0b18e3e620) has moved the heavy task of device enumeration into
a separate thread. The problem with this commit is that there is a
functionality change in the cleanup when udevEnumerateDevices
fails. Before commit 9f0ae0b18e3e620, the entire driver was cleaned up
by calling nodeStateCleanup (defined in node_device_udev.c) which
resulted in libvirtd stopping with the error message
'daemonRunStateInit:800 : Driver state initialization failed'. With
the commit 9f0ae0b18e3e620 there is only a signal to the udev thread
that it must stop. This means that for example the watch handle isn't
removed from the main loop and this can result in the main loop
consuming 100% CPU time as soon as a new udev event occurs.

This patch proposes a simple solution to the described problem. In
case the udev thread stops the watch handle is removed from the main
loop.

Fixes: 9f0ae0b18e3e620 ("nodedev: Move device enumumeration out of nodeStateInitialize")
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
---

Note: I'm not sure whether we should stop libvirtd (as it would have
      been done before) or if this patch is already sufficient.

---
 src/node_device/node_device_udev.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
index b1e5f00067e8..299f55260129 100644
--- a/src/node_device/node_device_udev.c
+++ b/src/node_device/node_device_udev.c
@@ -1628,6 +1628,13 @@ udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
         }
 
         if (priv->threadQuit) {
+            if (priv->watch != -1) {
+                /* Since the udev thread getting stopped remove the
+                 * watch handle from the main loop */
+                virEventRemoveHandle(priv->watch);
+                priv->watch = -1;
+            }
+
             virObjectUnlock(priv);
             return;
         }
-- 
2.17.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC PATCH] udev: Remove udev handle from main loop when udev thread stops
Posted by Marc Hartmayer 5 years, 2 months ago
On Thu, Feb 07, 2019 at 05:08 PM +0100, Marc Hartmayer <mhartmay@linux.ibm.com> wrote:
> Commit "nodedev: Move device enumumeration out of nodeStateInitialize"
> (9f0ae0b18e3e620) has moved the heavy task of device enumeration into
> a separate thread. The problem with this commit is that there is a
> functionality change in the cleanup when udevEnumerateDevices
> fails. Before commit 9f0ae0b18e3e620, the entire driver was cleaned up
> by calling nodeStateCleanup (defined in node_device_udev.c) which
> resulted in libvirtd stopping with the error message
> 'daemonRunStateInit:800 : Driver state initialization failed'. With
> the commit 9f0ae0b18e3e620 there is only a signal to the udev thread
> that it must stop. This means that for example the watch handle isn't
> removed from the main loop and this can result in the main loop
> consuming 100% CPU time as soon as a new udev event occurs.
>
> This patch proposes a simple solution to the described problem. In
> case the udev thread stops the watch handle is removed from the main
> loop.

Another option would be to stop libvirtd if udevEnumerateDevices fails
in nodeStateInitializeEnumerate.

[…snip]


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC PATCH] udev: Remove udev handle from main loop when udev thread stops
Posted by Marc Hartmayer 5 years, 2 months ago
On Thu, Feb 07, 2019 at 05:08 PM +0100, Marc Hartmayer <mhartmay@linux.ibm.com> wrote:
> Commit "nodedev: Move device enumumeration out of nodeStateInitialize"
> (9f0ae0b18e3e620) has moved the heavy task of device enumeration into
> a separate thread. The problem with this commit is that there is a
> functionality change in the cleanup when udevEnumerateDevices
> fails. Before commit 9f0ae0b18e3e620, the entire driver was cleaned up
> by calling nodeStateCleanup (defined in node_device_udev.c) which
> resulted in libvirtd stopping with the error message
> 'daemonRunStateInit:800 : Driver state initialization failed'. With
> the commit 9f0ae0b18e3e620 there is only a signal to the udev thread
> that it must stop. This means that for example the watch handle isn't
> removed from the main loop and this can result in the main loop
> consuming 100% CPU time as soon as a new udev event occurs.
>
> This patch proposes a simple solution to the described problem. In
> case the udev thread stops the watch handle is removed from the main
> loop.
>
> Fixes: 9f0ae0b18e3e620 ("nodedev: Move device enumumeration out of nodeStateInitialize")
> Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
> ---
>
> Note: I'm not sure whether we should stop libvirtd (as it would have
>       been done before) or if this patch is already sufficient.
>
> ---
>  src/node_device/node_device_udev.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
> index b1e5f00067e8..299f55260129 100644
> --- a/src/node_device/node_device_udev.c
> +++ b/src/node_device/node_device_udev.c
> @@ -1628,6 +1628,13 @@ udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
>          }
>
>          if (priv->threadQuit) {
> +            if (priv->watch != -1) {
> +                /* Since the udev thread getting stopped remove the
> +                 * watch handle from the main loop */
> +                virEventRemoveHandle(priv->watch);
> +                priv->watch = -1;
> +            }
> +
>              virObjectUnlock(priv);
>              return;
>          }
> --
> 2.17.0
>
> --
> libvir-list mailing list
> libvir-list@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list

Sorry… I forgot to add John to CC.

Kind regards / Beste Grüße
   Marc Hartmayer

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC PATCH] udev: Remove udev handle from main loop when udev thread stops
Posted by John Ferlan 5 years, 2 months ago

On 2/7/19 11:08 AM, Marc Hartmayer wrote:
> Commit "nodedev: Move device enumumeration out of nodeStateInitialize"
> (9f0ae0b18e3e620) has moved the heavy task of device enumeration into
> a separate thread. The problem with this commit is that there is a
> functionality change in the cleanup when udevEnumerateDevices
> fails. Before commit 9f0ae0b18e3e620, the entire driver was cleaned up
> by calling nodeStateCleanup (defined in node_device_udev.c) which
> resulted in libvirtd stopping with the error message
> 'daemonRunStateInit:800 : Driver state initialization failed'. With
> the commit 9f0ae0b18e3e620 there is only a signal to the udev thread
> that it must stop. This means that for example the watch handle isn't
> removed from the main loop and this can result in the main loop
> consuming 100% CPU time as soon as a new udev event occurs.
> 
> This patch proposes a simple solution to the described problem. In
> case the udev thread stops the watch handle is removed from the main
> loop.
> 
> Fixes: 9f0ae0b18e3e620 ("nodedev: Move device enumumeration out of nodeStateInitialize")
> Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
> ---
> 
> Note: I'm not sure whether we should stop libvirtd (as it would have
>       been done before) or if this patch is already sufficient.
> 
> ---
>  src/node_device/node_device_udev.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 

What you have seems reasonable - although I wonder if it would be better
to remove the event handle in the error of nodeStateInitializeEnumerate.

I think the assumption made was that by setting threadQuit that the next
event have some sort of failure through udevEventMonitorSanityCheck
which removes the EventHandle too. Although I cannot be sure it's been
too long to remember for sure ;-)

Furthermore, should nodeStateCleanup be altered to check for priv->watch
== -1 and thus not worry about the code to set threadQuit, signal, and
joining the thread.

John

BTW: I'm subscribed to the list, no need to CC...

> diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
> index b1e5f00067e8..299f55260129 100644
> --- a/src/node_device/node_device_udev.c
> +++ b/src/node_device/node_device_udev.c
> @@ -1628,6 +1628,13 @@ udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
>          }
>  
>          if (priv->threadQuit) {
> +            if (priv->watch != -1) {
> +                /* Since the udev thread getting stopped remove the
> +                 * watch handle from the main loop */
> +                virEventRemoveHandle(priv->watch);
> +                priv->watch = -1;
> +            }
> +
>              virObjectUnlock(priv);
>              return;
>          }
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC PATCH] udev: Remove udev handle from main loop when udev thread stops
Posted by Marc Hartmayer 5 years, 1 month ago
On Tue, Feb 12, 2019 at 09:46 PM +0100, John Ferlan <jferlan@redhat.com> wrote:
> On 2/7/19 11:08 AM, Marc Hartmayer wrote:
>> Commit "nodedev: Move device enumumeration out of nodeStateInitialize"
>> (9f0ae0b18e3e620) has moved the heavy task of device enumeration into
>> a separate thread. The problem with this commit is that there is a
>> functionality change in the cleanup when udevEnumerateDevices
>> fails. Before commit 9f0ae0b18e3e620, the entire driver was cleaned up
>> by calling nodeStateCleanup (defined in node_device_udev.c) which
>> resulted in libvirtd stopping with the error message
>> 'daemonRunStateInit:800 : Driver state initialization failed'. With
>> the commit 9f0ae0b18e3e620 there is only a signal to the udev thread
>> that it must stop. This means that for example the watch handle isn't
>> removed from the main loop and this can result in the main loop
>> consuming 100% CPU time as soon as a new udev event occurs.
>>
>> This patch proposes a simple solution to the described problem. In
>> case the udev thread stops the watch handle is removed from the main
>> loop.
>>
>> Fixes: 9f0ae0b18e3e620 ("nodedev: Move device enumumeration out of nodeStateInitialize")
>> Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
>> ---
>>
>> Note: I'm not sure whether we should stop libvirtd (as it would have
>>       been done before) or if this patch is already sufficient.
>>
>> ---
>>  src/node_device/node_device_udev.c | 7 +++++++
>>  1 file changed, 7 insertions(+)
>>
>

[…snip…]

> Furthermore, should nodeStateCleanup be altered to check for priv->watch
> == -1 and thus not worry about the code to set threadQuit, signal, and
> joining the thread.

Hmm, I looked at the code again and it seems like your suggested change
could be a small performance improvement but it makes the code not more
readable. The current code is not wrong/problematic because changing
`priv->threadQuit` to true changes nothing. virCondSignal doesn’t block
and virThreadJoin/pthread_join returns immediately if the passed thread
has already terminated
(http://man7.org/linux/man-pages/man3/pthread_join.3.html).

The join would also still be needed with your proposed change since
otherwise there would be a (possible) race condition between
nodeStateCleanup() and setting `priv->threadQuit`.

>
> John
>
> BTW: I'm subscribed to the list, no need to CC...
>
>> diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
>> index b1e5f00067e8..299f55260129 100644
>> --- a/src/node_device/node_device_udev.c
>> +++ b/src/node_device/node_device_udev.c
>> @@ -1628,6 +1628,13 @@ udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
>>          }
>>
>>          if (priv->threadQuit) {
>> +            if (priv->watch != -1) {
>> +                /* Since the udev thread getting stopped remove the
>> +                 * watch handle from the main loop */
>> +                virEventRemoveHandle(priv->watch);
>> +                priv->watch = -1;
>> +            }
>> +
>>              virObjectUnlock(priv);
>>              return;
>>          }
>>
>
--
Kind regards / Beste Grüße
   Marc Hartmayer

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC PATCH] udev: Remove udev handle from main loop when udev thread stops
Posted by Marc Hartmayer 5 years, 1 month ago
On Tue, Feb 12, 2019 at 09:46 PM +0100, John Ferlan <jferlan@redhat.com> wrote:
> On 2/7/19 11:08 AM, Marc Hartmayer wrote:
>> Commit "nodedev: Move device enumumeration out of nodeStateInitialize"
>> (9f0ae0b18e3e620) has moved the heavy task of device enumeration into
>> a separate thread. The problem with this commit is that there is a
>> functionality change in the cleanup when udevEnumerateDevices
>> fails. Before commit 9f0ae0b18e3e620, the entire driver was cleaned up
>> by calling nodeStateCleanup (defined in node_device_udev.c) which
>> resulted in libvirtd stopping with the error message
>> 'daemonRunStateInit:800 : Driver state initialization failed'. With
>> the commit 9f0ae0b18e3e620 there is only a signal to the udev thread
>> that it must stop. This means that for example the watch handle isn't
>> removed from the main loop and this can result in the main loop
>> consuming 100% CPU time as soon as a new udev event occurs.
>>
>> This patch proposes a simple solution to the described problem. In
>> case the udev thread stops the watch handle is removed from the main
>> loop.
>>
>> Fixes: 9f0ae0b18e3e620 ("nodedev: Move device enumumeration out of nodeStateInitialize")
>> Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
>> ---
>>
>> Note: I'm not sure whether we should stop libvirtd (as it would have
>>       been done before) or if this patch is already sufficient.
>>
>> ---
>>  src/node_device/node_device_udev.c | 7 +++++++
>>  1 file changed, 7 insertions(+)
>>
>
> What you have seems reasonable - although I wonder if it would be better
> to remove the event handle in the error of nodeStateInitializeEnumerate.
>
> I think the assumption made was that by setting threadQuit that the next
> event have some sort of failure through udevEventMonitorSanityCheck
> which removes the EventHandle too. Although I cannot be sure it's been
> too long to remember for sure ;-)
>
> Furthermore, should nodeStateCleanup be altered to check for priv->watch
> == -1 and thus not worry about the code to set threadQuit, signal, and
> joining the thread.

A simple check for `priv->threadQuit` is probably even more useful since
it’s not safe that the watch callback is set.

I’ll send a v2 with the changes you suggested.

>
> John
>
> BTW: I'm subscribed to the list, no need to CC...
>
>> diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
>> index b1e5f00067e8..299f55260129 100644
>> --- a/src/node_device/node_device_udev.c
>> +++ b/src/node_device/node_device_udev.c
>> @@ -1628,6 +1628,13 @@ udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
>>          }
>>
>>          if (priv->threadQuit) {
>> +            if (priv->watch != -1) {
>> +                /* Since the udev thread getting stopped remove the
>> +                 * watch handle from the main loop */
>> +                virEventRemoveHandle(priv->watch);
>> +                priv->watch = -1;
>> +            }
>> +
>>              virObjectUnlock(priv);
>>              return;
>>          }
>>
>
--
Kind regards / Beste Grüße
   Marc Hartmayer

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC PATCH] udev: Remove udev handle from main loop when udev thread stops
Posted by Marc Hartmayer 5 years, 2 months ago
On Tue, Feb 12, 2019 at 09:46 PM +0100, John Ferlan <jferlan@redhat.com> wrote:
> On 2/7/19 11:08 AM, Marc Hartmayer wrote:
>> Commit "nodedev: Move device enumumeration out of nodeStateInitialize"
>> (9f0ae0b18e3e620) has moved the heavy task of device enumeration into
>> a separate thread. The problem with this commit is that there is a
>> functionality change in the cleanup when udevEnumerateDevices
>> fails. Before commit 9f0ae0b18e3e620, the entire driver was cleaned up
>> by calling nodeStateCleanup (defined in node_device_udev.c) which
>> resulted in libvirtd stopping with the error message
>> 'daemonRunStateInit:800 : Driver state initialization failed'. With
>> the commit 9f0ae0b18e3e620 there is only a signal to the udev thread
>> that it must stop. This means that for example the watch handle isn't
>> removed from the main loop and this can result in the main loop
>> consuming 100% CPU time as soon as a new udev event occurs.
>>
>> This patch proposes a simple solution to the described problem. In
>> case the udev thread stops the watch handle is removed from the main
>> loop.
>>
>> Fixes: 9f0ae0b18e3e620 ("nodedev: Move device enumumeration out of nodeStateInitialize")
>> Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
>> ---
>>
>> Note: I'm not sure whether we should stop libvirtd (as it would have
>>       been done before) or if this patch is already sufficient.
>>
>> ---
>>  src/node_device/node_device_udev.c | 7 +++++++
>>  1 file changed, 7 insertions(+)
>>
>
> What you have seems reasonable - although I wonder if it would be better
> to remove the event handle in the error of
> nodeStateInitializeEnumerate.

Might be better, yes - I’ve no strong opinion about that. I’ve added the
removal here because it doesn’t make sense to still have the watch
handle registered in the main loop if the udev thread stops - at least I
think so (just to be bulletproof).

The more important point is before your change, the behavior was that
libvirtd stops after this error. Now (with this patch) it only removes
the watch handle and stops the udev thread. Is this change of behavior
okay?

>
> I think the assumption made was that by setting threadQuit that the next
> event have some sort of failure through udevEventMonitorSanityCheck
> which removes the EventHandle too. Although I cannot be sure it's been
> too long to remember for sure ;-)
>
> Furthermore, should nodeStateCleanup be altered to check for priv->watch
> == -1 and thus not worry about the code to set threadQuit, signal, and
> joining the thread.
>
> John
>
> BTW: I'm subscribed to the list, no need to CC...

Yep, I know :) But adding you to CC increases the chance that you’re
looking for this patch since it was your commit.

Thanks for the feedback!

>
>> diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
>> index b1e5f00067e8..299f55260129 100644
>> --- a/src/node_device/node_device_udev.c
>> +++ b/src/node_device/node_device_udev.c
>> @@ -1628,6 +1628,13 @@ udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
>>          }
>>
>>          if (priv->threadQuit) {
>> +            if (priv->watch != -1) {
>> +                /* Since the udev thread getting stopped remove the
>> +                 * watch handle from the main loop */
>> +                virEventRemoveHandle(priv->watch);
>> +                priv->watch = -1;
>> +            }
>> +
>>              virObjectUnlock(priv);
>>              return;
>>          }
>>
>
--
Kind regards / Beste Grüße
   Marc Hartmayer

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC PATCH] udev: Remove udev handle from main loop when udev thread stops
Posted by John Ferlan 5 years, 2 months ago

On 2/13/19 4:34 AM, Marc Hartmayer wrote:
> On Tue, Feb 12, 2019 at 09:46 PM +0100, John Ferlan <jferlan@redhat.com> wrote:
>> On 2/7/19 11:08 AM, Marc Hartmayer wrote:
>>> Commit "nodedev: Move device enumumeration out of nodeStateInitialize"
>>> (9f0ae0b18e3e620) has moved the heavy task of device enumeration into
>>> a separate thread. The problem with this commit is that there is a
>>> functionality change in the cleanup when udevEnumerateDevices
>>> fails. Before commit 9f0ae0b18e3e620, the entire driver was cleaned up
>>> by calling nodeStateCleanup (defined in node_device_udev.c) which
>>> resulted in libvirtd stopping with the error message
>>> 'daemonRunStateInit:800 : Driver state initialization failed'. With
>>> the commit 9f0ae0b18e3e620 there is only a signal to the udev thread
>>> that it must stop. This means that for example the watch handle isn't
>>> removed from the main loop and this can result in the main loop
>>> consuming 100% CPU time as soon as a new udev event occurs.
>>>
>>> This patch proposes a simple solution to the described problem. In
>>> case the udev thread stops the watch handle is removed from the main
>>> loop.
>>>
>>> Fixes: 9f0ae0b18e3e620 ("nodedev: Move device enumumeration out of nodeStateInitialize")
>>> Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
>>> ---
>>>
>>> Note: I'm not sure whether we should stop libvirtd (as it would have
>>>       been done before) or if this patch is already sufficient.
>>>
>>> ---
>>>  src/node_device/node_device_udev.c | 7 +++++++
>>>  1 file changed, 7 insertions(+)
>>>
>>
>> What you have seems reasonable - although I wonder if it would be better
>> to remove the event handle in the error of
>> nodeStateInitializeEnumerate.
> 
> Might be better, yes - I’ve no strong opinion about that. I’ve added the
> removal here because it doesn’t make sense to still have the watch
> handle registered in the main loop if the udev thread stops - at least I
> think so (just to be bulletproof).
> 
> The more important point is before your change, the behavior was that
> libvirtd stops after this error. Now (with this patch) it only removes
> the watch handle and stops the udev thread. Is this change of behavior
> okay?
> 

This was mentioned during review of the patch - final response here:

https://www.redhat.com/archives/libvir-list/2017-December/msg00531.html

with the feeling that at least allowing other aspects of libvirt to
function even though udev processing is crippled wasn't a bad thing.

Without the patch something would need to be done before starting
libvirtd anyway. The "missing piece" was actually having what you had
happen occur. Part of me wonders what fails such that enumeration fails
and is that something we should be chasing instead. IOW: Is there
something in udevEnumerateDevices failure that could just be ignored and
we continue to operate logging the failure (like you've done in the
patch you recently posted).  Does that fix what caused this patch?

John

>>
>> I think the assumption made was that by setting threadQuit that the next
>> event have some sort of failure through udevEventMonitorSanityCheck
>> which removes the EventHandle too. Although I cannot be sure it's been
>> too long to remember for sure ;-)
>>
>> Furthermore, should nodeStateCleanup be altered to check for priv->watch
>> == -1 and thus not worry about the code to set threadQuit, signal, and
>> joining the thread.
>>
>> John
>>
>> BTW: I'm subscribed to the list, no need to CC...
> 
> Yep, I know :) But adding you to CC increases the chance that you’re
> looking for this patch since it was your commit.
> 
> Thanks for the feedback!
> 
>>
>>> diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
>>> index b1e5f00067e8..299f55260129 100644
>>> --- a/src/node_device/node_device_udev.c
>>> +++ b/src/node_device/node_device_udev.c
>>> @@ -1628,6 +1628,13 @@ udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
>>>          }
>>>
>>>          if (priv->threadQuit) {
>>> +            if (priv->watch != -1) {
>>> +                /* Since the udev thread getting stopped remove the
>>> +                 * watch handle from the main loop */
>>> +                virEventRemoveHandle(priv->watch);
>>> +                priv->watch = -1;
>>> +            }
>>> +
>>>              virObjectUnlock(priv);
>>>              return;
>>>          }
>>>
>>
> --
> Kind regards / Beste Grüße
>    Marc Hartmayer
> 
> IBM Deutschland Research & Development GmbH
> Vorsitzende des Aufsichtsrats: Matthias Hartmann
> Geschäftsführung: Dirk Wittkopp
> Sitz der Gesellschaft: Böblingen
> Registergericht: Amtsgericht Stuttgart, HRB 243294
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC PATCH] udev: Remove udev handle from main loop when udev thread stops
Posted by Marc Hartmayer 5 years, 2 months ago
On Wed, Feb 13, 2019 at 03:03 PM +0100, John Ferlan <jferlan@redhat.com> wrote:
> On 2/13/19 4:34 AM, Marc Hartmayer wrote:
>> On Tue, Feb 12, 2019 at 09:46 PM +0100, John Ferlan <jferlan@redhat.com> wrote:
>>> On 2/7/19 11:08 AM, Marc Hartmayer wrote:
>>>> Commit "nodedev: Move device enumumeration out of nodeStateInitialize"
>>>> (9f0ae0b18e3e620) has moved the heavy task of device enumeration into
>>>> a separate thread. The problem with this commit is that there is a
>>>> functionality change in the cleanup when udevEnumerateDevices
>>>> fails. Before commit 9f0ae0b18e3e620, the entire driver was cleaned up
>>>> by calling nodeStateCleanup (defined in node_device_udev.c) which
>>>> resulted in libvirtd stopping with the error message
>>>> 'daemonRunStateInit:800 : Driver state initialization failed'. With
>>>> the commit 9f0ae0b18e3e620 there is only a signal to the udev thread
>>>> that it must stop. This means that for example the watch handle isn't
>>>> removed from the main loop and this can result in the main loop
>>>> consuming 100% CPU time as soon as a new udev event occurs.
>>>>
>>>> This patch proposes a simple solution to the described problem. In
>>>> case the udev thread stops the watch handle is removed from the main
>>>> loop.
>>>>
>>>> Fixes: 9f0ae0b18e3e620 ("nodedev: Move device enumumeration out of nodeStateInitialize")
>>>> Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
>>>> ---
>>>>
>>>> Note: I'm not sure whether we should stop libvirtd (as it would have
>>>>       been done before) or if this patch is already sufficient.
>>>>
>>>> ---
>>>>  src/node_device/node_device_udev.c | 7 +++++++
>>>>  1 file changed, 7 insertions(+)
>>>>
>>>
>>> What you have seems reasonable - although I wonder if it would be better
>>> to remove the event handle in the error of
>>> nodeStateInitializeEnumerate.
>>
>> Might be better, yes - I’ve no strong opinion about that. I’ve added the
>> removal here because it doesn’t make sense to still have the watch
>> handle registered in the main loop if the udev thread stops - at least I
>> think so (just to be bulletproof).
>>
>> The more important point is before your change, the behavior was that
>> libvirtd stops after this error. Now (with this patch) it only removes
>> the watch handle and stops the udev thread. Is this change of behavior
>> okay?
>>
>
> This was mentioned during review of the patch - final response here:
>
> https://www.redhat.com/archives/libvir-list/2017-December/msg00531.html
>
> with the feeling that at least allowing other aspects of libvirt to
> function even though udev processing is crippled wasn't a bad thing.

Okay.

>
> Without the patch something would need to be done before starting
> libvirtd anyway. The "missing piece" was actually having what you had
> happen occur. Part of me wonders what fails such that enumeration fails
> and is that something we should be chasing instead. IOW: Is there
> something in udevEnumerateDevices failure that could just be ignored and
> we continue to operate logging the failure (like you've done in the
> patch you recently posted).  Does that fix what caused this patch?

Yep, it does - if you’re referring to my mail
id:20190213123838.2826-1-mhartmay@linux.ibm.com.

>

[…snip]

--
Kind regards / Beste Grüße
   Marc Hartmayer

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list