[PATCH v2 3/4] hw/vfio-user: wait for proxy close correctly

John Levon posted 4 patches 4 months ago
Maintainers: John Levon <john.levon@nutanix.com>, Thanos Makatos <thanos.makatos@nutanix.com>, "Cédric Le Goater" <clg@redhat.com>, Alex Williamson <alex.williamson@redhat.com>
There is a newer version of this series
[PATCH v2 3/4] hw/vfio-user: wait for proxy close correctly
Posted by John Levon 4 months ago
Coverity reported:

CID 1611806: Concurrent data access violations (BAD_CHECK_OF_WAIT_COND)

A wait is performed without a loop. If there is a spurious wakeup, the
condition may not be satisfied.

Fix this by checking ->state for VFIO_PROXY_CLOSED in a loop.

Resolves: Coverity CID 1611806
Fixes: 0b3d881a ("vfio-user: implement message receive infrastructure")
Signed-off-by: John Levon <john.levon@nutanix.com>
---
 hw/vfio-user/proxy.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/hw/vfio-user/proxy.c b/hw/vfio-user/proxy.c
index c418954440..2275d3fe39 100644
--- a/hw/vfio-user/proxy.c
+++ b/hw/vfio-user/proxy.c
@@ -32,7 +32,6 @@ static void vfio_user_recycle(VFIOUserProxy *proxy, VFIOUserMsg *msg);
 
 static void vfio_user_recv(void *opaque);
 static void vfio_user_send(void *opaque);
-static void vfio_user_cb(void *opaque);
 
 static void vfio_user_request(void *opaque);
 
@@ -492,7 +491,7 @@ static void vfio_user_send(void *opaque)
     }
 }
 
-static void vfio_user_cb(void *opaque)
+static void vfio_user_close_cb(void *opaque)
 {
     VFIOUserProxy *proxy = opaque;
 
@@ -984,8 +983,11 @@ void vfio_user_disconnect(VFIOUserProxy *proxy)
      * handler to run after the proxy fd handlers were
      * deleted above.
      */
-    aio_bh_schedule_oneshot(proxy->ctx, vfio_user_cb, proxy);
-    qemu_cond_wait(&proxy->close_cv, &proxy->lock);
+    aio_bh_schedule_oneshot(proxy->ctx, vfio_user_close_cb, proxy);
+
+    while (proxy->state != VFIO_PROXY_CLOSED) {
+        qemu_cond_wait(&proxy->close_cv, &proxy->lock);
+    }
 
     /* we now hold the only ref to proxy */
     qemu_mutex_unlock(&proxy->lock);
-- 
2.43.0
Re: [PATCH v2 3/4] hw/vfio-user: wait for proxy close correctly
Posted by Mark Cave-Ayland 4 months ago
On 15/07/2025 06:52, John Levon wrote:

> Coverity reported:
> 
> CID 1611806: Concurrent data access violations (BAD_CHECK_OF_WAIT_COND)
> 
> A wait is performed without a loop. If there is a spurious wakeup, the
> condition may not be satisfied.
> 
> Fix this by checking ->state for VFIO_PROXY_CLOSED in a loop.
> 
> Resolves: Coverity CID 1611806
> Fixes: 0b3d881a ("vfio-user: implement message receive infrastructure")
> Signed-off-by: John Levon <john.levon@nutanix.com>

Is this definitely the right patch? The v2 posted at 
https://patchew.org/QEMU/20250711124500.1611628-1-john.levon@nutanix.com/ 
contains the updated commit message mentioning the rename of the 
callback, whereas this one doesn't?

> ---
>   hw/vfio-user/proxy.c | 10 ++++++----
>   1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/vfio-user/proxy.c b/hw/vfio-user/proxy.c
> index c418954440..2275d3fe39 100644
> --- a/hw/vfio-user/proxy.c
> +++ b/hw/vfio-user/proxy.c
> @@ -32,7 +32,6 @@ static void vfio_user_recycle(VFIOUserProxy *proxy, VFIOUserMsg *msg);
>   
>   static void vfio_user_recv(void *opaque);
>   static void vfio_user_send(void *opaque);
> -static void vfio_user_cb(void *opaque);
>   
>   static void vfio_user_request(void *opaque);
>   
> @@ -492,7 +491,7 @@ static void vfio_user_send(void *opaque)
>       }
>   }
>   
> -static void vfio_user_cb(void *opaque)
> +static void vfio_user_close_cb(void *opaque)
>   {
>       VFIOUserProxy *proxy = opaque;
>   
> @@ -984,8 +983,11 @@ void vfio_user_disconnect(VFIOUserProxy *proxy)
>        * handler to run after the proxy fd handlers were
>        * deleted above.
>        */
> -    aio_bh_schedule_oneshot(proxy->ctx, vfio_user_cb, proxy);
> -    qemu_cond_wait(&proxy->close_cv, &proxy->lock);
> +    aio_bh_schedule_oneshot(proxy->ctx, vfio_user_close_cb, proxy);
> +
> +    while (proxy->state != VFIO_PROXY_CLOSED) {
> +        qemu_cond_wait(&proxy->close_cv, &proxy->lock);
> +    }
>   
>       /* we now hold the only ref to proxy */
>       qemu_mutex_unlock(&proxy->lock);


ATB,

Mark.
Re: [PATCH v2 3/4] hw/vfio-user: wait for proxy close correctly
Posted by John Levon 4 months ago
On Tue, Jul 15, 2025 at 10:01:59AM +0100, Mark Cave-Ayland wrote:

> On 15/07/2025 06:52, John Levon wrote:
> 
> > Coverity reported:
> > 
> > CID 1611806: Concurrent data access violations (BAD_CHECK_OF_WAIT_COND)
> > 
> > A wait is performed without a loop. If there is a spurious wakeup, the
> > condition may not be satisfied.
> > 
> > Fix this by checking ->state for VFIO_PROXY_CLOSED in a loop.
> > 
> > Resolves: Coverity CID 1611806
> > Fixes: 0b3d881a ("vfio-user: implement message receive infrastructure")
> > Signed-off-by: John Levon <john.levon@nutanix.com>
> 
> Is this definitely the right patch? The v2 posted at
> https://patchew.org/QEMU/20250711124500.1611628-1-john.levon@nutanix.com/
> contains the updated commit message mentioning the rename of the callback,
> whereas this one doesn't?

Yep, sorry, picked the wrong commit message (but same commit contents).

Should I resend?

regards
john
Re: [PATCH v2 3/4] hw/vfio-user: wait for proxy close correctly
Posted by Cédric Le Goater 4 months ago
On 7/15/25 11:33, John Levon wrote:
> On Tue, Jul 15, 2025 at 10:01:59AM +0100, Mark Cave-Ayland wrote:
> 
>> On 15/07/2025 06:52, John Levon wrote:
>>
>>> Coverity reported:
>>>
>>> CID 1611806: Concurrent data access violations (BAD_CHECK_OF_WAIT_COND)
>>>
>>> A wait is performed without a loop. If there is a spurious wakeup, the
>>> condition may not be satisfied.
>>>
>>> Fix this by checking ->state for VFIO_PROXY_CLOSED in a loop.
>>>
>>> Resolves: Coverity CID 1611806
>>> Fixes: 0b3d881a ("vfio-user: implement message receive infrastructure")
>>> Signed-off-by: John Levon <john.levon@nutanix.com>
>>
>> Is this definitely the right patch? The v2 posted at
>> https://patchew.org/QEMU/20250711124500.1611628-1-john.levon@nutanix.com/
>> contains the updated commit message mentioning the rename of the callback,
>> whereas this one doesn't?
> 
> Yep, sorry, picked the wrong commit message (but same commit contents).
> 
> Should I resend?
yep. Please pick up the trailers.

Thanks,

C.