[v1] Fix deadlock with bdrv_open of self-served NBD

[PATCH 6/8] qio: Hoist ref of listener outside loop

Posted by Eric Blake 1 week, 5 days ago

The point of QIONetListener is to allow a server to listen to more
than one socket address at a time, and respond to clients in a
first-come-first-serve order across any of those addresses.  While
some servers (like NBD) really do want to serve multiple simultaneous
clients, many other servers only care about the first client to
connect, and will immediately deregister the callback, possibly by
dropping their reference to the QIONetListener.  The existing code
ensures that all other pending callbacks remain safe once the first
callback drops the listener, by adding an extra reference to the
listener for each GSource created, where those references pair to the
eventual teardown of each GSource after a given callbacks has been
serviced or aborted.  But it is equally acceptable to hoist the
reference to the listener outside the loop - as long as there is a
callback function registered, it is sufficient to have a single
reference live for the entire array of sioc, rather than one reference
per sioc in the array.

Hoisting the reference like this will make it easier for an upcoming
patch to still ensure the listener cannot be prematurely garbage
collected during the user's callback, even when the callback no longer
uses a per-sioc GSource.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 io/net-listener.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/io/net-listener.c b/io/net-listener.c
index afdacdd5ff4..ce29bf3c993 100644
--- a/io/net-listener.c
+++ b/io/net-listener.c
@@ -118,12 +118,14 @@ qio_net_listener_watch(QIONetListener *listener, size_t i, const char *caller)
     }

     trace_qio_net_listener_watch_enabled(listener, listener->io_func, caller);
-    for ( ; i < listener->nsioc; i++) {
+    if (i == 0) {
         object_ref(OBJECT(listener));
+    }
+    for ( ; i < listener->nsioc; i++) {
         listener->io_source[i] = qio_channel_add_watch_source(
             QIO_CHANNEL(listener->sioc[i]), G_IO_IN,
             qio_net_listener_channel_func,
-            listener, (GDestroyNotify)object_unref, listener->context);
+            listener, NULL, listener->context);
     }
 }

@@ -144,6 +146,7 @@ qio_net_listener_unwatch(QIONetListener *listener, const char *caller)
             listener->io_source[i] = NULL;
         }
     }
+    object_unref(OBJECT(listener));
 }

 void qio_net_listener_add(QIONetListener *listener,
-- 
2.51.1

Re: [PATCH 6/8] qio: Hoist ref of listener outside loop

Posted by Daniel P. Berrangé 1 week, 4 days ago

On Mon, Nov 03, 2025 at 02:10:57PM -0600, Eric Blake wrote:
> The point of QIONetListener is to allow a server to listen to more
> than one socket address at a time, and respond to clients in a
> first-come-first-serve order across any of those addresses.  While
> some servers (like NBD) really do want to serve multiple simultaneous
> clients, many other servers only care about the first client to
> connect, and will immediately deregister the callback, possibly by
> dropping their reference to the QIONetListener.  The existing code
> ensures that all other pending callbacks remain safe once the first
> callback drops the listener, by adding an extra reference to the
> listener for each GSource created, where those references pair to the
> eventual teardown of each GSource after a given callbacks has been
> serviced or aborted.  But it is equally acceptable to hoist the
> reference to the listener outside the loop - as long as there is a
> callback function registered, it is sufficient to have a single
> reference live for the entire array of sioc, rather than one reference
> per sioc in the array.
> 
> Hoisting the reference like this will make it easier for an upcoming
> patch to still ensure the listener cannot be prematurely garbage
> collected during the user's callback, even when the callback no longer
> uses a per-sioc GSource.

It isn't quite this simple. Glib reference counts the callback
func / data, holding a reference when dispatching the callback.

IOW, even if the GSource is unrefed, the callback 'notify'
function won't be called if the main loop is in the process
of dispatching.

With this change, the reference on 'listener' can now be
released even if the callback is currently dispatching.

> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>  io/net-listener.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/io/net-listener.c b/io/net-listener.c
> index afdacdd5ff4..ce29bf3c993 100644
> --- a/io/net-listener.c
> +++ b/io/net-listener.c
> @@ -118,12 +118,14 @@ qio_net_listener_watch(QIONetListener *listener, size_t i, const char *caller)
>      }
> 
>      trace_qio_net_listener_watch_enabled(listener, listener->io_func, caller);
> -    for ( ; i < listener->nsioc; i++) {
> +    if (i == 0) {
>          object_ref(OBJECT(listener));
> +    }
> +    for ( ; i < listener->nsioc; i++) {
>          listener->io_source[i] = qio_channel_add_watch_source(
>              QIO_CHANNEL(listener->sioc[i]), G_IO_IN,
>              qio_net_listener_channel_func,
> -            listener, (GDestroyNotify)object_unref, listener->context);
> +            listener, NULL, listener->context);
>      }
>  }
> 
> @@ -144,6 +146,7 @@ qio_net_listener_unwatch(QIONetListener *listener, const char *caller)
>              listener->io_source[i] = NULL;
>          }
>      }
> +    object_unref(OBJECT(listener));
>  }
> 
>  void qio_net_listener_add(QIONetListener *listener,
> -- 
> 2.51.1
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH 6/8] qio: Hoist ref of listener outside loop

Posted by Eric Blake 1 week, 3 days ago

On Tue, Nov 04, 2025 at 11:13:48AM +0000, Daniel P. Berrangé wrote:
> On Mon, Nov 03, 2025 at 02:10:57PM -0600, Eric Blake wrote:
> > The point of QIONetListener is to allow a server to listen to more
> > than one socket address at a time, and respond to clients in a
> > first-come-first-serve order across any of those addresses.  While
> > some servers (like NBD) really do want to serve multiple simultaneous
> > clients, many other servers only care about the first client to
> > connect, and will immediately deregister the callback, possibly by
> > dropping their reference to the QIONetListener.  The existing code
> > ensures that all other pending callbacks remain safe once the first
> > callback drops the listener, by adding an extra reference to the
> > listener for each GSource created, where those references pair to the
> > eventual teardown of each GSource after a given callbacks has been
> > serviced or aborted.  But it is equally acceptable to hoist the
> > reference to the listener outside the loop - as long as there is a
> > callback function registered, it is sufficient to have a single
> > reference live for the entire array of sioc, rather than one reference
> > per sioc in the array.
> > 
> > Hoisting the reference like this will make it easier for an upcoming
> > patch to still ensure the listener cannot be prematurely garbage
> > collected during the user's callback, even when the callback no longer
> > uses a per-sioc GSource.
> 
> It isn't quite this simple. Glib reference counts the callback
> func / data, holding a reference when dispatching the callback.
> 
> IOW, even if the GSource is unrefed, the callback 'notify'
> function won't be called if the main loop is in the process
> of dispatching.

I'm not sure I follow your argument.  Glib holds a reference on the
GSource object, not on the opaque data that is handed to the GSource.
It is possible to use g_source_set_callback_indirect() where GSource
can learn how to use the same reference counting on data as external
code, by the use of function pointers for ref and unref, but QIO uses
merely g_source_set_callback().
https://gitlab.gnome.org/GNOME/glib/-/blob/main/glib/gmain.c#L1844
shows that glib then wraps that opaque pointer into an internal
GSourceCallback object which itself is reference counted, so that the
notify function is not called until the GSource is finalized, but that
is reference counting on the container, not on the opaque object
itself (which in this patch is the QIONetListener).

> 
> With this change, the reference on 'listener' can now be
> released even if the callback is currently dispatching.

So if I'm understanding your concern, you're worried that the unwatch
code can finish looping through the g_source_destroy and then reach
the point where it unrefs listener, but that a late-breaking client
connection can trigger a callback that can still be executing in
another thread/coroutine after the listener is unref'ed but before the
GSource has been finalized?  If so, would squashing this in fix the
problem you are seeing?

diff --git i/io/net-listener.c w/io/net-listener.c
index 9f4e3c0be0c..1fcbbeb7a76 100644
--- i/io/net-listener.c
+++ w/io/net-listener.c
@@ -67,8 +67,10 @@ static void qio_net_listener_aio_func(void *opaque)
 {
     QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(opaque);

+    object_ref(OBJECT(sioc->listener));
     qio_net_listener_channel_func(QIO_CHANNEL(sioc), G_IO_IN,
                                   sioc->listener);
+    object_unref(OBJECT(sioc->listener));
 }

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org

Re: [PATCH 6/8] qio: Hoist ref of listener outside loop

Posted by Daniel P. Berrangé 4 days, 10 hours ago

On Wed, Nov 05, 2025 at 03:57:29PM -0600, Eric Blake wrote:
> On Tue, Nov 04, 2025 at 11:13:48AM +0000, Daniel P. Berrangé wrote:
> > On Mon, Nov 03, 2025 at 02:10:57PM -0600, Eric Blake wrote:
> > > The point of QIONetListener is to allow a server to listen to more
> > > than one socket address at a time, and respond to clients in a
> > > first-come-first-serve order across any of those addresses.  While
> > > some servers (like NBD) really do want to serve multiple simultaneous
> > > clients, many other servers only care about the first client to
> > > connect, and will immediately deregister the callback, possibly by
> > > dropping their reference to the QIONetListener.  The existing code
> > > ensures that all other pending callbacks remain safe once the first
> > > callback drops the listener, by adding an extra reference to the
> > > listener for each GSource created, where those references pair to the
> > > eventual teardown of each GSource after a given callbacks has been
> > > serviced or aborted.  But it is equally acceptable to hoist the
> > > reference to the listener outside the loop - as long as there is a
> > > callback function registered, it is sufficient to have a single
> > > reference live for the entire array of sioc, rather than one reference
> > > per sioc in the array.
> > > 
> > > Hoisting the reference like this will make it easier for an upcoming
> > > patch to still ensure the listener cannot be prematurely garbage
> > > collected during the user's callback, even when the callback no longer
> > > uses a per-sioc GSource.
> > 
> > It isn't quite this simple. Glib reference counts the callback
> > func / data, holding a reference when dispatching the callback.
> > 
> > IOW, even if the GSource is unrefed, the callback 'notify'
> > function won't be called if the main loop is in the process
> > of dispatching.
> 
> I'm not sure I follow your argument.  Glib holds a reference on the
> GSource object, not on the opaque data that is handed to the GSource.
> It is possible to use g_source_set_callback_indirect() where GSource
> can learn how to use the same reference counting on data as external
> code, by the use of function pointers for ref and unref, but QIO uses
> merely g_source_set_callback().
> https://gitlab.gnome.org/GNOME/glib/-/blob/main/glib/gmain.c#L1844
> shows that glib then wraps that opaque pointer into an internal
> GSourceCallback object which itself is reference counted, so that the
> notify function is not called until the GSource is finalized, but that
> is reference counting on the container, not on the opaque object
> itself (which in this patch is the QIONetListener).
> 
> > 
> > With this change, the reference on 'listener' can now be
> > released even if the callback is currently dispatching.
> 
> So if I'm understanding your concern, you're worried that the unwatch
> code can finish looping through the g_source_destroy and then reach
> the point where it unrefs listener, but that a late-breaking client
> connection can trigger a callback that can still be executing in
> another thread/coroutine after the listener is unref'ed but before the
> GSource has been finalized?  If so, would squashing this in fix the
> problem you are seeing?

Consider the following scenario, where we have two threads, one
is calling QIONetListener APIs, and one is the event thread.

In the current code, when we unref() the GSource for the socket
watch, the destroy-notify does not get called, because the
event thread is in the middle of a dispatch callback for the
I/O event.  When the dispatch callback returns control to the
event loop, the GSourceCallback is unrefed, and this triggers
the destroy-notify call, which unrefs the listener.

The flow looks like this:

  Thread 1:
       qio_net_listener_set_client_func(lstnr, f, ...);
           => foreach sock: socket
               => object_ref(lstnr)
               => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, object_unref);

  Thread 2:
       poll()
          => event POLLIN on socket
               => ref(GSourceCallback)
               => call dispatch(sock)
                    ...do stuff..

  Thread 1:
       qio_net_listener_set_client_func(lstnr, NULL, ...);
          => foreach sock: socket
               => g_source_unref(sock_src)
       unref(lstnr)  (the final reference)
          => finalize(lstnr)

  Thread 2:
              => return dispatch(sock)
              => unref(GSourceCallback)
                  => destroy-notify
                     => object_unref



> diff --git i/io/net-listener.c w/io/net-listener.c
> index 9f4e3c0be0c..1fcbbeb7a76 100644
> --- i/io/net-listener.c
> +++ w/io/net-listener.c
> @@ -67,8 +67,10 @@ static void qio_net_listener_aio_func(void *opaque)
>  {
>      QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(opaque);
> 
> +    object_ref(OBJECT(sioc->listener));
>      qio_net_listener_channel_func(QIO_CHANNEL(sioc), G_IO_IN,
>                                    sioc->listener);
> +    object_unref(OBJECT(sioc->listener));
>  }

Now consider this patch, plus this extra hunk...

  Thread 1:
       qio_net_listener_set_client_func(lstnr, f, ...);
           => object_ref(listener)
           => foreach sock: socket
               => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, NULL);

  Thread 2:
       poll()
          => event POLLIN on socket
               => call dispatch(sock)
                    => ref(lstnr)
                    ...do stuff..

  Thread 1:
       qio_net_listener_set_client_func(lstnr, NULL, ...);
          => foreach sock: socket
               => g_source_unref(sock_src)
          => object_unref(listener)
       unref(lstnr)  (still 1 reference left)

  Thread 2:
                    => unref(lstnr)    (the final reference)
                        => finalize(lstnr)
              => return dispatch(sock)
              => unref(GSourceCallback)


That appears to work ok, however, there's still a race window that is
not solved. Between the time thread 2 sees POLLIN, and when it calls
the dispatch(sock) function, it is possible that thread 1 will drop
the last reference:



  Thread 1:
       qio_net_listener_set_client_func(lstnr, f, ...);
           => object_ref(listener)
           => foreach sock: socket
               => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, NULL);

  Thread 2:
       poll()
          => event POLLIN on socket

  Thread 1:
       qio_net_listener_set_client_func(lstnr, NULL, ...);
          => foreach sock: socket
               => g_source_unref(sock_src)
          => object_unref(listener)
       unref(lstnr)  (still 1 reference left)

  Thread 2:
               => call dispatch(sock)
                    => ref(lstnr)
                    ...do stuff..
                    => unref(lstnr)    (the final reference)
                        => finalize(lstnr)
              => return dispatch(sock)
              => unref(GSourceCallback)


I don't see a way to solve this without synchronization with the event
loop for releasing the reference on the opaque data for the dispatcher
callback.  That's what the current code does, but I'm seeing no way for
the AioContext event loop callbacks to have anything equivalent. This
feels like a gap in the AioContext design.

This is admittedly an incredibly hard to trigger race condition. It would
need a client to be calling a QMP command that tears down the NBD server,
at the exact same time as a new NBD client was incoming. Or the same kind
of scenario for other pieces of QEMU code using QIONetListener. This still
makes me worried though, as rare races have a habit of hitting QEMU
eventually.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH 6/8] qio: Hoist ref of listener outside loop

Posted by Eric Blake 4 days, 5 hours ago

On Tue, Nov 11, 2025 at 02:43:16PM +0000, Daniel P. Berrangé wrote:
> > > > Hoisting the reference like this will make it easier for an upcoming
> > > > patch to still ensure the listener cannot be prematurely garbage
> > > > collected during the user's callback, even when the callback no longer
> > > > uses a per-sioc GSource.
> > > 
> > > It isn't quite this simple. Glib reference counts the callback
> > > func / data, holding a reference when dispatching the callback.
> > > 
> > > IOW, even if the GSource is unrefed, the callback 'notify'
> > > function won't be called if the main loop is in the process
> > > of dispatching.
> > 
> > I'm not sure I follow your argument.  Glib holds a reference on the
> > GSource object, not on the opaque data that is handed to the GSource.
> > It is possible to use g_source_set_callback_indirect() where GSource
> > can learn how to use the same reference counting on data as external
> > code, by the use of function pointers for ref and unref, but QIO uses
> > merely g_source_set_callback().
> > https://gitlab.gnome.org/GNOME/glib/-/blob/main/glib/gmain.c#L1844
> > shows that glib then wraps that opaque pointer into an internal
> > GSourceCallback object which itself is reference counted, so that the
> > notify function is not called until the GSource is finalized, but that
> > is reference counting on the container, not on the opaque object
> > itself (which in this patch is the QIONetListener).
> > 
> > > 
> > > With this change, the reference on 'listener' can now be
> > > released even if the callback is currently dispatching.
> > 
> > So if I'm understanding your concern, you're worried that the unwatch
> > code can finish looping through the g_source_destroy and then reach
> > the point where it unrefs listener, but that a late-breaking client
> > connection can trigger a callback that can still be executing in
> > another thread/coroutine after the listener is unref'ed but before the
> > GSource has been finalized?  If so, would squashing this in fix the
> > problem you are seeing?
> 
> Consider the following scenario, where we have two threads, one
> is calling QIONetListener APIs, and one is the event thread.
> 
> In the current code, when we unref() the GSource for the socket
> watch, the destroy-notify does not get called, because the
> event thread is in the middle of a dispatch callback for the
> I/O event.  When the dispatch callback returns control to the
> event loop, the GSourceCallback is unrefed, and this triggers
> the destroy-notify call, which unrefs the listener.
> 
> The flow looks like this:
> 
>   Thread 1:
>        qio_net_listener_set_client_func(lstnr, f, ...);
>            => foreach sock: socket
>                => object_ref(lstnr)
>                => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, object_unref);
> 
>   Thread 2:
>        poll()
>           => event POLLIN on socket
>                => ref(GSourceCallback)
>                => call dispatch(sock)
>                     ...do stuff..
> 
>   Thread 1:
>        qio_net_listener_set_client_func(lstnr, NULL, ...);
>           => foreach sock: socket
>                => g_source_unref(sock_src)
>        unref(lstnr)  (the final reference)
>           => finalize(lstnr)

If I understand correctly, _this_ unref(lstnr) is NOT the final
reference, because of the additional reference still owned by the
GSource that is still dispatching, so finalize(lstnr) is not reached
here...

> 
>   Thread 2:
>               => return dispatch(sock)
>               => unref(GSourceCallback)
>                   => destroy-notify
>                      => object_unref

...but instead here.  And that is the desirable property of the
pre-patch behavior with a per-GSource reference on the listsner object
- we are guaranteed that listener can't be finalized while there are
any pending dispatch in flight, even if the caller has unref'd their
last mention of lstnr, and therefore the dispatch never has a
use-after-free.

> 
> 
> 
> > diff --git i/io/net-listener.c w/io/net-listener.c
> > index 9f4e3c0be0c..1fcbbeb7a76 100644
> > --- i/io/net-listener.c
> > +++ w/io/net-listener.c
> > @@ -67,8 +67,10 @@ static void qio_net_listener_aio_func(void *opaque)
> >  {
> >      QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(opaque);
> > 
> > +    object_ref(OBJECT(sioc->listener));
> >      qio_net_listener_channel_func(QIO_CHANNEL(sioc), G_IO_IN,
> >                                    sioc->listener);
> > +    object_unref(OBJECT(sioc->listener));
> >  }
> 
> Now consider this patch, plus this extra hunk...
> 
>   Thread 1:
>        qio_net_listener_set_client_func(lstnr, f, ...);
>            => object_ref(listener)
>            => foreach sock: socket
>                => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, NULL);
> 
>   Thread 2:
>        poll()
>           => event POLLIN on socket
>                => call dispatch(sock)
>                     => ref(lstnr)
>                     ...do stuff..
> 
>   Thread 1:
>        qio_net_listener_set_client_func(lstnr, NULL, ...);
>           => foreach sock: socket
>                => g_source_unref(sock_src)
>           => object_unref(listener)
>        unref(lstnr)  (still 1 reference left)
> 
>   Thread 2:
>                     => unref(lstnr)    (the final reference)
>                         => finalize(lstnr)
>               => return dispatch(sock)
>               => unref(GSourceCallback)
>

Yes, that's the race I plugged between v1 and v2 by adding the
referencing around the client callback.

> 
> That appears to work ok, however, there's still a race window that is
> not solved. Between the time thread 2 sees POLLIN, and when it calls
> the dispatch(sock) function, it is possible that thread 1 will drop
> the last reference:
> 
> 
> 
>   Thread 1:
>        qio_net_listener_set_client_func(lstnr, f, ...);
>            => object_ref(listener)
>            => foreach sock: socket
>                => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, NULL);
> 
>   Thread 2:
>        poll()
>           => event POLLIN on socket
> 
>   Thread 1:
>        qio_net_listener_set_client_func(lstnr, NULL, ...);
>           => foreach sock: socket
>                => g_source_unref(sock_src)
>           => object_unref(listener)
>        unref(lstnr)  (still 1 reference left)

Another copy-paste problem?  I think your argument is that if we lose
this race, this particular unref() drops the count to 0 triggering
finalize() early...

> 
>   Thread 2:
>                => call dispatch(sock)
>                     => ref(lstnr)

...before this one can compensate, and thus having a UAF for a
finalized lstnr passed to the callback.  Yes, I can see the race now;
thanks for persisting with the explanation.

>                     ...do stuff..
>                     => unref(lstnr)    (the final reference)
>                         => finalize(lstnr)
>               => return dispatch(sock)
>               => unref(GSourceCallback)
> 
> 
> I don't see a way to solve this without synchronization with the event
> loop for releasing the reference on the opaque data for the dispatcher
> callback.  That's what the current code does, but I'm seeing no way for
> the AioContext event loop callbacks to have anything equivalent. This
> feels like a gap in the AioContext design.

Yes, that was one thing I was frustrated with while writing v2 - the
AioContext code simply lacks a notify hook, even though it is
demonstrably useful with GSource.  But plumbing in a notify hook is
more invasive, and puts the series at risk of missing 10.2.  Still,
it's something worth getting right, so I'll try a v3.

> 
> This is admittedly an incredibly hard to trigger race condition. It would
> need a client to be calling a QMP command that tears down the NBD server,
> at the exact same time as a new NBD client was incoming. Or the same kind
> of scenario for other pieces of QEMU code using QIONetListener. This still
> makes me worried though, as rare races have a habit of hitting QEMU
> eventually.

Indeed.  And a hard-to-trigger race is, in some regards, worse than a
deadlock because it's harder to reproduce and prove whether it is
fixed.

I'm thinking about whether any of the GSourceCallback solution (using
a reference-counter wrapper around the user's original opaque data to
prove how many in-flight callbacks are still being dispatched) may be
usable with AioContext.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org

Re: [PATCH 6/8] qio: Hoist ref of listener outside loop

Posted by Daniel P. Berrangé 3 days, 14 hours ago

On Tue, Nov 11, 2025 at 01:09:24PM -0600, Eric Blake wrote:
> On Tue, Nov 11, 2025 at 02:43:16PM +0000, Daniel P. Berrangé wrote:
> > > > > Hoisting the reference like this will make it easier for an upcoming
> > > > > patch to still ensure the listener cannot be prematurely garbage
> > > > > collected during the user's callback, even when the callback no longer
> > > > > uses a per-sioc GSource.
> > > > 
> > > > It isn't quite this simple. Glib reference counts the callback
> > > > func / data, holding a reference when dispatching the callback.
> > > > 
> > > > IOW, even if the GSource is unrefed, the callback 'notify'
> > > > function won't be called if the main loop is in the process
> > > > of dispatching.
> > > 
> > > I'm not sure I follow your argument.  Glib holds a reference on the
> > > GSource object, not on the opaque data that is handed to the GSource.
> > > It is possible to use g_source_set_callback_indirect() where GSource
> > > can learn how to use the same reference counting on data as external
> > > code, by the use of function pointers for ref and unref, but QIO uses
> > > merely g_source_set_callback().
> > > https://gitlab.gnome.org/GNOME/glib/-/blob/main/glib/gmain.c#L1844
> > > shows that glib then wraps that opaque pointer into an internal
> > > GSourceCallback object which itself is reference counted, so that the
> > > notify function is not called until the GSource is finalized, but that
> > > is reference counting on the container, not on the opaque object
> > > itself (which in this patch is the QIONetListener).
> > > 
> > > > 
> > > > With this change, the reference on 'listener' can now be
> > > > released even if the callback is currently dispatching.
> > > 
> > > So if I'm understanding your concern, you're worried that the unwatch
> > > code can finish looping through the g_source_destroy and then reach
> > > the point where it unrefs listener, but that a late-breaking client
> > > connection can trigger a callback that can still be executing in
> > > another thread/coroutine after the listener is unref'ed but before the
> > > GSource has been finalized?  If so, would squashing this in fix the
> > > problem you are seeing?
> > 
> > Consider the following scenario, where we have two threads, one
> > is calling QIONetListener APIs, and one is the event thread.
> > 
> > In the current code, when we unref() the GSource for the socket
> > watch, the destroy-notify does not get called, because the
> > event thread is in the middle of a dispatch callback for the
> > I/O event.  When the dispatch callback returns control to the
> > event loop, the GSourceCallback is unrefed, and this triggers
> > the destroy-notify call, which unrefs the listener.
> > 
> > The flow looks like this:
> > 
> >   Thread 1:
> >        qio_net_listener_set_client_func(lstnr, f, ...);
> >            => foreach sock: socket
> >                => object_ref(lstnr)
> >                => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, object_unref);
> > 
> >   Thread 2:
> >        poll()
> >           => event POLLIN on socket
> >                => ref(GSourceCallback)
> >                => call dispatch(sock)
> >                     ...do stuff..
> > 
> >   Thread 1:
> >        qio_net_listener_set_client_func(lstnr, NULL, ...);
> >           => foreach sock: socket
> >                => g_source_unref(sock_src)
> >        unref(lstnr)  (the final reference)
> >           => finalize(lstnr)
> 
> If I understand correctly, _this_ unref(lstnr) is NOT the final
> reference, because of the additional reference still owned by the
> GSource that is still dispatching, so finalize(lstnr) is not reached
> here...

Sorry, yes, my bad.

> 
> > 
> >   Thread 2:
> >               => return dispatch(sock)
> >               => unref(GSourceCallback)
> >                   => destroy-notify
> >                      => object_unref
> 
> ...but instead here.  And that is the desirable property of the
> pre-patch behavior with a per-GSource reference on the listsner object
> - we are guaranteed that listener can't be finalized while there are
> any pending dispatch in flight, even if the caller has unref'd their
> last mention of lstnr, and therefore the dispatch never has a
> use-after-free.

Correct.


> > That appears to work ok, however, there's still a race window that is
> > not solved. Between the time thread 2 sees POLLIN, and when it calls
> > the dispatch(sock) function, it is possible that thread 1 will drop
> > the last reference:
> > 
> > 
> > 
> >   Thread 1:
> >        qio_net_listener_set_client_func(lstnr, f, ...);
> >            => object_ref(listener)
> >            => foreach sock: socket
> >                => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, NULL);
> > 
> >   Thread 2:
> >        poll()
> >           => event POLLIN on socket
> > 
> >   Thread 1:
> >        qio_net_listener_set_client_func(lstnr, NULL, ...);
> >           => foreach sock: socket
> >                => g_source_unref(sock_src)
> >           => object_unref(listener)
> >        unref(lstnr)  (still 1 reference left)
> 
> Another copy-paste problem?  I think your argument is that if we lose
> this race, this particular unref() drops the count to 0 triggering
> finalize() early...

Sigh, yes, somehow I got the examples inverted slightly. It was
correct in my head at least :-)

> 
> > 
> >   Thread 2:
> >                => call dispatch(sock)
> >                     => ref(lstnr)
> 
> ...before this one can compensate, and thus having a UAF for a
> finalized lstnr passed to the callback.  Yes, I can see the race now;
> thanks for persisting with the explanation.
> 
> >                     ...do stuff..
> >                     => unref(lstnr)    (the final reference)
> >                         => finalize(lstnr)
> >               => return dispatch(sock)
> >               => unref(GSourceCallback)
> > 
> > 
> > I don't see a way to solve this without synchronization with the event
> > loop for releasing the reference on the opaque data for the dispatcher
> > callback.  That's what the current code does, but I'm seeing no way for
> > the AioContext event loop callbacks to have anything equivalent. This
> > feels like a gap in the AioContext design.
> 
> Yes, that was one thing I was frustrated with while writing v2 - the
> AioContext code simply lacks a notify hook, even though it is
> demonstrably useful with GSource.  But plumbing in a notify hook is
> more invasive, and puts the series at risk of missing 10.2.  Still,
> it's something worth getting right, so I'll try a v3.

I don't mind a short term NBD only solution for 10.2 is we think
we can rationalize that the specific NBD usage pattern is safe,
if it is easier to delay AioContext thoughts untill 11.0 dev cycle.


> > This is admittedly an incredibly hard to trigger race condition. It would
> > need a client to be calling a QMP command that tears down the NBD server,
> > at the exact same time as a new NBD client was incoming. Or the same kind
> > of scenario for other pieces of QEMU code using QIONetListener. This still
> > makes me worried though, as rare races have a habit of hitting QEMU
> > eventually.
> 
> Indeed.  And a hard-to-trigger race is, in some regards, worse than a
> deadlock because it's harder to reproduce and prove whether it is
> fixed.
> 
> I'm thinking about whether any of the GSourceCallback solution (using
> a reference-counter wrapper around the user's original opaque data to
> prove how many in-flight callbacks are still being dispatched) may be
> usable with AioContext.
> 
> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.
> Virtualization:  qemu.org | libguestfs.org
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH 6/8] qio: Hoist ref of listener outside loop

Posted by Eric Blake 4 days, 5 hours ago

On Tue, Nov 11, 2025 at 01:09:24PM -0600, Eric Blake wrote:
> > In the current code, when we unref() the GSource for the socket
> > watch, the destroy-notify does not get called, because the
> > event thread is in the middle of a dispatch callback for the
> > I/O event.  When the dispatch callback returns control to the
> > event loop, the GSourceCallback is unrefed, and this triggers
> > the destroy-notify call, which unrefs the listener.
> > 
> > The flow looks like this:
> > 
> >   Thread 1:
> >        qio_net_listener_set_client_func(lstnr, f, ...);
> >            => foreach sock: socket
> >                => object_ref(lstnr)
> >                => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, object_unref);
> > 
> >   Thread 2:
> >        poll()
> >           => event POLLIN on socket
> >                => ref(GSourceCallback)
> >                => call dispatch(sock)
> >                     ...do stuff..
> > 
> >   Thread 1:
> >        qio_net_listener_set_client_func(lstnr, NULL, ...);
> >           => foreach sock: socket
> >                => g_source_unref(sock_src)
> >        unref(lstnr)  (the final reference)
> >           => finalize(lstnr)
> 
> If I understand correctly, _this_ unref(lstnr) is NOT the final
> reference, because of the additional reference still owned by the
> GSource that is still dispatching, so finalize(lstnr) is not reached
> here...
> 
> > 
> >   Thread 2:
> >               => return dispatch(sock)
> >               => unref(GSourceCallback)
> >                   => destroy-notify
> >                      => object_unref
> 
> ...but instead here.  And that is the desirable property of the
> pre-patch behavior with a per-GSource reference on the listsner object
> - we are guaranteed that listener can't be finalized while there are
> any pending dispatch in flight, even if the caller has unref'd their
> last mention of lstnr, and therefore the dispatch never has a
> use-after-free.

But thinking further, even if it is not an object_unref, but merely a
change of the async callback function, is this the sort of thing where
a mutex is needed to make things safer?

Even without my patch series, we have this scenario:

  Thread 1:
       qio_net_listener_set_client_func(lstnr, f1, ...);
           => foreach sock: socket
               => object_ref(lstnr)
               => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, object_unref);

  Thread 2:
       poll()
          => event POLLIN on socket
               => ref(GSourceCallback) // while lstnr->io_func is f1
                    ...do stuff..

  Thread 1:
       qio_net_listener_set_client_func(lstnr, f2, ...);
          => foreach sock: socket
               => g_source_unref(sock_src)
          => foreach sock: socket
               => object_ref(lstnr)
               => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, object_unref);

  Thread 2:
               => access lstnr->io_func // now sees f2
               => call f2(sock)
               => return dispatch(sock)
               => unref(GSourceCallback)
                  => destroy-notify
                     => object_unref

So even though thread 2 noticed a client while f1 was registered,
because we lack a mutex and are not using atomic access primitives,
there is nothing preventing thread 1 from changing the io_func from f1
to f2 before thread 2 finally calls the callback.  And if f2 is NULL
(because the caller is deregistering the callback), I could see the
race resulting in qio_net_listener_channel_func() with its pre-patch
code of:

    if (listener->io_func) {
        listener->io_func(listener, sioc, listener->io_data);
    }

resulting in a NULL-pointer deref if thread 1 changed out
listener->io_func after the if but before the dispatch.

Adding a mutex might make QIONetListener slower in the time to grab
the mutex (even if it is uncontended most of the time), but if we have
a proper mutex in place, we could at least guarantee that
listener->io_func, listener->io_data, and listener->io_notify are
changed as a group, rather than sharded if the polling thread is
managing a dispatch with listener as its opaque data at the same time
the user's thread is trying to change the registered callback.

By itself, GSource DOES have a mutex lock around the GMainContext it
is attached to; but our particular use of GSource is operating outside
that locking scheme, so it looks like I've uncovered another latent
problem.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org

Re: [PATCH 6/8] qio: Hoist ref of listener outside loop

Posted by Daniel P. Berrangé 3 days, 14 hours ago

On Tue, Nov 11, 2025 at 02:07:50PM -0600, Eric Blake wrote:
> On Tue, Nov 11, 2025 at 01:09:24PM -0600, Eric Blake wrote:
> > > In the current code, when we unref() the GSource for the socket
> > > watch, the destroy-notify does not get called, because the
> > > event thread is in the middle of a dispatch callback for the
> > > I/O event.  When the dispatch callback returns control to the
> > > event loop, the GSourceCallback is unrefed, and this triggers
> > > the destroy-notify call, which unrefs the listener.
> > > 
> > > The flow looks like this:
> > > 
> > >   Thread 1:
> > >        qio_net_listener_set_client_func(lstnr, f, ...);
> > >            => foreach sock: socket
> > >                => object_ref(lstnr)
> > >                => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, object_unref);
> > > 
> > >   Thread 2:
> > >        poll()
> > >           => event POLLIN on socket
> > >                => ref(GSourceCallback)
> > >                => call dispatch(sock)
> > >                     ...do stuff..
> > > 
> > >   Thread 1:
> > >        qio_net_listener_set_client_func(lstnr, NULL, ...);
> > >           => foreach sock: socket
> > >                => g_source_unref(sock_src)
> > >        unref(lstnr)  (the final reference)
> > >           => finalize(lstnr)
> > 
> > If I understand correctly, _this_ unref(lstnr) is NOT the final
> > reference, because of the additional reference still owned by the
> > GSource that is still dispatching, so finalize(lstnr) is not reached
> > here...
> > 
> > > 
> > >   Thread 2:
> > >               => return dispatch(sock)
> > >               => unref(GSourceCallback)
> > >                   => destroy-notify
> > >                      => object_unref
> > 
> > ...but instead here.  And that is the desirable property of the
> > pre-patch behavior with a per-GSource reference on the listsner object
> > - we are guaranteed that listener can't be finalized while there are
> > any pending dispatch in flight, even if the caller has unref'd their
> > last mention of lstnr, and therefore the dispatch never has a
> > use-after-free.
> 
> But thinking further, even if it is not an object_unref, but merely a
> change of the async callback function, is this the sort of thing where
> a mutex is needed to make things safer?
> 
> Even without my patch series, we have this scenario:
> 
>   Thread 1:
>        qio_net_listener_set_client_func(lstnr, f1, ...);
>            => foreach sock: socket
>                => object_ref(lstnr)
>                => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, object_unref);
> 
>   Thread 2:
>        poll()
>           => event POLLIN on socket
>                => ref(GSourceCallback) // while lstnr->io_func is f1
>                     ...do stuff..
> 
>   Thread 1:
>        qio_net_listener_set_client_func(lstnr, f2, ...);
>           => foreach sock: socket
>                => g_source_unref(sock_src)
>           => foreach sock: socket
>                => object_ref(lstnr)
>                => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, object_unref);
> 
>   Thread 2:
>                => access lstnr->io_func // now sees f2
>                => call f2(sock)
>                => return dispatch(sock)
>                => unref(GSourceCallback)
>                   => destroy-notify
>                      => object_unref
> 
> So even though thread 2 noticed a client while f1 was registered,
> because we lack a mutex and are not using atomic access primitives,
> there is nothing preventing thread 1 from changing the io_func from f1
> to f2 before thread 2 finally calls the callback.  And if f2 is NULL
> (because the caller is deregistering the callback), I could see the
> race resulting in qio_net_listener_channel_func() with its pre-patch
> code of:
> 
>     if (listener->io_func) {
>         listener->io_func(listener, sioc, listener->io_data);
>     }
> 
> resulting in a NULL-pointer deref if thread 1 changed out
> listener->io_func after the if but before the dispatch.

Hmm, yes, that is a risk I hadn't thought of before.

> Adding a mutex might make QIONetListener slower in the time to grab
> the mutex (even if it is uncontended most of the time), but if we have
> a proper mutex in place, we could at least guarantee that
> listener->io_func, listener->io_data, and listener->io_notify are
> changed as a group, rather than sharded if the polling thread is
> managing a dispatch with listener as its opaque data at the same time
> the user's thread is trying to change the registered callback.

Yeah, a mutex should be ok, as it will be uncontended 99.999% of
the time.

> By itself, GSource DOES have a mutex lock around the GMainContext it
> is attached to; but our particular use of GSource is operating outside
> that locking scheme, so it looks like I've uncovered another latent
> problem.

Agreed.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH 6/8] qio: Hoist ref of listener outside loop

Posted by Kevin Wolf 4 days, 9 hours ago

Am 11.11.2025 um 15:43 hat Daniel P. Berrangé geschrieben:
> On Wed, Nov 05, 2025 at 03:57:29PM -0600, Eric Blake wrote:
> > On Tue, Nov 04, 2025 at 11:13:48AM +0000, Daniel P. Berrangé wrote:
> > > On Mon, Nov 03, 2025 at 02:10:57PM -0600, Eric Blake wrote:
> > > > The point of QIONetListener is to allow a server to listen to more
> > > > than one socket address at a time, and respond to clients in a
> > > > first-come-first-serve order across any of those addresses.  While
> > > > some servers (like NBD) really do want to serve multiple simultaneous
> > > > clients, many other servers only care about the first client to
> > > > connect, and will immediately deregister the callback, possibly by
> > > > dropping their reference to the QIONetListener.  The existing code
> > > > ensures that all other pending callbacks remain safe once the first
> > > > callback drops the listener, by adding an extra reference to the
> > > > listener for each GSource created, where those references pair to the
> > > > eventual teardown of each GSource after a given callbacks has been
> > > > serviced or aborted.  But it is equally acceptable to hoist the
> > > > reference to the listener outside the loop - as long as there is a
> > > > callback function registered, it is sufficient to have a single
> > > > reference live for the entire array of sioc, rather than one reference
> > > > per sioc in the array.
> > > > 
> > > > Hoisting the reference like this will make it easier for an upcoming
> > > > patch to still ensure the listener cannot be prematurely garbage
> > > > collected during the user's callback, even when the callback no longer
> > > > uses a per-sioc GSource.
> > > 
> > > It isn't quite this simple. Glib reference counts the callback
> > > func / data, holding a reference when dispatching the callback.
> > > 
> > > IOW, even if the GSource is unrefed, the callback 'notify'
> > > function won't be called if the main loop is in the process
> > > of dispatching.
> > 
> > I'm not sure I follow your argument.  Glib holds a reference on the
> > GSource object, not on the opaque data that is handed to the GSource.
> > It is possible to use g_source_set_callback_indirect() where GSource
> > can learn how to use the same reference counting on data as external
> > code, by the use of function pointers for ref and unref, but QIO uses
> > merely g_source_set_callback().
> > https://gitlab.gnome.org/GNOME/glib/-/blob/main/glib/gmain.c#L1844
> > shows that glib then wraps that opaque pointer into an internal
> > GSourceCallback object which itself is reference counted, so that the
> > notify function is not called until the GSource is finalized, but that
> > is reference counting on the container, not on the opaque object
> > itself (which in this patch is the QIONetListener).
> > 
> > > 
> > > With this change, the reference on 'listener' can now be
> > > released even if the callback is currently dispatching.
> > 
> > So if I'm understanding your concern, you're worried that the unwatch
> > code can finish looping through the g_source_destroy and then reach
> > the point where it unrefs listener, but that a late-breaking client
> > connection can trigger a callback that can still be executing in
> > another thread/coroutine after the listener is unref'ed but before the
> > GSource has been finalized?  If so, would squashing this in fix the
> > problem you are seeing?
> 
> Consider the following scenario, where we have two threads, one
> is calling QIONetListener APIs, and one is the event thread.
> 
> In the current code, when we unref() the GSource for the socket
> watch, the destroy-notify does not get called, because the
> event thread is in the middle of a dispatch callback for the
> I/O event.  When the dispatch callback returns control to the
> event loop, the GSourceCallback is unrefed, and this triggers
> the destroy-notify call, which unrefs the listener.
> 
> The flow looks like this:
> 
>   Thread 1:
>        qio_net_listener_set_client_func(lstnr, f, ...);
>            => foreach sock: socket
>                => object_ref(lstnr)
>                => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, object_unref);
> 
>   Thread 2:
>        poll()
>           => event POLLIN on socket
>                => ref(GSourceCallback)
>                => call dispatch(sock)
>                     ...do stuff..
> 
>   Thread 1:
>        qio_net_listener_set_client_func(lstnr, NULL, ...);
>           => foreach sock: socket
>                => g_source_unref(sock_src)
>        unref(lstnr)  (the final reference)
>           => finalize(lstnr)
> 
>   Thread 2:
>               => return dispatch(sock)
>               => unref(GSourceCallback)
>                   => destroy-notify
>                      => object_unref
> 
> 
> 
> > diff --git i/io/net-listener.c w/io/net-listener.c
> > index 9f4e3c0be0c..1fcbbeb7a76 100644
> > --- i/io/net-listener.c
> > +++ w/io/net-listener.c
> > @@ -67,8 +67,10 @@ static void qio_net_listener_aio_func(void *opaque)
> >  {
> >      QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(opaque);
> > 
> > +    object_ref(OBJECT(sioc->listener));
> >      qio_net_listener_channel_func(QIO_CHANNEL(sioc), G_IO_IN,
> >                                    sioc->listener);
> > +    object_unref(OBJECT(sioc->listener));
> >  }
> 
> Now consider this patch, plus this extra hunk...
> 
>   Thread 1:
>        qio_net_listener_set_client_func(lstnr, f, ...);
>            => object_ref(listener)
>            => foreach sock: socket
>                => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, NULL);
> 
>   Thread 2:
>        poll()
>           => event POLLIN on socket
>                => call dispatch(sock)
>                     => ref(lstnr)
>                     ...do stuff..
> 
>   Thread 1:
>        qio_net_listener_set_client_func(lstnr, NULL, ...);
>           => foreach sock: socket
>                => g_source_unref(sock_src)
>           => object_unref(listener)
>        unref(lstnr)  (still 1 reference left)
> 
>   Thread 2:
>                     => unref(lstnr)    (the final reference)
>                         => finalize(lstnr)
>               => return dispatch(sock)
>               => unref(GSourceCallback)
> 
> 
> That appears to work ok, however, there's still a race window that is
> not solved. Between the time thread 2 sees POLLIN, and when it calls
> the dispatch(sock) function, it is possible that thread 1 will drop
> the last reference:
> 
> 
> 
>   Thread 1:
>        qio_net_listener_set_client_func(lstnr, f, ...);
>            => object_ref(listener)
>            => foreach sock: socket
>                => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, NULL);
> 
>   Thread 2:
>        poll()
>           => event POLLIN on socket
> 
>   Thread 1:
>        qio_net_listener_set_client_func(lstnr, NULL, ...);
>           => foreach sock: socket
>                => g_source_unref(sock_src)
>           => object_unref(listener)
>        unref(lstnr)  (still 1 reference left)

Is what you're worried about that there is still a reference left in
the opaque pointer of an fd handler, but it's not in the refcount and
therefore this already frees the listener while thread 2 will still
access it?

> 
>   Thread 2:
>                => call dispatch(sock)
>                     => ref(lstnr)
>                     ...do stuff..
>                     => unref(lstnr)    (the final reference)
>                         => finalize(lstnr)
>               => return dispatch(sock)
>               => unref(GSourceCallback)
> 
> 
> I don't see a way to solve this without synchronization with the event
> loop for releasing the reference on the opaque data for the dispatcher
> callback.  That's what the current code does, but I'm seeing no way for
> the AioContext event loop callbacks to have anything equivalent. This
> feels like a gap in the AioContext design.

I think the way you would normally do this is schedule a BH in thread 2
to do the critical work. If you delete the fd handler and unref the
listener in thread 2, then there is no race.

But maybe adding a callback for node deletion in AioHandler wouldn't
hurt because the opaque pointer pretty much always references something
and doing an unref when deleting the AioHandler should be a pretty
common pattern.

> This is admittedly an incredibly hard to trigger race condition. It would
> need a client to be calling a QMP command that tears down the NBD server,
> at the exact same time as a new NBD client was incoming. Or the same kind
> of scenario for other pieces of QEMU code using QIONetListener. This still
> makes me worried though, as rare races have a habit of hitting QEMU
> eventually.

Aren't both QMP and incoming NBD connections always handled in the main
thread? I'm not sure if I know a case where we would actually get this
pattern with two different threads today. Of course, that doesn't mean
that we couldn't get it in the future.

Kevin

Re: [PATCH 6/8] qio: Hoist ref of listener outside loop

Posted by Daniel P. Berrangé 4 days, 9 hours ago

On Tue, Nov 11, 2025 at 04:48:04PM +0100, Kevin Wolf wrote:
> Am 11.11.2025 um 15:43 hat Daniel P. Berrangé geschrieben:
> > On Wed, Nov 05, 2025 at 03:57:29PM -0600, Eric Blake wrote:
> > > On Tue, Nov 04, 2025 at 11:13:48AM +0000, Daniel P. Berrangé wrote:
> > > > On Mon, Nov 03, 2025 at 02:10:57PM -0600, Eric Blake wrote:
> > > > > The point of QIONetListener is to allow a server to listen to more
> > > > > than one socket address at a time, and respond to clients in a
> > > > > first-come-first-serve order across any of those addresses.  While
> > > > > some servers (like NBD) really do want to serve multiple simultaneous
> > > > > clients, many other servers only care about the first client to
> > > > > connect, and will immediately deregister the callback, possibly by
> > > > > dropping their reference to the QIONetListener.  The existing code
> > > > > ensures that all other pending callbacks remain safe once the first
> > > > > callback drops the listener, by adding an extra reference to the
> > > > > listener for each GSource created, where those references pair to the
> > > > > eventual teardown of each GSource after a given callbacks has been
> > > > > serviced or aborted.  But it is equally acceptable to hoist the
> > > > > reference to the listener outside the loop - as long as there is a
> > > > > callback function registered, it is sufficient to have a single
> > > > > reference live for the entire array of sioc, rather than one reference
> > > > > per sioc in the array.
> > > > > 
> > > > > Hoisting the reference like this will make it easier for an upcoming
> > > > > patch to still ensure the listener cannot be prematurely garbage
> > > > > collected during the user's callback, even when the callback no longer
> > > > > uses a per-sioc GSource.
> > > > 
> > > > It isn't quite this simple. Glib reference counts the callback
> > > > func / data, holding a reference when dispatching the callback.
> > > > 
> > > > IOW, even if the GSource is unrefed, the callback 'notify'
> > > > function won't be called if the main loop is in the process
> > > > of dispatching.

snip

> > That appears to work ok, however, there's still a race window that is
> > not solved. Between the time thread 2 sees POLLIN, and when it calls
> > the dispatch(sock) function, it is possible that thread 1 will drop
> > the last reference:
> > 
> > 
> > 
> >   Thread 1:
> >        qio_net_listener_set_client_func(lstnr, f, ...);
> >            => object_ref(listener)
> >            => foreach sock: socket
> >                => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, NULL);
> > 
> >   Thread 2:
> >        poll()
> >           => event POLLIN on socket
> > 
> >   Thread 1:
> >        qio_net_listener_set_client_func(lstnr, NULL, ...);
> >           => foreach sock: socket
> >                => g_source_unref(sock_src)
> >           => object_unref(listener)
> >        unref(lstnr)  (still 1 reference left)
> 
> Is what you're worried about that there is still a reference left in
> the opaque pointer of an fd handler, but it's not in the refcount and
> therefore this already frees the listener while thread 2 will still
> access it?

Yes, exactly.

> 
> > 
> >   Thread 2:
> >                => call dispatch(sock)
> >                     => ref(lstnr)
> >                     ...do stuff..
> >                     => unref(lstnr)    (the final reference)
> >                         => finalize(lstnr)
> >               => return dispatch(sock)
> >               => unref(GSourceCallback)
> > 
> > 
> > I don't see a way to solve this without synchronization with the event
> > loop for releasing the reference on the opaque data for the dispatcher
> > callback.  That's what the current code does, but I'm seeing no way for
> > the AioContext event loop callbacks to have anything equivalent. This
> > feels like a gap in the AioContext design.
> 
> I think the way you would normally do this is schedule a BH in thread 2
> to do the critical work. If you delete the fd handler and unref the
> listener in thread 2, then there is no race.

Yes, using a BH would be safe, provided you put the BH in the right
loop, given that we have choice of the main event loop, or a non-default
GMainContext, or the special  AioContext that NBD is relying on.

> But maybe adding a callback for node deletion in AioHandler wouldn't
> hurt because the opaque pointer pretty much always references something
> and doing an unref when deleting the AioHandler should be a pretty
> common pattern.

That would likely make the scenario less error-prone, compared to
remembering to use a BH to synchronize.

> > This is admittedly an incredibly hard to trigger race condition. It would
> > need a client to be calling a QMP command that tears down the NBD server,
> > at the exact same time as a new NBD client was incoming. Or the same kind
> > of scenario for other pieces of QEMU code using QIONetListener. This still
> > makes me worried though, as rare races have a habit of hitting QEMU
> > eventually.
> 
> Aren't both QMP and incoming NBD connections always handled in the main
> thread? I'm not sure if I know a case where we would actually get this
> pattern with two different threads today. Of course, that doesn't mean
> that we couldn't get it in the future.

Yeah, I believe we're probably safe in todays usage. My concern was
primarily about any surprises in the conceptual design that might
impact us in future.

I guess if NBD is the only thing using AioContext for QIONetListener
today, we could hoist the ref/unref only when using a AioContext,
and keep the GDestroyNotify usage for the GMainContext code path
to significantly limit the exposure. Avoid needing to do anything
extra for AioHandlers right before release.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|