[PATCH v2 4/4] block: Close block exports in two steps

Sergio Lopez posted 4 patches 4 years, 11 months ago
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>, Stefano Stabellini <sstabellini@kernel.org>, Paul Durrant <paul@xen.org>, Stefan Hajnoczi <stefanha@redhat.com>, Kevin Wolf <kwolf@redhat.com>, Fam Zheng <fam@euphon.net>, Eric Blake <eblake@redhat.com>, Anthony Perard <anthony.perard@citrix.com>, Max Reitz <mreitz@redhat.com>
There is a newer version of this series
[PATCH v2 4/4] block: Close block exports in two steps
Posted by Sergio Lopez 4 years, 11 months ago
There's a cross-dependency between closing the block exports and
draining the block layer. The latter needs that we close all export's
client connections to ensure they won't queue more requests, but the
exports may have coroutines yielding in the block layer, which implies
they can't be fully closed until we drain it.

To break this cross-dependency, this change adds a "bool wait"
argument to blk_exp_close_all() and blk_exp_close_all_type(), so
callers can decide whether they want to wait for the exports to be
fully quiesced, or just return after requesting them to shut down.

Then, in bdrv_close_all we make two calls, one without waiting to
close all client connections, and another after draining the block
layer, this time waiting for the exports to be fully quiesced.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1900505
Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 block.c                   | 20 +++++++++++++++++++-
 block/export/export.c     | 10 ++++++----
 blockdev-nbd.c            |  2 +-
 include/block/export.h    |  4 ++--
 qemu-nbd.c                |  2 +-
 stubs/blk-exp-close-all.c |  2 +-
 6 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/block.c b/block.c
index bc8a66ab6e..41db70ac07 100644
--- a/block.c
+++ b/block.c
@@ -4472,13 +4472,31 @@ static void bdrv_close(BlockDriverState *bs)
 void bdrv_close_all(void)
 {
     assert(job_next(NULL) == NULL);
-    blk_exp_close_all();
+
+    /*
+     * There's a cross-dependency between closing the block exports and
+     * draining the block layer. The latter needs that we close all export's
+     * client connections to ensure they won't queue more requests, but the
+     * exports may have coroutines yielding in the block layer, which implies
+     * they can't be fully closed until we drain it.
+     *
+     * Make a first call to close all export's client connections, without
+     * waiting for each export to be fully quiesced.
+     */
+    blk_exp_close_all(false);
 
     /* Drop references from requests still in flight, such as canceled block
      * jobs whose AIO context has not been polled yet */
     bdrv_drain_all();
 
     blk_remove_all_bs();
+
+    /*
+     * Make a second call to shut down the exports, this time waiting for them
+     * to be fully quiesced.
+     */
+    blk_exp_close_all(true);
+
     blockdev_close_all_bdrv_states();
 
     assert(QTAILQ_EMPTY(&all_bdrv_states));
diff --git a/block/export/export.c b/block/export/export.c
index bad6f21b1c..0124ebd9f9 100644
--- a/block/export/export.c
+++ b/block/export/export.c
@@ -280,7 +280,7 @@ static bool blk_exp_has_type(BlockExportType type)
 }
 
 /* type == BLOCK_EXPORT_TYPE__MAX for all types */
-void blk_exp_close_all_type(BlockExportType type)
+void blk_exp_close_all_type(BlockExportType type, bool wait)
 {
     BlockExport *exp, *next;
 
@@ -293,12 +293,14 @@ void blk_exp_close_all_type(BlockExportType type)
         blk_exp_request_shutdown(exp);
     }
 
-    AIO_WAIT_WHILE(NULL, blk_exp_has_type(type));
+    if (wait) {
+        AIO_WAIT_WHILE(NULL, blk_exp_has_type(type));
+    }
 }
 
-void blk_exp_close_all(void)
+void blk_exp_close_all(bool wait)
 {
-    blk_exp_close_all_type(BLOCK_EXPORT_TYPE__MAX);
+    blk_exp_close_all_type(BLOCK_EXPORT_TYPE__MAX, wait);
 }
 
 void qmp_block_export_add(BlockExportOptions *export, Error **errp)
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index d8443d235b..d71d4da7c2 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -266,7 +266,7 @@ void qmp_nbd_server_stop(Error **errp)
         return;
     }
 
-    blk_exp_close_all_type(BLOCK_EXPORT_TYPE_NBD);
+    blk_exp_close_all_type(BLOCK_EXPORT_TYPE_NBD, true);
 
     nbd_server_free(nbd_server);
     nbd_server = NULL;
diff --git a/include/block/export.h b/include/block/export.h
index 7feb02e10d..71c25928ce 100644
--- a/include/block/export.h
+++ b/include/block/export.h
@@ -83,7 +83,7 @@ BlockExport *blk_exp_find(const char *id);
 void blk_exp_ref(BlockExport *exp);
 void blk_exp_unref(BlockExport *exp);
 void blk_exp_request_shutdown(BlockExport *exp);
-void blk_exp_close_all(void);
-void blk_exp_close_all_type(BlockExportType type);
+void blk_exp_close_all(bool wait);
+void blk_exp_close_all_type(BlockExportType type, bool wait);
 
 #endif
diff --git a/qemu-nbd.c b/qemu-nbd.c
index a7075c5419..928f4466f6 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -1122,7 +1122,7 @@ int main(int argc, char **argv)
     do {
         main_loop_wait(false);
         if (state == TERMINATE) {
-            blk_exp_close_all();
+            blk_exp_close_all(true);
             state = TERMINATED;
         }
     } while (state != TERMINATED);
diff --git a/stubs/blk-exp-close-all.c b/stubs/blk-exp-close-all.c
index 1c71316763..ecd0ce611f 100644
--- a/stubs/blk-exp-close-all.c
+++ b/stubs/blk-exp-close-all.c
@@ -2,6 +2,6 @@
 #include "block/export.h"
 
 /* Only used in programs that support block exports (libblockdev.fa) */
-void blk_exp_close_all(void)
+void blk_exp_close_all(bool wait)
 {
 }
-- 
2.26.2


Re: [PATCH v2 4/4] block: Close block exports in two steps
Posted by Kevin Wolf 4 years, 11 months ago
Am 14.12.2020 um 18:05 hat Sergio Lopez geschrieben:
> There's a cross-dependency between closing the block exports and
> draining the block layer. The latter needs that we close all export's
> client connections to ensure they won't queue more requests, but the
> exports may have coroutines yielding in the block layer, which implies
> they can't be fully closed until we drain it.

A coroutine that yielded must have some way to be reentered. So I guess
the quesiton becomes why they aren't reentered until drain. We do
process events:

    AIO_WAIT_WHILE(NULL, blk_exp_has_type(type));

So in theory, anything that would finalise the block export closing
should still execute.

What is the difference that drain makes compared to a simple
AIO_WAIT_WHILE, so that coroutine are reentered during drain, but not
during AIO_WAIT_WHILE?

This is an even more interesting question because the NBD server isn't a
block node nor a BdrvChildClass implementation, so it shouldn't even
notice a drain operation.

Kevin

> To break this cross-dependency, this change adds a "bool wait"
> argument to blk_exp_close_all() and blk_exp_close_all_type(), so
> callers can decide whether they want to wait for the exports to be
> fully quiesced, or just return after requesting them to shut down.
> 
> Then, in bdrv_close_all we make two calls, one without waiting to
> close all client connections, and another after draining the block
> layer, this time waiting for the exports to be fully quiesced.
> 
> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1900505
> Signed-off-by: Sergio Lopez <slp@redhat.com>
> ---
>  block.c                   | 20 +++++++++++++++++++-
>  block/export/export.c     | 10 ++++++----
>  blockdev-nbd.c            |  2 +-
>  include/block/export.h    |  4 ++--
>  qemu-nbd.c                |  2 +-
>  stubs/blk-exp-close-all.c |  2 +-
>  6 files changed, 30 insertions(+), 10 deletions(-)
> 
> diff --git a/block.c b/block.c
> index bc8a66ab6e..41db70ac07 100644
> --- a/block.c
> +++ b/block.c
> @@ -4472,13 +4472,31 @@ static void bdrv_close(BlockDriverState *bs)
>  void bdrv_close_all(void)
>  {
>      assert(job_next(NULL) == NULL);
> -    blk_exp_close_all();
> +
> +    /*
> +     * There's a cross-dependency between closing the block exports and
> +     * draining the block layer. The latter needs that we close all export's
> +     * client connections to ensure they won't queue more requests, but the
> +     * exports may have coroutines yielding in the block layer, which implies
> +     * they can't be fully closed until we drain it.
> +     *
> +     * Make a first call to close all export's client connections, without
> +     * waiting for each export to be fully quiesced.
> +     */
> +    blk_exp_close_all(false);
>  
>      /* Drop references from requests still in flight, such as canceled block
>       * jobs whose AIO context has not been polled yet */
>      bdrv_drain_all();
>  
>      blk_remove_all_bs();
> +
> +    /*
> +     * Make a second call to shut down the exports, this time waiting for them
> +     * to be fully quiesced.
> +     */
> +    blk_exp_close_all(true);
> +
>      blockdev_close_all_bdrv_states();
>  
>      assert(QTAILQ_EMPTY(&all_bdrv_states));
> diff --git a/block/export/export.c b/block/export/export.c
> index bad6f21b1c..0124ebd9f9 100644
> --- a/block/export/export.c
> +++ b/block/export/export.c
> @@ -280,7 +280,7 @@ static bool blk_exp_has_type(BlockExportType type)
>  }
>  
>  /* type == BLOCK_EXPORT_TYPE__MAX for all types */
> -void blk_exp_close_all_type(BlockExportType type)
> +void blk_exp_close_all_type(BlockExportType type, bool wait)
>  {
>      BlockExport *exp, *next;
>  
> @@ -293,12 +293,14 @@ void blk_exp_close_all_type(BlockExportType type)
>          blk_exp_request_shutdown(exp);
>      }
>  
> -    AIO_WAIT_WHILE(NULL, blk_exp_has_type(type));
> +    if (wait) {
> +        AIO_WAIT_WHILE(NULL, blk_exp_has_type(type));
> +    }
>  }
>  
> -void blk_exp_close_all(void)
> +void blk_exp_close_all(bool wait)
>  {
> -    blk_exp_close_all_type(BLOCK_EXPORT_TYPE__MAX);
> +    blk_exp_close_all_type(BLOCK_EXPORT_TYPE__MAX, wait);
>  }
>  
>  void qmp_block_export_add(BlockExportOptions *export, Error **errp)
> diff --git a/blockdev-nbd.c b/blockdev-nbd.c
> index d8443d235b..d71d4da7c2 100644
> --- a/blockdev-nbd.c
> +++ b/blockdev-nbd.c
> @@ -266,7 +266,7 @@ void qmp_nbd_server_stop(Error **errp)
>          return;
>      }
>  
> -    blk_exp_close_all_type(BLOCK_EXPORT_TYPE_NBD);
> +    blk_exp_close_all_type(BLOCK_EXPORT_TYPE_NBD, true);
>  
>      nbd_server_free(nbd_server);
>      nbd_server = NULL;
> diff --git a/include/block/export.h b/include/block/export.h
> index 7feb02e10d..71c25928ce 100644
> --- a/include/block/export.h
> +++ b/include/block/export.h
> @@ -83,7 +83,7 @@ BlockExport *blk_exp_find(const char *id);
>  void blk_exp_ref(BlockExport *exp);
>  void blk_exp_unref(BlockExport *exp);
>  void blk_exp_request_shutdown(BlockExport *exp);
> -void blk_exp_close_all(void);
> -void blk_exp_close_all_type(BlockExportType type);
> +void blk_exp_close_all(bool wait);
> +void blk_exp_close_all_type(BlockExportType type, bool wait);
>  
>  #endif
> diff --git a/qemu-nbd.c b/qemu-nbd.c
> index a7075c5419..928f4466f6 100644
> --- a/qemu-nbd.c
> +++ b/qemu-nbd.c
> @@ -1122,7 +1122,7 @@ int main(int argc, char **argv)
>      do {
>          main_loop_wait(false);
>          if (state == TERMINATE) {
> -            blk_exp_close_all();
> +            blk_exp_close_all(true);
>              state = TERMINATED;
>          }
>      } while (state != TERMINATED);
> diff --git a/stubs/blk-exp-close-all.c b/stubs/blk-exp-close-all.c
> index 1c71316763..ecd0ce611f 100644
> --- a/stubs/blk-exp-close-all.c
> +++ b/stubs/blk-exp-close-all.c
> @@ -2,6 +2,6 @@
>  #include "block/export.h"
>  
>  /* Only used in programs that support block exports (libblockdev.fa) */
> -void blk_exp_close_all(void)
> +void blk_exp_close_all(bool wait)
>  {
>  }
> -- 
> 2.26.2
> 


Re: [PATCH v2 4/4] block: Close block exports in two steps
Posted by Sergio Lopez 4 years, 11 months ago
On Tue, Dec 15, 2020 at 04:34:05PM +0100, Kevin Wolf wrote:
> Am 14.12.2020 um 18:05 hat Sergio Lopez geschrieben:
> > There's a cross-dependency between closing the block exports and
> > draining the block layer. The latter needs that we close all export's
> > client connections to ensure they won't queue more requests, but the
> > exports may have coroutines yielding in the block layer, which implies
> > they can't be fully closed until we drain it.
> 
> A coroutine that yielded must have some way to be reentered. So I guess
> the quesiton becomes why they aren't reentered until drain. We do
> process events:
> 
>     AIO_WAIT_WHILE(NULL, blk_exp_has_type(type));
> 
> So in theory, anything that would finalise the block export closing
> should still execute.
> 
> What is the difference that drain makes compared to a simple
> AIO_WAIT_WHILE, so that coroutine are reentered during drain, but not
> during AIO_WAIT_WHILE?
> 
> This is an even more interesting question because the NBD server isn't a
> block node nor a BdrvChildClass implementation, so it shouldn't even
> notice a drain operation.

I agree in that this deserves a deeper analysis. I'm going to drop
this patch from the series, and will re-analyze the issue later.

Thanks,
Sergio.
Re: [PATCH v2 4/4] block: Close block exports in two steps
Posted by Sergio Lopez 4 years, 11 months ago
On Tue, Dec 15, 2020 at 04:34:05PM +0100, Kevin Wolf wrote:
> Am 14.12.2020 um 18:05 hat Sergio Lopez geschrieben:
> > There's a cross-dependency between closing the block exports and
> > draining the block layer. The latter needs that we close all export's
> > client connections to ensure they won't queue more requests, but the
> > exports may have coroutines yielding in the block layer, which implies
> > they can't be fully closed until we drain it.
> 
> A coroutine that yielded must have some way to be reentered. So I guess
> the quesiton becomes why they aren't reentered until drain. We do
> process events:
> 
>     AIO_WAIT_WHILE(NULL, blk_exp_has_type(type));
> 
> So in theory, anything that would finalise the block export closing
> should still execute.
> 
> What is the difference that drain makes compared to a simple
> AIO_WAIT_WHILE, so that coroutine are reentered during drain, but not
> during AIO_WAIT_WHILE?
> 
> This is an even more interesting question because the NBD server isn't a
> block node nor a BdrvChildClass implementation, so it shouldn't even
> notice a drain operation.

OK, took a deeper dive into the issue. While shutting down the guest,
some co-routines from the NBD server are stuck here:

nbd_trip
 nbd_handle_request
  nbd_do_cmd_read
   nbd_co_send_sparse_read
    blk_pread
     blk_prw
      blk_read_entry
       blk_do_preadv
        blk_wait_while_drained
         qemu_co_queue_wait

This happens because bdrv_close_all() is called after
bdrv_drain_all_begin(), so all block backends are quiesced.

An alternative approach to this patch would be moving
blk_exp_close_all() to vl.c:qemu_cleanup, before
bdrv_drain_all_begin().

Do you have a preference for one of these options?

Thanks,
Sergio.