Am 03.02.2025 um 20:19 hat Stefan Hajnoczi geschrieben:
> On Fri, Jan 31, 2025 at 10:50:48AM +0100, Kevin Wolf wrote:
> > In order to support running an NBD export on inactive nodes, we must
> > make sure to return errors for any operations that aren't allowed on
> > inactive nodes. Reads are the only operation we know we need for
> > inactive images, so to err on the side of caution, return errors for
> > everything else, even if some operations could possibly be okay.
> >
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> > nbd/server.c | 17 +++++++++++++++++
> > 1 file changed, 17 insertions(+)
> >
> > diff --git a/nbd/server.c b/nbd/server.c
> > index f64e47270c..2076fb2666 100644
> > --- a/nbd/server.c
> > +++ b/nbd/server.c
> > @@ -2026,6 +2026,7 @@ static void nbd_export_delete(BlockExport *blk_exp)
> > const BlockExportDriver blk_exp_nbd = {
> > .type = BLOCK_EXPORT_TYPE_NBD,
> > .instance_size = sizeof(NBDExport),
> > + .supports_inactive = true,
> > .create = nbd_export_create,
> > .delete = nbd_export_delete,
> > .request_shutdown = nbd_export_request_shutdown,
> > @@ -2920,6 +2921,22 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
> > NBDExport *exp = client->exp;
> > char *msg;
> > size_t i;
> > + bool inactive;
> > +
> > + WITH_GRAPH_RDLOCK_GUARD() {
> > + inactive = bdrv_is_inactive(blk_bs(exp->common.blk));
> > + if (inactive) {
> > + switch (request->type) {
> > + case NBD_CMD_READ:
> > + /* These commands are allowed on inactive nodes */
> > + break;
> > + default:
> > + /* Return an error for the rest */
> > + return nbd_send_generic_reply(client, request, -EPERM,
> > + "export is inactive", errp);
> > + }
> > + }
> > + }
>
> Hmm...end of lock guard. What prevents the race where inactive changes
> before the request is performed?
That's a good question. Probably nothing. Extending the lock guard to
cover the rest of the function wouldn't prevent it either because
inactivating doesn't change the structure of the graph and therefore
also doesn't take the writer lock.
We should probably drain nodes around setting BDRV_O_INACTIVE. Generally
the expectation has always been that the block node is idle when we try
to inactivate an image. With exports, this isn't automatically true any
more, but draining gives us the guarantee we need.
This seems to also fix a qed crash I noticed with the new test cases
where the timer still wants to write to the image after we set the
inactive flag. Draining cancels the timer.
Kevin
diff --git a/block.c b/block.c
index 7eeb8d076e..1601b25f66 100644
--- a/block.c
+++ b/block.c
@@ -7032,7 +7032,9 @@ bdrv_inactivate_recurse(BlockDriverState *bs, bool top_level)
return -EPERM;
}
+ bdrv_drained_begin(bs);
bs->open_flags |= BDRV_O_INACTIVE;
+ bdrv_drained_end(bs);
/*
* Update permissions, they may differ for inactive nodes.