From nobody Mon Nov 10 10:10:08 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=virtuozzo.com ARC-Seal: i=1; a=rsa-sha256; t=1555003801; cv=none; d=zoho.com; s=zohoarc; b=CcInH19Mx8HUfMss5EEKJLSwvnRBNqbzJ8ihfQcCyCp5NFrfwy863b96z0M5nubJrcdE1fP3cIWpHVm/qFj+RMLjG9QhjNlVVUHjsJrceztaI/8WxJRHsurTGeANRs16wur2Zc7KHVMHpIazgprEAzNm+iIwYUkQ7jAGkgULnmk= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1555003801; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=PzSZwivKRT9xnMRscfRaCxtXRa4v/iNK9R1qPj8EDQ0=; b=DbdYlspHOQiLlNa5PAGT98BUNwdAdf4KzAggUYbvX4tsNUxm9psdZzJ7gCZ4JtTVvJJquQ+uUIxGv3nEKEL5mK9RwlQ6nmUwlr5pOhv3Uinar5Wka8EKRinFstRYCdcutr/IlUB9+C8hpOGbUblKjw3zmpqJ+D7bu13pDf0AFYQ= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1555003801087120.98080368983108; Thu, 11 Apr 2019 10:30:01 -0700 (PDT) Received: from localhost ([127.0.0.1]:52553 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hEdWM-0006Ix-ST for importer@patchew.org; Thu, 11 Apr 2019 13:29:54 -0400 Received: from eggs.gnu.org ([209.51.188.92]:50472) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hEdTo-0004ZZ-Co for qemu-devel@nongnu.org; Thu, 11 Apr 2019 13:27:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hEdTl-0003c4-Rn for qemu-devel@nongnu.org; Thu, 11 Apr 2019 13:27:16 -0400 Received: from relay.sw.ru ([185.231.240.75]:59366) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hEdTl-0003Zy-9h; Thu, 11 Apr 2019 13:27:13 -0400 Received: from [10.28.8.145] (helo=kvm.sw.ru) by relay.sw.ru with esmtp (Exim 4.91) (envelope-from ) id 1hEdTj-00044D-8X; Thu, 11 Apr 2019 20:27:11 +0300 From: Vladimir Sementsov-Ogievskiy To: qemu-devel@nongnu.org, qemu-block@nongnu.org Date: Thu, 11 Apr 2019 20:27:08 +0300 Message-Id: <20190411172709.205032-7-vsementsov@virtuozzo.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20190411172709.205032-1-vsementsov@virtuozzo.com> References: <20190411172709.205032-1-vsementsov@virtuozzo.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 185.231.240.75 Subject: [Qemu-devel] [PATCH v6 6/7] block/nbd-client: nbd reconnect X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, vsementsov@virtuozzo.com, armbru@redhat.com, mreitz@redhat.com, stefanha@redhat.com, den@openvz.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Implement reconnect. To achieve this: 1. add new modes: connecting-wait: means, that reconnecting is in progress, and there were small number of reconnect attempts, so all requests are waiting for the connection. connecting-nowait: reconnecting is in progress, there were a lot of attempts of reconnect, all requests will return errors. two old modes are used too: connected: normal state quit: exiting after fatal error or on close Possible transitions are: * -> quit connecting-* -> connected connecting-wait -> connecting-nowait (transition is done after reconnect-delay seconds in connecting-wait mode) connected -> connecting-wait 2. Implement reconnect in connection_co. So, in connecting-* mode, connection_co, tries to reconnect unlimited times. 3. Retry nbd queries on channel error, if we are in connecting-wait state. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.h | 7 + block/nbd-client.c | 336 +++++++++++++++++++++++++++++++++++---------- 2 files changed, 273 insertions(+), 70 deletions(-) diff --git a/block/nbd-client.h b/block/nbd-client.h index 91a6b32bdd..f366c90e5e 100644 --- a/block/nbd-client.h +++ b/block/nbd-client.h @@ -24,6 +24,8 @@ typedef struct { } NBDClientRequest; =20 typedef enum NBDClientState { + NBD_CLIENT_CONNECTING_WAIT, + NBD_CLIENT_CONNECTING_NOWAIT, NBD_CLIENT_CONNECTED, NBD_CLIENT_QUIT } NBDClientState; @@ -38,10 +40,15 @@ typedef struct NBDClientSession { Coroutine *connection_co; int in_flight; NBDClientState state; + int connect_status; + Error *connect_err; + bool wait_in_flight; =20 NBDClientRequest requests[MAX_NBD_REQUESTS]; NBDReply reply; BlockDriverState *bs; + + uint32_t reconnect_delay; } NBDClientSession; =20 NBDClientSession *nbd_get_client_session(BlockDriverState *bs); diff --git a/block/nbd-client.c b/block/nbd-client.c index 1359aff162..b829a1a1bd 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -1,6 +1,7 @@ /* * QEMU Block driver for NBD * + * Copyright (c) 2019 Virtuozzo International GmbH. All rights reserved. * Copyright (C) 2016 Red Hat, Inc. * Copyright (C) 2008 Bull S.A.S. * Author: Laurent Vivier @@ -36,10 +37,27 @@ #define HANDLE_TO_INDEX(bs, handle) ((handle) ^ (uint64_t)(intptr_t)(bs)) #define INDEX_TO_HANDLE(bs, index) ((index) ^ (uint64_t)(intptr_t)(bs)) =20 -/* @ret will be used for reconnect in future */ +static int nbd_client_connect(BlockDriverState *bs, + SocketAddress *saddr, + const char *export, + QCryptoTLSCreds *tlscreds, + const char *hostname, + const char *x_dirty_bitmap, + Error **errp); + static void nbd_channel_error(NBDClientSession *s, int ret) { - s->state =3D NBD_CLIENT_QUIT; + if (ret =3D=3D -EIO) { + if (s->state =3D=3D NBD_CLIENT_CONNECTED) { + s->state =3D s->reconnect_delay ? NBD_CLIENT_CONNECTING_WAIT : + NBD_CLIENT_CONNECTING_NOWAIT; + } + } else { + if (s->state =3D=3D NBD_CLIENT_CONNECTED) { + qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); + } + s->state =3D NBD_CLIENT_QUIT; + } } =20 static void nbd_recv_coroutines_wake_all(NBDClientSession *s) @@ -59,24 +77,133 @@ static void nbd_teardown_connection(BlockDriverState *= bs) { NBDClientSession *client =3D nbd_get_client_session(bs); =20 - assert(client->ioc); - - /* finish any pending coroutines */ - qio_channel_shutdown(client->ioc, - QIO_CHANNEL_SHUTDOWN_BOTH, - NULL); + if (client->state =3D=3D NBD_CLIENT_CONNECTED) { + /* finish any pending coroutines */ + assert(client->ioc); + qio_channel_shutdown(client->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); + } + client->state =3D NBD_CLIENT_QUIT; + if (client->connection_co) { + qemu_co_sleep_wake(client->connection_co); + } BDRV_POLL_WHILE(bs, client->connection_co); +} + +typedef struct NBDConnection { + BlockDriverState *bs; + SocketAddress *saddr; + const char *export; + QCryptoTLSCreds *tlscreds; + const char *hostname; + const char *x_dirty_bitmap; +} NBDConnection; + +static bool nbd_client_connecting(NBDClientSession *client) +{ + return client->state =3D=3D NBD_CLIENT_CONNECTING_WAIT || + client->state =3D=3D NBD_CLIENT_CONNECTING_NOWAIT; +} + +static bool nbd_client_connecting_wait(NBDClientSession *client) +{ + return client->state =3D=3D NBD_CLIENT_CONNECTING_WAIT; +} + +static coroutine_fn void nbd_reconnect_attempt(NBDConnection *con) +{ + NBDClientSession *s =3D nbd_get_client_session(con->bs); + Error *local_err =3D NULL; + + if (!nbd_client_connecting(s)) { + return; + } + assert(nbd_client_connecting(s)); + + /* Wait completion of all in-flight requests */ + + qemu_co_mutex_lock(&s->send_mutex); =20 - nbd_client_detach_aio_context(bs); - object_unref(OBJECT(client->sioc)); - client->sioc =3D NULL; - object_unref(OBJECT(client->ioc)); - client->ioc =3D NULL; + while (s->in_flight > 0) { + qemu_co_mutex_unlock(&s->send_mutex); + nbd_recv_coroutines_wake_all(s); + s->wait_in_flight =3D true; + qemu_coroutine_yield(); + s->wait_in_flight =3D false; + qemu_co_mutex_lock(&s->send_mutex); + } + + qemu_co_mutex_unlock(&s->send_mutex); + + if (!nbd_client_connecting(s)) { + return; + } + + /* + * Now we are sure, that nobody accessing the channel now and nobody + * will try to access the channel, until we set state to CONNECTED + */ + + /* Finalize previous connection if any */ + if (s->ioc) { + nbd_client_detach_aio_context(con->bs); + object_unref(OBJECT(s->sioc)); + s->sioc =3D NULL; + object_unref(OBJECT(s->ioc)); + s->ioc =3D NULL; + } + + s->connect_status =3D nbd_client_connect(con->bs, con->saddr, + con->export, con->tlscreds, + con->hostname, con->x_dirty_bit= map, + &local_err); + error_free(s->connect_err); + s->connect_err =3D NULL; + error_propagate(&s->connect_err, local_err); + local_err =3D NULL; + + if (s->connect_status < 0) { + /* failed attempt */ + return; + } + + /* successfully connected */ + s->state =3D NBD_CLIENT_CONNECTED; + qemu_co_queue_restart_all(&s->free_sema); +} + +static coroutine_fn void nbd_reconnect_loop(NBDConnection *con) +{ + NBDClientSession *s =3D nbd_get_client_session(con->bs); + uint64_t start_time_ns =3D qemu_clock_get_ns(QEMU_CLOCK_REALTIME); + uint64_t delay_ns =3D s->reconnect_delay * 1000000000UL; + uint64_t timeout =3D 1000000000UL; /* 1 sec */ + uint64_t max_timeout =3D 16000000000UL; /* 16 sec */ + + nbd_reconnect_attempt(con); + + while (nbd_client_connecting(s)) { + if (s->state =3D=3D NBD_CLIENT_CONNECTING_WAIT && + qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - start_time_ns > delay= _ns) + { + s->state =3D NBD_CLIENT_CONNECTING_NOWAIT; + qemu_co_queue_restart_all(&s->free_sema); + } + + bdrv_dec_in_flight(con->bs); + qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, timeout); + bdrv_inc_in_flight(con->bs); + if (timeout < max_timeout) { + timeout *=3D 2; + } + + nbd_reconnect_attempt(con); + } } =20 static coroutine_fn void nbd_connection_entry(void *opaque) { - NBDClientSession *s =3D opaque; + NBDConnection *con =3D opaque; + NBDClientSession *s =3D nbd_get_client_session(con->bs); uint64_t i; int ret =3D 0; Error *local_err =3D NULL; @@ -91,16 +218,25 @@ static coroutine_fn void nbd_connection_entry(void *op= aque) * Therefore we keep an additional in_flight reference all the tim= e and * only drop it temporarily here. */ + + if (nbd_client_connecting(s)) { + nbd_reconnect_loop(con); + } + + if (s->state !=3D NBD_CLIENT_CONNECTED) { + continue; + } + assert(s->reply.handle =3D=3D 0); ret =3D nbd_receive_reply(s->bs, s->ioc, &s->reply, &local_err); - if (local_err) { trace_nbd_read_reply_entry_fail(ret, error_get_pretty(local_er= r)); error_free(local_err); + local_err =3D NULL; } if (ret <=3D 0) { nbd_channel_error(s, ret ? ret : -EIO); - break; + continue; } =20 /* There's no need for a mutex on the receive side, because the @@ -114,7 +250,7 @@ static coroutine_fn void nbd_connection_entry(void *opa= que) (nbd_reply_is_structured(&s->reply) && !s->info.structured_rep= ly)) { nbd_channel_error(s, -EINVAL); - break; + continue; } =20 /* We're woken up again by the request itself. Note that there @@ -132,10 +268,21 @@ static coroutine_fn void nbd_connection_entry(void *o= paque) qemu_coroutine_yield(); } =20 + qemu_co_queue_restart_all(&s->free_sema); nbd_recv_coroutines_wake_all(s); bdrv_dec_in_flight(s->bs); =20 s->connection_co =3D NULL; + if (s->ioc) { + nbd_client_detach_aio_context(con->bs); + object_unref(OBJECT(s->sioc)); + s->sioc =3D NULL; + object_unref(OBJECT(s->ioc)); + s->ioc =3D NULL; + } + + g_free(con); + aio_wait_kick(); } =20 @@ -147,7 +294,7 @@ static int nbd_co_send_request(BlockDriverState *bs, int rc, i =3D -1; =20 qemu_co_mutex_lock(&s->send_mutex); - while (s->in_flight =3D=3D MAX_NBD_REQUESTS) { + while (s->in_flight =3D=3D MAX_NBD_REQUESTS || nbd_client_connecting_w= ait(s)) { qemu_co_queue_wait(&s->free_sema, &s->send_mutex); } =20 @@ -198,7 +345,11 @@ err: s->requests[i].coroutine =3D NULL; s->in_flight--; } - qemu_co_queue_next(&s->free_sema); + if (s->in_flight =3D=3D 0 && s->wait_in_flight) { + aio_co_wake(s->connection_co); + } else { + qemu_co_queue_next(&s->free_sema); + } } qemu_co_mutex_unlock(&s->send_mutex); return rc; @@ -577,10 +728,15 @@ static coroutine_fn int nbd_co_receive_one_chunk( if (reply) { *reply =3D s->reply; } - s->reply.handle =3D 0; } + s->reply.handle =3D 0; =20 - if (s->connection_co) { + if (s->connection_co && !s->wait_in_flight) { + /* + * We must check s->wait_in_flight, because we may entered by + * nbd_recv_coroutines_wake_all(), in this case we should not + * wake connection_co here, it will woken by last request. + */ aio_co_wake(s->connection_co); } =20 @@ -688,7 +844,11 @@ break_loop: =20 qemu_co_mutex_lock(&s->send_mutex); s->in_flight--; - qemu_co_queue_next(&s->free_sema); + if (s->in_flight =3D=3D 0 && s->wait_in_flight) { + aio_co_wake(s->connection_co); + } else { + qemu_co_queue_next(&s->free_sema); + } qemu_co_mutex_unlock(&s->send_mutex); =20 return false; @@ -832,20 +992,26 @@ static int nbd_co_request(BlockDriverState *bs, NBDRe= quest *request, } else { assert(request->type !=3D NBD_CMD_WRITE); } - ret =3D nbd_co_send_request(bs, request, write_qiov); - if (ret < 0) { - return ret; - } =20 - ret =3D nbd_co_receive_return_code(client, request->handle, - &request_ret, &local_err); - if (local_err) { - trace_nbd_co_request_fail(request->from, request->len, request->ha= ndle, - request->flags, request->type, - nbd_cmd_lookup(request->type), - ret, error_get_pretty(local_err)); - error_free(local_err); - } + do { + ret =3D nbd_co_send_request(bs, request, write_qiov); + if (ret < 0) { + continue; + } + + ret =3D nbd_co_receive_return_code(client, request->handle, + &request_ret, &local_err); + if (local_err) { + trace_nbd_co_request_fail(request->from, request->len, + request->handle, request->flags, + request->type, + nbd_cmd_lookup(request->type), + ret, error_get_pretty(local_err)); + error_free(local_err); + local_err =3D NULL; + } + } while (ret < 0 && nbd_client_connecting_wait(client)); + return ret ? ret : request_ret; } =20 @@ -886,20 +1052,24 @@ int nbd_client_co_preadv(BlockDriverState *bs, uint6= 4_t offset, request.len -=3D slop; } =20 - ret =3D nbd_co_send_request(bs, &request, NULL); - if (ret < 0) { - return ret; - } + do { + ret =3D nbd_co_send_request(bs, &request, NULL); + if (ret < 0) { + continue; + } + + ret =3D nbd_co_receive_cmdread_reply(client, request.handle, offse= t, qiov, + &request_ret, &local_err); + if (local_err) { + trace_nbd_co_request_fail(request.from, request.len, request.h= andle, + request.flags, request.type, + nbd_cmd_lookup(request.type), + ret, error_get_pretty(local_err)); + error_free(local_err); + local_err =3D NULL; + } + } while (ret < 0 && nbd_client_connecting_wait(client)); =20 - ret =3D nbd_co_receive_cmdread_reply(client, request.handle, offset, q= iov, - &request_ret, &local_err); - if (local_err) { - trace_nbd_co_request_fail(request.from, request.len, request.handl= e, - request.flags, request.type, - nbd_cmd_lookup(request.type), - ret, error_get_pretty(local_err)); - error_free(local_err); - } return ret ? ret : request_ret; } =20 @@ -1033,20 +1203,25 @@ int coroutine_fn nbd_client_co_block_status(BlockDr= iverState *bs, if (client->info.min_block) { assert(QEMU_IS_ALIGNED(request.len, client->info.min_block)); } - ret =3D nbd_co_send_request(bs, &request, NULL); - if (ret < 0) { - return ret; - } + do { + ret =3D nbd_co_send_request(bs, &request, NULL); + if (ret < 0) { + continue; + } + + ret =3D nbd_co_receive_blockstatus_reply(client, request.handle, b= ytes, + &extent, &request_ret, + &local_err); + if (local_err) { + trace_nbd_co_request_fail(request.from, request.len, request.h= andle, + request.flags, request.type, + nbd_cmd_lookup(request.type), + ret, error_get_pretty(local_err)); + error_free(local_err); + local_err =3D NULL; + } + } while (ret < 0 && nbd_client_connecting_wait(client)); =20 - ret =3D nbd_co_receive_blockstatus_reply(client, request.handle, bytes, - &extent, &request_ret, &local_e= rr); - if (local_err) { - trace_nbd_co_request_fail(request.from, request.len, request.handl= e, - request.flags, request.type, - nbd_cmd_lookup(request.type), - ret, error_get_pretty(local_err)); - error_free(local_err); - } if (ret < 0 || request_ret < 0) { return ret ? ret : request_ret; } @@ -1083,13 +1258,22 @@ void nbd_client_attach_aio_context(BlockDriverState= *bs, AioContext *new_context) { NBDClientSession *client =3D nbd_get_client_session(bs); - qio_channel_attach_aio_context(QIO_CHANNEL(client->ioc), new_context); =20 - bdrv_inc_in_flight(bs); + /* + * client->connection_co is either yielded from nbd_receive_reply or f= rom + * nbd_reconnect_loop(), in latter case we do nothing + */ + if (client->state =3D=3D NBD_CLIENT_CONNECTED) { + qio_channel_attach_aio_context(QIO_CHANNEL(client->ioc), new_conte= xt); =20 - /* Need to wait here for the BH to run because the BH must run while t= he - * node is still drained. */ - aio_wait_bh_oneshot(new_context, nbd_client_attach_aio_context_bh, bs); + bdrv_inc_in_flight(bs); + + /* + * Need to wait here for the BH to run because the BH must run whi= le the + * node is still drained. + */ + aio_wait_bh_oneshot(new_context, nbd_client_attach_aio_context_bh,= bs); + } } =20 void nbd_client_close(BlockDriverState *bs) @@ -1097,9 +1281,9 @@ void nbd_client_close(BlockDriverState *bs) NBDClientSession *client =3D nbd_get_client_session(bs); NBDRequest request =3D { .type =3D NBD_CMD_DISC }; =20 - assert(client->ioc); - - nbd_send_request(client->ioc, &request); + if (client->ioc) { + nbd_send_request(client->ioc, &request); + } =20 nbd_teardown_connection(bs); } @@ -1223,6 +1407,7 @@ int nbd_client_init(BlockDriverState *bs, { int ret; NBDClientSession *client =3D nbd_get_client_session(bs); + NBDConnection *con; =20 client->bs =3D bs; qemu_co_mutex_init(&client->send_mutex); @@ -1233,8 +1418,19 @@ int nbd_client_init(BlockDriverState *bs, if (ret < 0) { return ret; } + /* successfully connected */ + client->state =3D NBD_CLIENT_CONNECTED; + client->reconnect_delay =3D reconnect_delay; + + con =3D g_new(NBDConnection, 1); + con->bs =3D bs; + con->saddr =3D saddr; + con->export =3D export; + con->tlscreds =3D tlscreds; + con->hostname =3D hostname; + con->x_dirty_bitmap =3D x_dirty_bitmap; =20 - client->connection_co =3D qemu_coroutine_create(nbd_connection_entry, = client); + client->connection_co =3D qemu_coroutine_create(nbd_connection_entry, = con); bdrv_inc_in_flight(bs); aio_co_schedule(bdrv_get_aio_context(bs), client->connection_co); =20 --=20 2.18.0