From nobody Sun May 5 04:25:55 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=virtuozzo.com ARC-Seal: i=1; a=rsa-sha256; t=1566406469; cv=none; d=zoho.com; s=zohoarc; b=K0XZQIyJm+I/UQD9BnANz++WIVbuOPtVkEDDmjKohdQ9qKctwW5VjDYB+lMpmAalURbMeHv8U091WgbBCRcORygqxqHhKqzy+4r7AFxJWDEqmfquhhnLhcoaqE9D+lBC5zSevLoNOycBGWlbgBkdcVBmv+7NlNGakBKCFgc5OM4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1566406469; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=zyZFts47nQ5Hf/hYK4L/5arWCGw4++0/lgks9HtinJk=; b=McHBBQY6qsX1q72lvnw7ljkJUVly6WgrOdkMJHLYbBqr8NKgx0w5IzvIX6ilCeO1nU7c66qUy4ROa+5hg/qp7sWQa3MX/g4ByzlYQM2cycl7FFsJjR5e5iPIIHPPEiZj9IwZPrONkc1AoU4HizNCiIHRb3FFGIVFGdPqtfnFnuQ= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1566406469401529.4725284736331; Wed, 21 Aug 2019 09:54:29 -0700 (PDT) Received: from localhost ([::1]:50608 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i0TsS-0002at-8h for importer@patchew.org; Wed, 21 Aug 2019 12:54:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34250) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i0TqU-0007mj-D9 for qemu-devel@nongnu.org; Wed, 21 Aug 2019 12:52:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i0TqS-0004Q6-OY for qemu-devel@nongnu.org; Wed, 21 Aug 2019 12:52:26 -0400 Received: from relay.sw.ru ([185.231.240.75]:53184) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i0TqP-0004AP-4z; Wed, 21 Aug 2019 12:52:21 -0400 Received: from [10.94.3.0] (helo=kvm.qa.sw.ru) by relay.sw.ru with esmtp (Exim 4.92) (envelope-from ) id 1i0TqK-0003Ry-QB; Wed, 21 Aug 2019 19:52:16 +0300 From: Vladimir Sementsov-Ogievskiy To: qemu-block@nongnu.org Date: Wed, 21 Aug 2019 19:52:13 +0300 Message-Id: <20190821165215.61406-2-vsementsov@virtuozzo.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20190821165215.61406-1-vsementsov@virtuozzo.com> References: <20190821165215.61406-1-vsementsov@virtuozzo.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 185.231.240.75 Subject: [Qemu-devel] [PATCH v8 1/3] qemu-coroutine-sleep: introduce qemu_co_sleep_wake X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: fam@euphon.net, kwolf@redhat.com, vsementsov@virtuozzo.com, sheepdog@lists.wpkg.org, qemu-devel@nongnu.org, mreitz@redhat.com, stefanha@redhat.com, den@openvz.org, namei.unix@gmail.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Introduce a function to gracefully wake a coroutine sleeping in qemu_co_sleep_ns(). Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Eric Blake Reviewed-by: Kevin Wolf --- include/qemu/coroutine.h | 17 ++++++++++++-- block/null.c | 2 +- block/sheepdog.c | 2 +- tests/test-bdrv-drain.c | 6 ++--- tests/test-block-iothread.c | 2 +- util/qemu-coroutine-sleep.c | 47 +++++++++++++++++++++++++++---------- 6 files changed, 55 insertions(+), 21 deletions(-) diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h index 9801e7f5a4..96780a4902 100644 --- a/include/qemu/coroutine.h +++ b/include/qemu/coroutine.h @@ -274,9 +274,22 @@ void qemu_co_rwlock_wrlock(CoRwlock *lock); void qemu_co_rwlock_unlock(CoRwlock *lock); =20 /** - * Yield the coroutine for a given duration + * Yield the coroutine for a given duration. During this yield @sleep_stat= e (if + * not NULL) is set to opaque pointer, which may be used for + * qemu_co_sleep_wake(). Be careful, the pointer is set back to zero when = timer + * shoots. Don't save obtained value to other variables and don't call + * qemu_co_sleep_wake from another aio context. */ -void coroutine_fn qemu_co_sleep_ns(QEMUClockType type, int64_t ns); +void coroutine_fn qemu_co_sleep_ns(QEMUClockType type, int64_t ns, + void **sleep_state); + +/** + * Wake a coroutine if it is sleeping by qemu_co_sleep_ns. Timer will be + * deleted. @sleep_state must be the variable which address was given to + * qemu_co_sleep_ns() and should be checked to be non-NULL before calling + * qemu_co_sleep_wake(). + */ +void qemu_co_sleep_wake(void *sleep_state); =20 /** * Yield until a file descriptor becomes readable diff --git a/block/null.c b/block/null.c index 699aa295cb..1e3f26b07e 100644 --- a/block/null.c +++ b/block/null.c @@ -109,7 +109,7 @@ static coroutine_fn int null_co_common(BlockDriverState= *bs) BDRVNullState *s =3D bs->opaque; =20 if (s->latency_ns) { - qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, s->latency_ns); + qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, s->latency_ns, NULL); } return 0; } diff --git a/block/sheepdog.c b/block/sheepdog.c index 773dfc6ab1..3a7ef2f209 100644 --- a/block/sheepdog.c +++ b/block/sheepdog.c @@ -743,7 +743,7 @@ static coroutine_fn void reconnect_to_sdog(void *opaque) if (s->fd < 0) { trace_sheepdog_reconnect_to_sdog(); error_report_err(local_err); - qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 1000000000ULL); + qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 1000000000ULL, NULL); } }; =20 diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c index 374bef6bb2..2f53a7add5 100644 --- a/tests/test-bdrv-drain.c +++ b/tests/test-bdrv-drain.c @@ -43,7 +43,7 @@ static void coroutine_fn bdrv_test_co_drain_begin(BlockDr= iverState *bs) BDRVTestState *s =3D bs->opaque; s->drain_count++; if (s->sleep_in_drain_begin) { - qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 100000); + qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 100000, NULL); } } =20 @@ -74,7 +74,7 @@ static int coroutine_fn bdrv_test_co_preadv(BlockDriverSt= ate *bs, * it to complete. We need to sleep a while as bdrv_drain_invoke() com= es * first and polls its result, too, but it shouldn't accidentally comp= lete * this request yet. */ - qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 100000); + qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 100000, NULL); =20 if (s->bh_indirection_ctx) { aio_bh_schedule_oneshot(s->bh_indirection_ctx, co_reenter_bh, @@ -829,7 +829,7 @@ static int coroutine_fn test_job_run(Job *job, Error **= errp) /* Avoid job_sleep_ns() because it marks the job as !busy. We want= to * emulate some actual activity (probably some I/O) here so that d= rain * has to wait for this activity to stop. */ - qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 1000000); + qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 1000000, NULL); =20 job_pause_point(&s->common.job); } diff --git a/tests/test-block-iothread.c b/tests/test-block-iothread.c index 926577b1f9..a1ac5efcaa 100644 --- a/tests/test-block-iothread.c +++ b/tests/test-block-iothread.c @@ -381,7 +381,7 @@ static int coroutine_fn test_job_run(Job *job, Error **= errp) * emulate some actual activity (probably some I/O) here so that t= he * drain involved in AioContext switches has to wait for this acti= vity * to stop. */ - qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 1000000); + qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 1000000, NULL); =20 job_pause_point(&s->common.job); } diff --git a/util/qemu-coroutine-sleep.c b/util/qemu-coroutine-sleep.c index 4bfdd30cbf..48a64bb8d8 100644 --- a/util/qemu-coroutine-sleep.c +++ b/util/qemu-coroutine-sleep.c @@ -17,31 +17,52 @@ #include "qemu/timer.h" #include "block/aio.h" =20 -static void co_sleep_cb(void *opaque) -{ - Coroutine *co =3D opaque; +const char *qemu_co_sleep_ns__scheduled =3D "qemu_co_sleep_ns"; + +typedef struct QemuCoSleepState { + Coroutine *co; + QEMUTimer *ts; + void **user_state_pointer; +} QemuCoSleepState; =20 +void qemu_co_sleep_wake(void *sleep_state) +{ + QemuCoSleepState *s =3D (QemuCoSleepState *)sleep_state; /* Write of schedule protected by barrier write in aio_co_schedule */ - atomic_set(&co->scheduled, NULL); - aio_co_wake(co); + const char *scheduled =3D atomic_cmpxchg(&s->co->scheduled, + qemu_co_sleep_ns__scheduled, NU= LL); + + assert(scheduled =3D=3D qemu_co_sleep_ns__scheduled); + if (s->user_state_pointer) { + *s->user_state_pointer =3D NULL; + } + timer_del(s->ts); + aio_co_wake(s->co); } =20 -void coroutine_fn qemu_co_sleep_ns(QEMUClockType type, int64_t ns) +void coroutine_fn qemu_co_sleep_ns(QEMUClockType type, int64_t ns, + void **sleep_state) { AioContext *ctx =3D qemu_get_current_aio_context(); - QEMUTimer *ts; - Coroutine *co =3D qemu_coroutine_self(); + QemuCoSleepState state =3D { + .co =3D qemu_coroutine_self(), + .ts =3D aio_timer_new(ctx, type, SCALE_NS, qemu_co_sleep_wake, &st= ate), + .user_state_pointer =3D sleep_state, + }; =20 - const char *scheduled =3D atomic_cmpxchg(&co->scheduled, NULL, __func_= _); + const char *scheduled =3D atomic_cmpxchg(&state.co->scheduled, NULL, + qemu_co_sleep_ns__scheduled); if (scheduled) { fprintf(stderr, "%s: Co-routine was already scheduled in '%s'\n", __func__, scheduled); abort(); } - ts =3D aio_timer_new(ctx, type, SCALE_NS, co_sleep_cb, co); - timer_mod(ts, qemu_clock_get_ns(type) + ns); + + if (sleep_state) { + *sleep_state =3D &state; + } + timer_mod(state.ts, qemu_clock_get_ns(type) + ns); qemu_coroutine_yield(); - timer_del(ts); - timer_free(ts); + timer_free(state.ts); } --=20 2.18.0 From nobody Sun May 5 04:25:55 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=virtuozzo.com ARC-Seal: i=1; a=rsa-sha256; t=1566406835; cv=none; d=zoho.com; s=zohoarc; b=jGjBw4GUYP50FEZafQm9H9If+lN8+3Zt8RYip50AUTqnr62xrnpQfVO0h240mnwuzZ9nXuTfg5DIB/popKt4tSV8nxN/fq/LnB1Cd9DhNxznFT/1sVUIQttUOeKOBmPSYhcBfy+AJWmEFiSa7zaqy2OYSo1hLyglaWTILdyiCAg= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1566406835; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=ZgiGPLq2ztSZrAxNIneCcXEkYGKci11kfLZJqteoeQI=; b=ZpcxAq/p2l6CJeKJ02efECPx3ZPZ3/Ma7kylerUFdqqpoXEsOiBx//9RFMsrPBjvUBhN9Oitp8yn1AIDj94Xz9Ivd8v/HBgPSlPo011zjiXPi4oUkwvgaox2NGQg8RK2J594VW+o9mBe3MV6/rD5709J0iXCtVqTlZtcZU6R1a4= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1566406835102804.4203149920277; Wed, 21 Aug 2019 10:00:35 -0700 (PDT) Received: from localhost ([::1]:50686 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i0TyL-0000d7-TQ for importer@patchew.org; Wed, 21 Aug 2019 13:00:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34267) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i0TqW-0007sF-Pz for qemu-devel@nongnu.org; Wed, 21 Aug 2019 12:52:30 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i0TqU-0004Vz-Fu for qemu-devel@nongnu.org; Wed, 21 Aug 2019 12:52:28 -0400 Received: from relay.sw.ru ([185.231.240.75]:53186) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i0TqP-0004AO-44; Wed, 21 Aug 2019 12:52:21 -0400 Received: from [10.94.3.0] (helo=kvm.qa.sw.ru) by relay.sw.ru with esmtp (Exim 4.92) (envelope-from ) id 1i0TqL-0003Ry-1X; Wed, 21 Aug 2019 19:52:17 +0300 From: Vladimir Sementsov-Ogievskiy To: qemu-block@nongnu.org Date: Wed, 21 Aug 2019 19:52:14 +0300 Message-Id: <20190821165215.61406-3-vsementsov@virtuozzo.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20190821165215.61406-1-vsementsov@virtuozzo.com> References: <20190821165215.61406-1-vsementsov@virtuozzo.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 185.231.240.75 Subject: [Qemu-devel] [PATCH v8 2/3] block/nbd: nbd reconnect X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: fam@euphon.net, kwolf@redhat.com, vsementsov@virtuozzo.com, sheepdog@lists.wpkg.org, qemu-devel@nongnu.org, mreitz@redhat.com, stefanha@redhat.com, den@openvz.org, namei.unix@gmail.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Implement reconnect. To achieve this: 1. add new modes: connecting-wait: means, that reconnecting is in progress, and there were small number of reconnect attempts, so all requests are waiting for the connection. connecting-nowait: reconnecting is in progress, there were a lot of attempts of reconnect, all requests will return errors. two old modes are used too: connected: normal state quit: exiting after fatal error or on close Possible transitions are: * -> quit connecting-* -> connected connecting-wait -> connecting-nowait (transition is done after reconnect-delay seconds in connecting-wait mode) connected -> connecting-wait 2. Implement reconnect in connection_co. So, in connecting-* mode, connection_co, tries to reconnect unlimited times. 3. Retry nbd queries on channel error, if we are in connecting-wait state. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/nbd.c | 335 ++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 271 insertions(+), 64 deletions(-) diff --git a/block/nbd.c b/block/nbd.c index beed46fb34..f272154d4b 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -1,6 +1,7 @@ /* * QEMU Block driver for NBD * + * Copyright (c) 2019 Virtuozzo International GmbH. * Copyright (C) 2016 Red Hat, Inc. * Copyright (C) 2008 Bull S.A.S. * Author: Laurent Vivier @@ -55,6 +56,8 @@ typedef struct { } NBDClientRequest; =20 typedef enum NBDClientState { + NBD_CLIENT_CONNECTING_WAIT, + NBD_CLIENT_CONNECTING_NOWAIT, NBD_CLIENT_CONNECTED, NBD_CLIENT_QUIT } NBDClientState; @@ -67,8 +70,14 @@ typedef struct BDRVNBDState { CoMutex send_mutex; CoQueue free_sema; Coroutine *connection_co; + void *connection_co_sleep_ns_state; + bool drained; + bool wait_drained_end; int in_flight; NBDClientState state; + int connect_status; + Error *connect_err; + bool wait_in_flight; =20 NBDClientRequest requests[MAX_NBD_REQUESTS]; NBDReply reply; @@ -83,10 +92,21 @@ typedef struct BDRVNBDState { char *x_dirty_bitmap; } BDRVNBDState; =20 -/* @ret will be used for reconnect in future */ +static int nbd_client_connect(BlockDriverState *bs, Error **errp); + static void nbd_channel_error(BDRVNBDState *s, int ret) { - s->state =3D NBD_CLIENT_QUIT; + if (ret =3D=3D -EIO) { + if (s->state =3D=3D NBD_CLIENT_CONNECTED) { + s->state =3D s->reconnect_delay ? NBD_CLIENT_CONNECTING_WAIT : + NBD_CLIENT_CONNECTING_NOWAIT; + } + } else { + if (s->state =3D=3D NBD_CLIENT_CONNECTED) { + qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); + } + s->state =3D NBD_CLIENT_QUIT; + } } =20 static void nbd_recv_coroutines_wake_all(BDRVNBDState *s) @@ -129,7 +149,13 @@ static void nbd_client_attach_aio_context(BlockDriverS= tate *bs, { BDRVNBDState *s =3D (BDRVNBDState *)bs->opaque; =20 - qio_channel_attach_aio_context(QIO_CHANNEL(s->ioc), new_context); + /* + * s->connection_co is either yielded from nbd_receive_reply or from + * nbd_reconnect_loop() + */ + if (s->state =3D=3D NBD_CLIENT_CONNECTED) { + qio_channel_attach_aio_context(QIO_CHANNEL(s->ioc), new_context); + } =20 bdrv_inc_in_flight(bs); =20 @@ -140,29 +166,157 @@ static void nbd_client_attach_aio_context(BlockDrive= rState *bs, aio_wait_bh_oneshot(new_context, nbd_client_attach_aio_context_bh, bs); } =20 +static void coroutine_fn nbd_client_co_drain_begin(BlockDriverState *bs) +{ + BDRVNBDState *s =3D (BDRVNBDState *)bs->opaque; =20 -static void nbd_teardown_connection(BlockDriverState *bs) + s->drained =3D true; + if (s->connection_co_sleep_ns_state) { + qemu_co_sleep_wake(s->connection_co_sleep_ns_state); + } +} + +static void coroutine_fn nbd_client_co_drain_end(BlockDriverState *bs) { BDRVNBDState *s =3D (BDRVNBDState *)bs->opaque; =20 - assert(s->ioc); + s->drained =3D false; + if (s->wait_drained_end) { + s->wait_drained_end =3D false; + aio_co_wake(s->connection_co); + } +} + =20 - /* finish any pending coroutines */ - qio_channel_shutdown(s->ioc, - QIO_CHANNEL_SHUTDOWN_BOTH, - NULL); +static void nbd_teardown_connection(BlockDriverState *bs) +{ + BDRVNBDState *s =3D (BDRVNBDState *)bs->opaque; + + if (s->state =3D=3D NBD_CLIENT_CONNECTED) { + /* finish any pending coroutines */ + assert(s->ioc); + qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); + } + s->state =3D NBD_CLIENT_QUIT; + if (s->connection_co) { + if (s->connection_co_sleep_ns_state) { + qemu_co_sleep_wake(s->connection_co_sleep_ns_state); + } + } BDRV_POLL_WHILE(bs, s->connection_co); +} =20 - nbd_client_detach_aio_context(bs); - object_unref(OBJECT(s->sioc)); - s->sioc =3D NULL; - object_unref(OBJECT(s->ioc)); - s->ioc =3D NULL; +static bool nbd_client_connecting(BDRVNBDState *s) +{ + return s->state =3D=3D NBD_CLIENT_CONNECTING_WAIT || + s->state =3D=3D NBD_CLIENT_CONNECTING_NOWAIT; +} + +static bool nbd_client_connecting_wait(BDRVNBDState *s) +{ + return s->state =3D=3D NBD_CLIENT_CONNECTING_WAIT; +} + +static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s) +{ + Error *local_err =3D NULL; + + if (!nbd_client_connecting(s)) { + return; + } + assert(nbd_client_connecting(s)); + + /* Wait for completion of all in-flight requests */ + + qemu_co_mutex_lock(&s->send_mutex); + + while (s->in_flight > 0) { + qemu_co_mutex_unlock(&s->send_mutex); + nbd_recv_coroutines_wake_all(s); + s->wait_in_flight =3D true; + qemu_coroutine_yield(); + s->wait_in_flight =3D false; + qemu_co_mutex_lock(&s->send_mutex); + } + + qemu_co_mutex_unlock(&s->send_mutex); + + if (!nbd_client_connecting(s)) { + return; + } + + /* + * Now we are sure that nobody is accessing the channel, and no one wi= ll + * try until we set the state to CONNECTED. + */ + + /* Finalize previous connection if any */ + if (s->ioc) { + nbd_client_detach_aio_context(s->bs); + object_unref(OBJECT(s->sioc)); + s->sioc =3D NULL; + object_unref(OBJECT(s->ioc)); + s->ioc =3D NULL; + } + + s->connect_status =3D nbd_client_connect(s->bs, &local_err); + error_free(s->connect_err); + s->connect_err =3D NULL; + error_propagate(&s->connect_err, local_err); + local_err =3D NULL; + + if (s->connect_status < 0) { + /* failed attempt */ + return; + } + + /* successfully connected */ + s->state =3D NBD_CLIENT_CONNECTED; + qemu_co_queue_restart_all(&s->free_sema); +} + +static coroutine_fn void nbd_reconnect_loop(BDRVNBDState *s) +{ + uint64_t start_time_ns =3D qemu_clock_get_ns(QEMU_CLOCK_REALTIME); + uint64_t delay_ns =3D s->reconnect_delay * NANOSECONDS_PER_SECOND; + uint64_t timeout =3D 1 * NANOSECONDS_PER_SECOND; + uint64_t max_timeout =3D 16 * NANOSECONDS_PER_SECOND; + + nbd_reconnect_attempt(s); + + while (nbd_client_connecting(s)) { + if (s->state =3D=3D NBD_CLIENT_CONNECTING_WAIT && + qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - start_time_ns > delay= _ns) + { + s->state =3D NBD_CLIENT_CONNECTING_NOWAIT; + qemu_co_queue_restart_all(&s->free_sema); + } + + qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, timeout, + &s->connection_co_sleep_ns_state); + if (s->drained) { + bdrv_dec_in_flight(s->bs); + s->wait_drained_end =3D true; + while (s->drained) { + /* + * We may be entered once from nbd_client_attach_aio_conte= xt_bh + * and then from nbd_client_co_drain_end. So here is a loo= p. + */ + qemu_coroutine_yield(); + } + bdrv_inc_in_flight(s->bs); + } + if (timeout < max_timeout) { + timeout *=3D 2; + } + + nbd_reconnect_attempt(s); + } } =20 static coroutine_fn void nbd_connection_entry(void *opaque) { - BDRVNBDState *s =3D opaque; + BDRVNBDState *s =3D (BDRVNBDState *)opaque; uint64_t i; int ret =3D 0; Error *local_err =3D NULL; @@ -177,16 +331,26 @@ static coroutine_fn void nbd_connection_entry(void *o= paque) * Therefore we keep an additional in_flight reference all the tim= e and * only drop it temporarily here. */ + + if (nbd_client_connecting(s)) { + nbd_reconnect_loop(s); + } + + if (s->state !=3D NBD_CLIENT_CONNECTED) { + continue; + } + assert(s->reply.handle =3D=3D 0); ret =3D nbd_receive_reply(s->bs, s->ioc, &s->reply, &local_err); =20 if (local_err) { trace_nbd_read_reply_entry_fail(ret, error_get_pretty(local_er= r)); error_free(local_err); + local_err =3D NULL; } if (ret <=3D 0) { nbd_channel_error(s, ret ? ret : -EIO); - break; + continue; } =20 /* @@ -201,7 +365,7 @@ static coroutine_fn void nbd_connection_entry(void *opa= que) (nbd_reply_is_structured(&s->reply) && !s->info.structured_rep= ly)) { nbd_channel_error(s, -EINVAL); - break; + continue; } =20 /* @@ -220,10 +384,19 @@ static coroutine_fn void nbd_connection_entry(void *o= paque) qemu_coroutine_yield(); } =20 + qemu_co_queue_restart_all(&s->free_sema); nbd_recv_coroutines_wake_all(s); bdrv_dec_in_flight(s->bs); =20 s->connection_co =3D NULL; + if (s->ioc) { + nbd_client_detach_aio_context(s->bs); + object_unref(OBJECT(s->sioc)); + s->sioc =3D NULL; + object_unref(OBJECT(s->ioc)); + s->ioc =3D NULL; + } + aio_wait_kick(); } =20 @@ -235,7 +408,7 @@ static int nbd_co_send_request(BlockDriverState *bs, int rc, i =3D -1; =20 qemu_co_mutex_lock(&s->send_mutex); - while (s->in_flight =3D=3D MAX_NBD_REQUESTS) { + while (s->in_flight =3D=3D MAX_NBD_REQUESTS || nbd_client_connecting_w= ait(s)) { qemu_co_queue_wait(&s->free_sema, &s->send_mutex); } =20 @@ -286,7 +459,11 @@ err: s->requests[i].coroutine =3D NULL; s->in_flight--; } - qemu_co_queue_next(&s->free_sema); + if (s->in_flight =3D=3D 0 && s->wait_in_flight) { + aio_co_wake(s->connection_co); + } else { + qemu_co_queue_next(&s->free_sema); + } } qemu_co_mutex_unlock(&s->send_mutex); return rc; @@ -666,10 +843,15 @@ static coroutine_fn int nbd_co_receive_one_chunk( } else { /* For assert at loop start in nbd_connection_entry */ *reply =3D s->reply; - s->reply.handle =3D 0; } + s->reply.handle =3D 0; =20 - if (s->connection_co) { + if (s->connection_co && !s->wait_in_flight) { + /* + * We must check s->wait_in_flight, because we may entered by + * nbd_recv_coroutines_wake_all(), in this case we should not + * wake connection_co here, it will woken by last request. + */ aio_co_wake(s->connection_co); } =20 @@ -781,7 +963,11 @@ break_loop: =20 qemu_co_mutex_lock(&s->send_mutex); s->in_flight--; - qemu_co_queue_next(&s->free_sema); + if (s->in_flight =3D=3D 0 && s->wait_in_flight) { + aio_co_wake(s->connection_co); + } else { + qemu_co_queue_next(&s->free_sema); + } qemu_co_mutex_unlock(&s->send_mutex); =20 return false; @@ -927,20 +1113,26 @@ static int nbd_co_request(BlockDriverState *bs, NBDR= equest *request, } else { assert(request->type !=3D NBD_CMD_WRITE); } - ret =3D nbd_co_send_request(bs, request, write_qiov); - if (ret < 0) { - return ret; - } =20 - ret =3D nbd_co_receive_return_code(s, request->handle, - &request_ret, &local_err); - if (local_err) { - trace_nbd_co_request_fail(request->from, request->len, request->ha= ndle, - request->flags, request->type, - nbd_cmd_lookup(request->type), - ret, error_get_pretty(local_err)); - error_free(local_err); - } + do { + ret =3D nbd_co_send_request(bs, request, write_qiov); + if (ret < 0) { + continue; + } + + ret =3D nbd_co_receive_return_code(s, request->handle, + &request_ret, &local_err); + if (local_err) { + trace_nbd_co_request_fail(request->from, request->len, + request->handle, request->flags, + request->type, + nbd_cmd_lookup(request->type), + ret, error_get_pretty(local_err)); + error_free(local_err); + local_err =3D NULL; + } + } while (ret < 0 && nbd_client_connecting_wait(s)); + return ret ? ret : request_ret; } =20 @@ -981,20 +1173,24 @@ static int nbd_client_co_preadv(BlockDriverState *bs= , uint64_t offset, request.len -=3D slop; } =20 - ret =3D nbd_co_send_request(bs, &request, NULL); - if (ret < 0) { - return ret; - } + do { + ret =3D nbd_co_send_request(bs, &request, NULL); + if (ret < 0) { + continue; + } + + ret =3D nbd_co_receive_cmdread_reply(s, request.handle, offset, qi= ov, + &request_ret, &local_err); + if (local_err) { + trace_nbd_co_request_fail(request.from, request.len, request.h= andle, + request.flags, request.type, + nbd_cmd_lookup(request.type), + ret, error_get_pretty(local_err)); + error_free(local_err); + local_err =3D NULL; + } + } while (ret < 0 && nbd_client_connecting_wait(s)); =20 - ret =3D nbd_co_receive_cmdread_reply(s, request.handle, offset, qiov, - &request_ret, &local_err); - if (local_err) { - trace_nbd_co_request_fail(request.from, request.len, request.handl= e, - request.flags, request.type, - nbd_cmd_lookup(request.type), - ret, error_get_pretty(local_err)); - error_free(local_err); - } return ret ? ret : request_ret; } =20 @@ -1127,20 +1323,25 @@ static int coroutine_fn nbd_client_co_block_status( if (s->info.min_block) { assert(QEMU_IS_ALIGNED(request.len, s->info.min_block)); } - ret =3D nbd_co_send_request(bs, &request, NULL); - if (ret < 0) { - return ret; - } + do { + ret =3D nbd_co_send_request(bs, &request, NULL); + if (ret < 0) { + continue; + } + + ret =3D nbd_co_receive_blockstatus_reply(s, request.handle, bytes, + &extent, &request_ret, + &local_err); + if (local_err) { + trace_nbd_co_request_fail(request.from, request.len, request.h= andle, + request.flags, request.type, + nbd_cmd_lookup(request.type), + ret, error_get_pretty(local_err)); + error_free(local_err); + local_err =3D NULL; + } + } while (ret < 0 && nbd_client_connecting_wait(s)); =20 - ret =3D nbd_co_receive_blockstatus_reply(s, request.handle, bytes, - &extent, &request_ret, &local_e= rr); - if (local_err) { - trace_nbd_co_request_fail(request.from, request.len, request.handl= e, - request.flags, request.type, - nbd_cmd_lookup(request.type), - ret, error_get_pretty(local_err)); - error_free(local_err); - } if (ret < 0 || request_ret < 0) { return ret ? ret : request_ret; } @@ -1159,9 +1360,9 @@ static void nbd_client_close(BlockDriverState *bs) BDRVNBDState *s =3D (BDRVNBDState *)bs->opaque; NBDRequest request =3D { .type =3D NBD_CMD_DISC }; =20 - assert(s->ioc); - - nbd_send_request(s->ioc, &request); + if (s->ioc) { + nbd_send_request(s->ioc, &request); + } =20 nbd_teardown_connection(bs); } @@ -1804,6 +2005,8 @@ static BlockDriver bdrv_nbd =3D { .bdrv_getlength =3D nbd_getlength, .bdrv_detach_aio_context =3D nbd_client_detach_aio_context, .bdrv_attach_aio_context =3D nbd_client_attach_aio_context, + .bdrv_co_drain_begin =3D nbd_client_co_drain_begin, + .bdrv_co_drain_end =3D nbd_client_co_drain_end, .bdrv_refresh_filename =3D nbd_refresh_filename, .bdrv_co_block_status =3D nbd_client_co_block_status, .bdrv_dirname =3D nbd_dirname, @@ -1826,6 +2029,8 @@ static BlockDriver bdrv_nbd_tcp =3D { .bdrv_getlength =3D nbd_getlength, .bdrv_detach_aio_context =3D nbd_client_detach_aio_context, .bdrv_attach_aio_context =3D nbd_client_attach_aio_context, + .bdrv_co_drain_begin =3D nbd_client_co_drain_begin, + .bdrv_co_drain_end =3D nbd_client_co_drain_end, .bdrv_refresh_filename =3D nbd_refresh_filename, .bdrv_co_block_status =3D nbd_client_co_block_status, .bdrv_dirname =3D nbd_dirname, @@ -1848,6 +2053,8 @@ static BlockDriver bdrv_nbd_unix =3D { .bdrv_getlength =3D nbd_getlength, .bdrv_detach_aio_context =3D nbd_client_detach_aio_context, .bdrv_attach_aio_context =3D nbd_client_attach_aio_context, + .bdrv_co_drain_begin =3D nbd_client_co_drain_begin, + .bdrv_co_drain_end =3D nbd_client_co_drain_end, .bdrv_refresh_filename =3D nbd_refresh_filename, .bdrv_co_block_status =3D nbd_client_co_block_status, .bdrv_dirname =3D nbd_dirname, --=20 2.18.0 From nobody Sun May 5 04:25:55 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=virtuozzo.com ARC-Seal: i=1; a=rsa-sha256; t=1566406777; cv=none; d=zoho.com; s=zohoarc; b=fqG4o1wPzzaelRUn/Lnu3FF3LUbx34Cb1iEAsIEF/1eOJcHaxIKTjUZIuDkaJS6Vs5b4zfVqp1d0AqEsBht+Iu/YUXEjSOuOjJRRhawKAaruxU6/F1IyUgzuaVdIgXs0dzgRFw7+omOHQG6c08cUkPDb67q1/Tuxh1SvdN/ruOc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1566406777; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=XMWn5k54ZZ/xz9HXzLL3wd2aq8HMi58iNrR9ZmiWmh4=; b=BXrhoTv3YtoPnYdBa1SmHgAlZ7lyxSFtX9lAlQ+3PdarFU203LzlLUUprtSquovd3j64aOAYQcIuoriyx+LueY8v1wlUNP2tWdFgruiYqmdQpRvx6TlSPuPLO1hCtoA+lkhvyJdsyxaw7c4B4Ew0ZTAr5nDy9VpCBWVH4siAaMc= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1566406777472902.9791687208867; Wed, 21 Aug 2019 09:59:37 -0700 (PDT) Received: from localhost ([::1]:50674 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i0TxO-00084X-5L for importer@patchew.org; Wed, 21 Aug 2019 12:59:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34248) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i0TqU-0007mX-80 for qemu-devel@nongnu.org; Wed, 21 Aug 2019 12:52:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i0TqS-0004Q1-OC for qemu-devel@nongnu.org; Wed, 21 Aug 2019 12:52:26 -0400 Received: from relay.sw.ru ([185.231.240.75]:53190) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i0TqP-0004AL-5X; Wed, 21 Aug 2019 12:52:21 -0400 Received: from [10.94.3.0] (helo=kvm.qa.sw.ru) by relay.sw.ru with esmtp (Exim 4.92) (envelope-from ) id 1i0TqL-0003Ry-AJ; Wed, 21 Aug 2019 19:52:17 +0300 From: Vladimir Sementsov-Ogievskiy To: qemu-block@nongnu.org Date: Wed, 21 Aug 2019 19:52:15 +0300 Message-Id: <20190821165215.61406-4-vsementsov@virtuozzo.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20190821165215.61406-1-vsementsov@virtuozzo.com> References: <20190821165215.61406-1-vsementsov@virtuozzo.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 185.231.240.75 Subject: [Qemu-devel] [PATCH v8 3/3] iotests: test nbd reconnect X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: fam@euphon.net, kwolf@redhat.com, vsementsov@virtuozzo.com, sheepdog@lists.wpkg.org, qemu-devel@nongnu.org, mreitz@redhat.com, stefanha@redhat.com, den@openvz.org, namei.unix@gmail.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Add test, which starts backup to nbd target and restarts nbd server during backup. Signed-off-by: Vladimir Sementsov-Ogievskiy --- tests/qemu-iotests/264 | 65 +++++++++++++++++++++++++++++++++++ tests/qemu-iotests/264.out | 12 +++++++ tests/qemu-iotests/group | 1 + tests/qemu-iotests/iotests.py | 4 +++ 4 files changed, 82 insertions(+) create mode 100755 tests/qemu-iotests/264 create mode 100644 tests/qemu-iotests/264.out diff --git a/tests/qemu-iotests/264 b/tests/qemu-iotests/264 new file mode 100755 index 0000000000..e70f91c5ca --- /dev/null +++ b/tests/qemu-iotests/264 @@ -0,0 +1,65 @@ +#!/usr/bin/env python +# +# Test nbd reconnect +# +# Copyright (c) 2019 Virtuozzo International GmbH. All rights reserved. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . +# + +import time + +import iotests +from iotests import qemu_img_create, file_path, qemu_nbd_popen, log + +disk_a, disk_b, nbd_sock =3D file_path('disk_a', 'disk_b', 'nbd-sock') + +qemu_img_create('-f', iotests.imgfmt, disk_a, '5M') +qemu_img_create('-f', iotests.imgfmt, disk_b, '5M') +srv =3D qemu_nbd_popen('-k', nbd_sock, '-f', iotests.imgfmt, disk_b) +time.sleep(1) + +vm =3D iotests.VM().add_drive(disk_a) +vm.launch() +vm.hmp_qemu_io('drive0', 'write 0 5M') + +vm.qmp_log('blockdev-add', filters=3D[iotests.filter_qmp_testfiles], + **{'node_name': 'backup0', + 'driver': 'raw', + 'file': {'driver': 'nbd', + 'server': {'type': 'unix', 'path': nbd_sock}, + 'reconnect-delay': 10}}) +vm.qmp_log('blockdev-backup', device=3D'drive0', sync=3D'full', target=3D'= backup0', + speed=3D(1 * 1024 * 1024)) + +time.sleep(1) +log('Kill NBD server') +srv.kill() + +jobs =3D vm.qmp('query-block-jobs')['return'] +if jobs and jobs[0]['offset'] < jobs[0]['len']: + log('Backup job is still in progress') + +time.sleep(1) + +log('Start NBD server') +srv =3D qemu_nbd_popen('-k', nbd_sock, '-f', iotests.imgfmt, disk_b) + +vm.qmp_log('block-job-set-speed', device=3D'drive0', speed=3D0) +e =3D vm.event_wait('BLOCK_JOB_COMPLETED') +log('Backup completed: {}'.format(e['data']['offset'])) + +vm.qmp_log('blockdev-del', node_name=3D'backup0') +srv.kill() +vm.shutdown() diff --git a/tests/qemu-iotests/264.out b/tests/qemu-iotests/264.out new file mode 100644 index 0000000000..4a2f4aa509 --- /dev/null +++ b/tests/qemu-iotests/264.out @@ -0,0 +1,12 @@ +{"execute": "blockdev-add", "arguments": {"driver": "raw", "file": {"drive= r": "nbd", "reconnect-delay": 10, "server": {"path": "TEST_DIR/PID-nbd-sock= ", "type": "unix"}}, "node-name": "backup0"}} +{"return": {}} +{"execute": "blockdev-backup", "arguments": {"device": "drive0", "speed": = 1048576, "sync": "full", "target": "backup0"}} +{"return": {}} +Kill NBD server +Backup job is still in progress +Start NBD server +{"execute": "block-job-set-speed", "arguments": {"device": "drive0", "spee= d": 0}} +{"return": {}} +Backup completed: 5242880 +{"execute": "blockdev-del", "arguments": {"node-name": "backup0"}} +{"return": {}} diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group index d95d556414..34c2b89108 100644 --- a/tests/qemu-iotests/group +++ b/tests/qemu-iotests/group @@ -274,3 +274,4 @@ 257 rw 258 rw quick 262 rw quick migration +264 rw quick diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py index 84438e837c..ab2f8f7d1b 100644 --- a/tests/qemu-iotests/iotests.py +++ b/tests/qemu-iotests/iotests.py @@ -229,6 +229,10 @@ def qemu_nbd_early_pipe(*args): else: return exitcode, subp.communicate()[0] =20 +def qemu_nbd_popen(*args): + '''Run qemu-nbd in daemon mode and return the parent's exit code''' + return subprocess.Popen(qemu_nbd_args + ['--persistent'] + list(args)) + def compare_images(img1, img2, fmt1=3Dimgfmt, fmt2=3Dimgfmt): '''Return True if two image files are identical''' return qemu_img('compare', '-f', fmt1, --=20 2.18.0