From nobody Mon May 6 02:30:19 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=intel.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1638345198396995.3681579388358; Tue, 30 Nov 2021 23:53:18 -0800 (PST) Received: from localhost ([::1]:45558 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1msKQW-0003E1-Vx for importer@patchew.org; Wed, 01 Dec 2021 02:53:17 -0500 Received: from eggs.gnu.org ([209.51.188.92]:39566) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1msKOq-0001qk-B7; Wed, 01 Dec 2021 02:51:32 -0500 Received: from mga18.intel.com ([134.134.136.126]:12294) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1msKOn-0004ez-8O; Wed, 01 Dec 2021 02:51:31 -0500 Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Nov 2021 23:51:21 -0800 Received: from leirao.bj.intel.com ([10.238.157.52]) by orsmga003.jf.intel.com with ESMTP; 30 Nov 2021 23:51:18 -0800 X-IronPort-AV: E=McAfee;i="6200,9189,10184"; a="223285450" X-IronPort-AV: E=Sophos;i="5.87,278,1631602800"; d="scan'208";a="223285450" X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.87,278,1631602800"; d="scan'208";a="459153914" From: "Rao, Lei" To: chen.zhang@intel.com, eblake@redhat.com, vsementsov@virtuozzo.com, kwolf@redhat.com, hreitz@redhat.com, berrange@redhat.com Subject: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits cleanly in some corner case Date: Wed, 1 Dec 2021 15:54:27 +0800 Message-Id: <20211201075427.155702-1-lei.rao@intel.com> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=134.134.136.126; envelope-from=lei.rao@intel.com; helo=mga18.intel.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Rao, Lei" , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1638345199584100001 Content-Type: text/plain; charset="utf-8" We found that the QIO channel coroutine could not be awakened in some c= orner cases during our stress test for COLO. The patch fixes as follow: #0 0x00007fad72e24bf6 in __ppoll (fds=3D0x5563d75861f0, nfds=3D1, = timeout=3D, sigmask=3D0x0) at ../sysdeps/unix/sysv/linux/ppo= ll.c:44 #1 0x00005563d6977c68 in qemu_poll_ns (fds=3D0x5563d75861f0, nfds= =3D1, timeout=3D599999697630) at util/qemu-timer.c:348 #2 0x00005563d697ac44 in aio_poll (ctx=3D0x5563d755dfa0, blocking= =3Dtrue) at util/aio-posix.c:669 #3 0x00005563d68ba24f in bdrv_do_drained_begin (bs=3D0x5563d75ea0c= 0, recursive=3Dfalse, parent=3D0x0, ignore_bds_parents=3Dfalse, poll=3Dtrue= ) at block/io.c:432 #4 0x00005563d68ba338 in bdrv_drained_begin (bs=3D0x5563d75ea0c0) = at block/io.c:438 #5 0x00005563d689c6d2 in quorum_del_child (bs=3D0x5563d75ea0c0, ch= ild=3D0x5563d7908600, errp=3D0x7fff3cf5b960) at block/quorum.c:1063 #6 0x00005563d684328f in bdrv_del_child (parent_bs=3D0x5563d75ea0c= 0, child=3D0x5563d7908600, errp=3D0x7fff3cf5b960) at block.c:6568 #7 0x00005563d658499e in qmp_x_blockdev_change (parent=3D0x5563d80= ec4c0 "colo-disk0", has_child=3Dtrue, child=3D0x5563d7577d30 "children.1", = has_node=3Dfalse, node=3D0x0,errp=3D0x7fff3cf5b960) at blockdev.c:4494 #8 0x00005563d67e8b4e in qmp_marshal_x_blockdev_change (args=3D0x7= fad6400ada0, ret=3D0x7fff3cf5b9f8, errp=3D0x7fff3cf5b9f0) at qapi/qapi-comm= ands-block-core.c:1538 #9 0x00005563d691cd9e in do_qmp_dispatch (cmds=3D0x5563d719a4f0 , request=3D0x7fad64009d80, allow_oob=3Dtrue, errp=3D0x7fff3cf5= ba98) at qapi/qmp-dispatch.c:132 #10 0x00005563d691cfab in qmp_dispatch (cmds=3D0x5563d719a4f0 , request=3D0x7fad64009d80, allow_oob=3Dtrue) at qapi/qmp-dispatch= .c:175 #11 0x00005563d67b7787 in monitor_qmp_dispatch (mon=3D0x5563d7605d4= 0, req=3D0x7fad64009d80) at monitor/qmp.c:145 #12 0x00005563d67b7b24 in monitor_qmp_bh_dispatcher (data=3D0x0) at= monitor/qmp.c:234 #13 0x00005563d69754c2 in aio_bh_call (bh=3D0x5563d7473230) at util= /async.c:89 #14 0x00005563d697555e in aio_bh_poll (ctx=3D0x5563d7471da0) at uti= l/async.c:117 #15 0x00005563d697a423 in aio_dispatch (ctx=3D0x5563d7471da0) at ut= il/aio-posix.c:459 #16 0x00005563d6975917 in aio_ctx_dispatch (source=3D0x5563d7471da0= , callback=3D0x0, user_data=3D0x0) at util/async.c:260 #17 0x00007fad730e1fbd in g_main_context_dispatch () from /lib/x86_= 64-linux-gnu/libglib-2.0.so.0 #18 0x00005563d6978cda in glib_pollfds_poll () at util/main-loop.c:= 219 #19 0x00005563d6978d58 in os_host_main_loop_wait (timeout=3D977650)= at util/main-loop.c:242 #20 0x00005563d6978e69 in main_loop_wait (nonblocking=3D0) at util/= main-loop.c:518 #21 0x00005563d658f5ed in main_loop () at vl.c:1814 #22 0x00005563d6596bb7 in main (argc=3D61, argv=3D0x7fff3cf5c0c8, e= nvp=3D0x7fff3cf5c2b8) at vl.c:450 From the call trace, we can see that the QEMU main thread is waiting fo= r the in_flight return to zero. But the in_filght never reaches 0. After analysis, the root cause is that the coroutine of NBD was not awa= kened. Although this bug was found in the COLO test, I think this is a universal problem in the QIO module. This issue also affects other modu= les depending on QIO such as NBD. We dump the following data: $2 =3D { in_flight =3D 2, state =3D NBD_CLIENT_QUIT, connect_status =3D 0, connect_err =3D 0x0, wait_in_flight =3D false, requests =3D {{ coroutine =3D 0x0, offset =3D 273077071872, receiving =3D false, }, { coroutine =3D 0x7f1164571df0, offset =3D 498792529920, receiving =3D false, }, { coroutine =3D 0x0, offset =3D 500663590912, receiving =3D false, }, { ... } }, reply =3D { simple =3D { magic =3D 1732535960, error =3D 0, handle =3D 0 }, structured =3D { magic =3D 1732535960, flags =3D 0, type =3D 0, handle =3D 0, length =3D 0 }, { magic =3D 1732535960, _skip =3D 0, handle =3D 0 } }, bs =3D 0x7f11640e2140, reconnect_delay =3D 0, saddr =3D 0x7f11640e1f80, export =3D 0x7f11640dc560 "parent0", } From the data, we can see the coroutine of NBD does not exit normally w= hen killing the NBD server(we kill the Secondary VM in the COLO stress test= ). Then it will not execute in_flight--. So, the QEMU main thread will han= g here. Further analysis, I found the coroutine of NBD will yield in nbd_send_request() or qio_channel_write_all() in nbd_co_send_request= . By the way, the yield is due to the kernel return EAGAIN(under the stress= test). However, NBD didn't know it would yield here. So, the nbd_recv_coroutin= es_wake won't wake it up because req->receiving is false. if the NBD server is terminated abnormally at the same time. The G_IO_OUT will be invalid= , the coroutine will never be awakened. In addition, the s->ioc will be rel= eased immediately. if we force to wake up the coroutine of NBD, access to the= ioc->xxx will cause segment fault. Finally, I add a state named force_quit= to the QIOChannel to ensure that QIO will exit immediately. And I add the = function of qio_channel_coroutines_wake to wake it up. Signed-off-by: Lei Rao Signed-off-by: Zhang Chen --- block/nbd.c | 5 +++++ include/io/channel.h | 19 +++++++++++++++++++ io/channel.c | 22 ++++++++++++++++++++++ 3 files changed, 46 insertions(+) diff --git a/block/nbd.c b/block/nbd.c index 5ef462db1b..5ee4eaaf57 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -208,6 +208,8 @@ static void nbd_teardown_connection(BlockDriverState *b= s) assert(!s->in_flight); =20 if (s->ioc) { + qio_channel_set_force_quit(s->ioc, true); + qio_channel_coroutines_wake(s->ioc); qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); yank_unregister_function(BLOCKDEV_YANK_INSTANCE(s->bs->node_name), nbd_yank, s->bs); @@ -315,6 +317,7 @@ int coroutine_fn nbd_co_do_establish_connection(BlockDr= iverState *bs, =20 /* successfully connected */ s->state =3D NBD_CLIENT_CONNECTED; + qio_channel_set_force_quit(s->ioc, false); qemu_co_queue_restart_all(&s->free_sema); =20 return 0; @@ -345,6 +348,8 @@ static coroutine_fn void nbd_reconnect_attempt(BDRVNBDS= tate *s) =20 /* Finalize previous connection if any */ if (s->ioc) { + qio_channel_set_force_quit(s->ioc, true); + qio_channel_coroutines_wake(s->ioc); qio_channel_detach_aio_context(QIO_CHANNEL(s->ioc)); yank_unregister_function(BLOCKDEV_YANK_INSTANCE(s->bs->node_name), nbd_yank, s->bs); diff --git a/include/io/channel.h b/include/io/channel.h index 88988979f8..bc5af8cdd6 100644 --- a/include/io/channel.h +++ b/include/io/channel.h @@ -78,6 +78,7 @@ struct QIOChannel { AioContext *ctx; Coroutine *read_coroutine; Coroutine *write_coroutine; + bool force_quit; #ifdef _WIN32 HANDLE event; /* For use with GSource on Win32 */ #endif @@ -484,6 +485,24 @@ int qio_channel_set_blocking(QIOChannel *ioc, bool enabled, Error **errp); =20 +/** + * qio_channel_force_quit: + * @ioc: the channel object + * @quit: the new flag state + * + * Set the new flag state + */ +void qio_channel_set_force_quit(QIOChannel *ioc, bool quit); + +/** + * qio_channel_coroutines_wake: + * @ioc: the channel object + * + * Wake up the coroutines to ensure that they will exit normally + * when the server terminated abnormally + */ +void qio_channel_coroutines_wake(QIOChannel *ioc); + /** * qio_channel_close: * @ioc: the channel object diff --git a/io/channel.c b/io/channel.c index e8b019dc36..bc1a9e055c 100644 --- a/io/channel.c +++ b/io/channel.c @@ -136,6 +136,9 @@ int qio_channel_readv_full_all_eof(QIOChannel *ioc, if (len =3D=3D QIO_CHANNEL_ERR_BLOCK) { if (qemu_in_coroutine()) { qio_channel_yield(ioc, G_IO_IN); + if (ioc->force_quit) { + goto cleanup; + } } else { qio_channel_wait(ioc, G_IO_IN); } @@ -242,6 +245,9 @@ int qio_channel_writev_full_all(QIOChannel *ioc, if (len =3D=3D QIO_CHANNEL_ERR_BLOCK) { if (qemu_in_coroutine()) { qio_channel_yield(ioc, G_IO_OUT); + if (ioc->force_quit) { + goto cleanup; + } } else { qio_channel_wait(ioc, G_IO_OUT); } @@ -543,6 +549,9 @@ void coroutine_fn qio_channel_yield(QIOChannel *ioc, } qio_channel_set_aio_fd_handlers(ioc); qemu_coroutine_yield(); + if (ioc->force_quit) { + return; + } =20 /* Allow interrupting the operation by reentering the coroutine other = than * through the aio_fd_handlers. */ @@ -555,6 +564,19 @@ void coroutine_fn qio_channel_yield(QIOChannel *ioc, } } =20 +void qio_channel_set_force_quit(QIOChannel *ioc, bool quit) +{ + ioc->force_quit =3D quit; +} + +void qio_channel_coroutines_wake(QIOChannel *ioc) +{ + if (ioc->read_coroutine) { + qio_channel_restart_read(ioc); + } else if (ioc->write_coroutine) { + qio_channel_restart_write(ioc); + } +} =20 static gboolean qio_channel_wait_complete(QIOChannel *ioc, GIOCondition condition, --=20 2.32.0