From nobody Tue May 21 06:04:55 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=yandex-team.ru ARC-Seal: i=1; a=rsa-sha256; t=1611906090; cv=none; d=zohomail.com; s=zohoarc; b=nFM0Zi+LN4QOrJnBH1x61Y1wOCvvqMsVKGZ91hW1HuudP0skmaMEtNzk8MXyDXwKlM1H1MuS/ck+A5elS/MzfybnlfVWyRAOctaZDIHOYJeBdAgY2iHGZKkukZW/2QsWOnPUkOBn6QSFC5mLh39NUxpnDOkRhZzlbNCY9xsDuc4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1611906090; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=JnuoYilODbCdaWsV5gPSIV0qw3rSuchMIyJ5xvvg2Nc=; b=KGhT9ZmhZKIqzSD7UH8F0SahiOj1UN74DX6US5LwD0vrPf8JvIXlvSkQHULfMUUzunkRCDC8C/TQjs8oMPhoEuwe0KScP0rpAetRp7dfnnO15cNCNGQDe+tO9KqjxOZMj+y+Z6KtKxOPf8xfT/P0KU2uqAOcXNUoHAEubbN8ZOM= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1611906090769284.3961207882586; Thu, 28 Jan 2021 23:41:30 -0800 (PST) Received: from localhost ([::1]:45098 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l5OPJ-0000X5-KS for importer@patchew.org; Fri, 29 Jan 2021 02:41:29 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:56376) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l5OMz-0007LY-Uw; Fri, 29 Jan 2021 02:39:05 -0500 Received: from forwardcorp1o.mail.yandex.net ([95.108.205.193]:51006) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l5OMx-0004LS-Su; Fri, 29 Jan 2021 02:39:05 -0500 Received: from vla1-fdfb804fb3f3.qloud-c.yandex.net (vla1-fdfb804fb3f3.qloud-c.yandex.net [IPv6:2a02:6b8:c0d:3199:0:640:fdfb:804f]) by forwardcorp1o.mail.yandex.net (Yandex) with ESMTP id 9DD182E1E11; Fri, 29 Jan 2021 10:39:01 +0300 (MSK) Received: from vla1-81430ab5870b.qloud-c.yandex.net (vla1-81430ab5870b.qloud-c.yandex.net [2a02:6b8:c0d:35a1:0:640:8143:ab5]) by vla1-fdfb804fb3f3.qloud-c.yandex.net (mxbackcorp/Yandex) with ESMTP id iKJDzONt78-d0waNa8Y; Fri, 29 Jan 2021 10:39:01 +0300 Received: from dynamic-red3.dhcp.yndx.net (dynamic-red3.dhcp.yndx.net [2a02:6b8:0:419:7359:4dc3:71d:4c5a]) by vla1-81430ab5870b.qloud-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id jIpaBE76QG-d0mWLGOf; Fri, 29 Jan 2021 10:39:00 +0300 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (Client certificate not present) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1611905941; bh=JnuoYilODbCdaWsV5gPSIV0qw3rSuchMIyJ5xvvg2Nc=; h=In-Reply-To:Message-Id:References:Date:Subject:To:From:Cc; b=wHWWaBJIbcdXYrAmyvwt6VhlRkC1a0KUCBe3JIynn8k2QqWzZEezH8SpobZgFLzAE a2JJ0BrOQStnqpfJ4seNMbfXqDDiu0fWqE2BNzK26uGQEBO8ddlLZPVYHGzaFpQqS1 yfQ2eR/jboQfN5RkfN3/vqIUkYlLdgcPpgHkw9V4= Authentication-Results: vla1-fdfb804fb3f3.qloud-c.yandex.net; dkim=pass header.i=@yandex-team.ru From: Roman Kagan To: qemu-devel@nongnu.org Subject: [PATCH v2 1/3] block/nbd: only detach existing iochannel from aio_context Date: Fri, 29 Jan 2021 10:38:57 +0300 Message-Id: <20210129073859.683063-2-rvkagan@yandex-team.ru> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210129073859.683063-1-rvkagan@yandex-team.ru> References: <20210129073859.683063-1-rvkagan@yandex-team.ru> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=95.108.205.193; envelope-from=rvkagan@yandex-team.ru; helo=forwardcorp1o.mail.yandex.net X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Vladimir Sementsov-Ogievskiy , qemu-block@nongnu.org, Max Reitz Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" When the reconnect in NBD client is in progress, the iochannel used for NBD connection doesn't exist. Therefore an attempt to detach it from the aio_context of the parent BlockDriverState results in a NULL pointer dereference. The problem is triggerable, in particular, when an outgoing migration is about to finish, and stopping the dataplane tries to move the BlockDriverState from the iothread aio_context to the main loop. If the NBD connection is lost before this point, and the NBD client has entered the reconnect procedure, QEMU crashes: #0 qemu_aio_coroutine_enter (ctx=3D0x5618056c7580, co=3D0x0) at /build/qemu-6MF7tq/qemu-5.0.1/util/qemu-coroutine.c:109 #1 0x00005618034b1b68 in nbd_client_attach_aio_context_bh ( opaque=3D0x561805ed4c00) at /build/qemu-6MF7tq/qemu-5.0.1/block/nbd.c= :164 #2 0x000056180353116b in aio_wait_bh (opaque=3D0x7f60e1e63700) at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-wait.c:55 #3 0x0000561803530633 in aio_bh_call (bh=3D0x7f60d40a7e80) at /build/qemu-6MF7tq/qemu-5.0.1/util/async.c:136 #4 aio_bh_poll (ctx=3Dctx@entry=3D0x5618056c7580) at /build/qemu-6MF7tq/qemu-5.0.1/util/async.c:164 #5 0x0000561803533e5a in aio_poll (ctx=3Dctx@entry=3D0x5618056c7580, blocking=3Dblocking@entry=3Dtrue) at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-posix.c:650 #6 0x000056180353128d in aio_wait_bh_oneshot (ctx=3D0x5618056c7580, cb=3D, opaque=3D) at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-wait.c:71 #7 0x000056180345c50a in bdrv_attach_aio_context (new_context=3D0x561805= 6c7580, bs=3D0x561805ed4c00) at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6172 #8 bdrv_set_aio_context_ignore (bs=3Dbs@entry=3D0x561805ed4c00, new_context=3Dnew_context@entry=3D0x5618056c7580, ignore=3Dignore@entry=3D0x7f60e1e63780) at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6237 #9 0x000056180345c969 in bdrv_child_try_set_aio_context ( bs=3Dbs@entry=3D0x561805ed4c00, ctx=3D0x5618056c7580, ignore_child=3D, errp=3D) at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6332 #10 0x00005618034957db in blk_do_set_aio_context (blk=3D0x56180695b3f0, new_context=3D0x5618056c7580, update_root_node=3Dupdate_root_node@ent= ry=3Dtrue, errp=3Derrp@entry=3D0x0) at /build/qemu-6MF7tq/qemu-5.0.1/block/block-backend.c:1989 #11 0x00005618034980bd in blk_set_aio_context (blk=3D, new_context=3D, errp=3Derrp@entry=3D0x0) at /build/qemu-6MF7tq/qemu-5.0.1/block/block-backend.c:2010 #12 0x0000561803197953 in virtio_blk_data_plane_stop (vdev=3D) at /build/qemu-6MF7tq/qemu-5.0.1/hw/block/dataplane/virtio-blk.c:292 #13 0x00005618033d67bf in virtio_bus_stop_ioeventfd (bus=3D0x5618056d9f08) at /build/qemu-6MF7tq/qemu-5.0.1/hw/virtio/virtio-bus.c:245 #14 0x00005618031c9b2e in virtio_vmstate_change (opaque=3D0x5618056d9f90, running=3D0, state=3D) at /build/qemu-6MF7tq/qemu-5.0.1/hw/virtio/virtio.c:3220 #15 0x0000561803208bfd in vm_state_notify (running=3Drunning@entry=3D0, state=3Dstate@entry=3DRUN_STATE_FINISH_MIGRATE) at /build/qemu-6MF7tq/qemu-5.0.1/softmmu/vl.c:1275 #16 0x0000561803155c02 in do_vm_stop (state=3DRUN_STATE_FINISH_MIGRATE, send_stop=3D) at /build/qemu-6MF7tq/qemu-5.0.1/cpus.c:= 1032 #17 0x00005618033e3765 in migration_completion (s=3D0x5618056e6960) at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:2914 #18 migration_iteration_run (s=3D0x5618056e6960) at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:3275 #19 migration_thread (opaque=3Dopaque@entry=3D0x5618056e6960) at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:3439 #20 0x0000561803536ad6 in qemu_thread_start (args=3D) at /build/qemu-6MF7tq/qemu-5.0.1/util/qemu-thread-posix.c:519 #21 0x00007f61085d06ba in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #22 0x00007f610830641d in sysctl () from /lib/x86_64-linux-gnu/libc.so.6 #23 0x0000000000000000 in ?? () Fix it by checking that the iochannel is non-null before trying to detach it from the aio_context. If it is null, no detaching is needed, and it will get reattached in the proper aio_context once the connection is reestablished. Signed-off-by: Roman Kagan Reviewed-by: Vladimir Sementsov-Ogievskiy --- block/nbd.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/block/nbd.c b/block/nbd.c index 42e10c7c93..bcd6641e90 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -235,7 +235,14 @@ static void nbd_client_detach_aio_context(BlockDriverS= tate *bs) =20 /* Timer is deleted in nbd_client_co_drain_begin() */ assert(!s->reconnect_delay_timer); - qio_channel_detach_aio_context(QIO_CHANNEL(s->ioc)); + /* + * If reconnect is in progress we may have no ->ioc. It will be + * re-instantiated in the proper aio context once the connection is + * reestablished. + */ + if (s->ioc) { + qio_channel_detach_aio_context(QIO_CHANNEL(s->ioc)); + } } =20 static void nbd_client_attach_aio_context_bh(void *opaque) --=20 2.29.2 From nobody Tue May 21 06:04:55 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=yandex-team.ru ARC-Seal: i=1; a=rsa-sha256; t=1611906091; cv=none; d=zohomail.com; s=zohoarc; b=d6JLLP7mB4fZrqlxTblTKrNNktB+tHB0LKvfh4YSWmnW90sxiqOku37lJT+2Y02X9jwavWJ9M1czSQjEjRJB6AGPGZOeN0MhAOl6dXqghlHsITneYXpgV4FHrd0rzqzX704yXZx+GzICyDuxqd/qGsXcn0NFwTIyLBZXEvYRhV4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1611906091; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=kU/u5e3DrvCv3UfEIUFW5g+as5m2didDZrTa2yypoYE=; b=YKgj7hHgtbC4HIWPjgPT5N0BjfY5L83sY0lErILINpuUYS51wMhIkRNUBKvdYbCse5CPBiNv/G6Fi1mu3fGDa08pq4rOHR2fmjoXWiW30y3R7WyFwo7pfQQVjRTyFUOnG7JTTn4FvXwO1lJXMkJyILroZ5u8S7ygD95xNHVJypQ= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1611906091809442.74713862523777; Thu, 28 Jan 2021 23:41:31 -0800 (PST) Received: from localhost ([::1]:45124 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l5OPK-0000Xh-Lo for importer@patchew.org; Fri, 29 Jan 2021 02:41:30 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:56396) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l5ON2-0007Ox-2q; Fri, 29 Jan 2021 02:39:08 -0500 Received: from forwardcorp1p.mail.yandex.net ([2a02:6b8:0:1472:2741:0:8b6:217]:34514) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l5OMz-0004M7-Ol; Fri, 29 Jan 2021 02:39:07 -0500 Received: from vla1-fdfb804fb3f3.qloud-c.yandex.net (vla1-fdfb804fb3f3.qloud-c.yandex.net [IPv6:2a02:6b8:c0d:3199:0:640:fdfb:804f]) by forwardcorp1p.mail.yandex.net (Yandex) with ESMTP id 502E22E1DA6; Fri, 29 Jan 2021 10:39:02 +0300 (MSK) Received: from vla1-81430ab5870b.qloud-c.yandex.net (vla1-81430ab5870b.qloud-c.yandex.net [2a02:6b8:c0d:35a1:0:640:8143:ab5]) by vla1-fdfb804fb3f3.qloud-c.yandex.net (mxbackcorp/Yandex) with ESMTP id fJ3QJLVsue-d1wqhtwX; Fri, 29 Jan 2021 10:39:02 +0300 Received: from dynamic-red3.dhcp.yndx.net (dynamic-red3.dhcp.yndx.net [2a02:6b8:0:419:7359:4dc3:71d:4c5a]) by vla1-81430ab5870b.qloud-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id jIpaBE76QG-d1mW5XHT; Fri, 29 Jan 2021 10:39:01 +0300 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (Client certificate not present) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1611905942; bh=kU/u5e3DrvCv3UfEIUFW5g+as5m2didDZrTa2yypoYE=; h=In-Reply-To:Message-Id:References:Date:Subject:To:From:Cc; b=h/rj6p5rlg9l5D0Zut+Tt9teOIrssygzO13VqtMOuGHuFpid7I98eD9TQgAadjQ5c 3CjcCLkdEWivUJ556Yt/cSFvtCE6u830yllMQZiFS2eq1NS13DoqNW7GA8Uo0jhEFO cT6shZOYehEN4i2+jJFUbeOsGCnfmspiBDrXgIgA= Authentication-Results: vla1-fdfb804fb3f3.qloud-c.yandex.net; dkim=pass header.i=@yandex-team.ru From: Roman Kagan To: qemu-devel@nongnu.org Subject: [PATCH v2 2/3] block/nbd: only enter connection coroutine if it's present Date: Fri, 29 Jan 2021 10:38:58 +0300 Message-Id: <20210129073859.683063-3-rvkagan@yandex-team.ru> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210129073859.683063-1-rvkagan@yandex-team.ru> References: <20210129073859.683063-1-rvkagan@yandex-team.ru> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a02:6b8:0:1472:2741:0:8b6:217; envelope-from=rvkagan@yandex-team.ru; helo=forwardcorp1p.mail.yandex.net X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Vladimir Sementsov-Ogievskiy , qemu-block@nongnu.org, Max Reitz Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" When an NBD block driver state is moved from one aio_context to another (e.g. when doing a drain in a migration thread), nbd_client_attach_aio_context_bh is executed that enters the connection coroutine. However, the assumption that ->connection_co is always present here appears incorrect: the connection may have encountered an error other than -EIO in the underlying transport, and thus may have decided to quit rather than keep trying to reconnect, and therefore it may have terminated the connection coroutine. As a result an attempt to reassign the client in this state (NBD_CLIENT_QUIT) to a different aio_context leads to a null pointer dereference: #0 qio_channel_detach_aio_context (ioc=3D0x0) at /build/qemu-gYtjVn/qemu-5.0.1/io/channel.c:452 #1 0x0000562a242824b3 in bdrv_detach_aio_context (bs=3D0x562a268d6a00) at /build/qemu-gYtjVn/qemu-5.0.1/block.c:6151 #2 bdrv_set_aio_context_ignore (bs=3Dbs@entry=3D0x562a268d6a00, new_context=3Dnew_context@entry=3D0x562a260c9580, ignore=3Dignore@entry=3D0x7feeadc9b780) at /build/qemu-gYtjVn/qemu-5.0.1/block.c:6230 #3 0x0000562a24282969 in bdrv_child_try_set_aio_context (bs=3Dbs@entry=3D0x562a268d6a00, ctx=3D0x562a260c9580, ignore_child=3D, errp=3D) at /build/qemu-gYtjVn/qemu-5.0.1/block.c:6332 #4 0x0000562a242bb7db in blk_do_set_aio_context (blk=3D0x562a2735d0d0, new_context=3D0x562a260c9580, update_root_node=3Dupdate_root_node@entry=3Dtrue, errp=3Derrp@entry= =3D0x0) at /build/qemu-gYtjVn/qemu-5.0.1/block/block-backend.c:1989 #5 0x0000562a242be0bd in blk_set_aio_context (blk=3D, new_context=3D, errp=3Derrp@entry=3D0x0) at /build/qemu-gYtjVn/qemu-5.0.1/block/block-backend.c:2010 #6 0x0000562a23fbd953 in virtio_blk_data_plane_stop (vdev=3D) at /build/qemu-gYtjVn/qemu-5.0.1/hw/block/dataplane/virtio-blk.c:292 #7 0x0000562a241fc7bf in virtio_bus_stop_ioeventfd (bus=3D0x562a260dbf08) at /build/qemu-gYtjVn/qemu-5.0.1/hw/virtio/virtio-bus.c:245 #8 0x0000562a23fefb2e in virtio_vmstate_change (opaque=3D0x562a260dbf90, running=3D0, state=3D) at /build/qemu-gYtjVn/qemu-5.0.1/hw/virtio/virtio.c:3220 #9 0x0000562a2402ebfd in vm_state_notify (running=3Drunning@entry=3D0, state=3Dstate@entry=3DRUN_STATE_FINISH_MIGRATE) at /build/qemu-gYtjVn/qemu-5.0.1/softmmu/vl.c:1275 #10 0x0000562a23f7bc02 in do_vm_stop (state=3DRUN_STATE_FINISH_MIGRATE, send_stop=3D) at /build/qemu-gYtjVn/qemu-5.0.1/cpus.c:1032 #11 0x0000562a24209765 in migration_completion (s=3D0x562a260e83a0) at /build/qemu-gYtjVn/qemu-5.0.1/migration/migration.c:2914 #12 migration_iteration_run (s=3D0x562a260e83a0) at /build/qemu-gYtjVn/qemu-5.0.1/migration/migration.c:3275 #13 migration_thread (opaque=3Dopaque@entry=3D0x562a260e83a0) at /build/qemu-gYtjVn/qemu-5.0.1/migration/migration.c:3439 #14 0x0000562a2435ca96 in qemu_thread_start (args=3D) at /build/qemu-gYtjVn/qemu-5.0.1/util/qemu-thread-posix.c:519 #15 0x00007feed31466ba in start_thread (arg=3D0x7feeadc9c700) at pthread_create.c:333 #16 0x00007feed2e7c41d in __GI___sysctl (name=3D0x0, nlen=3D608471908, oldval=3D0x562a2452b138, oldlenp=3D0x0, newval=3D0x562a2452c5e0 <__func__.28102>, newlen=3D0) at ../sysdeps/unix/sysv/linux/sysctl.c:30 #17 0x0000000000000000 in ?? () Fix it by checking that the connection coroutine is non-null before trying to enter it. If it is null, no entering is needed, as the connection is probably going down anyway. Signed-off-by: Roman Kagan Reviewed-by: Vladimir Sementsov-Ogievskiy --- block/nbd.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/block/nbd.c b/block/nbd.c index bcd6641e90..b3cbbeb4b0 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -250,13 +250,15 @@ static void nbd_client_attach_aio_context_bh(void *op= aque) BlockDriverState *bs =3D opaque; BDRVNBDState *s =3D (BDRVNBDState *)bs->opaque; =20 - /* - * The node is still drained, so we know the coroutine has yielded in - * nbd_read_eof(), the only place where bs->in_flight can reach 0, or = it is - * entered for the first time. Both places are safe for entering the - * coroutine. - */ - qemu_aio_coroutine_enter(bs->aio_context, s->connection_co); + if (s->connection_co) { + /* + * The node is still drained, so we know the coroutine has yielded= in + * nbd_read_eof(), the only place where bs->in_flight can reach 0,= or + * it is entered for the first time. Both places are safe for ente= ring + * the coroutine. + */ + qemu_aio_coroutine_enter(bs->aio_context, s->connection_co); + } bdrv_dec_in_flight(bs); } =20 --=20 2.29.2 From nobody Tue May 21 06:04:55 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=yandex-team.ru ARC-Seal: i=1; a=rsa-sha256; t=1611906224; cv=none; d=zohomail.com; s=zohoarc; b=hVUUfxnYiZQU4GWxnsQ9/zYp+ckoOvtk1FU5ILn/RpsWumvwhcOmju7bfCvCch4BiNV+oXHun9dFoH1vSPG91SkID/ghwycZ72+w2OOOc2L3Q5u4WA55oRVaicspoWtquD6X5RmrOWKK/VnFRGBAYfklO762ZRxcA+dYL4ZEfMM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1611906224; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=gzy55dftt9Amp91HbZtNUUPI6Lgxpk4o7F6uPERRcNs=; b=fLYomHgh3Qcb4syMq8nnw5ZdbTIy2tLzYc2BzB+YYXtL5b6Y0rMg2YgsOmCWQVXmuhvoltL+k4DGIri8C7uH4rDOCJx2NGzER0G/0EvdjzvCo1aIvmCXRMsUobrcrkgP1vUn3ZUh0Ay/3uBG+xSv9Gm52YLho5vDHJ/ORcTEiPY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1611906224738889.3459045652211; Thu, 28 Jan 2021 23:43:44 -0800 (PST) Received: from localhost ([::1]:49754 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l5ORT-0002Uv-Ju for importer@patchew.org; Fri, 29 Jan 2021 02:43:43 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:56384) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l5ON0-0007MC-Mc; Fri, 29 Jan 2021 02:39:06 -0500 Received: from forwardcorp1o.mail.yandex.net ([2a02:6b8:0:1a2d::193]:37032) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l5OMy-0004M5-G0; Fri, 29 Jan 2021 02:39:06 -0500 Received: from vla1-fdfb804fb3f3.qloud-c.yandex.net (vla1-fdfb804fb3f3.qloud-c.yandex.net [IPv6:2a02:6b8:c0d:3199:0:640:fdfb:804f]) by forwardcorp1o.mail.yandex.net (Yandex) with ESMTP id B005C2E23DC; Fri, 29 Jan 2021 10:39:02 +0300 (MSK) Received: from vla1-81430ab5870b.qloud-c.yandex.net (vla1-81430ab5870b.qloud-c.yandex.net [2a02:6b8:c0d:35a1:0:640:8143:ab5]) by vla1-fdfb804fb3f3.qloud-c.yandex.net (mxbackcorp/Yandex) with ESMTP id o1Wtf9HWua-d2wq1vFn; Fri, 29 Jan 2021 10:39:02 +0300 Received: from dynamic-red3.dhcp.yndx.net (dynamic-red3.dhcp.yndx.net [2a02:6b8:0:419:7359:4dc3:71d:4c5a]) by vla1-81430ab5870b.qloud-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id jIpaBE76QG-d2mWB5CA; Fri, 29 Jan 2021 10:39:02 +0300 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (Client certificate not present) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1611905942; bh=gzy55dftt9Amp91HbZtNUUPI6Lgxpk4o7F6uPERRcNs=; h=In-Reply-To:Message-Id:References:Date:Subject:To:From:Cc; b=07o23AUabzkuntZGTMk6WU1U8alK38yPsown6EUWfFk9l3Z943+VeTE87rkJ/Z0vH hUEyt7sqnXnnx434hOhHkCP6rrUw8kDbxE66oBIN4JNv5/3pyg1FZ3GWlDw78aIKBs mcsqmNbvCOeDbAW8ooDsdukjXvN9DTCP+nuJl5NM= Authentication-Results: vla1-fdfb804fb3f3.qloud-c.yandex.net; dkim=pass header.i=@yandex-team.ru From: Roman Kagan To: qemu-devel@nongnu.org Subject: [PATCH v2 3/3] nbd: make nbd_read* return -EIO on error Date: Fri, 29 Jan 2021 10:38:59 +0300 Message-Id: <20210129073859.683063-4-rvkagan@yandex-team.ru> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210129073859.683063-1-rvkagan@yandex-team.ru> References: <20210129073859.683063-1-rvkagan@yandex-team.ru> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a02:6b8:0:1a2d::193; envelope-from=rvkagan@yandex-team.ru; helo=forwardcorp1o.mail.yandex.net X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Vladimir Sementsov-Ogievskiy , qemu-block@nongnu.org, Max Reitz Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" NBD reconnect logic considers the error code from the functions that read NBD messages to tell if reconnect should be attempted or not: it is attempted on -EIO, otherwise the client transitions to NBD_CLIENT_QUIT state (see nbd_channel_error). This error code is propagated from the primitives like nbd_read. The problem, however, is that nbd_read itself turns every error into -1 rather than -EIO. As a result, if the NBD server happens to die while sending the message, the client in QEMU receives less data than it expects, considers it as a fatal error, and wouldn't attempt reestablishing the connection. Fix it by turning every negative return from qio_channel_read_all into -EIO returned from nbd_read. Apparently that was the original behavior, but got broken later. Also adjust nbd_readXX to follow. Fixes: e6798f06a6 ("nbd: generalize usage of nbd_read") Signed-off-by: Roman Kagan Reviewed-by: Vladimir Sementsov-Ogievskiy --- include/block/nbd.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/include/block/nbd.h b/include/block/nbd.h index 4a52a43ef5..5f34d23bb0 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -364,7 +364,7 @@ static inline int nbd_read(QIOChannel *ioc, void *buffe= r, size_t size, if (desc) { error_prepend(errp, "Failed to read %s: ", desc); } - return -1; + return ret; } =20 return 0; @@ -375,8 +375,9 @@ static inline int nbd_read##bits(QIOChannel *ioc, = \ uint##bits##_t *val, \ const char *desc, Error **errp) \ { \ - if (nbd_read(ioc, val, sizeof(*val), desc, errp) < 0) { \ - return -1; \ + int ret =3D nbd_read(ioc, val, sizeof(*val), desc, errp); \ + if (ret < 0) { \ + return ret; \ } \ *val =3D be##bits##_to_cpu(*val); \ return 0; \ --=20 2.29.2