From nobody Thu May 2 17:16:17 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1538062696948599.4715719246622; Thu, 27 Sep 2018 08:38:16 -0700 (PDT) Received: from localhost ([::1]:36454 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1g5YMp-0004jB-AG for importer@patchew.org; Thu, 27 Sep 2018 11:38:15 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58815) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1g5XKt-00080P-Cz for qemu-devel@nongnu.org; Thu, 27 Sep 2018 10:32:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1g5XKp-0002Py-01 for qemu-devel@nongnu.org; Thu, 27 Sep 2018 10:32:11 -0400 Received: from mx1.didichuxing.com ([111.202.154.82]:19853 helo=bsf01.didichuxing.com) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1g5XKo-0002J4-At for qemu-devel@nongnu.org; Thu, 27 Sep 2018 10:32:06 -0400 Received: from BJEXCAS002.didichuxing.com ([172.20.1.42]) by bsf01.didichuxing.com with ESMTP id pJ7SyB9SeFVQ4s7m; Thu, 27 Sep 2018 22:31:54 +0800 (CST) Received: from liangdeMacBook-Pro.local (172.30.23.117) by BJSGEXMBX04.didichuxing.com (172.20.15.134) with Microsoft SMTP Server (TLS) id 15.0.1263.5; Thu, 27 Sep 2018 22:31:53 +0800 X-ASG-Debug-ID: 1538058714-0e40880b6542c5d0001-jgbH7p X-Barracuda-Envelope-From: liliangleo@didiglobal.com Date: Thu, 27 Sep 2018 22:31:35 +0800 From: Liang Li To: "Michael S. Tsirkin" , Jason Wang , Message-ID: <20180927140931.GA56333@liangdeMacBook-Pro.local> X-ASG-Orig-Subj: [PATCH] vhost-user: fix qemu crash caused by failed backend Mail-Followup-To: "Michael S. Tsirkin" , Jason Wang , qemu-devel@nongnu.org MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.7.2 (2016-11-26) X-Originating-IP: [172.30.23.117] X-ClientProxiedBy: BJEXCAS003.didichuxing.com (172.20.1.43) To BJSGEXMBX04.didichuxing.com (172.20.15.134) X-Barracuda-Connect: UNKNOWN[172.20.1.42] X-Barracuda-Start-Time: 1538058714 X-Barracuda-URL: https://bsf01.didichuxing.com:443/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at didichuxing.com X-Barracuda-Scan-Msg-Size: 5180 X-Barracuda-BRTS-Status: 1 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Spam-Score: -1.52 X-Barracuda-Spam-Status: No, SCORE=-1.52 using global scores of TAG_LEVEL=1000.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=1000.0 tests=BSF_RULE7568M X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.58475 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.50 BSF_RULE7568M Custom Rule 7568M X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 111.202.154.82 X-Mailman-Approved-At: Thu, 27 Sep 2018 11:36:35 -0400 Subject: [Qemu-devel] [PATCH] vhost-user: fix qemu crash caused by failed backend X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" During live migration, when stopping vhost-user device, 'vhost_dev_stop' will be called, 'vhost_dev_stop' will call a batch of 'vhost_user_read' and 'vhost_user_write'. If a previous 'vhost_user_read' or 'vhost_user_writ= e' failed because the vhost user backend failed, the 'CHR_EVENT_CLOSED' event will be triggerd, followed by the call chain chr_closed_bh()->vhost_user_st= op()-> vhost_net_cleanup()->vhost_dev_cleanup() vhost_dev_cleanup will clear vhost_dev struct, so the later 'vhost_user_rea= d' or 'vhost_user_read' will reference null pointer and cause qemu crash Signed-off-by: Liang Li --- hw/net/vhost_net.c | 6 ++++++ hw/virtio/vhost-user.c | 15 +++++++++++++-- include/hw/virtio/vhost.h | 1 + include/net/vhost_net.h | 1 + net/vhost-user.c | 3 +++ 5 files changed, 24 insertions(+), 2 deletions(-) diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index e037db6..77994e9 100644 --- a/hw/net/vhost_net.c +++ b/hw/net/vhost_net.c @@ -113,6 +113,11 @@ uint64_t vhost_net_get_features(struct vhost_net *net,= uint64_t features) features); } =20 +void vhost_net_mark_break_down(struct vhost_net *net) +{ + net->dev.break_down =3D true; +} + void vhost_net_ack_features(struct vhost_net *net, uint64_t features) { net->dev.acked_features =3D net->dev.backend_features; @@ -156,6 +161,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions *optio= ns) net->dev.max_queues =3D 1; net->dev.nvqs =3D 2; net->dev.vqs =3D net->vqs; + net->dev.break_down =3D false; =20 if (backend_kernel) { r =3D vhost_net_get_fd(options->net_backend); diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index b041343..1394719 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -213,14 +213,20 @@ static bool ioeventfd_enabled(void) static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg) { struct vhost_user *u =3D dev->opaque; - CharBackend *chr =3D u->user->chr; + CharBackend *chr; uint8_t *p =3D (uint8_t *) msg; int r, size =3D VHOST_USER_HDR_SIZE; =20 + if (dev->break_down) { + goto fail; + } + + chr =3D u->user->chr; r =3D qemu_chr_fe_read_all(chr, p, size); if (r !=3D size) { error_report("Failed to read msg header. Read %d instead of %d." " Original request %d.", r, size, msg->hdr.request); + dev->break_down =3D true; goto fail; } =20 @@ -299,9 +305,12 @@ static int vhost_user_write(struct vhost_dev *dev, Vho= stUserMsg *msg, int *fds, int fd_num) { struct vhost_user *u =3D dev->opaque; - CharBackend *chr =3D u->user->chr; + CharBackend *chr; int ret, size =3D VHOST_USER_HDR_SIZE + msg->hdr.size; =20 + if (dev->break_down) { + return -1; + } /* * For non-vring specific requests, like VHOST_USER_SET_MEM_TABLE, * we just need send it once in the first time. For later such @@ -312,6 +321,7 @@ static int vhost_user_write(struct vhost_dev *dev, Vhos= tUserMsg *msg, return 0; } =20 + chr =3D u->user->chr; if (qemu_chr_fe_set_msgfds(chr, fds, fd_num) < 0) { error_report("Failed to set msg fds."); return -1; @@ -319,6 +329,7 @@ static int vhost_user_write(struct vhost_dev *dev, Vhos= tUserMsg *msg, =20 ret =3D qemu_chr_fe_write_all(chr, (const uint8_t *) msg, size); if (ret !=3D size) { + dev->break_down =3D true; error_report("Failed to write msg." " Wrote %d instead of %d.", ret, size); return -1; diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h index a7f449f..86d0dc5 100644 --- a/include/hw/virtio/vhost.h +++ b/include/hw/virtio/vhost.h @@ -74,6 +74,7 @@ struct vhost_dev { bool started; bool log_enabled; uint64_t log_size; + bool break_down; Error *migration_blocker; const VhostOps *vhost_ops; void *opaque; diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h index 77e4739..06f2c08 100644 --- a/include/net/vhost_net.h +++ b/include/net/vhost_net.h @@ -27,6 +27,7 @@ void vhost_net_cleanup(VHostNetState *net); =20 uint64_t vhost_net_get_features(VHostNetState *net, uint64_t features); void vhost_net_ack_features(VHostNetState *net, uint64_t features); +void vhost_net_mark_break_down(VHostNetState *net); =20 bool vhost_net_virtqueue_pending(VHostNetState *net, int n); void vhost_net_virtqueue_mask(VHostNetState *net, VirtIODevice *dev, diff --git a/net/vhost-user.c b/net/vhost-user.c index a39f9c9..b99e20b 100644 --- a/net/vhost-user.c +++ b/net/vhost-user.c @@ -270,6 +270,9 @@ static void net_vhost_user_event(void *opaque, int even= t) if (s->watch) { AioContext *ctx =3D qemu_get_current_aio_context(); =20 + if (s->vhost_net) { + vhost_net_mark_break_down(s->vhost_net); + } g_source_remove(s->watch); s->watch =3D 0; qemu_chr_fe_set_handlers(&s->chr, NULL, NULL, NULL, NULL, --=20 1.8.3.1