From nobody Fri May 17 03:54:24 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1681225529; cv=none; d=zohomail.com; s=zohoarc; b=oMBa9T513YIWyzpGDeEkQxztEaUUrbQ1oqqkai6IFcUSfqC8yBKmPqU+5mS3s7oUSMTDvebHOGf+LBk8XyLVSGPPEbPuizqIntwi0+8ntip75hNQ7R7JLSqjb/SxXD+91r5R+xZiw4tFMQyECT9OQw/Flv2pR89OeQwEAHDh6W0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1681225529; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=RNMERNpKP16GJwlN4ZArZIHjWRIWKxmMgtbmzxLcbMM=; b=YT2QQ2uZPNqlFwOYZcmphvk1AZm+IuieI2punCyqW+QXMpLVKmANoLg5qnwgUXe34D0Os9EEQR8H3HfHZ2Mo6msf9ajNww09UOn8nw/XBv+o10vbm+j28dk4oe0YxgQK8gsLIaekmAMppBpyIAT+6DXPS3/XBFQsQ9aw8e4eczU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16812255291131010.226062433773; Tue, 11 Apr 2023 08:05:29 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pmFYk-0000pB-0W; Tue, 11 Apr 2023 11:05:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pmFYi-0000nh-0k for qemu-devel@nongnu.org; Tue, 11 Apr 2023 11:05:24 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pmFYf-0002xA-W1 for qemu-devel@nongnu.org; Tue, 11 Apr 2023 11:05:23 -0400 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-343-3r927bc0MdWxAPVtblaI-A-1; Tue, 11 Apr 2023 11:05:19 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4D7148996F3; Tue, 11 Apr 2023 15:05:18 +0000 (UTC) Received: from localhost (unknown [10.39.194.86]) by smtp.corp.redhat.com (Postfix) with ESMTPS id E6A3A40C83A9; Tue, 11 Apr 2023 15:05:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681225521; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RNMERNpKP16GJwlN4ZArZIHjWRIWKxmMgtbmzxLcbMM=; b=d4+apfOnic8xgaM42ghXGM/lLafbLcSgOJSHFlrgm3/YvlHH4YJKmxG8Up/xESGa5aEvn3 RC+NRmLLARTXYft6Xag23EYdqAu2ufOpFevyDMs0DrhO/nua9emIPnaTnhCMHpVUAXvfG2 bNFDSEYFBT6bHqHcUdVhTCi1zL1uvs0= X-MC-Unique: 3r927bc0MdWxAPVtblaI-A-1 From: Hanna Czenczek To: qemu-devel@nongnu.org, virtio-fs@redhat.com Cc: Hanna Czenczek , Stefan Hajnoczi , German Maglione , Anton Kuchin , Juan Quintela , "Michael S . Tsirkin" , Stefano Garzarella Subject: [PATCH 1/4] vhost: Re-enable vrings after setting features Date: Tue, 11 Apr 2023 17:05:12 +0200 Message-Id: <20230411150515.14020-2-hreitz@redhat.com> In-Reply-To: <20230411150515.14020-1-hreitz@redhat.com> References: <20230411150515.14020-1-hreitz@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1681225580885100002 Content-Type: text/plain; charset="utf-8" If the back-end supports the VHOST_USER_F_PROTOCOL_FEATURES feature, setting the vhost features will set this feature, too. Doing so disables all vrings, which may not be intended. For example, enabling or disabling logging during migration requires setting those features (to set or unset VHOST_F_LOG_ALL), which will automatically disable all vrings. In either case, the VM is running (disabling logging is done after a failed or cancelled migration, and only once the VM is running again, see comment in memory_global_dirty_log_stop()), so the vrings should really be enabled. As a result, the back-end seems to hang. To fix this, we must remember whether the vrings are supposed to be enabled, and, if so, re-enable them after a SET_FEATURES call that set VHOST_USER_F_PROTOCOL_FEATURES. It seems less than ideal that there is a short period in which the VM is running but the vrings will be stopped (between SET_FEATURES and SET_VRING_ENABLE). To fix this, we would need to change the protocol, e.g. by introducing a new flag or vhost-user protocol feature to disable disabling vrings whenever VHOST_USER_F_PROTOCOL_FEATURES is set, or add new functions for setting/clearing singular feature bits (so that F_LOG_ALL can be set/cleared without touching F_PROTOCOL_FEATURES). Even with such a potential addition to the protocol, we still need this fix here, because we cannot expect that back-ends will implement this addition. Signed-off-by: Hanna Czenczek --- include/hw/virtio/vhost.h | 10 ++++++++++ hw/virtio/vhost.c | 13 +++++++++++++ 2 files changed, 23 insertions(+) diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h index a52f273347..2fe02ed5d4 100644 --- a/include/hw/virtio/vhost.h +++ b/include/hw/virtio/vhost.h @@ -90,6 +90,16 @@ struct vhost_dev { int vq_index_end; /* if non-zero, minimum required value for max_queues */ int num_queues; + + /* + * Whether the virtqueues are supposed to be enabled (via + * SET_VRING_ENABLE). Setting the features (e.g. for + * enabling/disabling logging) will disable all virtqueues if + * VHOST_USER_F_PROTOCOL_FEATURES is set, so then we need to + * re-enable them if this field is set. + */ + bool enable_vqs; + /** * vhost feature handling requires matching the feature set * offered by a backend which may be a subset of the total diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index a266396576..cbff589efa 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -50,6 +50,8 @@ static unsigned int used_memslots; static QLIST_HEAD(, vhost_dev) vhost_devices =3D QLIST_HEAD_INITIALIZER(vhost_devices); =20 +static int vhost_dev_set_vring_enable(struct vhost_dev *hdev, int enable); + bool vhost_has_free_slot(void) { unsigned int slots_limit =3D ~0U; @@ -899,6 +901,15 @@ static int vhost_dev_set_features(struct vhost_dev *de= v, } } =20 + if (dev->enable_vqs) { + /* + * Setting VHOST_USER_F_PROTOCOL_FEATURES would have disabled all + * virtqueues, even if that was not intended; re-enable them if + * necessary. + */ + vhost_dev_set_vring_enable(dev, true); + } + out: return r; } @@ -1896,6 +1907,8 @@ int vhost_dev_get_inflight(struct vhost_dev *dev, uin= t16_t queue_size, =20 static int vhost_dev_set_vring_enable(struct vhost_dev *hdev, int enable) { + hdev->enable_vqs =3D enable; + if (!hdev->vhost_ops->vhost_set_vring_enable) { return 0; } --=20 2.39.1 From nobody Fri May 17 03:54:24 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1681225554; cv=none; d=zohomail.com; s=zohoarc; b=KEG0tVteqkGjf8p3atH2lBNbpieMsI8pMfCLXts9XeVviyFfM3/Xu5j1cvEmaA/mmkdl9MZpfr1a9shcyCYbqB6xbmRI3pP8/IitYUiGgxePIV9zADC2a5tZ5qRTrUSUTTh4OU4ccMLy5x95RE4x5s4QiWLgzgdXTrnGdRLTWJw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1681225554; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=Q+HqTDHnainpu3zzQ/us0E2/RP4Of/gpVwis5GvQ5I0=; b=jQ1pjbXTuMiydBaIgnUpI+WvkrvFGCpLxEUZ5oOmusweTepOlnDl6LeysZ+GhBJQDVyUteLabiuyAove2AoIKqiCJ3FC08VXq3OIfzQFXNRbaLgHZodNwnsS1GRkGA0HcfoAC7TNJ2bHCOM1grEXV9/d1l0dO/2+Av8Yoc4omx0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1681225554436104.90765264377285; Tue, 11 Apr 2023 08:05:54 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pmFYl-0000sC-Mx; Tue, 11 Apr 2023 11:05:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pmFYj-0000on-QY for qemu-devel@nongnu.org; Tue, 11 Apr 2023 11:05:25 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pmFYh-0002ya-3P for qemu-devel@nongnu.org; Tue, 11 Apr 2023 11:05:25 -0400 Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-356-_-GeHufDMI-K0HO7HmrxAw-1; Tue, 11 Apr 2023 11:05:20 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 17A8B38601BC; Tue, 11 Apr 2023 15:05:20 +0000 (UTC) Received: from localhost (unknown [10.39.194.86]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 77B19C15BBA; Tue, 11 Apr 2023 15:05:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681225522; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Q+HqTDHnainpu3zzQ/us0E2/RP4Of/gpVwis5GvQ5I0=; b=IPSFVK4zSyPsaJZVSnuFechj4I298dHUuItEpL7PVhFuH3VbpL0dnNs6tYEPVa2Lh76QH9 biX9/9ZC5lRSu2sqLTdDNFuyAnxkYu7Xsg0C+Fiw+djs9xrMX0fei+OcUA7P16ysffTbS8 IakdMRQGdOpr7OFx8B6JaweEovQGV7s= X-MC-Unique: _-GeHufDMI-K0HO7HmrxAw-1 From: Hanna Czenczek To: qemu-devel@nongnu.org, virtio-fs@redhat.com Cc: Hanna Czenczek , Stefan Hajnoczi , German Maglione , Anton Kuchin , Juan Quintela , "Michael S . Tsirkin" , Stefano Garzarella Subject: [PATCH 2/4] vhost-user: Interface for migration state transfer Date: Tue, 11 Apr 2023 17:05:13 +0200 Message-Id: <20230411150515.14020-3-hreitz@redhat.com> In-Reply-To: <20230411150515.14020-1-hreitz@redhat.com> References: <20230411150515.14020-1-hreitz@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1681225598142100001 Content-Type: text/plain; charset="utf-8" So-called "internal" virtio-fs migration refers to transporting the back-end's (virtiofsd's) state through qemu's migration stream. To do this, we need to be able to transfer virtiofsd's internal state to and from virtiofsd. Because virtiofsd's internal state will not be too large, we believe it is best to transfer it as a single binary blob after the streaming phase. Because this method should be useful to other vhost-user implementations, too, it is introduced as a general-purpose addition to the protocol, not limited to vhost-user-fs. These are the additions to the protocol: - New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE: This feature signals support for transferring state, and is added so that migration can fail early when the back-end has no support. - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe over which to transfer the state. The front-end sends an FD to the back-end into/from which it can write/read its state, and the back-end can decide to either use it, or reply with a different FD for the front-end to override the front-end's choice. The front-end creates a simple pipe to transfer the state, but maybe the back-end already has an FD into/from which it has to write/read its state, in which case it will want to override the simple pipe. Conversely, maybe in the future we find a way to have the front-end get an immediate FD for the migration stream (in some cases), in which case we will want to send this to the back-end instead of creating a pipe. Hence the negotiation: If one side has a better idea than a plain pipe, we will want to use that. - CHECK_DEVICE_STATE: After the state has been transferred through the pipe (the end indicated by EOF), the front-end invokes this function to verify success. There is no in-band way (through the pipe) to indicate failure, so we need to check explicitly. Once the transfer pipe has been established via SET_DEVICE_STATE_FD (which includes establishing the direction of transfer and migration phase), the sending side writes its data into the pipe, and the reading side reads it until it sees an EOF. Then, the front-end will check for success via CHECK_DEVICE_STATE, which on the destination side includes checking for integrity (i.e. errors during deserialization). Suggested-by: Stefan Hajnoczi Signed-off-by: Hanna Czenczek --- include/hw/virtio/vhost-backend.h | 24 +++++ include/hw/virtio/vhost.h | 79 ++++++++++++++++ hw/virtio/vhost-user.c | 147 ++++++++++++++++++++++++++++++ hw/virtio/vhost.c | 37 ++++++++ 4 files changed, 287 insertions(+) diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-ba= ckend.h index ec3fbae58d..5935b32fe3 100644 --- a/include/hw/virtio/vhost-backend.h +++ b/include/hw/virtio/vhost-backend.h @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType { VHOST_SET_CONFIG_TYPE_MIGRATION =3D 1, } VhostSetConfigType; =20 +typedef enum VhostDeviceStateDirection { + /* Transfer state from back-end (device) to front-end */ + VHOST_TRANSFER_STATE_DIRECTION_SAVE =3D 0, + /* Transfer state from front-end to back-end (device) */ + VHOST_TRANSFER_STATE_DIRECTION_LOAD =3D 1, +} VhostDeviceStateDirection; + +typedef enum VhostDeviceStatePhase { + /* The device (and all its vrings) is stopped */ + VHOST_TRANSFER_STATE_PHASE_STOPPED =3D 0, +} VhostDeviceStatePhase; + struct vhost_inflight; struct vhost_dev; struct vhost_log; @@ -133,6 +145,15 @@ typedef int (*vhost_set_config_call_op)(struct vhost_d= ev *dev, =20 typedef void (*vhost_reset_status_op)(struct vhost_dev *dev); =20 +typedef bool (*vhost_supports_migratory_state_op)(struct vhost_dev *dev); +typedef int (*vhost_set_device_state_fd_op)(struct vhost_dev *dev, + VhostDeviceStateDirection dire= ction, + VhostDeviceStatePhase phase, + int fd, + int *reply_fd, + Error **errp); +typedef int (*vhost_check_device_state_op)(struct vhost_dev *dev, Error **= errp); + typedef struct VhostOps { VhostBackendType backend_type; vhost_backend_init vhost_backend_init; @@ -181,6 +202,9 @@ typedef struct VhostOps { vhost_force_iommu_op vhost_force_iommu; vhost_set_config_call_op vhost_set_config_call; vhost_reset_status_op vhost_reset_status; + vhost_supports_migratory_state_op vhost_supports_migratory_state; + vhost_set_device_state_fd_op vhost_set_device_state_fd; + vhost_check_device_state_op vhost_check_device_state; } VhostOps; =20 int vhost_backend_update_device_iotlb(struct vhost_dev *dev, diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h index 2fe02ed5d4..29449e0fe2 100644 --- a/include/hw/virtio/vhost.h +++ b/include/hw/virtio/vhost.h @@ -346,4 +346,83 @@ int vhost_dev_set_inflight(struct vhost_dev *dev, struct vhost_inflight *inflight); int vhost_dev_get_inflight(struct vhost_dev *dev, uint16_t queue_size, struct vhost_inflight *inflight); + +/** + * vhost_supports_migratory_state(): Checks whether the back-end + * supports transferring internal state for the purpose of migration. + * Support for this feature is required for vhost_set_device_state_fd() + * and vhost_check_device_state(). + * + * @dev: The vhost device + * + * Returns true if the device supports these commands, and false if it + * does not. + */ +bool vhost_supports_migratory_state(struct vhost_dev *dev); + +/** + * vhost_set_device_state_fd(): Begin transfer of internal state from/to + * the back-end for the purpose of migration. Data is to be transferred + * over a pipe according to @direction and @phase. The sending end must + * only write to the pipe, and the receiving end must only read from it. + * Once the sending end is done, it closes its FD. The receiving end + * must take this as the end-of-transfer signal and close its FD, too. + * + * @fd is the back-end's end of the pipe: The write FD for SAVE, and the + * read FD for LOAD. This function transfers ownership of @fd to the + * back-end, i.e. closes it in the front-end. + * + * The back-end may optionally reply with an FD of its own, if this + * improves efficiency on its end. In this case, the returned FD is + * stored in *reply_fd. The back-end will discard the FD sent to it, + * and the front-end must use *reply_fd for transferring state to/from + * the back-end. + * + * @dev: The vhost device + * @direction: The direction in which the state is to be transferred. + * For outgoing migrations, this is SAVE, and data is read + * from the back-end and stored by the front-end in the + * migration stream. + * For incoming migrations, this is LOAD, and data is read + * by the front-end from the migration stream and sent to + * the back-end to restore the saved state. + * @phase: Which migration phase we are in. Currently, there is only + * STOPPED (device and all vrings are stopped), in the future, + * more phases such as PRE_COPY or POST_COPY may be added. + * @fd: Back-end's end of the pipe through which to transfer state; note + * that ownership is transferred to the back-end, so this function + * closes @fd in the front-end. + * @reply_fd: If the back-end wishes to use a different pipe for state + * transfer, this will contain an FD for the front-end to + * use. Otherwise, -1 is stored here. + * @errp: Potential error description + * + * Returns 0 on success, and -errno on failure. + */ +int vhost_set_device_state_fd(struct vhost_dev *dev, + VhostDeviceStateDirection direction, + VhostDeviceStatePhase phase, + int fd, + int *reply_fd, + Error **errp); + +/** + * vhost_set_device_state_fd(): After transferring state from/to the + * back-end via vhost_set_device_state_fd(), i.e. once the sending end + * has closed the pipe, inquire the back-end to report any potential + * errors that have occurred on its side. This allows to sense errors + * like: + * - During outgoing migration, when the source side had already started + * to produce its state, something went wrong and it failed to finish + * - During incoming migration, when the received state is somehow + * invalid and cannot be processed by the back-end + * + * @dev: The vhost device + * @errp: Potential error description + * + * Returns 0 when the back-end reports successful state transfer and + * processing, and -errno when an error occurred somewhere. + */ +int vhost_check_device_state(struct vhost_dev *dev, Error **errp); + #endif diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index e5285df4ba..93d8f2494a 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -83,6 +83,7 @@ enum VhostUserProtocolFeature { /* Feature 14 reserved for VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS.= */ VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS =3D 15, VHOST_USER_PROTOCOL_F_STATUS =3D 16, + VHOST_USER_PROTOCOL_F_MIGRATORY_STATE =3D 17, VHOST_USER_PROTOCOL_F_MAX }; =20 @@ -130,6 +131,8 @@ typedef enum VhostUserRequest { VHOST_USER_REM_MEM_REG =3D 38, VHOST_USER_SET_STATUS =3D 39, VHOST_USER_GET_STATUS =3D 40, + VHOST_USER_SET_DEVICE_STATE_FD =3D 41, + VHOST_USER_CHECK_DEVICE_STATE =3D 42, VHOST_USER_MAX } VhostUserRequest; =20 @@ -210,6 +213,12 @@ typedef struct { uint32_t size; /* the following payload size */ } QEMU_PACKED VhostUserHeader; =20 +/* Request payload of VHOST_USER_SET_DEVICE_STATE_FD */ +typedef struct VhostUserTransferDeviceState { + uint32_t direction; + uint32_t phase; +} VhostUserTransferDeviceState; + typedef union { #define VHOST_USER_VRING_IDX_MASK (0xff) #define VHOST_USER_VRING_NOFD_MASK (0x1 << 8) @@ -224,6 +233,7 @@ typedef union { VhostUserCryptoSession session; VhostUserVringArea area; VhostUserInflight inflight; + VhostUserTransferDeviceState transfer_state; } VhostUserPayload; =20 typedef struct VhostUserMsg { @@ -2681,6 +2691,140 @@ static int vhost_user_dev_start(struct vhost_dev *d= ev, bool started) } } =20 +static bool vhost_user_supports_migratory_state(struct vhost_dev *dev) +{ + return virtio_has_feature(dev->protocol_features, + VHOST_USER_PROTOCOL_F_MIGRATORY_STATE); +} + +static int vhost_user_set_device_state_fd(struct vhost_dev *dev, + VhostDeviceStateDirection direct= ion, + VhostDeviceStatePhase phase, + int fd, + int *reply_fd, + Error **errp) +{ + int ret; + struct vhost_user *vu =3D dev->opaque; + VhostUserMsg msg =3D { + .hdr =3D { + .request =3D VHOST_USER_SET_DEVICE_STATE_FD, + .flags =3D VHOST_USER_VERSION, + .size =3D sizeof(msg.payload.transfer_state), + }, + .payload.transfer_state =3D { + .direction =3D direction, + .phase =3D phase, + }, + }; + + *reply_fd =3D -1; + + if (!vhost_user_supports_migratory_state(dev)) { + close(fd); + error_setg(errp, "Back-end does not support migration state transf= er"); + return -ENOTSUP; + } + + ret =3D vhost_user_write(dev, &msg, &fd, 1); + close(fd); + if (ret < 0) { + error_setg_errno(errp, -ret, + "Failed to send SET_DEVICE_STATE_FD message"); + return ret; + } + + ret =3D vhost_user_read(dev, &msg); + if (ret < 0) { + error_setg_errno(errp, -ret, + "Failed to receive SET_DEVICE_STATE_FD reply"); + return ret; + } + + if (msg.hdr.request !=3D VHOST_USER_SET_DEVICE_STATE_FD) { + error_setg(errp, + "Received unexpected message type, expected %d, receive= d %d", + VHOST_USER_SET_DEVICE_STATE_FD, msg.hdr.request); + return -EPROTO; + } + + if (msg.hdr.size !=3D sizeof(msg.payload.u64)) { + error_setg(errp, + "Received bad message size, expected %zu, received %" P= RIu32, + sizeof(msg.payload.u64), msg.hdr.size); + return -EPROTO; + } + + if ((msg.payload.u64 & 0xff) !=3D 0) { + error_setg(errp, "Back-end did not accept migration state transfer= "); + return -EIO; + } + + if (!(msg.payload.u64 & VHOST_USER_VRING_NOFD_MASK)) { + *reply_fd =3D qemu_chr_fe_get_msgfd(vu->user->chr); + if (*reply_fd < 0) { + error_setg(errp, + "Failed to get back-end-provided transfer pipe FD"); + *reply_fd =3D -1; + return -EIO; + } + } + + return 0; +} + +static int vhost_user_check_device_state(struct vhost_dev *dev, Error **er= rp) +{ + int ret; + VhostUserMsg msg =3D { + .hdr =3D { + .request =3D VHOST_USER_CHECK_DEVICE_STATE, + .flags =3D VHOST_USER_VERSION, + .size =3D 0, + }, + }; + + if (!vhost_user_supports_migratory_state(dev)) { + error_setg(errp, "Back-end does not support migration state transf= er"); + return -ENOTSUP; + } + + ret =3D vhost_user_write(dev, &msg, NULL, 0); + if (ret < 0) { + error_setg_errno(errp, -ret, + "Failed to send CHECK_DEVICE_STATE message"); + return ret; + } + + ret =3D vhost_user_read(dev, &msg); + if (ret < 0) { + error_setg_errno(errp, -ret, + "Failed to receive CHECK_DEVICE_STATE reply"); + return ret; + } + + if (msg.hdr.request !=3D VHOST_USER_CHECK_DEVICE_STATE) { + error_setg(errp, + "Received unexpected message type, expected %d, receive= d %d", + VHOST_USER_CHECK_DEVICE_STATE, msg.hdr.request); + return -EPROTO; + } + + if (msg.hdr.size !=3D sizeof(msg.payload.u64)) { + error_setg(errp, + "Received bad message size, expected %zu, received %" P= RIu32, + sizeof(msg.payload.u64), msg.hdr.size); + return -EPROTO; + } + + if (msg.payload.u64 !=3D 0) { + error_setg(errp, "Back-end failed to process its internal state"); + return -EIO; + } + + return 0; +} + const VhostOps user_ops =3D { .backend_type =3D VHOST_BACKEND_TYPE_USER, .vhost_backend_init =3D vhost_user_backend_init, @@ -2716,4 +2860,7 @@ const VhostOps user_ops =3D { .vhost_get_inflight_fd =3D vhost_user_get_inflight_fd, .vhost_set_inflight_fd =3D vhost_user_set_inflight_fd, .vhost_dev_start =3D vhost_user_dev_start, + .vhost_supports_migratory_state =3D vhost_user_supports_migratory_= state, + .vhost_set_device_state_fd =3D vhost_user_set_device_state_fd, + .vhost_check_device_state =3D vhost_user_check_device_state, }; diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index cbff589efa..90099d8f6a 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -2088,3 +2088,40 @@ int vhost_net_set_backend(struct vhost_dev *hdev, =20 return -ENOSYS; } + +bool vhost_supports_migratory_state(struct vhost_dev *dev) +{ + if (dev->vhost_ops->vhost_supports_migratory_state) { + return dev->vhost_ops->vhost_supports_migratory_state(dev); + } + + return false; +} + +int vhost_set_device_state_fd(struct vhost_dev *dev, + VhostDeviceStateDirection direction, + VhostDeviceStatePhase phase, + int fd, + int *reply_fd, + Error **errp) +{ + if (dev->vhost_ops->vhost_set_device_state_fd) { + return dev->vhost_ops->vhost_set_device_state_fd(dev, direction, p= hase, + fd, reply_fd, err= p); + } + + error_setg(errp, + "vhost transport does not support migration state transfer"= ); + return -ENOSYS; +} + +int vhost_check_device_state(struct vhost_dev *dev, Error **errp) +{ + if (dev->vhost_ops->vhost_check_device_state) { + return dev->vhost_ops->vhost_check_device_state(dev, errp); + } + + error_setg(errp, + "vhost transport does not support migration state transfer"= ); + return -ENOSYS; +} --=20 2.39.1 From nobody Fri May 17 03:54:24 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1681225562; cv=none; d=zohomail.com; s=zohoarc; b=GIfmo5O3sHDcz37e6G1/mC4SiOVu0ekqrYZxVtbiNi7nmxLNQCvl00hHjL66J3fmG8La32IwSyP7FToXQBSFQ0uPEyQp+H4pX6JZ2zuPLkYi+Ud3eHrtmAYgEQtblDXcaPGBLmtdYHhbKSSK3NX6znqkEhXmZaJcn2cGho8uAhk= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1681225562; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=fOGKrr0Kq5sNVMqdOIBpXrBujbQI48ZxG9rf06fMcN4=; b=mVq7s3Hd+tPbApinX4dHlG29n1GUjTFvEEHAdHUO91x1y8McY6mBZo6u+x3gY2VQFSErnisalaA5PccSIGX0OuwHtN8qcM55xo1cxxn8jN43NSXke1If+osNRPP3Ix1I8lX85fd/pzSkBbtXUhebU+melW96fxekQkajxyaaywI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1681225562674978.4398926759546; Tue, 11 Apr 2023 08:06:02 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pmFYv-0001b3-Kz; Tue, 11 Apr 2023 11:05:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pmFYr-00018M-VT for qemu-devel@nongnu.org; Tue, 11 Apr 2023 11:05:34 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pmFYp-00033D-Ti for qemu-devel@nongnu.org; Tue, 11 Apr 2023 11:05:33 -0400 Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-136-cT_0dCgfOrSRuUwt_eNl7g-1; Tue, 11 Apr 2023 11:05:24 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id EB81328004EA; Tue, 11 Apr 2023 15:05:21 +0000 (UTC) Received: from localhost (unknown [10.39.194.86]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 6344A2166B30; Tue, 11 Apr 2023 15:05:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681225531; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fOGKrr0Kq5sNVMqdOIBpXrBujbQI48ZxG9rf06fMcN4=; b=gRb14GOIkCJ9nof48BO7+tz7/O4f9cXbQLdtF3CeoL3WpVNFP9DLVgSi0rViJMNMRcVPPb 2cByGES56OIqd8WCefJ5Q9osNFNWm6qGfx0ACWUKm6HwuRDRaGV94eXWJmf2p/i14XHzQz MTLnXVsK8JpyhzU9ZD+TM56PDyRD6Oc= X-MC-Unique: cT_0dCgfOrSRuUwt_eNl7g-1 From: Hanna Czenczek To: qemu-devel@nongnu.org, virtio-fs@redhat.com Cc: Hanna Czenczek , Stefan Hajnoczi , German Maglione , Anton Kuchin , Juan Quintela , "Michael S . Tsirkin" , Stefano Garzarella Subject: [PATCH 3/4] vhost: Add high-level state save/load functions Date: Tue, 11 Apr 2023 17:05:14 +0200 Message-Id: <20230411150515.14020-4-hreitz@redhat.com> In-Reply-To: <20230411150515.14020-1-hreitz@redhat.com> References: <20230411150515.14020-1-hreitz@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1681225604536100005 Content-Type: text/plain; charset="utf-8" vhost_save_backend_state() and vhost_load_backend_state() can be used by vhost front-ends to easily save and load the back-end's state to/from the migration stream. Because we do not know the full state size ahead of time, vhost_save_backend_state() simply reads the data in 1 MB chunks, and writes each chunk consecutively into the migration stream, prefixed by its length. EOF is indicated by a 0-length chunk. Signed-off-by: Hanna Czenczek --- include/hw/virtio/vhost.h | 35 +++++++ hw/virtio/vhost.c | 196 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 231 insertions(+) diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h index 29449e0fe2..d1f1e9e1f3 100644 --- a/include/hw/virtio/vhost.h +++ b/include/hw/virtio/vhost.h @@ -425,4 +425,39 @@ int vhost_set_device_state_fd(struct vhost_dev *dev, */ int vhost_check_device_state(struct vhost_dev *dev, Error **errp); =20 +/** + * vhost_save_backend_state(): High-level function to receive a vhost + * back-end's state, and save it in `f`. Uses + * `vhost_set_device_state_fd()` to get the data from the back-end, and + * stores it in consecutive chunks that are each prefixed by their + * respective length (be32). The end is marked by a 0-length chunk. + * + * Must only be called while the device and all its vrings are stopped + * (`VHOST_TRANSFER_STATE_PHASE_STOPPED`). + * + * @dev: The vhost device from which to save the state + * @f: Migration stream in which to save the state + * @errp: Potential error message + * + * Returns 0 on success, and -errno otherwise. + */ +int vhost_save_backend_state(struct vhost_dev *dev, QEMUFile *f, Error **e= rrp); + +/** + * vhost_load_backend_state(): High-level function to load a vhost + * back-end's state from `f`, and send it over to the back-end. Reads + * the data from `f` in the format used by `vhost_save_state()`, and + * uses `vhost_set_device_state_fd()` to transfer it to the back-end. + * + * Must only be called while the device and all its vrings are stopped + * (`VHOST_TRANSFER_STATE_PHASE_STOPPED`). + * + * @dev: The vhost device to which to send the sate + * @f: Migration stream from which to load the state + * @errp: Potential error message + * + * Returns 0 on success, and -errno otherwise. + */ +int vhost_load_backend_state(struct vhost_dev *dev, QEMUFile *f, Error **e= rrp); + #endif diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 90099d8f6a..d08849c691 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -2125,3 +2125,199 @@ int vhost_check_device_state(struct vhost_dev *dev,= Error **errp) "vhost transport does not support migration state transfer"= ); return -ENOSYS; } + +int vhost_save_backend_state(struct vhost_dev *dev, QEMUFile *f, Error **e= rrp) +{ + /* Maximum chunk size in which to transfer the state */ + const size_t chunk_size =3D 1 * 1024 * 1024; + void *transfer_buf =3D NULL; + g_autoptr(GError) g_err =3D NULL; + int pipe_fds[2], read_fd =3D -1, write_fd =3D -1, reply_fd =3D -1; + int ret; + + /* [0] for reading (our end), [1] for writing (back-end's end) */ + if (!g_unix_open_pipe(pipe_fds, FD_CLOEXEC, &g_err)) { + error_setg(errp, "Failed to set up state transfer pipe: %s", + g_err->message); + ret =3D -EINVAL; + goto fail; + } + + read_fd =3D pipe_fds[0]; + write_fd =3D pipe_fds[1]; + + /* VHOST_TRANSFER_STATE_PHASE_STOPPED means the device must be stopped= */ + assert(!dev->started && !dev->enable_vqs); + + /* Transfer ownership of write_fd to the back-end */ + ret =3D vhost_set_device_state_fd(dev, + VHOST_TRANSFER_STATE_DIRECTION_SAVE, + VHOST_TRANSFER_STATE_PHASE_STOPPED, + write_fd, + &reply_fd, + errp); + if (ret < 0) { + error_prepend(errp, "Failed to initiate state transfer: "); + goto fail; + } + + /* If the back-end wishes to use a different pipe, switch over */ + if (reply_fd >=3D 0) { + close(read_fd); + read_fd =3D reply_fd; + } + + transfer_buf =3D g_malloc(chunk_size); + + while (true) { + ssize_t read_ret; + + read_ret =3D read(read_fd, transfer_buf, chunk_size); + if (read_ret < 0) { + ret =3D -errno; + error_setg_errno(errp, -ret, "Failed to receive state"); + goto fail; + } + + assert(read_ret <=3D chunk_size); + qemu_put_be32(f, read_ret); + + if (read_ret =3D=3D 0) { + /* EOF */ + break; + } + + qemu_put_buffer(f, transfer_buf, read_ret); + } + + /* + * Back-end will not really care, but be clean and close our end of th= e pipe + * before inquiring the back-end about whether transfer was successful + */ + close(read_fd); + read_fd =3D -1; + + /* Also, verify that the device is still stopped */ + assert(!dev->started && !dev->enable_vqs); + + ret =3D vhost_check_device_state(dev, errp); + if (ret < 0) { + goto fail; + } + + ret =3D 0; +fail: + g_free(transfer_buf); + if (read_fd >=3D 0) { + close(read_fd); + } + + return ret; +} + +int vhost_load_backend_state(struct vhost_dev *dev, QEMUFile *f, Error **e= rrp) +{ + size_t transfer_buf_size =3D 0; + void *transfer_buf =3D NULL; + g_autoptr(GError) g_err =3D NULL; + int pipe_fds[2], read_fd =3D -1, write_fd =3D -1, reply_fd =3D -1; + int ret; + + /* [0] for reading (back-end's end), [1] for writing (our end) */ + if (!g_unix_open_pipe(pipe_fds, FD_CLOEXEC, &g_err)) { + error_setg(errp, "Failed to set up state transfer pipe: %s", + g_err->message); + ret =3D -EINVAL; + goto fail; + } + + read_fd =3D pipe_fds[0]; + write_fd =3D pipe_fds[1]; + + /* VHOST_TRANSFER_STATE_PHASE_STOPPED means the device must be stopped= */ + assert(!dev->started && !dev->enable_vqs); + + /* Transfer ownership of read_fd to the back-end */ + ret =3D vhost_set_device_state_fd(dev, + VHOST_TRANSFER_STATE_DIRECTION_LOAD, + VHOST_TRANSFER_STATE_PHASE_STOPPED, + read_fd, + &reply_fd, + errp); + if (ret < 0) { + error_prepend(errp, "Failed to initiate state transfer: "); + goto fail; + } + + /* If the back-end wishes to use a different pipe, switch over */ + if (reply_fd >=3D 0) { + close(write_fd); + write_fd =3D reply_fd; + } + + while (true) { + size_t this_chunk_size =3D qemu_get_be32(f); + ssize_t write_ret; + const uint8_t *transfer_pointer; + + if (this_chunk_size =3D=3D 0) { + /* End of state */ + break; + } + + if (transfer_buf_size < this_chunk_size) { + transfer_buf =3D g_realloc(transfer_buf, this_chunk_size); + transfer_buf_size =3D this_chunk_size; + } + + if (qemu_get_buffer(f, transfer_buf, this_chunk_size) < + this_chunk_size) + { + error_setg(errp, "Failed to read state"); + ret =3D -EINVAL; + goto fail; + } + + transfer_pointer =3D transfer_buf; + while (this_chunk_size > 0) { + write_ret =3D write(write_fd, transfer_pointer, this_chunk_siz= e); + if (write_ret < 0) { + ret =3D -errno; + error_setg_errno(errp, -ret, "Failed to send state"); + goto fail; + } else if (write_ret =3D=3D 0) { + error_setg(errp, "Failed to send state: Connection is clos= ed"); + ret =3D -ECONNRESET; + goto fail; + } + + assert(write_ret <=3D this_chunk_size); + this_chunk_size -=3D write_ret; + transfer_pointer +=3D write_ret; + } + } + + /* + * Close our end, thus ending transfer, before inquiring the back-end = about + * whether transfer was successful + */ + close(write_fd); + write_fd =3D -1; + + /* Also, verify that the device is still stopped */ + assert(!dev->started && !dev->enable_vqs); + + ret =3D vhost_check_device_state(dev, errp); + if (ret < 0) { + goto fail; + } + + ret =3D 0; +fail: + g_free(transfer_buf); + if (write_fd >=3D 0) { + close(write_fd); + } + + return ret; +} --=20 2.39.1 From nobody Fri May 17 03:54:24 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1681225547; cv=none; d=zohomail.com; s=zohoarc; b=d2PfxotCOiexSgFv3sEX7WT3LYzpwU7sd7gD1GrZKdIHxQGd+7iTj5ID9VjaTA/feacedUcPvUGyhPcZFgO+Aeo657Wkq2ODRvVCTDzgJ5JbOUmhevgCeFtpITay/rWEqLJpDxHaHHXz1eKmP5iywiRXanw4/J3GIyA0w4OrL+E= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1681225547; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=xgAt48T0Lqdx3L8EU5najhywTFEgFyFlL7pLqzNAov0=; b=jEQEdhtFLnjoP1pv4VSfzO7ALUB7hFQ2NawPF0kB6QNPz62zCOAE0mIxkif5jhnytzYjprTt1LTi02NubpDvS7271Tlvrafohcg+DyjAGdJV0CUnEa9oqumEoHm5AsrPrCM15fXKaHnIJB7GjJ28/uwdOo/lnH863umSUvw2k6s= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1681225547201536.9942835430558; Tue, 11 Apr 2023 08:05:47 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pmFYp-0000yv-N3; Tue, 11 Apr 2023 11:05:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pmFYo-0000yH-EB for qemu-devel@nongnu.org; Tue, 11 Apr 2023 11:05:30 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pmFYm-00031y-78 for qemu-devel@nongnu.org; Tue, 11 Apr 2023 11:05:30 -0400 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-67-Dp4DfCbEO3OearBACYbb9Q-1; Tue, 11 Apr 2023 11:05:23 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8ECF81010424; Tue, 11 Apr 2023 15:05:23 +0000 (UTC) Received: from localhost (unknown [10.39.194.86]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 25FDB40C83A9; Tue, 11 Apr 2023 15:05:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681225527; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xgAt48T0Lqdx3L8EU5najhywTFEgFyFlL7pLqzNAov0=; b=OOsFXXiDfbll+P4Mp6b28izSsb2Rdy4TFfrgvyLdpZLJehuvGHCkkh2ZjXP6AR3uhOmbok ELIguuD6yhJrgb7xcbAKyCdxGCy9jVMiEiN5QRGlXkbifZ8pFioIdXPcNG0RNRFtV5I0LL 9tvKHV2YyzrlxMLW2ctZLpkHTu0/V0Q= X-MC-Unique: Dp4DfCbEO3OearBACYbb9Q-1 From: Hanna Czenczek To: qemu-devel@nongnu.org, virtio-fs@redhat.com Cc: Hanna Czenczek , Stefan Hajnoczi , German Maglione , Anton Kuchin , Juan Quintela , "Michael S . Tsirkin" , Stefano Garzarella Subject: [PATCH 4/4] vhost-user-fs: Implement internal migration Date: Tue, 11 Apr 2023 17:05:15 +0200 Message-Id: <20230411150515.14020-5-hreitz@redhat.com> In-Reply-To: <20230411150515.14020-1-hreitz@redhat.com> References: <20230411150515.14020-1-hreitz@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1681225666374100001 Content-Type: text/plain; charset="utf-8" A virtio-fs device's VM state consists of: - the virtio device (vring) state (VMSTATE_VIRTIO_DEVICE) - the back-end's (virtiofsd's) internal state We get/set the latter via the new vhost operations to transfer migratory state. It is its own dedicated subsection, so that for external migration, it can be disabled. Signed-off-by: Hanna Czenczek --- hw/virtio/vhost-user-fs.c | 101 +++++++++++++++++++++++++++++++++++++- 1 file changed, 100 insertions(+), 1 deletion(-) diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c index 83fc20e49e..f9c19f7a3d 100644 --- a/hw/virtio/vhost-user-fs.c +++ b/hw/virtio/vhost-user-fs.c @@ -298,9 +298,108 @@ static struct vhost_dev *vuf_get_vhost(VirtIODevice *= vdev) return &fs->vhost_dev; } =20 +/** + * Fetch the internal state from virtiofsd and save it to `f`. + */ +static int vuf_save_state(QEMUFile *f, void *pv, size_t size, + const VMStateField *field, JSONWriter *vmdesc) +{ + VirtIODevice *vdev =3D pv; + VHostUserFS *fs =3D VHOST_USER_FS(vdev); + Error *local_error =3D NULL; + int ret; + + ret =3D vhost_save_backend_state(&fs->vhost_dev, f, &local_error); + if (ret < 0) { + error_reportf_err(local_error, + "Error saving back-end state of %s device %s " + "(tag: \"%s\"): ", + vdev->name, vdev->parent_obj.canonical_path, + fs->conf.tag ?: ""); + return ret; + } + + return 0; +} + +/** + * Load virtiofsd's internal state from `f` and send it over to virtiofsd. + */ +static int vuf_load_state(QEMUFile *f, void *pv, size_t size, + const VMStateField *field) +{ + VirtIODevice *vdev =3D pv; + VHostUserFS *fs =3D VHOST_USER_FS(vdev); + Error *local_error =3D NULL; + int ret; + + ret =3D vhost_load_backend_state(&fs->vhost_dev, f, &local_error); + if (ret < 0) { + error_reportf_err(local_error, + "Error loading back-end state of %s device %s " + "(tag: \"%s\"): ", + vdev->name, vdev->parent_obj.canonical_path, + fs->conf.tag ?: ""); + return ret; + } + + return 0; +} + +static bool vuf_is_internal_migration(void *opaque) +{ + /* TODO: Return false when an external migration is requested */ + return true; +} + +static int vuf_check_migration_support(void *opaque) +{ + VirtIODevice *vdev =3D opaque; + VHostUserFS *fs =3D VHOST_USER_FS(vdev); + + if (!vhost_supports_migratory_state(&fs->vhost_dev)) { + error_report("Back-end of %s device %s (tag: \"%s\") does not supp= ort " + "migration through qemu", + vdev->name, vdev->parent_obj.canonical_path, + fs->conf.tag ?: ""); + return -ENOTSUP; + } + + return 0; +} + +static const VMStateDescription vuf_backend_vmstate; + static const VMStateDescription vuf_vmstate =3D { .name =3D "vhost-user-fs", - .unmigratable =3D 1, + .version_id =3D 0, + .fields =3D (VMStateField[]) { + VMSTATE_VIRTIO_DEVICE, + VMSTATE_END_OF_LIST() + }, + .subsections =3D (const VMStateDescription * []) { + &vuf_backend_vmstate, + NULL, + } +}; + +static const VMStateDescription vuf_backend_vmstate =3D { + .name =3D "vhost-user-fs-backend", + .version_id =3D 0, + .needed =3D vuf_is_internal_migration, + .pre_load =3D vuf_check_migration_support, + .pre_save =3D vuf_check_migration_support, + .fields =3D (VMStateField[]) { + { + .name =3D "back-end", + .info =3D &(const VMStateInfo) { + .name =3D "virtio-fs back-end state", + .get =3D vuf_load_state, + .put =3D vuf_save_state, + }, + }, + VMSTATE_END_OF_LIST() + }, }; =20 static Property vuf_properties[] =3D { --=20 2.39.1