From nobody Sat Apr 11 23:08:14 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1772645873; cv=none; d=zohomail.com; s=zohoarc; b=PhTy1iFBzYQ3Z3KK5njadzWV84YbsEC0534cpPLqNDkltr84ibVjGuxBf9e7LKN9z8qoVoSd1XeQW/cGCmTfXaeTZCQjhhA5Kv1oWvR2ojY1hjZEpKAzUu0ndofaBspife96SKRTGdypxjCJE5rWnuPsMDv32ze7J9w7SdWYLww= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1772645873; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=VtPai8/b13235tXiHfqfh6KX3mRBdNsS410cZ9nv1As=; b=Sl1DtM+p0XnPZBkDL/Jzy2kd6j8s7So74fTmBsoTcx5ev1256Rp1I3CVHAbu+U44wApO2Jr5tG2yl14B4w8AW8Bb3w6XLV9PFOUHXAcsH/YPb6m7twanbMD50CprgFzkjnr4QHY9f2cG8i1E4mMz+RMqYrYAIj67je9JuTOvByE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1772645873271324.0683844271575; Wed, 4 Mar 2026 09:37:53 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vxq98-0003U9-CM; Wed, 04 Mar 2026 12:36:30 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vxq8t-0003FB-Fj for qemu-devel@nongnu.org; Wed, 04 Mar 2026 12:36:15 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vxq8r-0004va-O6 for qemu-devel@nongnu.org; Wed, 04 Mar 2026 12:36:15 -0500 Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-343-NGZdSyXtM3aQ7Mhj2w9U5A-1; Wed, 04 Mar 2026 12:36:09 -0500 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id C72BC1956095; Wed, 4 Mar 2026 17:36:08 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.45.224.212]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id D61591958DC5; Wed, 4 Mar 2026 17:36:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1772645773; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VtPai8/b13235tXiHfqfh6KX3mRBdNsS410cZ9nv1As=; b=TVD3/zOa6h1UdszHl3E1LtIaS+E6LdK7Udv+/BSJiKC2KAvXOaRjo6Sj+0sqauy3k9oPWb 2r8twVsijdMGODclYvo5d4aRUZatYC0VZWq8FYsfi6wru59mNKheiAbYP1d1f0mZ6Mz9lo rEN4xavDasBugxTgWPhB01KgBuRCZZ4= X-MC-Unique: NGZdSyXtM3aQ7Mhj2w9U5A-1 X-Mimecast-MFC-AGG-ID: NGZdSyXtM3aQ7Mhj2w9U5A_1772645768 From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= To: qemu-devel@nongnu.org Cc: Laurent Vivier , Jason Wang , Dragos Tatulea DE , Jonah Palmer , "Michael S. Tsirkin" , =?UTF-8?q?Eugenio=20P=C3=A9rez?= , Lei Yang , Koushik Dutta , Stefano Garzarella , qemu-stable@nongnu.org, Cindy Lu , Maxime Coquelin Subject: [PATCH 6/7] vhost: add in_order feature to shadow virtqueue Date: Wed, 4 Mar 2026 18:35:34 +0100 Message-ID: <20260304173535.2702587-7-eperezma@redhat.com> In-Reply-To: <20260304173535.2702587-1-eperezma@redhat.com> References: <20260304173535.2702587-1-eperezma@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=eperezma@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: 33 X-Spam_score: 3.3 X-Spam_bar: +++ X-Spam_report: (3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_SBL_CSS=3.335, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.703, RCVD_IN_VALIDITY_SAFE_BLOCKED=1.386, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1772645875429139100 Some vdpa devices benefit from the in order feature. Add support to SVQ so QEMU can migrate these. Signed-off-by: Eugenio P=C3=A9rez Acked-by: Jason Wang --- hw/virtio/vhost-shadow-virtqueue.c | 137 +++++++++++++++++++++++++++-- hw/virtio/vhost-shadow-virtqueue.h | 36 ++++++-- 2 files changed, 160 insertions(+), 13 deletions(-) diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-vi= rtqueue.c index 2d8fc82cc06f..60212fcd7bf3 100644 --- a/hw/virtio/vhost-shadow-virtqueue.c +++ b/hw/virtio/vhost-shadow-virtqueue.c @@ -12,11 +12,14 @@ =20 #include "qemu/error-report.h" #include "qapi/error.h" +#include "qemu/iov.h" #include "qemu/main-loop.h" #include "qemu/log.h" #include "qemu/memalign.h" #include "linux-headers/linux/vhost.h" =20 +#define VIRTIO_RING_NOT_IN_BATCH UINT16_MAX + /** * Validate the transport device features that both guests can use with th= e SVQ * and SVQs can use with the device. @@ -150,7 +153,33 @@ static bool vhost_svq_translate_addr(const VhostShadow= Virtqueue *svq, static uint16_t vhost_svq_next_desc(const VhostShadowVirtqueue *svq, uint16_t id) { - return svq->desc_state[id].next; + if (virtio_vdev_has_feature(svq->vdev, VIRTIO_F_IN_ORDER)) { + return (id =3D=3D svq->vring.num) ? 0 : ++id; + } else { + return svq->desc_state[id].next; + } +} + +/** + * Updates the SVQ free_head member after adding them to the SVQ avail rin= g. + * The new free_head is the next descriptor that SVQ will make available by + * forwarding a new guest descriptor. + * + * @svq Shadow Virtqueue + * @num Number of descriptors added + * @id ID of the last descriptor added to the SVQ avail ring. + */ +static void vhost_svq_update_free_head(VhostShadowVirtqueue *svq, + size_t num, uint16_t id) +{ + if (virtio_vdev_has_feature(svq->vdev, VIRTIO_F_IN_ORDER)) { + svq->free_head +=3D num; + if (svq->free_head >=3D svq->vring.num) { + svq->free_head -=3D svq->vring.num; + } + } else { + svq->free_head =3D vhost_svq_next_desc(svq, id); + } } =20 /** @@ -202,7 +231,7 @@ static bool vhost_svq_vring_write_descs(VhostShadowVirt= queue *svq, hwaddr *sg, i =3D next; } =20 - svq->free_head =3D vhost_svq_next_desc(svq, last); + vhost_svq_update_free_head(svq, num, last); return true; } =20 @@ -306,6 +335,9 @@ int vhost_svq_add(VhostShadowVirtqueue *svq, const stru= ct iovec *out_sg, svq->num_free -=3D ndescs; svq->desc_state[qemu_head].elem =3D elem; svq->desc_state[qemu_head].ndescs =3D ndescs; + if (virtio_vdev_has_feature(svq->vdev, VIRTIO_F_IN_ORDER)) { + svq->desc_state[qemu_head].in_bytes =3D iov_size(in_sg, in_num); + } vhost_svq_kick(svq); return 0; } @@ -401,6 +433,12 @@ static void vhost_handle_guest_kick_notifier(EventNoti= fier *n) static bool vhost_svq_more_used(VhostShadowVirtqueue *svq) { uint16_t *used_idx =3D &svq->vring.used->idx; + + if (virtio_vdev_has_feature(svq->vdev, VIRTIO_F_IN_ORDER) && + svq->batch_last.id !=3D VIRTIO_RING_NOT_IN_BATCH) { + return true; + } + if (svq->last_used_idx !=3D svq->shadow_used_idx) { return true; } @@ -463,6 +501,47 @@ static uint16_t vhost_svq_get_last_used_split(VhostSha= dowVirtqueue *svq, return le32_to_cpu(used->ring[last_used].id); } =20 +/* + * Gets the next buffer id and moves forward the used idx, so the next time + * SVQ calls this function will get the next one. IN_ORDER version + * + * @svq: Shadow VirtQueue + * @len: Consumed length by the device. + * + * Return the next descriptor consumed by the device. + */ +static int32_t vhost_svq_get_last_used_split_in_order( + VhostShadowVirtqueue *svq, + uint32_t *len) +{ + unsigned num =3D svq->vring.num; + const vring_used_t *used =3D svq->vring.used; + uint16_t last_used =3D svq->last_used & (num - 1); + uint16_t last_used_idx =3D svq->last_used_idx & (num - 1); + + if (svq->batch_last.id =3D=3D VIRTIO_RING_NOT_IN_BATCH) { + svq->batch_last.id =3D le32_to_cpu(used->ring[last_used_idx].id); + svq->batch_last.len =3D le32_to_cpu(used->ring[last_used_idx].len); + } + + if (unlikely(last_used >=3D num)) { + qemu_log_mask(LOG_GUEST_ERROR, "Device %s says index %u is used", + svq->vdev->name, last_used); + return -1; + } + + if (svq->batch_last.id =3D=3D last_used) { + svq->batch_last.id =3D VIRTIO_RING_NOT_IN_BATCH; + *len =3D svq->batch_last.len; + } else { + *len =3D svq->desc_state[last_used].in_bytes; + } + + svq->last_used +=3D svq->desc_state[last_used].ndescs; + svq->last_used_idx++; + return last_used; +} + static uint16_t vhost_svq_last_desc_of_chain(const VhostShadowVirtqueue *s= vq, uint16_t num, uint16_t i) { @@ -474,8 +553,8 @@ static uint16_t vhost_svq_last_desc_of_chain(const Vhos= tShadowVirtqueue *svq, } =20 G_GNUC_WARN_UNUSED_RESULT -static VirtQueueElement *vhost_svq_detach_buf(VhostShadowVirtqueue *svq, - uint16_t id) +static VirtQueueElement *vhost_svq_detach_buf_split(VhostShadowVirtqueue *= svq, + uint16_t id) { uint16_t num =3D svq->desc_state[id].ndescs; uint16_t last_used_chain =3D vhost_svq_last_desc_of_chain(svq, num, id= ); @@ -486,6 +565,33 @@ static VirtQueueElement *vhost_svq_detach_buf(VhostSha= dowVirtqueue *svq, return g_steal_pointer(&svq->desc_state[id].elem); } =20 +G_GNUC_WARN_UNUSED_RESULT +static VirtQueueElement *vhost_svq_detach_buf_split_in_order( + VhostShadowVirtqueue *svq, + uint16_t id) +{ + return g_steal_pointer(&svq->desc_state[id].elem); +} + +/* + * Return the descriptor id (and the chain of ids) to the free list + * + * @svq: Shadow Virtqueue + * @id: Id of the buffer to return. + * + * Return the element associated to the buffer if any. + */ +G_GNUC_WARN_UNUSED_RESULT +static VirtQueueElement *vhost_svq_detach_buf(VhostShadowVirtqueue *svq, + uint16_t id) +{ + if (virtio_vdev_has_feature(svq->vdev, VIRTIO_F_IN_ORDER)) { + return vhost_svq_detach_buf_split_in_order(svq, id); + } else { + return vhost_svq_detach_buf_split(svq, id); + } +} + G_GNUC_WARN_UNUSED_RESULT static VirtQueueElement *vhost_svq_get_buf(VhostShadowVirtqueue *svq, uint32_t *len) @@ -498,7 +604,18 @@ static VirtQueueElement *vhost_svq_get_buf(VhostShadow= Virtqueue *svq, =20 /* Only get used array entries after they have been exposed by dev */ smp_rmb(); - last_used =3D vhost_svq_get_last_used_split(svq, len); + + if (virtio_vdev_has_feature(svq->vdev, VIRTIO_F_IN_ORDER)) { + int32_t r; + r =3D vhost_svq_get_last_used_split_in_order(svq, len); + if (r < 0) { + return NULL; + } + + last_used =3D r; + } else { + last_used =3D vhost_svq_get_last_used_split(svq, len); + } =20 if (unlikely(last_used >=3D svq->vring.num)) { qemu_log_mask(LOG_GUEST_ERROR, "Device %s says index %u is used", @@ -726,6 +843,8 @@ void vhost_svq_start(VhostShadowVirtqueue *svq, VirtIOD= evice *vdev, svq->next_guest_avail_elem =3D NULL; svq->shadow_avail_idx =3D 0; svq->shadow_used_idx =3D 0; + memset(&svq->batch_last, 0, sizeof(svq->batch_last)); + svq->last_used =3D 0; svq->last_used_idx =3D 0; svq->vdev =3D vdev; svq->vq =3D vq; @@ -742,8 +861,12 @@ void vhost_svq_start(VhostShadowVirtqueue *svq, VirtIO= Device *vdev, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYM= OUS, -1, 0); svq->desc_state =3D g_new0(SVQDescState, svq->vring.num); - for (unsigned i =3D 0; i < svq->vring.num - 1; i++) { - svq->desc_state[i].next =3D i + 1; + if (virtio_vdev_has_feature(svq->vdev, VIRTIO_F_IN_ORDER)) { + svq->batch_last.id =3D VIRTIO_RING_NOT_IN_BATCH; + } else { + for (unsigned i =3D 0; i < svq->vring.num - 1; i++) { + svq->desc_state[i].next =3D i + 1; + } } } =20 diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-vi= rtqueue.h index f52c33e65046..ec16a1e83858 100644 --- a/hw/virtio/vhost-shadow-virtqueue.h +++ b/hw/virtio/vhost-shadow-virtqueue.h @@ -24,11 +24,19 @@ typedef struct SVQDescState { */ unsigned int ndescs; =20 - /* - * Backup next field for each descriptor so we can recover securely, n= ot - * needing to trust the device access. - */ - uint16_t next; + union { + /* + * Total length of the available buffer that is writable by the de= vice. + * Only used in packed vq. + */ + uint32_t in_bytes; + + /* + * Backup next field for each descriptor so we can recover securel= y, not + * needing to trust the device access. Only used in split vq. + */ + uint16_t next; + }; } SVQDescState; =20 typedef struct VhostShadowVirtqueue VhostShadowVirtqueue; @@ -99,9 +107,25 @@ typedef struct VhostShadowVirtqueue { /* Next head to expose to the device */ uint16_t shadow_avail_idx; =20 - /* Next free descriptor */ + /* + * Next free descriptor. + * + * Without IN_ORDER free_head is used as a linked list head, and + * desc_next[id] is the next element. + * With IN_ORDER free_head is the next available buffer index. + */ uint16_t free_head; =20 + /* + * Last used element of the processing batch of used descriptors if + * IN_ORDER. + * If SVQ is not processing a batch of descriptors id is set to UINT_M= AX. + */ + vring_used_elem_t batch_last; + + /* Last used id if IN_ORDER and split vq */ + uint16_t last_used; + /* Last seen used idx */ uint16_t shadow_used_idx; =20 --=20 2.53.0