From nobody Tue Sep 2 09:50:27 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7A113308F09 for ; Mon, 18 Aug 2025 08:57:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755507447; cv=none; b=Db1/OA11j/mTQzhWqVNGWFKIq081LFlOt68cs8PDAIxSyjgcm0+KMoAIbtt2c1J3p3fCM/ObD/Da166x/ipG7TUMhMuF87L6Fy3OPQdVUoM1STh7CYp0EHBzlRiX9yEXkf/JocdIFHOI9zN1hWqVTP93mH7UbXoMi/2966R6waA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755507447; c=relaxed/simple; bh=Mq/AfDbkzeahYi6sN3Rcg3Dlb4puLIjsK36j9gN6tVE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ChK9C+nRFrTW8Pvc35nc9iBXEgHVlmsh507X50DSgQrXTK3poPz3JsksbSyOokuftv/c5TEQEviWs9KJJI/LDwIbn2ncDKHTtn8e69CLwCguPZfRQUqQsoiVfGJjdJMuw9es47zq4oXrhkCAburiDIKUKe6/euujyyv5fl19WT8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Jp52qTJM; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Jp52qTJM" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1755507445; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ahzdshQiL3Xzni/Jqm6kcPlFcj6smcBvlHHrtbsnZWI=; b=Jp52qTJMIyFNlMBRutJiyyCRxDbf5+2JBS3TkI6+8QcfEYokFUMDd7CKo4uhRYkIor7iU2 BJSsMJqwILwX1q1S/P6FYztEtvYHOypCpBM8Ghu8JFRcbCYkyUN4fBOdsd9KPFCvB0qhrF 4tnhoRD26Ge29KRAKB48LdvbaGEJxqw= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-685-GMg2T3ajOSG-vO-Bz9LOFg-1; Mon, 18 Aug 2025 04:57:23 -0400 X-MC-Unique: GMg2T3ajOSG-vO-Bz9LOFg-1 X-Mimecast-MFC-AGG-ID: GMg2T3ajOSG-vO-Bz9LOFg_1755507442 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 3FD7D1800286; Mon, 18 Aug 2025 08:57:22 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.44.32.213]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 59F9B1800446; Mon, 18 Aug 2025 08:57:18 +0000 (UTC) From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= To: "Michael S . Tsirkin " Cc: =?UTF-8?q?Eugenio=20P=C3=A9rez?= , Laurent Vivier , virtualization@lists.linux.dev, jasowang@redhat.com, Cindy Lu , linux-kernel@vger.kernel.org, Maxime Coquelin , Yongji Xie , Stefano Garzarella , Xuan Zhuo Subject: [RFC v3 1/7] vduse: add v1 API definition Date: Mon, 18 Aug 2025 10:57:05 +0200 Message-ID: <20250818085711.3461758-2-eperezma@redhat.com> In-Reply-To: <20250818085711.3461758-1-eperezma@redhat.com> References: <20250818085711.3461758-1-eperezma@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 This allows the kernel to detect whether the userspace VDUSE device supports the VQ group and ASID features. VDUSE devices that don't set the V1 API will not receive the new messages, and vdpa device will be created with only one vq group and asid. The next patches implement the new feature incrementally, only enabling the VDUSE device to set the V1 API version by the end of the series. Signed-off-by: Eugenio P=C3=A9rez --- include/uapi/linux/vduse.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h index 68a627d04afa..9a56d0416bfe 100644 --- a/include/uapi/linux/vduse.h +++ b/include/uapi/linux/vduse.h @@ -10,6 +10,10 @@ =20 #define VDUSE_API_VERSION 0 =20 +/* VQ groups and ASID support */ + +#define VDUSE_API_VERSION_1 1 + /* * Get the version of VDUSE API that kernel supported (VDUSE_API_VERSION). * This is used for future extension. --=20 2.50.1 From nobody Tue Sep 2 09:50:27 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C3DC2308F02 for ; Mon, 18 Aug 2025 08:57:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755507454; cv=none; b=ogTYs1AyZTwnFwAH+8isHOeyqPRrCfQPUFpa9Orzwc9kCgH7TKqz5SXhi1pwaz7HmxpV1T3VDCuuydXudQ/sn2yY+XNcpTC21n4vEM+1JH/Rs25Ayc4Gj0c4xEWjqa5bd6hRSPtfIWEFlvEm6Ay0eHep/uSoe67lm28YY5babHo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755507454; c=relaxed/simple; bh=bIJXmp+NJrMo9m0v1Ptbem41UpFckZaQZLVKL5jzZ4Q=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=YMnDYN+TgdD06JiV/OKQAQwt0i9XvpIQ53kQVtq4r88plF9pQyCRn2s3P/f8Wi5UCX5kSLrb4M8xmtC2GpRJzlZOJXOLjhruYzCsTmobqBDHpQRaXb4kqceTqVwGYkwMrK2XscUHmS4FcI3XHrGSt1a46YgWZoOrH8nnUz5MCvY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Ij0Jwcvk; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Ij0Jwcvk" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1755507451; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=S/3SgPuyeB/3g1noTu/JJgOcwr+hehlqJojKmHQcytI=; b=Ij0Jwcvk9fUDHQbSOgpoaDNZDNVaXfPCiq1OIXH0hlftM5AZoL1hGv3GTqNZ8tK+0IO72X UcbxH/Ff2VJcmwYDsjf1P5M8rJEei8JJQlSjei32uWcmF38Fka92VgGdQNQCQwYWn1eIct CDtvuYMrmyfQ6tmRkUXNLVm6K1fdV5I= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-630--H-08AnMMqiyhY1rZ4zMUg-1; Mon, 18 Aug 2025 04:57:28 -0400 X-MC-Unique: -H-08AnMMqiyhY1rZ4zMUg-1 X-Mimecast-MFC-AGG-ID: -H-08AnMMqiyhY1rZ4zMUg_1755507447 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id CED411800280; Mon, 18 Aug 2025 08:57:26 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.44.32.213]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id C7528180028A; Mon, 18 Aug 2025 08:57:22 +0000 (UTC) From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= To: "Michael S . Tsirkin " Cc: =?UTF-8?q?Eugenio=20P=C3=A9rez?= , Laurent Vivier , virtualization@lists.linux.dev, jasowang@redhat.com, Cindy Lu , linux-kernel@vger.kernel.org, Maxime Coquelin , Yongji Xie , Stefano Garzarella , Xuan Zhuo Subject: [RFC v3 2/7] vduse: add vq group support Date: Mon, 18 Aug 2025 10:57:06 +0200 Message-ID: <20250818085711.3461758-3-eperezma@redhat.com> In-Reply-To: <20250818085711.3461758-1-eperezma@redhat.com> References: <20250818085711.3461758-1-eperezma@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 This allows sepparate the different virtqueues in groups that shares the same address space. Asking the VDUSE device for the groups of the vq at the beginning as they're needed for the DMA API. Allocating 3 vq groups as net is the device that need the most groups: * Dataplane (guest passthrough) * CVQ * Shadowed vrings. Future versions of the series can include dynamic allocation of the groups array so VDUSE can declare more groups. Signed-off-by: Eugenio P=C3=A9rez --- v3: * Increase VDUSE_MAX_VQ_GROUPS to 0xffff (Jason). It was set to a lower value to reduce memory consumption, but vqs are already limited to that value and userspace VDUSE is able to allocate that many vqs. * Remove the descs vq group capability as it will not be used and we can add it on top. * Do not ask for vq groups in number of vq groups < 2. * Move the valid vq groups range check to vduse_validate_config. v2: * Cache group information in kernel, as we need to provide the vq map tokens properly. * Add descs vq group to optimize SVQ forwarding and support indirect descriptors out of the box. --- drivers/vdpa/vdpa_user/vduse_dev.c | 53 ++++++++++++++++++++++++++++-- include/uapi/linux/vduse.h | 21 +++++++++++- 2 files changed, 70 insertions(+), 4 deletions(-) diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vd= use_dev.c index 3260edefdf0d..e42d14888ca2 100644 --- a/drivers/vdpa/vdpa_user/vduse_dev.c +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -58,6 +58,7 @@ struct vduse_virtqueue { struct vdpa_vq_state state; bool ready; bool kicked; + u32 vq_group; spinlock_t kick_lock; spinlock_t irq_lock; struct eventfd_ctx *kickfd; @@ -115,6 +116,7 @@ struct vduse_dev { u8 status; u32 vq_num; u32 vq_align; + u32 ngroups; struct vduse_umem *umem; struct mutex mem_lock; unsigned int bounce_size; @@ -593,6 +595,13 @@ static int vduse_vdpa_set_vq_state(struct vdpa_device = *vdpa, u16 idx, return 0; } =20 +static u32 vduse_get_vq_group(struct vdpa_device *vdpa, u16 idx) +{ + struct vduse_dev *dev =3D vdpa_to_vduse(vdpa); + + return dev->vqs[idx]->vq_group; +} + static int vduse_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 idx, struct vdpa_vq_state *state) { @@ -679,13 +688,42 @@ static u8 vduse_vdpa_get_status(struct vdpa_device *v= dpa) return dev->status; } =20 +static int vduse_fill_vq_groups(struct vduse_dev *dev) +{ + /* All vqs and descs must be in vq group 0 if ngroups < 2 */ + if (dev->ngroups < 2) + return 0; + + for (int i =3D 0; i < dev->vdev->vdpa.nvqs; ++i) { + struct vduse_dev_msg msg =3D { 0 }; + int ret; + + msg.req.type =3D VDUSE_GET_VQ_GROUP; + msg.req.vq_group.index =3D i; + ret =3D vduse_dev_msg_sync(dev, &msg); + if (ret) + return ret; + + dev->vqs[i]->vq_group =3D msg.resp.vq_group.group; + } + + return 0; +} + static void vduse_vdpa_set_status(struct vdpa_device *vdpa, u8 status) { struct vduse_dev *dev =3D vdpa_to_vduse(vdpa); + u8 previous_status =3D dev->status; =20 if (vduse_dev_set_status(dev, status)) return; =20 + if ((dev->status ^ previous_status) & + BIT_ULL(VIRTIO_CONFIG_S_FEATURES_OK) && + status & (1ULL << VIRTIO_CONFIG_S_FEATURES_OK)) + if (vduse_fill_vq_groups(dev)) + return; + dev->status =3D status; } =20 @@ -790,6 +828,7 @@ static const struct vdpa_config_ops vduse_vdpa_config_o= ps =3D { .set_vq_cb =3D vduse_vdpa_set_vq_cb, .set_vq_num =3D vduse_vdpa_set_vq_num, .get_vq_size =3D vduse_vdpa_get_vq_size, + .get_vq_group =3D vduse_get_vq_group, .set_vq_ready =3D vduse_vdpa_set_vq_ready, .get_vq_ready =3D vduse_vdpa_get_vq_ready, .set_vq_state =3D vduse_vdpa_set_vq_state, @@ -1738,12 +1777,19 @@ static bool features_is_valid(struct vduse_dev_conf= ig *config) return true; } =20 -static bool vduse_validate_config(struct vduse_dev_config *config) +static bool vduse_validate_config(struct vduse_dev_config *config, + u64 api_version) { if (!is_mem_zero((const char *)config->reserved, sizeof(config->reserved))) return false; =20 + if (api_version < VDUSE_API_VERSION_1 && config->ngroups) + return false; + + if (api_version >=3D VDUSE_API_VERSION_1 && config->ngroups > 0xffff) + return false; + if (config->vq_align > PAGE_SIZE) return false; =20 @@ -1859,6 +1905,7 @@ static int vduse_create_dev(struct vduse_dev_config *= config, dev->device_features =3D config->features; dev->device_id =3D config->device_id; dev->vendor_id =3D config->vendor_id; + dev->ngroups =3D (dev->api_version < 1) ? 1 : (config->ngroups ?: 1); dev->name =3D kstrdup(config->name, GFP_KERNEL); if (!dev->name) goto err_str; @@ -1937,7 +1984,7 @@ static long vduse_ioctl(struct file *file, unsigned i= nt cmd, break; =20 ret =3D -EINVAL; - if (vduse_validate_config(&config) =3D=3D false) + if (!vduse_validate_config(&config, control->api_version)) break; =20 buf =3D vmemdup_user(argp + size, config.config_size); @@ -2018,7 +2065,7 @@ static int vduse_dev_init_vdpa(struct vduse_dev *dev,= const char *name) =20 vdev =3D vdpa_alloc_device(struct vduse_vdpa, vdpa, dev->dev, &vduse_vdpa_config_ops, &vduse_map_ops, - 1, 1, name, true); + dev->ngroups, 1, name, true); if (IS_ERR(vdev)) return PTR_ERR(vdev); =20 diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h index 9a56d0416bfe..b1c0e47d71fb 100644 --- a/include/uapi/linux/vduse.h +++ b/include/uapi/linux/vduse.h @@ -31,6 +31,7 @@ * @features: virtio features * @vq_num: the number of virtqueues * @vq_align: the allocation alignment of virtqueue's metadata + * @ngroups: number of vq groups that VDUSE device declares * @reserved: for future use, needs to be initialized to zero * @config_size: the size of the configuration space * @config: the buffer of the configuration space @@ -45,7 +46,8 @@ struct vduse_dev_config { __u64 features; __u32 vq_num; __u32 vq_align; - __u32 reserved[13]; + __u32 ngroups; /* if VDUSE_API_VERSION >=3D 1 */ + __u32 reserved[12]; __u32 config_size; __u8 config[]; }; @@ -160,6 +162,16 @@ struct vduse_vq_state_packed { __u16 last_used_idx; }; =20 +/** + * struct vduse_vq_group - virtqueue group + * @index: Index of the virtqueue + * @group: Virtqueue group + */ +struct vduse_vq_group { + __u32 index; + __u32 group; +}; + /** * struct vduse_vq_info - information of a virtqueue * @index: virtqueue index @@ -274,6 +286,7 @@ enum vduse_req_type { VDUSE_GET_VQ_STATE, VDUSE_SET_STATUS, VDUSE_UPDATE_IOTLB, + VDUSE_GET_VQ_GROUP, }; =20 /** @@ -316,6 +329,7 @@ struct vduse_iova_range { * @vq_state: virtqueue state, only index field is available * @s: device status * @iova: IOVA range for updating + * @vq_group: virtqueue group of a virtqueue * @padding: padding * * Structure used by read(2) on /dev/vduse/$NAME. @@ -328,6 +342,8 @@ struct vduse_dev_request { struct vduse_vq_state vq_state; struct vduse_dev_status s; struct vduse_iova_range iova; + /* Only if vduse api version >=3D 1 */; + struct vduse_vq_group vq_group; __u32 padding[32]; }; }; @@ -338,6 +354,7 @@ struct vduse_dev_request { * @result: the result of request * @reserved: for future use, needs to be initialized to zero * @vq_state: virtqueue state + * @vq_group: virtqueue group of a virtqueue * @padding: padding * * Structure used by write(2) on /dev/vduse/$NAME. @@ -350,6 +367,8 @@ struct vduse_dev_response { __u32 reserved[4]; union { struct vduse_vq_state vq_state; + /* Only if vduse api version >=3D 1 */ + struct vduse_vq_group vq_group; __u32 padding[32]; }; }; --=20 2.50.1 From nobody Tue Sep 2 09:50:27 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C54233090F5 for ; Mon, 18 Aug 2025 08:57:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755507458; cv=none; b=X6rIRXTPcWeijH6OJF19nMWWmXdyt/9JhN13PvW79ADjc0g8Mp3ASKr2GFgBi5z5JUWSrmaotbY6q9MReWHYkIYQG21j4QOu97HllmKCDjMmgydpCw+DnneKinl8ND/roJ7Hpxsb6HwL1TEkPRZ48T56AVqVgB0FmdTSUT/tu1Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755507458; c=relaxed/simple; bh=usLxxLCW1qadP/ouGgiHlbKdjEzjAq87u4VZ8+i0TZY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=JbEqfu8Ocz6zZ81UAYogXbWPaAjnZ7iDykKX8Qtl2li0yCPqCDa4uwof8Dv82E6zmiYIc/6GvB8htwb7x+RyCrZr2ADIY5OBYhIcZkyl3nxrJb4QyRBG1tEFfLSXYIpzjrbsBnGj+dbvG/RGvu3j1g94L30n+MFMvhZX/qX7lJQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=iqPw2orn; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="iqPw2orn" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1755507456; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=c2Z2NV/NWbWOP0kVy/5ym3JJhAbGWPlFBi1uAM5sl6o=; b=iqPw2ornGEcHsfXVG88Ym5J+6dZS+L9TimLyLMa9O++ODbCZ5zO3vRCAcjL1Ri724pNzvJ 20ZpA8nDfXj7pq12BG+fSJW7FjC7uvHQr+J7+Um9On6Iw7Py8dF5ZpA6Il8jpRLIeXWFrX B5pRYzLnjHHL15qyOZLNbIhEnARWThE= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-153-1e-QIgacMlCx5tRx79SBdA-1; Mon, 18 Aug 2025 04:57:32 -0400 X-MC-Unique: 1e-QIgacMlCx5tRx79SBdA-1 X-Mimecast-MFC-AGG-ID: 1e-QIgacMlCx5tRx79SBdA_1755507451 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 8F394180045C; Mon, 18 Aug 2025 08:57:31 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.44.32.213]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 6244A180028A; Mon, 18 Aug 2025 08:57:27 +0000 (UTC) From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= To: "Michael S . Tsirkin " Cc: =?UTF-8?q?Eugenio=20P=C3=A9rez?= , Laurent Vivier , virtualization@lists.linux.dev, jasowang@redhat.com, Cindy Lu , linux-kernel@vger.kernel.org, Maxime Coquelin , Yongji Xie , Stefano Garzarella , Xuan Zhuo Subject: [RFC v3 3/7] vdpa: change map_token from void * to an empty struct Date: Mon, 18 Aug 2025 10:57:07 +0200 Message-ID: <20250818085711.3461758-4-eperezma@redhat.com> In-Reply-To: <20250818085711.3461758-1-eperezma@redhat.com> References: <20250818085711.3461758-1-eperezma@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 Proposal to make it type safe, never casting to void, but allow the backend to use whatever struct it needs. Next patches will move the token from a domain to a custom struct. Signed-off-by: Eugenio P=C3=A9rez --- drivers/vdpa/vdpa_user/iova_domain.h | 1 + drivers/vdpa/vdpa_user/vduse_dev.c | 60 ++++++++++++++++------------ drivers/virtio/virtio_ring.c | 6 ++- include/linux/virtio.h | 8 +++- include/linux/virtio_config.h | 34 +++++++++------- 5 files changed, 67 insertions(+), 42 deletions(-) diff --git a/drivers/vdpa/vdpa_user/iova_domain.h b/drivers/vdpa/vdpa_user/= iova_domain.h index 1f3c30be272a..c0f97dfaf94f 100644 --- a/drivers/vdpa/vdpa_user/iova_domain.h +++ b/drivers/vdpa/vdpa_user/iova_domain.h @@ -26,6 +26,7 @@ struct vduse_bounce_map { }; =20 struct vduse_iova_domain { + struct vring_mapping_opaque token; struct iova_domain stream_iovad; struct iova_domain consistent_iovad; struct vduse_bounce_map *bounce_maps; diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vd= use_dev.c index e42d14888ca2..e3c8fc1aa446 100644 --- a/drivers/vdpa/vdpa_user/vduse_dev.c +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -164,6 +164,11 @@ static inline struct vduse_dev *dev_to_vduse(struct de= vice *dev) return vdpa_to_vduse(vdpa); } =20 +static struct vduse_iova_domain *vduse_token_to_domain(struct vring_mappin= g_opaque *token) +{ + return container_of(token, struct vduse_iova_domain, token); +} + static struct vduse_dev_msg *vduse_find_msg(struct list_head *head, uint32_t request_id) { @@ -854,47 +859,50 @@ static const struct vdpa_config_ops vduse_vdpa_config= _ops =3D { .free =3D vduse_vdpa_free, }; =20 -static void vduse_dev_sync_single_for_device(void *token, +static void vduse_dev_sync_single_for_device(struct vring_mapping_opaque *= token, dma_addr_t dma_addr, size_t size, enum dma_data_direction dir) { - struct vduse_iova_domain *domain =3D token; + struct vduse_iova_domain *domain =3D vduse_token_to_domain(token); =20 vduse_domain_sync_single_for_device(domain, dma_addr, size, dir); } =20 -static void vduse_dev_sync_single_for_cpu(void *token, - dma_addr_t dma_addr, size_t size, - enum dma_data_direction dir) +static void vduse_dev_sync_single_for_cpu(struct vring_mapping_opaque *tok= en, + dma_addr_t dma_addr, size_t size, + enum dma_data_direction dir) { - struct vduse_iova_domain *domain =3D token; + struct vduse_iova_domain *domain =3D vduse_token_to_domain(token); =20 vduse_domain_sync_single_for_cpu(domain, dma_addr, size, dir); } =20 -static dma_addr_t vduse_dev_map_page(void *token, struct page *page, +static dma_addr_t vduse_dev_map_page(struct vring_mapping_opaque *token, + struct page *page, unsigned long offset, size_t size, enum dma_data_direction dir, unsigned long attrs) { - struct vduse_iova_domain *domain =3D token; + struct vduse_iova_domain *domain =3D vduse_token_to_domain(token); =20 return vduse_domain_map_page(domain, page, offset, size, dir, attrs); } =20 -static void vduse_dev_unmap_page(void *token, dma_addr_t dma_addr, - size_t size, enum dma_data_direction dir, - unsigned long attrs) +static void vduse_dev_unmap_page(struct vring_mapping_opaque *token, + dma_addr_t dma_addr, size_t size, + enum dma_data_direction dir, + unsigned long attrs) { - struct vduse_iova_domain *domain =3D token; + struct vduse_iova_domain *domain =3D vduse_token_to_domain(token); =20 return vduse_domain_unmap_page(domain, dma_addr, size, dir, attrs); } =20 -static void *vduse_dev_alloc_coherent(void *token, size_t size, - dma_addr_t *dma_addr, gfp_t flag) +static void *vduse_dev_alloc_coherent(struct vring_mapping_opaque *token, + size_t size, dma_addr_t *dma_addr, + gfp_t flag) { - struct vduse_iova_domain *domain =3D token; + struct vduse_iova_domain *domain =3D vduse_token_to_domain(token); unsigned long iova; void *addr; =20 @@ -909,32 +917,34 @@ static void *vduse_dev_alloc_coherent(void *token, si= ze_t size, return addr; } =20 -static void vduse_dev_free_coherent(void *token, size_t size, - void *vaddr, dma_addr_t dma_addr, - unsigned long attrs) +static void vduse_dev_free_coherent(struct vring_mapping_opaque *token, + size_t size, void *vaddr, + dma_addr_t dma_addr, unsigned long attrs) { - struct vduse_iova_domain *domain =3D token; + struct vduse_iova_domain *domain =3D vduse_token_to_domain(token); =20 vduse_domain_free_coherent(domain, size, vaddr, dma_addr, attrs); } =20 -static bool vduse_dev_need_sync(void *token, dma_addr_t dma_addr) +static bool vduse_dev_need_sync(struct vring_mapping_opaque *token, + dma_addr_t dma_addr) { - struct vduse_iova_domain *domain =3D token; + struct vduse_iova_domain *domain =3D vduse_token_to_domain(token); =20 return dma_addr < domain->bounce_size; } =20 -static int vduse_dev_mapping_error(void *token, dma_addr_t dma_addr) +static int vduse_dev_mapping_error(struct vring_mapping_opaque *token, + dma_addr_t dma_addr) { if (unlikely(dma_addr =3D=3D DMA_MAPPING_ERROR)) return -ENOMEM; return 0; } =20 -static size_t vduse_dev_max_mapping_size(void *token) +static size_t vduse_dev_max_mapping_size(struct vring_mapping_opaque *toke= n) { - struct vduse_iova_domain *domain =3D token; + struct vduse_iova_domain *domain =3D vduse_token_to_domain(token); =20 return domain->bounce_size; } @@ -2103,7 +2113,7 @@ static int vdpa_dev_add(struct vdpa_mgmt_dev *mdev, c= onst char *name, return -ENOMEM; } =20 - dev->vdev->vdpa.mapping_token.token =3D dev->domain; + dev->vdev->vdpa.mapping_token.token =3D &dev->domain->token; ret =3D _vdpa_register_device(&dev->vdev->vdpa, dev->vq_num); if (ret) { put_device(&dev->vdev->vdpa.dev); diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index fc0f5faa8523..4fc588458b23 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -349,8 +349,12 @@ size_t virtio_max_dma_size(const struct virtio_device = *vdev) =20 if (vring_use_map_api(vdev)) { if (vdev->map) + /* + * TODO we should be able to get the token here, not + * cast to void + */ max_segment_size =3D - vdev->map->max_mapping_size(vdev->dev.parent); + vdev->map->max_mapping_size((void *)vdev->dev.parent); else max_segment_size =3D dma_max_mapping_size(vdev->dev.parent); diff --git a/include/linux/virtio.h b/include/linux/virtio.h index ceca93348aed..c446c511b8c1 100644 --- a/include/linux/virtio.h +++ b/include/linux/virtio.h @@ -40,11 +40,17 @@ struct virtqueue { void *priv; }; =20 +/* + * Base struct for the transport specific token used for doing map. + * It allows to convert between the transport specific type to the mapping + * token with a valud type always. + */ +struct vring_mapping_opaque {}; union vring_mapping_token { /* Device that performs DMA */ struct device *dma_dev; /* Transport specific token used for doing map */ - void *token; + struct vring_mapping_opaque *token; }; =20 int virtqueue_add_outbuf(struct virtqueue *vq, diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h index 4566ac87feb7..02d98fb1309c 100644 --- a/include/linux/virtio_config.h +++ b/include/linux/virtio_config.h @@ -191,24 +191,28 @@ struct virtio_config_ops { * Returns: the maximum buffer size that can be mapped */ struct virtio_map_ops { - dma_addr_t (*map_page)(void *token, struct page *page, - unsigned long offset, size_t size, - enum dma_data_direction dir, unsigned long attrs); - void (*unmap_page)(void *token, dma_addr_t map_handle, - size_t size, enum dma_data_direction dir, - unsigned long attrs); - void (*sync_single_for_cpu)(void *token, dma_addr_t map_handle, - size_t size, enum dma_data_direction dir); - void (*sync_single_for_device)(void *token, + dma_addr_t (*map_page)(struct vring_mapping_opaque *token, + struct page *page, unsigned long offset, + size_t size, enum dma_data_direction dir, + unsigned long attrs); + void (*unmap_page)(struct vring_mapping_opaque *token, + dma_addr_t map_handle, size_t size, + enum dma_data_direction dir, unsigned long attrs); + void (*sync_single_for_cpu)(struct vring_mapping_opaque *token, + dma_addr_t map_handle, size_t size, + enum dma_data_direction dir); + void (*sync_single_for_device)(struct vring_mapping_opaque *token, dma_addr_t map_handle, size_t size, enum dma_data_direction dir); - void *(*alloc)(void *token, size_t size, + void *(*alloc)(struct vring_mapping_opaque *token, size_t size, dma_addr_t *map_handle, gfp_t gfp); - void (*free)(void *token, size_t size, void *vaddr, - dma_addr_t map_handle, unsigned long attrs); - bool (*need_sync)(void *token, dma_addr_t map_handle); - int (*mapping_error)(void *token, dma_addr_t map_handle); - size_t (*max_mapping_size)(void *token); + void (*free)(struct vring_mapping_opaque *token, size_t size, + void *vaddr, dma_addr_t map_handle, unsigned long attrs); + bool (*need_sync)(struct vring_mapping_opaque *token, + dma_addr_t map_handle); + int (*mapping_error)(struct vring_mapping_opaque *token, + dma_addr_t map_handle); + size_t (*max_mapping_size)(struct vring_mapping_opaque *token); }; =20 /* If driver didn't advertise the feature, it will never appear. */ --=20 2.50.1 From nobody Tue Sep 2 09:50:27 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DDC0330BF65 for ; Mon, 18 Aug 2025 08:57:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755507463; cv=none; b=gmQ39eC94wzM4+Borme1YSL7AW+/WOEEhsP9N7iA/j0gOyV61cB/QlFEeXUD8GYtF++mPZgfAkQjCJ+82AxUKxB9pb9NKNd5DwgyCxiooJBQ8ty7XSYTMlCjyLdeCxZPAjb+nne6rdOtjFnZ8KC3f2iiNZwkyLbsfWnYlxjbxcg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755507463; c=relaxed/simple; bh=KukE7lIDYgBkgVivBel/763Uqu2vc2JHySiAPWZJYj4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=bj6ZsXQLBbs77cNc+rcWOFdTTDp56FsaHBJbVERtWH5g9vRcCUAQHy/nVtkb/pBk9jhFYIW9YnYkIA/uWhb05m07j7zlmkbT5UNCLjNqXNAb9lUiGyBVe5PWykaSbeQDnpUbvdlBkK4BPFX6Cc6iAImC9G78772zG203G1Gphas= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=dTtISHbG; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="dTtISHbG" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1755507460; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Vv2fIHHkGpDijoACTLcDqW3ci7RJYD8mt/1gDnxJgTA=; b=dTtISHbGAkcZgfOY55LXgxmnBBHyfRDnznTgxaIHI2dLAusNEgHdPkIN6G7gltJpX/0RcS quG/ndD2HCjd3WvzCjgOHkqNafgYPxOWLFgqildaEEOZxEqR504lZUxMWKv4jSgd1fNjx+ pp0z4ZIheJTYORpOBY2F1ltrfS5GsRM= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-93-s1bxgSckNY6Xv3T4BMjAGw-1; Mon, 18 Aug 2025 04:57:37 -0400 X-MC-Unique: s1bxgSckNY6Xv3T4BMjAGw-1 X-Mimecast-MFC-AGG-ID: s1bxgSckNY6Xv3T4BMjAGw_1755507456 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A07331956086; Mon, 18 Aug 2025 08:57:36 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.44.32.213]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 5BCB3180028B; Mon, 18 Aug 2025 08:57:31 +0000 (UTC) From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= To: "Michael S . Tsirkin " Cc: =?UTF-8?q?Eugenio=20P=C3=A9rez?= , Laurent Vivier , virtualization@lists.linux.dev, jasowang@redhat.com, Cindy Lu , linux-kernel@vger.kernel.org, Maxime Coquelin , Yongji Xie , Stefano Garzarella , Xuan Zhuo Subject: [RFC v3 4/7] vduse: return internal vq group struct as map token Date: Mon, 18 Aug 2025 10:57:08 +0200 Message-ID: <20250818085711.3461758-5-eperezma@redhat.com> In-Reply-To: <20250818085711.3461758-1-eperezma@redhat.com> References: <20250818085711.3461758-1-eperezma@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 Return the internal struct that represents the vq group as virtqueue map token, instead of the device. This allows the map functions to access the information per group. At this moment all the virtqueues share the same vq group, that only can point to ASID 0. This change prepares the infrastructure for actual per-group address space handling Signed-off-by: Eugenio P=C3=A9rez --- v3: * Make the vq groups a dynamic array to support an arbitrary number of them. --- drivers/vdpa/vdpa_user/iova_domain.h | 1 - drivers/vdpa/vdpa_user/vduse_dev.c | 65 +++++++++++++++++++++++----- 2 files changed, 54 insertions(+), 12 deletions(-) diff --git a/drivers/vdpa/vdpa_user/iova_domain.h b/drivers/vdpa/vdpa_user/= iova_domain.h index c0f97dfaf94f..1f3c30be272a 100644 --- a/drivers/vdpa/vdpa_user/iova_domain.h +++ b/drivers/vdpa/vdpa_user/iova_domain.h @@ -26,7 +26,6 @@ struct vduse_bounce_map { }; =20 struct vduse_iova_domain { - struct vring_mapping_opaque token; struct iova_domain stream_iovad; struct iova_domain consistent_iovad; struct vduse_bounce_map *bounce_maps; diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vd= use_dev.c index e3c8fc1aa446..45b188dc2659 100644 --- a/drivers/vdpa/vdpa_user/vduse_dev.c +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -84,6 +84,11 @@ struct vduse_umem { struct mm_struct *mm; }; =20 +struct vduse_vq_group_int { + struct vring_mapping_opaque token; + struct vduse_dev *dev; +}; + struct vduse_dev { struct vduse_vdpa *vdev; struct device *dev; @@ -118,6 +123,7 @@ struct vduse_dev { u32 vq_align; u32 ngroups; struct vduse_umem *umem; + struct vduse_vq_group_int *groups; struct mutex mem_lock; unsigned int bounce_size; struct mutex domain_lock; @@ -164,9 +170,9 @@ static inline struct vduse_dev *dev_to_vduse(struct dev= ice *dev) return vdpa_to_vduse(vdpa); } =20 -static struct vduse_iova_domain *vduse_token_to_domain(struct vring_mappin= g_opaque *token) +static struct vduse_vq_group_int *vduse_token_to_vq_group(struct vring_map= ping_opaque *token) { - return container_of(token, struct vduse_iova_domain, token); + return container_of(token, struct vduse_vq_group_int, token); } =20 static struct vduse_dev_msg *vduse_find_msg(struct list_head *head, @@ -607,6 +613,16 @@ static u32 vduse_get_vq_group(struct vdpa_device *vdpa= , u16 idx) return dev->vqs[idx]->vq_group; } =20 +static union vring_mapping_token vduse_get_vq_mapping_token(struct vdpa_de= vice *vdpa, + u16 idx) +{ + struct vduse_dev *dev =3D vdpa_to_vduse(vdpa); + u32 vq_group =3D dev->vqs[idx]->vq_group; + union vring_mapping_token ret =3D { .token =3D &dev->groups[vq_group].tok= en }; + + return ret; +} + static int vduse_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 idx, struct vdpa_vq_state *state) { @@ -856,6 +872,7 @@ static const struct vdpa_config_ops vduse_vdpa_config_o= ps =3D { .get_vq_affinity =3D vduse_vdpa_get_vq_affinity, .reset =3D vduse_vdpa_reset, .set_map =3D vduse_vdpa_set_map, + .get_vq_mapping_token =3D vduse_get_vq_mapping_token, .free =3D vduse_vdpa_free, }; =20 @@ -863,7 +880,9 @@ static void vduse_dev_sync_single_for_device(struct vri= ng_mapping_opaque *token, dma_addr_t dma_addr, size_t size, enum dma_data_direction dir) { - struct vduse_iova_domain *domain =3D vduse_token_to_domain(token); + struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); + struct vduse_dev *vdev =3D group->dev; + struct vduse_iova_domain *domain =3D vdev->domain; =20 vduse_domain_sync_single_for_device(domain, dma_addr, size, dir); } @@ -872,7 +891,9 @@ static void vduse_dev_sync_single_for_cpu(struct vring_= mapping_opaque *token, dma_addr_t dma_addr, size_t size, enum dma_data_direction dir) { - struct vduse_iova_domain *domain =3D vduse_token_to_domain(token); + struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); + struct vduse_dev *vdev =3D group->dev; + struct vduse_iova_domain *domain =3D vdev->domain; =20 vduse_domain_sync_single_for_cpu(domain, dma_addr, size, dir); } @@ -883,7 +904,9 @@ static dma_addr_t vduse_dev_map_page(struct vring_mappi= ng_opaque *token, enum dma_data_direction dir, unsigned long attrs) { - struct vduse_iova_domain *domain =3D vduse_token_to_domain(token); + struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); + struct vduse_dev *vdev =3D group->dev; + struct vduse_iova_domain *domain =3D vdev->domain; =20 return vduse_domain_map_page(domain, page, offset, size, dir, attrs); } @@ -893,7 +916,9 @@ static void vduse_dev_unmap_page(struct vring_mapping_o= paque *token, enum dma_data_direction dir, unsigned long attrs) { - struct vduse_iova_domain *domain =3D vduse_token_to_domain(token); + struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); + struct vduse_dev *vdev =3D group->dev; + struct vduse_iova_domain *domain =3D vdev->domain; =20 return vduse_domain_unmap_page(domain, dma_addr, size, dir, attrs); } @@ -902,7 +927,9 @@ static void *vduse_dev_alloc_coherent(struct vring_mapp= ing_opaque *token, size_t size, dma_addr_t *dma_addr, gfp_t flag) { - struct vduse_iova_domain *domain =3D vduse_token_to_domain(token); + struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); + struct vduse_dev *vdev =3D group->dev; + struct vduse_iova_domain *domain =3D vdev->domain; unsigned long iova; void *addr; =20 @@ -921,7 +948,9 @@ static void vduse_dev_free_coherent(struct vring_mappin= g_opaque *token, size_t size, void *vaddr, dma_addr_t dma_addr, unsigned long attrs) { - struct vduse_iova_domain *domain =3D vduse_token_to_domain(token); + struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); + struct vduse_dev *vdev =3D group->dev; + struct vduse_iova_domain *domain =3D vdev->domain; =20 vduse_domain_free_coherent(domain, size, vaddr, dma_addr, attrs); } @@ -929,7 +958,9 @@ static void vduse_dev_free_coherent(struct vring_mappin= g_opaque *token, static bool vduse_dev_need_sync(struct vring_mapping_opaque *token, dma_addr_t dma_addr) { - struct vduse_iova_domain *domain =3D vduse_token_to_domain(token); + struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); + struct vduse_dev *vdev =3D group->dev; + struct vduse_iova_domain *domain =3D vdev->domain; =20 return dma_addr < domain->bounce_size; } @@ -944,7 +975,9 @@ static int vduse_dev_mapping_error(struct vring_mapping= _opaque *token, =20 static size_t vduse_dev_max_mapping_size(struct vring_mapping_opaque *toke= n) { - struct vduse_iova_domain *domain =3D vduse_token_to_domain(token); + struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); + struct vduse_dev *vdev =3D group->dev; + struct vduse_iova_domain *domain =3D vdev->domain; =20 return domain->bounce_size; } @@ -1750,6 +1783,7 @@ static int vduse_destroy_dev(char *name) if (dev->domain) vduse_domain_destroy(dev->domain); kfree(dev->name); + kfree(dev->groups); vduse_dev_destroy(dev); module_put(THIS_MODULE); =20 @@ -1915,7 +1949,15 @@ static int vduse_create_dev(struct vduse_dev_config = *config, dev->device_features =3D config->features; dev->device_id =3D config->device_id; dev->vendor_id =3D config->vendor_id; + dev->ngroups =3D (dev->api_version < 1) ? 1 : (config->ngroups ?: 1); + dev->groups =3D kcalloc(dev->ngroups, sizeof(dev->groups[0]), + GFP_KERNEL); + if (!dev->groups) + goto err_vq_groups; + for (u32 i =3D 0; i < dev->ngroups; ++i) + dev->groups[i].dev =3D dev; + dev->name =3D kstrdup(config->name, GFP_KERNEL); if (!dev->name) goto err_str; @@ -1952,6 +1994,8 @@ static int vduse_create_dev(struct vduse_dev_config *= config, err_idr: kfree(dev->name); err_str: + kfree(dev->groups); +err_vq_groups: vduse_dev_destroy(dev); err: return ret; @@ -2113,7 +2157,6 @@ static int vdpa_dev_add(struct vdpa_mgmt_dev *mdev, c= onst char *name, return -ENOMEM; } =20 - dev->vdev->vdpa.mapping_token.token =3D &dev->domain->token; ret =3D _vdpa_register_device(&dev->vdev->vdpa, dev->vq_num); if (ret) { put_device(&dev->vdev->vdpa.dev); --=20 2.50.1 From nobody Tue Sep 2 09:50:27 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5C35430BF7B for ; Mon, 18 Aug 2025 08:57:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755507467; cv=none; b=UBM7+P9+ZAPi3F9AEKiEcmeaQH7A25VZLBKi/IZJc16/qslyb2QG9ZhvrsIXZQeutNfIlW46l32F6dOnOOsuowLIy3ldKWAY9iQJA2WDpWcyvr/sLdXA2S8U7BcdaoC7s+59nJGV8mj4AIHZYp4yH3HrEOBeRp8IgTxB4v7jjqs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755507467; c=relaxed/simple; bh=YeVHgpr0G/Epi9J9NURDwYa/iqPbsgPFjaSKQrD2xBs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Ni9QRSeMPBj/wFhFwtJCckwaDPf9MQRcONAygN5uP/vDA3RPY7hHeQiD0UH1vVvA3OGzy3cjX9B5WCYc+vZerNbPlGBujVcYvhZe+ymIZev1m78wl3+ZSbZd3Aa8tBmprLhGbELtRHROvb51VOtzGBBcZJCngNpu2xaBEU4x5Ig= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Vkkr68/2; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Vkkr68/2" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1755507464; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AaLpftRzHAuIfiitvGtvjTGLY40NdnI2bh71byGQyKc=; b=Vkkr68/2DHFvvFlkUobmMJUPifBU44EhwSD84DZ2Llf7BcGqAF5IRzJJ3uLssAXQz+O5BA L/3MSKca1wSIg3ysTMSfCiEAwMbc9q0Nopofxu62gKP0+c8RVZHnArUhgZwXMCrr+DSaos LTVAtG8HSU01eTbMKUfZiKBlvNeGspM= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-108-3CINtWkyO7arldixSZTtvA-1; Mon, 18 Aug 2025 04:57:43 -0400 X-MC-Unique: 3CINtWkyO7arldixSZTtvA-1 X-Mimecast-MFC-AGG-ID: 3CINtWkyO7arldixSZTtvA_1755507462 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B3E7919560A6; Mon, 18 Aug 2025 08:57:41 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.44.32.213]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 4849A180028B; Mon, 18 Aug 2025 08:57:36 +0000 (UTC) From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= To: "Michael S . Tsirkin " Cc: =?UTF-8?q?Eugenio=20P=C3=A9rez?= , Laurent Vivier , virtualization@lists.linux.dev, jasowang@redhat.com, Cindy Lu , linux-kernel@vger.kernel.org, Maxime Coquelin , Yongji Xie , Stefano Garzarella , Xuan Zhuo Subject: [RFC v3 5/7] vduse: create vduse_as to make it an array Date: Mon, 18 Aug 2025 10:57:09 +0200 Message-ID: <20250818085711.3461758-6-eperezma@redhat.com> In-Reply-To: <20250818085711.3461758-1-eperezma@redhat.com> References: <20250818085711.3461758-1-eperezma@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 This is a first step so we can make more than one different address spaces. No change on the colde flow intended. Signed-off-by: Eugenio P=C3=A9rez --- drivers/vdpa/vdpa_user/vduse_dev.c | 115 +++++++++++++++-------------- 1 file changed, 59 insertions(+), 56 deletions(-) diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vd= use_dev.c index 45b188dc2659..561aab048d1a 100644 --- a/drivers/vdpa/vdpa_user/vduse_dev.c +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -84,6 +84,12 @@ struct vduse_umem { struct mm_struct *mm; }; =20 +struct vduse_as { + struct vduse_iova_domain *domain; + struct vduse_umem *umem; + struct mutex mem_lock; +}; + struct vduse_vq_group_int { struct vring_mapping_opaque token; struct vduse_dev *dev; @@ -93,8 +99,7 @@ struct vduse_dev { struct vduse_vdpa *vdev; struct device *dev; struct vduse_virtqueue **vqs; - struct vduse_iova_domain *domain; - struct vduse_iova_domain *dom; + struct vduse_as as; char *name; struct mutex lock; spinlock_t msg_lock; @@ -122,9 +127,7 @@ struct vduse_dev { u32 vq_num; u32 vq_align; u32 ngroups; - struct vduse_umem *umem; struct vduse_vq_group_int *groups; - struct mutex mem_lock; unsigned int bounce_size; struct mutex domain_lock; }; @@ -444,7 +447,7 @@ static __poll_t vduse_dev_poll(struct file *file, poll_= table *wait) static void vduse_dev_reset(struct vduse_dev *dev) { int i; - struct vduse_iova_domain *domain =3D dev->domain; + struct vduse_iova_domain *domain =3D dev->as.domain; =20 /* The coherent mappings are handled in vduse_dev_free_coherent() */ if (domain && domain->bounce_map) @@ -823,13 +826,13 @@ static int vduse_vdpa_set_map(struct vdpa_device *vdp= a, struct vduse_dev *dev =3D vdpa_to_vduse(vdpa); int ret; =20 - ret =3D vduse_domain_set_map(dev->domain, iotlb); + ret =3D vduse_domain_set_map(dev->as.domain, iotlb); if (ret) return ret; =20 ret =3D vduse_dev_update_iotlb(dev, 0ULL, ULLONG_MAX); if (ret) { - vduse_domain_clear_map(dev->domain, iotlb); + vduse_domain_clear_map(dev->as.domain, iotlb); return ret; } =20 @@ -882,7 +885,7 @@ static void vduse_dev_sync_single_for_device(struct vri= ng_mapping_opaque *token, { struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); struct vduse_dev *vdev =3D group->dev; - struct vduse_iova_domain *domain =3D vdev->domain; + struct vduse_iova_domain *domain =3D vdev->as.domain; =20 vduse_domain_sync_single_for_device(domain, dma_addr, size, dir); } @@ -893,7 +896,7 @@ static void vduse_dev_sync_single_for_cpu(struct vring_= mapping_opaque *token, { struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); struct vduse_dev *vdev =3D group->dev; - struct vduse_iova_domain *domain =3D vdev->domain; + struct vduse_iova_domain *domain =3D vdev->as.domain; =20 vduse_domain_sync_single_for_cpu(domain, dma_addr, size, dir); } @@ -906,7 +909,7 @@ static dma_addr_t vduse_dev_map_page(struct vring_mappi= ng_opaque *token, { struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); struct vduse_dev *vdev =3D group->dev; - struct vduse_iova_domain *domain =3D vdev->domain; + struct vduse_iova_domain *domain =3D vdev->as.domain; =20 return vduse_domain_map_page(domain, page, offset, size, dir, attrs); } @@ -918,7 +921,7 @@ static void vduse_dev_unmap_page(struct vring_mapping_o= paque *token, { struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); struct vduse_dev *vdev =3D group->dev; - struct vduse_iova_domain *domain =3D vdev->domain; + struct vduse_iova_domain *domain =3D vdev->as.domain; =20 return vduse_domain_unmap_page(domain, dma_addr, size, dir, attrs); } @@ -929,7 +932,7 @@ static void *vduse_dev_alloc_coherent(struct vring_mapp= ing_opaque *token, { struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); struct vduse_dev *vdev =3D group->dev; - struct vduse_iova_domain *domain =3D vdev->domain; + struct vduse_iova_domain *domain =3D vdev->as.domain; unsigned long iova; void *addr; =20 @@ -950,7 +953,7 @@ static void vduse_dev_free_coherent(struct vring_mappin= g_opaque *token, { struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); struct vduse_dev *vdev =3D group->dev; - struct vduse_iova_domain *domain =3D vdev->domain; + struct vduse_iova_domain *domain =3D vdev->as.domain; =20 vduse_domain_free_coherent(domain, size, vaddr, dma_addr, attrs); } @@ -960,7 +963,7 @@ static bool vduse_dev_need_sync(struct vring_mapping_op= aque *token, { struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); struct vduse_dev *vdev =3D group->dev; - struct vduse_iova_domain *domain =3D vdev->domain; + struct vduse_iova_domain *domain =3D vdev->as.domain; =20 return dma_addr < domain->bounce_size; } @@ -977,7 +980,7 @@ static size_t vduse_dev_max_mapping_size(struct vring_m= apping_opaque *token) { struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); struct vduse_dev *vdev =3D group->dev; - struct vduse_iova_domain *domain =3D vdev->domain; + struct vduse_iova_domain *domain =3D vdev->as.domain; =20 return domain->bounce_size; } @@ -1124,29 +1127,29 @@ static int vduse_dev_dereg_umem(struct vduse_dev *d= ev, { int ret; =20 - mutex_lock(&dev->mem_lock); + mutex_lock(&dev->as.mem_lock); ret =3D -ENOENT; - if (!dev->umem) + if (!dev->as.umem) goto unlock; =20 ret =3D -EINVAL; - if (!dev->domain) + if (!dev->as.domain) goto unlock; =20 - if (dev->umem->iova !=3D iova || size !=3D dev->domain->bounce_size) + if (dev->as.umem->iova !=3D iova || size !=3D dev->as.domain->bounce_size) goto unlock; =20 - vduse_domain_remove_user_bounce_pages(dev->domain); - unpin_user_pages_dirty_lock(dev->umem->pages, - dev->umem->npages, true); - atomic64_sub(dev->umem->npages, &dev->umem->mm->pinned_vm); - mmdrop(dev->umem->mm); - vfree(dev->umem->pages); - kfree(dev->umem); - dev->umem =3D NULL; + vduse_domain_remove_user_bounce_pages(dev->as.domain); + unpin_user_pages_dirty_lock(dev->as.umem->pages, + dev->as.umem->npages, true); + atomic64_sub(dev->as.umem->npages, &dev->as.umem->mm->pinned_vm); + mmdrop(dev->as.umem->mm); + vfree(dev->as.umem->pages); + kfree(dev->as.umem); + dev->as.umem =3D NULL; ret =3D 0; unlock: - mutex_unlock(&dev->mem_lock); + mutex_unlock(&dev->as.mem_lock); return ret; } =20 @@ -1159,14 +1162,14 @@ static int vduse_dev_reg_umem(struct vduse_dev *dev, unsigned long npages, lock_limit; int ret; =20 - if (!dev->domain || !dev->domain->bounce_map || - size !=3D dev->domain->bounce_size || + if (!dev->as.domain || !dev->as.domain->bounce_map || + size !=3D dev->as.domain->bounce_size || iova !=3D 0 || uaddr & ~PAGE_MASK) return -EINVAL; =20 - mutex_lock(&dev->mem_lock); + mutex_lock(&dev->as.mem_lock); ret =3D -EEXIST; - if (dev->umem) + if (dev->as.umem) goto unlock; =20 ret =3D -ENOMEM; @@ -1190,7 +1193,7 @@ static int vduse_dev_reg_umem(struct vduse_dev *dev, goto out; } =20 - ret =3D vduse_domain_add_user_bounce_pages(dev->domain, + ret =3D vduse_domain_add_user_bounce_pages(dev->as.domain, page_list, pinned); if (ret) goto out; @@ -1203,7 +1206,7 @@ static int vduse_dev_reg_umem(struct vduse_dev *dev, umem->mm =3D current->mm; mmgrab(current->mm); =20 - dev->umem =3D umem; + dev->as.umem =3D umem; out: if (ret && pinned > 0) unpin_user_pages(page_list, pinned); @@ -1214,7 +1217,7 @@ static int vduse_dev_reg_umem(struct vduse_dev *dev, vfree(page_list); kfree(umem); } - mutex_unlock(&dev->mem_lock); + mutex_unlock(&dev->as.mem_lock); return ret; } =20 @@ -1260,12 +1263,12 @@ static long vduse_dev_ioctl(struct file *file, unsi= gned int cmd, break; =20 mutex_lock(&dev->domain_lock); - if (!dev->domain) { + if (!dev->as.domain) { mutex_unlock(&dev->domain_lock); break; } - spin_lock(&dev->domain->iotlb_lock); - map =3D vhost_iotlb_itree_first(dev->domain->iotlb, + spin_lock(&dev->as.domain->iotlb_lock); + map =3D vhost_iotlb_itree_first(dev->as.domain->iotlb, entry.start, entry.last); if (map) { map_file =3D (struct vdpa_map_file *)map->opaque; @@ -1275,7 +1278,7 @@ static long vduse_dev_ioctl(struct file *file, unsign= ed int cmd, entry.last =3D map->last; entry.perm =3D map->perm; } - spin_unlock(&dev->domain->iotlb_lock); + spin_unlock(&dev->as.domain->iotlb_lock); mutex_unlock(&dev->domain_lock); ret =3D -EINVAL; if (!f) @@ -1469,22 +1472,22 @@ static long vduse_dev_ioctl(struct file *file, unsi= gned int cmd, break; =20 mutex_lock(&dev->domain_lock); - if (!dev->domain) { + if (!dev->as.domain) { mutex_unlock(&dev->domain_lock); break; } - spin_lock(&dev->domain->iotlb_lock); - map =3D vhost_iotlb_itree_first(dev->domain->iotlb, + spin_lock(&dev->as.domain->iotlb_lock); + map =3D vhost_iotlb_itree_first(dev->as.domain->iotlb, info.start, info.last); if (map) { info.start =3D map->start; info.last =3D map->last; info.capability =3D 0; - if (dev->domain->bounce_map && map->start =3D=3D 0 && - map->last =3D=3D dev->domain->bounce_size - 1) + if (dev->as.domain->bounce_map && map->start =3D=3D 0 && + map->last =3D=3D dev->as.domain->bounce_size - 1) info.capability |=3D VDUSE_IOVA_CAP_UMEM; } - spin_unlock(&dev->domain->iotlb_lock); + spin_unlock(&dev->as.domain->iotlb_lock); mutex_unlock(&dev->domain_lock); if (!map) break; @@ -1509,8 +1512,8 @@ static int vduse_dev_release(struct inode *inode, str= uct file *file) struct vduse_dev *dev =3D file->private_data; =20 mutex_lock(&dev->domain_lock); - if (dev->domain) - vduse_dev_dereg_umem(dev, 0, dev->domain->bounce_size); + if (dev->as.domain) + vduse_dev_dereg_umem(dev, 0, dev->as.domain->bounce_size); mutex_unlock(&dev->domain_lock); spin_lock(&dev->msg_lock); /* Make sure the inflight messages can processed after reconncection */ @@ -1729,7 +1732,7 @@ static struct vduse_dev *vduse_dev_create(void) return NULL; =20 mutex_init(&dev->lock); - mutex_init(&dev->mem_lock); + mutex_init(&dev->as.mem_lock); mutex_init(&dev->domain_lock); spin_lock_init(&dev->msg_lock); INIT_LIST_HEAD(&dev->send_list); @@ -1780,8 +1783,8 @@ static int vduse_destroy_dev(char *name) idr_remove(&vduse_idr, dev->minor); kvfree(dev->config); vduse_dev_deinit_vqs(dev); - if (dev->domain) - vduse_domain_destroy(dev->domain); + if (dev->as.domain) + vduse_domain_destroy(dev->as.domain); kfree(dev->name); kfree(dev->groups); vduse_dev_destroy(dev); @@ -1897,7 +1900,7 @@ static ssize_t bounce_size_store(struct device *devic= e, =20 ret =3D -EPERM; mutex_lock(&dev->domain_lock); - if (dev->domain) + if (dev->as.domain) goto unlock; =20 ret =3D kstrtouint(buf, 10, &bounce_size); @@ -2148,11 +2151,11 @@ static int vdpa_dev_add(struct vdpa_mgmt_dev *mdev,= const char *name, return ret; =20 mutex_lock(&dev->domain_lock); - if (!dev->domain) - dev->domain =3D vduse_domain_create(VDUSE_IOVA_SIZE - 1, + if (!dev->as.domain) + dev->as.domain =3D vduse_domain_create(VDUSE_IOVA_SIZE - 1, dev->bounce_size); mutex_unlock(&dev->domain_lock); - if (!dev->domain) { + if (!dev->as.domain) { put_device(&dev->vdev->vdpa.dev); return -ENOMEM; } @@ -2161,8 +2164,8 @@ static int vdpa_dev_add(struct vdpa_mgmt_dev *mdev, c= onst char *name, if (ret) { put_device(&dev->vdev->vdpa.dev); mutex_lock(&dev->domain_lock); - vduse_domain_destroy(dev->domain); - dev->domain =3D NULL; + vduse_domain_destroy(dev->as.domain); + dev->as.domain =3D NULL; mutex_unlock(&dev->domain_lock); return ret; } --=20 2.50.1 From nobody Tue Sep 2 09:50:27 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4143730C37B for ; Mon, 18 Aug 2025 08:57:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755507473; cv=none; b=kutgwhhHnuhtYTfkw3YLi2nXrhcJpR5+gEpb2y+tB108h/X/jKor1WP7CHK0Zu1B8gzaNw745qjxeRJvEwoN34ku1Qa6OZWCir66ZtXuXplcRlLnZgnBm3hs5656GzyVxgAJojowho94sh63AgmUV1gYlxdWiaySORENKGpRmZs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755507473; c=relaxed/simple; bh=/yF/mLVUyznfs6F8Ow0lhgfc03D0XfBzZQwB5h3AjNI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=OyzoELO8mDSX0fzrq249fW49nk57UX3YlE4EQQvjvlDC+tLq6OCEaMfyVkPqkQlFhosDDQAtze/ElVJjYNNS6ZYBIpgjipI80W6UCnMo26LljtiFEAQiRxt250inuQUT6NjYzoyqs25/21+JLNbvOJW1sVAZxjtctrr3eaJqd1M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=BROTaTbE; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="BROTaTbE" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1755507470; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u/6TWQGiXsjOBEy7mumrLTCVexTxEL4HnXh6O++H7Lc=; b=BROTaTbEv99Q1DuN7TdWxFVDRssPqEvgbjZwxXctM4Viyw8uUPYRPZYMvjZUSBX7lXBFEh qy3lkAqNVzjpQ7Xsan8jkaCPX9lGWUPjRfbeRSTU1wbT/s8a/7ZYid0JPjBN2P7J2gIsYK EQODZaPgdA8/jVzMIyt3g8PLscfUgf0= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-138-FuotNUrMOY6AzUMJNDTIdQ-1; Mon, 18 Aug 2025 04:57:47 -0400 X-MC-Unique: FuotNUrMOY6AzUMJNDTIdQ-1 X-Mimecast-MFC-AGG-ID: FuotNUrMOY6AzUMJNDTIdQ_1755507466 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 5294C19775AC; Mon, 18 Aug 2025 08:57:46 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.44.32.213]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 45F92180028A; Mon, 18 Aug 2025 08:57:42 +0000 (UTC) From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= To: "Michael S . Tsirkin " Cc: =?UTF-8?q?Eugenio=20P=C3=A9rez?= , Laurent Vivier , virtualization@lists.linux.dev, jasowang@redhat.com, Cindy Lu , linux-kernel@vger.kernel.org, Maxime Coquelin , Yongji Xie , Stefano Garzarella , Xuan Zhuo Subject: [RFC v3 6/7] vduse: add vq group asid support Date: Mon, 18 Aug 2025 10:57:10 +0200 Message-ID: <20250818085711.3461758-7-eperezma@redhat.com> In-Reply-To: <20250818085711.3461758-1-eperezma@redhat.com> References: <20250818085711.3461758-1-eperezma@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 Add support for assigning Address Space Identifiers (ASIDs) to each VQ group. This enables mapping each group into a distinct memory space. Now that the driver can change ASID in the middle of operation, the domain that each vq address point is also protected by domain_lock. Signed-off-by: Eugenio P=C3=A9rez --- v3: * Increase VDUSE_MAX_VQ_GROUPS to 0xffff (Jason). It was set to a lower value to reduce memory consumption, but vqs are already limited to that value and userspace VDUSE is able to allocate that many vqs. * Remove TODO about merging VDUSE_IOTLB_GET_FD ioctl with VDUSE_IOTLB_GET_INFO. * Use of array_index_nospec in VDUSE device ioctls. * Embed vduse_iotlb_entry into vduse_iotlb_entry_v2. * Move the umem mutex to asid struct so there is no contention between ASIDs. v2: * Make iotlb entry the last one of vduse_iotlb_entry_v2 so the first part of the struct is the same. --- drivers/vdpa/vdpa_user/vduse_dev.c | 290 +++++++++++++++++++++-------- include/uapi/linux/vduse.h | 52 +++++- 2 files changed, 259 insertions(+), 83 deletions(-) diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vd= use_dev.c index 561aab048d1a..de9550fd1cc8 100644 --- a/drivers/vdpa/vdpa_user/vduse_dev.c +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -92,6 +92,7 @@ struct vduse_as { =20 struct vduse_vq_group_int { struct vring_mapping_opaque token; + struct vduse_iova_domain *domain; struct vduse_dev *dev; }; =20 @@ -99,7 +100,7 @@ struct vduse_dev { struct vduse_vdpa *vdev; struct device *dev; struct vduse_virtqueue **vqs; - struct vduse_as as; + struct vduse_as *as; char *name; struct mutex lock; spinlock_t msg_lock; @@ -127,6 +128,7 @@ struct vduse_dev { u32 vq_num; u32 vq_align; u32 ngroups; + u32 nas; struct vduse_vq_group_int *groups; unsigned int bounce_size; struct mutex domain_lock; @@ -322,7 +324,7 @@ static int vduse_dev_set_status(struct vduse_dev *dev, = u8 status) return vduse_dev_msg_sync(dev, &msg); } =20 -static int vduse_dev_update_iotlb(struct vduse_dev *dev, +static int vduse_dev_update_iotlb(struct vduse_dev *dev, u32 asid, u64 start, u64 last) { struct vduse_dev_msg msg =3D { 0 }; @@ -331,8 +333,14 @@ static int vduse_dev_update_iotlb(struct vduse_dev *de= v, return -EINVAL; =20 msg.req.type =3D VDUSE_UPDATE_IOTLB; - msg.req.iova.start =3D start; - msg.req.iova.last =3D last; + if (dev->api_version < VDUSE_API_VERSION_1) { + msg.req.iova.start =3D start; + msg.req.iova.last =3D last; + } else { + msg.req.iova_v2.start =3D start; + msg.req.iova_v2.last =3D last; + msg.req.iova_v2.asid =3D asid; + } =20 return vduse_dev_msg_sync(dev, &msg); } @@ -444,14 +452,28 @@ static __poll_t vduse_dev_poll(struct file *file, pol= l_table *wait) return mask; } =20 +/* Force set the asid to a vq group without a message to the VDUSE device = */ +static void vduse_set_group_asid_nomsg(struct vduse_dev *dev, + unsigned int group, unsigned int asid) +{ + guard(mutex)(&dev->domain_lock); + dev->groups[group].domain =3D dev->as[asid].domain; +} + static void vduse_dev_reset(struct vduse_dev *dev) { int i; - struct vduse_iova_domain *domain =3D dev->as.domain; =20 /* The coherent mappings are handled in vduse_dev_free_coherent() */ - if (domain && domain->bounce_map) - vduse_domain_reset_bounce_map(domain); + for (i =3D 0; i < dev->nas; i++) { + struct vduse_iova_domain *domain =3D dev->as[i].domain; + + if (domain && domain->bounce_map) + vduse_domain_reset_bounce_map(domain); + } + + for (i =3D 0; i < dev->ngroups; i++) + vduse_set_group_asid_nomsg(dev, i, 0); =20 down_write(&dev->rwsem); =20 @@ -626,6 +648,29 @@ static union vring_mapping_token vduse_get_vq_mapping_= token(struct vdpa_device * return ret; } =20 +static int vduse_set_group_asid(struct vdpa_device *vdpa, unsigned int gro= up, + unsigned int asid) +{ + struct vduse_dev *dev =3D vdpa_to_vduse(vdpa); + struct vduse_dev_msg msg =3D { 0 }; + int r; + + if (dev->api_version < VDUSE_API_VERSION_1 || + group >=3D dev->ngroups || asid >=3D dev->nas) + return -EINVAL; + + msg.req.type =3D VDUSE_SET_VQ_GROUP_ASID; + msg.req.vq_group_asid.group =3D group; + msg.req.vq_group_asid.asid =3D asid; + + r =3D vduse_dev_msg_sync(dev, &msg); + if (r < 0) + return r; + + vduse_set_group_asid_nomsg(dev, group, asid); + return 0; +} + static int vduse_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 idx, struct vdpa_vq_state *state) { @@ -826,13 +871,13 @@ static int vduse_vdpa_set_map(struct vdpa_device *vdp= a, struct vduse_dev *dev =3D vdpa_to_vduse(vdpa); int ret; =20 - ret =3D vduse_domain_set_map(dev->as.domain, iotlb); + ret =3D vduse_domain_set_map(dev->as[asid].domain, iotlb); if (ret) return ret; =20 - ret =3D vduse_dev_update_iotlb(dev, 0ULL, ULLONG_MAX); + ret =3D vduse_dev_update_iotlb(dev, asid, 0ULL, ULLONG_MAX); if (ret) { - vduse_domain_clear_map(dev->as.domain, iotlb); + vduse_domain_clear_map(dev->as[asid].domain, iotlb); return ret; } =20 @@ -875,6 +920,7 @@ static const struct vdpa_config_ops vduse_vdpa_config_o= ps =3D { .get_vq_affinity =3D vduse_vdpa_get_vq_affinity, .reset =3D vduse_vdpa_reset, .set_map =3D vduse_vdpa_set_map, + .set_group_asid =3D vduse_set_group_asid, .get_vq_mapping_token =3D vduse_get_vq_mapping_token, .free =3D vduse_vdpa_free, }; @@ -885,8 +931,10 @@ static void vduse_dev_sync_single_for_device(struct vr= ing_mapping_opaque *token, { struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); struct vduse_dev *vdev =3D group->dev; - struct vduse_iova_domain *domain =3D vdev->as.domain; + struct vduse_iova_domain *domain; =20 + guard(mutex)(&vdev->domain_lock); + domain =3D group->domain; vduse_domain_sync_single_for_device(domain, dma_addr, size, dir); } =20 @@ -896,8 +944,10 @@ static void vduse_dev_sync_single_for_cpu(struct vring= _mapping_opaque *token, { struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); struct vduse_dev *vdev =3D group->dev; - struct vduse_iova_domain *domain =3D vdev->as.domain; + struct vduse_iova_domain *domain; =20 + guard(mutex)(&vdev->domain_lock); + domain =3D group->domain; vduse_domain_sync_single_for_cpu(domain, dma_addr, size, dir); } =20 @@ -909,8 +959,10 @@ static dma_addr_t vduse_dev_map_page(struct vring_mapp= ing_opaque *token, { struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); struct vduse_dev *vdev =3D group->dev; - struct vduse_iova_domain *domain =3D vdev->as.domain; + struct vduse_iova_domain *domain; =20 + guard(mutex)(&vdev->domain_lock); + domain =3D group->domain; return vduse_domain_map_page(domain, page, offset, size, dir, attrs); } =20 @@ -921,8 +973,10 @@ static void vduse_dev_unmap_page(struct vring_mapping_= opaque *token, { struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); struct vduse_dev *vdev =3D group->dev; - struct vduse_iova_domain *domain =3D vdev->as.domain; + struct vduse_iova_domain *domain; =20 + guard(mutex)(&vdev->domain_lock); + domain =3D group->domain; return vduse_domain_unmap_page(domain, dma_addr, size, dir, attrs); } =20 @@ -932,11 +986,13 @@ static void *vduse_dev_alloc_coherent(struct vring_ma= pping_opaque *token, { struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); struct vduse_dev *vdev =3D group->dev; - struct vduse_iova_domain *domain =3D vdev->as.domain; + struct vduse_iova_domain *domain; unsigned long iova; void *addr; =20 *dma_addr =3D DMA_MAPPING_ERROR; + guard(mutex)(&vdev->domain_lock); + domain =3D group->domain; addr =3D vduse_domain_alloc_coherent(domain, size, (dma_addr_t *)&iova, flag); if (!addr) @@ -953,8 +1009,10 @@ static void vduse_dev_free_coherent(struct vring_mapp= ing_opaque *token, { struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); struct vduse_dev *vdev =3D group->dev; - struct vduse_iova_domain *domain =3D vdev->as.domain; + struct vduse_iova_domain *domain; =20 + guard(mutex)(&vdev->domain_lock); + domain =3D group->domain; vduse_domain_free_coherent(domain, size, vaddr, dma_addr, attrs); } =20 @@ -963,8 +1021,10 @@ static bool vduse_dev_need_sync(struct vring_mapping_= opaque *token, { struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); struct vduse_dev *vdev =3D group->dev; - struct vduse_iova_domain *domain =3D vdev->as.domain; + struct vduse_iova_domain *domain; =20 + guard(mutex)(&vdev->domain_lock); + domain =3D group->domain; return dma_addr < domain->bounce_size; } =20 @@ -980,8 +1040,10 @@ static size_t vduse_dev_max_mapping_size(struct vring= _mapping_opaque *token) { struct vduse_vq_group_int *group =3D vduse_token_to_vq_group(token); struct vduse_dev *vdev =3D group->dev; - struct vduse_iova_domain *domain =3D vdev->as.domain; + struct vduse_iova_domain *domain; =20 + guard(mutex)(&vdev->domain_lock); + domain =3D group->domain; return domain->bounce_size; } =20 @@ -1122,39 +1184,40 @@ static int vduse_dev_queue_irq_work(struct vduse_de= v *dev, return ret; } =20 -static int vduse_dev_dereg_umem(struct vduse_dev *dev, +static int vduse_dev_dereg_umem(struct vduse_dev *dev, u32 asid, u64 iova, u64 size) { int ret; =20 - mutex_lock(&dev->as.mem_lock); + mutex_lock(&dev->as[asid].mem_lock); ret =3D -ENOENT; - if (!dev->as.umem) + if (!dev->as[asid].umem) goto unlock; =20 ret =3D -EINVAL; - if (!dev->as.domain) + if (!dev->as[asid].domain) goto unlock; =20 - if (dev->as.umem->iova !=3D iova || size !=3D dev->as.domain->bounce_size) + if (dev->as[asid].umem->iova !=3D iova || + size !=3D dev->as[asid].domain->bounce_size) goto unlock; =20 - vduse_domain_remove_user_bounce_pages(dev->as.domain); - unpin_user_pages_dirty_lock(dev->as.umem->pages, - dev->as.umem->npages, true); - atomic64_sub(dev->as.umem->npages, &dev->as.umem->mm->pinned_vm); - mmdrop(dev->as.umem->mm); - vfree(dev->as.umem->pages); - kfree(dev->as.umem); - dev->as.umem =3D NULL; + vduse_domain_remove_user_bounce_pages(dev->as[asid].domain); + unpin_user_pages_dirty_lock(dev->as[asid].umem->pages, + dev->as[asid].umem->npages, true); + atomic64_sub(dev->as[asid].umem->npages, &dev->as[asid].umem->mm->pinned_= vm); + mmdrop(dev->as[asid].umem->mm); + vfree(dev->as[asid].umem->pages); + kfree(dev->as[asid].umem); + dev->as[asid].umem =3D NULL; ret =3D 0; unlock: - mutex_unlock(&dev->as.mem_lock); + mutex_unlock(&dev->as[asid].mem_lock); return ret; } =20 static int vduse_dev_reg_umem(struct vduse_dev *dev, - u64 iova, u64 uaddr, u64 size) + u32 asid, u64 iova, u64 uaddr, u64 size) { struct page **page_list =3D NULL; struct vduse_umem *umem =3D NULL; @@ -1162,14 +1225,14 @@ static int vduse_dev_reg_umem(struct vduse_dev *dev, unsigned long npages, lock_limit; int ret; =20 - if (!dev->as.domain || !dev->as.domain->bounce_map || - size !=3D dev->as.domain->bounce_size || + if (!dev->as[asid].domain || !dev->as[asid].domain->bounce_map || + size !=3D dev->as[asid].domain->bounce_size || iova !=3D 0 || uaddr & ~PAGE_MASK) return -EINVAL; =20 - mutex_lock(&dev->as.mem_lock); + mutex_lock(&dev->as[asid].mem_lock); ret =3D -EEXIST; - if (dev->as.umem) + if (dev->as[asid].umem) goto unlock; =20 ret =3D -ENOMEM; @@ -1193,7 +1256,7 @@ static int vduse_dev_reg_umem(struct vduse_dev *dev, goto out; } =20 - ret =3D vduse_domain_add_user_bounce_pages(dev->as.domain, + ret =3D vduse_domain_add_user_bounce_pages(dev->as[asid].domain, page_list, pinned); if (ret) goto out; @@ -1206,7 +1269,7 @@ static int vduse_dev_reg_umem(struct vduse_dev *dev, umem->mm =3D current->mm; mmgrab(current->mm); =20 - dev->as.umem =3D umem; + dev->as[asid].umem =3D umem; out: if (ret && pinned > 0) unpin_user_pages(page_list, pinned); @@ -1217,7 +1280,7 @@ static int vduse_dev_reg_umem(struct vduse_dev *dev, vfree(page_list); kfree(umem); } - mutex_unlock(&dev->as.mem_lock); + mutex_unlock(&dev->as[asid].mem_lock); return ret; } =20 @@ -1249,47 +1312,66 @@ static long vduse_dev_ioctl(struct file *file, unsi= gned int cmd, =20 switch (cmd) { case VDUSE_IOTLB_GET_FD: { - struct vduse_iotlb_entry entry; + struct vduse_iotlb_entry_v2 entry; struct vhost_iotlb_map *map; struct vdpa_map_file *map_file; struct file *f =3D NULL; + u32 asid; =20 ret =3D -EFAULT; - if (copy_from_user(&entry, argp, sizeof(entry))) - break; + if (dev->api_version >=3D VDUSE_API_VERSION_1) { + if (copy_from_user(&entry, argp, sizeof(entry))) + break; + } else { + entry.asid =3D 0; + if (copy_from_user(&entry.v1, argp, + sizeof(entry.v1))) + break; + } =20 ret =3D -EINVAL; - if (entry.start > entry.last) + if (entry.v1.start > entry.v1.last) + break; + + if (entry.asid >=3D dev->nas) break; =20 mutex_lock(&dev->domain_lock); - if (!dev->as.domain) { + asid =3D array_index_nospec(entry.asid, dev->nas); + if (!dev->as[asid].domain) { mutex_unlock(&dev->domain_lock); break; } - spin_lock(&dev->as.domain->iotlb_lock); - map =3D vhost_iotlb_itree_first(dev->as.domain->iotlb, - entry.start, entry.last); + spin_lock(&dev->as[asid].domain->iotlb_lock); + map =3D vhost_iotlb_itree_first(dev->as[asid].domain->iotlb, + entry.v1.start, entry.v1.last); if (map) { map_file =3D (struct vdpa_map_file *)map->opaque; f =3D get_file(map_file->file); - entry.offset =3D map_file->offset; - entry.start =3D map->start; - entry.last =3D map->last; - entry.perm =3D map->perm; + entry.v1.offset =3D map_file->offset; + entry.v1.start =3D map->start; + entry.v1.last =3D map->last; + entry.v1.perm =3D map->perm; } - spin_unlock(&dev->as.domain->iotlb_lock); + spin_unlock(&dev->as[asid].domain->iotlb_lock); mutex_unlock(&dev->domain_lock); ret =3D -EINVAL; if (!f) break; =20 ret =3D -EFAULT; - if (copy_to_user(argp, &entry, sizeof(entry))) { + if (dev->api_version >=3D VDUSE_API_VERSION_1) + ret =3D copy_to_user(argp, &entry, + sizeof(entry)); + else + ret =3D copy_to_user(argp, &entry.v1, + sizeof(entry.v1)); + + if (ret) { fput(f); break; } - ret =3D receive_fd(f, NULL, perm_to_file_flags(entry.perm)); + ret =3D receive_fd(f, NULL, perm_to_file_flags(entry.v1.perm)); fput(f); break; } @@ -1422,6 +1504,7 @@ static long vduse_dev_ioctl(struct file *file, unsign= ed int cmd, } case VDUSE_IOTLB_REG_UMEM: { struct vduse_iova_umem umem; + u32 asid; =20 ret =3D -EFAULT; if (copy_from_user(&umem, argp, sizeof(umem))) @@ -1429,17 +1512,21 @@ static long vduse_dev_ioctl(struct file *file, unsi= gned int cmd, =20 ret =3D -EINVAL; if (!is_mem_zero((const char *)umem.reserved, - sizeof(umem.reserved))) + sizeof(umem.reserved)) || + (dev->api_version < VDUSE_API_VERSION_1 && + umem.asid !=3D 0) || umem.asid >=3D dev->nas) break; =20 mutex_lock(&dev->domain_lock); - ret =3D vduse_dev_reg_umem(dev, umem.iova, + asid =3D array_index_nospec(umem.asid, dev->nas); + ret =3D vduse_dev_reg_umem(dev, asid, umem.iova, umem.uaddr, umem.size); mutex_unlock(&dev->domain_lock); break; } case VDUSE_IOTLB_DEREG_UMEM: { struct vduse_iova_umem umem; + u32 asid; =20 ret =3D -EFAULT; if (copy_from_user(&umem, argp, sizeof(umem))) @@ -1447,10 +1534,15 @@ static long vduse_dev_ioctl(struct file *file, unsi= gned int cmd, =20 ret =3D -EINVAL; if (!is_mem_zero((const char *)umem.reserved, - sizeof(umem.reserved))) + sizeof(umem.reserved)) || + (dev->api_version < VDUSE_API_VERSION_1 && + umem.asid !=3D 0) || + umem.asid >=3D dev->nas) break; + mutex_lock(&dev->domain_lock); - ret =3D vduse_dev_dereg_umem(dev, umem.iova, + asid =3D array_index_nospec(umem.asid, dev->nas); + ret =3D vduse_dev_dereg_umem(dev, asid, umem.iova, umem.size); mutex_unlock(&dev->domain_lock); break; @@ -1458,6 +1550,7 @@ static long vduse_dev_ioctl(struct file *file, unsign= ed int cmd, case VDUSE_IOTLB_GET_INFO: { struct vduse_iova_info info; struct vhost_iotlb_map *map; + u32 asid; =20 ret =3D -EFAULT; if (copy_from_user(&info, argp, sizeof(info))) @@ -1471,23 +1564,31 @@ static long vduse_dev_ioctl(struct file *file, unsi= gned int cmd, sizeof(info.reserved))) break; =20 + if (dev->api_version < VDUSE_API_VERSION_1) { + if (info.asid) + break; + } else if (info.asid >=3D dev->nas) + break; + mutex_lock(&dev->domain_lock); - if (!dev->as.domain) { + asid =3D array_index_nospec(info.asid, dev->nas); + if (!dev->as[asid].domain) { mutex_unlock(&dev->domain_lock); break; } - spin_lock(&dev->as.domain->iotlb_lock); - map =3D vhost_iotlb_itree_first(dev->as.domain->iotlb, + spin_lock(&dev->as[asid].domain->iotlb_lock); + map =3D vhost_iotlb_itree_first(dev->as[asid].domain->iotlb, info.start, info.last); if (map) { info.start =3D map->start; info.last =3D map->last; info.capability =3D 0; - if (dev->as.domain->bounce_map && map->start =3D=3D 0 && - map->last =3D=3D dev->as.domain->bounce_size - 1) + if (dev->as[asid].domain->bounce_map && + map->start =3D=3D 0 && + map->last =3D=3D dev->as[asid].domain->bounce_size - 1) info.capability |=3D VDUSE_IOVA_CAP_UMEM; } - spin_unlock(&dev->as.domain->iotlb_lock); + spin_unlock(&dev->as[asid].domain->iotlb_lock); mutex_unlock(&dev->domain_lock); if (!map) break; @@ -1512,8 +1613,10 @@ static int vduse_dev_release(struct inode *inode, st= ruct file *file) struct vduse_dev *dev =3D file->private_data; =20 mutex_lock(&dev->domain_lock); - if (dev->as.domain) - vduse_dev_dereg_umem(dev, 0, dev->as.domain->bounce_size); + for (int i =3D 0; i < dev->nas; i++) + if (dev->as[i].domain) + vduse_dev_dereg_umem(dev, i, 0, + dev->as[i].domain->bounce_size); mutex_unlock(&dev->domain_lock); spin_lock(&dev->msg_lock); /* Make sure the inflight messages can processed after reconncection */ @@ -1732,7 +1835,6 @@ static struct vduse_dev *vduse_dev_create(void) return NULL; =20 mutex_init(&dev->lock); - mutex_init(&dev->as.mem_lock); mutex_init(&dev->domain_lock); spin_lock_init(&dev->msg_lock); INIT_LIST_HEAD(&dev->send_list); @@ -1783,8 +1885,11 @@ static int vduse_destroy_dev(char *name) idr_remove(&vduse_idr, dev->minor); kvfree(dev->config); vduse_dev_deinit_vqs(dev); - if (dev->as.domain) - vduse_domain_destroy(dev->as.domain); + for (int i =3D 0; i < dev->nas; i++) { + if (dev->as[i].domain) + vduse_domain_destroy(dev->as[i].domain); + } + kfree(dev->as); kfree(dev->name); kfree(dev->groups); vduse_dev_destroy(dev); @@ -1831,12 +1936,16 @@ static bool vduse_validate_config(struct vduse_dev_= config *config, sizeof(config->reserved))) return false; =20 - if (api_version < VDUSE_API_VERSION_1 && config->ngroups) + if (api_version < VDUSE_API_VERSION_1 && + (config->ngroups || config->nas)) return false; =20 if (api_version >=3D VDUSE_API_VERSION_1 && config->ngroups > 0xffff) return false; =20 + if (api_version >=3D VDUSE_API_VERSION_1 && config->nas > 0xffff) + return false; + if (config->vq_align > PAGE_SIZE) return false; =20 @@ -1900,7 +2009,8 @@ static ssize_t bounce_size_store(struct device *devic= e, =20 ret =3D -EPERM; mutex_lock(&dev->domain_lock); - if (dev->as.domain) + /* Assuming that if the first domain is allocated, all are allocated */ + if (dev->as[0].domain) goto unlock; =20 ret =3D kstrtouint(buf, 10, &bounce_size); @@ -1961,6 +2071,13 @@ static int vduse_create_dev(struct vduse_dev_config = *config, for (u32 i =3D 0; i < dev->ngroups; ++i) dev->groups[i].dev =3D dev; =20 + dev->nas =3D (dev->api_version < 1) ? 1 : (config->nas ?: 1); + dev->as =3D kcalloc(dev->nas, sizeof(dev->as[0]), GFP_KERNEL); + if (!dev->as) + goto err_as; + for (int i =3D 0; i < dev->nas; i++) + mutex_init(&dev->as[i].mem_lock); + dev->name =3D kstrdup(config->name, GFP_KERNEL); if (!dev->name) goto err_str; @@ -1997,6 +2114,8 @@ static int vduse_create_dev(struct vduse_dev_config *= config, err_idr: kfree(dev->name); err_str: + kfree(dev->as); +err_as: kfree(dev->groups); err_vq_groups: vduse_dev_destroy(dev); @@ -2122,7 +2241,7 @@ static int vduse_dev_init_vdpa(struct vduse_dev *dev,= const char *name) =20 vdev =3D vdpa_alloc_device(struct vduse_vdpa, vdpa, dev->dev, &vduse_vdpa_config_ops, &vduse_map_ops, - dev->ngroups, 1, name, true); + dev->ngroups, dev->nas, name, true); if (IS_ERR(vdev)) return PTR_ERR(vdev); =20 @@ -2151,11 +2270,20 @@ static int vdpa_dev_add(struct vdpa_mgmt_dev *mdev,= const char *name, return ret; =20 mutex_lock(&dev->domain_lock); - if (!dev->as.domain) - dev->as.domain =3D vduse_domain_create(VDUSE_IOVA_SIZE - 1, - dev->bounce_size); + ret =3D 0; + + for (int i =3D 0; i < dev->nas; ++i) { + dev->as[i].domain =3D vduse_domain_create(VDUSE_IOVA_SIZE - 1, + dev->bounce_size); + if (!dev->as[i].domain) { + ret =3D -ENOMEM; + for (int j =3D 0; j < i; ++j) + vduse_domain_destroy(dev->as[j].domain); + } + } + mutex_unlock(&dev->domain_lock); - if (!dev->as.domain) { + if (ret =3D=3D -ENOMEM) { put_device(&dev->vdev->vdpa.dev); return -ENOMEM; } @@ -2164,8 +2292,12 @@ static int vdpa_dev_add(struct vdpa_mgmt_dev *mdev, = const char *name, if (ret) { put_device(&dev->vdev->vdpa.dev); mutex_lock(&dev->domain_lock); - vduse_domain_destroy(dev->as.domain); - dev->as.domain =3D NULL; + for (int i =3D 0; i < dev->nas; i++) { + if (dev->as[i].domain) { + vduse_domain_destroy(dev->as[i].domain); + dev->as[i].domain =3D NULL; + } + } mutex_unlock(&dev->domain_lock); return ret; } diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h index b1c0e47d71fb..54da965a65dc 100644 --- a/include/uapi/linux/vduse.h +++ b/include/uapi/linux/vduse.h @@ -47,7 +47,8 @@ struct vduse_dev_config { __u32 vq_num; __u32 vq_align; __u32 ngroups; /* if VDUSE_API_VERSION >=3D 1 */ - __u32 reserved[12]; + __u32 nas; /* if VDUSE_API_VERSION >=3D 1 */ + __u32 reserved[11]; __u32 config_size; __u8 config[]; }; @@ -82,6 +83,18 @@ struct vduse_iotlb_entry { __u8 perm; }; =20 +/** + * struct vduse_iotlb_entry_v2 - entry of IOTLB to describe one IOVA regio= n in an ASID + * @v1: the original vduse_iotlb_entry + * @asid: address space ID of the IOVA region + * + * Structure used by VDUSE_IOTLB_GET_FD ioctl to find an overlapped IOVA r= egion. + */ +struct vduse_iotlb_entry_v2 { + struct vduse_iotlb_entry v1; + __u32 asid; +}; + /* * Find the first IOVA region that overlaps with the range [start, last] * and return the corresponding file descriptor. Return -EINVAL means the @@ -172,6 +185,16 @@ struct vduse_vq_group { __u32 group; }; =20 +/** + * struct vduse_vq_group - virtqueue group + @ @group: Index of the virtqueue group + * @asid: Address space ID of the group + */ +struct vduse_vq_group_asid { + __u32 group; + __u32 asid; +}; + /** * struct vduse_vq_info - information of a virtqueue * @index: virtqueue index @@ -231,6 +254,7 @@ struct vduse_vq_eventfd { * @uaddr: start address of userspace memory, it must be aligned to page s= ize * @iova: start of the IOVA region * @size: size of the IOVA region + * @asid: Address space ID of the IOVA region * @reserved: for future use, needs to be initialized to zero * * Structure used by VDUSE_IOTLB_REG_UMEM and VDUSE_IOTLB_DEREG_UMEM @@ -240,7 +264,8 @@ struct vduse_iova_umem { __u64 uaddr; __u64 iova; __u64 size; - __u64 reserved[3]; + __u32 asid; + __u32 reserved[5]; }; =20 /* Register userspace memory for IOVA regions */ @@ -254,6 +279,7 @@ struct vduse_iova_umem { * @start: start of the IOVA region * @last: last of the IOVA region * @capability: capability of the IOVA regsion + * @asid: Address space ID of the IOVA region, only if device API version = >=3D 1 * @reserved: for future use, needs to be initialized to zero * * Structure used by VDUSE_IOTLB_GET_INFO ioctl to get information of @@ -264,7 +290,8 @@ struct vduse_iova_info { __u64 last; #define VDUSE_IOVA_CAP_UMEM (1 << 0) __u64 capability; - __u64 reserved[3]; + __u32 asid; /* Only if device API version >=3D 1 */ + __u32 reserved[5]; }; =20 /* @@ -287,6 +314,7 @@ enum vduse_req_type { VDUSE_SET_STATUS, VDUSE_UPDATE_IOTLB, VDUSE_GET_VQ_GROUP, + VDUSE_SET_VQ_GROUP_ASID, }; =20 /** @@ -321,6 +349,18 @@ struct vduse_iova_range { __u64 last; }; =20 +/** + * struct vduse_iova_range - IOVA range [start, last] if API_VERSION >=3D 1 + * @start: start of the IOVA range + * @last: last of the IOVA range + * @asid: address space ID of the IOVA range + */ +struct vduse_iova_range_v2 { + __u64 start; + __u64 last; + __u32 asid; +}; + /** * struct vduse_dev_request - control request * @type: request type @@ -330,6 +370,8 @@ struct vduse_iova_range { * @s: device status * @iova: IOVA range for updating * @vq_group: virtqueue group of a virtqueue + * @iova_v2: IOVA range for updating if API_VERSION >=3D 1 + * @vq_group_asid: ASID of a virtqueue group * @padding: padding * * Structure used by read(2) on /dev/vduse/$NAME. @@ -342,8 +384,10 @@ struct vduse_dev_request { struct vduse_vq_state vq_state; struct vduse_dev_status s; struct vduse_iova_range iova; - /* Only if vduse api version >=3D 1 */; + /* Following members only if vduse api version >=3D 1 */; struct vduse_vq_group vq_group; + struct vduse_iova_range_v2 iova_v2; + struct vduse_vq_group_asid vq_group_asid; __u32 padding[32]; }; }; --=20 2.50.1 From nobody Tue Sep 2 09:50:27 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 12CCD30DD19 for ; Mon, 18 Aug 2025 08:57:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755507478; cv=none; b=PLRAFduUSfLJMUZE23wtsE/gO2mgu3IorrXsp2baaqpK2HrrjOyqf9uq3W8UlZ3iUCG+fOpGSzmF9S1Cks4bRcwIiS62nNUl1Xi2xNGyXoPMrAo0tqZSMDWqF18/+ffoizB92TZ1v4S29vj5gtX6lTrQ3x24Q0dz0xM/MfuNSEQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755507478; c=relaxed/simple; bh=w2x9ThFG/3oIaisD6f6vvsQ8qKWULGWziBs21QqVWAQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ZWwdX2pi5hY0palhiR8vDlEGcuRZ8ziZ5RZzCEVd2y6WzWA1wd4SKzErO0GWlahv0YghitWDFZnhtt+CKw/mch0AUOst+LxeilsZ2Mch4yo0LgY7WRLTPD8NvwHIQmLB9rFVpjXAe8euTF7ji65/3rui5nquayG4QVsVKgrjVfQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=W7deRLou; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="W7deRLou" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1755507476; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3PG84EHC3mShb3Y0J40UmsNIyZdZBlJ04AfluOXhawc=; b=W7deRLouPblEUumvFGm7iOCZYsIbmNsNv7mM9Mp4whE+EMK7VHo+DvPNGozXnMrxtvQ624 82HD8yV3e/ODALoZwaFUbQ3JED65Bh3nOqO00H2YRfkEwVH3gk3LGhCF3YKfmjhThLP8h5 OgQDE8IchM1cV98TsBwUS3cq4XHnL5Q= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-348-lhbsMwHQMziFeZbFk0XgFw-1; Mon, 18 Aug 2025 04:57:52 -0400 X-MC-Unique: lhbsMwHQMziFeZbFk0XgFw-1 X-Mimecast-MFC-AGG-ID: lhbsMwHQMziFeZbFk0XgFw_1755507471 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id E94661954B2D; Mon, 18 Aug 2025 08:57:50 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.44.32.213]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id E8361180028B; Mon, 18 Aug 2025 08:57:46 +0000 (UTC) From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= To: "Michael S . Tsirkin " Cc: =?UTF-8?q?Eugenio=20P=C3=A9rez?= , Laurent Vivier , virtualization@lists.linux.dev, jasowang@redhat.com, Cindy Lu , linux-kernel@vger.kernel.org, Maxime Coquelin , Yongji Xie , Stefano Garzarella , Xuan Zhuo Subject: [RFC v3 7/7] vduse: bump version number Date: Mon, 18 Aug 2025 10:57:11 +0200 Message-ID: <20250818085711.3461758-8-eperezma@redhat.com> In-Reply-To: <20250818085711.3461758-1-eperezma@redhat.com> References: <20250818085711.3461758-1-eperezma@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 Finalize the series by advertising VDUSE API v1 support to userspace. Now that all required infrastructure for v1 (ASIDs, VQ groups, update_iotlb_v2) is in place, VDUSE devices can opt in to the new features. Signed-off-by: Eugenio P=C3=A9rez --- drivers/vdpa/vdpa_user/vduse_dev.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vd= use_dev.c index de9550fd1cc8..dbabb1e527bf 100644 --- a/drivers/vdpa/vdpa_user/vduse_dev.c +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -2143,7 +2143,7 @@ static long vduse_ioctl(struct file *file, unsigned i= nt cmd, break; =20 ret =3D -EINVAL; - if (api_version > VDUSE_API_VERSION) + if (api_version > VDUSE_API_VERSION_1) break; =20 ret =3D 0; @@ -2210,7 +2210,7 @@ static int vduse_open(struct inode *inode, struct fil= e *file) if (!control) return -ENOMEM; =20 - control->api_version =3D VDUSE_API_VERSION; + control->api_version =3D VDUSE_API_VERSION_1; file->private_data =3D control; =20 return 0; --=20 2.50.1